KR102407730B1

KR102407730B1 - Method for anomaly behavior detection system using multiple machine learning models, and apparatus for the same

Info

Publication number: KR102407730B1
Application number: KR1020200070140A
Authority: KR
Inventors: 윤정한; 신혁기; 이우묘; 김형천
Original assignee: 한국전자통신연구원
Priority date: 2020-06-10
Filing date: 2020-06-10
Publication date: 2022-06-16
Also published as: KR20210153785A

Abstract

기재된 실시예는 정상상황을 나타내는 학습 데이터를 상기 학습 데이터의 피처(Feature)의 개수를 기반으로 N개의 분배 학습 데이터로 나누어 전처리하는 단계; 상기 N개의 분배 학습 데이터를 각 N개의 학습기에 입력하여 N개의 학습 모델을 생성하는 단계; 상기 학습 데이터를 상기 N개의 학습 모델에 입력하여 생성한 예측치와 실측치의 차이에 상응하는 N개의 예측오차(Cost)를 생성하는 단계; 상기 N개의 예측오차를 M개의 예측오차 분석기에 입력하기 위하여 M개의 재분배 예측오차로 재분배하는 단계; 상기 M개의 재분배 예측오차를 각 M개의 예측오차 분석기에 입력하여 패턴 수집 시간 동안의 예측오차 패턴(Cost trend)들을 예측오차 패턴 데이터베이스에 저장하는 단계; 및 입력 데이터에 상응하는 예측오차 패턴을 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하여 이상 또는 정상 상황인지 여부를 판단하는 단계를 포함하고, 상기 N과 M은 각각 적어도 2 이상의 자연수 값이고, 서로 독립적으로 설정되는 것인, 복수개의 머신 러닝 학습 모델을 이용한 이상 상황 탐지 방법에 관한 것이다. The described embodiment includes the steps of dividing training data representing a normal situation into N distributed training data based on the number of features of the training data and pre-processing; generating N learning models by inputting the N distributed learning data to each of the N learners; generating N prediction errors (costs) corresponding to differences between predicted values and actual values generated by inputting the training data into the N training models; redistributing the N prediction errors into M redistribution prediction errors to input the N prediction errors into the M prediction error analyzers; inputting the M redistribution prediction errors into each M prediction error analyzer and storing the prediction error patterns (Cost trends) for a pattern collection time in a prediction error pattern database; and comparing the prediction error pattern corresponding to the input data with the prediction error patterns included in the prediction error pattern database to determine whether it is an abnormal or normal situation, wherein each of N and M is a natural number value of at least 2 or more And, it relates to an abnormal situation detection method using a plurality of machine learning learning models, which are set independently of each other.

Description

Anomaly detection method using a plurality of machine learning learning models and an apparatus therefor

본 발명은 머신러닝을 이용하여 시간흐름에 따라 변하는 정보를 학습하고 그 예측결과를 이용하여 이상상태를 판단하는 기술에 관한 것이다.The present invention relates to a technique for learning information that changes over time using machine learning and determining an abnormal state using the prediction result.

머신러닝을 이용하여 시계열 정보를 학습하고 학습된 결과인 학습모델을 이용하여 새로 입력되는 정보의 정상/비정상 유무를 판단하는 기술들이 많이 존재한다. 이 때 종래 기술은 학습모델을 이용하여 정상/비정상 유무 판단 대상에 대해 예측치를 출력한 후 예측치와 실측치의 차이인 비용(cost)을 미리 정한 임계값과 비교하여 정상/비정상을 판단하고 있다. There are many techniques for learning time series information using machine learning and determining whether newly input information is normal/abnormal using a learning model that is the learned result. In this case, in the prior art, the normal/abnormality is determined by outputting a predicted value for a normal/abnormality determination target using a learning model, and then comparing the cost, which is the difference between the predicted value and the actual value, with a predetermined threshold value.

상기 종래 기술에서 정상과 비정상을 구분 짓는 threshold를 어떻게 정하느냐에 따라 이상 판단의 정확성이 결정된다. 그러나 정상과 비정상의 범주가 확연히 구분되는 경우 threshold를 쉽게 결정할 수도 있지만, 특정 정상 상황에서는 다른 비정상 상황에서보다 cost가 크게 나타나고, 특정 비정상 상황에서는 타 정상 상황보다 cost가 작게 나타나서 오탐지(정상을 비정상으로 판단)과 미탐지(비정상을 정상으로 판단)를 유발할 수 있다는 문제점이 있다. In the prior art, the accuracy of anomaly determination is determined according to how a threshold for discriminating between normal and abnormal is determined. However, if the categories of normal and abnormal are clearly distinguished, the threshold can be easily determined. However, in a certain normal situation, the cost appears larger than in other abnormal situations, and in a specific abnormal situation, the cost appears smaller than other normal situations, resulting in a false detection (recognizing normal as abnormal). ) and non-detection (determining abnormality as normal).

따라서 위와 같은 문제를 해결하고자 상기 예측치와 실측치의 차이의 패턴, 즉 예측오차 패턴을 이용하여 임계값을 결정하는 방식을 사용하는 방법을 사용할 수 있다.Therefore, in order to solve the above problem, a method of determining a threshold value using a pattern of a difference between the predicted value and the actual value, that is, a prediction error pattern, may be used.

이 때, 정상데이터를 학습하여 이상을 판단하는 기술은 정상데이터 전처리, 전처리된 데이터를 학습하여 예측모델 개발, 예측모델의 예측치와 실측치의 차이(cost trend)를 이용해 이상여부 판단의 3단계의 구성으로 이루어진다. 그러나, 학습해야 할 데이터가 너무 많은 feature를 가지고 있을 경우 머신러닝 기법이 학습모델을 잘 생성해 내지 못 하는 경우가 많다. 이런 문제를 해결하여 학습/예측 성능을 향상시키기 위해 많은 경우 데이터의 feature의 개수를 줄이거나 나누어 복수 개의 학습모델을 이용해 데이터를 학습하고 이들의 예측오차를 이용해 이상발생 여부를 판단하기도 한다. At this time, the technology for judging abnormalities by learning normal data consists of three steps: preprocessing normal data, developing a predictive model by learning the preprocessed data, and judging whether there is an abnormality using the difference between the predicted value and the measured value of the predictive model (cost trend) is made of However, if the data to be learned has too many features, machine learning techniques often fail to generate a learning model. In order to solve this problem and improve learning/prediction performance, in many cases, the number of features in the data is reduced or divided, and data is learned using a plurality of learning models, and the occurrence of anomalies is determined using their prediction errors.

그러나 이상과 같이 복수개의 학습모델을 이용해 이상판단을 할 경우 다음과 같은 문제가 발생할 수 있다. 첫째, 배경기술에서 데이터 전처리기가 학습모델마다 서로 다른 개수의 입력들을 분배해 주면 예측오차 패턴을 확인하기 위한 최적의 데이터 수집 시간(cost trend의 패턴분석을 위한 cost 수집기간)이 학습모델마다 서로 다를 수 있다. 이 경우 이상징후 판단 시점 및 판단 근거 데이터가 학습모델마다 서로 다를 수 있어 통합 판단이 어려워진다. 둘째, 하나의 학습모델에 적용된 입력의 개수가 너무 적거나 많은 경우 배경기술에서 사용되는 nearest-neighbor distance는 예측오차 패턴의 유사성을 정확하기 판단해 내기가 어렵다. 입력 개수가 너무 많으면 정상 상황에서도 너무 많은 변형이 발생할 수 있어 정상 상황을 모두 확보하기 어렵기 때문에 오탐이 많아지고, 입력 개수가 너무 적으면 표현 가능한 형태가 너무 적어 정상과 비정상을 구분하기 어려워져서 미탐이 많아진다. 셋째, 학습 단계에서 학습데이터를 분리하여 학습할 경우, 각 이상판단기는 자신의 입력인 예측오차를 이용해 이상여부를 판단한다. 이 경우 분리된 feature들 간의 패턴을 이용해 이상판단을 수행하기 어렵다. 예를 들어, 학습기1에 들어간 feature A, B와 학습기2의 feature C, 학습기n의 feature D 사이의 관계에서 이상판단이 필요하거나 이상 발생시 해당 feature A, B, C, D 간의 관계를 분석하고자 할 경우 그렇게 분류된 학습모델이 없으면 해당 이상판단 분석을 수행하기 어렵다. 이를 위해서는 feature A, B, C, D를 포함한 학습모델을 새로이 구성/학습해야하나 학습시간이 오래 걸리며, 해당 feature들로 구성한 학습모델의 학습이 잘 이루어지지 않을 가능성도 존재하기 때문이다.However, when an abnormality is judged using a plurality of learning models as described above, the following problems may occur. First, in the background technology, if the data preprocessor distributes a different number of inputs to each learning model, the optimal data collection time to check the prediction error pattern (cost collection period for cost trend pattern analysis) is different for each learning model. can In this case, it is difficult to make an integrated judgment because the timing of judging anomalies and the data based on the judgment may be different for each learning model. Second, when the number of inputs applied to one learning model is too few or too many, it is difficult to accurately determine the similarity of the prediction error pattern using the nearest-neighbor distance used in the background technology. If the number of inputs is too many, too many deformations can occur even in normal situations, which makes it difficult to secure all the normal situations, so false positives increase. this becomes more Third, in the case of learning by separating the learning data in the learning stage, each abnormality determiner determines whether there is an abnormality using its own input prediction error. In this case, it is difficult to judge anomalies using patterns between separated features. For example, in the relationship between features A and B in learner 1, feature C in learner 2, and feature D in learner n, it is necessary to determine an anomaly or to analyze the relationship between the features A, B, C, and D when an anomaly occurs. In this case, if there is no such classified learning model, it is difficult to perform the anomaly judgment analysis. For this, a new learning model including features A, B, C, and D needs to be configured/learned, but it takes a long time to learn, and there is a possibility that the learning model composed of the features may not be well learned.

따라서 복수개의 학습모델을 이용하고, 예측오차 패턴을 이용하여 이상 상황을 판단하는 방법에 있어, 상기 문제들을 해결하기 위한 방법의 필요성이 대두된다. Therefore, in a method of judging an abnormal situation using a plurality of learning models and a prediction error pattern, a need for a method for solving the above problems arises.

한국등록특허 제 10-1888683호 (2018.08.14)Korean Patent No. 10-1888683 (2018.08.14)

본 발명의 목적은 개별 학습모델(예측모델)이 생성하는 예측오차들의 패턴을 이용한 이상 판단 과정에서 이상 판단 성능을 향상하기 위해 예측오차들을 재분배하는 방법 및 이를 위한 장치를 제공함에 있다.An object of the present invention is to provide a method and apparatus for redistributing prediction errors to improve abnormality determination performance in an abnormality determination process using patterns of prediction errors generated by an individual learning model (prediction model).

또한, 본 발명의 목적은 정상데이터만으로 데이터 전처리기를 이용해 학습모델의 학습/예측 능력을 최대한으로 향상시키면서 예측오차 분배기를 통해 예측오차들을 재분배 해주어 전체 이상탐지시스템의 이상판단 능력을 극대화하는 방법을 제공함에 있다. In addition, it is an object of the present invention to maximize the learning/prediction ability of the learning model using the data preprocessor only with normal data, and redistribute the prediction errors through the prediction error divider to provide a method of maximizing the abnormality judgment ability of the entire anomaly detection system. is in

또한, 본 발명의 목적은 공격 패턴에 따라 학습모델을 수정/추가하지 않고도 해당 공격 패턴 탐지에 필요한 예측오차 그룹을 생성하고 이를 위한 예측오차분석기를 추가하여 새로운 공격 패턴 및 사고조사 등에 효율적으로 대응할 수 있는 방법을 제공함에 있다.In addition, it is an object of the present invention to efficiently respond to new attack patterns and accident investigations by creating a prediction error group necessary for detecting the corresponding attack pattern without modifying/adding the learning model according to the attack pattern and adding a prediction error analyzer for this. It is to provide a way.

실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 상황 탐지 방법 은 정상상황을 나타내는 학습 데이터를 상기 학습 데이터의 피처(Feature)의개수를 기반으로 N개의 분배 학습 데이터로 나누어 전처리하는 단계; 상기 N개의 분배 학습 데이터를 각 N개의 학습기에 입력하여 N개의 학습 모델을 생성하는 단계; 상기 학습 데이터를 상기 N개의 학습 모델에 입력하여 생성한 예측치와 실측치의 차이에 상응하는 N개의 예측오차(Cost)를 생성하는 단계; 상기 N개의 예측오차를 M개의 예측오차 분석기에 입력하기 위하여 M개의 재분배 예측오차로 재분배하는 단계; 상기 M개의 재분배 예측오차를 각 M개의 예측오차 분석기에 입력하여 패턴 수집 시간 동안의 예측오차 패턴(Cost trend)들을 예측오차 패턴 데이터베이스에 저장하는 단계; 및 입력 데이터에 상응하는 예측오차 패턴을 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하여 이상 또는 정상 상황인지 여부를 판단하는 단계를 포함하고, 상기 N과 M은 각각 적어도 2 이상의 자연수 값이고, 서로 독립적으로 설정되는 것이다. An abnormal situation detection method using a plurality of machine learning learning models according to an embodiment comprises the steps of dividing training data representing a normal situation into N distributed training data based on the number of features of the training data and pre-processing; generating N learning models by inputting the N distributed learning data to each of the N learners; generating N prediction errors (costs) corresponding to differences between predicted values and actual values generated by inputting the training data into the N training models; redistributing the N prediction errors into M redistribution prediction errors to input the N prediction errors into the M prediction error analyzers; inputting the M redistribution prediction errors into each M prediction error analyzer and storing the prediction error patterns (Cost trends) for a pattern collection time in a prediction error pattern database; and comparing the prediction error pattern corresponding to the input data with the prediction error patterns included in the prediction error pattern database to determine whether it is an abnormal or normal situation, wherein each of N and M is a natural number value of at least 2 or more and are set independently of each other.

이 때, 상기 M개의 재분배 예측오차로 재분배하는 단계는 상기 예측오차 분석기에서 예측오차 분석에 필요한 최소 피처(Feature)의 개수를 정하는 단계; 및 상기 N개의 예측오차에 포함된 피처들의 개수의 총합을 상기 최소 피처의 개수로 나눈 값을 기반으로 상기 재분배 예측오차에 포함되는 입력 피처의 개수가 동일하게 되도록 M을 정하는 단계를 포함할 수 있다.In this case, the redistribution of the M redistribution prediction errors may include: determining the minimum number of features required for prediction error analysis in the prediction error analyzer; and determining M so that the number of input features included in the redistribution prediction error is the same based on a value obtained by dividing the sum of the number of features included in the N prediction errors by the minimum number of features. .

이 때, 상기 M개의 재분배 예측오차로 재분배하는 단계는 상기 N개의 예측오차들을 랜덤(Random)하게 섞어 예측오차의 집합을 생성하는 단계; 및 상기 예측오차의 집합을 M개의 재분배 예측오차로 재분배하는 단계를 더 포함하는 것일 수 있다. In this case, the redistribution of the M redistribution prediction errors may include randomly mixing the N prediction errors to generate a set of prediction errors; and redistributing the set of prediction errors into M redistribution prediction errors.

이 때, 상기 M개의 재분배 예측오차로 재분배하는 단계는 상기 예측오차의 집합 중 상기 예측오차에 상응하는 피처들을 분석하는 단계; 상기 예측오차 피처들 중 이상여부 결정에 유리한 연관 피처들끼리 묶는 단계; 및 상기 연관 피처들을 포함하는 재분배 예측오차를 추가로 생성하는 단계를 더 포함할 수 있다.In this case, the redistribution of the M redistribution prediction errors may include analyzing features corresponding to the prediction errors among the set of prediction errors; bundling related features that are advantageous for determining abnormality among the prediction error features; and further generating a redistribution prediction error including the associated features.

이 때, 상기 이상 또는 정상 상황인지 여부를 판단하는 단계는 상기 입력 데이터에 상응하는 예측오차 패턴을 상기 패턴 수집 시간동안추출하여 타겟 예측오차 패턴을 생성하는 단계; 상기 타겟 예측오차 패턴과 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하는 단계; 상기 예측오차 패턴들 중 어느 하나 이상이 상기 타겟 예측오차 패턴과 유사하다고 판단되면, 정상이라고 판단하는 단계; 및 상기 예측오차 패턴들 모두가 상기 타겟 예측오차 패턴과 유사하지 않다고 판단되면, 이상이라고 판단하는 단계를 포함할 수 있다.In this case, the determining whether the abnormal or normal situation includes: generating a target prediction error pattern by extracting a prediction error pattern corresponding to the input data during the pattern collection time; comparing the target prediction error pattern with prediction error patterns included in the prediction error pattern database; determining that the prediction error pattern is normal when it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern; and when it is determined that all of the prediction error patterns are not similar to the target prediction error pattern, determining that they are abnormal.

이 때, 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하는 단계는 상기 예측오차 패턴들 중 상기 타겟 예측오차 패턴과 가장 유사한 이웃 예측오차 패턴을 생성하는 단계; 상기 타겟 예측오차 패턴과 이웃 예측오차 패턴을 상기 차이를 계산하는 함수를 이용하여 차이값을 계산하는 단계; 상기 차이값이 차이 임계값보다 같거나 작으면 유사하다고 판단하는 단계; 및 상기 차이값이 상기 차이 임계값보다 크면 유사하지 않다고 판단하는 단계를 포함할 수 있다.In this case, the comparing with the prediction error patterns included in the prediction error pattern database may include: generating a neighboring prediction error pattern most similar to the target prediction error pattern among the prediction error patterns; calculating a difference value between the target prediction error pattern and the neighboring prediction error pattern using a function for calculating the difference; determining that the difference value is equal to or less than a difference threshold value; and if the difference value is greater than the difference threshold value, determining that they are not similar.

이 때, 상기 차이를 계산하는 함수는 유클리디언(Euclidean) 거리 함수로 하는 것일 수 있다.In this case, the function for calculating the difference may be a Euclidean distance function.

이 때, 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하는 단계는 상기 타겟 예측오차 패턴과 상기 예측오차 패턴들을 차이를 계산하는 함수를 이용하여 차이값들을 계산하는 단계; 상기 차이값들 중 적어도 하나 이상이 차이 임계값보다 같거나 작으면 유사하다고 판단하는 단계; 및 상기 차이값들 모두가 상기 차이 임계값보다 크면 유사하지 않다고 판단하는 단계를 포함할 수 있다.In this case, the comparing with the prediction error patterns included in the prediction error pattern database may include calculating difference values using a function for calculating a difference between the target prediction error pattern and the prediction error patterns; determining that they are similar if at least one of the difference values is less than or equal to a difference threshold; and if all of the difference values are greater than the difference threshold, determining that they are not similar.

이 때, 상기 이상 또는 정상 상황인지 여부를 판단하는 단계는 상기 M개의 예측오차 분석기의 판단결과를 융합하여 최종으로 정상 또는이상을 판단하는 단계를 더 포함할 수 있다.In this case, the step of determining whether the abnormality or the normal situation may further include the step of finally determining the normality or the abnormality by fusing the judgment results of the M prediction error analyzers.

이 때, 상기 N개의 분배 학습 데이터로 나누어 전처리하는 단계는 상기 학습 데이터에 포함되어 있는 피처들을 분석하는 단계; 상기 피처들 중 학습에 최적화된 연관 피처들끼리 묶는 단계; 및 상기 연관 피처들을 포함하는 학습 데이터들끼리 하나의 분배 학습 데이터로 생성하는 단계를 포함할 수 있다.In this case, the pre-processing by dividing the N pieces of distributed learning data may include: analyzing features included in the training data; bundling related features optimized for learning among the features; and generating one distributed learning data among the learning data including the related features.

실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치는 정상상황을 나타내는 학습 데이터를 상기 학습 데이터의 피처(Feature)의개수를 기반으로 N개의 분배 학습 데이터로 나누어 전처리하고, 상기 N개의 분배 학습 데이터를 각 N개의 학습기에 입력하여 N개의 학습 모델을 생성하고, 상기 학습 데이터를 상기 N개의 학습 모델에 입력하여 생성한 예측치와 실측치의 차이에 상응하는 N개의 예측오차(Cost)를 생성하고, 상기 N개의 예측오차를 M개의 예측오차 분석기에 입력하기 위하여 M개의 재분배 예측오차로 재분배하고, 상기 M개의 재분배 예측오차를 각 M개의 예측오차 분석기에 입력하여 패턴 수집 시간 동안의 예측오차 패턴(Cost trend)들을 예측오차 패턴 데이터베이스에 저장하고, 상기 N과 M은 각각 적어도 2개 이상의 값이고, 서로 독립적으로 설정되는 것인 프로세서; 및 상기 N개의 예측오차 또는 M개의 재분배 예측오차 중 적어도 하나 이상을 저장하는 메모리를 포함할 수 있다.Anomaly detection learning apparatus using a plurality of machine learning learning models according to an embodiment divides and preprocesses learning data representing a normal situation into N distributed learning data based on the number of features of the learning data, and the N N learning models are generated by inputting distributed learning data to each of the N learners, and N prediction errors (Cost) corresponding to the difference between the predicted values and the actual values generated by inputting the training data to the N learning models are generated. and redistributed to M redistribution prediction errors in order to input the N prediction errors to the M prediction error analyzers, and input the M redistribution prediction errors to each M prediction error analyzer. Prediction error patterns during the pattern collection time a processor that stores (cost trends) in a prediction error pattern database, wherein each of N and M is at least two or more values, and is set independently of each other; and a memory for storing at least one of the N prediction errors and the M redistribution prediction errors.

이 때, 상기 프로세서는 상기 예측오차 분석기에서 예측오차 분석에 필요한 최소 피처(Feature)의 개수를 정하고, 상기 N개의 예측오차에 포함된 피처들의 개수의 총합을 상기 최소 피처의 개수로 나눈 값을 기반으로 상기 재분배 예측오차에 포함되는 입력 피처의 개수가 동일하게 되도록 M을 정하는 것일 수 있다.In this case, the processor determines the minimum number of features required for prediction error analysis in the prediction error analyzer, and divides the total number of features included in the N prediction errors by the minimum number of features. As such, M may be determined so that the number of input features included in the redistribution prediction error is the same.

이 때, 상기 프로세서는 상기 N개의 예측오차들을 랜덤(Random)하게 섞어 예측오차의 집합을 생성하고, 상기 예측오차의 집합을 M개의 재분배 예측오차로 재분배하는 것일 수 있다.In this case, the processor may randomly mix the N prediction errors to generate a set of prediction errors, and redistribute the set of prediction errors into M redistribution prediction errors.

이 때, 상기 프로세서는 상기 예측오차의 집합 중 상기 예측오차에 상응하는 피처들을 분석하고, 상기 예측오차 피처들 중 이상여부 결정에 유리한 연관 피처들끼리 묶고, 상기 연관 피처들을 포함하는 재분배 예측오차를 추가로 생성하는 것일 수 있다. At this time, the processor analyzes the features corresponding to the prediction error in the set of prediction errors, groups related features advantageous for determining abnormality among the prediction error features, and redistributes the prediction error including the related features. It may be an additional creation.

이 때, 상기 프로세서는 상기 학습 데이터에 포함되어 있는 피처들을 분석하고, 상기 피처들 중 학습에 최적화된 연관 피처들끼리 묶고, 상기 연관 피처들을 포함하는 학습 데이터들끼리 하나의 분배 학습 데이터로 생성하는 것일 수 있다.At this time, the processor analyzes the features included in the learning data, bundles related features optimized for learning among the features, and generates one distributed learning data between learning data including the related features. it could be

실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 판단 장치는 입력 데이터를 이상 탐지 학습 장치에 입력하고, 상기 입력 데이터에 상응하는 예측오차 패턴을 상기 패턴 수집 시간동안 추출한 타겟 예측오차 패턴을 생성하고, 상기 타겟 예측오차 패턴과 상기 이상 탐지 학습 장치의 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하고, 상기 예측오차 패턴들 중 어느 하나 이상이 상기 타겟 예측오차 패턴과 유사하다고 판단되면, 정상이라고 판단하고, 상기 예측오차 패턴들 모두가 상기 타겟 예측오차 패턴과 유사하지 않다고 판단되면, 이상이라고 판단하는 프로세서; 및 상기 타겟 예측오차 패턴을 저장하는 메모리를 포함할 수 있다.Anomaly detection determination apparatus using a plurality of machine learning learning models according to an embodiment inputs input data to anomaly detection learning apparatus, and generates a target prediction error pattern in which a prediction error pattern corresponding to the input data is extracted during the pattern collection time and compares the target prediction error pattern with prediction error patterns included in the prediction error pattern database of the anomaly detection and learning apparatus, and if it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern, a processor that determines that the prediction error patterns are normal and that all of the prediction error patterns are not similar to the target prediction error patterns; and a memory for storing the target prediction error pattern.

이 때, 상기 프로세서는 상기 예측오차 패턴들 중 상기 타겟 예측오차 패턴과 가장 유사한 이웃 예측오차 패턴을 생성하고, 상기 타겟 예측오차 패턴과 이웃 예측오차 패턴을 상기 차이를 계산하는 함수를 이용하여 차이값을 계산하고, 상기 차이값이 차이 임계값보다 같거나 작으면 유사하다고 판단하고, 상기 차이값이 상기 차이 임계값보다 크면 유사하지 않다고 판단하는 것일 수 있다.In this case, the processor generates a neighbor prediction error pattern most similar to the target prediction error pattern from among the prediction error patterns, and uses a function for calculating the difference between the target prediction error pattern and the neighboring prediction error pattern. , and if the difference value is less than or equal to the difference threshold, it is determined that they are similar, and when the difference value is greater than the difference threshold, it is determined that they are not similar.

이 때, 상기 프로세서는 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하고, 상기 타겟 예측오차 패턴과 상기 예측오차 패턴들을 차이를 계산하는 함수를 이용하여 차이값들을 계산하고, 상기 차이값들 중 적어도 하나 이상이 차이 임계값보다 같거나 작으면 유사하다고 판단하고, 상기 차이값들 모두가 상기 차이 임계값보다 크면 유사하지 않다고 판단하는 것일 수 있다.In this case, the processor compares the prediction error patterns included in the prediction error pattern database, calculates the difference values using a function for calculating the difference between the target prediction error pattern and the prediction error patterns, and the difference value If at least one or more of these values is less than or equal to the difference threshold, it may be determined that they are similar, and if all of the difference values are greater than the difference threshold, it may be determined that they are not similar.

이 때, 상기 프로세서는 상기 M개의 예측오차 분석기의 판단결과를 융합하여 최종으로 정상 또는 이상을 판단하는 것일 수 있다. In this case, the processor may fuse the determination results of the M prediction error analyzers to finally determine normal or abnormal.

본 발명은 개별 학습모델(예측모델)이 생성하는 예측오차들의 패턴을 이용한 이상 판단 과정에서 이상 판단 성능을 향상하기 위해 예측오차들을 재분배하는 방법 및 이를 위한 장치를 제공할 수 있다. The present invention can provide a method and apparatus for redistribution of prediction errors to improve abnormality determination performance in an abnormality determination process using patterns of prediction errors generated by an individual learning model (prediction model).

또한, 본 발명은 정상데이터만으로 데이터 전처리기를 이용해 학습모델의 학습/예측 능력을 최대한으로 향상시키면서 예측오차 분배기를 통해 예측오차들을 재분배 해주어 전체 이상탐지시스템의 이상판단 능력을 극대화하는 방법을 제공할 수 있다. In addition, the present invention provides a method of maximizing the abnormality judgment ability of the entire anomaly detection system by redistributing the prediction errors through the prediction error divider while maximally improving the learning/prediction ability of the learning model using the data preprocessor only with normal data. have.

또한, 본 발명의 목적은 공격 패턴에 따라 학습모델을 수정/추가하지 않고도 해당 공격 패턴 탐지에 필요한 예측오차 그룹을 생성하고 이를 위한 예측오차분석기를 추가하여 새로운 공격 패턴 및 사고조사 등에 효율적으로 대응할 수 있는 방법을 제공할 수 있다. In addition, it is an object of the present invention to efficiently respond to new attack patterns and accident investigations by creating a prediction error group necessary for detecting the corresponding attack pattern without modifying/adding the learning model according to the attack pattern and adding a prediction error analyzer for this. can provide a way.

도 1은 실시예에 따른 이상 상황 탐지 시스템(100)의 일 예를 나타낸 블록도이다.
도 2의 (a)는 일반적으로 예측치를 이용한 이상 상황 탐지 방법의 개념을 보여주는 도면이다.
도 2의 (b)는 임계값(Threshold)을 이용한 이상 상황 탐지 방법의 개념을 보여주는 도면이다.
도 3은 도 1에 도시된 이상 판단기(160)의 동작의 일 예를 보여주는 도면이다.
도 4는 실시예에 따른 이상 상황 탐지 시스템(400)이 복수개의 학습 모델을 이용하여 이상을 판단하는 경우의 일 예를 보여주는 블록도이다.
도 5는 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치의 일 예의 블록도이다.
도 6은 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 판단 장치의 일 예의 블록도이다.
도 7은 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치의 일 예의 동작의 흐름도를 나타낸 도면이다.
도 8은 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 판단 장치의 일 예의 동작 흐름도를 나타낸 도면이다.
도 9는 실시예에 따른 컴퓨터 시스템 구성을 나타낸 도면이다.1 is a block diagram illustrating an example of an abnormal situation detection system 100 according to an embodiment.
FIG. 2A is a diagram showing the concept of a method for detecting an anomaly using a prediction value in general.
FIG. 2B is a diagram illustrating the concept of an abnormal situation detection method using a threshold.
FIG. 3 is a diagram illustrating an example of the operation of the abnormality determiner 160 shown in FIG. 1 .
4 is a block diagram illustrating an example in which the abnormal situation detection system 400 according to the embodiment determines abnormality using a plurality of learning models.
5 is a block diagram of an example of an anomaly detection learning apparatus using a plurality of machine learning learning models according to an embodiment.
6 is a block diagram of an example of an anomaly detection determination apparatus using a plurality of machine learning learning models according to an embodiment.
7 is a diagram illustrating an example operation of an anomaly detection learning apparatus using a plurality of machine learning learning models according to an embodiment.
8 is a diagram illustrating an operation flowchart of an apparatus for determining anomaly detection using a plurality of machine learning learning models according to an embodiment.
9 is a diagram showing the configuration of a computer system according to an embodiment.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in a variety of different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the technical field to which the present invention belongs It is provided to fully inform the possessor of the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout.

비록 "제1" 또는 "제2" 등이 다양한 구성요소를 서술하기 위해서 사용되나, 이러한 구성요소는 상기와 같은 용어에 의해 제한되지 않는다. 상기와 같은 용어는 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용될 수 있다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있다.Although "first" or "second" is used to describe various elements, these elements are not limited by the above terms. Such terms may only be used to distinguish one component from another. Accordingly, the first component mentioned below may be the second component within the spirit of the present invention.

본 명세서에서 사용된 용어는 실시예를 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 또는 "포함하는(comprising)"은 언급된 구성요소 또는 단계가 하나 이상의 다른 구성요소 또는 단계의 존재 또는 추가를 배제하지 않는다는 의미를 내포한다.The terminology used herein is for the purpose of describing the embodiment and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” or “comprising” implies that the stated component or step does not exclude the presence or addition of one or more other components or steps.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 해석될 수 있다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms used herein may be interpreted with meanings commonly understood by those of ordinary skill in the art to which the present invention pertains. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless specifically defined explicitly.

이하에서는, 도 1 내지 도 8을 참조하여 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 상황 탐지 방법 및 이를 위한 장치가 상세히 설명된다.Hereinafter, an abnormal situation detection method using a plurality of machine learning learning models according to an embodiment and an apparatus therefor will be described in detail with reference to FIGS. 1 to 8 .

도 1은 실시예에 따른 이상 상황 탐지 시스템(100)의 일 예를 나타낸 블록도이다.1 is a block diagram illustrating an example of an abnormal situation detection system 100 according to an embodiment.

도 1을 참조하면, 상기 이상 상황 탐지 시스템(100)은 학습과 판단의 두 단계로 나누어지는데, 학습 단계에서는 학습 데이터(110)로 학습하여 학습 모델을 만들어낸다. 그리고 판단 단계에서는 입력 데이터(120)를 받아 학습 모델에 입력하고 그 결과를 가지고 이상 판단 결과(170)을 출력하게 된다. 그리고 상기 이상 상황 탐지 시스템(100)은 데이터 전처리기(130), 학습기(140), 학습모델(150), 이상 판단기(160)을 포함할 수 있다. Referring to FIG. 1 , the abnormal situation detection system 100 is divided into two stages of learning and judgment. In the learning stage, the learning model is created by learning from the learning data 110 . And in the determination step, the input data 120 is received and input to the learning model, and the abnormal determination result 170 is output with the result. In addition, the abnormal situation detection system 100 may include a data preprocessor 130 , a learner 140 , a learning model 150 , and an abnormality determiner 160 .

상기 이상 상황 탐지 시스템(100)의 학습 동작은 아래와 같이 크게 4단계로 구성될 수 있다.The learning operation of the abnormal situation detection system 100 may be largely composed of four steps as follows.

제 1 학습 단계에서, 학습기(140)이 머신 러닝 기법을 이용해 학습 모델(150)을 생성할 수 있다. 여기서 학습 모델(150)은 머신 러닝을 이용하여 시계열 정보를 학습한 결과이다. In the first learning step, the learner 140 may generate the learning model 150 using a machine learning technique. Here, the learning model 150 is a result of learning time series information using machine learning.

제 2 학습 단계에서, 테스트 데이터(정상/비정상에 대해 정답을 아는 데이터)를 기준으로 비용(혹은 누적 비용(cumulative cost), 이하 '비용'이라고 함)의 임계값(threshold)를 결정할 수 있다. 이 때, 상기 테스트 데이터에서 정상/비정상 예측 테스트를 함으로써, 학습 모델의 입장에서 예외적인 상황(정상인데 비용이 큰 경우, 비정상인데 비용이 작은 경우)을 무시하고 비용의 임계값(threshold) TH0이 결정될 수 있다.In the second learning step, a threshold of cost (or cumulative cost, hereinafter referred to as 'cost') may be determined based on test data (data that knows the correct answer for normal/abnormality). At this time, by performing a normal/abnormal prediction test on the test data, the learning model ignores exceptional situations (normal but high cost, abnormal but low cost) and the cost threshold TH0 is can be decided.

제 3 학습 단계에서, 임계값-초과 비용 변화(Threshold-over cost trend) 저장 동작이 수행될 수 있다. 제 2 학습 단계의 결과로써 정상 상황을 나타내는 테스트 데이터에 상응하는 비용이 상기 임계값(threshold)보다 큰 경우, 사용자가 정한 제 1 시간 동안의 비용 변화(cost trend, pattern 정보)가 임계값-초과 비용 변화 데이터베이스(230)에 저장될 수 있다. 그리고 제 1 비용 변화 간의 차이를 계산하는 함수(diff function, 예: Euclidean distance)가 결정될 수 있다. 상기 차이를 계산하는 함수(diff function)을 이용하여 비용 변화 간의 차이(cost trend difference)에 대한 임계값(threshold, cost trend difference limit; 비용 변화 차이 제한값) TH1이 결정될 수 있다.In the third learning step, a threshold-over cost trend storage operation may be performed. As a result of the second learning step, if the cost corresponding to the test data representing the normal situation is greater than the threshold, the cost trend (cost trend, pattern information) for the first time determined by the user exceeds the threshold. It may be stored in the cost change database 230 . In addition, a function (diff function, for example, Euclidean distance) for calculating a difference between the first cost changes may be determined. A threshold, cost trend difference limit (cost trend difference limit) TH1 for a cost trend difference may be determined using the diff function for calculating the difference.

제 4 학습 단계에서, 임계값-이하 비용 변화(threshold-under cost trend) 저장 동작이 수행될 수 있다. 제 2 학습 단계의 결과로써 이상 상황을 나타내는 테스트 데이터에 상응하는 비용이 상기 임계값보다 작은 경우, 사용자가 정한 제 2 시간 동안의 비용 변화(cost trend, pattern 정보)가 임계값-이하 비용 변화 데이터베이스(240)에 저장될 수 있다. 여기서 제 2 시간은 제 3 학습 단계의 제 1 시간과 동일하거나 다를 수 있다. 그리고 제 2 비용 변화 간의 차이를 계산하는 함수(diff function, 예: Euclidean distance)가 결정될 수 있다. 상기 차이를 계산하는 함수(diff function)을 이용하여 비용 변화 간의 차이(cost trend difference)에 대한 임계값(threshold, cost trend difference limit) TH2가 결정될 수 있다. 이 때 제 2 비용 변화간의 차이를 계산하는 함수는 제 1 비용 변화간의 차이를 계산하는 함수와 다른 것으로 선택될 수 있다.In the fourth learning step, a threshold-under cost trend storage operation may be performed. As a result of the second learning step, if the cost corresponding to the test data indicating the abnormal situation is less than the threshold, the cost change (cost trend, pattern information) for the second time determined by the user is below the threshold value in the cost change database (240) may be stored. Here, the second time may be the same as or different from the first time of the third learning phase. In addition, a function (diff function, for example, Euclidean distance) for calculating a difference between the second cost changes may be determined. A threshold, cost trend difference limit TH2 for a cost trend difference may be determined using the diff function for calculating the difference. In this case, the function for calculating the difference between the second cost changes may be selected to be different from the function for calculating the difference between the first cost changes.

즉, 상기 학습기(140)의 4단계의 학습 동작을 통해 학습 모델(150), 임계값-초과 비용 변화 데이터베이스, 임계값-이하 비용 변화 데이터베이스가 생성되고, 이상 판단기(160)의 판단 기준이 되는 임계값 TH0, TH1, TH2가 결정되는 것이다.That is, the learning model 150, the threshold-exceeding cost change database, and the threshold-below cost change database are generated through the four-step learning operation of the learner 140, and the judgment criterion of the abnormality determiner 160 is Threshold values TH0, TH1, and TH2 to be used are determined.

이후 이상 판단기(160)에서 수행하는 판단 과정은 다음의 4단계로 구성될 수 있다. Thereafter, the determination process performed by the abnormality determiner 160 may consist of the following four steps.

제 1 판단 단계에서, 실시간으로 모니터링 하는 입력 데이터(120)를 데이터전처리기(130)을 거쳐 학습 모델(150)에 입력하여 예측치를 계산한다. In the first determination step, input data 120 that is monitored in real time is input to the learning model 150 through the data preprocessor 130 to calculate a predicted value.

제 2 판단 단계에서, 상기 학습 2단계에서 결정한 비용의 임계값(threshold) TH0를 이용하여 정상과 비정상(또는 이상)을 구분한다. In the second determination step, normal and abnormal (or abnormal) are distinguished by using the threshold value TH0 of the cost determined in the second step of learning.

제 3 판단 단계에서, 제 2 판단 단계의 결과로써 판단의 대상이 되는 입력 데이터(또는 타겟)가 비정상으로 판단될 경우 이상 상황 판단 과정은 다음과 같이 진행될 수 있다. 상기 타겟에 대한 해당 실측치에 대해 사용자가 정한 기간 동안(예를 들어, 제 1 시간)의 비용 변화(cost trend)인 제 1 타겟 비용 변화를 추출한다. 그리고 상기 제 1 타겟 비용 변화와 상기 학습 3단계에서 저장한 임계값-초과 비용 변화(threshold-over cost trend)들 중에서 제 1 비용 변화 차이 계산 함수(diff function)을 이용하여 가장 유사한 비용 변화를 갖는 nearest-neighbor(제 1 이웃 비용 변화)가 검색되는 것이다. 만약 상기 제 1 비용 변화 차이 계산 함수를 유클리안 거리 함수로 정한다면, 상기 제 1 타겟 비용 변화와 임계값-초과 비용 변화 데이터베이스(230)에 포함된 모든 비용 변화와의 유클리안 거리를 계산함으로써, 계산된 그 값이 최소인 비용 변화가 바로 제 1 이웃 비용 변화(nearest-neighbor)가 될 수 있다. 계산된 비용 변화와 검색된 nearest-neighbor와의 비용 변화 차이인, 두 cost trend간의 Euclidean distance를 계산한다. 그리고 계산된 값이 학습 단계에서 결정한 임계값 TH1보다 크면 비정상으로 최종 판단될 수 있다. 반대로, 계산된 값이 학습 단계에서 결정한 TH1보다 작으면 정상으로 최종 판단될 수 있다.In the third determination step, when it is determined that the input data (or target) to be determined is abnormal as a result of the second determination step, the abnormal situation determination process may be performed as follows. A first target cost change that is a cost trend for a period (eg, first time) determined by a user is extracted with respect to the corresponding measured value for the target. And using a first cost change difference calculation function (diff function) from among the first target cost change and the threshold-over cost trend stored in the learning step 3 to have the most similar cost change The nearest-neighbor (first-neighbor cost change) is what is searched for. If the first cost change difference calculation function is defined as a Euclidean distance function, the Euclidean distance between the first target cost change and all cost changes included in the threshold-exceeding cost change database 230 is calculated. By doing so, a cost change for which the calculated value is the smallest may be a nearest-neighbor. Calculate the Euclidean distance between two cost trends, which is the difference between the calculated cost change and the searched nearest-neighbor. In addition, if the calculated value is greater than the threshold value TH1 determined in the learning step, it may be finally determined as abnormal. Conversely, if the calculated value is smaller than TH1 determined in the learning step, it may be finally determined as normal.

제 4 판단 단계에서, 제 2 판단 단계의 결과로써 판단의 대상이 되는 입력 데이터(또는 타겟)이 정상으로 판단될 경우 이상 상황 판단 과정은 다음과 같이 진행될 수 있다. 상기 타겟에 대한 해당 실측치에 대해 사용자가 정한 기간 동안(예를 들어, 제 2 시간)의 비용 변화인 제 2 타겟 비용 변화가 추출된다. 상기 제 2 타겟 비용 변화와 상기 학습 4단계에서 저장한 임계값-이하 비용 변화(threshold-under cost trend)들 중에서 제 2 비용 변화 차이 계산 함수(diff function)을 이용하여 가장 유사한 비용 변화를 갖는 nearest-neighbor(제 2 이웃 비용 변화)가 검색될 수 있다. 만약 상기 제 2 비용 변화 차이 계산 함수를 유클리안 거리 함수로 정한다면, 상기 제 2 타겟 비용 변화와 임계값-이하 비용 변화 데이터베이스에 포함된 모든 비용 변화와의 유클리안 거리를 계산하고, 계산된 값이 최소인 비용 변화가 nearest-neighbor가 될 수 있다. 계산된 비용 변화와 검색된 nearest-neighbor와의 비용 변화 차이(cost trend difference)인 두 비용 변화 간의 유클리안 거리를계산한다. 그리고 계산된 값이 학습 단계에서 결정한 TH2보다 크면 비정상으로 최종 판단될 수 있다. 반대로, 계산된 값이 상기 TH2보다 작으면 정상으로 최종 판단될 수 있다.In the fourth determination step, when it is determined that the input data (or target) to be determined as a result of the second determination step is normal, the abnormal situation determination process may proceed as follows. A second target cost change, which is a cost change for a user-determined period (eg, a second time), is extracted for the corresponding measured value for the target. The nearest cost change that is most similar to the second target cost change and the threshold-under cost trend stored in the learning step 4 using a second difference calculation function (diff function) -neighbor (second neighbor cost change) may be searched for. If the second cost change difference calculation function is defined as a Euclidean distance function, the Euclidean distance between the second target cost change and all cost changes included in the sub-threshold cost change database is calculated and calculated The cost change with the smallest value can be nearest-neighbor. Compute the Euclidean distance between two cost changes, which is the cost trend difference between the computed cost change and the nearest-neighbor found. And if the calculated value is greater than TH2 determined in the learning step, it may be finally determined as abnormal. Conversely, if the calculated value is smaller than the TH2, it may be finally determined as normal.

도 2는 머신러닝을 이용하여 이상 상황을 탐지하는 기존의 방법의 개념을 보여주는 도면이다. 2 is a diagram showing the concept of a conventional method for detecting an abnormal situation using machine learning.

도 2의 (a)는 일반적으로 예측치를 이용한 이상 상황 탐지 방법의 개념을 보여주는 도면이다. 머신 러닝을 이용하여 시계열 정보를 학습하고 학습된 결과(학습모델)를 이용하여 새로 입력되는 정보의 정상/비정상 유무를 판단하는 기술들이 많이 존재한다. 도 3에서 보는 바와 같이, 종래 기술은 학습모델을 이용하여 정상/비정상 유무 판단 대상에 대해 예측치를 출력한 후, 상기 예측치와 실제 측정치(실측치)를 비교하여, 상기 예측치와 실측치의 차이가 크면 이상행위로 탐지하는 방법으로 이상 상황을 탐지하고 있다. 그리고 상기 예측치와 실측치의 차이를 구하기 위하여 도 2의 (b)와 같이 상기 예측치와 실측치의 차이를 계산하는 함수(cost function)를 정의할 수 있다.FIG. 2A is a diagram showing the concept of a method for detecting an anomaly using a prediction value in general. There are many techniques for learning time series information using machine learning and determining whether newly input information is normal/abnormal using the learned result (learning model). As shown in FIG. 3 , in the prior art, after outputting a predicted value for a normal/abnormal determination target using a learning model, the predicted value and the actual measured value (actual value) are compared, and if the difference between the predicted value and the actual value is large, it is abnormal Anomalies are detected by a method of detecting behavior. In addition, in order to obtain a difference between the predicted value and the actual value, a function for calculating the difference between the predicted value and the actual value may be defined as shown in FIG. 2B .

도 2의 (b)는 임계값(Threshold)을 이용한 이상 상황 탐지 방법의 개념을 보여주는 도면이다. 즉, 상기 차이를 계산하는 함수로부터 임계값(Threshold)을 이용하여 이상 상황 탐지 방법의 개념을 보여주는 도면이다. 도 2의 (b)를 참조하면, Cost function은 예측치와 실측치의 차이(Diff(실제 측정치, 예측치))를 하나의 실수(비용, cost)로 계산해 준다. 따라서, 여기서는 정상/비정상을 판단하는 기준으로 임계값(Threshold)를 정하고, 상기 비용이 상기 임계값보다 크면 비정상으로 판단하게 된다. 만약 한 번의 cost로 이상여부 판단이 어려울 경우 사용자가 정의한 일정 기간 동안 cost의 누적(cumulative cost)이 사용자가 정의한 threshold보다 클 때 비정상으로 판단하기도 한다. FIG. 2B is a diagram illustrating the concept of an abnormal situation detection method using a threshold. That is, it is a diagram showing the concept of an abnormal situation detection method using a threshold from the function for calculating the difference. Referring to FIG. 2B , the cost function calculates the difference between the predicted value and the measured value (Diff (actual measured value, predicted value)) as a single real number (cost, cost). Therefore, here, a threshold is set as a criterion for judging normal/abnormal, and when the cost is greater than the threshold, it is judged as abnormal. If it is difficult to determine whether there is an abnormality with a single cost, it is judged as abnormal when the cumulative cost of the cost for a user-defined period is greater than the user-defined threshold.

도 3은 도 1에 도시된 이상 판단기(160)의 동작의 일 예를 보여주는 도면이다. FIG. 3 is a diagram illustrating an example of the operation of the abnormality determiner 160 shown in FIG. 1 .

도 3을 참조하여, 상기 이상 판단기(160)가 이상을 판단하는 동작의 예를 보여주고 있다. 우선 정상 상황을 나타내는 학습 데이터를 학습기에 입력하여 학습 모델을 생성하고, 상기 학습 데이터를 학습 모델에 입력하여 생성한 예측치와 실측치의 차이에 상응하는 예측오차(Cost)를 생성할 수 있다. 그리고 상기 정상 상황을 나타내는 학습 데이터에 상응하는 예측오차 예측오차 패턴 데이터베이스에 저장할 수 있다. 그리고 입력 데이터에 상응하는 예측오차 패턴(300)을 상기 예측오차 패턴 데이터베이스에 포함된 정상 상황의 예측오차 패턴들(310, 320, 330, 340)과 비교하여 이상 또는 정상 상황인지 여부를 판단하는 것이다. Referring to FIG. 3 , an example of an operation in which the abnormality determiner 160 determines an abnormality is shown. First, learning data representing a normal situation is input to the learner to generate a learning model, and a prediction error (Cost) corresponding to the difference between the predicted value and the actual value generated by inputting the learning data into the learning model may be generated. And it may be stored in the prediction error prediction error pattern database corresponding to the training data representing the normal situation. And, by comparing the prediction error pattern 300 corresponding to the input data with the prediction error patterns 310, 320, 330, 340 of the normal situation included in the prediction error pattern database, it is determined whether an abnormality or a normal situation is present. .

즉, 상기 이상 판단기(160)는 상기 입력 데이터에 상응하는 예측오차 패턴을 상기 패턴 수집 시간동안 추출하여 타겟 예측오차 패턴(300)을 생성하고, 상기 타겟 예측오차 패턴과 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들(310, 320, 330, 340)과 비교하고, 상기 예측오차 패턴들 중 어느 하나 이상이 상기 타겟 예측오차 패턴과 유사하다고 판단되면, 정상이라고 판단하고, 상기 예측오차 패턴들 모두가 상기 타겟 예측오차 패턴과 유사하지 않다고 판단되면, 이상이라고 판단할 수 있다. That is, the abnormality determiner 160 generates a target prediction error pattern 300 by extracting a prediction error pattern corresponding to the input data during the pattern collection time, and stores the target prediction error pattern and the prediction error pattern database. It is compared with the included prediction error patterns 310 , 320 , 330 , and 340 , and when it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern, it is determined as normal, and the prediction error patterns are If it is determined that all of them are not similar to the target prediction error pattern, it may be determined that the pattern is abnormal.

또한, 상기 이상 판단기(160)는 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하기 위하여, 상기 예측오차 패턴들 중 상기 타겟 예측오차 패턴과 가장 유사한 이웃 예측오차 패턴을 생성하고, 상기 타겟 예측오차 패턴과 이웃 예측오차 패턴을 상기 차이를 계산하는 함수를 이용하여 차이값을 계산하고, 상기 차이값이 차이 임계값보다 같거나 작으면 유사하다고 판단하고, 상기 차이값이 상기 차이 임계값보다 크면 유사하지 않다고 판단할 수 있다. 상기 차이를 계산하는 함수는 유클리디언(Euclidean) 거리 함수로 정할 수 있다. In addition, the anomaly determiner 160 generates a neighboring prediction error pattern most similar to the target prediction error pattern among the prediction error patterns in order to compare it with the prediction error patterns included in the prediction error pattern database, and The difference between the target prediction error pattern and the neighboring prediction error pattern is calculated using the function for calculating the difference, and if the difference value is equal to or smaller than the difference threshold, it is determined that they are similar, and the difference is the difference threshold If it is larger than that, it can be judged that they are not similar. The function for calculating the difference may be determined as a Euclidean distance function.

따라서, 도 3에서는 타겟 예측오차 패턴(300)과 가장 유사한 이웃 예측오차 패턴을 그림에서 두번째 예측오차 패턴(320)으로 선택할 수 있다. 그리고 상기 타겟 예측오차 패턴(300)과 상기 이웃 예측오차 패턴(320)의 차이값을 유클리디어 거리함수로 계산하여 차이값을 얻을 수 있다. 만약 상기 차이값이 차이 임계값보다 크면 유사하지 않다고 판단하고 최종 비정상으로 판단하고, 상기 차이값이 상기 차이 임계값보다 작으면 유사하다고 판단하고 최종 정상으로 판단 할 수 있다. 그리고 상기 차이 임계값은 유사한 정도를 나타내는 정도를 나타내는 값으로, 사용자나 시스템에 의해서 결정될 수 있다. Accordingly, in FIG. 3 , a neighboring prediction error pattern most similar to the target prediction error pattern 300 may be selected as the second prediction error pattern 320 in the figure. In addition, a difference value may be obtained by calculating a difference value between the target prediction error pattern 300 and the neighbor prediction error pattern 320 using the Euclidean distance function. If the difference value is greater than the difference threshold, it is determined that the difference is not similar and the final abnormality is determined. In addition, the difference threshold is a value indicating a degree of similarity, and may be determined by a user or a system.

도 4는 실시예에 따른 이상 상황 탐지 시스템(400)이 복수개의 학습 모델을 이용하여 이상을 판단하는 경우의 일 예를 보여주는 블록도이다. 4 is a block diagram illustrating an example in which the abnormal situation detection system 400 according to the embodiment determines abnormality using a plurality of learning models.

정상상황을 나타내는 학습 데이터를 학습하여 이상을 판단하는 방법은, 정상데이터 전처리, 전처리된 데이터를 학습하여 예측모델 개발, 예측모델의 예측치와 실측치의 차이(cost trend)를 이용해 이상여부 판단의 세 단계로 구성된다. The method of judging abnormality by learning the learning data representing the normal situation is three steps: preprocessing normal data, developing a predictive model by learning the preprocessed data, and determining whether there is an abnormality using the difference between the predicted value and the measured value of the predictive model (cost trend) is composed of

이 때 학습해야 할 데이터가 너무 많은 feature를 가지고 있을 경우 머신러닝 기법이 학습모델을 잘 생성해 내지 못 하는 경우가 많다. 이런 문제를 해결하여 학습/예측 성능을 향상시키기 위해 많은 경우 데이터의 feature를 줄이거나 나누어 복수 개의 학습모델을 이용해 데이터를 학습하고 이들의 예측오차를 이용해 이상발생 여부를 판단한다. At this time, if the data to be learned has too many features, the machine learning technique is often not able to generate a learning model well. In order to solve this problem and improve learning/prediction performance, in many cases, data features are reduced or divided by using a plurality of learning models to learn data and use their prediction errors to determine whether anomalies occur.

도 4는 이와 같은 방식의 대표적인 구성을 보여준다. 우선 데이터 전처리기(440)는 학습모델들이 잘 학습하여 잘 예측할 수 있도록 하는 것이 목적이다4 shows a representative configuration of such a scheme. First of all, the purpose of the data preprocessor 440 is to enable the learning models to learn well and predict well.

도 4를 참조하면, 정상 상황을 나타내는 학습 데이터(420)을 데이터 분석기(430)에 분석을 한다. 그리고 상기 데이터 분석기(430)의 분석 결과를 바탕으로 데이터 전처리기(440)에서 N개의 분배 학습 데이터로 분배할 수 있다. 상기 N개의 분배 학습 데이터는 N개의 학습기(450-1 내지 450-N)에 입력되어 학습되고, 상기 학습의 결과로 N개의 학습 모델(452-1 내지 452-N)이 만들어진다. 그리고 N개의 이상 판단기(460-1 내지 460-N)는 실시간 입력 데이터(410)에 대하여 상기 학습의 결과로 만들어진 N개의 학습 모델(452-1 내지 452-N)을 이용하여 N개의 예측 오차(465-1 내지 465-N)를 생성하고, 상기 N개의 예측 오차(465-1 내지 465-N)를 기반으로 이상을 판단하게 된다. 그리고 통합 판단기(470)가 상기 N개의 이상 판단 결과를 종합하여 최종 이상 판단 결과(480)을 생성하게 되는 것이다. Referring to FIG. 4 , the training data 420 representing the normal situation is analyzed by the data analyzer 430 . In addition, based on the analysis result of the data analyzer 430 , the data preprocessor 440 may distribute N pieces of distributed learning data. The N pieces of distributed learning data are input to and learned from the N learners 452-1 to 450-N, and as a result of the learning, N learning models 452-1 to 452-N are created. In addition, the N abnormality determiners 460-1 to 460-N use the N learning models 452-1 to 452-N made as a result of the learning with respect to the real-time input data 410 to obtain N prediction errors. (465-1 to 465-N) is generated, and abnormality is determined based on the N prediction errors (465-1 to 465-N). In addition, the integrated determiner 470 generates a final abnormality determination result 480 by synthesizing the N abnormality determination results.

이 때, 각 학습기는 분배된 학습데이터와 동일/유사한 차원의 예측오차를 생성한다. 예를 들어 5개의 feature로 구성된 학습데이터로 학습한 모델은 5개의 feature를 예측하게 되어 각 feature마다 예측오차를 생성하기 때문에 예측오차도 5개의 feature로 구성된다.At this time, each learner generates a prediction error of the same/similar dimension to the distributed learning data. For example, a model trained with learning data composed of 5 features predicts 5 features and generates a prediction error for each feature, so the prediction error is also composed of 5 features.

학습모델을 위한 데이터 전처리기(440)는 학습을 잘 수행하게 하는 것이 목적이기 때문에 각 학습기(451-1 내지 451-N)마다 서로 다른 개수의 feature로 구성된 데이터로 학습데이터를 분배해 줄 수 있다. 예를 들어 학습기1은 5개 feature, 학습기2는 20개 feature로 구성된 데이터에 대해 학습/예측을 수행할 수 있다. Since the purpose of the data preprocessor 440 for the learning model is to perform well in learning, the learning data can be distributed as data composed of a different number of features for each learner 451-1 to 451-N. . For example, learner 1 can perform learning/prediction on data composed of 5 features and learner 2 with 20 features.

그러나, 도 4와 같은 구조에서 예측오차 패턴을 이용하여 이상판단을 할 경우 다음과 같은 문제가 발생할 수 있다. However, when an abnormality is determined using the prediction error pattern in the structure shown in FIG. 4, the following problems may occur.

우선, 데이터 전처리기가 학습모델마다 서로 다른 개수의 입력들을 분배해 주면 예측오차 패턴을 확인하기 위한 최적의 데이터 수집 시간(cost trend의 패턴분석을 위한 cost 수집기간)이 학습모델마다 서로 다를 수 있다. 이 경우 이상징후 판단 시점 및 판단 근거 데이터가 학습모델마다 서로 다를 수 있어 통합 판단이 어려워진다.First, if the data preprocessor distributes a different number of inputs to each learning model, the optimal data collection time for checking the prediction error pattern (cost collection period for cost trend pattern analysis) may be different for each learning model. In this case, it is difficult to make an integrated judgment because the timing of judging anomalies and the data based on the judgment may be different for each learning model.

그리고 하나의 학습모델에 적용된 입력의 개수가 너무 적거나 많은 경우, 가장 유사한 예측오차 패턴(nearest-neighbor)과의 거리로 예측오차 패턴의 유사성을 정확하기 판단해 내기가 어려울 수 있다. 만약 입력 개수가 너무 많으면 정상 상황에서도 너무 많은 변형이 발생할 수 있어 정상 상황을 모두 확보하기 어렵기 때문에 오탐지(정상을 이상으로 판단)가 많아지고, 입력 개수가 너무 적으면 표현 가능한 형태가 너무 적어 정상과 비정상을 구분하기 어려워져서 미탐지(이상을 정상으로 판단)가 많아질 수 있다.And if the number of inputs applied to one learning model is too small or too many, it may be difficult to accurately determine the similarity of the prediction error pattern by the distance to the nearest-neighbor. If the number of inputs is too large, too many deformations may occur even under normal conditions, so it is difficult to secure all the normal conditions, so false detections (judging normal as abnormal) increase. It becomes difficult to distinguish between normal and abnormal, so the number of undetected (abnormalities judged as normal) may increase.

그리고 도 4와 같이 학습 단계에서 학습데이터를 분리하여 학습할 경우, 각 이상 판단기는 자신의 입력인 예측오차를 이용해 이상여부를 판단한다. 이 경우 분리된 feature들 간의 패턴을 이용해 이상판단을 수행하기 어려울 수 있다. 예를 들어, 학습기1에 들어간 feature A, B와 학습기2의 feature C, 학습기n의 feature D 사이의 관계에서 이상판단이 필요하거나 이상 발생시 해당 feature A, B, C, D 간의 관계를 분석하고자 할 경우 그렇게 분류된 학습모델이 없으면 해당 이상판단 분석을 수행하기 어렵다. 이를 위해서는 feature A, B, C, D를 포함한 학습모델을 새로이 구성/학습해야하나 학습시간이 오래 걸리며, 해당 feature들로 구성한 학습모델의 학습이 잘 이루어지지 않을 가능성도 존재하기 때문이다.And, when learning by separating the learning data in the learning step as shown in FIG. 4, each abnormality determiner determines whether there is an abnormality using a prediction error that is its input. In this case, it may be difficult to judge anomalies using patterns between separated features. For example, in the relationship between features A and B in learner 1, feature C in learner 2, and feature D in learner n, it is necessary to determine an anomaly or to analyze the relationship between the features A, B, C, and D when an anomaly occurs. In this case, if there is no such classified learning model, it is difficult to perform the anomaly judgment analysis. For this, a new learning model including features A, B, C, and D needs to be configured/learned, but it takes a long time to learn, and there is a possibility that the learning model composed of the features may not be well learned.

도 5는 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치의 일 예의 블록도이다. 5 is a block diagram of an example of an anomaly detection learning apparatus using a plurality of machine learning learning models according to an embodiment.

도 5는 앞서 도 4의 구조에서의 문제점을 해결하기 위해 예측오차 분배기(540)를 추가한 구조이다. 도 5를 참조하면, 정상 상황을 나타내는 학습 데이터(500)을 데이터 분석기(510)에 분석을 한다. 그리고 상기 데이터 분석기(510)의 분석 결과를 바탕으로 데이터 전처리기(515)에서 N개의 분배 학습 데이터로 분배할 수 있다. 상기 N개의 분배 학습 데이터는 N개의 학습기(520-1 내지 520-N)에 입력되어 학습되고, 상기 학습의 결과로 N개의 학습 모델이 만들어진다. 이상과 같은 구조는 도 4와 동일하다. 5 is a structure in which a prediction error divider 540 is added to solve the problem in the structure of FIG. 4 above. Referring to FIG. 5 , the training data 500 representing a normal situation is analyzed by the data analyzer 510 . And, based on the analysis result of the data analyzer 510, the data preprocessor 515 may distribute the N pieces of distributed learning data. The N pieces of distributed learning data are input to and learned from the N learners 520-1 to 520-N, and N learning models are created as a result of the learning. The structure as described above is the same as that of FIG. 4 .

도 5는 도 4와 달리 N개의 학습기(520-1 내지 520-N)이 생성한 N개의 예측오차들(530-1 내지 530-N)을 상기 예측오차 분배기(540)가 M개의 예측오차들(550-1 내지 550-M)로 재분배하고, 역시 M개의 예측오차 패턴 데이터베이스(560-1 내지 560-M)를 생성해낼 수 있다. In FIG. 5, unlike FIG. 4, the prediction error divider 540 divides the N prediction errors 530-1 to 530-N generated by the N learners 520-1 to 520-N into the M prediction errors. It is redistributed to (550-1 to 550-M), and also M prediction error pattern databases 560-1 to 560-M can be generated.

즉, 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치의 데이터 전처리기(515)가 정상상황을 나타내는 학습 데이터를 상기 학습 데이터의 피처(Feature)의 개수를 기반으로 N개의 분배 학습 데이터로 나누어 전처리한다. 그리고 N개의 학습기(520-1 내지 520-N)는 상기 N개의 분배 학습 데이터를 각 N개의 학습기에 입력하여 N개의 학습 모델을 생성하고, 상기 학습 데이터를 상기 N개의 학습 모델에 입력하여 생성한 예측치와 실측치의 차이에 상응하는 N개의 예측오차(530-1 내지 530-N)를 생성한다. 그리고 예측오차 분배기(540)는 상기 N개의 예측오차를 M개의 예측오차 분석기에 입력하기 위하여 M개의 재분배 예측오차(550-1 내지 550-M)로 재분배한다. 그리고 상기 M개의 재분배 예측오차를 각 M개의 예측오차 분석기에 입력하여 패턴 수집 시간 동안의 예측오차 패턴(Cost trend)들을 예측오차 패턴 데이터베이스(560-1 내지 560-M)에 저장할 수 있다. 그리고 상기 N과 M은 각각 적어도 2개 이상의 값이고, 서로 독립적으로 설정되는 것일 수 있다. That is, the data preprocessor 515 of the anomaly detection learning apparatus using a plurality of machine learning learning models according to the embodiment divides learning data representing a normal situation into N distributed learning based on the number of features of the learning data. Divide the data into preprocessing. And the N learners (520-1 to 520-N) input the N distributed learning data to each of the N learners to generate N learning models, and input the training data to the N learning models to generate N prediction errors 530-1 to 530-N corresponding to the difference between the predicted value and the actual value are generated. The prediction error divider 540 redistributes the N prediction errors into M redistribution prediction errors 550-1 to 550-M in order to input the N prediction errors to the M prediction error analyzers. And, by inputting the M redistribution prediction errors to each of the M prediction error analyzers, it is possible to store prediction error patterns (cost trends) during the pattern collection time in the prediction error pattern databases 560-1 to 560-M. In addition, each of N and M may be at least two or more values, and may be set independently of each other.

상기 예측오차 분배기(540)는 예측오차 분석이 효과적으로 이루어질 수 있도록 예측오차들을 재분배하며, 다음과 같은 장점이 존재한다.The prediction error divider 540 redistributes prediction errors so that prediction error analysis can be performed effectively, and the following advantages exist.

우선 각 예측오차의 feature 개수를 동일하게 하여 동일한 시간(w) 동안에 이상판단을 할 수 있게 해 준다. 예를 들어, 학습데이터가 60개 입력으로 이루어져 있는데, 5개 입력마다 1개씩 20개 학습모델을 만들어낼 경우(학습모델 간에 동일 입력을 중복으로 사용할 수도 있다), 5개(feature의 수)가 예측오차 분석에 부족하다면 예측오차 분배기는 예측오차 100개(5*20)를 모아 10개씩 묶어 재분배 해 주는 것이다. 이후 예측오차를 이용한 분석 및 이상징후 판단은 재분배된 예측오차를 이용해서 수행하게 된다.First, by making the number of features of each prediction error the same, it is possible to judge anomalies for the same time (w). For example, if the training data consists of 60 inputs, and 20 learning models are created, 1 for every 5 inputs (the same input can be used repeatedly between training models), 5 (the number of features) If the prediction error analysis is insufficient, the prediction error divider collects 100 prediction errors (5*20) and redistributes them by grouping them 10 each. Thereafter, analysis using the prediction error and judgment of anomalies are performed using the redistributed prediction error.

그리고 예측오차 feature들을 random하게 섞어 분배하여 예측오차의 통계 분석을 유리하게 할 수 있다. 각 예측오차들이 서로 독립(independent)인 경우에 chi-square 분포를 활용하는 등 예측오차들의 패턴을 통계적으로 분석하는데 유리하다. 하나의 학습모델이 계산한 예측오차 feature들 간에는 서로 간의 연관성이 있을 수 있으므로, 이들을 random하게 섞어 feature 간의 연관성을 줄여 통계 분석 기반의 이상판단 성능을 향상시켜 줄 수 있다.And by randomly mixing and distributing prediction error features, statistical analysis of prediction error can be advantageous. When each prediction error is independent of each other, it is advantageous to statistically analyze patterns of prediction errors, such as using a chi-square distribution. Since there may be correlations between the prediction error features calculated by one learning model, it is possible to improve the abnormality judgment performance based on statistical analysis by randomly mixing them to reduce the correlation between features.

또한, 이상탐지시스템 운영 중 탐지된 특정 feature 간의 관계가 이상여부 결정에 유리하다고 판단될 경우, 학습기/학습모델을 새로 만들 필요 없이 해당 feature들을 새로운 하나의 예측오차 그룹으로 묶어 예측오차분석기에 추가하는 것만으로도 해당 feature들의 관계를 감시할 수 있다. 이는 학습모델은 각 feature들을 예측하는 기능에 최적화되도록 구성하고, 이상판단 단계에서 필요한 만큼의 예측오차분석기를 사용함으로써 개별 학습모델의 예측결과를 효과적으로 앙상블 하는데에도 활용할 수 있다. In addition, when it is judged that the relationship between specific features detected during the operation of the anomaly detection system is advantageous for determining whether there is an abnormality, the corresponding features are grouped into a new prediction error group and added to the prediction error analyzer without the need to create a new learner/learning model. It is possible to monitor the relationship between the corresponding features. This can be used to effectively ensemble the prediction results of individual learning models by configuring the learning model to be optimized for the function of predicting each feature, and using as many prediction error analyzers as necessary in the abnormality judgment stage.

도 6은 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 판단 장치의 일 예의 블록도이다. 6 is a block diagram of an example of an anomaly detection determination apparatus using a plurality of machine learning learning models according to an embodiment.

도 6 역시 앞서 도 4의 구조에서의 문제점을 해결하기 위해 예측오차 분배기(540)를 추가한 구조이다. 도 6은 도 4와 달리, 예측오차 분배기(630)에서 학습기가 생성한 N개의 예측오차가 아닌 M개의 예측오차들을 기반으로 M개의 예측오차 패턴 기준 판단기들(650-1 내지 650-M)을 이용하여 예측오차 패턴 기준으로 판단을 하게 된다. 그리고 통합 판단기(660)가 상기 M개의 예측오차 패턴 기준 판단 결과를 종합하여 최종 이상 판단 결과를 생성하게 되는 것이다. FIG. 6 is also a structure in which a prediction error divider 540 is added to solve the problem in the structure of FIG. 4 . Unlike FIG. 4, FIG. 6 shows M prediction error pattern reference determiners 650-1 to 650-M based on M prediction errors rather than N prediction errors generated by the learner in the prediction error divider 630. is used to make a decision based on the prediction error pattern. In addition, the integrated determiner 660 generates a final abnormality determination result by synthesizing the M prediction error pattern reference determination results.

즉, 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 판단 장치로 실시간 입력 데이터(600)가 입력된다. 상기 입력 데이터가 N개의 학습기(610-1 내지 610-N)의 N개의 학습 모델에 입력되어 N개의 예측오차(620-1 내지 620-N)으로 생성된다. 그리고 상기 N개의 예측오차(620-1 내지 620-N)가 예측오차 분배기(630)을 거쳐 M개의 재분배 예측오차(640-1 내지 640-M)으로 재분배된다. 그리고 상기 M개의 재분배 예측오차(640-1 내지 640-M)들로부터 상기 입력 데이터에 상응하는 M개의 예측오차 패턴을 상기 패턴 수집 시간동안 추출한 M개의 타겟 예측오차 패턴을 생성하고, M개의 예측오차 패턴 기준 판단기(650-1 내지 650-M)에서 각각 타겟 예측오차 패턴과 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교한다. 그리고 상기 예측오차 패턴들 중 어느 하나 이상이 상기 타겟 예측오차 패턴과 유사하다고 판단되면, 정상이라고 판단하고, 상기 예측오차 패턴들 모두가 상기 타겟 예측오차 패턴과 유사하지 않다고 판단되면, 이상이라고 판단한다. 그리고 판단의 결과를 통합 판단기(660)에서 통합하여 최종 판단 결과를 생성하게 되는 것이다. That is, the real-time input data 600 is input to the anomaly detection determination apparatus using a plurality of machine learning learning models according to the embodiment. The input data is input to N learning models of N learners 610-1 to 610-N and is generated with N prediction errors 620-1 to 620-N. Then, the N prediction errors 620-1 to 620-N are redistributed to M redistribution prediction errors 640-1 to 640-M through the prediction error divider 630 . And, from the M redistribution prediction errors 640-1 to 640-M, M target prediction error patterns obtained by extracting M prediction error patterns corresponding to the input data during the pattern collection time are generated, and M prediction errors The pattern reference determiners 650-1 to 650-M compare the target prediction error pattern and the prediction error patterns included in the prediction error pattern database, respectively. And when it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern, it is determined that the pattern is normal, and when it is determined that all of the prediction error patterns are not similar to the target prediction error pattern, it is determined that it is abnormal . And the result of the determination is integrated in the integrated determiner 660 to generate the final determination result.

상기 예측오차 패턴 기준 판단기(650)는 상기 예측오차 패턴들 중 상기 타겟 예측오차 패턴과 가장 유사한 이웃 예측오차 패턴을 생성하고, 상기 타겟 예측오차 패턴과 이웃 예측오차 패턴을 상기 차이를 계산하는 함수를 이용하여 차이값을 계산하고, 상기 차이값이 차이 임계값보다 같거나 작으면 유사하다고 판단하고, 상기 차이값이 상기 차이 임계값보다 크면 유사하지 않다고 판단할 수 있다. 이 때, 상기 차이를 계산하는 함수는 유클리디언(Euclidean) 거리 함수로 하는 것일 수 있다. The prediction error pattern reference determiner 650 generates a neighbor prediction error pattern most similar to the target prediction error pattern among the prediction error patterns, and calculates the difference between the target prediction error pattern and the neighbor prediction error pattern. may be used to calculate a difference value, and if the difference value is equal to or smaller than the difference threshold value, it may be determined that they are similar, and if the difference value is greater than the difference threshold value, it may be determined that the difference value is not similar. In this case, the function for calculating the difference may be a Euclidean distance function.

또한, 상기 예측오차 패턴 기준 판단기(650)는 상기 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하고, 상기 타겟 예측오차 패턴과 상기 예측오차 패턴들을 차이를 계산하는 함수를 이용하여 차이값들을 계산하고, 상기 차이값들 중 적어도 하나 이상이 차이 임계값보다 같거나 작으면 유사하다고 판단하고, 상기 차이값들 모두가 상기 차이 임계값보다 크면 유사하지 않다고 판단하는 것일 수 있다. In addition, the prediction error pattern reference determiner 650 compares the prediction error patterns included in the prediction error pattern database, and uses a function for calculating the difference between the target prediction error pattern and the prediction error patterns to determine the difference value may be calculated, and if at least one of the difference values is less than or equal to a difference threshold, it is determined that they are similar, and when all of the difference values are greater than the difference threshold, it is determined that they are not similar.

도 7은 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치의 일 예의 동작의 흐름도를 나타낸 도면이다. 7 is a diagram illustrating an example operation of an anomaly detection learning apparatus using a plurality of machine learning learning models according to an embodiment.

도 7을 참조하면, 정상 상황을 나타내는 학습 데이터가 상기 이상 탐지 학습 장치에 입력된다(S710). Referring to FIG. 7 , learning data representing a normal situation is input to the anomaly detection learning apparatus ( S710 ).

데이터 전처리기가 정상상황을 나타내는 학습 데이터를 상기 학습 데이터의 피처(Feature)의 개수를 기반으로 N개의 분배 학습 데이터로 나누어 전처리한다(S720). The data preprocessor preprocesses the training data representing the normal situation by dividing the training data into N distributed training data based on the number of features of the training data (S720).

그리고 N개의 학습기는 상기 N개의 분배 학습 데이터를 각 N개의 학습기에 입력하여 N개의 학습 모델을 생성한다(S730).Then, the N learners input the N distributed learning data to each of the N learners to generate N learning models (S730).

또한 상기 N개의 학습기가 상기 학습 데이터를 상기 N개의 학습 모델에 입력하여 생성한 예측치와 실측치의 차이에 상응하는 N개의 예측오차를 생성한다(S740). In addition, the N learners input the training data to the N learning models to generate N prediction errors corresponding to the difference between the generated predicted values and the actual values (S740).

그리고 예측오차 분배기는 상기 N개의 예측오차를 M개의 예측오차 분석기에 입력하기 위하여 M개의 재분배 예측오차로 재분배한다(S750). Then, the prediction error divider redistributes the N prediction errors into M redistribution prediction errors to input the N prediction errors to the M prediction error analyzers (S750).

그리고 상기 M개의 재분배 예측오차를 각 M개의 예측오차 분석기에 입력하여 패턴 수집 시간 동안의 예측오차 패턴(Cost trend)들을 예측오차 패턴 데이터베이스에 저장할 수 있다(S760). 그리고 상기 N과 M은 각각 적어도 2개 이상의 값이고, 서로 독립적으로 설정되는 것일 수 있다. And, by inputting the M redistribution prediction errors to each M prediction error analyzer, it is possible to store prediction error patterns (cost trends) during the pattern collection time in the prediction error pattern database (S760). In addition, each of N and M may be at least two or more values, and may be set independently of each other.

도 8은 실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 판단 장치의 일 예의 동작 흐름도를 나타낸 도면이다.8 is a diagram illustrating an operation flowchart of an apparatus for determining anomaly detection using a plurality of machine learning learning models according to an embodiment.

도 8을 참조하면, 상기 이상 탐지 판단 장치에 실시간 입력 데이터가 입력된다(S810). Referring to FIG. 8 , real-time input data is input to the anomaly detection determining device (S810).

상기 입력 데이터에 상응하는 타겟 예측오차 패턴이 생성된다(S820). 즉, 상기 입력 데이터가 N개의 학습기의 N개의 학습 모델에 입력되어 N개의 예측오차로 생성된다. 그리고 상기 N개의 예측오차가 예측오차 분배기를 거쳐 M개의 재분배 예측오차로 재분배된다. 그리고 상기 M개의 재분배 예측오차들로부터 상기 입력 데이터에 상응하는 M개의 예측오차 패턴을 상기 패턴 수집 시간동안 추출한 M개의 타겟 예측오차 패턴을 생성하는 것이다. A target prediction error pattern corresponding to the input data is generated (S820). That is, the input data is input to N learning models of N learners and is generated with N prediction errors. Then, the N prediction errors are redistributed to M redistribution prediction errors through a prediction error divider. In addition, M target prediction error patterns obtained by extracting M prediction error patterns corresponding to the input data from the M redistribution prediction errors during the pattern collection time are generated.

그리고 M개의 예측오차 패턴 기준 판단기에서 각각 타겟 예측오차 패턴과 예측오차 패턴 데이터베이스에 포함된 예측오차 패턴들과 비교하여 유사한지 판단한다(S830). 만약 상기 타겟 예측오차 패턴이 상기 예측오차 패턴들과 유사하다고 판단되면, 정상으로 판단한다(S840). 만약 상기 타겟 예측오차 패턴이 상기 예측오차 패턴들과 유사하지 않다고 판단되면 이상으로 판단한다(S850). 즉, 상기 예측오차 패턴들 중 어느 하나 이상이 상기 타겟 예측오차 패턴과 유사하다고 판단되면, 정상이라고 판단하고, 상기 예측오차 패턴들 모두가 상기 타겟 예측오차 패턴과 유사하지 않다고 판단되면, 이상이라고 판단한다. 여기서, 각 예측오차 패턴 기준 판단기의 판단의 결과를 통합 판단기에서 통합하여 최종 판단 결과를 생성할 수도 있다. Then, the M prediction error pattern reference determiners compare the target prediction error pattern and the prediction error patterns included in the prediction error pattern database to determine whether they are similar (S830). If it is determined that the target prediction error pattern is similar to the prediction error patterns, it is determined as normal ( S840 ). If it is determined that the target prediction error pattern is not similar to the prediction error patterns, it is determined as abnormal (S850). That is, if it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern, it is determined that the pattern is normal. do. Here, the final determination result may be generated by integrating the determination result of each prediction error pattern reference determiner in the integrated determiner.

도 9는 실시예에 따른 컴퓨터 시스템 구성을 나타낸 도면이다.9 is a diagram showing the configuration of a computer system according to an embodiment.

실시예에 따른 복수개의 머신 러닝 학습 모델을 이용한 이상 탐지 학습 장치 또는 이상 탐지 판단 장치는 컴퓨터로 읽을 수 있는 기록매체와 같은 컴퓨터 시스템(900)에서 구현될 수 있다.An anomaly detection learning apparatus or anomaly detection determining apparatus using a plurality of machine learning learning models according to an embodiment may be implemented in the computer system 900 such as a computer-readable recording medium.

컴퓨터 시스템(900)은 버스(920)를 통하여 서로 통신하는 하나 이상의 프로세서(910), 메모리(930), 사용자 인터페이스 입력 장치(940), 사용자 인터페이스 출력 장치(950) 및 스토리지(960)를 포함할 수 있다. 또한, 컴퓨터 시스템(900)은 네트워크(980)에 연결되는 네트워크 인터페이스(970)를 더 포함할 수 있다. 프로세서(910)는 중앙 처리 장치 또는 메모리(930)나 스토리지(960)에 저장된 프로그램 또는 프로세싱 인스트럭션들을 실행하는 반도체 장치일 수 있다. 메모리(930) 및 스토리지(960)는 휘발성 매체, 비휘발성 매체, 분리형 매체, 비분리형 매체, 통신 매체, 또는 정보 전달 매체 중에서 적어도 하나 이상을 포함하는 저장 매체일 수 있다. 예를 들어, 메모리(930)는 ROM(931)이나 RAM(932)을 포함할 수 있다.Computer system 900 may include one or more processors 910 , memory 930 , user interface input device 940 , user interface output device 950 and storage 960 that communicate with each other via bus 920 . can In addition, computer system 900 may further include a network interface 970 coupled to network 980 . The processor 910 may be a central processing unit or a semiconductor device that executes programs or processing instructions stored in the memory 930 or storage 960 . The memory 930 and the storage 960 may be storage media including at least one of a volatile medium, a non-volatile medium, a removable medium, a non-removable medium, a communication medium, and an information delivery medium. For example, the memory 930 may include a ROM 931 or a RAM 932 .

이상에서 설명된 실시예에 따르면, 본 발명은 개별 학습모델(예측모델)이 생성하는 예측오차들의 패턴을 이용한 이상 판단 과정에서 이상 판단 성능을 향상하기 위해 예측오차들을 재분배하는 방법 및 이를 위한 장치를 제공할 수 있다. According to the embodiment described above, the present invention provides a method and apparatus for redistributing prediction errors to improve abnormality determination performance in the abnormality determination process using patterns of prediction errors generated by individual learning models (prediction models). can provide

또한, 본 발명의 목적은 공격 패턴에 따라 학습모델을 수정/추가하지 않고도 해당 공격 패턴 탐지에 필요한 예측오차 그룹을 생성하고 이를 위한 예측오차분석기를 추가하여 새로운 공격 패턴 및 사고조사 등에 효율적으로 대응할 수 있는 방법을 제공할 수 있다.In addition, it is an object of the present invention to efficiently respond to new attack patterns and accident investigations by creating a prediction error group necessary for detecting the corresponding attack pattern without modifying/adding the learning model according to the attack pattern and adding a prediction error analyzer for this. can provide a way.

이상에서 첨부된 도면을 참조하여 본 발명의 실시예들을 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.Although embodiments of the present invention have been described above with reference to the accompanying drawings, those of ordinary skill in the art to which the present invention pertains can practice the present invention in other specific forms without changing its technical spirit or essential features. You will understand that there is Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive.

100: 이상 상황 탐지 시스템
110: 학습 데이터
120: 입력 데이터
130: 데이터 전처리기
140: 학습기
150: 학습모델
160: 이상 판단기
170: 이상 판단 결과100: anomaly detection system
110: training data
120: input data
130: data preprocessor
140: learner
150: learning model
160: abnormality judgment machine
170: abnormal judgment result

Claims

In an abnormal situation detection method using a plurality of machine learning learning models in which each step is performed by a computing device
pre-processing the training data representing the normal situation by dividing the training data into N distributed training data based on the number of features in the training data;
generating N learning models by inputting the N distributed learning data to each of the N learners;
generating N prediction errors (costs) corresponding to differences between predicted values and actual values generated by inputting the training data into the N training models;
redistributing the N prediction errors into M redistribution prediction errors to input the N prediction errors into the M prediction error analyzers;
inputting the M redistribution prediction errors into each M prediction error analyzer and storing the prediction error patterns (Cost trends) for a pattern collection time in a prediction error pattern database; and
Comprising the step of comparing the prediction error pattern corresponding to the input data with the prediction error patterns included in the prediction error pattern database to determine whether an abnormal or normal situation,
Wherein N and M are each a natural number value of at least 2 or more, and are set independently of each other,
Anomaly detection method using multiple machine learning learning models.

According to claim 1,
The redistribution of the M number of redistribution prediction errors is
determining the minimum number of features required for prediction error analysis in the prediction error analyzer; and
Based on a value obtained by dividing the sum of the number of features included in the N prediction errors by the minimum number of features, determining M so that the number of input features included in the redistribution prediction error is the same;
Anomaly detection method using multiple machine learning learning models.

3. The method of claim 2,
The redistribution of the M number of redistribution prediction errors is
generating a set of prediction errors by randomly mixing the N prediction errors; and
Which further comprises the step of redistributing the set of prediction errors to M redistribution prediction errors,
Anomaly detection method using multiple machine learning learning models.

3. The method of claim 2,
The redistribution of the M number of redistribution prediction errors is
analyzing features corresponding to the prediction error among the set of prediction errors;
bundling related features that are advantageous for determining abnormality among the features corresponding to the prediction error; and
Further comprising the step of further generating a redistribution prediction error including the associative features,
Anomaly detection method using multiple machine learning learning models.

3. The method of claim 2,
The step of determining whether the abnormal or normal situation is
generating a target prediction error pattern by extracting a prediction error pattern corresponding to the input data during the pattern collection time;
comparing the target prediction error pattern with prediction error patterns included in the prediction error pattern database;
determining that the prediction error pattern is normal when it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern; and
If it is determined that all of the prediction error patterns are not similar to the target prediction error pattern, determining that they are abnormal,
Anomaly detection method using multiple machine learning learning models.

6. The method of claim 5,
The step of comparing with the prediction error patterns included in the prediction error pattern database is
generating a neighboring prediction error pattern most similar to the target prediction error pattern among the prediction error patterns;
calculating a difference value between the target prediction error pattern and the neighboring prediction error pattern using a function for calculating the difference;
determining that the difference value is equal to or less than a difference threshold value; and
If the difference value is greater than the difference threshold, determining that they are not similar,
Anomaly detection method using multiple machine learning learning models.

7. The method of claim 6,
The function for calculating the difference is to be a Euclidean distance function,
Anomaly detection method using multiple machine learning learning models.

6. The method of claim 5,
The step of comparing with the prediction error patterns included in the prediction error pattern database is
calculating difference values using a function for calculating a difference between the target prediction error pattern and the prediction error patterns;
determining that they are similar if at least one of the difference values is less than or equal to a difference threshold; and
If all of the difference values are greater than the difference threshold, determining that they are not similar;
Anomaly detection method using multiple machine learning learning models.

6. The method of claim 5,
The step of determining whether the abnormal or normal situation is
Further comprising the step of finally judging normal or abnormal by fusing the judgment results of the M prediction error analyzers,
Anomaly detection method using multiple machine learning learning models.

According to claim 1,
The step of pre-processing by dividing the N distributed learning data is
analyzing the features included in the training data;
bundling related features optimized for learning among the features; and
Including the step of generating one distributed learning data among the learning data including the related features,
Anomaly detection method using multiple machine learning learning models.

Pre-processing the training data representing the normal situation by dividing it into N distributed learning data based on the number of features of the training data,
Input the N distributed learning data to each N learner to generate N learning models,
Generate N prediction errors (Cost) corresponding to the difference between the predicted value and the actual value generated by inputting the training data into the N learning models,
Redistributing the N prediction errors to M redistribution prediction errors to input the M prediction error analyzers,
The M redistribution prediction errors are input to each M prediction error analyzer, and the prediction error patterns (cost trends) during the pattern collection time are stored in the prediction error pattern database,
a processor wherein each of N and M is at least two or more values, and is set independently of each other; and
A memory for storing at least one of the N prediction errors and the M redistribution prediction errors,
Anomaly detection learning apparatus using a plurality of machine learning learning models.

12. The method of claim 11,
the processor is
Determine the minimum number of features required for prediction error analysis in the prediction error analyzer,
M is determined so that the number of input features included in the redistribution prediction error is the same based on a value obtained by dividing the sum of the number of features included in the N prediction errors by the number of the minimum features,
Anomaly detection learning apparatus using a plurality of machine learning learning models.

13. The method of claim 12,
the processor is
A set of prediction errors is generated by randomly mixing the N prediction errors,
Redistributing the set of prediction errors to M redistribution prediction errors,
Anomaly detection learning apparatus using a plurality of machine learning learning models.

13. The method of claim 12,
the processor is
Analyze features corresponding to the prediction error among the set of prediction errors,
Among the features corresponding to the prediction error, related features that are advantageous for determining abnormality are grouped together,
Further generating a redistribution prediction error including the associative features,
Anomaly detection learning apparatus using a plurality of machine learning learning models.

12. The method of claim 11,
the processor is
Analyze the features included in the training data,
Among the features, related features optimized for learning are grouped together,
That the learning data including the related features are generated as one distributed learning data,
Anomaly detection learning apparatus using a plurality of machine learning learning models.

input the input data into the anomaly detection learning device;
generating a target prediction error pattern in which a prediction error pattern corresponding to the input data is extracted during a pattern collection time,
comparing the target prediction error pattern with prediction error patterns included in the prediction error pattern database of the anomaly detection and learning apparatus;
If it is determined that at least one of the prediction error patterns is similar to the target prediction error pattern, it is determined that it is normal,
a processor that determines that the prediction error patterns are abnormal when it is determined that all of the prediction error patterns are not similar to the target prediction error patterns; and
Containing a memory for storing the target prediction error pattern,
Anomaly detection and judgment device using a plurality of machine learning learning models.

17. The method of claim 16,
the processor is
generating a neighboring prediction error pattern most similar to the target prediction error pattern among the prediction error patterns;
Calculate the difference value using a function for calculating the difference between the target prediction error pattern and the neighboring prediction error pattern,
If the difference value is equal to or less than the difference threshold, it is determined that the difference is similar,
If the difference value is greater than the difference threshold, it is determined that they are not similar,
Anomaly detection and judgment device using a plurality of machine learning learning models.

18. The method of claim 17,
The function for calculating the difference is to be a Euclidean distance function,
Anomaly detection and judgment device using a plurality of machine learning learning models.

17. The method of claim 16,
the processor is
and comparing with the prediction error patterns included in the prediction error pattern database,
Calculate the difference values using a function for calculating the difference between the target prediction error pattern and the prediction error patterns,
If at least one of the difference values is equal to or less than the difference threshold, it is determined that they are similar,
If all of the difference values are greater than the difference threshold, it is determined that they are not similar,
Anomaly detection and judgment device using a plurality of machine learning learning models.

17. The method of claim 16,
the processor is
The final judgment of normality or abnormality by fusion of the judgment results of M prediction error analyzers,
Anomaly detection and judgment device using a plurality of machine learning learning models.