KR20190078850A

KR20190078850A - Method for estimation on online multivariate time series using ensemble dynamic transfer models and system thereof

Info

Publication number: KR20190078850A
Application number: KR1020170180576A
Authority: KR
Inventors: 이기천; 임태훈; 최동근
Original assignee: (주)가디엘; 한양대학교 산학협력단
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2019-07-05
Also published as: KR102038703B1

Abstract

Provided are a real-time multivariate time series prediction method through a dynamic transition ensemble model, and a system thereof. The real-time multivariate time series prediction method through a dynamic transition ensemble model comprises the following steps. A real-time multivariate time series prediction system selects a predetermined number of a plurality of dynamic transition models to define a dynamic transition ensemble model based on the selected selection models. As the real-time multivariate time series prediction system receives real-time data while updating weights of each of the selection models included in the dynamic transition ensemble model or model coefficients of each of the selection models, and performs the real-time prediction.

Description

[0001] The present invention relates to a method and system for predicting real time multivariate time series through a dynamic transition ensemble model,

본 발명은 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법 및 그 시스템에 관한 것으로, 보다 상세하게는 복수의 동적전이 모형들을 복합적으로 이용한 앙상블 모형을 이용하여 효과적으로 다변량 환경에서 실시간 시계열 예측을 할 수 있는 방법 및 그 시스템에 관한 것이다.The present invention relates to a method and system for real-time multivariate time series prediction using a dynamic transition ensemble model, and more particularly, to a method and system for real-time multivariate time series prediction using an ensemble model using a plurality of dynamic transition models Method and system thereof.

기존의 다양한 APM 솔루션이나 CCTV 네트워크 기타 시계열 예측 모형을 통해 실시간으로 특정 이벤트(예컨대, 오류 또는 장애 등)를 예측하는 시도가 존재하였다.Attempts have been made to predict specific events (e.g., errors or failures) in real time through various existing APM solutions or CCTV network other time series prediction models.

하지만 시스템의 성능에 영향을 미치는 독립적인 인자가 다수 개가 되는 다변량 환경에서 종래의 기술들은 실시간 분석을 통해 예측까지 가능한 것이 아니라 단순히 모니터링을 하는 수준에 그치고 있다.However, in a multivariate environment where there are a number of independent factors affecting system performance, conventional technologies are not capable of predicting through real-time analysis, but are merely monitoring.

또한 종래의 다변량 시계열 예측 모형으로 알려진 VAR(Vector Autoregressive) 모델은 여러 가지 가정을 이용한 예측 모형이며, 하나의 모형을 이용해서 변수들을 한꺼번에 예측하는 모형이라는 특징으로 인해 여러 상황에 대처하기 어려운 문제가 있고 다변량 데이터를 이용하면서 타겟 변수 하나를 예측하는데는 어려움이 있다.In addition, the VAR (Vector Autoregressive) model, which is known as a conventional multivariate time series prediction model, is a prediction model using various assumptions, and it is difficult to cope with various situations due to the feature that a single model predicts variables at one time It is difficult to predict one target variable using multivariate data.

따라서 본 발명은 복수의 동적전이 모형을 결합한 앙상블 모형을 이용하여 예측의 정확도를 높일 수 있는 방법 및 그 시스템을 제공하는 것이다. Accordingly, the present invention provides a method and system for enhancing the accuracy of prediction using an ensemble model combining a plurality of dynamic transition models.

또한 실시간 다변량 시계열 데이터를 이용하여 많은 변수들 중에서 타겟이 되는 반응변수를 설정하여 예측력을 향상시킬 수 있는 방법 및 그 시스템을 제공하는 것이다.The present invention also provides a method and system for improving predictive power by setting a target response variable among many variables using real time multivariate time series data.

또한 오프라인 학습을 통해 모형을 선별하여 온라인에서 실시간으로 정확도 높은 예측을 할 수 있는 방법 및 시스템을 제공하는 것이다. In addition, it provides a method and system for selecting accurate models by offline learning and real-time accurate prediction in online.

본 발명의 일 측면에 따르면, 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법은 실시간 다변량 시계열 예측시스템이 복수의 동적전이 모형들 중 소정 개수를 선택하여 선택된 선택 모형들에 기초한 동적전이 앙상블 모형을 정의하는 단계, 상기 실시간 다변량 시계열 예측시스템이 실시간 데이터를 입력받아 상기 동적전이 앙상블 모형에 포함된 상기 선택 모형들 각각의 가중치 또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행하는 단계를 포함한다.According to an aspect of the present invention, a real time multivariate time series prediction method using a dynamic transition ensemble model is a method of real time multivariate time series prediction by defining a dynamic transition ensemble model based on selected selection models by selecting a predetermined number of a plurality of dynamic transition models The real time multivariate time series prediction system receives real time data and performs prediction in real time while updating the weight of each of the selection models included in the dynamic transition ensemble model or the model coefficient of each of the selection models .

상기 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법은, 상기 실시간 다변량 시계열 예측시스템이 학습데이터를 통해 상기 복수의 동적전이 모형들을 생성하는 단계를 더 포함할 수 있다.The real-time multivariate time series prediction method using the dynamic transition ensemble model may further include the step of generating the plurality of dynamic transition models through the learning data by the real-time multivariate time series prediction system.

상기 실시간 다변량 시계열 예측시스템이 학습데이터를 통해 복수의 동적전이 모형들을 생성하는 단계는 종속변수 Y에 대해 상기 종속변수 Y의 자기 회귀 영향과 현재 시점의 독립변수 x들의 영향을 선형 결합하여 상기 동적전이 모형들을 생성하는 단계를 포함할 수 있다.Wherein the step of generating the plurality of dynamic transition models through the learning data comprises linearly combining the autoregressive influence of the dependent variable Y and the influence of the independent variable x at the present time on the dependent variable Y, And generating models.

상기 동적전이 모형들은 다음의 수식에 의해 생성되는 것일 수 있다.The dynamic transfer models may be generated by the following equations.

[수학식][Mathematical Expression]

여기서 Y는 종속변수, B는 backshift operator, d는 차분차수(difference factor), p는 자기회귀차수, I는 표시함수,

는 자기회귀계수,

는 회귀모델 상수,

는 독립변수 x에 대한 1차 회귀계수,

는 오차항, m은 독립변수의 개수를 나타낸다. Where Y is the dependent variable, B is the backshift operator, d is the difference factor, p is the autoregressive order, I is the display function,

Is the autoregressive coefficient,

Regression model constants,

Is the first-order regression coefficient for the independent variable x,

Is the error term, and m is the number of independent variables.

상기 실시간 다변량 시계열 예측시스템이 학습데이터를 통해 상기 복수의 동적전이 모형들을 생성하는 단계는,

개(여기서, q는 독립변수 선택개수)의 동적전이 모형들을 생성하는 것을 특징으로 할 수 있다.Wherein the generating of the plurality of dynamic transition models by the real-time multivariate time series prediction system through learning data comprises:

(Where q is an independent variable number of choices).

상기 실시간 다변량 시계열 예측시스템이 생성된 복수의 동적전이 모형들 중 소정 개수를 선택하여 선택된 선택 모형들에 기초한 동적전이 앙상블 모형을 정의하는 단계는 상기 복수의 동적전이 모형들 중에서 예측성능이 뛰어난 K개를 상기 선택 모형들로 뽑는 것을 특징으로 할 수 있다.Wherein the step of defining a dynamic transition ensemble model based on the selected selection models by selecting a predetermined number of the plurality of dynamic transition models generated by the real time multivariate time series prediction system includes: As the selection models.

상기 동적전이 앙상블 모형은 다음과 같은 수식에 의해 정의되는 것을 특징으로 할 수 있다.The dynamic transition ensemble model may be defined by the following equation.

여기서 w는 각각의 선택모형들의 가중치.Where w is the weight of each choice model.

상기 실시간 다변량 시계열 예측시스템이 상기 동적전이 앙상블 모형에 포함된 상기 선택 모형들 각각의 가중치 또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행하는 단계는 다음의 수학식을 이용하여 상기 가중치를 업데이트하는 것을 특징으로 할 수 있다.Wherein the real time multivariate time series prediction system performs prediction in real time while updating the weight of each of the selection models included in the dynamic transition ensemble model or the model coefficient of each of the selection models using the following equation And updating the weight value.

상기 실시간 다변량 시계열 예측시스템이 상기 동적전이 앙상블 모형에 포함된 상기 선택 모형들 각각의 가중치 또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행하는 단계는 미리 정해진 윈도 사이즈를 이용하여 전진 업데이트를 하면서 RLS방법을 통해 상기 선택 모현들 각각의 모형 계수를 업데이트하는 것을 특징으로 할 수 있다.Wherein the real time multivariate time series prediction system performs prediction in real time while updating the weight of each of the selection models included in the dynamic transition ensemble model or the model coefficient of each of the selection models by using a predetermined window size Updating model coefficients of each of the selected indices through an RLS method while updating the model coefficients of the selected indices.

상기 실시간 다변량 시계열 예측시스템이 상기 동적전이 앙상블 모형에 포함된 상기 선택 모형들 각각의 가중치 또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행하는 단계는 현재 시점 t에서는 이전 시점인 (t-1)에서의 예측한 오차를 이용하여 가중치를 구하는 것을 특징으로 할 수 있다.Wherein the real time multivariate time series prediction system performs prediction in real time while updating the weight of each of the selection models included in the dynamic transition ensemble model or the model coefficient of each of the selection models, t-1), the weighting value is obtained.

상기 기술적 과제를 해결하기 위한 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법은 실시간 다변량 시계열 예측시스템이 학습데이터를 통해 상기 복수의 동적전이 모형들을 생성하는 단계, 상기 실시간 다변량 시계열 예측시스템이 생성한 상기 복수의 동적전이 모형들 중 소정 개수를 선택하여 선택된 선택 모형들에 기초한 동적전이 앙상블 모형을 이용하여, 실시간 데이터를 입력받아 상기 선택 모형들 각각의 가중치 또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행하는 단계를 포함한다.According to an aspect of the present invention, there is provided a method for predicting a real time multivariate time series through a dynamic transition ensemble model, the method comprising: generating a plurality of dynamic transition models using learning data; Selecting a predetermined number of the plurality of dynamic transition models and using the dynamic transition ensemble model based on the selected selection models, receiving the real-time data and updating the weight of each of the selection models or the model coefficient of each of the selection models And performing prediction in real time.

상기 방법은 데이터 처리장치에 설치되며 기록매체에 기록된 컴퓨터 프로그램에 의해 구현될 수 있다.The method may be implemented by a computer program installed in a data processing apparatus and recorded in a recording medium.

상기 기술적 과제를 해결하기 위한 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측시스템은 복수의 동적전이 모형들 각각에 대한 정보가 저장되는 DB, 상기 복수의 동적전이 모형들 중 소정 개수를 선택하여 선택된 선택 모형들에 기초한 동적전이 앙상블 모형을 정의하기 위한 앙상블 모형모듈, 실시간 데이터를 입력받아 상기 동적전이 앙상블 모형에 포함된 상기 선택 모형들 각각의 가중치 또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행하기 위한 제어모듈을 포함한다.According to an aspect of the present invention, there is provided a real-time multivariate time series prediction system using a dynamic transition ensemble model, comprising: a DB storing information on each of a plurality of dynamic transition models; a selection model selecting a predetermined number of the plurality of dynamic transition models; An ensemble model module for defining a dynamic transition ensemble model based on the real-time data, a weight of each of the selection models included in the dynamic transition ensemble model after receiving real-time data or a model coefficient of each of the selection models, And a control module for performing the control.

상기 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측시스템은 학습데이터를 통하여 학습을 수행하여 상기 복수의 동적전이 모형들을 생성하는 동적전이 모형모듈을 더 포함할 수 있다.The real time multivariate time series prediction system using the dynamic transition ensemble model may further include a dynamic transition model module for performing learning through learning data to generate the plurality of dynamic transition models.

상기 기술적 과제를 해결하기 위한 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측시스템은 프로세서 및 상기 프로세서에 의하여 실행되는 컴퓨터 프로그램을 저장하는 메모리를 포함하며, 상기 컴퓨터 프로그램은, 상기 프로세서에 의해 실행되는 경우, 상기 방법을 수행할 수 있다.According to an aspect of the present invention, there is provided a real time multivariate time series prediction system using a dynamic transition ensemble model, the system including a processor and a memory for storing a computer program executed by the processor, The above method can be performed.

본 발명의 기술적 사상에 따르면, 복수의 동적전이 모형을 결합한 앙상블 모형을 이용하여 예측의 정확도를 높일 수 있는 효과가 있다. 또한 복수의 동적전이 모형들 각각의 가중치 및/또는 모형계수를 실시간으로 업데이트하면서 실시간 예측성능을 향상시킬 수 있는 효과가 있다. According to the technical idea of the present invention, the accuracy of prediction can be enhanced by using an ensemble model combining a plurality of dynamic transition models. Also, there is an effect that the real-time prediction performance can be improved while updating the weight and / or model coefficient of each of the plurality of dynamic transition models in real time.

또한 오프라인 학습을 통해 모형을 선별하여 온라인에서 실시간으로 정확도 높은 예측을 할 수 있는 효과가 있다.In addition, the model is selected through off-line learning, and accurate prediction can be performed in real time on-line.

또한 실시간 다변량 시계열 데이터를 이용하여 많은 변수들 중에서 타겟이 되는 반응변수를 설정하여 예측력을 향상시킬 수 있는 효과가 있다. 따라서 종래의 애플리케이션 성능관리 솔루션이나 CCTV 네트워크 등에서 실시간으로 장애를 예측하는데 이용될 수 있을 뿐만 아니라, 기타 다양한 데이터베이스나 서버에서 발생하는 로그 데이터나 성능 데이터 등과 같이 서버 네트워크 시스템에서 발생하는 실시간 다변량 시계열 데이터를 이용해서 분석, 예측이 필요한 산업에서 적용이 가능한 효과가 있다. 적절한 임계치를 설정할 수 있는 경우 이상치 탐지 분야에도 적용이 가능하며, 의료 산업에서 발생하는 데이터를 이용해 질병을 판단하거나, 네트워크 상의 침입 탐지와 같은 분야에도 적용 될 수 있는 효과가 있다.Also, there is an effect that prediction power can be improved by setting a target response variable among many variables using real time multivariate time series data. Therefore, not only can it be used for predicting a failure in real time in a conventional application performance management solution or a CCTV network, but also real time multivariate time series data generated in a server network system such as log data or performance data generated in various databases or servers It can be applied in industries that require analysis, prediction, and use. If an appropriate threshold value can be set, it can be applied to the detection of anomaly, and it can be applied to fields such as an intrusion detection on a network, or the like, by using data generated in the medical industry.

본 발명의 상세한 설명에서 인용되는 도면을 보다 충분히 이해하기 위하여 각 도면의 간단한 설명이 제공된다.
도 1은 본 발명의 기술적 사상에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측시스템의 일 예를 나타낸다.
도 2는 본 발명의 일실시 예에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법을 구현하기 위한 개략적인 플로우 차트를 나타낸다.
도 3은 및 도 4는 본 발명의 일 실시 예에 따른 학습 데이터의 실시 예를 나타내는 도면이다.
도 5는 본 발명의 실시 예에 따라 복수의 동적전이 모형들 중에서 선택된 선택모형의 예시를 나타낸다.
도 6은 본 발명의 실시 예에 따른 앙상블 모형에서의 선택보형들 각각의 가중치 변화를 예시적으로 나타낸다.
도 7은 본 발명의 기술적 사상에 따른 예측결과를 나타내는 도면이다.BRIEF DESCRIPTION OF THE DRAWINGS A brief description of each drawing is provided to more fully understand the drawings recited in the description of the invention.
FIG. 1 shows an example of a real time multivariate time series prediction system using a dynamic transition ensemble model according to the technical idea of the present invention.
FIG. 2 is a schematic flow chart for implementing a real time multivariate time series prediction method using a dynamic transition ensemble model according to an embodiment of the present invention.
FIG. 3 and FIG. 4 are diagrams showing examples of learning data according to an embodiment of the present invention.
Figure 5 illustrates an example of a selection model selected from a plurality of dynamic transition models in accordance with an embodiment of the present invention.
FIG. 6 exemplarily shows a change in weight of each selected shape in an ensemble model according to an embodiment of the present invention.
7 is a diagram showing a prediction result according to the technical idea of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시 예를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.BRIEF DESCRIPTION OF THE DRAWINGS The present invention is capable of various modifications and various embodiments, and specific embodiments are illustrated in the drawings and described in detail in the detailed description. It is to be understood, however, that the invention is not to be limited to the specific embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise.

본 명세서에 있어서, “포함하다”또는 “가지다”등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this specification, the terms "comprises" or "having" and the like refer to the presence of stated features, integers, steps, operations, elements, components, or combinations thereof, But do not preclude the presence or addition of features, numbers, steps, operations, components, parts, or combinations thereof.

또한, 본 명세서에 있어서는 어느 하나의 구성요소가 다른 구성요소로 데이터를 '전송'하는 경우에는 상기 구성요소는 상기 다른 구성요소로 직접 상기 데이터를 전송할 수도 있고, 적어도 하나의 또 다른 구성요소를 통하여 상기 데이터를 상기 다른 구성요소로 전송할 수도 있는 것을 의미한다. 반대로 어느 하나의 구성요소가 다른 구성요소로 데이터를 '직접 전송'하는 경우에는 상기 구성요소에서 다른 구성요소를 통하지 않고 상기 다른 구성요소로 상기 데이터가 전송되는 것을 의미한다.Also, in this specification, when any one element 'transmits' data to another element, the element may transmit the data directly to the other element, or may be transmitted through at least one other element And may transmit the data to the other component. Conversely, when one element 'directly transmits' data to another element, it means that the data is transmitted to the other element without passing through another element in the element.

이하, 첨부된 도면들을 참조하여 본 발명의 실시 예들을 중심으로 본 발명을 상세히 설명한다. 각 도면에 제시된 동일한 참조부호는 동일한 부재를 나타낸다.Hereinafter, the present invention will be described in detail with reference to the embodiments of the present invention with reference to the accompanying drawings. Like reference symbols in the drawings denote like elements.

도 1은 본 발명의 기술적 사상에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측시스템의 일 예를 나타낸다.FIG. 1 shows an example of a real time multivariate time series prediction system using a dynamic transition ensemble model according to the technical idea of the present invention.

도 1을 참조하면, 본 발명의 기술적 사상에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법을 구현하기 위해서 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측시스템(이하, '예측시스템' , 100)이 구비될 수 있다.Referring to FIG. 1, a real time multivariate time series prediction system (hereinafter, a 'prediction system') 100 using a dynamic transition ensemble model is implemented to realize a real time multivariate time series prediction method using a dynamic transition ensemble model according to the technical idea of the present invention .

상기 예측시스템(100)은 소정의 유저의 단말기에 클라이언트 형태로 설치되거나 또는 서버 측에 설치되는 시스템일 수 있다. 상기 예측시스템(100)은 소정의 데이터 처리장치에 설치되며 본 발명의 기술적 사상을 구현하기 위해 구현되는 소프트웨어(예컨대, 애플리케이션)와 상기 소프트웨어를 구동하기 위한 상기 데이터 처리장치의 하드웨어가 유기적으로 결합하여 구현되는 시스템일 수 있다.The prediction system 100 may be a client installed in a terminal of a predetermined user or a system installed in a server. The prediction system 100 is installed in a predetermined data processing apparatus, and the software (e.g., application) implemented to implement the technical idea of the present invention and the hardware of the data processing apparatus for driving the software are organically combined Lt; / RTI >

상기 예측시스템(100)은 본 발명의 기술적 사상에 따라 다수의 독립변수가 존재하는 환경 즉, 다변량 환경에서 특정 변수의 시계열 데이터를 예측하기 위한 시스템일 수 있다. 이러한 상기 예측시스템(100)은 종래의 다변량 시계열 예측시스템(예컨대, Var 등)이 단일모형을 이용하여 한꺼번에 모든 변수들을 예측하는 방법에 비해 하나 또는 소수 개의 특정 변수(예컨대, 독립변수에 영향을 받는 종속변수)를 예측하는데 유리한 효과가 있다. The prediction system 100 may be a system for predicting time series data of a specific variable in a multivariate environment in which a plurality of independent variables exists according to the technical idea of the present invention. Such a prediction system 100 is advantageous in that a conventional multivariate time series prediction system (e.g., Var, etc.) can use one or a few specific variables (e.g., Dependent variable).

이하 본 명세서에서 예시하는 학습 데이터 및 모형들은 얘츨리케이션 성능관리(Application Performance Management) 시스템에 적용된 일 예들이며, 본 발명의 권리범위가 이에 한정되지는 않는다.Hereinafter, the learning data and models exemplified in this specification are examples applied to the application performance management system, and the scope of the present invention is not limited thereto.

상기 예측시스템(100)은 제어모듈(110) 및 앙상블 모형모듈(120)을 포함한다.The prediction system 100 includes a control module 110 and an ensemble model module 120.

상기 예측시스템(100)은 소정의 DB(130) 및/또는 동적전이 모형모듈(140)을 더 포함할 수 있다. 본 발명의 실시 예에 따라서는, 상술한 구성요소들 중 일부 구성요소는 반드시 본 발명의 구현에 필수적으로 필요한 구성요소에 해당하지 않을 수도 있으며, 또한 실시 예에 따라 상기 예측시스템(100)은 이보다 더 많은 구성요소를 포함할 수도 있음은 물론이다.The prediction system 100 may further include a predetermined DB 130 and / or a dynamic transition model module 140. According to an embodiment of the present invention, some of the above-mentioned components may not necessarily correspond to the components necessary for implementation of the present invention, and according to an embodiment, But may include more components.

상기 예측시스템(100)은 본 발명의 기술적 사상을 구현하기 위해 필요한 하드웨어 리소스(resource) 및/또는 소프트웨어를 구비할 수 있으며, 반드시 하나의 물리적인 구성요소를 의미하거나 하나의 장치를 의미하는 것은 아니다. 즉, 상기 예측시스템(100)은 본 발명의 기술적 사상을 구현하기 위해 구비되는 하드웨어 및/또는 소프트웨어의 논리적인 결합을 의미할 수 있으며, 필요한 경우에는 서로 이격된 장치에 설치되어 각각의 기능을 수행함으로써 본 발명의 기술적 사상을 구현하기 위한 논리적인 구성들의 집합으로 구현될 수도 있다. The prediction system 100 may include hardware resources and / or software necessary to implement the technical idea of the present invention, and does not necessarily mean one physical component or a single device . That is, the prediction system 100 may mean a logical combination of hardware and / or software provided to implement the technical idea of the present invention. If necessary, the prediction system 100 may be installed in a separate apparatus to perform respective functions And may be embodied as a set of logical structures for realizing the technical idea of the present invention.

또한, 본 명세서에서 모듈이라 함은, 본 발명의 기술적 사상을 수행하기 위한 하드웨어 및 상기 하드웨어를 구동하기 위한 소프트웨어의 기능적, 구조적 결합을 의미할 수 있다. 예를 들면, 상기 모듈은 소정의 코드와 상기 소정의 코드가 수행되기 위한 하드웨어 리소스의 논리적인 단위를 의미할 수 있으며, 반드시 물리적으로 연결된 코드를 의미하거나, 한 종류의 하드웨어를 의미하는 것은 아님은 본 발명의 기술분야의 평균적 전문가에게는 용이하게 추론될 수 있다.In this specification, a module may mean a functional and structural combination of hardware for carrying out the technical idea of the present invention and software for driving the hardware. For example, the module may refer to a logical unit of a predetermined code and a hardware resource for executing the predetermined code, and it does not necessarily mean a physically connected code or a kind of hardware But can be easily deduced to the average expert in the field of the present invention.

상기 제어모듈(110)은 상기 예측시스템(100)에 포함된 다른 구성들 예를 들면, 앙상블 모형모듈(120), DB(130), 및/또는 동적전이 모형모듈(140) 등의 기능 및/또는 리소스를 제어할 수 있다.The control module 110 may perform functions and / or functions of other components included in the prediction system 100, such as an ensemble model module 120, a DB 130, and / or a dynamic transition model module 140, Or resources.

본 발명의 기술적 사상에 따른 예측시스템(100)은 다변량 시계열 예측에 대한 가정이 많고 다변량 데이터를 이용하여 특정 변수 하나에 대한 예측이 어려우면 다양한 상황에 강인하지 못한 문제점을 보완하기 위해 복수의 동적전이 모형들을 결합한 동적전이 앙상블 모형을 이용하는 특징이 있다.The prediction system 100 according to the technical idea of the present invention has many assumptions about multivariate time series prediction and it is difficult to predict one specific variable by using multivariate data. In order to solve the problem that is not robust to various situations, And a dynamic transition ensemble model combining models.

이를 위해 상기 앙상블 모형모듈(120)은 미리 정의된 복수의 동적전이 모형들 중 소정 개수를 선택할 수 있다. 그리고 선택된 동적전이 모형들 즉, 선택모형에 기초한 동적전이 앙상블 모형을 정의할 수 있다. 이러한 동적전이 앙상블 모형은 다음과 같은 수학식으로 정의될 수 있다.For this, the ensemble model module 120 can select a predetermined number of the plurality of predefined dynamic transition models. And we can define selected dynamic transition models, ie, dynamic transition ensemble models based on selection models. This dynamic transition ensemble model can be defined by the following equation.

[수학식 1][Equation 1]

여기서 w는 각각의 선택모형들의 가중치이고,

는 k 번째 선택모형을 의미할 수 있다.Where w is the weight of each selection model,

Is the k-th choice model.

상기 앙상블 모형모듈(120)은 다수의 동적전이 모형들 중에서 K 개의 동적전이 모형들을 선택모형으로 선택할 수 있다. 물론, 상기 앙상블 모형모듈(120)은 예측성능이 뛰어난 모형들 순으로 상기 K 개의 선택모형을 선택할 수도 있고, 다양한 방식의 선택이 가능할 수 있다. The ensemble model module 120 can select K dynamic transition models from among a plurality of dynamic transition models as a selection model. Of course, the ensemble model module 120 may select the K selection models in the order of models having superior prediction performance, and may select various schemes.

다수의 동적전이 모형들에 대한 정보는 상기 DB(130)에 저장되어 있을 수 있다. Information about a plurality of dynamic transition models may be stored in the DB 130. [

또한, 상기 DB(130)에는 상기 다수의 동적전이 모형들을 학습시키기 위한 학습 데이터, 학습결과에 따른 동적전이 모형들에 대한 정보(즉, 동적전이 모형들을 정의하기 위한 모형계수 및 오차 등), 각각의 동적전이 모형들의 성능측정결과에 대한 정보가 더 저장되어 있을 수 있다.In addition, the DB 130 stores learning data for learning the plurality of dynamic transition models, information on dynamic transition models according to learning results (i.e., model coefficients and errors for defining dynamic transition models), and The information on the performance measurement results of the dynamic transition models of the model can be further stored.

상기 동적전이 모형들은 오프라인 학습을 통해 구축된 다변량 시계열 모형일 수 있다. 즉, 각각의 동적전이 모형들이 하나의 다변량 시계열 모형으로 동작할 수 있다. The dynamic transfer models may be multivariate time series models constructed through off-line learning. That is, each dynamic transition model can operate as one multivariate time series model.

상기 동적전이 모형들은 학습데이터를 통해 학습될 수 있으며, 학습의 결과를 통해 동적전이 모형을 정의하기 위한 계수들이 정의될 수 있다.The dynamic transfer models can be learned through learning data, and coefficients for defining a dynamic transfer model can be defined through the learning result.

상기 동적전이 모형모듈(140)은 이러한 동적전이 모형들 다수를 생성할 수 있다. 생성된 동적전이 모형들에 대한 정보 및 관련정보(예컨대, 학습데이터, 성능 등)는 상기 DB(130)에 저장될 수 있음은 물론이다.The dynamic transformation model module 140 may generate a plurality of such dynamic transformation models. It is needless to say that information on the generated dynamic transition models and related information (e.g., learning data, performance, etc.) can be stored in the DB 130.

상기 동적전이 모형모듈(140)은 자기 회귀 영향 부분 및 현재 시점의 독립변수(

)들의 영향을 선형 결합하여 상기 동적전이 모형들을 생성할 수 있다.The dynamic transfer model module 140 calculates an autoregressive influence portion and an independent variable

) Can be linearly combined to generate the dynamic transfer models.

이러한 동적전이 모형들은 다음과 같은 수학식을 통해 생성될 수 있다.These dynamic transfer models can be generated by the following equations.

[수학식 2]&Quot; (2) "

여기서 자기회귀 영향 부분은

으로 표현될 수 있다. Here,

. &Lt; / RTI >

또한 현재 시점의 독립변수(

)들의 영향은

로 표현될 수 있다.In addition,

The influence of

. &Lt; / RTI >

또한, Y는 종속변수, B는 후진작용소(backshift operator), d는 차분차수(difference factor), p는 자기회귀 차수(autoregressive order), I는 표시함수(indicator function)

,

는 자기회귀 계수(autoregression coefficient),

는 회귀모델 상수,

는 독립변수에 대한 1차 회귀 계수(simple linear regression coefficient),

는 오차, m은 독립변수의 개수를 의미할 수 있다.In addition, Y is the dependent variable, B is the backshift operator, d is the difference factor, p is the autoregressive order, I is the indicator function,

,

Is an autoregression coefficient,

Regression model constants,

Is the simple linear regression coefficient for independent variables,

Is the error, and m is the number of independent variables.

일 실시 예에 의하면, 도 3에 도시된 바와 같은 데이터가 학습 데이터가 될 수 있다. According to one embodiment, the data as shown in FIG. 3 can be the learning data.

도 3은 본 발명의 일 실시 예에 따른 학습데이터의 일 예를 나타내며, 이러한 학습데이터는 애플리케이션 성능관리 솔루션에 이용되는 학습데이터일 수 있다.FIG. 3 shows an example of learning data according to an embodiment of the present invention. The learning data may be learning data used in an application performance management solution.

도 3에 도시된 바와 같이 학습데이터 중 응답속도(response)가 종속변수 Y가 될 수 있고, 나머지 변수들(예컨대, Totla_mem, Free_mem, Used_mem, Usage, sys_cpu, user_cpu, count, http, javad, jdbc)이 모두 독립변수(x)가 될 수 있다. 따라서 이러한 학습데이터들을 이용해 상기 독립변수들이 어떤 값을 각각 가지면서 시계열적으로 변화될 때 상기 응답속도를 예측하기 위한 것이 본 발명의 기술적 사상에 따른 동적전이 앙상블 모형일 수 있다.As shown in FIG. 3, the response speed of the learning data can be the dependent variable Y and the remaining variables (e.g., Totla_mem, Free_mem, Used_mem, Usage, sys_cpu, user_cpu, count, http, javad, jdbc) Can all be independent variables (x). Therefore, it may be a dynamic transition ensemble model according to the technical idea of the present invention for predicting the response speed when the independent variables are changed in a time-wise manner with each value using each of these learning data.

상기 동적전이 모형모듈(140)은 수학식 2를 이용하여 다수의 동적전이 모형을 생성할 수 있다. The dynamic transfer model module 140 may generate a plurality of dynamic transfer models using Equation (2).

예컨대, 상기 동적전이 모형모듈(140)은,For example, the dynamic transfer modeling module 140,

[수학식 3]&Quot; (3) "

개(여기서, q는 독립변수 선택개수)의 동적전이 모형들을 생성할 수 있다.(Where q is an independent variable number of choices).

즉, 가능한 자기회귀 차수(p)와 가능한 차분차수(d) 및 종속변수에 영향을 미치는 독립변수 개수 q개일 경우, 가능한 서로 다른 모든 수학식 2의 개수가 수학식 3일 수 있다. 여기서

는 독립변수가 q개인 경우 전체 독립변수의 개수(m) 중에서 서로 다른 1개를 뽑을 수 있는 조합의 수부터 서로 다른 q개까지 뽑을 수 있는 조합의 수를 의미할 수 있다. 즉, [수학식 3]은 종속변수에 영향을 미치는 독립변수가 q개 이하라고 가정할 경우, 생성될 수 있는 모든 서로 다른 수학식 2의 개수가 될 수 있다.That is, in the case of the possible autoregressive order (p), possible differential order (d) and number of independent variables affecting the dependent variable, the number of all possible different equations (2) here

Is the number of combinations that can be selected from the number of combinations that can be drawn from one different among the total number (m) of independent variables when q is independent variable. That is, [Equation 3] can be the number of all different Equation 2 that can be generated assuming that the number of independent variables that affect the dependent variable is q or less.

이처럼 다수 개(예컨대, 수학식 3의 개수)의 서로 다른 동적전이 모형들이 생성되고 학습되면 학습된 동적전이 모형은 각각 상기 DB(130)에 저장될 수 있다.When a plurality of different dynamic transition models (for example, the number of Equation 3) are generated and learned, the learned dynamic transition models can be stored in the DB 130, respectively.

일 실시 예에 의하면, 상기 동적전이 모형모듈(140)은 학습데이터들(예컨대, 4000개) 중 일부(예컨대, 3000개)를 이용하여 동적전이 모형을 생성하고, 나머지 일부(예컨대, 1000개)를 이용하여 생성한 동적전이 모형을 업데이트할 수 있다. 그리고 각각의 모형의 성능을 평가할 수 있는 소정의 측정치(예컨대, RMSE : Root Meas Square Error)를 상기 DB(130)에 같이 저장할 수도 있다.According to one embodiment, the dynamic transfer model module 140 generates a dynamic transfer model using some (e.g., 3000) of learning data (e.g., 4000) Can be used to update the generated dynamic transfer model. A predetermined measurement value (e.g., RMSE: Root Meas Square Error) capable of evaluating the performance of each model may be stored in the DB 130 as well.

예컨대, 도 5에 도시된 데이터는 성능측정치(예컨대, RMSE)를 기준으로 높은순으로 정렬한 동적전이 모형의 식별번호들을 나타내고 있다. 즉, 1628번째 동적전이 모형이 가장 높은 성능측정치(예컨대, 0.3679042)를 가짐을 의미할 수 있다.For example, the data shown in FIG. 5 represent the identification numbers of dynamic transition models sorted in ascending order based on performance measures (e.g., RMSE). That is, it can mean that the 1628th dynamic transition model has the highest performance measure (for example, 0.3679042).

그러면 상기 앙상블 모형모듈(120)은 상기 DB(130)에 저장된 서로 다른 동적전이 모형들 중 소정의 개수(예컨대, K개)를 동적전이 앙상블 모형에 이용할 모형 즉, 선택모형으로 선택할 수 있다. 물론 상기 앙상블 모형모듈(120)은 각각의 동적전이 모형의 성능평가결과가 좋은 순으로 K개의 동적전이 모형을 선택모형을 선택할 수 있다. 여기서 K 개는 임의로 선택될 수도 있고, 다수의 반복 시뮬레이션을 통해 학습데이터의 개수나 기타 다양한 상황에 적합한 K개가 결정되는 과정을 거칠 수도 있다.Then, the ensemble model module 120 can select a predetermined number (for example, K) of the different dynamic transition models stored in the DB 130 as a model for use in the dynamic transition ensemble model, that is, a selection model. Of course, the ensemble model module 120 can select K dynamic transformation models as good as the performance evaluation results of the respective dynamic transformation models. Here, K may be arbitrarily selected, or it may be determined through a plurality of iterative simulations that the number of training data and K suitable for various situations are determined.

그러면 상기 앙상블 모형모듈(120)에 의해 정의되는 동적전이 앙상블 모형은 전술한 수학식 1과 같을 수 있다.Then, the dynamic transition ensemble model defined by the ensemble model module 120 may be expressed by Equation (1).

이처럼 동적전이 앙상블 모형이 정의되면, 상기 제어모듈(110)은 실시간 데이터를 입력받아 상기 동적전이 앙상블 모형에 포함된 상기 선택 모형들 각각의 가중치 및/또는 상기 선택 모형들 각각의 모형 계수를 업데이트하면서 실시간으로 예측을 수행할 수 있다. 즉, 앙상블 모형이 동적전이가 되면서 실시간으로 시계열적 예측을 수행할 수 있다. If the dynamic transition ensemble model is defined as described above, the control module 110 receives real-time data and updates the weight of each of the selection models included in the dynamic transition ensemble model and / or the model coefficient of each of the selection models Prediction can be performed in real time. In other words, time series prediction can be performed in real time as the ensemble model becomes dynamic transition.

상기 제어모듈(110)이 선택모형들 각각의 가중치를 업데이트한다고 함은, 실시간으로 각각의 선택모형들의 영향력 또는 성능이 동적전이 됨을 의미할 수 있다. 또한 선택모형들 각각의 모형계수(예컨대,

등)는 선택모형을 정의하는 파라미터 자체가 실시간으로 동적 전이됨을 의미할 수 있다. The fact that the control module 110 updates the weights of each of the selection models may mean that the influence or performance of each selection model is dynamic transition in real time. The model coefficients of each of the selection models (eg,

Etc.) may mean that the parameter itself defining the selection model is dynamic transition in real time.

이처럼 상기 제어모듈(110)은 실시간 데이터가 입력됨에 따라 실시간으로 동적전이하면서 동적전이 앙상블 모형을 업데이트하여 예측성능을 향상시키는 효과가 있다.As described above, the control module 110 has an effect of improving prediction performance by updating the dynamic transition ensemble model while real-time dynamic transition occurs as real-time data is input.

상기 제어모듈(110)은 다음과 같은 수학식을 이용하여 선택모형들 각각의 가중치를 업데이트할 수 있다.The control module 110 may update the weights of each of the selection models using the following equation.

[수학식 4]&Quot; (4) "

여기서

는 동적전이 모형에 의해 예측된 예측값을 의미할 수 있으며, 수학식 4의 왼쪽 식은 현재 시점의 다음시점부터 예측하고자 하는 시간 타임 개수(T)까지의 실제 데이터(

)와 예측값(

)의 차이의 제곱의 합에 반비례하여 해당 동적전이 모형의 가중치가 연산됨을 의미할 수 있다. 또한 수학식 4의 오른쪽 식은 선택모형들 각각의 가중치의 합은 1이 됨을 의미할 수 있으며 각각의 가중치는 전체 가중치의 합에서 차지하는 해당 가중치의 비율을 나타냄을 의미할 수 있다.here

The left equation of Equation (4) represents the predicted value predicted by the dynamic transition model, and the left equation of Equation (4) represents the actual data from the next time point of the current point to the time time number T to be predicted

) And the predicted value (

), And the weight of the corresponding dynamic transfer model is calculated in inverse proportion to the sum of the squares of the differences. Also, the right-hand side of Equation (4) can mean that the sum of the weights of each of the selection models is 1, and each weight can represent the ratio of the corresponding weights to the sum of all the weights.

한편, 상기 제어모듈(110)은 상기 선택모형들 각각의 모형계수를 업데이트할 수 있으며, 이때에는 널리 알려진 바와 같이 미리 정해진 윈도 사이즈를 이용하여 전진 업데이트를 하면서 RLS(Recursive Least Square)방법을 통해 상기 선택 모형 들 각각의 모형 계수를 업데이트할 수 있다.Meanwhile, the control module 110 can update the model coefficients of each of the selection models. At this time, as is well known, the control module 110 performs forward updating using a predetermined window size, The model coefficients of each of the selection models can be updated.

한편, 상기 제어모듈(110)은 수학식 5에 의하면, 현재 시점의 바로 다음 시점을 예측하는 경우라고 하면, 현재 시점 t에서는

을 이용하여 해당

번째 선택모형의 가중치를 구해야하지만 온라인(실시간)으로 순차적으로 데이터가 입력되는 상황이므로

이 존재하지 않게 된다. 따라서 이때에는 (t-1)의 시점에서의 오차 즉,

를 이용하여 가중치

를 구할 수도 있다.According to Equation (5), if the control module 110 predicts the next time point immediately after the current time point,

Using

We need to find the weight of the second selection model. However, since the data is sequentially input in the online (real time)

Is not present. Therefore, at this time, the error at the time point of (t-1)

Lt; / RTI >

.

마찬가지로 현재 시점이 t일 경우, 그 이후의 시점의 가중치(예컨대, t+n)은 (t-n)의 시점에서의 오차인

를 이용하여 가중치

를 구할 수도 있다.Similarly, when the present time is t, the weight (e.g., t + n) at a later time point is an error at the time point of (tn)

Lt; / RTI >

.

이러한 본 발명의 기술적 사상에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법을 정리하면 도 2와 같을 수 있다.The real time multivariate time series prediction method using the dynamic transition ensemble model according to the technical idea of the present invention can be summarized as shown in FIG.

도 2는 본 발명의 일실시 예에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법을 구현하기 위한 개략적인 플로우 차트를 나타낸다.FIG. 2 is a schematic flow chart for implementing a real time multivariate time series prediction method using a dynamic transition ensemble model according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 기술적 사상에 따른 예측시스템(100)은 학습데이터를 입력받아 오프라인에서 학습을 수행할 수 있다(S100, S110). Referring to FIG. 2, the prediction system 100 according to the technical idea of the present invention can receive learning data and perform learning in off-line (S100, S110).

그 결과 전술한 바와 같이 복수 개(예컨대, 수학식 3)의 동적전이 모형들이 생성될 수 있다(S120). 각각의 동적전이 모형은 수학식 2에 의해 정의될 수 있다.As a result, a plurality of dynamic transition models (e.g., Equation 3) may be generated as described above (S120). Each dynamic transfer model can be defined by equation (2).

그러면 상기 예측시스템(100)은 복수 개의 동적전이 모형들 중 K개를 선택하여 동적전이 앙상블 모형을 구축할 수 있다(S130). 구축되는 동적전이 앙상블 모형은 수학식 1과 같을 수 있다. Then, the prediction system 100 selects K of the plurality of dynamic transition models to construct a dynamic transition ensemble model (S130). The constructed dynamic transition ensemble model can be expressed by Equation (1).

그러면 상기 예측시스템(100)은 실시간으로(온라인) 입력데이터를 획득하고(S140), 이에 따라 실시간으로 각각의 선택모형의 가중치 및/또는 모형계수를 업데이트하면서 예측을 수행할 수 있다(S150).In operation S140, the prediction system 100 may obtain the input data in real time (S140), and may perform the prediction while updating the weights and / or model coefficients of each selection model in real time (S150).

이러한 본원발명의 기술적 사상에 따른 실시 예는 도 3 내지 도 7을 참조하여 설명하도록 한다.An embodiment according to the technical idea of the present invention will be described with reference to FIG. 3 to FIG.

도 3은 및 도 4는 본 발명의 일 실시 예에 따른 학습 데이터의 실시 예를 나타내는 도면이고, 도 5는 본 발명의 실시 예에 따라 복수의 동적전이 모형들 중에서 선택된 선택모형의 예시를 나타내며, 도 6은 본 발명의 실시 예에 따른 앙상블 모형에서의 선택모형들 각각의 가중치 변화를 나타내고, 도 7은 본 발명의 기술적 사상에 따른 예측결과를 나타내는 도면이다.FIG. 3 shows an example of learning data according to an embodiment of the present invention, FIG. 5 shows an example of a selection model selected from a plurality of dynamic transition models according to an embodiment of the present invention, FIG. 6 shows a weight change of each of the selection models in the ensemble model according to the embodiment of the present invention, and FIG. 7 is a diagram showing a prediction result according to the technical idea of the present invention.

본 발명의 실시 예에 따르면 도 3에 도시된 바와 같은 종류의 학습데이터를 이용하여 전술한 바와 같이 애플리케이션의 선능관리 솔루션에 적용하기 위한 예측을 수행하였다. 이때 예측하고자 하는 종속변수는 응답속도이다.According to an embodiment of the present invention, predictions for application to a talent management solution of an application are performed using the learning data of the kind shown in FIG. 3 as described above. The dependent variable to be predicted is the response speed.

실제 학습데이터는 4000개를 사용하였고, 이렇게 사용된 학습데이터는 도 4에 도시된 바와 같다.4000 pieces of actual learning data are used, and the learning data thus used is as shown in FIG.

이러한 4000개의 학습데이터들 중 상기 예측시스템(100)은 3000개의 학습데이터를 이용하여 다수의 동적전이 모형들을 구축하였고, 나머지 1000개의 학습데이터를 이용하여 각각의 동적전이 모형들을 업데이트하였다.Among the 4000 learning data, the prediction system 100 constructs a plurality of dynamic transition models using 3000 learning data, and updates the respective dynamic transition models using the remaining 1000 learning data.

그리고 도 5에 도시된 바와 같이 각각의 동적전이 모형들의 성능치(예컨대, RMSE)를 이용하여 K(예컨대, 40개)개의 선택모형을 선택하였다. 도 5에서는 t+6의 예측성능에 따라 선택된 선택모형과 해당 선택모형의 성능치를 나타낸다.Then, K (for example, 40) selection models are selected using the performance values (for example, RMSE) of the respective dynamic transition models as shown in FIG. FIG. 5 shows performance values of the selected model and the selected model according to the prediction performance of t + 6.

그리고 선택된 K개의 선택모형을 이용한 동적전이 앙상블 모형을 정의하고, 이를 이용하여 가중치와 모형계수들을 업데이트하면서 예측을 수행하였다.We define the dynamic transition ensemble model using selected K selection models, and perform predictions by updating weights and model coefficients.

온라인에서 초기 실시간 데이터 200개가 들어온 시점 이후부터 예측을 시작하였고, 그 결과 도 6에서와 같이 K개(40개)의 선택모형의 가중치가 변화되는 것을 확인할 수 있었다.As shown in FIG. 6, it is confirmed that the weighting of K (40) selection models is changed as shown in FIG. 6.

또한 도 7은 위와 같은 환경에서 t+6의 예측결과를 나타내는데, 검은 실선이 실제 데이터이고 붉은 실선이 본 발명의 실시 예에 따른 동적전이 앙상블 모형을 통한 예측치를 나타낸다. 그리고 도 7에 도시된 바와 같이 상대적으로 우수한 예측성능이 발휘됨을 알 수 있다.Also, FIG. 7 shows the prediction result of t + 6 in the above environment, where the black solid line is the actual data and the red solid line represents the predicted value through the dynamic transition ensemble model according to the embodiment of the present invention. As shown in FIG. 7, a relatively excellent prediction performance is exhibited.

본 발명의 기술적 사상에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법은 다양한 데이터베이스나 서버에서 발생하는 로그 데이터나 성능 데이터 등과 같이 서버 네트워크 시스템에서 발생하는 실시간 다변량 시계열 데이터를 이용해서 분석, 예측이 필요한 산업에서 적용이 가능할 수 있다. 또한 적절한 임계치를 설정할 수 있는 경우 이상치 탐지 분야에도 적용이 가능하며, 의료 산업에서 발생하는 데이터를 이용해 질병을 판단하거나, 네트워크 상의 침입 탐지와 같은 분야에도 적용 될 수도 있다.The real time multivariate time series prediction method using the dynamic transition ensemble model according to the technical idea of the present invention analyzes and predicts by using real time multivariate time series data generated in the server network system such as log data or performance data generated in various databases or servers It may be applicable in the required industry. In addition, if an appropriate threshold can be set, it can be applied to the detection of anomaly, and it can be applied to fields such as intrusion detection of a network, or the like, by using data generated in the medical industry.

한편, 본 발명의 실시 예에 따른 동적전이 앙상블 모형을 통한 실시간 다변량 시계열 예측방법은 컴퓨터가 읽을 수 있는 프로그램 명령 형태로 구현되어 컴퓨터로 읽을 수 있는 기록 매체에 저장될 수 있으며, 본 발명의 실시예에 따른 제어 프로그램 및 대상 프로그램도 컴퓨터로 판독 가능한 기록 매체에 저장될 수 있다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다.Meanwhile, the real-time multivariate time series prediction method using the dynamic transition ensemble model according to the embodiment of the present invention can be implemented as a computer-readable program command and stored in a computer-readable recording medium. The control program and the target program according to the present invention can also be stored in a computer-readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored.

기록 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 소프트웨어 분야 당업자에게 공지되어 사용 가능한 것일 수도 있다.Program instructions to be recorded on a recording medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of software.

컴퓨터로 읽을 수 있는 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Examples of the computer-readable recording medium include magnetic media such as a hard disk, a floppy disk and a magnetic tape, optical media such as CD-ROM and DVD, a floptical disk, And hardware devices that are specially configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like. The computer readable recording medium may also be distributed over a networked computer system so that computer readable code can be stored and executed in a distributed manner.

프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 전자적으로 정보를 처리하는 장치, 예를 들어, 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.Examples of program instructions include machine language code such as those produced by a compiler, as well as devices for processing information electronically using an interpreter or the like, for example, a high-level language code that can be executed by a computer.

상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성요소들도 결합된 형태로 실시될 수 있다.It will be understood by those skilled in the art that the foregoing description of the present invention is for illustrative purposes only and that those of ordinary skill in the art can readily understand that various changes and modifications may be made without departing from the spirit or essential characteristics of the present invention. will be. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타나며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.It is intended that the present invention covers the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents. .

Claims

Selecting a predetermined number of the plurality of dynamic transition models and defining a dynamic transition ensemble model based on the selected selection models;
Wherein the real time multivariate time series prediction system receives real time data and performs prediction while updating the weight of each of the selection models included in the dynamic transition ensemble model or the model coefficient of each of the selection models, A Real - Time Multivariate Time Series Prediction Method Using Model.

The method of claim 1, wherein the real time multivariate time series prediction method using the dynamic transition ensemble model comprises:
Wherein the real time multivariate time series prediction system further comprises generating the plurality of dynamic transition models through training data.

3. The method of claim 2, wherein the real-time multivariate time series prediction system generates a plurality of dynamic transition models through learning data,
Generating a dynamic transition model by linearly combining the autoregressive influence of the dependent variable Y and the influence of the independent variable x at the present time on the dependent variable Y to generate a real time multivariate time series prediction model using the dynamic transition ensemble model.

4. The method of claim 3, wherein the dynamic transition models are generated by the following equation.
[Mathematical Expression]

Where Y is the dependent variable, B is the backshift operator, d is the difference factor, p is the autoregressive order, I is the display function,

Is the autoregressive coefficient,

Regression model constants,

Is the first-order regression coefficient for the independent variable x,

Is the error term, and m is the number of independent variables.

5. The method of claim 4, wherein the real time multivariate time series prediction system generates the plurality of dynamic transition models through learning data,

(Where q is the number of independent variables selected) is generated by using a dynamic transition ensemble model.

The method as claimed in claim 1, wherein the real-time multivariate time series prediction system includes a plurality of dynamic transition models, and the dynamic transition ensemble model is defined based on the selected selection models,
And selecting K, which is excellent in prediction performance among the plurality of dynamic transition models, as the selection models.

The method of claim 1, wherein the dynamic transient ensemble model is defined by the following equation:

Where w is the weight of each choice model.

The method of claim 1, wherein the real-time multivariate time series prediction system performs prediction in real time while updating a weight of each of the selection models included in the dynamic transition ensemble model or model coefficients of each of the selection models,
Wherein the weights are updated using the following equation: < EMI ID = 15.0 >

The method of claim 1, wherein the real-time multivariate time series prediction system performs prediction in real time while updating a weight of each of the selection models included in the dynamic transition ensemble model or model coefficients of each of the selection models,
And updating the model coefficients of each of the selected indices through an RLS method while performing a forward update using a predetermined window size. The present invention also provides a method for predicting real time multivariate time series through a dynamic transition ensemble model.

The method of claim 1, wherein the real-time multivariate time series prediction system performs prediction in real time while updating a weight of each of the selection models included in the dynamic transition ensemble model or model coefficients of each of the selection models,
Wherein a weight is obtained by using a predicted error at a previous time point (t-1) in a current time point t, and a real time multivariate time series prediction method using a dynamic transition ensemble model.

A real time multivariate time series prediction system generating the plurality of dynamic transition models through learning data;
Selecting a predetermined number of the plurality of dynamic transition models generated by the real time multivariate time series prediction system and using the dynamic transition ensemble model based on the selected selection models, A method for predicting real time multivariate time series through a dynamic transition ensemble model, comprising the step of performing prediction while updating the model coefficients of each of the selection models.

A computer program installed in a data processing apparatus and recorded on a recording medium for performing the method according to any one of claims 1 to 11.

A DB storing information on each of the plurality of dynamic transition models;
An ensemble model module for defining a dynamic transition ensemble model based on selected selection models by selecting a predetermined number of the plurality of dynamic transition models;
A real time multivariate ensemble model with a real time data and a control module for performing prediction while updating the weight of each of the selection models included in the dynamic transition ensemble model or the model coefficient of each of the selection models, Time Series Prediction System.

14. The system of claim 13, wherein the real-time multivariate time series prediction system using the dynamic transition ensemble model comprises:
A real time multivariate time series prediction system using a dynamic transition ensemble model further comprising a dynamic transition model module for performing learning through learning data to generate the plurality of dynamic transition models.

A processor; And
A memory for storing a computer program executed by the processor,
The computer program, when executed by the processor, causes the method according to any one of claims 1 to 10 to be performed.