KR101941854B1

KR101941854B1 - System and method of estimating load with null data correction

Info

Publication number: KR101941854B1
Application number: KR1020180136852A
Authority: KR
Inventors: 문경훈
Original assignee: 문경훈
Priority date: 2018-11-08
Filing date: 2018-11-08
Publication date: 2019-01-24

Abstract

In the present invention, provided is a system for predicting a load through correction of null data. The system for predicting a load comprises: a data acquiring unit acquiring power data and determining whether the power data is equal to or greater than a minimum number of data necessary for analysis; a preprocessing unit determining whether second power data obtained by removing partial data from the power data is equal to or higher than an available data rate, and creating null data; and a load predicting unit generating a data set according to a trend change in the second power data, and performing a load prediction according to a first prediction algorithm or a second prediction algorithm by using the data set. Abnormal data of the power load data is removed and a point at which a pattern changes is detected to remove abnormal data, thereby preprocessing the load data to increase accuracy of the load prediction algorithm.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system and a method for predicting a load,

본 발명은 미취득 데이터 보정을 통한 부하 예측 시스템 및 방법에 관한 것이다. 보다 상세하게는, 미취득 데이터 보정을 통한 데이터 생성 및 기간 별 부하 예측 방법 및 이를 이용하는 시스템에 관한 것이다.The present invention relates to a system and method for predicting a load through correction of unacquired data. More particularly, the present invention relates to a data generation method and a load estimation method for each period, and a system using the same.

최근 데이터 분석 기술의 발달로 인하여 전력 또는 수자원 기기에 붙어있는 센서에서 취득된 데이터를 이용한 미래 상황 예측이 중요시되고 있다. 예를 들면, 배전선로에서 취득된 전력을 통해 미래 사용될 전력값을 예측하거나 수자원 약품 투여량 변화시에 수자원 품질을 예측한다. Due to recent advances in data analysis technology, prediction of future situations using data acquired from sensors attached to electric power or water resources devices is becoming more important. For example, predict power values for future use through power from power distribution lines, or predict water quality at changes in water drug dosage.

이와 같이 미래 상황 예측에 대한 가용성(availability) 및 성능을 확보하기 위하여는 데이터에 결측치 또는 이상치 문제가 해결되어야 한다. 이러한 시계열 데이터의 결측치 (미측치) 및 이상치는 일반적으로 센서들의 전송에러나 끊김으로 인해 발생한다. 시계열 전력 또는 수 데이터에 이러한 결측치 또는 이상치가 존재하는 경우, 데이터의 왜곡 및 편향을 야기하거나, 심지어 분석을 위해 적용되는 알고리즘의 성능을 저하시키는 문제를 야기하기도 한다. 하지만 대부분의 전력 패턴 분석을 위한 종래의 방법들은 결측치 또는 이상치가 포함되지 않은 완전한 자료만을 분석데이터로 사용하였거나 결측치 또는 이상치가 미치는 영향을 고려하지 않고 사용되었으나, 보다 정확한 데이터 분석을 위해서 결측치 또는 이상치에 대한 정확하고 신뢰성 있는 대체 접근법이 요구되었다. In order to secure the availability and performance of the prediction of the future situation, the problem of missing or outliers should be solved in the data. Missing values (anomalies) and anomalies of such time series data are generally caused by transmission errors or interruptions of the sensors. The presence of such missing values or anomalies in time series power or number data can lead to data distortion and bias, or even degrade the performance of algorithms applied for analysis. However, most of the conventional methods for power pattern analysis were used only for the complete data without missing values or anomalies, or without considering the influence of missing values or anomalies. However, for more accurate data analysis, An accurate and reliable alternative approach was required.

이와 관련하여, 전력 계통 등에서 부하 예측과 관련하여, ARIMA (Autogressive Integrated Moving Average) 또는 AR (Autogressive Regression)을 사용하여 미래 부하를 예측하는 방법이 있다. 하지만, 이러한 방법은 공간적 연결(위치, 소속) 관계 데이터를 활용하지 못하고, 중장기 예측을 위한 복잡한 주기성을 추정하지 못하는 모델이라는 문제점이 있다. 또한, 이러한 방법은 미취득 데이터(null data)를 고려하지 못하고, 시계열 데이터의 패턴 변화와 데이터의 이상치를 고려하지 못한다는 문제점이 있다.In this regard, there is a method of predicting the future load using ARIMA (Autogressive Integrated Moving Average) or AR (Autogressive Regression) with respect to load prediction in a power system or the like. However, this method has a problem in that it can not utilize the spatial connection (location, affiliation) relationship data, and can not estimate the complex periodicity for the mid / long term prediction. Also, this method has a problem in that it can not consider null data and can not take into consideration a pattern change of time series data and an abnormal value of data.

본 발명이 해결하고자 하는 기술적 과제는 미취득 데이터 보정을 통한 데이터 생성 및 기간 별 부하 예측 방법 및 이를 이용하는 시스템을 제공하는 것이다.SUMMARY OF THE INVENTION It is an object of the present invention to provide a data generation method, a load estimation method, and a system using the method.

또한, 본 발명은 데이터 수집 장치에서 측정된 부하 데이터를 이용해 미래 부하를 예측하는데 필요한 전 과정에 대한 알고리즘을 제공하기 위한 것이다.The present invention is also intended to provide an algorithm for the entire process required to predict future loads using the load data measured by the data collection device.

또한, 본 발명을 통해 미래 부하를 높은 정확도로 예측할 수 있도록 함으로써 전력 설비의 투자 및 운영 시 경제적이고 효율적인 투자 및 운영을 가능하도록 한다.Further, the present invention can predict future loads with high accuracy, thereby enabling economical and efficient investment and operation in investment and operation of electric power facilities.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the above-mentioned technical problems, and other technical problems which are not mentioned can be clearly understood by those skilled in the art from the following description.

이와 같은 목적을 달성하기 위한 본 발명에 따른 미취득 데이터 보정을 통한 부하 예측 시스템이 제공된다. 상기 부하 예측 시스템은, 전력 데이터를 획득한 후, 상기 전력 데이터가 분석을 위해 필요한 최소 데이터 개수 이상인지 여부를 판별하는 데이터 획득부; 상기 전력 데이터에서 일부 데이터를 제거한 제2 전력 데이터가 가용 데이터 비율 이상인지 여부를 판별하고, 미취득 데이터(null data)를 생성하는 전처리부; 및 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성하고, 상기 데이터 셋을 이용하여 제1 예측 알고리즘 또는 제2 예측 알고리즘에 따라 부하 예측을 수행하는 부하 예측부를 포함하여, 전력 부하 데이터의 비정상 데이터를 제거하고 패턴이 변화하는 시점을 탐지한 후 이상 데이터를 제거함으로써 부하 예측 알고리즘의 정확도를 높일 수 있도록 부하 데이터를 전처리 할 수 있다.To achieve these and other advantages and in accordance with the purpose of the present invention, The load prediction system comprising: a data acquiring unit for acquiring power data and determining whether the power data is greater than or equal to a minimum number of data required for analysis; A preprocessor for determining whether or not the second power data from which the partial data is removed from the power data is equal to or higher than an available data rate and generating null data; And a load predictor for generating a data set according to the change in the second power data and performing a load prediction according to a first prediction algorithm or a second prediction algorithm using the data set, The load data can be preprocessed to remove the data, detect when the pattern changes, and then remove the anomaly data to increase the accuracy of the load prediction algorithm.

일 실시 예에서, 상기 부하 예측부는, 상기 제1 예측 알고리즘을 사용하여 전력 계통의 장기 부하 예측을 수행하고, 상기 제2 예측 알고리즘을 사용하여 상기 전력 계통의 단기/중기 부하 예측을 수행할 수 있다.In one embodiment, the load predicting unit may perform the long-term load prediction of the power system using the first prediction algorithm, and may perform the short / medium-term load prediction of the power system using the second prediction algorithm .

일 실시 예에서, 상기 장기 부하 예측은 LSTM(long short term memory) 알고리즘을 변경한 HOLSTM (High Order LST) 알고리즘을 사용하여 수행되고, 상기 단기/중기 부하 예측은 상기 HOLSTM 알고리즘 또는 상기 LSTM 알고리즘을 사용하여 수행될 수 있다.In one embodiment, the long-term load prediction is performed using a High Order LST (HOLSTM) algorithm with a modified long long term memory (LSTM) algorithm, and the short / medium term load prediction uses the HOLSTM algorithm or the LSTM algorithm . &Lt; / RTI >

일 실시 예에서, 상기 부하 예측부는, 상기 단기/중기 부하 예측을 위한 학습 시 학습 레이트(learning rate)를 동적으로 변경할 수 있다. 또한, 상기 학습 시 도출된 MMSE (Minimum Mean Squared Error) 가 d회 동안 갱신되지 않을 경우, 상기 학습 레이트에 gamma 를 곱하여 학습 레이트 감쇠(learning rate decay)를 동적으로 변경시킬 수 있다.In one embodiment, the load predictor may dynamically change a learning learning rate for the short / medium term load prediction. In addition, when the minimum mean squared error (MMSE) derived during the learning is not updated for d times, the learning rate can be dynamically changed by multiplying the learning rate by gamma.

일 실시 예에서, 상기 전처리부는, 상기 전력 데이터에서 비정상 데이터 제거 과정, 패턴 변화 데이터 제거 과정 및 이상 데이터 제거 과정을 통해 상기 제2 전력 데이터를 획득할 수 있다.In one embodiment, the preprocessor may obtain the second power data through the abnormal data removal process, the pattern change data removal process, and the abnormal data removal process in the power data.

일 실시 예에서, 상기 패턴 변화 데이터 제거 과정에서, 부하 데이터의 범위를 예측한 예측 값과 실제 측정 값의 차이가 임계 값을 초과하는 절체 시점을 기준으로 패턴 군집화가 수행될 수 있다. 또한, 상기 절체 시점 이전 및 이후 데이터에 대해 패턴 재군집화가 수행될 수 있다.In one embodiment, pattern clustering may be performed on the basis of a switching time point at which a difference between a predicted value of the range of the load data and an actual measured value exceeds a threshold value in the process of removing the pattern variation data. In addition, pattern re-clustering may be performed on the data before and after the switching point.

일 실시 예에서, 상기 전처리부는, 상기 미취득 데이터를 생성하기 위하여 이전 시점에서 예측한 값을 다시 입력으로 사용하는 방법으로 LSTM 알고리즘을 학습할 수 있다. 이때, 상기 부하 예측부는, 상기 부하 예측 시, 학습 시와 동일한 방법으로 이전 시점에서 예측한 값을 다시 입력으로 사용할 수 있다.In one embodiment, the preprocessor may learn the LSTM algorithm by re-inputting a value predicted at a previous time point to generate the unacquired data. At this time, the load predicting unit may use the predicted value at the previous time in the same way as during learning at the time of the load prediction.

일 실시 예에서, 상기 부하 예측부는, 상기 제2 전력 데이터의 시계열 데이터에서 이상 데이터를 보정한 이동 평균법인 절사 주변 평균법에 따라 상기 시계열 데이터의 시계열 추세 값을 추출하는 시계열 추세 추출부; 및 상기 제2 전력 데이터의 원시 데이터(raw data)와 상기 시계열 추세 값의 차이에 해당하는 잔차(remainder)를 계산하는 시계열 추세 제거부를 포함할 수 있다.In one embodiment, the load predicting unit may include: a time series trend extractor for extracting a time series trend value of the time series data according to a truncated surrounding average method, which is a moving average method in which abnormal data is corrected in time series data of the second power data; And a time series trend eliminator for calculating a remainder corresponding to a difference between the raw data of the second power data and the time series trend value.

일 실시 예에서, 특정 시간 간격으로 상기 잔차의 합을 계산하여, 상기 잔차에 대한 통계치를 획득하는 통계치 획득부; 및 상기 통계치 획득부에서 계산한 잔차의 복수의 q% 분위수 중 어느 하나의 분위수를 사용하여 임계치를 결정하는 임계치 결정부를 더 포함할 수 있다. 이에 따라 상기 결정된 임계치에 기반하여 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성할 수 있다.In one embodiment, a statistics acquiring unit calculates a sum of the residuals at specific time intervals, and obtains statistics on the residuals; And a threshold value determiner for determining a threshold using one of the plurality of q < th > percentiles of the residuals calculated by the statistic value acquiring unit. Accordingly, the data set can be generated according to the change of the second power data based on the determined threshold value.

본 발명의 다른 양상에 따른 미취득 데이터 보정을 통한 부하 예측 방법이 제공된다. 상기 부하 예측 방법은, 전력 데이터를 획득한 후, 상기 전력 데이터가 분석을 위해 필요한 최소 데이터 개수 이상인지 여부를 판별하는 데이터 획득 과정; 상기 전력 데이터에서 일부 데이터를 제거한 제2 전력 데이터가 가용 데이터 비율 이상인지 여부를 판별하고, 미취득 데이터(null data)를 생성하는 전처리 과정; 및 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성하고, 상기 데이터 셋을 이용하여 제1 예측 알고리즘 또는 제2 예측 알고리즘에 따라 부하 예측을 수행하는 부하 예측 과정을 포함한다.There is provided a load predicting method through correction of unacquired data according to another aspect of the present invention. The load prediction method includes: a data acquiring step of acquiring power data and determining whether the power data is greater than or equal to a minimum number of data required for analysis; A preprocessing step of discriminating whether or not the second power data from which the partial data is removed from the power data is equal to or higher than an available data rate and generating null data; And a load prediction process of generating a data set according to a trend change of the second power data and performing a load prediction according to a first prediction algorithm or a second prediction algorithm using the data set.

일 실시 예에서, 상기 부하 예측 과정은, 상기 제2 전력 데이터의 시계열 데이터에서 이상 데이터를 보정한 이동 평균법인 절사 주변 평균법에 따라 상기 시계열 데이터의 시계열 추세 값을 추출하는 시계열 추세 추출 과정; 상기 제2 전력 데이터의 원시 데이터(raw data)와 상기 시계열 추세 값의 차이에 해당하는 잔차(remainder)를 계산하는 시계열 추세 제거 과정; 특정 시간 간격으로 상기 잔차의 합을 계산하여, 상기 잔차에 대한 통계치를 획득하는 통계치 획득 과정; 및 상기 통계치 획득부에서 계산한 잔차의 복수의 q% 분위수 중 어느 하나의 분위수를 사용하여 임계치를 결정하는 임계치 결정 과정을 포함할 수 있다. 이에 따라, 일 실시 예에서, 상기 결정된 임계치에 기반하여 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성할 수 있다.In one embodiment, the load prediction process includes: a time series trend extracting step of extracting a time series trend value of the time series data according to a truncated surrounding average method, which is a moving average method in which abnormal data is corrected in time series data of the second power data; A time series trend removal process for calculating a remainder corresponding to a difference between the raw data of the second power data and the time series trend value; Calculating a sum of the residuals at specific time intervals and obtaining a statistic for the residual; And a threshold determining step of determining a threshold value using any one of the plurality of q < th > percentile quantiles of the residuals calculated by the statistic value acquiring unit. Thus, in one embodiment, a data set may be generated according to the trend change of the second power data based on the determined threshold.

본 발명에 따르면, 전력 부하 데이터의 비정상 데이터를 제거하고 패턴이 변화하는 시점을 탐지한 후 이상 데이터를 제거함으로써 부하 예측 알고리즘의 정확도를 높일 수 있도록 부하 데이터를 전처리할 수 있다는 장점이 있다. According to the present invention, there is an advantage that load data can be preprocessed so as to increase the accuracy of the load prediction algorithm by removing abnormal data of power load data, detecting a time when a pattern changes, and removing abnormal data.

또한, 본 발명에 따르면, 측정된 데이터에서 비정상 데이터와 이상 데이터를 제거하는 것은 부하 예측 알고리즘이 정상 데이터를 이용해 학습을 진행할 수 있도록 함으로써 예측하고자 하는 미래 부하의 정확도를 높일 수 있다.In addition, according to the present invention, removing the abnormal data and the abnormal data from the measured data can increase the accuracy of the future load to be predicted by allowing the load prediction algorithm to proceed with learning using normal data.

또한, 본 발명에 따르면, 측정된 데이터에서 패턴이 변화하는 시점을 탐지하는 것은 해당 데이터에서 절체로 인해 데이터의 패턴이 변화한 시점과 일반적인 패턴에서 벗어나는 패턴을 탐지함으로써 일반적인 패턴만으로 구성된 학습 데이터를 이용해 부하 예측을 진행하여 예측하고자 하는 미래 부하의 정확도를 높일 수 있다.In addition, according to the present invention, detecting a point at which a pattern changes in measured data is performed by detecting a pattern that is different from a point at which a data pattern changes due to a change in the data and a pattern that is out of a general pattern, It is possible to increase the accuracy of the future load to be predicted by proceeding with the load prediction.

또한, 본 발명에 따르면, 이와 같이 측정된 부하 데이터에서 비정상 데이터를 제거하고 패턴 변화를 탐지하여 복원한 후 이상 데이터를 제거하는 것은 부하 데이터에 기계학습을 적용하여 정확한 미래 부하를 예측할 수 있도록 한다.According to the present invention, the abnormal data is removed from the measured load data, and the pattern data is detected and restored and then the abnormal data is removed. Thus, it is possible to predict the accurate future load by applying the machine learning to the load data.

또한, 본 발명에 따르면, 미래 전력 수요로 볼 수 있는 미래 부하를 높은 정확도로 예측할 수 있게 되면 전력 설비를 운영하는데 많은 도움을 줄 수 있다는 장점이 있다.In addition, according to the present invention, it is possible to predict a future load, which can be seen as a demand for future electric power, with high accuracy, and thus it is advantageous in operating the electric power facility.

도 1은 본 발명에 따른 부하 예측 시스템의 상세한 구성을 나타낸다.
도 2a는 본 발명의 일 실시 예 따른 가용 데이터 비율의 시각화를 나타낸다. 또한, 도 2b는 본 발명의 다른 실시 예 따른 가용 데이터 비율의 시각화를 나타낸다.
도 3은 본 발명에 따른 전처리부에서 수행하는 일련의 과정의 개념도를 나타낸다.
도 4는 본 발명에 따른 비정상 데이터 제거 과정에서 비정상 데이터 제거 후와 제거 이전의 시계열 데이터를 나타낸다.
도 5a는 본 발명에 따른 절체 패턴 탐지 관련 기본 데이터를 나타낸다. 한편, 도 5b는 본 발명에 따른 절체 시점 탐지에 관한 예시를 나타낸다.
도 5c는 본 발명에 따른 패턴 군집화에 관한 예시를 나타낸다. 또한, 도 5d는 본 발명에 따른 패턴 재군집화에 관한 예시를 나타낸다.
도 6은 본 발명의 비정상 패턴 탐지에 따른 패턴 변화 데이터 제거 후, 제거 이전의 시계열 데이터를 나타낸다.
도 7a는 본 발명에 따른 이상 데이터 제거 과정의 개념도를 나타낸다. 또한, 도 7b는 본 발명에 따른 이상 데이터 제거 이후, 비정상 데이터 판정을 위한 RNOF 알고리즘을 적용한 예시이다.
도 7c는 본 발명과 관련하여 RNOF 알고리즘 적용 시, 비정상 데이터 판정이 이루어질 수 없는 일 예시를 나타낸다.
도 7d는 본 발명과 관련하여, 시계열 데이터에서 주변 값들과 평균 값과의 차이가 발생 시 절사 주변 평균법의 원리를 나타낸 개념도이다.
도 8은 본 발명에 따른 미취득 데이터 학습 및 예측 과정의 개념도를 나타낸다.
도 9는 본 발명에 따른 추세 변화 데이터 제거 과정의 개념도를 나타낸다.
도 10은 본 발명에 따른 HOLSTM 알고리즘의 원리를 나타낸 개념도이다.
도 11은 본 발명의 LSTM 및 HOLSTM 알고리즘에 따른 학습 횟수에 따른 오류 결과를 나타낸 것이다.
도 12는 본 발명의 LSTM 및 HOLSTM 알고리즘에 따른 기간 별 전력 부하 예측 결과를 나타낸 것이다.
도 13은 본 발명의 HOLSTM 알고리즘에 날짜 정보를 사용한 경우의 부하 예측 결과를 나타낸 것이다.
도 14는 본 발명과 관련하여 부하 패턴 변화 지점을 탐지하지 못한 경우와 본 발명에 따라 패턴 변화 탐지에 따른 예측 결과를 비교한 것이다.
도 15는 본 발명에 따른 미취득 데이터 보정을 통한 부하 예측 방법의 흐름도를 나타낸다.1 shows a detailed configuration of a load prediction system according to the present invention.
2A illustrates a visualization of the available data rates in accordance with one embodiment of the present invention. Figure 2B also shows a visualization of the available data rate according to another embodiment of the present invention.
3 is a conceptual diagram of a series of processes performed by the preprocessing unit according to the present invention.
FIG. 4 shows time series data after abnormal data removal and before removal in the process of removing abnormal data according to the present invention.
FIG. 5A shows basic data relating to detection of transfer pattern according to the present invention. Meanwhile, FIG. 5B shows an example of the switching time point detection according to the present invention.
5C shows an example of pattern clustering according to the present invention. FIG. 5D shows an example of the pattern material clustering according to the present invention.
FIG. 6 shows time series data before removal after pattern change data removal according to the abnormal pattern detection of the present invention.
7A is a conceptual diagram of an abnormal data removal process according to the present invention. FIG. 7B is an example of applying the RNOF algorithm for abnormal data determination after eliminating abnormal data according to the present invention.
FIG. 7C shows an example in which abnormal data determination can not be made when applying the RNOF algorithm in connection with the present invention.
FIG. 7D is a conceptual diagram showing the principle of the trimming surrounding averaging method when a difference between the surrounding values and the average value occurs in the time series data according to the present invention.
8 shows a conceptual diagram of an unacquired data learning and prediction process according to the present invention.
9 is a conceptual diagram of a trend change data removal process according to the present invention.
10 is a conceptual diagram illustrating the principle of the HOLSTM algorithm according to the present invention.
11 shows the error results according to the learning times according to the LSTM and HOLSTM algorithms of the present invention.
FIG. 12 shows the results of the power load prediction according to the LSTM and HOLSTM algorithms according to the present invention.
13 shows a result of load prediction when date information is used in the HOLSTM algorithm of the present invention.
FIG. 14 is a graph comparing a case where a load pattern change point is not detected and a prediction result of pattern change detection according to the present invention, in accordance with the present invention.
Fig. 15 shows a flowchart of a load predicting method through correction of unacquired data according to the present invention.

상술한 본 발명의 특징 및 효과는 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해질 것이며, 그에 따라 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 본 발명은 다양한 변경을 가할 수 있고 여러가지 형태를 가질 수 있는바, 특정 실시 예들을 도면에 예시하고 본문에 상세하게 설명하고자 한다. 그러나 이는 본 발명을 특정한 개시형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 명세서에서 사용한 용어는 단지 특정한 실시 예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다.BRIEF DESCRIPTION OF THE DRAWINGS The above and other features and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings, It will be possible. While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and are herein described in detail. It is to be understood, however, that the invention is not intended to be limited to the particular forms disclosed, but on the contrary, is intended to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

본 발명은 미취득 데이터 보정을 통한 부하 예측 시스템 및 방법을 제안한다. 본 발명에 따른 부하 예측 시스템 및 방법은 다음과 같은 점에서 종래 기술에 비해 기술적 차별점이 있다.The present invention proposes a system and method for predicting a load through correction of unacquired data. The load prediction system and method according to the present invention are technically different from the prior art in the following points.

1) 공간적 연결(위치, 소속) 관계 데이터 미 활용 → 다 변량 데이터를 사용할 수 있는 회귀신경망 모델을 사용할 수 있다.1) Spatial connection (position, affiliation) No relational data → A regression neural network model that can use multivariate data can be used.

2) 중장기 예측을 위한 복잡한 주기성을 추정하지 못하는 모델 → 회귀신경망 모델 중 중장기 예측을 위한 복잡한 주기성을 추정할 수 있는 고 차수 회귀신경망 알고리즘 (High Order Recurrent Neural Networks; HORNN)을 변경하여 발명한 고 차수 LSTM 회귀신경망 알고리즘 (High Order Long Short-Term Memory; HOLSTM)을 사용할 수 있다.2) Models that can not estimate complex periodicities for mid-to-long term prediction → Modified high-order recurrent neural networks (HORNN) that can estimate complex periodicities for mid-to-long- The LSTM regressive neural network algorithm (High Order Long Short-Term Memory; HOLSTM) can be used.

3) 미취득 데이터 미 고려 → 미취득 데이터를 학습하여 미취득 데이터 값을 생성하는 고 차수 LSTM 회귀신경망 알고리즘 발명을 제공할 수 있다.3) Not considering data that is not yet acquired → You can provide a high-order LSTM regression neural network algorithm invention that learns non-acquired data to generate unassigned data values.

4) 패턴 변화 미 고려 → 부하 데이터의 패턴 변화 지점을 단순 지수 평활법(Simple Exponential Smoothing)을 이용하여 탐지할 수 있다.4) No consideration of pattern change → The point of pattern change of load data can be detected by simple exponential smoothing.

5) 이상치 미 고려 → 근접 시점(예: t-3, t-2, t-1, t+1, t+2, t+3)의 데이터를 이용하여 이상치를 탐지하기 위하여 발명한 '주변 시점 기반 강건한 이상치 점수 (Robust Neighbor Outlier Factor; RNOF)' 알고리즘을 사용할 수 있다.5) In order to detect anomalies using the data of the outliers → the nearest time (eg t-3, t-2, t-1, t + 1, t + 2, t + 3) Based Robust Neighbor Outlier Factor (RNOF) algorithm.

이하, 본 발명에 따른 미취득 데이터 보정을 통한 부하 예측 시스템 및 방법을 도면을 참조하여 보다 상세하게 설명한다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will now be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 부하 예측 시스템의 상세한 구성을 나타낸다. 도 1에 도시된 바와 같이, 부하 예측 시스템은 데이터 획득부(100), 전처리부(200) 및 부하 예측부(300)를 포함한다. 1 shows a detailed configuration of a load prediction system according to the present invention. 1, the load prediction system includes a data acquiring unit 100, a preprocessing unit 200, and a load predicting unit 300. As shown in FIG.

데이터 획득부(100)는 전력 데이터를 획득한 후, 상기 전력 데이터가 분석을 위해 필요한 최소 데이터 개수 이상인지 여부를 판별한다. 또한, 전처리부(200)는 상기 전력 데이터에서 일부 데이터를 제거한 제2 전력 데이터가 가용 데이터 비율 이상인지 여부를 판별하고, 미취득 데이터(null data)를 생성한다. 또한, 부하 예측부(300)는 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성하고, 상기 데이터 셋을 이용하여 제1 예측 알고리즘 또는 제2 예측 알고리즘에 따라 부하 예측을 수행한다. 이때, 부하 예측부(300)는 상기 제1 예측 알고리즘을 사용하여 전력 계통의 장기 부하 예측을 수행할 수 있다. 또한, 부하 예측부(300)는 상기 제2 예측 알고리즘을 사용하여 상기 전력 계통의 단기/중기 부하 예측을 수행할 수 있다.After acquiring the power data, the data obtaining unit 100 determines whether or not the power data is equal to or larger than the minimum number of data necessary for the analysis. Also, the preprocessing unit 200 determines whether or not the second power data from which the partial data is removed from the power data is equal to or greater than the available data rate, and generates null data. Also, the load predictor 300 generates a data set according to the change of the second power data, and performs a load prediction according to the first prediction algorithm or the second prediction algorithm using the data set. At this time, the load predicting unit 300 can perform the long-term load prediction of the power system using the first prediction algorithm. Also, the load predicting unit 300 may perform the short / medium term load prediction of the power system using the second prediction algorithm.

여기서, 장기 부하 예측은 LSTM(long short term memory) 알고리즘을 변경한 HOLSTM (High Order LST) 알고리즘을 사용하여 수행될 수 있다. 반면에, 상기 단기/중기 부하 예측은 상기 HOLSTM 알고리즘 또는 상기 LSTM 알고리즘을 사용하여 수행될 수 있다. 이러한 알고리즘에 대해서는 아래에서 상세히 살펴보기로 한다.Here, the long term load prediction can be performed using a HOLSTM (High Order LST) algorithm in which a long short term memory (LSTM) algorithm is changed. On the other hand, the short / medium term load prediction can be performed using the HOLSTM algorithm or the LSTM algorithm. These algorithms will be described in detail below.

한편, 전술한 부하 예측은 장기 부하 예측 및 단기/중기 부하 예측에 한정되는 것이 아니라, 응용에 따라 다양하게 변경 가능하다. 이와 관련하여, 표 1은 본 발명의 일 실시 예에 따른 부하 예측 방법과 관련한 기간과 활용 방안을 제시한 것이다.On the other hand, the above-described load prediction is not limited to the long-term load prediction and the short-term / medium-term load prediction, but can be variously changed depending on the application. In this regard, Table 1 shows the periods and utilization methods related to the load prediction method according to an embodiment of the present invention.

초 단기 예측Short-term forecast 기간: 1시간 이내
활용: 1시간 이내의 단기 급전 계획Duration: Within 1 hour
Application: Short-term dispatch plan within 1 hour 단기 예측Short-term prediction 기간: 일주일 이내활용: 전력 계통 안정성 확보, 효율적 전력계통 운용 (발전 비용 감소)Period: Within a week Use: Ensure stability of power system, Operate efficient power system (Decrease power generation cost) 중·장기 예측Medium and long term forecast 기간: 월·년 단위 활용: [발전기] 유지보수 계획, 건설 계획
[전력 수급] 수급 비상 상황 (최대 수요) 대비Period: Monthly · Yearly utilization: [Generator] Maintenance plan, construction plan
[Electricity Supply & Demand] Supply / Demand Emergency (Maximum Demand)

한편, 데이터 획득부(100)에서의 동작에 대해 상세하게 살펴보면 다음과 같다. 데이터 획득부에서는 데이터를 획득한 후, 분석에 사용할 만큼 데이터가 있는지를 판별한다.The operation of the data acquisition unit 100 will now be described in detail. After acquiring data, the data acquisition unit determines whether there is enough data to be used for analysis.

판별을 위하여 가용 데이터 비율을 계산하는데, 가용 데이터 비율이 α % 이상일 경우 데이터가 사용 가능하다고 판단된다.For the determination, the available data ratio is calculated. If the available data ratio is more than α%, it is judged that the data is usable.

- α는 사용자 지정 파라미터로 디폴트(default) 값은 80% 이다.- α is a user-specified parameter. The default value is 80%.

- 가용 데이터 비율 대신 사용자 지정 최소 데이터 개수 (

) 를 사용할 수도 있다.- Custom minimum data count instead of the available data rate (

) May be used.

- 가용 데이터 비율: (전처리 이전) 전체 데이터 개수/취득되어야 할 데이터 개수 - Available data rate: (before preprocessing) Total number of data / Number of data to be acquired

- 취득되어야 할 데이터 개수: 취득 간격 * 기간- Number of data to be acquired: Acquisition interval * Period

한편, 가용 데이터 비율이 시각화 될 수 있다. 이와 관련하여, 도 2a는 본 발명의 일 실시 예 따른 가용 데이터 비율의 시각화를 나타낸다. 또한, 도 2b는 본 발명의 다른 실시 예 따른 가용 데이터 비율의 시각화를 나타낸다.On the other hand, the available data rate can be visualized. In this regard, FIG. 2A illustrates a visualization of the available data rate in accordance with one embodiment of the present invention. Figure 2B also shows a visualization of the available data rate according to another embodiment of the present invention.

한편, 전처리부(200)에서의 동작에 대해 상세하게 살펴보면 다음과 같다. 전처리부(200)는 상기 전력 데이터에서 비정상 데이터 제거 과정, 패턴 변화 데이터 제거 과정 및 이상 데이터 제거 과정을 통해 상기 제2 전력 데이터를 획득할 수 있다. 한편, 상기 패턴 변화 데이터 제거 과정에서, 부하 데이터의 범위를 예측한 예측 값과 실제 측정 값의 차이가 임계 값을 초과하는 절체 시점을 기준으로 패턴 군집화가 수행될 수 있다. 또한, 상기 절체 시점 이전 및 이후 데이터에 대해 패턴 재군집화가 수행될 수 있다.The operation of the preprocessing unit 200 will be described in detail below. The preprocessing unit 200 may obtain the second power data through the abnormal data removal process, the pattern change data removal process, and the abnormal data removal process in the power data. In the process of removing the pattern change data, the pattern clustering may be performed based on a transfer time point at which a difference between the predicted value and the actual measured value that predicts the range of the load data exceeds a threshold value. In addition, pattern re-clustering may be performed on the data before and after the switching point.

한편, 전처리부(200)는 다음과 같은 과정을 수행할 수 있다. 이와 관련하여, 도 3은 본 발명에 따른 전처리부에서 수행하는 일련의 과정의 개념도를 나타낸다.Meanwhile, the preprocessing unit 200 may perform the following process. In this regard, FIG. 3 shows a conceptual diagram of a series of processes performed by the preprocessing unit according to the present invention.

1) 비정상 데이터 제거1) Remove abnormal data

2) 패턴 변화 데이터 제거2) Remove pattern change data

3) 이상 데이터 제거3) Remove abnormal data

4) 가용 데이터 판별4) Determining Available Data

5) 미취득 데이터 생성5) Generate unacquired data

먼저, 비정상 데이터 제거 과정과 관련하여, 도 4는 본 발명에 따른 비정상 데이터 제거 과정에서 비정상 데이터 제거 후와 제거 이전의 시계열 데이터를 나타낸다. 한편, 비정상 데이터란 0 이하 또는 고정 값이 반복되어 나오는 경우를 의미하며, 이는 기록 오류로 발생한 데이터이므로, 이러한 데이터를 제거한다.First, regarding the abnormal data removal process, FIG. 4 shows time series data after abnormal data removal and removal in the abnormal data removal process according to the present invention. On the other hand, the abnormal data means a case where the value is 0 or less or a fixed value is repeatedly generated.

다음으로, 패턴 변화 데이터 제거 과정과 관련하여, 도 5a는 본 발명에 따른 절체 패턴 탐지 관련 기본 데이터를 나타낸다. 한편, 도 5b는 본 발명에 따른 절체 시점 탐지에 관한 예시를 나타낸다. 이때, 절체 시점 탐지와 관련하여, 다음 부하 데이터의 범위를 Simple Exponential Smoothing을 이용하여 예측하고 예측 값과 실제 측정된 값이 설정한 임계 값을 초과할 경우 절체 시점으로 탐지할 수 있다. Next, with reference to the process of removing pattern change data, FIG. 5A shows basic data related to the transfer pattern detection according to the present invention. Meanwhile, FIG. 5B shows an example of the switching time point detection according to the present invention. At this time, regarding the switching point detection, the range of the next load data can be predicted using Simple Exponential Smoothing, and the switching point can be detected when the predicted value and the actually measured value exceed the set threshold value.

한편, 도 5c는 본 발명에 따른 패턴 군집화에 관한 예시를 나타낸다. 또한, 도 5d는 본 발명에 따른 패턴 재군집화에 관한 예시를 나타낸다. 한편, 패턴 군집화 과정에서, 절체 시점을 기준으로 클래스를 분류할 수 있다. 이때, 절체 시점 이전은 A 클래스로, 절체 구간에서는 B 클래스로, 절체 시점 이후는 C 클래스로 분류될 수 있다. 또한, 패턴 재군집화 과정에서, 일차 분류된 클래스를 군집에 따라 클래스를 재분류할 수 있다. 이때, A 클래스와 패턴이 비슷한 C클래스를 A 클래스로 재분류할 수 있다.On the other hand, FIG. 5C shows an example of pattern clustering according to the present invention. FIG. 5D shows an example of the pattern material clustering according to the present invention. On the other hand, in the pattern clustering process, the class can be classified based on the switching time point. In this case, the class A class can be classified into the class A, the class B class in the transition class, and the class C class in the transition period. Also, in the pattern re-clustering process, classes can be reclassified according to clusters of primary classified classes. At this time, it is possible to reclassify a C class similar in pattern to an A class into an A class.

다음으로, 도 6은 본 발명의 비정상 패턴 탐지에 따른 패턴 변화 데이터 제거 후, 제거 이전의 시계열 데이터를 나타낸다. 이때, 전술한 바와 같이, 부하 데이터의 범위를 예측한 예측 값과 실제 측정 값의 차이가 임계 값을 초과하는 절체 시점을 기준으로 패턴 군집화가 수행될 수 있다. 또한, 상기 절체 시점 이전 및 이후 데이터에 대해 패턴 재군집화가 수행될 수 있다. Next, FIG. 6 shows the time series data before the removal after the pattern change data removal according to the abnormal pattern detection of the present invention. At this time, as described above, pattern clustering can be performed based on the switching time point at which the difference between the predicted value of the range of the load data and the actual measured value exceeds the threshold value. In addition, pattern re-clustering may be performed on the data before and after the switching point.

다음으로, 도 7a는 본 발명에 따른 이상 데이터 제거 과정의 개념도를 나타낸다. 또한, 도 7b는 본 발명에 따른 이상 데이터 제거 이후, 비정상 데이터 판정을 위한 RNOF 알고리즘을 적용한 예시이다. 여기서, RNOF 알고리즘은 Robust Neighbor Outlier Factor 알고리즘이다. Next, FIG. 7A shows a conceptual diagram of the abnormal data removal process according to the present invention. FIG. 7B is an example of applying the RNOF algorithm for abnormal data determination after eliminating abnormal data according to the present invention. Here, the RNOF algorithm is a Robust Neighbor Outlier Factor algorithm.

한편, 도 7b의 하단 그림에서는 RNOF 가 적용된 그림이 나타난다. 파란색 선은 측정된 데이터를 의미하며, 주황색 선은 추세선을 의미한다. RNOF는 측정된 데이터 값에서 추세 값을 차감한 값에 절대값을 취하여 계산한다. 한편, 임계치는 잔차 탐지율을 보고 자동적으로 결정되며, 잔차가 임계치가 넘는 데이터를 이상 데이터로 판정하고 제거할 수 있다.On the other hand, in the lower figure of FIG. 7B, a figure to which RNOF is applied appears. The blue line represents the measured data, and the orange line represents the trend line. RNOF is calculated by taking the absolute value of the measured data value minus the trend value. On the other hand, the threshold value can be determined automatically based on the residual detection rate, and the data with the residual value exceeding the threshold value can be judged as abnormal data and removed.

한편, 이상 데이터 제거 이후, 비정상 데이터 판정을 위한 RNOF 알고리즘에 대해 살펴보면 다음과 같다. RNOF 알고리즘은 특정 기간에 취득된 데이터의 경계값 즉, 최대값과 최소값에 일정 상수 c1, c2를 곱하여 정상 범위를 정하는 방법 예를 들어, 특정 기간에 취득된 데이터의 최대값의 150%, 최소값의 50% 값 이내에 해당하는 데이터를 정상으로 판정하는 기법이다. 한편, 이러한 방법은 시간에 따라 변화하는 데이터의 특성을 고려하지 않았기 때문에, 정상 데이터의 범위가 필요 이상으로 넓을 수 있다. 이럴 경우, 비정상 데이터를 정상으로 탐지하는 오류가 발생할 수 있다는 문제점이 있다. The RNOF algorithm for abnormal data determination after the abnormal data removal is as follows. The RNOF algorithm is a method of determining the normal range by multiplying the boundary value of the data acquired in a specific period, that is, the maximum value and the minimum value by constant constants c1 and c2, for example, 150% It is a technique to determine the corresponding data within the 50% value as normal. On the other hand, this method does not take into consideration the characteristics of data that change with time, and thus the range of normal data can be wider than necessary. In this case, there is a problem that an error may occur that abnormal data is detected as normal.

한편, 도 7c는 본 발명과 관련하여 RNOF 알고리즘 적용 시, 비정상 데이터 판정이 이루어질 수 없는 일 예시를 나타낸다. 예를 들어, 위 그림과 같이 특정 기간에서 취득된 데이터의 최대값이 100, 최소값이 10이라고 가정하자. 그러면 150, 10 은 각각 최대값의 150%에 해당하는 값과 최소값의 50%에 해당하는 값이다. 이러한 값들 안에 데이터가 들어올 경우 정상으로 판정, 이 범위를 넘어가는 값이 발생할 경우 비정상으로 판정한다.On the other hand, FIG. 7C shows an example in which abnormal data determination can not be made when applying the RNOF algorithm in connection with the present invention. For example, suppose that the maximum value of data acquired in a specific period is 100 and the minimum value is 10, as shown in the above figure. Then, 150 and 10 correspond to 150% of the maximum value and 50% of the minimum value, respectively. When data is entered into these values, it is judged as normal, and when a value exceeding this range occurs, it is judged as abnormal.

이와 같은 방법은 도 7c의 A 데이터를 정상으로 판정하지만, 이는 비정상적인 데이터이다. 왜냐하면 시간에 따라 변화하는 데이터의 특성을 고려하였을 때, 발생할 확률이 적은 데이터이기 때문이다.Such a method determines that the A data in FIG. 7C is normal, but this is abnormal data. This is because the probability of occurrence is low when considering the characteristics of data that change with time.

종래 비정상 데이터 판별 방법은 시간에 따라 변화하는 데이터의 특성을 고려하지 않는다. 따라서 비정상 데이터를 정상 데이터라고 잘못 판정하는 오류가 발생할 가능성이 크다.Conventional abnormal data discrimination methods do not consider characteristics of data that change with time. Therefore, there is a high possibility that an error that erroneously judges abnormal data as normal data occurs.

따라서, 본 발명에서는 시간에 따라 변화하는 데이터의 특성을 고려하기 위하여, 이동 평균법을 사용하여 추출한 추세 값을 기준으로 정상 범위를 벗어나는 데이터를 비정상 데이터로 판정한다.Therefore, in order to consider characteristics of data that change with time, data that deviates from the normal range based on the trend value extracted using the moving average method is determined to be abnormal data.

이를 위해, 전술한 바와 같이, 부하 예측부(300)의 시계열 추세 추출부는 제2 전력 데이터의 시계열 데이터에서 이상 데이터를 보정한 이동 평균법인 절사 주변 평균법에 따라 시계열 데이터의 시계열 추세 값을 추출할 수 있다.For this, as described above, the time series trend extracting unit of the load predicting unit 300 can extract the time series trend value of the time series data according to the truncated surrounding average method which is a moving average method in which the ideal data is corrected in the time series data of the second power data have.

이와 관련하여, 시계열 데이터 추세는 이동 평균법을 사용하여 추출하였다. 이동 평균법은 t시점의 데이터와 주변 시점의 데이터의 평균값을 의미한다. 이동 평균법은 주변 시점의 데이터가 이상 데이터이더라도 이를 포함하여 평균값을 계산한다. 따라서 이상 데이터가 존재할 경우, 정상 시계열 추세를 추출해내지 못하는 단점이 있다. 따라서 본 발명에서는 이동 평균법을 수정하여 사용하였으며, 이를 절사 주변 평균법(Trimmed vicinity Mean Method)이라 칭하였다. 한편, 도 7d는 본 발명과 관련하여, 시계열 데이터에서 주변 값들과 평균 값과의 차이가 발생 시 절사 주변 평균법의 원리를 나타낸 개념도이다.In this regard, time series data trends were extracted using the moving average method. The moving average method refers to the average value of the data at the time point t and the data at the surrounding time point. The moving average method calculates the average value including the abnormal data even if the data of the peripheral viewpoint is included. Therefore, when abnormal data exists, there is a disadvantage that the normal time series trend can not be extracted. Therefore, in the present invention, the moving average method is modified and used as the trimmed vicinity mean method. Meanwhile, FIG. 7D is a conceptual diagram illustrating the principle of the trimming surrounding averaging method when a difference between the surrounding values and the average value occurs in time series data.

도 7d는 본 발명에 따른 절사 주변 평균법 계산 즉, 시계열 추세 추출부에 대한 그림이다. t시점의 주변 평균값을 계산할 경우, t 시점의 데이터 외에 주변 시점의 데이터를 같이 고려하여 계산한다.FIG. 7D is a diagram illustrating a calculation of a truncated surrounding average method according to the present invention, that is, a time series trend extracting unit. When calculating the surrounding average value at time t, the data at the time t and the data at the peripheral time are taken into account together.

발명된 절사 주변 평균값의 목적은 주변 평균값 계산시, 이상 데이터 또는 이상 데이터에 영향을 받지 않도록 하는 것이다. 이를 위하여 t시점의 이동 평균값을 구할 경우, t시점의 데이터는 고려 대상에서 제외하였다. 왜냐하면 해당 값이 이상 값일 경우, 주변 평균값에 영향을 미치기 때문이다. 이와 더불어 주변 평균값 계산시 절사 평균값을 사용함으로써 이상 값에 영향을 덜 받도록 하였다.The object of the invention is to avoid the influence of the abnormal data or the abnormal data when calculating the surrounding average value. For this purpose, the data at time t were excluded from consideration when the moving average of t was obtained. This is because, if the value is abnormal, it affects the surrounding average value. In addition, the use of the truncated mean value in the calculation of the surrounding mean value is less influenced by the ideal value.

예를 들어, t시점 주변 평균값 계산시 고려되는 대상이 [-100, 1, 2, 500, 2, 3, 100] 이며, t 시점의 값이 500 이라고 가정하자. 그러면 t 시점의 값과 절사되는 값들을 제외하면 [1, 2, 2, 3]이 되며, 평균값은 2가 된다.For example, suppose that the target to be considered when calculating the average value at time t is [-100, 1, 2, 500, 2, 3, 100] and the value at time t is 500. Then, [1, 2, 2, 3] is obtained except for the value at the time t and the truncated values, and the average value is 2.

한편, 부하 예측부(300)의 시계열 추세 제거부는 제2 전력 데이터의 원시 데이터(raw data)와 상기 시계열 추세 값의 차이에 해당하는 잔차(remainder)를 계산한다.Meanwhile, the time series trend eliminator of the load predicting unit 300 calculates a remainder corresponding to the difference between the raw data of the second power data and the time series trend value.

한편, 부하 예측부(300)의 통계치 획득부는 특정 시간 간격으로 상기 잔차의 합을 계산하여, 상기 잔차에 대한 통계치를 획득한다. 이와 관련하여, 통계치 획득부에서는 여러 방법으로 계산한 잔차의 합을 계산한다. 방법이 1개 이상이 되는 이유는 주변 시점을 어떻게 정의하는지에 따라 여러 방법이 발생하기 때문이다. Meanwhile, the statistics obtaining unit of the load predicting unit 300 calculates the sum of the residuals at specific time intervals, and obtains statistics on the residuals. In this regard, the statistics acquisition unit calculates the sum of the residuals calculated by several methods. The reason why one method is more than one is that there are several methods depending on how to define the surrounding point.

예를 들어 오후 2시의 이동 평균값을 계산할 경우 고려되는 주변 값은 다음 시점의 값들이다. 이 경우에는 주변 2시점까지 고려하는 것으로 가정한다.For example, when calculating the moving average value at 2:00 pm, the peripheral values considered are the values at the next time points. In this case, it is assumed that consideration is made up to two points in the vicinity.

1시간 간격: [오후 12시, 오후 1시, 오후 2시, 오후 3시, 오후 4시]Every 1 hour: [12:00 pm, 1:00 pm, 2:00 pm, 3:00 pm, 4:00 pm]

24시간 간격: [전 전날 오후 2시, 전날 오후 2시, 당일 오후 2시, 다음날 오후 2시, 다다음날 오후 2시]Every 24 hours: [before 2 pm the previous day, 2 pm the previous day, 2 pm the next day, 2 pm the next day, 2 pm the next day]

24시간 간격: [전 전주 오후 2시, 전주 오후 2시, 당일 오후 2시, 다음주 오후 2시, 다다음주 오후 2시]Every 24 hours: [2:00 pm in Jeonju, 2:00 pm in Jeonju, 2:00 pm in the same day, 2:00 pm in the next week, 2:00 pm in the next week]

변압기 IoT 센서에서 취득한 유중 온도의 경우 사람들이 사용한 전력량에 영향을 받는다. 사용된 전력량 즉 부하는 시간에 따라 변화하는 성질이 있기 때문에, 주변 값을 1시간 간격, 24시간 간격 그리고 1주일 간격으로 정의하여 획득할 수 있다 (변압기 외부 온도는 날씨에 영향을 받기 때문에 시간에 따라 변화하는 성질을 갖으며, 유중 온도와 동일한 간격을 적용할 수 있다).The temperature of the fluid obtained from the transformer IoT sensor is affected by the amount of power consumed by people. Since the amount of power used, that is, the load, changes with time, it can be obtained by defining the ambient value as 1 hour interval, 24 hour interval and 1 week interval (since the external temperature of the transformer is affected by the weather, And the same interval as the temperature of the oil can be applied).

한편, 부하 예측부(300)의 임계치 결정부는 상기 통계치 획득부에서 계산한 잔차의 복수의 q% 분위수 중 어느 하나의 분위수를 사용하여 임계치를 결정한다. 이에 따라, 상기 결정된 임계치에 기반하여 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성할 수 있다.Meanwhile, the threshold determining unit of the load predicting unit 300 determines a threshold using any one of the plurality of q < th > percentiles of the residuals calculated by the statistic acquiring unit. Accordingly, the data set can be generated according to the change of the second power data based on the determined threshold value.

이와 관련하여, 임계치 획득부에서는 부트스트랩(Bootstrap) 방법을 이용하여 통계치 획득부에서 계산된 잔차의 q% 분위수를 계산한다. 부트스트랩 방법은 데이터의 개수만큼 복원추출을 한 다음 통계량을 계산하는 작업을 n번 반복하고 평균을 내서, 통계량을 추정하는 방법이다.In this regard, the threshold acquiring unit calculates the q% quotient of the residuals calculated by the statistic acquiring unit using a bootstrap method. The bootstrap method is a method of estimating a statistic by repeatedly extracting the number of data and then calculating a statistic amount by repeating n times and averaging.

예를 들어 잔차의 q% 분위수를 구할 경우를 가정하자. 그러면 잔차 데이터 개수만큼 복원 추출을 수행하여 q% 분위수를 계산한다. 이를 q1 이라고 한다. 그 다음에는 이와 같은 작업을 반복하고 이렇게 계산된 분위수들을 q1, q2, ..., q1000 라고 한다. qi에서 i는 번호, 즉 지수를 의미한다. 위의 과정에서 얻어진 1000개의 분위수를 다시 평균을 내면 잔차의 부트스트랩 q% 분위수가 계산된다.For example, suppose you want to find the q% percentile of the residual. Then, reconstruction extraction is performed by the number of residual data to calculate q% quantile. This is called q1. Next, we repeat this process and calculate the quantiles q1, q2, ..., q1000. In qi, i means a number, that is, an exponent. If the 1000 quintiles obtained in the above procedure are averaged again, the bootstrap q% quotient of the residual is calculated.

부트스트랩 방법을 사용하여 통계치를 획득할 경우 표본 통계치를 사용하는 것보다 모집단의 통계량을 잘 추정하는 것으로 알려져 있다. 임계치 획득부에서는 q를 100, 99.9, 99.8, ..., 99.0 으로 설정하여 11가지 경우의 분위수를 계산한다.It is known that the statistical value of the population is well estimated by using the bootstrap method rather than using the sample statistic. In the threshold value acquisition unit, q is set to 100, 99.9, 99.8, ..., 99.0 to calculate the quantiles of 11 cases.

한편, 임계치 결정부에서는 11가지 분위수 중 어느 분위수를 사용할지 결정한다.On the other hand, the threshold determination unit determines which of the 11th quantiles to use.

임계치 결정 방법 1) 이전 분위수와의 값 차이가 c 이상인 분위수, c: 사용자 지정 파라미터 (default: 1)Method of determining threshold 1) Quantile with difference of value from preceding quantile being c or more, c: User-specified parameter (default: 1)

임계치 결정 방법 2) 이전 분위수와의 값 차이가 최대가 되는 분위수 + 잔차 데이터의 표준편차 * α, α: 사용자 지정 파라미터 (default: 1.5)2) Standard deviation of the residual data with the largest difference between the previous and the previous quantile * α, α: User-specified parameter (default: 1.5)

이와 같은 시계열 추세를 고려한 일련의 부하 예측 방법은 시계열 값의 변화가 이상 범위에 해당하는 데이터를 제거할 수 있다는 장점이 있다. 이와 관련하여, 시계열 값이 급격하게 변화된 경우는 시계열 데이터의 추세에서 극단적으로 벗어난 경우를 의미한다. 시계열 추세에서 벗어난 데이터는 이전 값에 비해 값이 급격하게 변화되는 특징을 보이며, 이는 비정상적인 값의 변화이므로 이상 데이터라고 볼 수 있다.A series of load prediction methods considering the time series trend has an advantage that the data in which the change of the time series value corresponds to the abnormal range can be removed. In this regard, a sudden change in the time series value means an extreme deviation from the trend of the time series data. Data deviating from the time series tendency show a sudden change in value compared to the previous value. This is abnormal data because it is an abnormal value change.

한편, 가용 데이터 판별은 데이터 획득부(100) 뿐만 아니라, 전처리부(200)에서도 수행될 수 있다.On the other hand, the available data discrimination can be performed not only in the data acquiring unit 100 but also in the preprocessing unit 200.

이와 관련하여, 가용 데이터 판별부에서는 전처리 후에도 분석에 사용할 만큼 데이터가 남아있는지를 판별한다.In this regard, the available data discrimination section discriminates whether there is enough data to be used for analysis even after the preprocessing.

이러한 판별을 위하여 가용 데이터 비율을 계산하는데, 가용 데이터 비율이 α % 이상일 경우 데이터가 사용 가능하다고 판단된다.For this determination, the available data rate is calculated. If the available data rate is more than α%, it is judged that data is available.

- α는 사용자 지정 파라미터로 디폴트(default) 값은 80% 이다. - α is a user-specified parameter. The default value is 80%.

- 가용 데이터 비율 대신 사용자 지정 최소 데이터 개수 (

) May be used.

한편, 미취득 데이터 생성과 관련하여, 전처리부(200)는 상기 미취득 데이터를 생성하기 위하여 이전 시점에서 예측한 값을 다시 입력으로 사용하는 방법으로 LSTM 알고리즘을 학습할 수 있다. 또한, 부하 예측부(300)는 부하 예측 시, 학습 시와 동일한 방법으로 이전 시점에서 예측한 값을 다시 입력으로 사용할 수 있다.On the other hand, with respect to generation of the unacquisition data, the preprocessing unit 200 can learn the LSTM algorithm by using the value predicted at the previous time point as input again to generate the unacquired data. Also, the load predicting unit 300 can use the predicted value at the previous time in the same way as during learning at the time of the load prediction.

구체적으로, 미취득 데이터 생성 과정과 관련하여, 데이터 개수가 사용 가능할 만큼 존재한다면, 미취득 또는 전처리 과정에서 제거된 데이터를 생성한다. 한편, 미취득 데이터 학습 알고리즘에 대해 상세하게 살펴보면 다음과 같다.More specifically, with respect to the process of generating unexecuted data, if the number of data exists so as to be usable, data that has been removed in the process of not acquiring or preprocessing is generated. On the other hand, an unacquired data learning algorithm will be described in detail as follows.

미취득 데이터가 포함된 데이터에서 예측 알고리즘을 학습하기 위하여 회귀신경망 (Recurrent Neural Networks) 모델 중 LSTM (Long Short-Term Memory) 알고리즘의 구조를 변경하여 사용할 수 있다. 이와 관련하여, 본 발명의 목적은 2가지 부분으로 각각 미취득 데이터 학습 및 예측과 중장기 예측을 목적으로 한다.The structure of the LSTM (Long Short-Term Memory) algorithm among the Recurrent Neural Networks model can be used to learn the prediction algorithm from the data including the unacquired data. In this connection, the object of the present invention is to obtain unexercised data learning and prediction and mid / long-term prediction in two parts, respectively.

먼저, 미취득 데이터 학습 및 예측은 다음과 같다. 이와 관련하여, 도 8은 본 발명에 따른 미취득 데이터 학습 및 예측 과정의 개념도를 나타낸다. 한편, 예측 알고리즘 학습 시 사용되는 데이터는 Null (미취득)이 없다고 가정되어 있다. 따라서, 미취득 데이터를 생성하기 위하여 이전 시점에서 예측한 값을 다시 입력으로 사용하는 방법으로 LSTM 알고리즘을 학습한다.First, non-acquired data learning and prediction are as follows. In this regard, Fig. 8 shows a conceptual diagram of an unacquired data learning and prediction process according to the present invention. On the other hand, it is assumed that the data used in the prediction algorithm learning is null (not acquired). Therefore, the LSTM algorithm is learned by using the value predicted at the previous time point as input again to generate the unacquired data.

예측 시에도 입력되는 데이터가 Null (미취득)을 포함할 경우, 학습시와 동일한 방법으로 이전 시점에서 예측한 데이터를 입력 데이터로 사용한다.When the input data includes Null (not yet acquired) at the time of prediction, data predicted at the previous time is used as input data in the same manner as during learning.

한편, 부하 예측부(300)는 다음과 같은 방식으로 부하 예측을 수행할 수 있다. 이와 관련하여, 부하 예측부(300)는 제2 전력 데이터의 시계열 데이터에서 이상 데이터를 보정한 이동 평균법인 절사 주변 평균법에 따라 상기 시계열 데이터의 시계열 추세 값을 추출하는 시계열 추세 추출부를 포함할 수 있다. 또한, 상기 제2 전력 데이터의 원시 데이터(raw data)와 상기 시계열 추세 값의 차이에 해당하는 잔차(remainder)를 계산하는 시계열 추세 제거부를 포함할 수 있다Meanwhile, the load predicting unit 300 may perform the load prediction in the following manner. In this regard, the load predicting unit 300 may include a time series trend extracting unit for extracting a time series trend value of the time series data according to a truncated surrounding average method, which is a moving average method in which abnormal data is corrected in time series data of the second power data . The apparatus may further include a time series trend eliminator for calculating a remainder corresponding to a difference between the raw data of the second power data and the time series trend value

한편, 부하 예측부(300)는 특정 시간 간격으로 상기 잔차의 합을 계산하여, 상기 잔차에 대한 통계치를 획득하는 통계치 획득부를 더 포함할 수 있다. 또한, 상기 통계치 획득부에서 계산한 잔차의 복수의 q% 분위수 중 어느 하나의 분위수를 사용하여 임계치를 결정하는 임계치 결정부를 더 포함할 수 있다. 이때, 상기 결정된 임계치에 기반하여 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성할 수 있다.The load predicting unit 300 may further include a statistics obtaining unit for calculating a sum of the residuals at specific time intervals and obtaining a statistic for the residual. The threshold value determiner may further include a threshold value determiner for determining a threshold using one of the plurality of qth percentiles of the residue calculated by the statistic value acquiring unit. At this time, the data set may be generated according to the change of the second power data based on the determined threshold value.

따라서, 전술한 바와 같이, 부하 예측부(300)는 추세 변화 데이터를 제거한 후, 단기 및 중기 또는 장기 예측을 진행한다. 먼저, 추세 변화 데이터 제거 과정에 대해 살펴보면 다음과 같다. 이와 관련하여, 도 9는 본 발명에 따른 추세 변화 데이터 제거 과정의 개념도를 나타낸다. 이때, 추세 변화 데이터 제거부에서는 추세의 변화가 있는지를 탐지한 후, 추세 변화가 존재할 경우 추세 변화 이후의 데이터를 빼내어 새로운 데이터셋을 만든다. 즉, 전체 데이터셋과 더불어 추세 변화 이후 데이터 셋이 생성된다.Therefore, as described above, the load predicting unit 300 proceeds to short-term and medium-term or long-term prediction after removing the trend change data. First, the process of removing trend change data will be described as follows. In this regard, FIG. 9 shows a conceptual diagram of a trend change data removal process according to the present invention. At this time, the trend change data removal service detects whether there is a trend change, and if there is a trend change, it removes the data after the trend change and creates a new data set. That is, a dataset is created after the trend change along with the entire dataset.

추세 변화를 고려하는 이유는 시계열 패턴이 변화된 경우, 변화 이전의 데이터를 같이 사용하면 예측 모델의 성능을 저하될 가능성이 있기 때문이다. 따라서 이후 예측 모델 학습 시, 전체 데이터셋과 추세 변화 이후 데이터셋에 대해 각각 예측 모델을 학습한 후 평가 오류(Validation Loss)가 적은 모델을 최종 모델로 선택한다.The reason for considering the trend change is that if the time series pattern is changed, if the data before the change are used together, the performance of the prediction model may be deteriorated. Therefore, in future learning of the prediction model, the prediction model is learned for the entire data set and the data set after the trend change, and a model with a small evaluation error (validation loss) is selected as the final model.

다음으로, 부하 예측과 관련하여 장기 예측 알고리즘에 대해 살펴보면 아래와 같다. 한편, 도 10은 본 발명에 따른 HOLSTM 알고리즘의 원리를 나타낸 개념도이다. 이와 관련하여, 장기 부하 예측은 LSTM(long short term memory) 알고리즘을 변경한 HOLSTM (High Order LST) 알고리즘을 사용하여 수행될 수 있다. 반면에, 단기/중기 부하 예측은 상기 HOLSTM 알고리즘 또는 상기 LSTM 알고리즘을 사용하여 수행될 수 있다.Next, the long-term prediction algorithm related to the load prediction will be described as follows. FIG. 10 is a conceptual diagram illustrating the principle of the HOLSTM algorithm according to the present invention. In this regard, the long-term load prediction can be performed using a High Order LST (HOLSTM) algorithm, which is a modification of the long short term memory (LSTM) algorithm. On the other hand, the short / medium term load prediction can be performed using the HOLSTM algorithm or the LSTM algorithm.

장기 예측을 위하여 LSTM 알고리즘을 변경한 HOLSTM (High Order LSTM) 알고리즘을 사용할 수 있다. HOLSTM은 HORNN (High Order Recurrent Neural Networks)의 High Order 개념을 LSTM의 long-term memory에 적용한 제안(발명) 방법이다.For long-term prediction, we can use the HOLSTM (High Order LSTM) algorithm that changes the LSTM algorithm. HOLSTM is a proposed (inventive) method of applying the High Order concept of HORNN (High Order Recurrent Neural Networks) to long-term memory of LSTM.

본 방법은 Long-term Memory (또는 Cell) 값 계산시, 2시점 이상의 이전 Long-term Memory (또는 Cell)의 값에 remember gate를 곱한 값을 같이 사용하는 방법 (그림의 빨간색 선)으로, 장기 정보를 효과적으로 전달한다. In this method, when calculating the long-term memory (or cell) value, a value obtained by multiplying the value of the previous long-term memory (or cell) .

- remember gate: 0 내지 1의 사이의 값을 가지며, Long-term Memory의 값을 얼마나 반영하여 전달할지를 결정한다.- remember gate: It has a value between 0 and 1, and decides how much to reflect the value of the long-term memory.

이때, LSTM과 HOLSTM 알고리즘에 따른 시간 t에서의 부하 예측 알고리즘은 다음과 같다. At this time, the load prediction algorithm at time t according to the LSTM and HOLSTM algorithm is as follows.

따라서, HOLSTM 알고리즘이 Long-term Memory (또는 Cell) 값 계산시, t-2시점 이상의 이전 (

)Long-term Memory (또는 Cell)의 값에 remember gate를 곱한 값을 같이 사용하는 방법 (그림의 빨간색 선)으로, 장기 정보를 효과적으로 전달할 수 있다는 장점이 있다. Therefore, when the HOLSTM algorithm calculates the long-term memory (or cell) value,

) Long-term memory (or cell) value multiplied by remember gate (red line in the figure) is used to effectively transmit long-term information.

한편, 단기, 중기 예측 알고리즘에 대해 살펴보면 다음과 같다. 단기 및 중기 예측 알고리즘에서는 장기 예측에서 사용한 HOLSTM 또는 LSTM 알고리즘을 사용하여 예측 모델을 학습할 수 있다. 이와 관련하여, 학습시에는 학습 레이트(learning rate)에 매 5 또는 10회 반복(iteration)마다 0.8을 곱하여 사용하는 학습시에는 학습 레이트 감쇠(learning rate decay)를 적용할 수 있다.The short-term and medium-term prediction algorithms are as follows. In the short-term and medium-term prediction algorithms, prediction models can be learned using the HOLSTM or LSTM algorithms used in the long-term prediction. In this regard, a learning rate decay may be applied to the learning when the learning rate is multiplied by 0.8 for every 5 or 10 iterations.

이와 관련하여, 부하 예측부(300)에서는 단기/중기 부하 예측을 위한 학습 시 학습 레이트(learning rate)를 동적으로 변경할 수 있다. 이때, 상기 학습 시 도출된 MMSE (Minimum Mean Squared Error) 가 d회 동안 갱신되지 않을 경우, 상기 학습 레이트에 gamma 를 곱하여 학습 레이트 감쇠(learning rate decay)를 동적으로 변경시킬 수 있다.In this regard, the load predicting unit 300 can dynamically change a learning time rate for short / medium term load prediction. At this time, when the minimum mean squared error (MMSE) derived during the learning is not updated for d times, the learning rate can be dynamically changed by multiplying the learning rate by gamma.

한편, 도 11은 본 발명의 LSTM 및 HOLSTM 알고리즘에 따른 학습 횟수에 따른 오류 결과를 나타낸 것이다. 도 11을 참조하면, HOLSTM 알고리즘을 적용한 경우, 비교적 적은 학습 횟수로 오류 가능성을 감소시킬 수 있다는 장점이 있다.Meanwhile, FIG. 11 shows an error result according to the learning times according to the LSTM and HOLSTM algorithms of the present invention. Referring to FIG. 11, when the HOLSTM algorithm is applied, there is an advantage that the error probability can be reduced with a relatively small number of learning times.

한편, 도 12는 본 발명의 LSTM 및 HOLSTM 알고리즘에 따른 기간 별 전력 부하 예측 결과를 나타낸 것이다. 이때, LSTM 은 1주간의 예측 패턴이 반복되는 반면, HOLSTM 은 조금 더 다양한 주기성을 반영하여 예측할 수 있음을 알 수 있다.Meanwhile, FIG. 12 shows the power load prediction result for each period according to the LSTM and HOLSTM algorithm of the present invention. At this time, it can be seen that the LSTM repeats the prediction pattern for one week, while the HOLSTM predicts a more varied periodicity.

한편, 도 13은 본 발명의 HOLSTM 알고리즘에 날짜 정보를 사용한 경우의 부하 예측 결과를 나타낸 것이다. 도 13에 도시된 바와 같이, 공휴일, 주말 정보 등 날짜 정보를 사용할 경우 예측 성능이 더욱 향상됨을 알 수 있다. 따라서, 공휴일, 주말 정보 등 날짜 정보와 기온 정보 등 부가 데이터(meta data)를 반영하면 더 정확한 부하 예측 결과를 얻을 수 있다는 장점이 있다.Meanwhile, FIG. 13 shows a load prediction result when date information is used in the HOLSTM algorithm of the present invention. As shown in FIG. 13, it can be seen that prediction performance is further improved when using date information such as holidays, weekends, and the like. Therefore, it is advantageous to obtain a more accurate load prediction result by reflecting meta data such as date information and temperature information such as holidays, weekends, and the like.

이상에서는 본 발명에 따른 발명은 미취득 데이터 보정을 통한 부하 예측 시스템에 대해 살펴보았다. 이와 관련하여, 도 14는 본 발명과 관련하여 부하 패턴 변화 지점을 탐지하지 못한 경우와 본 발명에 따라 패턴 변화 탐지에 따른 예측 결과를 비교한 것이다. 도 14의 (a)와 같이, 부하 패턴 변화 지점을 탐지하지 못할 경우 부정확하게 부하를 예측하게 된다. 반면에, 도 14의 (b)와 같이, 부하 패턴 변화 지점을 탐지할 경우, 부하를 높은 정확도로 올바르게 예측할 수 있다는 장점이 있다.In the foregoing, the invention according to the present invention has been described with respect to a load prediction system through correction of unacquired data. In this regard, FIG. 14 compares the case where the load pattern change point is not detected with the present invention and the prediction result according to the pattern change detection according to the present invention. As shown in FIG. 14 (a), if the load pattern change point can not be detected, the load is predicted incorrectly. On the other hand, as shown in FIG. 14 (b), when the load pattern change point is detected, the load can be accurately predicted with high accuracy.

이상에서는 본 발명에 따른 발명은 미취득 데이터 보정을 통한 부하 예측 시스템에 대해 살펴보았다. 한편, 상기 부하 예측 시스템을 이용하여 미취득 데이터 보정을 통한 부하 예측이 수행될 수 있다. 이와 관련하여, 부하 예측 시스템에서 설명된 내용이 부하 예측 방법에 이용될 수 있다.In the foregoing, the invention according to the present invention has been described with respect to a load prediction system through correction of unacquired data. On the other hand, the load prediction through the correction of the unacquired data can be performed using the load prediction system. In this regard, the contents described in the load prediction system can be used in the load prediction method.

도 15는 본 발명에 따른 미취득 데이터 보정을 통한 부하 예측 방법의 흐름도를 나타낸다. 도 16을 참조하면, 상기 부하 예측 방법은 데이터 획득 과정(S100), 전처리 과정(S200) 및 부하 예측 과정(S300)을 포함한다. Fig. 15 shows a flowchart of a load predicting method through correction of unacquired data according to the present invention. Referring to FIG. 16, the load prediction method includes a data acquisition step S100, a preprocessing step S200, and a load prediction step S300.

데이터 획득 과정(S100)에서, 전력 데이터를 획득한 후, 상기 전력 데이터가 분석을 위해 필요한 최소 데이터 개수 이상인지 여부를 판별한다. 또한, 전처리 과정(S200)에서, 상기 전력 데이터에서 일부 데이터를 제거한 제2 전력 데이터가 가용 데이터 비율 이상인지 여부를 판별하고, 미취득 데이터(null data)를 생성한다. 또한, 부하 예측 과정(S300)에서, 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성하고, 상기 데이터 셋을 이용하여 제1 예측 알고리즘 또는 제2 예측 알고리즘에 따라 부하 예측을 수행할 수 있다.After acquiring the power data in the data acquisition process (S100), it is determined whether or not the power data is equal to or larger than the minimum number of data required for the analysis. In the preprocessing step (S200), it is determined whether or not the second power data from which the partial data is removed from the power data is equal to or higher than the available data rate, and null data is generated. Also, in the load prediction process (S300), a data set may be generated according to the change of the second power data, and the load prediction may be performed according to the first prediction algorithm or the second prediction algorithm using the data set .

한편, 상기 부하 예측 과정은 전술한 바와 같이, 시계열 추세 추출 과정, 시계열 추세 제거 과정, 통계치 획득 과정 및 임계치 결정 과정을 포함할 수 있다. Meanwhile, the load prediction process may include a time series trend extraction process, a time series trend removal process, a statistics acquisition process, and a threshold determination process, as described above.

이때, 시계열 추세 추출 과정에서, 제2 전력 데이터의 시계열 데이터에서 이상 데이터를 보정한 이동 평균법인 절사 주변 평균법에 따라 상기 시계열 데이터의 시계열 추세 값을 추출할 수 있다. 또한, 시계열 추세 제거 과정에서, 상기 제2 전력 데이터의 원시 데이터(raw data)와 상기 시계열 추세 값의 차이에 해당하는 잔차(remainder)를 계산할 수 있다. 또한, 통계치 획득 과정에서, 특정 시간 간격으로 상기 잔차의 합을 계산하여, 상기 잔차에 대한 통계치를 획득할 수 있다. 또한, 임계치 결정 과정에서, 상기 통계치 획득부에서 계산한 잔차의 복수의 q% 분위수 중 어느 하나의 분위수를 사용하여 임계치를 결정할 수 있다. 이에 따라, 상기 결정된 임계치에 기반하여 상기 제2 전력 데이터의 추세 변화에 따라 데이터 셋을 생성하고, 상기 데이터 셋을 이용하여 제1 예측 알고리즘 또는 제2 예측 알고리즘에 따라 부하 예측을 수행할 수 있다.In this case, in the time series trend extracting process, the time series trend value of the time series data can be extracted according to the truncated surrounding average method which is the moving average method in which the abnormal data is corrected in the time series data of the second power data. Also, in the time-series trend removal process, a remainder corresponding to the difference between the raw data of the second power data and the time series trend value can be calculated. Also, in the process of acquiring statistics, the sum of the residuals may be calculated at specific time intervals to obtain statistics on the residuals. Also, in the threshold value determination process, the threshold value may be determined using any one of the plurality of q < th > percentile quantiles of the residuals calculated by the statistic value obtaining unit. Accordingly, a data set can be generated according to the change in the second power data based on the determined threshold, and the load prediction can be performed according to the first prediction algorithm or the second prediction algorithm using the data set.

이상에서는 본 발명에 따른 발명은 미취득 데이터 보정을 통한 부하 예측 시스템과 부하 예측 방법에 대해 살펴보았다. 본 발명의 적어도 일 실시 예에 따른 기술적 효과는 다음과 같다. In the foregoing, the invention according to the present invention has been described with respect to a load prediction system and a load prediction method through correction of unacquired data. Technical effects according to at least one embodiment of the present invention are as follows.

본 발명에 따르면, 전력 부하 데이터의 비정상 데이터를 제거하고 패턴이 변화하는 시점을 탐지한 후 이상 데이터를 제거함으로써 부하 예측 알고리즘의 정확도를 높일 수 있도록 부하 데이터를 전처리 할 수 있다는 장점이 있다. According to the present invention, there is an advantage that load data can be preprocessed so as to increase the accuracy of the load prediction algorithm by removing abnormal data of power load data, detecting a time when a pattern changes, and removing abnormal data.

소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능뿐만 아니라 각각의 구성 요소들은 별도의 소프트웨어 모듈로도 구현될 수 있다. 상기 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다. 적절한 프로그램 언어로 쓰여진 소프트웨어 어플리케이션으로 소프트웨어 코드가 구현될 수 있다. 상기 소프트웨어 코드는 메모리에 저장되고, 제어부(controller) 또는 프로세서(processor)에 의해 실행될 수 있다.According to a software implementation, not only the procedures and functions described herein, but also each component may be implemented as a separate software module. Each of the software modules may perform one or more of the functions and operations described herein. Software code can be implemented in a software application written in a suitable programming language. The software code is stored in a memory and can be executed by a controller or a processor.

Claims

In a load prediction system through correction of unacquired data,
A data acquiring unit for acquiring power data and determining whether the power data is greater than or equal to a minimum number of data necessary for analysis;
A preprocessor for determining whether or not the second power data from which the partial data is removed from the power data is equal to or higher than an available data rate and generating null data; And
And a load predictor for generating a data set according to a change in the second power data and for performing load prediction according to a first prediction algorithm or a second prediction algorithm using the data set.

The method according to claim 1,
The load predictor may include:
Performing a long term load prediction of the power system using the first prediction algorithm,
And performs the short / medium term load prediction of the power system using the second prediction algorithm.

3. The method of claim 2,
The long-term load prediction is performed using a HOLSTM (High Order LST) algorithm in which a long short term memory (LSTM) algorithm is changed,
Wherein the short / medium term load prediction is performed using the HOLSTM algorithm or the LSTM algorithm.

The method of claim 3,
The load predictor may include:
The learning rate in learning for the short / medium term load prediction is changed dynamically,
Wherein the learning rate decay is dynamically changed by multiplying the learning rate by gamma when the Minimum Mean Squared Error (MMSE) derived during the learning is not updated for d times.

The method of claim 3,
The pre-
And the second power data is obtained through a process of removing abnormal data, a process of removing pattern change data, and a process of removing abnormal data in the power data.

6. The method of claim 5,
In the pattern change data removal process,
The pattern clustering is performed on the basis of the transfer time point at which the difference between the predicted value of the load data and the actual measured value exceeds the threshold,
Wherein the pattern re-clustering is performed on data before and after the transfer time point.

The method according to claim 1,
The pre-
The LSTM algorithm is learned by using a value predicted at a previous time point as input again to generate the unacquisition data,
The load predictor may include:
Wherein the predicted value is used again as a predicted value at the previous prediction time in the same way as during learning at the time of the prediction of the load.

The method according to claim 1,
The load predictor may include:
A time series trend extracting unit for extracting a time series trend value of the time series data according to a truncated surrounding average method which is a moving average method in which abnormal data is corrected in time series data of the second power data; And
And a time series trend eliminator for calculating a remainder corresponding to a difference between the raw data of the second power data and the time series trend value.

9. The method of claim 8,
A statistics obtaining unit for calculating a sum of the residuals at specific time intervals and obtaining a statistic for the residual; And
And a threshold value determiner for determining a threshold value using any one of the plurality of qth percentiles of the residuals calculated by the statistic value acquiring unit,
And generates a data set according to a trend change of the second power data based on the determined threshold.

A method for predicting a load through correction of an unacquired data,
A data acquiring step of acquiring power data and determining whether the power data is equal to or greater than a minimum number of data necessary for analysis;
A preprocessing step of discriminating whether or not the second power data from which the partial data is removed from the power data is equal to or higher than an available data rate and generating null data; And
Generating a data set according to a trend change of the second power data, and performing a load prediction according to a first prediction algorithm or a second prediction algorithm using the data set.

11. The method of claim 10,
The load prediction process includes:
A time series trend extracting step of extracting a time series trend value of the time series data according to a truncated surrounding average method which is a moving average method in which abnormal data is corrected in time series data of the second power data;
A time series trend removal process for calculating a remainder corresponding to a difference between the raw data of the second power data and the time series trend value;
Calculating a sum of the residuals at specific time intervals and obtaining a statistic for the residual; And
And a threshold value determination step of determining a threshold value by using one of the plurality of q < th > percentile quantiles of the residuals calculated in the statistical value acquisition step,
And generates a data set according to a change in the trend of the second power data based on the determined threshold.