JP2016126718A

JP2016126718A - Time series data prediction device and time series data prediction method

Info

Publication number: JP2016126718A
Application number: JP2015002449A
Authority: JP
Inventors: 英朋境野; Hidetomo Sakaino
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-01-08
Filing date: 2015-01-08
Publication date: 2016-07-11

Abstract

PROBLEM TO BE SOLVED: To provide a time series data prediction device and a method therefor with which it is possible to predict data on-line since the accuracy of applying a steep support vector regression to time-series data has increased, and also cross validation is unused.SOLUTION: The time series data prediction device comprises: a statistical analysis unit 203 for extracting the statistical nature of captured time-series data for each temporal interval; a kernel selection unit 204 for selecting a kernel function for each temporal interval that conforms to the statistical nature of time-series data extracted for each temporal interval by the statistical analysis unit; and a data prediction unit 205 for calculating a temporal change in time-series data in and after the temporal interval of the captured time-series data and predicting time-series data using a support vector regression model in which the kernel function selected for each temporal interval by the kernel selection unit is used.SELECTED DRAWING: Figure 2

Description

本発明は、時系列データ予測装置及び時系列データ予測方法に関し、より詳細には、サポートベクトル回帰を使用した時系列データ予測装置及び時系列データ予測方法に関する。 The present invention relates to a time series data prediction apparatus and a time series data prediction method, and more particularly to a time series data prediction apparatus and a time series data prediction method using support vector regression.

時系列データ予測は、電力分野における消費電力の予測等において、重要な予測手段の一つである。また、時系列データ予測は、消費電力の予測以外にも、エネルギー分野、通信分野、センシング分野における実環境内のモニタリング、及び画像センシング等さまざまな分野において用いられている。これまで、単調あるいは滑らかに変化していくデータを予測する方法については、これまで数多くの手法が考案されてきている。 Time series data prediction is one of important prediction means in power consumption prediction in the power field. In addition to power consumption prediction, time-series data prediction is used in various fields such as monitoring in the real environment in the energy field, communication field, and sensing field, and image sensing. So far, many methods have been devised for predicting data that changes monotonously or smoothly.

時系列データの予測は、一般的には、時間的に変化するデータをグラフに表した曲線を、直線又は曲線に近似させることにより行われる。図１は、消費電力の予測を例とした従来の時系列データの予測の例を示す図であり、図１（ａ）は線形回帰分析（非特許文献１参照）による消費電力の予測の例を示し、図１（ｂ）は非線形重回帰分析による消費電力の予測の例を示している。ここで、図１（ａ）及び（ｂ）の実線は電力計により計測された消費電力の時間的変化であり、図１（ａ）の破線は線形回帰分析による予測関数、図１（ｂ）の破線は非線形重回帰分析による予測関数である。電力計により計測された消費電力の時間的変化は、大きな変動を伴っていることが多い。この、消費電力の時間的変化については、線形回帰分析（図１（ａ））により予測することができる。図１（ａ）の例においては、１分ごとに計測された消費電力（Ｗ）：ｙについて、ｙ＝ａｘ＋ｂという最も簡単な一次回帰式を適用することを考え、未知数ａ、ｂを線形最小二乗法により容易に推定することができる。しかし、データのばらつきに関する外れ値の影響をどのように考慮するかによって、線形重回帰分析による予測関数が大きく異なるため、予測誤差が大きくなるという問題があった。 Prediction of time-series data is generally performed by approximating a curve representing data that changes with time in a graph to a straight line or a curve. FIG. 1 is a diagram illustrating an example of conventional time-series data prediction using power consumption prediction as an example, and FIG. 1A illustrates an example of power consumption prediction based on linear regression analysis (see Non-Patent Document 1). FIG. 1B shows an example of prediction of power consumption by nonlinear multiple regression analysis. Here, the solid lines in FIGS. 1A and 1B are temporal changes in power consumption measured by the power meter, the broken lines in FIG. 1A are prediction functions based on linear regression analysis, and FIG. The broken line is a prediction function by nonlinear multiple regression analysis. The temporal change in power consumption measured by the wattmeter is often accompanied by large fluctuations. This temporal change in power consumption can be predicted by linear regression analysis (FIG. 1A). In the example of FIG. 1A, it is considered that the simplest linear regression equation y = ax + b is applied to the power consumption (W): y measured every minute, and the unknowns a and b are linearly minimized. It can be easily estimated by the square method. However, there is a problem that the prediction error increases because the prediction function based on the linear multiple regression analysis differs greatly depending on how to consider the influence of outliers on the variation in data.

このような予測誤差の改善のために、図１（ｂ）のように、気温といった気象因子を考慮し、過去に蓄積された最大電力需要量との間で線形重回帰分析の変数に２乗項を用いた非線形重回帰分析により、時系列データを推定する方法がある（非特許文献２）。しかし、図１（ｂ）のような非線形重回帰分析による時系列データの予測は、電力会社など特殊な条件のもとで蓄積された過去データに基づいた手法であり、一般の消費電力量の予測モデルには適用できない。 In order to improve such prediction errors, as shown in Fig. 1 (b), taking into account meteorological factors such as air temperature, square the variable of the linear multiple regression analysis with the maximum power demand accumulated in the past. There is a method of estimating time series data by nonlinear multiple regression analysis using terms (Non-patent Document 2). However, the prediction of time-series data by nonlinear multiple regression analysis as shown in FIG. 1B is a method based on past data accumulated under special conditions such as electric power companies. Not applicable to predictive models.

そのため、より汎用的な時系列データ予測方法および一般に入手しやすい観測データ、計測データとの相性の良い時系列データの予測モデルが求められてきた。特に、不連続に変化していく時系列データ、及び急峻に変化する時系列データの直線又は曲線への近似については、一般的な線形の最小二乗法、及び滑らかな関数を基底とした多項式近似法等では直線又は曲線に当てはめる際に数学的な限界が存在することが明らかである。 For this reason, a more general-purpose time-series data prediction method and a prediction model for time-series data having good compatibility with observation data and measurement data that are generally available have been demanded. In particular, for time-series data that changes discontinuously and approximation of time-series data that changes sharply to a straight line or curve, general linear least-squares method and polynomial approximation based on a smooth function It is clear that the law has mathematical limitations when applying to a straight line or curve.

一方で、近年、時系列データの予測に対しては、サポートベクトルマシン（ＳＶＭ：非特許文献３参照）に基づいたサポートベクトル回帰（ＳＶＲ）が注目されている。サポートベクトル回帰は、少ない学習データであっても、従来よりも時系列データへの回帰モデルの当てはめの精度が高まり、予測精度の向上が見られる。 On the other hand, recently, support vector regression (SVR) based on a support vector machine (SVM: see Non-Patent Document 3) has attracted attention for prediction of time series data. In support vector regression, even with a small amount of learning data, the accuracy of fitting a regression model to time-series data is higher than before, and the prediction accuracy is improved.

P. P. Vaidyanathan “The Theory of Linear Prediction” California Institute of Technology 2008年、［online］、［平成２４年１２月２日検索］、インターネット＜ＵＲＬ：http://authors.library.caltech.edu/25063/1/S00086ED1V01Y200712SPR003.pdf＞PP Vaidyanathan “The Theory of Linear Prediction” California Institute of Technology 2008, [online], [December 2, 2012 search], Internet <URL: http://authors.library.caltech.edu/25063/1 /S00086ED1V01Y200712SPR003.pdf> 灰田武史、武藤昭一「重回帰手法に基づいた最大需要予測支援システムの開発」オペレーションズ・リサーチvol.41，pp.476-480,Takeshi Haida, Shoichi Muto “Development of Maximum Demand Forecasting Support System Based on Multiple Regression Method” Operations Research vol.41, pp.476-480, “サポートベクターマシン”、ウィキペディア、［online］、［平成２４年１２月２日検索］、インターネット＜ＵＲＬ：http://ja.wikipedia.org/wiki/%E3%82%B5%E3%83%9D%E3%83%BC%E3%83%88%E3%83%99%E3%82%AF%E3%82%BF%E3%83%BC%E3%83%9E%E3%82%B7%E3%83%B3＞“Support Vector Machine”, Wikipedia, [online], [Searched on December 2, 2012], Internet <URL: http://en.wikipedia.org/wiki/%E3%82%B5%E3%83% 9D% E3% 83% BC% E3% 83% 88% E3% 83% 99% E3% 82% AF% E3% 82% BF% E3% 83% BC% E3% 83% 9E% E3% 82% B7% E3% 83% B3> 赤穂昭太郎「サポートベクターマシン（統数研公開講座「カーネル法の最前線―ＳＶＭ，非線形データ解析，構造化データ―」）」産業技術総合研究所 2006.7.6〜7、［online］、［平成２４年１２月２日検索］、インターネット＜ＵＲＬ：http://www.ism.ac.jp/~fukumizu/ISM_lecture_2006/svm-ism.pdf）Shota Akaho “Support Vector Machine (Study on the Opening of the Strategic Research Institute“ The Forefront of Kernel Method -SVM, Nonlinear Data Analysis, Structured Data ”)” National Institute of Advanced Industrial Science and Technology 2006.7.6-7, Search December 2, 2009], Internet <URL: http://www.ism.ac.jp/~fukumizu/ISM_lecture_2006/svm-ism.pdf)

サポートベクトル回帰は、時系列データの近似を行う場合、カーネル関数及びそのモデルパラメータを選択する。ここで、カーネル関数及びそのモデルパラメータの選択は、経験的、又は総当たり計算により準最適化を行う。しかし、準最適化を行ったとしても、依然として急峻に変化する時系列データにカーネル関数を当てはめても、誤差は大きくなることが多い。特に、総当たり計算の一つであるクロスバリデーションは、膨大な組み合わせ計算が強いられる。したがって、クロスバリデーションを使用することは、事前学習又は事後検証には有効であるが、実時間処理では不向きであることが知られている。また、想定するモデルパラメータの探索範囲は人為的に決められ、パラメータの探索分解能は人為的に決められた範囲に左右されるため、クロスバリデーションだけでは最適化は不十分な場合がある。 The support vector regression selects a kernel function and its model parameter when approximating time series data. Here, the selection of the kernel function and its model parameters is performed by quasi-optimization by empirical or brute force calculation. However, even if quasi-optimization is performed, the error often increases even if a kernel function is applied to time-series data that still changes sharply. In particular, cross-validation, which is one of brute force calculations, is forced to perform enormous combination calculations. Therefore, it is known that using cross-validation is effective for pre-learning or post-verification, but is not suitable for real-time processing. In addition, since the assumed model parameter search range is artificially determined and the parameter search resolution depends on the artificially determined range, optimization by cross-validation alone may be insufficient.

そこで本発明では、時系列データの統計的な性質に着目し、時間区間ごとにサポートベクトル回帰におけるカーネル関数を選択して時系列データを予測する、適応的カーネル型予測による時系列データ予測装置及びその方法を提供する。 Therefore, in the present invention, focusing on the statistical properties of time-series data, a time-series data prediction apparatus using adaptive kernel type prediction that predicts time-series data by selecting a kernel function in support vector regression for each time interval, and The method is provided.

このような目的を達成するために、本発明の第１の実施態様は、サポートベクトル回帰により時系列データを予測する時系列データ予測装置であって、取り込んだ前記時系列データの統計的な性質を、時間区間ごとに抽出する統計解析部と、前記統計解析部により前記時間区間ごとに抽出した前記時系列データの統計的な性質に適合するカーネル関数を、前記時間区間ごとに選択するカーネル選択部と、前記カーネル選択部において前記時間区間ごとに選択した前記カーネル関数を使用したサポートベクトル回帰モデルを使用して、前記取り込んだ時系列データの時間区間以降の時系列データの時間的変化を計算し、前記時系列データを予測するデータ予測部とを備えることを特徴とする。 In order to achieve such an object, a first embodiment of the present invention is a time-series data predicting apparatus for predicting time-series data by support vector regression, wherein the statistical properties of the captured time-series data A statistical analysis unit that extracts each time interval, and a kernel selection that selects, for each time interval, a kernel function that matches the statistical properties of the time-series data extracted for each time interval by the statistical analysis unit And a support vector regression model using the kernel function selected for each time interval in the kernel selection unit, and calculates temporal changes in time series data after the time interval of the captured time series data And a data prediction unit for predicting the time series data.

また、本発明の第２の態様は、第１の態様の時系列データ予測装置であって、前記サポートベクトル回帰モデルは、前記サポートベクトル回帰モデルの基本の予測関数における内積を、前記選択したカーネル関数と置き換えた関数であることを特徴とする。 Further, a second aspect of the present invention is the time-series data prediction apparatus according to the first aspect, wherein the support vector regression model uses an inner product in a basic prediction function of the support vector regression model as the selected kernel. It is a function replaced with a function.

また、本発明の第３の態様は、サポートベクトル回帰により時系列データを予測する時系列データ予測方法であって、統計解析部において、取り込んだ前記時系列データの統計的な性質を、時間区間ごとに抽出するステップと、カーネル選択部において、前記抽出するステップにおいて前記時間区間ごとに抽出した前記時系列データの統計的な性質に適合するカーネル関数を、前記時間区間ごとに選択するステップと、データ予測部において、前記抽出するステップにおいて前記時間区間ごとに選択した前記カーネル関数を使用したサポートベクトル回帰モデルを使用して、前記取り込んだ時系列データの時間区間以降の時系列データの時間的変化を計算し、前記時系列データを予測するステップとを備えることを特徴とする。 Further, a third aspect of the present invention is a time series data prediction method for predicting time series data by support vector regression, wherein a statistical property of the captured time series data is represented by a time interval in a statistical analysis unit. Extracting for each time interval in the kernel selection unit, and selecting a kernel function that matches the statistical properties of the time-series data extracted for each time interval in the extracting step; In the data prediction unit, using the support vector regression model using the kernel function selected for each time interval in the extracting step, the temporal change of the time series data after the time interval of the captured time series data And calculating the time series data.

また、本発明の第４の態様は、第３の態様の時系列データ予測方法であって、前記サポートベクトル回帰モデルは、前記サポートベクトル回帰モデルの基本の予測関数における内積を、前記選択したカーネル関数と置き換えた関数であることを特徴とする。 According to a fourth aspect of the present invention, there is provided the time-series data prediction method according to the third aspect, wherein the support vector regression model uses an inner product in a basic prediction function of the support vector regression model as the selected kernel. It is a function replaced with a function.

本発明によれば、急峻な時系列データへのサポートベクトル回帰の予測精度が高まり、しかも、クロスバリデーションを用いないため、オンラインでのデータ予測が可能となる。 According to the present invention, the prediction accuracy of support vector regression to steep time-series data is improved, and on the other hand, online data prediction is possible because cross-validation is not used.

消費電力の予測を例とした従来の時系列データの予測の例を示す図であり、（ａ）は線形回帰分析による消費電力の予測の例を示し、（ｂ）は非線形重回帰分析による消費電力の予測の例を示している。It is a figure which shows the example of the prediction of the conventional time series data which used the prediction of power consumption as an example, (a) shows the example of the prediction of power consumption by linear regression analysis, (b) is the consumption by nonlinear multiple regression analysis. An example of power prediction is shown. 本発明の一実施形態にかかる時系列データ予測装置を示す構成図である。It is a block diagram which shows the time series data prediction apparatus concerning one Embodiment of this invention. 図１の時系列データ予測装置による時系列データを予測する方法を示すフローチャートである。It is a flowchart which shows the method of estimating the time series data by the time series data prediction apparatus of FIG. サポートベクトル回帰の基となるサポートベクトルマシンを説明するための図で、（ａ）は、分類対象のデータを実空間に配置した図であり、（ｂ）は、分類対象のデータを高次元特徴空間内に配置した図である。It is a figure for demonstrating the support vector machine used as the basis of support vector regression, (a) is the figure which arranged the data of classification object in real space, (b) is a high-dimensional feature It is the figure arrange | positioned in space. サポートベクトル回帰分析にカーネル関数の一例であるガウシアン型カーネル関数を使用した場合において、カーネル関数の分散パラメータと識別境界との関係を示す図であり、（ａ）はガウシアンの広がりδ＝０．５の場合、（ｂ）はδ＝１の場合、（ｃ）はδ＝５の場合を示している。When a Gaussian type kernel function, which is an example of a kernel function, is used for support vector regression analysis, it is a diagram showing a relationship between a variance parameter of the kernel function and an identification boundary, and (a) is a Gaussian spread δ = 0.5. (B) shows the case where δ = 1, and (c) shows the case where δ = 5. 図３の方法において、観測した消費電力量の時系列データの統計的な特徴量を時間区間ごとに解析し、時系列データが急峻に変化する部分への予測関数の当てはめの精度を高めた例を示す図であり、（ａ）は、時系列データの時間的変化を示し、（ｂ）は、時系列データの時間的変化の時間に関する１次微分の値の変化を示す。In the method of FIG. 3, the statistical feature amount of the time series data of the observed power consumption is analyzed for each time interval, and the accuracy of fitting the prediction function to the portion where the time series data changes sharply is increased. (A) shows the time change of time series data, (b) shows the change of the value of the 1st derivative regarding the time of the time change of time series data. 異なるカーネル関数を用いたときの予測精度の違いを比較した実験例を示す図である。It is a figure which shows the experimental example which compared the difference in the prediction precision when using a different kernel function.

以下、図面を参照しながら本発明の実施形態について詳細に説明する。図２は、本発明の一実施形態にかかる時系列データ予測装置２００を示す構成図である。時系列データ予測装置２００は、時系列データを取り込む観測データ入力部２０１と、取り込んだ時系列データを蓄積するデータ蓄積部２０２と、蓄積した時系列データの統計的な性質を時間区間ごとに抽出する統計解析部２０３とを備える。また、時系列データ予測装置２００は、蓄積した時系列データの統計的な性質に当てはめるカーネル関数を選択するカーネル選択部２０４と、選択したカーネル関数により時系列データを予測するデータ予測部２０５と、予測結果等を画面に表示する表示部２０６とを備える。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 2 is a configuration diagram showing a time-series data prediction apparatus 200 according to an embodiment of the present invention. The time-series data prediction apparatus 200 extracts an observation data input unit 201 that captures time-series data, a data storage unit 202 that accumulates the captured time-series data, and a statistical property of the accumulated time-series data for each time interval. And a statistical analysis unit 203. In addition, the time-series data prediction apparatus 200 includes a kernel selection unit 204 that selects a kernel function to be applied to the statistical properties of accumulated time-series data, a data prediction unit 205 that predicts time-series data using the selected kernel function, And a display unit 206 for displaying a prediction result or the like on the screen.

図３は、時系列データ予測装置２００による時系列データを予測する方法を示すフローチャートである。図３の時系列データ予測方法は、ステップ３０１において開始し、ステップ３０２において、観測データ入力部２０１により、観測対象となる時系列データを取り込む。ステップ３０３において、データ蓄積部２０２により、観測データ入力部２０１から取り込んだ時系列データを蓄積する。ステップ３０４において、統計解析部２０３により、データ蓄積部２０２に蓄積した時系列データについて、その時系列データの時間的変化の統計的な性質を、時間区間ごとに抽出する。ステップ３０５において、カーネル選択部２０４により、統計解析部２０３において時間区間ごとに抽出した時系列データの時間的変化の統計的な性質に当てはめるために、サポートベクトル回帰モデルにおけるカーネル関数を、それぞれの時間区間ごとに選択する。ステップ３０６において、データ予測部２０５により、カーネル選択部２０４において時間区間ごとに選択したカーネル関数を使用したサポートベクトル回帰モデル（予測関数）を使用して、取り込んだ時系列データの時間区間以降の時刻における変化を計算し、時系列データの時間的変化を予測する。ステップ３０７において、表示部２０６により、データ予測部２０５において予測した時系列データの時間的変化の結果等を画面に表示する。 FIG. 3 is a flowchart showing a method of predicting time series data by the time series data prediction apparatus 200. The time series data prediction method in FIG. 3 starts in step 301, and in step 302, the observation data input unit 201 captures time series data to be observed. In step 303, the data accumulation unit 202 accumulates time series data acquired from the observation data input unit 201. In step 304, the statistical analysis unit 203 extracts, for each time interval, the statistical properties of the time series data stored in the data storage unit 202 for each time interval. In step 305, the kernel function in the support vector regression model is applied to each time interval in order to apply the statistical property of the temporal change of the time series data extracted for each time interval by the statistical analysis unit 203 by the kernel selection unit 204. Select for each section. In step 306, the data prediction unit 205 uses the support vector regression model (prediction function) using the kernel function selected for each time interval by the kernel selection unit 204, and the time after the time interval of the captured time-series data. Calculate changes in, and predict temporal changes in time series data. In step 307, the display unit 206 displays the result of the temporal change of the time series data predicted by the data prediction unit 205 on the screen.

次に、サポートベクトル回帰について説明する。図４は、サポートベクトル回帰の基となるサポートベクトルマシンを説明するための図で、図４（ａ）は、分類対象のデータを実空間に配置した図であり、図４（ｂ）は、分類対象のデータを高次元特徴空間内に配置した図である。サポートベクトルマシンは、線形分離不可能なデータに対して、カーネル関数を用いて高次元特徴空間へ変換することで、線形分離が可能となる。例えば図４（ａ）に示す二次元平面座標系（ｘ，ｙ）上に、４つのデータを表す点（１，１）、（１，−１）、（−１，−１）、（−１，１）があるとする。（１，１）と（−１，−１）がひとつのクラス、（−１，１）、（１，−１）がひとつのクラスとして分類しようとする場合、平面上でクラスの境界線を一本の直線で引くことができないため、線形分離が不可能であることが容易にわかる。 Next, support vector regression will be described. FIG. 4 is a diagram for explaining a support vector machine that is a basis of support vector regression. FIG. 4A is a diagram in which data to be classified is arranged in a real space, and FIG. It is the figure which has arrange | positioned the data of classification object in the high-dimensional feature space. The support vector machine can perform linear separation by converting data that cannot be linearly separated into a high-dimensional feature space using a kernel function. For example, on the two-dimensional plane coordinate system (x, y) shown in FIG. 4A, points (1, 1), (1, -1), (-1, -1), (- 1, 1). When (1, 1) and (-1, -1) are to be classified as one class and (-1, 1), (1, -1) are to be classified as one class, class boundaries on the plane are displayed. Since it cannot be drawn with a single straight line, it is easily understood that linear separation is impossible.

ここで、図４（ｂ）のように、ある新しい関数を導入し、二次元空間（平面）（ｘ，ｙ）上の４つの点を三次元空間（ｘ，ｙ，ｚ）に射影する。図４（ｂ）においては、関数φ（Ｘ）を用いて、二次元空間内のデータ（点）を三次元空間内の各点に変換している。ここで、関数φ（ｘ）は、通常高次元への写像関数を表し、カーネル関数という。また、Ｘを入力空間、変換された三次元空間Ｆを特徴空間という。従って、二次元平面座標系（ｘ₁，ｙ₁）は、特徴空間において、（φ（ｘ₁），φ（ｙ₁））と表現される。 Here, as shown in FIG. 4B, a new function is introduced, and four points on the two-dimensional space (plane) (x, y) are projected onto the three-dimensional space (x, y, z). In FIG. 4B, data (points) in the two-dimensional space is converted into points in the three-dimensional space using the function φ (X). Here, the function φ (x) usually represents a mapping function to a higher dimension and is called a kernel function. X is referred to as an input space, and the transformed three-dimensional space F is referred to as a feature space. Accordingly, the two-dimensional planar coordinate system (x ₁ , y ₁ ) is expressed as (φ (x ₁ ), φ (y ₁ )) in the feature space.

三次元空間内に変換された二次元空間内のデータ（点）は、図４（ｂ）の斜線で示す平面により２つのクラスの間を分類できるようになる。サポートベクトルマシンは、二次元空間から三次元空間へデータの次元を高めたことにより、線形分離不可能であったデータについて、線形分離が可能とする。 Data (points) in the two-dimensional space converted into the three-dimensional space can be classified between the two classes by the plane indicated by the oblique lines in FIG. The support vector machine increases the dimension of data from a two-dimensional space to a three-dimensional space, thereby enabling linear separation of data that cannot be linearly separated.

このとき、カーネル関数は特徴空間中のデータの座標を直接計算するのではなく、特徴空間における分離用の平面の内積を計算する。平面の内積を計算することにより、計算量を減らすことができる。内積の計算は、カーネルトリックと呼ばれている。 At this time, the kernel function does not directly calculate the coordinates of the data in the feature space, but calculates the inner product of the planes for separation in the feature space. The amount of calculation can be reduced by calculating the inner product of the planes. The inner product calculation is called a kernel trick.

図５は、サポートベクトル回帰分析にカーネル関数の一例であるガウシアン型カーネル関数を使用した場合において、カーネル関数の分散パラメータと識別境界との関係を示す図であり、図５（ａ）はガウシアンの広がりδ＝０．５の場合、図５（ｂ）はδ＝１の場合、図５（ｃ）はδ＝５の場合を示している。図５（ａ）〜（ｃ）においては、対角方向にデータ群があり、左下と右上に大きくクラスが２つある。ここで、ガウシアン型カーネル関数の分散を大きく（（ａ）→（ｂ）→（ｃ））していくと、サポートベクトル回帰のクラスの分離性が変わることが示されている。図５（ａ）〜（ｃ）中、データ点に○をつけてあるものがサポートベクトルであり、識別境界を支えている。このように、カーネル関数の特徴によって、データの分離性が変わることから、カーネル関数の選択により予測精度が変化することが考えられる。従って、時系列データの時間的変化において、時間区間ごとにカーネル関数を選択してサポートベクトル回帰モデルを作成することにより、サポートベクトル回帰モデルの最適化を行う。 FIG. 5 is a diagram showing the relationship between the variance parameter of the kernel function and the identification boundary when a Gaussian kernel function, which is an example of the kernel function, is used for support vector regression analysis. FIG. FIG. 5B shows the case where δ = 0.5, FIG. 5B shows the case where δ = 1, and FIG. 5C shows the case where δ = 5. 5A to 5C, there are data groups in the diagonal direction, and there are two classes in the lower left and upper right. Here, it is shown that when the variance of the Gaussian kernel function is increased ((a) → (b) → (c)), the separability of the support vector regression class changes. In FIGS. 5A to 5C, data points with a circle are support vectors, which support the identification boundary. As described above, since the data separability changes depending on the characteristics of the kernel function, it is conceivable that the prediction accuracy changes depending on the selection of the kernel function. Therefore, the support vector regression model is optimized by creating a support vector regression model by selecting a kernel function for each time interval in the temporal change of time series data.

次に、ステップ３０５の時間区間ごとに時系列データの変化に当てはめるサポートベクトル回帰モデルに使用するカーネル関数について説明する。ここで、まず、サポートベクトル回帰における基本の予測関数（サポートベクトル回帰モデル）決定の概要について説明する（非特許文献４参照）。過去の時間区間（取り込んだ時系列データの時間区間）ｉ＝１〜ｌ、次の時間区間（データを取り込んだ時間区間以降の時間区間）における時刻をｊ、時系列データの入力空間におけるベクトル表現をｘ、ｙとする。ｂはバイアス、αは計算過程で得られる変数とすると、予測関数Ｆ（ｙ_j）は、 Next, the kernel function used for the support vector regression model applied to the change of the time series data for each time interval in step 305 will be described. First, an outline of determining a basic prediction function (support vector regression model) in support vector regression will be described (see Non-Patent Document 4). Past time interval (time interval of captured time-series data) i = 1 to l, time in the next time interval (time interval after the time interval where data was acquired) j, vector expression in input space of time-series data Are x and y. If b is a bias and α is a variable obtained in the calculation process, the prediction function F (y _j ) is

ここで、式（１）においては、内積＜ｘ_i，ｙ_j＞を、直接計算する代わりに、カーネル関数Ｋに置き換えて計算する。内積をカーネル関数Ｋに置き換えて計算することにより、演算効率が向上する。 Here, in the expression (1), the inner product <x _i , y _j > is calculated by replacing it with the kernel function K instead of directly calculating. The calculation efficiency is improved by replacing the inner product with the kernel function K for calculation.

ステップ３０５は、カーネル関数を選択して時間区間ごとに時系列データの変化に当てはめるが、当てはめは、以下のカーネル関数の中から最適のカーネル関数を選択し、予測関数に代入した上で時系列データの変化に当てはめている。サポートベクトル回帰においては、まず、下記のようなカーネル関数（ガウシアン型カーネル）が用いられてきている。 In step 305, a kernel function is selected and applied to changes in time series data for each time interval. For the fitting, an optimum kernel function is selected from the following kernel functions and substituted into the prediction function, and then the time series is selected. It applies to data changes. In support vector regression, first, the following kernel function (Gaussian kernel) has been used.

ただし、σは形状変数である。 Where σ is a shape variable.

また、下記の式（３）〜（５）は、カーネル関数の他の一例であり、それぞれ、積型、べき乗型、及びtanh関数型である。式（３）〜（５）中、ｃ、κ及びδは、調整パラメータであり、経験的に推定される実数値である。 The following formulas (3) to (5) are other examples of the kernel function, and are a product type, a power type, and a tanh function type, respectively. In Expressions (3) to (5), c, κ, and δ are adjustment parameters, which are real values estimated empirically.

上述の式（１）及び式（２）から、最終的に、本発明における予測関数は、式（６）のように導出される。 From the above equations (1) and (2), the prediction function in the present invention is finally derived as in equation (6).

式（６）は、ガウシアン型カーネル（上述の（２））を用いた予測関数である。次の時刻におけるデータをｊ番とし、学習に用いたのと同じ時間区間長ｌまで多項式で表現される。本実施形態において、上記（２）〜（５）等のカーネル関数の中から、時系列データの時間的変化と近似しているカーネル関数を、基本の予測関数（１）に代入した予測関数が、時間区間ごとに当てはめられる。 Equation (6) is a prediction function using a Gaussian kernel (the above (2)). The data at the next time is j, and is expressed by a polynomial up to the same time interval length l used for learning. In the present embodiment, a prediction function obtained by substituting a kernel function that approximates a temporal change in time-series data from the kernel functions (2) to (5) described above into the basic prediction function (1) is provided. Applied for each time interval.

次に、ステップ３０５における、時系列データの統計的な性質の時間区間ごとの抽出、及びステップ３０６における、カーネル関数の選択について説明する。図６は、本実施形態において、観測した消費電力量の時系列データの統計的な特徴量を時間区間ごとに解析し、時系列データが急峻に変化する部分への予測関数の当てはめの精度を高めた例を示す図であり、図６（ａ）は、時系列データの時間的変化を示し、図６（ｂ）は、時系列データの時間的変化の時間に関する１次微分の値の変化を示す。時系列データの時間的変化の時間に関する一次微分を計算すると、小さい値から大きい値までさまざまな値が得られる。このとき、一次微分値は正値と負値が混在するが、ここではその絶対値のみを扱う。図６（ｂ）において、１次微分の値の絶対値が大きい場合は、時系列データが急峻に変化していることを示す。また、１次微分の値が小さい場合は、時系列データが緩やかに変化していることを示す。図６（ｂ）中の各時間区間Ｋ１〜Ｋ４には、それぞれ異なるカーネル関数が当てはめられている。１の時間区間における一次微分の絶対値の平均値に対し、それぞれ時系列データに近似するカーネル関数を選択して当てはめている。カーネル関数は、Ｋ１には式（３）が、Ｋ２には式（５）が、Ｋ３には式（４）が、Ｋ４には式（２）が当てはめられている。 Next, the extraction of the statistical properties of the time-series data for each time interval in step 305 and the selection of the kernel function in step 306 will be described. FIG. 6 shows an analysis of statistical feature quantities of time series data of observed power consumption for each time interval in this embodiment, and shows the accuracy of fitting a prediction function to a portion where time series data changes sharply. FIGS. 6A and 6B are diagrams showing an example in which the time series data is changed, and FIG. 6B shows the change in the value of the first derivative with respect to the time of the time change of the time series data. Indicates. When calculating the first derivative with respect to the time of temporal change of the time series data, various values are obtained from a small value to a large value. At this time, the primary differential value has both a positive value and a negative value, but only the absolute value is handled here. In FIG. 6B, when the absolute value of the first-order differential value is large, it indicates that the time-series data changes sharply. In addition, when the value of the first derivative is small, it indicates that the time series data is slowly changing. Different kernel functions are applied to the time intervals K1 to K4 in FIG. A kernel function that approximates time-series data is selected and applied to the average value of the absolute value of the first derivative in one time interval. In the kernel function, equation (3) is applied to K1, equation (5) is applied to K2, equation (4) is applied to K3, and equation (2) is applied to K4.

［実施例］
図７は、異なるカーネル関数を用いたときの予測精度の違いを比較した実験例を示す図である。図７中、横軸のａ〜ｄは、それぞれ上述のカーネル関数（２）〜（５）をそれぞれ単独に使用した予測制度の結果であり、ｅは、本実施形態における時系列データ予測方法を使用した場合の予測精度の結果である。また、縦軸は予測精度の結果、つまり正解データと予測データとの誤差（平均二乗誤差により計測）を示し、誤差が小さいほど低い値を示す。上述のカーネル関数（２）〜（５）を単独に用いた場合は、ａ〜ｄで予測精度がばらついたが、本実施形態の混在型カーネル関数によれば予測精度が一番改善されることが示された。 [Example]
FIG. 7 is a diagram illustrating an experimental example in which differences in prediction accuracy when different kernel functions are used are compared. In FIG. 7, a to d on the horizontal axis are the results of the prediction system using the above-described kernel functions (2) to (5), respectively, and e is the time-series data prediction method in the present embodiment. It is the result of prediction accuracy when used. The vertical axis indicates the prediction accuracy result, that is, the error (measured by the mean square error) between the correct answer data and the prediction data, and the smaller the error, the lower the value. When the above kernel functions (2) to (5) are used independently, the prediction accuracy varies from a to d. However, according to the mixed kernel function of this embodiment, the prediction accuracy is most improved. It has been shown.

本発明は、電力分野、エネルギー分野、通信分野、センシング分野において、実環境におけるモニタリングや画像センシングなどに適用される。 The present invention is applied to monitoring and image sensing in a real environment in the electric power field, energy field, communication field, and sensing field.

２０１観測データ入力部
２０２データ蓄積部
２０３統計解析部
２０４カーネル選択部
２０５データ予測部
２０６表示部 201 observation data input unit 202 data storage unit 203 statistical analysis unit 204 kernel selection unit 205 data prediction unit 206 display unit

Claims

A time series data prediction device for predicting time series data by support vector regression,
A statistical analysis unit that extracts the statistical properties of the captured time-series data for each time interval;
A kernel selection unit that selects, for each time interval, a kernel function that matches the statistical properties of the time series data extracted for each time interval by the statistical analysis unit;
Using the support vector regression model using the kernel function selected for each time interval in the kernel selection unit, calculating the temporal change of the time series data after the time interval of the captured time series data, A time prediction data prediction apparatus comprising: a data prediction unit that predicts time series data.

The time series data prediction apparatus according to claim 1, wherein the support vector regression model is a function in which an inner product in a basic prediction function of the support vector regression model is replaced with the selected kernel function.

A time series data prediction method for predicting time series data by support vector regression,
In the statistical analysis unit, extracting the statistical properties of the captured time series data for each time interval;
In the kernel selection unit, selecting a kernel function that matches the statistical properties of the time-series data extracted for each time interval in the extracting step for each time interval;
In the data prediction unit, using the support vector regression model using the kernel function selected for each time interval in the selecting step, the temporal change of the time series data after the time interval of the captured time series data And calculating the time-series data. The time-series data prediction method comprising:

4. The time series data prediction method according to claim 3, wherein the support vector regression model is a function in which an inner product in a basic prediction function of the support vector regression model is replaced with the selected kernel function.