JP2012247998A

JP2012247998A - Time series data prediction apparatus, prediction method, prediction program and memory medium

Info

Publication number: JP2012247998A
Application number: JP2011119194A
Authority: JP
Inventors: Yoshiki Murakami; 好樹村上; Takenori Kobayashi; 武則小林; Yoshiro Hasegawa; 義朗長谷川; Hideo Kusano; 日出男草野
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2011-05-27
Filing date: 2011-05-27
Publication date: 2012-12-13
Anticipated expiration: 2031-05-27
Also published as: JP5777404B2

Abstract

PROBLEM TO BE SOLVED: To provide a time series data prediction apparatus capable of appropriately selecting previous result data and capable of appropriately predicting even when the time to give a maximum value or minimum value of previous result data is not matched.SOLUTION: The time series data prediction apparatus comprises: a data selecting section 120 that selects data of a date similar to a prediction target date from previous time series data; a probability density distribution creating section 150 that calculates the probability density distribution which varies at each point of time from the data selected by the data selecting section; a random number generating section 160 that generates a set of plural random numbers in accordance with the probability density distribution which varies at each point of time calculated by the probability density distribution creating section; a correlation coefficient calculating section 180 that calculates a correlation among the times in the previous time series data; a random number converting section 210 that converts the set of random numbers generated by the random number generating section based on the correlation calculated by the correlation coefficient calculating section into a set of random numbers having a correlation with each other; and a prediction result creating section 230 that creates a prediction result by predicting time series data value based on the set of random numbers obtained by converting made by the random number converting section.

Description

本発明の実施形態は、時系列データの将来の値を予測する時系列データの予測装置、予測方法、予測プログラムおよび記憶媒体に関する。 Embodiments described herein relate generally to a time-series data prediction apparatus, a prediction method, a prediction program, and a storage medium that predict future values of time-series data.

株価、気温、電力需要または太陽光発電の発電量など、時間によって変化する量を時間経過に従って並べたデータを時系列データと呼ぶ。時系列データの将来の値を予測することは、株の取引で利益を上げたり、気温によって売り上げが変わる商品の販売計画を立てたり、発電機の運転計画を作成したり、または、電力の販売計画を作成するために重要である。このため、時系列データを予測する種々の手法が提案されている。 Data obtained by arranging amounts that change with time, such as stock price, temperature, power demand, or power generation amount of photovoltaic power generation, as time elapses is called time-series data. Predicting future values of time series data can be profitable from stock trading, planning sales of products whose sales change depending on temperature, creating generator operation plans, or selling power It is important to create a plan. For this reason, various methods for predicting time-series data have been proposed.

特開２００７−４７９９６号公報JP 2007-47996 A 特開平５−１８９９５号公報Japanese Patent Laid-Open No. 5-18995 特開２０１０−２０４４２号公報JP 2010-20442 A

時系列データを予測する手法として、特許文献１は、気象情報に基づき将来の一定期間の電力需要を予測する技術を開示する。特許文献２は、気温および湿度データと過去の需要データから将来の電力総需要を予測する技術を開示する。特許文献３は、確率的に需要シナリオを多数発生することにより誤差を含めて需要予測する技術を開示する。 As a method for predicting time-series data, Patent Document 1 discloses a technique for predicting power demand in a certain period in the future based on weather information. Patent Document 2 discloses a technique for predicting future total power demand from temperature and humidity data and past demand data. Patent Document 3 discloses a technique for predicting demand including errors by generating a large number of demand scenarios stochastically.

図９は、特許文献３に開示された技術を用いて作成された需要シナリオの例を示す。図９（ａ）は、過去の需要実績を示す過去の実績データの例である。図９（ｂ）は、過去の実績データに基づき作成された需要予測結果（需要シナリオ）を示す。図９の例では、１時から２４時までの２４点の需要データを時系列データとし、７組の過去の時系列データに基づき将来起こりうる需要データを時系列データの組として、１００通りの需要シナリオを予測する。各需要シナリオは同じ確率で実現し、需要シナリオを用いて総需要の計算や発電計画の作成が行われる。その結果、総需要の期待値や発電コストの期待値などを計算できる。この場合、各時刻における需要データが正規分布に従うと仮定して、各時刻の需要の平均および分散を計算するとともに、時刻間の相関を計算し、これら平均、分散および相関に従う乱数を発生させることにより将来の需要データである多数の時系列データを発生させる。将来の時系列データの数は必要に応じて何組でも作成できるが、特許文献３では１００組の時系列データを作成して表示する。この例では、図９（ｂ）に示す需要予測結果は、過去の実績データの特徴をよく反映している。 FIG. 9 shows an example of a demand scenario created using the technique disclosed in Patent Document 3. FIG. 9A is an example of past performance data indicating past demand results. FIG. 9B shows a demand forecast result (demand scenario) created based on past performance data. In the example of FIG. 9, the demand data of 24 points from 1 o'clock to 24 o'clock are time series data, and demand data that can occur in the future based on the seven sets of past time series data is set as time series data, and 100 types of data Predict demand scenarios. Each demand scenario is realized with the same probability, and total demand is calculated and a power generation plan is created using the demand scenario. As a result, the expected value of total demand, the expected value of power generation cost, etc. can be calculated. In this case, assuming that the demand data at each time follows a normal distribution, calculate the average and variance of the demand at each time, calculate the correlation between the times, and generate random numbers according to these average, variance and correlation To generate a large number of time series data as future demand data. The number of future time series data can be created as many as necessary, but in Patent Document 3, 100 sets of time series data are created and displayed. In this example, the demand prediction result shown in FIG. 9B well reflects the characteristics of past performance data.

図１０は、特許文献３に開示された技術を用いて作成された需要シナリオの他の例を示す。図１０（ａ）は、過去の実績データの例である。図１０に示す例は、図９に示した例とは異なり、過去の実績データが特徴の異なる２種類のデータ群から構成される。図１０（ｂ）は、図１０（ａ）に示した過去の実績データに基づき作成された需要予測結果（需要シナリオ）を示す図である。図１０（ｂ）に示された予測結果を見ると、各時刻の需要データが正規分布していると仮定しているため、全体として過去の実績と大きく異なった予測結果になっている。 FIG. 10 shows another example of a demand scenario created using the technique disclosed in Patent Document 3. FIG. 10A is an example of past performance data. In the example shown in FIG. 10, unlike the example shown in FIG. 9, the past performance data is composed of two types of data groups having different characteristics. FIG. 10B is a diagram showing a demand prediction result (demand scenario) created based on the past performance data shown in FIG. Looking at the prediction results shown in FIG. 10B, since it is assumed that the demand data at each time is normally distributed, the prediction results largely differ from the past results as a whole.

図１０（ａ）に示す例では、過去の実績データは傾向が異なる２組の時系列データからなり、時系列データが最大となる時刻が異なっている。結果として、各時刻における過去の実績データの分布は、正規分布とは大きく異なる二山を有している。一方、図１０（ｂ）に示す例では、各時刻の需要予測結果は、中心にピークを持つ一山の分布をしている。これは、各時刻の需要データの分布が正規分布をしていると仮定したためである。 In the example shown in FIG. 10A, past performance data is composed of two sets of time-series data having different tendencies, and the time at which the time-series data is maximum is different. As a result, the distribution of past performance data at each time has two peaks that are significantly different from the normal distribution. On the other hand, in the example shown in FIG. 10B, the demand prediction result at each time has a mountain distribution with a peak at the center. This is because it is assumed that the distribution of demand data at each time is a normal distribution.

各時刻の分布が正規分布でなくても、最初の近似として正規分布を仮定することは通常行われているが、この場合の予測結果において重要なことは、２４時間のまとまりとしてみた時系列データが、過去の実績を正しく再現できているかどうかである。過去の実績が２つのグループからなっている場合には、予測結果も２つのグループからなっていることが望ましい。 Even if the distribution at each time is not a normal distribution, a normal distribution is usually assumed as the first approximation. However, what is important in the prediction result in this case is time series data viewed as a unit of 24 hours. However, it is whether the past results can be correctly reproduced. When the past performance is composed of two groups, it is desirable that the prediction result is also composed of two groups.

予測された需要シナリオにおいて最も実現確率が高いのは各時刻の平均値を結んだような時系列データであり、これは過去の実績データの実現頻度に比べてみても明らかに不自然である。したがって、従来技術では、過去の実績データの特徴が正確に再現されていないという問題がある。これは、そもそも過去の実績データの選択方法がよくないという考え方もできる。即ち、時系列データの最大値や最小値を与える時刻が一致していないことが問題とも言える。しかし、これらの時刻を厳密に一致させることは困難であるため、最大値や最小値を与える時刻が異なる場合でも予測を行うことができる必要がある。 In the predicted demand scenario, the highest realization probability is time-series data that connects the average values at each time, which is clearly unnatural even when compared with the realization frequency of past performance data. Therefore, in the prior art, there is a problem that the characteristics of past performance data are not accurately reproduced. In the first place, it is possible to think that the past performance data selection method is not good. That is, it can be said that the time at which the maximum value and the minimum value of the time series data do not match is a problem. However, since it is difficult to make these times exactly the same, it is necessary to be able to perform prediction even when the time for giving the maximum value and the minimum value is different.

このように、従来技術では、各時刻における過去の実績データの分布が正確に再現できていないため、精度のよい予測モデルを作成することが困難であるという問題がある。また、過去の実績データの選択方法に関しても定められていないという問題がある。 Thus, in the prior art, since the distribution of past performance data at each time cannot be accurately reproduced, there is a problem that it is difficult to create an accurate prediction model. There is also a problem that the method for selecting past performance data is not defined.

本発明が解決しようとする課題は、過去の実績データの最大値や最小値を与える時刻が不一致でも適切な予測が可能で、過去の実績データの選択を適切に行うことができる時系列データの予測装置を提供する。 The problem to be solved by the present invention is that time series data that can be appropriately predicted even when the time at which the maximum value or minimum value of the past result data is given does not match can be selected appropriately. A prediction device is provided.

実施形態に係る時系列データの予測装置によれば、過去の時系列データから予測対象日と類似した日のデータを選択するデータ選択部と、データ選択部で選択されたデータから時刻毎に異なる確率密度分布を計算する確率密度分布作成部と、確率密度分布作成部からの時刻毎に異なる確率密度分布に従った複数の乱数の組を発生させる乱数発生部と、過去の時系列データにおける時刻間の相関を計算する相関係数計算部と、相関係数計算部で計算された相関に基づいて乱数発生部で発生された乱数の組を互いに相関を有する乱数の組に変換する乱数変換部と、乱数変換部における変換により得られた乱数の組に基づいて時系列データの値を予測して予測結果を作成する予測結果作成部を備えることを特徴とする。 According to the time-series data prediction device according to the embodiment, the data selection unit that selects data similar to the prediction target date from the past time-series data, and the time selected from the data selected by the data selection unit differs from time to time. Probability density distribution creation unit that calculates probability density distribution, random number generation unit that generates a set of multiple random numbers according to different probability density distributions for each time from probability density distribution creation unit, and time in past time series data A correlation coefficient calculation unit for calculating a correlation between the random numbers, and a random number conversion unit for converting a set of random numbers generated by the random number generation unit based on the correlation calculated by the correlation coefficient calculation unit into a set of random numbers having a correlation with each other And a prediction result creation unit that predicts a value of time-series data based on a random number set obtained by conversion in the random number conversion unit and creates a prediction result.

第１の実施形態に係る時系列データの予測装置で得られた時系列データの予測結果の例を示す図である。It is a figure which shows the example of the prediction result of the time series data obtained with the prediction apparatus of the time series data which concerns on 1st Embodiment. 第１の実施形態に係る時系列データの予測装置で使用される過去の需要の分布の例を示す図である。It is a figure which shows the example of distribution of the past demand used with the prediction apparatus of the time series data which concerns on 1st Embodiment. 第１の実施形態に係る時系列データの予測装置で使用される頻度分布関数の例を示す図である。It is a figure which shows the example of the frequency distribution function used with the prediction apparatus of the time series data which concerns on 1st Embodiment. 第１の実施形態に係る時系列データの予測装置の構成を示す図である。It is a figure which shows the structure of the prediction apparatus of the time series data which concerns on 1st Embodiment. 第２の実施形態に係る時系列データの予測装置の構成を示す図である。It is a figure which shows the structure of the prediction apparatus of the time series data which concerns on 2nd Embodiment. 第２の実施形態に係る時系列データの予測装置で使用される、予測された値と類似する過去の実績データの選択方法を説明するための図である。It is a figure for demonstrating the selection method of the past performance data similar to the predicted value used with the prediction apparatus of the time series data which concerns on 2nd Embodiment. 第３の実施形態に係る時系列データの予測装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the prediction apparatus of the time series data which concerns on 3rd Embodiment. 第４の実施形態に係る時系列データの予測装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the prediction apparatus of the time series data concerning 4th Embodiment. 従来の技術を用いて作成された需要シナリオの例を示す図である。It is a figure which shows the example of the demand scenario created using the prior art. 従来技術を用いて作成された需要シナリオの他の例を示す図である。It is a figure which shows the other example of the demand scenario created using the prior art.

以下、本発明の実施例について、図面を参照しながら詳細に説明する。
［第１の実施形態］ Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[First Embodiment]

この時系列データの予測装置は、将来の一定期間の電力需要量などのような時間的に連なった一連のデータのまとまりを複数のシナリオとして予測し、その期間の発電機や関連する装置の運転パターンを複数のシナリオとして再現し、その期間の総需要や売り上げ、発電機や関連する装置の運転計画を、不確実性を考慮して予測するものである。 This time-series data prediction device predicts a series of time-series data such as power demand for a certain period in the future as multiple scenarios, and operates generators and related devices during that period. The pattern is reproduced as a plurality of scenarios, and the total demand and sales during that period, and the operation plan of the generator and related devices are predicted in consideration of uncertainties.

図１は、第１の実施形態に係る時系列データの予測装置で得られた時系列データの予測結果の例を示す図である。ここで、図１（ａ）は、予測の元になった過去の実績データを示しており、図１０（ａ）に示した従来の過去の実績データと同じものである。図１（ｂ）は、第１の実施形態に係る時系列データの予測装置で予測を実施した結果を示す図である。図１（ｂ）は、図１０（ｂ）と異なり、需要シナリオが２つのグループに分かれており、過去の実績データの特徴をよく反映していることが分かる。 FIG. 1 is a diagram illustrating an example of a prediction result of time-series data obtained by the time-series data prediction apparatus according to the first embodiment. Here, FIG. 1A shows the past performance data that is the basis of the prediction, and is the same as the past past performance data shown in FIG. FIG. 1B is a diagram illustrating a result of prediction performed by the time-series data prediction apparatus according to the first embodiment. FIG. 1B differs from FIG. 10B in that the demand scenarios are divided into two groups, and reflect the characteristics of past performance data well.

次に、第１の実施形態に係る時系列データの予測装置で使用されるアルゴリズムを具体的に説明する。ここでは、予測を行う周期を１日とし、予測を行う時間間隔を１時間として、２４個のデータの時系列を予測する場合を例にとって説明する。１日分で２４個の過去の実績データを日数分集めて、時刻毎にＫ個（Ｋは２以上の整数）の過去の実績データからｉ番目（ｉ＝１、２、…、２４）の時刻のＫ個の需要データが得られる。ここで、まず、Ｋ個の需要データの確率密度分布を求める。 Next, an algorithm used in the time-series data prediction apparatus according to the first embodiment will be specifically described. Here, a case where a time series of prediction is assumed to be one day and a time interval of prediction is assumed to be one hour and a time series of 24 data is predicted will be described as an example. 24 past performance data for one day are collected for the number of days, and the i-th (i = 1, 2,..., 24) from the past performance data of K pieces (K is an integer of 2 or more) at each time. K pieces of demand data of time are obtained. Here, first, a probability density distribution of K pieces of demand data is obtained.

各時刻における需要データの分布は、平均と分散によって特徴づけられる。平均や分散は分布の形状にはよらないパラメータである。時刻毎にＫ個の過去の実績データを集めてきてｉ番目の時刻のＫ個の平均μ_iと分散ｖ_iを計算する。これを時刻毎に行い、２４個の平均データμ_iと分散データｖ_iが得られる。分散を計算するためにはＫは２以上である必要がある。 The distribution of demand data at each time is characterized by an average and variance. Mean and variance are parameters that do not depend on the shape of the distribution. K pieces of past performance data are collected for each time, and K average μ _i and variance v _i at the i-th time are calculated. This is performed for each time, and 24 average data μ _i and distributed data v _i are obtained. In order to calculate the variance, K needs to be 2 or more.

一方、時刻毎の需要データの分布は平均や分散だけでは表現できない。時刻毎に分布形状を計算する必要がある。図２（ａ）はｉ＝１２、つまり１２時における過去の需要の分布の例を示しており、図２（ｂ）は、ｉ＝２０、つまり２０時における過去の需要の分布の例を示している。この例では、時刻によって分布の形状が大きく異なることが分かる。すなわち、１２時の需要は比較的に正規分布に近い形状を有するが、２０時の需要分布は正規分布と大きく異なり、２つの山を有するように見える。この場合、分布の形が正規分布と異なっていることが問題ではなく、時刻によって分布の形状が異なっていることが問題である。例えば、２０時の分布の形状をすべての時刻に用いると、すべての時刻で二山を有する分布形状となってしまい、元のデータの分布を再現することはできない。 On the other hand, the distribution of demand data for each time cannot be expressed only by average or variance. It is necessary to calculate the distribution shape at each time. 2A shows an example of the distribution of past demand at i = 12, that is, 12:00, and FIG. 2B shows an example of the distribution of past demand at i = 20, that is, 20:00. ing. In this example, it can be seen that the shape of the distribution varies greatly with time. That is, the demand at 12 o'clock has a shape that is relatively close to the normal distribution, but the demand distribution at 20:00 differs greatly from the normal distribution and appears to have two peaks. In this case, it is not a problem that the shape of the distribution is different from the normal distribution, but a problem that the shape of the distribution is different depending on the time. For example, if the distribution shape at 20:00 is used at all times, the distribution shape has two peaks at all times, and the original data distribution cannot be reproduced.

すなわち、確率密度分布の形状を正規分布以外の一定の形状と仮定することではなく、時刻毎に需要データが従う確率密度分布を定めることが必要である。このためには、時刻毎に過去の需要データの発生頻度に比例した確率密度分布を持つ乱数を発生させればよい。発生頻度の合計は過去の実績データの数であるが、確率密度分布は積分が１になるように規格化された関数である。このような乱数を発生させる方法としては、頻度分布から計算された累積分布関数の逆関数を用いることができる。 That is, it is necessary not to assume the shape of the probability density distribution as a fixed shape other than the normal distribution, but to determine the probability density distribution that the demand data follows at each time. For this purpose, random numbers having a probability density distribution proportional to the occurrence frequency of past demand data may be generated at each time. The total occurrence frequency is the number of past performance data, but the probability density distribution is a function standardized so that the integral becomes 1. As a method for generating such random numbers, an inverse function of the cumulative distribution function calculated from the frequency distribution can be used.

ただし、この関数は縦軸（頻度の累積数）に関して必ず値を持つ必要があるので、連続関数になるように工夫する必要がある。以下では、時刻毎に確率密度分布を定める方法を具体的に示す。 However, since this function must always have a value with respect to the vertical axis (cumulative number of frequencies), it must be devised to be a continuous function. Hereinafter, a method for determining the probability density distribution for each time will be specifically described.

選択された過去の需要データにおいて、時刻ｉの需要ｘがＸ_kからＸ_k+1の間にある頻度をＦ_kとすると、需要ｘの発生頻度の分布を表す頻度分布関数Ｆは、下記（１）式で表すことができる。 In the selected past demand data, assuming that the frequency at which the demand x at time i is between X _k and X _{k + 1} is F _k , the frequency distribution function F representing the distribution of the occurrence frequency of the demand x is 1) It can be expressed by the formula.

Ｆ＝Ｆ_k （Ｘ_k≦ｘ≦Ｘ_k+1）…（１）
この頻度分布関数Ｆは、ｘの確率密度関数の形状を近似した分布になっている。図３（ａ）は、ｘの頻度分布関数Ｆの例を示す。 F = F _k (X _k ≦ x ≦ X _{k + 1} ) (1)
This frequency distribution function F has a distribution approximating the shape of the probability density function of x. FIG. 3A shows an example of the frequency distribution function F of x.

一方、需要ｘの累積分布関数ｙを、連続関数として定義するために、Ｙ₀＝０および
On the other hand, in order to define the cumulative distribution function y of demand x as a continuous function, Y ₀ = 0 and

として、
As

（３）式から、
From equation (3)

となるので、０〜Ｙ_L-1の間の値を取る一様乱数ｙを発生させて、（４）式に代入することにより、頻度分布関数Ｆに従ったＸ₁〜Ｘ_Lの間の乱数を発生させることができる。このようにして発生された乱数は、時刻ｉにおける過去の需要データと近似的に同じ平均値、同じ分散、および、同様の分布形状を有しており、過去の需要データの発生確率を近似的に再現したものとなる。なお、過去の実績データの数を増やし、頻度分布を計測する区分を増やすことにより、確率分布の近似精度を向上させることも可能である。 Therefore, by generating a uniform random number y that takes a value between 0 and Y _L-1 and substituting it into the equation (4), the interval between X _{1 and} X _L according to the frequency distribution function F is obtained. A random number can be generated. The random numbers generated in this way have approximately the same average value, the same variance, and the same distribution shape as the past demand data at time i, and approximate the occurrence probability of the past demand data. It will be reproduced. In addition, it is also possible to improve the approximation accuracy of the probability distribution by increasing the number of past result data and increasing the number of sections for measuring the frequency distribution.

次に、時刻間の相関を考慮するために、各時刻間の共分散を計算する。これは時刻１のＫ個の需要データと時刻２のＫ個の需要データの共分散ｖ_1,2、時刻１のＫ個の需要データと時刻３のＫ個の需要データの共分散ｖ_1,3、．．．、時刻２３のＫ個の需要データと時刻２４のＫ個の需要データの共分散ｖ_23,24を計算することにより行われる。これにより、２４個の時刻から重複を許さず２個の組を選ぶ組み合わせの数である２７６個の共分散が得られる。これらの共分散の値と、先に計算した各時刻の平均、分散および確率密度関数の分布形状によって過去の実績データの特徴が決定される。 Next, in order to consider the correlation between the times, the covariance between the times is calculated. This time 1 of the K demand data and time 2 of the K demand data covariance v _{1, 2,} covariance v ₁ of the K demand data of K demand data and time 3 times _{1, 3} ,. . . This is done by calculating the covariance v _23,24 of the K demand data at time 23 and the K demand data at time 24. As a result, 276 covariances, which are the number of combinations for selecting two sets without allowing duplication from 24 times, are obtained. The characteristics of past performance data are determined by these covariance values and the distribution shape of the mean, variance, and probability density function calculated at each time.

次に、需要シナリオの作成について説明する。各時刻の需要を１時間毎に分解して時刻ｉの発電量をｄ_iで表し、ｄ₁からｄ₂₄まで並べたベクトルである時系列データを考える。一方、複数の日から時刻毎の発電データを集めたデータ｛ｄ_i｝が得られる。ここで、時刻間の発電量の関係を考慮するために、分散共分散行列作成機能により、（５）式に示す２４×２４の成分を持つ分散共分散行列Σを作成する。ここで、ｖａｒ（ｄ_i）は｛ｄ_i｝の分散を、ｃｏｖ（ｄ_i，ｄ_j）は、｛ｄ_i｝と｛ｄ_j｝の共分散である。分散や共分散を計算するためには、需要量を示す需要データを取得する日は最低でも２日必要である。なお、需要データを取得する日数は２日でもかまわないが、日数が多いほど予測精度は高くなる。
Next, creation of a demand scenario will be described. Consider time-series data that is a vector in which the demand at each time is decomposed every hour and the power generation amount at time _i is represented by d _i and arranged from d ₁ to d ₂₄ . On the other hand, data {d _i } obtained by collecting power generation data for each time from a plurality of days is obtained. Here, in order to consider the relationship between the power generation amounts between times, a variance-covariance matrix Σ having 24 × 24 components shown in Equation (5) is created by the variance-covariance matrix creation function. Here, var (d _i ) is the variance of {d _i }, and cov (d _i , d _j ) is the covariance of {d _i } and {d _j }. In order to calculate variance and covariance, at least two days are required to acquire demand data indicating demand. The number of days for acquiring the demand data may be two days, but the prediction accuracy increases as the number of days increases.

次に、特定の確率分布に従い、かつ、相互に与えられた相関を持つ乱数の組をＬ組だけ作成する方法を説明する。ここでは、互いに相関を持つ２４個の乱数の組をＬ組生成する場合を例にとって説明する。 Next, a method of creating only L sets of random numbers that follow a specific probability distribution and have a correlation given to each other will be described. Here, a case will be described as an example where L sets of 24 random numbers that are correlated with each other are generated.

まず、先に説明したように時刻毎に過去のデータの分布を求め、その分布に従う乱数をＬ個発生する。これを、２４時間の各時刻において繰り返す。これにより、Ｌ個の乱数が、２４時間分作成される。この段階では各時刻の乱数の組は他の時刻の乱数の組と相関がなく互いに独立である。この結果、Ｌ行２４列（Ｌ×２４）の成分を持つ乱数行列が得られる。この段階で先に計算した平均と分散を用いて各乱数の標準偏差が１、平均が０になるように変数変換により正規化する（平均を引いてから標準偏差で割って規格化する）。このようにして、正規乱数行列Ｇが得られる。 First, as described above, the distribution of past data is obtained for each time, and L random numbers according to the distribution are generated. This is repeated at each time of 24 hours. Thereby, L random numbers are generated for 24 hours. At this stage, the set of random numbers at each time has no correlation with the set of random numbers at other times and is independent of each other. As a result, a random matrix having components of L rows and 24 columns (L × 24) is obtained. At this stage, normalization is performed by variable transformation so that the standard deviation of each random number is 1 and the average is 0 using the average and variance calculated earlier (subtract the average and then normalize by dividing by the standard deviation). In this way, a normal random number matrix G is obtained.

次に、２４×２４の相関係数行列Ｒをコレスキー分解し、（６）式のように上三角行列Ｔ_Uと下三角行列Ｔ_Lに分解する。相関係数行列は、分散共分散行列Σを規格化することにより得られる。 Next, the 24 × 24 correlation coefficient matrix R is subjected to Cholesky decomposition, and is decomposed into an upper triangular matrix T _U and a lower triangular matrix T _L as shown in Equation (6). The correlation coefficient matrix is obtained by normalizing the variance-covariance matrix Σ.

Ｒ＝Ｔ_UＴ_L…（６）
その後、正規乱数行列Ｇに上三角行列Ｔ_Uを右から掛けることにより、相関を持った正規乱数行列Ｇ’を作成できる。 R = T _U T _L (6)
Then, by multiplying the upper triangular matrix T _U from right to normal random matrix G, you can create a regular random matrix G 'having a correlation.

Ｇ’＝ＧＴ_U…（７）
これに、各時刻の標準偏差を対角線上におき、他の成分が０である２４行２４列（２４×２４）の標準偏差行列Ｓを右から掛け、平均値ベクトルμをＬ行並べたＬ行２４列（Ｌ×２４）の平均値行列Ａを加えることにより、最終的に求める平均と分散および相関を持った乱数行列Ｇ”が得られる。 G ′ = GT _U (7)
The standard deviation matrix S of 24 rows by 24 columns (24 × 24) in which the standard deviation at each time is placed on a diagonal line and the other components are 0 is multiplied from the right, and the average vector μ is arranged in L rows. By adding the average value matrix A of 24 rows and 24 columns (L × 24), a random number matrix G ″ having a mean, variance, and correlation finally obtained is obtained.

Ｇ”＝Ｇ’Ｓ＋Ａ…（８）
この行列Ｇ”が時系列データの予測結果そのものであり、Ｇ”のＮ個の各行ベクトル（１行２４列）が、各シナリオに相当する。 G ″ = G ′S + A (8)
This matrix G ″ is the prediction result of the time series data itself, and N row vectors (1 row and 24 columns) of G ″ correspond to each scenario.

なお、時刻分割数が大きい場合や、需要量の相関が強い場合などは、数値計算上の問題として桁落ちによる誤差が大きくなり、相関係数行列Ｒのコレスキー分解が困難な場合がある。この場合には、必要に応じて固有値分解や特異値分解を用いることができる。また、１日を１時間毎に２４に分割する代わりに任意の時間で分割することもできる。 In addition, when the number of time divisions is large or when the correlation of the demand amount is strong, an error due to the digit loss becomes large as a problem in numerical calculation, and the Cholesky decomposition of the correlation coefficient matrix R may be difficult. In this case, eigenvalue decomposition or singular value decomposition can be used as necessary. Moreover, instead of dividing the day into 24 every hour, it is also possible to divide it at an arbitrary time.

以上の方法でＭ個の乱数の組を発生させることによりＭ個の需要シナリオが得られる。なお、Ｍの値は任意であり、Ｋの値やＮの値にはよらない。例えば、過去の実績データが２日分しかなくても、１００万通りのシナリオを発生させることもできる。 M demand scenarios can be obtained by generating a set of M random numbers by the above method. Note that the value of M is arbitrary and does not depend on the value of K or the value of N. For example, even if there are only two days of past performance data, one million scenarios can be generated.

図４は、第１の実施形態に係る時系列データの予測装置の構成を示す図である。時系列データの予測装置は、いずれも図示は省略するが、入力装置、表示装置、記憶装置および中央処理装置などを備えたコンピュータで構成されている。時系列データの予測装置は、第１過去の実績データ記憶部１１０、データ選択部１２０、第２過去の実績データ記憶部１３０、平均・分散計算部１４０、確率密度分布作成部１５０、乱数発生部１６０、乱数組記憶部１７０、相関係数計算部１８０、相関係数行列記憶部１９０、乱数変換部２１０、乱数組記憶部２２０、予測結果作成部２３０、予測結果記憶部２４０、予測結果表示部３１０および表示装置３２０を備えている。 FIG. 4 is a diagram illustrating the configuration of the time-series data prediction apparatus according to the first embodiment. The time-series data prediction device is composed of a computer including an input device, a display device, a storage device, a central processing unit, and the like, although not shown. The time series data prediction apparatus includes a first past record data storage unit 110, a data selection unit 120, a second past record data storage unit 130, an average / variance calculation unit 140, a probability density distribution creation unit 150, and a random number generation unit. 160, random number set storage unit 170, correlation coefficient calculation unit 180, correlation coefficient matrix storage unit 190, random number conversion unit 210, random number set storage unit 220, prediction result creation unit 230, prediction result storage unit 240, prediction result display unit 310 and a display device 320 are provided.

第１過去の実績データ記憶部１１０は、過去の実績データを記憶する。この過去の実績データ記憶部１１０に記憶されている過去の実績データは、データ選択部１２０によって読み出される。 The first past result data storage unit 110 stores past result data. The past performance data stored in the past performance data storage unit 110 is read by the data selection unit 120.

データ選択部１２０は、第１過去の実績データ記憶部１１０から読み出した過去の実績データから、予測対象日と類似した日のデータを選択し、第２過去の実績データ記憶部１３０に送る。このデータ選択部１２０における選択基準としては、例えば同じ月の日や同じ曜日の日、または、同じ天候の日などが用いられる。 The data selection unit 120 selects data similar to the prediction target date from the past performance data read from the first past performance data storage unit 110 and sends the data to the second past performance data storage unit 130. As a selection criterion in the data selection unit 120, for example, the day of the same month, the day of the same day of the week, or the day of the same weather is used.

第２過去の実績データ記憶部１３０は、データ選択部１２０から送られてきた「選択された過去の実績データ」を記憶する。この第２過去の実績データ記憶部１３０に記憶されている選択された過去の実績データは、平均・分散計算部１４０、確率密度分布作成部１５０および相関係数計算部１８０によって読み出される。 The second past record data storage unit 130 stores “selected past record data” sent from the data selection unit 120. The selected past performance data stored in the second past performance data storage unit 130 is read by the average / variance calculation unit 140, the probability density distribution creation unit 150, and the correlation coefficient calculation unit 180.

平均・分散計算部１４０は、第２過去の実績データ記憶部１３０から過去の実績データを読み出して時刻毎の平均および分散を計算し、乱数発生部１７０に送る。確率密度分布作成部１５０は、第２過去の実績データ記憶部１３０から過去の実績データを読み出して時刻毎に確率密度分布を計算し、乱数発生部１７０に送る。 The average / dispersion calculation unit 140 reads past performance data from the second past performance data storage unit 130, calculates the average and variance for each time, and sends the average and variance to the random number generation unit 170. The probability density distribution creation unit 150 reads past performance data from the second past performance data storage unit 130, calculates a probability density distribution at each time, and sends the probability density distribution to the random number generation unit 170.

乱数発生部１６０は、平均・分散計算部１４０から送られてくる時刻毎の平均と分散、および、確率密度分布作成部１５０から送られてくる時刻毎の確率密度分布に基づき乱数を発生して時刻毎の乱数の組を作成し、乱数組記憶部１７０に送る。乱数組記憶部１７０は、乱数発生部１６０から送られてくる時刻毎の乱数の組を記憶する。この乱数組記憶部１７０に記憶されている時刻毎の乱数の組は、乱数変換部２１０によって読み出される。 The random number generator 160 generates random numbers based on the average and variance for each time sent from the average / variance calculator 140 and the probability density distribution for each time sent from the probability density distribution generator 150. A set of random numbers for each time is created and sent to the random number set storage unit 170. The random number set storage unit 170 stores a set of random numbers for each time sent from the random number generation unit 160. The random number set stored for each time stored in the random number set storage unit 170 is read by the random number conversion unit 210.

相関係数計算部１８０は、第２過去の実績データ記憶部１３０から読み出した過去の実績データに基づき時刻間の相関係数（共分散）を計算して相関係数行列を作成し、相関係数行列記憶部１９０に送る。相関係数行列記憶部１９０は、相関係数計算部１８０から送られてくる相関係数行列を記憶する。この相関係数行列作成部１９０に記憶されている相関係数行列は、乱数変換部２１０によって読み出される。 The correlation coefficient calculation unit 180 calculates a correlation coefficient (covariance) between times based on the past performance data read from the second past performance data storage unit 130, creates a correlation coefficient matrix, and It is sent to the number matrix storage unit 190. The correlation coefficient matrix storage unit 190 stores the correlation coefficient matrix sent from the correlation coefficient calculation unit 180. The correlation coefficient matrix stored in the correlation coefficient matrix creation unit 190 is read by the random number conversion unit 210.

乱数変換部２１０は、相関係数行列記憶部１９０から読み出した相関係数行列を用いて、乱数組記憶部１７０から読み出した時刻毎の乱数の組を、相関を持った乱数の組へ変換し、乱数組記憶部２２０に送る。乱数組記憶部２２０は、乱数変換部２１０から送られてきた相関を持った乱数の組を記憶する。この乱数組記憶部２２０に記憶されている相関を持った乱数の組は、予測結果作成部２３０によって読み出される。 The random number conversion unit 210 uses the correlation coefficient matrix read from the correlation coefficient matrix storage unit 190 to convert a random number set read from the random number set storage unit 170 into a correlated random number set. And sent to the random number set storage unit 220. The random number set storage unit 220 stores a set of random numbers having a correlation sent from the random number conversion unit 210. A set of correlated random numbers stored in the random number set storage unit 220 is read by the prediction result creation unit 230.

予測結果作成部２３０は、乱数組記憶部２２０から読み出した乱数の組に基づいて時系列データの値を予測して予測結果を作成し、予測結果記憶部２４０に送る。予測結果記憶部２４０は、予測結果作成部２３０から送られてくる時系列データの予測結果を記憶する。この予測結果記憶部２４０に記憶されている時系列データの予測結果は、予測結果表示部３１０によって読み出される。 The prediction result creation unit 230 creates a prediction result by predicting the value of the time-series data based on the random number set read from the random number set storage unit 220, and sends the prediction result to the prediction result storage unit 240. The prediction result storage unit 240 stores the prediction result of the time series data sent from the prediction result creation unit 230. The prediction result of the time series data stored in the prediction result storage unit 240 is read out by the prediction result display unit 310.

予測結果表示部３１０は、予測結果記憶部２４０から読み出した時系列データの予測結果を表示データに変換し、表示装置３２０に送る。表示装置３２０は、予測結果表示部３１０から送られてきた表示データに基づき時系列データの予測結果を表示する。 The prediction result display unit 310 converts the prediction result of the time series data read from the prediction result storage unit 240 into display data, and sends the display data to the display device 320. The display device 320 displays the prediction result of the time series data based on the display data sent from the prediction result display unit 310.

次に、上記のように構成される第１の実施形態に係る時系列データの予測装置の動作を説明する。 Next, the operation of the time-series data prediction apparatus according to the first embodiment configured as described above will be described.

時系列データの予測装置の動作が開始されると、データ選択部１２０は、第１過去の実績データ記憶部１１０から予測対象日と類似した日の過去の実績データを選択し、第２過去の実績データ記憶部１３０に送って準備する。次いで、平均・分散計算部１４０は、第２過去の実績データ記憶部１３０から過去の実績データを読み出して時刻毎の平均および分散を計算し、乱数発生部１７０に送る。同様に、確率密度分布作成部１５０は、第２過去の実績データ記憶部１３０から過去の実績データを読み出して時刻毎に確率密度分布を計算し、乱数発生部１７０に送る。 When the operation of the time-series data prediction apparatus is started, the data selection unit 120 selects past performance data on a date similar to the prediction target date from the first past performance data storage unit 110, and the second past data Prepared by sending it to the result data storage unit 130. Next, the average / dispersion calculation unit 140 reads past performance data from the second past performance data storage unit 130, calculates an average and variance for each time, and sends the average and variance to the random number generation unit 170. Similarly, the probability density distribution creation unit 150 reads past performance data from the second past performance data storage unit 130, calculates a probability density distribution at each time, and sends the probability density distribution to the random number generation unit 170.

乱数発生部１６０は、平均・分散計算部１４０から送られてくる時刻毎の平均と分散、および、確率密度分布作成部１５０から送られてくる時刻毎の確率密度分布に基づき乱数を発生して時刻毎の乱数の組を作成し、乱数組記憶部１７０に送って記憶させる。 The random number generator 160 generates random numbers based on the average and variance for each time sent from the average / variance calculator 140 and the probability density distribution for each time sent from the probability density distribution generator 150. A set of random numbers for each time is created and sent to the random number set storage unit 170 for storage.

一方、相関係数計算部１８０は、第２過去の実績データ記憶部１３０から読み出した過去の実績データに基づき時刻間の相関係数（共分散）を計算して相関係数行列を作成し、相関係数行列記憶部１９０に送って記憶させる。 On the other hand, the correlation coefficient calculation unit 180 calculates a correlation coefficient (covariance) between times based on past performance data read from the second past performance data storage unit 130 to create a correlation coefficient matrix, It is sent to the correlation coefficient matrix storage unit 190 for storage.

次いで、乱数変換部２１０は、相関係数行列記憶部１９０から読み出した相関係数行列を用いて、乱数組記憶部１７０から読み出した時刻毎の乱数の組を、相関を持った乱数の組へ変換し、乱数組記憶部２２０に送って記憶させる。 Next, the random number conversion unit 210 uses the correlation coefficient matrix read from the correlation coefficient matrix storage unit 190 to convert the random number set read out from the random number set storage unit 170 into a correlated random number set. The data is converted and sent to the random number set storage unit 220 for storage.

次いで、予測結果作成部２３０は、時系列データの予測結果を作成し、予測結果記憶部２４０に送って記憶させる。予測結果表示部３１０は、予測結果記憶部２４０から読み出した時系列データの予測結果を表示データに変換し、表示装置３２０に送る。これにより、表示装置３２０には、予測結果表示部３１０から送られてきた表示データに基づき時系列データの予測結果が表示される。
［第２の実施形態］ Next, the prediction result creation unit 230 creates a prediction result of the time series data, and sends it to the prediction result storage unit 240 for storage. The prediction result display unit 310 converts the prediction result of the time series data read from the prediction result storage unit 240 into display data, and sends the display data to the display device 320. As a result, the prediction result of the time series data is displayed on the display device 320 based on the display data sent from the prediction result display unit 310.
[Second Embodiment]

図５は、第２の実施形態に係る時系列データの予測装置の構成を示す図である。なお、図５では、図４に示した第１の実施形態に係る時系列データの予測装置の構成要素の一部を、便宜的に簡略化して示している。 FIG. 5 is a diagram illustrating a configuration of a time-series data prediction apparatus according to the second embodiment. In FIG. 5, some components of the time-series data prediction apparatus according to the first embodiment shown in FIG. 4 are simplified for convenience.

この時系列データの予測装置では、データ選択部１２０は、第１過去の実績データ記憶部１１０から読み出した過去の実績データを用いて、従来の予測手法である重回帰分析やニューラルネットワーク等による既知の予測処理１２１１を実行し、その結果を用いて予測対象の時系列データの一部または全部を予測する処理１２１２を実行し、その予測した値が近い過去の実績データを全過去の実績データから選択する処理１２１３を実行する。この処理１２１３により選択された過去の実績データが、予測に用いる過去の実績データとして第２過去の実績データ記憶部１３０に送られて記憶される。 In this time-series data prediction apparatus, the data selection unit 120 uses the past performance data read from the first past performance data storage unit 110, and is known by a multiple regression analysis or a neural network that is a conventional prediction method. Is executed, and a process 1212 for predicting a part or all of the time-series data to be predicted is executed using the result, and past performance data whose predicted values are close are obtained from all past performance data. A selection process 1213 is executed. The past performance data selected by this processing 1213 is sent to and stored in the second past performance data storage unit 130 as past performance data used for prediction.

図５においては、従来の予測手法によって予測された結果と値が類似した過去の実績データが選択されるが、類似の判断基準としては、種々の値を用いることができる。例えば、（９）式に示すような非類似度を定義し、この非類似度が小さな方から予め定められた日数だけ選択して類似と判断することができる。
In FIG. 5, past performance data whose values are similar to those predicted by the conventional prediction method are selected, but various values can be used as similar determination criteria. For example, a dissimilarity as shown in the equation (9) is defined, and it can be determined that the dissimilarity is selected by selecting a predetermined number of days from the one with the smaller dissimilarity.

ここで、Ｘ_iは予測対象時系列データのｉ番目の時刻の予測値であり、ｘ_j,iはｊ番目の過去の実績データにおけるｉ番目の時刻の実績値である。また、ｗ_iはｉ番目の時刻に対する重み係数である。なお、Ｘ_iとしては、時系列データの最大値や最小値を用いることもできる。 Here, X _i is the predicted value at the i-th time of the prediction target time-series data, and x _{j, i} is the actual value at the i-th time in the j-th past actual data. W _i is a weighting coefficient for the i-th time. Note that the maximum value or the minimum value of the time series data can also be used as X _i .

図６は、上述した予測された値と類似する過去の実績データの選択方法を説明するための図である。図６において、折れ線Ａ、Ｂ、ＣおよびＤは過去の実績データであり、折れ線Ｚは、予測対象の時系列データのうち、予め予測された点（Ｘ₁〜Ｘ_n）を結んだ線分である。この例の場合、折れ線Ｂが最も類似した過去の実績データとなる。この第２の実施形態では、過去の実績データとしては２つ以上選択する必要があるので、例えば、折れ線Ｂと折れ線Ｃを選択することができる。 FIG. 6 is a diagram for explaining a method of selecting past performance data similar to the predicted value described above. In FIG. 6, the broken lines A, B, C, and D are past performance data, and the broken line Z is a line segment connecting points (X _{1 to} X _n ) predicted in advance in the time series data to be predicted. It is. In the case of this example, the broken line B is the past result data that is most similar. In the second embodiment, since it is necessary to select two or more past performance data, for example, a broken line B and a broken line C can be selected.

この他にも、類似性の判断には、時系列データの一部または全部、あるいは、最大値または最小値を予め予測し、この予測結果と過去の実績データの対応する値との差を用いて、重みを付けた差の自乗和を用いることができる。また、時系列データの各時刻の値を位置ベクトルとみなして、距離の近いデータを選択する手法等を用いることができる。距離の定義としては、通常のユークリッド距離の他に、マハラノビスの距離などを用いることができる。なお、（９）式に示す非類似度も距離の一種と考えることができる。 In addition to this, in determining similarity, a part or all of the time series data, or the maximum value or the minimum value is predicted in advance, and the difference between the prediction result and the corresponding value of the past performance data is used. Thus, a sum of squares of weighted differences can be used. Further, it is possible to use a method of selecting data having a short distance by regarding each time value of time series data as a position vector. As a definition of the distance, in addition to a normal Euclidean distance, a Mahalanobis distance or the like can be used. Note that the dissimilarity shown in equation (9) can also be considered as a kind of distance.

このようにして選択された類似日において、第１の実施形態と同様にして、確率密度分布作成部１５０は、各時刻の確率密度分布を作成し、時間・分散・共分散計算部１４１に送る。時間・分散・共分散計算部１４１は、各時刻の平均、分散および時刻間の相関係数（共分散）を計算し、乱数発生部１６１に送る。 On the similar days selected in this way, the probability density distribution creating unit 150 creates a probability density distribution at each time and sends it to the time / variance / covariance calculation unit 141 in the same manner as in the first embodiment. . The time / variance / covariance calculation unit 141 calculates the average, variance, and correlation coefficient (covariance) between the times and sends them to the random number generation unit 161.

乱数発生部１６１は、平均・分散・共分散計算部１４１から送られてくる時刻毎の平均、分散および時刻間の相関係数（共分散）に基づき時刻毎の乱数の組を作成し、これを、相関を持った乱数の組へ変換する。そして、相関を持った乱数の組に基づいて時系列データの値を予測して予測結果を作成し、予測結果記憶部２４０に送る。それ以後の動作は、第１の実施形態と同じである。
［第３の実施形態］ The random number generator 161 creates a set of random numbers for each time based on the average, variance, and correlation coefficient (covariance) between the times sent from the average / variance / covariance calculator 141. Is converted to a set of correlated random numbers. Then, based on a set of correlated random numbers, the value of the time-series data is predicted to create a prediction result, which is sent to the prediction result storage unit 240. The subsequent operation is the same as that of the first embodiment.
[Third Embodiment]

第３の実施形態に係る時系列データの予測装置の構成は、図５に示した第２の実施形態に係る時系列データの予測装置の構成と同じである。以下、第３の実施形態に係る時系列データの予測装置の動作についてのみ説明する。 The configuration of the time-series data prediction apparatus according to the third embodiment is the same as the configuration of the time-series data prediction apparatus according to the second embodiment shown in FIG. Only the operation of the time-series data prediction apparatus according to the third embodiment will be described below.

図７は、第３の実施形態に係る時系列データの予測装置の動作を示すフローチャートである。この第３の実施形態では、時系列データとして１日の需要カーブを例にあげて説明する。 FIG. 7 is a flowchart showing the operation of the time-series data prediction apparatus according to the third embodiment. In the third embodiment, a daily demand curve will be described as an example of time series data.

動作が開始されると、まず、予測対象日の属性が決定される（ステップＳ１１）。次いで、属性が同じ過去の実績データが選定される（ステップＳ１２）。ここで、属性とは、年、月、曜日、天候、その他の過去の実績データを特徴づける性質である。この過去の実績データを用いて、まず、重回帰分析等の予測手法を用いて、時系列データにおいて基準となる予測値が予測される（ステップＳ１３）。基準となる予測値としては、時系列データの最大値や最小値を用いることができる。なお、基準となる時刻の予測結果や、１日の初めや終わりの値を用いることもできる。１日の初めの値として前日の終わりの値を与えると、前日との連続性を考慮して類似日を選択することができる。次いで、ステップＳ１３で得られた最大値や最小値を用いて、過去の実績データから類似日が選択される（ステップＳ１４）。類似日の選択には、第２の実施形態と同様な方法を用いることができる。 When the operation is started, first, the attribute of the prediction target day is determined (step S11). Next, past performance data having the same attribute is selected (step S12). Here, the attribute is a property that characterizes past performance data such as year, month, day of the week, weather, and the like. Using the past performance data, first, a prediction value serving as a reference in time series data is predicted using a prediction method such as multiple regression analysis (step S13). As the reference predicted value, the maximum value or the minimum value of the time series data can be used. In addition, the prediction result of the time used as a reference | standard and the value of the beginning and end of a day can also be used. If the value of the end of the previous day is given as the first value of the day, a similar day can be selected in consideration of continuity with the previous day. Next, similar dates are selected from past performance data using the maximum and minimum values obtained in step S13 (step S14). A method similar to that in the second embodiment can be used for selecting similar days.

類似日が選択されると、第１の実施形態と同様な方法で、各時刻における需要の確率密度分布が作成され（ステップＳ１５）、次いで、これらの類似日において、各時刻における平均と分散、時刻間の相関（共分散）が計算される（ステップＳ１６）。次いで、これらの統計的な性質を再現する乱数が生成され（ステップＳ１７）、その後、需要シナリオが作成される（ステップＳ１８）。これにより、分布を考慮した予測を行うことができる。 When similar days are selected, a probability density distribution of demand at each time is created in the same manner as in the first embodiment (step S15), and then, on these similar days, the mean and variance at each time, A correlation (covariance) between times is calculated (step S16). Next, random numbers that reproduce these statistical properties are generated (step S17), and then a demand scenario is created (step S18). As a result, it is possible to perform prediction in consideration of distribution.

以上のようにして分布を考慮した時系列データを予測することができるので、この時系列データのシナリオを用いて将来の種々の現象を期待値として予測することができる。
［４の実施形態］ Since time series data considering the distribution can be predicted as described above, various future phenomena can be predicted as expected values using this time series data scenario.
[Fourth Embodiment]

第４の実施形態に係る時系列データの予測装置の構成は、図４に示した第１の実施形態に係る時系列データの予測装置の構成と同じである。以下、第４の実施形態に係る時系列データの予測装置の動作についてのみ説明する。 The configuration of the time-series data prediction apparatus according to the fourth embodiment is the same as the configuration of the time-series data prediction apparatus according to the first embodiment shown in FIG. Only the operation of the time-series data prediction apparatus according to the fourth embodiment will be described below.

図８は、第４の実施形態に係る時系列データの予測装置の動作を示すフローチャートであり、上述した期待値の計算方法を示している。この処理は、第１の実施形態で説明した予測結果作成部２３０で予測結果が作成された後に開始される。予測結果が作成されると、まず、時系列データの予測シナリオ（Ｍ個）が予測結果記憶部２４０に格納される（ステップＳ２１）。次いで、評価量ｐの時刻ｉ、シナリオ番号ｍにおける値が定義される（ステップＳ２２）。ここで、評価量ｐを「ｐ＝ｐ（ｉ，ｍ）」と表す。 FIG. 8 is a flowchart showing the operation of the time-series data prediction apparatus according to the fourth embodiment, and shows the above-described expected value calculation method. This process is started after a prediction result is created by the prediction result creation unit 230 described in the first embodiment. When a prediction result is created, first, prediction scenarios (M) of time series data are stored in the prediction result storage unit 240 (step S21). Next, the value of the evaluation amount p at time i and scenario number m is defined (step S22). Here, the evaluation amount p is expressed as “p = p (i, m)”.

次いで、シナリオ番号ｍが「０」に初期化される（ステップＳ２３）。その後、シナリオ番号ｍがインクリメント（＋１）される（ステップＳ２４）。次いで、ｍ番目の時系列データの時刻ｉにおいて評価量ｐが評価される（ステップＳ２５）。次いで、シナリオ番号ｍがＭになったかどうかが調べられる（ステップＳ２６）。このステップＳ２６において、シナリオ番号ｍがＭになっていないことが判断されると、ステップＳ２４に戻り、上述した処理が繰り返される。 Next, the scenario number m is initialized to “0” (step S23). Thereafter, the scenario number m is incremented (+1) (step S24). Next, the evaluation amount p is evaluated at time i of the m-th time series data (step S25). Next, it is checked whether or not the scenario number m becomes M (step S26). If it is determined in step S26 that the scenario number m is not M, the process returns to step S24 and the above-described processing is repeated.

一方、ステップＳ２６において、シナリオ番号ｍがＭになったことが判断されると、次いで、期待値が計算される（ステップＳ２７）。期待値は、下記（１０）式で表すことができる。その後、評価量ｐの各時刻における期待値＜ｐ（ｉ）＞が記憶装置（図示しない）に記憶される（ステップＳ２８）。
On the other hand, if it is determined in step S26 that the scenario number m has become M, then an expected value is calculated (step S27). The expected value can be expressed by the following equation (10). Thereafter, the expected value <p (i)> at each time of the evaluation amount p is stored in a storage device (not shown) (step S28).

なお、期待値としては、単純な時系列データの値そのもの、例えば、時系列データの一定期間の合計値を用いることもできるが、時系列データの値に種々の演算を施した結果、または、この結果を平均して得られる値を用いることもできる。この場合、期待値が望ましい値になるように演算の方法を変更する、より詳しくは、入力パラメータを変更して再計算するように構成できる。 As the expected value, the value of simple time-series data itself, for example, the total value of a certain period of time-series data can be used, but the result of performing various operations on the value of time-series data, or A value obtained by averaging the results can also be used. In this case, the calculation method can be changed so that the expected value becomes a desirable value, more specifically, the input parameter can be changed and recalculated.

具体的には、予測結果が１日の電力需要のシナリオであれば、各シナリオの１日の総需要を計算し、平均を計算することにより１日の総電力需要の期待値を計算できる。また、予測結果が太陽光等の自然エネルギー発電の発電量であれば、最適な制御方法や電力売買計画を検討することができる。また、発電量の期待値や電力売買による収益の期待値を予測することもできる。 Specifically, if the prediction result is a scenario of daily power demand, the expected value of daily total power demand can be calculated by calculating the total daily demand for each scenario and calculating the average. Moreover, if the prediction result is the amount of power generated by natural energy such as sunlight, an optimal control method and power trading plan can be examined. It is also possible to predict the expected value of power generation and the expected value of revenue from power trading.

また、予測結果が将来の商品の販売量であれば、適切な在庫量や仕入れ量を検討することもできる。さらに、予測結果が将来の株価の推移であれば、株式の最適な売買のタイミングを決定することができる。 In addition, if the prediction result is the sales volume of future products, it is possible to consider an appropriate inventory volume or purchase volume. Furthermore, if the prediction result is a transition of the future stock price, it is possible to determine the optimal trading timing of the stock.

このように、時系列データを統計的に正しく過去の実績データを再現したシナリオとして予測することができれば、応用範囲は非常に広く、有効性は大きい。 As described above, if the time series data can be predicted as a scenario in which the past actual data is reproduced correctly and statistically, the application range is very wide and the effectiveness is large.

以上のように、いくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 As mentioned above, although several embodiment was described, these embodiment is shown as an example and is not intending limiting the range of invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.

１１０第１過去実績データ記憶部
１２０データ選択部
１３０第２過去実績データ記憶部
１４０、１４１平均・分散計算部
１５０確率密度分布作成部
１６０、１６１乱数発生部
１７０乱数記憶部
１８０相関係数計算部
１９０相関係数行列記憶部
２１０乱数変換部
２２０乱数記憶部
２３０予測結果作成部
２４０予測結果記憶
２５０予測結果表示部
３２０表示装置 110 First past record data storage unit 120 Data selection unit 130 Second past record data storage unit 140, 141 Average / variance calculation unit 150 Probability density distribution creation unit 160, 161 Random number generation unit 170 Random number storage unit 180 Correlation coefficient calculation unit 190 correlation coefficient matrix storage unit 210 random number conversion unit 220 random number storage unit 230 prediction result creation unit 240 prediction result storage 250 prediction result display unit 320 display device

Claims

A data selection step for selecting data similar to the forecast date from past time series data,
A probability density distribution creating step of calculating a different probability density distribution for each time from the data selected in the data selection step;
A random number generation step for generating a plurality of random numbers according to different probability density distributions at each time calculated in the probability density distribution creating step;
A correlation coefficient calculating step for calculating a correlation between times in past time series data;
A random number conversion step for converting a set of random numbers generated in the random number generation step into a set of random numbers having a correlation with each other based on the correlation calculated in the correlation coefficient calculation step;
A prediction result creating step of predicting a value of time series data based on a set of random numbers obtained by the conversion in the random number conversion step and creating a prediction result;
A method for predicting time-series data, comprising:

The data selection step includes:
A prediction step for predicting a part or all of the time series data to be predicted from the past time series data using a prediction result by a known prediction method;
2. The method for predicting time-series data according to claim 1, further comprising a selection step of selecting past performance data whose value is similar to the result predicted in the prediction step from all past performance data. .

3. The time series data prediction method according to claim 2, wherein in the prediction step, the maximum value and the minimum value of the time series data are predicted in advance using multiple regression analysis.

3. The time series data prediction method according to claim 2, wherein in the prediction step, a maximum value and a minimum value of the time series data are predicted in advance using a neural network.

In the selection step, weighting is performed using a difference between the predicted result and a corresponding value of past performance data in order to determine whether the value is similar to the result predicted in the prediction step. 5. The method for predicting time-series data according to claim 2, wherein the sum of squares of differences, Euclidean distance, or Mahalanobis distance is used.

From the scenario of the time series data indicated by the prediction result created in the prediction result creation step, an operation was performed on the expected value indicated by the total value for a certain period of time series data or the data for a certain period of each scenario. 6. The method according to claim 1, further comprising a step of calculating an expected value obtained by averaging the results and changing a calculation method so that the expected value becomes a desired value. To predict time series data.

A prediction program for executing the time-series data prediction method according to any one of claims 1 to 6 by a computer.

A computer-readable storage medium storing a prediction program for executing the time-series data prediction method according to any one of claims 1 to 6 by a computer.

A data selection unit that selects data similar to the prediction target date from past time series data,
A probability density distribution creating unit that calculates a different probability density distribution for each time from the data selected by the data selection unit;
A random number generator for generating a set of a plurality of random numbers according to a different probability density distribution for each time from the probability density distribution creating unit;
A correlation coefficient calculation unit for calculating a correlation between times in past time series data;
A random number conversion unit that converts a set of random numbers generated by the random number generation unit into a set of random numbers having a correlation with each other based on the correlation calculated by the correlation coefficient calculation unit;
A prediction result creation unit that creates a prediction result by predicting a value of time-series data based on a set of random numbers obtained by conversion in the random number conversion unit;
A time-series data prediction apparatus comprising: