JP6397380B2

JP6397380B2 - Spatio-temporal variable prediction apparatus and program

Info

Publication number: JP6397380B2
Application number: JP2015151299A
Authority: JP
Inventors: 真耶大川; 澤田　宏; 宏澤田; 上田　修功; 修功上田
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc
Current assignee: Nippon Telegraph and Telephone Corp; NTT Inc
Priority date: 2015-07-30
Filing date: 2015-07-30
Publication date: 2018-09-26
Anticipated expiration: 2035-07-30
Also published as: JP2017033198A

Description

本発明は、時空間変数予測装置及びプログラムに関する。 The present invention relates to a spatiotemporal variable prediction apparatus and program.

従来の技術として、時空間変数の時系列データから、クリギングモデルを用いて未観測の時空間変数を推定あるいは予測する手法がある。いま、地理的・時間的広がりを持った空間内のＮ箇所｛ｘ_１，…，ｘ_Ｎ｝でＮ個の観測値｛ｔ_１，…，ｔ_Ｎ｝が得られたとする。クリギングモデルでは、新しい観測地点ｘ_＊における予測値ｔ_＊を確率分布として出力する。 As a conventional technique, there is a method of estimating or predicting an unobserved spatiotemporal variable from time series data of a spatiotemporal variable using a kriging model. Now, it is assumed that N observation values {t ₁ ,..., T _N } are obtained at N locations {x ₁ ,..., X _N } in a space having a geographical and temporal spread. In the kriging model, the predicted value t _* at the new observation point x _* is output as a probability distribution.

クリギングモデルは、入力変数（位置、時間など）が似ているほど近い値を持つという仮定に基づいたモデルである。「似ている」ということの定義は問題設定に応じて任意に決めることができ、セミバリオグラム、コバリオグラムと呼ばれる関数で定義される。各種のモデル化に使用される代表的なセミバリオグラムの例として、ナゲット効果モデル、球形モデル、指数モデル、ガウスモデルなどが挙げられる。予測の前処理として、全データを用いてセミバリオグラム、コバリオグラムの選択とパラメータ推定を行う。モデルの選択とパラメータの推定には、全データの時空間変数のペアの差分について線形和をとった“経験バリオグラム”を用いる。 The Kriging model is a model based on the assumption that the closer the input variables (position, time, etc.) are, the closer the values are. The definition of “similar” can be arbitrarily determined according to the problem setting, and is defined by functions called semivariogram and covariogram. Examples of typical semivariograms used for various modeling include a nugget effect model, a spherical model, an exponential model, and a Gaussian model. As preprocessing for prediction, selection of semivariogram and covariogram and parameter estimation are performed using all data. For model selection and parameter estimation, an “empirical variogram” is used, which is a linear sum of differences between spatiotemporal variable pairs of all data.

上記の定義に従ってデータ点から設計した経験バリオグラムに対しセミバリオグラムモデルの当てはめを行い、加重最小二乗法を用いてモデル選択とパラメータ推定を行う。予測に使用されるクリギングモデルとして単純クリギング、普遍クリギング、ブロッククリギングなどが提案されている。予測に用いるモデルは問題設定に応じて任意に決めることができる。これらのモデルは全て“時空間変数の確率場は本質的定常性あるいは本質的定常性に準ずる定常性を持つ"という仮定に基づいている。 A semivariogram model is fitted to an empirical variogram designed from data points according to the above definition, and model selection and parameter estimation are performed using a weighted least squares method. Simple kriging, universal kriging, block kriging and the like have been proposed as kriging models used for prediction. The model used for prediction can be arbitrarily determined according to the problem setting. All these models are based on the assumption that the random field of the spatiotemporal variable has intrinsic stationarity or stationarity similar to intrinsic stationarity.

本質的定常性とは、（１）時空間変数の平均値が入力変数（時間帯や地域）によらず一定で、かつ（２）任意の時空間変数のペアの分散がそれらの入力変数の類似度にのみ依存するという性質である。単純クリギングは時空間変数の確率場に本質的定常性を仮定したモデルであり、新しい入力変数｛ｘ_＊｝における時空間変数の予測値ｔ_＊は次式の平均、分散を持つガウス分布に従う。 Inherent stationarity means that (1) the mean value of spatiotemporal variables is constant regardless of input variables (time zone and region), and (2) the variance of any spatiotemporal variable pair is It is a property that depends only on similarity. Simple kriging is a model that assumes intrinsic stationarity in the random field of the spatiotemporal variable, and the predicted value t _* of the spatiotemporal variable in the new input variable {x _* } follows a Gaussian distribution having the following average and variance.

ここでγはコバリオグラム、Ｃは要素γ（ｘ_ｎ，ｘ_ｍ）＋β^−１δ_ｎｍを持つＮ×Ｎの共分散行列である。ただしδ_ｎｍはクロネッカーのデルタ、βは定数である。単純クリギングは、ガウス過程による回帰の一種とみなすことができる。ガウス過程では、コバリオグラムの代わりにカーネル関数が用いられる。定常性条件（２）は、「似ている」ということの定義が一意に決まるという仮定に対応している。 Here, γ is a covariogram, and C is an N × N covariance matrix having an element γ (x _n , x _m ) + β ⁻¹ δ _nm . Where δ _nm is the Kronecker delta and β is a constant. Simple kriging can be regarded as a kind of regression by Gaussian process. In the Gaussian process, kernel functions are used instead of covariograms. Stationarity condition (2) corresponds to the assumption that the definition of “similar” is uniquely determined.

前述の通り、クリギングモデル（あるいはガウス過程による回帰）は、“「似ている」ということの定義が一意に決まる”という仮定に基づいて時空間変数の値を予測するものである。しかし、実際には「似ている」入力変数から時空間変数が受ける影響は、地域、時間帯など入力変数の特性によって異なるはずである。そこで、クリギングモデルの定常性の過程を緩め、局所非定常性を導入した混合ガウス過程が提案された。混合ガウス過程は、単一のガウス過程を複数個混合したモデルであり、複数のガウス過程の重ね合わせで表現される。混合ガウス過程では、新しい入力変数ｘ_＊における予測値ｔ_＊は次式の平均、分散を持つガウス分布に従う。 As mentioned above, Kriging model (or regression by Gaussian process) predicts the value of spatiotemporal variables based on the assumption that “the definition of“ similar ”is uniquely determined”. The influence of spatiotemporal variables from “similar” input variables should differ depending on the characteristics of the input variables such as region and time zone. Therefore, a mixed Gaussian process was proposed that relaxed the stationary process of the Kriging model and introduced local unsteadiness. The mixed Gaussian process is a model in which a plurality of single Gaussian processes are mixed, and is expressed by superposition of a plurality of Gaussian processes. In the mixed Gaussian process, the predicted value t _* in the new input variable x _* follows a Gaussian distribution having the mean and variance of the following equation.

ｐ（ｚ_＊＝ｒ｜ｘ_＊）はｔ_＊がｒ個目の要素から生じた確率であり、負担率と呼ばれる。各入力変数の特性（地域や時間帯など）に応じてデータを複数のクラスタに分類し、クラスタごとにデータを最もよく説明するセミバリオグラムの選択、パラメータ推定を行う。各データ点が属するクラスタと各クラスタのパラメータはＥＭアルゴリズムを用いて同時に推定される（非特許文献１，非特許文献２参照）。 p (z _* = r | x _* ) is a probability that t _* is generated from the r-th element, and is called a burden rate. Data is classified into a plurality of clusters according to the characteristics (region, time zone, etc.) of each input variable, and a semivariogram that best describes the data is selected and parameters are estimated for each cluster. The cluster to which each data point belongs and the parameters of each cluster are estimated simultaneously using the EM algorithm (see Non-Patent Document 1 and Non-Patent Document 2).

S. De Iaco, D.E. Myers, and D. Posa.“Space-time analysis using a general product-sum model.”, Statistics & Probability Letters, 52(1):p.21−28, 2001.S. De Iaco, D.E.Myers, and D. Posa. “Space-time analysis using a general product-sum model.”, Statistics & Probability Letters, 52 (1): p.21-28, 2001. Benedikt Gr¨aler, Lydia E. Gerharz, and Edzer J. Pebesma. “Spatio-temporal analysis and interpolation of PM10 measurements in Europe.”, Technical report, ETC/ACM, 2012.Benedikt Gr¨aler, Lydia E. Gerharz, and Edzer J. Pebesma. “Spatio-temporal analysis and interpolation of PM10 measurements in Europe.”, Technical report, ETC / ACM, 2012.

前述の通り、従来技術の混合ガウス過程は、入力変数の次元・個々の入力変数の性質の違いを考慮せず、入力データをクラスタに分類するものである。しかし、実際には時空間変数のクラスタは時間的・空間的な相関を持つはずである。従来技術では、時間的・空間的な相関を持つ現実世界の時空間変数分布を正確に予測することができないという問題が存在した。 As described above, the mixed Gaussian process of the prior art classifies input data into clusters without taking into account the differences in dimensions of input variables and the properties of individual input variables. In practice, however, clusters of spatiotemporal variables should have temporal and spatial correlations. In the prior art, there has been a problem that the real-time spatiotemporal variable distribution with temporal and spatial correlation cannot be accurately predicted.

本発明は、上記の点に鑑みてなされたものであり、時間的及び空間的相関を持つ時空間変数の値を精度よく予測することができる時空間変数予測装置及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to provide a spatiotemporal variable prediction apparatus and program capable of accurately predicting values of spatiotemporal variables having temporal and spatial correlations. And

上記目的を達成するために、本発明に係る時空間変数予測装置は、位置情報及び時間情報を有する入力変数に対する時空間変数の観測値を有する観測データの集合に基づいて、未観測の位置情報及び時間情報に対する時空間変数の値を予測する時空間変数予測装置であって、前記観測データの集合に基づいて、複数のガウス過程を、空間的な広がり及び時間的な広がりに対応する複数の階層で混合した階層混合ガウス過程でモデル化された、前記入力変数に対する時空間変数の値を予測するためのモデルに含まれる、前記複数のガウス過程の各々についての、前記観測データ同士の類似性を定義する関数であるカーネル関数の各々のハイパーパラメータと、前記観測データの各々に対する、前記複数のガウス過程の各々の寄与度を表すパラメータである負担率とを学習する学習部を含んで構成されている。 In order to achieve the above object, the spatiotemporal variable prediction apparatus according to the present invention is based on an observation data set having observation values of spatiotemporal variables with respect to an input variable having position information and temporal information. And a spatio-temporal variable prediction apparatus for predicting a value of a spatio-temporal variable with respect to temporal information, wherein a plurality of Gaussian processes corresponding to a spatial spread and a temporal spread are based on the set of observation data. Similarity between the observation data for each of the plurality of Gaussian processes included in the model for predicting the value of the spatio-temporal variable with respect to the input variable modeled by the hierarchical mixed Gaussian process mixed in the hierarchy A parameter representing the contribution of each of the plurality of Gaussian processes to each of the observed data. It is configured to include a learning unit for learning the load ratio and a motor.

また、本発明に係る時空間変数予測装置において、前記学習部は、前記観測データの集合と、前記複数のガウス過程のカーネル関数の各々のハイパーパラメータとに基づいて、前記観測データの各々に対する、複数のガウス過程からなる複数のユニットの各々の寄与度を表すパラメータであるユニット負担率、及び前記観測データの各々に対する、前記複数のガウス過程の各々の寄与度を表すパラメータである負担率を推定する負担率推定部と、前記観測データの集合と、前記負担率推定部によって推定された、前記観測データの各々に対する、前記複数のユニットの各々のユニット負担率、及び前記観測データの各々に対する、前記複数のガウス過程の各々の負担率とに基づいて、前記複数のガウス過程の各々に対し、前記ガウス過程のカーネル関数の各々のハイパーパラメータを推定するガウス過程パラメータ推定部と、予め定められた反復終了条件を満たすまで、前記負担率推定部による推定、及び前記ガウス過程パラメータ推定部による推定を繰り返す反復判定部とを含むようにすることができる。 Further, in the spatiotemporal variable prediction apparatus according to the present invention, the learning unit, for each of the observation data, based on the set of the observation data and each hyperparameter of the kernel function of the plurality of Gaussian processes, Estimating a unit burden rate, which is a parameter representing the contribution of each of a plurality of units consisting of a plurality of Gaussian processes, and a burden factor, a parameter representing the contribution of each of the plurality of Gaussian processes, to each of the observation data A burden rate estimating unit, a set of the observation data, and each of the observation data estimated by the burden rate estimation unit, for each of the unit burden rates of the plurality of units, and for each of the observation data, Based on the burden ratio of each of the plurality of Gaussian processes, the Gaussian process is determined for each of the plurality of Gaussian processes. A Gaussian process parameter estimation unit for estimating each hyperparameter of the channel function, and an iterative determination unit that repeats the estimation by the burden factor estimation unit and the estimation by the Gaussian process parameter estimation unit until a predetermined iteration end condition is satisfied. Can be included.

また、本発明に係る時空間変数予測装置は、入力された未観測の位置情報及び時間情報を有する前記入力変数に基づいて、前記入力変数に対する前記複数のガウス過程の各々の寄与度を表すパラメータである負担率を推定し、前記学習部によって学習された、前記複数のガウス過程の各々についての前記カーネル関数の各々のハイパーパラメータと、推定された前記入力変数に対する前記複数のガウス過程の各々の負担率とに基づいて、前記入力変数に対する時空間変数の値を予測する時空間変数算出部を更に含むようにすることができる。 In addition, the spatiotemporal variable prediction apparatus according to the present invention is a parameter that represents a contribution degree of each of the plurality of Gaussian processes to the input variable based on the input variable having unobserved position information and time information input thereto. Each of the hyperparameters of the kernel function for each of the plurality of Gaussian processes, and each of the plurality of Gaussian processes for the estimated input variable, learned by the learning unit. A spatiotemporal variable calculation unit that predicts a value of the spatiotemporal variable with respect to the input variable based on the burden rate may be further included.

また、本発明のプログラムは、コンピュータを、上記の時空間変数予測装置を構成する各部として機能させるためのプログラムである。 Moreover, the program of this invention is a program for functioning a computer as each part which comprises said space-time variable prediction apparatus.

以上説明したように、本発明の時空間変数予測装置及びプログラムによれば、位置情報及び時間情報を有する入力変数に対する時空間変数の観測値を有する観測データの集合に基づいて、複数のガウス過程を、空間的な広がり及び時間的な広がりに対応する複数の階層で混合した階層混合ガウス過程でモデル化された、入力変数に対する時空間変数の値を予測するためのモデルに含まれる、複数のガウス過程の各々についての、観測データ同士の類似性を定義する関数であるカーネル関数の各々のハイパーパラメータと、観測データの各々に対する、複数のガウス過程の各々の寄与度を表すパラメータである負担率とを学習することにより、時間的及び空間的相関を持つ時空間変数の値を精度よく予測することができる、という効果が得られる。 As described above, according to the spatiotemporal variable prediction apparatus and program of the present invention, a plurality of Gaussian processes based on a set of observation data having observation values of spatiotemporal variables with respect to input variables having position information and temporal information. Are included in a model for predicting the value of a spatio-temporal variable with respect to an input variable, modeled by a hierarchical mixed Gaussian process mixed at multiple hierarchies corresponding to spatial and temporal spread. For each Gaussian process, each hyperparameter of the kernel function, which is a function that defines the similarity between the observed data, and a burden ratio that is a parameter that represents the contribution of each of the multiple Gaussian processes to each of the observed data Can be used to accurately predict the values of spatio-temporal variables having temporal and spatial correlations. .

本発明の実施の形態における時空間変数予測装置のブロック図である。It is a block diagram of the spatiotemporal variable prediction apparatus in an embodiment of the present invention. 観測データの集合の一例を示す図である。It is a figure which shows an example of the collection of observation data. 入力部と出力部との構成例を示す図である。It is a figure which shows the structural example of an input part and an output part. 本発明の実施の形態における時空間変数予測装置の学習処理ルーチンを示すフローチャートである。It is a flowchart which shows the learning process routine of the spatiotemporal variable prediction apparatus in embodiment of this invention. 本発明の実施の形態における時空間変数予測装置の時空間変数算出処理ルーチンを示すフローチャートである。It is a flowchart which shows the spatiotemporal variable calculation processing routine of the spatiotemporal variable prediction apparatus in embodiment of this invention.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜概要＞ <Overview>

本発明の実施の形態では、混合ガウス過程を拡張した階層混合ガウス過程を提案する。階層混合ガウス過程は、ガウス過程を階層的に混合したモデルである。階層混合ガウス過程では、新しい入力変数ｘ_＊における予測値ｔ_＊は次式の平均を持つガウス分布に従う。 The embodiment of the present invention proposes a hierarchical mixed Gaussian process that is an extension of the mixed Gaussian process. The hierarchical mixed Gaussian process is a model in which Gaussian processes are mixed hierarchically. In a hierarchical mixed Gaussian process, the predicted value t _{* at} the new input variable x _* follows a Gaussian distribution with the mean of:

ｐ（ｚ_＊’＝ｒ’｜ｚ_＊＝ｒ，ｘ_＊）はｔ_＊がｒ’個目のユニットのｒ番目の要素から生じた確率であり、負担率と呼ばれる。ｚ，ｚ’は入力変数の各クラスタに対応する潜在変数である。各データ点が属するクラスタと各クラスタのパラメータはＥＭアルゴリズムを用いて同時に推定される。 p (z _* ′ = r ′ | z _* = r, x _* ) is a probability that t _* is generated from the r-th element of the r′-th unit, and is called a burden rate. z and z ′ are latent variables corresponding to each cluster of input variables. The cluster to which each data point belongs and the parameters of each cluster are estimated simultaneously using the EM algorithm.

本発明の実施の形態に係る時空間変数予測装置は、時空間変数の時系列データ（人口分布、人流・交通流の速度・向き、金やダイヤモンドなど鉱物資源の埋蔵量、降水量などの気象データ、土地価格など）を対象としたものであり、観測データに応じて柔軟に適用できるものである。以下では、実施の形態として、人口密度分布の時系列データが観測データとして与えられた条件の下で、未観測地点あるいは未来の時空間変数分布を推定・予測するという場合について説明する。 The spatio-temporal variable prediction device according to the embodiment of the present invention uses time-series data of spatio-temporal variables (population distribution, speed / direction of human flow / traffic flow, reserves of mineral resources such as gold and diamond, and weather such as precipitation) Data, land prices, etc.) and can be applied flexibly according to observation data. Hereinafter, as an embodiment, a case will be described in which an unobserved spot or a future spatiotemporal variable distribution is estimated / predicted under the condition that time series data of population density distribution is given as observation data.

＜本発明の実施の形態に係る時空間変数予測装置の構成＞ <Configuration of Spatiotemporal Variable Prediction Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る時空間変数予測装置の構成について説明する。図１に示すように、本発明の実施の形態に係る時空間変数予測装置１００は、ＣＰＵと、ＲＡＭと、後述する各処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。時空間変数予測装置１００は、位置情報及び時間情報を有する入力変数に対する時空間変数の観測値を有する観測データの集合に基づいて、未観測の位置情報及び時間情報での時空間変数の値を予測する。この時空間変数予測装置１００は、機能的には図１に示すように、操作部１０と、人口密度情報記憶部１２と、入力部１３と、演算部１４と、時空間変数算出部２６と、出力部２８とを備えている。操作部１０及び演算部１４は、人口密度情報記憶部１２と接続されている。 Next, the configuration of the spatiotemporal variable prediction apparatus according to the embodiment of the present invention will be described. As shown in FIG. 1, a spatiotemporal variable prediction apparatus 100 according to an embodiment of the present invention includes a CPU, a RAM, and a ROM that stores programs and various data for executing processing routines described later. It can be configured with a computer including. The spatiotemporal variable prediction apparatus 100 calculates the value of the spatiotemporal variable in the unobserved position information and the temporal information based on the set of observation data having the observed value of the spatiotemporal variable with respect to the input variable having the positional information and the temporal information. Predict. As shown in FIG. 1, the spatiotemporal variable prediction device 100 functionally includes an operation unit 10, a population density information storage unit 12, an input unit 13, a calculation unit 14, and a spatiotemporal variable calculation unit 26. And an output unit 28. The operation unit 10 and the calculation unit 14 are connected to the population density information storage unit 12.

操作部１０は、後述する人口密度情報記憶部１２に格納されているデータに対するユーザからの各種操作を受け付ける。各種操作とは、人口密度情報記憶部１２に格納された情報を登録、修正、削除する操作等である。操作部１０の入力手段は、キーボードやマウス、メニュー画面、タッチパネルによるもの等、何でもよい。操作部１０は、マウス等の入力手段のデバイスドライバや、メニュー画面の制御ソフトウェアで実現され得る。 The operation unit 10 accepts various operations from the user for data stored in a population density information storage unit 12 described later. The various operations include operations for registering, correcting, and deleting information stored in the population density information storage unit 12. The input means of the operation unit 10 may be anything such as a keyboard, mouse, menu screen, or touch panel. The operation unit 10 can be realized by a device driver of an input unit such as a mouse or control software for a menu screen.

人口密度情報記憶部１２には、観測データの集合が格納されている。 The population density information storage unit 12 stores a set of observation data.

人口密度情報記憶部１２には、後述する演算部１４が解析する観測データの集合が格納されており、演算部１４からの要求に従って、観測データの集合を読み出し、当該情報を演算部１４に送信する。 The population density information storage unit 12 stores a set of observation data to be analyzed by the calculation unit 14 to be described later. In response to a request from the calculation unit 14, the observation data set is read and the information is transmitted to the calculation unit 14. To do.

人口密度情報記憶部１２に格納される観測データの集合は、入力変数ｘと時空間変数ｔとの組み合わせの集合である。本発明の実施の形態では、入力変数ｘは位置及び時刻を有し、時空間変数ｔは人口密度である。ある位置、ある時刻における人口密度は｛ｘ_ｉ，ｔ_ｉ｝と表すことができる。人口密度情報記憶部１２はＷｅｂページを保持するＷｅｂサーバや、データベースを具備するデータベースサーバ等である。 The set of observation data stored in the population density information storage unit 12 is a set of combinations of the input variable x and the spatiotemporal variable t. In the embodiment of the present invention, the input variable x has a position and time, and the spatiotemporal variable t is population density. The population density at a certain position and at a certain time can be expressed as {x _i , t _i }. The population density information storage unit 12 is a Web server that holds Web pages, a database server that includes a database, or the like.

図２に、観測データの集合の一例を示す。図２に示すように、例えば、位置ＩＤ及び時刻を表す情報が入力変数ｘとして格納され、人口密度を表す情報が時空間変数ｔとして格納される。 FIG. 2 shows an example of a set of observation data. As shown in FIG. 2, for example, information representing a position ID and time is stored as an input variable x, and information representing a population density is stored as a spatiotemporal variable t.

入力部１３は、未観測の位置情報及び時間情報を有する入力変数を受け付ける。 The input unit 13 accepts input variables having unobserved position information and time information.

本発明の実施の形態では、入力部１３で受け付けられた位置情報及び時間情報に対して、後述する演算部１４によって得られた各パラメータに基づいて、時空間変数である人口密度の予測が行われる。 In the embodiment of the present invention, the population density, which is a spatio-temporal variable, is predicted based on each parameter obtained by the calculation unit 14 described later with respect to the position information and time information received by the input unit 13. Is called.

入力部１３の入力手段は、キーボードやマウス、メニュー画面、タッチパネルによるもの等、何でもよい。入力部１３は、マウス等の入力手段のデバイスドライバや、メニュー画面の制御ソフトウェアで実現され得る。 The input means of the input unit 13 may be anything such as a keyboard, mouse, menu screen, or touch panel. The input unit 13 can be realized by a device driver of input means such as a mouse or control software for a menu screen.

図３に、本実施の形態における入力部１３の構成例を示す。図３の構成例では、入力部１３と後述する出力部２８とが１つの画面として構成されている場合を示す。 FIG. 3 shows a configuration example of the input unit 13 in the present embodiment. The configuration example of FIG. 3 shows a case where the input unit 13 and an output unit 28 described later are configured as one screen.

図３に示すように、入力部１３は、予測を行う対象となる地点あるいは領域に関する位置情報と、予測を行う時間情報とを含む入力変数を受け付ける。また、出力部２８は、後述する時空間変数算出部２６によって出力された、入力変数に対する時空間変数の値を予測結果として表示する。 As illustrated in FIG. 3, the input unit 13 receives an input variable including position information regarding a point or region to be predicted and time information for prediction. Further, the output unit 28 displays the value of the spatiotemporal variable for the input variable output by the spatiotemporal variable calculation unit 26 described later as a prediction result.

演算部１４は、学習部１６と、負担率パラメータ格納部２２と、ガウス過程パラメータ格納部２４とを備える。演算部１４では、学習データとして、ガウス過程のハイパーパラメータ、各ガウス過程からの負担率を表すパラメータを算出する。 The calculation unit 14 includes a learning unit 16, a burden rate parameter storage unit 22, and a Gaussian process parameter storage unit 24. The computing unit 14 calculates, as learning data, a hyper parameter of a Gaussian process and a parameter representing a burden rate from each Gaussian process.

ここで、本発明の実施の形態における原理について説明する。 Here, the principle in the embodiment of the present invention will be described.

人口密度情報記憶部１２に、時空間変数データである観測データの集合Ｄ＝｛ｘ_ｉ，ｔ_ｉ｝^Ｎ _ｉ＝１が格納されたとする。ここでｘ_ｉは位置及び時刻など複数の変数を含むベクトル、ｔ_ｉは人口密度である。 Assume that the population density information storage unit 12 stores a set of observation data D = {x _i , t _i } ^N _{i = 1} , which is spatiotemporal variable data. Here, x _i is a vector including a plurality of variables such as position and time, and t _i is a population density.

ここで解くべき問題は、入力部１３によって受け付けた、（１）新しい入力変数（未観測の位置及び時間）ｘ_＊における人口密度の値ｔ_＊の予測と、（２）未観測地点を含むある領域における人口密度の値の予測である。 The problems to be solved here include (1) prediction of population density value t _{* in} a new input variable (unobserved position and time) x _* accepted by the input unit 13, and (2) unobserved points. A prediction of the value of population density in an area.

本実施の形態では、問題（１）に焦点を当てて説明する。人口密度分布の予測面を求めるには、ｘ_＊を動かしながら問題（１）を繰り返し解けばよい。 In the present embodiment, description will be given focusing on the problem (1). In order to obtain the prediction surface of the population density distribution, problem (1) should be solved repeatedly while moving x _* .

時空間変数の予測値ｔ_＊が従うモデルが、Ｒ個のガウス過程の混合モデルで記述できると仮定する。この仮定に基づけば、新たな入力変数ｘ_＊に対応する値ｔ_＊の期待値を、以下の式（１）のように書き下すことができる。 Assume that the model followed by the predicted value t _* of the spatiotemporal variable can be described by a mixed model of R Gaussian processes. Based on this assumption, the expected value of the value t _* corresponding to the new input variable x _* can be written down as in the following equation (1).

ここでＸは入力変数ｘの集合（Ｘ＝｛ｘ_ｉ｝^Ｎ _ｉ＝１）、Θ＝｛θ_ｒｒ’｝^Ｒ，Ｒ’ _{ｒ，ｒ’＝１}はｒ’番目のユニットのｒ個目のガウス過程のハイパーパラメータ、ｚ，ｚ’は入力変数の特性（地域や時間帯など）によって分類されたクラスに対応する潜在変数である。ｘ_＊がｒ番目のガウス過程から生じたと仮定すると、予測値ｔ_＊は次式の平均、分散を持つガウス分布に従う。 Here, X is a set of input variables x (X = {x _i } ^N _{i = 1} ), Θ = {θ _{rr ′} } ^{R, R ′} _{r, r ′ = 1} is the r-th unit of the r′-th unit. The hyperparameters z and z ′ of the Gaussian process are latent variables corresponding to classes classified according to the characteristics of the input variables (region, time zone, etc.). Assuming that x _* originates from the r-th Gaussian process, the predicted value t _* follows a Gaussian distribution with mean and variance of the following equation.

ここでｋはカーネル関数、ｋ^ｒｒ’は要素ｋ^ｒｒ’（ｘ_＊，Ｘ）を持つベクトル、Ｃ^ｒｒ’は要素ｋ^ｒｒ’（ｘ_ｎ，ｘ_ｍ）＋β^−１δ_ｎｍ＋Ψ^ｒｒ’ _ｎｍを持つＮ×Ｎの共分散行列である。また、ｔは、要素をｔ_ｉとするベクトルである。ただしΨ^ｒｒ’は次式で定義される対角行列である。 Here, k is a kernel function, k ^{rr ′} is a vector having an element k ^{rr ′} (x _* , X), and C ^{rr ′} is an element k ^{rr ′} (x _n , x _m ) + β ⁻ 1δ _nm + Ψ ^{rr ′} _nm It is an N × N covariance matrix. Also, t is a vector of the element and _{t i.} However, Ψ ^{rr ′} is a diagonal matrix defined by the following equation.

ここでσ_ｎは定数である。ｉ番目のデータ点がｒ番目のガウス過程の寄与を受けないとき、すなわち負担率ｐ（ｚ_ｉ’＝ｒ’，ｚ_ｉ＝ｒ｜ｘ_ｉ）のとき、Ｃ^ｒｒ’は無限大に発散し、（Ｃ^ｒｒ’）^−１の（ｉ，ｉ）成分は０となる。すなわちｔ_ｉはｒ’個目のユニットのｒ番目のガウス過程のパラメータ推定に影響しない。 Here, σ _n is a constant. When the i-th data point does not receive the contribution of the r-th Gaussian process, that is, when the burden factor is p (z _i ′ = r ′, z _i = r | x _i ), C ^{rr ′} diverges infinitely. , (C ^{rr ′} ) ⁻¹ has (i, i) components of zero. That is, t _i does not affect the parameter estimation of the r th Gaussian process of the r ′ th unit.

従来手法では、カーネル関数ｋ（ｘ_ｎ，ｘ_ｍ）は予測の前処理段階で一意に決められる。一方、本発明の実施の形態では、カーネル関数を自由に設計し、各々のカーネル関数のパラメータと重みとを自動推定する。本発明の実施の形態では、気象データや環境データなどの時系列地理統計解析で一般的に使われる代表的なコバリオグラムを二つ挙げる。 In the conventional method, the kernel function k (x _n , x _m ) is uniquely determined at the preprocessing stage of prediction. On the other hand, in the embodiment of the present invention, kernel functions are freely designed, and parameters and weights of each kernel function are automatically estimated. In the embodiment of the present invention, two typical covariograms generally used in time-series geographic statistical analysis such as weather data and environmental data are given.

ここで入力変数ｘの次元数をｎ_ｘとおくと If the number of dimensions of the input variable _x is nx,

である。また、上記のコバリオグラムに対応するカーネル関数は次式のように書き下せる。 It is. The kernel function corresponding to the above covariogram can be written as:

本発明の実施の形態では、上記のカーネル関数の線形和で定義した、以下の式で示す新たなカーネル関数を用いる。 In the embodiment of the present invention, a new kernel function defined by the following equation defined by the linear sum of the above kernel functions is used.

ここで推定すべきガウス過程のハイパーパラメータは The hyperparameter of the Gaussian process to be estimated here is

である。入力変数の集合Ｘ＝｛ｘ_ｉ｝^Ｎ _ｉ＝１、およびガウス過程のハイパーパラメータの集合Θ＝（θ^１１，・・・，θ^ＲＲ’）が与えられた条件の下で、観測データの集合Ｄの尤度関数は次式のように書き下せる。 It is. A set of observation data under the condition that a set of input variables X = {x _i } ^N _{i = 1} and a set of hyperparameters of Gaussian processes Θ = (θ ¹¹ ,..., Θ ^{RR ′} ) The likelihood function of D can be written as

ここでｔ_＼ｉはｔからｉ番目の要素を除いたものを表す。 Here, t _\ i represents a value obtained by removing the i-th element from t.

本発明の実施の形態では、上記式の尤度関数を最大化するパラメータΘ，ｚ，ｚ’を推定するため、ＥＭアルゴリズムを用いる。ＥＭアルゴリズムでは、次の手順［１］〜［４］でパラメータ推定を行う。なお、本発明の実施の形態では、負担率の事前分布として以下の式（４）〜（５）に示すＳｏｆｔｍａｘ関数を採用し、後述する時空間変数算出部２６による時空間変数の予測で用いられるＳｏｆｔｍａｘ関数のパラメータを算出する。 In the embodiment of the present invention, an EM algorithm is used to estimate parameters Θ, z, z ′ that maximize the likelihood function of the above equation. In the EM algorithm, parameter estimation is performed by the following procedures [1] to [4]. In the embodiment of the present invention, the Softmax function shown in the following equations (4) to (5) is adopted as the prior distribution of the burden rate, and is used for prediction of the spatiotemporal variable by the spatiotemporal variable calculation unit 26 described later. The parameter of the Softmax function to be calculated is calculated.

［１］各パラメータの初期値を選択する。 [1] Select an initial value for each parameter.

［２］ＥＭアルゴリズムのＥステップにおいて、観測データ集合の各観測データに対し、以下の式に従って、当該観測データの入力変数に対する、ｒ’番目のユニットのｒ番目のガウス過程の負担率ｐ（ｚ_ｉ＝ｒ｜ｚ_ｉ’＝ｒ’，Ｘ，ｔ，θ^ｒｒ’）、当該観測データの入力変数に対する、ｒ番目のガウス過程の負担率ｐ（ｚ_ｉ＝ｒ｜Ｘ，ｔ，θ^ｒｒ’）、当該観測データの入力変数に対する、ｒ’番目のユニットのｒ番目のガウス過程の同時確率ｐ（ｚ_ｉ＝ｒ，ｚ_ｉ’＝ｒ’｜Ｘ，ｔ，θ^ｒｒ’）を計算する。 [2] In the E step of the EM algorithm, for each observation data in the observation data set, the load ratio p (z of the r'th Gaussian process of the r'th unit for the input variable of the observation data according to the following equation: _i = r | z _i ′ = r ′, X, t, θ ^{rr ′} ), and the burden ratio p (z _i = r | X, t, θ ^{rr ′} ) of the r-th Gaussian process for the input variable of the observed data. ), The joint probability p (z _i = r, z _i ′ = r ′ | X, t, θ ^{rr ′} ) of the r th Gaussian process of the r ′ th unit is calculated for the input variable of the observed data.

なお、上記式（２）におけるｐ（ｔ_ｉ｜Ｘ，ｔ，θ^ｒｒ’）は、次式に示す平均μ_ｉ ^ｒｒ’、分散Σ_ｉ ^ｒｒ’を持つガウス分布で表される。 Note that p (t _i | X, t, θ ^{rr ′} ) in the above equation (2) is expressed by a Gaussian distribution having mean μ _i ^{rr ′} and variance Σ _i ^{rr ′} shown in the following equation.

［３］ＥＭアルゴリズムのＭステップにおいて、観測データ集合の各観測データに対して計算された、ｒ’番目のユニットのｒ番目のガウス過程の負担率ｐ（ｚ_ｉ＝ｒ｜ｚ_ｉ’＝ｒ’，Ｘ，ｔ，θ^ｒｒ’）、ｒ番目のガウス過程の負担率ｐ（ｚ_ｉ＝ｒ｜Ｘ，ｔ，θ^ｒｒ’）、ｒ’番目のユニットのｒ番目のガウス過程の同時確率ｐ（ｚ_ｉ＝ｒ，ｚ_ｉ’＝ｒ’｜Ｘ，ｔ，θ^ｒｒ’）に基づいて、尤度関数Ｑを最大化するガウス過程のパラメータΘ^ｎｅｗを計算する。ここで尤度関数Ｑを次式で表されるＱ^〜で予め近似する。 [3] The burden ratio p (z _i = r | z _i '= r of the r'th Gaussian process of the r'th unit calculated for each observation data in the observation data set in the M step of the EM algorithm. ', X, t, θ ^rr' ), the burden factor p (z _i = r | X, t, θ ^{rr '} ) of the r th Gaussian process, the joint probability p of the r th Gaussian process of the r ′ th unit Based on (z _i = r, z _i ′ = r ′ | X, t, θ ^{rr ′} ), the parameter Θ ^new of the Gaussian process that maximizes the likelihood function Q is calculated. Here, the likelihood function Q is preliminarily approximated by Q ^~ represented by the following equation.

ここで、π_ｉ ^ｒｒ’は、 Where π _i ^{rr ′} is

で定義される。 Defined by

また、目的関数Ｑ^〜のガウス過程のハイパーパラメータθ^ｒｒ’に関する微分は次式で書き下せる。 Also, the derivative of the objective function Q ^to the ^{hyperparameter} θ ^{rr ′} of the Gaussian process can be written as

ここでθ^ｒｒ’ _ｊはｒ番目のガウス過程のｊ番目のハイパーパラメータである。降下勾配法、準ニュートン法等を用いれば目的関数Ｑ^〜を最大化するガウス過程のパラメータΘを得ることができる。更新式は次式のように書き下せる。 Here theta ^{rr _'j} is the j th hyper parameters r th Gaussian process. Descent gradient method, it is possible to obtain the parameters Θ of the Gaussian process that maximizes the objective function Q ^~ a the use of the quasi-Newton method or the like. The update formula can be written as:

また、負担率を表すＳｏｆｔｍａｘ関数のパラメータｖ^ｒ及びｖ^ｒｒ’の更新式は次式のように書き下せる。 Also, the update equation for the parameters v ^r and v ^{rr ′} of the Softmax function representing the burden rate can be written as the following equation.

ここでηは定数である。 Here, η is a constant.

［４］ＥＭアルゴリズムの収束条件が満たされているか調べ、満たされていなければ [4] Check whether the convergence condition of the EM algorithm is satisfied.

を実行し、上記手順［２］に戻る。収束条件が満たされていれば、処理を終了する。 To return to the above procedure [2]. If the convergence condition is satisfied, the process is terminated.

従って、学習部１６は、人口密度情報記憶部１２に格納された観測データの集合に基づいて、観測データ同士の類似性を定義する関数であるカーネル関数の各々のハイパーパラメータと、観測データの各々に対する、複数のガウス過程の各々の寄与度を表すパラメータである負担率とを学習する。 Therefore, the learning unit 16 uses each of the hyperparameters of the kernel function, which is a function that defines the similarity between the observation data, based on the set of observation data stored in the population density information storage unit 12, and each of the observation data. And a burden factor, which is a parameter representing the degree of contribution of each of a plurality of Gaussian processes.

本発明の実施の形態におけるカーネル関数は、複数のガウス過程を、空間的な広がり及び時間的な広がりに対応する複数の階層で混合した階層混合ガウス過程でモデル化される。また、本発明の実施の形態におけるカーネル関数は、入力変数に対する時空間変数の値を予測するためのモデルに含まれる、複数のガウス過程の各々についての、観測データ同士の類似性を定義する関数である。 The kernel function in the embodiment of the present invention is modeled by a hierarchical mixed Gaussian process in which a plurality of Gaussian processes are mixed in a plurality of hierarchies corresponding to a spatial spread and a temporal spread. In addition, the kernel function in the embodiment of the present invention is a function that defines the similarity between observation data for each of a plurality of Gaussian processes included in a model for predicting the value of a spatiotemporal variable with respect to an input variable. It is.

学習部１６は、負担率推定部１８と、ガウス過程パラメータ推定部２０と、反復判定部２１とを備える。 The learning unit 16 includes a burden rate estimation unit 18, a Gaussian process parameter estimation unit 20, and an iterative determination unit 21.

負担率推定部１８は、人口密度情報記憶部１２に格納された観測データの集合と、複数のガウス過程のカーネル関数の各々のハイパーパラメータの初期値、又はガウス過程パラメータ推定部２０による前回推定値とに基づいて、上記式（６）〜（８）に従って、観測データの各々に対する、複数のガウス過程からなる複数のユニットの各々の寄与度を表すパラメータであるユニット負担率、及び観測データの各々に対する、複数のガウス過程の各々の寄与度を表すパラメータである負担率を推定する。 The burden rate estimator 18 includes a set of observation data stored in the population density information storage unit 12 and initial values of hyper parameters of kernel functions of a plurality of Gaussian processes, or previous estimated values by the Gaussian process parameter estimator 20. Based on the above, according to the above formulas (6) to (8), each of the unit data, which is a parameter representing the contribution of each of a plurality of units composed of a plurality of Gaussian processes, to each of the observation data, and each of the observation data A burden factor, which is a parameter representing each degree of contribution of a plurality of Gaussian processes, is estimated.

ガウス過程パラメータ推定部２０は、人口密度情報記憶部１２に格納された観測データの集合と、負担率推定部１８によって推定された、観測データの各々に対する、複数のユニットの各々のユニット負担率、及び観測データの各々に対する、複数のガウス過程の各々の負担率とに基づいて、上記式（９）〜（１０）、（１３）に従って、複数のガウス過程の各々に対し、ガウス過程のカーネル関数の各々のハイパーパラメータを推定する。 The Gaussian process parameter estimator 20 includes a set of observation data stored in the population density information storage unit 12 and a unit burden rate of each of a plurality of units for each of the observation data estimated by the burden rate estimator 18. And a kernel function of the Gaussian process for each of the plurality of Gaussian processes according to the above formulas (9) to (10) and (13) based on the respective burden rates of the plurality of Gaussian processes for each of the observation data. Estimate each hyperparameter of.

また、ガウス過程パラメータ推定部２０は、人口密度情報記憶部１２に格納された観測データの集合に基づいて、上記式（１１）〜（１２）に従って、Ｓｏｆｔｍａｘ関数のパラメータｖ^ｒ及びｖ^ｒｒ’を推定する。 In addition, the Gaussian process parameter estimation unit 20 calculates the parameters v ^r and v ^{rr ′} of the Softmax function according to the above formulas (11) to (12) based on the set of observation data stored in the population density information storage unit 12. presume.

反復判定部２１は、予め定められた反復終了条件を満たすまで、負担率推定部１８による推定、及びガウス過程パラメータ推定部２０による推定を繰り返す。そして、反復判定部２１は、予め定められた反復終了条件が満たされた場合には、ガウス過程パラメータ推定部２０によって推定されたＳｏｆｔｍａｘ関数のパラメータｖ^ｒ及びｖ^ｒｒ’を負担率パラメータ格納部２２に格納し、ガウス過程パラメータ推定部２０によって推定されたハイパーパラメータをガウス過程パラメータ格納部２４に格納する。 The iteration determination unit 21 repeats the estimation by the burden rate estimation unit 18 and the estimation by the Gaussian process parameter estimation unit 20 until a predetermined iteration end condition is satisfied. The repetition determining unit 21, when a defined iteration termination condition is satisfied in advance, the parameters of the Softmax function estimated by the Gaussian process parameter estimator 20 v ^r and v ^{rr 'burden} ratio parameter storage section 22 And the hyperparameter estimated by the Gaussian process parameter estimation unit 20 is stored in the Gaussian process parameter storage unit 24.

負担率パラメータ格納部２２には、ガウス過程パラメータ推定部２０によって推定された、Ｓｏｆｔｍａｘ関数のパラメータｖ^ｒ及びｖ^ｒｒ’が格納される。 The burden rate parameter storage unit 22 stores the parameters v ^r and v ^{rr ′ of} the Softmax function estimated by the Gaussian process parameter estimation unit 20.

ガウス過程パラメータ格納部２４には、ガウス過程パラメータ推定部２０によって推定されたガウス過程のカーネル関数の各々のハイパーパラメータが格納される。 The Gaussian process parameter storage unit 24 stores each hyperparameter of the kernel function of the Gaussian process estimated by the Gaussian process parameter estimation unit 20.

時空間変数算出部２６は、入力部１３によって受け付けた、未観測の位置情報及び時間情報を有する入力変数と、負担率パラメータ格納部２２に格納されたＳｏｆｔｍａｘ関数のパラメータとに基づいて、入力変数に対する複数のガウス過程の各々の寄与度を表すパラメータである負担率を推定する。 The spatiotemporal variable calculation unit 26 receives the input variable having the unobserved position information and time information received by the input unit 13 and the parameter of the Softmax function stored in the burden ratio parameter storage unit 22. A burden factor, which is a parameter representing the degree of contribution of each of a plurality of Gaussian processes, is estimated.

具体的には、時空間変数算出部２６は、入力変数ｘ_＊と、負担率パラメータ格納部２２に格納されたＳｏｆｔｍａｘ関数のパラメータｖ^ｒ及びｖ^ｒｒ’とに基づいて、以下の式（１４）〜（１５）に従って、入力変数ｘ_＊に対する複数のガウス過程の各々の寄与度を表すパラメータである負担率を推定する。 Specifically, the spatiotemporal variable calculation unit 26 uses the following equation (14) based on the input variable x _* and the parameters v ^r and v ^{rr ′ of} the Softmax function stored in the burden factor parameter storage unit 22. According to ˜ (15), a burden factor, which is a parameter representing each contribution degree of the plurality of Gaussian processes to the input variable x _* , is estimated.

そして、時空間変数算出部２６は、推定した入力変数ｘ_＊に対する負担率と、ガウス過程パラメータ格納部２４に格納されたハイパーパラメータに基づいて、上記式（１）〜（３）に従って、入力変数に対する時空間変数の値を予測する。 The spatiotemporal variable calculation unit 26 then inputs the input variable according to the above equations (1) to (3) based on the burden rate for the estimated input variable x _* and the hyperparameter stored in the Gaussian process parameter storage unit 24. Predict the value of the spatiotemporal variable for.

出力部２８は、時空間変数算出部２６によって予測された、入力された入力変数に対する時空間変数の値を、結果として出力する。 The output unit 28 outputs the value of the spatiotemporal variable for the input variable predicted by the spatiotemporal variable calculation unit 26 as a result.

例えば、出力部２８は、上記図３に示すように、予測対象の地点あるいは領域及び時間での時空間変数の予測値である人口密度と、当該人口密度に関連する情報である混雑度とを結果として出力する。 For example, as shown in FIG. 3, the output unit 28 calculates the population density that is the predicted value of the spatiotemporal variable at the point or region to be predicted and the time, and the congestion level that is information related to the population density. Output as a result.

＜本発明の実施の形態に係る時空間変数予測装置の作用＞ <Operation of Spatiotemporal Variable Prediction Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る時空間変数予測装置１００の作用について説明する。 Next, the operation of the spatiotemporal variable prediction apparatus 100 according to the embodiment of the present invention will be described.

＜学習処理ルーチン＞
まず、時空間変数予測装置１００は、操作部１０より観測データの集合Ｄが入力されると、観測データの集合Ｄを人口密度情報記憶部１２に格納する。そして、時空間変数予測装置１００は、図４に示す学習処理ルーチンを実行する。 <Learning processing routine>
First, when the observation data set D is input from the operation unit 10, the spatiotemporal variable prediction apparatus 100 stores the observation data set D in the population density information storage unit 12. Then, the spatiotemporal variable prediction apparatus 100 executes a learning process routine shown in FIG.

まず、ステップＳ１００では、繰り返し変数ｊと各パラメータとを初期化する。 First, in step S100, the repetition variable j and each parameter are initialized.

次に、ステップＳ１０２において、反復判定部２１は、予め定められた反復終了条件を満たしたか否かを判定する。予め定められた条件として、繰り返し変数ｊが予め定められた値Ｎ_ｉｔｅｒ未満である場合には、ステップＳ１１０へ進む。一方、繰り返し変数ｊがＮ_ｉｔｅｒ以上である場合には、ステップＳ１０４へ進む。 Next, in step S102, the repetition determination unit 21 determines whether or not a predetermined repetition end condition is satisfied. As a predetermined condition, when the repetition variable j is less than a predetermined value N _iter , the process proceeds to step S110. On the other hand, if the repetition variable j is greater than or equal to N _iter , the process proceeds to step S104.

ステップＳ１０４において、負担率推定部１８は、人口密度情報記憶部１２に格納された観測データの集合と、上記ステップＳ１００で設定された、複数のガウス過程のカーネル関数の各々のハイパーパラメータの初期値、又は後述するステップＳ１０６で前回推定されたガウス過程のカーネル関数の各々のハイパーパラメータとに基づいて、上記式（６）〜（８）に従って、観測データの各々に対するユニット負担率、及び観測データの各々に対する負担率を推定する。 In step S104, the burden ratio estimation unit 18 sets the initial values of the hyperparameters of the set of observation data stored in the population density information storage unit 12 and the kernel functions of the plurality of Gaussian processes set in step S100. Or, based on the hyperparameters of the kernel function of the Gaussian process previously estimated in step S106, which will be described later, according to the above formulas (6) to (8), the unit burden rate for each of the observation data and the observation data Estimate the burden rate for each.

ステップＳ１０６において、ガウス過程パラメータ推定部２０は、人口密度情報記憶部１２に格納された観測データの集合と、上記ステップＳ１０４で推定された、観測データの各々に対するユニット負担率、及び観測データの各々に対する負担率とに基づいて、上記式（９）〜（１０）、（１３）に従って、複数のガウス過程の各々に対し、ガウス過程のカーネル関数の各々のハイパーパラメータを推定する。また、ガウス過程パラメータ推定部２０は、人口密度情報記憶部１２に格納された観測データの集合に基づいて、上記式（１１）〜（１２）に従って、Ｓｏｆｔｍａｘ関数のパラメータを推定する。 In step S106, the Gaussian process parameter estimation unit 20 sets the observation data stored in the population density information storage unit 12, the unit burden rate for each observation data estimated in step S104, and each of the observation data. On the basis of the burden rate for each, a hyperparameter of each kernel function of the Gaussian process is estimated for each of the plurality of Gaussian processes according to the above formulas (9) to (10) and (13). In addition, the Gaussian process parameter estimation unit 20 estimates the parameter of the Softmax function according to the above formulas (11) to (12) based on the set of observation data stored in the population density information storage unit 12.

ステップＳ１０８では、繰り返し変数ｊを１インクリメントしてステップＳ１０２へ戻る。 In step S108, the repetition variable j is incremented by 1, and the process returns to step S102.

ステップＳ１１０において、反復判定部２１は、上記ステップＳ１０６で推定されたＳｏｆｔｍａｘ関数のパラメータを負担率パラメータ格納部２２に格納し、推定されたガウス過程のハイパーパラメータをガウス過程パラメータ格納部２４に格納して、学習処理ルーチンを終了する。 In step S110, the iterative determination unit 21 stores the parameter of the Softmax function estimated in step S106 in the burden factor parameter storage unit 22, and stores the estimated hyperparameter of the Gaussian process in the Gaussian process parameter storage unit 24. Then, the learning process routine ends.

次に、図５に示す時空間変数算出処理ルーチンについて説明する。 Next, the spatiotemporal variable calculation processing routine shown in FIG. 5 will be described.

学習処理ルーチンが実行され、負担率パラメータ格納部２２にＳｏｆｔｍａｘ関数のパラメータｖ^ｒｒ’が格納され、ガウス過程パラメータ格納部２４にガウス過程のハイパーパラメータが格納され、予測を行う対象となる未観測の位置情報及び時間情報を含む入力変数が入力されると、時空間変数予測装置１００は、図５に示す時空間変数算出処理ルーチンを実行する。 The learning processing routine is executed, the parameter v ^{rr ′} of the Softmax function is stored in the burden factor parameter storage unit 22, the hyperparameter of the Gaussian process is stored in the Gaussian process parameter storage unit 24, and an unobserved target to be predicted is stored. When an input variable including position information and time information is input, the spatiotemporal variable prediction apparatus 100 executes a spatiotemporal variable calculation processing routine shown in FIG.

＜時空間変数算出処理ルーチン＞
ステップＳ２００において、入力部１３は、未観測の位置情報及び時間情報を含む入力変数を受け付ける。 <Time-space variable calculation processing routine>
In step S200, the input unit 13 receives an input variable including unobserved position information and time information.

ステップＳ２０２において、時空間変数算出部２６は、負担率パラメータ格納部２２に格納されたＳｏｆｔｍａｘ関数のパラメータと、ガウス過程パラメータ格納部２４に格納されたガウス過程のハイパーパラメータとを読み出す。 In step S <b> 202, the spatiotemporal variable calculation unit 26 reads the parameters of the Softmax function stored in the burden factor parameter storage unit 22 and the Gaussian process hyperparameters stored in the Gaussian process parameter storage unit 24.

ステップＳ２０４において、時空間変数算出部２６は、上記ステップＳ２００で受け付けた入力変数と、上記ステップＳ２０２で読み込まれたＳｏｆｔｍａｘ関数のパラメータとに基づいて、上記式（１４）〜（１５）に従って、入力変数に対する複数のガウス過程の各々の寄与度を表すパラメータである負担率を推定する。そして、時空間変数算出部２６は、推定された入力変数に対する負担率と、上記ステップＳ２０２で読み込まれたガウス過程のハイパーパラメータとに基づいて、上記式（１）〜（３）に従って、入力変数に対する時空間変数の値である人口密度を予測する。 In step S204, the spatiotemporal variable calculation unit 26 performs input according to the above formulas (14) to (15) based on the input variables received in step S200 and the parameters of the Softmax function read in step S202. A burden factor, which is a parameter representing each degree of contribution of a plurality of Gaussian processes to a variable, is estimated. The spatiotemporal variable calculation unit 26 then inputs the input variable according to the above formulas (1) to (3) based on the burden rate for the estimated input variable and the hyperparameter of the Gaussian process read in step S202. Predict the population density, which is the value of the spatiotemporal variable for.

出力部２８は、時空間変数算出部２６によって予測された、入力変数に対する時空間変数の値である人口密度を結果として出力して、時空間変数算出処理ルーチンを終了する。 The output unit 28 outputs the population density, which is the value of the spatiotemporal variable with respect to the input variable, predicted by the spatiotemporal variable calculation unit 26, and ends the spatiotemporal variable calculation processing routine.

以上説明したように、本発明の実施の形態に係る時空間変数予測装置によれば、位置情報及び時間情報を有する入力変数に対する時空間変数の観測値を有する観測データの集合に基づいて、複数のガウス過程を、空間的な広がり及び時間的な広がりに対応する複数の階層で混合した階層混合ガウス過程でモデル化された、入力変数に対する時空間変数の値を予測するためのモデルに含まれる、複数のガウス過程の各々についての、観測データ同士の類似性を定義する関数であるカーネル関数の各々のハイパーパラメータと、観測データの各々に対する、複数のガウス過程の各々の寄与度を表すパラメータである負担率とを学習することにより、時間的及び空間的相関を持つ時空間変数の値を精度よく予測することができる。 As described above, according to the spatiotemporal variable prediction apparatus according to the embodiment of the present invention, based on a set of observation data having observation values of spatiotemporal variables with respect to an input variable having position information and temporal information, Included in models for predicting spatio-temporal variable values for input variables, modeled by hierarchical mixed Gaussian processes mixed at multiple levels corresponding to spatial and temporal spreads A hyperparameter of each of the kernel functions, which is a function that defines the similarity between the observation data for each of the plurality of Gaussian processes, and a parameter that represents the contribution of each of the plurality of Gaussian processes to each of the observation data By learning a certain burden ratio, the value of the spatiotemporal variable having temporal and spatial correlation can be predicted with high accuracy.

また、時間的及び空間的相関を持つ時空間変数の値を正確に予測することができる。 In addition, the value of the spatiotemporal variable having temporal and spatial correlation can be accurately predicted.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、上記の実施の形態では、人口密度分布の時系列データが観測データとして与えられた条件の下で、未観測地点あるいは未来の時空間変数分布を推定・予測するという場合について説明したが、他の様々な時空間変数の時系列データ（人口分布、人流・交通流の速度・向き、金やダイヤモンドなど鉱物資源の埋蔵量、降水量などの気象データ、土地価格など）を対象として本発明を適用することができる。 For example, in the above embodiment, a case has been described in which the time series data of the population density distribution is estimated / predicted as an unobserved point or a future spatiotemporal variable distribution under the condition given as observation data. The present invention is intended for time series data of various other spatiotemporal variables (population distribution, speed / direction of human / traffic flow, reserves of mineral resources such as gold and diamond, meteorological data such as precipitation, land price, etc.) Can be applied.

また、本発明の実施の形態においては、時空間変数予測装置１００によって各パラメータを推定し、推定された各パラメータを用いて、入力変数に対応する時空間変数である人口密度を予測する場合を例に説明したが、これに限定されるものではない。例えば、各パラメータを推定する処理と、推定された各パラメータを用いて時空間変数を予測する処理とを別々の装置として構成してもよい。この場合、各パラメータを推定する装置は、操作部１０と、人口密度情報記憶部１２と、演算部１４とを備え、推定された各パラメータを用いて時空間変数を予測する装置は、入力部１３と、時空間変数算出部２６と、出力部２８とを備える。 Further, in the embodiment of the present invention, a case where each parameter is estimated by the spatiotemporal variable prediction apparatus 100, and a population density that is a spatiotemporal variable corresponding to the input variable is predicted using each estimated parameter. Although described as an example, the present invention is not limited to this. For example, a process for estimating each parameter and a process for predicting a spatiotemporal variable using each estimated parameter may be configured as separate devices. In this case, an apparatus for estimating each parameter includes an operation unit 10, a population density information storage unit 12, and a calculation unit 14, and an apparatus for predicting a spatiotemporal variable using each estimated parameter includes an input unit 13, a spatiotemporal variable calculation unit 26, and an output unit 28.

また、本発明の実施の形態では、負担率の事前分布としてＳｏｆｔｍａｘ関数を採用し、Ｓｏｆｔｍａｘ関数のパラメータを推定する場合を例に説明したが、これに限定されるものではなく、他の関数を用いてもよい。 In the embodiment of the present invention, the Softmax function is adopted as the prior distribution of the burden ratio and the parameter of the Softmax function is estimated as an example. However, the present invention is not limited to this, and other functions are used. It may be used.

また、上述の時空間変数予測装置１００は、内部にコンピュータシステムを有しているが、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。 The spatiotemporal variable prediction apparatus 100 described above has a computer system inside, but if the “computer system” uses a WWW system, a homepage providing environment (or display environment) is also provided. Shall be included.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能であるし、ネットワークを介して提供することも可能である。 Further, in the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium or provided via a network. It is also possible to do.

１０操作部
１２人口密度情報記憶部
１３入力部
１４演算部
１６学習部
１８負担率推定部
２０ガウス過程パラメータ推定部
２１反復判定部
２２負担率パラメータ格納部
２４ガウス過程パラメータ格納部
２６時空間変数算出部
２８出力部
１００時空間変数予測装置 DESCRIPTION OF SYMBOLS 10 Operation part 12 Population density information storage part 13 Input part 14 Calculation part 16 Learning part 18 Burden rate estimation part 20 Gaussian process parameter estimation part 21 Iteration determination part 22 Burden rate parameter storage part 24 Gaussian process parameter storage part 26 Spatio-temporal variable calculation 28 Output unit 100 Spatio-temporal variable prediction device

Claims

A spatiotemporal variable prediction device that predicts spatiotemporal variable values for unobserved position information and temporal information based on a set of observation data having observation values of spatiotemporal variables for input variables having positional information and temporal information. And
A spatio-temporal variable for the input variable modeled by a hierarchical mixed Gaussian process in which a plurality of Gaussian processes are mixed in a plurality of hierarchies corresponding to a spatial spread and a temporal spread based on the set of observation data Each of the plurality of Gaussian processes included in the model for predicting the value of each of the hyperparameters of a kernel function, which is a function that defines the similarity between the observation data, and for each of the observation data A spatio-temporal variable prediction apparatus including a learning unit that learns a burden rate that is a parameter representing a contribution degree of each of the plurality of Gaussian processes.

The learning unit
Based on the set of observation data and the hyperparameters of each of the kernel functions of the plurality of Gaussian processes, the parameter represents the contribution of each of a plurality of units composed of a plurality of Gaussian processes to each of the observation data. A unit burden factor, and a burden factor estimation unit that estimates a burden factor that is a parameter representing the degree of contribution of each of the plurality of Gaussian processes to each of the observation data;
Each of the plurality of Gaussian processes for each of the unit burden rates of each of the plurality of units and each of the observation data, for each of the observation data, estimated by the set of the observation data, and the burden rate estimation unit A Gaussian process parameter estimator for estimating a hyperparameter of each kernel function of the Gaussian process for each of the plurality of Gaussian processes based on
The spatiotemporal variable prediction apparatus according to claim 1, further comprising: an iterative determination unit that repeats the estimation by the burden rate estimation unit and the estimation by the Gaussian process parameter estimation unit until a predetermined iteration end condition is satisfied.

Based on the input variable having unobserved position information and time information that is input, estimate a burden factor that is a parameter representing the contribution of each of the plurality of Gaussian processes to the input variable,
Based on the hyperparameters of each of the kernel functions for each of the plurality of Gaussian processes learned by the learning unit and the burden rates of the plurality of Gaussian processes for the estimated input variable, The spatiotemporal variable prediction apparatus according to claim 1, further comprising a spatiotemporal variable calculation unit that predicts a value of the spatiotemporal variable with respect to the input variable.

The program for functioning a computer as each part which comprises the spatiotemporal variable prediction apparatus of any one of Claims 1-3.