JP5684055B2

JP5684055B2 - Time-related data analysis apparatus, method, and program

Info

Publication number: JP5684055B2
Application number: JP2011136377A
Authority: JP
Inventors: 勝彦石黒; 澤田　宏; 宏澤田; 冨岡　亮太; 亮太冨岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-06-20
Filing date: 2011-06-20
Publication date: 2015-03-11
Anticipated expiration: 2031-06-20
Also published as: JP2013003967A

Description

本発明は、時間関係データ解析装置、方法、及びプログラムに係り、特に、オブジェクト間の関係データの時系列を解析する時間関係データ解析装置、方法、及びプログラムに関する。 The present invention relates to a time relationship data analysis device, method, and program, and more particularly, to a time relationship data analysis device, method, and program for analyzing a time series of relationship data between objects.

大量のデータがインターネットなどを通じて入手可能な現在、そのような大規模データセットを用いた情報解析やビジネスが盛んに行われている。しかしながら、こういった大量のデータの内容を人間が把握するのは困難である。 Currently, a large amount of data is available through the Internet, and information analysis and business using such a large-scale data set are actively performed. However, it is difficult for humans to grasp the contents of such a large amount of data.

特に最近、人間関係や企業間の取引関係といった、オブジェクトとオブジェクトの間の関係を示す「関係データ」に注目が集まっている。このような関係データを用いれば、例えば友人同士の関係からより精度の高い購買アイテムの推薦を実現したりすることができる。 In particular, recently, attention has been focused on “relation data” indicating relationships between objects such as human relationships and business relationships between companies. By using such relationship data, for example, it is possible to realize purchase item recommendation with higher accuracy from the relationship between friends.

オブジェクトの間の関係データの解析手法はこれまでにも多くの方法が提案されてきている。多くの手法は、２者の間の関係を行列の形で表現できる事に着目して、その行列の構造化を行う。非特許文献１、非特許文献２は、２つのオブジェクト間の関係の有無の情報が与えられたときに、似たような接続先をもつオブジェクトのクラスタリングを行う。また、これらの関係データでは、多くの似た傾向（パターン）を持つ人間やアイテムが存在するため、関係データを行列として見たときにランクが低いという特徴があることが知られている。非特許文献６はこの性質を仮定として用いて部分的に観測された行列の未観測部分を推定する方法である。 Many methods for analyzing relational data between objects have been proposed so far. Many methods focus on the fact that the relationship between the two can be expressed in the form of a matrix, and structure the matrix. Non-Patent Document 1 and Non-Patent Document 2 perform clustering of objects having similar connection destinations when information on the presence / absence of a relationship between two objects is given. In addition, since there are many people and items having similar tendencies (patterns) in these relation data, it is known that the rank is low when the relation data is viewed as a matrix. Non-Patent Document 6 is a method for estimating an unobserved part of a partially observed matrix using this property as an assumption.

これらの手法で問題となるのは、例えば友人間の関係や企業間の取引は時間に応じて変化していく、ということである。したがって、さらに高精度な関係データ解析を実現するためには、一定のオブジェクト間の関係を複数の時間で観測し、その変化を含んだ時間関係データの解析手法が必要となる。そのためには、ただの行列データの解析ではなく、関係を表す行列が時間ごとに定義されるデータの解析手法が必要となる。時間関係データの解析手法はまだ多く提案されていない。非特許文献３、非特許文献４に記載の手法はそれぞれ非特許文献１、非特許文献２を時間変化する場合に適用できるように拡張した手法である。非特許文献３に記載の手法では、状態空間モデルを利用することで、関係が連続的に変化する状況を想定したモデル化を行っている。一方、非特許文献４はＨＭＭを利用することで急激な時間変化も対応可能な時間発展モデルを提案している。また非特許文献５に記載の手法は、時間関係データをテンソルとしてモデル化し、一種の自己回帰モデルを用いて時間変化をモデル化している。 The problem with these methods is that, for example, relationships between friends and transactions between companies change over time. Therefore, in order to realize more accurate relational data analysis, a technique for observing a relation between certain objects at a plurality of times and analyzing temporal relational data including the change is required. For this purpose, a method for analyzing data in which a matrix representing a relationship is defined for each time is required, not just analysis of matrix data. Many methods for analyzing time-related data have not yet been proposed. The methods described in Non-Patent Literature 3 and Non-Patent Literature 4 are methods that have been extended so that Non-Patent Literature 1 and Non-Patent Literature 2 can be applied when time-varying. In the method described in Non-Patent Document 3, modeling is performed assuming a situation in which the relationship changes continuously by using a state space model. On the other hand, Non-Patent Document 4 proposes a time evolution model that can cope with a rapid time change by using an HMM. The method described in Non-Patent Document 5 models time-related data as a tensor and models time changes using a kind of autoregressive model.

K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077−1087, 2001.K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96 (455): 1077-1087, 2001. C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda. Learning systems of concepts with an infinite relational model. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006.C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda. Learning systems of concepts with an infinite relational model.In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006. T. Yang, Y. Chi, S. Zhu, Y. Gong, and R. Jin. A Bayesian approach toward finding communities and their evolutions in dynamic social networks. In Proceedings of SIAM International Conference on Data Mining (SDM), 2009.T. Yang, Y. Chi, S. Zhu, Y. Gong, and R. Jin.A Bayesian approach toward finding communities and their evolutions in dynamic social networks.In Proceedings of SIAM International Conference on Data Mining (SDM), 2009. K. Ishiguro, T. Iwata, N. Ueda and J. Tenenbaum. Dynamic Infinite Relational Model for Time-varying Relational Data Analysis. Advances in Neural Information Processing Systems 23, 2010.K. Ishiguro, T. Iwata, N. Ueda and J. Tenenbaum. Dynamic Infinite Relational Model for Time-varying Relational Data Analysis. Advances in Neural Information Processing Systems 23, 2010. L. Xiong, X. Chen, T.-K. Huang, J. Schneider, and J. G. Carbonell. Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization. In Proceedings of SIAM International Conference on Data Mining (SDM), 2010.L. Xiong, X. Chen, T.-K. Huang, J. Schneider, and J. G. Carbonell.Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization.In Proceedings of SIAM International Conference on Data Mining (SDM), 2010. N. Srebro, J. D. M. Rennie, and T. S. Jaakkola. Maximum-Margin Matrix Factorization. In Advances in Neural Information Processing Systems 17, 2005.N. Srebro, J. D. M. Rennie, and T. S. Jaakkola. Maximum-Margin Matrix Factorization. In Advances in Neural Information Processing Systems 17, 2005.

以上の手法はそれぞれ以下のような問題点を持つ。 Each of the above methods has the following problems.

まず、非特許文献１、非特許文献２、及び非特許文献６に記載の手法では、時間関係データを解析することが不可能である、という問題がある。 First, the methods described in Non-Patent Document 1, Non-Patent Document 2, and Non-Patent Document 6 have a problem that it is impossible to analyze time-related data.

非特許文献３及び非特許文献５に記載の手法は、時間関係データをパターンに分割することが可能であるが、事前にパターンの数を決定する必要がある、という問題がある。 The methods described in Non-Patent Document 3 and Non-Patent Document 5 can divide time-related data into patterns, but have a problem that the number of patterns needs to be determined in advance.

非特許文献４に記載の手法は、パターンの数を自動的に決定することができる時間関係データ解析法である。しかしながら、オブジェクト間の関係の有無だけを解析の対象としている。したがって、例えば関係性の強弱の変化といった、より正確な関係の解析には不適切である。また、この手法では観測が行われてない時刻の推定は不可能である。たとえば５年毎に調査される産業連関表のような時間関係データが与えられた場合に、他の時期（３年目など）の関係の復元や、未来における関係の予測は原理上不可能である、という問題がある。 The method described in Non-Patent Document 4 is a time-related data analysis method that can automatically determine the number of patterns. However, only the presence or absence of the relationship between objects is the object of analysis. Therefore, it is unsuitable for analyzing a more accurate relationship, such as a change in the strength of the relationship. Moreover, it is impossible to estimate the time when no observation is made by this method. For example, when time-related data such as an input-output table surveyed every five years is given, it is impossible in principle to restore relationships in other periods (such as the third year) and predict future relationships. There is a problem that there is.

本発明は、上記の課題を解決するためになされたもので、時間変化するオブジェクト間の関係データを解析して、未観測のオブジェクト間の関係の強さを求めることができる時間関係データ解析装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made in order to solve the above-described problem, and analyzes the relationship data between objects that change over time to obtain the strength of the relationship between unobserved objects. It is an object to provide a method and a program.

上記の目的を達成するために時間関係データ解析装置は、複数の第１オブジェクトの各々と複数の第２オブジェクトの各々との間の関係の強さを示す関係データの時系列を表わし、かつ、テンソルで表現される時間関係データを解析して、前記時間関係データを表わす未知のテンソルＸを求める時間関係データ解析装置であって、前記第１オブジェクトと前記第２オブジェクトとの間の関係の強さのＭ個の観測値からなる観測ベクトルｙ、正の実数である正則化定数λ、及びテンソルＸの各モードｋ（ｋ＝1、2、3）に対する相対的な正則化の強さを表わす正則化バランス定数γ_kを記憶した記憶手段と、前記記憶手段から、前記観測ベクトルｙ、前記正則化定数λ、前記正則化バランス定数γ_k（ｋ＝1、2、3）を取得し、以下の（Ｉ）式で表される目的関数を最小化する前記テンソルＸを求める解析手段と、を含んで構成されている。 In order to achieve the above object, the time relationship data analysis device represents a time series of relationship data indicating the strength of the relationship between each of the plurality of first objects and each of the plurality of second objects, and A time relationship data analysis apparatus that analyzes time relationship data expressed by a tensor and obtains an unknown tensor X representing the time relationship data, wherein the relationship between the first object and the second object is strong. Represents the strength of regularization relative to each mode k (k = 1, 2, 3) of the observation vector y consisting of M observation values, the regularization constant λ which is a positive real number, and the tensor X The storage means storing the regularization balance constant γ _k , and the observation vector y, the regularization constant λ, and the regularization balance constant γ _k (k = 1, 2, 3) are acquired from the storage means, and Represented by the formula (I) Analyzing means for obtaining the tensor X that minimizes the objective function.

ただし、ｎ₁，ｎ₂，ｎ₃は、前記時間関係データのサイズを表す自然数であり、ｘは、前記テンソルＸの要素である。Ω（Ｘ）は、テンソルＸから前記観測ベクトルの観測値に対応するＭ個の要素の取り出す操作である。ｍａｔ₁は、n₁×n₂n₃を返すモード１行列化であり、ｍａｔ₂は、n₂×n₃n₁を返すモード２行列化であり、ｍａｔ₃は、n₃×n₁n₂を返すモード３行列化である。‖‖_trは、行列の特異値の線形和を表わす。 Here, n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the time-related data, and x is an element of the tensor X. Ω (X) is an operation for extracting M elements corresponding to the observation value of the observation vector from the tensor X. mat ₁ is a mode 1 matrix that returns n ₁ × n ₂ n ₃ , mat ₂ is a mode 2 matrix that returns n ₂ × n ₃ n ₁ , and mat ₃ is n ₃ × n ₁ n Mode 3 matrixing that returns ₂ . ‖‖ _tr represents the linear sum of the singular values of the matrix.

本発明に係る時間関係データ解析方法は、第１オブジェクトと第２オブジェクトとの間の関係の強さのＭ個の観測値からなる観測ベクトルｙ、正の実数である正則化定数λ、及びテンソルＸの各モードｋ（ｋ＝1、2、3）に対する相対的な正則化の強さを表わす正則化バランス定数γ_kを記憶した記憶手段を備えた時間関係データ解析装置において、複数の第１オブジェクトの各々と複数の第２オブジェクトの各々との間の関係の強さを示す関係データの時系列を表わし、かつ、テンソルで表現される時間関係データを解析して、前記時間関係データを表わす未知のテンソルＸを求める時間関係データ解析方法であって、前記解析手段によって、前記記憶手段から、前記観測ベクトルｙ、前記正則化定数λ、前記正則化バランス定数γ_k（ｋ＝1、2、3）を取得し、上記の（Ｉ）式で表される目的関数を最小化する前記テンソルＸを求めることを特徴としている。 The temporal relationship data analysis method according to the present invention includes an observation vector y composed of M observation values of the strength of the relationship between the first object and the second object, a regularization constant λ that is a positive real number, and a tensor. In a time-related data analysis apparatus including a storage unit that stores a regularization balance constant γ _k representing a relative regularization strength relative to each mode k (k = 1, 2, 3) of X. A time series of relation data indicating the strength of the relation between each of the objects and each of the plurality of second objects is represented, and the time relation data expressed by a tensor is analyzed to represent the time relation data a temporal relationship data analysis method for determining the unknown tensor X, by the analyzing means, from the storage unit, the observation vector y, the regularization constant lambda, the regularization balance constant γ _{k (k} = 1,2 3) Get the is characterized by determining said tensor X that minimizes the objective function represented by the above formula (I).

本発明によれば、前記解析手段によって、前記記憶手段から、前記観測ベクトルｙ、前記正則化定数λ、前記正則化バランス定数γ_k（ｋ＝1、2、3）を取得し、上記の（Ｉ）式で表される目的関数を最小化する前記テンソルＸを求める。 According to the present invention, the analysis unit obtains the observation vector y, the regularization constant λ, and the regularization balance constant γ _k (k = 1, 2, 3) from the storage unit, I) Find the tensor X that minimizes the objective function expressed by the equation.

このように、時間変化するオブジェクト間の関係データを解析して、目的関数を最小化するように、時間関係データを表わす未知のテンソルＸを求めることによって、未観測のオブジェクト間の関係の強さを求めることができる。 In this way, by analyzing the relationship data between the objects that change over time and obtaining the unknown tensor X representing the time relationship data so as to minimize the objective function, the strength of the relationship between the unobserved objects is obtained. Can be requested.

本発明に係る前記記憶手段は、補助変数Ｚ_k（ｋ＝1、2、3）、前記テンソルＸの各要素の値ｘ、ラグランジュ乗数Ａ_k（ｋ＝1、2、3）、第１オブジェクト方向、第２オブジェクト方向、及び時間方向の各々のパターン基底Ｕ_k（ｋ＝1、2、3）、前記パターン基底Ｕk（ｋ＝1、2、3）に従って変化する前記第１オブジェクト、前記第２オブジェクト、及び時間の各ペアの間の関係の強さを表わす関係パターンＶ_k（ｋ＝1、2、3）を更に記憶し、前記解析手段は、各モードｋの補助係数Ｚｋ及びラグランジュ乗数Ａ_kに基づいて、以下の（ＩＩ）式に従ってテンソルの予測値^Ｘを算出し、算出されたテンソルの予測値^Ｘ及び前記観測ベクトルｙに基づいて、以下の（ＩＩＩ）式に従って前記テンソルＸの各要素の値ｘを更新して前記記憶手段に記憶する主変数更新手段と、前記テンソルＸ及び各モードｋの前記ラグランジュ乗数Ａ_kに基づいて、各モードｋについて、以下の（ＩＶ）式に従って、補助変数の予測値^Ｚ_kを算出し、算出された前記補助変数の予測値^Ｚ_ｋに基づいて、各モードｋについて、以下の（Ｖ）式に従って、特異値分解を行って、特異値が並んだ対角行列Ｓ_kを求めると共に、前記パターン基底Ｕ_k及び前記関係パターンＶ_kを更新して前記記憶手段に記憶し、前記求められた前記対角行列Ｓ_k、前記パターン基底Ｕ_k、及び前記関係パターンＶ_kに基づいて、以下の（ＶＩ）式に従って、前記補助変数Ｚ_kを更新して、前記記憶手段に記憶する補助変数更新手段と、前記テンソルＸ及び補助変数Ｚ_kに基づいて、各モードｋについて、以下の（ＶＩＩ）式に従って、前記ラグランジュ乗数Ａ_kを更新して前記記憶手段に記憶するラグランジュ乗数更新手段と、前記主変数更新手段による更新、前記補助変数更新手段による更新、前記ラグランジュ乗数更新手段による更新を繰り返すことで、所定の解析終了条件を満たした場合に、前記記憶手段に記憶されている前記テンソルＸを出力する解析終了条件判定手段と、を含むようにすることができる。 The storage means according to the present invention includes an auxiliary variable Z _k (k = 1, 2, 3), a value x of each element of the tensor X, a Lagrange multiplier A _k (k = 1, 2, 3), a first object A pattern base U _k (k = 1, 2, 3) in each of a direction, a second object direction, and a time direction, the first object changing according to the pattern base U _k (k = 1, 2, 3), A relationship pattern V _k (k = 1, 2, 3) representing the strength of the relationship between the two objects and each pair of time is further stored, and the analysis means includes an auxiliary coefficient Zk and a Lagrange multiplier for each mode k. Based on A _k , the predicted value of the tensor ^ X is calculated according to the following equation (II), and the tensor is calculated according to the following equation (III) based on the calculated predicted value of the tensor ^ X and the observed vector y: The value x of each element of X is updated and stored in the storage means. A main variable updating means for, based on the Lagrange multipliers A _k of the tensor X and modes k, for each mode k, in accordance with the following formula (IV), to calculate the predicted value ^ Z _k of auxiliary variables, calculates On the basis of the predicted value ^ Z _k of the auxiliary variable, for each mode k, singular value decomposition is performed according to the following equation (V) to obtain a diagonal matrix S _{k in} which singular values are arranged, The pattern base U _k and the relation pattern V _k are updated and stored in the storage means. Based on the obtained diagonal matrix S _k , the pattern base U _k , and the relation pattern V _k , the following According to the formula (VI), the auxiliary variable Z _k is updated and stored in the storage means, and on the basis of the tensor X and auxiliary variable Z _k , for each mode k, the following (VII) According to the formula A Lagrange multiplier updating means for updating the serial Lagrange multiplier A _k stored in the storage means, updated by the main variable updating means updating by the auxiliary variable updating means, by repeating the updating by the Lagrangian multiplier update means, a predetermined Analysis end condition determining means for outputting the tensor X stored in the storage means when the analysis end condition is satisfied.

ただし、￣Ω（Ｘ）は、前記テンソルＸから前記観測値でない要素の取り出す操作であり、ηは定数である。 However, ￣Ω (X) is an operation for extracting an element other than the observed value from the tensor X, and η is a constant.

本発明に係る時間関係データ解析装置は、前記解析手段によって求められたテンソルＸに基づいて、以下の（ＶＩＩＩ）式に従って、前記第１オブジェクトｉと前記第２オブジェクトｊとの間の関係における予測対象時刻ｔ’の関係データの予測値Ｘ’（ｉ，ｊ）を計算する関係データ予測手段を更に含むようにすることができる。 The temporal relationship data analysis apparatus according to the present invention predicts the relationship between the first object i and the second object j according to the following equation (VIII) based on the tensor X obtained by the analysis means. A relational data predicting means for calculating a predicted value X ′ (i, j) of the relational data at the target time t ′ can be further included.

ただし、予測対象時刻ｔ’の前後で観測値が観測された時刻を、Ｔ（ｋ）、Ｔ（ｋ＋１）とする。 However, the times when the observed values are observed before and after the prediction target time t ′ are T (k) and T (k + 1).

本発明に係るプログラムは、上記の時間関係データ解析装置の各手段としてコンピュータを機能させるためのプログラムである。 The program according to the present invention is a program for causing a computer to function as each means of the time-related data analysis apparatus.

以上説明したように、本発明の時間関係データ解析装置、方法、及びプログラムによれば、時間変化するオブジェクト間の関係データを解析して、目的関数を最小化するように、時間関係データを表わす未知のテンソルＸを求めることによって、未観測のオブジェクト間の関係の強さを求めることができる、という効果が得られる。 As described above, according to the time-related data analysis apparatus, method, and program of the present invention, the time-related data is represented so as to minimize the objective function by analyzing the relationship data between the objects that change with time. By obtaining the unknown tensor X, it is possible to obtain the effect that the strength of the relationship between unobserved objects can be obtained.

時間関係データの解析に関する模式図である。It is a schematic diagram regarding the analysis of time-related data. 本発明の実施の形態に係る時間関係データ解析装置の構成を示す概略図である。It is the schematic which shows the structure of the time-related data analysis apparatus which concerns on embodiment of this invention. モード１行列化及びモード１テンソル化を説明するための図である。It is a figure for demonstrating mode 1 matrixing and mode 1 tensorization. モード２行列化及びモード２テンソル化を説明するための図である。It is a figure for demonstrating mode 2 matrixing and mode 2 tensorization. モード３行列化及びモード３テンソル化を説明するための図である。It is a figure for demonstrating mode 3 matrix formation and mode 3 tensorization. 特異値分解によって得られた特異値を示すグラフである。It is a graph which shows the singular value obtained by singular value decomposition. 本発明の実施の形態に係る時間関係データ解析装置における時間関係データ解析処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the time relation data analysis processing routine in the time relation data analysis apparatus which concerns on embodiment of this invention. 主変数を更新する処理の内容を示すフローチャートである。It is a flowchart which shows the content of the process which updates a main variable. 補助変数、パターン基底、及び関係パターンを更新する処理の内容を示すフローチャートである。It is a flowchart which shows the content of the process which updates an auxiliary variable, a pattern base, and a relationship pattern. ラグランジュ乗数を更新する処理の内容を示すフローチャートである。It is a flowchart which shows the content of the process which updates a Lagrange multiplier. ＫＫＴ条件の誤差を計算する処理の内容を示すフローチャートである。It is a flowchart which shows the content of the process which calculates the error of KKT conditions. 実験で得られた特異値を示すグラフである。It is a graph which shows the singular value obtained by experiment. 実験で得られた真の1995年における逆行列係数と推定された逆行列係数を示す図である。It is a figure which shows the inverse matrix coefficient in 1995 and the estimated inverse matrix coefficient obtained by experiment. 実験で得られたモード３に関する特異値分解に付随する第１成分の特異ベクトル及び特異値を示す図である。It is a figure which shows the singular vector and singular value of the 1st component accompanying the singular value decomposition regarding the mode 3 obtained by experiment. 実験で得られたモード３に関する特異値分解に付随する第２成分の特異ベクトル及び特異値を示す図である。It is a figure which shows the singular vector and singular value of the 2nd component accompanying the singular value decomposition regarding the mode 3 obtained by experiment.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜概要＞
まず、本発明で提案する時間関係データ解析装置で用いる時間関係データについて説明する。図１は、提案する時間関係データ解析装置のコンセプトを模式的に表した図である。 <Overview>
First, time-related data used in the time-related data analysis apparatus proposed in the present invention will be described. FIG. 1 is a diagram schematically showing the concept of the proposed time-related data analysis apparatus.

図１において、左側にはユーザ（購買者）と商品の間の関係（買ったか否か）が時間軸に沿ってテンソルとして表現されている。黒い立方体は観測されている関係を表し、テンソルのそれ以外の要素は未知であるとする。図１の右側には左側の部分観測テンソルの低ランク分解を示している。中心を取り囲む３つの薄い直方体はそれぞれ商品、ユーザ、および時間変化を少数のパターンで表現したものであり、中心の小さい直方体はパターン間の相互作用を表現している。このように分解することにより、左側のテンソルの未知部分の推定も可能となる。 In FIG. 1, on the left side, the relationship between the user (purchaser) and the product (whether or not it is bought) is expressed as a tensor along the time axis. The black cube represents the observed relationship, and the other elements of the tensor are unknown. The right side of FIG. 1 shows a low rank decomposition of the left partial observation tensor. Three thin rectangular parallelepipeds surrounding the center represent products, users, and temporal changes with a small number of patterns, respectively, and a rectangular parallelepiped with a small center represents an interaction between patterns. By decomposing in this way, the unknown part of the left tensor can be estimated.

＜システム構成＞
次に、観測データを入力として、時間関係データを表わす未知のテンソルを求めて出力する時間関係データ解析装置に、本発明を適用した場合を例にして、本発明の実施の形態を説明する。 <System configuration>
Next, an embodiment of the present invention will be described by taking as an example the case where the present invention is applied to a time-related data analysis apparatus that obtains and outputs unknown tensors representing time-related data with observation data as an input.

図２に示すように、本実施の形態に係る時間関係データ解析装置は、時間関係データの観測値の入力を受け付ける入力部１と、観測された時間関係データの解析を行う解析部２と、変数や定数を記憶する記憶部３と、予測値計算部４と、解析結果を出力する出力部５と、を備えている。 As shown in FIG. 2, the time-related data analysis device according to the present embodiment includes an input unit 1 that receives input of observation values of time-related data, an analysis unit 2 that analyzes the observed time-related data, A storage unit 3 that stores variables and constants, a predicted value calculation unit 4, and an output unit 5 that outputs an analysis result are provided.

入力部１は、既知のビデオカメラ、マイクロフォン、記憶装置などの入力器により実現される。また、出力部５は、ディスプレイ、プリンタ、磁気ディスクなどで実装される。 The input unit 1 is realized by an input device such as a known video camera, microphone, or storage device. The output unit 5 is implemented by a display, a printer, a magnetic disk, or the like.

解析部２、記憶部３、及び予測値計算部４は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）と、後述する時間関係データ解析処理ルーチンを実行するためのプログラムを記憶したＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）とを備えたコンピュータで構成されている。 The analysis unit 2, the storage unit 3, and the predicted value calculation unit 4 include a CPU (Central Processing Unit), a RAM (Random Access Memory), and a ROM that stores a program for executing a time-related data analysis processing routine described later. (Read Only Memory).

解析部２は、初期設定部２１、変数計算部２２、及び解析終了条件判定部２７を備えている。変数計算部２２は、主変数更新部２３、補助変数更新部２４、ラグランジュ乗数更新部２５、及び誤差計算部２６を備えている。 The analysis unit 2 includes an initial setting unit 21, a variable calculation unit 22, and an analysis end condition determination unit 27. The variable calculation unit 22 includes a main variable update unit 23, an auxiliary variable update unit 24, a Lagrange multiplier update unit 25, and an error calculation unit 26.

初期設定部２１は、入力された観測データから、記憶部３の各要素に値を設定する。 The initial setting unit 21 sets a value for each element of the storage unit 3 from the input observation data.

変数計算部２２は、記憶部３に保存された情報を用いて各変数の更新を行う。 The variable calculation unit 22 updates each variable using information stored in the storage unit 3.

主変数更新部２３は、記憶部３に保存された情報を用いて主変数の更新を行う。 The main variable update unit 23 updates the main variable using information stored in the storage unit 3.

補助変数更新部２４では、記憶部３に保存された情報を用いて、補助変数の更新とパターン基底の更新と関係パターンの更新とを行う。 The auxiliary variable updating unit 24 uses the information stored in the storage unit 3 to update the auxiliary variable, the pattern base, and the relation pattern.

ラグランジュ乗数更新部２５は、記憶部３に保存された情報を用いてラグランジュ乗数の更新を行う。 The Lagrange multiplier update unit 25 uses the information stored in the storage unit 3 to update the Lagrange multiplier.

誤差計算部２６では、記憶部３に保存された情報を用いて誤差の計算を行う。 The error calculation unit 26 calculates an error using the information stored in the storage unit 3.

解析終了条件判定部２７では、変数計算部２２を用いて計算され記憶部３に記憶された数値をもちいて解析を終了してよいかどうかを判定する。 The analysis end condition determination unit 27 determines whether or not the analysis can be ended using the numerical values calculated using the variable calculation unit 22 and stored in the storage unit 3.

記憶部３は、定数３１、観測値３２、インデックス集合３３、主変数３４、補助変数３５、ラグランジュ乗数３６、誤差３７、パターン基底３８、関係パターン３９から構成される。 The storage unit 3 includes a constant 31, an observation value 32, an index set 33, a main variable 34, an auxiliary variable 35, a Lagrange multiplier 36, an error 37, a pattern base 38, and a relation pattern 39.

予測値計算部４では、指定された予測時刻t’における関係データの予測を行う。 The predicted value calculation unit 4 performs prediction of related data at the specified predicted time t ′.

＜データ構造＞
ｎ_１、ｎ_２、ｎ_３は想定する時間関係データのサイズを表す自然数である。関係データは２種類のオブジェクトの間の接続関係を表現する。本実施の形態では、関係を構成する２種類のオブジェクトの一方を起点オブジェクト、他方を終点オブジェクトと呼ぶ。例えば、図１の例では商品が起点オブジェクト、ユーザが終点オブジェクトとなる。また、本実施の形態では、ｎ_１、ｎ_２はそれぞれ関係の起点と終点のオブジェクト数とし、ｎ_３は観測が発生した時間の総ステップ数とする。また、可能なすべての要素数をＮとし、Ｎ＝ｎ_１×ｎ_２×ｎ_３とする。また、起点となるオブジェクトの軸をテンソルの１番目の次元、終点となるオブジェクトの軸をテンソルの２番目の次元、時間の軸をテンソルの３番目の次元と呼ぶ。 <Data structure>
n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the assumed time-related data. The relationship data represents a connection relationship between two types of objects. In the present embodiment, one of the two types of objects constituting the relationship is referred to as a starting object and the other is referred to as an end object. For example, in the example of FIG. 1, the product is a start object and the user is an end object. Further, in the present embodiment, n ₁ and n ₂ are the number of objects at the start and end of the relationship, respectively, and n ₃ is the total number of steps at the time when the observation occurred. Further, N is the number of all possible elements, and N = n ₁ × n ₂ × n ₃ . The axis of the object that is the starting point is called the first dimension of the tensor, the axis of the object that is the end point is called the second dimension of the tensor, and the axis of time is called the third dimension of the tensor.

Ｔはn₃個の観測された時間点の時刻を格納するベクトルである。Ｔの要素は昇順に並んでいるものとする。 T is a vector that stores the times of n ₃ observed time points. Assume that the elements of T are arranged in ascending order.

観測ベクトルｙは観測された関係データであり長さＭの縦ベクトルで表現される。Ｍはｙの要素数であり、ｙ_ｍはｙのｍ番目（１≦ｍ≦Ｍ）の観測値を表す。なお、本実施の形態では観測値ｙ_ｍはすべての実数を許容する。したがって、この値はある関係の強さを表すと理解できる。一方、上記の非特許文献３や非特許文献４では観測される関係データは１あるいは０の２値しか許容しないため、関係の有無しか表現できず、関係の強さを考慮した解析が不可能である。 The observation vector y is observed relational data and is expressed by a vertical vector of length M. M is the number of elements of y, _{y m} represents the observed value of the m-th y (1 ≦ m ≦ M) . Note that the observed values y _m in the present embodiment allows for all of the real numbers. Therefore, it can be understood that this value represents the strength of a certain relationship. On the other hand, in the non-patent document 3 and the non-patent document 4 described above, since the observed relational data only allows binary values of 1 or 0, only the presence or absence of the relation can be expressed, and the analysis considering the strength of the relation is impossible. It is.

Ｉ，Ｊ，Ｋはそれぞれ長さＭのインデックス集合３３で、それぞれのｍ番目の要素を並べた(i_m,j_m,k_m)が未知テンソルのｍ番目の観測された要素を指し示す。すなわちｍ番目の観測値はi_m番目の起点オブジェクトとj_m番目の終点オブジェクトの時刻k_mでの関係である。 I, J, and K are each an index set 33 of length M, and (i _m , j _m , k _m ) in which the m-th elements are arranged indicates the m-th observed element of the unknown tensor. That m th observation value is the relationship with i _m-th origin object and the j _m-th time k _m endpoint object.

また同様にI^c,J^c,K^cは、それぞれ長さ（Ｎ−Ｍ）のインデックス集合３３で、それぞれのｍ番目の要素の並べたもの(i^c _m,j^c _m,k^c _m)は未知テンソルのｍ番目の未観測要素を指し示す。 Similarly, I ^c , J ^c , and K ^c are index sets 33 each having a length (N−M) in which m-th elements are arranged (i ^c _m , j ^c _m , k ^c _m ). Indicates the mth unobserved element of the unknown tensor.

λは正則化定数とよばれる正の実数である。分解・解析計算における正則化項の効果の大きさを表す。 λ is a positive real number called a regularization constant. Represents the magnitude of the regularization term effect in decomposition / analysis calculations.

Ｍ次元の観測ベクトルｙは、観測値３２に格納される。 The M-dimensional observation vector y is stored in the observation value 32.

主変数３４にはｎ_１×ｎ_２×ｎ_３の要素からなる未知テンソルＸが格納される。初期値としてＸの各要素には適当な値を入力する。一般にはすべての値を０とすれば良い。未知テンソルＸは、上記図１に示すテンソルに相当するものであり、最終的には、要素(i,j,k)に、時刻kにおいて観測される起点オブジェクトiと終点オブジェクトjとの間の関係の強さを示す値が格納（上書き）されることになる。 The main variable 34 stores an unknown tensor X composed of n ₁ × n ₂ × n ₃ elements. An appropriate value is input to each element of X as an initial value. In general, all values should be 0. The unknown tensor X corresponds to the tensor shown in FIG. 1, and finally, the element (i, j, k) is between the start object i and the end object j observed at time k. A value indicating the strength of the relationship is stored (overwritten).

以下に観測要素の取り出しΩ、未観測要素の取り出し￣Ω、観測要素の復元Ω^T、モードk行列化mat_k、モードkテンソル化tensor_kと呼ぶ4つの操作を定義する。これらの操作は以下の数式では行列演算として表現されるが、実際には行列として保持する必要はない。まず、観測要素の取り出しΩは、引数としてｎ_１×ｎ_２×ｎ_３の要素からなるテンソルを取り、Ｍ次元ベクトルを返すもので、戻り値のｍ番目の要素は、引数であるテンソルの第(i_m,j_m,k_m)要素である。同様に、未観測要素の取り出し￣Ωは、引数としてｎ_１×ｎ_２×ｎ_３の要素からなるテンソルを取り、（Ｎ−Ｍ）次元ベクトルを返すもので、戻り値のｍ番目の要素は、引数であるテンソルの第(i^c _m,j^c _m,k^c _m)要素である。観測要素の復元Ω^Tは観測要素の取り出しΩの逆の操作として定義され、Ｍ次元ベクトルを引数としてｎ_１×ｎ_２×ｎ_３の要素からなるテンソルを返すもので、引数のｍ番目の要素は、戻り値であるテンソルの第(i_m,j_m,k_m)要素に格納される。観測値に対応しない戻り値の要素はすべてゼロである。モードk行列化mat-_kはｋ＝１,２,３の３種存在し、それぞれｎ_１×ｎ_２×ｎ_３の要素からなるテンソルを引数として、n₁×n₂n₃の行列、n₂×n₃n₁の行列、n₃×n₁n₂の行列を返すものである。具体的にはモード１行列化mat₁は図３Ａに示すように与えられたテンソルの第1モード、すなわち起点オブジェクトが縦に連続するようにベクトルを取り出し、横にならべて行列にしたものを生成する操作である。また、第１モードテンソル化tensor-₁はその逆の操作である。同様に、第２モード行列化mat-₂は図３Ｂに示すように終点オブジェクトが縦に連続するようにベクトルを取り出し、横にならべて行列にしたものを生成する操作であり、第２モードテンソル化tensor₂はその逆の操作である。第３モード行列化mat₃は図３Ｃのように時間点が縦に連続するようにベクトルを取り出し、横に並べて行列にしたものを生成する操作であり、第３モードテンソル化tensor-₃はその逆の操作である。 The following four operations are defined: observation element extraction Ω, unobserved element extraction ΩΩ, observation element restoration Ω ^T , mode k matrixization mat _k , and mode k tensorization tensor _k . These operations are expressed as matrix operations in the following mathematical formula, but actually do not need to be held as a matrix. First, the observation element extraction Ω takes a tensor consisting of n ₁ × n ₂ × n ₃ elements as an argument and returns an M-dimensional vector. The m-th element of the return value is the first of the tensors that are arguments. (i _m , j _m , k _m ) elements. Similarly, the unobserved element extraction ￣Ω takes a tensor consisting of n ₁ × n ₂ × n ₃ elements as an argument and returns a (N−M) -dimensional vector. The m-th element of the return value is , Is the (i ^c _m , j ^c _m , k ^c _m ) element of the argument tensor. The observation element restoration Ω ^T is defined as the inverse operation of the observation element extraction Ω, and returns a tensor consisting of n ₁ × n ₂ × n ₃ elements with an M-dimensional vector as an argument. The m th element of the argument Is stored in the (i _m , j _m , k _m ) element of the tensor that is the return value. Return elements that do not correspond to observations are all zero. There are three types of mode k matrix mat- _k , k = 1, 2, and 3, each of which has an n ₁ × n ₂ × n ₃ element tensor as an argument, an n ₁ × n ₂ n ₃ matrix, n Returns a ₂ × n ₃ n ₁ matrix and an n ₃ × n ₁ n ₂ matrix. Specifically, the mode 1 matrixing mat ₁ generates the first mode of the given tensor as shown in FIG. 3A, that is, the vector is extracted so that the starting objects are vertically continuous and arranged in a matrix. It is an operation to do. The first mode tensorization tensor- ₁ is the reverse operation. Similarly, the second mode matrixing mat- ₂ is an operation for taking out a vector so that the end point objects are vertically continuous as shown in FIG. 3B and generating a matrix arranged horizontally, and the second mode tensor. The tensor ₂ is the reverse operation. As shown in FIG. 3C, the third mode matrixing mat ₃ is an operation for extracting vectors such that time points are vertically continuous and generating a matrix arranged side by side, and the third mode tensorization tensor- ₃ is The reverse operation.

＜入力部１＞
入力部１では、解析にあたり事前に必要となる数値データの入力を受け付ける。入力されるデータは以下の通りである。 <Input unit 1>
The input unit 1 accepts input of numerical data required in advance for analysis. The input data is as follows.

ｎ_１、ｎ_２、ｎ_３、Ｎ、Ｍ、Ｉ，Ｊ，Ｋ，Ｉ^ｃ、Ｊ^ｃ，Ｋ^ｃ、ｙ、λ、γ₁、γ₂、γ₃、η、Ｌ，Ｅ、t’ n ₁ , n ₂ , n ₃ , N, M, I, J, K, I ^c , J ^c , K ^c , y, λ, γ ₁ , γ ₂ , γ ₃ , η, L, E, t ′

ｎ_１、ｎ_２、ｎ_３は想定する時間関係データのサイズを表す自然数である。可能なすべての要素数をＮとし、Ｎ＝ｎ_１×ｎ_２×ｎ_３とする。 n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the assumed time-related data. Let N be all possible elements and N = n ₁ × n ₂ × n ₃ .

ｙは観測された関係データであり長さMの縦ベクトルで表現される観測ベクトルである。Ｍはｙの要素数であり、ｙ_ｍはｍ番目の観測値を表す。 y is the observed relational data, and is an observation vector expressed by a vertical vector of length M. M is the number of elements of y, y _m denotes the m th observation.

Ｉ，Ｊ，Ｋは観測値のインデックスであり、Ｍ個の観測値ｙと未知テンソルＸの関係を表す。同様に（Ｎ−Ｍ）個の未観測値のインデックスであるＩ^ｃ、Ｊ^ｃ，Ｋ^ｃも入力される。 I, J, and K are indices of observation values, and represent the relationship between M observation values y and unknown tensors X. Similarly, I ^c , J ^c , and K ^c that are indices of (N−M) unobserved values are also input.

λは正則化定数とよばれる正の実数である。分解・解析計算における正則化項の効果の大きさを表す。γ₁、γ₂、γ₃、は未知テンソルＸの各モードｋに対する相対的な正則化の強さを表現する正の定数である。ここではこれらを正則化バランス定数と呼ぶ。また、正則化定数λをある定数倍して正則化バランス定数γ₁、γ₂、γ₃をその定数で割っても得られる解は同じであるため、規格化のためγ₁＋γ₂＋γ₃＝１と仮定する。 λ is a positive real number called a regularization constant. Represents the magnitude of the regularization term effect in decomposition / analysis calculations. γ ₁ , γ ₂ , and γ ₃ are positive constants representing the relative regularization strength of each mode k of the unknown tensor X. Here, these are called regularization balance constants. In addition, the solution obtained by multiplying the regularization constant λ by a certain constant and dividing the regularization balance constants γ ₁ , γ ₂ , γ ₃ by the constants is the same. Therefore, for normalization, γ ₁ + γ ₂ + γ ₃ = 1.

ηはステップサイズであり、正の実数である。分解・解析計算においてラグランジュ乗数Ａ_１、Ａ_２、Ａ_３の更新量を決める定数であり、主変数と補助変数の乖離に対するペナルティーの強さを表す。 η is a step size, which is a positive real number. It is a constant that determines the amount of update of the Lagrange multipliers A ₁ , A ₂ , A _{3 in} the decomposition / analysis calculation, and represents the strength of the penalty for the difference between the main variable and the auxiliary variable.

Ｌは変数計算部２２における、最大繰り返し回数を表す自然数である。繰り返し回数はｌで表す。 L is a natural number representing the maximum number of repetitions in the variable calculation unit 22. The number of repetitions is represented by l.

Ｅは解析終了条件判定部２７における、許容される誤差の上限を表す定数である。 E is a constant representing the upper limit of the allowable error in the analysis end condition determination unit 27.

t’は予測したい時刻のインデックスを表す実数である。なお、予測が不要な場合にはt’は入力しなくても良い。 t ′ is a real number representing the index of the time to be predicted. Note that t 'may not be input when prediction is unnecessary.

＜解析部２＞
解析部２は、入力された観測値や定数をもとにして、主変数である未知テンソルＸを求める。解析部２は、以下の（１）式で示される目的関数を、近似的に最小化することにより主変数Ｘを決定する。 <Analysis unit 2>
The analysis unit 2 obtains an unknown tensor X that is a main variable based on the input observation values and constants. The analysis unit 2 determines the main variable X by approximately minimizing an objective function expressed by the following equation (1).

ここで、上記（１）式の第１項は損失項であり、観測ベクトルyと、未知テンソルＸから観測値に対応する要素を取り出して得られるＭ次元ベクトルΩ(Ｘ)との差の二乗和として定義される。ただし、観測値に欠損値がある場合、第１項だけでは未知テンソルＸを一意に定めることができない。第２項は正則化項と呼ばれ、未知テンソルのランクが低いという仮定を表現する。ただし、ここで||Ｚ_k||_trはトレースノルムと呼ばれ、行列Ｚ_kの特異値の線形和として定義される。ここで未知テンソルが上記図１のように少ない数のパターンとそれらの間の相互作用に分解されることは、行列化操作mat-_kによって得られる行列が低ランクであることに対応することに注意する（例えば非特許文献８（T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM Review, 51(3):455−500, 2009.）を参照）。なぜトレースノルム正則化が低ランク性の仮定に対応しているかを説明するために、全要素が観測されている場合で簡単のために第１モードに対応する正則化項のみ考える（つまりγ₁＝１,γ₂＝０,γ₃＝０）。この時、上記の最適化問題は、以下の（２）式のように書き直すことができる。 Here, the first term of the equation (1) is a loss term, and the square of the difference between the observation vector y and the M-dimensional vector Ω (X) obtained by extracting the element corresponding to the observation value from the unknown tensor X Defined as sum. However, when there is a missing value in the observed value, the unknown tensor X cannot be uniquely determined only by the first term. The second term is called the regularization term and expresses the assumption that the rank of the unknown tensor is low. Here, || Z _k || _tr is called a trace norm and is defined as a linear sum of singular values of the matrix Z _k . The fact that the unknown tensor is decomposed into a small number of patterns and the interactions between them as shown in FIG. 1 corresponds to the fact that the matrix obtained by the matrixing operation mat- _k has a low rank. Note (for example, see Non-Patent Document 8 (TG Kolda and BW Bader. Tensor decompositions and applications. SIAM Review, 51 (3): 455-500, 2009.)). To explain why trace norm regularization corresponds to the hypothesis of low rank, only regularization terms corresponding to the first mode are considered for simplicity when all elements are observed (ie, γ ₁ = 1, γ ₂ = 0, γ ₃ = 0). At this time, the above optimization problem can be rewritten as the following equation (2).

ここで、Ｘ、Ｙはそれぞれ、未知テンソルＸ、および観測ベクトルｙのモード１行列化によって得られる行列とする。上の最適化問題の解は、行列Ｙの特異値分解をＹ＝ＵＳＶ^Tとして、解析的に以下の（３）式のように書けることが知られている（例えば、非特許文献９（R. Tomioka, T. Suzuki, and M. Sugiyama. Augmented lagrangian methods for learning, selecting, and combining features. In S. Sra, S. Nowozin, and S. J. Wright, editors, Optimization for Machine Learning. MIT Press, 2011.）を参照）。 Here, X and Y are assumed to be matrices obtained by mode 1 matrixization of the unknown tensor X and the observation vector y, respectively. It is known that the solution of the above optimization problem can be analytically written as the following equation (3) where the singular value decomposition of the matrix Y is Y = USV ^T (for example, Non-Patent Document 9 (R Tomioka, T. Suzuki, and M. Sugiyama. Augmented lagrangian methods for learning, selecting, and combining features. In S. Sra, S. Nowozin, and SJ Wright, editors, Optimization for Machine Learning. MIT Press, 2011.) See).

ここでＳは行列Ｙの特異値が並んだ対角行列である。Ｕ、Ｖは各列が特異値に対応する特異ベクトルであるような行列である。また、ｍａｘの操作は行列の要素ごとに適用されるものとする。上記（３）式からこの最適化問題を解くと、図４に示すように、与えられた行列Ｙの特異値のうち正則化定数λ以下の特異値は雑音として切り捨てられ、雑音より十分大きい信号成分を取り出すことができることがわかる。また、上記の操作は観測値Ｙの両側から直交行列をかけても不変であり、これらの操作に対して頑健になっている。 Here, S is a diagonal matrix in which singular values of the matrix Y are arranged. U and V are matrices in which each column is a singular vector corresponding to a singular value. Further, the max operation is applied to each element of the matrix. When this optimization problem is solved from the above equation (3), as shown in FIG. 4, the singular values below the regularization constant λ among the singular values of the given matrix Y are discarded as noise, and the signal is sufficiently larger than the noise. It turns out that an ingredient can be taken out. Further, the above operation is invariant even when an orthogonal matrix is applied from both sides of the observed value Y, and is robust to these operations.

上記最適化問題は、たとえば非特許文献７（S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. http://www.stanford.edu/~boyd/papers/distr_opt_stat_learning_admm.html）のようなAlternating Direction Method of Multipliers (ADMM)法を用いて解くことができる。さらに、上記最適化問題は凸最適化問題（凸最適化に関する一般論は非特許文献１０（ S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press. 2004.）を参照）なので、ADMM法が収束すれば、それが上記の最適化問題の大域最適解である。ADMM法の収束は下で詳述するようにKKT条件の誤差を用いて判定する。これにより、テンソルを幾つのパターンに分解するか（次元）を指定する必要がなく、最適化を通してこれらを自動的に決定することが可能となる。 For example, Non-Patent Document 7 (S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. It can be solved using an Alternative Direction Method of Multipliers (ADMM) method such as www.stanford.edu/~boyd/papers/distr_opt_stat_learning_admm.html). Furthermore, since the above optimization problem is a convex optimization problem (refer to Non-Patent Document 10 (S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press. 2004.) for general theory on convex optimization), the ADMM method is If it converges, it is the global optimal solution of the above optimization problem. The convergence of the ADMM method is determined using the error of the KKT condition as described in detail below. Thus, it is not necessary to specify how many patterns (dimensions) the tensor is decomposed into, and these can be automatically determined through optimization.

＜初期設定部２１＞
初期設定部２１では、入力部１から入力された情報をもとに、記憶部３に記憶されている各要素の初期設定を行う。 <Initial setting unit 21>
The initial setting unit 21 performs initial setting of each element stored in the storage unit 3 based on information input from the input unit 1.

まず、入力部１よりｎ_１、ｎ_２、ｎ_３、Ｎ、Ｍ、ｙ、λ、γ₁、γ₂、γ₃、η、Ｌ，Ｅ、t’を定数３１に記憶する。また繰り返し回数ｌを１に初期化する。 First, n ₁ , n ₂ , n ₃ , N, M, y, λ, γ ₁ , γ ₂ , γ ₃ , η, L, E, and t ′ are stored in the constant 31 from the input unit 1. Also, the number of repetitions l is initialized to 1.

Ｍ次元の観測ベクトルｙを観測値３２に、観測値のインデックスＩ，Ｊ，Ｋおよび未観測値のインデックスＩ^ｃ、Ｊ^ｃ，Ｋ^ｃをインデックス集合３３に保存する。 The M-dimensional observation vector y is stored in the observation value 32, and the index I, J, K of the observation value and the index I ^c , J ^c , K ^c of the unobserved value are stored in the index set 33.

主変数３４にはｎ_１×ｎ_２×ｎ_３の要素からなる未知テンソルＸを記憶する。初期値としてＸの各要素には適当な値を入力する。一般にはすべての値を０とすれば良い。 The main variable 34 stores an unknown tensor X composed of n ₁ × n ₂ × n ₃ elements. An appropriate value is input to each element of X as an initial value. In general, all values should be 0.

- 補助変数３５にはＺ_１，Ｚ_２，Ｚ_３が記憶され、それぞれ大きさｎ_１×ｎ_２ｎ_３、ｎ_２×ｎ_３ｎ_１、ｎ_３×ｎ_１ｎ_２の行列として表現される。そのベクトル化をそれぞれｚ_１，ｚ_２，ｚ_３と書く。補助変数は変数計算部２２において主変数ｘに対する制約を緩和したものである。具体的にはＺ_kは最適化が進むに従ってテンソルＸのモードk行列化と一致していく量である。初期値として補助変数Ｚ_１，Ｚ_２，Ｚ_３には適当な値を入力する。一般にはすべての値を０とすれば良い。 -Z ₁ , Z ₂ , Z ₃ are stored in the auxiliary variable 35 and expressed as a matrix of size n ₁ × n ₂ n ₃ , n ₂ × n ₃ n _1, n ₃ × n ₁ n ₂ , respectively. . The vectorization is written as z ₁ , z ₂ , and z ₃ , respectively. The auxiliary variable is obtained by relaxing restrictions on the main variable x in the variable calculation unit 22. Specifically, Z _k is an amount that coincides with the mode k matrixing of the tensor X as optimization proceeds. Appropriate values are input to the auxiliary variables Z ₁ , Z ₂ , and Z ₃ as initial values. In general, all values should be 0.

ラグランジュ乗数３６には、ラグランジュ乗数Ａ₁、Ａ₂、Ａ₃が記憶され、ラグランジュ乗数Ａ₁、Ａ₂、Ａ₃は補助変数Ｚ₁、Ｚ_２、Ｚ_３と同じ次元を持つ行列である。ラグランジュ乗数は主変数Ｘと補助変数Ｚ_１、Ｚ_２、Ｚ_３を一致させる役割を持つ。初期値としてラグランジュ乗数Ａ₁、Ａ₂、Ａ₃には適当な値を入力する。一般にはすべての値を０とすれば良い。 The Lagrange multiplier 36, Lagrange multiplier A _1, A _2, A ₃ are stored, Lagrange multiplier A _1, A _2, A ₃ is a matrix with the same dimensions as the auxiliary variables _{_{_{Z 1, Z 2, Z 3}}} . The Lagrange multiplier has the role of making the main variable X coincide with the auxiliary variables Z ₁ , Z ₂ , and Z ₃ . As initial values, appropriate values are input to Lagrange multipliers A ₁ , A ₂ , and A ₃ . In general, all values should be 0.

誤差３７には、ε_１、ε_２、・・・、ε_Ｔが記憶され、ε_１、ε_２、・・・、ε_Ｔは誤差計算部２６において計算される誤差を表す。 The error _{_{37, ε 1, ε 2, ···}} , ε T is _{_{stored, ε 1, ε 2, ···}} , the epsilon _T represents the error calculated at the error calculating unit 26.

パターン基底３８には、パターン基底Ｕ_１、Ｕ_２、Ｕ_３が記憶される。パターン基底Ｕ_１、Ｕ_２、Ｕ_３はそれぞれｎ_１×ｒ_１、ｎ_２×ｒ_２、ｎ_３×ｒ_３の行列であり、変数計算部２２で得られた起点オブジェクト方向、終点オブジェクト方向、および時間方向のパターンを表現する。関係パターン３９には、関係パターンＶ_１、Ｖ_２、Ｖ_３が記憶される。関係パターンＶ_１、Ｖ_２、Ｖ_３はそれぞれｎ_２ｎ_３×ｒ_１、ｎ_３ｎ_１×ｒ_２、ｎ_１ｎ_２×ｒ_３の行列であり、パターン基底Ｕｋに従って変化する、終点オブジェクトおよび時間の間の関係の強さ、起点オブジェクトおよび時間の間の関係の強さ、及び起点オブジェクトおよび終点オブジェクトの間の関係の強さを表現する。ｒ_１、ｒ_２、ｒ_３はそれぞれパターンの要素数を表す。この値はテンソルの分解解析が進むにつれて自動的に決定される。初期値として、パターン基底Ｕ_１、Ｕ_２、Ｕ_３、関係パターンＶ_１、Ｖ_２、Ｖ_３、パターンの要素数ｒ_１、ｒ_２、ｒ_３には適当な値を入力する。一般にはパターン基底はすべての値を０とすれば良い。ｒ_１、ｒ_２、ｒ_３は適当な自然数を選択する。 In the pattern base 38, pattern bases U ₁ , U ₂ , U ₃ are stored. Pattern bases U ₁ , U ₂ , and U ₃ are matrices of n ₁ × r ₁ , n ₂ × r ₂ , and n ₃ × r ₃ , respectively, and the origin object direction, end object direction, And express the pattern in the time direction. In the relationship pattern 39, relationship patterns V ₁ , V ₂ , and V ₃ are stored. The relationship patterns V ₁ , V ₂ , V ₃ are matrices of n ₂ n ₃ × r ₁ , n ₃ n ₁ × r ₂ , n ₁ n ₂ × r ₃ , respectively, and are endpoint objects that change according to the pattern base Uk and It represents the strength of the relationship between time, the strength of the relationship between the origin object and time, and the strength of the relationship between the origin object and the end object. r ₁ , r ₂ , and r ₃ each represent the number of elements of the pattern. This value is automatically determined as the tensor decomposition analysis proceeds. As initial values, appropriate values are input to the pattern bases U ₁ , U ₂ , U ₃ , the relationship patterns V ₁ , V ₂ , V ₃ , and the number of pattern elements r ₁ , r ₂ , r ₃ . In general, all values of the pattern base should be zero. For r ₁ , r ₂ , and r _3, an appropriate natural number is selected.

＜変数計算部２２＞
変数計算部２２は、記憶部３に記憶された各要素を利用して、ＡＤＭＭ法に基づいた主変数の推定を行う。実際の計算では、主変数更新部２３、補助変数更新部２４、ラグランジュ乗数更新部２５、及び誤差計算部２６の各処理をこの順番で行う解析処理を繰り返し行う。 <Variable calculation unit 22>
The variable calculation unit 22 estimates the main variable based on the ADMM method using each element stored in the storage unit 3. In the actual calculation, the analysis processing for performing each processing of the main variable update unit 23, the auxiliary variable update unit 24, the Lagrange multiplier update unit 25, and the error calculation unit 26 in this order is repeatedly performed.

繰り返しごとに解析終了条件判定部２７で解析の終了を判定し、解析終了とならない場合は繰り返し回数ｔを１増加させ、再度解析処理を行う。 The analysis end condition determination unit 27 determines the end of the analysis every repetition, and if the analysis is not ended, the number of repetitions t is incremented by 1 and the analysis process is performed again.

＜主変数更新部２３＞
主変数更新部２３は、以下の（４）式および（５）を用いて主変数Ｘの更新を行う。 <Main variable update unit 23>
The main variable updating unit 23 updates the main variable X using the following equations (4) and (5).

まず、主変数更新部２３は、上記の（４）式を用いて主変数の予測値^Ｘを構成する。予測値^Ｘは、主変数と同じサイズのテンソルであり、各モードｋについて補助変数Ｚ_kからラグランジュ乗数Ａ_kを差し引き、テンソル化したものの平均である。 First, the main variable updating unit 23 constructs a predicted value ^ X of the main variable using the above equation (4). The predicted value ^ X is a tensor having the same size as the main variable, and is the average of the tensors _obtained by subtracting the Lagrange multipliers A _k from the auxiliary variables Z _k for each mode k.

次に、主変数更新部２３は、構成した主変数の予測値^Ｘを用いて、上記（５）式のように主変数Ｘを更新する。上記（５）式では、観測値の得られている要素については、新しい主変数の値が、その予測値と観測値を３λη：１の割合で内分したものに等しくなるように更新している。一方、観測値の得られていない要素については、新しい主変数の値がその予測値そのものとなるように更新している。得られた主変数Ｘを記憶部３の主変数３４に保存して、この主変数更新部２３の処理は終了である。 Next, the main variable updating unit 23 updates the main variable X as shown in the above equation (5) using the configured predicted value ^ X of the main variable. In the above equation (5), for the element for which the observed value is obtained, the new main variable value is updated so as to be equal to the predicted value and the observed value divided internally by a ratio of 3λη: 1. Yes. On the other hand, for elements for which no observed value is obtained, the value of the new main variable is updated so as to be the predicted value itself. The obtained main variable X is stored in the main variable 34 of the storage unit 3, and the processing of the main variable update unit 23 is completed.

＜補助変数更新部２４＞
補助変数更新部２４は、以下の（６）式、（７）式、および（８）式を用いて、補助変数Ｚ_１，Ｚ_２，Ｚ_３、パターン基底Ｕ_１、Ｕ_２、Ｕ_３、及び関係パターンＶ_１，Ｖ_２，Ｖ_３の更新を行う。 <Auxiliary variable update unit 24>
The auxiliary variable update unit 24 uses the following equations (6), (7), and (8) to calculate auxiliary variables Z ₁ , Z ₂ , Z ₃ , pattern bases U ₁ , U ₂ , U ₃ , And the relationship patterns V ₁ , V ₂ , V ₃ are updated.

まず、上記（６）式のように、記憶部３に保存されている量を用いて補助変数の予測値＾Ｚ_kを構成する。補助変数の予測値＾Ｚ_kは各モードｋについて主変数Ｘを行列化したものにラグランジュ乗数Ａkを足したものである。 First, as shown in the above equation (6), the predicted value ^ Z _k of the auxiliary variable is configured using the amount stored in the storage unit 3. The auxiliary variable predictive value _{{circumflex over (} Z) _{} k} is obtained by adding a Lagrange multiplier Ak to a matrix of the main variable X for each mode k.

次に、このように構成した補助変数の予測値＾Ｚ_kに対し、上記（７）式に従って特異値分解を行う。特異値分解を行うと、ｋ番目の予測値＾Ｚ_kに対し、Ｕ_ｋ、Ｓ_ｋ、Ｖ_ｋという行列が得られる。このうちＳ_ｋは特異値が並んだ対角行列である。Ｕ_ｋ、Ｖ_ｋはそれぞれの列が特異値に対応する特異ベクトルである。ただし、特異値及び特異ベクトルとしては、次の（８）式の操作のために、γ_ｋ／η以上の値を持つ特異値とそれに対応する特異ベクトルのみ計算すればよい。 Next, the singular value decomposition is performed on the predicted value _{{circumflex over} (Z) _{} k} of the auxiliary variable configured as described above according to the equation (7). When singular value decomposition is performed, U _k , S _k , and V _k matrices are obtained for the k-th predicted value ^ Z _k . Of these, S _k is a diagonal matrix in which singular values are arranged. U _k and V _k are singular vectors in which each column corresponds to a singular value. However, as the singular value and the singular vector, only the singular value having a value of γ _k / η or more and the corresponding singular vector need be calculated for the operation of the following equation (8).

最後に、上記（７）式で得られた特異値と特異ベクトルを用いて、上記（８）式に従って、補助変数Ｚ_kを更新する。また、新しいパターン基底Ｕ_ｋは、max関数によって０となる列を除いたものである。すなわち、パターンの要素数ｒ_ｋは特異値分解の結果によって繰り返しのたびに変動しうる。 Finally, the auxiliary variable Z _k is updated according to the above equation (8) using the singular value and the singular vector obtained by the above equation (7). Further, the new pattern base U _k is obtained by excluding the column that becomes 0 by the max function. That is, the number of elements r _k of the pattern may vary in each iteration by the results of a singular value decomposition.

得られた補助変数Ｚ_kをベクトル化して、記憶部３の補助変数３５に保存し、パターン基底Ｕ_kを記憶部３のパターン基底３８に保存し、さらに関係パターンＶ_kを記憶部３の関係パターン３９に保存して、この補助変数更新部２４の解析処理は終了となる。 The obtained auxiliary variable Z _k is vectorized and stored in the auxiliary variable 35 of the storage unit 3, the pattern base U _k is stored in the pattern base 38 of the storage unit 3, and the relation pattern V _k is further stored in the relationship of the storage unit 3. The data is stored in the pattern 39, and the analysis process of the auxiliary variable update unit 24 is finished.

＜ラグランジュ乗数更新部２５＞
ラグランジュ乗数更新部２５は、以下の（９）式を用いてラグランジュ乗数Ａ_１、Ａ_２、Ａ_３の更新を行う。 <Lagrange multiplier update unit 25>
The Lagrange multiplier updating unit 25 updates the Lagrange multipliers A ₁ , A ₂ , and A ₃ using the following equation (9).

上記（９）式では、ラグランジュ乗数Ａ_１、Ａ_２、Ａ_３を、主変数Ｘを行列化したものとそれぞれの補助変数Ｚ_１、Ｚ_２、Ｚ_３の乖離の量だけ更新する。得られたラグランジュ乗数を記憶部３のラグランジュ乗数３６に保存して、このラグランジュ乗数更新部２５の解析処理は終了である。 In the above equation (9), the Lagrange multipliers A ₁ , A ₂ , A ₃ are updated by the amount of divergence between the matrix of the main variable X and the auxiliary variables Z ₁ , Z ₂ , Z ₃ . The obtained Lagrangian multiplier is stored in the Lagrange multiplier 36 of the storage unit 3, and the analysis process of the Lagrange multiplier update unit 25 is completed.

＜誤差計算部２６＞
誤差計算部２６では、ＫＫＴ条件誤差を計算する。ＫＫＴ条件については、例えば、上記の非特許文献１０に記載されているものと同様であるため、説明を省略する。 <Error calculation unit 26>
The error calculator 26 calculates a KKT condition error. The KKT conditions are the same as those described in Non-Patent Document 10, for example, and will not be described.

ＫＫＴ条件の誤差ε_tは、以下の（１０）式に示すように、２つの誤差項ε₁およびε₂の最大値として計算される。 The error ε _{t of the} KKT condition is calculated as the maximum value of the _two error terms ε ₁ and ε ₂ as shown in the following equation (10).

ただし、正則化定数λが正の値の場合、誤差項ε₁は、以下の（１１）式に従って計算される。一方、正則化定数λがゼロの場合、誤差項ε₁は、以下の（１２）式に従って計算される。 However, when the regularization constant λ is a positive value, the error term ε ₁ is calculated according to the following equation (11). On the other hand, when the regularization constant λ is zero, the error term ε ₁ is calculated according to the following equation (12).

また、誤差項ε₂は、以下の（１３）式に従って計算される。 The error term ε ₂ is calculated according to the following equation (13).

得られた誤差εを、記憶部３の誤差３７の対応する箇所に保存して、この誤差計算部２６の解析処理は終了となる。 The obtained error ε is stored in a location corresponding to the error 37 of the storage unit 3, and the analysis process of the error calculation unit 26 is completed.

＜解析終了条件判定部２７＞
上記の主変数、補助変数、およびラグランジュ乗数の更新は反復して行われ、反復する毎に、主変数、補助変数、およびラグランジュ乗数の更新をした後、解析終了条件判定部２７は、ＫＫＴ条件の誤差および反復回数を用いて反復を終了するべきか判定する。具体的には、ＫＫＴ条件の誤差があらかじめ決められた定数ε以下であるか、又は反復回数が予め決められた回数Ｌに達した場合、解析終了条件を満たしたとして、反復を終了する。反復を継続する場合は反復回数tを１だけ増加させて主変数更新部２３による処理に戻る。 <Analysis End Condition Determination Unit 27>
The update of the main variable, auxiliary variable, and Lagrange multiplier is performed iteratively. After each update, the main variable, auxiliary variable, and Lagrange multiplier are updated, and then the analysis end condition determination unit 27 performs the KKT condition. Is used to determine if the iteration should be terminated. Specifically, if the error in the KKT condition is equal to or smaller than a predetermined constant ε or the number of iterations reaches a predetermined number L, the iteration is terminated assuming that the analysis termination condition is satisfied. When continuing the iteration, the number of iterations t is incremented by 1, and the processing returns to the main variable updating unit 23.

解析終了条件判定部２７で反復を終了したときの未知テンソルＸの要素(i,j,k)には、時刻kにおいて観測される起点オブジェクトiと終点オブジェクトjとの間の関係の強さを表す値が格納されることになる。これにより、雑音の影響を受けたり、観測できなかったりしたオブジェクト間の関係データを復元したデータ（未知テンソルＸ）を獲得することができる。また、未知テンソルの各軸（次元）に注目することで、各パターン（上記図１の例だと商品パターン、ユーザパターン、時間パターン）の時間変化を知ることができる。 In the element (i, j, k) of the unknown tensor X when the iteration is completed by the analysis end condition determination unit 27, the strength of the relationship between the start object i and the end object j observed at time k is set. The value to represent will be stored. Thereby, it is possible to acquire data (unknown tensor X) obtained by restoring relational data between objects that are affected by noise or that cannot be observed. In addition, by paying attention to each axis (dimension) of the unknown tensor, it is possible to know a temporal change of each pattern (a product pattern, a user pattern, and a time pattern in the example of FIG. 1).

さらに、任意の時刻の関係データの予測を行う場合には、解析終了条件判定部２７で反復を終了した後、以下の予測値計算部４を実行する。 Furthermore, when predicting related data at an arbitrary time, after the iteration is completed by the analysis end condition determination unit 27, the following predicted value calculation unit 4 is executed.

＜予測値計算部４＞
予測値計算部４では、まず入力部１で指定された予測時刻t’の前後の時刻Ｔ（ｋ）およびＴ（ｋ＋１）を、観測時刻を要素とするベクトルＴの中から検索する。これはＴの要素が昇順に並んでいるので２分探索を用いて容易に実行することができる。 <Predicted value calculation unit 4>
The predicted value calculation unit 4 first searches the vectors T whose elements are observation times for the times T (k) and T (k + 1) before and after the predicted time t ′ designated by the input unit 1. This can be easily performed using a binary search because the elements of T are arranged in ascending order.

前後の時刻に対応するインデックスｋが見つかった場合、これを用いて時刻ｔ’における予測値Ｘ’の各要素（ｉ，ｊ）を以下の（１４）式に従って算出し、予測値Ｘ’を構成する。 When the index k corresponding to the preceding and succeeding times is found, using this, each element (i, j) of the predicted value X ′ at the time t ′ is calculated according to the following equation (14), and the predicted value X ′ is configured. To do.

ここで、予測値Ｘ’はｎ₁×ｎ₂行列で、その要素(i,j)は、最適化の結果得られた主変数Ｘの要素(i,j,k)と要素(i,j,k+1)をＴ(k+1)−ｔ’：ｔ’−Ｔ(k)の割合で内分したものである。 Here, the predicted value X ′ is an n ₁ × n ₂ matrix, and its elements (i, j) are the elements (i, j, k) and (i, j) of the main variable X obtained as a result of optimization. , k + 1) is internally divided at a ratio of T (k + 1) -t ': t'-T (k).

ｔ’＜Ｔ（１）の場合は、主変数Ｘの要素(i,j,1)を予測値Ｘ’の要素(i,j)とする。また、ｔ’＞Ｔ（ｎ₃）の場合は主変数Ｘの要素(i,j,n₃)を予測値Ｘ’の要素(i,j)とする。 In the case of t ′ <T (1), the element (i, j, 1) of the main variable X is set as the element (i, j) of the predicted value X ′. When t ′> T (n ₃ ), the element (i, j, n ₃ ) of the main variable X is set as the element (i, j) of the predicted value X ′.

＜出力部５＞
出力部５は、最適化の結果得られた主変数であるテンソルＸ、誤差ε、予測値Ｘ’、およびパターン基底Ｕ₁,Ｕ₂,Ｕ₃を出力する。 <Output unit 5>
The output unit 5 outputs a tensor X, an error ε, a predicted value X ′, and pattern bases U ₁ , U ₂ , U ₃ that are main variables obtained as a result of optimization.

＜時間関係データ解析装置の作用＞
次に、本実施の形態に係る時間関係データ解析装置の作用について説明する。まず、各種の値が入力部１を介して時間関係データ解析装置に入力されると、時間関係データ解析装置において、図５に示す時間関係データ解析処理ルーチンが実行される。 <Operation of time-related data analyzer>
Next, the operation of the time-related data analysis apparatus according to this embodiment will be described. First, when various values are input to the time-related data analysis device via the input unit 1, the time-related data analysis processing routine shown in FIG. 5 is executed in the time-related data analysis device.

まず、ステップ１００において、入力部１により入力された、データサイズｎ_１、ｎ_２、ｎ_３、Ｎ、Ｍ、時刻ラベルＴ、観測ベクトルｙ、観測値のインデックスＩ，Ｊ，Ｋ、未観測値のインデックスＩ^ｃ、Ｊ^ｃ，Ｋ^ｃ、正則化定数λ、正則化バランス定数γ₁、γ₂、γ₃、ステップサイズη、最大繰り返し回数Ｌ，許容誤差の上限Ｅ、予測時刻t’を受け付けて、記憶部３に格納する。 First, in step 100, the data size n ₁ , n ₂ , n ₃ , N, M, time label T, observation vector y, observation index I, J, K, unobserved value input by the input unit 1 are input. Index I ^c , J ^c , K ^c , regularization constant λ, regularization balance constant γ ₁ , γ ₂ , γ ₃ , step size η, maximum number of iterations L, allowable error upper limit E, prediction time t ′ And stored in the storage unit 3.

そして、ステップ１０２において、記憶部３に記憶されている主変数Ｘ、補助変数Ｚ_１，Ｚ_２，Ｚ_３、パターン基底Ｕ_１、Ｕ_２、Ｕ_３、及びラグランジュ乗数Ａ₁、Ａ₂、Ａ₃をゼロに初期化する。ステップ１０４では、反復回数を示す変数ｌに初期値１を設定する。 In step 102, the main variable X, auxiliary variables Z ₁ , Z ₂ , Z ₃ , pattern bases U ₁ , U ₂ , U ₃ , and Lagrange multipliers A ₁ , A ₂ , A stored in the storage unit 3 are stored. Initialize ₃ to zero. In step 104, an initial value 1 is set to a variable l indicating the number of iterations.

次のステップ１０６では、記憶部３に記憶されている主変数Ｘを更新する処理を行い、ステップ１０８では、記憶部３に記憶されている補助変数Ｚ_１，Ｚ_２，Ｚ_３、パターン基底Ｕ_１、Ｕ_２、Ｕ_３、及び関係パターンＶ₁、Ｖ₂、Ｖ₃を更新する処理を行う。 In the next step 106, the main variable X stored in the storage unit 3 is updated. In step 108, auxiliary variables Z ₁ , Z ₂ , Z ₃ stored in the storage unit 3 and the pattern base U ₁ , U ₂ , U ₃ , and relationship patterns V ₁ , V ₂ , V ₃ are updated.

そして、ステップ１１０では、記憶部３に記憶されているラグランジュ乗数Ａ₁、Ａ₂、Ａ₃を更新する処理を行い、ステップ１１２において、ＫＫＴ条件の誤差εを計算する。 In step 110, a process of updating the Lagrange multipliers A ₁ , A ₂ , A ₃ stored in the storage unit 3 is performed. In step 112, an error ε of the KKT condition is calculated.

次のステップ１１４では、上記ステップ１１２で計算した誤差εが、許容誤差の上限Ｅより大きいか否かを判定する。誤差εが、許容誤差の上限Ｅ以下である場合には、解析終了条件を満たしたと判断し、ステップ１２０へ移行する。一方、誤差εが、許容誤差の上限Ｅより大きい場合には、ステップ１１６において、反復回数ｌが、最大繰り返し回数Ｌ未満であるか否かを判定する。反復回数ｌが、最大繰り返し回数Ｌに達した場合には、解析終了条件を満たしたと判断し、ステップ１２０へ移行する。一方、反復回数ｌが、最大繰り返し回数Ｌ未満である場合には、ステップ１１８へ移行し、反復回数ｌを１インクリメントして、上記ステップ１０６へ戻る。 In the next step 114, it is determined whether or not the error ε calculated in step 112 is larger than the upper limit E of the allowable error. When the error ε is equal to or less than the upper limit E of the allowable error, it is determined that the analysis end condition is satisfied, and the process proceeds to step 120. On the other hand, if the error ε is larger than the upper limit E of the allowable error, it is determined in step 116 whether or not the number of iterations 1 is less than the maximum number of iterations L. When the number of iterations l reaches the maximum number of iterations L, it is determined that the analysis end condition is satisfied, and the routine proceeds to step 120. On the other hand, if the number of iterations 1 is less than the maximum number of iterations L, the process proceeds to step 118, increments the number of iterations 1 by 1, and returns to step 106 above.

ステップ１２０では、記憶部３に記憶されている主変数のテンソルＸに基づいて、予測時刻ｔ’の関係データを予測する。そして、ステップ１２２において、最終的に得られた主変数Ｘ、誤差ε、予測値Ｘ’、およびパターン基底Ｕ₁,Ｕ₂,Ｕ₃を出力して、時間関係データ解析処理ルーチンを終了する。 In step 120, based on the tensor X of the main variable stored in the storage unit 3, the relationship data at the predicted time t ′ is predicted. In step 122, the finally obtained main variable X, error ε, predicted value X ′, and pattern bases U ₁ , U ₂ , U ₃ are output, and the time-related data analysis processing routine is terminated.

上記ステップ１０６の処理は、図６に示す処理ルーチンによって実現される。 The processing in step 106 is realized by the processing routine shown in FIG.

まず、ステップ１０６０において、補助変数Ｚ_１，Ｚ_２，Ｚ_３及びラグランジュ乗数Ａ₁、Ａ₂、Ａ₃に基づいて、上記（４）式に従って、主変数の予測値＾Ｘを構成する。そして、ステップ１０６２において、上記ステップ１０６０で構成された主変数の予測値＾Ｘに基づいて、上記ステップ（５）式に従って主変数Ｘを更新して、記憶部３の主変数３４に格納して、処理ルーチンを終了する。 First, in step 1060, based on the auxiliary variables Z ₁ , Z ₂ , Z ₃ and the Lagrange multipliers A ₁ , A ₂ , A ₃ , the predicted value ^ X of the main variable is constructed according to the above equation (4). In step 1062, the main variable X is updated according to the equation (5) based on the predicted value ^ X of the main variable configured in step 1060, and stored in the main variable 34 of the storage unit 3. The processing routine is terminated.

上記ステップ１０８の処理は、図７に示す処理ルーチンによって実現される。 The processing of step 108 is realized by the processing routine shown in FIG.

まず、ステップ１０８０において、モードを示す変数ｋを初期値０に設定し、ステップ１０８２において、ｋを１インクリメントする。 First, in step 1080, the variable k indicating the mode is set to an initial value 0, and in step 1082, k is incremented by one.

そして、ステップ１０８４において、主変数Ｘとラグランジュ乗数Ａ₁、Ａ₂、Ａ₃に基づいて、上記（６）式に従って、補助変数の予測値＾Ｚ_kを構成する。次のステップ１０８６では、上記ステップ１０８４で構成された補助変数の予測値＾Ｚ_kに対して、上記（７）式に従って特異値分解を計算し、ステップ１０８８において、上記ステップ１０８６での特異値分解の結果に基づいて、パターン基底Ｕ_ｋ、関係パターンＶ_kを更新して、記憶部３のパターン基底３８、関係パターン３９に格納する。 In step 1084, based on the main variable X and the Lagrange multipliers A ₁ , A ₂ , A ₃ , the auxiliary variable predicted value ^ Z _k is constructed according to the above equation (6). In the next step 1086, the singular value decomposition is calculated according to the above equation (7) for the predicted value ^ Z _k of the auxiliary variable configured in step 1084. In step 1088, the singular value decomposition in step 1086 is performed. Based on the result, the pattern base U _k and the relation pattern V _k are updated and stored in the pattern base 38 and the relation pattern 39 of the storage unit 3.

そして、ステップ１０９０において、上記ステップ１０８６での特異値分解で得られる行列Ｓ_ｋと上記ステップ１０８８で更新されたパターン基底Ｕ_ｋ、関係パターンV_ｋとに基づいて、上記（８）式に従って、補助変数Ｚ_kを更新して、記憶部３の補助変数３５に格納する。 Then, in step 1090, based on the matrix S _k obtained by the singular value decomposition in step 1086, the pattern base U _k updated in step 1088, and the relationship pattern V _k , the auxiliary is performed according to the above equation (8). The variable Z _k is updated and stored in the auxiliary variable 35 of the storage unit 3.

そして、ステップ１０９２では、変数ｋが３未満であるか否かを判定し、変数ｋが３未満であれば、上記ステップ１０８２へ戻り、ｋを１インクメントする。一方、変数ｋが３に到達した場合には、各モードｋについて、上記ステップ１０８２〜１０９０の処理を実行したと判断し、処理ルーチンを終了する。 In step 1092, it is determined whether or not the variable k is less than 3. If the variable k is less than 3, the process returns to step 1082 to increment k by 1. On the other hand, when the variable k reaches 3, it is determined that the processing of steps 1082 to 1090 has been executed for each mode k, and the processing routine is ended.

上記ステップ１１０の処理は、図８に示す処理ルーチンによって実現される。 The processing of step 110 is realized by the processing routine shown in FIG.

まず、ステップ１１００において、モードを示す変数ｋを初期値０に設定し、ステップ１１０２において、ｋを１インクリメントする。 First, in step 1100, a variable k indicating a mode is set to an initial value 0, and in step 1102, k is incremented by one.

そして、ステップ１１０４では、主変数Ｘと、補助係数Ｚ_kと、ラグランジュ乗数Ａ_ｋとに基づいて、上記（９）式に従ってラグランジュ乗数Ａ_ｋを更新して、記憶部３のラグランジュ乗数３６に格納する。 Then, stored in step 1104, the main variable X, and an auxiliary coefficient Z _k, based on the Lagrange multipliers _{A k,} update the Lagrange multiplier _{A k} according to the above (9), the Lagrange multiplier 36 of the storage section 3 To do.

そして、ステップ１１０６では、変数ｋが３未満であるか否かを判定し、変数ｋが３未満であれば、上記ステップ１１０２へ戻り、ｋを１インクメントする。一方、変数ｋが３に到達した場合には、各モードｋについて、上記ステップ１１０２〜１１０４の処理を実行したと判断し、処理ルーチンを終了する。 In Step 1106, it is determined whether or not the variable k is less than 3. If the variable k is less than 3, the process returns to Step 1102 and increments k by 1. On the other hand, when the variable k reaches 3, it is determined that the processing in steps 1102 to 1104 has been executed for each mode k, and the processing routine is terminated.

上記ステップ１１２の処理は、図９に示す処理ルーチンによって実現される。 The processing of step 112 is realized by the processing routine shown in FIG.

まず、ステップ１１２０において、正則化定数λが０より大きいか否かを判定する。正則化定数λが０より大きい場合には、ステップ１１２２において、主変数Ｘ、観測ベクトルｙ、正則化定数λ、ステップサイズη、ラグランジュ乗数Ａ₁、Ａ₂、Ａ₃に基づいて、上記（１１）式に従って、誤差項ε₁を計算し、ステップ１１２６へ移行する。一方、正則化定数λが０である場合には、ステップ１１２４において、ラグランジュ乗数Ａ₁、Ａ₂、Ａ₃に基づいて、上記（１２）式に従って、誤差項ε₁を計算し、ステップ１１２６へ移行する。 First, in step 1120, it is determined whether the regularization constant λ is greater than zero. If the regularization constant λ is greater than 0, in step 1122, the above (11) is determined based on the main variable X, the observation vector y, the regularization constant λ, the step size η, and the Lagrange multipliers A ₁ , A ₂ , A _3. ), The error term ε ₁ is calculated, and the process proceeds to step 1126. On the other hand, if the regularization constant λ is 0, in step 1124, the error term ε ₁ is calculated according to the above equation (12) based on the Lagrange multipliers A ₁ , A ₂ , A ₃ , and the flow goes to step 1126. Transition.

ステップ１１２６では、主変数Ｘ、及び補助変数Ｚ_１，Ｚ_２，Ｚ_３に基づいて、上記（１３）式に従って、誤差項ε₂を計算する。次のステップ１１２８では、上記ステップ１１２又は１１２４で計算した誤差項ε₁と、上記ステップ１１２６で計算した誤差項ε₂とに基づいて、上記（１０）式に従って、誤差εを計算し、処理ルーチンを終了する。 In step 1126, the error term ε ₂ is calculated according to the above equation (13) based on the main variable X and the auxiliary variables Z ₁ , Z ₂ , Z ₃ . In the next step 1128, based on the error term ε ₁ calculated in step 112 or 1124 and the error term ε ₂ calculated in step 1126, the error ε is calculated according to the above equation (10), and the processing routine is performed. Exit.

＜実験例＞
次に、本実施の形態で提案する時間関係データの解析手法を用いた実験の結果について説明する。実験では、総務省統計局が公開している産業連関表データ（逆行列係数）で５年ごとに１９８５〜２００５年の間のデータを利用した。なお、もとのデータは32業種で集計されている期間と34業種で集計されている期間があるため、32業種に集計しなおしたものを使用した。ここでは1995年のデータを予測すべきデータ（すなわちＴ（ｔ’）＝１９９５）とした。さらに残りの1985, 1990, 2000, 2005年のデータのうちランダムに選んだ20%の要素を欠損データとした。この設定のもとでn₁=32, n₂=32, n₃=4, N=4096, M=3277とした。また、正則化定数λ=0、正則化バランス定数γ₁=0,γ₂=0,γ₃=1, ステップサイズη=1, 許容誤差Ｅ=0.001, 最大繰り返し回数L=1000とした。ただし、アルゴリズムは最大繰り返し回数に達する前に60回程度の反復で許容誤差Eを下回る誤差を達成した。さらに未観測要素に対する誤差err1を以下の（１５）式のように定義する。 <Experimental example>
Next, the results of an experiment using the time-related data analysis method proposed in this embodiment will be described. In the experiment, data from 1985 to 2005 was used every five years in the input-output table data (inverse matrix coefficient) published by the Statistics Bureau of the Ministry of Internal Affairs and Communications. In addition, since the original data has a period aggregated in 32 industries and a period aggregated in 34 industries, the data recalculated in 32 industries was used. Here, the data for 1995 is assumed as data to be predicted (that is, T (t ′) = 1995). Furthermore, 20% of randomly selected elements from the remaining 1985, 1990, 2000, and 2005 data were used as missing data. Under this setting, n ₁ = 32, n ₂ = 32, n ₃ = 4, N = 4096, and M = 3277. Further, the regularization constant λ = 0, the regularization balance constant γ ₁ = 0, γ ₂ = 0, γ ₃ = 1, the step size η = 1, the allowable error E = 0.001, and the maximum number of repetitions L = 1000. However, the algorithm achieved an error below the allowable error E in about 60 iterations before reaching the maximum number of iterations. Further, an error err1 for the unobserved element is defined as the following equation (15).

err1=‖true(I^c,J^c,K^c) − X(I^c,J^c,K^c) ‖／‖X(I^c,J^c,K^c) ‖ ・・・（１５） err1 = ‖true (I ^c , J ^c , K ^c ) −X (I ^c , J ^c , K ^c ) ‖ / ‖X (I ^c , J ^c , K ^c ) ‖ (15)

ここで、Xtrueは真の逆行列係数、X(I^c,J^c,K^c) はテンソルXの未観測要素を（N−M）次元のベクトルに並べたものを表し、‖ ‖は通常のユークリッドノルムを表す。また、予測点t'における誤差err2を以下の（１６）式のように定義する。 Where Xtrue is the true inverse matrix coefficient, X (I ^c , J ^c , K ^c ) is the unobserved elements of tensor X arranged in a (N−M) dimensional vector, and ‖ ‖ is the usual Represents the Euclidean norm. Further, the error err2 at the prediction point t ′ is defined as the following equation (16).

err2=‖Xtrue(:,:,t')−X'(:)‖／‖Xtrue(:,:,t') ‖ ・・・（１６） err2 = ‖Xtrue (:,:, t ') − X' (:) ‖ / ‖Xtrue (:,:, t ') ‖ (16)

ここで、Xtrue(:,:,t')は真の1995年における逆行列係数を1列目、2列目、・・・、32列目の順で縦に連結し、1024次元のベクトルにしたものであり、X'(:)は推定された1995年における逆行列係数を同様にベクトルにしたものである。 Here, Xtrue (:,:, t ') is the true inverse matrix coefficient in 1995, vertically connected in the order of the 1st column, 2nd column, ..., 32nd column, to a 1024-dimensional vector. X ′ (:) is the same as the vector of the estimated inverse matrix coefficient in 1995.

実験結果は以下の通りである。まず、得られた解のランクはr₁=32, r₂=32, r₃=4であった。モード１とモード２に関しては正則化バランス定数γ₁=0, γ₂=0であるからこれは妥当な結果である。モード３に関してはフルランクであるものの、得られた特異値は図１０に示すように第１特異値に集中していることがわかる。従って低ランクの仮定はあてはまっていると考えられる。観測要素（８０％）のランダムな選択を１００回繰り返して得られた誤差は、未観測要素に関してerr1=0.11、補間に関する誤差err2=0.050であった。図１１に、真の1995年における逆行列係数と推定された逆行列係数を示す。ここでは縦軸にのみ、業種の名前を示しているが、横軸も縦軸と同じ業種に対応する。この図からも視覚的に推定結果が真の逆行列係数とよく一致していることがわかる。図１２、１３に、モード３に関する特異値分解（上記図１０）に付随する特異ベクトルを示す。上記図１２は第１成分を表わし、上記図１３は、第２成分を表わしている。また、図１２、１３の右側にはパターン基底Ｕ₃のそれぞれ１列目（第１特異ベクトル）および２列目（第２特異ベクトル）に対応する特異値を乗じたものを示す。なお、最適化の結果得られたＵ₃のそれぞれの列ベクトルは4つの要素を持ち、時間点1985、1990、2000、2005年に対応し、1995年のデータ（丸印で示す）は1990年および2000年のデータの内分から得られる予測値である。上記図１２、１３の左側には、Ｚ₃の第１、第２特異値に対応する特異ベクトル、すなわち関係パターンＶ_３の第１列、第２列をそれぞれ32×32行列に変形したものを示しており、図の右側に示すパターン基底が表現するダイナミクスに従って時間変化する逆行列係数の成分に対応する。上記図１２は産業界全体での連関が2000年まで年々減少し、2000年を境にして増加に転じたことを示している。上記図１３は農林水産業や鉄鋼などの伝統的な産業から他業種への影響が減少し、逆に通信・放送や対事業所サービスのような非伝統的な産業の他業種へ与える影響が強まっていることを表している。 The experimental results are as follows. First, the rank of the obtained solution was r ₁ = 32, r ₂ = 32, r ₃ = 4. For mode 1 and mode 2, this is a reasonable result because the regularization balance constants γ ₁ = 0 and γ ₂ = 0. Although it is full rank regarding mode 3, it turns out that the obtained singular value concentrates on the 1st singular value as shown in FIG. Therefore, it is considered that the low rank assumption is applied. The errors obtained by repeating the random selection of the observation element (80%) 100 times were err1 = 0.11 for the unobserved element and err2 = 0.050 for the interpolation. FIG. 11 shows the true inverse matrix coefficients in 1995 and the estimated inverse matrix coefficients. Here, only the vertical axis indicates the name of the business type, but the horizontal axis also corresponds to the same business type as the vertical axis. This figure also shows that the estimation result is in good agreement with the true inverse matrix coefficient. 12 and 13 show the singular vectors associated with the singular value decomposition (FIG. 10 above) relating to mode 3. FIG. FIG. 12 shows the first component, and FIG. 13 shows the second component. 12 and 13 are multiplied by singular values corresponding to the first column (first singular vector) and the second column (second singular vector) of the pattern base U ₃ , respectively. Note that each column vector of U ₃ obtained as a result of optimization has four elements, corresponding to time points 1985, 1990, 2000, 2005, and 1995 data (circled) is 1990 And predicted values obtained from internal data of 2000 data. On the left side of FIGS. 12 and 13, singular vectors corresponding to the first and second singular values of Z ₃ , that is, the first and second columns of the relationship pattern V ₃ are transformed into 32 × 32 matrices, respectively. It corresponds to the inverse matrix coefficient component that changes with time according to the dynamics represented by the pattern base shown on the right side of the figure. FIG. 12 above shows that the linkages in the entire industry decreased year by year until 2000, and began to increase after 2000. In Figure 13 above, the impact on other industries from traditional industries such as agriculture, forestry and fisheries and steel decreases, and conversely the impact on other industries in non-traditional industries such as communications / broadcasting and services for business establishments. It shows that it is getting stronger.

以上説明したように、本発明の実施の形態に係る時間関係データ解析装置によれば、時間変化するオブジェクト間の関係の強さを示す時間関係データを解析して、目的関数を最小化するように、時間関係データを表わす未知のテンソルＸを求めることによって、未観測のオブジェクト間の関係の強さを求めることができる。 As described above, according to the time-related data analysis apparatus according to the embodiment of the present invention, the time-related data indicating the strength of the relationship between time-varying objects is analyzed to minimize the objective function. In addition, the strength of the relationship between unobserved objects can be determined by determining the unknown tensor X representing the time relationship data.

また、時間変化する関係データを少数のパターンに分解して、内在するパターンを発見することができる。パターンの数は補助変数更新部２４の特異値分解において、γ_ｋ／η以上の値を持つ特異値とそれに対応する特異ベクトルのみ計算する処理により、予めパターン数を設定せずとも自動的に最適な値が求まる。 In addition, the inherent data can be found by decomposing the time-related relational data into a small number of patterns. In the singular value decomposition of the auxiliary variable update unit 24, the number of patterns is automatically optimized without setting the number of patterns in advance by processing only singular values having a value of γ _k / η or more and corresponding singular vectors. The correct value is obtained.

また、解析する関係データの時間発展を複数の時間的要因に分解することが可能である。また、解析する関係データを複数のパターンの重みづけとして表現するため、観測に失敗した未知数や雑音の重畳した観測データを高精度に復元・予測できる。 It is also possible to decompose the time evolution of the relational data to be analyzed into a plurality of time factors. In addition, since the relational data to be analyzed is expressed as a weight of a plurality of patterns, it is possible to restore / predict the observation data on which unknowns or noises that have failed to be observed are highly accurate.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、上述の時間関係データ解析装置は、内部にコンピュータシステムを有しているが、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。 For example, the time-related data analysis apparatus described above has a computer system inside, but the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used. Shall be.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能である。 In the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium.

１入力部
２解析部
３記憶部
４予測値計算部
５出力部
２１初期設定部
２２変数計算部
２３主変数更新部
２４補助変数更新部
２５ラグランジュ乗数更新部
２６誤差計算部
２７解析終了条件判定部 DESCRIPTION OF SYMBOLS 1 Input part 2 Analysis part 3 Storage part 4 Predicted value calculation part 5 Output part 21 Initial setting part 22 Variable calculation part 23 Main variable update part 24 Auxiliary variable update part 25 Lagrange multiplier update part 26 Error calculation part 27 Analysis end condition determination part

Claims

Representing a time series of relational data indicating the strength of the relation between each of the plurality of first objects and each of the plurality of second objects, and analyzing the time relational data represented by a tensor, to analyze the time A time relational data analysis device for obtaining an unknown tensor X representing relational data,
Each mode k (k = 1) of an observation vector y consisting of M observation values of the strength of the relationship between the first object and the second object, a regularization constant λ that is a positive real number, and a tensor X , 2, 3) storing means for storing a regularization balance constant γ _k representing a relative regularization strength relative to
The observation vector y, the regularization constant λ, and the regularization balance constant γ _k (k = 1, 2, 3) are acquired from the storage means, and the objective function expressed by the following equation (I) is minimized. Analyzing means for obtaining the tensor X to be converted;
A temporal relationship data analyzer comprising,
The storage means includes an auxiliary variable Z _k (k = 1, 2, 3), a value x of each element of the tensor X, a Lagrange multiplier A _k (k = 1, 2, 3), a first object direction, a second object direction, and each of the pattern base U _k in the time direction (k = 1, 2, 3), the first object that varies in accordance with said pattern base U _k (k = 1, 2, 3), the second object, And a relationship pattern V _k (k = 1, 2, 3) representing the strength of the relationship between each pair of time and
The analysis means includes
Based on the auxiliary coefficient Z _k and Lagrange multiplier A _k of each mode k, the predicted value of the tensor ^ X is calculated according to the following equation (II), and based on the calculated predicted value of the tensor ^ X and the observed vector y Main variable updating means for updating the value x of each element of the tensor X according to the following formula (III) and storing the value x in the storage means:
Based on the tensor X and the Lagrange multiplier A _k of each mode k, for each mode k, an auxiliary variable predicted value ^ Z _k is calculated according to the following equation (IV) , and the calculated auxiliary variable is predicted: Based on the value ^ Z _k , singular value decomposition is performed for each mode k according to the following equation (V) to obtain a diagonal matrix S _{k in} which singular values are arranged, and the pattern base U _k and the relationship Update the pattern V _k and store it in the storage means;
Based on the obtained diagonal matrix S _k , the pattern base U _k , and the relation pattern V _k , the auxiliary variable Z _k is updated according to the following equation (VI) and stored in the storage means. An auxiliary variable updating means for
Based on the tensor X and the auxiliary variable Z _k , for each mode k, the Lagrange multiplier update means for updating the Lagrange multiplier A _k and storing it in the storage means according to the following equation (VII) :
The tensor X stored in the storage means when the predetermined analysis end condition is satisfied by repeating the update by the main variable update means, the update by the auxiliary variable update means, and the update by the Lagrange multiplier update means. And an analysis end condition determination means for outputting
Time-related data analysis device.

Here, n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the time-related data, and x is an element of the tensor X. Ω (X) is an operation for extracting M elements corresponding to the observation value of the observation vector from the tensor X. mat ₁ is a mode 1 matrix that returns n ₁ × n ₂ n ₃ , mat ₂ is a mode 2 matrix that returns n ₂ × n ₃ n ₁ , and mat ₃ is n ₃ × n ₁ n Mode 3 matrixing that returns ₂ . ‖‖ _tr represents the linear sum of the singular values of the matrix.

However, ￣Ω (X) is an operation for extracting an element other than the observed value from the tensor X, and η is a constant.

Representing a time series of relational data indicating the strength of the relation between each of the plurality of first objects and each of the plurality of second objects, and analyzing the time relational data represented by a tensor, to analyze the time A time relational data analysis device for obtaining an unknown tensor X representing relational data,
Each mode k (k = 1) of an observation vector y consisting of M observation values of the strength of the relationship between the first object and the second object, a regularization constant λ that is a positive real number, and a tensor X , 2, 3) storing means for storing a regularization balance constant γ _k representing a relative regularization strength relative to
The observation vector y, the regularization constant λ, and the regularization balance constant γ _k (k = 1, 2, 3) are acquired from the storage means, and the objective function expressed by the following equation (I) is minimized. Analyzing means for obtaining the tensor X to be converted;
Based on the tensor X obtained by the analyzing means, the predicted value X of the relationship data at the prediction target time t ′ in the relationship between the first object i and the second object j according to the following equation (VIII): 'Relational data prediction means for calculating (i, j);
Time-related data analysis device including

Here, n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the time-related data, and x is an element of the tensor X. Ω (X) is an operation for extracting M elements corresponding to the observation value of the observation vector from the tensor X. mat ₁ is a mode 1 matrix that returns n ₁ × n ₂ n ₃ , mat ₂ is a mode 2 matrix that returns n ₂ × n ₃ n ₁ , and mat ₃ is n ₃ × n ₁ n Mode 3 matrixing that returns ₂ . ‖‖ _tr represents the linear sum of the singular values of the matrix.

However, the times when the observed values are observed before and after the prediction target time t ′ are T (k) and T (k + 1).

An observation vector y consisting of M observations of the strength of the relationship between the first object and the second object, a regularization constant λ that is a positive real number, and each mode k (k = 1, 2) of the tensor X 3) In the time-related data analysis apparatus provided with storage means for storing the regularization balance constant γ _k representing the relative regularization strength relative to 3), each of the plurality of first objects and each of the plurality of second objects Time relationship data analysis method for representing an unknown tensor X representing the time relationship data by analyzing the time relationship data represented by a tensor and representing a time series of relationship data indicating the strength of the relationship between Because
The analysis unit obtains the observation vector y, the regularization constant λ, and the regularization balance constant γ _k (k = 1, 2, 3) from the storage unit, and is expressed by the following equation (I): obtains the tensor X that minimizes the objective function that,
The storage means includes an auxiliary variable Z _k (k = 1, 2, 3), a value x of each element of the tensor X, a Lagrange multiplier A _k (k = 1, 2, 3), a first object direction, a second object direction, and each of the pattern base U _k in the time direction (k = 1, 2, 3), the first object that varies in accordance with said pattern base U _k (k = 1, 2, 3), the second object, And a relationship pattern V _k (k = 1, 2, 3) representing the strength of the relationship between each pair of time and
Obtaining the tensor X by the analyzing means
Based on the auxiliary coefficient Z _k and Lagrange multiplier A _k of each mode k, the main variable updating means calculates the predicted tensor value ^ X according to the following equation (II), and calculates the predicted tensor predicted value ^ X and Updating the value x of each element of the tensor X based on the observation vector y according to the following equation (III) and storing it in the storage means:
Based on the tensor X and the Lagrange multiplier A _k of each mode k, the auxiliary variable update means calculates the predicted value ^ Z _k of the auxiliary variable for each mode k according to the following equation (IV). On the basis of the predicted value ^ Z _k of the auxiliary variable , for each mode k, singular value decomposition is performed according to the following equation (V) to obtain a diagonal matrix S _{k in} which singular values are arranged, Updating the pattern base U _k and the relation pattern V _k and storing them in the storage means;
Based on the obtained diagonal matrix S _k , the pattern base U _k , and the relation pattern V _k , the auxiliary variable Z _k is updated according to the following equation (VI) and stored in the storage means. And steps to
The Lagrange multiplier updating means on the basis of the tensor X and auxiliary variables Z _k, for each mode k, the following (VII) wherein the step of storing in the storage means to update the Lagrange multiplier A _k,
When the analysis end condition determination unit repeats the update by the main variable update unit, the update by the auxiliary variable update unit, and the update by the Lagrange multiplier update unit, the storage unit when the predetermined analysis end condition is satisfied Outputting the tensor X stored in the time-related data analysis method.

Here, n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the time-related data, and x is an element of the tensor X. Ω (X) is an operation for extracting M elements corresponding to the observation value of the observation vector from the tensor X. mat ₁ is a mode 1 matrix that returns n ₁ × n ₂ n ₃ , mat ₂ is a mode 2 matrix that returns n ₂ × n ₃ n ₁ , and mat ₃ is n ₃ × n ₁ n Mode 3 matrixing that returns ₂ . ‖‖ _tr represents the linear sum of the singular values of the matrix.

However, ￣Ω (X) is an operation for extracting an element other than the observed value from the tensor X, and η is a constant.

An observation vector y consisting of M observations of the strength of the relationship between the first object and the second object, a regularization constant λ that is a positive real number, and each mode k (k = 1, 2) of the tensor X 3) In the time-related data analysis apparatus provided with storage means for storing the regularization balance constant γ _k representing the relative regularization strength relative to 3), each of the plurality of first objects and each of the plurality of second objects Time relationship data analysis method for expressing an unknown tensor X representing the time relationship data by analyzing time relationship data expressed by a tensor and representing a time series of relationship data indicating the strength of the relationship between Because
The analysis unit obtains the observation vector y, the regularization constant λ, and the regularization balance constant γ _k (k = 1, 2, 3) from the storage unit, and is expressed by the following equation (I): obtains the tensor X that minimizes the objective function that,
Based on the tensor X obtained by the analyzing means by the relation data predicting means, the prediction target time t ′ in the relation between the first object i and the second object j according to the following equation (VIII): A temporal relational data analysis method further comprising calculating a predicted value X '(i, j) of the relational data.

Here, n ₁ , n ₂ , and n ₃ are natural numbers representing the size of the time-related data, and x is an element of the tensor X. Ω (X) is an operation for extracting M elements corresponding to the observation value of the observation vector from the tensor X. mat ₁ is a mode 1 matrix that returns n ₁ × n ₂ n ₃ , mat ₂ is a mode 2 matrix that returns n ₂ × n ₃ n ₁ , and mat ₃ is n ₃ × n ₁ n Mode 3 matrixing that returns ₂ . ‖‖ _tr represents the linear sum of the singular values of the matrix.

However, the times when the observed values are observed before and after the prediction target time t ′ are T (k) and T (k + 1).

As each unit constituting the time relationship data analyzing device according to claim 1 or 2, the program for causing a computer to function.