JP2014182753A

JP2014182753A - Ppr computing unit, method and program

Info

Publication number: JP2014182753A
Application number: JP2013058563A
Authority: JP
Inventors: Masaaki Nishino; 正彬西野; Yoshihito Yasuda; 宜仁安田; Masaaki Nagata; 昌明永田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-03-21
Filing date: 2013-03-21
Publication date: 2014-09-29
Anticipated expiration: 2033-03-21
Also published as: JP5901558B2

Abstract

PROBLEM TO BE SOLVED: To reduce the times of computing in computing PPR.SOLUTION: A score matrix creation part 29 creates a score matrix. A ZDD construction part 22 converts a binary matrix corresponding to an inputted adjacent matrix into ZDD and stores output degree of each node. Based on a score matrix, a matrix made of a group of a plurality of personalized vectors inputted by an input part 10, and ZDD, a PPR calculating part 30 calculates a score vector with regard to each intermediate node, and outputs a matrix as a result of a multiplication of a score of a child node of a matrix node, and calculates a PPR score using the matrix.

Description

本発明は、ＰＰＲ演算装置、方法、及びプログラムに係り、特に、ＰＰＲ演算を行うＰＰＲ演算装置、方法、及びプログラムに関する。 The present invention relates to a PPR calculation device, method, and program, and more particularly, to a PPR calculation device, method, and program for performing PPR calculation.

Personalized PageRank（以下ＰＰＲ）はグラフの構造をもとに、グラフの各節点の重要さを反映した重み計算するアルゴリズムである。ＰＰＲは、例えばＷｅｂページを対象とした情報検索におけるページの重要度の判定や、推薦システムにおいて推薦するアイテムを決定するためのスコア等に幅広く用いられている。 Personalized PageRank (hereinafter referred to as PPR) is an algorithm for calculating weights reflecting the importance of each node of a graph based on the structure of the graph. PPR is widely used for, for example, determining the importance of a page in information retrieval targeting a Web page, and a score for determining an item recommended in a recommendation system.

ＰＰＲを計算するためには、グラフを表現する隣接行列と、スコアを表現する行列との乗算を繰り返す必要がある（非特許文献１）。 In order to calculate PPR, it is necessary to repeat multiplication of an adjacency matrix that represents a graph and a matrix that represents a score (Non-Patent Document 1).

また、隣接行列が二値行列の場合には、ゼロサプレス型二分決定グラフ（ＺＤＤ：Zero-suppressed Binary Decision Diagrams）を用いて表現することもできる。ＺＤＤは二分グラフの構造で対象となるデータを表現するデータ構造であり、データを圧縮して表現できることを特徴とする。ある一定の条件の下において、隣接行列をＺＤＤとして表し、ＰＰＲの計算において必要となる乗算をＺＤＤの構造を用いた動的計画法として実行することによって、計算量を減らすことができる（非特許文献２）。 Further, when the adjacency matrix is a binary matrix, it can be expressed using a zero-suppressed binary decision diagram (ZDD). ZDD is a data structure that represents target data in the structure of a binary graph, and is characterized in that data can be compressed and expressed. Under certain conditions, the adjacency matrix is expressed as ZDD, and the multiplication required in the calculation of PPR is executed as dynamic programming using the structure of ZDD, so that the amount of calculation can be reduced (non-patented). Reference 2).

Taher H. Haveliwala, “Topic-Sensitive PageRank”, In Proceedings of the 11th World Wide Web conference, 2002.Taher H. Haveliwala, “Topic-Sensitive PageRank”, In Proceedings of the 11th World Wide Web conference, 2002. 西野正彬，安田宜仁，小林透, “ZDDを用いた効率的な集合拡張の計算”，人工知能学会論文誌 27巻2号p1 − 6, 2011Masanobu Nishino, Yoshihito Yasuda, Toru Kobayashi, “Computation of efficient set expansion using ZDD”, Transactions of the Japanese Society for Artificial Intelligence Vol. 27 No. 2 p1 − 6, 2011

しかし、ＰＰＲは大規模なグラフデータを対象として適用されることが多いため、大規模な疎行列である隣接行列とスコアを表現する行列との乗算の繰り返しには、疎行列の非ゼロ要素の数に比例する時間がかかり、大規模なグラフに対しては計算時間が長くなるという問題がある。特に、複数のパーソナライズドベクトルに対してＰＰＲを計算するときには、パーソナライズドベクトルの個数に応じて計算時間がかかるという問題がある。 However, since PPR is often applied to large-scale graph data, repeated multiplication of an adjacency matrix, which is a large-scale sparse matrix, and a matrix that expresses a score, requires non-zero elements of the sparse matrix. There is a problem that it takes time proportional to the number, and the calculation time becomes long for a large-scale graph. In particular, when calculating the PPR for a plurality of personalized vectors, there is a problem that it takes a calculation time according to the number of personalized vectors.

また、非特許文献２によれば、二値行列のみを対象としているため、ＰＰＲの計算の為に必要となる、列で正規化された隣接行列を、ＺＤＤを用いて表すことが出来ないという問題がある。また、非特許文献２おいては、二値行列とベクトルとの積の計算法は示されているが、行列間の積の計算方法については記載がないことから、直接ＰＰＲの計算に利用することは不可能であるという問題がある。 Further, according to Non-Patent Document 2, since only binary matrices are targeted, it is not possible to represent a matrix-normalized adjacency matrix necessary for PPR calculation using ZDD. There's a problem. In Non-Patent Document 2, although a method for calculating a product of a binary matrix and a vector is shown, since there is no description about a method for calculating a product between matrices, it is directly used for calculating a PPR. There is a problem that it is impossible.

本発明では、上記問題点を解決するために成されたものであり、ＰＰＲのスコアの演算において必要な演算回数を削減することができるＰＰＲ演算装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above-described problems, and an object of the present invention is to provide a PPR calculation device, method, and program that can reduce the number of calculations necessary for calculating the PPR score. To do.

上記目的を達成するために、第１の発明に係るＰＰＲ演算装置は、Ｎ個のページ間の関係を表す有向グラフについて求められたＮ行Ｎ列の隣接行列と、Ｋ個のトピックごとに予め用意された各ページの重みを格納したＮ次元のパーソナライズドベクトルを表すＮ行Ｋ列の行列とに基づいて、ＰＰＲ（Personalized PageRank）スコアを表わすＮ行Ｋ列のスコア行列を計算するＰＰＲ演算装置であって、前記スコア行列の全ての要素を初期化する初期化手段と、前記隣接行列と、前記初期化手段によって初期化された前記スコア行列、又は前回計算された前記スコア行列との積、及び前記Ｋ個のトピックごとのパーソナライズドベクトルを表す行列の重み付き和を、前記スコア行列として計算するＰＰＲスコア計算手段と、予め定められた反復終了条件を満たすまで、前記ＰＰＲスコア計算手段による計算を繰り返す反復判定手段と、を含み、前記ＰＰＲスコア計算手段は、前記隣接行列の非ゼロの要素の各々について、前記要素の値を１に置き換えることによって、前記隣接行列に対応する二値行列を作成し、前記作成した二値行列を、値を示す終端ノード、前記二値行列の各行に対応する各行ノード、及び前記二値行列の非ゼロの各要素に対応する各中間ノードを含むゼロサプレス型二分決定グラフ（Zero-suppressed Binary Decision Diagrams）に変換するＺＤＤ構築手段と、前記ゼロサプレス型二分決定グラフの前記終端ノードから前記行ノードに向かう順序で、前記中間ノードの各々について、Ｋ次元のスコアベクトルを算出するスコア算出手段であって、算出対象の中間ノードの各々について、前記算出対象の中間ノードの子ノードである各中間ノードに対し既に算出された前記スコアベクトルと、前記算出対象の中間ノードに対応する前記二値行列の要素、乗算すべき前記スコア行列のうちのＫ次元ベクトル、及び前記算出対象の中間ノードに対応する前記隣接行列の要素の積との和を、前記算出対象の中間ノードの前記スコアベクトルとして算出するスコア算出手段を含み、前記行ノードの各々について、前記行ノードの子ノードである中間ノードに対し前記スコア算出手段により算出された前記スコアベクトルを取得し、前記隣接行列と前記スコア行列との積演算の結果として、Ｎ行Ｋ列の行列を作成し、前記作成された前記Ｎ行Ｋ列の行列、及び前記Ｋ個のトピックごとのパーソナライズドベクトルを表す行列の重み付き和を計算する。 In order to achieve the above object, the PPR calculation device according to the first invention is prepared in advance for each of N rows and N columns of adjacency matrices obtained for a directed graph representing a relationship between N pages and K topics. A PPR calculation device that calculates an N-row and K-column score matrix representing a PPR (Personalized PageRank) score based on an N-dimensional and K-column matrix that represents an N-dimensional personalized vector storing the weights of the respective pages. Initializing means for initializing all elements of the score matrix, a product of the adjacency matrix, the score matrix initialized by the initialization means, or the previously calculated score matrix, and PPR score calculation means for calculating a weighted sum of matrices representing personalized vectors for each of the K topics as the score matrix, and a predetermined iteration end condition Repetitive determination means that repeats the calculation by the PPR score calculation means until it is satisfied, wherein the PPR score calculation means replaces the value of the element with 1 for each non-zero element of the adjacency matrix, A binary matrix corresponding to the adjacency matrix is created, and the created binary matrix is represented by a terminal node indicating a value, each row node corresponding to each row of the binary matrix, and each non-zero element of the binary matrix ZDD construction means for converting to zero-suppressed binary decision diagrams including each intermediate node corresponding to, and the intermediate in the order from the end node to the row node of the zero-suppressed binary decision graph A score calculating means for calculating a K-dimensional score vector for each of the nodes, and for each of the intermediate nodes to be calculated, Of the score vector already calculated for each intermediate node that is a child node of the intermediate node to be calculated, elements of the binary matrix corresponding to the intermediate node to be calculated, and the score matrix to be multiplied Score calculating means for calculating a sum of a K-dimensional vector and a product of elements of the adjacency matrix corresponding to the calculation target intermediate node as the score vector of the calculation target intermediate node; For the intermediate node that is a child node of the row node, the score vector calculated by the score calculation means is obtained, and as a result of product operation of the adjacency matrix and the score matrix, a matrix of N rows and K columns And a weighted sum of the created N-by-K matrix and a matrix representing a personalized vector for each of the K topics. Calculated to.

第２の発明に係るＰＰＲ演算方法は、初期化手段と、ＰＰＲスコア計算手段と、反復判定手段と、を含み、Ｎ個のページ間の関係を表すグラフについて求められたＮ行Ｎ列の隣接行列と、Ｋ個のトピックごとに予め用意された各ページの重みを格納したＮ次元のパーソナライズドベクトルを表すＮ行Ｋ列の行列とに基づいて、ＰＰＲ（Personalized PageRank）スコアを表わすＮ行Ｋ列のスコア行列を計算するＰＰＲ演算装置におけるＰＰＲ演算方法であって、前記初期化手段によって、前記スコア行列の全ての要素を初期化するステップと、前記ＰＰＲスコア計算手段によって、前記隣接行列と、前記初期化手段によって初期化された前記スコア行列、又は前回計算された前記スコア行列との積、及び前記Ｋ個のトピックごとのパーソナライズドベクトルを表す行列の重み付き和を、前記スコア行列として計算するステップと、前記反復判定手段によって、予め定められた反復終了条件を満たすまで、前記ＰＰＲスコア計算手段による計算を繰り返すステップと、を含み、前記ＰＰＲスコア計算手段によって計算するステップは、ＺＤＤ構築手段によって、前記隣接行列の非ゼロの要素の各々について、前記要素の値を１に置き換えることによって、前記隣接行列に対応する二値行列を作成し、前記作成した二値行列を、値を示す終端ノード、前記二値行列の各行に対応する各行ノード、及び前記二値行列の非ゼロの各要素に対応する各中間ノードを含むゼロサプレス型二分決定グラフ（Zero-suppressed Binary Decision Diagrams）に変換するステップと、スコア算出手段によって、前記ゼロサプレス型二分決定グラフの前記終端ノードから前記行ノードに向かう順序で、前記中間ノードの各々について、Ｋ次元のスコアベクトルを算出するステップであって、前記スコア算出手段によって、算出対象の中間ノードの各々について、前記算出対象の中間ノードの子ノードである各中間ノードに対し既に算出された前記スコアベクトルと、前記算出対象の中間ノードに対応する前記二値行列の要素、乗算すべき前記スコア行列のうちのＫ次元ベクトル、及び前記算出対象の中間ノードに対応する前記隣接行列の要素の積との和を、前記算出対象の中間ノードの前記スコアベクトルとして算出するステップとを含み、前記行ノードの各々について、前記行ノードの子ノードである中間ノードに対し前記スコア算出手段により算出された前記スコアベクトルを取得し、前記隣接行列と前記スコア行列との積演算の結果として、Ｎ行Ｋ列の行列を作成し、前記作成された前記Ｎ行Ｋ列の行列、及び前記Ｋ個のトピックごとのパーソナライズドベクトルを表す行列の重み付き和を計算する。 The PPR calculation method according to the second invention includes an initialization unit, a PPR score calculation unit, and an iterative determination unit, and is adjacent to N rows and N columns obtained for a graph representing a relationship between N pages. N rows K representing a PPR (Personalized PageRank) score based on a matrix and an N-by-K matrix representing an N-dimensional personalized vector storing the weight of each page prepared in advance for each of K topics A PPR calculation method in a PPR calculation device for calculating a score matrix of a column, the step of initializing all elements of the score matrix by the initialization means, the adjacency matrix by the PPR score calculation means, The score matrix initialized by the initialization means, or a product with the previously calculated score matrix, and a personalized vector for each of the K topics Calculating a weighted sum of a matrix representing Toll as the score matrix, and repeating the calculation by the PPR score calculation means until the iteration determination means satisfies a predetermined iteration end condition. The step of calculating by the PPR score calculating means replaces the value of the element by 1 for each non-zero element of the adjacency matrix by the ZDD construction means, thereby obtaining a binary matrix corresponding to the adjacency matrix. A zero suppression type including the created binary matrix including a terminal node indicating a value, each row node corresponding to each row of the binary matrix, and each intermediate node corresponding to each non-zero element of the binary matrix The zero suppression is performed by a step of converting into a binary decision graph (Zero-suppressed Binary Decision Diagrams) and a score calculation means. Calculating a K-dimensional score vector for each of the intermediate nodes in the order from the terminal node to the row node of the binary decision graph, wherein each of the intermediate nodes to be calculated is calculated by the score calculation means; The score vector already calculated for each intermediate node that is a child node of the intermediate node to be calculated, the elements of the binary matrix corresponding to the intermediate node to be calculated, and the score matrix to be multiplied Each of the row nodes including a step of calculating a sum of a K-dimensional vector and a product of elements of the adjacency matrix corresponding to the intermediate node to be calculated as the score vector of the intermediate node to be calculated. The score calculated by the score calculation means for an intermediate node that is a child node of the row node And a matrix of N rows and K columns is created as a result of the product operation of the adjacency matrix and the score matrix, the N rows and K columns of the created matrix, and the K topics Compute a weighted sum of matrices representing personalized vectors.

第１の発明及び第２の発明によれば、初期化手段によって、スコア行列の全ての要素を初期化し、ＰＰＲスコア計算手段によって、隣接行列と、初期化されたスコア行列、又は前回計算されたスコア行列との積、及びＫ個のトピックごとのパーソナライズドベクトルを表す行列の重み付き和を、スコア行列として計算し、反復判定手段によって、予め定められた反復終了条件を満たすまで、前記ＰＰＲスコア計算手段による計算を繰り返す。 According to the first and second inventions, all elements of the score matrix are initialized by the initialization unit, and the adjacency matrix and the initialized score matrix or the previous calculation are calculated by the PPR score calculation unit. A product with a score matrix and a weighted sum of matrices representing personalized vectors for each of K topics are calculated as a score matrix, and the PPR score is satisfied until a predetermined iteration end condition is satisfied by an iteration determination unit. Repeat the calculation by the calculation means.

そして、ＰＰＲスコア計算手段は、ＺＤＤ構築手段によって、隣接行列の非ゼロの要素の各々について、要素の値を１に置き換えることによって、隣接行列に対応する二値行列を作成し、作成した二値行列を、値を示す終端ノード、二値行列の各行に対応する各行ノード、及び二値行列の非ゼロの各要素に対応する各中間ノードを含むゼロサプレス型二分決定グラフに変換し、スコア算出手段によって、ゼロサプレス型二分決定グラフの終端ノードから行ノードに向かう順序で、中間ノードの各々について、Ｋ次元のスコアベクトルを算出するスコア算出手段であって、算出対象の中間ノードの各々について、算出対象の中間ノードの子ノードである各中間ノードに対し既に算出されたスコアベクトルと、算出対象の中間ノードに対応する二値行列の要素、乗算すべきスコア行列のうちのＫ次元ベクトル、及び算出対象の中間ノードに対応する隣接行列の要素の積との和を、算出対象の中間ノードのスコアベクトルとして算出し、行ノードの各々について、行ノードの子ノードである中間ノードに対しスコア算出手段により算出されたスコアベクトルを取得し、隣接行列とスコア行列との積演算の結果として、Ｎ行Ｋ列の行列を作成し、作成されたＮ行Ｋ列の行列、及びＫ個のトピックごとのパーソナライズドベクトルを表す行列の重み付き和を計算する。 Then, the PPR score calculation means creates a binary matrix corresponding to the adjacency matrix by replacing the element value with 1 for each non-zero element of the adjacency matrix by the ZDD construction means, and creates the created binary value. The matrix is converted into a zero suppression type binary decision graph including a terminal node indicating a value, each row node corresponding to each row of the binary matrix, and each intermediate node corresponding to each non-zero element of the binary matrix, and score calculation means Is a score calculation means for calculating a K-dimensional score vector for each of the intermediate nodes in the order from the terminal node to the row node of the zero-suppressed binary decision graph, and for each of the intermediate nodes to be calculated The score vector that has already been calculated for each intermediate node that is a child node of the intermediate node and the two corresponding to the intermediate node to be calculated The sum of the matrix element, the K-dimensional vector of the score matrix to be multiplied, and the product of the elements of the adjacent matrix corresponding to the intermediate node to be calculated is calculated as the score vector of the intermediate node to be calculated, and the row node Is obtained for the intermediate node that is a child node of the row node, and a matrix of N rows and K columns is created as a result of product operation of the adjacency matrix and the score matrix. , Calculate the weighted sum of the created N-by-K matrix and the matrix representing the personalized vector for each of the K topics.

このように、本発明によれば、隣接行列と、スコア行列との積の演算において、隣接行列に対応する二値行列をＺＤＤに変換し、隣接行列とスコア行列との積演算を行うことにより、ＰＰＲのスコアの演算において必要な演算回数を削減することができる。 Thus, according to the present invention, in the calculation of the product of the adjacency matrix and the score matrix, the binary matrix corresponding to the adjacency matrix is converted to ZDD, and the product operation of the adjacency matrix and the score matrix is performed. , It is possible to reduce the number of calculations required in calculating the PPR score.

また、本発明において、前記有向グラフは、前記Ｎ個のページ間のリンク関係を表す有向グラフであって、前記隣接行列の（ｊ，ｉ）要素は、前記有向グラフのｉ番目のノードを始点、ｊ番目のノードを終点とするエッジが存在する場合に、１／（ｉ番目のノードの出次数）であり、前記有向グラフのｉ番目のノードを始点、ｊ番目のノードを終点とするエッジが存在しない場合に、０とすることもできる。 Also, in the present invention, the directed graph is a directed graph representing a link relation between the N pages, and the (j, i) element of the adjacency matrix starts from the i-th node of the directed graph, and the j-th When there is an edge whose end point is node i, it is 1 / (the degree of the i-th node), and there is no edge whose starting point is the i-th node of the directed graph and whose end point is the j-th node. It can also be set to 0.

また、本発明のプログラムは、コンピュータを、上記のＰＰＲ演算装置を構成する各手段として機能させるためのプログラムである。 The program of the present invention is a program for causing a computer to function as each means constituting the above-described PPR arithmetic device.

以上説明したように、本発明のＰＰＲ演算装置、方法、及びプログラムによれば、隣接行列と、スコア行列との積の演算において、隣接行列に対応する二値行列をＺＤＤに変換し、隣接行列とスコア行列との積演算を行うことにより、ＰＰＲのスコアの演算において必要な演算回数を削減することができる。 As described above, according to the PPR calculation device, method, and program of the present invention, in the calculation of the product of the adjacency matrix and the score matrix, the binary matrix corresponding to the adjacency matrix is converted to ZDD, and the adjacency matrix And the score matrix can be used to reduce the number of computations required in computing the PPR score.

本発明の実施の形態に係るＰＰＲ演算装置の構成を示す概略図である。It is the schematic which shows the structure of the PPR arithmetic unit which concerns on embodiment of this invention. 入力される隣接行列の有向グラフの例を示す図である。It is a figure which shows the example of the directed graph of the adjacency matrix input. 入力された隣接行列の例を示す図である。It is a figure which shows the example of the input adjacency matrix. 二値行列をＺＤＤへ変換した例を示す図である。It is a figure which shows the example which converted the binary matrix into ZDD. 入力された隣接行列を二値行列に変換した例を示す図である。It is a figure which shows the example which converted the adjacency matrix input into the binary matrix. ＺＤＤの簡略表記の例を示す図である。It is a figure which shows the example of the simple notation of ZDD. ＺＤＤにノードＩＤを割り当てた例を示す図である。It is a figure which shows the example which assigned node ID to ZDD. ＺＤＤの配列表現の例を示す図である。It is a figure which shows the example of the array expression of ZDD. ＺＤＤの配列表現と出次数の併記の例を示す図である。It is a figure which shows the example of the joint description of the arrangement | sequence expression of ZDD and an outgoing order. 中間ノードのスコアベクトルを算出する例を示す図である。It is a figure which shows the example which calculates the score vector of an intermediate node. 本発明の実施の形態に係るＰＰＲ演算装置におけるＰＰＲ演算処理ルーチンを示す図である。It is a figure which shows the PPR arithmetic processing routine in the PPR arithmetic unit which concerns on embodiment of this invention. 本発明の実施の形態に係るＰＰＲ演算装置におけるＰＰＲスコア計算処理ルーチンを示す図である。It is a figure which shows the PPR score calculation process routine in the PPR arithmetic unit which concerns on embodiment of this invention. 本発明の実施の形態に係るＰＰＲ演算装置におけるＡＸ^{（ｔ−１）}計算処理ルーチンを示す図である。It is a figure which shows the AX ^(t-1) calculation process routine in the PPR arithmetic unit which concerns on embodiment of this invention.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜ＰＰＲ演算装置の構成＞
本発明の実施の形態に係るＰＰＲ演算装置について説明する。図１に示すように、本発明の実施の形態に係るＰＰＲ演算装置１００は、ＣＰＵと、ＲＡＭと、後述するＰＰＲ演算処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。このＰＰＲ演算装置１００は、機能的には図１に示すように入力部１０と、演算部２０と、出力部５０とを備えている。 <Configuration of PPR arithmetic unit>
A PPR arithmetic device according to an embodiment of the present invention will be described. As shown in FIG. 1, a PPR arithmetic device 100 according to an embodiment of the present invention includes a CPU, a RAM, and a ROM that stores a program and various data for executing a PPR arithmetic processing routine to be described later. Can be configured with a computer. Functionally, the PPR calculation device 100 includes an input unit 10, a calculation unit 20, and an output unit 50 as shown in FIG.

入力部１０は、キーボードなどの入力装置から、図２に示すようなＰＰＲの計算対象となるＮ個のページ間のリンク関係を示す、ノード数がＮ個の有向グラフを表す図３に示すＮ行Ｎ列の隣接行列Ａと、Ｋ個のトピックごとのパーソナライズドベクトルの集まりで表現されるＮ行Ｋ列の行列Ｕと、を受け付ける。なお、入力部１０は、ネットワーク等を介して外部から入力されたものを受け付けるようにしてもよい。なお、本発明の実施の形態においては、３行３列の隣接行列Ａが入力されたものとして説明する。 The input unit 10 represents an N line shown in FIG. 3 representing a directed graph with N nodes indicating a link relationship between N pages to be calculated by PPR as shown in FIG. 2 from an input device such as a keyboard. An N-column adjacency matrix A and an N-row and K-column matrix U expressed by a collection of personalized vectors for K topics are received. Note that the input unit 10 may accept input from the outside via a network or the like. In the embodiment of the present invention, it is assumed that an adjacency matrix A having 3 rows and 3 columns is input.

ここでパーソナライズドベクトルとは、各ページごとの重みのことをいう。例えば、トピック「news」に関しては、どのようなページが重要であるか、トピック「スポーツ」に関しては、どのようなページが重要であるかをページごとに重みづけした重みからなるベクトルのことをいう。また、パーソナライズドベクトルの各要素の重みづけは、各要素の合計が１となる確率で表わされているものとする。 Here, the personalized vector means a weight for each page. For example, for the topic “news”, what kind of page is important, and for the topic “sports”, it is a vector composed of weights weighted for each page. . In addition, the weighting of each element of the personalized vector is represented by the probability that the total of each element is 1.

隣接行列Ａの（ｊ，ｉ）要素Ａ_ｊ，ｉは、対象の有向グラフのｉ番目の節点を始点、ｊ番目の節点を終点とするエッジが存在する場合にＡ_ｊ，ｉ＝１／ｍ_ｉ、存在しない場合にＡ_ｊ，ｉ＝０とする。ここで、ｍ_ｉは節点ｉの出次数とする。 The (j, i) element A _{j, i} of the adjacency matrix A is A _{j, i} = 1 / m _i when there is an edge starting from the i-th node and ending at the j-th node of the target directed graph. If not, A _{j, i} = 0. Here, _{m i} is the out-degree of nodes i.

演算部２０は、ＺＤＤ構築部２２と、ＺＤＤ記憶部２４と、スコア配列作成部２６と、配列記憶部２８と、ＰＰＲスコア行列作成部２９と、ＰＰＲ計算部３０と、行列記憶部３２と、を備えている。なお、ＰＰＲスコア行列作成部２９が、初期化手段の一例であり、ＺＤＤ構築部２２及びＰＰＲ計算部３０が、ＰＰＲスコア計算手段、ＺＤＤ構築手段、スコア算出手段、及び反復判定手段の一例である。 The calculation unit 20 includes a ZDD construction unit 22, a ZDD storage unit 24, a score array creation unit 26, an array storage unit 28, a PPR score matrix creation unit 29, a PPR calculation unit 30, a matrix storage unit 32, It has. The PPR score matrix creation unit 29 is an example of an initialization unit, and the ZDD construction unit 22 and the PPR calculation unit 30 are examples of a PPR score calculation unit, a ZDD construction unit, a score calculation unit, and an iterative determination unit. .

ＺＤＤ構築部２２は、入力部１０において受け付けたＮ行Ｎ列の隣接行列Ａを図４に示すようなＺＤＤのデータ構造に変換する。上記の非特許文献２に記載の方法においては、二値行列に対応するＺＤＤの構築方法しか示されていないため、まず、隣接行列Ａの非ゼロの要素の各々について、要素の値を１に置き換えた、隣接行列Ａに対応するＮ行Ｎ列の二値行列Ｄを作成する（図５参照）。当該二値行列をＺＤＤのデータ構造に変換し、有向グラフの各ノードの出次数と併せてＺＤＤ記憶部２４に記憶しておく。ＺＤＤは、０、１の値の各々に対する終端ノード、二値行列Ｄの各行に対応する各行ノード、及び二値行列Ｄの非ゼロの各要素に対応する各中間ノードを含む。図４中のＦは始点ノードへのポインタを示し、ｒで始まるラベルがついた各ノード（行ノード）は二値行列Ｄの各行に対応し、ｅで始まるラベルがついた各ノード（中間ノード）は二値行列Ｄの各列に対応している。中間ノードはすべてＬＯ側のリンクが０終端ノードへつながっているため、これらのＬＯ側のリンクを省略し、図６のように表記する。なお、二値行列ＤをＺＤＤのデータ構造に変換する手法については、上記の非特許文献２に記載の方法を用いればよい。 The ZDD construction unit 22 converts the N-row N-column adjacency matrix A received by the input unit 10 into a ZDD data structure as shown in FIG. In the method described in Non-Patent Document 2 described above, only the ZDD construction method corresponding to the binary matrix is shown. Therefore, first, for each non-zero element of the adjacency matrix A, the element value is set to 1. The replaced binary matrix D of N rows and N columns corresponding to the adjacency matrix A is created (see FIG. 5). The binary matrix is converted into a ZDD data structure and stored in the ZDD storage unit 24 together with the degree of each node in the directed graph. ZDD includes a terminal node for each of the values 0, 1, each row node corresponding to each row of the binary matrix D, and each intermediate node corresponding to each non-zero element of the binary matrix D. In FIG. 4, F indicates a pointer to the start node, and each node (row node) with a label starting with r corresponds to each row of the binary matrix D, and each node (intermediate node) with a label starting with e ) Corresponds to each column of the binary matrix D. Since all the intermediate nodes have the LO side links connected to the 0 terminal node, these LO side links are omitted and are represented as shown in FIG. As a method for converting the binary matrix D into the ZDD data structure, the method described in Non-Patent Document 2 may be used.

また、このダイヤグラムを図７のように、終端ノードから始点ノードへ向かって順番にノードＩＤを付し、ノードごとに、ノードＩＤと、ラベルと、ＨＩ側リンクの接続先のノードＩＤと、ＬＯ側リンクの接続先のノードＩＤと、を対応付けた一次元配列で表現し、ＺＤＤ記憶部２４に記憶する。なお、図６及び図７に対応する配列表現を図８左に示し、図８右にＺＤＤの各ノードにノードＩＤを記したものを示す。図８のように、０終端、１終端の２つの終端ノードには、配列番号１、２が割り当てられているとし、配列番号３以降に中間ノードが並び、その後に行ノードが並ぶように構成する。また、図９に示すように、図８の一次元配列に、ＺＤＤのノードに対応する有向グラフのノードの出次数を格納するカラムを設けても良い。 Further, as shown in FIG. 7, the diagram is assigned node IDs in order from the terminal node to the starting node, and for each node, the node ID, label, node ID of the connection destination of the HI side link, LO The node ID of the connection destination of the side link is expressed as a corresponding one-dimensional array and stored in the ZDD storage unit 24. The array representation corresponding to FIGS. 6 and 7 is shown on the left side of FIG. 8, and the right side of FIG. 8 shows a node ID for each node of ZDD. As shown in FIG. 8, it is assumed that array numbers 1 and 2 are assigned to two terminal nodes, 0 terminal and 1 terminal, and that intermediate nodes are arranged after array number 3 and row nodes are arranged thereafter. To do. Also, as shown in FIG. 9, a column for storing the output degree of the node of the directed graph corresponding to the ZDD node may be provided in the one-dimensional array of FIG.

具体的には、入力部１０により受け付けた隣接行列Ａに対応するＮ行Ｎ列の二値行列Ｄを組み合わせ集合として表現し、組み合わせ集合で用いられるシンボルの集合Ｓを、Ｓ＝｛r1,...,rN,e1,...,eN｝とする。ここで、r1,...,rNは、それぞれ各行に対応するシンボルであり、e1,...,eNは、それぞれ各列に対応するシンボルとする。これらのシンボルを用いて、二値行列Ｄに対応する組み合わせ集合Ｚは、下記（１）式として表現し、当該組み合わせ集合をＺＤＤとして構築する。 Specifically, the binary matrix D of N rows and N columns corresponding to the adjacency matrix A received by the input unit 10 is expressed as a combination set, and a set S of symbols used in the combination set is expressed as S = {r1,. .., rN, e1, ..., eN}. Here, r1,..., RN are symbols corresponding to the respective rows, and e1,..., EN are symbols corresponding to the respective columns. Using these symbols, the combination set Z corresponding to the binary matrix D is expressed as the following equation (1), and the combination set is constructed as ZDD.

ここで、a_ｇ(l)は、ｇ行目のｆ_ｇのベクトルの成分ｆ_ｇ1,...,ｆ_ｇNのうち、値が１となるl番目の要素の添え字を表す。また、b_ｇはｆ_ｇ1,...,ｆ_ｇNのうち値が１となる要素の総数である。１≦a_ｇ（１）＜…＜a_ｇ（b_ｇ）≦Ｎを満たす。例えば、図５の二値行列Ｄには、組み合わせ集合{r1e1e3,r2e3,r3e2}が対応する。これをＺＤＤとして表現したものが図４となる。 Here, a _g (l) represents the subscript of the l-th element having a value of 1 among the components f _g1 ,..., F _gN of the f _g vector in the g-th row. B _g is the total number of elements having a value of 1 among f _g1 ,..., F _gN . 1 ≦ a _g (1) <... <A _g (b _g ) ≦ N is satisfied. For example, the combination set {r1e1e3, r2e3, r3e2} corresponds to the binary matrix D in FIG. This is expressed as ZDD in FIG.

また、構築されたＺＤＤにはいくつかの特徴がある。まず、ＺＤＤの節点数は、 In addition, the constructed ZDD has several characteristics. First, the number of nodes in ZDD is

以下であり、かつシンボルr1,...,rN対応する節点は、常に高々一つしか出現しないという特徴がある。また、集合中の２つの項について、ある節点の共有による簡約化はe1,...,eNに対応する節点でのみ起きるようになる。各項に含まれるシンボルのうち、順序的に後にくるものがすべて共通な場合にのみ、ＺＤＤにおいて構造の共有が行われるということになる。図４では、図５の二値行列Ｄの１行目に相当する項がr1e1e3、２行目に相当する項がr2e3であり、e3に相当する節点で共有されている。 There is a feature that no more than one node corresponding to the symbols r1,. In addition, for two terms in the set, simplification by sharing a certain node occurs only at the nodes corresponding to e1,..., EN. A structure is shared in ZDD only when symbols included in each term are common in order. In FIG. 4, the term corresponding to the first row of the binary matrix D in FIG. 5 is r1e1e3, and the term corresponding to the second row is r2e3, which is shared by nodes corresponding to e3.

ＺＤＤ記憶部２４には、ＺＤＤ構築部２２において構築された図９に示すような隣接行列Ａに対応する二値行列ＤのＺＤＤと、ＺＤＤの各ノードに対応する有向グラフの各ノードの出次数とを一次元配列で表現した配列が記憶されている。 In the ZDD storage unit 24, the ZDD of the binary matrix D corresponding to the adjacency matrix A as shown in FIG. 9 constructed in the ZDD construction unit 22, and the output degree of each node of the directed graph corresponding to each node of ZDD, Is stored in a one-dimensional array.

スコア配列作成部２６は、計算の途中結果を保持するための一時記憶領域としての、中間ノード各々のスコアを保持するための各要素からなるスコア配列を作成し、配列記憶部２８に記憶する。計算の途中結果を保持するために、中間ノードごとに対応付けられた一時領域が必要であり、各要素がＫ次元ベクトルとすると、中間ノード数分の要素（Ｋ次元ベクトル）からなる１次元配列をスコア配列として作成する。なお、このスコア配列をscoreとする。 The score array creation unit 26 creates a score array composed of each element for holding the score of each intermediate node as a temporary storage area for holding the intermediate result of the calculation, and stores it in the array storage unit 28. In order to hold the intermediate results of the calculation, a temporary area associated with each intermediate node is required. If each element is a K-dimensional vector, a one-dimensional array composed of elements (K-dimensional vectors) for the number of intermediate nodes. As a score array. This score sequence is defined as score.

ＰＰＲスコア行列作成部２９は、入力されたＫ個のトピックごとのパーソナライズドベクトルの集まりで表現されるＮ行Ｋ列の行列Ｕに対応して、ＰＰＲスコアを格納するためのＮ行Ｋ列のスコア行列Ｘを作成する。スコア行列Ｘは、入力部１０において受け付けたパーソナライズドベクトルを集約した行列Ｕと同様の構成とし、スコア行列Ｘの各要素を１／Ｎの値で初期化し、行列記憶部３２に記憶する。ここで、計算するパーソナライズドベクトルの数をＫとすると、行列Ｕ及び行列Ｘ共にＮ行Ｋ列の実数行列となる。それぞれのｉ列目に対応する列ベクトルをｕ_ｉ、ｘ_ｉとすると、それぞれトピックｉに対するパーソナライズドベクトルと、それに対応するスコアを表している。 The PPR score matrix creating unit 29 has N rows and K columns for storing the PPR scores in correspondence with the N rows and K columns matrix U expressed by the set of K personalized personalized vectors for each topic. A score matrix X is created. The score matrix X has the same configuration as the matrix U in which the personalized vectors received in the input unit 10 are aggregated, and each element of the score matrix X is initialized with a value of 1 / N and stored in the matrix storage unit 32. Here, if the number of personalized vectors to be calculated is K, both the matrix U and the matrix X are real matrices with N rows and K columns. If column vectors corresponding to each i-th column are u _i and x _i , respectively, a personalized vector for topic i and a score corresponding to it are represented.

ＰＰＲ計算部３０は、入力部１０において受け付けたパーソナライズドベクトルを集約した行列Ｕと、行列記憶部３２に記憶されているスコア行列Ｘと、ＺＤＤ記憶部２４に記憶されている隣接行列Ａを変換した二値行列ＤのＺＤＤと、配列記憶部２８に記憶されているスコア配列と、に基づいて、下記（２）式よりスコア行列Ｘの更新を、予め定めた回数Ｖになるまで、若しくは収束条件を満たすまで繰り返し行い、その結果を出力部５０に出力する。例えば、収束条件として、前回の行列Ｘの各要素との差分の絶対値の総和が一定値以下になった場合に、収束条件を満たすとしてもよい。なお、予め定められた回数の試行の繰り返し、及び収束条件が反復終了条件の一例である。 The PPR calculation unit 30 converts the matrix U obtained by collecting the personalized vectors received in the input unit 10, the score matrix X stored in the matrix storage unit 32, and the adjacency matrix A stored in the ZDD storage unit 24. Based on the ZDD of the binary matrix D and the score array stored in the array storage unit 28, the score matrix X is updated until the predetermined number of times V is reached or converged from the following equation (2). The process is repeated until the condition is satisfied, and the result is output to the output unit 50. For example, as the convergence condition, the convergence condition may be satisfied when the sum of the absolute values of the differences from the previous elements of the matrix X becomes a certain value or less. The repetition of the predetermined number of trials and the convergence condition are examples of the repetition end condition.

ここで、Ｘ^（ｔ）はｔ回目のスコア行列Ｘを表す。また、ｃは調整用のパラメータであり、ｃ∈［０，１］を満たす。 Here, X ^(t) represents the t-th score matrix X. C is a parameter for adjustment and satisfies cε [0, 1].

行列の積ＡＸ^{（ｔ−１）}の計算は、ＺＤＤ上での動的計画法に基づいて行う。具体的には、ＺＤＤ記憶部２４に記憶されている隣接行列Ａに対応する二値行列ＤのＺＤＤと、行列記憶部３２に記憶されているＰＰＲスコアを格納するスコア行列Ｘとを受け付ける。そして、ＺＤＤ記憶部２４に記憶されている二値行列ＤのＺＤＤの構造を利用して、ノードＩＤが３からの中間ノードの各々について、ノードＩＤの昇順に各中間ノードのスコアベクトルを、隣接行列Ａの対応する要素の値（有向グラフの対応するノードの出次数の逆数）で重みづけを行いながら下記（３）式により算出する。なお、ノードＩＤが１又は２のノードは終端ノードであるので当該ノードのスコアベクトルは、各要素が０のベクトルとする。また、算出したスコアベクトルは、配列記憶部２８に記憶されているスコア配列の当該中間ノードに対応付けられた位置に格納される。図１０に各中間ノードのスコアベクトルの算出例を示す。 The matrix product AX ^(t−1) is calculated based on dynamic programming on ZDD. Specifically, the ZDD of the binary matrix D corresponding to the adjacency matrix A stored in the ZDD storage unit 24 and the score matrix X storing the PPR scores stored in the matrix storage unit 32 are received. Then, using the ZDD structure of the binary matrix D stored in the ZDD storage unit 24, for each intermediate node from the node ID 3, the score vector of each intermediate node is adjacent to the node ID in ascending order of the node ID. Calculation is performed by the following equation (3) while weighting with the value of the corresponding element of the matrix A (the reciprocal of the degree of the corresponding node of the directed graph). Note that since the node with the node ID 1 or 2 is a terminal node, the score vector of the node is a vector in which each element is 0. The calculated score vector is stored at a position associated with the intermediate node of the score array stored in the array storage unit 28. FIG. 10 shows an example of calculating the score vector of each intermediate node.

ここで、ｐはノードＩＤの値であり、ｗは、ノードＩＤ＝ｐのノードに対応する有向グラフのノードの出次数の逆数であり、η（ｐ）はノードＩＤ＝ｐのノードのラベルを受け取り、当該ラベルの添え字を返す関数である。例えば、図９を例に示すと、η（３）＝３、η（５）＝１となる。また、Ｘ^{（ｔ−１）} _η（ｐ）は、Ｎ行Ｋ列の行列Ｘ^{（ｔ−１）}のη（ｐ）行目の各要素からなるベクトルであり、ノードＩＤ＝ｐのノードに対応する二値行列Ｄの要素と乗算すべき、行列Ｘ^{（ｔ−１）}のうちの各要素からなるベクトルである。また、配列記憶部２８に記憶されているスコア配列から、算出対象の中間ノード（ノードＩＤ＝ｐ）のＨＩ側の子ノードのノードＩＤ＝ＨＩ（ｐ）に対応するスコアベクトルを取得することで、スコアベクトルscore[HI(ｐ)]を取得する。同様に、算出対象の中間ノード（ノードＩＤ＝ｐ）のＬＯ側の子ノードのノードＩＤ＝ＬＯ（ｐ）に対応するスコアベクトルを取得することで、スコアベクトルscore[LO(ｐ)]を取得する。 Here, p is the value of the node ID, w is the reciprocal of the outgoing degree of the node of the directed graph corresponding to the node of node ID = p, and η (p) receives the label of the node of node ID = p. , A function that returns the subscript of the label. For example, taking FIG. 9 as an example, η (3) = 3 and η (5) = 1. X ^(t−1) _{η (p)} is a vector composed of each element in the η (p) row of the matrix X ^(t−1) of N rows and K columns, and corresponds to the node with node ID = p. It is a vector consisting of each element of the matrix X ^(t−1) to be multiplied with the elements of the binary matrix D to be Further, by obtaining a score vector corresponding to the node ID = HI (p) of the child node on the HI side of the intermediate node (node ID = p) to be calculated from the score array stored in the array storage unit 28. And score vector score [HI (p)]. Similarly, the score vector score [LO (p)] is obtained by obtaining the score vector corresponding to the node ID = LO (p) of the child node on the LO side of the intermediate node (node ID = p) to be calculated. To do.

また、中間ノードの全てのスコアベクトルが算出された場合、下記（４）式によりベクトルＡＸ^{（１−ｔ）} _１〜Ｎを求め、ベクトルＡＸ^{（ｔ−１）} _１〜Ｎを各行に並べたＮ行Ｋ列の行列ＡＸ^{（ｔ−１）}を作成して、積演算の結果として出力する。 Further, when all the score vectors of the intermediate nodes are calculated, vectors AX ^(1-t) ₁ to _N are obtained by the following equation (4), and the vectors AX ^(t−1) ₁ to N are arranged in each row. A matrix AX ^(t−1) of row K column is created and output as the result of the product operation.

ここで、上記（４）式は、ラベルｒ１〜ｒＮに対応する行ノードの各々について算出される。そのため、当該行ノードの各々に対応するＡＸ^{（ｔ−１）} _η（ｐ）のベクトルは行列ＡＸ^{（ｔ−１）}の各行に該当する。そして、ベクトルＡＸ^{（ｔ−１）} _η（ｐ）をメモリ（図示省略）に格納しておき、ラベルｒ１〜ｒＮに対応する全てのノードについてベクトルＡＸ^{（ｔ−１）} _η（ｐ）が算出された場合に、ベクトルＡＸ^{（ｔ−１）} _η（ｐ）を結合しＮ行Ｋ列の行列ＡＸ^{（ｔ−１）}とする。そして、Ｎ行Ｋ列の行列ＡＸ^{（ｔ−１）}を用いて上記（２）式に従ってスコア行列Ｘ^（ｔ）の計算を行う。そしてスコア行列Ｘ^（ｔ）の計算が、予め定めた回数Ｖになった場合、若しくは収束条件を満たした場合、その結果を出力部５０に出力する。 Here, the above equation (4) is calculated for each of the row nodes corresponding to the labels r1 to rN. Therefore, the vector of AX ^(t−1) _{η (p)} corresponding to each row node corresponds to each row of the matrix AX ^(t−1) . The vector AX ^(t−1) _{η (p)} is stored in a memory (not shown), and the vector AX ^(t−1) _{η (p)} is calculated for all the nodes corresponding to the labels r1 to rN. The vectors AX ^(t−1) _{η (p)} are combined into a matrix AX ^{(t−1) of} N rows and K columns. Then, the score matrix X ^(t) is calculated according to the above equation (2) using the matrix AX ^(t−1) of N rows and K columns. When the calculation of the score matrix X ^(t) reaches a predetermined number of times V or when the convergence condition is satisfied, the result is output to the output unit 50.

＜ＰＰＲ演算装置の作用＞
次に、本発明の実施の形態に係るＰＰＲ演算装置１００の作用について説明する。まず、入力部１０により、隣接行列Ａと、パーソナライズドベクトルの集まりで表現される行列Ｕと、が入力され、ＰＰＲ演算装置１００のＲＯＭに記憶されたプログラムを、ＣＰＵが実行することにより、図１１に示すＰＰＲ演算処理ルーチンが実行される。 <Operation of PPR arithmetic unit>
Next, the operation of the PPR arithmetic device 100 according to the embodiment of the present invention will be described. First, the input unit 10 receives an adjacency matrix A and a matrix U represented by a collection of personalized vectors, and the CPU executes a program stored in the ROM of the PPR arithmetic unit 100, thereby causing the diagram to be changed. 11 is executed.

まず、ステップＳ１００では、入力部１０により入力された隣接行列Ａ及び行列Ｕを受け付ける。 First, in step S100, the adjacency matrix A and the matrix U input by the input unit 10 are received.

次に、ステップＳ１０２では、ステップＳ１００において取得した隣接行列Ａを、図５に示すように隣接行列Ａの非ゼロ要素を全て１で置き換えた隣接行列Ａに対応する二値行列Ｄを作成し、二値行列Ｄを０、１の値の各々に対する終端ノード、二値行列Ｄの各行に対応する各行ノード、及び二値行列Ｘの非ゼロの各要素に対応する各中間ノードを含むＺＤＤのデータ構造に変換し、有向グラフの各ノードの出次数と併せてＺＤＤ記憶部２４に記憶する。 Next, in step S102, the binary matrix D corresponding to the adjacency matrix A in which the non-zero elements of the adjacency matrix A are all replaced with 1 as shown in FIG. ZDD data including binary matrix D with terminal nodes for each of the values 0, 1; each row node corresponding to each row of binary matrix D; and each intermediate node corresponding to each non-zero element of binary matrix X The data is converted into a structure and stored in the ZDD storage unit 24 together with the degree of each node in the directed graph.

次に、ステップＳ１０４では、ステップＳ１０２において構築したＺＤＤと、ステップＳ１００において受け付けた行列Ｕとに基づいて、各要素をＫ次元ベクトルとした、中間ノード数分の要素からなる一次元配列のスコア配列を作成する。 Next, in step S104, based on the ZDD constructed in step S102 and the matrix U received in step S100, a score array of a one-dimensional array composed of elements corresponding to the number of intermediate nodes, with each element as a K-dimensional vector. Create

次に、ステップＳ１０５において、ステップＳ１００において受け付けた行列Ｕに基づいて、Ｎ行Ｋ列のＰＰＲスコアを格納するためのスコア行列Ｘを作成し、行列記憶部３２に記憶する。 Next, in step S105, based on the matrix U received in step S100, a score matrix X for storing the PPR score of N rows and K columns is created and stored in the matrix storage unit 32.

次に、ステップＳ１０６では、ステップＳ１０２において構築したＺＤＤと、ステップ１００において受け付けた行列Ｕと、行列記憶部３２に記憶されているスコア行列Ｘとに基づいて、ＰＰＲスコアを計算する。 Next, in step S106, a PPR score is calculated based on the ZDD constructed in step S102, the matrix U received in step 100, and the score matrix X stored in the matrix storage unit 32.

上記ステップＳ１０８は、図１２に示すＰＰＲスコア計算処理ルーチンによって実現される。 Step S108 is realized by the PPR score calculation processing routine shown in FIG.

まず、ステップＳ２０２では、行列記憶部３２に記憶されているスコア行列Ｘの各要素を１／Ｎで初期化して、スコア行列Ｘ^（0）とする。 First, in step S202, each element of the score matrix X stored in the matrix storage unit 32 is initialized with 1 / N to obtain a score matrix X ⁽⁰⁾ .

次に、ステップＳ２０４では、変数ｔに初期値として１を与える。 Next, in step S204, 1 is given to the variable t as an initial value.

次に、ステップＳ２０６では、ステップＳ１０２において構築したＺＤＤと、行列記憶部３２に記憶されているスコア行列Ｘとに基づいて、行列ＡＸ^{（ｔ−１）}を計算する。 Next, in step S206, a matrix AX ^(t−1) is calculated based on the ZDD constructed in step S102 and the score matrix X stored in the matrix storage unit 32.

上記ステップＳ２０６は、図１３に示す行列ＡＸ^{（ｔ−１）}計算処理ルーチンによって実現される。 Step S206 is realized by a matrix AX ^(t-1) calculation processing routine shown in FIG.

まず、ステップＳ３００では、変数ｐの初期値として３を与える。 First, in step S300, 3 is given as the initial value of the variable p.

次に、ステップＳ３０２では、ノードＩＤ＝ｐのラベルがＥｏ（ｏ＝１,...,Ｎ）か否かを判定する。ノードＩＤ＝ｐのラベルがＥｏである場合には、ノードＩＤ＝ｐのノードが中間ノードであると判断し、ステップＳ３０４に移行し、ノードＩＤ＝ｐのラベルがＥｏでない場合には、ノードＩＤ＝ｐのノードが行ノードであると判断し、ステップＳ３１０へ移行する。 Next, in step S302, it is determined whether the label of node ID = p is Eo (o = 1,..., N). If the label of node ID = p is Eo, it is determined that the node of node ID = p is an intermediate node, and the process proceeds to step S304. If the label of node ID = p is not Eo, the node ID = P is determined to be a row node, and the process proceeds to step S310.

次に、ステップＳ３０４では、配列記憶部２８に記憶されているスコア配列から、ノードＩＤ＝ｐのノードのＨＩ側の子ノード（中間ノード）のノードＩＤ＝ＨＩ（ｐ）に対応付けられたスコアベクトルscore［ＨＩ（ｐ）］、及びノードＩＤ＝ｐのノードのＬＯ側の子ノード（中間ノード）のノードＩＤ＝ＬＯ（ｐ）に対応付けられたスコアベクトルscore[LO(p)]を取得し、score［ＨＩ（ｐ）］と、score［ＬＯ（ｐ）］と、ＺＤＤ記憶部２４に記憶されている、ノードＩＤ＝ｐのノードに対応する有向グラフのノードの出次数と、ノードＩＤ＝ｐのラベルの添え字η（ｐ）に対応するＸ_η（ｉ）とに基づいて、上記（３）式に従って、ノードＩＤ＝ｐの中間ノードのスコアベクトルを算出する。 Next, in step S304, the score associated with the node ID = HI (p) of the child node (intermediate node) on the HI side of the node with the node ID = p from the score array stored in the array storage unit 28. Obtain score vector score [LO (p)] associated with vector score [HI (p)] and node ID = LO (p) of the child node (intermediate node) on the LO side of the node with node ID = p And score [HI (p)], score [LO (p)], the degree of the output of the node of the directed graph corresponding to the node of node ID = p, stored in the ZDD storage unit 24, and node ID = Based on X _{η (i)} corresponding to the subscript η (p) of the label of p, the score vector of the intermediate node with node ID = p is calculated according to the above equation (3).

次に、ステップＳ３０６では、変数ｐの値が最大ノード番号Ｙよりも小さいか否かを判定する。変数ｐが最大ノード番号Ｙよりも小さい場合にはステップＳ３０８に移行し、変数ｐの値が最大ノード番号Ｙ未満である場合にはステップＳ３１２に移行する。 Next, in step S306, it is determined whether or not the value of the variable p is smaller than the maximum node number Y. If the variable p is smaller than the maximum node number Y, the process proceeds to step S308. If the value of the variable p is less than the maximum node number Y, the process proceeds to step S312.

次に、ステップＳ３０８では、変数ｐに１を加えた値を変数ｐとする。 Next, in step S308, a value obtained by adding 1 to the variable p is set as the variable p.

次に、ステップＳ３１０では、ノードＩＤ＝ｐの行ノードについて、当該行ノードのＨＩ側の子ノード(中間ノード)のノードＩＤ＝ＨＩ（ｐ）に対応付けてスコア配列に格納されている、当該行ノードのＨＩ側の子ノードのスコアベクトルを取得する。そして、当該行ノードについて、ノードＩＤ＝ｐの当該行ノードのラベルの添え字η（ｐ）を取得し、当該行ノードについて取得したスコアベクトルを、積演算の結果であるＮ行Ｋ列の行列のうちの要素からなるベクトルＡＸ^{（ｔ−１）} _η（ｐ）とする。なお、ステップＳ３１０は、全ての行ノードについて繰り返し行われる。 Next, in step S310, the row node of node ID = p is stored in the score array in association with the node ID = HI (p) of the child node (intermediate node) on the HI side of the row node. The score vector of the child node on the HI side of the row node is acquired. Then, for the row node, the subscript η (p) of the label of the row node of node ID = p is acquired, and the score vector acquired for the row node is an N-row / K-column matrix that is the result of the product operation. Is a vector AX ^(t−1) _{η (p)} consisting of the elements of Note that step S310 is repeated for all row nodes.

次に、ステップＳ３１２では、ステップＳ３１０において取得したベクトルＡＸ^{（ｔ−１）} _１〜Ｎの結果に基づいて、ベクトルＡＸ^{（ｔ−１）} _１〜Ｎを各行に並べて生成されるＮ行Ｋ列の行列ＡＸ^{（ｔ−１）}を出力する。 Next, in step S312, the based on the obtained vector ^{AX _(t-1) 1~N} result in step S310, the N rows and K columns that are generated side by side in each row vector ^{AX _(t-1) 1~N} The matrix AX ^(t−1) is output.

図１２のステップＳ２０８では、ステップＳ３１２において取得した行列ＡＸ^{（ｔ−１）}と、ステップＳ１００において受け付けた行列Ｕとに基づいて、上記（２）式に従って、スコア行列Ｘ^（ｔ）を計算する。 In step S208 of FIG. 12, the score matrix X ^(t) is calculated according to the above equation (2) based on the matrix AX ^(t−1) acquired in step S312 and the matrix U received in step S100.

次に、ステップＳ２１０では、変数ｔの値が規定繰返し回数Ｖよりも小さいか否かを判定する。変数ｔの値が規定繰返し回数Ｖ未満の場合は、ステップＳ２１２に移行し、変数ｔの値が規定繰返し回数Ｖ以上の場合には、ステップＳ２１４に移行する。 Next, in step S210, it is determined whether or not the value of the variable t is smaller than the specified number of repetitions V. If the value of the variable t is less than the specified number of repetitions V, the process proceeds to step S212. If the value of the variable t is equal to or greater than the specified number of repetitions V, the process proceeds to step S214.

ステップＳ２１２では、変数ｔの値に１を加えた値をｔとする。 In step S212, a value obtained by adding 1 to the value of the variable t is set to t.

次に、ステップＳ２１４では、ステップＳ２１０において取得したＸ^（ｔ）をＰＰＲスコアの計算結果として出力部５０に出力し、処理を終了する。 Next, in step S214, X ^(t) acquired in step S210 is output to the output unit 50 as the calculation result of the PPR score, and the process is terminated.

以上説明したように、本発明の実施の形態に係るＰＰＲ演算装置によれば、ＰＰＲスコアの演算のうちの、隣接行列Ａとスコア行列Ｘとの積演算において、隣接行列Ａに対応する二値行列をＺＤＤに変換し、隣接行列Ａとスコア行列Ｘとの積演算を行うことにより、ＰＰＲのスコアの演算において必要な演算回数を削減することができる。 As described above, according to the PPR calculation device according to the embodiment of the present invention, the binary corresponding to the adjacency matrix A in the product calculation of the adjacency matrix A and the score matrix X in the calculation of the PPR score. By converting the matrix into ZDD and performing the product operation of the adjacency matrix A and the score matrix X, the number of calculations required in the calculation of the PPR score can be reduced.

また、Personalized PageRankの計算に必要な演算回数を削減することによって、処理を高速化することができる。 In addition, the processing can be speeded up by reducing the number of operations required for the calculation of Personalized PageRank.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、パーソナライズドベクトルが、ある要素だけ１となり他の要素の全てが０となるような単位ベクトルであってもよい。 For example, the personalized vector may be a unit vector in which only one element is 1 and all other elements are 0.

１０入力部
２０演算部
２２ＺＤＤ構築部
２４ＺＤＤ記憶部
２６スコア配列作成部
２８配列記憶部
２９スコア行列作成部
３０ＰＰＲ計算部
３２行列記憶部
３４配列記憶部
５０出力部
１００ＰＰＲ演算装置 DESCRIPTION OF SYMBOLS 10 Input part 20 Operation part 22 ZDD construction part 24 ZDD memory | storage part 26 Score arrangement | sequence creation part 28 Array storage part 29 Score matrix creation part 30 PPR calculation part 32 Matrix storage part 34 Array storage part 50 Output part 100 PPR arithmetic unit

Claims

N rows representing an N-dimensional personalized vector storing N-row N-column adjacency matrix obtained for a directed graph representing the relationship between N pages and the weight of each page prepared in advance for each of K topics A PPR arithmetic device that calculates a score matrix of N rows and K columns representing a PPR (Personalized PageRank) score based on a matrix of K columns,
Initialization means for initializing all elements of the score matrix;
A product of the adjacency matrix and the score matrix initialized by the initialization unit, or the score matrix calculated last time, and a weighted sum of matrices representing personalized vectors for the K topics, PPR score calculation means for calculating as the score matrix;
Iterative determination means that repeats the calculation by the PPR score calculation means until a predetermined iteration end condition is satisfied,
Including
The PPR score calculation means includes:
For each non-zero element of the adjacency matrix, a binary matrix corresponding to the adjacency matrix is created by replacing the value of the element with 1, and the created binary matrix is represented as a terminal node indicating a value, ZDD constructing means for converting into zero-suppressed binary decision diagrams including each row node corresponding to each row of the binary matrix and each intermediate node corresponding to each non-zero element of the binary matrix When,
Score calculating means for calculating a K-dimensional score vector for each of the intermediate nodes in the order from the terminal node to the row node of the zero-suppressed binary decision graph;
For each intermediate node to be calculated, the score vector already calculated for each intermediate node that is a child node of the intermediate node to be calculated, and elements of the binary matrix corresponding to the intermediate node to be calculated, Score calculation for calculating a sum of a K-dimensional vector of the score matrix to be multiplied and a product of elements of the adjacent matrix corresponding to the calculation target intermediate node as the score vector of the calculation target intermediate node Including means,
For each of the row nodes, the score vector calculated by the score calculation unit is acquired for an intermediate node that is a child node of the row node, and as a result of product operation of the adjacency matrix and the score matrix, A PPR arithmetic unit that creates a matrix of rows and K columns and calculates a weighted sum of the created matrix of N rows and K columns and a matrix representing a personalized vector for each of the K topics.

The directed graph is a directed graph representing a link relationship between the N pages,
The (j, i) element of the adjacency matrix is 1 / (the degree of the i-th node) when there is an edge starting from the i-th node and ending at the j-th node in the directed graph. 2. The PPR arithmetic device according to claim 1, wherein when there is no edge having an i-th node as a start point and a j-th node as an end point in the directed graph, the PPR arithmetic device is 0.

Including an initialization unit, a PPR score calculation unit, and an iterative determination unit, and prepared in advance for each N rows and N columns adjacency matrix obtained for a graph representing a relationship between N pages and for each of K topics In a PPR arithmetic apparatus for calculating an N-row and K-column score matrix representing a PPR (Personalized PageRank) score based on an N-dimensional and K-column matrix representing an N-dimensional personalized vector storing the weights of the respective pages A PPR calculation method,
Initializing all elements of the score matrix by the initialization means;
The PPR score calculation means represents a product of the adjacency matrix and the score matrix initialized by the initialization means, or the score matrix calculated last time, and a personalized vector for each of the K topics. Calculating a weighted sum of matrices as the score matrix;
Repeating the calculation by the PPR score calculation means until a predetermined iteration end condition is satisfied by the repetition determination means;
Including
The step of calculating by the PPR score calculating means includes:
A ZDD construction means creates a binary matrix corresponding to the adjacency matrix by replacing the value of the element with 1 for each non-zero element of the adjacency matrix. Zero-suppressed binary decision diagrams including a terminal node indicating, each row node corresponding to each row of the binary matrix, and each intermediate node corresponding to each non-zero element of the binary matrix Converting, and
Calculating a K-dimensional score vector for each of the intermediate nodes in the order from the end node to the row node of the zero-suppressed binary decision graph by the score calculation means;
For each intermediate node to be calculated by the score calculation means, the score vector that has already been calculated for each intermediate node that is a child node of the intermediate node to be calculated, and the intermediate node that corresponds to the intermediate node to be calculated The sum of the elements of the binary matrix, the K-dimensional vector of the score matrix to be multiplied, and the product of the elements of the adjacent matrix corresponding to the intermediate node to be calculated is the score of the intermediate node to be calculated Calculating as a vector,
For each of the row nodes, the score vector calculated by the score calculation unit is acquired for an intermediate node that is a child node of the row node, and as a result of product operation of the adjacency matrix and the score matrix, N A PPR calculation method of creating a matrix of rows and K columns and calculating a weighted sum of the created matrix of N rows and K columns and a matrix representing a personalized vector for each of the K topics.

The directed graph is a directed graph representing a link relationship between the N pages,
The (j, i) element of the adjacency matrix is 1 / (the degree of the i-th node) when there is an edge starting from the i-th node and ending at the j-th node in the directed graph. 4. The PPR calculation method according to claim 3, wherein when there is no edge having an i-th node as a start point and a j-th node as an end point in the directed graph, the PPR calculation method is zero.

The program for functioning a computer as each means which comprises the PPR arithmetic unit of Claim 1 or 2.