JP5903376B2

JP5903376B2 - Information recommendation device, information recommendation method, and information recommendation program

Info

Publication number: JP5903376B2
Application number: JP2012270710A
Authority: JP
Inventors: 恭太堤田; 浩之戸田; 内山　匡; 匡内山
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-12-11
Filing date: 2012-12-11
Publication date: 2016-04-13
Anticipated expiration: 2032-12-11
Also published as: JP2014115911A

Description

本発明は、情報推薦において複数のサービスやドメイン（これらを総称して、以下、ドメインと呼ぶ）のコンテンツやアイテム（これらを総称して、以下、アイテムと呼ぶ）を横断的に扱ってアイテムを推薦する情報推薦の技術に関するものである。 In the present invention, content information and items (collectively referred to as items hereinafter) of a plurality of services and domains (collectively referred to as domains hereinafter) are handled in information recommendation. This is related to information recommendation technology.

従来技術として、複数のドメインを対象としてアイテムの推薦を可能にするために、複数のドメインのアイテムに付与されたメタデータ間を関連付けることで、より高精度な情報推薦を実現する方法がある（例えば、特許文献1参照）。 As a conventional technique, there is a method for realizing highly accurate information recommendation by associating metadata assigned to items in a plurality of domains in order to enable item recommendation for a plurality of domains ( For example, see Patent Document 1).

また、単一のドメインを対象として、ユーザ、アイテムの購買や利用履歴の関係を表すユーザ-アイテム行列について、その行列内から傾向の近いユーザ集合及びアイテム集合からなる複数の部分行列を発見するクラスタリングを施して、被推薦ユーザが属する部分行列のみを用いて一般的な協調フィルタリングの手法を適用することで推薦精度を改善する研究がある。（例えば、非特許文献1参照）。 In addition, for a single domain, clustering that finds a plurality of sub-matrices consisting of user sets and item sets that have similar trends from within the user-item matrix that represents the relationship between users, item purchases, and usage histories There is a research to improve the recommendation accuracy by applying a general collaborative filtering technique using only the partial matrix to which the recommended user belongs. (For example, see Non-Patent Document 1).

特開2012-150561号公報JP 2012-150561 A

Bin Xu, Jiajun Bu, Chun Chen, Deng Cai: An Exploration of Improving Collaborative Recommender Systems via User-Item Subgroups, Proceedings of the WWW2012, 2012.Bin Xu, Jiajun Bu, Chun Chen, Deng Cai: An Exploration of Improving Collaborative Recommender Systems via User-Item Subgroups, Proceedings of the WWW2012, 2012.

上記の特許文献1の方法を用いることで、複数のドメインを扱った情報推薦を行えるようになる。しかしながら、推薦システムが構築するグラフを表す隣接行列中に推薦精度の向上に寄与しないデータが存在する場合には推薦精度が低下するという問題がある。 By using the method of Patent Document 1 described above, information recommendation that handles a plurality of domains can be performed. However, when there is data that does not contribute to the improvement of the recommendation accuracy in the adjacency matrix representing the graph constructed by the recommendation system, there is a problem that the recommendation accuracy decreases.

また、上記の非特許文献1の方法を用いることで、単一のドメインを扱った推薦システムの推薦精度向上が期待できるが、複数のドメインを対象としてクラスタリングを施す場合には、それぞれのドメインにクラスタが構築されやすいため、ドメインをまたがって推薦を行う場合には推薦できないアイテムが多くなり、結果として推薦精度が低下するという問題がある。 In addition, by using the method of Non-Patent Document 1, the recommendation accuracy of a recommendation system that handles a single domain can be expected. However, when clustering is performed for multiple domains, Since a cluster is easily constructed, there are many items that cannot be recommended when recommending across domains, and as a result, there is a problem in that recommendation accuracy decreases.

特許文献1における上記課題を解決するために、グラフを表す隣接行列に非特許文献1のようにクラスタリングを施して、被推薦ユーザと関連の強いクラスタに属するアイテム集合のみからなる部分行列を用いることが考えられる。そうすることにより、異なるクラスタに属しやすい推薦精度に寄与しないアイテムを取り除くことができる。しかしながら、前述の通り、クラスタリングの結果としてドメイン間の情報が抜け落ちてしまうことによる推薦精度低下の懸念がある。 In order to solve the above-mentioned problem in Patent Document 1, clustering is performed on the adjacency matrix representing the graph as in Non-Patent Document 1, and a submatrix consisting only of item sets belonging to a cluster strongly associated with the recommended user is used. Can be considered. By doing so, items that do not contribute to recommendation accuracy that easily belong to different clusters can be removed. However, as described above, there is a concern that recommendation accuracy may be reduced due to the loss of information between domains as a result of clustering.

本発明は上記の点に鑑みてなされたものであり、複数ドメインをまたがったアイテムの推薦を高精度に行うことを可能とする情報推薦技術を提供することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to provide an information recommendation technique that makes it possible to highly accurately recommend items across multiple domains.

上記の課題を解決するために、本発明は、被推薦ユーザに対してアイテムを推薦する情報推薦装置であって、
ユーザＩＤ、アイテムＩＤ、及びドメインＩＤを含むアイテム履歴データベースを参照し、前記ドメインＩＤの情報を制約として用いることにより、アイテムに関する制約付きクラスタリングを行い、前記被推薦ユーザと関連の強いクラスタに属するアイテム集合を取得し、当該アイテム集合に基づき部分行列を抽出する部分行列抽出手段と、
前記部分行列から部分隣接行列を作成し、当該部分隣接行列を用いて前記被推薦ユーザと各アイテムとの関連度を算出し、算出された関連度に基づき推薦アイテムを決定する推薦アイテム決定手段とを備えることを特徴とする情報推薦装置として構成される。 In order to solve the above problem, the present invention is an information recommendation device for recommending an item to a recommended user,
By referring to an item history database including a user ID, an item ID, and a domain ID, and using the domain ID information as a constraint, clustering with restrictions on items is performed, and items belonging to a cluster strongly associated with the recommended user Submatrix extraction means for obtaining a set and extracting a submatrix based on the item set;
A recommended item determining means for creating a partial adjacency matrix from the partial matrix, calculating a degree of association between the recommended user and each item using the partial adjacency matrix, and determining a recommended item based on the calculated degree of association; It is comprised as an information recommendation apparatus characterized by providing.

前記部分行列抽出手段は、互いに異なるドメインに属するアイテムの組の類似度を、制約なしの場合と比較して大きく評価することにより前記制約付きクラスタリングを行うように構成することができる。 The sub-matrix extraction unit can be configured to perform the constrained clustering by greatly evaluating the similarity of a set of items belonging to different domains compared to the case of no constraint.

例えば、前記部分行列抽出手段は、前記アイテム履歴データベースにおけるアイテムの組について、ドメインＩＤが異なる組を１とし、ドメインＩＤが同じ組を０とした制約ベクトルを求め、当該制約ベクトルを用いて前記制約付きクラスタリングを行う。 For example, the sub-matrix extraction unit obtains a constraint vector in which a set having different domain IDs is set to 1 and a set having the same domain ID is set to 0 for the set of items in the item history database, and the constraint vector is used to determine the constraint Perform clustering.

また、前記推薦アイテム決定手段は、ＲＷＲ（Random Walk with Restart）のアルゴリズムを用いて前記被推薦ユーザと各アイテムとの関連度を算出するように構成してもよい。 The recommended item determination means may be configured to calculate the degree of association between the recommended user and each item using an RWR (Random Walk with Restart) algorithm.

本発明は、被推薦ユーザに対してアイテムを推薦する情報推薦装置が実行する情報推薦方法、及び、コンピュータを、情報推薦装置の各手段として機能させるための情報推薦プログラムとして構成することもできる。 The present invention can also be configured as an information recommendation method executed by an information recommendation device that recommends an item to a recommended user, and an information recommendation program for causing a computer to function as each unit of the information recommendation device.

本発明によれば、複数のドメインをまたがるために必要な情報を残すような制約付きクラスタリングを行うこととしたので、複数ドメインのアイテムを、推薦精度に寄与するデータのみを抽出して用いることが可能となる。よって、複数ドメインをまたがったアイテムの推薦をより高精度に行うことが可能となる。 According to the present invention, it is decided to perform constrained clustering that leaves information necessary for crossing a plurality of domains, so that it is possible to extract and use only the data that contributes to the recommendation accuracy for items of a plurality of domains. It becomes possible. Therefore, it becomes possible to recommend an item across multiple domains with higher accuracy.

本発明の一実施の形態における情報推薦装置の構成図である。It is a block diagram of the information recommendation apparatus in one embodiment of this invention. 本発明の一実施の形態におけるアイテム履歴データベースのテーブルのイメージである。It is an image of the table of the item history database in one embodiment of the present invention. 本発明の一実施の形態における情報推薦装置の処理のフローチャートである。It is a flowchart of a process of the information recommendation apparatus in one embodiment of this invention. 本発明の一実施の形態における部分行列のイメージである。It is an image of the submatrix in one embodiment of the present invention. 本発明の一実施の形態における部分行列抽出処理で用いる制約ベクトル計算のフローチャートである。It is a flowchart of the constraint vector calculation used by the partial matrix extraction process in one embodiment of this invention. 本発明の一実施の形態における部分行列抽出処理のフローチャートである。It is a flowchart of the submatrix extraction process in one embodiment of this invention. 本発明の一実施の形態における部分行列正規化処理のフローチャートである。It is a flowchart of the submatrix normalization process in one embodiment of this invention.

以下、図面を参照して本発明の実施の形態を説明する。なお、以下で説明する実施の形態は一例に過ぎず、本発明が適用される実施の形態は、以下の実施の形態に限られるわけではない。 Embodiments of the present invention will be described below with reference to the drawings. The embodiment described below is only an example, and the embodiment to which the present invention is applied is not limited to the following embodiment.

（装置構成）
図１は、本発明の一実施の形態における情報推薦装置の構成図である。 (Device configuration)
FIG. 1 is a configuration diagram of an information recommendation device according to an embodiment of the present invention.

同図に示す情報推薦装置は、被推薦ユーザＩＤ取得部１１０、アイテム履歴データベース１２０、部分行列抽出部１３０、部分行列正規化処理部１４０、推薦アイテム予測処理部１５０、アイテム提示部１６０を有する。各機能部の概要は以下のとおりである。 The information recommendation apparatus shown in the figure includes a recommended user ID acquisition unit 110, an item history database 120, a partial matrix extraction unit 130, a partial matrix normalization processing unit 140, a recommended item prediction processing unit 150, and an item presentation unit 160. The outline of each functional part is as follows.

被推薦ユーザＩＤ取得部１１０は、被推薦ユーザＩＤを取得し、アイテム履歴データベース１２０及び推薦アイテム予測処理部１５０に出力する。 The recommended user ID acquisition unit 110 acquires the recommended user ID and outputs it to the item history database 120 and the recommended item prediction processing unit 150.

アイテム履歴データベース１２０は、アイテム履歴のテーブルを格納する格納手段を構成している。図２に、本発明の一実施の形態におけるアイテム履歴データベースのテーブルのイメージを示す。図２に示すように、アイテム履歴データベース１２０は、ユーザＩＤ、アイテムＩＤ、ドメインＩＤ、アイテムへの評価値がそれぞれ参照できるようなデータベースとなっている。 The item history database 120 constitutes storage means for storing an item history table. FIG. 2 shows an image of the table of the item history database in the embodiment of the present invention. As illustrated in FIG. 2, the item history database 120 is a database in which user IDs, item IDs, domain IDs, and evaluation values for items can be referred to.

図１の部分行列抽出部１３０は、アイテム履歴データベース１２０のデータを用いて後述する部分行列Ａを計算する。部分行列正規化処理部１４０は、部分行列Ａを正規化し、部分隣接行列を作成する。推薦アイテム予測処理部１５０は、被推薦ユーザＩＤ、被推薦ユーザＩＤに関連付けられているアイテムＩＤリスト、及び部分隣接行列に基づいて、推薦するアイテムを決定する。アイテム提示部１６０は、推薦アイテム予測処理部１５０の出力である推薦アイテムを提示する。なお、部分行列正規化処理部１４０と推薦アイテム予測処理部１５０とをまとめて推薦アイテム決定手段と呼んでもよい。 The submatrix extraction unit 130 in FIG. 1 calculates a submatrix A, which will be described later, using data in the item history database 120. The partial matrix normalization processing unit 140 normalizes the partial matrix A and creates a partial adjacency matrix. The recommended item prediction processing unit 150 determines a recommended item based on the recommended user ID, the item ID list associated with the recommended user ID, and the partial adjacency matrix. The item presentation unit 160 presents a recommended item that is an output of the recommended item prediction processing unit 150. The partial matrix normalization processing unit 140 and the recommended item prediction processing unit 150 may be collectively referred to as recommended item determination means.

本実施の形態における情報推薦装置は、例えば、１つ又は複数のコンピュータに、本実施の形態で説明する処理内容を記述したプログラムを実行させることにより実現可能である。すなわち、情報推薦装置の各部（被推薦ユーザＩＤ取得部、アイテム履歴データベース、部分行列抽出部、部分行列正規化処理部、推薦アイテム予測処理部、アイテム提示部）が有する機能は、当該情報推薦装置を構成するコンピュータに内蔵されるＣＰＵやメモリ、ハードディスクなどのハードウェア資源を用いて、各部で実施される処理に対応するプログラムを実行することによって実現することが可能である。より具体的には、プログラムに従って、計算対象のデータをメモリから読み出し、ＣＰＵにより演算を行って、メモリに格納する動作を繰り返しながら処理が実行される。上記プログラムは、コンピュータが読み取り可能な記録媒体（可搬メモリ等）に記録して、保存したり、配布したりすることが可能である。また、上記プログラムをインターネットや電子メールなど、ネットワークを通して提供することも可能である。 The information recommendation apparatus according to the present embodiment can be realized, for example, by causing one or a plurality of computers to execute a program describing the processing content described in the present embodiment. That is, the functions of each unit of the information recommendation device (recommended user ID acquisition unit, item history database, partial matrix extraction unit, partial matrix normalization processing unit, recommended item prediction processing unit, item presentation unit) This can be realized by executing a program corresponding to processing executed in each unit using hardware resources such as a CPU, a memory, and a hard disk built in the computer constituting the computer. More specifically, in accordance with a program, data to be calculated is read from the memory, calculation is performed by the CPU, and the process is executed while repeating the operation of storing in the memory. The above-mentioned program can be recorded on a computer-readable recording medium (portable memory or the like), stored, or distributed. It is also possible to provide the program through a network such as the Internet or electronic mail.

なお、アイテム履歴データベース１２０は、情報推薦装置内に存在しなくてもよい。例えば、アイテム履歴データベース１２０を外部のデータベースサーバ内に備え、情報推薦装置が当該データベースサーバにアクセスすることにより、必要なデータを取得することとしてもよい。 Note that the item history database 120 may not exist in the information recommendation device. For example, the item history database 120 may be provided in an external database server, and the information recommendation device may acquire necessary data by accessing the database server.

（装置の動作）
図３は、本発明の一実施の形態における情報推薦装置の処理のフローチャートである。図３のフローチャートに沿って、図１に示した情報推薦装置により実行される被推薦ユーザに対して推薦アイテムを提示する処理について説明する。図３に示す各ステップの処理において、更に図を参照して詳細な説明を適宜行う。 (Device operation)
FIG. 3 is a flowchart of the process of the information recommendation device according to the embodiment of the present invention. A process of presenting a recommended item to a recommended user executed by the information recommendation apparatus shown in FIG. 1 will be described along the flowchart of FIG. In the processing of each step shown in FIG. 3, detailed description will be appropriately made with reference to the drawings.

ステップ２１０）まず、被推薦ユーザＩＤ取得処理として、被推薦ユーザＩＤ取得部１１０は、被推薦ユーザのＩＤを取得し、アイテム履歴データベース１２０及び推薦アイテム予測処理部１５０に出力する。この処理は、例えば、ユーザが利用するサービスやシステムにログイン処理と同時に行うものとする。 Step 210) First, as a recommended user ID acquisition process, the recommended user ID acquisition unit 110 acquires the ID of the recommended user and outputs it to the item history database 120 and the recommended item prediction processing unit 150. For example, this process is performed simultaneously with the login process for the service or system used by the user.

ステップ２２０）部分行列抽出部１３０の処理として、アイテム履歴データベース１２０からユーザＩＤ、アイテムＩＤ、ドメインＩＤ、評価値のデータを受け取って、部分行列Aを生成し、部分行列正規化処理部１４０に部分行列Aを出力する。図４に、本発明の一実施の形態における部分行列Aのイメージを示す。後述するように、部分行列Aは、制約付きクラスタリングで得られたアイテムに関する複数のクラスタのうち、被推薦ユーザに関するベクトルとの間の距離の小さいクラスタに含まれるアイテムＩＤ集合を有効な行列のインデックスとする行列である。 Step 220) As processing of the submatrix extraction unit 130, user ID, item ID, domain ID, and evaluation value data are received from the item history database 120, a submatrix A is generated, and the partial matrix normalization processing unit 140 Output matrix A. FIG. 4 shows an image of the submatrix A in one embodiment of the present invention. As will be described later, the submatrix A is an index of an effective matrix indicating an item ID set included in a cluster having a small distance from a vector related to a recommended user among a plurality of clusters related to items obtained by constrained clustering. Is a matrix.

以下、図５及び図６のフローチャートを用いてより詳細な一実施形態における部分行列Aの構築手順例について説明する。 Hereinafter, an example of a procedure for constructing the submatrix A in a more detailed embodiment will be described using the flowcharts of FIGS. 5 and 6.

図５に示すように、まず、アイテム履歴データベース１２０より出力される全ユーザ集合U={u₁,u₂,…u_|U|}、全アイテム集合I={i₁,i₂,…,i_|I|}を収集する（ステップ５１０）。 As shown in FIG. 5, first, all user sets U = {u ₁ , u ₂ ,... U _{| U |} } and all item sets I = {i ₁ , i ₂ ,. i _{| I |} } is collected (step 510).

次に、制約付きクラスタリングで用いる制約ベクトルW=(w_i,j)を求める。具体的には、前記全アイテムの全てのアイテムの組みi,j(i≠j)について、アイテム履歴データベース１２０上でドメインＩＤが異なるものについてw_i,jに１を代入し、そうでない組合せについて0を代入する（ステップ５３０〜ステップ５３７）。 Next, a constraint vector W = (w _{i, j} ) used in constrained clustering is obtained. Specifically, for all the item combinations i, j (i ≠ j) of all the items, 1 is substituted into w _{i, j} for items having different domain IDs on the item history database 120, and combinations that are not 0 is substituted (steps 530 to 537).

次に、上記のようにして作成された制約ベクトルW=(w_i,j)を用いて、制約付きクラスタリングを行い、部分行列Aを求める。この処理を図６を参照して説明する。 Next, constrained clustering is performed using the constraint vector W = (w _{i, j} ) created as described above, and a submatrix A is obtained. This process will be described with reference to FIG.

まず、全てのアイテムについて、アイテム履歴データベース１２０での各ユーザの評点の平均を要素とするアイテムベクトルx_iを用意する（ステップ６１０）。全てのアイテムベクトルについて、K個のクラスタCを用意する（ステップ６２０）。クラスタの初期値は、x_iをランダムにK個選択する。クラスタリング結果のクラスタ数を決定するパラメータKの値はK=10や50などを用いるが、データの規模や用途によって指定し規模が大きい時やアイテムのより詳細な違いを気にする場合は大きい値を用いる。 First, for all items, an item vector x _i having an average of the scores of each user in the item history database 120 as an element is prepared (step 610). K clusters C are prepared for all item vectors (step 620). As the initial value of the cluster, K pieces of x _i are selected at random. The value of parameter K that determines the number of clusters in the clustering result is K = 10 or 50, etc., but it is specified according to the size and use of the data. Is used.

x_iを最も近傍のクラスタC_kにマージしてクラスタリングを行うための、近傍の計算は、前記制約ベクトルw_i,jを用いて次式
λ₁Σ_xj∈Ck (W_i,j * Sim(x_i,x_j)) + λ₂Σ_xj∈CkSim(x_i,x_j)
の値が最大になるものを選ぶ。ここで、kは1≦k≦KのクラスタCを識別する添字、Sim(x_i,x_j)は、ベクトルx_iとx_jの類似度を表し、ユークリッド距離やJaccad係数などの任意の尺度を用いることができる。上記の式により、互いに異なるドメインに属するアイテムの組の類似度を、制約なしの場合と比較して大きく評価してクラスタリングを行うことができる。
Neighbor calculation to perform clustering by merging x _i to the nearest cluster C _k is performed using the constraint vector w _{i, j} as follows: λ ₁ Σ _xj∈Ck ( W _{i, j} * S im (x _i , x _j )) + λ ₂ Σ _xj∈Ck Sim (x _i , x _j )
Select the one with the maximum value. Where k is a subscript identifying cluster C with 1 ≦ k ≦ K, Sim (x _i , x _j ) represents the similarity between vectors x _i and x _j , and is an arbitrary scale such as Euclidean distance or Jaccad coefficient Can be used. According to the above formula, clustering can be performed by greatly evaluating the degree of similarity of a set of items belonging to different domains as compared to the case of no constraint.

λ_1、λ₂は制約付きの類似度と制約なしの類似度の影響度合いを制御するパラメータであり、λ₁＝0.85_、λ₂＝0.15などの実数を用いる。λ₁とλ₂の和は必ずしも1でなくてもよい。例えば、λ₁＝0.85_、λ₂＝0.15のように、λ₁が大きな値であれば、制約付きの類似度の影響が大きくなり、異なるドメインのアイテムを含むクラスタが構築され易くなる。このようにして算出されたクラスタに基づき生成された部分行列を用いることで、複数ドメインをまたがった精度の高い推薦を行うことが可能となる。 λ _{1 and} λ ₂ are parameters for controlling the degree of influence of similarity with constraints and similarity without constraints, and real numbers such as λ ₁ = 0.85 _and λ ₂ = 0.15 are used. The sum of λ ₁ and λ ₂ is not necessarily 1. For example, if λ ₁ is a large value such as λ ₁ = 0.85 _and λ ₂ = 0.15, the influence of the similarity with restrictions becomes large, and a cluster including items of different domains is easily constructed. By using the partial matrix generated based on the cluster calculated in this way, it is possible to perform highly accurate recommendation across multiple domains.

全てのアイテムベクトルについて上記の制約付きクラスタリング処理を行い、制約付きクラスタリングの結果としてK個のクラスタCを得る（ステップ６２０〜ステップ６２２）。 The above-mentioned constrained clustering process is performed for all item vectors, and K clusters C are obtained as a result of constrained clustering (steps 620 to 622).

更に、被推薦ユーザaにアイテム履歴データベース１２０で関連付けられているアイテムＩＤ集合I_aを用いて、被推薦ユーザのベクトルをq_a、その要素q_a,jを次式
q_a,j= (1/|I_a|) * Σ_i∈Ia(x_i,j/ (Σ_{_j} x_i,j))
を用いて求める（ステップ６３０）。上記の式において、x_i,jはアイテムベクトルx_iの要素であり、jはx_iの添字である。x_i,j/ (Σ_{_j} x_i,j)は、x_iの要素の和が１になるように正規化することを意味する。 Further, using the item ID set I _a associated with the recommended user a in the item history database 120, the vector of the recommended user is q _a , and its element q _{a, j} is expressed by the following equation:
q _{a, j} = (1 / | I _a |) * Σ _i∈Ia (x _{i, j} / (Σ _{_j} x _{i, j} ))
(Step 630). In the above formula, x _{i, j} is an element of the item vector x _i , and j is a subscript of x _i . x _{i, j} / ( _{Σ_j} x _{i, j} ) means normalization so that the sum of the elements of x _i becomes 1.

最後に、K個のクラスタのうち、q_aと最もベクトル間の距離の小さいクラスタC_kに含まれるアイテムＩＤ集合を有効な行列のインデックスとする行列を部分行列A=(a_i,j)として出力する（ステップ６４０）。すなわち、被推薦ユーザと関連の強いクラスタに属するアイテムＩＤ集合を取得し、当該アイテムＩＤ集合に基づき部分行列を抽出する。ベクトル間の距離の基準は、ユークリッド距離やJaccard係数などの任意の基準を用いることができる。行列の要素a_i,jは、前記Sim(x_i,x_j)とする。 Finally, out of K clusters, a matrix having an item ID set included in the cluster C _k having the smallest distance between q _a and the smallest vector as an effective matrix index is defined as a submatrix A = (a _{i, j} ). Output (step 640). That is, an item ID set belonging to a cluster closely related to the recommended user is acquired, and a partial matrix is extracted based on the item ID set. As a standard for the distance between vectors, any standard such as a Euclidean distance or a Jaccard coefficient can be used. The element a _{i, j} of the matrix is Sim (x _i , x _j ).

ステップ２３０）部分行列正規化処理部１４０で部分行列の正規化を行う。この正規化処理を図７のフローチャートを参照して説明する。 Step 230) The submatrix normalization processing unit 140 normalizes the submatrix. This normalization process will be described with reference to the flowchart of FIG.

正規化処理では、まず、部分行列抽出部１３０より出力される部分行列A=(a_j,k)を受け取る（ステップ７１０）。 In the normalization process, first, the partial matrix A = (a _{j, k} ) output from the partial matrix extraction unit 130 is received (step 710).

次に、部分行列Aの各要素a_j,kを列の値の和で除した値に次式
a_j,k=a_j,k/Σ_g=1a_g,k
を用いて更新し、全ての要素について行ったものを、部分隣接行列Aとして出力する（ステップ７２０〜ステップ７４０）。つまり、部分行列Aを列の値の和が１になるように正規化することで、部分隣接行列Aを求める。 Next, the value _obtained by dividing each element a _{j, k} of the submatrix A by the sum of the column values is
a _{j, k} = a _{j, k} / Σ _{g = 1} a _{g, k}
Are updated for each element and output as a partial adjacency matrix A (steps 720 to 740). That is, the partial adjacency matrix A is obtained by normalizing the partial matrix A so that the sum of the column values becomes 1.

ステップ２４０）推薦アイテム予測処理部１５０が、被推薦ユーザＩＤ取得部１１０から被推薦ユーザＩＤ、アイテム履歴データベース１２０から被推薦ユーザＩＤに関連付けられているアイテムＩＤリスト、部分行列正規化処理部１４０から部分隣接行列をそれぞれ受け取り、推薦するアイテムを決定する。 Step 240) The recommended item prediction processing unit 150 receives the recommended user ID from the recommended user ID acquisition unit 110, the item ID list associated with the recommended user ID from the item history database 120, and the submatrix normalization processing unit 140. Each partial adjacency matrix is received and an item to be recommended is determined.

その計算には、取得した被推薦ユーザＩＤと各アイテムＩＤとの関連度を求め、その値の高いアイテムを推薦するアイテムとするRandom Walk with Restart (RWR)を用いる。具体的な計算式は、p^(t) を計算のステップtにおけるアイテムの関連度の列ベクトル、αは0 <α< 1を満たす定数、Aを前記部分隣接行列、qを被推薦ユーザがアイテム履歴データベース１２０上で関連付けられているn個のアイテムを表す値が1/nである列ベクトルとすると、次式のように表される。 For the calculation, Random Walk with Restart (RWR) is used in which the degree of association between the acquired recommended user ID and each item ID is obtained and an item with a high value is recommended. Specifically, p ^(t) is a column vector of the relevance level of an item in the calculation step t, α is a constant satisfying 0 <α <1, A is the partial adjacency matrix, and q is a recommended user item Assuming that a column vector whose value representing n items associated on the history database 120 is 1 / n is represented by the following equation.

p^(t+1) = (1−α)Ap^(t)+ αq
計算は部分隣接行列A上でt回繰り返してp^(t) が更新され、p^(t) の列ベクトルからアイテムを表すノードの関連度の高いものを推薦アイテムとして出力する。ここでのtは例えば30回とする。なお、この実施回数は30回に限定されることなく、十分に大きな値であればよい。 p ^{(t + 1)} = (1−α) Ap ^(t) + αq
The calculation is repeated t times on the partial adjacency matrix A, p ^(t) is updated, and a node with a high degree of relevance of the node representing the item is output as a recommended item from the column vector of p ^(t) . Here, t is, for example, 30 times. The number of implementations is not limited to 30 and may be a sufficiently large value.

ステップ２５０）推薦アイテム提示処理としてアイテム提示部１６０が、推薦アイテム予測処理部１５０の出力である推薦アイテムを提示し、推薦が完了する。 Step 250) As the recommended item presentation process, the item presentation unit 160 presents the recommended item that is the output of the recommended item prediction processing unit 150, and the recommendation is completed.

本発明は、上記の実施の形態に限定されることなく、特許請求の範囲内において、種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments, and various modifications and applications are possible within the scope of the claims.

１１０被推薦ユーザＩＤ取得部
１２０アイテム履歴データベース
１３０部分行列抽出部
１４０部分行列正規化処理部
１５０推薦アイテム予測処理部
１６０アイテム提示部 110 recommended user ID acquisition unit 120 item history database 130 partial matrix extraction unit 140 partial matrix normalization processing unit 150 recommended item prediction processing unit 160 item presentation unit

Claims

An information recommendation device for recommending an item to a recommended user,
By referring to an item history database including a user ID, an item ID, and a domain ID, and using the domain ID information as a constraint, clustering with restrictions on items is performed, and items belonging to a cluster strongly associated with the recommended user Submatrix extraction means for obtaining a set and extracting a submatrix based on the item set;
A recommended item determining means for creating a partial adjacency matrix from the partial matrix, calculating a degree of association between the recommended user and each item using the partial adjacency matrix, and determining a recommended item based on the calculated degree of association; An information recommendation device comprising:

The said partial matrix extraction means performs the said restricted clustering by evaluating largely the similarity of the group of the items which belong to a mutually different domain compared with the case where there is no restriction | limiting. Information recommendation device.

The sub-matrix extraction unit obtains a constraint vector in which a set having different domain IDs is set to 1 and a set having the same domain ID is set to 0 for the set of items in the item history database, and the constraint clustering is performed using the constraint vector The information recommendation device according to claim 2, wherein:

The said recommended item determination means calculates the relevance of the said recommended user and each item using the algorithm of RWR (Random Walk with Restart). Any one of Claim 1 thru | or 3 characterized by the above-mentioned. The information recommendation device described.

An information recommendation method executed by an information recommendation device for recommending an item to a recommended user,
By referring to an item history database including a user ID, an item ID, and a domain ID, and using the domain ID information as a constraint, clustering with restrictions on items is performed, and items belonging to a cluster strongly associated with the recommended user A submatrix extraction step of obtaining a set and extracting a submatrix based on the item set;
A recommended item determining step of creating a partial adjacency matrix from the partial matrix, calculating a degree of association between the recommended user and each item using the partial adjacency matrix, and determining a recommended item based on the calculated degree of association; An information recommendation method comprising:

The information recommendation program for functioning a computer as each means of the information recommendation apparatus of any one of Claims 1 thru | or 4.