JP4934058B2

JP4934058B2 - Co-clustering apparatus, co-clustering method, co-clustering program, and recording medium recording the program

Info

Publication number: JP4934058B2
Application number: JP2008002218A
Authority: JP
Inventors: 修平桑田; 武士山田; 修功上田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2008-01-09
Filing date: 2008-01-09
Publication date: 2012-05-16
Anticipated expiration: 2028-01-09
Also published as: JP2009163615A

Description

本発明は、ノンパラメトリックベイズモデル（ディリクレ過程混合モデル）に基づく、各ユーザが各アイテムを選択したか否かあるいは選択した頻度を示したアイテム選択情報（購買履歴データなど）に対する共クラスタリング（ユーザとアイテムの両方を同時にクラスタリングする手法：Co-clustering）の技術に関する。 The present invention is based on a non-parametric Bayes model (Dirichlet process mixture model) and co-clustering (selecting user and item selection information (such as purchase history data) indicating whether or not each user has selected each item. It is related to the technique of clustering both items simultaneously: Co-clustering.

近年、情報の高度化や多様化が進み、膨大な情報の中から必要な情報を効率よく抽出する技術が求められている。そのような状況の中、例えば、インターネットの普及などにより世界中から容易に商品を購入できるようになったことや、ユーザの嗜好が多様化してきたことなどにともない、企業にとってユーザのニーズを詳細に把握することがますます重要になってきている。 In recent years, the sophistication and diversification of information has progressed, and a technique for efficiently extracting necessary information from a vast amount of information is required. Under such circumstances, for example, as the Internet has become popular, products can be easily purchased from all over the world, and user preferences have diversified. It is becoming increasingly important to understand.

ユーザのニーズを把握する１つのアプローチとして、複数ユーザのアイテム購入履歴（購買履歴データ）を元にユーザ群を、購買履歴の似通った幾つかのユーザグループに分けるユーザクラスタリング（セグメンテーション）がある。また、ユーザクラスタリングと同様に、アイテム群を、同じユーザに購入されやすいアイテムグループに分ける、アイテムクラスタリングも考えられている。ユーザクラスタリングの場合には、同じユーザグループに所属するユーザの共通点を見出すことでユーザのニーズや嗜好の把握に役立てることができる。また、アイテムクラスタリングの場合には、同じアイテムグループに所属するアイテムの共通点を見出すことで新商品の開発に役立てることができる。 As one approach for grasping user needs, there is user clustering (segmentation) in which a user group is divided into several user groups having similar purchase histories based on item purchase histories (purchase history data) of a plurality of users. Similarly to user clustering, item clustering is also considered in which item groups are divided into item groups that are easily purchased by the same user. In the case of user clustering, finding the common points of users belonging to the same user group can be used to grasp the needs and preferences of the users. In the case of item clustering, finding common points of items belonging to the same item group can be used for the development of new products.

ここで、ユーザとアイテムのいずれかに関するクラスタリングだけではなく、共クラスタリングも考えられている。共クラスタリングとは、ユーザとアイテムの一方をクラスタリングする際に、他方のクラスタリング結果を相互に利用しながらユーザクラスとアイテムクラスから定まるユーザ・アイテムブロック（以下、単に「ブロック」ともいう。）を求め、消費者行動の把握につながるより有用な情報を得ようとするアプローチである。 Here, not only clustering regarding either a user or an item but also co-clustering is considered. In the co-clustering, when one of the user and the item is clustered, a user / item block (hereinafter, also simply referred to as “block”) determined from the user class and the item class while mutually using the other clustering result is obtained. It is an approach that seeks to obtain more useful information that leads to understanding consumer behavior.

図１０は、購買履歴データの共クラスタリングの例を示す図である。図１０において、今、Ｎ人のユーザが、Ｍ個のアイテムのそれぞれについて、購入したか否かに関する購買履歴データを考える。これまでの購買履歴データは、図１０における左図のように行列Ｒで表現される。すなわち、Ｎ×Ｍ行列Ｒの（ｉ,ｊ）要素Ｒ_ｉ,ｊはユーザｉ（ｉ＝１,２,…,Ｎ）がアイテムｊ（ｊ＝１,２,…,Ｍ）を購入していた場合に「１」、購入していない場合「０」をとるものとする。 FIG. 10 is a diagram illustrating an example of co-clustering of purchase history data. In FIG. 10, now consider purchase history data regarding whether or not N users have purchased each of the M items. The purchase history data so far is represented by a matrix R as shown in the left diagram of FIG. That is, the (i, j) element R _{i, j} of the N × M matrix R is that the user i (i = 1, 2,..., N) has purchased the item j (j = 1, 2,..., M). It is assumed that “1” is taken when the product is purchased and “0” is taken when the product has not been purchased.

ここで、ユーザとアイテムのそれぞれを共クラスタリングしたものが図１０の右図である。右図では、同じユーザクラスに属するユーザ、同じアイテムクラスに属するアイテムがそれぞれ隣り合うように、ソートしてある。このように共クラスタリングすることによって、「１」の部分、「０」の部分がより局所的に集中し、「ブロック化」されていることがわかる。なお、ソートは、必ず、ユーザ単位あるいはアイテム単位で行われる。 Here, the right side of FIG. 10 is a result of co-clustering the user and the item. In the right figure, users are sorted so that users belonging to the same user class and items belonging to the same item class are adjacent to each other. By co-clustering in this way, it can be seen that the “1” and “0” portions are more locally concentrated and “blocked”. Note that sorting is always performed in units of users or items.

ユーザ（またはアイテム）をクラスタリングする際にアイテム（またはユーザ）クラスを制約に加えてクラスタリングを行うことで、ユーザクラスとアイテムクラスから規定される「ユーザ・アイテムブロック」が得られ、得られたユーザ・アイテムブロックからユーザクラスやアイテムクラスの特徴を把握することができる。この方法は、ユーザとアイテムを個別にクラスタリングする方法よりも、より詳細な情報が得られることが期待される。 By clustering with the item (or user) class added to the constraints when clustering users (or items), the “user / item block” defined from the user class and item class is obtained, and the obtained user -It is possible to grasp the characteristics of the user class and item class from the item block. This method is expected to provide more detailed information than the method of clustering users and items individually.

共クラスタリングを実現する１つの方法に、Kempらにより提案されたInfinite Relational Model（ＩＲＭ）（非特許文献１参照）がある。
ＩＲＭでは、まず、与えられたデータを生成する、もっともらしい、データの生成過程（生成プロセス）をモデル化する。すなわち、「与えられたデータは、このような生成過程によって生成されたのではないか」という仮説を立てる。次に、この仮説の下で、与えられたデータを最もうまく説明するモデルを探索する。 One method for realizing co-clustering is the Infinite Relational Model (IRM) proposed by Kemp et al. (See Non-Patent Document 1).
In the IRM, first, a plausible data generation process (generation process) for generating given data is modeled. That is, a hypothesis is made that “the given data was generated by such a generation process”. Next, under this hypothesis, a model that best explains the given data is searched.

ここで、図１１を参照して、ＩＲＭが仮定するデータ生成プロセスを、購買履歴データを用いて説明する。図１１は、ＩＲＭのデータ生成プロセスの説明図である。図１０の場合と同様、ユーザｉがアイテムｊを購入する（していた）場合に「１」、購入（しない）していない場合に「０」をとるものとする。 Here, with reference to FIG. 11, a data generation process assumed by the IRM will be described using purchase history data. FIG. 11 is an explanatory diagram of an IRM data generation process. As in the case of FIG. 10, “1” is assumed when the user i purchases (has) the item j, and “0” is assumed when the user i has not purchased (does not).

（ＩＲＭのプロセス１）
[１−１]ユーザ全体を、複数のユーザクラスに分割する。すなわち、まず、複数のユーザクラスの集合（ユーザクラス集合）を生成し、各ユーザはユーザクラス（のひとつ）ｋを選択し、そこに所属（帰属）する。なお、ユーザクラス数Ｋは、固定ではなく変動する、つまり、データに合わせて決定される。
[１−２]アイテム全体を、複数のアイテムクラスに分割する。すなわち、まず、複数のアイテムクラスの集合（アイテムクラス集合）を生成し、各アイテムは、アイテムクラス（のひとつ）ｓを選択し、そこに所属（帰属）する。なお、アイテムクラス数Ｓは、固定ではなく変動する、つまり、データに合わせて決定される。 (IRM process 1)
[1-1] The entire user is divided into a plurality of user classes. That is, first, a set of a plurality of user classes (user class set) is generated, and each user selects a user class (one) k and belongs (belongs to) there. Note that the number of user classes K is not fixed but varies, that is, it is determined according to data.
[1-2] Divide the entire item into a plurality of item classes. That is, first, a set of item classes (item class set) is generated, and each item selects an item class (one) s and belongs (belongs to) there. The item class number S is not fixed but varies, that is, it is determined according to the data.

（ＩＲＭのプロセス２）
[２−１]任意のユーザクラス（のひとつ）と任意のアイテムクラス（のひとつ）の組に対し（すなわち、任意の「ユーザ・アイテムブロックｋｓ」に対し）、そのブロック固有の購入確率である「０〜１」の範囲の実数値をひとつ割り当てる。
[２−２]特定のユーザクラスに属するユーザは、特定のアイテムクラスに属するアイテムを、そのユーザ・アイテムブロックに割り当てられている購入確率にしたがって購入する、と仮定する。 (IRM process 2)
[2-1] A purchase probability specific to a block for a set of any one user class and one item class (ie, for any “user / item block ks”). One real value in the range of “0 to 1” is assigned.
[2-2] It is assumed that a user belonging to a specific user class purchases an item belonging to a specific item class according to the purchase probability assigned to the user / item block.

そして、与えられた購買履歴データが、前記プロセスを表現した確率モデルに基づいて生成されたものとみなし、与えられた購買履歴データを最もよく説明するモデルを学習する、すなわち、与えられたデータからモデル学習を行う。
例えば、前記生成過程ではユーザとアイテムがそれぞれ「（ある）クラスに所属する」と述べているが、具体的なクラスヘの割り当て（すなわちクラスタリング）は述べていない。具体的なクラスヘの割り当てについては、データごとに異なる、最も適した（データを最もよく説明する）割り当て法があるはずである。これを探索する必要があり、そうすることが、データを共クラスタリングすることになる。 The given purchase history data is considered to have been generated based on a probability model representing the process, and a model that best explains the given purchase history data is learned, that is, from the given data. Perform model learning.
For example, in the generation process, a user and an item are described as “belonging to (a) class”, but a specific class assignment (ie, clustering) is not described. For specific class assignments, there should be the most appropriate assignment method (which best describes the data) that varies from data to data. This needs to be searched, and doing so will co-cluster the data.

一般に、ユーザクラスとアイテムクラスの割り当てパターンは膨大に存在するため、一度に最適なクラスタリング結果を得るのは困難である。そのため、以下で述べるサンプリングを用いて、クラスタリングを行う。サンプリングでは、まず、ランダムなクラスタリング結果を初期値とし、目的関数である事後確率が大きくなるように逐次的にクラスタリング結果を改善する。 In general, since there are a large number of user class and item class assignment patterns, it is difficult to obtain an optimal clustering result at a time. Therefore, clustering is performed using the sampling described below. In sampling, first, a random clustering result is used as an initial value, and the clustering result is sequentially improved so that the posterior probability as an objective function is increased.

ここで、事後確率とは、条件付確率の一種であり、特定の情報を考慮に入れた条件での、特定の変数に関する確率である。また、事前確率とは、前記した特定の情報がない条件での、特定の変数に関する確率である。例えば、ベイズの定理によれば、事前確率に尤度（関数）を乗算すると事後確率が得られる。事後確率は、モデル学習の結果の良さの尺度となりえる。 Here, the posterior probability is a kind of conditional probability, and is a probability related to a specific variable under a condition that takes into account specific information. The prior probability is a probability related to a specific variable under the condition where there is no specific information. For example, according to Bayes' theorem, the posterior probability is obtained by multiplying the prior probability by the likelihood (function). The posterior probability can be a measure of the goodness of the model learning results.

ＩＲＭについての説明を続けると、具体的には、まず、初期値として、ユーザとアイテムをランダムにクラスタリングする。次に、与えられた購買履歴データの事後確率（後記する式（５））が大きくなるように、ユーザの帰属ユーザクラスとアイテムの帰属アイテムクラスの各々を逐次的に更新していく。評価値である事後確率がこれ以上変化（改善）されなくなった時点で帰属クラスの更新を停止する。 Continuing the description of the IRM, specifically, first, users and items are randomly clustered as initial values. Next, each of the user belonging user class and the item belonging item class is sequentially updated so that the posterior probability (formula (5) described later) of the given purchase history data is increased. When the posterior probability, which is the evaluation value, is no longer changed (improved), the attribution class update is stopped.

以下、ＩＲＭが仮定する確率モデルと共クラスタリング結果の評価式について説明する。まず、以下がデータとして観測される量である。
・ユーザ数はＮ、アイテム数はＭである。
・行列Ｒは購買履歴データを表し、Ｎ×Ｍ行列Ｒの（ｉ,ｊ）要素Ｒ_ｉ,ｊはユーザｉ（ｉ＝１,２,…,Ｎ）がアイテムｊ（ｊ＝１,２,…,Ｍ）を購入していた場合に「１」、購入していない場合に「０」をとるものとする。 Hereinafter, the probability model assumed by the IRM and the evaluation formula of the co-clustering result will be described. First, the following are the quantities observed as data.
The number of users is N and the number of items is M.
The matrix R represents purchase history data, and the (i, j) element R _{i, j} of the N × M matrix R represents the user i (i = 1, 2,..., N) as the item j (j = 1, 2, .., M) is “1” when purchased, and “0” when not purchased.

これらが得られた下で、以下のような確率モデルを仮定する。
（ユーザクラス割り当ての事前確率）
Ｚ＝｛z_１,z_２,…,z_Ｎ｝はユーザのクラス割り当てを表し、ｚ_ｉ＝ｋはユーザｉの帰属ユーザクラスがｋであることを意味する。したがって、ユーザクラス数をＫとすると、ｚ_ｉ∈｛１,２,…,Ｋ｝（ｉ＝１,２,…,Ｎ）である。このとき、ｎ_ｋをユーザクラスｋに帰属するユーザ数（ｚ_ｉ＝ｋとなるｉの個数：ユーザクラスサイズ）とするとΣ^Ｋ _ｋ＝１ｎ_ｋ＝Ｎ（「Σ^Ｋ _ｋ＝１」は「ｋ」が「１」から「Ｋ」までの値をとる、という意味である。以下同様）であって、ユーザへのユーザクラス割り当てＺの事前確率は、式（１）（ユーザクラス割り当て確率式）となる。ただし、α＞０は定数（ハイパーパラメータ）である。

Once these are obtained, the following probabilistic model is assumed.
(Advance probability of user class assignment)
Z = {z ₁ , z ₂ ,..., Z _N } represents the user class assignment, and z _i = k means that the user class to which user i belongs is k. Therefore, if the number of user classes is K, z _i ε {1, 2,..., K} (i = 1, 2,..., N). At this time, if _nk is the number of users belonging to the user class k (the number of i for which z _i = k: user class size), Σ ^K _{k = 1} n _k = N (“Σ ^K _{k = 1} ” is “ k ”takes a value from“ 1 ”to“ K ”(the same applies hereinafter), and the prior probability of the user class assignment Z to the user is expressed by equation (1) (user class assignment probability equation) ) However, α> 0 is a constant (hyper parameter).

ここで、事前確率（分布）は、購買履歴データＲを知る前の状態（すなわちＲの観測の「事前」の状態）での、クラス分けの「良さ」をあらわす尺度であると考えることができる。ユーザクラス数Ｋやユーザクラスサイズｎ_ｋが変化することによって、式（１）に示すＰ（Ｚ;α）の値が変化する。直感的には、各ユーザクラスサイズｎ_ｋが大きいほど、また、ユーザクラス数Ｋが少ないほど（α＜１の場合）良いクラス割り当てであると考えることを意味する。なお、ユーザクラス数Ｋは事前に固定されるのではなく、与えられたデータになるべく適した数値となるように逐次更新（変更）され、最終的に決定される。 Here, the prior probability (distribution) can be considered as a measure representing the “goodness” of the classification in the state before the purchase history data R is known (that is, the “prior” state of R observation). . As the number of user classes K and the user class size _nk change, the value of P (Z; α) shown in Expression (1) changes. Intuitively, this means that the larger the user class size _{nk and the} smaller the number K of user classes (when α <1), the better the class assignment is considered. Note that the number K of user classes is not fixed in advance, but is sequentially updated (changed) so as to be a numerical value suitable for given data, and finally determined.

購買履歴データＲを知った後の状態では、基本的に、各ユーザ・アイテムブロックは、共クラスタリングした結果、「１」ばかり、もしくは、「０」ばかり、という、「１」と「０」がはっきりしたブロックになることが理想である。しかし、そのことだけを目指すと（そのことだけを目的関数として最適化してしまうと）、できるだけ小さなブロックを作ろうとしてしまう。なぜなら、極端な例として、全ユーザ・アイテムブロックの要素をすべて一つずつにすれば、「１」と「０」が一番はっきりするからである。しかし、このような、「すべての要素が別のブロック」という状態は、「ブロック分けをする」という観点からはバランスに欠け、望ましくない。これを補正するのが、事前確率Ｐ（Ｚ；α）である。後記するＰ（Ｗ；β）についても同様である。 In the state after knowing the purchase history data R, basically, each user / item block has “1” or “0”, which is “1” or “0” as a result of co-clustering. Ideally it should be a clear block. However, if we aim only at that (optimizing only that as an objective function), we will try to make as small a block as possible. This is because, as an extreme example, if all the elements of all user / item blocks are made one by one, “1” and “0” are the clearest. However, such a state that “all elements are different blocks” is not desirable because it is not balanced from the viewpoint of “block division”. It is the prior probability P (Z; α) that corrects this. The same applies to P (W; β) described later.

（アイテムクラス割り当ての事前確率）
同様に、Ｗ＝｛ｗ_１,ｗ_２,…,ｗ_Ｍ｝はアイテムのクラス割り当てを表し、ｗ_ｊ＝ｓはアイテムｊの帰属アイテムクラスがｓであることを意味する。したがって、アイテムクラス数をＳとすると、ｗ_ｊ∈｛１,２,…,Ｓ｝（ｊ＝１,２,…,Ｍ）である。このとき、ｍ_ｓをアイテムクラスｓに帰属するアイテム数（ｗ_ｊ＝ｓとなるｊの個数）とするとΣ^Ｓ _ｓ＝１ｍ_ｓ＝Ｍであって、アイテムへのアイテムクラス割り当てＷの事前確率は、式（２）（アイテムクラス割り当て確率式）となる。また、アイテムクラス数Ｓは事前に固定されるのではなく、与えられたデータになるべく適した数値となるように逐次更新（変更）され、最終的に決定される。β＞０は定数（ハイパーパラメータ）である。

(Priority probability of item class assignment)
Similarly, W = {w ₁ , w ₂ ,..., W _M } represents the class assignment of the item, and w _j = s means that the belonging item class of item j is s. Therefore, if the number of item classes is S, w _j ε {1, 2,..., S} (j = 1, 2,..., M). At this time, if m _s is the number of items belonging to the item class s (the number of j where w _j = s), Σ ^S _{s = 1} m _s = M, and the prior probability of the item class assignment W to the item Is the equation (2) (item class assignment probability equation). Further, the item class number S is not fixed in advance, but is sequentially updated (changed) so as to be a numerical value suitable for given data, and finally determined. β> 0 is a constant (hyper parameter).

（行列Ｒの出現確率）
さらに、Θ＝｛θ_ｋ,ｓ｜ｋ＝１,２,…,Ｋ, ｓ＝１,２,…,Ｓ｝は、ユーザクラスｋとアイテムクラスｓから決まるユーザ・アイテムブロック（ｋ,ｓ）ごとの購買確率である。すなわち、ユーザクラスｋに所属するユーザは、アイテムクラスｓの所属するアイテムをθ_ｋ,ｓの確率で購入する(（１−θ_ｋ,ｓ）の確率で購入しない)ことを意味する(「ベルヌーイ試行」と呼ばれる)。したがって、Ｚ,Ｗ,Θが与えられた下での、観測データであるＲの出現確率は式（３）で表される。なお、前記したように、ユーザクラス数Ｋとアイテムクラス数Ｓは、固定ではなく、逐次更新されるものである。

ただし、Ｒの各要素はおのおの独立に生成されたものと仮定する。式（３）は、Ｒが与えられた下での、Θの「良さ」（尤度）である。 (Appearance probability of matrix R)
Further, Θ = {θ _{k, s} | k = 1, 2,..., K, s = 1, 2,..., S} is a user item block (k, s) determined from the user class k and the item class s. The purchase probability for each. In other words, the user belonging to the user class k means that an item that belongs to the item class s to buy with a probability of _{θ k, s ((1-} θ k, s) do not buy with a probability of) ( "Bernoulli Called "trial"). Therefore, the appearance probability of R, which is observation data, given Z, W, and Θ is expressed by Equation (3). As described above, the user class number K and the item class number S are not fixed, but are updated sequentially.

However, it is assumed that each element of R is generated independently. Equation (3) is the “goodness” (likelihood) of Θ under the condition that R is given.

（購買確率）
θ_ｋ,ｓは、単なる「０」〜「１」の大きさの一様乱数で生成されるのではなく、通常はベータ分布によって生成される、と考える（すなわちθ_ｋ,ｓの事前分布としてベータ分布を用いる）。そこで、ベータ分布のパラメータをγ＝｛γ_０,γ_１｝とすると、θ_ｋ,ｓの事前分布は式（４）で表される。なお、Γ（ｘ）はガンマ関数である。

(Purchase probability)
theta _{k, s} is a mere "0" to "1" in size rather than being generated by the uniform random number, as are normally produced by the beta distribution, the considered (i.e. prior distribution theta _{k, s} Use beta distribution). Therefore, if the parameter of the beta distribution is γ = {γ ₀ , γ ₁ } _, the prior distribution of θ _{k, s} is expressed by equation (4). Note that Γ (x) is a gamma function.

以上が、ＩＲＭにおいて仮定する確率モデルである。
このモデルの下で、与えられた購買履歴データＲに関して、最適な共クラスタリングの結果を求める。共クラスタリング結果であるユーザクラス割り当てＺとアイテムクラス割り当てＷに関して、Ｒが与えられた下での、ＺとＷの事後確率は式（５）で与えられる。
Ｐ（Ｚ,Ｗ｜Ｒ；α,β,γ）・・・式（５） The above is the probability model assumed in the IRM.
Under this model, the optimum co-clustering result is obtained for the given purchase history data R. With respect to the user class assignment Z and the item class assignment W, which are co-clustering results, the posterior probabilities of Z and W under R are given by Equation (5).
P (Z, W | R; α, β, γ) (5)

そして、最適な共クラスタリングとは、この式（５）で与えられる事後確率が最大（厳密な最大ではなく計算した中での最大でよい）となる共クラスタリング結果を求めることに他ならない。すなわち、式（５）は、購買履歴データＲに対する、共クラスタリング結果Ｚ,Ｗの当てはまりの良さを表す評価関数であり、この値が大きいほど良い共クラスタリング結果であるとみなすことができる。 The optimal co-clustering is nothing but to obtain a co-clustering result in which the posterior probability given by the equation (5) is the maximum (the maximum is not the strict maximum but may be calculated). That is, Expression (5) is an evaluation function representing the goodness of application of the co-clustering results Z and W to the purchase history data R, and it can be considered that the larger this value, the better the co-clustering result.

式（５）は、以下のようにして計算できる。すなわち、ＩＲＭの仮定の下、式（１）,（２）,（３）,（４）を用いて、確率モデル(購買履歴データの同時分布)は式（６）のように表現できる。
Ｐ（Ｒ,Θ,Ｚ,Ｗ；α,β,γ）＝
Ｐ(Ｒ｜Θ,Ｚ,Ｗ)Ｐ(Ｚ;α)Ｐ(Ｗ;β)Ｐ(Θ;γ）・・・式（６） Equation (5) can be calculated as follows. That is, under the assumption of IRM, the probability model (simultaneous distribution of purchase history data) can be expressed as equation (6) using equations (1), (2), (3), and (4).
P (R, Θ, Z, W; α, β, γ) =
P (R | Θ, Z, W) P (Z; α) P (W; β) P (Θ; γ) (6)

ここで、購買履歴データＲが与えられた下での条件付分布は式（７）のように表現できる。
Ｐ（Ｚ,Ｗ,Θ｜Ｒ；α,β,γ）・・・式（７） Here, the conditional distribution under the purchase history data R can be expressed as shown in Equation (7).
P (Z, W, Θ | R; α, β, γ) (7)

また、ベイズの定理から、式（７）は式（８）のように表現できる。

From Bayes' theorem, equation (7) can be expressed as equation (8).

式（８）の右辺の分母Ｐ（Ｒ；α,β,γ）は定数となるため、分子の項について考える。すなわち、式（９）,（１０）のように表現できる。

Since the denominator P (R; α, β, γ) on the right side of Equation (8) is a constant, the numerator term is considered. That is, it can be expressed as equations (9) and (10).

ここで、θ_ｋ,ｓ（∈Θ)の事前分布としてベルヌーイ分布の共役事前分布であるベータ分布を用いたため、Θは式（１１）,（１２）に表現するように積分消去することができる。すなわち、式（９）の左辺と式（１０）との両辺をΘについて積分し、その後、式（１１）の積分を実行すると、式（１２）のようにΘが消えることになる。

Here, since the beta distribution, which is a conjugate prior distribution of Bernoulli distribution, is used as the prior distribution of θ _{k, s} (∈Θ), Θ can be integrated and eliminated as expressed in equations (11) and (12). . That is, if both the left side of the formula (9) and the formula (10) are integrated with respect to Θ, and then the integration of the formula (11) is executed, Θ disappears as shown in the formula (12).

式（１１）の左辺は前記した式（５）のＰ（Ｚ,Ｗ｜Ｒ；α,β,γ）に他ならず、これが式（１２）により計算できることになる。式（１２）の各項は、それぞれ、式（１）,（２）,（３）から計算できる。 The left side of Equation (11) is nothing but P (Z, W | R; α, β, γ) in Equation (5), and this can be calculated by Equation (12). Each term of Formula (12) can be calculated from Formulas (1), (2), and (3), respectively.

ＩＲＭの特徴は、従来の(共)クラスタリング手法のほとんどが事前に必要とするクラス数を、予め与えることなく、与えられたデータから、クラス数Ｋ,Ｓを統計的学習の枠組みで自動的に決定できる点にある。通常、購買履歴データなどの実データにおけるクラス数は事前には未知であり、他の従来手法のようにクラス数に対する評価尺度を元に探索的にクラス数を決定する必要が無い。
Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T.＆ Ueda, N., “Learning systems of concepts with an infinite relational model.” In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), pages 381-388, June 2006. The feature of IRM is that the number of classes K and S are automatically calculated from the given data in the framework of statistical learning without giving the number of classes that most of the conventional (co) clustering methods require in advance. It is in the point that can be decided. Usually, the number of classes in actual data such as purchase history data is unknown in advance, and there is no need to search for the number of classes based on an evaluation scale for the number of classes unlike other conventional methods.
Kemp, C., Tenenbaum, JB, Griffiths, TL, Yamada, T. & Ueda, N., “Learning systems of concepts with an infinite relational model.” In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06) , pages 381-388, June 2006.

しかしながら、ＩＲＭを購買履歴データに適用する場合、以下の二つの点が問題となる。
（問題１）
購買履歴データＲが疎（スパース）である。すなわち、Ｒ_ｉ,ｊ＝１（購入する）となる要素に比べて、Ｒ_ｉ,ｊ＝０（購入しない）となる要素数が圧倒的に多いにもかかわらず、ＩＲＭはＲ_ｉ,ｊ＝１とＲ_ｉ,ｊ＝０を対等に扱うモデルになっているため、購買履歴データを扱うことが困難である（結果の精度がよくない）。 However, when IRM is applied to purchase history data, the following two points are problematic.
(Problem 1)
The purchase history data R is sparse. That, R _i, in comparison with the j = 1 (buy) become elements, even though R _i, j = 0 (non purchased) and a number of elements is overwhelmingly large, IRM is R _i, j = Since the model handles 1 and R _{i, j} = 0 on an equal basis, it is difficult to handle purchase history data (results are not accurate).

（問題２）
購買履歴データＲにおいて、Ｒ_ｉ,ｊ＝０となる要素は、必ずしも「購入しない」ことを意味するのではなく、むしろ、「購入する／しないが不明である」と解釈すべきである。例えば、一般にユーザは、現在未購入のアイテムを、場合によっては、将来的には購入する（Ｒ_ｉ,ｊ＝０の箇所がＲ_ｉ,ｊ＝１となる）可能性があると考えるのが自然である。すなわち、購買履歴データＲは欠損値が多数存在するデータとみなせる。しかし、ＩＲＭは欠損値が存在するデータを対象として考案されたモデルではないため、このような状況をうまく扱えない（結果の精度がよくない）。 (Problem 2)
In the purchase history data R, an element with R _{i, j} = 0 does not necessarily mean “not purchased”, but rather should be interpreted as “unknown whether or not to purchase”. For example, in general, the user thinks that there is a possibility that an item that is not yet purchased will be purchased in the future (where R _{i, j} = 0 becomes R _{i, j} = 1). Is natural. That is, the purchase history data R can be regarded as data having many missing values. However, since IRM is not a model devised for data with missing values, such a situation cannot be handled well (results are not accurate).

以上をまとめると、従来技術であるＩＲＭは、必ずしも購買履歴データを扱うのに適したモデルではないと言える。 In summary, it can be said that the IRM which is the prior art is not necessarily a model suitable for handling purchase history data.

そこで、本発明は、前記した問題を解決するためになされたものであり、購買履歴データのような欠損値を含んだデータ（購買履歴データＲにおいて、Ｒ_ｉ,ｊ＝０の箇所がＲ_ｉ,ｊ＝０でない可能性を持つようなデータ）に、より適した生成プロセス（確率モデル）に基づいて、ユーザとアイテムを共クラスタリングすることを課題とする。 Accordingly, the present invention has been made to solve the above-described problem, and data including missing values such as purchase history data (in the purchase history data R, a location where R _i _{, j} = 0 is R _{i. , j} = data that may not be 0) based on a more suitable generation process (probability model), it is an object to co-cluster users and items.

前記課題を解決するために、本発明は、各ユーザが各アイテムを選択したか否かあるいは選択した頻度を示したアイテム選択情報を、前記ユーザと前記アイテムに関する行列情報として扱ってクラスタリングするために、所定のユーザクラス割り当て確率式に基づき、前記行列情報における前記ユーザのそれぞれを複数のユーザクラスにクラス分けし、所定のアイテムクラス割り当て確率式に基づき、前記行列情報における前記アイテムのそれぞれを複数のアイテムクラスにクラス分けし、前記ユーザクラスと前記アイテムクラスとの組み合わせによって一意に特定されるユーザ・アイテムブロックごとに、確率モデル上で当該ユーザクラスのユーザが当該アイテムクラスのアイテムを選択する仮の確率である仮選択確率を所定のアイテム選択確率式に基づいて与え、前記所定のユーザクラス割り当て確率式、前記所定のアイテムクラス割り当て確率式、および、前記所定のアイテム選択確率式から算出される前記行列情報の事後確率の最大化を図ることで、前記行列情報を前記ユーザ単位および前記アイテム単位の両方でクラスタリングする共クラスタリング装置であって、前記した各ユーザクラスに対して各アイテムクラスに関する所定のアイテムクラス指定確率を与え、前記した各アイテムクラスに対して各ユーザクラスに関する所定のユーザクラス指定確率を与え、前記所定のアイテム選択確率式は、前記ユーザ・アイテムブロックごとの、前記所定のアイテムクラス指定確率と前記所定のユーザクラス指定確率との乗算値または乗算値に基づく値として前記仮選択確率を算出する式であり、前記アイテム選択情報と前記行列情報とを格納する記憶部と、最新の前記行列情報の入力を受け付け、前記行列情報における１つ以上の前記ユーザの所属するユーザクラスを、前記所定のユーザクラス割り当て確率式と前記仮選択確率とに基づいて更新するユーザクラス更新部と、最新の前記行列情報の入力を受け付け、前記行列情報における１つ以上の前記アイテムの所属するアイテムクラスを、前記所定のアイテムクラス割り当て確率式と前記仮選択確率とに基づいて更新するアイテムクラス更新部と、最新の前記行列情報に基づいて、前記行列情報の事後確率を算出する共クラスタリング結果評価部と、を備える。 In order to solve the above-mentioned problem, the present invention performs clustering by treating item selection information indicating whether or not each user has selected each item or frequency of selection as matrix information related to the user and the item. Classifying each of the users in the matrix information into a plurality of user classes based on a predetermined user class allocation probability formula, and classifying each of the items in the matrix information based on a predetermined item class allocation probability formula For each user / item block uniquely identified by a combination of the user class and the item class, a user of the user class selects an item of the item class on the probability model. Preliminary selection probability that is probability And maximizing the posterior probability of the matrix information calculated from the predetermined user class allocation probability formula, the predetermined item class allocation probability formula, and the predetermined item selection probability formula. In the co-clustering apparatus for clustering the matrix information in both the user unit and the item unit, a predetermined item class designation probability for each item class is given to each user class, and each item described above A predetermined user class designation probability for each user class is given to the class, and the predetermined item selection probability formula is obtained by calculating the predetermined item class designation probability and the predetermined user class designation probability for each user / item block. The provisional selection probability as a multiplication value or a value based on the multiplication value of A storage unit that stores the item selection information and the matrix information, and an input of the latest matrix information, and a user class to which one or more of the users in the matrix information belong, A user class update unit that updates based on a predetermined user class assignment probability formula and the provisional selection probability, and receives the latest matrix information input, and an item class to which one or more items in the matrix information belong An item class updating unit that updates based on the predetermined item class allocation probability formula and the provisional selection probability, and a co-clustering result evaluation unit that calculates a posterior probability of the matrix information based on the latest matrix information; .

かかる発明によれば、所定のアイテム選択確率式が、ユーザ・アイテムブロックごとの、所定のアイテムクラス指定確率と所定のユーザクラス指定確率との乗算値または乗算値に基づく値として仮選択確率を算出する式であり、所定のユーザクラス割り当て確率式と仮選択確率とに基づいてユーザクラスを更新し、所定のアイテムクラス割り当て確率式と仮選択確率とに基づいてアイテムクラスを更新し、最新の行列情報に基づいて行列情報の事後確率を算出することで、購買履歴データのような欠損値を含んだデータに、より適した生成プロセス（確率モデル）に基づいて、ユーザとアイテムを共クラスタリングすることができる。 According to this invention, the predetermined item selection probability formula calculates the provisional selection probability as a product of the predetermined item class specification probability and the predetermined user class specification probability or a value based on the multiplication value for each user / item block. A user class is updated based on a predetermined user class allocation probability formula and a provisional selection probability, an item class is updated based on a predetermined item class allocation probability expression and a provisional selection probability, and the latest matrix By calculating the posterior probability of matrix information based on the information, co-clustering users and items based on a more suitable generation process (probability model) to data containing missing values such as purchase history data Can do.

また、本発明は、所定条件を満たすまで、前記ユーザクラス更新部、前記アイテムクラス更新部および前記共クラスタリング結果評価部による各ステップを繰り返させ、前記所定条件を満たしたとき、最新の前記行列情報に基づいた前記ユーザと前記アイテムに関する共クラスタリングの結果を前記記憶部に格納する共クラスタリング終了判定部、をさらに備えることが望ましい。 In addition, the present invention repeats each step by the user class update unit, the item class update unit, and the co-clustering result evaluation unit until a predetermined condition is satisfied, and when the predetermined condition is satisfied, the latest matrix information It is desirable to further include a co-clustering end determination unit that stores a result of co-clustering related to the user and the item based on the item in the storage unit.

かかる発明によれば、所定条件を満たしたとき、最新の行列情報に基づいたユーザとアイテムに関する共クラスタリングの結果を記憶部に格納することで、共クラスタリングを適切なタイミングで終了させることができる。 According to this invention, when the predetermined condition is satisfied, the result of the co-clustering regarding the user and the item based on the latest matrix information is stored in the storage unit, so that the co-clustering can be terminated at an appropriate timing.

また、本発明は、前記ユーザクラス更新部が、最新の前記行列情報から、更新する前記ユーザを１つ選び、その選んだ前記ユーザを移行させる先となる既存および新規の前記ユーザクラスのそれぞれの移行確率を算出し、当該それぞれの移行確率にしたがって当該ユーザを移行させる先の前記ユーザクラスを決定し、当該決定に基づいて前記ユーザクラスに関する情報を更新することが望ましい。 Further, according to the present invention, the user class update unit selects one of the users to be updated from the latest matrix information, and each of the existing and new user classes to which the selected user is transferred. It is desirable to calculate a migration probability, determine the user class to which the user is to be migrated according to the respective migration probability, and update information on the user class based on the determination.

かかる発明によれば、移行確率にしたがって、ユーザを移行させる先のユーザクラスを決定することで、共クラスタリングがデッドロック（不適切な膠着状態）に陥る事態を回避することができる。 According to this invention, it is possible to avoid a situation in which co-clustering falls into a deadlock (inappropriate stalemate state) by determining a user class to which a user is to be migrated according to the migration probability.

また、本発明は、前記アイテムクラス更新部が、最新の前記行列情報から、更新する前記アイテムを１つ選び、その選んだ前記アイテムを移行させる先となる既存および新規の前記アイテムクラスのそれぞれの移行確率を算出し、当該それぞれの移行確率にしたがって当該アイテムを移行させる先の前記アイテムクラスを決定し、当該決定に基づいて前記アイテムクラスに関する情報を更新することが望ましい。 Further, according to the present invention, the item class update unit selects one of the items to be updated from the latest matrix information and transfers each of the existing and new item classes to which the selected item is to be transferred. It is desirable to calculate a transfer probability, determine the item class to which the item is to be transferred according to the respective transfer probabilities, and update information on the item class based on the determination.

かかる発明によれば、移行確率にしたがって、アイテムを移行させる先のアイテムクラスを決定することで、共クラスタリングがデッドロック（不適切な膠着状態）に陥る事態を回避することができる。 According to this invention, it is possible to avoid a situation in which co-clustering falls into a deadlock (inappropriate stalemate state) by determining an item class to which an item is transferred according to the transfer probability.

また、本発明に係る共クラスタリングプログラムは、前記した共クラスタリング装置を構成するコンピュータに実行させることを特徴とする。このような構成により、このプログラムをインストールされたコンピュータは、このプログラムに基づいた各機能を実現することができる。 The co-clustering program according to the present invention is characterized by causing a computer constituting the co-clustering apparatus to be executed. With such a configuration, a computer in which this program is installed can realize each function based on this program.

また、本発明に係るコンピュータに読み取り可能な記録媒体は、前記した共クラスタリングプログラムが記録されたことを特徴とする。このような構成により、この記録媒体を装着されたコンピュータは、この記録媒体に記録されたプログラムに基づいた各機能を実現することができる。 A computer-readable recording medium according to the present invention is characterized in that the above-described co-clustering program is recorded. With such a configuration, a computer equipped with the recording medium can realize each function based on a program recorded on the recording medium.

本発明によれば、購買履歴データのような欠損値を含んだデータに、より適した生成プロセス（確率モデル）に基づいて、ユーザとアイテムを共クラスタリングすることができる。 According to the present invention, users and items can be co-clustered on data including missing values such as purchase history data based on a more suitable generation process (probability model).

以下、図面を参照（言及図以外の図も適宜参照）して、本発明を実施するための最良の形態（以下、「実施形態」という。）について詳細に説明する。まず、理解を助けるため、本実施形態のデータ生成プロセスについて説明し、その後、共クラスタリング装置について説明する。 The best mode for carrying out the present invention (hereinafter referred to as “embodiment”) will be described in detail below with reference to the drawings (refer to drawings other than the referenced drawings as appropriate). First, to help understanding, the data generation process of this embodiment will be described, and then the co-clustering apparatus will be described.

＜データ生成プロセス＞
本実施形態では、ＩＲＭが仮定するデータ生成プロセスとは異なるデータ生成プロセスを仮定する。図９は、本実施形態において仮定するデータ生成プロセスの説明図である。 <Data generation process>
In this embodiment, a data generation process different from the data generation process assumed by the IRM is assumed. FIG. 9 is an explanatory diagram of a data generation process assumed in the present embodiment.

（本実施形態のプロセス１）
[１−１]ユーザ全体を、複数のユーザクラスに分割する。すなわち、まず、複数のユーザクラスの集合（ユーザクラス集合）を生成し、各ユーザはユーザクラス（のひとつ）ｋを選択し、そこに所属（帰属）する。
[１−２]アイテム全体を、複数のアイテムクラスに分割する。すなわち、まず、複数のアイテムクラスの集合（アイテムクラス集合）を生成し、各アイテムは、アイテムクラス（のひとつ）ｓを選択し、そこに所属（帰属）する。
なお、本実施形態のプロセス１は、前記したＩＲＭのプロセス１とまったく同じである。 (Process 1 of this embodiment)
[1-1] The entire user is divided into a plurality of user classes. That is, first, a set of a plurality of user classes (user class set) is generated, and each user selects a user class (one) k and belongs (belongs to) there.
[1-2] Divide the entire item into a plurality of item classes. That is, first, a set of item classes (item class set) is generated, and each item selects an item class (one) s and belongs (belongs to) there.
The process 1 of this embodiment is exactly the same as the process 1 of the IRM described above.

（本実施形態のプロセス２）
[２−１]各ユーザクラスに対し、各アイテムクラスのうちどのクラスを選択（指定）しやすいか、という確率（アイテムクラス指定確率）を割り当てる。すなわち、各ユーザクラスｋには、アイテムクラス上の多項分布がひとつ割り当てられる。これはアイテムクラス数をＳとすると、Σ^Ｓ _ｓ＝１θ_ｋ,ｓ＝１となるｓ個の非負実数θ_ｋ,ｓ（≧０）を割り当てることに相当する。ここで、θ_ｋ,ｓは、ユーザクラスｋに属するユーザにとって、アイテムクラスｓを選択（指定）する確率であり、サイコロに例えると、ｓという目の出る確率に対応する。 (Process 2 of this embodiment)
[2-1] The probability (item class designation probability) of which class among the item classes is easily selected (designated) is assigned to each user class. That is, one multinomial distribution on the item class is assigned to each user class k. This corresponds to assigning _s non-negative real numbers θ _{k, s} (≧ 0) such that Σ ^S _{s = 1} θ _{k, s} = 1, _where S is the number of item classes. Here, θ _{k, s} is the probability of selecting (designating) the item class s for users belonging to the user class k, and corresponds to the probability of being noticed as s when compared to a dice.

[２−２]同様に、各アイテムクラスに対し、各ユーザクラスのうちどのクラスを選択（指定）しやすいか、という確率（ユーザクラス指定確率）を割り当てる。すなわち、各アイテムクラスsには、ユーザクラス上の多項分布がひとつ割り当てられる。これはユーザクラス数をＫとすると、Σ^Ｋ _ｋ＝１φ_ｓ,ｋ＝１となるＫ個の非負実数φ_ｓ,ｋ（≧０）を割り当てることに相当する。φ_ｓ,ｋは、アイテムクラスｓに属するユーザにとって、ユーザクラスｋを選択（指定）する確率である。 [2-2] Similarly, each item class is assigned a probability (user class designation probability) that it is easy to select (designate) which user class among the user classes. That is, one multinomial distribution on the user class is assigned to each item class s. This is equivalent to allocating K non-negative real numbers φ _{s, k} (≧ 0) such that Σ ^K _{k = 1} φ _{s, k} = 1 _{, where} K is the number of user classes. φ _{s, k} is the probability of selecting (designating) the user class k for users belonging to the item class s.

[２−３]各ユーザは、それが帰属するユーザクラスが持つアイテムクラス指定確率に基づいてアイテムクラスを選択（指定）し、各アイテムは、それが帰属するアイテムクラスが持つユーザクラス指定確率に基づいてユーザクラスを選択（指定）する。その結果、両者が一致したときに、該当するユーザは該当するアイテムを購入する、と考える（みなす）。
本実施形態のプロセス２は、前記したＩＲＭのプロセス２と大きく異なる。 [2-3] Each user selects (specifies) an item class based on the item class designation probability of the user class to which the user belongs, and each item has a user class designation probability of the item class to which the user belongs. Select (specify) the user class based on this. As a result, when the two match, the corresponding user is considered (deemed) to purchase the corresponding item.
The process 2 of the present embodiment is greatly different from the process 2 of the IRM described above.

ＩＲＭの場合と同様に、本実施形態では、与えられた購買履歴データ（アイテム選択情報）が、前記プロセスを表現した確率モデルに基づいて生成されたものとみなし、データを共クラスタリングする。具体的には、ＩＲＭと同様に、ユーザとアイテムをランダムにクラスタリングした状態からユーザの帰属ユーザクラスとアイテムの帰属アイテムクラスの各々を逐次的に更新していくことにより、共クラスタリング結果を得る。ここで、ＩＲＭと同様に、本実施形態においても、与えられたデータからクラス数を統計的学習の枠組みで自動的に決定できる。 As in the case of IRM, in this embodiment, the given purchase history data (item selection information) is regarded as being generated based on a probability model representing the process, and the data is co-clustered. Specifically, as in the case of IRM, a co-clustering result is obtained by sequentially updating each of the user belonging user class and the item belonging item class from a state in which the user and the item are randomly clustered. Here, similarly to the IRM, in this embodiment, the number of classes can be automatically determined from the given data in the framework of statistical learning.

次に、本実施形態が仮定する確率モデルと共クラスタリング結果の評価式について説明する。ＩＲＭの場合と同様、まず、以下がデータとして観測される量である。
・ユーザ数はＮ、アイテム数はＭである。
・行列Ｒは購買履歴データを表し、Ｎ×Ｍ行列Ｒの（ｉ,ｊ）要素Ｒ_ｉ,ｊはユーザｉ（ｉ＝１,２,…,Ｎ）がアイテムｊ（ｊ＝１,２,…,Ｍ）を購入していた場合に「１」、購入していない場合に「０」をとるものとする。 Next, the probability model assumed by this embodiment and the evaluation formula of the co-clustering result will be described. As in the case of IRM, first, the following are the quantities observed as data.
The number of users is N and the number of items is M.
The matrix R represents purchase history data, and the (i, j) element R _{i, j} of the N × M matrix R represents the user i (i = 1, 2,..., N) as the item j (j = 1, 2, .., M) is “1” when purchased, and “0” when not purchased.

これらが得られた下で、以下のような確率モデルを仮定する。
（ユーザクラス割り当ての事前確率、アイテムクラス割り当ての事前確率）
Ｐ（Ｚ;α）,Ｐ（Ｗ;β）については、ＩＲＭの場合とまったく同じであり、式（１）,（２）をそのまま用いる。
（行列Ｒの出現確率）
Ｚ,Ｗ,Θ,Φが与えられた下での、観測データであるＲの出現確率は式（１３）のように表現できる。

Once these are obtained, the following probabilistic model is assumed.
(Prior probability of user class assignment, prior probability of item class assignment)
P (Z; α) and P (W; β) are exactly the same as in the case of IRM, and formulas (1) and (2) are used as they are.
(Appearance probability of matrix R)
The appearance probability of R, which is observation data, given Z, W, Θ, and Φ can be expressed as Equation (13).

この式（１３）は、Ｒが与えられた下での、Θ、Φの「良さ」(尤度)である。この式（１３）は、本実施形態の特徴の１つであり、ＩＲＭの対応する式（３）と大きく違っている。ここでは、ユーザクラスｚ_ｉとアイテムクラスｗ_ｊが、互いに同じアイテムクラスｗ_ｊ,ユーザクラスｚ_ｉを選択（指定）したときのみ購入すると仮定するというプロセスが、式（１３）（アイテム選択確率式）中の（θ_{ｚｉ,ｗｊ}φ_{ｗｊ,ｚｉ}）^Ｒｉ,ｊに反映されている。 This equation (13) is the “goodness” (likelihood) of Θ and Φ under the condition that R is given. This equation (13) is one of the features of this embodiment, and is greatly different from the corresponding equation (3) of IRM. Here, the process of assuming that the user class z _i and the item class w _j purchase only when the same item class w _j and user class z _i are selected (designated) is expressed by the following equation (13) (item selection probability formula ) In (θ _{zi, wj} _{φwj, zi} ) ^{Ri, j} .

この「（θ_{ｚｉ,ｗｊ}φ_{ｗｊ,ｚｉ}）^Ｒｉ,ｊ」において、「θ_{ｚｉ,ｗｊ}」は、ユーザクラスｚ_ｉに属するユーザがアイテムクラスｗ_ｊを選択（指定）する確率（アイテムクラス指定確率）である。また、「φ_{ｗｊ,ｚｉ}」は、アイテムクラスｗ_ｊに属するアイテムがユーザクラスｚ_ｉを選択（指定）する確率（ユーザクラス指定確率）である。したがって、このモデルにおいて、「θ_{ｚｉ,ｗｊ}φ_{ｗｊ,ｚｉ}」は、ユーザクラスｚ_ｉに属するユーザがアイテムクラスｗ_ｊに属するアイテムを購入する確率（仮選択確率）となる。 In this “(θ _{zi, wj} _{φwj, zi} ) ^{Ri, j} ”, “θ _{zi, wj} ” is the probability that the user belonging to the user class z _i will select (specify) the item class w _j (item class designation probability) ). “Φ _{wj, zi} ” is a probability (user class designation probability) that an item belonging to the item class w _j selects (designates) the user class z _i . Therefore, in this model, “θ _{zi, wj} φ _{wj, zi} ” is a probability that a user belonging to the user class z _i purchases an item belonging to the item class w _j (provisional selection probability).

そして、「Ｒ_ｉ,ｊ」は、「購入する」ときに「１」となり、「購入しない」ときに「０」となる数値である。したがって、「（θ_{ｚｉ,ｗｊ}φ_{ｗｊ,ｚｉ}）^Ｒｉ,ｊ」の値は、「購入する」ときに「（θ_{ｚｉ,ｗｊ}φ_{ｗｊ,ｚｉ}）」（「θ_{ｚｉ,ｗｊ}」と「φ_{ｗｊ,ｚｉ}」との乗算値）の値がそのまま残り、「購入しない」ときに「１」になる（「０」以外の数値の０乗は「１」であるので。また、「０」の「０」乗は便宜上「１」と扱えばよい）。つまり、「（θ_{ｚｉ,ｗｊ}φ_{ｗｊ,ｚｉ}）^Ｒｉ,ｊ」のすべての積をとった式（１３）は、このモデルにおける、観測データである行列Ｒの出現確率となる。 “R _{i, j} ” is a numerical value “1” when “purchase” and “0” when “not purchase”. Therefore, the value of “(θ _{zi, wj} _{φwj, zi} ) ^{Ri, j} ” becomes “(θ _{zi, wj} _{φwj, zi} )” (“θ _{zi, wj} ” and “φ _{wj , zi} ”) is left as it is and becomes“ 1 ”when“ not purchased ”(because the zero power of a numerical value other than“ 0 ”is“ 1 ”. The 0th power may be treated as “1” for convenience). That is, Expression (13) obtained by taking all products of “(θ _{zi, wj} _{φwj, zi} ) ^{Ri, j} ” is the appearance probability of the matrix R as the observation data in this model.

なお、仮選択確率としては、前記した乗算値そのものでなくても、乗算値の累乗や累乗根など、乗算値に基づいて算出された値を採用してもよい。 The provisional selection probability may be a value calculated based on the multiplication value, such as a power of the multiplication value or a power root, instead of the multiplication value itself.

このようにして行列Ｒの出現確率を算出することで、Ｒ_ｉ,ｊ＝１（購入する）となる要素に比べてＲ_ｉ,ｊ＝０（購入しない）となる要素数が圧倒的に多い場合や、欠損値を含んだ購買履歴データの場合などにも適するデータ生成プロセスを実現することができる。 By calculating the appearance probability of the matrix R in this way, the number of elements with R _{i, j} = 0 (not purchased) is overwhelmingly larger than the elements with R _{i, j} = 1 (purchased). In this case, a data generation process suitable for the case of purchase history data including missing values can be realized.

（ユーザクラスｋに属するユーザがアイテムクラスｓを選択（指定）する確率）
Θ＝｛θ_ｋ,ｓ｜ｋ＝１,２,…,Ｋ, ｓ＝１,２,…,Ｓ｝（ただし、θ_ｋ,ｓ≧０,Σ^Ｓ _ｓ＝１θ_ｋ,ｓ＝１）の事前確率（分布）としては、式（１４）のようなディリクレ分布を用いる。

ただし、以下、簡単のためγ_ｋ＝γとする。 (Probability that a user belonging to user class k selects (specifies) item class s)
Θ = {θ _{k, s} | k = 1, 2,..., K, s = 1, 2,..., S} (where θ _{k, s} ≧ 0, Σ ^S _{s = 1} θ _{k, s} = 1) As a prior probability (distribution), a Dirichlet distribution as shown in Equation (14) is used.

However, for the sake of simplicity, it is assumed that γ _k = γ.

（アイテムクラスｓに属するアイテムがユーザクラスｋを選択（指定）する確率）
同様に、Φ＝｛φ_ｓ,ｋ｜ｓ＝１,２,…,Ｓ, ｋ＝１,２,…,Ｋ｝（ただし、φ_ｓ,ｋ≧０,Σ^Ｋ _ｋ＝１φ_ｓ,ｋ＝１）の事前分布としては、式（１５）のようなディリクレ分布を用いる。

ただし、以下、簡単のためη_ｓ＝ηとする。 (Probability that an item belonging to item class s selects (designates) user class k)
_{Similarly, Φ = {φ s, k} | s = 1,2, ..., S, k = 1,2, ..., K} ( _{however, φ s, k ≧ 0,} Σ K k = 1 φ s, k As a prior distribution of = 1), a Dirichlet distribution as in Expression (15) is used.

However, for simplicity, η _s = η is assumed below.

前記した式（１４）,（１５）が、ＩＲＭにおける式（４）に対応している。以上が、本実施形態が仮定する確率モデルである。 Expressions (14) and (15) described above correspond to Expression (4) in IRM. The above is the probability model assumed in the present embodiment.

ＩＲＭの場合と同様、ＺとＷの事後確率である式（５）が最大となる共クラスタリング結果を求めることを考える。これを計算するため、まず、本実施形態における確率モデル(購買履歴データの同時分布)は式（１６）,（１７）のように表現できる。

ここで、Θ＝｛θ_ｋ,ｓ｜ｋ＝１,２,…,Ｋ, ｓ＝１,２,…,Ｓ｝,Φ＝｛φ_ｓ,ｋ｜ｓ＝１,２,…,Ｓ, ｋ＝１,２,…,Ｋ｝はそれぞれ多項分布のパラメータを表し、また、a、β、γ、ηはハイパーパラメータ(定数)を表す。 As in the case of the IRM, let us consider obtaining a co-clustering result that maximizes the expression (5) that is the posterior probability of Z and W. In order to calculate this, first, the probability model (simultaneous distribution of purchase history data) in the present embodiment can be expressed as Equations (16) and (17).

Here, Θ = {θ _{k, s} | k = 1, 2,..., K, s = 1, 2,..., S}, Φ = {φ _{s, k} | s = 1, 2,. k = 1, 2,..., K} each represent a parameter of a multinomial distribution, and a, β, γ, and η represent hyperparameters (constants).

Θ、Φの事前分布として、多項分布の共役事前分布である、ディリクレ分布を用いたことによって、実際にはΘ、Φは式（１８）,（１９）のように積分消去することができる。すなわち、式（１６）の左辺と式（１７）の両辺をΘとΦについて積分し、その後、式（１８）の積分を実行すると、式（１９）のようにΘ、Φが消えることになる。

By using the Dirichlet distribution, which is a conjugate prior distribution of the multinomial distribution, as the prior distribution of Θ and Φ, in practice, Θ and Φ can be integrated and eliminated as in equations (18) and (19). That is, if the left side of Expression (16) and both sides of Expression (17) are integrated with respect to Θ and Φ, and then the integration of Expression (18) is performed, Θ and Φ disappear as shown in Expression (19). .

したがって、この積分は解析的に解くことができ、陽にΘ、Φを考える必要はない。前記積分を実際に実行して、具体的に代入すると、ＺとＷの事後確率は、式（２０）,（２１）のようになる。

Therefore, this integral can be solved analytically and there is no need to explicitly consider Θ and Φ. When the integration is actually executed and specifically substituted, the posterior probabilities of Z and W are as shown in equations (20) and (21).

なお、ｌ（ｋ,ｓ）は、ユーザクラスｋに所属するユーザがアイテムクラスｓに所属するアイテムを購入した回数を表す。すなわち、その回数は式（２２）の通りである。

ただし、δ(ｘ)はｘが真の場合に「１」、それ以外に「０」となる関数(デルタ関数)とする。 Note that l (k, s) represents the number of times a user belonging to the user class k has purchased an item belonging to the item class s. That is, the number of times is as shown in Expression (22).

However, δ (x) is a function (delta function) that becomes “1” when x is true and “0” otherwise.

本実施形態において、共クラスタリングを行うということは、式（２１）を最大化するようなＺとＷを求めることと等価である。 In the present embodiment, co-clustering is equivalent to obtaining Z and W that maximize Equation (21).

＜共クラスタリング装置＞
次に、本実施形態の共クラスタリング装置について説明する。なお、本実施形態では、サンプリングアルゴリズムを用いる。図１は、本実施形態に係る共クラスタリング装置の構成を模式的に示す機能ブロック図である。共クラスタリング装置１００は、例えば、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、ＨＤＤ（Hard Disk Drive）、入出力インタフェース等から構成されるコンピュータ装置である。共クラスタリング装置１００は、図１に示すように、演算手段１、記憶手段２（記憶部）、キーボードやマウスなどの入力手段３、液晶表示機などの出力手段４を備え、それらがバスライン５で接続されている。 <Co-clustering device>
Next, the co-clustering apparatus of this embodiment will be described. In the present embodiment, a sampling algorithm is used. FIG. 1 is a functional block diagram schematically showing the configuration of the co-clustering apparatus according to this embodiment. The co-clustering device 100 is a computer device that includes, for example, a central processing unit (CPU), a random access memory (RAM), a read only memory (ROM), a hard disk drive (HDD), and an input / output interface. As shown in FIG. 1, the co-clustering apparatus 100 includes a calculation unit 1, a storage unit 2 (storage unit), an input unit 3 such as a keyboard and a mouse, and an output unit 4 such as a liquid crystal display, which are bus lines 5. Connected with.

演算手段１は、前処理部１１、初期化部１２、ユーザクラス更新部１３、アイテムクラス更新部１４、共クラスタリング結果評価部１５、共クラスタリング終了判定部１６、および、動作領域であるメモリ１７を備えている。
記憶手段２は、購買履歴データ（行列情報）を格納する購買履歴ＤＢ２１、共クラスタリング結果を保存（格納）する共クラスタリング結果保存部２２、および、プログラム２３を備えている。 The computing means 1 includes a preprocessing unit 11, an initialization unit 12, a user class update unit 13, an item class update unit 14, a co-clustering result evaluation unit 15, a co-clustering end determination unit 16, and a memory 17 that is an operation area. I have.
The storage unit 2 includes a purchase history DB 21 that stores purchase history data (matrix information), a co-clustering result storage unit 22 that stores (stores) a co-clustering result, and a program 23.

プログラム２３は、前処理２３１、初期化２３２、ユーザクラス更新２３３、アイテムクラス更新２３４、共クラスタリング結果評価２３５、および、共クラスタリング終了判定２３６の各機能プログラムを備えている。
演算手段１の前処理部１１，初期化部１２，ユーザクラス更新部１３，アイテムクラス更新部１４，共クラスタリング結果評価部１５および共クラスタリング終了判定部１６は、それぞれ、プログラム２３の前処理２３１，初期化２３２，ユーザクラス更新２３３，アイテムクラス更新２３４，共クラスタリング結果評価２３５および共クラスタリング終了判定２３６の各機能プログラムをメモリ１７に読み込んで展開することで実現される。 The program 23 includes functional programs of preprocessing 231, initialization 232, user class update 233, item class update 234, co-clustering result evaluation 235, and co-clustering end determination 236.
The preprocessing unit 11, initialization unit 12, user class update unit 13, item class update unit 14, co-clustering result evaluation unit 15, and co-clustering end determination unit 16 of the computing unit 1 are respectively pre-process 231, program 23, This is realized by reading the function programs of initialization 232, user class update 233, item class update 234, co-clustering result evaluation 235 and co-clustering end determination 236 into the memory 17 and developing them.

ここで、図２を参照して共クラスタリング装置１００の処理の流れについて説明し、その中で演算手段１の各部の動作について詳述する。図２は、共クラスタリング装置１００の処理の流れを示すフローチャートである。 Here, the flow of processing of the co-clustering apparatus 100 will be described with reference to FIG. 2, and the operation of each part of the computing means 1 will be described in detail. FIG. 2 is a flowchart showing a processing flow of the co-clustering apparatus 100.

まず、前処理部１１は前処理を行う（ステップＳ１）。具体的には、前処理部１１は、購買履歴ＤＢ２１に格納された購買履歴データから行列Ｒを作成する。ここで、Ｒの（ｉ,ｊ）要素を表すＲ_ｉ,ｊは、ユーザｉがアイテムｊを購入していた場合に「１」、購入していない場合に「０」とする。また、ユーザ数をＮ、アイテム数をＭとする。 First, the preprocessing unit 11 performs preprocessing (step S1). Specifically, the preprocessing unit 11 creates a matrix R from the purchase history data stored in the purchase history DB 21. Here, R _{i, j} representing the (i, j) element of R is "1" when the user i had purchased the item j, and "0" when not purchased. The number of users is N, and the number of items is M.

ステップＳ１の後、初期化部１２は初期化を行う（ステップＳ２）。具体的には、初期化部１２は、ユーザの仮の帰属ユーザクラスＺ＝｛z_ｉ｜ｉ＝１,２,…,Ｎ｝と、アイテムの仮の帰属アイテムクラスＷ＝｛ｗ_ｊ｜ｊ＝１,２,…,Ｍ｝を適当に決定する。これにより、仮のユーザクラス数Ｋ、仮のアイテムクラス数Ｓも決定される。ここでは、例えば、ランダムに帰属クラスを割り当てる方法や、各データをそれぞれ別のクラス(その場合、ユーザクラス数Ｎ、アイテムクラス数Ｍとなる)とする方法などを用いる。 After step S1, the initialization unit 12 performs initialization (step S2). Specifically, the initialization unit 12 sets the user's temporary attribution user class Z = {z _i | i = 1, 2,..., N} and the item's temporary attribution item class W = {w _j | j = 1, 2,..., M} are appropriately determined. Thereby, the temporary user class number K and the temporary item class number S are also determined. Here, for example, a method of randomly assigning the belonging class or a method of assigning each data to different classes (in this case, the number of user classes is N and the number of item classes is M) is used.

続いて、ユーザクラス更新部１３はユーザクラス更新を行う（ステップＳ３）。ステップＳ３について図３を参照して説明する。図３はステップＳ３の処理を細分化したフローチャートである。 Subsequently, the user class update unit 13 performs user class update (step S3). Step S3 will be described with reference to FIG. FIG. 3 is a flowchart obtained by subdividing the processing in step S3.

ユーザクラス更新部１３は、まず、ユーザクラスを更新する対象となるユーザをランダムに１人選び、選んだユーザのインデックスをｉとする（ステップＳ３１：更新対象ユーザ選択）。次に、ユーザｉがユーザクラスｋ^＊（∈｛１,２,…,Ｋ,Ｋ＋1｝）を選択する確率を算出し（ステップＳ３２：ユーザクラス帰属確率算出）、算出した確率に基づいてユーザｉの帰属ユーザクラスを決定する（ステップＳ３３：帰属ユーザクラス決定）。このように、（移行）確率に基づいてユーザの移行先ユーザクラスを決定することで、共クラスタリングがデッドロック（不適切な膠着状態）に陥る事態を回避することができる。 First, the user class update unit 13 randomly selects one user whose user class is to be updated, and sets the selected user's index to i (step S31: update target user selection). Next, a probability that the user i selects the user class k ^* (ε {1, 2,..., K, K + 1}) is calculated (step S32: user class belonging probability calculation), and the user i is calculated based on the calculated probability. Is determined (step S33: determination of belonging user class). In this way, by determining the user migration destination user class based on the (migration) probability, it is possible to avoid a situation in which co-clustering falls into a deadlock (an inappropriate stalemate state).

最後に、決定した帰属ユーザクラスによってｚ_ｉを更新する（ステップＳ３４：ユーザクラス更新）。なお、このユーザクラス更新によって、空のユーザクラスが発生した場合は、そのユーザクラスが消去されるので、ユーザクラス数が減少することになる。また、逆に、このユーザクラス更新によって、新規のユーザクラスが発生した場合は、その分、ユーザクラス数が増加することになる。 Finally, to update the z _i by determining the assigned user class (step S34: User Class Update). When an empty user class is generated by this user class update, the user class is deleted, and the number of user classes is reduced. Conversely, when a new user class is generated by this user class update, the number of user classes increases accordingly.

ここで、ステップＳ３２について詳述する。ユーザｉがユーザクラスｋ^＊を選択する確率（ユーザクラス割り当て確率式）は、式（２３）で表される。ただし、ｋ^＊＝１,２,…,Ｋ,Ｋ＋1、である。
Ｐ（ｚ_ｉ＝ｋ^＊｜Ｒ,Ｚ_−１,Ｗ；α,β,γ,η）・・・式（２３）
また、Ｚ_−１＝｛z_１,z_２,…,z_ｉ-２,z_ｉ-１,z_ｉ＋１,z_ｉ＋２,…,z_Ｎ-１,z_Ｎ｝（Ｚからz_ｉを除いたもの）とすると、式（２３）を式（２４）,（２５）,（２６）のように表すことができる。

Here, step S32 will be described in detail. The probability that the user i selects the user class k ^* (user class allocation probability formula) is represented by the formula (23). However, k ^* = 1, 2,..., K, K + 1.
P (z _i = k ^* | R, Z ₋₁ , W; α, β, γ, η) (23)
Z ₋₁ = {z ₁ , z ₂ ,..., Z _i−2 , z _i−1 , z _{i + 1} , z _{i + 2} ,..., Z _N−1 , z _N } (Z excluding z _i ), Expression (23) can be expressed as Expressions (24), (25), and (26).

ここで、ｋ^＊∈｛１,２,…,Ｋ,Ｋ＋1｝であること、つまり、既存のＫ個のクラスに加えて、新たなＫ＋１番目のクラスを選ぶ確率も計算することに注意が必要である。ｋ^＊が既存クラスである場合は式（２７）のように表現でき、ｋ^＊が新規クラスである場合は式（２８）のように表現できる。 Note that k ^* ∈ {1,2, ..., K, K + 1}, that is, calculate the probability of selecting the new K + 1 class in addition to the existing K classes. It is. When k ^* is an existing class, it can be expressed as in equation (27), and when k ^* is a new class, it can be expressed as in equation (28).

ｋ^＊が既存クラスである場合:

ｋ^＊が新規クラスである場合：

If k ^* is an existing class:

If k ^* is a new class:

なお、ｎ_−ｉ,ｋはユーザｉを購買履歴データから取り除いたときのユーザクラスｋに所属するユーザ数を表し、ｌ（ｋ,ｓ）は、式（２２）で定義されたように、ユーザクラスｋに所属するユーザがアイテムクラスｓに所属するアイテムを購入した回数を表す。また、ｌ_−ｉ（ｋ,ｓ）はユーザｉを購買履歴データから取り除いたときのｌ（ｋ,ｓ）、ｌ_＋ｉ（ｋ,ｓ）はユーザｉをユーザクラスｋに割り当てたときのｌ（ｋ,ｓ）を表す。 Note that n− _{i, k} represents the number of users belonging to the user class k when the user i is removed from the purchase history data, and l (k, s) is the user as defined by the equation (22). This represents the number of times a user belonging to class k has purchased an item belonging to item class s. L _−i (k, s) is l (k, s) when user i is removed from the purchase history data, and l _{+ i} (k, s) is l (when user i is assigned to user class k). k, s).

図２に戻って説明を続けると、ステップＳ３の後、アイテムクラス更新部１４はアイテムクラス更新を行う（ステップＳ４）。ステップＳ４について図４を参照して説明する。図４はステップＳ４の処理を細分化したフローチャートである。 Returning to FIG. 2 and continuing the description, after step S3, the item class updating unit 14 updates the item class (step S4). Step S4 will be described with reference to FIG. FIG. 4 is a flowchart obtained by subdividing the processing in step S4.

アイテムクラス更新部１４は、まず、アイテムクラスを更新する対象となるアイテムをランダムに１つ選び、選んだアイテムのインデックスをｊとする（ステップＳ４１：更新対象アイテム選択）。次に、アイテムｊがアイテムクラスｓ^＊(∈｛１,２,…,Ｓ,Ｓ＋1｝)を選択する確率を算出し（ステップＳ４２：アイテムクラス帰属確率算出)、算出した確率に基づいてアイテムｊの帰属アイテムクラスを決定する（ステップＳ４３：帰属アイテムクラス決定)。このように、（移行）確率に基づいてアイテムの移行先アイテムクラスを決定することで、共クラスタリングがデッドロック（不適切な膠着状態）に陥る事態を回避することができる。 First, the item class updating unit 14 randomly selects one item for which the item class is to be updated, and sets the index of the selected item to j (step S41: update target item selection). Next, the probability that the item j selects the item class s ^* (ε {1, 2,..., S, S + 1}) is calculated (step S42: item class attribution probability calculation), and the item j is based on the calculated probability. Is determined (step S43: determination of belonging item class). Thus, by determining the item class to which the item is to be transferred based on the (migration) probability, it is possible to avoid a situation in which co-clustering falls into a deadlock (an inappropriate stalemate state).

最後に、決定した帰属アイテムクラスによってｗ_ｊを更新する(ステップＳ４４：アイテムクラス更新)。なお、このアイテムクラス更新によって、空のアイテムクラスが発生した場合は、そのアイテムクラスが消去されるので、アイテムクラス数が減少することになる。また、逆に、このアイテムクラス更新によって、新規のアイテムクラスが発生した場合は、その分、アイテムクラス数が増加することになる。 Finally, w _j is updated with the determined belonging item class (step S44: item class update). When an empty item class is generated by this item class update, the item class is deleted, and the number of item classes is reduced. Conversely, when a new item class is generated by this item class update, the number of item classes increases accordingly.

ここで、ステップＳ４２について詳述する。アイテムｊがアイテムクラスｓ^＊を選択する確率（アイテムクラス割り当て確率式）は、式（２９）と表される。ただし、ｓ^＊＝１,２,…,Ｓ,Ｓ＋1、である。
Ｐ（ｗ_ｊ＝ｓ^＊｜Ｒ,Ｚ,Ｗ_−ｊ；α,β,γ,η）・・・式（２９） Here, step S42 will be described in detail. The probability that the item j selects the item class s ^* (item class assignment probability equation) is expressed as equation (29). However, s ^* = 1, 2,..., S, S + 1.
P (w _j = s ^* | R, Z, W _−j ; α, β, γ, η) (29)

また、Ｗ_−ｊ＝｛ｗ_１,ｗ_２,…,ｗ_ｊ-２,ｗ_ｊ-１,ｗ_ｊ＋１,ｗ_ｊ＋２,…,ｗ_Ｎ-１,ｗ_Ｎ｝（Ｗからｗ_ｊを除いたもの）であり、Ｒ_ｊ＝｛Ｒ_１ｊ,Ｒ_２ｊ,…,Ｒ_Ｎｊ｝とおくと、式（２９）を式（３０）,（３１）,（３２）のように表すことができる。

_Further, W -j = minus the _{w j} from _{_{{w 1, w 2, ...}} , w j-2, w j-1, w j + 1, w j + 2, ..., w N-1, w N} (W ) And R _j = {R _1j , R _2j ,..., R _Nj }, Expression (29) can be expressed as Expressions (30), (31), and (32).

ここで、ユーザクラス更新部の場合と同様、ｓ^＊∈｛１,２,…,Ｓ,Ｓ＋1｝であり、既存のＳ個のクラスに加えて、新たなＳ＋１番目のクラスを選ぶ確率も計算する。ｓ^＊が既存クラスである場合は式（３３）のように表現でき、ｓ^＊が新規クラスである場合は式（３４）のように表現できる。 Here, as in the case of the user class update unit, s ^* ε {1, 2,..., S, S + 1}, and the probability of selecting the new S + 1th class in addition to the existing S classes is also calculated. To do. When s ^* is an existing class, it can be expressed as in equation (33), and when s ^* is a new class, it can be expressed as in equation (34).

ｓ^＊が既存クラスである場合:

ｓ^＊が新規クラスである場合：

If s ^* is an existing class:

If s ^* is a new class:

ここで、ｍ_−ｊ,ｓはアイテムｊを購買履歴データから取り除いたときのアイテムクラスｓに所属するアイテム数を表し、ｌ_−ｊ（ｋ,ｓ）はアイテムｊを購買履歴データから取り除いたときのｌ（ｋ,ｓ）、ｌ_＋ｊ（ｋ,ｓ）はアイテムｊをクラスｋに割り当てたときのｌ（ｋ,ｓ）を表す。 Here, m _{−j, s} represents the number of items belonging to the item class s when the item j is removed from the purchase history data, and l _−j (k, s) is when the item j is removed from the purchase history data. L (k, s) and l _{+ j} (k, s) represent l (k, s) when item j is assigned to class k.

図２に戻って説明を続けると、ステップＳ４の後、共クラスタリング結果評価部１５は、共クラスタリング結果評価、つまり、ユーザクラスタリング結果Ｚとアイテムクラスタリング結果Ｗの評価を行う（ステップＳ５）。評価には、ユーザクラスタリング結果Ｚとアイテムクラスタリング結果Ｗの事後確率を、式（３５）,（３６）から算出し、用いる。
式（３５）,（３６）により算出される値が大きいほど、良いクラスタリング結果であるとみなせる。

Returning to FIG. 2 and continuing the description, after step S4, the co-clustering result evaluation unit 15 performs co-clustering result evaluation, that is, evaluates the user clustering result Z and the item clustering result W (step S5). For the evaluation, the posterior probabilities of the user clustering result Z and the item clustering result W are calculated from the equations (35) and (36) and used.
The larger the value calculated by Equations (35) and (36), the better the clustering result.

ステップＳ５の後、共クラスタリング終了判定部１６は、共クラスタリング終了判定、つまり、ユーザクラスの更新とアイテムクラスの更新を終了させるか否かの判定を行う（ステップＳ６）。例えば、共クラスタリング結果の評価値（Ｚ,Ｗの事後確率）に変化がなくなったところで終了させればよい。また、予め指定しておいた更新回数を超えたところで終了させてもよい。あるいは、それらの組み合わせや、その他の条件によりこの終了判定を行ってもよい。 After step S5, the co-clustering end determination unit 16 determines whether to end the co-clustering, that is, whether to end the update of the user class and the update of the item class (step S6). For example, the process may be terminated when there is no change in the evaluation value (Z, W posterior probability) of the co-clustering result. Alternatively, the process may be terminated when the number of updates specified in advance is exceeded. Or you may perform this completion | finish determination by those combination and other conditions.

ステップＳ６で「継続（終了しない）」と判定された場合、ステップＳ３に戻る。
ステップＳ６で「終了」と判定された場合、共クラスタリング終了判定部１６は共クラスタリング結果（Ｚ,Ｗ）を共クラスタリング結果保存部２２に保存する（ステップＳ７）。このステップＳ６により、共クラスタリングを適切なタイミングで終了させることができる。 If it is determined in step S6 that “continue (do not end)”, the process returns to step S3.
When it is determined as “end” in step S6, the co-clustering end determination unit 16 stores the co-clustering result (Z, W) in the co-clustering result storage unit 22 (step S7). By this step S6, the co-clustering can be terminated at an appropriate timing.

このように、本実施形態の共クラスタリング装置１００によれば、購買履歴データのような欠損値を含んだデータに、より適した生成プロセス（確率モデル）に基づき、従来技術と比べて、ユーザとアイテムを良好に共クラスタリングすることができる。特に、従来法であるＩＲＭとの違いをまとめると、以下のようになる。 As described above, according to the co-clustering apparatus 100 of the present embodiment, based on a generation process (probability model) more suitable for data including missing values such as purchase history data, the user and the Items can be co-clustered well. In particular, the differences from the conventional IRM are summarized as follows.

ＩＲＭではユーザがアイテムを購入する事象と購入しない事象の両方を対等に考慮し、式（３）で表されるような、ベルヌーイ分布に基づきユーザとアイテムを双方向にクラスタリングしている。 In the IRM, both the event that the user purchases the item and the event that the user does not purchase are considered equally, and the user and the item are bi-directionally clustered based on the Bernoulli distribution as expressed by Equation (3).

一方、本実施形態では、ユーザはアイテムを購入する／しない、ではなくユーザとアイテムは互いに選択（指定）しあい、互いが一致したときにアイテムを購入する、と想定（仮定）し、式（１３）で表現されるような多項分布にしたがってユーザとアイテムが互いに選択しあう、と考えている。 On the other hand, in the present embodiment, it is assumed (not assumed) that the user purchases or does not purchase the item, but the user and the item select (designate) each other and purchase the item when they match each other (formula 13). The user and the item select each other according to the multinomial distribution represented by

別の表現を用いれば、ＩＲＭがユーザ・アイテムブロックにおけるベルヌーイ分布に基づきユーザとアイテムを双方向にクラスタリングするのに対して、本実施形態は、ユーザクラス／アイテムクラス上での多項分布に基づきユーザとアイテムを双方向にクラスタリングする。 In other words, the IRM clusters users and items bidirectionally based on the Bernoulli distribution in the user / item block, whereas this embodiment uses the multinomial distribution on the user class / item class. And bi-directionally cluster items.

ユーザがアイテムを購入する事象（ユーザとアイテムが互いに選択（指定）しあう事象）のみを考慮することによって、ユーザがアイテムを購入しない事象を考慮する必要がなくなる。これにより、購買履歴データにおける欠損部分、つまり、今は購入していないが将来購入されるアイテム箇所、に対処可能となる。 By considering only an event in which the user purchases an item (an event in which the user and the item select (designate) each other), it is not necessary to consider an event in which the user does not purchase the item. As a result, it is possible to deal with a missing portion in the purchase history data, that is, an item portion that is not purchased but is purchased in the future.

なお、共クラスタリング装置１００を構成するコンピュータに実行させる共クラスタリングプログラムを作成し、コンピュータにインストールすることにより、コンピュータは、その共クラスタリングプログラムに基づいた各機能を実現することができる。また、その共クラスタリングプログラムをＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）等の種々の記録媒体に記録することができる。 Note that, by creating a co-clustering program to be executed by a computer constituting the co-clustering apparatus 100 and installing the co-clustering program on the computer, the computer can realize each function based on the co-clustering program. The co-clustering program can be recorded on various recording media such as a CD (Compact Disc) and a DVD (Digital Versatile Disc).

以上、本発明の実施形態について説明したが、本発明はこれに限定されるものではなく、その趣旨を変えない範囲で実施することができる。
例えば、本発明を適用する対象となる購買履歴データは、行列Ｒの各要素が、「０」と「１」などの二値で示されたものでなくても、選択する頻度を表す「０」以上の整数で示されたものであってもよい。その場合、行列Ｒの各要素は、「購入した／しない」ではなく、「購入した回数」を示すことになり（「０」は購入していないことを示す）、各確率計算時等にその購入した回数を考慮すればよい。 As mentioned above, although embodiment of this invention was described, this invention is not limited to this, It can implement in the range which does not change the meaning.
For example, the purchase history data to which the present invention is applied is “0” indicating the frequency of selection even if each element of the matrix R is not indicated by binary values such as “0” and “1”. It may be represented by an integer greater than or equal to. In this case, each element of the matrix R indicates “the number of purchases” instead of “purchased / not purchased” (“0” indicates that it has not been purchased). Consider the number of purchases.

また、本発明は、購買履歴データ以外にも、ユーザのＷｅｂページへのアクセス履歴データ（Ｗｅｂページがアイテムに相当）、文書・単語データ（文書がユーザ、その文書に現れる単語がアイテムに相当）、論文・著者データ（論文がユーザ、その論文を執筆した著者がアイテムに相当）など多種のデータに適用できる。つまり、特許請求の範囲において、「ユーザ」は必ずしも「人」を指すわけではなく、また、「アイテム」は必ずしも「物」を指すわけではなく、適宜それら以外のものを指すと解釈してよい。 In addition to the purchase history data, the present invention provides access history data of a user's Web page (Web page corresponds to an item), document / word data (a document is a user, and a word appearing in the document corresponds to an item) , Paper / author data (the paper is the user, the author who wrote the paper is the item), and other data. In other words, in the claims, “user” does not necessarily indicate “person”, and “item” does not necessarily indicate “thing”, and may be interpreted as other than appropriate. .

また、本実施形態のモデルに基づいて共クラスタリングをする際、前記した説明ではサンプリングによる方法を示したが、変分ベイズアルゴリズムなどの他の学習アルゴリズムによる方法を採用することもできる。
さらに、より評価値の大きい共クラスタリング結果Ｚ,Ｗを得るため（質の悪い局所解に陥るのを防ぐため）、ユーザクラス・アイテムクラス更新の間に、それぞれのクラス割り当てＺ,Ｗを確率的に変化させる操作などを加えることも可能である。 Further, when performing co-clustering based on the model of the present embodiment, the above-described description has shown the sampling method, but a method using another learning algorithm such as a variational Bayes algorithm can also be adopted.
Furthermore, in order to obtain co-clustering results Z and W with larger evaluation values (to prevent falling into poor local solutions), the respective class assignments Z and W are probabilistic during user class / item class update. It is also possible to add an operation to change to.

また、複数のユーザまたはアイテムを同時に他のクラスに変更するようにしてもよい。
その他、ハードウェア、ソフトウェアの具体的な構成について、本発明の主旨を逸脱しない範囲で適宜変更が可能である。 Moreover, you may make it change a some user or item to another class simultaneously.
In addition, specific configurations of hardware and software can be appropriately changed without departing from the gist of the present invention.

＜実施例＞
次に、実データ（購買履歴データ）に対して、「本実施形態の共クラスタリング装置１００による共クラスタリング」を実行した場合と、「ＩＲＭ（従来技術）による共クラスタリング」を実行した場合との比較結果について説明する。 <Example>
Next, a comparison between the case where “co-clustering by co-clustering apparatus 100 of the present embodiment” is executed on real data (purchase history data) and the case where “co-clustering by IRM (conventional technology)” is executed is performed. The results will be described.

（使用したデータ）
実データとして、映画に対してユーザが実際に評価した履歴からなるデータ“MovieLens”を用いた。評価値は「１」,「２」,「３」,「４」,「５」の５つの値をとる。そして、ユーザが評価を行った映画をユーザが購入したアイテムとみなし、ユーザが評価を行っていない映画をユーザが購入していないアイテムとみなし、ユーザ数943、アイテム数1682の購買履歴データを作成した。図５は、その購買履歴データの行列Ｒを示す図である。図５において、縦軸がユーザを表し、横軸がアイテムを表し、また、黒い点はＲ_ｉ,ｊ＝１（商品を購入したこと）、白い点はＲ_ｉ,ｊ＝０（商品を購入していないこと）をそれぞれ表す（図６、図７についても同様）。 (Data used)
As actual data, data “MovieLens” composed of a history actually evaluated by a user for a movie was used. The evaluation value takes five values of “1”, “2”, “3”, “4”, and “5”. Then, the movie evaluated by the user is regarded as an item purchased by the user, the movie not evaluated by the user is regarded as an item not purchased by the user, and purchase history data of 943 users and 1682 items is created. did. FIG. 5 is a diagram showing a matrix R of the purchase history data. In FIG. 5, the vertical axis represents the user, the horizontal axis represents the item, the black dot is R _{i, j} = 1 (purchased product), and the white dot is R _{i, j} = 0 (purchase product) (The same applies to FIGS. 6 and 7).

（実施方法）
各手法におけるハイパーパラメータ値は、以下のとおりである。
本実施形態:α＝β＝0.1,γ＝η＝0.01（ｋ＝１,２,…,Ｋ, ｓ＝１,２,…,Ｓ）
ＩＲＭ:α＝β＝0.1,γ＝η＝0.01（ｋ＝１,２,…,Ｋ, ｓ＝１,２,…,Ｓ） (Implementation method)
Hyperparameter values in each method are as follows.
This embodiment: α = β = 0.1, γ = η = 0.01 (k = 1, 2,..., K, s = 1, 2,..., S)
IRM: α = β = 0.1, γ = η = 0.01 (k = 1, 2,..., K, s = 1, 2,..., S)

また、初期化部１２においては、ユーザとアイテムをそれぞれ１００クラスにランダムに分けてＺ,Ｗを初期化し、共クラスタリング終了判定部１６においては、10000回連続で評価値（Ｚ,Ｗの事後確率（分布））が更新されなかったときに共クラスタリングを終了させた。 The initialization unit 12 initializes Z and W by randomly dividing the user and the item into 100 classes, respectively. The co-clustering end determination unit 16 continuously evaluates the evaluation values (Z and W posterior probabilities). Co-clustering was terminated when (distribution)) was not updated.

（実施結果）
購買履歴データＲに対して、ＩＲＭと本実施形態による手法とをそれぞれ適用した結果を定性的に比較および評価する。図６は、ＩＲＭの手法の適用結果を示す図である。図７は、本実施形態の手法の適用結果を示す図である。
図６および図７において、共クラスタリングによって得られたユーザクラス、アイテムクラスごとにユーザとアイテムをソートしており、クラスの区切りを実線で表している。また、左上から要素数の大きい順に各クラスを並べている。 (Implementation results)
Qualitatively compare and evaluate the results of applying the IRM and the method according to the present embodiment to the purchase history data R, respectively. FIG. 6 is a diagram illustrating an application result of the IRM technique. FIG. 7 is a diagram illustrating an application result of the method of the present embodiment.
In FIG. 6 and FIG. 7, users and items are sorted for each user class and item class obtained by co-clustering, and class divisions are represented by solid lines. Each class is arranged in descending order from the top left.

各手法により得られたユーザクラス数／アイテムクラス数は、ＩＲＭの手法において39/54、本実施形態の手法において33/57であった。どちらの手法によってもユーザ・アイテムブロック単位で購買履歴（黒の部分）が密集している結果が得られることが分かる。ただし、ＩＲＭの手法では、購入履歴の少ないユーザ・アイテム群を一括りにして巨大なユーザ・アイテムブロック（図６の左上部分）を構成してしまっており、購入履歴データの大部分を占めるＲ_ｉ,ｊ＝０の影響を受けていることが分かる。これに対して、本実施形態の手法では、ＩＲＭの手法に比べてＲ_ｉ,ｊ＝１（黒）の部分が図７に示すように全体的に分布しており、Ｒ_ｉ,ｊ＝０の影響をあまり受けていないことが確認できる。 The number of user classes / number of item classes obtained by each method was 39/54 in the IRM method and 33/57 in the method of this embodiment. It can be seen that by either method, a result that the purchase history (black portion) is densely obtained for each user / item block is obtained. However, in the IRM method, a large user / item block (upper left portion in FIG. 6) is configured by collectively collecting user / item groups having a small purchase history, and R occupies most of the purchase history data. It can be seen that _{i, j} = 0. On the other hand, in the method of the present embodiment, the portion of R _{i, j} = 1 (black) is distributed as shown in FIG. 7 as compared with the IRM method, and R _{i, j} = 0. It can be confirmed that there is not much influence of.

ここで、図８は、本実施形態の手法により得られたアイテムクラスごとの映画タイトルの一部を示す図である。図８に示すように、クラス６（アイテムクラス６）はヒット作、クラス２９（アイテムクラス２９）はホラー映画、クラス３５（アイテムクラス３５）は子供・家族向けの映画のアイテムクラスであることが分かり、本実施形態の手法によって特徴のあるアイテムクラスが得られていることが分かる。 Here, FIG. 8 is a diagram showing a part of a movie title for each item class obtained by the method of the present embodiment. As shown in FIG. 8, class 6 (item class 6) is a hit product, class 29 (item class 29) is a horror movie, and class 35 (item class 35) is an item class for movies for children and families. It can be seen that a characteristic item class is obtained by the method of the present embodiment.

ＩＲＭの手法では、ここまで精度の高いクラスタリング結果は得られない。改めて説明すると、例えば、図６に示すように、ＩＲＭの手法では、Ｒ_ｉ,ｊ＝０（購入しない）の影響を強く受け、左端に大きくて白いアイテムクラス１の領域ができてしまい、また、そのアイテムクラス１の領域には、複数のジャンルにまたがる雑多な映画のタイトルが含まれている（具体的なタイトル名は省略）。そして、ＩＲＭの手法でもっとも問題となるのは、この左端の大きくて白いアイテムクラスの領域の発生である。この領域は、アクセス数の少ない映画の寄せ集め領域になっているので、いろいろなジャンルの映画が混在してしまう。一方、図７に示すように、本実施形態の手法では、そのような大きくて白いアイテムクラスの領域は発生していない。 With the IRM method, clustering results with high accuracy cannot be obtained so far. To explain again, for example, as shown in FIG. 6, the IRM method is strongly influenced by R _{i, j} = 0 (not purchased), and a large white item class 1 area is formed at the left end. The area of the item class 1 includes various movie titles extending over a plurality of genres (specific title names are omitted). The most serious problem in the IRM method is the generation of a large white item class area at the left end. Since this area is a gathering area for movies with few accesses, movies of various genres are mixed. On the other hand, as shown in FIG. 7, in the method of the present embodiment, such a large white item class region does not occur.

本実施形態に係る共クラスタリング装置の構成を模式的に示す機能ブロック図である。It is a functional block diagram which shows typically the structure of the co-clustering apparatus which concerns on this embodiment. 共クラスタリング装置の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of a co-clustering apparatus. ステップＳ３の処理を細分化したフローチャートである。It is the flowchart which subdivided the process of step S3. ステップＳ４の処理を細分化したフローチャートである。It is the flowchart which subdivided the process of step S4. 実データ（購買履歴データ）の行列Ｒを示す図である。It is a figure which shows the matrix R of real data (purchase history data). ＩＲＭの手法の適用結果を示す図である。It is a figure which shows the application result of the method of IRM. 本実施形態の手法の適用結果を示す図である。It is a figure which shows the application result of the method of this embodiment. 本実施形態の手法で得られたアイテムクラスごとの映画タイトルの一部を示す図である。It is a figure which shows a part of movie title for every item class obtained by the method of this embodiment. 本実施形態において仮定するデータ生成プロセスの説明図である。It is explanatory drawing of the data generation process assumed in this embodiment. 購買履歴データの共クラスタリングの例を示す図である。It is a figure which shows the example of co-clustering of purchase history data. ＩＲＭのデータ生成プロセスの説明図である。It is explanatory drawing of the data generation process of IRM.

Explanation of symbols

１演算手段
２記憶手段
１１前処理部
１２初期化部
１３ユーザクラス更新部
１４アイテムクラス更新部
１５共クラスタリング結果評価部
１６共クラスタリング終了判定部
２１購買履歴ＤＢ
２２共クラスタリング結果保存部
２３プログラム
１００共クラスタリング装置 DESCRIPTION OF SYMBOLS 1 Calculation means 2 Storage means 11 Pre-processing part 12 Initialization part 13 User class update part 14 Item class update part 15 Co-clustering result evaluation part 16 Co-clustering end determination part 21 Purchasing history DB
22 Co-clustering result storage unit 23 Program 100 Co-clustering device

Claims

In order to cluster item selection information indicating whether each user has selected each item or the frequency of selection, as matrix information related to the user and the item,
Based on a predetermined user class allocation probability formula, classify each of the users in the matrix information into a plurality of user classes,
Based on a predetermined item class allocation probability formula, classify each of the items in the matrix information into a plurality of item classes,
For each user / item block uniquely specified by the combination of the user class and the item class, a provisional selection probability that is a provisional probability that a user of the user class selects an item of the item class on the probability model Based on a predetermined item selection probability formula,
The matrix information is calculated by maximizing the posterior probability of the matrix information calculated from the predetermined user class allocation probability formula, the predetermined item class allocation probability formula, and the predetermined item selection probability formula. A co-clustering device that performs clustering in both user units and item units,
A predetermined item class designation probability for each item class is given to each user class, a predetermined user class designation probability for each user class is given to each item class, and the predetermined item selection probability formula is A formula for calculating the provisional selection probability as a multiplication value or a value based on the multiplication value of the predetermined item class designation probability and the predetermined user class designation probability for each user / item block,
A storage unit for storing the item selection information and the matrix information;
A user class update unit that receives the latest input of the matrix information and updates a user class to which one or more of the users belong in the matrix information based on the predetermined user class allocation probability formula and the provisional selection probability When,
An item class update unit that accepts the latest input of the matrix information and updates an item class to which one or more of the items in the matrix information belong based on the predetermined item class allocation probability formula and the provisional selection probability When,
Based on the latest matrix information, a co-clustering result evaluation unit that calculates the posterior probability of the matrix information;
A co-clustering apparatus comprising:

Until the predetermined condition is satisfied, the processing by the user class update unit, the item class update unit, and the co-clustering result evaluation unit is repeated, and when the predetermined condition is satisfied, the user based on the latest matrix information and the user The co-clustering apparatus according to claim 1, further comprising: a co-clustering end determination unit that stores a result of co-clustering related to an item in the storage unit.

The user class update unit
From the latest matrix information, select one user to update,
Calculate the migration probability of each of the existing and new user classes to which the selected user is migrated,
Determine the user class to which the user is to be migrated according to the respective migration probability,
The information regarding the said user class is updated based on the said determination. The co-clustering apparatus of Claim 1 or Claim 2 characterized by the above-mentioned.

The item class updating unit
From the latest matrix information, select one item to be updated,
Calculate the migration probability of each of the existing and new item classes to which the selected item is to be migrated,
Determine the item class to which the item is to be transferred according to the respective transfer probability,
The co-clustering apparatus according to claim 1 or 2, wherein information on the item class is updated based on the determination.

In order to cluster item selection information indicating whether each user has selected each item or the frequency of selection, as matrix information related to the user and the item,
Based on a predetermined user class allocation probability formula, classify each of the users in the matrix information into a plurality of user classes,
Based on a predetermined item class allocation probability formula, classify each of the items in the matrix information into a plurality of item classes,
For each user / item block uniquely specified by the combination of the user class and the item class, a provisional selection probability that is a provisional probability that a user of the user class selects an item of the item class on the probability model Based on a predetermined item selection probability formula,
The matrix information is calculated by maximizing the posterior probability of the matrix information calculated from the predetermined user class allocation probability formula, the predetermined item class allocation probability formula, and the predetermined item selection probability formula. A co-clustering method using a co-clustering apparatus that performs clustering in both user units and item units,
A predetermined item class designation probability for each item class is given to each user class, a predetermined user class designation probability for each user class is given to each item class, and the predetermined item selection probability formula is A formula for calculating the provisional selection probability as a multiplication value or a value based on the multiplication value of the predetermined item class designation probability and the predetermined user class designation probability for each user / item block,
The co-clustering apparatus includes a storage unit that stores the item selection information and the matrix information.
The user class update unit receives the latest input of the matrix information, and determines the user class to which one or more of the users in the matrix information belongs based on the predetermined user class allocation probability formula and the provisional selection probability A user class update step to update;
The item class update unit receives the latest input of the matrix information, and determines the item class to which one or more of the items in the matrix information belong based on the predetermined item class allocation probability formula and the provisional selection probability An item class update step to be updated;
A co-clustering result evaluation unit calculates a posterior probability of the matrix information based on the latest matrix information, and a co-clustering result evaluation step;
A co-clustering method characterized by executing:

The co-clustering end determination unit repeats the steps by the user class update unit, the item class update unit, and the co-clustering result evaluation unit until a predetermined condition is satisfied, and when the predetermined condition is satisfied, the latest matrix The co-clustering method according to claim 5, wherein a step of storing a result of co-clustering on the user and the item based on information in the storage unit is executed.

The user class update unit, in the user class update step,
From the latest matrix information, select one user to update,
Calculate the migration probability of each of the existing and new user classes to which the selected user is migrated,
Determine the user class to which the user is to be migrated according to the respective migration probability,
The co-clustering method according to claim 5 or 6, wherein information on the user class is updated based on the determination.

The item class update unit, in the item class update step,
From the latest matrix information, select one item to be updated,
Calculate the migration probability of each of the existing and new item classes to which the selected item is to be migrated,
Determine the item class to which the item is to be transferred according to the respective transfer probability,
The co-clustering method according to claim 5 or 6, wherein information on the item class is updated based on the determination.

A co-clustering program that causes a computer constituting the co-clustering apparatus according to any one of claims 1 to 4 to be executed.

A computer-readable recording medium on which the co-clustering program according to claim 9 is recorded.