JP6079270B2

JP6079270B2 - Information provision device

Info

Publication number: JP6079270B2
Application number: JP2013015016A
Authority: JP
Inventors: 弘紀水口; 大久寿居
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2013-01-30
Filing date: 2013-01-30
Publication date: 2017-02-15
Anticipated expiration: 2033-01-30
Also published as: JP2014146218A

Description

本発明は、情報提供装置にかかり、特に、ユーザ間のコミュニケーションを活性化する話題を提供する情報提供装置に関する。また、本発明は、情報提供用のプログラム、情報提供方法に関する。 The present invention relates to an information providing apparatus, and more particularly to an information providing apparatus that provides a topic that activates communication between users. The present invention also relates to an information providing program and an information providing method.

ソーシャルネットワーキングサイトおよびブログでの日記情報などの発信、これらへのコメント、電子メールなど、仮想空間において様々なコミュニケーションツールによるコミュニケーションが行われている。そして、このようなコミュニケーションを促進するために、コミュニケーションの題材となる話題を推薦するシステムがある。 Communication with various communication tools is performed in a virtual space, such as sending diary information on social networking sites and blogs, commenting on them, and e-mails. And in order to promote such communication, there exists a system which recommends the topic used as the subject of communication.

コミュニケーションの題材となる話題とは、推薦対象ユーザやそのコミュニケーション相手が好む話題であると考えられる。このような話題を推薦する技術として、協調フィルタリングがある。 The topic that is the subject of communication is considered to be a topic that is preferred by the user to be recommended and the communication partner. As a technique for recommending such a topic, there is collaborative filtering.

非特許文献１に協調フィルタリング方法が記載されている。まず、評価履歴を入力する。評価履歴とは、各ユーザが好んだ話題を評価した履歴情報である。評価とは、ユーザによる５段階等での評価や、話題を読んだというログ情報を元に読んだか否かに基づく２段階等での評価などがある。次に、推薦対象ユーザと好みの近いユーザ（以降、嗜好類似ユーザ）を探す。好みの近さは、評価履歴の行列表現における、ユーザ列ベクトルの類似度で計算する。ここで、評価履歴の行列表現とは、行を話題、列をユーザ、値を評価値とした行列表現のことである。すなわち、ユーザ列ベクトルとは、各要素を各話題に対する評価値とするベクトルである。類似度の計算は、コサイン類似度やピアソン相関係数などが用いられる。次に、嗜好類似ユーザが高く評価した話題で、推薦対象ユーザがまだ閲覧していない話題を、推薦対象ユーザに推薦する。 Non-Patent Document 1 describes a collaborative filtering method. First, an evaluation history is input. The evaluation history is history information that evaluates a topic that each user likes. The evaluation includes, for example, evaluation by a user in five stages, and evaluation in two stages based on whether or not the user has read based on log information indicating that a topic has been read. Next, a user who is close to the recommendation target user (hereinafter referred to as a preference-similar user) is searched. The closeness of preference is calculated by the similarity of the user column vector in the matrix expression of the evaluation history. Here, the matrix expression of the evaluation history is a matrix expression in which a row is a topic, a column is a user, and a value is an evaluation value. That is, the user column vector is a vector having each element as an evaluation value for each topic. The similarity is calculated using cosine similarity, Pearson correlation coefficient, or the like. Next, a topic that has been highly appreciated by a preference-similar user and that has not yet been browsed by the recommendation target user is recommended to the recommendation target user.

土方嘉徳、「１．嗜好抽出と情報推薦技術（嗜好抽出・情報推薦の基礎理論，＜特集＞利用者の好みをとらえ活かす−嗜好抽出技術の最前線−）」、情報処理48(9), 957-965, 2007Hijikata Yoshinori, “1. Preference Extraction and Information Recommendation Technology (Basic Theory of Preference Extraction and Information Recommendation, <Special Feature> Utilizing User's Preference-The Forefront of Preference Extraction Technology”), Information Processing 48 (9), 957-965, 2007 Thomas Hofmann, “Probabilistic Latent Semantic Indexing,”Proceedings of the Twenty-Second Annual International SIGIR Conference onResearch and Development in Information Retrieval (SIGIR-99), 1999Thomas Hofmann, “Probabilistic Latent Semantic Indexing,” Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval (SIGIR-99), 1999 David M. Blei, Andrew Y. Ng, Michael I. Jordan, “Latent Dirichletallocation,” Journal of Machine Learning Research 3 (4-5), pp. 993-1022.David M. Blei, Andrew Y. Ng, Michael I. Jordan, “Latent Dirichletallocation,” Journal of Machine Learning Research 3 (4-5), pp. 993-1022. T. L. Griffiths ら, “Finding scientific topics,” Proc. of the National Academy ofSciences of the United States of America, vol.101, pp.5228-5235, 2004T. L. Griffiths et al., “Finding scientific topics,” Proc. Of the National Academy of Sciences of the United States of America, vol.101, pp.5228-5235, 2004 小出ら、“単語間関係を制約として用いた文書クラスタリング”、情報科学技術フォーラム講演論文集 8(2)、 269-270Koide et al., “Document Clustering Using Relationships Between Words as Constraints”, Proc. 8 (2), 269-270 Andrzejewskiら, “Incorporating domain knowledge into topic modeling via DirichletForest priors”, Proceedings of the 26th International Conference on MachineLearning(ICML2009), 25-32Andrzejewski et al., “Incorporating domain knowledge into topic modeling via DirichletForest priors”, Proceedings of the 26th International Conference on MachineLearning (ICML2009), 25-32

ところが、上述した技術では、以下の問題が生じる。まず、話題が同種の興味領域であるか判断できないため、嗜好類似ユーザには興味があるが、推薦対象ユーザには興味のない話題が推薦されていた。そのため、コミュニケーションの題材となる推薦対象ユーザにも嗜好類似ユーザにも好む話題が推薦できないため、コミュニケーションを活性化する題材とならなかった。 However, the above-described technique causes the following problems. First, since it cannot be determined whether the topic is the same type of interest area, a topic that is interested in a preference-similar user but not interested in a recommendation target user has been recommended. For this reason, it is not possible to recommend a topic that is preferred by both a recommendation target user and a preference-similar user as a communication subject, and thus it has not become a subject that activates communication.

また、人の興味はいくつかの興味領域に分かれている。したがって、推薦対象ユーザと嗜好類似ユーザで、ある興味領域は同じ場合でも、別の興味領域が同じであるとは限らない。例えば、政治に関連する話題に好評価を付ける２人のユーザであるユーザ１とユーザ２がいたとする。ユーザ２が芸能に関連する話題に好評価を付けている場合、ユーザ１は芸能の話題を好むとは限らない。 Human interest is divided into several areas of interest. Therefore, even if a region of interest is the same for a recommendation target user and a preference-similar user, another region of interest is not necessarily the same. For example, it is assumed that there are two users who give a favorable evaluation to a topic related to politics, user 1 and user 2. When the user 2 gives a good evaluation to a topic related to performing arts, the user 1 does not necessarily like the topic of performing arts.

以上のように、協調フィルタリングでは、興味領域を区別せず、異なる興味領域の話題であっても推薦していた。そのため、推薦対象ユーザも嗜好類似ユーザも好むコミュニケーションの題材となる話題を提供できない、という問題が生じていた。 As described above, collaborative filtering does not distinguish the regions of interest and recommends even topics of different regions of interest. For this reason, there has been a problem in that it is not possible to provide a topic that is a subject of communication that is recommended by both the recommendation target user and the preference similar user.

このため、本発明の目的は、上述した課題である、ユーザに興味がある話題を提供することができない、ということを解決することができる、情報提供装置、プログラム、情報提供方法、を提供することにある。 For this reason, an object of the present invention is to provide an information providing apparatus, a program, and an information providing method capable of solving the above-described problem that a topic that is of interest to a user cannot be provided. There is.

本発明の一形態である情報提供装置は、
各ユーザの各話題に対する評価から成る評価履歴情報を入力する入力手段と、
前記評価履歴情報に基づいて、話題を行とし、ユーザを列とし、ユーザによる話題の評価値を値とする２次元行列を作成し、同じ話題を評価したユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する次元圧縮手段と、
話題の圧縮次元ベクトル表現の要素値が第１の閾値以上である話題と、ユーザの圧縮次元ベクトル表現の要素値が第２の閾値以上であるユーザとを、ベクトルの要素毎に一つのグループとして対応づけるグループ化手段と、
前記グループに基づいて、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の圧縮次元ベクトル表現の要素値に基づいて推薦話題候補を選定する推薦話題候補作成手段と、
前記推薦話題候補のうち、前記評価履歴情報に基づいて話題の提供対象となるユーザがまだ評価していない話題を当該ユーザに提供する話題提供手段と、
を備えた、
という構成をとる。 An information providing apparatus according to one aspect of the present invention
An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic Dimensional compression means for compressing a single dimension and outputting a compressed dimension vector representation of a topic and a compressed dimension vector representation of a user;
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
With
The configuration is as follows.

また、本発明の他の形態であるプログラムは、
情報処理装置に、
各ユーザの各話題に対する評価から成る評価履歴情報を入力する入力手段と、
前記評価履歴情報に基づいて、話題を行とし、ユーザを列とし、ユーザによる話題の評価値を値とする２次元行列を作成し、同じ話題を評価したユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する次元圧縮手段と、
話題の圧縮次元ベクトル表現の要素値が第１の閾値以上である話題と、ユーザの圧縮次元ベクトル表現の要素値が第２の閾値以上であるユーザとを、ベクトルの要素毎に一つのグループとして対応づけるグループ化手段と、
前記グループに基づいて、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の圧縮次元ベクトル表現の要素値に基づいて推薦話題候補を選定する推薦話題候補作成手段と、
前記推薦話題候補のうち、前記評価履歴情報に基づいて話題の提供対象となるユーザがまだ評価していない話題を当該ユーザに提供する話題提供手段と、
を実現させるためのプログラムである。 Moreover, the program which is the other form of this invention is:
In the information processing device,
An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic Dimensional compression means for compressing a single dimension and outputting a compressed dimension vector representation of a topic and a compressed dimension vector representation of a user;
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
It is a program for realizing.

また、本発明の他の形態である情報提供方法は、
各ユーザの各話題に対する評価から成る評価履歴情報を入力し、
前記評価履歴情報に基づいて、話題を行とし、ユーザを列とし、ユーザによる話題の評価値を値とする２次元行列を作成し、同じ話題を評価したユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力し、
話題の圧縮次元ベクトル表現の要素値が第１の閾値以上である話題と、ユーザの圧縮次元ベクトル表現の要素値が第２の閾値以上であるユーザとを、ベクトルの要素毎に一つのグループとして対応づけ、
前記グループに基づいて、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の圧縮次元ベクトル表現の要素値に基づいて推薦話題候補を選定し、
前記推薦話題候補のうち、前記評価履歴情報に基づいて話題の提供対象となるユーザがまだ評価していない話題を当該ユーザに提供する、
という構成をとる。 An information providing method according to another aspect of the present invention includes:
Enter evaluation history information consisting of evaluations for each topic for each user,
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic Are compressed into one dimension, and the compressed dimension vector expression of the topic and the compressed dimension vector expression of the user are output,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Mapping,
Based on the group, for each user, extract a group to which the user belongs, extract a topic belonging to the group, select a recommended topic candidate based on the element value of the compression dimension vector representation of the topic,
Among the recommended topic candidates, providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information,
The configuration is as follows.

本発明は、以上のように構成されることにより、ユーザに興味がある話題を提供することができる。 By being configured as described above, the present invention can provide a topic that is of interest to the user.

本発明の実施の形態１における話題推薦装置を含むシステム全体の構成を示すブロック図である。It is a block diagram which shows the structure of the whole system containing the topic recommendation apparatus in Embodiment 1 of this invention. 本発明の実施の形態１の全体の動作の流れを説明するフローチャートである。It is a flowchart explaining the flow of the whole operation | movement of Embodiment 1 of this invention. 図１に示す評価履歴蓄積手段の例を示す図である。It is a figure which shows the example of the evaluation log | history storage means shown in FIG. 図１に示す次元圧縮手段が作成する２次元行列の例を示す図である。It is a figure which shows the example of the two-dimensional matrix which the dimension compression means shown in FIG. 1 produces. 図１に示す評価履歴蓄積手段の話題のタイトルの例を示す図である。It is a figure which shows the example of the title of a topic of the evaluation log | history storage means shown in FIG. 図１に示す次元圧縮手段の圧縮方法を説明する概念図である。It is a conceptual diagram explaining the compression method of the dimension compression means shown in FIG. 図１に示すグループ化手段の出力例を示す図である。It is a figure which shows the example of an output of the grouping means shown in FIG. 図１に示す推薦話題候補作成手段の出力例を示す図である。It is a figure which shows the example of an output of the recommendation topic candidate creation means shown in FIG. 本発明の実施の形態２における話題推薦装置を含むシステム全体の構成を示すブロック図である。It is a block diagram which shows the structure of the whole system containing the topic recommendation apparatus in Embodiment 2 of this invention. 本発明の実施の形態２の全体の動作の流れを説明するフローチャートである。It is a flowchart explaining the flow of the whole operation | movement of Embodiment 2 of this invention. 図９に示すコミュニケーション相手蓄積手段の例を示す図である。It is a figure which shows the example of the communication other party storage means shown in FIG. 本発明の実施の形態３の話題推薦装置を含むシステム全体の構成を示すブロック図である。It is a block diagram which shows the structure of the whole system containing the topic recommendation apparatus of Embodiment 3 of this invention. 本発明の実施の形態３の全体の動作の流れを説明するフローチャートである。It is a flowchart explaining the flow of the whole operation | movement of Embodiment 3 of this invention. 図１２に示すユーザモデル蓄積手段の例を示す図である。It is a figure which shows the example of the user model storage means shown in FIG. 図１２に示すコンテンツモデル蓄積手段の例を示す図である。It is a figure which shows the example of the content model storage means shown in FIG. 本発明の実施の形態４の話題推薦装置を含むシステム全体の構成を示すブロック図である。It is a block diagram which shows the structure of the whole system containing the topic recommendation apparatus of Embodiment 4 of this invention. 本発明の実施の形態４の全体の動作の流れを説明するフローチャートである。It is a flowchart explaining the flow of the whole operation | movement of Embodiment 4 of this invention. 本発明の実施の形態１〜３における話題推薦装置を実現するコンピュータの一例を示すブロック図であるIt is a block diagram which shows an example of the computer which implement | achieves the topic recommendation apparatus in Embodiment 1-3 of this invention.

＜実施の形態１＞
［装置構成］
以下、本発明の実施の形態１における話題推薦装置、話題推薦方法、及びプログラムについて、図１〜図８、及び、図１８を参照して説明する。 <Embodiment 1>
[Device configuration]
Hereinafter, a topic recommendation device, a topic recommendation method, and a program according to Embodiment 1 of the present invention will be described with reference to FIGS. 1 to 8 and FIG.

最初に、図１を用いて、実施の形態１における話題推薦装置（情報提供装置）の構成について説明する。図１は、本発明の実施の形態１における話題推薦装置の構成を示すブロック図である。 First, the configuration of the topic recommendation device (information providing device) in the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing a configuration of a topic recommendation device according to Embodiment 1 of the present invention.

図１に示すように、本実施の形態１における話題推薦装置は、評価履歴蓄積手段１２１と、入力手段１０１と、次元圧縮手段１０２と、グループ化手段１０３と、推薦候補作成手段１０４と、推薦手段１０５とを備えている。これらの部はそれぞれ概略つぎのように動作する。 As shown in FIG. 1, the topic recommendation device according to the first exemplary embodiment includes an evaluation history storage unit 121, an input unit 101, a dimension compression unit 102, a grouping unit 103, a recommendation candidate creation unit 104, and a recommendation. Means 105. Each of these units generally operates as follows.

評価履歴蓄積手段１２１は、各ユーザの各話題への評価の履歴を表す評価履歴情報が蓄積されている。具体的には、ユーザと話題と評価とを含む。この他にも評価時刻などがあっても良い。 The evaluation history storage unit 121 stores evaluation history information representing a history of evaluation of each user on each topic. Specifically, a user, a topic, and an evaluation are included. In addition, there may be an evaluation time.

入力手段１０１は、上記評価履歴蓄積手段１２１から評価履歴情報の入力を受け付け、次元圧縮手段１０２に渡す。 The input unit 101 receives input of evaluation history information from the evaluation history storage unit 121 and passes it to the dimension compression unit 102.

次元圧縮手段１０２は、入力された評価履歴情報を、話題の圧縮ベクトル表現とユーザの圧縮ベクトル表現に変換する。具体的には、まず、上記評価履歴情報から、話題を行とし、ユーザを列とし、評価値を値とする、２次元行列を作成する。そして、同じ話題を評価したユーザと、同じユーザから評価された話題と、を一つの次元として、２次元行列を圧縮する。次に、圧縮した次元を要素とする話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する。 The dimension compression unit 102 converts the input evaluation history information into a topic compression vector representation and a user compression vector representation. Specifically, first, a two-dimensional matrix is created from the evaluation history information, with topics as rows, users as columns, and evaluation values as values. Then, the two-dimensional matrix is compressed with a user who has evaluated the same topic and a topic evaluated by the same user as one dimension. Next, the compressed dimension vector representation of the topic having the compressed dimension as an element and the compressed dimension vector representation of the user are output.

２次元行列を圧縮する方法は、トピックモデルにより計算する。具体的な計算手法は、上記非特許文献２に示されるProbabilistic latent semantic analysis (PLSA) や、上記非特許文献３で示されるLatent Dirichlet allocation (LDA)を用いる。なお、後述する実施の形態２では、コミュニケーション相手同士を一つの次元に圧縮するように制約をかける。 A method for compressing a two-dimensional matrix is calculated by a topic model. As a specific calculation method, Probabilistic latent semantic analysis (PLSA) shown in Non-Patent Document 2 and Latent Dirichlet allocation (LDA) shown in Non-Patent Document 3 are used. In the second embodiment to be described later, a restriction is imposed so that communication partners are compressed into one dimension.

トピックモデルは、複数の単語からなる文書群を入力に、文書はK個の潜在トピックから生成されるという仮説に基づき、文書をK個の潜在トピックの分布で表現するモデルである。潜在トピックとは、文書内に潜在的に存在するカテゴリであり、潜在トピックの分布は、単語群の分布で表現される。潜在トピックの単語分布は、同じ文書に出現しやすい単語群は高い確率となる。これによって、文書は、要素をK個の潜在トピック、値を潜在トピックの確率とするK次元のベクトルで表現する。単語は、要素をK個の潜在トピック、値を潜在トピックの確率とする、K次元のベクトルで表現する。 The topic model is a model that expresses a document with a distribution of K latent topics based on a hypothesis that a document group consisting of a plurality of words is input and a document is generated from K latent topics. A latent topic is a category that potentially exists in a document, and the distribution of the latent topic is expressed by the distribution of a word group. With regard to the word distribution of latent topics, a word group that tends to appear in the same document has a high probability. As a result, the document is represented by a K-dimensional vector whose elements are K potential topics and whose values are the probabilities of the latent topics. A word is expressed as a K-dimensional vector with elements as K potential topics and values as probabilities of potential topics.

本実施形態では、文書を話題とし、単語をユーザとし、単語の頻度を評価値として、トピックモデルにより次元を圧縮する。すなわち、潜在トピックの単語分布は、ユーザ群の分布であらわさる。これにより、同じ文書を評価したユーザは、同じ潜在トピックにおいて高い確率となる。話題は、要素をK個の潜在トピック、値を潜在トピックの確率とするK次元のベクトルで表現する。ユーザは、要素をK個の潜在トピック、値を潜在トピックの確率とするK次元のベクトルで表現する。 In the present embodiment, the dimension is compressed by the topic model with the document as the topic, the word as the user, and the word frequency as the evaluation value. That is, the word distribution of the latent topic is represented by the user group distribution. As a result, users who have evaluated the same document have a high probability in the same potential topic. A topic is expressed as a K-dimensional vector with elements as K potential topics and values as probabilities of potential topics. The user expresses an element as a K-dimensional vector having K potential topics and a value as the probability of the latent topic.

グループ化手段１０３は、話題の圧縮次元ベクトルとユーザの圧縮次元ベクトルとの入力を受け、各次元毎に、上記話題の圧縮次元ベクトル表現のうち第１の閾値以上の圧縮次元の値を持つ話題と、上記ユーザの圧縮次元ベクトルのうち第２の閾値以上の圧縮次元の値を持つユーザとを一つのグループとしてまとめる。 The grouping means 103 receives an input of a topic compression dimension vector and a user compression dimension vector, and a topic having a compression dimension value equal to or greater than a first threshold in the topic compression dimension vector representation for each dimension. And users having a compression dimension value greater than or equal to the second threshold among the user's compression dimension vectors are grouped together as one group.

推薦話題候補作成手段１０４は、上記グループの入力を受け、各ユーザ毎に、ユーザが所属する複数のグループを抽出し、当該グループに対応する話題群を抽出し、話題を圧縮次元のスコアの順番で並べ、推薦話題候補を作成する。 The recommended topic candidate creation unit 104 receives the input of the group, extracts a plurality of groups to which the user belongs for each user, extracts a topic group corresponding to the group, and selects the topic in the order of the scores of the compression dimension. Line up and create recommended topic candidates.

推薦手段１０５は、評価履歴情報と推薦話題候補との入力を受け、推薦対象となるユーザ毎に当該ユーザがまだ評価していない話題を推薦する。このとき、上記推薦話題候補作成手段１０４にて順番に並べられた推薦話題候補のうち、上位から予め設定された推薦話題件数に相当する数の話題を推薦するよう提供する。 The recommendation unit 105 receives the input of the evaluation history information and the recommended topic candidate, and recommends a topic that the user has not yet evaluated for each user to be recommended. At this time, among the recommended topic candidates arranged in order by the recommended topic candidate creation unit 104, the number of topics corresponding to the number of recommended topics set in advance from the top is recommended.

［装置動作］
次に、図２から図８を参照して本実施の形態１の話題推薦装置の動作について詳細に説明する。図２は、本実施の形態１の全体の動作の流れを説明するフローチャートである。以下の説明においては、適宜図１を参酌する。また、本実施の形態１では、話題推薦装置を動作させることによって、話題推薦方法が実施される。よって、本実施の形態１における話題推薦方法の説明は、以下の話題推薦装置の動作説明に代える。 [Device operation]
Next, the operation of the topic recommendation device according to the first embodiment will be described in detail with reference to FIGS. FIG. 2 is a flowchart for explaining the overall operation flow of the first embodiment. In the following description, FIG. 1 is taken into consideration as appropriate. In the first embodiment, the topic recommendation method is implemented by operating the topic recommendation device. Therefore, the description of the topic recommendation method in the first embodiment is replaced with the following description of the operation of the topic recommendation device.

［評価履歴の行列表現］
まず、入力手段１０１が、評価履歴蓄積手段１２１から評価履歴情報を受け付け、次元圧縮手段１０２が、評価履歴情報を行列表現に変換する（ステップＳ１）。 [Matrix representation of evaluation history]
First, the input means 101 receives evaluation history information from the evaluation history storage means 121, and the dimension compression means 102 converts the evaluation history information into a matrix representation (step S1).

評価履歴蓄積手段１２１に蓄積している評価履歴情報の例を図３に示す。評価履歴情報は、ユーザＩＤと話題ＩＤと評価値と評価時刻とを含んでいる。ここで、評価値は、明示的に５段階などをユーザが付与した結果でも良いし、暗黙的にユーザがコンテンツを閲覧したログから１などを付与しても良い。図３では、暗黙的に評価したことを想定し、ユーザがコンテンツを閲覧した場合、評価値１とした。また、図中「・・・」は省略を示す（以降も同様）。 An example of the evaluation history information stored in the evaluation history storage unit 121 is shown in FIG. The evaluation history information includes a user ID, a topic ID, an evaluation value, and an evaluation time. Here, the evaluation value may be a result of the user explicitly assigning 5 levels or the like, or may be 1 or the like from a log in which the user browses the content implicitly. In FIG. 3, assuming that the evaluation is implicit, the evaluation value is 1 when the user browses the content. Further, “...” In the figure indicates omission (the same applies hereinafter).

次元圧縮手段は、評価履歴情報から、話題を行、ユーザを列、評価値を値とする２次元行列に変換する。行列表現の例を図４に示す。図４は、評価履歴情報の話題ＩＤとユーザＩＤと評価値とを、それぞれ行と列と値として表現したものである。たとえば、ユーザ１は、話題１を評価値１と評価したことを示している。評価値０は評価しなかったことを示す。 The dimension compression means converts the evaluation history information into a two-dimensional matrix having a topic as a row, a user as a column, and an evaluation value as a value. An example of matrix representation is shown in FIG. FIG. 4 represents the topic ID, user ID, and evaluation value of the evaluation history information as a row, a column, and a value, respectively. For example, it is indicated that the user 1 has evaluated the topic 1 with the evaluation value 1. An evaluation value of 0 indicates that no evaluation was made.

上記話題ＩＤに対応する話題の例を図５に示す。話題１と話題２は政治に関連する話題、話題３と話題４は政治と芸能に関する話題、話題５と話題６は芸能に関する話題である。行列表現の例と比較すると、ユーザ１とユーザ２は政治の興味領域に関連する話題を評価し、ユーザ３とユーザ４は政治と芸能の２つの異なる興味領域に関連する話題を評価し、ユーザ５とユーザ６は芸能関連の興味領域に関連する話題を評価していることが分かる。 An example of a topic corresponding to the topic ID is shown in FIG. Topics 1 and 2 are topics related to politics, topics 3 and 4 are topics related to politics and performing arts, and topics 5 and 6 are topics related to performing arts. Compared to the matrix representation example, user 1 and user 2 evaluate topics related to political interest areas, user 3 and user 4 evaluate topics related to two different areas of interest political and entertainment, It can be seen that 5 and user 6 are evaluating topics related to the entertainment related interest areas.

［評価履歴の次元圧縮］
次に、次元圧縮手段１０２が、行列表現を次元圧縮する（ステップＳ２）。次元圧縮には、トピックモデル手法を用いる。以下では、PLSAによる手法を示す。 [Dimension compression of evaluation history]
Next, the dimension compression means 102 dimensionally compresses the matrix representation (step S2). A topic model method is used for dimension compression. Below, the technique by PLSA is shown.

まず、PLSAは、以下のように話題xとユーザyの同時確率を、潜在トピックzを用いて表す。
P( x , y ) = P( x ) Σ_{ z ∈ Z } P( y | z ) Pr( x | z )
同時確率P（ x , y ）は、ユーザyがコンテンツxを評価したことを示している。 First, PLSA represents the simultaneous probability of topic x and user y using latent topic z as follows.
P (x, y) = P (x) Σ_ {z ∈ Z} P (y | z) Pr (x | z)
The joint probability P (x, y) indicates that the user y has evaluated the content x.

PLSAモデルのパラメータは、EMアルゴリズム等によって解くことができる。その結果を利用し、P( z | x ) と P ( z | y ) を求めることで、それぞれ、話題xをK個の潜在トピックで表した確率、ユーザyをK個の潜在トピックモデルで表した確率となる。すなわち、話題の圧縮ベクトル表現は、話題の潜在トピックの確率分布で表現し、具体的には、要素を潜在トピック、値をP( z=j | x=d )(d番目の話題のk番目の潜在トピックの確率)とするベクトルで表現する。ユーザの圧縮ベクトル表現は、ユーザの潜在トピックの確率分布で表現し、要素を潜在トピック、値をP( z=k | y=u ) （u番目のユーザのk番目の潜在トピックの確率）とするベクトルで表現する。 The parameters of the PLSA model can be solved by an EM algorithm or the like. Using the results, P (z | x) and P (z | y) are obtained, and the probability that topic x is represented by K latent topics and user y are represented by K latent topic models, respectively. Probability. In other words, the compression vector representation of the topic is represented by the probability distribution of the topic latent topic, specifically, the element is the latent topic and the value is P (z = j | x = d) (kth of the dth topic (Probability of potential topic). The user's compressed vector representation is expressed by the probability distribution of the user's latent topic, the element is the latent topic, the value is P (z = k | y = u) (the probability of the kth latent topic of the uth user) and The vector is

LDAモデルでも同様にトピックモデリングを行う。LDAモデルの解法が上記非特許文献４に記載されている。 The topic modeling is done in the same way for the LDA model. The solution of the LDA model is described in Non-Patent Document 4 above.

非特許文献４では、以下のようにギブスサンプラに基づく方法が紹介されている。上述したように、話題を文書、ユーザを単語としてLDAによるトピックモデリングを行う。 Non-Patent Document 4 introduces a method based on a Gibbs sampler as follows. As described above, topic modeling by LDA is performed using a topic as a document and a user as a word.

まず、各ユーザにトピックをランダムに割り当てる。次に、一様分布からランダムにrをサンプリングする。次に、ユーザの出現位置毎に、以下の数１式の値を計算する。数１式の値がr以上であれば、トピック番号をインクリメントする。そうでなければ、ユーザの出現位置にトピックを割り当てる。最後に、割り当てられたトピック番号から、P( z | x ) と P ( z | y ) とを求める。
First, a topic is randomly assigned to each user. Next, r is sampled randomly from the uniform distribution. Next, for each appearance position of the user, the value of the following formula 1 is calculated. If the value of Equation 1 is greater than or equal to r, the topic number is incremented. Otherwise, the topic is assigned to the user's appearance position. Finally, P (z | x) and P (z | y) are obtained from the assigned topic numbers.

上記iはユーザの出現位置、jはユーザの出現位置に割り当てられたトピック番号、Kはトピック数を表す。上記関数Qは、以下の数２式を計算する。
The above i is the appearance position of the user, j is the topic number assigned to the appearance position of the user, and K is the number of topics. The function Q calculates the following equation (2).

ここで、関数qは以下の数３式のように表される。
Here, the function q is expressed as the following equation (3).

ここで、上記数３式中の数４式の値は、位置iのユーザ番号uのうち位置i以外でトピック番号がjである頻度を表す。 Here, the value of the equation 4 in the equation 3 represents the frequency of the topic number j other than the position i among the user numbers u at the position i.

上記数３式中の数５式の値は、ユーザ番号に関係なく位置i以外の場所でトピック番号がjである頻度を表す。 The value of equation (5) in the above equation (3) represents the frequency that the topic number is j at a place other than the position i regardless of the user number.

上記数３式中の数６式の値は、位置iの話題番号dのうち位置i以外の場所でトピック番号がjである頻度を表す。 The value of Equation 6 in Equation 3 above represents the frequency at which the topic number is j at a place other than position i among the topic numbers d at position i.

上記数３式中の数７式の値は、話題番号dのうち位置i以外の頻度を表す。
また、上記Wはユーザ数、Kはトピック数、αとβは事前分布のパラメータである。 The value of Equation 7 in Equation 3 represents the frequency other than the position i in the topic number d.
W is the number of users, K is the number of topics, and α and β are prior distribution parameters.

図６に、トピックモデルにおける潜在トピックの生成の概念図を示す。トピックモデルは、話題はいくつかの潜在トピックを持ち、潜在トピックはユーザから評価されるというモデルを仮定している。したがって、評価ユーザが同じ話題は、話題の潜在トピックの確率分布も同じである。また、評価した話題が同じユーザは、ユーザの潜在トピックの確率分布も同じである。このことから、話題１と話題２、話題３と話題４、話題５と話題６はそれぞれ、似ている潜在トピックの確率分布P(z|d)となる。また、ユーザ１とユーザ２、ユーザ３とユーザ４、ユーザ５とユーザ６は、それぞれ似ている潜在トピックの確率分布P(z|y)となる。これらが似ている確率分布となるには、図６の点線で囲った部分、すなわち、同じユーザに評価された話題で、同じ話題を評価したユーザの部分が潜在トピックとなり、この潜在トピック部分をどの程度含むかによって、潜在トピックの確率分布が作成される。 FIG. 6 shows a conceptual diagram of generation of a latent topic in the topic model. The topic model assumes a model in which a topic has several potential topics and the potential topics are evaluated by the user. Accordingly, the topics with the same evaluation user have the same probability distribution of the topic latent topics. In addition, users with the same evaluated topic have the same probability distribution of the user's latent topics. From this, Topic 1 and Topic 2, Topic 3 and Topic 4, Topic 5 and Topic 6 each have a probability distribution P (z | d) of similar latent topics. Further, user 1 and user 2, user 3 and user 4, and user 5 and user 6 have similar potential topic probability distributions P (z | y), respectively. In order to obtain a probability distribution in which these are similar, a portion surrounded by a dotted line in FIG. 6, that is, a topic evaluated by the same user, a user's portion evaluated by the same topic becomes a latent topic, and how much is this potential topic portion Probability distributions of potential topics are created depending on whether they are included.

潜在トピック数Kを２個とした場合、以下のように潜在トピックの確率分布が生成されたとする。
話題１の圧縮ベクトル表現＝P(z|x=話題１)＝（0.8,0.2）
話題２の圧縮ベクトル表現＝P(z|x=話題２)＝（0.7,0.3）
話題３の圧縮ベクトル表現＝P(z|x=話題３)＝（0.5,0.5）
話題４の圧縮ベクトル表現＝P(z|x=話題４)＝（0.5,0.5）
話題５の圧縮ベクトル表現＝P(z|x=話題５)＝（0.3,0.7）
話題６の圧縮ベクトル表現＝P(z|x=話題６)＝（0.2,0.8）
ユーザ１の圧縮ベクトル表現＝P(z|y=ユーザ１)＝（0.7,0.3）
ユーザ２の圧縮ベクトル表現＝P(z|y=ユーザ２)＝（0.8,0.2）
ユーザ３の圧縮ベクトル表現＝P(z|y=ユーザ３)＝（0.5,0.5）
ユーザ４の圧縮ベクトル表現＝P(z|y=ユーザ４)＝（0.5,0.5）
ユーザ５の圧縮ベクトル表現＝P(z|y=ユーザ５)＝（0.2,0.8）
ユーザ６の圧縮ベクトル表現＝P(z|y=ユーザ６)＝（0.3,0.7） When the number of latent topics K is 2, it is assumed that a probability distribution of latent topics is generated as follows.
Compressed vector representation of topic 1 = P (z | x = topic 1) = (0.8,0.2)
Compressed vector representation of topic 2 = P (z | x = topic 2) = (0.7,0.3)
Compression vector representation of topic 3 = P (z | x = topic 3) = (0.5, 0.5)
Compressed vector representation of topic 4 = P (z | x = topic 4) = (0.5, 0.5)
Compression vector representation of topic 5 = P (z | x = topic 5) = (0.3,0.7)
Compression vector representation of topic 6 = P (z | x = topic 6) = (0.2,0.8)
Compressed vector representation of user 1 = P (z | y = user 1) = (0.7,0.3)
Compressed vector representation of user 2 = P (z | y = user 2) = (0.8,0.2)
Compressed vector representation of user 3 = P (z | y = user 3) = (0.5,0.5)
Compressed vector representation of user 4 = P (z | y = user 4) = (0.5,0.5)
Compressed vector representation of user 5 = P (z | y = user 5) = (0.2,0.8)
Compressed vector representation of user 6 = P (z | y = user 6) = (0.3,0.7)

ここで、潜在トピック数Kは、あらかじめ決めておいてもよい。また、１ユーザの興味領域数と、１ユーザのコミュニケーション相手の数の平均と、全ユーザ数とから全体の興味領域数を推定し、この値を潜在トピック数としてもよい。ここで、全ユーザ数は評価履歴蓄積手段１２１を参照することで得ることができる。 Here, the number of latent topics K may be determined in advance. Alternatively, the total number of regions of interest may be estimated from the number of regions of interest of one user, the average number of communication partners of one user, and the total number of users, and this value may be used as the number of potential topics. Here, the total number of users can be obtained by referring to the evaluation history storage unit 121.

全体の興味領域数＝興味領域の異なり数÷１興味領域のユーザ数
興味領域の異なり数＝１ユーザの興味領域数×全ユーザ数
１興味領域のユーザ数＝コミュニケーション相手数平均＋１ Total number of regions of interest = number of different regions of interest / number of users of one region of interest Number of different regions of interest = number of regions of interest of one user x total number of users Number of users of one region of interest = average number of communication partners + 1

例えば、１ユーザの興味領域数が２、１ユーザのコミュニケーション相手の数が３、全ユーザ数が６とすると、全体の興味領域数は以下のように計算される。全体の興味領域数を潜在トピック数とする。
全体の興味領域数＝2 × 6 ÷ ( 3 + 1 ) ＝ 3 For example, if the number of interest areas of one user is 2, the number of communication partners of one user is 3, and the total number of users is 6, the total number of interest areas is calculated as follows. Let the total number of areas of interest be the number of potential topics.
Total number of areas of interest = 2 × 6 ÷ (3 + 1) = 3

このように推定することで、ユーザの増加に応じで、潜在トピック数を自動的に調節することができる。また、ユーザが増加しても、１ユーザの興味領域数や１ユーザのコミュニケーション相手はそれほど変化しないと考えられるため有効である。 By estimating in this way, the number of potential topics can be automatically adjusted according to an increase in users. Further, even if the number of users increases, it is effective because the number of interest areas of one user and the communication partner of one user are considered not to change so much.

［グループ化］
次に、グループ化手段１０３が、話題の圧縮ベクトル表現とユーザの圧縮ベクトル表現を入力にグループを作成する（ステップＳ３）。ユーザの圧縮ベクトル表現と、話題の圧縮ベクトル表現を参照し、ベクトル要素毎に、特定の閾値以上の値を持つユーザと話題を同じグループとする。 [Group]
Next, the grouping means 103 creates a group with the compressed vector expression of the topic and the compressed vector expression of the user as input (step S3). The user's compressed vector expression and the topic's compressed vector expression are referred to, and for each vector element, the user and topic having a value equal to or greater than a specific threshold value are grouped together.

上述のユーザの圧縮ベクトル表現と話題の圧縮ベクトル表現とに対して、ユーザの閾値を0.5、話題の閾値を0.5とする。ユーザの圧縮ベクトル表現と話題の圧縮ベクトル表現との第１要素を参照すると、閾値以上のユーザは、ユーザ１、ユーザ２、ユーザ３、ユーザ４、閾値以上の話題は、話題１、話題２、話題３、話題４である。これを一つのグループとしてまとめる。第２要素についても同様にグループとしてまとめる。 For the above-described user compressed vector representation and topic compressed vector representation, the user threshold is set to 0.5 and the topic threshold is set to 0.5. Referring to the first element of the user's compressed vector expression and the topic's compressed vector expression, the users above the threshold are User 1, User 2, User 3, User 4, and the topics above the threshold are Topic 1, Topic 2, Topic 3 and Topic 4. Put this together as a group. The second element is similarly grouped together.

生成されたグループの例を図７に示す。グループ、ユーザ、話題の組合せができる。ユーザは、カンマ区切りで複数のユーザを示し、「：」区切りで潜在トピックの確率値を示している。話題も、カンマ区切りで複数の話題を示し、「：」区切りで潜在トピックの確率値を示している。 An example of the generated group is shown in FIG. You can combine groups, users, and topics. The user indicates a plurality of users separated by commas, and indicates the probability value of the potential topic by separating “:”. The topics also indicate a plurality of topics separated by commas, and the probability values of latent topics are separated by “:”.

図７においては、ユーザ３とユーザ４はグループ１とグループ２の両方に所属している。話題３と話題４も、グループ１とグループ２の両方に所属している。図４と図５を参照すると、ユーザ３とユーザ４は政治と芸能関連の話題に興味を示しており、グループ１は政治関連、グループ２は芸能関連を示している。したがって、ユーザ３、ユーザ４、話題３、話題４は両方のグループに所属する。 In FIG. 7, user 3 and user 4 belong to both group 1 and group 2. Topic 3 and Topic 4 also belong to both Group 1 and Group 2. Referring to FIGS. 4 and 5, the user 3 and the user 4 are interested in topics related to politics and performing arts, the group 1 represents politics and the group 2 represents entertainment related. Therefore, user 3, user 4, topic 3, and topic 4 belong to both groups.

この例では、ユーザの閾値と、話題の閾値はあらかじめ決めておいてもよい。また、１ユーザの興味領域数とユーザの圧縮ベクトル表現とからユーザの閾値を決定し、ユーザの閾値と評価履歴情報と話題の圧縮ベクトル表現とから話題の閾値を決定しても良い。以下のように決定する。１ユーザが所属するグループ数の平均が、１ユーザの興味領域数と同じ程度となるまで、ユーザの閾値を２分探索していくことで決定する。ユーザの閾値を元に、ユーザをグループにまとめる。次に、評価履歴情報を参照し、各グループ内のユーザが評価した話題を取得する。次に、話題の圧縮ベクトル表現を参照し、前記話題の潜在トピックの確率値を取得し、最も低い値を閾値とする。 In this example, the user threshold and the topic threshold may be determined in advance. Alternatively, the user threshold may be determined from the number of regions of interest of one user and the user compressed vector expression, and the topic threshold may be determined from the user threshold, evaluation history information, and topic compressed vector expression. It is determined as follows. This is determined by searching the user's threshold value in half until the average number of groups to which one user belongs becomes the same as the number of interest areas of one user. Group users into groups based on user thresholds. Next, referring to the evaluation history information, the topics evaluated by the users in each group are acquired. Next, referring to the compressed vector expression of the topic, the probability value of the latent topic of the topic is acquired, and the lowest value is set as a threshold value.

［推薦候補作成］
次に、推薦話題候補作成手段１０４が、各ユーザに対して推薦対象となる話題の候補を選定して、推薦話題候補を作成する（ステップＳ４）。 [Nomination candidate creation]
Next, the recommended topic candidate creation means 104 selects a topic candidate to be recommended for each user and creates a recommended topic candidate (step S4).

推薦話題候補作成手段１０４は、グループ化手段１０３からグループ、ユーザ、話題の対応表の入力を受け、ユーザ毎に、所属するグループを抽出する。次に、当該グループに対応する話題を抽出する。当該話題が複数のグループから抽出された場合、その合計をスコアとする。次にスコア順に話題を並べ、当該スコアが高いものから順に推薦対象の話題候補とする。 The recommended topic candidate creation unit 104 receives input of a correspondence table of groups, users, and topics from the grouping unit 103, and extracts a group to which each user belongs. Next, a topic corresponding to the group is extracted. When the topic is extracted from a plurality of groups, the total is used as a score. Next, the topics are arranged in the order of scores, and the candidate candidates are recommended in order from the highest score.

ここで、話題のスコアは以下のように数式で表すことができる。
話題iのスコア＝Σ_{k ∈ G} P( z=k | d=i )
（ただし、Gは話題iが所属するグループの集合を表す） Here, the topic score can be expressed by a mathematical formula as follows.
Topic i score = Σ_ {k ∈ G} P (z = k | d = i)
(G represents the set of groups to which topic i belongs.)

図８に、推薦話題候補作成手段１０４が作成した各ユーザに対する推薦対象となる話題候補の例を示す。これは、図７のグループ、ユーザ、話題の対応表を入力として、ユーザ１とユーザ２とユーザ３に対する推薦話題候補を作成した例である。ユーザ毎に、グループと話題候補を作成する。ここで、ユーザ３は２つのグループに所属し、話題３と話題４はグループ１とグループ２に所属している。したがって、話題３と話題４のスコアは、前述の数式を用いて以下のように計算する。 FIG. 8 shows an example of topic candidates to be recommended for each user created by the recommended topic candidate creation unit 104. This is an example in which recommended topic candidates for user 1, user 2, and user 3 are created using the correspondence table of groups, users, and topics in FIG. 7 as input. Create groups and topic candidates for each user. Here, the user 3 belongs to two groups, and the topic 3 and the topic 4 belong to the group 1 and the group 2. Therefore, the scores of Topic 3 and Topic 4 are calculated as follows using the above mathematical formula.

［推薦話題決定］
次に、推薦手段１０５が、推薦話題候補を入力とし、評価履歴情報を参照し、ユーザに対して話題を推薦する（ステップＳ５）。 [Recommended topic decision]
Next, the recommendation means 105 receives a recommended topic candidate as input, refers to the evaluation history information, and recommends a topic to the user (step S5).

推薦手段１０５が、評価履歴情報を参照し、推薦話題候補のうち、ユーザがまだ評価していない話題を推薦対象話題として決定し、出力する。例えば、ユーザ１には、話題２を推薦対象話題とする。 The recommendation unit 105 refers to the evaluation history information, determines a topic that has not yet been evaluated by the user among the recommended topic candidates, and outputs it as a recommendation target topic. For example, for the user 1, the topic 2 is set as the recommendation target topic.

［プログラム］
本発明の実施の形態１におけるプログラムは、コンピュータ（情報処理装置）に、図２に示すステップＳ１からステップＳ５の処理を実行させるプログラムであれば良い。あるいは、コンピュータに、図１に示す符号１０１〜１０５に示す各手段を実現させるプログラムであればよい。このプログラムをコンピュータにインストールし、実行することによって、本実施の形態１における話題推薦装置と話題推薦方法とを実現することができる。 [program]
The program according to the first embodiment of the present invention may be a program that causes a computer (information processing apparatus) to execute the processing from step S1 to step S5 shown in FIG. Or what is necessary is just a program which makes a computer implement | achieve each means shown to the codes | symbols 101-105 shown in FIG. By installing and executing this program on a computer, the topic recommendation device and the topic recommendation method according to the first embodiment can be realized.

本発明の実施の形態１の場合、コンピュータのＣＰＵ（Central Processing Unit）は、入力手段１０１、次元圧縮手段１０２、グループ化手段１０３、推薦候補話題作成手段１０４、及び、推薦手段１０５として機能し、処理を行なう。また、本実施の形態１では、評価履歴蓄積手段１２１は、コンピュータに備えられたハードディスク等の記憶装置に、これらを構成するデータファイルを格納することによって、又はこのデータファイルが格納された記録媒体をコンピュータと接続された読取装置に搭載することによって実現されている。 In the case of Embodiment 1 of the present invention, the CPU (Central Processing Unit) of the computer functions as the input means 101, the dimension compression means 102, the grouping means 103, the recommended candidate topic creation means 104, and the recommendation means 105, Perform processing. In the first embodiment, the evaluation history accumulating unit 121 stores the data files constituting these in a storage device such as a hard disk provided in the computer, or a recording medium on which the data files are stored. Is implemented in a reading device connected to a computer.

ここで、本実施の形態１におけるプログラムを実行することによって、話題推薦装置を実現するコンピュータについて図１８を用いて説明する。図１８は、本発明の実施の形態１における話題推薦装置を実現するコンピュータの一例を示すブロック図である。 Here, a computer that realizes the topic recommendation device by executing the program according to the first embodiment will be described with reference to FIG. FIG. 18 is a block diagram illustrating an example of a computer that implements the topic recommendation device according to Embodiment 1 of the present invention.

図１８に示すように、コンピュータ１５０は、ＣＰＵ１１１と、メインメモリ１１２と、記憶装置１１３と、入力インターフェイス１１４と、表示コントローラ１１５と、データリーダ／ライタ１１６と、通信インターフェイス１１７とを備える。これらの各部は、バス１２１を介して、互いにデータ通信可能に接続される。 As shown in FIG. 18, the computer 150 includes a CPU 111, a main memory 112, a storage device 113, an input interface 114, a display controller 115, a data reader / writer 116, and a communication interface 117. These units are connected to each other via a bus 121 so that data communication is possible.

ＣＰＵ１１１は、記憶装置１１３に格納された、本実施の形態１におけるプログラム（コード）をメインメモリ１１２に展開し、これらを所定順序で実行することにより、各種の演算を実施する。メインメモリ１１２は、典型的には、ＤＲＡＭ（Dynamic Random Access Memory）等の揮発性の記憶装置である。また、本実施の形態１におけるプログラムは、コンピュータ読み取り可能な記録媒体１２０に格納された状態で提供される。なお、本実施の形態におけるプログラムは、通信インターフェイス１１７を介して接続されたインターネット上で流通するものであっても良い。 The CPU 111 performs various operations by expanding the program (code) in the first embodiment stored in the storage device 113 in the main memory 112 and executing them in a predetermined order. The main memory 112 is typically a volatile storage device such as a DRAM (Dynamic Random Access Memory). The program according to the first embodiment is provided in a state where it is stored in a computer-readable recording medium 120. Note that the program in the present embodiment may be distributed on the Internet connected via the communication interface 117.

また、記憶装置１１３の具体例としては、ハードディスクの他、フラッシュメモリ等の半導体記憶装置が挙げられる。入力インターフェイス１１４は、ＣＰＵ１１１と、キーボード及びマウスといった入力機器１１８との間のデータ伝送を仲介する。表示コントローラ１１５は、ディスプレイ装置１１９と接続され、ディスプレイ装置１１９での表示を制御する。データリーダ／ライタ１１６は、ＣＰＵ１１１と記録媒体１２０との間のデータ伝送を仲介し、記録媒体１２０からのプログラムの読み出し、及びコンピュータ１１０における処理結果の記録媒体１２０への書き込みを実行する。通信インターフェイス１１７は、ＣＰＵ１１１と、他のコンピュータとの間のデータ伝送を仲介する。 Specific examples of the storage device 113 include a hard disk and a semiconductor storage device such as a flash memory. The input interface 114 mediates data transmission between the CPU 111 and an input device 118 such as a keyboard and a mouse. The display controller 115 is connected to the display device 119 and controls display on the display device 119. The data reader / writer 116 mediates data transmission between the CPU 111 and the recording medium 120, and reads a program from the recording medium 120 and writes a processing result in the computer 110 to the recording medium 120. The communication interface 117 mediates data transmission between the CPU 111 and another computer.

また、記録媒体１２０の具体例としては、ＣＦ（Compact Flash）及びＳＤ（Secure Digital）等の汎用的な半導体記憶デバイス、フレキシブルディスク（Flexible Disk）等の磁気記憶媒体、又はＣＤ−ＲＯＭ（Compact Disk Read Only Memory）などの光学記憶媒体が挙げられる。 Specific examples of the recording medium 120 include general-purpose semiconductor storage devices such as CF (Compact Flash) and SD (Secure Digital), magnetic storage media such as a flexible disk, or CD-ROM (Compact Disk). Optical storage media such as Read Only Memory).

［効果］
このように、評価履歴から同じユーザから評価された話題と、同じ話題を評価したユーザとを同じ次元に圧縮し、圧縮した次元を元に、ユーザ群と話題群をグループ化することで、圧縮次元を興味分野としたグループを作成することができ、ユーザが所属するグループ毎に推薦する話題を抽出ことで、興味分野別の話題推薦を行うことができる。 [effect]
In this way, the topic evaluated by the same user from the evaluation history and the user who evaluated the same topic are compressed to the same dimension, and the user group and the topic group are grouped based on the compressed dimension. It is possible to create a group with a dimension as an area of interest, and to extract a topic to be recommended for each group to which the user belongs, thereby performing topic recommendation for each area of interest.

＜実施の形態２＞
［装置構成］
次に、本発明における実施の形態２について説明する。図９は、本実施の形態２における話題推薦装置を含むシステム全体の構成を示すブロック図である。本実施の形態２では、図１に示した実施の形態１と異なり、コミュニケーション相手蓄積手段２１１が備えられている点が異なる。さらに、次元圧縮手段１０２の動作が異なる。以下では、異なる点について主に説明する。 <Embodiment 2>
[Device configuration]
Next, a second embodiment of the present invention will be described. FIG. 9 is a block diagram illustrating a configuration of the entire system including the topic recommendation device according to the second embodiment. The second embodiment is different from the first embodiment shown in FIG. 1 in that a communication partner storage unit 211 is provided. Furthermore, the operation of the dimension compression means 102 is different. Below, a different point is mainly demonstrated.

コミュニケーション相手蓄積手段２２１は、ユーザとそのコミュニケーション相手を蓄積している。本データは、あらかじめ作成しておいてもよい。また、評価履歴蓄積手段１２１の情報を元に、評価した話題の似ているユーザ同士を、コミュニケーション相手として追加してもよい。評価した話題の似ているユーザは、共通の話題を提供することで新たにコミュニケーション相手となる可能性が高いユーザ同士であるため、これらのユーザを追加しても良い。 The communication partner storage unit 221 stores the user and the communication partner. This data may be created in advance. In addition, based on the information in the evaluation history storage unit 121, users who are similar in the evaluated topic may be added as communication partners. Since users who are similar to the evaluated topics are users who are likely to become new communication partners by providing a common topic, these users may be added.

次元圧縮手段１０２は、評価履歴情報を入力とし、コミュニケーション相手蓄積手段２２１を参照し、話題の圧縮ベクトル表現とユーザの圧縮ベクトル表現に変換する。具体的には、まず、上記評価履歴情報から、話題を行、ユーザを列、評価値を値とした２次元行列を作成する。次に、コミュニケーション相手蓄積手段２２１を参照し、同じ話題を評価したでコミュニケーション相手同士のユーザと、同じユーザから評価された話題とを一つの次元として２次元行列を圧縮する。次に、圧縮した次元を要素とする話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する。 The dimension compression means 102 receives the evaluation history information as input, refers to the communication partner storage means 221, and converts it into a compressed vector expression of the topic and a compressed vector expression of the user. Specifically, first, a two-dimensional matrix is created from the evaluation history information, with the topic as the row, the user as the column, and the evaluation value as the value. Next, the communication partner storage unit 221 is referred to, and the two-dimensional matrix is compressed using the users of the communication partners and the topics evaluated by the same user as one dimension after evaluating the same topic. Next, the compressed dimension vector representation of the topic having the compressed dimension as an element and the compressed dimension vector representation of the user are output.

［装置動作］
次に、図１０のフローチャートを参照して本実施の形態２の話題推薦装置の動作について詳細に説明する。図１０は、本実施の形態２の全体の動作の流れを説明するフローチャートである。 [Device operation]
Next, the operation of the topic recommendation device of the second embodiment will be described in detail with reference to the flowchart of FIG. FIG. 10 is a flowchart for explaining the overall operation flow of the second embodiment.

図１０に示すように、本実施の形態２では、実施の形態１と異なり、コミュニケーション相手を読み込むステップ（ステップＳ２１）が追加された点が異なる。また、行列を圧縮するステップ（ステップＳ２）の動作が異なる。以下では主に異なるステップを中心に説明する。 As shown in FIG. 10, the second embodiment is different from the first embodiment in that a step of reading a communication partner (step S21) is added. Further, the operation of the step of compressing the matrix (step S2) is different. The following description will mainly focus on the different steps.

まず、実施の形態１と同様に、入力手段１０１が、評価履歴情報を受け付け、次元圧縮手段１０２が、評価履歴情報を行列表現に変換する（ステップＳ１）。 First, as in the first embodiment, the input unit 101 receives the evaluation history information, and the dimension compression unit 102 converts the evaluation history information into a matrix representation (step S1).

［コミュニケーション相手読込］
次に、次元圧縮手段１０２が、コミュニケーション相手蓄積手段２２１を参照し、コミュニケーション相手情報を読み込む（ステップＳ２１）。コミュニケーション相手蓄積手段２２１は、コミュニケーション相手同士のユーザ関係が蓄積されている。図１１に例を示す。ユーザ１は、ユーザ２とユーザ３とがコミュニケーション相手であることがわかる。コミュニケーション相手情報は、（ユーザ１、ユーザ２）、（ユーザ１、ユーザ３）といったペア毎に読み込む。 [Read communication partner]
Next, the dimension compression unit 102 refers to the communication partner storage unit 221 and reads communication partner information (step S21). The communication partner storage unit 221 stores user relationships between communication partners. An example is shown in FIG. User 1 knows that user 2 and user 3 are communication partners. The communication partner information is read for each pair such as (user 1, user 2) and (user 1, user 3).

［行列を圧縮］
次に、実施の形態１と同様に、２次元行列を次元圧縮する（ステップＳ２）。ただし、次元圧縮時に、コミュニケーション相手情報を参照し、コミュニケーション相手同士が同じトピックとなる確率値を高くする制約条件を付けくわえて計算する。 [Compress matrix]
Next, as in the first embodiment, the two-dimensional matrix is dimensionally compressed (step S2). However, at the time of dimension compression, the communication partner information is referred to, and calculation is performed with a constraint that increases the probability value that the communication partners are the same topic.

PLSAの単語間の制約条件の手法である上記非特許文献５の手法を参考に、ユーザ間の制約条件とする。以下では、まず、上記非特許文献６の内容を説明し、次に、ユーザ間の制約条件に拡張する。 Referring to the technique of Non-Patent Document 5 above, which is a technique for restricting conditions between words in PLSA, a constraint condition between users is used. Below, the content of the said nonpatent literature 6 is demonstrated first, and it expands to the constraint conditions between users next.

非特許文献５では、トピックP(z)の確率分布に変更を加える。変更内容は、単語間に関係を元に、単語間に関係あるが別の潜在トピックに割り当てられた場合は、ペナルティをかけるための、ペナルティ項を追加する。
P(z) = Π P(z_i)×exp{-Σ_{<i,j>∈C} δ(z_i ≠ z_j) × w_ij } ÷ G In Non-Patent Document 5, the probability distribution of topic P (z) is changed. Based on the relationship between words, if the change is related to a word but assigned to another potential topic, a penalty term for adding a penalty is added.
P (z) = Π P (z_i) × exp {-Σ _ {<i, j> ∈C} δ (z_i ≠ z_j) × w_ij} ÷ G

ここで、z_iは、第i番目の位置にある単語の潜在トピック、Cは、i番目の単語とj番目の単語に関係を示す制約条件集合、δ（・）は、i番目の単語の潜在トピックとｊ番目の単語の潜在トピックが異なる場合に１となる関数、w_ijは、ペナルティ重み(>0)、Gは正規化項である。 Here, z_i is the latent topic of the word at the i-th position, C is a constraint set indicating the relationship between the i-th word and the j-th word, and δ (•) is the latent potential of the i-th word A function that becomes 1 when the topic and the potential topic of the j-th word are different, w_ij is a penalty weight (> 0), and G is a normalization term.

本実施形態では、単語間の関係をユーザ間の関係にあてはめ、ユーザ間がコミュニケーション相手だった場合、これを制約条件集合Cとする。コミュニケーション相手が別の潜在トピックに割り当てられた場合、ペナルティが科される。すなわち、本例では、（ユーザ１、ユーザ２）、（ユーザ１、ユーザ３）が別のトピックに割り当てられた場合、ペナルティが科される。 In the present embodiment, the relationship between words is applied to the relationship between users, and when the users are communication partners, this is set as a constraint set C. If the communication partner is assigned to another potential topic, a penalty is imposed. That is, in this example, when (User 1, User 2) and (User 1, User 3) are assigned to different topics, a penalty is imposed.

LDAでも同様に、単語間の制約条件を導入した上記非特許文献６の手法を用いることによって、トピックモデリングを行っても良い。非特許文献６でも、単語間に類義関係がある場合を制約条件とする。本実施形態でも同様に、コミュニケーション相手情報を制約条件として加える。 Similarly, in LDA, topic modeling may be performed by using the method of Non-Patent Document 6 in which a constraint condition between words is introduced. Even in Non-Patent Document 6, a case where there is a synonymous relationship between words is set as a constraint. Similarly, in this embodiment, communication partner information is added as a constraint condition.

また、以下のように、前出の非特許文献４の初期トピック割り当てと、関数qを変更することで、コミュニケーション相手を制約として入れても良い。 Further, as described below, the communication partner may be included as a restriction by changing the initial topic assignment and the function q of Non-Patent Document 4 described above.

まず、コミュニケーション相手同士をネットワークとし、ネットワークをトピック数Tでクラスタリングする。ネットワークのクラスタリングには、Newman法など既存手法を用いる。各ユーザにクラスタ番号をトピックとして割り当てる。次に、一様分布からランダムにrをサンプリングする。次に、ユーザの出現位置毎に、以下の数８式の値を計算する。数８式の値がr以上であれば、トピック番号をインクリメントする。そうでなければ、ユーザの出現位置にトピックを割り当てる。 First, communication partners are set as a network, and the network is clustered by the number of topics T. Existing methods such as Newman method are used for network clustering. Assign a cluster number to each user as a topic. Next, r is sampled randomly from the uniform distribution. Next, the value of the following equation 8 is calculated for each appearance position of the user. If the value of equation (8) is greater than or equal to r, the topic number is incremented. Otherwise, the topic is assigned to the user's appearance position.

ここで、非特許文献４の関数qを以下のように、ユーザ間の類似度を加えた関数で置き換える。
Here, the function q of Non-Patent Document 4 is replaced with a function to which the similarity between users is added as follows.

ここで、関数simは、２ユーザの類似度を表している。fuは、ユーザuのコミュニケーション相手ユーザを表している。Fuは、ユーザuのコミュニケーション相手ユーザ数を表している。ユーザ間の類似度である関数simは、ユーザの閲覧した話題をベクトルとするコサイン類似度等や、ユーザのコミュニケーション相手番号をベクトルとしたコサイン類似度等を用いる。 Here, the function sim represents the similarity between two users. fu represents the communication partner user of user u. Fu represents the number of communication partner users of user u. The function sim, which is the similarity between users, uses a cosine similarity or the like using a topic viewed by the user as a vector, a cosine similarity using a user's communication partner number as a vector, or the like.

このように初期トピック割り当てと、関数qを置きかえることで、コミュニケーション相手を制約条件としたトピック割り当てを行う。 In this way, topic assignment is performed by replacing the initial topic assignment and the function q with the communication partner as a constraint.

［ステップＳ３からステップＳ５］
次に図１０に示すように、ステップＳ３からステップＳ５が実行される。ただし、図１０に示すステップＳ３からステップＳ５は、図２に示すステップＳ３からステップＳ５と同様のステップであるため、ここでの説明は省略する。 [Step S3 to Step S5]
Next, as shown in FIG. 10, steps S3 to S5 are executed. However, steps S3 to S5 shown in FIG. 10 are the same steps as steps S3 to S5 shown in FIG.

［プログラム］
本発明の実施の形態２におけるプログラムは、コンピュータに、図１０に示すステップＳ１からステップＳ５を実行させるプログラムであれば良い。あるいは、コンピュータに、図９に示す符号１０１〜１０５に示す各手段を実現させるプログラムであればよい。このプログラムをコンピュータにインストールし、実行することによって、本実施の形態２における話題推薦装置と話題推薦方法とを実現することができる。 [program]
The program according to the second embodiment of the present invention may be a program that causes a computer to execute steps S1 to S5 shown in FIG. Or what is necessary is just a program which makes each computer implement | achieve each means shown to the codes | symbols 101-105 shown in FIG. By installing and executing this program on a computer, the topic recommendation device and the topic recommendation method in the second embodiment can be realized.

本発明の実施の形態２の場合、コンピュータのＣＰＵ（Central Processing Unit）は、入力手段１０１、次元圧縮手段１０２、グループ化手段１０３、推薦候補話題作成手段１０４、及び、推薦手段１０５として機能し、処理を行なう。また、本実施の形態２では、評価履歴蓄積手段１２１、コミュニケーション相手蓄積手段２２１は、コンピュータに備えられたハードディスク等の記憶装置に、これらを構成するデータファイルを格納することによって、又はこのデータファイルが格納された記録媒体をコンピュータと接続された読取装置に搭載することによって実現されている。 In the case of Embodiment 2 of the present invention, the CPU (Central Processing Unit) of the computer functions as the input means 101, the dimension compression means 102, the grouping means 103, the recommended candidate topic creation means 104, and the recommendation means 105, Perform processing. In the second embodiment, the evaluation history storage unit 121 and the communication partner storage unit 221 store the data files constituting them in a storage device such as a hard disk provided in the computer, or this data file. Is stored in a reading device connected to a computer.

ここで、本実施の形態２におけるプログラムを実行することによって、話題推薦装置を実現するコンピュータについて図１８を用いて説明する。図１８は、本発明の実施の形態２における話題推薦装置を実現するコンピュータの一例を示すブロック図である。 Here, a computer that realizes the topic recommendation device by executing the program according to the second embodiment will be described with reference to FIG. FIG. 18 is a block diagram illustrating an example of a computer that implements the topic recommendation device according to Embodiment 2 of the present invention.

ＣＰＵ１１１は、記憶装置１１３に格納された、本実施の形態２におけるプログラム（コード）をメインメモリ１１２に展開し、これらを所定順序で実行することにより、各種の演算を実施する。メインメモリ１１２は、典型的には、ＤＲＡＭ（Dynamic Random Access Memory）等の揮発性の記憶装置である。また、本実施の形態２におけるプログラムは、コンピュータ読み取り可能な記録媒体１２０に格納された状態で提供される。なお、本実施の形態におけるプログラムは、通信インターフェイス１１７を介して接続されたインターネット上で流通するものであっても良い。 The CPU 111 performs various calculations by developing the program (code) in the second embodiment stored in the storage device 113 in the main memory 112 and executing them in a predetermined order. The main memory 112 is typically a volatile storage device such as a DRAM (Dynamic Random Access Memory). The program according to the second embodiment is provided in a state where it is stored in a computer-readable recording medium 120. Note that the program in the present embodiment may be distributed on the Internet connected via the communication interface 117.

［効果］
このように、次元を圧縮する際に、コミュニケーション相手情報を制約条件にすることで、コミュニケーション相手同士を同じグループに所属しやすくする。これにより、あらかじめコミュニケーション相手がいる場合、コミュニケーション相手同士に興味のある話題を提示することができ、コミュニケーションを支援することができる。 [effect]
As described above, when the dimension is compressed, the communication partner information is made a constraint condition so that the communication partners can easily belong to the same group. Thereby, when there is a communication partner in advance, it is possible to present a topic that interests the communication partner and to support communication.

＜実施の形態３＞
次に、本発明における実施の形態３について説明する。図１２は、本実施の形態３における話題推薦装置を含むシステム全体の構成を示すブロック図である。本実施の形態３では、図９に示した実施の形態２と異なり、評価情報補完手段３０１と、ユーザモデル蓄積手段３２２と、コンテンツモデル蓄積手段３２１が備えられている点が異なる。以下では、異なる点について主に説明する。 <Embodiment 3>
Next, a third embodiment of the present invention will be described. FIG. 12 is a block diagram illustrating a configuration of the entire system including the topic recommendation device according to the third embodiment. The third embodiment is different from the second embodiment shown in FIG. 9 in that an evaluation information complementing unit 301, a user model storage unit 322, and a content model storage unit 321 are provided. Below, a different point is mainly demonstrated.

ユーザモデル蓄積手段３２２は、各ユーザ毎に、興味あるキーワードとその重みから成るユーザモデルが蓄積されている。本情報は、あらかじめ、ユーザから興味キーワードを取得しても良いし、評価履歴情報と、後述のコンテンツモデルから生成しても良い。 The user model storage unit 322 stores a user model including an interesting keyword and its weight for each user. This information may acquire an interest keyword from a user beforehand, and may produce | generate from evaluation log | history information and the content model mentioned later.

コンテンツモデル蓄積手段３２１は、話題毎に、話題内に存在するキーワードとその重みから成るコンテンツモデルが蓄積されている。本情報は、話題内に記述されている文書を形態素解析等により単語に分割し、その単語をキーワードとすることによって生成される。 The content model storage unit 321 stores, for each topic, a content model composed of keywords existing in the topic and their weights. This information is generated by dividing a document described in a topic into words by morphological analysis or the like and using the words as keywords.

評価情報補完手段３０１は、評価履歴情報の入力を受け、ユーザモデル蓄積手段３２２とコンテンツモデル蓄積手段３２１とを参照し、ユーザによって評価されていない話題、つまり、評価値が０である話題と、評価していないユーザとのペアについて、ユーザモデルとコンテンツモデルとから類似度を計算する。そして、類似度が特定の閾値以上であれば、ユーザによる話題の評価値を、評価履歴情報に追加する。 The evaluation information complementing unit 301 receives the input of the evaluation history information, refers to the user model storage unit 322 and the content model storage unit 321, and refers to a topic not evaluated by the user, that is, a topic whose evaluation value is 0. For a pair with a user who has not been evaluated, the similarity is calculated from the user model and the content model. If the similarity is equal to or higher than a specific threshold, the topic evaluation value by the user is added to the evaluation history information.

［装置動作］
次に、図１３のフローチャートを参照して本実施の形態３の話題推薦装置の動作について詳細に説明する。図１３は、本実施の形態３の全体の動作の流れを説明するフローチャートである。 [Device operation]
Next, the operation of the topic recommendation device of the third embodiment will be described in detail with reference to the flowchart of FIG. FIG. 13 is a flowchart for explaining the overall operation flow of the third embodiment.

図１３に示すように、本実施の形態３では、実施の形態２と異なり、さらに、評価履歴情報を補完するステップが追加された点が異なる。以下では主に異なるステップを中心に説明する。 As shown in FIG. 13, the third embodiment is different from the second embodiment in that a step for supplementing the evaluation history information is added. The following description will mainly focus on the different steps.

［評価情報の補完］
評価情報補完手段３０１が、ユーザモデル蓄積手段３２２とコンテンツモデル蓄積手段３２１を参照し、評価履歴のないユーザと話題のペアについて評価情報を追加する（ステップＳ３１）。 [Estimation of evaluation information]
The evaluation information complementing unit 301 refers to the user model storage unit 322 and the content model storage unit 321 and adds evaluation information for a user / topic pair with no evaluation history (step S31).

まず、評価履歴を参照し、評価履歴のないユーザと話題ペアを抽出する。ここでは、ユーザ１と話題２の評価履歴がないとする（図４参照）。次に、ユーザモデル蓄積手段３２２とコンテンツモデル蓄積手段３２１を参照し、ユーザ１のユーザモデルと、話題２のコンテンツモデルを取得する。 First, referring to the evaluation history, a user and topic pair with no evaluation history are extracted. Here, it is assumed that there is no evaluation history of user 1 and topic 2 (see FIG. 4). Next, the user model storage unit 322 and the content model storage unit 321 are referred to, and the user model of the user 1 and the content model of the topic 2 are acquired.

ユーザモデル蓄積手段３２２の例を図１４に示す。ユーザモデル蓄積手段３２２は、ユーザＩＤとそのユーザモデルと蓄積する。ユーザモデルとは、キーワードとその重みの集合である。キーワードは、ユーザが興味のあるキーワードである。例えば、ユーザ１は、「首相」や「経済」に興味がある。これらのキーワードは、評価履歴情報と後述のコンテンツモデルとを参照し、学習によって得ることもできる。例えば、ユーザが閲覧したコンテンツモデルを平均化したものをユーザモデルとする方法や、評価値で重みを付けて平均化する方法などがある。 An example of the user model storage unit 322 is shown in FIG. The user model storage unit 322 stores the user ID and its user model. A user model is a set of keywords and their weights. The keyword is a keyword that the user is interested in. For example, the user 1 is interested in “Prime Minister” and “Economy”. These keywords can also be obtained by learning with reference to evaluation history information and a content model described later. For example, there are a method of averaging a content model browsed by a user as a user model, a method of averaging by weighting with evaluation values, and the like.

コンテンツモデル蓄積手段３２１の例を図１５に示す。コンテンツモデル蓄積手段３２１は、話題ＩＤとそのコンテンツモデルとを蓄積する。コンテンツモデルとは、キーワードとその重みに集合である。キーワードは、話題に出現する単語や、話題のメタ情報などから得ることができる。 An example of the content model storage unit 321 is shown in FIG. The content model storage unit 321 stores the topic ID and its content model. A content model is a set of keywords and their weights. A keyword can be obtained from a word that appears in a topic, meta information of the topic, or the like.

次に、ユーザ１のユーザモデルと、話題２のコンテンツモデルとの類似度を計算する。類似度は、ユークリッド距離に基づく方法や、内積、コサイン類似度などを用いる。内積を用いた場合、以下のように計算される。
SIM(ユーザ１、話題２)＝0×0.5 ＋ 0.3 × 0.3 ＋ 0.1 × 0.1 ＋ …
各項は、「首相」、「財政」、「計画」など関する積である。 Next, the similarity between the user model of user 1 and the content model of topic 2 is calculated. As the similarity, a method based on the Euclidean distance, an inner product, a cosine similarity, or the like is used. When inner product is used, it is calculated as follows.
SIM (user 1, topic 2) = 0 x 0.5 + 0.3 x 0.3 + 0.1 x 0.1 +…
Each term is a product related to “Prime Minister”, “Finance”, “Plan” and so on.

次に、類似度が閾値を超えている場合、評価値を付与する。例えば、閾値0.5を超えている場合、評価値として類似度を付与する。 Next, when the similarity exceeds a threshold value, an evaluation value is assigned. For example, when the threshold value 0.5 is exceeded, the similarity is assigned as the evaluation value.

［ステップＳ１からステップＳ５］
次に図１３に示すように、ステップＳ１からステップＳ５が実行される。ただし、図１３に示すステップＳ１からステップＳ５は、図２あるいは図１０に示すステップＳ１からステップＳ５と同様のステップであるため、ここでの説明は省略する。 [Step S1 to Step S5]
Next, as shown in FIG. 13, steps S1 to S5 are executed. However, steps S1 to S5 shown in FIG. 13 are the same steps as steps S1 to S5 shown in FIG. 2 or FIG.

［プログラム］
本発明の実施の形態３におけるプログラムは、コンピュータに、図１３に示すステップＳ３１からステップＳ５を実行させるプログラムであれば良い。あるいは、コンピュータに、図１２に示す符号１０１〜１０５，３０１に示す各手段を実現させるプログラムであればよい。このプログラムをコンピュータにインストールし、実行することによって、本実施の形態３における話題推薦装置と話題推薦方法とを実現することができる。 [program]
The program according to the third embodiment of the present invention may be a program that causes a computer to execute steps S31 to S5 shown in FIG. Or what is necessary is just a program which makes each computer implement | achieve each means shown to the code | symbol 101-105,301 shown in FIG. By installing and executing this program on a computer, the topic recommendation device and the topic recommendation method according to the third embodiment can be realized.

本発明の実施の形態３の場合、コンピュータのＣＰＵ（Central Processing Unit）は、入力手段１０１、評価情報保管手段３０１、次元圧縮手段１０２、グループ化手段１０３、推薦候補話題作成手段１０４、及び、推薦手段１０５として機能し、処理を行なう。また、本実施の形態３では、評価履歴蓄積手段１２１およびコミュニケーション相手蓄積手段２２１、ユーザモデル蓄積手段３２２、コンテンツモデル蓄積手段３２１は、コンピュータに備えられたハードディスク等の記憶装置に、これらを構成するデータファイルを格納することによって、又はこのデータファイルが格納された記録媒体をコンピュータと接続された読取装置に搭載することによって実現されている。 In the case of Embodiment 3 of the present invention, the CPU (Central Processing Unit) of the computer includes an input means 101, an evaluation information storage means 301, a dimension compression means 102, a grouping means 103, a recommended candidate topic creation means 104, and a recommendation Functions as means 105 and performs processing. In the third embodiment, the evaluation history storage unit 121, the communication partner storage unit 221, the user model storage unit 322, and the content model storage unit 321 are configured in a storage device such as a hard disk provided in the computer. This is realized by storing a data file or by mounting a recording medium storing the data file on a reading device connected to a computer.

本発明の実施の形態３における話題推薦装置を実現するコンピュータは、図１８に示した実施の形態１における話題推薦装置を実現するコンピュータと同様の構成である。 A computer that implements the topic recommendation device according to Embodiment 3 of the present invention has the same configuration as the computer that implements the topic recommendation device according to Embodiment 1 shown in FIG.

［効果］
トピックモデルによる次元圧縮手法は、評価履歴が非常に疎な場合、精度よく潜在トピックを推定できない。したがって、興味領域を精度よく特定することができず、ユーザとコミュニケーション相手との興味のある話題を提供することができない。このような構成をとることで、ユーザがまだ評価していない話題に対して、どのような評価を行うのかを予測し補完することで、精度よく興味領域を特定できる。これにより、ユーザとコミュニケーション相手との興味のある話題を提供することができる。 [effect]
When the evaluation history is very sparse, the dimension compression method using the topic model cannot accurately estimate the potential topic. Therefore, the region of interest cannot be specified with high accuracy, and a topic of interest between the user and the communication partner cannot be provided. By adopting such a configuration, it is possible to accurately identify the region of interest by predicting and complementing what kind of evaluation is performed on a topic that has not yet been evaluated by the user. Thereby, the topic in which the user and the communication partner are interested can be provided.

＜実施の形態４＞
次に、本発明における実施の形態４について説明する。図１６は、本実施の形態４における話題推薦装置を含むシステム全体の構成を示すブロック図である。本実施の形態４では、図１２に示す実施の形態３と異なり、話題推薦装置には、評価履歴選択手段４０１とがさらに備えられている。また、本実施の形態４は、推薦話題候補作成手段１０４の機能の点で、実施の形態３と異なっている。以下では異なる点について主に説明する。 <Embodiment 4>
Next, a fourth embodiment of the present invention will be described. FIG. 16 is a block diagram showing a configuration of the entire system including the topic recommendation device according to the fourth embodiment. In the fourth embodiment, unlike the third embodiment shown in FIG. 12, the topic recommendation device is further provided with an evaluation history selecting means 401. The fourth embodiment is different from the third embodiment in terms of the function of the recommended topic candidate creation unit 104. Hereinafter, the different points will be mainly described.

評価履歴選択手段４０１は、評価履歴情報の入力を受け、評価ユーザの少ない話題や評価話題数の少ないユーザを特定し、その話題とユーザ以外の評価履歴を選択する。具体的には、評価履歴情報に基づいて、多くのユーザに評価されていない不人気話題を選択する。また、多くの話題を評価していない未評価ユーザを選択する。次に、上記不人気話題と未評価ユーザの一方又は両方を設定し、これら以外つまり不人気話題以外の評価履歴や未評価ユーザ以外の評価履歴を選択する。 The evaluation history selection means 401 receives input of evaluation history information, specifies a topic with a small number of evaluation users or a user with a small number of evaluation topics, and selects evaluation history other than the topic and the user. Specifically, an unpopular topic that is not evaluated by many users is selected based on the evaluation history information. In addition, an unrated user who has not evaluated many topics is selected. Next, one or both of the unpopular topic and the unrated user are set, and other than these, that is, an evaluation history other than the unpopular topic and an evaluation history other than the unrated user are selected.

推薦話題候補作成手段１０４は、グループ化手段１０３の出力であるグループとユーザと話題の対応関係に基づいて、各ユーザ毎に、ユーザが所属する複数のグループを抽出する。そして、当該グループに対応する一つ以上の話題を抽出し、当該話題の圧縮次元のスコアと、各話題のコンテンツモデルとユーザのユーザモデルとの類似度から、各話題のスコアを決め、話題のスコアの順番で並べ、推薦候補話題を作成する。 The recommended topic candidate creation unit 104 extracts a plurality of groups to which the user belongs for each user based on the correspondence relationship between the group, which is an output of the grouping unit 103, the user, and the topic. Then, one or more topics corresponding to the group are extracted, and the score of each topic is determined from the score of the compression dimension of the topic and the similarity between the content model of each topic and the user model of the user. Arrange in order of score and create recommended candidate topics.

［装置動作］
次に、図１７のフローチャートを参照して本実施の形態４の話題推薦装置の動作について詳細に説明する。図１７は、本実施の形態４の全体の動作の流れを説明するフローチャートである。 [Device operation]
Next, the operation of the topic recommendation device of the fourth embodiment will be described in detail with reference to the flowchart of FIG. FIG. 17 is a flowchart for explaining the overall operation flow of the fourth embodiment.

図１７に示すように、本実施の形態４では、実施の形態３と異なり、さらに、評価情報を選択するステップＳ４１が追加された点が異なる。また、推薦候補話題を作成するステップＳ４も動作が異なる。以下では主に異なるステップを中心に説明する。 As shown in FIG. 17, the fourth embodiment is different from the third embodiment in that step S41 for selecting evaluation information is added. The operation of step S4 for creating a recommendation candidate topic is also different. The following description will mainly focus on the different steps.

まず、実施の形態３で説明したように、話題評価情報を補完する。 First, as described in the third embodiment, the topic evaluation information is supplemented.

［評価情報の選択］
次に、評価情報選択手段４０１が、評価履歴と補完された評価情報とに基づいて、評価履歴情報の選択を行う（ステップＳ４１）。 [Select Evaluation Information]
Next, the evaluation information selection means 401 selects evaluation history information based on the evaluation history and the supplemented evaluation information (step S41).

まず、評価履歴情報を入力に、評価ユーザの少ない話題を特定する。具体的には、特定の閾値以下のユーザ数からしか評価を受けていない話題を不人気話題として特定する。さらに、評価話題数の少ないユーザを特定する。具体的には、特定の閾値以下の話題数しか評価していないユーザを未評価ユーザとして特定する。 First, the topic with few evaluation users is specified by inputting the evaluation history information. Specifically, a topic that is evaluated only from the number of users equal to or less than a specific threshold is specified as an unpopular topic. Furthermore, a user with a small number of evaluation topics is specified. Specifically, a user who has evaluated only the number of topics equal to or less than a specific threshold is specified as an unevaluated user.

次に、上記不人気話題と上記未評価ユーザのどちらか又は両方を設定し、どちらか一方以外又は両方以外の評価履歴、つまり、不人気話題以外の評価履歴や未評価ユーザ以外の評価履歴を選択する。 Next, either or both of the unpopular topic and the unevaluated user are set, and the evaluation history other than one or both, that is, the evaluation history other than the unpopular topic and the evaluation history other than the unevaluated user select.

［ステップＳ１からステップＳ３］
続いて、実施の形態３と同様に、行列表現に変換し、次元を圧縮し、グループ化を行う。 [Step S1 to Step S3]
Subsequently, as in the third embodiment, the data is converted into a matrix representation, the dimensions are compressed, and grouping is performed.

［ステップＳ４］
次に、推薦話題候補作成部１０４が、グループとユーザと話題の対応関係の入力を受け、ユーザモデル蓄積手段３２２とコンテンツモデル蓄積手段３２１とを参照し、各ユーザに推薦する話題の候補を作成する。その際、各話題のスコアは、各圧縮次元の値と、ユーザモデルとコンテンツモデルとの類似度とから計算する。具体的には全ての話題iについて以下のように計算する。
話題iのスコア＝w×Σ_{k ∈ G} P( z=k | d=i )＋ ( 1−w ）×SIM(ユーザj、話題i)
（ただし、Gは話題iが所属するグループの集合を表す） [Step S4]
Next, the recommended topic candidate creation unit 104 receives the input of the correspondence relationship between the group, the user, and the topic, refers to the user model storage unit 322 and the content model storage unit 321, and creates a topic candidate recommended for each user. To do. In that case, the score of each topic is calculated from the value of each compression dimension and the similarity between the user model and the content model. Specifically, it calculates as follows about all the topics i.
Topic i score = w × Σ_ {k ∈ G} P (z = k | d = i) + (1-w) × SIM (user j, topic i)
(G represents the set of groups to which topic i belongs.)

ここで、wは、潜在トピックの確率値を重視するか否かを示す重みであり、0以上1以下の範囲である。SIM(ユーザj、話題i)は、ユーザjと話題iの類似度を示し、ユーザモデルとコンテンツモデルから、ユークリッド距離や内積、コサイン類似度で計算される。 Here, w is a weight indicating whether or not to place importance on the probability value of the latent topic, and is in the range of 0 to 1. SIM (user j, topic i) indicates the similarity between user j and topic i, and is calculated from the user model and content model using the Euclidean distance, inner product, and cosine similarity.

［ステップＳ５］
続いて、実施の形態３と同様に推薦話題を特定し、推薦する。 [Step S5]
Subsequently, a recommended topic is identified and recommended as in the third embodiment.

［プログラム］
本発明の実施の形態４におけるプログラムは、コンピュータに、図１７に示すステップＳ３１からステップＳ５を実行させるプログラムであれば良い。あるいは、コンピュータに、図１６に示す符号１０１〜１０５，３０１，４０１に示す各手段を実現させるプログラムであればよい。このプログラムをコンピュータにインストールし、実行することによって、本実施の形態４における話題推薦装置と話題推薦方法とを実現することができる。 [program]
The program according to the fourth embodiment of the present invention may be a program that causes a computer to execute steps S31 to S5 shown in FIG. Or what is necessary is just a program which makes a computer implement | achieve each means shown to the code | symbol 101-105,301,401 shown in FIG. By installing and executing this program on a computer, the topic recommendation device and the topic recommendation method according to the fourth embodiment can be realized.

本発明の実施の形態４の場合、コンピュータのＣＰＵ（Central Processing Unit）は、入力手段１０１、評価情報保管手段３０１、評価情報選択手段４０１、次元圧縮手段１０２、グループ化手段１０３、推薦候補話題作成手段１０４、及び、推薦手段１０５として機能し、処理を行なう。また、本実施の形態４では、評価履歴蓄積手段１２１およびコミュニケーション相手蓄積手段２２１、ユーザモデル蓄積手段３２２、コンテンツモデル蓄積手段３２１は、コンピュータに備えられたハードディスク等の記憶装置に、これらを構成するデータファイルを格納することによって、又はこのデータファイルが格納された記録媒体をコンピュータと接続された読取装置に搭載することによって実現されている。
本発明の実施の形態４における話題推薦装置を実現するコンピュータは、図１８に示した実施の形態１における話題推薦装置を実現するコンピュータと同様の構成である。 In the case of the fourth embodiment of the present invention, the CPU (Central Processing Unit) of the computer is the input means 101, the evaluation information storage means 301, the evaluation information selection means 401, the dimension compression means 102, the grouping means 103, and the recommended candidate topic creation. Functions as the means 104 and the recommendation means 105 to perform processing. In the fourth embodiment, the evaluation history storage unit 121, the communication partner storage unit 221, the user model storage unit 322, and the content model storage unit 321 are configured in a storage device such as a hard disk provided in the computer. This is realized by storing a data file or by mounting a recording medium storing the data file on a reading device connected to a computer.
A computer that implements the topic recommendation device according to Embodiment 4 of the present invention has the same configuration as the computer that implements the topic recommendation device according to Embodiment 1 shown in FIG.

［効果］
トピックモデルによる次元圧縮手法は、評価履歴が非常に疎な場合、精度よく潜在トピックを推定できない。したがって、興味領域を精度よく特定することができず、ユーザとコミュニケーション相手との興味のある話題を提供することができない。このような構成をとることで、新規の話題で評価数の少ない話題や、新規のユーザで評価件数の少ないユーザがいる場合でも、これらを除外した行列を作成することで、精度よく興味領域を特定できる。これにより、ユーザとコミュニケーション相手との興味のある話題を提供することができる。 [effect]
When the evaluation history is very sparse, the dimension compression method using the topic model cannot accurately estimate the potential topic. Therefore, the region of interest cannot be specified with high accuracy, and a topic of interest between the user and the communication partner cannot be provided. By adopting such a configuration, even if there are new topics with a low number of evaluations or new users with a low number of evaluations, by creating a matrix that excludes these, the region of interest can be accurately defined. Can be identified. Thereby, the topic in which the user and the communication partner are interested can be provided.

また、推薦候補作成時に、圧縮した次元の値だけでなく、ユーザモデルとコンテンツモデルとからも話題のスコアを算出することで、除外した話題やユーザに対しても推薦することができる。これにより、徐々に評価数を増やすことができる。 In addition, when a recommendation candidate is created, a topic score is calculated not only from a compressed dimension value but also from a user model and a content model, so that it can be recommended to excluded topics and users. Thereby, the evaluation number can be gradually increased.

＜付記＞
上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明における情報提供装置、プログラム、情報提供方法の構成の概略を説明する。但し、本発明は、以下の構成に限定されない。 <Appendix>
Part or all of the above-described embodiment can be described as in the following supplementary notes. The outline of the configuration of the information providing apparatus, the program, and the information providing method in the present invention will be described below. However, the present invention is not limited to the following configuration.

（付記１）
各ユーザの各話題に対する評価から成る評価履歴情報を入力する入力手段と、
前記評価履歴情報に基づいて、話題を行とし、ユーザを列とし、ユーザによる話題の評価値を値とする２次元行列を作成し、同じ話題を評価したユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する次元圧縮手段と、
話題の圧縮次元ベクトル表現の要素値が第１の閾値以上である話題と、ユーザの圧縮次元ベクトル表現の要素値が第２の閾値以上であるユーザとを、ベクトルの要素毎に一つのグループとして対応づけるグループ化手段と、
前記グループに基づいて、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の圧縮次元ベクトル表現の要素値に基づいて推薦話題候補を選定する推薦話題候補作成手段と、
前記推薦話題候補のうち、前記評価履歴情報に基づいて話題の提供対象となるユーザがまだ評価していない話題を当該ユーザに提供する話題提供手段と、
を備えた情報提供装置。 (Appendix 1)
An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic Dimensional compression means for compressing a single dimension and outputting a compressed dimension vector representation of a topic and a compressed dimension vector representation of a user;
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
An information providing apparatus comprising:

（付記２）
付記１に記載の情報提供装置であって、
前記次元圧縮手段が、前記評価履歴情報に基づいて、潜在トピックモデリングにより、話題に潜在的に存在するカテゴリである潜在トピックを要素として、話題の属する潜在トピックの確率と、ユーザの属する潜在トピックの確率とを計算し、前記話題の圧縮次元ベクトル表現と前記ユーザの圧縮次元ベクトル表現とを、話題の潜在トピックの確率とユーザの潜在トピックの確率とで表し、
前記グループ化手段が、第１の閾値以上の潜在トピック確率を持つ話題と、第２の閾値以上の潜在トピック確率を持つユーザとを、各潜在トピック毎に一つのグループとして対応づけ、
前記推薦話題候補作成手段が、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の潜在トピックの確率に基づいて推薦話題候補を選定する、
情報提供装置。 (Appendix 2)
An information providing apparatus according to attachment 1, wherein
Based on the evaluation history information, the dimension compression means uses latent topic modeling as an element of a potential topic that is a category that potentially exists in the topic, and the probability of the potential topic to which the topic belongs and the potential topic to which the user belongs. And calculating the topic's compression dimension vector representation and the user's compression dimension vector representation as a topic potential topic probability and a user potential topic probability,
The grouping means associates a topic having a potential topic probability equal to or higher than a first threshold and a user having a potential topic probability equal to or higher than a second threshold as one group for each potential topic,
The recommended topic candidate creation means extracts a group to which the user belongs for each user, extracts a topic belonging to the group, and selects a recommended topic candidate based on the probability of a potential topic of the topic.
Information providing device.

（付記３）
付記１又は２に記載の情報提供装置であって、
前記次元圧縮手段が、前記評価履歴情報と、各ユーザのコミュニケーション関係を示すコミュニケーション相手情報と、に基づいて、同じ話題を評価したコミュニケーション相手同士のユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する、
情報提供装置。 (Appendix 3)
The information providing apparatus according to appendix 1 or 2,
Based on the evaluation history information and communication partner information indicating the communication relationship of each user, the dimension compressing means combines users of communication partners who have evaluated the same topic and topics evaluated by the same user. Compress to one dimension and output the topic's compressed dimension vector representation and the user's compressed dimension vector representation,
Information providing device.

（付記４）
付記３に記載の情報提供装置であって、
前記次元圧縮手段が、前記評価履歴情報と、前記コミュニケーション相手情報と、に基づいて、コミュニケーション相手同士を条件として重みを加える潜在トピックモデリングにより、話題に潜在的に存在するカテゴリである潜在トピックを要素として、話題の属する潜在トピックの確率と、ユーザの属する潜在トピックの確率を計算し、前記話題の圧縮次元ベクトル表現と前記ユーザの圧縮次元ベクトル表現とを、話題の潜在トピックの確率とユーザの潜在トピックの確率とで表す、
情報提供装置。 (Appendix 4)
An information providing device according to attachment 3, wherein
Based on the evaluation history information and the communication partner information, the dimension compression unit adds a latent topic that is a category potentially existing in a topic by latent topic modeling that weights communication partners as conditions. The probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs are calculated, and the compression dimension vector representation of the topic and the compression dimension vector representation of the user are represented by the probability of the topic potential topic and the user potential of the topic. Expressed as the probability of the topic,
Information providing device.

（付記５）
付記１乃至４のいずれかに記載の情報提供装置であって、
ユーザの好むキーワードと重みとからなるユーザモデルと、話題内のキーワードと重みとからなるコンテンツモデルと、前記評価履歴情報とに基づいて、ユーザにて評価されていない話題と当該ユーザとの類似度を計算し、当該類似度に応じて、前記評価履歴情報に話題に対する評価値を補完する評価情報補完手段をさらに備えた、
情報処理装置。 (Appendix 5)
An information providing apparatus according to any one of appendices 1 to 4,
The similarity between the user and the topic that is not evaluated by the user based on the user model that the user likes and the weight, the content model that includes the keyword and weight in the topic, and the evaluation history information And further comprising an evaluation information complementing means for complementing the evaluation value for the topic in the evaluation history information according to the similarity.
Information processing device.

（付記６）
付記１乃至５のいずれかに記載の情報提供装置であって、
前記評価履歴情報に基づいて、評価するユーザが予め設定された基準に基づいて少ない話題と評価している話題の数が予め設定された基準に基づいて少ないユーザとを特定し、当該話題及び／又は当該ユーザ以外の評価情報を前記次元圧縮手段にて２次元行列を作成する対象として選択する評価履歴選択手段をさらに備えた、
情報処理装置。 (Appendix 6)
An information providing apparatus according to any one of appendices 1 to 5,
Based on the evaluation history information, the user to be evaluated is identified as having a small number of topics based on a preset criterion, and the number of topics evaluated based on a preset criterion is identified as the topic and / or Or further comprising evaluation history selection means for selecting evaluation information other than the user as a target for creating a two-dimensional matrix by the dimension compression means,
Information processing device.

（付記７）
付記１乃至６のいずれかに記載の情報提供装置であって、
前記推薦話題候補作成手段が、ユーザの好むキーワードと重みとからなるユーザモデルと、話題内のキーワードと重みとからなるコンテンツモデルとに基づいて、話題とユーザとの類似度を計算し、話題の圧縮次元表現の値と、話題とユーザとの類似度とから、各話題のスコアを算出して、当該スコアに基づいて推薦話題候補を選定する、
情報提供装置。 (Appendix 7)
An information providing apparatus according to any one of appendices 1 to 6,
The recommended topic candidate creating means calculates the similarity between the topic and the user based on the user model that the user likes and weights, and the content model that includes the keywords and weights in the topic. From the value of the compressed dimension expression and the similarity between the topic and the user, calculate a score for each topic, and select a recommended topic candidate based on the score,
Information providing device.

（付記８）
情報処理装置に、
各ユーザの各話題に対する評価から成る評価履歴情報を入力する入力手段と、
前記評価履歴情報に基づいて、話題を行とし、ユーザを列とし、ユーザによる話題の評価値を値とする２次元行列を作成し、同じ話題を評価したユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力する次元圧縮手段と、
話題の圧縮次元ベクトル表現の要素値が第１の閾値以上である話題と、ユーザの圧縮次元ベクトル表現の要素値が第２の閾値以上であるユーザとを、ベクトルの要素毎に一つのグループとして対応づけるグループ化手段と、
前記グループに基づいて、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の圧縮次元ベクトル表現の要素値に基づいて推薦話題候補を選定する推薦話題候補作成手段と、
前記推薦話題候補のうち、前記評価履歴情報に基づいて話題の提供対象となるユーザがまだ評価していない話題を当該ユーザに提供する話題提供手段と、
を実現させるためのプログラム。 (Appendix 8)
In the information processing device,
An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic Dimensional compression means for compressing a single dimension and outputting a compressed dimension vector representation of a topic and a compressed dimension vector representation of a user;
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
A program to realize

（付記９）
各ユーザの各話題に対する評価から成る評価履歴情報を入力し、
前記評価履歴情報に基づいて、話題を行とし、ユーザを列とし、ユーザによる話題の評価値を値とする２次元行列を作成し、同じ話題を評価したユーザと、同じユーザから評価された話題とを一つの次元に圧縮し、話題の圧縮次元ベクトル表現とユーザの圧縮次元ベクトル表現とを出力し、
話題の圧縮次元ベクトル表現の要素値が第１の閾値以上である話題と、ユーザの圧縮次元ベクトル表現の要素値が第２の閾値以上であるユーザとを、ベクトルの要素毎に一つのグループとして対応づけ、
前記グループに基づいて、ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の圧縮次元ベクトル表現の要素値に基づいて推薦話題候補を選定し、
前記推薦話題候補のうち、前記評価履歴情報に基づいて話題の提供対象となるユーザがまだ評価していない話題を当該ユーザに提供する、
情報提供方法。 (Appendix 9)
Enter evaluation history information consisting of evaluations for each topic for each user,
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic Are compressed into one dimension, and the compressed dimension vector expression of the topic and the compressed dimension vector expression of the user are output,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Mapping,
Based on the group, for each user, extract a group to which the user belongs, extract a topic belonging to the group, select a recommended topic candidate based on the element value of the compression dimension vector representation of the topic,
Among the recommended topic candidates, providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information,
Information provision method.

（付記１０）
付記９に記載の情報提供方法であって、
前記評価履歴情報に基づいて、潜在トピックモデリングにより、話題に潜在的に存在するカテゴリである潜在トピックを要素として、話題の属する潜在トピックの確率と、ユーザの属する潜在トピックの確率とを計算し、前記話題の圧縮次元ベクトル表現と前記ユーザの圧縮次元ベクトル表現とを、話題の潜在トピックの確率とユーザの潜在トピックの確率とで表し、
第１の閾値以上の潜在トピック確率を持つ話題と、第２の閾値以上の潜在トピック確率を持つユーザとを、各潜在トピック毎に一つのグループとして対応づけ、
ユーザ毎に当該ユーザが属するグループを抽出すると共に、当該グループに属する話題を抽出し、当該話題の潜在トピックの確率に基づいて推薦話題候補を選定する、
情報提供方法。 (Appendix 10)
The information providing method according to appendix 9, wherein
Based on the evaluation history information, the potential topic modeling is a latent topic that is a category potentially existing in the topic, and calculates the probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs, The topic's compression dimension vector representation and the user's compression dimension vector representation are expressed as the topic's potential topic probability and the user's potential topic probability,
A topic having a potential topic probability equal to or higher than the first threshold and a user having a potential topic probability equal to or higher than the second threshold are associated as one group for each potential topic,
Extracting a group to which the user belongs for each user, extracting a topic belonging to the group, and selecting a recommended topic candidate based on the probability of a potential topic of the topic.
Information provision method.

なお、上述したプログラムは、記憶装置に記憶されていたり、コンピュータが読み取り可能な記録媒体に記録されている。例えば、記録媒体は、フレキシブルディスク、光ディスク、光磁気ディスク、及び、半導体メモリ等の可搬性を有する媒体である。 Note that the above-described program is stored in a storage device or recorded on a computer-readable recording medium. For example, the recording medium is a portable medium such as a flexible disk, an optical disk, a magneto-optical disk, and a semiconductor memory.

以上、上記実施形態等を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることができる。 Although the present invention has been described with reference to the above-described embodiment and the like, the present invention is not limited to the above-described embodiment. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

以上のように、本発明によれば、興味分野を考慮して、ユーザにもコミュニケーション相手にも興味のある話題を提供することができるので、ユーザ間のコミュニケーションを活性化することができる。よって、本発明は、例えば、コミュニケーションツールにおける情報提供支援装置、または、情報提供支援装置をコンピュータに実現するプログラムといった用途に適用できる。 As described above, according to the present invention, it is possible to provide a topic that is interesting to both the user and the communication partner in consideration of the field of interest, so that communication between users can be activated. Therefore, the present invention can be applied to an application such as an information providing support device in a communication tool or a program for realizing the information providing support device on a computer.

１０１入力手段
１０２次元圧縮手段
１０３グループ化手段
１０４推薦話題候補作成手段
１０５推薦手段
１２１評価履歴蓄積手段
２２１コミュニケーション相手蓄積手段
３０１評価情報補完手段
３２１コンテンツモデル蓄積手段
３２２ユーザモデル蓄積手段
４０１評価情報選択手段
１１１ＣＰＵ
１１２メインメモリ
１１３記憶装置
１１４入力インターフェイス
１１５表示コントローラ
１１６データリーダ／ライタ
１１７通信インターフェイス
１１８入力機器
１１９ディスプレイ装置
１２０記録媒体
１２１バス
１５０コンピュータ
101 input means 102 dimension compression means 103 grouping means 104 recommended topic candidate creation means 105 recommendation means 121 evaluation history storage means 221 communication partner storage means 301 evaluation information supplement means 321 content model storage means 322 user model storage means 401 evaluation information selection means 111 CPU
112 Main memory 113 Storage device 114 Input interface 115 Display controller 116 Data reader / writer 117 Communication interface 118 Input device 119 Display device 120 Recording medium 121 Bus 150 Computer

Claims

An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic , The probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs by using latent topic modeling as a potential topic. calculated, and the compression-dimensional vector representation of a topic represented by the probability of the topic belongs potential topics, and dimensionality reduction means for outputting the compression-dimensional vector representation of the user represented by the probability of latent topic to which the user belongs,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
An information providing apparatus comprising:

An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information and communication partner information indicating the communication relationship of each user , create a two-dimensional matrix with topics as rows, users as columns, and user evaluation values of topics as values Potentially exists in the topic by latent topic modeling that weights the communication partner as a condition when compressing the user of the communication partner who evaluated the topic and the topic evaluated by the same user into one dimension Calculate the probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs, using the potential topic that is the category as an element, and the compressed dimension vector representation of the topic expressed by the probability of the potential topic to which the topic belongs, and the latency to which the user belongs dimensions for outputting a compressed-dimensional vector representation of the user represented by the probability of topics And contraction means,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
An information providing apparatus comprising:

The information providing device according to claim 1 or 2 ,
The similarity between the user and the topic that is not evaluated by the user based on the user model that the user likes and the weight, the content model that includes the keyword and weight in the topic, and the evaluation history information And further comprising an evaluation information complementing means for complementing the evaluation value for the topic in the evaluation history information according to the similarity.
Information processing device.

The information providing device according to any one of claims 1 to 3 ,
Based on the evaluation history information, the user who evaluates identifies a topic with a small number based on a preset criterion and a user with a small number of topics evaluated based on a preset criterion, and the topic And / or further comprising evaluation history selection means for selecting evaluation information other than the user as a target for creating a two-dimensional matrix by the dimension compression means,
Information processing device.

The information providing device according to any one of claims 1 to 4 ,
The recommended topic candidate creating means calculates the similarity between the topic and the user based on the user model that the user likes and weights, and the content model that includes the keywords and weights in the topic. From the value of the compressed dimension expression and the similarity between the topic and the user, calculate a score for each topic, and select a recommended topic candidate based on the score,
Information providing device.

In the information processing device,
An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic , The probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs by using latent topic modeling as a potential topic. calculated, and the compression-dimensional vector representation of a topic represented by the probability of the topic belongs potential topics, and dimensionality reduction means for outputting the compression-dimensional vector representation of the user represented by the probability of latent topic to which the user belongs,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
A program to realize

In the information processing device,
An input means for inputting evaluation history information comprising evaluations for each topic of each user;
Based on the evaluation history information and communication partner information indicating the communication relationship of each user , create a two-dimensional matrix with topics as rows, users as columns, and user evaluation values of topics as values Potentially exists in the topic by latent topic modeling that weights the communication partner as a condition when compressing the user of the communication partner who evaluated the topic and the topic evaluated by the same user into one dimension Calculate the probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs, using the potential topic that is the category as an element, and the compressed dimension vector representation of the topic expressed by the probability of the potential topic to which the topic belongs, and the latency to which the user belongs dimensions for outputting a compressed-dimensional vector representation of the user represented by the probability of topics And contraction means,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Grouping means to associate,
A recommended topic candidate that extracts a group to which the user belongs for each user based on the group, extracts a topic that belongs to the group, and selects a recommended topic candidate based on an element value of a compression dimension vector representation of the topic Creating means;
Among the recommended topic candidates, topic providing means for providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information;
A program to realize

Enter evaluation history information consisting of evaluations for each topic for each user,
Based on the evaluation history information, a two-dimensional matrix having a topic as a row, a user as a column, and a topic evaluation value by the user as a value is created, and a topic evaluated by the same user as a user who evaluated the same topic , The probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs by using latent topic modeling as a potential topic. calculated, and outputs the compressed-dimensional vector representation of a topic represented by the probability of the topic belongs potential topics, and compression-dimensional vector representation of the user represented by the probability of potential topics to which the user belongs, a,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Mapping,
Based on the group, for each user, extract a group to which the user belongs, extract a topic belonging to the group, select a recommended topic candidate based on the element value of the compression dimension vector representation of the topic,
Among the recommended topic candidates, providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information,
Information provision method.

Enter evaluation history information consisting of evaluations for each topic for each user,
Based on the evaluation history information and communication partner information indicating the communication relationship of each user , create a two-dimensional matrix with topics as rows, users as columns, and user evaluation values of topics as values Potentially exists in the topic by latent topic modeling that weights the communication partner as a condition when compressing the user of the communication partner who evaluated the topic and the topic evaluated by the same user into one dimension Calculate the probability of the potential topic to which the topic belongs and the probability of the potential topic to which the user belongs, using the potential topic that is the category as an element, and the compressed dimension vector representation of the topic expressed by the probability of the potential topic to which the topic belongs, and the latency to which the user belongs outputs, and compression-dimensional vector representation of the user represented by the probability of the topic,
The topic whose element value of the compression dimension vector expression of the topic is equal to or more than the first threshold and the user whose element value of the compression dimension vector expression of the user is equal to or more than the second threshold are grouped into one group for each vector element. Mapping,
Based on the group, for each user, extract a group to which the user belongs, extract a topic belonging to the group, select a recommended topic candidate based on the element value of the compression dimension vector representation of the topic,
Among the recommended topic candidates, providing the user with a topic that has not yet been evaluated by the user who is the subject of the topic based on the evaluation history information,
Information provision method.