JP2013257668A

JP2013257668A - Interest analysis method, interest analyzer and program of the same

Info

Publication number: JP2013257668A
Application number: JP2012132387A
Authority: JP
Inventors: Masanari Fujita; 将成藤田; Tae Sato; 妙佐藤; Koji Ito; 浩二伊藤; Minoru Kobayashi; 稔小林
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-06-11
Filing date: 2012-06-11
Publication date: 2013-12-26
Anticipated expiration: 2032-06-11
Also published as: JP5723835B2

Abstract

PROBLEM TO BE SOLVED: To enable information recommendation precision to be enhanced by automatically extracting an optimal context condition.SOLUTION: A global context/context ID setting part 117 collects context conditions related to content browsing. A divided context extracting and processing part 116 divides an interest model into tables for combination, on the basis of the combinations of the collected context conditions, and extracts a table to be updated from the tables for each combination, on the basis of a relationship between the context conditions included in the combinations, to calculate a weight for the table to be updated on the basis of the relationship. An interest model updating and processing part 130 updates a user interest score with respect to a concept of the table to be updated by using a feature score calculated from content browsing history and the weight.

Description

この発明は、ユーザのコンテンツ閲覧履歴と、閲覧対象コンテンツを代表する概念を示すメタ情報を利用してコンテキストを考慮してユーザの興味を分析する興味分析方法、興味分析装置及びそのプログラムに関する。 The present invention relates to an interest analysis method, an interest analysis apparatus, and a program for analyzing an interest of a user in consideration of a context by using meta information indicating a user's content browsing history and a concept representative of browsing target content.

ユーザの行動や状況に合わせて適切なサービス・コンテンツをリコメンドする技術が望まれている。このため、書籍通販サイトにて、サイト内での書籍情報閲覧履歴からユーザの興味を推定して書籍をリコメンドする等、履歴情報からユーザの趣味嗜好等を推定する技術が提案されている。このような方法において、各コンテンツに内容をサマライズするメタ情報が付与されていることを前提として、ユーザ履歴において出現する概念等の頻度からユーザの興味等を推定する方法は、内容ベースフィルタリング手法（Content Based Filtering: CBF）で、特にメモリベース手法として研究が進められている。 There is a demand for a technique for recommending appropriate services and contents in accordance with user actions and situations. For this reason, a technique for estimating a user's hobbies and the like from history information has been proposed, such as estimating a user's interest from a book information browsing history within the site and recommending a book. In such a method, assuming that meta information for summarizing contents is given to each content, a method for estimating the user's interest and the like from the frequency of concepts and the like appearing in the user history is a content-based filtering method ( Content Based Filtering (CBF) is being researched especially as a memory-based method.

具体的に、内容ベースフィルタリング技術とは、例えば特定ブランド（ブランドを示す情報を概念タグを保持）の商品を閲覧した場合に、同じブランドの商品（同じ概念タグを保持）を提示する。この場合の、メモリベース手法は、過去に閲覧した履歴から、特定ブランドを頻繁に閲覧していれば、特定ブランドの商品を提示することとなる。このような技術において、履歴を候補アイテムからの選択と見なして分析する手法が存在する。また、このような手法では、コンテキストの変化を学習時の重み、及び学習モデルの切り替えとして扱うことが可能である（例えば、非特許文献１又は２を参照。）。 Specifically, the content-based filtering technique presents products of the same brand (holding the same concept tag) when browsing products of a specific brand (information indicating the brand holds the concept tag). In this case, the memory-based method presents a product of a specific brand if the specific brand is frequently browsed from the history of browsing in the past. In such a technique, there is a method for analyzing a history as a selection from candidate items. Also, with such a method, it is possible to handle context changes as learning weights and learning model switching (see, for example, Non-Patent Document 1 or 2).

奥健太，中島伸介，宮崎純，植村俊亮，「Context-Aware SVMに基づく状況依存型情報推薦方式の提案」，日本データベース学会，DBSJ Letters Vol.5，No.1，pp.1-4，2006年6月Kenta Oku, Shinsuke Nakajima, Jun Miyazaki, Toshiaki Uemura, “Proposal of Context-Aware SVM-based Context-Aware Information Recommendation Method”, Database Society of Japan, DBSJ Letters Vol.5, No.1, pp.1-4, 2006 June Alexandros Karatzoglou，Xavier Amatriain，Linas Baltrunas，Nuria Oliver，Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering，RecSys 2010: 79-86Alexandros Karatzoglou, Xavier Amatriain, Linas Baltrunas, Nuria Oliver, Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering, RecSys 2010: 79-86

ところが、従来技術においては、コンテキストの定義について、人手で適切な分類を事前に決定する必要があった。このため、適切なコンテキストを設定することは稼働がかかり困難であった。また、コンテキストを設定した場合も、適切な設定でなければ、コンテキストに当てはまる履歴が十分に収集出来ず、適切な推薦結果を得られないという問題があった。 However, in the prior art, it was necessary to manually determine an appropriate classification in advance for the definition of the context. For this reason, it has been difficult to set an appropriate context. In addition, even when the context is set, there is a problem that if the setting is not appropriate, the history applicable to the context cannot be sufficiently collected and an appropriate recommendation result cannot be obtained.

この発明は上記事情に着目してなされたもので、その目的とするところは、最適なコンテキスト条件を自動的に抽出し、情報推薦精度の高度化を可能にする興味分析方法、興味分析装置及びそのプログラムを提供することにある。 The present invention has been made paying attention to the above circumstances, and the object of the present invention is to automatically extract an optimal context condition and to improve the accuracy of information recommendation, an interest analysis method, an interest analysis device, and To provide that program.

上記目的を達成するためにこの発明の第１の態様は、複数の概念のそれぞれに対してユーザ興味スコアを有する興味モデルを用いて、前記概念を含むコンテンツの閲覧履歴からユーザの興味を分析する方法、装置及びプログラムであって、複数のコンテンツを一覧として閲覧した第１のコンテンツリストと、前記第１のコンテンツリストからコンテンツの本体を閲覧した第２のコンテンツリストとをクラスタ化し、前記クラスタ毎に、前記第１のコンテンツリストのコンテンツの総数を第１の総数と、前記第１のコンテンツリストにおいて前記概念が出現するコンテンツの数を第１の出現数と、前記第２のコンテンツリストのコンテンツの総数を第２の総数と、前記第２のコンテンツリストにおいて前記概念が出現するコンテンツの数を第２の出現数としたとき、前記第１の総数、前記第１の出現数、及び前記第２の総数の条件下で、前記第２のコンテンツリストに前記概念が出現するコンテンツの数が、前記第２の出現数以上となる第１の確率及び前記第２の出現数以下となる第２の確率を算出し、前記第１の確率及び前記第２の確率をもとに標準正規分布の累積分布関数の逆関数により特徴スコアを算出し、前記コンテンツの閲覧に関するコンテキスト条件を収集し、前記収集されたコンテキスト条件の組合せに基づいて、前記興味モデルを前記組合せ毎のテーブルに分割し、前記組合せに含まれるコンテキスト条件間の関連性に基づいて、前記組合せ毎のテーブルから更新対象テーブルを抽出し、前記更新対象テーブルに対して前記関連性をもとに重みを算出し、前記特徴スコアと前記重みとを用いて前記更新対象テーブルにおける前記概念に対する前記ユーザ興味スコアを更新するものである。 To achieve the above object, according to a first aspect of the present invention, an interest model having a user interest score for each of a plurality of concepts is used to analyze a user's interest from a browsing history of content including the concepts. A method, an apparatus, and a program, wherein a first content list browsing a plurality of contents as a list and a second content list browsing contents bodies from the first content list are clustered, and each cluster In addition, the total number of contents in the first content list is the first total number, the number of contents in which the concept appears in the first content list is the first appearance number, and the contents in the second content list The second total number and the second content list the number of contents in which the concept appears in the second content list. When the number of appearances is used, the number of contents in which the concept appears in the second content list under the conditions of the first total number, the first number of appearances, and the second total number is the second number. A first probability that is equal to or greater than the number of occurrences and a second probability that is equal to or less than the second number of occurrences, and a cumulative distribution function of a standard normal distribution based on the first and second probabilities The feature score is calculated by the inverse function of the above, the context conditions regarding the browsing of the content are collected, the interest model is divided into the table for each combination based on the combination of the collected context conditions, and included in the combination An update target table is extracted from the table for each combination based on the relationship between the context conditions to be calculated, a weight is calculated for the update target table based on the relationship, It is intended to update the user interest score for the concepts in the update target table using A and the said weight.

上記第１の態様によれば、履歴データの集まり方から、コンテキスト条件を自動抽出し、自動抽出したコンテキスト条件について、自動で学習時のコンテキスト適合度（重み）を決定することが可能となる。これにより、コンテキスト条件自動判別機能により処理コストや運用コストが削減でき、多様なコンテキストを考慮した、適切なコンテキスト条件による分析により情報推薦を高精度化できる。 According to the first aspect, it is possible to automatically extract context conditions from the way of collecting history data, and automatically determine the context suitability (weight) at the time of learning for the automatically extracted context conditions. Thereby, the processing cost and the operation cost can be reduced by the context condition automatic discrimination function, and the information recommendation can be made highly accurate by analysis based on an appropriate context condition in consideration of various contexts.

この発明の第２の態様は、前記第１の態様において、前記コンテキスト条件に適合する閲覧履歴の量に基づいて、前記興味モデルを前記組合せ毎のテーブルに分割するものである。
上記第２の態様によれば、コンテキスト条件を履歴量と相関させることで、対応する履歴が少ないコンテキスト条件による興味モデルの分割を避けることが出来るため、最小限のコンピューターリソースでの情報推薦の高精度化が実現できる。 According to a second aspect of the present invention, in the first aspect, the interest model is divided into a table for each combination based on the amount of browsing history that matches the context condition.
According to the second aspect, by correlating the context condition with the history amount, it is possible to avoid the division of the interest model based on the context condition having a small corresponding history. Accuracy can be realized.

この発明の第３の態様は、前記第１又は第２の態様において、コンテンツ要求時のコンテキスト条件を収集し、当該コンテキスト条件に適合するテーブルを用いて、コンテンツに対する評価スコアを算出するものである。
上記第３の態様によれば、コンテンツ要求時のユーザ状況に適合するユーザ興味モデルを用いてコンテンツの評価スコアを算出することで、ユーザの興味に合ったコンテンツを精度良く推薦することが可能となる。 According to a third aspect of the present invention, in the first or second aspect, context conditions at the time of content request are collected, and an evaluation score for the content is calculated using a table that matches the context conditions. .
According to the third aspect, by calculating a content evaluation score using a user interest model that matches a user situation at the time of content request, it is possible to recommend content that matches the user's interest with high accuracy. Become.

すなわちこの発明によれば、最適なコンテキスト条件を自動的に抽出し、情報推薦精度の高度化を可能にする興味分析方法、興味分析装置及びそのプログラムを提供することができる。 That is, according to the present invention, it is possible to provide an interest analysis method, an interest analysis device, and a program thereof that automatically extract optimum context conditions and enable information recommendation accuracy to be enhanced.

本実施形態に係る興味分析装置を用いたシステム全体図。The whole system figure using the interest analysis device concerning this embodiment. 図１の各装置の機能構成を示すブロック図。The block diagram which shows the function structure of each apparatus of FIG. 閲覧履歴を用いた興味分析処理の概要を示す図。The figure which shows the outline | summary of the interest analysis process using browsing history. コンテキスト条件を設定する場合の興味分析処理の概要を示す図。The figure which shows the outline | summary of the interest analysis process in the case of setting context conditions. コンテンツ要求データの一例を示す図。The figure which shows an example of content request data. クライアント端末上でのコンテンツ閲覧操作の一例を示す図。The figure which shows an example of content browsing operation on a client terminal. 一覧閲覧コンテンツリストのデータ構成例を示す図。The figure which shows the data structural example of a list browsing content list. 詳細閲覧コンテンツのデータ構成例を示す図。The figure which shows the data structural example of detailed browsing content. 提示コンテンツリストのデータ構成例を示す図。The figure which shows the data structural example of a presentation content list. コンテンツデータベースの一例を示す図。The figure which shows an example of a content database. ユーザ興味スコアデータベースの一例を示す図。The figure which shows an example of a user interest score database. コンテキスト別履歴量データベースの一例を示す図。The figure which shows an example of the historical amount database according to context. コンテキスト／関連性定義データベースの一例を示す図。The figure which shows an example of a context / relevance definition database. 履歴情報受信部の処理フローを示す図。The figure which shows the processing flow of a log | history information receiving part. 学習対象の興味テーブル選択処理部の処理フローを示す図。The figure which shows the processing flow of the interest table selection process part of learning object. 大域コンテキスト／コンテキスＩＤ設定部の処理フローを示す図。The figure which shows the processing flow of a global context / context ID setting part. 興味モデル更新処理部の処理フローを示す図。The figure which shows the processing flow of an interest model update process part. 分析パラメータリストのデータ構成例を示す図。The figure which shows the data structural example of an analysis parameter list. 特徴スコア算出部の動作を説明するための模式図。The schematic diagram for demonstrating operation | movement of a characteristic score calculation part. 特徴スコア算出処理の詳細を示す図。The figure which shows the detail of a characteristic score calculation process. 興味モデル更新処理の詳細を示す図。The figure which shows the detail of an interest model update process. コンテキスト履歴追記処理部の処理フローを示す図。The figure which shows the processing flow of a context log | history additional recording process part. 分割コンテキスト抽出の処理フローを示す図。The figure which shows the processing flow of division | segmentation context extraction. コンテキスト分割方法の処理概要を示す図。The figure which shows the process outline | summary of a context division | segmentation method. コンテキスト条件に基づく重み算出処理の具体例を示す図。The figure which shows the specific example of the weight calculation process based on a context condition. コンテンツ要求受信部の処理フローを示す図。The figure which shows the processing flow of a content request | requirement receiving part. 利用興味テーブル選択処理部の処理フローを示す図。The figure which shows the processing flow of a utilization interest table selection process part. コンテンツ評価処理部の処理フローを示す図。The figure which shows the processing flow of a content evaluation process part. コンテンツスコアリストの一例を示す図。The figure which shows an example of a content score list. コンテンツ評価処理の詳細を示す図。The figure which shows the detail of a content evaluation process.

以下、図面を参照してこの発明の実施の形態について詳細に説明する。
図１は、本実施形態に係る興味分析装置を用いたシステム全体図である。このシステムは、クライアント端末２００と、コンテンツサーバ３００と、興味分析装置１００を備える。クライアント端末２００とコンテンツサーバ３００との間、及びコンテンツサーバ３００と興味分析装置１００との間はそれぞれ通信ネットワークで接続される。ユーザは、クライアント端末２００上での閲覧操作により、所望のコンテンツをコンテンツサーバ３００から取得し、取得したコンテンツをクライアント端末２００の画面に提示して閲覧する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is an overall system diagram using the interest analysis apparatus according to the present embodiment. This system includes a client terminal 200, a content server 300, and an interest analysis device 100. The client terminal 200 and the content server 300, and the content server 300 and the interest analysis device 100 are connected to each other via a communication network. The user acquires desired content from the content server 300 through a browsing operation on the client terminal 200 and presents the acquired content on the screen of the client terminal 200 for browsing.

クライアント端末２００は、ユーザ操作によるコンテンツ閲覧履歴を収集し、複数のコンテンツを一覧として閲覧した一覧閲覧コンテンツリスト（第１のコンテンツリスト）と、コンテンツの一覧からコンテンツの本体を閲覧した詳細閲覧コンテンツリスト（第２のコンテンツリスト）を閲覧時の端末コンテキスト情報と共にコンテンツサーバ３００に送信する。なお、端末コンテキスト情報とは、例えば、位置情報、加速度、地軸センサ、温度計など、その他端末保有のセンサの測定時刻及び測定結果を含む。コンテンツサーバ３００は、この端末コンテキスト情報と共に、一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストを、通信ネットワークを介して興味分析装置１００に転送する。 The client terminal 200 collects a content browsing history by a user operation, browses a plurality of content as a list, a list browsing content list (first content list), and a detailed browsing content list that browses the content body from the content list The (second content list) is transmitted to the content server 300 together with the terminal context information at the time of browsing. Note that the terminal context information includes, for example, measurement time and measurement results of other terminal-owned sensors such as position information, acceleration, earth axis sensors, and thermometers. The content server 300 transfers the list browsing content list and the detailed browsing content list together with the terminal context information to the interest analysis device 100 via the communication network.

興味分析装置１００は、複数の概念のそれぞれに対してユーザ興味スコアを有する興味モデルを用いて、概念をメタ情報として含むコンテンツの閲覧履歴からユーザの興味を分析する。具体的には、上記一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストとを利用して、コンテンツに出現する各概念に対する特徴スコア及びユーザ興味スコアを算出し、ユーザの興味を推定する。興味分析装置１００は、このユーザ興味スコアに基づいて、コンテンツサーバ３００から受け取った「提示コンテンツリスト」から、ユーザの興味に合わせてソートを行ったコンテンツのリスト（ソート済み提示コンテンツリスト）を生成し、コンテンツサーバ３００に送信する。 The interest analysis apparatus 100 analyzes a user's interest from a browsing history of content including the concept as meta information, using an interest model having a user interest score for each of the plurality of concepts. Specifically, using the list browsing content list and the detailed browsing content list, a feature score and a user interest score for each concept appearing in the content are calculated, and the user's interest is estimated. Based on this user interest score, the interest analysis apparatus 100 generates a list of contents sorted according to the user's interest (sorted presented content list) from the “presentation content list” received from the content server 300. To the content server 300.

図２は、図１の各装置の機能構成を示すブロック図である。図２中の各部は、例えば、各装置のＣＰＵ（Central Processing Unit）とメモリ上で実行される制御プログラムにより実現することができる。
興味分析装置１００は、履歴情報受信部１１０、学習対象の興味テーブル選択処理部１１３、コンテキスト履歴追記処理部１１５、分割コンテキスト抽出処理部１１６、大域コンテキスト／コンテキストＩＤ設定部１１７、コンテンツ要求受信部１２１、利用興味テーブル選択処理部１２４、興味モデル更新処理部１３０、コンテキスト／関連性定義データベース１３１、コンテキスト別履歴量データベース１３２、興味スコアデータベース１４０、提示コンテンツリスト受信部１５０、コンテンツデータベース１６０、コンテンツ評価処理部１７０、及びソート済みコンテンツスコアリスト送信部１８０を備える。 FIG. 2 is a block diagram showing a functional configuration of each device in FIG. Each unit in FIG. 2 can be realized by, for example, a CPU (Central Processing Unit) of each device and a control program executed on a memory.
The interest analysis device 100 includes a history information receiving unit 110, a learning target interest table selection processing unit 113, a context history additional processing unit 115, a divided context extraction processing unit 116, a global context / context ID setting unit 117, and a content request receiving unit 121. , Use interest table selection processing unit 124, interest model update processing unit 130, context / relevance definition database 131, context-specific history amount database 132, interest score database 140, presentation content list reception unit 150, content database 160, content evaluation processing Unit 170 and sorted content score list transmission unit 180.

図３に、興味分析装置１００の閲覧履歴を用いた興味分析処理の概要を示す。
履歴情報受信部１１０は、クライアント端末２００から一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストをコンテンツサーバ３００を介して受信する。一覧閲覧コンテンツリストとは、例えば、ユーザがコンテンツのタイトルのみを一覧で閲覧したコンテンツのリストである。詳細閲覧コンテンツリストとは、ユーザがコンテンツ本体の内容（詳細）を閲覧したコンテンツのリストである。例えば、図３において、一覧閲覧コンテンツリストには、コンテンツ１〜８が含まれ、詳細閲覧コンテンツリストには、コンテンツ１，３，４が含まれる。また、図３において、斜線パターンで示すコンテンツは、概念Ｂがコンテンツ１，６，７，８に出現することを示す。 FIG. 3 shows an overview of the interest analysis process using the browsing history of the interest analysis device 100.
The history information receiving unit 110 receives the list browsing content list and the detailed browsing content list from the client terminal 200 via the content server 300. The list browsing content list is, for example, a list of content in which the user browses only the content titles in a list. The detailed browsing content list is a list of content that the user has viewed the content (details) of the content body. For example, in FIG. 3, the list browsing content list includes contents 1 to 8, and the detailed browsing content list includes contents 1, 3, and 4. In FIG. 3, the content indicated by the hatched pattern indicates that the concept B appears in the content 1, 6, 7, and 8.

興味モデル更新処理部１３０は、一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストを利用して、概念選択の統計モデルにより各概念の特徴スコア（後述するＺ値）を算出する。さらに、後述するコンテキストセットＩＤ毎に分割される興味モデル（コンテキストセットＩＤ毎のユーザ興味テーブル）について、上記特徴スコアを用いてユーザ興味スコアを更新する。 The interest model update processing unit 130 uses the list browsing content list and the detailed browsing content list to calculate a feature score (Z value to be described later) of each concept using a statistical model of concept selection. Further, for the interest model (user interest table for each context set ID) divided for each context set ID, which will be described later, the user interest score is updated using the characteristic score.

コンテンツ評価処理部１７０は、評価コンテンツに出現する各概念のユーザ興味スコアを利用して確率結合によってコンテンツに対するユーザの評価スコアを算出する。図３の例では、コンテンツ１に出現する概念Ｅ，Ｆ，Ｄのユーザ興味スコアを用いて評価コンテンツ１の評価スコアを求めている。 The content evaluation processing unit 170 calculates the user's evaluation score for the content by probability combining using the user interest score of each concept appearing in the evaluation content. In the example of FIG. 3, the evaluation score of the evaluation content 1 is obtained using the user interest scores of the concepts E, F, and D that appear in the content 1.

図４は、コンテキスト条件を設定する場合の興味分析処理の概要を示す図である。興味モデル更新処理部１３０は、分析対象とすべきコンテキスト条件（コンテキストＩＤ）のセットを示すコンテキストセットＩＤと、そのコンテキストセットＩＤ向けの学習重みを自動抽出する。この例では、単純な例を示す。具体的には、コンテキストセットＩＤ（簡単化のためセットと呼称しても実際は時間帯のコンテキストＩＤのみを持つ）を１時間毎（２４個）用意し、閲覧時刻をもとに重み付きで学習する。例えば、１０：３５に閲覧履歴が発生したとすると、この時刻に該当する興味テーブル（ＭＯＤＥＬ１０）と、その時間的に近傍の興味テーブル（例えば、ＭＯＤＥＬ９，１１）も同時に更新する。興味テーブルは、コンテキストセットＩＤが共通のレコードの集合を示す。 FIG. 4 is a diagram showing an overview of the interest analysis process when setting a context condition. The interest model update processing unit 130 automatically extracts a context set ID indicating a set of context conditions (context IDs) to be analyzed and a learning weight for the context set ID. This example shows a simple example. Specifically, context set IDs (24 sets) are prepared for each hour (same as a set for simplification but actually have only a time zone context ID), and weighted learning based on the browsing time To do. For example, if a browsing history occurs at 10:35, the interest table (MODEL 10) corresponding to this time and the interest tables nearby (for example, MODEL 9, 11) are updated simultaneously. The interest table indicates a set of records having a common context set ID.

コンテンツ評価処理部１７０は、コンテンツ要求時刻に適合する興味テーブルを用いてコンテンツを評価する。例えば、図４に示すように１２：１０にレコメンドする場合は、興味テーブル（ＭＯＤＥＬ１２）を用いてコンテンツ評価を行う。
しかしながら、分析対象となり得る多数のコンテキストの切り口（時刻、場所、気温、曜日、季節等）が有る場合には、全ての組合せで興味モデルを分割することは計算量及び必要なコンピュータリソースが莫大となる。また、全ての組合せについて運用者が「重み」を設定することは非常に困難である。そこで、履歴データの集まり方から、最適なコンテキスト条件を自動抽出し、コンテキスト条件の組合せからなるコンテキストセットＩＤ毎に自動で学習時のコンテキスト適合度（重み）を決定する手法を後述する。 The content evaluation processing unit 170 evaluates content using an interest table that matches the content request time. For example, when recommending at 12:10 as shown in FIG. 4, content evaluation is performed using the interest table (MODEL 12).
However, if there are a lot of contextual cuts (time, place, temperature, day of the week, season, etc.) that can be analyzed, dividing the interest model by all combinations requires enormous amounts of computation and necessary computer resources. Become. Further, it is very difficult for the operator to set “weight” for all combinations. Therefore, a method for automatically extracting optimum context conditions from the way of collecting history data and automatically determining the context suitability (weight) at the time of learning for each context set ID consisting of a combination of context conditions will be described later.

（クライアント端末）
図２において、クライアント端末２００は、履歴収集部２１０、履歴情報送信部２２０、コンテンツ提示部２３０、コンテンツ要求送信部２４０、及び端末情報収集部２５０を備える。
コンテンツ要求送信部２４０は、ユーザの指示（入力）によりコンテンツサーバ３００に対して、コンテンツの提示要求を行う。具体的には図５のようなコンテンツ要求データをコンテンツサーバ３００に送信する。例えば、コンテンツ要求データは、クライアント端末ＩＤ（もしくはユーザＩＤ）、要求時刻及び端末コンテキスト情報を有する。なお、要求時刻は、コンテンツサーバ３００において追加するようにしてもよい。クライアント端末ＩＤ（もしくはユーザＩＤ）は、端末（もしくはユーザ）毎に一意に付与される数字等であって、後述する興味スコアデータベース１４０のユーザ興味テーブルのユーザＩＤと一致するＩＤである。 (Client terminal)
2, the client terminal 200 includes a history collection unit 210, a history information transmission unit 220, a content presentation unit 230, a content request transmission unit 240, and a terminal information collection unit 250.
The content request transmission unit 240 makes a content presentation request to the content server 300 in accordance with a user instruction (input). Specifically, content request data as shown in FIG. 5 is transmitted to the content server 300. For example, the content request data includes a client terminal ID (or user ID), a request time, and terminal context information. The request time may be added in the content server 300. The client terminal ID (or user ID) is a number uniquely assigned to each terminal (or user), and is an ID that matches a user ID in a user interest table of the interest score database 140 described later.

図６は、クライアント端末２００上でのユーザによるコンテンツ閲覧操作の一例を示したものである。
コンテンツ提示部２３０は、コンテンツサーバ３００から受信したソート済み提示コンテンツリストをもとに、クライアント端末２００の表示画面サイズが許容する範囲でソート順の上位から一覧として表示を行う。 FIG. 6 shows an example of a content browsing operation by the user on the client terminal 200.
Based on the sorted presentation content list received from the content server 300, the content presentation unit 230 displays a list from the top of the sort order within the range allowed by the display screen size of the client terminal 200.

図６の例では、１０個のコンテンツ（コンテンツ１〜１０）が一覧表示されている。ユーザのフリック、スクロールバーの操作等で一覧によりソート順下位のコンテンツが表示することができる。このように実際にクライアント端末２００に表示されたコンテンツのリストを一覧閲覧コンテンツリストとする。つまり、ソート済み提示コンテンツリスト内のすべてのコンテンツがクライアント端末２００で表示されるとは限らないため、一覧閲覧コンテンツリストに含まれるとは限らない。ユーザがこの一覧から各コンテンツのタイトルをクリック操作等で選択すると、選択されたタイトルのコンテンツ（図６のコンテンツ３，５，６）の本体（詳細）を閲覧することができる。この詳細を閲覧したコンテンツを、詳細閲覧コンテンツリストに含む。 In the example of FIG. 6, ten contents (contents 1 to 10) are displayed in a list. The content in the lower order of the sort order can be displayed by the list by the user's flick, scroll bar operation or the like. The list of contents actually displayed on the client terminal 200 in this way is referred to as a list browsing content list. That is, not all the contents in the sorted presentation content list are displayed on the client terminal 200, and thus are not necessarily included in the list browsing content list. When the user selects a title of each content from this list by clicking or the like, the main body (details) of the content of the selected title (contents 3, 5, and 6 in FIG. 6) can be viewed. The content whose details are browsed is included in the detailed browsing content list.

履歴収集部２１０は、上述したように、ユーザの操作履歴を収集して一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストを作成する。履歴情報送信部２２０は、履歴収集部２１０により作成された一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストをコンテンツサーバ３００に送信する。 As described above, the history collection unit 210 collects user operation histories and creates a list browsing content list and a detailed browsing content list. The history information transmission unit 220 transmits the list browsing content list and the detailed browsing content list created by the history collection unit 210 to the content server 300.

図７に、上記図６の場合の一覧閲覧コンテンツリストのデータ構成例を示す。一覧閲覧コンテンツリストは、クラスタＩＤ、コンテンツＩＤ、及び閲覧時刻を有する。クラスタとは、一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストに一意に付与される識別子（図７では“１”）である。別の時刻（時間帯）に表示した一覧閲覧コンテンツをユーザが閲覧した場合は、別のクラスタＩＤが付与される。なお、時刻以外の条件でクラスタＩＤを新たに付与する条件としては、一覧閲覧コンテンツリスト表示中に一定時間操作が無かった場合や、閲覧するユーザ（ユーザＩＤ）を切り替えた場合、一覧閲覧コンテンツリストに対して、コンテンツジャンル等を観点に絞り込み検索を掛けた場合、その他閲覧アプリケーションにおいて閲覧モードを切り替えた場合がある。コンテンツＩＤは、一覧閲覧コンテンツの各コンテンツに一意に付与された識別子であり、後述するコンテンツデータベース１６０が保持する値と一致するものとする。 FIG. 7 shows a data configuration example of the list browsing content list in the case of FIG. The list browsing content list has a cluster ID, a content ID, and a browsing time. The cluster is an identifier ("1" in FIG. 7) uniquely assigned to the list browsing content list and the detailed browsing content list. When the user browses the list browsing content displayed at another time (time zone), another cluster ID is given. The conditions for newly assigning the cluster ID under conditions other than the time include when there is no operation for a certain period of time while the list browsing content list is displayed, or when the browsing user (user ID) is switched, the list browsing content list On the other hand, when a narrow search is performed from the viewpoint of the content genre or the like, the browsing mode may be switched in other browsing applications. The content ID is an identifier uniquely assigned to each content of the list browsing content, and is assumed to match a value held in a content database 160 described later.

図８は、上記図６の場合の詳細閲覧コンテンツリストのデータ構成例を示したものである。詳細閲覧コンテンツリストは、上記一覧閲覧コンテンツリストと同様に、クラスタＩＤ、コンテンツＩＤ、及び閲覧時刻を有する。クラスタＩＤは、一覧閲覧コンテンツリストと同一の値とする（図８では“１”）。コンテンツＩＤ及び閲覧時刻は、詳細閲覧コンテンツリストでは、ユーザが一覧閲覧コンテンツから選択して詳細を閲覧したコンテンツ（図８ではコンテンツ３，５，６）の識別子及び当該コンテンツを閲覧した時刻となる。 FIG. 8 shows an example of the data structure of the detailed browsing content list in the case of FIG. The detailed browsing content list has a cluster ID, a content ID, and a browsing time, like the list browsing content list. The cluster ID is the same value as the list browsing content list (“1” in FIG. 8). In the detailed browsing content list, the content ID and browsing time are the identifier of the content (contents 3, 5, and 6 in FIG. 8) that the user has selected from the browsing content and browsed the content and the time when the content was browsed.

（コンテンツサーバ）
上記図２において、コンテンツサーバ３００は、コンテンツ送信処理部３１０、ソート済み提示コンテンツリスト受信部３２０、提示コンテンツリスト送信部３３０、提示コンテンツリスト入力部３４０、履歴情報転送部３５０、及びコンテンツ要求転送部３６０を備える。 (Content server)
In FIG. 2, the content server 300 includes a content transmission processing unit 310, a sorted presentation content list reception unit 320, a presentation content list transmission unit 330, a presentation content list input unit 340, a history information transfer unit 350, and a content request transfer unit. 360 is provided.

履歴情報転送部３５０は、クライアント端末２００から受信した一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリストを通信ネットワークを介して興味分析装置１００に転送する。
提示コンテンツリスト入力部３４０には、サービス運用者により、ユーザの利用するクライアント端末２００に提示するコンテンツを一覧にした提示コンテンツリストが入力される。提示コンテンツリスト送信部３３０は、上記入力された提示コンテンツリストを興味分析装置１００へ通信ネットワークを介して送信する。 The history information transfer unit 350 transfers the list browsing content list and the detailed browsing content list received from the client terminal 200 to the interest analysis device 100 via the communication network.
The presentation content list input unit 340 receives a presentation content list that lists contents to be presented to the client terminal 200 used by the user by the service operator. The presented content list transmission unit 330 transmits the input presented content list to the interest analysis apparatus 100 via the communication network.

図９に、提示コンテンツリストのデータ構成例を示す。提示コンテンツリストは、コンテンツＩＤ、概念ＩＤ／関連度リスト、コンテンツ本体、及びコンテンツ登録時刻を有する。コンテンツＩＤは、各コンテンツに対してコンテンツサーバ３００にて付与される一意のＩＤである。概念ＩＤ／関連度リストは、コンテンツに出現する概念の概念ＩＤ及び当該概念とコンテンツと関連性の程度を示す関連度Ｗのセットが格納される。概念ＩＤ／関連度リストは、コンテンツ毎に予め設定されており、具体例としては、コンテンツ１（スポーツ記事）には、｛“野球”の概念ＩＤ=１，関連度＝０．５｝、｛“サッカー”の概念ＩＤ=２，関連度＝０．８｝、｛“ゴルフ”の概念ＩＤ=３、関連度＝０．６｝…のように設定される。 FIG. 9 shows a data configuration example of the presented content list. The presented content list has a content ID, a concept ID / relevance list, a content body, and a content registration time. The content ID is a unique ID assigned by the content server 300 to each content. The concept ID / relationship degree list stores a concept ID of a concept that appears in the content and a set of relevance W that indicates the degree of relevance between the concept and the content. The concept ID / relevance degree list is set in advance for each content. As a specific example, content 1 (sports article) includes {“baseball” concept ID = 1, relevance = 0.5}, { “Soccer” concept ID = 2, relevance = 0.8}, {“golf” concept ID = 3, relevance = 0.6}...

なお、概念ＩＤは、興味スコアデータベース１４０に格納される値と一致する。関連度は、例えば、０から１までの値とし、大きいほど関連性が強いものとする。関連度は、サービス運用者がコンテンツ登録時に設定する値、若しくは別システムにより算出される値を利用する。 The concept ID matches the value stored in the interest score database 140. For example, the relevance is a value from 0 to 1, and the larger the relevance, the stronger the relevance. As the relevance, a value set by the service operator at the time of content registration or a value calculated by another system is used.

ソート済み提示コンテンツリスト受信部３２０は、興味分析装置１００から提示コンテンツリストの一部又は全部をソートしたソート済み提示コンテンツリストとクライアント端末ＩＤ（もしくはユーザＩＤ）を受信する。コンテンツ送信処理部３１０は、ソート済み提示コンテンツリストをクライアント端末ＩＤ（もしくはユーザＩＤ）に該当するクライアント端末２００に送信する。
コンテンツ要求転送部３６０は、クライアント端末２００のコンテンツ要求送信部２４０からのコンテンツ提示要求であるコンテンツ要求データ（図５）を興味分析装置１００に転送する。 The sorted presentation content list receiving unit 320 receives a sorted presentation content list and a client terminal ID (or user ID) obtained by sorting a part or all of the presentation content list from the interest analysis device 100. The content transmission processing unit 310 transmits the sorted presentation content list to the client terminal 200 corresponding to the client terminal ID (or user ID).
The content request transfer unit 360 transfers content request data (FIG. 5) that is a content presentation request from the content request transmission unit 240 of the client terminal 200 to the interest analysis device 100.

（興味分析装置）
次に、興味分析装置１００の各部の詳細について説明する。
［コンテンツデータベース１６０］
図１０にコンテンツデータベース１６０のデータ構造の一例を示す。コンテンツデータベース１６０は、コンテンツテーブルを有する。 (Interest analysis device)
Next, the detail of each part of the interest analysis apparatus 100 is demonstrated.
[Content database 160]
FIG. 10 shows an example of the data structure of the content database 160. The content database 160 has a content table.

コンテンツテーブルは、コンテンツＩＤ、概念ＩＤ／関連度リスト、コンテンツ本体、及びコンテンツ登録時刻を格納する。提示コンテンツリスト受信部１５０で受信した値が格納される。
［興味スコアデータベース１４０］
図１１に興味スコアデータベース１４０のデータ構造の一例を示す。興味スコアデータベース１４０は、ユーザ興味テーブルを有する。 The content table stores a content ID, a concept ID / relevance list, a content body, and a content registration time. The value received by the presented content list receiving unit 150 is stored.
[Interest score database 140]
FIG. 11 shows an example of the data structure of the interest score database 140. The interest score database 140 has a user interest table.

ユーザ興味テーブルは、概念ＩＤ、コンテキストセットＩＤ、ユーザＩＤ（クライアント端末ＩＤ）、ＴｏｔａｌＺ（ユーザ興味スコア）、Ｘ、及びＹの値を格納する。ＴｏｔａｌＺ、Ｘ、及びＹの定義及び算出方法は後述する。つまり、ユーザ興味テーブルのコンテキストセットＩＤ毎のデータは、は、各ユーザ（端末）について、コンテキスト条件の組合せ（セット）毎に作成される。これは、このコンテキストの組合せの時にユーザが特徴的な行動を取ることを想定した分析を行うためである。 The user interest table stores values of concept ID, context set ID, user ID (client terminal ID), TotalZ (user interest score), X, and Y. The definition and calculation method of TotalZ, X, and Y will be described later. That is, the data for each context set ID in the user interest table is created for each combination (set) of context conditions for each user (terminal). This is to perform an analysis assuming that the user takes a characteristic action when the context is combined.

［コンテキスト別履歴量データベース１３２］
図１２に、コンテキスト別履歴量データベース１３２のデータ構造の一例を示す。コンテキスト別履歴量データベース１３２は、コンテキスト別ユーザ履歴量テーブルと、分析対象コンテキストセットテーブルとを有する。 [History database 132 by context]
FIG. 12 shows an example of the data structure of the history amount database 132 by context. The context-specific history amount database 132 includes a context-specific user history amount table and an analysis target context set table.

コンテキスト別ユーザ履歴量テーブルは、クライアント端末ＩＤ（ユーザＩＤ）、コンテキストＩＤ、及び適合クラスタＩＤリストを含む。適合クラスタＩＤリストは、上記図７の一覧閲覧コンテンツリストのクラスタＩＤと同一の値であり、後述する履歴情報受信部１１０で付与される。適合クラスタＩＤリストは、このコンテキストＩＤ（コンテキスト条件）に適合する閲覧履歴のクラスタＩＤの群を示す。 The context-specific user history amount table includes a client terminal ID (user ID), a context ID, and a matching cluster ID list. The matching cluster ID list has the same value as the cluster ID of the list browsing content list in FIG. 7 and is given by the history information receiving unit 110 described later. The compatible cluster ID list indicates a group of browsing history cluster IDs that match the context ID (context condition).

分析対象コンテキストセットテーブルは、ユーザＩＤ（クライアント端末ＩＤ）、コンテキストセットＩＤ、適合コンテキストＩＤリスト、及び隣接コンテキストセットＩＤリストを含む。適合コンテキストＩＤリストは、このコンテキストセットが満たすべきコンテキスト条件のコンテキストＩＤの群を示す。隣接コンテキストセットＩＤリストは、このコンテキストセットの履歴により学習する場合に、同時に学習すべきコンテキストセットＩＤの群と、その時の重みｗを示す。例えば、｛コンテキストセットＩＤ１、０．１｝,｛コンテキストセットＩＤ２、０．５｝，…とする。 The analysis target context set table includes a user ID (client terminal ID), a context set ID, a matching context ID list, and an adjacent context set ID list. The matching context ID list indicates a group of context IDs of the context conditions that should be satisfied by this context set. The adjacent context set ID list indicates a group of context set IDs to be learned simultaneously and the weight w at that time when learning is performed from the history of the context set. For example, {context set ID1, 0.1}, {context set ID2, 0.5},.

［分割コンテキスト／関連性定義データベース１３１］
図１３に、分割コンテキスト／関連性定義データベース１３１のデータ構造の一例を示す。分割コンテキスト／関連性定義データベース１３１は、コンテキストＩＤテーブルと、コンテキスト関連性テーブルとを有する。 [Division context / relationship definition database 131]
FIG. 13 shows an example of the data structure of the division context / relevance definition database 131. The division context / relevance definition database 131 includes a context ID table and a context relevance table.

コンテキストＩＤテーブルは、コンテキストＩＤと、コンテキスト条件とを含む。コンテキスト条件は、例えば、２４時間を８分割して、０時〜３時、３時〜６時等の条件、気温を１０分割して、０度以下、０〜５度、．．．３０度以上などの条件、月曜、火曜、．．．、日曜の曜日の条件、あるいは、春夏秋冬、晴れ、雨、曇りなどの季節や天気の条件を含む。 The context ID table includes a context ID and a context condition. The context condition is, for example, 24 hours divided into 8 and 0 to 3 o'clock, 3 o'clock to 6 o'clock, etc., and the air temperature is divided into 10 to 0 degrees or less, 0 to 5 degrees,. . . Conditions such as 30 degrees or more, Monday, Tuesday,. . . , Including conditions for Sundays, or seasonal and weather conditions such as spring, summer, autumn and winter, sunny, rainy, cloudy.

コンテキスト関連性テーブルは、コンテキストＩＤと、関連コンテキストＩＤと、関連コンテキストＩＤとの距離とを含む。コンテキストＩＤは、後述する大域コンテキスト／コンテキストＩＤ設定部１１７により一意に付与される数字等である。関連コンテキストＩＤは、このコンテキストＩＤと関連性を持つコンテキストＩＤの群である。関連コンテキストＩＤとの距離は、当該コンテキストＩＤと各関連コンテキストＩＤとの間の距離を示す数値が、コンテキストＩＤ、距離の値の組として、それぞれ格納され、数が大きいほど当該コンテキストＩＤとの関係性が少ないことを示す。 The context relevance table includes a context ID, a related context ID, and a distance between the related context IDs. The context ID is a number uniquely assigned by a global context / context ID setting unit 117 described later. The related context ID is a group of context IDs having a relationship with this context ID. As for the distance to the related context ID, a numerical value indicating the distance between the context ID and each related context ID is stored as a set of the context ID and the distance value, and the larger the number, the more the relationship with the context ID. It shows that there is little nature.

［提示コンテンツリスト受信部１５０］
提示コンテンツリスト受信部１５０は、コンテンツサーバ３００から上記図９のような提示コンテンツリストを受信し、上記図１０に示すコンテンツデータベース１６０に保存する。
［履歴情報受信部１１０］
図１４に、履歴情報受信部１１０の処理フローを示す。 [Presentation content list receiving unit 150]
The presented content list receiving unit 150 receives the presented content list as shown in FIG. 9 from the content server 300 and stores it in the content database 160 shown in FIG.
[History information receiving unit 110]
FIG. 14 shows a processing flow of the history information receiving unit 110.

（履歴データ受信）
履歴情報受信部１１０は、コンテンツサーバ３００の履歴情報転送部３５０から通信ネットワークを介してクライアント端末ＩＤ（もしくはユーザＩＤ）、一覧閲覧コンテンツリスト、詳細閲覧コンテンツリスト、閲覧時刻情報、及び、各測定時刻情報と組とした端末コンテキスト情報（位置情報、加速度、地軸センサ、温度計など、その他端末保有のセンサの検出結果）、を受信する。そして、クラスタＩＤ（ユニークな値）を付与して、学習対象の興味テーブル選択処理１１３へ出力する（Ａ−１）。 (Receiving history data)
The history information receiving unit 110 receives a client terminal ID (or user ID), a list browsing content list, a detailed browsing content list, browsing time information, and each measurement time from the history information transfer unit 350 of the content server 300 via a communication network. The terminal context information (position information, acceleration, ground axis sensor, thermometer, etc., the detection result of other sensors owned by the terminal) that is paired with the information is received. Then, a cluster ID (unique value) is assigned and output to the learning target interest table selection process 113 (A-1).

［学習対象の興味テーブル選択処理部１１３］
図１５に、学習対象の興味テーブル選択処理部１１３の処理フローを示す。学習対象の興味テーブル選択処理部１１３は、履歴情報受信部１１０からクラスタＩＤ、クライアント端末ＩＤ（もしくはユーザＩＤ）、一覧閲覧コンテンツリスト、詳細閲覧コンテンツリスト、閲覧時刻情報、及び端末コンテキスト情報を受信する（Ａ−１）。 [Learning Object Interest Table Selection Processing Unit 113]
FIG. 15 shows a processing flow of the learning target interest table selection processing unit 113. The learning target interest table selection processing unit 113 receives the cluster ID, client terminal ID (or user ID), list browsing content list, detailed browsing content list, browsing time information, and terminal context information from the history information receiving unit 110. (A-1).

（学習対象のユーザ興味テーブル選択）
先ず、この履歴受信時のユーザのコンテキスト（状況）を取得するために、学習対象の興味テーブル選択処理部１１３は、クラスタＩＤ、クライアント端末ＩＤ（もしくはユーザＩＤ）、閲覧時刻情報、及び端末コンテキスト情報を大域コンテキスト／コンテキストＩＤ設定部１１７へ出力し（Ａ−２）、大域コンテキスト／コンテキストＩＤ設定部１１７からクライアント端末ＩＤ（もしくはユーザＩＤ）と、履歴収集時点のユーザ状況や大域コンテキストに適合するコンテキストＩＤ群とを受信する（Ａ−３）。このとき、サーバの現時点への適合を利用する方法もある。 (Select user interest table for learning)
First, in order to acquire the user's context (situation) at the time of history reception, the interest table selection processing unit 113 to be learned includes a cluster ID, a client terminal ID (or user ID), browsing time information, and terminal context information. To the global context / context ID setting unit 117 (A-2), the client terminal ID (or user ID) from the global context / context ID setting unit 117 and the context that matches the user situation and the global context at the time of history collection The ID group is received (A-3). At this time, there is a method of using the current adaptation of the server.

学習対象の興味テーブル選択処理部１１３は、コンテキストＩＤがそろった時点で、コンテキスト別履歴量データベース１３２の分析対象コンテキストセットテーブルからコンテキスト条件が合致するコンテキストセットＩＤのリストとｗ値を取得する。また、常に「その他」コンテキスト条件を分析対象とし、この重みは初期値ｗ＝１とする。なお、ｗ値は、事前計算のデータベースから読み出す方法以外に、入力コンテキスト条件を起点として、分割コンテキスト抽出処理部１１６の「関係性重み算出」の処理にて算出する方法もある。 The learning target interest table selection processing unit 113 obtains a list of context set IDs and w values that match the context conditions from the analysis target context set table of the context-specific history amount database 132 when the context IDs are obtained. In addition, the “other” context condition is always an analysis target, and this weight is set to an initial value w = 1. In addition to the method of reading from the pre-calculated database, there is a method of calculating the w value by the “relationship weight calculation” process of the division context extraction processing unit 116 starting from the input context condition.

学習対象の興味テーブル選択処理部１１３は、クライアント端末ＩＤ（もしくはユーザＩＤ）、一覧閲覧コンテンツリスト、詳細閲覧コンテンツリスト、学習対象ユーザ興味テーブルのコンテキストセットＩＤ、及び学習対象ユーザ興味テーブルのコンテキストセットＩＤ毎のｗ値を出力する（Ａ−４）。 The learning target interest table selection processing unit 113 includes a client terminal ID (or user ID), a list browsing content list, a detailed browsing content list, a context target ID of the learning target user interest table, and a context set ID of the learning target user interest table. Each w value is output (A-4).

［大域コンテキスト／コンテキストＩＤ設定部１１７］
大域コンテキスト／コンテキストＩＤ設定部１１７は、学習対象の興味テーブル選択処理部１１３からの入力（Ａ−２）又は利用興味テーブル選択処理部１２４からの入力（Ｃ−２）を処理起動トリガとする。 [Global Context / Context ID Setting Unit 117]
The global context / context ID setting unit 117 uses the input (A-2) from the interest table selection processing unit 113 to be learned or the input (C-2) from the use interest table selection processing unit 124 as a process activation trigger.

（大域コンテキスト収集）
大域コンテキスト／コンテキストＩＤ設定部１１７は、クライアント端末ＩＤ（もしくはユーザＩＤ）、閲覧時刻情報（Ｃ−２の場合は時刻情報）、端末コンテキスト情報、及びクラスタＩＤ（Ｃ−２の場合は省略）を受信すると、大域コンテキストを収集する。例えば、端末コンテキスト情報がＧＰＳ履歴等の位置情報を含んでいる場合は、位置情報と時刻情報に対応する気温、湿度、天気等の情報をインターネットから収集する。または、世の中で話題になっているイベント情報をインターネット等から収集、もしくは運営者が適時設定した情報を収集する。端末コンテキスト情報がＧＰＳ履歴等の位置情報を含んでいる場合は、ユーザ位置の近隣のイベント情報をインターネット等から収集、もしくは運営者が適時設定した情報を収集する。あるいは、各ユーザのツイッターやブログ等の更新情報、季節／曜日／祝日などの情報、その他、事前設定などによる、ユーザの年齢、性別、職業等のユーザプロファイル情報を収集する。 (Global context collection)
The global context / context ID setting unit 117 receives the client terminal ID (or user ID), browsing time information (time information in the case of C-2), terminal context information, and cluster ID (omitted in the case of C-2). When received, collect global context. For example, when the terminal context information includes position information such as GPS history, information such as temperature, humidity, and weather corresponding to the position information and time information is collected from the Internet. Or, collect event information that has become a hot topic in the world from the Internet, or collect information that is set by the operator in a timely manner. When the terminal context information includes position information such as GPS history, event information in the vicinity of the user position is collected from the Internet or the information set by the operator in a timely manner. Alternatively, update information of each user's Twitter, blog, etc., information such as season / day of week / holiday, etc., and other user profile information such as the user's age, sex, occupation, etc. according to presetting, etc. are collected.

（ＤＢ読み出し）
大域コンテキスト／コンテキストＩＤ設定部１１７は、上記収集されたコンテキスト情報群をもとに、履歴収集時点（もしくはサーバ現時点）に適合するコンテキストＩＤ群をコンテキスト／関連性定義データベース１３１から読み出す。そして、Ａ−２の場合は、クラスタＩＤ、及び上記履歴収集時点（もしくはサーバ現時点）に適合するコンテキストＩＤ群とを学習対象の興味テーブル選択処理部１１３へ出力する（Ａ−３）。Ｃ−２の場合は、上記履歴収集時点（もしくはサーバ現時点）に適合するコンテキストＩＤ群を利用興味テーブル選択処理部１２４へ出力する。 (DB read)
Based on the collected context information group, the global context / context ID setting unit 117 reads a context ID group that matches the history collection time point (or server current time point) from the context / relevance definition database 131. In the case of A-2, the cluster ID and the context ID group suitable for the history collection time (or server current time) are output to the interest table selection processing unit 113 to be learned (A-3). In the case of C-2, a context ID group suitable for the history collection time (or server current time) is output to the utilization interest table selection processing unit 124.

［ユーザ興味モデル更新部１３０］
図１７に、ユーザ興味モデル更新部１３０の処理フローを示す。ユーザ興味モデル更新部１３０には、学習対象の興味テーブル選択処理部１１３からクライアント端末ＩＤ（もしくはユーザＩＤ）、一覧閲覧コンテンツリスト、詳細閲覧コンテンツリスト、学習対象ユーザ興味テーブルのコンテキストセットＩＤ、及び学習対象ユーザ興味テーブルのコンテキストセット毎のｗ値が入力される（Ａ−４）。 [User interest model update unit 130]
FIG. 17 shows a processing flow of the user interest model update unit 130. The user interest model update unit 130 receives the client terminal ID (or user ID), the list browsing content list, the detailed browsing content list, the context set ID of the learning target user interest table, and the learning from the learning target interest table selection processing unit 113. The w value for each context set in the target user interest table is input (A-4).

（出現概念抽出）
ユーザ興味モデル更新部１３０は、クラスタＩＤの一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリスト内の各コンテンツに出現する概念ＩＤをコンテンツデータベース１６０から抽出する。具体的には、図７、図８において、各コンテンツＩＤに紐付けされている「概念ＩＤ」を図１０のコンテンツデータベース１６０のコンテンツテーブルから検索する。ユーザ興味モデル更新部１３０は、クラスタデータ｛クラスタＩＤ，一覧閲覧コンテンツリスト，詳細閲覧コンテンツリスト｝と、コンテンツＩＤ／概念ＩＤ関連づけリスト｛｛コンテンツＩＤ，｛関連づいている概念ＩＤ，…｝｝，…｝と、出現概念リスト｛概念ＩＤ｝とを生成する。「コンテンツＩＤ／概念ＩＤ関連付けリスト」とは、コンテンツＩＤをもとに検索された概念ＩＤのリストである。「出現概念リスト」とは、一覧閲覧コンテンツリスト、及び詳細閲覧コンテンツリストに含まれる各コンテンツに出現する概念の概念ＩＤを全て列挙したものである。 (Appearance concept extraction)
The user interest model updating unit 130 extracts a concept ID that appears in each content in the list browsing content list and the detailed browsing content list of the cluster ID from the content database 160. Specifically, in FIG. 7 and FIG. 8, “concept ID” linked to each content ID is searched from the content table of the content database 160 of FIG. 10. The user interest model update unit 130 includes cluster data {cluster ID, list browsing content list, detailed browsing content list} and content ID / concept ID association list {{content ID, {related concept ID, ...}}, ...} and the appearance concept list {concept ID} are generated. The “content ID / concept ID association list” is a list of concept IDs searched based on the content ID. The “appearance concept list” is a list of all concept IDs of concepts that appear in each content included in the list browsing content list and the detailed browsing content list.

（分析パラメータ抽出）
ユーザ興味モデル更新部１３０は、「出現概念リスト」の各概念について出現数を算出し、特徴スコアの算出に必要な分析パラメータを抽出し、分析パラメータリストを生成する。
図１８に、分析パラメータリストのデータ構成例を示す。分析パラメータリストは、クラスタＩＤ毎に、一覧閲覧コンテンツリストのコンテンツ総数Ｓ（第１の総数）、詳細閲覧コンテンツリストのコンテンツ総数ａ（第２の総数）、クラスタＩＤに紐づいた出現概念リスト内の概念ＩＤ毎に算出するＮとｎがある。Ｎ（第１の出現数）は、一覧閲覧コンテンツリストにおいて当該概念ＩＤが付与されているコンテンツ数とする。ｎ（第２の出現数）は詳細閲覧コンテンツリストにおける当該概念ＩＤが付与されているコンテンツ数とする。なお、上記追加した上位概念も含めて出現概念リスト内の概念ＩＤすべてについて、Ｎとｎを算出する。 (Analysis parameter extraction)
The user interest model update unit 130 calculates the number of appearances for each concept in the “appearance concept list”, extracts analysis parameters necessary for calculating the feature score, and generates an analysis parameter list.
FIG. 18 shows a data configuration example of the analysis parameter list. The analysis parameter list includes, for each cluster ID, the total content S (first total) of the list browsing content list, the total content a (second total) of the detailed browsing content list, and the appearance concept list associated with the cluster ID. N and n are calculated for each concept ID. N (first appearance number) is the number of contents to which the concept ID is assigned in the list browsing content list. n (second appearance number) is the number of contents to which the concept ID is assigned in the detailed browsing content list. It should be noted that N and n are calculated for all concept IDs in the appearance concept list including the added superordinate concept.

図１９（ａ）に分析パラメータ抽出処理の模式図を示す。例えば、５０個（＝Ｓ）のコンテンツが一覧表示されている中から、ユーザが１０個（＝ａ）のコンテンツの詳細を閲覧した場合を示す。ここで、一覧表示されている５０個のコンテンツのうち「野球」という概念が含まれている記事が１５個（＝Ｎ）あり、ユーザが閲覧した１０個のコンテンツのうち、「野球」という概念が含まれているコンテンツが５個（＝ｎ）あったことを示す。 FIG. 19A shows a schematic diagram of the analysis parameter extraction process. For example, the case where the user browses the details of 10 (= a) contents from a list of 50 (= S) contents is shown. Here, 15 articles (= N) containing the concept of “baseball” among the 50 contents displayed in a list are displayed, and the concept of “baseball” is included in 10 contents viewed by the user. This indicates that there are five (= n) contents including “”.

（特徴スコア算出）
ユーザ興味モデル更新部１３０は、上記分析パラメータＳ，ａ，Ｎ，ｎを利用して概念ＩＤ毎に特徴スコアＺを算出する。図２０に特徴スコア算出処理の詳細を示す。図２０において、ｉは概念の識別子、ｊは、クラスタＩＤを示す。Ｈ１（第１の確率）は、一覧閲覧コンテンツリストに含まれる一覧閲覧コンテンツの総数Ｓ、一覧閲覧コンテンツのうち概念ｉが出現するコンテンツ数Ｎのとき、詳細閲覧コンテンツをａ個ランダム選択して閲覧した場合に、概念ｉが出現する詳細閲覧コンテンツの数がｎ以上となる累積確率である。Ｈ２（第２の確率）は、一覧閲覧コンテンツリストに含まれる一覧閲覧コンテンツの総数Ｓ、一覧閲覧コンテンツのうち概念ｉが出現するコンテンツ数Ｎのとき、詳細閲覧コンテンツをａ個ランダム選択して閲覧した場合に、概念ｉが出現する詳細閲覧コンテンツの数がｎ以下となる累積確率である。なお、本実施形態では、累積確率Ｈ１及びＨ２は、超幾何分布により求めるが、この手法に限定するものではない。他の分布の例としては、二項分布、正規分布が存在する。 (Feature score calculation)
The user interest model update unit 130 calculates a feature score Z for each concept ID using the analysis parameters S, a, N, and n. FIG. 20 shows details of the feature score calculation process. In FIG. 20, i is a conceptual identifier, and j is a cluster ID. When H1 (first probability) is the total number S of the list browsing contents included in the list browsing content list and the number N of the contents of the list browsing content where the concept i appears, a detailed browsing content is randomly selected and viewed. In this case, the cumulative probability that the number of detailed browsing contents in which the concept i appears is n or more. When H2 (second probability) is the total number S of the list browsing contents included in the list browsing content list and the number of contents N in which the concept i appears in the list browsing contents, a detailed browsing content is randomly selected and viewed. In this case, the cumulative probability that the number of detailed browsing contents in which the concept i appears is n or less. In the present embodiment, the cumulative probabilities H1 and H2 are obtained by the hypergeometric distribution, but are not limited to this method. Examples of other distributions include a binomial distribution and a normal distribution.

図１９（ｂ）に示すように、例えば、上記の分析パラメータＳ、Ｎ、ａ、ｎを用いて、ユーザが閲覧した１０個のコンテンツのうち、「野球」という概念が含まれるコンテンツが５以上である確率が、「０．１２」であることを示す。ここで、「０．１２」は、累積確率Ｈ１の値に相当する。 As shown in FIG. 19B, for example, there are 5 or more contents including the concept of “baseball” among the 10 contents viewed by the user using the analysis parameters S, N, a, and n described above. It is shown that the probability of being “0.12”. Here, “0.12” corresponds to the value of the cumulative probability H1.

なお、Ｈ２の値を使う例として、上記の分析パラメータでｎが０である場合を考える。この場合は、出現数が０以下の場合の確率を算出する。具体的には、図１９（ｂ）において横軸が０の項目の値となるため「０．０２」となる。
そして、ユーザ興味モデル更新部１３０は、図２０に示すように、上記算出した累積確率Ｈ１及びＨ２を用いて、標準正規分布の累積分布関数の逆関数により特徴スコアＺを算出する。図１９（ｃ）に示すように、上記Ｈ１を累積確率とする標準正規分布の累積分布関数の逆関数により特徴スコアＺを求める。なお、累積確率としてＨ２を利用する場合には、標準正規分布の累積分布関数の逆関数の返値の符号を負にして特徴スコアＺを求める。ユーザ興味モデル更新部１３０は、更新対象概念リスト｛クラスタＩＤ，｛概念ＩＤ，特徴スコア＝Ｚ，重み＝ｗ｝，…｝を出力する。重みｗは、更新対象興味テーブルＩＤ毎のｗ値である。 As an example of using the value of H2, consider the case where n is 0 in the above analysis parameters. In this case, the probability when the number of appearances is 0 or less is calculated. Specifically, in FIG. 19B, since the horizontal axis is the value of the item of 0, “0.02”.
Then, as shown in FIG. 20, the user interest model update unit 130 calculates the feature score Z by using the inverse function of the standard normal distribution cumulative distribution function using the calculated cumulative probabilities H1 and H2. As shown in FIG. 19C, the feature score Z is obtained by the inverse function of the cumulative distribution function of the standard normal distribution with H1 as the cumulative probability. When H2 is used as the cumulative probability, the feature score Z is obtained with the sign of the return value of the inverse function of the standard normal distribution cumulative distribution function being negative. The user interest model update unit 130 outputs an update target concept list {cluster ID, {concept ID, feature score = Z, weight = w},. The weight w is a w value for each update target interest table ID.

（ＤＢ更新）
興味モデル更新処理部１３０は、「更新対象概念リスト」の各概念ＩＤのユーザ興味スコア（ＴｏｔａｌＺ）を更新する。図２１に興味モデル更新処理部１３０の処理の詳細を示す。興味モデル更新処理部１３０は、コンテンツに出現した概念（出現概念）について、図２１に示す各概念ｉに対するユーザ興味スコア更新式を用いて、ユーザ興味スコアＴｏｔａｌＺ_ｉｎ，及びＸ_{ｉ（ｎ−１）}，Ｙ_{ｉ（ｎ−１）}の値を求める。そして、図１１の興味スコアデータベース１４０のユーザ興味テーブルのコンテキストセットＩＤに対応するレコードについて、概念ＩＤ及びクライアント端末ＩＤ（ユーザＩＤ）に対応するカラムに格納されている各値（ＴｏｔａｌＺ，Ｘ，Ｙ）を更新する。 (DB update)
The interest model update processing unit 130 updates the user interest score (TotalZ) of each concept ID in the “update target concept list”. FIG. 21 shows details of processing of the interest model update processing unit 130. The interest model update processing unit 130 uses the user interest score update formula for each concept i shown in FIG. 21 for the concept (appearance concept) that appears in the content, and the user interest score TotalZ _in and X _{i (n−1).} , Y _{i (n−1)} is obtained. And about each record (TotalZ, X, Y) stored in the column corresponding to the concept ID and the client terminal ID (user ID) for the record corresponding to the context set ID of the user interest table of the interest score database 140 of FIG. ).

ここで、Ｘ_{ｉ（ｎ−１）}は、各概念ＩＤ（ここでは識別子ｉで表現）に対する、過去の（前回までの）前記更新対象概念リストの重みｗの二乗の合計である。Ｙ_{ｉ（ｎ−１）}は、同様に各概念ＩＤ（ここでは識別子ｉで表現）に対する、過去の前記更新対象概念リストの重みｗと特徴スコアＺの乗算の合計である。 Here, X _{i (n−1)} is the sum of the squares of the weights w of the update target concept list in the past (up to the previous time) for each concept ID (represented by the identifier i here). Similarly, Y _{i (n−1)} is the sum of multiplication of the weight w of the past update target concept list and the feature score Z for each concept ID (represented by identifier i here).

この、Ｘ，Ｙはユーザ興味スコア（ＴｏｔａｌＺ）計算過程における中間結果を保持することとなり、省メモリ／ストレージを優先させる場合、最低限としては各概念の変数としてＴｏｔａｌＺ，Ｘ，Ｙの３つの実数値を保持することで実現可能である。省メモリ／ストレージを優先させない場合は、算出した各概念、各クラスタの特徴スコアＺをすべて保存することとなる。この場合は、Ｘ，Ｙの保存は不要となる。 These X and Y hold intermediate results in the user interest score (TotalZ) calculation process. When giving priority to memory saving / storage, at least three variables of TotalZ, X, and Y are used as variables of each concept. This can be realized by holding a numerical value. When priority is not given to memory saving / storage, all the calculated concept scores and feature scores Z of the respective clusters are stored. In this case, storage of X and Y is not necessary.

図２１において、ｎは、更新処理が何度目かを示す識別子である。ユーザ興味スコアＴｏｔａｌＺを求める一連の処理は、クラスタＩＤ単位で行なわれ、この一連の処理が行なわれる単位を１度と数えるとき、ｎはこの一連の処理が何度目に行なわれたものであるかを示す識別子である。ｉは、概念ＩＤの識別子である。Ｚ_ｉｎは、概念ｉの各更新処理に利用するＺ値である。なお、上記Ｚ_ｉｊは一覧閲覧コンテンツリスト及び詳細閲覧コンテンツリスト毎のＺ値であり、Ｚ_ｉｊ∈Ｚ_ｉｎの関係である。重みｗ_ｉｎは、概念ｉの各更新処理に利用する重みであり、更新対象概念リストで設定される値である。 In FIG. 21, n is an identifier indicating how many times the update process is performed. A series of processes for obtaining the user interest score TotalZ is performed in units of cluster IDs. When the unit in which this series of processes is performed is counted once, n is the number of times this series of processes has been performed. Is an identifier. i is an identifier of a concept ID. Z _in is a Z value used for each update process of concept i. Note that Z _ij is a Z value for each of the list browsing content list and the detailed browsing content list, and has a relationship of Z _ij εZ _in . Weight w _in is a weight to be used in each process of updating the concept i, is a value that is set in the updated concept list.

［コンテキスト履歴追記処理部１１５］
図２２にコンテキスト履歴追記処理部１１５の処理フローを示す。コンテキスト履歴追記処理部１１５には、学習対象の興味テーブル選択処理部１１３からクライアント端末ＩＤ（もしくはユーザＩＤ）、上記履歴収集時点（もしくはサーバでの現時点）に適合するコンテキストＩＤ群、及びクラスタＩＤが入力される（Ｂ−１）。 [Context history addition processing unit 115]
FIG. 22 shows a processing flow of the context history appending processing unit 115. The context history appending processing unit 115 has a client terminal ID (or user ID) from the interest table selection processing unit 113 to be learned, a context ID group suitable for the history collection time point (or the current time at the server), and a cluster ID. Input (B-1).

（ＤＢ更新）
コンテキスト履歴追記処理部１１５は、上記入力された情報をもとに、コンテキスト別履歴量データベース１３２のコンテキスト別ユーザ履歴量テーブルを更新する。具体的には、クライアント端末ＩＤ（もしくはユーザＩＤ）及びコンテキストＩＤに対応する適合クラスタＩＤリストのカラムに上記入力されたクラスタＩＤを追記する。
コンテキスト履歴追記処理部１１５は、クライアント端末ＩＤ（もしくはユーザＩＤ）、及び上記履歴収集時点（もしくはサーバでの現時点）に適合するコンテキストＩＤ群を分割コンテキスト抽出処理部１１６へ出力する（Ｂ−２）。 (DB update)
The context history addition processing unit 115 updates the context-specific user history amount table of the context-specific history amount database 132 based on the input information. Specifically, the input cluster ID is added to the column of the matching cluster ID list corresponding to the client terminal ID (or user ID) and context ID.
The context history record processing unit 115 outputs the client terminal ID (or user ID) and the context ID group that matches the history collection time point (or the current time at the server) to the divided context extraction processing unit 116 (B-2). .

［分割コンテキスト抽出処理部１１６］
図２３に分割コンテキスト抽出処理部１１６の処理フローを示す。分割コンテキスト抽出処理部１１６には、コンテキスト履歴追記処理部１１５からクライアント端末ＩＤ（もしくはユーザＩＤ）、上記履歴収集時点（もしくはサーバでの現時点）に適合するコンテキストＩＤ群が入力される（Ｂ−２）。 [Division context extraction processing unit 116]
FIG. 23 shows a processing flow of the division context extraction processing unit 116. The divided context extraction processing unit 116 receives the client terminal ID (or user ID) from the context history addition processing unit 115 and a context ID group that matches the history collection time point (or the current time at the server) (B-2). ).

（コンテキスト条件組合せ毎の履歴量閾値判定）
分割コンテキスト抽出処理部１１６は、各クライアント端末ＩＤ（もしくはユーザＩＤ）について、コンテキスト別履歴量データベース１３２のコンテキスト別ユーザ履歴量テーブルを参照し、更新対象となったコンテキストＩＤ毎に他のコンテキストと組み合わせた場合の履歴量が所定の閾値を超えているかを判定し、履歴量が閾値を超えている組合せを抽出する。ここで、図１２のコンテキスト別ユーザ履歴量テーブルはコンテキストＩＤ毎の適合クラスタＩＤリストを保存しているため、コンテキストの組合せ毎の履歴量の算出には、この組み合わされたコンテキストＩＤに共通して関連づけられているクラスタＩＤの数を数えることで実現する。 (History threshold judgment for each context condition combination)
For each client terminal ID (or user ID), the divided context extraction processing unit 116 refers to the context-specific user history amount table in the context-specific history amount database 132 and combines it with another context for each context ID to be updated. If the history amount exceeds a predetermined threshold, a combination in which the history amount exceeds the threshold is extracted. Here, since the user history amount table by context in FIG. 12 stores a matching cluster ID list for each context ID, the history amount for each context combination is calculated in common with the combined context ID. This is achieved by counting the number of associated cluster IDs.

例えば、「８時〜１０時」、「駅近く」、ＡＮＤ「平日」のコンテキストの組合せの場合に、履歴量（閲覧回数など）が閾値である１０回を超えるものを抽出する。抽出した組合せが、コンテキスト別ユーザ履歴量データベース１３２の分析対象コンテキストセットテーブルに無い場合は、新規コンテキストセットとして、コンテキストセットＩＤを割り振る。 For example, in the case of a combination of contexts “8:00 to 10:00”, “near station”, and AND “weekdays”, a history amount (such as the number of browsing times) exceeding 10 is extracted as a threshold. When the extracted combination is not in the analysis target context set table of the context-based user history amount database 132, a context set ID is allocated as a new context set.

図２４にコンテキスト分割方法の処理概要を示す。ここでは、２次元のコンテキスト条件を設定するものとし、時間を２時間毎に１２個用意し、位置情報ＩＤを日本全地域をグリッドで分けたエリア通し番号として用意する。既存のコンテキスト条件が、「４時〜６時」ＡＮＤ「場所ｘ」と、「１８時〜２０時」ＡＮＤ「場所ｙ」と、「その他」（初期コンテキスト条件）との３つ存在する状態で、「６時〜８時」ＡＮＤ「場所ｘ」のログ（履歴量、閲覧回数など）が追加発生したものとする。「６時〜８時」ＡＮＤ「場所ｘ」のログが閾値である５以上になった時点で、「６時〜８時」ＡＮＤ「場所ｘ」のコンテキスト条件を分割する。その結果、コンテキスト条件は、「４時〜６時」ＡＮＤ「場所ｘ」と、「１８時〜２０時」ＡＮＤ「場所ｙ」と、「６時〜８時」ＡＮＤ「場所ｘ」と、「その他」（初期コンテキスト条件）との４つになる。なお、図２４では説明を簡単にするため２次元ベクトルの例を示したが、実際は多次元ベクトル（時刻、場所、気温、天気、曜日、季節など）で処理を行う。 FIG. 24 shows a processing outline of the context dividing method. Here, it is assumed that a two-dimensional context condition is set, and 12 times are prepared every two hours, and the position information ID is prepared as an area serial number obtained by dividing all regions of Japan by a grid. There are three existing context conditions: “4 o'clock to 6 o'clock” AND “location x”, “18 o'clock to 20 o'clock” AND “location y”, and “other” (initial context condition). , “6 o'clock to 8 o'clock” AND “place x” logs (history amount, number of browsing times, etc.) are additionally generated. When the log of “6 o'clock to 8 o'clock” AND “location x” becomes 5 or more which is a threshold value, the context condition of “6 o'clock to 8 o'clock” AND “location x” is divided. As a result, the context conditions are “4 o'clock to 6 o'clock” AND “location x”, “18:00 to 20:00” AND “location y”, “6 o'clock to 8 o'clock” AND “location x”, “ There are four "others" (initial context conditions). In FIG. 24, an example of a two-dimensional vector is shown to simplify the explanation, but in actuality, processing is performed using a multidimensional vector (time, place, temperature, weather, day of the week, season, etc.).

（関連コンテキストセットＩＤ抽出）
分割コンテキスト抽出処理部１１６は、上記履歴量閾値判定処理により、新規のコンテキストセットＩＤと、そのコンテキストＩＤが入力されると、コンテキスト／関連性定義データベース１３１のコンテキスト関連性テーブルを参照し、新規コンテキストセット内のコンテキストＩＤと、関係を持つコンテキストＩＤをそれぞれ抽出し、抽出されたコンテキストＩＤを持つコンテキストセットＩＤを抽出する。なお、この関係を持つコンテキストＩＤの抽出処理では、コンテキスト／関連性定義データベース１３１のコンテキスト関連性テーブルをグラフとみなし、抽出対象のコンテキストＩＤを起点に、複数ホップを行う。ホップ回数は「２回や３回」、もしくは「関連コンテキストＩＤとの距離のホップ毎の積算値」を閾値として設定する。 (Related context set ID extraction)
When the new context set ID and the context ID are input by the history amount threshold determination process, the divided context extraction processing unit 116 refers to the context relevance table of the context / relevance definition database 131 and creates a new context. A context ID in the set and a context ID having a relationship are extracted, and a context set ID having the extracted context ID is extracted. In the context ID extraction process having this relationship, the context relevance table of the context / relevance definition database 131 is regarded as a graph, and multiple hops are performed starting from the extraction target context ID. As the number of hops, “2 or 3 times” or “integrated value for each hop of the distance to the related context ID” is set as a threshold.

抽出したコンテキストセットＩＤと新規コンテキストセットＩＤとの距離を、上記で算出した「関連コンテキストＩＤとの距離のホップ毎の積算値」の全コンテキストセットＩＤ内のコンテキストＩＤでの「合計」、「算術平均値」、「ベクトルと見なした場合のコサイン類似度や、ピアソン距離の逆数」によって算出する。 “Total” and “arithmetic” of the context IDs in all context set IDs of the “integrated value for each hop of the distance to the related context ID” calculated above for the distance between the extracted context set ID and the new context set ID It is calculated by “average value”, “cosine similarity when considered as a vector, and reciprocal of Pearson distance”.

（関係性重み算出）
分割コンテキスト抽出処理部１１６は、上記関連コンテキストセットＩＤ抽出処理により、新規のコンテキストセットＩＤと、そのコンテキストＩＤ、関連するコンテキストセットＩＤ群、及びそれぞれのコンテキストセットＩＤとの距離が入力されると、新規のコンテキストＩＤと関連するコンテキストＩＤ間で相互に関係性重みを、「関連コンテキストセットＩＤ抽出」で計算したセット間の距離から算出する。このとき、距離を重みとして、０〜１の値に正規化する。正規化の方法として、平均０、分散が運営者設定値の標準正規分布関数における確率値、累積確率値において距離をＺ値とした場合の結果などの方法がある。その他、ロジスティック関数を利用する方法、十分大きい数値で除す方法がある。 (Relationship weight calculation)
When the divided context extraction processing unit 116 receives the new context set ID, the context ID, the related context set ID group, and the distance between each context set ID by the related context set ID extraction process, A mutual relationship weight between the new context ID and the related context ID is calculated from the distance between the sets calculated in “related context set ID extraction”. At this time, the distance is normalized to a value of 0 to 1, using the distance as a weight. As a normalization method, there are a method such as a mean 0, a probability value in a standard normal distribution function whose variance is an operator set value, and a result in a case where a distance is a Z value in a cumulative probability value. In addition, there are a method using a logistic function and a method of dividing by a sufficiently large number.

（ＤＢ更新）
分割コンテキスト抽出処理部１１６は、新規のコンテキストセットＩＤと、そのコンテキストＩＤ、と上記算出された関係性重みをもとに、コンテキスト別履歴量データベース１３２の分析対象コンテキストセットテーブルを更新する。
図２５に、コンテキスト条件に基づく重み算出処理の具体例を示す。図２５では、学習時のコンテキスト条件「６時〜８時」ＡＮＤ［場所ｘ」の適合度（重み）算出イメージを示す。ＣＡＳＥ１として、「２時〜４時」ＡＮＤ［場所ｘ」のデータが発生した場合は、「６時〜８時」ＡＮＤ［場所ｘ」のコンテキスト条件との距離は２であるため、重み関数ｗ＝ｆ（２）の結果を重みとして追加された履歴クラスタにて学習処理を実行する。ＣＡＳＥ２として、「１０時〜１２時」ＡＮＤ［場所４」のデータが発生した場合は、「６時〜８時」ＡＮＤ［場所ｘ」のコンテキスト条件との距離は３であるため、重み関数ｗ＝ｆ（３）の結果を重みとして追加された履歴クラスタにて学習処理を実行する。さらに、ＣＡＳＥ３として、「６時〜８時」ＡＮＤ［場所ｙ」のデータが発生した場合は、「６時〜８時」ＡＮＤ［場所ｘ」のコンテキスト条件との距離は閾値を超えているため（この例では閾値は３）、学習処理は実行しない。 (DB update)
The divided context extraction processing unit 116 updates the analysis target context set table of the context-specific history amount database 132 based on the new context set ID, the context ID, and the calculated relationship weight.
FIG. 25 shows a specific example of the weight calculation process based on the context condition. FIG. 25 shows an image for calculating the fitness (weight) of the context condition “6 o'clock to 8 o'clock” AND [place x] during learning. When the data of “2 o'clock to 4 o'clock” AND [location x] is generated as CASE 1, the distance from the context condition of “6 o'clock to 8 o'clock” AND [location x] is 2, so the weight function w = The learning process is executed on the history cluster added with the result of f (2) as a weight. When the data of “10 o'clock to 12 o'clock” AND [location 4] is generated as CASE 2, the distance from the context condition of “6 o'clock to 8 o'clock” AND [location x] is 3, so the weight function w = The learning process is executed on the history cluster added with the result of f (3) as a weight. Furthermore, when data of “6 o'clock to 8 o'clock” AND [location y] is generated as CASE 3, the distance from the context condition of “6 o'clock to 8 o'clock” AND [location x] exceeds the threshold value. (In this example, the threshold is 3), and the learning process is not executed.

つまり、コンテキストを自動抽出することで、どのような状況（状況の組合せ）に着目して分析するかというコンテキスト定義の最適化及び定義作業の省力化でき、コンテキスト定義を履歴量と相関させることで対応する履歴が少ないコンテキスト分解を避けることが出来るため、最小限のコンピューターリソースでの情報推薦の高精度化が実現できる。 In other words, by automatically extracting the context, it is possible to optimize the context definition of what kind of situation (combination of situations) to analyze and to save labor of the definition work, and by correlating the context definition with the history amount Since it is possible to avoid context decomposition with few corresponding histories, it is possible to achieve high accuracy of information recommendation with minimum computer resources.

［コンテンツ要求受信部１２１］
図２６にコンテンツ評価処理部１７０の処理フローを示す。コンテンツ要求受信部１２１は、コンテンツ要求転送部３６０からクライアント端末ＩＤ（もしくはユーザＩＤ）を含む、図５に示すようなコンテンツ要求データを受信する。 [Content Request Receiving Unit 121]
FIG. 26 shows a processing flow of the content evaluation processing unit 170. The content request receiving unit 121 receives content request data as shown in FIG. 5 including the client terminal ID (or user ID) from the content request transfer unit 360.

（コンテンツ要求履歴データ受信）
コンテンツ要求受信部１２１は、コンテンツサーバ３００のコンテンツ要求転送部３６０から通信ネットワークを介してクライアント端末ＩＤ（もしくはユーザＩＤ）、時刻情報、及び端末コンテキスト情報（位置情報、加速度、地軸センサ、温度計など、その他端末保有のセンサの測定時刻及び測定結果）を受信する。そして、クライアント端末ＩＤ（もしくはユーザＩＤ）、時刻情報、及び端末コンテキスト情報を利用興味テーブル選択処理部１２４へ出力する（Ｃ−１）。なお、これらの情報を履歴情報受信部１１０にさらに転送し、学習に利用する方法もある。また、このコンテンツ要求の履歴について、クライアント端末側で履歴収集し、コンテンツ要求送信部２４０から履歴情報収集部２１０に情報を通知し、興味分析装置１００に送信する方法もある。 (Receiving content request history data)
The content request receiving unit 121 receives a client terminal ID (or user ID), time information, and terminal context information (position information, acceleration, ground axis sensor, thermometer, etc.) from the content request transfer unit 360 of the content server 300 via a communication network. , And the measurement time and measurement results of other sensors owned by the terminal. Then, the client terminal ID (or user ID), time information, and terminal context information are output to the utilization interest table selection processing unit 124 (C-1). There is also a method in which these pieces of information are further transferred to the history information receiving unit 110 and used for learning. There is also a method of collecting the history of the content request on the client terminal side, notifying the history request collecting unit 210 of the information from the content request transmitting unit 240, and transmitting the information to the interest analysis apparatus 100.

［利用興味テーブル選択処理部１２４］
図２７に、利用興味テーブル選択処理部１２４の処理フローを示す。利用興味テーブル選択処理部１２４は、コンテンツ要求受信部１２１からクライアント端末ＩＤ（もしくはユーザＩＤ）、時刻情報、及び端末コンテキスト情報を受信する（Ｃ−１）。 [Use interest table selection processing unit 124]
FIG. 27 shows a processing flow of the utilization interest table selection processing unit 124. The usage interest table selection processing unit 124 receives the client terminal ID (or user ID), time information, and terminal context information from the content request receiving unit 121 (C-1).

（適合コンテキストＩＤの読み出し）
利用興味テーブル選択処理部１２４は、クライアント端末ＩＤ（もしくはユーザＩＤ）、時刻情報、及び端末コンテキスト情報を大域コンテキスト／コンテキストＩＤ設定部１１７へ出力し（Ｃ−２）、大域コンテキスト／コンテキストＩＤ設定部１１７からクライアント端末ＩＤ（もしくはユーザＩＤ）及び履歴収集時点（もしくはサーバでの現時点）に適合するコンテキストＩＤ群を受信する（Ｃ−３）。 (Read conformance context ID)
The utilization interest table selection processing unit 124 outputs the client terminal ID (or user ID), time information, and terminal context information to the global context / context ID setting unit 117 (C-2), and the global context / context ID setting unit. From 117, a context ID group that matches the client terminal ID (or user ID) and the history collection time point (or the current time at the server) is received (C-3).

（利用するコンテキストセットＩＤ決定（ユーザ興味テーブル選択））
上記適合コンテキストＩＤの読み出し処理により、大域コンテキスト／コンテキストＩＤ設定部１１７から、クライアント端末ＩＤ（もしくはユーザＩＤ）、履歴収集時点（もしくはサーバでの現時点）に適合するコンテキストＩＤ群が入力される。と、利用興味テーブル選択処理部１２４は、コンテキストＩＤがそろった時点で、コンテキスト別履歴量ＤＢ１３２の分析対象コンテキストセットテーブルからコンテキスト条件が適合するコンテキストセットＩＤを取得する。なお、複数のコンテキストセットＩＤが取得される場合は、コンテキストセットにおいて、もっともコンテキスト条件数が多いものを選択する。そして、利用興味テーブル選択処理部１２４は、クライアント端末ＩＤ（もしくはユーザＩＤ）、利用するユーザ興味テーブルに対応するコンテキストセットＩＤを出力する（Ｃ−４）。 (Determine context set ID to be used (select user interest table))
Through the process of reading the matching context ID, a context ID group matching the client terminal ID (or user ID) and the history collection time (or the current time at the server) is input from the global context / context ID setting unit 117. Then, the utilization interest table selection processing unit 124 acquires a context set ID that satisfies the context condition from the analysis target context set table of the context-specific history amount DB 132 when the context IDs are collected. When a plurality of context set IDs are acquired, the context set having the largest number of context conditions is selected. Then, the use interest table selection processing unit 124 outputs the client terminal ID (or user ID) and the context set ID corresponding to the user interest table to be used (C-4).

［コンテンツ評価処理部１７０］
図２８にコンテンツ評価処理部１７０の処理フローを示す。コンテンツ評価処理部１７０には、コンテンツデータベース１６０のコンテンツテーブルから読み出した図７のような形式の提示コンテンツリストと、利用興味テーブル選択処理部１２４からクライアント端末ＩＤ（もしくはユーザＩＤ）、利用するユーザ興味テーブルのレコードを決定するためのコンテキストセットＩＤが入力される。 [Content Evaluation Processing Unit 170]
FIG. 28 shows a processing flow of the content evaluation processing unit 170. The content evaluation processing unit 170 includes a presentation content list in a format as shown in FIG. 7 read from the content table of the content database 160, a client terminal ID (or user ID) from the usage interest table selection processing unit 124, and a user interest to be used. A context set ID for determining a record in the table is input.

（スコア評価）
コンテンツ評価処理部１７０は、提示コンテンツリストから、利用するコンテキストセットＩＤが合致するユーザ興味テーブルのレコードを用いてコンテンツの評価を行う。そして、コンテンツ評価処理部１７０は、評価対象のコンテンツについて、評価スコアを算出し、図２９に示すようなコンテンツスコアリストを生成する。コンテンツスコアリストは、コンテンツＩＤ、評価スコア、コンテンツ本体、及びコンテンツ登録時刻を有する。 (Score evaluation)
The content evaluation processing unit 170 evaluates the content from the presented content list using a record of the user interest table that matches the context set ID to be used. Then, the content evaluation processing unit 170 calculates an evaluation score for the content to be evaluated, and generates a content score list as shown in FIG. The content score list has a content ID, an evaluation score, a content body, and a content registration time.

図３０に評価スコアの算出方法の一例を示す。例えば、図３０に示すコンテンツ評価式により、評価コンテンツｘに対する評価スコアＥｎｔｉｔｙＺ_ｘを概念ｉのユーザ興味スコアＴｏｔａｌＺ_ｉ、コンテンツｘと概念ｉとの関連度Ｗ_ｉ（もしくは、概念ｉの重要度）、及びコンテンツｘに出現する概念ＩＤの集合ｐを用いて算出することができる。なお、概念の識別子ｉは集合ｐ内の概念ＩＤに対応する。 FIG. 30 shows an example of an evaluation score calculation method. For example, the content evaluation formula shown in Figure 30, the user interest score TotalZ _i concept i evaluation score EntityZ _x for the evaluation content x, relevance _{W i} a content x and concepts i (or importance of concepts i), And a set p of concept IDs appearing in the content x. The concept identifier i corresponds to the concept ID in the set p.

図３０の算出で利用するユーザ興味スコア（ＴｏｔａｌＺ）は、各コンテンツに関連した概念ＩＤについて、興味スコアデータベース１４０のユーザ興味テーブル（図１１）から、クライアント端末ＩＤ（もしくは、ユーザＩＤ）をもとに読み出し利用する。図３０において、概念Ｋ、概念Ｂ及び概念Ｄが出現するコンテンツ１を評価コンテンツとした場合、概念Ｋ、概念Ｂ及び概念ＤのＴｏｔａｌＺ，関連度Ｗを利用して評価スコアＥｎｔｉｔｙＺ_ｘ＝０．１８と算出できる。一方、概念Ｂのみが出現するコンテンツ２を評価コンテンツとした場合、概念ＢのＴｏｔａｌＺ，Ｗを利用して評価スコアＥｎｔｉｔｙＺ_ｘ＝−０．３と算出できる。評価スコアＥｎｔｉｔｙＺ_ｘの値が大きいコンテンツ１が優先して表示される。 The user interest score (TotalZ) used in the calculation of FIG. 30 is based on the client terminal ID (or user ID) from the user interest table (FIG. 11) of the interest score database 140 for the concept ID related to each content. Read out and use. In FIG. 30, when the content 1 in which the concept K, the concept B, and the concept D appear is an evaluation content, the evaluation score EntityZ _x = 0.18 using the TotalZ and the relevance W of the concept K, the concept B, and the concept D. And can be calculated. On the other hand, when the content 2 in which only the concept B appears is set as the evaluation content, the evaluation score EntityZ _x = −0.3 can be calculated using the TotalZ, W of the concept B. Content 1 with a large value of evaluation score EntityZ _x is displayed preferentially.

その他にも、評価スコアＥｎｔｉｔｙＺ_ｘは、以下の変形例１〜３の方法により求めることができる。
変形例１としては、ＥｎｔｉｔｙＺ_ｘ=ＭＡＸ（ＴｏｔａｌＺ_ｉ＊Ｗ_ｉ）により求める。ＭＡＸ（ＴｏｔａｌＺ_ｉ＊Ｗ_ｉ）は、ｉ∈ｐのＴｏｔａｌＺ_ｉ＊Ｗ_ｉの最大値を返す関数とする。 In addition, the evaluation score EntityZ _x can be obtained by the following methods 1 to 3.
As a first modification, it is obtained by EntityZ _x = MAX (TotalZ _i * W _i ). MAX (TotalZ _i * W _i ) is a function that returns the maximum value of TotalZ _i * W _i for i∈p.

変形例２としては、ＥｎｔｉｔｙＺ_ｘの値は、ＭＡＸ（ＴｏｔａｌＺ_ｉ＊Ｗ_ｉ）の値が閾値を超えた場合には、ＭＡＸ（ＴｏｔａｌＺ_ｉ＊Ｗ_ｉ）の返り値とする。ＭＡＸ（ＴｏｔａｌＺ_ｉ＊Ｗ_ｉ）はｉ∈ｐのＴｏｔａｌＺ_ｉ＊Ｗ_ｉの最大値を返す関数とする。閾値を超えない場合は、図３０のコンテンツ評価式の結果をＥｎｔｉｔｙＺ_ｘとする。ＭＡＸ（）は、はｉ∈ｐのＴｏｔａｌＺ_ｉ＊Ｗ_ｉで最大値を返す関数とする。閾値はサービス運用者が設定する値とする。 The second modification, the value of EntityZ _x When the value of the MAX (TotalZ _{_i} * _W _i) exceeds the threshold value, the return value of _{_{MAX (TotalZ i * W i)}} . MAX (TotalZ _i * W _i ) is a function that returns the maximum value of TotalZ _i * W _i for i∈p. When the threshold value is not exceeded, the result of the content evaluation formula in FIG. 30 is set to EntityZ _x . MAX () is a function that returns the maximum value of TotalZ _i * W _i with i∈p. The threshold is a value set by the service operator.

変形例３としては、ＴｏｔａｌＺ_ｉが正の値のｉ∈ｐについてのみ取り出し、図３０のコンテンツ評価式で統合した値をＥｎｔｉｔｙＺ_ｘとする。
（コンテンツをスコア順にソート）
コンテンツ評価処理部１７０は、コンテンツスコアリストに含まれるコンテンツを評価スコアＥｎｔｉｔｙＺ_ｘの降順にソートし、ソート済みコンテンツスコアリストをコンテンツスコアリスト送信部１８０に出力する。 As a third modification example, only the value iεp where TotalZ _i is a positive value is extracted, and the value integrated by the content evaluation formula of FIG. 30 is EntityZ _x .
(Sort content in score order)
The content evaluation processing unit 170 sorts the content included in the content score list in descending order of the evaluation score EntityZ _x and outputs the sorted content score list to the content score list transmission unit 180.

［ソート済みコンテンツスコアリスト送信部１８０］
ソート済みコンテンツスコアリスト送信部１８０は、コンテンツ評価処理部１７０から入力されるソート済みコンテンツスコアリストとクライアント端末ＩＤ（もしくはユーザＩＤ）を通信ネットワークを介してコンテンツサーバ３００に送信する。 [Sorted content score list transmission unit 180]
The sorted content score list transmission unit 180 transmits the sorted content score list and client terminal ID (or user ID) input from the content evaluation processing unit 170 to the content server 300 via the communication network.

以上述べたように、上記実施形態によれば、ユーザの選択候補となる一覧リストを定義し、そこからのコンテンツ選択における概念の出現数を分析することで、各概念の出現の希少性を考慮し、且つ一覧から選ばれない概念の履歴特徴を利用することができるため、ユーザの興味を高精度に推定することが可能となる。 As described above, according to the above-described embodiment, the list of candidates for user selection is defined, and the number of appearances of concepts in content selection from there is analyzed, thereby taking into account the rarity of appearance of each concept. In addition, since it is possible to use history features of concepts that are not selected from the list, it is possible to estimate the user's interest with high accuracy.

また、履歴データの集まり方から、コンテキスト条件（その条件限定で傾向を分析するもの）を自動抽出し、自動抽出したコンテキスト条件について、自動で学習時のコンテキスト適合度（重み）を決定することが可能となる。
分析対象となり得る多数のコンテキストの切り口（時刻、場所、気温、曜日、季節等）が有る場合には、全ての組合せで計算することは計算量が莫大となる。また、全ての組合せについて運用者が「重み」を設定することは非常に困難である。本実施形態の手法によれば、コンテキスト条件自動判別機能により処理コストや運用コストが削減でき、多様なコンテキストを考慮した、適切なコンテキスト条件による分析により情報推薦を高精度化できる。 In addition, it is possible to automatically extract context conditions (those that analyze trends by limiting the conditions) from the way of gathering historical data, and automatically determine the context suitability (weight) at the time of learning for the automatically extracted context conditions. It becomes possible.
When there are many context cuts (time, place, temperature, day of the week, season, etc.) that can be analyzed, the calculation amount for all combinations becomes enormous. Further, it is very difficult for the operator to set “weight” for all combinations. According to the method of the present embodiment, processing cost and operation cost can be reduced by the automatic context condition discrimination function, and information recommendation can be made highly accurate by analysis based on appropriate context conditions in consideration of various contexts.

さらに、コンテンツ要求時のユーザ状況に適合するユーザ興味テーブルを用いてコンテンツの評価スコアを算出することで、ユーザの興味に合ったコンテンツを精度良く推薦することが可能となる。
なお、この発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態に亘る構成要素を適宜組み合せてもよい。 Furthermore, by calculating a content evaluation score using a user interest table that matches a user situation at the time of content request, it is possible to recommend content that matches the user's interest with high accuracy.
Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Further, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, you may combine suitably the component covering different embodiment.

１００…興味分析装置、２００…クライアント端末、３００…コンテンツサーバ、１１０…履歴情報受信部、１１３…学習対象の興味テーブル選択処理部、１１５…コンテキスト履歴追記処理部、１１６…分割コンテキスト抽出処理部、１１７…大域コンテキスト／コンテキストＩＤ設定部、１２１…コンテンツ要求受信部、１２４…利用興味テーブル選択処理部、１３０…興味モデル更新処理部、１３１…コンテキスト／関連性定義データベース、１３２…コンテキスト別履歴量データベース、１４０…興味スコアデータベース、１５０…提示コンテンツリスト受信部、１６０…コンテンツデータベース、１７０…コンテンツ評価処理部、１８０…ソート済みコンテンツスコアリスト送信部、２１０…履歴情報送信部、２２０…履歴収集部、２３０…コンテンツ提示部、２４０…コンテンツ要求送信部、２５０…端末情報収集部、３１０…コンテンツ送信処理部、３２０…ソート済み提示コンテンツリスト受信部、３３０…提示コンテンツリスト送信部、３４０…提示コンテンツリスト入力部、３５０…履歴情報転送部、３６０…コンテンツ要求転送部。 DESCRIPTION OF SYMBOLS 100 ... Interest analysis apparatus, 200 ... Client terminal, 300 ... Content server, 110 ... History information receiving part, 113 ... Interest table selection process part of learning object, 115 ... Context history addition process part, 116 ... Divided context extraction process part, 117 ... Global context / context ID setting unit 121 ... Content request receiving unit 124 ... Use interest table selection processing unit 130 ... Interest model update processing unit 131 ... Context / relevance definition database 132 ... history amount database by context 140 ... Interest score database, 150 ... Presented content list receiving unit, 160 ... Content database, 170 ... Content evaluation processing unit, 180 ... Sorted content score list transmitting unit, 210 ... History information transmitting unit, 220 ... History collecting unit 230 ... Content presentation unit, 240 ... Content request transmission unit, 250 ... Terminal information collection unit, 310 ... Content transmission processing unit, 320 ... Sorted presentation content list reception unit, 330 ... Presentation content list transmission unit, 340 ... Presentation content list Input unit, 350... History information transfer unit, 360... Content request transfer unit.

Claims

A method of analyzing a user's interest from a browsing history of content including the concept using an interest model having a user interest score for each of a plurality of concepts,
Clustering a first content list browsing a plurality of contents as a list and a second content list browsing a content body from the first content list;
For each cluster, the total number of contents in the first content list is the first total number, the number of contents in which the concept appears in the first content list is the first appearance number, and the second content When the total number of contents in the list is the second total number and the number of contents in which the concept appears in the second content list is the second appearance number, the first total number, the first appearance number, And under the condition of the second total number, a first probability that the number of contents in which the concept appears in the second content list is greater than or equal to the second occurrence number and less than or equal to the second occurrence number. Calculating a second probability, and calculating a feature score by an inverse function of a cumulative distribution function of a standard normal distribution based on the first probability and the second probability;
Collecting contextual conditions for viewing the content;
Dividing the model of interest into a table for each combination based on the collected context condition combinations;
Extracting an update target table from the table for each combination based on the relevance between context conditions included in the combination;
Calculating a weight for the update target table based on the relevance;
And updating the user interest score for the concept in the update target table using the feature score and the weight.

The interest analysis method according to claim 1, wherein the interest model is divided into a table for each combination based on an amount of browsing history that matches the context condition.

The interest analysis method according to claim 1, further comprising a step of collecting context conditions at the time of requesting content and calculating an evaluation score for the content using a table that matches the context condition.

Using an interest model having a user interest score for each of a plurality of concepts, an apparatus for analyzing user interest from a browsing history of content including the concept,
Means for clustering a first content list in which a plurality of contents are browsed as a list and a second content list in which content bodies are browsed from the first content list;
For each cluster, the total number of contents in the first content list is the first total number, the number of contents in which the concept appears in the first content list is the first appearance number, and the second content When the total number of contents in the list is the second total number and the number of contents in which the concept appears in the second content list is the second appearance number, the first total number, the first appearance number, And under the condition of the second total number, a first probability that the number of contents in which the concept appears in the second content list is greater than or equal to the second occurrence number and less than or equal to the second occurrence number. Calculating means for calculating a feature score by an inverse function of a cumulative distribution function of a standard normal distribution based on the first probability and the second probability;
Means for collecting contextual conditions relating to browsing of the content;
Means for dividing the model of interest into a table for each combination based on the combination of the collected context conditions;
Means for extracting an update target table from the table for each combination based on the relationship between the context conditions included in the combination;
Means for calculating a weight for the update target table based on the relevance;
An interest analysis apparatus comprising: means for updating the user interest score for the concept in the update target table using the feature score and the weight.

The interest analysis apparatus according to claim 4, wherein the interest model is divided into a table for each combination based on an amount of browsing history that matches the context condition.

6. The interest analysis apparatus according to claim 4, further comprising means for collecting context conditions at the time of requesting content and calculating an evaluation score for the content using a table that matches the context condition.

An interest analysis device program that causes a computer to function as each means constituting the interest analysis device according to any one of claims 4 to 6.