JP2008146610A

JP2008146610A - Method of recommendation to user on network, recommendation server, and program

Info

Publication number: JP2008146610A
Application number: JP2006336428A
Authority: JP
Inventors: Dorje Brody; ブローディドージェ; Meister Bernard; マイスターベルナルド; Julian Brody; ブローディジュリアン
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2006-12-13
Filing date: 2006-12-13
Publication date: 2008-06-26
Anticipated expiration: 2026-12-13
Also published as: JP4962950B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method for regulating the range in which recommendations are made to each user on a network. <P>SOLUTION: A server 10 receives user characteristic data from the terminals of a plurality of users via a communication network, and maps the characteristics of the plurality of users to a probability space according to the user characteristic data received. In the probability space to which the characteristics are mapped, the server calculates spherical distances between the users and, based on the spherical distances calculated, calculates attribute overlap index data indicating the degree of overlaps of attributes between the particular user among the plurality of users and the other users. About the attribute overlap index data calculated, the server calculates a nonlinear average that depends on a parameter indicating the user's degree of risk avoidance, thereby creating a recommendation list for making recommendations to the particular user. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、ネットワーク上のユーザに対するレコメンデーションの方法、レコメンデーションサーバ及びプログラムに関する。 The present invention relates to a recommendation method, a recommendation server, and a program for users on a network.

近年、インターネットの社会への普及に伴い、Ｗｅｂその他のネットワーク環境において、ユーザはＷｅｂブラウザその他の技術的手段により情報を単に閲覧するだけではなく、商品を購入したり、属性の近いもの同士で情報を送受信してコミュニティを作ったり、といった現実社会における活動と同様の活動をするようになってきた。 In recent years, with the spread of the Internet to the society, in the Web and other network environments, users not only browse information by Web browsers or other technical means, but also purchase products or share information between those with similar attributes. It has come to perform activities similar to those in the real world, such as creating communities by sending and receiving.

ところで、当該Ｗｅｂその他のネットワーク環境においては、ユーザの情報閲覧、商品購入、コミュニティにおける情報の送受信等の活動を記録することは比較的容易である。具体的には、例えば、Ｗｅｂ上でユーザの会員登録等を受け付けることによって、サーバが当該ユーザの基本属性を記憶したり、当該会員登録等を行ったユーザのＷｅｂ上での情報閲覧、商品購入等の活動を、サーバがログ（活動記録）として自動的に収集、記憶したり、といったことが可能である。 In the Web and other network environments, it is relatively easy to record activities such as user information browsing, product purchase, and information transmission / reception in the community. Specifically, for example, by accepting a user's member registration on the Web, the server stores the basic attributes of the user, information browsing on the Web of the user who performed the member registration, etc., product purchase The server can automatically collect and store the activity as a log (activity record).

そこで、当該Ｗｅｂその他のネットワーク環境において、登録された基本属性やログ（活動記録）等に基づいてユーザに興味があると考えられる商品やサービスを推薦したり（レコメンデーション）、広告を送信したり、属性が近いと思われるユーザを紹介したり（ソーシャル・ネットワーキング・サービス；ＳＮＳ）、といったサービスが行われている。 Therefore, in the Web and other network environments, recommend products and services that are considered to be of interest to users based on registered basic attributes, logs (activity records), etc., and send advertisements. Introducing users who seem to have similar attributes (social networking service; SNS).

例えば、特許文献１に記載の技術によれば、Ｗｅｂサーバは、ユーザの購買記録に基づいて当該ユーザの興味を分析し、その分析結果に基づいて、商品推薦（レコメンデーション）を行うことができる。 For example, according to the technique described in Patent Document 1, the Web server can analyze the user's interest based on the user's purchase record, and can perform product recommendation (recommendation) based on the analysis result. .

より具体的には、Ｗｅｂサーバが、全てのユーザの過去の購買記録に基づいて、人気商品を特定し、その人気商品を購入したユーザが購入した別の商品の購入頻度を集計し、これらの人気商品とその他の商品との間の購入に関する相関を分析し、以ってそれぞれの人気商品について購入頻度において相関の強いその他の商品をその特定商品を購入したユーザに推薦する。 More specifically, the Web server identifies popular products based on the past purchase records of all users, aggregates the purchase frequency of other products purchased by the user who purchased the popular products, The correlation regarding the purchase between the popular product and the other product is analyzed, and therefore, the other product having a strong correlation in the purchase frequency for each popular product is recommended to the user who has purchased the specific product.

また、例えば、非特許文献１の技術によれば、Ｗｅｂサーバは、ユーザの購買記録や商品の評価記録（レイティング）に基づいて、商品推薦を行うことができる。 Further, for example, according to the technique of Non-Patent Document 1, the Web server can make a product recommendation based on a user's purchase record or a product evaluation record (rating).

より具体的には、Ｗｅｂサーバは、類似商品テーブルを生成し、人気商品を購入したユーザをサンプリングして当該ユーザの購買記録や商品の評価記録（レイティング）に基づいて、当該人気商品を購入したユーザが購入或いはレイティングを行った商品に類似した商品を相関の強い商品として推薦する。
米国特許第６，９１２，５０５号公報Ａｍａｚｏｎ．ｃｏｍＲｅｃｏｍｍｅｎｄａｔｉｏｎｓ，Ｉｔｅｍ−ｔｏ−ＩｔｅｍＣｏｌｌａｂｏｒａｔｉｖｅＦｉｌｔｅｒｉｎｇ，ＧｒｅｇＬｉｎｄｅｎ，ＢｒｅｎｔＳｍｉｔｈ，ａｎｄＪｅｒｅｍｙＹｏｒｋ，Ａｍａｚｏｎ．ｃｏｍ，Ｊａｎｕａｒｙ・Ｆｅｂｒｕａｒｙ２００３，ＩＥＥＥＣｏｍｐｕｔｅｒＳｏｃｉｅｔｙ More specifically, the Web server generates a similar product table, samples users who have purchased popular products, and purchases the popular products based on the purchase records and rating records (ratings) of the users. A product similar to the product purchased or rated by the user is recommended as a highly correlated product.
US Pat. No. 6,912,505 Amazon. com Recommendations, Item-to-Item Collaborative Filtering, Greg Linden, Brent Smith, and Jeremy York, Amazon. com, January, February 2003, IEEE Computer Society

しかしながら、特許文献１及び非特許文献１に記載された発明において、レコメンデーションを行うに当たって、ユーザではなく、まず商品に着目するアプローチを採っていること、及び、人気商品を購入したユーザがあわせて購入した商品との「相関」を用いることに関していくつか課題が存在する。 However, in the inventions described in Patent Document 1 and Non-Patent Document 1, in making the recommendation, not the user, but first taking an approach that focuses on the product, and the user who purchased the popular product together There are several challenges associated with using “correlation” with purchased products.

第１の課題は、一般に相関は−１から１の間の値を取り得ると考えられているが、実際にはより限られた範囲の値しか取らない分布が多々存在している。具体的には例えば、−０．２から＋０．６の値しか取らないことも考えられる。このような状況においては、例えば、「相関が＋０．７の場合にユーザの属性が近いと判断する」等とした場合、相関を用いてもユーザの属性の重複度合いを判断することができない。 The first problem is generally considered that the correlation can take a value between -1 and 1, but there are many distributions that actually take a value in a more limited range. Specifically, for example, it can be considered that only a value of -0.2 to +0.6 is taken. In such a situation, for example, when “correlation is +0.7, it is determined that the user attribute is close” or the like, the degree of duplication of the user attribute cannot be determined even if the correlation is used.

第２の課題は、相関を用いる分析において、あるユーザ同士の相関が負の値を持つ場合、その値は一般には破棄されてしまうが、仮にあるユーザ同士の相関が負の大きな値を持つ場合、そのユーザ同士は明らかに依存性を持っているといえるため、相関により依存性を捉えることには限界があるといえる。 The second problem is that in the analysis using correlation, if the correlation between certain users has a negative value, that value is generally discarded, but if the correlation between certain users has a large negative value Since it can be said that the users clearly have dependencies, it can be said that there is a limit to capturing the dependencies by correlation.

第３の課題は、相関はユーザの属性を表す確率分布に関するグローバルな情報を含んでいないということである。具体的には、あるユーザ同士の属性の重複度合いが比較的大きい場合においても、相関はそれを察知できるとは限らない。即ち、あるユーザの属性の重複度合いが全く無い場合は相関はゼロであるので正しい判断を下すことができるが、逆に相関がゼロであっても属性の重なり度合いが比較的大きい場合も存在し、無相関＝無関係として誤った判断を下すことになるのである。特に、相関は２次のモーメントにしか依存しないため、複数のユーザの行動特性の確率分布が与えられたとき、テールに関する情報は相関を用いた分析結果に現れないことである。例として、Ａ及びＢ２人のユーザがある商品を購入する頻度を示す確率分布が当該商品の価格に対して例えばＡがパワー型、Ｂがガウス型のテールを持つ場合、このテールの部分には決定的な違いがある。即ち、Ｂは高額の商品を薦めても購入する見込みは無いが、Ａは商品が気に入れば購入する可能性がある。しかしながら、高次のモーメントに依存しないため、相関はこのような差異を判断する情報を与えることができない。 A third problem is that the correlation does not include global information regarding the probability distribution representing the user's attributes. Specifically, even when the degree of overlapping of attributes between certain users is relatively large, the correlation cannot always be detected. In other words, if there is no degree of attribute duplication for a user, the correlation is zero so that a correct judgment can be made, but conversely, even if the correlation is zero, the degree of attribute duplication may be relatively large. Therefore, a wrong judgment is made as uncorrelated = irrelevant. In particular, since the correlation depends only on the second-order moment, when a probability distribution of behavior characteristics of a plurality of users is given, information on the tail does not appear in the analysis result using the correlation. As an example, if the probability distribution indicating the frequency with which A and B users purchase a certain product has a tail with a power type and B with a Gauss type for the price of the product, for example, There is a decisive difference. That is, even if B recommends an expensive product, there is no prospect of purchasing it, but A may purchase if he likes the product. However, since it does not depend on higher-order moments, correlation cannot provide information for determining such differences.

第４の課題は、相関は線形の関係を表現することができるが、非線形の関係を表現することができないことである。具体的には、２つの確率変数ＸとＹがＹ＝ａＸ＋ｂの関係（線形の関係）を有している場合には、相関を用いた分析は有意義な情報を与えることができるが、これら２つの確率変数が例えばＹ＝Ｓｉｎ（Ｘ）の関係（周期的に依存する関係）を有している場合（例えば、ネットワークへのアクセス数Ｙと時間Ｘとの関係がこれに当たる）、相関を用いた分析はこのような周期的に依存する関係について有意義な情報を与えることができない（つまり、ＸとＹの依存性は高いが、相関はゼロとなる）。 The fourth problem is that the correlation can express a linear relationship but cannot express a non-linear relationship. Specifically, when two random variables X and Y have a relationship of Y = aX + b (linear relationship), analysis using correlation can provide meaningful information. For example, if two random variables have a relationship of Y = Sin (X) (a periodically dependent relationship) (for example, the relationship between the number of accesses to the network Y and the time X corresponds to this), the correlation is used. The analysis that has been performed cannot give meaningful information about such periodically dependent relationships (ie, the dependence of X and Y is high, but the correlation is zero).

さらに、相関を用いる、用いないにかかわらず、次のような課題も存在する。即ち、第５の課題は、相関を用いる、用いないにかかわらず、従来のレコメンデーションは商品基準（商品重視）となっていることがあげられる。例えば中年の男性がたまたま５歳の娘に絵本を購入したからといって、５歳向けの絵本ばかりレコメンデーションしてもその効用は小さい。むしろ、興味の似通った人の購入したものを推薦したほうが望ましい。 Furthermore, the following problems exist regardless of whether or not correlation is used. That is, the fifth problem is that, regardless of whether or not the correlation is used, the conventional recommendation is a product standard (product-oriented). For example, just because a middle-aged man happens to purchase a picture book for a five-year-old daughter, even if only a picture book for five-year-olds is recommended, the utility is small. Rather, it is better to recommend items purchased by people with similar interests.

そこで、商品重視ではなく、ユーザ重視であり、また、相関を用いることなくネットワーク上のユーザに対してレコメンデーションを行う範囲を当該ユーザ毎に調整する方法を提供することが望まれている。 Therefore, it is desired to provide a method that adjusts the range of recommendation for users on the network for each user without using correlation and using user correlation, without using correlation.

そこで、本発明は、ネットワーク上のユーザに対してレコメンデーションを行う範囲をユーザ毎に調整する方法、レコメンデーションサーバ及びプログラムを提供することを目的とする。 Then, an object of this invention is to provide the method, the recommendation server, and program which adjust the range which performs recommendation with respect to the user on a network for every user.

上述の目的を達成するために、本発明者らは研究を重ね、本発明を完成するに至った。具体的には、本発明は以下のようなものを提供する。 In order to achieve the above-mentioned object, the present inventors have conducted research and have completed the present invention. Specifically, the present invention provides the following.

（１）通信ネットワーク（通信ネットワーク３０）を介して接続可能な端末（端末２０）のユーザに対してサーバ（サーバ１０）がレコメンデーションを行う方法であって、
複数の前記ユーザの端末から、複数の前記ユーザの基本属性データ又はログデータを少なくとも含んで構成するユーザ特性データを、前記通信ネットワークを介して受信するステップと、
受信した前記ユーザ特性データに基づいて、複数の前記ユーザの特性を確率空間にマッピングするステップと、
マッピングした前記確率空間においてそれぞれの前記ユーザの間の球面距離を計算するステップと、
計算した前記球面距離に基づいて、複数の前記ユーザのうち、特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算するステップと、
計算した前記属性重複指数データについて、前記ユーザのリスク回避度を表すパラメータに依存した非線形平均を計算することによって、前記特定のユーザに対してレコメンデーションを行うためのレコメンデーションリストを生成するステップと、
生成した前記レコメンデーションリストに基づいて、前記特定のユーザの端末にレコメンデーションを行うためのデータを送信するステップと、を含む方法。 (1) A method in which a server (server 10) makes a recommendation to a user of a terminal (terminal 20) connectable via a communication network (communication network 30),
Receiving, via the communication network, user characteristic data comprising at least a plurality of basic attribute data or log data of the plurality of users from a plurality of terminals of the users;
Mapping a plurality of user characteristics to a probability space based on the received user characteristic data;
Calculating a spherical distance between each of the users in the mapped probability space;
Calculating attribute duplication index data representing a degree of duplication of attributes between a specific user and other users among the plurality of users based on the calculated spherical distance;
Generating a recommendation list for recommending the specific user by calculating a non-linear average depending on a parameter representing the risk avoidance degree of the user for the calculated attribute duplication index data; ,
Transmitting data for making a recommendation to the terminal of the specific user based on the generated recommendation list.

本発明のこのような構成によれば、前記サーバは、複数の前記ユーザの端末から、複数の前記ユーザの基本属性データ又はログデータを少なくとも含んで構成するユーザ特性データを、前記通信ネットワークを介して受信し、受信した前記ユーザ特性データに基づいて、複数の前記ユーザの特性を確率空間にマッピングし、マッピングした前記確率空間においてそれぞれの前記ユーザの間の球面距離を計算し、計算した前記球面距離に基づいて、複数の前記ユーザのうち、特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算し、計算した前記属性重複指数データについて、前記ユーザのリスク回避度を表すパラメータに依存した非線形平均を計算することによって、前記特定のユーザに対してレコメンデーションを行うためのレコメンデーションリストを生成し、生成した前記レコメンデーションリストに基づいて、前記特定のユーザの端末にレコメンデーションを行うためのデータを送信することができる、という作用を有する。 According to such a configuration of the present invention, the server transmits user characteristic data including at least a plurality of basic attribute data or log data of the plurality of users from the plurality of user terminals via the communication network. And, based on the received user characteristic data, map a plurality of user characteristics to a probability space, calculate a spherical distance between each user in the mapped probability space, and calculate the spherical surface Based on the distance, attribute duplication index data representing the degree of duplication of attributes between a specific user and other users among the plurality of users is calculated, and the risk of the user is calculated for the calculated attribute duplication index data. Recommends the specific user by calculating a non-linear average depending on the parameter representing the degree of avoidance. Generates a recommendation list for performing Shon, generated based on the recommendation list, it is possible to transmit the data for recommendation to a terminal of said particular user, such an action.

このことにより、前記サーバは、計算した前記球面距離に基づいて、前記特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算し、計算した前記属性重複指数データに基づいて、前記特定の複数のユーザに対するレコメンデーションリストを生成してレコメンデーションを行うためのデータを前記特定のユーザの端末に送信することができる。 Accordingly, the server calculates attribute duplication index data representing the degree of duplication of attributes between the specific user and other users based on the calculated spherical distance, and calculates the attribute duplication index data. Based on the above, it is possible to generate a recommendation list for the specific plurality of users and transmit data for performing the recommendation to the terminal of the specific user.

その結果、前記サーバは、前記属性重複指数データを計算することによって、レコメンデーションを行うユーザ毎にパーソナライズされたレコメンデーションを行うことができる。 As a result, the server can perform the recommendation that is personalized for each user who performs the recommendation by calculating the attribute duplication index data.

ここで用いる球面距離は、ユーザの属性を表す分布に関してその重複度が大きければ大きいほど限りなくゼロに近づき、また小さければ小さいほどその最大値であるπ／２に近づく。即ち、球面距離は、０〜π／２の値を取り得、ユーザの属性の分布によって、「相関」のように、その取り得る範囲がそれ以下の範囲に限定されることがない。このように本発明の原理によれば、第１の課題を解決することができる。 As for the spherical distance used here, the greater the degree of overlap with respect to the distribution representing the attribute of the user, the closer to zero, and the smaller the distance, the closer to the maximum value of π / 2. That is, the spherical distance can take a value of 0 to π / 2, and the possible range is not limited to a range less than that, as in “correlation”, depending on the distribution of user attributes. Thus, according to the principle of the present invention, the first problem can be solved.

なお、当該球面距離の値は、インプットであるユーザ特性データの全てを反映した値であり、「相関」のように負の値が廃棄されることがない。このように本発明の原理によれば、第２の課題を解決することができる。 Note that the value of the spherical distance reflects all of the user characteristic data as input, and negative values are not discarded as in “correlation”. Thus, according to the principle of the present invention, the second problem can be solved.

また、当該球面距離は、グローバルなレベルでユーザの属性の重複度合いを表しているので、基本的にグローバルな全てのモーメントが関与しており、包括的な情報を含んでいるといえる。従って、当該球面距離は、上述の例のように、Ａ及びＢ２人のユーザがある商品を購入する頻度を示す確率分布が当該商品の価格に対して例えばＡがパワー型、Ｂがガウス型のテールを持つ場合において、「相関」が当該テールの違いを十分に表すことができないのと対照的である。このように本発明の原理によれば、第３の課題を解決することができる。 In addition, since the spherical distance represents the degree of duplication of user attributes at a global level, it can be said that basically all global moments are involved and comprehensive information is included. Therefore, as in the above example, the spherical distance is calculated based on the probability distribution indicating the frequency with which A and B two users purchase a certain product, for example, A is a power type and B is a Gaussian type. In contrast to having a tail, “correlation” cannot contrast the tails well. Thus, according to the principle of the present invention, the third problem can be solved.

更に、距離の概念においては、異なるユーザの利用特性を表す確率変数の依存性には無関係に距離が定まる。このため、依存性が線形であるか否かに関わらず、球面距離はユーザの属性の重複度合いを表現することができる。従って、当該球面距離は、「相関」のように線形の依存関係しか十分に表現することができない、という制約から解放されている。このように本発明の原理によれば、第４の課題を解決することができる。 Further, in the concept of distance, the distance is determined regardless of the dependence of random variables representing the usage characteristics of different users. For this reason, the spherical distance can express the overlapping degree of the user's attributes regardless of whether the dependency is linear or not. Therefore, the spherical distance is freed from the restriction that only a linear dependency can be expressed sufficiently as in “correlation”. Thus, according to the principle of the present invention, the fourth problem can be solved.

また、ユーザが購入や検索をした商品を用いて当該ユーザ間の距離を定めるのであるが、このような距離が一旦定まったならば、レコメンデーションに関しては（商品重視ではなく）ユーザ観点に基づいて決定するため、このような本発明の原理によれば、第５の課題を解決することができる。 In addition, the distance between the users is determined using the products purchased or searched by the user. Once such a distance is determined, the recommendation is based on the viewpoint of the user (not on the product focus). Therefore, according to such a principle of the present invention, the fifth problem can be solved.

（２）前記サーバが、前記ユーザのリスク回避度を表すパラメータの設定入力を受け付けるステップを更に含む（１）に記載の方法。 (2) The method according to (1), further including a step in which the server receives a setting input of a parameter representing the risk avoidance degree of the user.

本発明のこのような構成によれば、前記サーバが、前記ユーザのリスク回避度を表すパラメータの設定入力を受け付けることができる、という作用を有する。 According to such a configuration of the present invention, the server can receive a setting input of a parameter representing the risk avoidance degree of the user.

このことにより、前記サーバは、前記サーバは、前記前記パラメータが表すリスク回避度に応じた属性重複指数データの非線形平均を計算することによって、レコメンデーションを行うユーザ毎にパーソナライズされたレコメンデーションを行うことができる。 Accordingly, the server performs a personalized recommendation for each user who performs the recommendation by calculating a nonlinear average of the attribute duplication index data according to the risk avoidance degree represented by the parameter. be able to.

その結果、前記サーバは、特定のユーザにレコメンデーションを行う際に、前記特定のユーザとその他のユーザとの間の属性の重なり度合いを表すスケール上でレコメンデーションを行う範囲を調整することができる。 As a result, when the recommendation is made to a specific user, the server can adjust the range in which the recommendation is performed on a scale representing the degree of overlapping of attributes between the specific user and other users. .

（３）前記球面距離を計算するステップにおいて、前記サーバは、前記球面距離としてバッタチャヤの球面距離を計算する（１）又は（２）に記載の方法。 (3) The method according to (1) or (2), wherein in the step of calculating the spherical distance, the server calculates a spherical distance of a grasshopper as the spherical distance.

本発明のこのような構成によれば、前記サーバは、前記球面距離としてバッタチャヤの球面距離を計算することができる、という作用を有する。
（４）前記サーバは、前記バッタチャヤの球面距離を

によって計算する（３）に記載の方法。 According to such a configuration of the present invention, the server has an effect that the spherical distance of the grasshopper can be calculated as the spherical distance.
(4) The server calculates the spherical distance of the grasshopper.

(3) The method according to (3).

本発明のこのような構成によれば、前記サーバは、前記バッタチャヤの球面距離を

によって計算することができる、という作用を有する。 According to such a configuration of the present invention, the server calculates the spherical distance of the grasshopper.

It has the effect that it can be calculated by.

（５）前記属性重複指数データを計算するステップにおいて、前記サーバは、前記バッタチャヤの球面距離を前記属性重複指数データとして計算する（４）に記載の方法。 (5) The method according to (4), wherein in the step of calculating the attribute duplication index data, the server calculates the spherical distance of the grasshopper as the attribute duplication index data.

本発明のこのような構成によれば、前記サーバは、前記バッタチャヤの球面距離を前記属性重複指数データとして計算することができる、という作用を有する。 According to this configuration of the present invention, the server has an effect that the spherical distance of the grasshopper can be calculated as the attribute duplication index data.

（６）前記属性重複指数データを計算するステップにおいて、前記サーバは、前記ユーザ特性データに基づいて前記その他のユーザの行動を示す確率分布を計算し、更に前記バッタチャヤの球面距離に基づいて計算するウエイトを掛けたものを前記属性重複指数データとして計算する（４）に記載の方法。 (6) In the step of calculating the attribute duplication index data, the server calculates a probability distribution indicating the behavior of the other user based on the user characteristic data, and further calculates based on the spherical distance of the grasshopper. The method according to (4), wherein the weighted product is calculated as the attribute duplication index data.

本発明のこのような構成によれば、前記サーバは、前記ユーザ特性データに基づいて前記その他のユーザの行動を示す確率分布を計算し、更に前記バッタチャヤの球面距離に基づいて計算するウエイトを掛けたものを前記属性重複指数データとして計算することができる、という作用を有する。 According to such a configuration of the present invention, the server calculates a probability distribution indicating the behavior of the other user based on the user characteristic data, and further multiplies a weight to be calculated based on the spherical distance of the grasshopper. The data can be calculated as the attribute duplication index data.

このことにより、前記サーバは、前記特定のユーザに対してレコメンデーションを行う際に、前記その他のユーザの行動を示す確率分布に前記バッタチャヤの球面距離に基づいて計算するウエイトを掛けたものを前記属性重複指数データとして計算し、前記属性重複指数データが示す前記特定のユーザと前記その他のユーザとの間の属性の重なり度合いを表すスケール上でレコメンデーションを行う範囲を調整することができる。
（７）前記サーバは、前記ウエイトを

によって計算し、
前記属性重複指数データを

によって計算する（６）に記載の方法。 Thus, when the server makes a recommendation to the specific user, the server calculates a probability distribution indicating the behavior of the other user multiplied by a weight calculated based on the spherical distance of the grasshopper. It is calculated as attribute duplication index data, and the range for performing the recommendation can be adjusted on a scale representing the degree of attribute duplication between the specific user and the other users indicated by the attribute duplication index data.
(7) The server receives the weight.

Calculated by
The attribute duplication index data

(6) The method according to (6).

本発明のこのような構成によれば、前記サーバは、前記ウエイトを

によって計算し、
前記属性重複指数データを

によって計算することができる、という作用を有する。 According to such a configuration of the present invention, the server receives the weight.

Calculated by
The attribute duplication index data

It has the effect that it can be calculated by.

（８）前記サーバが、計算した前記バッタチャヤの球面距離に基づいて、それぞれの前記ユーザを中心として、その他の全ての前記ユーザとの相対距離を計算するステップと、
計算した前記相対距離に基づいて、前記複数のユーザを前記相対距離の近い複数のグループに分類するステップと、を更に含み、
前記属性重複指数データを計算するステップにおいて、前記サーバは、前記特定のユーザと同じグループに分類された前記その他のユーザについて前記属性重複指数データを計算する（３）から（７）のいずれかに記載の方法。 (8) The server calculates relative distances from all the other users around the respective users based on the calculated spherical distance of the grasshopper;
Further classifying the plurality of users into a plurality of groups having close relative distances based on the calculated relative distances;
In the step of calculating the attribute duplication index data, the server calculates the attribute duplication index data for the other users classified into the same group as the specific user. The method described.

本発明のこのような構成によれば、前記サーバが、計算した前記バッタチャヤの球面距離に基づいて、それぞれの前記ユーザを中心として、その他の全ての前記ユーザとの相対距離を計算し、計算した前記相対距離に基づいて、前記複数のユーザを前記相対距離の近い複数のグループに分類し、前記属性重複指数データを計算するステップにおいて、前記サーバは、前記特定のユーザと同じグループに分類された前記その他のユーザについて前記属性重複指数データを計算することができる、という作用を有する。 According to such a configuration of the present invention, the server calculates and calculates relative distances from all the other users around the respective users based on the calculated spherical distance of the grasshopper. Based on the relative distance, the server is classified into the same group as the specific user in the step of classifying the plurality of users into a plurality of groups close to the relative distance and calculating the attribute duplication index data. The attribute duplication index data can be calculated for the other users.

このことにより、前記サーバは、前記特定のユーザと同じグループに分類された前記その他のユーザについて前記属性重複指数データを計算し、レコメンデーションリストを生成して前記特定のユーザの端末にレコメンデーションを行うためのデータを送信することができる。 Thus, the server calculates the attribute duplication index data for the other users classified into the same group as the specific user, generates a recommendation list, and recommends the recommendation to the terminal of the specific user. Data to do can be sent.

ここで、ユーザの特性を表す確率空間において、前記相対距離は、それぞれの前記ユーザを中心として個々に計算されるので、当該中心となるユーザにとってのそれぞれの前記その他のユーザとの間の属性の重複度合いを表現することができる。 Here, in the probability space representing the characteristics of the user, the relative distance is calculated individually around each of the users. Therefore, the attribute of the attribute between the other users for the user at the center is calculated. The degree of overlap can be expressed.

従って、前記サーバは、前記属性重複指数データを計算する際に、前記特定のユーザと同じグループに分類された、より属性の重複度合いの高い前記その他のユーザに絞って計算し、前記レコメンデーションリストを生成して前記特定のユーザの端末にレコメンデーションを行うためのデータを送信することができる。 Therefore, when calculating the attribute duplication index data, the server calculates only the other users classified into the same group as the specific user and having a higher degree of duplication of attributes, and the recommendation list. Can be transmitted to the terminal of the specific user for recommendation.

その結果、前記サーバは、前記レコメンデーションリストの精度をより高めることができる可能性がある。 As a result, the server may be able to further improve the accuracy of the recommendation list.

（９）前記サーバは、前記相対距離を

によって計算する（８）に記載の方法。 (9) The server determines the relative distance.

(8) The method according to (8).

本発明のこのような構成によれば、前記サーバは、前記相対距離を

によって計算することができる、という作用を有する。 According to such a configuration of the present invention, the server calculates the relative distance.

It has the effect that it can be calculated by.

このことにより、前記サーバは、前記特定のユーザと同じグループに分類された前記その他のユーザについて前記属性重複指数データを計算し、レコメンデーションリストを生成してレコメンデーションを行うためのデータを前記特定のユーザの端末に送信することができる。 Accordingly, the server calculates the attribute duplication index data for the other users classified into the same group as the specific user, generates a recommendation list, and specifies the data for performing the recommendation. To the user terminal.

その結果、前記サーバは、前記特定のユーザと同じグループに分類された前記その他のユーザのみに基づいてレコメンデーションを行うことができる。 As a result, the server can make recommendations based only on the other users classified into the same group as the specific user.

（１０）通信ネットワークを介して接続可能な端末のユーザに対してレコメンデーションを行うサーバであって、前記サーバは、
複数の前記ユーザの端末から、複数の前記ユーザの基本属性データ又はログデータを少なくとも含んで構成するユーザ特性データを、前記通信ネットワークを介して受信する手段と、
受信した前記ユーザ特性データに基づいて、複数の前記ユーザの特性を確率空間にマッピングする手段と、
マッピングした前記確率空間においてそれぞれの前記ユーザの間の球面距離を計算する手段と、
計算した前記球面距離に基づいて、複数の前記ユーザのうち、特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算する手段と、
計算した前記属性重複指数データについて、前記ユーザのリスク回避度を表すパラメータに依存した非線形平均を計算することによって、前記特定のユーザに対してレコメンデーションを行うためのレコメンデーションリストを生成する手段と、
生成した前記レコメンデーションリストに基づいて、前記特定のユーザの端末にレコメンデーションを行うためのデータを送信する手段と、を含むサーバ。 (10) A server that makes recommendations to a user of a terminal that can be connected via a communication network,
Means for receiving, from the plurality of user terminals, user characteristic data including at least a plurality of basic attribute data or log data of the plurality of users via the communication network;
Means for mapping a plurality of user characteristics to a probability space based on the received user characteristic data;
Means for calculating a spherical distance between each of the users in the mapped probability space;
Means for calculating attribute duplication index data representing the degree of duplication of attributes between a specific user and other users among the plurality of users based on the calculated spherical distance;
A means for generating a recommendation list for making a recommendation for the specific user by calculating a non-linear average depending on a parameter representing the risk avoidance degree of the user for the calculated attribute duplication index data; ,
Means for transmitting data for making a recommendation to the terminal of the specific user based on the generated recommendation list.

このことにより、前記サーバを運用することにより、（１）と同様の作用が期待できる。 Thus, the same operation as (1) can be expected by operating the server.

（１１）通信ネットワークを介して接続可能な端末のユーザに対してサーバにレコメンデーションを行わせるプログラムであって、前記サーバに、
複数の前記ユーザの端末から、複数の前記ユーザの基本属性データ又はログデータを少なくとも含んで構成するユーザ特性データを、前記通信ネットワークを介して受信させるステップと、
受信した前記ユーザ特性データに基づいて、複数の前記ユーザの特性を確率空間にマッピングさせるステップと、
マッピングした前記確率空間においてそれぞれの前記ユーザの間の球面距離を計算させるステップと、
計算した前記球面距離に基づいて、複数の前記ユーザのうち、特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算させるステップと、
計算した前記属性重複指数データについて、前記ユーザのリスク回避度を表すパラメータに依存した非線形平均を計算することによって、前記特定のユーザに対してレコメンデーションを行うためのレコメンデーションリストを生成させるステップと、
生成した前記レコメンデーションリストに基づいて、前記特定のユーザの端末にレコメンデーションを行うためのデータを送信させるステップと、を含むプログラム。 (11) A program for causing a server user to make a recommendation to a terminal user connectable via a communication network,
Receiving, from the plurality of user terminals, user characteristic data including at least a plurality of basic attribute data or log data of the plurality of users via the communication network;
Mapping a plurality of user characteristics to a probability space based on the received user characteristic data;
Calculating a spherical distance between each of the users in the mapped probability space;
Calculating attribute duplication index data representing a degree of duplication of attributes between a specific user and other users among the plurality of users based on the calculated spherical distance;
Generating a recommendation list for recommending the specific user by calculating a non-linear average depending on a parameter representing the risk avoidance degree of the user for the calculated attribute duplication index data; ,
And a step of transmitting data for making a recommendation to the terminal of the specific user based on the generated recommendation list.

本発明のこのような構成によれば、前記サーバは、
複数の前記ユーザの端末から、複数の前記ユーザの基本属性データ又はログデータを少なくとも含んで構成するユーザ特性データを、前記通信ネットワークを介して受信し、受信した前記ユーザ特性データに基づいて、複数の前記ユーザの特性を確率空間にマッピングし、マッピングした前記確率空間においてそれぞれの前記ユーザの間の球面距離を計算し、計算した前記球面距離に基づいて、複数の前記ユーザのうち、特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算し、計算した前記属性重複指数データについて、前記ユーザのリスク回避度を表すパラメータに依存した非線形平均を計算することによって、前記特定のユーザに対してレコメンデーションを行うためのレコメンデーションリストを生成し、生成した前記レコメンデーションリストに基づいて、前記特定のユーザの端末にレコメンデーションを行うためのデータを送信することができる、という作用を有する。 According to such a configuration of the present invention, the server is
User characteristic data including at least the basic attribute data or log data of the plurality of users is received from the plurality of user terminals via the communication network, and a plurality of user characteristic data is received based on the received user characteristic data. The characteristics of the user are mapped to a probability space, a spherical distance between the users is calculated in the mapped probability space, and a specific user among the plurality of users is calculated based on the calculated spherical distance. By calculating attribute duplication index data representing the degree of attribute duplication between the user and other users, and calculating a non-linear average depending on a parameter representing the degree of risk aversion of the user for the calculated attribute duplication index data Recommendation list for making recommendations to the specific user Generated, generated based on the recommendation list, it is possible to transmit the data for recommendation to a terminal of said particular user, such an action.

本発明によれば、前記サーバは、前記属性重複指数データを計算することによって、レコメンデーションを行うユーザ毎にパーソナライズされたレコメンデーションリストを生成することができる。従って、前記サーバは、特定のユーザにレコメンデーションを行う際に、前記特定のユーザとその他のユーザとの間の属性の重なり度合いを表すスケール上でレコメンデーションを行う範囲を調整することができる。 According to the present invention, the server can generate a personalized recommendation list for each user who makes a recommendation by calculating the attribute duplication index data. Therefore, when making recommendations to a specific user, the server can adjust the range in which the recommendation is performed on a scale representing the degree of overlapping of attributes between the specific user and other users.

以下、本発明の実施形態について図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本発明の好適な実施形態の一例に係るシステム１の全体構成を示す図である。図２は、本発明の好適な実施形態の一例に係るサーバ１０及び端末２０の構成を示す図である。図３は、本発明の好適な実施形態の一例に係るサーバ１０によるレコメンデーション処理を示すフローチャートである。図４は本発明の好適な実施形態の一例に係るサーバ１０による３つのガウス分布の平均操作について説明するために元のガウス分布を示す図である。図５は、本発明の好適な実施形態の一例に係るサーバ１０による図４の３つのガウス分布に対するαの値が非常に大きい場合（悲観的）の平均と非常に小さい場合の平均（楽観的）と比較する図である。図６は、本発明の好適な実施形態の一例に係るサーバ１０によるユーザ（利用者）ａ_１から他のユーザ（利用者）までの距離の（非線形）平均を一般のαの関数として示した図である。図７は、本発明の好適な実施形態の一例に係るサーバ１０によるユーザ（利用者）ａ_３の球面距離の非線形平均Θ_３（α）及び（Θ_３１，Θ_３２，Θ_３４）を比較して示す図である。 FIG. 1 is a diagram showing an overall configuration of a system 1 according to an example of a preferred embodiment of the present invention. FIG. 2 is a diagram illustrating the configuration of the server 10 and the terminal 20 according to an example of the preferred embodiment of the present invention. FIG. 3 is a flowchart showing a recommendation process by the server 10 according to an example of the preferred embodiment of the present invention. FIG. 4 is a diagram showing an original Gaussian distribution for explaining an average operation of three Gaussian distributions by the server 10 according to an example of the preferred embodiment of the present invention. FIG. 5 shows an average when the value of α is very large (pessimistic) and an average when it is very small (optimistic) for the three Gaussian distributions of FIG. 4 by the server 10 according to an example of the preferred embodiment of the present invention. FIG. FIG. 6 shows a (non-linear) average of distances from the user (user) a ₁ to other users (users) by the server 10 according to an example of the preferred embodiment of the present invention as a function of general α. FIG. FIG. 7 compares the nonlinear mean Θ ₃ (α) and (Θ ₃₁ , Θ ₃₂ , Θ ₃₄ ) of the spherical distance of the user (user) a ₃ by the server 10 according to an example of the preferred embodiment of the present invention. FIG.

［システムの全体構成］ [System overall configuration]

図１は、本発明の好適な実施形態の一例に係るシステム１の全体構成を示す図である。 FIG. 1 is a diagram showing an overall configuration of a system 1 according to an example of a preferred embodiment of the present invention.

サーバ１０は、通信ネットワーク３０を介して、ユーザの端末２０と接続可能である。 The server 10 can be connected to the user terminal 20 via the communication network 30.

サーバ１０と端末２０の接続の形態としては、有線でも無線でもよい。 The connection form between the server 10 and the terminal 20 may be wired or wireless.

［サーバ１０のハードウェア構成］ [Hardware Configuration of Server 10]

図２は、図１で説明した本発明の好適な実施形態の一例に係るサーバ１０のハードウェア構成の一例を示す図である。サーバ１０は、制御部１０１を構成するＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０１０（マルチプロセッサ構成ではＣＰＵ１０１２等複数のＣＰＵが追加されてもよい）、バスライン１００５、通信Ｉ／Ｆ１０４０、メインメモリ１０５０、ＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔＯｕｔｐｕｔＳｙｓｔｅｍ）１０６０、ＵＳＢポート１０９０、Ｉ／Ｏコントローラ１０７０、並びにキーボード及びマウス１１００等の入力手段や表示装置１０２２を備える。 FIG. 2 is a diagram illustrating an example of a hardware configuration of the server 10 according to an example of the preferred embodiment of the present invention described in FIG. The server 10 includes a central processing unit (CPU) 1010 (a plurality of CPUs such as a CPU 1012 may be added in a multiprocessor configuration), a bus line 1005, a communication I / F 1040, a main memory 1050, a BIOS ( Basic Input Output System) 1060, USB port 1090, I / O controller 1070, keyboard and mouse 1100 and other input means and display device 1022.

Ｉ／Ｏコントローラ１０７０には、テープドライブ１０７２、ハードディスク１０７４、光ディスクドライブ１０７６、半導体メモリ１０７８、等の記憶手段を接続することができる。 Storage means such as a tape drive 1072, a hard disk 1074, an optical disk drive 1076, and a semiconductor memory 1078 can be connected to the I / O controller 1070.

ＢＩＯＳ１０６０は、サーバ１０の起動時にＣＰＵ１０１０が実行するブートプログラムや、サーバ１０のハードウェアに依存するプログラム等を格納する。 The BIOS 1060 stores a boot program executed by the CPU 1010 when the server 10 is started up, a program depending on the hardware of the server 10, and the like.

記憶部１０７を構成するハードディスク１０７４は、サーバ１０がサーバとして機能するための各種プログラム及び本発明の機能を実行するプログラムを記憶しており、更に必要に応じて各種データベースを構成可能である。 The hard disk 1074 constituting the storage unit 107 stores various programs for the server 10 to function as a server and programs for executing the functions of the present invention, and can configure various databases as necessary.

光ディスクドライブ１０７６としては、例えば、ＤＶＤ−ＲＯＭドライブ、ＣＤ−ＲＯＭドライブ、ＤＶＤ−ＲＡＭドライブ、ＣＤ−ＲＡＭドライブを使用することができる。この場合は各ドライブに対応した光ディスク１０７７を使用する。光ディスク１０７７から光ディスクドライブ１０７６によりプログラム又はデータを読み取り、Ｉ／Ｏコントローラ１０７０を介してメインメモリ１０５０又はハードディスク１０７４に提供することもできる。また、同様にテープドライブ１０７２に対応したテープメディア１０７１を主としてバックアップのために使用することもできる。 As the optical disc drive 1076, for example, a DVD-ROM drive, a CD-ROM drive, a DVD-RAM drive, or a CD-RAM drive can be used. In this case, the optical disk 1077 corresponding to each drive is used. A program or data may be read from the optical disk 1077 by the optical disk drive 1076 and provided to the main memory 1050 or the hard disk 1074 via the I / O controller 1070. Similarly, the tape medium 1071 corresponding to the tape drive 1072 can be used mainly for backup.

サーバ１０に提供されるプログラムは、ハードディスク１０７４、光ディスク１０７７、又はメモリーカード等の記録媒体に格納されて提供される。このプログラムは、Ｉ／Ｏコントローラ１０７０を介して、記録媒体から読み出され、又は通信Ｉ／Ｆ１０４０を介してダウンロードされることによって、サーバ１０にインストールされ実行されてもよい。 The program provided to the server 10 is provided by being stored in a recording medium such as the hard disk 1074, the optical disk 1077, or a memory card. The program may be installed in the server 10 and executed by being read from the recording medium via the I / O controller 1070 or downloaded via the communication I / F 1040.

前述のプログラムは、内部又は外部の記憶媒体に格納されてもよい。ここで、記憶部１０７を構成する記憶媒体としては、ハードディスク１０７４、光ディスク１０７７、又はメモリーカードの他に、ＭＤ等の光磁気記録媒体、テープ媒体を用いることができる。また、専用通信回線やインターネットに接続されたサーバシステムに設けたハードディスク１０７４又は光ディスクライブラリー等の記憶装置を記録媒体として使用し、通信回線を介してプログラムをサーバ１０に提供してもよい。 The aforementioned program may be stored in an internal or external storage medium. Here, as a storage medium constituting the storage unit 107, a magneto-optical recording medium such as an MD or a tape medium can be used in addition to the hard disk 1074, the optical disk 1077, or the memory card. Further, a storage device such as a hard disk 1074 or an optical disk library provided in a server system connected to a dedicated communication line or the Internet may be used as a recording medium, and the program may be provided to the server 10 via the communication line.

ここで、表示装置１０２２は、ユーザにデータの入力を受け付ける画面を表示したり、サーバ１０による演算処理結果の画面を表示したりするものであり、ブラウン管表示装置（ＣＲＴ）、液晶表示装置（ＬＣＤ）等のディスプレイ装置を含む。 Here, the display device 1022 displays a screen for accepting data input to the user, or displays a screen of a calculation processing result by the server 10, and includes a cathode ray tube display device (CRT) and a liquid crystal display device (LCD). ) And the like.

ここで、入力手段は、ユーザによる入力の受け付けを行うものであり、キーボード及びマウス１１００等により構成してよい。 Here, the input means accepts input by the user, and may be configured by a keyboard, a mouse 1100, and the like.

また、通信Ｉ／Ｆ１０４０は、サーバ１０を専用ネットワーク又は公共ネットワークを介して端末と接続できるようにするためのネットワーク・アダプタである。通信Ｉ／Ｆ１０４０は、モデム、ケーブル・モデム及びイーサネット（登録商標）・アダプタを含んでよい。 The communication I / F 1040 is a network adapter for enabling the server 10 to be connected to a terminal via a dedicated network or a public network. The communication I / F 1040 may include a modem, a cable modem, and an Ethernet (registered trademark) adapter.

以上の例は、サーバ１０について主に説明したが、コンピュータに、プログラムをインストールして、そのコンピュータをサーバ装置として動作させることにより上記で説明した機能を実現することもできる。したがって、本発明において一実施形態として説明したサーバにより実現される機能は、上述の方法を当該コンピュータにより実行することにより、或いは、上述のプログラムを当該コンピュータに導入して実行することによっても実現可能である。
［ユーザの端末２０のハードウェア構成］ In the above example, the server 10 has been mainly described. However, the functions described above can also be realized by installing a program in a computer and operating the computer as a server device. Therefore, the functions realized by the server described as an embodiment in the present invention can be realized by executing the above-described method by the computer, or by introducing the above-mentioned program into the computer and executing it. It is.
[Hardware Configuration of User's Terminal 20]

ここでユーザの端末２０は、上述のサーバ１０と同様の構成を備えてよい。
［レコメンデーション処理］ Here, the user terminal 20 may have the same configuration as the server 10 described above.
[Recommendation process]

サーバ１０は、図３に示すようにレコメンデーション処理を行う。 The server 10 performs a recommendation process as shown in FIG.

まず、制御部１０１は、通信ネットワーク３０を介して複数のユーザの端末２０から前記複数のユーザの基本属性データ又はログデータを少なくとも含んで構成するユーザ特性データを受信して記憶する（ステップＳ１０１）。 First, the control unit 101 receives and stores user characteristic data including at least the basic attribute data or log data of the plurality of users from the terminals 20 of the plurality of users via the communication network 30 (step S101). .

具体的には、例えば、前記ユーザの性別、年齢、職業、興味のある分野等を含む基本属性データ又は、商品・サービスの購入履歴、商品・サービスに対する評価記録（レイティング）等のデータを受け付ける。 Specifically, for example, basic attribute data including the user's gender, age, occupation, field of interest, or the like, data such as purchase history of products / services, evaluation records (ratings) for the products / services, and the like are received.

次に、制御部１０１は、受信した前記ユーザ特性データに基づいて、複数の前記ユーザの特性を確率空間にマッピングする（ステップＳ１０２）。 Next, the control unit 101 maps a plurality of user characteristics to a probability space based on the received user characteristic data (step S102).

次に、制御部１０１は、受信した前記ユーザ特性データに基づいて、前記複数のユーザの特性を含む確率空間においてそれぞれの前記ユーザ間のバッタチャヤの球面距離を計算する（ステップＳ１０３）。 Next, based on the received user characteristic data, the control unit 101 calculates the spherical distance of the batcher between the users in a probability space including the characteristics of the plurality of users (step S103).

次に、制御部１０１は、複数の前記ユーザのうち、特定のユーザとその他のユーザとの間の属性の重複度合いを表す属性重複指数データを計算する（ステップＳ１０４）。 Next, the control unit 101 calculates attribute duplication index data representing the degree of duplication of attributes between a specific user and other users among the plurality of users (step S104).

次に、制御部１０１は、計算した前記属性重複指数データについて、非線形平均を計算する（ステップＳ１０５）。 Next, the control unit 101 calculates a nonlinear average for the calculated attribute duplication index data (step S105).

次に、制御部１０１は、計算した前記属性重複指数データの非線形平均に基づいて、前記特定のユーザに対してレコメンデーションを行うためのレコメンデーションリストを生成する（ステップＳ１０６）。 Next, the control part 101 produces | generates the recommendation list | wrist for performing recommendation with respect to the said specific user based on the nonlinear average of the calculated said attribute duplication index data (step S106).

ここで、非線形平均（α混合平均）について説明する。 Here, the nonlinear average (α mixed average) will be described.

α混合平均を用いて確率分布の非線形平均をとることに関して、変数αが極端に大きい、又は小さい値をとることによる効果を把握するために次の例を考える。ここでは３つのガウス分布に対する平均操作を行う。具体的に元となる分布が図４のように与えられていたとする。そこでこれら３つの分布の線形平均（図の実線）を、αの値が非常に大きい場合（悲観的＝図の点線）の平均と非常に小さい場合の平均（楽観的＝図の一点鎖線）と比較すると図５が得られる。 The following example is considered in order to grasp the effect of taking the value of the variable α being extremely large or small with respect to taking the nonlinear average of the probability distribution using the α mixed average. Here, an average operation is performed on three Gaussian distributions. Assume that the original distribution is given as shown in FIG. Therefore, the linear average of these three distributions (solid line in the figure) is the average when the value of α is very large (pessimistic = dotted line in the figure) and the average when it is very small (optimistic = dashed line in the figure). In comparison, FIG. 5 is obtained.

そこで次の状況を考えてみる。仮に、これら３分布の平均が全てのｘに対して所与の臨界値を超えない場合にのみ何かの商品（又は薬剤）の推薦を行うとした状況である。仮にこの臨界値が０．２４であった（図の二点鎖線）場合、この例においては、αの値が非常に大きい、つまりリスク回避度がとても大きい値をとる場合、ｘの値が区間［０，１］の周辺値をとるときに、確率値が臨界値０．２４を超過しているため、推薦する可能性が却下されることが伺える。逆にリスク回避度（つまりαの値）がさほど大きくなければ全てのｘに対して確率値が臨界値０．２４を超過しないため、推薦（レコメンド）する可能性が採用されることとなる。 Therefore, consider the following situation. Temporarily, a product (or drug) is recommended only when the average of these three distributions does not exceed a given critical value for all x. If this critical value is 0.24 (two-dot chain line in the figure), in this example, if the value of α is very large, that is, the risk avoidance level is very large, the value of x is the interval. When taking the peripheral value of [0, 1], the probability value exceeds the critical value 0.24, so it can be seen that the possibility of recommendation is rejected. On the other hand, if the risk avoidance degree (that is, the value of α) is not so large, the probability value does not exceed the critical value 0.24 for all x, so the possibility of recommendation (recommendation) is adopted.

次に、制御部１０１は、生成した前記レコメンデーションリストに基づいて、前記特定のユーザの端末にレコメンデーションを行うためのデータを送信する（ステップＳ１０７）。 Next, based on the generated recommendation list, the control unit 101 transmits data for making a recommendation to the terminal of the specific user (step S107).

ここで、上述のレコメンデーション処理を具体的な例を用いて説明する。
［実施例１］
バッタチャヤの球面距離Θの非線形平均に基づいたレコメンデーションの例 Here, the above-described recommendation process will be described using a specific example.
[Example 1]
Example of a recommendation based on a non-linear average of the Battachaya spherical distance Θ

ここでは、ユーザ間の距離Θの非線形平均操作に基づいたユーザに対するレコメンデーションリスト（推薦リスト）を作成する例を考える。 Here, consider an example of creating a recommendation list (recommendation list) for users based on a nonlinear average operation of the distance Θ between users.

そこで顧客数も分類数も共に４であった場合を想定する。そしてユーザ属性を表す分布関数ρ_ｎ（ｋ）が以下のように与えられたとする。

このときの球面距離は

のように求まる。 Therefore, it is assumed that the number of customers and the number of classifications are both 4. Then, it is assumed that a distribution function ρ _n (k) representing a user attribute is given as follows.

The spherical distance at this time is

It is obtained like this.

そこでΘの非線形平均Θ_ｎ（α）は

によって求める。 So the nonlinear mean Θ _n (α) of Θ is

Ask for.

そこでユーザ（利用者）ａ_３に着目してΘ_３（α）を求め、これを（Θ_３１，Θ_３２，Θ_３４）と比較したものを図７に示す。図７において、Θ_３（α）は実線、Θ_３１は一点鎖線、Θ_３２は点線、Θ_３４は二点鎖線で示す。 FIG. 7 shows the result of obtaining Θ ₃ (α) by paying attention to the user (user) a ₃ and comparing it with (Θ ₃₁ , Θ ₃₂ , Θ ₃₄ ). In FIG. 7, Θ ₃ (α) is indicated by a solid line, Θ ₃₁ is indicated by a one-dot chain line, Θ ₃₂ is indicated by a dotted line, and Θ ₃₄ is indicated by a two-dot chain line.

この例ではリスク回避度合を表す変数αをアルゴリズムの利用者（例えば、レコメンデーションサービス提供業者）が適当に選ぶ必要がある。仮に比較的非保守的な値α＝−５が選ばれた場合、

であるので、この値よりも小さな距離をとるユーザが、着目しているユーザａ_３からみて興味が近いと判断される。この場合、レコメンデーションリストはユーザａ_１及びユーザａ_４のリストより選ばれる。逆に比較的保守的な値α＝１０が選ばれたとき

であるので、この値よりも小さな距離をとるユーザはａ_１のみとなる。
［実施例２］
球面距離の非線形平均に基づいてユーザ（利用者）が自らレコメンデーションリスト（推薦リスト）の類似性度合いを調整して推薦を受ける例 In this example, it is necessary for an algorithm user (for example, a recommendation service provider) to appropriately select a variable α representing the degree of risk avoidance. If a relatively non-conservative value α = −5 is chosen,

Since it is, the user taking a smaller distance than this value, interest Te user a ₃ pungent of interest is determined to be close. In this case, the recommendation list is selected from the list of the user a ₁ and the user a ₄ . Conversely, when a relatively conservative value α = 10 is selected

Since it is, the user than this value takes a small distance is only a _1.
[Example 2]
An example in which a user (user) receives a recommendation by adjusting the degree of similarity of a recommendation list (recommendation list) based on a nonlinear average of spherical distances

非線形平均（α混合平均）の考えを用いて、ユーザ（利用者）が自らレコメンデーションリスト（推薦リスト）の類似性度合いを調整して推薦を受けるシステムに対する応用を考える。つまり、非常に保守的なユーザ（利用者）は、スライドパラメータの値を０にとることによって最も属性（趣味）の重複度が大きい他のユーザ（利用者）が購入した商品のレコメンデーション（推薦）を受け、或いはチャレンジ意欲の大きいユーザ（利用者）はスライドパラメータの値を１に近くとることによって、かなり属性（興味）の異なるユーザ（利用者）のリストよりレコメンデーション（推薦）を受けるというシステムを考える。 Using the idea of nonlinear average (α-mixed average), consider an application to a system in which a user (user) adjusts the similarity degree of a recommendation list (recommendation list) and receives a recommendation. In other words, highly conservative users (users) recommend (recommend) products purchased by other users (users) who have the highest degree of duplication of attributes (hobbies) by setting the slide parameter value to 0. ) Or a user (user) who has a strong willingness to take a challenge, recommends a recommendation (recommendation) from a list of users (users) with significantly different attributes (interests) by setting the slide parameter value close to 1. Think of a system.

具体的に各ユーザ（利用者）は０から１の間の値をとる変数ｔを指定する。この変数に基づいて、非線形平均（α混合平均）のパラメータαの値を

のように選ぶ。 Specifically, each user (user) specifies a variable t that takes a value between 0 and 1. Based on this variable, the value of the parameter α of the nonlinear average (α mixed average)

Choose as follows.

ここで、ユーザ（利用者）間の属性（興味）の重複度に関しては、一般的な商品やサービスに関しての購入数又は検索数によって得られた各ユーザ（利用者）のヒストグラムに基づいて得られる確率分布に対する重複度（＝球面距離）Θｍｎによって与える。 Here, the degree of duplication of attributes (interests) between users (users) is obtained based on the histogram of each user (user) obtained by the number of purchases or searches for general products and services. The degree of overlap (= spherical distance) for the probability distribution is given by Θmn.

そこでｎ番目のユーザ（利用者）が何らかのｔ_ｎを選んだとき、このユーザ（利用者）に対するα_ｎが上述の式によって定まる。このα_ｎに基づいて、

を計算する。その結果として定まる距離Θ（ｎ）の値に最も近い値をとるユーザ（利用者）、つまり

によって定まるｍ^＊番目のユーザ（利用者）を次に決定する。そのユーザ（利用者）（そのような利用者は複数存在することもおおいにありうる）の属性（興味）の大きな商品やサービスの中から、ｎ番目のユーザ（利用者）がまだ購入や利用をしていないものをレコメンド（推薦）するのである。 Therefore, when the nth user (user) selects some t _n , α _n for this user (user) is determined by the above formula. Based on this α _n ,

Calculate As a result, the user (user) who takes the closest value to the value of the distance Θ (n), that is,

Next, the m ^* th user (user) determined by is determined. Among the products and services that have a large attribute (interest) of the user (user) (there can be many such users), the nth user (user) still purchases or uses it. Recommend (recommend) what is not.

次に、上述の概念を理解するための具体例としてユーザ（利用者）が全体で８名であった場合を想定する。そしてユーザ（利用者）ａ_１に注目し、このユーザ（利用者）から測った他のユーザ（利用者）までの距離が

と与えられたとする。更にユーザ（利用者）ａ_１はやや保守的な方で、変数ｔの値をｔ＝０．３と選んだものとする。ところでユーザ（利用者）ａ_１から他のユーザ（利用者）までの距離の（非線形）平均は、一般のαの関数として図６のようになる（実線の曲線）。 Next, as a specific example for understanding the above concept, a case where there are eight users (users) in total is assumed. Then, paying attention to the user (user) a ₁ , the distance from this user (user) to another user (user) is

And given. Further, it is assumed that the user (user) a ₁ is somewhat conservative and has selected the value of the variable t as t = 0.3. By the way, the (non-linear) average of the distance from the user (user) a ₁ to another user (user) is as shown in FIG. 6 as a function of general α (solid curve).

図６にはユーザ（利用者）ａ_１から他のユーザ（利用者）までの距離の値もそれぞれ示し（ａ_２〜ａ_８）、またその線形平均値も示した。ところで、ユーザ（利用者）ａ_１はｔ＝０．３と選んだので、この選択に従って得られる距離の平均値（つまりａ_１さんにとっての臨界値）を求めると、これは約０．７７と与えられる。このため、この臨界値に最も近い利用者とはａ_３に他ならない。 FIG. 6 also shows values of distances from the user (user) a ₁ to other users (users) (a _{2 to} a ₈ ), and their linear average values. By the way, since the user (user) a ₁ has selected t = 0.3, the average value of distances obtained according to this selection (that is, the critical value for Mr. a ₁ ) is about 0.77. Given. Therefore, nothing but a a ₃ is the closest user to this critical value.

具体的なレコメンデーションリスト（推薦リスト）は、以下の二通りに従って製作可能となる。
（１）臨界値以内のユーザ（利用者）のリストを用いる。臨界値を超えない範囲内の利用者はａ_２及びａ_３であるので、これら二名のリストにおいてａ_１がまだ購入を行っていない商品を、ａ_２及びａ_３を合わせた頻度に従ってレコメンド（推薦）する。
（２）臨界値に最も近いユーザ（利用者）のリストを用いる。既に述べた通り、臨界値に最も近いユーザ（利用者）とはａ_３であるため、ａ_１がまだ購入を行っていない商品をａ３の商品リストより、その頻度に従ってレコメンド（推薦）する。
［実施例３］
バッタチャヤの球面距離Θを用いた非線形平均に基づいたレコメンデーションの例 A specific recommendation list (recommendation list) can be produced according to the following two methods.
(1) A list of users (users) within a critical value is used. Since the users within the range not exceeding the critical value are a ₂ and a ₃ , the products that a ₁ has not yet purchased in the list of these two persons are recommended (recommended according to the frequency of a ₂ and a ₃ combined) )
(2) Use a list of users (users) closest to the critical value. As already mentioned, for the nearest user to the critical value (user) is a _3, from the commodity list of the products that a ₁ has not yet made a purchase a3, the recommendation (recommendation) in accordance with the frequency.
[Example 3]
Example of recommendation based on nonlinear average using Battachaya's spherical distance Θ

ここでは、ユーザ間の距離Θを用いた非線形平均操作を加えることによってユーザに対するレコメンデーションリスト（推薦リスト）を作成する例を、「本のレコメンデーション」という具体的を用いて示す。 Here, an example of creating a recommendation list (recommendation list) for users by adding a non-linear average operation using the distance Θ between users will be described using a specific “book recommendation”.

まずは、この例における入力項目を設定する。本の総数はＬによって与える。現実の書籍数は非常に多いので、ポピュラー小説、医学専門書、歴史もの、等といった具合に種類（カテゴリー）別に分類する必要がある。この分類項目数をＭとおく。これらの分類項目にｉ＝１からｉ＝Ｍまで、番号をつける。そしてｉ番目の分類はｂ_ｉと呼称をつける。さて、顧客（ユーザ）の数は全てでＮ人いたとする。これらに関しても

のように名前をつける。 First, input items in this example are set. The total number of books is given by L. Since the actual number of books is very large, it is necessary to classify them by type (category), such as popular novels, medical books, and historical books. Let M be the number of classification items. These classification items are numbered from i = 1 to i = M. And the i-th classification put a referred to as a b _i. Now, it is assumed that there are N customers (users) in all. Also about these

Give it a name.

次に顧客（ユーザ）ａ_ｎ氏が購入したｂ_ｋ類の書籍の数は、Ｃ_ｎ（ｋ）によって与える。同様に顧客（ユーザ）ａ_ｎ氏が検索したｂ_ｋ類の書籍の数をＤ_ｎ（ｋ）によって与える。更に顧客（ユーザ）ａ_ｎ氏が購入した書籍の総数を

によって与え、また顧客（ユーザ）ａ_ｎ氏が検索した（が購入しなかった）書籍の総数を

によって与える。 Then the number of books of _{b k} such that the customer (user) _{a n} Mr. _purchased, given by _C n (k). Similarly, the number of books of _{b k} such that the customer (user) _{a n} Mr. searches given by _D n (k). Further the total number of books that the customer (user) a _n Mr. purchased

The total number of given, also a customer (user) a _n Mr. searches for (but did not buy) Books by

Give by.

次に購入項目と検索項目に対する「相対比」をζ：１−ζによって与える。変数

の値は、購入することに対する重要度を、検索することに対する重要度と比較して、このアルゴリズムの利用者（例えば、レコメンデーションサービス提供業者）が適当に選ぶ必要がある。 Next, the “relative ratio” between the purchase item and the search item is given by ζ: 1−ζ. variable

The value of is required to be selected appropriately by the user of this algorithm (for example, a recommendation service provider) by comparing the importance for purchasing with the importance for searching.

以上の定義が与えられたとき、顧客（ユーザ）ａ_ｎ氏に対する、異なる分野の書籍に関する確率分布を以下の式によって与える。

このようにして、顧客（ユーザ）ａ_ｎに関する属性を表す分布関数ρ_ｎ（ｋ）が具体的に定まる。 When the above definitions is given, to the customer (user) a _n Mr, given by the following equation a probability distribution over books in different fields.

In this way, a customer (user) a _n distribution function representing attributes about [rho _{n (k)} is determined specifically.

顧客（ユーザ）ａ_ｎ及びａ_ｍ間の球面距離は上述の通り

によって決定する。 As described above the spherical distance between the customer (user) _{a n} and _{a m}

Determined by.

更に重み関数μ_ｎ（ｍ）をまた上述のように

と設定する。 In addition, the weight function μ _n (m) is also

And set.

そして、顧客（ユーザ）ａ_ｎを中心とした観点から、趣味の重複が大きい順に本の分類の分布に関する情報を、α混合平均を用いて統合する。具体的にこれは

と与えられる。 Then, from the viewpoint of focusing on the customer (user) a _n, the information about the distribution of the classification in order overlap is large hobby, integrating using a mixed average alpha. Specifically this is

And given.

最適であると思われる推薦リストとは、何かしらの選ばれたαに対して、Ｐ^ｎ _α（ｋ）なる確率に従って推薦する本の分類を選ぶことである（顧客（ユーザ）ａ_ｎが既に購入した書籍はそこから排除することは明らかであろう）。 The recommendation list that seems to be optimal, against α was some sort of chosen, is to choose a book classification of that recommendation in accordance with the P ⁿ _α (k) becomes the probability (the customer (user) a _n is already purchase It will be obvious that the book will be excluded from it).

そこで顧客数も商品分類数も共に４であった場合を想定する。そしてユーザ属性を表す分布関数ρ_ｎ（ｋ）が（実施例１と同様に）以下のように与えられたとする。

このときの球面距離は

のように求まる。 Accordingly, a case is assumed where the number of customers and the number of product categories are both four. Then, it is assumed that the distribution function ρ _n (k) representing the user attribute is given as follows (similar to the first embodiment).

The spherical distance at this time is

It is obtained like this.

同様に比重を与える関数μ_ｎ（ｍ）を計算すると以下のような結果となる。

Similarly, when the function μ _n (m) that gives the specific gravity is calculated, the following results are obtained.

これらを用いて、顧客（ユーザ）ａ_３に対する推薦リストを作成するのに必要な分布Ｐ^３ _２１（ｋ）を、αの値を２１として求めた結果

が得られる。 Using these, the distribution P ³ ₂₁ (k) necessary for creating the recommendation list for the customer (user) a ₃ is obtained with the value of α being 21.

Is obtained.

以上、本発明の実施形態について説明したが、本発明は上述した実施形態に限るものではない。また、本発明の実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本発明の実施例に記載されたものに限定されるものではない。 As mentioned above, although embodiment of this invention was described, this invention is not restricted to embodiment mentioned above. The effects described in the embodiments of the present invention are only the most preferable effects resulting from the present invention, and the effects of the present invention are limited to those described in the embodiments of the present invention. is not.

本発明に係るシステム１の全体構成を示す図である。1 is a diagram showing an overall configuration of a system 1 according to the present invention. 本発明に係るサーバ１０及び端末２０の構成を示す図である。It is a figure which shows the structure of the server 10 and the terminal 20 which concern on this invention. 本発明に係るサーバ１０によるレコメンデーション処理を示すフローチャートである。It is a flowchart which shows the recommendation process by the server 10 which concerns on this invention. 本発明に係るサーバ１０による３つのガウス分布の平均操作について説明するために元のガウス分布を示す図である。It is a figure which shows the original Gaussian distribution in order to demonstrate average operation of three Gaussian distribution by the server 10 which concerns on this invention. 本発明に係るサーバ１０による図４の３つのガウス分布に対するαの値が非常に大きい場合（悲観的）の平均と非常に小さい場合の平均（楽観的）と比較する図である。FIG. 5 is a diagram comparing the average when the value of α for the three Gaussian distributions of FIG. 4 by the server 10 according to the present invention is very large (pessimistic) and the average when it is very small (optimistic). 本発明に係るサーバ１０によるユーザ（利用者）ａ_１から他のユーザ（利用者）までの距離の（非線形）平均を一般のαの関数として示した図である。User by the server 10 according to the present invention is a diagram showing the mean (linear) distance from (user) a ₁ to other users (user) as a function of a general alpha. 本発明に係るサーバ１０によるユーザ（利用者）ａ_３の球面距離の非線形平均Θ_３（α）及び（Θ_３１，Θ_３２，Θ_３４）を比較して示す図である。User by the server 10 according to the present invention (user) _{a 3} of the spherical distance nonlinear average theta ₃ of (alpha) and _{_{(Θ 31, Θ 32, Θ}} 34) is a diagram showing a comparison.

Explanation of symbols

１システム
１０サーバ
２０端末
３０通信ネットワーク 1 system 10 server 20 terminal 30 communication network

Claims

A method in which a server makes recommendations to a user of a terminal that can be connected via a communication network,
Receiving, via the communication network, user characteristic data comprising at least a plurality of basic attribute data or log data of the plurality of users from a plurality of terminals of the users;
Mapping a plurality of user characteristics to a probability space based on the received user characteristic data;
Calculating a spherical distance between each of the users in the mapped probability space;
Calculating attribute duplication index data representing a degree of duplication of attributes between a specific user and other users among the plurality of users based on the calculated spherical distance;
Generating a recommendation list for recommending the specific user by calculating a non-linear average depending on a parameter representing the risk avoidance degree of the user for the calculated attribute duplication index data; ,
Transmitting data for making a recommendation to the terminal of the specific user based on the generated recommendation list.

The method according to claim 1, further comprising the step of receiving a setting input of a parameter representing the risk avoidance level of the user.

The method according to claim 1, wherein in the step of calculating the spherical distance, the server calculates a spherical distance of a grasshopper as the spherical distance.

The server calculates the spherical distance of the grasshopper.

The method according to claim 3, calculated by:

The method according to claim 4, wherein in the step of calculating the attribute duplication index data, the server calculates a spherical distance of the grasshopper as the attribute duplication index data.

In the step of calculating the attribute duplication index data, the server calculates a probability distribution indicating the behavior of the other users based on the user characteristic data, and further multiplies a weight to be calculated based on the spherical distance of the grasshopper. The method according to claim 4, wherein the data is calculated as the attribute duplication index data.

The server receives the weight

Calculated by
The attribute duplication index data

The method of claim 6, calculated by:

The server calculates a relative distance from all the other users centered on each of the users based on the calculated spherical distance of the grasshopper;
Further classifying the plurality of users into a plurality of groups having close relative distances based on the calculated relative distances;
The said server calculates the attribute duplication index data about the said other user classified into the same group as the said specific user in the step which calculates the said attribute duplication index data. The method described.

The server determines the relative distance

The method according to claim 8, calculated by:

A server that makes recommendations to a user of a terminal that can be connected via a communication network, the server comprising:
Means for receiving, from the plurality of user terminals, user characteristic data including at least a plurality of basic attribute data or log data of the plurality of users via the communication network;
Means for mapping a plurality of user characteristics to a probability space based on the received user characteristic data;
Means for calculating a spherical distance between each of the users in the mapped probability space;
Means for calculating attribute duplication index data representing the degree of duplication of attributes between a specific user and other users among the plurality of users based on the calculated spherical distance;
A means for generating a recommendation list for making a recommendation for the specific user by calculating a non-linear average depending on a parameter representing the risk avoidance degree of the user for the calculated attribute duplication index data; ,
Means for transmitting data for making a recommendation to the terminal of the specific user based on the generated recommendation list.

A program that allows a server user to make a recommendation to a user of a terminal that can be connected via a communication network,
Receiving, from the plurality of user terminals, user characteristic data including at least a plurality of basic attribute data or log data of the plurality of users via the communication network;
Mapping a plurality of user characteristics to a probability space based on the received user characteristic data;
Calculating a spherical distance between each of the users in the mapped probability space;
Calculating attribute duplication index data representing a degree of duplication of attributes between a specific user and other users among the plurality of users based on the calculated spherical distance;
Generating a recommendation list for recommending the specific user by calculating a non-linear average depending on a parameter representing the risk avoidance degree of the user for the calculated attribute duplication index data; ,
And a step of transmitting data for making a recommendation to the terminal of the specific user based on the generated recommendation list.