JP2008244602A

JP2008244602A - Display of recommended information of broadcast program

Info

Publication number: JP2008244602A
Application number: JP2007079351A
Authority: JP
Inventors: Shunichi Tano; 俊一田野; Yuzo Koido; 祐三小井土; Hideto Yoshimura; 秀人吉村; Yasuo Masaki; 康生政木; Takehiro Onomatsu; 丈洋小野松
Original assignee: Funai Electric Co Ltd; University of Electro Communications NUC
Current assignee: Funai Electric Co Ltd; University of Electro Communications NUC
Priority date: 2007-03-26
Filing date: 2007-03-26
Publication date: 2008-10-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a display of recommended information of broadcast program in order to recommend more useful viewing and listening information for the user. <P>SOLUTION: The display of recommended information of broadcast program comprises a means for inputting an evaluation (Good/No-Good) according to the taste of a viewer, a means for inputting an evaluation of the list of genre common to the viewers, a first totalization means for acquiring the evaluation of the list of common genre inputted by each viewer viewing the same program for each broadcast program and totalizing the evaluations, a second totalization means for acquiring the evaluation according to the taste of the viewer for each broadcast program through a work and totalizing the evaluations, and a means for displaying the evaluation of each broadcast program totalized by the first and second totalization means. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、テレビ番組等の放送番組の推薦情報を表示する表示装置に関するものである。 The present invention relates to a display device that displays recommendation information of a broadcast program such as a television program.

社会の情報化が進むにつれて、我々が触れることのできる情報は爆発的に増えてきた。情報の増加は、ユーザの柔軟な選択を可能とし、より豊かな生活をもたらすものである。しかし膨大すぎる情報は、ユーザの情報処理能力をはるかに超え、逆にユーザの生活を妨げるものとなっている。
例えば、近年放送のデジタル化が進み、TVコンテンツも急速に増加して、ユーザはよりニーズにあった番組を視聴できるようになった。しかし、従来はチャンネル数が少なくザッピングによってどんな番組が放送されているのか気軽に確認することができていたが、チャンネル数が増加した現在では、どんな番組が放送されているのかを確認するだけで多くの時間を費やしてしまう。
また、情報雑誌から見たい番組を探すという方法もよく行われている方法ではあるが、そういった雑誌は万人向けの番組を紹介することが多く、ユーザのニッチな要望に応えるものではない。2011年には地上アナログ放送から完全に地上デジタル放送へ移行することも予定されており、今後このような問題はより顕著になっていくものと考えられる。 As society has become informatized, the information that we can touch has exploded. The increase in information enables a flexible selection by the user and leads to a richer life. However, too much information far exceeds the information processing ability of the user and conversely hinders the user's life.
For example, in recent years, digitalization of broadcasting has progressed, and TV content has also increased rapidly, enabling users to view programs that meet their needs. However, in the past, it was easy to check what programs were being broadcast by zapping because the number of channels was small, but now that the number of channels has increased, it is only necessary to check what programs are being broadcast. I spend a lot of time.
In addition, a method of searching for a program to be viewed from an information magazine is also a common method, but such a magazine often introduces programs for everyone and does not meet the niche demands of users. In 2011, a transition from analog terrestrial broadcasting to digital terrestrial broadcasting is also planned, and such problems are expected to become more prominent in the future.

このような問題を解決する従来技術として、情報検索技術がある。これは、膨大な情報の中から効率的に必要な情報を見つけることを目的としている。代表的なものには検索エンジンが挙げられる。検索エンジンは、インターネット上に散らばる膨大な情報の中から希望した情報を検索して返してくれる。しかしユーザが適切な検索式を入力しないと、望みどおりの情報を得ることはできない。また、検索結果が膨大な量になることもしばしばあり、その場合、膨大な検索結果の中からさらに望みの情報を絞り込まなければならない。情報検索はユーザの能力にも大きく依存するのである。 As a conventional technique for solving such a problem, there is an information search technique. This is intended to efficiently find necessary information from a vast amount of information. A typical example is a search engine. Search engines search for and return desired information from a vast amount of information scattered across the Internet. However, if the user does not input an appropriate search expression, the desired information cannot be obtained. In addition, the search result often has a huge amount. In this case, it is necessary to further narrow down desired information from the huge search result. Information retrieval is also highly dependent on user capabilities.

そこで、情報検索のこのような問題を解決する手法として情報推薦がある。情報推薦とは、ユーザの好みや要求をシステムが推測し、それに合致する情報を自動的に探してユーザに提示する、という技術である。情報推薦は情報検索のようにユーザの能力に依存することはなく、そのためユーザは何も意識することなく望みの情報を手に入れることができる。ユーザの負担が少ない情報推薦技術は、情報過多社会において非常に重要視されているのである。
しかし、従来の情報推薦手法は主にユーザの好みを重要視しており、情報の質を深く考慮していない。そのため良い推薦結果が得られないことも多々ある。ここでの情報の質とは、ユーザにとってその情報がどれだけ価値のあるものかという意味である。どれだけその情報が貴重であろうが、ユーザにとって全く役に立たない情報に価値はない。また、ユーザによって価値観は異なるため、あるユーザにとって価値がある情報でも他のユーザにとって価値があるとは限らない。すなわち、情報の質はユーザ１人１人によって決定されると言うこともできる。 Therefore, there is information recommendation as a method for solving such a problem of information retrieval. Information recommendation is a technique in which a system guesses a user's preference or request, automatically searches for matching information, and presents it to the user. Information recommendation does not depend on the user's ability like information retrieval, so the user can obtain desired information without being aware of anything. Information recommendation technology that places little burden on users is very important in an information-rich society.
However, the conventional information recommendation method mainly places importance on the user's preference and does not consider the quality of information in detail. As a result, there are many cases where good recommendation results cannot be obtained. The quality of information here means how valuable the information is for the user. No matter how valuable that information is, information that is completely useless to the user is worthless. Moreover, since values differ depending on the user, information valuable to a certain user is not necessarily valuable to other users. That is, it can be said that the quality of information is determined by each user.

以上のような背景を踏まえ、本発明では情報推薦手法について扱う。特に、従来の推薦手法で軽視されている情報の質を取り上げて、推薦結果向上の実現方法を探る。なお、情報推薦の対象の好ましい一例としては、今後のデジタル放送社会を考えTV番組とする。
従来のTV番組の推薦システムは、ユーザの好みに合う番組を推測することが目標であった。しかし、情報の質すなわち番組コンテンツの価値という観点から見ると、ユーザが望む推薦システムは、自分の好みに合う番組で且つ自分にとって見る価値がある番組を推薦するものであると言える。 Based on the above background, the present invention deals with the information recommendation method. In particular, the quality of information that has been neglected in the conventional recommendation method will be taken up, and a method for realizing improvement of the recommendation result will be searched. As a preferred example of the information recommendation target, a TV program is considered in the future digital broadcasting society.
The conventional TV program recommendation system has aimed to guess a program that suits the user's preference. However, from the viewpoint of the quality of information, that is, the value of program content, it can be said that the recommendation system desired by the user recommends a program that suits his / her preference and that is worth seeing for himself / herself.

本発明は上記従来事情に鑑みてなされたものであり、その課題とする処は、ユーザにとってより有用な視聴情報の推薦を可能にすべく、放送番組の推薦情報の表示装置を提供することにある。 The present invention has been made in view of the above-described conventional circumstances, and a problem to be solved by the present invention is to provide a recommended information display device for broadcast programs in order to make it possible to recommend viewing information more useful to the user. is there.

上記課題を解決するために本発明は、放送番組情報に、視聴者の嗜好に合わせた評価（Good/No-Good）を入力する手段と、視聴者に共通するジャンルのリストに対して評価を入力する手段と、各放送番組毎に、同一の番組を視聴している視聴者の各視聴者が入力した前記共通するジャンルのリストに対してされた評価をネットワークを介して取得すると共に、集計する第１の集計手段と、各放送番組毎に、前記視聴者の嗜好に合わせた評価をワークを介して取得すると共に、集計する第2の集計手段と、前記第1，および第２の集計手段により集計された各放送番組の評価を表示する表示手段を有することを特徴とする。 In order to solve the above problems, the present invention evaluates the broadcast program information with a means for inputting an evaluation (Good / No-Good) according to the viewer's preference and a list of genres common to the viewer. Means for inputting and, for each broadcast program, obtaining evaluations made on the list of common genres inputted by each viewer of viewers viewing the same program through the network, and summing up First tabulation means, and for each broadcast program, an evaluation according to the viewer's preference is obtained through the work, and the second tabulation means for tabulating and the first and second tabulations It has a display means which displays the evaluation of each broadcast program totaled by the means.

更に好ましくは、前記表示手段は、視聴中の番組の視聴率と共に、視聴中の放送番組に対する前記第2の集計手段により集計された放送番組の評価を時系列情報として表示することを特徴とする。 More preferably, the display means displays the evaluation of the broadcast program aggregated by the second aggregation means for the broadcast program being viewed as time series information together with the viewing rate of the program being viewed. .

本発明に係る実施の形態では、番組の質を考慮した上で、番組推薦の精度の向上を目指す。そこで先ずは、従来の典型的な推薦手法を分析し、次に情報の質を決定する上で重要な要素となる個々のユーザの情報を効果的に利用することについて説明する。そして、従来の推薦手法が番組の質を考慮していないために現れる問題を述べる。 The embodiment according to the present invention aims to improve the accuracy of program recommendation in consideration of the quality of the program. Therefore, first, a typical typical recommendation method will be analyzed, and then the effective use of individual user information, which is an important factor in determining the quality of information, will be described. And the problem that appears because the conventional recommendation method does not consider the quality of the program is described.

（従来の情報推薦手法）
情報推薦の研究は数多く行われており[1]、代表的な手法にコンテンツベース推薦[2]とコラボレーティブ推薦[3]がある。
（参考文献）
[1] Adomavicius, G and Tuzhilin A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions; IEEE Trans. KDE, Vol.17, No.6, pp.734-749, 2005.
[2] M. Balabanovic and Y. Shoham: Fab: Content-Based, Collaborative Recommendation; Comm. ACM, Vol.40, No.3, pp.66-72, 1997.
[3] P.Resnick, N. Iakovou, M. Sushak, P. Bergstorm, and J. Riedl: GroupLens: An Open Architecture for Collaborative Filtering of Netnews; Proc. of 1994 ACM Conference on Computer Supported Cooperative Work pp.175-186., 1994. (Conventional information recommendation method)
There have been many studies on information recommendation [1], and representative methods include content-based recommendation [2] and collaborative recommendation [3].
(References)
[1] Adomavicius, G and Tuzhilin A .: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions; IEEE Trans. KDE, Vol.17, No.6, pp.734 -749, 2005.
[2] M. Balabanovic and Y. Shoham: Fab: Content-Based, Collaborative Recommendation; Comm. ACM, Vol.40, No.3, pp.66-72, 1997.
[3] P. Resnick, N. Iakovou, M. Sushak, P. Bergstorm, and J. Riedl: GroupLens: An Open Architecture for Collaborative Filtering of Netnews; Proc. Of 1994 ACM Conference on Computer Supported Cooperative Work pp.175- 186., 1994.

コンテンツベース推薦は、ユーザの視聴履歴情報に基づいて推薦する手法である。システムは、ユーザの視聴履歴からどんなものを好んで視聴しているかを学習し、その好みと合致するものをユーザに提示する。そのため、この手法では比較的ユーザの希望通りの情報推薦が期待できる。しかし、ユーザの視聴履歴からはずれる情報が推薦されることはなく、例えば、ユーザ自身がまだ気づいてない自分の好みを発見する、という体験をすることは出来ない。図１は、番組推薦におけるコンテンツベース推薦を模式的に表したものである。大きな円は視聴者Aの視聴履歴から求められる視聴者Aの嗜好を表している。視聴者Aには、この嗜好の中に含まれる番組が推薦されることになり、比較的精度の高い推薦結果が得られる。 Content-based recommendation is a method of recommending based on user viewing history information. The system learns what the user likes from the viewing history of the user, and presents to the user what matches that preference. Therefore, this method can be expected to recommend information as desired by the user. However, information that deviates from the user's viewing history is not recommended, and for example, it is not possible to experience that the user himself / herself has not yet noticed his / her preference. FIG. 1 schematically shows content-based recommendation in program recommendation. The large circle represents the preference of the viewer A required from the viewing history of the viewer A. Viewer A is recommended a program included in this preference, and a highly accurate recommendation result is obtained.

一方コラボレーティブ推薦は、自分と嗜好が似ているユーザの履歴情報を参考にして推薦を行う手法である。自分と嗜好が似ているユーザの好みに合うもので、まだ自分にとって未視聴である情報を推薦する、というものである。図２はこれを模式的に表したものである。視聴者Aと視聴者Bは嗜好が重なっており、お互いの好みが似ている。そこで、視聴者Aの嗜好に含まれない番組でも、視聴者Bの嗜好に含まれていれば視聴者Aもきっと好きであろうと推測し、その番組を視聴者Aに推薦するわけである。この手法では他のユーザの視聴履歴を利用するため、各ユーザの嗜好情報が豊富でかつ多様なユーザが存在すれば、情報推薦は効果的に行われる。また、コンテンツベース推薦の欠点である、新しい好みの発見という問題も解決できる。 On the other hand, the collaborative recommendation is a method of making a recommendation with reference to history information of a user whose preference is similar to that of the user. It recommends information that suits the tastes of users who have similar tastes and that is not yet viewed by them. FIG. 2 schematically shows this. Viewers A and B have similar preferences, and their preferences are similar. Therefore, even if the program is not included in the preference of the viewer A, it is presumed that the viewer A will surely like it if it is included in the preference of the viewer B, and the program is recommended to the viewer A. In this method, since the viewing history of other users is used, information recommendation is effectively performed if there is a wealth of preference information of each user and various users. It also solves the problem of finding new preferences, which is a drawback of content-based recommendation.

しかし、コラボレーティブ推薦には情報伝播の境界”コラボレーティブホライズン”が存在することも指摘されている[4]。コラボレーティブ推薦では、人から人へと推薦を行うことによって情報推薦の伝播が発生する。しかし、嗜好の似ているユーザ同士で閉じたグループを形成してしまう可能性があり、そのグループには他のユーザからの推薦が行われない(図３参照)。これを解決するために、Forced Social Filtering[4]が提案されている。これは、閉じたグループに対して、グループの嗜好は考えず、とりあえず試しに推薦してみる、という方法をとっている。これによって情報伝播の境界は解消され、ユーザ全体で見れば効率的なコラボレーティブ推薦が行われる。しかし個々のユーザから見ると、試しに行った推薦が成功するとは限らず課題も残る。
（参考文献）
[4] 田野俊一, 木根智也, 石谷規彦: 広域的強化学習によるデジタル情報推薦の活性化手法; 電気学会論文誌 C, Vol. 121-C, No.7, pp.1237-1245, 2001. However, it has also been pointed out that collaborative recommendation has a “collaborative horizon” of information propagation [4]. In collaborative recommendation, information recommendation is propagated by making recommendations from person to person. However, there is a possibility that a closed group is formed by users with similar preferences, and no recommendation from other users is made to the group (see FIG. 3). To solve this, Forced Social Filtering [4] has been proposed. This is a method in which closed groups are not considered group preferences and are recommended for trials. As a result, the boundary of information propagation is eliminated, and efficient collaborative recommendation is performed as seen by the entire user. However, from the viewpoint of individual users, the recommendation made for the trial is not always successful, and there are still problems.
(References)
[4] Shunichi Tano, Tomoya Kine, Norihiko Ishitani: Activation method of digital information recommendation by global reinforcement learning; IEEJ Transactions C, Vol. 121-C, No.7, pp.1237-1245, 2001.

（個々のユーザの情報の利用）
近年、インターネット技術の普及によって多数のユーザが自分の情報をインターネット上に頻繁に発信するようになってきた。そこで、これらのユーザが発信した情報を利用して有益な成果を得ようとする新しい試みが出てきている。その1つに、様々なユーザによって投稿されたインターネット掲示板上のメッセージを利用し、それを番組コンテンツへのアノテーションに役立てようとする研究がある[5]。
（参考文献）
[5] 岡本直之, 竹之内隆夫, 川村隆浩, 大須賀昭彦, 前川守: 放送番組に対してパブリックオピニオン・メタデータを生成する視聴支援エージェントの開発〜ネットコミュニティからの雰囲気成分の抽出とユーザ間での流通による洗練化〜; 電子情報通信学会, Vol. J88-D-I, No.9, pp.1477-1486, 2005. (Use of individual user information)
In recent years, with the spread of Internet technology, many users have frequently transmitted their information on the Internet. Accordingly, new attempts have been made to obtain useful results using information transmitted by these users. One of these is research that uses messages on Internet bulletin boards posted by various users and uses them to annotate program content [5].
(References)
[5] Naoyuki Okamoto, Takao Takenouchi, Takahiro Kawamura, Akihiko Osuga, Mamoru Maekawa: Development of a viewing support agent that generates public opinion metadata for broadcast programs-Extraction of atmosphere components from the online community and between users Refinement by distribution; IEICE, Vol. J88-DI, No.9, pp.1477-1486, 2005.

このインターネット掲示板は放送中の番組と連動しており、番組を視聴しているユーザがリアルタイムにメッセージを投稿する。そこでこの研究では、投稿された特定のメッセージの出現頻度を計算することで、そのときの掲示板での話題と盛り上がりを番組と関連付けて検出している(図４参照)。こうして得られたデータは番組のメタデータとして利用できる。
個々のユーザが自由に発信する情報は、役に立たないゴミの情報が大部分を占めるが、中には非常に価値のある情報も含まれている。この研究は、そのような情報の中から価値のある情報を抽出して効果的に利用している。 This Internet bulletin board is linked to a program being broadcast, and a user who is watching the program posts a message in real time. Therefore, in this research, by calculating the frequency of appearance of a specific posted message, the topic and excitement on the bulletin board at that time are detected in association with the program (see FIG. 4). The data thus obtained can be used as program metadata.
Information that is freely transmitted by individual users is mostly garbage information that is useless, but some information is very valuable. This research extracts valuable information from such information and uses it effectively.

（従来の情報推薦手法の問題点）
従来の情報推薦システムは、コンテンツベース推薦とコラボレーティブ推薦を組み合わせて両方の利点を利用するものなどが多い。しかしどの推薦手法も、目的はユーザの好みの情報を推測することである。テレビ番組の推薦ならば、番組のEPG(電子番組ガイド)データとユーザの視聴履歴、及び他のユーザの視聴履歴を用いて、出来る限りユーザの好みに合う番組を推測することが目標であった。
しかし、ユーザの好みを利用した番組推薦は番組の表面的な概要しか見ていない。実際に番組を見て本当にユーザが見たいと思う内容か判断しているわけではないため、ユーザは推薦された番組を見て落胆することが多い。たとえ好みの番組で毎週見ている番組でも、毎回面白い回とは限らず、時にはつまらなく感じる回もあり、このような場合は実際に番組を見て判断するしかない。 (Problems of conventional information recommendation methods)
Conventional information recommendation systems often use both advantages by combining content-based recommendation and collaborative recommendation. However, the purpose of any recommendation method is to infer user preference information. For TV program recommendations, the goal was to guess as much as possible the user's preferences using EPG (electronic program guide) data of the program, the user's viewing history, and other users' viewing history. .
However, program recommendations using user preferences only look at the superficial overview of the program. Since the user does not actually determine what the user really wants to see by watching the program, the user is often discouraged by seeing the recommended program. Even if a program is watched every week with a favorite program, it is not always an interesting time, and sometimes it feels boring. In such a case, it is only possible to judge by actually watching the program.

ユーザが見たいと思う番組を推測するためには、さらにユーザの視聴行動を知る必要がある。ユーザは必ずしも自分の好みに合う番組を見るわけではない。世間の動向を把握するために話題の番組を見ることもあれば、友人との話しのネタに普段は見ないような番組を見ることある。その結果自分の新しい好みを発見することもあるだろう。このようなユーザには、好みに合う番組を推測しても意味がない。そもそも、ユーザが求める番組と推薦システムが推測しようとする番組がずれているのである。これが従来の推薦手法の限界である。本発明に係る実施の形態ではこのような状況にも対応できる番組推薦を目指す。そのためには、各ユーザの要求を考慮した上で番組推薦を行う。 In order to guess the program that the user wants to watch, it is necessary to know the viewing behavior of the user. Users do not always watch programs that suit their preferences. Sometimes I watch a topical program in order to understand the trend of the world, and sometimes I see a program that I don't usually see in talking with friends. As a result, you may discover your new taste. For such users, there is no point in guessing a program that suits their preferences. In the first place, the program that the user wants is different from the program that the recommendation system tries to guess. This is the limit of the conventional recommendation method. The embodiment according to the present invention aims at program recommendation that can cope with such a situation. For this purpose, program recommendation is performed in consideration of each user's request.

（本実施の形態のアプローチ）
本実施の形態では、従来の番組推薦と、番組の質を考慮した新しい番組推薦を同時に行うことで、ユーザが本当に見たいと思う番組を推測する。
この２つの推薦手法はタイプが違うため利用する情報も異なる。情報は、その性質が異なれば扱い方も異なるため、まずはこれらの推薦手法において情報の分類を行う。その上でそれぞれの推薦手法に適した情報の利用方法を検討する。
本実施の形態では、推薦手法としてコンテンツベース推薦を用いる。その他の手法も有用ではあるが、それらはどれも本質的な目的はコンテンツベース推薦と同一であるため、本実施の形態では利用しない。
番組の質を考慮した番組推薦手法に関しては、上記したように、２つの要求事項が存在する。１つ目は、ユーザの細かい要求に対応できなければならないことである。これは、必要に応じてユーザに推薦して欲しい番組の方向性を決定してもらうことで対応できる。２つ目は、正確な番組の価値を知るために実際にその番組を視聴する必要があるということである。これについても、多数のユーザが協力し合うことで解決できる。自分以外の多数のユーザが様々な価値観で番組を視聴し、その価値を決定していれば、それは自分にとっても非常に参考になる情報だからである。特に自分と価値観が似ているユーザの評価は、自分の評価と似た結果になり信頼できるものとなる。したがって、番組の質を考慮した番組推薦には、多数のユーザによる番組評価を集計して得られる集計結果を利用する。 (Approach of this embodiment)
In the present embodiment, a program that a user really wants to watch is estimated by simultaneously performing a conventional program recommendation and a new program recommendation considering the quality of the program.
Since the two types of recommendation methods are different, the information used is also different. Since information is handled differently depending on its nature, information is first classified using these recommendation methods. Based on that, we will examine how to use information suitable for each recommendation method.
In the present embodiment, content-based recommendation is used as a recommendation method. Other methods are also useful, but none of them are used in this embodiment because their essential purpose is the same as content-based recommendation.
As described above, there are two requirements for the program recommendation method considering the quality of the program. The first is that it is necessary to be able to cope with the detailed requirements of the user. This can be dealt with by determining the direction of the program that the user wants to recommend as necessary. Second, it is necessary to actually watch the program in order to know the value of the correct program. This can also be solved by the cooperation of many users. This is because if a large number of users other than yourself watch a program with various values and determine its value, it is very useful information for them. In particular, the evaluation of a user whose values are similar to his / her value is similar to his / her evaluation and can be trusted. Therefore, for the program recommendation considering the quality of the program, a total result obtained by totaling program evaluations by a large number of users is used.

以上をまとめると以下のようになる。
・従来の推薦手法と番組の質を考慮した推薦手法で、どのような情報を扱うか明確にする
・推薦手法としてコンテンツベース推薦を用いる
・必要に応じて、ユーザが番組推薦の方向性を決定する
・多数のユーザに番組の評価をしてもらい、その評価の集計結果を番組推薦に利用する The above is summarized as follows.
・ Clarify what information is handled by the conventional recommendation method and the recommendation method considering the quality of the program ・ Use content-based recommendation as the recommendation method ・ The user decides the direction of program recommendation as needed・ A large number of users evaluate programs and use the evaluation results for program recommendations

［情報の分類と利用］
次に、従来の推薦手法および番組の質を考慮した新しい推薦手法で利用する情報の分析について説明する。
（番組推薦で利用する情報）
（従来の番組推薦手法）
従来の番組推薦手法はユーザの嗜好を用いて番組推薦を行う。ユーザが好んで視聴する番組の概要データをEPG(電子番組ガイド)データから取得し蓄積することで、その蓄積されたデータがユーザの嗜好を表すデータとなる。よって、従来の番組推薦手法はEPGデータに頼った推薦手法と言える。
このEPGデータは特定の機関が提供しており、正確で信頼できる情報である。個々のユーザが自由に発信できる類の情報ではない。このような情報のことをフォーマル情報と呼ぶ。 [Classification and use of information]
Next, analysis of information used in the conventional recommendation method and the new recommendation method considering the quality of the program will be described.
(Information used for program recommendation)
(Conventional program recommendation method)
The conventional program recommendation method recommends a program using the user's preference. By acquiring and accumulating summary data of programs that the user likes to watch from EPG (electronic program guide) data, the accumulated data becomes data representing user preferences. Therefore, it can be said that the conventional program recommendation method is a recommendation method that relies on EPG data.
This EPG data is provided by a specific organization and is accurate and reliable information. It is not a kind of information that can be freely transmitted by individual users. Such information is called formal information.

（新しい番組推薦手法）
新しい番組推薦手法では、番組の質を評価するために番組に対する多数のユーザの評価という情報を利用する。これは、EPGと異なりユーザが自由に発信することができる。ただし、ユーザが自由に発信できてしまうためにいいかげんで嘘の情報が含まれることもある。このような情報のことを一般にインフォーマル情報と呼ぶ。 (New program recommendation method)
In the new program recommendation method, in order to evaluate the quality of a program, information such as evaluation of a large number of users for the program is used. Unlike EPG, this can be freely transmitted by the user. However, since the user can freely transmit, lie information may be included. Such information is generally called informal information.

番組推薦に利用する情報に関して、従来の推薦手法ではフォーマル情報を利用し、新しい推薦手法ではインフォーマル情報を利用するという明確な違いが明らかになった。すなわち、従来の推薦手法では番組コンテンツをフォーマル情報で評価し、新しい推薦手法では番組コンテンツをインフォーマル情報で評価している。そこで本実施の形態では、これら２つの手法で番組コンテンツを定量的に評価するため、ユーザの好みや番組の概要を表す「ジャンル」と、番組の質や価値を表す「バリュー」を定義し、この２つを番組コンテンツの評価軸として導入する。ただし、番組の質や価値はユーザの価値観の違いによって複数の指標が存在する。よって、バリューも複数存在し得る。
ところで、インフォーマル情報は従来の推薦手法では扱われず、番組の質を考慮した新しい番組推薦手法で初めて積極的に導入されるものである。そこで、次の節ではインフォーマル情報の分析を行う。 With regard to information used for program recommendation, a clear difference was revealed that formal information was used in the conventional recommendation method and informal information was used in the new recommendation method. That is, the conventional recommendation method evaluates program content with formal information, and the new recommendation method evaluates program content with informal information. Therefore, in this embodiment, in order to quantitatively evaluate program content using these two methods, a “genre” that represents user preferences and an outline of the program and a “value” that represents the quality and value of the program are defined, These two are introduced as program content evaluation axes. However, there are a plurality of indicators for the quality and value of the program depending on the user's values. Therefore, there can be a plurality of values.
By the way, informal information is not dealt with by the conventional recommendation method, but is introduced for the first time by a new program recommendation method considering the quality of the program. In the next section, informal information is analyzed.

（インフォーマル情報）
インフォーマル情報とは、雑談やうわさ、口コミなどで伝えられる非公式な情報のことである。その特徴として、”いいかげん”、”本音”、”うそ”といった性質を持つ。
一方、インフォーマル情報と対照的なものにフォーマル情報がある。フォーマル情報は、TVや新聞、雑誌などの権威あるメディアを通して伝えられる公式な情報のことである。その特徴として、“正確”、”建前”、”信憑性がある”といった性質を持つ。
従来より、正確で定型的であるという性質から様々なシステムで利用されてきたのはフォーマル情報である。一方、インフォーマル情報は、不定型でいいかげんであるという理由のため敬遠されてきた。しかし、インフォーマル情報にはユーザの本音の情報が含まれておりその利用価値は高い。インフォーマル情報はその扱い方さえ間違えなければ価値を引き出すことが出来る。 (Informal information)
Informal information is informal information conveyed through chats, rumors, and word of mouth. Its characteristics are "good luck", "real intention", and "lie".
On the other hand, formal information is contrasted with informal information. Formal information is official information conveyed through prestigious media such as TV, newspapers and magazines. Its characteristics are “accurate”, “building” and “credible”.
Conventionally, formal information has been used in various systems because of its accurate and routine nature. On the other hand, informal information has been shunned because it is indefinite and fine. However, the informal information includes information about the user's real intention, and its utility value is high. Informal information can bring out value if it is not mistakenly handled.

例えば、書籍”Wisdom of Crowds”では、一定の条件の下で多数のユーザの主観的な意見の集約が案外正しいことを示す事例を多く紹介している[6]。また、インターネット上の百科事典であるWikipedia[7]は、ユーザが自由に加筆、修正を行えるにもかかわらず、掲載されている情報は大部分が正しい情報であるという。このように、多数のユーザのインフォーマル情報は集約することでその価値を抽出することが可能になる。
（参考文献及び参考ＵＲＬ）
[6] ジェームズ・スロウィッキー, 小高: 「みんなの意見は案外正しい」, 角川書店, 2006.
[7] “ウィキペディア”, http://ja.wikipedia.org/ For example, the book “Wisdom of Crowds” introduces many examples that show that the aggregation of subjective opinions of many users under certain conditions is unexpectedly correct [6]. In addition, Wikipedia [7], an encyclopedia on the Internet, says that most of the information posted is correct even though users can freely add and modify it. In this way, it is possible to extract the value of the informal information of a large number of users by aggregating them.
(Reference and reference URL)
[6] James Slowicky, Odaka: "Everyone's opinion is unexpectedly correct", Kadokawa Shoten, 2006.
[7] “Wikipedia”, http://en.wikipedia.org/

従来からフォーマル情報ばかりが利用されてきた理由に、多数のインフォーマル情報を気軽に共有できる場が存在しなかったことも挙げられる。そこで次に、メディアの発展とともにインフォーマル情報の共有について考察する。そして、TV番組に関して多数のインフォーマル情報が共有され得るのか分析する。 The reason why only formal information has been used in the past is that there was no place where many informal information could be easily shared. Next, we consider sharing informal information with the development of media. And it analyzes whether a lot of informal information can be shared about TV program.

メディアは言葉やジェスチャー、文字といったものから、印刷、ラジオ、電話、TVへと発展し、さらに現在のPC、携帯電話、インターネットへと発展を遂げた。このようなメディアの発展により、フォーマル情報、インフォーマル情報の共有は共に大きく向上した。しかし、印刷〜TVといったメディア、特にラジオやTVについては、権威ある者が発信する情報が大部分を占める。すなわちラジオやTVの登場は、フォーマル情報の共有だけを飛躍的に増加させた。この時点で、インフォーマル情報の共有はそれほど飛躍的には向上しなかった。その後インターネットが普及し、誰でも自分のWebページを不特定多数に公開できるようになり、フォーマル情報以上にインフォーマル情報の共有が急増した。その流れを後押しするように、自由な情報発信を保護する目的でP2P(Peer-to-Peer)技術を利用したネットワークも登場した。さらにインターネットは、手紙、電話、ラジオ、TVなどの機能(メールやインターネットTVなど)も内包し、これまで主に一部の者だけが情報発信してきた既存メディアすらも、個人の情報発信の手段へと変化させた。インターネットは、既存のメディアも巻き込んでインフォーマル情報の爆発を引き起こしたのである[11][12]。
その結果、現在インターネット上には無数のインフォーマル情報が存在する。ここで、特にTV番組に関するインフォーマル情報の共有について調べる。図５は、インターネット上のある匿名掲示板群における、2005年のTV番組に対するメッセージ投稿数の順位を番組ジャンルごとに集計したものである。前記匿名掲示板群には、主要放送キー局ごとに掲示板が用意されおり、放送番組に対してリアルタイムに多くのメッセージが投稿されている。図５はそれらの掲示板のメッセージ数を集計したものである。
（参考文献）
[11] 田野俊一, 岩田満, 橋山智訓:インフォーマル情報爆発とそれを活用するユーザインタフェースの事例; ヒューマンインタフェースシンポジウム, pp.1175-1180, 2006.
[12] 小井土祐三, 田野俊一, 岩田満, 橋山智訓:新たな視聴支援方法の提案とインフォーマル視聴情報のP2P集計アルゴリズムの実現; ヒューマンインタフェースシンポジウム, 2006. The media has evolved from words, gestures, and letters to printing, radio, telephone, and TV, and further to current PCs, mobile phones, and the Internet. With the development of such media, sharing of formal information and informal information has greatly improved. However, for media such as printing to TV, especially radio and TV, information transmitted by an authorized person occupies most. In other words, the advent of radio and TV has dramatically increased only the sharing of formal information. At this point, the sharing of informal information has not improved significantly. Since then, the Internet has become widespread, and anyone can publish their Web page to an unspecified number of people, and the sharing of informal information has increased rapidly more than formal information. A network that uses P2P (Peer-to-Peer) technology has also appeared to protect the flow of free information so as to support this trend. In addition, the Internet includes functions such as letters, telephone, radio, and TV (such as mail and Internet TV), and even existing media that has been mainly sent by only some people until now is a means of sending personal information. It was changed to. The Internet involved the explosion of informal information, involving existing media [11] [12].
As a result, there are innumerable informal information currently on the Internet. Here, we will examine the sharing of informal information about TV programs. FIG. 5 summarizes the ranking of the number of message postings for a 2005 TV program in a group of anonymous bulletin boards on the Internet for each program genre. In the anonymous bulletin board group, a bulletin board is prepared for each major broadcast key station, and many messages are posted in real time for broadcast programs. FIG. 5 summarizes the number of messages on those bulletin boards.
(References)
[11] Shunichi Tano, Mitsuru Iwata, Tomonori Hashiyama: An example of informal information explosion and user interface utilizing it; Human Interface Symposium, pp.1175-1180, 2006.
[12] Yuzo Koido, Shunichi Tano, Mitsuru Iwata, Tomonori Hashiyama: Proposal of a new viewing support method and implementation of P2P aggregation algorithm of informal viewing information; Human Interface Symposium, 2006.

図５から、TV番組に対して多いもので毎分数百ものメッセージが投稿されていることがわかる。また、どの番組ジャンルに対しても投稿数は多く、TV番組に関するインフォーマル情報は溢れているということが確認できる。よって、番組推薦においてもインフォーマル情報を活用できる可能性は十分にあると考えられる。ただし、本実施の形態では、インターネット掲示板上のインフォーマル情報を利用するというわけではない。 From FIG. 5, it can be seen that hundreds of messages are posted every minute for TV programs. In addition, it can be confirmed that the number of posts is large for any program genre, and informal information about TV programs is overflowing. Therefore, it is considered that there is a sufficient possibility that informal information can be used in program recommendation. However, in this embodiment, the informal information on the Internet bulletin board is not used.

（インフォーマル情報の発信の活性化）
ユーザの発信するインフォーマル情報を利用する場合は、いかにユーザのインフォーマル情報の発信を活性化させるかが重要な問題となる。特に、すでに情報発信の活性化に成功している従来システムを情報源として利用しない場合は、この問題はやっかいである。本実施の形態では情報源にそのような既存のシステムは利用しない。そのようなシステムがこれからも存在するとは言い切れないからである。そこで本実施の形態では、一般的な視聴行動からインフォーマル情報を取得することを前提とする。 (Activation of transmission of informal information)
When using informal information transmitted by the user, how to activate the transmission of the user's informal information is an important issue. This problem is particularly troublesome when a conventional system that has already succeeded in activating information transmission is not used as an information source. In this embodiment, such an existing system is not used as an information source. This is because it cannot be said that such a system still exists. Therefore, in the present embodiment, it is assumed that informal information is acquired from general viewing behavior.

インフォーマル情報発生の活性化に関連する研究としては、インフォーマルコミュニケーションの支援がある。インフォーマル情報は特にインフォーマルコミュニケーションから大量に発生しやすい。このインフォーマルコミュニケーションは、情報の獲得や共有、知的創造活動の促進といった役割を果たすため従来から重要視されているものである。そのため、インフォーマルコミュニケーションを促す研究は数多くなされており、特に、相手との親近感とコミュニケーションのきっかけが重要な要素であると考えて、仮想的な対面環境を構築したり、対面環境でのインフォーマルコミュニケーションのきっかけを与える研究が多い。一方で、対面環境も支援せず、やりとりする情報もテキストデータのみという匿名掲示板で、対面環境以上に本音であると思われるやりとりが頻繁に行われているという事実もある。
本来は、これらを詳細に分析し、TV視聴という環境においてインフォーマル情報の発生を活発にする手法を検討すべきであるが、本実施の形態ではそこまでは扱わず、この問題に関しては今後の重要な課題とする。 Research related to the activation of informal information generation includes support for informal communication. Informal information is particularly likely to be generated in large quantities from informal communication. This informal communication has long been regarded as important because it plays a role of acquiring and sharing information and promoting intellectual creation activities. For this reason, many studies have been conducted to promote informal communication.In particular, it is thought that familiarity with the other party and the chance of communication are important factors, so that a virtual face-to-face environment can be established and There are many studies that give the opportunity for formal communication. On the other hand, there is also the fact that conversations that do not support the face-to-face environment and that only interact with text data are frequent exchanges that seem to be more authentic than the face-to-face environment.
Originally, these should be analyzed in detail, and a method to activate the generation of informal information in the TV viewing environment should be considered, but this embodiment will not deal with it so far. Let it be an important issue.

（番組推薦におけるバリュー）
以上までで、番組の質を考慮した番組推薦手法に、ユーザの番組に対する評価、すなわちインフォーマル情報が利用できることが分かった。しかし、ユーザのインフォーマル情報が具体的にどのような情報なのかはまだ分からない。そこで次に、ユーザの視聴行動から得られる具体的なインフォーマル情報を示し、それらの情報から得られるバリューについて説明する。 (Value in program recommendation)
As described above, it has been found that evaluation of a user's program, that is, informal information can be used for a program recommendation method considering the quality of the program. However, it is not yet known what the user's informal information is. Then, next, specific informal information obtained from the user's viewing behavior will be shown, and values obtained from the information will be described.

ユーザの番組視聴行動から取得できる情報には以下のようなものが考えられる。
・どの番組を視聴しているか
・番組の視聴時間
・ブックマークしたかどうか
・録画、予約録画したかどうか
・録画番組の視聴回数
・番組に対する感想
・番組に対する感情
・番組に対するアノテーション The following can be considered as information that can be acquired from the program viewing behavior of the user.
・ Which program is being watched ・ Watching time of the program ・ Whether it was bookmarked ・ Whether it was recorded or reserved recording ・ Number of times the program was viewed ・ Impressions about the program ・ Emotions about the program ・ Annotations about the program

すなわち、これらの情報から番組の質を評価することになる。例えば、視聴回数が多ければその番組はそれだけ面白く質が高いと評価できる。よって、番組の質を考慮した番組推薦を実現するためには、多数のユーザからこれらの情報を収集することになる。そして収集した情報を集約することで、ユーザは様々なバリューを得ることができる。その中でも番組推薦に利用しやすいバリューを以下に示す。
・番組の人気度
・番組の性質
・番組に対する期待度
・番組に対する満足度
・番組に対する感情 That is, the quality of the program is evaluated from these pieces of information. For example, if the number of times of viewing is large, it can be evaluated that the program is so interesting and high quality. Therefore, in order to realize program recommendation considering the quality of the program, such information is collected from a large number of users. By collecting the collected information, the user can obtain various values. Among them, the following values are easy to use for program recommendation.
・ Program popularity ・ Characteristics of programs ・ Expectations for programs ・ Satisfaction with programs ・ Emotions about programs

番組の人気度は、どれだけ多くのユーザが好んで番組を視聴しているかを表すものである。これは、番組の視聴率や番組の視聴回数などから計算できる。世間で話題の番組を見ておきたい、といった場合などに番組の人気度は有効である。
次の番組の性質は、ユーザからみてその番組がどのような特徴を持っているか、を表す。例えば、保守的な番組や急進的な番組、子供には相応しくない番組、といったものがここで言う番組の性質である。これらはユーザにその旨の情報を直接入力してもらうことになる。 The popularity of a program represents how many users like it. This can be calculated from the audience rating of the program and the number of times the program has been viewed. The popularity of programs is effective when you want to watch popular programs in the world.
The nature of the next program represents what characteristics the program has from the user's perspective. For example, conservative programs, radical programs, and programs that are not suitable for children are the characteristics of the programs mentioned here. These require the user to directly input information to that effect.

番組に対する期待度は、まだ放送されていない番組に対してユーザがどれだけ期待を寄せているかを表すものである。例えば、今夜から放送が開始される新番組は世間からどれだけ注目を浴びているのか、ということが番組の期待度から分かる。これはユーザの番組に対する直接入力以外に、その番組を予約録画したかどうかといった情報からも計算することができる。 The degree of expectation for a program represents how much the user expects a program that has not been broadcast yet. For example, it can be seen from the degree of expectation of the program how much attention the new program that will start broadcasting tonight has attracted attention. In addition to the direct input to the user's program, this can be calculated from information such as whether or not the program has been reserved for recording.

番組に対する満足度は、ユーザが番組に対して満足したか、あるいは面白いと思ったかを表すものである。これは、番組に対してユーザが直接入力した、”満足した”または”不満であった”という情報から計算できる。 The degree of satisfaction with the program indicates whether the user is satisfied with the program or thought it was interesting. This can be calculated from "satisfied" or "dissatisfied" information entered directly by the user for the program.

最後の、番組に対する感情は、番組を視聴してユーザがどのような感情を抱いたかを表すものである。例えば、映画を見ていて感動のシーンであれば”感動した”という情報を入力し、恐怖を感じるシーンであれば”怖い”といった情報を入力する。表情認識技術を使えば、これらをシステムが自動的に取得することも可能である。取得した情報は、感動できる番組を見たいというユーザや、怖いシーンは苦手だから見たくないといったユーザに対して効果的に利用できる。
これらの具体的な利用方法については後述する。 Finally, the feelings about the program indicate what feelings the user had when viewing the program. For example, if you are watching a movie and you are in a moving scene, you will enter “I was moved”. If facial expression recognition technology is used, these can be automatically acquired by the system. The acquired information can be effectively used for a user who wants to watch a moving program or a user who does not want to watch a scary scene because he is not good at it.
Specific usage of these will be described later.

（情報の集計方式）
以上では、ユーザの視聴行動から得られる具体的なインフォーマル情報と、それを集計することで得られるバリューの例を示した。
そこで次に、ユーザの感情といったインフォーマル情報を集計する際に問題となる、情報の集計方式について述べる。
従来から、情報の集計にはクライアント・サーバ方式が多く用いられてきた。それは、簡単に全ユーザの情報を集計できるからである。しかし、今回集計する情報はインフォーマル情報である。インフォーマル情報の、途中で中央管理機構が介在するとインフォーマル度が低下する恐れがあるという特徴は、クライアント・サーバ方式とインフォーマル情報の相性が悪いことを示している。また、クライアント・サーバ方式は正確な情報集計に適している方式であるが、そもそも”いいかげん”なインフォーマル情報を正確に集計する必要はない。
中央管理機構を介在させずに済む方法としてはP2P方式がある。P2P方式はユーザ同士が直接接続するブローカレスモデルである。途中に中央管理機構を介さないため、インフォーマル情報を扱う場合でもそのインフォーマル度の低下を防ぐことができる。また、分散的な性質も持ち合わせており、各ノードに負荷を分散させることができるというメリットもある。さらに、P2Pでは各ノードが連携しながら情報を集計することになるので、ある程度いいかげんな情報集計になることが多い。このことからも、P2Pとインフォーマル情報の相性が良いことがわかる。ただし、P2Pにも一部にクライアント・サーバ方式を取り入れたハイブリッドP2Pモデルが存在し、純粋なP2Pモデル(ピュアP2P)とは区別されている。 (Information aggregation method)
In the above, the specific informal information obtained from the viewing behavior of the user and the example of the value obtained by totaling the information are shown.
Then, next, the information summarization method which becomes a problem when summarizing informal information such as user's feelings will be described.
Conventionally, a client-server system has been frequently used for collecting information. This is because the information of all users can be totaled easily. However, the information collected this time is informal information. The feature that the informal level may be lowered if the central management mechanism intervenes in the informal information indicates that the compatibility between the client / server system and the informal information is poor. The client / server method is suitable for accurate information aggregation, but it is not necessary to accurately calculate “good or bad” informal information in the first place.
A method that does not require a central management mechanism is the P2P method. The P2P method is a brokerless model in which users connect directly. Since there is no central management mechanism in the middle, it is possible to prevent a decrease in the degree of informal information even when handling informal information. In addition, it has a distributed property, and there is an advantage that the load can be distributed to each node. Furthermore, in P2P, each node aggregates information while cooperating, so there are many cases where the information is aggregated to some extent. This also shows that P2P and informal information are compatible. However, there is a hybrid P2P model that partially incorporates a client / server system in P2P, and it is distinguished from a pure P2P model (pure P2P).

このような理由から、本実施の形態ではインフォーマル情報と相性がいいP2P方式(ピュアP2P)を用いて情報の集計を行うことにする。
ここで、従来の推薦手法と本実施の形態で目指す新しい推薦手法の特徴を表１にまとめる。なお、新しい推薦手法は従来の推薦手法を内包するものである。 For this reason, in this embodiment, information is aggregated using a P2P method (pure P2P) that is compatible with informal information.
Here, Table 1 summarizes the characteristics of the conventional recommendation method and the new recommendation method aimed at in the present embodiment. The new recommendation method includes the conventional recommendation method.

（プライバシーの問題）
ユーザの視聴情報を扱う場合に問題となるのがプライバシーの問題である。一般的にユーザは自分の視聴履歴が誰かの手に渡ることを嫌がる傾向にある。すなわち、自分の視聴履歴が自動的に集計されることに不安を抱いているユーザもいる。
しかし、本実施の形態ではP2P通信でユーザ同士が協力して視聴情報を集計するため、特定の機関に自分の視聴履歴を把握されるということはない。もちろん、個人を特定できるような情報はそもそも送信しない。また、視聴情報を暗号化して集計を行えばセキュリティを高めることもできる。 (Privacy issue)
When dealing with user viewing information, the problem of privacy is a problem. In general, users tend to hate having their viewing history in someone's hand. In other words, some users are worried that their viewing history is automatically counted.
However, in the present embodiment, users cooperate with each other in P2P communication to count viewing information, so that a specific institution does not grasp their viewing history. Of course, information that can identify an individual is not sent in the first place. Also, if the viewing information is encrypted and aggregated, security can be improved.

（情報集計アルゴリズム）
次に、インフォーマル情報の集計に適したアルゴリズムを提案し、その性能を分析する。 (Information aggregation algorithm)
Next, we propose an algorithm suitable for collecting informal information and analyze its performance.

（集計可能な情報）
インフォーマル情報の集計アルゴリズムを考案するために、まずはどのようなインフォーマル情報を集計するのか明らかにする。
例えば、各ユーザから取得する情報は例えば以下のようなものになる。
・どの番組を視聴しているか
・子供に見せたくないか
・満足したか(面白かったか)
・どんな感情を抱いたか
これらの情報はすべて数値で入力することができる。例えば、”どの番組を視聴しているか”ならば、視聴している番組には1、視聴していない番組には0を入力し、”子供に見せたくないか”や”満足したかどうか”ならば、番組に対してそれらの度合いを数値で入力する。”感情”についても、どんな感情をどの程度抱いたかを数値で入力すればよい。感情の種類には例えば”喜び”、”怒り”、”悲しみ”、”楽しみ”といったものが考えられる。 (Aggregable information)
To devise an informal information aggregation algorithm, we first clarify what kind of informal information is aggregated.
For example, information acquired from each user is as follows, for example.
・ Which program is being watched ・ Do not want to show to children ・ Is satisfied (Is it interesting)
・ What feelings you have? All of this information can be entered numerically. For example, if “which program you are viewing”, enter 1 for the program you are viewing and 0 for the program you are not viewing. Then, the degree is input numerically for the program. As for “emotion”, it is only necessary to input numerically what kind of feeling you have. Examples of emotions include “joy”, “anger”, “sadness”, and “fun”.

このように情報を数値で表すことによって、情報の集計はこれらの値を単純に加算するだけでよいことになる。ここで提案する情報集計アルゴリズムは、このような数値で表されたインフォーマル情報の集計を行うものとする。
なお、番組推薦で利用しやすいように、集計結果は比率として計算することにする。すなわち、番組Aを見ているのは全体の何%か(視聴率)、何%のユーザが番組Aに満足しているか、何%のユーザが番組に対して感動したか、といった情報を得ることを目的とする。そのため、ユーザの人数(サンプル数)も一緒に集計できるようにする。 By representing the information numerically in this way, the information can be aggregated simply by adding these values. The information aggregation algorithm proposed here performs the aggregation of the informal information represented by such numerical values.
Note that the tabulation result is calculated as a ratio so that it can be easily used for program recommendation. In other words, obtain information such as what percentage of the program A is watched (viewing rate), how many users are satisfied with program A, and what percentage of users are impressed with the program. For the purpose. Therefore, the number of users (number of samples) can be counted together.

（要件）
次に、番組推薦システムで求められる情報集計アルゴリズムの機能と性能における要件をまとめる。
まず必要最低限の機能は、上述したように、数値で表されたインフォーマル情報を番組ごとに常時集計できることである。ただし、集計結果である視聴率などは、数秒間隔でその値が分かるようにする。すなわち、番組の平均視聴率だけでなく視聴率の変遷も分かるように集計する。また、リアルタイムに変化する視聴率などの様子も番組推薦に利用できるため、リアルタイム集計も実現する。 (Requirements)
Next, the requirements for the function and performance of the information aggregation algorithm required in the program recommendation system are summarized.
First, the minimum necessary function is that, as described above, the informal information represented by numerical values can be constantly counted for each program. However, the audience rating, which is the total result, is made to be known at intervals of several seconds. That is, not only the average audience rating of programs but also the transition of audience ratings are tabulated. Also, real-time counting is realized because the rating such as audience rating that changes in real time can be used for program recommendation.

次に想定するネットワークの規模であるが、現在日本でTVを所有している世帯数は数千万にも及ぶ。そこで、本実施の形態でも大規模なネットワークを想定しておく必要がある。よって、ここではノード数が数十万〜数百万程度の大規模ネットワークを想定することにする。もちろん、ネットワークの規模が小さくても機能しなくてはならない。また、常時安定して稼動するためにはネットワーク障害にも強くなければならない。 Next is the assumed network size, and the number of households that currently own TVs in Japan is tens of millions. Therefore, it is necessary to assume a large-scale network also in this embodiment. Therefore, a large-scale network having about hundreds of thousands to millions of nodes is assumed here. Of course, it must work even if the network is small. Also, in order to operate stably at all times, it must be resistant to network failures.

視聴率などの精度もできる限り確保しなければならない。現在の日本でのある世帯視聴率調査では、例えば調査地域によって標本数を200または600に設定している。標本の抽出は国勢調査のデータをもとにランダムサンプリングで行っているが、TV局関係者のいる世帯や病院、事務所などの施設は除外して視聴率の正当性も考慮している。これによって得られる視聴率の誤差は、視聴率が50%のときに±7%または±4%程度(信頼度95%)である。そこで、本実施の形態で考案する情報集計アルゴリズムも、集計結果の誤差を少なくとも±4%以下に抑えられる方式とする。
また、本実施の形態では各ユーザのTVがP2Pネットワークでつながることを考えており、その場合、ユーザからTV局関係者などを除外して情報を集めることは困難である。そのため正確なサンプリングを行うことはあきらめ、代わりに標本数を現行の調査よりも増やすことで集計結果の正当性低下を補う。そこで、考案する情報集計アルゴリズムは、ある程度多数のユーザの情報を集計できるようにする。 The accuracy such as audience rating must be secured as much as possible. In a current household audience rating survey in Japan, for example, the sample size is set to 200 or 600 depending on the survey area. Samples are extracted by random sampling based on data from the national census, but the legitimacy of the audience rating is also taken into account, excluding facilities such as households, hospitals, offices, etc. with TV station personnel. The error of the audience rating obtained by this is about ± 7% or ± 4% (reliability 95%) when the audience rating is 50%. Therefore, the information aggregation algorithm devised in the present embodiment is also a method that can suppress the error of the aggregation result to at least ± 4% or less.
Further, in the present embodiment, it is considered that each user's TV is connected through a P2P network. In this case, it is difficult to collect information by excluding persons related to the TV station from the user. Therefore, we give up giving accurate sampling, and instead compensate the decline in the validity of the tabulation results by increasing the number of samples over the current survey. Therefore, the information aggregation algorithm devised makes it possible to aggregate information of a large number of users to some extent.

以上をまとめると、以下のような機能及び性能が必要となる。
・番組ごとに視聴率のような比率が計算できる
・リアルタイムに集計できる
・ユーザ数が数十万〜数百万の大規模なネットワークでも機能する
・ネットワーク障害に強い
・視聴率などの誤差を±4%以下に抑えられる
・多数のユーザの情報を集計できる In summary, the following functions and performance are required.
・ Percentage such as audience rating can be calculated for each program. ・ Real-time aggregation is possible. ・ It works even in large networks with hundreds of thousands to millions of users. Less than 4%

（P2P情報集計アルゴリズム）
次に、以上で明らかにした要求をできる限り満たす情報集計アルゴリズムを提案する。
（単純な情報集計方法）
P2Pで多数のユーザの情報を集計する最も単純で基本的な方法は、1つのノード(ユーザ)が全ての集計を引き受けるという方法である（図６参照）。集計したいデータの数だけノードを選択し、後は選択したノードからデータをもらい集計すればよい。集計結果はブロードキャストすることで、全てのノードが集計結果を得ることができる。
しかし現実にこのような方法は利用できない。集計しているノードの負荷が大きすぎ、さらにそのノードが故障したらその時点で集計は失敗する。
そこで以下に、安定性や負荷の分散も考慮し、全てのノードが集計を行い且つお互いに協力する方法を提供する。 (P2P information aggregation algorithm)
Next, we propose an information aggregation algorithm that satisfies the requirements identified above as much as possible.
(Simple information aggregation method)
The simplest and basic method of totaling information on a large number of users by P2P is a method in which one node (user) takes over all the totals (see FIG. 6). Nodes are selected as many as the number of data to be aggregated, and then data is obtained from the selected nodes and aggregated. By broadcasting the aggregation results, all nodes can obtain the aggregation results.
However, such a method cannot actually be used. If the total node load is too large and the node fails, the aggregation will fail at that point.
In view of this, the following provides a method in which all nodes perform aggregation and cooperate with each other in consideration of stability and load distribution.

（P2P情報集計アルゴリズムの概要）
提案する情報集計アルゴリズムの仕組みを図７に簡単に示す。ノード間の矢印は情報の流れを表している。 (Outline of P2P information aggregation algorithm)
The mechanism of the proposed information aggregation algorithm is simply shown in FIG. Arrows between nodes represent the flow of information.

基本的な考え方は次の通りである。各ユーザは定期的に自分の周りのノードから情報を集めてそれを集計している。すると、自分の周りのノードは、さらにその周りのノードから情報を集めて集計していることになる。そこで、自分の周りのノードからそれらの集計結果も一緒に取得する。例えば、図７ではノード1はノード2〜7から情報を集めている。同様にノード6はノード11〜12の情報を、ノード7はノード8〜10の情報を集めている。よってノード1は、ノード2〜7の情報に加えて、ノード6とノード7が集めたノード8〜12の情報(の集計結果)も同時に得ることができる。さらに言えば、ノード8がノード13〜16の情報を集めているので、ノード13〜16の情報(の集計結果)も同時に集めることができる。このとき、ノード1から見て自分自身の情報を0次情報、ノード2〜7の情報を1次情報、ノード8〜12の情報を2次情報、ノード13〜16の情報を3次情報と呼ぶことにする。もちろん、それ以上の4次情報、5次情報といったものもあり得る。なお、ここで言っているノードの情報とは、上述したような、各ユーザのインフォーマル情報を数値で表したもののことである。 The basic idea is as follows. Each user regularly collects information from nodes around him and aggregates it. Then, the surrounding nodes further collect and aggregate information from the surrounding nodes. Therefore, the totaling results are also acquired from the nodes around you. For example, in FIG. 7, node 1 collects information from nodes 2-7. Similarly, the node 6 collects information on the nodes 11 to 12, and the node 7 collects information on the nodes 8 to 10. Therefore, in addition to the information of the nodes 2 to 7, the node 1 can simultaneously obtain the information of the nodes 8 to 12 collected by the node 6 and the node 7 (the total result). Furthermore, since the node 8 collects the information of the nodes 13 to 16, the information of the nodes 13 to 16 (the total result) can be collected at the same time. At this time, as viewed from node 1, the information of itself is 0th order information, the information of nodes 2-7 is primary information, the information of nodes 8-12 is secondary information, and the information of nodes 13-16 is tertiary information. I will call it. Of course, there can be more quaternary information and quintic information. Note that the node information referred to here is a numerical value representing the informal information of each user as described above.

この方法によって、ネットワーク上の膨大な量の情報を効率的に集計することができる。ただし、他のノードからもらう集計結果がその時点ですでに集計されているデータであるため、集めたデータには時間的なずれが生じている。よって、1次情報よりも2次情報の方が古く、2次情報よりも3次情報の方が古い情報となってしまい、言い換えると、遠くのノードから集めた情報ほど古い情報になっている。 By this method, a huge amount of information on the network can be efficiently aggregated. However, since the aggregation results obtained from other nodes are data that has already been aggregated at that time, there is a time lag in the collected data. Therefore, the secondary information is older than the primary information, and the tertiary information is older than the secondary information. In other words, the information collected from a distant node is older. .

また、集めたデータに重複が生じることも考えられる。例えばノード6がノード8の情報も集めていたと仮定する。するとノード1は、ノード6が集めたノード8の情報と、ノード7が集めたノード8の情報を重複して受け取ることになる。この問題については後に述べる。 In addition, duplication may occur in the collected data. For example, assume that node 6 has also collected information on node 8. Then, the node 1 receives the information on the node 8 collected by the node 6 and the information on the node 8 collected by the node 7 in duplicate. This problem will be described later.

（送信リンクと受信リンク）
各ノードは、他のノードと送信リンク及び受信リンクを張ることができる(図７におけるノード間の矢印がそれに当たる)。送信リンクとは、あるノードが他のノードにデータを送信するときの伝送路である。受信リンクは、逆に他のノードからデータをもらうときに使用する伝送路のことである。よって、例えばノードAがノードBとの間に張った受信リンクは、ノードBにとってはノードAに対する送信リンクである。 (Sending link and receiving link)
Each node can establish a transmission link and a reception link with other nodes (the arrow between the nodes in FIG. 7 corresponds to it). A transmission link is a transmission path when a certain node transmits data to another node. On the contrary, the reception link is a transmission path used when receiving data from another node. Thus, for example, the reception link established between node A and node B is a transmission link to node A for node B.

各ノードは、送信リンク及び受信リンクをあらかじめ決められた数だけ他のノードとの間に張ることができる。ただし自分から他の1つのノードに対して張れるリンクは1本のみとし、さらに、自分から他のノードに張ることができるのは受信リンクのみとする。送信リンクについては、他のノードが自分に対して受信リンクを張ることによって持つことができる。よって、他のノードが自分に対して受信リンクを張ろうとしなければ、全く送信リンクを持たないこともある。なお、どのノードと受信リンクを張るかは完全にランダムとする。また、すでに決められた数だけ送信リンクが張られているノードに対しては、受信リンクを張ることはできないとする。
各ノードは、定期的に自分が持っているデータを送信リンク先の全てのノードに送信する。逆に受信リンク先のノードにデータを要求することはせず、他のノードから受信リンクを通してデータが送られてくるのを待ち続けるのみとする。ただし、一定時間待ってもデータが送られてこない受信リンクは切断し、他のノードとの間に新たに受信リンクを張りなおす。 Each node can establish a predetermined number of transmission links and reception links with other nodes. However, only one link can be extended from one node to another node, and only a receiving link can be extended from one node to another node. With respect to the transmission link, other nodes can have the reception link by setting up a reception link to themselves. Therefore, if another node does not attempt to establish a reception link for itself, it may not have a transmission link at all. It should be noted that it is completely random which node is connected to the receiving link. Further, it is assumed that a reception link cannot be established for a node having transmission links established for a predetermined number.
Each node periodically transmits its own data to all nodes of the transmission link destination. Conversely, data is not requested to the node at the receiving link destination, but only waiting for data to be sent from another node through the receiving link. However, the reception link where data is not sent even after waiting for a certain time is disconnected, and a new reception link is reestablished with another node.

（データの集計）
周りのノードから集めたデータの集計計算の様子を図８に示す。
図８では、4つの領域に分かれた長方形は各ノードが持つデータを表し、それぞれの領域に0次情報〜3次情報が格納されている。集計計算は、集めたデータのn次情報同士を集計して集計結果のn+1次情報の領域へ格納することによって行う。集計結果の0次情報には最新の自分の情報を格納しておく。他のノードにデータを送信するときは、この集計結果のデータを送ることになる。もちろん、最初にネットワークに参加した時点では、まだ他のノードからデータを集めていないので、データを送信する際は0次情報(自分の情報)のみが格納されたデータを送信することになる。
なお、本実施の形態では0次情報〜n次情報を持つデータのことを”次数n+1のデータ”と呼ぶことにする。 (Data aggregation)
FIG. 8 shows how the data collected from the surrounding nodes is calculated.
In FIG. 8, rectangles divided into four areas represent data held by each node, and 0th order information to 3rd order information are stored in the respective areas. Aggregation calculation is performed by totaling the n-order information of the collected data and storing it in the area of the n + 1-order information of the aggregation result. The latest self information is stored in the 0th order information of the counting result. When data is transmitted to other nodes, the data of the total result is transmitted. Of course, when data is first joined to the network, data has not yet been collected from other nodes, so when data is transmitted, data in which only 0th order information (own information) is stored is transmitted.
In the present embodiment, data having 0th order information to nth order information is referred to as “order n + 1 data”.

（データの遅延）
データの集計方法から明らかなように、集計したデータにはデータの遅延が生じている。基本的には、n次情報よりもn+1次情報の方が古いデータになっている。例えば、各ノードが1分ごとにデータを送信しているとすると、集計結果の0次情報には現在の情報(自分の情報)、1次情報には1分前の情報、2次情報には2分前の情報、3次情報には3分前の情報が含まれていることになる。すなわち、データの送信間隔をt [sec]とすれば、d次情報が得られるのはt×d [sec]後となる。
データ量はn次情報よりもn+1次情報の方が大きくなっているので、より多くのデータを使って精度の良い結果を得るには、より古いデータを使うことになる。よって提案した情報集計アルゴリズムでは、完全にリアルタイムな集計はできない。しかし、各ノードのデータ送信間隔を短くすれば、よりリアルタイム集計に近づけることは可能である。ただし、送信間隔が短くなればその分ネットワークトラフィックも増加することに注意する必要がある。 (Data delay)
As is clear from the data aggregation method, there is a data delay in the aggregated data. Basically, the n + 1st order information is older than the nth order information. For example, if each node sends data every minute, the 0th order information of the aggregation result is the current information (your information), the primary information is the information one minute ago, and the secondary information is Is the information two minutes ago, and the tertiary information contains the information three minutes ago. That is, if the data transmission interval is t [sec], d-th order information is obtained after t × d [sec].
Since the n + 1 order information is larger than the n order information, older data is used to obtain a more accurate result using more data. Therefore, the proposed information aggregation algorithm cannot perform total real-time aggregation. However, if the data transmission interval of each node is shortened, it is possible to approximate real-time counting. However, it should be noted that as the transmission interval becomes shorter, the network traffic increases accordingly.

（精度）
次に、提案した情報集計アルゴリズムを利用した場合に得られる結果の精度について考察する。
（収集データ量）
集計結果である視聴率などの精度は、収集したデータ量で決まる。より多くのデータを用いて計算すれば、その分誤差も小さくなる。よって、まず最初に提案したアルゴリズムで収集できるデータの量を見積もることにする。
収集できるデータ量はリンク数と次数で決定される。各ノードの送信リンク数及び受信リンク数を共にL、データの次数をdとすれば、集計結果の1次情報としてLノード分のデータが、集計結果の2次情報としてL2ノード分のデータが、集計結果のd次情報としてLdノード分のデータが収集できることになる。すなわち、収集できる全データ量は式(5・1)のようになる。 (accuracy)
Next, we consider the accuracy of the results obtained when the proposed information aggregation algorithm is used.
(Amount of collected data)
The accuracy of the audience rating, which is the result of the aggregation, is determined by the amount of data collected. If the calculation is performed using more data, the error is reduced accordingly. Therefore, we first estimate the amount of data that can be collected by the proposed algorithm.
The amount of data that can be collected is determined by the number of links and the order. If the number of transmission links and the number of reception links of each node are both L and the order of data is d, the data for L nodes is the primary information for the aggregation results, and the data for L2 nodes is the secondary information for the aggregation results. Therefore, data for Ld nodes can be collected as the d-order information of the aggregation result. In other words, the total amount of data that can be collected is as shown in Equation (5-1).

例えば、L=4、d=3とすれば、集計結果の3次情報として、4096ノード分の情報の集計結果を得られることになる。ただしこれは、全てのノードが受信リンクを4つ張っている状態での結果である。実際には、各ノードの受信リンク数および送信リンク数の最大値を共に4としてしまうと、受信リンクを全て張れないノードも出現する可能性がある。各ノードが受信リンクを張れば張るほど送信リンクに余裕のあるノードが減少するからである。よって、各ノードの送信リンク数の最大値を受信リンク数の最大値よりも大きな値に設定するなどの対策が必要である。 For example, if L = 4 and d = 3, the total result of information for 4096 nodes can be obtained as the tertiary information of the total result. However, this is a result when all nodes have four receiving links. Actually, if the maximum value of the number of received links and the number of transmitted links of each node is set to 4, there may be a node that does not have all the received links. This is because, as each node establishes a reception link, the number of nodes having a margin in the transmission link decreases. Therefore, it is necessary to take measures such as setting the maximum value of the number of transmission links of each node to a value larger than the maximum value of the number of reception links.

次に、視聴率の精度は収集データの量によってどの程度変化するか調査する。視聴率のような標本比率の誤差(信頼度95%)は式(5・2)から得られることが知られている。 Next, it is investigated how much the accuracy of the audience rating changes depending on the amount of collected data. It is known that a sample ratio error (95% reliability) such as audience rating can be obtained from equation (5.2).

式(5・2)を用いると、標本比率の誤差は表２のようになる。 Using equation (5 · 2), the error in sample ratio is as shown in Table 2.

現行の視聴率調査では標本数は600程度であるが、この表からそのときの視聴率の誤差は比較的大きいことが分かる。一方、標本数が4096であれば視聴率の誤差は現行の誤差の半分以下となることが分かる。
提案した情報集計アルゴリズムでは、パラメータdを大きくすれば収集できるデータの量も急激に増加するので、視聴率の誤差は十分小さく抑えることが可能である。 In the current audience rating survey, the number of samples is about 600, but this table shows that the audience rating error is relatively large. On the other hand, if the number of samples is 4096, it can be seen that the audience rating error is less than half of the current error.
In the proposed information aggregation algorithm, if the parameter d is increased, the amount of data that can be collected increases abruptly, so that the audience rating error can be kept sufficiently small.

（重複データ）
上記したように、この情報集計アルゴリズムではデータを重複して集計してしまう可能性がある。特に、ネットワークの規模が小さいとデータの重複も生じやすい。しかし、データが重複するからといって必ずしも集計結果がでたらめな値になるわけではない。例えばノードの総数を集計する場合はおかしな集計結果になってしまい、この情報集計アルゴリズムは役に立たないが、本実施の形態で求めたいデータは「何人中何人が番組を見ているか」や「何人中何人が番組に満足したか」といった”比率”のデータである。その場合、ランダムなネットワークを構築することで、提案した情報集計アルゴリズムでも正しい結果が得られると考えられる。これは、各ノードのデータが同程度に重複して集計されても、視聴率などの比率はほとんど変わらないためである。例えば100人中10人が番組を視聴している場合(視聴率10%)、重複して1000人分のデータを集めたとしても、結局そのうちの約100人が視聴していることになって視聴率はほぼ同じ10%となる。 (Duplicate data)
As described above, there is a possibility that this information aggregation algorithm will aggregate data redundantly. In particular, data duplication is likely to occur when the network is small. However, just because the data is duplicated does not necessarily mean that the tabulation results are random. For example, when totaling the total number of nodes, the result is strange, and this information aggregation algorithm is not useful, but the data you want to find in this embodiment is "how many people are watching the program" or "how many people “Ratio” data such as “How many people were satisfied with the program”. In that case, it is considered that a correct result can be obtained even by the proposed information aggregation algorithm by constructing a random network. This is because even if the data of each node is aggregated to the same extent, the ratio such as the audience rating hardly changes. For example, if 10 out of 100 people are watching a program (viewing rate is 10%), even if data for 1000 people is duplicated, about 100 of them will end up watching it. The audience rating is almost the same 10%.

しかし、それでも多少は本来の視聴率よりも誤差が大きくなってしまうことは避けられない。そこで、式(5・2)にデータの重複率を導入して、データに重複がある場合の誤差を導く。
まず、重複を含めてS人分のデータを収集し、そのうちs人が番組を視聴していたと仮定する。次に、データSの内、重複して収集したデータをM、その残りをNと仮定する。さらに、N人の内のn人と、M人の内のm人が番組を視聴していたとすると、視聴率は式(5・3)のように表せる。 However, it is inevitable that the error becomes somewhat larger than the original audience rating. Therefore, the data duplication rate is introduced into Equation (5.2) to derive an error when there is duplication of data.
First, suppose that S data including duplication is collected, and that s of them are watching the program. Next, it is assumed that, among the data S, the data collected redundantly is M and the rest is N. Furthermore, assuming that n out of N people and m out of M people are watching a program, the audience rating can be expressed as shown in Equation (5.3).

ここで、Rは重複データがない場合の視聴率、R’は重複データの視聴率、αは収集データの重複率である。以下、本論文ではデータの重複率をα＝M/N として説明する。重複率＝M/(N+M)でないことに注意する。 Here, R is the audience rating when there is no duplicate data, R ′ is the audience rating of duplicate data, and α is the duplicate rate of collected data. In this paper, the data duplication rate is described as α = M / N. Note that the overlap rate is not M / (N + M).

よって視聴率の誤差Eは、式(5 2)及び式(5 3)から

と導くことができる。この式(5・4)を用いて具体的に誤差を計算すると表３のようになる。なお、図９はN=4096において式(5・4)をグラフ化したものである。重複率が0.1前後のときに最も誤差が大きくなっていることが分かる。 Therefore, the audience rating error E is calculated from Equation (52) and Equation (53).

Can lead to. Table 3 shows a specific error calculation using this equation (5.4). FIG. 9 is a graph of the formula (5 · 4) when N = 4096. It can be seen that the error is the largest when the overlap rate is around 0.1.

表３を見ると、データに重複があっても誤差はほとんど増えないことがわかる。よって、提案した情報集計アルゴリズムにおいて、データの重複率はさほど問題ではないことが分かる。 Looking at Table 3, it can be seen that the error hardly increases even if there is duplication in the data. Therefore, it can be seen that the data duplication rate is not a problem in the proposed information aggregation algorithm.

（トラフィック）
本システムは大規模なP2Pネットワーク上で動作させることを前提としている。そのため、情報を集計する際にネットワークを流れるメッセージ数がどの程度になるか、事前に見積もることが重要になる。
提案した情報集計アルゴリズムの場合、受信リンクの数をL、やり取りするデータのサイズをm [bit]、時間T [sec]後にd次情報を得られるとすると、各ノードのネットワークトラフィックは(5・5)式のようになる。 (traffic)
This system is premised on operating on a large-scale P2P network. Therefore, it is important to estimate in advance how many messages will flow through the network when collecting information.
In the proposed information aggregation algorithm, if the number of received links is L, the size of data to be exchanged is m [bit], and d-order information can be obtained after time T [sec], the network traffic of each node is (5 It becomes like 5) type.

次に、このトラフィックを小さく抑えられるようなLとdの値を求める。
まず式(5・4)から、視聴率の誤差は視聴率が50%のときに最も大きくなることが分かる。ここで、そのときの誤差を1%以下程度に抑えることを考える。現行の視聴率調査での誤差が約4%であるので、誤差が1%以下ならば十分な精度と言える。そこで、式(5・4)から重複率が0.1で視聴率(50%)の誤差が1%になるサンプル数Nを求めてみると約14000になる。よって、重複データも考慮して約15000人分のデータを集められれば、誤差を1%以下に抑えられることになる。ただし、実際のネットワークでは、ノードの離脱などによって収集できるデータ量は理論値よりも小さくなるので、そのことも考慮して最大約30000人のデータを集められるようなパラメータを設定しておけば十分だと考えられる。
表４はd次情報として30000以上のデータを集められるような受信リンク数Lと次数dの組を示したものである。 Next, the values of L and d that can keep this traffic small are obtained.
First, it can be seen from equation (5.4) that the audience rating error is greatest when the audience rating is 50%. Here, it is considered to suppress the error at that time to about 1% or less. Since the error in the current audience rating survey is about 4%, if the error is 1% or less, it can be said that the accuracy is sufficient. Therefore, when the number of samples N at which the duplication rate is 0.1 and the audience rating (50%) error is 1% is calculated from Equation (5.4), it is about 14000. Therefore, if data for about 15,000 people is collected in consideration of duplicate data, the error can be suppressed to 1% or less. However, in an actual network, the amount of data that can be collected by leaving the node is smaller than the theoretical value, so it is sufficient to set parameters that can collect data of up to about 30,000 people in consideration of this. It is thought that.
Table 4 shows a set of the received link number L and the order d that can collect data of 30000 or more as d-order information.

表４から、トラフィックが最も小さくなる受信リンク数Lの値は2または３である。しかし、受信リンク数Lが小さすぎると、受信リンクを1本張れないだけで収集データ量が急激に減少してしまう恐れがある。そこでそのようなリスクを分散させることができ、かつトラフィックも小さい、受信リンク数L=4、次数d=8の組を本情報集計アルゴリズムでは採用する。その場合、もし時間T=60 [sec]ならば、トラフィックは0.53m [bps]ということになる。 From Table 4, the value of the number L of received links with the smallest traffic is 2 or 3. However, if the number of received links L is too small, there is a possibility that the amount of collected data will decrease rapidly just because one receiving link cannot be installed. Therefore, this information aggregation algorithm adopts a set of the number of received links L = 4 and the order d = 8, in which such risks can be distributed and the traffic is small. In that case, if the time T = 60 [sec], the traffic is 0.53 m [bps].

（耐障害性）
本実施の形態に係る情報集計アルゴリズムは、基本的には常時ネットワークに繋がっているシステムを想定している。そのため、頻繁にネットワークから離脱したりするノードはそれほど多くないが、大規模なネットワークで常に安定して動作するためには、耐障害性に優れていることが求められる。
本実施の形態に係るアルゴリズムは、ピュアP2Pネットワークで動作する。そのため、ネットワーク障害には比較的強いものになっている。例えば、突然受信リンク先のノードがネットワークから離脱したとする。その場合、単に代わりのノードを探して受信リンクを張りなおせばよい。各ノードの送信リンク数の最大値を大きく設定すれば、受け付けられる受信リンクの数にも余裕ができて、代わりのノードは見つかりやすくなるはずである。仮に、代わりのノードを見つけるまでに時間がかかってしまった場合でも、集計したデータには古いデータも含まれているので(どれだけ古いかはデータの次数による)、現在得られた集計結果からある程度昔のデータを補間することも可能である。 (Fault tolerance)
The information aggregation algorithm according to the present embodiment basically assumes a system that is always connected to a network. Therefore, there are not so many nodes that frequently leave the network, but in order to always operate stably in a large-scale network, it is required to have excellent fault tolerance.
The algorithm according to the present embodiment operates on a pure P2P network. Therefore, it is relatively strong against network failures. For example, it is assumed that the node at the receiving link suddenly leaves the network. In that case, simply search for an alternative node and re-establish the receiving link. If the maximum value of the number of transmission links of each node is set to be large, the number of reception links that can be accepted can be afforded, and an alternative node should be easily found. Even if it takes time to find an alternative node, the aggregated data includes old data (how much old depends on the order of the data). It is also possible to interpolate some old data.

（P2P番組推薦システム）
次に、ジャンルとバリューを利用した番組推薦システムの実現方法について述べる。なお、推薦対象の番組は、放送中の番組、過去の番組、未放送の番組とする。 (P2P program recommendation system)
Next, a method for realizing a program recommendation system using genres and values will be described. The recommended programs are broadcast programs, past programs, and unbroadcast programs.

（ジャンルによる番組推薦）
本実施の形態では、ジャンルを利用した番組推薦に従来の推薦手法であるコンテンツベース推薦を用いる。コンテンツベース推薦は、ユーザの視聴履歴からユーザの嗜好を抽出して、その嗜好にあった番組を推薦するというものである。 (Program recommendation by genre)
In the present embodiment, content-based recommendation which is a conventional recommendation method is used for program recommendation using a genre. Content-based recommendation is to extract a user's preference from the user's viewing history and recommend a program that meets that preference.

（ユーザの嗜好）
各ユーザは番組視聴に関して自分の嗜好を持っている。例えば、スポーツ番組を好んで視聴するユーザもいれば、ドキュメンタリー番組を好んで視聴するユーザもいる。コンテンツベース推薦では、このようなユーザの嗜好を利用して番組推薦をする。
本システムでは、ユーザの嗜好をワードリストで表現する(表５)。ワードリストは単語とその出現頻度の組から成り、これは番組のEPGを利用して作成する。具体的には、リモコンに”Goodボタン”と”No-Goodボタン”を用意し、Good/No-Goodボタンが押された番組のEPG情報のワード列を単語に分解して抜き出し、ワードリストに登録する。ワードリストは各単語の出現頻度の情報も持っており、GoodボタンによってEPGから抽出された単語は出現頻度を1増やし、No-Goodボタンによって抽出された単語は出現頻度を1減らす。このようにして作られるワードリストはユーザの嗜好そのものとなる。なお、Goodボタンはバリューの満足度の入力としても利用する。 (User preference)
Each user has his / her preference for viewing the program. For example, some users prefer to watch sports programs, while other users prefer to watch documentary programs. In content-based recommendation, such user preferences are used to recommend programs.
In this system, user preferences are expressed in a word list (Table 5). The word list consists of pairs of words and their appearance frequencies, which are created using the EPG of the program. Specifically, a “Good button” and “No-Good button” are prepared on the remote control, and the word sequence of the EPG information of the program for which the Good / No-Good button is pressed is disassembled into words and extracted into the word list. sign up. The word list also has information on the appearance frequency of each word, and the word extracted from the EPG by the Good button increases the appearance frequency by 1, and the word extracted by the No-Good button decreases the appearance frequency by 1. The word list created in this way becomes the user's preference itself. The Good button is also used as an input for value satisfaction.

ユーザの好きなジャンルの番組を推測するために、ユーザのワードリストと番組のEPG情報のワード列(をワードリストに変換したもの)との類似度を計算する。この類似度計算は、まず各単語間で類似度を計算し、その次にワードリスト間で類似度を計算するという2段階の手続きを踏む。
単語間の類似度の計算には、例えば株式会社日本電子化辞書研究所のEDR電子化辞書の概念辞書と英語単語辞書を利用する。概念辞書には概念の上下関係がツリー構造で記述されている。英語単語辞書には、英単語と概念の対応関係が記述されている。
単語間の類似度計算は、まず英語単語辞書で2つの単語の概念を引き、次に概念辞書のツリーにおけるそれら概念間の距離を計算する。この距離をNp、ツリーの高さをDとするとき、次の式(6・1)で単語間の類似度を計算できることが知られている。 In order to guess a program of a user's favorite genre, a similarity between the user's word list and a word string of EPG information of the program (which is converted into a word list) is calculated. This similarity calculation is a two-step procedure in which the similarity is first calculated between the words, and then the similarity is calculated between the word lists.
For the calculation of the similarity between words, for example, a concept dictionary and an English word dictionary of an EDR electronic dictionary of the Japan Electronic Dictionary Research Institute are used. The concept dictionary describes the hierarchical relationship of concepts in a tree structure. The English word dictionary describes the correspondence between English words and concepts.
To calculate the similarity between words, first draw the concepts of two words in the English word dictionary, and then calculate the distance between them in the concept dictionary tree. It is known that when this distance is Np and the height of the tree is D, the similarity between words can be calculated by the following equation (6.1).

次にワードリスト間の類似度の計算のために、まず長い方のワードリスト内の単語から、短い方のワードリスト内の各単語と類似度の高いものを抽出して、新たなワードリストを作成する。この新しいワードリストは短い方のワードリストと同じ長さである。そして、新しいワードリスト(各単語の出現回数の値と類似度をかけた値)と短い方のワードリスト(各単語の出現回数の値)をベクトルとしてその内積を計算し、そのなす角を求める。この角度が小さければ、その番組はユーザの嗜好に合っていることになる。 Next, in order to calculate the similarity between the word lists, first, from the words in the longer word list, the ones with high similarity to each word in the shorter word list are extracted, and a new word list is created. create. This new word list is the same length as the shorter word list. Then, the inner product is calculated using the new word list (the value obtained by multiplying the value of the number of occurrences of each word by the similarity) and the shorter word list (value of the number of occurrences of each word) as a vector, and the angle formed is calculated . If this angle is small, the program matches the user's preference.

（バリューによる番組推薦）
次に、番組推薦に利用するバリューについて詳細な説明を行う。
（バリューの種類）
以下のバリューが番組推薦に利用できる。
・番組の人気度(視聴率)
・番組の性質
・番組に対する期待度
・番組に対する満足度
・番組に対する感情 (Program recommendation by value)
Next, the value used for program recommendation will be described in detail.
(Value type)
The following values can be used for program recommendation.
・ Program popularity (viewing rate)
・ Characteristics of programs ・ Expectations for programs ・ Satisfaction with programs ・ Emotions for programs

このうち、人気度、期待度、満足度は1つの値として算出できるが、性質と感情は、複数の値を持ち得る。そこで、本論文では性質と感情の種類を以下のように定義する。
「性質」
・子供に見せたくない
「感情」
・嬉しい
・楽しい
・悲しい
・苛立ち
・恐怖
性質と感情については、これら1つ1つに対してその度合いを算出する。なお、人気度は番組視聴人数をサンプル数で割ることで算出し、それ以外の期待度、満足度、性質、感情は、集計した値を番組視聴人数で割ることで算出する。すなわち、番組を視聴していたユーザのうち何%が満足したかといった値が得られる。 Of these, popularity, expectation, and satisfaction can be calculated as one value, but nature and emotion can have multiple values. Therefore, in this paper, the types of nature and emotion are defined as follows.
"nature"
・ Emotions you don't want your children to see
・ We are happy, fun, sad, irritated, fear For nature and emotion, we calculate the degree of each one. The degree of popularity is calculated by dividing the number of viewers of the program by the number of samples, and the other expectation, satisfaction, nature, and emotion are calculated by dividing the total value by the number of viewers of the program. That is, a value such as what percentage of users who watched the program was satisfied is obtained.

（情報の入力）
バリューを求めるためには、通常の視聴履歴から得られる情報のほかに、各ユーザが番組に対してどの程度満足したか、番組に対してどんな感情を抱いたかといった情報を集計する必要がある。
このような情報の取得方法には、キーボードによるテキスト入力、あるいは笑い声などの音声や表情から感情を取得するといった方法が考えられる。しかし、表情認識は今の段階では技術的に難しく、音声入力はTVの音声とかぶってしまう。また、キーボードは誰もが使いこなせるわけではなく、そもそも受動的なメディアであるTVに対して積極性が求められるキーボードは向いていない。TVに向いている入力手段は、リモコンのようにだれでも気軽に入力できるものである。例えば、”Goodボタン”があれば番組に対する満足度を簡単に入力でき、”嬉しいボタン”や”悲しいボタン”をリモコンに用意すれば、番組に対する満足度や感情を簡単に入力することができる。どのボタンが何回押されたかという情報から、簡単にユーザのインフォーマル情報を数値化することもできる。よって、本実施の形態では情報の入力手段としてリモコンを想定する。情報入力の対象は、番組全体あるいは番組の一部分(ピンポイント)が考えられるが、本実施の形態では”この番組のこの瞬間”という風にピンポイントに情報を入力することにする。
ところで、近年TVの高機能が進み、リモコンのボタンも非常に複雑になってきている。通常のリモコンのほかに、必要最低限のボタンのみのリモコンを用意するケースもある。よって単純にボタンを増やすことは望ましくない。 (Enter information)
In order to obtain the value, in addition to the information obtained from the normal viewing history, it is necessary to aggregate information such as how satisfied each user is with the program and what feelings the program has.
As a method of acquiring such information, a method of acquiring text from voice input such as text input using a keyboard or voices such as laughter and facial expressions can be considered. However, facial expression recognition is technically difficult at this stage, and voice input is covered with TV voice. Also, not everyone can use the keyboard, and in the first place it is not suitable for a keyboard that requires aggressiveness for a passive media TV. The input means that are suitable for TV is something that anyone can easily input like a remote control. For example, if there is a “Good button”, the degree of satisfaction with the program can be easily input, and if a “happy button” or a “sad button” is prepared on the remote control, the degree of satisfaction or feelings about the program can be easily input. The user's informal information can be easily digitized from the information of which button has been pressed and how many times. Therefore, in the present embodiment, a remote controller is assumed as information input means. The target of information input may be the entire program or a part of the program (pinpoint), but in this embodiment, information is input to the pinpoint as “this moment of this program”.
By the way, in recent years, advanced functions of TVs have advanced, and buttons on the remote control have become very complex. In addition to the usual remote control, there are cases where a remote control with only the minimum necessary buttons is prepared. Therefore, it is not desirable to simply increase the buttons.

（ユーザのプロフィール）
多数のユーザが番組を評価することによって得られるバリューは、全ユーザの総意を知るのに良い指標となるが、さらにユーザのプロフィールも同時に考慮するとより効果的な推薦が可能になる。例えば男性ユーザに関して、好みの異なる女性ユーザのデータを除外したほうが有効な結果を得られることがあるだろう。あるいは、同世代のユーザに人気のある番組を知りたいといった場合や、地元のユーザに人気のある番組を知りたいといった場合も、ユーザのプロフィールを利用することで知ることができる。
番組推薦に効果的に利用できるユーザプロフィールは以下のようなものである。
・年齢
・性別
・地域
情報集計では、10代の人気度・満足度・感情、20代の人気度・満足度・感情、・・・のように、プロフィールごとに情報を集計すればよい。ただし、集計データのサイズはプロフィールの長さ(次元数)がnならばn倍となる。
ここに挙げたプロフィール以外にも多数のプロフィールが考えられるが、これはユーザのプライバシーに深く関わる問題であり、特に個人を特定できるような情報は避けなければならない。 (User profile)
The value obtained by evaluating a program by a large number of users is a good index for knowing the consensus of all users, but more effective recommendation can be made by considering the user's profile at the same time. For example, for male users, it may be more effective to exclude female user data with different preferences. Alternatively, when a user wants to know a program popular with users of the same generation or wants to know a program popular with local users, the user's profile can be used.
The user profiles that can be effectively used for program recommendation are as follows.
・ Age, gender, and region Information can be summed up for each profile, such as popularity / satisfaction / emotion for teens, popularity / satisfaction / emotion for teens, and so on. However, the total data size is n times if the profile length (number of dimensions) is n.
Many profiles other than those listed here can be considered, but this is a problem deeply related to the privacy of the user, and information that can identify an individual must be avoided.

（データレベルのクラスタリング）
上記では、年齢別、性別、地域別のバリューにも利用価値があることを述べた。ここではそれらに加え、ユーザの嗜好別バリューを算出する方法を説明する。
バリューによる番組推薦において、自分と嗜好が似ているユーザの情報を利用することは非常に有効である。なぜなら、ユーザにとって自分と嗜好が似ているユーザによる番組評価は、多様な嗜好のユーザによる番組評価よりも自分と似ており信頼できるからである。以下ではそのような情報の集計方法を検討する。 (Data level clustering)
In the above, it was stated that there is value in use by age, gender, and region. Here, in addition to these, a method for calculating the value according to the preference of the user will be described.
In program recommendation by value, it is very effective to use information of a user whose preference is similar to that of himself / herself. This is because program evaluation by users who have similar preferences to the user is more similar to the user and more reliable than program evaluations by users with various preferences. In the following, we will consider how to aggregate such information.

従来のP2Pファイル共有ソフトでは、ファイル共有の効率を高めるために、ネットワークレベルで嗜好のクラスタリングを行っているものがある。すなわち、嗜好の似た者同士をネットワーク的に近い位置に配置している。このようなクラスタリングを行えば、嗜好が似ているユーザから情報を集めることが可能である。
しかし、提案した情報集計アルゴリズムでは、このようなクラスタリングを行っても意味がない。なぜなら、この情報集計アルゴリズムはランダムなネットワークを前提としているからである。クラスタリングをしてもネットワークに偏りが生じてしまえば正しい集計結果は得られない。そこで本実施の形態では”共通ジャンルリスト”を導入する。共通ジャンルリストとは、あらかじめ決められた全ユーザに共通の嗜好ワードとその値の列である(表６参照)。例えば、”SF”や”サッカー”といった単語を共通ジャンルとして設定する。 Some conventional P2P file sharing software performs preference clustering at the network level to increase file sharing efficiency. That is, persons with similar preferences are arranged at positions close to each other in terms of network. By performing such clustering, it is possible to collect information from users with similar preferences.
However, in the proposed information aggregation algorithm, it does not make sense to perform such clustering. This is because this information aggregation algorithm assumes a random network. Even if clustering is performed, if the network is biased, a correct tabulation result cannot be obtained. Therefore, in this embodiment, a “common genre list” is introduced. The common genre list is a column of preference words and values common to all users determined in advance (see Table 6). For example, words such as “SF” and “soccer” are set as a common genre.

各ユーザは共通ジャンルリストを以下のように作成する。
まず各ユーザは、自分の嗜好ワードリストから共通ジャンルに対する嗜好の度合いを0〜1の値に変換し、共通ジャンルリストの値として設定する。これを一時共通ジャンルリストと呼ぶことにする。次に、その値が大きいものは1、それ以外は0とする共通ジャンルリストを作成する(図１０参照)。すなわち、各ユーザは自分の好きな共通ジャンルの値を1に設定する。値を0か1に変換するのは、ユーザの共通ジャンルに対する嗜好をよりはっきりと区別させるためである。 Each user creates a common genre list as follows.
First, each user converts the degree of preference for the common genre from his / her preference word list to a value of 0 to 1 and sets it as the value of the common genre list. This is called a temporary common genre list. Next, a common genre list is created with 1 being the largest value and 0 being otherwise (see FIG. 10). That is, each user sets their favorite common genre value to 1. The reason for converting the value to 0 or 1 is to make the user's preference for the common genre more distinct.

情報の集計はこの共通ジャンルリストの値を加算していくことで行う(図１１参照)。集計の方法はユーザプロフィールを利用する場合と同様である。各バリューの集計を、さらに共通ジャンルごとの集計に細かく分けて行う。例えば共通ジャンル1に関する視聴率、共通ジャンル2に関する視聴率、共通ジャンル1に関する満足度、・・・のように集計を行う。 The information is totaled by adding the values of the common genre list (see FIG. 11). The counting method is the same as when using a user profile. The aggregation of each value is further divided into tabulations for each common genre. For example, tabulation is performed such as audience rating for common genre 1, audience rating for common genre 2, satisfaction for common genre 1, and so on.

こうして得られた集計結果からは、番組Aを視聴しているユーザのうち、共通ジャンル1が好きなユーザは何人、共通ジャンル2が好きなユーザは何人、・・・といった情報を得ることができる。これはすなわち専門家によるバリューが得られるということでもある。例えば、ある映画についてSF映画好きなユーザはこの映画をどのように評価しているか、ということが分かる。
さらにこの集計結果を用いて、各ユーザの嗜好に即したバリューを得ることもできる。それには、ユーザの一時共通ジャンルリストを利用する。一時共通ジャンルリストはユーザの共通ジャンルに対する嗜好を表しているので、共通ジャンルリストの集計結果と一時共通ジャンルリストの内積をとることで、ユーザの嗜好を考慮したバリューが得られる。すなわち、自分と嗜好が似ているユーザの情報から算出するバリューを、このような方法で近似できる。
共通ジャンルリストの長さ(次元数)を増やせば、より精度の高いバリューが得られることになるが、その場合は集計データのサイズも増加することに注意する必要がある。共通ジャンルリストの次元数をmとすればデータサイズは元のデータ(ユーザプロフィールを除く)のm倍、ユーザプロフィールの次元数nも考慮すれば、データサイズはn+m倍になる。 From the results obtained in this way, it is possible to obtain information such as how many users who like the common genre 1 among users who watch the program A, how many users like the common genre 2, and so on. . This also means that professional value can be obtained. For example, it can be seen how a user who likes SF movies for a movie evaluates this movie.
Furthermore, it is also possible to obtain a value in accordance with each user's preference by using the total result. To do this, the user's temporary common genre list is used. Since the temporary common genre list represents the user's preference for the common genre, by taking the inner product of the total result of the common genre list and the temporary common genre list, a value considering the user's preference can be obtained. That is, the value calculated from the information of the user whose preference is similar to the user can be approximated by such a method.
Increasing the length (number of dimensions) of the common genre list will provide more accurate value, but it should be noted that the size of the aggregate data also increases in that case. If the number of dimensions of the common genre list is m, the data size is m times the original data (excluding the user profile), and if the number n of dimensions of the user profile is also considered, the data size is n + m times.

（集計データ）
以下のデータは、情報集計の際にやり取りする、放送中番組に関する集計データの一部である。ユーザプロフィール、共通ジャンルは考慮していない。各ユーザは自分の視聴履歴を15秒おきに記録するものとする。よって1分間に4つのデータが作られる。また、データ送信間隔を60秒、次数を3とすれば、3分前までのデータを集計することになるので、合計12個のデータを持つ。 (Aggregated data)
The following data is a part of the total data related to the broadcast program exchanged at the time of information totalization. User profiles and common genres are not considered. Each user records his / her viewing history every 15 seconds. Therefore, 4 data are created per minute. Also, if the data transmission interval is 60 seconds and the order is 3, the data up to 3 minutes before is totaled, so there are 12 data in total.

＜番組ID＞ ABC
＜サンプル数＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜視聴人数＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜満足度＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜期待度＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜嬉しい＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜楽しい＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜悲しい＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜苛立ち＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜恐怖＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜子供に見せたくない＞ 1,2,3,4,5,6,7,8,9,10,11,12 <Program ID> ABC
<Number of samples> 1,2,3,4,5,6,7,8,9,10,11,12
<Number of viewers> 1,2,3,4,5,6,7,8,9,10,11,12
<Satisfaction> 1,2,3,4,5,6,7,8,9,10,11,12
<Expectation> 1,2,3,4,5,6,7,8,9,10,11,12
<Happy> 1,2,3,4,5,6,7,8,9,10,11,12
<Fun> 1,2,3,4,5,6,7,8,9,10,11,12
<Sad> 1,2,3,4,5,6,7,8,9,10,11,12
<Irritation> 1,2,3,4,5,6,7,8,9,10,11,12
<Fear> 1,2,3,4,5,6,7,8,9,10,11,12
<I don't want to show my kids> 1,2,3,4,5,6,7,8,9,10,11,12

12個の数字のデータが集計中のデータである(ここで示した数字に意味はない)。一番左のデータが3分前、一番右のデータが現在のデータである。次数を3としているので、3分以上前のデータは切り捨てている。集計データには、このようなデータが放送中の番組の数だけ含まれることになる。
受信リンク数を4に設定すれば、60秒間にこのような集計データを4つ受け取る。受け取った集計データは、同じ番組同士でデータを加算することによって集計を行う。この集計結果から番組の人気度、満足度、感情、性質の値を算出することになる。
放送中の番組ではなく録画番組に関するデータの場合は、3分間のデータではなく、番組の開始から終了までのデータを集計する必要がある。なぜなら、録画番組はバラバラに視聴されるからである。録画番組Aを30分前から視聴しているユーザもいれば、録画番組Aを視聴し始めたばかりのユーザもいるかもしれない。その場合、3分間のデータだけ集計しても意味がない。
なお、録画番組に関しては、その番組が放送中のときに視聴データをすでに集計しているはずである。よって、録画番組として集計して得たデータは、すでに持っている放送中に集計された集計結果に足し合わせていくことになる。これによって、例えば放送中のときは満足度が低かったが、数ヶ月経ってみたら満足度がとても高くなっていた、ということも起こり得る。
一方未放送の番組に関しては、そもそも期待度しか計算できないので、集計するデータは<サンプル数>と<期待度>のデータのみになる。
上に示したデータにユーザプロフィールと共通ジャンルを導入すると集計データは以下のようになる。 Twelve numeric data are being compiled (the numbers shown here have no meaning). The leftmost data is 3 minutes ago, and the rightmost data is the current data. Since the order is 3, the data more than 3 minutes ago is truncated. The total data includes such data as many as the number of programs being broadcast.
If the number of received links is set to 4, you will receive 4 such aggregated data in 60 seconds. The received aggregated data is aggregated by adding data between the same programs. The popularity, satisfaction, emotion, and property values of the program are calculated from the totaled results.
In the case of data related to a recorded program rather than a program being broadcast, it is necessary to count data from the start to the end of the program, not data for three minutes. This is because the recorded program is viewed separately. Some users may have watched recorded program A 30 minutes ago, while others may have just started watching recorded program A. In that case, there is no point in counting only 3 minutes of data.
As for the recorded program, the viewing data should be already totaled when the program is being broadcast. Therefore, the data obtained as a total of recorded programs is added to the total results obtained during the broadcast already held. As a result, for example, the satisfaction level may be low during broadcasting, but the satisfaction level may be very high after several months.
On the other hand, since an unbroadcast program can only calculate the expected degree in the first place, only the data of <number of samples> and <expected degree> are collected.
If a user profile and a common genre are introduced into the data shown above, the aggregated data is as follows.

＜番組ID＞ ABC
＜サンプル数＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜視聴人数＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜男性＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜女性＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜１０代＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜２０代＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜共通ジャンル1：SF＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜共通ジャンル2：サッカー＞ 1,2,3,4,5,6,7,8,9,10,11,12
：
＜満足度＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜男性＞ 1,2,3,4,5,6,7,8,9,10,11,12
＜女性＞ 1,2,3,4,5,6,7,8,9,10,11,12
： <Program ID> ABC
<Number of samples> 1,2,3,4,5,6,7,8,9,10,11,12
<Number of viewers> 1,2,3,4,5,6,7,8,9,10,11,12
<Male> 1,2,3,4,5,6,7,8,9,10,11,12
<Women> 1,2,3,4,5,6,7,8,9,10,11,12
<Teens> 1,2,3,4,5,6,7,8,9,10,11,12
<20's> 1,2,3,4,5,6,7,8,9,10,11,12
<Common Genre 1: SF> 1,2,3,4,5,6,7,8,9,10,11,12
<Common Genre 2: Soccer> 1,2,3,4,5,6,7,8,9,10,11,12
:
<Satisfaction> 1,2,3,4,5,6,7,8,9,10,11,12
<Male> 1,2,3,4,5,6,7,8,9,10,11,12
<Women> 1,2,3,4,5,6,7,8,9,10,11,12
:

導入するプロフィール、共通ジャンルの数は、データのサイズ(トラフィック)がどれだけ増えるかを考慮して決める必要がある。録画番組のデータを集計する場合は、番組数が決まっている放送中の番組と違って番組数が無数になるため、特にデータのサイズに注意する必要がある。
そもそも無数に存在する録画番組をすべて集計することは不可能である。例えば、1万人分のデータを集めた場合、最悪1万個の録画番組データを集計することになる。このような場合は、データ集計中に番組数が一定の数に達したら番組データを切り詰める、という処理を行う。ユーザにとって必要な番組のデータは残し、必要でない番組のデータは捨ててしまうようにする。ユーザにとって必要な番組とは、何らかの理由でその番組が放送中のときに十分な視聴データを集められなかったものである。逆に必要でない番組は、すでに十分な視聴データを集められている番組である。
録画番組の集計データは、たとえ番組数を切り詰めても放送中番組の集計データに比べるとかなり大きくなってしまう。しかし放送中番組と違って、録画番組にはリアルタイム性は必要ない。よって、放送中番組と録画番組で同じデータ送信間隔を設定する必要はなく、放送中番組のデータ送信間隔は短く設定し、録画番組のデータ送信間隔は長く設定することで、トラフィックが抑えられ効率的な情報集計が可能になる。 The number of profiles and common genres to be introduced must be determined in consideration of how much the data size (traffic) will increase. When counting the data of recorded programs, the number of programs is innumerable unlike a broadcast program in which the number of programs is determined, so it is necessary to pay particular attention to the size of the data.
In the first place, it is impossible to count all the innumerable recorded programs. For example, if data for 10,000 people is collected, the worst 10,000 recorded program data will be aggregated. In such a case, a process of truncating the program data when the number of programs reaches a certain number during data aggregation is performed. The program data necessary for the user is kept, and the program data unnecessary is discarded. The program necessary for the user is a program for which sufficient viewing data cannot be collected when the program is being broadcast for some reason. Conversely, programs that are not necessary are programs for which sufficient viewing data has already been collected.
The total data of the recorded program is considerably larger than the total data of the program being broadcast even if the number of programs is reduced. However, unlike broadcast programs, recorded programs do not need real time. Therefore, it is not necessary to set the same data transmission interval between the program being broadcast and the recorded program, and the data transmission interval of the program being broadcast is set short and the data transmission interval of the recorded program is set long, thereby reducing traffic. Information can be aggregated.

（画面インタフェース）
次に、実際のTV視聴シーンにおける番組推薦インタフェースについて説明する。
（ブラウジング）
ブラウジングとはユーザが情報の一覧を眺めながら欲しい情報を取り出すというものである。ブラウジングを効率的に行うには、情報の視覚化が有効である。
本システムでは、各番組ごとにその番組のジャンルとバリューをグラフで表示することで、ユーザはどの番組が自分の好みと合っていて、どの番組がどんなユーザから高い評価を得ているかということが一目で分かるようにする。その結果ユーザは簡単に見たい番組を探せるようになる。このようなブラウジング主体の表示は、本システムでは主に3つのシーンに分かれる。1つ目は番組視聴中の表示、2つ目は録画番組一覧での表示、3つ目は番組表閲覧時の表示である。 (Screen interface)
Next, a program recommendation interface in an actual TV viewing scene will be described.
(browsing)
Browsing is to take out information that the user wants while viewing a list of information. Visualizing information is effective for efficient browsing.
In this system, for each program, the genre and value of the program are displayed in a graph, so that the user can find out which program suits his / her preference and which program has received high evaluation from which user. Try to understand at a glance. As a result, the user can easily find a program that the user wants to watch. Such browsing-based display is mainly divided into three scenes in this system. The first is the display while viewing the program, the second is the display in the recorded program list, and the third is the display when viewing the program guide.

1つ目の番組視聴中の表示例を図１２に示す。また、現在試作中のプロトタイプシステムの画面を図１３に示す。番組放送画面では図１２のように、画面の右隅に視聴中の番組および他チャンネルの番組のジャンルとその瞬間のバリュー(ここでは人気度)を表示する。ユーザは現在の番組を視聴しながら、他チャンネルの番組のバリューが刻々と変化する様子を確認できる。さらに画面下には視聴中番組の人気度(視聴率)の変遷やその他のバリュー(満足度や感情など)を表示する。他のユーザが満足ボタンや感情ボタンなどを押せばその結果がほぼリアルタイムに画面下のグラフに反映され、他のユーザが今の番組に対してどのように思っているのかが確認できる。特にスポーツ番組などは、盛り上がったシーンでボタン入力することで他のユーザと一体感を感じることができる。また、もし今の番組に満足できないようであれば、他チャンネルのジャンルとバリューを確認して、自分の好みの番組でかつ他のユーザからの評価も高い番組を見つければよい。 A display example during viewing of the first program is shown in FIG. FIG. 13 shows a screen of a prototype system currently being prototyped. On the program broadcast screen, as shown in FIG. 12, the genre and the instantaneous value (popularity here) of the program being viewed and the program of other channels are displayed in the right corner of the screen. While viewing the current program, the user can check how the value of the program of the other channel changes every moment. Furthermore, the transition of the popularity (viewing rate) of the program being viewed and other values (satisfaction and emotions) are displayed at the bottom of the screen. If another user presses the satisfaction button or the emotion button, the result is reflected in the graph at the bottom of the screen almost in real time, so that it can be confirmed how the other user thinks about the current program. In particular, sports programs and the like can feel a sense of unity with other users by inputting buttons in a lively scene. If you are not satisfied with the current program, you can check the genre and value of other channels and find a program that you like and that is highly appreciated by other users.

2つ目の録画番組一覧での表示と3つ目の番組表閲覧時の表示は、番組視聴中に表示した情報を各番組に対して表示するだけである。すなわち、一覧の各番組の横にその番組のジャンルとバリューの棒グラフを表示し、一目で各番組のジャンルとバリューの値が分かるようにする。さらにどれかの番組を選択した場合には、その番組のバリューの変遷を画面下に表示することによって、ユーザはどの番組のどのシーンが面白いのかを簡単に見つけることができる。これによってユーザは本当に面白いシーンだけを”つまみ食い”することが可能になる。 The display in the second recorded program list and the display at the time of viewing the third program guide only display the information displayed during program viewing for each program. That is, a bar graph of the genre and value of the program is displayed next to each program in the list so that the genre and value of each program can be understood at a glance. Furthermore, when any program is selected, the transition of the value of the program is displayed at the bottom of the screen, so that the user can easily find out which scene of which program is interesting. This allows the user to “pinch” only really interesting scenes.

（番組推薦インタフェース）
ブラウジングは番組一覧の中から見たい番組を探すというものであった。しかし、ユーザに”こんな番組が見たい”という曖昧な要求がある場合は、それに即した番組推薦を行うべきである。そこで本実施の形態では図１４のような番組推薦インタフェースを提案する。 (Program recommendation interface)
Browsing was to search for a program to watch from a list of programs. However, if the user has an ambiguous request that “I want to watch such a program”, the program should be recommended accordingly. Therefore, in this embodiment, a program recommendation interface as shown in FIG. 14 is proposed.

ユーザは周囲に配置されたそれぞれのチェックボックスをチェックすることで、どんな番組を推薦して欲しいか指定する。図１４の例では、20代の男性の中で人気度、満足度、好み(ジャンル)、感情(楽しい)の度合いが高い番組を推薦してもらおうとしている。推薦結果の番組は中央にその評価の高い順に表示される。このようにしてユーザは単純なブラウジングよりも効率的に自分の見たい番組を見つけることができる。また、ここで設定した条件をブラウジングの画面表示に適用することも有効である。例えば、番組視聴中 (図１２)、録画番組一覧、番組表閲覧時におけるバリューとして、”20代の男性の中での人気度、満足度、感情(楽しい)の総合値”を表示すると、ブラウジング中でもユーザは効率的に自分の見たい番組を見つけることが出来るようになる。 The user designates what program he / she wants to recommend by checking each check box arranged around. In the example of FIG. 14, an attempt is made to recommend a program having a high degree of popularity, satisfaction, preference (genre), and emotion (fun) among men in their twenties. The recommended programs are displayed in the center in descending order of evaluation. In this way, the user can find the program he / she wants to watch more efficiently than simple browsing. It is also effective to apply the conditions set here to the browsing screen display. For example, when viewing “total value of popularity, satisfaction, and emotion (fun) among men in their twenties” as a value during viewing a program (FIG. 12), a list of recorded programs, and viewing a program guide, browsing In particular, the user can efficiently find the program he / she wants to watch.

（シミュレーション実験）
次に、以上に提案した情報集計アルゴリズムのシミュレーション実験を行い、その性能について評価する。 (Simulation experiment)
Next, we conduct a simulation experiment of the information aggregation algorithm proposed above and evaluate its performance.

（実験の目的）
提案した情報集計アルゴリズムの評価をするために何十万人ものユーザで実験することは不可能である。そこで本実験では、シミュレーションによってトラフィックや各ノードにおける集計結果取得の様子および集計結果の精度などを調べることで情報集計アルゴリズムの評価を行う。そのために作成したシミュレータの概要を以下に示す。
・多数の仮想ノード間で提案した情報集計アルゴリズムと同等の処理を行う
・その際、各ノードは接続相手(受信リンク先ノード)を全ノードの中からランダムに選択する
・最大リンク数、次数、データ送信間隔は全ノードで共通である
・データの送信は全ノードが同期して一斉に行う
・データ送信間隔をシミュレーションの1ステップとする
・1ステップごとに決められた接続試行回数だけ他のノードに接続を試みる
・1ステップごとにノードの離脱率および参加率を設定できる
・1ステップごとにユーザの視聴行動を操作できる
(例えば番組Aを100人が視聴し、番組Bを200人が視聴している、という風に設定する)
・各ノードにおいて、1ステップごとに集計したデータから番組の各種バリューを算出することができる
このシミュレータは、情報集計の際のネットワークの分析と、各ノードにおけるバリューの算出可能性を確認できるように設計した。 (Purpose of the experiment)
It is impossible to experiment with hundreds of thousands of users to evaluate the proposed information aggregation algorithm. Therefore, in this experiment, we evaluate the information aggregation algorithm by examining the traffic, how the aggregation results are obtained at each node, and the accuracy of the aggregation results. An overview of the simulator created for this purpose is shown below.
・ Processes equivalent to the proposed information aggregation algorithm among a large number of virtual nodes ・ At that time, each node randomly selects the connection partner (receiving link destination node) from all nodes ・ Maximum number of links, degree, Data transmission interval is common to all nodes ・ Data transmission is performed simultaneously by all nodes ・ Data transmission interval is one step of simulation ・ Other nodes as many as the number of connection attempts determined for each step・ You can set the node withdrawal rate and participation rate for each step. ・ You can control the user's viewing behavior for each step.
(For example, 100 people watch program A and 200 people watch program B)
・ Each node can calculate various values of the program from the data aggregated for each step. This simulator can analyze the network at the time of information aggregation and check the possibility of calculating the value at each node. Designed.

（接続リンク数）
上記したように、最大受信リンク数は4に設定した方が良いことが分かったが、最大送信リンク数は未定であった。最大受信リンク数と最大送信リンク数を共に4と設定すれば、理論的には全てのノードが受信リンクおよび送信リンクを4つずつ張れることになる。しかし実際には送信リンクに余裕があるノードをなかなか見つけられないノードが出てきてしまい、そういったノードは4つ全ての受信リンクを張ることは難しい。そうなると、それらのノードでは収集データ量が減少し、視聴率の精度の低下を招く。
この問題は、最大送信リンク数を最大受信リンク数よりも大きな値に設定することで解決できる。しかし逆に最大送信リンク数が大きすぎると、多数の送信リンクを保持するノードとほとんど送信リンクを保持しないノードが出てきて、ネットワークに偏りが生じる。このような偏りがあると、特定のノードのデータばかり多く重複して集計するといった事態が起きてしまう。これも視聴率の精度を低下させる要因である。よって、最大送信リンク数は最大受信リンク数よりも大きく、且つ出来るだけ小さい値が良いと考えられる。
以上のことを踏まえ、各ノードの受信リンク数が最大送信リンク数によってどのように変化するかを実験によって確かめた。図１５〜図１９は、ノード数を100000、各ノードの最大受信リンク数を4、他のノードへの接続試行回数を12(すなわち1ステップごとに12個のノードへ受信リンクを張ることを試みる)、としたときの各ノードの受信リンク数を調べたものである。値はシミュレーションを10ステップ行ったときの平均値として算出している。なお、1ステップごとに約1%のノードがネットワークから離脱およびネットワークへ参加する設定とした。 (Number of connected links)
As described above, it was found that the maximum number of received links should be set to 4, but the maximum number of transmitted links has not been determined. If both the maximum number of received links and the maximum number of transmitted links are set to 4, theoretically, all nodes can establish four receive links and four transmit links. However, in reality, there are some nodes that cannot easily find a node with a sufficient transmission link, and it is difficult for such a node to establish all four reception links. If so, the amount of collected data is reduced at those nodes, and the accuracy of the audience rating is reduced.
This problem can be solved by setting the maximum transmission link number to a value larger than the maximum reception link number. On the other hand, if the maximum number of transmission links is too large, a node that holds a large number of transmission links and a node that hardly holds a transmission link appear, and the network is biased. If there is such a bias, there will be a situation where only a large amount of data of a specific node is duplicated. This is also a factor that reduces the accuracy of the audience rating. Therefore, it is considered that the maximum number of transmission links is larger than the maximum number of reception links and a value as small as possible is good.
Based on the above, we confirmed through experiments how the number of received links at each node changes depending on the maximum number of transmitted links. 15 to 19, the number of nodes is 100000, the maximum number of received links of each node is 4, and the number of connection attempts to other nodes is 12 (that is, attempts to establish received links to 12 nodes every step) ), And the number of received links at each node. The value is calculated as an average value when 10 steps of simulation are performed. In addition, about 1% of nodes left the network and joined the network every step.

図１５〜図１９のグラフを見ると、最大送信リンク数を最大受信リンク数よりも1増やすだけで、受信リンクを4つ保持しているノードは急激に増加し、逆に受信リンクが3つ以下のノードは急激に減少していることが分かる。また、最大送信リンク数を6以上に設定してもそれほど大きな変化は見られなかった。この事から、最大送信リンク数は最大受信リンク数よりも1だけ多い5と設定するのが良いと考えられる。あるいは、通常は最大送信リンク数＝最大受信リンク数としておき、各ノードの接続失敗回数に応じて動的に最大送信リンク数を変化させることも考えられる。これは例えば、接続試行回数12回の中で10回他のノードに接続を試みても受信リンクが1しか張れなかったノードに対しては、通常最大送信リンク数4のところを5に増やして特別に接続を許可する、というものである。 15 to 19, when the maximum number of transmission links is increased by 1 from the maximum number of received links, the number of nodes holding four reception links increases rapidly, and conversely three reception links. It can be seen that the following nodes are decreasing rapidly. In addition, even if the maximum number of transmission links was set to 6 or more, there was no significant change. For this reason, it is considered that the maximum number of transmission links should be set to 5, which is one more than the maximum number of reception links. Or, normally, the maximum number of transmission links may be set to the maximum number of reception links, and the maximum number of transmission links may be dynamically changed according to the number of connection failures of each node. For example, for a node in which only one incoming link was established even when trying to connect to another node 10 times out of 12 connection attempts, the maximum number of outgoing links 4 is usually increased to 5. A special connection is allowed.

（データ収集量）
次に、ノードのネットワークからの離脱やネットワークへの参加を考慮したときの各ノードの収集データ量を調べる。
図２０は、ノード数を100000、最大受信リンク数を4、最大送信リンク数を5、次数を8、他のノードへの接続試行回数を12としたときのシミュレーション結果である。縦軸が収集データ量、横軸がノード離脱率およびノード参加率を表している。ノード離脱率および参加率は、全ノードのうち何%のノードがネットワークから離脱または参加するかを表したものである。ここではシミュレーションを1ステップ進める度にノードの離脱および参加が起こるものとした。なお、ノード離脱率とノード参加率は同じ値に設定した。
図２０は、シミュレーションを10ステップ実行したときの、任意のノードの第8次データ量の平均を示している。このときの理想の収集データ量は48 ＝65536であるが、ノード離脱率1%で60000以上、ノード離脱率5%でも収集データ量が40000以上と、相当数のデータを収集できていることが分かる。 (Data collection amount)
Next, the amount of data collected at each node is examined when considering the departure of the node from the network and the participation in the network.
FIG. 20 shows simulation results when the number of nodes is 100,000, the maximum number of received links is 4, the maximum number of transmission links is 5, the degree is 8, and the number of connection attempts to other nodes is 12. The vertical axis represents the collected data amount, and the horizontal axis represents the node withdrawal rate and node participation rate. The node leave rate and participation rate represent what percentage of all nodes leave or join the network. Here, it is assumed that the node leaves and joins each time the simulation is advanced by one step. The node leaving rate and the node participation rate were set to the same value.
FIG. 20 shows an average of the eighth data amount of an arbitrary node when the simulation is executed for 10 steps. The ideal amount of collected data at this time is 48 = 65536, but the collected data amount is 40,000 or more even when the node leaving rate is 1%, and the collected data amount is 40000 or more. I understand.

（ネットワークトラフィック）
図２１に示したグラフは、上記と同じ条件でシミュレーションを行ったときの、1ステップ当たりの各ノードの平均データ送信回数である。
最大受信リンク数を4としているので各ノードは平均して4つの送信リンクを持つことになり、理論的には、各ノードは1ステップ当たり4回データを送信することになる。離脱率変えた場合の結果が図２１であるが、離脱率が増えてもデータ送信回数はそれほど減少していないため、安定してデータ集計が行われているということでもある。
シミュレーションの1ステップをs [sec]とし、データのサイズをm [bit]とすれば、各ノードに要求される通信帯域幅は約4m/s [bps]と見積もることができる。例えば、s=10 [sec]、m=100K [byte]ならば、通信量は320K [bps]となる。よって、データサイズが大きいとネットワーク回線を圧迫してしまう恐れがあるため、データを圧縮したり、必要な情報のみ集計するような工夫が必要である。 (Network traffic)
The graph shown in FIG. 21 shows the average data transmission count of each node per step when the simulation is performed under the same conditions as described above.
Since the maximum number of received links is 4, each node has four transmission links on average, and theoretically, each node transmits data four times per step. FIG. 21 shows the result when the withdrawal rate is changed. However, even if the withdrawal rate increases, the number of data transmissions does not decrease so much, so that data aggregation is performed stably.
If one step of the simulation is s [sec] and the data size is m [bit], the communication bandwidth required for each node can be estimated to be about 4 m / s [bps]. For example, if s = 10 [sec] and m = 100K [byte], the communication amount is 320K [bps]. Therefore, if the data size is large, the network line may be compressed. Therefore, it is necessary to devise a method for compressing data or collecting only necessary information.

（データ重複率）
続いて、ノード数によってデータの重複率がどのように変化するか調べる。
シミュレーションは、受信リンク数4、送信リンク数5、次数8、接続試行回数12、ノード離脱率(=ノード参加率)1%という条件で行う。重複率は第8次データに関する重複率を調べるものとする。集計結果の第8次データは8ホップ先のノードの情報を集めたものである。よって、特定のノードから受信リンクを辿って8ホップ先のノードがどれだけ重複しているかを調べれば、その特定ノードの第8次データのだいたいの重複率を見積もることができる。もしノードの離脱、参加がなくネットワークのリンク構造が変化しなければ、求めた重複率は正確なデータ重複率となる。図２２はノード数を変化させてシミュレーションを行ったときの重複率である。 (Data duplication rate)
Subsequently, it is examined how the data duplication rate changes depending on the number of nodes.
The simulation is performed under the condition that the number of received links is 4, the number of transmitted links is 5, the degree is 8, the number of connection attempts is 12, and the node leaving rate (= node participation rate) is 1%. As for the duplication rate, the duplication rate concerning the 8th data shall be examined. The eighth data of the tabulation results is a collection of information on nodes that are 8 hops away. Therefore, by tracing the reception link from a specific node and examining how much the 8 hop ahead node overlaps, it is possible to estimate the approximate overlap rate of the 8th data of that specific node. If the network link structure does not change with no node leaving or joining, the obtained duplication rate is an accurate data duplication rate. FIG. 22 shows the overlap rate when simulation is performed with the number of nodes changed.

この結果から、ノード数がn倍になるとデータ重複率は約1/n倍になっていることが分かる。よって、大規模なネットワークでは重複率はほとんど無視できることが分かった。本実施の形態では、集計結果としてデータ重複の影響をほとんど受けない比率のデータを求めていたが、大規模なネットワークではデータ重複がほとんど生じないことが確認できたため、そのようなネットワークでは比率以外のデータも取得できる可能性がある。 From this result, it can be seen that the data duplication rate is about 1 / n times when the number of nodes is n times. Therefore, we found that the overlap rate is almost negligible in large networks. In this embodiment, the ratio of data that is almost unaffected by data duplication has been obtained as a result of aggregation, but it has been confirmed that there is almost no data duplication in a large-scale network. There is a possibility that data can be obtained.

（精度）
ここでは集計で得られた視聴率の精度を調べる。
まずは、各ノードで得られる視聴率の精度が、時間の経過によってどのように変化するかを確かめる。シミュレーションの条件は7.5節と同じである。ただしノード数は10万とし、ユーザの視聴状況も以下のように設定した。カッコ内の数値はその番組の真の視聴率である。
・番組A：5000人（視聴率5%）
・番組B：10000人（視聴率10%）
・番組C：20000人（視聴率20%）
・番組D：50000人（視聴率50%）
・未視聴：15000人
なお、ユーザはシミュレーションの途中で視聴番組を変更しないものとした。
このシミュレーション結果を表７、表８と図２３に示す。表７は、シミュレーションの任意のステップにおいて、ランダムに抽出した100ノードの各次数データによる視聴率の平均をまとめたものである。表８は、そのうちの第8次データについての基本的な統計量を計算したものである。また図２３は、100ノードの各次数のデータによる視聴率の標準偏差をグラフにしたものである。 (accuracy)
Here, the accuracy of the audience rating obtained by the aggregation is examined.
First, it is confirmed how the accuracy of the audience rating obtained at each node changes with the passage of time. The simulation conditions are the same as in Section 7.5. However, the number of nodes was 100,000, and the user's viewing situation was set as follows. The number in parentheses is the true audience rating of the program.
・ Program A: 5000 people (viewing rate 5%)
・ Program B: 10000 people (viewing rate 10%)
・ Program C: 20000 people (20% audience rating)
・ Program D: 50000 people (viewing rate 50%)
・ Unviewed: 15,000 people Note that the user does not change the viewing program during the simulation.
The simulation results are shown in Tables 7 and 8 and FIG. Table 7 summarizes the average audience rating based on each degree data of 100 nodes extracted at random in any step of the simulation. Table 8 shows the basic statistics for the 8th data. FIG. 23 is a graph showing the standard deviation of the audience rating based on the data of each order of 100 nodes.

表７及び８から、ノードの離脱があっても第8次データを用いて計算すれば十分な精度の視聴率が得られることを確認できる。図２３では、次数の増加によって視聴率の精度が向上している様子が分かる。また、第6次データや第7次データでも良い精度の視聴率が得られているため、8ステップかけて第8次データを得なくても、6ステップまたは7ステップで視聴率を求めることもできる。 From Tables 7 and 8, it can be confirmed that a sufficient audience rating can be obtained by calculating using the eighth data even if the node is detached. In FIG. 23, it can be seen that the accuracy of the audience rating is improved by increasing the order. Also, since the audience rating with good accuracy is obtained for the 6th data and 7th data, the audience rating can be obtained in 6 steps or 7 steps without obtaining the 8th data over 8 steps. it can.

（集計結果取得の様子）
本実施の形態で提案した情報集計アルゴリズムは、時間の経過とともに集計結果の精度が上がっていくという特長がある。ここでは、時間軸に沿ってユーザの視聴行動を制御し、そのような集計結果の変化を確認する。
シミュレーションの条件は、ノード数10000、最大受信リンク数4、最大送信リンク数5、次数8、接続試行回数12、ノード離脱率およびノード参加率1%、データ送信間隔10[sec]とする。また、ユーザの視聴行動は以下に示すように設定する。 (State of collecting results)
The information aggregation algorithm proposed in the present embodiment has a feature that the accuracy of the aggregation result increases with the passage of time. Here, the viewing behavior of the user is controlled along the time axis, and such a change in the counting result is confirmed.
The simulation conditions are as follows: the number of nodes is 10,000, the maximum number of received links is 4, the maximum number of transmitted links is 5, the degree is 8, the number of connection attempts is 12, the node leaving rate and the node participation rate are 1%, and the data transmission interval is 10 [sec]. The viewing behavior of the user is set as shown below.

＜1〜3ステップの番組視聴状況＞
・番組A：500人（視聴率5%）
・番組B：1000人（視聴率10%）
・番組C：2000人（視聴率20%）
・番組D：5000人（視聴率50%）
・未視聴：1500人 <1 to 3 steps program viewing status>
・ Program A: 500 (viewing rate 5%)
・ Program B: 1000 people (viewing rate 10%)
・ Program C: 2000 people (20% audience rating)
・ Program D: 5,000 people (viewing rate 50%)
・ Not watched: 1500 people

＜4〜5ステップの番組視聴状況＞
・番組A：1500人（視聴率15%）
・番組B：800人（視聴率8%）
・番組C：2500人（視聴率25%）
・番組D：4000人（視聴率40%）
・未視聴：1200人 <4-5 steps program viewing status>
・ Program A: 1500 people (15% audience rating)
・ Program B: 800 people (viewing rate 8%)
・ Program C: 2500 people (viewing rate 25%)
・ Program D: 4000 people (40% audience rating)
・ Not watched: 1200 people

＜6〜ステップの番組視聴状況＞
・番組A：2000人（視聴率20%）
・番組B：900人（視聴率9%）
・番組C：2200人（視聴率22%）
・番組D：3800人（視聴率38%）
・未視聴：1100人 <6-step program viewing status>
・ Program A: 2000 people (20% audience rating)
・ Program B: 900 people (9% audience rating)
・ Program C: 2200 people (viewing rate 22%)
・ Program D: 3800 people (viewing rate 38%)
・ Not watched: 1100 people

上記において、カッコ内に示したのは、その時点での真の視聴率である。このシミュレーションの結果を図２４〜図２７に示す。図はシミュレーションの8ステップ目、10ステップ目、12ステップ目、14ステップ目に得られる任意のノードの集計結果をグラフで表したものである。横軸は時刻を表しており、右端が現在の時刻となる。また、グラフ中の黒い縦線は現在から6ステップ前の時刻、赤い縦線は現在から8ステップ前の時刻を示している。なお、紫で表示された”E”のグラフはTVを視聴していないユーザの割合である。 In the above, what is shown in parentheses is the true audience rating at that time. The results of this simulation are shown in FIGS. The figure is a graphical representation of the total results of arbitrary nodes obtained at the 8th, 10th, 12th and 14th steps of the simulation. The horizontal axis represents time, and the right end is the current time. The black vertical line in the graph indicates the time six steps before the current time, and the red vertical line indicates the time eight steps before the current time. The “E” graph displayed in purple represents the percentage of users who are not watching TV.

番組Dのグラフに注目する。番組Dの真の視聴率は、tを時間とすると、0≦t＜30のとき50%、30≦t＜50のとき40%、50≦tのとき38%である。
図２４〜図２７の各図において、現在から6ステップ前の黒い縦線よりも右側のデータは、サンプル数が少なく信頼できるデータではない。
図２４は、8ステップ古いt=0での視聴率は精度よく求められているが、6ステップ古いt=20の視聴率は誤差が1.5%以上と大きくなってしまっている。しかし、その2ステップ後の図２５では、誤差が0.2%程度と小さくなっていることが確認できる。同様に、図２５のt=40における視聴率も誤差が約2.7%と大きいが、その2ステップ後の図２６では、誤差はほとんど0である。図２６においては、図２４で30≦t＜70の視聴率がでたらめであったものが、誤差も小さくきれいなグラフになっている。これらの結果から、時間が経過するにつれて視聴率の精度が上昇するという本情報集計アルゴリズムの特徴が確認できる。 Pay attention to the graph of program D. The true audience rating of program D is 50% when 0 ≦ t <30, 40% when 30 ≦ t <50, and 38% when 50 ≦ t, where t is time.
24 to 27, the data on the right side of the black vertical line 6 steps before the present is not reliable data with a small number of samples.
In FIG. 24, the audience rating at t = 0, which is 8 steps old, is obtained with high accuracy, but the error rate of t = 20, which is 6 steps older, has a large error of 1.5% or more. However, in FIG. 25 after the two steps, it can be confirmed that the error is as small as about 0.2%. Similarly, the audience rating at t = 40 in FIG. 25 has a large error of about 2.7%, but in FIG. 26 after two steps, the error is almost zero. In FIG. 26, the audience rating of 30 ≦ t <70 is random in FIG. 24, but the error is small and the graph is clear. From these results, it is possible to confirm the characteristic of the present information aggregation algorithm that the accuracy of the audience rating increases as time passes.

さらに今回のシミュレーションでは、4ステップ目で一部のユーザがGoodボタンを押すように設定した。Goodボタンを押したユーザの人数は以下の通りである。
＜4ステップ目にGoodボタンを押したユーザ＞
・番組A：200人（13.3%）
・番組B：300人（37.5%）
・番組C：400人（16%）
・番組D：500人（12.5%） Furthermore, in this simulation, we set some users to press the Good button at the 4th step. The number of users who pressed the Good button is as follows.
<User who pressed the Good button at the 4th step>
・ Program A: 200 people (13.3%)
・ Program B: 300 people (37.5%)
・ Program C: 400 people (16%)
・ Program D: 500 people (12.5%)

上記において、カッコ内の数値は、各番組の視聴人数に対するGoodボタンを押したユーザの割合である。これらの数値が各番組に対する真の満足度になる。
シミュレーションによって得た集計結果を図２８〜図３１に示す。図は8ステップ目、10ステップ目、12ステップ目、14ステップ目に得られる任意のノードの集計結果をグラフにしたものである。横軸と赤、黒の縦線は上の視聴率のグラフと同様である。 In the above, the numerical value in parentheses is the ratio of users who pressed the Good button to the number of viewers of each program. These numbers are the true satisfaction for each program.
The tabulation results obtained by the simulation are shown in FIGS. The figure is a graph of the total results of arbitrary nodes obtained at the 8th, 10th, 12th and 14th steps. The horizontal axis and the vertical lines of red and black are the same as the above audience rating graph.

番組Bに注目すると、ステップを重ねるにつれて真の満足度である37.5%に近づいていることが分かる。しかし、最終的に得られた結果が36.430%と誤差1%ほどになってしまった。シミュレーション中は、1ステップごとに全ノードのうち1%のノードがネットワークからの離脱およびネットワークへの参加を繰り返しているため、たまたま自分の近くのノードが離脱してしまったと考えられる。しかし、それでも1%の誤差であれば十分な精度といえる。 Looking at program B, it can be seen that as the steps are repeated, the true satisfaction level approaches 37.5%. However, the final result was 36.430% with an error of about 1%. During the simulation, 1% of all nodes are repeatedly leaving the network and joining the network at each step, so it is probable that a nearby node has accidentally left. However, an accuracy of 1% is still sufficient.

（実験のまとめ）
シミュレーション実験では、まず受信リンク数を出来る限りだけ多く保持できるような最大送信リンク数の値を調べた。その結果、最大送信リンク数は最大受信リンク数よりも1だけ大きい値が適していることが確認できた（図１５〜１９参照）。
次に、ノード離脱率を変化させたときのデータ収集量を調べた。その結果、ノード離脱率を高い値に設定しても十分な量のデータを集計できていることが確認できた。これは、常に安定してデータの集計が行えているということである。
次に、ノード離脱率を変化させたときの1ステップ当たりのデータ送信回数を調べた。結果は、ノード離脱率を高い値に設定してもデータ送信回数に大きな変化は見られなかった（図２１参照）。したがって、上記実験結果（図２０参照）と同じく、本情報集計アルゴリズムは耐障害性に優れており、安定した集計ができていると言える。
次に、ノード数を変化させたときのデータ重複率を見積もった。結果、大規模なネットワークではほとんど集計データの重複が生じないことが確認できた（図２２参照）。
次に、ノードの離脱を組み入れたときに実際にどの程度の精度で視聴率を求められるか確認した。結果、かなり高い精度で視聴率を求められることが確認できた（表７〜８、及び図２３参照）。
次に、シミュレーションを実行したときの集計の経過を確認し、集計結果の精度が時間の経過とともに変化する様子を確認した。シミュレーションではあるが、短いステップ数で十分な精度の視聴率を算出することに成功しており、大規模分散環境においても短時間で高い精度の視聴率を計算できる可能性を示せた（図２４〜３１参照）。 (Summary of experiment)
In the simulation experiment, first, the value of the maximum number of transmission links that can hold as many reception links as possible was examined. As a result, it was confirmed that a value that is 1 larger than the maximum number of received links is suitable for the maximum number of transmitted links (see FIGS. 15 to 19).
Next, the amount of data collected when the node departure rate was changed was examined. As a result, it was confirmed that a sufficient amount of data could be aggregated even if the node leaving rate was set to a high value. This means that data can always be counted stably.
Next, the number of data transmissions per step when the node leaving rate was changed was examined. As a result, there was no significant change in the number of data transmissions even when the node leaving rate was set to a high value (see FIG. 21). Therefore, like the above experimental result (see FIG. 20), it can be said that the present information aggregation algorithm is excellent in fault tolerance and can be stably aggregated.
Next, the data duplication rate when the number of nodes was changed was estimated. As a result, it was confirmed that there was almost no duplication of aggregated data in a large-scale network (see FIG. 22).
Next, we confirmed how accurately the audience rating can be obtained when incorporating node detachment. As a result, it was confirmed that the audience rating can be obtained with considerably high accuracy (see Tables 7 to 8 and FIG. 23).
Next, the progress of counting when the simulation was executed was confirmed, and it was confirmed that the accuracy of the counting results changed with the passage of time. Although it is a simulation, it has succeeded in calculating an audience rating with sufficient accuracy with a short number of steps, and has shown the possibility of calculating an audience rating with high accuracy in a short time even in a large-scale distributed environment (FIG. 24). ~ 31).

次に、本実施の形態のまとめと今後の課題について述べる。
（まとめ）
本実施の形態では、従来の番組推薦手法で扱いきれていなかった「番組の質」も取り入れて、ユーザにより有益な番組を提示できるような新しい番組推薦手法の実現を目的とした。
最初に、番組推薦に利用する情報の分析を行った。その結果、従来から番組推薦に利用されてきたEPGがフォーマル情報であるのに対して、番組の質はユーザの正直な情報を集めたインフォーマル情報であることがわかった。そこで、コンテンツの評価軸としてフォーマル情報に対応する「ジャンル」と、インフォーマル情報に対応する「バリュー」を定義した。ジャンルはコンテンツの表面的な概要を表し、バリューは内面の質を表すものである。
バリューを算出するには、多数のユーザからインフォーマルな情報を集計しなければならない。そこで本実施の形態では、インフォーマル情報の集計に適したP2Pを利用して効率的に多数のユーザのインフォーマル情報を集計できる情報集計アルゴリズムを提案した。このアルゴリズムは大規模な分散環境において効率的に情報を集計できる。小、中規模なネットワーク環境においては集計データに重複が発生するが、集計結果を比率として得る場合には誤差がほとんど生じないことが分かった。
次に、バリューを用いたP2P番組推薦システムの実現方法を示し、TVの新しい視聴スタイルとなり得る画面インタフェースを提案した。
最後のシミュレーション実験では、情報集計アルゴリズムの性能を評価すると共に、視聴率などの集計結果の算出過程を観察し、実際の番組推薦システムにおける実現可能性を示した。 Next, a summary of the present embodiment and future problems will be described.
(Summary)
In the present embodiment, a “program quality” that has not been handled by the conventional program recommendation method is taken in, and an object of the present invention is to realize a new program recommendation method that can present a useful program to the user.
First, the information used for program recommendation was analyzed. As a result, it was found that the EPG that has been used for program recommendation in the past is formal information, whereas the quality of the program is informal information that gathers honest information from the user. Therefore, “genre” corresponding to formal information and “value” corresponding to informal information are defined as content evaluation axes. The genre represents the superficial outline of the content, and the value represents the quality of the inner surface.
Informal information from a large number of users must be aggregated in order to calculate value. Therefore, in the present embodiment, an information aggregation algorithm that can efficiently aggregate informal information of a large number of users using P2P suitable for the aggregation of informal information has been proposed. This algorithm can aggregate information efficiently in a large-scale distributed environment. In small and medium-sized network environments, duplication occurred in the aggregated data, but it was found that there was almost no error when the aggregated results were obtained as a ratio.
Next, the realization method of the P2P program recommendation system using value was shown, and the screen interface that could become a new viewing style of TV was proposed.
In the final simulation experiment, the performance of the information aggregation algorithm was evaluated, and the calculation process of the aggregation results such as audience rating was observed to show the feasibility of an actual program recommendation system.

（今後の課題）
提案したP2P番組推薦システムには、情報の入力に関していくつか課題が残る。まず１つ目は、情報の入力を全てリモコンのボタンに頼っていることである。瞬時に、且つ気軽で簡単に情報を入力する手段として、リモコンによるボタン入力は非常に優れている。しかし、単純に入力する情報の種類だけボタンを用意すれば、ボタンの数が増えすぎてしまい、ユーザは戸惑ってしまう。この問題を解決する１つの方法として、システムがユーザを観察して自動的に情報を入力することが考えられる。ユーザの感情をユーザの声や表情から取得する研究は多く行われている。このような技術が発展すれば、システムがユーザの気持ちを自動的に入力してくれるようになる。
2つ目の問題は、ユーザがTV視聴時にGoodボタンや感情ボタンを頻繁に入力するかという問題である。受動的なTVに対してユーザに情報を入力させるには、ユーザの情報入力に対するモチベーションを高めるような工夫が必要になる。例えば、情報入力をすることによって他のユーザとインタラクションができれば、ユーザのモチベーションも上がるかもしれない。この点に関してはさらに分析を行う必要がある。
3つ目の問題は、1台のTVを複数人で視聴する場合である。本実施の形態で提案したような情報推薦システムは、基本的に個人のユーザを対象とする。複数のユーザが同時に使用すると、ユーザの正確な嗜好が取得できなくなってしまうからである。また、他のユーザに自分の嗜好を知られてしまうというプライバシーの問題も生じる。これを解決する妥当な方法は、情報入力装置(リモコン)にユーザ機能を設けることだろう。これによってシステムは誰が情報を入力したか知ることができ、個々のユーザに対して処理を切り替えることができる。現在は、携帯端末で家電を操作できるようになりつつあるが、携帯端末は主に個人で携帯しているので、これを入力装置として利用することも考えられる。 (Future tasks)
The proposed P2P program recommendation system still has some issues regarding information input. The first is that all input of information relies on the buttons on the remote control. As a means for inputting information instantly, easily and easily, button input by a remote controller is very excellent. However, if buttons are prepared for only the type of information to be input, the number of buttons will increase and the user will be confused. One way to solve this problem is for the system to observe the user and automatically enter information. Many studies have been conducted to obtain user emotions from user voices and facial expressions. If such technology develops, the system will automatically input the user's feelings.
The second problem is whether the user frequently inputs the Good button or the emotion button when watching TV. In order for a user to input information to a passive TV, a device that increases the motivation for the user to input information is required. For example, if the user can interact with other users by inputting information, the user's motivation may increase. Further analysis is needed on this point.
The third problem is when one TV is viewed by multiple people. The information recommendation system as proposed in this embodiment is basically intended for individual users. This is because if a plurality of users are used at the same time, the user's exact preference cannot be acquired. In addition, there is a privacy problem that other users are aware of their preferences. A reasonable way to solve this would be to provide a user function on the information input device (remote control). This allows the system to know who entered the information and to switch processing for individual users. Currently, it is becoming possible to operate home appliances with a mobile terminal, but since the mobile terminal is mainly carried by individuals, it may be used as an input device.

次に、本発明に係る情報集計アルゴリズムの他例について説明する。
ここではP2P による情報集計の具体的なアルゴリズムを説明する。目的は、周囲数千人程度のユーザからいい加減に、かつ効率的に情報を集めることである。なお、ここでは集計する情報を「各ユーザが1 分間の間何チャンネルを見ていたか」というものとする。すなわち集計結果として得られる情報は、各チャンネルの1 分ごとの視聴人数である。 Next, another example of the information aggregation algorithm according to the present invention will be described.
Here, the specific algorithm of information aggregation by P2P is explained. The purpose is to gather information from thousands of users in a casual and efficient manner. Here, the information to be aggregated is “how many channels each user was watching for 1 minute”. In other words, the information obtained as a result of aggregation is the number of viewers per minute of each channel.

まず、各ノードが持つ情報は図３２に示すようなものである。（図３２中の丸付き数字については、以下、(1),(2)…と表す）
やり取りするデータの(1)、(2)、…には、各チャンネルごとに視聴人数をカウントしたものが入る。
ここで、d 次情報を持つデータを「次数d+1 のデータ」
と呼ぶことにする。
データ集計のアルゴリズムは以下のようになる。
“起動時”
・サーバにつなぎ、自分のノートアドレスをサーバに登録する。
・サーバから、すでに登録済みのノードアドレスをランダムにX 個教えてもらう。
・そのうちK 個のノードアドレスを受信リンクに登録する。逆に、自分のノードアドレスを相手ノードの送信リンクに登録してもらう。（すなわち、他ノードから要求があるまで送信リンクは
張らない。）
“送信”
・ 1 分ごとに“集計”で作成したデータを送信リンク先のノードへ送信する。
・このときデータのTTL を1 にセットする。
“転送”
受信リンク先のノードから受信したデータについて
・ 1 ホップ隣から来たデータ(TTL=1 のデータ)は、
TTL を1 増やして送信リンク先ノードに転送す
る。
・ 2 ホップ隣から来たデータ(TTL=2 のデータ)は、“集計”へ。
“集計”
・自分の情報を、送信するデータの(1)にセットする。
・受信したデータの(1)を全て加算して、送信するデータの(2)へセットする。
・受信したデータの(2)を全て加算して、送信するデータの(3)へセットする。
：
・受信したデータのd 次情報を加算したものが、得られる集計結果である（図３３参照）。
ただし、d 次情報は1 分前のd-1 次情報を使って計算しているので、集計結果には最大d 分前の情報が含まれる。 First, information held by each node is as shown in FIG. (The numbers with circles in FIG. 32 are represented as (1), (2)...)
In (1), (2),... Of the data to be exchanged, the number of viewers is counted for each channel.
Here, data with d-order information is referred to as “data of order d + 1”.
I will call it.
The data aggregation algorithm is as follows.
“On startup”
・ Connect to the server and register your notebook address with the server.
・ Ask the server to randomly tell X already registered node addresses.
-Register K node addresses of them in the receiving link. On the other hand, your node address is registered in the transmission link of the partner node. (In other words, the transmission link is not established until there is a request from another node.)
“Send”
-Every minute, the data created by “Aggregation” is sent to the destination node.
• At this time, set the TTL of the data to 1.
"transfer"
About data received from the node at the receiving link destination ・ Data coming from 1 hop next (TTL = 1)
Increase the TTL by 1 and forward to the destination node.
・ Data coming from 2 hops next (TTL = 2 data) go to “Total”.
"Aggregate"
・ Set your information to (1) of the data to be sent.
-Add all the received data (1) and set it to (2) of the data to be transmitted.
・ Add all the received data (2) and set it to (3) of the data to be transmitted.
:
-The sum of the d-order information of the received data is the total result obtained (see FIG. 33).
However, since the d-order information is calculated using the d-1 order information one minute before, the aggregated result includes information up to d minutes ago.

このアルゴリズムではK^(d+1)・TTL人分の情報を効率的に集めることができる。また、やり取りするデータのサイズは、基本的にデータの次数のみに依存する。
“転送”にて1 ホップ隣のノードから来たデータを素通りさせるのは、データを重複して集計することを防ぐためである。しかし、それでもデータの重複を完全に防げないことは容易に想像がつく。この重複率については後に述べる。 This algorithm can efficiently collect information for K ^{(d + 1) · TTL} people. The size of data to be exchanged basically depends only on the order of the data.
The reason for passing the data that comes from the node next to one hop through “forwarding” is to prevent duplicating the data. However, it is easy to imagine that it still cannot completely prevent duplication of data. This overlap rate will be described later.

（トラフィック）
本システムは大規模なP2P ネットワーク上で動作させることを前提としている。そのため、情報を集計する際にネットワークを流れるメッセージ数がどの程度になるか、事前に見積もることが重要である。
上のアルゴリズムの場合、全ノード数をN、受信リンク数をK とし、p ホップ先までデータを送信することにすると(TTL = p)、1 タイムステップ(1 分間)に流れるメッセージ数は、

と見積もることが出来る。と見積もることが出来る。上のアルゴリズムはp を小さく取っても多くの情報を集められるので、パラメータの適切な設定によりトラフィックはある程度小さく抑えられる。 (traffic)
This system is premised on operating on a large-scale P2P network. Therefore, it is important to estimate in advance how many messages will flow through the network when collecting information.
In the case of the above algorithm, if the total number of nodes is N, the number of received links is K, and data is transmitted up to p hops away (TTL = p), the number of messages that flow in one time step (1 minute) is

Can be estimated. Can be estimated. Since the above algorithm can collect a lot of information even if p is small, traffic can be kept small to some extent by setting parameters appropriately.

（シミュレーション）
次に、上のアルゴリズムにおいて、重複して集計されたデータの数をシミュレーションにより調べる。
図３４は、受信リンク数及び送信リンク数を4、TTL を2、データの次数を3 としたときのシミュレーション結果である。ノード数が5000〜12000 のときのデータ重複率を示している。なお、このとき集められたデータ数はど
のノード数の場合でもほぼ4000 程度であった。
本システムはノード数が数十万、数百万といった大規模ネットワークを想定している。シミュレーション結果を見る限り、そのような大規模ネットワークでは重複はほとんど生じないことが予想される。そもそも、いい加減なインフォーマル情報の集計を目的としているので、重複を完全に無くすことにあまり意味はない。
もちろん、初期のノード数が少ないネットワークでは重複率が無視できない問題となるので、これについては今後の課題とする。 (simulation)
Next, in the above algorithm, the number of duplicated data is examined by simulation.
FIG. 34 shows simulation results when the number of received links and the number of transmitted links are 4, the TTL is 2, and the data order is 3. The data duplication rate when the number of nodes is 5000 to 12000 is shown. The number of data collected at this time was about 4000 for any number of nodes.
This system assumes a large-scale network with hundreds of thousands or millions of nodes. From the simulation results, it is expected that there will be little overlap in such a large network. In the first place, it is aimed at summarizing informal informal information, so it does not make much sense to eliminate duplication completely.
Of course, in a network with a small initial number of nodes, the duplication rate cannot be ignored.

（効率化）
上のアルゴリズムは、全てのノードが平等に仕事をしている。しかし、ノードの性能によって役割を分けた方が明らかに効率は良い。そこで、n 次情報の集計計算をさぼるノードを考える。この場合、受け取ったデータにn+1 次情報が存在するはずなので、それをそのままコピーし、n 次情報の集計結果とすればよい。ただし、古い情報を流用しているので、当然、最終的に得られる集計結果にもその分古い情報が含まれることになる。計算をさぼるノードが増えると、ネットワークを流れる情報は古い情報ばかりになってしまうのである。そのため、さぼるノードの数を確率的に制御するか、さぼるノードの送信リンク数を減らし古い情報の流通を抑えるかするなどの対策が必要になる。これについては今後分析を行っていく。 (Efficiency)
In the above algorithm, all nodes work equally. However, it is clearly more efficient to divide roles according to node performance. Therefore, consider a node that reduces the calculation of n-th order information. In this case, since the n + 1 order information should exist in the received data, it can be copied as it is and used as the total result of the n order information. However, since old information is diverted, naturally, the total information that is finally obtained includes old information accordingly. As the number of nodes that reduce computation increases, the information that flows through the network becomes only old information. Therefore, it is necessary to take measures such as controlling the number of nodes to be stochastically controlled, or reducing the number of transmission links of nodes to be reduced to suppress the distribution of old information. This will be analyzed in the future.

（P2P 情報推薦システム）
本実施の形態では、デジタル放送に対応したセットトップボックス(STB)上にシステムの実装を進めている。
以下に、本システムの具体的な実現方法について説明する。 (P2P information recommendation system)
In this embodiment, the system is being mounted on a set-top box (STB) that supports digital broadcasting.
Below, the concrete realization method of this system is demonstrated.

（ジャンルによる番組推薦）
ユーザの好みの番組を推測するためには、ユーザの番組に対する嗜好を知る必要がある。そのため本システムでは、STB のリモコンに「Good ボタン」と「No-Goodボタン」を割り当て、番組を視聴中のユーザに、その番組が気に入ったか、気に入らないかを入力してもらうことにした。
これらのボタンが押されると、システムはその番組のEPG 情報のワード列を単語に分解して抜き出し、ワードリストに登録する。すでに登録されている単語については、その出現回数を更新する。(Good ボタンの場合は出現回数を1 増やし、No-Good ボタンの場合は1 減らす。)
なお、ワードリストとは、単語とその出現回数のリストであり、そのサイズは最大6000 ワードとした。この方法によって、ワードリストはユーザの嗜好そのものとなる。すなわち、出現回数の多い単語が、ユーザの好きな番組
の特徴を表すのである。 (Program recommendation by genre)
In order to guess a user's favorite program, it is necessary to know the user's preference for the program. Therefore, in this system, the “Good button” and “No-Good button” are assigned to the STB remote control, and the user who is watching the program inputs whether they like the program or not.
When these buttons are pressed, the system breaks up the word string of EPG information of the program into words and registers them in the word list. For words that are already registered, the number of appearances is updated. (In the case of the Good button, the number of appearances is increased by 1, and in the case of the No-Good button, it is decreased by 1.)
The word list is a list of words and their number of appearances, and the maximum size is 6000 words. By this method, the word list becomes the user's preference itself. That is, words with a high frequency of appearance represent the features of the user's favorite program.

ユーザの嗜好と番組のマッチングユーザの好きなジャンルの番組を推測するために、ユーザのワードリストと番組のEPG 情報のワード列(をワードリストに変換したもの)との類似度を計算する。この類似度計算は、まず各単語間で類似度を計算し、その次にワードリスト間で類似度を計算するという2 段階の手続きを踏む。
単語間の類似度の計算には、EDR 電子化辞書の概念辞書と英語単語辞書を利用する。概念辞書には概念の上下関係がツリー構造で記述されている。そして英語単語辞書には、英単語と概念の対応関係が記述されている。
ただし、実際に使用する英単語はSTB 上にのるように高頻度の6000 語程度に限定する。
単語間の類似度計算は、まず英語単語辞書で2 つの単語の概念を引き、次に概念辞書のツリーにおけるそれら概念間の距離を計算する。この距離をNp、ツリーの最大の高さをD とするとき、上記した（６・１）式［数６］で単語間の類似度を計算する。
次にワードリスト間の類似度の計算のために、まず、長い方のワードリスト内の単語から、短い方のワードリスト内の各単語と類似度の高いものを抽出して、新たなワードリストを作成する。この新しいワードリストは短い方のワードリストと同じ長さである。そして、新しいワードリスト(各単語の出現回数と類似度をかけた値)と短い方のワードリスト(各単語の出現回数の値)をベクトルとしてその内積を計算し、その成す角を求める。この角度が小さければ、その番組はユーザの嗜好に合っていることになる。 Matching user preferences and programs In order to infer programs of the user's favorite genre, the similarity between the user's word list and the word string of the EPG information of the program (converted into a word list) is calculated. This similarity calculation is a two-step procedure in which the similarity is first calculated between the words, and then the similarity is calculated between the word lists.
To calculate the similarity between words, the EDR electronic dictionary concept dictionary and English word dictionary are used. The concept dictionary describes the hierarchical relationship of concepts in a tree structure. The English word dictionary describes the correspondence between English words and concepts.
However, the English words actually used are limited to the high frequency of about 6000 words as shown on the STB.
To calculate the similarity between words, first draw the concepts of two words in the English word dictionary, and then calculate the distance between them in the concept dictionary tree. When this distance is Np, and the maximum height of the tree is D, the similarity between words is calculated by the above-described equation (6.1).
Next, in order to calculate the similarity between the word lists, first, from the words in the longer word list, those having high similarity to each word in the shorter word list are extracted, and a new word list is extracted. Create This new word list is the same length as the shorter word list. Then, the inner product is calculated using the new word list (value obtained by multiplying the number of appearances of each word and the similarity) and the shorter word list (value of the number of appearances of each word) as vectors, and the angle formed by the inner product is obtained. If this angle is small, the program matches the user's preference.

（バリューによる番組推薦）
バリューには、ユーザから集める情報によって様々な
ものが考えられる。例えば、本システムでは
・視聴率
・視聴時間
・視聴回数
・Good ボタン、No-Good ボタンが押された回数
・録画したかどうか
・ブックマークしたかどうか
などがバリューとして考えられる。これらは4 章の情報集計アルゴリズムを使って容易に得ることができる。
これらのバリュー及びその変遷を番組毎に提示してやることで、ユーザは様々な角度から番組を知ることができ、これまでとは違った視点で番組を選ぶことができるようになる。 (Program recommendation by value)
Various values can be considered depending on the information collected from the user. For example, in this system, the following factors can be considered: • audience rating, viewing time, viewing count, number of times that the Good button or No-Good button was pressed, whether recording was performed, whether bookmarked, etc. These can be easily obtained using the information aggregation algorithm in Chapter 4.
By presenting these values and their transitions for each program, the user can know the program from various angles, and can select the program from a different viewpoint.

（クラスタリング）
クラスタリングとは、嗜好の似た者同士を近くにまとめる技術である。これを導入することによって、ユーザはさらに多くの有用な情報を得ることができる。それはすなわち、世間一般の人によるバリュー以外にも、自分と嗜好が似ている人によるバリューも得ることが出来る、ということである。これは、自分と趣味が似ている友人が非常に有用な情報を提供してくれることと同じである。
また、自分が属すクラスタ以外のクラスタから情報を得ることも非常に有用である。なぜなら、クラスタとはその分野に深く通じた者の集団だからである。自分が詳しくない分野について、その分野の専門家の意見を参考にすることは賢い方法である。
本システムでクラスタリングを導入することはさほど難しくない。各ノードは自分の嗜好をワードリストとして保持しているので、それを用いて各ノードと嗜好の類似度を計算し、類似度の高い相手と選択的に接続すればよいからである。
しかし問題もある。クラスタリングをしてしまうと、周りには自分と似ている人しかいないため、自分と似ている人からしか情報を集められなくなってしまうのである。すなわち、世間一般の人によるバリューが得られくなってしまう。そこで、例えば以下の2 つの方法が考えられる。
方法１：接続先ノードを、嗜好ノードリストとランダムノードリストの2 つのノードリストで管理する。
方法2：周囲のノード群の中から、自分と似ているノードの情報を抽出する。 (Clustering)
Clustering is a technique for bringing together people with similar preferences together. By introducing this, the user can obtain more useful information. In other words, in addition to the value of the general public, it is also possible to obtain the value of people who have similar tastes. This is the same as a friend who has a hobby like you provides useful information.
It is also very useful to obtain information from a cluster other than the cluster to which it belongs. This is because a cluster is a group of people who are familiar with the field. It is wise to refer to the opinions of experts in a field that you are not familiar with.
It is not so difficult to introduce clustering in this system. This is because each node holds its own preference as a word list, so that it is only necessary to calculate the similarity between each node and the preference and selectively connect to a partner with a high similarity.
But there are problems. When clustering is done, there are only people who are similar to you, so you can only collect information from people who are similar to you. In other words, the value of the general public will not be obtained. Therefore, for example, the following two methods can be considered.
Method 1: The connection destination node is managed by two node lists, a preference node list and a random node list.
Method 2: Extract information on nodes that are similar to you from the surrounding nodes.

（方法1）
方法１は、クラスタリングしたネットワークとクラスタリングしていないネットワークが同時に存在するとした方法である。嗜好ノードリストには、自分と嗜好類似度の高いノードアドレスを登録し、ランダムノードリストには、ランダムなノードのアドレスを登録する。そして、これら2 つのノードリストで独立平行に情報集計アルゴリズムを実行する。
この方法では、柔軟で正確なクラスタリングが可能である。そして、世間一般の人の情報も、自分と嗜好が似ている人の情報も集めることが出来る。しかし、情報集計アルゴリズムを2 つ平行して実行させることになり、トラフィックと計算コストが2 倍になってしまう。また、他のクラスタの情報を得ることは出来ない。 (Method 1)
Method 1 is a method in which a clustered network and a non-clustered network exist at the same time. A node address having a high preference similarity with itself is registered in the preference node list, and a random node address is registered in the random node list. Then, the information aggregation algorithm is executed independently and in parallel with these two node lists.
This method enables flexible and accurate clustering. And you can collect information on the general public as well as information on people with similar preferences. However, two information aggregation algorithms are executed in parallel, and traffic and calculation costs are doubled. Also, information about other clusters cannot be obtained.

（方法2）
方法２では、全ユーザで共通のジャンルを導入する必要がある。各ユーザは自分の嗜好ワードリストから、共通のジャンルに対する嗜好の度合いを計算し、0〜1 の値に変換する。(= 一時共通ジャンルリスト) そしてその値が高いものは1、それ以外は0 とする共通ジャンルリストを作成する（図３５参照）。すなわち、好きな共通ジャンルを1 とする。なお、共通のジャンルは数十個程度とする。 (Method 2)
In Method 2, it is necessary to introduce a common genre for all users. Each user calculates the degree of preference for a common genre from his / her preference word list and converts it to a value between 0 and 1. (= Temporary common genre list) Then, a common genre list is created in which 1 is high and 0 is otherwise (see FIG. 35). That is, the favorite common genre is 1. Note that there are about several tens of common genres.

共通ジャンルリストは、同じ番組を見ていたユーザ同士で足し合わせながら集計する。そうすれば、その番組を見ていたユーザの中で、共通genre1 が好きなユーザ数、共通genre2 が好きなユーザ数、・・・を知ることができる。すなわち、共通genreX の専門家の情報を得られるということである。これは、ユーザを共通ジャンルでクラスタリングしたことと同じである。
また、自分と似ている人の情報は、集計した共通ジャンルリストと自分の一時共通ジャンルリストの値をかけて足し合わせることで、擬似的に求めることができる。
この方法２では、あらかじめ共通のジャンルを決めてしまうので、新しいジャンルが出てきたときに対応できないなど、柔軟で精度の高いクラスタリングが不可能となってしまう。 The common genre list is totaled while adding together users who were watching the same program. Then, it is possible to know the number of users who like the common genre1, the number of users who like the common genre2, among the users who have watched the program. In other words, it is possible to obtain information on common genreX experts. This is the same as clustering users by common genre.
In addition, information about a person similar to you can be obtained in a pseudo manner by adding together the values of the tabulated common genre list and your temporary common genre list.
In Method 2, since a common genre is determined in advance, flexible and highly accurate clustering becomes impossible, for example, when a new genre comes out, it is impossible to cope.

コンテンツベース推薦の概念を示す模式図である。It is a schematic diagram which shows the concept of content base recommendation. コラボレーティブ推薦の概念を示す模式図である。It is a schematic diagram which shows the concept of collaborative recommendation. コラボレーティブホライズンの概念を示す模式図である。It is a schematic diagram which shows the concept of collaborative horizon. 放送番組と同期した掲示板の話題について、時間経過と盛り上がりの関係を示すグラフである。It is a graph which shows the relationship between time progress and excitement about the topic of the bulletin board synchronized with the broadcast program. ＴＶ番組に対する掲示板へのメッセージ投稿数の順位を番組ジャンルごとに集計したグラフである。It is the graph which totaled the order of the number of message postings to the bulletin board for TV programs for each program genre. 情報集計方法の一例を示す模式図である。It is a schematic diagram which shows an example of the information totalization method. 情報集計アルゴリズムの仕組みを示す模式図である。It is a schematic diagram which shows the structure of an information totalization algorithm. データの集計方法の一例を示す模式図である。It is a schematic diagram which shows an example of the totaling method of data. 重複データを考慮し視聴率の誤差を可視化したグラフである。It is the graph which visualized the error of the audience rating in consideration of duplication data. 共通ジャンルリストの作成例を示す模式図である。It is a schematic diagram which shows the example of creation of a common genre list. 共通ジャンルリストの集計の概念を示す模式図である。It is a schematic diagram which shows the concept of totalization of a common genre list. 番組表示中の画面の一例を模式的に示す図である。It is a figure which shows an example of the screen during program display typically. 実際の番組表示中の画面の一例を示す図である。It is a figure which shows an example of the screen during an actual program display. 番組推薦インタフェースの一例を示す図である。It is a figure which shows an example of a program recommendation interface. 受信リンク数が０の場合について、最大送信リンク数とノード数の関係を示すグラフである。6 is a graph showing the relationship between the maximum number of transmission links and the number of nodes when the number of reception links is zero. 受信リンク数が１の場合について、最大送信リンク数とノード数の関係を示すグラフである。6 is a graph showing the relationship between the maximum number of transmission links and the number of nodes when the number of reception links is 1. 受信リンク数が２の場合について、最大送信リンク数とノード数の関係を示すグラフである。6 is a graph showing the relationship between the maximum number of transmission links and the number of nodes when the number of reception links is two. 受信リンク数が３の場合について、最大送信リンク数とノード数の関係を示すグラフである。6 is a graph showing the relationship between the maximum number of transmission links and the number of nodes when the number of reception links is 3. 受信リンク数が４の場合について、最大送信リンク数とノード数の関係を示すグラフである。It is a graph which shows the relationship between the maximum transmission link number and the number of nodes when the number of reception links is four. ノード離脱率と収集データ量の関係の一例を示すグラフである。It is a graph which shows an example of the relationship between a node leaving rate and the amount of collected data. ノード離脱率とデータ送信回数の関係の一例を示すグラフである。It is a graph which shows an example of the relationship between a node leaving rate and the number of times of data transmission. ノード数とデータ重複率の関係の一例を示すグラフである。It is a graph which shows an example of the relationship between the number of nodes and a data duplication rate. 100ノードの各次数のデータによる視聴率の標準偏差を示すグラフである。It is a graph which shows the standard deviation of the audience rating by the data of each order of 100 nodes. 視聴率について８ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 8 steps about audience rating. 視聴率について１０ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 10 steps about audience rating. 視聴率について１２ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 12 steps about audience rating. 視聴率について１４ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 14 steps about audience rating. 満足度について８ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 8 steps about satisfaction. 満足度について１０ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 10 steps about satisfaction. 満足度について１２ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 12 steps about satisfaction. 満足度について１４ステップ後の集計結果の一例を示すグラフである。It is a graph which shows an example of the total result after 14 steps about satisfaction. やり取りするデータの一例を示す模式図である。It is a schematic diagram which shows an example of the data exchanged. データの集計方法の一例を示す模式図である。It is a schematic diagram which shows an example of the totaling method of data. ノード数とデータ重複率の関係の一例を示すグラフである。It is a graph which shows an example of the relationship between the number of nodes and a data duplication rate. 共通ジャンルリストの作成例を示す模式図である。It is a schematic diagram which shows the example of creation of a common genre list.

Claims

A means to input the evaluation (Good / No-Good) according to the viewer's preference in the broadcast program information,
A means for entering ratings against a list of genres common to viewers;
For each broadcast program, a first tabulation is performed to obtain and tabulate evaluations made on the common genre list input by the viewers of the viewers who are viewing the same program via the network. Means,
For each broadcast program, while obtaining an evaluation according to the viewer's preference through the work, a second counting means for counting,
A broadcast program recommendation information display device comprising: display means for displaying an evaluation of each broadcast program aggregated by the first and second aggregation means.

2. The display unit displays the evaluation of the broadcast program aggregated by the second aggregation unit for the broadcast program being viewed together with the audience rating of the program being viewed as time-series information. Of recommended information for broadcast programs in Japan.