JP2021177362A

JP2021177362A - Information processing apparatus, information processing method, information processing program, and terminal apparatus

Info

Publication number: JP2021177362A
Application number: JP2020082880A
Authority: JP
Inventors: 宏彰寺岡; Hiroaki Teraoka; 圭祐明石; Keisuke Akashi
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2020-05-08
Filing date: 2020-05-08
Publication date: 2021-11-11
Anticipated expiration: 2040-05-08
Also published as: JP7260505B2

Abstract

To provide information useful for a user in response to change in feelings of the user which occurs upon browsing contents.SOLUTION: An information processing apparatus according to an embodiment of the present invention has an acquiring unit and a specifying unit. The acquiring unit acquires information relating to feelings of a user estimated based on an expression of the user indicated by captured image information obtained by capturing an image of a user browsing a content by imaging means of a terminal indicating a content. The specifying unit specifies a feeling point as a point of the content when the feelings of a user are changed, by collecting estimated result acquired by the acquiring unit.SELECTED DRAWING: Figure 6

Description

本発明の実施形態は、情報処理装置、情報処理方法、情報処理プログラム及び端末装置に関する。 Embodiments of the present invention relate to information processing devices, information processing methods, information processing programs, and terminal devices.

従来、利用者から取得した生体情報に基づいて利用者の心理状況や感情を判定し、判定した利用者の心理状況や感情に応じたサービスの提供を行う技術が知られている。例えば、利用者の感情を検出し、検出した感情に応じて、利用者が作成中のメッセージを加工する情報端末装置が開示されている。 Conventionally, there is known a technique of determining a user's psychological state and emotion based on biometric information acquired from the user and providing a service according to the determined user's psychological state and emotion. For example, an information terminal device that detects a user's emotion and processes a message being created by the user according to the detected emotion is disclosed.

例えば、情報端末装置は、利用者がメッセージを作成する際に、生体センサを用いて、利用者の生体情報を測定し、測定した生体情報を用いて、利用者の心理状況や感情の強さを示す情報を算出する。そして、情報端末装置は、算出した情報に基づいて、利用者が作成したメールを加工し、加工したメッセージを送信することで、利用者の感情を伝達する。 For example, an information terminal device measures a user's biometric information by using a biosensor when the user composes a message, and uses the measured biometric information to determine the user's psychological state and emotional strength. The information indicating is calculated. Then, the information terminal device processes the mail created by the user based on the calculated information, and transmits the processed message to convey the user's emotions.

特開２０１３−０２９９２８号公報Japanese Unexamined Patent Publication No. 2013-029928

しかしながら、上記の従来技術は、メッセージの送信先へ利用者の感情を伝えるに過ぎず、コンテンツを閲覧することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができるとは限らない。 However, the above-mentioned prior art merely conveys the user's emotions to the destination of the message, and can provide meaningful information to the user in response to the change in emotions caused to the user by browsing the content. Not always possible.

本願は、上記に鑑みてなされたものであって、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる情報処理装置、情報処理方法、情報処理プログラム及び端末装置を提供することを目的とする。 The present application has been made in view of the above, and is an information processing device, an information processing method, which can provide information meaningful to the user in response to changes in emotions caused to the user by viewing the content. An object of the present invention is to provide an information processing program and a terminal device.

本願に係る情報処理装置は、コンテンツを閲覧中のユーザを、当該コンテンツを表示している端末装置が有する撮像手段によって撮像された撮像情報が示す当該ユーザの表情に基づいて推定された当該ユーザの感情に関する情報を取得する取得部と、前記取得部によって取得された推定結果を集計することにより、前記コンテンツの中で前記ユーザの感情に変化が生じたポイントである感情ポイントを特定する特定部とを備えることを特徴とする。 The information processing device according to the present application estimates the user who is viewing the content based on the facial expression of the user indicated by the image pickup information captured by the image pickup means of the terminal device displaying the content. An acquisition unit that acquires information about emotions, and a specific unit that identifies emotion points, which are points where the user's emotions have changed in the content, by aggregating the estimation results acquired by the acquisition unit. It is characterized by having.

実施形態の一態様によれば、コンテンツを閲覧することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができるといった効果を奏する。 According to one aspect of the embodiment, it is possible to provide information that is meaningful to the user in response to changes in emotions that occur in the user by browsing the content.

図１は、実施形態に係る情報処理装置による情報処理の一例を示す図である。FIG. 1 is a diagram showing an example of information processing by the information processing apparatus according to the embodiment. 図２は、実施形態に係る提示処理の一例を示す図である。FIG. 2 is a diagram showing an example of the presentation process according to the embodiment. 図３は、実施形態に係る表示画面の一例を示す図である。FIG. 3 is a diagram showing an example of a display screen according to the embodiment. 図４は、実施形態に係る情報処理システムの構成例を示す図である。FIG. 4 is a diagram showing a configuration example of the information processing system according to the embodiment. 図５は、実施形態に係る端末装置の構成例を示す図である。FIG. 5 is a diagram showing a configuration example of the terminal device according to the embodiment. 図６は、実施形態に係る情報処理装置の構成例を示す図である。FIG. 6 is a diagram showing a configuration example of the information processing device according to the embodiment. 図７は、実施形態に係る撮像情報記憶部の一例を示す図である。FIG. 7 is a diagram showing an example of the imaging information storage unit according to the embodiment. 図８は、実施形態に係る推定情報記憶部の一例を示す図である。FIG. 8 is a diagram showing an example of an estimated information storage unit according to the embodiment. 図９は、実施形態に係る全体集計結果記憶部の一例を示す図である。FIG. 9 is a diagram showing an example of an overall aggregation result storage unit according to the embodiment. 図１０は、実施形態に係る感情ポイント記憶部の一例を示す図である。FIG. 10 is a diagram showing an example of the emotion point storage unit according to the embodiment. 図１１は、実施形態に係る出演者情報記憶部の一例を示す図である。FIG. 11 is a diagram showing an example of the performer information storage unit according to the embodiment. 図１２は、実施形態に係る情報処理装置が実行する情報処理を示すフローチャートである。FIG. 12 is a flowchart showing information processing executed by the information processing apparatus according to the embodiment. 図１３は、情報処理装置の機能を実現するコンピュータの一例を示すハードウェア構成図である。FIG. 13 is a hardware configuration diagram showing an example of a computer that realizes the functions of the information processing device.

以下に、本願に係る情報処理装置、情報処理方法、情報処理プログラム及び端末装置を実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ説明する。なお、この実施形態により本願にかかる情報処理装置、情報処理方法、情報処理プログラム及び端末装置が限定されるものではない。また、以下の実施形態において、同一の部位には同一の符号を付し、重複する説明は省略される。 Hereinafter, modes for implementing the information processing apparatus, information processing method, information processing program, and terminal apparatus according to the present application (hereinafter, referred to as “embodiments”) will be described with reference to the drawings. Note that this embodiment does not limit the information processing device, information processing method, information processing program, and terminal device according to the present application. Further, in the following embodiments, the same parts are designated by the same reference numerals, and duplicate description is omitted.

〔１．情報処理の一例〕
最初に、実施形態に係る情報処理装置１００により実現される情報処理を説明する。図１は、実施形態に係る情報処理装置１００による情報処理の一例を示す図である。以下の説明では、実施形態に係る情報処理として、端末装置１０及び情報処理装置１００が協働して行う情報処理について説明する。本実施形態では、情報処理装置１００は、実施形態に係る情報処理プログラムを実行し、端末装置１０と協働することで、実施形態に係る情報処理を行う。また、端末装置１０にも、実施形態に係る情報処理プログラムであるアプリケーション（以下、「アプリＡＰ」と表記する場合がある）がインストールされているものとする。 [1. An example of information processing]
First, the information processing realized by the information processing apparatus 100 according to the embodiment will be described. FIG. 1 is a diagram showing an example of information processing by the information processing apparatus 100 according to the embodiment. In the following description, as the information processing according to the embodiment, the information processing performed by the terminal device 10 and the information processing device 100 in cooperation with each other will be described. In the present embodiment, the information processing device 100 executes the information processing program according to the embodiment and performs information processing according to the embodiment by cooperating with the terminal device 10. Further, it is assumed that an application (hereinafter, may be referred to as "application AP"), which is an information processing program according to the embodiment, is also installed in the terminal device 10.

図１の説明に先立って、図４を用いて、実施形態に係る情報処理システム１について説明する。図４は、実施形態に係る情報処理システム１の構成例を示す図である。実施形態に係る情報処理システム１は、図４に示すように、端末装置１０と、コンテンツ配信装置３０と、情報処理装置１００とを含む。端末装置１０、コンテンツ配信装置３０、情報処理装置１００は、ネットワークＮを介して有線又は無線により通信可能に接続される。なお、図４に示す情報処理システム１には、複数台の端末装置１０や、複数台のコンテンツ配信装置３０や、複数台の情報処理装置１００が含まれてよい。 Prior to the description of FIG. 1, the information processing system 1 according to the embodiment will be described with reference to FIG. FIG. 4 is a diagram showing a configuration example of the information processing system 1 according to the embodiment. As shown in FIG. 4, the information processing system 1 according to the embodiment includes a terminal device 10, a content distribution device 30, and an information processing device 100. The terminal device 10, the content distribution device 30, and the information processing device 100 are connected to each other via a network N so as to be communicable by wire or wirelessly. The information processing system 1 shown in FIG. 4 may include a plurality of terminal devices 10, a plurality of content distribution devices 30, and a plurality of information processing devices 100.

端末装置１０は、ユーザによって利用される情報処理装置である。端末装置１０は、例えば、スマートフォンや、タブレット型端末や、ノート型ＰＣ（Personal Computer）や、デスクトップＰＣや、携帯電話機や、ＰＤＡ（Personal Digital Assistant）や、ヘッドマウントディスプレイ等である。本実施形態では、端末装置１０は、スマートフォンであるものとする。 The terminal device 10 is an information processing device used by the user. The terminal device 10 is, for example, a smartphone, a tablet terminal, a notebook PC (Personal Computer), a desktop PC, a mobile phone, a PDA (Personal Digital Assistant), a head-mounted display, or the like. In the present embodiment, the terminal device 10 is a smartphone.

また、端末装置１０には、２つのカメラ機能が内蔵されている。一つは、メインカメラであり、ユーザから見た景色や人物を撮像するために用いられる。このため、メインカメラ用のレンズは、ユーザとは反対側の方向に向けて、例えば、端末装置１０の裏側に付与されている。 Further, the terminal device 10 has two built-in camera functions. One is the main camera, which is used to capture the scenery and people seen by the user. Therefore, the lens for the main camera is provided on the back side of the terminal device 10, for example, in the direction opposite to the user.

もう一つは、インカメラであり、例えば、ビデオ通話や顔認証に利用される。このため、インカメラ用のレンズは、端末装置１０が有するディスプレイ（タッチパネル）の周辺等において、ユーザ向きに付与されている。このため、ユーザは、インカメラ用のレンズによって取り込まれた映像（例えば、自身の顔の映像）であって、ディスプレイに表示された映像の映り具合を確認しながら、インカメラを用いて自身を撮影することができる。本実施形態では、このインカメラを「撮像手段の一例」とする。 The other is an in-camera, which is used, for example, for video calls and face recognition. Therefore, the lens for the in-camera is provided for the user around the display (touch panel) of the terminal device 10. Therefore, the user uses the in-camera to check the image captured by the in-camera lens (for example, the image of his / her face) and the appearance of the image displayed on the display. You can shoot. In the present embodiment, this in-camera is referred to as an "example of imaging means".

また、上記の通り、端末装置１０にはアプリＡＰがインストールされており、端末装置１０は、アプリＡＰの制御に従って、ユーザによる操作に関係なく、動的にユーザを撮像することができるものとする。例えば、端末装置１０は、アプリＡＰの制御に従って、ユーザが所定の動画サイト（「動画サイトＳＴ」とする）で動画コンテンツを視聴中にのみ、動的にユーザの表情を撮像する。より具体的には、端末装置１０は、アプリＡＰ内で配信される全て、又は、任意の動画コンテンツを視聴中にのみ、動的にユーザの表情を撮像する。 Further, as described above, the application AP is installed in the terminal device 10, and the terminal device 10 can dynamically image the user according to the control of the application AP regardless of the operation by the user. .. For example, the terminal device 10 dynamically captures the user's facial expression only while the user is viewing the video content on a predetermined video site (referred to as “video site ST”) under the control of the application AP. More specifically, the terminal device 10 dynamically captures the facial expression of the user only while viewing all or arbitrary moving image contents distributed in the application AP.

なお、動的にユーザを撮像するため、例えば、ユーザが動画サイトＳＴを訪問した際、又は、動画サイトＳＴにおいて任意の動画コンテンツを閲覧する際には、端末装置１０は、アプリＡＰの制御に従って、ユーザに対して撮像する旨の同意を得るようにする。例えば、ユーザがユーザ自身を撮像されることを許可した場合（同意が得られた場合）には、端末装置１０は、かかるユーザの撮像を行う。一方、ユーザがユーザ自身を撮像されることを許可しなかった場合（同意が得られなかった場合）には、端末装置１０は、かかるユーザの撮像は行わない。 In order to dynamically image the user, for example, when the user visits the video site ST or browses arbitrary video content on the video site ST, the terminal device 10 follows the control of the application AP. , To obtain the consent of the user to take an image. For example, if the user is allowed to image the user himself (if consent is obtained), the terminal device 10 performs the image of the user. On the other hand, if the user does not allow the user to be imaged (if consent is not obtained), the terminal device 10 does not image the user.

コンテンツ配信装置３０は、コンテンツを配信するサーバ装置又はクラウドシステム等である。例えば、コンテンツ配信装置３０は、動画コンテンツを配信する。例えば、コンテンツ配信装置３０は、動画サイトＳＴを介して、端末装置１０に動画コンテンツを配信する。例えば、ユーザが動画サイトＳＴに訪問し、閲覧したい動画コンテンツの動画名やカテゴリをクエリとして指定したものとする。この場合、コンテンツ配信装置３０は、端末装置１０からクエリを受信し、受信したクエリに対応する動画コンテンツの一覧を動画サイトＳＴ中に表示させる。 The content distribution device 30 is a server device or a cloud system that distributes content. For example, the content distribution device 30 distributes moving image content. For example, the content distribution device 30 distributes video content to the terminal device 10 via the video site ST. For example, it is assumed that the user visits the video site ST and specifies the video name or category of the video content to be viewed as a query. In this case, the content distribution device 30 receives a query from the terminal device 10 and displays a list of video contents corresponding to the received query in the video site ST.

また、コンテンツ配信装置３０は、ＶＯＤ（Video On Demand）といった配信形態で、動画コンテンツを配信する。例えば、コンテンツ配信装置３０は、お笑い番組、ドラマ、映画、アニメ等の様々なジャンルの動画コンテンツを配信する。また、コンテンツ配信装置３０は、インターネットライブ配信を行う。 Further, the content distribution device 30 distributes video content in a distribution form such as VOD (Video On Demand). For example, the content distribution device 30 distributes video content of various genres such as comedy programs, dramas, movies, and animations. In addition, the content distribution device 30 performs Internet live distribution.

ここで、実施形態にかかる情報処理が行われるにあたっての前提について説明する。例えば、動画サイトＳＴでお笑い番組を視聴する場合、面白いポイントを探してそこだけを視聴しようとする場合があるが、かかる場合、ユーザはシークバーを動かす等を行い、ユーザ自身で面白いポイントを見つける必要があり面倒である。このようなことから、面白いポイントだけをピックアップして視聴出来るようにして欲しいといったニーズがある。 Here, the premise for performing information processing according to the embodiment will be described. For example, when watching a laughing program on the video site ST, there is a case where you search for an interesting point and try to watch only that point, but in such a case, the user needs to move the seek bar etc. and find the interesting point by himself. It is troublesome. For this reason, there is a need to pick up and watch only interesting points.

このようなニーズをかなえようとすると、所定の担当者（例えば、目利きの人）が、お笑い番組を視聴し、面白いポイントを探すことが考えられるが、この作業も非常に面倒である。このようなことから、お笑い番組を視聴しているユーザについて、そのユーザの感情（表情）から、笑ったという笑いの行動を推定できれば、より多くのユーザが笑ったポイントを面白いポイントとして抽出し、以降、このお笑い番組を視聴しようとするユーザに、この抽出したポイントを提示することができるようになる。 In order to meet such needs, it is conceivable that a predetermined person in charge (for example, a connoisseur) watches a laughing program and searches for interesting points, but this work is also very troublesome. From this, if the laughing behavior of laughing can be estimated from the emotions (facial expressions) of the user who is watching the laughing program, the points where more users laughed can be extracted as interesting points. After that, the extracted points can be presented to the user who wants to watch this laughing program.

また、お笑い番組の中には、例えば、出演者（例えば、お笑いタレントやグループ）毎に、出演者の演技（すなわち、ネタ）に対するユーザの反応に基づいて、ネタの面白さを競って、出演者に順位付けするといったものがある。このようなお笑い番組では、実際に、お笑い番組を視聴していた各ユーザからの投票を受け付けてその投票結果で順位付けする場合がある。一例を示すと、集計する専用サーバは、お笑い番組が再生表示されている最中に、このお笑い番組が再生表示されている領域の下部等に「投票ボタン」を表示しておく。ユーザは、この出演者のネタが面白いと思う場合には、「投票ボタン」を押下する。これによって、集計する専用サーバは、出演者毎に投票結果を集計し、投票数のより多い出演者に高い順位を付与する。 Also, in a comedy program, for example, each performer (for example, a comedy talent or a group) competes for the fun of the story based on the user's reaction to the performer's performance (that is, the story) and appears. There is something like ranking people. In such a laughing program, there is a case where a vote is actually received from each user who was watching the laughing program and ranked by the voting result. As an example, the dedicated server for totaling displays a "voting button" at the bottom of the area where the laughing program is played and displayed while the laughing program is being played and displayed. The user presses the "voting button" when he / she finds the story of this performer interesting. As a result, the dedicated server that aggregates aggregates the voting results for each performer, and assigns a higher ranking to the performers with the larger number of votes.

しかしながら、このような投票システムで算出される投票数は、必ずしも面白さを正確に反映しているとはいい難い場合がある。例えば、１人のユーザが、１組の出演者について、複数回投票ができてしまうと、興味のない出演者を故意に上位にランクアップさせることができてしまうかもしれない。また、投票するという行動は、反射的な行動ではなく、確固たる意思に基づく行動であるため、実際には面白いと思ってなくても、不正のような形で投票ボタンを押そうとするユーザもいるかもしれない。そうすると、投票数は、必ずしも面白さを正確に反映しているとはいい難い。 However, the number of votes calculated by such a voting system may not always accurately reflect the fun. For example, if one user can vote for a set of performers multiple times, an uninterested performer may be intentionally ranked higher. Also, since the action of voting is not a reflexive action but an action based on a firm intention, some users try to press the voting button in a fraudulent manner even if they do not actually find it interesting. There may be. Then, the number of votes does not necessarily accurately reflect the fun.

一方で、ユーザは面白いネタには、反射的に笑ってしまい、面白くないネタには反応しない（わざと笑うようなことはしない）ため、ユーザが確実に笑ったことを特定し、笑ったユーザの人数を出演者毎に集計できれば、この集計結果は、上記投票数よりも、より正確にユーザの意志（面白いという感情）を反映しているといえる。そうすると、出演者に対してより正確な順位付けができるようになる。 On the other hand, the user laughs reflexively at interesting material and does not react to uninteresting material (does not intentionally laugh), so identify that the user laughed surely and the user who laughed If the number of people can be totaled for each performer, it can be said that this totaled result reflects the user's intention (feeling of being interesting) more accurately than the above-mentioned number of votes. Then, the performers can be ranked more accurately.

以上のような前提及び問題点を踏まえて、実施形態に係る情報処理装置１００は、コンテンツ（例えば、動画コンテンツ）を視聴中のユーザを、かかるコンテンツを表示している端末装置が有する撮像手段によって撮像された撮像情報が示すユーザの表情に基づいて推定されたユーザの感情に関する情報を取得する。そして、情報処理装置１００は、取得された推定結果を集計することにより、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントを特定する。 Based on the above assumptions and problems, the information processing device 100 according to the embodiment uses an imaging means included in the terminal device displaying the content to a user who is viewing the content (for example, moving image content). Information on the user's emotions estimated based on the user's facial expression indicated by the captured image information is acquired. Then, the information processing device 100 identifies the emotional points, which are the points at which the user's emotions change in the content, by aggregating the acquired estimation results.

以下、実施形態に係る情報処理の一例について説明する。以下の情報処理の一例では、情報処理装置１００が情報処理プログラムを実行することにより、上記手順を行うものとする。情報処理装置１００は、例えば、実施形態にかかる情報処理を行うサーバ装置又はクラウドシステム等である。また、コンテンツ配信装置３０及び情報処理装置１００を管理する事業主を「事業主Ｘ」とする。このようなことから、動画サイトＳＴは、事業主Ｘによって管理・運営されるコンテンツといえる。 Hereinafter, an example of information processing according to the embodiment will be described. In the following example of information processing, the information processing apparatus 100 executes the information processing program to perform the above procedure. The information processing device 100 is, for example, a server device or a cloud system that performs information processing according to the embodiment. Further, the business owner who manages the content distribution device 30 and the information processing device 100 is referred to as "business owner X". From this, it can be said that the video site ST is the content managed and operated by the business owner X.

まず、図１の例では、ユーザＵ１及びＵ２の２人のユーザを例示するが、これは一例であり、ユーザの人数は２人に限定されない。また、ユーザＵ１によって利用される端末装置１０を端末装置１０−１とする。また、ユーザＵ２によって利用される端末装置１０を端末装置１０−２とする。一方、ユーザ毎に端末装置を区別する必要が無い場合には、単に端末装置１０と表記する。また、図１の例では、ユーザＵ１及びＵ２ともに、動画サイトＳＴにて動画コンテンツを閲覧中にインカメラで自身が撮像されることを許可しているものとする。つまり、端末装置１０−１は、インカメラにてユーザＵ１を撮像してよい旨を認識している。また、端末装置１０−２は、インカメラにてユーザＵ２を撮像してよい旨を認識している。 First, in the example of FIG. 1, two users, users U1 and U2, are illustrated, but this is an example, and the number of users is not limited to two. Further, the terminal device 10 used by the user U1 is referred to as a terminal device 10-1. Further, the terminal device 10 used by the user U2 is referred to as a terminal device 10-2. On the other hand, when it is not necessary to distinguish the terminal device for each user, it is simply referred to as the terminal device 10. Further, in the example of FIG. 1, it is assumed that both the users U1 and U2 are allowed to image themselves with the in-camera while viewing the video content on the video site ST. That is, the terminal device 10-1 recognizes that the user U1 may be imaged by the in-camera. Further, the terminal device 10-2 recognizes that the user U2 may be imaged by the in-camera.

このような状態において、図１の例では、ユーザＵ１は、端末装置１０−１を用いて、動画サイトＳＴにて動画コンテンツＶＣ１を視聴しているものとする。また、同様に、ユーザＵ２は、端末装置１０−２を用いて、動画サイトＳＴにて動画コンテンツＶＣ２を視聴しているものとする。また、動画コンテンツＶＣ１及びＶＣ２ともにお笑い番組であるものとする。 In such a state, in the example of FIG. 1, it is assumed that the user U1 is viewing the video content VC1 on the video site ST by using the terminal device 10-1. Similarly, it is assumed that the user U2 is viewing the video content VC2 on the video site ST using the terminal device 10-2. Further, it is assumed that both the video contents VC1 and VC2 are funny programs.

端末装置１０−１は、ユーザＵ１が動画コンテンツＶＣ１を視聴している間、インカメラを制御し、ユーザＵ１の顔（表情）を撮像する（ステップＳ１）。例えば、端末装置１０−１は、ユーザＵ１が動画コンテンツＶＣ１を視聴している間、ユーザＵ１の表情を動画（顔動画）として撮像する。そして、端末装置１０−１は、ユーザＵ１を撮像することで得られた撮像情報ＦＤＡ１を情報処理装置１００に送信する（ステップＳ２）。この点について具体的に説明すると、端末装置１０−１は、ユーザＵ１を撮像することで得られた顔動画のデータを解析することにより、その解析に基づく推定結果を含む撮像情報ＦＤＡ１を得る。 The terminal device 10-1 controls the in-camera and captures the face (facial expression) of the user U1 while the user U1 is viewing the moving image content VC1 (step S1). For example, the terminal device 10-1 captures the facial expression of the user U1 as a moving image (face moving image) while the user U1 is viewing the moving image content VC1. Then, the terminal device 10-1 transmits the image pickup information FDA1 obtained by imaging the user U1 to the information processing device 100 (step S2). More specifically, the terminal device 10-1 analyzes the face moving image data obtained by imaging the user U1 to obtain the imaging information FDA1 including the estimation result based on the analysis.

例えば、端末装置１０−１は、顔動画のデータ（撮像情報の一例）に基づいて、ユーザＵ１の感情に関する情報を推定する。具体的には、端末装置１０−１は、顔動画のデータが示すユーザＵ１の表情に基づいて、ユーザＵ１の感情に関する情報として、ユーザＵ１の感情表出行動を推定する。例えば、端末装置１０−１は、顔動画のデータについて表情解析することにより、ユーザＵ１の感情表出行動を推定する推定処理を行う。また、端末装置１０−１は、顔動画のデータが示すユーザの瞳孔について解析することにより、ユーザＵ１の感情表出行動を推定する推定処理を行う。 For example, the terminal device 10-1 estimates information about the emotion of the user U1 based on face moving image data (an example of imaging information). Specifically, the terminal device 10-1 estimates the emotion expression behavior of the user U1 as information regarding the emotion of the user U1 based on the facial expression of the user U1 indicated by the face moving image data. For example, the terminal device 10-1 performs an estimation process for estimating the emotional expression behavior of the user U1 by performing facial expression analysis on the face moving image data. In addition, the terminal device 10-1 performs an estimation process for estimating the emotional expression behavior of the user U1 by analyzing the pupil of the user indicated by the face moving image data.

ここで、感情表出行動とは、いわゆる喜怒哀楽に関する行動であり、「笑う」、「泣く」、「驚く」等が挙げられる。以下の実施形態では、特に「笑う」行動に焦点を当てて説明する。以下、感情表出行動を「笑う行動」と表記する。図１の例では、端末装置１０−１は、ユーザＵ１が笑う行動を行ったと推定したとする。そうすると、端末装置１０−１は、この笑う行動の度合い（どれだけ笑ったか笑いの程度を示す度合い）を示す特徴量を推定（算出）する。例えば、端末装置１０−１は、笑う行動の度合いを示す特徴量を１〜１０の数値で推定することができる。例えば、端末装置１０−１は、ユーザＵ１の笑いが微笑レベルであるなら、笑う行動の度合いを示す特徴量として、笑い度「２」を推定する。一方、端末装置１０は、ユーザＵ１の笑いが大笑いレベルであるなら、笑い度「９」を推定する。 Here, the emotional expression behavior is a behavior related to so-called emotions, and includes “laughing”, “crying”, “surprise”, and the like. The following embodiments will be described with a particular focus on "laughing" behavior. Hereinafter, the emotional expression behavior is referred to as "laughing behavior". In the example of FIG. 1, it is assumed that the terminal device 10-1 presumes that the user U1 has performed a laughing action. Then, the terminal device 10-1 estimates (calculates) a feature amount indicating the degree of this laughing behavior (the degree of laughing or the degree of laughing). For example, the terminal device 10-1 can estimate a feature amount indicating the degree of laughing behavior with a numerical value of 1 to 10. For example, if the laughter of the user U1 is at the smile level, the terminal device 10-1 estimates the laughter degree "2" as a feature quantity indicating the degree of the laughing behavior. On the other hand, the terminal device 10 estimates the laughter degree "9" if the laughter of the user U1 is at the level of laughter.

なお、端末装置１０−１は、ユーザＵ１が動画コンテンツＶＣ１を閲覧しているまさにそのタイミング、つまり、リアルタイムで、ユーザＵ１を撮像しつつ上記推定処理を連続的に行う。そして、端末装置１０−１は、この推定結果を含む撮像情報ＦＤＡ１を、例えば、毎秒、情報処理装置１００に送信する。一例を示すと、端末装置１０−１は、動画コンテンツＶＣ１の再生時間に対応する時間位置（タイムコード）と、感情表出行動を示す情報と、その感情表出行動の特徴量とを含む撮像情報ＦＤＡ１を毎秒毎に、情報処理装置１００に送信する。なお、上記リアルタイムにおける処理は、５Ｇ（Generation）等の無線通信網を介して通信を行うことで実現可能である。 The terminal device 10-1 continuously performs the above estimation process while imaging the user U1 at the exact timing when the user U1 is browsing the moving image content VC1, that is, in real time. Then, the terminal device 10-1 transmits the imaging information FDA1 including the estimation result to the information processing device 100, for example, every second. As an example, the terminal device 10-1 takes an image including a time position (time code) corresponding to the playback time of the video content VC1, information indicating the emotion expression behavior, and a feature amount of the emotion expression behavior. Information FDA1 is transmitted to the information processing apparatus 100 every second. The real-time processing can be realized by communicating via a wireless communication network such as 5G (Generation).

上記のように、笑いの例を用いると、端末装置１０−１は、ユーザＵ１を撮像しつつ上記推定処理を連続的に行っているため、例えば、時間位置「１分５３秒」、感情表出行動「笑う行動」、笑い度「０」といった情報を含む撮像情報ＦＤＡ１を情報処理装置１００に送信する。また、端末装置１０は、例えば、時間位置「１分５４秒」、感情表出行動「笑う行動」、笑い度「２」といった情報を含む撮像情報ＦＤＡ１を情報処理装置１００に送信する。また、端末装置１０は、例えば、時間位置「１分５５秒」、感情表出行動「笑う行動」、笑い度「９」といった情報を含む撮像情報ＦＤＡ１を情報処理装置１００に送信する。なお、端末装置１０−１は、撮像情報を毎秒毎に送信するのではなく、任意の時間間隔（例えば、３秒）毎に撮像情報を送信してもよい。 As described above, using the example of laughter, since the terminal device 10-1 continuously performs the above estimation process while imaging the user U1, for example, the time position "1 minute 53 seconds", the emotion table. The imaging information FDA1 including information such as the appearance behavior "laughing behavior" and the laughing degree "0" is transmitted to the information processing device 100. Further, the terminal device 10 transmits the imaging information FDA1 including information such as the time position "1 minute 54 seconds", the emotion expression action "laughing action", and the laughing degree "2" to the information processing device 100. Further, the terminal device 10 transmits the imaging information FDA1 including information such as the time position "1 minute 55 seconds", the emotion expression action "laughing action", and the laughing degree "9" to the information processing device 100. Note that the terminal device 10-1 may transmit the imaging information at arbitrary time intervals (for example, 3 seconds) instead of transmitting the imaging information every second.

端末装置１０−１について説明してきたが、端末装置１０−２についても同様である。具体的には、端末装置１０−２は、ユーザＵ２が動画コンテンツＶＣ２を視聴している間、インカメラを制御し、ユーザＵ２の顔（表情）を撮像する（ステップＳ１）。例えば、端末装置１０−２は、ユーザＵ２が動画コンテンツＶＣ２を視聴している間、ユーザＵ２の表情を顔動画として撮像する。そして、端末装置１０−２は、ユーザＵ２を撮像することで得られた撮像情報ＦＤＡ２を情報処理装置１００に送信する（ステップＳ２）。具体的には、端末装置１０−２は、ユーザＵ２が動画コンテンツＶＣ２を閲覧しているまさにそのタイミング、つまり、リアルタイムで、ユーザＵ２を撮像しつつ、端末装置１０−１を例に説明した推定処理を連続的に行う。そして、端末装置１０−２は、この推定結果を含む撮像情報ＦＤＡ２を、例えば、毎秒毎に、情報処理装置１００に送信する。一例を示すと、端末装置１０−２は、動画コンテンツＶＣ２の再生時間に対応する時間位置と、感情表出行動を示す情報と、その感情表出行動の特徴量とを含む撮像情報ＦＤＡ１を毎秒毎に、情報処理装置１００に送信する。 Although the terminal device 10-1 has been described, the same applies to the terminal device 10-2. Specifically, the terminal device 10-2 controls the in-camera while the user U2 is viewing the moving image content VC2, and captures the face (facial expression) of the user U2 (step S1). For example, the terminal device 10-2 captures the facial expression of the user U2 as a facial motion while the user U2 is viewing the moving image content VC2. Then, the terminal device 10-2 transmits the image pickup information FDA2 obtained by imaging the user U2 to the information processing device 100 (step S2). Specifically, the terminal device 10-2 is estimated at the exact timing when the user U2 is browsing the video content VC2, that is, the estimation described using the terminal device 10-1 as an example while capturing the user U2 in real time. The process is performed continuously. Then, the terminal device 10-2 transmits the imaging information FDA2 including the estimation result to the information processing device 100, for example, every second. As an example, the terminal device 10-2 outputs the imaging information FDA1 including the time position corresponding to the playback time of the video content VC2, the information indicating the emotional expression behavior, and the feature amount of the emotional expression behavior per second. Each time, it is transmitted to the information processing apparatus 100.

端末装置１０−２は、ユーザＵ２を撮像しつつ上記推定処理を連続的に行っているため、例えば、時間位置「１分５３秒」、感情表出行動「笑う行動」、笑い度「０」といった情報を含む撮像情報ＦＤＡ２を情報処理装置１００に送信する。また、端末装置１０は、例えば、時間位置「１分５４秒」、感情表出行動「笑う行動」、笑い度「３」といった情報を含む撮像情報ＦＤＡ２を情報処理装置１００に送信する。また、端末装置１０は、例えば、時間位置「１分５５秒」、感情表出行動「笑う行動」、笑い度「１０」といった情報を含む撮像情報ＦＤＡ２を情報処理装置１００に送信する。 Since the terminal device 10-2 continuously performs the above estimation process while imaging the user U2, for example, the time position "1 minute 53 seconds", the emotion expression behavior "laughing behavior", and the laughing degree "0". The imaging information FDA2 including such information is transmitted to the information processing apparatus 100. Further, the terminal device 10 transmits the image pickup information FDA2 including information such as the time position “1 minute 54 seconds”, the emotion expression action “laughing action”, and the laughing degree “3” to the information processing device 100. Further, the terminal device 10 transmits the imaging information FDA2 including information such as the time position “1 minute 55 seconds”, the emotion expression action “laughing action”, and the laughing degree “10” to the information processing device 100.

以下、撮像情報ＦＤＡ１及び撮像情報ＦＤＡ２を区別せずに、単に撮像情報ＦＤＡと表記する場合がある。情報処理装置１００は、端末装置１０から送信された撮像情報ＦＤＡを受信する（ステップＳ３）。言い換えれば、情報処理装置１００は、端末装置１０から撮像情報ＦＤＡを取得する。また、情報処理装置１００は、受信した撮像情報ＦＤＡを撮像情報記憶部１２１に格納する（ステップＳ４）。なお、このとき、情報処理装置１００は、端末装置１０からユーザの属性に関する属性情報を取得してもよい。ここで、属性情報とは、ユーザの性別や、年齢や、興味関心及び趣味趣向や、ユーザの居住地及びユーザの位置情報等を含む地域に関する情報等に関する情報である。 Hereinafter, the imaging information FDA1 and the imaging information FDA2 may not be distinguished and may be simply referred to as the imaging information FDA. The information processing device 100 receives the image pickup information FDA transmitted from the terminal device 10 (step S3). In other words, the information processing device 100 acquires the imaging information FDA from the terminal device 10. Further, the information processing apparatus 100 stores the received image pickup information FDA in the image pickup information storage unit 121 (step S4). At this time, the information processing device 100 may acquire attribute information related to the user's attributes from the terminal device 10. Here, the attribute information is information about the user's gender, age, interests and hobbies, information about the area including the user's place of residence and the user's location information.

撮像情報記憶部１２１は、コンテンツを視聴中のユーザを、かかるコンテンツを表示している端末装置１０が有するインカメラ（撮像手段）で撮像することで得られる撮像情報ＦＤＡを記憶する。図１の例では、撮像情報記憶部１２１は、「ユーザＩＤ」、「動画ＩＤ」、「撮像情報」といった項目を有する。 The imaging information storage unit 121 stores the imaging information FDA obtained by imaging the user who is viewing the content with the in-camera (imaging means) of the terminal device 10 displaying the content. In the example of FIG. 1, the imaging information storage unit 121 has items such as "user ID", "moving image ID", and "imaging information".

「ユーザＩＤ」は、ユーザ又はユーザの端末装置１０を識別する識別情報を示す。「動画ＩＤ」は、ユーザが視聴する動画コンテンツであって、インカメラにて撮像されるユーザが視聴していた動画コンテンツを識別する識別情報を示す。「撮像情報」は、動画コンテンツを視聴中のユーザをインカメラで撮像することで得られる撮像情報であって、端末装置１０の推定処理による推定結果を含む撮像情報を示す。なお、撮像情報には、ユーザが撮像された顔動画のデータも含まれてよい。 The "user ID" indicates identification information that identifies the user or the user's terminal device 10. The "video ID" indicates the video content that the user watches and identifies the video content that is captured by the in-camera and that the user is watching. The "imaging information" is imaging information obtained by imaging a user who is viewing moving image content with an in-camera, and indicates imaging information including an estimation result by an estimation process of the terminal device 10. The imaging information may also include data of a facial motion image captured by the user.

すなわち、図１に示す撮像情報記憶部１２１の例では、ユーザＩＤ「Ｕ１」によって識別されるユーザ（ユーザＵ１）が、動画ＩＤ「ＶＣ１」によって識別される動画コンテンツ（動画コンテンツＶＣ１）を閲覧中において、端末装置１０のインカメラによって撮像されることによって、ユーザＵ１の表情を含む撮像情報ＦＤＡ１が得られた例を示す。 That is, in the example of the imaging information storage unit 121 shown in FIG. 1, the user (user U1) identified by the user ID "U1" is viewing the moving image content (video content VC1) identified by the moving image ID "VC1". In the example, the image pickup information FDA1 including the facial expression of the user U1 is obtained by being imaged by the in-camera of the terminal device 10.

次に、情報処理装置１００は、動画コンテンツにおいて笑う行動が行われた時間位置を特定する（ステップＳ５）。上記の通り、情報処理装置１００は、時間位置（例えば、１分５５秒）、感情表出行動行動（例えば、笑う行動）、笑い度（特徴量）（例えば、「９」）といった推定結果を含む撮像情報を端末装置１０（図１の例では、端末装置１０−１及び１０−２）から毎秒毎に受信する。このため、情報処理装置１００は、端末装置１０による推定結果（撮像情報）に基づいて、動画コンテンツにおいて笑う行動が行われた時間位置を特定する。例えば、情報処理装置１００は、特徴量である笑い度が所定の閾値（例えば、笑い度「５」）以上を示す時間位置を、動画コンテンツＶＣ１において、ユーザＵ１が笑う行動を行った時間位置として特定する。かかる例では、情報処理装置１００は、動画コンテンツＶＣ１の時間位置「ｔ２、ｔ２１、ｔ５１・・・」をユーザＵ１が笑う行動を行った時間位置として特定したとする。 Next, the information processing device 100 specifies the time position where the laughing action is performed in the moving image content (step S5). As described above, the information processing apparatus 100 obtains estimation results such as time position (for example, 1 minute 55 seconds), emotional expression behavior (for example, laughing behavior), and laughter degree (feature amount) (for example, “9”). The including imaging information is received from the terminal device 10 (terminal devices 10-1 and 10-2 in the example of FIG. 1) every second. Therefore, the information processing device 100 specifies the time position where the laughing action is performed in the moving image content based on the estimation result (imaging information) by the terminal device 10. For example, in the information processing device 100, a time position in which the laughter degree, which is a feature amount, indicates a predetermined threshold value (for example, the laughter degree “5”) or more is set as a time position in which the user U1 performs a laughing action in the video content VC1. Identify. In such an example, it is assumed that the information processing apparatus 100 specifies the time position "t2, t21, t51 ..." Of the moving image content VC1 as the time position where the user U1 performs a laughing action.

また、情報処理装置１００は、動画コンテンツＶＣ２の時間位置「ｔ１３、ｔ３１、ｔ５２・・・」をユーザＵ２が笑う行動を行った時間位置として特定したとする。 Further, it is assumed that the information processing device 100 specifies the time position "t13, t31, t52 ..." Of the moving image content VC2 as the time position where the user U2 performs the laughing action.

次に、情報処理装置１００は、端末装置１０により推定された感情表出行動と、ステップＳ５で特定した時間位置とを対応付けて、推定情報記憶部１２２に格納する（ステップＳ６）。推定情報記憶部１２２は、感情表出行動を推定した推定結果に関する情報を記憶する。図１の例では、推定情報記憶部１２２は、「動画ＩＤ」、「ユーザＩＤ」、「行動情報（笑う）」といった項目を有する。なお、情報処理装置１００は、感情表出行動として、笑う行動だけでなく、泣く行動や驚く行動等を推定する場合もある。このため、「行動情報」には、「泣く」や「驚く」といった項目も含まれてよい。 Next, the information processing device 100 associates the emotion expression behavior estimated by the terminal device 10 with the time position specified in step S5 and stores it in the estimated information storage unit 122 (step S6). The estimation information storage unit 122 stores information regarding the estimation result of estimating the emotion expression behavior. In the example of FIG. 1, the estimation information storage unit 122 has items such as "moving image ID", "user ID", and "behavior information (laughing)". The information processing device 100 may estimate not only a laughing behavior but also a crying behavior, a surprised behavior, and the like as emotional expression behaviors. Therefore, the "behavior information" may include items such as "crying" and "surprise".

また、情報処理装置１００は、画像解析等の従来技術を用いて、ユーザの顔動画から、かかるユーザの属性情報を推定してもよい。そして、情報処理装置１００は、ユーザの属性情報を「行動情報」と対応付けて推定情報記憶部１２２に格納してもよい。なお、情報処理装置１００は、予め端末装置１０からユーザの属性情報を取得している場合には、かかるユーザの「行動情報」と対応付けてユーザの属性情報を推定情報記憶部１２２に格納してもよい。 Further, the information processing apparatus 100 may estimate the attribute information of the user from the facial motion of the user by using a conventional technique such as image analysis. Then, the information processing device 100 may store the attribute information of the user in the estimation information storage unit 122 in association with the "behavior information". When the information processing device 100 has acquired the user's attribute information from the terminal device 10 in advance, the information processing device 100 stores the user's attribute information in the estimation information storage unit 122 in association with the user's "behavior information". You may.

「動画ＩＤ」は、ユーザが視聴する動画コンテンツであって、インカメラにて撮像されるユーザが視聴している動画コンテンツを識別する識別情報を示す。「ユーザＩＤ」は、対応する動画コンテンツを視聴するユーザ又はユーザの端末装置を識別する識別情報を示す。「行動情報（笑い）」は、推定処理で推定された感情表出行動のうち、笑う行動が行われた時間位置を示す。 The "video ID" is the video content that the user watches, and indicates identification information that identifies the video content that the user is watching, which is captured by the in-camera. The "user ID" indicates identification information that identifies the user who views the corresponding moving image content or the terminal device of the user. "Behavioral information (laughter)" indicates the time position at which the laughing behavior was performed among the emotional expression behaviors estimated by the estimation processing.

上記例の通り、情報処理装置１００は、ユーザＵ１について、笑う行動は動画コンテンツＶＣ１の「ｔ２、ｔ２１、ｔ５１・・・」で行われたことを特定している。したがって、情報処理装置１００は、図１に示す推定情報記憶部１２２の例のように、動画ＩＤ「ＶＣ１」、ユーザＩＤ「Ｕ１」、行動情報（笑い）「ｔ２、ｔ２１、ｔ５１・・・」を対応付けて格納する。 As described in the above example, the information processing apparatus 100 specifies that the laughing behavior of the user U1 is performed in the video content VC1 "t2, t21, t51 ...". Therefore, the information processing device 100 has a moving image ID “VC1”, a user ID “U1”, and behavior information (laughter) “t2, t21, t51 ...” As in the example of the estimation information storage unit 122 shown in FIG. Are associated and stored.

また、上記例の通り、情報処理装置１００は、ユーザＵ２について、笑う行動は動画コンテンツＶＣ２の「ｔ１３、ｔ３１、ｔ５２・・・」で行われたことを特定している。したがって、情報処理装置１００は、図１に示す推定情報記憶部１２２の例のように、動画ＩＤ「ＶＣ２」、ユーザＩＤ「Ｕ２」、行動情報（笑い）「ｔ１３、ｔ３１、ｔ５２・・・」を対応付けて格納する。 Further, as in the above example, the information processing apparatus 100 specifies that the laughing action of the user U2 is performed in the video content VC2 "t13, t31, t52 ...". Therefore, the information processing device 100 has a moving image ID “VC2”, a user ID “U2”, and behavior information (laughter) “t13, t31, t52 ...” As in the example of the estimation information storage unit 122 shown in FIG. Are associated and stored.

なお、推定情報記憶部１２２は、各ユーザが各動画コンテンツの中で行ったと推定される感情表出行動について、動画コンテンツの中で感情表出行動行われた時間位置を記憶するため、ユーザ毎の集計結果を記憶する記憶部といえる。これに対して、後述する全体集計結果記憶部１２３は、ユーザ毎の集計結果をまとめて、全ユーザで見た場合はどうなるか集計し直した集計結果を記憶する。 In addition, since the estimation information storage unit 122 stores the time position in which the emotion expression action is performed in the video content for the emotion expression action estimated to have been performed by each user in each video content, each user It can be said that it is a storage unit that stores the aggregated results of. On the other hand, the overall aggregation result storage unit 123, which will be described later, collects the aggregation results for each user and stores the aggregation results re-aggregated to see what happens when all users see it.

次に、情報処理装置１００は、ステップＳ５で特定した時間位置に基づいて、動画コンテンツの中で感情表出行動が行われた人数を集計する（ステップＳ７）。例えば、情報処理装置１００は、各動画コンテンツの中で笑う行動を行った人数である行動人数を、各動画コンテンツの時間位置毎に集計する。例えば、情報処理装置１００は、推定情報記憶部１２２に記憶される情報を用いて、かかる集計を行う。 Next, the information processing device 100 totals the number of people who have performed emotional expression actions in the moving image content based on the time position specified in step S5 (step S7). For example, the information processing device 100 totals the number of people who have performed a laughing action in each video content for each time position of each video content. For example, the information processing apparatus 100 uses the information stored in the estimation information storage unit 122 to perform such aggregation.

図１の例では、情報処理装置１００は、動画コンテンツＶＣ１の時間位置「ｔ１」では、所定期間の間に動画コンテンツＶＣ１を視聴したユーザの総数のうち、「１３５人」が笑う行動を行った（行動人数１３５人）との集計結果を得たものとする。また、情報処理装置１００は、動画コンテンツＶＣ１の時間位置「ｔ２」では、所定期間の間に動画コンテンツＶＣ１を視聴したユーザの総数のうち、「６９３人」が笑う行動を行った（行動人数６９３人）との集計結果を得たものとする。また、情報処理装置１００は、動画コンテンツＶＣ１の時間位置「ｔ３」では、所定期間の間に動画コンテンツＶＣ１を視聴したユーザの総数のうち、「８６人」が笑う行動を行った（行動人数８６人）との集計結果を得たものとする。 In the example of FIG. 1, in the information processing device 100, at the time position "t1" of the video content VC1, "135" out of the total number of users who watched the video content VC1 during a predetermined period acted to laugh. It is assumed that the aggregated result with (135 people in action) is obtained. Further, in the information processing device 100, at the time position "t2" of the video content VC1, "693" out of the total number of users who watched the video content VC1 during the predetermined period acted to laugh (the number of actions 693). It is assumed that the aggregated result with (person) is obtained. Further, in the information processing device 100, at the time position "t3" of the video content VC1, "86 people" out of the total number of users who watched the video content VC1 during the predetermined period acted to laugh (the number of actions 86). It is assumed that the aggregated result with (person) is obtained.

また、図１の例では、情報処理装置１００は、動画コンテンツＶＣ２の時間位置「ｔ１」では、所定期間の間に動画コンテンツＶＣ２を視聴したユーザの総数のうち、「３２１人」が笑う行動を行った（行動人数３２１人）との集計結果を得たものとする。また、情報処理装置１００は、動画コンテンツＶＣ２の時間位置「ｔ２」では、所定期間の間に動画コンテンツＶＣ２を視聴したユーザの総数のうち、「５９２人」が笑う行動を行った（行動人数５９２人）との集計結果を得たものとする。また、情報処理装置１００は、動画コンテンツＶＣ２の時間位置「ｔ３」では、所定期間の間に動画コンテンツＶＣ２を視聴したユーザの総数のうち、「２９３人」が笑う行動を行った（行動人数２９３人）との集計結果を得たものとする。 Further, in the example of FIG. 1, at the time position "t1" of the video content VC2, the information processing device 100 makes "321 people" laugh out of the total number of users who watched the video content VC2 during a predetermined period. It is assumed that the total result of the visit (321 people in action) is obtained. Further, in the information processing device 100, at the time position "t2" of the video content VC2, "592 people" out of the total number of users who watched the video content VC2 during the predetermined period performed an action of laughing (number of actions 592). It is assumed that the aggregated result with (person) is obtained. Further, in the information processing device 100, at the time position "t3" of the video content VC2, "293 people" out of the total number of users who watched the video content VC2 during the predetermined period acted to laugh (the number of actions 293). It is assumed that the aggregated result with (person) is obtained.

次に、情報処理装置１００は、ステップＳ７での集計結果として、行動人数を全体集計結果記憶部１２３に格納する（ステップＳ８）。全体集計結果記憶部１２３は、所定期間の間において、各動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数を、各動画コンテンツの時間位置毎に記憶する。図１の例では、全体集計結果記憶部１２３は、「動画ＩＤ」、「行動情報（笑う）」といった項目を有する。なお、情報処理装置１００は、感情表出行動として、笑う行動だけでなく、泣く行動や驚く行動等を推定する場合もある。このため、「行動情報」には、「泣く」や「驚く」といった項目も含まれてよい。 Next, the information processing apparatus 100 stores the number of active persons in the total aggregation result storage unit 123 as the aggregation result in step S7 (step S8). The total total result storage unit 123 stores the number of users who have performed a laughing action out of the total number of users who have viewed each video content during a predetermined period, for each time position of each video content. .. In the example of FIG. 1, the total total result storage unit 123 has items such as "moving image ID" and "behavior information (laughing)". The information processing device 100 may estimate not only a laughing behavior but also a crying behavior, a surprised behavior, and the like as emotional expression behaviors. Therefore, the "behavior information" may include items such as "crying" and "surprise".

「動画ＩＤ」は、ユーザが視聴する動画コンテンツであって、インカメラにて撮像されるユーザが視聴している動画コンテンツを識別する識別情報を示す。「行動情報（笑う）」に対応付けられる項目（「ｔ１」、「ｔ２」、「ｔ３」・・・）は、各動画コンテンツの時間位置を示し、所定期間の間、動画コンテンツを閲覧したユーザの総数うち、その時間位置において笑う行動を行ったユーザの人数である行動人数が入力される。 The "video ID" is the video content that the user watches, and indicates identification information that identifies the video content that the user is watching, which is captured by the in-camera. Items ("t1", "t2", "t3" ...) Associated with "behavior information (laughing)" indicate the time position of each video content, and the user who browsed the video content for a predetermined period of time. Of the total number of users, the number of users who performed a laughing action at that time position is input.

上記例の通り、情報処理装置１００は、動画コンテンツＶＣ１の時間位置「ｔ１」では行動人数「１３５人」、時間位置「ｔ２」では行動人数「６９３人」、時間位置「ｔ３」では行動人数「８６人」との集計結果を得ている。したがって、情報処理装置１００は、図１に示す全体集計結果記憶部１２３の例のように、動画ＩＤ「ＶＣ１」及び時間位置「ｔ１」に対応する入力欄に「１３５人」を入力する。また、情報処理装置１００は、図１に示す全体集計結果記憶部１２３の例のように、動画ＩＤ「ＶＣ１」及び時間位置「ｔ２」に対応する入力欄に「６９３人」を入力する。また、情報処理装置１００は、図１に示す全体集計結果記憶部１２３の例のように、動画ＩＤ「ＶＣ１」及び時間位置「ｔ３」に対応する入力欄に「８６人」を入力する。 As in the above example, in the information processing device 100, the number of actions is "135" at the time position "t1" of the video content VC1, the number of actions is "693" at the time position "t2", and the number of actions is "693" at the time position "t3". The total result is "86 people". Therefore, the information processing apparatus 100 inputs "135 people" in the input fields corresponding to the moving image ID "VC1" and the time position "t1" as in the example of the total aggregation result storage unit 123 shown in FIG. Further, the information processing apparatus 100 inputs "693 people" in the input fields corresponding to the moving image ID "VC1" and the time position "t2" as in the example of the total aggregation result storage unit 123 shown in FIG. Further, the information processing apparatus 100 inputs "86 people" in the input fields corresponding to the moving image ID "VC1" and the time position "t3" as in the example of the total aggregation result storage unit 123 shown in FIG.

また、上記例の通り、情報処理装置１００は、動画コンテンツＶＣ２の時間位置「ｔ１」では行動人数「３２１人」、時間位置「ｔ２」では行動人数「５９２人」、時間位置「ｔ３」では行動人数「２９３人」との集計結果を得ている。したがって、情報処理装置１００は、図１に示す全体集計結果記憶部１２３の例のように、動画ＩＤ「ＶＣ２」及び時間位置「ｔ１」に対応する入力欄に「３２１人」を入力する。また、情報処理装置１００は、図１に示す全体集計結果記憶部１２３の例のように、動画ＩＤ「ＶＣ２」及び時間位置「ｔ２」に対応する入力欄に「５９２人」を入力する。また、情報処理装置１００は、図１に示す全体集計結果記憶部１２３の例のように、動画ＩＤ「ＶＣ２」及び時間位置「ｔ３」に対応する入力欄に「２９３人」を入力する。 Further, as in the above example, the information processing device 100 has the number of actions "321" at the time position "t1" of the video content VC2, the number of actions "592" at the time position "t2", and the action at the time position "t3". The total number of people is "293". Therefore, the information processing apparatus 100 inputs "321 people" in the input fields corresponding to the moving image ID "VC2" and the time position "t1" as in the example of the total aggregation result storage unit 123 shown in FIG. Further, the information processing apparatus 100 inputs "592 people" in the input fields corresponding to the moving image ID "VC2" and the time position "t2" as in the example of the total aggregation result storage unit 123 shown in FIG. Further, the information processing apparatus 100 inputs "293 people" in the input fields corresponding to the moving image ID "VC2" and the time position "t3" as in the example of the total aggregation result storage unit 123 shown in FIG.

次に、情報処理装置１００は、ステップＳ８での集計結果、すなわち行動人数に基づいて、動画コンテンツに関する情報をユーザに提示する（ステップＳ９）。例えば、情報処理装置１００は、動画コンテンツの中で感情表出行動を行ったユーザの人数である行動人数であって、動画コンテンツの時間位置に応じて変化する行動人数の遷移を示すグラフを、かかる動画コンテンツとともに表示されるシークバーが示す時間位置に対応付けて提示する。 Next, the information processing device 100 presents information about the moving image content to the user based on the aggregation result in step S8, that is, the number of people acting (step S9). For example, the information processing device 100 is a graph showing the transition of the number of actions, which is the number of users who have performed emotional expression actions in the video content, and changes according to the time position of the video content. It is presented in association with the time position indicated by the seek bar displayed together with the moving image content.

上記の通り、全体集計結果記憶部１２３は、所定期間の間において、各動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数を、各動画コンテンツの時間位置毎に記憶する。このようなことから、全体集計結果記憶部１２３に記憶される集計結果は、動画コンテンツの時間位置に応じて変化する行動人数の遷移と言い換えることもできる。したがって、ステップＳ９では、情報処理装置１００は、動画コンテンツの中で感情表出行動を行ったユーザの人数である行動人数であって、動画コンテンツの時間位置に応じて変化する行動人数の遷移を示すグラフが、動画コンテンツとともに表示されるシークバーが示す時間位置に対応付けて表示されるよう表示制御する。 As described above, the total total result storage unit 123 sets the number of actions, which is the number of users who performed a laughing action, out of the total number of users who watched each video content during a predetermined period, at the time position of each video content. Remember every time. For this reason, the total result stored in the total total result storage unit 123 can be rephrased as a transition of the number of active persons that changes according to the time position of the moving image content. Therefore, in step S9, the information processing device 100 changes the number of actions, which is the number of users who have performed emotional expression actions in the video content, and changes according to the time position of the video content. The display is controlled so that the graph shown is displayed in association with the time position indicated by the seek bar displayed together with the moving image content.

ここで、図２に実施形態に係る提示処理の一例を示す。図２では、ユーザＵ１が、動画コンテンツＶＣ２を閲覧する際を例に説明する。まず、端末装置１０は、ユーザＵ１の操作に応じて、動画サイトＳＴにおいて動画コンテンツＶＣ２をストリーミング配信させるための配信要求をコンテンツ配信装置３０に送信する（ステップＳ１０）。例えば、ユーザＵ１が動画サイトＳＴにおいて、動画コンテンツＶＣ２を示すクエリを指定したとすると、端末装置１０は、かかるクエリを含む配信要求をコンテンツ配信装置３０に送信する。 Here, FIG. 2 shows an example of the presentation process according to the embodiment. In FIG. 2, a case where the user U1 browses the moving image content VC2 will be described as an example. First, the terminal device 10 transmits a distribution request for streaming distribution of the video content VC2 on the video site ST to the content distribution device 30 in response to the operation of the user U1 (step S10). For example, if the user U1 specifies a query indicating the video content VC2 on the video site ST, the terminal device 10 transmits a distribution request including such a query to the content distribution device 30.

続いて、コンテンツ配信装置３０は、配信要求を受信すると、ユーザＵ１の端末装置１０から動画コンテンツＶＣ２の配信要求を受信した旨を情報処理装置１００に通知する（ステップＳ１１）。例えば、コンテンツ配信装置３０は、ユーザＩＤ「Ｕ１」と、動画ＩＤ「ＶＣ２」とを含む情報を情報処理装置１００に通知する。 Subsequently, when the content distribution device 30 receives the distribution request, the content distribution device 30 notifies the information processing device 100 that the distribution request for the video content VC2 has been received from the terminal device 10 of the user U1 (step S11). For example, the content distribution device 30 notifies the information processing device 100 of information including the user ID “U1” and the moving image ID “VC2”.

そして、情報処理装置１００は、コンテンツ配信装置３０から通知を受信すると、動画コンテンツＶＣ２の中で笑う行動を行ったユーザの人数である行動人数であって、動画コンテンツＶＣ２の時間位置に応じて変化する行動人数の遷移を示すグラフＧを生成する（ステップＳ１２）。具体的には、情報処理装置１００は、全体集計結果記憶部１２３にアクセスし、動画ＩＤ「ＶＣ２」に対応付けられる行動人数を取得する。より具体的には、情報処理装置１００は、動画ＩＤ「ＶＣ２」に対応付けられる行動人数として、動画コンテンツＶＣ２の時間位置の変化（例えば、時間位置ｔ１、ｔ２、ｔ３といった時間位置の変化）に応じて変化する行動人数を取得する。図１の例では、情報処理装置１００は、時間位置ｔ１では「３２１人」、時間位置ｔ２では「５９２人」、時間位置ｔ３では「２９３人」といった、時間位置の変化に応じて変化する行動人数の遷移（遷移情報）を取得する。 When the information processing device 100 receives the notification from the content distribution device 30, the information processing device 100 is the number of users who have performed a laughing action in the video content VC2, and changes according to the time position of the video content VC2. A graph G showing the transition of the number of people to perform is generated (step S12). Specifically, the information processing device 100 accesses the total aggregation result storage unit 123 and acquires the number of actions associated with the moving image ID “VC2”. More specifically, the information processing device 100 changes the time position of the video content VC2 (for example, a change in the time position such as time positions t1, t2, and t3) as the number of actions associated with the video ID "VC2". Acquire the number of actions that change accordingly. In the example of FIG. 1, the information processing apparatus 100 has actions that change according to changes in the time position, such as "321 people" at the time position t1, "592 people" at the time position t2, and "293 people" at the time position t3. Acquire the transition (transition information) of the number of people.

そして、情報処理装置１００は、取得した遷移情報に基づいて、グラフＧを生成する。例えば、情報処理装置１００は、横軸（Ｘ座標）を動画コンテンツＶＣ２の時間位置、縦軸（Ｙ座標）を行動人数として、各時間位置に対応する行動人数をプロットすることで、グラフＧを生成する。 Then, the information processing apparatus 100 generates the graph G based on the acquired transition information. For example, the information processing device 100 plots the graph G by plotting the number of actions corresponding to each time position, with the horizontal axis (X coordinate) as the time position of the moving image content VC2 and the vertical axis (Y coordinate) as the number of actions. Generate.

次に、情報処理装置１００は、ステップＳ１３で生成したグラフＧが動画コンテンツＶＣ２の再生箇所（時間位置）をユーザ側がコントロールすることができるシークバーＢＲ上に表示されるようコンテンツ配信装置３０に対して表示制御する（ステップＳ１３）。具体的には、情報処理装置１００は、グラフＧの横軸が示す時間位置、すなわち動画コンテンツＶＣ２の時間位置が、シークバーＢＲの時間位置に対応付けて表示されるようコンテンツ配信装置３０に対して表示制御する。例えば、情報処理装置１００は、端末装置１０がシークバーＢＲ上にグラフＧを表示するよう、端末装置１０に対してグラフＧを配信するようコンテンツ配信装置３０に指示する。また、情報処理装置１００は、グラフＧをコンテンツ配信装置３０に送信する。 Next, the information processing device 100 tells the content distribution device 30 that the graph G generated in step S13 is displayed on the seek bar BR where the user can control the playback location (time position) of the video content VC2. Display control (step S13). Specifically, the information processing device 100 refers to the content distribution device 30 so that the time position indicated by the horizontal axis of the graph G, that is, the time position of the moving image content VC2 is displayed in association with the time position of the seek bar BR. Display control. For example, the information processing device 100 instructs the content distribution device 30 to distribute the graph G to the terminal device 10 so that the terminal device 10 displays the graph G on the seek bar BR. Further, the information processing device 100 transmits the graph G to the content distribution device 30.

シークバーの時間位置は、動画コンテンツＶＣ２の時間位置に対応付けられる。例えば、ユーザＵ１は、シークバーを時間位置「３２分」のところに合わせた場合、動画コンテンツＶＣ２を再生時間「３２分」のところから視聴することができる。このような状態において、グラフＧの時間位置もシークバーの時間位置に対応付けられる。したがって、シークバーの時間位置「３２分」は、グラフＧの時間位置「３２分」に一致する。 The time position of the seek bar is associated with the time position of the moving image content VC2. For example, when the seek bar is set to the time position "32 minutes", the user U1 can watch the moving image content VC2 from the playback time "32 minutes". In such a state, the time position of the graph G is also associated with the time position of the seek bar. Therefore, the time position "32 minutes" of the seek bar coincides with the time position "32 minutes" of the graph G.

説明を戻す。コンテンツ配信装置３０は、情報処理装置１００からの表示制御に応じて、動画コンテンツＶＣ２をストリーミング配信する（ステップＳ１４）。例えば、コンテンツ配信装置３０は、動画コンテンツＶＣ２をストリーミング配信するにあたって、シークバーＢＲ上にグラフＧを表示するよう、端末装置１０に対してグラフＧを配信する。これにより、図２に示す端末装置１０の表示画面Ｄのように、シークバーＢＲ上にグラフＧを表示される。 Return the description. The content distribution device 30 streams and distributes the video content VC2 in response to the display control from the information processing device 100 (step S14). For example, the content distribution device 30 distributes the graph G to the terminal device 10 so as to display the graph G on the seek bar BR when streaming the video content VC2. As a result, the graph G is displayed on the seek bar BR as shown in the display screen D of the terminal device 10 shown in FIG.

図２に示す表示画面Ｄの例によると、動画サイトＳＴに含まれる領域ＡＲ１内に、実際に動画コンテンツＶＣ２が再生表示される領域ＰＬ１が存在し、領域ＰＬ１内には動画コンテンツＶＣ２の再生を開始しるための再生ボタンＢＴ３が表示される。なお、領域ＰＬ１は、動画コンテンツＶＣ２の再生制御を行うプレーヤーＰＬ１と言い換えることができるものとする。プレーヤーＰＬ１は、例えば、ブラウザ上で動画コンテンツの再生制御を行うブラウザ版プレーヤー（ウェブプレーヤー）であってもよいし、アプリケーション（アプリＡＰ）としてのプレーヤー（アプリ版プレーヤー）であってもよい。また、予め、シークバーＢＲの時間位置のうち、最も行動人数が多い再生位置から選択された状態で動画コンテンツが再生されてもよい。また、ユーザに対して、最も行動人数が多い再生位置から動画コンテンツを再生するか否かを提示してもよい。 According to the example of the display screen D shown in FIG. 2, the area PL1 in which the video content VC2 is actually reproduced and displayed exists in the area AR1 included in the video site ST, and the video content VC2 is reproduced in the area PL1. The play button BT3 for starting is displayed. The area PL1 can be rephrased as the player PL1 that controls the reproduction of the moving image content VC2. The player PL1 may be, for example, a browser version player (web player) that controls playback of video content on a browser, or a player (application version player) as an application (application AP). In addition, the moving image content may be played in advance in a state of being selected from the playback positions with the largest number of active players among the time positions of the seek bar BR. In addition, the user may be presented with whether or not to play the video content from the playback position where the number of active players is the largest.

また、シークバーＢＲ上には、グラフＧが表示される。上記の通り、シークバーＢＲの時間位置と、グラフＧの時間位置とは一致している。また、グラフＧの縦軸は行動人数を示すため、ユーザＵ１は、他のユーザはおよそどの時間位置でよく笑っていたかをグラフＧを一目見て把握することができる。このため、ユーザＵ１は、例えば、動画コンテンツＶＣ２の中で面白いポイントだけピックアップして視聴した場合、例えば、グラフＧのピークに対応する時間位置にシークバーＢＲのカーソルを合わせることで、簡単に面白いポイントの箇所へと移動することができる。また、これにより、目利きの人が面白いポイントを探さなければならないといった面倒な作業を無くすことができる。 Further, a graph G is displayed on the seek bar BR. As described above, the time position of the seek bar BR and the time position of the graph G coincide with each other. Further, since the vertical axis of the graph G indicates the number of people acting, the user U1 can grasp at a glance the graph G at which time position other users often laughed. Therefore, for example, when the user U1 picks up and views only interesting points in the video content VC2, for example, by moving the cursor of the seek bar BR to the time position corresponding to the peak of the graph G, the interesting points can be easily obtained. You can move to the place of. This also eliminates the hassle of having a connoisseur find an interesting point.

また、図３を用いて、所定の時間位置に感情を抽象化したマークを付した動画コンテンツＶＣ２を配信する場合の表示画面の例を説明する。図３は、実施形態に係る表示画面の一例を示す図である。ここで、コンテンツ配信装置３０は、情報処理装置１００からの表示制御に応じて、動画コンテンツＶＣ２の所定の時間位置に感情を抽象化したマークを付した動画コンテンツＶＣ２をストリーミング配信するものとして説明する。 Further, with reference to FIG. 3, an example of a display screen in the case of delivering the moving image content VC2 having a mark that abstracts emotions at a predetermined time position will be described. FIG. 3 is a diagram showing an example of a display screen according to the embodiment. Here, the content distribution device 30 will be described as streaming distribution of the video content VC2 having a mark that abstracts emotions at a predetermined time position of the video content VC2 in response to the display control from the information processing device 100. ..

図３に示す表示画面Ｔの例によると、動画サイトＳＴに含まれる領域ＡＲ２内に、実際に動画コンテンツＶＣ２が再生表示されるプレーヤーＰＬ２が表示される。また、図３に示す表示画面Ｔの例によると、シークバーＢＲ上にグラフＧが表示される。 According to the example of the display screen T shown in FIG. 3, the player PL2 in which the moving image content VC2 is actually reproduced and displayed is displayed in the area AR2 included in the moving image site ST. Further, according to the example of the display screen T shown in FIG. 3, the graph G is displayed on the seek bar BR.

ここで、動画コンテンツＶＣ２のうち、時間位置「ｔ２」で所定期間の間に動画コンテンツＶＣ２を視聴したユーザの総数のうち、最も多い「５９２人」が笑う行動を行ったとの集計結果を得たものとする。この場合、情報処理装置１００は、動画コンテンツＶＣ２のうち、時間位置「ｔ２」において笑った顔文字マークＭＲを付すようにコンテンツ配信装置３０に対して表示制御する。例えば、情報処理装置１００は、動画コンテンツＶＣ２の時間位置「ｔ２」において、笑った顔文字マークＭＲがプレーヤーＰＬ２の下方向からプレーヤーＰＬ２の中央付近に素早く飛出すような表示態様で表示制御する。この場合、笑った顔文字マークＭＲは、動画コンテンツＶＣ２に重畳されるように表示される。これにより、情報処理装置１００は、観客がいないリアルタイム配信においても、ユーザ間で一体感を演出したサービスの提供が可能となる。 Here, among the video content VC2, among the total number of users who watched the video content VC2 at the time position "t2" during the predetermined period, the largest number "592" performed the laughing behavior. It shall be. In this case, the information processing device 100 controls the display of the moving image content VC2 so as to attach the laughing emoticon mark MR at the time position “t2” to the content distribution device 30. For example, the information processing apparatus 100 controls the display at the time position “t2” of the moving image content VC2 in such a display manner that the laughing emoticon mark MR quickly pops out from the lower direction of the player PL2 to the vicinity of the center of the player PL2. In this case, the laughing emoticon mark MR is displayed so as to be superimposed on the moving image content VC2. As a result, the information processing device 100 can provide a service that creates a sense of unity among users even in real-time distribution without an audience.

また、動画コンテンツＶＣ２は、予め、シークバーＢＲの時間位置のうち、最も行動人数が多い再生位置から選択された状態で再生されてもよい。また、ユーザに対して、最も行動人数が多い再生位置から動画コンテンツＶ２を再生するか否かを提示してもよい。例えば、図３の例では、グラフＧのうち、時間位置「ｔ２」に笑った顔文字マークＭＲが付されている。これにより、ユーザに対して、最も笑う行動を行った人数が多い再生位置である時間位置「ｔ２」から動画コンテンツＶ２を再生するように提示してもよい。 Further, the moving image content VC2 may be played in a state of being selected in advance from the playing position having the largest number of active players among the time positions of the seek bar BR. Further, the user may be presented with whether or not to play the moving image content V2 from the playback position having the largest number of active players. For example, in the example of FIG. 3, a laughing emoticon mark MR is attached to the time position “t2” in the graph G. As a result, the user may be presented to play the moving image content V2 from the time position "t2", which is the playback position where the number of people who have performed the most laughing action is large.

なお、笑った顔文字マークＭＲを付す例に限定されなくともよく、感情を抽象化したマークの代わりに、笑い声や、効果音や、キャラクタを付してもよい。このように、情報処理装置１００は、動画コンテンツの盛り上りを演出できるような効果であれば如何なる情報を付すように表示制御してもよい。また、上記例では、ユーザの感情として、笑いについて例を挙げて説明したが、上記処理は、泣くや、驚く等の感情にも適用可能である。 It is not limited to the example of attaching the laughing emoticon mark MR, and instead of the mark that abstracts emotions, a laughing voice, a sound effect, or a character may be attached. As described above, the information processing apparatus 100 may display and control the display so as to add any information as long as the effect is such that the excitement of the moving image content can be produced. Further, in the above example, laughter has been described as an example of the user's emotions, but the above processing can also be applied to emotions such as crying and surprise.

以上、図１及び図２を用いて説明してきたように、実施形態に係る情報処理装置１００は、コンテンツ（例えば、動画コンテンツ）を視聴中のユーザを、かかるコンテンツを表示している端末装置１０が有するインカメラによって撮像された撮像情報が示すユーザの表情に基づいて推定されたユーザの感情に関する情報を取得する。そして、情報処理装置１００は、取得された推定結果を集計することにより、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントを特定する。また、情報処理装置１００は、推定結果に基づいて、コンテンツに関する情報を提示する。これにより、実施形態にかかる情報処理装置１００は、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As described above with reference to FIGS. 1 and 2, the information processing device 100 according to the embodiment allows the user who is viewing the content (for example, moving image content) to the terminal device 10 displaying the content. Acquires information on the user's emotions estimated based on the user's facial expression indicated by the imaged information captured by the in-camera. Then, the information processing device 100 identifies the emotional points, which are the points at which the user's emotions change in the content, by aggregating the acquired estimation results. In addition, the information processing device 100 presents information about the content based on the estimation result. As a result, the information processing device 100 according to the embodiment can provide information that is meaningful to the user in response to changes in emotions that occur in the user by viewing the content.

〔２．端末装置の構成〕
次に、図５を用いて、実施形態にかかる端末装置１０について説明する。図５は、実施形態に係る端末装置１０の構成例を示す図である。図５に示すように、端末装置１０は、通信部１１と、表示部１２と、撮像部１３と、制御部１４とを有する。端末装置１０は、ユーザによって利用される情報処理装置である。 [2. Terminal device configuration]
Next, the terminal device 10 according to the embodiment will be described with reference to FIG. FIG. 5 is a diagram showing a configuration example of the terminal device 10 according to the embodiment. As shown in FIG. 5, the terminal device 10 includes a communication unit 11, a display unit 12, an imaging unit 13, and a control unit 14. The terminal device 10 is an information processing device used by the user.

（通信部１１について）
通信部１１は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部１１は、ネットワークＮと有線又は無線で接続され、例えば、コンテンツ配信装置３０や情報処理装置１００との間で情報の送受信を行う。 (About communication unit 11)
The communication unit 11 is realized by, for example, a NIC (Network Interface Card) or the like. Then, the communication unit 11 is connected to the network N by wire or wirelessly, and transmits / receives information to / from, for example, the content distribution device 30 and the information processing device 100.

（表示部１２について）
表示部１２は、各種情報を表示する表示デバイスであり、図２に示す表示画面Ｄに相当する。例えば、表示部１２には、タッチパネルが採用される。また、表示部１２は、例えば、撮像部１３によってレンズから取り込まれた映像を表示する。 (About display unit 12)
The display unit 12 is a display device that displays various types of information, and corresponds to the display screen D shown in FIG. For example, a touch panel is adopted for the display unit 12. Further, the display unit 12 displays, for example, an image captured from the lens by the imaging unit 13.

（撮像部１３について）
撮像部１３は、撮像素子を内蔵し、画像や動画を撮像するデバイスである。撮像素子は、ＣＣＤ(Charge Coupled Device)、ＣＭＯＳ（Complementary Metal Oxide Semiconductor）など何れでもよい。例えば、撮像部１３は、レンズから取り込んだ映像であって表示部１２に現在表示されている映像を静止画像として写真撮影したり、動画撮影したりすることができる。また、撮像部１３は、図１で説明したインカメラに相当するものとする。 (About the imaging unit 13)
The image pickup unit 13 is a device that has a built-in image pickup element and captures an image or a moving image. The image pickup device may be any of CCD (Charge Coupled Device), CMOS (Complementary Metal Oxide Semiconductor) and the like. For example, the imaging unit 13 can take a picture of an image captured from the lens and currently displayed on the display unit 12 as a still image, or take a moving image. Further, the imaging unit 13 is assumed to correspond to the in-camera described with reference to FIG.

（制御部１４について）
制御部１４は、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、端末装置１０内部の記憶装置に記憶されている各種プログラムがＲＡＭ（Random Access Memory)を作業領域として実行されることにより実現される。また、制御部１４は、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。また、制御部１４は、実施形態に係る情報処理プログラム（アプリＡＰ）により実行される処理部である。 (About control unit 14)
In the control unit 14, various programs stored in the storage device inside the terminal device 10 are executed by the CPU (Central Processing Unit), MPU (Micro Processing Unit), etc. using the RAM (Random Access Memory) as the work area. Is realized by. Further, the control unit 14 is realized by, for example, an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). Further, the control unit 14 is a processing unit executed by the information processing program (application AP) according to the embodiment.

図５に示すように、制御部１４は、要求部１４ａと、同意情報受付部１４ｂと、表示制御部１４ｃと、カメラ制御部１４ｄと、取得部１４ｅと、推定部１４ｆ、送信部１４ｇとを有し、以下に説明する情報処理の機能や作用を実現又は実行する。なお、制御部１４の内部構成は、図５に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。また、制御部１４が有する各処理部の接続関係は、図５に示した接続関係に限られず、他の接続関係であってもよい。 As shown in FIG. 5, the control unit 14 includes a request unit 14a, a consent information reception unit 14b, a display control unit 14c, a camera control unit 14d, an acquisition unit 14e, an estimation unit 14f, and a transmission unit 14g. It has and realizes or executes the functions and actions of information processing described below. The internal configuration of the control unit 14 is not limited to the configuration shown in FIG. 5, and may be another configuration as long as it is a configuration for performing information processing described later. Further, the connection relationship of each processing unit included in the control unit 14 is not limited to the connection relationship shown in FIG. 5, and may be another connection relationship.

（要求部１４ａ）
要求部１４ａは、コンテンツ（例えば、動画コンテンツ）の配信を要求する。例えば、要求部１４ａは、コンテンツ配信装置３０に対して、コンテンツの配信を要求する。例えば、要求部１４ａは、コンテンツの配信を要求する配信要求をコンテンツ配信装置３０に送信する。図１の例では、端末装置１０は、ユーザＵ１の操作に応じて、動画サイトＳＴにおいて動画コンテンツをストリーミング配信させるための配信要求をコンテンツ配信装置３０に送信する。例えば、ユーザＵ１が動画サイトＳＴにおいて、動画コンテンツＶＣ２を示すクエリを指定したとすると、端末装置１０は、かかるクエリを含む配信要求をコンテンツ配信装置３０に送信する。また、要求部１４ａは、コンテンツ配信装置３０から配信されたコンテンツを受信する。 (Request unit 14a)
The requesting unit 14a requests the distribution of content (for example, moving image content). For example, the requesting unit 14a requests the content distribution device 30 to distribute the content. For example, the request unit 14a transmits a distribution request requesting the distribution of the content to the content distribution device 30. In the example of FIG. 1, the terminal device 10 transmits a distribution request for streaming distribution of the video content on the video site ST to the content distribution device 30 in response to the operation of the user U1. For example, if the user U1 specifies a query indicating the video content VC2 on the video site ST, the terminal device 10 transmits a distribution request including such a query to the content distribution device 30. In addition, the request unit 14a receives the content distributed from the content distribution device 30.

（同意情報受付部１４ｂ）
同意情報受付部１４ｂ、インカメラ（撮像部１３）によって撮像されることに同意するか否か（撮像されることを許可するか否か）を示す同意情報をユーザから受け付ける。図１の例では、同意情報受付部１４ｂは、動画サイトＳＴにおいて任意の動画コンテンツを閲覧している間だけインカメラ（撮像部１３）によって撮像されることに、同意するか否か（撮像されることを許可するか否か）を示す同意情報をユーザから受け付ける。例えば、同意情報受付部１４ｂは、動画サイトＳＴに表示される「同意ボタン」が押下された場合には、インカメラ（撮像部１３）によって撮像されることに同意する旨の同意情報を受け付ける。 (Consent Information Reception Department 14b)
Consent information reception unit 14b receives consent information from the user indicating whether or not he / she agrees to be imaged by the in-camera (imaging unit 13) (whether or not he / she is permitted to be imaged). In the example of FIG. 1, whether or not the consent information receiving unit 14b agrees to be imaged by the in-camera (imaging unit 13) only while viewing arbitrary video content on the video site ST (imaged). Accepts consent information from the user indicating (whether or not to allow it). For example, the consent information receiving unit 14b receives consent information to the effect that when the "agreement button" displayed on the video site ST is pressed, the consent information is agreed to be imaged by the in-camera (imaging unit 13).

（表示制御部１４ｃについて）
表示制御部１４ｃは、各種情報を端末装置１０の表示画面Ｄ（表示部１２）に表示させるための表示制御を行う。例えば、表示制御部１４ｃは、要求部１４ａによって受信された情報を表示画面Ｄに表示させる。例えば、表示制御部１４ｃは、動画サイトＳＴを表示画面Ｄに表示させる。また、表示制御部１４ｃは、動画コンテンツを表示画面Ｄに表示させる。例えば、図２の例では、要求部１４ａは、動画コンテンツＶＣ２を受信する。かかる場合、表示制御部１４ｃは、領域ＡＲ内にプレーヤーＰＬ１、グラフＧ、シークバーＢＲを表示させる。 (About display control unit 14c)
The display control unit 14c performs display control for displaying various information on the display screen D (display unit 12) of the terminal device 10. For example, the display control unit 14c causes the display screen D to display the information received by the request unit 14a. For example, the display control unit 14c displays the moving image site ST on the display screen D. Further, the display control unit 14c displays the moving image content on the display screen D. For example, in the example of FIG. 2, the requesting unit 14a receives the moving image content VC2. In such a case, the display control unit 14c displays the player PL1, the graph G, and the seek bar BR in the area AR.

（カメラ制御部１４ｄについて）
カメラ制御部１４ｄは、インカメラ（撮像部１３）を制御することによりユーザを撮像する。例えば、カメラ制御部１４ｄは、同意情報受付部１４ｂにより受け付けられた同意情報に従って、インカメラを制御する。例えば、カメラ制御部１４ｄは、同意情報受付部１４ｂにより撮像されることに同意する旨の同意情報が受け付けられた場合には、ユーザが動画サイトＳＴにおいて任意の動画コンテンツを閲覧している間だけインカメラを制御する。つまり、カメラ制御部１４ｄは、ユーザが動画サイトＳＴにおいて任意の動画コンテンツを閲覧している間だけユーザを撮像するようインカメラを制御する。 (About camera control unit 14d)
The camera control unit 14d captures the user by controlling the in-camera (imaging unit 13). For example, the camera control unit 14d controls the in-camera according to the consent information received by the consent information reception unit 14b. For example, when the camera control unit 14d receives the consent information to the effect that the consent information reception unit 14b agrees to be imaged, the camera control unit 14d only while the user is browsing arbitrary video content on the video site ST. Control the in-camera. That is, the camera control unit 14d controls the in-camera so that the user is imaged only while the user is browsing an arbitrary video content on the video site ST.

（取得部１４ｅについて）
取得部１４ｅは、コンテンツを視聴中のユーザを、コンテンツを表示している端末装置１０が有するインカメラで撮像することで得られる撮像情報（顔動画のデータ）を取得する。例えば、取得部１４ｅは、カメラ制御部１４ｄから撮像情報を取得する。 (About acquisition unit 14e)
The acquisition unit 14e acquires imaging information (face moving image data) obtained by imaging a user who is viewing the content with an in-camera included in the terminal device 10 displaying the content. For example, the acquisition unit 14e acquires imaging information from the camera control unit 14d.

また、例えば、取得部１４ｅは、コンテンツとして、動画コンテンツ又は画像コンテンツを視聴中のユーザを撮像することで得られる撮像情報を取得する。動画コンテンツは、お笑い番組、ドラマ、映画、アニメ等の様々なジャンルの動画コンテンツである。一方、画像コンテンツは、例えば、各種の電子書籍である。また、取得部１４ｅは、撮像情報として、ユーザの許諾が得られた場合にインカメラで撮像することで得られる撮像情報を取得する。例えば、取得部１４ｅは、撮像情報として、ユーザの許諾が得られた場合において、コンテンツが表示されている間、インカメラで撮像することで得られる撮像情報を取得する。 Further, for example, the acquisition unit 14e acquires the imaging information obtained by imaging the user who is viewing the moving image content or the image content as the content. Video content is video content of various genres such as comedy programs, dramas, movies, and animations. On the other hand, the image content is, for example, various electronic books. In addition, the acquisition unit 14e acquires the imaging information obtained by imaging with the in-camera when the user's permission is obtained as the imaging information. For example, the acquisition unit 14e acquires the imaging information obtained by imaging with the in-camera while the content is displayed when the user's permission is obtained as the imaging information.

（推定部１４ｆについて）
推定部１４ｆは、図１のステップＳ２で説明した推定処理を行う。具体的には、推定部１４ｆは、取得部１４ｅにより取得された撮像情報が示すユーザの表情に基づいて、ユーザの感情に関する情報を推定する。例えば、推定部１４ｆは、撮像情報が示すユーザの表情に基づいて、ユーザの感情に関する情報として、ユーザの感情表出行動を推定する。感情表出行動は、感情を表す行動であり、面白いといった感情が生じた際に行う笑う行動、悲しいといった感情が生じた際に行う泣く行動、等である。また、例えば、推定部１４ｆは、コンテンツが再生されている再生中に（つまり、ユーザがコンテンツを視聴しているまさにその時、リアルタイムに）、ユーザの感情に関する情報を推定する。また、推定部１４ｆは、撮像情報が示すユーザの表情に基づいて、ユーザの感情に関する情報として、ユーザの感情表出行動の度合いを示す特徴量を推定する。 (About estimation unit 14f)
The estimation unit 14f performs the estimation process described in step S2 of FIG. Specifically, the estimation unit 14f estimates information about the user's emotions based on the facial expression of the user indicated by the imaging information acquired by the acquisition unit 14e. For example, the estimation unit 14f estimates the user's emotion expression behavior as information on the user's emotions based on the user's facial expression indicated by the imaging information. Emotional expression behaviors are behaviors that express emotions, such as laughing behaviors that occur when emotions such as funny occur, and crying behaviors that occur when emotions such as sadness occur. Also, for example, the estimation unit 14f estimates information about the user's emotions during playback of the content (that is, at the very moment when the user is viewing the content). Further, the estimation unit 14f estimates a feature amount indicating the degree of the user's emotion expression behavior as information on the user's emotion based on the user's facial expression indicated by the imaging information.

図１の例では、取得部１４ｅは、カメラ制御部１４ｄによる撮像で得られた顔動画のデータ（撮像情報の一例）を取得し、推定部１４ｆに送信する。そして、推定部１４ｆは、顔動画のデータ（撮像情報の一例）に基づいて、ユーザの感情に関する情報を推定する。具体的には、推定部１４ｆは、顔動画のデータが示すユーザの表情に基づいて、ユーザの感情に関する情報として、ユーザの感情表出行動を推定する。例えば、推定部１４ｆは、顔動画のデータについて表情解析することにより、ユーザの感情表出行動を推定する。 In the example of FIG. 1, the acquisition unit 14e acquires face moving image data (an example of imaging information) obtained by imaging by the camera control unit 14d and transmits it to the estimation unit 14f. Then, the estimation unit 14f estimates information about the user's emotions based on the face moving image data (an example of imaging information). Specifically, the estimation unit 14f estimates the user's emotional expression behavior as information on the user's emotions based on the user's facial expression indicated by the facial expression data. For example, the estimation unit 14f estimates the emotional expression behavior of the user by performing facial expression analysis on the face moving image data.

また、推定部１４ｆは、推定した感情放出行動の度合いを示す特徴量を推定する。例えば、推定部１４ｆは、感情放出行動として、「笑う行動」を推定した場合には、この笑う行動の度合い（どれだけ笑ったか笑いの程度を示す度合い）を示す特徴量を推定（算出）する。例えば、推定部１４ｆは、顔動画のデータが示すユーザの笑いが微笑レベルであるなら、笑う行動の度合いを示す特徴量として、笑い度「２」を推定する。一方、推定部１４ｆは、顔動画のデータが示すユーザの笑いが大笑いレベルであるなら、笑い度「９」を推定する。 In addition, the estimation unit 14f estimates a feature amount indicating the estimated degree of emotional release behavior. For example, when the estimation unit 14f estimates "laughing behavior" as an emotional release behavior, it estimates (calculates) a feature amount indicating the degree of this laughing behavior (the degree of laughing or the degree of laughing). .. For example, if the user's laughter indicated by the face video data is at the smile level, the estimation unit 14f estimates the laughter degree "2" as a feature quantity indicating the degree of laughing behavior. On the other hand, the estimation unit 14f estimates the laughter degree "9" if the user's laughter indicated by the face video data is at the level of laughter.

なお、推定部１４ｆは、上記例に限定されない。具体的には、推定部１４ｆは、取得部１４ｅにより取得された撮像情報が示すユーザの表情に基づいて、ユーザの属性情報を推定してもよい。例えば、推定部１４ｆは、画像解析等の従来技術を用いて、目や、鼻や、口の大きさ、眉毛の形、顔の皺又は髪の長さ等のユーザの属性を特徴付ける特徴情報を抽出する。そして、推定部１４ｆは、抽出された特徴情報に基づいて、ユーザの属性情報として、ユーザの年齢や、性別を推定してもよい。 The estimation unit 14f is not limited to the above example. Specifically, the estimation unit 14f may estimate the user's attribute information based on the facial expression of the user indicated by the imaging information acquired by the acquisition unit 14e. For example, the estimation unit 14f uses conventional techniques such as image analysis to provide feature information that characterizes user attributes such as eye, nose, mouth size, eyebrow shape, facial wrinkles, or hair length. Extract. Then, the estimation unit 14f may estimate the age and gender of the user as the attribute information of the user based on the extracted feature information.

また、推定部１４ｆは、ユーザが動画コンテンツを閲覧しているまさにそのタイミング、つまり、リアルタイムで、ユーザが撮像されることに応じて、例えば、毎秒推定処理を連続的に行う。このため、後述する送信部１４ｇは、この推定部１４ｆによる推定処理の推定結果を含む情報を、例えば、毎秒毎に、情報処理装置１００に送信する。一例を示すと、送信部１４ｇは、動画コンテンツの再生時間に対応する時間位置（タイムコード）と、感情表出行動を示す情報と、その感情表出行動の特徴量とを含む情報（図1の例では、撮像情報ＦＤＡ１やＦＤＡ２）を毎秒、情報処理装置１００に送信する。つまり、送信部１４ｇは、ユーザが動画コンテンツを閲覧している間は、時間位置（タイムコード）と、感情表出行動を示す情報と、その感情表出行動の特徴量とを含む情報、つまり推定結果を遂次、情報処理装置１００に送信する。 Further, the estimation unit 14f continuously performs estimation processing, for example, every second, in response to the user being imaged at the exact timing when the user is browsing the moving image content, that is, in real time. Therefore, the transmission unit 14g, which will be described later, transmits information including the estimation result of the estimation process by the estimation unit 14f to the information processing apparatus 100, for example, every second. As an example, the transmission unit 14g contains information including a time position (time code) corresponding to the playback time of the video content, information indicating the emotion expression behavior, and a characteristic amount of the emotion expression behavior (FIG. 1). In the example of, the imaging information FDA1 and FDA2) are transmitted to the information processing apparatus 100 every second. That is, while the user is browsing the video content, the transmission unit 14g includes information including a time position (time code), information indicating the emotion expression behavior, and a feature amount of the emotion expression behavior, that is, The estimation result is transmitted to the information processing apparatus 100 one after another.

（送信部１４ｇについて）
送信部１４ｇは、推定部１４ｆによる推定結果を送信する。具体的には、送信部１４ｇは、推定部１４ｆによる推定結果を含む情報を情報処理装置１００に送信する。図１の例では、送信部１４ｇは、撮像情報ＦＤＡ１を情報処理装置１００に送信する。また、送信部１４ｇは、撮像情報ＦＤＡ２を情報処理装置１００に送信する。 (About the transmitter 14g)
The transmission unit 14g transmits the estimation result by the estimation unit 14f. Specifically, the transmission unit 14g transmits information including the estimation result by the estimation unit 14f to the information processing device 100. In the example of FIG. 1, the transmission unit 14g transmits the imaging information FDA1 to the information processing device 100. Further, the transmission unit 14g transmits the image pickup information FDA2 to the information processing device 100.

なお、推定部１４ｆによる推定処理は、情報処理装置１００側で行われてもよい。この場合には、情報処理装置１００は、推定部１４ｆに対応する処理部を有することになる。また、この場合には、送信部１４ｇは、顔動画のデータを連続的に情報処理装置１００に送信する。 The estimation process by the estimation unit 14f may be performed on the information processing apparatus 100 side. In this case, the information processing device 100 has a processing unit corresponding to the estimation unit 14f. Further, in this case, the transmission unit 14g continuously transmits the face moving image data to the information processing apparatus 100.

〔３．情報処理装置の構成〕
次に、図６を用いて、実施形態にかかる情報処理装置１００について説明する。図６は、実施形態にかかる情報処理装置１００の構成例を示す図である。図６に示すように、情報処理装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。情報処理装置１００は、例えば、実施形態にかかる情報処理を行うサーバ装置である。 [3. Information processing device configuration]
Next, the information processing apparatus 100 according to the embodiment will be described with reference to FIG. FIG. 6 is a diagram showing a configuration example of the information processing apparatus 100 according to the embodiment. As shown in FIG. 6, the information processing device 100 includes a communication unit 110, a storage unit 120, and a control unit 130. The information processing device 100 is, for example, a server device that performs information processing according to the embodiment.

（通信部１１０について）
通信部１１０は、例えば、ＮＩＣ等によって実現される。そして、通信部１１０は、ネットワークＮと有線又は無線で接続され、例えば、端末装置１０やコンテンツ配信装置３０との間で情報の送受信を行う。 (About communication unit 110)
The communication unit 110 is realized by, for example, a NIC or the like. Then, the communication unit 110 is connected to the network N by wire or wirelessly, and transmits / receives information to / from, for example, the terminal device 10 or the content distribution device 30.

（記憶部１２０について）
記憶部１２０は、例えば、ＲＡＭ、フラッシュメモリ等の半導体メモリ素子又はハードディスク、光ディスク等の記憶装置によって実現される。記憶部１２０は、撮像情報記憶部１２１と、推定情報記憶部１２２と、全体集計結果記憶部１２３と、感情ポイント記憶部１２４と、出演者情報記憶部１２５とを有する。 (About storage unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory element such as a RAM or a flash memory or a storage device such as a hard disk or an optical disk. The storage unit 120 includes an imaging information storage unit 121, an estimation information storage unit 122, an overall aggregation result storage unit 123, an emotion point storage unit 124, and a performer information storage unit 125.

（撮像情報記憶部１２１について）
撮像情報記憶部１２１は、コンテンツを視聴中のユーザを、コンテンツを表示している端末装置１０が有するインカメラで撮像することで得られる撮像情報を記憶する。ここで、図７に実施形態にかかる撮像情報記憶部１２１の一例を示す。図７の例では、撮像情報記憶部１２１は、「ユーザＩＤ」、「動画ＩＤ」、「撮像情報」といった項目を有する。撮像情報記憶部１２１については、図１で説明済みのため、説明を省略する。 (About the image pickup information storage unit 121)
The image pickup information storage unit 121 stores the image pickup information obtained by capturing the image of the user who is viewing the content with the in-camera included in the terminal device 10 displaying the content. Here, FIG. 7 shows an example of the imaging information storage unit 121 according to the embodiment. In the example of FIG. 7, the imaging information storage unit 121 has items such as "user ID", "moving image ID", and "imaging information". Since the image pickup information storage unit 121 has already been described with reference to FIG. 1, the description thereof will be omitted.

（推定情報記憶部１２２について）
推定情報記憶部１２２は、感情表出行動を推定した推定結果に関する情報を記憶する。また、推定情報記憶部１２２は、各ユーザが各動画コンテンツの中で行ったと推定される感情表出行動について、動画コンテンツの中で感情表出行動行われた時間位置を記憶するため、ユーザ毎の集計結果を記憶する記憶部といえる。ここで、図８に実施形態にかかる推定情報記憶部１２２の一例を示す。図１の例では、推定情報記憶部１２２は、「動画ＩＤ」、「ユーザＩＤ」、「行動情報」といった項目を有する。また、「行動情報」は、「笑う」、「泣く」、「驚く」といった項目を含む。 (About the estimated information storage unit 122)
The estimation information storage unit 122 stores information regarding the estimation result of estimating the emotion expression behavior. Further, the estimation information storage unit 122 stores the time position in which the emotion expression action is performed in the video content for each user, because the estimation information storage unit 122 stores the time position in which the emotion expression action is performed in the video content. It can be said that it is a storage unit that stores the aggregated results of. Here, FIG. 8 shows an example of the estimation information storage unit 122 according to the embodiment. In the example of FIG. 1, the estimation information storage unit 122 has items such as "moving image ID", "user ID", and "behavior information". In addition, "behavioral information" includes items such as "laughing", "crying", and "surprise".

「動画ＩＤ」は、ユーザが視聴する動画コンテンツであって、インカメラにて撮像されるユーザが視聴している動画コンテンツを識別する識別情報を示す。「ユーザＩＤ」は、対応する動画コンテンツを視聴するユーザ又はユーザの端末装置を識別する識別情報を示す。 The "video ID" is the video content that the user watches, and indicates identification information that identifies the video content that the user is watching, which is captured by the in-camera. The "user ID" indicates identification information that identifies the user who views the corresponding moving image content or the terminal device of the user.

「行動情報」に含まれる「笑い」は、推定処理で推定された感情表出行動のうち、笑う行動が行われた時間位置であって、対応する動画ＩＤが示す動画コンテンツの中で笑う行動が行われた時間位置を示す。また、「行動情報」に含まれる「笑い」は、後述する集計部１３２が、推定部１４ｆによる推定結果に基づいて、動画コンテンツにおいて笑う行動が行われたものとして特定した時間位置を示す。「行動情報」に含まれる「泣く」は、推定処理で推定された感情表出行動のうち、泣く行動が行われた時間位置であって、対応する動画ＩＤが示す動画コンテンツの中で泣く行動が行われた時間位置を示す。また、「行動情報」に含まれる「泣く」は、後述する集計部１３２が、推定部１４ｆによる推定結果に基づいて、動画コンテンツにおいて泣く行動が行われたものとして特定した時間位置を示す。「行動情報」に含まれる「驚く」は、推定処理で推定された感情表出行動のうち、驚く行動が行われた時間位置であって、対応する動画ＩＤが示す動画コンテンツの中で驚く行動が行われた時間位置を示す。また、「行動情報」に含まれる「驚く」は、後述する集計部１３２が、推定部１４ｆによる推定結果に基づいて、動画コンテンツにおいて驚く行動が行われたものとして特定した時間位置を示す。 "Laughter" included in "behavior information" is the time position where the laughing action was performed among the emotional expression actions estimated by the estimation process, and the laughing action in the video content indicated by the corresponding video ID. Indicates the time position where Further, the "laughter" included in the "behavior information" indicates a time position specified by the aggregation unit 132, which will be described later, as having performed a laughing action in the moving image content based on the estimation result by the estimation unit 14f. "Crying" included in "behavior information" is the time position where the crying action was performed among the emotional expression actions estimated by the estimation process, and the crying action in the video content indicated by the corresponding video ID. Indicates the time position where Further, "crying" included in the "behavior information" indicates a time position specified by the aggregation unit 132, which will be described later, as having performed a crying action in the moving image content based on the estimation result by the estimation unit 14f. The "surprise" included in the "behavior information" is the time position where the surprising action was performed among the emotional expression actions estimated by the estimation process, and the surprising action in the video content indicated by the corresponding video ID. Indicates the time position where Further, "surprise" included in the "behavior information" indicates a time position specified by the aggregation unit 132, which will be described later, as having performed a surprising action in the moving image content based on the estimation result by the estimation unit 14f.

すなわち、図８の例では、ユーザＵ１が動画コンテンツＶＣ１を閲覧している中で、笑う行動を行ったと推定されたとともに、動画コンテンツＶＣ１の再生時間の中の時間位置ｔ２、ｔ２１、ｔ５１において、この笑う行動が行われたことを特定された例を示す。 That is, in the example of FIG. 8, it is presumed that the user U1 performed a laughing action while browsing the video content VC1, and at the time positions t2, t21, and t51 in the playback time of the video content VC1. Here is an example of how this laughing behavior was identified.

なお、本実施形態では、時間位置は、ある１点の時間位置であってもよいし、時間の範囲であってもよい。例えば、時間位置「ｔ２」は、「２分３５秒」といった１点の時間位置であってもよいし、「２分３５秒〜２分３０秒」といった時間範囲であってもよい。また、時間位置が１点の時間位置を示す場合、かかる時間位置は、例えば、感情表出行動が開始された時間位置、感情表出行動が終了した時間位置、感情表出行動が開始された時間位置から感情表出行動が終了した時間位置までの時間範囲の中での中間時刻のいずれかであってもよい。 In the present embodiment, the time position may be a time position of a certain point or a time range. For example, the time position "t2" may be a one-point time position such as "2 minutes 35 seconds" or a time range such as "2 minutes 35 seconds to 2 minutes 30 seconds". When the time position indicates a time position of one point, the time position is, for example, the time position where the emotion expression action is started, the time position where the emotion expression action is finished, and the emotion expression action is started. It may be any of the intermediate times in the time range from the time position to the time position where the emotional expression action is completed.

（全体集計結果記憶部１２３について）
全体集計結果記憶部１２３は、所定期間の間において、各動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数（笑う行動を行ったユーザの割合）を、各動画コンテンツの時間位置毎に記憶する。ここで、図９に実施形態にかかる全体集計結果記憶部１２３の一例を示す。図９の例では、全体集計結果記憶部１２３は、「動画ＩＤ」、「行動情報」といった項目を有する。また、「行動情報」は、「笑った人数（割合）」、「泣いた人数（割合）」、「驚いた人数（割合）」といった項目を含む。また、「笑った人数（割合）」、「泣いた人数（割合）」、「驚いた人数（割合）」それぞれには、動画コンテンツの時間位置を示す広告が対応付けられる。 (About the total total result storage unit 123)
The total total result storage unit 123 determines the number of users who have performed laughing behavior (ratio of users who have performed laughing behavior) out of the total number of users who have viewed each video content during a predetermined period. Store each video content at each time position. Here, FIG. 9 shows an example of the total total result storage unit 123 according to the embodiment. In the example of FIG. 9, the total total result storage unit 123 has items such as “moving image ID” and “behavior information”. In addition, "behavior information" includes items such as "number of people who laughed (ratio)", "number of people who cried (ratio)", and "number of people who were surprised (ratio)". In addition, an advertisement indicating the time position of the video content is associated with each of the "number of people who laughed (ratio)", "number of people who cried (ratio)", and "number of people who were surprised (ratio)".

「動画ＩＤ」は、ユーザが視聴する動画コンテンツであって、インカメラにて撮像されるユーザが視聴している動画コンテンツを識別する識別情報を示す。 The "video ID" is the video content that the user watches, and indicates identification information that identifies the video content that the user is watching, which is captured by the in-camera.

「笑った人数（割合）」に対応付けられる項目である時間位置（「ｔ１」、「ｔ２」、「ｔ３」・・・）は、各動画コンテンツの時間位置を示し、所定期間の間、動画コンテンツを閲覧したユーザの総数うち、その時間位置において笑う行動を行ったユーザの人数である行動人数（笑う行動を行ったユーザの割合）が入力される。「泣いた人数（割合）」に対応付けられる項目である時間位置（「ｔ１」、「ｔ２」、「ｔ３」・・・）は、各動画コンテンツの時間位置を示し、所定期間の間、動画コンテンツを閲覧したユーザの総数うち、その時間位置において泣く行動を行ったユーザの人数である行動人数（泣く行動を行ったユーザの割合）が入力される。「驚いた人数（割合）」に対応付けられる項目である時間位置（「ｔ１」、「ｔ２」、「ｔ３」・・・）は、各動画コンテンツの時間位置を示し、所定期間の間、動画コンテンツを閲覧したユーザの総数うち、その時間位置において驚く行動を行ったユーザの人数である行動人数（驚く行動を行ったユーザの割合）が入力される。 The time position ("t1", "t2", "t3" ...), Which is an item associated with the "number of people laughing (ratio)", indicates the time position of each video content, and the video is displayed for a predetermined period of time. Of the total number of users who browsed the content, the number of users who performed the laughing action at that time position (the ratio of the users who performed the laughing action) is input. The time position ("t1", "t2", "t3" ...), Which is an item associated with the "number of people crying (ratio)", indicates the time position of each video content, and the video is displayed for a predetermined period of time. Of the total number of users who browsed the content, the number of users who performed the crying action at that time position (the ratio of the users who performed the crying action) is input. The time position ("t1", "t2", "t3" ...), Which is an item associated with the "surprised number of people (ratio)", indicates the time position of each video content, and the video is displayed for a predetermined period of time. Of the total number of users who browsed the content, the number of actions (percentage of users who performed a surprising action), which is the number of users who performed a surprising action at that time position, is input.

すなわち、図９の例では、所定期間の間、動画コンテンツＶＣ１を閲覧したユーザの総数うち、時間位置ｔ１において笑う行動を行ったユーザの人数である行動人数が「１３５人」である例を示す。また、図９の例では、所定期間の間、動画コンテンツＶＣ１を閲覧したユーザの総数に対する、時間位置ｔ１において笑う行動を行ったユーザの人数の割合が「２０％」である例を示す。 That is, in the example of FIG. 9, among the total number of users who browsed the video content VC1 during a predetermined period, the number of users who performed a laughing action at the time position t1 is "135". .. Further, the example of FIG. 9 shows an example in which the ratio of the number of users who performed a laughing action at the time position t1 to the total number of users who browsed the moving image content VC1 during a predetermined period is “20%”.

また、図９の例では、所定期間の間、動画コンテンツＶＣ２を閲覧したユーザの総数うち、時間位置ｔ１において笑う行動を行ったユーザの人数である行動人数が「３２１人」である例を示す。また、図９の例では、所定期間の間、動画コンテンツＶＣ２を閲覧したユーザの総数に対する、時間位置ｔ１において笑う行動を行ったユーザの人数の割合が「５％」である例を示す。 Further, in the example of FIG. 9, among the total number of users who browsed the video content VC2 during a predetermined period, the number of users who performed a laughing action at the time position t1 is "321". .. Further, the example of FIG. 9 shows an example in which the ratio of the number of users who performed a laughing action at the time position t1 to the total number of users who browsed the moving image content VC2 during a predetermined period is “5%”.

（感情ポイント記憶部１２４について）
感情ポイント記憶部１２４は、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントに関する情報を記憶する。ユーザは感情に変化が生じると、反射的にその感情を行動に表す、例えば、面白いといった感情が芽生えたときには、笑う行動を行う。例えば、悲しいといった感情が芽生えたときには、泣く行動を行う。例えば、驚きの感情が芽生えたときには、驚く行動を行う。このようなことから、感情ポイントは、面白ポイント、泣きポイント、驚きポイント等に分けられる。ここで、図１０に実施形態にかかる感情ポイント記憶部１２４の一例を示す。図１０に示すように、感情ポイント記憶部１２４は、感情ポイント記憶部１２４−１、１２４−２、１２４−３に分けられる。 (About emotion point storage unit 124)
The emotion point storage unit 124 stores information about emotion points, which are points in which the user's emotions have changed in the content. When a change in emotion occurs, the user reflexively expresses the emotion in an action, for example, when an emotion such as funny emerges, the user performs a laughing action. For example, when feelings of sadness develop, they act to cry. For example, when a surprised emotion emerges, a surprised action is taken. For this reason, emotional points can be divided into interesting points, crying points, surprise points, and the like. Here, FIG. 10 shows an example of the emotion point storage unit 124 according to the embodiment. As shown in FIG. 10, the emotion point storage unit 124 is divided into emotion point storage units 124-1, 124-2, and 124-3.

まず、感情ポイント記憶部１２４−１について説明する。感情ポイント記憶部１２４−１は、ユーザの感情ポイントに関する情報を記憶する。図１０の例では、感情ポイント記憶部１２４−１は、「動画ＩＤ」、「感情ポイント」といった項目を有する。また、「感情ポイント」は、「面白ポイント」、「泣きポイント」、「驚きポイント」といった項目を含む。 First, the emotion point storage unit 124-1 will be described. The emotion point storage unit 124-1 stores information about the user's emotion points. In the example of FIG. 10, the emotion point storage unit 124-1 has items such as “moving image ID” and “emotion point”. In addition, the "emotion point" includes items such as "interesting point", "crying point", and "surprise point".

「動画ＩＤ」は、ユーザによって視聴された動画コンテンツを識別する識別情報を示す。「面白ポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数に基づく数値が所定数以上（条件情報）であった時間位置を示す。かかる数値は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数そのもの、又は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数に対する、笑う行動を行ったユーザの人数の割合である。 The "video ID" indicates identification information that identifies the video content viewed by the user. The "interesting point" is that the numerical value based on the number of users who performed the laughing action out of the total number of users who watched the corresponding video content during the predetermined period is a predetermined number or more (condition information). Indicates the time position. Such a numerical value is the total number of users who watched the corresponding video content during the predetermined period, which is the number of users who performed the laughing action, or the corresponding video content during the predetermined period. It is the ratio of the number of users who performed laughing behavior to the total number of users who watched.

このようなことから、「面白ポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数が所定人数以上であった時間位置を示す。あるいは、「面白ポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数に対する、笑う行動を行ったユーザの人数の割合が所定割合以上であった時間位置を示す。つまり、「面白ポイント」は、図９に示す全体集計結果記憶部１２３に記憶される時間位置のうち、上記条件情報を満たす時間位置が抽出されたものである。図１０の例では、「面白ポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数が所定人数以上であった時間位置を示すものとする。 For this reason, the "interesting point" is that the number of users who performed the laughing action out of the total number of users who watched the corresponding video content during the predetermined period was equal to or greater than the predetermined number. Indicates the time position. Alternatively, the "interesting point" indicates the time position in which the ratio of the number of users who performed the laughing action to the total number of users who viewed the corresponding video content during the predetermined period was equal to or higher than the predetermined ratio. That is, the "interesting point" is the time position that satisfies the above-mentioned condition information from the time positions stored in the total aggregation result storage unit 123 shown in FIG. In the example of FIG. 10, the "interesting point" is that the number of users who performed the laughing action out of the total number of users who watched the corresponding video content during the predetermined period was equal to or greater than the predetermined number. It shall indicate the time position.

「泣きポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち泣く行動を行ったユーザの人数である行動人数に基づく数値が所定数以上（条件情報）であった時間位置を示す。かかる数値は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、泣く行動を行ったユーザの人数である行動人数そのもの、又は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数に対する、泣く行動を行ったユーザの人数の割合である。 For the "crying point", the numerical value based on the number of users who performed the crying action out of the total number of users who watched the corresponding video content during the predetermined period was a predetermined number or more (condition information). Indicates the time position. Such a numerical value is the total number of users who watched the corresponding video content during the predetermined period, which is the number of users who performed the crying action, or the corresponding video content during the predetermined period. It is the ratio of the number of users who performed crying behavior to the total number of users who watched.

このようなことから、「泣きポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、泣く行動を行ったユーザの人数である行動人数が所定人数以上であった時間位置を示す。あるいは、「面白ポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数に対する、泣く行動を行ったユーザの人数の割合が所定割合以上であった時間位置を示す。つまり、「泣きポイント」は、図９に示す全体集計結果記憶部１２３に記憶される時間位置のうち、上記条件情報を満たす時間位置が抽出されたものである。図１０の例では、「泣きポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、泣く行動を行ったユーザの人数である行動人数が所定人数以上であった時間位置を示すものとする。 For this reason, the "crying point" is that the number of users who performed the crying action out of the total number of users who watched the corresponding video content during the predetermined period was equal to or greater than the predetermined number. Indicates the time position. Alternatively, the "interesting point" indicates the time position in which the ratio of the number of users who performed the crying action to the total number of users who viewed the corresponding video content during the predetermined period was equal to or higher than the predetermined ratio. That is, the "crying point" is a time position that satisfies the above-mentioned condition information from the time positions stored in the total total result storage unit 123 shown in FIG. In the example of FIG. 10, the “crying point” is the number of users who performed the crying action among the total number of users who watched the corresponding video content during the predetermined period. It shall indicate the time position.

「驚きポイント」は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち驚く行動を行ったユーザの人数である行動人数に基づく数値が所定数以上（条件情報）であった時間位置を示す。「驚きポイント」についても同様であるためこれ以上の説明は省略する。 The "surprise point" was a numerical value based on the number of users who performed a surprising action out of the total number of users who watched the corresponding video content during a predetermined period, which was a predetermined number or more (condition information). Indicates the time position. Since the same applies to the "surprise point", further description will be omitted.

また、「感情ポイント」を示す時間位置の中でも、最も行動人数が多かった（又は、最も割合が高かった）時間位置にはチェックマークが付与される。「感情ポイント」は、後述する特定部１３３によって特定され、感情ポイント記憶部１２４に入力される。 In addition, among the time positions indicating "emotion points", a check mark is given to the time position where the number of active persons is the largest (or the ratio is the highest). The "emotion point" is specified by the specific unit 133, which will be described later, and is input to the emotion point storage unit 124.

すなわち、図１０の例では、動画コンテンツＶＣ１について時間位置「ｔ１、ｔ３１、ｔ６２・・・」が面白ポイントとして特定され、また、時間位置「ｔ１、ｔ３１、ｔ６２・・・」のうち、笑う行動を行った行動人数が最も多い時間位置が時間位置ｔ３１であると特定された例を示す。 That is, in the example of FIG. 10, the time position "t1, t31, t62 ..." Is specified as an interesting point for the video content VC1, and the laughing behavior of the time positions "t1, t31, t62 ..." An example is shown in which the time position where the number of people who performed the above is the largest is the time position t31.

また、図１０の例では、動画コンテンツＶＣ２について時間位置「ｔ１３、ｔ５５、ｔ６１・・・」が面白ポイントとして特定され、また、時間位置「ｔ１３、ｔ５５、ｔ６１・・・」のうち、笑う行動を行った行動人数が最も多い時間位置が時間位置ｔ６１であると特定された例を示す。 Further, in the example of FIG. 10, the time position "t13, t55, t61 ..." Is specified as an interesting point for the video content VC2, and the laughing behavior of the time positions "t13, t55, t61 ..." An example is shown in which the time position where the number of people who performed the above is the largest is the time position t61.

なお、図８に示す推定情報記憶部１２２は、動画コンテンツ毎に各ユーザが感情表出行動を行った時間位置を記憶している。このため、推定情報記憶部１２２に記憶される時間位置は、各ユーザの感情ポイントともいえる。 The estimated information storage unit 122 shown in FIG. 8 stores the time position in which each user performs an emotional expression action for each moving image content. Therefore, the time position stored in the estimated information storage unit 122 can be said to be an emotional point of each user.

次に、感情ポイント記憶部１２４−２について説明する。感情ポイント記憶部１２４−２は、ユーザの年代毎に、年代を有するユーザの感情ポイントに関する情報を記憶する。図１０の例では、感情ポイント記憶部１２４−２は、「属性（年代）」、「属性（性別）」、「動画ＩＤ」、「感情ポイント」といった項目を有する。また、「感情ポイント」は、「面白ポイント」、「泣きポイント」、「驚きポイント」といった項目を含む。なお、図１０の例では、属性が「年代」及び「性別」である例を示すが、感情ポイント記憶部１２４、例えば、属性がユーザの興味関心及び趣味趣向や、ユーザの居住地及びユーザの位置情報等を含む地域に関する情報等の場合の感情ポイントも記憶することができる。つまり、図１０の例では、属性をどうするかは限定されない。 Next, the emotion point storage unit 124-2 will be described. The emotion point storage unit 124-2 stores information about the emotion points of the user having the age for each age of the user. In the example of FIG. 10, the emotion point storage unit 124-2 has items such as "attribute (age)", "attribute (gender)", "video ID", and "emotion point". In addition, the "emotion point" includes items such as "interesting point", "crying point", and "surprise point". In the example of FIG. 10, an example in which the attributes are "age" and "gender" is shown, but the emotion point storage unit 124, for example, the attributes are the user's interests and hobbies, the user's place of residence, and the user. Emotion points in the case of information about the area including location information can also be memorized. That is, in the example of FIG. 10, what to do with the attribute is not limited.

「属性（年代）」は、動画コンテンツを視聴したユーザの年代を示す。なお、「属性（年代）」は、動画コンテンツを視聴したユーザの年齢を示してもよい。「属性（性別）」は、動画コンテンツを視聴したユーザの性別を示す。「動画ＩＤ」は、対応する属性のユーザが視聴する動画コンテンツであって、インカメラにて撮像されるユーザが視聴している動画コンテンツを識別する識別情報を示す。 The "attribute (age)" indicates the age of the user who viewed the video content. The "attribute (age)" may indicate the age of the user who has viewed the video content. "Attribute (gender)" indicates the gender of the user who viewed the video content. The "video ID" is video content viewed by a user with the corresponding attribute, and indicates identification information for identifying the video content being viewed by the user captured by the in-camera.

「面白ポイント」は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際の総数のうち、笑う行動を行ったユーザの人数である行動人数に基づく数値が所定数以上（条件情報）であった時間位置を示す。かかる数値は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数のうち、笑う行動を行ったかかる年代及び性別のユーザの人数である行動人数そのもの、又は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数に対する、笑う行動を行ったかかる年代及び性別のユーザの人数の割合である。 The "interesting point" is a numerical value based on the number of users who have performed a laughing action out of the total number of users of the corresponding age and gender viewing the video content during a predetermined period. Indicates the time position that was (condition information). Such a numerical value is the number of users of the age and gender who performed the laughing action among the total number of the users when the users of the corresponding age and gender watched the video content during the predetermined period. Or, the ratio of the number of users of such age and gender who performed a laughing action to the total number of users of the corresponding age and gender when viewing the video content during a predetermined period.

このようなことから、「面白ポイント」は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数のうち、笑う行動を行ったかかる年代及び性別のユーザの人数である行動人数が所定人数以上であった時間位置を示す。あるいは、「面白ポイント」は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数に対する、笑う行動を行ったかかる年代及び性別のユーザの人数の割合が所定割合以上であった時間位置を示す。図１０の例では、「面白ポイント」は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数のうち、笑う行動を行ったかかる年代及び性別のユーザの人数である行動人数が所定人数以上であった時間位置を示すものとする。 For this reason, the "interesting point" is the age and gender of the user who performed the laughing behavior out of the total number of users of the corresponding age and gender when viewing the video content during the predetermined period. Indicates the time position where the number of actions, which is the number of users, is equal to or greater than the predetermined number. Alternatively, the "interesting point" is the ratio of the number of users of such age and gender who performed laughing behavior to the total number of users of the corresponding age and gender when viewing the video content during a predetermined period. Indicates a time position where was greater than or equal to a predetermined proportion. In the example of FIG. 10, the "interesting point" is the age and gender of the user who performed the laughing behavior among the total number of users of the corresponding age and gender when viewing the video content during the predetermined period. It shall indicate the time position where the number of actions, which is the number of users, was equal to or greater than the predetermined number.

「泣きポイント」は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数のうち、泣く行動を行ったかかる年代及び性別のユーザの人数である行動人数に基づく数値が所定数以上（条件情報）であった時間位置を示す。「驚きポイント」は、所定期間の間において、対応する年代及び性別のユーザが動画コンテンツを視聴した際のこのユーザの総数のうち、驚く行動を行ったかかる年代のユーザの人数である行動人数に基づく数値が所定数以上（条件情報）であった時間位置を示す。「泣きポイント」及び「驚きポイント」も考え方は「面白ポイント」と同様であるためこれ以上の説明は省略する。 The "crying point" is the number of users of the age and gender who performed the crying behavior out of the total number of users of the corresponding age and gender when viewing the video content during the predetermined period. Indicates the time position where the numerical value based on the number of people is equal to or greater than the predetermined number (condition information). The "surprise point" is the number of users in the age group who performed a surprising action out of the total number of users of the corresponding age group and gender when viewing the video content during a predetermined period. Indicates the time position where the base numerical value is equal to or greater than a predetermined number (condition information). Since the concept of "crying point" and "surprise point" is the same as that of "interesting point", further explanation is omitted.

すなわち、図１０の例では、動画コンテンツＶＣ１について時間位置「ｔ１４、ｔ２１、ｔ３９・・・」が面白ポイントとして特定され、また、時間位置「ｔ１４、ｔ２１、ｔ３９・・・」のうち、笑う行動を行った行動人数が最も多い時間位置が時間位置ｔ３１であると特定された例を示す。 That is, in the example of FIG. 10, the time position "t14, t21, t39 ..." Is specified as an interesting point for the video content VC1, and the laughing behavior of the time positions "t14, t21, t39 ..." An example is shown in which the time position where the number of people who performed the above is the largest is the time position t31.

また、図１０の例では、１０代のユーザであり、男性のユーザに対して、動画コンテンツＶＣ２について時間位置「ｔ１３、ｔ５５、ｔ６１・・・」が面白ポイントとして特定された例を示す。また、時間位置「ｔ１３、ｔ５５、ｔ６１・・・」のうち、笑う行動を行った行動人数が最も多い時間位置が１０代男性のユーザでは、時間位置ｔ２１であると特定された例を示す。 Further, in the example of FIG. 10, an example is shown in which the time position "t13, t55, t61 ..." Is specified as an interesting point for the moving image content VC2 for a male user who is a teenage user. Further, among the time positions "t13, t55, t61 ...", the time position in which the number of people who performed the laughing action is the largest is the time position t21 for the male teenage user.

（出演者情報記憶部１２５について）
出演者情報記憶部１２５は、動画コンテンツに出演する出演者（例えば、タレント、芸人等）に対して行われた感情表出行動に関する情報を記憶する。出演者情報記憶部１２５は、例えば、推定情報記憶部１２２に記憶される情報を集計することで得られる。ここで、図１１に実施形態にかかる出演者情報記憶部１２５の一例を示す。図１１に示すように、出演者情報記憶部１２５は、出演者情報記憶部１２５−１、出演者情報記憶部１２５−２等に分けられる。 (About performer information storage 125)
The performer information storage unit 125 stores information regarding emotional expression behavior performed on performers (for example, talents, entertainers, etc.) who appear in the video content. The performer information storage unit 125 is obtained, for example, by aggregating the information stored in the estimation information storage unit 122. Here, FIG. 11 shows an example of the performer information storage unit 125 according to the embodiment. As shown in FIG. 11, the performer information storage unit 125 is divided into a performer information storage unit 125-1, a performer information storage unit 125-2, and the like.

まず、出演者情報記憶部１２５−１について説明する。出演者情報記憶部１２５−１は、動画コンテンツＶＣ１に出演している各出演者毎に、出演者が動画コンテンツＶＣ１の中で演じている際に、このとき動画コンテンツＶＣ１を視聴していたユーザのうち、感情表出行動を行ったユーザの人数に関する情報を記憶する。つまり、出演者情報記憶部１２５−１は、動画コンテンツＶＣ１に出演している各出演者に対して、視聴者であるユーザがどれだけ笑ったか等といった情報を記憶する。図１１の例では、出演者情報記憶部１２５−１は、「動画ＩＤ」、「行動情報」、「出演者」といった項目を有する。また、「出演者」は、各出演者を示す情報（例えば、氏名、グループ名等）を概念的に示す記号（ＴＲ１１、ＴＲ１２、ＴＲ１３等）を含む。 First, the performer information storage unit 125-1 will be described. The performer information storage unit 125-1 is a user who was watching the video content VC1 at this time when the performer was performing in the video content VC1 for each performer appearing in the video content VC1. Among them, information on the number of users who have performed emotional expression behavior is stored. That is, the performer information storage unit 125-1 stores information such as how much the user who is the viewer laughed at each performer appearing in the video content VC1. In the example of FIG. 11, the performer information storage unit 125-1 has items such as “video ID”, “behavior information”, and “performer”. Further, the "performer" includes symbols (TR11, TR12, TR13, etc.) that conceptually indicate information (for example, name, group name, etc.) indicating each performer.

「動画ＩＤ」は、ユーザによって視聴された動画コンテンツを識別する識別情報を示す。「行動情報」は、対応する動画コンテンツの中でユーザが行った感情放出行動を示す。 The "video ID" indicates identification information that identifies the video content viewed by the user. The "behavior information" indicates an emotion-releasing action performed by the user in the corresponding video content.

また、動画コンテンツＶＣ１において、出演者「ＴＲ１１」及び行動情報「笑う」に対応付けられる数値「３０％」は、出演者「ＴＲ１１」が動画コンテンツＶＣ１の中で演じている際に笑う行動を行ったユーザの割合を示す。また、動画コンテンツＶＣ１において、出演者「ＴＲ１２」及び行動情報「笑う」に対応付けられる数値「５０％」は、出演者「ＴＲ１２」が動画コンテンツＶＣ１の中で演じている際に笑う行動を行ったユーザの割合を示す。また、動画コンテンツＶＣ１において、出演者「ＴＲ１３」及び行動情報「笑う」に対応付けられる数値「１５％」は、出演者「ＴＲ１３」が動画コンテンツＶＣ１の中で演じている際に笑う行動を行ったユーザの割合を示す。 Further, in the video content VC1, the numerical value "30%" associated with the performer "TR11" and the action information "laughing" performs the action of laughing when the performer "TR11" is performing in the video content VC1. Shows the percentage of users. Further, in the video content VC1, the numerical value "50%" associated with the performer "TR12" and the action information "laughing" performs the action of laughing when the performer "TR12" is performing in the video content VC1. Shows the percentage of users. Further, in the video content VC1, the numerical value "15%" associated with the performer "TR13" and the action information "laughing" performs the action of laughing when the performer "TR13" is performing in the video content VC1. Shows the percentage of users.

次に、出演者情報記憶部１２５−２について説明する。出演者情報記憶部１２５−２は、出演者情報記憶部１２５−１と比較して、対象とする動画コンテンツが異なるため、出演者が行っているといった違いはあるが、実質、出演者情報記憶部１２５−１と同様である。 Next, the performer information storage unit 125-2 will be described. Since the target video content of the performer information storage unit 125-2 is different from that of the performer information storage unit 125-1, there is a difference that the performer is performing, but in reality, the performer information storage unit It is the same as the part 125-1.

例えば、動画コンテンツＶＣ２において、出演者「ＴＲ２１」及び行動情報「笑う」に対応付けられる数値「３％」は、出演者「ＴＲ２１」が動画コンテンツＶＣ２の中で演じている際に笑う行動を行ったユーザの割合を示す。また、動画コンテンツＶＣ２において、出演者「ＴＲ２２」及び行動情報「笑う」に対応付けられる数値「３％」は、出演者「ＴＲ２２」が動画コンテンツＶＣ２の中で演じている際に笑う行動を行ったユーザの割合を示す。また、動画コンテンツＶＣ２において、出演者「ＴＲ２３」及び行動情報「笑う」に対応付けられる数値「３％」は、出演者「ＴＲ２３」が動画コンテンツＶＣ２の中で演じている際に笑う行動を行ったユーザの割合を示す。 For example, in the video content VC2, the numerical value "3%" associated with the performer "TR21" and the action information "laughing" performs the action of laughing when the performer "TR21" is performing in the video content VC2. Shows the percentage of users. Further, in the video content VC2, the numerical value "3%" associated with the performer "TR22" and the action information "laughing" performs the action of laughing when the performer "TR22" is performing in the video content VC2. Shows the percentage of users. Further, in the video content VC2, the numerical value "3%" associated with the performer "TR23" and the action information "laughing" performs the action of laughing when the performer "TR23" is performing in the video content VC2. Shows the percentage of users.

図６に戻り、制御部１３０は、ＣＰＵやＭＰＵ等によって、情報処理装置１００内部の記憶装置に記憶されている各種プログラムがＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０は、例えば、ＡＳＩＣやＦＰＧＡ等の集積回路により実現される。 Returning to FIG. 6, the control unit 130 is realized by executing various programs stored in the storage device inside the information processing device 100 by using the RAM as a work area by the CPU, MPU, or the like. Further, the control unit 130 is realized by, for example, an integrated circuit such as an ASIC or FPGA.

図６に示すように、制御部１３０は、受信部１３１と、集計部１３２と、特定部１３３と、提示部１３４と、編集部１３５とを有し、以下に説明する情報処理の機能や作用を実現又は実行する。なお、制御部１３０の内部構成は、図６に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。また、制御部１３０が有する各処理部の接続関係は、図６に示した接続関係に限られず、他の接続関係であってもよい。 As shown in FIG. 6, the control unit 130 includes a reception unit 131, an aggregation unit 132, a specific unit 133, a presentation unit 134, and an editorial unit 135, and functions and operations of information processing described below. To realize or execute. The internal configuration of the control unit 130 is not limited to the configuration shown in FIG. 6, and may be another configuration as long as it is a configuration for performing information processing described later. Further, the connection relationship of each processing unit included in the control unit 130 is not limited to the connection relationship shown in FIG. 6, and may be another connection relationship.

（受信部１３１について）
受信部１３１は、各種情報を受信する。具体的には、受信部１３１は、端末装置１０から送信される情報を受信する。例えば、受信部１３１は、端末装置１０による推定処理の推定結果を含む情報を受信する。上記の通り、推定部１４ｆは、ユーザが動画コンテンツを閲覧しているリアルタイムで推定処理を行い、送信部１４ｇは、推定結果を含む情報をリアルタイムで遂次、情報処理装置１００に送信する。このため、受信部１３１は、ユーザが動画コンテンツを閲覧しているリアルタイムで情報を受信する。図１の例では、端末装置１０は、撮像情報ＦＤＡ１やＦＤＡ２を受信する。例えば、受信部１３１は、ユーザの属性情報を受信する。このとき、受信部１３１は、端末装置１０からユーザの属性情報を受信してもよいし、ユーザ毎に属性情報が予め記憶されている記憶部を有する外部サーバからユーザの属性情報を受信してもよい。 (About receiver 131)
The receiving unit 131 receives various information. Specifically, the receiving unit 131 receives the information transmitted from the terminal device 10. For example, the receiving unit 131 receives information including the estimation result of the estimation process by the terminal device 10. As described above, the estimation unit 14f performs estimation processing in real time when the user is browsing the moving image content, and the transmission unit 14g sequentially transmits information including the estimation result to the information processing device 100 in real time. Therefore, the receiving unit 131 receives the information in real time when the user is browsing the moving image content. In the example of FIG. 1, the terminal device 10 receives the imaging information FDA1 and FDA2. For example, the receiving unit 131 receives the attribute information of the user. At this time, the receiving unit 131 may receive the user's attribute information from the terminal device 10, or receives the user's attribute information from an external server having a storage unit in which the attribute information is stored in advance for each user. May be good.

（集計部１３２について）
集計部１３２は、推定部１４ｆにより推定された推定結果を集計する集計処理を行う。例えば、集計部１３２は、推定部１４ｆにより推定された推定結果に基づいて、動画コンテンツにおいて感情表出行動が行われた時間位置を特定する。そして、特定部１３３は、特定した時間位置に基づいて、動画コンテンツの中で感情表出行動が行われた回数を集計する。図１の例では、集計部１３２は、特徴量である笑い度が所定の閾値（例えば、笑い度「５」）以上を示す時間位置を、動画コンテンツＶＣ１において、ユーザＵ１が笑う行動を行った時間位置として特定する。また、集計部１３２は、推定情報記憶部１２２に格納する。 (About tabulation section 132)
The totaling unit 132 performs a totaling process for totaling the estimation results estimated by the estimation unit 14f. For example, the aggregation unit 132 specifies the time position where the emotional expression action is performed in the moving image content based on the estimation result estimated by the estimation unit 14f. Then, the specific unit 133 totals the number of times the emotion expression action is performed in the moving image content based on the specified time position. In the example of FIG. 1, the aggregation unit 132 performs an action in which the user U1 laughs at a time position at which the laughter degree, which is a feature amount, indicates a predetermined threshold value (for example, the laughter degree “5”) or more in the video content VC1. Specify as a time position. Further, the aggregation unit 132 stores the information in the estimation information storage unit 122.

また、例えば、集計部１３２は、各動画コンテンツの中で笑う行動を行った人数である行動人数を、各動画コンテンツの時間位置毎に集計する。例えば、集計部１３２は、推定情報記憶部１２２に記憶される情報を用いて、かかる集計を行う。また、集計部１３２は、所定期間の間において、動画コンテンツを視聴したユーザの総数に対する、その動画コンテンツの中で笑う行動を行ったユーザの人数の割合を、各動画コンテンツの時間位置毎に集計する。例えば、集計部１３２は、推定情報記憶部１２２に記憶される情報を用いて、かかる集計を行う。また、集計部１３２は、集計した集計結果を全体集計結果記憶部１２３に格納する。 Further, for example, the aggregation unit 132 aggregates the number of people who have performed a laughing action in each video content for each time position of each video content. For example, the aggregation unit 132 performs such aggregation using the information stored in the estimation information storage unit 122. In addition, the aggregation unit 132 aggregates the ratio of the number of users who have performed a laughing action in the video content to the total number of users who have viewed the video content for each time position of the video content. do. For example, the aggregation unit 132 performs such aggregation using the information stored in the estimation information storage unit 122. Further, the totaling unit 132 stores the totaled total result in the total totaled result storage unit 123.

また、集計部１３２は、コンテンツに出演している出演者毎に、出演者がコンテンツの中で演じている際に動画コンテンツＶＣ１を視聴していたユーザのうち、感情表出行動を行ったユーザの人数に関する情報を集計する。例えば、集計部１３２は、出演者がコンテンツの中で演じている際に笑う行動を行ったユーザの割合を集計する。例えば、集計部１３２は、推定情報記憶部１２２に記憶される情報を集計することにより、出演者がコンテンツの中で演じている際に笑う行動を行ったユーザの割合を算出する。また、集計部１３２は、このときの集計結果を出演者情報記憶部１２５に格納する。 In addition, the aggregation unit 132 is a user who has performed an emotional expression action among the users who were watching the video content VC1 when the performer was performing in the content for each performer appearing in the content. Aggregate information about the number of people. For example, the aggregation unit 132 aggregates the percentage of users who laugh when the performer is performing in the content. For example, the aggregation unit 132 aggregates the information stored in the estimation information storage unit 122 to calculate the percentage of users who perform a laughing action when the performer is performing in the content. In addition, the aggregation unit 132 stores the aggregation result at this time in the performer information storage unit 125.

（特定部１３３について）
特定部１３３は、推定部１４ｆにより推定された推定結果を集計することにより、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントを特定する。図１０で説明したように、感情ポイントは、面白ポイント、泣きポイント、驚きポイント等に分けられる。 (About specific part 133)
The identification unit 133 identifies an emotion point, which is a point at which the user's emotion changes in the content, by aggregating the estimation results estimated by the estimation unit 14f. As described with reference to FIG. 10, emotion points are divided into fun points, crying points, surprise points, and the like.

例えば、特定部１３３は、推定部１４ｆにより推定された推定結果をユーザ毎に集計することにより、ユーザ毎に感情ポイントを特定する。推定情報記憶部１２２に記憶されるユーザ毎に時間位置は、ユーザ毎の感情ポイントといえる。また、特定部１３３は、このユーザ毎の感情ポイントである時間位置の出現回数を集計することにより、感情ポイントを特定する。言い換えれば、特定部１３３は、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数を集計することにより、感情ポイントを特定する。 For example, the identification unit 133 identifies emotion points for each user by aggregating the estimation results estimated by the estimation unit 14f for each user. The time position for each user stored in the estimation information storage unit 122 can be said to be an emotion point for each user. In addition, the identification unit 133 identifies the emotion points by totaling the number of appearances of the time position, which is the emotion points for each user. In other words, the specific unit 133 identifies emotion points by totaling the number of actions, which is the number of users who have performed emotion expression actions in the content.

例えば、特定部１３３は、行動人数に基づく数値が所定の条件情報を満たすポイントを感情ポイントとして特定する。具体的には、特定部１３３は、コンテンツが動画コンテンツである場合には、動画コンテンツの再生時間のうち、行動人数に基づく数値が所定の条件情報を満たす時間位置を感情ポイントとして特定する。例えば、特定部１３３は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数が所定人数以上の時間位置を面白ポイント（感情ポイントの一例）として特定する。あるいは、特定部１３３は、所定期間の間において、対応する動画コンテンツを視聴したユーザの総数に対する、このユーザのうち笑う行動を行ったユーザの人数である行動人数の割合が所定割合以上の時間位置を面白ポイント（感情ポイントの一例）として特定する。また、特定部１３３は、特定した感情ポイントを感情ポイント記憶部１２４−１に格納する。 For example, the specific unit 133 specifies a point at which a numerical value based on the number of active persons satisfies a predetermined condition information as an emotional point. Specifically, when the content is a moving image content, the specifying unit 133 specifies as an emotion point a time position in which the numerical value based on the number of active persons satisfies the predetermined condition information in the playing time of the moving image content. For example, the specific unit 133 points out an interesting point (emotion) at a time position in which the number of users who have performed a laughing action out of the total number of users who have viewed the corresponding video content during a predetermined period is equal to or greater than the predetermined number of people. Specify as an example of points). Alternatively, the specific unit 133 is at a time position in which the ratio of the number of active users, which is the number of users who have performed a laughing action, to the total number of users who have viewed the corresponding video content during a predetermined period is equal to or greater than the predetermined ratio. Is specified as an interesting point (an example of an emotional point). Further, the specific unit 133 stores the specified emotion point in the emotion point storage unit 124-1.

また、特定部１３３は、推定部１４ｆにより推定された推定結果をユーザのユーザ属性毎に集計することにより、ユーザ属性毎に感情ポイントを特定する。また、特定部１３３は、特定した感情ポイントを感情ポイント記憶部１２４−２に格納する。 In addition, the identification unit 133 identifies emotion points for each user attribute by aggregating the estimation results estimated by the estimation unit 14f for each user attribute of the user. Further, the specific unit 133 stores the specified emotion point in the emotion point storage unit 124-2.

また、特定部１３３は、推定結果をユーザ毎に集計した集計結果と、ユーザ毎の属性情報とに基づいて、ユーザ毎に感情ポイントを特定してもよい。例えば、特定部１３３は、ユーザの年齢が同一又は類似する年齢である他のユーザの感情ポイントを参照して、かかるユーザの感情ポイントを特定してもよい。また、特定部１３３は、ユーザの性別が同一又は類似する性別である他のユーザの感情ポイントを参照して、かかるユーザの感情ポイントを特定してもよい。また、特定部１３３は、ユーザの興味関心及び趣味嗜好が同一又は類似する興味関心及び趣味趣向を有する他のユーザの感情ポイントを参照して、かかるユーザの感情ポイントを特定してもよい。 In addition, the identification unit 133 may specify emotion points for each user based on the aggregation result obtained by totaling the estimation results for each user and the attribute information for each user. For example, the identification unit 133 may specify the emotional points of such users by referring to the emotional points of other users whose ages of the users are the same or similar. In addition, the identification unit 133 may specify the emotional points of the user by referring to the emotional points of other users whose genders are the same or similar to each other. In addition, the specific unit 133 may specify the emotional points of the user by referring to the emotional points of other users who have the same or similar interests and hobbies and tastes of the user.

（提示部１３４について）
提示部１３４は、推定部１４ｆにより推定された推定結果に基づいて、コンテンツに関する情報を提示する。例えば、提示部１３４は、ユーザがコンテンツを閲覧する際に、コンテンツについて推定された推定結果に基づくコンテンツに関する情報を提示する。図２で説明したように、提示部１３４は、コンテンツが動画コンテンツである場合には、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数であって、動画コンテンツの時間位置に応じて変化する行動人数の遷移を示すグラフが、コンテンツとともに表示されるシークバーが示す時間位置に対応付けて表示（提示）されるよう表示制御する。例えば、図２の例では、提示部１３４は、ステップＳ１３及びＳ１４にかけての処理を行う。 (About presentation unit 134)
The presentation unit 134 presents information about the content based on the estimation result estimated by the estimation unit 14f. For example, when the user browses the content, the presentation unit 134 presents information about the content based on the estimation result estimated for the content. As described with reference to FIG. 2, when the content is video content, the presentation unit 134 is the number of users who have performed emotional expression behavior in the content, and is the time position of the video content. The display is controlled so that the graph showing the transition of the number of action persons changing according to the content is displayed (presented) in association with the time position indicated by the seek bar displayed together with the content. For example, in the example of FIG. 2, the presentation unit 134 performs the processing in steps S13 and S14.

また、提示部１３４は、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数に基づいて、コンテンツに順位付けを行う。そして、提示部１３４は、付与した順位情報に基づいて、ランキング形式でコンテンツを提示する。例えば、提示部１３４は、順位の高い上位所定数のコンテンツを人気コンテンツランキングとしてユーザに提示する。この点について、図９の例を用いて説明する。 In addition, the presentation unit 134 ranks the content based on the number of users who have performed emotional expression behavior in the content. Then, the presentation unit 134 presents the content in the ranking format based on the given ranking information. For example, the presentation unit 134 presents a predetermined number of high-ranked contents to the user as a popular content ranking. This point will be described with reference to the example of FIG.

図９の例では、全体集計結果記憶部１２３は、所定期間の間において、各動画コンテンツを視聴したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数を、各動画コンテンツの時間位置毎に記憶する。したがって、提示部１３４は、全体集計結果記憶部１２３に記憶される行動人数に基づいて、コンテンツに順位付けを行う。例えば、提示部１３４は、動画コンテンツを視聴したユーザの総数に対する、このユーザのうち笑う行動を行ったユーザの人数の割合を、各動画コンテンツの時間位置毎に算出する。なお、この算出は、集計部１３２によって行われてもよい。 In the example of FIG. 9, the total total result storage unit 123 sets the number of actions, which is the number of users who performed a laughing action, out of the total number of users who watched each video content during a predetermined period, for each video content. Store for each time position. Therefore, the presentation unit 134 ranks the contents based on the number of actions stored in the total total result storage unit 123. For example, the presentation unit 134 calculates the ratio of the number of users who have performed a laughing action to the total number of users who have viewed the video content for each time position of the video content. In addition, this calculation may be performed by the totaling unit 132.

次に、提示部１３４は、各動画コンテンツから最も高い割合を抽出する。図９の例では、動画コンテンツＶＣ１については時間位置ｔ２の「４６％」、動画コンテンツＶＣ２については時間位置ｔ１の「５％」といった具合である。そして、提示部１３４は、例えば、この割合がより高い上位５つの動画コンテンツを提示対象の動画コンテンツとして決定するとともに、割合が高い動画コンテンツほど高い順位を付与する。図９で不図示であるが、説明の便宜上、提示部１３４は、動画コンテンツＶＣ５「１位」、動画コンテンツＶＣ１「２位」、動画コンテンツＶＣ４「３位」、動画コンテンツＶＣ２「４位」、動画コンテンツＶＣ３「５位」、といった順位付けを行ったものとする。 Next, the presentation unit 134 extracts the highest ratio from each moving image content. In the example of FIG. 9, the video content VC1 has a time position t2 of “46%”, and the video content VC2 has a time position t1 of “5%”. Then, for example, the presentation unit 134 determines the top five video contents having a higher ratio as the video contents to be presented, and assigns a higher rank to the video contents having a higher ratio. Although not shown in FIG. 9, for convenience of explanation, the presentation unit 134 has video content VC5 “1st place”, video content VC1 “2nd place”, video content VC4 “3rd place”, video content VC2 “4th place”, and the like. It is assumed that the video content VC3 is ranked as "5th place".

そうすると、提示部１３４は、この順位付けを行った５つの動画コンテンツを、例えば、「今週の人気動画ランキング」といった形でユーザに提示する。例えば、ユーザＵ１が動画サイトＳＴにアクセスしてきた場合、提示部１３４は、動画サイトＳＴの所定のページ内において「今週の人気動画ランキング」を表示させる。ユーザＵ１は、「今週の人気動画ランキング」の中に気になる動画コンテンツが含まれていれば、それを選択することで動画閲覧ページへとジャンプすることができる。 Then, the presentation unit 134 presents the five video contents that have been ranked to the user in the form of, for example, "popular video ranking of this week". For example, when the user U1 accesses the video site ST, the presentation unit 134 displays the "popular video ranking of this week" in a predetermined page of the video site ST. If the video content of interest is included in the "popular video ranking of this week", the user U1 can jump to the video viewing page by selecting it.

また、別の一例を示すと、提示部１３４は、推定部１４ｆにより推定された推定結果に基づいて、各ユーザに応じたコンテンツをユーザにレコメンドすることができる。具体的には、提示部１３４は、推定部１４ｆにより推定された推定結果をユーザ毎に集計することによりユーザについて特定された感情ポイントであって、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントに基づいて、ユーザに応じたコンテンツをレコメンドする。一例を示すと、提示部１３４は、ユーザ毎にユーザにパーソナライズ化されたコンテンツを、「あなた向けの動画一覧」等としてレコメンドする。この点について、図８の例を用いて説明する。 Further, as another example, the presentation unit 134 can recommend the content corresponding to each user to the user based on the estimation result estimated by the estimation unit 14f. Specifically, the presentation unit 134 is an emotion point specified for the user by aggregating the estimation results estimated by the estimation unit 14f for each user, and the emotion of the user changes in the content. Based on emotional points, which are points, we recommend content according to the user. As an example, the presentation unit 134 recommends the content personalized to the user for each user as a "video list for you" or the like. This point will be described with reference to the example of FIG.

図８の例では、推定情報記憶部１２２は、動画コンテンツ毎に各ユーザが感情表出行動を行った時間位置を記憶している。このため、推定情報記憶部１２２に記憶される時間位置は、各ユーザの感情ポイントともいえる。したがって、提示部１３４は、この感情ポイントでの動画コンテンツの内容を分析する。ユーザＵ１を例に挙げると、提示部１３４は、動画コンテンツＶＣ１の時間位置ｔ２では、出演者は誰であったか、その出演者はどのような内容の演技を行っていたか等を分析する。また、提示部１３４は、動画コンテンツＶＣ１の時間位置ｔ２１、ｔ５１についても同様に分析する。また、提示部１３４は、分析結果に基づいて、例えば、ユーザＵ１はどのようなジャンルの動画コンテンツが好みであるか、ユーザＵ１はどのような出演者が好みであるか、ユーザＵ１はどのような演技（例えば、お笑いネタ）が好みであるか等といった、動画コンテンツに対するユーザＵ１の傾向を学習する。 In the example of FIG. 8, the estimation information storage unit 122 stores the time position in which each user performs an emotional expression action for each moving image content. Therefore, the time position stored in the estimated information storage unit 122 can be said to be an emotional point of each user. Therefore, the presentation unit 134 analyzes the content of the video content at this emotion point. Taking the user U1 as an example, the presentation unit 134 analyzes who was the performer at the time position t2 of the video content VC1 and what kind of content the performer was performing. Further, the presentation unit 134 also analyzes the time positions t21 and t51 of the moving image content VC1 in the same manner. Further, based on the analysis result, the presentation unit 134, for example, what kind of genre of video content the user U1 likes, what kind of performer the user U1 likes, and how the user U1 likes it. Learn the tendency of user U1 with respect to video content, such as whether or not he likes acting (for example, comedy material).

ここでは、簡単な例として、提示部１３４は、ユーザＵ１について「複数のグループが漫才を披露してゆく番組を好む傾向にある」との学習結果を得たとする。このような状態において、ユーザＵ１が動画サイトＳＴにアクセスしてきたとする。かかる場合、提示部１３４は、動画サイトＳＴの所定のページ内において「あなた向けの動画一覧」を表示させる。ここで、コンテンツ配信装置３０は、ユーザに配信する各種コンテンツを記憶部に格納している。したがって、提示部１３４は、コンテンツ配信装置３０の記憶部にアクセスし、「複数のグループが漫才を披露してゆく番組」（動画コンテンツ）を選択する。そして、提示部１３４は、選択した動画コンテンツをユーザＵ１に配信するようコンテンツ配信装置３０に指示する。例えば、提示部１３４は、選択した動画コンテンツが「あなた向けの動画一覧」として表示されるよう、選択した動画コンテンツを配信させる。 Here, as a simple example, it is assumed that the presentation unit 134 has obtained a learning result that "a plurality of groups tend to prefer a program in which a plurality of groups show off their comics" for the user U1. In such a state, it is assumed that the user U1 has accessed the video site ST. In such a case, the presentation unit 134 displays a "video list for you" in a predetermined page of the video site ST. Here, the content distribution device 30 stores various contents to be distributed to the user in the storage unit. Therefore, the presentation unit 134 accesses the storage unit of the content distribution device 30 and selects "a program in which a plurality of groups show their comics" (video content). Then, the presentation unit 134 instructs the content distribution device 30 to distribute the selected moving image content to the user U1. For example, the presentation unit 134 distributes the selected video content so that the selected video content is displayed as a “video list for you”.

これにより、情報処理装置１００は、ユーザＵ１が好みそうなコンテンツをレコメンドすることができる。この結果、例えば、ユーザＵ１は、視聴したい動画コンテンツは決まっていないが、面白そうものがあれば視聴してみたいといった場面で、積極的に探すことなく、容易に自分好みの動画コンテンツを視聴することができるようになる。つまり、情報処理装置１００は、ユーザに面倒な操作を与えることなく、ユーザに適したコンテンツをレコメンドすることができる。 As a result, the information processing device 100 can recommend the content that the user U1 is likely to like. As a result, for example, the user U1 can easily watch his / her favorite video content without actively searching for it in a situation where the video content he / she wants to watch is not decided but he / she wants to watch it if there is something interesting. become able to. That is, the information processing device 100 can recommend the content suitable for the user without giving the user a troublesome operation.

（編集部１３５について）
編集部１３５は、動画コンテンツの編集を行う、具体的には、編集部１３５は、特定部１３３により特定された感情ポイントに基づいて、コンテンツの編集を行う。なお、本実施形態において編集するとは、新たなコンテンツを生成する概念を含み得るものとする。例えば、編集部１３５は、コンテンツのうち、感情ポイントに対応するコンテンツである部分コンテンツを抽出し、抽出した部分コンテンツを組み合わせた新たなコンテンツを生成する。例えば、編集部１３５は、コンテンツそれぞれの感情ポイントに対応するコンテンツである部分コンテンツを抽出し、抽出した部分コンテンツを組み合わせた新たなコンテンツを生成する。この点について、図１０の例を用いて説明する。 (About editorial department 135)
The editorial unit 135 edits the moving image content. Specifically, the editorial unit 135 edits the content based on the emotion points specified by the specific unit 133. Note that editing in this embodiment may include the concept of generating new content. For example, the editorial unit 135 extracts partial content that corresponds to emotional points from the content, and generates new content by combining the extracted partial content. For example, the editorial unit 135 extracts partial content that is content corresponding to each emotional point of the content, and generates new content by combining the extracted partial content. This point will be described with reference to the example of FIG.

図１０に示す感情ポイント記憶部１２４−１の例では、動画コンテンツＶＣ１の面白ポイントは時間位置ｔ２、ｔ３１、ｔ６２である。また、動画コンテンツＶＣ２の面白ポイントは時間位置ｔ１３、ｔ５５、ｔ６１である。 In the example of the emotion point storage unit 124-1 shown in FIG. 10, the interesting points of the moving image content VC1 are the time positions t2, t31, and t62. The interesting points of the moving image content VC2 are the time positions t13, t55, and t61.

この場合、編集部１３５は、動画コンテンツＶＣ１から、時間位置ｔ２周辺の部分コンテンツ、時間位置ｔ３１周辺の部分コンテンツ、時間位置ｔ６２周辺の部分コンテンツをそれぞれ抽出する。例えば、編集部１３５は、時間位置ｔ０〜ｔ４までに対応する動画コンテンツＶＣ１を、時間位置ｔ２周辺の部分コンテンツＶＣ１１として抽出する。また、編集部１３５は、時間位置ｔ２９〜ｔ３３までに対応する動画コンテンツＶＣ１を、時間位置ｔ３１周辺の部分コンテンツとＶＣ１２して抽出する。また、編集部１３５は、時間位置ｔ００〜ｔ６４までに対応する動画コンテンツＶＣ１を、時間位置ｔ６２周辺の部分コンテンツＶＣ１３として抽出する。 In this case, the editorial unit 135 extracts the partial content around the time position t2, the partial content around the time position t31, and the partial content around the time position t62 from the moving image content VC1. For example, the editorial unit 135 extracts the moving image content VC1 corresponding to the time positions t0 to t4 as the partial content VC11 around the time position t2. Further, the editorial unit 135 extracts the moving image content VC1 corresponding to the time positions t29 to t33 by VC12 with the partial content around the time position t31. Further, the editorial unit 135 extracts the moving image content VC1 corresponding to the time positions t00 to t64 as the partial content VC13 around the time position t62.

また、編集部１３５は、動画コンテンツＶＣ２から、時間位置ｔ１３周辺の部分コンテンツ、時間位置ｔ５５周辺の部分コンテンツ、時間位置ｔ６１周辺の部分コンテンツをそれぞれ抽出する。例えば、編集部１３５は、時間位置ｔ１１〜ｔ１５までに対応する動画コンテンツＶＣ２を、時間位置ｔ１３周辺の部分コンテンツＶＣ２１として抽出する。また、編集部１３５は、時間位置ｔ５３〜ｔ５７までに対応する動画コンテンツＶＣ２を、時間位置ｔ５５周辺の部分コンテンツとＶＣ２２して抽出する。また、編集部１３５は、時間位置ｔ５９〜ｔ６３までに対応する動画コンテンツＶＣ２を、時間位置ｔ６１周辺の部分コンテンツＶＣ２３として抽出する。 Further, the editorial unit 135 extracts the partial content around the time position t13, the partial content around the time position t55, and the partial content around the time position t61 from the moving image content VC2, respectively. For example, the editorial unit 135 extracts the moving image content VC2 corresponding to the time positions t11 to t15 as the partial content VC21 around the time position t13. Further, the editorial unit 135 extracts the moving image content VC2 corresponding to the time positions t53 to t57 by VC22 with the partial content around the time position t55. Further, the editorial unit 135 extracts the moving image content VC2 corresponding to the time positions t59 to t63 as the partial content VC23 around the time position t61.

そして、編集部１３５は、上記にように抽出した部分コンテンツＶＣ１１、ＶＣ１２、ＶＣ１３、ＶＣ２１、ＶＣ２２、ＶＣ２３を組み合わせる（繋ぎ合わせる）ことにより、新たな動画コンテンツＶＣ１１−２１を生成する。このようなことから、動画コンテンツＶＣ１１−２１は、面白ポイントだけで構成された動画コンテンツといえる。また、提示部１３４は、ユーザからのアクセスに応じて、動画コンテンツＶＣ１１−２１を提示してもよい。 Then, the editorial unit 135 generates a new moving image content VC11-21 by combining (connecting) the partial contents VC11, VC12, VC13, VC21, VC22, and VC23 extracted as described above. From this, it can be said that the video content VC11-21 is a video content composed of only interesting points. In addition, the presentation unit 134 may present the moving image content VC11-21 according to the access from the user.

これにより、情報処理装置１００は、ユーザがより楽しむことのできる動画コンテンツを動的に生成することができる。また、情報処理装置１００は、動画コンテンツＶＣ１１−２１を所定の事業主に販売することができる。なお、上記例では、情報処理装置１００が、面白ポイントで編集する例を示したが、情報処理装置１００は、泣きポイントや驚きポイントで同様の編集を行ってもよい。また、情報処理装置１００は、面白ポイント、泣きポイント、驚きポイントを織り交ぜることで編集を行ってもよい。 As a result, the information processing device 100 can dynamically generate moving image content that the user can enjoy more. In addition, the information processing device 100 can sell the moving image content VC11-21 to a predetermined business owner. In the above example, the information processing device 100 edits at an interesting point, but the information processing device 100 may perform the same editing at a crying point or a surprise point. Further, the information processing apparatus 100 may edit by interweaving interesting points, crying points, and surprise points.

また、編集部１３５は、動画コンテンツの所定の時間位置に、感情を抽象化したマークを付してもよい。例えば、動画コンテンツＶＣ１のうち、時間位置「ｔ２」で所定期間の間に動画コンテンツＶＣ１を視聴したユーザの総数のうち、最も多い「６９３人」が笑う行動を行った（行動人数６９３人）との集計結果を得たものとする。この場合、編集部１３５は、時間位置「ｔ２」において笑った顔文字等のマークを付してもよい。この場合、笑った顔文字マークは、動画コンテンツＶＣ１に重畳されるように表示される。なお、上記例に限定されなくともよく、感情を抽象化したマークの代わりに、笑い声や、効果音や、キャラクタを付してもよい。このように、編集部１３５は、動画コンテンツの盛り上りを演出できるような効果であれば如何なる情報を付してもよい。また、上記例では、ユーザの感情として、笑いについて例を挙げて説明したが、上記編集処理は、泣くや、驚く等の感情にも適用可能である。 In addition, the editorial unit 135 may add a mark that abstracts emotions to a predetermined time position of the moving image content. For example, among the video content VC1, "693", which is the largest number of users who watched the video content VC1 during a predetermined period at the time position "t2", performed a laughing action (the number of actions was 693). It is assumed that the aggregated result of is obtained. In this case, the editorial unit 135 may add a mark such as a laughing emoticon at the time position "t2". In this case, the laughing emoticon mark is displayed so as to be superimposed on the moving image content VC1. It is not limited to the above example, and instead of the mark that abstracts the emotion, a laughter, a sound effect, or a character may be added. As described above, the editorial unit 135 may add any information as long as the effect is such that the excitement of the moving image content can be produced. Further, in the above example, laughter has been described as an example of the user's emotions, but the above editing process can also be applied to emotions such as crying and surprise.

〔４．処理手順〕
次に、図１２を用いて、実施形態に係る情報処理の手順について説明する。図１２は、実施形態にかかる情報処理装置１００が実行する情報処理を示すフローチャートである。図１２の例では、端末装置１０と情報処理装置１００とが協働して行う情報処理の手順を示す。また、端末装置１０及び情報処理装置１００は、実施形態に係る情報処理プログラムを実行することにより情報処理を行う。なお、図１２の例では、ユーザが閲覧する動画コンテンツを動画コンテンツＶＣ１とする。 [4. Processing procedure]
Next, the procedure of information processing according to the embodiment will be described with reference to FIG. FIG. 12 is a flowchart showing information processing executed by the information processing apparatus 100 according to the embodiment. In the example of FIG. 12, the procedure of information processing performed by the terminal device 10 and the information processing device 100 in cooperation with each other is shown. Further, the terminal device 10 and the information processing device 100 perform information processing by executing the information processing program according to the embodiment. In the example of FIG. 12, the moving image content viewed by the user is referred to as the moving image content VC1.

まず、端末装置１０の同意情報受付部１４ｂは、ユーザから受け付けた同意情報に基づいて、ユーザが撮像に許可したか否かを判定する（ステップＳ１０１）。同意情報受付部１４ｂは、ユーザが撮像に許可しなかった場合には（ステップＳ１０１；Ｎｏ）、ユーザの撮像を行わず処理を終了する。一方、カメラ制御部１４ｄは、同意情報受付部１４ｂによりユーザが撮像に許可したと判定された場合には（ステップＳ１０１；Ｙｅｓ）、動画コンテンツＶＣ１の閲覧が開始されたか否かを判定する（ステップＳ１０２）。カメラ制御部１４ｄは、動画コンテンツＶＣ１の閲覧が開始されていない場合には（ステップＳ１０２；Ｎｏ）、閲覧が開始されるまで待機する。一方、カメラ制御部１４ｄは、動画コンテンツＶＣ１の閲覧が開始された場合には（ステップＳ１０２；Ｙｅｓ）、ユーザの撮像を行う（ステップＳ１０３）。 First, the consent information receiving unit 14b of the terminal device 10 determines whether or not the user has permitted imaging based on the consent information received from the user (step S101). If the user does not permit the imaging (step S101; No), the consent information receiving unit 14b ends the process without imaging the user. On the other hand, the camera control unit 14d determines whether or not the viewing of the video content VC1 has been started when it is determined by the consent information reception unit 14b that the user has permitted the imaging (step S101; Yes) (step S101; Yes). S102). If the viewing of the moving image content VC1 has not been started (step S102; No), the camera control unit 14d waits until the viewing is started. On the other hand, when the viewing of the moving image content VC1 is started (step S102; Yes), the camera control unit 14d takes an image of the user (step S103).

カメラ制御部１４ｄは、ユーザが動画コンテンツＶＣ１を閲覧している間は撮像を継続するため、推定部１４ｆは、カメラ制御部１４ｄの撮像による撮像データ（顔動画のデータ）に基づいて、ユーザの感情表出行動を推定するとともに、推定した感情表出行動の度合いを示す特徴量を推定する推定処理を行う（ステップＳ１０４）。例えば、推定部１４ｆは、ユーザが動画コンテンツＶＣ１を閲覧しているリアルタイムにおいて、ユーザが動画コンテンツＶＣ１の閲覧を終了するまで、毎秒毎に、この推定処理を行う。そして、送信部１４ｇは、推定部１４ｆによる推定結果を含む情報を、毎秒毎に、情報処理装置１００に送信する（ステップＳ１０５）。 Since the camera control unit 14d continues imaging while the user is viewing the video content VC1, the estimation unit 14f is based on the imaged data (face video data) obtained by the camera control unit 14d. In addition to estimating the emotional expression behavior, an estimation process for estimating a feature amount indicating the degree of the estimated emotional expression behavior is performed (step S104). For example, the estimation unit 14f performs this estimation process every second until the user finishes viewing the video content VC1 in real time when the user is browsing the video content VC1. Then, the transmission unit 14g transmits information including the estimation result by the estimation unit 14f to the information processing apparatus 100 every second (step S105).

集計部１３２は、受信部１３１により撮像情報が受信されると、推定部１４ｆにより推定された推定結果に基づいて、動画コンテンツＶＣ１において感情表出行動が行われた時間位置を特定する（ステップＳ２０６）。例えば、集計部１３２は、特徴量が所定の閾値以上を示す時間位置を、動画コンテンツＶＣ１において、ユーザが対応する感情表出行動を行った時間位置として特定する。 When the imaging information is received by the receiving unit 131, the aggregation unit 132 specifies the time position where the emotion expression action is performed in the video content VC1 based on the estimation result estimated by the estimation unit 14f (step S206). ). For example, the aggregation unit 132 specifies a time position in which the feature amount indicates a predetermined threshold value or more as a time position in which the user performs the corresponding emotion expression action in the moving image content VC1.

次に、集計部１３２は、推定部１４ｆによる推定結果、及び、ステップＳ２０６で特定した時間位置に基づいて、各種集計を行う（ステップＳ２０７）。例えば、集計部１３２は、笑う行動を行ったユーザの人数や割合を集計し、全体集計結果記憶部１２３に格納する。 Next, the aggregation unit 132 performs various aggregations based on the estimation result by the estimation unit 14f and the time position specified in step S206 (step S207). For example, the aggregation unit 132 aggregates the number and proportion of users who have performed a laughing action, and stores the total aggregation result storage unit 123.

次に、特定部１３３は、感情ポイント（例えば、面白ポイント）を特定する（ステップＳ２０８）。例えば、特定部１３３は、全体集計結果記憶部１２３を参照し、各動画コンテンツの中で所定人数以上が笑う行動を行った時間位置、又は動画コンテンツの中で所定割合以上が笑う行動を行った時間位置を面白ポイントとして特定する。 Next, the identification unit 133 specifies an emotional point (for example, an interesting point) (step S208). For example, the specific unit 133 refers to the overall aggregation result storage unit 123, and performs an action of laughing at a predetermined ratio or more in the video content or a time position in which a predetermined number of people or more laugh at each video content. Specify the time position as an interesting point.

このような状態において、受信部１３１は、ユーザからのアクセスを受信したか否かを判定する（ステップＳ２０９）。例えば、受信部１３１は、ユーザから動画コンテンツの配信要求を受信したか否かを判定する。受信部１３１は、アクセスを受信していない場合には（ステップＳ２０９；Ｎｏ）、受信するまで待機する。一方、提示部１３４は、受信部１３１によりアクセスが受信された場合には（ステップＳ２０９；Ｙｅｓ）、このとき、かかるユーザが視聴しようとする動画コンテンツ（ここでは、動画コンテンツＶＣ２とする）に応じたコンテンツ（提示対象のコンテンツ）を生成する（ステップＳ２１０）。例えば、提示部１３４は、図２で説明したように、全体集計結果記憶部１２３にアクセスし、線情報を取得し、取得した遷移情報に基づいて、動画コンテンツＶＣ２における行動人数ンの遷移を示すグラフＧを生成する。 In such a state, the receiving unit 131 determines whether or not the access from the user has been received (step S209). For example, the receiving unit 131 determines whether or not a distribution request for video content has been received from the user. If the receiving unit 131 has not received the access (step S209; No), the receiving unit 131 waits until it is received. On the other hand, when the access is received by the receiving unit 131 (step S209; Yes), the presenting unit 134 responds to the video content (here, the video content VC2) that the user intends to watch. The content (content to be presented) is generated (step S210). For example, as described with reference to FIG. 2, the presentation unit 134 accesses the overall aggregation result storage unit 123, acquires line information, and shows the transition of the number of actions in the video content VC2 based on the acquired transition information. Generate graph G.

次に、提示部１３４は、グラフＧを配信要求送信元のユーザに提示する（ステップＳ２１１）。例えば、情報処理装置１３６は、グラフＧが動画コンテンツＶＣ２の再生箇所（時間位置）をユーザ側がコントロールすることができるシークバーＢＲ上に表示されるようコンテンツ配信装置３０に対して表示制御する。例えば、情報処理装置１００は、端末装置１０がシークバーＢＲ上にグラフＧを表示するよう、端末装置１０に対してグラフＧを配信するようコンテンツ配信装置３０に指示する。また、情報処理装置１００は、グラフＧをコンテンツ配信装置３０に送信する。 Next, the presentation unit 134 presents the graph G to the user who sends the distribution request (step S211). For example, the information processing device 136 controls the content distribution device 30 so that the graph G is displayed on the seek bar BR where the user can control the playback location (time position) of the moving image content VC2. For example, the information processing device 100 instructs the content distribution device 30 to distribute the graph G to the terminal device 10 so that the terminal device 10 displays the graph G on the seek bar BR. Further, the information processing device 100 transmits the graph G to the content distribution device 30.

〔５．変形例〕
上記実施形態に係る端末装置１０及び情報処理装置１００は、上記実施形態以外にも種々の異なる形態にて実施されてよい。そこで、以下では、端末装置１０及び情報処理装置１００の他の実施形態について説明する。 [5. Modification example]
The terminal device 10 and the information processing device 100 according to the above embodiment may be implemented in various different forms other than the above embodiment. Therefore, in the following, other embodiments of the terminal device 10 and the information processing device 100 will be described.

〔５−１．コンテンツ〕
上記実施形態では、情報処理装置１００による情報処理の対象となるコンテンツが動画コンテンツの場合での例を示してきたが、情報処理の対象となるコンテンツは、画像コンテンツであってもよい。すなわち、実施形態に係る情報処理装置１００は、画像コンテンツを閲覧中のユーザを、かかる画像コンテンツを表示している端末装置１０が有する撮像手段によって撮像された撮像情報が示すユーザの表情に基づいて推定されたユーザの感情に関する情報を取得する。そして、情報処理装置１００は、取得された推定結果を集計することにより、画像コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントを特定する。 [5-1. content〕
In the above embodiment, an example has been shown in which the content to be processed by the information processing apparatus 100 is a moving image content, but the content to be processed may be an image content. That is, the information processing device 100 according to the embodiment refers to the user who is viewing the image content based on the facial expression of the user indicated by the image pickup information captured by the image pickup means included in the terminal device 10 displaying the image content. Get information about the estimated user's emotions. Then, the information processing device 100 identifies the emotional points, which are the points at which the user's emotions change in the image content, by aggregating the acquired estimation results.

一方、情報処理装置１００は、コンテンツが画像コンテンツ（例えば、電子書籍）である場合には、画像コンテンツのページのうち、行動人数に基づく数値が所定の条件情報を満たすページを感情ポイントとして特定する。 On the other hand, when the content is an image content (for example, an electronic book), the information processing device 100 specifies a page of the image content whose numerical value based on the number of active persons satisfies a predetermined condition information as an emotion point. ..

この場合、集計部１３２は、各画像コンテンツの中で笑う行動を行った人数である行動人数を、各画像コンテンツのページ毎に集計する。例えば、集計部１３２は、推定情報記憶部１２２に記憶される情報を用いて、行動人数の集計を行う。また、集計部１３２は、所定期間の間において、画像コンテンツを閲覧したユーザの総数に対する、その画像コンテンツの中で笑う行動を行ったユーザの人数の割合を、各画像コンテンツのページ毎に集計する。例えば、集計部１３２は、推定情報記憶部１２２に記憶される情報を用いて、ユーザの人数の割合の集計を行う。また、集計部１３２は、集計した集計結果を全体集計結果記憶部１２３に格納する。 In this case, the aggregation unit 132 aggregates the number of people who performed the laughing action in each image content for each page of each image content. For example, the aggregation unit 132 aggregates the number of active people using the information stored in the estimation information storage unit 122. In addition, the aggregation unit 132 aggregates the ratio of the number of users who have performed a laughing action in the image content to the total number of users who have viewed the image content for each page of the image content during a predetermined period. .. For example, the aggregation unit 132 aggregates the ratio of the number of users by using the information stored in the estimation information storage unit 122. Further, the totaling unit 132 stores the totaled total result in the total totaled result storage unit 123.

また、特定部１３３は、行動人数に基づく数値が所定の条件情報を満たすポイントを感情ポイントとして特定する。具体的には、特定部１３３は、画像コンテンツのページのうち、行動人数に基づく数値が所定の条件情報を満たすページを感情ポイントとして特定する。例えば、特定部１３３は、所定期間の間において、対応する画像コンテンツを閲覧したユーザの総数のうち、笑う行動を行ったユーザの人数である行動人数が所定人数以上のページを面白ポイントとして特定する。また、特定部１３３は、所定期間の間において、対応する画像コンテンツを閲覧したユーザの総数に対する、このユーザのうち笑う行動を行ったユーザの人数である行動人数の割合が所定割合以上のページを面白ポイントとして特定する。 In addition, the specific unit 133 specifies a point at which the numerical value based on the number of active persons satisfies the predetermined condition information as an emotional point. Specifically, the specific unit 133 specifies, among the pages of the image content, the page whose numerical value based on the number of active persons satisfies the predetermined condition information as the emotion point. For example, the specific unit 133 specifies, as an interesting point, a page in which the number of users who have performed a laughing action, which is the number of users who have performed a laughing action, is equal to or greater than the predetermined number of users who have viewed the corresponding image content during a predetermined period. .. In addition, the specific unit 133 displays a page in which the ratio of the number of users who have performed a laughing action to the total number of users who have viewed the corresponding image content during a predetermined period is equal to or greater than the predetermined ratio. Identify as an interesting point.

これにより、情報処理装置１００は、コンテンツを閲覧することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。例えば、情報処理装置１００は、面白ポイントのページだけを寄せ集めた新たな画像コンテンツを提供したり、ユーザが画像コンテンツを閲覧使用する際に、笑いのポイントとなるページをグラフで提示したりすることができる。 As a result, the information processing device 100 can provide information that is meaningful to the user in response to changes in emotions that occur in the user by browsing the content. For example, the information processing device 100 provides new image content that is a collection of only interesting point pages, or presents pages that are points of laughter in a graph when a user browses and uses the image content. be able to.

〔５−２．端末装置〕
上記実施形態では、端末装置１０が、感情に関する情報を推定する例を示したが、端末装置１０が実行する推定処理は、情報処理装置１００側で行われてもよい。一方、端末装置１０は、推定処理を行うことに加えて、情報処理装置１００が実行する集計部１３２によって行われる集計処理や、特定部１３３によって行われる特定処理等を行ってよい。 [5-2. Terminal equipment]
In the above embodiment, the terminal device 10 has shown an example of estimating information about emotions, but the estimation process executed by the terminal device 10 may be performed on the information processing device 100 side. On the other hand, in addition to performing the estimation process, the terminal device 10 may perform a total process performed by the total unit 132 executed by the information processing device 100, a specific process performed by the specific unit 133, and the like.

〔５−３．集中度を推定〕
上記実施形態では、推定部１４ｆが、ユーザの感情に関する情報として、感情表出行動や感情表出行動の特徴量を推定する推定処理を行う例を示したが、推定部１４ｆは、撮像情報が示すユーザの表情に基づいて、ユーザの感情に関する情報として、コンテンツに対するユーザの集中度を推定してもよい。例えば、ユーザは、動画コンテンツを集中して閲覧するからこそ、笑う、泣く、驚く、といった感情表出行動を行う。したがって、コンテンツに対するユーザの集中度は、ユーザの感情に関する情報といえる。なお、推定部１４ｆは、これまでに説明してきた推定処理と同様の手法を用いて、集中度（集中の度合いを示す指標値）を推定することができる。以下、この一例について、適宜、図１の例を用いて説明する。 [5-3. Estimate concentration ratio]
In the above embodiment, the estimation unit 14f has shown an example in which the estimation unit 14f performs an estimation process for estimating the emotion expression behavior and the feature amount of the emotion expression behavior as information on the user's emotion. The degree of concentration of the user on the content may be estimated as information on the user's emotions based on the facial expression of the user. For example, a user performs emotional expression behaviors such as laughing, crying, and surprise because he / she concentrates on viewing video content. Therefore, the degree of concentration of the user on the content can be said to be information on the user's emotions. The estimation unit 14f can estimate the degree of concentration (an index value indicating the degree of concentration) by using the same method as the estimation process described so far. Hereinafter, this example will be described as appropriate with reference to the example of FIG.

例えば、推定部１４ｆは、顔動画のデータについて表情解析することにより、ユーザの表情、動画コンテンツＶＣ１のどの時間位置で動画コンテンツＶＣ１（あるいは、動画コンテンツＶＣ１が表示されている端末装置１０の画面）に注目したかといった視聴態様を判断・計測する。そして、推定部１４ｆは、この結果に基づいて、例えば、毎秒、集中度を推定する。 For example, the estimation unit 14f analyzes the facial expression of the facial expression data to analyze the facial expression of the user and the video content VC1 (or the screen of the terminal device 10 on which the video content VC1 is displayed) at which time position of the video content VC1. Judge and measure the viewing mode, such as whether you paid attention to. Then, the estimation unit 14f estimates the degree of concentration based on this result, for example, every second.

また、送信部１４ｇは、かかる推定結果（集中度）を含む情報を、例えば、毎秒毎に、情報処理装置１００に送信する。一例を示すと、送信部１４ｇは、動画コンテンツの再生時間に対応する時間位置（タイムコード）と、集中度とを含む情報を毎秒毎に、情報処理装置１００に送信する。つまり、送信部１４ｇは、ユーザが動画コンテンツを閲覧している間は、時間位置（タイムコード）と、集中度とを含む情報を遂次、情報処理装置１００に送信する。例えば、送信部１４ｇは、時間位置「１分５３秒」、集中度「１０」といった情報を情報処理装置１００に送信する。また、例えば、送信部１４ｇは、時間位置「１分５４秒」、集中度「８」といった情報を情報処理装置１００に送信する。また、例えば、送信部１４ｇは、時間位置「１分５５秒」、集中度「７」といった情報を情報処理装置１００に送信する。 Further, the transmission unit 14g transmits information including the estimation result (concentration ratio) to the information processing apparatus 100, for example, every second. As an example, the transmission unit 14g transmits information including a time position (time code) corresponding to the reproduction time of the moving image content and a degree of concentration to the information processing apparatus 100 every second. That is, the transmission unit 14g sequentially transmits information including the time position (time code) and the degree of concentration to the information processing device 100 while the user is browsing the moving image content. For example, the transmission unit 14g transmits information such as the time position "1 minute 53 seconds" and the concentration degree "10" to the information processing device 100. Further, for example, the transmission unit 14g transmits information such as the time position "1 minute 54 seconds" and the concentration degree "8" to the information processing device 100. Further, for example, the transmission unit 14g transmits information such as the time position "1 minute 55 seconds" and the concentration degree "7" to the information processing device 100.

ここで、動画コンテンツＶＣ１を広告動画とすると、情報処理装置１００は、端末装置１０から受信した集中度に基づいて、広告効果を測定することができるため、測定した広告効果に基づいて、どのような広告配信がよいかを分析することや、分析結果を広告主にフィードバックすることができる。 Here, assuming that the video content VC1 is an advertising video, the information processing device 100 can measure the advertising effect based on the concentration level received from the terminal device 10, so how is it based on the measured advertising effect? It is possible to analyze whether the advertisement delivery is good and to feed back the analysis result to the advertiser.

〔５−４．コンテンツ配信装置〕
上記実施形態では、コンテンツ配信装置３０が、各種コンテンツを配信する例を示したが、情報処理装置１００がコンテンツ配信装置３０の機能を有することによりコンテンツ配信を行ってもよい。この場合、情報処理装置１００は、事業主（例えば、コンテンツプロバイダー）から受け付けた各種コンテンツを記憶する記憶部を有する。 [5-4. Content distribution device]
In the above embodiment, the content distribution device 30 has shown an example of distributing various contents, but the information processing device 100 may perform the content distribution by having the function of the content distribution device 30. In this case, the information processing device 100 has a storage unit that stores various contents received from the business owner (for example, a content provider).

〔５−５．音声情報〕
上記実施形態では、情報処理装置１００が、撮像情報が示すユーザの表情に基づいて、ユーザの感情に関する情報を推定する例を示したが、情報処理装置１００が、端末装置１０が有する集音手段（例えば、マイク）で集音された音声情報を取得し、取得した音声情報に基づいて、ユーザの感情に関する情報を推定してもよい。 [5-5. Voice information]
In the above embodiment, the information processing device 100 shows an example of estimating information about the user's emotions based on the user's facial expression indicated by the imaging information. However, the information processing device 100 is a sound collecting means included in the terminal device 10. The voice information collected by the (for example, a microphone) may be acquired, and the information regarding the user's emotion may be estimated based on the acquired voice information.

例えば、情報処理装置１００は、端末装置１０が有するマイクで集音されたユーザの笑い声を取得する。そして、情報処理装置１００は、ユーザの笑い声が取得されたことから、ユーザの感情に関する情報を「笑い」と推定してもよい。このとき、情報処理装置１００は、音声解析等の従来技術を用いて、ユーザの音声情報を解析する。 For example, the information processing device 100 acquires the laughter of the user collected by the microphone included in the terminal device 10. Then, since the information processing device 100 has acquired the laughter of the user, the information regarding the emotion of the user may be presumed to be "laughter". At this time, the information processing device 100 analyzes the user's voice information by using a conventional technique such as voice analysis.

なお、変形例は、上記例に限定されなくともよい。例えば、情報処理装置１００は、端末装置１０が有する集音手段で集音された音声情報を取得し、取得した音声情報と撮像情報とを組み合わせて、ユーザの感情に関する情報を推定してもよい。 The modified example is not limited to the above example. For example, the information processing device 100 may acquire voice information collected by the sound collecting means of the terminal device 10 and combine the acquired voice information with the imaging information to estimate information on the user's emotions. ..

また、情報処理装置１００は、音声情報に限らず、例えば、端末装置１０の動きを検知するジャイロセンサ及び加速度センサから取得されるセンシング情報や、ユーザの心拍数や、ユーザの体温等のユーザの生体情報に関するセンシング情報に基づいて、ユーザの感情に関する情報を推定してもよい。 Further, the information processing device 100 is not limited to voice information, for example, sensing information acquired from a gyro sensor and an acceleration sensor that detect the movement of the terminal device 10, a user's heart rate, a user's body temperature, and the like. Information on the user's emotions may be estimated based on the sensing information on the biometric information.

〔６．ハードウェア構成〕
また、上述してきた実施形態にかかる端末装置１０、コンテンツ配信装置３０及び情報処理装置１００は、例えば図１３に示すような構成のコンピュータ１０００によって実現される。以下、情報処理装置１００を例に挙げて説明する。図１３は、情報処理装置１００の機能を実現するコンピュータ１０００の一例を示すハードウェア構成図である。コンピュータ１０００は、ＣＰＵ１１００、ＲＡＭ１２００、ＲＯＭ１３００、ＨＤＤ１４００、通信インターフェイス（Ｉ／Ｆ）１５００、入出力インターフェイス（Ｉ／Ｆ）１６００、及びメディアインターフェイス（Ｉ／Ｆ）１７００を有する。 [6. Hardware configuration]
Further, the terminal device 10, the content distribution device 30, and the information processing device 100 according to the above-described embodiment are realized by, for example, a computer 1000 having a configuration as shown in FIG. Hereinafter, the information processing apparatus 100 will be described as an example. FIG. 13 is a hardware configuration diagram showing an example of a computer 1000 that realizes the functions of the information processing device 100. The computer 1000 has a CPU 1100, a RAM 1200, a ROM 1300, an HDD 1400, a communication interface (I / F) 1500, an input / output interface (I / F) 1600, and a media interface (I / F) 1700.

ＣＰＵ１１００は、ＲＯＭ１３００又はＨＤＤ１４００に格納されたプログラムに基づいて動作し、各部の制御を行う。ＲＯＭ１３００は、コンピュータ１０００の起動時にＣＰＵ１１００によって実行されるブートプログラムや、コンピュータ１０００のハードウェアに依存するプログラム等を格納する。 The CPU 1100 operates based on a program stored in the ROM 1300 or the HDD 1400, and controls each part. The ROM 1300 stores a boot program executed by the CPU 1100 when the computer 1000 is started, a program depending on the hardware of the computer 1000, and the like.

ＨＤＤ１４００は、ＣＰＵ１１００によって実行されるプログラム、及び、かかるプログラムによって使用されるデータ等を格納する。通信インターフェイス１５００は、通信網５０を介して他の機器からデータを受信してＣＰＵ１１００へ送り、ＣＰＵ１１００が生成したデータを、通信網５０を介して他の機器へ送信する。 The HDD 1400 stores a program executed by the CPU 1100, data used by such a program, and the like. The communication interface 1500 receives data from another device via the communication network 50 and sends it to the CPU 1100, and transmits the data generated by the CPU 1100 to the other device via the communication network 50.

ＣＰＵ１１００は、入出力インターフェイス１６００を介して、ディスプレイやプリンタ等の出力装置、及び、キーボードやマウス等の入力装置を制御する。ＣＰＵ１１００は、入出力インターフェイス１６００を介して、入力装置からデータを取得する。また、ＣＰＵ１１００は、生成したデータを、入出力インターフェイス１６００を介して出力装置へ出力する。 The CPU 1100 controls an output device such as a display or a printer, and an input device such as a keyboard or a mouse via the input / output interface 1600. The CPU 1100 acquires data from the input device via the input / output interface 1600. Further, the CPU 1100 outputs the generated data to the output device via the input / output interface 1600.

メディアインターフェイス１７００は、記録媒体１８００に格納されたプログラム又はデータを読み取り、ＲＡＭ１２００を介してＣＰＵ１１００に提供する。ＣＰＵ１１００は、かかるプログラムを、メディアインターフェイス１７００を介して記録媒体１８００からＲＡＭ１２００上にロードし、ロードしたプログラムを実行する。記録媒体１８００は、例えばＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、又は半導体メモリ等である。 The media interface 1700 reads a program or data stored in the recording medium 1800 and provides the program or data to the CPU 1100 via the RAM 1200. The CPU 1100 loads the program from the recording medium 1800 onto the RAM 1200 via the media interface 1700, and executes the loaded program. The recording medium 1800 is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. And so on.

例えば、コンピュータ１０００が実施形態にかかる情報処理装置１００として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラムを実行することにより、制御部１３０の機能を実現する。また、ＨＤＤ１４００には、記憶部１２０内のデータが格納される。コンピュータ１０００のＣＰＵ１１００は、これらのプログラムを、記録媒体１８００から読み取って実行するが、他の例として、他の装置から、通信網５０を介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the information processing device 100 according to the embodiment, the CPU 1100 of the computer 1000 realizes the function of the control unit 130 by executing the program loaded on the RAM 1200. Further, the data in the storage unit 120 is stored in the HDD 1400. The CPU 1100 of the computer 1000 reads and executes these programs from the recording medium 1800, but as another example, these programs may be acquired from another device via the communication network 50.

また、例えば、コンピュータ１０００が端末装置１０として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラムを実行することにより、制御部１４の機能を実現する。 Further, for example, when the computer 1000 functions as the terminal device 10, the CPU 1100 of the computer 1000 realizes the function of the control unit 14 by executing the program loaded on the RAM 1200.

〔７．その他〕
上記各実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 [7. others〕
Of the processes described in each of the above embodiments, all or part of the processes described as being automatically performed can be performed manually, or all the processes described as being performed manually. Alternatively, a part thereof can be automatically performed by a known method. In addition, the processing procedure, specific name, and information including various data and parameters shown in the above document and drawings can be arbitrarily changed unless otherwise specified.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。 Further, each component of each of the illustrated devices is a functional concept, and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of the device is functionally or physically dispersed / physically distributed in arbitrary units according to various loads and usage conditions. Can be integrated and configured.

また、上述してきた各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 In addition, the above-described embodiments can be appropriately combined as long as the processing contents do not contradict each other.

また、上述してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、推定部は、推定手段や推定回路に読み替えることができる。 Further, the above-mentioned "section, module, unit" can be read as "means" or "circuit". For example, the estimation unit can be read as an estimation means or an estimation circuit.

〔８．効果〕
上述してきたように、実施形態に係る情報処理装置１００は、受信部１３１（取得部の一例）と、特定部１３３とを有する。受信部１３１は、コンテンツを閲覧中のユーザを、コンテンツを表示している端末装置１０が有する撮像手段によって撮像された撮像情報が示すユーザの表情に基づいて推定されたユーザの感情に関する情報を取得する。特定部１３３は、受信部１３１によって取得された推定結果を集計することにより、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントを特定する。 [8. effect〕
As described above, the information processing apparatus 100 according to the embodiment includes a receiving unit 131 (an example of an acquisition unit) and a specific unit 133. The receiving unit 131 acquires information on the user's emotions estimated based on the user's facial expression indicated by the imaging information captured by the imaging means included in the terminal device 10 displaying the content for the user who is viewing the content. do. The identification unit 133 identifies an emotion point, which is a point in which the user's emotion changes in the content, by aggregating the estimation results acquired by the reception unit 131.

これにより、実施形態に係る情報処理装置１００は、推定結果を集計することにより、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントを特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment identifies the emotional points, which are the points at which the user's emotions change in the content, by aggregating the estimation results. It is possible to provide meaningful information to the user in response to the emotional change that occurs in.

また、実施形態に係る情報処理装置１００において、受信部１３１は、ユーザの表情に基づいて、リアルタイムで推定されたユーザの感情に関する情報を取得し、特定部１３３は、受信部１３１によってリアルタイムで取得された推定結果をユーザ毎に集計することにより、ユーザ毎に感情ポイントをリアルタイムで特定する。 Further, in the information processing apparatus 100 according to the embodiment, the receiving unit 131 acquires information on the user's emotions estimated in real time based on the facial expression of the user, and the specific unit 133 acquires the information on the user's emotions in real time by the receiving unit 131. Emotion points are specified in real time for each user by aggregating the estimated results for each user.

これにより、実施形態に係る情報処理装置１００は、リアルタイムで取得された推定結果をユーザ毎に集計することにより、ユーザ毎に感情ポイントをリアルタイムで特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報をリアルタイムで提供することができる。 As a result, the information processing device 100 according to the embodiment aggregates the estimation results acquired in real time for each user to identify emotion points for each user in real time. It is possible to provide meaningful information to the user in real time in response to changes in emotions.

また、実施形態に係る情報処理装置１００において、受信部１３１は、ユーザの属性に関する属性情報を取得し、特定部１３３は、受信部１３１によって取得された推定結果をユーザ毎に集計した集計結果と、ユーザ毎の属性情報とに基づいて、ユーザ毎に感情ポイントを特定する。 Further, in the information processing apparatus 100 according to the embodiment, the receiving unit 131 acquires the attribute information related to the user's attributes, and the specific unit 133 collects the estimation results acquired by the receiving unit 131 for each user. , The emotion point is specified for each user based on the attribute information for each user.

これにより、実施形態に係る情報処理装置１００は、取得された推定結果をユーザ毎に集計した集計結果と、ユーザ毎の属性情報とに基づいて、ユーザ毎に感情ポイントを特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing apparatus 100 according to the embodiment obtains content in order to identify emotion points for each user based on the aggregated result obtained by totaling the acquired estimation results for each user and the attribute information for each user. It is possible to provide meaningful information to the user according to the change in emotions caused to the user by viewing.

また、実施形態に係る情報処理装置１００において、特定部１３３は、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数を集計することにより、感情ポイントを特定する。 Further, in the information processing device 100 according to the embodiment, the identification unit 133 identifies emotion points by totaling the number of actions, which is the number of users who have performed emotion expression actions in the content.

これにより、実施形態に係る情報処理装置１００は、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数を集計することにより、感情ポイントを特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment can view the content in order to identify the emotion point by totaling the number of users who have performed the emotion expression action in the content. It is possible to provide information that is meaningful to the user in response to changes in emotions that occur in the user.

また、実施形態に係る情報処理装置１００において、特定部１３３は、行動人数に基づく数値が所定の条件情報を満たすポイントを感情ポイントとして特定する。 Further, in the information processing device 100 according to the embodiment, the specific unit 133 specifies a point at which the numerical value based on the number of active persons satisfies the predetermined condition information as an emotional point.

これにより、実施形態に係る情報処理装置１００は、行動人数に基づく数値が所定の条件情報を満たすポイントを前記感情ポイントとして特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment identifies the point at which the numerical value based on the number of active persons satisfies the predetermined condition information as the emotion point, and therefore responds to the change in emotion caused by the user by viewing the content. Therefore, it is possible to provide meaningful information to the user.

また、実施形態に係る情報処理装置１００において、特定部１３３は、コンテンツが動画コンテンツである場合には、動画コンテンツの再生時間のうち、行動人数に基づく数値が所定の条件情報を満たす時間位置を感情ポイントとして特定する。 Further, in the information processing apparatus 100 according to the embodiment, when the content is a moving image content, the specific unit 133 sets a time position in the playback time of the moving image content in which a numerical value based on the number of active persons satisfies a predetermined condition information. Identify as an emotional point.

これにより、実施形態に係る情報処理装置１００は、コンテンツが動画コンテンツである場合には、動画コンテンツの再生時間のうち、行動人数に基づく数値が所定の条件情報を満たす時間位置を感情ポイントとして特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, when the content is video content, the information processing device 100 according to the embodiment identifies as an emotion point a time position in which the numerical value based on the number of active persons satisfies predetermined condition information in the playback time of the video content. Therefore, it is possible to provide information that is meaningful to the user in response to changes in emotions that occur in the user by viewing the content.

また、実施形態に係る情報処理装置１００において、特定部１３３は、コンテンツが画像コンテンツである場合には、画像コンテンツのページのうち、行動人数に基づく数値が所定の条件情報を満たすページを感情ポイントとして特定する。 Further, in the information processing device 100 according to the embodiment, when the content is an image content, the specific unit 133 points out the page of the image content in which the numerical value based on the number of active persons satisfies the predetermined condition information. Identify as.

これにより、実施形態に係る情報処理装置１００は、コンテンツが画像コンテンツである場合には、画像コンテンツのページのうち、行動人数に基づく数値が所定の条件情報を満たすページを感情ポイントとして特定するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, when the content is image content, the information processing device 100 according to the embodiment identifies the page of the image content whose numerical value based on the number of active persons satisfies the predetermined condition information as an emotion point. , It is possible to provide meaningful information to the user according to the change in emotions caused to the user by viewing the content.

また、実施形態に係る情報処理装置１００において、感情ポイントに基づいて、コンテンツの編集を行う編集部１３５をさらに備える。 Further, the information processing apparatus 100 according to the embodiment further includes an editorial unit 135 that edits the content based on the emotional points.

これにより、実施形態に係る情報処理装置１００は、感情ポイントに基づいて、コンテンツの編集を行うため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment edits the content based on the emotional points, and therefore provides meaningful information to the user according to the change in the emotion generated by the user by viewing the content. can do.

また、実施形態に係る情報処理装置１００において、編集部１３５は、コンテンツのうち、感情ポイントに対応するコンテンツである部分コンテンツを抽出し、抽出した部分コンテンツを組み合わせた新たなコンテンツを生成する。 Further, in the information processing apparatus 100 according to the embodiment, the editorial unit 135 extracts partial content which is the content corresponding to the emotional point from the content, and generates new content by combining the extracted partial content.

これにより、実施形態に係る情報処理装置１００は、コンテンツのうち、感情ポイントに対応するコンテンツである部分コンテンツを抽出し、抽出した部分コンテンツを組み合わせた新たなコンテンツを生成するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing apparatus 100 according to the embodiment extracts the partial content which is the content corresponding to the emotional point from the content, and generates new content by combining the extracted partial content, so that the content is viewed. As a result, it is possible to provide meaningful information to the user in response to changes in emotions that occur in the user.

また、実施形態に係る情報処理装置１００において、編集部１３５は、コンテンツそれぞれの感情ポイントに対応するコンテンツである部分コンテンツを抽出し、抽出した部分コンテンツを組み合わせた新たなコンテンツを生成する。 Further, in the information processing apparatus 100 according to the embodiment, the editorial unit 135 extracts partial content which is the content corresponding to each emotional point of the content, and generates new content by combining the extracted partial content.

これにより、実施形態に係る情報処理装置１００は、コンテンツそれぞれの感情ポイントに対応するコンテンツである部分コンテンツを抽出し、抽出した部分コンテンツを組み合わせた新たなコンテンツを生成するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment extracts the partial content which is the content corresponding to the emotional point of each content, and generates new content by combining the extracted partial content, so that the content is viewed. It is possible to provide meaningful information to the user according to the change in emotions caused by the user.

また、実施形態に係る情報処理装置１００において、受信部１３１によって取得された推定結果に基づいて、コンテンツに関する情報を提示する提示部１３４をさらに備える。 Further, the information processing apparatus 100 according to the embodiment further includes a presenting unit 134 that presents information about the content based on the estimation result acquired by the receiving unit 131.

これにより、実施形態に係る情報処理装置１００は、取得された推定結果に基づいて、コンテンツに関する情報を提示するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, since the information processing device 100 according to the embodiment presents information about the content based on the acquired estimation result, it is meaningful to the user according to the change in emotions caused to the user by viewing the content. Information can be provided.

また、実施形態に係る情報処理装置１００において、提示部１３４は、ユーザがコンテンツを閲覧する際に、コンテンツについて推定された推定結果に基づくコンテンツに関する情報を提示する。 Further, in the information processing apparatus 100 according to the embodiment, when the user browses the content, the presentation unit 134 presents information about the content based on the estimation result estimated for the content.

これにより、実施形態に係る情報処理装置１００は、ユーザがコンテンツを閲覧する際に、コンテンツについて推定された推定結果に基づくコンテンツに関する情報を提示するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, when the user browses the content, the information processing device 100 according to the embodiment presents information about the content based on the estimation result estimated for the content, and therefore, the emotion generated by the user by viewing the content. It is possible to provide meaningful information to the user according to the change of.

また、実施形態に係る情報処理装置１００において、提示部１３４は、コンテンツが動画コンテンツである場合には、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数であって、動画コンテンツの時間位置に応じて変化する行動人数の遷移を示すグラフが、コンテンツとともに表示されるシークバーが示す時間位置に対応付けて表示されるよう表示制御する。 Further, in the information processing apparatus 100 according to the embodiment, when the content is video content, the presentation unit 134 is the number of users who have performed emotional expression behavior in the content, and is the number of people who acted. The display is controlled so that the graph showing the transition of the number of actions changing according to the time position of the content is displayed in association with the time position indicated by the seek bar displayed together with the content.

これにより、実施形態に係る情報処理装置１００は、コンテンツが動画コンテンツである場合には、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数であって、動画コンテンツの時間位置に応じて変化する行動人数の遷移を示すグラフが、コンテンツとともに表示されるシークバーが示す時間位置に対応付けて表示されるよう表示制御するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, when the content is video content, the information processing device 100 according to the embodiment is the number of users who have performed emotional expression actions in the content, and is the number of people who act, and the time position of the video content. Since the display control is performed so that the graph showing the transition of the number of actions that changes according to the content is displayed in association with the time position indicated by the seek bar displayed together with the content, the emotional change caused to the user by viewing the content. Depending on the situation, it is possible to provide information that is meaningful to the user.

また、実施形態に係る情報処理装置１００において、提示部１３４は、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数に基づきコンテンツに順位付けされた順位情報に基づいて、ランキング形式でコンテンツを提示する。 Further, in the information processing device 100 according to the embodiment, the presentation unit 134 ranks based on the ranking information ranked in the content based on the number of actions, which is the number of users who have performed emotional expression actions in the content. Present the content in a format.

これにより、実施形態に係る情報処理装置１００は、コンテンツの中で感情表出行動を行ったユーザの人数である行動人数に基づきコンテンツに順位付けされた順位情報に基づいて、ランキング形式でコンテンツを提示するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment displays the content in a ranking format based on the ranking information ranked in the content based on the number of actions, which is the number of users who have performed emotional expression actions in the content. Since it is presented, it is possible to provide information that is meaningful to the user in response to changes in emotions that occur in the user by viewing the content.

また、実施形態に係る情報処理装置１００において、提示部１３４は、受信部１３１によって取得された推定結果に基づいて、ユーザに応じたコンテンツをレコメンドする。 Further, in the information processing apparatus 100 according to the embodiment, the presenting unit 134 recommends the content according to the user based on the estimation result acquired by the receiving unit 131.

これにより、実施形態に係る情報処理装置１００は、取得された推定結果に基づいて、ユーザに応じたコンテンツをレコメンドするため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment recommends the content according to the user based on the acquired estimation result, so that the user responds to the change in emotions caused by the user by viewing the content. It is possible to provide meaningful information for the user.

また、実施形態に係る情報処理装置１００において、提示部１３４は、受信部１３１によって取得された推定結果をユーザ毎に集計することによりユーザについて特定された感情ポイントであって、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントに基づいて、ユーザに応じたコンテンツをレコメンドする。 Further, in the information processing device 100 according to the embodiment, the presenting unit 134 is an emotion point specified for the user by totaling the estimation results acquired by the receiving unit 131 for each user, and is a user in the content. Based on the emotional points, which are the points at which the emotions of the user change, the content according to the user is recommended.

これにより、実施形態に係る情報処理装置１００は、取得された推定結果をユーザ毎に集計することによりユーザについて特定された感情ポイントであって、コンテンツの中でユーザの感情に変化が生じたポイントである感情ポイントに基づいて、ユーザに応じたコンテンツをレコメンドするため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the information processing device 100 according to the embodiment is an emotion point specified for the user by totaling the acquired estimation results for each user, and is a point at which the user's emotion changes in the content. Since the content according to the user is recommended based on the emotional point, it is possible to provide meaningful information to the user according to the change in emotion caused to the user by viewing the content.

また、実施形態に係る端末装置１０は、取得部１４ｅと、推定部１４ｆと、送信部１４ｇを有する。取得部１４ｅは、コンテンツを閲覧中のユーザを、撮像手段によって撮像されることで得られる撮像情報を取得する。推定部１４ｆは、取得部１４ｅによって取得された撮像情報が示すユーザの表情に基づいて、ユーザの感情に関する情報を推定する。送信部１４ｇは、推定部１４ｆによって推定された推定結果を情報処理装置１００に送信する。 Further, the terminal device 10 according to the embodiment includes an acquisition unit 14e, an estimation unit 14f, and a transmission unit 14g. The acquisition unit 14e acquires the imaging information obtained by imaging the user who is viewing the content by the imaging means. The estimation unit 14f estimates information about the user's emotions based on the facial expression of the user indicated by the imaging information acquired by the acquisition unit 14e. The transmission unit 14g transmits the estimation result estimated by the estimation unit 14f to the information processing device 100.

これにより、実施形態に係る端末装置１０は、取得された撮像情報が示すユーザの表情に基づいて推定されたユーザの感情に関する推定結果を情報処理装置１００に送信するため、コンテンツを視聴することでユーザに生じた感情の変化に応じて、ユーザにとって有意義な情報を提供することができる。 As a result, the terminal device 10 according to the embodiment transmits the estimation result regarding the user's emotions estimated based on the user's facial expression indicated by the acquired imaging information to the information processing device 100, so that the content can be viewed. It is possible to provide information that is meaningful to the user in response to changes in emotions that occur in the user.

以上、本願の実施形態をいくつかの図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 The embodiments of the present application have been described in detail with reference to some drawings, but these are examples, and various modifications are made based on the knowledge of those skilled in the art, including the embodiments described in the disclosure column of the invention. It is possible to practice the present invention in other improved forms.

１情報処理システム
１０端末装置
１２表示部
１３撮像部
１４制御部
１４ａ要求部
１４ｂ同意情報受付部
１４ｃ表示制御部
１４ｄカメラ制御部
１４ｅ取得部
１４ｆ推定部
１４ｇ送信部
３０コンテンツ配信装置
１００情報処理装置
１２０記憶部
１２１撮像情報記憶部
１２２推定情報記憶部
１２３全体集計結果記憶部
１２４感情ポイント記憶部
１２５出演者情報記憶部
１３０制御部
１３１受信部
１３２集計部
１３３特定部
１３４提示部
１３５編集部 1 Information processing system 10 Terminal device 12 Display unit 13 Imaging unit 14 Control unit 14a Request unit 14b Consent information reception unit 14c Display control unit 14d Camera control unit 14e Acquisition unit 14f Estimate unit 14g Transmission unit 30 Content distribution device 100 Information processing device 120 Storage unit 121 Imaging information storage unit 122 Estimated information storage unit 123 Overall total result storage unit 124 Emotion point storage unit 125 Performer information storage unit 130 Control unit 131 Reception unit 132 Total unit 133 Specific unit 134 Presentation unit 135 Editorial unit

Claims

An acquisition unit that acquires information about the user's emotions estimated based on the user's facial expression indicated by the image pickup information captured by the image pickup means of the terminal device displaying the content for the user who is viewing the content. When,
An information processing device including a specific unit that identifies an emotional point, which is a point at which the user's emotion changes in the content, by aggregating the estimation results acquired by the acquisition unit. ..

The acquisition unit
Based on the facial expression of the user, information on the emotion of the user estimated in real time is acquired, and the information is obtained.
The specific part is
The information processing apparatus according to claim 1, wherein the emotion points are specified in real time for each user by aggregating the estimation results acquired in real time by the acquisition unit for each user.

The acquisition unit
Acquire attribute information related to the user's attribute,
The specific part is
The second aspect of claim 2 is characterized in that the emotion point is specified for each user based on the aggregation result obtained by the acquisition unit for each user and the attribute information for each user. The information processing device described.

The specific part is
The information according to any one of claims 1 to 3, wherein the emotional points are specified by aggregating the number of users who have performed emotional expression behavior in the content. Processing equipment.

The specific part is
The information processing device according to claim 4, wherein a point at which a numerical value based on the number of active persons satisfies a predetermined condition information is specified as the emotion point.

The specific part is
The fifth aspect of the fifth aspect is that when the content is a video content, the time position where the numerical value based on the number of active persons satisfies the predetermined condition information is specified as the emotion point in the playback time of the video content. The information processing device described.

The specific part is
When the content is image content, claim 5 or 6 is characterized in that, among the pages of the image content, the page whose numerical value based on the number of active persons satisfies the predetermined condition information is specified as the emotion point. The information processing device described.

The information processing apparatus according to any one of claims 1 to 7, further comprising an editorial unit that edits the content based on the emotional point.

The editorial department
The information processing apparatus according to claim 8, wherein the partial content, which is the content corresponding to the emotional point, is extracted from the content, and new content is generated by combining the extracted partial content.

The editorial department
The information processing apparatus according to claim 9, wherein partial content, which is content corresponding to the emotional point of each of the contents, is extracted, and new content is generated by combining the extracted partial contents.

The information processing apparatus according to any one of claims 1 to 10, further comprising a presenting unit that presents information about the content based on the estimation result acquired by the acquiring unit.

The presentation unit
The information processing device according to claim 11, wherein when the user browses the content, information about the content is presented based on an estimation result estimated for the content.

The presentation unit
When the content is video content, it is the number of actions that is the number of users who have performed emotional expression actions in the content, and indicates the transition of the number of actions that changes according to the time position of the video content. The information processing apparatus according to claim 12, wherein the graph is displayed and controlled so as to be displayed in association with the time position indicated by the seek bar displayed together with the content.

The presentation unit
Claims 11 to 11 characterized in that the content is presented in a ranking format based on the ranking information ranked in the content based on the number of users who have performed emotional expression behavior in the content. The information processing apparatus according to any one of 13.

The presentation unit
The information processing apparatus according to any one of claims 11 to 14, wherein the content according to the user is recommended based on the estimation result acquired by the acquisition unit.

The presentation unit
Emotion points that are identified for the user by aggregating the estimation results acquired by the acquisition unit for each user, and are points at which the user's emotions change in the content. The information processing apparatus according to claim 15, wherein the content according to the user is recommended based on the information processing apparatus.

It is an information processing method executed by a computer.
An acquisition step of acquiring information on the user's emotions estimated based on the facial expression of the user indicated by the imaging information captured by the imaging means of the terminal device displaying the content for the user who is viewing the content. When,
An information processing method including a specific step of identifying emotion points, which are points in which the user's emotions change in the content, by aggregating the estimation results acquired by the acquisition process. ..

Acquisition procedure for acquiring information on the user's emotions estimated based on the facial expression of the user indicated by the imaging information captured by the imaging means of the terminal device displaying the content. When,
By aggregating the estimation results acquired by the acquisition procedure, the computer is made to execute a specific procedure for identifying emotion points, which are points where the user's emotions have changed in the content. Information processing program.

An acquisition unit that acquires imaging information obtained by imaging a user who is viewing content by an imaging means, and an acquisition unit.
An estimation unit that estimates information about the user's emotions based on the facial expression of the user indicated by the imaging information acquired by the acquisition unit, and an estimation unit.
A terminal device including a transmission unit that transmits an estimation result estimated by the estimation unit to an information processing device.