JP2011205353A

JP2011205353A - Viewing situation recognition device, and viewing situation recognition system

Info

Publication number: JP2011205353A
Application number: JP2010069994A
Authority: JP
Inventors: Noriyuki Hata; 紀行畑
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2010-03-25
Filing date: 2010-03-25
Publication date: 2011-10-13

Abstract

PROBLEM TO BE SOLVED: To objectively experience self-reactions to a program or content under viewing.SOLUTION: A voice signal processing unit 12 of a viewing situation recognition device 1 acquires voice of a viewer US by performing echo canceling of a sound collection signal of a microphone MC. The voice signal processing unit 12 analyzes the voice of the viewer US to acquire feature information of the voice from characteristics of the voice. The voice signal processing unit 12 acquires a broadcast voice signal corresponding to the feature information. A control unit 10 generates individual reaction data from the feature information and a feature time broadcast voice signal to transmit the generated reaction data to a server device 3. The server device generates analysis data regarding viewing situations based on the individual reaction data from a plurality of viewers to transmit the generated analysis data to the control unit 10 of the viewing situation recognition device 1. The control unit 10 displays an analysis result image on a display unit 20 based on the analysis data and emits an additional sound emission signal from a speaker SP.

Description

この発明は、視聴者の反応等の視聴状況を認識する視聴状況認識装置および視聴状況認識システムに関する。 The present invention relates to a viewing situation recognition apparatus and a viewing situation recognition system that recognize a viewing situation such as a viewer's reaction.

従来、視聴者の視聴状況を認識する装置が各種考案されている。例えば、特許文献１では、リモコン操作に応じた視聴者の視聴状況を取得して記録している。また、特許文献２では、それぞれの視聴者に対して視聴状況を取得し、同じ番組を視聴している者がいるかどうかを検出する。そして、同じ番組を視聴している者が検出されれば、これらの視聴者の音声をまとめて、それぞれの視聴者に対して放音している。 Conventionally, various devices for recognizing the viewing situation of a viewer have been devised. For example, in Patent Document 1, the viewing status of a viewer according to a remote control operation is acquired and recorded. Moreover, in patent document 2, viewing status is acquired with respect to each viewer, and it is detected whether there is any person who is viewing the same program. And if the person who is watching the same program is detected, the audio | voices of these viewers are put together and emitted to each viewer.

特開２００１−２７５０５７号公報JP 2001-275057 A 特開２００７−１３４８０８号公報JP 2007-134808 A

しかしながら、上述の特許文献１に記載の技術では、それぞれの視聴者の状況を個別に取得できるものの、その他の視聴者の視聴状況を利用して、付加的なサービスを提供することができない。 However, with the technique described in Patent Document 1 described above, although the status of each viewer can be acquired individually, additional services cannot be provided using the viewing status of other viewers.

上述の特許文献２では、このような付加的なサービスとして、同じ番組を見ている複数の視聴者が、一緒に番組を見ているような疑似体験を与えることができるが、単に同じ番組を見ている者の声等が聞こえるだけであり、視聴している番組に対する自分の反応とその他の視聴者の反応との違いを体感することができない。 In the above-mentioned Patent Document 2, as such an additional service, a plurality of viewers watching the same program can give a simulated experience of watching the program together. You can only hear the voice of the viewer, and you cannot experience the difference between your response to the program you are watching and the response of other viewers.

したがって、この発明の目的は、視聴中の番組やコンテンツに対する自分の反応を客観的に体感できる視聴状況認識装置および視聴状況認識システムを提供することにある。 Accordingly, an object of the present invention is to provide a viewing status recognition apparatus and a viewing status recognition system that can objectively experience a user's reaction to a program or content being viewed.

この発明は、視聴状況認識装置に関する。当該視聴状況認識装置は、音声信号処理部、制御部、表示部、および放音部を備える。音声信号処理部は、収音部で収音された視聴者の音声と、視聴対象メディアの音声とに基づいて、該視聴対象メディアに対する視聴者の反応を示す個別反応データを生成する。制御部は、個別反応データに基づいて得られる分析データを取得し、該分析データに基づく分析結果画像および分析結果音声を生成する。表示部は、分析結果画像を表示する。放音部は、分析結果音声を放音する。 The present invention relates to a viewing status recognition device. The viewing status recognition device includes an audio signal processing unit, a control unit, a display unit, and a sound emitting unit. The audio signal processing unit generates individual reaction data indicating the viewer's reaction to the viewing target medium based on the viewer's voice collected by the sound collecting unit and the viewing target medium. The control unit acquires analysis data obtained based on the individual reaction data, and generates an analysis result image and an analysis result sound based on the analysis data. The display unit displays the analysis result image. The sound emitting unit emits the analysis result sound.

この構成では、視聴者の個別反応が音声のみで分析され、視聴者の視聴対象メディアに対する反応の客観的な位置づけが得られる。そして、この分析結果を視聴者へ体感させることが可能になる。 In this configuration, the individual reaction of the viewer is analyzed only by voice, and the objective positioning of the reaction to the viewer's target media can be obtained. And it becomes possible to let a viewer experience this analysis result.

また、この発明の視聴状況認識装置では、音声信号処理部は、視聴者の音声のレベルおよび音声パターンを取得し、当該音声レベルおよび音声パターンから前記視聴者の反応を識別する。音声信号処理部は、特定の反応を検出した場合に、当該反応の検出情報と、該検出した時間に対応する視聴対象メディアの音声の情報とを含んで個別反応データを生成する。 In the viewing situation recognition apparatus of the present invention, the audio signal processing unit acquires the audio level and audio pattern of the viewer, and identifies the reaction of the viewer from the audio level and audio pattern. When a specific reaction is detected, the audio signal processing unit generates individual reaction data including detection information of the reaction and audio information of the viewing target medium corresponding to the detected time.

この構成では、具体的な個別反応データの生成方法を示している。例えば、視聴者の反応として笑いを例にすれば、当該笑いの反応の場合、音声レベルが相対的に高くなり、会話とは異なる特徴的な音声パターンとなる。このように、音声レベルや音声パターンを取得すれば、これらの特徴から反応を識別することができる。そして、このような特徴が現れる時間の視聴対象メディアの音声を添付すれば、視聴対象メディアのどのタイミングで特徴的な反応が生じたかを関連付けできる。 This configuration shows a specific method for generating individual reaction data. For example, if laughter is taken as an example of the viewer's reaction, in the case of the laughter response, the voice level is relatively high, resulting in a characteristic voice pattern different from the conversation. In this way, if the voice level or voice pattern is acquired, the reaction can be identified from these features. Then, by attaching the sound of the viewing target medium at the time when such a characteristic appears, it is possible to relate at which timing of the viewing target medium a characteristic reaction has occurred.

また、この発明の視聴状況認識装置では、音声信号処理部は、収音部の収音信号に対して、分析結果音声信号および視聴メディアの音声に基づくエコーキャンセル処理を行うことで、視聴者の音声を取得する。 In the viewing status recognition device of the present invention, the audio signal processing unit performs echo cancellation processing based on the analysis result audio signal and the audio of the viewing media on the sound collection signal of the sound collection unit, thereby Get audio.

この構成では、エコーキャンセル処理を行うことで、視聴対象メディアの音声が放音されていても、分析結果に基づく分析結果音声が放音されていても、視聴者の音声を、より正確に取得することができる。これにより、より正確な個別反応データを生成することができる。 In this configuration, by performing echo cancellation processing, even if the audio of the viewing target media is emitted or the analysis result audio based on the analysis result is emitted, the audio of the viewer is acquired more accurately can do. Thereby, more accurate individual reaction data can be generated.

また、この発明は、上述の視聴状況認識装置を含む視聴状況認識システムに関する。この視聴状況認識システムは、視聴状況認識装置とサーバとからなる。サーバは、視聴状況認識装置の制御部からの個別反応データを受信し、分析データを前記制御部へ送信する通信部を備える。サーバは、同一の視聴対象メディアに対する複数の個別反応データに基づいて、統計処理を行うことで分析データを生成する統計情報分析部を備える。 The present invention also relates to a viewing situation recognition system including the above-described viewing situation recognition apparatus. This viewing situation recognition system includes a viewing situation recognition apparatus and a server. The server includes a communication unit that receives individual reaction data from the control unit of the viewing status recognition device and transmits analysis data to the control unit. The server includes a statistical information analysis unit that generates analysis data by performing statistical processing based on a plurality of individual reaction data for the same viewing target medium.

この構成では、上述の視聴状況認識装置を含む視聴状況認識システムについて示しており、サーバにおける分析データの生成方法について示している。このように、複数の個別反応データを統計処理することで、各視聴者の個別反応データの客観的位置づけが分かる分析データが得られる。 This configuration shows a viewing status recognition system including the above-described viewing status recognition device, and shows a method for generating analysis data in the server. As described above, by statistically processing a plurality of individual reaction data, analysis data that can understand the objective position of each viewer's individual reaction data can be obtained.

また、この発明の視聴状況認識システムでは、サーバは分析データを更新記録する記録媒体を備える。サーバの統計情報分析部は、更新記録された分析データを含んで統計処理を実行する。 In the viewing situation recognition system of the present invention, the server includes a recording medium for updating and recording the analysis data. The statistical information analysis unit of the server executes statistical processing including the updated analysis data.

この構成では、例えば、コンテンツサーバから視聴対象メディアを再生する場合、同じ時刻に複数の視聴者が視聴するとは限らない。このような場合に対応し、個別反応データが入力される度に分析データを読み出して、更新し、再度記録しておく。これにより、同時刻ではなくても、各視聴者の個別反応データの客観的位置づけが分かる分析データが得られる。 In this configuration, for example, when a viewing target medium is played from a content server, a plurality of viewers are not necessarily viewed at the same time. Corresponding to such a case, analysis data is read out, updated and recorded again each time individual reaction data is input. Thereby, even if it is not the same time, the analysis data which can understand objective positioning of the individual reaction data of each viewer are obtained.

この発明によれば、単に番組やコンテンツを視聴するだけでなく、番組やコンテンツを視聴しながら、自分の反応を客観的に体感することができる。 According to the present invention, it is possible to objectively experience one's reaction while viewing a program or content as well as simply viewing a program or content.

第１の実施形態に係る視聴状況認識装置およびこれに接続する視聴者側のシステム構成について示す図である。It is a figure shown about the viewing-and-listening condition recognition apparatus which concerns on 1st Embodiment, and the system configuration | structure of the viewer side connected to this. 図１に示す音声信号処理部１２の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice signal processing part 12 shown in FIG. 第１の実施形態に係る視聴状況認識システムにおけるサーバの構成について示す図である。It is a figure shown about the structure of the server in the viewing-and-listening situation recognition system which concerns on 1st Embodiment. 第１の実施形態に係る視聴状況認識フローを示すフローチャートである。It is a flowchart which shows the viewing-and-listening condition recognition flow which concerns on 1st Embodiment. 具体的な視聴状況を説明するための図である。It is a figure for demonstrating a specific viewing condition. 第２の実施形態に係る視聴状況認識装置およびこれに接続する視聴者側のシステム構成について示す図である。It is a figure shown about the viewing condition recognition apparatus which concerns on 2nd Embodiment, and the system configuration | structure of the viewer side connected to this. 第２の実施形態に係る視聴状況認識システムにおけるサーバの構成について示す図である。It is a figure shown about the structure of the server in the viewing-and-listening situation recognition system which concerns on 2nd Embodiment. 第２の実施形態に係る視聴状況認識フローを示すフローチャートである。It is a flowchart which shows the viewing-and-listening condition recognition flow which concerns on 2nd Embodiment.

本発明の第１の実施形態に係る視聴状況認識装置および当該視聴状況認識装置を含む視聴状況認識システムについて、図を参照して説明する。図１は本実施形態における視聴状況認識装置１を含む視聴者側の構成を示すブロック図である。図２は図１に示す音声信号処理部１２の具体的構成を示すブロック図である。図３は本実施形態の視聴状況認識システムにおけるサーバ３の構成図である。 A viewing situation recognition apparatus according to a first embodiment of the present invention and a viewing situation recognition system including the viewing situation recognition apparatus will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration on the viewer side including a viewing status recognition apparatus 1 in the present embodiment. FIG. 2 is a block diagram showing a specific configuration of the audio signal processing unit 12 shown in FIG. FIG. 3 is a configuration diagram of the server 3 in the viewing status recognition system of the present embodiment.

視聴状況認識システムは、図１に示すような視聴状況認識装置１を含む視聴者ＵＳ側の構成と、図３に示すようなサーバ装置３とがネットワーク９００により接続されてなる。なお、サーバ装置３には、複数の視聴者側の構成（視聴状況認識装置１）がネットワーク９００を介して接続されている。 The viewing status recognition system is configured by connecting a configuration on the viewer US side including the viewing status recognition device 1 as shown in FIG. 1 and a server device 3 as shown in FIG. Note that a plurality of viewer-side configurations (viewing status recognition device 1) are connected to the server device 3 via a network 900.

視聴者ＵＳ側には、図１に示すように視聴状況認識装置１とテレビジョン装置２とが配設されている。テレビジョン装置２は、テレビ放送信号（ここでは、地上波デジタル放送を例にする。）を受信して、放送映像を画面上に表示するとともに、放送音声を内蔵スピーカから放音する。テレビジョン装置２は、放送音声信号を視聴状況認識装置１へも出力する。 As shown in FIG. 1, a viewing state recognition device 1 and a television device 2 are arranged on the viewer US side. The television apparatus 2 receives a television broadcast signal (here, terrestrial digital broadcast is taken as an example), displays broadcast video on the screen, and emits broadcast sound from a built-in speaker. The television device 2 also outputs the broadcast audio signal to the viewing status recognition device 1.

視聴状況認識装置１は、制御部１０、収音制御部１１、音声信号処理部１２、放音制御部１３および表示部２０を備える。 The viewing status recognition device 1 includes a control unit 10, a sound collection control unit 11, an audio signal processing unit 12, a sound emission control unit 13, and a display unit 20.

制御部１０は、視聴状況認識装置１の全体制御を行うともに、ネットワーク９００を介してのサーバ装置３との通信制御も行う。制御部１０は、音声信号処理部１２で形成された収音音声の解析結果に基づいて、個別反応データを生成し、サーバ装置３へ送信する。制御部１０は、サーバ装置３から受信した分析データに基づいて、分析結果画像を生成して、表示部２０へ出力する。また、制御部１０は、分析データに基づいて、「分析結果音声」に相当する付加放音信号を生成して、音声信号処理部１２へ出力する。ここで、分析データは、概略的には、サーバ装置３に集計された複数の個別反応データに基づく統計処理から得られるデータである。なお、分析データの具体的形成方法は、後述のサーバ装置３の説明の際に、併せて説明する。 The control unit 10 performs overall control of the viewing status recognition device 1 and also performs communication control with the server device 3 via the network 900. The control unit 10 generates individual reaction data based on the analysis result of the collected sound formed by the audio signal processing unit 12 and transmits it to the server device 3. The control unit 10 generates an analysis result image based on the analysis data received from the server device 3 and outputs the analysis result image to the display unit 20. Further, the control unit 10 generates an additional sound emission signal corresponding to the “analysis result voice” based on the analysis data, and outputs it to the voice signal processing unit 12. Here, the analysis data is roughly data obtained from statistical processing based on a plurality of individual reaction data collected in the server device 3. A specific method of forming analysis data will be described together with the description of the server device 3 described later.

また、制御部１０は、テレビジョン装置２から取得した放送音声信号を、定期的に所定時間長に亘り、サーバ装置３へ送信する。 Moreover, the control part 10 transmits the broadcast audio | voice signal acquired from the television apparatus 2 to the server apparatus 3 regularly over predetermined time length.

収音制御部１１には、複数のマイクロホンＭＣが所定パターンで配列されたマイクアレイＭＣＡが接続されている。収音制御部１１は、各マイクロホンＭＣの収音信号に基づいて、所定の収音指向性からなる収音指向性信号を生成する。この際、例えば、複数の収音指向性信号を形成し、複数の収音指向性信号から話者方位を検出する機能を備えておけば、当該話者方位を最大収音感度方向とする収音指向性信号を形成することができる。なお、このような指向性制御を行わない場合は、複数のマイクロホンＭＣからなるマイクアレイＭＣＡを用いる必要はなく、一本のマイクロホンを用いればよい。 The sound collection control unit 11 is connected to a microphone array MCA in which a plurality of microphones MC are arranged in a predetermined pattern. The sound collection control unit 11 generates a sound collection directivity signal having a predetermined sound collection directivity based on the sound collection signal of each microphone MC. At this time, for example, if a function of forming a plurality of sound pickup directivity signals and detecting the speaker orientation from the plurality of sound pickup directivity signals is provided, the sound pickup direction is set to the maximum sound pickup sensitivity direction. A sound directivity signal can be formed. When such directivity control is not performed, it is not necessary to use a microphone array MCA composed of a plurality of microphones MC, and a single microphone may be used.

放音制御部１３には、複数のスピーカＳＰが所定パターンで配列されたスピーカアレイＳＰＡが接続されている。放音制御部１３には、制御部１０から付加放音信号が与えられており、放音制御部１３は、当該付加放音信号を所定の放音指向性で放音するように、各スピーカＳＰに対する放音駆動信号を生成し、それぞれのスピーカＳＰへ出力する。 A speaker array SPA in which a plurality of speakers SP are arranged in a predetermined pattern is connected to the sound emission control unit 13. The sound emission control unit 13 is provided with an additional sound emission signal from the control unit 10, and the sound emission control unit 13 emits the additional sound emission signal with a predetermined sound emission directivity. A sound emission drive signal for the SP is generated and output to each speaker SP.

音声信号処理部１２は、第１エコーキャンセル部１２１、第２エコーキャンセル部１２２、および収音音声解析部１２３を備える。 The audio signal processing unit 12 includes a first echo cancellation unit 121, a second echo cancellation unit 122, and a collected sound analysis unit 123.

第１エコーキャンセル部１２１は、スピーカアレイＳＰＡからの放音とマイクアレイＭＣＡによる収音に基づくエコー経路に対する第１の適応パラメータを設定する。第１エコーキャンセル部１２１は、第１の適応パラメータを付加放音信号に乗算することで、第１擬似エコー信号を生成する。第１エコーキャンセル部１２１は、収音制御部１１からの収音指向性信号から第１擬似エコー信号を減算する。これにより、収音指向性信号に含まれる付加放音信号のエコー成分を抑圧することができる。 The first echo cancellation unit 121 sets a first adaptive parameter for an echo path based on sound emission from the speaker array SPA and sound collection by the microphone array MCA. The first echo cancellation unit 121 generates a first pseudo echo signal by multiplying the additional sound emission signal by the first adaptive parameter. The first echo cancellation unit 121 subtracts the first pseudo echo signal from the sound collection directivity signal from the sound collection control unit 11. Thereby, the echo component of the additional sound emission signal contained in the sound collection directivity signal can be suppressed.

第２エコーキャンセル部１２２は、テレビジョン装置２の内蔵スピーカからの放音とマイクアレイＭＣＡによる収音に基づくエコー経路に対する第２の適応パラメータを設定する。第２エコーキャンセル部１２２は、第２の適応パラメータを放送音声信号に乗算することで、第２擬似エコー信号を生成する。第２エコーキャンセル部１２２は、第１エコーキャンセル部１２１から出力された第１段のエコーキャンセル後の収音指向性信号から第２擬似エコー信号を減算する。これにより、第１段のエコーキャンセル後の収音指向性信号に含まれる放送音声信号のエコー成分を抑圧することができる。 The second echo cancellation unit 122 sets a second adaptive parameter for an echo path based on sound emission from the built-in speaker of the television apparatus 2 and sound collection by the microphone array MCA. The second echo cancellation unit 122 generates a second pseudo echo signal by multiplying the broadcast audio signal by the second adaptive parameter. The second echo cancellation unit 122 subtracts the second pseudo echo signal from the sound collection directivity signal after the first stage echo cancellation output from the first echo cancellation unit 121. Thereby, it is possible to suppress the echo component of the broadcast audio signal included in the collected sound directivity signal after the first stage echo cancellation.

以上のようなエコーキャンセル処理を行うことで、収音指向性信号から、付加放音信号と放送音声信号のエコー成分の両方を抑圧することができる。これにより、収音音声解析部１２３には、視聴者ＵＳの発声音のみからなる音声信号が入力される。 By performing the echo cancellation processing as described above, both the additional sound emission signal and the echo component of the broadcast audio signal can be suppressed from the sound collection directivity signal. As a result, a sound signal consisting only of the utterance sound of the viewer US is input to the collected sound analysis unit 123.

収音音声解析部１２３は、上述のエコーキャンセル後の音声信号に基づいて、当該音声信号の特徴を検出し、特徴情報、当該特徴の発生検出時間に対応する放送音声信号（特徴時放送音声信号と称する。）を解析結果として、制御部１０へ出力する。 The collected sound analysis unit 123 detects the feature of the sound signal based on the sound signal after the echo cancellation described above, and broadcast information corresponding to the feature information and the generation detection time of the feature (broadcast sound signal at the time of feature) Is output to the control unit 10 as an analysis result.

具体的な特徴情報の検出方法は、以下の通りである。 A specific feature information detection method is as follows.

収音音声解析部１２３は、各種の特徴的な発声音（例えば笑い声）の時間軸上や周波数軸上の特性を予め記憶している。このような特性は、視聴状況認識装置１の製造時に、予め一般的な特性として記憶しておいてもよいし、使用開始時に視聴者ＵＳが自身の特徴的な発生視聴状況認識装置１に収音させて、当該特性を記憶しておいてもよい。 The collected sound analysis unit 123 stores in advance characteristics on the time axis and frequency axis of various characteristic uttered sounds (for example, laughter). Such characteristics may be stored in advance as general characteristics at the time of manufacture of the viewing situation recognition apparatus 1, or the viewer US stores the characteristic in the generated viewing situation recognition apparatus 1 at the start of use. You may make it sound and memorize | store the said characteristic.

収音音声解析部１２３は、入力されたエコーキャンセル後の音声信号のレベル、時間的なレベル変化、周波数特性を解析する。なお、これらの特性は、必ずしも全てを取得する必要はなく、少なくとも一つの特性が得られればよい。ただし、解析する特性数が多いほど、より正確な特徴の解析結果が得られる。 The collected sound analysis unit 123 analyzes the level of the input audio signal after echo cancellation, temporal level change, and frequency characteristics. Note that it is not always necessary to acquire all of these characteristics, as long as at least one characteristic is obtained. However, as the number of characteristics to be analyzed increases, a more accurate analysis result of features can be obtained.

収音音声解析部１２３は、これらの情報から、入力されたエコーキャンセル後の音声信号の特徴を、記憶された特性と比較することで、解析する。例えば、笑い声の場合、通常の音声レベルよりも時間軸上で全体的にレベルが高くなり、レベルの高い区間と低い区間とが短い間隔で繰り返す。また、周波数軸上では、笑い声の周波数が、他の周波数よりも大幅に高くなる。このような特性が、特徴毎に記憶されているので、収音音声解析部１２３は、それぞれの特徴の特性と、入力されたエコーキャンセル後の音声信号の解析した特性とを比較して、一致するものを検出する。例えば、エコーキャンセル後の音声信号の解析した特性が、笑い声の特性とに所定閾値以上の一致度であれば、当該エコーキャンセル後の音声信号は、笑い声であると判定する。そして、当該笑い声であるという特徴情報を出力する。 The collected sound analysis unit 123 analyzes the characteristics of the input audio signal after the echo cancellation from these pieces of information by comparing it with the stored characteristics. For example, in the case of a laughing voice, the level is generally higher on the time axis than the normal voice level, and a high level section and a low level section are repeated at short intervals. On the frequency axis, the frequency of the laughing voice is significantly higher than other frequencies. Since such characteristics are stored for each feature, the collected voice analysis unit 123 compares the characteristics of each feature with the analyzed characteristics of the input audio signal after echo cancellation, and matches the characteristics. Detect what to do. For example, if the analyzed characteristic of the audio signal after echo cancellation is the degree of coincidence with the laughing voice characteristic equal to or greater than a predetermined threshold, the audio signal after echo cancellation is determined to be a laughing voice. And the characteristic information that it is the said laughter is output.

表示部２０は、液晶ディスプレイ等からなり、分析結果画像を表示する。 The display unit 20 includes a liquid crystal display and displays an analysis result image.

サーバ装置３は、通信部３１、放送信号受信部３２、統計情報分析部３３を備える。通信部３１は、ネットワーク９００を介して各視聴状況認識装置１の制御部１０からの個別反応データを受信する。通信部３１は、ネットワーク９００を介して各視聴状況認識装置１の制御部１０へ分析データを送信する。放送信号受信部３２は、放送信号を受信して、放送音声信号を統計情報分析部３３へ出力する。なお、放送信号受信部３２は、受信可能な全ての放送信号を受信して統計情報分析部３３へ出力する。 The server device 3 includes a communication unit 31, a broadcast signal reception unit 32, and a statistical information analysis unit 33. The communication unit 31 receives individual reaction data from the control unit 10 of each viewing status recognition device 1 via the network 900. The communication unit 31 transmits analysis data to the control unit 10 of each viewing situation recognition device 1 via the network 900. The broadcast signal receiving unit 32 receives the broadcast signal and outputs the broadcast audio signal to the statistical information analyzing unit 33. The broadcast signal receiver 32 receives all receivable broadcast signals and outputs them to the statistical information analyzer 33.

統計情報分析部３３は、放送音声信号を参照して、複数の個別反応データから特徴情報に関する統計処理を実行することで、分析データを生成する。具体的には、統計情報分析部３３は、各個別反応データから、特徴時放送音声信号を抽出する。この特徴時放送音声信号が同期情報となる。すなわち、統計情報分析部３３は、同じ特徴時放送音声信号を有する個別反応データ同士を関連付けする。 The statistical information analysis unit 33 generates analysis data by referring to the broadcast audio signal and executing statistical processing on feature information from a plurality of individual reaction data. Specifically, the statistical information analysis unit 33 extracts a feature time broadcast audio signal from each individual reaction data. This feature time broadcast audio signal becomes synchronization information. That is, the statistical information analysis unit 33 associates individual reaction data having the same feature time broadcast audio signal.

統計情報分析部３３は、各特徴時放送音声信号と、放送信号受信部３２で受信した放送音声信号とを比較し、各特徴時放送音声信号に対応するチャンネルおよび番組を検出する。 The statistical information analysis unit 33 compares each feature time broadcast audio signal with the broadcast sound signal received by the broadcast signal reception unit 32, and detects a channel and a program corresponding to each feature time broadcast audio signal.

統計情報分析部３３は、関連付けられた複数の個別反応データから特徴情報を読み出し、これらの特徴情報を分類する。また、統計情報分析部３３は、関連付けられた複数の個別反応データから、同一特徴情報の発生頻度をそれぞれに取得する。 The statistical information analysis unit 33 reads out feature information from a plurality of associated individual reaction data, and classifies the feature information. In addition, the statistical information analysis unit 33 acquires the occurrence frequency of the same feature information from each of the associated individual reaction data.

このような処理を行うことで、例えば、次に示すような視聴状況に関する分析データを得ることができる。 By performing such a process, for example, the following analysis data regarding the viewing situation can be obtained.

同じチャンネルおよび番組における、笑い声の特徴情報を有する個別反応データの数が取得できるので、一般的な当該番組に対する笑い度（笑っている視聴者の割合）が得られる。また、各個別反応データにおける笑い声の特徴情報の期間長さから、それぞれの視聴者の個別笑い度（どれぐらい笑っているか）が得られる。さらに、それぞれの視聴者の個別笑い度が取得できることで、他の視聴者の個別笑い度の平均と自分の個別笑い度との比較結果も得られる。また、同一時間に対する特徴時放送音声信号の頻度から番組の視聴率が得られる。 Since the number of individual reaction data having laughing voice characteristic information in the same channel and program can be acquired, a general laughter degree (percentage of viewers laughing) for the program can be obtained. Also, the individual laughter level (how much laughing) of each viewer can be obtained from the length of the characteristic information of laughter in each individual reaction data. Furthermore, since the individual laughter level of each viewer can be acquired, a comparison result between the average individual laughter level of other viewers and the individual laughter level can be obtained. Also, the audience rating of the program can be obtained from the frequency of the feature time broadcast audio signal for the same time.

統計情報分析部３３は、これらの個別反応データに含まれる各情報の統計処理から得られる、視聴者毎のおよび世間一般の視聴状況に関する各種の分析結果を分析データとして生成し、通信部３１へ出力する。 The statistical information analysis unit 33 generates, as analysis data, various analysis results for each viewer and the general public viewing situation obtained from the statistical processing of each information included in these individual reaction data, to the communication unit 31. Output.

分析データは、通信部３１、ネットワーク９００を介して視聴状況認識装置１へ送信される。視聴状況認識装置１は、上述のように分析データに基づく分析結果画像を表示部２０で表示するとともに、分析データに基づく付加音声をスピーカアレイＳＰＡから放音する。これにより、視聴者ＵＳは、視聴状況を客観的に把握しながら、視聴することができる。 The analysis data is transmitted to the viewing status recognition device 1 via the communication unit 31 and the network 900. The viewing status recognition device 1 displays the analysis result image based on the analysis data on the display unit 20 as described above, and emits additional sound based on the analysis data from the speaker array SPA. As a result, the viewer US can view the content while objectively grasping the viewing status.

次に、上述の視聴状況認識装置１およびサーバ装置３の処理を、フローチャートに沿って説明する。図４は本実施形態の視聴状況認識フローを示すフローチャートである。 Next, processing of the above-described viewing status recognition device 1 and server device 3 will be described along a flowchart. FIG. 4 is a flowchart showing the viewing status recognition flow of this embodiment.

まず、視聴者ＵＳが視聴を開始すると、視聴状況認識装置１は、放送音声信号を取得して、その一部にあたる所定時間長の放送音声信号を、サーバ装置３へ送信する（Ｓ１０１）。ここで、テレビジョン放送がデジタル放送であれば、例えば、１個のトランスポートストリームパケットＴＳＰ単位で、サーバ装置３に送る等の処理を行う。 First, when the viewer US starts viewing, the viewing status recognition device 1 acquires a broadcast audio signal and transmits a broadcast audio signal having a predetermined length corresponding to a part thereof to the server device 3 (S101). Here, if the television broadcast is a digital broadcast, for example, processing such as sending to the server device 3 in units of one transport stream packet TSP is performed.

サーバ装置３は、放送音声信号を受信すると（Ｓ２０１）、視聴チャネルおよび視聴番組の特定を行う（Ｓ２０２）。 When receiving the broadcast audio signal (S201), the server device 3 specifies the viewing channel and the viewing program (S202).

視聴状況認識装置１は、視聴者ＵＳの音声をエコーキャンセル処理して取得し、上述のように特徴解析を行う（Ｓ１０２）。視聴状況認識装置１は、特徴解析結果と特徴時放送音声信号とから個別反応データを生成し、サーバ装置３へ送信する（Ｓ１０３）。 The viewing status recognition apparatus 1 acquires the audio of the viewer US by performing echo cancellation processing, and performs feature analysis as described above (S102). The viewing situation recognition device 1 generates individual reaction data from the feature analysis result and the feature time broadcast audio signal, and transmits it to the server device 3 (S103).

サーバ装置３は、個別反応データを受信すると（Ｓ２０３）、当該個別反応データと、略同時に受信した他の視聴状況認識装置からの個別反応データとを用いて、上述のように視聴状況に関する分析データを生成する（Ｓ２０４）。サーバ装置３は、当該分析データを視聴状況認識装置１へ送信する（Ｓ２０５）。 Upon receiving the individual response data (S203), the server device 3 uses the individual response data and the individual response data from the other viewing status recognition devices received substantially simultaneously, as described above, to analyze the viewing status analysis data. Is generated (S204). The server device 3 transmits the analysis data to the viewing status recognition device 1 (S205).

視聴状況認識装置１は、分析データを受信すると（Ｓ１０４）、該分析データに基づいて、分析結果画像を表示部２０から表示するとともに、付加放音信号をスピーカＳＰから放音する（Ｓ１０５）。 Upon receiving the analysis data (S104), the viewing status recognition device 1 displays an analysis result image from the display unit 20 based on the analysis data, and emits an additional sound emission signal from the speaker SP (S105).

視聴状況認識装置１は、視聴が終了するまで（Ｓ１０６：Ｎｏ）、上述の処理を所定タイミング間隔で継続的に行い、視聴が終了すれば（Ｓ１０６：Ｙｅｓ）、一連の処理を終了させる。 The viewing status recognition apparatus 1 continuously performs the above-described processing at a predetermined timing interval until viewing ends (S106: No), and ends the series of processing when viewing ends (S106: Yes).

以上のような構成および処理を行った場合、図５に示すような視聴が可能になる。図５は具体的な視聴状況を説明するための図であり、図５（Ａ）が世間の笑い度が、視聴者ＵＳ自身の笑い度よりも低い場合を示し、図５（Ｂ）が世間の笑い度と視聴者ＵＳ自身の笑い度とが略一致して高い場合を示す。 When the above configuration and processing are performed, viewing as shown in FIG. 5 becomes possible. FIG. 5 is a diagram for explaining a specific viewing situation. FIG. 5A shows a case where the laughter degree of the public is lower than the laughter degree of the viewer US itself, and FIG. The laughter level of the viewer US and the laughter level of the viewer US itself are substantially coincident and high.

図５（Ａ）のように、視聴者ＵＳ自身が非常に笑っている場合、当該高い笑い度に対応する個別反応データが視聴状況認識装置１からサーバ装置３へ送信される。サーバ装置３では、他の多くの視聴者からの個別反応データも用いて分析データを生成する。そして、他の多くの視聴者があまり笑っていなければ、笑い度が低い分析データとなる。この分析データは視聴状況認識装置１に送信され、視聴状況認識装置１は、この低い笑い度の分析データに基づいて、分析結果画像を表示するとともに、付加放音信号を放音する。ここで、図５（Ａ）に示すように、表示部２０の画面上には、視聴者ＵＳ自身の笑い度と、分析データに基づく世間一般の笑い度とが、表示される。これにより、視聴者ＵＳは、自身と世間との感覚の違いを視覚的に知ることができる。また、これにより、視聴中の番組に対する世間の評価を知ることもできる。この際、図５（Ａ）に示すように、視聴率を同時に表示すれば、さらに別の観点からの視聴中の番組に対する世間の評価を知ることもできる。また、さらに、図５（Ａ）に示すように、付加放音信号から小さい笑い声が放音されれば、聴覚的にも違いを体感することができる。 As shown in FIG. 5A, when the viewer US is very laughing, individual reaction data corresponding to the high degree of laughter is transmitted from the viewing state recognition device 1 to the server device 3. The server device 3 generates analysis data using individual reaction data from many other viewers. And if many other viewers are not laughing too much, it becomes analysis data with low laughter. This analysis data is transmitted to the viewing situation recognition apparatus 1, and the viewing situation recognition apparatus 1 displays an analysis result image and emits an additional sound emission signal based on the analysis data of this low laughter level. Here, as shown in FIG. 5A, the laughter level of the viewer US and the general laughter level based on the analysis data are displayed on the screen of the display unit 20. Thus, the viewer US can visually know the difference in sense between himself and the public. This also makes it possible to know the public evaluation of the program being viewed. At this time, as shown in FIG. 5A, if the audience rating is displayed at the same time, it is possible to know the public evaluation of the program being viewed from another viewpoint. Furthermore, as shown in FIG. 5A, if a small laughter is emitted from the additional sound emission signal, the difference can be sensed auditorily.

図５（Ｂ）の場合にも、視聴者ＵＳ自身が非常に笑っているので、当該高い笑い度に対応する個別反応データが視聴状況認識装置１からサーバ装置３へ送信される。サーバ装置３では、他の多くの視聴者からの個別反応データも用いて分析データを生成する。そして、他の多くの視聴者も大いに笑っているので、笑い度が高い分析データが得られる。この分析データは視聴状況認識装置１に送信され、視聴状況認識装置１は、この高い笑い度の分析データに基づいて、分析結果画像を表示するとともに、付加放音信号を放音する。ここでも、図５（Ｂ）に示すように、表示部２０の画面上には、視聴者ＵＳ自身の笑い度と、分析データに基づく世間一般の笑い度とが、表示される。これにより、視聴者ＵＳは、世間でも同じ感覚で笑っていることを視覚的に知ることができ、視聴者間での親近感や番組に対する親近感を与えることができる。また、さらに、図５（Ｂ）に示すように、付加放音信号から大きな笑い声が放音されれば、聴覚的にも他の視聴者や番組との親近感を体感することができる。 In the case of FIG. 5B as well, since the viewer US is very laughing, individual reaction data corresponding to the high laughter level is transmitted from the viewing status recognition device 1 to the server device 3. The server device 3 generates analysis data using individual reaction data from many other viewers. And since many other viewers are also laughing so much, analytical data with a high degree of laughter can be obtained. This analysis data is transmitted to the viewing situation recognition apparatus 1, and the viewing situation recognition apparatus 1 displays an analysis result image and emits an additional sound emission signal based on the analysis data of this high laughter level. Here, as shown in FIG. 5B, the laughter level of the viewer US and the general laughter level based on the analysis data are displayed on the screen of the display unit 20. Thereby, the viewer US can visually know that they are laughing in the same sense in the world, and it is possible to give a sense of familiarity between viewers and a sense of familiarity with the program. Furthermore, as shown in FIG. 5 (B), if a loud laughter is emitted from the additional sound emission signal, it is possible to experience a sense of familiarity with other viewers and programs.

なお、上述に示す放音される笑い声は、予め視聴状況認識装置１に記憶されたものを用いればよいが、上述の個別反応データに特徴を検出した音声信号を含ませれば、当該音声信号に基づく他の視聴者の笑い声を放音することもできる。これにより、自身の視聴状況だけでなく、さらに臨場感も得ることができる。 Note that the laughing voice to be emitted as described above may be the one stored in advance in the viewing situation recognition device 1, but if the individual reaction data includes a voice signal whose characteristics are detected, the voice signal is included in the voice signal. You can also laugh out other viewers' laughs. Thereby, not only the viewing situation of the user but also a sense of reality can be obtained.

次に、第２の実施形態に係る視聴状況認識システムについて、図を参照して説明する。図６は第２の実施形態に係る視聴状況認識装置およびこれに接続する視聴者側のシステム構成について示す図である。図７は第２の実施形態に係る視聴状況認識システムにおけるサーバの構成について示す図である。 Next, a viewing status recognition system according to the second embodiment will be described with reference to the drawings. FIG. 6 is a diagram showing a viewing situation recognition apparatus according to the second embodiment and a system configuration on the viewer side connected to the viewing situation recognition apparatus. FIG. 7 is a diagram showing the configuration of the server in the viewing status recognition system according to the second embodiment.

上述の第１の実施形態では、リアルタイムに放送信号を受信した場合について示したが、本実施形態では、ＤＶＤ等のメディアに記録されたコンテンツや、ネットを介してストリーミング配信されるコンテンツを視聴する場合について、説明する。なお、以下ではストリーミング配信されるコンテンツの場合について説明する。また、以下では、第１の実施形態と異なる箇所のみを説明し、同一の構成および同一の処理の箇所については説明を省略する。 In the first embodiment described above, a case where a broadcast signal is received in real time has been described. However, in this embodiment, content recorded on a medium such as a DVD or content streamed via the Internet is viewed. The case will be described. In the following, the case of content that is streamed will be described. In the following description, only portions different from the first embodiment will be described, and description of the same configuration and the same processing portions will be omitted.

図６に示すように、本実施形態の視聴者ＵＳ側の構成では、視聴状況認識装置１およびテレビジョン装置２とともに、コンテンツ再生装置４が設置されている。コンテンツ再生装置４は、ネットワーク９００を介してコンテンツデータを取得する。コンテンツ再生装置４は、コンテンツデータをデコードして、コンテンツ映像信号およびコンテンツ音声信号を生成し、テレビジョン装置２の外部入力端子へ供給する。また、コンテンツ再生装置４は、コンテンツデータおよびコンテンツ音声信号を視聴状況認識装置１へも出力する。 As shown in FIG. 6, in the configuration on the viewer US side of the present embodiment, the content reproduction device 4 is installed together with the viewing status recognition device 1 and the television device 2. The content reproduction device 4 acquires content data via the network 900. The content reproduction device 4 decodes the content data, generates a content video signal and a content audio signal, and supplies them to the external input terminal of the television device 2. The content playback device 4 also outputs the content data and the content audio signal to the viewing status recognition device 1.

視聴状況認識装置１は、テレビジョン装置２からの放送音声信号に代わりコンテンツデータの一部をサーバ装置３Ａへ送信する。また、視聴状況認識装置１は、放送音声信号に代わりコンテンツ音声信号に基づいて個別反応データを生成し、サーバ装置３Ａへ送信する。 The viewing status recognition device 1 transmits part of the content data to the server device 3A instead of the broadcast audio signal from the television device 2. Also, the viewing status recognition device 1 generates individual reaction data based on the content audio signal instead of the broadcast audio signal, and transmits it to the server device 3A.

サーバ装置３Ａは、図７に示すように、通信部３１、統計情報分析部３３Ａ、ソース情報分析部３４、記録媒体３５を備える。通信部３１は、ネットワーク９００を介して、各視聴状況認識装置１との間で、個別反応データ、コンテンツデータ、分析データの送受信を行う。 As illustrated in FIG. 7, the server device 3A includes a communication unit 31, a statistical information analysis unit 33A, a source information analysis unit 34, and a recording medium 35. The communication unit 31 transmits / receives individual reaction data, content data, and analysis data to / from each viewing status recognition device 1 via the network 900.

ソース情報分析部３４は、通信部３１からのコンテンツデータを、デコードし、記録媒体３５のコンテンツパラメータ記録部３５１に記録されたコンテンツパラメータとを比較する。コンテンツパラメータとは、コンテンツデータにより構成されるコンテンツの種類や内容、取得したコンテンツデータのコンテンツ内における時間帯等の情報である。 The source information analysis unit 34 decodes the content data from the communication unit 31 and compares it with the content parameter recorded in the content parameter recording unit 351 of the recording medium 35. The content parameter is information such as the type and content of content configured by content data, the time zone within the content of the acquired content data, and the like.

ソース情報分析部３４は、受信したコンテンツデータに該当するコンテンツパラメータを検出すると、検出したコンテンツパラメータに対応する記録媒体３５の統計分析データ記録部３５２に記録されている分析データを、統計情報分析部３３Ａへ出力するように制御する。 When the source information analysis unit 34 detects the content parameter corresponding to the received content data, the source information analysis unit 34 converts the analysis data recorded in the statistical analysis data recording unit 352 of the recording medium 35 corresponding to the detected content parameter to the statistical information analysis unit. Control to output to 33A.

統計分析データ記録部３５２には、コンテンツパラメータ毎に分析データが記録されており、ソース情報分析部３４からの分析データの出力制御に応じて、分析データを統計情報分析部３３Ａへ出力する。 In the statistical analysis data recording unit 352, analysis data is recorded for each content parameter, and the analysis data is output to the statistical information analysis unit 33A in accordance with the analysis data output control from the source information analysis unit 34.

統計情報分析部３３Ａは、通信部３１を介して受信した個別反応データと、統計分析データ記録部３５２からの分析データとに基づいて、第１の実施形態と同様の視聴状況に関する分析データを更新生成する。これにより、異なる時に、複数の視聴者が同じコンテンツを視聴した場合であっても、当該コンテンツに対する分析データを得ることができる。 The statistical information analysis unit 33A updates the analysis data related to the viewing situation similar to the first embodiment based on the individual reaction data received via the communication unit 31 and the analysis data from the statistical analysis data recording unit 352. Generate. Thus, even when a plurality of viewers view the same content at different times, analysis data for the content can be obtained.

分析データは、第１の実施形態と同様に、通信部３１、ネットワーク９００を介して視聴状況認識装置１へ送信される。視聴状況認識装置１は、上述のように分析データに基づく分析結果画像を表示部２０で表示するとともに、分析データに基づく付加放音信号を生成して、スピーカアレイＳＰＡから放音する。 The analysis data is transmitted to the viewing status recognition apparatus 1 via the communication unit 31 and the network 900 as in the first embodiment. The viewing situation recognition device 1 displays the analysis result image based on the analysis data on the display unit 20 as described above, generates an additional sound emission signal based on the analysis data, and emits the sound from the speaker array SPA.

次に、上述の視聴状況認識装置１およびサーバ装置３Ａの処理を、フローチャートに沿って説明する。図８は本実施形態の視聴状況認識フローを示すフローチャートである。 Next, processing of the above-described viewing status recognition device 1 and server device 3A will be described along a flowchart. FIG. 8 is a flowchart showing the viewing status recognition flow of this embodiment.

まず、視聴者ＵＳがコンテンツの再生をして視聴を開始すると、視聴状況認識装置１は、コンテンツデータを取得して、サーバ装置３へ送信する（Ｓ３０１）。 First, when the viewer US starts playback by playing back content, the viewing status recognition device 1 acquires content data and transmits it to the server device 3 (S301).

サーバ装置３Ａは、コンテンツデータを受信すると（Ｓ４０１）、予め記憶したコンテンツパラメータと比較して、コンテンツの特定を行う（Ｓ４０２）。 Upon receiving the content data (S401), the server device 3A compares the content parameter stored in advance and specifies the content (S402).

視聴状況認識装置１は、視聴者ＵＳの音声をエコーキャンセル処理して取得し、上述のように特徴解析を行う（Ｓ３０２）。視聴状況認識装置１は、特徴解析結果とコンテンツ音声信号とから個別反応データを生成し、サーバ装置３へ送信する。 The viewing status recognition device 1 acquires the audio of the viewer US by performing echo cancellation processing, and performs feature analysis as described above (S302). The viewing situation recognition device 1 generates individual reaction data from the feature analysis result and the content audio signal, and transmits it to the server device 3.

サーバ装置３Ａは、個別反応データを受信すると（Ｓ４０３）、当該個別反応データと、略同時に受信した他の視聴状況認識装置からの個別反応データとを用いて、上述のように視聴状況に関する分析データを生成する。この際、サーバ装置３Ａは、記録媒体３５に記録されている分析データを読み出し（Ｓ４０４）、当該分析データをベースにして分析データを更新生成する（Ｓ４０５）。サーバ装置３は、当該分析データを視聴状況認識装置１へ送信し（Ｓ４０６）、分析データの更新を記録する（Ｓ４０７）。 When receiving the individual response data (S403), the server device 3A uses the individual response data and the individual response data from the other viewing status recognition devices received almost simultaneously, as described above, for the analysis data regarding the viewing status. Is generated. At this time, the server device 3A reads the analysis data recorded in the recording medium 35 (S404), and updates and generates analysis data based on the analysis data (S405). The server device 3 transmits the analysis data to the viewing status recognition device 1 (S406), and records the update of the analysis data (S407).

視聴状況認識装置１は、分析データを受信すると（Ｓ３０４）、該分析データに基づいて、分析結果画像を表示部２０から表示するとともに、付加放音信号をスピーカＳＰから放音する（Ｓ３０５）。 Upon receiving the analysis data (S304), the viewing status recognition device 1 displays an analysis result image from the display unit 20 based on the analysis data, and emits an additional sound emission signal from the speaker SP (S305).

視聴状況認識装置１は、視聴が終了するまで（Ｓ３０６：Ｎｏ）、上述の処理を所定タイミング間隔で継続的に行い、視聴が終了すれば（Ｓ３０６：Ｙｅｓ）、一連の処理を終了させる。 The viewing status recognition device 1 continuously performs the above-described processing at a predetermined timing interval until viewing ends (S306: No), and ends the series of processing when viewing ends (S306: Yes).

以上のように、本実施形態の構成および処理を用いれば、一つのコンテンツを複数の視聴者が異なる時間に、別々に視聴しても、他の視聴者および自身の視聴状況を客観的に把握しながら視聴することができる。 As described above, by using the configuration and processing of this embodiment, even when a plurality of viewers view a single content at different times, objectively grasp the other viewers and their own viewing status. You can watch while.

なお、上述の説明では、テレビジョン放送や映像コンテンツを用いた場合を例に説明したが、ラジオ放送や音声コンテンツを用いた場合にも、上述の構成および処理を適用することができる。 In the above description, the case where television broadcast or video content is used has been described as an example. However, the above configuration and processing can also be applied when radio broadcast or audio content is used.

また、上述の説明は、音声から個別反応を検出する例を示したが、カメラ等を配置することで、音声に加えて映像からも個別反応を検出してもよい。 Moreover, although the above-mentioned description showed the example which detects an individual reaction from an audio | voice, you may detect an individual reaction also from an image | video in addition to an audio | voice by arrange | positioning a camera etc. FIG.

１−視聴状況認識装置、２−テレビジョン装置、３，３Ａ−サーバ装置、４−コンテンツ再生装置、１０−制御部、１１−収音制御部、１２−音声信号処理部、１２１−第１エコーキャンセル部、１２２−第２エコーキャンセル部、１２３−収音音声解析部、１３−放音制御部、２０−表示部、３１−通信部、３２−放送信号受信部、３３，３３Ａ−統計情報分析部、３４−ソース情報分析部、３５−記録媒体、３５１−コンテンツパラメータ記録部、３５２−統計分析データ記録部、９００−ネットワーク、ＭＣ−マイクロホン、ＭＣＡ−マイクアレイ、ＳＰ−スピーカ、ＳＰＡ−スピーカアレイ 1-viewing state recognition device, 2-television device, 3,3A-server device, 4-content playback device, 10-control unit, 11-sound collection control unit, 12-audio signal processing unit, 121-first echo Cancellation unit, 122-second echo cancellation unit, 123-acquired sound analysis unit, 13-sound emission control unit, 20-display unit, 31-communication unit, 32-broadcast signal reception unit, 33, 33A-statistical information analysis Section, 34-source information analysis section, 35-recording medium, 351-content parameter recording section, 352-statistical analysis data recording section, 900-network, MC-microphone, MCA-microphone array, SP-speaker, SPA-speaker array

Claims

An audio signal processing unit that generates individual response data indicating the viewer's response to the viewing target medium based on the audio of the viewer collected by the sound collecting unit and the sound of the viewing target medium;
A controller that obtains analysis data obtained based on the individual reaction data, and generates an analysis result image and an analysis result sound based on the analysis data;
A display unit for displaying the analysis result image;
A viewing situation recognition apparatus comprising: a sound emitting unit that emits the analysis result sound.

The viewing status recognition device according to claim 1,
The audio signal processor is
Obtaining the audio level and audio pattern of the viewer, identifying the viewer's reaction from the audio level and audio pattern,
A viewing state recognition device that generates the individual reaction data including detection information of the reaction and audio information of the viewing target medium corresponding to the detected time when a specific reaction is detected.

The viewing situation recognition device according to claim 1 or 2,
The audio signal processor is
A viewing state recognition device that acquires the audio of the viewer by performing echo cancellation processing based on the analysis result audio signal and the audio of the viewing media on the sound collection signal of the sound collection unit.

While equipped with the viewing-and-listening condition recognition apparatus in any one of Claims 1 thru | or 3,
Statistical processing based on the communication unit that receives the individual reaction data from the control unit of the viewing status recognition device and transmits the analysis data to the control unit, and a plurality of individual reaction data for the same viewing target medium A viewing situation recognition system comprising a server having a statistical information analysis unit that generates the analysis data by performing

The viewing situation recognition system according to claim 4,
The server includes a recording medium for updating and recording the analysis data,
The viewing information recognition system, wherein the statistical information analysis unit executes the statistical processing including the analysis data updated and recorded.