JP5381434B2

JP5381434B2 - Content processing device

Info

Publication number: JP5381434B2
Application number: JP2009164815A
Authority: JP
Inventors: 尚司谷内田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-07-13
Filing date: 2009-07-13
Publication date: 2014-01-08
Anticipated expiration: 2029-07-13
Also published as: JP2011023811A

Description

本発明は、コンテンツを配信すると共に、当該コンテンツに対するユーザのコメントを配信する音声付加コンテンツ配信サービスを実現するコンテンツ処理装置、コンテンツ処理方法及びコンテンツ処理プログラム、コンテンツ配信システムに関する。 The present invention relates to a content processing apparatus, a content processing method, a content processing program, and a content distribution system that realize an audio-added content distribution service that distributes content and distributes a user comment on the content.

ｘＤＳＬ（Digital Subscriber Line）やＦＴＴＨ（Fiber To The Home）等の広帯域ネットワークの普及によりＩＰ（Internet Protocol）ネットワークを利用したコンテンツ配信サービスが盛んに行われるようになってきた。これまで、多言語音声や副音声などといった通常番組以外の音声情報は、コンテンツ製作者が提供していた。 With the widespread use of broadband networks such as xDSL (Digital Subscriber Line) and FTTH (Fiber To The Home), content distribution services using IP (Internet Protocol) networks have been actively performed. Until now, content producers provided audio information other than normal programs such as multilingual audio and sub-audio.

最近では、特許文献１に開示されているように、視聴者が映像画面に文字などの情報を付加したコンテンツを作成し、他の視聴者と当該コンテンツを同時に視聴するコンテンツ共有サービスが提案されている。 Recently, as disclosed in Patent Document 1, a content sharing service has been proposed in which a viewer creates content in which information such as characters is added to a video screen, and the viewer can view the content simultaneously with other viewers. Yes.

また、特許文献２に開示されているように、同一のコンテンツを同時に視聴し、ＶｏＩＰ（Voice over Internet Protocol）機能を利用した音声通信を行い、発言コメントを共有するサービスが提案されている。 Further, as disclosed in Patent Document 2, a service has been proposed in which the same content is simultaneously viewed, voice communication using a VoIP (Voice over Internet Protocol) function is performed, and a comment comment is shared.

特開２００６−４１８８６号公報JP 2006-41886 A 特開２００７−２７４２２７号公報JP 2007-274227 A

特許文献１、２の技術は、視聴者同士が同一のコンテンツを同時に視聴する必要がある。そのため、ＶＯＤ（Video On Demand）などのように視聴者が個々にコンテンツを要求し再生する場合には、予め参加者がスケジュールを決め、その時間には必ずＴＶの視聴を行なう必要がある。したがって、たまたま参加できなかったメンバが会話の内容を聞いたり、後でコメントに参加できなかったり、といった課題があった。 The techniques of Patent Documents 1 and 2 require viewers to view the same content at the same time. For this reason, when a viewer individually requests and reproduces content such as VOD (Video On Demand), it is necessary for the participant to determine a schedule in advance and to watch the TV at that time. Therefore, there was a problem that members who happened to be unable to participate listened to the content of the conversation and could not participate in comments later.

特に、特許文献２の技術は、同時視聴しているコンテンツと会話の内容とが同時視聴中に共有できても、コメントをコンテンツに同期させて残しておくことができない。また、映像画面に文字などのコメントを書き込む場合にもコメントを付加したいタイミングにコメントを挿入することは困難であった。 In particular, the technique of Patent Document 2 cannot leave a comment in synchronization with content even if the content being simultaneously viewed and the content of the conversation can be shared during simultaneous viewing. In addition, when writing a comment such as a character on the video screen, it is difficult to insert the comment at a timing when the comment is desired to be added.

ところで、ＶＯＤにおいてコンテンツ開始から経過した時間情報によりコメントを挿入する方法も考えられる。しかし、例えばＩＰマルチキャスト放送を視聴している時にコメントデータを挿入する場合には、コンテンツの開始時刻がわからないため適切な時刻にコメントを挿入できない問題が残る。更に、今後放送されたコンテンツを番組単位で見逃し視聴サービスといわれるＶＯＤを行う際に、コメントが挿入された放送コンテンツを視聴する場合、番組単位に編集した場合、コメント挿入時刻情報を一致させる方法が提案されていない。 By the way, a method of inserting a comment based on time information that has elapsed since the start of content in VOD is also conceivable. However, for example, when comment data is inserted when viewing an IP multicast broadcast, there remains a problem that a comment cannot be inserted at an appropriate time because the start time of the content is not known. Further, when performing VOD called “viewing and viewing service” in which broadcast content is overlooked in the future in units of programs, there is a method of matching comment insertion time information when viewing broadcast content with comments inserted and editing in units of programs. Not proposed.

本発明の目的は、上述した課題を解決するコンテンツ処理装置、コンテンツ処理方法及びコンテンツ処理プログラム、コンテンツ配信システムを提供することにある。 An object of the present invention is to provide a content processing apparatus, a content processing method, a content processing program, and a content distribution system that solve the above-described problems.

本発明のコンテンツ処理装置は、コンテンツを配信すると共に、前記コンテンツに対するコメントを配信する音声付加コンテンツ配信サービスを実現するコンテンツ処理装置であって、前記コンテンツの映像を出力する映像再生手段と、前記コンテンツから分離した音声の時間情報と、前記コンテンツの時間情報に基づいて予め与えられ、ユーザの要求に基づいて受信した前記コメントの時間情報と、が一致すると、前記コンテンツの音声を前記コメントに切り替えて出力するオーディオ切り替え手段と、を備える。
本発明のコンテンツ配信システムは、上記コンテンツ処理装置を備える。 The content processing apparatus of the present invention is a content processing apparatus that realizes an audio-added content distribution service that distributes content and distributes comments on the content, and a video playback unit that outputs a video of the content; If the time information of the audio separated from the time information of the comment given in advance based on the time information of the content and received based on the user's request matches, the sound of the content is switched to the comment. Audio switching means for outputting.
The content distribution system of this invention is provided with the said content processing apparatus.

本発明のコンテンツ処理方法は、コンテンツを配信すると共に、前記コンテンツに対するコメントを配信する音声付加コンテンツ配信サービスを実現するコンテンツ処理方法であって、前記コンテンツの映像を出力する映像再生工程と、前記コンテンツから分離した音声の時間情報と、前記コンテンツの時間情報に基づいて予め与えられ、ユーザの要求に基づいて受信した前記コメントの時間情報と、が一致すると、前記コンテンツの音声を前記コメントに切り替えて出力するオーディオ切り替え工程と、を備える。 The content processing method of the present invention is a content processing method that realizes an audio-added content distribution service that distributes content and distributes comments on the content, and includes a video reproduction step of outputting the video of the content, and the content If the time information of the audio separated from the time information of the comment given in advance based on the time information of the content and received based on the user's request matches, the sound of the content is switched to the comment. An audio switching step of outputting.

本発明のコンテンツ処理プログラムは、コンテンツを配信すると共に、前記コンテンツに対するコメントを配信する音声付加コンテンツ配信サービスを実現するコンテンツ処理プログラムであって、コンテンツ処理装置に、前記コンテンツの映像を出力する映像再生処理と、前記コンテンツから分離した音声の時間情報と、前記コンテンツの時間情報に基づいて予め与えられ、ユーザの要求に基づいて受信した前記コメントの時間情報と、が一致すると、前記コンテンツの音声を前記コメントに切り替えて出力するオーディオ切り替え処理と、を実現させる。 The content processing program of the present invention is a content processing program that realizes an audio-added content distribution service that distributes content and distributes comments on the content, and outputs video of the content to a content processing device. If the processing matches the time information of the sound separated from the content and the time information of the comment given in advance based on the time information of the content and received based on the user's request, the sound of the content is And an audio switching process of switching to the comment and outputting.

本発明によれば、コンテンツ配信サービスの視聴時間は視聴者によって自由に選択でき、更に視聴しているコンテンツにコメントが入力されている場合には、発声者の意図する映像のタイミングで入力されたコメントを視聴者が聞くことができる。 According to the present invention, the viewing time of the content distribution service can be freely selected by the viewer, and when a comment is input to the content being viewed, it is input at the timing of the video intended by the speaker. Comments can be heard by viewers.

本発明の第１の実施の形態のコンテンツ配信システムを概略的に示すブロック図である。1 is a block diagram schematically showing a content distribution system according to a first embodiment of this invention. 本発明の第１の実施の形態のコンテンツ処理装置を概略的に示すブロック図である。1 is a block diagram schematically showing a content processing apparatus according to a first embodiment of the present invention. 本発明の第１の実施の形態のコンテンツ処理装置を詳細に示すブロック図である。It is a block diagram which shows the content processing apparatus of the 1st Embodiment of this invention in detail. 本発明の第１の実施の形態のコンテンツ処理方法の流れを示すタイミングチャートである。It is a timing chart which shows the flow of the content processing method of the 1st Embodiment of this invention. 本発明の第１の実施の形態のコンテンツ処理プログラムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the content processing program of the 1st Embodiment of this invention. 本発明の第２の実施の形態のコンテンツ処理装置を詳細に示すブロック図である。It is a block diagram which shows the content processing apparatus of the 2nd Embodiment of this invention in detail. 本発明の第２の実施の形態のコンテンツ処理方法の流れを示すタイミングチャートである。It is a timing chart which shows the flow of the content processing method of the 2nd Embodiment of this invention. 本発明の第２の実施の形態のコンテンツ処理プログラムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the content processing program of the 2nd Embodiment of this invention. 本発明の第４の実施の形態のコンテンツ配信システムを概略的に示すブロック図である。It is a block diagram which shows roughly the content delivery system of the 4th Embodiment of this invention. 本発明の第４の実施の形態のコンテンツ処理装置を詳細に示すブロック図である。It is a block diagram which shows the content processing apparatus of the 4th Embodiment of this invention in detail.

本発明の実施の形態に係るコンテンツ処理装置、コンテンツ処理方法及びコンテンツ処理プログラム、コンテンツ配信システムについて説明する。但し、本発明が以下の実施の形態に限定される訳ではない。また、説明を明確にするため、以下の記載及び図面は、適宜、簡略化されている。 A content processing apparatus, a content processing method, a content processing program, and a content distribution system according to an embodiment of the present invention will be described. However, the present invention is not limited to the following embodiment. In addition, for clarity of explanation, the following description and drawings are simplified as appropriate.

＜第１の実施の形態＞
本実施の形態のコンテンツ処理装置は、図１に示す音声付加コンテンツ配信システムにおいて好適に用いることができる。 <First Embodiment>
The content processing apparatus according to the present embodiment can be suitably used in the audio-added content distribution system shown in FIG.

便宜上、先ず音声付加コンテンツ配信システムの構成を説明する。本実施の形態の音声付加コンテンツ配信システムは、ＩＰネットワークを利用してコンテンツを配信すると共に、当該コンテンツに対するコメントを配信する。但し、図１に示す音声付加コンテンツ配信システムは、ＩＰネットワークに対して視聴者Ａ、Ｂが接続されているが、視聴者の数はこの限りでない。 For convenience, the configuration of the audio-added content distribution system will be described first. The audio-added content distribution system according to the present embodiment distributes content using an IP network and distributes comments on the content. However, in the audio-added content distribution system shown in FIG. 1, viewers A and B are connected to the IP network, but the number of viewers is not limited to this.

コンテンツサーバ１は、ＶＯＤサービス等のコンテンツサービスを配信するサーバであり、視聴者Ａの視聴要求によりコンテンツＳ１０１を配信する。コンテンツＳ１０１は、ＩＰネットワーク２を介して視聴者Ａ宅のコンテンツ処理装置（ＳＴＢ：Set Top Box）４に配信され、当該コンテンツ処理装置４にてコンテンツをデコードし、視聴装置５にて映像・音声が出力される。視聴装置５は通常のテレビ画面でも、パーソナルコンピュータ等のモニタでも構わない。視聴者Ａは、モニタ画面を見ながら感想や意見といったコメントをマイク（入力部）６より入力する。コメントには、例えば視覚障害者向けの副音声や放送されていない外国語の吹き替え音声でも良い。 The content server 1 is a server that distributes a content service such as a VOD service, and distributes content S101 according to a viewing request of the viewer A. The content S101 is distributed to the content processing device (STB: Set Top Box) 4 of the viewer A's home via the IP network 2, the content processing device 4 decodes the content, and the viewing device 5 performs video / audio. Is output. The viewing device 5 may be a normal television screen or a monitor such as a personal computer. The viewer A inputs comments such as comments and opinions from the microphone (input unit) 6 while watching the monitor screen. The comment may be, for example, sub-audio for visually impaired persons or dubbed voice in a foreign language that is not broadcast.

コンテンツ処理装置４は、詳細は後述するが、マイク６より入力されたコメント情報（音声情報）をデジタル化し、現在デコード中のコンテンツの時間情報を付加してコンテンツ情報と共にエンコードした後に、ＩＰネットワーク２を通してコメントサーバ３に送信する。すなわち、配信されるコンテンツがＭＰＥＧ−２ＴＳである場合には、ＭＰＥＧ−２ＴＳのＡｕｄｉｏストリームもしくはＶｉｄｅｏストリームのＰＴＳ（Presentation Time Stamp）を時間情報として付加することで、コメントとコンテンツの映像とを同期させることができる。また、コンテンツ情報は配信されたコンテンツが特定できる情報であれば良いため、視聴装置５から再生中のコンテンツ情報(例えば、コンテンツのタイトルであったり、コンテンツの録画開始された時刻情報及びチャンネル情報等)であれば良い。なお、音声情報をデジタル化する際のクロックは、コンテンツをデコードしているクロックを用いることにより、時間情報と同期させることができる。デジタル化されたコメント情報は、非圧縮のままコメントサーバ３に送信しても良いし、圧縮符号化してコメントサーバ３に送信しても良い。また、コメント情報を圧縮符号化する場合には、再生中のコンテンツの音声情報と同じ圧縮符号化方式を用い、同じ圧縮率に符号量制御することにより、コメントサーバ３の処理を簡略化することができる。 As will be described in detail later, the content processing device 4 digitizes the comment information (audio information) input from the microphone 6, adds the time information of the content currently being decoded and encodes it together with the content information, and then the IP network 2. To the comment server 3. That is, when the content to be distributed is MPEG-2 TS, the comment and the video of the content are added by adding the PTS (Presentation Time Stamp) of the MPEG-2 TS Audio stream or Video stream as time information. Can be synchronized. Since the content information may be information that can identify the distributed content, the content information being reproduced from the viewing device 5 (for example, the title of the content, the time information when the content recording is started, the channel information, etc.) ). Note that the clock for digitizing the audio information can be synchronized with the time information by using the clock for decoding the content. The digitized comment information may be transmitted to the comment server 3 without being compressed, or may be compressed and encoded and transmitted to the comment server 3. In addition, when compressing and encoding the comment information, the processing of the comment server 3 is simplified by using the same compression encoding method as the audio information of the content being reproduced and controlling the code amount to the same compression rate. Can do.

コメントサーバ３は、視聴者Ａ宅のコンテンツ処理装置４から時間情報等が付加されたコメント情報Ｓ１０２とコンテンツ情報を受信し、コンテンツ情報からコンテンツサーバ１にコンテンツの詳細情報(例えば、コンテンツの録画開始された時刻情報及びチャンネル情報、コンテンツの映像符号化方式及び符号化レートや、音声符号化方式及び符号化レート等の情報)Ｓ１０３を問い合わせし、視聴者Ａ宅のコンテンツ処理装置４から受信したコメント情報Ｓ１０２を、当該コンテンツの詳細情報Ｓ１０３に基づいて、コンテンツＳ１０１と同じ音声フォーマットに圧縮し蓄積する。コンテンツサーバ１は、コメントサーバ３のコンテンツ詳細情報問い合わせによって、コンテンツＳ１０１にコメント情報があることを記憶する。 The comment server 3 receives the comment information S102 to which time information is added and the content information from the content processing device 4 at the viewer A's home, and receives detailed content information (for example, content recording start) from the content information to the content server 1 Comment information received from the content processing apparatus 4 of the viewer A's home), and the time information and the channel information, the information such as the video encoding method and encoding rate of the content, and the information such as the audio encoding method and the encoding rate) The information S102 is compressed and stored in the same audio format as the content S101 based on the detailed information S103 of the content. The content server 1 stores that there is comment information in the content S <b> 101 by the content detailed information inquiry of the comment server 3.

視聴者Ｂ宅のコンテンツ処理装置４からコンテンツサーバ１に対してコンテンツＳ１０１の再生要求があると、コンテンツサーバ１は視聴者Ｂ宅のコンテンツ処理装置４に視聴者Ａのコメントがあることを告知する。コンテンツ処理装置４に視聴者Ａのコメントがあることを告知する方法としては、Ｗｉｄｇｅｔ等のアプリケーションを用いて告知したり、視聴装置５の画面にＰｏｐＵＰコメント有り表示を示したりする。また、視聴装置５に表示されたＶＯＤ選択画面のＧＵＩ（Graphical User Interface）上にコメントがある場合には、その番組表示にコメントがあることが判るようにしておく。 When the content processing device 4 at the viewer B's home requests the content server 1 to play the content S101, the content server 1 notifies the content processing device 4 at the viewer B's home that the viewer A has a comment. . As a method of notifying that the comment of the viewer A is present in the content processing device 4, notification is made using an application such as Widget, or a popup comment presence display is shown on the screen of the viewing device 5. If there is a comment on the GUI (Graphical User Interface) on the VOD selection screen displayed on the viewing device 5, it is determined that the program display has a comment.

視聴者Ｂからの要求があれば、コンテンツサーバ１がコンテンツＳ１０１を配信する際に、コメントサーバ３がコメント情報Ｓ１０４を配信する。視聴者Ｂ宅のコンテンツ処理装置４は、コンテンツＳ１０１とコメント情報Ｓ１０４を受信してデコードし、コンテンツＳ１０１を再生して視聴装置５に出力する。それと共に、視聴者Ｂ宅のコンテンツ処理装置４は、コンテンツＳ１０１の復号時間情報とコメント情報Ｓ１０４の時間情報とが一致したタイミングでコメント情報Ｓ１０４を視聴装置５に出力する。これにより、視聴者Ｂは視聴者Ａが発言したタイミングでコンテンツに対するコメントを聞くことが可能となる。ちなみに、視聴者Ｂは、例えばコンテンツ処理装置４が備える、ＶＯＤ選択画面のＧＵＩ上のコメントに連動したボタン（選択部）を視聴者が選択する等の方法でコメントの要求ができる（図示を省略）。 If there is a request from the viewer B, the comment server 3 distributes the comment information S104 when the content server 1 distributes the content S101. The content processing device 4 at the viewer B home receives and decodes the content S101 and the comment information S104, reproduces the content S101, and outputs it to the viewing device 5. At the same time, the content processing device 4 at the viewer B's home outputs the comment information S104 to the viewing device 5 at the timing when the decryption time information of the content S101 matches the time information of the comment information S104. Thereby, the viewer B can hear a comment on the content at the timing when the viewer A speaks. Incidentally, the viewer B can request a comment by, for example, selecting a button (selection unit) linked to a comment on the GUI of the VOD selection screen provided in the content processing device 4 (not shown). ).

このような音声付加コンテンツ配信システムにおいて、本実施の形態のコンテンツ処理装置４は視聴者Ｂ宅で用いられる。
コンテンツ処理装置４は、図２に示すように、少なくとも映像再生部４２、オーディオ切り替え部４４を備える。映像再生部４２は、視聴者Ｂの要求に基づいて、コンテンツサーバ１からＩＰネットワーク２を介して送信された、コンテンツＳ１０１の映像を視聴装置５に出力する。オーディオ切り替え部４４は、当該コンテンツＳ１０１から分離した音声の時間情報と、視聴者Ｂの要求に基づいてコメントサーバ３からＩＰネットワーク２を介して送信された、コメント情報Ｓ１０４の時間情報と、が一致すると、コンテンツＳ１０１の音声をコメント情報Ｓ１０４に切り替えて、当該コメント情報Ｓ１０４を視聴装置５に出力する。すなわち、コンテンツ処理装置４には、当該コンテンツ処理装置４に、コンテンツＳ１０１の映像を出力する処理と、コンテンツＳ１０１から分離した音声の時間情報と、視聴者Ｂの要求に基づいて送信された、コメント情報Ｓ１０４の時間情報と、が一致すると、コンテンツＳ１０１の音声をコメント情報Ｓ１０４に切り替えて、当該コメント情報Ｓ１０４を出力する処理と、を実現させるコンテンツ処理プログラムが格納されている。そして、コンテンツ処理装置４は、当該プログラムに基づいて、上述した処理を実現する。 In such an audio-added content distribution system, the content processing device 4 of the present embodiment is used at the viewer B's home.
As shown in FIG. 2, the content processing device 4 includes at least a video playback unit 42 and an audio switching unit 44. Based on the request of the viewer B, the video playback unit 42 outputs the video of the content S101 transmitted from the content server 1 via the IP network 2 to the viewing device 5. The audio switching unit 44 matches the audio time information separated from the content S101 with the time information of the comment information S104 transmitted from the comment server 3 via the IP network 2 based on the request of the viewer B. Then, the audio of the content S101 is switched to the comment information S104, and the comment information S104 is output to the viewing device 5. That is, to the content processing apparatus 4, the process of outputting the video of the content S 101, the audio time information separated from the content S 101, and the comment transmitted to the content processing apparatus 4 based on the request from the viewer B When the time information of the information S104 matches, a content processing program that realizes a process of switching the voice of the content S101 to the comment information S104 and outputting the comment information S104 is stored. And the content processing apparatus 4 implement | achieves the process mentioned above based on the said program.

ここで、コメント情報Ｓ１０４（Ｓ１０２）は、上述のように視聴者Ａ宅で視聴されたコンテンツＳ１０１の時間情報に基づいて、予め時間情報が与えられた状態でコメントサーバ３に格納されている。そして、コメント情報Ｓ１０４は、視聴者Ｂの要求に基づいて、コメントサーバ３から視聴者Ｂ宅のコンテンツ処理装置４に送信される。このとき、コメント情報Ｓ１０４の時間情報として、コンテンツＳ１０１の時間情報、例えば、配信されるコンテンツがＭＰＥＧ−２ＴＳである場合には、ＭＰＥＧ−２ＴＳのＡｕｄｉｏストリームもしくはＶｉｄｅｏストリームのＰＴＳを時間情報として与えている。 Here, the comment information S104 (S102) is stored in the comment server 3 in a state in which the time information is given in advance based on the time information of the content S101 viewed at the viewer A's home as described above. Then, the comment information S104 is transmitted from the comment server 3 to the content processing device 4 at the viewer B home based on the request from the viewer B. At this time, as the time information of the comment information S104, the time information of the content S101, for example, when the content to be distributed is MPEG-2 TS, the audio stream of the MPEG-2 TS or the PTS of the Video stream is used as the time information. Giving.

このようなコンテンツ処理装置４、コンテンツ処理方法及びコンテンツ処理プログラム、コンテンツ配信システムは、視聴者の要求に基づいて、コンテンツサーバ１から所望のコンテンツを受信するので、コンテンツの視聴時間を視聴者によって自由に選択できる。更に視聴しているコンテンツにコメントが入力されている場合には、発声者（視聴者Ａ）の意図する映像のタイミングで入力されたコメントを視聴者（視聴者Ｂ）が聞くことができる。 Such a content processing apparatus 4, content processing method, content processing program, and content distribution system receive desired content from the content server 1 based on the viewer's request, so that the viewer can freely view the content. Can be selected. Further, when a comment is input to the content being viewed, the viewer (viewer B) can listen to the comment input at the timing of the video intended by the speaker (viewer A).

このようなコンテンツ処理装置４は、図３のような構成とされていることが好ましい。すなわち、コンテンツ処理装置４は、ＩＰ網を通してさまざまなサービスのデータＳ４００が通信Ｉ／Ｆ部４０に入力される。通信Ｉ／Ｆ部４０は、コンテンツストリームＳ４０１を選択して、当該コンテンツストリームＳ４０１をストリーム分離部４１に入力する。また、通信Ｉ／Ｆ部４０は、ＴＣＰプロトコルのメッセージを利用して視聴者に対して、視聴装置５の画面にコメントがあることを示したり、Ｗｉｄｇｅｔ等のアプリケーションを利用して、視聴装置５の画面にコメントの有無を示したりする。 Such a content processing apparatus 4 is preferably configured as shown in FIG. That is, in the content processing apparatus 4, data S400 of various services is input to the communication I / F unit 40 through the IP network. The communication I / F unit 40 selects the content stream S401 and inputs the content stream S401 to the stream separation unit 41. Further, the communication I / F unit 40 indicates to the viewer that there is a comment on the screen of the viewing device 5 using a message of the TCP protocol, or uses the application such as Widget to use the viewing device 5. The presence or absence of comments is indicated on the screen.

ストリーム分離部４１は、コンテンツストリームＳ４０１のヘッダ情報を解析し、当該コンテンツストリームＳ４０１を映像ストリームＳ４０２、クロック情報Ｓ４０３、音声ストリームＳ４０４に分離する。クロック情報Ｓ４０３は、クロック制御部４３に入力される。クロック制御部４３は、クロック情報Ｓ４０３に基づいてコンテンツ処理装置４の動作クロックを生成する。映像ストリームＳ４０２は、クロック制御部４３で生成された映像クロック信号Ｓ４０６に基づいて、ビデオ復号化部（映像再生部）４２でデコード（復号化）されて映像信号Ｓ４０５として出力される。また、音声ストリームＳ４０４は、クロック制御部４３で生成された音声クロック信号Ｓ４０７に基づいて、オーディオ復号化部（オーディオ切り替え部）４４でデコードされてオーディオ信号Ｓ４０９として出力される。この時、オーディオ信号Ｓ４０９は、必ずしもオーディオ復号化部４４でデコードされなくてもよく、圧縮されたデジタルストリーム例えばＡＥＳ／ＥＢＵ規格やＳＰＤＩＦ規格に準拠したフォーマットのオーディオ信号Ｓ４０９として出力しても良い。 The stream separation unit 41 analyzes the header information of the content stream S401 and separates the content stream S401 into a video stream S402, clock information S403, and an audio stream S404. The clock information S403 is input to the clock control unit 43. The clock control unit 43 generates an operation clock for the content processing apparatus 4 based on the clock information S403. The video stream S402 is decoded (decoded) by the video decoding unit (video reproduction unit) 42 based on the video clock signal S406 generated by the clock control unit 43 and output as a video signal S405. The audio stream S404 is decoded by the audio decoding unit (audio switching unit) 44 based on the audio clock signal S407 generated by the clock control unit 43 and output as an audio signal S409. At this time, the audio signal S409 does not necessarily have to be decoded by the audio decoding unit 44, and may be output as a compressed digital stream, for example, an audio signal S409 in a format compliant with the AES / EBU standard or SPDIF standard.

視聴者Ｂは、例えばコンテンツ処理装置４が予め備える、ＶＯＤ選択画面のＧＵＩ上のコメントに連動したボタン（選択部）を操作して、コメントの要求を行う。視聴者Ｂのコメントの要求に基づいて、コンテンツ再生時にコメント情報Ｓ４１２をオーディオ信号Ｓ４０９として出力する際には、通信Ｉ／Ｆ部４０で受信したコメント情報Ｓ４１２をオーディオ復号化部４４に入力する。オーディオ復号化部４４には、音声ストリームＳ４０４とコメント情報Ｓ４１２とが入力される。オーディオ復号化部４４は、コメント情報Ｓ４１２をデコードし、当該コメント情報Ｓ４１２のタイムスタンプ（時間情報）が音声ストリームＳ４０４と一致すると、当該コメント情報Ｓ４１２をオーディオ信号Ｓ４０９として出力する。 For example, the viewer B requests a comment by operating a button (selection unit) linked to a comment on the GUI of the VOD selection screen provided in advance in the content processing apparatus 4. When the comment information S412 is output as the audio signal S409 during content playback based on the comment request from the viewer B, the comment information S412 received by the communication I / F unit 40 is input to the audio decoding unit 44. The audio decoder 44 receives the audio stream S404 and the comment information S412. The audio decoding unit 44 decodes the comment information S412, and outputs the comment information S412 as the audio signal S409 when the time stamp (time information) of the comment information S412 matches the audio stream S404.

ちなみに、ＣＰＵ（Central Processing Unit）４６は、通信Ｉ／Ｆ部４０がさまざまな情報の中からユーザ所望のコンテンツを選択したり、コメント入出力をサーバと通信したりする等のコンテンツ処理装置４全体を制御する。 Incidentally, the CPU (Central Processing Unit) 46 is an entire content processing device 4 in which the communication I / F unit 40 selects user-desired content from various information and communicates comment input / output with a server. To control.

このようなコンテンツ処理装置４を用いて、図４に示すようにコンテンツ処理方法が実現される。
先ず、視聴者Ｂ宅のコンテンツ処理装置４からコンテンツサーバ１へコンテンツ視聴要求を出力する（Ｃ１１）。これにより、コンテンツ処理装置４には、コンテンツサーバ１からコンテンツが配信される（Ｃ１２）。更にコンテンツ処理装置４には、コンテンツサーバ１から視聴者Ａのコメントが蓄積されていることを示す情報が配信される（Ｃ１３）。視聴者Ｂ宅のコンテンツ処理装置４は、受信したコンテンツのデコードを開始すると共に、上述のように例えばＷｉｄｇｅｔ等のアプリケーションを利用して、コメントの有無を視聴者Ｂに伝える。コンテンツ処理装置４は、視聴者Ｂからのコメント要求を、ＩＰネットワーク２を介してコメントサーバ３に送信する（Ｃ１４）。これにより、コンテンツ処理装置４には、コメントサーバ３から視聴者Ａのコメントが配信される（Ｃ１５）。コンテンツ処理装置４は、コンテンツから分離した音声の時間情報と、視聴者Ａのコメントの時間情報と、が一致すると、コンテンツの音声を視聴者Ａのコメントに切り替えて出力する。その結果、視聴者Ｂ宅では、コンテンツを視聴すると共に、当該コンテンツと同期するコメントを視聴することが可能となる（Ｃ１６）。 Using such a content processing apparatus 4, a content processing method is realized as shown in FIG.
First, a content viewing request is output from the content processing device 4 at the viewer B's home to the content server 1 (C11). As a result, the content is delivered from the content server 1 to the content processing device 4 (C12). Further, information indicating that the comments of the viewer A are accumulated from the content server 1 is distributed to the content processing device 4 (C13). The content processing device 4 at the viewer B's home starts decoding the received content, and informs the viewer B of the presence or absence of a comment using an application such as Widget as described above. The content processing device 4 transmits a comment request from the viewer B to the comment server 3 via the IP network 2 (C14). Thereby, the comment of the viewer A is distributed from the comment server 3 to the content processing device 4 (C15). When the time information of the audio separated from the content matches the time information of the comment of the viewer A, the content processing device 4 switches the sound of the content to the comment of the viewer A and outputs it. As a result, at the viewer B's home, it is possible to view the content and view a comment synchronized with the content (C16).

さらに具体的に云うと、コンテンツ処理装置４は、コンテンツ処理プログラムに基づいて、図５に示す処理フローを実現する。
コンテンツ処理装置４の通信Ｉ／Ｆ部４０は、入力されたＩＰパケットからＲＴＰ（Real-time Transport Protocol）／ＵＤＰ（User Datagram Protocol）パケットヘッダを解析する（Ｆ１０１）。そして、視聴者Ｂが要求したストリームのみ次のＦ１０３の処理に送り、そうでないパケットデータの場合には破棄し、次のパケットデータの処理に移行する（Ｆ１０２）。 More specifically, the content processing device 4 realizes the processing flow shown in FIG. 5 based on the content processing program.
The communication I / F unit 40 of the content processing apparatus 4 analyzes an RTP (Real-time Transport Protocol) / UDP (User Datagram Protocol) packet header from the input IP packet (F101). Then, only the stream requested by the viewer B is sent to the processing of the next F103, and if it is not so, it is discarded and the processing proceeds to the processing of the next packet data (F102).

次に、コンテンツ処理装置４の通信Ｉ／Ｆ部４０は、ＲＴＰパケットヘッダ内のシーケンス番号によりパケットデータがコンテンツストリームパケットであるか、コメントパケットであるかを判定する。そして、コンテンツパケットの場合にはＦ１０４の処理に移行し、コメントパケットの場合にはＦ１１０の処理に移行する（Ｆ１０３）。 Next, the communication I / F unit 40 of the content processing device 4 determines whether the packet data is a content stream packet or a comment packet based on the sequence number in the RTP packet header. Then, in the case of a content packet, the process proceeds to F104, and in the case of a comment packet, the process proceeds to F110 (F103).

次に、コンテンツ処理装置４のストリーム分離部４１は、ＴＳヘッダ解析を行い、ＴＳパケットの分離情報をビデオパケット（映像パケット）とオーディオパケット（音声パケット）とＰＣＲ（Program Clock Reference）などの時間情報に分離する為のＰＩＤ情報を取得する（Ｆ１０４）。 Next, the stream separation unit 41 of the content processing apparatus 4 performs TS header analysis, and uses TS packet separation information as time information such as video packets (video packets), audio packets (voice packets), and PCR (Program Clock Reference). PID information for separation into (1) is acquired (F104).

次に、コンテンツ処理装置４のストリーム分離部４１は、ＰＩＤ情報からＰＣＲ、ビデオパケット、オーディオパケットを分離し、それぞれＦ１０６、Ｆ１０７、Ｆ１１０の処理に移る（Ｆ１０５）。 Next, the stream separation unit 41 of the content processing apparatus 4 separates the PCR, video packet, and audio packet from the PID information, and proceeds to the processes of F106, F107, and F110, respectively (F105).

コンテンツ処理装置４のクロック制御部４３には、ＰＣＲが入力される。クロック制御部４３は、ＰＣＲに基づいて基準クロックを生成する（Ｆ１０６）。ビデオ復号化部４２は、分離したビデオデータをデコードしビデオ信号を出力する（Ｆ１０７）。オーディオ復号化部４４は、コメント情報と、コンテンツの音声情報とのどちらをデコードするかを選択し（Ｆ１１０）、いずれか一方の情報をデコードし出力する（Ｆ１０８）。 The PCR is input to the clock control unit 43 of the content processing apparatus 4. The clock control unit 43 generates a reference clock based on the PCR (F106). The video decoding unit 42 decodes the separated video data and outputs a video signal (F107). The audio decoding unit 44 selects whether to decode the comment information or the content audio information (F110), and decodes and outputs either one of the information (F108).

＜第２の実施の形態＞
上記第１の実施の形態のコンテンツ処理装置は、音声付加コンテンツを良好に受信することができる構成とされているが、コンテンツにコメントを入力することができる手段をさらに備えていることが好ましい。 <Second Embodiment>
The content processing apparatus according to the first embodiment is configured to be able to satisfactorily receive audio-added content, but preferably further includes means for inputting a comment to the content.

即ち、コンテンツ処理装置は、図６に示すように、コメントを入力するマイクなどの入力部６、オーディオ符号化部４５を備えることが好ましい。入力部６から入力されるコメント情報Ｓ４１０は、オーディオ符号化部４５においてエンコードされたコメント情報Ｓ４１１として通信Ｉ／Ｆ部４０からＩＰネットワーク２を介してコメントサーバ３に送信される。これにより、コンテンツ処理装置は、単に音声付加コンテンツを受信するだけでなく、当該コンテンツに対してコメントを付加することができる。しかも、再生中のコンテンツに対して意図したタイミングでコメントを付加することができる。 That is, the content processing apparatus preferably includes an input unit 6 such as a microphone for inputting a comment and an audio encoding unit 45 as shown in FIG. The comment information S410 input from the input unit 6 is transmitted from the communication I / F unit 40 to the comment server 3 via the IP network 2 as comment information S411 encoded by the audio encoding unit 45. Thus, the content processing apparatus can not only receive the audio-added content but also add a comment to the content. In addition, it is possible to add a comment to the content being played at the intended timing.

オーディオ符号化部４５には、現在受信している音声ストリーム情報Ｓ４０８が入力される。オーディオ符号化部４５は、音声ストリーム情報Ｓ４０８に基づいてコメント情報Ｓ４１０をエンコード（符号化）する。ここで、音声ストリーム情報Ｓ４０８とは、オーディオ復号化部４４でデコードしている音声ストリームの符号化方式情報と、符号化レートと、時間情報（ＰＴＳ）等である。 The audio encoding unit 45 receives the currently received audio stream information S408. The audio encoding unit 45 encodes (encodes) the comment information S410 based on the audio stream information S408. Here, the audio stream information S408 includes encoding method information, encoding rate, time information (PTS), and the like of the audio stream decoded by the audio decoding unit 44.

オーディオ符号化部４５は、発声されている状態ではエンコードを続ける。そのため、コンテンツ処理装置から出力される音声などの取り込みを防止するためのエコーキャンセラを搭載することが好ましい。これにより、入力されるコメントをより精度よく入力することが可能となる。 The audio encoding unit 45 continues encoding in a state where it is uttered. For this reason, it is preferable to mount an echo canceller for preventing the audio output from the content processing apparatus from being captured. Thereby, it becomes possible to input the input comment with higher accuracy.

このようなコンテンツ処理装置を用いると、図７に示すような音声付加コンテンツ処理方法を実現することができる。なお、図１に示す音声付加コンテンツ処理システムを前提として説明する。 When such a content processing apparatus is used, a sound-added content processing method as shown in FIG. 7 can be realized. The description will be given on the assumption that the audio-added content processing system shown in FIG.

先ず、視聴者Ａ宅のコンテンツ処理装置４は、視聴者Ａの要求に基づいて、コンテンツサーバ１にコンテンツ視聴要求を出力する（Ｃ１００）。コンテンツサーバ１は、当該視聴要求に基づいて、所望のコンテンツの配信を視聴者Ａ宅のコンテンツ処理装置４に対して開始する（Ｃ１０１）。 First, the content processing device 4 at the viewer A's home outputs a content viewing request to the content server 1 based on the request from the viewer A (C100). Based on the viewing request, the content server 1 starts distributing desired content to the content processing device 4 at the viewer A's home (C101).

次に、視聴者Ａ宅のコンテンツ処理装置４は、コンテンツをデコードし視聴者Ａが視聴可能な状態とする（Ｃ１０２）。この時、視聴者Ａが視聴中にコメントを発言した際、視聴者Ａ宅のコンテンツ処理装置４の入力部６は当該コメントを取得する。そして、当該コンテンツ処理装置４のオーディオ符号化部４５は、コメントに再生中のコンテンツの時間情報を付加してエンコードし、コメントサーバ３に送信する（Ｃ１０３）。 Next, the content processing device 4 at the viewer A's home decodes the content so that the viewer A can view it (C102). At this time, when the viewer A makes a comment during viewing, the input unit 6 of the content processing device 4 in the viewer A's home acquires the comment. Then, the audio encoding unit 45 of the content processing apparatus 4 encodes the comment by adding the time information of the content being reproduced to the comment server 3 (C103).

次に、コメントサーバ３は、コメントを受信するとコンテンツサーバ１にコメント蓄積通知を送信する（Ｃ１０４）。コンテンツサーバ１は、コメント蓄積通知から当該コンテンツに視聴者Ａのコメントが残されたことを管理する。 Next, when the comment server 3 receives the comment, the comment server 3 transmits a comment accumulation notification to the content server 1 (C104). The content server 1 manages that the comment of the viewer A is left in the content from the comment accumulation notification.

一方、視聴者Ｂ宅のコンテンツ処理装置４は、視聴者Ｂの要求に基づいて、コンテンツサーバ１にコンテンツ視聴要求を出力する（Ｃ１０５）。コンテンツサーバ１は、当該コンテンツ視聴要求に基づいて、所望のコンテンツを視聴者Ｂ宅のコンテンツ処理装置４に対して配信する（Ｃ１０６）。更に、コンテンツサーバ１は、視聴者Ａのコメントが蓄積されていることを示す情報を視聴者Ｂ宅のコンテンツ処理装置４に配信する（Ｃ１０７）。 On the other hand, the content processing apparatus 4 at the viewer B's home outputs a content viewing request to the content server 1 based on the request from the viewer B (C105). Based on the content viewing request, the content server 1 distributes the desired content to the content processing device 4 at the viewer B home (C106). Furthermore, the content server 1 distributes information indicating that the comments of the viewer A are accumulated to the content processing device 4 of the viewer B home (C107).

視聴者Ｂ宅のコンテンツ処理装置４は、コンテンツを受信しデコードを開始すると共にコメントがあることを視聴者Ｂに伝える。視聴者Ｂ宅のコンテンツ処理装置４は、視聴者Ｂからのコメント要求を、ＩＰネットワーク２を介してコメントサーバ３に送信する（Ｃ１０８）。これにより、コメントサーバ３は、視聴者Ｂ宅のコンテンツ処理装置４にコメントサーバ３からＩＰネットワーク２を介して視聴者Ａのコメントを配信する（Ｃ１０９）。視聴者Ｂ宅のコンテンツ処理装置４は、コンテンツから分離した音声の時間情報と、視聴者Ａのコメントの時間情報と、が一致すると、コンテンツの音声を視聴者Ａのコメントに切り替えて出力する。その結果、視聴者Ｂ宅では、コンテンツを視聴すると共に、当該コンテンツと同期するコメントを視聴することが可能となる（Ｃ１１０）。 The content processing device 4 at the viewer B's home receives the content, starts decoding, and informs the viewer B that there is a comment. The content processing device 4 at the viewer B's home transmits a comment request from the viewer B to the comment server 3 via the IP network 2 (C108). Thereby, the comment server 3 distributes the comment of the viewer A via the IP network 2 from the comment server 3 to the content processing device 4 of the viewer B home (C109). When the time information of the audio separated from the content matches the time information of the comment of the viewer A, the content processing device 4 at the viewer B home switches the sound of the content to the comment of the viewer A and outputs it. As a result, at the viewer B's home, it is possible to view the content and view a comment synchronized with the content (C110).

ちなみに、本実施の形態のコンテンツ処理装置４も、コンテンツ処理プログラムに基づいて動作し、第１の実施の形態のコンテンツ処理装置４と略同様に動作する。但し、本実施の形態のコンテンツ処理装置４は、図８に示すように、コメント情報が入力された場合、コンテンツ処理装置４のオーディオ符号化部４５が、コメント情報とクロック情報とオーディオ出力時間情報とをエンコードし、通信Ｉ／Ｆ部４０を介してコメントサーバ３に送信する（Ｆ１０９）。 Incidentally, the content processing device 4 according to the present embodiment also operates based on the content processing program, and operates in substantially the same manner as the content processing device 4 according to the first embodiment. However, in the content processing apparatus 4 of the present embodiment, as shown in FIG. 8, when comment information is input, the audio encoding unit 45 of the content processing apparatus 4 performs comment information, clock information, and audio output time information. Are transmitted to the comment server 3 via the communication I / F unit 40 (F109).

＜第３の実施の形態＞
第１及び第２の実施の形態では、コンテンツの音声をコメントに切り替えて出力しているが、この限りでない。 <Third Embodiment>
In the first and second embodiments, the audio of content is switched to a comment and output, but this is not restrictive.

即ち、オーディオ復号化部４４は、視聴者Ｂ宅のスピーカがサラウンド設定である場合、背面スピーカのチャンネルの出力を音声ストリームからコメント情報に切り替える構成であることが好ましい。これにより、前面スピーカからはコンテンツの音声を聞きつつ、背面スピーカからコメントを聞くことが可能となる。 That is, it is preferable that the audio decoding unit 44 is configured to switch the output of the channel of the rear speaker from the audio stream to the comment information when the speaker of the viewer B's home is in the surround setting. As a result, it is possible to hear a comment from the rear speaker while listening to the audio of the content from the front speaker.

つまり、図１に示すように、視聴者Ｂ宅のスピーカがサラウンドの設定である場合には背面スピーカ７よりコメント情報Ｓ１０４をデコードしたコメントＳ１０５を出力し、前面スピーカからはコンテンツの音声を出力することが可能となる。これにより、視聴者ＢはコンテンツＳ１０１を視聴しながら、視聴者Ａのコメントを楽しむことが可能となる。この時、コメントサーバ３では配信されるコンテンツの音声データが５．１ｃｈサラウンド音声もしくは７．１ｃｈサラウンド音声といった複数チャンネルの音声の場合、コメント情報を背面チャンネルの圧縮フォーマットに従った符号量に制御する。コンテンツ処理装置４では、背面チャンネル情報をコメント情報に切り替えて出力する。これにより、視聴者Ｂは背面スピーカ７からコメントＳ１０５を聞くことができる。コメントとコンテンツの音声を前後のチャンネルに分割して出力することにより、人間の聴覚特性を利用して、コンテンツそのものの音声と後から付け加えたコメントとを区別することが可能となる。 That is, as shown in FIG. 1, when the speaker at the viewer B's home is set to surround, the comment S105 obtained by decoding the comment information S104 is output from the rear speaker 7, and the audio of the content is output from the front speaker. It becomes possible. Thereby, the viewer B can enjoy the comment of the viewer A while viewing the content S101. At this time, the comment server 3 controls the comment information to a code amount according to the compression format of the back channel when the audio data of the content to be distributed is multi-channel audio such as 5.1ch surround sound or 7.1ch surround sound. . In the content processing apparatus 4, the back channel information is switched to comment information and output. Thus, the viewer B can hear the comment S105 from the rear speaker 7. By dividing and outputting the comment and the audio of the content into the front and back channels, it becomes possible to distinguish the audio of the content itself and the comment added later using the human auditory characteristics.

＜第４の実施の形態＞
第３の実施の形態では、コンテンツサーバ１から配信されたコンテンツに対してコメントを付加するべく、当該コンテンツサーバ１からコンテンツを受信しているが、この限りでない。 <Fourth embodiment>
In the third embodiment, the content is received from the content server 1 in order to add a comment to the content distributed from the content server 1, but this is not restrictive.

即ち、本実施の形態のコンテンツ処理装置４は、図９に示す音声付加コンテンツ配信システムにおいて好適に用いられる。当該音声付加コンテンツ配信システムは、図１に示す音声付加コンテンツ配信システムの機能に加えて、放送局８から放送コンテンツが放送されたり、見逃した当該放送コンテンツが配信されたりする構成とされている。 That is, the content processing apparatus 4 of the present embodiment is preferably used in the audio-added content distribution system shown in FIG. In addition to the functions of the audio additional content distribution system shown in FIG. 1, the audio additional content distribution system is configured such that broadcast content is broadcast from the broadcast station 8, or the missed broadcast content is distributed.

図９に示す音声付加コンテンツ配信システムは、放送局８からの放送コンテンツＳ１０６が各家庭に放送されると共に、コンテンツサーバ１に放送コンテンツごとに蓄積される。視聴者Ａ宅のコンテンツ処理装置４は、アンテナで放送コンテンツＳ１０６を受信し、その後は、上記実施の形態で説明したコンテンツサーバ１からのビデオコンテンツと略同様に、放送コンテンツをデコードし、視聴装置５にて映像・音声を出力する。視聴装置５は通常のＴＶ画面でも、ＰＣ等のモニタでも構わない。 In the audio-added content distribution system shown in FIG. 9, the broadcast content S106 from the broadcast station 8 is broadcast to each home and is stored in the content server 1 for each broadcast content. The content processing device 4 at the viewer A's home receives the broadcast content S106 with an antenna, and then decodes the broadcast content in substantially the same manner as the video content from the content server 1 described in the above embodiment, and the viewing device 5 outputs video / audio. The viewing device 5 may be a normal TV screen or a monitor such as a PC.

視聴者Ａは、モニタ画面を見ながら感想や意見といったコメントを、入力部（マイク）６より入力する。コメントとしては例えば視覚障害者向けの副音声や放送されていない外国語の吹き替え音声でも良い。視聴者Ａ宅のコンテンツ処理装置４は、オーディオ符号化部４５が入力部６より入力されたコメント情報（音声情報）をデジタル化して、さらに現在デコード中の時間情報を付加し放送コンテンツ情報と共にエンコードして、ＩＰネットワーク２を介してコメントサーバ３に送信する。配信されるコンテンツがＭＰＥＧ−２ＴＳである場合には、ＭＰＥＧ−２ＴＳのＡｕｄｉｏストリームもしくはＶｉｄｅｏストリームのＰＴＳを時間情報として付加することでコメントと映像とを同期させることができる。つまり、放送局８からの放送コンテンツはＭＰＥＧ−２ＴＳであるため、第１の実施の形態などと同様に、ＡｕｄｉｏストリームもしくはＶｉｄｅｏストリームのＰＴＳを時間情報として付加する。
また、放送コンテンツ情報は、放送された放送コンテンツが特定できる情報であれば良いため、放送コンテンツのチャンネル情報であれば良い。尚、コメント情報をデジタル化する際のクロックは、放送コンテンツをデコードしているクロックを用いることで、時間情報と同期させる。デジタル化されたコメント情報は、非圧縮のままコメントサーバ３に送信しても良いし、圧縮符号化してコメントサーバ３に送信しても良い。また、コメント情報を圧縮符号化する場合には、放送コンテンツの音声情報と同じ圧縮符号化方式を用い、同じ圧縮率に符号量制御することにより、コメントサーバ３の処理を簡略化することができる。 The viewer A inputs comments such as comments and opinions from the input unit (microphone) 6 while watching the monitor screen. The comment may be, for example, a sub-audio for the visually impaired or a dubbed voice in a foreign language that is not broadcast. In the content processing apparatus 4 at the viewer A's home, the audio encoding unit 45 digitizes the comment information (audio information) input from the input unit 6 and further adds time information that is currently decoded and encodes it together with the broadcast content information. Then, it is transmitted to the comment server 3 via the IP network 2. When the content to be distributed is MPEG-2 TS, the comment and the video can be synchronized by adding the PTS of the MPEG-2 TS Audio stream or Video stream as time information. That is, since the broadcast content from the broadcast station 8 is MPEG-2 TS, the PTS of the audio stream or the video stream is added as time information as in the first embodiment.
The broadcast content information only needs to be information that can identify the broadcast content that has been broadcast. Note that the clock for digitizing the comment information is synchronized with the time information by using the clock for decoding the broadcast content. The digitized comment information may be transmitted to the comment server 3 without being compressed, or may be compressed and encoded and transmitted to the comment server 3. Also, when compressing and encoding comment information, the processing of the comment server 3 can be simplified by using the same compression encoding method as the audio information of the broadcast content and controlling the code amount to the same compression rate. .

コメントサーバ３は、視聴者Ａ宅のコンテンツ処理装置４から時間情報が付加されたコメント情報と放送コンテンツ情報とを受信してデコードし、コンテンツ情報からコンテンツサーバ１に放送コンテンツの詳細情報を問い合わせし、視聴者Ａ宅のコンテンツ処理装置４から受信したコメント情報を放送コンテンツＳ１０６と同じ音声フォーマットに圧縮し蓄積する。 The comment server 3 receives and decodes the comment information to which the time information is added and the broadcast content information from the content processing device 4 at the viewer A's home, and inquires the content server 1 for detailed information on the broadcast content from the content information. The comment information received from the content processing device 4 at the viewer A's home is compressed and stored in the same audio format as the broadcast content S106.

コンテンツサーバ１は、コメントサーバ３のコンテンツ詳細情報（Ｓ１０３）問い合わせによって、放送コンテンツＳ１０６にコメントがあることを記憶する。コンテンツサーバ１は、ＶＯＤサービス等のコンテンツサービスを配信するサーバであり、視聴者Ｂの視聴要求により放送コンテンツＳ１０６をビデオコンテンツＳ１０１として配信する。視聴者Ｂ宅のコンテンツ処理装置４からコンテンツサーバ１に対してビデオコンテンツＳ１０１（放送コンテンツＳ１０６）の再生要求があると、コンテンツサーバ１は視聴者Ｂ宅のコンテンツ処理装置４に視聴者Ａのコメントがあることを告知する。視聴者Ｂからの要求があれば、コンテンツサーバ１がビデオコンテンツＳ１０１を配信する際に、コメントサーバがコメント情報Ｓ１０４を配信する。 The content server 1 stores that there is a comment in the broadcast content S106 in response to the detailed content information (S103) inquiry from the comment server 3. The content server 1 is a server that distributes a content service such as a VOD service, and distributes broadcast content S106 as video content S101 in response to a viewer B's viewing request. When the content processing device 4 at the viewer B's home requests the content server 1 to reproduce the video content S101 (broadcast content S106), the content server 1 sends the comment of the viewer A to the content processing device 4 at the viewer B's home. Announce that there is. If there is a request from the viewer B, the comment server distributes the comment information S104 when the content server 1 distributes the video content S101.

視聴者Ｂ宅のコンテンツ処理装置４は、上記実施の形態と同様に、ビデオコンテンツＳ１０１とコメント情報Ｓ１０４とを受信し、ビデオコンテンツＳ１０１を再生し視聴装置５に出力すると共に、ビデオコンテンツＳ１０１の復号時間情報とコメント情報Ｓ１０４の時間情報とが一致したタイミングで、コメント情報Ｓ１０４を視聴装置５に出力する。これにより、視聴者Ｂは、視聴者Ａが放送局８から受信してコメントを残した放送コンテンツＳ１０６を、所望のタイミングで視聴することができ、しかも視聴者Ａが意図したタイミングでコメントを聞くことが可能となる。
勿論、視聴者Ｂは、コンテンツ処理装置４を用いて当該放送コンテンツに対してコメントを付加することもできる。 The content processing device 4 at the viewer B's home receives the video content S101 and the comment information S104, reproduces and outputs the video content S101 to the viewing device 5, and also decodes the video content S101, as in the above embodiment. The comment information S104 is output to the viewing device 5 at the timing when the time information and the time information of the comment information S104 match. Thus, the viewer B can view the broadcast content S106 received by the viewer A from the broadcast station 8 and leaving a comment at a desired timing, and listen to the comment at the timing intended by the viewer A. It becomes possible.
Of course, the viewer B can also add a comment to the broadcast content using the content processing device 4.

本実施の形態のコンテンツ処理装置について図１０を用いて動作を説明する。放送局８からの放送Ｓ１０６を受信する放送受信部４７で視聴者が選択した放送チャンネルのストリームＳ４１４を出力する。ストリーム分離部４１は、ビデオストリームＳ４０２とクロック情報Ｓ４０３とオーディオＳ４０４とに分離する。ビデオ復号化部４２は、映像ストリームＳ４０２を復号し、ビデオ信号Ｓ４０５を出力する。オーディオ復号化部４４は、オーディストリームＳ４０４を復号し、オーディオ信号Ｓ４０９を出力すると共に、オーディオ符号化方式や符号化レートなどのオーディオ符号化情報Ｓ４０８を出力する。クロック制御部４３は、クロック情報Ｓ４０３に基づいて基準クロックを生成し、ビデオ信号とオーディオ信号の出力タイミングを合わせると共に、オーディオ符号化部４５に符号化したコメントデータの時間情報を出力させる。 The operation of the content processing apparatus according to the present embodiment will be described with reference to FIG. The broadcast receiving unit 47 that receives the broadcast S106 from the broadcast station 8 outputs the stream S414 of the broadcast channel selected by the viewer. The stream separation unit 41 separates the video stream S402, the clock information S403, and the audio S404. The video decoding unit 42 decodes the video stream S402 and outputs a video signal S405. The audio decoding unit 44 decodes the audio stream S404, outputs an audio signal S409, and outputs audio encoding information S408 such as an audio encoding method and an encoding rate. The clock control unit 43 generates a reference clock based on the clock information S403, matches the output timing of the video signal and the audio signal, and causes the audio encoding unit 45 to output time information of the encoded comment data.

本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。 The present invention is not limited to the above-described embodiment, and can be changed as appropriate without departing from the spirit of the present invention.

１コンテンツサーバ
２ＩＰネットワーク
３コメントサーバ
４コンテンツ処理装置
５視聴装置
６入力部（マイク）
７背面スピーカ
８放送局
４０通信Ｉ／Ｆ部
４１ストリーム分離部
４２ビデオ復号化部（映像再生部）
４３クロック制御部
４４オーディオ復号化部
４５オーディオ符号化部
４６ＣＰＵ
４７放送受信部 DESCRIPTION OF SYMBOLS 1 Content server 2 IP network 3 Comment server 4 Content processing apparatus 5 Viewing apparatus 6 Input part (microphone)
7 Rear speaker 8 Broadcast station 40 Communication I / F unit 41 Stream separation unit 42 Video decoding unit (video playback unit)
43 Clock control unit 44 Audio decoding unit 45 Audio encoding unit 46 CPU
47 Broadcast receiver

Claims

A content processing apparatus that realizes an audio-added content distribution service that distributes content and distributes comments on the content,
Video playback means for outputting video of the content;
When the time information of the audio separated from the content matches the time information of the comment given in advance based on the time information of the content and received based on a user request, the audio of the content is converted into the comment. An audio switching means for switching and outputting;
A content processing apparatus comprising:

An input means for a user to input a comment during playback of the content;
Audio encoding means for encoding the comment based on audio time information separated from the content being reproduced and audio encoding scheme information of the content being reproduced;
Transmitting means for transmitting the user's comment to a comment server;
The content processing apparatus according to claim 1, further comprising:

Broadcast content receiving means for receiving broadcast content from a broadcast station, and video content receiving means for receiving video content from a content server via a network,
The video reproduction means reproduces the video of the received broadcast content or the video of the received video content,
The audio encoding means encodes the comment based on audio time information separated from broadcast content or video content being played back and audio encoding method information of the broadcast content or video content being played back. The content processing apparatus according to claim 2 .

4. The content processing apparatus according to claim 3, wherein the video content receiving unit receives broadcast content stored in the content server as the video content based on a user request.

Said audio encoding means, according to claim 3, characterized in that with respect to comments entered from said input means, performs encoding in accordance with the code amount corresponding to the rear channels of the audio information of the video content Content processing apparatus.

A content distribution system comprising the content processing device according to claim 1.

A content processing method that realizes an audio-added content distribution service that distributes content and distributes comments on the content,
A video playback process for outputting video of the content;
When the time information of the audio separated from the content matches the time information of the comment given in advance based on the time information of the content and received based on a user request, the audio of the content is converted into the comment. An audio switching process for switching and outputting;
A content processing method comprising:

An input step in which a user inputs a comment during playback of the content;
An audio encoding step of encoding the comment based on audio time information separated from the content being reproduced and audio encoding scheme information of the content being reproduced;
A transmitting step of transmitting the user's comment to a comment server;
The content processing method according to claim 7, further comprising:

In the video playback step, the video of the broadcast content received from the broadcasting station or the video content received from the content server via the network is played back,
In the audio encoding step, the comment is encoded based on audio time information separated from the broadcast content or video content being played back and audio encoding method information of the broadcast content or video content being played back. The content processing method according to claim 8 .

10. The content processing method according to claim 9, wherein, in the video playback step, the video of the broadcast content stored in the content server as the video content received based on a user request is played back.

The content according to claim 9 , wherein, in the audio encoding step, the comment input by the user is encoded according to a code amount corresponding to a rear channel of audio information of the video content. Processing method.

A content processing program that realizes an audio-added content distribution service that distributes content and distributes comments on the content,
In the content processing device,
Video playback processing for outputting video of the content;
When the time information of the audio separated from the content matches the time information of the comment given in advance based on the time information of the content and received based on a user request, the audio of the content is converted into the comment. Audio switching processing to switch and output,
Content processing program that realizes

An input process in which a user inputs a comment during playback of the content;
An audio encoding process that encodes the comment based on audio time information separated from the content being reproduced and audio encoding scheme information of the content being reproduced;
A transmission process for transmitting the user's comment to a comment server;
The content processing program according to claim 12, further comprising:

In the video playback process, the video of the broadcast content received from the broadcasting station or the video content of the video content received from the content server via the network is played back,
In the audio encoding process, the comment is encoded based on audio time information separated from broadcast content or video content being played back and audio encoding method information of the broadcast content or video content being played back. The content processing program according to claim 13 .

15. The content processing program according to claim 14, wherein in the video reproduction process, the video of the broadcast content stored in the content server as the video content received based on a user request is reproduced.

15. The audio encoding process according to claim 14 , wherein the comment input by the user is encoded according to a code amount corresponding to a rear channel of audio information of the video content. Content processing program.