JP2012060467A

JP2012060467A - Video content reproducing device, video content reproducing method, and computer program for the method

Info

Publication number: JP2012060467A
Application number: JP2010202297A
Authority: JP
Inventors: Hiroyasu Ito; 博康伊藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-09-09
Filing date: 2010-09-09
Publication date: 2012-03-22

Abstract

PROBLEM TO BE SOLVED: To provide a video content reproducing device capable of performing closed-captioned fast-forward reproduction of a closed-captioned content distributed from the outside via a network.SOLUTION: A video content reproducing device 110 designates a caption start time of a content and a reproducing range from the caption start time to obtain the content including caption data and video data from a distribution server 100, then calculates a caption start time of a content to be obtained next on the basis of a caption start time 301 and a reproduction time 302 included in the caption data, then designates the caption start time and the reproduction time to obtain the next content from the distribute server 100, then calculates a fast-forward reproduction time of a video frame corresponding to the caption of the caption data included in the video data on the basis of the number of characters of the caption extracted from the caption data, and then displays the caption data and the video frame as a still image for the fast-forward reproduction time.

Description

本発明は、映像コンテンツ再生装置、映像コンテンツ再生方法、及びコンピュータプログラムに関し、特に、字幕付きでコンテンツを再生するために用いて好適なものである。 The present invention relates to a video content playback apparatus, a video content playback method, and a computer program, and is particularly suitable for use in playback of content with captions.

録画した放送番組等のコンテンツを再生する際、短時間で内容を把握するため、早送り再生時に、映像と共に字幕を表示する機能を持つ装置が開示されている（特許文献１を参照）。特許文献１に記載の装置では、字幕文字列を順番に切り替えて表示するのに合わせて、映像データから切り出した静止画も表示する。具体的に説明すると、まず、映像データから字幕文字列を抽出し、当該字幕文字列と、当該字幕文字列に対応する静止画とを要約コンテンツとしている。そして、要約コンテンツを構成する複数の字幕文字列及び静止画についての表示の切り替えのタイミングを、当該字幕文字列に基づいて決定する。特許文献１に記載の技術では、このようにすることによって、字幕付き早見再生を実現している。 An apparatus having a function of displaying subtitles together with video during fast-forward playback has been disclosed in order to grasp the contents in a short time when playing back a content such as a recorded broadcast program (see Patent Document 1). In the apparatus described in Patent Document 1, a still image cut out from video data is also displayed in accordance with the switching and display of subtitle character strings in order. More specifically, first, a caption character string is extracted from video data, and the caption character string and a still image corresponding to the caption character string are used as summary content. Then, the display switching timing for a plurality of subtitle character strings and still images constituting the summary content is determined based on the subtitle character strings. In the technique described in Patent Document 1, fast playback with subtitles is realized in this way.

特開２００９−７６９７０号公報JP 2009-76970 A

しかし、特許文献１に記載の技術では、一旦コンテンツの全データを読み出し、字幕情報を解析し、字幕に応じて要約コンテンツを作成する必要がある。そのため、ネットワーク経由でサーバからコンテンツ再生装置に配信されるコンテンツ（特に録画不可のコンテンツ）については、コンテンツ再生装置内に要約コンテンツを作成することができず、字幕付き早見再生をすることができないという課題がある。
本発明は、このような課題に鑑みてなされたものであり、ネットワークを介して外部から配信された字幕付きコンテンツの字幕付き早見再生を行えるようにすることを目的とする。 However, in the technique described in Patent Document 1, it is necessary to once read all data of content, analyze caption information, and create summary content according to the caption. For this reason, it is not possible to create summary content in the content playback device for content (particularly content that cannot be recorded) delivered from the server to the content playback device via the network, and to perform quick playback with subtitles. There are challenges.
The present invention has been made in view of such problems, and an object of the present invention is to enable fast-playing with captions of content with captions distributed from outside via a network.

本発明の映像コンテンツ再生装置は、ビデオデータと字幕データとを有するコンテンツを、ネットワークを介して外部装置から取得し、再生する映像コンテンツ再生装置であって、前記コンテンツの時間位置と、当該時間位置からの再生時間範囲とを指定して、当該指定した範囲のコンテンツの取得を、ネットワークを介して前記外部装置に要求する要求手段と、前記要求手段で要求したコンテンツを、ネットワークを介して前記外部装置から取得する取得手段と、前記取得手段により取得されたコンテンツに含まれるビデオデータから得られる映像フレームのうち、前記取得手段により取得されたコンテンツに含まれる字幕データに基づく字幕に対応する映像フレームと、当該字幕とを静止画として映像出力する出力手段と、前記取得手段により取得されたコンテンツに含まれる字幕データから得られる、字幕の表示開始のタイミングと、字幕の再生時間とに基づいて、次に取得するコンテンツの時間位置を決定する決定手段と、を有することを特徴とする。 The video content playback apparatus according to the present invention is a video content playback apparatus that acquires and plays back content including video data and caption data from an external device via a network, the time position of the content, and the time position of the content. A requesting means for requesting the external device via the network to acquire the content within the specified range, and the content requested by the requesting means via the network. A video frame corresponding to a subtitle based on subtitle data included in the content acquired by the acquisition unit, out of video frames acquired from the video data included in the content acquired by the acquisition unit; Output means for outputting the caption as a still image, and the acquisition means Determination means for determining the time position of the content to be acquired next based on the subtitle display start timing obtained from the subtitle data included in the acquired content and the playback time of the subtitle. Features.

本発明によれば、字幕の表示開始のタイミングと、字幕の再生時間とを指定し、指定した範囲のコンテンツを外部装置から順次取得し、字幕と、当該字幕に対応する映像フレームとを静止画として順次映像出力する。したがって、ネットワークを介して外部から配信された字幕付きコンテンツの字幕付き早見再生を行うことができる。 According to the present invention, the subtitle display start timing and the subtitle playback time are specified, the content in the specified range is sequentially acquired from the external device, and the subtitle and the video frame corresponding to the subtitle are captured as a still image. Are output sequentially. Accordingly, it is possible to perform quick playback with captions of content with captions distributed from the outside via the network.

第１の実施形態に係る映像コンテンツ再生装置の構成を示す図である。It is a figure which shows the structure of the video content reproduction apparatus which concerns on 1st Embodiment. コンテンツを通常再生する場合の、ビデオデータと字幕データとの関係を示す図である。It is a figure which shows the relationship between video data and subtitle data in the case of reproducing | regenerating a content normally. 字幕データのデータ構造を示す図である。It is a figure which shows the data structure of subtitle data. コンテンツの取得処理を説明するシーケンス図である。It is a sequence diagram explaining a content acquisition process. 配信サーバから送信されるコンテンツのデータを示す図である。It is a figure which shows the data of the content transmitted from a delivery server. コンテンツ取得処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a content acquisition process. 字幕付き早見再生を行う場合の、ビデオデータと字幕データとの関係を表す図である。It is a figure showing the relationship between video data and subtitle data in the case of performing quick-play with subtitles. 字幕付き早見再生処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a quick-view reproduction process with a caption. 第２の実施形態に係る映像コンテンツ再生装置の構成を示す図である。It is a figure which shows the structure of the video content reproduction apparatus which concerns on 2nd Embodiment. コンテンツ要求処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a content request process. コンテンツの取得処理を説明するシーケンス図である。It is a sequence diagram explaining a content acquisition process. 字幕付き早見再生を行う場合の、ビデオデータと字幕データの再生タイミングを表形式で表す図である。It is a figure showing the reproduction | regeneration timing of video data and subtitle data in the case of performing quick-view reproduction with a subtitle in a table format. 字幕付き早見再生処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a quick-view reproduction process with a caption.

（第１の実施形態）
まず、本発明の第１の実施形態について説明する。
図１は、映像コンテンツ再生装置の構成の一例を示すブロック図である。
映像コンテンツ再生装置１１０は、ネットワーク１０１を介して配信サーバ１００と相互に通信可能に接続されている。
ネットワーク通信部１１１は、ネットワーク１０１を介し、配信サーバ１００に対してコンテンツの取得を要求する。また、ネットワーク通信部１１１は、ネットワーク１０１を介し、配信サーバ１００から送信されるコンテンツを受信する。ネットワーク通信部１１１が受信したコンテンツは、デマルチプレクサ１１２によってビデオデータと字幕データとに分離される。 (First embodiment)
First, a first embodiment of the present invention will be described.
FIG. 1 is a block diagram showing an example of the configuration of a video content playback apparatus.
The video content reproduction apparatus 110 is connected to the distribution server 100 via the network 101 so as to be able to communicate with each other.
The network communication unit 111 requests the distribution server 100 to acquire content via the network 101. Further, the network communication unit 111 receives content transmitted from the distribution server 100 via the network 101. The content received by the network communication unit 111 is separated into video data and caption data by the demultiplexer 112.

デマルチプレクサ１１２によって分離されたビデオデータは、ビデオデコーダ１１３によってデコードされる。ビデオデコーダ１１３は、ＭＰＥＧ−２Ｖｉｄｅｏ等の圧縮形式で圧縮されたビデオデータを伸張するためのビデオデコーダである。
デマルチプレクサ１１２によって分離された字幕データは、字幕デコーダ１１４によってデコードされる。字幕デコーダ１１４は、字幕データから符号化された字幕文データを抽出し、当該字幕文データのデコードを行う。更に、字幕デコーダ１１４は、字幕データから、字幕開始時刻（字幕の表示開始の時刻）、字幕再生時間、及び字幕文字数を抽出し、早見再生制御部１１６に出力する。
ビデオ／字幕ブレンド部１１５は、ビデオデコーダ１１３から出力されたビデオデータと、字幕デコーダ１１４から出力された字幕文データとを重畳し、ビデオと字幕とを共に映像出力する。映像出力に基づく画像は、映像コンテンツ再生装置１１０に接続された表示装置に表示される。 The video data separated by the demultiplexer 112 is decoded by the video decoder 113. The video decoder 113 is a video decoder for decompressing video data compressed in a compression format such as MPEG-2 Video.
The caption data separated by the demultiplexer 112 is decoded by the caption decoder 114. The caption decoder 114 extracts the caption text data encoded from the caption data, and decodes the caption text data. Furthermore, the subtitle decoder 114 extracts the subtitle start time (subtitle display start time), subtitle playback time, and subtitle character number from the subtitle data, and outputs them to the quick-view playback control unit 116.
The video / subtitle blending unit 115 superimposes the video data output from the video decoder 113 and the subtitle sentence data output from the subtitle decoder 114, and outputs the video and the subtitle together with video. An image based on the video output is displayed on a display device connected to the video content reproduction device 110.

早見再生制御部１１６は、字幕デコーダ１１４から入力された、字幕開始時刻と、字幕再生時間とを基に、次に取得すべきコンテンツの開始時刻を算出する。また、早見再生制御部１１６は、字幕開示時刻と字幕文字数とを基に、早見再生の制御を行う。早見再生制御部１１６の処理については図２以降の説明で詳しく述べる。 The quick-view playback control unit 116 calculates the start time of the content to be acquired next based on the caption start time and the caption playback time input from the caption decoder 114. The fast-play playback control unit 116 controls fast-play playback based on the closed caption disclosure time and the number of closed caption characters. The processing of the quick-play reproduction control unit 116 will be described in detail in the explanation after FIG.

図２は、配信サーバ１００に記録されているコンテンツを通常再生（一倍速再生）する場合の、ビデオデータと字幕データとの関係の一例を示す図である。具体的に図２（ａ）は、ビデオデータと字幕データの再生タイミングの一例を概念的に表す図であり、図２（ｂ）は、コンテンツのデータ構造の一例を表す図である。
図２（ａ）では、ビデオデータＶ０〜Ｖ１と共に字幕データＪ０が再生され、ビデオデータＶ２〜Ｖ６と共に字幕データＪ１が再生され、ビデオデータＶ７〜Ｖ１１と共に字幕データＪ２が再生されることを表している。
図２（ｂ）では、ビデオデータと共に再生されるべき字幕データが、当該ビデオデータブロックの直前に配置される構造となっている。具体的には、図２（ａ）に示したようにビデオデータＶ０〜Ｖ１と字幕データＪ０とが共に再生される場合、字幕データＪ０、ビデオデータＶ０、Ｖ１の順にデータが並ぶ構造となっている。ビデオデータＶ２〜Ｖ６と字幕データＪ１の並び順、ビデオデータＶ７〜Ｖ１１と字幕データＪ２の並び順についても、ビデオデータＶ０〜Ｖ１と字幕データＪ０の場合と同様に定められる。 FIG. 2 is a diagram illustrating an example of the relationship between video data and subtitle data when the content recorded in the distribution server 100 is normally played (single speed playback). Specifically, FIG. 2A is a diagram conceptually illustrating an example of the reproduction timing of video data and caption data, and FIG. 2B is a diagram illustrating an example of a data structure of content.
FIG. 2A shows that the caption data J0 is reproduced together with the video data V0 to V1, the caption data J1 is reproduced together with the video data V2 to V6, and the caption data J2 is reproduced together with the video data V7 to V11. Yes.
In FIG. 2B, subtitle data to be reproduced together with video data is arranged immediately before the video data block. Specifically, as shown in FIG. 2A, when the video data V0 to V1 and the caption data J0 are reproduced together, the data is arranged in the order of the caption data J0, the video data V0, and V1. Yes. The arrangement order of the video data V2 to V6 and the caption data J1 and the arrangement order of the video data V7 to V11 and the caption data J2 are determined in the same manner as in the case of the video data V0 to V1 and the caption data J0.

図３は、字幕データのデータ構造の一例を示す図である。図３（ａ）は、字幕データのデータ構造形式の一例を示す図である。図３（ａ）に示すように、字幕データは、字幕開始時刻３０１、字幕再生時間３０２、及び字幕文データ３０３を有している。図３（ｂ）、図３（ｃ）、図３（ｄ）は、それぞれ、図２に示した字幕データＪ０、Ｊ１、Ｊ２に対応するデータ構造の具体例を示す図である。例えば、図３（ｂ）では、字幕開始時刻が０：０：０（時：分：秒、以下同様）であり、字幕再生時間が２秒であり、字幕文データが『客「ごめんください」』であることを示している。図３（ｃ）、（ｄ）に示す字幕データの内容は、具体的な数値及び字幕文データが図３（ｂ）に示す例と異なるだけである。 FIG. 3 is a diagram illustrating an example of the data structure of caption data. FIG. 3A is a diagram illustrating an example of a data structure format of caption data. As shown in FIG. 3A, the caption data has a caption start time 301, caption playback time 302, and caption text data 303. FIGS. 3B, 3C, and 3D are diagrams showing specific examples of data structures corresponding to the caption data J0, J1, and J2 shown in FIG. 2, respectively. For example, in FIG. 3B, the subtitle start time is 0: 0: 0 (hours: minutes: seconds, the same applies hereinafter), the subtitle playback time is 2 seconds, and the subtitle text data is “I am sorry” It is shown that. The contents of the caption data shown in FIGS. 3C and 3D are only different from the example shown in FIG. 3B in specific numerical values and caption text data.

次に、図４のシーケンス図を参照しながら、映像コンテンツ再生装置１１０における、コンテンツの取得処理の一例を説明する。
映像コンテンツ再生装置１１０は、配信サーバ１００に対して、コンテンツの開始時刻＝０：０：０から、終了時刻＝０：０：１までの１秒間のコンテンツを要求する（ステップＳ４０１）。ここで、１秒間のコンテンツを要求しているのは、本実施形態では、コンテンツの要求のために、ＨＴＴＰ−ＧＥＴコマンドのTimeSeekRangeで１秒単位の時間を指定することを例に挙げているためである。ただし、コンテンツの要求単位はこれに限るものではなく、このことは、以下のシーケンスにおいても同様である。 Next, an example of content acquisition processing in the video content reproduction apparatus 110 will be described with reference to the sequence diagram of FIG.
The video content playback apparatus 110 requests the distribution server 100 for content for one second from the content start time = 0: 0: 0 to the end time = 0: 0: 1 (step S401). Here, the reason for requesting the content for one second is that, in the present embodiment, in order to request the content, a time in units of one second is specified by TimeSeekRange of the HTTP-GET command. It is. However, the content request unit is not limited to this, and this also applies to the following sequences.

ステップＳ４０１でなされた要求に応じて、配信サーバ１００から映像コンテンツ再生装置１１０に、０：０：０から０：０：１までに対応したコンテンツデータが送信される（ステップＳ４０２）。ステップＳ４０２で映像コンテンツ再生装置１１０が受信するコンテンツデータを図５（ａ）に示す。図５（ａ）に示す例では、映像コンテンツ再生装置１１０は、０：０：０から０：０：１までの１秒間のコンテンツデータとして、字幕データＪ０と、ビデオデータＶ０とを受信する。ここで、ビデオデータＶｎ（ｎ＝０，１，・・・）は、１秒間に再生するビデオデータを表しており、例えば、１秒間のビデオデータが３０フレームで構成されている場合、各ビデオデータＶｎは、３０フレームのビデオデータを表す。 In response to the request made in step S401, content data corresponding to 0: 0: 0 to 0: 0: 1 is transmitted from the distribution server 100 to the video content reproduction apparatus 110 (step S402). The content data received by the video content reproduction apparatus 110 in step S402 is shown in FIG. In the example shown in FIG. 5A, the video content reproduction apparatus 110 receives subtitle data J0 and video data V0 as content data for one second from 0: 0: 0 to 0: 0: 1. Here, the video data Vn (n = 0, 1,...) Represents video data to be reproduced in one second. For example, when video data for one second is composed of 30 frames, each video data Data Vn represents 30 frames of video data.

映像コンテンツ再生装置１１０は、受信した字幕データＪ０から、次の字幕開始時刻を算出する。図３（ｂ）に示すように、字幕データＪ０の開始時刻が０：０：０、再生時間が２秒であることから、映像コンテンツ再生装置１１０は、次の字幕開始時刻を次の（１）式により０：０：２と算出する。
次の字幕開始時刻＝現在の字幕開始時刻＋現在の字幕再生時間・・・（１） The video content reproduction apparatus 110 calculates the next caption start time from the received caption data J0. As shown in FIG. 3B, since the start time of the caption data J0 is 0: 0: 0 and the playback time is 2 seconds, the video content playback apparatus 110 sets the next caption start time to the next (1 ) Is calculated as 0: 0: 2.
Next subtitle start time = current subtitle start time + current subtitle playback time (1)

次に、映像コンテンツ再生装置１１０は、ステップＳ４０２で算出した開始時刻０：０：２から０：０：３までの１秒間のコンテンツを配信サーバ１００に要求する（ステップＳ４０３）。ステップＳ４０３でなされた要求に応じて、配信サーバ１００から映像コンテンツ再生装置１１０に、０：０：２から０：０：３までに対応したコンテンツデータが送信される（ステップＳ４０４）。ステップＳ４０４で映像コンテンツ再生装置１１０が受信するコンテンツデータを図５（ｂ）に示す。図５（ｂ）に示す例では、映像コンテンツ再生装置１１０は、０：０：２から０：０：３までの１秒間のコンテンツデータとして、字幕データＪ１と、ビデオデータＶ２とを受信する。映像コンテンツ再生装置１１０は、受信した字幕データＪ１から、次の字幕開始時刻を算出する。図３（ｃ）に示すように、字幕データＪ１の開始時刻が０：０：２、再生時間が５秒であることから、映像コンテンツ再生装置１１０は、次の字幕開始時刻を（１）式により０：０：７と算出する。 Next, the video content reproduction apparatus 110 requests the distribution server 100 for content for one second from the start time 0: 0: 2 to 0: 0: 3 calculated in step S402 (step S403). In response to the request made in step S403, the content data corresponding to 0: 0: 2 to 0: 0: 3 is transmitted from the distribution server 100 to the video content reproduction apparatus 110 (step S404). The content data received by the video content reproduction apparatus 110 in step S404 is shown in FIG. In the example shown in FIG. 5B, the video content reproduction apparatus 110 receives subtitle data J1 and video data V2 as content data for 1 second from 0: 0: 2 to 0: 0: 3. The video content reproduction apparatus 110 calculates the next caption start time from the received caption data J1. As shown in FIG. 3C, since the start time of the caption data J1 is 0: 0: 2 and the playback time is 5 seconds, the video content playback device 110 sets the next caption start time to the formula (1). Thus, 0: 0: 7 is calculated.

同様に、映像コンテンツ再生装置１１０は、ステップＳ４０４で算出した開始時刻０：０：７から０：０：８までの１秒間のコンテンツを配信サーバ１００に要求する（ステップＳ４０５）。ステップＳ４０５でなされた要求に応じて、配信サーバ１００から映像コンテンツ再生装置１１０に、０：０：７から０：０：８までに対応したコンテンツデータが送信される（ステップＳ４０６）。ステップＳ４０６で映像コンテンツ再生装置１１０が受信するコンテンツデータを図５（ｃ）に示す。図５（ｃ）に示す例では、映像コンテンツ再生装置１１０は、０：０：７から０：０：８までの１秒間のコンテンツデータとして、字幕データＪ２と、ビデオデータＶ７とを受信する。映像コンテンツ再生装置１１０は、受信した字幕データＪ２から、次の字幕開始時刻を算出する。図３（ｄ）に示すように、字幕データＪ２の開始時刻が０：０：７、再生時間が５秒であることから、映像コンテンツ再生装置１１０は、次の字幕開始時刻を（１）式により０：０：１２と算出する。 Similarly, the video content reproduction apparatus 110 requests the distribution server 100 for content for one second from the start time 0: 0: 7 to 0: 0: 8 calculated in step S404 (step S405). In response to the request made in step S405, content data corresponding to 0: 0: 7 to 0: 0: 8 is transmitted from the distribution server 100 to the video content reproduction apparatus 110 (step S406). The content data received by the video content reproduction apparatus 110 in step S406 is shown in FIG. In the example shown in FIG. 5C, the video content reproduction apparatus 110 receives subtitle data J2 and video data V7 as content data for one second from 0: 0: 7 to 0: 0: 8. The video content reproduction apparatus 110 calculates the next caption start time from the received caption data J2. As shown in FIG. 3D, since the start time of the caption data J2 is 0: 0: 7 and the playback time is 5 seconds, the video content playback device 110 sets the next caption start time to the formula (1). Is calculated as 0: 0: 12.

次に、図６のフローチャートを参照しながら、映像コンテンツ再生装置１１０における、コンテンツ取得処理の流れの一例を説明する。
まず、早見再生制御部１１６は、コンテンツの開始から１秒間のコンテンツを、ネットワーク通信部１１１を介して配信サーバ１００に要求する（ステップＳ６０１）。すなわち、早見再生制御部１１６は、コンテンツの開始時刻＝０：０：０から、終了時刻＝０：０：１までの１秒間のコンテンツを、配信サーバ１００に要求する。
次に、デマルチプレクサ１１２は、配信サーバ１００からネットワーク通信部１１１が受信したコンテンツを、ビデオデータと字幕データとに分離し、字幕データを字幕デコーダ１１４に出力する（ステップＳ６０２）。 Next, an example of the flow of content acquisition processing in the video content reproduction apparatus 110 will be described with reference to the flowchart of FIG.
First, the quick-play reproduction control unit 116 requests content for one second from the start of the content to the distribution server 100 via the network communication unit 111 (step S601). That is, the quick-view reproduction control unit 116 requests the distribution server 100 for content for one second from the content start time = 0: 0: 0 to the end time = 0: 0: 1.
Next, the demultiplexer 112 separates the content received by the network communication unit 111 from the distribution server 100 into video data and caption data, and outputs the caption data to the caption decoder 114 (step S602).

次に、字幕デコーダ１１４は、字幕データから、字幕開始時刻、字幕再生時間、及び字幕文字数を抽出し、早見再生制御部１１６に出力する（ステップＳ６０３）。
次に、早見再生制御部１１６は、字幕開始時刻、及び字幕再生時間を基に、次の字幕開始時刻を算出する。そして、早見再生制御部１１６は、次の字幕開始時刻を次のコンテンツの開始時刻とし、その次のコンテンツの開始時刻から１秒間のコンテンツを、ネットワーク通信部１１１を介して配信サーバ１００に要求する（ステップＳ６０４）。さらにコンテンツの終わりまでステップＳ６０２からＳ６０４の処理を繰り返す。尚、コンテンツの終わりまでステップＳ６０２からＳ６０４の処理を行うと、図６のフローチャートによる処理が終了する。 Next, the caption decoder 114 extracts the caption start time, the caption playback time, and the number of caption characters from the caption data, and outputs them to the quick-view playback control unit 116 (step S603).
Next, the quick-view playback control unit 116 calculates the next caption start time based on the caption start time and the caption playback time. Then, the quick-view reproduction control unit 116 sets the next caption start time as the start time of the next content, and requests the content server 100 via the network communication unit 111 for content for one second from the start time of the next content. (Step S604). Further, the processing from step S602 to S604 is repeated until the end of the content. Note that when the processing of steps S602 to S604 is performed until the end of the content, the processing according to the flowchart of FIG. 6 ends.

次に、映像コンテンツ再生装置１１０における、字幕付き早見再生処理の一例について説明する。字幕付き早見再生処理は、取得した字幕データ（例えばＪ０）とビデオデータ（例えばＶ０）とを用い、ビデオデータに含まれる所定のフレームを、字幕文字数に応じた時間だけ、字幕と共に静止画として再生することにより行う。この静止画の再生時間は、字幕文字数を基に、以下の（２式）により算出する。
再生時間（秒）＝ＩＮＴ（（字幕文字数＋９）÷１０）・・・（２）
ただし、ＩＮＴ（ｎ）は、ｎの整数部である。
尚、（９）式は、静止画の再生時間の一例を示したものであり、他の値を使用したり、テーブル（表）を参照したりして、静止画の再生時間を算出することも可能である。
また、ビデオデータに含まれる所定のフレームとして、字幕データと、ＰＴＳ（Presentation Time Stamp）の一致するフレームや、ビデオデータ内の最初のフレームが選択され、静止画に使用するフレームとして決定される。 Next, an example of a quick-view playback process with subtitles in the video content playback apparatus 110 will be described. The quick-view playback process with subtitles uses the obtained subtitle data (for example, J0) and video data (for example, V0), and reproduces a predetermined frame included in the video data as a still image together with the subtitles for a time corresponding to the number of subtitle characters. To do. The reproduction time of this still image is calculated by the following (Formula 2) based on the number of subtitle characters.
Playback time (seconds) = INT ((number of subtitle characters + 9) ÷ 10) (2)
Here, INT (n) is an integer part of n.
Equation (9) shows an example of the playback time of a still image, and the still image playback time is calculated by using other values or referring to a table. Is also possible.
In addition, as a predetermined frame included in the video data, a frame in which the caption data and the PTS (Presentation Time Stamp) match or the first frame in the video data is selected and determined as a frame to be used for a still image.

図７は、字幕付き早見再生を行う場合の、ビデオデータと字幕データとの関係の一例を表す図である。具体的に、図７（ａ）は、ビデオデータと字幕データの再生タイミングの一例を概念的に表す図であり、図７（ｂ）は、ビデオデータと字幕データの再生タイミングを表形式で表す図である。
図７（ａ）では、字幕データＪ０と共にビデオデータＶ０のフレームＦ０を静止画として再生することを表している。字幕データＪ１とビデオデータＶ２のフレームＦ０、字幕データＪ２とビデオデータＶ７のフレームＦ０についても同様に静止画として再生される。ここで、フレームＦ０は、ビデオデータＶｎ（ｎ＝０，１，…）内の最初のフレームを表す。 FIG. 7 is a diagram illustrating an example of a relationship between video data and caption data when performing fast-play with caption. Specifically, FIG. 7A conceptually shows an example of the playback timing of video data and caption data, and FIG. 7B shows the playback timing of video data and caption data in a table format. FIG.
FIG. 7A shows that the frame F0 of the video data V0 is reproduced as a still image together with the caption data J0. The frame F0 of the caption data J1 and the video data V2 and the frame F0 of the caption data J2 and the video data V7 are similarly reproduced as still images. Here, the frame F0 represents the first frame in the video data Vn (n = 0, 1,...).

図７（ｂ）では、字幕Ｊ０と共にビデオデータＶ０のフレームＦ０を静止画として再生することと、字幕Ｊ０の文字数が１０であることから再生時間が１秒であること（（２）式を参照）と、を示している。以下、字幕Ｊ１（Ｊ２）についても字幕Ｊ０と同様に、ビデオデータＶ２（Ｖ７）のフレームＦ０を、静止画として、３秒（２秒）再生することを示している。 In FIG. 7B, the frame F0 of the video data V0 is reproduced as a still image together with the caption J0, and the reproduction time is 1 second since the number of characters of the caption J0 is 10 (see equation (2)). ). Hereinafter, it is shown that the subtitle J1 (J2) is reproduced as a still image for 3 seconds (2 seconds) as the frame F0 of the video data V2 (V7) in the same manner as the subtitle J0.

次に、図８のフローチャートを参照しながら、映像コンテンツ再生装置１１０における、字幕付き早見再生処理の流れの一例を説明する。
早見再生制御部１１６は、字幕デコーダ１１４から、字幕開始時刻、字幕再生時間、及び字幕文字数を取得し（ステップＳ８０１）、取得した字幕文字数を基に、（２）式を用いてコンテンツ（字幕）の早見再生時間を算出する（ステップＳ８０２）。
次に、早見再生制御部１１６は、ビデオデータの所定のフレーム（例えば、ビデオデータＶ０のフレームＦ０）をデコードし、静止画として出力することをビデオデコーダ１１３に指示する（ステップＳ８０３）。次に、早見再生制御部１１６は、字幕データ（例えばＪ０）をデコードし、字幕として出力することを字幕デコーダ１１４に指示する（ステップＳ８０４）。ビデオ／字幕ブレンド部１１５は、ステップＳ８０３でビデオデコーダ１１３から出力されたビデオデータと、ステップＳ８０４で字幕デコーダ１１４から出力された字幕文データとを重畳し、映像出力する。 Next, an example of the flow of the fast-playing reproduction process with subtitles in the video content reproduction apparatus 110 will be described with reference to the flowchart of FIG.
The quick-view reproduction control unit 116 acquires the subtitle start time, the subtitle reproduction time, and the number of subtitle characters from the subtitle decoder 114 (step S801), and based on the acquired subtitle character number, the content (subtitle) is obtained using equation (2). Is calculated (step S802).
Next, the quick-view reproduction control unit 116 instructs the video decoder 113 to decode a predetermined frame of video data (for example, the frame F0 of the video data V0) and output it as a still image (step S803). Next, the quick-view reproduction control unit 116 instructs the subtitle decoder 114 to decode subtitle data (for example, J0) and output it as subtitles (step S804). The video / subtitle blending unit 115 superimposes the video data output from the video decoder 113 in step S803 and the subtitle sentence data output from the subtitle decoder 114 in step S804, and outputs the video.

次に、早見再生制御部１１６は、早見再生時間が経過したか否かを判定する（ステップＳ８０５）。この判定の結果、早見再生時間が経過していれば、後述するステップＳ８０９に進み、経過していなければ、ステップＳ８０６に進む。
ステップＳ８０６に進むと、早見再生制御部１１６は、字幕デコーダ１１４によって、次のコンテンツの字幕データが既に取得されているか否かを判定する（ステップＳ８０６）。この判定の結果、次のコンテンツの字幕データを既に取得済みであればステップＳ８０５に戻り、取得していなければステップＳ８０７に進む。 Next, the quick-lookup reproduction control unit 116 determines whether or not the quick-playback time has elapsed (step S805). As a result of the determination, if the quick-play time has elapsed, the process proceeds to step S809 described later, and if not, the process proceeds to step S806.
In step S806, the quick-view reproduction control unit 116 determines whether or not the caption data of the next content has already been acquired by the caption decoder 114 (step S806). As a result of this determination, if subtitle data of the next content has already been acquired, the process returns to step S805, and if not acquired, the process proceeds to step S807.

早見再生制御部１１６は、字幕デコーダ１１４から、次の字幕開始時刻、字幕再生時間、及び字幕文字数を取得し（ステップＳ８０７）、取得した字幕文字数を基に、（２）式を用いて次のコンテンツの早見再生時間を算出する（ステップＳ８０８）。そして、ステップＳ８０５に戻る。
ステップＳ８０５の判定の結果、コンテンツの早見再生時間が経過していると、ステップＳ８０９に進む。そして、早見再生制御部１１６は、次のコンテンツの字幕データが既に取得されているか否かを判定する（ステップＳ８０９）。この判定の結果、次のコンテンツの字幕データを既に取得済みであれば、ステップＳ８０８で算出したコンテンツの早見再生時間を、ステップＳ８０５で経過判別するための早見再生時間とし、ステップＳ８０３に戻る。一方、コンテンツの早見再生時間が経過しているが、次のコンテンツの字幕データを取得していなければ（ステップＳ８０９でＮｏ）、早見再生制御部１１６は、コンテンツの終了と判断し、図８のフローチャートによる処理を終了する。 The quick-view playback control unit 116 acquires the next subtitle start time, subtitle playback time, and subtitle character count from the subtitle decoder 114 (step S807), and uses the formula (2) to The content quick playback time is calculated (step S808). Then, the process returns to step S805.
If the result of determination in step S805 is that the content quick-view playback time has elapsed, processing proceeds to step S809. Then, the quick-view reproduction control unit 116 determines whether or not the subtitle data of the next content has already been acquired (step S809). If the subtitle data of the next content has already been acquired as a result of this determination, the quick playback time of the content calculated in step S808 is set as the quick playback time for determining the progress in step S805, and the process returns to step S803. On the other hand, if the fast-playing time of the content has passed but the subtitle data of the next content has not been acquired (No in step S809), the fast-playing control unit 116 determines that the content has ended, and FIG. The process according to the flowchart ends.

以上のように本実施形態では、映像コンテンツ再生装置１１０は、コンテンツの時間位置の一例である字幕開始時刻と、再生時間範囲の一例である１秒間とを指定する。映像コンテンツ再生装置１１０は、指定した範囲の、字幕データとビデオデータとを含むコンテンツを、外部装置の一例である配信サーバ１００から取得する。映像コンテンツ再生装置１１０は、コンテンツに含まれる字幕データの字幕開始時刻３０１と再生時間３０２とに基づいて、次に取得するコンテンツの字幕開始時刻を算出する。そして、映像コンテンツ再生装置１１０は、当該字幕開始時刻と、１秒間とを指定し、次のコンテンツを配信サーバ１００から取得する。このような処理をコンテンツが終了するまで繰り返し行う。映像コンテンツ再生装置１１０は、字幕データから字幕の文字数を抽出し、その文字数に基づいて、ビデオデータに含まれる、当該字幕データの字幕に対応する映像フレームの早見再生時間を算出する。映像コンテンツ再生装置１１０は、取得したコンテンツの字幕データと、当該字幕データの字幕に対応する映像フレームとを静止画として、算出した早見再生時間だけ表示させる。したがって、字幕の再生のタイミングに合わせて配信サーバ１００から、字幕を考慮した所望のコンテンツを取得し、字幕付き早見再生を行うことが可能となる。 As described above, in the present embodiment, the video content playback apparatus 110 specifies the caption start time, which is an example of the time position of content, and 1 second, which is an example of the playback time range. The video content reproduction apparatus 110 acquires content including subtitle data and video data in a specified range from the distribution server 100 which is an example of an external apparatus. The video content playback apparatus 110 calculates the caption start time of the content to be acquired next based on the caption start time 301 and the playback time 302 of the caption data included in the content. Then, the video content reproduction apparatus 110 designates the caption start time and 1 second, and acquires the next content from the distribution server 100. Such processing is repeated until the content is completed. The video content playback apparatus 110 extracts the number of subtitle characters from the subtitle data, and calculates the quick playback time of the video frame corresponding to the subtitle of the subtitle data included in the video data based on the number of characters. The video content playback apparatus 110 displays the acquired content caption data and the video frame corresponding to the caption data of the caption data as a still image for the calculated quick-view playback time. Therefore, it is possible to acquire desired content in consideration of the caption from the distribution server 100 in accordance with the timing of reproducing the caption, and to perform quick-play with caption.

（第２の実施形態）
次に、本発明の第２の実施形態について説明する。第１の実施形態では、配信サーバ１００からコンテンツを取得する際、字幕の開始時刻に対応したビデオフレームを含んだコンテンツの取得を要求し、字幕文字数に応じて静止画提示時間（早見再生時間）を決定し、早見再生を行う場合を例に挙げて説明した。しかし、コンテンツによっては、音楽が再生されているシーン等、長いシーンでも字幕文字数が少ない場合があり、字幕文字数だけではなく元のシーンの長さにも応じて、間のシーンを静止画として提示した方が、内容をより分りやすく早見再生することができる。このように本実施形態と第１の実施形態とは、早見再生を行うときに、１つの字幕に対して表示するフレーム（静止画）の数が異なることによる処理が主として異なる。よって、本実施形態の説明において、第１の実施形態と同一の部分については、図１〜図８に付した符号と同一の符号を付す等して詳細な説明を省略する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described. In the first embodiment, when content is acquired from the distribution server 100, the acquisition of the content including the video frame corresponding to the start time of the subtitle is requested, and the still image presentation time (early playback time) according to the number of subtitle characters In the above description, an example of performing fast playback is described. However, depending on the content, there are cases where the number of subtitle characters is small even in a long scene such as a scene where music is being played back, and the intervening scene is presented as a still image according to the length of the original scene as well as the number of subtitle characters This makes it easier to understand the content and to play it quickly. As described above, the present embodiment and the first embodiment mainly differ in processing due to the difference in the number of frames (still images) to be displayed for one subtitle when fast-playing is performed. Therefore, in the description of the present embodiment, the same parts as those in the first embodiment are denoted by the same reference numerals as those in FIGS.

図９は、映像コンテンツ再生装置９０１の構成の一例を示すブロック図である。映像コンテンツ再生装置９０１は、図１に示した映像コンテンツ再生装置１１０に対して、ネットワーク帯域測定部９０４とビットレート算出部９０５とが付加され、早見再生制御部９０６の処理の一部が異なる構成となっている。
ネットワーク帯域測定部９０４は、配信サーバ１００からコンテンツを受信する際、ネットワーク帯域を測定し、早見再生制御部９０６に通知する。ビットレート算出部９０５は、コンテンツのビットレートを算出する。例えば、ＭＰＥＧ−２ＴＳコンテンツの場合、ＰＣＲ値を基にビットレートを算出することが可能である。早見再生制御部９０６は、ネットワーク通信部１１１を介してコンテンツの取得要求を行う。この取得要求は、コンテンツの字幕データから得られる字幕開始時刻及び字幕再生時間と、ネットワーク帯域測定部９０４から取得したネットワーク帯域と、ビットレート算出部９０５より取得したビットレートとを基に行われる。 FIG. 9 is a block diagram showing an example of the configuration of the video content playback apparatus 901. The video content playback apparatus 901 has a configuration in which a network bandwidth measurement unit 904 and a bit rate calculation unit 905 are added to the video content playback apparatus 110 shown in FIG. It has become.
When the network bandwidth measuring unit 904 receives content from the distribution server 100, the network bandwidth measuring unit 904 measures the network bandwidth and notifies the quick-play reproduction control unit 906. The bit rate calculation unit 905 calculates the bit rate of the content. For example, in the case of MPEG-2TS content, the bit rate can be calculated based on the PCR value. The quick-view reproduction control unit 906 makes a content acquisition request via the network communication unit 111. This acquisition request is made based on the caption start time and caption playback time obtained from the caption data of the content, the network bandwidth acquired from the network bandwidth measuring unit 904, and the bit rate acquired from the bit rate calculating unit 905.

次に、図１０のフローチャートを参照しながら、映像コンテンツ再生装置９０１における、コンテンツ要求処理の流れの一例を説明する。
まず、早見再生制御部９０６は、字幕の先頭に対応したビデオフレームを含むコンテンツを、ネットワーク通信部１１１を介して配信サーバ１００に要求する（ステップＳ１００１）。そして、早見再生制御部９０６は、要求に対応するコンテンツを、ネットワーク通信部１１１を介して配信サーバ１００から取得する。
次に、早見再生制御部９０６は、配信サーバ１００から取得したコンテンツのビットレートとネットワーク帯域とを比較し、コンテンツのビットレートがネットワーク帯域の２倍未満であるか否かを判定する（ステップＳ１００２）。この判定の結果、コンテンツのビットレートがネットワーク帯域の２倍未満であれば、図１０のフローチャートによる処理を終了する。一方、コンテンツのビットレートがネットワーク帯域の２倍以上であれば、ステップＳ１００３に進む。 Next, an example of the flow of content request processing in the video content reproduction apparatus 901 will be described with reference to the flowchart of FIG.
First, the quick-view playback control unit 906 requests content including a video frame corresponding to the head of the caption from the distribution server 100 via the network communication unit 111 (step S1001). Then, the quick-play reproduction control unit 906 acquires content corresponding to the request from the distribution server 100 via the network communication unit 111.
Next, the quick-view reproduction control unit 906 compares the content bit rate acquired from the distribution server 100 with the network bandwidth, and determines whether the content bit rate is less than twice the network bandwidth (step S1002). ). If the bit rate of the content is less than twice the network bandwidth as a result of this determination, the processing according to the flowchart of FIG. 10 is terminated. On the other hand, if the bit rate of the content is more than twice the network bandwidth, the process proceeds to step S1003.

早見再生制御部９０６は、コンテンツのビットレートがネットワーク帯域の４倍未満であるか否かを判定する（ステップＳ１００３）。この判定の結果、コンテンツのビットレートがネットワーク帯域の４倍以上である場合には、後述するステップＳ１００６に進む。一方、コンテンツのビットレートがネットワーク帯域の４倍未満である場合には、ステップＳ１００４に進む。
早見再生制御部９０６は、字幕デコーダ１１４より取得した字幕再生時間が８秒以上であるか否かを判定する（ステップＳ１００４）。この判定の結果、字幕再生時間が８秒以上でない場合には、図１０のフローチャートによる処理を終了する。一方、字幕再生時間が８秒以上である場合には、ステップＳ１００５に進む。 The quick-view reproduction control unit 906 determines whether or not the content bit rate is less than four times the network bandwidth (step S1003). As a result of the determination, if the bit rate of the content is four times or more the network bandwidth, the process proceeds to step S1006 described later. On the other hand, if the bit rate of the content is less than 4 times the network bandwidth, the process proceeds to step S1004.
The quick-view reproduction control unit 906 determines whether or not the subtitle reproduction time acquired from the subtitle decoder 114 is 8 seconds or more (step S1004). If the result of this determination is that the subtitle playback time is not 8 seconds or longer, the processing according to the flowchart of FIG. 10 is terminated. On the other hand, if the subtitle playback time is 8 seconds or longer, the process advances to step S1005.

早見再生制御部９０６は、字幕再生開始から字幕再生終了までの間の中間コンテンツの取得を、ネットワーク通信部１１１を介して配信サーバ１００に要求し、図１０のフローチャートによる処理を終了する（ステップＳ１００５）。早見再生制御部９０６は、以下の（３）式及び（４）式により、一つの字幕再生中に再生すべき静止画数とその間隔を求める。
静止画数＝ＩＮＴ（字幕再生時間÷４）・・・（３）
静止画間隔＝ＩＮＴ（字幕再生時間÷静止画数）・・・（４）
例えば、字幕再生時間が１０秒であった場合、（３）式及び（４）式より、
静止画数＝ＩＮＴ（１０÷４）＝２
静止画間隔＝ＩＮＴ（１０÷２）＝５
と、静止画数と静止画間隔とを求めることができる。その結果、早見再生制御部９０６は、中間コンテンツとして、静止画開始時刻（静止画の表示開始の時刻）から５秒目のコンテンツを１回取得する。ここで静止画数＝２は、先頭フレームを含めて２であることから、中間コンテンツとしては１回取得することとなる。 The quick-view playback control unit 906 requests the distribution server 100 to acquire intermediate content from the start of subtitle playback to the end of subtitle playback via the network communication unit 111, and ends the processing according to the flowchart of FIG. 10 (step S1005). ). The quick-view reproduction control unit 906 obtains the number of still images to be reproduced and the interval between them during reproduction of one caption by the following equations (3) and (4).
Number of still images = INT (subtitle playback time ÷ 4) (3)
Still image interval = INT (subtitle playback time / number of still images) (4)
For example, when the subtitle playback time is 10 seconds, from Equation (3) and Equation (4),
Number of still images = INT (10 ÷ 4) = 2
Still image interval = INT (10/2) = 5
And the number of still images and the interval between still images can be obtained. As a result, the quick-view reproduction control unit 906 acquires the content of the fifth second from the still image start time (still image display start time) as intermediate content once. Here, since the number of still images = 2 is 2 including the first frame, the intermediate content is acquired once.

前述したように、ステップＳ１００３の判定の結果、コンテンツのビットレートがネットワーク帯域の４倍以上である場合には、ステップＳ１００６に進む。早見再生制御部９０６は、字幕デコーダ１１４より取得した字幕再生時間が４秒以上であるか否かを判定する（ステップＳ１００６）。この判定の結果、字幕再生時間が４秒以上でない場合には、図１０のフローチャートによる処理を終了する。一方、字幕再生時間が４秒以上である場合には、ステップＳ１００７に進む。そして、早見再生制御部９０６は、字幕再生開始から字幕再生終了までの間の中間コンテンツの取得を、ネットワーク通信部１１１を介して配信サーバ１００に要求し、図１０のフローチャートによる処理を終了する（ステップＳ１００７）。ステップＳ１００７でも、ステップＳ１００５での処理と同様に、以下の（５）式及び（６）式により、一つの字幕再生中に再生すべき静止画数とその間隔を求める。 As described above, if it is determined in step S1003 that the bit rate of the content is four times or more the network bandwidth, the process proceeds to step S1006. The quick-view reproduction control unit 906 determines whether or not the subtitle reproduction time acquired from the subtitle decoder 114 is 4 seconds or more (step S1006). If the result of this determination is that the subtitle playback time is not 4 seconds or longer, the processing according to the flowchart of FIG. 10 is terminated. On the other hand, if the subtitle playback time is 4 seconds or longer, the process proceeds to step S1007. Then, the quick-view playback control unit 906 requests the distribution server 100 via the network communication unit 111 to acquire intermediate content from the start of subtitle playback to the end of subtitle playback, and ends the processing according to the flowchart of FIG. Step S1007). Also in step S1007, as in the processing in step S1005, the number of still images to be reproduced and the interval between them are obtained by the following equations (5) and (6).

静止画数＝ＩＮＴ（字幕再生時間÷２）・・・（５）
静止画間隔＝ＩＮＴ（字幕再生時間÷静止画数）・・・（６）
例えば、字幕再生時間が１０秒であった場合、（５）式及び（６）式より、
静止画数＝ＩＮＴ（１０÷２）＝５
静止画間隔＝ＩＮＴ（１０÷５）＝２
と、静止画数と静止画間隔とを求めることができる。その結果、早見再生制御部９０６は、中間コンテンツとして、静止画開始時刻から２秒目毎のコンテンツを４回取得する。
図３（ｃ）、（ｄ）に示した字幕データＪ１、Ｊ２を例にとると、字幕再生時間３０２が５秒であるため、コンテンツのビットレートがネットワーク帯域の４倍以上であった場合、ステップＳ１００７の処理が行われる。以下のように、一つの字幕再生中に再生すべき静止画数とその間隔は、（５）式及び（６）式より、
静止画数＝ＩＮＴ（５÷２）＝２
静止画間隔＝ＩＮＴ（５÷２）＝２
であることから、字幕Ｊ１については、字幕再生時刻０：０：２に加え０：０：４のコンテンツを、字幕Ｊ２については、字幕再生時刻０：０：７に加え０：０：９のコンテンツを取得する。
ここまでの説明では、コンテンツのビットレートがネットワーク帯域の２倍又は４倍未満であるか否か、静止画数が字幕再生時間の４分の１又は２分の１である場合を例に挙げたが、これらの数字は説明のためのものであり、これらの数字に限るものではない。 Number of still images = INT (subtitle playback time ÷ 2) (5)
Still image interval = INT (subtitle playback time / number of still images) (6)
For example, when the subtitle playback time is 10 seconds, from Equation (5) and Equation (6),
Number of still images = INT (10 ÷ 2) = 5
Still image interval = INT (10 ÷ 5) = 2
And the number of still images and the interval between still images can be obtained. As a result, the quick-view reproduction control unit 906 acquires the content every second second from the still image start time four times as the intermediate content.
Taking the subtitle data J1 and J2 shown in FIGS. 3C and 3D as an example, since the subtitle playback time 302 is 5 seconds, when the bit rate of the content is four times or more of the network bandwidth, The process of step S1007 is performed. As shown below, the number of still images to be played during one subtitle playback and the interval between them can be calculated from equations (5) and (6):
Number of still images = INT (5 ÷ 2) = 2
Still image interval = INT (5 ÷ 2) = 2
Therefore, for the subtitle J1, the content at 0: 0: 4 is added to the subtitle playback time 0: 0: 2, and for the subtitle J2, the content at 0: 0: 9 is added to the subtitle playback time 0: 0: 7. Get content.
In the above description, the case where the bit rate of the content is twice or less than four times the network bandwidth and the number of still images is ¼ or ½ of the subtitle playback time is taken as an example. However, these numbers are for illustrative purposes and are not limited to these numbers.

図１１は、映像コンテンツ再生装置９０１における、コンテンツの取得処理の一例を説明するシーケンス図である。
図１１に示すシーケンスは、図４に示したシーケンスに、０：０：４のコンテンツを取得するためのステップＳ１１０１〜１１０２が、ステップＳ４０４、Ｓ４０５の間に付加されている。また、図１１に示すシーケンスは、図４に示したシーケンスに、０：０：９のコンテンツを取得するためのステップＳ１１０３〜１１０４が、ステップＳ４０６の後に付加されたシーケンスとなる。 FIG. 11 is a sequence diagram for explaining an example of content acquisition processing in the video content playback apparatus 901.
In the sequence shown in FIG. 11, steps S1101 to 1102 for acquiring 0: 0: 4 content are added to the sequence shown in FIG. 4 between steps S404 and S405. In addition, the sequence illustrated in FIG. 11 is a sequence in which steps S1103 to 1104 for acquiring the content of 0: 0: 9 are added to the sequence illustrated in FIG. 4 after step S406.

図１２は、字幕付き早見再生を行う場合の、ビデオデータと字幕データの再生タイミングの一例を表形式で表す図である。
各字幕Ｊ０、Ｊ１、Ｊ２の再生時間は、第１の実施形態の図７（ｂ）に示したものと同一である。ただし、第１の実施形態では、一つの字幕に対して一つのビデオフレームを表示するのに対して、本実施形態では、前述したように一つの字幕に対して複数のビデオフレームを静止画として再生することが可能となっている。字幕Ｊ１については、ビデオデータＶ２のフレームＦ０に加え、ビデオデータＶ４のフレームＦ０を表示するため、字幕Ｊ１の再生時間である３秒に対してそれぞれのビデオフレームを１．５秒再生する。字幕Ｊ２の再生時間についても、字幕Ｊ２の再生時間である２秒に対して、ビデオデータＶ２のフレームＦ０とビデオデータＶ９のフレームＦ０とをそれぞれ１秒ずつ再生する。 FIG. 12 is a diagram illustrating an example of the playback timing of video data and caption data in the case of performing quick-play with captions.
The reproduction times of the subtitles J0, J1, and J2 are the same as those shown in FIG. 7B of the first embodiment. However, in the first embodiment, one video frame is displayed for one subtitle, whereas in this embodiment, as described above, a plurality of video frames are displayed as still images for one subtitle. It is possible to play. For the subtitle J1, in order to display the frame F0 of the video data V4 in addition to the frame F0 of the video data V2, each video frame is reproduced for 1.5 seconds with respect to 3 seconds as the reproduction time of the subtitle J1. Regarding the playback time of the caption J2, the frame F0 of the video data V2 and the frame F0 of the video data V9 are each played back by 1 second with respect to 2 seconds that is the playback time of the caption J2.

次に、図１３のフローチャートを参照しながら、映像コンテンツ再生装置９０１における、字幕付き早見再生処理の流れの一例を説明する。
早見再生制御部９０６は、字幕デコーダ１１４から、字幕開始時刻、字幕再生時間、及び字幕文字数を取得し（ステップＳ１３０１）、取得した字幕文字数を基に、（２）式を用いてコンテンツ（字幕）の早見再生時間を算出する（ステップＳ１３０２）。
次に、早見再生制御部９０６は、（３）式、又は（５）式を用いて、一つの字幕再生中に再生すべき静止画数を算出する（ステップＳ１３０３）。この計算は、字幕デコーダ１１４より取得した字幕再生時間と、ネットワーク帯域測定部９０４から取得したネットワーク帯域、ビットレート算出部９０５より取得したビットレートとを基にして行われる。
次に、早見再生制御部９０６は、ステップＳ１３０２で算出した字幕の早見再生時間と、ステップＳ１３０３で算出した静止画数とを基に、以下の（７）式により、一つの字幕再生中に再生すべき各静止画の再生時間を算出する（ステップＳ１３０４）。
各静止画再生時間＝字幕の早見再生時間÷静止画数・・・（７）
例えば、図１２に示した字幕Ｊ１の場合、（７）式より、
各静止画再生時間＝３÷２＝１．５
と、各静止画再生時間を求めることができる。 Next, an example of the flow of the quick-playing process with subtitles in the video content playback apparatus 901 will be described with reference to the flowchart of FIG.
The quick-view playback control unit 906 obtains the caption start time, the caption playback time, and the number of caption characters from the caption decoder 114 (step S1301), and the content (subtitle) using Expression (2) based on the obtained number of caption characters. Is calculated (step S1302).
Next, the quick-view playback control unit 906 calculates the number of still images to be played back during playback of one caption using the formula (3) or the formula (5) (step S1303). This calculation is performed based on the caption reproduction time acquired from the caption decoder 114, the network bandwidth acquired from the network bandwidth measuring unit 904, and the bit rate acquired from the bit rate calculating unit 905.
Next, the fast-playing playback control unit 906 plays back during the playback of one subtitle according to the following equation (7) based on the quick-playing time of the subtitle calculated in step S1302 and the number of still images calculated in step S1303. The reproduction time of each still image to be calculated is calculated (step S1304).
Each still image playback time = quick playback time of subtitles / number of still images (7)
For example, in the case of the caption J1 shown in FIG.
Each still image playback time = 3 ÷ 2 = 1.5
And each still picture reproduction time can be obtained.

次に、早見再生制御部９０６は、字幕データ（例えばＪ１）をデコードし、字幕として出力することを字幕デコーダ１１４に指示する（ステップＳ１３０５）。次に、早見再生制御部９０６は、ビデオデータの所定のフレーム（例えば、ビデオデータＶ２のフレームＦ０）をデコードし、静止画として出力することをビデオデコーダ１１３に指示する（ステップＳ１３０６）。ビデオ／字幕ブレンド部１１５は、ステップＳ１３０６でビデオデコーダ１１３から出力されたビデオデータと、ステップＳ１３０５で字幕デコーダ１１４から出力された字幕文データとを重畳し、映像出力する。 Next, the quick-view reproduction control unit 906 instructs the subtitle decoder 114 to decode subtitle data (for example, J1) and output it as subtitles (step S1305). Next, the quick-view reproduction control unit 906 instructs the video decoder 113 to decode a predetermined frame of video data (for example, the frame F0 of the video data V2) and output it as a still image (step S1306). The video / caption blending unit 115 superimposes the video data output from the video decoder 113 in step S1306 and the caption text data output from the caption decoder 114 in step S1305, and outputs a video.

次に、早見再生制御部９０６は、ステップＳ１３０４で算出した静止画再生時間が経過するまで待機する（ステップＳ１３０７）。そして、静止画再生時間が経過すると、早見再生制御部９０６は、ステップＳ１３０２で算出した字幕の早見再生時間が経過したか否かを判定する（ステップＳ１３０８）。この判定の結果、字幕の早見再生時間が経過していない場合は、ステップＳ１３０６に戻る。ここでは、早見再生制御部９０６は、次のビデオデータの所定フレーム（例えば、ビデオデータＶ４のフレームＦ０）をデコードし、静止画として出力することをビデオデコーダ１１３に指示する（ステップＳ１３０６）。
一方、ステップＳ１３０２で算出した字幕の早見再生時間が経過している場合には、図１３のフローチャートによる処理を終了する。
尚、次の字幕データについて処理する場合には、例えば、図１３のフローチャートを再度実行するようにすればよい。 Next, the quick-view reproduction control unit 906 waits until the still image reproduction time calculated in step S1304 has elapsed (step S1307). When the still image playback time has elapsed, the quick-play playback control unit 906 determines whether or not the quick-play playback time of the subtitle calculated in step S1302 has passed (step S1308). If the result of this determination is that the subtitle quick-view playback time has not elapsed, processing returns to step S1306. Here, the quick-view reproduction control unit 906 instructs the video decoder 113 to decode a predetermined frame of the next video data (for example, the frame F0 of the video data V4) and output it as a still image (step S1306).
On the other hand, if the subtitle quick-view playback time calculated in step S1302 has elapsed, the processing according to the flowchart of FIG. 13 ends.
When processing the next caption data, for example, the flowchart of FIG. 13 may be executed again.

以上のように実施形態では、コンテンツのビットレートとネットワーク帯域とに応じて字幕再生中の中間コンテンツを取得し、それを字幕と共に再生する。したがって、字幕に対応した一つのビデオフレームだけでなく、複数のビデオフレームを早見再生することができ、より内容のわかりやすい字幕付き早見再生が可能となる。 As described above, in the embodiment, the intermediate content being reproduced with subtitles is acquired according to the bit rate of the content and the network band, and is reproduced together with the subtitles. Therefore, not only one video frame corresponding to subtitles but also a plurality of video frames can be quickly played back, and quick playback with subtitles that can be easily understood is possible.

尚、例えば、ステップＳ６０１、Ｓ６０４、Ｓ１００１、Ｓ１００５、Ｓ１００７の処理が実行されることにより、要求工程、取得工程の一例が実現され、ステップＳ６０４、Ｓ１００５、Ｓ１００７の処理が実行されることにより、決定工程の一例が実現される。また、例えば、ステップＳ８０２、Ｓ８０８、Ｓ１３０２の処理が実行されることにより、映像出力時間算出工程の一例が実現され、ステップＳ８０３、Ｓ８０４、Ｓ１３０５、Ｓ１３０６の処理が実行されることにより、出力工程の一例が実現される。また、例えば、ステップＳ１００２、Ｓ１００３の処理が実行されることにより、比較工程の一例が実現される。また、例えば、ステップＳ１００５、Ｓ１００７の処理が実行されることにより、映像出力算出工程の一例が実現される。 Note that, for example, by executing the processes of steps S601, S604, S1001, S1005, and S1007, an example of the request process and the acquisition process is realized, and the determination is performed by executing the processes of steps S604, S1005, and S1007. An example of a process is realized. Further, for example, by executing the processes of steps S802, S808, and S1302, an example of the video output time calculation process is realized, and by executing the processes of steps S803, S804, S1305, and S1306, the process of the output process is performed. An example is realized. For example, an example of a comparison process is realized by executing the processes of steps S1002 and S1003. Further, for example, an example of the video output calculation process is realized by executing the processing of steps S1005 and S1007.

尚、前述した実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The above-described embodiments are merely examples of implementation in carrying out the present invention, and the technical scope of the present invention should not be construed in a limited manner. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.

（その他の実施例）
本発明は、以下の処理を実行することによっても実現される。即ち、まず、以上の実施形態の機能を実現するソフトウェア（コンピュータプログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給する。そして、そのシステム或いは装置のコンピュータ（又はＣＰＵやＭＰＵ等）が当該コンピュータプログラムを読み出して実行する。 (Other examples)
The present invention is also realized by executing the following processing. That is, first, software (computer program) for realizing the functions of the above embodiments is supplied to a system or apparatus via a network or various storage media. Then, the computer (or CPU, MPU, etc.) of the system or apparatus reads and executes the computer program.

１００配信サーバ、１１０、９０１映像コンテンツ再生装置 100 Distribution server, 110, 901 Video content playback apparatus

Claims

A video content playback apparatus that acquires and plays back content including video data and caption data from an external device via a network,
Request means for designating a time position of the content and a reproduction time range from the time position, and requesting the external device to acquire content in the designated range;
Acquisition means for acquiring the content requested by the request means from the external device via a network;
Of the video frames obtained from the video data included in the content acquired by the acquisition unit, the video frame corresponding to the subtitle based on the subtitle data included in the content acquired by the acquisition unit and the subtitle Output means for outputting video as
Determining means for determining the time position of the content to be acquired next based on the subtitle display start timing obtained from the subtitle data included in the content acquired by the acquisition means and the subtitle playback time; A video content reproduction apparatus comprising:

Video output time calculation means for calculating the video output time based on the number of subtitle characters obtained from the caption data included in the content acquired by the acquisition means;
2. The video content reproduction apparatus according to claim 1, wherein the output means outputs the still image as a video for the time calculated by the video output time calculation means.

The video content playback apparatus according to claim 1, wherein the time position of the content includes a time at which subtitle display in the content is started.

Based on the subtitle playback time obtained from the subtitle data included in the content acquired by the acquisition unit, the number of video outputs as the still image when reproducing one subtitle, and the still image is output as a video A video output calculating means for calculating the interval;
The determining unit is based on the subtitle display start timing obtained from the subtitle data included in the content acquired by the acquiring unit, and the number and interval of still images calculated by the video output calculating unit. 4. The video content reproduction apparatus according to claim 1, wherein a time position of content to be acquired next is determined.

Comparing means for comparing the bit rate of the content acquired by the acquiring means and the bandwidth with the network;
5. The video content reproduction apparatus according to claim 4, wherein the video output calculation unit calculates the number and interval of the still images according to a result of comparison by the comparison unit.

The video output calculation means calculates the number and interval of the still images when the reproduction time of the subtitles obtained from the subtitle data included in the content acquired by the acquisition means is longer than a predetermined time. The video content reproduction apparatus according to claim 4 or 5, characterized in that:

A video content playback method for acquiring and playing content having video data and subtitle data from an external device via a network,
A requesting step of designating a time position of the content and a playback time range from the time position, and requesting the external device to acquire content in the designated range;
An acquisition step of acquiring the content requested in the request step from the external device via a network;
Of the video frames obtained from the video data included in the content acquired in the acquisition step, the video frame corresponding to the subtitle based on the subtitle data included in the content acquired in the acquisition step, and the subtitle As an output process to output video as
A determination step of determining a time position of content to be acquired next based on subtitle display start timing and subtitle playback time obtained from subtitle data included in the content acquired by the acquisition step; A video content reproduction method comprising:

A video output time calculating step of calculating the video output time based on the number of subtitle characters obtained from the subtitle data included in the content acquired by the acquiring step;
8. The video content reproduction method according to claim 7, wherein in the output step, the still image is output as video for the time calculated in the video output time calculation step.

9. The video content reproduction method according to claim 7, wherein the time position of the content includes a time to start displaying subtitles in the content.

Based on the subtitle playback time obtained from the subtitle data included in the content acquired in the acquisition step, the number of video outputs as the still image when reproducing one subtitle, and the still image is output as a video A video output calculating step for calculating the interval;
The determination step is based on subtitle display start timing obtained from the subtitle data included in the content acquired in the acquisition step, and the number and interval of still images calculated in the video output calculation step. 10. The video content reproduction method according to claim 7, wherein a time position of content to be acquired next is determined.

A comparison step of comparing the bit rate of the content acquired by the acquisition step with the bandwidth of the network;
11. The video content reproduction method according to claim 10, wherein the video output calculation step calculates the number and interval of the still images according to the comparison result of the comparison step.

The video output calculation step calculates the number and interval of the still images when the reproduction time of the subtitles obtained from the subtitle data included in the content acquired by the acquisition step is longer than a predetermined time. The video content reproduction method according to claim 10 or 11, wherein:

A video content playback method for acquiring and playing content having video data and subtitle data from an external device via a network,
A requesting step of designating a time position of the content and a playback time range from the time position, and requesting the external device to acquire content in the designated range;
An acquisition step of acquiring the content requested in the request step from the external device via a network;
Of the video frames obtained from the video data included in the content acquired in the acquisition step, the video frame corresponding to the subtitle based on the subtitle data included in the content acquired in the acquisition step, and the subtitle As an output process to output video as
A determination step of determining a time position of content to be acquired next based on subtitle display start timing and subtitle playback time obtained from subtitle data included in the content acquired by the acquisition step; A computer program that is executed by a computer.