JP3979566B2

JP3979566B2 - Time-varying text information segmentation device with moving images

Info

Publication number: JP3979566B2
Application number: JP2001326227A
Authority: JP
Inventors: 茂之酒澤; 悟史宮地; 幸一高木; 泰利渡辺; 康弘滝嶋; 正裕和田
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2001-10-24
Filing date: 2001-10-24
Publication date: 2007-09-19
Anticipated expiration: 2021-10-24
Also published as: JP2003134477A

Description

【０００１】
【発明の属する技術分野】
本発明は動画像付帯時変テキスト情報分割装置に関し、特に、動画像・音声、およびそれらと同期して時間的に変化する文字情報（時変テキスト）からなるコンテンツを分割するときに、該時変テキストを視聴者に違和感なく分割できる動画像付帯時変テキスト情報分割装置に関する。
【０００２】
【従来の技術】
動画像、音声、および時変テキスト（すなわち、テロップ（商品名））を表現するための書式として、ＳＭＩＬ言語が存在する。ここに、時変テキストは、時間と共に変化する文字情報を意味し、例えば画面の左方から右方へ流れる文字情報を想定することができる。
【０００３】
これらの表現形式は、一旦作成した後は、そのまま配信されることが前提となっており、時変テキストを、動画像・音声の関係を考慮に入れて分割するコンテンツ分割技術は、提案されていない。
【０００４】
さて、ライブ中継などの場合には、途中から視聴を開始したユーザは、分割後のデータの先頭から受信を開始することになるため、その先頭データはそれ単独で復号可能なデータ形式である必要がある。例えば、動画像データの場合であれば、先頭データはキーフレーム（フレーム内符号化された画像フレーム）の先頭でなければならない。音声データの場合は、それ単独で復号が可能であるような音声フレームの切れ目で分割される必要がある。時変テキストの場合は、それ単独で復号が可能であるデータの塊で表現され、その塊は時間的広がりを持つ。例えばドラマにおける俳優の台詞を時変テキストで表現する場合には、俳優が一息で話す台詞を一塊りとし、俳優が話している時間だけ、その台詞が表示されることになる。
【０００５】
【発明が解決しようとする課題】
しかしながら、携帯電話などのように、通信路容量、蓄積容量に限りがあるような受信装置では、コンテンツ全体をダウンロードするのではなく、それを細分化してダウンロードおよび再生を行う必要が生ずる。また、前記したように、ライブ中継などの場合には、途中から視聴を開始したユーザは、分割後のデータの先頭から受信を開始することになる。
【０００６】
そこで、前記のような場合に、コンテンツ分割をしようとすると、単純に動画像のキーフレーム（フレーム内符号化された画像フレーム）に着目した分割、すなわち該キーフレームの先頭で分割することしか行われず、分割時点における動画像と時変テキストとの関係が不定となり、視聴者に違和感が生ずるという問題があった。
【０００７】
より具体的に説明すると、動画像のキーフレームで分割する場合には、音声データは動画像のキーフレームの１周期（例えば、１ＧＯＰ期間）に比べて短い時間間隔で分割できるため、その先頭データが本来出力されるべき時刻と、動画像の表示時刻とのずれは十分に小さくなる。しかし、時変テキストの場合には、動画像フレームに比べてかなり長い時間間隔でしか分割できないため、時変テキストを表示すべき時刻と、動画像の先頭フレームの表示時刻とが大きくずれてしまう。例えば、俳優が台詞を話している動画像が表示されているにも拘わらず、台詞データが分割の前の方に付属させられて、分割の後ろ側の台詞データがなくなってしまい、台詞が表示されなくなることが起こり得る。
【０００８】
本発明は、前記した従来技術に鑑みてなされたものであり、その目的は、時変テキストを視聴者に違和感なく分割できる動画像付帯時変テキスト情報分割装置を提供することにある。
【０００９】
【課題を解決するための手段】
前記した目的を達成するために、本発明は、動画像、音声、およびそれらと同期して時間的に変化する文字情報（以下、時変テキストという）からなるコンテンツの該時変テキストを分割する動画像付帯時変テキスト情報分割装置において、前記時変テキストを、分割点である動画像のキーフレームの先頭に同期させて分割する分割手段を具備し、該分割手段は、前記分割点の時刻に表示される時変テキストを抽出し、該時変テキストを、ある時間解像度単位で時間的変化を伴わない静的時変テキストに展開し、該静的時変テキストの開始時刻および終了時刻を前記分割点の時刻で区切ることにより、前記時変テキストを分割するようにした点に特徴がある。
【００１１】
前記特徴によれば、時変テキストを、動画像の分割点と同期させて分割することができるようになり、時変テキストを視聴者に違和感なく分割できるようになる。
【００１２】
【発明の実施の形態】
以下に、図面を参照して、本発明を詳細に説明する。図１は、本発明を含む動画像再生装置の概略構成を示すブロック図である。
【００１３】
図示されているように、時変テキスト付きコンテンツファイルが音声／動画／時変テキスト分離部１に入力すると、該分離部１は該時変テキスト付きコンテンツファイルを音声データ、動画像データ、および時変テキストデータに分離する。分離された音声データ２ａ、動画像データ３ａ、および時変テキストデータ４ａは、分割処理部５に入力し、分割処理の指示７があった場合には本発明による分割処理をされ、一方分割処理の指示がなかった場合にはそのまま通過して、音声／動画／時変テキスト多重化部６に送られる。該多重化部６は音声／動画／時変テキストを多重化し、元の時変テキスト付きコンテンツファイルと同様の形式にて出力する。
【００１４】
図２は、前記音声データ２ａ、動画像データ３ａ、および時変テキストデータ４ａの概念図を示す。
【００１５】
図示されているように、音声データ２ａは符号化時の符号化フレームＡ1，Ａ2，Ａ3、・・・から構成されている。
【００１６】
動画像データ３ａは画像列から構成されており、その画像列の中にキーフレーム（フレーム内符号化された画像フレーム）が定期的に挿入されている。図中の１１〜１４は、該キーフレームの先頭を示す。また、あるキーフレームから次のキーフレームまでの期間Ｋはキーフレーム周期（１ＧＯＰ期間）を示す。なお、該キーフレームは、図示されていない動画像符号化装置において挿入される。
【００１７】
また、時変テキストデータ４ａは、一塊りのデータ（例えば、台詞の一塊りのデータ）Ｔ１、Ｔ２、Ｔ３・・・の列から構成されている。図から明らかなように、時変テキストデータの塊Ｔ１、Ｔ２、・・・は、動画像データや音声データに比べて大きな時間的広がりを持つことは明らかである。また、時変テキストは、動画像の意味内容（例えば、ドラマの台詞など）に応じて挿入されるため、時変テキストデータの切れ目は、必ずしも動画像データのキーフレーム位置１１，１２，１３，・・・とは一致しない。
【００１８】
なお、時変テキストでは、文字情報に様々な修飾が施される。該修飾には、(1)時間的挙動を伴うものと、(2)伴わないものとがある。時間的挙動を伴うものには、スクロール、ブリンク、ワイプなどがある。スクロールは文字列が流れていくようにする挙動、ブリンクは文字列が点滅する挙動、ワイプは文字列中の文字色が変化していく挙動（例、カラオケの歌詞）である。一方、時間的挙動を伴わないものには、文字色、背景色、あるいは下線などがある。
【００１９】
次に、本発明の要部である分割処理部５の構成および動作を説明する。図３は、該分割処理部５中の動画像付帯時変テキスト情報分割装置の構成を示すブロック図である。該動画像付帯時変テキスト情報分割装置２０は、分割時変テキスト探索器２１と文字修飾分割器２２から構成されている。
【００２０】
前記分割処理部５は、任意の時間Ｔ’に分割指示信号７が入力すると、前記音声データ２ａ、動画像データ３ａ、および時変テキストデータ４ａを、図２に示されている分割点Ｔで分割する。すなわち、動画像データ３ａの分割点のデータは、それ単独で復号可能なデータ形式である必要があるため、その先頭データがキーフレーム（フレーム内符号化された画像フレーム）となるように分割しなければならない。このため、分割点Ｔはキーフレームの先頭に合わされる。
【００２１】
音声データ２ａは、分割点Ｔが符号化フレームＡ2の途中に来るので、符号化フレームＡ1とＡ2の境界で分割され、Ａ1は分割前のファイルに、またＡ2は分割後のファイルに格納される。
【００２２】
次に、時変テキストデータ４ａに関しては、例えば、図２に示されているように、時変テキストデータ４ａがデータの塊Ｔ２で二つに分割される時、分割の前後で整合性が保たれるように補完する必要が生ずる。また、文字情報の修飾も、分割の前後で連続性を保ちつつ、かつ分割点から受信を初めて開始した場合でも文字情報が失われないようにする必要がある。そこで、本実施形態では、下記のように分割処理を行う。
【００２３】
動画像付帯時変テキスト情報分割装置２０は、分割の対象となる時変テキストを探索する分割時変テキスト探索器２１と、該時変テキストおよびその修飾を分割する文字修飾分割器２２とからなる。
【００２４】
まず、分割時変テキスト探索器２１の動作を、図４のフローチャートを参照して説明する。前提として、分割時刻をＴ、時変テキストデータ４ａの塊Ｔ１、Ｔ２、・・・の各開始時刻をｔi、各終了時刻をｔi'とする。ステップＳ１では、時変テキストデータの塊を表す数ｉを０と置き、ステップＳ２では時変テキストデータ４ａの塊Ｔiを読み込む。ステップＳ３では、分割時刻Ｔがｔi＜Ｔ＜ｔi'を満足するか否かの判断がなされる。この判断が否定の時にはステップＳ４に進んで、全ての時変テキストデータを判定したか否かの判断がなされる。この判断が否定の時には、ステップＳ５に進んでｉに１が加算される。そして、ステップＳ２に戻り、前記した処理が繰り返される。前記ステップＳ３の判断が肯定になった時には、データの塊Ｔiは分割対象であるから、ステップＳ６に進んで処理対象の時変テキストとして保持される。
【００２５】
次に、前記文字修飾分割器２２の動作を説明する。該文字修飾分割器２２の動作手順は、次のようである。
(1)時間挙動のある時変テキストを、時変挙動のない時変テキスト（以下、静的時変テキストという）に展開する。
(2)分割時刻にある静的時変テキストを、分割時刻Ｔで二つに分割する。
【００２６】
前記の手順(1)、(2)を、具体的に説明する。この(1)では、ある時間解像度で時変テキストの状態を見て、その状態の静的時変テキストを順次出力する。
【００２７】
いま、時変テキストデータＴ２として、図５に示されているように、例えば、「ニューヨークで大事件が発生しました」を想定する。この時変テキストデータＴ２は、５文字表示され、毎秒１文字ずつ移動していくとすると、図示されているように、「０秒から１２秒までスクロール、１文字／秒、窓は５文字、ニューヨークで大事件が発生しました」と表すことができる。現象的には、図示されているように、５文字を通す窓Ｗが１秒毎に矢印ａの方向に移動すると考えることができる。なお、時間解像度の定め方は、例えばブリンクの場合はブリンク周期の１／２，ワイプの場合はワイプの１文字移動速度とする。
【００２８】
これを前記(1)の静的時変テキストに展開すると、図５に示されているように、第０秒「ニューヨー」、第１秒「ューヨーク」、第２秒「ーヨークで」、・・・、第１２秒「生しました」となる。この場合の時間解像度は１秒である。
【００２９】
次に、前記(2)では、前記静的時変テキストを、分割時刻Ｔで二つに分割する処理が行われる。
【００３０】
前記データ塊Ｔ２の静的時変テキストが図６に示されているように表されたとすると、前記分割点Ｔは、該静的時変テキストのどこかに対応することになる。今、該分割点Ｔが第４秒「ークで大事」に対応したとすると、前記(2)の処理では、まず、該分割点Ｔが属する静的時変テキストを探索する動作をし、該探索の結果、静的時変テキストとして、例えば第４秒「ークで大事」が検出されると、次いで、該第４秒「ークで大事」を分割点Ｔで分割する処理をする。なお、前記分割点Ｔが属する静的時変テキストを探索する処理は、図４と同様の処理を採用することができる。
【００３１】
この処理は、分割点Ｔを分割前の時変テキストの終了時刻、また該分割点Ｔを分割後の時変テキストの開始時刻とする処理である。前記の例では、４〜５秒の時変テキスト「ークで大事」の時間４〜Ｔを分割前の時変テキストの終了時変テキストとし、時間Ｔ〜５を分割後の時変テキストの開始時変テキストとする処理をする。
【００３２】
以上の分割処理により、時変テキストデータ４ａは、分割の前後で整合性が保たれるように補完され、かつ文字情報の修飾も、分割の前後で連続性を保ちつつ、かつ分割点から受信を初めて開始した場合でも文字情報が失われないようにできるようになる。また、分割点Ｔにおける動画像のキーフレームとの整合も取れるようになる。
【００３３】
【発明の効果】
以上の説明から明らかなように、本発明によれば、任意の時刻に、動画像データおよび音声データと同期させて、時変データを分割できるようになる。このため、時変テキストを視聴者に違和感なく分割できるようになる。
【００３４】
また、携帯電話などのように、通信路容量、蓄積容量に限りがあるような受信装置において、コンテンツを細分化してダウンロードおよび再生をする場合であっても、分割点における時変テキストを視聴者に違和感なく提供できるようになる。
【図面の簡単な説明】
【図１】本発明が適用される再生装置の概略の構成を示すブロック図である。
【図２】音声データ、動画像データ、および時変テキストデータの概念図である。
【図３】本発明の一実施形態の動画像付帯時変テキスト情報分割装置の構成を示すブロック図である。
【図４】図３の分割時変テキスト探索器の動作を示すフローチャートである。
【図５】時変テキストと静的時変テキストの説明図である。
【図６】図３の文字修飾分割器の動作の説明図である。
【符号の説明】
１・・・ファイル仕分け部、２・・・音声復号部、３・・・動画像復号部、４・・・時変テキスト復号部、５・・・分割処理部、６・・・画像表示部、２０・・・動画像付帯時変テキスト情報分割装置、２１・・・分割時変テキスト探索器、２２・・・文字修飾分割器。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a time-varying text information dividing apparatus with moving images, and particularly, when dividing content composed of moving images / sounds and character information (time-varying text) that changes in time in synchronization therewith. The present invention relates to a time-varying text information dividing apparatus with moving images that can divide a strange text into a viewer without feeling uncomfortable.
[0002]
[Prior art]
The SMIL language exists as a format for expressing moving images, sound, and time-varying text (that is, telop (product name)). Here, the time-varying text means character information that changes with time. For example, character information that flows from the left to the right of the screen can be assumed.
[0003]
These expression formats are premised on being distributed as they are once created, and a content division technique that divides time-varying text in consideration of the relationship between moving images and audio has been proposed. Absent.
[0004]
Now, in the case of live broadcast, etc., the user who has started viewing from the middle will start receiving from the beginning of the divided data, so that the beginning data must be in a data format that can be decoded alone. There is. For example, in the case of moving image data, the head data must be the head of a key frame (an intra-frame encoded image frame). In the case of audio data, it is necessary to divide at the breaks of audio frames that can be decoded alone. In the case of time-varying text, it is represented by a chunk of data that can be decoded by itself, and the chunk has a temporal spread. For example, when expressing the line of an actor in a drama with time-varying text, the line that the actor speaks with a breath is grouped and the line is displayed only for the time the actor is speaking.
[0005]
[Problems to be solved by the invention]
However, in a receiving apparatus having a limited communication path capacity and storage capacity, such as a mobile phone, it is necessary to download and reproduce the content by subdividing it instead of downloading the entire content. In addition, as described above, in the case of live relay or the like, the user who started viewing from the middle starts reception from the top of the divided data.
[0006]
Therefore, in the above case, if content division is attempted, the division is simply performed by focusing on the key frame of the moving image (the image frame encoded in the frame), that is, dividing at the head of the key frame. Accordingly, there is a problem that the relationship between the moving image and the time-varying text at the time of division becomes indefinite and the viewer feels uncomfortable.
[0007]
More specifically, in the case of dividing by a moving image key frame, since the audio data can be divided at a time interval shorter than one cycle (for example, 1 GOP period) of the moving image key frame, the leading data The difference between the time when the image should be output and the display time of the moving image is sufficiently small. However, in the case of time-varying text, since it can be divided only at a considerably longer time interval than the moving image frame, the time when the time-varying text should be displayed and the display time of the first frame of the moving image are greatly shifted. . For example, even though a moving image of an actor talking is displayed, the dialogue data is attached to the front of the division, the dialogue data behind the division is lost, and the dialogue is displayed It can happen that it is not done.
[0008]
The present invention has been made in view of the above-described prior art, and an object of the present invention is to provide a time-varying text information dividing device with moving images that can divide time-varying text into a viewer without a sense of incongruity.
[0009]
[Means for Solving the Problems]
In order to achieve the above-described object, the present invention divides the time-varying text of content composed of moving images, sounds, and character information that changes in time synchronously (hereinafter referred to as time-varying text). in a variation text information division unit when a moving image bearing, the time-varying text, comprising a dividing means for dividing in synchronism with the head of the key frame of the moving image is divided points, the dividing means, the time of the division point The time-varying text displayed on the screen is extracted, and the time-varying text is expanded into a static time-varying text without a temporal change in a certain time resolution unit, and the start time and end time of the static time-varying text are set. The time-varying text is divided by dividing by the time of the division point .
[0011]
According to the above feature, the time-varying text can be divided in synchronization with the dividing point of the moving image, and the time-varying text can be divided without a sense of discomfort to the viewer.
[0012]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing a schematic configuration of a moving image reproducing apparatus including the present invention.
[0013]
As shown in the figure, when a content file with time-varying text is input to the audio / video / time-varying text separating unit 1, the separating unit 1 converts the content file with time-varying text into audio data, moving image data, and time. Separate into variable text data. The separated audio data 2a, moving image data 3a, and time-varying text data 4a are input to the division processing unit 5, and when there is a division processing instruction 7, the division processing according to the present invention is performed. If there is no such instruction, it passes as it is and is sent to the voice / video / time-varying text multiplexing unit 6. The multiplexing unit 6 multiplexes audio / video / time-varying text and outputs it in the same format as the original content file with time-varying text.
[0014]
FIG. 2 is a conceptual diagram of the audio data 2a, the moving image data 3a, and the time-varying text data 4a.
[0015]
As shown in the figure, the audio data 2a is composed of encoded frames A1, A2, A3,.
[0016]
The moving image data 3a is composed of an image sequence, and key frames (image frames encoded within the frame) are periodically inserted into the image sequence. 11 to 14 in the figure indicate the heads of the key frames. A period K from one key frame to the next key frame indicates a key frame period (1 GOP period). Note that the key frame is inserted in a moving picture coding apparatus (not shown).
[0017]
Further, the time-varying text data 4a is composed of a group of data (for example, a group of lines of data) T1, T2, T3,. As is apparent from the figure, it is clear that the time-varying text data chunks T1, T2,... Have a larger temporal spread than moving image data and audio data. In addition, since the time-varying text is inserted according to the meaning content of the moving image (for example, a drama dialogue, etc.), the break of the time-varying text data is not necessarily the key frame position 11, 12, 13,. Does not match.
[0018]
In the time-varying text, various modifications are applied to the character information. The modifications include (1) those with temporal behavior and (2) those without. Things with temporal behavior include scrolling, blinking, and wiping. Scrolling is a behavior that causes a character string to flow, blinking is a behavior that the character string blinks, and wipe is a behavior that the character color in the character string changes (for example, lyrics of karaoke). On the other hand, there are a character color, a background color, an underline, etc. that do not involve temporal behavior.
[0019]
Next, the configuration and operation of the division processing unit 5 that is the main part of the present invention will be described. FIG. 3 is a block diagram showing a configuration of the time-varying text information dividing device with moving images in the division processing unit 5. The moving image-attached time-varying text information dividing device 20 includes a divided time-varying text searcher 21 and a character modification divider 22.
[0020]
When the division instruction signal 7 is input at an arbitrary time T ′, the division processing unit 5 converts the audio data 2a, the moving image data 3a, and the time-varying text data 4a at the division point T shown in FIG. To divide. That is, since the data at the dividing point of the moving image data 3a needs to be in a data format that can be decoded by itself, it is divided so that the head data becomes a key frame (an intra-frame encoded image frame). There must be. For this reason, the dividing point T is aligned with the head of the key frame.
[0021]
The audio data 2a is divided at the boundary between the encoded frames A1 and A2 because the division point T comes in the middle of the encoded frame A2, and A1 is stored in the file before division, and A2 is stored in the file after division. .
[0022]
Next, with respect to the time-varying text data 4a, for example, as shown in FIG. 2, when the time-varying text data 4a is divided into two by the data chunk T2, consistency is maintained before and after the division. It will be necessary to supplement so that it may lean. In addition, it is necessary to modify the character information so that the character information is not lost even when reception is started for the first time from the division point while maintaining continuity before and after the division. Therefore, in this embodiment, the division process is performed as follows.
[0023]
The time-varying text information dividing device 20 with a moving image includes a divided time-varying text searcher 21 for searching for a time-varying text to be divided, and a character modification divider 22 for dividing the time-varying text and its modification. .
[0024]
First, the operation of the divided time-varying text searcher 21 will be described with reference to the flowchart of FIG. As a premise, the division time is T, the start times of the chunks T1, T2,... Of the time-varying text data 4a are ti, and the end times are ti ′. In step S1, the number i representing the block of time-varying text data is set to 0, and in step S2, the block Ti of the time-varying text data 4a is read. In step S3, it is determined whether or not the division time T satisfies ti <T <ti '. When this determination is negative, the process proceeds to step S4, where it is determined whether all time-varying text data has been determined. When this determination is negative, the process proceeds to step S5, and 1 is added to i. And it returns to step S2 and the above-mentioned process is repeated. When the determination in step S3 is affirmative, the data chunk Ti is a division target, so that the process proceeds to step S6 and is held as a time-varying text to be processed.
[0025]
Next, the operation of the character modification divider 22 will be described. The operation procedure of the character modification divider 22 is as follows.
(1) Expand time-varying text with time behavior into time-varying text without time-varying behavior (hereinafter referred to as static time-varying text).
(2) The static time-varying text at the division time is divided into two at the division time T.
[0026]
The procedures (1) and (2) will be specifically described. In (1), the state of the time-varying text is seen at a certain time resolution, and the static time-varying text in that state is sequentially output.
[0027]
Now, as time-varying text data T2, as shown in FIG. 5, for example, it is assumed that “a major incident has occurred in New York”. Assuming that the time-varying text data T2 is displayed in five characters and moves by one character per second, as shown in the figure, “scroll from 0 to 12 seconds, 1 character / second, the window has 5 characters, A major incident occurred in New York. ” Phenomenologically, as shown in the figure, it can be considered that the window W through which five characters pass moves in the direction of the arrow a every second. Note that the time resolution is determined by, for example, ½ the blink cycle in the case of blink, and one character movement speed of wipe in the case of wipe.
[0028]
When this is expanded into the static time-varying text of (1), as shown in FIG. 5, the second second “New York”, the first second “New York”, the second second “Yoke”,・ It becomes the 12th "I was born". In this case, the time resolution is 1 second.
[0029]
Next, in (2), a process of dividing the static time-varying text into two at the division time T is performed.
[0030]
If the static time-varying text of the data chunk T2 is represented as shown in FIG. 6, the division point T corresponds to somewhere in the static time-varying text. Assuming that the division point T corresponds to the 4th second “Cake important”, in the process (2), first, an operation for searching for the static time-varying text to which the division point T belongs is performed. As a result of the search, if, for example, the 4th second “Cheke is important” is detected as the static time-varying text, the fourth second “Cheke is important” is then divided at the dividing point T. . The process for searching for the static time-varying text to which the division point T belongs can employ the same process as in FIG.
[0031]
This process is a process in which the division point T is the end time of the time-varying text before division, and the division point T is the start time of the time-varying text after division. In the above example, the time 4 to T of the time-varying text “4 to 5 seconds” of 4 to 5 seconds is the end time-varying text of the time-varying text before the division, and the time T to 5 is the time-varying text of the divided time-varying text. Process to change the start time text.
[0032]
Through the above division processing, the time-varying text data 4a is complemented so that consistency is maintained before and after the division, and the modification of the character information is received from the division point while maintaining the continuity before and after the division. Even if it is started for the first time, character information can be prevented from being lost. In addition, it is possible to match with the key frame of the moving image at the division point T.
[0033]
【The invention's effect】
As is apparent from the above description, according to the present invention, time-varying data can be divided at an arbitrary time in synchronization with moving image data and audio data. For this reason, the time-varying text can be divided without a sense of incongruity to the viewer.
[0034]
Even in a receiving device that has limited channel capacity and storage capacity, such as a mobile phone, even when content is segmented and downloaded and played back, the time-varying text at the dividing point can be viewed by the viewer. It will be possible to provide without discomfort.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a schematic configuration of a playback apparatus to which the present invention is applied.
FIG. 2 is a conceptual diagram of audio data, moving image data, and time-varying text data.
FIG. 3 is a block diagram illustrating a configuration of a time-varying text information dividing apparatus with moving images according to an embodiment of the present invention.
4 is a flowchart showing the operation of the divided time-varying text searcher of FIG.
FIG. 5 is an explanatory diagram of time-varying text and static time-varying text.
6 is an explanatory diagram of an operation of the character modification divider shown in FIG. 3. FIG.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... File classification part, 2 ... Voice decoding part, 3 ... Moving image decoding part, 4 ... Time-variant text decoding part, 5 ... Division processing part, 6 ... Image display part 20 ... time-varying text information dividing device with moving image, 21 ... divided time-varying text searcher, 22 ... character modification divider.

Claims

In a time-varying text information dividing device with moving images, which divides the time-varying text of content composed of moving images, audio, and character information that changes in time synchronously (hereinafter referred to as time-varying text).
Dividing means for dividing the time-varying text in synchronization with the beginning of a key frame of a moving image that is a dividing point ;
The dividing means extracts time-varying text displayed at the time of the dividing point, expands the time-varying text into static time-varying text that does not involve temporal change in a certain time resolution unit, and A time-varying text information dividing apparatus with moving images , wherein the time-varying text is divided by dividing a start time and an end time of the time-varying text by the time of the division point .

The time-varying text information dividing device according to claim 1, wherein the time-varying text is subjected to at least one of a modification with temporal behavior and a modification without temporal behavior.