JP2022098735A

JP2022098735A - Subtitle generation device and subtitle generation program

Info

Publication number: JP2022098735A
Application number: JP2020212304A
Authority: JP
Inventors: 大輔宮島; Daisuke Miyajima; 顕也福本; Akiya Fukumoto; 和秀 ▲高▼橋; Kazuhide Takahashi; 慶吾小渕; Keigo Kobuchi
Original assignee: Play Inc
Current assignee: Play Inc
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2022-07-04
Anticipated expiration: 2040-12-22
Also published as: JP7201656B2

Abstract

To provide a subtitle generation device capable of generating a subtitle file capable of appropriately displaying subtitle text as a subtitle file for distributing subtitle data in real time via a telecommunication line.SOLUTION: A subtitle generation device includes a subtitle extraction unit that extracts a subtitle text from subtitle data acquired from the outside and extracts a first display start time of the subtitle text, an end time setting unit that sets a first display end time after the first display start time, a split subtitle generation unit that generates split subtitle data in which the first display start time and the first display end time are associated with the subtitle text, a data output unit that outputs the generated split subtitle data, and a start time setting unit that sets the first display end time to the second display start time of the subtitle text. The end time setting unit sets the second display end time after the second display start time. The split subtitle generation unit generates split subtitle data in which the second display start time and the second display end time are associated with the subtitle text.SELECTED DRAWING: Figure 1

Description

本開示は、字幕生成装置及び字幕生成プログラムに関するものである。 The present disclosure relates to a subtitle generator and a subtitle generator.

映像及び字幕を含む番組の放送波から、映像を表示するための映像データ及び字幕を表示するための字幕データを得て、得られた映像データに係る映像、及び得られた字幕データに係る字幕を共に表示するための映像信号を出力する映像処理装置が知られている（例えば、特許文献１参照）。特に、特許文献１には、所定の表示領域に表示される字幕の文字数と次に得られた字幕データの字幕の文字数との加算結果が所定文字数以下であれば、次の字幕を連続配置して表示すること、及び、字幕を表示し続けるべき表示時間を放送波から抽出し、字幕の文字数に応じて表示時間を延長すること等が記載されている。 Video data for displaying video and subtitle data for displaying subtitles are obtained from the broadcast wave of a program including video and subtitles, and the video related to the obtained video data and the subtitles related to the obtained subtitle data are obtained. There is known a video processing device that outputs a video signal for displaying together (see, for example, Patent Document 1). In particular, in Patent Document 1, if the addition result of the number of characters of the subtitle displayed in the predetermined display area and the number of characters of the subtitle of the subtitle data obtained next is the predetermined number of characters or less, the next subtitles are continuously arranged. It describes that the subtitles should be displayed continuously, and that the display time for which the subtitles should be displayed is extracted from the broadcast wave and the display time is extended according to the number of characters in the subtitles.

特開２０１０－２６８０７６号公報Japanese Unexamined Patent Publication No. 2010-268076

このように、特許文献１に示されるような技術は、映像及び字幕を含む番組の放送波から、映像データ、字幕データ及び表示時間を抽出する。ここで、字幕データが映像データと重畳されたデータを受信してリアルタイムで字幕データをＷｅｂＶＴＴ形式の字幕ファイルとして出力する場合を考える。ＷｅｂＶＴＴ形式の字幕ファイルでは、字幕テキストについて表示開始時刻と表示終了時刻とが特定されている。表示開始時刻のみが特定され、表示終了時刻が特定されていない字幕テキストがＷｅｂＶＴＴ形式の字幕ファイルに含まれていた場合、一般的なプレイヤーでは、当該字幕テキストについて無視されてしまい、当該字幕テキストが表示されない。したがって、例えば、放送データをリアルタイムにエンコード及び変換して、インターネット等の電気通信回線で配信するような場合、ある字幕データを受信時には次の字幕データを受信していないために、字幕テキストの表示終了時間を確定できず、字幕データの受信と同時にリアルタイムで生成した字幕ファイルでは字幕テキストを適切に表示できなくなり、映像データと字幕データの同期がとれない。 As described above, the technique as shown in Patent Document 1 extracts video data, subtitle data, and display time from the broadcast wave of a program including video and subtitles. Here, consider a case where the subtitle data receives the data superimposed on the video data and outputs the subtitle data as a WebVTT format subtitle file in real time. In the WebVTT format subtitle file, the display start time and the display end time are specified for the subtitle text. If the subtitle text in which only the display start time is specified and the display end time is not specified is included in the subtitle file in WebVTT format, the subtitle text is ignored by a general player, and the subtitle text is displayed. Do not show. Therefore, for example, when broadcasting data is encoded and converted in real time and distributed over a telecommunication line such as the Internet, the subtitle text is displayed because the next subtitle data is not received when a certain subtitle data is received. The end time cannot be determined, the subtitle text cannot be displayed properly in the subtitle file generated in real time at the same time as the subtitle data is received, and the video data and the subtitle data cannot be synchronized.

本開示は、このような課題を解決するためになされたものである。その目的は、字幕データが映像データと重畳されたデータを受信し、リアルタイムで字幕データを電気通信回線で配信するための字幕ファイルとして出力する場合に、映像データと字幕データとを同期させて字幕テキストを適切に表示できる字幕ファイルを生成することが可能である字幕生成装置及び字幕生成プログラムを提供することにある。 This disclosure is made to solve such a problem. The purpose is to synchronize the video data and the subtitle data when the subtitle data receives the data superimposed on the video data and outputs the subtitle data as a subtitle file for distribution on the telecommunications line in real time. It is an object of the present invention to provide a subtitle generator and a subtitle generator capable of generating a subtitle file capable of appropriately displaying text.

本開示に係る字幕生成装置は、外部から取得した字幕データから、字幕テキストと前記字幕テキストの第１表示開始時刻とを抽出する字幕抽出部と、前記第１表示開始時刻よりも後の時刻である第１表示終了時刻を設定する終了時刻設定部と、前記字幕抽出部が抽出した前記字幕テキストに前記第１表示開始時刻と前記第１表示終了時刻とを対応付けた分割字幕データを生成する分割字幕生成部と、前記分割字幕生成部が生成した分割字幕データを出力するデータ出力部と、前記第１表示終了時刻を前記字幕テキストの第２表示開始時刻に設定する開始時刻設定部と、を備え、前記終了時刻設定部は、前記第２表示開始時刻よりも後の時刻である第２表示終了時刻を設定し、前記分割字幕生成部は、前記字幕抽出部が抽出した前記字幕テキストに前記第２表示開始時刻と前記第２表示終了時刻とを対応付けた分割字幕データを生成する。 The subtitle generator according to the present disclosure has a subtitle extraction unit that extracts the subtitle text and the first display start time of the subtitle text from the subtitle data acquired from the outside, and a time after the first display start time. Generates divided subtitle data in which the first display start time and the first display end time are associated with the subtitle text extracted by the end time setting unit for setting a certain first display end time and the subtitle extraction unit. A divided subtitle generation unit, a data output unit that outputs the divided subtitle data generated by the divided subtitle generation unit, a start time setting unit that sets the first display end time to the second display start time of the subtitle text, and a start time setting unit. The end time setting unit sets the second display end time, which is a time after the second display start time, and the divided subtitle generation unit sets the subtitle text extracted by the subtitle extraction unit. The divided subtitle data in which the second display start time and the second display end time are associated with each other is generated.

本開示に係る字幕生成プログラムは、字幕生成装置のコンピュータを、外部から取得した字幕データから、字幕テキストと前記字幕テキストの第１表示開始時刻とを抽出する字幕抽出部と、前記第１表示開始時刻よりも後の時刻である第１表示終了時刻を設定する終了時刻設定部と、前記字幕抽出部が抽出した前記字幕テキストに前記第１表示開始時刻と前記第１表示終了時刻とを対応付けた分割字幕データを生成する分割字幕生成部と、前記分割字幕生成部が生成した分割字幕データを出力するデータ出力部と、前記第１表示終了時刻を前記字幕テキストの第２表示開始時刻に設定する開始時刻設定部と、として機能させるとともに、前記終了時刻設定部に、前記第２表示開始時刻よりも後の時刻である第２表示終了時刻を設定させ、前記分割字幕生成部に、前記字幕抽出部が抽出した前記字幕テキストに前記第２表示開始時刻と前記第２表示終了時刻とを対応付けた分割字幕データを生成させる。 In the subtitle generation program according to the present disclosure, the computer of the subtitle generator has a subtitle extraction unit that extracts the subtitle text and the first display start time of the subtitle text from the subtitle data acquired from the outside, and the first display start. The end time setting unit that sets the first display end time, which is a time after the time, and the subtitle text extracted by the subtitle extraction unit are associated with the first display start time and the first display end time. The divided subtitle generation unit that generates the divided subtitle data, the data output unit that outputs the divided subtitle data generated by the divided subtitle generation unit, and the first display end time are set as the second display start time of the subtitle text. In addition to functioning as a start time setting unit, the end time setting unit is made to set a second display end time which is a time after the second display start time, and the divided subtitle generation unit is made to set the subtitle. The subtitle text extracted by the extraction unit is made to generate divided subtitle data in which the second display start time and the second display end time are associated with each other.

本開示に係る字幕生成装置及び字幕生成プログラムによれば、字幕データが映像データと重畳されたデータを受信し、リアルタイムで字幕データを電気通信回線で配信するための字幕ファイルとして出力する場合に、映像データと字幕データとを同期させて字幕テキストを適切に表示できる字幕ファイルを生成することが可能であるという効果を奏する。 According to the subtitle generation device and the subtitle generation program according to the present disclosure, when the subtitle data receives the data superimposed with the video data and outputs the subtitle data as a subtitle file for distribution on a telecommunications line in real time. It has the effect that it is possible to synchronize the video data and the subtitle data to generate a subtitle file that can appropriately display the subtitle text.

実施の形態１に係る字幕生成装置の機能的な構成を示すブロック図である。It is a block diagram which shows the functional structure of the subtitle generation apparatus which concerns on Embodiment 1. FIG. ＷｅｂＶＴＴ形式の字幕ファイルの一例を説明する図である。It is a figure explaining an example of the subtitle file of the WebVTT format. 実施の形態１に係る字幕生成装置が生成するＷｅｂＶＴＴ形式の字幕ファイルの一例を説明する図である。It is a figure explaining an example of the subtitle file of the WebVTT format generated by the subtitle generation apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係る字幕生成装置が生成するＷｅｂＶＴＴ形式の字幕ファイルの一例を説明する図である。It is a figure explaining an example of the subtitle file of the WebVTT format generated by the subtitle generation apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係る字幕生成装置の処理例を示すフロー図である。It is a flow diagram which shows the processing example of the subtitle generation apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係る字幕生成装置の処理例を示すフロー図である。It is a flow diagram which shows the processing example of the subtitle generation apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係る字幕生成装置の処理例を示すフロー図である。It is a flow diagram which shows the processing example of the subtitle generation apparatus which concerns on Embodiment 1. FIG.

本開示に係る字幕生成装置及び字幕生成プログラムを実施するための形態について添付の図面を参照しながら説明する。各図において、同一又は相当する部分には同一の符号を付して、重複する説明は適宜に簡略化又は省略する。以下の説明においては便宜上、図示の状態を基準に各構造の位置関係を表現することがある。なお、本開示は以下の実施の形態に限定されることなく、本開示の趣旨を逸脱しない範囲において、各実施の形態の自由な組み合わせ、各実施の形態の任意の構成要素の変形、又は各実施の形態の任意の構成要素の省略が可能である。 A mode for implementing the subtitle generation device and the subtitle generation program according to the present disclosure will be described with reference to the attached drawings. In each figure, the same or corresponding parts are designated by the same reference numerals, and duplicate description will be appropriately simplified or omitted. In the following description, for convenience, the positional relationship of each structure may be expressed with reference to the illustrated state. It should be noted that the present disclosure is not limited to the following embodiments, and is free combination of each embodiment, modification of any component of each embodiment, or each of them, as long as the purpose of the present disclosure is not deviated. It is possible to omit any component of the embodiment.

実施の形態１．
図１から図７を参照しながら、本開示の実施の形態１について説明する。図１は字幕生成装置の機能的な構成を示すブロック図である。図２はＷｅｂＶＴＴ形式の字幕ファイルの一例を説明する図である。図３及び図４は字幕生成装置が生成するＷｅｂＶＴＴ形式の字幕ファイルの一例を説明する図である。図５から図７は字幕生成装置の処理例を示すフロー図である。 Embodiment 1.
The first embodiment of the present disclosure will be described with reference to FIGS. 1 to 7. FIG. 1 is a block diagram showing a functional configuration of a subtitle generator. FIG. 2 is a diagram illustrating an example of a subtitle file in WebVTT format. 3 and 4 are diagrams illustrating an example of a WebVTT format subtitle file generated by the subtitle generator. 5 to 7 are flow charts showing processing examples of the subtitle generator.

この実施の形態に係る字幕生成装置１０は、図１に示すように、字幕抽出部１１、終了時刻設定部１２、開始時刻設定部１３、分割字幕生成部１４及びデータ出力部１５を備えている。これらの各部は電子回路を用いて実現され、情報を表す電気的な信号を処理する。 As shown in FIG. 1, the subtitle generation device 10 according to this embodiment includes a subtitle extraction unit 11, an end time setting unit 12, a start time setting unit 13, a divided subtitle generation unit 14, and a data output unit 15. .. Each of these parts is realized using electronic circuits and processes electrical signals that represent information.

字幕生成装置１０は、ハードウェアとして、プロセッサ及びメモリを備えた１台以上のコンピュータから構成されていてもよい。プロセッサは、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、中央処理装置、処理装置、演算装置、マイクロプロセッサ、マイクロコンピュータあるいはＤＳＰともいう。メモリには、例えば、ＲＡＭ、ＲＯＭ、フラッシュメモリー、ＥＰＲＯＭ及びＥＥＰＲＯＭ等の不揮発性または揮発性の半導体メモリ、又は、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク及びＤＶＤ等が該当する。 The subtitle generator 10 may be composed of one or more computers including a processor and a memory as hardware. The processor is also referred to as a CPU (Central Processing Unit), a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, or a DSP. Examples of the memory include non-volatile or volatile semiconductor memories such as RAM, ROM, flash memory, EPROM and EEPROM, or magnetic disks, flexible disks, optical disks, compact disks, mini disks and DVDs.

字幕生成装置１０のメモリには、ソフトウェアとしてのプログラムが記憶される。そして、字幕生成装置１０は、メモリに記憶されたプログラムをプロセッサが実行することによって予め設定された処理を実施し、ハードウェアとソフトウェアとが協働した結果として、以下に説明する各部の機能を実現する。すなわち、字幕生成装置１０のメモリに記憶されたプログラムは、字幕生成装置１０のコンピュータを、以下に説明する各部として機能させる字幕生成プログラムである。 A program as software is stored in the memory of the subtitle generation device 10. Then, the subtitle generation device 10 executes preset processing by executing a program stored in the memory by the processor, and as a result of the cooperation between the hardware and the software, the functions of the respective parts described below are performed. Realize. That is, the program stored in the memory of the subtitle generation device 10 is a subtitle generation program that causes the computer of the subtitle generation device 10 to function as each part described below.

字幕抽出部１１は、外部から入力される放送信号を取り込み、取得した放送信号から字幕データを抽出する。放送信号は、例えばＳＤＩ（ＳｅｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ：シリアル・デジタル・インターフェース）で字幕生成装置１０に伝送されてくる。ＳＤＩは、放送用機器に用いられる標準的なインターフェースである。放送信号の形式は、ＡＲＩＢ（ＡｓｓｏｃｉａｔｉｏｎｏｆＲａｄｉｏＩｎｄｕｓｔｒｉｅｓａｎｄＢｕｓｉｎｅｓｓ：一般社団法人電波産業会）で策定された標準規格に基づくものである。字幕データも、ＡＲＩＢの規定にしたがって、入力される放送信号に重畳されている。字幕データは、ＨＤ－ＳＤＩ又はＳＤ－ＳＤＩの垂直ブランキング領域に格納されており、字幕抽出部１１はこの字幕データを抽出する。字幕データが、放送信号の他の領域に格納されていてもよい。 The subtitle extraction unit 11 takes in a broadcast signal input from the outside and extracts subtitle data from the acquired broadcast signal. The broadcast signal is transmitted to the subtitle generation device 10 by, for example, SDI (Serial Digital Interface). SDI is a standard interface used in broadcasting equipment. The format of the broadcast signal is based on the standard established by ARIB (Association of Radio Industries and Businesses: Association of Radio Industries and Businesses). The subtitle data is also superimposed on the input broadcast signal in accordance with the ARIB rules. The subtitle data is stored in the vertical blanking area of HD-SDI or SD-SDI, and the subtitle extraction unit 11 extracts the subtitle data. The subtitle data may be stored in another area of the broadcast signal.

なお、字幕生成装置１０に入力される放送信号のインターフェースはＳＤＩに限られない。他に例えば、字幕生成装置１０に入力される放送信号は、ＲＴＰ（Ｒｅａｌ－ｔｉｍｅＴｒａｎｓｐｏｒｔＰｒｏｔｏｃｏｌ）等を用いてＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）ネットワーク上に送出されたものであってもよい。ＲＴＰを用いる場合、放送信号に含まれる映像及び音声データは、例えばリアルタイムエンコーダ等を用いてＭＰＥＧ２－ＴＳ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ２－ＴｒａｎｓｐｏｒｔＳｔｒｅａｍ）形式にエンコードされたものである。また、ＲＴＰを用いる場合も放送信号には、例えばＡＲＩＢで規定されている字幕データが重畳されている。この場合、字幕抽出部１１は、ＭＰＥＧ２－ＴＳのエレメンタリストリームから、字幕データを抽出する。 The interface of the broadcast signal input to the subtitle generation device 10 is not limited to SDI. In addition, for example, the broadcast signal input to the subtitle generator 10 may be transmitted on an IP (Internet Protocol) network using RTP (Real-time Transport Protocol) or the like. When RTP is used, the video and audio data included in the broadcast signal are encoded in the MPEG2-TS (Moving Picture Experts Group2-Transport Stream) format using, for example, a real-time encoder or the like. Further, even when RTP is used, subtitle data specified by, for example, ARIB is superimposed on the broadcast signal. In this case, the subtitle extraction unit 11 extracts the subtitle data from the MPEG2-TS elementary stream.

このようにして、字幕抽出部１１は、外部から字幕データを取得する。そして、字幕抽出部１１は、取得した字幕データから字幕テキストを抽出する。また、字幕データには、字幕テキストの表示を開始するタイミングを指定する情報も含まれている。字幕抽出部１１は、取得した字幕データから、字幕テキストの表示を開始するタイミングを指定する情報を当該字幕テキストの第１表示開始時刻として抽出する。換言すれば、字幕抽出部１１は、取得した字幕データから字幕テキストの第１表示開始時刻を抽出する。 In this way, the subtitle extraction unit 11 acquires the subtitle data from the outside. Then, the subtitle extraction unit 11 extracts the subtitle text from the acquired subtitle data. The subtitle data also includes information that specifies when to start displaying the subtitle text. The subtitle extraction unit 11 extracts from the acquired subtitle data information that specifies the timing for starting the display of the subtitle text as the first display start time of the subtitle text. In other words, the subtitle extraction unit 11 extracts the first display start time of the subtitle text from the acquired subtitle data.

終了時刻設定部１２は、字幕抽出部１１により抽出された字幕テキストについて、第１表示終了時刻を設定する。第１表示終了時刻は、字幕抽出部１１により抽出された第１表示開始時刻よりも後の時刻である。分割字幕生成部１４は、字幕抽出部１１が抽出した字幕テキストに、字幕抽出部１１により抽出された第１表示開始時刻と、終了時刻設定部１２により設定された第１表示終了時刻とを対応付けた分割字幕データを生成する。 The end time setting unit 12 sets the first display end time for the subtitle text extracted by the subtitle extraction unit 11. The first display end time is a time after the first display start time extracted by the subtitle extraction unit 11. The divided subtitle generation unit 14 corresponds to the subtitle text extracted by the subtitle extraction unit 11 with the first display start time extracted by the subtitle extraction unit 11 and the first display end time set by the end time setting unit 12. Generate the attached split subtitle data.

開始時刻設定部１３は、字幕抽出部１１により抽出された字幕テキストについて、第２表示終了時刻を設定する。第２表示終了時刻は、終了時刻設定部１２により設定された第１表示終了時刻と同時刻である。終了時刻設定部１２は、字幕抽出部１１により抽出された字幕テキストについて、第２表示終了時刻を設定する。第２表示終了時刻は、開始時刻設定部１３により設定された第２表示開始時刻よりも後の時刻である。そして、分割字幕生成部１４は、字幕抽出部１１が抽出した字幕テキストに、開始時刻設定部１３により設定された第２表示開始時刻と、終了時刻設定部１２により設定された第２表示終了時刻とを対応付けた分割字幕データを生成する。 The start time setting unit 13 sets the second display end time for the subtitle text extracted by the subtitle extraction unit 11. The second display end time is the same time as the first display end time set by the end time setting unit 12. The end time setting unit 12 sets the second display end time for the subtitle text extracted by the subtitle extraction unit 11. The second display end time is a time after the second display start time set by the start time setting unit 13. Then, the divided subtitle generation unit 14 has a second display start time set by the start time setting unit 13 and a second display end time set by the end time setting unit 12 in the subtitle text extracted by the subtitle extraction unit 11. Generates divided subtitle data associated with.

このようにして、分割字幕生成部１４は、字幕抽出部１１により抽出された字幕テキストの表示時間について、第１表示開始時刻から第１表示終了時刻までと、第２表示開始時刻から第２表示終了時刻までとに分割された分割字幕データを生成する。第１表示終了時刻と第２表示開始時刻とは同時刻である。したがって、字幕データは、字幕テキストの表示時間が途切れることがないようにして分割される。第２表示終了時刻以降も、同様にして、字幕テキストの表示時間が途切れることがないように分割が継続される。すなわち、１つ前の分割字幕データの表示終了時刻と、その直後の分割字幕データの表示開始時刻とは、同時刻である。 In this way, the divided subtitle generation unit 14 displays the display time of the subtitle text extracted by the subtitle extraction unit 11 from the first display start time to the first display end time and from the second display start time to the second display. Generates divided subtitle data divided up to the end time. The first display end time and the second display start time are the same time. Therefore, the subtitle data is divided so that the display time of the subtitle text is not interrupted. After the second display end time, the division is continued in the same manner so that the display time of the subtitle text is not interrupted. That is, the display end time of the previous divided subtitle data and the display start time of the divided subtitle data immediately after that are the same time.

終了時刻設定部１２は、第１表示開始時刻と第１表示終了時刻との時間間隔を、例えば、字幕生成装置１０が備えるタイマー部１６の計時結果に基づいて設定する。同様に、終了時刻設定部１２は、第２表示開始時刻と第２表示終了時刻との時間間隔を、例えばタイマー部１６の計時結果に基づいて設定する。終了時刻設定部１２は、第１表示開始時刻から第１表示終了時刻までの間隔と、第２表示開始時刻から第２表示終了時刻までの間隔とが等しくなるように、第１表示終了時刻及び第２表示終了時刻を設定する。 The end time setting unit 12 sets the time interval between the first display start time and the first display end time, for example, based on the time measurement result of the timer unit 16 included in the subtitle generation device 10. Similarly, the end time setting unit 12 sets the time interval between the second display start time and the second display end time based on, for example, the timing result of the timer unit 16. The end time setting unit 12 sets the first display end time and the interval from the second display start time to the second display end time so that the interval from the first display start time to the first display end time becomes equal. Set the second display end time.

この場合、それぞれの表示時間、すなわち、第１表示開始時刻から第１表示終了時刻までの間隔、及び第２表示開始時刻から第２表示終了時刻までの間隔は、例えば、字幕データが重畳される映像データのエンコード遅延、プレイヤーでの映像データのデコード遅延、及びプレイヤーでの映像データの描画遅延等を考慮して決定するとよい。このようにすることで、映像データと字幕データの表示タイミングを容易に合わせることが可能である。 In this case, for example, the subtitle data is superimposed on each display time, that is, the interval from the first display start time to the first display end time and the interval from the second display start time to the second display end time. It may be determined in consideration of the video data encoding delay, the video data decoding delay by the player, the drawing delay of the video data by the player, and the like. By doing so, it is possible to easily match the display timings of the video data and the subtitle data.

データ出力部１５は、分割字幕生成部１４が生成した分割字幕データを出力する。すなわち、データ出力部１５は、字幕テキストに第１表示開始時刻と第１表示終了時刻とが対応付けられた分割字幕データを出力する。また、データ出力部１５は、字幕テキストに第２表示開始時刻と第２表示終了時刻とが対応付けられた分割字幕データを出力する。第２表示終了時刻以降についても同様に、分割字幕生成部１４が生成した分割字幕データがあれば、データ出力部１５は、当該分割字幕データを出力する。 The data output unit 15 outputs the divided subtitle data generated by the divided subtitle generation unit 14. That is, the data output unit 15 outputs the divided subtitle data in which the first display start time and the first display end time are associated with the subtitle text. Further, the data output unit 15 outputs the divided subtitle data in which the second display start time and the second display end time are associated with the subtitle text. Similarly, after the second display end time, if there is the divided subtitle data generated by the divided subtitle generation unit 14, the data output unit 15 outputs the divided subtitle data.

この際の出力データのファイル形式は、例えばＷｅｂＶＴＴ（ＷｅｂＶｉｄｅｏＴｅｘｔＴｒａｃｋ：ウェブ・ビデオ・テキスト・トラック）形式である。次に、図２を参照しながらＷｅｂＶＴＴ形式の字幕ファイルの構成について説明する。同図に示すのは、ＷｅｂＶＴＴ形式の字幕ファイルの一例である。第１行目の「ＷＥＢＶＴＴ」は、ヘッダー情報であり、本ファイルがＷＥＢＶＴＴ形式のファイルであることを表す。第２行目は空白行である。 The file format of the output data at this time is, for example, a WebVTT (Web Video Text Track) format. Next, the structure of the WebVTT format subtitle file will be described with reference to FIG. The figure shows an example of a subtitle file in WebVTT format. "WEBVTT" in the first line is header information, and indicates that this file is a WEBVTT format file. The second line is a blank line.

第３行目のデータと第４行目のデータは組になっている。第３行目は、字幕テキストの表示開始時刻及び表示終了時刻である。第４行目は、第３行目の表示開始時刻から表示終了時刻までの間に表示される字幕テキストの内容である。具体的には、第３行目の「－－＞」よりも行頭側の「００：００：０５．０００」は、表示開始時刻が０時０分５秒０００であることを示している。また、第３行目の「－－＞」よりも行末側の「００：００：１０．０００」は、表示終了時刻が０時０分１０秒０００であることを示している。なお、これらの時刻は相対的なものであり、例えば、当該字幕データが表示される映像の再生時刻を基準としている。そして、第４行目の「今日は晴れています。」は、０時０分５秒０００から０時０分１０秒０００の間に表示する字幕テキストである。 The data in the third line and the data in the fourth line are a set. The third line is the display start time and display end time of the subtitle text. The fourth line is the content of the subtitle text displayed between the display start time and the display end time of the third line. Specifically, "00:00:05.000" on the line head side of "->" on the third line indicates that the display start time is 0:00:05.000. Further, "00: 00: 10.000" on the end of the line from "->" on the third line indicates that the display end time is 0:00:10.000. It should be noted that these times are relative, and are based on, for example, the reproduction time of the video in which the subtitle data is displayed. The fourth line, "Today is sunny." Is a subtitle text to be displayed between 0:00:05 and 0:00 to 0:00:10.000.

同様に、第５行目の空白行を挟んで、第６行目及び第７行目が組となったデータである。第６行目及び第７行目は、表示開始時刻０時０分１１秒０００から表示終了時刻０時０分１６秒０００の間に字幕テキスト「明日の天気は曇りでしょう。」を表示することを示している。また、第８行目の空白行を挟んで、第９行目及び第１０行目が組となったデータである。第９行目及び第１０行目は、表示開始時刻０時０分２０秒０００から表示終了時刻０時１分２０秒０００の間に字幕テキスト「♪（主題歌）」を表示することを示している。そして、第１１行目の空白行を挟んで、第１２行目及び第１３行目が組となったデータである。第１２行目及び第１３行目は、表示開始時刻０時１分２２秒０００から表示終了時刻０時１分２５秒０００の間に字幕テキスト「さて、次のニュースです。」を表示することを示している。 Similarly, the data is a set of the sixth line and the seventh line with the blank line of the fifth line in between. On the 6th and 7th lines, the subtitle text "Tomorrow's weather will be cloudy." Is displayed between the display start time of 0:00:11: 000 and the display end time of 0:00:16: 000. It is shown that. Further, the data is a set of the 9th line and the 10th line with the blank line of the 8th line sandwiched between them. The 9th and 10th lines indicate that the subtitle text "♪ (theme song)" is displayed between the display start time of 0:00:20: 000 and the display end time of 0:01:20: 000. ing. Then, the data is a set of the twelfth line and the thirteenth line with the blank line of the eleventh line sandwiched between them. On the 12th and 13th lines, the subtitle text "Now, the next news." Is displayed between the display start time of 0:01:22: 000 and the display end time of 0:01:25:000. Is shown.

データ出力部１５は、２つ以上の分割字幕データを１つの字幕ファイルとして出力してもよいし、２つ以上の分割字幕データのそれぞれを別々の字幕ファイルとして出力してもよい。２つ以上の分割字幕データを１つの字幕ファイルとして出力する場合、データ出力部１５は、字幕テキストに第１表示開始時刻と第１表示終了時刻とを対応付けた分割字幕データと、字幕テキストに第２表示開始時刻と第２表示終了時刻とを対応付けた分割字幕データとが少なくとも含まれる１つの字幕ファイルを出力する。 The data output unit 15 may output two or more divided subtitle data as one subtitle file, or may output each of the two or more divided subtitle data as a separate subtitle file. When outputting two or more divided subtitle data as one subtitle file, the data output unit 15 converts the subtitle text into the divided subtitle data in which the first display start time and the first display end time are associated with each other, and the subtitle text. Outputs one subtitle file including at least the divided subtitle data in which the second display start time and the second display end time are associated with each other.

図３に示すのは、データ出力部１５が、２つ以上の分割字幕データを１つの字幕ファイルとして出力した場合の一例である。同図の例は、ＷｅｂＶＴＴ形式の字幕ファイルの一部分である。この例では、まず、表示開始時刻０時０分１１秒０００から表示終了時刻０時０分１６秒０００の間に字幕テキスト「明日の天気は曇りでしょう。」を表示する。そして、図示の範囲では、字幕テキスト「♪（主題歌）」について、５秒毎に５つの分割字幕データに分割されている。 FIG. 3 shows an example in which the data output unit 15 outputs two or more divided subtitle data as one subtitle file. The example in the figure is a part of a subtitle file in WebVTT format. In this example, first, the subtitle text "Tomorrow's weather will be cloudy." Is displayed between the display start time of 0:00:11: 000 and the display end time of 0:00:16: 000. Then, in the range shown in the figure, the subtitle text "♪ (theme song)" is divided into five divided subtitle data every 5 seconds.

すなわち、まず、１つめの分割字幕データにおいて、表示開始時刻０時０分２０秒０００から表示終了時刻０時０分２５秒０００の間に字幕テキスト「♪（主題歌）」を表示する。次に、２つめの分割字幕データにおいては、表示開始時刻０時０分２５秒０００から表示終了時刻０時０分３０秒０００の間に、同一の字幕テキスト「♪（主題歌）」を表示する。また、３つめの分割字幕データにおいては、表示開始時刻０時０分３０秒０００から表示終了時刻０時０分３５秒０００の間に、同一の字幕テキスト「♪（主題歌）」を表示する。さらに、４つめの分割字幕データにおいては、表示開始時刻０時０分３５秒０００から表示終了時刻０時０分４０秒０００の間に、同一の字幕テキスト「♪（主題歌）」を表示する。そして、５つめの分割字幕データにおいては、表示開始時刻０時０分４０秒０００から表示終了時刻０時０分４５秒０００の間に、同一の字幕テキスト「♪（主題歌）」を表示する。なお、０時０分４５秒０００以降についても、同様に、表示時間５秒毎に分割字幕データが生成されている。 That is, first, in the first divided subtitle data, the subtitle text "♪ (theme song)" is displayed between the display start time of 0:00:20: 000 and the display end time of 0:00:25: 000. Next, in the second divided subtitle data, the same subtitle text "♪ (theme song)" is displayed between the display start time of 0:00:25: 000 and the display end time of 0:00:30: 000. do. Further, in the third divided subtitle data, the same subtitle text "♪ (theme song)" is displayed between the display start time of 0:00:30: 000 and the display end time of 0:00:35:000. .. Further, in the fourth divided subtitle data, the same subtitle text "♪ (theme song)" is displayed between the display start time of 0:00:35:000 and the display end time of 0:00:40.000. .. Then, in the fifth divided subtitle data, the same subtitle text "♪ (theme song)" is displayed between the display start time of 0:00:40:000 and the display end time of 0:00:45:000. .. Similarly, after 0:00:45, 000, the divided subtitle data is generated every 5 seconds of the display time.

図４に示すのは、図３に示した１つのＷｅｂＶＴＴ形式の字幕ファイルを、複数のＷｅｂＶＴＴ形式の字幕ファイルに分割した一例である。この例では、ＨＬＳ（ＨＴＴＰＬｉｖｅＳｔｒｅａｍｉｎｇ：ＨＴＴＰ・ライブ・ストリーミング）規格により配信される動画に合わせて、字幕ファイルを分割している。ＥＸＴ－ＩＮＦは、ＨＬＳセグメントの長さを特定するタグである。図示の例では、ＥＸＴ－ＩＮＦは１０秒である。このＨＬＳセグメントの長さに合わせて、図３に示した１つのＷｅｂＶＴＴ形式の字幕ファイルを複数の字幕ファイルに分割している。図３の例では、１つの分割字幕データの表示時間は５秒である。そこで、ＨＬＳセグメントの長さ１０秒に合わせて、２つの分割字幕データ毎に１つのＷｅｂＶＴＴ形式字幕ファイルとなるようにファイルを分割している。 FIG. 4 shows an example in which one WebVTT format subtitle file shown in FIG. 3 is divided into a plurality of WebVTT format subtitle files. In this example, the subtitle file is divided according to the moving image distributed by the HLS (HTTP Live Streaming) standard. EXT-INF is a tag that specifies the length of the HLS segment. In the illustrated example, EXT-INF is 10 seconds. One WebVTT format subtitle file shown in FIG. 3 is divided into a plurality of subtitle files according to the length of the HLS segment. In the example of FIG. 3, the display time of one divided subtitle data is 5 seconds. Therefore, the file is divided into one WebVTT format subtitle file for each of the two divided subtitle data according to the length of the HLS segment of 10 seconds.

具体的には、１つめの字幕ファイルは、表示開始時刻０時０分２０秒０００から表示終了時刻０時０分２５秒０００の間に字幕テキスト「♪（主題歌）」を表示する１つめの分割字幕データと、表示開始時刻０時０分２５秒０００から表示終了時刻０時０分３０秒０００の間に字幕テキスト「♪（主題歌）」を表示する２つめの分割字幕データとからなる。また、２つめの字幕ファイルは、表示開始時刻０時０分３０秒０００から表示終了時刻０時０分３５秒０００の間に字幕テキスト「♪（主題歌）」を表示する３つめの分割字幕データと、表示開始時刻０時０分３５秒０００から表示終了時刻０時０分４０秒０００の間に字幕テキスト「♪（主題歌）」を表示する４つめの分割字幕データとからなる。そして、３つめの字幕ファイルは、表示開始時刻０時０分４０秒０００から表示終了時刻０時０分４５秒０００の間に字幕テキスト「♪（主題歌）」を表示する５つめの分割字幕データと、表示開始時刻０時０分４５秒０００から表示終了時刻０時０分５０秒０００の間に字幕テキスト「♪（主題歌）」を表示する６つめの分割字幕データとからなる。 Specifically, the first subtitle file displays the subtitle text "♪ (theme song)" between the display start time of 0:00:20: 000 and the display end time of 0:00:25: 000. From the divided subtitle data of, and the second divided subtitle data that displays the subtitle text "♪ (theme song)" between the display start time of 0:00:25:000 and the display end time of 0:00:30:000. Become. In addition, the second subtitle file is the third split subtitle that displays the subtitle text "♪ (theme song)" between the display start time of 0:00:30: 000 and the display end time of 0:00:35:000. It consists of data and a fourth divided subtitle data for displaying the subtitle text "♪ (theme song)" between the display start time of 0:00:35 and 000 and the display end time of 0:00:40.000. The third subtitle file is the fifth split subtitle that displays the subtitle text "♪ (theme song)" between the display start time of 0:00:40 and 000 and the display end time of 0:00:45.000. It consists of data and the sixth divided subtitle data for displaying the subtitle text "♪ (theme song)" between the display start time of 0:00:45:000 and the display end time of 0:00:50.000.

なお、図示の例では、それぞれの字幕ファイルは、連番を含むファイル名が付けられている。具体的には、それぞれ、ｗｅｂｖｔｔ＿２．ｖｔｔ、ｗｅｂｖｔｔ＿３．ｖｔｔ及びｗｅｂｖｔｔ＿４．ｖｔｔである。 In the illustrated example, each subtitle file is given a file name including a serial number. Specifically, each is webvtt_1. vtt, webvtt_3. vtt and webvtt_4. It is vtt.

図２に例示したように、ＷｅｂＶＴＴ形式の字幕ファイルでは、字幕テキストについて表示開始時刻と表示終了時刻とが特定されている。ここで、表示開始時刻のみが特定され、表示終了時刻が特定されていない字幕テキストがＷｅｂＶＴＴ形式の字幕ファイルに含まれていた場合、一般的なプレイヤーでは、当該字幕テキストについて無視されてしまい、当該字幕テキストが表示されない。 As illustrated in FIG. 2, in the WebVTT format subtitle file, the display start time and the display end time are specified for the subtitle text. Here, if the subtitle text in which only the display start time is specified and the display end time is not specified is included in the subtitle file in the WebVTT format, the general player ignores the subtitle text and the subject is concerned. Subtitle text is not displayed.

ここで、字幕データが映像データと重畳され伝送されるケースにおいて、これらのデータを受信してリアルタイムで字幕データをＷｅｂＶＴＴ形式の字幕ファイルとして出力する場合を考える。このような場合、従来技術では、ある字幕データを受信した時点では、次の字幕データを受信していないために、字幕テキストの表示終了時間を確定できないことが起こり得る。したがって、字幕データの受信と同時にリアルタイムで生成したＷｅｂＶＴＴ形式の字幕ファイルにおいて、表示終了時刻が特定されていない字幕テキストが含まれることになり、当該字幕テキストを適切に表示できず、映像データと字幕データの同期がとれないおそれがある。 Here, in the case where the subtitle data is superimposed on the video data and transmitted, consider a case where these data are received and the subtitle data is output as a WebVTT format subtitle file in real time. In such a case, in the prior art, when a certain subtitle data is received, the display end time of the subtitle text may not be determined because the next subtitle data is not received. Therefore, in the WebVTT format subtitle file generated in real time at the same time as receiving the subtitle data, the subtitle text whose display end time is not specified is included, and the subtitle text cannot be displayed properly, and the video data and the subtitle Data may not be synchronized.

これに対し、以上のように構成された本開示に係る字幕生成装置１０によれば、字幕データの受信時に表示終了時刻が確定できない字幕テキストについて、表示終了時刻を設定し、さらに、当該字幕テキストについて、設定した表示終了時刻以後も再度表示を開始して表示が継続されるような分割字幕データを生成する。このため、字幕データの受信時に表示終了時刻が確定できない字幕テキストについても、表示終了時刻が特定されたＷｅｂＶＴＴ形式の字幕ファイルとして出力できる。したがって、字幕データが映像データと重畳され伝送されるケースにおいて、これらのデータを受信してリアルタイムで字幕データをインターネット等の電気通信回線で配信するためのＷｅｂＶＴＴ形式の字幕ファイルとして出力する場合に、映像データと字幕データとを同期させて字幕テキストを適切に表示できるようにすることが可能である。 On the other hand, according to the subtitle generation device 10 according to the present disclosure configured as described above, the display end time is set for the subtitle text whose display end time cannot be determined when the subtitle data is received, and further, the subtitle text is set. For, the divided subtitle data is generated so that the display is restarted and the display is continued even after the set display end time. Therefore, even a subtitle text whose display end time cannot be determined when the subtitle data is received can be output as a WebVTT format subtitle file in which the display end time is specified. Therefore, in the case where the subtitle data is superimposed on the video data and transmitted, when receiving these data and outputting the subtitle data as a WebVTT format subtitle file for distribution on a telecommunications line such as the Internet in real time, It is possible to synchronize the video data and the subtitle data so that the subtitle text can be displayed appropriately.

また、２つ以上の分割字幕データを１つの字幕ファイルとして出力しておくことで、その後の必要性等に応じて、容易に複数の字幕ファイルに分割できる。したがって、例えば、ＨＬＳ形式のライブ配信の場合に映像データのＨＬＳセグメント長に合わせて、字幕データのセグメントファイルを容易に生成できる。 Further, by outputting two or more divided subtitle data as one subtitle file, it can be easily divided into a plurality of subtitle files according to the subsequent needs and the like. Therefore, for example, in the case of live distribution in the HLS format, a segment file of subtitle data can be easily generated according to the HLS segment length of the video data.

次に、以上のように構成された字幕生成装置１０の処理の流れの一例について、図５のフロー図を参照しながら説明する。まず、ステップＳ１１においては、字幕抽出部１１は、外部から字幕データを取得する。そして、字幕抽出部１１は、取得した字幕データから、字幕テキストを抽出する。また、字幕抽出部１１は、取得した字幕データから、字幕テキストの表示されるべき時間を抽出し、これを表示開始時刻とする。字幕テキストの表示されるべき時間とは、例えば、字幕データに含まれるＰＴＳ（ＰｒｅｓｅｎｔａｔｉｏｎＴｉｍｅＳｔａｍｐ）である。そして、字幕抽出部１１が次の字幕データを取得した場合には、次の字幕データの表示開始時刻を、今回の字幕データの表示終了時刻とする。ステップＳ１１の後、字幕生成装置１０はステップＳ１２の処理を行う。 Next, an example of the processing flow of the subtitle generation device 10 configured as described above will be described with reference to the flow chart of FIG. First, in step S11, the subtitle extraction unit 11 acquires the subtitle data from the outside. Then, the subtitle extraction unit 11 extracts the subtitle text from the acquired subtitle data. Further, the subtitle extraction unit 11 extracts the time when the subtitle text should be displayed from the acquired subtitle data, and sets this as the display start time. The time when the subtitle text should be displayed is, for example, a PTS (Presentation Time Stamp) included in the subtitle data. Then, when the subtitle extraction unit 11 acquires the next subtitle data, the display start time of the next subtitle data is set as the display end time of the current subtitle data. After step S11, the subtitle generation device 10 performs the process of step S12.

ステップＳ１２においては、字幕抽出部１１は、取得した字幕データに制御コード「ＴＩＭＥ」又は「ＣＳ」（画面消去）があるか否かを確認する。そして、字幕データに制御コード「ＴＩＭＥ」及び「ＣＳ」のいずれもなければ、次に字幕生成装置１０はステップＳ１３の処理を行う。 In step S12, the subtitle extraction unit 11 confirms whether or not the acquired subtitle data has the control code “TIME” or “CS” (screen erase). Then, if neither the control code "TIME" nor "CS" is present in the subtitle data, the subtitle generation device 10 then performs the process of step S13.

ステップＳ１３においては、タイマー部１６による計時を開始する。なお、このステップＳ１３の処理の実行時に、既にタイマー部１６が計時を行っている場合には、一旦タイマー部１６による計時を停止し、タイマー部１６をリセットしてから、タイマー部１６による計時を開始する。 In step S13, the timer unit 16 starts timing. If the timer unit 16 has already measured the time when the process of step S13 is executed, the timer unit 16 temporarily stops the time measurement, resets the timer unit 16, and then performs the time measurement by the timer unit 16. Start.

ステップＳ１３の後、字幕生成装置１０は、ステップＳ１１に戻って次の字幕データについて処理を続ける。また、この処理と並行して、終了時刻設定部１２は、タイマー部１６により計時された経過時間を監視している（ステップＳ１４）。そして、終了時刻設定部１２は、タイマー部１６により計時された経過時間が設定時間に達した（Ｅｘｐｉｒｅｄ）か否かを判定する。 After step S13, the subtitle generation device 10 returns to step S11 and continues processing for the next subtitle data. Further, in parallel with this process, the end time setting unit 12 monitors the elapsed time measured by the timer unit 16 (step S14). Then, the end time setting unit 12 determines whether or not the elapsed time measured by the timer unit 16 has reached the set time (Expired).

タイマー部１６により計時された経過時間が設定時間に達した場合、終了時刻設定部１２は、表示終了時刻を設定する。この際、タイマー部１６により計時された経過時間が設定時間に達するまでの間に、次の字幕データが到着しない場合、字幕テキストを複製し、この複製した字幕テキストについて開始時刻設定部１３は表示開始時刻を再設定する。そして、字幕生成装置１０は、ステップＳ１３に戻って処理を続け、タイマー部１６による計時を開始する。 When the elapsed time measured by the timer unit 16 reaches the set time, the end time setting unit 12 sets the display end time. At this time, if the next subtitle data does not arrive before the elapsed time measured by the timer unit 16 reaches the set time, the subtitle text is duplicated, and the start time setting unit 13 displays the duplicated subtitle text. Reset the start time. Then, the subtitle generation device 10 returns to step S13 to continue the process, and starts the time counting by the timer unit 16.

一方、ステップＳ１４で、タイマー部１６により計時された経過時間が設定時間に達するまでの間に、次の字幕データが到着した場合、その時点において、終了時刻設定部１２は、表示終了時刻を設定する。そして、字幕生成装置１０はステップＳ１５の処理を行う。 On the other hand, if the next subtitle data arrives before the elapsed time measured by the timer unit 16 reaches the set time in step S14, the end time setting unit 12 sets the display end time at that time. do. Then, the subtitle generation device 10 performs the process of step S15.

ステップＳ１５においては、分割字幕生成部１４は、ステップＳ１１で取得された字幕テキスト、表示開始時刻、並びに、ステップＳ１３及びＳ１４で設定された字幕テキスト、表示開始時刻及び表示終了時刻に基づいて、分割字幕データを生成する。そして、データ出力部１５は、分割字幕生成部１４が生成した分割字幕データを、字幕ファイルとして出力する。 In step S15, the divided subtitle generation unit 14 divides based on the subtitle text acquired in step S11, the display start time, and the subtitle text, the display start time, and the display end time set in steps S13 and S14. Generate subtitle data. Then, the data output unit 15 outputs the divided subtitle data generated by the divided subtitle generation unit 14 as a subtitle file.

一方、ステップＳ１２において、字幕データに制御コード「ＴＩＭＥ」がある場合、終了時刻設定部１２は、制御コード「ＴＩＭＥ」に従って字幕テキストの表示終了時刻を設定する。また、ステップＳ１２において、字幕データに制御コード「ＣＳ」がある場合、終了時刻設定部１２は、制御コード「ＣＳ」により画面が消去される時刻を字幕テキストの表示終了時刻として設定する。そして、次に字幕生成装置１０はステップＳ１５の処理を行う。ステップＳ１５の処理が完了すれば、一連の処理は終了となる。 On the other hand, in step S12, when the subtitle data has the control code "TIME", the end time setting unit 12 sets the display end time of the subtitle text according to the control code "TIME". Further, in step S12, when the subtitle data has the control code "CS", the end time setting unit 12 sets the time when the screen is erased by the control code "CS" as the display end time of the subtitle text. Then, the subtitle generation device 10 performs the process of step S15. When the process of step S15 is completed, the series of processes is completed.

なお、図５に示した処理例では、字幕データの受信処理タイミングにより、出力が保留される字幕データが同時に複数存在する状態になることがある。この場合、出力が保留された字幕データについて、ＦＩＦＯ（先入先出）により処理することで、順序を維持することができる。 In the processing example shown in FIG. 5, depending on the reception processing timing of the subtitle data, a plurality of subtitle data whose output is suspended may exist at the same time. In this case, the order can be maintained by processing the subtitle data whose output is suspended by FIFO (first in, first out).

以上で説明した構成例では、タイマー部１６による計時結果に基づいて、一定時間間隔で字幕テキストの表示終了時間を設定し、分割字幕データを生成している。すなわち、字幕テキストを取得した時点で、当該字幕テキストの表示終了時刻を確定できない場合にタイマー部１６による計時を開始し、このタイマー部１６が一定時間を計時したタイミングで、当該字幕テキストの表示終了時刻を設定している。しかし、分割字幕データの表示終了時間の設定方法は、これに限られない。 In the configuration example described above, the display end time of the subtitle text is set at regular time intervals based on the time counting result by the timer unit 16, and the divided subtitle data is generated. That is, when the subtitle text is acquired, if the display end time of the subtitle text cannot be determined, the timer unit 16 starts timing, and the timer unit 16 clocks the fixed time to end the display of the subtitle text. The time is set. However, the method of setting the display end time of the divided subtitle data is not limited to this.

終了時刻設定部１２は、外部から入力されたキュー信号に基づいて、分割字幕データの表示終了時間の設定してもよい。すなわち、終了時刻設定部１２は、外部から入力されたキュー信号に基づいて、少なくとも、第１表示終了時刻及び第２表示終了時刻の一方又は両方を設定する。具体的に例えば、ＳＣＴＥ－３５信号をキュー信号として用いる。ＳＣＴＥ－３５信号は、番組の開始と終了、及び、広告の挿入開始と挿入終了を指定する信号である。なお、ＳＣＴＥは、ＳｏｃｉｅｔｙｏｆＣａｂｌｅａｎｄＴｅｌｅｃｏｍｍｕｎｉｃａｔｉｏｎｓＥｎｇｉｎｅｅｒｓの略である。 The end time setting unit 12 may set the display end time of the divided subtitle data based on the queue signal input from the outside. That is, the end time setting unit 12 sets at least one or both of the first display end time and the second display end time based on the queue signal input from the outside. Specifically, for example, the SCTE-35 signal is used as a cue signal. The SCTE-35 signal is a signal that specifies the start and end of the program and the start and end of insertion of the advertisement. SCTE is an abbreviation for Society of Cable and Telecommunications Engineers.

この際、例えば、ＳＣＴＥ－３５信号のｓｐｌｉｃｅ＿ｉｎｓｅｒｔ（）メッセージに含まれるｕｎｉｑｕｅ＿ｐｒｏｇｒａｍ＿ｉｄフィールドの値を番組ＩＤとして利用することができる。この場合、例えば、ｕｎｉｑｕｅ＿ｐｒｏｇｒａｍ＿ｉｄフィールドの値すなわち番組ＩＤに変更がなければ、同一の番組内での広告挿入であると判定して、終了時刻設定部１２はＳＣＴＥ－３５信号のタイミングで字幕テキストの表示終了時間を設定し、分割字幕生成部１４は分割字幕データを生成する。一方、ｕｎｉｑｕｅ＿ｐｒｏｇｒａｍ＿ｉｄフィールドの値すなわち番組ＩＤが変更されれば、１つの番組が終了し、別の番組が開始されたと判定する。番組ＩＤが変更された場合、データ出力部１５による字幕ファイルの出力先（例えばディレクトリ及び字幕ファイル名）を番組に合わせて変更する。そして、データ出力部１５が出力する分割字幕ファイルのカウンタをリセットする。このような処理により、番組単位に合わせて字幕ファイルを出力することができる。 At this time, for example, the value of the unitue_program_id field included in the splice_insert () message of the SCTE-35 signal can be used as the program ID. In this case, for example, if the value of the unique_program_id field, that is, the program ID is not changed, it is determined that the advertisement is inserted in the same program, and the end time setting unit 12 displays the subtitle text at the timing of the SCTE-35 signal. The end time is set, and the divided subtitle generation unit 14 generates the divided subtitle data. On the other hand, if the value of the unique_program_id field, that is, the program ID is changed, it is determined that one program has ended and another program has started. When the program ID is changed, the output destination (for example, the directory and the subtitle file name) of the subtitle file by the data output unit 15 is changed according to the program. Then, the counter of the divided subtitle file output by the data output unit 15 is reset. By such processing, the subtitle file can be output according to the program unit.

この場合の字幕生成装置１０の処理例について、図６のフロー図を参照しながら説明する。字幕生成装置１０に、外部からのキュー信号としてＳＣＴＥ－３５信号が入力されると、まず、ステップＳ２１において、字幕生成装置１０は、ＳＣＴＥ－３５信号のｕｎｉｑｕｅ＿ｐｒｏｇｒａｍ＿ｉｄフィールドの値すなわち番組ＩＤを取得する。 A processing example of the subtitle generation device 10 in this case will be described with reference to the flow chart of FIG. When the SCTE-35 signal is input to the subtitle generation device 10 as a queue signal from the outside, first, in step S21, the subtitle generation device 10 acquires the value of the unitue_program_id field of the SCTE-35 signal, that is, the program ID.

続くステップＳ２２において、ステップＳ２１で取得した番組ＩＤと、前回のＳＣＴＥ－３５信号の受信時に取得した番組ＩＤとを比較し、番組ＩＤに変更があったか否かを判定する。なお、例えば、字幕生成装置１０のメモリに、前回のＳＣＴＥ－３５信号の受信時に取得した番組ＩＤの値が保持されている。そして、番組ＩＤに変更がなければ、字幕生成装置１０は次にステップＳ２３の処理を行う。 In the following step S22, the program ID acquired in step S21 is compared with the program ID acquired when the previous SCTE-35 signal was received, and it is determined whether or not the program ID has been changed. For example, the memory of the subtitle generation device 10 holds the value of the program ID acquired when the previous SCTE-35 signal was received. Then, if there is no change in the program ID, the subtitle generation device 10 then performs the process of step S23.

ステップＳ２３においては、例えば図５のフロー図に示した処理により、外部から字幕データを取得し、ＷｅｂＶＴＴ形式の字幕ファイルを出力する。ステップＳ２３の後、字幕生成装置１０はステップＳ２１に戻って処理を続ける。 In step S23, for example, by the process shown in the flow chart of FIG. 5, the subtitle data is acquired from the outside, and the subtitle file in the WebVTT format is output. After step S23, the subtitle generator 10 returns to step S21 to continue processing.

一方、ステップＳ２２で番組ＩＤに変更があれば、字幕生成装置１０は次にステップＳ２４の処理を行う。ステップＳ２４においては、データ出力部１５による字幕ファイルの出力先ディレクトリ及び字幕ファイル名を番組に合わせて変更する。そして、データ出力部１５が出力する分割字幕ファイルのカウンタをリセットする。ステップＳ２４の後、字幕生成装置１０はステップＳ２１に戻って処理を続ける。 On the other hand, if the program ID is changed in step S22, the subtitle generation device 10 then performs the process of step S24. In step S24, the output destination directory and the subtitle file name of the subtitle file by the data output unit 15 are changed according to the program. Then, the counter of the divided subtitle file output by the data output unit 15 is reset. After step S24, the subtitle generator 10 returns to step S21 to continue processing.

なお、以上においては、番組ＩＤとしてＳＣＴＥ－３５信号のｕｎｉｑｕｅ＿ｐｒｏｇｒａｍ＿ｉｄフィールドの値を使用した場合について説明したが、番組ＩＤの特定方法はこれに限られない。他に例えば、ＳＣＴＥ－３５信号のｓｐｌｉｃｅ＿ｉｎｓｅｒｔ（）メッセージに含まれるｓｐｌｉｃｅ＿ｅｖｅｎｔ＿ｉｄフィールド等の他の識別子を番組ＩＤとして利用してもよい。また、１つの識別子だけでなく複数の識別子を組み合わせたものにより番組を一意に特定して、番組ＩＤとしてもよい。 In the above, the case where the value of the unite_program_id field of the SCTE-35 signal is used as the program ID has been described, but the method of specifying the program ID is not limited to this. Alternatively, for example, another identifier such as the splice_event_id field included in the splice_insert () message of the SCTE-35 signal may be used as the program ID. Further, the program may be uniquely specified by a combination of a plurality of identifiers as well as one identifier, and may be used as a program ID.

字幕生成装置１０は、ＨＬＳセグメントに合わせて分割字幕ファイルを出力できるようにしてもよい。すなわち、取得した字幕データに、ＨＬＳセグメントの長さを超えて表示される字幕テキストが含まれている場合、当該字幕テキストの表示時間を分割した分割字幕データを生成し、ＷｅｂＶＴＴ形式の字幕ファイルを出力する。 The subtitle generation device 10 may be capable of outputting a divided subtitle file according to the HLS segment. That is, when the acquired subtitle data contains subtitle text that is displayed in excess of the length of the HLS segment, the subtitle data in which the display time of the subtitle text is divided is generated, and the subtitle file in WebVTT format is generated. Output.

この場合の字幕生成装置１０の処理例について、図７のフロー図を参照しながら説明する。まず、ステップＳ３１において、字幕抽出部１１は、外部から字幕データを取得する。そして、字幕抽出部１１は、取得した字幕データから、字幕テキスト及び当該字幕テキストの表示時間を抽出する。 A processing example of the subtitle generation device 10 in this case will be described with reference to the flow chart of FIG. First, in step S31, the subtitle extraction unit 11 acquires the subtitle data from the outside. Then, the subtitle extraction unit 11 extracts the subtitle text and the display time of the subtitle text from the acquired subtitle data.

続くステップＳ３２において、字幕生成装置１０は、ステップＳ３１で取得した字幕データについて、ＨＬＳセグメント時間内に、表示開始する又は表示終了する字幕テキストが存在するか否かを判定する。この判定結果は、３つの場合が考えられる。第１の場合は、当該ＨＬＳセグメント時間内に表示される字幕テキストが存在しない場合である。第２の場合は、当該ＨＬＳセグメント時間内に表示開始され、かつ、表示終了される字幕テキストが存在する場合である。 In the following step S32, the subtitle generation device 10 determines whether or not the subtitle text acquired in step S31 has subtitle text that starts or ends to be displayed within the HLS segment time. There are three possible cases for this determination result. The first case is a case where the subtitle text to be displayed within the HLS segment time does not exist. The second case is a case where there is a subtitle text that starts to be displayed and ends to be displayed within the HLS segment time.

そして、第３の場合は、当該ＨＬＳセグメント時間内に表示開始され、かつ、表示終了されない、すなわち、当該ＨＬＳセグメント時間を超えて表示し続ける字幕テキストが存在する場合である。この第３の場合には、当該ＨＬＳセグメント時間が経過しても、次の字幕データが字幕生成装置１０に到着せず、字幕テキストの表示終了時間が確定できなかった場合も含まれる。 The third case is a case where there is a subtitle text that starts to be displayed within the HLS segment time and does not end the display, that is, continues to be displayed beyond the HLS segment time. In this third case, even if the HLS segment time has elapsed, the next subtitle data does not arrive at the subtitle generation device 10, and the display end time of the subtitle text cannot be determined.

上記第１の場合、すなわち、当該ＨＬＳセグメント時間内に表示される字幕テキストが存在しない場合、字幕生成装置１０は次にステップＳ３３の処理を行う。ステップＳ３３においてデータ出力部１５は、内容が空のＷｅｂＶＴＴ形式字幕ファイルを出力する。ステップＳ３３の後、字幕生成装置１０はステップＳ３１に戻って処理を続ける。 In the first case, that is, when the subtitle text to be displayed does not exist within the HLS segment time, the subtitle generation device 10 then performs the process of step S33. In step S33, the data output unit 15 outputs a WebVTT format subtitle file with empty contents. After step S33, the subtitle generator 10 returns to step S31 to continue processing.

ステップＳ３２で上記第２の場合、すなわち、当該ＨＬＳセグメント時間内に表示開始され、かつ、表示終了される字幕テキストが存在する場合、字幕生成装置１０は次にステップＳ３４の処理を行う。ステップＳ３４においては、分割字幕生成部１４は、ステップＳ３１で取得した字幕テキスト及び表示時間により、字幕データを生成する。そして、データ出力部１５は、生成された字幕データをＷｅｂＶＴＴ形式字幕ファイルとして出力する。なお、出力待ちの字幕ファイルが既に存在する場合には、データ出力部１５は、当該出力待ちの字幕ファイルに字幕データを追記する形で出力する。ステップＳ３４の後、字幕生成装置１０はステップＳ３５の処理を行う。 In the second case of step S32, that is, when there is a subtitle text that is started to be displayed and is finished to be displayed within the HLS segment time, the subtitle generation device 10 then performs the process of step S34. In step S34, the divided subtitle generation unit 14 generates subtitle data based on the subtitle text and the display time acquired in step S31. Then, the data output unit 15 outputs the generated subtitle data as a WebVTT format subtitle file. If the subtitle file waiting for output already exists, the data output unit 15 outputs the subtitle data in the form of adding the subtitle data to the subtitle file waiting for output. After step S34, the subtitle generation device 10 performs the process of step S35.

ステップＳ３５においては、データ出力部１５は、字幕用ｍ３ｕ８ファイルを生成する。字幕用ｍ３ｕ８ファイルは、字幕ファイルのプレイリストを定義するファイルである。なお、字幕用ｍ３ｕ８ファイルが既に存在する場合には、データ出力部１５は、字幕用ｍ３ｕ８ファイルに今回出力した字幕ファイルを追記する。ステップＳ３５の後、字幕生成装置１０はステップＳ３１に戻って処理を続ける。 In step S35, the data output unit 15 generates a subtitle m3u8 file. The subtitle m3u8 file is a file that defines a playlist of subtitle files. If the subtitle m3u8 file already exists, the data output unit 15 adds the subtitle file output this time to the subtitle m3u8 file. After step S35, the subtitle generator 10 returns to step S31 to continue processing.

一方、ステップＳ３２で上記第３の場合、すなわち、当該ＨＬＳセグメント時間を超えて表示し続ける字幕テキストが存在する場合、字幕生成装置１０は次にステップＳ３６の処理を行う。ステップＳ３６においては、まず、終了時刻設定部１２は、当該ＨＬＳセグメントの終了時間を字幕テキストの表示終了時間に設定する。そして、分割字幕生成部１４は、設定された表示終了時間により分割字幕データを生成する。また、開始時刻設定部１３は、当該ＨＬＳセグメントの終了時間を字幕テキストの表示開始時間に設定する。そして、終了時刻設定部１２は、例えば、次のＨＬＳセグメントの終了時間を字幕テキストの表示終了時間に設定する。分割字幕生成部１４は、設定された表示開始時間及び表示終了時間により分割字幕データを生成する。このようにして、字幕テキストが複製され、ＨＬＳセグメント時間に合わせて分割された分割字幕データが生成される。そして、ステップＳ３６の後、字幕生成装置１０はステップＳ３４の処理を行う。 On the other hand, in the third case of step S32, that is, when there is a subtitle text that continues to be displayed beyond the HLS segment time, the subtitle generation device 10 next performs the process of step S36. In step S36, first, the end time setting unit 12 sets the end time of the HLS segment to the display end time of the subtitle text. Then, the divided subtitle generation unit 14 generates the divided subtitle data according to the set display end time. Further, the start time setting unit 13 sets the end time of the HLS segment as the display start time of the subtitle text. Then, the end time setting unit 12 sets, for example, the end time of the next HLS segment as the display end time of the subtitle text. The divided subtitle generation unit 14 generates the divided subtitle data according to the set display start time and display end time. In this way, the subtitle text is duplicated, and the divided subtitle data divided according to the HLS segment time is generated. Then, after step S36, the subtitle generation device 10 performs the process of step S34.

このようにすることで、分割されたＷｅｂＶＴＴ形式字幕ファイルは、ＨＬＳセグメントに合わせて表示時間が分割されている。したがって、途中のＨＬＳセグメントから再生が開始された場合でも、適切に字幕の表示可能な字幕ファイルを生成できる。 By doing so, the display time of the divided WebVTT format subtitle file is divided according to the HLS segment. Therefore, even when the reproduction is started from the HLS segment in the middle, it is possible to appropriately generate a subtitle file in which the subtitle can be displayed.

なお、ステップＳ３３におけるデータ出力部１５による空のＷｅｂＶＴＴ形式字幕ファイルの出力は必ずしも行われなくともよい。より詳しくは、上記第１の場合、すなわち、当該ＨＬＳセグメント時間内に表示される字幕テキストが存在しない場合に空のＷｅｂＶＴＴ形式字幕ファイルを出力すべきか否かは、ステップＳ３５において生成される字幕用ｍ３ｕ８ファイルの仕様により決まる。すなわち、上記第１の場合にステップＳ３５で字幕用ｍ３ｕ８ファイルに空のＷｅｂＶＴＴ形式字幕ファイルが記載される場合には、ステップＳ３３で空のＷｅｂＶＴＴ形式字幕ファイルを出力しなければならない。一方、上記第１の場合にステップＳ３５で字幕用ｍ３ｕ８ファイルに空のＷｅｂＶＴＴ形式字幕ファイルが記載されないのであれば、ステップＳ３３で空のＷｅｂＶＴＴ形式字幕ファイルを出力してもしなくてもよい。 It should be noted that the data output unit 15 in step S33 does not necessarily have to output an empty WebVTT format subtitle file. More specifically, in the first case described above, that is, whether or not an empty WebVTT format subtitle file should be output when the subtitle text displayed within the HLS segment time does not exist is for the subtitle generated in step S35. It depends on the specifications of the m3u8 file. That is, in the first case, when an empty WebVTT format subtitle file is described in the subtitle m3u8 file in step S35, the empty WebVTT format subtitle file must be output in step S33. On the other hand, in the first case, if the empty WebVTT format subtitle file is not described in the subtitle m3u8 file in step S35, the empty WebVTT format subtitle file may or may not be output in step S33.

１０字幕生成装置
１１字幕抽出部
１２終了時刻設定部
１３開始時刻設定部
１４分割字幕生成部
１５データ出力部
１６タイマー部 10 Subtitle generator 11 Subtitle extraction unit 12 End time setting unit 13 Start time setting unit 14 Divided subtitle generation unit 15 Data output unit 16 Timer unit

Claims

A subtitle extraction unit that extracts the subtitle text and the first display start time of the subtitle text from the subtitle data acquired from the outside, and
An end time setting unit that sets the first display end time, which is a time after the first display start time, and
A divided subtitle generation unit that generates divided subtitle data in which the first display start time and the first display end time are associated with the subtitle text extracted by the subtitle extraction unit.
A data output unit that outputs the divided subtitle data generated by the divided subtitle generation unit, and
A start time setting unit for setting the first display end time to the second display start time of the subtitle text is provided.
The end time setting unit sets a second display end time, which is a time after the second display start time.
The divided subtitle generation unit is a subtitle generation device that generates divided subtitle data in which the second display start time and the second display end time are associated with the subtitle text extracted by the subtitle extraction unit.

The end time setting unit is the first so that the interval from the first display start time to the first display end time is equal to the interval from the second display start time to the second display end time. The subtitle generation device according to claim 1, which sets the display end time and the second display end time.

The subtitle generation device according to claim 1 or 2, wherein the end time setting unit sets one or both of the first display end time and the second display end time by time counting by a timer.

The subtitle generation device according to claim 1, wherein the end time setting unit sets one or both of the first display end time and the second display end time based on a cue signal input from the outside.

The data output unit includes divided subtitle data in which the first display start time and the first display end time are associated with the subtitle text, and the second display start time and the second display end time of the subtitle text. The subtitle generation device according to any one of claims 1 to 4, which outputs one subtitle file including the divided subtitle data associated with the above.

The computer of the subtitle generator,
A subtitle extraction unit that extracts the subtitle text and the first display start time of the subtitle text from the subtitle data acquired from the outside, and
An end time setting unit that sets the first display end time, which is a time after the first display start time, and
A divided subtitle generation unit that generates divided subtitle data in which the first display start time and the first display end time are associated with the subtitle text extracted by the subtitle extraction unit.
A data output unit that outputs the divided subtitle data generated by the divided subtitle generation unit, and
In addition to functioning as a start time setting unit that sets the first display end time to the second display start time of the subtitle text,
The end time setting unit is made to set the second display end time, which is a time after the second display start time.
A subtitle generation program that causes the divided subtitle generation unit to generate divided subtitle data in which the second display start time and the second display end time are associated with the subtitle text extracted by the subtitle extraction unit.