JP7070564B2

JP7070564B2 - Information processing equipment, information recording media, information processing methods, and programs

Info

Publication number: JP7070564B2
Application number: JP2019519529A
Authority: JP
Inventors: 幸一内村; 徹知念; 光行畠中
Original assignee: Sony Corp; Sony Group Corp
Current assignee: Sony Corp; Sony Group Corp
Priority date: 2017-05-24
Filing date: 2018-04-25
Publication date: 2022-05-18
Anticipated expiration: 2038-04-25
Also published as: JPWO2018216424A1; WO2018216424A1

Description

本開示は、情報処理装置、情報記録媒体、および情報処理方法、並びにプログラムに関する。さらに詳細には、放送波等における今後のデータ伝送規格として規格化が進められているＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットデータを入力して、メディアに記録し、再生可能とする情報処理装置、情報記録媒体、および情報処理方法、並びにプログラムに関する。 The present disclosure relates to information processing devices, information recording media, information processing methods, and programs. More specifically, an information processing device and information recording that input MMT (MPEG Media Transport) format data, which is being standardized as a future data transmission standard for broadcast waves, etc., and record it on media so that it can be played back. Media, information processing methods, and programs.

現在、４Ｋ画像や８Ｋ画像等の放送等、高画質画像のデータ伝送を実現するための規格化が進められており、その一つとして、ＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットを利用したデータ配信方式についての検討が進められている。 Currently, standardization is underway to realize data transmission of high-quality images such as broadcasting of 4K images and 8K images, and one of them is a data distribution method using the MMT (MPEG Media Transport) format. Is under consideration.

ＭＭＴフォーマットは、画像（Ｖｉｄｅｏ）、音声（Ａｕｄｉｏ）、字幕（Ｓｕｂｔｉｔｌｅ）等、コンテンツを構成する符号化データや、制御情報や属性情報等の様々な管理情報からなる制御情報（ＳＩ：ＳｉｇｎａｌｉｎｇＩｎｆｏｒｍａｔｉｏｎ（シグナリング情報））等のデータを、放送波やネットワークを介して伝送するデータ転送方式（トランスポートフォーマット）を規定したものである。 The MMT format is a control information (SI: Signaling Information) consisting of coded data constituting the content such as an image (Video), audio (Audio), subtitle (Subtile), and various management information such as control information and attribute information. It defines a data transfer method (transport format) for transmitting data such as signaling information)) via broadcast waves or networks.

ＭＭＴフォーマットは、例えば４Ｋ画像、高ダイナミックレンジ（ＨＤＲ：ＨｉｇｈＤｙｎａｍｉｃＲａｎｇｅ）画像等の次世代コンテンツの放送等に利用される予定となっている。 The MMT format is scheduled to be used for broadcasting next-generation contents such as 4K images and high dynamic range (HDR) images.

なお、現行の画像（Ｖｉｄｅｏ）、音声（Ａｕｄｉｏ）、字幕（Ｓｕｂｔｉｔｌｅ）等の伝送フォーマット、あるいは、メディアに対するデータ記録フォーマットとしては、ＭＰＥＧ－２ＴＳフォーマットが多く利用されている。
また、このＭＰＥＧ－２ＴＳフォーマット対応の記録再生アプリケーション規格（フォーマット）としてＢＤＭＶやＢＤＡＶ規格（フォーマット）が広く利用されている。The MPEG-2 TS format is often used as a transmission format for current images (Video), audio (Audio), subtitles (Subtitles), etc., or as a data recording format for media.
Further, BDMV and BDAV standards (formats) are widely used as recording / playback application standards (formats) compatible with this MPEG-2 TS format.

なお、ＢＤＭＶやＢＤＡＶは、主にＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）を利用したデータ記録再生のアプリケーション規格であるが、これらの規格はＢＤに限らず、フラッシュメモリやＨＤなど、その他のＢＤ以外のメディアを利用したデータ記録再生にも適用可能である。
ＢＤを利用したデータ記録再生処理構成については、例えば特許文献１（特開２０１１－０２３０７１号公報）等に記載がある。BDMV and BDAV are application standards for data recording and playback mainly using BD (Blu-ray (registered trademark) Disc), but these standards are not limited to BD, but other standards such as flash memory and HD. It can also be applied to data recording / playback using media other than BD.
A data recording / reproduction processing configuration using a BD is described in, for example, Patent Document 1 (Japanese Unexamined Patent Publication No. 2011-023071).

ＢＤＭＶは、例えば映画コンテンツなどを予め記録したＢＤ－ＲＯＭ向けに開発されたアプリケーション規格であり、主に、パッケージコンテンツ等の書き換え不能なＢＤ－ＲＯＭで広く使われている。
一方、ＢＤＡＶは、主に書き換え可能なＢＤ－ＲＥ型ディスクや、一回のみ記録可能なＢＤ－Ｒ型ディスク等を利用したデータ記録再生処理に適用することを目的として開発された規格である。ＢＤＡＶは、例えばユーザがビデオカメラなどで撮影した映像の記録再生やテレビ放送を記録し再生するために利用される。BDMV is an application standard developed for BD-ROMs in which movie contents are recorded in advance, and is mainly widely used in non-rewritable BD-ROMs such as package contents.
On the other hand, BDAV is a standard developed mainly for the purpose of applying to data recording / playback processing using a rewritable BD-RE type disc, a BD-R type disc that can be recorded only once, and the like. BDAV is used, for example, for recording / reproducing a video taken by a user with a video camera or the like, or for recording / reproducing a television broadcast.

上述のＭＭＴフォーマットに従った配信コンテンツを、情報記録媒体（メディア）に記録し、メディアからのコンテンツ再生処理をＢＤＡＶフォーマット対応の再生アプリケーションを利用して行なうためには、このＢＤＡＶフォーマットに従ってデータ記録を行うことが必要である。 In order to record the distributed content according to the above-mentioned MMT format on an information recording medium (media) and perform the content reproduction processing from the media using a reproduction application compatible with the BDAV format, the data recording is performed according to this BDAV format. It is necessary to do.

現在、ＢＤＡＶフォーマットを拡張し、ＭＭＴフォーマットデータを記録、再生可能とするための構成について議論が進められている。
例えば、放送局等が送信するＭＭＴフォーマットに従った配信データをテレビ等の情報処理装置が受信し、受信データをＢＤやフラッシュメモリ、あるいはＨＤＤ（ハードディスク）等の記録メディアに記録する場合、画像、音声、字幕データや、制御情報（ＳＩ）等のデータを、ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列としてメディアに記録する方向で議論が進んでいる。Currently, discussions are underway on a configuration for expanding the BDAV format to record and play back MMT format data.
For example, when an information processing device such as a television receives distribution data according to the MMT format transmitted by a broadcasting station or the like and records the received data on a recording medium such as a BD, a flash memory, or an HDD (hard disk), an image, Discussions are progressing in the direction of recording data such as voice, subtitle data, and control information (SI) on a medium as a packet string of a packet containing data according to the MMT format.

具体的には、ＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケット、あるいはＭＭＴＰパケットの上位パケットであるＴＬＶ（ＴｙｐｅＬｅｎｇｔｈＶａｌｕｅ）パケットのパケット列をメディアに記録する方向で議論が進んでいる。 Specifically, discussions are proceeding in the direction of recording a packet sequence of an MMTP (MMT Protocol) packet or a TLV (Type Length Value) packet, which is a higher-level packet of an MMTP packet, on a medium.

ＭＭＴＰパケットやＴＬＶパケットには、再生データである画像、音声、字幕、さらに、様々な管理情報からなる制御情報（ＳＩ：ＳｉｇｎａｌｉｎｇＩｎｆｏｒｍａｔｉｏｎ）等が格納されている。 In the MMTP packet and the TLV packet, images, sounds, subtitles, which are reproduction data, and control information (SI: Signaling Information) including various management information are stored.

例えばＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）やフラッシュメモリやＨＤＤ（ハードディスク）等の記録メディアに、画像、音声、字幕等のコンテンツを格納したＭＭＴＰパケットやＴＬＶパケットを記録し、メディアからのコンテンツ再生を上述のＢＤＡＶフォーマット対応の再生アプリケーションを利用して行なうためには、このＢＤＡＶフォーマットに従ってデータ記録を行うことが必要である。 For example, MMTP packets and TLV packets containing contents such as images, sounds, and subtitles are recorded on recording media such as BD (Blu-ray (registered trademark) Disc), flash memory, and HDD (hard disk), and the contents from the media are recorded. In order to perform reproduction using the reproduction application corresponding to the above-mentioned BDAV format, it is necessary to record data according to this BDAV format.

ＢＤＡＶフォーマットは、再生制御情報ファイルとして、プレイリストファイルやクリップ情報ファイル等のデータベースファイルを規定しており、ＢＤＡＶ対応再生アプリケーションはこれらの再生制御情報ファイル（データベースファイル）を参照してデータ再生処理を実行する。
従って、ＭＭＴフォーマットデータについても、これらのプレイリストファイルやクリップ情報ファイルに記録された再生制御情報を利用して再生処理を行うことが必要となる。The BDAV format defines database files such as playlist files and clip information files as playback control information files, and BDAV-compatible playback applications refer to these playback control information files (database files) for data playback processing. Run.
Therefore, it is necessary to perform the reproduction processing of the MMT format data by using the reproduction control information recorded in these playlist files and clip information files.

しかし、ＭＭＴフォーマットに従って配信されるデータには、このＢＤＡＶフォーマットで規定するプレイリストファイルやクリップ情報ファイルが含まれていない。
従って、このＭＭＴフォーマットデータをメディアに記録して、ＢＤＡＶフォーマット対応アプリを利用してコンテンツ再生を行うためには、ＢＤＡＶフォーマットの規定するプレイリストファイルやクリップ情報ファイルを生成してメディアに記録する処理が必要となる。
しかし、この処理については、現時点で具体化されていないというのが現状である。However, the data distributed according to the MMT format does not include the playlist file and the clip information file specified in this BDAV format.
Therefore, in order to record this MMT format data on the media and play the content using the BDAV format compatible application, a process of generating a playlist file or a clip information file specified by the BDAV format and recording it on the media. Is required.
However, the current situation is that this process has not been materialized at this time.

特開２０１１－０２３０７１号公報Japanese Unexamined Patent Publication No. 2011-023071

本開示は、例えば、上記の問題点に鑑みてなされたものであり、ＭＭＴフォーマットに従った配信データを入力して、ＢＤＡＶフォーマットに規定されたデータベースファイルを生成してメディアに記録して、これらのデータベースファイルを利用してメディア記録コンテンツを再生可能とする情報処理装置、情報記録媒体、および情報処理方法、並びにプログラムを提供することを目的とする。 The present disclosure has been made in view of the above problems, for example, by inputting distribution data according to the MMT format, generating a database file specified in the BDAV format, and recording the data on the media. It is an object of the present invention to provide an information processing device, an information recording medium, an information processing method, and a program that enable reproduction of media recorded contents by using the database file of the above.

本開示の第１の側面は、
ＭＭＴフォーマットデータを入力し、情報記録媒体に対するデータ記録フォーマットであるＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを生成するデータ処理部を有し、
前記データ処理部は、
情報記録媒体に記録する音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録する処理を実行する情報処理装置にある。The first aspect of this disclosure is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The data processing unit
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It is in the information processing apparatus that executes the process of recording the voice identification information that can identify which of the voice data (a) and (b) is in the database file specified in the BDAV format or the SPAV format.

さらに、本開示の第２の側面は、
情報記録媒体の記録データの再生処理を実行するデータ処理部を有し、
前記情報記録媒体は、ＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従って記録したデータを格納した情報記録媒体であり、
前記データ処理部は、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルであるプレイリストファイルとクリップ情報ファイルの記録情報を利用して、前記情報記録媒体に記録されたＭＭＴフォーマットデータの再生処理を実行する構成であり、
前記データ処理部は、
再生対象の音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれのデータであるかを示す音声識別情報を、前記データベースファイルから取得し、取得情報に従って音声データの復号処理を実行する情報処理装置にある。Further, the second aspect of the present disclosure is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The data processing unit
It is configured to execute the reproduction process of the MMT format data recorded on the information recording medium by using the recorded information of the playlist file and the clip information file which are the database files specified in the BDAV format or the SPAV format.
The data processing unit
The audio data to be played is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It is in an information processing apparatus that acquires voice identification information indicating which data (a) or (b) is from the database file and executes voice data decoding processing according to the acquired information.

さらに、本開示の第３の側面は、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを記録した情報記録媒体であり、
情報記録媒体に記録された音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録した構成を有し、
再生装置が、前記データベースファイルに記録された音声識別情報に従って、再生対象とした音声データの種類を識別可能とした情報記録媒体にある。Further, the third aspect of the present disclosure is
It is an information recording medium that records recorded data according to the BDAV format or SPAV format.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It has a configuration in which the voice identification information that enables identification of which of the voice data (a) and (b) is recorded is recorded in a database file specified in BDAV format or SPAV format.
The reproduction device is in an information recording medium capable of identifying the type of audio data to be reproduced according to the audio identification information recorded in the database file.

さらに、本開示の第４の側面は、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データが記録される情報記録媒体であり、
情報記録媒体に記録される音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報を含むＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルを有し、
再生装置が、前記データベースファイルの音声識別情報に従って、再生対象とした音声データの種類を識別可能とした情報記録媒体にある。Further, the fourth aspect of the present disclosure is
It is an information recording medium on which recorded data according to the BDAV format or SPAV format is recorded.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It has a database file specified in BDAV format or SPAV format that contains voice identification information that makes it possible to identify which of the voice data (a) and (b) is.
The reproduction device is in an information recording medium capable of identifying the type of audio data to be reproduced according to the audio identification information of the database file.

さらに、本開示の第５の側面は、
情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、
ＭＭＴフォーマットデータを入力し、情報記録媒体に対するデータ記録フォーマットであるＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを生成するデータ処理部を有し、
前記データ処理部が、
情報記録媒体に記録する音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録する処理を実行する情報処理方法にある。Further, the fifth aspect of the present disclosure is
It is an information processing method executed in an information processing device.
The information processing device is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The data processing unit
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It is an information processing method for executing a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

さらに、本開示の第６の側面は、
情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、
情報記録媒体の記録データの再生処理を実行するデータ処理部を有し、
前記情報記録媒体は、ＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従って記録したデータを格納した情報記録媒体であり、
前記データ処理部が、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルであるプレイリストファイルとクリップ情報ファイルの記録情報を利用して、前記情報記録媒体に記録されたＭＭＴフォーマットデータの再生処理を実行し、
前記再生処理に際して、再生対象の音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれのデータであるかを示す音声識別情報を、前記データベースファイルから取得し、取得情報に従って音声データの復号処理を実行する情報処理方法にある。Further, the sixth aspect of the present disclosure is
It is an information processing method executed in an information processing device.
The information processing device is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The data processing unit
Using the recorded information of the playlist file and the clip information file, which are database files specified in the BDAV format or the SPAV format, the MMT format data recorded in the information recording medium is reproduced.
In the reproduction process, the audio data to be reproduced is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It is an information processing method that acquires voice identification information indicating which data (a) or (b) is from the database file and executes a voice data decoding process according to the acquired information.

さらに、本開示の第７の側面は、
情報処理装置において実行する情報処理を実行させるプログラムであり、
前記情報処理装置は、
ＭＭＴフォーマットデータを入力し、情報記録媒体に対するデータ記録フォーマットであるＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを生成するデータ処理部を有し、
前記プログラムは、前記データ処理部に、
情報記録媒体に記録する音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録する処理を実行させるプログラムにある。Further, the seventh aspect of the present disclosure is
It is a program that executes information processing to be executed in an information processing device.
The information processing device is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The program is installed in the data processing unit.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
There is a program that executes a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

さらに、本開示の第８の側面は、
情報処理装置において実行する情報処理を実行させるプログラムであり、
前記情報処理装置は、
情報記録媒体の記録データの再生処理を実行するデータ処理部を有し、
前記情報記録媒体は、ＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従って記録したデータを格納した情報記録媒体であり、
前記プログラムは、前記データ処理部に、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルであるプレイリストファイルとクリップ情報ファイルの記録情報を利用して、前記情報記録媒体に記録されたＭＭＴフォーマットデータの再生処理を実行させ、
前記再生処理に際して、再生対象の音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれのデータであるかを示す音声識別情報を、前記データベースファイルから取得させ、取得情報に従って音声データの復号処理を行わせるプログラムにある。Further, the eighth aspect of the present disclosure is
It is a program that executes information processing to be executed in an information processing device.
The information processing device is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The program is installed in the data processing unit.
Using the recorded information of the playlist file and the clip information file, which are database files specified in the BDAV format or the SPAV format, the reproduction process of the MMT format data recorded on the information recording medium is executed.
In the reproduction process, the audio data to be reproduced is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
The program has voice identification information indicating which of the data (a) and (b) is acquired from the database file, and decodes the voice data according to the acquired information.

なお、本開示のプログラムは、例えば、様々なプログラム・コードを実行可能な情報処理装置やコンピュータ・システムに対して、コンピュータ可読な形式で提供する記憶媒体、通信媒体によって提供可能なプログラムである。このようなプログラムをコンピュータ可読な形式で提供することにより、情報処理装置やコンピュータ・システム上でプログラムに応じた処理が実現される。 The program of the present disclosure is, for example, a program that can be provided by a storage medium or a communication medium provided in a computer-readable format to an information processing apparatus or a computer system capable of executing various program codes. By providing such a program in a computer-readable format, processing according to the program can be realized on an information processing apparatus or a computer system.

本開示のさらに他の目的、特徴や利点は、後述する本開示の実施例や添付する図面に基づくより詳細な説明によって明らかになるであろう。なお、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 Still other objectives, features and advantages of the present disclosure will be clarified by more detailed description based on the examples of the present disclosure and the accompanying drawings described below. In the present specification, the system is a logical set configuration of a plurality of devices, and the devices of each configuration are not limited to those in the same housing.

本開示の一実施例の構成によれば、メディア記録ＭＭＴフォーマット音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データか、ＭＰＥＧ４ＡＬＳ音声データかを迅速に確認可能とした構成が実現される。
具体的には、例えば、メディアに記録するＭＭＴフォーマット音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データであるか、ＭＰＥＧ４ＡＬＳ音声データであるかを示す音声識別情報を記録したクリップ情報ファイルを生成してメディアに記録する。再生装置は、クリップ情報ファイルに記録された音声識別情報に基づいて、再生対象とする音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データであるか、ＭＰＥＧ４ＡＬＳ音声データであるかを識別し、迅速な復号、再生を行うことができる。
本構成により、メディア記録ＭＭＴフォーマット音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データか、ＭＰＥＧ４ＡＬＳ音声データかを迅速に確認可能とした構成が実現される。
なお、本明細書に記載された効果はあくまで例示であって限定されるものではなく、また付加的な効果があってもよい。According to the configuration of one embodiment of the present disclosure, a configuration is realized in which it is possible to quickly confirm whether the media recorded MMT format audio data is MPEG4 AAC LC audio data or MPEG4 ALS audio data.
Specifically, for example, a clip information file recording audio identification information indicating whether the MMT format audio data to be recorded on the media is MPEG4 AAC LC audio data or MPEG4 ALS audio data is generated and stored on the media. Record. The playback device identifies whether the audio data to be reproduced is MPEG4 AAC LC audio data or MPEG4 ALS audio data based on the audio identification information recorded in the clip information file, and rapidly decodes the audio data. Playback can be performed.
With this configuration, it is possible to quickly confirm whether the media recording MMT format audio data is MPEG4 AAC LC audio data or MPEG4 ALS audio data.
It should be noted that the effects described in the present specification are merely exemplary and not limited, and may have additional effects.

本開示の処理を実行する情報処理装置の利用構成例について説明する図である。It is a figure explaining the use configuration example of the information processing apparatus which executes the process of this disclosure. ＭＭＴフォーマットについて説明する図である。It is a figure explaining the MMT format. ＭＭＴフォーマットに従った画像データ格納構成例について説明する図である。It is a figure explaining the image data storage configuration example according to the MMT format. ＢＤＡＶフォーマットについて説明する図である。It is a figure explaining the BDAV format. ＢＤＡＶフォーマットに従ったデータ再生処理例について説明する図である。It is a figure explaining the example of the data reproduction processing according to the BDAV format. ＭＰＥＧ－２ＴＳフォーマットについて説明する図である。It is a figure explaining the MPEG-2TS format. ＭＭＴフォーマットについて説明する図である。It is a figure explaining the MMT format. ＳＰＡＶフォーマットについて説明する図である。It is a figure explaining the SPAV format. 放送局等からの受信データをＭＭＴフォーマットデータであるＭＭＴＰパケット列として、情報記録媒体（メディア）に記録する処理例について説明する図である。It is a figure explaining the processing example which records the received data from a broadcasting station or the like as an MMTP packet string which is MMT format data on an information recording medium (media). 放送局等からの受信データをＭＭＴフォーマットデータであるＭＭＴＰパケットを格納したＴＬＶパケット列として、情報記録媒体（メディア）に記録する処理例について説明する図である。It is a figure explaining the processing example which records the received data from a broadcasting station or the like as a TLV packet string storing MMTP packet which is MMT format data on an information recording medium (media). ＭＭＴフォーマットデータを、ＢＤＡＶフォーマットデータとして記録する場合の処理例について説明する図である。It is a figure explaining the processing example in the case of recording MMT format data as BDAV format data. クリップ情報ファイルのデータ構成（シンタクス）を示す図である。It is a figure which shows the data structure (syntax) of a clip information file. クリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］のデータ構成（シンタクス）を示す図である。It is a figure which shows the data structure (syntax) of the program information [ProgramInfo ()] of a clip information file. クリップ情報ファイルのコーディング情報のデータ構成例について説明する図である。It is a figure explaining the data structure example of the coding information of a clip information file. クリップ情報ファイルの音声情報のデータ記録例について説明する図である。It is a figure explaining the data recording example of the audio information of a clip information file. ＭＭＴフォーマットの音声データの識別情報を記録したクリップ情報ファイルのコーディング情報のデータ構成例について説明する図である。It is a figure explaining the data structure example of the coding information of the clip information file which recorded the identification information of the audio data of MMT format. クリップ情報ファイルに記録するＭＭＴフォーマットの音声識別情報の例について説明する図である。It is a figure explaining the example of the voice identification information of the MMT format recorded in the clip information file. 音声データ格納パケットの構成例について説明する図である。It is a figure explaining the configuration example of the voice data storage packet. ＭＭＴフォーマットの音声データの識別情報を記録したクリップ情報ファイルのコーディング情報のデータ構成例について説明する図である。It is a figure explaining the data structure example of the coding information of the clip information file which recorded the identification information of the audio data of MMT format. クリップ情報ファイルに記録するＭＭＴフォーマットの音声識別情報の例について説明する図である。It is a figure explaining the example of the voice identification information of the MMT format recorded in the clip information file. ＭＭＴパッケージテーブル（ＭＰＴ）のデータ構成（シンタクス）を示す図である。It is a figure which shows the data structure (syntax) of the MMT package table (MPT). ＭＭＴパッケージテーブル（ＭＰＴ）に記録されるアセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）の具体例について説明する図である。It is a figure explaining the specific example of the asset type (asset_type) recorded in the MMT package table (MPT). ＭＭＴパッケージテーブル（ＭＰＴ）に記録される音声属性情報を示す図である。It is a figure which shows the audio attribute information recorded in the MMT package table (MPT). ＭＭＴパッケージテーブル（ＭＰＴ）に記録される（Ｍ１）ストリームコンテンツ情報の具体例を示す図である。It is a figure which shows the specific example of the (M1) stream content information recorded in the MMT package table (MPT). ＭＭＴパッケージテーブル（ＭＰＴ）に記録される（Ｍ２）コンポーネントタイプの具体例を示す図である。It is a figure which shows the specific example of the (M2) component type recorded in the MMT package table (MPT). ＭＭＴパッケージテーブル（ＭＰＴ）に記録される（Ｍ３）サンプリング周波数の具体例を示す図である。It is a figure which shows the specific example of the (M3) sampling frequency recorded in the MMT package table (MPT). 情報記録媒体（メディア）に対するデータ記録処理を実行する情報処理装置の構成例について説明する図である。It is a figure explaining the configuration example of the information processing apparatus which executes the data recording process with respect to the information recording medium (media). 情報記録媒体（メディア）に対するデータ記録処理の処理シーケンスを説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence of the data recording processing with respect to the information recording medium (media). 情報記録媒体（メディア）に対するデータ記録処理の処理シーケンスを説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence of the data recording processing with respect to the information recording medium (media). 情報記録媒体（メディア）に対するデータ記録処理の処理シーケンスを説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence of the data recording processing with respect to the information recording medium (media). 情報記録媒体（メディア）からのデータ再生処理を実行する情報処理装置の構成例について説明する図である。It is a figure explaining the configuration example of the information processing apparatus which executes the data reproduction processing from an information recording medium (media). 情報記録媒体（メディア）からのデータ再生処理の処理シーケンスを説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence of the data reproduction processing from an information recording medium (media). 情報記録媒体（メディア）からのデータ再生処理の処理シーケンスを説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the processing sequence of the data reproduction processing from an information recording medium (media). 本開示の処理に適用される情報処理装置のハードウェア構成例について説明する図である。It is a figure explaining the hardware configuration example of the information processing apparatus applied to the process of this disclosure.

以下、図面を参照しながら本開示の情報処理装置、情報記録媒体、および情報処理方法、並びにプログラムの詳細について説明する。なお、説明は以下の項目に従って行なう。
１．通信システムの構成例について
２．ＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットについて
３．ＢＤＡＶフォーマットとＳＰＡＶフォーマットについて
４．ＭＭＴフォーマットデータをＢＤＡＶフォーマットに従って記録する場合の処理について
５．ＭＭＴフォーマットに従った音声データについて
６．クリップ情報ファイルに対する音声識別情報の記録例について
６－１．（実施例１）音声データ格納パケットに記録された音声識別情報を取得してクリップ情報ファイルに記録する実施例
６－２．（実施例２）ＭＭＴパッケージテーブル（ＭＰＴ）に記録された音声識別情報を取得してクリップ情報ファイルに記録する実施例
７．情報記録媒体に対するデータ記録処理を実行する情報処理装置の構成と処理について
８．情報記録媒体からのデータ再生処理を実行する情報処理装置の構成と処理について
９．情報処理装置の構成例について
１０．本開示の構成のまとめHereinafter, the details of the information processing apparatus, the information recording medium, the information processing method, and the program of the present disclosure will be described with reference to the drawings. The explanation will be given according to the following items.
1. 1. Configuration example of communication system 2. About MMT (MPEG Media Transport) format 3. BDAV format and SPAV format 4. 4. Processing when recording MMT format data according to BDAV format. Voice data according to MMT format 6. Recording example of voice identification information for clip information file 6-1. (Example 1) Example 6-2. Example of acquiring voice identification information recorded in a voice data storage packet and recording it in a clip information file. (Embodiment 2) Example 2 to acquire the voice identification information recorded in the MMT package table (MPT) and record it in the clip information file. 8. Configuration and processing of information processing equipment that executes data recording processing for information recording media. 9. Configuration and processing of an information processing device that executes data reproduction processing from an information recording medium. About the configuration example of the information processing device 10. Summary of the structure of this disclosure

［１．通信システムの構成例について］
まず、図１を参照して本開示の処理を実行する情報処理装置の一つの利用構成例である通信システムの例について説明する。
図１に示す情報処理装置３０は、ＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）や、フラッシュメモリ、ハードディスク（ＨＤＤ）などのメディアを装着し、これらの装着メディアに対するデータ記録処理や、装着メディアからのデータ再生処理、さらに、他メディアに対するデータコピー処理等を実行する。[1. Communication system configuration example]
First, an example of a communication system, which is an example of a usage configuration of an information processing apparatus that executes the process of the present disclosure, will be described with reference to FIG.
The information processing device 30 shown in FIG. 1 is equipped with media such as a BD (Blu-ray (registered trademark) Disc), a flash memory, and a hard disk (HDD), and performs data recording processing on these mounted media and from the mounted media. Data reproduction processing, data copy processing for other media, etc. are executed.

情報処理装置３０がメディアに記録するデータは、例えば放送局（放送サーバ）２１や、データ配信サーバ２２等の送信装置２０の提供する送信コンテンツである。具体的には、テレビ局の提供する放送番組等である。
これらの送信コンテンツは、放送波、あるいはインターネット等のネットワークを介して送信装置２０から情報処理装置３０に送信される。The data recorded on the media by the information processing device 30 is, for example, transmission content provided by a transmission device 20 such as a broadcasting station (broadcast server) 21 or a data distribution server 22. Specifically, it is a broadcast program or the like provided by a television station.
These transmitted contents are transmitted from the transmitting device 20 to the information processing device 30 via a broadcast wave or a network such as the Internet.

情報処理装置３０は、例えば記録再生装置３１、テレビ３２、ＰＣ３３、携帯端末３４等である、これらの情報処理装置は、例えば、ＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）４１、ＨＤＤ（ハードディスク）４２、フラッシュメモリ４３等の様々なメディアを装着し、これらのメディアに対するデータ記録処理や、メディア記録データの編集処理、メディアからのデータ再生処理、さらに、メディア間のデータコピー処理等を実行する。 The information processing device 30 is, for example, a recording / reproducing device 31, a television 32, a PC 33, a portable terminal 34, etc. These information processing devices are, for example, BD (Blu-ray (registered trademark) Disc) 41, HDD (hard disk). 42, various media such as a flash memory 43 are attached, and data recording processing for these media, editing processing of media recording data, data reproduction processing from media, data copy processing between media, and the like are executed.

送信装置２０から情報処理装置３０に対するデータ送信は、ＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットに従って実行される。
ＭＭＴフォーマットは、画像（Ｖｉｄｅｏ）、音声（Ａｕｄｉｏ）、字幕（Ｓｕｂｔｉｔｌｅ）等、コンテンツ構成データである符号化データを放送波やネットワークを介して伝送する際のデータ転送方式（トランンスポートフォーマット）を規定したものである。Data transmission from the transmission device 20 to the information processing device 30 is executed according to the MMT (MPEG Media Transport) format.
The MMT format is a data transfer method (transition port format) for transmitting coded data, which is content composition data, such as image (Video), audio (Audio), and subtitle (Subtitle), via a broadcast wave or a network. It is specified.

送信装置２０は、コンテンツデータを符号化し、符号化データおよび符号化データのメタデータを含むデータファイルを生成し、生成した符号化データをＭＭＴにおいて規定されるＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケットに格納して放送波、またはネットワークを介して送信する。 The transmission device 20 encodes the content data, generates a data file containing the coded data and the metadata of the coded data, and stores the generated coded data in an MMTP (MMT Protocol) packet defined in the MMT. Transmit via broadcast wave or network.

送信装置２０が情報処理装置３０に提供するデータは、画像、音声、字幕等の再生対象データの他、番組ガイド等の案内情報や通知情報、制御メッセージ等の様々な管理情報によって構成される制御情報（ＳＩ：ＳｉｇｎａｌｉｎｇＩｎｆｏｒｍａｔｉｏｎ（シグナリング情報））によって構成される。 The data provided by the transmission device 20 to the information processing device 30 is a control composed of reproduction target data such as images, sounds, and subtitles, as well as guidance information such as a program guide, notification information, and various management information such as control messages. It is composed of information (SI: Signaling Information).

［２．ＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットについて］
上述したように、送信装置２０から情報処理装置３０に対するデータ送信は、ＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットに従って実行される。
図２以下を参照して、ＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットについて説明する。[2. About MMT (MPEG Media Transport) format]
As described above, the data transmission from the transmission device 20 to the information processing device 30 is executed according to the MMT (MPEG Media Transport) format.
The MMT (MPEG Media Transport) format will be described with reference to FIG. 2 and the following.

図２は、ＭＭＴフォーマットのスタック・モデルを示す図である。
図２に示すＭＭＴスタック・モデルにおいて、最下層には、物理レイヤ（ＰＨＹ）がある。物理レイヤは、放送系の処理を行なうブロードキャスト（Ｂｒｏａｄｃａｓｔｉｎｇ）レイヤと、ネットワーク系の処理を行なうブロードバンド（Ｂｒｏａｄｂａｎｄ）レイヤに分割されている。
ＭＭＴは放送系、ネットワーク系の２つの通信網を利用した処理を可能としている。FIG. 2 is a diagram showing a stack model in MMT format.
In the MMT stack model shown in FIG. 2, the bottom layer has a physical layer (PHY). The physical layer is divided into a broadcast (Broadcasting) layer that performs broadcasting-related processing and a broadband (Broadband) layer that performs network-based processing.
MMT enables processing using two communication networks, a broadcasting system and a network system.

物理レイヤ（ＰＨＹ）の上位レイヤとして、ＴＬＶ（ＴｙｐｅＬｅｎｇｔｈＶａｌｕｅ）レイヤがある。ＴＬＶはＩＰパケットの多重化方式を規定したフォーマット規定レイヤである。複数のＩＰパケットが多重化されてＴＬＶパケットとして送信される。ＴＬＶ－ＳＩは、ＴＬＶフォーマットに従った制御メッセージ等の制御情報（ＳＩ）の伝送レイヤである。 As an upper layer of the physical layer (PHY), there is a TLV (Type Length Value) layer. The TLV is a format specification layer that defines a multiplexing method for IP packets. A plurality of IP packets are multiplexed and transmitted as TLV packets. The TLV-SI is a transmission layer of control information (SI) such as a control message according to the TLV format.

制御情報（ＳＩ）は、情報処理装置３０側においてコンテンツ（番組）を受信、再生するために必要となる設定情報や、番組ガイド等の案内情報や通知情報、制御情報、管理情報によって構成される。
ＴＬＶレイヤで処理が生成されるＴＬＶパケットに格納される制御情報（ＳＩ）がＴＬＶ－ＳＩであり、主に受信処理に関する制御情報によって構成されている。
ＭＭＴプロトコル（ＭＭＴＰ）に従って生成されるパケットであるＭＭＴＰパケットに格納される制御情報（ＳＩ）は最上位レイヤに示すＭＭＴ－ＳＩであり、主に再生制御に関する制御情報によって構成されている。The control information (SI) is composed of setting information required for receiving and playing content (program) on the information processing apparatus 30 side, guidance information such as a program guide, notification information, control information, and management information. ..
The control information (SI) stored in the TLV packet for which processing is generated in the TLV layer is TLV-SI, and is mainly composed of control information related to reception processing.
The control information (SI) stored in the MMTP packet, which is a packet generated according to the MMT protocol (MMTP), is MMT-SI shown in the uppermost layer, and is mainly composed of control information related to reproduction control.

ＴＬＶレイヤ上には、ＵＤＰ／ＩＰレイヤが設定される。
ＵＤＰ／ＩＰレイヤは、詳細にはＩＰレイヤとＵＤＰレイヤに分割可能であるが、ＩＰパケットのペイロードにＵＤＰパケットを格納する伝送を規定するレイヤである。
ＵＤＰ／ＩＰレイヤ上にＭＭＴレイヤ、およびＦｉｌｅｄｅｌｉｖｅｒｙｍｅｔｈｏｄレイヤが設定される。
ＭＭＴＰパケットをＩＰパケットに格納して送信する場合と、ＭＭＴＰパケットを用いないデータ伝送方式であるＦｉｌｅｄｅｌｉｖｅｒｙｍｅｔｈｏｄを利用してＩＰパケットとしてデータ送信する方式が併用可能な設定となっている。A UDP / IP layer is set on the TLV layer.
The UDP / IP layer can be divided into an IP layer and a UDP layer in detail, but is a layer that regulates transmission in which a UDP packet is stored in the payload of the IP packet.
An MMT layer and a File delivery method layer are set on the UDP / IP layer.
The setting is such that the case where the MMTP packet is stored in the IP packet and transmitted and the method where the data is transmitted as the IP packet by using the File delivery method, which is a data transmission method that does not use the MMTP packet, can be used together.

ＭＭＴレイヤ上には、以下のレイヤが設定される。
画像符号化規格であるＨＥＶＣ（ＨｉｇｈＥｆｆｉｃｉｅｎｃｙＶｉｄｅｏＣｏｄｉｎｇ）に従った符号化画像データである画像（Ｖｉｄｅｏ）データのレイヤ。The following layers are set on the MMT layer.
A layer of image (Video) data which is coded image data according to HEVC (High Efficiency Video Coding) which is an image coding standard.

音声符号化規格である、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇＬｏｗＣｏｍｐｌｅｘｉｔｙ）に従った符号化音声データ、または、
（２）ＭＰＥＧ４ＡＬＳ（ＡｕｄｉｏＬｏｓｌｅｓｓＣｏｄｉｎｇ）に従った符号化音声データ、
これらの音声符号化規格に従って符号化された音声（Ａｕｄｉｏ）データのレイヤ。A voice coding standard,
(1) Coded audio data according to MPEG4 AAC LC (Advanced Audio Coding Low Complexity), or
(2) Coded audio data according to MPEG4 ALS (Audio Lossless Coding),
A layer of audio (Audio) data encoded according to these audio coding standards.

さらに、
字幕符号化規格であるＴＴＭＬ（ＴｉｍｅｄＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）に従った符号化字幕データである字幕（Ｓｕｂｔｉｔｌｅ）データ、
ＭＭＴＰパケットを利用して送信される制御情報（ＭＭＴ－ＳＩ）、
ＨＴＭＬ５（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ５）に従って記述された様々なアプリケーション、
これらの各データがＭＭＴＰパケットに格納されて送信される。moreover,
Subtitle data, which is coded subtitle data according to TTML (Timed Text Markup Language), which is a subtitle coding standard.
Control information (MMT-SI) transmitted using MMTP packets,
Various applications described according to HTML5 (HyperText Markup Language 5),
Each of these data is stored in an MMTP packet and transmitted.

制御情報（ＭＭＴ－ＳＩ）は、ＭＭＴＰパケットで送信される制御情報（シグナリング情報）であり、情報処理装置３０側においてコンテンツ（番組）を再生するために必要となる設定情報や、番組ガイド等の案内情報や通知情報、制御情報等の様々な管理情報によって構成される。 The control information (MMT-SI) is control information (signaling information) transmitted in an MMTP packet, such as setting information required for playing content (program) on the information processing apparatus 30 side, a program guide, and the like. It is composed of various management information such as guidance information, notification information, and control information.

なお、時刻情報（ＮＴＰ：ＮｅｔｗｏｒｋＴｉｍｅＰｒｏｔｏｃｏｌ）は絶対時刻情報であり、ＵＤＰパケットに直接格納され送信される。
その他のデータ配信を行うデータサービス（Ｄａｔａｓｅｒｖｉｃｅ）、コンテンツダウンロード等（Ｃｏｎｔｅｎｔｄｏｗｎｌｏａｄ，ｅｔｃ．）がＭＭＴと異なるファイル配信メソッド（Ｆｉｌｅｄｅｌｉｖｅｒｙｍｅｔｈｏｄ）を利用して配信可能な構成を有している。The time information (NTP: Network Time Protocol) is absolute time information, which is directly stored in a UDP packet and transmitted.
Other data services (Data service) that distribute data, content download (Contentdownload, etc.) have a configuration that can be distributed by using a file distribution method (File delivery method) different from MMT.

図２に示すように、画像、音声、字幕、さらに、様々な通知情報や制御情報等の様々な管理情報からなる制御情報（ＭＭＴ－ＳＩ）やアプリケーションは、ＭＭＴＰパケットによって送信される。 As shown in FIG. 2, a control information (MMT-SI) or an application composed of images, sounds, subtitles, and various management information such as various notification information and control information is transmitted by an MMTP packet.

図３を参照してＭＭＴＰパケットの具体的構成例について説明する。
図３には以下の４種類のデータ構成例を示している。
（ａ）ＭＰＵ（ＭｅｄｉａＰｒｅｓｅｎｔａｔｉｏｎＵｎｉｔ）
（ｂ）ＭＭＴＰペイロート
（ｃ）ＭＭＴＰパケット
（ｄ）ＴＬＶパケットA specific configuration example of the MMTP packet will be described with reference to FIG.
FIG. 3 shows the following four types of data configuration examples.
(A) MPU (Media Presentation Unit)
(B) MMTP payrot (c) MMTP packet (d) TLV packet

（ｄ）ＴＬＶパケットが放送波やネットワークを介して送信されるパケットであり、ＴＬＶパケットには、ＵＤＰヘッダ、ＩＰヘッダ、ＴＬＶヘッダの各ヘッダ情報が設定される。ＴＬＶパケットは、データ種類ごとに個別のパケットとして設定される。 (D) The TLV packet is a packet transmitted via a broadcast wave or a network, and each header information of a UDP header, an IP header, and a TLV header is set in the TLV packet. The TLV packet is set as an individual packet for each data type.

すなわち、１つのＴＬＶパケットのＴＬＶペイロードには、１つの種類のデータが格納される。具体的には、例えば、画像（Ｖ）、音声（Ａ）、字幕（Ｓ）、あるいは、様々な管理情報からなる制御情報（ＳＩ）が個別に格納される。
なお、制御情報（ＳＩ）については、ＭＭＴＰパケットに格納する制御情報（ＭＭＴ－ＳＩ）と、ＴＬＶパケットで送信される制御情報（ＴＬＶ－ＳＩ）があり、それぞれ異なる個別のＴＬＶパケットに格納される。That is, one type of data is stored in the TLV payload of one TLV packet. Specifically, for example, an image (V), an audio (A), a subtitle (S), or a control information (SI) composed of various management information is individually stored.
Regarding the control information (SI), there are control information (MMT-SI) stored in the MMTP packet and control information (TLV-SI) transmitted in the TLV packet, which are stored in different individual TLV packets. ..

ＴＬＶパケットのペイロードであるＴＬＶペイロートの一例が、図３（ｃ）に示すＭＭＴＰパケットである。
図３（ｃ）に示すＭＭＴＰパケットは、ＭＭＴＰヘッダと、ＭＭＴＰペイロードによって構成される。An example of the TLV payrot, which is the payload of the TLV packet, is the MMTP packet shown in FIG. 3 (c).
The MMTP packet shown in FIG. 3C is composed of an MMTP header and an MMTP payload.

１つのＭＭＴＰパケットのＭＭＴＰペイロードには、１つの種類のデータが格納される。具体的には、例えば、画像（Ｖ）、音声（Ａ）、字幕（Ｓ）、さらにＭＭＴＰパケットに格納する制御情報（ＭＭＴ－ＳＩ）、これらのいずれか一種類のデータが、個別のＭＭＴＰパケットに格納される。 One type of data is stored in the MMTP payload of one MMTP packet. Specifically, for example, an image (V), an audio (A), a subtitle (S), a control information (MMT-SI) stored in an MMTP packet, and any one of these types of data are individual MMTP packets. Stored in.

図３（ａ），（ｂ）は、図３（ｃ）に示すＭＭＴＰパケットのＭＭＴＰペイロードに格納される画像データの詳細構成を示している。
図３（ｂ）は、図３（ｃ）に示すＭＭＴＰパケット中、ＭＭＴＰペイロードが画像データ（Ｖ）であるもののみを選択して示している。
図３（ｂ）に示すＭＭＴＰペイロードは、ヘッダとデータユニットによって構成される。3 (a) and 3 (b) show the detailed structure of the image data stored in the MMTP payload of the MMTP packet shown in FIG. 3 (c).
FIG. 3B selects and shows only the MMTP packet whose MMTP payload is image data (V) among the MMTP packets shown in FIG. 3C.
The MMTP payload shown in FIG. 3B is composed of a header and a data unit.

データユニットには、図３（ａ）に示すように、画像データおよび、以下の各種のパラメータが格納される。
ＡＵＤｅｌｉｍｉｔｅｒ（ＡｃｃｅｓｓＵｎｉｔＤｅｌｉｍｉｔｅｒ）
ＳＰＳ（ＳｅｑｕｅｎｃｅＰａｒａｍｅｔｅｒＳｅｔ）
ＰＰＳ（ＰｉｃｔｕｒｅＰａｒａｍｅｔｅｒＳｅｔ）
ＳＥＩｓ（ＳｕｐｐｌｅｍｅｎｔａｌＥｎｈａｎｃｅｍｅｎｔＩｎｆｏｒｍａｔｉｏｎ）
これらのパラメータは、画像表示に利用されるパラメータである。As shown in FIG. 3A, the data unit stores image data and various parameters as described below.
AU Delimiter (Access Unit Delimiter)
SPS (Sequence Parameter Set)
PPS (Picture Parameter Set)
SEIs (Supplemental Enhanment Information)
These parameters are parameters used for image display.

図３（ａ）に示すＭＰＵ（ＭｅｄｉａＰｒｅｓｅｎｔａｔｉｏｎＵｎｉｔ）は、ＭＭＴフォーマットにおける画像、音声、字幕等の再生対象データの１つのデータ処理単位である。図３（ａ）に示す例は、画像データのＭＰＵの例であり、いわゆる符号化、復号処理単位としてのＧＯＰ（ＧｒｏｕｐｏｆＰｉｃｔｕｒｅｓ）と同じ単位である。 The MPU (Media Presentation Unit) shown in FIG. 3A is one data processing unit of reproduction target data such as images, sounds, and subtitles in the MMT format. The example shown in FIG. 3A is an example of an MPU of image data, which is the same unit as GOP (Group of Pictures) as a so-called coding / decoding processing unit.

このように、例えば画像データは、図３（ａ）に示すように、ＭＭＴフォーマットにおいて規定されたパラメータと画像構成データに分割され、図３（ｂ）に示すＭＭＴＰペイロードに格納され、図３（ｃ）に示すＭＭＴＰパケットとして構成される。
さらに、ＭＭＴＰパケットは、図３（ｄ）に示すＴＬＶパケットのペイロードとして設定されて、ＴＬＶパケットが放送波やネットワークを介して送信される。In this way, for example, the image data is divided into the parameters defined in the MMT format and the image configuration data as shown in FIG. 3A, stored in the MMTP payload shown in FIG. 3B, and shown in FIG. 3 (b). It is configured as the MMTP packet shown in c).
Further, the MMTP packet is set as the payload of the TLV packet shown in FIG. 3D, and the TLV packet is transmitted via a broadcast wave or a network.

なお、音声、字幕等の各データ、ＭＭＴ－ＳＩの各データについても、それぞれデータ種類単位のＭＭＴＰパケット、ＴＬＶパケットが設定されて送信される。
ＴＬＶ－ＳＩは、ＭＭＴＰパケットに格納されることなくＴＬＶパケットに格納されて送信される。For each data such as voice and subtitles and each data of MMT-SI, MMTP packets and TLV packets for each data type are set and transmitted.
The TLV-SI is stored in the TLV packet and transmitted without being stored in the MMTP packet.

［３．ＢＤＡＶフォーマットとＳＰＡＶフォーマットについて］
次に、上述のＭＭＴフォーマットに従った配信コンテンツを例えばＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）やフラッシュメモリ、あるいはＨＤＤ（ハードディスク）等のメディアに記録して再生する場合の記録データフォーマットであるＢＤＡＶフォーマットとＳＰＡＶフォーマットについて図４以下を参照して説明する。[3. About BDAV format and SPAV format]
Next, it is a recording data format in which the distribution content according to the above-mentioned MMT format is recorded and reproduced on a medium such as BD (Blu-ray (registered trademark) Disc), flash memory, or HDD (hard disk). The BDAV format and the SPAV format will be described with reference to FIG. 4 and below.

例えばＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）やフラッシュメモリやＨＤＤ等のメディアから、画像、音声、字幕等のコンテンツを再生する場合には、これらのコンテンツの再生処理を行なうための再生制御情報やインデックス情報が必要となる。再生制御情報やインデックス情報は一般的にデータベースファイルと呼ばれる。
これらの再生制御情報やインデックス情報は、メディアの記録データの再生処理を実行する再生アプリケーションに応じて異なるものとなる。For example, when playing content such as images, sounds, and subtitles from media such as BD (Blu-ray (registered trademark) Disc), flash memory, and HDD, playback control information for performing playback processing of these contents. And index information is required. Playback control information and index information are generally called database files.
These reproduction control information and index information differ depending on the reproduction application that executes the reproduction processing of the recorded data of the media.

前述したように、現行の記録再生アプリケーション規格（＝データ記録フォーマット）としてＢＤＭＶやＢＤＡＶ規格（データ記録フォーマット）がある。これらのアプリケーション規格は、主にＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）を利用したデータ記録再生アプリケーション規格として策定されたものである。 As described above, there are BDMV and BDAV standards (data recording formats) as current recording / playback application standards (= data recording formats). These application standards have been established mainly as data recording / playback application standards using BD (Blu-ray (registered trademark) Disc).

なお、ＢＤＭＶやＢＤＡＶは、主にＢＤを利用したデータ記録再生のアプリケーション規格であり、データ記録フォーマット（規格）であるが、これらの規格はＢＤに限らず、フラッシュメモリなど、その他のＢＤ以外のメディアを利用したデータ記録再生にも適用可能である。
ＢＤＭＶは、例えば映画コンテンツなどを予め記録したＢＤ－ＲＯＭ向けに開発されたアプリケーション規格であり、主に、パッケージコンテンツ等の書き換え不能なＢＤ－ＲＯＭで広く使われている。
一方、ＢＤＡＶは、主に書き換え可能なＢＤ－ＲＥ型ディスクや、一回のみ記録可能なＢＤ－Ｒ型ディスク等を利用したデータ記録再生処理に適用することを目的として開発された規格である。ＢＤＡＶは、例えばユーザがビデオカメラなどで撮影した映像の記録再生やテレビ放送を記録し再生するために利用される。BDMV and BDAV are application standards for data recording and playback mainly using BD, and are data recording formats (standards), but these standards are not limited to BD, and other than BD such as flash memory. It can also be applied to data recording and playback using media.
BDMV is an application standard developed for BD-ROMs in which movie contents are recorded in advance, and is mainly widely used in non-rewritable BD-ROMs such as package contents.
On the other hand, BDAV is a standard developed mainly for the purpose of applying to data recording / playback processing using a rewritable BD-RE type disc, a BD-R type disc that can be recorded only once, and the like. BDAV is used, for example, for recording / reproducing a video taken by a user with a video camera or the like, or for recording / reproducing a television broadcast.

上述のＭＭＴフォーマットに従った配信コンテンツを記録したメディアからのコンテンツ再生処理を、ＢＤＡＶフォーマット対応の再生アプリケーションを利用して行なうためには、このＢＤＡＶフォーマットに従ってデータ記録を行うことが必要である。
前述のように、ＢＤＡＶフォーマットは、再生制御情報の記録ファイルとして、プレイリストファイルやクリップ情報ファイル等を規定しており、ＢＤＡＶ対応再生アプリケーションはこれらの再生制御情報ファイル（データベースファイル）の記録情報を利用してデータ再生処理を実行する。In order to perform the content reproduction process from the medium in which the distribution content according to the above-mentioned MMT format is recorded by using the reproduction application corresponding to the BDAV format, it is necessary to perform the data recording according to this BDAV format.
As described above, the BDAV format defines playlist files, clip information files, etc. as playback control information recording files, and BDAV-compatible playback applications record the recording information of these playback control information files (database files). Use it to execute data playback processing.

図４は、情報記録媒体（メディア）４０にＢＤＡＶフォーマットに従って記録されたデータのディレクトリ構成例を示す図である。
ディレクトリには、図４に示すように様々な管理情報、再生制御情報、再生対象データの格納ファイルが設定される。FIG. 4 is a diagram showing an example of a directory structure of data recorded on the information recording medium (media) 40 according to the BDAV format.
As shown in FIG. 4, various management information, reproduction control information, and storage files for reproduction target data are set in the directory.

管理情報ファイルは、例えば、図４に示すインフォファイル（ｉｎｆｏ）、メニューファイル（ｍｅｎｕ）、マークファイル（ｍａｒｋ）等によって構成される。これらは、主にユーザに見せるタイトルの管理情報等を格納する。
また、再生制御情報ファイルとして、
プレイリストファイル（ｐｌａｙｌｉｓｔ）、
クリップ情報ファイル（ｃｌｉｐｉｎｆ）
例えば、これらのファイルが記録される。
さらに、再生データ格納ファイルとして、クリップＡＶストリームファイル（ｓｔｒｅａｍ）が記録される。The management information file is composed of, for example, an info file (info), a menu file (menu), a mark file (mark), and the like shown in FIG. These mainly store title management information and the like to be shown to the user.
Also, as a playback control information file,
Playlist file (playlist),
Clip information file (clipinf)
For example, these files are recorded.
Further, a clip AV stream file (stream) is recorded as a reproduction data storage file.

プレイリストファイルは、タイトルによって指定される再生プログラムのプログラム情報に従ってコンテンツの再生順等を規定したファイルであり、例えば、再生位置情報等を記録したクリップ情報ファイルの指定情報等を有する。
クリップ情報ファイルは、プレイリストファイルによって指定されるファイルであり、クリップＡＶストリームファイルの再生位置情報等を有する。The playlist file is a file that defines the playback order and the like of the contents according to the program information of the playback program designated by the title, and has, for example, the designated information of the clip information file in which the playback position information and the like are recorded.
The clip information file is a file specified by the playlist file, and has playback position information of the clip AV stream file and the like.

クリップＡＶストリームファイルは、再生対象となるＡＶストリームデータや管理情報を格納したファイルである。クリップＡＶストリームファイルは、再生対象となる画像、音声、字幕等の各データや、管理情報を格納したパケットによって構成される。
なお、ＭＰＥＧ－２ＴＳフォーマットにおいて規定され、クリップＡＶストリームファイルに記録される管理情報として、例えば、ＰＳＩ／ＳＩ（ＰｒｏｇｒａｍＳｐｅｃｉｆｉｃＩｎｆｏｒｍａｔｉｏｎ／ＳｅｒｖｉｃｅＩｎｆｏｒｍａｔｉｏｎ）がある。The clip AV stream file is a file that stores AV stream data to be played back and management information. The clip AV stream file is composed of packets for storing management information and data such as images, sounds, and subtitles to be played back.
The management information defined in the MPEG-2 TS format and recorded in the clip AV stream file includes, for example, PSI / SI (Program Special Information / Service Information).

なお、従来の放送データや、ネットワーク配信データは、ＴＳ（ＴｒａｎｓｐｏｒｔＳｔｒｅａｍ）パケットによって構成されたＭＰＥＧ－２ＴＳフォーマットデータであるが、今後の４Ｋ，８Ｋ画像等の高精細画像等を含むデータは前述したＭＭＴＰパケットによって構成されるＭＭＴフォーマットデータとなることが予想される。 The conventional broadcast data and network distribution data are MPEG-2 TS format data composed of TS (Transport Stream) packets, but the data including high-definition images such as 4K and 8K images in the future are described above. It is expected to be MMT format data composed of MMTP packets.

図４にはクリップＡＶストリームファイル（ｓｔｒｅａｍ）として、
ＴＳパケットによって構成されたＭＰＥＧ－２ＴＳフォーマットデータからなるストリームファイル（ｎｎｎｎｎ．ｍ２ｔｓ）、
ＭＭＴＰパケットによって構成されるＭＭＴフォーマットデータ（ｎｎｎｎｎ．ｍｍｔｖ）、
これらの２種類のストリームファイルを示している。In FIG. 4, as a clip AV stream file (stream),
A stream file (nnnnnn.m2ts) composed of MPEG-2 TS format data composed of TS packets,
MMT format data (nnnnnn.mmtv) composed of MMTP packets,
These two types of stream files are shown.

図４に示すディレクトリ例は、情報処理装置３０が受信したデータがＭＰＥＧ－２ＴＳフォーマットデータである場合は、そのまま、ＭＰＥＧ－２ＴＳフォーマットデータとしてメディアに記録し、受信データがＭＭＴフォーマットデータである場合は、ＭＭＴフォーマットデータとしてメディアに記録する設定の場合のディレクトリ例である。 In the directory example shown in FIG. 4, when the data received by the information processing apparatus 30 is MPEG-2 TS format data, it is recorded as it is on the media as MPEG-2 TS format data, and when the received data is MMT format data, it is recorded as it is. , This is an example of a directory in the case of setting to record on media as MMT format data.

なお、ＭＭＴフォーマットデータをメディアに記録する場合のクリップＡＶストリームファイルは、
（１）ＭＭＴフォーマットをＭＰＥＧ－２ＴＳフォーマットに変換してクリップＡＶストリームファイルを生成する、あるいは、
（２）ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列からなるクリップＡＶストリームファイルを生成する、
これらの２種類の処理が想定される。The clip AV stream file for recording MMT format data on media is
(1) Convert the MMT format to the MPEG-2 TS format to generate a clip AV stream file, or
(2) Generate a clip AV stream file consisting of a packet string of packets storing data according to the MMT format.
These two types of processing are assumed.

現在、上記（２）の処理、すなわち、ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列からなるクリップＡＶストリームファイルを生成してメディアに記録する方向で議論が進んでいる。
具体的には、ＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケット、あるいはＭＭＴＰパケットの上位パケットであるＴＬＶ（ＴｙｐｅＬｅｎｇｔｈＶａｌｕｅ）パケットのパケット列として記録する方向で議論が進んでいる。
この具体例については後段で詳細に説明する。Currently, discussions are underway in the process of (2) above, that is, to generate a clip AV stream file consisting of a packet sequence of packets storing data according to the MMT format and record it on a medium.
Specifically, discussions are proceeding in the direction of recording as a packet sequence of an MMTP (MMT Protocol) packet or a TLV (Type Length Value) packet which is a higher-level packet of the MMTP packet.
This specific example will be described in detail later.

管理情報ファイル、プレイリストファイル、クリップ情報ファイル、これらのデータファイルは、クリップＡＶストリームファイルに格納された再生データである画像、音声、字幕等の再生処理に適用する管理情報の格納ファイルである。これらは、再生制御情報や、再生データの属性情報等を格納したファイルであり、データベースファイルと呼ばれる。 The management information file, playlist file, clip information file, and these data files are storage files for management information applied to the reproduction processing of images, sounds, subtitles, etc., which are reproduction data stored in the clip AV stream file. These are files that store playback control information, attribute information of playback data, and the like, and are called database files.

情報記録媒体に記録されたコンテンツを再生するシーケンスは以下の通りである。
（ａ）まず、再生アプリケーションによって管理情報ファイルから特定のプレイリストを指定する。
（ｂ）選択されたプレイリストに規定されたクリップ情報ファイルを選択し、クリップ情報ファイルの記録データに従って、クリップＡＶストリームファイルから、画像、音声等の再生データであるＡＶストリームあるいはコマンドを読み出して、ＡＶストリームの再生や、コマンドの実行処理を開始する。The sequence for reproducing the content recorded on the information recording medium is as follows.
(A) First, a specific playlist is specified from the management information file by the playback application.
(B) Select the clip information file specified in the selected playlist, read the AV stream or command that is the playback data such as images and sounds from the clip AV stream file according to the recorded data of the clip information file. Starts playback of the AV stream and command execution processing.

図５は、情報記録媒体（メディア）４０に記録される以下のデータ、すなわち、
プレイリストファイル、
クリップ情報ファイル、
クリップＡＶストリームファイル、
これらのデータの対応関係を説明する図である。FIG. 5 shows the following data recorded on the information recording medium (media) 40, that is,
Playlist file,
Clip information file,
Clip AV stream file,
It is a figure explaining the correspondence relation of these data.

実際の再生対象データである画像、音声、字幕等の再生対象データからなるＡＶストリームはクリップＡＶストリーム（ＣｌｉｐＡＶＳｔｒｅａｍ）ファイルとして記録され、さらに、これらのＡＶストリームの管理情報、再生制御情報ファイルとして、プレイリスト（ＰｌａｙＬｉｓｔ）ファイルと、クリップ情報（ＣｌｉｐＩｎｆｏｒｍａｔｉｏｎ）ファイルが規定される。 An AV stream consisting of playback target data such as images, sounds, and subtitles, which are actual playback target data, is recorded as a clip AV stream (Clip AV Stream) file, and further, as management information and playback control information files of these AV streams. , A playlist (PlayList) file and a clip information (Clip Information) file are specified.

これら複数のカテゴリのファイルは、図５に示すように、
プレイリスト（ＰｌａｙＬｉｓｔ）ファイルを含むプレイリストレイヤ、
クリップＡＶストリーム（ＣｌｉｐＡＶＳｔｒｅａｍ）ファイルと、クリップ情報（ＣｌｉｐＩｎｆｏｒｍａｔｉｏｎ）ファイルからなるクリップレイヤ、
これらの２つのレイヤに区分できる。These multiple categories of files are as shown in FIG.
Playlist layer containing playlist (PlayList) files,
A clip layer consisting of a clip AV stream file and a clip information file,
It can be divided into these two layers.

なお、一つのクリップＡＶストリーム（ＣｌｉｐＡＶＳｔｒｅａｍ）ファイルには一つのクリップ情報（ＣｌｉｐＩｎｆｏｒｍａｔｉｏｎ）ファイルが対応付けられ、これらのペアを一つのオブジェクトと考え、これらをまとめてクリップ（Ｃｌｉｐ）、あるいはクリップファイルと呼ぶ。
クリップＡＶストリームファイルに含まれるデータの詳細情報、例えばＭＰＥＧデータのＩピクチャ位置情報などを記録したＥＰマップなどの管理情報がクリップ情報ファイルに記録される。One clip AV stream (Clip AV Stream) file is associated with one clip information (Clip Information) file, and these pairs are considered as one object, and these are collectively clipped or clipped. Call it a file.
Detailed information of the data included in the clip AV stream file, for example, management information such as an EP map recording I-picture position information of MPEG data is recorded in the clip information file.

なお、クリップＡＶストリーム（ＣｌｉｐＡＶＳｔｒｅａｍ）ファイルは、ＭＰＥＧ－２ＴＳフォーマットデータである場合はＴＳパケットによって構成される。
また、ＭＭＴフォーマットデータである場合はＭＭＴＰパケットによって構成される。The Clip AV Stream file is composed of TS packets when it is MPEG-2 TS format data.
Further, in the case of MMT format data, it is composed of MMTP packets.

クリップ情報（ＣｌｉｐＩｎｆｏｒｍａｔｉｏｎ）ファイルは、例えば、クリップＡＶストリームファイルのバイト列データのデータ位置と、時間軸上に展開した場合の再生開始ポイントであるエントリポイント（ＥＰ）等の再生時間位置等の対応データ等、クリップＡＶストリームファイルの格納データの再生開始位置などを取得するための管理情報を格納している。 The clip information (Clip Information) file corresponds to, for example, the data position of the byte string data of the clip AV stream file and the reproduction time position such as the entry point (EP) which is the reproduction start point when expanded on the time axis. Storage of clip AV stream files such as data Management information for acquiring the playback start position of data is stored.

プレイリストは、クリップ（Ｃｌｉｐ）の再生開始位置や再生終了位置に対応するアクセスポイントを時間軸上の情報であるタイムスタンプで指し示す情報を有する。
例えば、コンテンツの開始点からの再生時間経過位置を示すタイムスタンプに基づいてクリップ情報ファイルを参照して、クリップＡＶストリームファイルのデータ読み出し位置、すなわち再生開始点としてのアドレスを取得することが可能となる。
クリップ情報ファイル（ＣｌｉｐＩｎｆｏｒｍａｔｉｏｎｆｉｌｅ）は、このタイムスタンプから、クリップＡＶストリームファイル中のストリームのデコードを開始すべきアドレス情報を見つけるために利用される。The playlist has information indicating the access point corresponding to the reproduction start position and the reproduction end position of the clip (Clip) by a time stamp which is information on the time axis.
For example, it is possible to refer to the clip information file based on the time stamp indicating the playback time elapsed position from the start point of the content, and acquire the data read position of the clip AV stream file, that is, the address as the playback start point. Become.
The clip information file (Clip Information file) is used to find the address information from which the decoding of the stream in the clip AV stream file should be started from this time stamp.

このように、プレイリスト（ＰｌａｙＬｉｓｔ）ファイルは、クリップ（＝クリップ情報ファイル＋クリップＡＶストリームファイル）レイヤに含まれる再生可能データに対する再生区間の指定情報を有する。
プレイリスト（ＰｌａｙＬｉｓｔ）ファイルには、１つ以上のプレイアイテム（ＰｌａｙＩｔｅｍ）が設定され、プレイアイテムの各々が、クリップ（＝クリップ情報ファイル＋クリップＡＶストリームファイル）レイヤに含まれる再生可能データに対する再生区間の指定情報を有する。As described above, the playlist (PlayList) file has the designated information of the reproduction section for the reproducible data included in the clip (= clip information file + clip AV stream file) layer.
One or more play items (PlayItem) are set in the playlist (PlayList) file, and each play item is a playback section for playable data included in the clip (= clip information file + clip AV stream file) layer. Has the designated information of.

なお、再生対象データを格納したクリップＡＶストリーム（ＣｌｉｐＡＶＳｔｒｅａｍ）ファイルは、前述したように、従来型のＭＰＥＧ－２ＴＳフォーマットデータである場合はＴＳパケットによって構成される。
また、今後、利用が拡大されると予想される４Ｋ，８Ｋ画像等の高精細画像データの場合は、ＭＭＴフォーマットデータである場合はＭＭＴＰパケットによって構成される。
図６、図７を参照して、ＭＭＴフォーマットと、ＭＰＥＧ－２ＴＳフォーマットについて説明する。As described above, the clip AV stream (Clip AV Stream) file storing the data to be reproduced is composed of TS packets in the case of the conventional MPEG-2 TS format data.
Further, in the case of high-definition image data such as 4K and 8K images whose use is expected to be expanded in the future, in the case of MMT format data, it is composed of MMTP packets.
The MMT format and the MPEG-2 TS format will be described with reference to FIGS. 6 and 7.

まず、図６を参照して、ＭＰＥＧ－２ＴＳフォーマットについて説明する。
ＭＰＥＧ－２ＴＳフォーマットは画像（Ｖｉｄｅｏ）、音声（Ａｕｄｉｏ）、字幕（Ｓｕｂｔｉｔｌｅ）等、コンテンツ構成データである符号化データや管理情報（ＰＳＩ／ＳＩ）を記録媒体（メディア）に格納、または放送波やネットワークを介して伝送する際の符号化データ等のデータ格納形式（コンテナフォーマット）を規定したフォーマットである。First, the MPEG-2 TS format will be described with reference to FIG.
The MPEG-2 TS format stores coded data and management information (PSI / SI), which are content composition data such as images (Video), audio (Audio), and subtitles (Subtile), in a recording medium (media), or broadcast waves. It is a format that defines a data storage format (container format) such as coded data when transmitting via a network.

ＭＰＥＧ－２ＴＳフォーマットは、ＩＳＯ１３８１８－１において標準化されたフォーマットであり、例えばＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）に対するデータ記録や、デジタル放送等に用いられている。 The MPEG-2 TS format is a format standardized in ISO13818-1, and is used for, for example, data recording for BD (Blu-ray (registered trademark) Disc), digital broadcasting, and the like.

図６（ａ）～（ｃ）はＭＰＥＧ－２ＴＳフォーマットデータの構成を示す図である。
最下段に示す図６（ａ）は、ＭＰＥＧ－２ＴＳのフォーマットデータの全体構成を示す図である。
図６（ａ）に示すように、ＭＰＥＧ－２ＴＳフォーマットデータは、複数のエレメンタリストリーム（Ｅｌｅｍｅｎｔａｒｙｓｔｒｅａｍ）によって構成される。
エレメンタリストリーム（Ｅｌｅｍｅｎｔａｒｙｓｔｒｅａｍ）は、例えば画像、音声、字幕等の１つの単位として設定されるユニットである。6 (a) to 6 (c) are diagrams showing the structure of MPEG-2 TS format data.
FIG. 6A shown at the bottom is a diagram showing the overall configuration of the MPEG-2 TS format data.
As shown in FIG. 6A, the MPEG-2 TS format data is composed of a plurality of elementary streams.
The elementary stream is a unit set as one unit of, for example, an image, an audio, a subtitle, and the like.

１つのエレメンタリストリーム（Ｅｌｅｍｅｎｔａｒｙｓｔｒｅａｍ）は、図２（ｂ）に示すように、１つまたは複数のＰＥＳ（ＰａｃｋｅｔｉｚｅｄＥｌｅｍｅｎｔａｒｙｓｔｒｅａｍ）パケットによって構成される。
具体的には、１つのエレメンタリストリーム（Ｅｌｅｍｅｎｔａｒｙｓｔｒｅａｍ）は、ペイロードタイプ（Ｐａｙｌｏａｄ＿ｔｙｐｅ）＝０ｘ０で、かつ同じパケット識別子（Ｐａｃｋｅｔ＿ｉｄ）を持つＰＥＳパケットの１つ、または複数から構成される。As shown in FIG. 2 (b), one elementary stream (Elementary stream) is composed of one or more PES (Packetized Elementary stream) packets.
Specifically, one elementary stream is composed of one or a plurality of PES packets having a payload type (Payload_type) = 0x0 and having the same packet identifier (Packet_id).

１つのＰＥＳパケットは、図６（ｃ）に示すように、１つまたは複数のＴＳパケットによって構成される。
具体的には、１つのＰＥＳパケットは、ペイロードタイプ（Ｐａｙｌｏａｄ＿ｔｙｐｅ）＝０ｘ０で、かつ同じパケット識別子（Ｐａｃｋｅｔ＿ｉｄ）を持つＴＳパケットの１つ、または複数から構成される。
ＴＳパケットは、前述のＭＭＴＰパケットと異なり、固定長であり、１つのＴＳパケットのパケットサイズは、１８８バイトに固定されている。One PES packet is composed of one or more TS packets as shown in FIG. 6 (c).
Specifically, one PES packet is composed of one or a plurality of TS packets having a payload type (Payload_type) = 0x0 and having the same packet identifier (Packet_id).
Unlike the above-mentioned MMTP packet, the TS packet has a fixed length, and the packet size of one TS packet is fixed at 188 bytes.

次に、図７を参照してＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットについて説明する。
ＭＭＴフォーマットについては、先に図３を参照しているが、図７に示すＭＭＴフォーマットの説明図は、図６を参照して説明したＭＰＥＧ－２ＴＳフォーマットとの対応関係を分かり易く説明した図である。Next, the MMT (MPEG Media Transport) format will be described with reference to FIG. 7.
Regarding the MMT format, FIG. 3 is referred to earlier, but the explanatory diagram of the MMT format shown in FIG. 7 is a diagram that clearly explains the correspondence with the MPEG-2 TS format described with reference to FIG. be.

先に説明したように、ＭＭＴフォーマットは、画像（Ｖｉｄｅｏ）、音声（Ａｕｄｉｏ）、字幕（Ｓｕｂｔｉｔｌｅ）等、コンテンツ構成データである符号化データを放送波やネットワークを介して伝送する際のデータ転送方式（トランスポートフォーマット）を規定したものである。
図７は、ＩＳＯ／ＩＥＣ２３００８－１に規定されるファイルフォーマットであるＭＭＴフォーマットについて説明する図である。As described above, the MMT format is a data transfer method for transmitting coded data, which is content composition data, such as an image (Video), audio (Audio), and subtitle (Subtitle), via a broadcast wave or a network. (Transport format) is specified.
FIG. 7 is a diagram illustrating an MMT format, which is a file format defined in ISO / IEC 23008.1.

図７（ａ）～（ｃ）にはＭＭＴフォーマットデータの構成を示している。
最下段に示す図７（ａ）は、ＭＭＴフォーマットデータの全体構成を示す図である。
図７（ａ）に示すように、ＭＭＴフォーマットデータは、複数のメディアプレゼンテーションユニット（ＭＰＵ：Ｍｅｄｉａｐｒｅｓｅｎｔａｔｉｏｎｕｎｉｔ）によって構成される。
ＭＰＵは、例えば画像、音声、字幕等の１つの単位として設定されるユニットである。例えば、画像の場合、１ＭＰＵが１つのＭＰＥＧ圧縮画像単位である１ＧＯＰ（Ｇｒｏｕｐｏｆｐｉｃｔｕｒｅ）に相当する。7 (a) to 7 (c) show the structure of MMT format data.
FIG. 7A shown at the bottom is a diagram showing the overall configuration of MMT format data.
As shown in FIG. 7A, the MMT format data is composed of a plurality of media presentation units (MPU: Media presentation unit).
The MPU is a unit set as one unit of, for example, an image, sound, subtitles, and the like. For example, in the case of an image, 1MPU corresponds to 1GOP (Group of picture) which is one MPEG compressed image unit.

１つのＭＰＵは、図７（ｂ）に示すように、１つまたは複数のメディアフラグメントユニット（ＭＦＵ：ＭｅｄｉａＦｒａｇｍｅｎｔｕｎｉｔ）によって構成される。
具体的には、１つのＭＰＵは、ペイロードタイプ（Ｐａｙｌｏａｄ＿ｔｙｐｅ）＝０ｘ０（ＭＰＵ）で、かつ同じパケット識別子（Ｐａｃｋｅｔ＿ｉｄ）を持つＭＦＵの１つ、または複数から構成される。As shown in FIG. 7 (b), one MPU is composed of one or more media fragment units (MFU: Media Fragment unit).
Specifically, one MPU is composed of one or a plurality of MFUs having a payload type (Payload_type) = 0x0 (MPU) and having the same packet identifier (Packet_id).

１つのＭＦＵは、図７（ｃ）に示すように、１つまたは複数のＭＭＴＰパケットによって構成される。
具体的には、１つのＭＦＵは、ペイロードタイプ（Ｐａｙｌｏａｄ＿ｔｙｐｅ）＝０ｘ０（ＭＰＵ）で、かつ同じパケット識別子（Ｐａｃｋｅｔ＿ｉｄ）を持つＭＭＴＰパケットの１つ、または複数から構成される。
ＭＭＴＰパケットは可変長であり、様々なパケットサイズに設定可能である。
ＭＭＴＰパケットの各々は、属性情報等を格納するヘッダ（ＭＭＴＰヘッダ）と、符号化画像の実データ等を格納するペイロード（ＭＭＴＰペイロード）により、構成される。One MFU is composed of one or more MMTP packets as shown in FIG. 7 (c).
Specifically, one MFU is composed of one or a plurality of MMTP packets having a payload type (Payload_type) = 0x0 (MPU) and having the same packet identifier (Packet_id).
The MMTP packet has a variable length and can be set to various packet sizes.
Each MMTP packet is composed of a header (MMTP header) for storing attribute information and the like and a payload (MMTP payload) for storing actual data and the like of a coded image.

なお、ＢＤＡＶフォーマットに類似するフォーマットとしてＳＰＡＶフォーマットがある。前述したように、ＢＤＭＶやＢＤＡＶは、主にＢＤを利用したデータ記録再生のアプリケーション規格である。これに対して、ＳＰＡＶフォーマットは、主にハードディスクに対するデータ記録再生のアプリケーション規格である。
ただし、ＢＤＡＶフォーマット、ＳＰＡＶフォーマットのいずれも、ＢＤ，フラッシュメモリ、ＨＤＤ等、様々なメディアを利用した記録再生に利用可能なフォーマットである。There is a SPAV format as a format similar to the BDAV format. As described above, BDMV and BDAV are application standards for data recording / reproduction mainly using BD. On the other hand, the SPAV format is an application standard for data recording and reproduction mainly for hard disks.
However, both the BDAV format and the SPAV format are formats that can be used for recording and playback using various media such as BD, flash memory, and HDD.

ＳＰＡＶフォーマットデータは、ＢＤＡＶフォーマットにおけるデータ記録再生処理と同様の処理でデータ記録再生を実行することが可能である。ただし、ＳＰＡＶフォーマットは、ファイル名設定が、ＢＤＡＶフォーマットと一部、異なっている。
図８にＳＰＡＶフォーマットのディレクトリ構成例を示す。The SPAV format data can be recorded and reproduced by the same processing as the data recording / reproduction processing in the BDAV format. However, the file name setting of the SPAV format is partially different from that of the BDAV format.
FIG. 8 shows an example of a directory structure in the SPAV format.

図８に示すＳＰＡＶフォーマットのディレクトリには、先に図４を参照して説明したＢＤＡＶフォーマットと同様、様々な管理情報、再生制御情報、再生対象データの格納ファイルが設定される。 In the directory of the SPAV format shown in FIG. 8, various management information, reproduction control information, and storage files of reproduction target data are set as in the BDAV format described above with reference to FIG.

管理情報ファイルは、例えば、図８に示すインフォファイル（ＩＮＦＯ）、メニューファイル（ＭＥＮＵ）、マークファイル（ＭＡＲＫ）等によって構成される。これらは、主にユーザに見せるタイトルの管理情報等を格納する。
また、再生制御情報ファイルとして、
プレイリストファイル（ＰＬＡＹＬＩＳＴ）、
クリップ情報ファイル（ＣＬＩＰＩＮＦ）
例えば、これらのファイルが記録される。
さらに、再生データ格納ファイルとして、クリップＡＶストリームファイル（ＳＴＲＥＡＭ）が記録される。The management information file is composed of, for example, an info file (INFO), a menu file (MENU), a mark file (MARK), and the like shown in FIG. These mainly store title management information and the like to be shown to the user.
Also, as a playback control information file,
Playlist file (PLAYLIST),
Clip information file (CLIPINF)
For example, these files are recorded.
Further, a clip AV stream file (STREAM) is recorded as a reproduction data storage file.

図８に示すようにＳＰＡＶフォーマットのディレクトリ名や各ファイルの拡張子の設定が、図４を参照して説明したＢＤＡＶフォーマットと異なっている。
ただし、各ファイルに格納されるデータや、各ファイルの役割は、ＢＤＡＶフォーマットと同様である。As shown in FIG. 8, the setting of the directory name of the SPAV format and the extension of each file is different from the BDAV format described with reference to FIG.
However, the data stored in each file and the role of each file are the same as in the BDAV format.

以下の実施例の説明では、ＭＭＴフォーマットデータをＢＤＡＶフォーマットデータとして記録し、再生する処理例について説明するが、以下に説明する実施例は、ＭＭＴフォーマットデータをＳＰＡＶフォーマットデータとして記録し、再生する処理にも適用可能である。 In the following description of the embodiment, an example of processing for recording and reproducing MMT format data as BDAV format data will be described, but in the following embodiment, processing for recording and reproducing MMT format data as SPAV format data will be described. It is also applicable to.

［４．ＭＭＴフォーマットデータをＢＤＡＶフォーマットに従って記録する場合の処理について］
次に、ＭＭＴフォーマットデータをＢＤＡＶフォーマットに従って記録する場合の処理について説明する。[4. Processing when recording MMT format data according to BDAV format]
Next, a process for recording MMT format data according to the BDAV format will be described.

先に説明したように、ＭＭＴフォーマットは、今後、放送局等によって配信が予定される４Ｋ画像等に利用されるデータ配信フォーマットであり、図３を参照して説明したプロトコルスタックに従ったフォーマットである。
一方、ＢＤＡＶフォーマットはメディアに対するデータ記録フォーマットであり、図４を参照して説明したようにプレイリストファイルやクリップ情報ファイル等の再生制御情報ファイルを含むデータベースファイルが規定されている。
なお、ＢＤＡＶフォーマットはデータ記録フォーマットであるとともにデータ記録再生アプリケーション規格にも対応しており、ＢＤＡＶフォーマットに従ってメディアに記録されたデータ再生は、ＢＤＡＶフォーマット対応の再生アプリケーションを利用して再生処理が実行される。As described above, the MMT format is a data distribution format used for 4K images and the like scheduled to be distributed by broadcasting stations and the like in the future, and is a format according to the protocol stack described with reference to FIG. be.
On the other hand, the BDAV format is a data recording format for media, and as described with reference to FIG. 4, a database file including a playback control information file such as a playlist file or a clip information file is defined.
The BDAV format is a data recording format and is compatible with the data recording / playback application standard. Data playback recorded on the media according to the BDAV format is performed by using a playback application compatible with the BDAV format. To.

従って、ＭＭＴフォーマットに従った配信コンテンツをメディアに記録し、記録したメディアからのコンテンツ再生処理をＢＤＡＶフォーマット対応の再生アプリケーションを利用して行なうためには、ＢＤＡＶフォーマットに従ったデータ記録を行うことが必要である。 Therefore, in order to record the distributed content according to the MMT format on the media and perform the content reproduction processing from the recorded media by using the reproduction application corresponding to the BDAV format, it is necessary to record the data according to the BDAV format. is necessary.

前述したように、現在、ＢＤＡＶフォーマットを拡張し、ＭＭＴフォーマットデータを記録、再生可能とするための規定について議論が進められている。 As mentioned above, discussions are currently underway on provisions for extending the BDAV format to make MMT format data recordable and playable.

なお、前述したように、ＭＭＴフォーマットデータをメディアに記録する場合のクリップＡＶストリームファイルは、
（１）ＭＭＴフォーマットをＭＰＥＧ－２ＴＳフォーマットに変換してクリップＡＶストリームファイルを生成する、あるいは、
（２）ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列からなるクリップＡＶストリームファイルを生成する、
これらの２種類の処理が可能である。As described above, the clip AV stream file for recording MMT format data on the media is
(1) Convert the MMT format to the MPEG-2 TS format to generate a clip AV stream file, or
(2) Generate a clip AV stream file consisting of a packet string of packets storing data according to the MMT format.
These two types of processing are possible.

現在、上記（２）の処理、すなわち、ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列からなるクリップＡＶストリームファイルを生成してメディアに記録する方向で議論が進んでいる。
具体的には、ＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケット、あるいはＭＭＴＰパケットの上位パケットであるＴＬＶ（ＴｙｐｅＬｅｎｇｔｈＶａｌｕｅ）パケットのパケット列として記録する方向で議論が進んでいる。Currently, discussions are underway in the process of (2) above, that is, to generate a clip AV stream file consisting of a packet sequence of packets storing data according to the MMT format and record it on a medium.
Specifically, discussions are proceeding in the direction of recording as a packet sequence of an MMTP (MMT Protocol) packet or a TLV (Type Length Value) packet which is a higher-level packet of the MMTP packet.

例えば、放送局等が送信するＭＭＴフォーマットに従った配信データを録画機等の情報処理装置が受信し、受信データをＢＤやフラッシュメモリ、あるいはＨＤＤ（ハードディスク）等のメディアに記録する処理を行う場合、画像、音声、字幕データや管理情報（ＳＩ）等のデータについては、ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列としてそのまま記録する。 For example, when an information processing device such as a recorder receives distribution data according to the MMT format transmitted by a broadcasting station or the like, and records the received data on a medium such as a BD, a flash memory, or an HDD (hard disk). , Image, audio, subtitle data, management information (SI), and other data are recorded as they are as a packet string of a packet that stores data according to the MMT format.

すなわち、図４に示すＢＤＡＶフォーマット、あるいは図８に示すＳＰＡＶフォーマットに設定されるクリップＡＶストリームファイル［０２００１．ｍｍｔｖ等］に、ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列を記録する。
なお、ＭＭＴフォーマットデータを格納したクリップＡＶストリームファイル［０２００１．ｍｍｔｖ等］に対応する再生制御情報ファイルであるプレイリストファイルやクリップ情報ファイルについては、ＭＭＴフォーマットデータ対応の制御情報を設定したプレイリストファイルやクリップ情報ファイルを記録装置が生成してメディアに記録することになる。That is, the clip AV stream file set in the BDAV format shown in FIG. 4 or the SPAV format shown in FIG. 8 [02001. In mmtv, etc.], the packet sequence of the packet storing the data according to the MMT format is recorded.
A clip AV stream file containing MMT format data [012001. Regarding the playlist file and clip information file that are playback control information files corresponding to [mmtv, etc.], the recording device generates a playlist file and clip information file in which control information corresponding to MMT format data is set and records them on the media. It will be.

なお、ＭＭＴフォーマットデータを格納したクリップＡＶストリームファイルは、具体的には、ＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケット、あるいはＭＭＴＰパケットの上位パケットであるＴＬＶ（ＴｙｐｅＬｅｎｇｔｈＶａｌｕｅ）パケット、これら２種類のパケットのいずれかのパケット列として記録することが可能である。 The clip AV stream file storing the MMT format data is specifically one of MMTP (MMT Protocol) packets, TLV (Type Length Value) packets which are higher-level packets of MMTP packets, and these two types of packets. It is possible to record as a packet string of.

ＭＭＴフォーマットデータを格納したクリップＡＶストリームファイルの具体的な記録構成例について図９、図１０を参照して説明する。
図９は、ＢＤやフラッシュメモリ、あるいはＨＤＤ（ハードディスク）等の記録メディアに、ＭＭＴフォーマットに従ったＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケット列を記録する処理例を説明する図である。A specific recording configuration example of the clip AV stream file storing the MMT format data will be described with reference to FIGS. 9 and 10.
FIG. 9 is a diagram illustrating a processing example of recording an MMTP (MMT Protocol) packet string according to the MMT format on a recording medium such as a BD, a flash memory, or an HDD (hard disk).

図９には、以下の３つのデータを示している。
（Ａ）放送配信データであるＴＬＶパケット列
（Ｂ）受信再生データとして処理される１つのＴＬＶパケット
（Ｃ）メディア記録用データの構成として提案されているＭＭＴＰパケット列FIG. 9 shows the following three data.
(A) TLV packet string which is broadcast distribution data (B) One TLV packet processed as received / reproduced data (C) MMTP packet string proposed as a configuration of media recording data

（Ａ）放送配信データであるＴＬＶパケット列は、先に図２を参照して説明したＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットを有するＴＬＶパケットの列（シーケンス）である。
このＴＬＶパケット列が、放送局等の送信装置２０から送信される。(A) The TLV packet sequence, which is broadcast distribution data, is a sequence of TLV packets having the MMT (MPEG Media Transport) format described above with reference to FIG. 2.
This TLV packet sequence is transmitted from a transmission device 20 such as a broadcasting station.

（Ｂ）受信再生データとして処理される１つのＴＬＶパケットは、テレビや録画機等の情報処理装置３０が受信し、再生処理を行う１つのＴＬＶパケットである。（Ａ）に示すＴＬＶパケット列を構成する１つのＴＬＶパケットの詳細構成を示している。
先に図２を参照して説明したＭＭＴ（ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔ）フォーマットを有するＴＬＶパケットである。(B) One TLV packet processed as received / reproduced data is one TLV packet received by an information processing apparatus 30 such as a television or a recorder and subjected to reproduction processing. The detailed configuration of one TLV packet constituting the TLV packet sequence shown in (A) is shown.
It is a TLV packet having an MMT (MPEG Media Transport) format described above with reference to FIG.

（Ｃ）メディア記録用データの構成として示すＭＭＴＰパケット列は、現在、メディアに対する記録データとして提案されているＭＭＴＰパケット列である。
メディアに記録されるＭＭＴＰパケットは、図９（Ｂ）との対応関係を示す点線から理解されるように、ＴＬＶパケットの一部の構成データであるＭＭＴＰパケットであり、以下の要素から構成される。
（ａ）ＭＭＴＰパケットヘッダ（ＭＭＴＰ＿ｐａｃｋｅｔ＿ｈｅａｄｅｒ）
（ｂ）ＭＭＴＰパケットデータ（ＭＭＴＰ＿ｐａｃｋｅｔ＿ｄａｔａ）（＝ペイロード）(C) The MMTP packet sequence shown as the structure of the media recording data is an MMTP packet sequence currently proposed as recording data for the media.
The MMTP packet recorded on the media is an MMTP packet which is a part of the configuration data of the TLV packet, as can be understood from the dotted line showing the correspondence with FIG. 9B, and is composed of the following elements. ..
(A) MMTP packet header (MMTP_packet_header)
(B) MMTP packet data (MMTP_packet_data) (= payload)

なお、ＭＭＴＰパケットデータ（ＭＭＴＰ＿ｐａｃｋｅｔ＿ｄａｔａ）（＝ペイロード）は以下の要素によって構成される。
（ｂ１）ＭＭＴＰペイロードヘッダ（ＭＭＴＰ＿ｐａｙｌｏａｄ＿ｈｅａｄｅｒ）
（ｂ２）ＭＭＴＰペイロードデータ（ＭＭＴＰ＿ｐａｙｌｏａｄ＿ｄａｔａ）The MMTP packet data (MMTP_packet_data) (= payload) is composed of the following elements.
(B1) MMTP payload header (MMTP_payload_header)
(B2) MMTP payload data (MMTP_payload_data)

現在、情報記録媒体（メディア）に対する記録データとして提案されている一つの構成が、この図９（Ｃ）に示すような、ＴＬＶパケットの構成要素であるＭＭＴＰパケットのみを取り出して、一列に並べて記録する構成である。 Currently, one configuration proposed as recording data for an information recording medium (media) is to take out only the MMTP packet, which is a component of the TLV packet, as shown in FIG. 9C, and record them side by side in a row. It is a configuration to do.

図１０は、ＢＤやフラッシュメモリ、あるいはＨＤＤ（ハードディスク）等の記録メディアに、ＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケットではなく、ＭＭＴＰパケットを格納した上位のＴＬＶパケットのパケット列を記録する処理例を説明する図である。 FIG. 10 is a diagram illustrating a processing example of recording a packet sequence of a higher-level TLV packet storing an MMTP packet instead of an MMTP (MMT Protocol) packet on a recording medium such as a BD, a flash memory, or an HDD (hard disk). Is.

図１０には、図９と同様、以下の３つのデータを示している。
（Ａ）放送配信データであるＴＬＶパケット列
（Ｂ）受信再生データとして処理される１つのＴＬＶパケット
（Ｃ）メディア記録用データの構成として提案されているＴＬＶパケット列Similar to FIG. 9, FIG. 10 shows the following three data.
(A) TLV packet string which is broadcast distribution data (B) One TLV packet processed as received / reproduced data (C) TLV packet string proposed as a configuration of media recording data

（Ａ），（Ｂ）は、図９を参照して説明したと同様のデータである。
（Ｃ）メディア記録用データの構成として示すＴＬＶパケット列は、現在、メディアに対する記録データとして提案されているもう１つの例であるＴＬＶパケット列である。
メディアに記録されるＴＬＶパケットは、図１０（Ｂ）との対応関係を示す点線から理解されるように、ＭＭＴＰパケットを含むＴＬＶパケットであり、以下の要素から構成される。
（ａ）ＴＬＶパケットヘッダ（ＴＬＶ＿ｈｅａｄｅｒ）
（ｂ）ＴＬＶパケットデータ（ＴＬＶ＿ｄａｔａ）（＝ペイロード）(A) and (B) are the same data as described with reference to FIG.
(C) The TLV packet sequence shown as the structure of the data for media recording is another example of the TLV packet sequence currently proposed as recording data for the media.
The TLV packet recorded on the media is a TLV packet including an MMTP packet, as can be understood from the dotted line showing the correspondence with FIG. 10B, and is composed of the following elements.
(A) TLV packet header (TLV_header)
(B) TLV packet data (TLV_data) (= payload)

このように、ＢＤ等のメディアに対するＭＭＴフォーマットデータの記録態様として、
図９を参照して説明したＭＭＴＰ（ＭＭＴＰｒｏｔｏｃｏｌ）パケットのパケット列、あるいは、
図１０を参照して説明したＴＬＶ（ＴｙｐｅＬｅｎｇｔｈＶａｌｕｅ）パケットのパケット列、
これらいずれかのパケット列として記録することが想定される。As described above, as a recording mode of MMT format data for media such as BD,
The packet sequence of the MMTP (MMT Protocol) packet described with reference to FIG. 9, or
Packet sequence of TLV (Type Length Value) packets described with reference to FIG.
It is expected to be recorded as one of these packet strings.

図９、または図１０に示すような設定でＭＭＴフォーマットデータをメディアに記録して、ＭＭＴフォーマットデータの再生処理をＢＤＡＶフォーマット対応の再生アプリケーションを利用して行う場合、ＢＤＡＶフォーマット対応の再生制御情報ファイル、すなわち、プレイリストファイルやクリップ情報ファイルを利用して再生を行うことになる。
なお、ＢＤＡＶフォーマットはデータ記録フォーマットであるとともにデータ記録再生アプリケーション規格にも対応しており、ＢＤＡＶフォーマットに従ってメディアに記録されたデータの再生は、ＢＤＡＶフォーマット対応の再生アプリケーションを利用して実行される。When the MMT format data is recorded on the media with the settings shown in FIG. 9 or 10 and the reproduction process of the MMT format data is performed by using the reproduction application corresponding to the BDAV format, the reproduction control information file corresponding to the BDAV format is used. That is, the playback is performed using the playlist file and the clip information file.
The BDAV format is a data recording format and also supports a data recording / playback application standard, and playback of data recorded on a medium according to the BDAV format is executed using a playback application compatible with the BDAV format.

ＢＤＡＶフォーマットは、再生制御情報ファイルであるプレイリストファイルやクリップ情報ファイル等のＢＤＡＶフォーマット固有のデータベースファイルを規定しており、ＢＤＡＶ対応再生アプリケーションはこれらの再生制御情報ファイル（データベースファイル）の記録情報を利用してデータ再生処理を実行する。 The BDAV format defines database files specific to the BDAV format, such as playlist files and clip information files, which are playback control information files, and BDAV-compatible playback applications record information from these playback control information files (database files). Use it to execute data playback processing.

前述したように、ＢＤＡＶフォーマット規定のプレイリストファイルやクリップ情報ファイル等のデータベースファイルは、もともと、ＭＰＥＧ－２ＴＳフォーマットの配信データに基づいて生成可能なファイルとして規定されたものである。
従って、ＭＰＥＧ－２ＴＳフォーマットとはフォーマットの異なるＭＭＴフォーマットに従った配信データは、現行のＢＤＡＶフォーマット規定のプレイリストファイルやクリップ情報ファイルにそのまま記録しても、現行のＢＤＡＶ対応再生アプリケーションが利用できないデータとなる場合がある。As described above, the database files such as the playlist file and the clip information file specified in the BDAV format are originally defined as files that can be generated based on the distribution data in the MPEG-2 TS format.
Therefore, even if the distribution data according to the MMT format, which is different from the MPEG-2 TS format, is recorded as it is in the playlist file or clip information file specified in the current BDAV format, the current BDAV compatible playback application cannot be used. May be.

ＭＭＴフォーマットデータをメディアに記録し、ＢＤＡＶフォーマット対応アプリを利用してコンテンツ再生を行うことを可能とするためには、メディアに対するＭＭＴフォーマットデータの記録処理に際して、ＭＭＴフォーマットデータ対応のプレイリストファイルやクリップ情報ファイルを生成してメディアに記録することが必要となる。
また、メディアに記録されたＭＭＴフォーマットデータの再生時には、ＭＭＴフォーマットデータ対応のプレイリストファイルやクリップ情報ファイルを利用して再生を行うことが必要となる。In order to record MMT format data on media and play back content using BDAV format compatible apps, playlist files and clips compatible with MMT format data are recorded when recording MMT format data on the media. It is necessary to generate an information file and record it on the media.
Further, when reproducing the MMT format data recorded on the media, it is necessary to reproduce using a playlist file or a clip information file corresponding to the MMT format data.

具体的には、図１１に示すように、ＭＭＴフォーマットデータ格納クリップＡＶストリームファイル５１対応のクリップ情報ファイル（ｎｎｎｎｎ．ｃｌｐｉ）５２と、プレイリストファイル（ｎｎｎｎｎ．ｒｐｌｓ）５３を生成して、メディア（ＢＤ，フラッシュメモリ，ＨＤＤ等）に記録して再生に利用する処理が必要となる。 Specifically, as shown in FIG. 11, a clip information file (nnnnnn.clpi) 52 corresponding to the MMT format data storage clip AV stream file 51 and a playlist file (nnnnnn.rpls) 53 are generated to generate media (nnnnn.rpls). BD, flash memory, HDD, etc.) must be recorded and used for playback.

メディアに記録されたＭＭＴフォーマットデータ格納クリップＡＶストリームファイル５１を再生する場合には、このＭＭＴフォーマットデータ格納クリップＡＶストリームファイル５１対応のクリップ情報ファイル（ｎｎｎｎｎ．ｃｌｐｉ）５２と、プレイリストファイル（ｎｎｎｎｎ．ｒｐｌｓ）５３を利用して再生処理を行うことができる。 When playing back the MMT format data storage clip AV stream file 51 recorded on the media, the clip information file (nnnnnn.clpi) 52 corresponding to the MMT format data storage clip AV stream file 51 and the playlist file (nnnnnn. The reproduction process can be performed using rpls) 53.

しかし、前述したように、放送局の配信データであるＭＭＴフォーマットに従って配信されるデータは、ＢＤＡＶフォーマットで規定するプレイリストファイルやクリップ情報ファイルの記録用データを全て含むようには構成されていない。
また、ＭＭＴフォーマットデータ格納クリップＡＶストリームファイル５１は、ＭＰＥＧ－２ＴＳフォーマットデータとは異なるデータ形式となり、ＭＰＥＧ－２ＴＳフォーマットデータ対応のプレイリストファイルやクリップ情報ファイルと同一形式のデータを持つプレイリストファイルやクリップ情報ファイルを利用しても、正しい再生処理は実行できないという問題がある。However, as described above, the data distributed according to the MMT format, which is the distribution data of the broadcasting station, is not configured to include all the recording data of the playlist file and the clip information file specified in the BDAV format.
Further, the MMT format data storage clip AV stream file 51 has a data format different from that of the MPEG-2 TS format data, and is a playlist file having the same format as the MPEG-2 TS format data compatible playlist file or clip information file. Even if the clip information file is used, there is a problem that the correct playback process cannot be executed.

従って、このＭＭＴフォーマットデータをメディアに記録して、ＢＤＡＶフォーマット対応アプリを利用してコンテンツ再生を行うためには、ＭＭＴフォーマットデータ格納クリップＡＶストリームファイル５１の再生制御を行うことができるＭＭＴフォーマットデータ固有のデータ形式を持つプレイリストファイルやクリップ情報ファイルを生成してメディアに記録することが必要となる。 Therefore, in order to record this MMT format data on a medium and play back the content using the BDAV format compatible application, the playback control of the MMT format data storage clip AV stream file 51 can be controlled uniquely to the MMT format data. It is necessary to generate a playlist file or clip information file with the data format of and record it on the media.

［５．ＭＭＴフォーマットに従った音声データについて］
先に図９、図１０を参照して説明したように、放送局から配信されるＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットデータとしてＢＤ等のメディアに記録する場合、画像、音声、字幕等の再生データや、制御情報（ＳＩ）等のデータは、例えばＭＭＴＰパケット列またはＴＬＶパケット列として記録される。
これらのパケット列は、ＢＤＡＶフォーマットやＳＰＡＶフォーマットに規定されるクリップＡＶストリームファイルとして記録される。[5. About audio data according to MMT format]
As described above with reference to FIGS. 9 and 10, when recording MMT format data distributed from a broadcasting station on a medium such as BD as BDAV format or SPAV format data, images, sounds, subtitles, etc. The reproduction data and data such as control information (SI) are recorded as, for example, an MMTP packet string or a TLV packet string.
These packet strings are recorded as a clip AV stream file specified in the BDAV format or the SPAV format.

また、図１１を参照して説明したように、メディアに記録されたＭＭＴフォーマットデータを格納したＭＭＴＰパケット列またはＴＬＶパケット列からなるクリップＡＶストリームファイル５１内の再生データを再生する場合には、このＭＭＴフォーマットデータ格納クリップＡＶストリームファイル５１対応のクリップ情報ファイル（ｎｎｎｎｎ．ｃｌｐｉ）５２と、プレイリストファイル（ｎｎｎｎｎ．ｒｐｌｓ）５３を利用して再生処理を行うことが必要となる。 Further, as described with reference to FIG. 11, when the reproduction data in the clip AV stream file 51 composed of the MMTP packet string or the TLV packet string storing the MMT format data recorded on the media is reproduced, the reproduction data is reproduced. It is necessary to perform the reproduction process using the clip information file (nnnnnn.clpi) 52 corresponding to the MMT format data storage clip AV stream file 51 and the playlist file (nnnnnn.rpls) 53.

しかし、現行のＢＤＡＶフォーマットやＳＰＡＶフォーマットにおいて規定されているクリップ情報ファイルや、プレイリストファイルは、本来、ＭＥＧ－２ＴＳフォーマットデータに対応する再生制御情報として規定されたものであり、ＭＭＴフォーマットデータに対する再生制御情報として、そのまま利用することはできない。
従って、ＭＭＴフォーマットデータをメディアに記録して、ＢＤＡＶフォーマット対応アプリを利用してコンテンツ再生を行うためには、ＢＤＡＶフォーマットの規定するプレイリストファイルやクリップ情報ファイルを生成してメディアに記録する処理が必要となる。However, the clip information file and the playlist file specified in the current BDAV format and SPAV format are originally specified as playback control information corresponding to the MEG-2TS format data, and are reproduced for the MMT format data. It cannot be used as it is as control information.
Therefore, in order to record MMT format data on media and play back content using a BDAV format compatible application, a process of generating a playlist file or clip information file specified by the BDAV format and recording it on the media is required. You will need it.

先に図２を参照して説明したように、例えば放送局から配信されるＭＭＴフォーマット音声データには、例えば、以下の２つの異なる音声符号化規格に従ったデータが含まれる。
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇＬｏｗＣｏｍｐｌｅｘｉｔｙ）に従った符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＡｕｄｉｏＬｏｓｌｅｓｓＣｏｄｉｎｇ）に従った符号化音声データ、
放送局から配信される音声データは、上記の２つの異なる音声符号化規格に従ったデータの少なくともいずれかである。As described above with reference to FIG. 2, for example, the MMT format audio data distributed from a broadcasting station includes, for example, data according to the following two different audio coding standards.
(1) Coded audio data according to MPEG4 AAC LC (Advanced Audio Coding Low Complexity),
(2) Coded audio data according to MPEG4 ALS (Audio Lossless Coding),
The audio data delivered from the broadcasting station is at least one of the data according to the above two different audio coding standards.

ＭＰＥＧ４ＡＡＣＬＣは、ＭＰＥＧ２ＡＡＣをベースとした改良された符号化フォーマットであり、符号化遅延低減、低ビットレート符号化、ビットレート拡張性などを実現した音声符号化フォーマットである。
一方、ＭＰＥＧ４ＡＬＳは、データロスを発生させることのないロスレス圧縮方式（可逆圧縮）を適用した音声符号化フォーマットであり、高音質な音声データの再生を可能としている。MPEG4 AAC LC is an improved coding format based on MPEG2 AAC, and is an audio coding format that realizes reduction of coding delay, low bit rate coding, bit rate expandability, and the like.
On the other hand, MPEG4 ALS is an audio coding format to which a lossless compression method (lossless compression) that does not cause data loss is applied, and enables reproduction of high-quality audio data.

なお、ＭＭＴフォーマットにおいて配信される上記の２種類の符号化音声データは、
（ＬＡＴＭ／ＬＯＡＳ）ストリーム形式で配信される。
ＬＡＴＭ（ＬｏｗＯｖｅｒｈｅａｄＡｕｄｉｏＴｒａｎｓｐｏｒｔＭｕｌｔｏｐｌｅｘ）は多重化処理の規定（ＩＳＯ／ＩＥＣ１４４９６－３）であり、
ＬＯＡＳ（ＬｏｗＯｖｅｒｈｅａｄＡｕｄｉｏＳｔｒｅａｍ）は、転送方式の規定（ＩＳＯ／ＩＥＣ１４４９６－３）である。The above two types of coded voice data distributed in the MMT format are
(LATM / LOAS) Delivered in stream format.
LATM (Low Overhead Audio Transport Multiplex) is a regulation of multiplexing processing (ISO / IEC14496-3).
LOAS (Low Overhead Audio Stream) is a regulation of the transfer method (ISO / IEC14496-3).

すなわち、ＭＭＴフォーマットにおいて配信される音声符号化データストリームは、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）
上記２つのストリームのいずれかである。That is, the voice-coded data stream delivered in the MMT format is
(1) MPEG4 AAC LC (LATM / LOAS)
(2) MPEG4 ALS (LATM / LOAS)
It is one of the above two streams.

なお、１つの番組について、いずれか一方の音声データのみを配信する設定、すなわち、
ＭＰＥＧ４ＡＡＣＬＣ、またはＭＰＥＧ４ＡＬＳのいずれかを配信する設定と、
ＭＰＥＧ４ＡＡＣＬＣと、ＭＰＥＧ４ＡＬＳ、これら２つの音声データを配信し、受信側において選択可能とした設定が可能である。It should be noted that the setting for delivering only one of the audio data for one program, that is,
With settings to deliver either MPEG4 AAC LC or MPEG4 ALS,
MPEG4 AAC LC and MPEG4 ALS, these two audio data can be distributed and set to be selectable on the receiving side.

例えば放送局等の送信装置は、先に図３を参照して説明したようにＭＭＴＰパケットおよびＴＬＶパケットに音声データを格納して配信する。
パケットに格納された音声データが、上記のＡＡＣＬＣ符号化音声データであるか、ＡＬＳ符号化音声データであるかについては、例えば、音声データを格納したパケット内の属性データ記録領域や、ＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）等に記録されている。For example, a transmitting device such as a broadcasting station stores and distributes audio data in MMTP packets and TLV packets as described above with reference to FIG.
Whether the voice data stored in the packet is the above-mentioned AAC LC coded voice data or ALS-coded voice data is determined by, for example, the attribute data recording area in the packet storing the voice data or the MMT format. It is recorded in the MMT package table (MPT) or the like, which is the corresponding control information (SI).

放送波を受信して再生を行うテレビ等の受信装置は、これらのパケット内記録データや、ＭＭＴパッケージテーブル（ＭＰＴ）を参照して、再生対象データが、ＡＡＣＬＣ符号化音声データであるか、ＡＬＳ符号化音声データであるかを判別して、判別結果に応じた復号処理を行うことができる。 A receiving device such as a television that receives and reproduces broadcast waves refers to these in-packet recorded data and the MMT package table (MPT) to determine whether the data to be reproduced is AAC LC coded audio data. It is possible to determine whether the data is ALS-encoded voice data and perform decoding processing according to the determination result.

放送局等から配信されるＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットデータとしてＢＤ等のメディアに記録する場合、画像、音声、字幕等の再生データや、制御情報（ＳＩ）等のデータは、ＭＭＴＰパケット列またはＴＬＶパケット列として記録される。
これらのパケット列は、ＢＤＡＶフォーマットやＳＰＡＶフォーマットに規定されるクリップＡＶストリームファイルとして記録される。When recording MMT format data distributed from a broadcasting station or the like on a medium such as BD as BDAV format or SPAV format data, playback data such as images, sounds, and subtitles, and data such as control information (SI) are used. Recorded as an MMTP packet sequence or a TLV packet sequence.
These packet strings are recorded as a clip AV stream file specified in the BDAV format or the SPAV format.

メディアから記録データの再生を行う情報処理装置（再生装置）のデータ再生手順は、先に図４、図５を参照して説明したように、以下の手順に従ったものとなる。
（ａ）まず、再生アプリケーションによって管理情報ファイルから特定のプレイリストを指定する。
（ｂ）選択されたプレイリストに規定されたクリップ情報ファイルを選択し、クリップ情報ファイルの記録データに従って、クリップＡＶストリームファイルから、画像、音声等の再生データであるＡＶストリームあるいはコマンドを読み出して、ＡＶストリームの再生や、コマンドの実行処理を開始する。The data reproduction procedure of the information processing apparatus (reproduction apparatus) for reproducing the recorded data from the media follows the following procedure as described above with reference to FIGS. 4 and 5.
(A) First, a specific playlist is specified from the management information file by the playback application.
(B) Select the clip information file specified in the selected playlist, read the AV stream or command that is the playback data such as images and sounds from the clip AV stream file according to the recorded data of the clip information file. Starts playback of the AV stream and command execution processing.

しかし、現行のＢＤＡＶフォーマットやＳＰＡＶフォーマット規定のプレイリストファイルや、クリップ情報ファイルには、上述したＭＭＴフォーマット対応の２つの異なる音声符号化データ（ＭＰＥＧ４ＡＡＣＬＣと、ＭＰＥＧ４ＡＬＳ）を識別する情報が記録されていない。 However, in the playlist file and clip information file specified in the current BDAV format and SPAV format, information for identifying two different audio-coded data (MPEG4 AAC LC and MPEG4 ALS) compatible with the above-mentioned MMT format is recorded. It has not been.

従って、再生装置は、クリップＡＶストリームファイルから音声データ格納パケットを取得して、取得したパケットのパケット内記録データを参照するか、あるいは、クリップＡＶストリームファイル中に記録された制御情報（ＳＩ）としてのＭＭＴパッケージテーブル（ＭＰＴ）を参照しなければ、パケット内の音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データであるか、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データであるか、
上記のいずれの音声データであるかを確認することができない。Therefore, the playback device acquires the voice data storage packet from the clip AV stream file and refers to the in-packet recorded data of the acquired packet, or as control information (SI) recorded in the clip AV stream file. If you do not refer to the MMT package table (MPT) of, the voice data in the packet will be
(1) Is it MPEG4 AAC LC coded audio data?
(2) Is it MPEG4 ALS coded audio data?
It is not possible to confirm which of the above voice data is used.

これら２つの異なる音声符号化データの復号処理は、異なる処理として実行することが必要である。すなわち各音声符号化対応のコーデックを適用した復号を行うことが必要となる。
従って、メディアに記録された音声データの再生を行う場合、音声データ格納パケットのパケット内記録データ、あるいは、ＭＭＴパッケージテーブル（ＭＰＴ）を参照しなければ、どのコーデックを適用すべきかを決定できなくなり、再生処理開始までの遅延が大きくなるという問題がある。The decoding processes of these two different voice-coded data need to be executed as different processes. That is, it is necessary to perform decoding by applying a codec corresponding to each voice coding.
Therefore, when playing back the audio data recorded on the media, it becomes impossible to determine which codec code should be applied without referring to the in-packet recorded data of the audio data storage packet or the MMT package table (MPT). There is a problem that the delay until the start of the reproduction process becomes large.

また、前述したように、１つの番組コンテンツに対する音声データとして、ＭＰＥＧ４ＡＡＣＬＣと、ＭＰＥＧ４ＡＬＳ、これら２つの音声データが配信されることがある。
これら２種類の音声データをメディアに記録した場合、メディアからのコンテンツ再生開始前に、ユーザ（視聴者）に再生対象とする音声を選択させるといった処理を行うことが想定される。Further, as described above, MPEG4 AAC LC, MPEG4 ALS, and these two audio data may be distributed as audio data for one program content.
When these two types of audio data are recorded on the media, it is assumed that the user (viewer) is made to select the audio to be reproduced before the content reproduction from the media is started.

しかし、このような処理を行う場合にも、クリップＡＶストリームファイルから音声データ格納パケットを取得して、取得した音声データ格納パケットのパケット内記録データを参照するか、あるいは、クリップＡＶストリームファイル中に記録された制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）を参照しなければ、２種類の音声データが記録されていることを確認することができない。
このような処理には多大な処理時間が必要となるという問題が発生する。However, even when such processing is performed, the voice data storage packet is acquired from the clip AV stream file and the recorded data in the packet of the acquired voice data storage packet is referred to, or the clip AV stream file is included. It is not possible to confirm that two types of audio data are recorded without referring to the MMT package table (MPT) which is the recorded control information (SI).
There arises a problem that such processing requires a large amount of processing time.

［６．クリップ情報ファイルに対する音声識別情報の記録例について］
以下、上述した問題点を解決する一構成例として、クリップ情報ファイルに音声識別情報を記録した例についてに説明する。[6. About recording example of voice identification information for clip information file]
Hereinafter, as an example of configuration for solving the above-mentioned problems, an example in which voice identification information is recorded in a clip information file will be described.

以下において説明する構成は、
ＢＤ等のメディアに記録したクリップＡＶストリームファイルがＭＭＴフォーマットである場合、そのクリップＡＶストリームファイルの再生制御情報ファイルであるクリップ情報ファイルに、このクリップ情報ファイルを適用した再生対象音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データであるか、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データであるか、
上記のいずれの音声データであるかを識別可能とした音声識別情報を記録した構成である。The configuration described below is
When the clip AV stream file recorded on a medium such as BD is in the MMT format, the playback target audio data to which this clip information file is applied is added to the clip information file which is the playback control information file of the clip AV stream file.
(1) Is it MPEG4 AAC LC coded audio data?
(2) Is it MPEG4 ALS coded audio data?
It is a configuration in which voice identification information that enables identification of which of the above voice data is recorded is recorded.

まず、図１２を参照してクリップ情報ファイルのデータ構成（シンタクス）について説明する。
先に説明したように、クリップ情報ファイルには、そのクリップ情報ファイルを適用して再生するデータに関する情報が記録される。具体的には、例えば、クリップＡＶストリームファイルの再生位置情報等を有する。First, the data structure (syntax) of the clip information file will be described with reference to FIG.
As described above, the clip information file records information about the data to be played by applying the clip information file. Specifically, it has, for example, playback position information of a clip AV stream file.

図１２は、１つのクリップ情報ファイルのデータ構成（シンタクス）を示す図である。
クリップ情報ファイルには、クリップ情報に対応付けられた再生データに関する情報が記録される。図１２に示すように、例えば、以下の各情報が記録される。
クリップ情報［ＣｌｉｐＩｎｆｏ（）］１０１、
シーケンス情報［ＳｅｑｕｅｎｃｅＩｎｆｏ（）］１０２、
プログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］１０３、FIG. 12 is a diagram showing a data structure (syntax) of one clip information file.
Information about the reproduction data associated with the clip information is recorded in the clip information file. As shown in FIG. 12, for example, the following information is recorded.
Clip information [ClipInfo ()] 101,
Sequence information [SequenceInfo ()] 102,
Program information [ProgramInfo ()] 103,

クリップ情報［ＣｌｉｐＩｎｆｏ（）］１０１には、クリップ情報ファイルに対応するＡＶストリームファイルの属性情報を記録する。
シーケンス情報［ＳｅｑｕｅｎｃｅＩｎｆｏ（）］１０２には、このクリップ情報ファイルに対応するＡＶストリームファイルに格納された再生対象データの再生シーケンスに関する情報を記録する。In the clip information [ClipInfo ()] 101, the attribute information of the AV stream file corresponding to the clip information file is recorded.
In the sequence information [SequenceInfo ()] 102, information regarding the reproduction sequence of the reproduction target data stored in the AV stream file corresponding to this clip information file is recorded.

プログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］１０３には、クリップ情報ファイルによって再生されるクリップＡＶストリームの再生区間や時間区間の定義情報等を含むプログラム(ｐｒｏｇｒａｍ＿ｓｅｑｕｅｎｃｅ)に関する情報を記録する。
プログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］１０３の記録情報の詳細について、図１３以下を参照して説明する。In the program information [ProgramInfo ()] 103, information about a program (program_sequence) including definition information of a reproduction section and a time interval of a clip AV stream reproduced by a clip information file is recorded.
The details of the recorded information of the program information [ProgramInfo ()] 103 will be described with reference to FIGS. 13 and below.

図１３は、プログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］のデータ構成（シンタクス）を示す図である。
プログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］に記録される主なデータについて説明する。FIG. 13 is a diagram showing a data structure (syntax) of the program information [ProgramInfo ()].
The main data recorded in the program information [ProgramInfo ()] will be described.

（ａ）プログラムシーケンス数［ｎｕｍ＿ｏｆ＿ｐｒｏｇｒａｍ＿ｓｅｑｕｅｎｃｅｓ］１１１には、クリップ情報ファイルに含まれるプログラムシーケンス（ｐｒｏｇｒａｍ＿ｓｅｑｕｅｎｃｅ）の数が記録される。
（ｂ）ＳＰＮプログラムシーケンス開始アドレス［ＳＰＮ＿ｐｒｏｇｒａｍ＿ｓｅｑｕｅｎｃｅ＿ｓｔａｒｔ［ｉ］］１１２には、ＡＶストリームファイル上でプログラムシーケンスが開始する場所の相対アドレスが記録される。
（ｃ）プログラムマップＰＩＤ［ｐｒｏｇｒａｍ＿ｍａｐ＿ＰＩＤ［ｉ］］１１３には、プログラムシーケンス（ｐｒｏｇｒａｍ＿ｓｅｑｕｅｎｃｅ）のマップが格納されているパケットの識別子（ＰＩＤ）が記録される。(A) Number of program sequences [num_of_program_sequences] 111 records the number of program sequences (program_sequence) included in the clip information file.
(B) The SPN program sequence start address [SPN_program_sequence_start [i]] 112 records the relative address of the place where the program sequence starts on the AV stream file.
(C) In the program map PID [program_map_PID [i]] 113, the identifier (PID) of the packet in which the map of the program sequence (program_sequence) is stored is recorded.

（ｄ）ストリームＰＩＤ［ｓｔｒｅａｍ＿ＰＩＤ］１１４には、このクリップ情報ファイルのプログラムシーケンスに従って再生されるストリームが格納されているパケットの識別子（ＰＩＤ）が記録される。
（ｅ）ストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］１１５には、再生対象ストリームのコーディング（符号化）情報が記録される。(D) In the stream PID [stream_PID] 114, an identifier (PID) of a packet containing a stream to be reproduced according to the program sequence of this clip information file is recorded.
(E) Stream coding information [StreamCodingInfo] 115 records coding (encoding) information of a stream to be reproduced.

このように、クリップ情報ファイルには、再生対象として関連付けられたクリップＡＶストリームファイル格納データの再生に必要な様々な情報が記録される。 In this way, various information necessary for reproducing the clip AV stream file storage data associated as the reproduction target is recorded in the clip information file.

図１４は、クリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］記録領域内に設定されるストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のデータ構成（シンタクス）を示す図である。
ストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］には、以下のデータが記録される。
（１）画像ストリームコーディング情報１２１、
（２）音声ストリームコーディング情報１２２、FIG. 14 is a diagram showing a data structure (syntax) of stream coding information [Stream Coding Info] set in the program information [Program Info ()] recording area of the clip information file.
The following data is recorded in the stream coding information [StreamCodingInfo].
(1) Image stream coding information 121,
(2) Audio stream coding information 122,

なお、情報記録媒体（メディア）に記録される画像ストリームや、音声ストリームの符号化態様（コーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ））は、様々なタイプが許容されており、画像、音声とも各コーディングタイプに応じた識別子が予め規定されている。
画像は、コーディングタイプ（符号化タイプ）に応じて、例えば、タイプ識別子＝０ｘ０１，０ｘ０２、０ｘ１Ｂ等が対応付けられている。
音声は、コーディングタイプ（符号化タイプ）に応じてタイプ識別子＝０ｘ０３、０ｘ０４、０ｘ０Ｆ、０ｘ８０，０ｘ８１等が対応付けられている。Various types of coding modes (coding type (stream_coding_type)) of the image stream and the audio stream recorded on the information recording medium (media) are allowed, and both the image and the audio correspond to each coding type. The identifier is specified in advance.
The image is associated with, for example, type identifiers = 0x01, 0x02, 0x1B, etc., depending on the coding type (encoding type).
The voice is associated with type identifiers = 0x03, 0x04, 0x0F, 0x80, 0x81 and the like according to the coding type (encoding type).

例えば、
コーディングタイプ０ｘ０１はＭＰＥＧ－１画像ストリーム、
０ｘ０２はＭＰＥＧ－２画像ストリーム、
０ｘ０３はＭＰＥＧ－１音声ストリーム、
これらの設定である。for example,
Coding type 0x01 is an MPEG-1 image stream,
0x02 is the MPEG-2 image stream,
0x03 is the MPEG-1 audio stream,
These settings.

なお、ＭＭＴフォーマットでは、現行のＢＤＡＶフォーマットでは規定されていない画像符号化タイプであるＨＥＶＣ符号化が用いられる。
この新たな符号化タイプについても新たに符号化タイプ識別子を割り振ることになる。In the MMT format, HEVC coding, which is an image coding type not specified in the current BDAV format, is used.
A new coding type identifier will be assigned to this new coding type as well.

同様に、音声データについても、ＭＭＴフォーマットでは、現行のＢＤＡＶフォーマットでは規定されていない音声符号化データが用いられる。
すなわち、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これらの符号化音声データが利用される。Similarly, for voice data, the MMT format uses voice-encoded data that is not specified in the current BDAV format.
That is,
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
These coded voice data are used.

しかし、これら２種類の符号化音声データに対するコーディングタイプ識別子（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）は、ＭＰＥＧ２規格であるＩＳＯ１３８１８－１では同じ、すなわち、いずれも同じ識別子の設定となっている。
具体的には、ＩＳＯ１３８１８－１では、ＩＳＯ／ＩＥＣ１４４９６－３に規定する多重化処理として、前述のＬＡＴＭ（ＬｏｗＯｖｅｒｈｅａｄＡｕｄｉｏＴｒａｎｓｐｏｒｔＭｕｌｔｏｐｌｅｘ）を適用している符号化音声データについて、すべて、コーディングタイプ＝０ｘ１１を設定している。However, the coding type identifier (stream_coding_type) for these two types of coded voice data is the same in the MPEG2 standard ISO13818-1, that is, the same identifier is set for both.
Specifically, in ISO13818-1, the coding type = 0x11 for all the coded speech data to which the above-mentioned LATM (Low Overhead Audio Trusport Multiplex) is applied as the multiplexing process specified in ISO / IEC14496-3. Is set.

ＭＭＴフォーマットにおいて配信される音声符号化データである、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）
上記２つの音声符号化データは、いずれも多重化処理として、ＬＡＴＭ（ＬｏｗＯｖｅｒｈｅａｄＡｕｄｉｏＴｒａｎｓｐｏｒｔＭｕｌｔｏｐｌｅｘ）を適用している符号化音声データであり、同じ識別子［０ｘ１１］が設定される。Speech-coded data delivered in MMT format,
(1) MPEG4 AAC LC (LATM / LOAS)
(2) MPEG4 ALS (LATM / LOAS)
Both of the above two voice-coded data are coded voice data to which LATM (Low Overhead Audio Transport Multiplex) is applied as the multiplexing process, and the same identifier [0x11] is set.

従って、ＢＤＡＶフォーマットデータを再生する再生装置は、メディアに記録されたクリップ情報ファイルに記録されたコーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）として、
［０ｘ１１］が設定されていた場合、この音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類のどちらかであるとの判断はできるが、
これら２種類のどちらの符号化音声データであるかについては識別することができない。Therefore, the playback device that reproduces the BDAV format data has the coding type (stream_coding_type) recorded in the clip information file recorded on the media.
If [0x11] is set, this audio data will be
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
It can be judged that it is either of these two types,
It is not possible to identify which of these two types of coded voice data is used.

図１４に示すように、現行のクリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］記録領域内に設定される音声ストリームコーディング情報１２２には、以下の各情報が記録される。
（２ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（２ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２As shown in FIG. 14, the following information is recorded in the audio stream coding information 122 set in the program information [ProgramInfo ()] recording area of the current clip information file.
(2a) Voice type (audio_presentation_type) 131
(2b) Sampling frequency (sampling_frequency) 132

これらの各データの記録例について、図１５を参照して説明する。
図１５（ａ）音声タイプ記録データは、ビット値０～１５に対して、モノチャネル、ステレオ、サラウンド等の様々な音声タイプ情報を識別可能なデータ設定となっている。
図１５（ｂ）サンプリング周波数は、ビット値０～１５に対して、４８～１９２ｋＨｚの各サンプリング周波数を識別可能なデータ設定となっている。
これらの音声データ形式については、ＭＭＴフォーマットに従った配信データも現行のＢＤＡＶフォーマットにおける定義データの範囲内であり、特に変更を要しない。An example of recording each of these data will be described with reference to FIG.
FIG. 15 (a) The voice type recording data has data settings capable of identifying various voice type information such as monochannel, stereo, and surround with respect to bit values 0 to 15.
FIG. 15B is a data setting in which each sampling frequency of 48 to 192 kHz can be identified with respect to bit values 0 to 15.
Regarding these audio data formats, the distribution data according to the MMT format is also within the range of the definition data in the current BDAV format, and no particular change is required.

しかし、これらの各データのみでは、コーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）として、［０ｘ１１］が設定されている音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類のどちらの符号化音声データであるかについては識別することができない。However, in each of these data alone, the audio data in which [0x11] is set as the coding type (stream_coding_type) is
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
It is not possible to identify which of these two types of coded voice data is used.

以下、上記問題を解決する新たなクリップ情報ファイルの構成例について説明する。
以下に示す２つの実施例について、順次、説明する。
（実施例１）音声データ格納パケットから音声識別情報を取得する実施例
（実施例２）ＭＭＴパッケージテーブル（ＭＰＴ）から音声識別情報を取得する実施例Hereinafter, a configuration example of a new clip information file that solves the above problem will be described.
The following two examples will be described in sequence.
(Example 1) Example of acquiring voice identification information from a voice data storage packet (Example 2) Example of acquiring voice identification information from an MMT package table (MPT)

［６－１．（実施例１）音声データ格納パケットに記録された音声識別情報を取得してクリップ情報ファイルに記録する実施例］
まず、実施例１として、音声データ格納パケットに記録された音声識別情報を取得してクリップ情報ファイルに記録する実施例について説明する。[6-1. (Example 1) Example of acquiring voice identification information recorded in a voice data storage packet and recording it in a clip information file]
First, as the first embodiment, an embodiment in which the voice identification information recorded in the voice data storage packet is acquired and recorded in the clip information file will be described.

すなわち、音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これら２種類の音声データのいずれであるかを示す識別情報を音声データ格納パケットから取得してクリップ情報ファイルに記録する実施例について説明する。That is, the voice data is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
An embodiment in which identification information indicating which of these two types of voice data is used is acquired from a voice data storage packet and recorded in a clip information file will be described.

図１６は、実施例１に従ったクリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］のストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のデータ構成（シンタクス）を示す図である。 FIG. 16 is a diagram showing a data structure (syntax) of the stream coding information [Stream Coding Info] of the program information [Program Info ()] of the clip information file according to the first embodiment.

図１６に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］は、このクリップ情報ファイルの制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類のどちらの符号化音声データであるかについて識別可能とした構成を有する。In the stream coding information [StreamCodingInfo] shown in FIG. 16, the audio data to be controlled by this clip information file is the audio data.
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
It has a configuration that makes it possible to identify which of these two types of coded voice data is.

図１６に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］は、先に図１４を参照して説明したと同様、以下のデータが記録される。
（１）画像ストリームコーディング情報１２１、
（２）音声ストリームコーディング情報１２２、As the stream coding information [StreamCodingInfo] shown in FIG. 16, the following data is recorded in the same manner as described above with reference to FIG.
(1) Image stream coding information 121,
(2) Audio stream coding information 122,

図１６に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］には、
コーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）＝［０ｘ１１］
上記コーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０が設定されている。The stream coding information [Stream Coding Info] shown in FIG. 16 includes
Coding type (stream_coding_type) = [0x11]
The audio data attribute information recording area 130 of the above coding type [0x11] is set.

なお、前述したように、コーディングタイプ＝［０ｘ１１］は、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類の符号化音声データの双方に対応付けられるタイプ識別子であり、このタイプ識別子［０ｘ１１］のみでは、上記（１），（２）どちらの符号化音声データであるかについては識別することができない。As described above, the coding type = [0x11] is
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
It is a type identifier associated with both of these two types of coded voice data, and it is possible to identify which of the above (1) and (2) is the coded voice data only by this type identifier [0x11]. I can't.

図１６に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０には、以下の各情報が記録される。
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３The following information is recorded in the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] shown in FIG.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Audio Object Type 133

これらのデータ中、以下のデータ（ａ），（ｂ）、すなわち、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
これらのデータは、先に、図１５を参照して説明したデータと同様のデータである。Among these data, the following data (a) and (b), that is,
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
These data are the same data as the data described above with reference to FIG.

なお、放送局等の送信装置から受信するＭＭＴフォーマットデータの場合、これらの情報は、例えばＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）に記録されている。
ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、ＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）から、上記（ａ），（ｂ）の各データを読み取り、クリップ情報ファイルに記録する。
なお、具体的なＭＭＴパッケージテーブル（ＭＰＴ）の記録データの例については後段で説明する。In the case of MMT format data received from a transmitting device such as a broadcasting station, these information are recorded in, for example, the MMT package table (MPT) which is control information (SI) corresponding to the MMT format.
The recording device that records the MMT format data on a medium such as a BD reads each of the above data (a) and (b) from the MMT package table (MPT) that is the control information (SI) corresponding to the MMT format, and the clip information. Record in a file.
A specific example of the recorded data of the MMT package table (MPT) will be described later.

これら（ａ），（ｂ）の音声属性情報、すなわち、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
これらの音声属性情報は、図１４、図１５を参照して説明したように、コーディングタイプ［０ｘ１１］以外の音声データ、すなわちコーディングタイプ［０ｘ０３］～［０ｘ８４］等の音声データに対しても設定されている情報である。The audio attribute information of these (a) and (b), that is,
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
As described with reference to FIGS. 14 and 15, these voice attribute information is also set for voice data other than the coding type [0x11], that is, voice data such as coding types [0x03] to [0x84]. It is the information that has been done.

図１６に示すように、コーディングタイプ［０ｘ１１］の音声データについては、さらに、
（ｃ）音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３
上記の音声属性情報を記録する。As shown in FIG. 16, for the audio data of the coding type [0x11], further
(C) Audio Object Type 133
Record the above audio attribute information.

この音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３が、
コーディングタイプ＝［０ｘ１１］に含まれる以下の２種類のデータ、すなわち、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類の符号化音声データの識別子、すなわち音声識別情報として利用される。This audio object type (audioObjectType) 133 is
The following two types of data included in the coding type = [0x11], that is,
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
These two types of coded voice data are used as identifiers, that is, voice identification information.

音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３の具体的な記録データ例を図１７に示す。
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝２の場合、
このコーディングタイプ＝［０ｘ１１］の符号化音声データが、
ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データであることを意味する。FIG. 17 shows a specific example of recorded data of the voice object type (audioObjectType) 133.
When the audio object type (audioObjectType) = 2,
The coded voice data of this coding type = [0x11] is
It means that it is MPEG4 AAC LC (LATM / LOAS) coded audio data.

一方、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝３６の場合、
このコーディングタイプ＝［０ｘ１１］の符号化音声データが、
ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データであることを意味する。on the other hand,
When the audio object type (audioObjectType) = 36,
The coded voice data of this coding type = [0x11] is
It means that it is MPEG4 ALS (LATM / LOAS) coded audio data.

ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、この音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）の記録データを参照することで、このクリップ情報ファイルの再生制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別することが可能となる。The playback device that reproduces the data recorded on the media such as BD refers to the recorded data of this audio object type (audioObjectType), so that the audio data that is the reproduction control target of this clip information file can be set.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is possible to identify which of these voice data is used.

なお、放送局等の送信装置から受信するＭＭＴフォーマットデータの場合、この音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）情報は、音声データ格納パケット（ＭＭＴＰパケット／ＴＬＶパケット）内に記録されている。
ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、放送局等の送信装置から受信する音声データ格納パケットから、「音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）」を読み取り、クリップ情報ファイルに記録する。In the case of MMT format data received from a transmission device such as a broadcasting station, this audio object type (audioObjectType) information is recorded in an audio data storage packet (MMTP packet / TLV packet).
A recording device that records MMT format data on a medium such as a BD reads an "audio object type (audioObjectType)" from an audio data storage packet received from a transmission device such as a broadcasting station, and records it in a clip information file.

図１８を参照して、放送局等の送信装置が送信するＭＭＴフォーマットに従った音声データ格納パケットのパケット構成例について説明する。
図１８には、ＭＭＴフォーマットに従った音声データ格納パケット１４０を示している。
図１８に示すように、音声データ格納パケット１４０のパケット格納データ１４１は、符号化音声データストリームと、ストリームデータの属性情報によって構成される。An example of a packet configuration of an audio data storage packet according to the MMT format transmitted by a transmission device such as a broadcasting station will be described with reference to FIG.
FIG. 18 shows a voice data storage packet 140 according to the MMT format.
As shown in FIG. 18, the packet storage data 141 of the voice data storage packet 140 is composed of a coded voice data stream and attribute information of the stream data.

なお、音声データ格納パケット１４０に格納されている符号化音声データストリームは、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれかの音声データである。The encoded voice data stream stored in the voice data storage packet 140 is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is the voice data of any of these.

音声データ格納パケット１４０には、さらに、パケットに格納された音声データの属性情報が記録される。
図１８に示すように、音声データ属性情報の一部として、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１４２が記録される。In the voice data storage packet 140, the attribute information of the voice data stored in the packet is further recorded.
As shown in FIG. 18, as a part of the audio data attribute information,
The audio object type (audioObjectType) 142 is recorded.

この音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１４２の設定値は、以下の設定である。
パケットに格納されたデータが、
ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データである場合、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝２に設定される。
一方、パケットに格納されたデータが、
ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データである場合、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝３６に設定される。
すなわち、先に図１７を参照して説明した設定値である。The setting value of this audio object type (audioObjectType) 142 is the following setting.
The data stored in the packet is
When it is MPEG4 AAC LC (LATM / LOAS) coded audio data,
The audio object type (audioObjectType) = 2 is set.
On the other hand, the data stored in the packet is
When it is MPEG4 ALS (LATM / LOAS) coded audio data,
The audio object type (audioObjectType) = 36 is set.
That is, it is a set value described above with reference to FIG.

放送局等の送信装置から送信されるＭＭＴフォーマットデータを受信して、ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、放送局等の送信装置から受信する音声データ格納パケットから、「音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）」の設定値（２、または３６）を読み取り、この設定値を、図１６に示すクリップ情報ファイルの、
音声オブジェクトタイプ１３３記録領域に記録する。A recording device that receives MMT format data transmitted from a transmitting device such as a broadcasting station and records the MMT format data on a medium such as BD is a recording device that receives "voice" from a voice data storage packet received from a transmitting device such as a broadcasting station. The setting value (2 or 36) of "object type (audioObjectType)" is read, and this setting value is used in the clip information file shown in FIG.
Audio object type 133 Record in the recording area.

ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、この音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）の記録データが「２」か、「３６」かを確認することで、このクリップ情報ファイルの再生制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別することが可能となる。The playback device that reproduces the data recorded on the media such as BD confirms whether the recorded data of this audio object type (audioObjectType) is "2" or "36" to control the reproduction of this clip information file. The target voice data is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is possible to identify which of these voice data is used.

すなわち、クリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］記録領域内に設定されるストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］におけるコーディングタイプ＝［０ｘ１１］の音声属性情報記録領域から、音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）の設定値を読み取る。 That is, the setting value of the audio object type (audioObjectType) from the audio attribute information recording area of the coding type = [0x11] in the stream coding information [StreamCodingInfo] set in the program information [ProgramInfo ()] recording area of the clip information file. To read.

音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝２であれば、
このコーディングタイプ＝［０ｘ１１］の音声データは、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
であると判定する。
一方、音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝３６であれば、
このコーディングタイプ＝［０ｘ１１］の音声データは、
（１）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
であると判定する。
再生装置は、この判定情報に基づいて、各符号化音声データの復号準備、例えばコーデック設定を開始することができる。If the audio object type (audioObjectType) = 2,
The audio data of this coding type = [0x11] is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
Is determined to be.
On the other hand, if the audio object type (audioObjectType) = 36,
The audio data of this coding type = [0x11] is
(1) MPEG4 ALS (LATM / LOAS) coded audio data,
Is determined to be.
Based on this determination information, the reproduction device can start the preparation for decoding each coded voice data, for example, the codec setting.

なお、１つの番組等のコンテンツに対応して、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類の音声データ（音声ストリーム）が配信される場合がある。
例えば、ユーザ（視聴者）が、これら２種類の音声データから再生データを選択可能とした設定である。In addition, corresponding to the contents such as one program,
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
These two types of audio data (audio stream) may be delivered.
For example, it is a setting that allows the user (viewer) to select playback data from these two types of audio data.

このように複数の音声データ（音声ストリーム）を放送局等の送信装置から受信して、ＢＤ等のメディアに記録する記録装置は、図１６に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］を、各音声データ（音声ストリーム）各々について個別に生成して、１つのクリップ情報ファイル内に格納する。 In this way, the recording device that receives a plurality of audio data (audio streams) from a transmission device such as a broadcasting station and records them on a medium such as a BD can use the stream coding information [Stream Coding Info] shown in FIG. 16 for each audio data ( Audio stream) Generated individually for each and stored in one clip information file.

この場合、ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝２、および、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝３６、
これらの２種類の音声データ、すなわち、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これら２種類の音声データが、このクリップ情報ファイルを適用して選択再生可能であると判断することができる。In this case, the playback device that reproduces the data recorded on the media such as BD is
Audio object type (audioObjectType) = 2, and
Audio object type (audioObjectType) = 36,
These two types of audio data, that is,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It can be determined that these two types of audio data can be selectively played back by applying this clip information file.

この場合、再生装置は、例えばユーザ（視聴者）に対して、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これら２種類の音声データが再生可能であることを示す案内情報を表示部に表示する。さらに、ユーザに再生希望音声の選択情報を入力させて、入力された選択情報、すなわち再生希望音声を再生データとして選択して再生する処理を行うことができる。In this case, the playback device, for example, with respect to the user (viewer).
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
Guidance information indicating that these two types of voice data can be reproduced is displayed on the display unit. Further, it is possible to have the user input the selection information of the desired reproduction sound, and perform the process of selecting and reproducing the input selection information, that is, the desired reproduction sound as the reproduction data.

［６－２．（実施例２）ＭＭＴパッケージテーブル（ＭＰＴ）に記録された音声識別情報を取得してクリップ情報ファイルに記録する実施例］
次に、実施例２として、ＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）に記録された音声識別情報を取得してクリップ情報ファイルに記録する実施例について説明する。[6-2. (Example 2) Example of acquiring the voice identification information recorded in the MMT package table (MPT) and recording it in the clip information file]
Next, as the second embodiment, an embodiment in which the voice identification information recorded in the MMT package table (MPT), which is the control information (SI) corresponding to the MMT format, is acquired and recorded in the clip information file will be described.

すなわち、音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これら２種類の音声データのいずれであるかを示す識別情報をＭＭＴパッケージテーブル（ＭＰＴ）から取得してクリップ情報ファイルに記録する実施例について説明する。That is, the voice data is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
An embodiment in which identification information indicating which of these two types of voice data is used is acquired from the MMT package table (MPT) and recorded in a clip information file will be described.

図１９は、実施例２に従ったクリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］のストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のデータ構成（シンタクス）を示す図である。 FIG. 19 is a diagram showing a data structure (syntax) of the stream coding information [Stream Coding Info] of the program information [Program Info ()] of the clip information file according to the second embodiment.

図１９に示す新たなストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］は、このクリップ情報ファイルの制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類のどちらの符号化音声データであるかについて識別可能とした構成を有する。In the new stream coding information [StreamCodingInfo] shown in FIG. 19, the audio data to be controlled by this clip information file is
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
It has a configuration that makes it possible to identify which of these two types of coded voice data is.

図１９に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］には、先に実施例１として説明した図１６に示すデータ構成と同様、
コーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）＝［０ｘ１１］
上記コーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０が設定されている。The stream coding information [StreamCodingInfo] shown in FIG. 19 has the same data structure as that shown in FIG. 16 described above as the first embodiment.
Coding type (stream_coding_type) = [0x11]
The audio data attribute information recording area 130 of the above coding type [0x11] is set.

図１９に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０には、以下の各情報が記録される。
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５The following information is recorded in the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] shown in FIG.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Stream content (stream_content) 135

なお、前述したように、放送局等の送信装置から受信するＭＭＴフォーマットデータの場合、これらの情報は、例えばＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）に記録されている。
ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、ＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）から、上記（ａ），（ｂ）の各データを読み取り、クリップ情報ファイルに記録する。
なお、具体的なＭＭＴパッケージテーブル（ＭＰＴ）の記録データの例については後段で説明する。As described above, in the case of MMT format data received from a transmitting device such as a broadcasting station, these information are recorded in, for example, the MMT package table (MPT) which is control information (SI) corresponding to the MMT format. ..
The recording device that records the MMT format data on a medium such as a BD reads each of the above data (a) and (b) from the MMT package table (MPT) that is the control information (SI) corresponding to the MMT format, and the clip information. Record in a file.
A specific example of the recorded data of the MMT package table (MPT) will be described later.

図１９に示すように、コーディングタイプ［０ｘ１１］の音声データについては、さらに、
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
上記の音声属性情報を記録する。As shown in FIG. 19, for the audio data of the coding type [0x11], further
(C) Stream content (stream_content) 135
Record the above audio attribute information.

本実施例２では、このストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５が、
コーディングタイプ＝［０ｘ１１］に含まれる以下の２種類のデータ、すなわち、
（１）ＭＰＥＧ４ＡＡＣＬＣ符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ符号化音声データ、
これら２種類の符号化音声データの識別子、すなわち音声識別情報として利用される。In the second embodiment, this stream content (stream_content) 135 is
The following two types of data included in the coding type = [0x11], that is,
(1) MPEG4 AAC LC coded audio data,
(2) MPEG4 ALS coded audio data,
These two types of coded voice data are used as identifiers, that is, voice identification information.

ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５の具体的な記録データ例を図２０に示す。
ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ３の場合、
このコーディングタイプ＝［０ｘ１１］の符号化音声データが、
ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データであることを意味する。FIG. 20 shows a specific example of recorded data of the stream content (stream_content) 135.
When stream content (stream_content) = 0x3,
The coded voice data of this coding type = [0x11] is
It means that it is MPEG4 AAC LC (LATM / LOAS) coded audio data.

一方、
ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ４の場合、
このコーディングタイプ＝［０ｘ１１］の符号化音声データが、
ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データであることを意味する。on the other hand,
When stream content (stream_content) = 0x4,
The coded voice data of this coding type = [0x11] is
It means that it is MPEG4 ALS (LATM / LOAS) coded audio data.

ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、このストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）の記録データを参照することで、このクリップ情報ファイルの再生制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別することが可能となる。The playback device that plays back the data recorded on the media such as BD refers to the recorded data of this stream content (stream_content), and the audio data that is the playback control target of this clip information file can be set.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is possible to identify which of these voice data is used.

なお、放送局等の送信装置から受信するＭＭＴフォーマットデータの場合、このストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）情報は、ＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）に記録されている。
ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、放送局等の送信装置から受信するＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）から、「ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）」を読み取り、クリップ情報ファイルに記録する。In the case of MMT format data received from a transmitting device such as a broadcasting station, this stream content (stream_content) information is recorded in the MMT package table (MPT) which is control information (SI) corresponding to the MMT format.
A recording device that records MMT format data on a medium such as a BD is a "stream content (stream_content)" from the MMT package table (MPT) that is control information (SI) corresponding to the MMT format received from a transmission device such as a broadcasting station. And record it in the clip information file.

図２１は、ＭＭＴパッケージテーブル（ＭＰＴ）のデータ構成（シンタクス）を示す図である。
ＭＭＴフォーマットにおいて規定されたＭＭＴパッケージテーブル（ＭＰＴ：ＭＭＴＰａｃｋａｇｅＴａｂｌｅ）は、例えば画像、音声、字幕等のデータ種類（アセットタイプ）毎に、データの属性情報（アセット記述子）を詳細に記録したテーブルである。FIG. 21 is a diagram showing a data structure (syntax) of the MMT package table (MPT).
The MMT package table (MPT: MMT Package Table) specified in the MMT format is a table in which data attribute information (asset descriptor) is recorded in detail for each data type (asset type) such as images, sounds, and subtitles. Is.

ＭＭＴパッケージテーブル（ＭＰＴ：ＭＭＴＰａｃｋａｇｅＴａｂｌｅ）２８０は、パケットＩＤ＝０ｘ００００のＭＭＴＰパケットに格納されており、情報処理装置は、パケットＩＤに基づいてＭＭＴパッケージテーブル（ＭＰＴ）の格納パケットを判別することができる。 The MMT package table (MPT: MMT Package Table) 280 is stored in an MMTP packet having a packet ID = 0x0000, and the information processing apparatus can determine the stored packet in the MMT package table (MPT) based on the packet ID. can.

ＭＭＴパッケージテーブル（ＭＰＴ）には、図２１に示すように、
アセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）１５１、
アセット記述子（ａｓｓｅｔ＿ｄｅｓｃｒｉｐｔｏｒｓ＿ｂｙｔｅ）１５２、
これらのデータ記録領域が含まれる。The MMT package table (MPT) contains, as shown in FIG.
Asset type (asset_type) 151,
Asset descriptor (asset_descriptors_byte) 152,
These data recording areas are included.

アセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）１５１は、画像、音声、字幕等のデータ種類別の識別子を記録する領域である。アセットとは、共通属性を持つデータ処理の単位であり、画像、音声、字幕等は、各々、異なるアセットとして設定される。
ＭＭＴパッケージテーブル（ＭＰＴ）に記録されるアセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）の具体例を図２２に示す。The asset type (asset_type) 151 is an area for recording identifiers for each data type such as images, sounds, and subtitles. An asset is a unit of data processing having a common attribute, and images, sounds, subtitles, etc. are set as different assets.
FIG. 22 shows a specific example of the asset type (asset_type) recorded in the MMT package table (MPT).

図２２に示すように、ＭＰＴに記録されるアセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）には、例えば以下の種類がある。
ｈｖｃ１：ＨＥＶＣ画像
ｍｐ４ａ：音声
ｓｔｐｐ：字幕等
ａａｐｐ：アプリケーション
図２１に示すＭＰＴのアセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）１５１記録フィールドには、例えば上記のいずれかのタイプ情報が記録される。As shown in FIG. 22, the asset type (asset_type) recorded in the MPT includes, for example, the following types.
hvc1: HEVC image mp4a: Audio stpp: Subtitles, etc. app: Application In the MPT asset type (asset_type) 151 recording field shown in FIG. 21, for example, any of the above type information is recorded.

図２１のＭＰＴの下段に示す、
アセット記述子（ａｓｓｅｔ＿ｄｅｓｃｒｉｐｔｏｒｓ＿ｂｙｔｅ）１５２、
このフィールドには、各アセットタイプ（例えば画像、音声、字幕）、これらのデータ種類に応じた様々な属性情報、例えば画像であれば解像度情報等が記録される。Shown in the lower part of the MPT in FIG. 21
Asset descriptor (asset_descriptors_byte) 152,
In this field, each asset type (for example, image, sound, subtitle), various attribute information according to these data types, for example, resolution information in the case of an image, etc. are recorded.

図２３は、図２１を参照して説明したＭＭＴパッケージテーブル（ＭＰＴ）に記録される音声属性情報を示す図である。
すなわち、図２３は、図２１に示すＭＰＴ中のアセットタイプ（ａｓｓｅｔ＿ｔｙｐｅ）１５１が音声対応のタイプ識別子（ｍｐ４ａ）である場合に、
アセット記述子（ａｓｓｅｔ＿ｄｅｓｃｒｉｐｔｏｒｓ＿ｂｙｔｅ）１５２として記録されるデータの一例であり、図２３に示すデータは、音声コンポーネント記述子（ＭＨ－Ａｕｄｉｏ＿Ｃｏｍｐｏｎｅｎｔ＿Ｄｅｓｃｒｉｐｔｏｒ）のデータ構成（シンタクス）を示す図である。FIG. 23 is a diagram showing audio attribute information recorded in the MMT package table (MPT) described with reference to FIG. 21.
That is, FIG. 23 shows the case where the asset type (asset_type) 151 in the MPT shown in FIG. 21 is a voice-compatible type identifier (mp4a).
It is an example of the data recorded as the asset descriptor (asset_descriptors_byte) 152, and the data shown in FIG. 23 is a diagram showing the data structure (syntax) of the voice component descriptor (MH-Audio_Component_Descriptor).

図２３に示すように、音声コンポーネント記述子（ＭＨ－Ａｕｄｉｏ＿Ｃｏｍｐｏｎｅｎｔ＿Ｄｅｓｃｒｉｐｔｏｒ）には、以下の音声属性情報が記録される。
（Ｍ１）ストリームコンテンツ情報
（Ｍ２）コンポーネントタイプ
（Ｍ３）サンプリング周波数As shown in FIG. 23, the following voice attribute information is recorded in the voice component descriptor (MH-Audio_Component_Describer).
(M1) Stream content information (M2) Component type (M3) Sampling frequency

情報記録媒体（メディア）に対するデータ記録処理を実行する情報処理装置は、ＭＭＴパッケージテーブル（ＭＰＴ）から、これらの音声属性情報を読み取り、情報記録媒体（メディア）に記録するクリップ情報ファイルに対する記録データとして利用する。 The information processing apparatus that executes data recording processing on the information recording medium (media) reads these audio attribute information from the MMT package table (MPT) and records it as recording data for the clip information file to be recorded on the information recording medium (media). Use.

すなわち、先に図１９を参照して説明したストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０に対する以下の記録データ、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
これらの記録データとする。That is, the following recorded data for the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] described above with reference to FIG.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Stream content (stream_content) 135
These recorded data will be used.

先に実施例１として説明した図１６に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０に対する以下の記録データ、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
これらの記録データも、図２３に示すＭＭＴパッケージテーブル（ＭＰＴ）の音声コンポーネント記述子（ＭＨ－Ａｕｄｉｏ＿Ｃｏｍｐｏｎｅｎｔ＿Ｄｅｓｃｒｉｐｔｏｒ）の記録データである、
（Ｍ２）コンポーネントタイプ
（Ｍ３）サンプリング周波数
これらの記録データを利用して記録すねことが可能である。The following recorded data for the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] shown in FIG. 16 described above as the first embodiment.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
These recorded data are also recorded data of the audio component descriptor (MH-Audio_Component_Describer) of the MMT package table (MPT) shown in FIG. 23.
(M2) Component type (M3) Sampling frequency It is possible to record using these recorded data.

図２３に示す、ＭＭＴパッケージテーブル（ＭＰＴ）の以下の記録データ、すなわち、
（Ｍ１）ストリームコンテンツ情報１６１、
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３、
これらのデータの具体例について図２４以下を参照して説明する。The following recorded data of the MMT package table (MPT) shown in FIG. 23, that is,
(M1) Stream content information 161
(M2) Component type 162,
(M3) Sampling frequency 163,
Specific examples of these data will be described with reference to FIGS. 24 and below.

図２４は、ＭＭＴパッケージテーブル（ＭＰＴ）に記録される、
（Ｍ１）ストリームコンテンツ情報１６１
の具体例を示す図である。
ＭＭＴパッケージテーブル（ＭＰＴ）には、ストリームコンテンツ情報として、
０ｘ０～０ｘＦのいずれかのビット値が格納される。各ビット値に応じた符号化態様情報が記録される。FIG. 24 is recorded in the MMT Package Table (MPT).
(M1) Stream content information 161
It is a figure which shows the specific example of.
In the MMT package table (MPT), as stream content information,
Any bit value from 0x0 to 0xF is stored. Coding mode information corresponding to each bit value is recorded.

具体的には、以下の設定である。
（１）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ２は、
このアセット対応の音声ストリームが、符号化方式を特定しない音声ストリームであることを意味する。
（２）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ３は、
このアセット対応の音声ストリームが、ＭＰＥＧ４ＡＡＣ（ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ））の音声ストリームであることを意味する。
（３）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ４は、
このアセット対応の音声ストリームが、ＭＰＥＧ４ＡＬＣ（ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ））の音声ストリームであることを意味する。Specifically, the settings are as follows.
(1) Stream content (stream_content) = 0x2 is
This means that the audio stream corresponding to the asset is an audio stream that does not specify the encoding method.
(2) Stream content (stream_content) = 0x3 is
It means that the audio stream corresponding to this asset is an audio stream of MPEG4 AAC (MPEG4 AAC LC (LATM / LOAS)).
(3) Stream content (stream_content) = 0x4 is
It means that the audio stream corresponding to this asset is an audio stream of MPEG4 ALC (MPEG4 ALS (LATM / LOAS)).

例えば、ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、放送局等の送信装置から受信するＭＭＴフォーマット対応の制御情報（ＳＩ）であるＭＭＴパッケージテーブル（ＭＰＴ）から、この「ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）」を読み取り、クリップ情報ファイルに記録する。
なお、ＭＭＴフォーマットにおけるアセットは、ＭＰＥＧフォーマットにおけるストリームに対応する。For example, a recording device that records MMT format data on a medium such as a BD can use this "stream content (MPT) from the MMT package table (MPT) that is control information (SI) corresponding to the MMT format received from a transmission device such as a broadcasting station. Read "stream_content)" and record it in the clip information file.
The asset in the MMT format corresponds to the stream in the MPEG format.

例えば、ＭＭＴフォーマットのある１つのアセットについて、ＭＭＴパッケージテーブル（ＭＰＴ）に記録されたストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）の値が、０ｘ３である場合、
このアセット対応の音声ストリームの属性情報の記録領域である図１９に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０の
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
この記録領域に０ｘ３を記録する。For example, if the value of the stream content (stream_content) recorded in the MMT package table (MPT) is 0x3 for one asset in the MMT format.
(C) Stream content (stream_content) 135 of the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] shown in FIG. 19, which is the recording area of the attribute information of the audio stream corresponding to this asset.
0x3 is recorded in this recording area.

一方、ＭＭＴフォーマットのある１つのアセットについて、ＭＭＴパッケージテーブル（ＭＰＴ）に記録されたストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）の値が、０ｘ４である場合、
このアセット対応の音声ストリームの属性情報の記録領域である図１９に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０の
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
この記録領域に０ｘ４を記録する。On the other hand, when the value of the stream content (stream_content) recorded in the MMT package table (MPT) is 0x4 for one asset having the MMT format,
(C) Stream content (stream_content) 135 of the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] shown in FIG. 19, which is the recording area of the attribute information of the audio stream corresponding to this asset.
0x4 is recorded in this recording area.

このように、ＢＤ等のメディアにＭＭＴフォーマットデータを記録する記録装置は、アセット（ストリーム）単位のストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）情報をクリップ情報ファイルに格納してメディアに記録する。 In this way, the recording device that records the MMT format data on a medium such as a BD stores the stream content (stream_content) information for each asset (stream) in the clip information file and records it on the media.

すなわち、再生装置は、クリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］記録領域内に設定されるストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］におけるコーディングタイプ＝［０ｘ１１］の音声属性情報記録領域から、ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）の設定値を読み取る。 That is, the playback device has stream content (stream_content) from the audio attribute information recording area of the coding type = [0x11] in the stream coding information [StreamCodingInfo] set in the program information [ProgramInfo ()] recording area of the clip information file. Read the setting value of.

ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ３であれば、
このコーディングタイプ＝［０ｘ１１］の音声データは、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
であると判定する。
一方、ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ４であれば、
このコーディングタイプ＝［０ｘ１１］の音声データは、
（１）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
であると判定する。
再生装置は、この判定情報に基づいて、各符号化音声データの復号準備、例えばコーデック設定を迅速に開始することができる。If stream content (stream_content) = 0x3,
The audio data of this coding type = [0x11] is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
Is determined to be.
On the other hand, if the stream content (stream_content) = 0x4,
The audio data of this coding type = [0x11] is
(1) MPEG4 ALS (LATM / LOAS) coded audio data,
Is determined to be.
Based on this determination information, the reproduction device can quickly start the preparation for decoding each coded voice data, for example, the codec setting.

このように複数の音声データ（音声ストリーム）を放送局等の送信装置から受信して、ＢＤ等のメディアに記録する記録装置は、図１９に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］を各音声データ（音声ストリーム）各々に対して個別に生成して、１つのクリップ情報ファイル内に格納する。 In this way, the recording device that receives a plurality of audio data (audio streams) from a transmission device such as a broadcasting station and records them on a medium such as a BD can use the stream coding information [Stream Coding Info] shown in FIG. 19 for each audio data (audio). Stream) Generated individually for each and stored in one clip information file.

この場合、ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、
ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ３、および、
ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）＝０ｘ４、
これらの２種類の音声データ、すなわち、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これら２種類の音声データが、このクリップ情報ファイルを適用して選択再生可能であると判断することができる。In this case, the playback device that reproduces the data recorded on the media such as BD is
Stream content (stream_content) = 0x3, and
Stream content (stream_content) = 0x4,
These two types of audio data, that is,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It can be determined that these two types of audio data can be selectively played back by applying this clip information file.

次に、図２５、図２６を参照して、ＭＭＴパッケージテーブル（ＭＰＴ）に記録される、図２３に示す以下の２つの情報の具体例について説明する。
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３、Next, with reference to FIGS. 25 and 26, specific examples of the following two pieces of information shown in FIG. 23 recorded in the MMT package table (MPT) will be described.
(M2) Component type 162,
(M3) Sampling frequency 163,

図２５は、ＭＭＴパッケージテーブル（ＭＰＴ）に記録される、
（Ｍ１２）コンポーネントタイプ１６２
の具体例を示す図である。
ＭＭＴパッケージテーブル（ＭＰＴ）には、コンポーネントタイプ情報１６２として、
０００００～１１１１１のいずれかのビット値が格納され、各ビット値に応じてモノラル、ステレオ等、様々な音声の出力タイプ（コンポーネントタイプ）が対応付けられている。FIG. 25 is recorded in the MMT Package Table (MPT).
(M12) Component type 162
It is a figure which shows the specific example of.
In the MMT package table (MPT), as component type information 162,
Any bit value from 000000 to 11111 is stored, and various audio output types (component types) such as monaural and stereo are associated with each bit value.

図２６は、ＭＭＴパッケージテーブル（ＭＰＴ）に記録される、
（Ｍ３）サンプリング周波数１６３
の具体例を示す図である。
ＭＭＴパッケージテーブル（ＭＰＴ）には、サンプリング周波数情報１６３として、
０００～１１１のいずれかのビット値が格納され、各ビット値に応じて、１６～４８ｋＨｚのサンプリング周波数が対応付けられている。FIG. 26 is recorded in the MMT Package Table (MPT).
(M3) Sampling frequency 163
It is a figure which shows the specific example of.
In the MMT package table (MPT), as sampling frequency information 163,
Any bit value of 000 to 111 is stored, and a sampling frequency of 16 to 48 kHz is associated with each bit value.

このように、ＭＭＴフォーマットデータ中の制御情報（シグナリング情報（ＭＭＴ－ＳＩ））に含まれるＭＭＴパッケージテーブル（ＭＰＴ）には、
（Ｍ１）ストリームコンテンツ情報
（Ｍ２）コンポーネントタイプ
（Ｍ３）サンプリング周波数
これらの具体的な情報が記録されている。As described above, the MMT package table (MPT) included in the control information (signaling information (MMT-SI)) in the MMT format data is included in the MMT package table (MPT).
(M1) Stream content information (M2) Component type (M3) Sampling frequency These specific information is recorded.

具体的には、図１９に示すストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０の以下の各情報を記録する。
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５Specifically, the following information of the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [StreamCodingInfo] shown in FIG. 19 is recorded.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Stream content (stream_content) 135

なお、先に図１６を参照して説明した（実施例１）のストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］のコーディングタイプ［０ｘ１１］の音声データ属性情報記録領域１３０の以下の各情報、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
これらの情報も、ＭＭＴパッケージテーブル（ＭＰＴ）の記録データを情報記録媒体（メディア）に記録するクリップ情報ファイルに対する記録データとして利用される。The following information of the audio data attribute information recording area 130 of the coding type [0x11] of the stream coding information [Stream Coding Info] described above with reference to FIG. 16 (Example 1).
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
This information is also used as recording data for the clip information file that records the recording data of the MMT package table (MPT) on the information recording medium (media).

上記、実施例１、実施例２において説明したように、ＭＭＴフォーマットデータを入力し、入力データをＢＤＡＶフォーマットデータとして情報記録媒体（メディア）に記録する情報処理装置は、音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらいずれの音声データであるかを示す音声識別情報を、ストリーム単位（アセット単位）の情報として、音声データ格納パケット、またはＭＭＴパッケージテーブル（ＭＰＴ）から取得してクリップ情報ファイルに記録する。As described above in Examples 1 and 2, the information processing apparatus that inputs MMT format data and records the input data as BDAV format data on an information recording medium (media) has audio data.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
The voice identification information indicating which of these voice data is used is acquired from the voice data storage packet or the MMT package table (MPT) as stream unit (asset unit) information and recorded in the clip information file.

なお、上記の音声識別情報とは、
実施例１において説明した音声データ格納パケットに記録された音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）、または、
実施例２において説明したＭＭＴパッケージテーブル（ＭＰＴ）に記録されたストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）、
これらのいずれかの情報である。The above voice identification information is
The voice object type (audioObjectType) recorded in the voice data storage packet described in the first embodiment, or.
The stream content (stream_content) recorded in the MMT package table (MPT) described in the second embodiment.
Information on any of these.

なお、上記実施例では、実施例１において説明した音声データ格納パケットに記録された音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）の設定値（２、または３６）や、ＭＭＴパッケージテーブル（ＭＰＴ）に記録されたストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）の設定値（０ｘ３、または０ｘ４）をそのままクリップ情報ファイルに記録する実施例として説明したが、これらの値は、２つの音声データが区別可能な異なる値であればよく、様々な設定が可能である。 In the above embodiment, the setting value (2 or 36) of the audio object type (audioObjectType) recorded in the audio data storage packet described in the first embodiment and the stream content recorded in the MMT package table (MPT). Although the setting value (0x3 or 0x4) of (stream_content) is recorded as it is in the clip information file as an example, these values may be different values that can distinguish the two audio data, and various settings are used. Is possible.

また、上述した実施例１，２においては、図１６に示すクリップ情報ファイルの記録データとして、ＭＭＴフォーマットの定義データ「音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）」をそのまま記録する設定とし、同様に図１９に示すクリップ情報ファイルの記録データとして、ＭＭＴフォーマットの定義データ「ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）」をそのまま記録する設定としている。
これらのデータについても、上記表現と異なる表現、例えば、「ＡＡＣ＿ＡＣ＿ＡＬＳ＿ｃｏｄｉｎｇ＿ｔｙｐｅ」といった新たな定義のフィールドを設定して、このフィールドに２つの音声データ（ＡＡＣＬＣと、ＡＬＳ）を区別可能とした値を設定する構成としてもよい。Further, in Examples 1 and 2 described above, the definition data “audio object type (audioOjectType)” in the MMT format is set to be recorded as it is as the recording data of the clip information file shown in FIG. 16, and is also shown in FIG. As the recording data of the clip information file, the definition data "stream content (stream_content)" in the MMT format is set to be recorded as it is.
Also for these data, an expression different from the above expression, for example, a field with a new definition such as "AAC_AC_ALS_coding_type" is set, and a value that makes it possible to distinguish between the two voice data (AAC LC and ALS) is set in this field. It may be configured to be used.

［７．情報記録媒体に対するデータ記録処理を実行する情報処理装置の構成と処理について］
次に、図２７以下を参照して情報記録媒体に対するデータ記録処理を実行する情報処理装置の構成と処理について説明する。[7. Configuration and processing of information processing equipment that executes data recording processing for information recording media]
Next, the configuration and processing of the information processing apparatus that executes the data recording processing for the information recording medium will be described with reference to FIGS. 27 and 27.

先に説明したように、本開示の情報処理装置は、ＭＭＴフォーマットに従った入力データを、ＢＤＡＶフォーマットデータ、またはＳＰＡＶフォーマットデータとして、ＢＤやフラッシュメモリ等の情報記録媒体に記録する。
情報処理装置３００は、情報記録媒体（記録メディア）３２０に、クリップＡＶストリームファイル、さらに、プレイリストやクリップ情報ファイル等のデータベースファイルを記録する。
プレイリストやクリップ情報ファイル等のデータベースファイルには、メディア記録コンテンツ対応の画像属性情報、音声属性情報、再生制御情報等を記録する。As described above, the information processing apparatus of the present disclosure records input data according to the MMT format as BDAV format data or SPAV format data on an information recording medium such as a BD or a flash memory.
The information processing apparatus 300 records a clip AV stream file and a database file such as a playlist or a clip information file on the information recording medium (recording medium) 320.
Image attribute information, audio attribute information, playback control information, etc. corresponding to media recording contents are recorded in a database file such as a playlist or a clip information file.

プレイリストやクリップ情報ファイル等のデータベースファイルに記録する情報は、ＭＭＴフォーマットに従った入力データに含まれる画像や音声等の再生ストリームデータ格納パケットの他、ＭＭＴフォーマットにおいて規定される制御情報（ＳＩ）であるＴＬＶ－ＳＩや、ＭＭＴ－ＳＩを構成する様々な情報記録テーブル等から取得する。 The information recorded in the database file such as the playlist or clip information file includes the playback stream data storage packet such as images and sounds included in the input data according to the MMT format, as well as the control information (SI) specified in the MMT format. It is acquired from TLV-SI, various information recording tables constituting MMT-SI, and the like.

具体的には、例えば、ＭＭＴ－ＳＩに含まれるＭＭＴパッケージテーブル（ＭＰＴ）から様々な情報を取得して、ＢＤＡＶフォーマットにおいて規定されるプレイリストやクリップ情報ファイル等のデータベースファイルに、メディア記録コンテンツ対応の情報を記録する。 Specifically, for example, various information is acquired from the MMT package table (MPT) included in the MMT-SI, and media recording contents are supported in database files such as playlists and clip information files specified in the BDAV format. Record the information of.

以下、このようなプレイリストやクリップ情報ファイルを記録した情報記録媒体の生成処理、具体的には、ＢＤ等の情報記録媒体に対するデータ記録処理を実行する情報処理装置の構成と、処理シーケンスについて説明する。 Hereinafter, the configuration and processing sequence of the information processing apparatus that executes the generation processing of the information recording medium in which such playlists and clip information files are recorded, specifically, the data recording processing for the information recording medium such as BD, will be described. do.

図２７は、ＢＤ等の情報記録媒体に対するデータ記録処理を実行する情報処理装置３００の構成を示す図である。
情報処理装置３００は、情報記録媒体（記録メディア）３２０に、ＢＤＡＶフォーマットデータ、またはＳＰＡＶフォーマットデータの記録データを記録する。すなわち、画像、音声データからなるクリップＡＶストリームファイルや、再生制御情報ファイルであるプレイリストファイルやクリップ情報ファイル等のデータベースファイルを記録する。FIG. 27 is a diagram showing a configuration of an information processing apparatus 300 that executes data recording processing on an information recording medium such as a BD.
The information processing apparatus 300 records the recorded data of the BDAV format data or the SPAV format data on the information recording medium (recording medium) 320. That is, a database file such as a clip AV stream file composed of image and audio data, a playlist file and a clip information file which are playback control information files is recorded.

データ入力部３０１は、情報記録媒体３２０に対するＭＭＴフォーマットデータ３３１、すなわち画像データ、音声データ、字幕データ等を含むＭＭＴフォーマットデータ３３１を入力する。
データ入力部３０１は、ＭＭＴフォーマットデータ３３１を送信する例えば放送局やコンテンツサーバ等からの送信データを受信する受信部、あるいは、ＭＭＴフォーマットデータ３３１を記録したメディアからのデータ読み取りを実行するメディア読み取り部等によって構成される。The data input unit 301 inputs MMT format data 331 for the information recording medium 320, that is, MMT format data 331 including image data, audio data, subtitle data, and the like.
The data input unit 301 is a receiving unit that transmits MMT format data 331, for example, a receiving unit that receives transmission data from a broadcasting station, a content server, or the like, or a media reading unit that reads data from a medium that records MMT format data 331. Etc.

データ入力部３０１から入力するＭＭＴフォーマットデータ３３１は、先に図２を参照して説明したデータフォーマットに従ったデータであり、例えばＨＥＶＣ画像等の高精細画像データや、ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）音声符号化データ、ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）音声符号化データ等が含まれている。 The MMT format data 331 input from the data input unit 301 is data according to the data format described above with reference to FIG. 2, for example, high-definition image data such as a HEVC image or MPEG4 AAC LC (LATM / LOAS). ) Audio-encoded data, MPEG4 ALS (LATM / LOAS) audio-encoded data and the like are included.

ＭＭＴフォーマットデータ３３１は、制御部３０３の制御によって、記憶部３０４に格納される。
ユーザ入力部３０２は、例えば情報記録媒体３２０に対するデータ記録の開始要求等を入力する。The MMT format data 331 is stored in the storage unit 304 under the control of the control unit 303.
The user input unit 302 inputs, for example, a request for starting data recording to the information recording medium 320.

ユーザ入力部３０２から、データ記録開始要求を入力すると、この入力をトリガとして、記憶部３０４に格納されたＭＭＴフォーマットデータ３３１が、デマルチプレクサ（ＤｅＭＵＸ）３０５に入力される。
デマルチプレクサ（ＤｅＭＵＸ）３０５は、ＭＭＴフォーマットデータ３３１から、画像、音声、字幕等の各データを格納したパケットや、通知情報や制御情報等を格納したシグナリング情報（ＴＬＶ－ＳＩ，ＭＭＴ－ＳＩ）等の補助情報を取得し、データ種別のパケットに分類し、各パケットを、データ種類に応じて、記録データ生成部３０６の字幕データ生成部３１１、画像データ生成部３１２、音声データ生成部３１３、補助情報生成部３１４に入力する。When a data recording start request is input from the user input unit 302, the MMT format data 331 stored in the storage unit 304 is input to the demultiplexer (DeMUX) 305 using this input as a trigger.
The demultiplexer (DeMUX) 305 uses MMT format data 331 as a packet containing data such as images, sounds, and subtitles, signaling information (TLV-SI, MMT-SI) containing notification information, control information, and the like. Auxiliary information is acquired, classified into packets of data type, and each packet is classified into subtitle data generation unit 311 of recorded data generation unit 306, image data generation unit 312, audio data generation unit 313, and auxiliary data according to the data type. It is input to the information generation unit 314.

字幕データ生成部３１１は、データ入力部３０１が入力し、記憶部３０４に格納されたＭＭＴフォーマットデータ３３１から、字幕データを取得し、ＢＤＡＶフォーマットにおいて規定されるストリームファイル格納用データを生成する。
画像データ生成部３１２は、データ入力部３０１が入力し、記憶部３０４に格納されたＭＭＴフォーマットデータ３３１から、画像データを取得し、ＢＤＡＶフォーマットにおいて規定されるストリームファイル格納用データを生成する。
音声データ生成部３１３は、データ入力部３０１が入力し、記憶部３０４に格納されたＭＭＴフォーマットデータ３３１から、音声データを取得し、ＢＤＡＶフォーマットにおいて規定されるストリームファイル格納用データを生成する。The subtitle data generation unit 311 acquires subtitle data from the MMT format data 331 input by the data input unit 301 and stored in the storage unit 304, and generates stream file storage data specified in the BDAV format.
The image data generation unit 312 acquires image data from the MMT format data 331 input by the data input unit 301 and stored in the storage unit 304, and generates stream file storage data specified in the BDAV format.
The audio data generation unit 313 acquires audio data from the MMT format data 331 input by the data input unit 301 and stored in the storage unit 304, and generates stream file storage data specified in the BDAV format.

補助情報生成部３１４は、データ入力部３０１が入力し、記憶部３０４に格納されたＭＭＴフォーマットデータ３３１から、再生データ対応の属性情報や再生制御情報を取得してＢＤＡＶフォーマットにおいて規定されるデータベースファイルとしてのプレイリストファイルやクリップ情報ファイルに格納すべきデータを生成する。 The auxiliary information generation unit 314 is a database file specified in the BDAV format by acquiring attribute information and reproduction control information corresponding to reproduction data from the MMT format data 331 input by the data input unit 301 and stored in the storage unit 304. Generates data to be stored in the playlist file or clip information file as.

具体的には、例えば、ＭＭＴフォーマットに従った画像や音声等の再生ストリームデータ格納パケットや、ＭＭＴパッケージテーブル（ＭＰＴ）等から、プレイリストファイルやクリップ情報ファイルに格納すべきデータを取得して、プレイリストファイルやクリップ情報ファイルを生成する。 Specifically, for example, data to be stored in a playlist file or a clip information file is acquired from a playback stream data storage packet such as an image or sound according to the MMT format, an MMT package table (MPT), or the like. Generate a playlist file or clip information file.

マルチプレクサ（ＭＵＸ）３１５は、字幕データ生成部３１１、画像データ生成部３１２、音声データ生成部３１３が変換した字幕、画像、音声各データを入力し、これらのデータを格納したストリームファイルを生成する。 The multiplexer (MUX) 315 inputs the subtitles, images, and audio data converted by the subtitle data generation unit 311, the image data generation unit 312, and the audio data generation unit 313, and generates a stream file in which these data are stored.

なお、前述したように、ＭＭＴフォーマットデータをメディアに記録する場合のクリップＡＶストリームファイルは、
（１）ＭＭＴフォーマットをＭＰＥＧ－２ＴＳフォーマットに変換してクリップＡＶストリームファイルを生成する、あるいは、
（２）ＭＭＴフォーマットに従ったデータを格納したパケットのパケット列（ＭＭＴＰパケット列、またはＴＬＶパケットシ列）からなるクリップＡＶストリームファイルを生成する、
これらの２種類の処理が想定される。
記録データ生成部３０６は、上記（１），（２）のいずれかの処理態様に従ってクリップＡＶストリームファイルを生成する。As described above, the clip AV stream file for recording MMT format data on the media is
(1) Convert the MMT format to the MPEG-2 TS format to generate a clip AV stream file, or
(2) Generate a clip AV stream file consisting of a packet sequence (MMTP packet sequence or TLV packet sequence) of a packet storing data according to the MMT format.
These two types of processing are assumed.
The recording data generation unit 306 generates a clip AV stream file according to any of the processing modes (1) and (2) above.

データベースファイル生成部３１６は、補助情報生成部３１４がＭＭＴフォーマットデータ３３１のシグナリング情報（ＴＬＶ－ＳＩ，ＭＭＴ－ＳＩ）から取得した様々な情報を記録したプレイリストファイルやクリップ情報ファイル等のデータベースファイルを生成する。 The database file generation unit 316 stores database files such as a playlist file and a clip information file that record various information acquired from the signaling information (TLV-SI, MMT-SI) of the MMT format data 331 by the auxiliary information generation unit 314. Generate.

なお、先に説明したように、クリップ情報ファイルには、音声データ対応の属性情報として、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）
上記２つの音声符号化データを識別可能とした音声識別情報を記録する。
具体的には、図１６、または図１９に示す記録データを持つクリップ情報ファイルを生成する。As explained earlier, the clip information file contains the attribute information corresponding to audio data.
(1) MPEG4 AAC LC (LATM / LOAS)
(2) MPEG4 ALS (LATM / LOAS)
The voice identification information that makes the above two voice-coded data distinguishable is recorded.
Specifically, a clip information file having the recorded data shown in FIG. 16 or FIG. 19 is generated.

記録データ生成部３０６の生成したストリームファイルデータと、プレイリストファイル、クリップ情報ファイル等のデータベースファイルを含む記録データ３３２は、制御部３０３の制御の下、記録部３０６によって、ドライブ３０７を介して情報記録媒体３２０に出力され、記録される。 The stream file data generated by the recording data generation unit 306 and the recording data 332 including the database files such as the playlist file and the clip information file are informationed by the recording unit 306 via the drive 307 under the control of the control unit 303. It is output to the recording medium 320 and recorded.

次に、図２７に示す情報処理装置３００が実行する情報記録媒体３２０に対するデータ記録処理のシーケンスについて、図２８に示すフローチャートを参照して説明する。 Next, a sequence of data recording processing for the information recording medium 320 executed by the information processing apparatus 300 shown in FIG. 27 will be described with reference to the flowchart shown in FIG. 28.

図２８に示すフローに従った処理は、例えば情報処理装置の記憶部に格納されたプログラムに従って、プログラム実行機能を有するＣＰＵを備えたデータ処理部（制御部）の制御の下で実行することができる。
以下、図２８のフローに示す各ステップの処理について、順次、説明する。The processing according to the flow shown in FIG. 28 may be executed under the control of a data processing unit (control unit) having a CPU having a program execution function, for example, according to a program stored in the storage unit of the information processing apparatus. can.
Hereinafter, the processing of each step shown in the flow of FIG. 28 will be sequentially described.

（ステップＳ１０１）
まず、情報処理装置３００は、ステップＳ１０１において、データ入力部３０１を介して記録用データであるＭＭＴフォーマットデータを入力する。
なお、この記録用データには画像データ、音声データ、字幕データ、さらに、通知情報や制御情報等を格納したシグナリング情報（ＴＬＶ－ＳＩ，ＭＭＴ－ＳＩ）等が含まれる。(Step S101)
First, in step S101, the information processing apparatus 300 inputs MMT format data, which is recording data, via the data input unit 301.
The recording data includes image data, audio data, subtitle data, signaling information (TLV-SI, MMT-SI) storing notification information, control information, and the like.

（ステップＳ１０２）
次に、ステップＳ１０２において、情報処理装置３００は、入力したＭＭＴフォーマットデータから、画像、音声等の再生対象データを取得し、ＢＤＡＶフォーマットに従ったＡＶストリームファイルを生成する。(Step S102)
Next, in step S102, the information processing apparatus 300 acquires reproduction target data such as images and sounds from the input MMT format data, and generates an AV stream file according to the BDAV format.

（ステップＳ１０３）
次に、ステップＳ１０３において、情報処理装置３００は、入力したＭＭＴフォーマットデータの構成データを利用して、クリップ情報ファイルを生成する。
具体的には、先の実施例１，２において説明したように、ＭＭＴフォーマットに従った画像や音声等の再生ストリームデータ格納パケットや、ＭＭＴパッケージテーブル（ＭＰＴ）等からデータを取得して、クリップ情報ファイルを生成する。
なお、このステップＳ１０３のクリップ情報ファイル生成処理の詳細については、図２９、図３０を参照して、後段で説明する。(Step S103)
Next, in step S103, the information processing apparatus 300 generates a clip information file by using the configuration data of the input MMT format data.
Specifically, as described in Examples 1 and 2 above, data is acquired from a reproduction stream data storage packet such as an image or audio according to the MMT format, an MMT package table (MPT), or the like, and a clip is obtained. Generate an information file.
The details of the clip information file generation process in step S103 will be described later with reference to FIGS. 29 and 30.

（ステップＳ１０４）
次に、ステップＳ１０４において、情報処理装置３００は、入力したＭＭＴフォーマットデータの構成データを利用して、プレイリストファイルを生成する。(Step S104)
Next, in step S104, the information processing apparatus 300 generates a playlist file by using the configuration data of the input MMT format data.

（ステップＳ１０５）
次に、ステップＳ１０５において、情報処理装置３００は、入力したＭＭＴフォーマットデータを利用してその他のデータベースファイルを生成する。(Step S105)
Next, in step S105, the information processing apparatus 300 uses the input MMT format data to generate other database files.

（ステップＳ１０６）
次に、ステップＳ１０６において、情報処理装置３００は、生成したＡＶストリームファイルとデータベースファイルを利用してＢＤＡＶフォーマットデータを生成する。(Step S106)
Next, in step S106, the information processing apparatus 300 generates BDAV format data using the generated AV stream file and database file.

（ステップＳ１０７）
次に、ステップＳ１０７において、情報処理装置３００は、ステップＳ１０６で生成したＢＤＡＶフォーマットデータを情報記録媒体（メディア）に記録する。(Step S107)
Next, in step S107, the information processing apparatus 300 records the BDAV format data generated in step S106 on the information recording medium (media).

次に、ステップＳ１０３において実行するクリップ情報ファイル生成処理の詳細シーケンスについて、図２９、図３０に示すフローチャートを参照して説明する。 Next, the detailed sequence of the clip information file generation process executed in step S103 will be described with reference to the flowcharts shown in FIGS. 29 and 30.

図２９は、先に説明した（実施例１）、すなわち、音声データ格納パケットから音声識別情報を取得する実施例に従ったクリップ情報ファイル生成処理の詳細シーケンスである。
図３０は、先に説明した（実施例２）、すなわち、ＭＭＴパッケージテーブル（ＭＰＴ）から音声識別情報を取得する実施例に従ったクリップ情報ファイル生成処理の詳細シーケンスである。FIG. 29 is a detailed sequence of the clip information file generation process according to the above-described (Example 1), that is, the embodiment of acquiring the voice identification information from the voice data storage packet.
FIG. 30 is a detailed sequence of clip information file generation processing according to the above-described (Example 2), that is, an embodiment of acquiring voice recognition information from the MMT package table (MPT).

まず、図２９を参照して、先に説明した（実施例１）、すなわち、音声データ格納パケットから音声識別情報を取得する実施例に従ったクリップ情報ファイル生成処理の詳細シーケンスについて説明する。
図２９に示すフローの各ステップの処理について、順次、説明する。First, with reference to FIG. 29, a detailed sequence of clip information file generation processing according to the above-described (Example 1), that is, an embodiment of acquiring voice identification information from a voice data storage packet will be described.
The processing of each step of the flow shown in FIG. 29 will be sequentially described.

（ステップＳ１２１）
まず、ステップＳ１２１において、情報処理装置３００は、クリップＡＶストリームファイルに記録する音声データ格納パケットを取得する。
この音声データ格納パケットは、先に図１８を参照して説明した音声データ格納パケット１４０に相当する。
図１８に示すように、音声データ格納パケット１４０のパケット格納データ１４１は、符号化音声データストリームと、このストリームデータの属性情報によって構成される。(Step S121)
First, in step S121, the information processing apparatus 300 acquires an audio data storage packet to be recorded in the clip AV stream file.
This voice data storage packet corresponds to the voice data storage packet 140 described above with reference to FIG.
As shown in FIG. 18, the packet storage data 141 of the voice data storage packet 140 is composed of a coded voice data stream and attribute information of the stream data.

音声データ格納パケット１４０に格納されている符号化音声データストリームは、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれかの音声データである。The coded audio data stream stored in the audio data storage packet 140 is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is the voice data of any of these.

さらに、音声データ格納パケット１４０には、パケットに格納された音声データの属性情報が記録される。
図１８に示すように、音声データ属性情報の一部として、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１４２が記録されている。Further, the voice data storage packet 140 records the attribute information of the voice data stored in the packet.
As shown in FIG. 18, as a part of the audio data attribute information,
A voice object type (audioObjectType) 142 is recorded.

（ステップＳ１２２）
次に、ステップＳ１２２において、情報処理装置３００は、取得した音声データ格納パケットから音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）の設定値を取得する。
図１８に示すように、音声データ属性情報の一部として、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１４２が記録されている。(Step S122)
Next, in step S122, the information processing apparatus 300 acquires the set value of the voice object type (audioObjectType) from the acquired voice data storage packet.
As shown in FIG. 18, as a part of the audio data attribute information,
A voice object type (audioObjectType) 142 is recorded.

図１８に示す音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１４２の設定値は、以下の設定である。
パケットに格納されたデータが、ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データである場合、音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝２である。
一方、パケットに格納されたデータが、ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データである場合、音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）＝３６である。The setting value of the audio object type (audioObjectType) 142 shown in FIG. 18 is the following setting.
When the data stored in the packet is MPEG4 AAC LC (LATM / LOAS) coded audio data, the audio object type (audioOjectType) = 2.
On the other hand, when the data stored in the packet is MPEG4 ALS (LATM / LOAS) coded audio data, the audio object type (audioOjectType) = 36.

（ステップＳ１２３）
次に、ステップＳ１２３において、情報処理装置３００は、制御情報（ＳＩ）格納パケットから、ＭＭＴパッケージテーブル（ＭＰＴ）を取得する。(Step S123)
Next, in step S123, the information processing apparatus 300 acquires the MMT package table (MPT) from the control information (SI) storage packet.

ＭＭＴパッケージテーブル（ＭＰＴ）は、先に図２１～図２６を参照して説明した様々なデータを格納したテーブルである。
例えば、音声データについては、アセット（ストリーム）単位の属性情報として、図２３～図２６を参照して説明した
（Ｍ１）ストリームコンテンツ情報１６１、
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３、
これらの各データが記録されている。The MMT package table (MPT) is a table that stores various data described above with reference to FIGS. 21 to 26.
For example, the audio data is described with reference to FIGS. 23 to 26 as attribute information for each asset (stream) (M1) Stream content information 161.
(M2) Component type 162,
(M3) Sampling frequency 163,
Each of these data is recorded.

（ステップＳ１２４）
次に、ステップＳ１２４において、情報処理装置３００は、ＭＰＴ（ＭＭＴパッケージテーブル）から、アセットタイプが音声であるアセット記述子を選択する。(Step S124)
Next, in step S124, the information processing apparatus 300 selects an asset descriptor whose asset type is voice from the MPT (MMT package table).

（ステップＳ１２５）
次に、ステップＳ１２５において、情報処理装置３００は、ＭＰＴのアセットタイプ＝音声のアセット記述子（ａｓｓｅｔ＿ｄｅｓｃｒｉｐｔｏｒ）から、以下の音声属性情報を取得する。
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３。(Step S125)
Next, in step S125, the information processing apparatus 300 acquires the following voice attribute information from the MPT asset type = voice asset descriptor (asset_descriptor).
(M2) Component type 162,
(M3) Sampling frequency 163.

（ステップＳ１２６）
次に、ステップＳ１２６において、情報処理装置３００は、音声データ格納パケットから取得した音声属性情報、すなわち、図１８に示す、
音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１４２、
さらに、ＭＰＴから取得した音声属性情報、すなわち、図２３～図２６を参照して説明した、
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３、
これらの各情報を、クリップ情報ファイルに設定したストリームコーディングタイプ＝０ｘ１１以下の記録フィールドに記録する。(Step S126)
Next, in step S126, the information processing apparatus 300 has voice attribute information acquired from the voice data storage packet, that is, shown in FIG.
Audio Object Type 142,
Further, the audio attribute information acquired from the MPT, that is, with reference to FIGS. 23 to 26, has been described.
(M2) Component type 162,
(M3) Sampling frequency 163,
Each of these pieces of information is recorded in a recording field of stream coding type = 0x11 or less set in the clip information file.

すなわち、先に図１６を参照して説明した以下の各データ
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３
これらの各情報を、クリップ情報ファイルに設定したストリームコーディングタイプ＝０ｘ１１以下の記録フィールドに記録する。That is, the following data (a) voice type (audio_presentation_type) 131 described above with reference to FIG. 16 above.
(B) Sampling frequency (sampling_frequency) 132
(C) Audio Object Type 133
Each of these pieces of information is recorded in a recording field of stream coding type = 0x11 or less set in the clip information file.

これらの処理により、所定の音声属性情報が記録されたクリップ情報ファイルが生成される。
ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、クリップ情報ファイルに記録された音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）を参照することで、このクリップ情報ファイルの再生制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別することが可能となる。By these processes, a clip information file in which predetermined audio attribute information is recorded is generated.
The playback device that reproduces the data recorded on the media such as BD refers to the audio object type (audioObjectType) recorded in the clip information file, so that the audio data that is the playback control target of this clip information file can be set.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is possible to identify which of these voice data is used.

次に、図３０を参照して、先に説明した（実施例２）、すなわち、ＭＭＴパッケージテーブル（ＭＰＴ）から音声識別情報を取得する実施例に従ったクリップ情報ファイル生成処理の詳細シーケンスについて説明する。
図３０に示すフローの各ステップの処理について、順次、説明する。Next, with reference to FIG. 30, a detailed sequence of the clip information file generation process according to the above-described (Example 2), that is, the embodiment of acquiring voice recognition information from the MMT package table (MPT) will be described. do.
The processing of each step of the flow shown in FIG. 30 will be sequentially described.

（ステップＳ１３１）
まず、ステップＳ１３１において、情報処理装置３００は、制御情報（ＳＩ）格納パケットから、ＭＭＴパッケージテーブル（ＭＰＴ）を取得する。(Step S131)
First, in step S131, the information processing apparatus 300 acquires the MMT package table (MPT) from the control information (SI) storage packet.

（ステップＳ１３２）
次に、ステップＳ１３２において、情報処理装置３００は、ＭＰＴ（ＭＭＴパッケージテーブル）から、アセットタイプが音声であるデータを選択する。(Step S132)
Next, in step S132, the information processing apparatus 300 selects data whose asset type is voice from the MPT (MMT package table).

（ステップＳ１３３）
次に、ステップＳ１３３において、情報処理装置３００は、ＭＰＴのアセットタイプ＝音声のアセット記述子（ａｓｓｅｔ＿ｄｅｓｃｒｉｐｔｏｒ）から、以下の音声属性情報を取得する。
（Ｍ１）ストリームコンテンツ情報１６１、
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３。(Step S133)
Next, in step S133, the information processing apparatus 300 acquires the following voice attribute information from the MPT asset type = voice asset descriptor (asset_descriptor).
(M1) Stream content information 161
(M2) Component type 162,
(M3) Sampling frequency 163.

（ステップＳ１３４）
次に、ステップＳ１３４において、情報処理装置３００は、ＭＰＴから取得した音声属性情報、すなわち、図２３～図２６を参照して説明した、
（Ｍ１）ストリームコンテンツ情報１６１、
（Ｍ２）コンポーネントタイプ１６２、
（Ｍ３）サンプリング周波数１６３、
これらの各情報を、クリップ情報ファイルに設定したストリームコーディングタイプ＝０ｘ１１以下の記録フィールドに記録する。(Step S134)
Next, in step S134, the information processing apparatus 300 has been described with reference to the voice attribute information acquired from the MPT, that is, FIGS. 23 to 26.
(M1) Stream content information 161
(M2) Component type 162,
(M3) Sampling frequency 163,
Each of these pieces of information is recorded in a recording field of stream coding type = 0x11 or less set in the clip information file.

すなわち、先に図１９を参照して説明した以下の各データ
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
これらの各情報を、クリップ情報ファイルに設定したストリームコーディングタイプ＝０ｘ１１以下の記録フィールドに記録する。That is, the following data (a) voice type (audio_presentation_type) 131 described above with reference to FIG. 19
(B) Sampling frequency (sampling_frequency) 132
(C) Stream content (stream_content) 135
Each of these pieces of information is recorded in a recording field of stream coding type = 0x11 or less set in the clip information file.

これらの処理により、所定の音声属性情報が記録されたクリップ情報ファイルが生成される。
具体的には、ＢＤ等のメディアに記録されたデータの再生を行う再生装置は、クリップ情報ファイルに記録されたストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）を参照することで、このクリップ情報ファイルの再生制御対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別することが可能となる。By these processes, a clip information file in which predetermined audio attribute information is recorded is generated.
Specifically, the playback device that reproduces the data recorded on the media such as BD becomes the playback control target of this clip information file by referring to the stream content (stream_content) recorded in the clip information file. The voice data is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is possible to identify which of these voice data is used.

［８．情報記録媒体からのデータ再生処理を実行する情報処理装置の構成と処理について］
次に、図３１以下を参照して情報記録媒体からのデータ再生処理を実行する情報処理装置の構成と処理について説明する。[8. Configuration and processing of information processing equipment that executes data reproduction processing from information recording media]
Next, the configuration and processing of the information processing apparatus that executes the data reproduction processing from the information recording medium will be described with reference to FIGS. 31 and below.

再生処理を実行する情報処理装置は、装置に装着した情報記録媒体に記録されたデータの読み取り、再生処理を実行する。 The information processing apparatus that executes the reproduction process reads the data recorded on the information recording medium attached to the apparatus and executes the reproduction process.

図３１は、ＢＤ等の情報記録媒体５１０に記録されたデータの再生処理を実行する情報処理装置４００の構成を示す図である。
情報処理装置４００は、図３１に示す情報記録媒体（記録メディア）５１０に記録されたデータを読み取り、出力装置（表示部＋スピーカ）５２０に出力する。なお、出力装置５２０は、例えばテレビ等であり、ディスプレイ、スピーカ等を備えた表示装置である。
なお、情報処理装置４００は先に図２７を参照して説明したデータ記録を行う情報処理装置３００と同一の装置である場合もある。すなわち、データ記録再生の両機能を有する情報処理装置である。FIG. 31 is a diagram showing a configuration of an information processing apparatus 400 that executes a reproduction process of data recorded on an information recording medium 510 such as a BD.
The information processing device 400 reads the data recorded on the information recording medium (recording medium) 510 shown in FIG. 31 and outputs the data to the output device (display unit + speaker) 520. The output device 520 is, for example, a television or the like, and is a display device including a display, a speaker, and the like.
The information processing device 400 may be the same device as the information processing device 300 that performs data recording described above with reference to FIG. 27. That is, it is an information processing device having both functions of data recording and reproduction.

情報記録媒体（記録メディア）５１０は、図２８～図３０を参照して説明した処理によって生成されたクリップＡＶストリームファイルと、プレイリスト、クリップ情報ファイル等のデータベースが記録された記録媒体である。 The information recording medium (recording medium) 510 is a recording medium in which a clip AV stream file generated by the processes described with reference to FIGS. 28 to 30 and a database such as a playlist and a clip information file are recorded.

なお、ＭＭＴフォーマット対応のクリップＡＶストリームファイルの再生に利用するクリップ情報ファイルには、音声データに関する各ストリーム（アセット）単位の音声属性情報として、各音声ストリームが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別することが可能とした音声識別情報が記録されている。In the clip information file used for playing the clip AV stream file compatible with the MMT format, each audio stream has audio attribute information for each stream (asset) related to audio data.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
The voice identification information that makes it possible to identify which of these voice data is recorded is recorded.

すなわち、先に図１６を参照して説明した以下の各データ、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３
これらのデータが記録されている。That is, the following data described above with reference to FIG. 16
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Audio Object Type 133
These data are recorded.

または、先に図１９を参照して説明した以下の各データ、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
これらのデータが記録されている。Alternatively, the following data, which have been described above with reference to FIG.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Stream content (stream_content) 135
These data are recorded.

図１６に示す例では、音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３、
図１９に示す例では、ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５、
これらのデータが、音声ストリームが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを示す音声識別情報である。In the example shown in FIG. 16, the audio object type (audioObjectType) 133,
In the example shown in FIG. 19, stream content (stream_content) 135,
These data, the audio stream,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
This is voice identification information indicating which of these voice data is used.

制御部４０１は、例えば、ユーザ入力部４０２からの再生指示情報の入力に基づいて、情報記録媒体５１０の記録データを、記録再生部４０４、ドライブ４０３を介して読み取り、データバッファとしての記憶部４０５に格納し、その格納データを再生処理部４０６に出力する。 For example, the control unit 401 reads the recorded data of the information recording medium 510 via the recording / reproducing unit 404 and the drive 403 based on the input of the reproduction instruction information from the user input unit 402, and the storage unit 405 as a data buffer. It is stored in, and the stored data is output to the reproduction processing unit 406.

また、制御部４０１は、ユーザ入力部４０２からの録画コンテンツリスト表示指示情報の入力に基づいて、情報記録媒体５１０の記録データに基づいて、先に図９を参照して説明した録画コンテンツリストを生成して、出力装置（表示部）５２０に出力する。 Further, the control unit 401 obtains the recorded content list described above with reference to FIG. 9 based on the recorded data of the information recording medium 510 based on the input of the recorded content list display instruction information from the user input unit 402. It is generated and output to the output device (display unit) 520.

再生処理部４０６は、制御部４０１の制御の下、情報記録媒体５１０から読み出された再生データ、すなわち、画像、音声、字幕等の各データを格納したクリップＡＶストリームファイルから各データを取得して再生データを生成する。 Under the control of the control unit 401, the reproduction processing unit 406 acquires each data from the reproduction data read from the information recording medium 510, that is, a clip AV stream file storing each data such as images, sounds, and subtitles. To generate playback data.

デマルチプレクサ（ＤｅＭＵＸ）４１１は、画像、音声、字幕、さらに、プレイリストファイル、クリップ情報ファイル等の各データを格納したデータ格納パケットを取得し、データ種別のパケットに分類し、各パケットを、データ種類に応じて、字幕データ生成部４１２、画像データ生成部４１３、音声データ生成部４１４、補助情報生成部４１５に出力する。 The demultiplexer (DeMUX) 411 acquires a data storage packet containing each data such as an image, an audio, a subtitle, a playlist file, a clip information file, etc., classifies them into data type packets, and classifies each packet into data. Depending on the type, the data is output to the subtitle data generation unit 412, the image data generation unit 413, the audio data generation unit 414, and the auxiliary information generation unit 415.

字幕データ生成部４１２、画像データ生成部４１３、音声データ生成部４１４は、パケットに格納されたデータの復号処理等を実行し、復号データを出力データ生成部４１６に出力する。
出力データ生成部４１６は、字幕、画像、音声の各データを、入出力インタフェース４０７を介して出力装置（表示部＋スピーカ）５２０に出力する。The subtitle data generation unit 412, the image data generation unit 413, and the audio data generation unit 414 execute the decoding process of the data stored in the packet and output the decoded data to the output data generation unit 416.
The output data generation unit 416 outputs each data of subtitles, images, and voices to the output device (display unit + speaker) 520 via the input / output interface 407.

なお、情報記録媒体５１０は、再生対象データを格納したストリームファイルとして、
ＭＰＥＧ－２ＴＳフォーマットデータを格納したストリームファイルと、
ＭＭＴフォーマットデータを格納したストリームファイルを有する場合がある。
情報処理装置４００は、各フォーマット対応のプレイリストファイルとクリップ情報ファイルを選択適用して、ＭＰＥＧ－２ＴＳフォーマットデータを格納したストリームファイル、および、ＭＭＴフォーマットデータを格納したストリームファイルの再生処理を実行することになる。The information recording medium 510 is used as a stream file in which data to be reproduced is stored.
A stream file containing MPEG-2 TS format data and
It may have a stream file containing MMT format data.
The information processing apparatus 400 selectively applies a playlist file and a clip information file corresponding to each format, and executes a playback process of a stream file storing MPEG-2 TS format data and a stream file storing MMT format data. It will be.

補助情報生成部４１５は、例えば、プレイリストファイル、クリップ情報ファイルに格納されたデータを取得して録画コンテンツリストを生成し、生成リストが出力装置（表示部＋スピーカ）５２０に出力される。 The auxiliary information generation unit 415 acquires, for example, the data stored in the playlist file and the clip information file to generate a recorded content list, and the generation list is output to the output device (display unit + speaker) 520.

出力装置（表示部＋スピーカ）５２０は、情報処理装置４００から入力する字幕、画像、音声等の各データ、さらに録画コンテンツリストを、出力装置（表示部＋スピーカ）５２０を介して出力する。 The output device (display unit + speaker) 520 outputs each data such as subtitles, images, and sounds input from the information processing device 400, and a recorded content list via the output device (display unit + speaker) 520.

次に、図３１に示す情報処理装置４００が実行する情報記録媒体５１０からのＭＭＴフォーマットデータの再生処理に際して、再生対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを識別する音声識別処理シーケンスについて、図３２に示すフローチャートを参照して説明する。Next, in the reproduction processing of the MMT format data from the information recording medium 510 executed by the information processing apparatus 400 shown in FIG. 31, the audio data to be reproduced is the audio data to be reproduced.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
A voice recognition processing sequence for identifying which of these voice data is used will be described with reference to the flowchart shown in FIG. 32.

図３２に示すフローに従った処理は、例えば情報処理装置４００の記憶部に格納されたプログラムに従って、プログラム実行機能を有するＣＰＵを備えたデータ処理部（制御部）の制御の下で実行することができる。 The processing according to the flow shown in FIG. 32 is executed under the control of a data processing unit (control unit) having a CPU having a program execution function, for example, according to a program stored in the storage unit of the information processing apparatus 400. Can be done.

なお、図３２に示すフローに従った処理を実行する情報処理装置は、図３１に示す情報処理装置４００であり、情報記録媒体（記録メディア）５１０を装着し、装着した情報記録媒体５１０に記録されたデータを読み取り、出力装置（表示部＋スピーカ）５２０に出力する。なお、出力装置５２０は、例えばテレビ等であり、ディスプレイ、スピーカ等を備えた表示装置である。 The information processing device that executes the processing according to the flow shown in FIG. 32 is the information processing device 400 shown in FIG. 31, in which an information recording medium (recording medium) 510 is attached and recorded on the attached information recording medium 510. The output data is read and output to the output device (display unit + speaker) 520. The output device 520 is, for example, a television or the like, and is a display device including a display, a speaker, and the like.

情報記録媒体（記録メディア）５１０は、図２８～図３１を参照して説明した処理によって生成されたクリップＡＶストリームと、プレイリスト、クリップ情報ファイル等のデータベースが記録された記録媒体である。
以下、図３２のフローに示す各ステップの処理について、順次、説明する。The information recording medium (recording medium) 510 is a recording medium in which a clip AV stream generated by the processes described with reference to FIGS. 28 to 31 and a database such as a playlist and a clip information file are recorded.
Hereinafter, the processing of each step shown in the flow of FIG. 32 will be sequentially described.

（ステップＳ２０１）
まず、情報処理装置４００の制御部４０１は、ステップＳ２０１において、クリップ情報ファイルを取得する。
なお、ここで選択するクリップＡＶストリームファイルは、ＭＭＴフォーマットデータを格納したクリップＡＶストリームファイルの再生に利用するクリップＡＶストリームファイルであるものとする。(Step S201)
First, the control unit 401 of the information processing apparatus 400 acquires the clip information file in step S201.
The clip AV stream file selected here is assumed to be a clip AV stream file used for playing back the clip AV stream file storing the MMT format data.

（ステップＳ２０２）
次に、情報処理装置４００の制御部４０１は、ステップＳ２０２において、クリップ情報ファイルに記録されたストリームコーディングタイプ＝０ｘ１１の以下の記録フィールドから、再生対象音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記いずれの符号化音声データであるかを示す音声識別情報を取得する。(Step S202)
Next, in step S202, the control unit 401 of the information processing apparatus 400 displays the audio data to be reproduced from the following recording field of stream coding type = 0x11 recorded in the clip information file.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
Acquires voice identification information indicating which of the above coded voice data is used.

これは、具体的には、図１６、または図１９に示すクリップ情報ファイルのプログラム情報［ＰｒｏｇｒａｍＩｎｆｏ（）］のストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］の記録データに含まれる以下の情報を取得する処理である。 Specifically, this is a process of acquiring the following information included in the recorded data of the stream coding information [Stream Coding Info] of the program information [Program Info ()] of the clip information file shown in FIG. 16 or FIG.

クリップ情報ファイルの記録データが、図１６に示すデータ構成を有する場合は、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３
これらのデータを取得する。When the recorded data of the clip information file has the data structure shown in FIG.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Audio Object Type 133
Get these data.

一方、クリップ情報ファイルの記録データが、図１９に示すデータ構成を有する場合は、
（ａ）音声タイプ（ａｕｄｉｏ＿ｐｒｅｓｅｎｔａｔｉｏｎ＿ｔｙｐｅ）１３１
（ｂ）サンプリング周波数（ｓａｍｐｌｉｎｇ＿ｆｒｅｑｕｅｎｃｙ）１３２
（ｃ）ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５
これらのデータを取得する。On the other hand, when the recorded data of the clip information file has the data structure shown in FIG.
(A) Voice type (audio_presentation_type) 131
(B) Sampling frequency (sampling_frequency) 132
(C) Stream content (stream_content) 135
Get these data.

（ステップＳ２０３）
次に、情報処理装置４００の制御部４０１は、ステップＳ２０３において、ステップＳ２０２でクリップ情報ファイルから読み取った音声識別情報に基づいて、再生対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを判定する。(Step S203)
Next, in step S203, the control unit 401 of the information processing apparatus 400 displays the voice data to be reproduced based on the voice identification information read from the clip information file in step S202.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is determined which of these voice data is used.

再生対象音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、ステップＳ２０４に進む。
一方、再生対象音声データが、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、ステップＳ２０５に進む。The audio data to be played is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
If is, the process proceeds to step S204.
On the other hand, the audio data to be played is
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
If is, the process proceeds to step S205.

（ステップＳ２０４）
再生対象音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、情報処理装置４００の制御部４０１は、ステップＳ２０４において、ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ対応のコーデックを適用した復号、再生処理を実行する。(Step S204)
The audio data to be played is
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
In the case of, the control unit 401 of the information processing apparatus 400 executes the decoding / reproduction process in step S204 by applying the codec corresponding to the MPEG4 AAC LC (LATM / LOAS) coded audio data.

（ステップＳ２０５）
一方、再生対象音声データが、
（１）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、情報処理装置４００の制御部４０１は、ステップＳ２０５において、ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ対応のコーデックを適用した復号、再生処理を実行する。(Step S205)
On the other hand, the audio data to be played is
(1) MPEG4 ALS (LATM / LOAS) coded audio data,
In the case of, the control unit 401 of the information processing apparatus 400 executes the decoding / reproduction process in step S205 by applying the codec corresponding to the MPEG4 ALS (LATM / LOAS) coded audio data.

このように、クリップ情報ファイルには、再生対象となる音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを示す音声識別情報が記録されており、この情報を参照することで、正しい復号処理、再生処理を迅速に実行することが可能となる。In this way, the clip information file contains audio data to be played back.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
Voice identification information indicating which of these voice data is recorded is recorded, and by referring to this information, it is possible to quickly execute correct decoding processing and reproduction processing.

次に、１つの番組等のコンテンツに対して、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これら２種類の音声データが設定され、ユーザ（視聴者）による選択再生を可能としたコンテンツの再生シーケンス例について図３３に示すフローチャートを参照して説明する。Next, for the content such as one program,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
An example of a reproduction sequence of content in which these two types of audio data are set and can be selectively reproduced by a user (viewer) will be described with reference to the flowchart shown in FIG. 33.

図３３に示すフローに従った処理は、例えば情報処理装置４００の記憶部に格納されたプログラムに従って、プログラム実行機能を有するＣＰＵを備えたデータ処理部（制御部）の制御の下で実行することができる。
以下、図３３のフローに示す各ステップの処理について、順次、説明する。The processing according to the flow shown in FIG. 33 is executed under the control of a data processing unit (control unit) having a CPU having a program execution function, for example, according to a program stored in the storage unit of the information processing apparatus 400. Can be done.
Hereinafter, the processing of each step shown in the flow of FIG. 33 will be sequentially described.

（ステップＳ３０１）
まず、情報処理装置４００の制御部４０１は、ステップＳ３０１において、クリップ情報ファイルを取得する。
なお、ここで選択するクリップＡＶストリームファイルは、ＭＭＴフォーマットデータを格納したクリップＡＶストリームファイルの再生に利用するクリップＡＶストリームファイルであるものとする。(Step S301)
First, the control unit 401 of the information processing apparatus 400 acquires the clip information file in step S301.
The clip AV stream file selected here is assumed to be a clip AV stream file used for playing back the clip AV stream file storing the MMT format data.

（ステップＳ３０２～Ｓ３０３）
次に、情報処理装置４００の制御部４０１は、ステップＳ３０２～Ｓ３０３において、クリップ情報ファイルに記録されたストリームコーディングタイプ＝０ｘ１１の以下の記録フィールドの記録情報に基づいて、再生対象音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記の２種類の符号化音声データを選択再生可能であるか否かを確認する。(Steps S302 to S303)
Next, in steps S302 to S303, the control unit 401 of the information processing apparatus 400 displays the audio data to be reproduced based on the recorded information in the following recording field of the stream coding type = 0x11 recorded in the clip information file.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is confirmed whether or not the above two types of coded voice data can be selectively played back.

図１６に示す例では、音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）１３３、
図１９に示す例では、ストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）１３５、
これらのデータは、音声ストリーム単位で記録されており、１つの番組等のコンテンツに複数の音声ストリームが設定されている場合、各音声ストリーム単位で、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを示す音声識別情報が記録されている。In the example shown in FIG. 16, the audio object type (audioObjectType) 133,
In the example shown in FIG. 19, stream content (stream_content) 135,
These data are recorded in units of audio streams, and when multiple audio streams are set for content such as one program, each audio stream is used.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
Voice identification information indicating which of these voice data is used is recorded.

ステップＳ３０２～Ｓ３０３では、クリップ情報ファイルの記録データを参照して、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記の２種類の符号化音声データを選択再生可能であるか否かを確認する。In steps S302 to S303, referring to the recorded data of the clip information file,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is confirmed whether or not the above two types of coded voice data can be selectively played back.

２種類の符号化音声データを選択再生可能であることが確認された場合は、ステップＳ３０４に進む。
２種類の符号化音声データを選択再生可能でなく、いずれか一方のみ再生可能であることが確認された場合は、ステップＳ３０５に進む。If it is confirmed that the two types of coded voice data can be selectively played back, the process proceeds to step S304.
If it is confirmed that the two types of coded audio data cannot be selectively reproduced and only one of them can be reproduced, the process proceeds to step S305.

（ステップＳ３０４）
２種類の符号化音声データを選択再生可能であることが確認された場合は、ステップＳ３０４において、情報処理装置４００の制御部４０１は、ＭＰＥＧ４ＡＡＣＬＣ符号化音声データと、ＭＰＥＧ４ＡＬＳ符号化音声データの２つの音声データを、ユーザ（視聴者）に選択させるための表示情報を提示して、ユーザによる選択情報を入力する。(Step S304)
When it is confirmed that the two types of coded voice data can be selectively played back, in step S304, the control unit 401 of the information processing apparatus 400 determines the MPEG4 AAC LC coded voice data and the MPEG4 ALS coded voice data. The display information for allowing the user (viewer) to select the two audio data of the above is presented, and the selection information by the user is input.

（ステップＳ３０５）
ステップＳ３０４において、ユーザからの選択情報を入力した場合、または、
ステップＳ３０３で、２種類の符号化音声データを選択再生可能でなく、いずれか一方のみ再生可能であることが確認された場合は、ステップＳ３０５に進む。(Step S305)
When the selection information from the user is input in step S304, or
If it is confirmed in step S303 that the two types of coded audio data cannot be selectively reproduced and only one of them can be reproduced, the process proceeds to step S305.

情報処理装置４００の制御部４０１は、ステップＳ３０５において、再生可能な１つの音声データ、またはユーザによって選択された音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
これらのいずれの音声データであるかを判定する。In step S305, the control unit 401 of the information processing apparatus 400 receives one reproducible voice data or voice data selected by the user.
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
It is determined which of these voice data is used.

再生可能な１つの音声データ、またはユーザによって選択された音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、ステップＳ３０６に進む。
一方、再生可能な１つの音声データ、またはユーザによって選択された音声データが、
（２）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、ステップＳ３０７に進む。One playable audio data, or audio data selected by the user,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
If is, the process proceeds to step S306.
On the other hand, one reproducible audio data or audio data selected by the user is
(2) MPEG4 ALS (LATM / LOAS) coded audio data,
If is, the process proceeds to step S307.

（ステップＳ３０６）
再生可能な１つの音声データ、またはユーザによって選択された音声データが、
（１）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、情報処理装置４００の制御部４０１は、ステップＳ３０６において、ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ対応のコーデックを適用した復号、再生処理を実行する。(Step S306)
One playable audio data, or audio data selected by the user,
(1) MPEG4 AAC LC (LATM / LOAS) coded audio data,
In the case of, the control unit 401 of the information processing apparatus 400 executes the decoding / reproduction process in step S306 by applying the codec corresponding to the MPEG4 AAC LC (LATM / LOAS) coded audio data.

（ステップＳ３０７）
一方、再生可能な１つの音声データ、またはユーザによって選択された音声データが、
（１）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
である場合は、情報処理装置４００の制御部４０１は、ステップＳ３０７において、ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ対応のコーデックを適用した復号、再生処理を実行する。(Step S307)
On the other hand, one reproducible audio data or audio data selected by the user is
(1) MPEG4 ALS (LATM / LOAS) coded audio data,
In the case of, the control unit 401 of the information processing apparatus 400 executes the decoding / reproduction process in step S307 by applying the codec corresponding to the MPEG4 ALS (LATM / LOAS) coded audio data.

［９．情報処理装置の構成例について］
次に、情報記録媒体に対するデータ記録、情報記録媒体からのデータ再生を実行する情報処理装置として適用可能な情報処理装置のハードウェア構成例について、図３４を参照して説明する。[9. About the configuration example of the information processing device]
Next, a hardware configuration example of an information processing device applicable as an information processing device for executing data recording on an information recording medium and data reproduction from the information recording medium will be described with reference to FIG. 34.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）６０１は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）６０２、または記憶部６０８に記憶されているプログラムに従って各種の処理を実行するデータ処理部として機能する。例えば、上述した実施例において説明したシーケンスに従った処理を実行する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）６０３には、ＣＰＵ６０１が実行するプログラムやデータなどが記憶される。これらのＣＰＵ６０１、ＲＯＭ６０２、およびＲＡＭ６０３は、バス６０４により相互に接続されている。 The CPU (Central Processing Unit) 601 functions as a data processing unit that executes various processes according to a program stored in a ROM (Read Only Memory) 602 or a storage unit 608. For example, the process according to the sequence described in the above-described embodiment is executed. The RAM (Random Access Memory) 603 stores programs and data executed by the CPU 601. These CPU 601, ROM 602, and RAM 603 are connected to each other by a bus 604.

ＣＰＵ６０１はバス６０４を介して入出力インタフェース６０５に接続され、入出力インタフェース６０５には、各種スイッチ、キーボード、マウス、マイクロホンなどよりなる入力部６０６、ディスプレイ、スピーカなどよりなる出力部６０７が接続されている。ＣＰＵ６０１は、入力部６０６から入力される指令に対応して各種の処理を実行し、処理結果を例えば出力部６０７に出力する。 The CPU 601 is connected to an input / output interface 605 via a bus 604, and an input unit 606 consisting of various switches, a keyboard, a mouse, a microphone, etc., and an output unit 607 consisting of a display, a speaker, etc. are connected to the input / output interface 605. There is. The CPU 601 executes various processes in response to a command input from the input unit 606, and outputs the process results to, for example, the output unit 607.

入出力インタフェース６０５に接続されている記憶部６０８は、例えばハードディスク等からなり、ＣＰＵ６０１が実行するプログラムや各種のデータを記憶する。通信部６０９は、インターネットやローカルエリアネットワークなどのネットワークを介したデータ通信の送受信部、さらに放送波の送受信部として機能し、外部の装置と通信する。 The storage unit 608 connected to the input / output interface 605 is composed of, for example, a hard disk or the like, and stores a program executed by the CPU 601 and various data. The communication unit 609 functions as a transmission / reception unit for data communication via a network such as the Internet or a local area network, and further as a transmission / reception unit for broadcast waves, and communicates with an external device.

入出力インタフェース６０５に接続されているドライブ６１０は、磁気ディスク、光ディスク、光磁気ディスク、あるいはメモリカード等の半導体メモリなどのリムーバブルメディア６１１を駆動し、データの記録あるいは読み取りを実行する。 The drive 610 connected to the input / output interface 605 drives a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card, and records or reads data.

［１０．本開示の構成のまとめ］
以上、特定の実施例を参照しながら、本開示の実施例について詳解してきた。しかしながら、本開示の要旨を逸脱しない範囲で当業者が実施例の修正や代用を成し得ることは自明である。すなわち、例示という形態で本発明を開示してきたのであり、限定的に解釈されるべきではない。本開示の要旨を判断するためには、特許請求の範囲の欄を参酌すべきである。[10. Summary of the structure of this disclosure]
As described above, the embodiments of the present disclosure have been described in detail with reference to the specific embodiments. However, it is self-evident that those skilled in the art may modify or substitute the examples without departing from the gist of the present disclosure. That is, the present invention has been disclosed in the form of an example and should not be construed in a limited manner. In order to judge the gist of this disclosure, the column of claims should be taken into consideration.

なお、本明細書において開示した技術は、以下のような構成をとることができる。
（１）ＭＭＴフォーマットデータを入力し、情報記録媒体に対するデータ記録フォーマットであるＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを生成するデータ処理部を有し、
前記データ処理部は、
情報記録媒体に記録する音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録する処理を実行する情報処理装置。The technique disclosed in the present specification can have the following configuration.
(1) It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or the SPAV format, which is a data recording format for an information recording medium.
The data processing unit
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing device that executes a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

（２）前記データ処理部は、
前記音声識別情報をクリップ情報ファイルに記録する（１）に記載の情報処理装置。(2) The data processing unit is
The information processing apparatus according to (1), which records the voice identification information in a clip information file.

（３）前記データ処理部は、
音声データ格納パケットから前記音声識別情報を抽出して、前記データベースファイルに記録する（１）または（２）に記載の情報処理装置。(3) The data processing unit is
The information processing apparatus according to (1) or (2), wherein the voice identification information is extracted from the voice data storage packet and recorded in the database file.

（４）前記データ処理部は、
音声データ格納パケット内に記録された音声オブジェクトタイプ（ａｕｄｉｏＯｂｊｅｃｔＴｙｐｅ）を抽出して、前記音声識別情報として前記データベースファイルに記録する（１）または（２）に記載の情報処理装置。(4) The data processing unit is
The information processing apparatus according to (1) or (2), wherein the voice object type (audioObjectType) recorded in the voice data storage packet is extracted and recorded in the database file as the voice identification information.

（５）前記データ処理部は、
ＭＭＴパッケージテーブル（ＭＰＴ）から前記音声識別情報を抽出して、前記データベースファイルに記録する（１）または（２）に記載の情報処理装置。(5) The data processing unit is
The information processing apparatus according to (1) or (2), wherein the voice identification information is extracted from the MMT package table (MPT) and recorded in the database file.

（６）前記データ処理部は、
ＭＭＴパッケージテーブル（ＭＰＴ）に記録されたストリームコンテンツ（ｓｔｒｅａｍ＿ｃｏｎｔｅｎｔ）を抽出して、前記音声識別情報として前記データベースファイルに記録する（１）または（２）に記載の情報処理装置。(6) The data processing unit is
The information processing apparatus according to (1) or (2), wherein the stream content (stream_content) recorded in the MMT package table (MPT) is extracted and recorded in the database file as the voice identification information.

（７）前記データ処理部は、
前記音声識別情報をクリップ情報ファイルのストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］記録領域の同じコーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）の属性情報記録領域に、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報を記録する（１）～（６）いずれかに記載の情報処理装置。(7) The data processing unit is
The voice recognition information is stored in the attribute information recording area of the same coding type (stream_coding_type) in the stream coding information [Stream Coding Info] recording area of the clip information file.
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
The information processing apparatus according to any one of (1) to (6), which records voice identification information that enables identification of which voice data is (a) or (b).

（８）前記同じコーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）は、０ｘ１１である（７）に記載の情報処理装置。 (8) The information processing apparatus according to (7), wherein the same coding type (stream_coding_type) is 0x11.

（９）情報記録媒体の記録データの再生処理を実行するデータ処理部を有し、
前記情報記録媒体は、ＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従って記録したデータを格納した情報記録媒体であり、
前記データ処理部は、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルであるプレイリストファイルとクリップ情報ファイルの記録情報を利用して、前記情報記録媒体に記録されたＭＭＴフォーマットデータの再生処理を実行する構成であり、
前記データ処理部は、
再生対象の音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれのデータであるかを示す音声識別情報を、前記データベースファイルから取得し、取得情報に従って音声データの復号処理を実行する情報処理装置。(9) It has a data processing unit that executes reproduction processing of recorded data of an information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The data processing unit
It is configured to execute the reproduction process of the MMT format data recorded on the information recording medium by using the recorded information of the playlist file and the clip information file which are the database files specified in the BDAV format or the SPAV format.
The data processing unit
The audio data to be played is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing device that acquires voice identification information indicating which data (a) or (b) is from the database file and executes voice data decoding processing according to the acquired information.

（１０）前記データ処理部は、
前記音声識別情報をクリップ情報ファイルから取得する（９）に記載の情報処理装置。(10) The data processing unit is
The information processing apparatus according to (9), which acquires the voice identification information from a clip information file.

（１１）前記データ処理部は、
前記音声識別情報をクリップ情報ファイルのストリームコーディング情報［ＳｔｒｅａｍＣｏｄｉｎｇＩｎｆｏ］記録領域の同じコーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）の属性情報記録領域から取得する（９）または（１０）に記載の情報処理装置。(11) The data processing unit is
The information processing apparatus according to (9) or (10), wherein the voice identification information is acquired from an attribute information recording area of the same coding type (stream_coding_type) in the stream coding information [StreamCodingInfo] recording area of the clip information file.

（１２）前記同じコーディングタイプ（ｓｔｒｅａｍ＿ｃｏｄｉｎｇ＿ｔｙｐｅ）は、０ｘ１１である（１１）に記載の情報処理装置。 (12) The information processing apparatus according to (11), wherein the same coding type (stream_coding_type) is 0x11.

（１３）前記データ処理部は、
前記データベースファイルから取得した音声識別情報に基づいて、
再生可能な音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）の２種類のデータであると判定した場合、
再生対象とする音声データをユーザに選択させる表示情報を出力する（９）～（１２）いずれかに記載の情報処理装置。(13) The data processing unit is
Based on the voice recognition information obtained from the database file,
Playable audio data
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
When it is determined that the data are the two types of data (a) and (b) above,
The information processing apparatus according to any one of (9) to (12), which outputs display information for allowing the user to select audio data to be reproduced.

（１４）前記データ処理部は、
前記表示情報に対するユーザ選択情報を入力し、入力した選択情報に従って、再生対象とする音声データを選択する（１３）に記載の情報処理装置。(14) The data processing unit is
The information processing apparatus according to (13), wherein user selection information for the display information is input, and audio data to be reproduced is selected according to the input selection information.

（１５）ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを記録した情報記録媒体であり、
情報記録媒体に記録された音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録した構成を有し、
再生装置が、前記データベースファイルに記録された音声識別情報に従って、再生対象とした音声データの種類を識別可能とした情報記録媒体。(15) An information recording medium that records recorded data according to the BDAV format or SPAV format.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It has a configuration in which the voice identification information that enables identification of which of the voice data (a) and (b) is recorded is recorded in a database file specified in BDAV format or SPAV format.
An information recording medium that allows the playback device to identify the type of voice data to be played back according to the voice identification information recorded in the database file.

（１６）ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データが記録される情報記録媒体であり、
情報記録媒体に記録される音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報を含むＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルを有し、
再生装置が、前記データベースファイルの音声識別情報に従って、再生対象とした音声データの種類を識別可能とした情報記録媒体。(16) An information recording medium on which recorded data according to the BDAV format or SPAV format is recorded.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
It has a database file specified in BDAV format or SPAV format that contains voice identification information that makes it possible to identify which of the voice data (a) and (b) is.
An information recording medium in which a playback device can identify the type of voice data to be played back according to the voice identification information of the database file.

（１７）情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、
ＭＭＴフォーマットデータを入力し、情報記録媒体に対するデータ記録フォーマットであるＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを生成するデータ処理部を有し、
前記データ処理部が、
情報記録媒体に記録する音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録する処理を実行する情報処理方法。(17) An information processing method executed by an information processing device.
The information processing device is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The data processing unit
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing method for executing a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

（１８）情報処理装置において実行する情報処理方法であり、
前記情報処理装置は、
情報記録媒体の記録データの再生処理を実行するデータ処理部を有し、
前記情報記録媒体は、ＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従って記録したデータを格納した情報記録媒体であり、
前記データ処理部が、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルであるプレイリストファイルとクリップ情報ファイルの記録情報を利用して、前記情報記録媒体に記録されたＭＭＴフォーマットデータの再生処理を実行し、
前記再生処理に際して、再生対象の音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれのデータであるかを示す音声識別情報を、前記データベースファイルから取得し、取得情報に従って音声データの復号処理を実行する情報処理方法。(18) An information processing method executed by an information processing device.
The information processing device is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The data processing unit
Using the recorded information of the playlist file and the clip information file, which are database files specified in the BDAV format or the SPAV format, the MMT format data recorded in the information recording medium is reproduced.
In the reproduction process, the audio data to be reproduced is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing method for acquiring voice identification information indicating which data (a) or (b) is from the database file and executing a voice data decoding process according to the acquired information.

（１９）情報処理装置において実行する情報処理を実行させるプログラムであり、
前記情報処理装置は、
ＭＭＴフォーマットデータを入力し、情報記録媒体に対するデータ記録フォーマットであるＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従った記録データを生成するデータ処理部を有し、
前記プログラムは、前記データ処理部に、
情報記録媒体に記録する音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれの音声データであるかを識別可能とした音声識別情報をＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルに記録する処理を実行させるプログラム。(19) A program that executes information processing to be executed in an information processing device.
The information processing device is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The program is installed in the data processing unit.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
A program for executing a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

（２０）情報処理装置において実行する情報処理を実行させるプログラムであり、
前記情報処理装置は、
情報記録媒体の記録データの再生処理を実行するデータ処理部を有し、
前記情報記録媒体は、ＭＭＴフォーマットデータを、ＢＤＡＶフォーマット、またはＳＰＡＶフォーマットに従って記録したデータを格納した情報記録媒体であり、
前記プログラムは、前記データ処理部に、
ＢＤＡＶフォーマット、またはＳＰＡＶフォーマット規定のデータベースファイルであるプレイリストファイルとクリップ情報ファイルの記録情報を利用して、前記情報記録媒体に記録されたＭＭＴフォーマットデータの再生処理を実行させ、
前記再生処理に際して、再生対象の音声データが、
（ａ）ＭＰＥＧ４ＡＡＣＬＣ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
（ｂ）ＭＰＥＧ４ＡＬＳ（ＬＡＴＭ／ＬＯＡＳ）符号化音声データ、
上記（ａ），（ｂ）いずれのデータであるかを示す音声識別情報を、前記データベースファイルから取得させ、取得情報に従って音声データの復号処理を行わせるプログラム。(20) A program that executes information processing to be executed in an information processing device.
The information processing device is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The program is installed in the data processing unit.
Using the recorded information of the playlist file and the clip information file, which are database files specified in the BDAV format or the SPAV format, the reproduction process of the MMT format data recorded on the information recording medium is executed.
In the reproduction process, the audio data to be reproduced is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
A program that acquires voice identification information indicating which data (a) or (b) is from the database file, and decodes the voice data according to the acquired information.

また、明細書中において説明した一連の処理はハードウェア、またはソフトウェア、あるいは両者の複合構成によって実行することが可能である。ソフトウェアによる処理を実行する場合は、処理シーケンスを記録したプログラムを、専用のハードウェアに組み込まれたコンピュータ内のメモリにインストールして実行させるか、あるいは、各種処理が実行可能な汎用コンピュータにプログラムをインストールして実行させることが可能である。例えば、プログラムは記録媒体に予め記録しておくことができる。記録媒体からコンピュータにインストールする他、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、インターネットといったネットワークを介してプログラムを受信し、内蔵するハードディスク等の記録媒体にインストールすることができる。 Further, the series of processes described in the specification can be executed by hardware, software, or a composite configuration of both. When executing processing by software, install the program that records the processing sequence in the memory in the computer built in the dedicated hardware and execute it, or execute the program on a general-purpose computer that can execute various processing. It can be installed and run. For example, the program can be pre-recorded on a recording medium. In addition to installing the program on a computer from a recording medium, the program can be received via a network such as a LAN (Local Area Network) or the Internet and installed on a recording medium such as a built-in hard disk.

なお、明細書に記載された各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。また、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 It should be noted that the various processes described in the specification are not only executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. Further, in the present specification, the system is a logical set configuration of a plurality of devices, and the devices having each configuration are not limited to those in the same housing.

以上、説明したように、本開示の一実施例の構成によれば、メディア記録ＭＭＴフォーマット音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データか、ＭＰＥＧ４ＡＬＳ音声データかを迅速に確認可能とした構成が実現される。
具体的には、例えば、メディアに記録するＭＭＴフォーマット音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データであるか、ＭＰＥＧ４ＡＬＳ音声データであるかを示す音声識別情報を記録したクリップ情報ファイルを生成してメディアに記録する。再生装置は、クリップ情報ファイルに記録された音声識別情報に基づいて、再生対象とする音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データであるか、ＭＰＥＧ４ＡＬＳ音声データであるかを識別し、迅速な復号、再生を行うことができる。
本構成により、メディア記録ＭＭＴフォーマット音声データが、ＭＰＥＧ４ＡＡＣＬＣ音声データか、ＭＰＥＧ４ＡＬＳ音声データかを迅速に確認可能とした構成が実現される。As described above, according to the configuration of one embodiment of the present disclosure, a configuration is realized in which it is possible to quickly confirm whether the media recording MMT format audio data is MPEG4 AAC LC audio data or MPEG4 ALS audio data. To.
Specifically, for example, a clip information file recording audio identification information indicating whether the MMT format audio data to be recorded on the media is MPEG4 AAC LC audio data or MPEG4 ALS audio data is generated and stored on the media. Record. The playback device identifies whether the audio data to be reproduced is MPEG4 AAC LC audio data or MPEG4 ALS audio data based on the audio identification information recorded in the clip information file, and rapidly decodes the audio data. Playback can be performed.
With this configuration, it is possible to quickly confirm whether the media recording MMT format audio data is MPEG4 AAC LC audio data or MPEG4 ALS audio data.

２０送信装置
２１放送サーバ
２２データ配信サーバ
３０情報処理装置
３１記録再生装置
３２ＴＶ
３３ＰＣ
３４携帯端末
４０情報記録媒体（メディア）
４１ＢＤ
４２ＨＤＤ
４３フラッシュメモリ
３００情報処理装置
３０１データ入力部
３０２ユーザ入力部
３０３制御部
３０４記憶部
３０５デマルチプレクサ
３０６記録データ生成部
３０７記録部
３０８ドライブ
３１１字幕データ生成部
３１２画像データ生成部
３１３音声データ生成部
３１４補助情報生成部
３１５マルチプレクサ
３１６データベースファイル生成部
３２０情報記録媒体
４００情報処理装置
４０１制御部
４０２ユーザ入力部
４０３ドライブ
４０４記録再生部
４０５記憶部
４０６再生処理部
４０７入出力Ｉ／Ｆ
４１１デマルチプレクサ
４１２字幕データ生成部
４１３画像データ生成部
４１４音声データ生成部
４１５補助情報生成部
４１６出力データ生成部
５１０情報記録媒体
５２０出力装置（表示部＋スピーカ）
６０１ＣＰＵ
６０２ＲＯＭ
６０３ＲＡＭ
６０４バス
６０５入出力インタフェース
６０６入力部
６０７出力部
６０８記憶部
６０９通信部
６１０ドライブ
６１１リムーバブルメディア20 Transmitter 21 Broadcast server 22 Data distribution server 30 Information processing device 31 Recording / playback device 32 TV
33 PC
34 Mobile terminal 40 Information recording medium (media)
41 BD
42 HDD
43 Flash memory 300 Information processing device 301 Data input unit 302 User input unit 303 Control unit 304 Storage unit 305 Demultiplexer 306 Recording data generation unit 307 Recording unit 308 Drive 311 Subtitle data generation unit 312 Image data generation unit 313 Audio data generation unit 314 Auxiliary information generation unit 315 multiplexer 316 database file generation unit 320 information recording medium 400 information processing device 401 control unit 402 user input unit 403 drive 404 recording / playback unit 405 storage unit 406 playback processing unit 407 input / output I / F
411 Demultiplexer 412 Subtitle data generation unit 413 Image data generation unit 414 Audio data generation unit 415 Auxiliary information generation unit 416 Output data generation unit 510 Information recording medium 520 Output device (display unit + speaker)
601 CPU
602 ROM
603 RAM
604 Bus 605 I / O interface 606 Input section 607 Output section 608 Storage section 609 Communication section 610 Drive 611 Removable media

Claims

It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The data processing unit
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing device that executes a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

The data processing unit
The information processing apparatus according to claim 1, wherein the voice identification information is recorded in a clip information file.

The data processing unit
The information processing apparatus according to claim 1, wherein the voice identification information is extracted from the voice data storage packet and recorded in the database file.

The data processing unit
The information processing apparatus according to claim 1, wherein the voice object type (audioObjectType) recorded in the voice data storage packet is extracted and recorded in the database file as the voice identification information.

The data processing unit
The information processing apparatus according to claim 1, wherein the voice identification information is extracted from the MMT package table (MPT) and recorded in the database file.

The data processing unit
The information processing apparatus according to claim 1, wherein the stream content (stream_content) recorded in the MMT package table (MPT) is extracted and recorded in the database file as the voice identification information.

The data processing unit
The voice recognition information is stored in the attribute information recording area of the same coding type (stream_coding_type) in the stream coding information [Stream Coding Info] recording area of the clip information file.
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
The information processing apparatus according to claim 1, wherein the voice identification information that makes it possible to identify which of the voice data (a) and (b) is.

The information processing apparatus according to claim 7, wherein the same coding type (stream_coding_type) is 0x11.

It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The data processing unit
It is configured to execute the reproduction process of the MMT format data recorded on the information recording medium by using the recorded information of the playlist file and the clip information file which are the database files specified in the BDAV format or the SPAV format.
The data processing unit
The audio data to be played is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing device that acquires voice identification information indicating which data (a) or (b) is from the database file and executes voice data decoding processing according to the acquired information.

The data processing unit
The information processing apparatus according to claim 9, wherein the voice identification information is acquired from the clip information file.

The data processing unit
The information processing apparatus according to claim 9, wherein the voice identification information is acquired from an attribute information recording area of the same coding type (stream_coding_type) in the stream coding information [StreamCodingInfo] recording area of the clip information file.

The information processing apparatus according to claim 11, wherein the same coding type (stream_coding_type) is 0x11.

The data processing unit
Based on the voice recognition information obtained from the database file,
Playable audio data
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
When it is determined that the data are the two types of data (a) and (b) above,
The information processing device according to claim 9, which outputs display information that causes the user to select audio data to be reproduced.

The data processing unit
The information processing apparatus according to claim 13, wherein user selection information for the display information is input, and audio data to be reproduced is selected according to the input selection information.

It is an information processing method executed in an information processing device.
The information processing device is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The data processing unit
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing method for executing a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

It is an information processing method executed in an information processing device.
The information processing device is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The data processing unit
Using the recorded information of the playlist file and the clip information file, which are database files specified in the BDAV format or the SPAV format, the MMT format data recorded in the information recording medium is reproduced.
In the reproduction process, the audio data to be reproduced is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
An information processing method for acquiring voice identification information indicating which data (a) or (b) is from the database file and executing a voice data decoding process according to the acquired information.

It is a program that executes information processing to be executed in an information processing device.
The information processing device is
It has a data processing unit that inputs MMT format data and generates recorded data according to the BDAV format or SPAV format, which is a data recording format for an information recording medium.
The program is installed in the data processing unit.
The audio data recorded on the information recording medium is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
A program for executing a process of recording voice identification information that can identify which of the voice data (a) and (b) is in a database file specified in BDAV format or SPAV format.

It is a program that executes information processing to be executed in an information processing device.
The information processing device is
It has a data processing unit that executes the reproduction processing of the recorded data of the information recording medium.
The information recording medium is an information recording medium that stores data in which MMT format data is recorded according to the BDAV format or the SPAV format.
The program is installed in the data processing unit.
Using the recorded information of the playlist file and the clip information file, which are database files specified in the BDAV format or the SPAV format, the reproduction process of the MMT format data recorded on the information recording medium is executed.
In the reproduction process, the audio data to be reproduced is
(A) MPEG4 AAC LC (LATM / LOAS) coded audio data,
(B) MPEG4 ALS (LATM / LOAS) coded audio data,
A program that acquires voice identification information indicating which data (a) or (b) is from the database file, and decodes the voice data according to the acquired information.