JP2002329372A

JP2002329372A - Audio information recording medium and reproducing device

Info

Publication number: JP2002329372A
Application number: JP2002067305A
Authority: JP
Inventors: Hitoshi Otomo; 仁大友; Hidenori Mimura; 英紀三村; Junichi Uota; 潤一魚田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2002-03-12
Filing date: 2002-03-12
Publication date: 2002-11-15
Anticipated expiration: 2018-06-26
Also published as: JP3473850B2

Abstract

PROBLEM TO BE SOLVED: To realize a DVD audio data structure which is capable of confining a transmission rate within a specified standard and has specifics of high sound quality. SOLUTION: The recording medium is recorded with the data structure added with the head data including the first speech data string (sample) formed by digitalizing a portion of the channels of the speech signals of plural channels by a first sampling frequency and a first quantizing bit number, the second speech data string (sample) formed by digitalizing the other channels of the speech signals of the plural channels by a second sampling frequency and a second quantizing bit number and the timing data for synchronizing the first and second speech data strings (samples).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、オーディオ用情
報記録媒体および再生装置に関するもので、特に光学式
ディスクなどの高密度記録媒体にデジタルオーディオ信
号を記録する方式及びその再生装置に適用されて有効な
ものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio information recording medium and a reproducing apparatus, and more particularly to a method for recording a digital audio signal on a high-density recording medium such as an optical disk, and the present invention is effectively applied to the reproducing apparatus. It is something.

【０００２】[0002]

【従来の技術】近年、主映像信号、この主映像信号に付
随する複数種類の副映像信号、及び複数チャンネルのオ
ーディオ信号が記録可能な高密度記録光学式ディスクが
開発されている。この高密度記録光学式ディスクは、Ｄ
ＶＤと称されている。以後この技術をＤＶＤビデオと称
することにする。2. Description of the Related Art In recent years, high-density recording optical discs capable of recording a main video signal, a plurality of types of sub-video signals accompanying the main video signal, and audio signals of a plurality of channels have been developed. This high-density recording optical disc is
VD. Hereinafter, this technique is referred to as DVD video.

【０００３】このＤＶＤビデオ技術を応用してＤＶＤオ
ーディオという技術も開発されつつある。このＤＶＤオ
ーディオは、オーディオ専門技術として開発し、高音質
化をねらいとするものである。A technology called DVD audio is being developed by applying this DVD video technology. The DVD audio has been developed as a specialized audio technology, and aims at high sound quality.

【０００４】[0004]

【発明が解決しようとする課題】ＤＶＤオーディオの開
発においては、その規格としてＤＶＤビデオのオーディ
オデータ構造の規格にできるだけ類似した形で実現した
いという要望がある。In the development of DVD audio, there is a demand that the standard be realized in a form as similar as possible to the standard of the audio data structure of DVD video.

【０００５】そこでこの発明では、ＤＶＤビデオにおけ
るオーディオデータ構造の規格をできるだけ利用し、高
音質の仕様をもったＤＶＤオーディオの規格を実現した
デジタルオーディオ記録媒体及び再生装置を提供するこ
とを目的とする。[0005] Therefore, an object of the present invention is to provide a digital audio recording medium and a reproducing apparatus which realize a DVD audio standard having a high sound quality specification by utilizing a standard of an audio data structure in a DVD video as much as possible. .

【０００６】[0006]

【課題を解決するための手段】この発明は、上記の目的
を達成するために、コンピュータ制御装置を用いて読取
られて処理されるデータが少なくともオーディオオブジ
ェクトと管理情報を有し、前記オーディオオブジェクト
は、複数のチャンネルの音声信号のうち第１のチャンネ
ル群を第１の標本化周波数で且つ第１の量子化ビット数
でデジタル化した第１の音声データ列（サンプル列）と
され、前記複数のチャンネルの音声信号のうち第２のチ
ャンネルまたは第２のチャンネル群を第２の標本化周波
数で且つ第２の量子化ビット数でデジタル化した第２の
音声データ列（サンプル列）とされ、前記第１と第２の
サンプル列がフレーム単位で対となり、フレーム配列さ
れ、且つフレーム配列がヘッダを含むパックに収容され
てパック列として形成されており、前記パックの集合が
セルとして管理され、前記セルの集合がプログラムとし
て管理されるように定義されており、前記管理情報は、
前記複数のチャンネルの音声信号を再生するために必要
な、オーディオタイトルセット情報管理テーブルと、プ
ログラムチェーン情報テーブルを有し、前記プロプログ
ラムチェーン情報テーブルには、プログラムチェーン一
般情報と、複数のプログラムの各エントリーセルの番号
を含むプログラム情報テーブルと、各プログラムの前記
セルの再生位置を示したセル再生情報テーブルとが含ま
れており、前記ヘッダには、前記第１のチャンネル群の
サンプルの量子化ワード長、第1のオーディオサンプリ
ング周波数の情報、及び前記第２のチャンネル群のサン
プルの量子化ワード長、第２のオーディオサンプリング
周波数の情報が含まれていることを特徴とするものであ
る。According to the present invention, in order to achieve the above object, data read and processed by using a computer control device has at least an audio object and management information, and the audio object has , A first channel group among a plurality of channels of audio signals is digitized at a first sampling frequency and a first quantization bit number to form a first audio data sequence (sample sequence); A second channel or a second channel group of the channel audio signals is digitized at a second sampling frequency and a second quantization bit number to form a second audio data sequence (sample sequence); The first and second sample strings are paired on a frame basis, are arranged in a frame, and the frame array is accommodated in a pack including a header to form a pack string. Made is and, the set of the pack is managed as a cell, is defined as the set of cells is managed as a program, the management information,
An audio title set information management table and a program chain information table necessary for reproducing the audio signals of the plurality of channels are provided. The professional program chain information table includes general program chain information and information on a plurality of programs. A program information table including a number of each entry cell and a cell reproduction information table indicating a reproduction position of the cell of each program are included, and the header includes a quantization of a sample of the first channel group. It is characterized by including information on a word length, information on a first audio sampling frequency, information on a quantized word length of samples of the second channel group, and information on a second audio sampling frequency.

【０００７】[0007]

【発明の実施の形態】以下、この発明の実施の形態を図
面を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【０００８】まずこの発明を説明する前に、ＤＶＤビデ
オの規格において定義されているオーディオ信号の記録
フォーマットについて説明する。Before describing the present invention, a recording format of an audio signal defined in the DVD video standard will be described.

【０００９】まずこの発明のデータ記録方式において、
リニアＰＣＭ方式によるデータの配列を説明する。リニ
アＰＣＭデータは、量子化ビットとして、例えば１６ビ
ット、２０ビット、２４ビットが任意に採用されるもの
とする。さらに、オーディオのモードとしては、モノラ
ル、ステレオ、３チャンネル、４チャンネル、５チャン
ネル、６チャンネル、７チャンネル、８チャンネルのモ
ードがある。First, in the data recording method of the present invention,
An arrangement of data according to the linear PCM method will be described. For the linear PCM data, for example, 16 bits, 20 bits, and 24 bits are arbitrarily adopted as quantization bits. Further, audio modes include monaural, stereo, three-channel, four-channel, five-channel, six-channel, seven-channel, and eight-channel modes.

【００１０】今、８チャンネルＡ〜Ｈまでのオーディオ
信号があるものとする。これらは、４８ＫＨｚまたは９
６ＫＨｚのサンプリング周波数でサンプルされ、量子化
される。量子化ビットは、２０ビットを例にとって説明
する。Now, it is assumed that there are audio signals of eight channels A to H. These are 48 kHz or 9
It is sampled at a sampling frequency of 6 KHz and quantized. The quantization bit will be described by taking 20 bits as an example.

【００１１】図１（Ａ）には、８チャンネルまでのオー
ディオ信号Ａ乃至Ｈまでがそれぞれサンプリングされた
様子を示している。また、それぞれのサンプルは、例え
ば２０ビットに量子化されているものとする。さらに２
０ビットの各サンプルは、メインワードとエキストラワ
ードとに分けられている状態を示している。FIG. 1A shows a state in which audio signals A to H of up to eight channels are sampled, respectively. It is assumed that each sample is quantized to, for example, 20 bits. 2 more
Each sample of 0 bits indicates a state of being divided into a main word and an extra word.

【００１２】各チャンのメインワードがアルファベット
の大文字Ａｎ−Ｈｎで示され、エキストラワードが小文
字ａｎ−ｈｎで示されている。またサフィックスｎ（ｎ
＝０，１，２，，３，…）は、サンプル順を示してい
る。ここでメインワードは１６ビットであり、エキスト
ラワードは４ビットである。The main word of each channel is indicated by a capital letter An-Hn, and the extra word is indicated by a small letter an-hn. The suffix n (n
= 0, 1, 2, 3, 3,...) Indicates a sample order. Here, the main word is 16 bits, and the extra word is 4 bits.

【００１３】信号Ａは、Ａ0 ａ0 、Ａ1 ａ1 、Ａ2 ａ2
、Ａ3 ａ3 、Ａ4 ａ4 …の如く、信号Ｂは、Ｂ0 ｂ0
、Ｂ1 ｂ1 、Ｂ2 ｂ2 、Ｂ3 ｂ3 、Ｂ4 ｂ4 …の如
く、信号Ｃは、Ｃ0 ｃ0 、Ｃ1 ｃ1 、Ｃ2 ｃ2 、Ｃ3 ｃ
3 、Ｃ4 ｃ4 …の如く、信号Ｈは、Ｈ0 ｈ0 、Ｈ1 ｈ1
、Ｈ2 ｈ2 、Ｈ3 ｈ3 、Ｈ4 ｈ4 …の如く各サンプル
が作成される。The signal A is A0 a0, A1 a1, A2 a2
, A3 a3, A4 a4..., The signal B is B0 b0
, B1 b1, B2 b2, B3 b3, B4 b4..., The signal C is represented by C0 c0, C1 c1, C2 c2, C3 c
The signal H is represented by H0 h0, H1 h1 like 3, C4 c4.
, H2 h2, H3 h3, H4 h4...

【００１４】次に、図１（Ｂ）には、上記のワードを記
録媒体に記録する場合，上記ワードの配列フォーマット
をサンプル列で示している。Next, FIG. 1B shows the arrangement format of the above words in a sample sequence when the above words are recorded on a recording medium.

【００１５】まず、２０（＝Ｍ）ビットからなる各サン
プルデータが、ＭＳＢ側の１６（＝ｍ1 ）ビットのメイ
ンワードとＬＳＢ側の４（＝ｍ2 ）ビットのエキストラ
ワードとに分けられる。次に、各チャンネルの０（＝２
ｎ）番目のメインワードがまとめられて配置される。こ
の次に各チャンネルの１（＝２ｎ＋１）番目のメインワ
ードがまとめられて配置される。この次に各チャンネル
の０（＝２ｎ）番目のエキストラワードがまとめられて
配置される。この次に各チャンネルの１（＝２ｎ＋１）
番目のエキストラワードがまとめられて配置（但し、ｎ
＝０，１，２，… ）される。First, each sample data of 20 (= M) bits is divided into a main word of 16 (= m1) bits on the MSB side and an extra word of 4 (= m2) bits on the LSB side. Next, 0 (= 2
The n) th main word is arranged collectively. Next to this, the first (= 2n + 1) -th main word of each channel is arranged collectively. Next, the 0th (= 2n) th extra word of each channel is arranged collectively. Then, 1 (= 2n + 1) of each channel
The extra words are grouped together (where n
= 0, 1, 2,...).

【００１６】ここで各チャンネルのメインワードが集ま
った群を、１メインサンプルとすることにする。また各
チャンネルのエキストラワードが集まった群を１エキス
トラサンプルとする。図１（Ｂ）には、各チャンネルの
Ａ０〜Ｈ０（メインサンプルＳ０）、Ａ１〜Ｈ１（メイ
ンサンプルＳ１），ａ０〜ｈ０（エキストラサンプルｅ
０），ａ１〜ｈ１（エキストラサンプルｅ１）、…と配
列された様子を示している。これらを、１組で４サンプ
ル、あるいは２対サンプルと称する。Here, a group in which the main words of each channel are collected is defined as one main sample. A group of extra words of each channel is defined as one extra sample. FIG. 1B shows A0 to H0 (main sample S0), A1 to H1 (main sample S1), and a0 to h0 (extra sample e) of each channel.
0), a1 to h1 (extra sample e1),... These are referred to as four samples or two pairs of samples.

【００１７】このようなフォーマットとした場合、簡易
機種（例えば１６ビットモードで動作する機種）により
データ再生処理を行うときは、いずれかのチャンネルの
メインワード、あるいはステレオであれば２つのチャン
ネルの各メインワードのみを取り扱って再生処理を行え
ばよく、上位機種（例えば２０ビットモードで動作する
機種）によりデータ再生処理を行うときは、メインワー
ドと、これに対応するエキストラワードを取り扱って再
生処理を行えばよい。In such a format, when performing data reproduction processing with a simple model (for example, a model operating in a 16-bit mode), the main word of one of the channels, or each of the two channels in the case of stereo, is used. The reproduction process may be performed by handling only the main word. When the data reproduction process is performed by an upper model (for example, a model operating in the 20-bit mode), the reproduction process is performed by handling the main word and the corresponding extra word. Just do it.

【００１８】図１（Ｃ）には、メインサンプルとエキス
トラサンプルの具体的なビット数を用いて、各サンプル
の配列状態を示している。FIG. 1C shows the arrangement state of each sample using the specific bit numbers of the main sample and the extra sample.

【００１９】このように、量子化されたリニアＰＣＭコ
ードの状態では、２０ビットであるものを、１６ビット
のメインワードと４ビットのエキストラワードとに分け
ておくことにより次のようなことが可能である。１６ビ
ットモードで動作する機種は、サンプル配列を取り扱う
場合、エキストラサンプルの領域では８ビット単位でデ
ータ処理を行うことにより不要な部分を容易に破棄する
ことができる。なぜならば、エキストラサンプルの２サ
ンプル分は、４ビット×８チャンネルと４ビット×８チ
ャンネルである。そしてこのデータは、８ビット単位で
８回連続して処理（破棄）することができるからであ
る。As described above, in the state of the quantized linear PCM code, the following can be achieved by dividing a 20-bit code into a 16-bit main word and a 4-bit extra word. It is. When a model operating in the 16-bit mode handles a sample array, unnecessary portions can be easily discarded by performing data processing in 8-bit units in the extra sample area. This is because two extra samples are 4 bits × 8 channels and 4 bits × 8 channels. This is because this data can be processed (discarded) continuously eight times in 8-bit units.

【００２０】このデータ配列の特徴はこの実施形態に限
らない。チャンネル数が奇数の場合も、またエキストラ
ワードが８ビットの場合も、いずれの場合でも連続した
２つのエキストラサンプルの合計ビット数は８ビットの
整数倍となり、メインワードのみ再生する簡易機種で
は、モードに応じて８ビットのｎ回連続破棄処理を実行
することにより、エキストラサンプルを読み飛ばすこと
ができる。The features of this data array are not limited to this embodiment. In both cases where the number of channels is odd and the number of extra words is 8 bits, the total number of bits of two consecutive extra samples is an integral multiple of 8 bits. By executing the 8-bit consecutive discarding process n times according to, the extra sample can be skipped.

【００２１】上記の図１（Ｂ）の状態で、後は変調処理
を行って記録媒体（光ディスクのトラック上）に記録し
てもよいが、さらに他の制御情報やビデオ情報とともに
記録する場合には、データの取り扱いや同期を容易にす
るために時間管理しやすい形態で記録する方が好まし
い。そこで次のような，フレーム化、フレームのグルー
プ化、パケット化を行っている。In the state shown in FIG. 1B, after that, modulation processing may be performed and recorded on a recording medium (on a track of an optical disk). However, when recording is performed together with other control information and video information, Is preferably recorded in a form that facilitates time management in order to facilitate data handling and synchronization. Therefore, the following framing, frame grouping, and packetization are performed.

【００２２】図１（Ｄ）には、オーディオフレーム列を
示している。つまり、まず一定再生時間のデータの単位
を（１／６００秒）として、これを１フレームとしてい
る。１フレームの中には、８０或いは１６０サンプルが
割り当てられる。オーディオ信号をサンプリングしたと
きのサンプリング周波数が４８ＫＨｚのときは、１サン
プルは、１／４８００秒であり、１フレームの時間は、
（１／４８０００）×８０サンプル＝１／６００秒に相
当する。またサンプリング周波数が９６ＫＨｚのとき
は、１サンプルは、１／９６００秒であり、（１／９６
０００）×１６０サンプル＝１／６００秒となる。この
ように、１フレームは８０サンプル、または１６０サン
プルとされている。FIG. 1D shows an audio frame sequence. That is, first, the unit of the data of the fixed reproduction time is (1/600 second), and this is set as one frame. In one frame, 80 or 160 samples are allocated. When the sampling frequency when the audio signal is sampled is 48 KHz, one sample is 1/4800 seconds, and the time of one frame is
(１／48000) × 80 samples = 1/600 second. When the sampling frequency is 96 KHz, one sample is 1/9600 seconds, and (1/96
000) × 160 samples = 1/600 second. Thus, one frame is made up of 80 samples or 160 samples.

【００２３】図２には、上記の１フレームと１ＧＯＦ
（グループオブフレーム）の関係を示している。１フレ
ームは８０又は１６０サンプルで、１／６００秒のデー
タであり、１ＧＯＦは、２０フレームでなる。するとこ
の１ＧＯＦは、（１／６００）秒×２０＝１／３０秒の
期間に相当する。つまりこれはテレビジョンのフレーム
周波数である。このようなＧＯＦの連続が、オーディオ
ストリームである。このような１ＧＯＦの単位を取り決
めることにより、オーディオストリームとビデオ信号と
の同期をとる場合に有効となる。FIG. 2 shows one frame and one GOF.
(Group of frames). One frame is data of 80 or 160 samples and is 1/600 second, and one GOF is composed of 20 frames. Then, one GOF corresponds to a period of (1/600) seconds × 20 = 1/30 seconds. That is, this is the television frame frequency. Such a sequence of GOFs is an audio stream. By deciding such a unit of 1 GOF, it is effective when synchronizing the audio stream and the video signal.

【００２４】さらに、上記のフレームは、他の制御信号
やビデオ信号と同じ記録媒体に記録する都合上、パケッ
トに配分されれる。このパケットとフレームとの関係を
以下説明する。Further, the above-mentioned frames are allocated to packets for convenience of recording on the same recording medium as other control signals and video signals. The relationship between the packet and the frame will be described below.

【００２５】図３（Ａ）には、上記パケットとフレーム
との関係を示している。FIG. 3A shows the relationship between the packet and the frame.

【００２６】ＮＶはナビゲーションパックであり、この
中にはパックヘッダ、パケットヘッダ、ＰＣＩ＿ＰＫＴ
（プレゼンテーションコントロールパケット）、及びＤ
ＳＩ＿ＰＫＴ（データサーチインフォメーションパケッ
ト）が記述されている。ＤＳＩ＿ＰＫＴのデータは、デ
ータサーチインフォメーションであり、Ｖはビデオオブ
ジェクトのパック、Ａはオーディオオブジェクトのパッ
ク、Ｓはサブピクチャーオブジェクトのパックを意味す
る。１パックは２０４８バイトと規定されている。１パ
ックは、１パケットを含み、また１パックはパックヘッ
ダとパケットヘッダ、パケットとからなる。ＤＳＩ＿Ｐ
ＫＴのデータには、各パックのスタートアドレスやエン
ドアドレス等の再生時に各データを制御するための情報
が記述されている。NV is a navigation pack, which includes a pack header, a packet header, a PCI_PKT
(Presentation control packet), and D
SI_PKT (data search information packet) is described. The DSI_PKT data is data search information, where V is a pack of video objects, A is a pack of audio objects, and S is a pack of sub-picture objects. One pack is defined as 2048 bytes. One pack includes one packet, and one pack includes a pack header, a packet header, and a packet. DSI_P
In the KT data, information for controlling each data at the time of reproduction, such as a start address and an end address of each pack, is described.

【００２７】図３（Ｂ）には、オーディオパックのみを
取り出して示している。実際には、図３（Ａ）に示すよ
うにＤＳＩ＿ＰＫＴ、ビデオパックＶ、オーディオパッ
クＡが混在して配置されるのであるが、図３（Ｂ）には
フレームとパックとの関係を分かりやすくするために、
オーディオパックＡを取り出して示している。このシス
テムの規格では、ＤＳＩ＿ＰＫＴと次のＤＳＩ＿ＰＫＴ
との間を再生したときに約０．５秒となるだけの情報を
配置することが規定されている。したがって、１フレー
ムは先の説明のように１／６００秒であるからＤＳＩ＿
ＰＫＴと次のＤＳＩ＿ＰＫＴの間のオーディオフレーム
数は、３０フレームとなる。１フレームのデータ量
（Ｄ）はサンプリング周波数（ｆｓ）、チャンネル数
（Ｎ）、量子化ビット数（ｍ）によって異なる。FIG. 3B shows only the audio pack taken out. Actually, DSI_PKT, video pack V, and audio pack A are mixedly arranged as shown in FIG. 3A, but FIG. 3B makes it easy to understand the relationship between frames and packs. for,
Audio pack A is taken out and shown. In the standard of this system, DSI_PKT and the next DSI_PKT
It is stipulated that information that is only about 0.5 second when the data is reproduced between and is arranged. Therefore, since one frame is 1/600 second as described above, DSI_
The number of audio frames between the PKT and the next DSI_PKT is 30 frames. The data amount (D) of one frame differs depending on the sampling frequency (fs), the number of channels (N), and the number of quantization bits (m).

【００２８】ｆｓ＝４８ｋＨｚのときＤ＝８０×Ｎ×ｍ、ｆｓ＝９６ｋＨｚのときＤ＝１６０×Ｎ×ｍとなる。When fs = 48 kHz, D = 80 × N × m, and when fs = 96 kHz, D = 160 × N × m.

【００２９】従って、１フレームは、必ずしも１パック
に対応するとは限らず、１パックに対して、複数フレー
ムが対応したり、或いは１フレーム以下が対応する場合
がある。すなわち、図３（Ｂ）に示すように１パックの
途中にフレームの先頭がくることがある。フレーム先頭
の位置情報は、パックヘッダに記述されてあり、パック
ヘッダあるいはＤＳＩ＿ＰＫＴからのデータカウント数
（タイミング）として記述されている。したがって再生
装置は、上記の記録媒体を再生する場合には、オーディ
オパケットのフレームを取り出し、かつ、再生すべきチ
ャンネルのデータを抽出して、オーディオデコーダに取
り込みデコード処理を行うようになっている。Therefore, one frame does not always correspond to one pack, and a plurality of frames may correspond to one pack, or one frame or less may correspond to one pack. That is, as shown in FIG. 3B, the beginning of the frame may come in the middle of one pack. The position information of the head of the frame is described in the pack header, and is described as the number of data counts (timing) from the pack header or DSI_PKT. Therefore, when reproducing the above-described recording medium, the reproducing device extracts the frame of the audio packet, extracts the data of the channel to be reproduced, takes it into the audio decoder, and performs the decoding process.

【００３０】図４（Ａ）には、上記のデータ配列を一般
的に示した２０ビットモードのメインワード（１６ビッ
ト）とエキストラワード（４ビット）の関係を示し、図
４（Ｂ）には２４ビットモードのメインワード（１６ビ
ット）とエキストラワード（８ビット）の関係を示して
いる。FIG. 4A shows the relationship between the main word (16 bits) and the extra word (4 bits) in the 20-bit mode, which generally shows the above data arrangement. FIG. The relationship between the main word (16 bits) and the extra word (8 bits) in the 24-bit mode is shown.

【００３１】図４（Ａ）、図４（Ｂ）に示すようにサン
プルデータは、メインサンプルとエキストラサンプルを
一対として２対のサンプルを１単位として、その整数倍
で前記フレーム構成とパック構成が行われる。As shown in FIGS. 4A and 4B, the sample data is composed of a main sample and an extra sample as a pair, and two pairs of samples as one unit. Done.

【００３２】以上説明したように、簡易機種、上位機種
のいずれでも再生処理が可能な多チャンネル対応のリニ
アＰＣＭ方式のデータのデータ記録又は伝送のための配
置方法及び媒体とその処理装置を得ることができる。As described above, it is possible to obtain an arrangement method, a medium, and a processing apparatus for recording or transmitting data of multi-channel linear PCM system data that can be reproduced by both a simple model and a high-order model. Can be.

【００３３】このシステムの規格では、ＤＳＩ＿ＰＫＴ
と次のＤＳＩ＿ＰＫＴとの間の情報を再生したときに約
０．５秒となるだけの情報量を配置することが規定され
ている。In the standard of this system, DSI_PKT
It is stipulated that an information amount of about 0.5 second is arranged when information between the DSI_PKT and the next is reproduced.

【００３４】１パックは、パックヘッダとパケットヘッ
ダ、パケットデータ部とからなる。そしてパックヘッダ
とパケットヘッダには、オーディオのパックのサイズ、
ビデオとの再生出力タイミングを取るためのプレゼンテ
ーションタイムスタンプ、チャンネル（ストリーム）の
識別コード、量子化ビット、サンプリング周波数、デー
タのスタートアドレス、エンドアドレス等のオーディオ
を再生するのに必要な情報が記載されている。パケット
に挿入されているオーディオは、図１（Ａ）−図１
（Ｃ）で示した２メインサンプルと２エキストラサンプ
ルからなる２対サンプルを単位として挿入されている。One pack includes a pack header, a packet header, and a packet data part. The pack header and packet header contain the size of the audio pack,
Information necessary for audio reproduction such as presentation time stamp for obtaining reproduction output timing with video, channel (stream) identification code, quantization bit, sampling frequency, data start address, end address, etc. is described. ing. The audio inserted in the packet is shown in FIG.
(C) is inserted in units of two pairs of samples consisting of two main samples and two extra samples.

【００３５】図５には、オーディオパックを拡大して示
している。このオーディオパックのデータ部には、その
データ領域の先頭に２対サンプルの先頭（Ａ０−Ｈ０，
Ａ１−Ｈ１）を合わせて、以後２対サンプル単位で配列
されている。ここで、１パックのバイト数は２０４８バ
イトと固定である。一方、サンプルは可変長データであ
るから、２０４８バイトが必ずしも２対サンプルの整数
倍のバイト長であるとは限らない。そこで、１パックの
最大バイト長と、（２対サンプル×整数倍）のバイト長
とが異なる場合が生じる。このような場合は、パックの
バイト長≧（２対サンプル×整数倍）のバイト長となる
ようにし、パックの一部が余った場合には次の対策が施
されている。即ち、パックの残余の部分が７バイト以下
の場合はパックヘッダ内にスタッフィングバイトを挿入
し、７バイトを越える場合にはパック末尾にパッディン
グパケットを挿入するようにしてる（図５に斜線を付し
た部分参照）。FIG. 5 shows an enlarged audio pack. In the data portion of this audio pack, the head of the pair of samples (A0-H0,
A1-H1), and are subsequently arranged in units of two pairs. Here, the number of bytes in one pack is fixed at 2048 bytes. On the other hand, since the sample is variable-length data, 2048 bytes is not necessarily a byte length that is an integral multiple of two pairs of samples. Therefore, the maximum byte length of one pack may be different from the byte length of (2 pairs of samples × an integer). In such a case, the byte length of the pack ≧ (two pairs × an integer multiple) is satisfied, and the following countermeasures are taken when a part of the pack remains. That is, if the remaining portion of the pack is 7 bytes or less, a stuffing byte is inserted in the pack header, and if the remaining portion exceeds 7 bytes, a padding packet is inserted at the end of the pack (shaded in FIG. 5). See the section that did.)

【００３６】このようなパック形式のオーディオ情報の
場合、再生時において取扱いが容易である。In the case of such packed audio information, handling is easy during reproduction.

【００３７】これは各パックの先頭のオーディオデータ
は必ず２対サンプルの先頭、即ちメインサンプルとなる
ので、タイミングを取って再生処理を行う場合に再生処
理が容易となる。これは再生装置がパック単位でデータ
を取り込んでデータ処理を行うからである。もし、オー
ディオデータのサンプルが２のつパック間に跨がって配
置されているとすると２つのパックを取り込んで、オー
ディオデータを一体化してデコードを行うことになり処
理が複雑になる。しかし、この方式のように、各パック
の先頭のオーディオデータが必ず２対メインサンプルの
先頭であり、オーディオデータがパック単位でまとめら
れていると、タイミングをとるのも１つのパックに対し
てのみであり、処理が容易である。またパケット単位で
区切るデータ処理であるためにオーサリングシステム
（支援システム）がシンプル化し、データ処理のための
ソフトウエアも簡単化することができる。Since the head audio data of each pack is always the head of two pairs of samples, that is, the main sample, the reproduction process is facilitated when the reproduction process is performed at a proper timing. This is because the playback device fetches data in packs and performs data processing. If the audio data samples are arranged over two packs, the two packs are fetched, the audio data is integrated and decoded, and the processing becomes complicated. However, as in this method, the audio data at the head of each pack is always the head of two pairs of main samples, and if audio data is grouped in packs, the timing is taken only for one pack. And the processing is easy. Further, since the data processing is performed in units of packets, the authoring system (support system) can be simplified, and the software for data processing can be simplified.

【００３８】特に、特殊再生時等は、ビデオデータを間
欠的に間引いて処理したり、あるいは補間して処理を行
うことがあるが、このような場合に、オーディオデータ
をパケット単位で扱えるようにしたために、再生タイミ
ングの制御を比較的容易にすることができる。デコーダ
のソフトウエアを複雑化することもない。In particular, at the time of special reproduction or the like, video data may be intermittently thinned out for processing or interpolated for processing. In such a case, audio data can be handled in packet units. As a result, it is possible to relatively easily control the reproduction timing. It does not complicate the decoder software.

【００３９】なお上記のシステムでは、サンプルが上位
１６ビットと下位４ビットに分けた形でサンプルを作成
しているが必ずしもこのような形式のデータである必要
はない。リニアＰＣＭオーディオデータをサンプル化し
たものであればよい。In the above system, the sample is created in such a manner that the sample is divided into upper 16 bits and lower 4 bits, but it is not always necessary to use such a format. What is necessary is just to sample linear PCM audio data.

【００４０】例えばエキストラサンプルのデータ長を０
としたものを考えれば、データ列はメインサンプルの連
続となり、一般的なデータ形式となる。この場合エキス
トラサンプルがないので、２対サンプルを単位とする必
要はなくメインサンプル単位でパケット化をすればよ
い。For example, if the data length of the extra sample is 0
Considering the above, the data sequence is a sequence of main samples, and has a general data format. In this case, since there is no extra sample, it is not necessary to use two pairs of samples, and packetization may be performed in units of main samples.

【００４１】図６には、上記のように２対サンプル単位
でパケット内にリニアＰＣＭデータを配置した場合のリ
ニアＰＣＭデータのサイズの一覧表を示している。モノ
ラル、ステレオ、マルチチャンネルの区分毎に示し、ま
た各区分では量子化ビット数毎に区別して１パケット内
に治まる最大サンプル数を示している。２対サンプル単
位であるため、１パケット内のサンプル数は全て偶数サ
ンプルとなっている。チャンネル数が多くなるとそれだ
けバイト数が増えるので１パケット内のサンプル数は少
なくなる。量子化ビット数が１６ビット、モノラルの場
合、１パケット内のサンプル数は１００４個であり、バ
イト数が２００８、スタッフィングバイトは５バイト
で、パディングバイトはないことを示している。ただ
し、最初のパケットのスタッフィングバイトは、２バイ
トであることを示している。これは、最初のパケットで
は、そのヘッダに３バイトの属性情報が付加されること
があるからである。FIG. 6 shows a list of linear PCM data sizes in a case where linear PCM data is arranged in a packet in units of two pairs as described above. The number of monaural, stereo, and multi-channel sections is shown, and in each section, the maximum number of samples that can be accommodated in one packet is shown for each quantization bit number. Since the number of samples is two, the number of samples in one packet is all even. As the number of channels increases, the number of bytes increases accordingly, so the number of samples in one packet decreases. When the quantization bit number is 16 bits and monaural, the number of samples in one packet is 1004, the number of bytes is 2008, the stuffing byte is 5 bytes, and there is no padding byte. However, the stuffing byte of the first packet is 2 bytes. This is because 3-byte attribute information may be added to the header of the first packet.

【００４２】また、量子化ビット２４ビット、ステレオ
モードについて見ると、先頭のパケットは６バイトのス
タッフィングが施され、以降のパケットは９バイトのパ
ディングが施されていることを示している。Looking at the 24-bit quantization bits and the stereo mode, it is shown that the leading packet is stuffed with 6 bytes and the subsequent packets are padded with 9 bytes.

【００４３】図７には、オーディオパックのパックヘッ
ダの概略を示している。FIG. 7 shows an outline of a pack header of an audio pack.

【００４４】まず、パックスタートコード（４バイト）
があり、次にシステムクロックリファレンス（ＳＣＲ）
が記述されている。システムクロックリファレンス（Ｓ
ＣＲ）は、このパックの取り込み時間を示しており、装
置内部の基準時間の値より、このＳＣＲの値が小さい場
合には、このＳＣＲが付与されているパックがオーディ
オバッファに取り込まれる。またパックヘッダには、プ
ログラム多重レートが３バイトで記述されている。さら
に、スタッフィング長さも１バイトで記述されている。
このスタッフィング長が制御回路により参照されること
により、制御回路は、制御情報の読み取りアドレスを決
めることができる。First, a pack start code (4 bytes)
And then the system clock reference (SCR)
Is described. System clock reference (S
(CR) indicates the loading time of this pack. If the value of this SCR is smaller than the value of the reference time inside the apparatus, the pack to which this SCR is added is loaded into the audio buffer. In the pack header, the program multiplexing rate is described in 3 bytes. Further, the stuffing length is also described in one byte.
The control circuit can determine the read address of the control information by referring to the stuffing length by the control circuit.

【００４５】図８には、オーディオパケットのパケット
ヘッダーの中身を示している。パケットヘッダは、パケ
ットのスタートを知らせるための、パケットスタートコ
ードプリフィックス、パケットがなにのデータを有する
のかを示すストリームＩＤ、パケットストリームの長さ
を示すデータがある。パケットエレメンタリーストリー
ム（ＰＥＳ）の各種の情報、例えばコピーの禁止、許可
を示すフラッグ、オリジナル情報かコピーされた情報か
を示すフラッグ、パケットヘッダの長さなどが記述され
ている。さらにこのパケットと他のビデオや副映像との
時間的出力同期を取るためのプレゼンテーションタイム
スタンプ（ＰＴＳ）も記述されている。さらに、各ビデ
オオブジェクトの中で最初のフィールドの最初のパケッ
トには、バッファについて記述しているかどうか示すフ
ラッグ、バッファのサイズなどの情報が記述されてい
る。また０−７バイトのスタッフィングバイトを有す
る。FIG. 8 shows the contents of the packet header of the audio packet. The packet header includes a packet start code prefix for notifying the start of the packet, a stream ID indicating what data the packet has, and data indicating the length of the packet stream. Various types of information of the packet elementary stream (PES), such as a flag indicating prohibition and permission of copying, a flag indicating original information or copied information, and a length of a packet header are described. Furthermore, a presentation time stamp (PTS) for synchronizing the output of the packet with another video or sub-picture is described. Further, in the first packet of the first field in each video object, information such as a flag indicating whether or not a buffer is described and the size of the buffer are described. It also has 0-7 byte stuffing bytes.

【００４６】さらに、オーディオストリームであるこ
と、リニアＰＣＭか他の圧縮方式及びオーディオストリ
ームの番号を示すためのサブストリームＩＤを有する。
さらにまた、このパケット内に先頭のバイトデータを配
置しているオーディオのフレーム数が記述されている。
さらにまた、前記ＰＴＳで指示されている時刻に再生さ
れるべき、パケット内の最初のオーディオフレーム、す
なわち最初にアクセスするユニットの先頭バイトを指示
するポインタが記述されている。このポインタは、この
情報の最後のバイトからのバイト番号で記述されてい
る。そしてポインタは、そのオーディオフレームの最初
のバイトアドレスを示している。また、高域強調された
のか否かを示すオーディオ強調フラッグ、オーディオフ
レームデータがオール０のときにミュートを得るための
ミュートフラッグ、オーディオフレームグループ（ＧＯ
Ｆ）の中の最初にアクセスするフレーム番号も記述され
ている。また量子化ワードの長さ、つまり量子化ビット
数、サンプリング周波数、チャンネル数、ダイナミック
レンジの制御情報などが記述されている。Further, it has an audio stream, a linear PCM or another compression system, and a substream ID for indicating the number of the audio stream.
Furthermore, the number of audio frames in which the first byte data is arranged in this packet is described.
Furthermore, a pointer indicating the first audio frame in the packet to be reproduced at the time pointed by the PTS, that is, the first byte of the unit to be accessed first is described. This pointer is described by the byte number from the last byte of this information. The pointer indicates the first byte address of the audio frame. Also, an audio enhancement flag indicating whether or not high-frequency enhancement has been performed, a mute flag for obtaining mute when audio frame data is all 0, and an audio frame group (GO)
The frame number to be accessed first in F) is also described. Also, the length of the quantization word, that is, the number of quantization bits, the sampling frequency, the number of channels, control information of the dynamic range, and the like are described.

【００４７】上記のヘッダ情報は、オーディオデコーダ
内のデコーダ制御部（図示せず）において解析される。
デコーダ制御部は、デコーダの信号処理回路を現在取り
込み中のオーディオデータに対応する信号処理形態に切
り換える。The above header information is analyzed by a decoder control unit (not shown) in the audio decoder.
The decoder control unit switches the signal processing circuit of the decoder to a signal processing mode corresponding to the audio data currently being captured.

【００４８】上記のヘッダ情報と同様な情報は、ビデオ
オマネージャにも記述されているので、再生動作の初期
にこのような情報を読み取れば、以後は同じサブストリ
ームの再生であれば読み取る必要はない。しかし上述し
たように各パケットのヘッダに、オーディオを再生する
に必要なモードの情報が記述されているのは、例えばパ
ケット列が通信系列で伝送されるような場合に何時受信
を開始しても受信端末がオーディオのモードを認識でき
るようにしたからである。また、パックのみをオーディ
オデコーダが取り込んだ場合でも、オーディオ情報を再
生できるようにしたからである。Since the same information as the above header information is also described in the video manager, if such information is read at the beginning of the reproducing operation, it is not necessary to read the same sub stream thereafter. Absent. However, as described above, the mode information necessary to reproduce audio is described in the header of each packet because, for example, when a packet sequence is transmitted in a communication sequence, reception is started at any time. This is because the receiving terminal can recognize the audio mode. Also, this is because audio information can be reproduced even when only the pack is taken in by the audio decoder.

【００４９】上記したＤＶＤビデオにおいて、オーディ
オデータの最大転送レートは、６．１４４Ｍｂｐｓであ
り、全オーディオデータストリーム合計の最大転送レー
トは９．８Ｍｂｐｓである。そして、１つのストリーム
中の各チャンネルの属性（サンプリング周波数ｆｓ、量
子化ビット数Ｑｂ，チャンネル数等）は同一である。こ
の制約は、ＤＶＤビデオ規格において定められている。In the above DVD video, the maximum transfer rate of audio data is 6.144 Mbps, and the maximum transfer rate of the total of all audio data streams is 9.8 Mbps. The attributes (sampling frequency fs, number of quantization bits Qb, number of channels, etc.) of each channel in one stream are the same. This restriction is defined in the DVD video standard.

【００５０】このような制約があるために、サラウンド
のようなマルチチャンネルオーディオ（例；１つのスト
リーム中にＲ，Ｌ，Ｃ，ＳＬ，ＳＷの６つのチャンネル
が存在する）では、高音質仕様を実現することができな
い。Due to such restrictions, multi-channel audio such as surround (eg, six channels of R, L, C, SL, and SW exist in one stream) has a high sound quality specification. Can not be realized.

【００５１】即ち、上記制約条件のままでは全チャンネ
ルのサンプリング周波数ｆｓ、量子化ビット数Ｑｂが同
一でなければならないので、高音質（例；ｆｓ＝９６ｋ
Ｈｚ）を実現しようとすると、全チャンネル同一で対応
しなければならないので、転送レートの値が大きくな
り、規定値を超えてしまう。In other words, the sampling frequency fs and the number of quantization bits Qb of all the channels must be the same under the above constraint conditions, so that high sound quality (for example, fs = 96k)
Hz), it is necessary to cope with all channels in the same manner, so that the value of the transfer rate becomes large and exceeds the specified value.

【００５２】例えば、サンプリング周波数ｆｓ，量子化
ビット数Ｑｂでの１チャンネル（ｃｈ）毎の転送レート
は、純粋にオーディオデータ部分だけで、９６ｋＨｚ，２４ｂｉｔであると２．３０４Ｍｂｐｓ／
ｃｈ９６ｋＨｚ，２０ｂｉｔであると１．９２Ｍｂｐｓ／ｃ
ｈ９６ｋＨｚ，１６ｂｉｔであると１．５３６Ｍｂｐｓ／
ｃｈ４８ｋＨｚ，２４ｂｉｔであると１．１５２Ｍｂｐｓ／
ｃｈ４８ｋＨｚ，２０ｂｉｔであると０．９６Ｍｂｐｓ／ｃ
ｈ４８ｋＨｚ，１６ｂｉｔであると０．７６Ｍｂｐｓ／ｃ
ｈであるから、上記のＤＶＤビデオの規格による制約条件
で実現できる高音質仕様は、４８ｋHz、２０ｂｉｔ，６
チャンネルまで（この場合のオーディオに関する転送レ
ートは０．９６×６＝５．７６Ｍｂｐｓ＜６．１４４）
であり、それ以上の仕様は対応不可能である。For example, the transfer rate per channel (ch) at the sampling frequency fs and the number of quantization bits Qb is 2.304 Mbps / 96 kHz, 24 bits, purely for the audio data portion.
When the frequency is 96 kHz and 20 bits, 1.92 Mbps / c
h At 96 kHz, 16 bits, 1.536 Mbps /
When the channel is 48 kHz and 24 bits, the frequency is 1.152 Mbps /
0.948 Mbps / c for 20 kHz with 48 kHz ch
h 48 kHz, 16 bits, 0.76 Mbps / c
h, the high sound quality specification that can be realized under the above constraint conditions of the DVD video standard is 48 kHz, 20 bits, 6 bits.
Up to channels (the transfer rate for audio in this case is 0.96 × 6 = 5.76 Mbps <6.144)
, And further specifications are not supported.

【００５３】そこで、本発明では、ＤＶＤビデオ規格に
おけるオーディオデータ構造のタイプをできるだけ残し
たまま、高音質の音声信号仕様をもつＤＶＤオーディオ
規格のデータ構造を工夫するものである。Therefore, in the present invention, the data structure of the DVD audio standard having a high-quality audio signal specification is devised while keeping the audio data structure type in the DVD video standard as much as possible.

【００５４】以下、本発明の基本的な概念を、ＤＶＤビ
デオ規格とＤＶＤオーディオ規格を比較して説明する。Hereinafter, the basic concept of the present invention will be described by comparing the DVD video standard and the DVD audio standard.

【００５５】ＤＶＤオーディオにおけるオーディオパッ
クの大きさは、ＤＶＤビデオと同じように２０４８バイ
トとする。また、量子化ビット数もＤＶＤビデオにおけ
るオーディオ仕様と同様にＱｂ＝１６ｂｉｔ又は２０ｂ
ｉｔ又は２４ｂｉｔとする。しかし、ＤＶＤオーディオ
では、同時に転送するリニアＰＣＭオーディオストリー
ムを１本に限定する。つまり、ＤＶＤビデオでは、ビデ
オオブジェクトとして映画のコンテンツが集録されてい
る場合、各種言語をオーディオストリームの各チャンネ
ルに割り当て、ストリームの切換え選択を可能としてい
る。しかしＤＶＤオーディオでは、基本的に音楽のコン
テンツを対象としているので、各ストリーム毎の切換え
選択を必ずしも行う必要はないので、全チャンネルを同
時の再生して出力するように用いることができ、つまり
１本化できる。本発明のシステムではこのように同時に
転送するリニアＰＣＭオーディオストリームを１本化す
るようにしている。The size of the audio pack in DVD audio is set to 2048 bytes as in the case of DVD video. Also, the number of quantization bits is Qb = 16 bits or 20 bits in the same manner as the audio specification in DVD video.
it or 24 bits. However, in the case of DVD audio, the number of simultaneously transferred linear PCM audio streams is limited to one. That is, in the case of DVD video, when movie content is recorded as a video object, various languages are assigned to each channel of the audio stream, and stream switching can be selected. However, since DVD audio basically targets music content, it is not always necessary to make a switching selection for each stream, so that all channels can be used to reproduce and output simultaneously. Can be realized. In the system of the present invention, the linear PCM audio streams to be transferred simultaneously are unified into one.

【００５６】次にＤＶＤオーディオにおける最大転送レ
ートを６．１４４Ｍｂｐｓから９．６Ｍｂｐｓに増加さ
せた。先に述べたようにＤＶＤビデオの全体のデータス
トリームをみると、ビデオデータ、サブピクチャーデー
タ、オーディオデータ、ナビゲーションデータ等の各パ
ックが時分割多重されて伝送されている。このような伝
送データ全体を含めて最大転送レートが、９．６Ｍｂｐ
ｓに制限されている。このため、オーディオデータに関
して、６．１４４Ｍｂｐｓ以上の転送レートに上げるこ
とは困難である。しかし、ＤＶＤオーディオに関して
は、ＤＶＤビデオに比較し、若干の制御データ以外はす
べてオーディオデータであるために、オーディオデータ
の量を多くでき転送レートを増大することができる。Next, the maximum transfer rate for DVD audio was increased from 6.144 Mbps to 9.6 Mbps. As described above, when looking at the entire data stream of a DVD video, packs of video data, sub-picture data, audio data, navigation data, and the like are transmitted in a time-division multiplex manner. The maximum transfer rate including the entire transmission data is 9.6 Mbp.
s. For this reason, it is difficult to increase the transfer rate of audio data to 6.144 Mbps or more. However, compared to DVD video, DVD audio is all audio data except for some control data, so that the amount of audio data can be increased and the transfer rate can be increased.

【００５７】上記のようにＤＶＤオーディオにおける最
大転送レートを増大したので、図２で説明したような１
オーディオフレームの中のサンプル数を、ＤＶＤビデオ
の場合の半分にする。よって、サンプリング周波数ｆｓ
に対してサンプル数を、ｆｓ＝４８ｋＨｚまたは４４．１ｋＨｚでは４０個／フ
レームｆｓ＝９６ｋＨｚまたは８８．２ｋＨｚでは８０個／フ
レームｆｓ＝１９２ｋＨｚまたは１７６．４ｋＨｚでは１６０
個／フレームとした。（なおＤＶＤビデオでは、４４．１ｋＨｚ、８
８．２ｋＨｚ、１７６．４ｋＨｚ及び１９２ｋＨｚはサ
ポートしていない。）これは、１オーディオフレームの
中に最低１個のオーディオパックが入り、オーディオフ
レームが必ずプレゼンテーションタイムスタンプ（ＰＴ
Ｓ）データ（再生時のシステムタイムスタンプと同期さ
せるためのデータ）を持つようにするためである。As described above, the maximum transfer rate of DVD audio has been increased, so that the maximum transfer rate as shown in FIG.
The number of samples in an audio frame is reduced to half that of DVD video. Therefore, the sampling frequency fs
For fs = 48 kHz or 44.1 kHz, 40 samples / frame fs = 96 kHz or 88.2 kHz for 80 samples / frame fs = 192 kHz or 176.4 kHz for 160
Pieces / frame. (In the case of DVD video, 44.1 kHz, 8
8.2 kHz, 176.4 kHz and 192 kHz are not supported. This means that at least one audio pack is included in one audio frame, and the audio frame must be a presentation time stamp (PT
S) This is for having data (data for synchronizing with the system time stamp at the time of reproduction).

【００５８】ここで、さらに、ＤＶＤオーディオでは、
ＤＶＤビデオを凌ぐ高音質音声仕様を実現するためにス
ケーラブル方式を採用する。即ち、今まで１ストリーム
内の全チャンネルがサンプリング周波数ｆｓ、Ｑｂに関
して同一属性であったのに対して、１ストリームの中に
異なる属性を持つチャンネルを認めることとした。Here, in DVD audio,
A scalable method is used to achieve high-quality sound specifications that surpass DVD video. That is, while all channels in one stream have the same attribute with respect to the sampling frequencies fs and Qb, channels having different attributes are allowed in one stream.

【００５９】これは、例えばＲ（右チャンネル），Ｌ
（左チャンネル），サラウンド用のＣ（中央チャンネ
ル），ＳＲ（後方右チャンネル），ＳＬ（後方左チャン
ネル），ＳＷ（低域チャンネル）の６チャンネルのうち
すべてのチャンネルを高音質（高いサンプリング周波数
ｆｓ）にする必要がなく、メインとなるチャンネル（例
えばＲ，Ｌ）を高音質（例えばｆｓ＝９６ｋＨｚ）と
し、他のサブとなるチャンネル（Ｃ，ＳＲ，ＳＬ，Ｓ
Ｗ）を現状の音質（ｆｓ＝４８ｋＨｚ）としても、全体
としては十分高音質とすることができると言う事実に基
づくものである。This corresponds to, for example, R (right channel), L
(Left channel), surround C (center channel), SR (rear right channel), SL (rear left channel), and SW (low frequency channel), all channels having high sound quality (high sampling frequency fs). ), The main channels (for example, R and L) have high sound quality (for example, fs = 96 kHz), and the other sub channels (C, SR, SL, and S)
Even if W) is the current sound quality (fs = 48 kHz), it is based on the fact that the sound quality can be made sufficiently high as a whole.

【００６０】ここでスケーラブル方式を用いたオーディ
オシステムの概念を簡単に説明すると、次のようにな
る。Here, the concept of the audio system using the scalable method will be briefly described as follows.

【００６１】オーディオに関しての１つのチャンネル群
の信号の最大転送（伝送）レートは、６．１４４Ｍｂｐ
ｓ以下，１ストリームの信号の転送レート合計である最
大転送レートは９．８Ｍｂｐｓ以下となることを目標と
している。チャンネル群とは、例えばステレオのＲ，Ｌ
チャンネル（メインの２チャンネル）を含むデジタル信
号のことである。またＣ，ＳＲ，ＳＬ，ＳＷのまとまっ
たストリームも１つのチャンネル群である。The maximum transfer (transmission) rate of one channel group signal for audio is 6.144 Mbp.
The maximum transfer rate, which is the sum of the transfer rates of one stream of signals below s, is targeted to be 9.8 Mbps or less. The channel group is, for example, stereo R, L
It is a digital signal containing channels (main two channels). A stream composed of C, SR, SL, and SW is also one channel group.

【００６２】次に、記録媒体に記録する信号として、例
えば６チャンネルのオーディオ信号を記録する場合につ
いて説明する。ここで言う６チャンネルは、例えば上記
したサラウンド方式におけるＲ，Ｌ，Ｃ，ＳＲ，ＳＬ，
ＳＷであり各チャンネルに対応した信号が作られてされ
ている。Ｒ，Ｌをメインチャンネルとし、他をサブチャ
ンネルとして区別することも可能である。そして、各チ
ャンネルの信号が再生されて、それぞれがスピーカに供
給されれば立体的な音響効果を得るものである。ここ
で、この発明の方式では、上記の６チャンネルを、第１
のチャンネル群と第２のチャンネル群として生成する。
この場合、第１のチャンネル群を構成するチャンネルと
しては重要度の高いＲ，Ｌとし、第２のチャンネル群を
構成するチャンネルとしてはＣ，ＳＲ，ＳＬ，ＳＷを選
択する。ここで、第１のチャンネル群の信号は、サンプ
リング周波数ｆｓが高く、第２のチャンネル群の信号
は、ｆｓ／２のサンプリング周波数（整数分の１）とさ
れる。Next, a case where an audio signal of, for example, 6 channels is recorded as a signal to be recorded on a recording medium will be described. The six channels mentioned here are, for example, R, L, C, SR, SL,
SW, and a signal corresponding to each channel is generated. It is also possible to distinguish R and L as main channels and the others as sub-channels. Then, if the signal of each channel is reproduced and supplied to the speaker, a three-dimensional sound effect is obtained. Here, according to the method of the present invention, the above-mentioned six channels are assigned to the first channel.
And a second channel group.
In this case, R and L with high importance are selected as channels constituting the first channel group, and C, SR, SL and SW are selected as channels constituting the second channel group. Here, the signal of the first channel group has a high sampling frequency fs, and the signal of the second channel group has a sampling frequency of fs / 2 (1 / integer).

【００６３】図９には、第１のチャンネル群の信号の処
理系統と、第２のチャンネル群の信号の処理系統とを具
体的に示している。FIG. 9 specifically shows a signal processing system of the first channel group and a signal processing system of the second channel group.

【００６４】アナログ信号源１０には、サラウンド方式
におけるＲ，Ｌ，Ｃ，ＳＲ，ＳＬ，ＳＷチャンネルの信
号が用意され、サンプリング部１１に供給される。サン
プリング部１１は、各チャンネルの信号を９６ｋＨｚで
サンプリングし、サンプリング信号は、量子化部１２に
入力されて、２４ビットに量子化され，ＰＣＭ（パルス
コード変調）信号に変換される。In the analog signal source 10, signals of R, L, C, SR, SL, and SW channels in the surround system are prepared and supplied to the sampling unit 11. The sampling section 11 samples the signal of each channel at 96 kHz, and the sampled signal is input to the quantization section 12, quantized to 24 bits, and converted into a PCM (pulse code modulation) signal.

【００６５】次にＣ，ＳＲ，ＳＬ，ＳＷチャンネルの各
信号は、サンプリング周波数を９６ｋＨｚからその１／
２の４８ｋＨｚに周波数変換される。９６ｋＨｚのＲ，
Ｌチャンネルの信号は、位相合せ部１４に入力されて、
サンプル間の位相の対応がとれるように位相合わせされ
る。実際には、周波数変換部１３の遅延量と同じ遅延量
が、位相合せ部１４に設定されている。次に、遅延され
た９６ｋＨｚのＲ，Ｌチャンネルの信号は、フレーム化
部１５に入力されて、所定のサンプル数毎にフレーム化
される。Next, each of the signals of the C, SR, SL, and SW channels has a sampling frequency of 96 kHz, which is 1 /
2 is converted to a frequency of 48 kHz. 96 kHz R,
The signal of the L channel is input to the phase matching unit 14,
The phases are adjusted so that the phase correspondence between the samples can be obtained. Actually, the same delay amount as that of the frequency conversion unit 13 is set in the phase matching unit 14. Next, the delayed 96 kHz R and L channel signals are input to the framing unit 15 and are framed every predetermined number of samples.

【００６６】また周波数数変換後の４８ｋＨｚのＣ，Ｓ
Ｒ，ＳＬ，ＳＷチャンネルの各信号は、フレーム化部１
６に入力されて、所定のサンプル数毎にフレーム化され
る。フレーム化された各信号は、パケット化部１７に入
力されて、所定のフォーマットのパケットに変換され
る。そして９６ｋＨｚ系のストリーム（第１の属性Ａｔ
ｒ１のストリーム）と、４８ｋＨｚ系のストリーム（第
２の属性Ａｔｒ２のストリーム）とが得られる。しかし
この２つのストリームは、パケットヘッダに識別子（Ｉ
Ｄ）が付されて識別されている。この２つのチャンネル
群のパケットは、更にパック化されて、マルチプレック
スされて出力され、記録処理部（図示せず）を介してデ
ィスク１８に記録される。Also, 48 kHz C, S after frequency number conversion
The signals of the R, SL, and SW channels are transmitted to the framing unit 1
6 and is framed every predetermined number of samples. Each of the framed signals is input to the packetizing unit 17 and is converted into a packet of a predetermined format. Then, a 96 kHz stream (first attribute At)
An r1 stream) and a 48 kHz stream (a stream with the second attribute Atr2) are obtained. However, these two streams have an identifier (I
D) is attached. The packets of the two channel groups are further packed, multiplexed and output, and recorded on the disk 18 via a recording processing unit (not shown).

【００６７】上記ディスク１８に記録された信号が再生
される場合には、次のような処理が行われる。When the signal recorded on the disk 18 is reproduced, the following processing is performed.

【００６８】ディスク１８から光学的に読み出された信
号は、エラー訂正や復調などを行う復調部（図示せず）
を介してパケット処理部２１に入力される。このパケッ
ト処理部２１においては、パケットヘッダの識別子を参
照してチャンネル群を識別する。この識別により第１の
チャンネル群のパケットと、第２のチャンネル群のパケ
ットを識別することができ、各チャンネル群の信号が振
り分けられる。いわゆるデマルチプレックスされる。第
１のチャンネル群の信号は、フレーム処理部２２に入力
されて、フレームの解除が行われ、Ｒ，Ｌチャンネルの
信号として出力される。また、第２のチャンネル群の信
号は、フレーム処理部２３に入力され、フレームの解除
が行われ、Ｃ，ＳＲ，ＳＬ，ＳＷチャンネルの各信号と
して出力される。A signal optically read from the disk 18 is demodulated (not shown) for performing error correction, demodulation, and the like.
Is input to the packet processing unit 21 via the. The packet processing unit 21 identifies a channel group with reference to the identifier of the packet header. By this identification, the packets of the first channel group and the packets of the second channel group can be identified, and the signals of each channel group are sorted. So-called demultiplexing. The signal of the first channel group is input to the frame processing unit 22, where the frame is released, and output as R and L channel signals. The signal of the second channel group is input to the frame processing unit 23, the frame is released, and the signal is output as each signal of the C, SR, SL, and SW channels.

【００６９】ここで、Ｒ，Ｌチャンネルの信号は、位相
合せ部２４に入力され、Ｃ，ＳＲ，ＳＬ，ＳＷチャンネ
ルの各信号は、４８ｋＨｚから９６ｋＨｚにサンプル周
波数を変換（アップコンバート）するための周波数変換
部２５に入力される。Here, the signals of the R and L channels are input to the phase matching unit 24, and the signals of the C, SR, SL and SW channels are used to convert (upconvert) the sample frequency from 48 kHz to 96 kHz. The signal is input to the frequency converter 25.

【００７０】位相合せされ、かつサンプル周波数が同じ
になったＲ，Ｌチャンネルの信号と、Ｃ，ＳＲ，ＳＬ，
ＳＷチャンネルの各信号とは、９６ｋＨｚのデジタルア
ナログ変換部２６に入力され、ＰＣＭ復号され、アナロ
グ信号に変換されて出力されることになる。The signals of the R and L channels which are phase-matched and have the same sample frequency, and C, SR, SL,
Each signal of the SW channel is input to the 96 kHz digital-to-analog converter 26, PCM-decoded, converted to an analog signal, and output.

【００７１】以上の処理により、高品位のＲ，Ｌチャン
ネルの信号、及びＣ，ＳＲ，ＳＬ，ＳＷチャンネルの各
信号を再生することができる。With the above processing, high-quality R and L channel signals and C, SR, SL and SW channel signals can be reproduced.

【００７２】この発明においては、上記のように１フレ
ーム内のサンプルデータが、再生したときに１／６００
秒となるようなサンプル数に設定されている。このため
に、９６ｋＨｚ系のストリーム（第１のチャンネル群）
と、４８ｋＨｚ系のストリーム（第２のチャンネル群）
との１フレーム内のサンプル数が異なることになる。図
１０には、フレーム内に存在するサンプルを、第１のチ
ャンネル群と第２のチャンネル群とで比較して示してい
る。位相合せ部１４では、第１のチャンネル群と第２の
チャンネル群の位相合せを行いフレームを作成してい
る。According to the present invention, as described above, when the sample data in one frame is reproduced,
The number of samples is set to be seconds. For this purpose, a 96 kHz stream (first channel group)
And a 48 kHz stream (second channel group)
Are different in the number of samples in one frame. FIG. 10 shows the samples existing in the frame in comparison between the first channel group and the second channel group. The phase matching unit 14 creates a frame by performing phase matching between the first channel group and the second channel group.

【００７３】そして、フレーム化部１５、１６において
は、第１と第２のチャンネル群の対応するフレーム（時
間的に同一時刻で再生されるべきフレーム）の先頭に、
同一のプレゼンテーションタイムスタンプを付加してい
る。この結果、再生時において、フレーム処理部２２、
２３においてフレーム解除を行い、デジタルアナログ変
換部に供給する場合、各フレームの解除タイミングは、
同一のプレゼンテーションタイムスタンプを有するフレ
ームを同時に解除すればよい。Then, the framing sections 15 and 16 add the head of the corresponding frame (the frame to be reproduced at the same time in time) of the first and second channel groups to
The same presentation time stamp is added. As a result, at the time of reproduction, the frame processing unit 22,
When the frame is released at 23 and supplied to the digital-to-analog converter, the release timing of each frame is
Frames having the same presentation time stamp may be simultaneously released.

【００７４】上記したように、ＤＶＤオーディオでは、
本来ならば１オーディオストリームを構成するチャンネ
ル群を２つの属性グループＡｔｒ１，Ａｔｒ２に分ける
ことができることとした。ここで属性とは、標本化周波
数ｆｓ、量子化ビット数Ｑｂ，チャンネル数などがあ
る。勿論１ストリーム中の全チャンネルの属性が同一の
場合は、２つの属性グループに分けなくても良い。As described above, in DVD audio,
Originally, the channel group forming one audio stream can be divided into two attribute groups Atr1 and Atr2. Here, the attributes include the sampling frequency fs, the number of quantization bits Qb, and the number of channels. Of course, if the attributes of all the channels in one stream are the same, they need not be divided into two attribute groups.

【００７５】上記の例のように、サラウンド６チャンネ
ルの場合を整理すると次のようになる。The arrangement of the case of six surround channels as in the above example is as follows.

【００７６】チャンネル群Ｒ，Ｌの属性（Ａｔｒ１）と
して、ｆｓ＝９６ｋＨｚ，Ｑｂ＝２４ｂｉｔ，チャンネ
ル群Ｃ，ＳＲ，ＳＬ，ＳＷ属性（Ａｔｒ２）として、ｆ
ｓ＝４８ｋＨｚ，Ｑｂ＝２４ｂｉｔの２種類が存在する
ことになる。するとこの場合の転送レートは、２．３０
４×２＋１．１１５２×４＝９．２１６Ｍｂｐｓとなっ
て、上述した最大転送レート９．８Ｍｂｐｓを満足する
ことになる。よって、スケーラブル方式を導入すること
により、高音質の音声信号仕様をもつオーディオデータ
構造を得ることができる。As the attributes (Atr1) of the channel groups R and L, fs = 96 kHz, Qb = 24 bits, and the channel groups C, SR, SL, and SW attributes (Atr2)
There are two types, s = 48 kHz and Qb = 24 bits. Then, the transfer rate in this case is 2.30
4 × 2 + 1.1152 × 4 = 9.216 Mbps, which satisfies the above-described maximum transfer rate of 9.8 Mbps. Therefore, by introducing the scalable method, an audio data structure having a high-quality sound signal specification can be obtained.

【００７７】上記の説明では、第１チャンネル群、第２
チャンネル群における属性としてサンプリング周波数ｆ
ｓ、量子化ビット数Ｑｂを含めて考慮した。In the above description, the first channel group, the second channel group
Sampling frequency f as an attribute in the channel group
s and the number of quantization bits Qb were considered.

【００７８】この発明の方式では、各チャンネル群の属
性としてはサンプリング周波数が群で異なり、量子化ビ
ット数が同じの場合、サンプリング周波数が同じで量子
化ビット数が異なる場合、サンプリング周波数が同じ、
量子化ビット数も同じの場合、サンプリング周波数も異
なり、量子化ビット数も異なる場合等種々の組み合わせ
において、要は、上述した最大転送レート９．８Ｍｂｐ
ｓを満足するストリームを構成してもよい。In the method of the present invention, the sampling frequency is different for each channel group as an attribute of each channel group. When the number of quantization bits is the same, when the sampling frequency is the same and the number of quantization bits is different, the sampling frequency is the same.
In various combinations such as when the number of quantization bits is the same, when the sampling frequency is different, and when the number of quantization bits is different, the point is that the above-mentioned maximum transfer rate is 9.8 Mbp
A stream that satisfies s may be configured.

【００７９】図１１は、ケース１であり、第１チャンネ
ル群、第２チャンネル群における属性Ａｒｔ１，Ａｒｔ
２としてサンプリング周波数ｆｓ＝９６ｋＨｚとｆｓ＝
４８ｋＨｚとを示している。FIG. 11 shows case 1, in which attributes Art1, Art in the first channel group and the second channel group are used.
2 and the sampling frequency fs = 96 kHz and fs =
48 kHz is shown.

【００８０】図１２は、ケース２の場合を示している。
この場合は、第１チャンネル群の属性Ａｔｒ１として、
ｆｓ＝９６ｋＨｚ，第２チャンネル群の属性（Ａｔｒ
２）として、ｆｓ＝４８ｋＨｚの場合を示している。FIG. 12 shows the case 2.
In this case, as the attribute Atr1 of the first channel group,
fs = 96 kHz, attribute of the second channel group (Atr
2) shows a case where fs = 48 kHz.

【００８１】図１３は、ケース３の場合を示している。
この場合は、第１チャンネル群の属性Ａｔｒ１として、
ｆｓ＝９６ｋＨｚ，第２チャンネル群の属性Ａｔｒ２と
して、ｆｓ＝４８ｋＨｚの場合を示している。FIG. 13 shows the case 3.
In this case, as the attribute Atr1 of the first channel group,
The case where fs = 48 kHz is shown as fs = 96 kHz and the attribute Atr2 of the second channel group.

【００８２】上記のように１ストリームの中に異なる属
性のチャンネル群が存在する場合、この発明の方式で
は、データ構造として、次のようなデータ構造とする。When a channel group having different attributes exists in one stream as described above, the data structure of the present invention has the following data structure.

【００８３】図１４のデータ構造は、図１１のケース１
に対応するもので、第１チャンネル群における属性Ａｒ
ｔ１として、サンプリング周波数ｆｓ＝９６ｋＨｚ、量
子化ビット数Ｑｓ＝１６ｂｉｔを採用し、第２チャンネ
ル群における属性Ａｒｔ２として、ｆｓ＝４８ｋＨｚ、
量子化ビット数Ｑｓ＝１６ｂｉｔを採用した例である。
また、このデータ構造は、上記のスケーラブル方式に加
え、ＤＶＤビデオのサンプル配列構造に類似するをデー
タ構造を構築している。The data structure of FIG. 14 corresponds to the case 1 of FIG.
And the attribute Ar in the first channel group
As t1, a sampling frequency fs = 96 kHz and a quantization bit number Qs = 16 bits are adopted. As an attribute Art2 in the second channel group, fs = 48 kHz,
This is an example in which the number of quantization bits Qs = 16 bits is adopted.
This data structure is similar to the scalable method described above, and also has a data structure similar to the DVD video sample array structure.

【００８４】即ち、４サンプルＳ4n、Ｓ4n+1、Ｓ4n+2、
Ｓ4n+4 が第１属性のメインサンプル、そして２サンプ
ルＳ2n、Ｓ2n+1 が第２属性のメインサンプルである。
この場合は、量子化ビット数が１６ビットのためにエキ
ストラサンプルは、存在しない。That is, four samples S4n, S4n + 1, S4n + 2,
S4n + 4 is a main sample having the first attribute, and two samples S2n and S2n + 1 are main samples having the second attribute.
In this case, there are no extra samples because the number of quantization bits is 16 bits.

【００８５】この例は、サンプリング周波数の関係で、
第１のチャンネル群の４サンプルに対して、第２のチャ
ンネル群の２サンプルが対応することになる。メインと
なる第１チャンネル群に関しては４サンプルが基本とな
り、第２チャンネル群も加えると全体では６サンプルが
基本となる。This example relates to the sampling frequency,
Two samples of the second channel group correspond to four samples of the first channel group. The first group of channels, which is the main, is basically composed of four samples, and if the second group of channels is added, the total number of samples is basically six.

【００８６】即ち、図１４のデータ構造は、複数チャン
ネルのうち少なくとも２つのチャンネルである第１のチ
ャンネル群の信号を第１の周波数でサンプルリングし、
他のチャンネルである第２のチャンネル群の信号を第２
の周波数でサンプルリングしたものであり、前記第１の
周波数でサンプリングされた第１のチャンネル群の各チ
ャンネルのサンプルのＳ４ｎ番目、Ｓ４ｎ＋１番目、Ｓ
４ｎ＋２番目、Ｓ４ｎ＋３番目を順次配列し、この次
に、前記第２の周波数でサンプリングされた第２のチャ
ンネル具の各チャンネルのサンプルのＳ２ｎ番目、Ｓ２
ｎ＋１番目を順次配列している（但し、ｎ＝０，１，
２，… ）。That is, in the data structure of FIG. 14, the signals of the first channel group, which is at least two of the plurality of channels, are sampled at the first frequency,
The signal of the second channel group, which is another channel, is
S4n-th, S4n + 1-th, and S4n-th samples of each channel of the first channel group sampled at the first frequency.
4n + 2nd and S4n + 3th are sequentially arranged, and then S2nth and S2 of the samples of each channel of the second channel device sampled at the second frequency.
The (n + 1) th is sequentially arranged (however, n = 0, 1,
2, ...).

【００８７】図１５のデータ構造は、図１２のケース２
に対応するもので、第１チャンネル群における属性Ａｒ
ｔ１として、サンプリング周波数ｆｓ＝９６ｋＨｚ、量
子化ビット数Ｑｓ＝２４ｂｉｔを採用し、第２チャンネ
ル群における属性Ａｒｔ２として、ｆｓ＝９６ｋＨｚ、
量子化ビット数Ｑｓ＝２０ｂｉｔを採用した例である。
この場合は、２対サンプルＳ2n、Ｓ2n+1 、ｅ2n、ｅ2n
+1 が第１属性のメインサンプルとエキストラサンプル
を含み、他の２対サンプルＳ2n、Ｓ2n+1 、ｅ2n、ｅ2n
+1 が第２属性のメインサンプルであり、全体は４対サ
ンプルが基本となる。第１属性のｅ2n、ｅ2n+1 が第２
属性のエキストラサンプルある。The data structure of FIG. 15 is the same as the case 2 of FIG.
And the attribute Ar in the first channel group
The sampling frequency fs = 96 kHz and the number of quantization bits Qs = 24 bits are adopted as t1, and the attributes Art2 in the second channel group are fs = 96 kHz,
This is an example in which the number of quantization bits Qs = 20 bits is adopted.
In this case, two pairs of samples S2n, S2n + 1, e2n, e2n
+1 includes a main sample and an extra sample of the first attribute, and other two pairs of samples S2n, S2n + 1, e2n, e2n
+1 is the main sample of the second attribute, and the whole is basically a four-pair sample. E2n and e2n + 1 of the first attribute are the second
There is an extra sample of attributes.

【００８８】即ち、図１５のデータ構造は、複数チャン
ネルのうち少なくとも２つのチャンネルである第１チャ
ンネル群の信号を第１の周波数でサンプルリングし、他
のチャンネルである第２チャンネル群の信号を第２の周
波数でサンプルリングしたものであり、更にサンプルデ
ータを、ＭＳＢ側のｍ1 ビットのメインワードとＬＳＢ
側のｍ2 ビットのエキストラワードとに分けている。That is, in the data structure of FIG. 15, the signal of the first channel group, which is at least two of the plurality of channels, is sampled at the first frequency, and the signal of the second channel group, which is another channel, is sampled. The sampled data is sampled at the second frequency, and the sample data is further divided into the m1 bit main word on the MSB side and the LSB.
M2 bit extra words.

【００８９】そして、第１のチャンネル群の各チャンネ
ルの（２ｎ）番目のサンプルデータのメインワードをま
とめてメインサンプルＳ２n として配置し、この次に第
１のチャンネル群の各チャンネルの（２ｎ＋１）番目の
サンプルデータのメインワードをまとめてメインサンプ
ルＳ２n+１とし配置し、この次に前記第１のチャンネル
群の各チャンネルの（２ｎ）番目のサンプルデータのエ
キストラワードをまとめてエキストラサンプルｅ2nとし
て配置し、この次に前記第１のチャンネル群の各チャン
ネルの（２ｎ＋１）番目のサンプルデータのエキストラ
ワードをまとめてエキストラサンプルｅ2n+1として配置
している。Then, the main word of the (2n) -th sample data of each channel of the first channel group is put together and arranged as a main sample S2n, and then the (2n + 1) -th of each channel of the first channel group is arranged. Are arranged as a main sample S2n + 1, and then the extra words of the (2n) -th sample data of each channel of the first channel group are arranged as an extra sample e2n. Then, the extra words of the (2n + 1) th sample data of each channel of the first channel group are collectively arranged as an extra sample e2n + 1.

【００９０】そして更にこの次に前記第２のチャンネル
群の各チャンネルの（２ｎ）番目のサンプルデータのメ
インワードをまとめてメインサンプルＳ２n として配置
し、この次に第２のチャンネル群の各チャンネルの（２
ｎ＋１）番目のサンプルデータのメインワードをまとめ
てメインサンプルＳ２n+１とし配置し、この次に前記第
２のチャンネル群の各チャンネルの（２ｎ）番目のサン
プルデータのエキストラワードをまとめてエキストラサ
ンプルｅ2nとして配置し、この次に前記第２のチャンネ
ル群の各チャンネルの（２ｎ＋１）番目のサンプルデー
タのエキストラワードをまとめてエキストラサンプルｅ
2n+1として配置している（但し、ｎ＝０，１，２，…
）。Then, the main word of the (2n) -th sample data of each channel of the second channel group is arranged together as a main sample S2n, and then the main word of each channel of the second channel group is arranged. (2
The main word of the (n + 1) -th sample data is put together and arranged as a main sample S2n + 1, and then the extra word of the (2n) -th sample data of each channel of the second channel group is put together to obtain an extra sample e2n. Then, the extra words of the (2n + 1) th sample data of each channel of the second channel group are put together to form an extra sample e.
2n + 1 (where n = 0, 1, 2,...)
).

【００９１】図１６のデータ構造は、図１３のケース３
に対応するもので、第１チャンネル群における属性Ａｒ
ｔ１として、サンプリング周波数ｆｓ＝４８ｋＨｚ、量
子化ビット数Ｑｓ＝１６ｂｉｔを採用し、第２チャンネ
ル群の属性Ａｒｔ２として、ｆｓ＝４８ｋＨｚ、量子化
ビット数Ｑｓ＝１６ｂｉｔを採用した例である。The data structure of FIG. 16 is the same as the case 3 of FIG.
And the attribute Ar in the first channel group
In this example, the sampling frequency fs = 48 kHz and the number of quantization bits Qs = 16 bits are used as t1, and the attribute Art2 of the second channel group is fs = 48 kHz and the number of quantization bits Qs = 16 bits.

【００９２】この場合は、Ｓ4n、Ｓ4n+2 が第１属性の
メインサンプル、ｅ4n、ｅ4n+2 が第１属性のエキスト
ラサンプル、Ｓ4n、Ｓ4n+2 が第２属性のメインサンプ
ル、ｅ4n、ｅ4n+2 が第１属性のエキストラサンプルで
ある。第１、第２のチャンネル群はそれぞれ２対サンプ
ルが基本となり、全体では４ついサンプルが基本とな
る。In this case, S4n and S4n + 2 are main samples of the first attribute, e4n and e4n + 2 are extra samples of the first attribute, S4n and S4n + 2 are main samples of the second attribute, e4n and e4n + 2 is the extra sample of the first attribute. Each of the first and second channel groups is basically based on two pairs of samples, and is basically based on four samples.

【００９３】即ち、図１６のデータ構造は、複数チャン
ネルのうち少なくとも２つのチャンネルである第１チャ
ンネル群の信号を第１の周波数でサンプルリングし、他
のチャンネルである第２チャンネル群の信号を第２の周
波数でサンプルリングしたものであり、更にサンプルデ
ータを、ＭＳＢ側のｍ1 ビットのメインワードとＬＳＢ
側のｍ2 ビットのエキストラワードとに分けている。That is, in the data structure of FIG. 16, the signal of the first channel group which is at least two of the plurality of channels is sampled at the first frequency, and the signal of the second channel group which is another channel is sampled. The sampled data is sampled at the second frequency, and the sample data is further divided into the m1 bit main word and the LSB
M2 bit extra words.

【００９４】そして、第１のチャンネル群の各チャンネ
ルの（４ｎ）番目のサンプルデータのメインワードをま
とめてメインサンプルＳ４n として配置し、この次に第
１のチャンネル群の各チャンネルの（４ｎ＋２）番目の
サンプルデータのメインワードをまとめてメインサンプ
ルＳ４n+２とし配置し、この次に前記第１のチャンネル
群の各チャンネルの（４ｎ）番目のサンプルデータのエ
キストラワードをまとめてエキストラサンプルｅ４n と
して配置し、この次に前記第１のチャンネル群の各チャ
ンネルの（４ｎ＋２）番目のサンプルデータのエキスト
ラワードをまとめてエキストラサンプルｅ４n+２として
配置している。Then, the main words of the (4n) -th sample data of each channel of the first channel group are collectively arranged as a main sample S4n, and then the (4n + 2) -th of each channel of the first channel group is arranged. Are arranged as a main sample S4n + 2, and then the extra words of the (4n) -th sample data of each channel of the first channel group are arranged as an extra sample e4n. Then, the extra words of the (4n + 2) th sample data of each channel of the first channel group are collectively arranged as an extra sample e4n + 2.

【００９５】更にこの次に前記第２のチャンネル群の各
チャンネルの（４ｎ）番目のサンプルデータのメインワ
ードをまとめてメインサンプルＳ４n として配置し、こ
の次に第２のチャンネル群の各チャンネルの（４ｎ＋
２）番目のサンプルデータのメインワードをまとめてメ
インサンプルＳ４n+２とし配置し、この次に前記第２の
チャンネル群の各チャンネルの（４ｎ）番目のサンプル
データのエキストラワードをまとめてエキストラサンプ
ルｅ４n として配置し、この次に前記第２のチャンネル
群の各チャンネルの（４ｎ＋２）番目のサンプルデータ
のエキストラワードをまとめてエキストラサンプルｅ４
n+２として配置している（但し、ｎ＝０，１，２，…
）。Next, the main word of the (4n) -th sample data of each channel of the second channel group is put together and arranged as a main sample S4n. 4n +
2) The main word of the second sample data is put together and arranged as a main sample S4n + 2, and then the extra word of the (4n) th sample data of each channel of the second channel group is put together into an extra sample e4n. Then, the extra words of the (4n + 2) -th sample data of each channel of the second channel group are put together to form an extra sample e4
n + 2 (where n = 0, 1, 2,...)
).

【００９６】図１７のデータ構造は、図１１のケース１
に対応するが、更にこの場合は、量子化ビット数も第１
と第２のチャンネル群では異なる。第１チャンネル群に
おける属性Ａｒｔ１として、サンプリング周波数ｆｓ＝
９６ｋＨｚ、量子化ビット数Ｑｓ＝２０ｂｉｔを採用
し、第２チャンネル群における属性Ａｒｔ２として、ｆ
ｓ＝４８ｋＨｚ、量子化ビット数Ｑｓ＝２４ｂｉｔを採
用した例である。また、このデータ構造は、上記のスケ
ーラブル方式に加え、ＤＶＤビデオのサンプル配列構造
に類似するをデータ構造を構築している。The data structure of FIG. 17 corresponds to the case 1 of FIG.
In this case, the number of quantization bits is also the first.
And the second channel group. As the attribute Art1 in the first channel group, the sampling frequency fs =
96 kHz and the number of quantization bits Qs = 20 bits are adopted, and the attribute Art2 in the second channel group is f
This is an example in which s = 48 kHz and the number of quantization bits Qs = 24 bits are used. This data structure is similar to the scalable method described above, and also has a data structure similar to the DVD video sample array structure.

【００９７】即ち、４サンプルＳ4n、Ｓ4n+1、Ｓ4n+2、
Ｓ4n+3 が第１属性のメインサンプル、そして２サンプ
ルＳ2n、Ｓ2n+1 が第２属性のメインサンプルである。
この場合は、第１チャンネル群には、エキストラサンプ
ルｅ4n、ｅ4n+1、ｅ4n+2、ｅ4n+3 が存在し，第２チャ
ンネル群には、エキストラサンプルｅ2n、ｅ2n+1が存在
する。この場合も第１のチャンネル群は４対サンプルが
基本となり、これに対応する第２チャンネル群は２対サ
ンプルが基本となり、全体では６対サンプルが基本とな
る。That is, four samples S4n, S4n + 1, S4n + 2,
S4n + 3 is a main sample having the first attribute, and two samples S2n and S2n + 1 are main samples having the second attribute.
In this case, extra samples e4n, e4n + 1, e4n + 2, e4n + 3 exist in the first channel group, and extra samples e2n, e2n + 1 exist in the second channel group. Also in this case, the first channel group is basically based on four pairs of samples, and the corresponding second channel group is basically based on two pairs of samples, and is basically based on six pairs of samples.

【００９８】上記のようなデータ構造とすることによ
り、ＤＶＤビデオのオーディオデータ構造のタイプをで
きるだけ残したまま、所定の伝送レートを満足した、高
音質の音声信号仕様をもつＤＶＤオーディオのデータ構
造を得ることができる。By adopting the data structure as described above, the DVD audio data structure having a high-quality sound signal specification satisfying a predetermined transmission rate while retaining the type of the DVD-video audio data structure as much as possible. Obtainable.

【００９９】この発明はは特徴あるデータ構造を提供す
るものであるが、その中でも特に特徴的なところは、２
つの属性のうち一方のサンプリング周波数ｆｓは、他方
のサンプリング周波数の倍数となることである。２つの
属性のグループのチャンネル数又は量子化ビット数が異
なるだけであるならば、ＤＶＤビデオの規格の考え方を
応用してチャンネル数及び又は量子化ビット数の違うデ
ータ構造に対応することができるからである。例えば、
図４に示したデータ構造において、メインサンプル部及
びエキストラサンプル部に続く次のデータの属性情報に
おけるチャンネル数及び又は量子化ビット数を変更（切
換え）て記録しておけば良いからである。The present invention provides a distinctive data structure.
One sampling frequency fs of the two attributes is to be a multiple of the other sampling frequency. If the two attribute groups differ only in the number of channels or the number of quantization bits, the concept of the DVD video standard can be applied to accommodate data structures with different numbers of channels and / or quantization bits. It is. For example,
This is because in the data structure shown in FIG. 4, the number of channels and / or the number of quantization bits in the attribute information of the next data following the main sample part and the extra sample part may be changed (switched) and recorded.

【０１００】本発明は上記したデータ構造において、次
のような思想も含むものである。The present invention includes the following concept in the above data structure.

【０１０１】即ち、図１１には属性Ａｒｔ１の第１チャ
ンネル群と属性Ａｒｔ２の第２チャンネル群の各サンプ
ルの同期すべき時刻の対応を示し、４n 、４n+1 ，４n+
2 ，４n+3 ，４n+4 と、２n 、２n+1 と符号を付してい
る。この図からわかるように４サンプルが１まとまりで
ある。従って、４サンプルを１まとまりとして取り扱う
ようにし、図１８に示すように、属性Ａｒｔ１の２サン
プルと、属性Ａｒｔ２の２サンプルを連続して配置し、
この次に属性Ａｒｔ１の２サンプルを配置してもよい。
このデータ構造は、図１４のデータ構造の変形に相当す
る。That is, FIG. 11 shows the correspondence between the time to be synchronized of each sample of the first channel group of the attribute Art1 and the second channel group of the attribute Art2, 4n, 4n + 1, 4n +.
2, 4n + 3, 4n + 4, 2n, and 2n + 1 are assigned. As can be seen from this figure, four samples are one unit. Accordingly, the four samples are handled as a single unit, and as shown in FIG. 18, two samples of the attribute Art1 and two samples of the attribute Art2 are continuously arranged.
Next, two samples of the attribute Art1 may be arranged.
This data structure corresponds to a modification of the data structure of FIG.

【０１０２】図１９さらに他の実施の形態によるデータ
構造であり、このデータ構造は、図１６のデータ構造の
変形に相当する。即ち、４サンプルＳ4n、Ｓ4n+1、Ｓ4n
+2、Ｓ4n+3 が第１属性のメインサンプル、そして２サ
ンプルＳ2n、Ｓ2n+1 が第２属性のメインサンプルであ
る。この場合は、第１チャンネル群には、エキストラサ
ンプルｅ4n、ｅ4n+1、ｅ4n+2、ｅ4n+3 が存在し，第２
チャンネル群には、エキストラサンプルｅ2n、ｅ2n+1
が存在する。この場合も第１のチャンネル群は４対サン
プルが基本となり、これに対応する第２チャンネル群は
２対サンプルが基本となり、全体では６対サンプルが基
本となる。FIG. 19 shows a data structure according to still another embodiment. This data structure corresponds to a modification of the data structure of FIG. That is, four samples S4n, S4n + 1, S4n
+2 and S4n + 3 are main samples of the first attribute, and two samples S2n and S2n + 1 are main samples of the second attribute. In this case, extra samples e4n, e4n + 1, e4n + 2, and e4n + 3 exist in the first channel group,
Extra samples e2n, e2n + 1
Exists. Also in this case, the first channel group is basically based on four pairs of samples, and the corresponding second channel group is basically based on two pairs of samples, and is basically based on six pairs of samples.

【０１０３】ここでこのデータ構造は、４対サンプルと
して、第１チャンネル群のＳ4n、Ｓ4n+1、ｅ4n、ｅ4n+
1、第２チャンネル群のＳ2n、Ｓ2n+1 、ｅ2n、ｅ2n+1
をまとめている。そしてこの次に、第１チャンネル群の
２対サンプルＳ4n+2、Ｓ4n+3、ｅ4n+2、ｅ4n+3を配列し
ている。Here, this data structure is composed of S4n, S4n + 1, e4n, e4n + of the first channel group as four pairs of samples.
1, S2n, S2n + 1, e2n, e2n + 1 of the second channel group
Is summarized. Then, two pairs of samples S4n + 2, S4n + 3, e4n + 2, and e4n + 3 of the first channel group are arranged.

【０１０４】上記したサンプルの単位を考える場合、次
のように理解することもできる。即ち、属性Ａｔｒ１と
属性Ａｔｒ２におけるサンプリング周波数ｆｓが同じで
あった場合（例えば、図１２や図１３、図１５や図１６
に示したようなケース）、同一時間経過に後におけるサ
ンプルの数は、属性Ａｔｒ１と属性Ａｔｒ２側のチャン
ネル群では同じサンプル数である。このような場合は、
ＤＶＤビデオ規格で取り扱われるのと同様に２サンプル
１単位方式でデータを捕らえるようにしてもよい。When considering the sample unit described above, it can be understood as follows. That is, when the sampling frequency fs in the attribute Atr1 and the attribute Atr2 is the same (for example, FIG. 12 and FIG. 13, FIG. 15 and FIG. 16).
The number of samples after the lapse of the same time is the same in the attribute Atr1 and the channel group on the attribute Atr2. In such a case,
As in the case of the DVD video standard, data may be captured in a two-sample one-unit system.

【０１０５】更に又この発明のデータ構造は、次のよう
に理解することができる。即ち１つのまとまり、つまり
１単位を成すサンプル数は、２，４，６が基本となって
いる。そこで汎用性を持たせるために、２，４，６の最
小公倍数である１２サンプル、あるいは１２対サンプル
を１単位として、データを取り扱うようにしてもよい。Furthermore, the data structure of the present invention can be understood as follows. That is, the number of samples forming one unit, that is, one unit is basically 2, 4, and 6. Therefore, in order to provide versatility, data may be handled with 12 samples, which is the least common multiple of 2, 4, and 6, or 12 pairs of samples as one unit.

【０１０６】上記したように、１単位のサンプル数は種
々のケースが可能であるが、いずれのケースにおいて
も、オーディオパックのデータエリアに対しては、この
１単位毎に埋めていき、オーディオパックの残余の部分
が１単位に満たない場合には、ビデオ規格の場合と同様
にスタッフィングバイトやパディングパケットを充填す
るようにしている。As described above, the number of samples per unit can be varied in various cases. In any case, the data area of the audio pack is filled for each unit. If the remaining portion of the data is less than one unit, the stuffing bytes and padding packets are filled as in the case of the video standard.

【０１０７】図２０には、１単位に満たないエリア（斜
線部）が生じために、パディングパケットを挿入した例
を示している。１単位に満たないエリアとは、所定サン
プル数以下あるいは所定対サンプル数以下のデータ量の
エリアを言う。所定サンプル数あるいは所定対サンプル
数とは２、４、６、１２などである。このオーディオパ
ックは２０４８バイトであり、必ずプレゼンテーション
タイムスタンプ（ＰＴＳ）を持つように構成される。FIG. 20 shows an example in which padding packets are inserted because an area (shaded area) of less than one unit occurs. The area less than one unit refers to an area having a data amount equal to or less than a predetermined number of samples or equal to or less than a predetermined number of samples. The predetermined number of samples or the predetermined number of samples is 2, 4, 6, 12, or the like. This audio pack is 2048 bytes and is configured to always have a presentation time stamp (PTS).

【０１０８】また、上記した各図において属性Ａｒｔ
１，属性Ａｒｔ２のデータの配列において、必ずしもこ
の配列に限定されるものではなく、逆の配列であっても
良いことは勿論である。この配列は、取り決めにより各
種変更してもよい。In each of the above-described drawings, the attribute Art
1, the arrangement of the data of the attribute Art2 is not necessarily limited to this arrangement, and it is needless to say that the arrangement may be reversed. This arrangement may be variously changed according to the agreement.

【０１０９】又、上記の説明では、サンプリング周波数
として９６ｋＨｚと４８ｋＨｚを示したが、これに限ら
ず８８．２ｋＨｚと４４．１ｋＨｚのサンプリング周波
数でも良く、２つのサンプリング周波数の関係が一方が
他方の２倍の関係であるならば常に本発明は適用が可能
である。更に汎用性を持たせて、２つの周波数の関係が
一方が他方の整数倍の関係にあれば、容易に本発明を応
用することができるものである。In the above description, the sampling frequencies of 96 kHz and 48 kHz have been described. However, the present invention is not limited to this, and the sampling frequencies of 88.2 kHz and 44.1 kHz may be used. The present invention is always applicable if the relationship is double. The present invention can be easily applied if the relationship between the two frequencies is a multiple of the other with respect to the versatility.

【０１１０】さらにまた、上記の説明では、１ストリー
ム内でチャンネルの属性を２種類としたが、３種類以上
でも本発明の適用範囲である。Further, in the above description, two types of channel attributes are included in one stream, but three or more types are also applicable to the present invention.

【０１１１】上記の説明は、データ構造について説明し
たが本発明は、更に上記データ構造を有する記録媒体及
び記録媒体に対する記録方法及び装置、さらには記録媒
体からのデータ再生方法及び装置、データの伝送方式に
も適用できるものである。In the above description, the data structure has been described. However, the present invention further relates to a recording medium having the above data structure, a recording method and apparatus for the recording medium, a method and an apparatus for reproducing data from the recording medium, and data transmission. It is also applicable to the method.

【０１１２】次に、ＤＶＤオーディオ情報が記録される
光学式ディスクの全体的なデータ構造と、上述したオー
ディオパックとの関係を簡単に説明する。Next, the relationship between the overall data structure of an optical disc on which DVD audio information is recorded and the above-described audio pack will be briefly described.

【０１１３】図２１は、ＤＶＤオーディオゾーンの記録
内容（オーディオ・オンリータイトル・オーディオ・オ
ブジェクトセット；ＡＯＴＴ＿ＡＯＢＳ）のデータ構造
の一例を示す。FIG. 21 shows an example of the data structure of the recorded contents of the DVD audio zone (audio only title audio object set; AOTT_AOBS).

【０１１４】ＡＯＴＴ＿ＡＯＢＳは、１以上のオーディ
オオブジェクトＡＯＴＴ＿ＡＯＢ＃ｎの集まりを定義し
ている。各ＡＯＴＴ＿ＡＯＢは１以上のオーディオセル
ＡＴＳ＿Ｃ＃ｎの集まりを定義している。そして、１以
上のセルＡＴＳ＿Ｃ＃ｎの集まりによってプログラムが
構成され、１以上のプログラムの集まりによってプログ
ラムチェーン（ＰＧＣ）が構成される。このＰＧＣは、
オーディオタイトルの全体あるいは一部を差し示すため
の論理的なユニットを構成する。AOTT_AOBS defines a group of one or more audio objects AOTT_AOB # n. Each AOTT_AOB defines a set of one or more audio cells ATS_C # n. A program is formed by a group of one or more cells ATS_C # n, and a program chain (PGC) is formed by a group of one or more programs. This PGC is
It constitutes a logical unit for indicating the whole or a part of the audio title.

【０１１５】この例では、各オーディオセルＡＴＳ＿Ｃ
＃が２０４８バイトサイズのオーディオパックＡ＿ＰＣ
Ｋの集合で構成されている。これらのパックは、データ
転送処理を行う際の最小単位となる。また、論理上の処
理を行う最小単位はセル単位であり、論理上の処理はこ
のセル単位で行なわれる。In this example, each audio cell ATS_C
# Is audio pack A_PC of 2048 byte size
It is composed of a set of K. These packs are the minimum units when performing data transfer processing. The minimum unit for performing the logical processing is a cell unit, and the logical processing is performed in this cell unit.

【０１１６】図２２は、ＤＶＤオーディオゾーンのプロ
グラムチェーン情報ＡＴＳ＿ＰＧＣＩにより、セルがア
クセスされる場合を説明する図である。FIG. 22 is a view for explaining a case where a cell is accessed by the program chain information ATS_PGCI of the DVD audio zone.

【０１１７】ＡＴＳ＿ＰＧＣＩ内のプログラム＃１に関
するセル再生情報により、ＡＯＢのセルＡＴＳ＿Ｃ＃
１、ＡＴＳ＿Ｃ＃２が再生される。According to the cell reproduction information on program # 1 in ATS_PGCI, AOB cell ATS_C #
1. ATS_C # 2 is reproduced.

【０１１８】１つのＰＧＣを１本のオペラに例えれば、
このＰＧＣを構成する複数のセルはそのオペラ中の種々
なシーンの音楽あるいは歌唱部分に対応すると解釈可能
である。このＰＧＣの中身（あるいはセルの中身）は、
ディスクに記録される内容を制作するソフトウエアプロ
バイダにより決定される。すなわち、プロバイダは、Ａ
ＴＳ内のプログラムチェーン情報ＡＴＳ＿ＰＧＣＩに書
き込まれたセル再生情報ＡＴＳ＿Ｃ＿ＰＢＩを用いて、
ＡＯＴＴ＿ＡＯＢＳを構成するセルを意図通りに再生さ
せることができる。If one PGC is compared to one opera,
The plurality of cells constituting the PGC can be interpreted to correspond to music or singing portions of various scenes in the opera. The contents of this PGC (or the contents of the cell)
Determined by the software provider that produces the content recorded on the disc. That is, the provider is A
Using the cell reproduction information ATS_C_PBI written in the program chain information ATS_PGCI in the TS,
Cells constituting AOTT_AOBS can be reproduced as intended.

【０１１９】次に、上記した第１チャンネル群、第２チ
ャンネル群の各種の取り決めが、管理データ上で具体的
にどのように行われているかを説明することにする。Next, it will be described how the various rules of the first channel group and the second channel group are concretely performed on the management data.

【０１２０】図２３は、ＤＶＤオーディオゾーン内のオ
ーディオタイトルセット（ＡＴＳ）の記録内容を説明す
る図である。FIG. 23 is a view for explaining the recorded contents of an audio title set (ATS) in the DVD audio zone.

【０１２１】オーディオタイトルセットＡＴＳは、オー
ディオタイトルセット情報ＡＴＳＩと、オーディオ・オ
ンリータイトル用オーディオオブジェクトセットＡＯＴ
Ｔ＿ＡＯＢＳと、オーディオタイトルセット情報のバッ
クアップＡＴＳＩ＿ＢＵＰとで構成されている。The audio title set ATS includes audio title set information ATSI and an audio only title audio object set AOT.
It is composed of T_AOBS and a backup ATSI_BUP of audio title set information.

【０１２２】オーディオタイトルセット情報ＡＴＳＩ
は、オーディオタイトルセット管理テーブルＡＴＳＩ＿
ＭＡＴおよびオーディオタイトルセットプログラムチェ
ーン情報テーブルＡＴＳ＿ＰＧＣＩＴを含んでいる。Audio title set information ATSI
Is an audio title set management table ATSI_
MAT and audio title set program chain information table ATS_PGCIT.

【０１２３】そして、オーディオタイトルセットプログ
ラムチェーン情報テーブルＡＴＳ＿ＰＧＣＩＴは、オー
ディオタイトルセットプログラムチェーン情報テーブル
情報ＡＴＳ＿ＰＧＣＩＴＩと、オーディオタイトルセッ
トプログラムチェーン情報サーチポインタＡＴＳ＿ＰＧ
ＣＩ＿ＳＲＰと、１以上のオーディオタイトルセットプ
ログラムチェーン情報ＡＴＳ＿ＰＧＣＩとを含んでい
る。The audio title set program chain information table ATS_PGCIT includes an audio title set program chain information table information ATS_PGCITI and an audio title set program chain information search pointer ATS_PG.
CI_SRP and one or more audio title set program chain information ATS_PGCI.

【０１２４】図２４は、図２５のオーディオタイトルセ
ット情報管理テーブルＡＴＳＩ＿ＭＡＴの記録内容を示
す。FIG. 24 shows the recorded contents of the audio title set information management table ATSI_MAT of FIG.

【０１２５】すなわち、このオーディオタイトルセット
情報管理テーブルＡＴＳＩ＿ＭＡＴには、オーディオタ
イトルセット識別子（ＡＴＳＩ＿ＩＤ）；オーディオタ
イトルセットのエンドアドレス（ＡＴＳ＿ＥＡ）；オー
ディオタイトルセット情報のエンドアドレス（ＡＴＳＩ
＿ＥＡ）；採用されたオーディオ規格のバージョン番号
（ＶＥＲＮ）；オーディオタイトルセット情報管理テー
ブルのエンドアドレス（ＡＴＳＩ＿ＭＡＴ＿ＥＡ）；オ
ーディオ・オンリータイトルＡＯＴＴ用ビデオタイトル
セットＶＴＳのスタートアドレス（ＶＴＳ＿ＳＡ）；オ
ーディオ・オンリータイトル用オーディオオブジェクト
セットのスタートアドレス（ＡＯＴＴ＿ＡＯＢＳ＿Ｓ
Ａ）またはオーディオ・オンリータイトル用ビデオオブ
ジェクトセットのスタートアドレス（ＡＯＴＴ＿ＶＯＢ
Ｓ＿ＳＡ）；オーディオタイトルセット用プログラムチ
ェーン情報テーブルのスタートアドレス（ＡＴＳ＿ＰＧ
ＣＩＴ＿ＳＡ）；オーディオ・オンリータイトル用オー
ディオオブジェクトセットの属性（ＡＯＴＴ＿ＡＯＢＳ
＿ＡＴＲ）またはオーディオ・オンリータイトル用ビデ
オオブジェクトセットの属性（ＡＯＴＴ＿ＶＯＢＳ＿Ａ
ＴＲ）＃０〜＃７；オーディオタイトルセットデータミ
ックス係数（ＡＴＳ＿ＤＭ＿ＣＯＥＦＴ）＃０〜＃１
５；オーディオタイトルセットのスチル画属性（ＡＴＳ
＿ＳＰＣＴ＿ＡＴＲ）；その他の予約エリアが設けられ
ている。That is, the audio title set information management table ATSI_MAT includes an audio title set identifier (ATSI_ID); an audio title set end address (ATS_EA); an audio title set information end address (ATSI_ATS).
_EA); version number of the adopted audio standard (VERN); end address of audio title set information management table (ATSI_MAT_EA); start address of video title set VTS for audio only title AOTT (VTS_SA); for audio only title Start address of audio object set (AOTT_AOBS_S
A) or the start address (AOTT_VOB) of the video object set for audio only title
S_SA); start address (ATS_PG) of the program chain information table for audio title set
CIT_SA); attribute of audio object set for audio only title (AOTT_AOBS)
_ATR) or attributes of the video object set for audio-only titles (AOTT_VOBS_A)
TR) # 0 to # 7; audio title set data mix coefficient (ATS_DM_COEFT) # 0 to # 1
5: Still image attribute of audio title set (ATS
_SPCT_ATR); other reserved areas are provided.

【０１２６】上記ＡＯＴＴ用ＶＴＳのスタートアドレス
ＶＴＳ＿ＳＡには、ＡＴＳがＡＯＴＴ＿ＡＯＢＳを持た
ないときは、ＡＯＴＴのために用いられるＶＴＳＴＴ＿
ＶＯＢＳを含むビデオタイトルセットＶＴＳのスタート
アドレスが書き込まれる。ＡＴＳがＡＯＴＴ＿ＡＯＢＳ
を持つときは「００００００００ｈ」がこのＶＴＳ＿Ｓ
Ａに書き込まれる。ビデオ情報も記録されることがある
からである。When the ATS does not have AOTT_AOBS, the start address VTS_SA of the AOTT VTS includes VTSTT_ATS used for AOTT.
The start address of the video title set VTS including VOBS is written. ATS is AOTT_AOBS
"00000000h" is this VTS_S
A is written. This is because video information may also be recorded.

【０１２７】上記ＡＯＴＴ＿ＡＯＢＳ＿ＳＡには、ＡＴ
ＳがＡＯＴＴ＿ＡＯＢＳを持つときは、ＡＴＳの最初の
論理ブロックからの相対論理ブロック数でもって、ＡＯ
ＴＴ＿ＡＯＢＳのスタートアドレスが書き込まれる。一
方、ＡＴＳがＡＯＴＴ＿ＡＢＯＳを持たないときは、Ａ
ＯＴＴ＿ＶＯＢＳ＿ＳＡには、ビデオタイトルセットの
ためのビデオオブジェクト（ＶＴＳＴＴ＿ＶＯＢＳ）の
スタートアドレスが、ＡＴＳのために用いられるＶＴＳ
ＴＴ＿ＶＯＢＳを含むＶＴＳの最初の論理ブロックから
の相対論理ブロック数でもって、書き込まれる。The AOTT_AOBS_SA contains AT
When S has AOTT_AOBS, AO is represented by the number of logical blocks relative to the first logical block of ATS.
The start address of TT_AOBS is written. On the other hand, when the ATS does not have AOTT_ABOS, ATS
In OTT_VOBS_SA, the start address of the video object (VTSTT_VOBS) for the video title set is the VTS used for ATS.
Written with the number of logical blocks relative to the first logical block of the VTS including TT_VOBS.

【０１２８】上記ＡＴＳ＿ＰＧＣＩＴ＿ＳＡには、ＡＴ
ＳＩの最初の論理ブロックからの相対論理ブロック数で
もって、ＡＴＳ＿ＰＧＣＩＴのスタートアドレスが書き
込まれる。The ATS_PGCIT_SA contains AT
The start address of ATS_PGCIT is written with the number of relative logical blocks from the first logical block of SI.

【０１２９】上記オーディオタイトルセットのための属
性情報であるＡＯＴＴ＿ＡＯＢＳ＿ＡＴＲまたは、ビデ
オタイトルセットの属性情報であるＡＯＴＴ＿ＶＯＢ＿
ＡＲＴは、＃０から＃７まで８つ用意されている。ＡＴ
ＳがＡＯＴＴ＿ＡＯＢＳを持つときは、ＡＴＳに記録さ
れたＡＯＴＴ＿ＡＯＢの属性がＡＯＴＴ＿ＡＯＢＳ＿Ａ
ＴＲに書き込まれる。一方、ＡＴＳがＡＯＴＴ＿ＡＯＢ
Ｓを持たないときは、ＡＯＴＴ＿ＶＯＢ＿ＡＲＴには、
ＡＴＳ内のＡＯＴＴ＿ＶＯＢのために用いられるＶＯＢ
内のオーディオストリームの属性が書き込まれる。この
ＡＯＴＴ＿ＡＯＢＳ＿ＡＴＲまたはＡＯＴＴ＿ＶＯＢ＿
ＡＲＴには、採用されたサンプリング周波数（４４〜１
９２ｋＨｚ）および量子化ビット数（１６〜２４ビッ
ト）が書き込まれている。AOTT_AOBS_ATR which is attribute information for the audio title set or AOTT_VOB_ which is attribute information for the video title set
Eight ARTs are prepared from # 0 to # 7. AT
When S has AOTT_AOBS, the attribute of AOTT_AOB recorded in ATS is AOTT_AOBS_A.
Written to TR. On the other hand, ATS is AOTT_AOB
If it does not have S, AOTT_VOB_ART contains
VOB used for AOTT_VOB in ATS
The attributes of the audio stream in are written. This AOTT_AOBS_ATR or AOTT_VOB_
ART includes the adopted sampling frequency (44 to 1).
92 kHz) and the number of quantization bits (16 to 24 bits).

【０１３０】更にこの部分には、チャンネルアサインメ
ントが記述されている。Further, in this part, a channel assignment is described.

【０１３１】チャンネルアサインメントは、この属性に
より特定されたビデオオブジェクトに含まれるオーディ
オストリームの各チャンネルの割り当て情報が記述され
ている。この割り当て情報の内容は、マルチチャンネル
の構成に応じている。このチャンネル割り当て情報は、
後述する図２６のようになっている。この割り当て情報
は、後述するオーディオパケットヘッダにも記述されて
いる。The channel assignment describes the assignment information of each channel of the audio stream included in the video object specified by this attribute. The content of the assignment information depends on the configuration of the multi-channel. This channel assignment information
FIG. 26 to be described later. This assignment information is also described in an audio packet header described later.

【０１３２】上記ＡＴＳ＿ＤＭ＿ＣＯＥＦＴは、ＡＣ−
３やＤＴＳ等のようなマルチチャネル出力（５．１チャ
ネル出力）を持つオーディオデータを２チャネル出力に
ミックスダウンする際の係数を示すもので、ＡＴＳ内に
記録された１以上のＡＯＴＴ＿ＡＯＢでのみ使用され
る。ＡＴＳがＡＯＴＴ＿ＡＯＢＳを持たないときは、１
６個（＃０〜＃１５）あるＡＴＳ＿ＤＭ＿ＣＯＥＦＴそ
れぞれの全ビットに、「０ｈ」が書き込まれる。この１
６個（＃０〜＃１５）のＡＴＳ＿ＤＭ＿ＣＯＥＦＴのた
めのエリアは定常的に設けられている。The ATS_DM_COEFT is AC-
Indicates a coefficient when audio data having multi-channel output (5.1-channel output) such as 3 or DTS is mixed down to 2-channel output, and is used only for one or more AOTT_AOBs recorded in the ATS Is done. 1 if ATS does not have AOTT_AOBS
“0h” is written to all bits of each of the six (# 0 to # 15) ATS_DM_COEFTs. This one
Areas for six (# 0 to # 15) ATS_DM_COEFTs are constantly provided.

【０１３３】上記ＡＴＳ＿ＳＰＣＴ＿ＡＴＲは、ＡＯＴ
Ｔ＿ＡＯＢＳ内の各スチル画のためのスチル画ストリー
ムの属性を示す。ＡＯＴＴ＿ＡＯＢＳにスチル画がない
ときは、ＡＴＳ＿ＳＰＣＴ＿ＡＴＲには「００００ｈ」
が書き込まれる。このスチル画の各フィールドは、ＡＯ
ＴＴ＿ＡＯＢＳ内の各スチル画のビデオストリームに記
録された情報に合わせてある。The ATS_SPCT_ATR is AOT
The attribute of the still picture stream for each still picture in T_AOBS is shown. When there is no still image in AOTT_AOBS, “0000h” is set in ATS_SPCT_ATR.
Is written. Each field of this still image is AO
This is in accordance with the information recorded in the video stream of each still image in TT_AOBS.

【０１３４】各ＡＴＳ＿ＳＰＣＴ＿ＡＴＲは１６ビット
で構成され、ＭＳＢ側の２ビット（ビットｂ１５〜ｂ１
４）はビデオ圧縮モード（ＭＰＥＧ２等）を表し、次の
２ビット（ビットｂ１３〜ｂ１２）はＴＶシステム（Ｎ
ＴＳＣ、ＰＡＬ、ＳＥＣＡＭ等）を表し、次の２ビット
（ビットｂ１１〜ｂ１０）は画像のアスペクト比（４：
３、１６：９等）を表し、次の２ビット（ビットｂ９〜
ｂ８）は表示モード（４：３サイズのＴＶモニタにおけ
る４：３表示、１６：９表示、レターボックス表示等）
を表している。次の２ビット（ビットｂ７〜ｂ６）は将
来に備えての予約ビットである。次の３ビット（ビット
ｂ５〜ｂ３）は、スチル画の解像度（ＮＴＳＣシステム
における水平７２０本ｘ垂直４８０本、ＰＡＬシステム
における水平７２０本ｘ垂直５７６本等）を表してい
る。ＬＳＢ側の最後の３ビット（ビットｂ２〜ｂ０）
も、将来に備えての予約ビットである。Each ATS_SPCT_ATR is composed of 16 bits, and 2 bits (bits b15 to b1) on the MSB side.
4) represents a video compression mode (MPEG2 or the like), and the next two bits (bits b13 to b12) are used in the TV system (N
TSC, PAL, SECAM, etc.), and the next two bits (bits b11 to b10) indicate the aspect ratio (4:
3, 16: 9, etc.) and the next two bits (bits b9 to
b8) is a display mode (4: 3 display, 16: 9 display, letter box display, etc. on a 4: 3 size TV monitor)
Is represented. The next two bits (bits b7 to b6) are reserved bits for the future. The next three bits (bits b5 to b3) represent the resolution of the still image (720 horizontal x 480 vertical in the NTSC system, 720 horizontal x 576 vertical in the PAL system). Last 3 bits on LSB side (bits b2 to b0)
Are reserved bits for the future.

【０１３５】図２５は、オーディオタイトルセット情報
ＡＴＳＩに含まれるオーディオタイトルセットプログラ
ムチェーン情報テーブルＡＴＳ＿ＰＧＣＩＴの内容を説
明する図である（このＡＴＳ＿ＰＧＣＩＴの記録位置は
ＡＴＳＩ＿ＭＡＴのＡＴＳ＿ＰＧＣＩＴ＿ＳＡに書き込
まれている）。FIG. 25 is a diagram for explaining the contents of the audio title set program chain information table ATS_PGCIT included in the audio title set information ATSI (the recording position of this ATS_PGCIT is written in ATS_PGCIT_SA of ATSI_MAT).

【０１３６】このＡＴＳ＿ＰＧＣＩＴは、前述したよう
に、オーディオタイトルセットプログラムチェーン情報
テーブル情報ＡＴＳ＿ＰＧＣＩＴＩと、オーディオタイ
トルセットプログラムチェーン情報サーチポインタＡＴ
Ｓ＿ＰＧＣＩ＿ＳＲＰと、オーディオタイトルセットプ
ログラムチェーン情報ＡＴＳ＿ＰＧＣＩとを含んでい
る。As described above, the ATS_PGCIT includes the audio title set program chain information table information ATS_PGCITI and the audio title set program chain information search pointer AT.
S_PGCI_SRP and audio title set program chain information ATS_PGCI.

【０１３７】上記ＡＴＳ＿ＰＧＣＩ＿ＳＲＰは１以上の
オーディオタイトルセット用プログラムチェーン情報サ
ーチポインタ（ＡＴＳ＿ＰＧＣＩ＿ＳＲＰ＃１〜ＡＴＳ
＿ＰＧＣＩ＿ＳＲＰ＃ｊ）を含み、上記ＡＴＳ＿ＰＧＣ
ＩはＡＴＳ＿ＰＧＣＩ＿ＳＲＰと同数のオーディオタイ
トルセット用プログラムチェーン情報（ＡＴＳ＿ＰＧＣ
Ｉ＃１〜ＡＴＳ＿ＰＧＣＩ＃ｊ）を含んでいる。The ATS_PGCI_SRP is one or more audio title set program chain information search pointers (ATS_PGCI_SRP # 1 to ATS).
_PGCI_SRP # j), and the ATS_PGC
I is the same number of ATS_PGCI_SRP and audio title set program chain information (ATS_PGC
I # 1 to ATS_PGCI # j).

【０１３８】各ＡＴＳ＿ＰＧＣＩは、オーディオタイト
ルセット用プログラムチェーンＡＴＳ＿ＰＧＣの再生を
制御するナビゲーションデータとして機能する。Each ATS_PGCI functions as navigation data for controlling the reproduction of the audio title set program chain ATS_PGC.

【０１３９】ここで、ＡＴＳ＿ＰＧＣは、オーディオ・
オンリータイトルＡＯＴＴを定義する単位であり、ＡＴ
Ｓ＿ＰＧＣＩと１以上のセル（ＡＯＴＴ＿ＡＯＢＳ内の
セルまたはＡＯＴＴのオブジェクトとして用いられるＡ
ＯＴＴ＿ＶＯＢＳ内のセル）とから構成される。[0139] Here, ATS_PGC is an audio
A unit that defines only title AOTT, AT
S_PGCI and one or more cells (A in AOTT_AOBS or A
OTT_VOBS).

【０１４０】各ＡＴＳ＿ＰＧＣＩは、オーディオタイト
ルセット用プログラムチェーンの一般情報（ＡＴＳ＿Ｐ
ＧＣ＿ＧＩ）と、オーディオタイトルセット用プログラ
ム情報テーブル（ＡＴＳ＿ＰＧＣＩＴ）と、オーディオ
タイトルセット用セル再生情報テーブル（ＡＴＳ＿Ｃ＿
ＰＢＩＴ）を含んでいる。Each ATS_PGCI includes general information (ATS_P) of an audio title set program chain.
GC_GI), an audio title set program information table (ATS_PGCIT), and an audio title set cell playback information table (ATS_C_
PBIT).

【０１４１】上記ＡＴＳ＿ＰＧＣＩＴは１以上のオーデ
ィオタイトルセット用プログラム情報（ＡＴＳ＿ＰＧＩ
＃１〜ＡＴＳ＿ＰＧＩ＃ｋ）を含み、上記ＡＴＳ＿Ｃ＿
ＰＢＩＴはＡＴＳ＿ＰＧＩと同数のオーディオタイトル
セット用セル再生情報（ＡＴＳ＿Ｃ＿ＰＢＩ＃１〜ＡＴ
Ｓ＿Ｃ＿ＰＢＩ＃ｋ）を含んでいる。The ATS_PGCIT contains one or more audio title set program information (ATS_PGI).
# 1 to ATS_PGI # k), and the ATS_C_
The PBIT contains the same number of audio title set cell playback information (ATS_C_PBI # 1 to AT
S_C_PBI # k).

【０１４２】図２６には、チャンネルの割り当て情報
と、この情報により分類された第１チャンネル群と第２
チャンネル群の分類を示している。図２４のＡＴＳＩ＿
ＭＡＴには、オーディオオブジェクトの属性情報が記述
され、その中にチャンネルアサインメントが存在すると
説明したが、そのチャンネルアサインメントが図２６に
示すデータである。FIG. 26 shows channel assignment information, a first channel group and a second channel group classified based on this information.
The classification of a channel group is shown. ATSI_ in FIG.
In the MAT, attribute information of an audio object is described, and it has been described that a channel assignment exists in the attribute information. The channel assignment is the data shown in FIG.

【０１４３】０００００ｂの場合は、モノラルを意味
し、００００１ｂの場合は、第１チャンネル群にＬ，Ｒ
（ステレオ）チャンネルが存在することを意味し、００
０１０ｂの場合は、第１チャンネル群にＬｆ，Ｒｆ（レ
フトフロント、ライトフロント）チャンネル、第２チャ
ンネル群にＳ（サラウンド）が存在することを意味す
る。０００１１ｂの場合は、第１チャンネル群にＬｆ，
Ｒｆ（レフトフロント、ライトフロント）、第２チャン
ネル群にＬｓ，Ｒｓ（レフトサラウンド、ライトサラウ
ンド）が存在することを意味する。００１００ｂの場合
は、第１チャンネル群にＬｆ，Ｒｆ（レフトフロント、
ライトフロント）、第２チャンネル群にＬＦＥ（低域周
波数効果）が存在することを意味する。００１０１ｂの
場合は、第１チャンネル群にＬｆ，Ｒｆ（レフトフロン
ト、ライトフロント）、第２チャンネル群にＬＦＥ（低
域周波数効果）、Ｓ（サラウンド）が存在することを意
味する。００１１０ｂの場合は、第１チャンネル群にＬ
ｆ，Ｒｆ（レフトフロント、ライトフロント）、第２チ
ャンネル群にＬＦＥ（低域周波数効果）、Ｌｓ，Ｒｓ
（レフトサラウンド、ライトサラウンド）が存在するこ
とを意味する。００１１１ｂの場合は、第１チャンネル
群にＬｆ，Ｒｆ（レフトフロント、ライトフロント）、
第２チャンネル群にＣ（センター）が存在することを意
味する。０１０００ｂの場合は、第１チャンネル群にＬ
ｆ，Ｒｆ（レフトフロント、ライトフロント）、第２チ
ャンネル群にＣ（センター）、Ｓ（サラウンド）が存在
することを意味する。０１００１ｂの場合は、第１チャ
ンネル群にＬｆ，Ｒｆ（レフトフロント、ライトフロン
ト）、第２チャンネル群にＣ（センター）、Ｌｓ，Ｒｓ
（レフトサラウンド、ライトサラウンド）が存在するこ
とを意味する。０１０１０ｂの場合は、第１チャンネル
群にＬｆ，Ｒｆ（レフトフロント、ライトフロント）、
第２チャンネル群にＣ（センター）、ＬＦＥ（低域周波
数効果）が存在することを意味する。０１０１１ｂの場
合は、第１チャンネル群にＬｆ，Ｒｆ（レフトフロン
ト、ライトフロント）、第２チャンネル群にＣ（センタ
ー）、ＬＦＥ（低域周波数効果）、Ｓ（サラウンド）が
存在することを意味する。０１１００ｂの場合は、第１
チャンネル群にＬｆ，Ｒｆ（レフトフロント、ライトフ
ロント）、第２チャンネル群にＣ（センター）、ＬＦＥ
（低域周波数効果）、Ｌｓ，Ｒｓ（レフトサラウンド、
ライトサラウンド）が存在することを意味する。０１１
０１ｂの場合は、第１チャンネル群にＬｆ，Ｒｆ（レフ
トフロント、ライトフロント）、Ｃ（センター）、第２
チャンネル群にＳ（サラウンド）が存在することを意味
する。０１１１０ｂの場合は、第１チャンネル群にＬ
ｆ，Ｒｆ（レフトフロント、ライトフロント）、Ｃ（セ
ンター）、第２チャンネル群にＬｓ，Ｒｓ（レフトサラ
ウンド、ライトサラウンド）が存在することを意味す
る。０１１１１ｂの場合は、第１チャンネル群にＬｆ，
Ｒｆ（レフトフロント、ライトフロント）、Ｃ（センタ
ー）、第２チャンネル群にＬＦＥ（低域周波数効果）が
存在することを意味する。１００００ｂの場合は、第１
チャンネル群にＬｆ，Ｒｆ（レフトフロント、ライトフ
ロント）、Ｃ（センター）、第２チャンネル群にＬＦＥ
（低域周波数効果）、Ｓ（サラウンド）が存在すること
を意味する。１０００１ｂの場合は、第１チャンネル群
にＬｆ，Ｒｆ（レフトフロント、ライトフロント）、Ｃ
（センター）、第２チャンネル群にＬＦＥ（低域周波数
効果）、Ｌｓ，Ｒｓ（レフトサラウンド、ライトサラウ
ンド）が存在することを意味する。１００１０ｂの場合
は、第１チャンネル群にＬｆ，Ｒｆ（レフトフロント、
ライトフロント）、Ｌｓ，Ｒｓ（レフトサラウンド、ラ
イトサラウンド）、第２チャンネル群にＬＦＥ（低域周
波数効果）が存在することを意味する。１００１１ｂの
場合は、第１チャンネル群にＬｆ，Ｒｆ（レフトフロン
ト、ライトフロント）、Ｌｓ，Ｒｓ（レフトサラウン
ド、ライトサラウンド）、第２チャンネル群にＣ（セン
ター）が存在することを意味する。In the case of 00000b, it means monaural, and in the case of 00001b, L, R
(Stereo) channel exists, 00
In the case of 010b, it means that Lf and Rf (left front, right front) channels exist in the first channel group, and S (surround) exists in the second channel group. In the case of 00001b, Lf,
Rf (left front, right front), and Ls, Rs (left surround, right surround) exist in the second channel group. In the case of 00100b, Lf, Rf (left front,
Light front) and LFE (low frequency effect) in the second channel group. In the case of 00101b, it means that Lf and Rf (left front and right front) exist in the first channel group, and LFE (low frequency effect) and S (surround) exist in the second channel group. In the case of 00110b, L is added to the first channel group.
f, Rf (left front, right front), LFE (low frequency effect), Ls, Rs in the second channel group
(Left surround, right surround). In the case of 00111b, the first channel group includes Lf, Rf (left front, right front),
This means that C (center) exists in the second channel group. In the case of 01000b, L is added to the first channel group.
f, Rf (left front, right front), and C (center) and S (surround) in the second channel group. In the case of 01001b, Lf and Rf (left front and right front) are assigned to the first channel group, and C (center), Ls and Rs are assigned to the second channel group.
(Left surround, right surround). In the case of 01010b, the first channel group includes Lf, Rf (left front, right front),
This means that C (center) and LFE (low frequency effect) exist in the second channel group. 01011b means that Lf and Rf (left front and right front) exist in the first channel group, C (center), LFE (low frequency effect), and S (surround) exist in the second channel group. . 01100b, the first
Lf, Rf (left front, right front) for channel group, C (center), LFE for second channel group
(Low frequency effect), Ls, Rs (left surround,
Light surround). 011
01b, Lf, Rf (left front, right front), C (center), second channel
This means that S (surround) exists in the channel group. In the case of 01110b, L is added to the first channel group.
f, Rf (left front, right front), C (center), and Ls, Rs (left surround, right surround) exist in the second channel group. 01111b, Lf,
Rf (left front, right front), C (center), and LFE (low frequency effect) exist in the second channel group. In the case of 10,000b, the first
Lf, Rf (left front, right front), C (center) in channel group, LFE in second channel group
(Low frequency effect) and S (surround) are present. In the case of 10001b, Lf, Rf (left front, right front), C
(Center), and LFE (low frequency effect), Ls, Rs (left surround, right surround) are present in the second channel group. In the case of 10010b, Lf, Rf (left front, left front,
(Right front), Ls, Rs (left surround, right surround), and LFE (low frequency effect) in the second channel group. In the case of 10011b, it means that Lf, Rf (left front, right front) and Ls, Rs (left surround, right surround) exist in the first channel group, and C (center) exists in the second channel group.

【０１４４】１０１００ｂの場合は、第１チャンネル群
にＬｆ，Ｒｆ（レフトフロント、ライトフロント）、Ｌ
ｓ，Ｒｓ（レフトサラウンド、ライトサラウンド）、第
２チャンネル群にＣ（センター）、ＬＦＥ（低域周波数
効果）が存在することを意味する。In the case of 10100b, Lf, Rf (left front, right front), Lf
s, Rs (left surround, right surround), and C (center) and LFE (low frequency effect) in the second channel group.

【０１４５】また、図２４で示した属性情報には、つま
りＡＯＴＴ＿ＡＯＢＳ＿ＡＴＲまたはＡＯＴＴ＿ＶＯＢ
＿ＡＲＴには、採用されたサンプリング周波数（４４〜
１９２ｋＨｚ）および量子化ビット数（１６〜２４ビッ
ト）が書き込まれている。The attribute information shown in FIG. 24 includes AOTT_AOBS_ATR or AOTT_VOB.
_ART includes the adopted sampling frequency (44 to
192 kHz) and the number of quantization bits (16 to 24 bits).

【０１４６】次に、オーディオパックについて、更に詳
しく説明することにする。Next, the audio pack will be described in more detail.

【０１４７】図２７にはオーディオパックの基本的な構
成を示している。FIG. 27 shows the basic structure of an audio pack.

【０１４８】Ａ＿ＰＫＴのデータ構成は、パックヘッ
ダ、パケットヘッダ、サブストリームＩＤ、ＩＳＲＣ，
プライベートヘッダー長、第１のアクセスユニットポイ
ンタ、オーディオデータ情報、０〜７バイトのスタッフ
ィングバイト、リニアＰＣＭオーディオデータの領域が
設定されている。The data structure of A_PKT includes a pack header, a packet header, a substream ID, an ISRC,
The area includes a private header length, a first access unit pointer, audio data information, stuffing bytes of 0 to 7 bytes, and linear PCM audio data.

【０１４９】パケットヘッダのサイズとしては、次のよ
うな規則が適用されている。即ち、Ａ＿ＰＫＴがオーデ
ィオオブジェクト内の最初のパケットであれサイズは１
７バイトであり、オーディオフレームの最初データを含
まない場合には９バイト、そうでなければ１４バイトで
ある。The following rules are applied to the size of the packet header. That is, if A_PKT is the first packet in the audio object, the size is 1
7 bytes, 9 bytes if the first data of the audio frame is not included, and 14 bytes otherwise.

【０１５０】リニアＰＣＭのオーディオパケットは、パ
ケットヘッダ、プライベートヘッダ、オーディオデータ
で構成される。パケットヘッダ及びプライベートヘッダ
の内容は、図２８、図２９に示すような構成である。An audio packet of the linear PCM includes a packet header, a private header, and audio data. The contents of the packet header and the private header are configured as shown in FIGS.

【０１５１】図２８はパケットヘッダであり、記述順に
各データを述べると、パケットスタートコード、ストリ
ームｉｄ，ＰＥＳパケット長、''０１'' 、ＰＥＳスク
ランブル制御情報、ＰＥＳプライオリティー、データ整
列インジケータ、コピーライト、オリジナルか又はコピ
ーか、ＰＴＳ＿ＤＴＳフラッグ、ＥＳＣＲ＿フラッグ、
ＥＳ＿レートフラッグ、ＤＳＭトリックモードフラッ
グ、付加的なコピーフラッグ、ＰＥＳ＿ＣＲＣフラッ
グ、ＰＥＳ拡張フラッグ、ＰＥＳヘッダー長がある。そ
して、次にこのパケットの再生時刻を示すプレゼンテー
ションタイムスタンプ（ＰＴＳ）の記述領域が５バイト
確保されている。次にＰＥＳプライベートデータフラッ
グ、パックヘッダーフィールドフラッグ、プログラムパ
ケット順カウンターフラッグ、Ｐ＿ＳＴＤバッファーフ
ラッグ、第２ＰＥＳ拡張フラッグ、''０１''、Ｐ＿ＳＴ
Ｄバッファースケール、Ｐ＿ＳＴＤバッファサイズ情報
が記述されている。FIG. 28 shows a packet header. Each data is described in the order of description. Packet start code, stream id, PES packet length, "01", PES scramble control information, PES priority, data alignment indicator, copy Write, original or copy, PTS_DTS flag, ESCR_flag,
There are ES_rate flag, DSM trick mode flag, additional copy flag, PES_CRC flag, PES extension flag, PES header length. Then, 5 bytes of a description area of a presentation time stamp (PTS) indicating the reproduction time of this packet is secured. Next, a PES private data flag, a pack header field flag, a program packet order counter flag, a P_STD buffer flag, a second PES extension flag, "01", P_ST
D buffer scale and P_STD buffer size information are described.

【０１５２】図２９には、プライベートパケットを示し
ている。FIG. 29 shows a private packet.

【０１５３】記述順に各データを述べると以下のように
なる。サブストリームｉｄ、予約、ＩＳＲＣ番号、ＩＳ
ＲＣデータ、プライベートヘッダ長、先頭のアクセスユ
ニットポインタ、オーディオ強調フラッグ、予約、予
約、ダウンミックスコード、第１の量子化ワード長、第
２の量子化ワード長、第１のオーディオサンプリング周
波数、第２のオーディオサンプリング周波数、予約、マ
ルチチャンネルタイプ、予約、チャンネルアサインメン
ト、ダイナミックレンジ制御情報、スタッフィングバイ
トである。Each data is described as follows in the order of description. Substream id, reservation, ISRC number, IS
RC data, private header length, head access unit pointer, audio enhancement flag, reservation, reservation, downmix code, first quantization word length, second quantization word length, first audio sampling frequency, second audio sampling frequency , Audio reservation frequency, reservation, multi-channel type, reservation, channel assignment, dynamic range control information, and stuffing byte.

【０１５４】各フィールド項目を説明すると次の通りで
ある。Each field item will be described as follows.

【０１５５】サブストリームｉｄには、リニアＰＣＭオ
ーディオデータであるることを示す１０１０００００ｂ
が記述される。静止画制御のために用いられるＩＳＲＣ
番号には、記録されているＩＳＲＣデータのレンジを示
す番号１から１２が記述される。ＩＳＲＣデータは、Ｉ
ＳＲＣ番号により特定されたデータが記述されている。
プライベートヘッダ長としては、このフィールドの最後
のバイトからの論理ブロック数で長さが示されている。
先頭のアクセスユニットポインタには、このフィールド
の最後のバイトからの論理ブロック数で、最初にアクセ
スするユニットの先頭バイトのアドレスが示されてい
る。オーディオ強調フラッグは、第１のサンプリング周
波数が９６ｋＨｚ又は８８．２ｋＨｚのときは強調オ
フ、また第２のサンプリング周波数が９６ｋＨｚ又は８
８．２ｋＨｚのときも強調オフが記述される。強調オフ
は０、強調オンは１が記述される。ダウンミックスコー
ドには、オーディオサンプルのダウンミックスのための
係数テーブルが指示されている。テーブル番号が０００
ｂから１１１１ｂで示されている。[0155] The substream id includes 10100000b indicating that it is linear PCM audio data.
Is described. ISRC used for still image control
In the number, numbers 1 to 12 indicating the range of the recorded ISRC data are described. ISRC data is I
The data specified by the SRC number is described.
As the private header length, the length is indicated by the number of logical blocks from the last byte of this field.
The first access unit pointer indicates the address of the first byte of the unit to be accessed first, using the number of logical blocks from the last byte of this field. The audio emphasis flag indicates that the emphasis is turned off when the first sampling frequency is 96 kHz or 88.2 kHz, and that the second sampling frequency is 96 kHz or 8 kHz.
The emphasis off is described also at 8.2 kHz. 0 is described for emphasis off, and 1 is described for emphasis on. In the downmix code, a coefficient table for downmixing audio samples is specified. Table number is 000
b to 1111b.

【０１５６】第１の量子化ワード長には、第１チャンネ
ル群の量子化されたオーディオサンプルのワード長が記
述され、００００ｂのときは１６ビット、０００１ｂの
ときは２０ビット、００１０ｂのときは２４ビットを意
味する。The first quantized word length describes the word length of the quantized audio samples of the first channel group. The bit length is 16 bits for 0000b, 20 bits for 0001b, and 24 bits for 0010b. Means a bit.

【０１５７】第２の量子化ワード長には、第２チャンネ
ル群の量子化されたオーディオサンプルのワード長が記
述され、００００ｂのときは１６ビット、０００１ｂの
ときは２０ビット、００１０ｂのときは２４ビットを意
味する。１１１１ｂのときは特定していないことを意味
し、例えば第２チャンネル群が存在しないようなときで
ある。The second quantized word length describes the word length of the quantized audio samples of the second channel group, 16 bits for 0000b, 20 bits for 0001b, and 24 bits for 0010b. Means a bit. The case of 1111b means that it is not specified, for example, when the second channel group does not exist.

【０１５８】第１のオーディオサンプリング周波数に
は、第１チャンネル群のオーディオサンプルの周波数を
記述している。００００ｂは４８ｋＨｚ、０００１ｂは
９６ｋＨｚ、１０００ｂは４４．１ｋＨｚ、１００１ｂ
は８８．２ｋＨｚを意味する。The first audio sampling frequency describes the frequency of audio samples of the first channel group. 0000b is 48 kHz, 0001b is 96 kHz, 1000b is 44.1 kHz, 1001b
Means 88.2 kHz.

【０１５９】第２のオーディオサンプリング周波数に
は、第２チャンネル群のオーディオサンプルの周波数を
記述している。００００ｂは４８ｋＨｚ、０００１ｂは
９６ｋＨｚ、１０００ｂは４４．１ｋＨｚ、１００１ｂ
は８８．２ｋＨｚを意味する。１１１１ｂのときは特定
していないことを意味し、例えば第２チャンネル群が存
在しないようなときである。The second audio sampling frequency describes the frequency of audio samples of the second channel group. 0000b is 48 kHz, 0001b is 96 kHz, 1000b is 44.1 kHz, 1001b
Means 88.2 kHz. The case of 1111b means that it is not specified, for example, when the second channel group does not exist.

【０１６０】マルチチャンネルタイプには、オーディオ
サンプルのマルチチャンネル構造のタイプが記述され
る。００００ｂはタイプ１であり、他は予約である。[0160] In the multi-channel type, the type of the multi-channel structure of the audio sample is described. 0000b is type 1 and the others are reserved.

【０１６１】チャンネルアサインメントは、チャンネル
割り当ての様子が記述され、先の図２６で述べた通りで
ある。The channel assignment describes the state of channel assignment, and is as described with reference to FIG.

【０１６２】ダイナミックレンジ制御情報は、ダイナミ
ックレンジを抑圧する制御情報であり、８ビットワード
の上位３ビットが整数Ｘを示し、下位５ビットが整数Ｙ
を示す。The dynamic range control information is control information for suppressing a dynamic range. The upper 3 bits of an 8-bit word indicate an integer X, and the lower 5 bits are an integer Y.
Is shown.

【０１６３】リニア利得は、Ｇ＝２^4-(X+Y/30) （０＜
＝Ｘ＜＝７，０＜＝Ｙ＜＝２９）ｄＢでは、Ｇ＝２４．
０８２−６．０２０６Ｘ−０．２００７Ｙ（０＜＝Ｘ＜
＝７，０＜＝Ｙ＜＝２９）である。The linear gain is G = ^{24− (X + Y / 30)} (0 <
= X <= 7, 0 <= Y <= 29) dB, G = 24.
082-6.0206X-0.2007Y (0 <= X <
= 7,0 <= Y <= 29).

【０１６４】ディスク再生時には、上記のチャンネルグ
ループなどの割り当てを示した属性情報、オーディオデ
ータの第１、第２の量子化ワード長、第１、第２のオー
ディオサンプリング周波数などをシステム制御部が把握
することにより、第１チャンネル群と、第２チャンネル
群のデータ切り出しを可能とし、また再生タイミングの
同期を得ることができる。つまり、これらのヘッダー情
報は、同期情報として用いることができる。At the time of reproducing the disc, the system control unit grasps the attribute information indicating the assignment of the above-described channel group and the like, the first and second quantization word lengths of the audio data, the first and second audio sampling frequencies, and the like. By doing so, it is possible to cut out the data of the first channel group and the second channel group, and it is possible to synchronize the reproduction timing. That is, these pieces of header information can be used as synchronization information.

【０１６５】次に、上記の如く記録されたＤＶＤオーデ
ィオディスクの再生系統について更に詳しく説明するこ
とにする。Next, the reproduction system of the DVD audio disc recorded as described above will be described in more detail.

【０１６６】図３０には、オーディオストリームに関す
る再生装置の信号系列を更に具体化して示している。光
ディスクに記録されているデータは、光ヘッド部５３３
により読み取られ、高周波信号として出力される。シス
テム処理部５０４に入力した高周波信号（読み取り信
号）は、同期検出器６０１に入力される。同期検出器６
０１では、記録データに付加されている同期信号を検出
し、タイミング信号を生成する。同期検出器６０１で同
期信号を除去された読み取り信号は、１６ビットを８ビ
ットに復調する８−１６復調器６０２に入力されて、８
ビットのデータ列に復調される。復調データは、エラー
訂正回路６０３に入力されて、エラー訂正処理が施され
る。エラー訂正されたデータは、トラックバッファ６０
４を介してデマルチプレクサ６０５に入力される。この
デマルチプレクサ６０５では、オーディオパック、リア
ルタイムデータなどの識別がストリームＩＤに基づいて
行われ、対応するデコーダに各パックが出力される。[0166] Fig. 30 shows a more specific signal sequence of the reproducing apparatus for the audio stream. Data recorded on the optical disk is stored in the optical head unit 533.
And output as a high-frequency signal. The high-frequency signal (read signal) input to the system processing unit 504 is input to the synchronization detector 601. Synchronous detector 6
In step 01, a synchronization signal added to the recording data is detected, and a timing signal is generated. The read signal from which the synchronization signal has been removed by the synchronization detector 601 is input to an 8-16 demodulator 602 for demodulating 16 bits to 8 bits, and
Demodulated into a bit data string. The demodulated data is input to the error correction circuit 603, where the data is subjected to error correction processing. The error corrected data is stored in the track buffer 60.
4 to the demultiplexer 605. In the demultiplexer 605, audio packs, real-time data, and the like are identified based on the stream ID, and each pack is output to the corresponding decoder.

【０１６７】オーディオパックは、オーディオバッファ
６１１に取り込まれる。またオーディオパックのパック
ヘッダ及びパケットヘッダは、コントロール回路６１２
に読み取られる。コントロール回路６１２は、オーディ
オパックの内容を認識する。すなわち、オーディオパッ
クのスタートコード、スタッフィング長、パケットスタ
ートコード、ストリームＩＤ等を認識する。さらにパケ
ットの長さ、サブストリームＩＤの認識、最初のアクセ
スポイントの認識、オーディオの量子化ビット数の認
識、サンプリング周波数、チャンネルアサインメントか
らチャンネル群などの認識も行う。The audio pack is taken into the audio buffer 611. The pack header and packet header of the audio pack are stored in the control circuit 612.
Is read. The control circuit 612 recognizes the contents of the audio pack. That is, the start code, stuffing length, packet start code, stream ID, and the like of the audio pack are recognized. Further, it recognizes the packet length, the substream ID, the first access point, the audio quantization bit number, the sampling frequency, and the channel group from the channel assignment.

【０１６８】このような情報が認識されると、コントロ
ール回路６１２は、リニアＰＣＭデータのパケット内容
を認識し、デコード方式を決定することができる。ま
た、コントロール回路６１２は、オーディオバッファ６
１１に格納されているパケット内の再生用オーディオデ
ータの切り出しアドレスを把握することができる。よっ
て、このオーディオバッファ６１１は、コントロール回
路６１２により制御され、先に説明したサンプル、例え
ばＳ０，Ｓ１，ｅ０，ｅ１，Ｓ２，Ｓ３，…をデコーダ
６１３に出力することができる。コントロール回路６１
２は、少なくとも、量子化ビット数、サンプリング周波
数、チャンネルアサインメントを認識する。そしてこの
認識情報に基づいて、データの切り出し及びデコーダ６
１３に対してデコードモードの設定を実行することがで
きる。このサンプルは、チャンネル処理を行いデコード
を行うデコーダ６１３に供給されるものである。When such information is recognized, the control circuit 612 recognizes the packet contents of the linear PCM data and can determine the decoding method. The control circuit 612 also controls the audio buffer 6.
It is possible to know the cut-out address of the audio data for reproduction in the packet stored in the packet 11. Therefore, the audio buffer 611 is controlled by the control circuit 612, and can output the previously described samples, for example, S0, S1, e0, e1, S2, S3,. Control circuit 61
2 recognizes at least the number of quantization bits, the sampling frequency, and the channel assignment. Then, based on the recognition information, data extraction and decoder 6
13 can be set in the decode mode. This sample is supplied to a decoder 613 which performs channel processing and decodes.

【０１６９】図３１には、デコーダ６１３の具体的な構
成例を示している。入力端子７１０にサンプルが供給さ
れ、スイッチ７１１においてコントロール回路６１２か
らの制御に基づき各チャンネル毎に振り分けられる。Ｌ
又はＬｆの信号（エキストラワードも含む）がきた場合
は、バッファメモリ７１３へ、Ｒ又はＲｆの信号（エキ
ストラワードも含む）がきた場合は、バッファメモリ７
１４へ、Ｃの信号（エキストラワードが来た場合はそれ
も含む）がきた場合は、バッファメモリ７１５へ、Ｌｓ
の信号（エキストラワードが来た場合はそれも含む）が
きた場合は、バッファメモリ７１６へ、Ｒｓの信号（エ
キストラワードが来た場合はそれも含む）がきた場合
は、バッファメモリ７１７へ振り分けられる。更にＳの
信号がきた場合は、バッファメモリ７１８へ、ＬＥＦの
信号がきた場合は、バッファメモリ７１８へ振り分けら
れる。FIG. 31 shows a specific configuration example of the decoder 613. A sample is supplied to an input terminal 710, and is distributed to each channel by a switch 711 based on control from a control circuit 612. L
Or, when an Lf signal (including an extra word) comes, the buffer memory 713 is sent to the buffer memory 713 when an R or Rf signal (including an extra word) comes.
14, when a signal of C (including an extra word when it comes) comes into the buffer memory 715, Ls
When the signal of Rs (including an extra word comes), the signal is distributed to the buffer memory 716. When the signal of Rs (including the extra word comes), the signal is distributed to the buffer memory 717. . Further, when an S signal arrives, the signal is distributed to the buffer memory 718. When an LEF signal arrives, the signal is distributed to the buffer memory 718.

【０１７０】各バッファメモリ７１３〜７１９の出力
は、それぞれフレーム処理部８１３〜８１９に入力さ
れ、フレーム単位とされる。フレーム処理部８１３，８
１４，８１５，８１６，８１７の出力は、それぞれ位相
合わせ部７２３，７２４，７２５，７２６，７２７に供
給される。またフレーム処理部８１５，８１６，８１７
の出力は、スイッチ８２０を介してそれぞれ周波数変換
器８２１，８２２，８２３に供給することもできる。フ
レーム処理部８１８，８１９の出力は、周波数変換器８
２４，８２５に供給される。Outputs from the buffer memories 713 to 719 are input to frame processing units 813 to 819, respectively, and are processed in frame units. Frame processing units 813 and 8
Outputs of 14, 815, 816, 817 are supplied to phase matching units 723, 724, 725, 726, 727, respectively. Also, frame processing units 815, 816, 817
Can also be supplied to the frequency converters 821, 822, 823 via the switch 820, respectively. The outputs of the frame processing units 818 and 819 are output from the frequency converter 8
24, 825.

【０１７１】位相合わせ部は、第２チャンネル群が周波
数変換を受けているときに、第１チャンネル群の信号と
第２チャンネル群の信号との最終的な位相を合わせるた
めのものである。位相合わせ部７２３〜７２７の出力及
び周波数変換器８２１〜８２５の出力は、それぞれセレ
クタ７３０に供給される。セレクタ７３０は、図２６に
示したようにチャンネルアサインメントの情報に応じ
て、対応するチャンネルの信号を選択し、それぞれを、
対応するデジタルアナログ変換器７３１、７３２、７３
３、７３４、７３５、７３６、７３７に供給する。The phase matching section is for adjusting the final phase of the signal of the first channel group and the signal of the second channel group when the second channel group is undergoing frequency conversion. Outputs of the phase matching units 723 to 727 and outputs of the frequency converters 821 to 825 are supplied to the selector 730, respectively. The selector 730 selects the signal of the corresponding channel according to the information of the channel assignment as shown in FIG.
Corresponding digital-to-analog converters 731,732,73
3, 734, 735, 736, 737.

【０１７２】なお上記の実施の形態では、第２チャンネ
ル群のサンプルを周波数変換して出力するとしたが、周
波数変換を行わずにアナログ変換しても良いことは勿論
である。この場合は、第１チャンネル群側の位相合わせ
回路を削除してもよい。In the above embodiment, the samples of the second channel group are frequency-converted and output. However, it is needless to say that analog conversion may be performed without performing frequency conversion. In this case, the phase matching circuit on the first channel group side may be deleted.

【０１７３】次に、上記したオーディオ情報がどのよう
な形態で光ディスクに記録されているのかを簡単に説明
する。Next, a brief description will be given in what form the above-mentioned audio information is recorded on the optical disk.

【０１７４】図３２（Ａ），図３２（Ｂ），図３２
（Ｃ），図３２（Ｄ）に示すように、光ディスク１０の
一部の記録面を拡大すると、ピット列が形成されてい
る。このピットの集合が、セクタを構成している。従っ
て光ディスクのトラック上には、セクタ列が形成されて
いる。このセクタは光ヘッドにより連続して読み取られ
る。そしてオーディオパックがリアルタイムで再生され
る。FIGS. 32 (A), 32 (B), 32
As shown in (C) and FIG. 32 (D), when the recording surface of a part of the optical disk 10 is enlarged, a pit row is formed. This set of pits forms a sector. Therefore, a sector row is formed on the track of the optical disc. This sector is read continuously by the optical head. Then, the audio pack is reproduced in real time.

【０１７５】次に１つのセクタ、例えばオーディオ情報
が記述されているセクタを説明する。図３３（Ａ），図
３３（Ｂ）に示すように、１つのセクタは、１３×２フ
レームから構成されている。そして各フレームには、同
期符号が付加されている。図面では２次元的にフレーム
の配列を示しているが、トラック上には先頭のフレーム
から順番に記録されている。図に示されている同期符号
の順番で述べると、ＳＹ０，ＳＹ５，ＳＹ１，ＳＹ５，
ＳＹ２，ＳＹ５， …である。Next, one sector, for example, a sector in which audio information is described will be described. As shown in FIGS. 33A and 33B, one sector is composed of 13 × 2 frames. Then, a synchronization code is added to each frame. In the drawing, the arrangement of the frames is shown two-dimensionally, but they are recorded on the track in order from the first frame. SY0, SY5, SY1, SY5, SY5, SY5, SY5
SY2, SY5,...

【０１７６】図に示されている１フレームにおける同期
符号とデータのビット数は、３２ビットと、１４５６ビ
ットである。３２ビット＝１６ビット×２、１４５６ビ
ット＝１６ビット×９１である。この数式は、１６ビッ
トの変調コードが記録されていることを意味する。光学
式ディスに対する記録が行われるときは、８ビットのデ
ータが１６ビットに変調されて記録されるからである。
さらにこのセクタ情報は、変調されたエラー訂正コード
も含んでいる。The number of bits of the synchronization code and data in one frame shown in the figure is 32 bits and 1456 bits. 32 bits = 16 bits × 2, 1456 bits = 16 bits × 91. This equation means that a 16-bit modulation code is recorded. This is because, when recording is performed on the optical disc, 8-bit data is modulated into 16 bits and recorded.
Further, the sector information includes a modulated error correction code.

【０１７７】図３４（Ａ）には、上記の物理セクタの１
６ビットデータを、８ビットに復号した後の１つの記録
セクタを示している。この記録セクタのデータ量は、
（１７２＋１０）バイト×（１２＋１）ラインである。
各ラインには、１０バイトの誤り訂正符号が付加されて
いる。また１ライン分の誤り訂正符号が存在するが、こ
の誤り訂正符号は、後で述べるように、１２ライン分が
集まったときに、列方向の誤り訂正符号として機能す
る。FIG. 34A shows one of the above physical sectors.
This shows one recording sector after 6-bit data is decoded into 8 bits. The data amount of this recording sector is
(172 + 10) bytes × (12 + 1) lines.
Each line is provided with a 10-byte error correction code. Also, there is an error correction code for one line, and this error correction code functions as an error correction code in the column direction when 12 lines are collected, as described later.

【０１７８】上記の１記録セクタのデータから、誤り訂
正符号が除去されると、図３４（Ｂ）に示すようなデー
タブロックとなる。すなわち、２０４８バイトのメイン
データに、６バイトのセクタＩＤ、２バイトＩＤ誤り検
出符号、６バイトの著作権管理情報がデータ先頭に付加
され、さらにデータの末尾には４バイトの誤り検出符号
が付加されたデータブロックとなる。When the error correction code is removed from the data of one recording sector, a data block as shown in FIG. 34B is obtained. That is, a 6-byte sector ID, a 2-byte ID error detection code, and a 6-byte copyright management information are added to the beginning of the data to the 2048-byte main data, and a 4-byte error detection code is added to the end of the data. Data block.

【０１７９】上記の２０４８バイトのデータが、先に説
明した１パックであり、この１パックの先頭からパック
ヘッダ、パケットヘッダ、オーディオデータが記述され
ている。そして、パックヘッダ及びパケットヘッダに
は、オーディオデータを処理するための各種のガイド情
報が記述されていることになる。The above 2048-byte data is one pack described above, and a pack header, a packet header, and audio data are described from the beginning of this one pack. Then, various guide information for processing the audio data is described in the pack header and the packet header.

【０１８０】上記したようにディスクの１つのセクタに
対して、オーディオサンプルを配列した１つのパケット
が割り当てられて記録されている。そして、オーディオ
デコーダは、１つのセクタの情報であっても、リニアＰ
ＣＭデータを良好に再生することができる。これは、１
パック内のオーディオデータの先頭は、必ずメインサン
プルの先頭から開始するようにデータ配分されているか
らである。また、パックヘッダ及びパケットヘッダに
は、オーディオデコーダがオーディオデータを処理する
のに十分な制御情報が記述されているからである。As described above, one packet in which audio samples are arranged is allocated to one sector of the disk and recorded. Then, even if the information is information of one sector, the audio decoder
CM data can be favorably reproduced. This is 1
This is because the beginning of the audio data in the pack is always distributed so as to start from the beginning of the main sample. Also, this is because the pack header and the packet header describe control information sufficient for the audio decoder to process the audio data.

【０１８１】次に、誤り訂正符号ブロック（ＥＣＣブロ
ック）について説明する。Next, an error correction code block (ECC block) will be described.

【０１８２】図３５（Ａ），図３５（Ｂ）に示すよう
に、ＥＣＣブロックは、上記した１記録セクタが１６個
集合することにより構成されている。図３５（Ａ）は，
１２行×１２７バイトのデータセクタ（図２６（Ａ））
が１６個集合された状態を示している。そして、各列に
は、１６バイトの外符号パリティ（ＰＯ）が付加され
る。また各行には１０バイトの内符号パリティ（ＰＩ）
が付加される。さらに、記録される前には、図３５
（Ｂ）に示されるように、１６バイトの外符号パリティ
（ＰＯ）が１ビットずつ各行に分散される。この結果、
１記録セクタは、１３（＝１２＋１）行のデータとして
構成されることになる。図３５（Ａ）において、Ｂ０，
０、Ｂ０，１、…は、バイト単位のアドレスを示してい
る。また図３５（Ｂ）において、各ブロックに付されて
いる０乃至１５は、それぞれ１記録セクタである。上記
したディスクの記録トラック上には、オーディオパッ
ク、管理情報、その他任意で静止画の情報、リアルタイ
ム情報が配列されている。As shown in FIGS. 35 (A) and 35 (B), an ECC block is constituted by a set of 16 recording sectors described above. FIG. 35 (A)
Data sector of 12 rows x 127 bytes (Fig. 26 (A))
Shows a state in which 16 are collected. Then, a 16-byte outer code parity (PO) is added to each column. Each row has a 10-byte inner code parity (PI)
Is added. Further, before recording, FIG.
As shown in (B), the outer code parity (PO) of 16 bytes is distributed to each row by 1 bit. As a result,
One recording sector is configured as data of 13 (= 12 + 1) rows. In FIG. 35 (A), B0,
.., 0, B0, 1,... Indicate addresses in byte units. In FIG. 35B, 0 to 15 assigned to each block represent one recording sector. On the recording tracks of the above-mentioned disc, audio packs, management information, and optionally still image information and real-time information are arranged.

【０１８３】なおこの発明は、ディスクに記録される、
又はディスクから再生されるデータ構造として説明して
いるが、通信系を用いたデータ伝送時に、上記したデー
タ構造を用いることは容易であり、この発明は、データ
構造自体、及びこのようなデータ構造を伝送する伝送す
る装置、転送する装置、受信する装置も範疇に含むこと
は勿論のことである。さらにまた、上記の説明ではオー
ディオ信号をサンプル化して取り扱う方法及び装置とし
て説明したが、同時に再生出力を必要とし同じ転送系、
伝送系で用いられるデータであれば、オーディオ信号以
外の信号に対しても適用できることは勿論である。According to the present invention, the information is recorded on a disc.
Alternatively, the data structure described above is described as a data structure reproduced from a disc. However, it is easy to use the above data structure when transmitting data using a communication system. It goes without saying that a device for transmitting, a device for transmitting, and a device for receiving are also included in the category. Furthermore, in the above description, the method and the apparatus for sampling and handling the audio signal have been described.
Of course, as long as the data is used in the transmission system, it can be applied to signals other than the audio signal.

【０１８４】[0184]

【発明の効果】上記したようにこの発明は、ＤＶＤビデ
オにおけるオーディオデータ構造の規格をできるだけ利
用し、高音質の仕様をもったＤＶＤオーディオの規格を
実現したデータ構造を有するオーディオ用情報記録媒体
および装置を提供できる。As described above, the present invention utilizes an audio data structure standard for DVD video as much as possible and realizes an audio information recording medium having a data structure that realizes a DVD audio standard having high sound quality specifications. Equipment can be provided.

[Brief description of the drawings]

【図１】この発明に関連するＤＶＤビデオのデータサン
プル構成及びサンプルの配置を示す説明図。FIG. 1 is an explanatory diagram showing a DVD video data sample configuration and sample arrangement relating to the present invention.

【図２】ＤＶＤビデオに係るパックの配列例と，この配
列の中のオーディオパックの構成を示す説明図。FIG. 2 is an explanatory diagram showing an example of an arrangement of DVD video packs and a configuration of an audio pack in the arrangement.

【図３】ＤＶＤビデオに係るオーディオパックの構成を
詳しく説明図。FIG. 3 is a diagram illustrating in detail a configuration of an audio pack related to a DVD video.

【図４】リニアＰＣＭデータのパケット内データサイズ
の例の一覧表を示す説明図。FIG. 4 is an explanatory diagram showing a list of examples of the data size in a packet of linear PCM data.

【図５】ＤＶＤビデオに係るオーディオパックの生成例
を示す説明図。FIG. 5 is an explanatory diagram showing an example of generating an audio pack for a DVD video.

【図６】ＤＶＤビデオに係るリニアＰＣＭデータのサイ
ズの一覧表を示す図。FIG. 6 is a diagram showing a list of linear PCM data sizes for DVD video.

【図７】オーディオパックのパックヘッダを示す図。FIG. 7 is a view showing a pack header of an audio pack.

【図８】オーディオパックのパケットヘッダを示す図。FIG. 8 is a diagram showing a packet header of an audio pack.

【図９】スケーラブルを採用したディスク記録再生装置
の基本構成を説明するために示す図。FIG. 9 is a view for explaining a basic configuration of a disk recording / reproducing apparatus adopting scalable;

【図１０】この発明に適用されるスケーラブルの原理を
サンプル例を示して説明する図。FIG. 10 is a diagram illustrating a scalable principle applied to the present invention with reference to a sample example.

【図１１】この発明に適用されるスケーラブルの原理を
他のサンプル例を示して説明する図。FIG. 11 is a view for explaining the principle of scalability applied to the present invention by showing another sample example.

【図１２】この発明に適用されるスケーラブルの原理を
更に他のサンプル例を示して説明する図。FIG. 12 is a view for explaining the principle of scalability applied to the present invention by showing still another sample example.

【図１３】この発明に適用されるスケーラブルの原理を
更にまた他のサンプル例を示して説明する図。FIG. 13 is a view for explaining the principle of scalability applied to the present invention by showing still another sample example.

【図１４】この発明に係るデータサンプル構造の一例を
示す図。FIG. 14 is a diagram showing an example of a data sample structure according to the present invention.

【図１５】この発明に係るデータサンプル構造の他の例
を示す図。FIG. 15 is a diagram showing another example of the data sample structure according to the present invention.

【図１６】この発明に係るデータサンプル構造の更に他
の例を示す図。FIG. 16 is a diagram showing still another example of the data sample structure according to the present invention.

【図１７】この発明に係るデータサンプル構造のまた他
の例を示す図。FIG. 17 is a diagram showing still another example of the data sample structure according to the present invention.

【図１８】この発明に係るデータサンプル構造の他の例
を示す図。FIG. 18 is a diagram showing another example of the data sample structure according to the present invention.

【図１９】この発明に係るデータサンプル構造の他の例
を示す図。FIG. 19 is a diagram showing another example of the data sample structure according to the present invention.

【図２０】この発明に係るオーディオパックの構造を簡
略化して示す図。FIG. 20 is a diagram showing a simplified structure of an audio pack according to the present invention.

【図２１】この発明に係るオーディオオブジェクトセッ
トと、オーディオパックの関係を階層的に示す説明図。FIG. 21 is an explanatory diagram hierarchically showing a relationship between an audio object set and an audio pack according to the present invention.

【図２２】この発明に係るオーディオタイトルセットの
セルとプログラムチェーン情報のとの関連を説明するた
めに示した図。FIG. 22 is a view shown for explaining the relationship between cells of an audio title set and program chain information according to the present invention;

【図２３】この発明に係るＤＶＤオーディオの記録され
たディスクの論理データの配置状態を示す説明図。FIG. 23 is an explanatory diagram showing an arrangement state of logical data of a disk on which DVD audio is recorded according to the present invention.

【図２４】この発明に係るオーディオタイトルセット情
報管理テーブルの内容を示す説明図。FIG. 24 is an explanatory diagram showing the contents of an audio title set information management table according to the present invention.

【図２５】図２３のオーディオタイトルセットプログラ
ムチェーン情報サーチポインタを構成する情報を示す説
明図。FIG. 25 is an explanatory diagram showing information forming an audio title set program chain information search pointer in FIG. 23;

【図２６】この発明に係るチャンネル割り当てテーブル
を説明するために示した図。FIG. 26 is a diagram shown to explain a channel assignment table according to the present invention.

【図２７】この発明に係るオーディオパックの構成を示
す図。FIG. 27 is a diagram showing a configuration of an audio pack according to the present invention.

【図２８】図２７のオーディオパックが有するパケット
ヘッダの内容を示す説明図。FIG. 28 is an explanatory diagram showing contents of a packet header included in the audio pack of FIG. 27;

【図２９】図２７のオーディオパックが有するプライベ
ートパケットヘッダの内容を示す説明図。FIG. 29 is an explanatory diagram showing the contents of a private packet header included in the audio pack of FIG. 27;

【図３０】この発明に係るディスク再生装置の構成を示
す図。FIG. 30 is a diagram showing a configuration of a disk reproducing apparatus according to the present invention.

【図３１】図３０のデコーダの内部構成例を示す図。FIG. 31 is a diagram showing an example of the internal configuration of the decoder of FIG. 30.

【図３２】ディスク、ピット列、セクタ列及び物理セク
タを示す説明図。FIG. 32 is an explanatory diagram showing a disk, a pit row, a sector row, and a physical sector.

【図３３】物理セクタの内容を示す図。FIG. 33 shows the contents of a physical sector.

【図３４】記録セクタの構成を示す図。FIG. 34 is a diagram showing a configuration of a recording sector.

【図３５】エラー訂正符号ブロックの構成を示す図。FIG. 35 is a diagram showing a configuration of an error correction code block.

[Explanation of symbols]

１０…ディスク、５３３…光ヘッド部、５０２…システ
ムＣＰＵ、５０４…システム処理部、５０５…データＲ
ＡＭ、10: Disk, 533: Optical head unit, 502: System CPU, 504: System processing unit, 505: Data R
AM,

───────────────────────────────────────────────────── フロントページの続き (72)発明者魚田潤一神奈川県川崎市幸区柳町70番地株式会社東芝柳町工場内Ｆターム(参考） 5C052 AA01 AB05 CC01 DD06 5C053 FA24 FA25 GB02 GB06 GB11 GB21 GB32 JA05 JA23 5D044 AB05 BC02 CC04 DE13 DE44 DE53 FG18 GK08 GK14 5D110 AA14 AA27 DA03 DA11 DA17 DC05 DE01 ────────────────────────────────────────────────── ─── Continuing from the front page (72) Inventor Junichi Uoda 70-Yanagimachi, Saiwai-ku, Kawasaki-shi, Kanagawa F-term in Toshiba Yanagicho Plant (reference) 5C052 AA01 AB05 CC01 DD06 5C053 FA24 FA25 GB02 GB06 GB11 GB21 GB32 JA05 JA23 5D044 AB05 BC02 CC04 DE13 DE44 DE53 FG18 GK08 GK14 5D110 AA14 AA27 DA03 DA11 DA17 DC05 DE01

Claims

[Claims]

The data read and processed using a computer controller includes at least an audio object and management information, wherein the audio object includes a first channel group among a plurality of channels of audio signals. And a first audio data sequence (sample sequence) digitized with a first quantization bit number and a second channel or a second channel group among the audio signals of the plurality of channels. A second audio data sequence (sample sequence) digitized with a second sampling frequency and a second quantization bit number is used, and the first and second sample sequences are paired in frame units, and a frame arrangement is performed. And a frame array is housed in a pack including a header and formed as a pack row, and a set of the packs is managed as a cell, The set of cells is defined so as to be managed as a program, and the management information includes an audio title set information management table and a program chain information table necessary for reproducing the audio signals of the plurality of channels. The professional program chain information table has program chain general information, a program information table including the number of each entry cell of a plurality of programs, and a cell reproduction information table indicating a reproduction position of the cell of each program. The header includes a quantized word length of a sample of the first channel group, information of a first audio sampling frequency, and a quantized word length of a sample of the second channel group. 2 that the information of audio sampling frequency is included. Readable audio information recording medium in a computer control system to.

2. The audio information recording medium according to claim 1, wherein the header further includes dynamic range information for setting a dynamic range of reproduced audio.

3. The recording data includes at least an audio object and management information, wherein the audio object is configured to divide a first channel group among a plurality of channels of audio signals at a first sampling frequency and a first quantization. A first audio data sequence (sample sequence) digitized by the number of bits, and a second channel or a second channel group of the plurality of channels of audio signals at a second sampling frequency and a second sampling frequency; A second audio data sequence (sample sequence) digitized by the number of quantization bits is used, and the first and second sample sequences are paired in frame units, arranged in frames, and in a pack in which the frame arrangement includes a header. The packs are accommodated and formed as a row of packs. The set of packs is managed as a cell, and the set of cells is managed as a program. The management information has an audio title set information management table and a program chain information table necessary for reproducing the audio signals of the plurality of channels, and the professional program chain information table has A program information table including program chain general information, a number of each entry cell of a plurality of programs, and a cell reproduction information table indicating a reproduction position of the cell of each program, the header includes: The quantization word length of the sample of the first channel group, the information of the first audio sampling frequency, and the quantization word length of the sample of the second channel group, the information of the second audio sampling frequency are included. Data processing device for an audio information recording medium Information means for reading the data of the recording medium, and means for the from the reading obtained data to obtain the pack sequence, the first included in the header of the pack of the pack sequence
The information of the quantized word length of the sample of the channel group of the second channel group and the information of the first audio sampling frequency, and the information of the quantized word length of the sample of the second channel group and the information of the second audio sampling frequency are obtained. A control unit for obtaining a reproduction timing of the audio data in the pack in order to obtain data synchronization between the second channel groups.