JP2000214894A

JP2000214894A - Sound coding device, record medium, sound decoding device, sound transmitting method and transmission medium

Info

Publication number: JP2000214894A
Application number: JP11325957A
Authority: JP
Inventors: Yoshiaki Tanaka; 美昭田中; Shoji Ueno; 昭治植野; Norihiko Fuchigami; 徳彦渕上
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 1998-11-16
Filing date: 1999-11-16
Publication date: 2000-08-04
Anticipated expiration: 2019-11-16
Also published as: JP3387084B2

Abstract

PROBLEM TO BE SOLVED: To improve decoding efficiency on the reproducing side, when multichannel sound signals are coded at a variable compression rate. SOLUTION: An audio data area in an audio packet is constituted of a plurality of PPCM access units, and each PPCM access unit is constituted of PPCM sink information and a sub-packet. The PPCM sink information includes the number of samples per packet (40, 80, or 160, depending on the sampling frequency), a data rate ['0', when it is VBR(variable bit rate): an identifier showing that data in the sub-packet are compressed data], sampling frequency and the number of quantizing bits, channel allocating information, and the like.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、マルチチャネルの
音声信号を圧縮するための音声符号化装置、記録媒体、
音声復号装置、音声伝送方法及び伝送媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio encoding apparatus for compressing a multi-channel audio signal, a recording medium,
The present invention relates to an audio decoding device, an audio transmission method, and a transmission medium.

【０００２】[0002]

【従来の技術】音声信号を圧縮する方法として、本発明
者は先の出願（特願平９−２８９１５９号）において１
チャネルの原デジタル音声信号に対して、特性が異なる
複数の予測器により時間領域における過去の信号から現
在の信号の複数の線形予測値を算出し、原デジタル音声
信号と、この複数の線形予測値から予測器毎の予測残差
を算出し、予測残差の最小値を選択する予測符号化方法
を提案している。2. Description of the Related Art As a method for compressing an audio signal, the present inventor has disclosed in Japanese Patent Application No.
A plurality of predictors having different characteristics are used to calculate a plurality of linear prediction values of a current signal from a past signal in a time domain for an original digital audio signal of a channel, and the original digital audio signal and the plurality of linear prediction values are calculated. A prediction coding method for calculating a prediction residual for each predictor from, and selecting a minimum value of the prediction residual is proposed.

【０００３】なお、上記方法では原デジタル音声信号が
サンプリング周波数＝９６ｋＨｚ、量子化ビット数＝２
０ビット程度の場合にある程度の圧縮効果を得ることが
できるが、近年のＤＶＤオーディオディスクではこの２
倍のサンプリング周波数（＝１９２ｋＨｚ）が使用さ
れ、また、量子化ビット数も２４ビットが使用される傾
向がある。また、マルチチャネルにおけるサンプリング
周波数と量子化ビット数はチャネル毎に異なることもあ
る。In the above method, the original digital audio signal has a sampling frequency = 96 kHz and the number of quantization bits = 2.
Although a certain degree of compression effect can be obtained in the case of about 0 bits, in recent DVD audio discs, this 2
A double sampling frequency (= 192 kHz) is used, and the number of quantization bits tends to be 24 bits. Further, the sampling frequency and the number of quantization bits in the multi-channel may be different for each channel.

【０００４】[0004]

【発明が解決しようとする課題】ところで、予測符号化
方式のような圧縮方式は圧縮率が可変（ＶＢＲ：バリア
ブル・ビット・レート）であるので、マルチチャネルの
音声信号を予測符号化するとチャネル毎のデータ量が時
間的に大きく変化する。また、このようなデータを伝送
する場合には、チャネル毎にパラレルではなくデータス
トリームとして伝送される。したがって、再生側（デコ
ード側）においてこのような可変長のデータストリーム
をチャネル毎に同期して再生（プレゼンテーション）可
能にする必要がある。By the way, since a compression rate such as a predictive coding method has a variable compression ratio (VBR: variable bit rate), when predictive coding of a multi-channel audio signal is performed, Of data greatly changes over time. When transmitting such data, the data is transmitted as a data stream instead of parallel for each channel. Therefore, it is necessary for the reproduction side (decoding side) to be able to reproduce (presentation) such a variable-length data stream in synchronization with each channel.

【０００５】そこで本発明は、マルチチャネルの音声信
号を可変の圧縮率で符号化する場合に再生側の復号効率
を改善することができる音声符号化装置、記録媒体、音
声復号装置、音声伝送方法及び伝送媒体を提供すること
を目的とする。[0005] Therefore, the present invention provides an audio encoding device, a recording medium, an audio decoding device, and an audio transmission method that can improve decoding efficiency on the reproduction side when encoding a multi-channel audio signal at a variable compression rate. And a transmission medium.

【０００６】[0006]

【課題を解決するための手段】本発明は上記目的を達成
するために、以下の１）〜９）の手段からなるものであ
る。すなわち、The present invention comprises the following means (1) to (9) to achieve the above object. That is,

【０００７】１）あるサンプリング周波数と量子化ビッ
ト数のマルチチャネルの音声信号をチャネル毎に予測符
号化方法で圧縮する圧縮手段と、前記圧縮手段により圧
縮されたチャネル毎のデータを含むサブパケットと、そ
のサンプリング周波数及び量子化ビット数を含む同期情
報部を有するデータ構造にフォーマット化する手段と
を、有する音声符号化装置。２）請求項１記載におけるデータ構造のフォーマット化
は、サブパケット及び同期情報部に対応したＳＣＲ情報
を含むヘッダを付加してフォーマット化するものである
ことを特徴とする音声符号化装置。３）あるサンプリング周波数と量子化ビット数のマルチ
チャネルの音声信号がチャネル毎に予測符号化方法で圧
縮され、前記圧縮されたチャネル毎のデータを含むサブ
パケットと、そのサンプリング周波数及び量子化ビット
数を含む同期情報部を有するデータ構造にフォーマット
化されて記録された記録媒体。４）請求項３記載におけるデータ構造のフォーマット化
は、サブパケット及び同期情報部に対応したＳＣＲ情報
を含むヘッダを付加してフォーマット化するものである
ことを特徴とする記録媒体。５）あるサンプリング周波数と量子化ビット数のマルチ
チャネルの音声信号がチャネル毎に予測符号化方法で圧
縮され、前記圧縮されたチャネル毎のデータを含むサブ
パケットと、そのサンプリング周波数及び量子化ビット
数を含む同期情報部を有するデータ構造を復号する音声
復号装置であって、前記データ構造をサブパケットと同
期情報部に分離する手段と、前記サブパケット内の圧縮
データをチャネル毎に伸長する伸長手段と、前記伸長さ
れたオーディオデータを前記同期情報部内のサンプリン
グ周波数及び量子化ビット数に基づいてアナログオーデ
ィオ信号に変換する手段とを、有する音声復号装置。６）あるサンプリング周波数と量子化ビット数のマルチ
チャネルの音声信号がチャネル毎に予測符号化方法で圧
縮され、前記圧縮されたチャネル毎のデータを含むサブ
パケットと、そのサンプリング周波数及び量子化ビット
数を含む同期情報部と、前記サブパケット及び同期情報
部に対応したＳＣＲ情報を含むヘッダとを有するデータ
構造を復号する音声復号装置であって、前記ヘッダに含
まれるＳＣＲ情報を分離する第１の分離手段と、前記分
離されたＳＣＲ情報に基づいて前記サブパケット及び同
期情報部を保持するためのバッファと、前記バッファに
保持された前記サブパケットと同期情報部とを分離する
第２の分離手段と、前記同期情報部内の識別子に基づい
て前記サブパケット内の圧縮データをチャネル毎に伸長
する伸長手段と、前記伸長されたオーディオデータを前
記同期情報部内のサンプリング周波数及び量子化ビット
数に基づいてアナログオーディオ信号に変換する手段と
を、有する音声復号装置。７）あるサンプリング周波数と量子化ビット数のマルチ
チャネルの音声信号がチャネル毎に予測符号化方法で圧
縮され、前記圧縮されたチャネル毎のデータを含むサブ
パケットと、そのサンプリング周波数及び量子化ビット
数を含む同期情報部を有するデータ構造にフォーマット
化したデータ構造のパケットを通信回線を介して伝送す
ることを特徴とする音声伝送方法。８）請求項７記載におけるデータ構造のフォーマット化
は、サブパケット及び同期情報部に対応したＳＣＲ情報
を含むヘッダを付加してフォーマット化するものである
ことを特徴とする音声伝送方法。９）あるサンプリング周波数と量子化ビット数のマルチ
チャネルの音声信号がチャネル毎に予測符号化方法で圧
縮され、前記圧縮されたチャネル毎のデータを含むサブ
パケットと、そのサンプリング周波数及び量子化ビット
数を含む同期情報部を有するデータ構造にフォーマット
化したデータ構造のパケットを伝送することを特徴とす
る伝送媒体。1) Compressing means for compressing a multi-channel audio signal having a certain sampling frequency and the number of quantization bits for each channel by a predictive coding method, and a subpacket including data for each channel compressed by the compressing means. Means for formatting into a data structure having a synchronization information section including the sampling frequency and the number of quantization bits. 2) An audio coding apparatus according to claim 1, wherein the data structure is formatted by adding a header including SCR information corresponding to a sub packet and a synchronization information section. 3) A multi-channel audio signal having a certain sampling frequency and a certain number of quantization bits is compressed by a predictive coding method for each channel, a subpacket including the compressed data for each channel, and the sampling frequency and the number of quantization bits. A recording medium that has been formatted and recorded in a data structure having a synchronization information section including: (4) The recording medium according to (3), wherein the data structure is formatted by adding a header including SCR information corresponding to a sub-packet and a synchronization information part. 5) A multi-channel audio signal having a certain sampling frequency and a certain number of quantization bits is compressed by a predictive coding method for each channel, and a subpacket including the compressed data for each channel, and the sampling frequency and the number of quantization bits A speech decoding apparatus for decoding a data structure having a synchronization information section including: a means for separating the data structure into subpackets and a synchronization information section; and a decompression means for expanding compressed data in the subpacket for each channel. And a means for converting the expanded audio data into an analog audio signal based on the sampling frequency and the number of quantization bits in the synchronization information section. 6) A multi-channel audio signal having a certain sampling frequency and a number of quantization bits is compressed for each channel by a predictive coding method, a subpacket including the compressed data for each channel, and the sampling frequency and the number of quantization bits. And a header including a sub-packet and a header including SCR information corresponding to the sub-packet and the synchronizing information section, wherein the first decoder separates the SCR information included in the header. Separating means, a buffer for holding the sub-packet and synchronization information part based on the separated SCR information, and second separating means for separating the sub-packet and synchronization information part held in the buffer Decompression means for decompressing the compressed data in the subpacket for each channel based on the identifier in the synchronization information section, And means for converting the analog audio signal based on the serial decompressed audio data to a sampling frequency and number of quantization bits in the synchronization information unit includes speech decoding apparatus. 7) A multi-channel audio signal having a certain sampling frequency and a certain number of quantization bits is compressed by a predictive coding method for each channel, a subpacket including the compressed data for each channel, and the sampling frequency and the number of quantization bits. And transmitting a packet having a data structure formatted into a data structure having a synchronization information section through a communication line. (8) The voice transmission method according to (7), wherein the data structure is formatted by adding a header including SCR information corresponding to a sub-packet and a synchronization information section. 9) A multi-channel audio signal having a certain sampling frequency and a certain number of quantization bits is compressed by a predictive coding method for each channel, a subpacket including the compressed data for each channel, and the sampling frequency and the number of quantization bits. A transmission medium for transmitting a packet having a data structure formatted into a data structure having a synchronization information section including:

【０００８】[0008]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態を説明する。図１は本発明に係る音声符号化装
置及び音声復号装置の第１の実施形態を示すブロック
図、図２は図１の符号化部を詳しく示すブロック図、図
３は図１、図２の符号化部により符号化されたビットス
トリームを示す説明図、図４はＤＶＤのパックのフォー
マットを示す説明図、図５はＤＶＤのオーディオパック
のフォーマットを示す説明図、図６は図５のオーディオ
データエリアのフォーマットを詳しく示す説明図、図７
は図１の復号化部を詳しく示すブロック図、図８は図７
の入力バッファの書き込み／読み出しタイミングを示す
タイミングチャート、図９はアクセスユニット毎の圧縮
データ量を示す説明図、図１０はアクセスユニットとプ
レゼンテーションユニットを示す説明図である。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a first embodiment of a speech encoding apparatus and a speech decoding apparatus according to the present invention, FIG. 2 is a block diagram showing the encoding unit of FIG. 1 in detail, and FIG. FIG. 4 is an explanatory diagram showing the format of a DVD pack, FIG. 5 is an explanatory diagram showing the format of a DVD audio pack, and FIG. 6 is an explanatory diagram showing the format of a DVD audio pack. Explanatory diagram showing the format of the area in detail, FIG.
FIG. 8 is a block diagram showing the decoding unit of FIG. 1 in detail, and FIG.
9 is a timing chart showing write / read timing of the input buffer, FIG. 9 is an explanatory diagram showing the amount of compressed data for each access unit, and FIG. 10 is an explanatory diagram showing an access unit and a presentation unit.

【０００９】ここで、マルチチャネル方式としては、例
えば次の４つの方式が知られている。（１）４チャネル方式ドルビーサラウンド方式の
ように、前方Ｌ、Ｃ、Ｒの３チャネル＋後方Ｓの１チャ
ネルの合計４チャネル（２）５チャネル方式ドルビーＡＣ−３方式のＳ
Ｗチャネルなしのように、前方Ｌ、Ｃ、Ｒの３チャネル
＋後方ＳＬ、ＳＲの２チャネルの合計５チャネル（３）６チャネル方式ＤＴＳ（Digital Theater
System）方式や、ドルビーＡＣ−３方式のように６チャ
ネル（Ｌ、Ｃ、Ｒ、ＳＷ（Ｌｆｅ）、ＳＬ、ＳＲ）（４）８チャネル方式ＳＤＤＳ（Sony Dynamic D
igital Sound）方式のように、前方Ｌ、ＬＣ、Ｃ、Ｒ
Ｃ、Ｒ、ＳＷの６チャネル＋後方ＳＬ、ＳＲの２チャネ
ルの合計８チャネルHere, as the multi-channel system, for example, the following four systems are known. (1) Four-channel system Like the Dolby surround system, a total of four channels of three channels of front L, C, and R + one channel of rear S. (2) Five-channel system S of Dolby AC-3 system
As without W channel, 3 channels of front L, C and R + 2 channels of rear SL and SR, total 5 channels (3) 6 channel system DTS (Digital Theater)
6) (L, C, R, SW (Lfe), SL, SR) such as the Dolby AC-3 system (4) 8-channel system SDDS (Sony Dynamic D
digital sound), forward L, LC, C, R
6 channels of C, R, SW + 2 channels of rear SL, SR, total 8 channels

【００１０】図１に示す符号化側の６チャネル（ch）ミ
クス＆マトリクス回路１’は、マルチチャネル信号の一
例としてフロントレフト（Ｌｆ）、センタ（Ｃ）、フロ
ントライト（Ｒｆ）、サラウンドレフト（Ｌｓ）、サラ
ウンドライト（Ｒｓ）及びＬｆｅ（Low Frequency Effe
ct）の６chのＰＣＭデータを次式（１）により前方グル
ープに関する２ch「１」、「２」と他のグループに関す
る４ch「３」〜「６」に分類して変換し、２ch「１」、
「２」を第１符号化部２’−１に、また、４ch「３」〜
「６」を第２符号化部２’−２に出力する。「１」＝Ｌｆ＋Ｒｆ「２」＝Ｌｆ−Ｒｆ「３」＝Ｃ−（Ｌｓ＋Ｒｓ）／２「４」＝Ｌｓ＋Ｒｓ「５」＝Ｌｓ−Ｒｓ「６」＝Ｌｆｅ−ａ×Ｃただし、０≦ａ≦１ …（１）[0010] The 6-channel (ch) mix & matrix circuit 1 'on the encoding side shown in FIG. 1 includes a front left (Lf), a center (C), a front right (Rf), a surround left ( Ls), surround light (Rs) and Lfe (Low Frequency Effe)
ct), the 6-channel PCM data is classified and converted into 2ch “1” and “2” for the front group and 4ch “3” to “6” for the other group by the following equation (1), and converted into 2ch “1”.
"2" is assigned to the first encoding unit 2'-1, and 4ch "3" to
"6" is output to the second encoding unit 2'-2. “1” = Lf + Rf “2” = Lf−Rf “3” = C− (Ls + Rs) / 2 “4” = Ls + Rs “5” = Ls−Rs “6” = Lfe−a × C where 0 ≦ a ≦ 1 ... (1)

【００１１】符号化部２’を構成する第１及び第２符号
化部２’−１、２’−２はそれぞれ、図２に詳しく示す
ように２ch「１」、「２」と４ch「３」〜「６」のＰＣ
Ｍデータをチャネル毎に予測符号化し、予測符号化デー
タを図３に示すようなビットストリームで記録媒体５や
衛星回線や電話回線等の通信媒体６を介して復号側に伝
送する。復号側では復号化部３’を構成する第１及び第
２復号化部３’−１、３’−２により、図７に詳しく示
すようにそれぞれ前方グループに関する２ch「１」、
「２」と他のグループに関する４ch「３」〜「６」の予
測符号化データをチャネル毎にＰＣＭデータに復号す
る。As shown in detail in FIG. 2, the first and second encoders 2'-1 and 2'-2 which constitute the encoder 2 'respectively have 2ch "1", "2" and 4ch "3". "~" 6 "PC
The M data is predictively encoded for each channel, and the encoded prediction data is transmitted as a bit stream as shown in FIG. 3 to the decoding side via a recording medium 5 or a communication medium 6 such as a satellite line or a telephone line. On the decoding side, the first and second decoding units 3'-1 and 3'-2 constituting the decoding unit 3 'respectively perform 2ch "1" for the forward group, as shown in detail in FIG.
The prediction coded data of "2" and 4ch "3" to "6" relating to the other groups are decoded into PCM data for each channel.

【００１２】次いでミクス＆マトリクス回路４’により
式（１）に基づいて元の６ch（Ｌｆ、Ｃ、Ｒｆ、Ｌｓ、
Ｒｓ、Ｌｆｅ）を復元するとともに、この元の６chと係
数ｍij（ｉ＝１，２，ｊ＝１，２〜６）により次式
（２）のようにステレオ２chデータ（Ｌ、Ｒ）を生成す
る。Ｌ＝ｍ11・Ｌｆ＋ｍ12・Ｒｆ＋ｍ13・Ｃ＋ｍ14・Ｌｓ＋ｍ15・Ｒｓ＋ｍ16・ＬｆｅＲ＝ｍ21・Ｌｆ＋ｍ22・Ｒｆ＋ｍ23・Ｃ＋ｍ24・Ｌｓ＋ｍ25・Ｒｓ＋ｍ26・Ｌｆｅ …（２）Then, the original 6 ch (Lf, C, Rf, Ls,
Rs, Lfe) are restored, and stereo 2-ch data (L, R) is generated from the original 6 ch and coefficients mij (i = 1, 2, j = 1, 2 to 6) as in the following equation (2). I do. L = m11 · Lf + m12 · Rf + m13 · C + m14 · Ls + m15 · Rs + m16 · Lfe R = m21 · Lf + m22 · Rf + m23 · C + m24 · Ls + m25 · Rs + m26 · Lfe (2)

【００１３】図２を参照して符号化部２’−１、２’−
２について詳しく説明する。各ch「１」〜「６」のＰＣ
Ｍデータは１フレーム毎に１フレームバッファ１０に格
納される。そして、１フレームの各ch「１」〜「６」の
サンプルデータがそれぞれ予測回路１３Ｄ１、１３Ｄ
２、１５Ｄ１〜１５Ｄ４に印加されるとともに、各ch
「１」〜「６」の各フレームの先頭サンプルデータ（後
述のリスタートヘッダ内に格納される）がアンパッキン
グ回路８及びフォーマット化回路１９に印加される。ま
た、ＰＣＭデータがＡ／Ｄ変換されたときのサンプリン
グ周波数（ｆｓ）と量子化ビット数（Ｑｂ）がパッキン
グ回路１８及びフォーマット化回路１９に印加される。
予測回路１３Ｄ１、１３Ｄ２、１５Ｄ１〜１５Ｄ４はそ
れぞれ、各ch「１」〜「６」のＰＣＭデータに対して、
特性が異なる複数の予測器（不図示）により時間領域に
おける過去の信号から現在の信号の複数の線形予測値を
算出し、次いで原ＰＣＭデータと、この複数の線形予測
値から予測器毎の予測残差を算出する。続くバッファ・
選択器１４Ｄ１、１４Ｄ２、１６Ｄ１〜１６Ｄ４はそれ
ぞれ、予測回路１３Ｄ１、１３Ｄ２、１５Ｄ１〜１５Ｄ
４により算出された各予測残差を一時記憶して、選択信
号／ＤＴＳ（デコーディング・タイム・スタンプ）生成
器１７により指定されたサブフレーム毎に予測残差の最
小値を選択する。Referring to FIG. 2, encoding sections 2'-1, 2'-
2 will be described in detail. PC for each channel "1" to "6"
The M data is stored in one frame buffer 10 for each frame. Then, the sample data of each of the channels “1” to “6” of one frame are respectively supplied to the prediction circuits 13D1 and 13D.
2, 15D1 to 15D4 and each channel
First sample data (stored in a restart header described later) of each frame of “1” to “6” is applied to the unpacking circuit 8 and the formatting circuit 19. The sampling frequency (fs) and the number of quantization bits (Qb) when the PCM data is A / D converted are applied to the packing circuit 18 and the formatting circuit 19.
The prediction circuits 13D1, 13D2, and 15D1 to 15D4 respectively calculate the PCM data of each channel “1” to “6”.
A plurality of predictors (not shown) having different characteristics calculate a plurality of linear prediction values of the current signal from a past signal in the time domain, and then perform prediction for each predictor from the original PCM data and the plurality of linear prediction values. Calculate the residual. The following buffer
The selectors 14D1, 14D2, 16D1 to 16D4 are prediction circuits 13D1, 13D2, 15D1 to 15D, respectively.
4 is temporarily stored, and the minimum value of the prediction residual is selected for each subframe specified by the selection signal / DTS (decoding time stamp) generator 17.

【００１４】選択信号／ＤＴＳ生成器１７は予測残差の
ビット数フラグをパッキング回路１８とフォーマット化
回路１９に対して印加し、また、予測残差が最小の予測
器を示す予測器選択フラグと、式（１）における相関係
数ａと、復号化側が入力バッファ２２ａ（図７）からス
トリームデータを取り出す時間を示すＤＴＳをフォーマ
ット化回路１９に対して印加する。パッキング回路１８
はバッファ・選択器１４Ｄ１、１４Ｄ２、１６Ｄ１〜１
６Ｄ４により選択された６ch分の予測残差を、選択信号
／ＤＴＳ生成器１７により指定されたビット数フラグに
基づいて指定ビット数でパッキングする。またＰＴＳ生
成器１７ｃは、復号化側が出力バッファ１１０（図７）
からＰＣＭデータを取り出す時間を示すＰＴＳ（プレゼ
ンテーション・タイム・スタンプ）を生成してフォーマ
ット化回路１９に出力する。The selection signal / DTS generator 17 applies the bit number flag of the prediction residual to the packing circuit 18 and the formatting circuit 19, and outputs a predictor selection flag indicating the predictor having the minimum prediction residual. To the formatting circuit 19, the correlation coefficient a in the equation (1) and the DTS indicating the time at which the decoding side extracts the stream data from the input buffer 22a (FIG. 7). Packing circuit 18
Are buffer / selectors 14D1, 14D2, 16D1-1.
The prediction residual for 6 ch selected by 6D4 is packed with the specified bit number based on the bit number flag specified by the selection signal / DTS generator 17. In the PTS generator 17c, the decoding side is the output buffer 110 (FIG. 7).
A PTS (Presentation Time Stamp) indicating the time at which the PCM data is to be extracted from the PCM is generated and output to the formatting circuit 19.

【００１５】続くフォーマット化回路１９は図３〜図６
に示すようなユーザデータにフォーマット化する。図３
に示すユーザデータ（サブパケット）は、前方グループ
に関する２ch「１」、「２」の予測符号化データを含む
可変レートビットストリーム（サブストリーム）ＢＳ０
と、他のグループに関する４ch「３」〜「６」の予測符
号化データを含む可変レートビットストリーム（サブス
トリーム）ＢＳ１と、サブストリームＢＳ０、ＢＳ１の
前に設けられたビットストリームヘッダ（リスタートヘ
ッダ）により構成されている。また、サブストリームＢ
Ｓ０、ＢＳ１の１フレーム分は・フレームヘッダと、・各ch「１」〜「６」の１フレームの先頭サンプルデー
タと、・各ch「１」〜「６」のサブフレーム毎の予測器選択フ
ラグと、・各ch「１」〜「６」のサブフレーム毎のビット数フラ
グと、・各ch「１」〜「６」の予測残差データ列（可変ビット
数）と、・ch「６」の係数ａとが、多重化されている。このような予測符号化によれば、原
信号が例えばサンプリング周波数（ｆｓ）＝９６ｋＨ
ｚ、量子化ビット数（Ｑｂ）＝２４ビット、６チャネル
の場合、７１％の圧縮率を実現することができる。The following formatting circuit 19 is shown in FIGS.
Format as user data as shown in FIG.
Is a variable-rate bit stream (sub-stream) BS0 including 2ch “1” and “2” prediction coded data related to the forward group.
And a variable-rate bit stream (substream) BS1 including 4ch “3” to “6” prediction coded data relating to other groups, and a bitstream header (restart header) provided before substreams BS0 and BS1 ). Also, substream B
One frame of S0 and BS1 includes: a frame header; first sample data of one frame of each channel “1” to “6”; and selection of a predictor for each subframe of each channel “1” to “6”. A flag; a bit number flag for each subframe of each ch “1” to “6”; a prediction residual data string (variable bit number) for each ch “1” to “6”; Are multiplexed. According to such predictive coding, the original signal has, for example, a sampling frequency (fs) = 96 kHz.
In the case of z, the number of quantization bits (Qb) = 24 bits, and 6 channels, a compression ratio of 71% can be realized.

【００１６】図２に示す符号化部２’−１、２’−２に
より予測符号化された可変レートビットストリームデー
タを、記録媒体の一例としてＤＶＤオーディオディスク
に記録する場合には、図４に示すオーディオ（Ａ）パッ
クにパッキングされる。このパックは２０３４バイトの
ユーザデータ（Ａパケット、Ｖパケット）に対して４バ
イトのパックスタート情報と、６バイトのＳＣＲ（Syst
em Clock Reference：システム時刻基準参照値）情報
と、３バイトのMux レート（rate）情報と１バイトのス
タッフィングの合計１４バイトのパックヘッダが付加さ
れて構成されている（１パック＝合計２０４８バイ
ト）。この場合、タイムスタンプであるＳＣＲ情報を、
先頭パックでは「１」として同一タイトル内で連続とす
ることにより同一タイトル内のＡパックの時間を管理す
ることができる。When the variable rate bit stream data predictively encoded by the encoding units 2'-1 and 2'-2 shown in FIG. 2 is recorded on a DVD audio disc as an example of a recording medium, FIG. The audio (A) pack shown is packed. This pack has 4 bytes of pack start information and 6 bytes of SCR (Syst) for 2034 bytes of user data (A packet, V packet).
em Clock Reference (system time reference value) information, a 3-byte Mux rate (rate) information, and a 1-byte stuffing that add a pack header of a total of 14 bytes (1 pack = 2048 bytes in total) . In this case, the time stamp SCR information is
In the first pack, the time of the A-pack in the same title can be managed by setting it as “1” and continuing within the same title.

【００１７】圧縮ＰＣＭのＡパケットは図５に詳しく示
すように、９〜２２バイトのパケットヘッダと、圧縮Ｐ
ＣＭのプライベートヘッダと、図３に示すフォーマット
の１ないし２０１５バイトのオーディオデータ（圧縮Ｐ
ＣＭ）により構成されている。そして、ＤＴＳとＰＴＳ
は図５のパケットヘッダ内に（具体的にはパケットヘッ
ダの１０〜１４バイト目にＰＴＳが、１５〜１９バイト
目にＤＴＳが）セットされる。圧縮ＰＣＭのプライベー
トヘッダは、・１バイトのサブストリームＩＤと、・２バイトのＵＰＣ／ＥＡＮ−ＩＳＲＣ（Universal Pr
oduct Code/European Article Number-International S
tandard Recording Code）番号、及びＵＰＣ／ＥＡＮ−
ＩＳＲＣデータと、・１バイトのプライベートヘッダ長と、・２バイトの第１アクセスユニットポインタと、・４バイトのオーディオデータ情報（ＡＤＩ）と、・０〜７バイトのスタッフィングバイトとに、より構成されている。As shown in detail in FIG. 5, the A packet of the compressed PCM has a packet header of 9 to 22 bytes and a compressed P packet.
The CM private header and 1 to 2015 bytes of audio data (compressed P
CM). And DTS and PTS
5 is set in the packet header of FIG. 5 (specifically, the PTS is set in the 10th to 14th bytes and the DTS is set in the 15th to 19th bytes). The private header of the compressed PCM is: 1-byte substream ID, 2 bytes of UPC / EAN-ISRC (Universal Prism).
oduct Code / European Article Number-International S
tandard Recording Code) number and UPC / EAN-
ISRC data, 1-byte private header length, 2-byte first access unit pointer, 4-byte audio data information (ADI), and 0-7 byte stuffing bytes. ing.

【００１８】そして、ＡＤＩ内に１秒後のアクセスユニ
ットをサーチするための前方アクセスユニット・サーチ
ポインタと、１秒前のアクセスユニットをサーチするた
めの後方アクセスユニット・サーチポインタがともに１
バイトでセットされる。具体的には、ＡＤＩの１バイト
目に前方アクセスユニット・サーチポインタが、８バイ
ト目に後方アクセスユニット・サーチポインタがセット
される。このようにＡＤＩは、圧縮ＰＣＭでは４バイト
に減少させるためオーディオデータを２０１５バイトま
で収納できる。Then, both the forward access unit search pointer for searching for the access unit one second later and the backward access unit search pointer for searching for the access unit one second earlier in the ADI are one.
Set in bytes. Specifically, the forward access unit search pointer is set in the first byte of the ADI, and the backward access unit search pointer is set in the eighth byte. Thus, ADI can store up to 2015 bytes of audio data in order to reduce it to 4 bytes in compressed PCM.

【００１９】図５に示す圧縮ＰＣＭ（ＰＰＣＭ）のオー
ディオパケットにおけるオーディオデータエリアは、図
６に示すように複数のＰＰＣＭアクセスユニットにより
構成され、ＰＰＣＭアクセスユニットはＰＰＣＭシンク
情報とサブパケットにより構成されている。最初のＰＰ
ＣＭアクセスユニット内のサブパケットは、ディレクト
リと、サブストリーム「ＢＳ０」と、ＣＲＣ（１バイト
又は２バイト）と、サブストリーム「ＢＳ１」と、ＣＲ
Ｃとエクストラ情報により構成され、サブストリーム
「ＢＳ０」、「ＢＳ１」はＰＰＣＭブロックのみにより
構成されている。２番目以降のＰＰＣＭアクセスユニッ
ト内のサブパケットも、ディレクトリと、サブストリー
ム「ＢＳ０」と、ＣＲＣと、サブストリーム「ＢＳ１」
と、ＣＲＣとエクストラ情報により構成され、サブスト
リーム「ＢＳ０」、「ＢＳ１」はリスタートヘッダとＰ
ＰＣＭブロックにより構成されている。そして、エクス
トラ情報は、少なくとも、サイズ調整機能を有してい
る。すなわち、入来データが固定レート（ＣＢＲ）の場
合には、上述したようにサンプリング周波数ｆｓによっ
て１パケット当たりのサンプリング数が４０，８０，１
６０のいずれかに定められており、そのため、決定され
たサンプリング数によっては１パケット当たりのデータ
長とサブパケットのサイズとが合わない場合があり、そ
れをサブパケットのサイズに合わせるために、例えば、
０，０…等を付加してサイズ調整を行う。また、このサ
イズ調整用のデータはテキストデータ等を利用すること
も可能である。The audio data area in the compressed PCM (PPCM) audio packet shown in FIG. 5 is composed of a plurality of PPCM access units as shown in FIG. 6, and the PPCM access unit is composed of PPCM sync information and subpackets. I have. First PP
The subpacket in the CM access unit includes a directory, a substream “BS0”, a CRC (1 byte or 2 bytes), a substream “BS1”,
C and extra information, and the sub-streams “BS0” and “BS1” are composed of only PPCM blocks. Sub-packets in the second and subsequent PPCM access units also include a directory, a sub-stream “BS0”, a CRC, and a sub-stream “BS1”.
, CRC and extra information, and the sub-streams “BS0” and “BS1” have a restart header and P
It is composed of PCM blocks. The extra information has at least a size adjustment function. That is, when the incoming data has a fixed rate (CBR), the number of samples per packet is 40, 80, 1 depending on the sampling frequency fs as described above.
60, the data length per packet may not match the size of the subpacket depending on the determined number of samplings. In order to match it with the size of the subpacket, for example, ,
The size is adjusted by adding 0, 0, etc. In addition, text data or the like can be used as the data for adjusting the size.

【００２０】ＰＰＣＭシンク情報（以下、同期情報とも
いう）は次の情報を含む。・１パケット当たりのサンプル数：サンプリング周波数
ｆｓに応じて４０、８０又は１６０が選択される。・データレートがＶＢＲの場合には「０」（サブパケッ
ト内のデータがＶＢＲの圧縮データであることを示す識
別子）、ＣＢＲの場合には「１」（サブパケット内のデ
ータが固定レートであることを示す識別子）・サンプリング周波数ｆｓ及び量子化ビット数Ｑｂ・チャネル割り当て情報The PPCM sync information (hereinafter also referred to as synchronization information) includes the following information. -Number of samples per packet: 40, 80 or 160 is selected according to the sampling frequency fs. If the data rate is VBR, "0" (an identifier indicating that the data in the subpacket is compressed data of VBR); if the data rate is CBR, "1" (the data in the subpacket has a fixed rate) Identifier indicating that sampling frequency fs and number of quantization bits Qb Channel assignment information

【００２１】次に図７を参照して復号化部３’−１、
３’−２について説明する。上記フォーマットの可変レ
ートビットストリームデータＢＳ０、ＢＳ１は、デフォ
ーマット化回路２１により分離される。そして、各ｃｈ
「１」〜「６」の１フレームの先頭サンプルデータと予
測器選択フラグはそれぞれ予測回路２４Ｄ１、２４Ｄ
２、２３Ｄ１〜２３Ｄ４に印加され、各ｃｈ「１」〜
「６」のビット数フラグはアンパッキング回路２２に印
加される。また、ＳＣＲと、ＤＴＳと予測残差データ列
は入力バッファ２２ａに印加され、ＰＴＳは出力バッフ
ァ１１０に印加される。また、データレートがＶＢＲか
ＣＢＲかを示す識別子は各予測器２４Ｄ１、２４Ｄ２、
２３Ｄ１、２３Ｄ２、２３Ｄ３、２３Ｄ４に印加され、
これらにおいて識別子に応じた入出力データの処理プロ
グラムが決定されて処理されることになる。ＶＢＲであ
る場合には処理プログラムを切り換えると共に入力デー
タを毎回ロードする必要があり処理に時間を要すること
になるが、ＣＢＲの場合には固定レートであることから
処理プログラムを切り換える必要がなく処理が速くな
る。また、サンプリング周波数ｆｓ及び量子化ビット数
ＱｂはＤ／Ａ変換器１０２に印加される。ここで、予測
回路２４Ｄ１、２４Ｄ２、２３Ｄ１〜２３Ｄ４内の複数
の予測器（不図示）はそれぞれ、符号化側の予測回路１
３Ｄ１、１３Ｄ２、１５Ｄ１〜１５Ｄ４内の複数の予測
器と同一の特性であり、予測器選択フラグにより同一特
性のものが選択される。Next, referring to FIG. 7, the decoding units 3'-1,
3′-2 will be described. The variable rate bit stream data BS0 and BS1 in the above format are separated by the deformatting circuit 21. And each channel
The head sample data of one frame of “1” to “6” and the predictor selection flag are stored in the prediction circuits 24D1 and 24D, respectively.
2, 23D1 to 23D4, and each channel “1” to
The bit number flag of “6” is applied to the unpacking circuit 22. The SCR, the DTS, and the prediction residual data string are applied to the input buffer 22a, and the PTS is applied to the output buffer 110. An identifier indicating whether the data rate is VBR or CBR is used as each of the predictors 24D1, 24D2,
23D1, 23D2, 23D3, 23D4,
In these, a processing program for input / output data corresponding to the identifier is determined and processed. In the case of VBR, it is necessary to switch the processing program and to load the input data every time, so that it takes time for processing. However, in the case of CBR, the processing rate does not need to be changed because the processing rate is fixed because the rate is fixed. Be faster. The sampling frequency fs and the number of quantization bits Qb are applied to the D / A converter 102. Here, a plurality of predictors (not shown) in the prediction circuits 24D1, 24D2, and 23D1 to 23D4 are each a prediction circuit 1 on the encoding side.
The same characteristics as those of the plurality of predictors in 3D1, 13D2, and 15D1 to 15D4, and those having the same characteristics are selected by the predictor selection flag.

【００２２】デフォーマット化回路２１により、最初オ
ーディオパックからオーディオパケットが分離され、次
にオーディオパケットからストリームデータ（予測残差
データ列）が分離されてビットストリームＢＳ０とＢＳ
１が取り出される。またＳＣＲが取り出され、図８に示
すようにＳＣＲによるタイミングにしたがってアクセス
ユニット毎に入力バッファ２２ａに取り込まれて蓄積さ
れる。ここで、１つのアクセスユニットのデータ量は、
例えばｆｓ＝９６ｋＨｚの場合には（１／９６ｋＨｚ）
秒分であるが、図９、図１０（ａ）に詳しく示すように
可変長である。そして、入力バッファ２２ａに蓄積され
たストリームデータはＤＴＳに基づいてＦＩＦＯで読み
出されてアンパッキング回路２２に印加される。An audio packet is first separated from the audio pack by the reformatting circuit 21. Next, stream data (predicted residual data string) is separated from the audio packet, and the bit streams BS0 and BS0 are separated.
1 is taken out. Further, the SCR is taken out and taken into the input buffer 22a for each access unit according to the timing by the SCR as shown in FIG. Here, the data amount of one access unit is
For example, when fs = 96 kHz (1/96 kHz)
The second is a variable length as shown in FIGS. 9 and 10A in detail. Then, the stream data stored in the input buffer 22a is read out by the FIFO based on the DTS and applied to the unpacking circuit 22.

【００２３】アンパッキング回路２２は各ｃｈ「１」〜
「６」の予測残差データ列をビット数フラグ毎に基づい
て分離してそれぞれ予測回路２４Ｄ１、２４Ｄ２、２３
Ｄ１〜２３Ｄ４に出力する。予測回路２４Ｄ１、２４Ｄ
２、２３Ｄ１〜２３Ｄ４ではそれぞれ、アンパッキング
回路２２からの各ｃｈ「１」〜「６」の今回の予測残差
データと、内部の複数の予測器の内、予測器選択フラグ
により選択された各１つにより予測された前回の予測値
が加算されて今回の予測値が算出され、次いで１フレー
ムの先頭サンプルデータを基準として各サンプルのＰＣ
Ｍデータが算出されて出力バッファ１１０に蓄積され
る。出力バッファ１１０に蓄積されたＰＣＭデータはＰ
ＴＳに基づいて読み出されて出力され、したがって、図
１０（ａ）に示す可変長のアクセスユニットが伸長され
て、図１０（ｂ）に示す一定長のプレゼンテーションユ
ニットが出力される。The unpacking circuit 22 is provided for each channel "1" to
The prediction residual data string of “6” is separated based on each bit number flag, and is divided into prediction circuits 24D1, 24D2, and 23, respectively.
It outputs to D1-23D4. Prediction circuits 24D1, 24D
2, 23D1 to 23D4, the current prediction residual data of each of the channels “1” to “6” from the unpacking circuit 22 and each of the plurality of internal predictors selected by the predictor selection flag. The previous predicted value predicted by one frame is added to calculate the current predicted value, and then the PC of each sample is determined based on the first sample data of one frame.
M data is calculated and stored in the output buffer 110. The PCM data stored in the output buffer 110 is P
The data is read out and output based on the TS. Therefore, the variable-length access unit shown in FIG. 10A is expanded, and a fixed-length presentation unit shown in FIG. 10B is output.

【００２４】また、ＰＰＣＭシンク情報内のサンプリン
グ周波数ｆｓ及び量子化ビット数Ｑｂに基づいて、ＰＣ
ＭデータがＤ／Ａ変換器１０２によりアナログ信号に変
換される。また、同時にＰＰＣＭシンク情報においてＣ
ＢＲの識別子が検出され、ディレクトリ内のエクストラ
データの位置が検出されて、更に例えば０，０…のデー
タや、テキストデータ等のサイズ調整用のエクストラデ
ータが検出されると、それがテキストデータである場合
にはエクストラデータをこのアンパッキング回路２２か
ら図示しないテキストデータデコード回路に供給し、そ
こで、デコード処理をしてテキストデータとして取り出
し、出力バッファ１１０を通じて出力されることにな
る。また一方、エクストラデータが０，０…データであ
った場合には、何の処理も施されないようになってい
る。また、テキストデータデコーダ回路が用意されてい
ない場合には、この処理はパスされる。また、ここで、
操作部１０１を介してサーチ再生が指示された場合に
は、制御部１００により図５に示す前方アクセスユニッ
ト・サーチポインタ（１秒先）と後方アクセスユニット
・サーチポインタ（１秒前）に基づいてアクセスユニッ
トを再生する。このサーチポインタとしては、１秒先、
１秒前の代わりに２秒先、２秒前のものでよい。Further, based on the sampling frequency fs and the number of quantization bits Qb in the PPCM sync information, the PC
The M data is converted by the D / A converter 102 into an analog signal. At the same time, C
When the BR identifier is detected, the position of the extra data in the directory is detected, and further, for example, data of 0, 0... Or extra data for size adjustment such as text data is detected, it is converted to text data. In some cases, the extra data is supplied from the unpacking circuit 22 to a text data decoding circuit (not shown), where the data is decoded, extracted as text data, and output through the output buffer 110. On the other hand, if the extra data is 0,0... Data, no processing is performed. If the text data decoder circuit is not prepared, this processing is passed. Also, where
When the search reproduction is instructed via the operation unit 101, the control unit 100 performs the search based on the forward access unit search pointer (one second ahead) and the backward access unit search pointer (one second before) shown in FIG. Play the access unit. As this search pointer, one second ahead,
Instead of one second before, two seconds before and two seconds before may be used.

【００２５】図２に示す符号化部２’−１、２’−２に
より予測符号化された可変レートビットストリームデー
タをネットワークを介して伝送する場合には、符号化側
では図１１に示すように伝送用にパケット化し（ステッ
プＳ４１）、次いでパケットヘッダを付与し（ステップ
Ｓ４２）、次いでこのパケットをネットワーク上に送り
出す（ステップＳ４３）。When variable-rate bit stream data predictively coded by the coding units 2'-1 and 2'-2 shown in FIG. 2 is transmitted through a network, the coding side performs the processing shown in FIG. (Step S41), add a packet header (step S42), and send this packet out onto the network (step S43).

【００２６】復号側では図１２（Ａ）に示すようにヘッ
ダを除去し（ステップＳ５１）、次いでデータを復元し
（ステップＳ５２）、次いでこのデータをメモリに格納
して復号を待つ（ステップＳ５３）。そして、復号を行
う場合には図１２（Ｂ）に示すように、デフォーマット
化を行い（ステップＳ６１）、次いで入力バッファ２２
ａの入出力制御を行い（ステップＳ６２）、次いでアン
パッキングを行う（ステップＳ６３）。なお、このと
き、サーチ再生指示がある場合にはサーチポインタをデ
コードする。次いで予測器をフラグに基づいて選択して
デコードを行い（ステップＳ６４）、次いで出力バッフ
ァ１１０の入出力制御を行い（ステップＳ６５）、次い
で元のマルチチャネルを復元し（ステップＳ６６）、次
いでこれを出力し（ステップＳ６７）、以下、これを繰
り返す。On the decoding side, as shown in FIG. 12A, the header is removed (step S51), the data is restored (step S52), and the data is stored in the memory to wait for decoding (step S53). . Then, when decoding is performed, as shown in FIG. 12B, deformatting is performed (step S61), and then the input buffer 22
The input / output control of a is performed (step S62), and then the unpacking is performed (step S63). At this time, if there is a search reproduction instruction, the search pointer is decoded. Next, a predictor is selected and decoded based on the flag (step S64), input / output control of the output buffer 110 is performed (step S65), and the original multi-channel is restored (step S66). This is output (step S67), and thereafter, this is repeated.

【００２７】なお、上記実施形態では、前方グループに
関する２ch「１」、「２」を「１」＝Ｌｆ＋Ｒｆ「２」＝Ｌｆ−Ｒｆにより変換して予測符号化したが、代わりに式（２）に
よりマルチチャネルをダウンミクスしてステレオ２chデ
ータ（Ｌ、Ｒ）を生成し、次いで次式（１）’ 「１」＝Ｌ＋Ｒ「２」＝Ｌ−Ｒ「３」〜「５」は同じ「６」＝Ｌｆｅ−Ｃ …（１）’ により変換して予測符号化するようにしてもよい（第２
の実施形態）。この場合には、復号化側のミクス＆マト
リクス回路４’はチャネル「１」、「２」を加算するこ
とによりチャネルＬを、減算することによりチャネルＲ
を生成することができる。In the above embodiment, 2ch "1" and "2" relating to the front group are converted by "1" = Lf + Rf "2" = Lf-Rf and are predictively coded. , Down-mixing the multi-channels to generate stereo 2-ch data (L, R), and then the following equation (1) ′ “1” = L + R “2” = LR “3” to “5” are the same as “6”. = Lfe-C (1) ′ and may be subjected to predictive coding (second
Embodiment). In this case, the mix & matrix circuit 4 ′ on the decoding side adds the channels “1” and “2” to the channel L, and subtracts the channel R by adding the channels “1” and “2”.
Can be generated.

【００２８】また、第３の実施形態として図１３に示す
ように、２ch「１」、「２」の代わりに式（２）により
マルチチャネルをダウンミクスしてステレオ２chデータ
（Ｌ、Ｒ）を生成して、このステレオ２ch（Ｌ、Ｒ）と
４ch「３」〜「６」を予測符号化するようにしてもよ
い。なお、第２、第３の実施形態では、フロントレフト
（Ｌｆ）とフロントライト（Ｒｆ）が復号化側に伝送さ
れないので、復号化側ではこれを式（１）、（２）によ
り生成する。As a third embodiment, as shown in FIG. 13, instead of 2ch "1" and "2", multi-channels are downmixed by equation (2) and stereo 2ch data (L, R) is obtained. The stereo 2ch (L, R) and the 4ch “3” to “6” may be generated and predictively coded. In the second and third embodiments, since the front left (Lf) and the front right (Rf) are not transmitted to the decoding side, the decoding side generates them according to equations (1) and (2).

【００２９】次に図１４、図１５、図１６を参照して第
４の実施形態について説明する。上記の実施形態では、
１グループの相関性の信号「１」〜「６」を予測符号化
するように構成されているが、この第４の実施形態では
複数グループの相関性のある信号を生成して予測符号化
し、圧縮率が最も高いグループの予測符号化データを選
択するように構成されている。また、この実施例ではそ
の１グループ内における符号化は、前述の各実施例の場
合のように前方グループに関する２ｃｈと他のグループ
に関する４ｃｈに分類して変換するようなことはせず
に、一つにまとめた符号化処理が行われる構成で、図１
４は前述の図１に対応した図として示してある。また、
図１５は符号化部の詳細ブロックを示すものであるが、
本実施例の場合にはｎ個の相関回路１−１〜１−ｎまで
が、ミクス＆マトリクス回路１’側に設けられている。
これらｎ個の相関回路１−１〜１−ｎは例えば６ch（Ｌ
ｆ、Ｃ、Ｒｆ、Ｌｓ、Ｒｓ、Ｌｆｅ）のＰＣＭデータ
を、相関性が異なるｎ種類の６ch信号「１」〜「６」に
変換する。Next, a fourth embodiment will be described with reference to FIG. 14, FIG. 15, and FIG. In the above embodiment,
Although one group of correlated signals “1” to “6” are configured to be predictively coded, in the fourth embodiment, a plurality of groups of correlated signals are generated and predicted and coded. It is configured to select the prediction coded data of the group having the highest compression ratio. Further, in this embodiment, the encoding in one group is not classified and converted into 2ch for the front group and 4ch for the other group as in each of the above-described embodiments. FIG. 1 shows a configuration in which the combined encoding process is performed.
4 is shown as a diagram corresponding to FIG. Also,
FIG. 15 shows a detailed block of the encoding unit.
In this embodiment, n correlation circuits 1-1 to 1-n are provided on the mix & matrix circuit 1 'side.
These n correlation circuits 1-1 to 1-n are, for example, 6 ch (L
f, C, Rf, Ls, Rs, Lfe) are converted into n types of 6-channel signals “1” to “6” having different correlations.

【００３０】例えば第１の相関回路１−１は以下のよう
に変換し、「１」＝Ｌｆ「２」＝Ｃ−（Ｌｓ＋Ｒｓ）／２「３」＝Ｒｆ−Ｌｆ「４」＝Ｌｓ−ａ×Ｌｆｅ「５」＝Ｒｓ−ｂ×Ｒｆ「６」＝Ｌｆｅまた、第ｎの相関回路１−ｎは以下のように変換する。「１」＝Ｌｆ＋Ｒｆ「２」＝Ｃ−Ｌｆ「３」＝Ｒｆ−Ｌｆ「４」＝Ｌｓ−Ｌｆ「５」＝Ｒｓ−Ｌｆ「６」＝Ｌｆｅ−ＣFor example, the first correlation circuit 1-1 converts as follows: "1" = Lf "2" = C- (Ls + Rs) / 2 "3" = Rf-Lf "4" = Ls-a × Lfe “5” = Rs−b × Rf “6” = Lfe Further, the n-th correlation circuit 1-n performs conversion as follows. “1” = Lf + Rf “2” = C−Lf “3” = Rf−Lf “4” = Ls−Lf “5” = Rs−Lf “6” = Lfe−C

【００３１】また、相関回路１−１〜１−ｎ毎に予測回
路１５とバッファ・選択器１６が設けられ、グループ毎
の予測残差の最小値のデータ量に基づいて圧縮率が最も
高いグループが相関選択信号生成器１７ｂにより選択さ
れる。このとき、フォーマット化回路１９はその選択フ
ラグ（相関回路選択フラグ、その相関回路の相関係数
ａ、ｂ）を追加して多重化する。A prediction circuit 15 and a buffer / selector 16 are provided for each of the correlation circuits 1-1 to 1-n, and the group having the highest compression ratio is determined based on the data amount of the minimum prediction residual for each group. Are selected by the correlation selection signal generator 17b. At this time, the formatting circuit 19 adds and multiplexes the selection flag (correlation circuit selection flag, correlation coefficients a and b of the correlation circuit).

【００３２】そして、図１６は前述の図６に対応したデ
ータエリアを示し、この実施例ではサブストリーム「Ｂ
Ｓ１」を用いず、サブストリーム「ＢＳ０」のみで構成
することになる。FIG. 16 shows a data area corresponding to FIG. 6 described above. In this embodiment, the sub-stream "B"
Instead of using “S1”, the sub-stream “BS0” alone is used.

【００３３】また、図１７に示す復号化側では、符号化
側の相関回路１−１〜１−ｎに対してｎ個の相関回路４
−１〜４−ｎ（又は係数ａ、ｂが変更可能な図示省略の
１つの相関回路）が設けられる。なお、図１５に示すｎ
グループの予測回路が同一の構成である場合、復号装置
では図１７に示すようにｎグループ分の予測回路を設け
る必要はなく、１つのグループ分の予測回路でよい。そ
して、符号化装置から伝送された選択フラグに基づいて
相関回路４−１〜４−ｎの１つを選択、又は係数ａ、ｂ
を設定して元の６ch（Ｌｆ、Ｃ、Ｒｆ、Ｌｓ、Ｒｓ、Ｌ
ｆｅ）を復元し、また、式（２）によりマルチチャネル
をダウンミクスしてステレオ２chデータ（Ｌ、Ｒ）を生
成する。On the decoding side shown in FIG. 17, n correlator circuits 4 are provided for the correlator circuits 1-1 to 1-n on the encoding side.
−1 to 4-n (or one correlation circuit (not shown) whose coefficients a and b can be changed) are provided. Note that n shown in FIG.
When the prediction circuits of the groups have the same configuration, the decoding device does not need to provide the prediction circuits of n groups as shown in FIG. 17, and may use the prediction circuits of one group. Then, one of the correlation circuits 4-1 to 4-n is selected based on the selection flag transmitted from the encoding device, or the coefficients a and b are selected.
And set the original 6 ch (Lf, C, Rf, Ls, Rs, L
fe), and down-mixes the multi-channels according to equation (2) to generate stereo 2-ch data (L, R).

【００３４】また、上記の第１の実施形態では、１種類
の相関性の信号「１」〜「６」を予測符号化するように
構成されているが、この信号「１」〜「６」のグループ
と原信号（Ｌｆ、Ｃ、Ｒｆ、Ｌｓ、Ｒｓ、Ｌｆｅ）のグ
ループを予測符号化し、圧縮率が高い方のグループを選
択するようにしてもよい。In the first embodiment, one kind of correlation signal "1" to "6" is configured to be predictively coded, but the signals "1" to "6" are encoded. And the group of the original signals (Lf, C, Rf, Ls, Rs, Lfe) may be predictively coded and the group with the higher compression ratio may be selected.

【００３５】[0035]

【発明の効果】以上説明したように本発明によれば、圧
縮データを含むサブパケットと、そのサンプリング周波
数及び量子化ビット数を含む同期情報部を有するデータ
構造にフォーマット化するようにしたので、マルチチャ
ネルの音声信号を可変の圧縮率で符号化する場合に再生
側の復号効率を改善することができる。As described above, according to the present invention, a subpacket including compressed data and a data structure having a synchronization information section including its sampling frequency and the number of quantization bits are formatted. When a multi-channel audio signal is encoded with a variable compression ratio, decoding efficiency on the reproduction side can be improved.

[Brief description of the drawings]

【図１】本発明に係る音声符号化装置及び音声復号装置
の第１の実施形態を示すブロック図である。FIG. 1 is a block diagram illustrating a first embodiment of a speech encoding device and a speech decoding device according to the present invention.

【図２】図１の符号化部を詳しく示すブロック図であ
る。FIG. 2 is a block diagram illustrating an encoding unit of FIG. 1 in detail.

【図３】図１、図２の符号化部により符号化されたビッ
トストリームを示す説明図である。FIG. 3 is an explanatory diagram showing a bit stream encoded by an encoding unit shown in FIGS. 1 and 2;

【図４】ＤＶＤのパックのフォーマットを示す説明図で
ある。FIG. 4 is an explanatory diagram showing a format of a DVD pack.

【図５】ＤＶＤのオーディオパックのフォーマットを示
す説明図である。FIG. 5 is an explanatory diagram showing a format of a DVD audio pack.

【図６】図５のオーディオデータエリアのフォーマット
を詳しく示す説明図である。6 is an explanatory diagram showing the format of the audio data area in FIG. 5 in detail.

【図７】図１の復号化部を詳しく示すブロック図であ
る。FIG. 7 is a block diagram illustrating a decoding unit of FIG. 1 in detail.

【図８】図７の入力バッファの書き込み／読み出しタイ
ミングを示すタイミングチャートである。FIG. 8 is a timing chart showing write / read timings of the input buffer of FIG. 7;

【図９】アクセスユニット毎の圧縮データ量を示す説明
図である。FIG. 9 is an explanatory diagram showing the amount of compressed data for each access unit.

【図１０】アクセスユニットとプレゼンテーションユニ
ットを示す説明図である。FIG. 10 is an explanatory diagram showing an access unit and a presentation unit.

【図１１】音声伝送方法を示すフローチャートである。FIG. 11 is a flowchart showing a voice transmission method.

【図１２】音声伝送方法を示すフローチャートである。FIG. 12 is a flowchart showing a voice transmission method.

【図１３】第３の実施形態の音声符号化装置及び音声復
号装置を示すブロック図である。FIG. 13 is a block diagram illustrating a speech encoding device and a speech decoding device according to a third embodiment.

【図１４】本発明に係る音声符号化装置及び音声復号装
置の第４の実施形態を示すブロック図である。FIG. 14 is a block diagram illustrating a fourth embodiment of the speech encoding device and the speech decoding device according to the present invention.

【図１５】第４の実施形態の音声符号化装置を示すブロ
ック図である。FIG. 15 is a block diagram illustrating a speech encoding device according to a fourth embodiment.

【図１６】図６に対応した別の実施例の説明図である。FIG. 16 is an explanatory diagram of another embodiment corresponding to FIG. 6;

【図１７】第４の実施形態の音声復号装置を示すブロッ
ク図である。FIG. 17 is a block diagram illustrating a speech decoding device according to a fourth embodiment.

[Explanation of symbols]

１’ ６chミクス＆マトリクス回路１３Ｄ１，１３Ｄ２，１５Ｄ１〜１５Ｄ４予測回路
（バッファ・選択器１４Ｄ１，１４Ｄ２，１６Ｄ１〜１
６Ｄ４と共に圧縮手段を構成する。）１４Ｄ１，１４Ｄ２，１６Ｄ１〜１６Ｄ４バッファ・
選択器１７選択信号／ＤＴＳ生成器（タイミング生成手段）１７ｃＰＴＳ生成器（タイミング生成手段）１９フォーマット化回路（フォーマット化手段）２１デフォーマット化回路（分離手段）２２アンパッキング回路２２ａ入力バッファ２４Ｄ１，２４Ｄ２，２３Ｄ１〜２３Ｄ４予測回路
（伸長手段）１００制御部１０２Ｄ／Ａ変換器１１０出力バッファ1 '6ch Mix & Matrix Circuit 13D1, 13D2, 15D1-15D4 Prediction Circuit (Buffer / Selector 14D1, 14D2, 16D1-1
A compression means is constituted together with 6D4. 14D1, 14D2, 16D1-16D4 buffer
Selector 17 Selection signal / DTS generator (timing generating means) 17c PTS generator (timing generating means) 19 Formatting circuit (Formatting means) 21 Deformatting circuit (Separating means) 22 Unpacking circuit 22a Input buffer 24D1, 24D2, 23D1 to 23D4 Prediction circuit (expansion means) 100 Control unit 102 D / A converter 110 Output buffer

Claims

[Claims]

1. A compression means for compressing a multi-channel audio signal having a certain sampling frequency and the number of quantization bits by a predictive coding method for each channel, and a subpacket including data for each channel compressed by the compression means. Means for formatting into a data structure having a synchronization information section including the sampling frequency and the number of quantization bits.

2. The audio coding apparatus according to claim 1, wherein the data structure is formatted by adding a header including SCR information corresponding to a sub-packet and a synchronization information section. .

3. A multi-channel audio signal having a certain sampling frequency and the number of quantization bits is compressed by a predictive coding method for each channel. A recording medium which is formatted and recorded in a data structure having a synchronization information section including the number of encoded bits.

4. The recording medium according to claim 3, wherein the data structure is formatted by adding a header including SCR information corresponding to a sub-packet and a synchronization information part.

5. A multi-channel audio signal having a certain sampling frequency and the number of quantization bits is compressed by a predictive coding method for each channel, a subpacket including the compressed data for each channel, a sampling frequency and a quantization What is claimed is: 1. An audio decoding apparatus for decoding a data structure having a synchronization information section including a number of encoded bits, comprising: means for separating the data structure into a subpacket and a synchronization information section; and decompressing compressed data in the subpacket for each channel. And a means for converting the expanded audio data into an analog audio signal based on the sampling frequency and the number of quantization bits in the synchronization information section.

6. A multi-channel audio signal having a certain sampling frequency and quantization bit number is compressed by a predictive coding method for each channel, and a sub-packet including the compressed data for each channel, a sampling frequency and a quantization value An audio decoding apparatus for decoding a data structure having a synchronization information part including a number of encoded bits and a header including SCR information corresponding to the subpacket and the synchronization information part, wherein the SCR information included in the header is separated. A first separating unit, a buffer for holding the subpacket and the synchronization information section based on the separated SCR information, and a second for separating the subpacket and the synchronization information section held in the buffer. And decompressing the compressed data in the subpacket for each channel based on the identifier in the synchronization information section. A long section, and means for converting the analog audio signal based on the decompressed audio data to a sampling frequency and number of quantization bits in the synchronization information unit includes speech decoding apparatus.

7. A multi-channel audio signal having a certain sampling frequency and a quantized bit number is compressed by a predictive coding method for each channel, a subpacket including the compressed data for each channel, and a sampling frequency and quantization A voice transmission method characterized by transmitting, via a communication line, a packet having a data structure formatted into a data structure having a synchronization information section including a number of encoded bits.

8. The audio transmission method according to claim 7, wherein the data structure is formatted by adding a header including SCR information corresponding to a subpacket and a synchronization information part.

9. A multi-channel audio signal having a certain sampling frequency and the number of quantization bits is compressed by a predictive coding method for each channel, a subpacket including the compressed data for each channel, a sampling frequency and a quantization A transmission medium for transmitting a packet having a data structure formatted into a data structure having a synchronization information section including a number of encoded bits.