JP3606458B2

JP3606458B2 - Audio signal transmission method and audio decoding method

Info

Publication number: JP3606458B2
Application number: JP2001265425A
Authority: JP
Inventors: 徳彦渕上; 昭治植野; 美昭田中
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 1998-10-13
Filing date: 2001-09-03
Publication date: 2005-01-05
Anticipated expiration: 2019-10-13
Also published as: JP2002169598A

Description

【０００１】
【発明の属する技術分野】
本発明は、音声信号を予測符号化して圧縮するための音声符号化方法により符号化された音声信号を伝送する音声信号伝送方法及びその音声信号を復号する音声復号方法に関する。
【０００２】
【従来の技術】
音声信号を予測符号化する方法として、本発明者は先の出願（特願平９−２８９１５９号）において１チャネル（チャンネル）の原デジタル音声信号に対して、特性が異なる複数の予測器により時間領域における過去の信号から現在の信号の複数の線形予測値を算出し、原デジタル音声信号と、この複数の線形予測値から予測器毎の予測残差を算出し、この複数の予測残差の最小値を選択する方法を提案している。
【０００３】
【発明が解決しようとする課題】
しかしながら、上記方法では原デジタル音声信号がサンプリング周波数＝９６ｋＨｚ、量子化ビット数＝２０ビット程度の場合に、ある程度の圧縮効果を得ることができるが、近年のＤＶＤオーディオディスクでは、この２倍のサンプリング周波数（＝１９２ｋＨｚ）が使用され、また、量子化ビット数も２４ビットが使用される傾向があるので、圧縮率を改善する必要がある。
【０００４】
そこで本発明は、音声信号を予測符号化する場合に圧縮率を改善することができる音声符号化方法により符号化された音声信号の伝送方法及び復号方法を提供することを目的とする。
【０００５】
【課題を解決するための手段】
本発明は上記目的を達成するために、以下の１）及び２）に記載の手段よりなる。
すなわち、
【０００６】
１）同一サンプリング周波数の第１及び第２の２系統の音声信号をマトリクス演算して互いに相関ある２つの相関チャネルに変換するステップと、
前記ステップにより変換された２つの相関チャネルを含む音声信号を、チャネル毎に、入力される音声信号に応答して先頭サンプル値を得ると共に、特性が異なる複数の線形予測方法により時間領域の過去から現在の信号の線形予測値がそれぞれ予測され、その予測される線形予測値と前記音声信号とから得られる予測残差が最小となるような線形予測方法を選択して予測符号化するステップと、
ヘッダ情報と、圧縮ＰＣＭプライベートヘッダ及びオーディオ圧縮ＰＣＭデータ部を含むユーザデータと、を含んだデータ構造にすると共に、前記オーディオ圧縮ＰＣＭデータ部を複数のアクセスユニットにより構成し、前記ステップにより選択された各チャネルの先頭サンプル値と予測残差と線形予測方法を含む予測符号化データを、前記アクセスユニット内に配置されるサブパケット内に格納し、前記音声信号のＵＰＣ／ＥＡＮ−ＩＳＲＣ番号及びＵＰＣ／ＥＡＮ−ＩＳＲＣデータを前記圧縮ＰＣＭプライベートヘッダ内に配置するステップからなる音声符号化方法により符号化された音声信号を伝送する音声信号伝送方法であって、
前記選択された先頭サンプル値と予測残差と線形予測方法とを含む予測符号化データと前記音声信号のＵＰＣ／ＥＡＮ−ＩＳＲＣ番号及びＵＰＣ／ＥＡＮ−ＩＳＲＣデータとをパケット化して伝送することを特徴とする音声信号伝送方法。
２）同一サンプリング周波数の第１及び第２の２系統の音声信号をマトリクス演算して互いに相関ある２つの相関チャネルに変換するステップと、
前記ステップにより変換された２つの相関チャネルを含む音声信号を、チャネル毎に、入力される音声信号に応答して先頭サンプル値を得ると共に、特性が異なる複数の線形予測方法により時間領域の過去から現在の信号の線形予測値がそれぞれ予測され、その予測される線形予測値と前記音声信号とから得られる予測残差が最小となるような線形予測方法を選択して予測符号化するステップと、
ヘッダ情報と、圧縮ＰＣＭプライベートヘッダ及びオーディオ圧縮ＰＣＭデータ部を含むユーザデータと、を含んだデータ構造にすると共に、前記オーディオ圧縮ＰＣＭデータ部を複数のアクセスユニットにより構成し、前記ステップにより選択された各チャネルの先頭サンプル値と予測残差と線形予測方法を含む予測符号化データを、前記アクセスユニット内に配置されるサブパケット内に格納し、前記音声信号のＵＰＣ／ＥＡＮ−ＩＳＲＣ番号及びＵＰＣ／ＥＡＮ−ＩＳＲＣデータを前記圧縮ＰＣＭプライベートヘッダ内に配置するステップからなる音声符号化方法により符号化されたデータから元の音声信号を復号する音声復号方法であって、
前記選択された先頭サンプル値と予測残差と線形予測方法を含む予測符号化データから予測値を算出するステップと、
この算出された予測値から前記第１の複数チャネルのデジタル音声信号を復元するステップと、
からなる音声復号方法。
【０００７】
【発明の実施の形態】
以下、図面を参照して本発明の実施の形態を説明する。図１は本発明に係る音声信号伝送方法を適用した音声符号化装置とそれに対応した音声復号装置の第１の実施形態を示すブロック図、図２は図１のエンコーダを詳しく示すブロック図、図３は図２のマルチプレクサにより多重化される１フレームのフォーマットを示す説明図、図４はＤＶＤのパックのフォーマットを示す説明図、図５はＤＶＤのオーディオパックのフォーマットを示す説明図、図６は図１のデコーダを詳しく示すブロック図である。
【０００８】
図１に示すチャネル相関回路Ａは加算回路１ａと減算回路１ｂを有する。加算回路１ａは各チャネル（以下、ｃｈ）が例えばサンプリング周波数＝１９２ｋＨｚ、量子化ビット数＝２４ビットのステレオ２ｃｈ信号Ｌ、Ｒの和信号（Ｌ＋Ｒ）を算出して和ｃｈ用１ｃｈロスレス・エンコーダ２Ｄ１に出力し、減算回路１ｂは差信号（Ｌ−Ｒ）を算出して差ｃｈ用１ｃｈロスレス・エンコーダ２Ｄ２に出力する。エンコーダ２Ｄ１、２Ｄ２は図２に詳しく示すように、それぞれ和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）の差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）を予測符号化して記録媒体や通信媒体を介して伝送する。
【０００９】
そして、復号側では、図６に詳しく示すようにデコーダ３Ｄ１、３Ｄ２がそれぞれ各ｃｈの予測符号化データを和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）に復号し、次いでチャネル相関回路Ｂがこの和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）をステレオ２ｃｈ信号Ｌ、Ｒに復元する。
【００１０】
図２を参照してエンコーダ２Ｄ１、２Ｄ２について詳しく説明する。和信号（Ｌ＋Ｒ）と差信号（Ｌ−Ｒ）は１フレーム毎に１フレームバッファ１０に格納される。そして、１フレームの各サンプル値（Ｌ＋Ｒ）、（Ｌ−Ｒ）がそれぞれ差分演算回路１１Ｄ１、１１Ｄ２に印加され、今回と前回の差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）、すなわち差分ＰＣＭ（ＤＰＣＭ）データが算出される。また、各フレームの先頭サンプル値（Ｌ＋Ｒ）、（Ｌ−Ｒ）がマルチプレクサ１９に印加される。
【００１１】
差分演算回路１１Ｄ１により算出された差分Δ（Ｌ＋Ｒ）は、予測係数が異なる複数の予測器１２ａ−１〜１２ａ−ｎと減算器１３ａ−１〜１３ａ−ｎに印加される。そして、予測器１２ａ−１〜１２ａ−ｎではそれぞれ各予測係数に基づいて差分Δ（Ｌ＋Ｒ）の各予測値が算出され、減算器１３ａ−１〜１３ｂ−ｎではそれぞれこの各予測値と差分Δ（Ｌ＋Ｒ）の各予測残差が算出される。バッファ・選択器１６Ｄ１はこの複数の予測残差を一時記憶して、選択信号生成器１７により指定されたサブフレーム毎に最小の予測残差を選択し、パッキング回路１８に出力する。なお、このサブフレームはフレームの数十分の１程度のサンプル長であり、一例として１フレームを８０サブフレームとする。ここで、予測器１２ａ−１〜１２ａ−ｎと減算器１３ａ−１〜１３ａ−ｎは和信号ｃｈの予測回路１５Ｄ１を構成し、また、この予測回路１５Ｄ１とバッファ・選択器１６Ｄ１は和信号ｃｈの予測符号化回路を構成している。
【００１２】
同様に、差分演算回路１１Ｄ２により算出された差分Δ（Ｌ−Ｒ）は、予測係数が異なる複数の予測器１２ｂ−１〜１２ｂ−ｎと減算器１３ｂ−１〜１３ｂ−ｎに印加される。そして、予測器１２ｂ−１〜１２ｂ−ｎではそれぞれ各予測係数に基づいて差分Δ（Ｌ−Ｒ）の各予測値が算出され、減算器１３ｂ−１〜１３ｂ−ｎではそれぞれこの各予測値と差分Δ（Ｌ−Ｒ）の各予測残差が算出される。バッファ・選択器１６Ｄ２はこの複数の予測残差を一時記憶して、選択信号生成器１７により指定されたサブフレーム毎に最小の予測残差を選択し、パッキング回路１８に出力する。予測器１２ｂ−１〜１２ｂ−ｎと減算器１３ｂ−１〜１３ｂ−ｎは差信号ｃｈの予測回路１５Ｄ２を構成し、また、この予測回路１５Ｄ２とバッファ・選択器１６Ｄ２は差信号ｃｈの予測符号化回路を構成している。
【００１３】
選択信号生成器１７は予測残差のビット数フラグ（５ビット）をパッキング回路１８とマルチプレクサ１９に対して印加し、また、予測残差が最小の予測器を示す予測器選択フラグ（その数ｎが２〜９個として３ビット）をマルチプレクサ１９に対して印加する。パッキング回路１８はバッファ・選択器１６Ｄ１、１６Ｄ２により選択された２ｃｈ分の予測残差を、選択信号生成器１７により指定されたビット数フラグに基づいて指定ビット数でパッキングする。
【００１４】
続くマルチプレクサ１９は図３に示すように１フレーム分に対して
・フレームヘッダ（４０ビット）と、
・和信号ｃｈ（Ｌ＋Ｒ）の１フレームの先頭サンプル値（２５ビット）と、
・差信号ｃｈ（Ｌ−Ｒ）の１フレームの先頭サンプル値（２５ビット）と、
・和信号ｃｈ（Ｌ＋Ｒ）のサブフレーム毎の予測器選択フラグ（３ビット×８０）と、
・差信号ｃｈ（Ｌ−Ｒ）のサブフレーム毎の予測器選択フラグ（３ビット×８０）と、
・和信号ｃｈ（Ｌ＋Ｒ）のサブフレーム毎のビット数フラグ（５ビット×８０）と、
・差信号ｃｈ（Ｌ−Ｒ）のサブフレーム毎のビット数フラグ（５ビット×８０）と、
・和信号ｃｈ（Ｌ＋Ｒ）の予測残差データ列（可変ビット数）と、
・差信号ｃｈ（Ｌ−Ｒ）の予測残差データ列（可変ビット数）とを
アクセスユニットとして多重化し、可変レートビットストリームとして出力する。上記予測残差データ列はサブパケットを構成する。このような予測符号化によれば、原信号が例えばサンプリング周波数＝１９２ｋＨｚ、量子化ビット数＝２４ビット、２チャネルの場合、５９％の圧縮率を実現することができる。
【００１５】
また、この可変レートビットストリームデータをＤＶＤオーディオディスクに記録する場合には、図４に示す圧縮ＰＣＭのオーディオ（Ａ）パックにパッキングされる。このパックは２０３４バイトのユーザデータ（Ａパケット、Ｖパケット）に対して４バイトのパックスタート情報と、６バイトのＳＣＲ（ＳｙｓｔｅｍＣｌｏｃｋＲｅｆｅｒｅｎｃｅ：システム時刻基準参照値）情報と、３バイトのＭｕｘレート（ｒａｔｅ）情報と１バイトのスタッフィングの合計１４バイトのパックヘッダが付加されて構成されている（１パック＝合計２０４８バイト）。この場合、タイムスタンプであるＳＣＲ情報を、ＡＣＢユニット内の先頭パックでは「１」として同一タイトル内で連続とすることにより同一タイトル内のＡパックの時間を管理することができる。
【００１６】
圧縮ＰＣＭのＡパケットは図５に詳しく示すように、１７、９又は１４バイトのパケットヘッダと、プライベートヘッダと、図３に示すフォーマットの１ないし２０１５バイトのオーディオ圧縮ＰＣＭデータにより構成されている。圧縮ＰＣＭのプライベートヘッダは、
・１バイトのサブストリームＩＤと、
・２バイトのＵＰＣ／ＥＡＮ−ＩＳＲＣ（ＵｎｉｖｅｒｓａｌＰｒｏｄｕｃｔＣｏｄｅ／ＥｕｒｏｐｅａｎＡｒｔｉｃｌｅＮｕｍｂｅｒ−ＩｎｔｅｒｎａｔｉｏｎａｌＳｔａｎｄａｒｄＲｅｃｏｒｄｉｎｇＣｏｄｅ）番号、及びＵＰＣ／ＥＡＮ−ＩＳＲＣデータと、
・１バイトのプライベートヘッダ長と、
・２バイトの第１アクセスユニットポインタと、
・４バイトのオーディオデータ情報（ＡＤＩ）と、
・０〜７バイトのスタッフィングバイトとに、
より構成されている。
このように圧縮ＰＣＭのＡパケットのＡＤＩは、４バイトに選定され、通常の非圧縮のＰＣＭのＡパケットのＡＤＩよりも４バイトだけ短くされている。したがってオーディオデータは４バイト分増加させることができる。
【００１７】
次に図６を参照してデコーダ３Ｄ１、３Ｄ２について説明する。図３に示したフォーマットの可変レートビットストリームデータは、デマルチプレクサ２１によりフレームヘッダに基づいて分離される。そして、和信号ｃｈ（Ｌ＋Ｒ）及び差信号ｃｈ（Ｌ−Ｒ）の１フレームの先頭サンプル値はそれぞれ累積演算回路２５ａ、２５ｂに印加され、和信号ｃｈ（Ｌ＋Ｒ）及び差信号ｃｈ（Ｌ−Ｒ）の予測器選択フラグはそれぞれ予測器（２４ａ−１〜２４ａ−ｎ）、（２４ｂ−１〜２４ｂ−ｎ）の各選択信号として印加され、和信号ｃｈ（Ｌ＋Ｒ）及び差信号ｃｈ（Ｌ−Ｒ）のビット数フラグと予測残差データ列はアンパッキング回路２２に印加される。ここで、予測器（２４ａ−１〜２４ａ−ｎ）、（２４ｂ−１〜２４ｂ−ｎ）はそれぞれ、符号化側の予測器（１２ａ−１〜１２ａ−ｎ）、（１２ｂ−１〜１２ｂ−ｎ）と同一の特性であり、予測器選択フラグにより同一特性のものが選択される。
【００１８】
アンパッキング回路２２は和信号ｃｈ（Ｌ＋Ｒ）及び差信号ｃｈ（Ｌ−Ｒ）の予測残差データ列をビット数フラグ毎に基づいて分離してそれぞれ加算回路２３ａ、２３ｂに出力する。加算回路２３ａ、２３ｂではそれぞれ、アンパッキング回路２２からの和信号ｃｈ（Ｌ＋Ｒ）及び差信号ｃｈ（Ｌ−Ｒ）の今回の予測残差データと、予測器（２４ａ−１〜２４ａ−ｎ）、（２４ｂ−１〜２４ｂ−ｎ）の内、予測器選択フラグにより選択された各１つにより予測された前回の予測値が加算されて今回の予測値が算出される。この今回の予測値は、図２に示す差分回路１１ａ、１１ｂによりそれぞれ算出された差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）すなわちＤＰＣＭデータであり、予測器（２４ａ−１〜２４ａ−ｎ）、（２４ｂ−１〜２４ｂ−ｎ）と累積演算回路２５ａ、２５ｂに印加される。
【００１９】
累積演算回路２５ａ、２５ｂはそれぞれ、１フレームの先頭サンプル値に対して差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）をサンプル毎に累積加算して和信号ｃｈ（Ｌ＋Ｒ）、差信号ｃｈ（Ｌ−Ｒ）の各ＰＣＭデータを出力する。この和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）は図１に示すように加算回路４ａにより２Ｌ信号が算出されるとともに、減算回路４ｂにより２Ｒ信号が算出される。そして、２Ｌ信号と２Ｒ信号がそれぞれ割り算器５ａ、５ｂにより１／２に割り算され、元のステレオ２チャネル信号Ｌ、Ｒが復元される。
【００２０】
次に図７、図８を参照して第２の実施形態について説明する。上記の実施形態では、和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）の各差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）、すなわちＤＰＣＭデータのみを予測符号化するように構成されているが、この第２の実施形態では和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）すなわちＰＣＭデータ、又はその各差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）すなわちＤＰＣＭデータを選択的に予測符号化するように構成されている。
【００２１】
このため図７に示す符号化装置では、図２に示す構成に対して和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）をそれぞれ予測符号化するための予測回路１５Ａ、１５Ｓとバッファ・選択器１６Ａ、１６Ｓが追加されている。また、選択信号生成器１７はバッファ・選択器１６Ａ、１６Ｓによりそれぞれ選択された和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）と、バッファ・選択器１６Ｄ１、１６Ｄ２によりそれぞれ選択された差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）の各予測残差の最小値に基づいて、ＰＣＭデータとＤＰＣＭデータのどちらが圧縮率が高いか否かを判断し、高い方のデータを選択する。このとき、そのＰＣＭ／ＤＰＣＭの選択フラグ（予測回路選択フラグ）を追加して多重化する。
【００２２】
ここで、図７に示す和信号（Ｌ＋Ｒ）の予測回路１５Ａと差分Δ（Ｌ＋Ｒ）の予測回路１５Ｄ１が同一の構成であり、また、差信号（Ｌ−Ｒ）の予測回路１５Ｓと差分Δ（Ｌ−Ｒ）の予測回路１５Ｄ２が同一の構成である場合、復号装置では図８に示すようにＰＣＭデータとＤＰＣＭデータの両方の予測回路を設ける必要はなく、１つのデータ分の予測回路でよい。そして、符号化装置から伝送された予測回路選択フラグに基づいてセレクタ２６ａ、２６ｂにより、ＤＰＣＭデータの場合には累積演算回路２５ａ、２５ｂの出力を選択し、ＰＣＭデータの場合には加算回路２３ａ、２３ｂの出力を選択する。
【００２３】
第３の実施形態では図９に示すように、原信号Ｌ、Ｒ（ＰＣＭデータ）と、和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）（ＰＣＭデータ）と、その各差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）（ＤＰＣＭデータ）の３グループの１つを選択的に予測符号化するように構成されている。
【００２４】
このため図９に示す符号化装置では、図７に示す構成に対して原信号Ｌ、Ｒをそれぞれ予測符号化するための予測回路１５Ｌ、１５Ｒとバッファ・選択器１６Ｌ、１６Ｒが追加されている。また、選択信号生成器１７はバッファ・選択器１６Ｌ、１６Ｒにより選択された原信号Ｌ、Ｒと、バッファ・選択器１６Ａ、１６Ｓにより選択された和信号（Ｌ＋Ｒ）、差信号（Ｌ−Ｒ）と、バッファ・選択器１６Ｄ１、１６Ｄ２により選択された各差分Δ（Ｌ＋Ｒ）、Δ（Ｌ−Ｒ）の各予測残差の最小値に基づいて圧縮率が高いグループのデータを選択する。このとき、その選択フラグ（予測回路選択フラグ）を追加して多重化する。
【００２５】
また、図９に示す３グループの予測回路が同一の構成である場合、復号装置では図１０に示すように３グループ分の予測回路を設ける必要はなく、１つのグループ分の予測回路でよい。そして、符号化装置から伝送された予測回路選択フラグに基づいて、ＤＰＣＭデータの場合には累積演算回路２５ａ、２５ｂの出力を選択し、ＰＣＭデータの場合には加算回路２３ａ、２３ｂの出力を選択してチャネル相関回路Ｂにより原信号Ｌ、Ｒを復元する。そして、更にセレクタ２７ａ、２７ｂにより原信号Ｌ、Ｒのグループの場合には加算回路２３ａ、２３ｂの出力を選択し、他の場合にはチャネル相関回路Ｂの出力を選択する
【００２６】
また、符号化側により予測符号化された可変レートビットストリームデータをネットワークを介して伝送する場合には、符号化側では図１１に示すように伝送用にパケット化し（ステップＳ４１）、次いでパケットヘッダを付与し（ステップＳ４２）、次いでこのパケットをネットワーク上に送り出す（ステップＳ４３）。復号側では図１２に示すようにヘッダを除去し（ステップＳ５１）、次いでデータを復元し（ステップＳ５２）、次いでこのデータをメモリに格納して復号を待つ（ステップＳ５３）。
【００２７】
図５に示す圧縮ＰＣＭ（ＰＰＣＭ）のオーディオ（Ａ）パケットの図３と異なる態様を図１３に示す。この異なる態様では、圧縮ＰＣＭ（ＰＰＣＭ）のオーディオ（Ａ）パケットにおけるオーディオデータエリアは、図１３に示すように複数のＰＰＣＭアクセスユニットにより構成され、ＰＰＣＭアクセスユニットはＰＰＣＭシンク情報とサブパケットにより構成されている。最初のＰＰＣＭアクセスユニット内のサブパケットは、ディレクトリと、ビットトリームＢＳ０と、ＣＲＣと、エクストラ情報により構成され、ビットストリームＢＳ０はＰＰＣＭブロックのみにより構成されている。２番目以降のＰＰＣＭアクセスユニット内のサブパケットは、ディレクトリを除いてビットストリームＢＳ０と、ＣＲＣと、エクストラ情報により構成され、フレーム先頭のサブストリームＢＳ０はリスタートヘッダとＰＰＣＭブロック（フレーム先頭サンプル値を含む）により構成されている。
【００２８】
ＰＰＣＭシンク情報（以下、同期情報ともいう）は次の情報を含む。
・１パケット当たりのサンプル数：サンプリング周波数ｆｓに応じて４０、８０又は１６０が選択される。
・データレート：ＶＢＲの場合には「０」（サブパケット内のデータが圧縮データであることを示す識別子）
・サンプリング周波数ｆｓ及び量子化ビット数Ｑｂ
・チャネル割り当て情報
リスタートヘッダはフレーム毎にチャネル相関回路Ａ（加算回路と減算回路を有すること）を明記した情報を有している。図１３に示したフォーマットの可変レートビットストリームデータは、図６のデマルチプレクサ２１以下の構成からなるデコーダ３Ｄ１、３Ｄ２により元の２チャネルオーディオ信号に復号される。
【００２９】
図１４は、本発明に係る音声符号化装置及び音声復号装置の第２の実施形態を示すブロック図である。図１４に示すチャネル相関回路Ａ−１は加算回路１ａと減算回路１ｂを有する。加算回路１ａはステレオ２ｃｈ信号Ｌ、Ｒの和信号（Ｌ＋Ｒ）を算出し、この和信号（Ｌ＋Ｒ）を割り算器５ａにより１／２に割り算してから、ロスレス・エンコーダ２Ｄに出力し、減算回路１ｂは差信号（Ｌ−Ｒ）を算出し、この差信号（Ｌ−Ｒ）を割り算器５ｂにより１／２に割り算してから、ロスレス・エンコーダ２Ｄに出力する。ロスレス・エンコーダ２Ｄは、１／２（Ｌ＋Ｒ）と１／２（Ｌ−Ｒ）を用いてこれらを多重化して多重化信号２５０を作る。多重化信号２５０はロスレス・デコーダ３Ｄによりデコードされて、元の１／２（Ｌ＋Ｒ）と１／２（Ｌ−Ｒ）が得られ、これらが、チャネル相関回路Ｂ−１を構成する加算回路４ａと減算回路４ｂにそれぞれ与えられ、出力信号としてステレオ２ｃｈのＬ信号とＲ信号が得られる。なお、ロスレス・エンコーダ２Ｄとロスレス・エンコーダ２Ｄにおける一連の動作である、差分の算出、予測値の算出、最小予測残差の選択、最小予測残差を用いた予測値の算出などは、第１の実施の形態と同様に行われる。図１３に示したフォーマットの可変レートビットストリームデータは、図１のチャネル相関回路を用いたか、図１４のチャネル相関回路を用いたかを例えばＰＰＣＭアクセスユニットのリスタートヘッダに格納した識別子で識別するようにしているので、いずれであっても確実にデコードできる。なお、フレーム毎のロスレス圧縮を例に説明したが、固定ではなく、区間は可変の長さにしてもよい。
【００３０】
【発明の効果】
以上説明したように本発明によれば、特に、同一サンプリング周波数の第１及び第２の２系統の音声信号をマトリクス演算して互いに相関あるチャネルに変換した２つの相関信号を、チャネル毎に入力される音声信号に応答して先頭サンプル値を得ると共に、時間領域に過去の信号から予測される現在の信号の複数の予測値の中でその予測残差が最小となる線形予測方式によりロスレス圧縮するようにして圧縮率の改善を図った音声信号を伝送し、不都合なく音声信号を復号できる。
【図面の簡単な説明】
【図１】本発明を適用した音声符号化装置とそれに対応する音声復号装置の第１の実施形態を示すブロック図である。
【図２】図１のエンコーダを詳しく示すブロック図である。
【図３】図２のマルチプレクサにより多重化される１フレームのフォーマットを示す説明図である。
【図４】ＤＶＤのパックのフォーマットを示す説明図である。
【図５】ＤＶＤのオーディオパックのフォーマットを示す説明図である。
【図６】図１のデコーダを詳しく示すブロック図である。
【図７】第２の実施形態のエンコーダを示すブロック図である。
【図８】第２の実施形態のデコーダを示すブロック図である。
【図９】第３の実施形態のエンコーダを示すブロック図である。
【図１０】第３の実施形態のデコーダを示すブロック図である。
【図１１】音声伝送方法を示すフローチャートである。
【図１２】音声伝送方法を示すフローチャートである。
【図１３】図５に示す圧縮ＰＣＭ（ＰＰＣＭ）のオーディオ（Ａ）パケットの図３と異なる態様を示すフォーマット説明図である。
【図１４】本発明を適用した音声符号化装置とそれに対応した音声復号装置の第２の実施形態を示すブロック図である。
【符号の説明】
１ａ、４ａ加算回路（加算手段）
１ｂ、４ｂ減算回路（減算手段）
５ａ、５ｂ割り算器
１１Ｄ１差分演算回路（第１の差分演算手段）
１１Ｄ２差分演算回路（第２の差分演算手段）
１２ａ−１〜１２ａ−ｎ予測器（減算器１３ａ−１〜１３ａ−ｎ、バッファ・選択器１６Ｄ１と共に第１の予測符号化手段を構成する。）
１２ｂ−１〜１２ｂ−ｎ予測器（減算器１３ｂ−１〜１３ｂ−ｎ、バッファ・選択器１６Ｄ２と共に第２の予測符号化手段を構成する。）
１３ａ−１〜１３ａ−ｎ，１３ｂ−１〜１３ｂ−ｎ減算器
１６Ｄ１，１６Ｄ２，１６Ａ，１６Ｓ，１６Ｌ，１６Ｒバッファ・選択器
１５Ａ予測回路（バッファ・選択器１６Ａと共に第３の予測符号化手段を構成する。）
１５Ｓ予測回路（バッファ・選択器１６Ｓと共に第４の予測符号化手段を構成する。）
１５Ｌ予測回路（バッファ・選択器１６Ｌと共に第５の予測符号化手段を構成する。）
１５Ｒ予測回路（バッファ・選択器１６Ｒと共に第６の予測符号化手段を構成する。）[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a speech decoding method for decoding an audio signal transmission method and the audio signal to transmit heat the audio signal encoded by the speech encoding method for compressing and predictive coding the speech signal.
[0002]
[Prior art]
As a method for predictively encoding a speech signal, the present inventor has used a plurality of predictors having different characteristics for the original digital speech signal of one channel (channel) in the previous application (Japanese Patent Application No. 9-289159). A plurality of linear prediction values of the current signal are calculated from past signals in the region, a prediction residual for each predictor is calculated from the original digital speech signal and the plurality of linear prediction values, and the prediction residuals are calculated. A method for selecting the minimum value is proposed.
[0003]
[Problems to be solved by the invention]
However, in the above method, when the original digital audio signal has a sampling frequency = 96 kHz and the number of quantization bits = 20 bits, it is possible to obtain a certain degree of compression effect. Since the frequency (= 192 kHz) is used and the number of quantization bits tends to be 24 bits, it is necessary to improve the compression rate.
[0004]
Therefore, an object of the present invention is to provide a transmission method and a decoding method for a speech signal encoded by a speech encoding method that can improve the compression rate when predictive encoding a speech signal.
[0005]
[Means for Solving the Problems]
In order to achieve the above object, the present invention comprises the following means 1) and 2) .
That is,
[0006]
1) matrix conversion of first and second two-line audio signals having the same sampling frequency to convert them into two correlated channels, and
An audio signal comprising two correlated channel converted by the step, for each channel, along with obtaining a top Sample value in response to the audio signal to be input, past the time domain by a plurality of linear prediction method different properties step linear prediction value of the current signal is predicted respectively, predictive coding by selecting the linear prediction method as prediction residues obtained from the audio signal and the linear prediction value being the predicted is minimized from When,
The data structure includes header information and user data including a compressed PCM private header and an audio compressed PCM data portion , and the audio compressed PCM data portion is configured by a plurality of access units, and is selected by the step Predictive coding data including the first sample value of each channel, the prediction residual, and the linear prediction method are stored in a subpacket arranged in the access unit, and the UPC / EAN-ISRC number and UPC / An audio signal transmission method for transmitting an audio signal encoded by an audio encoding method comprising a step of placing EAN-ISRC data in the compressed PCM private header,
Predictive encoded data including the selected first sample value, prediction residual, and linear prediction method, and UPC / EAN-ISRC number and UPC / EAN-ISRC data of the voice signal are packetized and transmitted. A voice signal transmission method.
2) converting the first and second two-line audio signals having the same sampling frequency into two correlated channels that are correlated with each other by matrix calculation;
The audio signal including the two correlation channels converted in the above step is obtained for each channel in response to the input audio signal, and a head sample value is obtained. Selecting and predicting a linear prediction method such that a linear prediction value of a current signal is predicted, and a prediction residual obtained from the predicted linear prediction value and the speech signal is minimized;
The data structure includes header information and user data including a compressed PCM private header and an audio compressed PCM data portion, and the audio compressed PCM data portion is configured by a plurality of access units, and is selected by the step Predictive coding data including the first sample value of each channel, the prediction residual, and the linear prediction method are stored in a subpacket arranged in the access unit, and the UPC / EAN-ISRC number and UPC / A speech decoding method for decoding an original speech signal from data encoded by a speech encoding method comprising a step of placing EAN-ISRC data in the compressed PCM private header,
Calculating a prediction value from predictive encoded data including the selected first sample value, prediction residual, and linear prediction method;
Restoring the first plurality of channels of digital audio signals from the calculated predicted value;
A speech decoding method comprising:
[0007]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a first embodiment of a speech encoding apparatus to which a speech signal transmission method according to the present invention is applied and a speech decoding apparatus corresponding to the speech encoding apparatus, and FIG. 2 is a block diagram showing in detail the encoder of FIG. 3 is an explanatory diagram showing the format of one frame multiplexed by the multiplexer of FIG. 2, FIG. 4 is an explanatory diagram showing the format of a DVD pack, FIG. 5 is an explanatory diagram showing the format of an audio pack of DVD, and FIG. It is a block diagram which shows the decoder of FIG. 1 in detail.
[0008]
The channel correlation circuit A shown in FIG. 1 has an addition circuit 1a and a subtraction circuit 1b. The adding circuit 1a calculates a sum signal (L + R) of stereo 2ch signals L and R with each channel (hereinafter referred to as ch) having a sampling frequency of 192 kHz and the number of quantization bits = 24 bits, for example. The subtraction circuit 1b calculates the difference signal (LR) and outputs it to the 1ch lossless encoder 2D2 for the difference channel. As shown in detail in FIG. 2, the encoders 2D1 and 2D2 predictively encode the differences Δ (L + R) and Δ (LR) of the sum signal (L + R) and the difference signal (LR), respectively, to record and communicate media Is transmitted through.
[0009]
On the decoding side, as shown in detail in FIG. 6, the decoders 3D1, 3D2 respectively decode the predicted encoded data of each channel into a sum signal (L + R) and a difference signal (LR), and then the channel correlation circuit B The sum signal (L + R) and difference signal (LR) are restored to stereo 2ch signals L and R.
[0010]
The encoders 2D1 and 2D2 will be described in detail with reference to FIG. The sum signal (L + R) and the difference signal (LR) are stored in one frame buffer 10 for each frame. Then, the sample values (L + R) and (LR) of one frame are applied to the difference calculation circuits 11D1 and 11D2, respectively, and the difference Δ (L + R) and Δ (LR), that is, the difference PCM ( DPCM) data is calculated. In addition, the head sample values (L + R) and (LR) of each frame are applied to the multiplexer 19.
[0011]
The difference Δ (L + R) calculated by the difference calculation circuit 11D1 is applied to a plurality of predictors 12a-1 to 12a-n and subtractors 13a-1 to 13a-n having different prediction coefficients. Each of the predictors 12a-1 to 12a-n calculates each prediction value of the difference Δ (L + R) based on each prediction coefficient, and each of the subtractors 13a-1 to 13b-n calculates each prediction value and the difference Δ. Each prediction residual of (L + R) is calculated. The buffer / selector 16 </ b> D <b> 1 temporarily stores the plurality of prediction residuals, selects the minimum prediction residual for each subframe specified by the selection signal generator 17, and outputs the selected prediction residual to the packing circuit 18. Note that this subframe has a sample length of about one tens of frames, and one frame is 80 subframes as an example. Here, the predictors 12a-1 to 12a-n and the subtractors 13a-1 to 13a-n constitute a prediction circuit 15D1 for the sum signal ch, and the prediction circuit 15D1 and the buffer / selector 16D1 are sum signals ch. The prediction encoding circuit is configured.
[0012]
Similarly, the difference Δ (LR) calculated by the difference calculation circuit 11D2 is applied to a plurality of predictors 12b-1 to 12b-n and subtractors 13b-1 to 13b-n having different prediction coefficients. The predictors 12b-1 to 12b-n calculate the predicted values of the difference Δ (LR) based on the respective prediction coefficients, and the subtracters 13b-1 to 13b-n respectively Each prediction residual of the difference Δ (LR) is calculated. The buffer / selector 16D2 temporarily stores the plurality of prediction residuals, selects the minimum prediction residual for each subframe specified by the selection signal generator 17, and outputs it to the packing circuit 18. The predictors 12b-1 to 12b-n and the subtractors 13b-1 to 13b-n constitute a prediction circuit 15D2 for the difference signal ch. The prediction circuit 15D2 and the buffer / selector 16D2 are prediction codes for the difference signal ch. Circuit.
[0013]
The selection signal generator 17 applies a prediction residual bit number flag (5 bits) to the packing circuit 18 and the multiplexer 19, and also predictor selection flags (the number n) indicating the predictor having the smallest prediction residual. 2 to 9 and 3 bits) is applied to the multiplexer 19. The packing circuit 18 packs the prediction residuals for 2ch selected by the buffers / selectors 16D1 and 16D2 with the designated number of bits based on the bit number flag designated by the selection signal generator 17.
[0014]
The succeeding multiplexer 19 has a frame header (40 bits) for one frame as shown in FIG.
The first sample value (25 bits) of one frame of the sum signal ch (L + R),
The first sample value (25 bits) of one frame of the difference signal ch (LR),
A predictor selection flag (3 bits × 80) for each subframe of the sum signal ch (L + R);
A predictor selection flag (3 bits × 80) for each subframe of the difference signal ch (LR);
A bit number flag (5 bits × 80) for each subframe of the sum signal ch (L + R);
A bit number flag (5 bits × 80) for each subframe of the difference signal ch (LR);
A prediction residual data string (number of variable bits) of the sum signal ch (L + R);
-The prediction residual data string (variable bit number) of the difference signal ch (LR) is multiplexed as an access unit and output as a variable rate bit stream. The prediction residual data string constitutes a subpacket. According to such predictive coding, when the original signal is, for example, sampling frequency = 192 kHz, the number of quantization bits = 24 bits, and 2 channels, a compression rate of 59% can be realized.
[0015]
When this variable rate bit stream data is recorded on a DVD audio disk, it is packed into an audio (A) pack of compressed PCM shown in FIG. This pack has 20 bytes of user data (A packet, V packet), 4 bytes of pack start information, 6 bytes of SCR (System Clock Reference) information, and 3 bytes of Mux rate ( rate) information and a 1-byte stuffing total 14-byte pack header are added (1 pack = total 2048 bytes). In this case, the time of the A pack in the same title can be managed by setting the SCR information as a time stamp as “1” in the first pack in the ACB unit and continuing in the same title.
[0016]
As shown in detail in FIG. 5, the compressed PCM A packet is composed of a 17, 9 or 14 byte packet header, a private header, and audio compressed PCM data of 1 to 2015 bytes in the format shown in FIG. The compressed PCM private header is
A 1-byte substream ID,
2-byte UPC / EAN-ISRC (Universal Product Code / European Articial Number-International Standard Recording Code) number and UPC / EAN-ISRC data;
-1 byte private header length,
A 2-byte first access unit pointer;
-4 bytes of audio data information (ADI),
・ With stuffing byte of 0-7 bytes,
It is made up of.
Thus, the ADI of the compressed PCM A packet is selected to be 4 bytes, which is 4 bytes shorter than the ADI of the normal uncompressed PCM A packet. Therefore, the audio data can be increased by 4 bytes.
[0017]
Next, the decoders 3D1 and 3D2 will be described with reference to FIG. The variable rate bit stream data in the format shown in FIG. 3 is separated by the demultiplexer 21 based on the frame header. Then, the head sample values of one frame of the sum signal ch (L + R) and the difference signal ch (LR) are respectively applied to the accumulation arithmetic circuits 25a and 25b, and the sum signal ch (L + R) and the difference signal ch (LR) are applied. ) Predictor selection flags are applied as selection signals of the predictors (24a-1 to 24a-n) and (24b-1 to 24b-n), respectively, and the sum signal ch (L + R) and difference signal ch (L- The bit number flag (R) and the prediction residual data string are applied to the unpacking circuit 22. Here, the predictors (24a-1 to 24a-n) and (24b-1 to 24b-n) are the predictors (12a-1 to 12a-n) and (12b-1 to 12b-) on the encoding side, respectively. The same characteristics as those in n) are selected by the predictor selection flag.
[0018]
The unpacking circuit 22 separates the prediction residual data strings of the sum signal ch (L + R) and the difference signal ch (LR) based on each bit number flag, and outputs them to the adder circuits 23a and 23b, respectively. In addition circuits 23a and 23b, current prediction residual data of sum signal ch (L + R) and difference signal ch (LR) from unpacking circuit 22, and predictors (24a-1 to 24a-n), Of the (24b-1 to 24b-n), the previous predicted value predicted by each one selected by the predictor selection flag is added to calculate the current predicted value. The current predicted values are the differences Δ (L + R) and Δ (LR) calculated by the difference circuits 11a and 11b shown in FIG. 2, that is, DPCM data, and predictors (24a-1 to 24a-n). , (24b-1 to 24b-n) and the cumulative calculation circuits 25a and 25b.
[0019]
The cumulative calculation circuits 25a and 25b respectively add the differences Δ (L + R) and Δ (LR) for each sample with respect to the first sample value of one frame, and add the sum signal ch (L + R) and difference signal ch (L -R) of each PCM data is output. As for the sum signal (L + R) and the difference signal (LR), a 2L signal is calculated by the adder circuit 4a and a 2R signal is calculated by the subtractor circuit 4b as shown in FIG. Then, the 2L signal and the 2R signal are respectively divided by 1/2 by the dividers 5a and 5b, and the original stereo two-channel signals L and R are restored.
[0020]
Next, a second embodiment will be described with reference to FIGS. The above embodiment is configured to predictively encode only the differences Δ (L + R) and Δ (LR) of the sum signal (L + R) and difference signal (LR), that is, DPCM data. In the second embodiment, the sum signal (L + R), the difference signal (LR), that is, PCM data, or the respective differences Δ (L + R), Δ (LR), that is, DPCM data are selectively predictively encoded. Is configured to do.
[0021]
Therefore, in the encoding device shown in FIG. 7, prediction circuits 15A and 15S and a buffer / selector for predictively encoding the sum signal (L + R) and the difference signal (LR) with respect to the configuration shown in FIG. 16A and 16S are added. In addition, the selection signal generator 17 includes a sum signal (L + R) and a difference signal (LR) selected by the buffers / selectors 16A and 16S, and a difference Δ (respectively selected by the buffers / selectors 16D1 and 16D2. Based on the minimum value of each prediction residual of (L + R) and Δ (LR), it is determined which of the PCM data and DPCM data has the higher compression rate, and the higher data is selected. At this time, the PCM / DPCM selection flag (prediction circuit selection flag) is added and multiplexed.
[0022]
Here, the prediction circuit 15A for the sum signal (L + R) and the prediction circuit 15D1 for the difference Δ (L + R) shown in FIG. 7 have the same configuration, and the difference Δ (L−R) is different from the prediction circuit 15S. LR) prediction circuit 15D2 has the same configuration, the decoding apparatus does not need to provide both PCM data and DPCM data prediction circuits as shown in FIG. . Based on the prediction circuit selection flag transmitted from the encoding device, the selectors 26a and 26b select the outputs of the cumulative arithmetic circuits 25a and 25b in the case of DPCM data, and in the case of PCM data, the addition circuit 23a, Select the output of 23b.
[0023]
In the third embodiment, as shown in FIG. 9, the original signals L and R (PCM data), the sum signal (L + R), the difference signal (LR) (PCM data), and their respective differences Δ (L + R) , Δ (LR) (DPCM data), one of the three groups is selectively predictively encoded.
[0024]
For this reason, in the encoding apparatus shown in FIG. 9, prediction circuits 15L and 15R and buffer / selectors 16L and 16R for predictively encoding the original signals L and R, respectively, are added to the configuration shown in FIG. . The selection signal generator 17 includes the original signals L and R selected by the buffers and selectors 16L and 16R, the sum signal (L + R) and the difference signal (LR) selected by the buffers and selectors 16A and 16S. Then, a group of data with a high compression ratio is selected based on the minimum value of the prediction residuals of the differences Δ (L + R) and Δ (LR) selected by the buffer / selectors 16D1 and 16D2. At this time, the selection flag (prediction circuit selection flag) is added and multiplexed.
[0025]
In addition, when the three groups of prediction circuits shown in FIG. 9 have the same configuration, the decoding apparatus does not need to provide prediction circuits for three groups as shown in FIG. Based on the prediction circuit selection flag transmitted from the encoding device, the output of the cumulative arithmetic circuits 25a and 25b is selected in the case of DPCM data, and the output of the adder circuits 23a and 23b is selected in the case of PCM data. Then, the original signals L and R are restored by the channel correlation circuit B. Further, in the case of the group of the original signals L and R, the outputs of the adder circuits 23a and 23b are selected by the selectors 27a and 27b, and in the other cases, the output of the channel correlation circuit B is selected.
Also, when variable rate bitstream data predictively encoded by the encoding side is transmitted via the network, the encoding side packetizes it for transmission as shown in FIG. 11 (step S41), and then packet header (Step S42), and then the packet is sent out on the network (step S43). As shown in FIG. 12, the decoding side removes the header (step S51), then restores the data (step S52), then stores this data in the memory and waits for decoding (step S53).
[0027]
FIG. 13 shows an aspect different from FIG. 3 of the audio (A) packet of the compressed PCM (PPCM) shown in FIG. In this different mode, the audio data area in the compressed PCM (PPCM) audio (A) packet is composed of a plurality of PPCM access units as shown in FIG. 13, and the PPCM access unit is composed of PPCM sync information and subpackets. ing. The subpacket in the first PPCM access unit is composed of a directory, a bitstream BS0, CRC, and extra information, and the bitstream BS0 is composed only of PPCM blocks. Sub-packets in the second and subsequent PPCM access units are composed of a bit stream BS0, CRC, and extra information except for the directory. The sub-stream BS0 at the head of the frame has a restart header and a PPCM block (the frame head sample value). Included).
[0028]
The PPCM sync information (hereinafter also referred to as synchronization information) includes the following information.
-Number of samples per packet: 40, 80 or 160 is selected according to the sampling frequency fs.
Data rate: “0” in the case of VBR (an identifier indicating that the data in the subpacket is compressed data)
-Sampling frequency fs and number of quantization bits Qb
Channel assignment information The restart header has information specifying the channel correlation circuit A (having an addition circuit and a subtraction circuit) for each frame. The variable rate bit stream data in the format shown in FIG. 13 is decoded into the original 2-channel audio signal by the decoders 3D1 and 3D2 having the configuration below the demultiplexer 21 in FIG.
[0029]
FIG. 14 is a block diagram showing a second embodiment of the speech encoding apparatus and speech decoding apparatus according to the present invention. The channel correlation circuit A-1 shown in FIG. 14 includes an addition circuit 1a and a subtraction circuit 1b. The adder circuit 1a calculates the sum signal (L + R) of the stereo 2ch signals L and R, divides this sum signal (L + R) by ½ by the divider 5a, and then outputs the result to the lossless encoder 2D. 1b calculates a difference signal (LR), divides this difference signal (LR) by 1/2 by a divider 5b, and then outputs it to the lossless encoder 2D. The lossless encoder 2D multiplexes these using 1/2 (L + R) and 1/2 (LR) to create a multiplexed signal 250. The multiplexed signal 250 is decoded by the lossless decoder 3D to obtain the original ½ (L + R) and ½ (LR), which are the addition circuit 4a constituting the channel correlation circuit B-1. Are supplied to the subtracting circuit 4b, and stereo 2ch L and R signals are obtained as output signals. Note that a series of operations in the lossless encoder 2D and the lossless encoder 2D includes calculation of a difference, calculation of a prediction value, selection of a minimum prediction residual, calculation of a prediction value using the minimum prediction residual, and the like. This is performed in the same manner as in the embodiment. The variable rate bit stream data in the format shown in FIG. 13 is identified by the identifier stored in the restart header of the PPCM access unit, for example, whether the channel correlation circuit of FIG. 1 or the channel correlation circuit of FIG. 14 is used. Therefore, it can be decoded reliably. In addition, although the lossless compression for every frame was demonstrated to the example, it is not fixed and an area may be made into variable length.
[0030]
【The invention's effect】
As described above, according to the present invention, in particular, two correlated signals obtained by performing matrix operation on the first and second audio signals of the same sampling frequency and converting them into mutually correlated channels are input for each channel. Loss-less compression using a linear prediction method that minimizes the prediction residual among multiple prediction values of the current signal predicted from the past signal in the time domain. Thus, an audio signal with an improved compression rate can be transmitted and the audio signal can be decoded without any inconvenience .
[Brief description of the drawings]
FIG. 1 is a block diagram showing a first embodiment of a speech encoding apparatus to which the present invention is applied and a speech decoding apparatus corresponding to the speech encoding apparatus.
FIG. 2 is a block diagram showing in detail the encoder of FIG. 1;
3 is an explanatory diagram showing a format of one frame multiplexed by the multiplexer of FIG. 2; FIG.
FIG. 4 is an explanatory diagram showing a DVD pack format;
FIG. 5 is an explanatory diagram showing a format of a DVD audio pack;
FIG. 6 is a block diagram showing in detail the decoder of FIG. 1;
FIG. 7 is a block diagram illustrating an encoder according to a second embodiment.
FIG. 8 is a block diagram illustrating a decoder according to a second embodiment.
FIG. 9 is a block diagram illustrating an encoder according to a third embodiment.
FIG. 10 is a block diagram illustrating a decoder according to a third embodiment.
FIG. 11 is a flowchart illustrating an audio transmission method.
FIG. 12 is a flowchart showing an audio transmission method.
13 is a format explanatory diagram showing a different aspect from FIG. 3 of the audio (A) packet of the compressed PCM (PPCM) shown in FIG. 5;
FIG. 14 is a block diagram showing a second embodiment of a speech encoding apparatus to which the present invention is applied and a speech decoding apparatus corresponding to the speech encoding apparatus.
[Explanation of symbols]
1a, 4a Adder circuit (addition means)
1b, 4b Subtraction circuit (subtraction means)
5a, 5b Divider 11D1 Difference calculation circuit (first difference calculation means)
11D2 difference calculation circuit (second difference calculation means)
12a-1 to 12a-n Predictors (the first predictive coding means is configured together with the subtractors 13a-1 to 13a-n and the buffer / selector 16D1)
12b-1 to 12b-n predictor (the second predictive encoding means is configured together with the subtracters 13b-1 to 13b-n and the buffer / selector 16D2)
13a-1 to 13a-n, 13b-1 to 13b-n Subtractors 16D1, 16D2, 16A, 16S, 16L, and 16R Buffer / selector 15A Prediction circuit (the third predictive encoding means together with the buffer / selector 16A Constitute.)
15S prediction circuit (constitutes the fourth predictive encoding means together with the buffer / selector 16S)
15L prediction circuit (constitutes the fifth predictive encoding means together with the buffer / selector 16L)
15R prediction circuit (configures the sixth predictive encoding means together with the buffer / selector 16R)

Claims

Converting the first and second two-line audio signals having the same sampling frequency into two correlated channels that are correlated with each other by matrix calculation;
An audio signal comprising two correlated channel converted by the step, for each channel, along with obtaining a top Sample value in response to the audio signal to be input, past the time domain by a plurality of linear prediction method different properties step linear prediction value of the current signal is predicted respectively, predictive coding by selecting the linear prediction method as prediction residues obtained from the audio signal and the linear prediction value being the predicted is minimized from When,
The data structure includes header information and user data including a compressed PCM private header and an audio compressed PCM data portion , and the audio compressed PCM data portion is configured by a plurality of access units, and is selected by the step Predictive coding data including the first sample value of each channel, the prediction residual, and the linear prediction method are stored in a subpacket arranged in the access unit, and the UPC / EAN-ISRC number and UPC / An audio signal transmission method for transmitting an audio signal encoded by an audio encoding method comprising a step of placing EAN-ISRC data in the compressed PCM private header,
Predictive encoded data including the selected first sample value, prediction residual, and linear prediction method, and UPC / EAN-ISRC number and UPC / EAN-ISRC data of the voice signal are packetized and transmitted. A voice signal transmission method.

Converting the first and second two-line audio signals having the same sampling frequency into two correlated channels that are correlated with each other by matrix calculation;
The audio signal including the two correlation channels converted in the above step is obtained for each channel in response to the input audio signal, and a head sample value is obtained. Selecting and predicting a linear prediction method such that a linear prediction value of a current signal is predicted, and a prediction residual obtained from the predicted linear prediction value and the speech signal is minimized;
The data structure includes header information and user data including a compressed PCM private header and an audio compressed PCM data portion, and the audio compressed PCM data portion is configured by a plurality of access units, and is selected by the step Predictive coding data including the first sample value of each channel, the prediction residual, and the linear prediction method are stored in a subpacket arranged in the access unit, and the UPC / EAN-ISRC number and UPC / A speech decoding method for decoding an original speech signal from data encoded by a speech encoding method comprising a step of placing EAN-ISRC data in the compressed PCM private header,
Calculating a prediction value from predictive encoded data including the selected first sample value, prediction residual, and linear prediction method;
Restoring the first plurality of channels of digital audio signals from the calculated predicted value;
A speech decoding method comprising: