JP3597077B2

JP3597077B2 - Audio signal decoding device

Info

Publication number: JP3597077B2
Application number: JP12749899A
Authority: JP
Inventors: 高志三浦
Original assignee: 株式会社ハドソン
Priority date: 1999-03-31
Filing date: 1999-03-31
Publication date: 2004-12-02
Anticipated expiration: 2019-03-31
Also published as: JP2000286713A

Description

【０００１】
【発明の属する技術分野】
本発明は、オーディオ信号の復号化装置、特にサブバンド合成フィルタバンクにおける演算処理を高能率化したオーディオ信号復号化装置に関する。
【０００２】
【従来の技術】
オーディオ信号の符号化は、近年のＩＳＤＮおよびＶＬＳＩ技術の発展によって急速に注目を集めている。特に、ディジタルオーデイオ、ディジタル衛星放送（ＤＳＢ）、オーディオ信号の蓄積、遠隔会議、マルチメディア応用等でその重要性が増加している。
【０００３】
ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔＧｒｏｕｐ）において、ビデオ信号とともにオーディオ信号を圧縮する世界標準符号化方式が検討され、オーディオ信号の符号化技術については、ＭＰＥＧ／オーディオ（国際規格ＩＳＯ／ＩＥＣ１１１７２−３）に代表される符号化アルゴリズムが決定され、公開されている。ＭＰＥＧオーディオ符号化技術については、社団法人テレビジョン学会編、オーム社発行の「ＭＰＥＧ」に詳しく説明されている。また、ＩＥＥＥマルチメディアジャーナル１９９５年夏号には、デイビス・パン著「ＭＰＥＧ／オーディオデータ圧縮チュートリアル」にＭＰＥＧオーディオ圧縮理論が解説されている。
【０００４】
ＭＰＥＧ／オーディオのアルゴリズムはレイヤ１，レイヤ２，レイヤ３の３種類のアルゴリズムから構成され、レイヤ１からレイヤ３の順で複雑になるが、アルゴリズムが３２帯域のサブバンド符号化に基づいており、チャンネル数が２、サンプリング周波数が３２，４４．１，４８ｋＨｚのいずれかである点で共通している（以下、ＭＰＥＧ／オーディオ・レイヤ３のアルゴリズムを特にＭＰ３と呼ぶ）。ここで、サブバンド符号化とは、聴覚特性を利用して信号を複数のサブバンドに分割して、それぞれのバンドを異なる量子化特性で量子化する符号化方式である。
【０００５】図１０に、サブバンド符号化方式に基づいたＭＰＥＧ／オーディオアルゴリズムの基本ブロック図を示す。入力されたオーディオ信号は、入力を複数のサブバンドに分けるフィルタバンク１１を通る。同時に、入力されたオーディオ信号は、各サブバンドのマスキング閾値に対する信号エネルギーの比率を決定する心理聴覚モデル１２を通る。心理聴覚モデル１２に基づいたビット割当てに従って量子化・符号化手段１３により量子化および符号化した後、必要に応じ図示しないアンシラリデータ（利用者が任意に定義できるデータ）を合わせて、フォーマット手段１４によりビットストリーム中にフォーマットする。このように符号化されたビットストリームは、ＣＤ−ＲＯＭ等の蓄積媒体に格納されたり、通信回線等の伝送路で伝送される。復号は、フレーム分解手段２１においてまず付加されたアンシラリデータを分離してビットストリームを分解する。次いで、サイド情報として送られたビット割当てに基づいて復号・逆量子化手段２２により復号、逆量子化を行いサブバンド値を復元する。最後に、サブバンド合成フィルタバンク２３によりサブバンド値を合成しオーディオ信号を再現する。
【０００６】
ところで、復号化におけるサブバンド合成フィルタバンクは、周波数成分に分割された信号をもとの時系列オーディオ信号にもどす機能を有しており、この機能を実現するために、離散コサイン変換（ＤＣＴ）、変形離散コサイン変換（ＭＤＣＴ）といった三角関数で記述される変換処理を行っている。
【０００７】
【発明が解決しようとする課題】
しかしながら、ＭＰ３の規格書に規定された変換処理をそのまま実行すると、角度の計算に要する加算・乗算も考慮に入れた場合の演算処理量は、６４個の出力を得るために６１４４回の乗算と６０８０回の加算が必要であり、計算負荷が大きかった。このため、従来のＭＰ３復号化による機器、例えばＭＰ３プレーヤでは、低価格で実用的な性能を得ることが困難であった。
【０００８】
そこで本発明の目的は、こうした従来技術の欠点を解決し、サブバンド合成フィルタバンクにおける演算量を大幅に削減でき、もって復号化処理をより高速化できるオーディオ信号の復号化装置を提供すること
にある。
【０００９】
【課題を解決するための手段】
上記課題を解決するために、本発明は、符号化されたビットストリームを分解するフレーム分解手段と、ビット割当てに基づいて復号、逆量子化を行いｎ個のサブバンド値を復元する復号・逆量子化手段と、得られたサブバンド値を合成しオーディオ信号を再現するサブバンド合成フィルタバンクとからなり、サブバンド合成フィルタバンクは、ｎ／４個のサブバンド入力を離散コサイン変換（ＤＣＴ）し出力する第１及び第２のフィルタバンクと、ｎ／４個のサブバンド入力を離散サイン変換（ＤＳＴ）し出力する第３及び第４のフィルタバンクを備え、ｎ個のサブバンド値を４分割して第１のフィルタバンク、第２のフィルタバンク、第３のフィルタバンクおよび第４のフィルタバンクへそれぞれ入力し、第１のフィルタバンクの出力Ｑ_１（ｎ／４，ｋ）（ｋ＝０，１，２，...，ｎ／２−１。以下同じ）と第２のフィルタバンクの出力Ｑ_２（ｎ／４，ｋ）との差Ｐ_１（ｋ）を求める第１の処理と、第３のフィルタバンクの出力Ｑ_３（ｎ／４，ｋ）と第４のフィルタバンクの出力Ｑ_４（ｎ／４，ｋ）との和をとり所定の係数を乗じＰ_２（ｋ）を求める第２の処理と、Ｐ_１（ｋ）及びＰ_２（ｋ）に基づきｎ×２個の合成フィルタ出力Ｘ（ｎ，ｐ）（ｐ＝０，１， ... ，６３）を求める第３の処理とを行うことを特徴とするオーディオ信号復号化装置を提供する。
【００１０】
この場合において、前記復号・逆量子化手段がｎ＝３２個のサブバンド値を復元するものであり、第１の処理が、
【００１１】
【数４】

【００１２】
であり、第２の処理が、
【００１３】
【数５】

【００１４】
であり、第３の処理が、次のフィルタ出力Ｘ（３２，ｐ）（ｐ＝０，１， ... ，６３）
【００１５】
【数６】

【００１６】
を求めるものであることが好ましい。
【００１７】
【発明の実施の形態】
本発明に係るオーディオ信号復号化装置について実施の形態の全体構成を説明する前に、まずその主要部をなすサブバンド合成フィルタバンクにおけるアルゴリズムを詳細に説明する。
【００１８】
今、ＭＰＥＧ／オーディオのレイヤ３（ＭＰ３）におけるサブバンド合成フィルタバンクを高速化することを例にとり、そのための高速ＤＣＴアルゴリズムの適用を検討する。
【００１９】
［サブバンド合成フィルタバンクを４つの項で記述する］
ＭＰ３のサブバンド合成フィルタバンクでは入力の個数が３２で出力の個数が６４である。入力の個数をｎ（＝３２）と表記したときの変換式は次式で表される。
【００２０】
【数７】

【００２１】
ここで、次の記号を定義する。
【００２２】
【数８】

【００２３】
式（２）を式（１）に適用すると、次式の様に表すことができる。
【００２４】
【数９】

【００２５】
式（３）はサイズ３２のいわゆるタイプＩＩのＤＣＴに類似しているので、高速ＤＣＴのアルゴリズムを用いることで、サイズ２まで縮小して演算を高速化することが可能と推定される。
【００２６】
そこで、式（１）を偶数項と奇数項に分割する。
【００２７】
【数１０】

【００２８】
式（４）は、式（２）の定義及び三角関数の公式に従い、かつ、ｘ_ｎ＝０およびｘ_−１＝０の関係より、次式のように変形できる。
【００２９】
【数１１】

【００３０】
式（５）に着目し、（−１）^ｍを削除するために、ｍが偶数の項と奇数の項に分割して表すと次の通りである。
【００３１】
【数１２】

【００３２】
次に、式（６）における最後の２項に着目すると、三角関数の公式に従い、かつ、ｘ_−２＝０およびｘ_−１＝０より、次式の様に変形できる。
【００３３】
【数１３】

【００３４】
ここで、次のように定義を行う。
【００３５】
【数１４】

【００３６】
【数１５】

【００３７】
【数１６】

【００３８】
【数１７】

【００３９】
式（８）から式（１１）を用いると、式（６）のＭＰ３の合成フィルタバンクは次のように表せる。
【００４０】
【数１８】

【００４１】
式（１２）の４つの項は、それぞれ、サイズ８のＤＣＴまたはＤＳＴである。以下、４項それぞれについて、本発明に従い計算負荷がどのように削減されるかを説明する。
【００４２】
［タイプＩのＤＣＴ］
タイプＩのＤＣＴは次の通り定義される。
【００４３】
【数１９】

【００４４】
ここで、式（１３）の２つの項を次のように定義する。
【００４５】
【数２０】

【００４６】
【数２１】

【００４７】
そうすると、式（１３）は次のように表すことができる。
【００４８】
【数２２】

【００４９】
式（１６）をｋ＝０，１，．．．，２ｎ−１に対して計算するときには、三角関数の周期性を利用して式（１５），（１６）をｋ＝０，１，．．．，ｎ／２について計算した結果を用いることで計算負荷を削減することができる。
【００５０】
すなわち、（１６）は、次のように表すことができる。
【００５１】
【数２３】

【００５２】
式（１４），（１５）はサイズｎ／２のタイプＩのＤＣＴであるので、再帰的に繰り返してサイズ２のＤＣＴまで縮約することができる。また、式（１７）を用いて、サイズ８のＤＣＴをｋ＝０，１，．．．，１６まで計算することができる。
【００５３】
式（８）にもどって、式（８）のＱ_１（ｎ／４，ｋ）は、サイズ８のＤＣＴである。ｋ＝０，１，．．．１６に対するＱ_１（ｎ／４，ｋ）を用いてＱ_１（ｎ／４，０）からＱ_１（ｎ／４，６３）までを計算する方法を以下説明する。
【００５４】
ここでは、ｎ＝３２となるが、式（１７）によってｋ＝０，１，．．．，１６までが計算済みである。式（１２）の４つの項すべてに掛けられる係数とＱ_１（ｎ／４，ｋ）について、三角関数の周期性を利用すると、次のように表すことができる。
【００５５】
【数２４】

【００５６】
すなわち、Ｑ_１（ｎ／４，ｋ）に関しては、ｋ＝０，１，．．．，１６に対する値を計算すればｋ＝０，１，．．．，６３に対応する値が得られるので計算負荷が４分の１で済むことを示している。しかも、これらの値はｋ＝０，１，．．．，８に対するＧ_１（８、ｋ）とＨ_１（８、ｋ）によって求めることができ、最終的にはこれらはｋ＝０，１に対するＧ_１（２、ｋ）とＨ_１（２、ｋ）によって求めることができるので、相当量の計算負荷を削減できることを示している。
【００５７】
［タイプＩＩのＤＣＴ］
タイプＩＩのＤＣＴは次の通り定義される。
【００５８】
【数２５】

【００５９】
ここで、式（１９）の２つの項を次のように定義する。
【００６０】
【数２６】

【００６１】
【数２７】

【００６２】
そうすると、式（１９）は次のように表すことができる。
【００６３】
【数２８】

【００６４】
式（２２）をｋ＝０，１，．．．，２ｎ−１に対して計算するときには、三角関数の周期性を利用して式（２０），（２１）をｋ＝０，１，．．．，ｎ／２について計算した結果を用いることで計算負荷を削減することができる。すなわち、式（２２）は、次のように表すことができる。
【００６５】
【数２９】

【００６６】
式（２０），（２１）はそれぞれサイズｎ／２のタイプＩとタイプＩＩのＤＣＴであるので、再帰的に繰り返してサイズ２のＤＣＴまで縮約することができる。また、式（２３）を用いて、サイズ８のＤＣＴをｋ＝０，１，．．．，１６まで計算することができる。
【００６７】
式（９）にもどって、式（９）のＱ_２（ｎ／４，ｋ）は、サイズ８のタイプＩＩのＤＣＴである。ｋ＝０，１，．．．１６に対するＱ_２（ｎ／４，ｋ）を用いてＱ_２（ｎ／４，０）からＱ_２（ｎ／４，６３）までを計算する方法を以下説明する。
【００６８】
ここでは、ｎ＝３２となるが、式（１７）によってｋ＝０，１，．．．，１６までが計算済みであり、式（１２）の４つの項すべてに掛けられる係数についても計算済みである。Ｑ_２（ｎ／４，ｋ）について、三角関数の周期性を利用すると、次のように表すことができる。
【００６９】
【数３０】

【００７０】
すなわち、Ｑ_２（ｎ／４，ｋ）に関しても、最終的にはこれらはｋ＝０，１に対するＧ_２（２、ｋ）とＨ_２（２、ｋ）によって求めることができるので、相当量の計算負荷を削減できることを示している。
【００７１】
［タイプＩのＤＳＴ］
タイプＩのＤＳＴは次の通り定義される。
【００７２】
【数３１】

【００７３】
ここで、式（２５）の２つの項を次のように定義する。
【００７４】
【数３２】

【００７５】
【数３３】

【００７６】
そうすると、式（２５）は次のように表すことができる。
【００７７】
【数３４】

【００７８】
式（２８）をｋ＝０，１，．．．，２ｎ−１に対して計算するときには、三角関数の周期性を利用して式（２６），（２７）をｋ＝０，１，．．．，ｎ／２について計算した結果を用いることで計算負荷を削減することができる。すなわち、式（２８）は、次のように表すことができる。
【００７９】
【数３５】

【００８０】
式（２６），（２７）はサイズｎ／２のタイプＩのＤＳＴであるので、再帰的に繰り返してサイズ２のＤＳＴまで縮約することができる。また、式（２９）を用いて、サイズ８のＤＳＴをｋ＝０，１，．．．，１６まで計算することができる。
【００８１】
式（１０）にもどって、式（１０）のＱ_３（ｎ／４，ｋ）は、サイズ８のＤＳＴである。ｋ＝０，１，．．．１６に対するＱ_３（ｎ／４，ｋ）を用いてＱ_３（ｎ／４，０）からＱ_３（ｎ／４，６３）までを計算する方法を以下説明する。
【００８２】
ここでは、ｎ＝３２となるが、式（２９）によってｋ＝０，１，．．．，１６までが計算済みであり、式（１２）の４つの項すべてに掛けられる係数についても計算済みである。Ｑ_３（ｎ／４，ｋ）について、三角関数の周期性を利用すると、次のように表すことができる。
【００８３】
【数３６】

【００８４】
すなわち、Ｑ_３（ｎ／４，ｋ）に関しても、最終的にはこれらはｋ＝０，１，２に対するＧ_３（２、ｋ）とＨ３（２、ｋ）によって求めることができるので、相当量の計算負荷を削減できることを示している。
【００８５】
［タイプＩＩのＤＳＴ］
タイプＩＩのＤＳＴは次の通り定義される。
【００８６】
【数３７】

【００８７】
ここで、式（３１）の２つの項を次のように定義する。
【００８８】
【数３８】

【００８９】
【数３９】

【００９０】
そうすると、式（３１）は次のように表すことができる。
【００９１】
【数４０】

【００９２】
式（３４）をｋ＝０，１，．．．，２ｎ−１に対して計算するときには、三角関数の周期性を利用して式（３２），（３３）をｋ＝０，１，．．．，ｎ／２について計算した結果を用いることで計算負荷を削減することができる。すなわち、式（３４）は、次のように表すことができる。
【００９３】
【数４１】

【００９４】
式（３２），（３３）はそれぞれサイズｎ／２のタイプＩＩとタイプＩのＤＳＴであるので、再帰的に繰り返してサイズ２のＤＳＴまで縮約することができる。また、式（３５）を用いて、サイズ８のＤＳＴをｋ＝０，１，．．．，１６まで計算することができる。
【００９５】
式（１１）にもどって、式（１１）のＱ_４（ｎ／４，ｋ）は、サイズ８のタイプＩＩのＤＳＴである。ｋ＝０，１，．．．１６に対するＱ_４（ｎ／４，ｋ）を用いてＱ_４（ｎ／４，０）からＱ_４（ｎ／４，６３）までを計算する方法を以下説明する。
【００９６】
ここでは、ｎ＝３２となるが、式（３５）によってｋ＝０，１，．．．，１６までが計算済みであり、式（１２）の４つの項すべてに掛けられる係数についても計算済みである。Ｑ_４（ｎ／４，ｋ）について、三角関数の周期性を利用すると、次のように表すことができる。
【００９７】
【数４２】

【００９８】
すなわち、Ｑ_４（ｎ／４，ｋ）に関しても、最終的にはこれらはｋ＝０，１に対するＧ_４（２、ｋ）とＨ_４（２、ｋ）によって求めることができるので、相当量の計算負荷を削減できることを示している。
【００９９】
以上の演算処理をまとめると、ＭＰ３のサブバンド合成フィルタバンクの各出力を計算する方法は次のように簡略表記ができる。
【０１００】
１．２つの式を定義する。
【０１０１】
【数４３】

【０１０２】
２．サブバンド合成フィルタバンクの出力Ｘ（３２，ｋ）は、次の通りである。
【０１０３】
【数４４】

【０１０４】
次に、本発明に係るオーディオ信号復号化装置について好ましい実施の形態を図面を参照しつつ説明する。
【０１０５】
本実施の形態の全体構成は、図１０に示す従来の装置と同様であり、符号化されたビットストリームを分解するフレーム分解手段２１と、ビット割当てに基づいて復号、逆量子化を行いｎ個のサブバンド値を復元する復号・逆量子化手段２２と、得られたサブバンド値を合成しオーディオ信号を再現するサブバンド合成フィルタバンク２３を有している。
【０１０６】
図１に本実施の形態における主要部であるサブバンド合成フィルタバンクのシグナルフロー図を示す。図１に示すようにサブバンド合成フィルタバンクは第１のフィルタバンクＱ１、第２のフィルタバンクＱ２、第３のフィルタバンクＱ３及び第４のフィルタバンクＱ４を備えている。第１のフィルタバンクＱ１及び第２のフィルタバンクＱ２は、ｎ／４＝８個（ｎ＝３２）のサブバンド入力を離散コサイン変換（ＤＣＴ）し１６個の出力をする。また、第３のフィルタバンクＱ３及び第４のフイルタバンクＱ４は、ｎ／４＝８個のサブバンド入力を離散サイン変換（ＤＳＴ）し１６個の出力をする。
【０１０７】
ここで、ｎ＝３２個のサブバンド値を４分割して第１のフィルタバンクＱ１、第２のフィルタバンクＱ２、第３のフィルタバンクＱ３および第４のフィルタバンクＱ４へそれぞれ入力するデータは、次式に従い前処理を行ったものである。
【０１０８】

【０１０９】
図２から図５は、フィルタバンクＱ１からＱ４までのシグナルフロー図である。
【０１１０】
ここで、図１から図５のシグナルフロー図の表記方法を補足する。
【０１１１】
まずデータの表示は、図６に示すように、数字と丸印と直線で表しており、数字はデータの番号を表し、処理の流れは左から右に進む。
【０１１２】
データの処理として、加算は図７のように示される。図７例では、データｋはデータｉとデータｊを加算することを意味する。
【０１１３】
符号の付加は、図８に示すように２つのデータの丸印を接続する直線の下もしくは横に付された符号で表しており、図８の例ではｊ＝−ｉとなる。
【０１１４】
乗算は、図９に示すように、２つのデータの丸印を接続する直線の下または上に付された数式（１／２，Ｃ６４−２等）で表しており、左側の丸印のデータにその数式の値を掛ける。図９の例ではｊ＝ａ＊ｉとなる。
【０１１５】
最後に、Ｃ６４−２，Ｃ４−１等で表した数式を一般的にＣａ−ｂと表記すると、
Ｃａ−ｂ＝１／［２ｃｏｓ〔（ｂ／ａ）＊π）〕］
を意味している。
【０１１６】
図１から図５に示す本実施の形態のシグナルフローから明らかな通り、本実施の形態の処理によれば、７９回の乗算と２２３回の加算により６４個のＭＰ３サブバンド合成フィルタバンクの出力を得ることができる。
【０１１７】
従来のサブバンド合成フィルタバンクの処理は、図１１に示すように６４個の処理Ｒ（ｋ）（ｋ＝０，１，．．．６３）により構成されており、６４個の処理Ｒ（ｋ）の入力データは共通であった。このときの処理Ｒ（ｋ）のシグナルフローは、図１２に示した通りであり、６４個の出力を得るためには、６１４４回の乗算と６０８０回の加算が必要であった。
【０１１８】
これに対し、本実施の形態によれば、従来のＲ（ｋ）に相当する処理を４個しか使用せず、必要とされる計算量も少ない。したがって、サブバンド合成フィルタバンクにおける計算負荷を大幅に軽減することができる。
【０１１９】
本発明は、上記した実施の形態に限定されるものではなく、特許請求の範囲に記載した技術的思想の範囲内において、種々の変更が可能なのはいうまでもない。
【０１２０】
【発明の効果】
以上説明したように、本発明によるオーディオ信号の復号化装置によれば、サブバンド合成フィルタバンクにおける演算量を大幅に削減でき、もって復号化処理をより高速化できるので、低価格で実用的な性能を有するＭＰ３プレーヤ等の機器に利用できる工業的意義は極めて大である。
【図面の簡単な説明】
【図１】本発明の主要部であるサブバンド合成フィルタバンクのシグナルフロー図。
【図２】フィルタバンクＱ１のシグナルフロー図。
【図３】フィルタバンクＱ２のシグナルフロー図。
【図４】フィルタバンクＱ３のシグナルフロー図。
【図５】フィルタバンクＱ４のシグナルフロー図。
【図６】データの表示を表記するシグナルフローの説明図。
【図７】加算のデータの処理を表記するシグナルフローの説明図。
【図８】符号の付加の処理を表記するシグナルフローの説明図。
【図９】乗算の処理を表記するシグナルフローの説明図。
【図１０】サブバンド符号化方式に基づいたＭＰＥＧ／オーディオアルゴリズムの基本ブロック図。
【図１１】従来のサブバンド合成フィルタバンクの処理を示す説明図。
【図１２】従来のフィルタバンク処理Ｒ（ｋ）のシグナルフロー図。
【符号の説明】
２１フレーム分解手段
２２復号・逆量子化手段
２３サブバンド合成フィルタバンク[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an audio signal decoding device, and more particularly to an audio signal decoding device in which arithmetic processing in a sub-band synthesis filter bank is highly efficient.
[0002]
[Prior art]
Coding of audio signals has been rapidly receiving attention due to the recent development of ISDN and VLSI technologies. In particular, its importance is increasing in digital audio, digital satellite broadcasting (DSB), storage of audio signals, teleconferencing, multimedia applications, and the like.
[0003]
In MPEG (Moving Picture Expert Group), a global standard encoding method for compressing an audio signal together with a video signal has been studied, and the encoding technique of an audio signal is represented by MPEG / audio (international standard ISO / IEC 11172-3). The coding algorithm used is determined and published. The MPEG audio encoding technology is described in detail in “MPEG” published by Ohmsha, edited by The Institute of Television Engineers of Japan. Also, the IEEE Multimedia Journal, Summer 1995, describes MPEG audio compression theory in the "MPEG / Audio Data Compression Tutorial" by Davis Pan.
[0004]
The MPEG / audio algorithm is composed of three types of algorithms, Layer 1, Layer 2 and Layer 3, and becomes complicated in the order of Layer 1 to Layer 3. However, the algorithm is based on subband coding of 32 bands, They are common in that the number of channels is 2, and the sampling frequency is any of 32, 44.1, and 48 kHz (hereinafter, the MPEG / audio layer 3 algorithm is particularly called MP3). Here, the sub-band coding is a coding scheme in which a signal is divided into a plurality of sub-bands using auditory characteristics, and each band is quantized with different quantization characteristics.
FIG. 10 shows a basic block diagram of an MPEG / audio algorithm based on a subband encoding method. The input audio signal passes through a filter bank 11 that divides the input into a plurality of subbands. At the same time, the input audio signal passes through a psychoacoustic model 12, which determines the ratio of signal energy to the masking threshold for each subband. After quantizing and encoding by the quantizing / encoding means 13 in accordance with the bit allocation based on the psychological auditory model 12, ancillary data (data that can be arbitrarily defined by the user) (not shown) is added as necessary to the formatting means. 14 to format into a bit stream. The bit stream encoded in this manner is stored in a storage medium such as a CD-ROM or transmitted through a transmission path such as a communication line. In decoding, first, the ancillary data added is separated by the frame decomposing means 21 to decompose the bit stream. Next, decoding and inverse quantization are performed by the decoding / inverse quantization means 22 based on the bit allocation transmitted as side information, and the subband value is restored. Finally, the sub-band synthesis filter bank 23 combines the sub-band values to reproduce the audio signal.
[0006]
Incidentally, the sub-band synthesis filter bank in decoding has a function of returning a signal divided into frequency components to an original time-series audio signal. In order to realize this function, a discrete cosine transform (DCT) is used. , A modified discrete cosine transform (MDCT).
[0007]
[Problems to be solved by the invention]
However, if the conversion processing specified in the MP3 standard is executed as it is, the amount of calculation processing in consideration of addition and multiplication required for the angle calculation is 6144 multiplications to obtain 64 outputs. 6080 additions were required, and the calculation load was heavy. For this reason, it has been difficult for a conventional device using MP3 decoding, for example, an MP3 player, to obtain practical performance at low cost.
[0008]
Accordingly, an object of the present invention is to provide an audio signal decoding device which can solve the above-mentioned drawbacks of the conventional technology, can greatly reduce the amount of calculation in the subband synthesis filter bank, and can thereby speed up the decoding process. is there.
[0009]
[Means for Solving the Problems]
In order to solve the above-mentioned problem, the present invention provides a frame decomposing unit that decomposes an encoded bit stream, and a decoding and inverse decoding unit that performs decoding and inverse quantization based on bit allocation to restore n subband values. It comprises a quantizing means and a sub-band synthesis filter bank for synthesizing the obtained sub-band values and reproducing an audio signal. The sub-band synthesis filter bank converts n / 4 sub-band inputs into discrete cosine transform (DCT). And first and second filter banks for performing output and n / 4 sub-band inputs, and third and fourth filter banks for performing discrete sine transform (DST) on and outputting n sub-band inputs. Divided and input to a first filter bank, a second filter bank, a third filter bank, and a fourth filter bank, respectively, and output Q of the first filter bank ₁ (n / 4, k) (k = 0, 1, 2,..., N / 2−1; the same applies hereinafter) and the output Q ₂ (n / 4, k) of the second filter bank The first processing for obtaining P ₁ (k) and the sum of the output Q ₃ (n / 4, k) of the third filter bank and the output Q ₄ (n / 4, k) of the fourth filter bank are A _second coefficient multiplication by a predetermined coefficient to obtain P ₂ (k), and n × 2 synthesis filter outputs X (n, p) (p = 0) based on P ₁ (k) and P ₂ (k). , 1, ... , 63) .
[0010]
In this case, the decoding / inverse quantization means restores n = 32 subband values, and the first processing is as follows .
[0011]
(Equation 4)

[0012]
And the second processing is
[0013]
(Equation 5)

[0014]
And the third processing is the next filter output X (32, p) (p = 0, 1, ... , 63)
[0015]
(Equation 6)

[0016]
Is preferably obtained.
[0017]
BEST MODE FOR CARRYING OUT THE INVENTION
Before describing the overall configuration of the embodiment of the audio signal decoding apparatus according to the present invention, first, an algorithm in a subband synthesis filter bank which is a main part thereof will be described in detail.
[0018]
Now, taking an example of speeding up the subband synthesis filter bank in the layer 3 (MP3) of MPEG / audio, an application of a high-speed DCT algorithm for that purpose will be considered.
[0019]
[Describe subband synthesis filter bank in four terms]
In the MP3 subband synthesis filter bank, the number of inputs is 32 and the number of outputs is 64. The conversion equation when the number of inputs is represented by n (= 32) is represented by the following equation.
[0020]
(Equation 7)

[0021]
Here, the following symbols are defined.
[0022]
(Equation 8)

[0023]
When equation (2) is applied to equation (1), it can be expressed as the following equation.
[0024]
(Equation 9)

[0025]
Since equation (3) is similar to a so-called type II DCT of size 32, it is presumed that by using an algorithm of high-speed DCT, it is possible to reduce the size to size 2 and speed up the operation.
[0026]
Therefore, equation (1) is divided into even and odd terms.
[0027]
(Equation 10)

[0028]
Equation (4) can be transformed into the following equation according to the definition of equation (2) and the formula of trigonometric function, and from the relationship of x _n = 0 and x ₋₁ = 0.
[0029]
(Equation 11)

[0030]
Focusing on equation (5), (-1) In order to eliminate ^m , m is divided into an even term and an odd term as follows.
[0031]
(Equation 12)

[0032]
Next, paying attention to the last two terms in the equation (6), the following equation can be used according to the formula of the trigonometric function, and from x ₋₂ = 0 and x ₋₁ = 0.
[0033]
(Equation 13)

[0034]
Here, the definition is made as follows.
[0035]
[Equation 14]

[0036]
[Equation 15]

[0037]
(Equation 16)

[0038]
[Equation 17]

[0039]
Using Expressions (8) to (11), the synthesis filter bank of MP3 in Expression (6) can be expressed as follows.
[0040]
(Equation 18)

[0041]
Each of the four terms in equation (12) is a DCT or DST of size 8. The following describes how the computational load is reduced according to the present invention for each of the four terms.
[0042]
[Type I DCT]
Type I DCT is defined as follows.
[0043]
[Equation 19]

[0044]
Here, two terms of the equation (13) are defined as follows.
[0045]
(Equation 20)

[0046]
[Equation 21]

[0047]
Then, equation (13) can be expressed as follows.
[0048]
(Equation 22)

[0049]
Equation (16) is calculated using k = 0, 1,. . . , 2n−1, the equations (15) and (16) are calculated using k = 0, 1,. . . , N / 2, the calculation load can be reduced.
[0050]
That is, (16) can be expressed as follows.
[0051]
(Equation 23)

[0052]
Equations (14) and (15) are Type I DCTs of size n / 2 and can be reduced recursively to size 2 DCTs. In addition, using the equation (17), the DCT of size 8 is defined as k = 0, 1,. . . , 16 can be calculated.
[0053]
Returning to equation (8), Q ₁ (n / 4, k) in equation (8) is a size 8 DCT. k = 0, 1,. . . A method of calculating Q ₁ (n / 4, 0) to Q ₁ (n / 4, 63) using Q ₁ (n / 4, k) for 16 will be described below.
[0054]
Here, n = 32. However, according to equation (17), k = 0, 1,. . . , 16 have already been calculated. Using the periodicity of the trigonometric function for the coefficient and Q ₁ (n / 4, k) to be applied to all four terms of equation (12), it can be expressed as follows.
[0055]
(Equation 24)

[0056]
That is, for Q ₁ (n / 4, k), k = 0, 1,. . . , 16 yields k = 0, 1,. . . , 63, the calculation load can be reduced to one fourth. Moreover, these values are k = 0, 1,. . . , 8 by G ₁ (8, k) and H ₁ (8, k), which are ultimately determined by G ₁ (2, k) and H ₁ (2, k) for k = 0,1. ) Indicates that a considerable amount of calculation load can be reduced.
[0057]
[Type II DCT]
Type II DCT is defined as follows.
[0058]
(Equation 25)

[0059]
Here, two terms of the equation (19) are defined as follows.
[0060]
(Equation 26)

[0061]
[Equation 27]

[0062]
Then, equation (19) can be expressed as follows.
[0063]
[Equation 28]

[0064]
Equation (22) is calculated using k = 0, 1,. . . , 2n−1, the equations (20) and (21) are calculated using k = 0, 1,. . . , N / 2, the calculation load can be reduced. That is, equation (22) can be expressed as follows.
[0065]
(Equation 29)

[0066]
Equations (20) and (21) are Type I and Type II DCTs of size n / 2, respectively, and can be recursively repeated to reduce to size 2 DCTs. Also, using the equation (23), the DCT of size 8 is defined as k = 0, 1,. . . , 16 can be calculated.
[0067]
Returning to equation (9), Q ₂ (n / 4, k) in equation (9) is a size 8 type II DCT. k = 0, 1,. . . The method of calculating the _Q 2 (n / 4,0) to _Q 2 (n / 4,63) will be described below with reference to _{Q 2 (n / 4, k} ) for 16.
[0068]
Here, n = 32. However, according to equation (17), k = 0, 1,. . . , 16 have already been calculated, and the coefficients multiplied by all four terms of equation (12) have also been calculated. Q ₂ (n / 4, k) can be expressed as follows by using the periodicity of the trigonometric function.
[0069]
[Equation 30]

[0070]
That is, Q ₂ (n / 4, k) can be finally obtained by G ₂ (2, k) and H ₂ (2, k) for k = 0, 1, so that a considerable amount It can be shown that the calculation load of can be reduced.
[0071]
[Type I DST]
Type I DST is defined as follows.
[0072]
[Equation 31]

[0073]
Here, two terms of the equation (25) are defined as follows.
[0074]
(Equation 32)

[0075]
[Equation 33]

[0076]
Then, equation (25) can be expressed as follows.
[0077]
(Equation 34)

[0078]
Equation (28) is converted to k = 0, 1,. . . , 2n−1, the equations (26) and (27) are calculated using k = 0, 1,. . . , N / 2, the calculation load can be reduced. That is, equation (28) can be expressed as follows.
[0079]
(Equation 35)

[0080]
Since equations (26) and (27) are type I DSTs of size n / 2, they can be reduced recursively to size 2 DSTs. Also, using the equation (29), the DST of size 8 is defined as k = 0, 1,. . . , 16 can be calculated.
[0081]
Returning to equation (10), Q ₃ (n / 4, k) in equation (10) is a size 8 DST. k = 0, 1,. . . The method of calculating the _Q 3 (n / 4,0) to _Q 3 (n / 4,63) will be described below with reference to _{Q 3 (n / 4, k} ) for 16.
[0082]
Here, n = 32. However, according to equation (29), k = 0, 1,. . . , 16 have already been calculated, and the coefficients multiplied by all four terms of equation (12) have also been calculated. Q ₃ (n / 4, k) can be expressed as follows by using the periodicity of the trigonometric function.
[0083]
[Equation 36]

[0084]
That is, Q ₃ (n / 4, k) can be finally obtained by G ₃ (2, k) and H3 (2, k) for k = 0, 1, 2, so that This shows that the amount of computational load can be reduced.
[0085]
[Type II DST]
Type II DST is defined as follows.
[0086]
(37)

[0087]
Here, two terms of the equation (31) are defined as follows.
[0088]
[Equation 38]

[0089]
[Equation 39]

[0090]
Then, equation (31) can be expressed as follows.
[0091]
(Equation 40)

[0092]
Equation (34) is calculated using k = 0, 1,. . . , 2n−1, the equations (32) and (33) are calculated using k = 0, 1,. . . , N / 2, the calculation load can be reduced. That is, equation (34) can be expressed as follows.
[0093]
(Equation 41)

[0094]
Equations (32) and (33) are type II and type I DSTs of size n / 2, respectively, and can be reduced recursively to size 2 DSTs. Further, using the equation (35), the DST of size 8 is defined as k = 0, 1,. . . , 16 can be calculated.
[0095]
Returning to equation (11), Q ₄ (n / 4, k) in equation (11) is a size 8 type II DST. k = 0, 1,. . . The method of calculating the _Q 4 (n / 4,0) to _Q 4 (n / 4,63) will be described below with reference to _{Q 4 (n / 4, k} ) for 16.
[0096]
Here, n = 32. However, according to equation (35), k = 0, 1,. . . , 16 have already been calculated, and the coefficients multiplied by all four terms of equation (12) have also been calculated. Q ₄ (n / 4, k) can be expressed as follows by using the periodicity of the trigonometric function.
[0097]
(Equation 42)

[0098]
That is, Q ₄ (n / 4, k) can be finally obtained by G ₄ (2, k) and H ₄ (2, k) for k = 0,1. It can be shown that the calculation load of can be reduced.
[0099]
To summarize the above arithmetic processing, the method of calculating each output of the subband synthesis filter bank of MP3 can be simply described as follows.
[0100]
1. Define two expressions.
[0101]
[Equation 43]

[0102]
2. The output X (32, k) of the sub-band synthesis filter bank is as follows.
[0103]
[Equation 44]

[0104]
Next, a preferred embodiment of an audio signal decoding device according to the present invention will be described with reference to the drawings.
[0105]
The overall configuration of the present embodiment is the same as that of the conventional apparatus shown in FIG. 10, and includes a frame decomposing unit 21 for decomposing an encoded bit stream, and decoding and dequantization based on bit allocation to perform n decoding. And a sub-band synthesis filter bank 23 that synthesizes the obtained sub-band values and reproduces the audio signal.
[0106]
FIG. 1 shows a signal flow diagram of a subband synthesis filter bank which is a main part in the present embodiment. As shown in FIG. 1, the sub-band synthesis filter bank includes a first filter bank Q1, a second filter bank Q2, a third filter bank Q3, and a fourth filter bank Q4. The first filter bank Q1 and the second filter bank Q2 perform discrete cosine transform (DCT) on n / 4 = 8 (n = 32) subband inputs and output 16 outputs. Further, the third filter bank Q3 and the fourth filter bank Q4 perform discrete sine transform (DST) on n / 4 = 8 subband inputs and output 16 outputs.
[0107]
Here, data which is obtained by dividing n = 32 subband values into four and inputting them to the first filter bank Q1, the second filter bank Q2, the third filter bank Q3, and the fourth filter bank Q4 are as follows: The pre-processing is performed according to the following equation.
[0108]

[0109]
2 to 5 are signal flow diagrams of the filter banks Q1 to Q4.
[0110]
Here, the notation method of the signal flow diagrams of FIGS. 1 to 5 will be supplemented.
[0111]
First, as shown in FIG. 6, the display of data is represented by a number, a circle, and a straight line, the number represents a data number, and the flow of processing proceeds from left to right.
[0112]
As a processing of data, the addition is shown as in FIG. In the example of FIG. 7, data k means that data i and data j are added.
[0113]
The addition of the code is represented by a code attached below or beside the straight line connecting the circles of the two data as shown in FIG. 8, and j = −i in the example of FIG.
[0114]
The multiplication is represented by an equation (1/2, C64-2, etc.) below or above a straight line connecting two data circles, as shown in FIG. Times the value of that formula. In the example of FIG. 9, j = a * i.
[0115]
Finally, when the mathematical expressions represented by C64-2, C4-1, etc. are generally expressed as Ca-b,
Ca-b = 1 / [2 cos [(b / a) * π)]]
Means
[0116]
As is clear from the signal flow of the present embodiment shown in FIGS. 1 to 5, according to the processing of the present embodiment, the outputs of 64 MP3 subband synthesis filter banks are obtained by 79 multiplications and 223 additions. Can be obtained.
[0117]
The processing of the conventional subband synthesis filter bank is composed of 64 processes R (k) (k = 0, 1,... 63) as shown in FIG. ) Were common. The signal flow of the processing R (k) at this time is as shown in FIG. 12, and in order to obtain 64 outputs, 6144 multiplications and 6080 additions were required.
[0118]
On the other hand, according to the present embodiment, only four processes corresponding to the conventional R (k) are used, and the required amount of calculation is small. Therefore, the calculation load on the subband synthesis filter bank can be significantly reduced.
[0119]
The present invention is not limited to the above-described embodiment, and it goes without saying that various changes can be made within the scope of the technical idea described in the claims.
[0120]
【The invention's effect】
As described above, according to the audio signal decoding apparatus of the present invention, the amount of calculation in the subband synthesis filter bank can be significantly reduced, and the decoding process can be further speeded up. The industrial significance that can be used for high-performance devices such as MP3 players is extremely large.
[Brief description of the drawings]
FIG. 1 is a signal flow diagram of a subband synthesis filter bank which is a main part of the present invention.
FIG. 2 is a signal flow diagram of a filter bank Q1.
FIG. 3 is a signal flow diagram of a filter bank Q2.
FIG. 4 is a signal flow diagram of a filter bank Q3.
FIG. 5 is a signal flow diagram of a filter bank Q4.
FIG. 6 is an explanatory diagram of a signal flow for displaying data.
FIG. 7 is an explanatory diagram of a signal flow showing processing of addition data.
FIG. 8 is an explanatory diagram of a signal flow showing a process of adding a code.
FIG. 9 is an explanatory diagram of a signal flow representing multiplication processing.
FIG. 10 is a basic block diagram of an MPEG / audio algorithm based on a subband encoding scheme.
FIG. 11 is an explanatory diagram showing processing of a conventional subband synthesis filter bank.
FIG. 12 is a signal flow diagram of conventional filter bank processing R (k).
[Explanation of symbols]
21 frame decomposition means 22 decoding / dequantization means 23 subband synthesis filter bank

Claims

Frame decomposing means for decomposing an encoded bit stream; decoding / dequantization means for decoding and dequantizing based on bit allocation to restore n subband values; A sub-band synthesis filter bank for synthesizing and reproducing an audio signal, wherein the sub-band synthesis filter bank performs a discrete cosine transform (DCT) on n / 4 sub-band inputs and outputs the first and second filter banks And a third and a fourth filter bank for performing discrete sine transform (DST) of n / 4 subband inputs and outputting the same, and dividing the n subband values into four to divide the n subband values into four. second filter bank, the third filter bank and a fourth filter bank respectively input, output to Q ₁ the first filter bank _{(n / 4, k) (} k = , 1,2, ..., n / 2-1 . Hereinafter the same) and the first process for obtaining the difference _P 1 (k) and the output _Q 2 of the second filter bank (n / 4, k) When the output _Q 3 of the third filter bank (n / 4, k) and multiplied by a predetermined coefficient taking the sum of the output _{Q 4 (n / 4, k} ) of the fourth filter bank _P 2 ( k) and n × 2 synthesis filter outputs X (n, p) (p = 0,1, ... , based on the P ₁ (k) and the P ₂ (k) . 63) performing a third process for obtaining the audio signal.

The decoding / inverse quantization means restores n = 32 subband values, and the first processing includes:

And the second processing is:

, And the third processing is the next filter output X (32, p) (p = 0, 1, ... , 63)

2. The audio signal decoding apparatus according to claim 1, wherein