JP2004147214A

JP2004147214A - Filter bank circuit and its design assisting method

Info

Publication number: JP2004147214A
Application number: JP2002311682A
Authority: JP
Inventors: Naohiko Shimizu; 清水　尚彦; Tomoaki Kamiyama; 神山　智章
Original assignee: Tokai University
Current assignee: Tokai University
Priority date: 2002-10-25
Filing date: 2002-10-25
Publication date: 2004-05-20

Abstract

<P>PROBLEM TO BE SOLVED: To reduce power consumption even while performing an IMDCT (inverse modified discrete cosine transform) operation. <P>SOLUTION: In this filter bank circuit 1 having an IMDCT module 2 and a COS table 3, the IMDCT module 2 uses an arithmetic result to a value of one part of the COS table 3 as it is or inverting its sign as an arithmetic result of a value of the other part in relation having periodicity to the value of one part in the case of performing the operation by utilizing the COS table 3. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、フィルタバンク回路及びその設計支援方法に関する。更に詳述すると、本発明はＭＰ３デコーダおよびエンコーダに適したフィルタバンク回路及びその設計支援方法に関する。
【０００２】
【従来の技術】
近年普及している音声圧縮技術の１つとして、ＭＰＥＧ１オーディオのレイヤ３（ＩＳＯ／ＩＥＣ　１１１７２−３）として定められているＭＰ３が挙げられる。ＭＰ３は、非常に高音質でありながら、もとのＰＣＭデータの１０分の１以下にまで圧縮ができる等の多くの利点がある。
【０００３】
パーソナルコンピュータをＭＰ３のエンコーダやデコーダとして機能させる場合、エンコードやデコード処理に際して一般に６４ビットの浮動小数点演算が行なわれる（非特許文献１参照）。この場合の浮動小数点形式は、符号部１ビット、指数部１１ビット、仮数部５２ビットで構成される。
【０００４】
また、ＭＰ３のデコーダとして機能する専用のハードウェアを構成し、デコード処理に際して３２ビットの整数演算を行なった報告がなされている（非特許文献２参照）。
【０００５】
また、ポータブルタイプＭＰ３プレーヤのデコーダあるいはエンコーダとして、ＩＭＤＣＴ（Ｉｎｖｅｒｓｅ　Ｍｏｄｉｆｉｅｄ　Ｄｉｓｃｒｅｔｅ　Ｃｏｓｉｎｅ　Ｔｒａｎｓｆｏｒｍ）の演算を行うフィルタバンク回路を搭載したものがある。このＩＭＤＣＴ演算では、離散コサイン変換を利用した手法で音楽データを圧縮したり圧縮したデータを復元したりする。そして、離散コサイン変換では複雑なコサインの計算をする必要があるので、ＣＯＳテーブルを利用して計算量の低減を図っている（非特許文献１参照）。
【０００６】
【非特許文献１】
インターネット＜ＵＲＬ：ｆｔｐ：／／ｆｔｐ．ｕｎｔ．ｕｎｉ−ｈａｎｎｏｖｅｒ．ｄｅ／ｐｕｂ／ＭＰＥＧ／ａｕｄｉｏ／ｍｐｅｇ２／ｓｏｆｔｗａｒｅ／ｔｅｃｈｎｉｃａｌ＿ｒｅｐｏｒｔ／ｄｉｓｔ１０．ｔａｒ．ｇｚ＞
【非特許文献２】
インターネット＜ＵＲＬ：ｈｔｔｐ：／／ｗｗｗ．ｍａｒｓ．ｏｒｇ／ｈｏｍｅ／ｒｏｂ／ｐｒｏｊ／ｍｐｅｇ／＞
【０００７】
【発明が解決しようとする課題】
しかしながら、ＭＰ３では高音質高圧縮を実現するために非常に多くの計算を必要とする。また、特にデコーダではリアルタイムでの動作を必ず要求される。ソフトウェアによってＭＰ３のエンコーダやデコーダを実現する場合、非常に高速なプロセッサ必要となる。高速なプロセッサは消費電力が非常に大きく、高価である。一方、ＭＰ３のエンコーダやデコーダを専用のハードウェアとして実現する場合、ソフトウェアで実現する場合と比較すれば、必要な処理速度を満足しつつ、回路サイズの縮小化および低消費電力化を図ることはできるが、ポータブルタイプのＭＰ３機器等においては更なる小型化、低消費電力化が渇望されている。ＭＰ３のエンコーダやデコーダのハードウェア化にあたっては、ＩＭＤＣＴ処理が特に演算量が多く、更なる小型化、低消費電力化を実現する際の隘路になっている。従来のＩＭＤＣＴ処理ではＣＯＳテーブルの全範囲のデータを使用して乗算および加算を行っているので、演算量が非常に多い。
【０００８】
そこで、本発明は、ＩＭＤＣＴ演算を行いながらも小型化及び低消費電力化を図ることができるフィルタバンク回路を提供することを目的とする。
【０００９】
【課題を解決するための手段】
かかる目的を達成するため、本願発明者が鋭意検討したところ、ＣＯＳテーブルには周期性があるため、その基本パターンのデータに対する演算結果をそのまま若しくは符号を反転させることにより他の周期的な部分の演算結果として利用可能なことを見出した。
【００１０】
かかる知見に基づく請求項１記載の発明は、ＩＭＤＣＴモジュールと、ＣＯＳテーブルとを備えるフィルタバンク回路において、ＩＭＤＣＴモジュールは、ＣＯＳテーブルを利用して演算を行う際に、ＣＯＳテーブルの一部の値に対する演算結果を、その一部の値に対して周期性を有する関係にある他部の値の演算結果としてそのまま若しくは符号を反転させて使用するようにしている。
【００１１】
したがって、ＣＯＳテーブルの一部の値に対する演算結果を他部の値の演算結果としてそのまま若しくは符号を反転させて利用しているので、従来のようにＣＯＳテーブルの全範囲のデータを使用して乗算および加算を行う場合に比べてＩＭＤＣＴ処理における重複演算を削減することができる。
【００１２】
また、本願発明者が、浮動小数点形式の仮数部のビット数を変化させて、当該変化させた各浮動小数点形式での演算に基づく音質を調べたところ、従来用いられている浮動小数点形式の仮数部のビット数よりも遥かに少ないビット数であっても、要求仕様を満足する音質を得られることを知見するに至った。
【００１３】
かかる知見に基づく請求項２記載の発明は、ＩＭＤＣＴモジュールを備えるフィルタバンク回路において、ＩＭＤＣＴモジュールは浮動小数点演算器を備え、浮動小数点演算器における仮数部のビット数は、要求音質を満足する最小ビット数または当該最小ビット数に余裕分のビット数を加えたビット数に、定められるものとしている。したがって、このように仮数部のビット数を決定することによって、要求音質を満足する範囲で、仮数部のビット数を従来の浮動小数点形式よりも削減することができる。
【００１４】
また、請求項３記載の発明は、請求項２記載のフィルタバンク回路において、浮動小数点演算形式が、符号部１ビット、指数部５ビット、仮数部１０ビットで構成されるものとしている。したがって、要求音質を満足させつつ、浮動小数点演算形式のビット数を従来よりも削減することができる。
【００１５】
また、請求項４記載の発明は、請求項２または３記載のフィルタバンク回路において、ＣＯＳテーブルを更に備え、ＩＭＤＣＴモジュールは、ＣＯＳテーブルを利用して演算を行う際に、ＣＯＳテーブルの一部の値に対する演算結果を、一部の値に対して周期性を有する関係にある他部の値の演算結果としてそのまま若しくは符号を反転させて使用するようにしている。したがって、従来のようにＣＯＳテーブルの全範囲のデータを使用して乗算および加算を行う場合に比べてＩＭＤＣＴ処理における重複演算を削減することができる。
【００１６】
また、請求項５記載の発明は、ＩＭＤＣＴモジュールを備えるフィルタバンク回路において、ＩＭＤＣＴモジュールは浮動小数点演算器を備え、浮動小数点演算における仮数部の演算後に仮数部の上位桁が０となり正規化のために仮数部を２ビット以上シフトする場合に、シフト前における仮数部の最下位ビットの１つ下位のビットが１となり、更に下位のビットが０となるように、シフトを行なうようにしている。この場合、仮数部の正規化に起因する最大誤差の大きさを半分にできる。
【００１７】
また、請求項６記載の発明は、請求項２から４のいずれかに記載のフィルタバンク回路において、浮動小数点演算における仮数部の演算後に仮数部の上位桁が０となり正規化のために仮数部を２ビット以上シフトする場合に、シフト前における仮数部の最下位ビットの１つ下位のビットが１となり、更に下位のビットが０となるように、シフトを行なう行なうようにしている。この場合、浮動小数点演算形式のビット数を従来よりも削減しつつ、仮数部の正規化に起因する最大誤差の大きさを半分にできる。
【００１８】
また、請求項７記載の発明は、請求項５または６記載のフィルタバンク回路において、浮動小数点演算における仮数部の演算後に仮数部の上位桁が０となり正規化のために仮数部を１ビットシフトする場合に、シフト前における仮数部の最下位ビットを反転したビットを、シフト後における仮数部の最下位ビットに挿入するようにしている。この場合、正規化補正値をある程度散らばらせることができ、全体としての演算誤差の累積を低減できる。
【００１９】
また、請求項８記載のフィルタバンク回路の設計支援方法は、ＩＭＤＣＴモジュールを備えるフィルタバンク回路において、ＩＭＤＣＴモジュールは浮動小数点演算器を備え、浮動小数点演算器における仮数部のビット数を任意のビット数に制限すると共に、当該制限したビット数に対応する音質の評価値を求めて、要求音質を満足する最小ビット数を求めるようにしている。したがって、当該最小ビット数に基づいて仮数部のビット数を決定することによって、要求音質を満足する範囲で、仮数部のビット数を従来の浮動小数点形式よりも削減することができる。
【００２０】
【発明の実施の形態】
以下、本発明の構成を図面に示す最良の形態に基づいて詳細に説明する。
【００２１】
図１〜図５に本発明に係るフィルタバンク回路１の実施形態の一例を示している。このフィルタバンク回路１は、ＩＭＤＣＴモジュール２と、ＣＯＳテーブル３とを備えるものである。そして、ＩＭＤＣＴモジュール２は、ＣＯＳテーブル３を利用して演算を行う際に、ＣＯＳテーブル３の一部の値に対する演算結果を、その一部の値に対して周期性を有する関係にある他部の値の演算結果としてそのまま若しくは符号を反転させて使用するようにしている。このため、従来のようにＣＯＳテーブル３の全範囲のデータを使用して乗算および加算を行う場合に比べて、ＩＭＤＣＴ処理における重複演算を削減して、回路サイズを縮小化でき消費電力を小さくすることができる。
【００２２】
本実施形態のフィルタバンク回路１は、上述したＩＭＤＣＴモジュール２と、ＣＯＳテーブル３を含む係数テーブル４と、入力メモリ５と、出力メモリ６とを備えている。各部の構成の一例を表１に示す。
【表１】

【００２３】
例えば本実施形態では、フィルタバンク回路１をＭＰ３用デコーダのＩＭＤＣＴ処理用回路として使用している。但し、フィルタバンク回路１はＭＰ３用エンコーダに使用しても良い。エンコーダも定数が異なるのみでデコーダと同一のフィルタバンク回路１を利用することができる。
【００２４】
ＭＰ３のデコードはグラニュール単位で行なわれる。１グラニュールは、１８サンプルから成る３２個のサブバンドから構成されている。ＩＭＤＣＴ処理は、基本的に各サブバンド単位で処理が行なわれる。ＩＭＤＣＴ処理は、図１５に示すように、大きくＩＤＣＴ、窓関数処理、オーバーラップ　アディションの３つの部分に分けることができる。
【００２５】
ＩＭＤＣＴ処理で必要になる三角関数の演算については、全て係数テーブル４に格納した値を用いる。係数テーブル４はＣＯＳテーブル３とＳＩＮテーブル７とを有する。各テーブル３，７は、事前に計算を行い配列に格納している。
【００２６】
サブバンドがロングブロックのときに用いられるＣＯＳテーブル３は数式１のように定義して配列に格納されている。
【数１】

【００２７】
サブバンドがショートブロックのときに用いられるＣＯＳテーブル３は数式２のように定義して配列に格納されている。
【数２】

【００２８】
実際にメモリに載せるときは、各ｉ毎に別々のメモリに格納されている。
【００２９】
そして、窓関数処理（ＷＩＮＤＯＷ処理）に用いられるＳＩＮテーブル７は、ブロックタイプにより全く異なるものとなっている。よって、数式３に示すように、各ブロックタイプ毎に配列を作成し格納している。
【数３】
ＳＩＮ［ブロックタイプ］［ｉ］
【００３０】
ＳＩＮテーブル７は、ＣＯＳテーブル３と違って処理には何も変更が無いので、ｉに対する値をそのまま格納している。
【００３１】
ＩＭＤＣＴモジュール２は、ステート（ハードウェアの回路の状態）の遷移でＩＭＤＣＴ処理を行うものとしている。そのため、パイプライン化などのような高速化には向かないものの、複数のステートでリソースを共有できるので、回路規模の点で有利になる。また、ＩＭＤＣＴ処理は、処理速度に十分余裕がある場合には入出力メモリ６を読み書きする間は止めるようにしても良い。
【００３２】
ＩＭＤＣＴ処理のおおまかな状態遷移を図２に示す。このフィルタバンク回路１では、基本的に各サブバンド単位でＩＭＤＣＴ処理を行うようにしている。まず、サブバンドセレクトで一つのサブバンドを選択し（ステップ１）、ブロックタイプがロングブロックかショートブロックかを判断する（ステップ２）。ロングブロックであればＤＣＴ１８（ステップ３）、ショートブロックであればＤＣＴ１２（ステップ４）の処理をそれぞれ行う。
【００３３】
ここで、ＤＣＴ１８およびＤＣＴ１２の処理について説明する。ＩＭＤＣＴモジュール２では、最初の処理として入力に対してＣＯＳテーブル３の乗算を行う。例えばロングブロックの場合は、数式４に示す計算を行う。尚、Ｎ［ｋ］は、ハフマン復号（逆量子化後の入力データビット流）を表す。
【数４】

【００３４】
ここで、このロングブロックのＣＯＳテーブル３には数式５に示す周期性がある。
【数５】

【００３５】
この周期性からＭＩＤ［ｉ］には数式６のような関係がある。
【数６】

【００３６】
この関係より、ｉ＝９〜２６の間でのＭＩＤ［ｉ］さえ求めれば、残りはコピーか符号の反転とコピーで同じ処理ができる。これにより、乗算および加算ともに５０％ずつ削減できる。また、ＣＯＳテーブル３のサイズは半分で済む。よって、演算量およびテーブルの削減を図ることができるので、回路サイズの縮小化および消費電力の低減を図ることができる。
【００３７】
ショートブロックの場合も同様の関係が成り立つ。このＭＩＤ［ｉ］は数式７のようになる。
【数７】

【００３８】
よって、この場合も乗算および加算をやはり５０％ずつ削減でき、テーブルサイズを半分にすることができる。
【００３９】
ＤＣＴ１８の状態遷移を図３に示す。これは、ロングブロックのときに行われるＩＤＣＴおよび窓関数処理（ＷＩＮＤＯＷ処理）の部分である。最初に、出力の９から２６番目に当たる部分のＩＤＣＴ処理のみを行う。これは、図３のテーブルセレクト（ステップ６）、乗算加算（ステップ７）、メモリへの格納（ＭＥＭ　ＳＴ）（ステップ８）のループに相当する。ＣＯＳテーブル３は、通常３６×１８で構成されるが、本発明ではテーブルを半減できることから１８×１８で十分となっている。このループで、入力とＣＯＳ係数の乗算および総和が行われる。この結果は、出力メモリ６に格納される。この出力メモリ６は出力にも使われるが、この処理中では一時作業用（テンポラリ）としても用いる。これにより、メモリが節約されている。
【００４０】
この段階では、出力全体の半分しか求まっていない。しかし、上述した通りコピーで代用可能であるので、まだ値の入っていない出力メモリ６のフィールドヘコピーしていく（ステップ９）。このとき、出力の前半部分にコピーするときは符号の反転が必要になる。最後に窓関数処理（ＷＩＮＤＯＷ処理）が行なわれる（ステップ１０）。ここでは特に特殊なことは行っていない。
【００４１】
また、ＤＣＴ１２の状態遷移を図４に示す。これは、ショートブロック用のＩＤＣＴであり、多少の違いはあるものの基本的にはＤＣＴ１８と同じ構成が取られている。
【００４２】
まず、ショートブロックではサブバンドを更に３つに分けて扱う。ここでは、この範囲をサブサブバンド（ＳＳＢ）と呼ぶ。本実施形態では、入力の読み込み方でＳＳＢの切替えを行っている。具体的には、次のような処理を行なう。即ち、入力データ流は、ＳＳＢ０，ＳＳＢ１，ＳＳＢ２，ＳＳＢ０，ＳＳＢ１，ＳＳＢ２，…の順で並んでいる。そこで、この入力データ流を格納したバッファをそのまま利用して後続の処理を行う。例えば、ＳＳＢ１のデータを取得するには、ＳＳＢ０のデータアドレスにデータ幅を加えて取り出す。ＳＳＢ２のデータを取得するには、ＳＳＢ０のデータアドレスにデータ幅の二倍の値を加えて取り出す。このような処理を行なうことにより、バッファの内容を他のメモリ領域にコピーして並べ替えたりする場合よりも、メモリの必要量が少なくなる。これにより、無駄な記憶素子を使わないようにしている。
【００４３】
ＳＳＢが選択されると（ステップ１１）、出力の３から８番目に当たる部分のＩＤＣＴ処理のみを行う。これは、図４のテーブルセレクト（ステップ１２）、乗算加算（ステップ１３）、メモリへの格納（ＭＥＭ　ＳＴ）（ステップ１４）のループに相当する。この結果を出力メモリ６に格納しておくが、このときメモリに図５のようにデータを書き込む。これは後の加算処理（ステップ１７）や窓関数処理（ＷＩＮＤＯＷ処理）などでの手軽さを考慮してのことである。また、まだ値の入っていない出力メモリ６のフィールドヘもコピーしていく（ステップ１５）。この後、各ＳＳＢ毎に窓関数処理（ＷＩＮＤＯＷ処理）が行なわれる（ステップ１６）。
【００４４】
全てのＳＳＢのＩＤＣＴが終ると、図４の加算処理が行われる（ステップ１７）。このとき、ＳＳＢ０、ＳＳＢ２は動かさずに、ＳＳＢ１の値を対応するところに足し加えていく。加算が終了すると、ＳＳＢ１のあったところにゼロが書き込まれる。
【００４５】
以上のようにしてＤＣＴステートから抜けると、図２に示すように一つ前のグラニュールのデータと重ね合わせるオーバーラップアディションを行う（ステップ５）。この処理はブロックタイプには依存しないので、流れが一つにまとまる。一つのサブバンドの処理としてはこれで完結する。あとは、１グラニュール分のサブバンドだけこの処理を繰り返す。すべてのサブバンドを処理し終えると、ＩＭＤＣＴモジュール２の処理は終了する。
【００４６】
また、本実施形態のフィルタバンク回路１のＩＭＤＣＴモジュール２は、浮動小数点演算器８を備えている。そして、浮動小数点演算器８における仮数部のビット数は、要求音質を満足する最小ビット数、または当該最小ビット数に余裕分のビット数を加えたビット数に、設定するようにしている。仮数部のビット数を減少させれば、浮動小数点演算の精度は落ちるが、演算速度は高まり、必要な回路サイズは縮小化でき、消費電力も低減できる。したがって、必要とされる演算精度が定められているならば、その精度を満たす、最も省コストなデータフォーマットを採用することが望ましい。そこで、例えば本実施形態では、要求音質を満足する最小ビット数を求めるために、浮動小数点演算器８における仮数部のビット数を任意のビット数に制限して、当該制限したビット数に対応する音質の評価値を求めるようにしている。
【００４７】
浮動小数点演算器８における仮数部のビット数を任意のビット数に制限する方法としては、実際に仮数部のビット数を制限したハードウェアを作製する方法、既存のハードウェアを用いてソフトウェアでの処理により仮数部のビット数を制限するシミュレーションを行なう方法、が考えられるが、後者の方が前者に比較して一般に安価且つ簡易であり好ましい。例えば、本実施形態では、ベースとなるＭＰ３デコード環境として、京都大学の提供する協調シミュレーション環境（第１９回パルテノン研究会資料集　ＡＳＩＣデザインコンテスト規定課題　ＭＰ３デコーダのＩＭＤＣＴ処理のハードウェア化にアナウンスされ、コンテストＷＥＢサイトで配布方法が指示されるＭＰ３デコーダＨＷ／ＳＷ協調検証環境である。参考ＵＲＬは、ｈｔｔｐ：／／ｗｗｗ．ｋｅｃｌ．ｎｔｔ．ｃｏ．ｊｐ／ｐａｒｔｈｅｎｏｎ／ｈｔｍｌ／ｃｏｎｔｅｓｔ／ｍｐ３＿０１／ｉｎｄｅｘ．ｈｔｍｌ）を用いる。このシミュレーション環境は、論理シミュレータＳＥＣＯＮＤＳと、Ｃ言語で用いるライブラリｒｕｎｓｅｃｏｎｄｓとを用いて、ハードウェアとソフトウェアを協調動作させることのできるものである。例えばこのシミュレーション環境内で、全ての浮動少数点演算の直後に、以下に説明する処理を施すことによって、仮数部のビット数を任意のＮビットに制限することができる。
【００４８】
浮動小数点数を、Ｃ言語で用いられるｆｒｅｘｐ関数を用いて、仮数部と指数部とに分離する。このとき、仮数部は０．５以上１．０未満に正規化される。そして、当該分離された仮数部に対して数式８から数式１０に示す演算を行い、Ｎビットに制限した仮数部を求める。ここで、浮動小数点数をｆとし、仮数部をｆ_ｆとし、指数部をｆ_ｅとし、制限するビット数をＮとし、Ｎビットに制限した仮数部をｆ_ｆ’’’とする。上記の処理は、正負毎にそれぞれ行なう。上記の処理を全ての浮動小数点演算の直後に施すことによって、ほぼ正確に仮数部のビット数をＮビットに制限することができる。尚、数式８および数式９中の「ｉｎｔ（　）」とは、（　）内の値を整数型に変換する関数である。また、数式９中の「＆」は論理積演算を表し、「０ｘ０００１」は論理値の１を表す。数式９中の「＆（０ｘ０００１）」の演算は、整数型に変換するときに四捨五入をするように有効桁の一つ下の桁に論理値の１を足すものである。
【００４９】
【数８】
ｆ_ｆ’＝ｉｎｔ（ｆ_ｆ×２^Ｎ＋１）
【数９】
ｆ_ｆ”＝ｉｎｔ（ｆ_ｆ×２^Ｎ＋２）＆（０ｘ０００１）
【数１０】
ｆ_ｆ’’’＝ｆ_ｆ’＋ｆ_ｆ”
【００５０】
そして、仮数部のビット数Ｎを様々に変化させた場合の各演算結果に基づく出力データのそれぞれについて、音質の評価値を求める。音質評価方法としては、例えば、ＭＰ３デコーダの規格に定められているＣｏｍｐｌｉａｎｃｅ　Ｔｅｓｔを用いる。Ｃｏｍｐｌｉａｎｃｅ　Ｔｅｓｔは、リファレンスのデコードデータと、設計したデコーダの出力とを、数式１１を用いて比較するものである。
【００５１】
【数１１】

ここで、
Ｎ：サンプル数
ｔ_ｉ：リファレンスデータ
ｒ_ｉ：デコードデータ
【００５２】
リファレンスデータｔ_ｉ，デコードデータｒ_ｉは−１．０から＋１．０の間で振幅を取るように調整される。数式１１により得られる評価値がＭＰ３デコーダの性能を示す。この値は表２に示す３段階に分けられる。ＭＰ３デコーダの要求仕様は、Ｌｉｍｉｔｅｄ　Ａｃｃｕｒａｃｙ以上の音質である。
【表２】

【００５３】
仮数部のビット数Ｎと、ビット数Ｎに対応する音質の評価値との関係を、図１６に示す。ＭＰ３デコーダの要求仕様であるＬｉｍｉｔｅｄ　Ａｃｃｕｒａｃｙ以上の音質を満たすには、ＩＭＤＣＴ処理における浮動小数点演算の仮数部には、９ビット以上が必要であることが分かる。また、仮数部が１５ビットを越えるあたりからは、音質の評価値が飽和することが分かる。これは、ＩＭＤＣＴ処理以外のＭＰ３の演算処理（例えば、逆量子化処理、アンチエイリアス処理など）における誤差が、ＩＭＤＣＴ処理の精度に対し充分でないためと考えられる。
【００５４】
浮動小数点演算器８における仮数部のビット数は、要求音質を満足する最小ビット数である９ビットであっても良いが、最小ビット数である９ビットに、余裕分のビット数を加えたビット数に設定することがより好ましい。ビット数に余裕を持たせることにより、演算器の計算誤差による精度不足を補うことができる。ここで、メモリのビット幅は８の倍数が多いため、ＣＰＵの構造も８の倍数で数値を取り扱うことが多い。そこで、浮動小数点形式のデータフォーマット全体でのビット数が８の倍数となるように、余裕分のビット数を決定することが好ましい。また、仮数部は暗黙の１を有するようにしても良い。即ち、仮数部は、数値「１．Ｍ」の「Ｍ」を表現するようにし、仮数部の最上位桁は必ず１であることを保証する。暗黙の１を有することにより、仮数部のビット数は事実上１ビット増える。例えば本実施形態では、余裕分として１ビットを加えて仮数部を１０ビットとすると共に、仮数部は暗黙の１を有するようにしている。したがって、仮数部は、事実上１１ビットの精度を有し、要求音質を満足する最小ビット数である９ビットに対して、２ビットの余裕を有する。
【００５５】
また、本願発明者等が種々実験した結果、ＭＰ３でエンコードした１００曲程度の音楽データや音声データを統計的に処理したデータ流を集計したところ、指数部は、０から−４１の間に分布することが知見された。また、本願発明者等が種々実験した結果、指数部は０から−３１の範囲をカバーすれば、ＭＰ３デコーダの要求仕様であるＬｉｍｉｔｅｄ　Ａｃｃｕｒａｃｙ以上の音質を満たすことが知見された。指数部が０から−３１の間に分布するデータ以外のデータに関しては、飽和演算を行うことで、多少の誤差は出ても聴覚上は区別できない程度のものと考えられる。ここでいう飽和演算とは、予め定めた最大値（例えば２の０乗）を超える大きなデータは全て当該最大値（例えば２の０乗）として扱い、予め定めた最小値（例えば２の−３１乗）を下回る小さなデータは全て当該最小値（例えば２の−３１乗）とみなす演算のことを意味する。そこで、本実施形態では、指数部を５ビットとし、指数の絶対値を取ったものを指数部に格納する。
【００５６】
本実施形態における浮動小数点形式のデータフォーマットの例を表３に示す。
【表３】

【００５７】
【実施例】
上述したフィルタバンク回路１全体をハードウェア記述言語の一つであるＳＦＬ（Ｓｔｒｕｃｔｕｒｅｄ　Ｆｕｎｃｔｉｏｎ　ｄｅｓｃｒｉｐｔｉｏｎ　Ｌａｎｇｕａｇｅ）を用いて記述した。尚、本実施例では、論理合成プログラムによるハードウェアの生成までは行なわず、シミュレーションによってフィルタバンク回路１の動作確認を行なった。ここで、ＩＭＤＣＴモジュール２内で使われる浮動小数点演算器８は、できるだけ回路を小さくするため、浮動小数点加算回路と浮動小数点乗算回路において演算処理に必要なモジュールを共有して用いるように設計した。演算回路の大きな部分を占める入出力レジスタや加算器等を共有するため、乗算器と加算器を統合した。この浮動小数点演算器８の構成の一例を図６に示す。この浮動小数点演算器８の主な構成要素は、Ｉ／Ｏレジスタ９として１７ビットレジスタを３つ、１１ビット加算器１０を１つ、１１ビットシフタ１１を１つ、４ビットインクリメンタ１２を１つとした。
【００５８】
これらの構成要素を共有して用いることにより、余分なゲートを消費しないようにした。もちろん浮動小数点の乗加算を行うので、これだけのリーソスでは１クロックで計算することはできない。この浮動小数点演算器８では、乗算に約１４クロック、加算に約６クロックを要する。この処理の流れを図７に示す。
【００５９】
この浮動小数点演算器８の合成結果は、回路サイズを表すゲート数が２６０１であるのに対して遅延時間が１９．７７ｎｓであった。ここでは、入力遅延は考えていない。したがって、従来の加算乗算回路と比べると、ゲート数は削減できたと考えられる。
【００６０】
表４に使用しているメモリを示す。尚、「ｒ６４＿８」とは、ＳＦＬ言語処理系が標準で用意しているメモリのライブラリであり、６４ワード、８ビット幅のメモリブロックを表している。
【表４】

【００６１】
データを１６ビットで構成しているので、８ビットのメモリを組み合わせて使用している。使われていない領域が多くなっているが、それはこれ以上小さなメモリが無かったためであり、実際にはもっと少なく済むと考えられる。また、実装時に無駄な部分を切れば済む。
【００６２】
本設計では、回路でゲート数が増えるより、メモリの方がコストがかからないという考えに基づき設計を行っているので、メモリサイズの縮小はさほど行わなかった。
【００６３】
入力メモリ５および出力メモリ６の構成を図８に示す。また、入力メモリ５の利用方法を図９に示す。入力データをＩＮ［ｓｂ］［ｓｓ］と表したとき、各サブバンドＩＮ［ｓｂ］毎に別々のメモリを用いるようにした。入力メモリ５では１８エントリのみ利用する。エントリは一つのアドレスで示されるデータ記録領域を指しており、即ち、入力メモリ５では、０番地（二進数では００００００）から１７番地（二進数では０１０００１）を利用する。また、出力メモリ６の利用方法を図１０に示す。出力データをＯＵＴ［ｓｂ］［ｓｓ］と表したとき、各サブバンドＯＵＴ［ｓｂ］毎に別々のメモリを用いるようにした。さらに、メモリ領域の前半は出力とワークとして、後半の４６番地（二進数では１０１１１０）から６３番地（二進数では１１１１１１）はオーバーラップアディションの記憶領域とした。
【００６４】
さらに、ＣＯＳテーブルメモリ１３の構成を図１１に示す。ここでは、数式６で示したように最低限必要なだけのテーブル、すなわち９〜２６までのみ用意している。また、ＣＯＳテーブルメモリ１３の利用方法を図１２に示す。テーブルをＣＯＳ［ｉ］［ｋ］と表すと、ＩＤＣＴ時に使われる単位であるＣＯＳ［ｉ］毎に、別々のメモリを割り当てている。尚、ロング用ＣＯＳテーブルと、ショート用ＣＯＳテーブルとは、別々のメモリに格納しているが、格納の形式はｋの範囲が異なるだけで同一である。
【００６５】
そして、ＳＩＮテーブルメモリ１４の構成を図１３に、ＳＩＮテーブルメモリ１４の利用方法を図１４に示す。テーブルをＳＩＮ［ブロックタイプ］［ｋ］と表すと、ＳＩＮ［ブロックタイプ］毎に別々のメモリを割り当てている。
【００６６】
以上のように　ＳＦＬで記述したフィルタバンク回路１の合成結果を表５に示す。但し、入力遅延は考えない。尚、表５中の回路サイズはゲート数で表している。
【表５】

【００６７】
また、ＳＦＬで記述したフィルタバンク回路１について、上述した京都大学の提供する協調シミュレーション環境を用いて、シミュレーションを行い、その結果をｃｏｍｐｌｉａｎｃｅ　ｔｅｓｔによって評価した。結果を表６に示す。
【表６】

【００６８】
この結果より、フィルタバンク回路１はｌｉｍｉｔｅｄ　ａｃｃｕｒａｃｙを十分に満たしていることが分かった。
【００６９】
さらに、フィルタバンク回路１のｃｏｍｐｌｉａｎｃｅ　ｔｅｓｔを行ったときのクロック数を表７に示す。
【表７】

【００７０】
４８ｋＨｚステレオの信号には、１秒間に約１６６グラニュール分のデータが含まれる。よって、約６ｍｓ以内に１グラニュールを処理しなければならない。今回設計した回路では、１グラニュールの処理に表７に示すように、１１２０３４クロックを要する。また、このクロック数には入出力メモリ５，６の分のクロックが入っていない。これを加えると、１グラニュールの処理には、約１２００００クロック程かかると考えられる。
【００７１】
よって、この回路は１秒間に約４８０グラニュール分のデータを処理できることになる。この性能は４８ｋＨｚステレオの信号を処理するのに十分な性能である。
【００７２】
この値はオーバースペックであるものの、消費電力の点から考えると非常に有効である。消費電力には動作周波数が効いて来る。この回路を４８ｋＨｚのデータで用いるには２０ＭＨｚ程度で十分であるので、無駄な電力消費が避けられると考えられる。
【００７３】
なお、上述の形態は本発明の好適な形態の一例ではあるがこれに限定されるものではなく本発明の要旨を逸脱しない範囲において種々変形実施可能である。例えば、本実施形態ではフィルタバンク回路１をＭＰ３用デコーダに使用しているが、これには限られずエンコーダに使用しても良い。
【００７４】
また、上述のフィルタバンク回路１は、ＣＯＳテーブル３の周期性を利用して演算量を削減する特徴点と、浮動小数点形式のビット数を削減する特徴点との双方を備えるものであったが、いずれか一方の特徴点のみを備えるものであっても良い。
【００７５】
また、桁落ち（絶対値が近い正数と負数の加算によって有効桁数が減少すること）による誤差を低減するために、浮動小数点演算において、次のような処理を行なうようにしても良い。即ち、仮数部の加算等の演算後に仮数部の上位桁が０となり正規化のために仮数部を左シフトする場合であって、２ビット以上の左シフトを行なう場合には、当該左シフト前における仮数部の最下位ビット（ＬＳＢ）の１つ下位のビットを’１’とし、更に下位のビットを’０’として、当該左シフトを行なうようにしても良い。換言すれば、２ビット以上の左シフトを行なうにあたり、１ビット目のシフト時には、最下位ビットに’１’を挿入し、以後のシフト時には、最下位ビットに’０’を挿入する。これにより、最大誤差の大きさを半分にでき、左シフト時にすべて’０’を挿入する従来の方法よりも、期待誤差を小さくできる。例えば、表３に示す浮動小数点形式で表現される二つの数値があって、一方が正であり、他方が負であるとする。そして、当該一方の仮数部が「１０１０１０１０１１」であり、当該他方の仮数部が「１０１０１０１０１０」であるとする。この二つの数値の指数部の値は等しいとする。この二つの数値の仮数部の加算結果は「０００００００００１」となる。左シフト時にすべて’０’を挿入する従来の方法で仮数部の正規化の処理を行うと、仮数部に暗黙の１があるため、仮数部は「００００００００００」となり、指数部から１０が減算される。しかし、本来の仮数部の値は「００００００００００」から「１１１１１１１１１１」の間にあるはずであり、このままでは最大誤差が「１１１１１１１１１１」となってしまう。そこで、仮数部の加算後に仮数部の上位桁が０となり正規化のために仮数部を２ビット以上左シフトする場合に、当該左シフト前における仮数部の最下位ビットの１つ下位のビットを’１’とし、更に下位のビットを’０’として、当該左シフトを行なう。これにより、正規化後の仮数部は「１０００００００００」となる。従って、最大誤差の大きさを半分にできる。
【００７６】
また、仮数部の加算等の演算後に仮数部の上位桁が０となり正規化のために仮数部を左シフトする場合であって、１ビットの左シフトを行なう場合には、当該左シフト前における仮数部の最下位ビット（ＬＳＢ）を反転したビットを、当該左シフト後における仮数部の最下位ビットに挿入するようにしても良い。この場合、正規化補正値をある程度散らばらせることができ、全体としての演算誤差の累積が低減することが期待できる。例えば、表３に示す浮動小数点形式で表現される二つの数値があって、一方が正であり、他方が負であるとする。そして、当該一方の仮数部が「１１１０１０１０１１」であり、当該他方の仮数部が「０１１０１０１０１０」であるとする。この二つの数値の指数部の値は等しいとする。この二つの数値の仮数部の加算結果は「１００００００００１」となり、仮数部の暗黙の１を考慮すると、正規化のために１ビットの左シフトが必要になる。正規化後の仮数部は、「００００００００１０」から「００００００００１１」の間の数が正しい数値となる。ここで、左シフト時に必ず’１’または’０’のどちらかを挿入することは、出力データの偏りをもたらし、好ましくないと思われる。そこで、当該左シフト前における仮数部の最下位ビットを反転したビットを、当該左シフト後における仮数部の最下位ビットに挿入することで、正規化補正値をある程度散らばらせることができる。これにより、全体としての演算誤差の累積が低減することが期待できる。
【００７７】
尚、上述した仮数部シフト時の処理は、上述の実施形態で説明したＣＯＳテーブル３の周期性を利用して演算量を削減する特徴点と、浮動小数点形式のビット数を削減する特徴点の一方または双方と組み合わせて行なっても良いが、浮動小数点演算器を備えるフィルタバンク回路において、上述した仮数部シフト時の処理だけを独立して行なうようにしても良い。
【００７８】
【発明の効果】
以上説明したように、請求項１記載のフィルタバンク回路によれば、ＣＯＳテーブルの一部の値に対する演算結果を他部の値の演算結果としてそのまま若しくは符号を反転させて利用しているので、従来のようにＣＯＳテーブルの全範囲のデータを使用して乗算および加算を行う場合に比べてＩＭＤＣＴ処理における重複演算を削減して、回路サイズ及び消費電力を小さくすることができる。
【００７９】
請求項２及び３記載のフィルタバンク回路および請求項５記載のフィルタバンク回路の設計支援方法によれば、要求音質を満足する範囲で、仮数部のビット数を従来の浮動小数点形式よりも削減することができ、回路サイズ及び消費電力を小さくすることができる。
【００８０】
請求項５及び６記載のフィルタバンク回路によれば、仮数部の正規化に起因する最大誤差の大きさを半分にできる。
【００８１】
請求項７記載のフィルタバンク回路によれば、正規化補正値をある程度散らばらせることができ、全体としての演算誤差の累積を低減できる。
【００８２】
請求項４記載のフィルタバンク回路によれば、ＣＯＳテーブルの全範囲のデータを使用して乗算および加算を行う場合に比べてＩＭＤＣＴ処理における重複演算を削減でき、且つ、仮数部のビット数を従来の浮動小数点形式よりも削減でき、更なる回路サイズの縮小化及び低消費電力化を実現できる。
【図面の簡単な説明】
【図１】本発明に係るフィルタバンク回路の一例を示すブロック図である。
【図２】ＩＭＤＣＴモジュールでの状態遷移の一例を示すフローチャートである。
【図３】ロングブロックの処理の状態遷移の一例を示すフローチャートである。
【図４】ショートブロックの処理の状態遷移の一例を示すフローチャートである。
【図５】ショートブロックの処理におけるメモリへのデータの書き込み方法の一例を示す図である。
【図６】浮動小数点演算器の一例を示すブロック図である。
【図７】浮動小数点演算器での乗算および加算の処理の一例を示すフローチャートである。
【図８】入力メモリおよび出力メモリの構成の一例を示す図である。
【図９】入力メモリの利用方法の一例を示す図である。
【図１０】出力メモリの利用方法の一例を示す図である。
【図１１】ＣＯＳテーブルメモリの構成の一例を示す図である。
【図１２】ＣＯＳテーブルメモリの利用方法の一例を示す図である。
【図１３】ＳＩＮテーブルメモリの構成の一例を示す図である。
【図１４】ＳＩＮテーブルメモリの利用方法の一例を示す図である。
【図１５】ＩＭＤＣＴ処理の一例を示すフローチャートである。
【図１６】仮数部のビット数Ｎと、ビット数Ｎに対応する音質の評価値との関係の一例を示すグラフである。
【符号の説明】
１　フィルタバンク回路
２　ＩＭＤＣＴモジュール
３　ＣＯＳテーブル
８　浮動小数点演算器[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a filter bank circuit and a design support method thereof. More specifically, the present invention relates to a filter bank circuit suitable for an MP3 decoder and an encoder and a design support method thereof.
[0002]
[Prior art]
As one of the audio compression techniques that have become widespread in recent years, there is MP3 defined as Layer 3 (ISO / IEC 11172-3) of MPEG1 audio. MP3 has many advantages, such as being able to compress to less than one tenth of the original PCM data while having very high sound quality.
[0003]
When a personal computer functions as an MP3 encoder or decoder, 64-bit floating point arithmetic is generally performed during encoding and decoding processing (see Non-Patent Document 1). The floating-point format in this case is composed of one bit for the sign part, 11 bits for the exponent part, and 52 bits for the mantissa part.
[0004]
In addition, there has been reported that a dedicated hardware functioning as an MP3 decoder is configured and a 32-bit integer operation is performed during decoding (see Non-Patent Document 2).
[0005]
Further, as a decoder or encoder of a portable type MP3 player, there is a type equipped with a filter bank circuit for performing an operation of IMDCT (Inverse Modified Discrete Cosine Transform). In this IMDCT operation, music data is compressed by a method using discrete cosine transform, or the compressed data is restored. In the discrete cosine transform, since a complicated cosine calculation needs to be performed, the amount of calculation is reduced by using a COS table (see Non-Patent Document 1).
[0006]
[Non-patent document 1]
Internet <URL: ftp: // ftp. unt. uni-hanover. de / pub / MPEG / audio / mpeg2 / software / technical_report / dist10. tar. gz>
[Non-patent document 2]
Internet <URL: http: // www. mars. org / home / rob / proj / mpeg / >>
[0007]
[Problems to be solved by the invention]
However, MP3 requires a large number of calculations to achieve high sound quality and high compression. In particular, a decoder always requires real-time operation. When an MP3 encoder or decoder is implemented by software, a very high-speed processor is required. High-speed processors have very high power consumption and are expensive. On the other hand, when an MP3 encoder or decoder is realized as dedicated hardware, it is possible to achieve a reduction in circuit size and power consumption while satisfying the required processing speed, as compared with a case in which the MP3 encoder or decoder is realized by software. Although it is possible, further downsizing and lower power consumption of portable MP3 devices and the like are desired. In hardware implementation of MP3 encoders and decoders, the IMDCT processing has a particularly large amount of calculation, and is a bottleneck in realizing further miniaturization and lower power consumption. In the conventional IMDCT processing, multiplication and addition are performed using data in the entire range of the COS table, so that the amount of calculation is extremely large.
[0008]
Therefore, an object of the present invention is to provide a filter bank circuit that can achieve downsizing and low power consumption while performing IMDCT operation.
[0009]
[Means for Solving the Problems]
In order to achieve this object, the inventors of the present application have conducted intensive studies. Since the COS table has periodicity, the result of operation on the data of the basic pattern can be used as it is or by inverting the sign to obtain other periodic parts. It was found that it could be used as a calculation result.
[0010]
An invention according to claim 1 based on such knowledge is a filter bank circuit including an IMDCT module and a COS table, wherein the IMDCT module performs an operation using the COS table with respect to a partial value of the COS table. The calculation result is used as it is or with its sign inverted as the calculation result of the value of another part which has a periodicity relation to some of the values.
[0011]
Therefore, since the operation result for a part of the value of the COS table is used as the operation result of the value of the other part as it is or with its sign inverted, the multiplication is performed by using the data in the entire range of the COS table as in the related art. It is possible to reduce the number of duplicate operations in the IMDCT process as compared with the case where addition is performed.
[0012]
Further, the inventor of the present application changed the number of bits of the mantissa part of the floating-point format and examined the sound quality based on the operation in each of the changed floating-point formats. It has been found that sound quality that satisfies the required specifications can be obtained even with a bit number far smaller than the bit number of the part.
[0013]
According to a second aspect of the present invention based on such knowledge, in a filter bank circuit including an IMDCT module, the IMDCT module includes a floating-point arithmetic unit, and the number of bits of a mantissa in the floating-point arithmetic unit is a minimum bit that satisfies a required sound quality. It is determined to be the number or the number of bits obtained by adding a marginal number of bits to the minimum number of bits. Therefore, by determining the number of bits of the mantissa in this way, the number of bits of the mantissa can be reduced as compared with the conventional floating point format within a range satisfying the required sound quality.
[0014]
According to a third aspect of the present invention, in the filter bank circuit of the second aspect, the floating-point arithmetic format is configured by a sign part, a exponent part, 5 bits, and a mantissa part, 10 bits. Therefore, the number of bits in the floating-point arithmetic format can be reduced as compared with the related art while satisfying the required sound quality.
[0015]
The invention according to claim 4 is the filter bank circuit according to

claim

2 or 3, further comprising a COS table, wherein the IMDCT module performs a part of the COS table when performing an operation using the COS table. The operation result of the value is used as it is or with its sign inverted as the operation result of the value of the other part which has a periodicity relation to some of the values. Therefore, it is possible to reduce the duplication operation in the IMDCT processing as compared with the conventional case where the multiplication and the addition are performed using the data in the entire range of the COS table.
[0016]
According to a fifth aspect of the present invention, in the filter bank circuit including the IMDCT module, the IMDCT module includes a floating-point operation unit, and the upper-order digit of the mantissa becomes 0 after the operation of the mantissa in the floating-point operation. When the mantissa is shifted by two or more bits, the shift is performed such that the lower one bit of the least significant bit of the mantissa before the shift becomes 1 and the lower bit becomes 0. In this case, the magnitude of the maximum error due to the normalization of the mantissa can be reduced to half.
[0017]
According to a sixth aspect of the present invention, in the filter bank circuit according to any one of the second to fourth aspects, the high-order digit of the mantissa becomes 0 after the operation of the mantissa in the floating-point operation, and the mantissa is used for normalization. Is shifted by two or more bits, the shift is performed such that the one lower bit of the least significant bit of the mantissa part before the shift becomes 1 and the lower bit becomes 0. In this case, the magnitude of the maximum error resulting from the normalization of the mantissa can be halved while the number of bits in the floating-point arithmetic format is reduced as compared with the conventional case.
[0018]
According to a seventh aspect of the present invention, in the filter bank circuit of the fifth or sixth aspect, after the operation of the mantissa in the floating-point operation, the upper digit of the mantissa becomes 0 and the mantissa is shifted by one bit for normalization. In this case, a bit obtained by inverting the least significant bit of the mantissa before the shift is inserted into the least significant bit of the mantissa after the shift. In this case, the normalized correction values can be scattered to some extent, and the accumulation of calculation errors as a whole can be reduced.
[0019]
Further, in the filter bank circuit design supporting method according to the present invention, in the filter bank circuit including the IMDCT module, the IMDCT module includes a floating point arithmetic unit, and the number of bits of the mantissa in the floating point arithmetic unit is set to an arbitrary number of bits. And the evaluation value of the sound quality corresponding to the limited number of bits is obtained to obtain the minimum number of bits that satisfies the required sound quality. Therefore, by determining the number of bits of the mantissa based on the minimum number of bits, the number of bits of the mantissa can be reduced as compared with the conventional floating point format within a range satisfying the required sound quality.
[0020]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, the configuration of the present invention will be described in detail based on the best mode shown in the drawings.
[0021]
1 to 5 show an example of an embodiment of a filter bank circuit 1 according to the present invention. The filter bank circuit 1 includes an IMDCT module 2 and a COS table 3. When performing an operation using the COS table 3, the IMDCT module 2 converts the operation result for a part of the value of the COS table 3 into another part having a periodicity with respect to the part of the value. Is used as it is or as a result of inverting the sign. Therefore, as compared with the conventional case where multiplication and addition are performed using data in the entire range of the COS table 3, the redundant operation in the IMDCT processing is reduced, the circuit size is reduced, and the power consumption is reduced. be able to.
[0022]
The filter bank circuit 1 of the present embodiment includes the above-described IMDCT module 2, a coefficient table 4 including a COS table 3, an input memory 5, and an output memory 6. Table 1 shows an example of the configuration of each unit.
[Table 1]

[0023]
For example, in the present embodiment, the filter bank circuit 1 is used as an IMDCT processing circuit of an MP3 decoder. However, the filter bank circuit 1 may be used for an MP3 encoder. The encoder can also use the same filter bank circuit 1 as the decoder except for the constants.
[0024]
MP3 decoding is performed in units of granules. One granule is made up of 32 subbands of 18 samples. The IMDCT process is basically performed on a subband basis. As shown in FIG. 15, the IMDCT processing can be roughly divided into three parts: IDCT, window function processing, and overlap addition.
[0025]
All calculations of trigonometric functions required in the IMDCT process use the values stored in the coefficient table 4. The coefficient table 4 has a COS table 3 and a SIN table 7. Each of the tables 3 and 7 is calculated in advance and stored in an array.
[0026]
The COS table 3 used when the subband is a long block is defined as in Equation 1 and stored in an array.
(Equation 1)

[0027]
The COS table 3 used when the sub-band is a short block is defined as in Expression 2 and stored in an array.
(Equation 2)

[0028]
When the data is actually loaded on the memory, it is stored in a separate memory for each i.
[0029]
The SIN table 7 used for the window function processing (WINDOW processing) is completely different depending on the block type. Therefore, as shown in Expression 3, an array is created and stored for each block type.
[Equation 3]
SIN [block type] [i]
[0030]
Unlike the COS table 3, the SIN table 7 has no change in processing, and therefore stores the value for i as it is.
[0031]
The IMDCT module 2 performs the IMDCT process at the transition of the state (the state of the hardware circuit). Therefore, although it is not suitable for speeding up such as pipelining, resources can be shared by a plurality of states, which is advantageous in terms of circuit scale. Further, the IMDCT process may be stopped while reading / writing the input / output memory 6 if the processing speed has a sufficient margin.
[0032]
FIG. 2 shows a rough state transition of the IMDCT process. The filter bank circuit 1 basically performs the IMDCT processing in units of subbands. First, one sub-band is selected by sub-band select (step 1), and it is determined whether the block type is a long block or a short block (step 2). If it is a long block, the processing of DCT 18 (step 3) is performed, and if it is a short block, the processing of DCT 12 (step 4) is performed.
[0033]
Here, the processing of the DCT 18 and the DCT 12 will be described. The IMDCT module 2 multiplies the input by the COS table 3 as the first processing. For example, in the case of a long block, the calculation shown in Expression 4 is performed. Note that N [k] represents Huffman decoding (input data bit stream after inverse quantization).
(Equation 4)

[0034]
Here, the COS table 3 of the long block has the periodicity shown in Expression 5.
(Equation 5)

[0035]
From the periodicity, MID [i] has a relationship as shown in Expression 6.
(Equation 6)

[0036]
From this relationship, if only MID [i] in the range of i = 9 to 26 is obtained, the same processing can be performed by copying or reversing the sign and copying. Thereby, both the multiplication and the addition can be reduced by 50%. Further, the size of the COS table 3 can be reduced to half. Therefore, the amount of calculation and the number of tables can be reduced, so that the circuit size and power consumption can be reduced.
[0037]
A similar relationship holds for a short block. This MID [i] is as shown in Expression 7.
(Equation 7)

[0038]
Therefore, also in this case, the multiplication and the addition can be reduced by 50%, and the table size can be halved.
[0039]
FIG. 3 shows the state transition of the DCT 18. This is a part of the IDCT and window function processing (WINDOW processing) performed for a long block. First, only the IDCT processing for the ninth to 26th parts of the output is performed. This corresponds to a loop of table selection (step 6), multiplication and addition (step 7), and storage in memory (MEM ST) (step 8) in FIG. The COS table 3 is usually composed of 36 × 18, but in the present invention, 18 × 18 is sufficient because the table can be reduced by half. In this loop, the input and the COS coefficient are multiplied and summed. This result is stored in the output memory 6. The output memory 6 is used for output, but is also used for temporary work (temporary) during this processing. This saves memory.
[0040]
At this stage, only half of the total output has been determined. However, as described above, since copying is possible, copying is performed to a field of the output memory 6 that does not yet contain a value (step 9). At this time, when copying to the first half of the output, the sign needs to be inverted. Finally, a window function process (WINDOW process) is performed (step 10). Nothing special is done here.
[0041]
FIG. 4 shows the state transition of the DCT 12. This is an IDCT for a short block, and basically has the same configuration as that of the DCT 18 with some differences.
[0042]
First, in the short block, the subband is further divided into three subbands. Here, this range is called a sub-subband (SSB). In the present embodiment, SSB is switched according to how to read the input. Specifically, the following processing is performed. That is, the input data streams are arranged in the order of SSB0, SSB1, SSB2, SSB0, SSB1, SSB2,. Therefore, subsequent processing is performed using the buffer storing the input data stream as it is. For example, to acquire the data of SSB1, the data is taken out by adding the data width to the data address of SSB0. In order to acquire the data of SSB2, the data address of SSB0 is taken out by adding a value twice as large as the data width. By performing such processing, the required amount of memory is reduced as compared with a case where the contents of the buffer are copied to another memory area and rearranged. As a result, useless storage elements are not used.
[0043]
When the SSB is selected (step 11), only the IDCT processing of the third to eighth output portions is performed. This corresponds to a loop of table selection (step 12), multiplication and addition (step 13), and storage in memory (MEM ST) (step 14) in FIG. The result is stored in the output memory 6. At this time, data is written into the memory as shown in FIG. This takes into account the simplicity of the subsequent addition processing (step 17) and window function processing (WINDOW processing). Further, a copy is made to a field of the output memory 6 which does not yet contain a value (step 15). Thereafter, window function processing (WINDOW processing) is performed for each SSB (step 16).
[0044]
When the IDCT of all SSBs is completed, the addition processing of FIG. 4 is performed (step 17). At this time, the value of SSB1 is added to the corresponding position without moving SSB0 and SSB2. When the addition is completed, zero is written where SSB1 was present.
[0045]
After exiting from the DCT state as described above, an overlap addition for superimposing the data on the immediately preceding granule is performed as shown in FIG. 2 (step 5). Since this process does not depend on the block type, the flow is united. This completes the processing for one subband. Then, this process is repeated for one granule of sub-bands. When all subbands have been processed, the processing of the IMDCT module 2 ends.
[0046]
Further, the IMDCT module 2 of the filter bank circuit 1 according to the present embodiment includes a floating point arithmetic unit 8. The number of bits of the mantissa in the floating-point arithmetic unit 8 is set to the minimum number of bits that satisfies the required sound quality or the number of bits obtained by adding the marginal number of bits to the minimum number of bits. If the number of bits in the mantissa is reduced, the precision of the floating-point operation is reduced, but the operation speed is increased, the required circuit size can be reduced, and the power consumption can be reduced. Therefore, if the required calculation accuracy is determined, it is desirable to adopt the most cost-saving data format that satisfies the accuracy. Therefore, in the present embodiment, for example, in order to obtain the minimum number of bits that satisfies the required sound quality, the number of bits of the mantissa in the floating-point calculator 8 is limited to an arbitrary number of bits, and the number of bits corresponding to the limited number of bits is adjusted. An evaluation value of the sound quality is obtained.
[0047]
As a method of limiting the number of bits of the mantissa to an arbitrary number of bits in the floating-point arithmetic unit 8, a method of actually manufacturing hardware in which the number of bits of the mantissa is limited, a method using software using existing hardware, A method of performing a simulation in which the number of bits of the mantissa part is limited by processing can be considered, but the latter is generally cheaper and simpler than the former, and is preferable. For example, in the present embodiment, as a base MP3 decoding environment, a co-simulation environment provided by Kyoto University (19th Parthenon Study Group ASIC Design Contest Regulations Issue) was announced on the hardware implementation of IMDCT processing of MP3 decoder, An MP3 decoder HW / SW co-verification environment where the distribution method is instructed on the contest WEB site.Reference URL is http://www.kecl.ntt.co.jp/parthenon/html/contest/mp3_01/index.html. ) Is used. This simulation environment allows hardware and software to cooperate using a logic simulator SECONDS and a library runseconds used in the C language. For example, in this simulation environment, the number of bits of the mantissa can be limited to an arbitrary N bits by performing the processing described below immediately after all floating-point operations.
[0048]
A floating-point number is separated into a mantissa and an exponent using a flexp function used in the C language. At this time, the mantissa is normalized to 0.5 or more and less than 1.0. Then, the operations shown in Equations 8 to 10 are performed on the separated mantissa to obtain a mantissa restricted to N bits. Here, the floating-point number is f, and the mantissa is f _f And the exponent is f _e And the number of bits to be restricted is N, and the mantissa part restricted to N bits is f _f '''. The above processing is performed for each of positive and negative. By performing the above processing immediately after all the floating-point operations, the number of bits of the mantissa part can be almost accurately limited to N bits. Note that “int ()” in

Equations

8 and 9 is a function that converts the value in () into an integer type. In Expression 9, “&” represents a logical product operation, and “0x0001” represents a logical value of 1. The operation of “& (0x0001)” in Equation 9 is to add a logical value of 1 to the next lower digit of the significant digit so as to round off when converting to an integer type.
[0049]
(Equation 8)
f _f '= Int (f _f × 2 ^{N + 1} )
(Equation 9)
f _f "= Int (f _f × 2 ^{N + 2} ) & (0x0001)
(Equation 10)
f _f '''= F _f '+ F _f "
[0050]
Then, an evaluation value of the sound quality is obtained for each output data based on each calculation result when the number N of bits of the mantissa is variously changed. As a sound quality evaluation method, for example, Compliance Test defined in the MP3 decoder standard is used. The Compliance Test compares the decoded data of the reference with the output of the designed decoder using Expression 11.
[0051]
[Equation 11]

here,
N: Number of samples
t _i : Reference data
r _i : Decoded data
[0052]
Reference data t _i , Decode data r _i Is adjusted to take an amplitude between -1.0 and +1.0. The evaluation value obtained by Expression 11 indicates the performance of the MP3 decoder. This value is divided into three stages shown in Table 2. The required specification of the MP3 decoder is sound quality higher than Limited Accuracy.
[Table 2]

[0053]
FIG. 16 shows the relationship between the number N of bits of the mantissa and the evaluation value of the sound quality corresponding to the number N of bits. It can be seen that the mantissa part of the floating-point operation in the IMDCT process needs 9 bits or more in order to satisfy the sound quality equal to or higher than Limited Accuracy which is a required specification of the MP3 decoder. Also, it can be seen that the evaluation value of the sound quality saturates around the point where the mantissa exceeds 15 bits. This is probably because errors in the MP3 arithmetic processing (eg, inverse quantization processing, anti-aliasing processing, etc.) other than the IMDCT processing are not sufficient for the accuracy of the IMDCT processing.
[0054]
The number of bits of the mantissa part in the floating point arithmetic unit 8 may be 9 bits, which is the minimum number of bits that satisfies the required sound quality, but the number of bits obtained by adding a marginal number of bits to the minimum number of 9 bits. More preferably, it is set to a number. By providing a margin for the number of bits, it is possible to compensate for the lack of accuracy due to a calculation error of the arithmetic unit. Here, since the bit width of the memory is often a multiple of 8, the CPU structure often handles numerical values in multiples of 8. Therefore, it is preferable to determine the marginal bit number so that the bit number in the entire floating-point data format is a multiple of eight. Also, the mantissa may have an implicit one. In other words, the mantissa represents "M" of the numerical value "1.M", and it is guaranteed that the most significant digit of the mantissa is always 1. Having an implicit one effectively increases the number of bits in the mantissa by one bit. For example, in the present embodiment, one bit is added as a margin to make the mantissa part 10 bits, and the mantissa part has an implicit one. Therefore, the mantissa part has a precision of 11 bits in effect, and has a margin of 2 bits with respect to 9 bits which is the minimum number of bits satisfying the required sound quality.
[0055]
Also, as a result of various experiments performed by the inventors of the present application, data streams obtained by statistically processing about 100 music data and audio data encoded by MP3 are totaled, and the exponent part is distributed between 0 and -41. It was found that Further, as a result of various experiments conducted by the inventors of the present application, it has been found that if the exponent portion covers the range of 0 to -31, the sound quality satisfies the required quality of the MP3 decoder, that is, Limited Accuracy or higher. For data other than the data whose exponent part is distributed between 0 and -31, by performing the saturation operation, even if a slight error appears, it is considered that the error is indistinguishable from hearing. Here, the saturation calculation means that all large data exceeding a predetermined maximum value (for example, 2 to the power of 0) is treated as the maximum value (for example, 2 to the power of 0) and the predetermined minimum value (for example, 2 to -31) All the small data below the power means a calculation regarded as the minimum value (for example, 2 to the power of -31). Therefore, in the present embodiment, the exponent part is set to 5 bits, and the absolute value of the exponent is stored in the exponent part.
[0056]
Table 3 shows an example of the data format of the floating-point format in the present embodiment.
[Table 3]

[0057]
【Example】
The entire filter bank circuit 1 described above is described using SFL (Structured Function Description Language), which is one of the hardware description languages. In the present embodiment, the operation of the filter bank circuit 1 was confirmed by simulation without generating the hardware by the logic synthesis program. Here, the floating-point arithmetic unit 8 used in the IMDCT module 2 is designed so that the modules required for the arithmetic processing are shared between the floating-point addition circuit and the floating-point multiplication circuit in order to make the circuit as small as possible. To share the input / output registers and adders that occupy a large part of the arithmetic circuit, the multipliers and adders were integrated. FIG. 6 shows an example of the configuration of the floating point arithmetic unit 8. The main components of the floating-point arithmetic unit 8 are three 17-bit registers, one 11-bit adder 10, one 11-bit shifter 11, and one 4-bit incrementer 12 as the I / O register 9. did.
[0058]
By using these components in common, an extra gate is not consumed. Of course, since floating-point multiplication and addition are performed, it is not possible to calculate in one clock with such resources. In the floating point arithmetic unit 8, about 14 clocks are required for multiplication and about 6 clocks are required for addition. FIG. 7 shows the flow of this processing.
[0059]
The result of synthesis of the floating-point arithmetic unit 8 has a delay time of 19.77 ns while the number of gates representing the circuit size is 2601. Here, the input delay is not considered. Therefore, it is considered that the number of gates could be reduced as compared with the conventional addition / multiplication circuit.
[0060]
Table 4 shows the memories used. Note that "r64_8" is a memory library prepared by the SFL language processing system as a standard, and represents a memory block having a width of 64 words and 8 bits.
[Table 4]

[0061]
Since the data is composed of 16 bits, an 8-bit memory is used in combination. More unused space is due to the fact that there was no smaller memory, and it is likely that less will actually be needed. Further, it is only necessary to cut off useless parts at the time of mounting.
[0062]
In the present design, the memory size is not reduced much because the design is based on the idea that the memory is less expensive than the increase in the number of gates in the circuit.
[0063]
FIG. 8 shows the configuration of the input memory 5 and the output memory 6. FIG. 9 shows how to use the input memory 5. When the input data is represented as IN [sb] [ss], a separate memory is used for each sub-band IN [sb]. The input memory 5 uses only 18 entries. The entry points to a data recording area indicated by one address, that is, in the input memory 5, addresses 0 (000000 in binary) to 17 (010001 in binary) are used. FIG. 10 shows how to use the output memory 6. When the output data is expressed as OUT [sb] [ss], a separate memory is used for each sub-band OUT [sb]. Further, the first half of the memory area is used as output and work, and the second half of address 46 (101110 in binary) to 63 (111111 in binary) are storage areas for overlap addition.
[0064]
FIG. 11 shows the configuration of the COS table memory 13. Here, as shown in Equation 6, only the minimum necessary tables, that is, only 9 to 26 are prepared. FIG. 12 shows how to use the COS table memory 13. When the table is expressed as COS [i] [k], a separate memory is allocated for each COS [i] which is a unit used in IDCT. Note that the long COS table and the short COS table are stored in separate memories, but the storage format is the same except for the range of k.
[0065]
FIG. 13 shows the configuration of the SIN table memory 14, and FIG. 14 shows how to use the SIN table memory 14. When the table is expressed as SIN [block type] [k], a different memory is allocated for each SIN [block type].
[0066]
Table 5 shows the synthesis result of the filter bank circuit 1 described in SFL as described above. However, input delay is not considered. Note that the circuit size in Table 5 is represented by the number of gates.
[Table 5]

[0067]
In addition, the filter bank circuit 1 described in SFL was simulated using the co-simulation environment provided by Kyoto University, and the result was evaluated by a compliance test. Table 6 shows the results.
[Table 6]

[0068]
From this result, it was found that the filter bank circuit 1 sufficiently satisfies the limited accuracy.
[0069]
Table 7 shows the number of clocks when the compliance test of the filter bank circuit 1 is performed.
[Table 7]

[0070]
A 48 kHz stereo signal contains data for about 166 granules per second. Thus, one granule must be processed within about 6 ms. In the circuit designed this time, 112034 clocks are required for processing one granule as shown in Table 7. In addition, the number of clocks does not include clocks for the input /

output memories

5 and 6. In addition, it is considered that processing of one granule takes about 120,000 clocks.
[0071]
Therefore, this circuit can process about 480 granules of data per second. This performance is sufficient for processing a 48 kHz stereo signal.
[0072]
Although this value is over-spec, it is very effective in terms of power consumption. The operating frequency affects power consumption. Since about 20 MHz is sufficient for using this circuit with 48 kHz data, it is considered that useless power consumption is avoided.
[0073]
The above-described embodiment is an example of a preferred embodiment of the present invention, but is not limited thereto, and various modifications can be made without departing from the spirit of the present invention. For example, although the filter bank circuit 1 is used in the MP3 decoder in the present embodiment, the present invention is not limited to this, and the filter bank circuit 1 may be used in an encoder.
[0074]
Further, the above-described filter bank circuit 1 has both a feature point of reducing the amount of calculation using the periodicity of the COS table 3 and a feature point of reducing the number of bits in the floating-point format. Alternatively, only one of the characteristic points may be provided.
[0075]
Further, in order to reduce an error due to cancellation of a digit (decrease in the number of significant digits due to addition of a positive number and a negative number whose absolute values are close to each other), the following processing may be performed in floating-point arithmetic. That is, after the operation such as addition of the mantissa, the upper digit of the mantissa becomes 0 and the mantissa is left-shifted for normalization. The left shift may be performed by setting the lower one bit of the least significant bit (LSB) of the mantissa to “1” and setting the lower bit to “0”. In other words, when performing a left shift of 2 bits or more, “1” is inserted into the least significant bit during the first bit shift, and “0” is inserted into the least significant bit during subsequent shifts. As a result, the magnitude of the maximum error can be halved, and the expected error can be made smaller than in the conventional method in which all “0” are inserted at the time of left shift. For example, suppose there are two numerical values represented in the floating point format shown in Table 3, one of which is positive and the other is negative. Then, it is assumed that the one mantissa is “1010101011” and the other mantissa is “1010101010”. The exponents of these two numbers are assumed to be equal. The result of adding the mantissa of these two numerical values is “0000000001”. When the mantissa normalization process is performed by the conventional method of inserting all '0's at the time of the left shift, the mantissa has an implicit 1 so that the mantissa becomes “000000000000” and 10 is subtracted from the exponent. You. However, the original value of the mantissa should be between “000000000000” and “1111111111”, and the maximum error will be “1111111111” if this value is left as it is. Therefore, when the high-order digit of the mantissa part becomes 0 after the addition of the mantissa and the mantissa part is left-shifted by 2 bits or more for normalization, one bit lower than the least significant bit of the mantissa part before the left shift is added. The left shift is performed by setting “1” and further setting the lower bit to “0”. As a result, the mantissa after normalization is “1000000000000”. Therefore, the magnitude of the maximum error can be halved.
[0076]
In addition, in the case where the high-order digit of the mantissa becomes 0 after an operation such as addition of the mantissa, the mantissa is left-shifted for normalization. A bit obtained by inverting the least significant bit (LSB) of the mantissa may be inserted into the least significant bit of the mantissa after the left shift. In this case, the normalized correction values can be scattered to some extent, and it can be expected that the accumulation of calculation errors as a whole is reduced. For example, suppose there are two numerical values represented in the floating point format shown in Table 3, one of which is positive and the other is negative. Then, it is assumed that the one mantissa is “11110101011” and the other mantissa is “0110101010”. The exponents of these two numbers are assumed to be equal. The result of the addition of the mantissa of these two numerical values is “100000000001”, and taking into account the implicit 1 of the mantissa, a 1-bit left shift is required for normalization. In the mantissa part after the normalization, a number between “00000000010” and “00000000011” is a correct numerical value. Here, it is considered that inserting either “1” or “0” without fail at the time of left shift results in bias of the output data, which is not preferable. Therefore, by inserting a bit obtained by inverting the least significant bit of the mantissa before the left shift into the least significant bit of the mantissa after the left shift, the normalized correction value can be scattered to some extent. Thus, it can be expected that the accumulation of the calculation errors as a whole is reduced.
[0077]
The process at the time of the mantissa shift described above includes a feature point for reducing the amount of calculation using the periodicity of the COS table 3 described in the above embodiment and a feature point for reducing the number of bits in the floating point format. Although it may be performed in combination with one or both, in a filter bank circuit including a floating-point arithmetic unit, only the above-described processing at the time of mantissa shift may be performed independently.
[0078]
【The invention's effect】
As described above, according to the filter bank circuit of the first aspect, the operation result of a part of the COS table is used as it is or the sign is inverted as the operation result of the value of the other part. Compared with the conventional case where multiplication and addition are performed using data in the entire range of the COS table, redundant operations in IMDCT processing can be reduced, and the circuit size and power consumption can be reduced.
[0079]
According to the filter bank circuit according to the second and third aspects and the filter bank circuit design support method according to the fifth aspect, the number of bits of the mantissa is reduced as compared with the conventional floating point format within a range satisfying the required sound quality. Circuit size and power consumption can be reduced.
[0080]
According to the filter bank circuit of the fifth and sixth aspects, the magnitude of the maximum error resulting from the normalization of the mantissa can be reduced to half.
[0081]
According to the filter bank circuit of the seventh aspect, the normalized correction values can be scattered to some extent, and the accumulation of calculation errors as a whole can be reduced.
[0082]
According to the filter bank circuit of the fourth aspect, the duplication operation in the IMDCT processing can be reduced as compared with the case where multiplication and addition are performed using data in the entire range of the COS table, and the number of bits of the mantissa is reduced. , And further reduction in circuit size and power consumption can be realized.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an example of a filter bank circuit according to the present invention.
FIG. 2 is a flowchart illustrating an example of a state transition in the IMDCT module.
FIG. 3 is a flowchart illustrating an example of a state transition of processing of a long block.
FIG. 4 is a flowchart illustrating an example of a state transition of processing of a short block.
FIG. 5 is a diagram illustrating an example of a method of writing data to a memory in processing of a short block.
FIG. 6 is a block diagram illustrating an example of a floating-point arithmetic unit.
FIG. 7 is a flowchart illustrating an example of multiplication and addition processing in a floating-point arithmetic unit.
FIG. 8 is a diagram showing an example of a configuration of an input memory and an output memory.
FIG. 9 is a diagram illustrating an example of a method of using an input memory.
FIG. 10 is a diagram illustrating an example of a method of using an output memory.
FIG. 11 is a diagram illustrating an example of a configuration of a COS table memory.
FIG. 12 is a diagram illustrating an example of a method of using a COS table memory.
FIG. 13 is a diagram illustrating an example of a configuration of a SIN table memory.
FIG. 14 is a diagram illustrating an example of a method of using a SIN table memory.
FIG. 15 is a flowchart illustrating an example of an IMDCT process.
FIG. 16 is a graph showing an example of the relationship between the number N of bits of a mantissa and an evaluation value of sound quality corresponding to the number N of bits.
[Explanation of symbols]
1 Filter bank circuit
2 IMDCT module
3 COS table
8 Floating point arithmetic unit

Claims

In a filter bank circuit including an IMDCT module and a COS table, the IMDCT module, when performing an operation using the COS table, converts an operation result for a part of the COS table into the partial value. A filter bank circuit characterized in that the filter bank circuit is used as it is or with its sign inverted as an operation result of a value of another part having a periodicity relationship with respect to.

In the filter bank circuit including the IMDCT module, the IMDCT module includes a floating-point arithmetic unit, and the number of bits of the mantissa in the floating-point arithmetic unit is a minimum number of bits that satisfies a required sound quality or a margin of the minimum number of bits. A filter bank circuit characterized by being determined by adding the number of bits to the number of bits.

3. The filter bank circuit according to claim 2, wherein the floating-point arithmetic format is composed of one bit for a sign part, five bits for an exponent part, and ten bits for a mantissa part.

A COS table, wherein the IMDCT module, when performing an operation using the COS table, associates an operation result with respect to a part of the COS table with a periodicity with respect to the part of the value. 4. The filter bank circuit according to claim 2, wherein the result of operation of the value of the other part is used as it is or with its sign inverted.

In the filter bank circuit including the IMDCT module, the IMDCT module includes a floating-point arithmetic unit, and the upper-order digit of the mantissa becomes 0 after the operation of the mantissa in the floating-point operation, and the mantissa is shifted by 2 bits or more for normalization. 2. The filter bank circuit according to claim 1, wherein the shift is performed such that the one lower bit of the least significant bit of the mantissa part before the shift becomes 1 and the lower bit becomes 0.

When the upper digit of the mantissa becomes 0 after the operation of the mantissa in the floating-point operation and the mantissa is shifted by two or more bits for normalization, one bit lower than the least significant bit of the mantissa before the shift The filter bank circuit according to any one of claims 2 to 4, wherein the shift is performed such that the value becomes 1 and the lower bit becomes 0.

When the high-order digit of the mantissa part becomes 0 after the operation of the mantissa part in the floating-point operation and the mantissa part is shifted by one bit for normalization, the least significant bit of the mantissa part before the shift is inverted, 7. The filter bank circuit according to claim 5, wherein the data is inserted into a least significant bit of the mantissa after the shift.

In a filter bank circuit including an IMDCT module, the IMDCT module includes a floating-point arithmetic unit, and limits the number of bits of a mantissa in the floating-point arithmetic unit to an arbitrary number of bits, and sets a sound quality corresponding to the limited number of bits. A design support method for a filter bank circuit, wherein an evaluation value is obtained and a minimum number of bits satisfying a required sound quality is obtained.