JP4024185B2

JP4024185B2 - Digital data encoding device

Info

Publication number: JP4024185B2
Application number: JP2003189179A
Authority: JP
Inventors: 修藤井
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2003-07-01
Filing date: 2003-07-01
Publication date: 2007-12-19
Anticipated expiration: 2023-07-01
Also published as: JP2005026940A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a digital data encoding device which uses one kind of quantizer to reduce the quantization noise of the maximum spectrum in each frequency band at the time of quantization and enables spectrum levels in the frequency bands after the quantization except the maximum spectrum to be made lower than conventional quantization values. <P>SOLUTION: A digital data encoding device 1 provided with; a time frequency conversion part 3 for converting an audio signal to a frequency area; a scale factor generation part 9 for generating scale factors relating maximum spectrums of spectrums converted for a plurality of frequency bands respectively; a number of quantization bits calculation part 6 for calculating the number of quantization bits for each frequency band; and a quantization part 7 for quantizing the spectrums to quantize them by using the scale factors and the numbers of quantization bits is provided with a spectrum data correction part 10 which corrects the spectrums by multiplying them by constants calculated on the basis of scale factors and the numbers of quantization bits. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、ミニディスクなどの記録媒体に楽音や音声等のデジタルデータを記録するにあたって、前記楽音や音声等に適応して各周波数帯域のスペクトルに対するビット割当てを行い、データ量を圧縮するデジタルデータ符号化装置に関するものである。
【０００２】
【従来の技術】
従来より、楽音や音声等のデジタルデータを高能率で圧縮符号化する方法として、ミニディスクで用いられているＡＴＲＡＣ(Adaptive TRansform Acoustic Coding)が上げられる。このＡＴＲＡＣでは、高能率で圧縮するために、入力デジタルデータは、複数の周波数帯域（以下、適宜サブバンドと呼ぶ）に分割された後、可変長の時間単位でブロック化される。そして、このブロック化されたデジタルデータは、ＭＤＣＴ（Modified Discrete Cosine Transform）処理によりスペクトルデータに変換され、さらに、聴覚心理特性を利用して割当てられたビット数で各スペクトルデータがそれぞれ符号化される。
【０００３】
前記の圧縮符号化に適応することができる聴覚心理特性には、等ラウドネス特性やマスキング効果が挙げられる。等ラウドネス特性とは、同じ音圧レベルの音であっても、人間が感じ取る音の大きさが周波数によって変化することを表すものである。従って、人間が感じ取ることができる音の大きさである最小可聴限が、その音の周波数によって変化することを表している。
【０００４】
一方、マスキング効果には、同時マスキング効果と経時マスキング効果があり、同時マスキング効果とは、複数の周波数成分の音が同時に発生しているときに、ある音が別の音を聴き取り難くさせる現象をいう。また、経時マスキング効果とは、大きな音の時間軸方向の前後では、別の音が聞き取り難くなる現象をいう。
【０００５】
このような聴覚心理特性を利用したビット割り当て法、例えば、反復法と呼ばれる割り当て法では、入力デジタルデータに適応した実際のビット割り当てを以下のようにして行っている。
【０００６】
先ず、各周波数帯域のパワーＳを求め、そのパワーＳによる他の周波数帯域に対するマスキング閾値Ｍを求める。次に、このマスキング閾値Ｍと、各周波数帯域をｎビットで量子化したときの量子化雑音パワーＮ（ｎ）とから、マスキング閾値対雑音比ＭＮＲ（ｎ）＝Ｍ／Ｎ（ｎ）を求める。続いて、そのマスキング閾値対雑音比ＭＮＲ（ｎ）が最小となる周波数帯域にビット割り当てを行った後、このマスキング閾値対雑音比ＭＮＲ（ｎ）を更新し、再び最小の周波数帯域にビット割り当てが行われる。
【０００７】
このようにして、聴覚心理特性を利用したビット割り当てにより、各周波数帯域の量子化ビット数（量子化語長）が算出される。また、周波数帯域毎のスペクトルの最大振幅レベルに関連するスケールファクタが算出され、このスケールファクタと前記量子化ビット数に基づいて各周波数帯域のスペクトルデータが量子化され、その後、圧縮データに符号化される。
【０００８】
また、この符号化処理で符号化されたデータを復号化する装置において、逆量子化時には、量子化ビット数とスケールファクタ、及び、量子化時に求められた量子化係数から各周波数帯域毎にスペクトルデータに展開される。そして、展開されたスペクトルデータは可変長の時間単位でブロック化され、逆ＭＤＣＴ処理が施されて時間領域のデータに変換される。さらに、この変換された時間領域のデータは、複数の周波数帯域（サブバンド）を合成した後、楽音や音声等のデジタルデータに復元される。
【０００９】
ところで、量子化時における最大量子化雑音を低減する方法として、量子化ビット数に対応する一定の係数を所定の周波数間隔内の全てのサンプルに掛ける方法、即ち、ＷＬ（量子化ビット数）がＸより小さい場合に、ｋ＝（２^WL[n]-２）/（２^WL[n]-1）をユニット（量子化を行う周波数帯域単位）内の全てのスペクトルサンプルに掛け、修正スペクトルサンプルを求め、さらに、修正された量子化器で量子化を行う方法がある（例えば、特許文献１参照）。
【００１０】
また、量子化器そのものを強化する方法、即ち、量子化を行う周波数帯域単位内のスペクトルの絶対値の最大値に係数ｋ（０＜ｋ＜１）を掛けてスケールファクタを算出し、スペクトルサンプルを修正しないで量子化する方法もある（例えば、特許文献２参照）。
【００１１】
【特許文献１】
特許第３１５０４７５号公報
【特許文献２】
特開平１１−１７７４３５号公報
【００１２】
【発明が解決しようとする課題】
しかしながら、特許文献１に記載の従来技術では、最大量子化雑音を低減させることは可能であるが、量子化を行う周波数帯域単位のスペクトル分布によっては、修正を行わない方が真値に近いスペクトルサンプルがあり、これらのスペクトルサンプルに対しては効果が逆になるという問題があった。また、量子化ビット数に対応する２種類の量子化器が必要となるという問題もあった。
【００１３】
また、特許文献２に記載の従来技術では、量子化を行う周波数帯域のスペクトルデータの絶対値の最大値に係数ｋ（０＜ｋ＜１）を掛けてスケールファクタを算出している為、スペクトル分布によっては正規化値（ＳＤ（ｍ）/ＳＦ）が１を越える、即ち、オーバーフローが発生する場合があるという問題があった。
【００１４】
本発明は、上記の問題点に鑑み、１種類の量子化器を用いて、量子化時における各周波数帯域の最大スペクトルの量子化雑音を低減するとともに、最大スペクトルを除く量子化後の周波数帯域のスペクトルレベルを従来の量子化値と比較して低くすることのできるデジタルデータ符号化装置を提供することを目的とする。
【００１５】
【課題を解決するための手段】
上記目的を達成するために本発明は、時間領域のデジタルデータを周波数領域に変換したスペクトルを複数の周波数帯域に分割し、各周波数帯域毎に最大スペクトルに関連するスケールファクタを生成し、聴覚心理特性を利用して前記周波数帯域毎の量子化ビット数を算出し、前記スケールファクタと量子化ビット数を用いて前記スペクトルを量子化した量子化係数を算出して符号化するデジタルデータ符号化装置において、量子化される前記スペクトルに前記スケールファクタと量子化ビット数とに基づいて算出した定数を乗算して修正するスペクトルデータ修正部を設けたものである。
【００１６】
このようにすると、入力されたデジタルデータの量子化時において、前記各周波数帯域のスペクトルの量子化誤差を低減して符号化することができ、この符号化されたデジタルデータの復号化時の量子化雑音を低減することができる。
【００１７】
また、例えば、時間領域のデジタルデータを周波数領域に変換する時間周波数変換部と、該時間周波数変換部により変換されたスペクトルを複数の周波数帯域に分割し各周波数帯域毎に最大スペクトルに関連するスケールファクタを生成するスケールファクタ生成部と、聴覚心理特性を利用して前記周波数帯域毎の量子化ビット数を算出する量子化ビット数算出部と、前記スケールファクタと量子化ビット数を用いて前記スペクトルを量子化した量子化係数を算出して符号化する量子化部を備えたデジタルデータ符号化装置において、前記量子化部により量子化される前記スペクトルに前記スケールファクタと前記量子化ビット数とに基づいて算出した定数を乗算して修正するスペクトルデータ修正部を設けると良い。
【００１８】
このようにすると、入力されたデジタルデータの量子化時において、前記各周波数帯域のスペクトルの量子化誤差を低減して符号化することができ、この符号化されたデジタルデータの復号化時の量子化雑音を低減することができる。
【００１９】
また、例えば、前記スペクトルデータ修正部が、前記スペクトルを前記スケールファクタで正規化した仮正規化係数を算出する仮正規化部と、前記量子化ビット数に応じた比較閾値を算出する比較閾値算出部と、前記仮正規化係数と比較閾値に基づいた定数を算出する定数算出部と、前記スペクトルに前記定数を乗算して重み付けするスペクトルデータ重み付け部を備えていると良い。
【００２０】
このようにすると、前記各周波数帯域のスペクトルの大きさに応じた定数を算出し、前記スペクトルに前記定数を乗算して重み付けすることにより前記スペクトルの量子化時における量子化誤差を低減することができる。
【００２１】
また、例えば、前記時間周波数変換部で変換されたスペクトルに前記定数算出部で算出される定数を乗算した修正スペクトルを前記量子化部で量子化した量子化係数が逆量子化されたときのスペクトル振幅は、前記スペクトルに前記定数を乗算していない無修正スペクトルを前記量子化部で量子化した量子化係数が逆量子化されたときのスペクトル振幅を超えないようになっていると、逆量子化された前記各周波数帯域のスペクトルレベルを従来の量子化値よりも低くして、量子化雑音を低減することができる。
【００２２】
また、例えば、前記定数算出部により算出される定数が、前記各周波数帯域の最大スペクトルに乗算される第１の定数と、前記各周波数帯域の最大スペクトル以外のスペクトルに乗算される第２の定数とから成ると、１種類の量子化器を用いて、量子化時における各周波数帯域の最大スペクトルの量子化雑音を低減できるとともに、最大スペクトルを除く量子化後の周波数帯域のスペクトルレベルを、従来の量子化値と比較して低くすることができる。
【００２３】
また、例えば、前記スケールファクタ生成部が、前記定数算出部により算出される定数に応じて前記スケールファクタを減算して前記量子化部に与えるスケールファクタ減算部を備えていると、前記スペクトルが正規化された正規化値が１を越える、即ち、オーバーフローの発生を防止することができる。
【００２４】
また、例えば、前記量子化ビット数が所定のビット数より小なるときに、前記スペクトルデータ修正部が前記スペクトルに前記定数を乗算して修正すると、前記スペクトルデータ修正部の演算量が低減し、デジタルデータ符号化装置の演算量を効果的に削減できる。
【００２５】
また、例えば、前記量子化ビット数が所定のビット数より小なるときに、前記定数算出部が前記定数を算出し、前記スペクトルデータ重み付け部が前記スペクトルに前記定数を乗算して重み付けすると、前記定数算出部及び前記スペクトルデータ重み付け部の演算量が低減し、デジタルデータ符号化装置の演算量を効果的に削減できる。
【００２６】
【発明の実施の形態】
以下に、本発明の実施形態を図面を参照して説明する。図１は、本発明の実施形態に係るデジタルデータ符号化装置の電気的構成を示すブロック図であり、説明の理解を深めるために復号化装置も合わせて示している。以下、ミニディスク等で利用するATRAC(Adaptive TRanceform Acoustic Coding)方式で行われる符号化復号化処理を図１を用いて説明する。図１において、１はデジタルデータ符号化装置、１５は復号化装置、１０はミニディスク等の記録メディアを示す。
【００２７】
デジタルデータ符号化装置１は、周波数帯域分割部２、時間周波数変換部３、帯域毎のパワー算出部４、マスキング算出部５、量子化ビット数算出部６、量子化部７、パッキング部８、スケールファクタ生成部９、及び、スペクトルデータ修正部１０を備えている。また、復号化装置１５は、アンパッキング部１１、逆量子化部１２、周波数時間変換部１３、周波数帯域合成部１４を備えている。
【００２８】
次に、このような構成のデジタルデータ符号化装置１の符号化処理を説明する。デジタルデータ符号化装置１の入力端には、４４．１ｋＨｚでサンプリングされたデジタルデータであるオーディオ信号が入力される。この入力オーディオ信号は、周波数帯域分割部３において、帯域分割フィルタであるＱＭＦ（Quadrature Mirror Filter）によって複数の周波数帯域（サブバンドフレーム）に分割される。例えば、０〜５．５ｋＨｚの低帯域サブバンドフレームＳＢ１と、５．５〜１１ｋＨｚの中帯域サブバンドフレームＳＢ２と、１１〜２２ｋＨｚの高帯域サブバンドフレームＳＢ３の３帯域である。
【００２９】
次に、時間周波数変換部３は、周波数帯域分割部２で得られたサブバンドフレーム単位毎にＭＤＣＴ(Modified Discrete Cosine Tranceform)処理を施すことで、入力オーディオ信号を周波数成分のＭＤＣＴ係数(スペクトルデータ）に変換する。このときのＭＤＣＴ処理によって得られる変換データＸｍ（ｋ）は次の（１）式で示される。
【００３０】
【数１】

【００３１】
尚、上式中の変数ｍはフレーム番号を表しており、関数ｘm（ｉ）は入力信号を表している。また、関数ｈ（ｉ）は順変換用窓関数を表している。
【００３２】
そして、帯域毎のパワー算出部４は、時間周波数変換部３で得られたＭＤＣＴ係数を、更に、ｉ個の各周波数帯域のスペクトルパワーＳｉ（ｉ＝１，２，…，Ｉ、例えばＩ＝２５）に変換する。ここで、前記周波数帯域には臨界帯域（単位Bark）等が用いられる。このようにして得られた各スペクトルパワーに対して、マスキング算出部５では、聴覚心理特性を用いてマスキングカーブが作成され、量子化ビット数算出部６では、マスキング算出部５で作成されたマスキングカーブと各スペクトルパワーによりビット割当処理が行われる。
【００３３】
また、スケールファクタ生成部９において、時間周波数変換部３で変換された各周波数帯域のＭＤＣＴ係数の絶対最大値から約２ｄＢ毎にスケールファクタが算出される。また、スペクトルデータ修正部１０において、このスケールファクタにより前記ＭＤＣＴ係数が仮正規化された仮正規化係数と量子化ビット数算出部６のビット割当処理から求められた各周波数帯域の量子化ビット数とから求められた定数が前記ＭＤＣＴ係数に乗算されて重み付けされたＭＤＣＴ係数が生成される。そして、この重み付けされたＭＤＣＴ係数と一部が減算されたスケールファクタとが量子化部７に与えられる。尚、このスケールファクタ生成部９とスペクトルデータ修正部１０の構成及び動作についての詳細は後述する。
【００３４】
そして、量子化部７は、量子化ビット数算出部６のビット割当処理から求められた各周波数帯域の量子化ビット数と、スケールファクタ生成部９から与えられたスケールファクタとから、スペクトルデータ修正部１０から与えられたスペクトルデータを次の（２）式によって量子化した量子化係数ＭＫ（ｍ）を算出する。
【００３５】
【数２】

【００３６】
尚、上式中の変数ｍはＭＤＣＴ係数のインデックス、ｉは量子化周波数帯域のインデックス、Ｋ（ｍ）はＭＤＣＴ係数、ＷＬ（ｉ）は量子化ビット数、ＳＦ（ｉ）はスケールファクタを表しており、Ｒｏｕｎｄは小数点以下を四捨五入する関数である。
【００３７】
そして、量子化部７で量子化された、量子化係数、量子化ビット数、スケールファクタはフレーム情報とともにパッキング部８でパッキング、符号化され、記録メディア２０に記録される。
【００３８】
次に、復号化装置１５の復号化処理について説明する。上述のように記録メディア２０に記録された符号化データは、フレーム情報から量子化係数、量子化ビット数、スケールファクタがアンパッキング部１１でアンパッキングされる。そして、逆量子化部１２において、この量子化係数、量子化ビット数、スケールファクタが逆量子化され、ＩＭＤＣＴ(Inverse Modified Discrete Cosine Transform)処理を施す周波数時間変換部１３に入力される。この周波数時間変換部１３に入力されるＩＭＤＣＴ入力信号Ｉ（ｍ）は、次の（３）式によって逆量子化される。
【００３９】
【数３】

【００４０】
尚、上式中の変数ｍはＩＭＤＣＴ入力信号のインデックス、ｉは逆量子化周波数帯域のインデックス、ＭＫ（ｍ）は量子化係数、ＷＬ（ｉ）は量子化ビット数、ＳＦ（ｉ）はスケールファクタを表している。
【００４１】
逆量子化部１２で逆量子化され、再び、スペクトルに復元されたスペクトルデータは、サブバンドフレーム毎に周波数時間変換部１３でＩＭＤＣＴ処理が施され、時間軸のデータに変換される。この周波数時間変換部１３のＩＭＤＣＴ処理による逆変換データｙｍ（ｉ）は次の（４）式で示される。
【００４２】
【数４】

【００４３】
尚、上式中の変数ｍはフレーム番号、ｆ（ｉ）は逆変換用窓関数、Ｘｍ（ｋ）は変換データを表している。
【００４４】
更に、周波数帯域合成部１４において、逆変換された周波数サブバンドフレームは、帯域合成フィルタであるＩＱＭＦ（Inverse Quadrature Mirror Filter）によって帯域合成され、オーディオデータに復号化される。
【００４５】
次に、量子化部７に与えられるスケールファクタの生成方法と、同じく量子化部７に与えられるＭＤＣＴ係数（スペクトルデータ）の修正方法を説明する。図２は、スケールファクタ生成部９とスペクトルデータ修正部１０の電気的構成を示すブロック図である。図２において、図１と同一の部分には同一の符号を付し、その詳細な説明を省略する。
【００４６】
スケールファクタ生成部９は、スケールファクタ算出部２１とスケールファクタ減算部２２とから構成されている。また、スペクトルデータ修正部１０は、仮正規化部２３、定数算出部２４、比較閾値算出部２５、及び、スペクトルデータ重み付け部２６から構成されている。
【００４７】
先ず、図１に示す時間周波数変換部３において、上記（１）式によって変換されたＭＤＣＴ係数（スペクトルデータ）は、スケールファクタ算出部９において、各周波数帯域のＭＤＣＴ係数の絶対最大値から約２ｄＢ毎にスケールファクタが算出される。即ち、各周波数帯域のＭＤＣＴ係数の絶対最大値をＫmax（ｉ）、その時のスケールファクタをＳＦ（ｉ）とすれば、ＳＦ（ｉ）×２^-1/3＝＜Ｋmax（ｉ）＜ＳＦ（ｉ）となるようなＳＦ（ｉ）が算出される。
【００４８】
また、仮正規化部２３においては、変換されたＭＤＣＴ係数Ｋ（ｍ）とスケールファクタＳＦ（ｉ）から、各周波数帯域毎に上記（２）式のＫ（ｍ）／ＳＦ（ｉ）部分（仮正規化係数）が算出される。このとき−１＜Ｋ（ｍ）／ＳＦ（ｉ）＜１である。また、帯域毎のパワー算出部４、マスキング算出部５、量子化ビット数算出部６においては、上述したように、聴覚心理特性を利用してマスキングカーブを作成し、ビット割当処理が行われる。そして、比較閾値算出部２５においては、ビット割当処理から求めた各周波数帯域の量子化ビット数より、最終的に量子化される値とその中点を算出する。例えば、量子化ビット数ＷＬ（ｉ）＝２の時は、最終的に量子化される値が｛−１，０，１｝に対して｛−１，−０．５，０，０．５，１｝が算出される。これらは、予めテーブルＲＯＭに記憶しておいてもよい。
【００４９】
そして、定数算出部２４においては、仮正規化値Ｋ（ｍ）／ＳＦ（ｉ）が比較閾値のどの領域に入るかを比較する。これにより、量子化雑音による振幅が真値と比較して増加するのか減少するのかを判断する。例えば、ＷＬ（ｉ）＝２の時は、−１＜Ｋ（ｍ）／ＳＦ（ｉ）＝＜−０．５、または、０．５＜Ｋ（ｍ）／ＳＦ（ｉ）＝＜１ならば量子化雑音による振幅が真値より増加し、−０．５＜Ｋ（ｍ）／ＳＦ（ｉ）＜０．５ならば量子化雑音による振幅が真値より減少する。ここで、スケールファクタの算出に用いたＭＤＣＴ係数の絶対最大値Ｋmax（ｉ）は、量子化雑音による振幅が真値より必ず増加するので比較対象から除外する。
【００５０】
更に、Ｋmax（ｉ）とＳＦ（ｉ）との差分、即ち、ＳＦ（ｉ）−Ｋmax（ｉ）と、Ｋmax（ｉ）とＳＦ（ｉ）×２^-1/3との差分、即ち、Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3とを比較し、以下のような条件により、Ｋmax（ｉ）に掛ける定数ｆmax（ｉ）と、Ｋmax（ｉ）以外のＭＤＣＴ係数に掛ける定数ｆexmax（ｉ）とを算出する。
【００５１】
先ず、ＳＦ（ｉ）−Ｋmax（ｉ）＞＝Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3、且つ、Ｑｕｐ（ｉ）＞＝Ｑｄｏｗｎ（ｉ）ならば、ｆmax（ｉ）＝２^-1/3、ｆexmax（ｉ）＝ＭＩＮ（２^-1/3，（２^WL(i)-1−２）／（２^WL(i)-1−１））とする。
【００５２】
また、ＳＦ（ｉ）−Ｋmax（ｉ）＞＝Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3、且つ、Ｑｕｐ（ｉ）＜Ｑｄｏｗｎ（ｉ）ならば、ｆmax（ｉ）＝２^-1/3，ｆexmax（ｉ）＝２^-1/3とする。
【００５３】
また、ＳＦ（ｉ）−Ｋmax（ｉ）＜Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3、且つ、Ｑｕｐ（ｉ）＞＝Ｑｄｏｗｎ（ｉ）ならば、ｆmax（ｉ）＝１、ｆexmax（ｉ）＝ＭＡＸ（２^-1/3，（２^WL(i)-1−２）／（２^WL(i)-1−１））とする。
【００５４】
また、ＳＦ（ｉ）−Ｋmax（ｉ）＜Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3、且つ、Ｑｕｐ（ｉ）＜Ｑｄｏｗｎ（ｉ）ならば、ｆmax（ｉ）＝１、ｆexmax（ｉ）＝２^-1/3とする。尚、ＭＩＮ（ｘ，ｙ）は引数の最小値を返す関数であり、ＭＡＸ（ｘ，ｙ）は引数の最大値を返す関数である。
【００５５】
ここで、Ｑｕｐ（ｉ）は量子化雑音による振幅が真値より増加する場合の量子化誤差のパワーまたはエネルギ値であり、Ｑｄｏｗｎ（ｉ）は量子化雑音による振幅が真値より減少する場合の量子化誤差のパワーまたはエネルギ値である。または、単純に、真値より増加または減少する場合のＭＤＣＴ係数の本数であってもよい。図３に、ＷＬ（ｉ）＝２、ＳＦ（ｉ）＝２⁵、周波数帯域ｉのスペクトル本数が６のときのＭＤＣＴ係数の一例を示している。この例の場合、量子化誤差パワーＱｕｐ（ｉ）＝ＳＡ（２）²＋ＳＡ（３）²、量子化誤差パワーＱｄｏｗｎ（ｉ）＝ＳＡ（１）²＋ＳＡ（３）²＋ＳＡ（４）²＋ＳＡ（５）²＋ＳＡ（６）²として算出する。
【００５６】
図４は、定数算出部２４における、Ｋmax（ｉ）に掛ける定数ｆmax（ｉ）と、Ｋmax（ｉ）以外のＭＤＣＴ係数に掛ける定数ｆexmax（ｉ）との算出動作を示すフローチャートである。先ず、量子化雑音による振幅が真値より増加する場合の誤差のパワーである量子化誤差パワーＱｕｐ（ｉ）を上述のようにして、即ち、最終的に量子化される値の中点以上にあるスペクトルとＳＦ（ｉ）との差分の２乗和を算出する（ステップＰ１）。次に、量子化雑音による振幅が真値より減少する場合の誤差のパワーである量子化誤差パワーＱｄｏｗｎ（ｉ）を上述のようにして、即ち、Ｋmax（ｉ）以外のスペクトルと最終的に量子化される値との差分の２乗和を算出する（ステップＰ２）。
【００５７】
次に、Ｋmax（ｉ）とＳＦ（ｉ）との差分、即ち、ＳＦ（ｉ）−Ｋmax（ｉ）と、Ｋmax（ｉ）とＳＦ（ｉ）×２^-1/3との差分、即ち、Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3とを比較する（ステップＰ３）。そして、比較した結果、ＳＦ（ｉ）−Ｋmax（ｉ）＞＝Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3の場合、Ｑｕｐ（ｉ）とＱｄｏｗｎ（ｉ）を比較し（ステップＰ４）、Ｑｕｐ（ｉ）＞＝Ｑｄｏｗｎ（ｉ）の場合、ｆmax（ｉ）とｆexmax（ｉ）を、ｆmax（ｉ）＝２^-1/3、ｆexmax（ｉ）＝ＭＩＮ（２^-1/3，（２^WL(i)-1−２）／（２^WL(i)-1−１））と決定する（ステップＰ６）。一方、ステップＰ４において、Ｑｕｐ（ｉ）＜Ｑｄｏｗｎ（ｉ）の場合、ｆmax（ｉ）とｆexmax（ｉ）を、ｆmax（ｉ）＝２^-1/3、ｆexmax（ｉ）＝２^-1/3と決定する（ステップＰ７）。
【００５８】
また、ステップＰ３での比較結果が、ＳＦ（ｉ）−Ｋmax（ｉ）＜Ｋmax（ｉ）−ＳＦ（ｉ）×２^-1/3の場合、Ｑｕｐ（ｉ）とＱｄｏｗｎ（ｉ）を比較し（ステップＰ５）、Ｑｕｐ（ｉ）＞＝Ｑｄｏｗｎ（ｉ）の場合、ｆmax（ｉ）とｆexmax（ｉ）を、ｆmax（ｉ）＝１、ｆexmax（ｉ）＝ＭＡＸ（２^-1/3，（２^WL(i)-1−２）／（２^WL(i)-1−１））と決定する（ステップＰ８）。一方、ステップＰ５において、Ｑｕｐ（ｉ）＜Ｑｄｏｗｎ（ｉ）の場合、ｆmax（ｉ）とｆexmax（ｉ）を、ｆmax（ｉ）＝１、ｆexmax（ｉ）＝２^-1/3と決定する（ステップＰ９）。
【００５９】
そして、このようにして求めた定数ｆmax（ｉ）とｆexmax（ｉ）を、スペクトルデータ重み付け部２６において、個々のＭＤＣＴ係数（スペクトルデータ）に掛けて重み付けを行う。また、スケールファクタ減算部２２では、ｆmax（ｉ）＝２^-1/3となるｉの周波数帯域のスケールファクタＳＦ（ｉ）を、スケールファクタ算出部２１で算出したスケールファクタから１分解能だけ減算する。そして、量子化部７において、量子化ビット数算出部６で算出された量子化ビット数ＷＬ（ｉ）と、スペクトルデータ重み付け部２６で重み付けされたＭＤＣＴ係数（スペクトルデータ）Ｋ’（ｍ）と、スケールファクタ減算部２２で減算されたスケールファクタＳＦ’（ｉ）とを用いて、上述の（２）式により量子化係数が算出され、それを符号化した符号化データが出力される。
【００６０】
また、演算量を削減することや、効果の程度から、聴覚心理上比較的重要でないと判断される場合、即ち、量子化ビット数ＷＬ（ｉ）が小なる場合、例えば、ＷＬ（ｉ）＜４の場合にのみ、上記実施例を実施することが好ましい。
【００６１】
このようにして、デジタルデータ符号化装置１により、入力されたオーディオ信号の符号化を行うと、１つの量子化部７だけであるにも拘わらず、各周波数帯域のＭＤＣＴ係数の絶対最大値Ｋmax（ｉ）の量子化誤差を低減できるとともに、各周波数帯域のＭＤＣＴ係数の絶対最大値Ｋmax（ｉ）以外のＭＤＣＴ係数の量子化誤差も低減でき、復号化されたオーディオ信号の量子化雑音を低減して音質を向上させることができる。
【００６２】
尚、上記の実施形態では、デジタルデータ符号化装置１により符号化されるデジタルデータをオーディオ信号とした例を挙げて説明したが、オーディオ信号に限定されず、他のデジタルデータの場合であっても適用可能であることは言うまでもない。
【００６３】
【発明の効果】
以上のように、本発明によれば、時間領域のデジタルデータを周波数領域に変換する時間周波数変換部と、該時間周波数変換部により変換されたスペクトルを複数の周波数帯域に分割し各周波数帯域毎に最大スペクトルに関連するスケールファクタを生成するスケールファクタ生成部と、聴覚心理特性を利用して前記周波数帯域毎の量子化ビット数を算出する量子化ビット数算出部と、前記スケールファクタと量子化ビット数を用いて前記スペクトルを量子化した量子化係数を算出して符号化する量子化部を備えたデジタルデータ符号化装置において、前記スペクトルを前記スケールファクタで正規化した仮正規化係数を算出する仮正規化部と、前記量子化ビット数に応じた比較閾値を算出する比較閾値算出部と、前記仮正規化係数と比較閾値に基づいた定数を算出する定数算出部と、前記スペクトルに前記定数を乗算して重み付けするスペクトルデータ重み付け部を設けたので、入力されたデジタルデータの量子化時において、前記各周波数帯域のスペクトルの量子化誤差を低減して符号化することができ、この符号化されたデジタルデータの復号化時の量子化雑音を低減することができる。
【００６４】
また、本発明によれば、前記時間周波数変換部で変換されたスペクトルに前記定数算出部で算出される定数を乗算した修正スペクトルを前記量子化部で量子化した量子化係数が逆量子化されたときのスペクトル振幅は、前記スペクトルに前記定数を乗算していない無修正スペクトルを前記量子化部で量子化した量子化係数が逆量子化されたときのスペクトル振幅を超えないようになっており、また、前記定数は、前記各周波数帯域の最大スペクトルに乗算される第１の定数と、前記各周波数帯域の最大スペクトル以外のスペクトルに乗算される第２の定数とから成る定数としたので、１種類の量子化器を用いて、量子化時における各周波数帯域の最大スペクトルの量子化雑音を低減できるとともに、最大スペクトルを除く量子化後の周波数帯域のスペクトルレベルを、従来の量子化値と比較して低くすることができる。
【００６５】
また、本発明によれば、量子化ビット数が小なる時にのみ前記スペクトルの修正を施すことで、デジタルデータ符号化装置の演算量を効果的に削減できる。
【図面の簡単な説明】
【図１】は、本発明の実施形態に係るデジタルデータ符号化装置の電気的構成を示すブロック図である。
【図２】は、図１に示すデジタルデータ符号化装置のスケールファクタ生成部とスペクトルデータ修正部の電気的構成を示すブロック図である。
【図３】は、ＭＤＣＴ係数の一例を示す図である。
【図４】は、図２に示す定数算出部の定数算出動作を示すフローチャートである。
【符号の説明】
１デジタルデータ符号化装置
２周波数帯域分割部
３時間周波数変換部
４帯域毎のパワー算出部
５マスキング算出部
６量子化ビット算出部
７量子化部
８パッキング部
９スケールファクタ生成部
１０スペクトルデータ修正部
１１アンパッキング部
１２逆量子化部
１３周波数時間変換部
１４周波数帯域合成部
１５復号化装置
２０記録メディア
２１スケールファクタ算出部
２２スケールファクタ減算部
２３仮正規化部
２４定数算出部
２５比較閾値算出部
２６スペクトルデータ重み付け部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to digital data that compresses the amount of data by assigning bits to the spectrum of each frequency band in accordance with the musical sound or voice when recording digital data such as musical sound or voice on a recording medium such as a mini-disc. The present invention relates to an encoding device.
[0002]
[Prior art]
Conventionally, ATRAC (Adaptive TRansform Acoustic Coding) used in mini-discs is raised as a method for compressing and encoding digital data such as musical sounds and voices with high efficiency. In this ATRAC, in order to compress with high efficiency, input digital data is divided into a plurality of frequency bands (hereinafter referred to as subbands as appropriate) and then blocked in units of variable length. The blocked digital data is converted into spectrum data by MDCT (Modified Discrete Cosine Transform) processing, and each spectrum data is encoded with the number of bits allocated using the psychoacoustic characteristics. .
[0003]
Examples of the psychoacoustic characteristics that can be applied to the compression encoding include an equal loudness characteristic and a masking effect. The equal loudness characteristic represents that the volume of sound perceived by humans varies depending on the frequency even for sounds having the same sound pressure level. Therefore, it represents that the minimum audible limit, which is the volume of sound that humans can perceive, changes depending on the frequency of the sound.
[0004]
On the other hand, the masking effect includes a simultaneous masking effect and a temporal masking effect. The simultaneous masking effect is a phenomenon that makes it difficult for one sound to hear another sound when sounds of multiple frequency components are generated simultaneously. Say. The temporal masking effect is a phenomenon in which another sound becomes difficult to hear before and after a loud sound in the time axis direction.
[0005]
In a bit allocation method using such psychoacoustic characteristics, for example, an allocation method called an iterative method, actual bit allocation adapted to input digital data is performed as follows.
[0006]
First, the power S of each frequency band is obtained, and a masking threshold M for other frequency bands based on the power S is obtained. Next, a masking threshold-to-noise ratio MNR (n) = M / N (n) is obtained from the masking threshold M and the quantization noise power N (n) when each frequency band is quantized with n bits. . Subsequently, after assigning bits to the frequency band where the masking threshold-to-noise ratio MNR (n) is minimum, the masking threshold-to-noise ratio MNR (n) is updated, and the bit allocation is again made to the minimum frequency band. Done.
[0007]
In this manner, the number of quantized bits (quantized word length) in each frequency band is calculated by bit allocation using auditory psychological characteristics. In addition, a scale factor related to the maximum amplitude level of the spectrum for each frequency band is calculated, and the spectrum data of each frequency band is quantized based on the scale factor and the number of quantization bits, and then encoded into compressed data. Is done.
[0008]
Also, in the device for decoding data encoded by this encoding process, at the time of inverse quantization, the spectrum for each frequency band is calculated from the number of quantization bits and the scale factor, and the quantization coefficient obtained at the time of quantization. Expanded into data. Then, the developed spectrum data is blocked in variable time units, subjected to inverse MDCT processing, and converted to time domain data. Further, the converted time-domain data is restored to digital data such as musical sound and voice after synthesizing a plurality of frequency bands (sub-bands).
[0009]
By the way, as a method of reducing the maximum quantization noise at the time of quantization, there is a method of multiplying all samples within a predetermined frequency interval by a constant coefficient corresponding to the number of quantization bits, that is, WL (number of quantization bits). If X is smaller than k, k = (2 ^{WL [n]} -2) / (2 ^{WL [n]} -1) is multiplied by all the spectrum samples in the unit (frequency band unit to be quantized) to obtain a corrected spectrum sample, and further, there is a method of performing quantization with the corrected quantizer (for example, patent document) 1).
[0010]
In addition, a method for enhancing the quantizer itself, that is, a scale factor is calculated by multiplying the maximum value of the absolute value of the spectrum within the frequency band unit to be quantized by a coefficient k (0 <k <1), thereby obtaining a spectrum sample. There is also a method of quantizing without correcting (for example, see Patent Document 2).
[0011]
[Patent Document 1]
Japanese Patent No. 3150475
[Patent Document 2]
Japanese Patent Laid-Open No. 11-177435
[0012]
[Problems to be solved by the invention]
However, in the prior art described in Patent Document 1, it is possible to reduce the maximum quantization noise, but depending on the spectrum distribution of the frequency band unit for quantization, the spectrum that is not corrected is closer to the true value. There were samples and the effect was reversed for these spectral samples. Another problem is that two types of quantizers corresponding to the number of quantization bits are required.
[0013]
In the prior art described in Patent Document 2, the scale factor is calculated by multiplying the maximum absolute value of the spectrum data in the frequency band to be quantized by a coefficient k (0 <k <1). Depending on the distribution, there is a problem that the normalized value (SD (m) / SF) exceeds 1, that is, overflow may occur.
[0014]
In view of the above problems, the present invention reduces the quantization noise of the maximum spectrum of each frequency band at the time of quantization using one kind of quantizer, and the frequency band after quantization excluding the maximum spectrum An object of the present invention is to provide a digital data encoding apparatus capable of lowering the spectral level of the signal compared to the conventional quantized value.
[0015]
[Means for Solving the Problems]
To achieve the above object, the present invention divides a spectrum obtained by converting time-domain digital data into a frequency domain into a plurality of frequency bands, generates a scale factor related to the maximum spectrum for each frequency band, and A digital data encoding device that calculates the number of quantization bits for each frequency band using characteristics and calculates and encodes a quantization coefficient obtained by quantizing the spectrum using the scale factor and the number of quantization bits And a spectrum data correction unit for correcting the spectrum to be quantized by a constant calculated based on the scale factor and the number of quantization bits.
[0016]
In this way, at the time of quantization of the input digital data, it is possible to reduce the quantization error of the spectrum of each frequency band, and to encode the quantized digital data. Noise can be reduced.
[0017]
Also, for example, a time-frequency conversion unit that converts time-domain digital data into the frequency domain, and a scale associated with the maximum spectrum for each frequency band by dividing the spectrum converted by the time-frequency conversion unit into a plurality of frequency bands A scale factor generation unit for generating a factor, a quantization bit number calculation unit for calculating the number of quantization bits for each frequency band using auditory psychological characteristics, and the spectrum using the scale factor and the number of quantization bits In a digital data encoding device including a quantization unit that calculates and encodes a quantized quantization coefficient, the scale factor quantized by the quantization unit includes the scale factor and the number of quantization bits. It is preferable to provide a spectrum data correction unit that multiplies the constants calculated based on the constants and corrects them.
[0018]
In this way, at the time of quantization of the input digital data, it is possible to reduce the quantization error of the spectrum of each frequency band, and to encode the quantized digital data. Noise can be reduced.
[0019]
Further, for example, the spectrum data correction unit calculates a temporary normalization coefficient obtained by normalizing the spectrum with the scale factor, and a comparison threshold value calculation that calculates a comparison threshold value according to the number of quantization bits. A constant calculation unit that calculates a constant based on the temporary normalization coefficient and the comparison threshold, and a spectrum data weighting unit that multiplies the spectrum by the constant and weights the spectrum.
[0020]
In this way, it is possible to reduce a quantization error during quantization of the spectrum by calculating a constant according to the spectrum size of each frequency band and multiplying the spectrum by the constant and weighting. it can.
[0021]
Further, for example, a spectrum obtained when the quantization coefficient obtained by quantizing the modified spectrum obtained by multiplying the spectrum converted by the time frequency conversion unit by the constant calculated by the constant calculation unit by the quantization unit is inversely quantized. When the quantization coefficient obtained by quantizing the uncorrected spectrum in which the spectrum is not multiplied by the constant by the quantization unit does not exceed the spectrum amplitude when the spectrum is inversely quantized, The quantized noise level can be reduced by lowering the spectral level of each of the frequency bands thus made lower than the conventional quantized value.
[0022]
Further, for example, a constant calculated by the constant calculation unit is multiplied by a first constant that is multiplied by the maximum spectrum of each frequency band, and a second constant that is multiplied by a spectrum other than the maximum spectrum of each frequency band. The quantization noise of the maximum spectrum of each frequency band at the time of quantization can be reduced by using one kind of quantizer, and the spectrum level of the frequency band after quantization excluding the maximum spectrum can be reduced. The quantization value can be lowered.
[0023]
Further, for example, when the scale factor generation unit includes a scale factor subtraction unit that subtracts the scale factor according to the constant calculated by the constant calculation unit and supplies the subtraction to the quantization unit, the spectrum is normalized. The normalized value exceeds 1, that is, the occurrence of overflow can be prevented.
[0024]
For example, when the quantization bit number is smaller than a predetermined number of bits, the spectrum data correction unit multiplies the spectrum by the constant to correct the spectrum data, and the amount of calculation of the spectrum data correction unit is reduced. The calculation amount of the digital data encoding device can be effectively reduced.
[0025]
Further, for example, when the number of quantization bits is smaller than a predetermined number of bits, the constant calculation unit calculates the constant, and the spectrum data weighting unit multiplies the spectrum by the constant and weights the spectrum, The calculation amount of the constant calculation unit and the spectrum data weighting unit is reduced, and the calculation amount of the digital data encoding device can be effectively reduced.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing an electrical configuration of a digital data encoding apparatus according to an embodiment of the present invention, and also shows a decoding apparatus for better understanding of the description. Hereinafter, an encoding / decoding process performed by an ATRAC (Adaptive TRanceform Acoustic Coding) method used in a mini-disc or the like will be described with reference to FIG. In FIG. 1, 1 is a digital data encoding device, 15 is a decoding device, and 10 is a recording medium such as a mini-disc.
[0027]
The digital data encoding apparatus 1 includes a frequency band division unit 2, a time frequency conversion unit 3, a power calculation unit 4 for each band, a masking calculation unit 5, a quantization bit number calculation unit 6, a quantization unit 7, a packing unit 8, A scale factor generation unit 9 and a spectrum data correction unit 10 are provided. In addition, the decoding device 15 includes an unpacking unit 11, an inverse quantization unit 12, a frequency time conversion unit 13, and a frequency band synthesis unit 14.
[0028]
Next, the encoding process of the digital data encoding apparatus 1 having such a configuration will be described. An audio signal, which is digital data sampled at 44.1 kHz, is input to the input terminal of the digital data encoding device 1. The input audio signal is divided into a plurality of frequency bands (subband frames) by a frequency band dividing unit 3 by a QMF (Quadrature Mirror Filter) which is a band dividing filter. For example, there are three bands, a low-band subband frame SB1 of 0 to 5.5 kHz, a medium-band subband frame SB2 of 5.5 to 11 kHz, and a high-band subband frame SB3 of 11 to 22 kHz.
[0029]
Next, the time frequency conversion unit 3 performs MDCT (Modified Discrete Cosine Tranceform) processing for each subband frame unit obtained by the frequency band dividing unit 2, thereby converting the input audio signal into MDCT coefficients (spectral data of frequency components). ). Conversion data Xm (k) obtained by MDCT processing at this time is expressed by the following equation (1).
[0030]
[Expression 1]

[0031]
Note that the variable m in the above expression represents the frame number, and the function xm (i) represents the input signal. The function h (i) represents a forward conversion window function.
[0032]
The power calculation unit 4 for each band further converts the MDCT coefficient obtained by the time frequency conversion unit 3 into spectrum power Si (i = 1, 2,..., I, for example, I = 25). Here, a critical band (unit Bark) or the like is used as the frequency band. For each spectrum power thus obtained, the masking calculation unit 5 creates a masking curve using the psychoacoustic characteristics, and the quantization bit number calculation unit 6 creates the masking curve created by the masking calculation unit 5. Bit allocation processing is performed by the curve and each spectrum power.
[0033]
Further, the scale factor generation unit 9 calculates a scale factor about every 2 dB from the absolute maximum value of the MDCT coefficient of each frequency band converted by the time frequency conversion unit 3. Further, in the spectrum data correction unit 10, the provisional normalization coefficient obtained by provisional normalization of the MDCT coefficient by this scale factor and the number of quantization bits in each frequency band obtained from the bit allocation processing of the quantization bit number calculation unit 6 The weighted MDCT coefficient is generated by multiplying the MDCT coefficient by the constant obtained from the above. Then, the weighted MDCT coefficient and the scale factor obtained by subtracting a part thereof are provided to the quantization unit 7. Details of the configuration and operation of the scale factor generation unit 9 and the spectrum data correction unit 10 will be described later.
[0034]
Then, the quantization unit 7 corrects the spectral data from the number of quantization bits in each frequency band obtained from the bit allocation process of the quantization bit number calculation unit 6 and the scale factor given from the scale factor generation unit 9. A quantization coefficient MK (m) obtained by quantizing the spectrum data given from the unit 10 by the following equation (2) is calculated.
[0035]
[Expression 2]

[0036]
The variable m in the above equation represents the MDCT coefficient index, i represents the quantization frequency band index, K (m) represents the MDCT coefficient, WL (i) represents the number of quantization bits, and SF (i) represents the scale factor. Round is a function that rounds off after the decimal point.
[0037]
Then, the quantization coefficient, the number of quantization bits, and the scale factor quantized by the quantization unit 7 are packed and encoded by the packing unit 8 together with the frame information, and are recorded on the recording medium 20.
[0038]
Next, the decoding process of the decoding device 15 will be described. As described above, the encoded data recorded on the recording medium 20 is unpacked by the unpacking unit 11 in terms of the quantization coefficient, the number of quantization bits, and the scale factor from the frame information. Then, the quantization coefficient, the number of quantization bits, and the scale factor are inversely quantized by the inverse quantization unit 12 and input to the frequency time conversion unit 13 that performs an IMDCT (Inverse Modified Discrete Cosine Transform) process. The IMDCT input signal I (m) input to the frequency time conversion unit 13 is dequantized by the following equation (3).
[0039]
[Equation 3]

[0040]
The variable m in the above equation is the index of the IMDCT input signal, i is the index of the inverse quantization frequency band, MK (m) is the quantization coefficient, WL (i) is the number of quantization bits, and SF (i) is the scale. Represents a factor.
[0041]
The spectrum data that has been inversely quantized by the inverse quantization unit 12 and restored to the spectrum again is subjected to IMDCT processing by the frequency time conversion unit 13 for each subband frame, and converted to time axis data. The inversely converted data ym (i) by the IMDCT process of the frequency time conversion unit 13 is expressed by the following equation (4).
[0042]
[Expression 4]

[0043]
In the above equation, the variable m represents the frame number, f (i) represents the inverse transformation window function, and Xm (k) represents the transformation data.
[0044]
Further, in the frequency band synthesizing unit 14, the inversely converted frequency subband frame is band synthesized by an IQMF (Inverse Quadrature Mirror Filter) which is a band synthesis filter, and decoded into audio data.
[0045]
Next, a method for generating a scale factor given to the quantizing unit 7 and a method for correcting an MDCT coefficient (spectral data) given to the quantizing unit 7 will be described. FIG. 2 is a block diagram showing an electrical configuration of the scale factor generation unit 9 and the spectrum data correction unit 10. 2, the same parts as those in FIG. 1 are denoted by the same reference numerals, and detailed description thereof is omitted.
[0046]
The scale factor generation unit 9 includes a scale factor calculation unit 21 and a scale factor subtraction unit 22. The spectrum data correction unit 10 includes a temporary normalization unit 23, a constant calculation unit 24, a comparison threshold calculation unit 25, and a spectrum data weighting unit 26.
[0047]
First, the MDCT coefficient (spectral data) converted by the above equation (1) in the time-frequency conversion unit 3 shown in FIG. 1 is about 2 dB from the absolute maximum value of the MDCT coefficient in each frequency band in the scale factor calculation unit 9. A scale factor is calculated for each. That is, if the absolute maximum value of the MDCT coefficient in each frequency band is Kmax (i) and the scale factor at that time is SF (i), then SF (i) × 2 ^-1/3 SF (i) such that = <Kmax (i) <SF (i) is calculated.
[0048]
Further, the temporary normalization unit 23 uses the converted MDCT coefficient K (m) and the scale factor SF (i) to calculate the K (m) / SF (i) portion of the above equation (2) for each frequency band ( Provisional normalization coefficient) is calculated. At this time, -1 <K (m) / SF (i) <1. In addition, as described above, the power calculation unit 4, the masking calculation unit 5, and the quantization bit number calculation unit 6 for each band create a masking curve using the psychoacoustic characteristics and perform bit allocation processing. Then, the comparison threshold value calculation unit 25 calculates the final quantized value and its midpoint from the number of quantization bits in each frequency band obtained from the bit allocation process. For example, when the number of quantization bits WL (i) = 2, the final quantized value is {−1, −0.5, 0, 0.5 with respect to {−1, 0, 1}. , 1} is calculated. These may be stored in advance in a table ROM.
[0049]
Then, the constant calculation unit 24 compares which region of the comparison threshold the provisional normalized value K (m) / SF (i) falls into. Thereby, it is determined whether the amplitude due to the quantization noise increases or decreases compared to the true value. For example, when WL (i) = 2, if −1 <K (m) / SF (i) = <− 0.5, or 0.5 <K (m) / SF (i) = <1 For example, the amplitude due to quantization noise increases from the true value, and if −0.5 <K (m) / SF (i) <0.5, the amplitude due to quantization noise decreases from the true value. Here, the absolute maximum value Kmax (i) of the MDCT coefficient used for the calculation of the scale factor is excluded from the comparison target because the amplitude due to the quantization noise always increases from the true value.
[0050]
Further, the difference between Kmax (i) and SF (i), that is, SF (i) −Kmax (i), Kmax (i) and SF (i) × 2 ^-1/3 Difference, i.e., Kmax (i) -SF (i) × 2 ^-1/3 And a constant fmax (i) multiplied by Kmax (i) and a constant fexmax (i) multiplied by an MDCT coefficient other than Kmax (i) are calculated under the following conditions.
[0051]
First, SF (i) −Kmax (i)> = Kmax (i) −SF (i) × 2 ^-1/3 And if Qup (i)> = Qdown (i), then fmax (i) = 2 ^-1/3 , Fexmax (i) = MIN (2 ^-1/3 , (2 ^{WL (i) -1} -2) / (2 ^{WL (i) -1} -1)).
[0052]
Also, SF (i) −Kmax (i)> = Kmax (i) −SF (i) × 2 ^-1/3 And if Qup (i) <Qdown (i), then fmax (i) = 2 ^-1/3 , Fexmax (i) = 2 ^-1/3 And
[0053]
Also, SF (i) −Kmax (i) <Kmax (i) −SF (i) × 2 ^-1/3 If Qup (i)> = Qdown (i), fmax (i) = 1, fexmax (i) = MAX (2 ^-1/3 , (2 ^{WL (i) -1} -2) / (2 ^{WL (i) -1} -1)).
[0054]
Also, SF (i) −Kmax (i) <Kmax (i) −SF (i) × 2 ^-1/3 If Qup (i) <Qdown (i), fmax (i) = 1, fexmax (i) = 2 ^-1/3 And MIN (x, y) is a function that returns the minimum value of the argument, and MAX (x, y) is a function that returns the maximum value of the argument.
[0055]
Here, Qup (i) is the power or energy value of the quantization error when the amplitude due to the quantization noise increases from the true value, and Qdown (i) is the case where the amplitude due to the quantization noise decreases from the true value. The power or energy value of the quantization error. Alternatively, the number of MDCT coefficients may be simply increased or decreased from the true value. In FIG. 3, WL (i) = 2, SF (i) = 2 ^Five 4 shows an example of an MDCT coefficient when the number of spectrums in the frequency band i is six. In this example, the quantization error power Qup (i) = SA (2) ² + SA (3) ² , Quantization error power Qdown (i) = SA (1) ² + SA (3) ² + SA (4) ² + SA (5) ² + SA (6) ² Calculate as
[0056]
FIG. 4 is a flowchart showing the calculation operation of the constant calculator 24 for calculating the constant fmax (i) multiplied by Kmax (i) and the constant fexmax (i) multiplied by the MDCT coefficient other than Kmax (i). First, the quantization error power Qup (i), which is the error power when the amplitude due to quantization noise increases from the true value, is set as described above, that is, above the midpoint of the value finally quantized. The sum of squares of the difference between a certain spectrum and SF (i) is calculated (step P1). Next, the quantization error power Qdown (i), which is the error power when the amplitude due to quantization noise decreases from the true value, is determined as described above, that is, the spectrum other than Kmax (i) and finally the quantum. The sum of squares of the difference from the value to be converted is calculated (step P2).
[0057]
Next, the difference between Kmax (i) and SF (i), that is, SF (i) −Kmax (i), Kmax (i) and SF (i) × 2 ^-1/3 Difference, i.e., Kmax (i) -SF (i) × 2 ^-1/3 Are compared (step P3). As a result of comparison, SF (i) −Kmax (i)> = Kmax (i) −SF (i) × 2 ^-1/3 In this case, Qup (i) and Qdown (i) are compared (step P4). If Qup (i)> = Qdown (i), fmax (i) and fexmax (i) are set to fmax (i) = 2. ^-1/3 , Fexmax (i) = MIN (2 ^-1/3 , (2 ^{WL (i) -1} -2) / (2 ^{WL (i) -1} -1)) (step P6). On the other hand, in Step P4, if Qup (i) <Qdown (i), fmax (i) and fexmax (i) are set to fmax (i) = 2. ^-1/3 , Fexmax (i) = 2 ^-1/3 Is determined (step P7).
[0058]
Further, the comparison result in step P3 is SF (i) −Kmax (i) <Kmax (i) −SF (i) × 2 ^-1/3 In this case, Qup (i) and Qdown (i) are compared (step P5). When Qup (i)> = Qdown (i), fmax (i) and fexmax (i) are set to fmax (i) = 1. , Fexmax (i) = MAX (2 ^-1/3 , (2 ^{WL (i) -1} -2) / (2 ^{WL (i) -1} -1)) (step P8). On the other hand, if Qup (i) <Qdown (i) at step P5, fmax (i) and fexmax (i) are set to fmax (i) = 1 and fexmax (i) = 2. ^-1/3 Is determined (step P9).
[0059]
Then, the constants fmax (i) and fexmax (i) obtained in this way are multiplied by individual MDCT coefficients (spectrum data) in the spectrum data weighting unit 26 to perform weighting. Further, in the scale factor subtraction unit 22, fmax (i) = 2 ^-1/3 Is subtracted by one resolution from the scale factor calculated by the scale factor calculation unit 21. In the quantization unit 7, the quantization bit number WL (i) calculated by the quantization bit number calculation unit 6, the MDCT coefficient (spectrum data) K ′ (m) weighted by the spectrum data weighting unit 26, and Using the scale factor SF ′ (i) subtracted by the scale factor subtraction unit 22, the quantization coefficient is calculated by the above-described equation (2), and encoded data obtained by encoding the quantized coefficient is output.
[0060]
Further, when it is determined that the amount of calculation is reduced or the degree of the effect is relatively unimportant in the psychological sense, that is, when the number of quantization bits WL (i) is small, for example, WL (i) < It is preferable to implement the above embodiment only in the case of 4.
[0061]
In this manner, when the input audio signal is encoded by the digital data encoding device 1, the absolute maximum value Kmax of the MDCT coefficient in each frequency band is obtained even though only one quantization unit 7 is provided. The quantization error of (i) can be reduced, the quantization error of MDCT coefficients other than the absolute maximum value Kmax (i) of the MDCT coefficient in each frequency band can be reduced, and the quantization noise of the decoded audio signal is reduced. Sound quality can be improved.
[0062]
In the above embodiment, an example in which the digital data encoded by the digital data encoding device 1 is an audio signal has been described. However, the present invention is not limited to an audio signal, and may be other digital data. It goes without saying that is also applicable.
[0063]
【The invention's effect】
As described above, according to the present invention, a time-frequency conversion unit that converts time-domain digital data into a frequency domain, and a spectrum converted by the time-frequency conversion unit is divided into a plurality of frequency bands, and each frequency band is divided. A scale factor generation unit that generates a scale factor related to the maximum spectrum, a quantization bit number calculation unit that calculates a quantization bit number for each frequency band using auditory psychological characteristics, and the scale factor and quantization In a digital data encoding device including a quantization unit that calculates and encodes a quantization coefficient obtained by quantizing the spectrum using the number of bits, calculates a temporary normalization coefficient obtained by normalizing the spectrum by the scale factor. A temporary normalization unit, a comparison threshold calculation unit that calculates a comparison threshold according to the number of quantization bits, the temporary normalization coefficient, and a comparison threshold And a spectrum data weighting unit that multiplies the spectrum by the constant and weights the spectrum, so that when the input digital data is quantized, the spectrum of each frequency band is calculated. It is possible to perform encoding while reducing the quantization error, and it is possible to reduce quantization noise when decoding the encoded digital data.
[0064]
Further, according to the present invention, the quantized coefficient obtained by quantizing the modified spectrum obtained by multiplying the spectrum converted by the time-frequency converting unit by the constant calculated by the constant calculating unit by the quantizing unit is dequantized. The spectrum amplitude at this time is such that it does not exceed the spectrum amplitude when the quantization coefficient obtained by quantizing the uncorrected spectrum that is not multiplied by the constant by the quantization unit is inversely quantized. The constant is a constant composed of a first constant multiplied by the maximum spectrum of each frequency band and a second constant multiplied by a spectrum other than the maximum spectrum of each frequency band. Using one type of quantizer, the quantization noise of the maximum spectrum in each frequency band during quantization can be reduced, and the frequency band after quantization excluding the maximum spectrum Spectrum level and can be lowered as compared with the conventional quantization value.
[0065]
Further, according to the present invention, the amount of calculation of the digital data encoding device can be effectively reduced by performing the spectrum correction only when the number of quantization bits is small.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an electrical configuration of a digital data encoding apparatus according to an embodiment of the present invention.
FIG. 2 is a block diagram showing an electrical configuration of a scale factor generation unit and a spectrum data correction unit of the digital data encoding device shown in FIG. 1;
FIG. 3 is a diagram illustrating an example of an MDCT coefficient.
FIG. 4 is a flowchart showing a constant calculation operation of a constant calculation unit shown in FIG.
[Explanation of symbols]
1 Digital data encoding device
2 Frequency band divider
3 Time frequency converter
4 Power calculation unit for each band
5 Masking calculator
6 Quantization bit calculation unit
7 Quantization part
8 Packing part
9 Scale factor generator
10 Spectral data correction unit
11 Unpacking part
12 Inverse quantization part
13 Frequency time converter
14 Frequency band synthesizer
15 Decoding device
20 recording media
21 Scale factor calculator
22 Scale factor subtraction unit
23 Temporary normalization part
24 Constant calculator
25 Comparison threshold calculation unit
26 Spectral data weighting unit

Claims

A spectrum obtained by converting digital data in the time domain into a frequency domain is divided into a plurality of frequency bands, a scale factor related to the maximum spectrum is generated for each frequency band, and a quantum factor for each frequency band is obtained using auditory psychological characteristics. In a digital data encoding device that calculates a quantization bit number and calculates and encodes a quantization coefficient obtained by quantizing the spectrum using the scale factor and the quantization bit number,
Before the quantization, a digital data is provided that includes a spectrum data correction unit that corrects the spectrum by multiplying the spectrum by a constant calculated based on the scale factor and the number of quantization bits. Data encoding device.

A time-frequency conversion unit that converts time-domain digital data into a frequency domain, and a spectrum converted by the time-frequency conversion unit is divided into a plurality of frequency bands, and a scale factor related to the maximum spectrum is generated for each frequency band. A scale factor generation unit; a quantization bit number calculation unit that calculates the number of quantization bits for each frequency band using auditory psychological characteristics; and the spectrum is quantized using the scale factor and the number of quantization bits. In a digital data encoding device including a quantization unit that calculates and encodes a quantization coefficient,
A spectrum data correction unit is provided for correcting the spectrum by multiplying the spectrum by a constant calculated based on the scale factor and the number of quantization bits before being quantized by the quantization unit. A digital data encoding device characterized by the above.

The spectrum data correction unit, a temporary normalization unit that calculates a temporary normalization coefficient obtained by normalizing the spectrum with the scale factor, a comparison threshold value calculation unit that calculates a comparison threshold value according to the number of quantization bits, A constant calculation unit for calculating a constant based on the temporary normalization coefficient and the comparison threshold;
The digital data encoding apparatus according to claim 2, further comprising a spectrum data weighting unit that multiplies the spectrum by the constant and weights the spectrum.

The spectrum amplitude when the quantized coefficient obtained by quantizing the modified spectrum obtained by multiplying the spectrum converted by the time-frequency converting unit by the constant calculated by the constant calculating unit is inversely quantized is 4. The digital data according to claim 3, wherein a spectrum amplitude obtained when a quantization coefficient obtained by quantizing an uncorrected spectrum that is not multiplied by the constant by the quantization unit is not quantized is obtained. Encoding device.

The constant calculated by the constant calculation unit includes a first constant that is multiplied by the maximum spectrum of each frequency band, and a second constant that is multiplied by a spectrum other than the maximum spectrum of each frequency band. The digital data encoding apparatus according to claim 3.

The scale factor generation unit includes a scale factor subtraction unit that subtracts the scale factor according to the constant calculated by the constant calculation unit and supplies the subtraction to the quantization unit. Digital data encoding device.

3. The digital data according to claim 1, wherein when the number of quantization bits is smaller than a predetermined number of bits, the spectrum data correction unit corrects the spectrum by multiplying the constant. Encoding device.

The constant calculation unit calculates the constant when the quantization bit number is smaller than a predetermined bit number, and the spectrum data weighting unit multiplies the spectrum by the constant and weights the spectrum. The digital data encoding device according to any one of claims 3 to 6.