JP3942882B2

JP3942882B2 - Digital signal encoding apparatus and digital signal recording apparatus having the same

Info

Publication number: JP3942882B2
Application number: JP2001376308A
Authority: JP
Inventors: 修藤井
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2001-12-10
Filing date: 2001-12-10
Publication date: 2007-07-11
Anticipated expiration: 2021-12-10
Also published as: JP2003177797A

Description

【０００１】
【発明の属する技術分野】
本発明は、ミニディスクなどの記録媒体に音楽や音声等のディジタル信号を記録する際に、これらの記録対象に適応して各周波数帯域のスペクトルに対するビット割り当てを行ってデータ量を圧縮するディジタル信号符号化装置に関するものである。
【０００２】
【従来の技術】
音楽や音声等のディジタル信号を高能率で圧縮符号化する従来の方法として、ミニディスクで用いられているＡＴＲＡＣ(Adaptive Transform Acoustic Coding)が挙げられる。このＡＴＲＡＣでは、高能率で圧縮するために、ディジタル信号を複数の周波数帯域（サブバンド）に分割した後、可変長の時間単位で符号化ユニットにブロック化してＭＤＣＴ（Modified Discrete Cosine Transform）処理を施し、スペクトル信号に変換し、さらに聴覚心理特性を利用して割り当てられたビット数で各スペクトル信号をそれぞれ符号化する。
【０００３】
上記の圧縮符号化に適応することができる聴覚心理特性には、等ラウドネス特性やマスキング効果が挙げられる。等ラウドネス特性は、同じ音圧レベルの音であっても、人間が感じ取る音の大きさが周波数によって変化することを表す。従って、等ラウドネス特性は、人間が感じ取ることができる音の大きさである最小可聴限が周波数によって変化することを表している。
【０００４】
一方、マスキング効果には、同時マスキングと経時マスキングとがある。同時マスキングは、複数の周波数成分の音が同時に発生しているときに、ある音が別の音を聞き取り難くさせる現象である。経時マスキングは、大きな音の時間軸方向の前後でマスキングを受ける現象である。
【０００５】
また、ビット割り当ての方法は、上記の聴覚心理特性を利用して、要求される音質レベルと使用できるハードウェア能力とのバランスを考慮したアルゴリズムを採用する必要がある。
【０００６】
例えば、反復法と呼ばれるビット割り当て法では、入力ディジタル信号に適応したビット割り当てが以下のようにして行われている。まず、各周波数帯域のパワーＳを求め、そのパワーＳによる他の周波数帯域に対するマスキングしきい値Ｍを求める。次に、このマスキングしきい値Ｍと、各周波数帯域をｎビットで量子化したときの量子化雑音パワーＮ（ｎ）とから、マスキングしきい値対雑音比ＭＮＲ（ｎ）＝Ｍ／Ｎ（ｎ）を求める。続いて、そのマスキングしきい値対雑音比ＭＮＲ（ｎ）が最小となる周波数帯域にビット割り当てを行った後、そのマスキングしきい値対雑音比ＭＮＲ（ｎ）を更新し、再び最小の周波数帯域にビット割り当てを行う。
【０００７】
【発明が解決しようとする課題】
経時変化の小さい信号の入力時に、瞬間的に経時変化の大きい信号が入力されると、同一周波数の量子化誤差が隣接するフレーム間で変動し、それが異音として知覚されることがある。特に、自身がマスキング効果の影響を受けないピーク周波数の量子化誤差が変動した場合に異音として知覚される。
【０００８】
上記のような異なるタイプの信号に対しては、エネルギー分布に応じたビットの配分が必要となるため、これが適切に行われないと上記のような異音が生じる。
【０００９】
また、前述の反復法は、１フレーム（圧縮処理単位時間）内でビット割り当てを行うので、そのフレーム内では最適な量子化ビット数を算出することができるものの、前後のフレームの信号変化を的確にビット割り当てに反映させることができない。特に、固定ビットレートで圧縮を行う場合、隣接するフレームで信号エネルギー成分が異なれば、同一周波数で量子化誤差の揺らぎ（変動が）発生してしまう。
【００１０】
本発明は、上記の事情に鑑みてなされたものであって、経時変化の小さい信号の入力時に入力された瞬間的に経時変化の大きい信号を符号化する際に、知覚可能な音質劣化を軽減するディジタル信号符号化装置を提供することを目的としている。
【００１１】
【課題を解決するための手段】
本発明のディジタル信号符号化装置は、ディジタル信号を所定の複数の周波数帯域毎にスペクトルデータに変換し、各周波数帯域のスペクトルデータをそれぞれに応じて与えられたビット割当量で符号化するディジタル信号符号化装置において、上記の課題を解決するために、時間的に連続する各フレームのビット割当量を前記周波数帯域毎に算出するビット割当量算出手段と、このビット割当量算出手段によって算出されたビット割当量の量子化誤差を算出する第１量子化誤差算出手段と、前記ビット割当量算出手段によって算出された、現フレームの１つ前の前フレームのビット割当量を基に、現フレームのビット割当量を修正するビット割当量修正手段と、前記ビット割当量修正手段によって得られた最終のビット割当量の量子化誤差を算出する第２量子化誤差算出手段とを備え、上記ビット割当量修正手段が、前記第１量子化誤差算出手段で算出された現フレームのビット割当量と前記第２量子化誤差算出手段で算出された前フレームのビット割当量との量子化誤差の差分を所定値より小さくなるように修正することを特徴としている。
【００１２】
上記の構成では、あるフレームのビット割当量が、ビット割当量算出手段によって算出されると、そのビット割当量の量子化誤差が、第１量子化誤差算出手段によって算出される。また、そのフレームに続くフレームのビット割当量の量子化誤差も同様にして算出される。これらの続く２つのフレームをそれぞれ前フレームと現フレームとして、ビット割当量修正手段によって、現フレームのビット割当量が前フレームのビット割当量を基に修正される。この結果、最終のビット割当量が得られる。そして、このビット割当量の量子化誤差が、第２量子化誤差算出手段によって算出される。
【００１３】
ビット割当量修正手段による修正時には、現フレームのビット割当量の量子化誤差と、第２量子化誤差算出手段で算出された前フレームとのビット割当量の量子化誤差との差分が所定値より小さくなるように修正される。これにより、経時変化の小さい信号の入力時に入力された瞬間的に経時変化の大きい信号を符号化するような場合でも、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制される。
【００１４】
上記のディジタル信号符号化装置は、前記スペクトルデータのパワー、エネルギーまたはスケールファクタの最大値を抽出する最大値抽出手段を備え、前記ビット割当量修正手段が、抽出された前記最大値が属する周波数帯域で前記差分を修正することが好ましい。このような構成では、スペクトルデータの上記の最大値が、最大値抽出手段によって抽出されると、その最大値でビット割当量修正手段による上記のビット割当量の修正が行われる。これにより、ピーク周波数の量子化誤差の変動が抑制される。
【００１５】
ここで、スペクトルデータのパワー、エネルギーまたはスケールファクタの最大値が属する周波数帯域の周波数をピーク周波数と称する。このピーク周波数は、最小可聴限以上の信号レベルではマスキングされずに可聴周波数となるので、量子化誤差の揺らぎ（変動）が発生すると、最も異音として知覚されやすい周波数である。それゆえ、上記のようにピーク周波数の量子化誤差の変動を抑制することによって、マスキングしきい値対雑音非を用いたビット割当法、信号対雑音比を用いたビット割当法およびマスキングしきい値対雑音比と信号対雑音比とを併用するビット割当法のいずれにも、従来のビット割当法を用いた場合と比較して、同一周波数の量子化誤差の変動が抑制される。
【００１６】
本発明の他のディジタル信号符号化装置は、ディジタル信号を所定の複数の周波数帯域毎にスペクトルデータに変換し、各周波数帯域スペクトルの大きさから、想定した各ビット数に対して各周波数帯域のマスキングしきい値対雑音比を求め、前記ビット数毎に前記マスキングしきい値対雑音比が最小となる周波数帯域から順に与えられたビット割当量で前記スペクトルデータを符号化するディジタル信号符号化装置において、上記の課題を解決するために、時間的に連続する各フレームのビット割当量を前記周波数帯域毎に算出するビット割当量算出手段と、このビット割当量算出手段によって算出されたビット割当量の量子化誤差を算出する第１量子化誤差算出手段と、前記量子化誤差を非マスキング周波数帯域について抽出する非マスキング周波数帯域抽出手段と、前記ビット割当量算出手段によって算出された、現フレームの１つ前の前フレームのビット割当量を基に、現フレームのビット割当量を修正するビット割当量修正手段と、前記ビット割当量修正手段によって得られた最終のビット割当量の量子化誤差を算出する第２量子化誤差算出手段とを備え、上記ビット割当量修正手段が、前記第１量子化誤差算出手段で算出された現フレームのビット割当量と前記第２量子化誤差算出手段で算出された前フレームのビット割当量との量子化誤差の差分を前記非マスキング周波数帯域の量子化誤差について所定値より小さくなるように修正することを特徴としている。
【００１７】
上記の構成では、あるフレームのビット割当量がビット割当量算出手段によって算出されると、そのビット割当量の量子化誤差が第１量子化誤差算出手段によって算出される。すると、マスキング周波数帯域抽出手段によって、その量子化誤差が聴覚心理を用いて非マスキング周波数帯域について抽出される。また、そのフレームに続くフレームのビット割当量の非マスキング周波数帯域についての量子化誤差も同様にして算出される。これらの２つの続くフレームをそれぞれ前フレームと現フレームとして、ビット割当量修正手段によって、現フレームのビット割当量が前フレームのビット割当量を基に修正される。この結果、最終のビット割当量が得られる。そして、このビット割当量の量子化誤差が第２量子化誤差算出手段によって算出される。
【００１８】
ビット割当量修正手段による修正時には、現フレームのビット割当量の非マスキング周波数帯域についての量子化誤差と、第２量子化誤差算出手段で算出された前フレームのビット割当量の非マスキング周波数帯域についての量子化誤差との差分が所定値より小さくなるように修正される。これにより、経時変化の小さい信号の入力時に入力された瞬間的に経時変化の大きい信号を符号化するような場合でも、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制される。
【００１９】
本発明のディジタル信号記録装置は、入力ディジタル信号を所定の符号化処理によって符号化して記録媒体に記録するディジタル信号記録装置であって、上記符号化処理を行うために、上記のいずれかのディジタル信号符号化装置を含んでいることを特徴としている。この構成では、上記の各ディジタル信号符号化装置によって、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制されることから、経時変化の小さい信号の記録時に経時変化の大きい信号が入力されても、量子化誤差に起因する上記のような音質の劣化の少ない信号を記録することができる。
【００２０】
【発明の実施の形態】
本発明の実施の一形態について図１ないし図１１に基づいて説明すれば、以下の通りである。
【００２１】
まず、本実施の形態に係るミニディスク装置について説明する。
【００２２】
図２に示すように、ディジタル信号記録装置としての本ミニディスク装置において、入力端子１から入力されたディジタル信号としてのディジタルオーディオ信号が、例えば、光信号としてシリアル入力される。この光信号は、光電素子２によって電気信号に変換された後、ディジタルＰＬＬ回路（Phase-Locked-Loop）３に入力される。
【００２３】
ディジタルＰＬＬ回路３は、入力されたディジタルオーディオ信号からクロックの抽出を行うとともに、サンプリング周波数および量子化ビット数に対応したマルチビットデータを再現する。このマルチビットデータは、信号源毎に対応したサンプリングレート（コンパクトディスクでは４４．１ｋＨｚ、ディジタルオーディオテープレコーダでは４８ｋＨｚ、衛星放送（Ａモード）では３２ｋＨｚ）で標本化されたディジタルデータである。そこで、ディジタルＰＬＬ回路３から出力されたマルチビットデータは、周波数変換回路４によって、そのサンプリングレートをミニディスクの規格に対応した４４．１ｋＨｚに変換する。
【００２４】
音声圧縮回路５は、前述のＡＴＲＡＣ方式によって入力されたディジタルオーディオデータの圧縮符号化を行う。符号化されたディジタルオーディオデータは、ショックプルーフメモリコントローラ６を介して信号処理回路７に送出される。ショックプルーフメモリコントローラ６によって制御されるショックプルーフメモリ８は、音声圧縮回路５から出力されるディジタルオーディオデータの転送速度と、信号処理回路７に入力されるディジタルオーディオデータの転送速度との差を吸収するとともに、再生時における振動等の外乱による再生信号の中断を補間し、ディジタルオーディオデータを保護するために設けられている。
【００２５】
信号処理回路７は、エンコーダおよびデコーダとしての機能を備えている。エンコーダとしての機能は、入力されたディジタルオーディオデータをシリアルの磁界変調信号にエンコードしてヘッド駆動回路９に与える。デコーダとしての機能は、後述するＲＦアンプ１３からのシリアル信号をディジタルオーディオデータにデコードしてショックプルーフメモリコントローラ６に与える
ヘッド駆動回路９は、記録時に、記録ヘッド１０をミニディスク１１上の所定の記録位置に移動させるとともに、上記の磁界変調信号に対応した磁界を発生させる。この状態で、ミニディスク１１上の所定の記録位置には、光ピックアップ１２からレーザ光が照射されている。これにより、上記の磁界に対応した磁化パターンがミニディスク１１上に形成される。
【００２６】
光ピックアップ１２は、ミニディスク１１から上記の磁化パターンに対応したシリアル信号を読み取る。このシリアル信号は、高周波アンプ（以降、ＲＦアンプと称する）１３で増幅された後、信号処理回路７によってディジタルオーディオデータにデコードされる。このディジタルオーディオデータは、ショックプルーフメモリコントローラ６およびショックプルーフメモリ８によって外乱による影響が除去された後、音声伸長回路１４に送出される。
【００２７】
音声伸長回路１４は、ＡＴＲＡＣ方式による圧縮符号化の逆変換処理（伸長復号化）を行い、フルビットのディジタルオーディオデータを復調する。復調されたディジタルオーディオデータは、ディジタル／アナログ変換回路（以降、Ａ／Ｄ変換回路と称する）１５によってアナログオーディオ信号に変換され、出力端子１６から外部へ出力される。
【００２８】
ＲＦアンプ１３で増幅されたシリアル信号は、サーボ回路１７にも入力される。サーボ回路１７は、再生されたシリアル信号に応じてドライバ回路１８に制御信号を送出し、そのドライバ回路１８を介してスピンドルモータ１９の回転速度をフィードバック制御する。このようなフィードバック制御により、ミニディスク１１を線速度一定で回転させることができる。
【００２９】
また、サーボ回路１７は、ドライバ回路１８を介して送りモータ２０の回転速度もフィードバック制御する。このようなフィードバック制御により、ミニディスク１１の半径方向に対する光ピックアップ１２の変移制御、すなわちトラッキング制御を行うことができる。さらに、サーボ回路１７は、ドライバ回路１８を介して光ピックアップ１２のフォーカシング制御も行う。
【００３０】
上記の信号処理回路７、光ピックアップ１２、ＲＦアンプ１４、サーボ回路１７、ドライバ回路１８等には、図示しない電源回路から電力が供給されるが、このような電力供給動作や後述する信号処理動作は、全てシステムコントロールマイクロコンピュータ２１によって集中管理されている。このシステムコントロールマイクロコンピュータ２１には、曲名入力、選曲操作、音質調整動作等を行うための入力装置２２が接続されている。
【００３１】
続いて、本実施の形態のディジタル信号符号化装置としての前述の音声圧縮回路５におけるディジタルデータ符号化処理について説明する。その前に、まず、ミニディスク１１等で利用する前述のＡＴＲＡＣによる符号化・復号化処理について説明する。
【００３２】
図３に示すように、音声圧縮回路５は、スペクトル変換部５１およびビット割当処理部５２を有している。
【００３３】
スペクトル変換部５１は、符号化時、４４．１ｋＨｚのサンプリング周波数でサンプリングされたオーディオ信号（マルチビットデータ）を、帯域分割フィルタであるＱＭＦ（Quadrature Mirror Filter）によって複数の周波数帯域（サブバンドフレーム）に分割する。また、スペクトル変換部５１は、分割されたサブバンドフレーム単位で前述のＭＤＣＴ処理を行い、各帯域の周波数成分のＭＤＣＴ係数（スペクトルデータ）を生成する。このときのＭＤＣＴ処理は、次式（１）で表される。
【００３４】
Ｘｍ（ｋ）＝Σxm(i)h(i)cos(π/M(k＋1/2)(i＋M/2＋1/2) …（１）
式（１）において、ｋ＝０，１，…，Ｍ−１、
ｍ：ブロック番号、
xm(i)：入力信号、
h(i)：順変換用窓関数、
Ｘｍ（ｋ）：変換データ
である。
【００３５】
ビット割当処理部５２は、上記のＭＤＣＴ係数を、ｉ個の各周波数帯域のスペクトルパワーＳｉ（ｉ＝１，２，…，Ｉ；例えばＩ＝２５）に変換し、各スペクトルパワーに対して後述のようにしてビット割当処理を行う。このスペクトルパワーＳｉは、臨界帯域（単位Bark）等が用いられる。臨界帯域は、周波数選択性、マスキングしきい値等の特定の音響心理学的規則性が有効な広帯域オーディオスペクトルの特性的部分のことである。
【００３６】
以下に、上記のビット割当処理部５２について詳細に説明する。
【００３７】
ビット割当処理部５２は、図１に示すように、パワー算出部５２ａ、ＳＮＲ算出部５２ｂ、１次量子化ビット数算出部５２ｃ、量子化ノイズ算出部５２ｄ、２次量子化ビット数算出部５２ｅおよび量子化ノイズ保存部５２ｆを備えている。
【００３８】
パワー算出部５２ａは、帯域毎に設けられており、前述のＭＤＣＴ処理によって得られたＭＤＣＴ係数を臨界帯域等の各周波数帯域に分割し、各周波数帯域に属するＭＤＣＴ係数の２乗和から、前記のスペクトルパワーＳｉをそれぞれの帯域について算出する。ここで、パワーとは、単位時間当たりのエネルギーのことをいう。
【００３９】
ＳＮＲ算出部５２ｂは、スペクトルパワーＳｉとこのスペクトルパワーＳｉをｎビットで量子化したときの量子化雑音パワーＮｉ（ｎ）とから、信号対雑音比ＳＮＲｉ（ｎ）＝Ｓｉ／Ｎｉ（ｎ）を算出する。このＳＮＲｉ（ｎ）は、統計的には、信号の特性に応じた定数となるので、統計処理によって予め求められていてもよい。
【００４０】
ビット割当量算出手段としての１次量子化ビット数算出部５２ｃは、所望のビットレートと上記のＳＮＲｉ（ｎ）とに基づいて前述の反復法を用いて量子化ビット数を算出する。ここでは、前述の反復法におけるマスキングしきい値Ｍを信号Ｓに置き換えて量子化ビット数を算出する。
【００４１】
第１量子化誤差算出手段としての量子化ノイズ算出部５２ｄは、現フレームにおいて、前記の処理で求めたｎより量子化雑音パワーＮｉ（ｎ）を確定する。
【００４２】
ビット割当量修正手段としての２次量子化ビット数算出部５２ｅは、量子化ノイズ保存部５２ｆに保存された前フレームの量子化雑音パワーＮｉ（ｎ）と、量子化ノイズ算出部５２ｄで算出された現フレームの量子化雑音パワーＮｉ（ｎ）との差分の絶対値を求め、その絶対値が所定値より小さくなるように、周波数帯域の個数ｉを修正し、その個数ｉに基づいて、１次量子化ビット数算出部５２ｃで算出された量子化ビット数を修正する。
【００４３】
第２量子化誤差算出手段としての量子化ノイズ保存部５２ｆは、２次量子化ビット数算出部５２で算出された各周波数帯域の最終量子化ビット数ｎから前フレームの量子化雑音パワーＮｉ（ｎ）を算出し、保存する。この量子化ノイズ保存部５２ｆは、保存した前フレームの量子化雑音パワーＮｉ（ｎ）を２次量子化ビット数算出部５２ｅでの上記の差分を求めるために、２次量子化ビット数算出部５２ｅに与える。
【００４４】
上記のように構成されるビット割当処理部５２においては、次のようにして割当処理が行われる。
【００４５】
まず、図４に示すように、時間ｔ１、すなわち、初期フレームの場合、２次量子化ビット数算出部５２ｅでのビット数算出処理を行わずに、１次量子化ビット数算出部５２ｃのｎが最終量子化ビット数となる。次に、量子化ノイズ保存部５２ｆは、時間ｔ１のフレームを前フレームとして、各周波数帯域の最終量子化ビット数ｎから時間ｔ１のフレームの量子化雑音パワーNit1（ｎ）を算出して保存する。
【００４６】
時間ｔ２、すなわち時間ｔ１の次のフレーム処理では、パワー算出部５２ａ、ＳＮＲ算出部５２ｂ、１次量子化ビット数算出部５２ｃおよび量子化ノイズ算出部５２ｄまで、初期フレームと同様の処理が行われ、量子化雑音パワーNit2'（ｎ）が算出される。２次量子化ビット数算出部５２ｅでは、まず、時間ｔ１の量子化雑音パワーNit1（ｎ）と時間ｔ２の量子化雑音パワーNit2'（ｎ）との差分が求められる。図４において、時間ｔ１の全帯域のパワー（＝Σsit1）と時間ｔ２の全帯域のパワー（＝Σsit2'）との関係は、Σsit1＜Σsit2'である。従って、固定ビットレートの場合には、おおむね、各周波数帯域でNit1（ｎ）＜Nit2'（ｎ）の関係が成立している。
【００４７】
次に、２次量子化ビット数算出部５２ｅでは、例えば、Ｓｉの周波数帯域とパワーとを参照して、｜Nit2'（ｎ）−Nit1（ｎ）｜で表される差分を、｜Nit2'（ｎ）−Nit1（ｎ）｜＜１２ｄＢ（所定値）となるように、０〜２５のｉの値について修正する。図５に示す例では、時間ｔ２にフレームについて、４つのサブバンドフレームＳＢ１〜ＳＢ４に対し、低域のビット割当量が増加修正され、高域のビット割当量が削減修正されていることを示す。この補正においては、補正対象となる周波数帯域のビット割当量を聴覚心理特性や信号のパワーに応じて重み付けして修正するのがより好ましい。
【００４８】
以上のように、本ビット割当処理部５２は、１次量子化ビット数算出部５２ｃで算出したビット割当量（量子化ビット数）を２次量子化ビット数算出部５２ｅで修正する際に、量子化ノイズ保存部５２ｆで算出して保存した前フレームの量子化雑音パワー（量子化誤差）と、量子化ノイズ算出部５２ｄで算出した現フレームの量子化雑音パワー（量子化誤差）との差分が所定値より小さくなるように修正を行う。これにより、経時変化の小さい信号の入力時に、瞬間的に経時変化の大きい信号が入力されるような場合でも、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制される。
【００４９】
続いて、他のビット割当処理部５２について説明する。
【００５０】
本ビット割当処理部５２は、図６に示すように、図１に示すビット割当処理部５２におけるパワー算出部５２ａ、量子化ノイズ算出部５２ｄ、２次量子化ビット数算出部５２ｅおよび量子化ノイズ保存部５２ｆを備えるとともに、マスキング算出部５２ｇ、最小可聴限合成部５２ｈ、ＳＭＲ算出部５２ｉ、ＭＮＲ算出部５２ｊ、１次量子化ビット数算出部５２ｋおよび非マスキング領域抽出部５２ｍを備えている。
【００５１】
マスキング算出部５２ｇは、上記のスペクトルパワーＳｉより、公知の手段によってマスキングしきい値を算出する。例えば、ＭＰＥＧ１の聴覚心理モデル１を用いれば以下のような式になる。
【００５２】
Ｖｆ＝17×(dz＋1)−(0．4×Ｘ〔z(i)〕＋6) ｄＢ(-3≦dz＜-1)Bark
Ｖｆ＝(0．4×Ｘ〔z(i)〕＋6) ｄＢ(-1≦dz＜0)Bark
Ｖｆ＝−17×dz ｄＢ(0≦dz＜1)Bark
Ｖｆ＝−(dz−1)×(17−0.15×Ｘ〔z(i)〕)−17 ｄＢ(1≦dz＜8)Bark
Ｖｆ＝−∞ ｄＢ(-3＞dz,8＜1dz)Bark
ここで、dz＝ｚ[ｊ]−ｚ[ｉ]、
Ｘ[ｚ(ｉ)]＝１０ｌｏｇ₁₀Ｓｉ
であり、Ｂａｒｋは臨界帯域の単位を表す。
【００５３】
上記の各式のＶｆをｉ（臨界帯域のインデックス）毎に算出し、重複する周波数については最大のＶｆを選択することによってマスキングしきい値が求められる。マスキングしきい値を算出するための方法としては、その他、いくつかの公知の方法があるので、上記の方法には限定されない。
【００５４】
最小可聴限合成部５２ｈは、次式等で表される最小可聴限特性等と上記のマスキング算出部５２ｇで求めたマスキングしきい値とを合成して、図７に示すような最終のマスキングしきい値Ｍｉを各周波数帯域について決定する。最小可聴限特性は、予めテーブルＲＯＭに格納されていてもよい。
【００５５】
lt(f)=-0.6×3.64×(f/1000)^-0.8＋6.5×exp(-0.6(f/1000-3.3)²-10^-3×(f/1000)⁴ …（２）
ＳＭＲ算出部５２ｉは、各周波数のインデックスを上記のｉとすると、パワー算出部５２ａで求めたスペクトルパワーＳｉと、最小可聴限合成部５２ｈで求めた各周波数帯域のマスキングしきい値Ｍｉとの比ＳＭＲｉ＝Ｓｉ／Ｍｉを全ての周波数帯域にわたって計算する。なお、上記のｆは周波数（Ｈｚ）である。
【００５６】
ＭＮＲ算出部５２ｊは、各周波数帯域の上記のスペクトルパワーＳｉをｎビットで量子化したときの、このスペクトルパワーＳｉと量子化雑音パワーＮｉ（ｎ）との比ＳＮＲｉ（ｎ）＝Ｓｉ／Ｎｉ（ｎ）を算出し、この比ＳＮＲｉ（ｎ）と前記のＳＭＲｉとの比から、マスキングしきい値と量子化雑音パワーとの比ＭＮＲｉ（ｎ）＝ＳＮＲｉ（ｎ）／ＳＭＲｉが求められる。上記の比ＳＮＲ（ｎ）は、統計的には、信号の特性に応じた特性となるので、統計処理によって求めておいてもよい。
【００５７】
１次量子化ビット数算出部５２ｋは、ＭＮＲ算出部５２ｊで求められたマスキングしきい値と量子化雑音パワーとの比ＭＮＲｉ（ｎ）に基づいて、各周波数帯域の量子化ビット数を次のようにして割り当てる。ビット数ｎを０から大きくしていき、その都度、各周波数帯域のマスキングしきい値と量子化雑音パワーとの比ＭＮＲｉ（ｎ）を計算し、その比ＭＮＲｉ（ｎ）が最小となる周波数帯域から順にビットを割り当てていき、前記の量子化ビット数ｎを更新する毎に、同様に比ＭＮＲｉ（ｎ）が最小となる周波数帯域にビットの割り当てを行い、ビットレートに応じた所定の割当可能ビット数となるまで割り当てを行う。すなわち、前記のスペクトルパワーＳｉが、しきい値Ｍｉを超えた部分が最も大きい周波数帯域から順次ビット割り当てが行われることになる。
【００５８】
非マスキング周波数帯域抽出手段としての非マスキング領域抽出部５２ｍは、前述の比ＳＭＲｉに基づいて非マスキング領域（非マスキング周波数帯域）を聴覚心理を用いて抽出する。具体的には、前述の比ＳＭＲｉが１を超える周波数帯域が非マスキング周波数帯域であり、比ＳＭＲｉが１以下である周波数帯域がマスキング周波数帯域であることから、各周波数帯域についてＳＭＲｉ＞１を判定し、非マスキング周波数帯域を求める。
【００５９】
ここでの、２次量子化ビット数算出部５２ｅは、非マスキング周波数帯域のみに対し、｜Nit2'（ｎ）−Nit1（ｎ）｜＞１２ｄＢとなるｎについて｜Nit2'（ｎ）−Nit1（ｎ）｜＜１２ｄＢとなるように、ｉ＝０，…，２５まで修正を施す。
【００６０】
修正によって削除または増加する量子化ビット数は、図８に示すマスキング周波数帯域ＳｉＭ（斜線部）内で調整される。
【００６１】
このように、本ビット割当処理部５２は、図１のビット割当処理部５２と同様に、１次量子化ビット数算出部５２ｋで算出したビット割当量（量子化ビット数）を２次量子化ビット数算出部５２ｅで修正するが、非マスキング領域抽出部５２ｍで抽出した非マスキング周波数帯域に対してのみ修正を行う。これにより、音楽や音声のように非マスキング周波数帯域の成分を多く含むために聴覚心理特性を利用することが好ましいソースに対して、量子化誤差の変動によって発生する異音として知覚可能な音質の劣化を低減することができる。
【００６２】
引き続き、さらに他のビット割当処理部５２について説明する。
【００６３】
本ビット割当処理部５２は、図９に示すように、図１に示すビット割当処理部５２と同様、パワー算出部５２ａ、量子化ノイズ算出部５２ｄ、１次量子化ビット数算出部５２ｃ、量子化ノイズ算出部５２ｄ、２次量子化ビット数算出部５２ｅおよび量子化ノイズ保存部５２ｆを備えており、さらにパワー最大帯域抽出部５２ｎを備えている。
【００６４】
最大値抽出手段としてのパワー最大帯域抽出部５２ｎは、パワー算出部５２ａで算出された前述のスペクトルパワーＳｉの中からスペクトルパワー最大値Max(Si)を抽出する。具体的には、パワー最大帯域抽出部５２ｎは、スペクトルパワーＳｉ（ｉ＝１，２，…，Ｉ）の中から最大となるＳｉのインデックスｉを抽出することでスペクトルパワー最大値Max(Si)を抽出する。
【００６５】
なお、パワー最大帯域抽出部５２ｎは、後述するエネルギーの最大値を抽出する場合、エネルギーＥｉ（ｉ＝１，２，…，Ｉ）の中から最大となるエネルギーＥｉのインデックスｉを抽出する。また、パワー最大帯域抽出部５２ｎは、後述するスケールファクタの最大値を抽出する場合、スケールファクタＳＦｉ（ｉ＝１，２，…，Ｉ）の中から最大となるスケールファクタＳＦｉのインデックスｉを抽出する。このスケールファクタは、スペクトルデータのスケール（大きさ）の因子を表しており、一般的には、量子化される周波数単位の中で、最大スペクトルの絶対値をコード化することによって算出される。
【００６６】
ここでの２次量子化ビット数算出部５２ｅは、上記のスペクトルパワー最大値Max(Si)に対してのみ、｜Nit2'（ｎ）−Nit1（ｎ）｜で表される差分が｜Nit2'（ｎ）−Nit1（ｎ）｜＞１２ｄＢであれば、その差分を｜Nit2'（ｎ）−Nit1（ｎ）｜＜１２ｄＢとなるように修正を施す。また、上記のスペクトルデータのエネルギーまたはスケールファクタの最大値がそれぞれ抽出される場合は、それらに対してのみ上記のようにして量子化ビット数を修正する。
【００６７】
この修正によって減少または増加する量子化ビット数は、図１０に示すパワー最大帯域ＳｉＥ（斜線部）以外の帯域の量子化ビット数を用いて調整される。
【００６８】
このように、本ビット割当処理部５２は、図１のビット割当処理部５２と同様に、１次量子化ビット数算出部５２ｃで算出したビット割当量（量子化ビット数）を２次量子化ビット数算出部５２ｅで修正するが、パワー最大帯域抽出部５２ｎで抽出したスペクトルパワー最大値（ピーク周波数）に対してのみ修正を行う。これにより、ピーク周波数の量子化誤差の変動が抑制される。上記のピーク周波数は、スペクトルデータのパワー、エネルギーまたは指標（スケールファクタ）のいずれかの最大値が属する周波数帯域の周波数を総称したものである。
【００６９】
ピーク周波数は、マスキングの影響を受けないため（最小可聴限の影響を受けることはある）、聴覚心理上重要な周波数である。つまり、ピーク周波数は、最小可聴限以上の信号レベルではマスキングされずに可聴周波数となるので、量子化誤差の揺らぎ（変動）が発生すると、最も異音として知覚されやすい周波数である。
【００７０】
それゆえ、ピーク周波数の量子化誤差の変動を抑制することによって、マスキングしきい値対雑音非を用いたビット割当法、信号対雑音比を用いたビット割当法およびマスキングしきい値対雑音比と信号対雑音比とを併用するビット割当法のいずれにも、従来のビット割当法を用いた場合と比較して、同一周波数の量子化誤差の変動を抑制することができる。
【００７１】
また、本実施の形態のミニディスク装置が、図１、図６および図９のビット割当処理部５２を含む音声圧縮回路５を含むことによって、上記のように、量子化誤差の変動が抑制されたディジタルオーディオデータの圧縮符号化を行うことができる。それゆえ、経時変化の小さい信号の記録時に経時変化の大きい信号が入力されても、量子化誤差に起因する音質の劣化の少ない信号を記録することができる。
【００７２】
なお、本発明のディジタル信号符号化装置は、実施の形態においてミニディスク装置に適用されているが、同様な符号化を必要とする他の装置にも適用できることは勿論である。
【００７３】
【発明の効果】
以上のように、本発明のディジタル信号符号化装置は、時間的に連続する各フレームのビット割当量を周波数帯域毎に算出するビット割当量算出手段と、このビット割当量算出手段によって算出されたビット割当量の量子化誤差を算出する第１量子化誤差算出手段と、前記ビット割当量算出手段によって算出された、現フレームの１つ前の前フレームのビット割当量を基に、現フレームのビット割当量を修正するビット割当量修正手段と、前記ビット割当量修正手段によって得られた最終のビット割当量の量子化誤差を算出する第２量子化誤差算出手段とを備え、上記ビット割当量修正手段が、前記第１量子化誤差算出手段で算出された現フレームのビット割当量と前記第２量子化誤差算出手段で算出された前フレームのビット割当量との量子化誤差の差分を所定値より小さくなるように修正する構成である。
【００７４】
これにより、ビット割当量修正手段による修正時には、現フレームのビット割当量の量子化誤差と、第２量子化誤差算出手段で算出された前フレームとのビット割当量の量子化誤差との差分が所定値より小さくなるように修正される。それゆえ、経時変化の小さい信号の入力時に、瞬間的に経時変化の大きい信号が入力されるような場合でも、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制される。したがって、その量子化誤差の変動によって発生する異音として知覚可能な音質の劣化を低減することができるという効果を奏する。
【００７５】
上記のディジタル信号符号化装置は、前記スペクトルデータのパワー、エネルギーまたはスケールファクタの最大値を抽出する最大値抽出手段を備え、前記ビット割当量修正手段が、抽出された前記最大値が属する周波数帯域で前記差分を修正することによって、スペクトルデータのパワー、エネルギーまたはスケールファクタの最大値が属する周波数帯域の周波数であるピーク周波数の量子化誤差の変動が抑制される。これにより、マスキングしきい値対雑音非を用いたビット割当法、信号対雑音比を用いたビット割当法およびマスキングしきい値対雑音比と信号対雑音比とを併用するビット割当法のいずれにも、従来のビット割当法を用いた場合と比較して、同一周波数の量子化誤差の変動が抑制される。したがって、経時変化に起因する知覚可能な音質劣化を低減することができるという効果を奏する。
【００７６】
本発明の他のディジタル信号符号化装置は、時間的に連続する各フレームのビット割当量を前記周波数帯域毎に算出するビット割当量算出手段と、このビット割当量算出手段によって算出されたビット割当量の量子化誤差を算出する第１量子化誤差算出手段と、前記量子化誤差を非マスキング周波数帯域について抽出する非マスキング周波数帯域抽出手段と、前記ビット割当量算出手段によって算出された、現フレームの１つ前の前フレームのビット割当量を基に、現フレームのビット割当量を修正するビット割当量修正手段と、前記ビット割当量修正手段によって得られた最終のビット割当量の量子化誤差を算出する第２量子化誤差算出手段とを備え、上記ビット割当量修正手段が、前記第１量子化誤差算出手段で算出された現フレームのビット割当量と前記第２量子化誤差算出手段で算出された前フレームのビット割当量との量子化誤差の差分を前記非マスキング周波数帯域の量子化誤差について所定値より小さくなるように修正する構成である。
【００７７】
これにより、ビット割当量修正手段による修正時には、現フレームのビット割当量の非マスキング周波数帯域についての量子化誤差と、第２量子化誤差算出手段で算出された前フレームのビット割当量の非マスキング周波数帯域についての量子化誤差との差分が所定値より小さくなるように修正される。それゆえ、経時変化の小さい信号の入力時に、瞬間的に経時変化の大きい信号が入力されるような場合でも、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制される。したがって、音楽や音声のように聴覚心理特性を利用することが好ましいソースに対して、量子化誤差の変動によって発生する異音として知覚可能な音質の劣化を低減することができるという効果を奏する。
【００７８】
本発明のディジタル信号記録装置は、入力ディジタル信号を所定の符号化処理によって符号化して記録媒体に記録するディジタル信号記録装置であって、上記符号化処理を行うために、上記のいずれかのディジタル信号符号化装置を含んでいる構成である。
【００７９】
上記の各ディジタル信号符号化装置によって、隣接するフレーム間での同一周波数の量子化誤差の変動が抑制されることから、経時変化の小さい信号の記録時に経時変化の大きい信号が入力されても、量子化誤差に起因する上記のような音質の劣化の少ない信号を記録することができる。したがって、高音質での記録が可能なディジタル信号記録装置を提供することができるという効果を奏する。
【図面の簡単な説明】
【図１】本発明の実施の一形態に係るミニディスク装置における音声圧縮回路のビット割当処理部の構成を示すブロック図である。
【図２】上記ミニディスク装置の構成を示すブロック図であなる。
【図３】上記音声圧縮回路の構成を示すブロック図である。
【図４】上記ビット割当処理部におけるパワー算出部で求められた各周波数帯域のスペクトルパワーを示す図面である。
【図５】上記ビット割当処理部による各周波数帯域へのビット割り当てを示す図面である。
【図６】他のビット割当処理部の構成を示すブロック図である。
【図７】図６のビット割当処理部におけるパワー算出部で求められた各周波数帯域のスペクトルパワーを示す図面である。
【図８】図６のビット割当処理部による各周波数帯域へのビット割り当てを示す図面である。
【図９】さらに他のビット割当処理部の構成を示すブロック図である。
【図１０】図９のビット割当処理部におけるパワー算出部で求められた各周波数帯域のスペクトルパワーを示す図面である。
【図１１】図９のビット割当処理部による各周波数帯域へのビット割り当てを示す図面である。
【符号の説明】
５音声圧縮回路（ディジタル信号符号化装置）
５１スペクトル変換部
５２ビット割当処理部
５２ａパワー算出部
５２ｃ１次量子化ビット数算出部（ビット割当量算出手段）
５２ｄ量子化ノイズ算出部（第１量子化誤差算出手段）
５２ｅ２次量子化ビット数算出部（ビット割当量修正手段）
５２ｆ量子化ノイズ保存部（第２量子化誤差算出手段）
５２ｍ非マスキング領域抽出部（非マスキング周波数帯域抽出手段）
５２ｎパワー最大帯域抽出部（最大値抽出手段）[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a digital signal that compresses the amount of data by assigning bits to the spectrum of each frequency band in accordance with the recording target when recording a digital signal such as music or voice on a recording medium such as a mini-disc. The present invention relates to an encoding device.
[0002]
[Prior art]
As a conventional method for compressing and encoding digital signals such as music and voice with high efficiency, there is ATRAC (Adaptive Transform Acoustic Coding) used in minidiscs. In this ATRAC, the digital signal is divided into a plurality of frequency bands (subbands) for high-efficiency compression, and then is divided into encoding units in variable length time units to perform MDCT (Modified Discrete Cosine Transform) processing. The spectrum signal is converted into a spectrum signal, and each spectrum signal is encoded with the allocated number of bits using the psychoacoustic characteristics.
[0003]
The auditory psychological characteristics that can be applied to the above compression coding include an equal loudness characteristic and a masking effect. The equal loudness characteristic represents that the volume of sound perceived by humans varies depending on the frequency even for sounds having the same sound pressure level. Therefore, the equal loudness characteristic indicates that the minimum audible limit, which is the volume of sound that humans can perceive, varies with frequency.
[0004]
On the other hand, the masking effect includes simultaneous masking and temporal masking. Simultaneous masking is a phenomenon that makes it difficult for one sound to hear another sound when multiple frequency component sounds are generated simultaneously. Temporal masking is a phenomenon in which masking is received before and after the time axis of a loud sound.
[0005]
In addition, the bit allocation method needs to adopt an algorithm that takes into account the balance between the required sound quality level and the usable hardware capability using the above psychoacoustic characteristics.
[0006]
For example, in a bit allocation method called an iterative method, bit allocation adapted to an input digital signal is performed as follows. First, the power S of each frequency band is obtained, and the masking threshold value M for other frequency bands based on the power S is obtained. Next, from this masking threshold M and the quantization noise power N (n) when each frequency band is quantized with n bits, the masking threshold to noise ratio MNR (n) = M / N ( n). Subsequently, after assigning bits to a frequency band in which the masking threshold-to-noise ratio MNR (n) is minimum, the masking threshold-to-noise ratio MNR (n) is updated, and again the minimum frequency band Bit assignment to
[0007]
[Problems to be solved by the invention]
When a signal having a small temporal change is input when a signal having a small temporal change is input, the quantization error of the same frequency varies between adjacent frames, which may be perceived as an abnormal sound. In particular, when the quantization error of the peak frequency that is not affected by the masking effect fluctuates, it is perceived as abnormal noise.
[0008]
For different types of signals as described above, it is necessary to allocate bits according to the energy distribution. If this is not performed appropriately, the above-described abnormal noise is generated.
[0009]
In addition, since the iterative method described above performs bit allocation within one frame (unit time of compression processing), an optimal number of quantization bits can be calculated within that frame, but signal changes in the preceding and succeeding frames can be accurately determined. Cannot be reflected in bit allocation. In particular, when compression is performed at a fixed bit rate, if the signal energy components are different between adjacent frames, quantization error fluctuation (variation) occurs at the same frequency.
[0010]
The present invention has been made in view of the above circumstances, and reduces perceivable deterioration in sound quality when encoding a signal having a large temporal change input when a signal having a small temporal change is input. An object of the present invention is to provide a digital signal encoding apparatus.
[0011]
[Means for Solving the Problems]
A digital signal encoding apparatus according to the present invention converts a digital signal into spectrum data for each of a plurality of predetermined frequency bands, and encodes the spectrum data of each frequency band with a bit allocation amount given in accordance with the spectrum data. In the encoding apparatus, in order to solve the above-described problem, a bit allocation amount calculating unit that calculates a bit allocation amount of each temporally continuous frame for each frequency band, and a bit allocation amount calculating unit First quantization error calculation means for calculating a quantization error of the bit allocation amount, and the bit allocation amount of the previous frame immediately before the current frame calculated by the bit allocation amount calculation means. Bit allocation amount correcting means for correcting the bit allocation amount, and quantization error of the final bit allocation amount obtained by the bit allocation amount correcting means A second quantization error calculating unit for calculating, and the bit allocation amount correcting unit calculates the bit allocation amount of the current frame calculated by the first quantization error calculating unit and the second quantization error calculating unit. The difference is that the difference in quantization error from the bit allocation amount of the previous frame is corrected to be smaller than a predetermined value.
[0012]
In the above configuration, when the bit allocation amount of a certain frame is calculated by the bit allocation amount calculation unit, the quantization error of the bit allocation amount is calculated by the first quantization error calculation unit. Also, the quantization error of the bit allocation amount of the frame following that frame is calculated in the same manner. The following two frames are set as a previous frame and a current frame, respectively, and the bit allocation amount correcting unit corrects the bit allocation amount of the current frame based on the bit allocation amount of the previous frame. As a result, the final bit allocation amount is obtained. Then, the quantization error of this bit allocation amount is calculated by the second quantization error calculation means.
[0013]
At the time of correction by the bit allocation amount correction means, the difference between the quantization error of the bit allocation amount of the current frame and the quantization error of the bit allocation amount of the previous frame calculated by the second quantization error calculation means is greater than a predetermined value. Modified to be smaller. As a result, even when a signal having a large temporal change inputted at the time of inputting a signal having a small temporal change is encoded, fluctuations in the quantization error of the same frequency between adjacent frames are suppressed.
[0014]
The digital signal encoding apparatus includes a maximum value extracting unit that extracts a maximum value of power, energy, or scale factor of the spectrum data, and the bit allocation amount correcting unit includes a frequency band to which the extracted maximum value belongs. It is preferable to correct the difference. In such a configuration, when the maximum value of the spectrum data is extracted by the maximum value extracting unit, the bit allocation amount is corrected by the bit allocation amount correcting unit with the maximum value. Thereby, the fluctuation | variation of the quantization error of a peak frequency is suppressed.
[0015]
Here, the frequency of the frequency band to which the maximum value of the power, energy, or scale factor of the spectrum data belongs is referred to as a peak frequency. Since this peak frequency becomes an audible frequency without being masked at a signal level equal to or higher than the minimum audible limit, it is the frequency that is most easily perceived as an abnormal sound when a fluctuation (variation) in quantization error occurs. Therefore, by suppressing the fluctuation of the quantization error of the peak frequency as described above, the bit allocation method using the non-masking threshold to noise, the bit allocation method using the signal-to-noise ratio, and the masking threshold In any of the bit allocation methods using both the noise-to-noise ratio and the signal-to-noise ratio, the fluctuation of the quantization error at the same frequency is suppressed as compared with the case of using the conventional bit allocation method.
[0016]
Another digital signal encoding apparatus of the present invention converts a digital signal into spectrum data for each of a plurality of predetermined frequency bands, and from the size of each frequency band spectrum, the number of bits in each frequency band with respect to the assumed number of bits. A digital signal encoding apparatus that obtains a masking threshold-to-noise ratio and encodes the spectrum data with a bit allocation amount sequentially given from a frequency band in which the masking threshold-to-noise ratio is minimum for each number of bits In order to solve the above problem, a bit allocation amount calculating means for calculating a bit allocation amount of each temporally continuous frame for each frequency band, and a bit allocation amount calculated by the bit allocation amount calculating means First quantization error calculating means for calculating a quantization error of the non-masking frequency band and extracting the quantization error for a non-masking frequency band And a bit allocation amount correcting unit for correcting the bit allocation amount of the current frame based on the bit allocation amount of the previous frame immediately before the current frame calculated by the bit allocation amount calculating unit. Second quantization error calculation means for calculating a quantization error of the final bit allocation amount obtained by the bit allocation amount correction means, and the bit allocation amount correction means includes the first quantization error calculation means. The difference of the quantization error between the bit allocation amount of the current frame calculated in step 1 and the bit allocation amount of the previous frame calculated by the second quantization error calculation means is calculated from a predetermined value for the quantization error in the non-masking frequency band. It is characterized by being modified to be smaller.
[0017]
In the above configuration, when the bit allocation amount of a certain frame is calculated by the bit allocation amount calculation unit, the quantization error of the bit allocation amount is calculated by the first quantization error calculation unit. Then, the quantization error is extracted for the non-masking frequency band by the psychological psychology by the masking frequency band extracting means. Also, the quantization error for the non-masking frequency band of the bit allocation amount of the frame following the frame is calculated in the same manner. These two subsequent frames are set as the previous frame and the current frame, respectively, and the bit allocation amount correcting unit corrects the bit allocation amount of the current frame based on the bit allocation amount of the previous frame. As a result, the final bit allocation amount is obtained. Then, the quantization error of this bit allocation amount is calculated by the second quantization error calculation means.
[0018]
At the time of correction by the bit allocation amount correction means, the quantization error for the non-masking frequency band of the bit allocation amount of the current frame and the non-masking frequency band of the bit allocation amount of the previous frame calculated by the second quantization error calculation means The difference from the quantization error is corrected to be smaller than a predetermined value. As a result, even when a signal having a large temporal change inputted at the time of inputting a signal having a small temporal change is encoded, fluctuations in the quantization error of the same frequency between adjacent frames are suppressed.
[0019]
A digital signal recording apparatus according to the present invention is a digital signal recording apparatus that encodes an input digital signal by a predetermined encoding process and records it on a recording medium. It is characterized by including a signal encoding device. In this configuration, each digital signal encoding device described above suppresses fluctuations in the quantization error of the same frequency between adjacent frames, so that a signal with a large change over time is input when a signal with a small change over time is recorded. Even in this case, it is possible to record a signal with little deterioration in sound quality due to quantization error.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of the present invention will be described with reference to FIGS. 1 to 11 as follows.
[0021]
First, the minidisk device according to the present embodiment will be described.
[0022]
As shown in FIG. 2, in this minidisc device as a digital signal recording device, a digital audio signal as a digital signal input from the input terminal 1 is serially input as an optical signal, for example. This optical signal is converted into an electric signal by the photoelectric element 2 and then input to a digital PLL circuit (Phase-Locked-Loop) 3.
[0023]
The digital PLL circuit 3 extracts a clock from the input digital audio signal and reproduces multi-bit data corresponding to the sampling frequency and the number of quantization bits. This multi-bit data is digital data sampled at a sampling rate (44.1 kHz for a compact disc, 48 kHz for a digital audio tape recorder, 32 kHz for satellite broadcasting (A mode)) corresponding to each signal source. Therefore, the multi-bit data output from the digital PLL circuit 3 is converted by the frequency conversion circuit 4 to a sampling rate of 44.1 kHz corresponding to the mini-disc standard.
[0024]
The audio compression circuit 5 performs compression encoding of the digital audio data input by the above-described ATRAC system. The encoded digital audio data is sent to the signal processing circuit 7 via the shock proof memory controller 6. The shock proof memory 8 controlled by the shock proof memory controller 6 absorbs the difference between the transfer speed of the digital audio data output from the audio compression circuit 5 and the transfer speed of the digital audio data input to the signal processing circuit 7. In addition, it is provided to protect the digital audio data by interpolating the interruption of the reproduction signal due to disturbance such as vibration during reproduction.
[0025]
The signal processing circuit 7 has functions as an encoder and a decoder. The function as an encoder encodes the input digital audio data into a serial magnetic field modulation signal and gives it to the head drive circuit 9. The function as a decoder is to decode a serial signal from an RF amplifier 13 (to be described later) into digital audio data, and give it to the shock proof memory controller 6.
The head drive circuit 9 moves the recording head 10 to a predetermined recording position on the mini disk 11 and generates a magnetic field corresponding to the magnetic field modulation signal during recording. In this state, a predetermined recording position on the mini disk 11 is irradiated with laser light from the optical pickup 12. As a result, a magnetization pattern corresponding to the magnetic field is formed on the mini disk 11.
[0026]
The optical pickup 12 reads a serial signal corresponding to the above magnetization pattern from the mini disk 11. The serial signal is amplified by a high frequency amplifier (hereinafter referred to as an RF amplifier) 13 and then decoded into digital audio data by a signal processing circuit 7. The digital audio data is sent to the audio decompression circuit 14 after the influence of disturbance is removed by the shock proof memory controller 6 and the shock proof memory 8.
[0027]
The audio decompression circuit 14 performs inverse transform processing (decompression decoding) of compression encoding by the ATRAC method, and demodulates full-bit digital audio data. The demodulated digital audio data is converted into an analog audio signal by a digital / analog conversion circuit (hereinafter referred to as an A / D conversion circuit) 15 and output from the output terminal 16 to the outside.
[0028]
The serial signal amplified by the RF amplifier 13 is also input to the servo circuit 17. The servo circuit 17 sends a control signal to the driver circuit 18 according to the reproduced serial signal, and feedback-controls the rotational speed of the spindle motor 19 via the driver circuit 18. By such feedback control, the mini disk 11 can be rotated at a constant linear velocity.
[0029]
The servo circuit 17 also feedback-controls the rotational speed of the feed motor 20 via the driver circuit 18. By such feedback control, shift control of the optical pickup 12 with respect to the radial direction of the mini disk 11, that is, tracking control can be performed. Further, the servo circuit 17 also performs focusing control of the optical pickup 12 via the driver circuit 18.
[0030]
The signal processing circuit 7, the optical pickup 12, the RF amplifier 14, the servo circuit 17, the driver circuit 18, and the like are supplied with power from a power supply circuit (not shown). Are centrally managed by the system control microcomputer 21. The system control microcomputer 21 is connected to an input device 22 for performing song name input, music selection operation, sound quality adjustment operation, and the like.
[0031]
Next, the digital data encoding process in the above-described speech compression circuit 5 as the digital signal encoding apparatus of the present embodiment will be described. Before that, first, the encoding / decoding processing by the above-described ATRAC used in the mini disc 11 or the like will be described.
[0032]
As shown in FIG. 3, the audio compression circuit 5 includes a spectrum conversion unit 51 and a bit allocation processing unit 52.
[0033]
The spectrum converting unit 51 encodes an audio signal (multi-bit data) sampled at a sampling frequency of 44.1 kHz at the time of encoding into a plurality of frequency bands (subband frames) using a band division filter QMF (Quadrature Mirror Filter). Divide into The spectrum conversion unit 51 performs the above-described MDCT processing in units of divided subband frames, and generates MDCT coefficients (spectrum data) of frequency components in each band. The MDCT process at this time is expressed by the following equation (1).
[0034]
Xm (k) = Σxm (i) h (i) cos (π / M (k + 1/2) (i + M / 2 + 1/2) (1)
In equation (1), k = 0, 1,..., M−1,
m: block number,
xm (i): input signal,
h (i): forward conversion window function,
Xm (k): Conversion data
It is.
[0035]
The bit allocation processing unit 52 converts the MDCT coefficient into i spectrum power Si (i = 1, 2,..., I; for example, I = 25) for each of the frequency bands. The bit allocation process is performed as follows. This spectral power Si uses a critical band (unit: Bark). The critical band is a characteristic part of a wideband audio spectrum in which specific psychoacoustic regularity such as frequency selectivity and masking threshold is effective.
[0036]
Hereinafter, the bit allocation processing unit 52 will be described in detail.
[0037]
As shown in FIG. 1, the bit allocation processing unit 52 includes a power calculation unit 52a, an SNR calculation unit 52b, a primary quantization bit number calculation unit 52c, a quantization noise calculation unit 52d, and a secondary quantization bit number calculation unit 52e. And a quantization noise storage unit 52f.
[0038]
The power calculation unit 52a is provided for each band, divides the MDCT coefficient obtained by the above-described MDCT processing into each frequency band such as a critical band, and calculates the above-mentioned sum of squares of MDCT coefficients belonging to each frequency band. Is calculated for each band. Here, power refers to energy per unit time.
[0039]
The SNR calculator 52b calculates the signal-to-noise ratio SNRi (n) = Si / Ni (n) from the spectrum power Si and the quantized noise power Ni (n) obtained by quantizing the spectrum power Si with n bits. calculate. Since this SNRi (n) is statistically a constant corresponding to the characteristics of the signal, it may be obtained in advance by statistical processing.
[0040]
The primary quantization bit number calculation unit 52c as a bit allocation amount calculation unit calculates the quantization bit number using the above-described iterative method based on a desired bit rate and the above SNRi (n). Here, the masking threshold M in the above iterative method is replaced with the signal S to calculate the number of quantization bits.
[0041]
The quantization noise calculation unit 52d as the first quantization error calculation means determines the quantization noise power Ni (n) from n obtained in the above process in the current frame.
[0042]
The secondary quantization bit number calculation unit 52e as the bit allocation amount correcting unit is calculated by the quantization noise power Ni (n) of the previous frame stored in the quantization noise storage unit 52f and the quantization noise calculation unit 52d. The absolute value of the difference from the quantization noise power Ni (n) of the current frame is obtained, the number i of frequency bands is corrected so that the absolute value is smaller than a predetermined value, and based on the number i, 1 The number of quantization bits calculated by the next quantization bit number calculation unit 52c is corrected.
[0043]
The quantization noise storage unit 52f as the second quantization error calculation unit calculates the quantization noise power Ni (P) of the previous frame from the final quantization bit number n of each frequency band calculated by the secondary quantization bit number calculation unit 52. n) Calculate and save. The quantization noise storage unit 52f obtains the above-mentioned difference in the secondary quantization bit number calculation unit 52e from the stored quantization noise power Ni (n) of the previous frame. 52e.
[0044]
In the bit allocation processing unit 52 configured as described above, allocation processing is performed as follows.
[0045]
First, as shown in FIG. 4, in the case of time t1, that is, in the case of an initial frame, the n bit of the primary quantization bit number calculation unit 52c is not performed without performing the bit number calculation processing in the secondary quantization bit number calculation unit 52e. Becomes the final number of quantization bits. Next, the quantization noise storage unit 52f calculates and stores the quantization noise power Nit1 (n) of the frame at time t1 from the final quantization bit number n of each frequency band, with the frame at time t1 as the previous frame. .
[0046]
In the next frame processing at time t2, that is, at time t1, processing similar to that of the initial frame is performed up to the power calculation unit 52a, SNR calculation unit 52b, primary quantization bit number calculation unit 52c, and quantization noise calculation unit 52d. Then, the quantization noise power Nit2 ′ (n) is calculated. In the secondary quantization bit number calculation unit 52e, first, a difference between the quantization noise power Nit1 (n) at time t1 and the quantization noise power Nit2 ′ (n) at time t2 is obtained. In FIG. 4, the relationship between the power of the entire band at time t1 (= Σsit1) and the power of the entire band at time t2 (= Σsit2 ′) is Σsit1 <Σsit2 ′. Therefore, in the case of the fixed bit rate, the relationship of Nit1 (n) <Nit2 ′ (n) is generally established in each frequency band.
[0047]
Next, in the secondary quantization bit number calculation unit 52e, for example, with reference to the frequency band and power of Si, the difference represented by | Nit2 ′ (n) −Nit1 (n) | (N) -Nit1 (n) | <12 dB (predetermined value) is corrected for the value of i between 0 and 25. In the example shown in FIG. 5, for the frame at time t <b> 2, for the four subband frames SB <b> 1 to SB <b> 4, the low band bit allocation amount is increased and corrected, and the high band bit allocation amount is reduced and corrected. . In this correction, it is more preferable to modify the bit allocation amount of the frequency band to be corrected by weighting according to the psychoacoustic characteristics and the signal power.
[0048]
As described above, when the bit allocation processing unit 52e corrects the bit allocation amount (quantization bit number) calculated by the primary quantization bit number calculation unit 52c by the secondary quantization bit number calculation unit 52e, The difference between the quantization noise power (quantization error) of the previous frame calculated and stored by the quantization noise storage unit 52f and the quantization noise power (quantization error) of the current frame calculated by the quantization noise calculation unit 52d Is corrected to be smaller than a predetermined value. Thereby, even when a signal having a large temporal change is input instantaneously when a signal having a small temporal change is input, fluctuations in quantization error of the same frequency between adjacent frames are suppressed.
[0049]
Next, another bit allocation processing unit 52 will be described.
[0050]
As shown in FIG. 6, the bit allocation processing unit 52 includes a power calculation unit 52a, a quantization noise calculation unit 52d, a secondary quantization bit number calculation unit 52e, and a quantization noise in the bit allocation processing unit 52 shown in FIG. In addition to a storage unit 52f, a masking calculation unit 52g, a minimum audible limit synthesis unit 52h, an SMR calculation unit 52i, an MNR calculation unit 52j, a primary quantization bit number calculation unit 52k, and a non-masking region extraction unit 52m are provided.
[0051]
The masking calculation unit 52g calculates a masking threshold value by a known means from the above spectrum power Si. For example, if the auditory psychology model 1 of MPEG1 is used, the following equation is obtained.
[0052]
Vf = 17 × (dz + 1) − (0.4 × X [z (i)] + 6) dB (−3 ≦ dz <−1) Bark
Vf = (0.4 × X [z (i)] + 6) dB (−1 ≦ dz <0) Bark
Vf = −17 × dz dB (0 ≦ dz <1) Bark
Vf = − (dz−1) × (17−0.15 × X [z (i)]) − 17 dB (1 ≦ dz <8) Bark
Vf = −∞ dB (−3> dz, 8 <1dz) Bark
Where dz = z [j] −z [i],
X [z (i)] = 10 log_TenSi
And Bark represents a unit of the critical band.
[0053]
The masking threshold is obtained by calculating Vf in each of the above formulas for each i (critical band index) and selecting the maximum Vf for overlapping frequencies. As a method for calculating the masking threshold, there are some other known methods, and the method is not limited to the above method.
[0054]
The minimum audible limit combining unit 52h combines the minimum audible limit characteristic expressed by the following equation and the like with the masking threshold obtained by the masking calculation unit 52g, and performs the final masking as shown in FIG. A threshold value Mi is determined for each frequency band. The minimum audible limit characteristic may be stored in the table ROM in advance.
[0055]
lt (f) =-0.6 × 3.64 × (f / 1000)^-0.8+ 6.5 × exp (-0.6 (f / 1000-3.3)²-Ten^-3× (f / 1000)^Four ... (2)
The SMR calculation unit 52i has a ratio between the spectrum power Si obtained by the power calculation unit 52a and the masking threshold Mi of each frequency band obtained by the minimum audible synthesis unit 52h, where i is the index of each frequency. SMRi = Si / Mi is calculated over all frequency bands. In addition, said f is a frequency (Hz).
[0056]
The MNR calculation unit 52j quantizes the spectrum power Si in each frequency band with n bits, and the ratio SNRi (n) = Si / Ni () of the spectrum power Si and the quantization noise power Ni (n). n) is calculated, and the ratio MNRi (n) = SNRi (n) / SMRi of the masking threshold value and the quantization noise power is obtained from the ratio of the ratio SNRi (n) and the above-mentioned SMRi. The above ratio SNR (n) is statistically a characteristic corresponding to the characteristic of the signal, and may be obtained by statistical processing.
[0057]
Based on the ratio MNRi (n) between the masking threshold value and the quantization noise power obtained by the MNR calculation unit 52j, the primary quantization bit number calculation unit 52k calculates the quantization bit number of each frequency band as follows: Assign as follows. The number n of bits is increased from 0, and each time, the ratio MNRi (n) between the masking threshold value and the quantization noise power in each frequency band is calculated, and the frequency band where the ratio MNRi (n) is minimized. Bits are allocated in order, and every time the number of quantization bits n is updated, bits are similarly allocated to the frequency band where the ratio MNRi (n) is minimized, and predetermined allocation according to the bit rate is possible. Allocate until the number of bits is reached. That is, bit allocation is performed sequentially from the frequency band where the portion where the spectrum power Si exceeds the threshold value Mi is the largest.
[0058]
The non-masking region extraction unit 52m as a non-masking frequency band extracting unit extracts a non-masking region (non-masking frequency band) using auditory psychology based on the above-described ratio SMRi. Specifically, since the frequency band in which the ratio SMRI exceeds 1 is a non-masking frequency band and the frequency band in which the ratio SMRI is 1 or less is a masking frequency band, SMRI> 1 is determined for each frequency band. Then, the non-masking frequency band is obtained.
[0059]
Here, the second-order quantized bit number calculation unit 52e performs only | Nit2 ′ (n) −Nit1 (n) |> 12 dB for only the non-masking frequency band | Nit2 ′ (n) −Nit1 ( n) Modify until i = 0,..., 25 so that | <12 dB.
[0060]
The number of quantization bits deleted or increased by the correction is adjusted within the masking frequency band SiM (shaded portion) shown in FIG.
[0061]
As described above, the bit allocation processing unit 52 performs the secondary quantization on the bit allocation amount (quantization bit number) calculated by the primary quantization bit number calculation unit 52k, similarly to the bit allocation processing unit 52 of FIG. Although the correction is performed by the bit number calculation unit 52e, only the non-masking frequency band extracted by the non-masking region extraction unit 52m is corrected. This makes it possible to perceive sound quality that can be perceived as abnormal noise caused by fluctuations in quantization error, for sources that preferably use psychoacoustic characteristics because they contain many components in the non-masking frequency band, such as music and speech. Deterioration can be reduced.
[0062]
Next, still another bit allocation processing unit 52 will be described.
[0063]
As shown in FIG. 9, the bit allocation processing unit 52 includes a power calculation unit 52a, a quantization noise calculation unit 52d, a primary quantization bit number calculation unit 52c, A quantization noise calculation unit 52d, a secondary quantization bit number calculation unit 52e, and a quantization noise storage unit 52f, and a power maximum band extraction unit 52n.
[0064]
The maximum power band extraction unit 52n as the maximum value extraction unit extracts the maximum spectral power value Max (Si) from the above-described spectral power Si calculated by the power calculation unit 52a. Specifically, the power maximum band extraction unit 52n extracts the maximum Si index i from the spectrum power Si (i = 1, 2,..., I) to thereby obtain the spectrum power maximum value Max (Si). To extract.
[0065]
Note that the maximum power band extraction unit 52n extracts the index i of the maximum energy Ei from the energy Ei (i = 1, 2,..., I) when extracting the maximum value of energy described later. Further, when extracting the maximum value of the scale factor, which will be described later, the power maximum band extracting unit 52n extracts the index i of the scale factor SFi that is the maximum from the scale factors SFi (i = 1, 2,..., I). To do. The scale factor represents a factor of the scale (size) of the spectrum data, and is generally calculated by encoding the absolute value of the maximum spectrum among the quantized frequency units.
[0066]
Here, the secondary quantization bit number calculation unit 52e has a difference represented by | Nit2 ′ (n) −Nit1 (n) | only for the above spectrum power maximum value Max (Si) | Nit2 ′. If (n) −Nit1 (n) |> 12 dB, the difference is corrected so that | Nit2 ′ (n) −Nit1 (n) | <12 dB. When the maximum values of the energy or scale factor of the spectrum data are extracted, the quantization bit number is corrected as described above only for them.
[0067]
The number of quantization bits reduced or increased by this modification is adjusted using the number of quantization bits in a band other than the maximum power band SiE (shaded portion) shown in FIG.
[0068]
As described above, the bit allocation processing unit 52 performs the secondary quantization on the bit allocation amount (quantization bit number) calculated by the primary quantization bit number calculation unit 52c, similarly to the bit allocation processing unit 52 of FIG. Although the correction is performed by the bit number calculation unit 52e, the correction is performed only for the spectrum power maximum value (peak frequency) extracted by the power maximum band extraction unit 52n. Thereby, the fluctuation | variation of the quantization error of a peak frequency is suppressed. The peak frequency is a general term for frequencies in the frequency band to which the maximum value of power, energy, or index (scale factor) of spectrum data belongs.
[0069]
Since the peak frequency is not affected by masking (it may be influenced by the minimum audible limit), it is an important psychoacoustic frequency. That is, the peak frequency becomes an audible frequency without being masked at a signal level equal to or higher than the minimum audible limit. Therefore, when fluctuation (variation) of the quantization error occurs, the peak frequency is the frequency that is most easily perceived as an abnormal sound.
[0070]
Therefore, by suppressing the fluctuation of the quantization error of the peak frequency, the bit allocation method using the masking threshold to noise non-bit, the bit allocation method using the signal to noise ratio, and the masking threshold to noise ratio In any of the bit allocation methods that use the signal-to-noise ratio together, fluctuations in the quantization error at the same frequency can be suppressed as compared with the case where the conventional bit allocation method is used.
[0071]
In addition, since the minidisk device according to the present embodiment includes the audio compression circuit 5 including the bit allocation processing unit 52 of FIGS. 1, 6, and 9, fluctuations in quantization error are suppressed as described above. The digital audio data can be compressed and encoded. Therefore, even when a signal having a large temporal change is input when recording a signal having a small temporal change, a signal with little deterioration in sound quality due to a quantization error can be recorded.
[0072]
The digital signal encoding apparatus of the present invention is applied to the mini disk apparatus in the embodiment, but it is needless to say that it can be applied to other apparatuses that require similar encoding.
[0073]
【The invention's effect】
As described above, the digital signal encoding apparatus of the present invention is calculated by the bit allocation amount calculating means for calculating the bit allocation amount of each temporally continuous frame for each frequency band and the bit allocation amount calculating means. First quantization error calculation means for calculating a quantization error of the bit allocation amount, and the bit allocation amount of the previous frame immediately before the current frame calculated by the bit allocation amount calculation means. Bit allocation amount correcting means for correcting the bit allocation amount, and second quantization error calculating means for calculating a quantization error of the final bit allocation amount obtained by the bit allocation amount correcting means, An amount between the bit allocation amount of the current frame calculated by the first quantization error calculation unit and the bit allocation amount of the previous frame calculated by the second quantization error calculation unit by the correction unit It is configured to correct the difference of the error to be smaller than a predetermined value.
[0074]
Thereby, at the time of correction by the bit allocation amount correction means, the difference between the quantization error of the bit allocation amount of the current frame and the quantization error of the bit allocation amount of the previous frame calculated by the second quantization error calculation means is It is corrected so as to be smaller than a predetermined value. Therefore, even when a signal with a large temporal change is input instantaneously when a signal with a small temporal change is input, fluctuations in quantization error of the same frequency between adjacent frames are suppressed. Therefore, it is possible to reduce the deterioration of sound quality that can be perceived as an abnormal sound generated by the fluctuation of the quantization error.
[0075]
The digital signal encoding apparatus includes a maximum value extracting unit that extracts a maximum value of power, energy, or scale factor of the spectrum data, and the bit allocation amount correcting unit includes a frequency band to which the extracted maximum value belongs. By correcting the difference, the fluctuation of the quantization error of the peak frequency which is the frequency of the frequency band to which the maximum value of the power, energy or scale factor of the spectrum data belongs is suppressed. As a result, any of the bit allocation method using masking threshold-to-noise, the bit allocation method using the signal-to-noise ratio, and the bit allocation method using both the masking threshold-to-noise ratio and the signal-to-noise ratio can be used. However, compared with the case where the conventional bit allocation method is used, the fluctuation | variation of the quantization error of the same frequency is suppressed. Therefore, there is an effect that perceivable deterioration in sound quality due to a change with time can be reduced.
[0076]
Another digital signal encoding apparatus of the present invention includes a bit allocation amount calculating unit that calculates a bit allocation amount of each temporally continuous frame for each frequency band, and a bit allocation calculated by the bit allocation amount calculating unit. A first quantization error calculating means for calculating a quantization error of a quantity; a non-masking frequency band extracting means for extracting the quantization error for a non-masking frequency band; and a current frame calculated by the bit allocation amount calculating means A bit allocation amount correcting means for correcting the bit allocation amount of the current frame based on the bit allocation amount of the previous previous frame, and a quantization error of the final bit allocation amount obtained by the bit allocation amount correcting means Second quantization error calculation means for calculating the bit allocation amount correction means, wherein the bit allocation amount correction means calculates the current frame calculated by the first quantization error calculation means. The difference in quantization error between the bit allocation amount and the bit allocation amount of the previous frame calculated by the second quantization error calculation means is corrected so that the quantization error in the non-masking frequency band is smaller than a predetermined value. It is a configuration.
[0077]
As a result, at the time of correction by the bit allocation amount correction means, the quantization error for the non-masking frequency band of the bit allocation amount of the current frame and the non-masking of the bit allocation amount of the previous frame calculated by the second quantization error calculation means The difference from the quantization error for the frequency band is corrected to be smaller than a predetermined value. Therefore, even when a signal with a large temporal change is input instantaneously when a signal with a small temporal change is input, fluctuations in quantization error of the same frequency between adjacent frames are suppressed. Therefore, it is possible to reduce deterioration in sound quality that can be perceived as abnormal sound generated by variation in quantization error, for a source that preferably uses auditory psychological characteristics such as music and voice.
[0078]
A digital signal recording apparatus according to the present invention is a digital signal recording apparatus that encodes an input digital signal by a predetermined encoding process and records it on a recording medium. This is a configuration including a signal encoding device.
[0079]
Since each digital signal encoding device described above suppresses variation in quantization error of the same frequency between adjacent frames, even when a signal having a large temporal change is input when a signal having a small temporal change is recorded, It is possible to record a signal with little deterioration in sound quality due to quantization error. Therefore, it is possible to provide a digital signal recording apparatus capable of recording with high sound quality.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a bit allocation processing unit of an audio compression circuit in a minidisk device according to an embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of the mini-disc device.
FIG. 3 is a block diagram showing a configuration of the audio compression circuit.
FIG. 4 is a diagram illustrating spectrum power of each frequency band obtained by a power calculation unit in the bit allocation processing unit.
FIG. 5 is a diagram illustrating bit allocation to each frequency band by the bit allocation processing unit.
FIG. 6 is a block diagram showing a configuration of another bit allocation processing unit.
7 is a diagram illustrating spectrum power of each frequency band obtained by a power calculation unit in the bit allocation processing unit of FIG. 6;
8 is a diagram illustrating bit allocation to each frequency band by the bit allocation processing unit of FIG. 6;
FIG. 9 is a block diagram showing a configuration of still another bit allocation processing unit.
10 is a diagram illustrating spectrum power of each frequency band obtained by a power calculation unit in the bit allocation processing unit of FIG. 9;
11 is a diagram illustrating bit allocation to each frequency band by the bit allocation processing unit of FIG. 9;
[Explanation of symbols]
5 Voice compression circuit (digital signal encoding device)
51 Spectrum converter
52-bit allocation processor
52a Power calculation unit
52c Primary quantization bit number calculation unit (bit allocation amount calculation means)
52d Quantization noise calculation unit (first quantization error calculation means)
52e Secondary quantization bit number calculation unit (bit allocation amount correcting means)
52f Quantization noise storage unit (second quantization error calculation means)
52m non-masking region extraction unit (non-masking frequency band extraction means)
52n Power maximum bandwidth extraction unit (maximum value extraction means)

Claims

In a digital signal encoding apparatus that converts a digital signal into spectrum data for each of a plurality of predetermined frequency bands, and encodes the spectrum data of each frequency band with a given bit allocation amount according to each.
Bit allocation amount calculation means for calculating the bit allocation amount of each frame that is temporally continuous for each frequency band;
First quantization error calculation means for calculating a quantization error of the bit allocation amount calculated by the bit allocation amount calculation means;
Bit allocation amount correcting means for correcting the bit allocation amount of the current frame based on the bit allocation amount of the previous frame immediately before the current frame calculated by the bit allocation amount calculating means;
Second quantization error calculation means for calculating a quantization error of the final bit allocation amount obtained by the bit allocation amount correction means,
The bit allocation amount correction means includes a quantization error between the bit allocation amount of the current frame calculated by the first quantization error calculation means and the bit allocation amount of the previous frame calculated by the second quantization error calculation means. The digital signal encoding apparatus is characterized in that the difference between the two is corrected to be smaller than a predetermined value.

A maximum value extracting means for extracting the maximum value of the power, energy or scale factor of the spectral data;
2. The digital signal encoding apparatus according to claim 1, wherein the bit allocation amount correcting unit corrects the difference in a frequency band to which the extracted maximum value belongs.

The digital signal is converted into spectrum data for each of a plurality of predetermined frequency bands, and the masking threshold-to-noise ratio of each frequency band is obtained for each assumed number of bits from the size of each frequency band spectrum. In a digital signal encoding apparatus that encodes the spectrum data with a bit allocation amount sequentially given from a frequency band in which the masking threshold-to-noise ratio is minimized every number,
Bit allocation amount calculation means for calculating the bit allocation amount of each frame that is temporally continuous for each frequency band;
First quantization error calculation means for calculating a quantization error of the bit allocation amount calculated by the bit allocation amount calculation means;
Non-masking frequency band extracting means for extracting the quantization error for a non-masking frequency band;
Bit allocation amount correcting means for correcting the bit allocation amount of the current frame based on the bit allocation amount of the previous frame immediately before the current frame calculated by the bit allocation amount calculating means;
Second quantization error calculation means for calculating a quantization error of the final bit allocation amount obtained by the bit allocation amount correction means,
The bit allocation amount correction means includes a quantization error between the bit allocation amount of the current frame calculated by the first quantization error calculation means and the bit allocation amount of the previous frame calculated by the second quantization error calculation means. The digital signal encoding apparatus is characterized in that the difference between the two is corrected so that the quantization error in the non-masking frequency band becomes smaller than a predetermined value.

A digital signal recording apparatus for encoding an input digital signal by a predetermined encoding process and recording it on a recording medium,
A digital signal recording apparatus comprising the digital signal encoding apparatus according to any one of claims 1 to 3 for performing the encoding process.