JPH06291679A

JPH06291679A - Threshold value control quantization determining method for audio signal

Info

Publication number: JPH06291679A
Application number: JP24368992A
Authority: JP
Inventors: Dou Fui Chiyuu; フィチュウ・ドゥ
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1992-09-11
Filing date: 1992-09-11
Publication date: 1994-10-18
Anticipated expiration: 2013-03-18
Also published as: JP2729013B2

Abstract

PURPOSE:To obtain the threshold value control quantization determining method for the audio signal which performs bit allocation at low cost while holding the quantity of a recomposed signal excellent. CONSTITUTION:A filter bank 101 generates a subband sample from an input audio signal, and a maximum value is found from the subband sample, and quantized by a 1st quantizing means 103. In a process wherein the quantized maximum value is normalized and quantized by a 2nd quantizing means 108, the mean energy of the subband is calculated 110, a division coefficient is found 111 by specific arithmetic on the basis of the mean energy, and 2nd quantization is performed according to the division coefficient.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、伝送またはデジタル記
憶媒体のためのデジタル・オーディオ信号の効率的な情
報コーディングに関するものである。FIELD OF THE INVENTION This invention relates to efficient information coding of digital audio signals for transmission or digital storage media.

【０００２】[0002]

【従来の技術】近年に開発されたハイファイ・オーディ
オのための圧縮アルゴリズムは、主として、周波数領域
のコーダである。これらのコーダは、スペクトル特性を
抽出して、冗長なもの及び不適切なものを除去するた
め、入力デジタル・オーディオ信号を周波数領域に分解
する。こうした設計の大部分において、エンコーダは、
いっそう複雑になる可能性があるが、デコーダは単純で
なければならない。エンコーダのビット割り付け案は、
極めて重要な役割を果たしており、コーダの複雑さ、冗
長性の除去範囲及び、再構成される出力の質をほとんど
決定する。2. Description of the Related Art The compression algorithms developed for high-fidelity audio in recent years are mainly frequency domain coders. These coders decompose the input digital audio signal into the frequency domain in order to extract spectral characteristics and remove redundant and improper ones. In most of these designs, the encoder
It can be even more complicated, but the decoder must be simple. The bit allocation plan of the encoder is
It plays a crucial role and determines the coder's complexity, the extent of redundancy elimination, and the quality of the reconstructed output.

【０００３】オーディオ信号のコーディングに関する音
響心理学的モデリングにフィードフォワード量子化を利
用するサブバンド・コーディング（ＳＢＣ）が、ＩＳＯ
／ＷＧ１１／ＭＰＧ（映画専門家グループ）によって
立案されたオーディオ・コーディング規格の核となる方
法を形成している。サブバンド・コーディング・アルゴ
リズムに用いられるビット割り付け手順は、人間の耳の
マスキングしきい値に基づくものであり、再構成される
出力は、１２８キロ・ビット／秒／チャネル以上のビッ
ト伝送速度で透明性の得られることが分かった。Subband coding (SBC), which utilizes feedforward quantization for psychoacoustic modeling of audio signal coding, is an ISO standard.
/ WG 11 / MPG (Motion Picture Experts Group) forms the core method of the audio coding standard. The bit allocation procedure used in the subband coding algorithm is based on the masking threshold of the human ear, and the reconstructed output is transparent at bit rates above 128 kbit / s / channel. It turned out that sex could be obtained.

【０００４】図６には、知覚に基づくビット割り付け手
順を利用したサブバンド・コーダのブロック図が示され
ている。まず、フィルタ・バンク１０１によって、入力
オーディオ信号ｘにフィルタリングを施してサブバンド
にすると、サブバンド・サンプルＳ_kが得られる。これ
らのサンプルから、最大値決定手段１０２によって規定
の時間間隔毎に、最大値Ｓｍａｘが求められる。次に６
ビット均等量子化器（第１量子化器）１０３によって求
められた最大値Ｓｍａｘを量子化すると、量子化された
Ｓｍａｘが生じる。ブロック１０４において、量子化さ
れたＳｍａｘを反転し、これにサンプルＳ_kを乗算器１
０５によりかけると、第２量子化器１０８に入力される
前にこの正規化が行われ、伝送のためのＳ_kが得られ
る。各サブバンド毎に用いる第２量子化のステップは、
ビット割り付けを計算するブロック１０７の出力によっ
て決まる。ここで、入力オーディオ信号ｘは別の経路
で、ブロック１０６に送られ、まず、ブロック１０６ａ
によって高速フーリエ変換（ＦＦＴ）され、時間間隔ま
たはフレーム毎に、スペクトル成分に分解される。次
に、ノイズに対するトーン及びトーンに対するノイズの
同時マスキングに基づいて、ブロック１０６ｂによりマ
スキングしきい値が計算される。この結果、サブバンド
毎にマスク値が得られ、これらのマスク値が、ブロック
１０７のビット割り付け手順に用いられる。FIG. 6 shows a block diagram of a subband coder utilizing a perceptual bit allocation procedure. First, the filter bank 101 filters the input audio signal x into subbands to obtain subband samples S _k . From these samples, the maximum value determination means 102 determines the maximum value Smax at each prescribed time interval. Next 6
When the maximum value Smax obtained by the bit uniform quantizer (first quantizer) 103 is quantized, a quantized Smax is generated. In block 104, the quantized Smax is inverted and sampled with S _k by multiplier 1
When multiplied by 05, this normalization is performed before it is input to the second quantizer 108, and S _k for transmission is obtained. The second quantization step used for each subband is
It depends on the output of block 107 which calculates the bit allocation. Here, the input audio signal x is sent by another route to the block 106, and first, the block 106a
Is subjected to a fast Fourier transform (FFT) and is decomposed into spectral components at each time interval or frame. A masking threshold is then calculated by block 106b based on the simultaneous masking of the tone-to-noise and noise-to-noise. As a result, mask values are obtained for each subband and these mask values are used in the block 107 bit allocation procedure.

【０００５】図７には、ブロック１０６及び１０７の処
理の概略を示す流れ図が示されている。まず、ステップ
Ｓ７１は、ＦＦＴプロセスであり、ブロック１０６ａに
対応する。ステップＳ７２は、マスク値が得られるブロ
ック１０６ｂに対応する。ステップＳ７３、７４、７
５、７６、７７、７８及び、７９は、ブロック１０７内
における処理ステップを示している。すなわち、マスク
値及び量子化されたＳｍａｘを用いて量子化ノイズが計
算され（ステップＳ７３）、その量子化ノイズによって
ノイズ対マスク比が計算される（ステップＳ７４）。得
られたノイズ対マスク比は、次の各サブバンドに割り付
けられるビット数を得るための反復法に用いられる。反
復法におけるステップは、以下の通りである。FIG. 7 shows a flow chart outlining the processing of blocks 106 and 107. First, step S71 is an FFT process and corresponds to block 106a. Step S72 corresponds to block 106b where the mask value is obtained. Steps S73, 74, 7
5, 76, 77, 78, and 79 indicate the processing steps within block 107. That is, the quantization noise is calculated by using the mask value and the quantized Smax (step S73), and the noise-to-mask ratio is calculated by the quantization noise (step S74). The resulting noise-to-mask ratio is used in an iterative method to obtain the number of bits allocated to each subband below. The steps in the iterative method are as follows.

【０００６】［１］最大のノイズ対マスク比（ＮＭＲ）
になる周波数帯域ｉを求める（ステップＳ７５）。[1] Maximum noise to mask ratio (NMR)
Then, the frequency band i is calculated (step S75).

【０００７】［２］周波数帯域Ｌｉに割り付けられるビ
ットをインクリメントする（ステップＳ７６）。[2] The bit assigned to the frequency band Li is incremented (step S76).

【０００８】［３］ＮＭＲｉを再計算する（ステップＳ
７７）。[3] Recalculate NMRi (step S
77).

【０００９】［４］Ｂ＝Ｂ−Ｍになる、残りのビットＢ
を計算する（ステップＳ７８）。[4] Remaining bit B where B = BM
Is calculated (step S78).

【００１０】［５］ＢがＭ以上の場合、上記処理［１］
から［５］を反復し、さもなければ、割り付けを終了す
る（ステップＳ７９）。[5] When B is M or more, the above process [1]
To [5] are repeated, otherwise allocation is finished (step S79).

【００１１】また、１９８５年５月の、プロシーディン
グスオブアイシーエーエスエスピー’８２，パリ
（Proceedings of ICASSP'82,Paris）、２０３〜２０６
頁に記載の、ティーエーラムスタッド（TA Ramstad）
による「サブバンドコーダーウイズアシンプル
アダプティブビットアロケーションアルゴリズム
アポシブルキャンディデートホアディジタル
モービルテレホニー（Subband coder with a simple
adaptive bit allocation algorithm-a possible candi
date for digital mobile telephony）」と題する論文
には、サブバンド信号の相対標準偏差に基づく単純なビ
ット割り付け法が記載されている。この方法のビット割
り付けは、単純な多数決の原理に基づくものであり、標
準偏差が最大の帯域が、最初のビットを受け、次に、標
準偏差は、線形係数だけ減少する。また、ワイ．マツ
イ（Y.Matsui）によるディジタル・オーディオ信号の符
号化方法と題する特許明細書には、ビット割り付け手順
のもう１つのバリエーションが見受けられる。図８に、
そのブロック図が示されている。２０１はサブバンド信
号のレベルが決定されるレベル計算ブロックである。２
０２は対数値計算ブロックであり、２０３はスケール・
ファクタの計算ブロックである。２０４のブロックで
は、対数表の重み付き値が保持され、２０５のブロック
では、サブバンドに関する重み表が含まれる。レベルの
対数値の線形重み付き比率に基づいて、ブロック２０６
において、ビット割り付け計算が行われる。Also, in May 1985, Proceedings of ICASSP'82, Paris, 203-206.
TA Ramstad, described on page
By "Subband Corder With A Simple
Adaptive Bit Allocation Algorithm Possible Candy Date Digital Mobile Telephony (Subband coder with a simple
adaptive bit allocation algorithm-a possible candi
A paper entitled "date for digital mobile telephony)" describes a simple bit allocation method based on the relative standard deviation of subband signals. The bit allocation of this method is based on the principle of simple majority voting, the band with the highest standard deviation receives the first bit, and then the standard deviation decreases by a linear coefficient. In addition, Wai. Another variation of the bit allocation procedure is found in the patent specification entitled Y. Matsui's method of encoding digital audio signals. In FIG.
Its block diagram is shown. 201 is a level calculation block in which the level of the subband signal is determined. Two
02 is a logarithmic value calculation block, and 203 is a scale
This is a factor calculation block. Block 204 holds the weighted values of the logarithmic table, and block 205 contains the weighting table for the subbands. Block 206 based on the linear weighted ratio of the logarithmic levels.
At, a bit allocation calculation is performed.

【００１２】[0012]

【発明が解決しようとする課題】しかしながら、上述の
知覚に基づくビット割り付け手順は、テストでは、１２
８キロビット／秒以上のビット伝送速度において透明性
が得られたが、その複雑さに欠点がある。ＦＦＴ及びマ
スクしきい値の計算は集約的であり、けた外れの処理能
力を必要とするので、低コストによる実施の有効性に限
界があるという課題がある。However, the above-described perceptual bit allocation procedure has been tested at 12
Although transparency was obtained at bit rates of 8 kilobits / second and above, its complexity suffers. The FFT and mask threshold calculations are intensive and require out-of-order processing power, which poses the problem of limited implementation effectiveness at low cost.

【００１３】一方、単純なビット割り付け方法は、人間
の聴覚系を考慮してビットを割り付けるのではないとい
う点に不足がある。すなわち、第１に、割り付けは標準
偏差に基づくものであって、知覚される音量と直接関係
する帯域内のエネルギに基づくものではない。第２に、
高周波数のサブバンドに低周波数のサブバンドと同じ重
み付けが施されるので、耳の感度が鈍い高周波数のサブ
バンドに対して均等なビット配分が行われてしまうとい
う課題がある。On the other hand, the simple bit allocation method is insufficient in that bits are not allocated in consideration of the human auditory system. That is, firstly, the allocation is based on standard deviation, not on energy in the band, which is directly related to the perceived loudness. Second,
Since the same weighting is applied to the high-frequency subbands as to the low-frequency subbands, there is a problem in that even bits are distributed to the high-frequency subbands whose ear sensitivity is low.

【００１４】以上のことは例えば表１に示された知覚に
基づくビット割り付け及び単純なビット割り付けに関す
るオーディオ信号のフレーム例の結果から知ることが出
来る。量子化ノイズがマスクを下回ることを保証するの
に必要な最小ビット割り付けとして計算された、要求さ
れるビット割り付けも比較のため並べて一覧表にされて
いる。これらのビット割り付けは、１９２キロビット／
秒／チャネルのビット伝送速度で動作する、３２の等帯
域のサブバンド符号復号器から得られたものである。ビ
ット割り付けの結果から明らかなように、単純なビット
割り付け手順によって割り付けられるビットの数は、帯
域０、１及び、２に関して要求される数より少ない。こ
れによって明らかなように、これら３つのサブバンドに
は可聴ノイズがある。オーディオ信号のこのフレームに
おいてノイズが聞こえる確率は、３／３２である。一
方、知覚による結果の場合には、割り付けられるビット
は常に要求される数を上回ることが分かる。The above can be seen, for example, from the results of example frames of audio signals for perceptual bit allocation and simple bit allocation shown in Table 1. The required bit allocation, calculated as the minimum bit allocation required to ensure that the quantization noise is below the mask, is also listed side by side for comparison. These bit allocations are 192 kilobits /
It is derived from 32 equal band sub-band code decoders operating at a bit rate of seconds / channel. As is clear from the bit allocation results, the number of bits allocated by the simple bit allocation procedure is less than that required for bands 0, 1 and 2. As can be seen, there is audible noise in these three subbands. The probability of hearing noise in this frame of the audio signal is 3/32. On the other hand, in the case of perceptual results, it can be seen that the allocated bits always exceed the required number.

【００１５】[0015]

【表１】 [Table 1]

【００１６】本発明は、従来のこのような課題を考慮
し、再構成される信号の質を良好に保持したまま、低コ
ストでビット割付が出来るオーディオ信号のためのしき
い値制御量子化決定方法を提供することを目的とするも
のである。In consideration of the above problems of the prior art, the present invention is a threshold control quantization decision for an audio signal which can be bit-allocated at a low cost while maintaining a good quality of a reconstructed signal. It is intended to provide a method.

【００１７】[0017]

【課題を解決するための手段】請求項１の本発明は、フ
ィルタバンクにより入力オーディオ信号からサブバンド
サンプルを生成し、そのサブバンドサンプルから最大値
を求め、その最大値を第１量子化手段により量子化し、
その量子化された最大値を正規化し、それを第２量子化
手段により量子化する過程において、サブバンドサンプ
ルの平均エネルギーを計算し、その平均エネルギーに基
づいて所定の演算により分割係数を求め、その分割係数
に応じて第２の量子化を行うオーディオ信号のためのし
きい値制御量子化決定法である。According to the present invention of claim 1, a sub-bank sample is generated from an input audio signal by a filter bank, a maximum value is obtained from the sub-band sample, and the maximum value is first quantizing means. Quantized by
In the process of normalizing the quantized maximum value and quantizing it by the second quantizing means, the average energy of the subband samples is calculated, and the division coefficient is obtained by a predetermined calculation based on the average energy, It is a threshold control quantization determination method for an audio signal which is secondly quantized according to the division coefficient.

【００１８】請求項２の本発明は、それぞれ、スペクト
ル量子化成分から成る複数のフレームによって表され
る、スペクトル及び時間構造を備えたデジタル・オーデ
ィオ信号の量子化を決定する際に、規定の時間間隔内に
おいてフレームの成分のピーク値を求めるステップと、
そのピーク値と人間の聴覚系における聴力しきい値を比
較することによって、フレーム内の成分を取り除くステ
ップと、時間間隔内においてフレームの平均エネルギを
計算するステップと、平均エネルギとの所定の関係によ
ってフレームの量子化ステップを変更するステップから
構成されたオーディオ信号のためのしきい値制御量子化
決定方法である。The invention of claim 2 provides a defined time interval in determining the quantization of a digital audio signal with spectral and temporal structure, each represented by a plurality of frames of spectral quantized components. The step of obtaining the peak value of the frame component in
By comparing the peak value with the hearing threshold in the human auditory system, the steps of removing the components in the frame, calculating the average energy of the frame within the time interval, and the predetermined relationship with the average energy A threshold control quantization determination method for an audio signal, comprising a step of changing a frame quantization step.

【００１９】[0019]

【作用】本発明は、サブバンドサンプルの平均エネルギ
ーを計算し、その平均エネルギーに基づいて所定の演算
により分割係数を求め、その分割係数に応じて第２の量
子化を行うので、単純なビット割付を行うことができ、
再構成される信号の質も高くすることが出来る。The present invention calculates the average energy of subband samples, obtains the division coefficient by a predetermined calculation based on the average energy, and performs the second quantization in accordance with the division coefficient. Can be assigned,
The quality of the reconstructed signal can also be improved.

【００２０】[0020]

【実施例】以下に、本発明をその実施例を示す図面に基
づいて説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below with reference to the drawings showing its embodiments.

【００２１】図１は、本発明にかかる実施例１のしきい
値制御量子化決定方法を実現するためのブロック図であ
る。図６の従来例と同じ番号を付したものは、同様の機
能を有した処理ブロックを示す。１０９及び１１０は、
サブバンドサンプルの平均エネルギーを計算するパワー
推定ブロックを表し、１１１は、その平均エネルギー
（パワー）に基づいてビット割り付けを行うブロックを
表している。FIG. 1 is a block diagram for realizing a threshold value control quantization determination method according to a first embodiment of the present invention. Those having the same numbers as those in the conventional example in FIG. 6 indicate processing blocks having the same function. 109 and 110 are
A power estimation block for calculating the average energy of the subband samples is represented, and 111 is a block for performing bit allocation based on the average energy (power).

【００２２】図２は、以上の構成のブロック図における
アルゴリズムの流れ図である。パワー推定ブロック１０
９，１１０は、ステップＳ１に対応し、ブロック１１１
は、ステップＳ２〜９に対応する。図２における処理ス
テップは、下記のように精巧なものである。FIG. 2 is a flow chart of the algorithm in the block diagram of the above configuration. Power estimation block 10
9, 110 corresponds to step S1 and corresponds to the block 111.
Corresponds to steps S2-9. The processing steps in FIG. 2 are elaborate as follows.

【００２３】［１］エネルギσ_iは、入力サブバンドの
サンプルＳ_kのブロックに関して、各サブバンドｉ毎に
計算される、すなわち、σ_i＝ΣＳ_k 。この結果、デシ
ベル値が得られる（ステップＳ１）。[1] The energy σ _i is calculated for each subband i, for a block of samples S _k of the input subband, ie σ _i = ΣS _k . As a result, a decibel value is obtained (step S1).

【００２４】［２］ゼロ・ビット割り付けを仮定した、
ｄＢ表示の量子化ノイズＮ_iは、サブバンド毎に最大値
Ｓｍａｘから計算される（ステップＳ２）。[2] Assuming zero bit allocation,
The quantization noise N _{i in} dB is calculated from the maximum value Smax for each subband (step S2).

【００２５】［３］聴力しきい値Ｑ_iがノイズＮ_iを超え
るか否かがチェックされる（ステップＳ３）。この聴力
しきい値は静的であり、その計算のために計算時間を追
加する必要はない。[3] It is checked whether the hearing threshold Q _i exceeds the noise N _i (step S3). This hearing threshold is static and does not require additional calculation time for its calculation.

【００２６】［４］処理ステップ［３］で、超えるとい
うことになれば、サブバンドのエネルギσ_iは、低いダ
ミー値にセットされる（ステップＳ４）。例えば、１０
０ｄＢ低い値にセットされるので、反復プロセス時に、
そのサブバンドにビットが割り付けられる公算は最も少
なくなる。[4] If it is exceeded in the processing step [3], the energy σ _{i of the} subband is set to a low dummy value (step S4). For example, 10
Since it is set to a value 0 dB lower, during the iterative process,
Bits are most likely to be assigned to that subband.

【００２７】［５］最大平均エネルギを有するサブバン
ドｉを探索する（ステップＳ５）。[5] Search for subband i having the maximum average energy (step S5).

【００２８】［６］サブバンドのビット割り付けをＬｉ
だけインクリメントする（ステップＳ６）。[6] Sub-bit bit allocation is Li
Only increments (step S6).

【００２９】［７］サブバンドの異なるクラスタに異な
るγ_nが用いられる場合、平均エネルギをγ_nだけ減少さ
せる（ステップＳ７）。これらの値γ_nは最良の主観音
質が得られるように最適化される。例えば、サブバンド
０〜４のγには値６が与えられ、サブバンド５〜３１の
γには値８が与えられる。[7] When different γ _n are used for different clusters of subbands, the average energy is reduced by γ _n (step S7). These values γ _n are optimized to obtain the best subjective sound quality. For example, γ of subbands 0 to 4 is given the value 6 and γ of subbands 5 to 31 is given the value 8.

【００３０】［８］用いられるビット数を計算する（ス
テップＳ８）。すなわち、Ｍ＝ｋＬｉ、ここでｋは、サ
ブバンドにおける時間サンプル数、Ｌｉは、サブバンド
ｉに関するインクリメンタル・ビット数である。[8] Calculate the number of bits used (step S8). That is, M = kLi, where k is the number of time samples in the subband and Li is the number of incremental bits for subband i.

【００３１】［９］残りのビット数が、可能性のある次
のビット数より少なくなると、ビット割り付けは終了す
る。さもなければ、上記のステップＳ５、６、７、８及
び、９が反復される（ステップＳ９）。[9] When the remaining number of bits becomes smaller than the next possible number of bits, the bit allocation is completed. Otherwise, steps S5, 6, 7, 8 and 9 above are repeated (step S9).

【００３２】図３及び図４は、計算時間を考慮に入れた
本発明にかかる実施例２及び実施例３の流れ図である。
このビット割り付け手順の最も可能性の高い用途は、コ
ストに敏感な製品にあるので、プロセッサの計算速度に
限界があれば反復手順の代替案が与えられる。FIGS. 3 and 4 are flow charts of the second and third embodiments of the present invention taking the calculation time into consideration.
The most likely use of this bit allocation procedure is in cost-sensitive products, thus providing an iterative procedure alternative if the computational speed of the processor is limited.

【００３３】図３には、正確な推定による利用可能な計
算時間が反復割り付け可能なビット数に変換される場合
の、ビット割り付け手順が示されている。残りのビット
は、サブバンドの「シード」・ビット割り付けとして用
いられる。これらの「シード」値は、サブバンドのエネ
ルギに正比例して、あるいは、異なる帯域に対する耳の
感度を反映するため、重み付きの比率で割り付けること
が可能である。計算時間を推定できない場合、計算時間
が終了したか否かを表示するため、反復ループにおいて
カウントが必要になる。終了した場合、残りのビットは
上述の「シード」・ビットの割り付けの場合のように割
り付けられる。FIG. 3 shows the bit allocation procedure when the available calculation time by accurate estimation is converted into the number of bits that can be iteratively allocated. The remaining bits are used as a "seed" bit allocation for the subband. These "seed" values can be assigned in weighted proportions, either directly proportional to the subband energy or to reflect the ear's sensitivity to different bands. If the calculation time cannot be estimated, a count is needed in the iterative loop to indicate whether the calculation time has expired. When finished, the remaining bits are allocated as in the "seed" bit allocation above.

【００３４】図３において、各ステップＳ１、Ｓ２、Ｓ
３及び、Ｓ４は、図２における各ステップＳ１、Ｓ２、
Ｓ３及び、Ｓ４と同じ処理である。ステップＳ１０は、
シードの比率の決定を表し、ステップＳ１１は、エネル
ギの重み付き比率による割り付けを示しており、ステッ
プＳ１２は、図２における各ステップＳ５、Ｓ６、Ｓ
７、Ｓ８、及びＳ９の反復処理に相当する。下記は、情
報の流れに関するシード比率の決定において行われるス
テップＳ１０の処理の詳細である。In FIG. 3, each step S1, S2, S
3 and S4 are steps S1, S2,
The process is the same as S3 and S4. Step S10 is
The determination of the seed ratio is represented, step S11 shows the allocation by the weighted ratio of energy, and step S12 represents the steps S5, S6, S in FIG.
This corresponds to the iterative process of 7, S8, and S9. The following is the details of the process of step S10 performed in determining the seed ratio regarding the flow of information.

【００３５】１．１反復ループにプロセッサが用いられ
る時間ｔを計算する。1.1 Calculate the time t at which the processor is used in the iterative loop.

【００３６】２．プロセッサに制約があるものとして、
許容可能な時間制限Ｔを求める。2. Assuming there are processor restrictions,
Find an acceptable time limit T.

【００３７】３．時間的制約Ｔがあるものとして、許容
可能なループ数Ｎを計算する。3. Assuming that there is a time constraint T, the number N of allowable loops is calculated.

【００３８】４．各ループ毎に１ビットが割り付けられ
るものとすると、Ｎは利用可能なビット数である。Ｎは
情報の流れに関するシード比率である。4. If one bit is allocated for each loop, N is the number of available bits. N is a seed ratio related to the flow of information.

【００３９】次のステップＳ１１において、各サブバン
ドの初期割り付けＬｉは、Ｌｉ＝ｗ_iσ_i／Σｗ_iσ_i ^*Ｎによって求められる。ここで、ｗ_iはサブバンドｉに与
えられる重みである。At the next step S11, the initial allocation Li of each sub-band is obtained by Li = w _i σ _i / Σw _i σ _i ^* N. Here, w _i is a weight given to subband i.

【００４０】図４において、ステップＳ１３は、図２に
おけるステップＳ１〜Ｓ４までの処理を表し、ステップ
Ｓ１４は、図２におけるステップＳ５〜Ｓ８までの処理
を表している。ステップＳ１５では、利用可能なビット
数を使いきったか否かをチェックする。ステップＳ１６
では、反復ループ数をカウントすることによって、計算
時間が終了したか否かをチェックする。計算時間が終了
していない場合は、ステップＳ１４の処理に戻って反復
される。計算時間が終了した場合、残りのビットは、残
りのエネルギの比率に基づいて割り付けられる（ステッ
プＳ１７）。ここでの処理は、用いられる平均エネルギ
値σ_iが、反復プロセスによって非線形に修正されてい
る点を除けば、図３におけるステップＳ１１と同様であ
る。In FIG. 4, step S13 represents the processing of steps S1 to S4 in FIG. 2, and step S14 represents the processing of steps S5 to S8 in FIG. In step S15, it is checked whether the number of available bits has been used up. Step S16
Then, by counting the number of iteration loops, it is checked whether the calculation time has ended. If the calculation time has not ended, the process returns to step S14 and is repeated. When the calculation time is over, the remaining bits are allocated based on the ratio of the remaining energy (step S17). The processing here is similar to step S11 in FIG. 3 except that the average energy value σ _i used is modified non-linearly by an iterative process.

【００４１】再構成された出力の結果は、知覚によるコ
ーダとほぼ同程度に良好な主観音質を示した。ノイズ測
定の確率を利用して、γ値はあるシーケンスについては
ゼロの確率を示す。また、１２８キロビット／秒でテス
トされるシーケンスについては平均３．５％の確率を示
す出力を送り出すことが可能である。表１には、前述の
オーディオ信号の同じフレームに関する本発明のビット
割り付けが示されている。単一のγ値だけを用いて、全
てのエネルギがサブバンドに分割される。サブバンド１
及び２において改良が観測され、ノイズの発生する確率
は、今や、１／３２まで低下した。図５には、オーディ
オ信号の別のフレームに関するビット割り付けの比較が
示されている。「しきい値」と表示された本発明の結果
は、単純なビット割り付けよりも知覚に基づく割り付け
に、より適合することを示している。単純なビット割り
付けは、サブバンド０、１及び、２にノイズの生じる可
能性のあることを表しており、しきい値制御ビット割り
付けの結果は、マスキング理論が要求する最小割り付け
を満たすものであることを表している。本発明は、１２
８キロビット／秒以上のビット伝送速度が最も効果的で
あることが分かった。The reconstructed output results showed almost as good subjective sound quality as the perceived coder. Utilizing the probability of noise measurement, the γ value exhibits a zero probability for a sequence. It is also possible to deliver an output showing an average of 3.5% probability for sequences tested at 128 kbps. Table 1 shows the bit allocation of the present invention for the same frame of the audio signal described above. All energies are divided into subbands using only a single γ value. Subband 1
Improvements were observed in 2 and 2 and the probability of noise generation is now reduced to 1/32. FIG. 5 shows a bit allocation comparison for another frame of the audio signal. The results of the invention, labeled "threshold", indicate that it is better suited for perceptual allocation than simple bit allocation. A simple bit allocation indicates that subbands 0, 1 and 2 can be noisy, and the result of the threshold control bit allocation is to meet the minimum allocation required by masking theory. It means that. The present invention is 12
Bit rates of 8 kilobits / second and above have been found to be most effective.

【００４２】以上のように本発明によれば、単純なビッ
ト割り付けの単純さは保持され、同時に、不適切なもの
をさらに除去するため、人間の聴覚系を考慮してその不
足する点に修正が加えられる。すなわち、第１に、サブ
バンドのサンプルのピーク値が規定の各時間間隔毎に求
められる。第２に、聴力しきい値を適用して、それとピ
ーク値とを比較することによって、音響心理学的に不適
切なサブバンドが除去される。この結果、しばしばオー
ディオ信号の帯域幅が狭くなる。聴力しきい値は人間の
耳に聞こえる純粋なトーンの最低音圧レベルであり、そ
れは周波数の関数である。第３に、各サブバンドの平均
エネルギが計算され、割り付けの基本として用いられ
る。最後に、ビットは各サブバンドに対して繰り返し割
り付けられる。サブバンドにビットのインクリメント数
が割り付けられる場合、サブバンドの各グループ毎に、
非線形値が平均エネルギの分割係数として割り付けられ
る。これは、周波数及び音圧レベルが異なれば、耳の感
度が異なることを考慮して行われる。As described above, according to the present invention, the simplicity of the simple bit allocation is maintained, and at the same time, in order to further remove the improper ones, the human auditory system is taken into consideration to correct the shortage. Is added. That is, first, the peak value of the subband sample is obtained at each prescribed time interval. Second, by applying a hearing threshold and comparing it to the peak value, psychoacoustically inappropriate subbands are removed. This often results in a narrow audio signal bandwidth. The hearing threshold is the lowest sound pressure level of a pure tone that the human ear can hear, which is a function of frequency. Third, the average energy of each subband is calculated and used as the basis for allocation. Finally, the bits are repeatedly allocated for each subband. If a bit increment is assigned to a subband, for each group of subbands,
A non-linear value is assigned as the division factor for the average energy. This is done in consideration of the different ear sensitivities at different frequencies and sound pressure levels.

【００４３】また、帯域内の平均エネルギ及びピーク
値、静寂時におけるしきい値に関する最悪の場合の値及
び、非線形減少係数を用いて、各サブバンドに割り付け
るべきビット数を計算する。このビット割り付け技法
は、エンコーディング・アルゴリズムの構造に影響を与
えないので、デコーディング・アルゴリズムも、同じで
あり続けることができる。再構成される出力の質を高め
るため、非線形減少係数は、サブバンドの異なるクラス
タ毎に重み付けすることができる。さらに、計算時間に
制約があれば、反復前に、まず、各サブバンド毎に初期
値すなわち「シード」値を計算することも可能である。Also, the number of bits to be assigned to each subband is calculated using the average energy and peak value in the band, the worst case value regarding the threshold value during silence, and the nonlinear reduction coefficient. This bit allocation technique does not affect the structure of the encoding algorithm, so the decoding algorithm can remain the same. To improve the quality of the reconstructed output, the non-linear reduction factor can be weighted for different clusters of subbands. Furthermore, if the calculation time is constrained, it is also possible to first calculate an initial value or "seed" value for each subband before iteration.

【００４４】以上のことから本発明の強みは、その単純
さと、知覚に基づくビット割り付けを利用したコーダに
匹敵する質が得られる手段にある。ＩＳＯ／ＭＰＥＧオ
ーディオ・サブグループの評定によれば、図６の構成を
備えたエンコーダは、アルゴリズムの複雑さのため、単
一ＤＳＰで実施することができない。このため、コスト
が重要な問題になり、あまり低いビット伝送速度が必要
とされない消費者製品におけるエンコーダの利用を抑止
することになっている。本発明は、その単純さと、良好
な音質を獲得する上での有効性によって、低コストの代
替案として極めて重要である。From the above, the strength of the present invention lies in its simplicity and means for obtaining a quality comparable to a coder using perceptual bit allocation. According to the ISO / MPEG audio subgroup rating, an encoder with the configuration of Figure 6 cannot be implemented in a single DSP due to the complexity of the algorithm. This has made cost an important issue, and is deterring the use of encoders in consumer products where very low bit rates are not required. The present invention is of great importance as a low cost alternative due to its simplicity and effectiveness in obtaining good sound quality.

【００４５】なお、上記実施例では、各処理を行うブロ
ックを専用のハードウェアにより構成したが、これに代
えて、同様の機能をコンピュータを用いてソフトウェア
的に構成しても勿論よい。In the above embodiment, the blocks for performing the respective processes were constructed by dedicated hardware, but instead of this, similar functions may be constructed by software using a computer.

【００４６】[0046]

【発明の効果】以上述べたところから明らかなように本
発明は、サブバンドサンプルの平均エネルギーを計算
し、その平均エネルギーに基づいて所定の演算により分
割係数を求め、その分割係数に応じて第２の量子化を行
うので、再構成される信号の質を良好に保持したまま、
低コストでビット割付が出来るという長所を有する。As is apparent from the above description, the present invention calculates the average energy of subband samples, obtains a division coefficient by a predetermined calculation based on the average energy, and determines the division coefficient according to the division coefficient. Since the quantization of 2 is performed, the quality of the reconstructed signal is kept good,
It has the advantage that bit allocation can be done at low cost.

[Brief description of drawings]

【図１】本発明にかかる実施例１のしきい値制御量子化
決定方法を実現するためのブロック図である。FIG. 1 is a block diagram for realizing a threshold value control quantization determination method according to a first embodiment of the present invention.

【図２】同実施例１の処理の流れを示す流れ図である。FIG. 2 is a flowchart showing a processing flow of the first embodiment.

【図３】本発明にかかる実施例２のしきい値制御量子化
決定方法の処理の流れを示す流れ図である。FIG. 3 is a flowchart showing a processing flow of a threshold value control quantization determination method according to a second embodiment of the present invention.

【図４】本発明にかかる実施例３のしきい値制御量子化
決定方法の処理の流れを示す流れ図である。FIG. 4 is a flowchart showing a processing flow of a threshold value control quantization determination method according to a third embodiment of the present invention.

【図５】ビット割付の比較結果を示す図である。FIG. 5 is a diagram showing a comparison result of bit allocation.

【図６】従来の量子化決定方法を実現するためのブロッ
ク図である。FIG. 6 is a block diagram for realizing a conventional quantization decision method.

【図７】図６の従来の量子化決定方法の処理の流れを示
す流れ図である。FIG. 7 is a flowchart showing a processing flow of the conventional quantization determination method of FIG.

【図８】別の従来の量子化決定方法を実現するためのブ
ロック図である。FIG. 8 is a block diagram for implementing another conventional quantization decision method.

[Explanation of symbols]

１０１フィルタバンク１０２最大値決定手段１０３第１量子化器１０８第２量子化器１１０パワー推定手段１１１ビット割付手段 101 Filter Bank 102 Maximum Value Determining Means 103 First Quantizer 108 Second Quantizer 110 Power Estimating Means 111 Bit Allocation Means

Claims

[Claims]

1. A subbank sample is generated from an input audio signal by a filter bank, a maximum value is obtained from the subband sample, the maximum value is quantized by a first quantizing means, and the quantized maximum value is obtained. Normalize,
In the process of quantizing it by the second quantizing means,
An average energy of the sub-band sample is calculated, a division coefficient is obtained by a predetermined calculation based on the average energy, and the second quantization is performed according to the division coefficient. Threshold control quantization decision method.

2. A component of said frame within a defined time interval in determining the quantization of a digital audio signal with spectral and temporal structure, each represented by a plurality of frames of spectral quantized components. The peak value of the frame, removing the component in the frame by comparing the peak value with the hearing threshold in the human auditory system, and calculating the average energy of the frame within the time interval. A threshold control quantization determination method for an audio signal, comprising: changing a quantization step of the frame according to a predetermined relationship between the step and the average energy.

3. The method of claim 2, wherein the component is set to 0 when the peak value is less than the hearing threshold.

4. The step of quantizing a component is determined by iteratively selecting the frame having the highest average energy and increasing the quantization step while decreasing the average energy non-linearly. A threshold control quantization determination method for an audio signal according to claim 2.

5. A threshold control quantization decision method for an audio signal as claimed in claim 4, characterized in that the quantization is improved when groups of frames are matched by different non-linear reductions of the mean energy. .

6. Before iterating, if each frame first matches, then it becomes more efficient to determine the quantization step, and the step is derived in proportion to the mean energy of the frame. Claim 4 characterized by
A threshold control quantization decision method for an audio signal as described.