JP3465341B2

JP3465341B2 - Audio signal encoding method

Info

Publication number: JP3465341B2
Application number: JP09154594A
Authority: JP
Inventors: ロバートヘドル
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1994-04-28
Filing date: 1994-04-28
Publication date: 2003-11-10
Anticipated expiration: 2018-11-10
Also published as: JPH07295594A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、オーディオ信号符号化
方法に関し、特に、ディジタルオーディオ信号のスペク
トル成分を、一定のビット数のデータに圧縮するような
圧縮処理を伴った符号化を施すオーディオ信号符号化方
法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio signal encoding method, and more particularly to an audio signal which is encoded with a compression process for compressing a spectral component of a digital audio signal into data having a fixed number of bits. The present invention relates to an encoding method.

【０００２】[0002]

【従来の技術】いわゆるコンパクトディスク等として広
く知られるデジタル・オーディオは、高音質な再生信号
が簡単に得られるため、広く消費者に普及している。し
かし、コンパクトディスクに見られるような、圧縮され
ないデジタルオーディオ信号は、大量にデータ記憶領域
を必要とし、これが媒体寸法および／または伝送帯域幅
条件における欠点となる。このため、人間の聴覚特性、
あるいは音響心理学的特性等を利用して、冗長なデータ
を削除するような圧縮デジタルオーディオシステムが開
発された。最近発表になったデジタルオーディオ媒体や
装置等の製品においては、オーディオデータのデータ量
を減少するために、上述のような圧縮デジタルオーディ
オシステムを使用しているものも多い。2. Description of the Related Art Digital audio, which is widely known as a so-called compact disc or the like, is widely used by consumers because a reproduced signal with high sound quality can be easily obtained. However, uncompressed digital audio signals, such as those found on compact discs, require a large amount of data storage, which is a drawback in media size and / or transmission bandwidth requirements. Therefore, human auditory characteristics,
Alternatively, a compressed digital audio system that deletes redundant data by utilizing psychoacoustic characteristics has been developed. Many recently announced products such as digital audio media and devices use the above compressed digital audio system in order to reduce the amount of audio data.

【０００３】このような圧縮デジタルオーディオシステ
ムは、一般に、まずオーディオデータに対して時間周波
数変換を行なう。この変換は、帯域分割または変換符号
化方式またはこれらの組み合わせに基づくのが普通であ
る。変換したデータはいわゆる量子化ユニット（Quanti
zation Unit ）にグループ化される。一般には、単一量
子化ユニット内の全データは同一ワード長で再量子化さ
れ、またブロック・フローティング量子化を用いる場合
には同一スケールファクタで再量子化される。データ圧
縮は、本来の入力データより短いワード長を用いて行な
う。また各量子化ユニットのワード長は可変とするのが
普通である。この際、音響心理学的技術を用いて量子化
ユニットの間に利用可能なビットを割り当て、再量子化
による可聴域の音質劣化、例えば量子化雑音を最小限に
抑えるようになしている。In such a compressed digital audio system, generally, time-frequency conversion is first performed on audio data. The transform is typically based on band splitting or transform coding schemes or a combination thereof. The converted data is a so-called quantization unit (Quanti
zation Unit). Generally, all data in a single quantizer unit is requantized with the same word length and, if block floating quantization is used, with the same scale factor. Data compression is performed using a word length shorter than the original input data. The word length of each quantization unit is usually variable. At this time, a psychoacoustic technique is used to allocate available bits between the quantization units so as to minimize deterioration of sound quality in the audible range due to requantization, for example, quantization noise.

【０００４】スペクトル係数グループ化の方法について
は、本件出願人が先に提案した、例えば、特願平５−１
８３３２２号、特願平５−２４１１８９号、及び特願平
５−２７５２１８号の各明細書及び図面等により開示さ
れている。これらのシステムでは、信号のもつトーン性
あるいはトーナリティ成分を抽出して別々に量子化する
方法を提供する。通常、例えば５程度の少数の周波数領
域のスペクトル係数をトーンとして決定する。このトー
ンのスペクトル係数は多くのビット数を用いることなく
正確に量子化することができる。残りのいわゆるノイズ
性のスペクトルを次に低い精度で量子化する。デコーダ
でトーン性部分とノイズ性部分を合成して得られた信号
は、同一ビットレートを用いる従来技術のシステムに比
べて、特にトーン性の高い入力信号で、高い音質を有す
る。Regarding the method of grouping spectral coefficients, the applicant of the present invention has previously proposed, for example, Japanese Patent Application No. 5-1.
No. 83322, Japanese Patent Application No. 5-241189, and Japanese Patent Application No. 5-275218. These systems provide a method of extracting the tonality or tonality component of a signal and quantizing them separately. Usually, a small number of spectral coefficients in the frequency domain, such as about 5, are determined as tones. The spectral coefficient of this tone can be accurately quantized without using a large number of bits. The remaining so-called noisy spectrum is quantized with the next lower precision. The signal obtained by synthesizing the tone-like portion and the noise-like portion by the decoder has a particularly high tone characteristic and high sound quality as compared with the conventional system using the same bit rate.

【０００５】[0005]

【発明の解決しようとする課題】ところで、エンコーダ
はそれぞれの入力信号のトーン性部分とノイズ性部分に
何ビットを割り当てるか決定しなければならない。決定
が不適当な場合、音質面の最大限の改善は実現されない
ことになる。By the way, the encoder has to decide how many bits are allocated to the tone part and the noise part of each input signal. If the decision is not correct, the maximum improvement in sound quality will not be realized.

【０００６】本発明は、このような実情に鑑みてなされ
たものであり、信号のトーン性成分およびノイズ性成分
の間に可能な最善のビット割り当てが達成されるオーデ
ィオ信号符号化方法を提供することを目的とするもので
ある。The present invention has been made in view of such circumstances, and provides an audio signal coding method in which the best possible bit allocation is achieved between the tone-like component and the noise-like component of a signal. That is the purpose.

【０００７】本発明の他の目的は、トーン性成分および
ノイズ性成分両方を用いて信号を符号化するオーディオ
エンコーダ用のビット割り当て方法を提供することであ
る。本ビット割り当て法は各種のスペクトル測定を用い
トーン性成分およびノイズ性成分について入力信号を分
析し、これらの測定に基づいて、可能な最良の音質が得
られるように利用可能なビットを割り当て得るオーディ
オ信号符号化方法を提供することである。Another object of the present invention is to provide a bit allocation method for an audio encoder that encodes a signal by using both a tone component and a noise component. The present bit allocation method analyzes the input signal for tonal and noisy components using various spectral measurements, and based on these measurements, audio that can allocate available bits to obtain the best possible sound quality. It is to provide a signal coding method.

【０００８】[0008]

【課題を解決するための手段】本発明に係るオーディオ
信号符号化方法は、入力オーディオ信号に対して所定の
変換、例えば時間−周波数分析あるいは直交変換を施す
ことにより複数のスペクトルを得て、各スペクトルに聴
覚特性に応じたビットを割り当てて符号化するオーディ
オ信号符号化方法であって、上記複数のスペクトルにつ
いて、トーン性成分を識別し、識別したトーン性成分か
らトーンを抽出して所定のビット数で量子化し、上記複
数のスペクトルの残りのスペクトルについて、高いパワ
ー密度を持つ周波数領域を分離し、パワー量子化ユニッ
トとして所定のビット数で量子化し、残りのスペクトル
を使用可能な残りのビットを用いて量子化することによ
り、上述の課題を解決する。割り当てビット数として
は、通常の場合、上記トーンが最も多く、次にパワー量
子化ユニットが多く、残りのスペクトルが最も少なくな
る。An audio signal encoding method according to the present invention obtains a plurality of spectra by subjecting an input audio signal to a predetermined conversion, for example, time-frequency analysis or orthogonal conversion, to obtain each spectrum. An audio signal encoding method for allocating bits according to auditory characteristics to a spectrum for encoding, wherein tone-like components are identified for the above-mentioned plurality of spectra, and tones are extracted from the identified tone-like components to obtain predetermined bits. Quantize with a number, separate the frequency domain with high power density for the remaining spectrum of the above multiple spectra, quantize with a predetermined number of bits as a power quantization unit, and use the remaining spectrum with the remaining available bits. By using and quantizing, the above-mentioned problems are solved. As for the number of allocated bits, in general, the above-mentioned tone has the largest number, the power quantization unit has the second largest number, and the remaining spectrum has the smallest number.

【０００９】上記抽出されたトーンを通常トーンまたは
大トーンのいずれかに分類し、上記大トーンは上記通常
トーンより多くのビット数で量子化することが好まし
い。ここで、大トーンとは、後述するように、全エネル
ギに対するトーンエネルギの比率であるＲ_ＴＯＴが所定
値以上（例えば０．１８以上）のものをいい、それ以外
は通常トーンとされる。It is preferable to classify the extracted tones into either normal tones or large tones, and to quantize the large tones with a larger number of bits than the ordinary tones. Here, as will be described later, the large tone means that R _TOT, which is a ratio of tone energy to total energy, is a predetermined value or more (for example, 0.18 or more), and the others are normal tones.

【００１０】また、上記残ったビットによる上記残りの
スペクトルの量子化は、同一ワード長で量子化すること
が好ましい。It is preferable that the remaining spectrum is quantized by the remaining bits with the same word length.

【００１１】さらに、上記トーンの抽出としては、まず
所定周波数範囲あるいは所定スペクトル数、例えば２５
スペクトル毎のローカルピークを求め、このローカルピ
ークについて、所定数スペクトル範囲内に、例えば４ス
ペクトル以内に２つのローカルピークが存在する場合に
は、パワーの大きな方の単一のトーンをトーンとして抽
出することが挙げられ、また、スペクトルの平均値／中
央値の比に基づきトーン抽出することや、上記スペクト
ルのトーナリティに基づきトーン抽出することが挙げら
れる。Further, in extracting the tone, first, a predetermined frequency range or a predetermined number of spectra, for example, 25
A local peak for each spectrum is obtained, and when there are two local peaks within a predetermined number of spectral ranges, for example, within 4 spectra, a single tone having a larger power is extracted as a tone. Further, the tone extraction may be performed based on the ratio of the average value / median value of the spectrum, and the tone extraction may be performed based on the tonality of the spectrum.

【００１２】また本発明は、入力デジタルオーディオ信
号と、時間と周波数に部分する係数へと信号を変換する
時間周波数変換と、一組の該係数に割り当て可能な最大
ビット数を表わす数と、該係数をトーン性部分とノイズ
性部分とに分割しこれら２つの部分の間でビット割り当
てを行なう最適なビット割り当て方法を有するオーディ
オ信号符号化方法を提供する。The invention also provides an input digital audio signal, a time-frequency transform for transforming the signal into coefficients that are subdivided into time and frequency, a number representing the maximum number of bits that can be assigned to the set of coefficients, and Provided is an audio signal coding method having an optimum bit allocation method in which a coefficient is divided into a tone part and a noise part and bit allocation is performed between these two parts.

【００１３】[0013]

【作用】入力オーディオ信号のスペクトルの内のトーン
性成分を抽出して量子化し、次に高パワー周波数領域を
パワー量子化ユニットとして分離して相対的に量子化雑
音が聴取されないようなビット数で量子化し、最後に残
ったスペクトルを残ったビット数で量子化することによ
り、音響心理学的に、聴感上で重要な部分から順に多く
のビットを割り当てるような効率的なビット配分による
符号化が行え、低いビットレートで高品質のオーディオ
信号符号化を実現できる。The tonal component of the spectrum of the input audio signal is extracted and quantized, and then the high power frequency region is separated as a power quantization unit with a bit number such that quantization noise is relatively inaudible. By quantizing and quantizing the last remaining spectrum with the number of remaining bits, psychoacoustically, encoding with efficient bit allocation such that many bits are sequentially allocated from the part that is important in terms of hearing can be performed. It is possible to realize high quality audio signal encoding at a low bit rate.

【００１４】また、トーンを通常トーンと大トーンとに
分類し、大トーンにより多くのビットを割り当てること
により、さらに効率を上げ、音質を高めることができ
る。By dividing the tones into normal tones and large tones and allocating more bits to the large tones, the efficiency can be further improved and the sound quality can be improved.

【００１５】また、残りのスペクトルを同一ワード長で
量子化することにより、ビット配分処理を簡略化でき
る。Also, the bit allocation process can be simplified by quantizing the remaining spectrum with the same word length.

【００１６】さらに、オーバーラップしない単一のトー
ンだけをトーンとして抽出することにより、高精度のト
ーン検出が行える。また、スペクトルの平均値／中央値
の比に基づきトーンを抽出することや、上記スペクトル
のトーナリティに基づきトーンを抽出することにより、
トーンの誤検出を防止でき、適切なトーン検出が行え
る。Furthermore, by extracting only a single tone that does not overlap as a tone, highly accurate tone detection can be performed. Also, by extracting the tone based on the ratio of the average value / median value of the spectrum, or by extracting the tone based on the tonality of the spectrum,
False detection of tones can be prevented and appropriate tones can be detected.

【００１７】[0017]

【実施例】以下、本発明に係るいくつかの好ましい実施
例について、図面を参照しながら説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Some preferred embodiments of the present invention will be described below with reference to the drawings.

【００１８】図１は、本発明に係るオーディオ信号符号
化方法が適用されるオーディオエンコーダ／デコーダシ
ステムの一般的構成を概略的に示す図である。FIG. 1 is a diagram schematically showing a general configuration of an audio encoder / decoder system to which an audio signal coding method according to the present invention is applied.

【００１９】この図１において、入力端子１０に供給さ
れた入力オーディオ信号は、オーディオエンコーダ１１
に送られ、このオーディオエンコーダ１１は、音響心理
学的原理に従って、聴取可能な音質の劣化を最小限に抑
えるように動作する。オーディオエンコーダ１１からの
出力信号である符号化データは、伝送又は記録媒体を介
して伝送され又は記録再生されて、オーディオデコーダ
１２に送られる。オーディオデコーダ１２は、入力され
た符号化データからオーディオ信号を再構成して、出力
端子１６より出力する。In FIG. 1, the input audio signal supplied to the input terminal 10 is the audio encoder 11
This audio encoder 11 operates in accordance with the psychoacoustic principle to minimize the deterioration of the audible sound quality. The encoded data, which is the output signal from the audio encoder 11, is transmitted or recorded / reproduced via a transmission or recording medium, and is sent to the audio decoder 12. The audio decoder 12 reconstructs an audio signal from the input encoded data and outputs it from the output terminal 16.

【００２０】図２は、上記音響心理学的なオーディオエ
ンコーダ１１の具体例を示すブロック図である。FIG. 2 is a block diagram showing a concrete example of the psychoacoustic audio encoder 11.

【００２１】この図２において、まず、時間−周波数分
析ブロック２１で入力オーディオ信号を時間周波数係数
に分割する。次に、これらの周波数成分をブロックフロ
ーティングアルゴリズムを用いて、スペクトル量子化ブ
ロック２３で量子化する。上記時間周波数係数の各ブロ
ックのワード長とスケールファクタは、ビット割当制御
ブロック２２のビット割当アルゴリズムにより決定す
る。最後に、スペクトル量子化ブロック２３からの量子
化スペクトル係数を出力端子１３より取り出し、ビット
割当制御ブロック２２からのビット割当表のパラメータ
を出力端子１４より取り出して、例えば記録媒体に書き
込む。あるいは伝送する。In FIG. 2, first, the time-frequency analysis block 21 divides the input audio signal into time-frequency coefficients. Next, these frequency components are quantized by the spectrum quantization block 23 using a block floating algorithm. The word length and scale factor of each block of the time-frequency coefficient is determined by the bit allocation algorithm of the bit allocation control block 22. Finally, the quantized spectrum coefficient from the spectrum quantization block 23 is taken out from the output terminal 13, the parameter of the bit allocation table from the bit allocation control block 22 is taken out from the output terminal 14, and is written in, for example, a recording medium. Or transmit.

【００２２】スペクトル量子化ブロック２３で生成され
る量子化雑音が人間の耳には聴取不可能または最小限に
聴取される程度に留まるようにするために、ビット割当
制御ブロック２２のビット割当アルゴリズムは、音響心
理学的モデル出力ブロック２４からの、最小可聴曲線、
等ラウドネス曲線、同時マスキング特性、テンポラルマ
スキング特性等による音響心理学的モデルに基づくもの
となっている。In order to ensure that the quantization noise generated by the spectrum quantization block 23 remains inaudible or minimally audible to the human ear, the bit allocation algorithm of the bit allocation control block 22 is , The minimum audible curve from the psychoacoustic model output block 24,
It is based on a psychoacoustic model with equal loudness curves, simultaneous masking characteristics, temporal masking characteristics, etc.

【００２３】ここで、時間−周波数分析ブロック２１
は、帯域別符号化、変換符号化またはこれら２種類の組
み合わせのいずれかに基づくのが一般的である。図３
は、このような２種類の組み合せ（ハイブリッド）の場
合の時間−周波数分析ブロック２１の一例を示してい
る。Here, the time-frequency analysis block 21
Is generally based on either band coding, transform coding or a combination of these two types. Figure 3
Shows an example of the time-frequency analysis block 21 in the case of such a combination of two types (hybrid).

【００２４】この図３において、まずクォドラチュアミ
ラーフィルタ（Quadrature MirrorFilter、以下ＱＭＦ
という）３１により、入力端子１０からの入力信号を高
周波帯域と低周波帯域に分割する。この後、低周波帯域
を第２のＱＭＦ３２でさらに分割する。第２のＱＭＦ３
２による処理遅延を補償する目的で、高周波帯域を遅延
回路３３により遅延する。これらの各帯域の範囲は、高
域側から順に、例えば１１〜２２ｋHz、５．５〜１１ｋ
Hz、０〜５．５ｋHzとすればよい。これらの各帯域の信
号を改良ディスクリートコサイン変換（ＭＤＣＴ）変換
ブロック３４、３５、３６でそれぞれ周波数領域のスペ
クトル係数に変換し、出力端子３７、３８、３９を介し
てそれぞれ取り出している。In FIG. 3, first, a quadrature mirror filter (hereinafter referred to as QMF) is used.
, 31 divides the input signal from the input terminal 10 into a high frequency band and a low frequency band. After that, the low frequency band is further divided by the second QMF 32. Second QMF3
The high frequency band is delayed by the delay circuit 33 for the purpose of compensating for the processing delay due to 2. The range of each of these bands is, for example, 11 to 22 kHz, 5.5 to 11 kHz in order from the high frequency side.
The frequency may be set to Hz, 0 to 5.5 kHz. The signals in each of these bands are converted into spectral coefficients in the frequency domain by the improved discrete cosine transform (MDCT) transform blocks 34, 35 and 36, and are taken out through output terminals 37, 38 and 39, respectively.

【００２５】ここで、本実施例が適用されるオーディオ
エンコーダシステムのスペクトル量子化ブロック２３
は、供給されたスペクトル係数のトーン性の部分と、そ
の他の部分とを量子化する必要がある。これらの２つの
部分とは、スペクトルの可変数の小グループから成り周
波数領域上の位置が変動し得るトーン性の部分と、周波
数領域上で帯域と領域が決まっている一組のもっと大き
なグループ（量子化ユニット）の集合である。Here, the spectrum quantization block 23 of the audio encoder system to which this embodiment is applied.
Needs to quantize the tonal and other parts of the supplied spectral coefficients. These two parts are a tonality part that consists of a small number of small groups of spectrum and whose position in the frequency domain can fluctuate, and a set of larger groups whose bands and areas are fixed in the frequency domain ( Quantization unit).

【００２６】図４には、スペクトル量子化ブロック２３
の一具体例のブロック図を示す。この図４において、ト
ーン性成分抽出回路４１では、入力端子４０より供給さ
れたスペクトル係数からトーン性成分を抽出する。次
に、抽出されたトーンはトーン性成分量子化回路４２に
より量子化され、さらに残りのスペクトルはノイズ性成
分量子化回路４３により量子化ユニット単位で量子化さ
れる。FIG. 4 shows the spectrum quantization block 23.
The block diagram of one specific example is shown. In FIG. 4, the tone component extraction circuit 41 extracts the tone component from the spectrum coefficient supplied from the input terminal 40. Next, the extracted tones are quantized by the tone characteristic component quantization circuit 42, and the remaining spectrum is quantized by the noise characteristic component quantization circuit 43 in units of quantization units.

【００２７】次に、上述したスペクトル係数におけるト
ーン性成分及びノイズ性成分の具体例について、図５を
参照しながら説明する。この図５は、ＭＤＣＴ変換ブロ
ック３４、３５、３６で周波数領域に変換されるオーデ
ィオ信号のスペクトルの分布を示す。Next, specific examples of the tone component and the noise component in the above-mentioned spectral coefficient will be described with reference to FIG. FIG. 5 shows the spectrum distribution of the audio signal transformed into the frequency domain by the MDCT transform blocks 34, 35 and 36.

【００２８】図５の例において、破線で示されたスペク
トル係数がトーン性の部分ＴＣ_A、ＴＣ_B、ＴＣ_C、Ｔ
Ｃ_Dを示し、実線で示されたスペクトル係数がノイズ性
の部分を示す。これらのトーン性成分は、図５の例のよ
うに少数のスペクトル信号に集中して分布しているた
め、これらの成分を精度よく量子化しても、全体として
あまり多くのビット数は必要とならない。In the example of FIG. 5, the spectral coefficients indicated by broken lines are tone-like portions TC _A , TC _B , TC _C , T.
C _D , and the spectral coefficient indicated by the solid line indicates the noisy portion. Since these tonal components are concentrated and distributed in a small number of spectrum signals as in the example of FIG. 5, even if these components are quantized accurately, a large number of bits as a whole is not required. .

【００２９】また、図５の実線に示すノイズ性成分につ
いては、各量子化ユニットの帯域、例えば５つの帯域ｂ
１〜ｂ５において、上記元のスペクトルから上記破線の
トーン性成分が取り除かれているため、各量子化ユニッ
トにおける正規化係数は小さな値となり、従って、少な
いビット数でも発生する量子化雑音を小さくすることが
できる。なお、実際には、オーディオ信号の全スペクト
ルを例えば２５バンド程度の帯域に分割しており、これ
らの帯域は人間の聴覚特性を考慮した高域ほど帯域幅が
広くなるようないわゆる臨界帯域幅で分割されている。For the noise component shown by the solid line in FIG. 5, the band of each quantization unit, for example, five bands b
1 to b5, since the tonality component of the broken line is removed from the original spectrum, the normalization coefficient in each quantization unit has a small value, and therefore, the quantization noise generated even with a small number of bits is reduced. be able to. In practice, the entire spectrum of the audio signal is divided into bands of, for example, about 25 bands, and these bands are so-called critical bandwidths in which the higher the band, the wider the band in consideration of human auditory characteristics. It is divided.

【００３０】次に、上記図２のビット割当制御ブロック
２２は、割り当て可能な全ビットについて、音響心理学
的に意味のある方法でトーン性成分とノイズ性成分とに
対してビットを割り当てることが必要とされる。Next, the bit allocation control block 22 shown in FIG. 2 can allocate bits to all the assignable bits to the tone-like component and the noise-like component in a psychoacoustically meaningful manner. Needed.

【００３１】図６は、本実施例のオーディオ信号符号化
方法に用いられるビット割り当て方法の一例の要部を概
略的に示すフローチャートである。このフローチャート
では、各部分にビットを割り当てる順序を示している。FIG. 6 is a flow chart schematically showing a main part of an example of a bit allocation method used in the audio signal encoding method of this embodiment. This flow chart shows the order in which bits are assigned to each part.

【００３２】まずステップＳ６２において、信号のトー
ン性成分を識別し、トーンを抽出し、これを量子化す
る。これらのトーンは、所定数のスペクトルで定義され
たローカル周波数範囲を単位としてローカルエネルギ基
準を用いて検出しているところから、ローカルトーンと
呼ぶ。ローカルトーンは、ローカル周波数範囲内の大部
分の信号エネルギを含むような一群のスペクトルとして
定義される。トーンは、通常トーンまたは大トーンのい
ずれかとして分類される。大トーンは通常トーンより大
きなエネルギを有し、そのためスペクトルごとに余分の
ビットを用いて量子化される。信号のトーナリティも計
算しており、トーン成分内に含まれる全エネルギが信号
の全エネルギの大部分を含む場合には、信号をトーンと
見なす。信号をトーンと見なす場合、トーンには余分の
ビットを割り当てる。First, in step S62, tone components of the signal are identified, tones are extracted and quantized. These tones are called local tones because they are detected using the local energy reference in units of a local frequency range defined by a given number of spectra. A local tone is defined as a group of spectra that contains most of the signal energy within the local frequency range. Tones are classified as either normal tones or loud tones. Large tones have more energy than normal tones and are therefore quantized with extra bits per spectrum. The tonality of the signal is also calculated, and the signal is considered to be a tone if the total energy contained within the tone component comprises most of the total energy of the signal. When considering a signal as a tone, the tone is assigned an extra bit.

【００３３】トーン成分を検出し、また抽出したなら、
信号の高パワーの周波数領域を分離して妥当な精度で量
子化することにより、残りの重要なスペクトル成分の認
識を行なう。このパワーは、トーンとしてまたはいわゆ
るパワー量子化ユニットとして量子化することが出来
る。本アルゴリズムにおいては、先ずステップＳ６３で
パワートーンを検出し、次いでステップＳ６４でパワー
量子化ユニットを検出している。これまでの方法ではト
ーンとして検出されなかったような単一トーンが顕著な
比率の信号パワーを含む場合には、正確に量子化するた
めおよび量子化ユニット内の残りスペクトルをより正確
に量子化するため、さらなる抽出を行なうことがある。
単一の量子化ユニットが高いパワー密度、すなわちスペ
クトルあたりのパワーを含む場合には、パワー量子化ユ
ニットとして分類し、相対的に量子化雑音が聴取されな
いように充分なビット数で量子化する。Once the tone components have been detected and extracted,
The remaining important spectral components are recognized by separating the high power frequency domain of the signal and quantizing it with reasonable accuracy. This power can be quantized as a tone or as a so-called power quantization unit. In this algorithm, first, the power tone is detected in step S63, and then the power quantization unit is detected in step S64. To quantize accurately and more accurately quantize the residual spectrum in the quantisation unit if a single tone, which was not detected as a tone in the previous method, contains a significant proportion of the signal power. Therefore, further extraction may be performed.
If a single quantisation unit contains a high power density, i.e. power per spectrum, it is classified as a power quantisation unit and quantized with a sufficient number of bits so that relatively no quantization noise is heard.

【００３４】最後に、ステップＳ６５で残りのスペクト
ルを可能な限り正確に量子化する。この時点では、マス
クされないスペクトルだけを考慮していることに注意す
る。通常は、パワー強度の順に残りのビットを量子化ユ
ニットに単純に割り当てることでスペクトルを量子化す
るが、場合によっては多数のスペクトルを含む量子化ユ
ニットからトーンを抽出することも有り得る。これによ
り残りのスペクトルでのスケールファクタが低下するの
で、同一ワード長でより正確に量子化する。このような
トーンをスケールファクタトーンと称する。Finally, in step S65, the remaining spectrum is quantized as accurately as possible. Note that at this point only the unmasked spectrum is considered. Normally, the spectrum is quantized by simply assigning the remaining bits to the quantization unit in the order of power strength, but in some cases it is possible to extract the tones from the quantization unit containing multiple spectra. This reduces the scale factor in the rest of the spectrum, so it will quantize more accurately with the same word length. Such tones are called scale factor tones.

【００３５】図７は、本実施例に用いられるビット割り
当て方法の主要な段階の具体例を示すフローチャートで
ある。FIG. 7 is a flow chart showing a concrete example of the main steps of the bit allocation method used in this embodiment.

【００３６】最初のステップＳ７２では、ビットを割り
当てる前に、マスキング閾値を計算している。このマス
キング閾値は、重要な各帯域内のパワーを加算してから
マスカー（masker）展開関数を適用することで計算す
る。展開関数は隣接する臨界帯域に由来するマスキング
を決定するものであり、展開関数の一例を図８に示す。
最終的なマスキング関数以下に収まる信号の展開成分
（図中斜線の領域）はいずれも聴取されないので、符号
化する必要はない。In the first step S72, the masking threshold is calculated before allocating the bits. The masking threshold is calculated by adding the powers within each band of interest and then applying a masker expansion function. The expansion function determines the masking derived from the adjacent critical band, and an example of the expansion function is shown in FIG.
It is not necessary to encode any of the expanded components of the signal that fall below the final masking function (hatched areas in the figure), since they are not heard.

【００３７】マスキング計算後、このアルゴリズムは、
いわゆるローカルトーンの検出ステップＳ７３に進む。
ローカルトーンは、信号のローカルエネルギの大部分を
含む一群のスペクトルと定義される。トーンは例えば５
スペクトルのグループとして定義され、ローカル周波数
範囲は例えば２５スペクトルと定義される。これらの２
つの範囲の信号パワーの比を計算し、この値をローカル
レートＲ_locという。すなわち図９において、最初のス
テップＳ９２で例えば２５スペクトルについてのローカ
ルパワーＰ_locを計算し、次のステップＳ９３で例えば
５スペクトルについてのトーンパワーＰ_toneを計算し、
最後のステップＳ９４でこれらの比としてのローカルレ
ートＲ_loc（＝Ｐ_tone／Ｐ_loc）を計算している。After the masking calculation, this algorithm
The process proceeds to so-called local tone detection step S73.
Local tones are defined as a group of spectra that contains most of the local energy of a signal. The tone is 5
Defined as a group of spectra, the local frequency range is defined as 25 spectra, for example. These two
The ratio of the signal powers in the two ranges is calculated, and this value is called the local rate R _loc . That is, in FIG. 9, the local power P _loc for 25 spectra is calculated in the first step S92, and the tone power P _tone for 5 spectra is calculated in the next step S93,
In the final step S94, the local rate R _loc (= P _tone / P _loc ) as these ratios is calculated.

【００３８】トーン閾値は例えば０．７５に設定し、ロ
ーカル周波数範囲のエネルギの７５％を含む例えば５ス
ペクトルからなるグループをトーンとして予めフラグを
立てておく。しかし、単一の高エネルギのスペクトルに
よる影響のため、幾つかの近接スペクトルをトーンとし
て分類することも有り得る点に注意しなくてはならな
い。従って、重複しないトーンだけ、すなわち中心スペ
クトルが少なくとも４スペクトルだけ離されているトー
ンだけを許容する。２つまたはそれ以上のトーンが互い
に４スペクトル以内の位置に存在する場合、上記ローカ
ルレートＲ_locが最大の値をとるトーンだけをトーンと
して分類する。The tone threshold value is set to 0.75, for example, and a group consisting of, for example, 5 spectra containing 75% of the energy in the local frequency range is flagged in advance as a tone. It has to be noted, however, that it is possible to classify several neighboring spectra as tones due to the effect of a single high energy spectrum. Therefore, only non-overlapping tones are allowed, ie tones whose center spectra are separated by at least 4 spectra. If two or more tones are located within 4 spectra of each other, then only the tones with the highest local rate R _loc are classified as tones.

【００３９】上述のアルゴリズムは、パワーの比にだけ
依存したものであって、トーンスペクトル内のパワー分
布を考慮していない。従ってトーンは必ずしも単一のト
ーンピークを含むとは限らず、ローカル周波数範囲のパ
ワーの例えば１５％をそれぞれが含むほぼ大きさの等し
い５つのスペクトルを含むことがある。この場合、スペ
クトルは実際のトーンではなく、通常の量子化ユニット
として量子化する方がよい。The above algorithm depends only on the ratio of powers and does not consider the power distribution in the tone spectrum. Thus, a tone does not necessarily include a single tone peak, but may include five approximately equal-sized spectra, each containing, for example, 15% of the power in the local frequency range. In this case, the spectrum should be quantized as a normal quantization unit rather than the actual tone.

【００４０】この条件を検出するには、図１０に示すフ
ローチャートにて、平均／中央値レートＲ_{mean_med}の計
算を行なう。これはローカルスペクトルの中央値に対す
るローカルスペクトルの平均値の比である。ここで中央
値は、いわゆるメジアンであり、量子化ユニットの各値
の内、中央の大きさの値である。図１０のステップＳ１
０２では、例えば２５個のローカルスペクトルの中央値
を計算しており、２５スペクトルを値の小さいものから
大きさの順に並べた場合、中央値は１３番目の値とな
る。次のステップＳ１０３では平均値を計算しており、
この平均値は、例えば２５個のローカルスペクトルの算
術平均である。一般には総加平均が用いられる。次のス
テップＳ１０４では、上記各ステップＳ１０２、Ｓ１０
３で求めた中央値と平均値を用いて、平均値／中央値の
値を計算している。To detect this condition, the mean / median rate R _{mean_med} is calculated in the flowchart shown in FIG. This is the ratio of the mean value of the local spectrum to the median value of the local spectrum. Here, the median is a so-called median, which is the value of the median size among the values of the quantization unit. Step S1 of FIG.
In 02, for example, the median value of 25 local spectra is calculated, and when 25 spectra are arranged in order from the smallest value, the median value is the 13th value. In the next step S103, the average value is calculated,
This average value is, for example, the arithmetic average of 25 local spectra. Generally, a total arithmetic mean is used. In the next step S104, the above steps S102 and S10 are performed.
The average value / median value is calculated using the median value and the average value obtained in 3.

【００４１】よって、平均／中央値レートＲ
_{mean_med}は、ローカル中央値に対するローカル平均値の
比率であると定義されることになる。平均／中央値レー
トＲ_{mean_med}の計算を、上述のパワーに基づいた方法で
選択したそれぞれのトーンに対して行なう。平均／中央
値レートＲ_{mean_med}が例えば２．５以下の値を有するよ
うなトーンの場合、誤検出トーンと見なして量子化のト
ーン性成分に含める。Therefore, the average / median rate R
_{mean_med} will be defined as the ratio of the local mean to the local median. An average / median rate R _{mean_med} is calculated for each tone selected in the power-based method described above. In the case of a tone whose average / median rate R _{mean_med} has a value of 2.5 or less, for example, it is regarded as a false detection tone and included in the tonal component of quantization.

【００４２】再び図７に戻って、以上でステップＳ７３
におけるローカルトーン検出が行われたので、次のステ
ップＳ７４で信号のトーナリティを計算する。具体的に
は、ローカルトーンにおけるエネルギの和が全エネルギ
の例えば９８％以上であれば、信号はトーナリティがあ
ると見なす。Returning to FIG. 7 again, step S73 is completed.
Since the local tone detection has been performed in step S74, the tonality of the signal is calculated in the next step S74. Specifically, if the sum of the energies in the local tones is greater than 98% of the total energy, then the signal is considered to be tonality.

【００４３】次に、ステップＳ７５に進んで、ローカル
トーンにビットを割り当てる。割当ビット数は、上述の
計算による信号のトーナリティおよび各トーンの包括す
る全エネルギの比率に依存する。より詳細には、全エネ
ルギはスペクトル全部のエネルギの和として計算され
る。トーンエネルギは前記と同一である。トータルレー
トＲ_tot は全エネルギに対するトーンエネルギの比率と
定義される。Next, in step S75, bits are assigned to local tones. The number of allocated bits depends on the tonality of the signal calculated above and the ratio of the total energy contained in each tone. More specifically, the total energy is calculated as the sum of the energy of all spectra. The tone energy is the same as above. The total rate R _tot is defined as the ratio of tone energy to total energy.

【００４４】このトータルレートＲ_tot の計算について
図１１を参照しながら説明する。図１１において、ステ
ップＳ１１２で全スペクトルのトータルパワーＰ_tot を
計算し、次のステップＳ１１３で５スペクトルのトーン
パワーＰ_toneを計算し、次のステップＳ１１４でトータ
ルレートＲ_totをＰ_tone／Ｐ_totを計算することにより
求める。The calculation of the total rate R _tot will be described with reference to FIG. In FIG. 11, the total power P _tot of all spectra is calculated in step S112, the tone power P _tone of 5 spectra is calculated in next step S113, and the total rate R _tot is P _tone / P _tot in next step S114. Calculated and calculated.

【００４５】このようにして求められたトータルレート
Ｒ_totが０．１８（全エネルギの１８％）以上の場合に
は、トーンは大トーンと見なされ、それ以外の場合は通
常トーンとされる。これらの各トーンには、例えば次の
ようなビット数を割り当てるものとする。大トーントーン性成分：５ビット非トーン性成
分：４ビット通常トーントーン性成分：４ビット非トーン性成
分：３ビットこれらのビット数はトーン内の各スペクトルに割り当て
るビット数の一具体例を表わす。When the total rate R _tot thus obtained is 0.18 (18% of the total energy) or more, the tone is regarded as a large tone, and otherwise the tone is regarded as a normal tone. For example, the following number of bits is assigned to each of these tones. Large tone Tone component: 5 bits Non-tone component: 4 bits Normal tone Tone component: 4 bits Non-tone component: 3 bits .

【００４６】次に、図７のステップＳ７６にてパワート
ーンを検索する。パワートーンは信号の全エネルギの例
えば４０％以上を含むトーンと定義される。全エネルギ
はスペクトル全部のエネルギの和で計算する。トーンエ
ネルギは前述のように５スペクトルにわたり計算する。
このときのトータルレートＲ_tot2は、全エネルギに対す
るトーンエネルギの比率と定義される。これは大トーン
と通常トーンの間で行なう上述のテストと同様である。Next, the power tone is searched in step S76 of FIG. A power tone is defined as a tone that contains, for example, 40% or more of the total energy of the signal. The total energy is calculated as the sum of the energy of all spectra. Tone energy is calculated over the 5 spectra as described above.
The total rate R _{tot2 at} this time is defined as the ratio of the tone energy to the total energy. This is similar to the above test performed between the large tone and the normal tone.

【００４７】このようなエネルギ計算は、ローカルトー
ン検出方法がトーンスペクトルとしてすでに決定したス
ペクトルについてのみ適用することに注意されたい。例
えば０．４以上のトータルレートＲ_tot2を有するスペク
トルは予めパワートーンとして検出する。前述のよう
に、幾つかの近接するスペクトルを単一の高エネルギス
ペクトルの影響によりトーンとして分類することが有り
得るから、重複しないパワートーンのみ、すなわち中心
スペクトルが例えば少なくとも４スペクトルだけ離れて
いるようなトーンだけを受け入れる。２つまたはそれ以
上のトーンが互いに４スペクトルの範囲内に存在する場
合には、トータルレートＲ_tot2が最大値を有するトーン
のみをパワートーンとして分類する。It should be noted that such energy calculation applies only to the spectrum that the local tone detection method has already determined as the tone spectrum. For example, a spectrum having a total rate R _tot2 of 0.4 or more is detected as a power tone in advance. As mentioned above, it is possible to classify several closely spaced spectra as tones due to the effect of a single high energy spectrum, so that only non-overlapping power tones, ie the central spectra are separated by at least 4 spectra, for example. Accept only tones. If two or more tones lie within four spectra of each other, then only the tones with the maximum total rate R _tot2 are classified as power tones.

【００４８】次のステップＳ７７においては、このよう
にして検出されたパワートーンにビットを割り当てる。
パワートーンは必ず大トーンとして量子化する。もし何
らかのパワートーンを検出した場合には、前述のトーナ
リティ検出で非トーンとしてそのブロックにフラグを立
てるが、トーナリティ測定のためにはローカルトーンに
信号エネルギの例えば９８％が含まれる必要がある点に
に注意されたい。従ってパワートーンは例えば４ビット
に量子化される。In the next step S77, bits are assigned to the power tones thus detected.
The power tone is always quantized as a large tone. If any power tone is detected, the tonality detection described above flags the block as non-tone, but the local tone must contain, for example, 98% of the signal energy for the tonality measurement. Please note. Therefore, the power tone is quantized into 4 bits, for example.

【００４９】次に、本アルゴリズムによれば、ステップ
Ｓ７８に進んでいわゆるパワー量子化ユニットを検出
し、ステップＳ７９でビットを割り当てる。量子化ユニ
ットはトーンとして量子化されなかった主要な信号成分
を含む。パワー量子化ユニットとして量子化されない量
子化ユニットは一般に重要でないスペクトル成分または
マスクした信号成分を含む。本アルゴリズムでは、例え
ば少なくとも３ビットまたは４ビットでパワー量子化ユ
ニットを量子化しようとする。Next, according to the present algorithm, the process proceeds to step S78, a so-called power quantization unit is detected, and bits are assigned in step S79. The quantisation unit contains the non-quantized major signal components as tones. Quantization units that are not quantized as power quantization units generally contain insignificant spectral or masked signal components. The algorithm seeks to quantize the power quantisation unit with, for example, at least 3 or 4 bits.

【００５０】図７に示したように、パワー量子化ユニッ
トの選択は反復的に行なう。本アルゴリズムでは、ステ
ップＳ８０に示すように、これ以上パワー量子化ユニッ
トが見つからなくなるまで、または別のパワー量子化ユ
ニットに割り当てるのに充分なビットがなくなるまで、
繰り返してパワー量子化ユニットを検出し、また必要な
だけビットを割り当てる。As shown in FIG. 7, the power quantization unit selection is performed iteratively. In the present algorithm, as shown in step S80, until no more power quantization units are found or there are not enough bits to allocate to another power quantization unit,
Iteratively detect power quantization units and allocate as many bits as needed.

【００５１】各繰り返し毎に、すでに検出したパワー量
子化ユニットを除いて、各量子化ユニットのパワー強度
を計算する。量子化ユニットのパワー密度Ｄ_QUは、トー
ンスペクトルを除く量子化ユニットの全スペクトルのパ
ワーＰ_QUを加算し量子化ユニット内のスペクトル総数で
除算することにより計算される。パワー密度を残りのス
ペクトルについて計算し、これを先に検出したパワー量
子化ユニットとトーンとして分類したスペクトル以外の
全スペクトルと定義する。At each iteration, the power intensity of each quantisation unit is calculated, except for the power quantisation units already detected. The power density D _QU of the quantization unit is calculated by adding the power P _QU of all spectra of the quantization unit except the tone spectrum and dividing by the total number of spectra in the quantization unit. The power density is calculated for the rest of the spectrum and defined as the total spectrum other than the previously detected power quantization unit and the spectrum classified as a tone.

【００５２】図１２には、残りのスペクトルのパワー密
度Ｄ_remの計算を示すフローチャートを示している。FIG. 12 shows a flowchart showing the calculation of the power density D _rem of the remaining spectrum.

【００５３】この図１２において、最初のステップＳ１
２２ではスペクトルの総数をＮ_SPECとし、ステップＳ１
２３、Ｓ１２４、Ｓ１２５ではそれぞれ初期値設定とし
て、残りパワーＰ_remを０．０とし、スペクトル番号ｉ
を０とし、残りスペクトルのカウント値ｎを０としてい
る。次のステップＳ１２６では、ｉ番目のスペクトルsp
ec[ｉ]は上記パワー量子化ユニットに分類されるか否か
を判別し、ＹＥＳのときはステップＳ１３０に、ＮＯの
ときはステップＳ１２７に、それぞれ進んでいる。ステ
ップＳ１２７では、ｉ番目のスペクトルspec[ｉ]は上記
トーンに分類されるか否かを判別し、ＹＥＳのときはス
テップＳ１３０に、ＮＯのときはステップＳ１２８にそ
れぞれ進んでいる。ステップＳ１２８では、残りパワー
Ｐ_remをそれまでの残りパワーＰ_remといまのスペクト
ルのパワーspec[ｉ]×spec[ｉ]との和に置き換えて、ス
テップＳ１２９に進み、残りスペクトルのカウント値ｎ
をインクリメントしている。次のステップＳ１３０で
は、上記ｉが上記総数Ｎ_SPECに達したか否かを判別して
おり、ＮＯのときはステップＳ１３２に進んでｉをイン
クリメントして上記ステップＳ１２６に戻り、ＹＥＳの
ときはステップＳ１３１に進んで残りのスペクトルのパ
ワー密度Ｄ_remをＰ_rem／ｎにより求めて出力してい
る。In FIG. 12, the first step S1
In step 22, the total number of spectra is set to N _SPEC, and step S1
23, S124, and S125, the remaining power _Prem is set to 0.0 as the initial value setting, and the spectrum number i
Is set to 0, and the count value n of the remaining spectrum is set to 0. In the next step S126, the i-th spectrum sp
It is determined whether or not ec [i] is classified into the above power quantization unit. If YES, the process proceeds to step S130, and if NO, the process proceeds to step S127. In step S127, it is determined whether or not the i-th spectrum spec [i] is classified into the above tone. If YES, the process proceeds to step S130, and if NO, to step S128. In step S128, the remaining power P _rem is replaced with the sum of the remaining power P _rem up to that point and the power spec [i] × spec [i] of the current spectrum, and the process proceeds to step S129 to count value n of the remaining spectrum.
Is being incremented. In the next step S130, it is determined whether or not i has reached the total number N _SPEC . If NO, the process proceeds to step S132, i is incremented and the process returns to step S126, and if YES, the process proceeds to step S126. Proceeding to S131, the power density D _rem of the remaining spectrum is obtained by P _rem / n and output.

【００５４】すなわちＤ_remは、これらの残りのスペク
トルの総パワーＰ_remを計算したのち、パワー計算に含
めたスペクトル数ｎで除算して求めている。この数ｎは
一般にスペクトル総数Ｎ_SPECと等しくないことに留意さ
れたい。That is, D _rem is obtained by calculating the total power P _rem of these remaining spectra and then dividing by the number of spectra n included in the power calculation. Note that this number n is generally not equal to the total number of spectra N _SPEC .

【００５５】本アルゴリズムにおいては、それぞれの量
子化ユニットを検査してパワー量子化ユニットの要件に
適合するかを調べる。検査結果に従ってパワー量子化ユ
ニットにビットを割り当てる。In the present algorithm, each quantisation unit is checked to see if it meets the power quantisation unit requirements. Allocate bits to the power quantization unit according to the inspection result.

【００５６】ここで図１３は、上記検出とビット割り当
ての具体例を表わすフローチャートを示している。FIG. 13 is a flow chart showing a concrete example of the above detection and bit allocation.

【００５７】この図１３において、まずステップＳ１４
２では、密度レートＲ_dens、すなわち、上記残りスペク
トルパワー密度Ｄ_remに対する量子化ユニットのパワー
密度Ｄ_QUの比を計算する。また次のステップＳ１４３で
は、トータルレートＲ_tot2を計算する。これは、全ての
トーンおよびパワー量子化ユニットを含めた、スペクト
ル全体の総エネルギＰ_totに対する、量子化ユニット内
のエネルギＰ_QUの比と定義される。In FIG. 13, first, in step S14
In 2, the density rate R _dens , that is, the ratio of the power density D _QU of the quantization unit to the remaining spectral power density D _rem is calculated. Further, in the next step S143, the total rate R _tot2 is calculated. It is defined as the ratio of the energy P _QU in the quantization unit to the total energy P _tot of the whole spectrum, including all tones and power quantization units.

【００５８】次のステップＳ１４４では、上記密度レー
トＲ_densが例えば１４より大きいか否かを判別してお
り、ＹＥＳのときはステップＳ１４５に、ＮＯのときは
ステップＳ１４７にそれぞれ進んでいる。ステップＳ１
４５では、上記トータルレートＲ_tot2が例えば０．１８
よりも大きいか否かを判別しており、ＹＥＳのときはス
テップＳ１４７に進んで量子化ユニットに４ビットを割
り当て、ＮＯのときはステップＳ１４８に進んで量子化
ユニットに３ビットを割り当てている。ステップＳ１４
６では、上記トータルレートＲ_tot2が例えば０．３５よ
りも大きいか否かを判別しており、ＹＥＳのときはステ
ップＳ１４８に進んで量子化ユニットに３ビットを割り
当て、ＮＯのときはステップＳ１４９に進んでいる。ス
テップＳ１４９では、残りパワーレートＲ_remをＰ_QU／
Ｐ_remにより計算し、ステップＳ１５０にてこのＲ_rem
が例えば０．９より大きいか否かを判別し、ＹＥＳのと
きはステップＳ１４８に進んで量子化ユニットに３ビッ
トを割り当て、ＮＯのときは処理を終了している。In the next step S144, it is determined whether or not the density rate R _dens is larger than 14, for example, and if YES, the process proceeds to step S145, and if NO, the process proceeds to step S147. Step S1
At 45, the total rate R _tot2 is, for example, 0.18.
If YES, the process proceeds to step S147 to allocate 4 bits to the quantization unit, and if NO, proceeds to step S148 to allocate 3 bits to the quantization unit. Step S14
In 6, it is judged whether or not the total rate R _tot2 is larger than 0.35, for example, and if YES, proceed to step S148 to allocate 3 bits to the quantization unit, and if NO, proceed to step S149. It is progressing. In step S149, the remaining power rate R _{rem is set} to P _QU /
It is calculated by P _{rem, and} this R _rem
Is greater than 0.9, for example, the process proceeds to step S148 if YES and 3 bits are assigned to the quantization unit, and if NO, the process ends.

【００５９】すなわち、この図１３の具体例では、上記
密度レートＲ_densが１４より大きくかつ上記トータルレ
ートＲ_tot2が０．１８より大きい場合には、量子化ユニ
ットに４ビットを割り当てる。これ以外の場合で、上記
Ｒ_densが１４より大きくかつ上記Ｒ_tot2が０．１８以下
の場合、又は上記Ｒ_densが１４以下でかつ上記Ｒ_tot2が
０．３５より大きい場合には３ビットを割り当てる。最
後に、上述のいずれの例にも当てはまらない場合には、
上記Ｒ_remをＰ_QU／Ｐ_remにより計算し、このＲ_remが
０．９より大きい場合には量子化ユニットに３ビットを
割り当て、それ以外では、パワー量子化ユニットとして
検出せずに次の量子化ユニットを検査する。図７に図示
したように、パワー量子化ユニットとして量子化ユニッ
トが検出されなくなるまで、この反復ループを繰返す。That is, in the specific example of FIG. 13, when the density rate R _dens is larger than 14 and the total rate R _tot2 is larger than 0.18, 4 bits are allocated to the quantization unit. Otherwise, if R _dens is greater than 14 and R _tot2 is 0.18 or less, or if R _dens is 14 or less and R _tot2 is greater than 0.35, allocate 3 bits. . Finally, if none of the above examples apply,
The above R _rem is calculated by P _QU / P _rem , and if this R _rem is larger than 0.9, 3 bits are assigned to the quantization unit, otherwise, the next quantum is not detected as a power quantization unit. Inspect the conversion unit. As shown in FIG. 7, this iterative loop is repeated until no quantization unit is detected as the power quantization unit.

【００６０】次に、もし残りのビットがあれば、残りの
ビットを用いて、可能な方法により残りのマスクされて
いないスペクトルを量子化する。ここで、信号の重要な
スペクトル成分が、トーンまたはパワー量子化ユニット
として識別されたものと仮定する。本アルゴリズムにお
いては、ステップＳ８１に示すように、例えば少なくと
も２ビットを残りのマスクされていないスペクトルに割
り当てることを目標としている。しかし、これを行うた
めに十分なビット数が残っていない場合も有り得るの
で、本アルゴリズムでは量子化ユニットのパワー密度の
順番に従って、各量子化ユニットにビットを配分してい
る。Then, if there are remaining bits, the remaining bits are used to quantize the remaining unmasked spectrum in a possible manner. Here it is assumed that the significant spectral components of the signal have been identified as tones or power quantization units. In the present algorithm, as shown in step S81, the goal is to allocate, for example, at least 2 bits to the remaining unmasked spectrum. However, there may not be enough bits left to do this, so the algorithm allocates bits to each quantization unit according to the order of the power density of the quantization units.

【００６１】パワー密度は、最初にマスクされていない
量子化ユニットパワーを計算し、次にパワーを量子化ユ
ニット内のスペクトル数で除算することにより計算す
る。パワー計算にはトーンスペクトルとして検出されな
かったマスキングされないスペクトルのみを含む。マス
キングされないスペクトルは大きさがその量子化ユニッ
トのマスキング閾値より大きなスペクトルからなる。さ
らにパワー強度の減少する順番に量子化ユニットを並べ
替える。本アルゴリズムは、量子化ユニットパワーが０
以上のマスキングされないスペクトルを有し、これまで
にビット割り当てしていない、すなわちパワー量子化ユ
ニットとしてビットを割り当てていない各量子化ユニッ
トに対して、それぞれ２ビットの割り当てを試みる。The power density is calculated by first calculating the unmasked quantizer unit power and then dividing the power by the number of spectra in the quantizer unit. The power calculation includes only unmasked spectra that were not detected as tone spectra. The unmasked spectrum consists of a spectrum whose magnitude is greater than the masking threshold of its quantization unit. Further, the quantization units are rearranged in order of decreasing power intensity. This algorithm has a quantization unit power of 0.
For each quantization unit that has the above unmasked spectrum and has not been allocated bits so far, that is, has not been allocated bits as a power quantization unit, an allocation of 2 bits is attempted.

【００６２】さらにビットが残っている場合、本アルゴ
リズムは非パワー量子化ユニットの量子化の改善を試み
る。これは、ステップＳ８２に示すように、剰余ビット
（２から３ビット）を量子化ユニットに加えることによ
り、またはいわゆるスケールファクタトーンを割り当て
ることにより行なう。スケールファクタトーンを用い
て、最大値を排除することにより量子化ユニット内のス
ケールファクタを減少させる。つまり残りの値はさらに
ビットを付加しなくともより正確に量子化されることに
なる。本アルゴリズムでは、剰余ビットの追加とスケー
ルファクタトーンの割り当てのどちらを行なうかの判断
を、いずれかの動作を実行するのに必要なビット数に従
って、またどちらの動作の方が少ない量子化エラーを生
成するかによって行なっている。If there are more bits left, the algorithm attempts to improve the quantization of the non-power quantisation unit. This is done by adding the remainder bits (2 to 3 bits) to the quantisation unit, or by assigning so-called scale factor tones, as shown in step S82. Scale factor tones are used to reduce the scale factor in the quantisation unit by eliminating the maximum. That is, the remaining values will be quantized more accurately without adding more bits. This algorithm determines whether to add a surplus bit or to assign a scale factor tone according to the number of bits required to perform one of the operations, and which operation produces less quantization error. It is done depending on whether to generate.

【００６３】最後に、本アルゴリズムでは、ステップＳ
８３において、残りビットがあれば残りの量子化ユニッ
トに割り当てる。最初に３ビットパワー量子化ユニット
に割り当て（４ビットに高品質化）、次いでマスキング
していないスペクトルを有する他の量子化ユニットに割
り当ててから、最後に全ての残りの量子化ユニットに割
り当てる。Finally, in the present algorithm, step S
At 83, the remaining bits, if any, are assigned to the remaining quantization units. It is first assigned to a 3-bit power quantization unit (higher quality to 4 bits), then to another quantization unit with an unmasked spectrum, and finally to all remaining quantization units.

【００６４】本アルゴリズムが何らかの残りビットと共
に最終段階にたどり着くことは希であり、本アルゴリズ
ムのこの部分は特に必須なものではなく、一般に、極度
にトーン性の高い信号の場合に到達するのみである。It is unlikely that the algorithm will reach the final stage with any remaining bits, and this part of the algorithm is not particularly essential and will generally only be reached in the case of extremely tonal signals. .

【００６５】なお、本発明は上記実施例のみに限定され
るものではなく、例えば、分割帯域数やトーン性識別の
ためのスペクトルの本数、あるいは各割当ビット数等は
上記具体的な数値に限定されず、ビットレートや音質等
を考慮して適宜設定すればよいことは勿論である。The present invention is not limited to the above-mentioned embodiment, and for example, the number of divided bands, the number of spectra for tone characteristic identification, the number of allocated bits, etc. are limited to the above specific numerical values. Of course, it is needless to say that it may be appropriately set in consideration of the bit rate, sound quality, and the like.

【００６６】[0066]

【発明の効果】本発明に係るオーディオ信号符号化方法
によれば、入力オーディオ信号のスペクトルの内のトー
ン性成分を抽出して量子化し、次に高パワー周波数領域
をパワー量子化ユニットとして分離して相対的に量子化
雑音が聴取されないようなビット数で量子化し、最後に
残ったスペクトルを残ったビット数で量子化することに
より、聴感上で重要な部分から順に多くのビットを割り
当てるような効率的なビット配分による符号化が行え、
低いビットレートで高品質のオーディオ信号符号化を実
現できる。According to the audio signal encoding method of the present invention, the tonal component in the spectrum of the input audio signal is extracted and quantized, and then the high power frequency region is separated as a power quantization unit. Quantization is performed by the number of bits so that the relative quantization noise is not heard, and the last remaining spectrum is quantized by the number of remaining bits, so that many bits are allocated in order from the part that is important for hearing. Encoding can be done with efficient bit allocation,
High quality audio signal coding can be realized at a low bit rate.

【００６７】また、トーンを通常トーンと大トーンとに
分類し、大トーンにより多くのビットを割り当てること
により、さらに符号化効率を上げることができ、同じビ
ットレートでは音質を高めることができる。Further, by classifying the tones into normal tones and large tones and allocating more bits to the large tones, the coding efficiency can be further improved, and the sound quality can be enhanced at the same bit rate.

【００６８】また、残りのスペクトルを同一ワード長で
量子化することにより、ビット配分処理を簡略化でき
る。Further, the bit allocation process can be simplified by quantizing the remaining spectrum with the same word length.

【００６９】さらに、所定数スペクトル範囲に２つのロ
ーカルピークが存在する場合には、パワーの大きな方の
単一のトーンをトーンとして抽出することにより、重複
したトーン検出が回避できる。また、スペクトルの平均
値／中央値の比に基づきトーンを抽出することや、上記
スペクトルのトーナリティに基づきトーンを抽出するこ
とにより、トーンの誤検出を防止でき、適切なトーン検
出が行える。Furthermore, when there are two local peaks in the predetermined number of spectral ranges, by extracting the single tone with the larger power as a tone, duplicate tone detection can be avoided. Further, by extracting the tone based on the ratio of the average value / median value of the spectrum and by extracting the tone based on the tonality of the spectrum, it is possible to prevent erroneous detection of the tone and perform appropriate tone detection.

[Brief description of drawings]

【図１】本発明に係るオーディオ信号符号化方法の実施
例が適用されるオーディオエンコーダ／デコーダシステ
ムの概略構成を示すブロック図である。FIG. 1 is a block diagram showing a schematic configuration of an audio encoder / decoder system to which an embodiment of an audio signal encoding method according to the present invention is applied.

【図２】本発明に係るオーディオ信号符号化方法の実施
例が適用されるオーディオエンコーダの一例を示すブロ
ック図である。FIG. 2 is a block diagram showing an example of an audio encoder to which the embodiment of the audio signal encoding method according to the present invention is applied.

【図３】図２のエンコーダ内の時間−周波数分析ブロッ
ク２１の構造の一例を示すブロック回路図である。3 is a block circuit diagram showing an example of a structure of a time-frequency analysis block 21 in the encoder of FIG.

【図４】本発明に係るオーディオ信号符号化方法の実施
例に用いられるスペクトル量子化ブロックを示すブロッ
ク回路図である。FIG. 4 is a block circuit diagram showing a spectrum quantization block used in an embodiment of an audio signal coding method according to the present invention.

【図５】本発明の実施例のオーディオ信号符号化方法に
おけるオーディオ信号のスペクトルのトーン性部分とノ
イズ性部分の一例を説明するための図である。FIG. 5 is a diagram for explaining an example of a tone characteristic portion and a noise characteristic portion of the spectrum of the audio signal in the audio signal encoding method according to the embodiment of the present invention.

【図６】本発明の実施例に用いられるビット割り当て方
法の概要を示すフローチャートである。FIG. 6 is a flowchart showing an outline of a bit allocation method used in an embodiment of the present invention.

【図７】本発明の実施例に用いられるビット割り当て方
法の動作の一例を示すフローチャートである。FIG. 7 is a flowchart showing an example of the operation of the bit allocation method used in the embodiment of the present invention.

【図８】典型的なマスキング展開関数を表わすグラフで
ある。FIG. 8 is a graph showing a typical masking expansion function.

【図９】上記図７のビット割り当て方法の動作における
トーン測定のためのローカルレート計算の一例を示すフ
ローチャートである。9 is a flowchart showing an example of local rate calculation for tone measurement in the operation of the bit allocation method of FIG.

【図１０】上記図７のビット割り当て方法の動作におけ
るトーン測定のための平均／中央値レートの計算の一例
を示すフローチャートである。10 is a flowchart showing an example of calculating an average / median rate for tone measurement in the operation of the bit allocation method of FIG. 7;

【図１１】上記図７のビット割り当て方法の動作におけ
るトーン測定のためのトータルレート計算の一例を示す
フローチャートである。FIG. 11 is a flowchart showing an example of total rate calculation for tone measurement in the operation of the bit allocation method of FIG.

【図１２】上記図７のビット割り当て方法の動作におけ
るパワー量子化ユニット検出に用いる残りスペクトルの
パワー密度計算の一例を示すフローチャートである。12 is a flowchart showing an example of power density calculation of a residual spectrum used for power quantization unit detection in the operation of the bit allocation method of FIG.

【図１３】上記図７のビット割り当て方法の動作におけ
る各種検査結果に応じたビット割当の具体例を示すフロ
ーチャートである。13 is a flowchart showing a specific example of bit allocation according to various inspection results in the operation of the bit allocation method of FIG.

[Explanation of symbols]

１１オーディオエンコーダ２１時間−周波数分析ブロック２２ビット割当制御ブロック２３スペクトル量子化ブロック２４音響心理学的モデル出力ブロック４１トーン性成分抽出回路４２トーン性成分量子化回路４３ノイズ性成分量子化回路 11 audio encoder 21 time-frequency analysis block 22-bit allocation control block 23 Spectral quantization block 24 Psychoacoustic model output block 41 Tone component extraction circuit 42 Tone component quantization circuit 43 Noise component quantization circuit

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 19/02 ─────────────────────────────────────────────────── ─── Continuation of front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) G10L 19/02

Claims

(57) [Claims]

1. An audio signal coding method for coding a plurality of spectra by subjecting an input audio signal to a predetermined conversion, and allocating bits corresponding to auditory characteristics to each spectrum for coding. Tone components are identified in the spectrum, and tones are extracted from the identified tone components to obtain a predetermined bit.
Quantize betting amount, the remaining spectrum of the plurality of spectrum, high
A frequency domain having a high power density is separated, quantized by a predetermined number of bits as a power quantization unit, and the remaining spectrum is quantized using the remaining available bits. Audio signal encoding method.

2. The extracted tone is a normal tone or
It is classified as either a large tone, the large tone audio signal encoding method according to claim 1, wherein the quantized with larger number of bits than the normal tone.

3. The audio signal encoding method according to claim 1, wherein the quantization by the remaining available bits is performed by the same word length.

4. When extracting the tones , a local peak is obtained, and when the two obtained local peaks are within a predetermined number of spectrum ranges , the one having a larger power is used.
Audio signal encoding method according to claim 1, wherein the extracting as tone.

Wherein the tone extraction, an audio signal encoding method according to claim 1, wherein the performed have specific based Dzu the mean / median of the spectrum.

Wherein the tone extraction, an audio signal encoding method according to claim 1, wherein the performed based on the tonality of the spectrum.