JP3227945B2

JP3227945B2 - Encoding device

Info

Publication number: JP3227945B2
Application number: JP27938393A
Authority: JP
Inventors: 健三赤桐
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1993-11-09
Filing date: 1993-11-09
Publication date: 2001-11-12
Anticipated expiration: 2016-11-12
Also published as: JPH07135470A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、映画フィルム映写シス
テム、ビデオテープレコーダ、ビデオディスクプレーヤ
等のステレオないしはいわゆるマルチサラウンド音響シ
ステムにおいて用いられるビットレートの削減を行う符
号化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an encoding apparatus for reducing a bit rate used in a stereo or so-called multi-surround sound system such as a movie film projection system, a video tape recorder, a video disk player, and the like.

【０００２】[0002]

【従来の技術】オーディオ或いは音声等の信号の高能率
符号化の手法及び装置には種々あるが、例えば、時間領
域のオーディオ信号等を単位時間毎にブロック化してこ
のブロック毎の時間軸の信号を周波数軸上の信号に変換
（直交変換）して複数の周波数帯域に分割し、各帯域毎
に符号化するブロック化周波数帯域分割方式であるいわ
ゆる変換符号化方式や、時間領域のオーディオ信号等を
単位時間毎にブロック化しないで、複数の周波数帯域に
分割して符号化する非ブロック化周波数帯域分割方式で
ある帯域分割符号化（サブ・バンド・コーディング：Ｓ
ＢＣ）方式等を挙げることができる。また、上述の帯域
分割符号化と変換符号化とを組み合わせた高能率符号化
の手法及び装置も考えられており、この場合には、例え
ば、上記帯域分割符号化方式で帯域分割を行った後、該
各帯域毎の信号を上記変換符号化方式で周波数領域の信
号に直交変換し、この直交変換された各帯域毎に符号化
を施すことになる。2. Description of the Related Art There are various methods and apparatuses for high-efficiency encoding of signals such as audio or voice. For example, a time domain audio signal or the like is divided into blocks for each unit time, and a time axis signal for each block is obtained. Is converted into a signal on the frequency axis (orthogonal transform), divided into a plurality of frequency bands, and a so-called transform coding method, which is a block frequency band division method for encoding each band, an audio signal in the time domain, and the like. Is a non-blocking frequency band division scheme (sub-band coding: S
BC) system and the like. Further, a high-efficiency coding method and apparatus combining the above-mentioned band division coding and transform coding are also considered. In this case, for example, after performing band division by the above-mentioned band division coding method, Then, the signal for each band is orthogonally transformed into a signal in the frequency domain by the above-described transform coding method, and encoding is performed for each band subjected to the orthogonal transformation.

【０００３】ここで、上述した帯域分割符号化方式に使
用される帯域分割用フィルタとしては、例えばＱＭＦ(Q
uadrature Mirror filter)等のフィルタがあり、これは
1976R.E.Crochiere Digital coding of speech in sub
bands Bell Syst.Tech. J.Vol.55, No.8 1976 に、述べ
られている。また、ICASSP 83, BOSTON PolyphaseQuad
rature filters-A new subband coding technique Jose
ph H. Rothweilerにはポリフェーズクワドラチャフ
ィルタ(Polyphase Quadrature filter) などの等バンド
幅のフィルタ分割手法及び装置が述べられている。Here, as a band division filter used in the above-mentioned band division coding system, for example, QMF (Q
uadrature Mirror filter).
1976R.E.Crochiere Digital coding of speech in sub
bands Bell Syst.Tech. J.Vol.55, No.8 1976. Also, ICASSP 83, BOSTON PolyphaseQuad
rature filters-A new subband coding technique Jose
ph H. Rothweiler describes an equal bandwidth filter dividing method and apparatus such as a polyphase quadrature filter.

【０００４】また、上述した直交変換としては、例え
ば、入力オーディオ信号を所定単位時間（フレーム）で
ブロック化し、該ブロック毎に高速フーリエ変換（ＦＦ
Ｔ）やコサイン変換（ＤＣＴ）、モディファイドＤＣＴ
変換（ＭＤＣＴ）などを行うことで時間軸を周波数軸に
変換するような直交変換がある。上記ＭＤＣＴについて
は、ICASSP 1987 Subband/Transform Coding Using Fil
ter Bank Designs Basedon Time Domain Aliasing Canc
ellation J.P.Princen A.B.Bradley Univ. ofSurrey R
oyal Melbourne Inst.of Tech. に述べられている。As the above-described orthogonal transform, for example, an input audio signal is divided into blocks in a predetermined unit time (frame), and a fast Fourier transform (FF) is performed for each block.
T), cosine transform (DCT), modified DCT
There is an orthogonal transformation that transforms a time axis into a frequency axis by performing transformation (MDCT) or the like. Regarding the above MDCT, ICASSP 1987 Subband / Transform Coding Using Fil
ter Bank Designs Basedon Time Domain Aliasing Canc
ellation JPPrincen ABBradley Univ. of Surrey R
oyal Melbourne Inst. of Tech.

【０００５】更に、周波数帯域分割された各周波数成分
を量子化する場合の周波数分割幅としては、例えば人間
の聴覚特性を考慮した帯域分割がある。すなわち、一般
に臨界帯域（クリティカルバンド）と呼ばれている高域
程帯域幅が広くなるような帯域幅で、オーディオ信号を
複数（例えば２５バント）の帯域に分割することがあ
る。また、この時の各帯域毎のデータを符号化する際に
は、各帯域毎に所定のビット配分或いは、各帯域毎に適
応的なビット配分による符号化が行われる。例えば、上
記ＭＤＣＴ処理されて得られたＭＤＣＴ係数データを上
記ビット配分によって符号化する際には、上記各ブロッ
ク毎のＭＤＣＴ処理により得られる各帯域毎のＭＤＣＴ
係数データに対して、適応的な配分ビット数で符号化が
行われることになる。Further, as a frequency division width when quantizing each frequency component divided into frequency bands, for example, there is a band division in consideration of human auditory characteristics. That is, an audio signal may be divided into a plurality of bands (for example, 25 bands) with a bandwidth generally called a critical band (critical band) such that the bandwidth becomes wider as the band becomes higher. When encoding data for each band at this time, predetermined bits are allocated to each band, or encoding is performed by adaptive bit allocation to each band. For example, when the MDCT coefficient data obtained by the MDCT processing is encoded by the bit allocation, the MDCT coefficient for each band obtained by the MDCT processing for each block is used.
Coding is performed on the coefficient data with an adaptive number of allocated bits.

【０００６】上記ビット配分手法及びそのための装置と
しては、次の２手法及び装置が知られている。IEEE Tra
nsactions of Accoustics,Speech,and Signal Processi
ng,vol.ASSP-25,No.4,August 1977 では、各帯域毎の信
号の大きさをもとに、ビット配分を行っている。またIC
ASSP 1980 The critical band coder--digital encodin
g ofthe perceptual requirements of the auditory sy
stem M.A. Kransner MITでは、聴覚マスキングを利用す
ることで、各帯域毎に必要な信号対雑音比を得て固定的
なビット配分を行う手法及び装置が述べられている。The following two methods and devices are known as the above-mentioned bit allocation method and a device therefor. IEEE Tra
nsactions of Accoustics, Speech, and Signal Processi
In ng, vol. ASSP-25, No. 4, August 1977, bits are allocated based on the magnitude of the signal in each band. Also IC
ASSP 1980 The critical band coder--digital encodin
g ofthe perceptual requirements of the auditory sy
stem MA Kransner MIT describes a method and apparatus for obtaining a required signal-to-noise ratio for each band and performing fixed bit allocation by using auditory masking.

【０００７】[0007]

【発明が解決しようとする課題】しかし、これらのビッ
ト配分技術は、再生側（デコード側）においてある一定
ビットレートで再生（デコード）が行われることを期待
したものであり、したがって、当該一定ビットレートを
下回るビットレートで再生を行った場合には著しい音質
劣化をもたらす。すなわち、エンコード時に使用された
ビットレートよりも低いビットレートを用いてデコード
することとして、例えばエンコード側でエンコード処理
後のビットの一部を別のデータ転送用に流用するような
場合は、再生側においてエンコード時のビットレートを
下回るビットレートで再生が行われることになるため、
上記再生側においてエンコード時のビットレートで再生
することを期待する上述の既知のビット配分技術では、
再生（デコード）時に著しい音質劣化をもたらすように
なる。However, these bit allocation techniques are intended to perform reproduction (decoding) at a certain bit rate on the reproduction side (decoding side). When the reproduction is performed at a bit rate lower than the rate, a significant deterioration in sound quality is caused. That is, decoding is performed using a bit rate lower than the bit rate used at the time of encoding. For example, when a part of the bits after the encoding process is diverted for another data transfer on the encoding side, the decoding is performed on the reproducing side. Because playback will be performed at a bit rate lower than the bit rate at the time of encoding,
In the above-mentioned known bit allocation technique, which is expected to reproduce at the encoding bit rate on the reproducing side,
At the time of reproduction (decoding), remarkable deterioration of sound quality is caused.

【０００８】また、例えば既に低いビットレートで再生
する再生機が使われているような場合において、より高
いビットレートを用いた音質の良いシステムを導入しよ
うとしても、上記既に用いられている低いビットレート
で再生を行う再生機では良好な再生を行なうことが出来
ない。Further, for example, in a case where a reproducing apparatus for reproducing at a low bit rate is already used, even if an attempt is made to introduce a system of good sound quality using a higher bit rate, the already used low bit Good playback cannot be performed with a playback device that performs playback at a rate.

【０００９】すなわち、従来のビット配分技術において
は、バックワードの互換性が無い。That is, in the conventional bit allocation technology, there is no backward compatibility.

【００１０】そこで、本発明は、このような音質劣化を
最小に止めることが可能で、バックワードの互換性をも
有する符号化装置を提供することを目的とするものであ
る。SUMMARY OF THE INVENTION It is an object of the present invention to provide an encoding apparatus which can minimize such sound quality deterioration and has backward compatibility.

【００１１】また、音声、オーディオ等の信号を符号化
した情報を例えばいわゆるＩＣカードのような記憶デバ
イスを用いた記憶媒体に記録させるような場合において
は、当該記憶デバイスが高価であることから、より長時
間の記録がなされることが望まれ、また、音質劣化も最
小化することが望まれる。したがって、本発明の別の目
的は、高価な記憶デバイスを用いた記憶媒体に記録を行
うような場合において、長時間記録のために例えば記録
時間を初期の設定から延長したいときには記録済み若し
くは記録中のエンコード情報のビットレートを適宜減ら
して記録時間を延ばし且つこの時の音質劣化を最小化で
きる符号化装置を提供することにある。Further, in the case where information obtained by encoding signals such as voice and audio is recorded on a storage medium using a storage device such as a so-called IC card, the storage device is expensive. It is desired that recording be performed for a longer time, and that sound quality degradation be minimized. Therefore, another object of the present invention is to perform recording on a storage medium using an expensive storage device, for example, when the recording time is to be extended from the initial setting for a long time recording, or when recording is performed or during recording. It is an object of the present invention to provide an encoding apparatus capable of appropriately reducing the bit rate of the encoded information to extend the recording time and minimizing the sound quality deterioration at this time.

【００１２】[0012]

【００１３】[0013]

【課題を解決するための手段】本発明は、上述の目的を
達成するために提案されたものであり、本発明の符号化
装置は、入力信号の時間領域サンプル若しくは周波数領
域サンプルを所定のビット数で量子化する符号化装置に
おいて、上記時間領域サンプル若しくは周波数領域サン
プルを量子化する第１の量子化手段と、上記第１の量子
化手段の量子化誤差を算出する量子化誤差算出手段と、
上記量子化誤差をさらに量子化する第２の量子化手段
と、上記第１の量子化手段の出力と上記第２の量子化手
段の出力とを符号化する符号化手段とを有する。SUMMARY OF THE INVENTION The present invention has been proposed to achieve the above-mentioned object, and an encoding apparatus according to the present invention is capable of converting a time domain sample or a frequency domain sample of an input signal into predetermined bits. In an encoding device that quantizes by a number, a first quantization unit that quantizes the time domain sample or the frequency domain sample, and a quantization error calculation unit that calculates a quantization error of the first quantization unit. ,
A second quantization unit that further quantizes the quantization error; and an encoding unit that encodes an output of the first quantization unit and an output of the second quantization unit.

【００１４】また、本発明の符号化装置では、上記第２
の量子化手段の出力ビットレートを一定時間単位で一定
ビットレートとしたり、上記第１の量子化手段及び上記
第２の量子化手段の出力ビットレートを一定時間単位で
一定ビットレートとする。これらの場合、符号化装置
は、上記時間領域サンプル若しくは周波数領域サンプル
を、複数のサンプル毎にブロック化してブロックフロー
ティング処理する第１のブロックフローティング手段
と、上記量子化誤差算出手段の出力をブロックフローテ
ィング処理する第２のブロックフローティング手段とを
更に具備し、上記第２のブロックフローティング手段
は、当該ブロックフローティングのためのスケールファ
クタを上記第１のブロックフローティング手段における
スケールファクタに基づいて求め、上記第１の量子化手
段は、上記第１のブロックフローティング手段の出力を
量子化し、上記第２の量子化手段は、上記第２のブロッ
クフローティング手段の出力を量子化する。Further, in the encoding apparatus of the present invention, the second
And the output bit rates of the first and second quantization units are set to a constant bit rate in fixed time units. In these cases, the encoding device includes a first block floating unit that blocks the time domain sample or the frequency domain sample for each of a plurality of samples and performs a block floating process, and a block floating unit that outputs an output of the quantization error calculation unit. A second block floating means for processing, wherein the second block floating means obtains a scale factor for the block floating based on the scale factor in the first block floating means, The quantizing means quantizes the output of the first block floating means, and the second quantizing means quantizes the output of the second block floating means.

【００１５】また、本発明の符号化装置では、上記第２
のブロックフローティング手段は、当該ブロックフロー
ティングのためのスケールファクタを、上記第１のブロ
ックフローティング手段におけるスケールファクタ及び
上記第１の量子化手段におけるワードレングスに基づい
て求める。In the encoding apparatus of the present invention, the second
The block floating means determines the scale factor for the block floating based on the scale factor in the first block floating means and the word length in the first quantization means.

【００１６】以上の場合、時間と周波数について細分化
された小ブロック中のサンプルデータに対しては、前記
小ブロック内で同一のブロックフローティング及び語長
をもつ量子化を行なう。また、前記時間と周波数につい
て細分化された小ブロック中のサンプルを得るために
は、フィルタなどの非ブロック化周波数分析を行った
後、前記フィルタなどの非ブロック化周波数分析の出力
を直交変換等のブロック化周波数分析する。この時、前
記非ブロック化周波数分析の周波数帯域幅が少なくとも
最低域の２帯域で同じであることは、コストを低減する
うえで役に立つ。また、前記非ブロック化周波数分析の
周波数帯域幅が少なくとも最高域で高域程広いことは、
臨界帯域に基づく聴覚の効果を利用するうえで重要であ
る。さらに、前記ブロック化周波数分析では、入力信号
の時間特性により適応的にそのブロックサイズを変更す
ることにより、入力信号の時間特性に対応した最適な処
理が可能となる。ここで、前記ブロックサイズの変更
は、少なくとも２つの前記非ブロック化周波数分析の出
力帯域ごとに独立に行うことは、周波数成分の間の相互
干渉を防いで各帯域成分独立に最適な処理を行う上で効
果的である。In the above case, with respect to the sample data in the small block subdivided in time and frequency, quantization having the same block floating and word length is performed in the small block. In addition, in order to obtain a sample in a small block subdivided with respect to the time and frequency, after performing a deblocking frequency analysis such as a filter, an output of the deblocking frequency analysis such as the filter is subjected to an orthogonal transformation or the like. To analyze the blocking frequency. At this time, the fact that the frequency bandwidth of the non-blocking frequency analysis is the same in at least the two lowest bands is useful in reducing costs. Further, the frequency bandwidth of the non-blocking frequency analysis is wider at least in the highest band and high band,
It is important in utilizing the effect of hearing based on the critical band. Further, in the block frequency analysis, an optimal process corresponding to the time characteristic of the input signal can be performed by adaptively changing the block size according to the time characteristic of the input signal. Here, the change of the block size is performed independently for each of the output bands of at least two of the non-blocking frequency analysis, so that optimal processing is performed independently for each band component while preventing mutual interference between frequency components. It is effective on.

【００１７】また、各チャネルに与えられるビット配分
量を各チャネルのスケールファクタ又はサンプル最大値
により決めるのは、簡単な演算によるため、演算を低減
させるうえで効果的である。これに加えて、各チャネル
のスケールファクタで代表される振幅情報の時間的変化
によって各チャネルに与えられるビット配分量を変化さ
せることも、ビットレートを下げるうえでは有益であ
る。Further, since the bit allocation amount given to each channel is determined by the scale factor or the sample maximum value of each channel, the calculation is simple, so that it is effective in reducing the number of calculations. In addition to this, changing the bit allocation amount given to each channel according to the temporal change of the amplitude information represented by the scale factor of each channel is also useful for lowering the bit rate.

【００１８】さらに、本発明の符号化装置は、上記時間
領域サンプル若しくは周波数領域サンプルを量子化する
第１の量子化手段と、上記第１の量子化手段の量子化誤
差を算出する量子化誤差算出手段と、上記量子化誤差を
さらに量子化する第２の量子化手段と、１つのシンクブ
ロックの中に、上記第１の量子化手段の出力サンプルと
上記第２の量子化手段の出力サンプルとをそれぞれ分離
して記録する記録手段とを有する。また、本発明の符号
化装置は、上記時間領域サンプル若しくは周波数領域サ
ンプルを量子化する第１の量子化手段と、上記第１の量
子化手段の量子化誤差を算出する量子化誤差算出手段
と、上記量子化誤差をさらに量子化する第２の量子化手
段と、１つのシンクブロックの中に、上記第１の量子化
手段の出力サンプルと上記第２の量子化手段の出力サン
プルとを時間順若しくは周波数順に交互に記録する記録
手段とを有する。Further, the encoding apparatus according to the present invention comprises a first quantization means for quantizing the time domain sample or the frequency domain sample, and a quantization error for calculating a quantization error of the first quantization means. Calculation means, second quantization means for further quantizing the quantization error, and an output sample of the first quantization means and an output sample of the second quantization means in one sync block. And recording means for recording separately. Further, the encoding apparatus of the present invention includes a first quantization unit for quantizing the time domain sample or the frequency domain sample, and a quantization error calculation unit for calculating a quantization error of the first quantization unit. , A second quantizing means for further quantizing the quantization error, and an output sample of the first quantizing means and an output sample of the second quantizing means in one sync block. Recording means for recording alternately in order of frequency or frequency.

【００１９】[0019]

【作用】本発明によれば、入力信号の時間領域サンプル
若しくは周波数領域サンプルを所定のビット数で量子化
する際に、第１の量子化手段により上記時間領域サンプ
ル若しくは周波数領域サンプルが量子化され、量子化誤
差算出手段により算出された上記第１の量子化手段の量
子化誤差が第２の量子化手段によりさらに量子化され、
符号化手段により上記第１の量子化手段の出力と上記第
２の量子化手段の出力とが符号化される。According to the present invention, when a time-domain sample or a frequency-domain sample of an input signal is quantized with a predetermined number of bits, the time-domain sample or the frequency-domain sample is quantized by the first quantization means. The quantization error of the first quantization means calculated by the quantization error calculation means is further quantized by the second quantization means,
The output of the first quantization means and the output of the second quantization means are encoded by the encoding means.

【００２０】また、上記第２の量子化手段の出力ビット
レートを一定時間単位で一定ビットレートとしたり、上
記第１の量子化手段及び上記第２の量子化手段の出力ビ
ットレートを一定時間単位で一定ビットレートとするこ
とは、ディスク、テープ等の記録媒体への記録方式を簡
単化するうえで有効である。The output bit rate of the second quantization means may be set to a constant bit rate in a fixed time unit, or the output bit rates of the first quantization means and the second quantization means may be set in a fixed time unit. Setting a constant bit rate is effective in simplifying a recording method on a recording medium such as a disk and a tape.

【００２１】これらの場合、符号化装置が上記時間領域
サンプル若しくは周波数領域サンプルを、複数のサンプ
ル毎にブロック化してブロックフローティング処理する
第１のブロックフローティング手段と、上記量子化誤差
算出手段の出力をブロックフローティング処理する第２
のブロックフローティング手段とを更に具備し、上記第
２のブロックフローティング手段が当該ブロックフロー
ティングのためのスケールファクタを上記第１のブロッ
クフローティング手段におけるスケールファクタに基づ
いて求め、上記第１の量子化手段が上記第１のブロック
フローティング手段の出力を量子化し、上記第２の量子
化手段が上記第２のブロックフローティング手段の出力
を量子化することは、符号化の効率を高めるうえで有効
である。[0021] In these cases, the encoding apparatus outputs the output of the first block floating means for performing block floating processing by blocking the time domain sample or the frequency domain sample for each of a plurality of samples, and the output of the quantization error calculating means. Block floating processing second
Block floating means, wherein the second block floating means obtains a scale factor for the block floating based on the scale factor in the first block floating means, and wherein the first quantization means Quantizing the output of the first block floating means and quantizing the output of the second block floating means by the second quantizing means is effective in increasing the coding efficiency.

【００２２】さらに、時間と周波数について細分化され
た小ブロック中のサンプルを得るために、フィルタなど
の非ブロック化周波数分析を行なった後、このフィルタ
などの非ブロック化周波数分析の出力を直交変換等のブ
ロック化周波数分析をする事により、時間領域、周波数
領域で聴覚マスキングを考慮した量子化雑音の発生が可
能となり、聴覚上好ましい周波数分析を得ることが可能
となる。この時、前記非ブロック化周波数分析の周波数
帯域幅が少なくとも最低域の２帯域で同じであることは
コストを低減するうえで役に立つ。また、この非ブロッ
ク化周波数分析の周波数帯域幅が少なくとも最高域で高
域程広くすることにより臨界帯域に基づく聴覚の効果を
効率的に利用することが可能となる。このブロック化周
波数分析は、入力信号の時間特性により適応的にそのブ
ロックサイズが変更されることにより入力信号の時間特
性に対応した最適な処理が可能となる。また、ブロック
サイズの変更は少なくとも２つの前記非ブロック化周波
数分析の出力帯域毎に独立に行うことは、周波数成分の
間の相互干渉を防いで各帯域成分独立に最適な処理を行
う上で効果的である。Further, in order to obtain a sample in a small block subdivided with respect to time and frequency, after performing a deblocking frequency analysis such as a filter, an output of the deblocking frequency analysis such as the filter is subjected to an orthogonal transformation. By performing a block frequency analysis such as that described above, it is possible to generate quantization noise in consideration of auditory masking in the time domain and the frequency domain, and it is possible to obtain a frequency analysis that is preferable for hearing. At this time, the fact that the frequency bandwidth of the non-blocking frequency analysis is the same in at least the two lowest bands is useful in reducing costs. In addition, by making the frequency bandwidth of the non-blocking frequency analysis wider at least in the highest band and in the higher band, the effect of hearing based on the critical band can be used efficiently. In this block frequency analysis, an optimal process corresponding to the time characteristic of the input signal can be performed by adaptively changing the block size according to the time characteristic of the input signal. In addition, performing the block size change independently for each of the at least two output bands of the deblocking frequency analysis is effective in preventing the mutual interference between the frequency components and performing the optimum processing independently for each band component. It is a target.

【００２３】また、上記時間領域サンプル若しくは周波
数領域サンプルを量子化する第１の量子化手段と、上記
第１の量子化手段の量子化誤差を算出する量子化誤差算
出手段と、上記量子化誤差をさらに量子化する第２の量
子化手段と、１つのシンクブロックの中に、上記第１の
量子化手段の出力サンプルと上記第２の量子化手段の出
力サンプルとをそれぞれ分離して記録する記録手段とを
有することは、ビットレートを下げて再生する場合に除
去すべきビット列部分を一括して除去できるという点で
有効である。Also, a first quantization means for quantizing the time domain sample or the frequency domain sample, a quantization error calculation means for calculating a quantization error of the first quantization means, And the output sample of the first quantization unit and the output sample of the second quantization unit are separately recorded in one sync block. The provision of the recording means is effective in that a bit string portion to be removed when reproducing at a reduced bit rate can be collectively removed.

【００２４】また、上記時間領域サンプル若しくは周波
数領域サンプルを量子化する第１の量子化手段と、上記
第１の量子化手段の量子化誤差を算出する量子化誤差算
出手段と、上記量子化誤差をさらに量子化する第２の量
子化手段と、１つのシンクブロックの中に、上記第１の
量子化手段の出力サンプルと上記第２の量子化手段の出
力サンプルとを時間順若しくは周波数順に交互に記録す
る記録手段とを有することは、ビットレートを下げて再
生する場合に周波数帯域を制限する形で除去すべきビッ
ト列部分を一括して除去できるという点で有効である。Also, a first quantization means for quantizing the time domain samples or the frequency domain samples, a quantization error calculation means for calculating a quantization error of the first quantization means, And the output sample of the first quantization means and the output sample of the second quantization means are alternately arranged in time order or frequency order in one sync block. Is effective in that the bit string portion to be removed can be collectively removed by limiting the frequency band when reproducing at a reduced bit rate.

【００２５】[0025]

【実施例】以下、本発明に係る符号化装置の実施例とな
る高能率符号化装置（エンコーダ）と、それに対応する
高能率符号復号化装置（デコーダ）の一例について図面
を参照しながら説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An example of a high-efficiency coding apparatus (encoder) as an embodiment of the coding apparatus according to the present invention and an example of a corresponding high-efficiency code decoding apparatus (decoder) will be described below with reference to the drawings. .

【００２６】本実施例では、オーディオＰＣＭ信号等の
入力ディジタル信号を、帯域分割符号化（ＳＢＣ）、適
応変換符号化（ＡＴＣ）、及び適応ビット配分て（ＡＰ
Ｃ−ＡＢ）の各技術を用いて高能率符号化する。この技
術について、図１を参照しながら説明する。In this embodiment, an input digital signal such as an audio PCM signal is subjected to band division coding (SBC), adaptive conversion coding (ATC), and adaptive bit allocation (AP).
High-efficiency coding is performed using each technique of C-AB). This technique will be described with reference to FIG.

【００２７】図１に示す本実施例の具体的な高能率符号
化装置では、入力ディジタル信号をフィルタなどにより
複数の周波数帯域に分割すると共に、各周波数帯域毎に
直交変換を行って、得られた周波数軸のスペクトルデー
タを、後述する人間の聴覚特性を考慮したいわゆる臨界
帯域幅（クリティカルバンド）毎に適応的にビット配分
して符号化している。この時、高域では臨界帯域幅を更
に分割した帯域を用いる。もちろんフィルタなどによる
非ブロッキングの周波数分割幅は等分割幅としてもよ
い。In the specific high-efficiency coding apparatus of this embodiment shown in FIG. 1, the input digital signal is obtained by dividing the input digital signal into a plurality of frequency bands by using a filter or the like and performing orthogonal transform for each frequency band. The spectrum data on the frequency axis is adaptively bit-distributed and encoded for each so-called critical bandwidth (critical band) in consideration of human auditory characteristics described later. At this time, in the high band, a band obtained by further dividing the critical bandwidth is used. Of course, a non-blocking frequency division width by a filter or the like may be an equal division width.

【００２８】さらに、本発明実施例においては、直交変
換の前に入力信号に応じて適応的にブロックサイズ（ブ
ロック長）を変化させると共に、クリティカルバンド単
位若しくは高域では臨界帯域幅（クリティカルバンド）
を更に細分化したブロックでフローティング処理を行っ
ている。なお、このクリティカルバンドとは、人間の聴
覚特性を考慮して分割された周波数帯域であり、ある純
音の周波数近傍の同じ強さの狭帯域バンドノイズによっ
て当該純音がマスクされるときのそのノイズの持つ帯域
のことである。このクリティカルバンドは、高域ほど帯
域幅が広くなっており、例えば０〜２０ｋＨｚの全周波
数帯域は例えば２５のクリティカルバンドに分割されて
いる。Further, in the embodiment of the present invention, before the orthogonal transform, the block size (block length) is adaptively changed according to the input signal, and the critical bandwidth (critical band) in the unit of a critical band or in a high band.
Floating processing is performed in blocks that are further subdivided. Note that this critical band is a frequency band divided in consideration of human auditory characteristics, and the narrow band noise of the same strength near the frequency of a certain pure tone causes the noise of the pure tone to be masked. It is the band that you have. The bandwidth of this critical band increases as the frequency band increases. For example, the entire frequency band of 0 to 20 kHz is divided into, for example, 25 critical bands.

【００２９】すなわち、図１において、入力端子１０に
は例えば０〜２２ｋＨｚのオーディオＰＣＭ信号が供給
されている。この入力信号は、例えばいわゆるＱＭＦな
どの帯域分割フィルタ１１により０〜１１ｋＨｚ帯域と
１１ｋ〜２２ｋＨｚ帯域とに分割され、０〜１１ｋＨｚ
帯域の信号は同じくいわゆるＱＭＦ等の帯域分割フィル
タ１２により０〜５．５ｋＨｚ帯域と５．５ｋ〜１１ｋ
Ｈｚ帯域とに分割される。That is, in FIG. 1, an audio PCM signal of, for example, 0 to 22 kHz is supplied to the input terminal 10. This input signal is divided into a band of 0 to 11 kHz and a band of 11 to 22 kHz by a band division filter 11 such as a so-called QMF, for example.
The band signals are also divided by a band division filter 12, such as a so-called QMF, into a band of 0 to 5.5 kHz and a band of 5.5 to 11 k.
Hz band.

【００３０】上記帯域分割フィルタ１１からの１１ｋ〜
２２ｋＨｚ帯域の信号は、直交変換回路の一例であるＭ
ＤＣＴ（Modified Discrete Cosine Transform）回路１
３に送られ、上記帯域分割フィルタ１２からの５．５ｋ
〜１１ｋＨｚ帯域の信号はＭＤＣＴ回路１４に送られ、
上記帯域分割フィルタ１２からの０〜５．５ｋＨｚ帯域
の信号はＭＤＣＴ回路１５に送られることにより、それ
ぞれＭＤＣＴ処理される。なお、各ＭＤＣＴ回路１３、
１４、１５では、各帯域毎に設けたブロック決定回路１
９、２０、２１により決定されたブロックサイズに基づ
いてＭＤＣＴ処理がなされる。11k from the band division filter 11
The signal in the 22 kHz band is represented by M which is an example of an orthogonal transformation circuit.
DCT (Modified Discrete Cosine Transform) circuit 1
5.5k from the band division filter 12
The signal in the ~ 11 kHz band is sent to the MDCT circuit 14,
The signals in the 0 to 5.5 kHz band from the band division filter 12 are sent to the MDCT circuit 15 to be subjected to the MDCT processing. Note that each MDCT circuit 13,
In blocks 14 and 15, the block decision circuit 1 provided for each band is used.
MDCT processing is performed based on the block sizes determined by 9, 20, and 21.

【００３１】ここで、上記ブロック決定回路１９、２
０、２１により決定される各ＭＤＣＴ回路１３、１４、
１５でのブロックサイズの具体例を図２のＡ及びＢに示
す。なお、図２のＡには直交変換ブロックサイズが長い
場合（ロングモードにおける直交変換ブロックサイズ）
を、図２のＢには直交変換ブロックサイズが短い場合
（ショートモードにおける直交変換ブロックサイズ）を
示ししている。Here, the block decision circuits 19, 2
0, 21 each MDCT circuit 13, 14,
Specific examples of the block size at 15 are shown in FIGS. FIG. 2A shows a case where the orthogonal transform block size is long (orthogonal transform block size in long mode).
2B shows the case where the orthogonal transform block size is short (orthogonal transform block size in the short mode).

【００３２】この図２の具体例においては、３つのフィ
ルタ出力は、それぞれ２つの直交変換ブロックサイズを
持つ。すなわち、低域側の０〜５．５ｋＨｚ帯域の信号
及び中域の５．５ｋ〜１１ｋＨｚ帯域の信号に対して
は、長いブロック長の場合（図２のＡ）は１ブロック内
のサンプル数を１２８サンプルとし、短いブロックが選
ばれた場合（図２のＢ）には１ブロック内のサンプル数
を３２サンプル毎のブロックとしている。これに対して
高域側の１１ｋ〜２２ｋＨｚ帯域の信号に対しては、長
いブロック長の場合（図２のＡ）は１ブロック内のサン
プル数を２５６サンプルとし、短いブロックが選ばれた
場合（図２のＢ）には１ブロック内のサンプル数を３２
サンプル毎のブロックとしている。このようにして短い
ブロックが選ばれた場合には各帯域の直交変換ブロック
のサンプル数を同じとして高域程時間分解能を上げ、な
おかつブロック化に使用するウインドウの種類を減らし
ている。In the specific example of FIG. 2, each of the three filter outputs has two orthogonal transform block sizes. That is, for a signal in the 0-5.5 kHz band on the low band side and a signal in the 5.5 kHz-11 kHz band in the middle band, in the case of a long block length (A in FIG. 2), the number of samples in one block is reduced. When a short block is selected (B in FIG. 2), the number of samples in one block is set to every 32 samples. On the other hand, for a signal in the 11 kHz to 22 kHz band on the high frequency side, in the case of a long block length (A in FIG. 2), the number of samples in one block is 256, and when a short block is selected ( FIG. 2B) shows that the number of samples in one block is 32.
It is a block for each sample. When a short block is selected in this way, the number of samples of the orthogonal transform block in each band is kept the same, and the higher the frequency, the higher the time resolution and the type of window used for blocking.

【００３３】なお、上記ブロック決定回路１９、２０、
２１で決定されたブロックサイズを示す情報は、後述の
適応ビット配分符号化回路１６、１７、１８に送られる
と共に、出力端子２３、２５、２７から出力される。The above-mentioned block decision circuits 19, 20,
Information indicating the block size determined in 21 is sent to adaptive bit allocation encoding circuits 16, 17, and 18, which will be described later, and output from output terminals 23, 25, and 27.

【００３４】再び図１において、各ＭＤＣＴ回路１３、
１４、１５にてＭＤＣＴ処理されて得られた周波数領域
のスペクトルデータあるいはＭＤＣＴ係数データは、い
わゆる臨界帯域（クリティカルバンド）または高域では
更にクリティカルバンドを分割した帯域毎にまとめられ
て適応ビット配分符号化回路１６、１７、１８に送られ
ている。Referring again to FIG. 1, each MDCT circuit 13,
The spectral data or the MDCT coefficient data in the frequency domain obtained by performing the MDCT processing in 14 and 15 are grouped in a so-called critical band (critical band) or a band obtained by further dividing the critical band in a high band, and are adapted to an adaptive bit allocation code. To the conversion circuits 16, 17 and 18.

【００３５】適応ビット配分符号化回路１６、１７、１
８では、上記ブロックサイズの情報、及び臨界帯域（ク
リティカルバンド）または高域では更にクリティカルバ
ンドを分割した帯域毎に割り当てられたビット数に応じ
て各スペクトルデータ（あるいはＭＤＣＴ係数データ）
を再量子化（正規化して量子化）するようにしている。The adaptive bit allocation encoding circuits 16, 17, 1
8, each spectrum data (or MDCT coefficient data) according to the block size information and the critical band (critical band) or the number of bits allocated to each band obtained by further dividing the critical band in the high band.
Is requantized (normalized and quantized).

【００３６】これら各適応ビット配分符号化回路１６、
１７、１８によって符号化されたデータは、出力端子２
２、２４、２６を介して取り出される。また、当該適応
ビット配分符号化回路１６、１７、１８では、どのよう
な信号の大きさに関する正規化がなされたかを示すスケ
ールファクタと、どのようなビット長で量子化がされた
かを示すビット長情報も求めており、これらも同時に出
力端子２２、２４、２６から出力される。Each of these adaptive bit allocation encoding circuits 16,
The data encoded by 17 and 18 is output to output terminal 2
It is taken out via 2, 24, 26. Also, in the adaptive bit allocation encoding circuits 16, 17, and 18, a scale factor indicating what kind of signal size has been normalized and a bit length indicating what kind of bit length has been used for quantization. Information is also obtained, and these are also output from the output terminals 22, 24, 26 at the same time.

【００３７】また、図１における各ＭＤＣＴ回路１３、
１４、１５の出力からは、上記臨界帯域（クリティカル
バンド）または高域では更にクリティカルバンドを分割
した帯域毎のエネルギを、例えば当該バンド内での各振
幅値の２乗平均の平方根を計算すること等により求めら
れる。もちろん、上記スケールファクタそのものを以後
のビット配分の為に用いるようにしてもよい。この場合
には新たなエネルギ計算の演算が不要となるため、ハー
ド規模の節約となる。また、各バンド毎のエネルギの代
わりに、振幅値のピーク値、平均値等を用いることも可
能である。Each MDCT circuit 13 in FIG.
From the outputs of 14 and 15, calculate the energy for each band obtained by further dividing the critical band in the critical band or the high band, for example, the root mean square of each amplitude value in the band. And so on. Of course, the scale factor itself may be used for subsequent bit allocation. In this case, a new calculation for energy calculation is not required, so that the hardware scale is saved. Further, instead of the energy for each band, a peak value, an average value, or the like of the amplitude values can be used.

【００３８】次に、上記ビット配分を行うための適応ビ
ット配分回路での具体的なビット配分の方法を図３に示
すビット配分ストラテジを用いて説明する。Next, a specific bit allocation method in the adaptive bit allocation circuit for performing the above-mentioned bit allocation will be described using a bit allocation strategy shown in FIG.

【００３９】本実施例では、ステップＳＴ１の総ビット
配分から、第１に、チャネル当たり１２８ｋｂｐｓの基
礎ビット配分（ステップＳＴ２）と、第２に、６４ｋｂ
ｐｓの付加ビット配分（ステップＳＴ３）との２つを求
める。In the present embodiment, based on the total bit allocation in step ST1, first, a basic bit allocation of 128 kbps per channel (step ST2), and secondly, 64 kbps
and ps additional bit allocation (step ST3).

【００４０】このうち基礎ビット配分は、更にビット配
分(1) （ステップＳＴ４）と、ビット配分(2) （ステッ
プＳＴ５）とに分割使用される。Of these, the basic bit allocation is further divided into bit allocation (1) (step ST4) and bit allocation (2) (step ST5).

【００４１】先ず、ステップＳＴ１からステップＳＴ２
への上記基礎ビット配分の手法について説明する。ここ
ではスケールファクタの周波数領域の分布をみて適応的
にビット配分を行なう。First, from step ST1 to step ST2
A method of allocating the basic bits to the above will be described. Here, bit allocation is performed adaptively by looking at the frequency distribution of the scale factor.

【００４２】最初に、ビット配分(1) に使うべきビット
量を確定する。そのためには信号情報のスペクトル情報
のうちトーナリティ情報を使用する。ここでのトーナリ
ティの指標としては、信号スペクトルの隣接値間の差の
絶対値の和を信号スペクトル数で割った値を用いてい
る。なお、より簡単な指標としては、図４に示すよう
に、いわゆるブロックフローティングの為のブロック毎
のスケールファクタにおける隣接スケールファクタ指標
の間の差の平均値を用いることができる。このスケール
ファクタ指標は、概略スケールファクタの対数値に対応
している。First, the amount of bits to be used for bit allocation (1) is determined. For that purpose, tonality information is used among the spectrum information of the signal information. As the tonality index, a value obtained by dividing the sum of the absolute values of the differences between adjacent values of the signal spectrum by the number of signal spectra is used. As a simpler index, as shown in FIG. 4, an average value of a difference between adjacent scale factor indexes in a scale factor for each block for so-called block floating can be used. This scale factor index roughly corresponds to the logarithmic value of the scale factor.

【００４３】実施例では、ビット配分(1) に使うべきビ
ット量をこのトーナリティを表す値に対応させて最大８
０ｋｂｐｓ、最小１０ｋｂｐｓと設定している。このト
ーナリティ計算は次の式のように行う。In the embodiment, the bit amount to be used for the bit allocation (1) is set to a maximum of 8 in correspondence with the value representing the tonality.
0 kbps and a minimum of 10 kbps are set. This tonality calculation is performed as in the following equation.

【００４４】T=(1/(WLmax*(N-1))(ΣABS(SFn-SFn-1)) WLmax ：ワードレングス最大値＝１６ SFn ：スケールファクタ指標で概略ピーク値の対数に
対応している。ｎ：ブロックフローティングバンド番号Ｎ：ブロックフローティングバンドの数T = (1 / (WLmax * (N-1)) (ΣABS (SFn-SFn-1)) WLmax: Maximum word length = 16 SFn: Scale factor index corresponding to the logarithm of the approximate peak value N: Block floating band number N: Number of block floating bands

【００４５】このようにして求められたトーナリティ指
標Ｔとビット配分(1) の配分量とは、図５に示すように
対応付けられる。ここでのビット配分(1) はスケ−ルフ
ァクタに依存した周波数、時間領域上の配分がなされ
る。The tonality index T and the amount of bit allocation (1) thus obtained are associated with each other as shown in FIG. The bit allocation (1) here is performed in the frequency and time domains depending on the scale factor.

【００４６】このようにしてビット配分(1) に使用され
るビット量が決定されたら、次にビット配分(1) で使わ
れなかったビットについての配分すなわちビット配分
(2) に移る。ここでは多種のビット配分が行われるが以
下に２つの例を示す。After the amount of bits used in bit allocation (1) is determined in this way, the allocation of bits not used in bit allocation (1), that is, the bit allocation
Go to (2). Here, various kinds of bit allocation are performed, but two examples are shown below.

【００４７】第１に、全てのサンプル値に対する均一配
分を行う。この場合のビット配分に対する量子化雑音ス
ペクトル（ビット配分(2) の均一配分のノイズスペクト
ル）を図６に示す。これによれば、全周波数帯域で均一
の雑音レベル低減が行える。First, a uniform distribution is performed for all sample values. FIG. 6 shows the quantization noise spectrum for the bit allocation in this case (the noise spectrum of the uniform allocation of the bit allocation (2)). According to this, the noise level can be reduced uniformly in all frequency bands.

【００４８】第２に、信号情報の周波数スペクトル及び
レベルに対する依存性を持たせた聴覚的な効果を得るた
めのビット配分を行う。この場合のビット配分に対する
量子化雑音スペクトル（信号情報の周波数スペクトル及
びレベルに対する依存性を持たせた聴覚的な効果を得る
ためのビット配分によるノイズスペクトル）の一例を図
７に示す。この例では情報信号のスペクトルに依存させ
たビット配分を行っていて、特に情報信号のスペクトル
の低域側にウエイトをおいたビット配分を行い高域側に
比して起きる低域側でのマスキング効果の減少を補償し
ている。これは隣接臨界帯域間でのマスキングを考慮し
てスペクトルの低域側を重視したマスキングカーブの非
対象性に基づいている。Second, bit allocation is performed to obtain an audible effect having dependency on the frequency spectrum and level of the signal information. FIG. 7 shows an example of a quantization noise spectrum for the bit allocation in this case (a noise spectrum by the bit allocation for obtaining an auditory effect having dependency on the frequency spectrum and the level of the signal information). In this example, bit allocation depending on the spectrum of the information signal is performed. In particular, bit allocation is performed with a weight on the low side of the spectrum of the information signal, and masking on the low side which occurs compared to the high side. It compensates for the reduced effect. This is based on the asymmetricity of a masking curve that emphasizes the lower side of the spectrum in consideration of masking between adjacent critical bands.

【００４９】なお、図８はビット配分(2) の均一配分の
時のビット配分（割当）を示す図であり図６に対応した
ビット配分を表している。図９は信号情報の周波数スペ
クトル及びレベルに対応する依存性を持たせた聴覚的な
効果を得るためのビット配分を示す図であり図７に対応
したビット配分を表している。また、図６，図７の図中
Ｓは信号スペクトルを、ＮＬ１はビット配分(1) による
雑音レベルを、ＮＬ２はビット配分(2) による雑音レベ
ルを示している。図８，図９の図中ＡＱ１はビット配分
(1) のビット量を、図中ＡＱ２はビット配分(2) のビッ
ト量を示している。FIG. 8 is a diagram showing bit allocation (allocation) at the time of uniform allocation of bit allocation (2), and shows bit allocation corresponding to FIG. FIG. 9 is a diagram showing bit allocation for obtaining an auditory effect having a dependency corresponding to the frequency spectrum and level of signal information, and shows the bit allocation corresponding to FIG. In FIGS. 6 and 7, S denotes a signal spectrum, NL1 denotes a noise level by bit allocation (1), and NL2 denotes a noise level by bit allocation (2). AQ1 in FIGS. 8 and 9 is bit allocation
In the figure, AQ2 indicates the bit amount of bit allocation (2).

【００５０】次に基礎ビット配分の別の手法を説明す
る。この場合の適応ビッ
ト配分回路の動作を図１０で説明するとＭＤＣＴ係数の
大きさが各ブロックごとに求められ、そのＭＤＣＴ係数
が入力端子８０１に供給される。当該入力端子８０１に
供給されたＭＤＣＴ係数は、帯域毎のエネルギ算出回路
８０３に与えられる。帯域毎のエネルギ算出回路８０３
では、クリティカルバンドまたは高域においてはクリテ
ィカルバンドを更に再分割したそれぞれの帯域に関する
信号エネルギを算出する。帯域毎のエネルギ算出回路８
０３で算出されたそれぞれの帯域に関するエネルギは、
エネルギ依存ビット配分回路８０４に供給される。Next, another method of basic bit allocation will be described. The operation of the adaptive bit distribution circuit in this case will be described with reference to FIG. 10. The magnitude of the MDCT coefficient is obtained for each block, and the MDCT coefficient is supplied to the input terminal 801. The MDCT coefficient supplied to the input terminal 801 is given to the energy calculation circuit 803 for each band. Energy calculation circuit 803 for each band
Then, in the critical band or the high band, the signal energy for each band obtained by further subdividing the critical band is calculated. Energy calculation circuit 8 for each band
The energy for each band calculated in 03 is
This is supplied to the energy-dependent bit distribution circuit 804.

【００５１】エネルギ依存ビット配分回路８０４では、
使用可能総ビット発生回路８０２からの使用可能総ビッ
ト、本実施例では１２８Ｋｂｐｓの内のある割合を用い
て白色の量子化雑音を作り出すようなビット配分を行
う。このとき、入力信号のトーナリティが高いほど、す
なわち入力信号のスペクトルの凸凹が大きいほど、この
ビット量が上記１２８Ｋｂｐｓに占める割合が増加す
る。なお、入力信号のスペクトルの凸凹を検出するに
は、隣接するブロックのブロックフローティング係数の
差の絶対値の和を指標として使う。そして、求められた
使用可能なビット量につき、各帯域のエネルギの対数値
に比例したビット配分を行う。In the energy-dependent bit distribution circuit 804,
Using the total available bits from the total available bits generation circuit 802, in this embodiment, a certain ratio of 128 Kbps, a bit allocation is performed to generate white quantization noise. At this time, the higher the tonality of the input signal, that is, the greater the unevenness of the spectrum of the input signal, the greater the ratio of the bit amount to the above 128 Kbps. In order to detect the unevenness of the spectrum of the input signal, the sum of the absolute values of the differences between the block floating coefficients of adjacent blocks is used as an index. Then, for the determined available bit amount, bit allocation is performed in proportion to the logarithmic value of the energy of each band.

【００５２】聴覚許容雑音レベルに依存したビット配分
算出回路８０５は、まず上記クリティカルバンド毎に分
割されたスペクトルデータに基づき、いわゆるマスキン
グ効果等を考慮した各クリティカルバンド毎の許容ノイ
ズ量を求め、次に聴覚許容雑音スペクトルを与えるよう
に使用可能総ビットからエネルギ依存ビットを引いたビ
ット分が配分される。このようにして求められたエネル
ギ依存ビットと聴覚許容雑音レベルに依存したビットは
加算されて、図１の適応ビット配分符号化回路１６、１
７、１８によって各クリティカルバンド毎若しくは高域
においてはクリティカルバンドを更に複数帯域に分割し
た帯域に割り当てられたビット数に応じて各スペクトル
データ（あるいはＭＤＣＴ係数データ）が再量子化され
るようになっている。このようにして符号化されたデー
タは、図１の出力端子２２、２４、２６を介して取り出
される。The bit allocation calculation circuit 805 depending on the permissible noise level of the hearing first calculates the permissible noise amount for each critical band in consideration of the so-called masking effect based on the spectrum data divided for each critical band. Are allocated by subtracting the energy-dependent bits from the total available bits so as to give the permissible noise spectrum. The energy-dependent bits obtained in this way and the bits depending on the permissible noise level are added, and the adaptive bit allocation coding circuits 16 and 1 shown in FIG.
According to 7 and 18, each spectral data (or MDCT coefficient data) is re-quantized in accordance with the number of bits allocated to each critical band or a band obtained by further dividing the critical band into a plurality of bands in the high band. ing. The data encoded in this way is taken out via the output terminals 22, 24, 26 of FIG.

【００５３】さらに詳しく上記聴覚許容雑音スペクトル
依存のビット配分回路８０５中の聴覚許容雑音スペクト
ル算出回路について説明すると、ＭＤＣＴ回路１３、１
４、１５で得られたＭＤＣＴ係数が当該ビット配分回路
８０５中の許容雑音スペクトル算出回路に与えられる。The permissible noise spectrum calculation circuit in the permissible noise spectrum dependent bit allocation circuit 805 will be described in more detail.
The MDCT coefficients obtained in 4 and 15 are given to an allowable noise spectrum calculation circuit in the bit allocation circuit 805.

【００５４】図１１は上記許容雑音スペクトル算出回路
をまとめて説明した一具体例の概略構成を示すブロック
回路図である。この図１１において、入力端子５２１に
は、ＭＤＣＴ回路１３、１４、１５からの周波数領域の
スペクトルデータが供給されている。FIG. 11 is a block circuit diagram showing a schematic configuration of one specific example in which the allowable noise spectrum calculating circuit is collectively described. In FIG. 11, spectrum data in the frequency domain is supplied to the input terminal 521 from the MDCT circuits 13, 14, and 15.

【００５５】この周波数領域の入力データは、帯域毎の
エネルギ算出回路５２２に送られて、上記クリティカル
バンド（臨界帯域）毎のエネルギが、例えば当該バンド
内での各振幅値２乗の総和を計算すること等により求め
られる。この各バンド毎のエネルギの代わりに、振幅値
のピーク値、平均値等が用いられることもある。このエ
ネルギ算出回路５２２からの出力として、例えば各バン
ドの総和値のスペクトルは、一般にバークスペクトルと
称されている。図１２はこのような各クリティカルバン
ド毎のバークスペクトルＳＢを示している。ただし、こ
の図１２では、図示を簡略化するため、上記クリティカ
ルバンドのバンド数を１２バンド（Ｂ1〜Ｂ12）で表現
している。The input data in the frequency domain is sent to the energy calculation circuit 522 for each band, and the energy for each critical band (critical band) is calculated, for example, by summing the square of each amplitude value within the band. Required. Instead of the energy for each band, a peak value or an average value of the amplitude value may be used. As an output from the energy calculation circuit 522, for example, the spectrum of the sum value of each band is generally called a bark spectrum. FIG. 12 shows such a bark spectrum SB for each critical band. However, in FIG. 12, the number of the critical bands is represented by 12 bands (B1 to B12) to simplify the illustration.

【００５６】ここで、上記バークスペクトルＳＢのいわ
ゆるマスキングに於ける影響を考慮するために、該バー
クスペクトルＳＢに所定の重み付け関数を掛けて加算す
るような畳込み（コンボリューション）処理を施す。こ
のため、上記帯域毎のエネルギ算出回路５２２の出力す
なわち該バークスペクトルＳＢの各値は、畳込みフィル
タ回路５２３に送られる。該畳込みフィルタ回路５２３
は、例えば、入力データを順次遅延させる複数の遅延素
子と、これら遅延素子からの出力にフィルタ係数（重み
付け関数）を乗算する複数の乗算器（例えば各バンドに
対応する２５個の乗算器）と、各乗算器出力の総和をと
る総和加算器とから構成されるものである。Here, in order to consider the influence of the bark spectrum SB on so-called masking, a convolution process is performed such that the bark spectrum SB is multiplied by a predetermined weighting function and added. Therefore, the output of the energy calculation circuit 522 for each band, that is, each value of the bark spectrum SB is sent to the convolution filter circuit 523. The convolution filter circuit 523
Includes, for example, a plurality of delay elements for sequentially delaying input data, a plurality of multipliers (for example, 25 multipliers corresponding to each band) for multiplying an output from these delay elements by a filter coefficient (weighting function). , And a sum adder for summing the outputs of the multipliers.

【００５７】なお、上記マスキングとは、人間の聴覚上
の特性により、ある信号によって他の信号がマスクされ
て聞こえなくなる現象をいうものであり、このマスキン
グ効果には、時間領域のオーディオ信号による時間軸マ
スキング効果と、周波数領域の信号による同時刻マスキ
ング効果とがある。これらのマスキング効果により、マ
スキングされる部分にノイズがあったとしても、このノ
イズは聞こえないことになる。このため、実際のオーデ
ィオ信号では、このマスキングされる範囲内のノイズは
許容可能なノイズとされる。The above-mentioned masking is a phenomenon in which a certain signal masks another signal and becomes inaudible due to human auditory characteristics. The masking effect includes a time domain of an audio signal in a time domain. There are an axis masking effect and a simultaneous masking effect by a frequency domain signal. Due to these masking effects, even if noise is present in the masked portion, this noise will not be heard. For this reason, in an actual audio signal, the noise within the masked range is regarded as acceptable noise.

【００５８】また、上記畳込みフィルタ回路５２３の各
乗算器の乗算係数（フィルタ係数）の一具体例を示す
と、任意のバンドに対応する乗算器Ｍの係数を１とする
とき、乗算器Ｍ−１で係数０．１５を、乗算器Ｍ−２で
係数０．００１９を、乗算器Ｍ−３で係数０．００００
０８６を、乗算器Ｍ＋１で係数０．４を、乗算器Ｍ＋２
で係数０．０６を、乗算器Ｍ＋３で係数０．００７を各
遅延素子の出力に乗算することにより、上記バークスペ
クトルＳＢの畳込み処理が行われる。ただし、Ｍは１〜
２５の任意の整数である。A specific example of the multiplication coefficient (filter coefficient) of each multiplier of the convolution filter circuit 523 is as follows. When the coefficient of the multiplier M corresponding to an arbitrary band is set to 1, the multiplier M −1, the coefficient 0.15, the multiplier M-2, the coefficient 0.0019, and the multiplier M-3, the coefficient 0.00000.
086, a coefficient 0.4 by a multiplier M + 1, and a multiplier M + 2
By multiplying the output of each delay element by the coefficient 0.06 by the multiplier M + 3 and the coefficient 0.007 by the multiplier M + 3, the convolution process of the bark spectrum SB is performed. However, M is 1
It is an arbitrary integer of 25.

【００５９】次に、上記畳込みフィルタ回路５２３の出
力は引算器５２４に送られる。該引算器５２４は、上記
畳込んだ領域での後述する許容可能なノイズレベルに対
応するレベルαを求めるものである。なお、当該許容可
能なノイズレベル（許容ノイズレベル）に対応するレベ
ルαは、後述するように、逆コンボリューション処理を
行うことによって、クリティカルバンドの各バンド毎の
許容ノイズレベルとなるようなレベルである。Next, the output of the convolution filter circuit 523 is sent to a subtractor 524. The subtracter 524 calculates a level α corresponding to an allowable noise level described later in the convolved area. The level α corresponding to the permissible noise level (permissible noise level) is, as described later, a level at which the permissible noise level of each critical band is obtained by performing inverse convolution processing. is there.

【００６０】ここで、上記引算器５２４には、上記レベ
ルαを求めるるための許容関数（マスキングレベルを表
現する関数）が供給される。この許容関数を増減させる
ことで上記レベルαの制御を行っている。当該許容関数
は、次に説明するような（ｎ−ａｉ）関数発生回路５２
５から供給されているものである。Here, an allowance function (a function expressing a masking level) for obtaining the level α is supplied to the subtractor 524. The level α is controlled by increasing or decreasing the allowable function. The permissible function is a (n-ai) function generation circuit 52 described below.
5.

【００６１】すなわち、許容ノイズレベルに対応するレ
ベルαは、クリティカルバンドのバンドの低域から順に
与えられる番号をｉとすると、次の式で求めることがで
きる。 α＝Ｓ−（ｎ−ａｉ）この式において、ｎ，ａは定数でａ＞０、Ｓは畳込み処
理されたバークスペクトルの強度であり、式中(n-ai)が
許容関数となる。例としてｎ＝３８，ａ＝−0.5 を用い
ることができる。That is, the level α corresponding to the allowable noise level can be obtained by the following equation, where i is a number sequentially given from the lower band of the critical band. α = S− (n−ai) In this equation, n and a are constants, a> 0, and S is the intensity of the convolution-processed Bark spectrum, where (n-ai) is an allowable function. As an example, n = 38 and a = -0.5 can be used.

【００６２】このようにして、上記レベルαが求めら
れ、このデータは、割算器５２６に伝送される。当該割
算器５２６では、上記畳込みされた領域での上記レベル
αを逆コンボリューションするためのものである。した
がって、この逆コンボリューション処理を行うことによ
り、上記レベルαからマスキングスレッショールドが得
られるようになる。すなわち、このマスキングスレッシ
ョールドが許容ノイズスペクトルとなる。なお、上記逆
コンボリューション処理は、複雑な演算を必要とする
が、本実施例では簡略化した割算器５２６を用いて逆コ
ンボリューションを行っている。Thus, the level α is obtained, and this data is transmitted to the divider 526. In the divider 526, the level α in the convolved area is inversely convolved. Therefore, by performing the inverse convolution processing, a masking threshold can be obtained from the level α. That is, this masking threshold becomes the allowable noise spectrum. Note that the above inverse convolution process requires a complicated operation, but in this embodiment, inverse convolution is performed using a simplified divider 526.

【００６３】次に、上記マスキングスレッショールド
は、合成回路５２７を介して減算器５２８に伝送され
る。ここで、当該減算器５２８には、上記帯域毎のエネ
ルギ検出回路５２２からの出力、すなわち前述したバー
クスペクトルＳＢが、遅延回路５２９を介して供給され
ている。したがって、この減算器５２８で上記マスキン
グスレッショールドとバークスペクトルＳＢとの減算演
算が行われることで、図１３に示すように、上記バーク
スペクトルＳＢは、当該マスキングスレッショールドＭ
Ｓのレベルで示すレベル以下がマスキングされることに
なる。なお、上記遅延回路５２９は、上記合成回路５２
７以前の各回路での遅延量を考慮してエネルギ検出回路
５２２からのバークスペクトルＳＢを遅延させるために
設けられている。Next, the masking threshold is transmitted to the subtractor 528 via the synthesis circuit 527. Here, the output from the energy detection circuit 522 for each band, that is, the above-described bark spectrum SB is supplied to the subtracter 528 via the delay circuit 529. Accordingly, the subtraction operation of the masking threshold and the bark spectrum SB is performed by the subtractor 528, so that the bark spectrum SB is converted to the masking threshold M as shown in FIG.
The level below the level indicated by the level S is masked. Note that the delay circuit 529 is connected to the synthesis circuit 52.
7 is provided to delay the bark spectrum SB from the energy detection circuit 522 in consideration of the amount of delay in each circuit before 7.

【００６４】当該減算器５２８からの出力は、許容雑音
補正回路５３０を介し、出力端子５３１を介して取り出
され、例えば配分ビット数情報が予め記憶されたＲＯＭ
等（図示せず）に送られる。このＲＯＭ等は、上記減算
回路５２８から許容雑音補正回路５３０を介して得られ
た出力（上記各バンドのエネルギと上記ノイズレベル設
定手段の出力との差分のレベル）に応じ、各バンド毎の
配分ビット数情報を出力する。The output from the subtracter 528 is taken out via an allowable noise correction circuit 530 and an output terminal 531. For example, a ROM in which distribution bit number information is stored in advance is provided.
Etc. (not shown). This ROM or the like distributes each band in accordance with the output (the level of the difference between the energy of each band and the output of the noise level setting means) obtained from the subtraction circuit 528 via the allowable noise correction circuit 530. Output bit number information.

【００６５】このようにしてエネルギ依存ビットと聴覚
許容雑音レベルに依存したビットは加算されてその配分
ビット数情報が上記適応ビット配分符号化回路１６、１
７、１８に送られることで、ここでＭＤＣＴ回路１３、
１４、１５からの周波数領域の各スペクトルデータがそ
れぞれのバンド毎に割り当てられたビット数で量子化さ
れるわけである。In this way, the energy-dependent bits and the bits depending on the permissible noise level are added, and the information on the number of allocated bits is added to the adaptive bit allocation coding circuits 16 and 1.
7 and 18 so that the MDCT circuit 13
Each spectrum data in the frequency domain from 14 and 15 is quantized by the number of bits allocated to each band.

【００６６】すなわち要約すれば、上記適応ビット配分
符号化回路１６、１７、１８では、上記クリティカルバ
ンドの各バンド帯域毎（クリティカルバンド毎）若しく
は高域においては当該クリティカルバンドを更に複数帯
域に分割した帯域のエネルギ若しくはピーク値と、上記
ノイズレベル設定手段の出力との差分のレベルに応じて
配分されたビット数で上記各バンド毎のスペクトルデー
タを量子化することになる。That is, in summary, in the adaptive bit allocation encoding circuits 16, 17, and 18, the critical band is further divided into a plurality of bands in each of the critical bands (for each critical band) or in the high band. The spectrum data for each band is quantized by the number of bits allocated according to the level of the difference between the energy or peak value of the band and the output of the noise level setting means.

【００６７】ところで、上述した合成回路５２７での合
成の際には、最小可聴カーブ発生回路５３２から供給さ
れる図１４に示すような人間の聴覚特性であるいわゆる
最小可聴カーブＲＣを示すデータと、上記マスキングス
レッショールドＭＳとを合成することができる。この最
小可聴カーブにおいて、雑音絶対レベルがこの最小可聴
カーブ以下ならば該雑音は聞こえないことになる。この
最小可聴カーブは、コーディングが同じであっても例え
ば再生時の再生ボリュームの違いで異なるものとなが、
現実的なディジタルシステムでは、例えば１６ビットダ
イナミックレンジへの音楽のはいり方にはさほど違いが
ないので、例えば４ｋＨｚ付近の最も耳に聞こえやすい
周波数帯域の量子化雑音が聞こえないとすれば、他の周
波数帯域ではこの最小可聴カーブのレベル以下の量子化
雑音は聞こえないと考えられる。したがって、このよう
に例えばシステムの持つダイナミックレンジの４ｋＨｚ
付近の雑音が聞こえない使い方をすると仮定し、この最
小可聴カーブＲＣとマスキングスレッショールドＭＳと
を共に合成することで許容ノイズレベルを得るようにす
ると、この場合の許容ノイズレベルは、図１４中の斜線
で示す部分までとすることができるようになる。なお、
本実施例では、上記最小可聴カーブの４ｋＨｚのレベル
を、例えば２０ビット相当の最低レベルに合わせてい
る。また、この図１４は、信号スペクトルＳＳも同時に
示している。By the way, at the time of synthesizing by the synthesizing circuit 527, data indicating a so-called minimum audible curve RC which is a human auditory characteristic as shown in FIG. The above-mentioned masking threshold MS can be synthesized. At this minimum audible curve, if the absolute noise level is below this minimum audible curve, the noise will not be heard. This minimum audible curve is different even if the coding is the same, for example, due to the difference in playback volume during playback,
In a realistic digital system, for example, there is not much difference in entering music into a 16-bit dynamic range, so if quantization noise in the most audible frequency band, for example, around 4 kHz is not heard, then other It is considered that quantization noise below the level of the minimum audible curve is not heard in the frequency band. Therefore, for example, the dynamic range of the system is 4 kHz.
If it is assumed that the usage is such that nearby noise is not heard, and an allowable noise level is obtained by combining the minimum audible curve RC and the masking threshold MS, the allowable noise level in this case is as shown in FIG. Up to the portion indicated by the oblique lines. In addition,
In this embodiment, the 4 kHz level of the minimum audible curve is adjusted to the lowest level corresponding to, for example, 20 bits. FIG. 14 also shows the signal spectrum SS.

【００６８】また、上記許容雑音補正回路５３０では、
補正情報出力回路５３３から送られてくる例えば等ラウ
ドネスカーブの情報に基づいて、上記減算器５２８から
の出力における許容雑音レベルを補正している。ここ
で、等ラウドネスカーブとは、人間の聴覚特性に関する
特性曲線であり、例えば１ｋＨｚの純音と同じ大きさに
聞こえる各周波数での音の音圧を求めて曲線で結んだも
ので、ラウドネスの等感度曲線とも呼ばれる。またこの
等ラウドネス曲線は、図１４に示した最小可聴カーブＲ
Ｃと略同じ曲線を描くものである。この等ラウドネス曲
線においては、例えば４ｋＨｚ付近では１ｋＨｚのとこ
ろより音圧が８〜１０ｄＢ下がっても１ｋＨｚと同じ大
きさに聞こえ、逆に、５０Ｈｚ付近では１ｋＨｚでの音
圧よりも約１５ｄＢ高くないと同じ大きさに聞こえな
い。このため、上記最小可聴カーブのレベルを越えた雑
音（許容ノイズレベル）は、この等ラウドネス曲線に応
じたカーブで与えられる周波数特性を持つようにするの
が良いことがわかる。このようなことから、上記等ラウ
ドネス曲線を考慮して上記許容ノイズレベルを補正する
ことは、人間の聴覚特性に適合していることがわかる。In the allowable noise correction circuit 530,
The permissible noise level in the output from the subtractor 528 is corrected based on, for example, information on the equal loudness curve sent from the correction information output circuit 533. Here, the equal loudness curve is a characteristic curve relating to human auditory characteristics. For example, the loudness curve is obtained by calculating the sound pressure of sound at each frequency that sounds as loud as a pure tone of 1 kHz, and is connected by a curve. Also called a sensitivity curve. This equal loudness curve is the minimum audible curve R shown in FIG.
It draws substantially the same curve as C. In this equal loudness curve, for example, at around 4 kHz, even if the sound pressure falls by 8 to 10 dB below 1 kHz, it sounds as large as 1 kHz. It doesn't sound the same size. For this reason, it can be seen that noise exceeding the level of the minimum audible curve (allowable noise level) preferably has a frequency characteristic given by a curve corresponding to the equal loudness curve. From this, it can be seen that correcting the allowable noise level in consideration of the equal loudness curve is suitable for human auditory characteristics.

【００６９】以上述べた聴覚許容雑音レベルに依存した
スペクトル形状を使用可能総ビット１２８Ｋｂｐｓの内
のある割合を用いるビット配分でつくる。この割合は入
力信号のトーナリティが高くなるほど減少する。The above-described spectral shape depending on the permissible noise level is created by bit allocation using a certain percentage of the total available bits of 128 Kbps. This ratio decreases as the tonality of the input signal increases.

【００７０】次に２つのビット配分手法の間でのビット
量分割手法について説明する。図１０に戻って、ＭＤＣ
Ｔ回路出力が供給される入力端子８０１からの信号は、
スペクトルの滑らかさ算出回路８０８にも与えられ、こ
こでスペクトルの滑らかさが算出される。本実施例で
は、信号スペクトルの絶対値の隣接値間の差の絶対値の
和を信号スペクトルの絶対値の和で割った値を、上記ス
ペクトルの滑らかさとして算出している。Next, a description will be given of a bit amount division method between the two bit distribution methods. Returning to FIG.
The signal from the input terminal 801 to which the output of the T circuit is supplied is
It is also provided to a spectrum smoothness calculation circuit 808, where the spectrum smoothness is calculated. In this embodiment, a value obtained by dividing the sum of the absolute values of the differences between the adjacent values of the absolute value of the signal spectrum by the sum of the absolute values of the signal spectrum is calculated as the smoothness of the spectrum.

【００７１】上記スペクトルの滑らかさ算出回路８０８
の出力は、ビット分割率決定回路８０９に与えられ、こ
こでエネルギ依存のビット配分と、聴覚許容雑音スペク
トルによるビット配分間のビット分割率とが決定され
る。ビット分割率はスペクトルの滑らかさ算出回路８０
８の出力値が大きいほど、スペクトルの滑らかさが無い
と考えて、エネルギ依存のビット配分よりも、聴覚許容
雑音スペクトルによるビット配分に重点をおいたビット
配分を行う。ビット分割率決定回路８０９は、それぞれ
エネルギ依存のビット配分及び聴覚許容雑音スペクトル
によるビット配分の大きさをコントロールするマルチプ
ライヤ８１１及び８１２に対してコントロール出力を送
る。ここで、仮にスペクトルが滑らかであり、エネルギ
依存のビット配分に重きをおくように、マルチプライヤ
８１１へのビット分割率決定回路８０９の出力が０．８
の値を取ったとき、マルチプライヤ８１２へのビット分
割率決定回路８０９の出力は１−０．８＝０．２とす
る。これら２つのマルチプライヤの出力はアダー８０６
で足し合わされて最終的なビット配分情報となって、出
力端子８０７から出力される。The above-mentioned spectrum smoothness calculating circuit 808
Is supplied to a bit division ratio determination circuit 809, where the energy-dependent bit allocation and the bit division ratio between the bit allocations based on the permissible noise spectrum are determined. The bit division ratio is calculated by the spectrum smoothness calculation circuit 80.
It is considered that the larger the output value of 8 is, the less smooth the spectrum is, so that bit allocation is performed with more emphasis on bit allocation based on the permissible noise spectrum than on energy-dependent bit allocation. The bit division ratio determination circuit 809 sends a control output to multipliers 811 and 812 that control the size of the bit allocation based on the energy-dependent bit allocation and the permissible noise spectrum, respectively. Here, the output of the bit division ratio determination circuit 809 to the multiplier 811 is set to 0.8 so that the spectrum is smooth and the weight of the energy-dependent bit allocation is emphasized.
, The output of the bit division ratio determination circuit 809 to the multiplier 812 is 1-0.8 = 0.2. The output of these two multipliers is the adder 806
And output as the final bit allocation information from the output terminal 807.

【００７２】このときのビット配分の様子を図１５、図
１６に示す。また、これに対応する量子化雑音の様子を
図１７、図１８に示す。図１５は信号のスペクトルが割
合平坦である場合を示しており、図１６は信号スペクト
ルが高いトーナリティを示す場合を示している。また、
図１５及び図１６の図中ＱＳは信号レベル依存分のビッ
ト量を示し、図中ＱＮは聴覚許容雑音レベル依存のビッ
ト割当分のビット量を示している。図１７及び図１８の
図中Ｌは信号レベルを示し、図中ＮＳは信号レベル依存
分による雑音低下分を、図中ＮＮは聴覚許容雑音レベル
依存のビット割当分による雑音低下分を示している。FIGS. 15 and 16 show how bits are allocated at this time. FIGS. 17 and 18 show the corresponding quantization noise. FIG. 15 shows a case where the signal spectrum is relatively flat, and FIG. 16 shows a case where the signal spectrum exhibits high tonality. Also,
QS in FIGS. 15 and 16 indicates the bit amount corresponding to the signal level, and QN in the drawings indicates the bit amount corresponding to the bit allocation depending on the permissible noise level. 17 and 18, L indicates the signal level, NS in the figures indicates the noise reduction due to the signal level dependency, and NN in the figures indicates the noise reduction due to the bit allocation depending on the permissible noise level. .

【００７３】先ず、信号のスペクトルが、割合平坦であ
る場合を示す図１５において、聴覚許容雑音レベルに依
存したビット配分は、全帯域に渡り大きい信号雑音比を
取るために役立つ。しかし低域及び高域では比較的少な
いビット配分が使用されている。これは聴覚的にこの帯
域の雑音に対する感度が小さいためである。信号エネル
ギレベルに依存したビット配分の分は量としては少ない
が、ホワイトな雑音スペクトルを生じるように、この場
合には中低域の信号レベルの高い周波数領域に重点的に
配分されている。First, in FIG. 15 showing the case where the spectrum of the signal is relatively flat, the bit allocation depending on the permissible noise level is useful for obtaining a large signal-to-noise ratio over the entire band. However, relatively low bit allocation is used in the low and high bands. This is because the sensitivity to noise in this band is small acoustically. Although the amount of the bit allocation depending on the signal energy level is small in amount, in this case, the bit allocation is focused on the high-frequency region where the signal level in the middle and low frequencies is high so as to generate a white noise spectrum.

【００７４】これに対して、図１６に示すように、信号
スペクトルが高いトーナリティを示す場合には、信号エ
ネルギレベルに依存したビット配分量が多くなり、量子
化雑音の低下は極めて狭い帯域の雑音を低減するために
使用される。聴覚許容雑音レベルに依存したビット配分
の集中はこれよりもきつくない。On the other hand, as shown in FIG. 16, when the signal spectrum shows a high tonality, the bit allocation amount depending on the signal energy level increases, and the quantization noise decreases with a very narrow band of noise. Used to reduce The concentration of the bit allocation depending on the permissible noise level is less tight.

【００７５】図１６に示すように、この両者のビット配
分の和により、孤立スペクトル入力信号での特性の向上
が達成される。As shown in FIG. 16, the improvement of the characteristics in the isolated spectrum input signal is achieved by the sum of the bit allocation of these two.

【００７６】以上の様にして得られた基礎ビット配分
に、次のようにして前記付加ビット配分（ステップＳＴ
３）部分を付け加える。The additional bit distribution (step ST) is added to the basic bit distribution obtained as described above as follows.
3) Add parts.

【００７７】次に、図１９を用いて基礎ビット配分と付
加ビット配分部分の分離及び再生時の結合について説明
する。Next, the separation of the basic bit allocation portion and the additional bit allocation portion and the coupling at the time of reproduction will be described with reference to FIG.

【００７８】先ず、図１９の構成の入力端子９００に
は、図１のＭＤＣＴ回路１３，１４，１５の出力である
ＭＤＣＴ係数が供給されるとする。すなわち、図１９は
図１の適応ビット割当符号化回路に含まれるものであ
る。First, it is assumed that the MDCT coefficients which are the outputs of the MDCT circuits 13, 14, 15 of FIG. 1 are supplied to the input terminal 900 having the configuration of FIG. That is, FIG. 19 is included in the adaptive bit allocation encoding circuit of FIG.

【００７９】この図１９において、上記入力端子９００
に供給されたＭＤＣＴ係数（ＭＤＣＴサンプル）は正規
化回路９０５によって複数サンプル毎に、ブロックにつ
いての正規化処理すなわちブロックフローティングが施
される。この時どの程度のブロックフローティングが行
われたかを示す係数としてスケールファクタが得られ
る。In FIG. 19, the input terminal 900
Are normalized by a normalization circuit 905 for each of a plurality of samples, that is, block floating is performed on the MDCT coefficients (MDCT samples). At this time, a scale factor is obtained as a coefficient indicating how much block floating has been performed.

【００８０】次段の第１の量子化器(quantizer) ９０１
は、前記基礎ビット配分で与えられた各サンプル語長で
量子化を行なう。この時、量子化雑音を少なくするため
には四捨五入による量子化が行われる。Next-stage first quantizer 901
Performs quantization at each sample word length given by the basic bit allocation. At this time, in order to reduce quantization noise, quantization by rounding is performed.

【００８１】次に、上記正規化回路９０５の出力と上記
量子化器９０１の出力が差分器９０２に送られる。すな
わち、当該差分器９０２では、量子化器９０１の入力と
出力の差（量子化誤差）が取られる。この差分器９０２
からの出力は、さらに正規化回路９０６を介して第２の
量子化器９０３に送られる。Next, the output of the normalizing circuit 905 and the output of the quantizer 901 are sent to a differentiator 902. That is, the difference unit 902 obtains the difference between the input and output of the quantizer 901 (quantization error). This differentiator 902
Is further sent to a second quantizer 903 via a normalization circuit 906.

【００８２】当該第２の量子化器９０３では、例えば２
ビットが各サンプル毎に使用される。この時のフローテ
ィング係数は、第１の量子化器９０１で用いられたフロ
ーティング係数と語長から自動的に決定される。In the second quantizer 903, for example, 2
Bits are used for each sample. The floating coefficient at this time is automatically determined from the floating coefficient used in the first quantizer 901 and the word length.

【００８３】すなわち、この図１９の構成のエンコーダ
側では、第１の量子化器９０１で用いられた語長が例え
ばＮビットであったときには（２＊＊Ｎ）で第２の量子
化器９０３で用いられるフローティング係数が得られ
る。That is, on the encoder side having the configuration in FIG. 19, when the word length used in the first quantizer 901 is, for example, N bits, the second quantizer 903 is set to (2 ** N). Is obtained.

【００８４】また、上記付加ビット配分のための第２の
量子化器９０３では、上記基礎ビット配分のための第１
の量子化器９０１と同じように四捨五入処理を含むビッ
ト配分を行う。このようにして２つの量子化により、２
つのビット配分に分けられる。Further, the second quantizer 903 for allocating additional bits has the first quantizer 903 for allocating basic bits.
In the same manner as the quantizer 901, bit allocation including round-off processing is performed. Thus, by the two quantizations, 2
Bit allocation.

【００８５】ここで、もし付加ビット配分のためのワー
ドレングスが固定的でない場合でも、前に述べたように
付加ビット配分の成分の大きさは基礎ビット配分のスケ
ールファクタとワードレングスから付加ビット配分のス
ケールファクタを算出できるので、ワードレングスのみ
がデコーダに必要とされる。本実施例では、付加ビット
配分のワードレングスは２ビットと固定されているの
で、付加ビット配分のためのワードレングスさえ必要で
はない。このようにして量子化器９０１及び９０３の出
力がそれぞれ四捨五入された効率の高い量子化が実現さ
れることになる。Here, even if the word length for the additional bit allocation is not fixed, the size of the additional bit allocation component is determined from the scale factor of the basic bit allocation and the word length as described above. Can be calculated, so only the word length is needed for the decoder. In this embodiment, the word length of the additional bit allocation is fixed at 2 bits, so that even the word length for the additional bit allocation is not necessary. In this way, highly efficient quantization in which the outputs of the quantizers 901 and 903 are rounded off is realized.

【００８６】なお、量子化器９０１及び９０３の出力ビ
ットレ−トは、両者とも固定にすると、ディスク、テ−
プ等のメディアに記録するときにシステムを簡単にする
ことができる。また、両者を可変としながら、トータル
で一定とすることもできる。もちろん一部の量子化器の
出力ビットレートのみを一定としてもよい。When the output bit rates of the quantizers 901 and 903 are both fixed, a disc, a tape, and a
The system can be simplified when recording on media such as media. Further, it is also possible to make the total variable while making both variable. Of course, only the output bit rate of some quantizers may be fixed.

【００８７】なお、図１９の構成（エンコーダ）に対応
する構成（デコーダ）では、上記正規化回路９０５，９
０６に対応する逆正規化処理を行う逆正規化回路９０
８，９０７が設けられ、回路９０８，９０７の出力が加
算器９０４で加算される。その加算出力が出力端子９１
０から取りだされることになる。Incidentally, in the configuration (decoder) corresponding to the configuration (encoder) in FIG.
Inverse normalization circuit 90 for performing inverse normalization processing corresponding to 06
8, 907 are provided, and the outputs of the circuits 908, 907 are added by the adder 904. The added output is output terminal 91
It will be taken from 0.

【００８８】図２０は、このようにして高能率符号化さ
れた信号を再び復号化するための基本的な例となる高能
率符号復号化装置を示している。FIG. 20 shows a high-efficiency code decoding apparatus serving as a basic example for decoding again a signal which has been highly-encoded in this way.

【００８９】この図２０において、各帯域の量子化され
たＭＤＣＴ係数は復号化装置入力端子１２２、１２４、
１２６に与えられ、使用されたブロックサイズ情報は入
力端子１２３、１２５、１２７に与えられる。復号化回
路１１６、１１７、１１８では適応ビット配分情報を用
いてビット割当を解除する。In FIG. 20, the quantized MDCT coefficients of each band are input to decoding device input terminals 122 and 124,
The block size information provided to 126 and used is provided to input terminals 123, 125, and 127. The decoding circuits 116, 117 and 118 release the bit allocation using the adaptive bit allocation information.

【００９０】次に、ＩＭＤＣＴ回路１１３、１１４、１
１５では周波数領域の信号が時間領域の信号に変換され
る。これらの部分帯域の時間領域信号は、ＩＱＭＦ回路
１１２、１１１により、全体域信号に復号化される。Next, the IMDCT circuits 113, 114, 1
At 15, the signal in the frequency domain is converted to a signal in the time domain. The time domain signals of these partial bands are decoded by the IQMF circuits 112 and 111 into the entire area signals.

【００９１】すなわち、前記基礎ビット配分の１２８ｋ
ｂｐｓのビット配分と前記付加ビット配分の６４ｋｂｐ
ｓのそれぞれが上記復号化回路１１６、１１７、１１８
で復号化される。そしてこれらの２つの復号化部分は夫
々が復号化された後夫々の時間軸上サンプルが加算され
て精度の高いサンプルとなる。That is, the basic bit allocation of 128 k
64 kbp of the bit allocation of bps and the additional bit allocation
s are the decoding circuits 116, 117, and 118, respectively.
Is decrypted. After these two decoded portions are decoded, the respective samples on the time axis are added to each other to obtain samples with high accuracy.

【００９２】もちろん図２０において、ＩＭＤＣＴ回路
１１３、１１４、１１５の各出力について基礎ビット配
分出力及び付加ビット配分をそれぞれ計算してから合成
し、ＩＱＭＦ回路１１２、１１１に送ることもできる。Of course, in FIG. 20, it is also possible to calculate the basic bit allocation output and the additional bit allocation for each of the outputs of the IMDCT circuits 113, 114 and 115, combine them, and send them to the IQMF circuits 112 and 111.

【００９３】さらには復号化回路１１６、１１７、１１
８において基礎ビット配分及び付加ビット配分を正規化
処理を解いた後に加算し、それをＩＭＤＣＴ，ＩＱＭＦ
処理して最終出力を得るようにすることもできる。Further, the decoding circuits 116, 117, 11
8, the basic bit allocation and the additional bit allocation are added after the normalization processing is solved, and the sum is added to the IMDCT, IQMF
It can be processed to get the final output.

【００９４】次に、本発明実施例に用いられるメディア
は、上述したような本発明実施例の高能率符号化装置に
より符号化された信号が記録若しくは伝送されるもので
あり、記録メディアとしては例えば光ディスク，光磁気
ディスク，磁気ディスク等のディスク状の記録媒体に上
記符号化信号が記録されたものや、磁気テープ等のテー
プ状記録媒体に上記符号化信号が記録されたもの、或い
は、符号化信号が記録された半導体メモリ，ＩＣカード
などを挙げることができる。また、伝送メディアとして
は、電線若しくは光ケーブルや電波等を挙げることがで
きる。Next, the medium used in the embodiment of the present invention is for recording or transmitting a signal coded by the high-efficiency coding apparatus of the embodiment of the present invention as described above. For example, a disk-shaped recording medium such as an optical disk, a magneto-optical disk, or a magnetic disk in which the above-mentioned encoded signal is recorded, a tape-like recording medium such as a magnetic tape in which the above-mentioned encoded signal is recorded, Semiconductor memory, an IC card, and the like in which the conversion signal is recorded. In addition, examples of the transmission medium include an electric wire, an optical cable, and a radio wave.

【００９５】なお、本発明実施例に用いられるメディア
におけるデータの並べ方については、図２１に示すよう
になる。すなわち、１つのシンクブロックは、シンク情
報と、サブ情報（スケールファクタ，ワードレングス）
と、基礎ビット配分と、付加ビット配分とからなるもの
とする。FIG. 21 shows the arrangement of data on the media used in the embodiment of the present invention. That is, one sync block includes sync information and sub information (scale factor, word length).
, Basic bit allocation, and additional bit allocation.

【００９６】この場合、時間領域サンプル若しくは周波
数領域サンプルを量子化した後、１サンプルづつ単独
で、前段の量子化誤差を更に量子化するような少なくと
も１箇の量子化機能により、少なくとも２箇の語に分解
し、１つのシンクブロックの中に各量子化出力毎に分離
して記録若しくは伝送し、その後復号再生することは、
ビットレートを下げて再生する場合に除去すべきビット
列部分を一括して除去できるという点で有効である。In this case, after quantizing the time domain sample or the frequency domain sample, at least two quantizing functions are performed by one sample by one quantizing function for further quantizing the quantization error of the preceding stage. Decomposing into words, separately recording or transmitting each quantized output in one sync block, and then decoding and reproducing,
This is effective in that a bit string portion to be removed when reproducing at a reduced bit rate can be collectively removed.

【００９７】また、別の方法として、時間領域サンプル
若しくは周波数領域サンプルを量子化した後、１サンプ
ルづつ単独で、前段の量子化誤差を更に量子化するよう
な少なくとも１箇の量子化機能により、少なくとも２箇
の語に分解し、１つのシンクブロックの中に各量子化出
力を周波数又は時間順に交互に記録若しくは伝送し、時
間領域サンプル若しくは周波数領域サンプルから復号再
生することは、ビットレートを下げて再生する場合に周
波数帯域を制限する形で除去すべきビット列部分を一括
して除去できるという点で有効である。As another method, after quantizing a time-domain sample or a frequency-domain sample, at least one quantization function for further quantizing a preceding-stage quantization error by one sample alone is used. Decomposing into at least two words, recording or transmitting each quantized output alternately in frequency or time order in one sync block, and decoding and reproducing from time domain samples or frequency domain samples lowers the bit rate. This is effective in that a bit string portion to be removed can be collectively removed in a form in which the frequency band is limited when the data is reproduced.

【００９８】以上のようなビット配列は、特に光磁気デ
ィスクや光ディスクを用いた例えばいわゆるミニディス
ク（Mini Disc ）や、磁気テ−プメディア、通信メディ
アなどに応用できる。The bit arrangement as described above can be applied to, for example, a so-called mini disk (Mini Disc) using a magneto-optical disk or an optical disk, a magnetic tape medium, a communication medium, and the like.

【００９９】[0099]

【発明の効果】以上の説明からも明らかなように、本発
明においては以下の効果を得ることができる。すなわ
ち、（１）エンコード時に使用されたビットレートよりも低
いビットレートを用いてデコードする時、例えばエンコ
ード側でエンコード処理後のビットの一部を別のデータ
転送用に流用するとき、音質劣化を最小に止める。（２）既に低いビットレートで再生する再生機が使われ
ている時には、より高いビットレートを用いた音質の良
いシステムを導入するに当たっては既に用いられていた
低いビットレートで再生する再生機とバックワードの互
換性を有するシステムを提供できる。（３）高価な記憶デバイス例えばＩＣカードを用いた記
憶媒体に記録を行ないたいときに、記録時間を初期の設
定から延長したいときに記録済み若しくは記録中のエン
コード情報のビットレートを適宜減らして記録時間を延
ばし且つこの時の音質劣化を最小化できる。（４）高音質のデコーダを、安価な通常良く使われるよ
りビットレートの低いビット配分を行うデコーダを複数
個使用して作成することができ、このことにより新たな
デコーダ用ＬＳＩの作成が不要となり安価に目的を達成
することが可能となる。As is clear from the above description, the following effects can be obtained in the present invention. That is, (1) when decoding is performed using a bit rate lower than the bit rate used at the time of encoding, for example, when a part of the bits after the encoding process is diverted for another data transfer on the encoding side, the sound quality is deteriorated. Keep it to a minimum. (2) When a playback device that plays back at a low bit rate is already used, a playback device that plays back at a low bit rate already used and a back A system having word compatibility can be provided. (3) When recording is to be performed on an expensive storage device, for example, a storage medium using an IC card, and when it is desired to extend the recording time from the initial setting, recording is performed by appropriately reducing the bit rate of the encoded information being recorded or being recorded. It is possible to extend the time and minimize the sound quality deterioration at this time. (4) A high-quality sound decoder can be created by using a plurality of inexpensive decoders that perform bit allocation at a lower bit rate than those usually used, thereby making it unnecessary to create a new decoder LSI. The purpose can be achieved at low cost.

[Brief description of the drawings]

【図１】本発明実施例の高能率符号化装置の構成例を示
すブロック回路図である。FIG. 1 is a block circuit diagram illustrating a configuration example of a high-efficiency encoding device according to an embodiment of the present invention.

【図２】本実施例装置での信号の周波数及び時間分割を
示す図である。FIG. 2 is a diagram illustrating frequency and time division of a signal in the apparatus according to the embodiment.

【図３】本実施例のビット配分ストラテジを示す図であ
る。FIG. 3 is a diagram illustrating a bit allocation strategy according to the present embodiment.

【図４】トーナリティをスケールファクタから計算する
方法を説明するための図である。FIG. 4 is a diagram for explaining a method of calculating tonality from a scale factor.

【図５】トーナリティからビット配分(1) のビット配分
量を求める方法を説明するための図である。FIG. 5 is a diagram for explaining a method of obtaining a bit allocation amount of bit allocation (1) from tonality.

【図６】ビット配分(2) において均一配分の時のノイズ
スペクトルを示す図である。FIG. 6 is a diagram showing a noise spectrum at the time of uniform distribution in bit distribution (2).

【図７】ビット配分(2) において情報信号の周波数スペ
クトル及びレベルに対する依存性を持たせた聴覚的な効
果を得るためのビット配分によるノイズスペクトルの例
を示す図である。FIG. 7 is a diagram showing an example of a noise spectrum by bit allocation for obtaining an audible effect having dependency on a frequency spectrum and a level of an information signal in bit allocation (2).

【図８】ビット配分(2) において均一配分を示す図であ
る。FIG. 8 is a diagram showing uniform distribution in bit distribution (2).

【図９】ビット配分(2) において情報信号の周波数スペ
クトル及びレベルに対する依存性を持たせた聴覚的な効
果を得るためのビット配分を用いたビット配分手法を示
す図である。FIG. 9 is a diagram showing a bit allocation method using bit allocation for obtaining an audible effect having dependency on the frequency spectrum and level of an information signal in bit allocation (2).

【図１０】本発明実施例の基礎ビット配分機能の構成例
を示すブロック回路図である。FIG. 10 is a block circuit diagram showing a configuration example of a basic bit allocation function according to the embodiment of the present invention.

【図１１】本発明実施例の聴覚マスキングスレッショー
ルド算定機能の構成例を示すブロック回路図である。FIG. 11 is a block circuit diagram showing a configuration example of an auditory masking threshold calculation function according to the embodiment of the present invention.

【図１２】各臨界帯域信号によるマスキングを示す図で
ある。FIG. 12 is a diagram illustrating masking by each critical band signal.

【図１３】各臨界帯域信号によるマスキングスレショー
ルドを示す図である。FIG. 13 is a diagram illustrating a masking threshold according to each critical band signal.

【図１４】情報スペクトル、マスキングスレショール
ド、最小可聴限を示す図である。FIG. 14 is a diagram showing an information spectrum, a masking threshold, and a minimum audible limit.

【図１５】信号スペクトルが平坦な情報信号に対する信
号レベル依存及び聴覚許容雑音レベル依存のビット配分
を示す図である。FIG. 15 is a diagram illustrating bit allocation depending on the signal level and the permissible noise level for an information signal having a flat signal spectrum.

【図１６】信号スペクトルのトナリティが高い情報信号
に対する信号レベル依存及び聴覚許容雑音レベル依存の
ビット配分を示す図である。FIG. 16 is a diagram illustrating a signal level-dependent and permissible noise level-dependent bit allocation for an information signal having a high tonality of a signal spectrum.

【図１７】信号スペクトルが平坦な情報信号に対する量
子化雑音レベルを示す図である。FIG. 17 is a diagram showing a quantization noise level for an information signal having a flat signal spectrum.

【図１８】トーナリティが高い情報信号に対する量子化
雑音レベルを示す図である。FIG. 18 is a diagram showing a quantization noise level for an information signal having a high tonality.

【図１９】基礎ビット配分と付加ビット配分の分割を行
う具体的構成を示すブロック回路図である。FIG. 19 is a block circuit diagram showing a specific configuration for dividing a basic bit allocation and an additional bit allocation.

【図２０】本発明実施例に用いられる高能率符号復号化
装置の構成例を示すブロック回路図である。FIG. 20 is a block circuit diagram illustrating a configuration example of a high-efficiency code decoding device used in an embodiment of the present invention.

【図２１】本発明実施例に用いられるメディアにおける
ビット配列の構成例を示す図である。FIG. 21 is a diagram illustrating a configuration example of a bit arrangement in a medium used in an embodiment of the present invention.

[Explanation of symbols]

１０・・・高能率符号化回路入力端子１１、１２・・・帯域分割フィルタ１３、１４、１５・・・ＭＤＣＴ回路１６、１７、１８・・・適応ビット配分符号化回路１９、２０、２１・・・ブロックサイズ決定回路２２、２４、２６・・・符号化出力端子２３、２５、２７・・・ブロックサイズ情報出力端子１２２、１２４、１２６・・・符号化入力端子１２３、１２５、１２７・・・ブロックサイズ情報入力
端子１１６、１１７、１１８・・・適応ビット配分復号化回
路１１３、１１４、１１５・・・ＩＭＤＣＴ回路１１２、１１１・・・ＩＱＭＦ回路１１０・・・高能率復号化回路出力端子５２０・・・許容雑音算出回路５２１・・・許容雑音算出回路入力端子５２２・・・帯域毎のエネルギ検出回路５２３・・・畳込みフィルタ回路５２４・・・引算器５２５・・・ｎ−ａｉ関数発生回路５２６・・・割算器５２７・・・合成回路５２８・・・減算器５３０・・・許容雑音補正回路５３２・・・最小可聴カーブ発生回路５３３・・・補正情報出力回路８０１・・・ＭＤＣＴ回路出力入力端子８０２・・・使用可能総ビット発生回路８０３・・・帯域毎のエネルギ算出回路８０４・・・エネルギ依存のビット配分回路８０５・・・聴覚許容雑音レベル依存のビット配分回路８０６・・・アダー８０７・・・各帯域のビット割当量出力端子８０８・・・スペクトルの滑らかさ算出回路８０９・・・ビット分割率決定回路８１１、８１２・・・マルチプライヤ10 high efficiency coding circuit input terminal 11, 12 band division filter 13, 14, 15 MDCT circuit 16, 17, 18 ... adaptive bit allocation coding circuit 19, 20, 21 ..Block size determination circuits 22, 24, 26 ... Encoding output terminals 23, 25, 27 ... Block size information output terminals 122, 124, 126 ... Encoding input terminals 123, 125, 127 ... -Block size information input terminals 116, 117, 118 ... adaptive bit allocation decoding circuits 113, 114, 115 ... IMDCT circuits 112, 111 ... IQMF circuits 110 ... high efficiency decoding circuit output terminals 520 ... Allowable noise calculation circuit 521 ... Allowable noise calculation circuit input terminal 522 ... Energy detection circuit for each band 523 ... Convolution filter Data circuit 524 ··· subtracter 525 ··· n-ai function generation circuit 526 ··· divider 527 ··· synthesis circuit 528 ··· subtractor 530 ··· allowable noise correction circuit 532 ··· Minimum audible curve generation circuit 533: Correction information output circuit 801: MDCT circuit output input terminal 802: Available total bit generation circuit 803: Energy calculation circuit for each band 804: Energy-dependent bit Allocation circuit 805: A bit allocation circuit depending on the permissible noise level 806: Adder 807: A bit allocation output terminal for each band 808: Smoothness calculation circuit for spectrum 809: Bit division rate determination Circuits 811, 812 ... Multiplier

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) H03M 7/30 G10L 13/00 G11B 20/10 301 ──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int.Cl. ⁷ , DB name) H03M 7/30 G10L 13/00 G11B 20/10 301

Claims

(57) [Claims]

1. An encoding device for quantizing a time-domain sample or a frequency-domain sample of an input signal with a predetermined number of bits, wherein: a first quantization means for quantizing the time-domain sample or the frequency-domain sample; A quantization error calculation unit for calculating a quantization error of the first quantization unit; a second quantization unit for further quantizing the quantization error; an output of the first quantization unit and the second Encoding means for encoding the output of the quantization means.

2. The encoding apparatus according to claim 1, wherein an output bit rate of said second quantization means is set to a constant bit rate in a fixed time unit.

3. The encoding apparatus according to claim 2, wherein the output bit rates of said first quantization means and said second quantization means are set to a fixed bit rate in fixed time units.

4. A first block floating means for blocking the time domain samples or frequency domain samples for each of a plurality of samples and performing a block floating process, and a second block floating process for performing an output of the quantization error calculating means in a block floating process. And a second block floating means, wherein the second block floating means sets a scale factor for the block floating to the first.
The first quantizing means quantizes the output of the first block floating means, and the second quantizing means quantizes the output of the first block floating means. 2. The encoding apparatus according to claim 1, wherein an output of said encoding is quantized.

5. The second block floating means obtains a scale factor for the block floating based on a scale factor in the first block floating means and a word length in the first quantization means. The encoding device according to claim 4, wherein:

6. An encoding apparatus which quantizes and encodes a time-domain sample or a frequency-domain sample of an input signal by a predetermined number of bits and records the encoded signal on a recording medium, wherein the time-domain sample or the frequency-domain sample is A first quantization unit for quantizing the following, a quantization error calculation unit for calculating a quantization error of the first quantization unit, a second quantization unit for further quantizing the quantization error, An encoding apparatus comprising, in one sync block, recording means for separately recording an output sample of the first quantizing means and an output sample of the second quantizing means. .

7. An encoding apparatus which quantizes and encodes a time-domain sample or a frequency-domain sample of an input signal with a predetermined number of bits and records the encoded signal on a recording medium, wherein the time-domain sample or the frequency-domain sample is A first quantization unit for quantizing the following, a quantization error calculation unit for calculating a quantization error of the first quantization unit, a second quantization unit for further quantizing the quantization error, In one sync block, there is provided recording means for alternately recording the output samples of the first quantization means and the output samples of the second quantization means in time order or frequency order. Encoding device.