JPH04302531A

JPH04302531A - High-efficiency encoding device for digital data

Info

Publication number: JPH04302531A
Application number: JP9118491A
Authority: JP
Inventors: Kiyouya Tsutsui; 京弥筒井; Osamu Shimoyoshi; 下吉　修
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1991-03-29
Filing date: 1991-03-29
Publication date: 1992-10-26

Abstract

PURPOSE:To obtain an optimum encoding output by selecting one of the outputs of orthogonal transforming means according to the outputs of the orthogonal transforming means. CONSTITUTION:Input digital audio data are divided into plural bands which are wider and wider in band width as the frequency range is higher and higher and blocks consisting of plural samples are formed by the divided bands; and coefficient data are obtained by orthogonal transformation by, for example, fast Fourier transformation, block by block, by the bands and then encoded by the adaptive allocated number of bits. In this case, encoders 2-5 which perform the orthogonal transformation of the input digital data with different block lengths by the blocks are provided and a selecting circuit 6 selects one of the outputs of the respective encoders 2-5 according to the respective outputs of the orthogonal transforming means of the encoders 2-5. Namely, only one of the outputs is sent out through the switching operation of a changeover switch 7 according to the select signal from the selecting circuit 6.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、いわゆる高能率符号化
によって入力ディジタルデータの符号化を行うディジタ
ルデータの高能率符号化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-efficiency digital data encoding apparatus for encoding input digital data by so-called high-efficiency encoding.

【０００２】0002

【従来の技術】オーディオ或いは音声等の信号の高能率
符号化の手法には種々あるが、例えば、時間軸上のオー
ディオ信号等を複数の周波数帯域に分割して符号化する
帯域分割符号化（サブ・バンド・コーディング：ＳＢＣ
）や、時間軸の信号を周波数軸上の信号に変換（直交変
換）して複数の周波数帯域に分割し各帯域毎に符号化す
るいわゆる変換符号化等を挙げることができる。また、
上述の帯域分割符号化と変換符号化とを組み合わせた高
能率符号化の手法も考えられており、この場合には、例
えば、上記帯域分割符号化で帯域分割を行った後、該各
帯域毎の信号を周波数軸上の信号に直交変換し、この直
交変換された各帯域毎に符号化が施される。ここで、上述した直交変換としては、例えば、入力オー
ディオ信号を所定単位時間（フレーム）でブロック化し
、当該ブロック毎に高速フーリエ変換（ＦＦＴ）を行う
ことで時間軸を周波数軸に変換するような直交変換があ
る。更に、上記帯域分割としては、例えば人間の聴覚特
性を考慮した帯域分割が行われることがある。すなわち
、一般に臨界帯域（クリティカルバンド）と呼ばれてい
る高域程帯域幅が広くなるような帯域幅で、オーディオ
信号を複数（例えば２５バント）の帯域に分割すること
がある。また、この時の各帯域毎のデータを符号化する
際には、各帯域毎に所定のビット配分或いは、各帯域毎
に適応的なビット割当て（ビットアロケーシヨン）によ
る符号化が行われる。例えば、上記ＦＦＴ処理されて得
られたＦＦＴ係数データを上記ビットアロケーシヨンに
よって符号化する際には、上記各ブロック毎のＦＦＴ処
理により得られる各帯域毎のＦＦＴ係数データに対して
、適応的な割当ビット数で符号化が行われることになる
。2. Description of the Related Art There are various methods for highly efficient encoding of audio or voice signals. For example, band division encoding (which divides an audio signal, etc. on the time axis into multiple frequency bands and encodes them) is available. Sub band coding: SBC
), and so-called transform coding in which a signal on the time axis is converted into a signal on the frequency axis (orthogonal transformation), divided into a plurality of frequency bands, and encoded for each band. Also,
A high-efficiency encoding method that combines the above-mentioned band division coding and transform coding has also been considered. In this case, for example, after performing band division using the band division coding described above, The signal is orthogonally transformed into a signal on the frequency axis, and encoding is performed for each orthogonally transformed band. Here, the above-mentioned orthogonal transformation includes, for example, converting the input audio signal into blocks of predetermined unit time (frames) and performing fast Fourier transform (FFT) on each block to transform the time axis into the frequency axis. There is an orthogonal transformation. Furthermore, as the above-mentioned band division, for example, band division may be performed in consideration of human auditory characteristics. That is, an audio signal may be divided into a plurality of bands (for example, 25 bands) using a bandwidth generally called a critical band, in which the higher the band, the wider the band. Further, when encoding data for each band at this time, encoding is performed by predetermined bit allocation for each band or adaptive bit allocation for each band. For example, when encoding the FFT coefficient data obtained by the above FFT processing using the above bit allocation, the FFT coefficient data for each band obtained by the above FFT processing for each block is adaptively Encoding will be performed with the appropriate number of allocated bits.

【０００３】0003

【発明が解決しようとする課題】ここで、上述した符号
化において、上記入力オーディオ信号を例えば複数帯域
に分割し、更にこれら各帯域毎に高速フーリエ変換（Ｆ
ＦＴ）等の直交変換を行う場合（すなわち各帯域におい
て周波数分析を行う場合）には、通常、上記各帯域毎の
信号を所定時間単位（フレーム単位）でブロック化して
、この帯域毎のブロック単位で直交変換を行うようにし
ている。[Problems to be Solved by the Invention] In the above-mentioned encoding, the input audio signal is divided into, for example, a plurality of bands, and each band is further subjected to fast Fourier transform (FFT).
When performing orthogonal transformation such as FT) (that is, when performing frequency analysis in each band), the signals for each band are usually divided into blocks in predetermined time units (frame units), and the block unit for each band is I am trying to perform an orthogonal transformation.

【０００４】また、この直交変換されて得られた係数デ
ータ（ＦＦＴ係数データ）が符号化され、この符号化の
際に割り当てられるビット数は、上記フレーム単位のブ
ロック毎に割り当てられている。[0004] Also, the coefficient data (FFT coefficient data) obtained by this orthogonal transformation is encoded, and the number of bits allocated during this encoding is allocated to each block of the frame unit.

【０００５】ところで、入力オーディオ信号は、常にレ
ベル等の変動が少ない定常的な信号であるとは限らず、
このレベル等が様々に変化するものである。例えば、上
記フレーム内でピークレベルの時間的変動が大きい信号
（過渡的に変化する信号）である場合も存在する。すな
わち、例えば打楽器の打音等のオーディオ信号の場合、
この打音部分の信号が上記過渡的に変化する信号となる
。By the way, the input audio signal is not always a steady signal with little fluctuation in level etc.
This level etc. changes variously. For example, there are cases where the signal has a large temporal variation in peak level within the frame (a signal that changes transiently). That is, for example, in the case of an audio signal such as the sound of a percussion instrument,
The signal of this hitting sound portion becomes the signal that changes transiently.

【０００６】この例えば定常的或いは過渡的というよう
に信号の特性（性質）が変化するオーディオ信号を、上
述のように所定時間のフレーム単位のブロックで一律に
直交変換し、この直交変換されたデータを符号化するこ
とは、当該信号の性質に適応した良好な符号化とは言い
難く、後の復号化後の音質は必ずしも聴感上良好なもの
とは言えない。[0006] This audio signal whose signal characteristics (properties) change, for example, stationary or transient, is uniformly orthogonally transformed in blocks of frames of a predetermined time as described above, and this orthogonally transformed data is It cannot be said that encoding is a good encoding that is adapted to the characteristics of the signal, and the sound quality after subsequent decoding cannot necessarily be said to be good in terms of audibility.

【０００７】そこで、本発明は、上述のような実情に鑑
みて提案されたものであり、入力オーディオ信号の性質
（特性）に、より適応した高能率の圧縮符号化が可能で
あると共に、復号化後の信号が聴感上良好なものとして
得られるディジタルデータの高能率符号化装置を提供す
ることを目的とするものである。The present invention has been proposed in view of the above-mentioned circumstances, and is capable of highly efficient compression encoding that is more suited to the nature (characteristics) of input audio signals, and also enables decoding. It is an object of the present invention to provide a highly efficient encoding device for digital data, which provides an audibly good signal after encoding.

【０００８】[0008]

【課題を解決するための手段】本発明のディジタルデー
タの高能率符号化装置は、上述の目的を達成するために
提案されたものであり、入力ディジタルデータを複数の
サンプルでブロック化すると共に、各ブロック毎に直交
変換を行い係数データを得て、この係数データを適応的
な割当ビット数で符号化するディジタルデータの高能率
符号化装置であって。上記入力ディジタルデータを互い
に異なるブロック長で直交変換する複数の直交変換手段
を有し、これら複数の直交変換手段からの各出力に基づ
いて各直交変換手段の出力のうち１つの出力のみを選択
するようにしたものである。ここで、上記各直交変換手
段に供給される入力ディジタルデータは、例えばいわゆ
る臨界帯域幅で分割された各帯域毎のデータとすること
ができる。また、この直交変換は例えばいわゆる高速フ
ーリエ変換（ＦＦＴ）処理等を挙げることができる。こ
の場合、該ＦＦＴ係数データが上記符号化される。更に
、上記選択の際には、例えば、複数の直交変換手段から
の各出力をそれぞれ符号化する際に必要とされる割当ビ
ット数（定められた音質を実現するのに必要なビット数
）が、最も少なくなるような直交変換手段の出力のみを
選択する処理を行うようにする。すなわち、上記各直交
変換された係数データがそれぞれ符号化処理された後の
該装置から出力されるフレーム単位のビット数は、予め
定められたある一定のビット数とされるが、これらの各
符号化処理の際には適応的なビット割当てが行われるた
め、当該フレームの符号化で実際に必要とされるビット
数は上記フレーム単位の一定ビット数とは異なるものと
なる場合がある。したがって、例えば当該符号化で実際
に必要とされるビット数が、上記フレーム単位で予め定
められたビット数よりも少なければ、残りの余ったビッ
トを使用して、より良好な符号化が行えることになるた
め、各直交変換手段の出力のうち最も上記必要ビット数
が少なくなるものを選択すれば最も良好な符号化出力が
得られるようになる。また、例えば、各符号化処理にお
いて、上記予め定められたフレーム単位のビット数より
も上記実際に必要とされるビット数が多くなった場合で
も、各直交変換手段の出力のうち最も上記必要ビット数
の少ないものを選べば、符号化による劣化が最も少ない
符号化出力を得ることができるようになる。[Means for Solving the Problems] A high-efficiency encoding device for digital data of the present invention has been proposed to achieve the above-mentioned object, and blocks input digital data into a plurality of samples. The present invention is a highly efficient digital data encoding device that performs orthogonal transformation for each block to obtain coefficient data, and encodes this coefficient data with an adaptive number of assigned bits. It has a plurality of orthogonal transformation means for orthogonally transforming the input digital data with mutually different block lengths, and selects only one output from each orthogonal transformation means based on each output from the plurality of orthogonal transformation means. This is how it was done. Here, the input digital data supplied to each of the orthogonal transform means can be, for example, data for each band divided by a so-called critical bandwidth. Further, this orthogonal transformation may include, for example, so-called fast Fourier transform (FFT) processing. In this case, the FFT coefficient data is encoded as described above. Furthermore, when making the above selection, for example, the number of allocated bits (the number of bits necessary to achieve a specified sound quality) required when encoding each output from a plurality of orthogonal transform means is , only the output of the orthogonal transform means that minimizes the number of outputs is selected. In other words, the number of bits per frame output from the device after each of the orthogonally transformed coefficient data is encoded is a certain predetermined number of bits. Since adaptive bit allocation is performed during encoding processing, the number of bits actually required for encoding the frame may differ from the fixed number of bits for each frame. Therefore, for example, if the number of bits actually required for the encoding is less than the predetermined number of bits for each frame, better encoding can be performed using the remaining surplus bits. Therefore, the best encoded output can be obtained by selecting the one that requires the least number of required bits among the outputs of each orthogonal transform means. Furthermore, for example, in each encoding process, even if the actually required number of bits is greater than the predetermined number of bits per frame, the most necessary bits among the outputs of each orthogonal transform means may be By selecting one with a small number, it becomes possible to obtain an encoded output with the least deterioration due to encoding.

【０００９】[0009]

【作用】本発明によれば、入力ディジタルデータに対し
て複数通りのブロック長で直交変換を行うと共に各直交
変換手段の出力に基づいてこれら直交変換手段の出力の
うち１つの出力のみを選択しており、この選択の際に例
えば符号化に必要とされるビット数が最も少ないものを
選択するようにすれば、最適の符号化出力を得ることが
できるようになる。[Operation] According to the present invention, input digital data is orthogonally transformed with a plurality of block lengths, and only one output from the orthogonal transform means is selected based on the output of each orthogonal transform means. When making this selection, for example, by selecting the one with the least number of bits required for encoding, it becomes possible to obtain the optimum encoded output.

【００１０】0010

【実施例】以下、本発明を適用した実施例について図面
を参照しながら説明する。本実施例のディジタルデータ
の高能率符号化装置は、図１に示すように、例えばオー
ディオ等の入力ディジタルデータを高域程帯域幅が広く
なるように複数の帯域に分割し、分割された帯域毎に複
数のサンプルからなるブロックを形成し、各帯域のブロ
ック毎に例えば高速フーリエ変換（ＦＦＴ）による直交
変換を行い係数データ（ＦＦＴ係数データ）を得て、こ
の係数データを適応的な割当ビット数で符号化するもの
である。本実施例の高能率符号化装置においては、上記
入力ディジタルデータを各帯域毎に互いに異なるブロッ
ク長で直交変換する直交変換手段を有して成るエンコー
ダが複数個（例えば４つのエンコーダ２〜５）設けられ
、これら各エンコーダ２〜５の各直交変換手段からの各
出力に基づいて各エンコーダ２〜５の出力のうち１つの
出力（すなわち各直交変換手段の出力のうちの１つの出
力）のみを選択するようにしている。換言すれば、この
時の上記エンコーダ２〜５における上記各帯域毎のＦＦ
Ｔ処理されるブロックのブロック長は各エンコーダ２〜
５毎に異なっており、これらエンコーダ２〜５からの出
力に基づいて各エンコーダ２〜５の出力のうちの１つの
出力のみを選択回路４で選択している。すなわち、この
選択回路６からの選択信号に基づいて切換スイッチ７の
切換動作を行わせることで、各エンコーダ２〜５の出力
の内から１つのみを出力するようにしている。Embodiments Hereinafter, embodiments to which the present invention is applied will be described with reference to the drawings. As shown in FIG. 1, the high-efficiency encoding device for digital data of this embodiment divides input digital data, such as audio, into a plurality of bands such that the higher the frequency band, the wider the bandwidth. For each band, a block consisting of a plurality of samples is formed, and each block of each band is subjected to orthogonal transformation using, for example, fast Fourier transform (FFT) to obtain coefficient data (FFT coefficient data), and this coefficient data is adaptively allocated bits. It is encoded by numbers. In the high-efficiency encoding device of this embodiment, there are a plurality of encoders (for example, four encoders 2 to 5) each having orthogonal transform means for orthogonally transforming the input digital data with different block lengths for each band. Based on each output from each orthogonal transformation means of each of these encoders 2 to 5, only one output from each of the encoders 2 to 5 (i.e., one output from each orthogonal transformation means) is provided. I try to choose. In other words, the FF for each band in the encoders 2 to 5 at this time
The block length of the block to be processed is determined by each encoder 2~
Based on the outputs from these encoders 2-5, the selection circuit 4 selects only one output from the outputs of the encoders 2-5. That is, by performing the switching operation of the changeover switch 7 based on the selection signal from the selection circuit 6, only one of the outputs of the encoders 2 to 5 is outputted.

【００１１】ここで、上記選択回路６における選択処理
の際には、例えば、上記各エンコーダ２〜５における各
符号化処理で必要とされるビット数が最も少ないエンコ
ーダの出力のみを選択する処理を行うようにしている。すなわち、上記エンコーダ２〜５から出力されるフレー
ム単位のビット数は、予め定められたある一定のビット
数とされるが、これらの各符号化処理の際には、後述す
るようなマスキング効果等を考慮した適応的なビット割
当てを行うことにより、当該フレームの符号化で実際に
必要とされるビット数が求められる。当該選択において
は、この実際に必要とされるビット数が最も少なくて済
むエンコーダの出力のみを選択するようにしている。し
たがって、この符号化に必要とされるビット数が、例え
ば上記フレーム単位で予め定められたビット数よりも少
なければ、残りの余ったビットを使用して、より良好な
符号化が行えると共に、各エンコーダの出力のうち最も
必要ビット数が少ないものを選択すれば最も良好な符号
化出力が得られるようになる。また、例えば、各エンコ
ーダ２〜５において、上記予め定められたフレーム単位
のビット数よりも上記実際に必要とされるビット数が多
くなった場合でも、各エンコーダ２〜５の出力のうち最
も必要ビット数の少ないものを選べば、符号化による劣
化が最も少ない符号化出力を得ることができるようにな
る。[0011] Here, during the selection process in the selection circuit 6, for example, a process of selecting only the output of the encoder that requires the least number of bits in each encoding process in each of the encoders 2 to 5 is performed. I try to do it. That is, the number of bits in each frame output from the encoders 2 to 5 is a predetermined certain number of bits, but during each of these encoding processes, masking effects etc., which will be described later, are applied. By performing adaptive bit allocation in consideration of the above, the number of bits actually required for encoding the frame can be determined. In this selection, only the output of the encoder that requires the least number of actually required bits is selected. Therefore, if the number of bits required for this encoding is smaller than the predetermined number of bits for each frame, the remaining surplus bits can be used to perform better encoding, and each The best encoded output can be obtained by selecting the encoder output that requires the least number of bits. Also, for example, in each encoder 2 to 5, even if the actually required number of bits is greater than the predetermined number of bits per frame, the most necessary number of bits among the outputs of each encoder 2 to 5 may be By selecting one with a small number of bits, it is possible to obtain an encoded output with the least deterioration due to encoding.

【００１２】すなわち、本実施例のディジタルデータの
高能率符号化装置の各エンコーダ２〜５おいては、例え
ば図２に示すように、入力端子１を介して供給されたオ
ーディオ或いは音声等の入力ディジタルデータを、いわ
ゆるミラーフィルタのＱＭＦ（ｑｕａｄｒａｔｕｒｅ　
ｍｉｒｒｏｒ　ｆｉｌｔｅｒ）４１，４２によって、い
わゆる臨界帯域（クリティカルバンド）での分割を考慮
して上記高域程帯域幅が広くなるように分割（例えば大
別して３つに分割）している。高速フーリエ変換（ＦＦ
Ｔ）回路４３，４４，４５では、この分割された帯域毎
に複数のサンプルからなるブロックを形成して、これら
各ブロック毎に高速フーリエ変換による直交変換（時間
軸を周波数軸に変換）を行うことで係数データ（ＦＦＴ
係数データ）を得るようにしている。なお、上記高速フ
ーリエ変換回路４３の出力は、臨界帯域の高域の例えば
２つの帯域と対応し、高速フーリエ変換回路４４の出力
は臨界帯域の中域の例えば３つの帯域と対応し、高速フ
ーリエ変換回路４５の出力は臨界帯域の低域の例えば２
０個の帯域と対応するようになされている。更に、これ
ら高速フーリエ変換回路４３，４４，４５からの各帯域
のＦＦＴ係数データを符号化回路４６，４７，４８によ
って適応的な割当ビット数で符号化するようになされて
いる。すなわち、本実施例の符号化回路４６，４７，４
８における上記３つの帯域のＦＦＴ係数データの符号化
の際には、人間の聴覚特性に基づく適応的な割当ビット
数で符号化を行うようにしている。That is, each of the encoders 2 to 5 of the high-efficiency encoding device for digital data of this embodiment receives an input of audio or voice supplied via the input terminal 1, for example, as shown in FIG. Digital data is processed by QMF (quadrature), a so-called mirror filter.
With mirror filters 41 and 42, the signal is divided (for example, roughly divided into three) so that the higher the frequency range, the wider the bandwidth, taking into consideration the division in the so-called critical band. Fast Fourier transform (FF
T) In the circuits 43, 44, and 45, a block consisting of a plurality of samples is formed for each divided band, and orthogonal transform (converting the time axis to the frequency axis) by fast Fourier transform is performed for each block. Therefore, the coefficient data (FFT
coefficient data). The output of the fast Fourier transform circuit 43 corresponds to, for example, two bands in the high range of the critical band, and the output of the fast Fourier transform circuit 44 corresponds to, for example, three bands in the middle range of the critical band. The output of the conversion circuit 45 is, for example, 2 in the low range of the critical band.
It is made to correspond to 0 bands. Furthermore, the FFT coefficient data of each band from these fast Fourier transform circuits 43, 44, and 45 is encoded by encoding circuits 46, 47, and 48 with an adaptive number of assigned bits. That is, the encoding circuits 46, 47, 4 of this embodiment
When encoding the FFT coefficient data of the three bands mentioned above in No. 8, the encoding is performed using an adaptive number of assigned bits based on human auditory characteristics.

【００１３】上述のような帯域分割を行うため、図２の
入力端子（各エンコーダ２〜５の入力端子）１にはアナ
ログオーディオ信号等をサンプリング（例えば１０２４
サンプル）して得たディジタルデータ（０〜２２．１ｋ
Ｈｚ）が供給されており、該ディジタルデータは上記Ｑ
ＭＦ４１，４２により、上記高域程帯域幅が広くなるよ
うに大まかに３つの帯域（０〜５．５ｋＨｚ，５．５ｋ
Ｈｚ〜１１．０ｋＨｚ，１１．０ｋＨｚ〜２２．１ｋＨ
ｚ）に分割される。上記ＱＭＦ４１では、上記０〜２２
．１ｋＨｚのディジタルデータが２分割されて１１．０
ｋＨｚ〜２２．１ｋＨｚと０〜１１．０ｋＨｚの２つの
出力が得られ、１１．０ｋＨｚ〜２２．１ｋＨｚの出力
は高速フーリエ変換回路４３に、０〜１１．０ｋＨｚの
出力はＱＭＦ４２に送られる。ＱＭＦ４２へ送られた０
〜１１．０ｋＨｚの出力は、該ＱＭＦ４２で更に２分割
されて５．５ｋＨｚ〜１１．０ｋＨｚと０〜５．５ｋＨ
ｚの２つの出力が得られる。上記５．５ｋＨｚ〜１１．
０ｋＨｚの出力は、高速フーリエ変換回路４４に送られ
、上記０〜５．５ｋＨｚの出力は、高速フーリエ変換回
路４５に送られる。In order to perform the band division as described above, the input terminal (input terminal of each encoder 2 to 5) 1 in FIG.
Digital data (0 to 22.1k) obtained from sample)
Hz) is supplied, and the digital data is
The MF41 and 42 roughly divide the frequency range into three bands (0 to 5.5kHz, 5.5kHz) so that the higher the frequency range, the wider the bandwidth.
Hz~11.0kHz, 11.0kHz~22.1kHz
z). In the above QMF41, the above 0 to 22
．． 1kHz digital data is divided into two and becomes 11.0
Two outputs of kHz to 22.1 kHz and 0 to 11.0 kHz are obtained, the output of 11.0 kHz to 22.1 kHz is sent to the fast Fourier transform circuit 43, and the output of 0 to 11.0 kHz is sent to the QMF 42. 0 sent to QMF42
The output of ~11.0kHz is further divided into two by the QMF42 to 5.5kHz~11.0kHz and 0~5.5kHz.
Two outputs of z are obtained. Above 5.5kHz~11.
The 0 kHz output is sent to the fast Fourier transform circuit 44, and the 0 to 5.5 kHz output is sent to the fast Fourier transform circuit 45.

【００１４】各高速フーリエ変換回路４３，４４，４５
では、供給された各帯域のデータの複数サンプル（例え
ば１０２４サンプル）で１つのフレームＢを構成し、当
該フレームＢ毎のブロックでフーリエ変換処理を施して
ＦＦＴ係数データを得るようになっている。ただし、こ
の時の各高速フーリエ変換回路４３，４４，４５でのＦ
ＦＴ処理のブロック長は、上述したように、各エンコー
ダ２〜５毎に異なったブロック長となされている。[0014] Each fast Fourier transform circuit 43, 44, 45
Here, one frame B is constructed from a plurality of samples (for example, 1024 samples) of the supplied data of each band, and the blocks of each frame B are subjected to Fourier transform processing to obtain FFT coefficient data. However, at this time, F in each fast Fourier transform circuit 43, 44, 45
As mentioned above, the block length of the FT processing is different for each of the encoders 2 to 5.

【００１５】例えば上記エンコーダ２の場合、図３に示
すように、各帯域のＦＦＴ処理のブロック長は、例えば
同じ長さになされる。すなわち、このエンコーダ２にお
いては、上記１１．０ｋＨｚ〜２２．１ｋＨｚの高域に
対応する上記高速フーリエ変換回路４３でのＦＦＴ処理
ブロック長ｂＨ　と、上記５．５ｋＨｚ〜１１．０ｋＨ
ｚの中域に対応する高速フーリエ変換回路４４でのブロ
ック長ｂＭ　と、上記０〜５．５ｋＨｚの低域に対応す
る高速フーリエ変換回路４５でのＦＦＴ処理のブロック
長ｂＬ　とを上記所定単位時間のフレームＢと同じ長さ
のブロック長としている。For example, in the case of the encoder 2 described above, as shown in FIG. 3, the block lengths of the FFT processing for each band are, for example, the same length. That is, in this encoder 2, the FFT processing block length bH in the fast Fourier transform circuit 43 corresponding to the high frequency range of 11.0 kHz to 22.1 kHz, and the above 5.5 kHz to 11.0 kHz.
The block length bM in the fast Fourier transform circuit 44 corresponding to the middle range of z and the block length bL of the FFT processing in the fast Fourier transform circuit 45 corresponding to the low range from 0 to 5.5 kHz are determined by the above predetermined unit time. The block length is the same as that of frame B.

【００１６】上記エンコーダ３の場合、図４に示すよう
に、各帯域のＦＦＴ処理されるブロック長は、高域で短
くなされる。すなわち、このエンコーダ３においては、
上記低域における高速フーリエ変換回路４５でのブロッ
ク長ｂＬ　及び中域における高速フーリエ変換回路２４
でのブロック長ｂＭ　に対し、高域における高速フーリ
エ変換回路４３でのブロック長は、例えば、当該低域（
又は中域）のブロック長ｂＬ　（又はｂＭ　）の１／２
のブロック長としている。図示の例では、高域のブロッ
クをブロック長ｂＨ１，ｂＨ２に分けている。In the case of the encoder 3, as shown in FIG. 4, the block length subjected to FFT processing in each band is shortened in the high frequency band. That is, in this encoder 3,
The block length bL in the fast Fourier transform circuit 45 in the low range and the fast Fourier transform circuit 24 in the middle range
For example, the block length in the fast Fourier transform circuit 43 in the high frequency range is the block length bM in the low frequency range (
1/2 of the block length bL (or bM ) of
The block length is set to . In the illustrated example, the high frequency block is divided into block lengths bH1 and bH2.

【００１７】上記エンコーダ４の場合、図５に示すよう
に、各帯域のＦＦＴ処理されるブロック長は、中域，高
域で短くなされる。すなわち、このエンコーダ４の各Ｆ
ＦＴ処理のブロック長においては、低域をブロック長ｂ
Ｌ　とすると、中域は例えば低域の１／２のブロック長
ｂＭ１，ｂＭ２とされ、高域は例えば低域の１／４（中
域の１／２）のブロック長ｂＨ１，ｂＨ２，ｂＨ３，ｂ
Ｈ４とされる。In the case of the encoder 4, as shown in FIG. 5, the block lengths subjected to FFT processing in each band are shortened in the middle and high bands. That is, each F of this encoder 4
In the block length of FT processing, the block length b is used for the low frequency range.
L, the middle range has block lengths bM1, bM2 that are 1/2 of the low range, and the high range has block lengths bH1, bH2, bH3, and 1/4 of the low range (1/2 of the middle range), for example. b
It is assumed to be H4.

【００１８】更に、上記エンコーダ５の場合、図６に示
すように、ブロック長は、高域及び中域で短く、低域で
長いものとなされる。すなわち、上記エンコーダ５の各
ＦＦＴ処理のブロック長においては、低域をブロック長
ｂＬ　とすると、高域，中域は例えば低域の１／４のブ
ロック長ｂＨ１，ｂＨ２，ｂＨ３，ｂＨ４及びｂＭ１，
ｂＭ２，ｂＭ３，ｂＭ４とされている。Furthermore, in the case of the encoder 5, as shown in FIG. 6, the block length is short in the high and middle ranges and long in the low range. That is, in the block length of each FFT process of the encoder 5, if the low range is the block length bL, the high and middle ranges are, for example, block lengths bH1, bH2, bH3, bH4, and bM1, which are 1/4 of the low range.
bM2, bM3, bM4.

【００１９】ここで、上述したように、図４〜図６にお
いて高域（及び中域）のブロック長を低域よりも短くし
、低域のブロック長を長くするのは、以下に示すような
理由による。すなわち、人間の聴覚における周波数分析
能力（周波数分解能）は、一般に、高域ではさほど高く
ないが低域では高いものであり、したがって、該低域で
の周波数分解能を確保する必要性から、現実にはこの低
域において上述したようにＦＦＴ処理のブロック長をあ
まり短くすることはできない。このため、低域でブロッ
ク長を長くしている。また、一般に、低域信号では定常
区間が長く、逆に高域信号では短いため、高域（及び中
域）でのブロック長を短くする（時間分解能を高める）
ことは有効となる。上述のようなことから、本実施例で
は、上記高域（及び中域）のブロック長を短くし、低域
のブロック長を長くするようにしている。Here, as mentioned above, the reason why the block length of the high range (and middle range) is made shorter than the block length of the low range and the block length of the low range is made longer in FIGS. 4 to 6 is as shown below. Due to reasons. In other words, the frequency analysis ability (frequency resolution) of human hearing is generally not very high in the high range, but high in the low range.Therefore, it is necessary to ensure frequency resolution in the low range. As mentioned above, the block length of FFT processing cannot be made very short in this low frequency range. For this reason, the block length is increased in the low range. In addition, in general, the steady interval is long for low frequency signals, and conversely short for high frequency signals, so the block length in the high frequency range (and midrange) is shortened (improving time resolution).
This is valid. In view of the above, in this embodiment, the block length of the high range (and middle range) is shortened, and the block length of the low range is lengthened.

【００２０】このように、本実施例においては、聴覚か
ら必要とされる周波数軸上の分解能と時間軸上の分解能
を同時に満足するような構成となっていて、上記低域（
０〜５．５ｋＨｚ）では処理のサンプル数を多くして周
波数分解能を上げ、高域（１１．０ｋＨｚ〜２２．１ｋ
Ｈｚ）では時間分解能を上げている。また、場合によっ
ては中域（５．５ｋＨｚ〜１１．０ｋＨｚ）でも時間分
解能を上げている。In this way, this embodiment has a configuration that simultaneously satisfies the resolution on the frequency axis and the resolution on the time axis required for hearing, and the above-mentioned low frequency (
0 to 5.5 kHz), the number of processing samples is increased to increase the frequency resolution, and the high frequency range (11.0 kHz to 22.1 kHz) is
Hz), the time resolution is increased. In some cases, the temporal resolution is also increased in the middle range (5.5 kHz to 11.0 kHz).

【００２１】また、入力オーディオ信号の特性（性質）
からも、上述のように、高域（及び中域）のＦＦＴ処理
ブロック長を短くすることは有効である。すなわち、入
力オーディオ信号が、例えば過渡的信号であるか又は定
常的信号であるかによって、上記ＦＦＴ処理のブロック
長を変えるようにすることが有効である。例えば、定常
的な信号の場合は、図３で示したような、各帯域のブロ
ック長を例えば同じ長さにすることが有効であり、また
例えば、過渡的な信号の場合は、図６に示したような、
高域，中域のＦＦＴ処理ブロック長を短くすることは有
効である。このように、過渡的な信号の場合にＦＦＴ処
理ブロック長を短くすることで、符号化の際に、フレー
ムＢ内のピークレベルの大きいブロック（過渡的信号部
）に対して多くのビットを割当てることができるように
なる。逆に他のブロックはビット数を減らすようにする
。これにより、スペクトルの時間的変化に追随し、フレ
ームＢの各帯域で真にビットを必要とするブロックのみ
に、ビットを与えることができるようになる。また、例
えば、定常的な信号の場合には、フレームＢ内のブロッ
ク毎の同様なスペクトルの信号に対して重複して符号化
をしなくて済むようになる。[0021] Also, the characteristics (properties) of the input audio signal
Therefore, as described above, it is effective to shorten the FFT processing block length in the high frequency range (and the middle frequency range). That is, it is effective to change the block length of the FFT processing, depending on whether the input audio signal is, for example, a transient signal or a stationary signal. For example, in the case of a stationary signal, it is effective to set the block length of each band to the same length as shown in FIG. As shown,
It is effective to shorten the FFT processing block lengths in the high and middle ranges. In this way, by shortening the FFT processing block length in the case of a transient signal, more bits are allocated to the block (transient signal part) with a large peak level in frame B during encoding. You will be able to do this. On the other hand, reduce the number of bits for other blocks. This makes it possible to follow temporal changes in the spectrum and provide bits only to blocks that truly require bits in each band of frame B. Furthermore, for example, in the case of a stationary signal, there is no need to redundantly encode signals with similar spectra in each block within frame B.

【００２２】なお、上記各帯域のＦＦＴ処理ブロック長
は、上述した図３〜図６の例に限定されず、高域で、よ
りブロック長を短くしたり、低域のブロック長も短くし
たり等様々なブロック長のパターンを考えることができ
る。[0022] The FFT processing block lengths for each of the above bands are not limited to the examples shown in Figs. 3 to 6 above. Various block length patterns can be considered.

【００２３】上記エンコーダ２〜５では、図２の各高速
フーリエ変換回路４３，４４，４５で上述したようにな
帯域毎にそれぞれブロック長を異ならせてＦＦＴ処理し
た後、このＦＦＴ係数データを各符号化回路４６，４７
，４８に送る。The encoders 2 to 5 perform FFT processing using different block lengths for each band as described above in the fast Fourier transform circuits 43, 44, and 45 of FIG. Encoding circuits 46, 47
, 48.

【００２４】ここで、本実施例の各エンコーダ２〜５に
おいては、以下の様な構成を用いることにより、上記各
符号化回路４６，４７，４８における適応的なビット割
当てによる符号化を行うようにしている。In each of the encoders 2 to 5 of this embodiment, the following configuration is used to perform encoding by adaptive bit allocation in each of the encoding circuits 46, 47, and 48. I have to.

【００２５】すなわち、本実施例の各エンコーダ２〜５
には、上記高速フーリエ変換回路４３，４４，４５から
の上記フレームＢ内のＦＦＴ係数データをそれぞれ符号
化するのに実際に必要なビット数（定められた音質を実
現するのに必要なビット数）を決定する１次ビット割当
数決定回路６０と、この１次ビット割当数決定回路６０
で決定された１次ビット数を、当該フレームＢで予め定
められている最終ビット数に合わせるためのビット配分
又はビット削減を行うビット数補正回路６１とを有する
ものである。したがって、上記符号化回路４６，４７，
４８での上記ＦＦＴ係数データの各符号化は、上記ビッ
ト数補正回路６１で上記１次ビット数が補正されたビッ
ト数（すなわち上記最終ビット数）によりなされている
。That is, each encoder 2 to 5 of this embodiment
The number of bits actually required to encode the FFT coefficient data in the frame B from the fast Fourier transform circuits 43, 44, and 45 (the number of bits required to achieve the specified sound quality) ), and this primary bit allocation number determining circuit 60.
A bit number correction circuit 61 performs bit allocation or bit reduction in order to match the primary bit number determined in frame B with the final bit number predetermined for the frame B. Therefore, the encoding circuits 46, 47,
Each encoding of the FFT coefficient data in 48 is performed using the number of bits (ie, the final number of bits) obtained by correcting the number of primary bits in the number of bits correction circuit 61.

【００２６】ここで、上記１次ビット割当数決定回路６
０において決定される１次ビット数は、例えば、後述す
るようないわゆるマスキング効果を考慮して決定される
ものである。Here, the primary bit allocation number determining circuit 6
The number of primary bits determined for 0 is determined, for example, in consideration of a so-called masking effect as described later.

【００２７】なお、上記マスキングとは、人間の聴覚特
性に関するものである。すなわち、一般に音に対する人
間の聴覚特性には、マスキング効果と呼ばれるものがあ
り、当該マスキング効果には、テンポラルマスキング効
果と同時刻マスキング効果等がある。上記同時刻マスキ
ング効果とは、ある大きな音と同時刻に発生する小さな
音（或いはノイズ）が当該大きな音によってマスクされ
て聞こえなくなるような効果であり、上記テンポラルマ
スキング効果とは、大きな音の時間的な前後の小さな音
（ノイズ）が、この大きな音にマスクされて聞こえなく
なるような効果である。このテンポラルマスキング効果
において、上記大きな音の時間的に後方のマスキングは
フォワードマスキングと呼ばれ、また、時間的に前方の
マスキングはバックワードマスキングと呼ばれている。また、テンポラルマスキングにおいては、人間の聴覚特
性から、フォワードマスキングの効果は長時間（例えば
　１００ｍｓｅｃ程度）効くようになっているのに対し
、バックワードマスキングの効果の持続時間は短時間（
例えば５ｍｓｅｃ程度）となっている。更に、上記マス
キング効果のレベル（マスキング量）は、フォワードマ
スキングが２０ｄＢ程度で、バックワードマスキングが
３０ｄＢ程度となっている。Note that the above-mentioned masking relates to human hearing characteristics. That is, in general, human auditory characteristics with respect to sound include something called a masking effect, and the masking effects include a temporal masking effect, a simultaneous masking effect, and the like. The above-mentioned temporal masking effect is an effect in which a small sound (or noise) that occurs at the same time as a loud sound is masked by the loud sound and becomes inaudible. The effect is such that the small sounds (noise) before and after the sound are masked by the loud sound and become inaudible. In this temporal masking effect, masking temporally behind the loud sound is called forward masking, and masking temporally forward is called backward masking. In addition, in temporal masking, due to the characteristics of human hearing, the effect of forward masking is effective for a long period of time (for example, about 100 msec), whereas the effect of backward masking is effective for a short period of time (for example, about 100 msec).
For example, about 5 msec). Furthermore, the level of the masking effect (masking amount) is about 20 dB for forward masking and about 30 dB for backward masking.

【００２８】したがって、このマスキング効果を上記フ
レームＢ内でのビット割当ての際に考慮すれば、より最
適なビット割当てが可能になる。すなわち、マスキング
される部分の信号に対してはビット数を少なくしても聴
感上何ら悪影響がないため、このマスキングされる部分
のビット数を減らすことができ、少ないビット数で効果
的な符号化が可能となる。なお、上記マスキング効果に
おけるマスキング量は、例えば上記臨界帯域毎のエネル
ギの総和を求め、この臨界帯域毎のエネルギに基づいて
求められる。また、ある臨界帯域の信号による他の臨界
帯域（或いは当該ある臨界帯域自身）の他の時間へのマ
スキング量を求めるようにすることも可能である。この
マスキング量に基づいて各帯域毎の許容可能なノイズレ
ベルが求められ、更にこの各帯域誤記の許容可能なノイ
ズレベルに基づいて上記符号化の割当ビット数を決める
ことができる。Therefore, if this masking effect is taken into consideration when allocating bits within the frame B, more optimal bit allocation becomes possible. In other words, even if the number of bits is reduced for the signal in the masked part, there is no adverse effect on the auditory sense, so the number of bits in the masked part can be reduced, and effective encoding can be achieved with a small number of bits. becomes possible. Note that the amount of masking in the masking effect is obtained, for example, by calculating the sum of the energies for each of the critical bands, and based on the energy for each of the critical bands. It is also possible to obtain the amount of masking of another critical band (or the certain critical band itself) at other times by a signal in a certain critical band. The allowable noise level for each band is determined based on this masking amount, and the number of allocated bits for the encoding can be determined based on the allowable noise level for each band error.

【００２９】上述のようにして１次ビット割当数決定回
路６０で決定された１次ビット数は、上記ビット数補正
回路６１に送られる。当該ビット数補正回路６１では、
上記１次ビット割当数決定回路６０で決定された１次ビ
ット数を、当該フレームＢで予め定められている上記最
終ビット数に合わせるためのビット配分又はビット削減
が行われる。The primary bit number determined by the primary bit allocation number determination circuit 60 as described above is sent to the bit number correction circuit 61. In the bit number correction circuit 61,
Bit allocation or bit reduction is performed to match the primary bit number determined by the primary bit allocation number determining circuit 60 to the final bit number predetermined for the frame B.

【００３０】ここで、上記１次ビット割当数決定回路６
０及び上記ビット数補正回路６１で行われる１次ビット
数の決定及びビット配分又はビット削減処理のフローチ
ャートを図７に示す。Here, the primary bit allocation number determining circuit 6
FIG. 7 shows a flowchart of the determination of the primary bit number and the bit allocation or bit reduction process performed by the bit number correction circuit 61.

【００３１】すなわち、このフローチャートにおいて、
ステップＳ１では、上記１次ビット割当数決定回路６０
で決定された１次ビット割当数、すなわち上記マスキン
グ計算等に基づいて求められる上記符号化回路４６，４
７，４８での符号化の際に実際に必要な上記１次ビット
数（使用ビット数）を変数ｎｓｕｍ０に代入する。ステ
ップＳ２では、この変数ｎｓｕｍ０が上記ビット数補正
回路６１に送られ、当該ビット数補正回路６１でこの変
数ｎｓｕｍ０を、更に変数ｎｓｕｍに代入する。That is, in this flowchart,
In step S1, the primary bit allocation number determining circuit 60
The number of primary bit allocations determined in , that is, the encoding circuits 46 and 4 determined based on the masking calculation, etc.
The above-mentioned number of primary bits (number of used bits) actually required for encoding with 7.48 is substituted into the variable nsum0. In step S2, this variable nsum0 is sent to the bit number correction circuit 61, and the bit number correction circuit 61 further substitutes this variable nsum0 into the variable nsum.

【００３２】ステップＳ３では、この変数ｎｓｕｍが上
記フレームＢで予め定められている最終ビット数を示す
数ｎｌｉｍｉｔよりも小さい（少ない）か、或いは、こ
の数ｎｌｉｍｉｔ以上かの判断がなされる。変数ｎｓｕ
ｍが上記数ｎｌｉｍｉｔよりも小さいときはステップＳ
４に進み、変数ｎｓｕｍが上記数ｎｌｉｍｉｔ以上のと
きはステップＳ５に進む。In step S3, it is determined whether this variable nsum is smaller than (less than) a number nlimit indicating the final number of bits predetermined in the frame B, or whether it is greater than or equal to this number nlimit. variable nsu
If m is smaller than the above number nlimit, step S
The process proceeds to step S4, and when the variable nsum is equal to or greater than the above number nlimit, the process proceeds to step S5.

【００３３】ここで、上記変数ｎｓｕｍが上記最終ビッ
ト数を示す数ｎｌｉｍｉｔよりも少ない場合、ビット数
が余ることになる。このため、上記ステップＳ４では、
この余ったビット数（上記変数ｎｓｕｍと上記最終ビッ
ト数を示す数ｎｌｉｍｉｔとの差に応じたビット数）を
上記フレームＢ内で更に配分する。このビット配分の際
には、より音質が良くなるように各帯域或いはブロック
に対して配分する。その後、このビット配分がなされた
ビット数を、再びステップＳ３に戻す。Here, if the variable nsum is less than the number nlimit indicating the final number of bits, there will be a surplus of bits. Therefore, in step S4 above,
This remaining number of bits (the number of bits corresponding to the difference between the variable nsum and the number nlimit indicating the final number of bits) is further distributed within the frame B. When allocating bits, the bits are allocated to each band or block so as to improve the sound quality. Thereafter, the number of bits for which this bit allocation has been made is returned to step S3.

【００３４】上記ステップＳ５では、上記変数ｎｓｕｍ
が上記最終ビット数を示す数ｎｌｉｍｉｔよりも大きい
（多い）か、或いは、該数ｎｌｉｍｉｔ以下かの判断が
なされる。上記変数ｎｓｕｍが上記最終ビット数を示す
数ｎｌｉｍｉｔ以下のときは、上記ステップＳ３との関
連からこの変数ｎｓｕｍが上記数ｎｌｉｍｉｔと等しく
なり、処理を終了する。また、変数ｎｓｕｍが上記最終
ビット数を示す数ｎｌｉｍｉｔよりも大きいときはステ
ップＳ６に進む。In step S5, the variable nsum
It is determined whether the number nlimit is larger than the number nlimit indicating the final number of bits, or is less than or equal to the number nlimit. When the variable nsum is less than or equal to the number nlimit indicating the final number of bits, the variable nsum becomes equal to the number nlimit in relation to step S3, and the process ends. Further, when the variable nsum is larger than the number nlimit indicating the final number of bits, the process advances to step S6.

【００３５】ここで、上記変数ｎｓｕｍが上記最終ビッ
ト数を示す数ｎｌｉｍｉｔよりも多い場合、ビット数が
不足していることになる。このため、上記ステップＳ６
では、この不足ビットを上記変数ｎｓｕｍに応じたビッ
ト数から削減する処理を行う。このビット削減の際には
、音質劣化に影響の少ない帯域或いはブロックから削減
するようにする。その後、このビット削減がなされたビ
ット数を、再びステップＳ５に戻す。Here, if the variable nsum is greater than the number nlimit indicating the final number of bits, the number of bits is insufficient. For this reason, the above step S6
Now, a process is performed to reduce this missing bit from the number of bits according to the variable nsum. When reducing bits, the bits are reduced starting from the band or block that has the least effect on sound quality deterioration. Thereafter, the number of bits resulting from this bit reduction is returned to step S5.

【００３６】上記フローチャートによって、ビット数の
補正がなされ、この補正後のビット数で上記符号化回路
４６，４７，４８での符号化が行われる。According to the above flowchart, the number of bits is corrected, and the encoding circuits 46, 47, and 48 perform encoding using the corrected number of bits.

【００３７】上述した各符号化回路４６，４７，４８の
符号化データが合成回路５０に送られる。また、上記１
次ビット割当数決定回路６０で決定された上記１次ビッ
ト数の情報（定められた音質を実現するのに実際に必要
なビット数を示すビット数情報）も、上記合成回路５０
に送られるようになっている。当該合成回路５０で各帯
域のデータが合成され、その合成データの内の上記符号
化データは出力端子５２から出力され、上記１次ビット
数情報は出力端子５３から出力されるようになっている
。The encoded data of each of the aforementioned encoding circuits 46, 47, and 48 is sent to a combining circuit 50. In addition, above 1
The information on the number of primary bits determined by the next bit allocation number determination circuit 60 (bit number information indicating the number of bits actually required to achieve a predetermined sound quality) is also transmitted to the synthesis circuit 50.
It is now sent to The data of each band is synthesized in the synthesis circuit 50, the encoded data of the synthesized data is outputted from an output terminal 52, and the primary bit number information is outputted from an output terminal 53. .

【００３８】図２の上記出力端子５２，５３からの出力
が、図１の各エンコーダ２〜５からそれぞれ出力されて
いる。各エンコーダ２〜５からの上記符号化データはそ
れぞれ図１の上記切換スイッチ７に送られ、上記１次ビ
ット数情報は上記選択回路６に送られる。The outputs from the output terminals 52 and 53 in FIG. 2 are output from each encoder 2 to 5 in FIG. 1, respectively. The encoded data from each encoder 2 to 5 is sent to the changeover switch 7 in FIG. 1, and the primary bit number information is sent to the selection circuit 6.

【００３９】ここで、上記選択回路６では、上述したよ
うに、各エンコーダ２〜５から送られてきた各１次ビッ
ト数情報に基づいて、該各エンコーダ２〜５の出力の内
から、符号化に必要なビット数（定められた音質を実現
するのに必要なビット数）が最も少なくなるエンコーダ
の出力のみを選択するための選択処理がなされ、この選
択信号が、上記切換スイッチ７に送られる。これにより
、当該切換スイッチ７では、供給されている各エンコー
ダ２〜５の出力の中から、上記選択信号に基づいた１つ
の符号化出力のみを出力する切り換え処理がなされる。したがって、本実施例装置の出力端子８からは、上記選
択された符号化出力のみが出力されるようになる。Here, as described above, the selection circuit 6 selects a code from among the outputs of the encoders 2 to 5 based on the primary bit number information sent from each of the encoders 2 to 5. A selection process is performed to select only the output of the encoder that minimizes the number of bits required for encoding (the number of bits required to achieve a specified sound quality), and this selection signal is sent to the changeover switch 7. It will be done. As a result, the changeover switch 7 performs a switching process to output only one encoded output based on the selection signal from among the supplied outputs of the encoders 2 to 5. Therefore, only the selected encoded output is output from the output terminal 8 of the device of this embodiment.

【００４０】本実施例装置の上記符号化出力を、上記１
次ビット割当情報に基づいて図示を省略する復号化装置
で復号化処理することにより、得られる音声の音質は最
適のものとなる。The above encoded output of the apparatus of this embodiment is
By performing decoding processing using a decoding device (not shown) based on the next bit allocation information, the sound quality of the obtained audio becomes optimal.

【００４１】上述のようなことから、本実施例のディジ
タルデータの高能率符号化装置においては、複数のエン
コーダ２〜５の出力から最適な符号化がなされたエンコ
ーダの出力のみを選択して符号化データとして得るよう
にしているため、この符号化データを復号化して音声に
変換すれば、最良の音質が得られるようになる。また、
本実施例装置では、実際に符号化した後の出力を選んで
いるため、上記ＦＦＴ処理でのブロックのブロック長は
、最適なブロック長のものが選ばれていることになり、
かつ、このブロック長の選択も容易となっている。更に、本実施例装置は、いわゆるコンパクトディスク（
ＣＤ）等のパッケージメディアに記録するデータの符号
化装置に適用することができる。この場合、一般ユーザ
はデコーダ（プレーヤ）のみが必要で符号化装置は必要
ないので、この符号化装置の規模は問題とされず本実施
例装置は特に有効となる。As described above, in the high-efficiency encoding device for digital data of this embodiment, only the output of the encoder that has been optimally encoded is selected from among the outputs of the plurality of encoders 2 to 5, and encoded. Since the encoded data is obtained as encoded data, the best sound quality can be obtained by decoding this encoded data and converting it to voice. Also,
In the device of this embodiment, since the output after actual encoding is selected, the block length of the block in the above FFT processing is selected to be the optimal block length.
Moreover, selection of this block length is also easy. Furthermore, the device of this embodiment can handle so-called compact discs (
The present invention can be applied to an encoding device for data recorded on package media such as CDs. In this case, the general user only needs a decoder (player) and does not need an encoding device, so the scale of the encoding device is not a problem and the device of this embodiment is particularly effective.

【００４２】また、上述した実施例装置においては、各
エンコーダ２〜５にそれぞれＦＦＴ処理ブロック長の異
なる高速フーリエ変換回路４３〜４５を配した構成とし
ているが、本発明の他の例として例えばエンコーダを１
つのみとし、該エンコーダ内に、各帯域毎に上述したよ
うなそれぞれＦＦＴ処理ブロック長の異なるＦＦＴ処理
を行う高速フーリエ変換回路を配し、これら各帯域毎の
高速フーリエ変換回路の出力のうちから上述したような
符号化に実際に必要なビット数が最も少なくなる高速フ
ーリエ変換回路の出力のみを選択するような構成とする
ことも可能である。Furthermore, in the above-described embodiment, the encoders 2 to 5 are each provided with fast Fourier transform circuits 43 to 45 having different FFT processing block lengths. 1
In this encoder, a fast Fourier transform circuit that performs FFT processing with different FFT processing block lengths as described above is arranged for each band, and from among the outputs of the fast Fourier transform circuit for each band, It is also possible to adopt a configuration in which only the output of the fast Fourier transform circuit that minimizes the number of bits actually required for encoding as described above is selected.

【００４３】すなわち、この例の高能率符号化装置の場
合、図１のようにエンコーダが複数個配されるのではな
く、１つのエンコーダのみとされる。この例の場合のエ
ンコーダ内の構成を図２の構成と比較して説明すると、
各域帯域に対応する高速フーリエ変換回路は図２のよう
にそれぞれ１つ（各フーリエ変換回路４３，４４，４５
）ではなく、図３〜図６における各帯域部分での各ブロ
ック長の種類数と対応した複数の高速フーリエ変換回路
が配される。例えば、高域ではそれぞれブロック長の異
なる３つの高速フーリエ変換回路が配され、中域では各
ブロック長の異なる３つの高速フーリエ変換回路が、低
域では１つの高速フーリエ変換回路が配される。換言す
れば、高域では図３のブロック長ｂＨ　と図４のブロッ
ク長ｂＨ１，ｂＨ２と図５及び図６のブロック長ｂＨ１
，ｂＨ２，ｂＨ３，ｂＨ４とに対応する３種類のブロッ
ク長で高域のデータを各々ＦＦＴ処理する３つの高速フ
ーリエ変換回路が配され、中域では図３及び図４のブロ
ック長ｂＭ　と図５のブロック長ｂＭ１，ｂＭ２と図６
のブロック長ｂＭ１，ｂＭ２，ｂＭ３，ｂＭ４とに対応
する３種類のブロック長で該中域のデータを各々ＦＦＴ
処理する３つの高速フーリエ変換回路が配され、低域で
は図３〜図６のブロック長ｂＬ　に対応する１種類のブ
ロック長で該低域のデータをＦＦＴ処理する高速フーリ
エ変換回路が配される。このような各高速フーリエ変換
回路の出力を前記選択回路６に送り、これら各高速フー
リエ変換回路の出力の内、各帯域毎に１つの高速フーリ
エ変換回路の出力のみを選択してふごけう化を行うよう
にすれば、前述の図１と同様の処理が可能となる。この
例の場合も、前述同様の効果を得ることが可能であると
共に、構成の簡略化も可能となる。That is, in the case of the high-efficiency encoding device of this example, only one encoder is provided instead of a plurality of encoders as shown in FIG. The internal configuration of the encoder in this example will be explained by comparing it with the configuration in Figure 2.
As shown in FIG. 2, there is one fast Fourier transform circuit corresponding to each band (each Fourier transform circuit 43, 44, 45
), a plurality of fast Fourier transform circuits corresponding to the number of types of block lengths in each band portion in FIGS. 3 to 6 are arranged. For example, three fast Fourier transform circuits with different block lengths are arranged in the high band, three fast Fourier transform circuits with different block lengths are arranged in the middle band, and one fast Fourier transform circuit is arranged in the low band. In other words, in the high frequency range, the block length bH in FIG. 3, the block lengths bH1, bH2 in FIG. 4, and the block length bH1 in FIGS. 5 and 6.
, bH2, bH3, and bH4. Three fast Fourier transform circuits are arranged to perform FFT processing on high-frequency data, respectively, with three types of block lengths corresponding to block lengths bH2, bH3, and bH4. The block lengths bM1, bM2 and Figure 6
The mid-range data is subjected to FFT with three types of block lengths corresponding to the block lengths bM1, bM2, bM3, and bM4.
Three fast Fourier transform circuits are arranged to process the data, and in the low range, a fast Fourier transform circuit is arranged to perform FFT processing on the low range data with one type of block length corresponding to the block length bL in FIGS. 3 to 6. . The outputs of these fast Fourier transform circuits are sent to the selection circuit 6, and among the outputs of these fast Fourier transform circuits, only the output of one fast Fourier transform circuit for each band is selected and converted. If this is done, the same processing as in FIG. 1 described above becomes possible. In this example as well, it is possible to obtain the same effects as described above, and it is also possible to simplify the configuration.

【００４４】なお、上記直交変換は上述した高速フーリ
エ変換に限らず例えば離散的余弦変換等をも適用するこ
とができる。Note that the orthogonal transformation described above is not limited to the above-mentioned fast Fourier transformation, but may also be applied to, for example, discrete cosine transformation.

【００４５】[0045]

【発明の効果】本発明のディジタルデータの高能率符号
化装置においては、複数の直交変換手段の出力に基づい
て各直交変換手段の出力のうちの１つの出力のみを選択
するようにしており、例えば、符号化の際の割当ビット
数が最も少なくなる直交変換手段の出力のみを選択する
ことで、入力ディジタルデータの特性及び人間の聴覚特
性（オーディオデータの場合）に応じた最適な符号化出
力を得ることができるようになり、したがって、この符
号化出力を復号化して音声に変換すれば、最良の音質が
得られるようになる。また、本発明の装置は、いわゆる
コンパクトディスク（ＣＤ）等のパッケージメディアに
記録するデータの符号化装置に適用する場合特に有効で
ある。Effects of the Invention In the high-efficiency encoding device for digital data of the present invention, only one output from each orthogonal transform means is selected based on the outputs of a plurality of orthogonal transform means, For example, by selecting only the output of the orthogonal transform means that minimizes the number of allocated bits during encoding, the optimal encoded output can be achieved according to the characteristics of the input digital data and the human auditory characteristics (in the case of audio data). Therefore, if this encoded output is decoded and converted into speech, the best sound quality can be obtained. Furthermore, the device of the present invention is particularly effective when applied to an encoding device for data recorded on package media such as so-called compact discs (CDs).

[Brief explanation of drawings]

【図１】本発明実施例のディジタルデータの高能率符号
化装置の概略構成を示すブロック回路図である。FIG. 1 is a block circuit diagram showing a schematic configuration of a highly efficient digital data encoding device according to an embodiment of the present invention.

【図２】エンコーダの具体例の概略構成を示すブロック
図である。FIG. 2 is a block diagram showing a schematic configuration of a specific example of an encoder.

【図３】ブロック長が各帯域で同じ場合のＦＦＴ処理ブ
ロックを示す図である。FIG. 3 is a diagram showing FFT processing blocks when the block length is the same in each band.

【図４】高域のブロック長がフレームの１／２の場合の
ＦＦＴ処理ブロックを示す図である。FIG. 4 is a diagram showing an FFT processing block when the high-frequency block length is 1/2 of a frame.

【図５】高域のブロック長がフレームの１／４で、中域
のブロック長がフレームの１／２の場合のＦＦＴ処理ブ
ロックを示す図である。FIG. 5 is a diagram showing FFT processing blocks when the block length of the high band is 1/4 of the frame and the block length of the middle band is 1/2 of the frame.

【図６】高域のブロック長がフレームの１／４で、中域
のブロック長がフレームの１／４の場合のＦＦＴ処理ブ
ロックを示す図である。FIG. 6 is a diagram showing an FFT processing block when the block length of the high band is 1/4 of the frame and the block length of the middle band is 1/4 of the frame.

【図７】１次ビット割当数決定及びビット数補正を説明
するためのフローチャートである。FIG. 7 is a flowchart for explaining primary bit allocation number determination and bit number correction.

[Explanation of symbols]

２〜５・・・・・・エンコーダ６・・・・・・・・選択回路７・・・・・・・・切換スイッチ４１，４２・・・・ＱＭＦ４３，４４，４５・・・高速フーリエ変換回路４６，４
７，４８・・・符号化回路2 to 5... Encoder 6... Selection circuit 7... Changeover switch 41, 42... QMF 43, 44, 45... Fast Fourier Conversion circuit 46, 4
7, 48... encoding circuit

Claims

[Claims]

Claim 1: A digital data processing method that blocks input digital data with a plurality of samples, performs orthogonal transformation for each block to obtain coefficient data, and encodes this coefficient data with an adaptive number of allocated bits. The efficiency encoding device has a plurality of orthogonal transform means for orthogonally transforming the input digital data with mutually different block lengths, and based on each output from the plurality of orthogonal transform means, one of the outputs of each orthogonal transform means is A high-efficiency encoding device for digital data, characterized in that only one output is selected.