JP2002268693A

JP2002268693A - Audio encoding device

Info

Publication number: JP2002268693A
Application number: JP2001069083A
Authority: JP
Inventors: Tetsuro Wada; 哲朗和田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2001-03-12
Filing date: 2001-03-12
Publication date: 2002-09-20

Abstract

PROBLEM TO BE SOLVED: To increase the efficiency of information amount compression of auxiliary information used for encoding. SOLUTION: A processing section determination part 5 sets a plurality of processing sections on a frequency axis in the encoding of a conversion coefficient so that the sections are less than prescribed minimum unit sections, and an optimum encoding processing part 3 encodes the conversion coefficient calculated by an orthogonal conversion part 1 by the processing sections set by the determination part 5 according to an index for adaptively controlling the generation amount of quantization noise calculated by an auditory analysis part 2 and outputs the encoded conversion coefficient and the attached auxiliary information having been used for the encoding.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、オーディオ信号
を効率的に符号化するオーディオ符号化装置に関し、特
に複数の周波数区分毎に適応処理を行うオーディオ符号
化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio encoding device for encoding an audio signal efficiently, and more particularly to an audio encoding device for performing adaptive processing for each of a plurality of frequency sections.

【０００２】[0002]

【従来の技術】オーディオ符号化には、オーディオ信号
を時間軸上で複数の周波数帯域に分割して符号化する帯
域分割符号化や、オーディオ信号を周波数軸へ直交変換
し複数の帯域区分に分割して符号化する変換符号化があ
る。また、これらを組み合わせた高能率符号化もある。2. Description of the Related Art In audio coding, band division coding for dividing an audio signal into a plurality of frequency bands on a time axis and coding the signal, and orthogonally transforming an audio signal into a frequency axis and dividing the signal into a plurality of band divisions. There is a transform coding to perform coding. There is also high-efficiency coding combining these.

【０００３】図１８は、ＭＰＥＧ−２ＡＡＣ方式（Ｉ
ＳＯ／ＩＥＣ１３８１８−７）に規定されている、変
換符号化を行う従来のオーディオ符号化装置の構成を示
すブロック図である。図において、１０１は入力された
オーディオ信号を、聴覚分析部１０２から出力される変
換ブロックサイズに応じて直交変換し変換係数を算出す
る直交変換部、１０２は入力されたオーディオ信号を人
間の聴覚特性に基づき分析し、マスキングしきい値や許
容雑音量等の量子化雑音の発生量を適応的に制御するた
めの指標を算出する聴覚分析部である。FIG. 18 shows an MPEG-2 AAC system (I
FIG. 21 is a block diagram illustrating a configuration of a conventional audio encoding device that performs transform encoding, which is defined in SO / IEC 13818-7). In the figure, reference numeral 101 denotes an orthogonal transform unit for orthogonally transforming an input audio signal according to a transform block size output from an auditory analysis unit 102 to calculate a transform coefficient, and 102 denotes a human auditory characteristic of an input audio signal. This is an auditory analysis unit that analyzes based on the information and calculates an index for adaptively controlling the amount of quantization noise such as a masking threshold and an allowable noise amount.

【０００４】また、図１８において、１０３は、聴覚分
析部１０２により算出された量子化雑音の発生量を適応
的に制御するための指標に基づいて、予め保有している
規定の最小単位区分である複数の帯域区分毎に、入力さ
れた変換係数を符号化し、符号化された変換係数及び符
号化の際に使用した付随する補助情報を出力する最適符
号化処理部、１０４は符号化された変換係数及び付随す
る補助情報を多重化する多重化部、２０１はオーディオ
信号を入力する入力端子、２０２は多重化された符号化
ビットストリームを出力する出力端子である。[0004] In FIG. 18, reference numeral 103 denotes a predetermined minimum unit division based on an index for adaptively controlling the amount of quantization noise calculated by the auditory analysis unit 102. For each of a plurality of band divisions, the optimal encoding processing unit 104 that encodes the input transform coefficient and outputs the encoded transform coefficient and the accompanying auxiliary information used at the time of encoding. A multiplexing unit that multiplexes the transform coefficient and the accompanying auxiliary information, 201 is an input terminal for inputting an audio signal, and 202 is an output terminal for outputting a multiplexed coded bit stream.

【０００５】次に動作について説明する。入力端子２０
１から入力されたオーディオ信号は、例えば、１０２４
サンプルを１つの時間フレームとしてフレーム化され、
フレーム単位で直交変換部１０１により直交変換され
る。直交変換部１０１で用いる直交変換には、変形離散
コサイン変換（ＭＤＣＴ：ＭｏｄｉｆｉｅｄＤｉｓｃ
ｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）等があ
り、ＭＤＣＴでは、変換ブロック毎に入力サンプルの５
０％がオーバーラップするように変換を行う。なお、直
交変換部１０１では、時間フレームを１０２４サンプル
としたが、時間分解能を上げるために１２８サンプルの
時間フレームである場合もあり、さらに、これらを長短
２つの変換ブロックとして適応的に切替可能とする場合
もある。Next, the operation will be described. Input terminal 20
The audio signal input from 1 is, for example, 1024
Samples are framed as one time frame,
The orthogonal transform is performed by the orthogonal transform unit 101 in frame units. The orthogonal transform used in the orthogonal transform unit 101 includes a modified discrete cosine transform (MDCT: Modified Disc).
For example, in the MDCT, 5 input samples are used for each transform block.
The conversion is performed so that 0% overlaps. In the orthogonal transform unit 101, the time frame is 1024 samples. However, in order to increase the time resolution, the time frame may be a 128-sample time frame, and these can be adaptively switched as two long and short transform blocks. In some cases.

【０００６】オーディオ信号は聴覚分析部１０２にも入
力され、オーディオ信号の特性及び人間の聴覚特性をモ
デル化した分析が行われる。聴覚特性の一つにマスキン
グ効果という特性があり、これはある音によってその他
の音が聞こえなくなるという現象のことである。近年の
オーディオ符号化では、この特性を利用して、符号化の
過程において発生する量子化雑音を人間の耳に知覚させ
ないような適応制御が用いられている。聴覚分析部１０
２は、聴覚特性に基づいて量子化雑音の発生量を適応的
に制御するための指標を算出する。この指標は、マスキ
ングしきい値や許容雑音量等と呼ばれる。[0006] The audio signal is also input to the auditory analysis unit 102, which performs an analysis modeling the characteristics of the audio signal and the human auditory characteristics. One of the hearing characteristics is a characteristic called a masking effect, which is a phenomenon in which a certain sound makes other sounds inaudible. In recent audio coding, adaptive control is used by using this characteristic so that quantization noise generated in the coding process is not perceived by the human ear. Auditory analysis unit 10
2 calculates an index for adaptively controlling the amount of quantization noise generation based on the auditory characteristics. This index is called a masking threshold, an allowable noise amount, or the like.

【０００７】直交変換部１０１において、ＭＤＣＴによ
って得られる変換係数は、最適符号化処理部１０３に入
力される。最適符号化処理部１０３では、入力された変
換係数を予め保有している複数の帯域区分に分割し、聴
覚分析部１０２により算出された量子化雑音の発生量を
適応的に制御するための指標に基づいて、その帯域区分
毎に量子化雑音の発生量を鑑みながら適応的な符号化処
理を行う。この帯域区分は最適符号化処理部１０３が予
め保有している規定の最小単位区分であり、例えば、臨
界帯域を模した複数の帯域区分、すなわち低域では帯域
幅が狭く、高域では帯域幅が広い帯域に分割する。[0007] In the orthogonal transform unit 101, transform coefficients obtained by MDCT are input to the optimal coding processing unit 103. The optimal coding processing unit 103 divides the input transform coefficient into a plurality of band divisions stored in advance, and an index for adaptively controlling the amount of quantization noise calculated by the auditory analysis unit 102. , Adaptive encoding processing is performed in consideration of the amount of quantization noise generated for each band division. This band division is a prescribed minimum unit division held in advance by the optimal coding processing unit 103. For example, a plurality of band divisions imitating a critical band, that is, a band is narrow in a low band, and a bandwidth is high in a high band. Split into wide bands.

【０００８】ある帯域区分においては、その区分に属す
る変換係数を正規化するためのスケーリング係数を求
め、そのスケーリング係数で正規化する。さらには、割
り当てられたビットによって量子化される。量子化され
た変換係数は、ハフマン符号化等のエントロピー符号化
によって符号化される。また、これらスケーリング係数
や量子化のためのビット割当て、ハフマン符号化のため
に選択された符号テーブル等の符号化の際に使用された
情報は、多重化部１０４によって、補助情報として変換
係数の情報である主情報と共に帯域区分毎に多重化さ
れ、符号化ビットストリームとして出力端子２０２から
出力される。出力データは、記録媒体に蓄積されたり、
伝送路を介して送信されたりする。In a certain band section, a scaling coefficient for normalizing a transform coefficient belonging to the section is obtained, and normalization is performed using the scaling coefficient. Further, quantization is performed by the assigned bits. The quantized transform coefficients are encoded by entropy encoding such as Huffman encoding. The information used for encoding such as the scaling coefficient, bit allocation for quantization, and a code table selected for Huffman encoding is converted by the multiplexing unit 104 as auxiliary information of the transform coefficient. The information is multiplexed with the main information, which is information, for each band division, and is output from the output terminal 202 as an encoded bit stream. Output data is stored on a recording medium,
Or transmitted via a transmission path.

【０００９】ところで、多重化部１０４では、最適符号
化処理部１０３で帯域区分毎に適応処理を行った結果得
られる個々の補助情報を全て多重化する。例えば、ＭＰ
ＥＧ−２ＡＡＣ方式（ＩＳＯ／ＩＥＣ１３８１８−
７）では、長変換ブロックにおける帯域区分を４９個に
設定しており、帯域区分毎の補助情報の多重方法を補助
情報の種類に応じて規定している。The multiplexing unit 104 multiplexes all the individual pieces of auxiliary information obtained as a result of performing the adaptive processing for each band division by the optimal coding processing unit 103. For example, MP
EG-2 AAC method (ISO / IEC 13818-
In 7), the band division in the long conversion block is set to 49, and the multiplexing method of the auxiliary information for each band division is defined according to the type of the auxiliary information.

【００１０】ここで、例としてハフマン符号の符号テー
ブルの情報の多重方法について述べる。図１９は、最適
符号化処理部１０３から出力された、帯域区分毎に選択
されたハフマン符号テーブルの様子の一例を示す図であ
る。図１９における（ａ）は帯域区分ｓｂを示すインデ
ックス、（ｂ）は該帯域区分ｓｂで選択されたハフマン
符号テーブルのインデックスｃｂ［ｓｂ］である。ハフ
マン符号テーブルとしては、インデックス０からインデ
ックス１１の１２種類のテーブルが用意されている。Here, a method of multiplexing information of the code table of the Huffman code will be described as an example. FIG. 19 is a diagram showing an example of a state of a Huffman code table output from the optimal encoding processing unit 103 and selected for each band section. In FIG. 19, (a) is an index indicating the band section sb, and (b) is an index cb [sb] of the Huffman code table selected in the band section sb. As the Huffman code table, twelve types of tables from index 0 to index 11 are prepared.

【００１１】多重化部１０４は、このハフマン符号テー
ブルの情報を次のように多重する。まず、ハフマン符号
テーブルのインデックスｃｂ［ｓｂ］について、連続し
た帯域区分において、同じインデックスが選択されてい
るケースを見つけ、その連続数を調べる。図１９におけ
る（ｃ）のｌｅｎがその連続数である。次に帯域区分ｓ
ｂ０から順に、１）ハフマン符号テーブルのインデック
スｃｂ［ｓｂ］、２）その連続数ｌｅｎの順に情報を多
重する。同じ符号テーブルが連続する帯域区分について
は情報の多重を省略し、異なる符号テーブルが選択され
た帯域区分から改めて情報を多重する。The multiplexing unit 104 multiplexes the information of the Huffman code table as follows. First, with respect to the index cb [sb] of the Huffman code table, a case where the same index is selected in a continuous band division is found, and the number of continuations is checked. Len in (c) in FIG. 19 is the number of continuations. Next, band division s
Information is multiplexed in the order of 1) Huffman code table index cb [sb] and 2) the consecutive number len from b0. Information multiplexing is omitted for band sections in which the same code table continues, and information is multiplexed again from a selected band section in a different code table.

【００１２】図２０は多重化部１０４によるハフマン符
号テーブルの情報の多重される順序を示す図である。す
なわち、図１９の場合、情報の多重される順序は図２０
に示す通りとなる。このように、ハフマン符号テーブル
のインデックスが連続する場合は形式的に１つの情報に
まとめる結果となり、図１９の場合、最終的に多重化す
る情報の数はｃｂ［ｓｂ］及びｌｅｎを１つの組と考え
ると４１組になる。これは、元々の４９個の帯域区分に
おいて連続して同じハフマン符号テーブルが選択されな
い場合に対して８組分の情報量の減少を意味する。FIG. 20 is a diagram showing the order in which the information of the Huffman code table is multiplexed by the multiplexing unit 104. That is, in the case of FIG.
It is as shown in. As described above, when the indexes of the Huffman code table are continuous, the result is formally combined into one piece of information. In the case of FIG. 19, the number of pieces of information to be finally multiplexed is cb [sb] and len as one set. Considering this, there are 41 pairs. This means that the amount of information for eight sets is reduced compared to the case where the same Huffman code table is not continuously selected in the original 49 band divisions.

【００１３】次の例として、スケーリング係数の情報に
関する多重方法について述べる。図２１は、最適符号化
処理部１０３から出力された、帯域区分毎に選択された
スケーリング係数の様子の一例を示す図である。図２１
における（ａ）は帯域区分ｓｂを示すインデックス、
（ｂ）は該帯域区分ｓｂで選択されたスケーリング係数
ｓｆ［ｓｂ］である。（ｃ）は、帯域区分ｓｂにおい
て、その直前の帯域区分のスケーリング係数との差分ｄ
ｉｆｆ［ｓｂ］である。差分ｄｉｆｆ［ｓｂ］は、次の
（１）式によって求められる。ｄｉｆｆ［ｓｂ］＝ｓｆ［ｓｂ］−ｓｆ［ｓｂ−１］（１）但し、ｄｉｆｆ［０］＝０とする。As a next example, a multiplexing method relating to scaling coefficient information will be described. FIG. 21 is a diagram illustrating an example of a state of the scaling coefficient output from the optimal encoding processing unit 103 and selected for each band section. FIG.
(A) is an index indicating the band division sb,
(B) is a scaling coefficient sf [sb] selected in the band section sb. (C) is a difference d between the band division sb and the scaling coefficient of the immediately preceding band division.
if [sb]. The difference diff [sb] is obtained by the following equation (1). diff [sb] = sf [sb] -sf [sb-1] (1) However, diff [0] = 0.

【００１４】多重化部１０４は、スケーリング係数に関
する情報として、この差分ｄｉｆｆ［ｓｂ］をハフマン
符号化して帯域区分毎に多重する。図２２は、ｄｉｆｆ
［ｓｂ］をハフマン符号化する際に使用するハフマン符
号テーブルの例を示す図である。The multiplexing unit 104 performs Huffman encoding of the difference diff [sb] as information on the scaling coefficient and multiplexes the difference diff [sb] for each band section. FIG.
FIG. 3 is a diagram illustrating an example of a Huffman code table used when Huffman coding [sb] is performed.

【００１５】図２２における（ａ）は差分ｄｉｆｆで、
（ｂ）は差分ｄｉｆｆに対応したハフマン符号の符号長
ｌｅｎｇｔｈである。スケーリング係数用のハフマン符
号テーブルは、差分の絶対値が小さい場合に対応するハ
フマン符号の符号長を短く、差分の絶対値が大きい場合
には対応するハフマン符号の符号長を長くしている。従
って、帯域区分毎に異なるスケーリング係数が選択さ
れ、さらには、多くの帯域区分において差分ｄｉｆｆ
［ｓｂ］の絶対値が大きい場合には、スケーリング係数
に関する情報に必要なビット量は多くなる。FIG. 22A shows a difference diff.
(B) is the code length of the Huffman code corresponding to the difference diff. The Huffman code table for the scaling coefficient shortens the code length of the Huffman code corresponding to the case where the absolute value of the difference is small, and lengthens the code length of the Huffman code corresponding to the case where the absolute value of the difference is large. Therefore, a different scaling factor is selected for each band section, and furthermore, the difference diff
When the absolute value of [sb] is large, the bit amount required for the information on the scaling coefficient increases.

【００１６】[0016]

【発明が解決しようとする課題】従来のオーディオ符号
化装置は、以上のように構成されているので、最適符号
化処理部１０３においては、常に最小の帯域区分毎に適
応処理が行われており、ハフマン符号テーブルやスケー
リング係数等の補助情報は異なる結果になることが多か
った。そのため、多重化部１０４で補助情報を多重する
ために必要なビット量を削減する手法が備わっていて
も、その効果が得られず、情報量圧縮の効率を低下させ
るという課題があった。また、特に低ビットレートでの
符号化の場合には、相対的に主情報に比べて補助情報の
占める比率が高くなり、品質劣化が生じるという課題が
あった。Since the conventional audio encoding apparatus is configured as described above, the adaptive encoding processing unit 103 always performs adaptive processing for each minimum band division. , Auxiliary information such as a Huffman code table and a scaling coefficient often have different results. Therefore, even if a method for reducing the amount of bits necessary for multiplexing the auxiliary information in the multiplexing unit 104 is provided, the effect cannot be obtained, and there is a problem that the efficiency of information amount compression is reduced. In addition, in particular, in the case of encoding at a low bit rate, there is a problem that the ratio of the auxiliary information occupies relatively higher than the main information, and quality degradation occurs.

【００１７】この発明は上記のような課題を解決するた
めになされたもので、補助情報の情報量圧縮の効率を高
めたオーディオ符号化装置を得ることを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problems, and has as its object to provide an audio encoding device with an improved efficiency in compressing the information amount of auxiliary information.

【００１８】[0018]

【課題を解決するための手段】この発明に係るオーディ
オ符号化装置は、入力されたオーディオ信号を人間の聴
覚特性に基づき分析し、マスキングしきい値や許容雑音
量等の量子化雑音の発生量を適応的に制御するための指
標を算出する聴覚分析部と、入力されたオーディオ信号
を、上記聴覚分析部から出力される変換ブロックサイズ
に応じて直交変換し変換係数を算出する直交変換部と、
上記変換係数を符号化する際の周波数軸上での複数の処
理区分を、既定の最小単位区分より少ない区分数になる
よう設定する処理区分決定部と、上記聴覚分析部により
算出された量子化雑音の発生量を適応的に制御するため
の指標に基づいて、上記処理区分決定部が設定した複数
の処理区分毎に、上記直交変換部により算出された変換
係数を符号化し、符号化された変換係数及び符号化の際
に使用した付随する補助情報を出力する最適符号化処理
部と、上記最適符号化処理部により出力された変換係数
及び補助情報を多重化する多重化手段とを備えたもので
ある。An audio encoding apparatus according to the present invention analyzes an input audio signal based on human auditory characteristics and generates an amount of quantization noise such as a masking threshold and an allowable noise amount. A hearing analysis unit that calculates an index for adaptively controlling the input audio signal, an orthogonal transformation unit that performs an orthogonal transformation according to a transformation block size output from the hearing analysis unit and calculates a transformation coefficient. ,
A processing division determining unit that sets a plurality of processing divisions on the frequency axis when encoding the transform coefficient to have a smaller number of divisions than a predetermined minimum unit division, and a quantization calculated by the auditory analysis unit Based on an index for adaptively controlling the amount of noise generated, for each of a plurality of processing sections set by the processing section determination section, the transform coefficient calculated by the orthogonal transform section is coded and coded. An optimal encoding processing unit that outputs the transform coefficient and the accompanying auxiliary information used in encoding, and a multiplexing unit that multiplexes the transform coefficient and the auxiliary information output by the optimal encoding processing unit. Things.

【００１９】この発明に係るオーディオ符号化装置は、
処理区分決定部が、規定のｎ個の最小単位区分に対して
ｋ個（ｋ＜ｎ）ずつの最小単位区分を１つにまとめて処
理区分として設定するものである。An audio encoding apparatus according to the present invention comprises:
The processing division determining unit collects k (k <n) minimum unit divisions from the prescribed n minimum unit divisions into one and sets them as processing divisions.

【００２０】この発明に係るオーディオ符号化装置は、
処理区分決定部が、処理区分に属する変換係数の数が一
様になるように処理区分を設定するものである。An audio encoding device according to the present invention comprises:
The processing section determination section sets the processing section so that the number of transform coefficients belonging to the processing section becomes uniform.

【００２１】この発明に係るオーディオ符号化装置は、
聴覚分析部が、処理区分決定部が設定した複数の処理区
分毎に、量子化雑音の発生量を適応的に制御するための
指標を算出するものである。An audio encoding device according to the present invention comprises:
The auditory analysis unit calculates an index for adaptively controlling the amount of quantization noise generated for each of the plurality of processing sections set by the processing section determination unit.

【００２２】この発明に係るオーディオ符号化装置は、
処理区分決定部が、直交変換部により算出された変換係
数から最小単位区分毎の変換係数のパワーを算出し、算
出された変換係数のパワーの差分が所定のしきい値内に
ある最小単位区分を同一の処理区分にまとめることによ
り処理区分を設定するものである。An audio encoding device according to the present invention comprises:
A processing section determining section for calculating the power of the transform coefficient for each minimum unit section from the transform coefficients calculated by the orthogonal transform section, and calculating the minimum unit section for which the difference between the calculated transform coefficient powers is within a predetermined threshold value; Are grouped into the same processing section to set the processing section.

【００２３】この発明に係るオーディオ符号化装置は、
処理区分決定部が、直交変換部により算出された変換係
数から最小単位区分毎に変換係数のパワーの最大値を検
出し、検出された変換係数のパワーの最大値の差分が所
定のしきい値内にある最小単位区分を同一の処理区分に
まとめることにより処理区分を設定するものである。An audio encoding device according to the present invention comprises:
A processing section determining section for detecting a maximum value of the power of the transform coefficient for each minimum unit section from the transform coefficients calculated by the orthogonal transform section, and determining a difference between the detected maximum values of the power of the transform coefficients by a predetermined threshold value; The processing divisions are set by grouping the minimum unit divisions in the table into the same processing division.

【００２４】この発明に係るオーディオ符号化装置は、
処理区分決定部が、聴覚分析部によりオーディオ信号を
分析した際に得られるスペクトルから最小単位区分毎の
スペクトルのパワーを算出し、算出されたスペクトルの
パワーの差分が所定のしきい値内にある最小単位区分を
同一の処理区分にまとめることにより処理区分を設定す
るものである。An audio encoding device according to the present invention comprises:
The processing section determination section calculates the spectrum power for each minimum unit section from the spectrum obtained when the audio signal is analyzed by the auditory analysis section, and the difference between the calculated spectrum powers is within a predetermined threshold. The processing section is set by grouping the minimum unit sections into the same processing section.

【００２５】この発明に係るオーディオ符号化装置は、
処理区分決定部が、外部から与えられる符号化ビットレ
ートに応じて、符号化ビットレートが低いほど区分数を
少なく、符号化ビットレートが高いほど区分数を多くな
るように処理区分を設定するものである。An audio encoding device according to the present invention comprises:
In accordance with an externally applied encoding bit rate, the processing division determining unit sets processing divisions such that the number of divisions decreases as the encoding bit rate decreases and the number of divisions increases as the encoding bit rate increases. It is.

【００２６】この発明に係るオーディオ符号化装置は、
最適符号化処理部により出力された変換係数及び補助情
報に必要なそれぞれのビット量を求め、補助情報に必要
なビット量の全体のビット量に対する割合が所定のしき
い値より多い場合に、補助情報に必要なビット量を少な
くするために、より少ない区分数になるように複数の処
理区分を再設定するよう処理区分決定部に指示する情報
量判定部を備えたものである。An audio encoding device according to the present invention comprises:
The respective bit amounts required for the transform coefficient and the auxiliary information output by the optimal encoding processing unit are obtained, and when the ratio of the bit amount required for the auxiliary information to the total bit amount is larger than a predetermined threshold value, In order to reduce the amount of bits required for information, the information processing apparatus further includes an information amount determination unit that instructs a processing division determination unit to reset a plurality of processing divisions so as to reduce the number of divisions.

【００２７】この発明に係るオーディオ符号化装置は、
情報量判定部が、補助情報に必要なビット量の全体のビ
ット量に対する割合が符号化ビットレートごとに定めら
れた所定のしきい値より多い場合に、より少ない区分数
になるように複数の処理区分を再設定するよう処理区分
決定部に指示するものである。An audio encoding device according to the present invention comprises:
When the ratio of the bit amount necessary for the auxiliary information to the total bit amount is larger than a predetermined threshold value defined for each encoding bit rate, the information amount determination unit performs a plurality of divisions so that the number of segments becomes smaller. This is for instructing the processing division determination unit to reset the processing division.

【００２８】[0028]

【発明の実施の形態】以下、この発明の実施の一形態を
説明する。実施の形態１．図１はこの発明の実施の形態１によるオ
ーディオ符号化装置の構成を示すブロック図である。図
において、１は入力されたオーディオ信号を、聴覚分析
部２から出力される変換ブロックサイズに応じて直交変
換し変換係数を算出する直交変換部、２は入力されたオ
ーディオ信号を人間の聴覚特性に基づき分析し、マスキ
ングしきい値や許容雑音量等の量子化雑音の発生量を適
応的に制御するための指標を算出する聴覚分析部であ
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below. Embodiment 1 FIG. FIG. 1 is a block diagram showing a configuration of an audio encoding device according to Embodiment 1 of the present invention. In the figure, reference numeral 1 denotes an orthogonal transform unit for orthogonally transforming an input audio signal according to a transform block size output from the auditory analysis unit 2 to calculate a transform coefficient, and 2 denotes a human auditory characteristic of the input audio signal. This is an auditory analysis unit that analyzes based on the information and calculates an index for adaptively controlling the amount of quantization noise such as a masking threshold and an allowable noise amount.

【００２９】また、図１において、３は、聴覚分析部２
により算出された量子化雑音の発生量を適応的に制御す
るための指標に基づき、処理区分決定部５が設定した処
理区分毎に、直交変換部１により算出された変換係数を
符号化し、符号化された変換係数及び符号化の際に使用
した付随する補助情報を出力する最適符号化処理部、４
は最適符号化処理部３により出力された変換係数及び補
助情報を多重化する多重化部、５は、最適符号化処理部
３が変換係数を符号化する際の周波数軸上での複数の処
理区分を、最適符号化処理部３が予め保有している周波
数軸上の規定の最小単位区分である帯域区分より少ない
区分数になるよう設定する処理区分決定部、９はオーデ
ィオ信号を入力する入力端子、１０は多重化された符号
化ビットストリームを出力する出力端子である。In FIG. 1, reference numeral 3 denotes an auditory analysis unit 2.
The transform coefficient calculated by the orthogonal transform unit 1 is encoded for each processing section set by the processing section determination unit 5 on the basis of an index for adaptively controlling the amount of quantization noise calculated by An optimal encoding processing unit for outputting the transformed coefficient and accompanying auxiliary information used in encoding,
Is a multiplexing unit that multiplexes the transform coefficient and the auxiliary information output by the optimal coding processing unit 3, and 5 is a plurality of processes on the frequency axis when the optimal coding processing unit 3 encodes the transform coefficient. A processing section determining section for setting the number of sections to be smaller than the number of band sections, which is a prescribed minimum unit section on the frequency axis, which is previously held by the optimal encoding processing section 3; Terminal 10 is an output terminal for outputting a multiplexed coded bit stream.

【００３０】次に動作について説明する。入力端子９か
ら入力されたオーディオ信号は、例えば、１０２４サン
プルを１つの時間フレームとして直交変換部１で直交変
換される。ここでは、長変換ブロックを選択して２０４
８ポイントのＭＤＣＴを行うと、１０２４個の変換係数
が出力される。Next, the operation will be described. The audio signal input from the input terminal 9 is orthogonally transformed by the orthogonal transformation unit 1 using, for example, 1024 samples as one time frame. Here, the length conversion block is selected and 204
When 8-point MDCT is performed, 1024 transform coefficients are output.

【００３１】聴覚分析部２は、時間フレーム単位でオー
ディオ信号を分析する。オーディオ信号のパワーやＦＦ
Ｔ（ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ；
高速フーリエ変換）によるスペクトラム解析、あるいは
マスキング特性等の、量子化雑音の発生量を適応的に制
御するための指標を算出する。聴覚分析部２におけるＦ
ＦＴでは、ＭＤＣＴの分解能に合わせて２０４８ポイン
トのＦＦＴを行い、１０２４個のスペクトルを基に、周
波数軸上の特性を中心に解析する。これは、後段の最適
符号化処理部３における適応処理が周波数軸上の区分で
行われるのに適合させるためである。The auditory analysis unit 2 analyzes an audio signal on a time frame basis. Power of audio signal and FF
T (Fast Fourier Transform;
An index for adaptively controlling the amount of quantization noise, such as spectrum analysis by fast Fourier transform) or masking characteristics, is calculated. F in the auditory analysis unit 2
In the FT, an FFT of 2048 points is performed in accordance with the resolution of the MDCT, and analysis is performed based on 1024 spectra, mainly on characteristics on the frequency axis. This is for adapting the adaptive processing in the subsequent optimal encoding processing unit 3 to be performed in the division on the frequency axis.

【００３２】最適符号化処理部３では、最初は、予め保
有している所定の帯域区分毎に適応処理を行う。図２は
最適符号化処理部３が予め保有している規定の最小区分
単位である帯域区分の一例であり、直交変換部１から出
力される１０２４個の変換係数を複数の帯域に区分する
ための区分規定テーブルを示す図である。図２における
（ａ）は帯域区分ｓｂを示すインデックスで、（ｂ）は
その帯域区分に含まれる変換係数の数を示す。図２の帯
域区分においては、帯域区分の総数は４９個で、それぞ
れの帯域区分では、周波数の低い方で変換係数の数が少
なく、高域になるに従ってその数が多くなっている。こ
れは、人間の聴覚特性の臨界帯域をモデル化しているか
らである。First, the optimal coding processing unit 3 performs an adaptive process for each predetermined band division held in advance. FIG. 2 is an example of a band division, which is a prescribed minimum division unit previously held by the optimal encoding processing unit 3, and is used for dividing 1024 transform coefficients output from the orthogonal transformation unit 1 into a plurality of bands. FIG. 9 is a diagram showing a section definition table. 2A is an index indicating the band division sb, and FIG. 2B shows the number of transform coefficients included in the band division. In the band divisions of FIG. 2, the total number of band divisions is 49, and in each band division, the number of transform coefficients is small at a lower frequency and increases as the frequency becomes higher. This is because the critical band of human auditory characteristics is modeled.

【００３３】図３は最適符号化処理部３の構成を示すブ
ロック図である。最適符号化処理部３は、変換係数をス
ケーリングする正規化部３０１、正規化された変換係数
を量子化する量子化部３０２、量子化された変換係数を
ハフマン符号化するハフマン符号化部３０３と、これら
の正規化部３０１、量子化部３０２、ハフマン符号化部
３０３に対して統括した制御を行うレート／歪み制御部
３０４で構成される。FIG. 3 is a block diagram showing the configuration of the optimum encoding processing unit 3. The optimal encoding processing unit 3 includes a normalizing unit 301 for scaling the transform coefficients, a quantizing unit 302 for quantizing the normalized transform coefficients, a Huffman encoding unit 303 for Huffman encoding the quantized transform coefficients, and , A normalization unit 301, a quantization unit 302, and a Huffman coding unit 303.

【００３４】正規化部３０１は、帯域区分内に属する変
換係数の大小を調べ、その値に応じて正規化を行うため
のスケーリング係数ｓｆ［ｓｂ］を決定し、さらに、そ
のスケーリング係数を用いて、帯域区分内の変換係数を
それぞれ正規化する。量子化部３０２は、正規化部３０
１で正規化された変換係数に対して、その帯域区分に対
して割り当てられたビット数で量子化を行う。正規化部
３０１及び量子化部３０２は、個別に動作する場合もあ
り、次の（２）式で表されるように同時に動作する場合
もある。ｘｑ＝ｉｎｔ（（ａｂｓ（ｃｏｅｆｆ）＊２＾（（１／４）＊α））＾３／４＋ β）（２）The normalizing section 301 checks the magnitude of the transform coefficient belonging to the band division, determines a scaling coefficient sf [sb] for normalization according to the value, and further uses the scaling coefficient. , And normalize the transform coefficients in the band sections. The quantization unit 302 includes the normalization unit 30
The transform coefficient normalized by 1 is quantized by the number of bits allocated to the band division. The normalization unit 301 and the quantization unit 302 may operate individually or may operate simultaneously as represented by the following equation (2). xq = int ((abs (coeff) * 2 ＾ ((1/4) * α)) ＾ 3/4 + β) (2)

【００３５】上記（２）式において、ｘｑは変換係数の
量子化値、ｃｏｅｆｆは変換係数、αがスケーリング係
数とビット割当てを総合したパラメータ、βは量子化の
為の補正値であり、αは整数とする。なお、ｉｎｔ
（）は小数値を切り捨てて整数値化する関数で、ａｂ
ｓ（）は絶対値化を行う関数である。（２）式におけ
るｉｎｔ（）関数によって切り捨てられる小数値が量
子化誤差であり、これが符号化品質の劣化をもたらす。In the above equation (2), xq is a quantized value of a transform coefficient, coeff is a transform coefficient, α is a parameter obtained by integrating a scaling coefficient and bit allocation, β is a correction value for quantization, and α is a correction value for quantization. Integer. Note that int
() Is a function that rounds down a decimal value and converts it to an integer.
s () is a function for performing absolute value conversion. The fractional value truncated by the int () function in the equation (2) is a quantization error, which causes deterioration in coding quality.

【００３６】なお、上記αは通常、ビット割当てに関す
る項γと合わせて、次の（３）式のように現されるが、
以下では簡単のため、（４）式と見なせるものとして考
える。 α＝ｓｆ［ｓｂ］−γ （γ：整数）（３） α＝ｓｆ［ｓｂ］（４）Note that the above α is usually expressed by the following equation (3) together with a term γ relating to bit allocation.
In the following, for simplicity, it is assumed that Equation (4) can be regarded. α = sf [sb] −γ (γ: integer) (3) α = sf [sb] (4)

【００３７】量子化後の変換係数は複数個ずつにまとめ
られ、ハフマン符号化部３０３によってハフマン符号に
置換される。ハフマン符号とは、出現確率に応じてその
ランレングスを定めた符号である。ハフマン符号化部３
０３には使用可能な複数のハフマン符号テーブルが用意
されており、全てのハフマン符号テーブルを用いてハフ
マン符号への置換を試行し、置換後のハフマン符号に必
要なビット量が最も少なくなる場合のハフマン符号テー
ブルを最適なハフマン符号テーブルｃｂ［ｓｂ］として
選択し、このハフマン符号テーブルを用いた場合のハフ
マン符号を最終的なハフマン符号とする。The transform coefficients after quantization are grouped into a plurality of pieces, and are replaced by Huffman codes by the Huffman coding section 303. The Huffman code is a code whose run length is determined according to the appearance probability. Huffman encoding unit 3
03, a plurality of Huffman code tables that can be used are prepared, replacement with the Huffman code is tried using all the Huffman code tables, and a case where the bit amount necessary for the Huffman code after the replacement is the smallest is obtained. The Huffman code table is selected as the optimal Huffman code table cb [sb], and the Huffman code when this Huffman code table is used is set as the final Huffman code.

【００３８】上記、正規化部３０１、量子化部３０２、
ハフマン符号化部３０３の一連の処理に対し、レート／
歪み制御部３０４は、量子化誤差量が多い帯域区分に対
して量子化誤差の発生量を減少させるようにスケーリン
グ係数を調整する。このとき、帯域区分選出の過程にお
いては、単に量子化誤差量を比較するのではなく、マス
キングしきい値との相対差によって行う場合もある。こ
のスケーリング係数の調整においては、量子化値が大き
くなり必然的にハフマン符号に必要なビット量は増加す
るため、この量子化誤差の発生量とビット量のトレード
オフで最終的に符号化効率が高まるように制御する。こ
の制御は特定の帯域区分に対して行われるのではなく、
全ての帯域区分を対象として行われ、全帯域区分の総合
的な量子化誤差の発生量を調節し、全帯域の変換係数の
ハフマン符号化に必要なビット量が指定の符号化ビット
レートの範囲内に収まるように制御する。The normalizing section 301, the quantizing section 302,
For a series of processes of the Huffman encoding unit 303, the rate /
The distortion control unit 304 adjusts the scaling coefficient so as to reduce the amount of quantization error for a band section having a large amount of quantization error. At this time, in the process of band division selection, there may be cases where the quantization error amount is not simply compared but is determined based on a relative difference with a masking threshold. In the adjustment of the scaling coefficient, the quantization value becomes large and the bit amount necessary for the Huffman code necessarily increases, so that the coding efficiency finally becomes lower due to the trade-off between the amount of quantization error and the bit amount. Control to increase. This control is not performed on a specific band segment,
It is performed for all band divisions, adjusts the amount of overall quantization error generation for all band divisions, and the bit amount required for Huffman encoding of transform coefficients for all bands is within the specified encoding bit rate range. Control to fit within.

【００３９】図１の処理区分決定部５は、最適符号化処
理部３が持つ図２に示した規定の最小単位区分である帯
域区分より区分数が少ない処理区分テーブルを持つもの
とする。この処理区分テーブルとして、図２の４９個の
帯域区分をｋ個ずつに再区分化した処理区分テーブルを
考える。ここで、ｋ＝２，３，４，・・・，２５であ
り、帯域区分をｎ個とするとｋ＜ｎである。例えば、ｋ
＝２の場合の処理区分を図４に示す。図４は帯域区分を
２個ずつに再区分化した処理区分テーブルを示す図であ
る。図４における（ａ）は、図２における帯域区分ｓｂ
を２個ずつにまとめた新たな処理区分ｎｂである。図４
における（ｂ）は該処理区分ｎｂに属する帯域区分ｓｂ
を示しており、各々の処理区分ｎｂにおいて２個の帯域
区分を含んでいる。また、図４における（ｃ）は処理区
分ｎｂに属する変換係数の総数を示している。処理区分
決定部５は、図４に示すこの処理区分ｎｂの処理区分テ
ーブルを最適符号化処理部３に対して与える。It is assumed that the processing section determination section 5 in FIG. 1 has a processing section table having a smaller number of sections than the band section which is the minimum unit section shown in FIG. As this processing division table, a processing division table in which 49 band divisions in FIG. 2 are re-partitioned into k units will be considered. Here, k = 2, 3, 4,..., 25, and if the number of band divisions is n, k <n. For example, k
FIG. 4 shows the processing divisions when = 2. FIG. 4 is a diagram showing a processing section table in which the band sections are re-partitioned into two. FIG. 4A shows the band division sb in FIG.
Is a new processing section nb in which is divided into two. FIG.
(B) in FIG. 7 indicates a band division sb belonging to the processing division nb.
, And each processing section nb includes two band sections. FIG. 4C shows the total number of transform coefficients belonging to the processing section nb. The processing section determination section 5 gives the processing section table of this processing section nb shown in FIG.

【００４０】最適符号化処理部３は、処理区分決定部５
から与えられた処理区分ｎｂに基づき、それぞれの帯域
区分におけるスケーリング係数ｓｆ［ｓｂ］及びハフマ
ン符号テーブルｃｂ［ｓｂ］の最適解を求める。このと
き、処理区分ｎｂに属する２つの帯域区分において、ス
ケーリング係数ｓｆ［ｓｂ］の値は同一であるように制
御し、処理区分ｎｂに属する２つの帯域区分において、
ハフマン符号テーブルｃｂ［ｓｂ］の値は同一であるよ
うに制御する。すなわち、１つの処理区分、つまり２つ
の連続する帯域区分に属する変換係数の大小を調べ、そ
の値に応じて正規化及び量子化を行うための係数ｓｆ
［ｓｂ］、ｓｆ［ｓｂ＋１］を決定し、帯域区分内の変
換係数を正規化及び量子化する。このとき、ｓｆ［ｓ
ｂ］＝ｓｆ［ｓｂ＋１］である。また、得られた量子化
後の変換係数に対して、１つの処理区分、つまり２つの
連続する帯域区分毎に、ハフマン符号化に必要なビット
量が最も少なくなるようなハフマン符号テーブルｃｂ
［ｓｂ］、ｃｂ［ｓｂ＋１］を選択する。ここでｃｂ
［ｓｂ］＝ｃｂ［ｓｂ＋１］である。The optimal encoding processing unit 3 includes a processing division determining unit 5
, An optimum solution of the scaling coefficient sf [sb] and the Huffman code table cb [sb] in each band section is obtained. At this time, in the two band sections belonging to the processing section nb, the value of the scaling coefficient sf [sb] is controlled to be the same, and in the two band sections belonging to the processing section nb,
Control is performed so that the values of the Huffman code table cb [sb] are the same. That is, a coefficient sf for examining the magnitude of transform coefficients belonging to one processing section, that is, two consecutive band sections, and performing normalization and quantization according to the value.
[Sb] and sf [sb + 1] are determined, and the transform coefficients in the band division are normalized and quantized. At this time, sf [s
b] = sf [sb + 1]. In addition, for the obtained quantized transform coefficients, a Huffman code table cb such that the amount of bits required for Huffman coding is minimized for one processing section, that is, for two consecutive band sections.
[Sb] and cb [sb + 1] are selected. Where cb
[Sb] = cb [sb + 1].

【００４１】次に、多重化部４の補助情報を多重化する
動作について詳述する。まず、ハフマン符号テーブル情
報について考える。図５はｋ＝２の場合の処理区分ｎｂ
を適用して最適符号化処理部３によって得られたハフマ
ン符号テーブルの結果を示す図であり、図５における
（ａ）は帯域区分のインデックス、（ｂ）は帯域区分別
のハフマン符号テーブルを示している。多重化部４は、
ハフマン符号テーブルの情報を以下の手順で多重する。Next, the operation of the multiplexing section 4 for multiplexing auxiliary information will be described in detail. First, consider the Huffman code table information. FIG. 5 shows the processing section nb when k = 2.
FIG. 6 is a diagram showing a result of a Huffman code table obtained by the optimal encoding processing unit 3 by applying the formula (1), wherein (a) in FIG. 5 shows an index of band division, and (b) shows a Huffman code table for each band division. ing. The multiplexing unit 4
The information of the Huffman code table is multiplexed in the following procedure.

【００４２】まず、図５における（ｂ）のｃｂ［ｓｂ］
について、連続する帯域区分ｓｂにおいて同じハフマン
符号テーブルが選択されているケースを見つけ、その連
続数ｌｅｎを見つける。図５における（ｂ）の場合、そ
の連続数ｌｅｎは図５における（ｃ）の通りとなる。次
にインデックスの小さな帯域区分ｓｂ０から情報を多重
し始める。ｓｂ０からｓｂ７までの連続する８個の帯域
区分において同じハフマン符号テーブル「１１」が選択
されており、この場合、ｃｂ［０］＝１１及びｌｅｎ＝
８の情報を多重する。続いて、ｓｂ１からｓｂ７までの
帯域区分については情報を多重せず、次はｓｂ８から情
報の多重を再開する。すなわち、ｓｂ８からｓｂ９まで
２個の連続する帯域区分において、同じハフマン符号テ
ーブル「１０」が選択されており、ｃｂ［８］＝１０及
びｌｅｎ＝２の情報を多重する。図６は、このようにし
て、多重化部４がハフマン符号テーブルの情報を多重し
た結果を示す図である。First, cb [sb] in FIG.
, The case where the same Huffman code table is selected in the continuous band section sb is found, and the number of continuous len is found. In the case of (b) in FIG. 5, the continuous number len is as shown in (c) of FIG. Next, multiplexing of information starts from the band section sb0 having the smaller index. The same Huffman code table “11” is selected in eight consecutive band divisions from sb0 to sb7. In this case, cb [0] = 11 and len =
8 are multiplexed. Subsequently, information is not multiplexed for the band divisions from sb1 to sb7, and multiplexing of information is restarted from sb8. That is, the same Huffman code table “10” is selected in two consecutive band divisions from sb8 to sb9, and information of cb [8] = 10 and len = 2 is multiplexed. FIG. 6 is a diagram illustrating a result of the multiplexing unit 4 multiplexing the information of the Huffman code table in this manner.

【００４３】このように、同じハフマン符号テーブルが
選択された連続する帯域区分について、共通なハフマン
符号テーブル情報ｃｂ［ｓｂ］の値、帯域区分の連続数
ｌｅｎの値、の順に情報を多重し、同じハフマン符号テ
ーブルが続く（ｌｅｎ−１）個のｓｂについてはハフマ
ン符号テーブルの情報を多重しないことを繰り返す。As described above, for consecutive band sections in which the same Huffman code table is selected, information is multiplexed in the order of the value of the common Huffman code table information cb [sb] and the value of the number of consecutive band sections len, Repeating not multiplexing the information of the Huffman code table is repeated for (len-1) sbs following the same Huffman code table.

【００４４】以上の手順によれば、ほとんどの処理区分
ｎｂにおいて、同じハフマン符号テーブルが選択された
帯域区分の連続数は最低２帯域以上になることが保証さ
れ、多重すべきハフマン符号テーブルｃｂ［ｓｂ］及び
帯域区分の連続数ｌｅｎの情報の総数は、帯域区分毎に
ハフマン符号テーブルを決定する場合に比較して減少す
る。According to the above procedure, in most of the processing sections nb, the number of consecutive band sections in which the same Huffman code table is selected is guaranteed to be at least two bands, and the Huffman code table cb [ sb] and the total number of pieces of information of the continuous number len of band divisions are reduced as compared with a case where a Huffman code table is determined for each band division.

【００４５】次に、スケーリング係数情報について考え
る。図７は帯域区分ｋ＝２の場合の処理区分ｎｂを適用
して最適符号化処理部３によって得られたスケーリング
係数の結果を示す図であり、図７における（ａ）は帯域
区分のインデックス、（ｂ）は帯域区分別のスケーリン
グ係数を示している。多重化部４は、スケーリング係数
の情報を以下の手順で多重する。Next, consider the scaling coefficient information. FIG. 7 is a diagram showing a result of a scaling coefficient obtained by the optimal coding processing unit 3 by applying the processing section nb in the case of the band section k = 2, where (a) in FIG. (B) shows a scaling coefficient for each band division. The multiplexing unit 4 multiplexes information of the scaling coefficient in the following procedure.

【００４６】まず、図７における（ｂ）のｓｆ［ｓｂ］
について、隣接する２つの帯域区分間のスケーリング係
数のインデックスの差分ｄｉｆｆ［ｓｂ］を、上記
（１）式により求める。図７における（ｃ）はその結果
である。スケーリング係数を帯域区分毎に決定する場合
と比較して差分が０となる場合が増えている。これは、
最低２帯域以上の連続した帯域区分で同じスケーリング
係数が選択されることに加えて、処理区分あたりに含ま
れる変換係数の数が相対的に多くなり、スケーリング係
数の大小関係が平滑化されるためである。First, sf [sb] shown in FIG.
, The difference diff [sb] of the index of the scaling coefficient between two adjacent band segments is obtained by the above equation (1). FIG. 7C shows the result. The number of cases where the difference is 0 has increased in comparison with the case where the scaling coefficient is determined for each band section. this is,
In addition to selecting the same scaling coefficient in at least two consecutive band divisions, the number of transform coefficients included in each processing division becomes relatively large, and the magnitude relation between the scaling coefficients is smoothed. It is.

【００４７】この差分ｄｉｆｆ［ｓｂ］に対して、図２
２に示したスケーリング係数のためのハフマン符号テー
ブルによってハフマン符号化を行う。図２２に示したハ
フマン符号テーブルは、差分が０の場合に最も符号長の
短い符号を割り当てており、さらに、差分の絶対値が小
さい場合に、より短い符号長の符号を割り当てているた
め、図７における（ｃ）の結果においては、帯域区分毎
にスケーリング係数を調整する場合に比較して、差分ｄ
ｉｆｆの情報を多重するために必要なビット量は減少す
る。With respect to the difference diff [sb], FIG.
Huffman coding is performed using the Huffman code table for the scaling coefficient shown in FIG. The Huffman code table shown in FIG. 22 assigns the code with the shortest code length when the difference is 0, and further assigns the code with the shorter code length when the absolute value of the difference is small. In the result of (c) in FIG. 7, the difference d is smaller than when the scaling coefficient is adjusted for each band division.
The number of bits required to multiplex the information of the iff information is reduced.

【００４８】このように、ｋ＝２として帯域区分を２つ
ずつまとめた処理区分ｎｂを新たに設定して、その処理
区分ｎｂを用いてスケーリング係数やハフマン符号テー
ブルを決定することにより、それらの補助情報に必要な
ビット量は大きく減少する。すなわち、より低ビットレ
ートでの符号化が可能となり、あるいは、減少した補助
情報のビット量を変換係数の量子化のために、あらため
て割り当てることによって、同一の符号化ビットレート
においては、より符号化品質を向上させることが可能と
なる。As described above, a processing section nb in which two band sections are grouped by setting k = 2 is newly set, and a scaling coefficient and a Huffman code table are determined using the processing section nb. The amount of bits required for the auxiliary information is greatly reduced. That is, encoding at a lower bit rate becomes possible, or at the same encoding bit rate, more encoding is performed by re-allocating the reduced bit amount of auxiliary information for quantization of a transform coefficient. Quality can be improved.

【００４９】このように、処理区分決定部５は最適符号
化処理部３が予め保有している規定の最小単位区分であ
る帯域区分より少ない区分数の処理区分を最適符号化処
理部３に与え、最適符号化処理部３はその処理区分を用
いて最適符号化処理を行う。この一連の処理を、処理区
分決定部５が有する全てのパターンについて繰り返し実
行し、最適符号化処理部３は、異なる処理区分のパター
ンに対する結果を保存しておき、最終的に最も符号化効
率の良い結果が得られるケースを選定し、その結果の情
報を多重化部に出力する。この選定の方法としては、全
帯域区分における量子化誤差量が最も少ない場合を選択
する方法や、主情報及び補助情報を合わせたトータルで
のビット量が最も少ない場合を選択する方法があり、も
ちろん、この両者をトレードオフの関係として、最適な
場合を選択するものであっても良い。As described above, the processing section determination section 5 gives the optimum coding section 3 a processing section having a smaller number of sections than the band section which is the prescribed minimum unit section previously held by the optimum coding section 3. The optimal encoding unit 3 performs an optimal encoding process using the processing section. This series of processing is repeatedly executed for all the patterns included in the processing division determination unit 5, and the optimal encoding processing unit 3 stores the results for the patterns of different processing divisions, and finally achieves the highest encoding efficiency. A case where a good result is obtained is selected, and information on the result is output to the multiplexing unit. As a method of this selection, there is a method of selecting a case where the quantization error amount in the entire band division is the smallest, and a method of selecting a case where the total bit amount including the main information and the auxiliary information is the smallest. The best case may be selected as a trade-off relationship between the two.

【００５０】多重化部４は、最適符号化処理部３から最
終結果として得られる主情報及び補助情報を規定の文法
に従って多重化し、出力端子１０へ出力する。出力端子
１０から出力されるデータは伝送路を介して伝送された
り、記録媒体へ記録されたりする。The multiplexing unit 4 multiplexes main information and auxiliary information obtained as a final result from the optimal encoding processing unit 3 according to a prescribed grammar, and outputs the multiplexed information to the output terminal 10. Data output from the output terminal 10 is transmitted via a transmission path or recorded on a recording medium.

【００５１】以上のように、この実施の形態１によれ
ば、最適符号化処理部３において、予め保有している規
定の最小単位区分である帯域区分より少ない区分数の処
理区分を適用することにより、帯域区分毎に存在する補
助情報に必要なビット量を削減することが可能となり、
全体としてデータ圧縮効率を高めることができるという
効果が得られる。すなわち、量子化誤差量を同水準に維
持しつつ符号化を行う場合には、補助情報量のビット
量、すなわち全体の情報のビット量を抑えることができ
る。また、同一の符号化ビットレートで符号化する場合
には、補助情報に必要なビット量を抑えた分、その他の
情報に使用することにより、より高品質な符号化が可能
となる。As described above, according to the first embodiment, the optimal encoding processing unit 3 applies a processing section having a smaller number of sections than the band section which is a prescribed minimum unit section held in advance. Thereby, it is possible to reduce the amount of bits required for auxiliary information existing for each band division,
The effect is obtained that the data compression efficiency can be increased as a whole. That is, when encoding is performed while maintaining the quantization error amount at the same level, the bit amount of the auxiliary information amount, that is, the bit amount of the entire information can be suppressed. In addition, when encoding is performed at the same encoding bit rate, higher quality encoding can be performed by using the information for other information by reducing the bit amount necessary for the auxiliary information.

【００５２】なお、処理区分決定部５で有する処理区分
テーブルは、帯域区分をｋ（ｋ＝２，３，４，・・・）
個ずつまとめた処理区分テーブルや、臨界帯域をモデル
化した処理区分テーブルだけではなく、１つの処理区分
内に等しい数の変換係数を持つように帯域区分をまとめ
た処理区分テーブルであっても良い。また、上記では変
換符号化の例をあげて説明したが、帯域分割符号化であ
っても良い。It should be noted that the processing division table provided in the processing division determination unit 5 indicates that the band division is k (k = 2, 3, 4,...)
In addition to the processing division table in which the individual processing pieces are grouped and the processing division table in which the critical band is modeled, a processing division table in which the band divisions are grouped so as to have an equal number of conversion coefficients in one processing division may be used. . In the above description, an example of transform coding has been described, but band division coding may be used.

【００５３】実施の形態２．図８はこの発明の実施の形
態２によるオーディオ符号化装置の構成を示すブロック
図である。図において、２１は入力されたオーディオ信
号を分析し処理区分決定部５から与えられる処理区分毎
にマスキングしきい値や信号パワー等の指標を算出する
聴覚分析部で、その他の構成は実施の形態１の図１と同
等である。上記実施の形態１では、処理区分決定部５に
よって設定された処理区分は、最適符号化処理部３に対
してのみ与えられるものであったが、同時に聴覚分析部
２１に対しても与えられるものであっても良い。Embodiment 2 FIG. 8 is a block diagram showing a configuration of an audio encoding device according to Embodiment 2 of the present invention. In the figure, reference numeral 21 denotes an auditory analysis unit that analyzes an input audio signal and calculates an index such as a masking threshold value or a signal power for each processing section provided from the processing section determination unit 5. 1 is equivalent to FIG. In the first embodiment, the processing section set by the processing section determination section 5 is given only to the optimal coding processing section 3, but is also given to the auditory analysis section 21 at the same time. It may be.

【００５４】次に動作について説明する。聴覚分析部２
１は、入力端子９からの入力されたオーディオ信号を分
析し、通常は、予め保有している規定の帯域区分毎にマ
スキングしきい値や信号パワー等の指標を算出して、最
適符号化処理部３に対して出力する。図９は、ある帯域
区分におけるマスキングしきい値算出方法の一例を示す
図である。帯域区分をｓｂｎとすると、ｓｂｎ内に含ま
れる信号スペクトルの最大ＳＰｍａｘを検出する（図９
（ａ））。次に、信号スペクトルによるマスキング効果
を計算する。図９（ｂ）の斜線部がマスクされる範囲を
示している。さらに、ｓｂｎ内で最小となるしきい値Ｔ
Ｈｍｉｎを求める。これを帯域区分内のマスキングしき
い値の代表値とする。また、最大信号スペクトルと最小
マスキングしきい値との差分を信号対マスク比として指
標化する。Next, the operation will be described. Auditory analysis unit 2
1 analyzes an audio signal inputted from the input terminal 9 and usually calculates an index such as a masking threshold value or a signal power for each of predetermined band divisions held in advance, and performs optimal coding processing. Output to section 3. FIG. 9 is a diagram illustrating an example of a masking threshold calculation method in a certain band section. Assuming that the band division is sbn, the maximum SPmax of the signal spectrum included in sbn is detected (FIG. 9).
(A)). Next, the masking effect by the signal spectrum is calculated. The shaded area in FIG. 9B indicates the range to be masked. Further, the threshold value T which is the minimum within sbn
Find Hmin. This is set as a representative value of the masking threshold in the band division. Also, the difference between the maximum signal spectrum and the minimum masking threshold is indexed as a signal-to-mask ratio.

【００５５】これに対して、処理区分決定部５から処理
区分ｎｂを与えられた場合は、その処理区分単位で指標
を算出する。図１０はある処理区分におけるマスキング
しきい値算出方法の一例を示す図である。図１０に示す
ように、例えば、ｎ番目の帯域区分ｓｂｎ及びｎ＋１番
目のｓｂｎ＋１をまとめたｍ番目の処理区分ｎｂｍを考
えると、２つの帯域区分内に含まれる信号スペクトルの
最大ＳＰｍａｘを検出し、次にｓｂｎ及びｓｂｎ＋１内
で最小となるしきい値ＴＨｍｉｎを求める。これを処理
区分内のマスキングしきい値の代表値とし、また、最大
信号スペクトルと最小マスキングしきい値との差分を信
号対マスク比として指標化する。On the other hand, when the processing section nb is given from the processing section determining section 5, the index is calculated in the processing section unit. FIG. 10 is a diagram showing an example of a masking threshold calculation method in a certain processing section. As shown in FIG. 10, for example, considering an m-th processing section nbm that combines the n-th band section sbn and the n + 1-th sbn + 1, the maximum SPmax of the signal spectrum included in the two band sections is detected. Next, a threshold value THmin which is the minimum of sbn and sbn + 1 is obtained. This is used as a representative value of the masking threshold in the processing section, and the difference between the maximum signal spectrum and the minimum masking threshold is indexed as a signal-to-mask ratio.

【００５６】以上のように、この実施の形態２によれ
ば、処理区分決定部５から、同時に聴覚分析部２１と最
適符号化処理部３に対して同じ処理区分を与えることに
より、その処理区分に最適な指標が得られ、その指標を
用いて最適符号化処理が行われることになり、符号化品
質の向上あるいは圧縮効率の向上が可能となるという効
果が得られる。As described above, according to the second embodiment, the processing division determination unit 5 gives the same processing division to the auditory analysis unit 21 and the optimal coding processing unit 3 at the same time. , An optimal index is obtained, and the optimal encoding process is performed using the index. This has the effect of improving the encoding quality or the compression efficiency.

【００５７】実施の形態３．上記実施の形態１及び実施
の形態２では、処理区分決定部５においては、予め用意
しておいた複数の処理区分テーブルの中からその処理区
分を選択するものであったが、オーディオ信号の特性に
応じて処理区分テーブルを絞り込む、あるいは生成する
ものであっても良い。Embodiment 3 In the first and second embodiments, the processing division determination unit 5 selects the processing division from a plurality of processing division tables prepared in advance. The processing division table may be narrowed down or generated in accordance with the condition.

【００５８】図１１はこの発明の実施の形態３によるオ
ーディオ符号化装置の構成を示すブロック図であり、図
において、５１は直交変換部１から得られるオーディオ
信号の特性に基づいて新たな処理区分を設定する処理区
分決定部であり、その他の構成は実施の形態１の図１と
同等である。この実施の形態では、オーディオ信号の特
性情報を直交変換部１の出力から得るものである。最適
符号化処理部３で符号化の対象となるデータは、直交変
換部１から出力される変換係数そのものであり、帯域区
分毎の適応処理を行うことは、すなわち帯域区分内に含
まれる変換係数に応じた処理を行うことに等しい。そこ
で、変換係数のパワー分布を基準に処理区分を決定す
る。以下にその具体的方法を示す。FIG. 11 is a block diagram showing the configuration of an audio encoding apparatus according to Embodiment 3 of the present invention. In FIG. 11, reference numeral 51 denotes a new processing section based on the characteristics of the audio signal obtained from the orthogonal transform section 1. The other configuration is the same as that of FIG. 1 of the first embodiment. In this embodiment, the characteristic information of the audio signal is obtained from the output of the orthogonal transform unit 1. The data to be encoded by the optimal encoding processing unit 3 is the transform coefficient itself output from the orthogonal transform unit 1, and performing the adaptive processing for each band segment means that the transform coefficient included in the band segment is included. Is equivalent to performing the processing according to. Therefore, the processing section is determined based on the power distribution of the transform coefficient. The specific method is described below.

【００５９】次に動作について説明する。処理区分決定
部５１は、まず、直交変換部１から得られる変換係数を
基に帯域区分毎の変換係数のパワーを算出する。すなわ
ち、帯域区分毎に、各帯域区分に含まれる全体の変換係
数のパワーを算出する。次に、帯域区分毎の変換係数の
パワーについて、隣接する帯域区分間の差分値を求め
る。図１２は隣接する帯域区分間の変換係数のパワーの
差分値を示す図であり、ｎ−１，ｎ，ｎ＋１番目の帯域
区分の変換係数のパワーを、それぞれＰｗｎ−１、Ｐｗ
ｎ、Ｐｗｎ＋１としたときの様子を示している。Next, the operation will be described. The processing section determination section 51 first calculates the power of the transform coefficient for each band section based on the transform coefficients obtained from the orthogonal transform section 1. That is, the power of the entire transform coefficient included in each band section is calculated for each band section. Next, a difference value between adjacent band sections is obtained for the power of the transform coefficient for each band section. FIG. 12 is a diagram showing the difference values of the powers of the transform coefficients between adjacent band sections. The powers of the transform coefficients of the (n−1), n, and n + 1-th band sections are represented by Pwn−1 and Pw, respectively.
The state when n and Pwn + 1 are set is shown.

【００６０】図１２における差分１及び差分２は、次の
（５）式で表せる。差分１＝｜Ｐｗｎ−Ｐｗｎ−１｜差分２＝｜Ｐｗｎ＋１−Ｐｗｎ｜（５）ここで、あらかじめ定めたしきい値ＴＨｐｗと差分を比
較し、差分値がしきい値ＴＨｐｗよりも小さければ、そ
の隣接する２つの帯域区分を一つの処理区分とする。例
えば、次の（６）式、差分１＜ＴＨｐｗかつ差分２＞ＴＨｐｗ（６）である場合には、ｎ−１及びｎ番目の帯域区分を１つの
処理区分とし、ｎ＋１番目の帯域区分は別の処理区分と
する。The difference 1 and the difference 2 in FIG. 12 can be expressed by the following equation (5). Difference 1 = | Pwn−Pwn−1 | Difference 2 = | Pwn + 1−Pwn | (5) Here, the difference is compared with a predetermined threshold value THpw, and if the difference value is smaller than the threshold value THpw, Two adjacent band sections are defined as one processing section. For example, when the following equation (6) is satisfied, the difference 1 <THpw and the difference 2> THpw (6), the (n−1) th and nth band divisions are set as one processing division, and the (n + 1) th band division is different. Processing category.

【００６１】処理区分決定部５１は、上記の処理を全て
の帯域区分について行い、最終的な処理区分を決定す
る。なお、ここでは隣接する２つの帯域区分間の差分を
用いたが、３つ以上の帯域区分間の相対関係を用いて判
定しても良い。また、帯域区分毎の変換係数のパワーを
算出する代わりに、この帯域区分における変換係数のパ
ワーの最大値を検出し、これを帯域区分毎の変換係数の
パワーの代用としても良い。The processing section determining section 51 performs the above-described processing for all the band sections, and determines the final processing section. Although the difference between two adjacent band sections is used here, the determination may be made using a relative relationship between three or more band sections. Instead of calculating the power of the transform coefficient for each band section, the maximum value of the transform coefficient power in this band section may be detected and used as a substitute for the transform coefficient power for each band section.

【００６２】図１３はこの発明の実施の形態３によるオ
ーディオ符号化装置の他の構成を示すブロック図であ
り、図において、５１は聴覚分析部２から得られるオー
ディオ信号の特性を基に新たな処理区分を設定する処理
区分決定部であり、その他の構成は実施の形態１の図１
と同等である。この実施の形態では、オーディオ信号の
特性情報を聴覚分析部２の出力から得るものである。FIG. 13 is a block diagram showing another configuration of the audio encoding device according to the third embodiment of the present invention. In the drawing, reference numeral 51 denotes a new audio encoding device based on the characteristics of the audio signal obtained from the auditory analyzer 2. This is a processing section determination unit for setting a processing section, and other configurations are the same as those in FIG.
Is equivalent to In this embodiment, the characteristic information of the audio signal is obtained from the output of the auditory analyzer 2.

【００６３】聴覚分析部２ではＦＦＴによるスペクトラ
ム解析が行われており、この結果から得られる個々のス
ペクトルのパワーを変換係数のパワーとみなし、帯域区
分毎のスペクトルのパワーを算出し、隣接する帯域区分
間の差分を求め、その値としきい値を比較判定し、しき
い値以下であればこれらの帯域区分を同一の処理区分と
してまとめる。これを全ての帯域区分について繰り返し
行い、最終的な処理区分を決定する。The auditory analysis unit 2 performs spectrum analysis by FFT. The power of each spectrum obtained from the result is regarded as the power of the transform coefficient, the spectrum power for each band division is calculated, and the adjacent band is calculated. The difference between the sections is obtained, the value is compared with a threshold value, and if the difference is equal to or less than the threshold value, these band sections are put together as the same processing section. This is repeated for all band divisions to determine the final processing division.

【００６４】以上のように、実施の形態３によれば、符
号化の対象となる変換係数から帯域区分毎の変換係数の
パワーの分布を算出し、変換係数のパワーが同じレベル
の帯域区分を同一の処理区分にまとめることによって、
処理区分内には一様な変換係数が揃うことになり、量子
化効率を向上させることができるという効果が得られ
る。As described above, according to the third embodiment, the distribution of the power of the transform coefficient for each band section is calculated from the transform coefficient to be encoded, and the band section having the same level of the transform coefficient power is determined. By grouping in the same processing category,
Since uniform transform coefficients are arranged in the processing section, the effect that the quantization efficiency can be improved can be obtained.

【００６５】また、実施の形態３によれば、聴覚分析部
２によりオーディオ信号を分析した際に得られるスペク
トルから帯域区分毎のスペクトルのパワーの分布を算出
し、スペクトルのパワーが同じレベルの帯域区分を同一
の処理区分にまとめることによって、処理区分内には一
様なスペクトルが揃うことになり、量子化効率を向上さ
せることができるという効果が得られる。Further, according to the third embodiment, the distribution of the spectrum power for each band division is calculated from the spectrum obtained when the audio signal is analyzed by the auditory sense analysis section 2, and the band having the same level of spectrum power is calculated. By grouping the sections into the same processing section, uniform spectra are arranged in the processing section, and the effect that the quantization efficiency can be improved can be obtained.

【００６６】なお、オーディオ信号の特性情報は、入力
端子９から処理区分決定部５に対して直接オーディオ信
号を入力し、処理区分決定部５の内部に解析部を設ける
ものであっても良いが、上記のように、もともとオーデ
ィオ信号の特性を解析している処理部からその結果を得
るように構成することによって、装置規模やコストを下
げることができる。The characteristic information of the audio signal may be such that the audio signal is directly input from the input terminal 9 to the processing division determination unit 5 and an analysis unit is provided inside the processing division determination unit 5. As described above, the configuration can be such that the result is obtained from the processing unit that originally analyzes the characteristics of the audio signal, so that the device scale and cost can be reduced.

【００６７】また、予め用意されている処理区分テーブ
ルを参照する場合、その数が多い場合には、それらのケ
ース全てを実行するには膨大な演算量が必要となるが、
上記のような構成とすることで、オーディオ信号の特性
を考慮することによって処理区分テーブルを絞り込むこ
とが可能となり、処理区分テーブルを格納するメモリサ
イズや演算量を低減することができる。When referring to the prepared processing section tables, if the number is large, an enormous amount of calculation is required to execute all of these cases.
With the above-described configuration, the processing division table can be narrowed down by considering the characteristics of the audio signal, and the memory size for storing the processing division table and the amount of calculation can be reduced.

【００６８】さらには、処理区分を上記のように一意に
設定し、設定した処理区分により少数のバリエーション
の処理区分の候補を設定し、上記実施の形態１及び実施
の形態２のように、最適符号化処理部３で繰り返し実行
して最適な処理区分を選択しても同様の効果が得られ
る。演算量の削減は、即ち装置規模及びコストの縮小に
つながる。Further, the processing divisions are uniquely set as described above, and a small number of variations of processing division candidates are set according to the set processing divisions, and the optimum processing divisions are set as in the first and second embodiments. The same effect can be obtained even when the encoding processing unit 3 repeatedly executes the processing to select the optimum processing section. The reduction in the amount of calculation leads to a reduction in the size and cost of the device.

【００６９】実施の形態４．図１４はこの発明の実施の
形態４によるオーディオ符号化装置の構成を示すブロッ
ク図である。図において、５２は外部から符号化ビット
レート情報を得て処理区分を設定する処理区分決定部
で、９１は符号化ビットレート情報を入力する制御端子
であり、その他の構成は実施の形態１の図１と同等であ
る。Embodiment 4 FIG. 14 is a block diagram showing a configuration of an audio encoding device according to Embodiment 4 of the present invention. In the figure, reference numeral 52 denotes a processing division determining unit for setting the processing division by obtaining the encoding bit rate information from the outside, and 91 is a control terminal for inputting the encoding bit rate information, and other configurations are the same as those of the first embodiment. It is equivalent to FIG.

【００７０】図１５は処理区分決定部５２の構成を示す
ブロック図である。処理区分決定部５２は、処理区分テ
ーブルを複数ずつに分けたＮ個の処理区分テーブル群
（１）５０１、処理区分テーブル群（２）５０２、・・
・、処理区分テーブル群（Ｎ）５０Ｎを持ち、また、Ｎ
個の処理区分テーブル群の中から特定の処理区分テーブ
ル群を選択するテーブル群選択部５１０及び切り替えの
ための切替器５１１で構成される。FIG. 15 is a block diagram showing the configuration of the processing division determination unit 52. The processing division determination unit 52 includes N processing division table groups (1) 501, processing division table groups (2) 502, which divide the processing division table into a plurality of processing division tables.
. Has a processing division table group (N) 50N, and N
It is composed of a table group selection section 510 for selecting a specific processing section table group from the plurality of processing section table groups, and a switch 511 for switching.

【００７１】次に動作について説明する。ここで、簡単
のため、図２に示す帯域区分において、ｋ個（ｋ＝２〜
７）ずつをまとめて１つの処理区分とする６個の処理区
分テーブルを考える。さらに、ｋ＝２，３の場合の処理
区分テーブルを第１の処理区分テーブル群とし、ｋ＝
４，５の場合の処理区分テーブルを第２の処理区分テー
ブル群とし、ｋ＝６，７の場合の処理区分テーブルを第
３の処理区分テーブル群とする。すなわち、第１の処理
区分テーブル群には比較的処理区分数が多いテーブルが
含まれ、第３の処理区分テーブル群には比較的処理区分
数が少ないテーブルが含まれている。第２の処理区分テ
ーブル群にはその中間のテーブルが含まれている。ここ
で、処理区分数が少ないテーブルほど、補助情報に必要
なビット量が少なくなることは前述の通りである。Next, the operation will be described. Here, for simplicity, in the band division shown in FIG.
7) Consider six processing section tables, each grouping one processing section. Further, the processing division table in the case of k = 2, 3 is defined as a first processing division table group, and k =
The processing division tables in the case of 4 and 5 are referred to as a second processing division table group, and the processing division tables in the case of k = 6 and 7 are referred to as a third processing division table group. That is, the first processing section table group includes a table having a relatively large number of processing sections, and the third processing section table group includes a table having a relatively small number of processing sections. The second processing section table group includes an intermediate table. Here, as described above, the smaller the number of processing sections, the smaller the bit amount required for the auxiliary information.

【００７２】テーブル群選択部５１０は、外部からの符
号化ビットレート情報を受け、予め処理区分テーブル群
の数と同じ数に区分された符号化ビットレートの範囲に
対応付けられた処理区分テーブル群を、切替器５１１に
よって選択する。この対応付けでは、ビットレートが低
いほど処理区分数が少なく、ビットレートが高いほど処
理区分数が多いというように対応付けられている。Table group selecting section 510 receives coding bit rate information from the outside, and sets processing section table groups associated with ranges of coding bit rates that are previously divided into the same number as the number of processing section tables. Is selected by the switch 511. In this association, the lower the bit rate, the smaller the number of processing sections, and the higher the bit rate, the greater the number of processing sections.

【００７３】例えば、入力された符号化ビットレート情
報が、３つの符号化ビットレート区分のうち最も低いビ
ットレート区分の範囲にあると仮定すると、テーブル群
選択部５１０は、切替器５１１によって第３の処理区分
テーブル群を選択する。これは、生成する符号化ストリ
ームに占める補助情報の割合を相対的に低く抑えるため
である。そして、第３の処理区分テーブル群に属するｋ
＝６，７の場合のそれぞれの処理区分テーブルを最適符
号化処理部３に対して与える。For example, assuming that the input encoding bit rate information is in the range of the lowest bit rate division among the three encoding bit rate divisions, the table group selecting unit 510 switches the third Is selected. This is for keeping the ratio of the auxiliary information in the generated coded stream relatively low. Then, k belonging to the third processing section table group
= 6, 7 are given to the optimal encoding processing unit 3.

【００７４】最適符号化処理部３では、ｋ＝６，７の２
つの処理区分について最適符号化処理を行い、結果とし
て符号化効率の良い方を選定し、その処理区分テーブル
を用いた場合の各種情報を多重化部４に出力する。この
動作については上記実施の形態で述べた通りである。In the optimum encoding processing section 3, k = 6,7
The optimum encoding process is performed for one of the processing sections, and as a result, the one with higher encoding efficiency is selected, and various information obtained when the processing section table is used is output to the multiplexer 4. This operation is as described in the above embodiment.

【００７５】また、符号化ビットレートが高い場合に
は、テーブル群選択部５１０は、切替器５１１によって
第１の処理区分テーブル群を選択する。これは、処理区
分数が多い処理区分テーブルを適用し帯域区分毎の適応
処理を図った結果、補助情報に必要なビット量が多くな
ったとしても、符号化ビットレートが高いため、１符号
化フレームに相当する全体のビット量も多く、生成する
符号化ストリームに占める補助情報の割合は低いと判断
できるからである。When the encoding bit rate is high, the table group selecting section 510 selects the first processing section table group by the switch 511. This is because, as a result of applying a processing section table having a large number of processing sections and performing adaptive processing for each band section, the coding bit rate is high even if the amount of bits required for the auxiliary information is large. This is because the entire bit amount corresponding to the frame is large, and it can be determined that the ratio of the auxiliary information in the generated coded stream is low.

【００７６】以上のように、この実施の形態４によれ
ば、符号化ビットレートの情報を考慮して処理区分を決
定することによって、補助情報に必要なビット量を調節
し、生成される符号化ストリームに占める補助情報の割
合を一様にすることが可能になり、特に低ビットレート
符号化の場合に、変換係数等の主情報に必要なビット量
を確保することができ、符号化品質の劣化を防ぐことが
できるという効果が得られる。また、処理区分テーブル
を絞り込むことによって、最適符号化処理部３での実行
処理回数を減らすことができ、装置規模、コストの縮小
を図ることができるという効果が得られる。さらに、処
理負荷が抑えられることによって、入力されたオーディ
オ信号に対して実時間処理が可能となるという効果が得
られる。As described above, according to the fourth embodiment, by determining the processing section in consideration of the coding bit rate information, the bit amount necessary for the auxiliary information is adjusted, and the generated code is It is possible to make the ratio of auxiliary information in the encoded stream uniform, and especially in the case of low bit rate encoding, it is possible to secure the amount of bits necessary for main information such as transform coefficients, and to improve the coding quality. The effect is obtained that the deterioration of can be prevented. Further, by narrowing down the processing division table, it is possible to reduce the number of times of execution processing in the optimal encoding processing unit 3, and it is possible to obtain the effect of reducing the size and cost of the apparatus. Further, by suppressing the processing load, there is an effect that real-time processing can be performed on the input audio signal.

【００７７】実施の形態５．図１６はこの発明の実施の
形態５によるオーディオ符号化装置の構成を示すブロッ
ク図である。図において、６は、最適符号化処理部３の
処理の結果得られる主情報及び補助情報に必要なそれぞ
れのビット量を求め、補助情報に必要なビット量の全体
のビット量に占める割合を算出し、算出した結果に応じ
て、処理区分決定部５３に処理区分を決定させる情報量
判定部で、５３は情報量判定部６からの指示に基づき処
理区分を設定する処理区分決定部であり、その他の構成
は実施の形態１の図１と同等である。Embodiment 5 FIG. 16 is a block diagram showing a configuration of an audio encoding device according to Embodiment 5 of the present invention. In the figure, reference numeral 6 denotes the respective bit amounts required for the main information and the auxiliary information obtained as a result of the processing of the optimal encoding processing unit 3, and calculates the ratio of the bit amount required for the auxiliary information to the total bit amount. Then, an information amount determining unit that causes the processing category determining unit 53 to determine the processing category in accordance with the calculated result, and 53 is a processing category determining unit that sets the processing category based on an instruction from the information quantity determining unit 6. Other configurations are the same as those of the first embodiment shown in FIG.

【００７８】次に動作について説明する。処理区分決定
部５３及び情報量判定部６以外は、これまで説明した通
りの動作をするものであり、ここでは説明を省略する。
情報量判定部６は、最適符号化処理部３の処理の結果得
られる主情報及び補助情報に必要なそれぞれのビット量
Ｑｍ及びＱｓを求め、ビット量Ｑｓの全体のビット量に
占める割合Ｒｑ＝Ｑｓ／（Ｑｍ＋Ｑｓ）を算出する。こ
こで、全体のビット量とは、符号化ビットレートに対し
て１符号化フレーム時間に相当するビット量Ｑａに等し
い（Ｑａ＝Ｑｍ＋Ｑｓ）。全体のビット量は符号化フレ
ーム毎に固定であっても可変であっても良い。Next, the operation will be described. The components other than the processing category determination unit 53 and the information amount determination unit 6 operate as described above, and a description thereof will be omitted.
The information amount determination unit 6 obtains the respective bit amounts Qm and Qs necessary for the main information and the auxiliary information obtained as a result of the processing of the optimal encoding processing unit 3, and calculates the ratio Rq = Calculate Qs / (Qm + Qs). Here, the total bit amount is equal to the bit amount Qa corresponding to one encoding frame time with respect to the encoding bit rate (Qa = Qm + Qs). The total bit amount may be fixed or variable for each encoded frame.

【００７９】次に情報量判定部６は、ビット量Ｑｓの全
体のビット量に占める割合Ｒｑと予め用意されたしきい
値Ｒｔｈ１と比較し、Ｒｑ＞Ｒｔｈ１であれば相対的に
補助情報のビット量が多く、変換係数等の主情報のビッ
ト量が少ないと判断して、処理区分決定部５３に対して
補助情報のビット量が少なくなるように調整する指示を
出す。すなわち、処理区分決定部５３が、処理区分の数
がより少ない処理区分テーブルを選択して最適符号化処
理部３に処理区分情報を与えるような制御情報を与え
る。Ｒｑ≦Ｒｔｈ１であれば、最適符号化処理部３の処
理結果を最終結果として、そのまま多重化部４へ各種情
報を出力する。Next, the information amount judging section 6 compares the ratio Rq of the bit amount Qs to the total bit amount with a predetermined threshold value Rth1, and if Rq> Rth1, the bit of the auxiliary information is relatively determined. It is determined that the amount is large and the bit amount of the main information such as the transform coefficient is small, and an instruction is issued to the processing division determination unit 53 to adjust the bit amount of the auxiliary information to be small. That is, the processing division determination unit 53 selects a processing division table having a smaller number of processing divisions, and gives control information that gives processing division information to the optimal encoding processing unit 3. If Rq ≦ Rth1, various information is directly output to the multiplexing unit 4 with the processing result of the optimal encoding processing unit 3 as the final result.

【００８０】処理区分決定部５３は、情報量判定部６か
ら処理区分の数を減少させるように指示を受けた場合、
改めて処理区分を設定し、最適符号化処理部３に与え
る。例えば、図２の帯域区分に対して、ｋ（ｋ＝２，
３，４，・・・）個ずつにまとめた処理区分テーブルを
持ち、情報量判定部６からの指示に対して、順次それぞ
れの処理区分を与える。最適符号化処理部３は、この処
理区分を用いて再び処理区分毎の適応処理を実行する。
そして、情報量判定部６は再び判定を行う。この一連の
処理はＲｑ≦Ｒｔｈ１を満足するまで繰り返し行われ
る。When the processing section determination section 53 receives an instruction from the information amount determination section 6 to reduce the number of processing sections,
The processing section is set again and given to the optimal encoding processing section 3. For example, for the band division of FIG.
3, 4,...) Are provided, and each processing section is sequentially given in response to an instruction from the information amount determination unit 6. The optimal encoding processing unit 3 executes the adaptive processing for each processing section again using this processing section.
Then, the information amount determination unit 6 performs the determination again. This series of processing is repeatedly performed until Rq ≦ Rth1 is satisfied.

【００８１】このように、情報量判定部６において補助
情報に必要なビット量の全体に占める割合を求め、その
結果に応じて処理区分を再区分化することによって補助
情報量を削減するので、変換係数等の主情報に必要なビ
ット量をより多く確保することができ、全体としてデー
タ圧縮効率を高めることができるという効果が得られ
る。特に、最適符号化処理部３の処理結果を基に補助情
報のビット量の判定を行うというフィードバック型の構
成をとることによって、補助情報量を調整するための、
より正確な判定がなされる。As described above, the information amount judging unit 6 finds the ratio of the bit amount necessary for the auxiliary information to the whole, and re-partitions the processing divisions according to the result, thereby reducing the amount of auxiliary information. It is possible to secure a larger bit amount necessary for main information such as a conversion coefficient, and to obtain an effect of improving data compression efficiency as a whole. In particular, by adopting a feedback type configuration in which the bit amount of the auxiliary information is determined based on the processing result of the optimal encoding processing unit 3, the amount of the auxiliary information is adjusted.
More accurate decisions are made.

【００８２】また、この実施の形態では、情報量判定部
６は、外部からの符号化ビットレート情報に応じてその
判定条件を変更するものであっても良い。補助情報のビ
ット量Ｑｓは、設定する符号化ビットレートに依らず、
その変動範囲には上限及び下限がある。In this embodiment, the information amount determination section 6 may change the determination condition in accordance with externally encoded bit rate information. The bit amount Qs of the auxiliary information is independent of the coding bit rate to be set,
The fluctuation range has an upper limit and a lower limit.

【００８３】例えば、ハフマン符号テーブルの情報につ
いては、極端な場合、全ての帯域区分において同じハフ
マン符号テーブルが選択された場合に最小となり、（４
ｂｉｔ＋５ｂｉｔ）×２＝１８ｂｉｔ、全ての帯域区分
において同じハフマン符号テーブルが連続して選択され
ることが無い場合に最大となり、（４ｂｉｔ＋５ｂｉ
ｔ）×４９＝４４１ｂｉｔとなる。ここでは、１１種類
のハフマン符号テーブルを表すために必要なビット数を
４ｂｉｔ、同じハフマン符号テーブルが選択されている
帯域の連続数を表すために必要なビット数を、３２個以
下の連続数を示すことができる５ｂｉｔ、また、帯域区
分の数を４９個と仮定している。For example, in the extreme case, the information of the Huffman code table becomes minimum when the same Huffman code table is selected in all the band divisions, and (4)
(bit + 5 bits) × 2 = 18 bits, which is maximum when the same Huffman code table is not continuously selected in all band divisions, and is (4 bits + 5bi)
t) × 49 = 441 bits. Here, the number of bits required to represent the 11 types of Huffman code table is 4 bits, and the number of bits required to represent the number of consecutive bands in which the same Huffman code table is selected is 32 or less. It is assumed that 5 bits can be indicated and the number of band sections is 49.

【００８４】図１７は符号化ビットレート、全体のビッ
ト量、補助情報の上限における割合の関係を示す図であ
る。上記のＱｓの最大値をＱｓｍａｘとすると、符号化
ビットレート、全体のビット量Ｑａ、補助情報の上限に
おける割合Ｑｓｍａｘ／Ｑａの関係は図１７に示す通り
となる。ここでは符号化フレーム周期を２１．３３ｍｓ
としている。FIG. 17 is a diagram showing the relationship among the encoding bit rate, the total bit amount, and the ratio of the auxiliary information at the upper limit. Assuming that the maximum value of Qs is Qsmax, the relationship among the coding bit rate, the total bit amount Qa, and the ratio Qsmax / Qa at the upper limit of the auxiliary information is as shown in FIG. Here, the encoding frame period is set to 21.33 ms.
And

【００８５】図１７に示す通り、符号化ビットレートが
高くなるに応じてＱｓｍａｘ／Ｑａの値は小さくなる。
すなわち、Ｑｓｍａｘの全体に占める割合が低くなり、
言い換えれば主情報のビット量は十分確保されるという
ことになる。従って、Ｒｑを算出し、この比が符号化ビ
ットレート毎に定められたしきい値Ｒｔｈ２（＝Ｑｓｍ
ａｘ／Ｑａ）との関係Ｒｑ＞Ｒｔｈ２を満たす場合にの
み、処理区分決定部５３に対して再区分化を行うような
指示を与え、条件を満たさない場合には再区分化を行わ
ないような指示を与える。As shown in FIG. 17, the value of Qsmax / Qa decreases as the encoding bit rate increases.
That is, the ratio of Qsmax to the whole becomes low,
In other words, the bit amount of the main information is sufficiently secured. Accordingly, Rq is calculated, and this ratio is determined by a threshold value Rth2 (= Qsm) determined for each coding bit rate.
ax / Qa), only when the relationship Rq> Rth2 is satisfied, an instruction to perform the re-partitioning is given to the processing partition determination unit 53, and when the condition is not satisfied, the re-partitioning is not performed. Give instructions.

【００８６】処理区分決定部５３は、情報量判定部６か
らの指示を受け、再区分化が必要な場合は処理区分を新
たに設定して最適符号化処理部３に出力する。再区分化
が必要でない場合には新たに処理区分を設定することは
しない。なお、処理区分決定部５３での処理区分を設定
する動作については、上記実施の形態で説明したものを
適用する。すなわち、帯域区分をｋ個ずつにまとめた処
理区分の中から選択するものであっても良いし、入力さ
れたオーディオ信号を分析することによって処理区分を
決定するものであっても良い。The processing section determination section 53 receives an instruction from the information amount determination section 6 and, if re-partitioning is necessary, newly sets a processing section and outputs it to the optimal encoding processing section 3. If re-partitioning is not required, no new processing section is set. The operation described in the above embodiment is applied to the operation of setting the processing division in the processing division determining unit 53. In other words, the processing section may be selected from the processing sections obtained by grouping the band sections into k pieces, or the processing section may be determined by analyzing the input audio signal.

【００８７】以上のように、この実施の形態５によれ
ば、情報量判定部６を設けることによって、補助情報の
占める割合が大きく変換係数等の主情報に必要なビット
数が不足するような低ビットレートでの符号化の場合に
のみ、処理区分化を行って符号化効率を向上させること
ができると共に、低ビットレートではない符号化の場合
には、帯域毎の適応制御結果を優先して採用することに
より、無駄な処理区分の見直しにおける処理の実行を避
けることができるという効果が得られる。As described above, according to the fifth embodiment, the provision of the information amount determination unit 6 makes it possible to reduce the number of bits required for main information such as conversion coefficients because the ratio of auxiliary information is large. Only in the case of coding at a low bit rate, processing partitioning can be performed to improve coding efficiency, and in the case of coding at a low bit rate, adaptive control results for each band are prioritized. By adopting this method, there is an effect that it is possible to avoid execution of processing in reviewing useless processing divisions.

【００８８】[0088]

【発明の効果】以上のように、この発明によれば、変換
係数を符号化する際の周波数軸上での複数の処理区分
を、既定の最小単位区分より少ない区分数になるよう設
定する処理区分決定部と、処理区分決定部が設定した複
数の処理区分毎に、算出された変換係数を符号化し、符
号化された変換係数及び符号化の際に使用した付随する
補助情報を出力する最適符号化処理部とを備えたことに
より、帯域区分毎に存在する補助情報に必要なビット量
を削減することが可能となり、全体としてデータ圧縮効
率を高めることができるという効果がある。As described above, according to the present invention, a process for setting a plurality of processing sections on the frequency axis when encoding a transform coefficient so as to have a smaller number of sections than a predetermined minimum unit section. A division determining unit, and an optimal unit that encodes the calculated transform coefficient for each of the plurality of processing divisions set by the processing division determining unit, and outputs the encoded transform coefficient and accompanying auxiliary information used in the encoding. With the provision of the encoding processing unit, it is possible to reduce the amount of bits required for the auxiliary information existing for each band division, and to increase the data compression efficiency as a whole.

【００８９】この発明によれば、処理区分決定部が、規
定のｎ個の最小単位区分に対してｋ個（ｋ＜ｎ）ずつの
最小単位区分を１つにまとめて処理区分として設定する
ことにより、帯域区分毎に存在する補助情報に必要なビ
ット量を削減することが可能となり、全体としてデータ
圧縮効率を高めることができるという効果がある。According to the present invention, the processing division determination unit sets k (k <n) minimum unit divisions into one for the prescribed n minimum unit divisions and sets them as processing divisions. Accordingly, it is possible to reduce the amount of bits necessary for the auxiliary information existing for each band division, and to increase the data compression efficiency as a whole.

【００９０】この発明によれば、処理区分決定部が、処
理区分に属する変換係数の数が一様になるように処理区
分を設定することにより、帯域区分毎に存在する補助情
報に必要なビット量を削減することが可能となり、全体
としてデータ圧縮効率を高めることができるという効果
がある。According to the present invention, the processing section determining section sets the processing section so that the number of transform coefficients belonging to the processing section becomes uniform, so that the bits necessary for the auxiliary information existing for each band section are set. The amount can be reduced, and the data compression efficiency can be increased as a whole.

【００９１】この発明によれば、聴覚分析部が、処理区
分決定部が設定した複数の処理区分毎に、量子化雑音の
発生量を適応的に制御するための指標を算出することに
より、その処理区分に最適な指標が得られ、その指標を
用いて最適符号化処理が行われることになり、符号化品
質の向上あるいは圧縮効率の向上が可能となるという効
果がある。According to the present invention, the auditory analysis section calculates an index for adaptively controlling the amount of quantization noise generated for each of the plurality of processing sections set by the processing section determination section. An index that is optimal for the processing section is obtained, and optimal encoding processing is performed using the index. This has the effect of improving encoding quality or compressing efficiency.

【００９２】この発明によれば、処理区分決定部が、算
出された変換係数から最小単位区分毎の変換係数のパワ
ーを算出し、算出された変換係数のパワーの差分が所定
のしきい値内にある最小単位区分を同一の処理区分にま
とめることにより処理区分を設定することで、処理区分
内には一様な変換係数が揃うことになり、量子化効率を
向上させることができるという効果がある。According to the present invention, the processing section determination section calculates the power of the conversion coefficient for each minimum unit section from the calculated conversion coefficient, and determines the difference between the calculated conversion coefficient powers within a predetermined threshold value. By setting the processing unit by grouping the minimum unit units in the same processing unit into the same processing unit, uniform transformation coefficients are arranged in the processing unit, and the effect that the quantization efficiency can be improved can be obtained. is there.

【００９３】この発明によれば、処理区分決定部が、算
出された変換係数から最小単位区分毎に変換係数のパワ
ーの最大値を検出し、検出された変換係数のパワーの最
大値の差分が所定のしきい値内にある最小単位区分を同
一の処理区分にまとめることにより処理区分を設定する
ことで、処理区分内には一様な変換係数が揃うことにな
り、量子化効率を向上させることができるという効果が
ある。According to the present invention, the processing section determination section detects the maximum value of the power of the transform coefficient for each minimum unit section from the calculated transform coefficients, and determines the difference between the detected maximum values of the power of the transform coefficients. By setting processing divisions by grouping the minimum unit divisions within a predetermined threshold value into the same processing division, uniform transform coefficients are arranged in the processing divisions, thereby improving quantization efficiency. There is an effect that can be.

【００９４】この発明によれば、処理区分決定部が、聴
覚分析部によりオーディオ信号を分析した際に得られる
スペクトルから最小単位区分毎のスペクトルのパワーを
算出し、算出されたスペクトルのパワーの差分が所定の
しきい値内にある最小単位区分を同一の処理区分にまと
めることにより処理区分を設定することで、処理区分内
には一様なスペクトルが揃うことになり、量子化効率を
向上させることができるという効果がある。According to the present invention, the processing section determination section calculates the power of the spectrum for each minimum unit section from the spectrum obtained when the audio signal is analyzed by the auditory analysis section, and calculates the difference between the calculated power of the spectrum. Sets the processing unit by grouping the minimum unit units within a predetermined threshold into the same processing unit, whereby uniform spectra are arranged in the processing unit and the quantization efficiency is improved. There is an effect that can be.

【００９５】この発明によれば、処理区分決定部が、外
部から与えられる符号化ビットレートに応じて、符号化
ビットレートが低いほど区分数を少なく、符号化ビット
レートが高いほど区分数を多くなるように処理区分を設
定することにより、補助情報に必要なビット量を調節
し、生成される符号化ストリームに占める補助情報の割
合を一様にすることが可能になり、特に低ビットレート
符号化の場合に、変換係数等の主情報に必要なビット量
を確保することができ、符号化品質の劣化を防ぐことが
できるという効果がある。According to the present invention, the processing section determination section determines the number of sections as the coding bit rate is lower and the number of sections as the coding bit rate is higher, according to the coding bit rate given from the outside. By setting the processing division such that the amount of auxiliary information in the generated coded stream can be made uniform by adjusting the amount of bits required for the auxiliary information, In the case of encoding, there is an effect that a bit amount necessary for main information such as a transform coefficient can be secured, and deterioration of coding quality can be prevented.

【００９６】この発明によれば、最適符号化処理部によ
り出力された変換係数及び補助情報に必要なそれぞれの
ビット量を求め、補助情報に必要なビット量の全体のビ
ット量に対する割合が所定のしきい値より多い場合に、
補助情報に必要なビット量を少なくするために、より少
ない区分数になるように複数の処理区分を再設定するよ
う処理区分決定部に指示する情報量判定部を備えたこと
により、補助情報の占める割合が大きく変換係数等の主
情報に必要なビット数が不足するような低ビットレート
での符号化の場合にのみ、処理区分化を行って符号化効
率を向上させることができると共に、低ビットレートで
はない符号化の場合には、帯域毎の適応制御結果を優先
して採用することにより、無駄な処理区分の見直しにお
ける処理の実行を避けることができるという効果があ
る。According to the present invention, the respective bit amounts required for the transform coefficient and the auxiliary information output by the optimum encoding processing unit are obtained, and the ratio of the bit amount required for the auxiliary information to the total bit amount is determined by a predetermined value. If more than the threshold,
In order to reduce the amount of bits required for the auxiliary information, an information amount determination unit that instructs the processing division determination unit to reset a plurality of processing divisions so as to reduce the number of divisions is provided. Only in the case of encoding at a low bit rate in which the proportion occupied is large and the number of bits required for main information such as transform coefficients is insufficient, processing partitioning can be performed to improve encoding efficiency. In the case of encoding that is not a bit rate, by prioritizing and adopting the adaptive control result for each band, there is an effect that it is possible to avoid executing processing in reviewing useless processing divisions.

【００９７】この発明によれば、情報量判定部が、補助
情報に必要なビット量の全体のビット量に対する割合が
符号化ビットレートごとに定められた所定のしきい値よ
り多い場合に、より少ない区分数になるように複数の処
理区分を再設定するよう処理区分決定部に指示すること
により、補助情報の占める割合が大きく変換係数等の主
情報に必要なビット数が不足するような低ビットレート
での符号化の場合にのみ、処理区分化を行って符号化効
率を向上させることができると共に、低ビットレートで
はない符号化の場合には、帯域毎の適応制御結果を優先
して採用することにより、無駄な処理区分の見直しにお
ける処理の実行を避けることができるという効果があ
る。[0097] According to the present invention, the information amount judging section is configured to determine whether the ratio of the bit amount necessary for the auxiliary information to the total bit amount is larger than a predetermined threshold value defined for each encoding bit rate. By instructing the processing division determination unit to reset a plurality of processing divisions so that the number of divisions becomes small, it is possible to reduce the number of bits required for the main information such as the conversion coefficient because the ratio of auxiliary information is large. Only in the case of encoding at the bit rate, processing partitioning can be performed to improve the encoding efficiency, and in the case of encoding that is not at a low bit rate, the adaptive control result for each band is given priority. By adopting this method, there is an effect that it is possible to avoid execution of processing in reviewing useless processing divisions.

[Brief description of the drawings]

【図１】この発明の実施の形態１によるオーディオ符
号化装置の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an audio encoding device according to Embodiment 1 of the present invention.

【図２】この発明の実施の形態１による、変換係数を
複数の帯域に区分するための区分規定テーブルを示す図
である。FIG. 2 is a diagram showing a partition definition table for partitioning a transform coefficient into a plurality of bands according to the first embodiment of the present invention.

【図３】この発明の実施の形態１による最適符号化処
理部の構成を示すブロック図である。FIG. 3 is a block diagram showing a configuration of an optimal encoding processing unit according to Embodiment 1 of the present invention.

【図４】この発明の実施の形態１による帯域区分を２
個ずつに再区分化した処理区分テーブルを示す図であ
る。FIG. 4 shows two band divisions according to the first embodiment of the present invention.
It is a figure which shows the process division | segmentation table re-partitioned for every piece.

【図５】この発明の実施の形態１による帯域区分が２
個の場合の処理区分ｎｂを適用して最適符号化処理部に
よって得られたハフマン符号テーブルの結果を示す図で
ある。FIG. 5 shows that the band division according to the first embodiment of the present invention is 2
FIG. 18 is a diagram illustrating a result of a Huffman code table obtained by an optimum encoding processing unit by applying the processing section nb in the case of the number of pieces.

【図６】この発明の実施の形態１による多重化部がハ
フマン符号テーブルの情報を多重した結果を示す図であ
る。FIG. 6 is a diagram illustrating a result of multiplexing information of a Huffman code table by the multiplexing unit according to the first embodiment of the present invention.

【図７】この発明の実施の形態１による帯域区分が２
個の場合の処理区分ｎｂを適用して最適符号化処理部に
よって得られたスケーリング係数の結果を示す図であ
る。FIG. 7 shows that the band division according to the first embodiment of the present invention is 2
FIG. 22 is a diagram illustrating a result of a scaling coefficient obtained by an optimal encoding processing unit by applying the processing section nb in the case of the number of pieces.

【図８】この発明の実施の形態２によるオーディオ符
号化装置の構成を示すブロック図である。FIG. 8 is a block diagram showing a configuration of an audio encoding device according to Embodiment 2 of the present invention.

【図９】この発明の実施の形態２による、ある帯域区
分におけるマスキングしきい値算出方法の一例を示す図
である。FIG. 9 is a diagram illustrating an example of a method of calculating a masking threshold in a certain band section according to the second embodiment of the present invention.

【図１０】この発明の実施の形態２による、ある処理
区分におけるマスキングしきい値算出方法の一例を示す
図である。FIG. 10 is a diagram showing an example of a masking threshold calculation method in a certain processing section according to the second embodiment of the present invention.

【図１１】この発明の実施の形態３によるオーディオ
符号化装置の構成を示すブロック図である。FIG. 11 is a block diagram showing a configuration of an audio encoding device according to Embodiment 3 of the present invention.

【図１２】この発明の実施の形態３による隣接する帯
域区分間の変換係数のパワーの差分値を示す図である。FIG. 12 is a diagram showing power difference values of transform coefficients between adjacent band sections according to Embodiment 3 of the present invention.

【図１３】この発明の実施の形態３によるオーディオ
符号化装置の他の構成を示すブロック図である。FIG. 13 is a block diagram showing another configuration of the audio encoding device according to the third embodiment of the present invention.

【図１４】この発明の実施の形態４によるオーディオ
符号化装置の構成を示すブロック図である。FIG. 14 is a block diagram showing a configuration of an audio encoding device according to Embodiment 4 of the present invention.

【図１５】この発明の実施の形態４による処理区分決
定部の構成を示すブロック図である。FIG. 15 is a block diagram showing a configuration of a processing division determination unit according to Embodiment 4 of the present invention.

【図１６】この発明の実施の形態５によるオーディオ
符号化装置の構成を示すブロック図である。FIG. 16 is a block diagram showing a configuration of an audio encoding device according to Embodiment 5 of the present invention.

【図１７】この発明の実施の形態５による、符号化ビ
ットレート、全体のビット量、補助情報の上限における
割合の関係を示す図である。FIG. 17 is a diagram illustrating a relationship among an encoding bit rate, an overall bit amount, and a ratio of an upper limit of auxiliary information according to Embodiment 5 of the present invention.

【図１８】従来のオーディオ符号化装置の構成を示す
ブロック図である。FIG. 18 is a block diagram illustrating a configuration of a conventional audio encoding device.

【図１９】従来の最適符号化処理部から出力された、
帯域区分毎に選択されたハフマン符号テーブルの様子の
一例を示す図である。FIG. 19 is a diagram showing an output from a conventional optimal encoding processing unit.
It is a figure showing an example of a situation of a Huffman code table chosen for every band division.

【図２０】従来の多重化部によるハフマン符号テーブ
ルの情報の多重される順序を示す図である。FIG. 20 is a diagram illustrating the order in which information of the Huffman code table is multiplexed by a conventional multiplexing unit.

【図２１】従来の最適符号化処理部から出力された、
帯域区分毎に選択されたスケーリング係数の様子の一例
を示す図である。FIG. 21 is a diagram illustrating an output from a conventional optimal encoding processing unit.
FIG. 9 is a diagram illustrating an example of a state of a scaling coefficient selected for each band division.

【図２２】従来における、ｄｉｆｆ［ｓｂ］をハフマ
ン符号化する際に使用するハフマン符号テーブルの例を
示す図である。FIG. 22 is a diagram illustrating an example of a conventional Huffman code table used for Huffman coding of diff [sb].

[Explanation of symbols]

１直交変換部、２聴覚分析部、３最適符号化処理
部、４多重化部、５処理区分決定部、６情報量判定
部、９入力端子、１０出力端子、２１聴覚分析部、
５１処理区分決定部、５２処理区分決定部、５３
処理区分決定部、９１制御端子、３０１正規化部、
３０２量子化部、３０３ハフマン符号化部、３０４
レート／歪み制御部、５０１処理区分テーブル群
（１）、５０２処理区分テーブル群（２）、５０Ｎ
処理区分テーブル群（Ｎ）、５１０テーブル群選択
部、５１１切替器。1 orthogonal transform section, 2 auditory analysis section, 3 optimal coding processing section, 4 multiplexing section, 5 processing section determination section, 6 information amount determination section, 9 input terminal, 10 output terminal, 21 auditory analysis section,
51 processing section determining section, 52 processing section determining section, 53
Processing division determination unit, 91 control terminal, 301 normalization unit,
302 Quantizer, 303 Huffman encoder, 304
Rate / distortion controller, 501 processing section table group (1), 502 processing section table group (2), 50N
Processing section table group (N), 510 table group selection section, 511 switch.

Claims

[Claims]

1. An auditory analysis for analyzing an input audio signal based on human auditory characteristics and calculating an index for adaptively controlling an amount of quantization noise such as a masking threshold and an allowable noise amount. Unit, an orthogonal transform unit for orthogonally transforming the input audio signal according to the transform block size output from the auditory analysis unit to calculate a transform coefficient, and on a frequency axis when encoding the transform coefficient. A processing division determining unit that sets the plurality of processing divisions so that the number of divisions is smaller than a predetermined minimum unit division; and an index for adaptively controlling the amount of quantization noise calculated by the auditory analysis unit. Based on the above, for each of the plurality of processing sections set by the processing section determination section, the transform coefficients calculated by the orthogonal transform section are encoded, and the coded transform coefficients and the associated An audio encoding device, comprising: an optimal encoding processing unit that outputs auxiliary information; and a multiplexing unit that multiplexes the transform coefficient and the auxiliary information output by the optimal encoding processing unit.

2. The processing section determining section sets k (k <n) minimum unit sections to 1 for a prescribed n minimum unit sections.
2. The audio encoding device according to claim 1, wherein the audio encoding device is set as a processing section.

3. The audio encoding apparatus according to claim 1, wherein the processing section determination section sets the processing section so that the number of transform coefficients belonging to the processing section becomes uniform.

4. The auditory analysis unit calculates an index for adaptively controlling the amount of quantization noise generated for each of the plurality of processing sections set by the processing section determination unit. Audio encoding device.

5. A processing section determining section calculates a power of a transform coefficient for each minimum unit section from the transform coefficients calculated by the orthogonal transform section, and a difference between the calculated powers of the transform coefficients falls within a predetermined threshold value. 2. The audio encoding apparatus according to claim 1, wherein the processing section is set by grouping the minimum unit sections in the same processing section into the same processing section.

6. The processing section determination section detects the maximum value of the power of the transform coefficient for each minimum unit section from the transform coefficients calculated by the orthogonal transform section, and determines the difference between the detected maximum values of the power of the transform coefficients. 2. The audio encoding apparatus according to claim 1, wherein the processing section is set by grouping the minimum unit sections within a predetermined threshold value into the same processing section.

7. A processing section determination section calculates a spectrum power for each minimum unit section from a spectrum obtained when an audio signal is analyzed by an auditory analysis section, and a difference between the calculated spectrum powers is a predetermined value. 2. The audio encoding apparatus according to claim 1, wherein a processing section is set by grouping minimum unit sections within a threshold into the same processing section.

8. The processing section determining section performs processing such that the number of sections decreases as the coding bit rate decreases and the number of sections increases as the coding bit rate increases, according to an externally applied encoding bit rate. 2. The audio encoding apparatus according to claim 1, wherein a section is set.

9. A method for determining respective bit amounts required for a transform coefficient and auxiliary information output by an optimum encoding processing unit,
When the ratio of the bit amount required for the auxiliary information to the total bit amount is larger than a predetermined threshold value, a plurality of processing sections are provided so as to reduce the number of sections to reduce the bit amount required for the auxiliary information. 2. The audio encoding apparatus according to claim 1, further comprising an information amount determination unit that instructs the processing division determination unit to reset.

10. An information amount determining unit, when a ratio of a bit amount necessary for auxiliary information to an entire bit amount is larger than a predetermined threshold value defined for each encoding bit rate,
10. The audio encoding apparatus according to claim 9, wherein the audio encoding apparatus is instructed to reset a plurality of processing sections so as to have a smaller number of sections.