JP2013195706A

JP2013195706A - Audio coding device, audio coding method, computer program for audio coding, audio decoding device, audio decoding method, and computer program for audio decoding

Info

Publication number: JP2013195706A
Application number: JP2012062767A
Authority: JP
Inventors: Masanao Suzuki; 政直鈴木; Yohei Kishi; 洋平岸; Shunsuke Takeuchi; 俊輔武内; Miyuki Shirakawa; 美由紀白川; Akira Nakagawa; 章中川
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2012-03-19
Filing date: 2012-03-19
Publication date: 2013-09-30
Anticipated expiration: 2032-03-19
Also published as: JP5990954B2

Abstract

PROBLEM TO BE SOLVED: To provide an audio coding device capable of reducing an amount of bits to be assigned to a prediction coefficient.SOLUTION: An audio coding device 1 calculates a pair of such first and second prediction coefficients that an error between a linear sum of a value obtained by multiplying a signal of a first channel by the first prediction coefficient and a value obtained by multiplying a signal of a second channel by the second prediction coefficient and a signal of a third channel is minimum. If one of the first and second prediction coefficient does not influence a minimum value of the error or a value of one of prediction coefficients included in the pair of first and second prediction coefficients minimizing the error is outside a quantization value range including a plurality of quantization values defined by a codebook relating for the one of the prediction coefficients, the audio coding device selects a codebook for the other of the first and second prediction coefficients and, with respect to the prediction coefficient for which the codebook is selected, obtains a quantization value minimizing the error out of the plurality of quantization values defined in this codebook.

Description

本発明は、例えば、オーディオ符号化装置、オーディオ符号化方法及びオーディオ符号化用コンピュータプログラムに関する。また本発明は、例えば、オーディオ復号装置、オーディオ復号方法ならびにオーディオ復号用コンピュータプログラムに関する。 The present invention relates to, for example, an audio encoding device, an audio encoding method, and an audio encoding computer program. The present invention also relates to, for example, an audio decoding device, an audio decoding method, and an audio decoding computer program.

従来より、3チャネル以上のチャネルを持つマルチチャネルオーディオ信号のデータ量を圧縮するためのオーディオ信号の符号化方式が開発されている。そのような符号化方式の一つとして、Moving Picture Experts Group (MPEG)により標準化されたMPEG Surround方式が知られている。MPEG Surround方式では、例えば、符号化対象となる5.1チャネル(5.1ch)のオーディオ信号が時間周波数変換され、その時間周波数変換により得られた周波数信号がダウンミックスされることにより、一旦3チャネルの周波数信号が生成される。さらに、その3チャネルの周波数信号が再度ダウンミックスされることにより2チャネルのステレオ信号に対応する周波数信号が算出される。そしてステレオ信号に対応する周波数信号は、Advanced Audio Coding(AAC)符号化方式及びSpectral Band Replication(SBR)符号化方式により符号化される。
その一方で、MPEG Surround方式では、5.1chの信号を3チャネルの信号へダウンミックスする際、及び3チャネルの信号を2チャネルの信号へダウンミックスする際、音の広がりまたは定位を表す空間情報が算出され、この空間情報が符号化される。このように、MPEG Surround方式では、マルチチャネルオーディオ信号をダウンミックスすることにより生成されたステレオ信号とデータ量の比較的少ない空間情報が符号化される。これにより、MPEG Surround方式では、マルチチャネルオーディオ信号に含まれる各チャネルの信号を独立に符号化するよりも高い圧縮効率が得られる。 Conventionally, an audio signal encoding method for compressing the data amount of a multi-channel audio signal having three or more channels has been developed. As one of such encoding methods, the MPEG Surround method standardized by the Moving Picture Experts Group (MPEG) is known. In the MPEG Surround system, for example, the 5.1 channel (5.1ch) audio signal to be encoded is time-frequency converted, and the frequency signal obtained by the time-frequency conversion is downmixed, so that the frequency of the 3 channel is once set. A signal is generated. Further, the frequency signal corresponding to the 2-channel stereo signal is calculated by downmixing the 3-channel frequency signal again. A frequency signal corresponding to the stereo signal is encoded by an Advanced Audio Coding (AAC) encoding method and a Spectral Band Replication (SBR) encoding method.
On the other hand, in the MPEG Surround system, when downmixing a 5.1ch signal to a 3-channel signal and when downmixing a 3-channel signal to a 2-channel signal, spatial information indicating the sound spread or localization is present. And this spatial information is encoded. Thus, in the MPEG Surround system, a stereo signal generated by downmixing a multi-channel audio signal and spatial information with a relatively small amount of data are encoded. Thereby, in the MPEG Surround system, higher compression efficiency can be obtained than when the signals of the respective channels included in the multi-channel audio signal are independently encoded.

MPEG Surround方式では、ステレオ周波数信号を生成する際に算出される空間情報を符号化するために、予測係数(channel prediction coefficient)が用いられる（例えば、特許文献１を参照）。予測係数とは、3チャネルのうちの一つのチャネルの信号をその他の二つのチャネルの信号に基づいて予測符号化するための係数である。予測係数は、その他の二つのチャネルのそれぞれについて算出される。その二つの予測係数のそれぞれの複数の量子化値及び対応する量子化値が符号帳と称されるテーブルに格納されており、各予測係数に最も近い量子化値が選択される。そしてこの量子化値に対応するインデックス値が可変長符号化される。この符号帳は、使用ビット効率の向上の為に用いられるものである。符号化器と復号器で予め定められた共通の（あるいは共通の方法で作成する）符号帳を持つことで、符号化器は少ないビット数でより重要な情報を復号器へ送ることができる。復号器は、上述の予測係数に基づいて3チャネルのうちの一つのチャネルの信号を再現する。このため、符号化器は、符号帳から最適な予測係数を選択する必要がある。 In the MPEG Surround system, a channel prediction coefficient is used to encode spatial information calculated when a stereo frequency signal is generated (see, for example, Patent Document 1). The prediction coefficient is a coefficient for predictively encoding a signal of one channel among the three channels based on signals of the other two channels. The prediction coefficient is calculated for each of the other two channels. A plurality of quantized values and corresponding quantized values of the two prediction coefficients are stored in a table called a codebook, and the quantized value closest to each predictive coefficient is selected. The index value corresponding to this quantized value is variable-length encoded. This codebook is used for improving the bit efficiency. By having a common code book (or created by a common method) determined in advance by the encoder and decoder, the encoder can send more important information to the decoder with a small number of bits. The decoder reproduces the signal of one of the three channels based on the above prediction coefficient. For this reason, the encoder needs to select an optimal prediction coefficient from the codebook.

特表２００８−５１７３３８号公報Special table 2008-517338 gazette

二つの予測係数のそれぞれについて、量子化値の数が増えると、それだけ各量子化値に対応する符号の数も増えることになる。そして符号の数が増えるほど、符号同士の直交性を保つために、個々の符号の長さの平均値も長くなり、その結果として二つの予測係数に割り当てるビット量を増やす必要が生じる。したがって、オーディオデータの圧縮効率を高めるために、二つの予測係数に対して割り当てるビット量を短縮できる技術が求められている。 As the number of quantized values increases for each of the two prediction coefficients, the number of codes corresponding to each quantized value increases accordingly. As the number of codes increases, in order to maintain the orthogonality between codes, the average value of the lengths of the individual codes becomes longer. As a result, it is necessary to increase the amount of bits allocated to the two prediction coefficients. Therefore, in order to increase the compression efficiency of audio data, there is a need for a technique that can reduce the amount of bits allocated to two prediction coefficients.

そこで、本明細書は、予測係数に割り当てるビット量を削減可能なオーディオ符号化装置、及びそのようなオーディオ符号化装置によって符号化されたオーディオ信号を復号するオーディオ復号装置を提供することを目的とする。 Therefore, the present specification aims to provide an audio encoding device capable of reducing the amount of bits allocated to a prediction coefficient, and an audio decoding device that decodes an audio signal encoded by such an audio encoding device. To do.

一つの実施形態によれば、オーディオ信号に含まれる複数のチャネルのうちの第１のチャネルの信号及び第２のチャネルの信号と、該第１のチャネルの信号に乗じる第１の予測係数と該第２のチャネルの信号に乗じる第２の予測係数とに基づいて複数のチャネルのうちの第３のチャネルの信号を予測符号化するオーディオ符号化装置が提供される。このオーディオ符号化装置は、第１のチャネルの信号に第１の予測係数を乗じて得られる値と第２のチャネルの信号に第２の予測係数を乗じて得られる値との線形和である第３のチャネルの信号の予測値と第３のチャネルの信号間の誤差が最小となるときの第１及び第２の予測係数の第１の値の組を算出する最小誤差予測係数算出部と、第１の予測係数及び第２の予測係数のうちの一方が、誤差の最小値に影響しないか、または誤差が最小となるときの第１及び第２の予測係数の組に含まれる予測係数の一方の第１の値が、予測係数の一方についての符号帳に規定された複数の量子化値を含む量子化値の範囲から外れている場合、第１の予測係数及び第２の予測係数のうちの他方の予測係数に対する符号帳を選択し、一方、第１の予測係数及び第２の予測係数の両方が誤差の最小値に影響し、かつ、誤差が最小となるときの第１及び第２の予測係数の組に含まれる第１の予測係数の第１の値及び第２の予測係数の第１の値のそれぞれが、その予測係数についての符号帳に規定された複数の量子化値を含む量子化値の範囲内に含まれる場合、第１及び第２の予測係数のそれぞれごとに符号帳を選択する符号帳選択部と、第１及び第２の予測係数のうち、符号帳が選択された予測係数について、その符号帳に規定されている複数の量子化値のうち、誤差が最小となる量子化値を求め、その量子化値を符号化することで符号化予測係数を求める予測係数符号化部とを有する。 According to one embodiment, a first channel signal and a second channel signal among a plurality of channels included in an audio signal, a first prediction coefficient to be multiplied by the first channel signal, and the An audio encoding device is provided that predictively encodes a third channel signal of a plurality of channels based on a second prediction coefficient multiplied by a second channel signal. This audio encoding device is a linear sum of a value obtained by multiplying a first channel signal by a first prediction coefficient and a value obtained by multiplying a second channel signal by a second prediction coefficient. A minimum error prediction coefficient calculation unit that calculates a set of first values of the first and second prediction coefficients when the error between the predicted value of the third channel signal and the third channel signal is minimized; , One of the first prediction coefficient and the second prediction coefficient does not affect the minimum value of the error, or the prediction coefficient included in the set of the first and second prediction coefficients when the error is the minimum The first prediction coefficient and the second prediction coefficient when the first value of one of the two is out of the range of quantization values including a plurality of quantization values defined in the codebook for one of the prediction coefficients A codebook for the other prediction coefficient is selected, while the first prediction coefficient and Both the second prediction coefficient affects the minimum value of the error, and the first value and the first value of the first prediction coefficient included in the set of the first and second prediction coefficients when the error is minimum When each of the first values of the two prediction coefficients is included in the range of quantization values including a plurality of quantization values defined in the codebook for the prediction coefficient, the first and second prediction coefficients A codebook selection unit that selects a codebook for each of the first and second prediction coefficients, and for a prediction coefficient for which the codebook is selected, a plurality of quantized values defined in the codebook Among them, a prediction coefficient encoding unit that obtains a quantized value that minimizes an error and encodes the quantized value to obtain an encoded prediction coefficient.

また他の実施形態によれば、オーディオ信号に含まれる複数のチャネルのうちの第１及び第２のチャネルの信号が符号化された符号化チャネル信号データと、第１及び第２のチャネルの信号に基づいて複数のチャネルのうちの第３のチャネルの信号を予測するための第１及び第２の予測係数が符号化された符号化予測係数と、第１の予測係数についての複数の量子化値を規定する第１の符号帳及び第２の予測係数についての複数の量子化値を規定する第２の符号帳のうち選択された符号帳を表す符号帳選択情報とを、所定のデータ形式に従って格納する符号化オーディオデータからオーディオ信号を復号するオーディオ復号装置が提供される。このオーディオ復号装置は、そのデータ形式に従って、符号化オーディオデータから、符号化チャネル信号データと、符号化予測係数と、符号帳選択情報とを取り出す分離部と、符号化チャネル信号データを復号することにより第１及び第２のチャネルの信号を再生するチャネル信号復号部と、第１及び第２の符号帳のうち、符号帳選択情報に選択されたことが示された符号帳に規定された複数の量子化値のうち、符号化予測係数に対応する量子化値を特定することにより第１及び第２の予測係数を再生する予測係数復号部と、再生された第１の予測係数に第１のチャネルの信号を乗じて第１の値を求め、かつ、再生された第２の予測係数に第２のチャネルの信号を乗じて第２の値を求め、第１の値と第２の値の和を第３のチャネルの信号として再生する予測復号部とを有する。 According to another embodiment, encoded channel signal data obtained by encoding signals of the first and second channels among a plurality of channels included in the audio signal, and signals of the first and second channels. And a plurality of quantizations for the first prediction coefficient and a first prediction coefficient encoded with the first and second prediction coefficients for predicting a signal of the third channel among the plurality of channels. A codebook selection information representing a codebook selected from among a first codebook defining a value and a second codebook defining a plurality of quantized values for the second prediction coefficient, and a predetermined data format An audio decoding device for decoding an audio signal from encoded audio data stored according to the above is provided. The audio decoding device, according to the data format, a decoding unit that extracts encoded channel signal data, an encoded prediction coefficient, and codebook selection information from encoded audio data, and decodes the encoded channel signal data A channel signal decoding unit that reproduces the first and second channel signals, and a plurality of codebooks defined in the codebook selected from the first and second codebooks as codebook selection information Among the quantized values, a predictive coefficient decoding unit that reproduces the first and second predictive coefficients by specifying a quantized value corresponding to the encoded predictive coefficient, and the reproduced first predictive coefficient in the first The first value is obtained by multiplying the signal of the second channel, the second value is obtained by multiplying the reproduced second prediction coefficient by the signal of the second channel, and the first value and the second value are obtained. Of the third channel signal and And a prediction decoding unit that reproduces Te.

本発明の目的及び利点は、請求項において特に指摘されたエレメント及び組み合わせにより実現され、かつ達成される。
上記の一般的な記述及び下記の詳細な記述の何れも、例示的かつ説明的なものであり、請求項のように、本発明を限定するものではないことを理解されたい。 The objects and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
It should be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention as claimed.

本明細書に開示されたオーディオ符号化装置及びオーディオ復号装置は、予測係数に割り当てるビット量を削減できる。 The audio encoding device and audio decoding device disclosed in this specification can reduce the amount of bits allocated to a prediction coefficient.

一つの実施形態によるオーディオ符号化装置の概略構成図である。1 is a schematic configuration diagram of an audio encoding device according to one embodiment. FIG. 類似度に対する量子化テーブルの一例を示す図である。It is a figure which shows an example of the quantization table with respect to similarity. インデックスの差分値と類似度符号の関係を示すテーブルの一例を示す図である。It is a figure which shows an example of the table which shows the relationship between the difference value of an index, and a similarity code. 強度差に対する量子化テーブルの一例を示す図である。It is a figure which shows an example of the quantization table with respect to an intensity difference. 符号化されたオーディオ信号が格納されたデータ形式の一例を示す図である。It is a figure which shows an example of the data format in which the encoded audio signal was stored. 予測符号化部の構成図である。It is a block diagram of a prediction encoding part. 予測係数c₁、c₂ならびに予測誤差dをそれぞれ互いに直交する座標軸で表す放物線柱面状の予測誤差の分布の概念図である。FIG. 4 is a conceptual diagram of a parabolic columnar prediction error distribution in which prediction coefficients c ₁ and c ₂ and a prediction error d are expressed by mutually orthogonal coordinate axes. 予測係数c₁、c₂ならびに予測誤差dをそれぞれ互いに直交する座標軸で表す楕円放物面状の予測誤差の分布の概念図である。FIG. 6 is a conceptual diagram of an elliptic paraboloid prediction error distribution in which prediction coefficients c ₁ and c ₂ and a prediction error d are expressed by mutually orthogonal coordinate axes. 予測誤差dの分布形状が楕円型である場合における、予測誤差dが最小となるときの予測係数c_1min、c_2minと予測係数c₁、c₂の値の範囲との位置関係を表す概念図である。Conceptual diagram showing the positional relationship between the prediction coefficients c _1min and c _2min and the range of values of the prediction coefficients c ₁ and c ₂ when the prediction error d is minimum when the distribution shape of the prediction error d is elliptical It is. 予測誤差dの分布形状が楕円型である場合における、符号帳に規定された予測係数の量子化値の範囲と予測誤差曲面との接点と、選択される符号帳との関係を示す図である。It is a figure which shows the relationship between the contact of the range of the quantization value of the prediction coefficient prescribed | regulated to the code book, and the prediction error curved surface, and the code book selected when the distribution shape of the prediction error d is elliptical. . 予測誤差dの分布形状が楕円型である場合の符号帳選択処理の動作フローチャートである。It is an operation | movement flowchart of a codebook selection process in case the distribution shape of the prediction error d is elliptical. 予測誤差dの分布形状が放物線型である場合における、予測誤差dの最小値に対応する予測係数c_1min、c_2minと予測係数の値の範囲との関係を示す図である。FIG. 7 is a diagram illustrating a relationship between prediction coefficients c _1min and c _2min corresponding to the minimum value of the prediction error d and a range of values of the prediction coefficient when the distribution shape of the prediction error d is a parabolic type. 予測誤差dの分布形状が放物線型である場合の符号帳選択処理の動作フローチャートである。It is an operation | movement flowchart of a codebook selection process in case the distribution shape of the prediction error d is a parabolic type. 予測係数の量子化値を格納した符号帳の一例を示す図である。It is a figure which shows an example of the codebook which stored the quantization value of the prediction coefficient. オーディオ符号化処理の動作フローチャートを示す。The operation | movement flowchart of an audio encoding process is shown. 一実施形態によるオーディオ復号装置の概略構成図である。It is a schematic block diagram of the audio decoding apparatus by one Embodiment. オーディオ復号装置により実行されるオーディオ復号処理の動作フローチャートである。It is an operation | movement flowchart of the audio decoding process performed by the audio decoding apparatus.

以下、図を参照しつつ、一つの実施形態によるオーディオ符号化装置について説明する。このオーディオ符号化装置は、３個のチャネルのうちの一つのチャネルの周波数信号を、他の二つのチャネルの周波数信号に予測係数を乗じて得られる値の線形和として予測する。そしてこのオーディオ符号化装置は、予測されるチャネルの周波数信号と上記の線形和間の予測誤差の分布形状を推定し、その分布形状及び予測誤差の最小値に対応する各予測係数に応じて、予測係数のそれぞれについて符号帳を使用するか否か判定する。
なお、本実施形態では、符号化対象となるマルチチャネルオーディオ信号は、5.1chオーディオ信号である。 Hereinafter, an audio encoding device according to an embodiment will be described with reference to the drawings. This audio encoding apparatus predicts the frequency signal of one of the three channels as a linear sum of values obtained by multiplying the frequency signals of the other two channels by a prediction coefficient. And this audio encoding device estimates the distribution shape of the prediction error between the frequency signal of the predicted channel and the above linear sum, and according to each prediction coefficient corresponding to the distribution shape and the minimum value of the prediction error, It is determined whether to use a codebook for each prediction coefficient.
In the present embodiment, the multi-channel audio signal to be encoded is a 5.1ch audio signal.

図１は、一つの実施形態によるオーディオ符号化装置１の概略構成図である。図１に示すように、オーディオ符号化装置１は、時間周波数変換部１１と、第１ダウンミックス部１２と、第２ダウンミックス部１３と、予測符号化部１４と、空間情報符号化部１５と、チャネル信号符号化部１６と、多重化部１７とを有する。 FIG. 1 is a schematic configuration diagram of an audio encoding device 1 according to one embodiment. As shown in FIG. 1, the audio encoding device 1 includes a time-frequency conversion unit 11, a first downmix unit 12, a second downmix unit 13, a prediction encoding unit 14, and a spatial information encoding unit 15. A channel signal encoding unit 16 and a multiplexing unit 17.

オーディオ符号化装置１が有するこれらの各部は、それぞれ別個の回路として形成される。あるいはオーディオ符号化装置１が有するこれらの各部は、その各部に対応する回路が集積された一つの集積回路としてオーディオ符号化装置１に実装されてもよい。さらに、オーディオ符号化装置１が有するこれらの各部は、オーディオ符号化装置１が有するプロセッサ上で実行されるコンピュータプログラムにより実現される、機能モジュールであってもよい。 Each of these units included in the audio encoding device 1 is formed as a separate circuit. Alternatively, these units included in the audio encoding device 1 may be mounted on the audio encoding device 1 as one integrated circuit in which circuits corresponding to the respective units are integrated. Furthermore, each of these units included in the audio encoding device 1 may be a functional module realized by a computer program executed on a processor included in the audio encoding device 1.

時間周波数変換部１１は、オーディオ符号化装置１に入力されたマルチチャネルオーディオ信号の時間領域の各チャネルの信号をそれぞれフレーム単位で時間周波数変換することにより、各チャネルの周波数信号に変換する。
本実施形態では、時間周波数変換部１１は、次式のQuadrature Mirror Filter(QMF)フィルタバンクを用いて、各チャネルの信号を周波数信号に変換する。

ここでnは時間を表す変数であり、１フレームのオーディオ信号を時間方向に128等分したときのn番目の時間を表す。なお、フレーム長は、例えば、10〜80msecの何れかとすることができる。またkは周波数帯域を表す変数であり、周波数信号が有する周波数帯域を64等分したときのk番目の周波数帯域を表す。またQMF(k,n)は、時間n、周波数kの周波数信号を出力するためのQMFである。時間周波数変換部１１は、QMF(k,n)を入力されたチャネルの1フレーム分のオーディオ信号に乗じることにより、そのチャネルの周波数信号を生成する。
なお、時間周波数変換部１１は、高速フーリエ変換、離散コサイン変換、修正離散コサイン変換など、他の時間周波数変換処理を用いて、各チャネルの信号を、それぞれ、周波数信号に変換してもよい。 The time-frequency conversion unit 11 converts the signal of each channel in the time domain of the multi-channel audio signal input to the audio encoding device 1 into a frequency signal of each channel by performing time-frequency conversion for each frame.
In the present embodiment, the time-frequency converter 11 converts the signal of each channel into a frequency signal using a quadrature mirror filter (QMF) filter bank of the following equation.

Here, n is a variable representing time, and represents the nth time when an audio signal of one frame is equally divided into 128 in the time direction. The frame length can be any of 10 to 80 msec, for example. K is a variable representing the frequency band, and represents the kth frequency band when the frequency band of the frequency signal is divided into 64 equal parts. QMF (k, n) is QMF for outputting a frequency signal of time n and frequency k. The time-frequency converter 11 multiplies the audio signal for one frame of the input channel by QMF (k, n) to generate a frequency signal of that channel.
Note that the time-frequency conversion unit 11 may convert each channel signal into a frequency signal using other time-frequency conversion processing such as fast Fourier transform, discrete cosine transform, and modified discrete cosine transform.

時間周波数変換部１１は、フレーム単位で各チャネルの周波数信号を算出する度に、各チャネルの周波数信号を第１ダウンミックス部１２へ出力する。 The time frequency conversion unit 11 outputs the frequency signal of each channel to the first downmix unit 12 every time the frequency signal of each channel is calculated in units of frames.

第１ダウンミックス部１２は、各チャネルの周波数信号を受け取る度に、それら各チャネルの周波数信号をダウンミックスすることにより、左チャネル、中央チャネル及び右チャネルの周波数信号を生成する。例えば、第１ダウンミックス部１２は、次式に従ってこれら3個のチャネルの周波数信号を算出する。

ここでL_Re(k,n)は、左前方チャネルの周波数信号L(k,n)のうちの実部を表し、L_Im(k,n)は、左前方チャネルの周波数信号L(k,n)のうちの虚部を表す。またSL_Re(k,n)は、左後方チャネルの周波数信号SL(k,n)のうちの実部を表し、SL_Im(k,n)は、左後方チャネルの周波数信号SL(k,n)のうちの虚部を表す。そしてL_in(k,n)は、ダウンミックスにより生成される左チャネルの周波数信号である。なお、L_inRe(k,n)は、左チャネルの周波数信号のうちの実部を表し、L_inIm(k,n)は、左チャネルの周波数信号のうちの虚部を表す。
同様に、R_Re(k,n)は、右前方チャネルの周波数信号R(k,n)のうちの実部を表し、R_Im(k,n)は、右前方チャネルの周波数信号R(k,n)のうちの虚部を表す。またSR_Re(k,n)は、右後方チャネルの周波数信号SR(k,n)のうちの実部を表し、SR_Im(k,n)は、右後方チャネルの周波数信号SR(k,n)のうちの虚部を表す。そしてR_in(k,n)は、ダウンミックスにより生成される右チャネルの周波数信号である。なお、R_inRe(k,n)は、右チャネルの周波数信号のうちの実部を表し、R_inIm(k,n)は、右チャネルの周波数信号のうちの虚部を表す。
さらに、C_Re(k,n)は、中央チャネルの周波数信号C(k,n)のうちの実部を表し、C_Im(k,n)は、中央チャネルの周波数信号C(k,n)のうちの虚部を表す。またLFE_Re(k,n)は、重低音チャネルの周波数信号LFE(k,n)のうちの実部を表し、LFE_Im(k,n)は、重低音チャネルの周波数信号LFE(k,n)のうちの虚部を表す。そしてC_in(k,n)は、ダウンミックスにより生成される中央チャネルの周波数信号である。なお、C_inRe(k,n)は、中央チャネルの周波数信号C_in(k,n)のうちの実部を表し、C_inIm(k,n)は、中央チャネルの周波数信号C_in(k,n)のうちの虚部を表す。 The first downmix unit 12 generates frequency signals of the left channel, the center channel, and the right channel by downmixing the frequency signals of each channel each time the frequency signal of each channel is received. For example, the first downmix unit 12 calculates the frequency signals of these three channels according to the following equation.

Where L _Re (k, n) represents the real part of the left front channel frequency signal L (k, n), and L _Im (k, n) represents the left front channel frequency signal L (k, n). represents the imaginary part of n). SL _Re (k, n) represents the real part of the left rear channel frequency signal SL (k, n), and SL _Im (k, n) represents the left rear channel frequency signal SL (k, n). ) Represents the imaginary part. L _in (k, n) is a frequency signal of the left channel generated by downmixing. L _inRe (k, n) represents the real part of the left channel frequency signal, and L _inIm (k, n) represents the imaginary part of the left channel frequency signal.
Similarly, R _Re (k, n) represents the real part of the right front channel frequency signal R (k, n), and R _Im (k, n) represents the right front channel frequency signal R (k , n) represents the imaginary part. SR _Re (k, n) represents the real part of the right rear channel frequency signal SR (k, n), and SR _Im (k, n) represents the right rear channel frequency signal SR (k, n). ) Represents the imaginary part. R _in (k, n) is a right channel frequency signal generated by downmixing. R _inRe (k, n) represents the real part of the right channel frequency signal, and R _inIm (k, n) represents the imaginary part of the right channel frequency signal.
Furthermore, C _Re (k, n) represents the real part of the center channel frequency signal C (k, n), and C _Im (k, n) represents the center channel frequency signal C (k, n). Represents the imaginary part. LFE _Re (k, n) represents the real part of the frequency signal LFE (k, n) of the heavy bass channel, and LFE _Im (k, n) represents the frequency signal LFE (k, n) of the heavy bass channel. ) Represents the imaginary part. C _in (k, n) is a center channel frequency signal generated by downmixing. C _inRe (k, n) represents the real part of the center channel frequency signal C _in (k, n), and C _inIm (k, n) represents the center channel frequency signal C _in (k, n). represents the imaginary part of n).

さらに、第１ダウンミックス部１２は、ダウンミックスされる二つのチャネルの周波数信号間の空間情報として、音の定位を表す情報であるその周波数信号間の強度差と、音の広がりを表す情報であるその周波数信号間の類似度を周波数帯域ごとに算出する。第１ダウンミックス部１２が算出するこれらの空間情報は、3チャネル空間情報の一例である。本実施形態では、第１ダウンミックス部１２は、次式に従って左チャネルについての周波数帯域kの強度差CLD_L(k)と類似度ICC_L(k)を算出する。

ただしNは、１フレームに含まれる時間方向のサンプル点数であり、本実施形態では、Nは128である。またe_L(k)は、左前方チャネルの周波数信号L(k,n)の自己相関値であり、e_SL(k)は、左後方チャネルの周波数信号SL(k,n)の自己相関値である。またe_LSL(k)は、左前方チャネルの周波数信号L(k,n)と左後方チャネルの周波数信号SL(k,n)との相互相関値である。
同様に、第１ダウンミックス部１２は、次式に従って右チャネルについての周波数帯域kの強度差CLD_R(k)と類似度ICC_R(k)を算出する。

e_R(k)は、右前方チャネルの周波数信号R(k,n)の自己相関値であり、e_SR(k)は、右後方チャネルの周波数信号SR(k,n)の自己相関値である。またe_RSR(k)は、右前方チャネルの周波数信号R(k,n)と右後方チャネルの周波数信号SR(k,n)との相互相関値である。
さらに、第１ダウンミックス部１２は、次式に従って中央チャネルについての周波数帯域kの強度差CLD_C(k)を算出する。

e_C(k)は、中央チャネルの周波数信号C(k,n)の自己相関値であり、e_LFE(k)は、重低音チャネルの周波数信号LFE(k,n)の自己相関値である。 Further, the first downmix unit 12 is information indicating the intensity difference between the frequency signals, which is information indicating the localization of the sound, and the information indicating the spread of the sound, as spatial information between the frequency signals of the two channels to be downmixed. The similarity between the frequency signals is calculated for each frequency band. The spatial information calculated by the first downmix unit 12 is an example of 3-channel spatial information. In the present embodiment, the first downmix unit 12 calculates the intensity difference CLD _L (k) and the similarity ICC _L (k) of the frequency band k for the left channel according to the following equation.

However, N is the number of sample points in the time direction included in one frame, and N is 128 in this embodiment. E _L (k) is the autocorrelation value of the frequency signal L (k, n) of the left front channel, and e _SL (k) is the autocorrelation value of the frequency signal SL (k, n) of the left rear channel. It is. E _LSL (k) is a cross-correlation value between the frequency signal L (k, n) of the left front channel and the frequency signal SL (k, n) of the left rear channel.
Similarly, the first downmix unit 12 calculates the intensity difference CLD _R (k) and the similarity ICC _R (k) of the frequency band k for the right channel according to the following equation.

e _R (k) is the autocorrelation value of the frequency signal R (k, n) of the right front channel, and e _SR (k) is the autocorrelation value of the frequency signal SR (k, n) of the right rear channel. is there. E _RSR (k) is a cross-correlation value between the frequency signal R (k, n) of the right front channel and the frequency signal SR (k, n) of the right rear channel.
Further, the first downmix unit 12 calculates the intensity difference CLD _C (k) of the frequency band k for the center channel according to the following equation.

e _C (k) is the autocorrelation value of the center channel frequency signal C (k, n), and e _LFE (k) is the autocorrelation value of the heavy bass channel frequency signal LFE (k, n). .

第１ダウンミックス部１２は、3チャネルの周波数信号を生成する度に、その3チャネルの周波数信号を第２ダウンミックス部１３へ出力し、一方、空間情報を空間情報符号化部１５へ出力する。 Each time the first downmix unit 12 generates a 3-channel frequency signal, the first downmix unit 12 outputs the 3-channel frequency signal to the second downmix unit 13, and outputs spatial information to the spatial information encoding unit 15. .

第２ダウンミックス部１３は、第１ダウンミックス部１２から受け取った左チャネルの周波数信号と中央チャネルの周波数信号をダウンミックスすることにより、ステレオ周波数信号のうちの左側周波数信号を生成する。また第２ダウンミックス部１３は、第１ダウンミックス部１２から受け取った右チャネルの周波数信号と中央チャネルの周波数信号をダウンミックスすることにより、ステレオ周波数信号のうちの右側周波数信号を生成する。
第２ダウンミックス部１３は、例えば、次式に従ってステレオ周波数信号の左側周波数信号L_p0(k,n)及び右側周波数信号R_p0(k,n)を生成する。さらに第２ダウンミックス部１３は、符号帳に含まれる予測係数を選択するために利用される中央チャネルの信号C_p0(k,n)を次式に従って算出する。

ここで、L_in(k,n)、R_in(k,n)、C_in(k,n)は、それぞれ、第１ダウンミックス部１２により生成された左チャネル、右チャネル及び中央チャネルの周波数信号である。左側周波数信号L_p0(k,n)は、元のマルチチャネルオーディオ信号の左前方チャネル、左後方チャネル、中央チャネル及び重低音チャネルの周波数信号が合成されたものとなる。同様に、右側周波数信号R_p0(k,n)は、元のマルチチャネルオーディオ信号の右前方チャネル、右後方チャネル、中央チャネル及び重低音チャネルの周波数信号が合成されたものとなる。 The second downmix unit 13 generates a left frequency signal among the stereo frequency signals by downmixing the left channel frequency signal and the center channel frequency signal received from the first downmix unit 12. The second downmix unit 13 generates the right frequency signal of the stereo frequency signal by downmixing the right channel frequency signal and the center channel frequency signal received from the first downmix unit 12.
For example, the second downmix unit 13 generates the left frequency signal L _p0 (k, n) and the right frequency signal R _p0 (k, n) of the stereo frequency signal according to the following equation. Further, the second downmixing unit 13 calculates a center channel signal C _p0 (k, n) used for selecting a prediction coefficient included in the codebook according to the following equation.

Here, L _in (k, n), R _in (k, n), and C _in (k, n) are the frequencies of the left channel, the right channel, and the center channel generated by the first downmix unit 12, respectively. Signal. The left frequency signal L _p0 (k, n) is obtained by synthesizing the frequency signals of the left front channel, the left rear channel, the center channel, and the heavy bass channel of the original multi-channel audio signal. Similarly, the right frequency signal R _p0 (k, n) is obtained by synthesizing the frequency signals of the right front channel, the right rear channel, the center channel, and the heavy bass channel of the original multi-channel audio signal.

第２ダウンミックス部１３は、ステレオ周波数信号の左側周波数信号L_p0(k,n)及び右側周波数信号R_p0(k,n)をチャネル信号符号化部１６へ出力する。さらに、第２ダウンミックス部１３は、その左側周波数信号L_p0(k,n)及び右側周波数信号R_p0(k,n)とともに、中央チャネルの周波数信号C_p0(k,n)を予測符号化部１４へ出力する。 The second downmix unit 13 outputs the left frequency signal L _p0 (k, n) and the right frequency signal R _p0 (k, n) of the stereo frequency signal to the channel signal encoding unit 16. Further, the second downmix unit 13 predictively encodes the center channel frequency signal C _p0 (k, n) together with the left frequency signal L _p0 (k, n) and the right frequency signal R _p0 (k, n). To the unit 14.

予測符号化部１４は、周波数帯域ごとに、L_p0(k,n)及びR_p0(k,n)を用いてC_p0(k,n)を近似的に表す予測係数C₁(k)、C₂(k)を求める。そして予測符号化部１４は、その予測係数C₁(k)、C₂(k)の量子化値に対応するインデックス値を可変長符号化することにより、予測係数符号idxc₁(k)、idxc₂(k)を求める。そして予測符号化部１４は、予測係数符号idxc₁(k)、idxc₂(k)を空間情報符号化部１５へ出力する。さらに予測符号化部１４は、周波数帯域ごとに、予測係数C₁(k)、C₂(k)を求めるために利用した符号帳を表す符号帳選択情報を多重化部１７へ出力する。なお、予測符号化部１４の詳細については後述する。 The prediction encoding unit 14 uses, for each frequency band, a prediction coefficient C ₁ (k) that approximately represents C _p0 (k, n) using L _p0 (k, n) and R _p0 (k, n), Find C ₂ (k). Then, the predictive coding unit 14 performs variable length coding on the index values corresponding to the quantized values of the predictive coefficients C ₁ (k) and C ₂ (k), so that the predictive coefficient codes idxc ₁ (k) and idxc ₂ Find (k). Then, the prediction encoding unit 14 outputs the prediction coefficient codes idxc ₁ (k) and idxc ₂ (k) to the spatial information encoding unit 15. Further, the predictive coding unit 14 outputs code book selection information representing the code book used for obtaining the prediction coefficients C ₁ (k) and C ₂ (k) to the multiplexing unit 17 for each frequency band. Details of the prediction encoding unit 14 will be described later.

空間情報符号化部１５は、第１ダウンミックス部１２から受け取った空間情報を符号化する。さらに空間情報符号化部１５は、符号化された空間情報とともに、予測符号化部１４から受け取った予測係数符号idxc₁(k)、idxc₂(k)を多重化することによりMPEG Surround符号（以下、MPS符号と呼ぶ）を生成する。 The spatial information encoding unit 15 encodes the spatial information received from the first downmix unit 12. Further, the spatial information encoding unit 15 multiplexes the prediction coefficient codes idxc ₁ (k) and idxc ₂ (k) received from the predictive encoding unit 14 together with the encoded spatial information to thereby generate an MPEG Surround code (hereinafter referred to as MPEG Surround code). , Called MPS code).

空間情報符号化部１５は、空間情報中の類似度の値とインデックス値の対応を示した量子化テーブルを参照する。そして空間情報符号化部１５は、量子化テーブルを参照することにより、各周波数帯域についてそれぞれの類似度ICC_i(k)(i=L,R,0)と最も値が近いインデックス値を決定する。なお、量子化テーブルは、予め、空間情報符号化部１５が有するメモリに格納される。 The spatial information encoding unit 15 refers to a quantization table indicating the correspondence between the similarity value and the index value in the spatial information. Then, the spatial information encoding unit 15 refers to the quantization table to determine an index value that is closest to each similarity ICC _i (k) (i = L, R, 0) for each frequency band. . Note that the quantization table is stored in advance in a memory included in the spatial information encoding unit 15.

図２は、類似度に対する量子化テーブルの一例を示す図である。図２に示す量子化テーブル２００において、上側の行２１０の各欄はインデックス値を表し、下側の行２２０の各欄は、同じ列のインデックス値に対応する類似度の代表値を表す。また、類似度が取りうる値の範囲は-0.99〜+1である。例えば、周波数帯域kに対する類似度が0.6である場合、量子化テーブル２００では、インデックス値3に対応する類似度の代表値が、周波数帯域kに対する類似度に最も近い。そこで、空間情報符号化部１５は、周波数帯域kに対するインデックス値を3に設定する。 FIG. 2 is a diagram illustrating an example of a quantization table for similarity. In the quantization table 200 shown in FIG. 2, each column in the upper row 210 represents an index value, and each column in the lower row 220 represents a representative value of similarity corresponding to the index value in the same column. The range of values that the similarity can take is -0.99 to +1. For example, when the similarity to the frequency band k is 0.6, in the quantization table 200, the representative value of the similarity corresponding to the index value 3 is closest to the similarity to the frequency band k. Therefore, the spatial information encoding unit 15 sets the index value for the frequency band k to 3.

次に、空間情報符号化部１５は、各周波数帯域について、周波数方向に沿ってインデックス間の差分値を求める。例えば、周波数帯域kに対するインデックス値が3であり、周波数帯域(k-1)に対するインデックス値が0であれば、空間情報符号化部１５は、周波数帯域kに対するインデックスの差分値を3とする。 Next, the spatial information encoding unit 15 obtains a difference value between indexes along the frequency direction for each frequency band. For example, if the index value for the frequency band k is 3 and the index value for the frequency band (k−1) is 0, the spatial information encoding unit 15 sets the index difference value for the frequency band k to 3.

空間情報符号化部１５は、インデックス値の差分値と類似度符号の対応を示した符号化テーブルを参照する。そして空間情報符号化部１５は、符号化テーブルを参照することにより、類似度ICC_i(k)(i=L,R)の各周波数についてインデックス間の差分値に対する類似度符号idxicc_i(k)(i=L,R)を決定する。なお、符号化テーブルは、予め、空間情報符号化部１５が有するメモリに格納される。また、類似度符号は、例えば、ハフマン符号あるいは算術符号など、出現頻度が高い差分値ほど符号長が短くなる可変長符号とすることができる。 The spatial information encoding unit 15 refers to an encoding table that indicates the correspondence between index value difference values and similarity codes. Then, the spatial information encoding unit 15 refers to the encoding table to determine the similarity code idxicc _i (k) for the difference value between the indexes for each frequency of the similarity ICC _i (k) (i = L, R). Determine (i = L, R). Note that the encoding table is stored in advance in a memory included in the spatial information encoding unit 15. Also, the similarity code can be a variable length code such as a Huffman code or an arithmetic code, in which the code length is shorter as the difference value has a higher appearance frequency.

図３は、インデックスの差分値と類似度符号の関係を示すテーブルの一例を示す図である。この例では、類似度符号はハフマン符号である。図３に示す符号化テーブル３００において、左側の列の各欄はインデックスの差分値を表し、右側の列の各欄は、同じ行のインデックスの差分値に対応する類似度符号を表す。例えば、周波数帯域kの類似度ICC_L(k)に対するインデックスの差分値が3である場合、空間情報符号化部１５は、符号化テーブル３００を参照することにより、周波数帯域kの類似度ICC_L(k)に対する類似度符号idxicc_L(k)を"111110"に設定する。 FIG. 3 is a diagram illustrating an example of a table indicating the relationship between index difference values and similarity codes. In this example, the similarity code is a Huffman code. In the encoding table 300 shown in FIG. 3, each column in the left column represents an index difference value, and each column in the right column represents a similarity code corresponding to the index difference value in the same row. For example, when the index difference value with respect to the similarity ICC _L (k) of the frequency band k is 3, the spatial information encoding unit 15 refers to the encoding table 300 to thereby determine the similarity ICC _L of the frequency band k. The similarity code idxicc _L (k) for (k) is set to “111110”.

また空間情報符号化部１５は、強度差の値とインデックス値との対応関係を示した量子化テーブルを参照する。そして空間情報符号化部１５は、量子化テーブルを参照することにより、各周波数についての強度差CLD_j(k)(j=L,R,C)と最も値が近いインデックス値を決定する。空間情報符号化部１５は、各周波数帯域について、周波数方向に沿ってインデックス間の差分値を求める。例えば、周波数帯域kに対するインデックス値が2であり、周波数帯域(k-1)に対するインデックス値が4であれば、空間情報符号化部１５は、周波数帯域kに対するインデックスの差分値を-2とする。 The spatial information encoding unit 15 refers to a quantization table that indicates the correspondence between the intensity difference value and the index value. Then, the spatial information encoding unit 15 determines an index value closest to the intensity difference CLD _j (k) (j = L, R, C) for each frequency by referring to the quantization table. The spatial information encoding unit 15 obtains a difference value between indexes along the frequency direction for each frequency band. For example, if the index value for the frequency band k is 2 and the index value for the frequency band (k−1) is 4, the spatial information encoding unit 15 sets the index difference value for the frequency band k to −2. .

図４は、強度差に対する量子化テーブルの一例を示す図である。図４に示す量子化テーブル４００において、行４１０、４３０及び４５０の各欄はインデックス値を表し、行４２０、４４０及び４６０の各欄は、それぞれ、同じ列の行４１０、４３０及び４５０の各欄に示されたインデックス値に対応する強度差の代表値を表す。
例えば、周波数帯域kに対する強度差CLD_L(k)が10.8dBである場合、量子化テーブル４００では、インデックス値5に対応する強度差の代表値がCLD_L(k)に最も近い。そこで、空間情報符号化部１５は、CLD_L(k)に対するインデックス値を5に設定する。 FIG. 4 is a diagram illustrating an example of a quantization table for the intensity difference. In the quantization table 400 shown in FIG. 4, each column of rows 410, 430, and 450 represents an index value, and each column of rows 420, 440, and 460 represents each column of rows 410, 430, and 450 in the same column, respectively. The representative value of the intensity difference corresponding to the index value shown in FIG.
For example, when the intensity difference CLD _L (k) with respect to the frequency band k is 10.8 dB, in the quantization table 400, the representative value of the intensity difference corresponding to the index value 5 is closest to CLD _L (k). Therefore, the spatial information encoding unit 15 sets the index value for CLD _L (k) to 5.

空間情報符号化部１５は、インデックス間の差分値と強度差符号の対応を示した符号化テーブルを参照する。そして空間情報符号化部１５は、符号化テーブルを参照することにより、強度差CLD_j(k)の各周波数帯域kの差分値に対する強度差符号idxcld_j(k)(j=L,R,C)を決定する。強度差符号は、類似度符号と同様に、例えば、ハフマン符号あるいは算術符号など、出現頻度が高い差分値ほど符号長が短くなる可変長符号とすることができる。
なお、量子化テーブル及び符号化テーブルは、予め、空間情報符号化部１５が有するメモリに格納される。 The spatial information encoding unit 15 refers to an encoding table that indicates the correspondence between the difference value between indexes and the intensity difference code. Then, the spatial information encoding unit 15 refers to the encoding table so that the intensity difference code idxcld _j (k) (j = L, R, C) for the difference value of each frequency band k of the intensity difference CLD _j (k). ). Similar to the similarity code, the intensity difference code can be a variable length code such as a Huffman code or an arithmetic code, in which the code length is shorter as the difference value has a higher appearance frequency.
Note that the quantization table and the encoding table are stored in advance in a memory included in the spatial information encoding unit 15.

空間情報符号化部１５は、類似度符号idxicc_i(k)、強度差符号idxcld_j(k)及び予測係数符号idxc_m(k)を用いてMPS符号を生成する。例えば、空間情報符号化部１５は、類似度符号idxicc_i(k)、強度差符号idxcld_j(k)及び予測係数符号idxc_m(k)を所定の順序に従って配列することにより、MPS符号を生成する。この所定の順序については、例えば、ISO/IEC 23003-1:2007に記述されている。
空間情報符号化部１５は、生成したMPS符号を多重化部１７へ出力する。 The spatial information encoding unit 15 generates an MPS code using the similarity code idxicc _i (k), the intensity difference code idxcld _j (k), and the prediction coefficient code idxc _m (k). For example, the spatial information encoding unit 15 generates an MPS code by arranging the similarity code idxicc _i (k), the intensity difference code idxcld _j (k), and the prediction coefficient code idxc _m (k) in a predetermined order. To do. This predetermined order is described in, for example, ISO / IEC 23003-1: 2007.
The spatial information encoding unit 15 outputs the generated MPS code to the multiplexing unit 17.

チャネル信号符号化部１６は、第２ダウンミックス部１３から出力されたステレオ周波数信号を符号化する。そのために、チャネル信号符号化部１６は、SBR符号化部１６１と、周波数時間変換部１６２と、AAC符号化部１６３とを有する。 The channel signal encoding unit 16 encodes the stereo frequency signal output from the second downmix unit 13. Therefore, the channel signal encoding unit 16 includes an SBR encoding unit 161, a frequency time conversion unit 162, and an AAC encoding unit 163.

SBR符号化部１６１は、ステレオ周波数信号を受け取る度に、チャネルごとに、ステレオ周波数信号のうち、高周波数帯域に含まれる成分である高域成分を、SBR符号化方式にしたがって符号化する。これにより、SBR符号化部１６１は、SBR符号を生成する。
例えば、SBR符号化部１６１は、特開２００８−２２４９０２号公報に開示されているように、SBR符号化の対象となる高域成分と強い相関のある各チャネルの周波数信号の低域成分を複製する。なお、低域成分は、SBR符号化部１６１が符号化対象とする高域成分が含まれる高周波数帯域よりも低い低周波数帯域に含まれる各チャネルの周波数信号の成分であり、後述するAAC符号化部１６３により符号化される。そしてSBR符号化部１６１は、複製された高域成分の電力を、元の高域成分の電力と一致するように調整する。またSBR符号化部１６１は、元の高域成分のうち、低域成分との差異が大きく、低域成分を複写しても、高域成分を近似できない成分を補助情報とする。そしてSBR符号化部１６１は、複製に利用された低域成分と対応する高域成分の位置関係を表す情報と、電力調整量と補助情報を量子化することにより符号化する。
SBR符号化部１６１は、上記の符号化された情報であるSBR符号を多重化部１７へ出力する。 Each time the SBR encoding unit 161 receives a stereo frequency signal, the SBR encoding unit 161 encodes, for each channel, a high frequency component, which is a component included in the high frequency band, of the stereo frequency signal according to the SBR encoding method. As a result, the SBR encoding unit 161 generates an SBR code.
For example, as disclosed in Japanese Patent Application Laid-Open No. 2008-224902, the SBR encoding unit 161 duplicates the low frequency component of the frequency signal of each channel having a strong correlation with the high frequency component to be SBR encoded. To do. The low frequency component is a component of the frequency signal of each channel included in the low frequency band lower than the high frequency band including the high frequency component to be encoded by the SBR encoding unit 161, and will be described later. The encoding unit 163 performs encoding. Then, the SBR encoding unit 161 adjusts the replicated high frequency component power so as to match the original high frequency component power. In addition, the SBR encoding unit 161 uses, as auxiliary information, a component that has a large difference from the low-frequency component among the original high-frequency components and cannot approximate the high-frequency component even if the low-frequency component is copied. Then, the SBR encoding unit 161 performs encoding by quantizing the information indicating the positional relationship between the low frequency component used for duplication and the corresponding high frequency component, the power adjustment amount, and the auxiliary information.
The SBR encoding unit 161 outputs the SBR code that is the encoded information to the multiplexing unit 17.

周波数時間変換部１６２は、ステレオ周波数信号を受け取る度に、各チャネルのステレオ周波数信号を時間領域のステレオ信号に変換する。例えば、時間周波数変換部１１がQMFフィルタバンクを用いる場合、周波数時間変換部１６２は、次式に示す複素型のQMFフィルタバンクを用いて各チャネルのステレオ周波数信号を周波数時間変換する。

ここでIQMF(k,n)は、時間n、周波数kを変数とする複素型のQMFである。 The frequency time conversion unit 162 converts the stereo frequency signal of each channel into a time domain stereo signal each time a stereo frequency signal is received. For example, when the time-frequency conversion unit 11 uses a QMF filter bank, the frequency-time conversion unit 162 performs frequency-time conversion of the stereo frequency signal of each channel using a complex QMF filter bank represented by the following equation.

Here, IQMF (k, n) is a complex QMF with time n and frequency k as variables.

なお、時間周波数変換部１１が、高速フーリエ変換、離散コサイン変換、修正離散コサイン変換など、他の時間周波数変換処理を用いている場合、周波数時間変換部１６２は、その時間周波数変換処理の逆変換を使用する。
周波数時間変換部１６２は、各チャネルの周波数信号を周波数時間変換することにより得られた各チャネルのステレオ信号をAAC符号化部１６３へ出力する。 When the time frequency conversion unit 11 uses another time frequency conversion process such as fast Fourier transform, discrete cosine transform, or modified discrete cosine transform, the frequency time conversion unit 162 performs inverse conversion of the time frequency conversion process. Is used.
The frequency time conversion unit 162 outputs the stereo signal of each channel obtained by frequency time conversion of the frequency signal of each channel to the AAC encoding unit 163.

AAC符号化部１６３は、各チャネルのステレオ信号を受け取る度に、各チャネルの信号の低域成分をAAC符号化方式にしたがって符号化することにより、AAC符号を生成する。そこで、AAC符号化部１６３は、例えば、特開２００７−１８３５２８号公報に開示されている技術を利用できる。具体的には、AAC符号化部１６３は、受け取った各チャネルのステレオ信号を離散コサイン変換することにより、再度ステレオ周波数信号を生成する。そしてAAC符号化部１６３は、再生成したステレオ周波数信号から心理聴覚エントロピー（Perceptual Entropy、PE）を算出する。PEは、リスナーが雑音を知覚することがないようにそのブロックを量子化するために必要な情報量を表す。そしてこのPEは、打楽器が発する音のようなアタック音など、信号レベルが短時間で変化する音に対して大きな値となる特性を持つ。そこで、AAC符号化部１６３は、PEの値が比較的大きくなるフレームに対しては、窓を短くし、PEの値が比較的小さくなるブロックに対しては、窓を長くする。例えば、短い窓は、256個のサンプルを含み、長い窓は、2048個のサンプルを含む。AAC符号化部１６３は、決定された長さを持つ窓を用いて各チャネルのステレオ信号に対して修正離散コサイン変換（Modified Discrete Cosine Transform、MDCT）を実行することにより、各チャネルのステレオ信号をMDCT係数の組に変換する。
そしてAAC符号化部１６３は、MDCT係数の組を量子化し、その量子化されたMDCT係数の組を可変長符号化する。
AAC符号化部１６３は、可変長符号化されたMDCT係数の組と、量子化係数など関連する情報を、AAC符号として多重化部１７へ出力する。 Each time the AAC encoding unit 163 receives a stereo signal of each channel, the AAC encoding unit 163 generates an AAC code by encoding the low frequency component of the signal of each channel according to the AAC encoding method. Therefore, the AAC encoding unit 163 can use, for example, a technique disclosed in Japanese Patent Application Laid-Open No. 2007-183528. Specifically, the AAC encoding unit 163 generates a stereo frequency signal again by performing a discrete cosine transform on the received stereo signal of each channel. Then, the AAC encoding unit 163 calculates psychoacoustic entropy (Perceptual Entropy, PE) from the regenerated stereo frequency signal. The PE represents the amount of information necessary to quantize the block so that the listener does not perceive noise. This PE has a characteristic that becomes a large value for a sound whose signal level changes in a short time, such as an attack sound like a sound emitted by a percussion instrument. Therefore, the AAC encoding unit 163 shortens the window for a frame having a relatively large PE value, and lengthens the window for a block having a relatively small PE value. For example, a short window contains 256 samples and a long window contains 2048 samples. The AAC encoding unit 163 performs a modified discrete cosine transform (MDCT) on the stereo signal of each channel using a window having the determined length, thereby converting the stereo signal of each channel. Convert to a set of MDCT coefficients.
Then, the AAC encoding unit 163 quantizes the set of MDCT coefficients, and variable-length encodes the quantized set of MDCT coefficients.
The AAC encoding unit 163 outputs a set of variable length encoded MDCT coefficients and related information such as a quantization coefficient to the multiplexing unit 17 as an AAC code.

多重化部１７は、AAC符号、SBR符号及びMPS符号を所定の順序に従って配列することにより多重化する。そして多重化部１７は、その多重化により生成された符号化オーディオ信号を出力する。
図５は、符号化されたオーディオ信号が格納されたデータ形式の一例を示す図である。この例では、符号化されたオーディオ信号は、MPEG-4 ADTS(Audio Data Transport Stream)形式に従って作成される。
図５に示される符号化データ列５００において、データブロック５１０にAAC符号は格納される。またADTS形式のFILLエレメントが格納されるブロック５２０の一部領域にSBR符号及びMPS符号が格納される。さらに、ブロック５２０には、予測符号化部１４により求められた符号帳選択情報も格納される。 The multiplexing unit 17 multiplexes the AAC code, the SBR code, and the MPS code by arranging them in a predetermined order. The multiplexing unit 17 outputs the encoded audio signal generated by the multiplexing.
FIG. 5 is a diagram illustrating an example of a data format in which an encoded audio signal is stored. In this example, the encoded audio signal is created according to the MPEG-4 ADTS (Audio Data Transport Stream) format.
In the encoded data sequence 500 shown in FIG. 5, the AAC code is stored in the data block 510. In addition, an SBR code and an MPS code are stored in a partial area of the block 520 in which an ADTS type FILL element is stored. Further, in the block 520, codebook selection information obtained by the predictive coding unit 14 is also stored.

次に、予測符号化部１４の詳細について説明する。図６は、予測符号化部１４の構成図である。予測符号化部１４は、予測誤差形状判定部１４１と、最小誤差予測係数算出部１４２と、符号帳選択部１４３と、予測係数符号化部１４４とを有する。 Next, details of the prediction encoding unit 14 will be described. FIG. 6 is a configuration diagram of the predictive encoding unit 14. The prediction encoding unit 14 includes a prediction error shape determination unit 141, a minimum error prediction coefficient calculation unit 142, a codebook selection unit 143, and a prediction coefficient encoding unit 144.

予測誤差形状判定部１４１は、第２ダウンミックス部１３から受け取った各チャネルの周波数信号に基づいて、予測係数C₁(k)、C₂(k)から算出される中央チャネルの周波数信号の予測値C'_p0(k,n)とC_p0(k,n)間の予測誤差の分布形状を判定する。この予測誤差の分布形状は、符号帳を選択するため、及び予測誤差dを最小化するために利用される。 The prediction error shape determination unit 141 predicts the frequency signal of the center channel calculated from the prediction coefficients C ₁ (k) and C ₂ (k) based on the frequency signal of each channel received from the second downmix unit 13. The distribution shape of the prediction error between the values C ′ _p0 (k, n) and C _p0 (k, n) is determined. This prediction error distribution shape is used to select a codebook and to minimize the prediction error d.

本願の発明者は、予測値C'_p0(k,n)と中央チャネルの信号C_p0(k,n)間の予測誤差の分布形状が、放物線柱面状あるいは楕円放物面状になることを見出した。そこで以下では、先ず、予測誤差の分布形状が放物線柱面状あるいは楕円放物面状になることについて説明する。 The inventor of the present application indicates that the distribution shape of the prediction error between the predicted value C ′ _p0 (k, n) and the central channel signal C _p0 (k, n) is a parabolic columnar shape or an elliptical parabolic shape. I found. Therefore, first, it will be described that the distribution shape of the prediction error is a parabolic columnar shape or an elliptical parabolic shape.

左側チャネルの予測係数C₁(k)及び右側チャネルの予測係数C₂(k)より、中央チャネルの信号の予測値C'_p0(k,n)及び予測誤差d(k)は次式で定義される。

（１０）式を展開して整理すると、（１０）式は、次式のように、予測係数C₁(k)及びC₂(k)の二次曲線として表される。なお、以下では、簡単化のために、予測係数C₁(k)、C₂(k)及び予測誤差d(k)を、それぞれ単にc₁、c₂、dと表記する。また左側チャネルの周波数信号L_p0(k,n)、右側チャネルの周波数信号R_p0(k,n)及び中央チャネルの信号C_p0(k,n)を、それぞれ、単にL₀、R₀、C₀と表記する。

なお、（１１）式において、関数Re(x)は、パラメータxの実数成分を出力する関数であり、関数Im(x)は、パラメータxの虚数成分を出力する関数である。
（１１）式において、予測係数c₁、c₂の係数及び定数項を以下のように定義する。

上記の定義を用いることにより、（１１）式は以下のように表される。

From the prediction coefficient C ₁ (k) of the left channel and the prediction coefficient C ₂ (k) of the right channel, the predicted value C ′ _p0 (k, n) and the prediction error d (k) of the center channel signal are defined by the following equations: Is done.

When the expression (10) is expanded and arranged, the expression (10) is expressed as a quadratic curve of the prediction coefficients C ₁ (k) and C ₂ (k) as in the following expression. Hereinafter, for simplification, the prediction coefficients C ₁ (k), C ₂ (k) and the prediction error d (k) are simply expressed as c ₁ , c ₂ , and d, respectively. Also, the left channel frequency signal L _p0 (k, n), the right channel frequency signal R _p0 (k, n) and the center channel signal C _p0 (k, n) are simply L ₀ , R ₀ , C, respectively. Indicated as ₀ .

In Expression (11), the function Re (x) is a function that outputs a real component of the parameter x, and the function Im (x) is a function that outputs an imaginary component of the parameter x.
In the equation (11), the coefficients and constant terms of the prediction coefficients c ₁ and c ₂ are defined as follows.

By using the above definition, the expression (11) is expressed as follows.

一般に、二次曲線は、放物線、双曲線、平行２直線、及び楕円のいずれかとなる。以下に、二次曲線が放物線、双曲線、平行２直線、及び楕円のそれぞれとなる条件について説明する。
例えば、（１３）式における各係数が以下の条件を満たす場合、（１３）式で表される２次曲線は放物線となる。

また、（１３）式における各係数が次式の条件を満たす場合、（１３）式で表される２次曲線は双曲線となる。

あるいは、（１３）式における各係数が次式の条件を満たす場合、（１３）式で表される２次曲線は平行２直線となる。

あるいはまた、（１３）式における各係数が次式の条件を満たす場合、（１３）式で表される２次曲線は楕円となる。

In general, the quadratic curve is one of a parabola, a hyperbola, parallel two straight lines, and an ellipse. Hereinafter, the conditions under which the quadratic curve is a parabola, a hyperbola, two parallel straight lines, and an ellipse will be described.
For example, when each coefficient in the equation (13) satisfies the following condition, the quadratic curve represented by the equation (13) is a parabola.

Further, when each coefficient in the equation (13) satisfies the condition of the following equation, the quadratic curve represented by the equation (13) is a hyperbola.

Alternatively, when each coefficient in the equation (13) satisfies the condition of the following equation, the quadratic curve represented by the equation (13) is a parallel two straight line.

Alternatively, when each coefficient in the equation (13) satisfies the condition of the following equation, the quadratic curve represented by the equation (13) becomes an ellipse.

ここで、左側周波数信号L_p0(k,n)、右側周波数信号R_p0(k,n)、中央チャネルの周波数信号C_p0(k,n)の性質から、（１３）式における各係数が放物線となる条件及び双曲線となる条件を満たすことはない。
先ず、（１３）式における各係数が、（１４）式に示される放物線の条件を満たすことがない理由を説明する。 Here, from the properties of the left frequency signal L _p0 (k, n), the right frequency signal R _p0 (k, n), and the center channel frequency signal C _p0 (k, n), each coefficient in the equation (13) is a parabola. And the condition that becomes a hyperbola is not satisfied.
First, the reason why each coefficient in the equation (13) does not satisfy the parabolic condition shown in the equation (14) will be described.

（１４）式において、γ=0と仮定する。γ=0の時、次式より全ての(k,n)において、右側周波数信号R_p0(k,n)=0が満たされることになる。

そのため、次式より、ε=0となる。

同様の計算により、α＝0と仮定した場合、δ＝0となる。したがって、（１４）式に示された条件は常に満たされることはない。 In the equation (14), γ = 0 is assumed. When γ = 0, the right frequency signal R _p0 (k, n) = 0 is satisfied in all (k, n) from the following equation.

Therefore, ε = 0 from the following equation.

According to the same calculation, assuming that α = 0, δ = 0. Therefore, the condition shown in the equation (14) is not always satisfied.

次に、（１３）式における各係数が、（１５）式に示される双曲線の条件を満たすことがない理由を説明する。（１４）式は、次式の通りに展開することができる。

（２０）式は、コーシーシュワルツの不等式により、次式を満たすことになる。

したがって、（１５）式に示される双曲線の条件は常に満たされることはない。 Next, the reason why each coefficient in equation (13) does not satisfy the hyperbolic condition shown in equation (15) will be described. Equation (14) can be expanded as follows:

Equation (20) satisfies the following equation by the Cauchy-Schwarz inequality.

Therefore, the hyperbolic condition shown in the equation (15) is not always satisfied.

したがって、予測誤差dの分布形状の断面の二次曲線は、平行２直線または楕円のいずれかである。平行２直線を予測係数c₁、c₂に対する二次曲面として規定すると、予測誤差dの分布形状は放物線柱面（すなわち、放物線型）となる。一方、楕円を予測係数c₁、c₂に対する二次曲面として規定すると、予測誤差dの分布形状は楕円放物面（すなわち、楕円型）となる。 Therefore, the quadratic curve of the cross section of the distribution shape of the prediction error d is either a parallel two straight line or an ellipse. If the parallel two straight lines are defined as a quadric surface with respect to the prediction coefficients c ₁ and c ₂ , the distribution shape of the prediction error d is a parabolic column surface (ie, a parabolic shape). On the other hand, if the ellipse is defined as a quadric surface with respect to the prediction coefficients c ₁ and c ₂ , the distribution shape of the prediction error d is an elliptic paraboloid (that is, an elliptic shape).

そこで、予測誤差形状判定部１４１は、左側周波数信号L_p0(k,n)、右側周波数信号R_p0(k,n)及び中央チャネルの周波数信号C_p0(k,n)に基づいて、（１６）式または（１７）式の何れの条件が満たされるか判定する。そして（１６）式に示される条件が満たされる場合、すなわち、(β²-αγ)＝0であれば、予測誤差形状判定部１４１は、予測誤差dの分布形状は放物線型であると判定する。これは、結局、以下の二つのケースの何れかに相当する。
・左側周波数信号L_p0(k,n)及び右側周波数信号R_p0(k,n)の少なくとも何れかが全ての周波数帯域において0、すなわち、左側チャネルか右側チャネルが無音である場合
・左側周波数信号L_p0(k,n)と右側周波数信号R_p0(k,n)の内積が0、すなわち、左側周波数信号L_p0(k,n)と右側周波数信号R_p0(k,n)とが同相または逆相である場合 Therefore, the prediction error shape determination unit 141 (16) based on the left frequency signal L _p0 (k, n), the right frequency signal R _p0 (k, n), and the center channel frequency signal C _p0 (k, n). ) Or (17) is satisfied. If the condition shown in equation (16) is satisfied, that is, if (β ² −αγ) = 0, the prediction error shape determination unit 141 determines that the distribution shape of the prediction error d is a parabolic shape. . This eventually corresponds to one of the following two cases.
When at least one of the left frequency signal L _p0 (k, n) and the right frequency signal R _p0 (k, n) is 0 in all frequency bands, that is, the left channel or the right channel is silent. The inner product of L _p0 (k, n) and the right frequency signal R _p0 (k, n) is 0, that is, the left frequency signal L _p0 (k, n) and the right frequency signal R _p0 (k, n) are in phase or When in reverse phase

一方、（１７）式に示される条件が満たされる場合、すなわち、(β²-αγ)≠0であれば、予測誤差形状判定部１４１は、予測誤差dの分布形状は楕円型であると判定する。 On the other hand, if the condition shown in the equation (17) is satisfied, that is, if (β ² −αγ) ≠ 0, the prediction error shape determination unit 141 determines that the distribution shape of the prediction error d is elliptical. To do.

予測誤差形状判定部１４１は、予測誤差dの分布形状の判定結果を最小誤差予測係数算出部１４２及び符号帳選択部１４３へ通知する。 The prediction error shape determination unit 141 notifies the minimum error prediction coefficient calculation unit 142 and the codebook selection unit 143 of the determination result of the distribution shape of the prediction error d.

最小誤差予測係数算出部１４２は、予測誤差dの分布形状の判定結果に基づいて、予測誤差が最小となる予測係数c₁、c₂を算出する。 The minimum error prediction coefficient calculation unit 142 calculates the prediction coefficients c ₁ and c ₂ that minimize the prediction error based on the determination result of the distribution shape of the prediction error d.

先ず、予測誤差dの分布形状が放物線型になる場合の予測係数c₁、c₂の算出式について説明する。上記の（１６）式の条件が満たされる場合、以下に示す(i)〜(iii)の何れの条件が満たされることになる。

ここで、（２２）式の条件（iii）が満たされる場合について説明する。条件（iii）の式は次式の通り表現することが可能である。

ただし、sは任意の実数である。（２３）式を（１１）式の各項に代入すると、予測誤差dは次式の通り表現することが可能となる。

（２４）式において、(c₁+sc₂)はc₁とc₂の一次式である。ここで、（２４）式の(c₁+sc₂)を変数zで置換し、左側周波数信号L_p0(k,n)、右側周波数信号R_p0(k,n)、中央チャネルの周波数信号C_p0(k,n)から一意的に定まる値を定数A、B、C、Dで置換すると、（２４）式は次式の一般的な放物線の式で表現できる。

First, calculation formulas for the prediction coefficients c ₁ and c ₂ when the distribution shape of the prediction error d is parabolic will be described. When the condition of the above expression (16) is satisfied, any of the following conditions (i) to (iii) is satisfied.

Here, a case where the condition (iii) of the equation (22) is satisfied will be described. The expression of condition (iii) can be expressed as the following expression.

Here, s is an arbitrary real number. Substituting equation (23) into each term of equation (11), the prediction error d can be expressed as:

In the equation (24), (c ₁ + sc ₂ ) is a linear equation of c ₁ and c ₂ . Here, (c ₁ + sc ₂ ) in the equation (24) is replaced with the variable z, the left frequency signal L _p0 (k, n), the right frequency signal R _p0 (k, n), and the center channel frequency signal C. _{When a} value uniquely determined from _p0 (k, n) is replaced with constants A, B, C, and D, equation (24) can be expressed by the following general parabolic equation.

（１１）式より、f(L₀,L₀)は、左側周波数信号L_p0(k,n)の絶対値の二乗の(k,n)についての総和となるので、常に正の値を有する。そのため、上記の（２４）式において、予測係数c₁、c₂ならびに予測誤差dをそれぞれ互いに直交する座標軸で表す放物線柱面の予測誤差の分布形状は、c₁-c₂平面に対して最小値を有することになる。 From equation (11), f (L ₀ , L ₀ ) is the sum of the squares (k, n) of the absolute value of the left frequency signal L _p0 (k, n), and therefore always has a positive value. . Therefore, in the above equation (24), the distribution shape of the prediction error of the parabolic column surface in which the prediction coefficients c ₁ and c ₂ and the prediction error d are expressed by coordinate axes orthogonal to each other is the minimum with respect to the c ₁ -c ₂ plane. Will have a value.

図７は予測係数c₁、c₂ならびに予測誤差dをそれぞれ互いに直交する座標軸で表す放物線柱面状の予測誤差の分布の概念図である。図７において、各座標軸は、それぞれ、予測係数c₁、c₂及び予測誤差dに対応する。３次元グラフ７００は、予測誤差dの分布を表す。３次元グラフ７００に示されるように、予測誤差dの最小値は、c₁-c₂平面において直線上に存在し、その直線から放物線上に予測誤差dが大きくなる。なお、放物線型の最小値は次式で表現される直線状となる。

FIG. 7 is a conceptual diagram of a parabolic columnar prediction error distribution in which the prediction coefficients c ₁ and c ₂ and the prediction error d are represented by mutually orthogonal coordinate axes. In FIG. 7, each coordinate axis corresponds to the prediction coefficients c ₁ and c ₂ and the prediction error d, respectively. A three-dimensional graph 700 represents the distribution of the prediction error d. As shown in the three-dimensional graph 700, the minimum value of the prediction error d exists on a straight line in the c ₁ -c ₂ plane, and the prediction error d increases from the straight line to the parabola. Note that the parabolic minimum value is a straight line expressed by the following equation.

なお、全ての(k,n)においてL_p0(k,n)=0である場合には、f(L₀,L₀)は正の値でなく0となる。しかし、この場合には、上記の（２２）式における上記(i)が満たされることになる。したがって、（２２）式の条件(iii)が満たされる場合には、f(L₀,L₀)は常に正の値となる。 When L _p0 (k, n) = 0 at all (k, n), f (L ₀ , L ₀ ) is not a positive value but 0. However, in this case, the above (i) in the above equation (22) is satisfied. Therefore, when the condition (iii) of the equation (22) is satisfied, f (L ₀ , L ₀ ) is always a positive value.

また、（２２）式における条件(i)が満たされる場合は、放物線型の予測誤差の最小値は次式で表現される直線状となる。

ただし、この場合において、予測係数c₁は任意の値を持つ。一方、（２２）式における条件(ii)が満たされる場合は、放物線型の予測誤差の最小値は次式で表現される直線状となる。

ただし、この場合において、予測係数c₂は任意の値を持つ。 When the condition (i) in the equation (22) is satisfied, the minimum value of the parabolic prediction error is a linear shape expressed by the following equation.

However, in this case, the prediction coefficient c ₁ has an arbitrary value. On the other hand, when the condition (ii) in the equation (22) is satisfied, the minimum value of the parabolic prediction error is a linear shape expressed by the following equation.

However, in this case, prediction coefficients c ₂ has any value.

したがって、最小誤差予測係数算出部１４２は、予測誤差dの分布形状が放物線型である場合、（２２）式の条件(i)〜(iii)のうちの満たされる条件に従って、予測誤差dが最小となる予測係数c_1min、c_2minを算出する。すなわち、（２２）式の条件(i)が満たされる場合、最小誤差予測係数算出部１４２は、（２７）式に従って予測係数c_1min、c_2minを算出する。また、（２２）式の条件(ii)が満たされる場合、最小誤差予測係数算出部１４２は、（２８）式に従って予測係数c_1min、c_2minを算出する。あるいは、（２２）式の条件(iii)が満たされる場合、最小誤差予測係数算出部１４２は、（２６）式に従って予測係数c_1min、c_2minを算出する。 Therefore, when the distribution shape of the prediction error d is parabolic, the minimum error prediction coefficient calculation unit 142 has the smallest prediction error d according to the condition satisfied among the conditions (i) to (iii) of the equation (22). Prediction coefficients c _1min and c _2min are calculated. That is, when the condition (i) of the equation (22) is satisfied, the minimum error prediction coefficient calculation unit 142 calculates the prediction coefficients c _1min and c _2min according to the equation (27). Further, when the condition (ii) of the equation (22) is satisfied, the minimum error prediction coefficient calculation unit 142 calculates the prediction coefficients c _1min and c _2min according to the equation (28). Alternatively, when the condition (iii) of the equation (22) is satisfied, the minimum error prediction coefficient calculation unit 142 calculates the prediction coefficients c _1min and c _2min according to the equation (26).

また、予測誤差dの分布形状が楕円型である場合、予測誤差dが最小となるのは、（１０）式を予測係数c₁、c₂でそれぞれ偏微分した値が0となる場合となる。そのため、最小誤差予測係数算出部１４２は、予測誤差dの分布形状が楕円型である場合、次式に従って、予測誤差dが最小となる予測係数c_1min、c_2minを算出する。

When the distribution shape of the prediction error d is elliptical, the prediction error d is minimized when the value obtained by partial differentiation of the equation (10) with the prediction coefficients c ₁ and c ₂ is 0. . Therefore, when the distribution shape of the prediction error d is elliptical, the minimum error prediction coefficient calculation unit 142 calculates the prediction coefficients c _1min and c _2min that minimize the prediction error d according to the following equation.

図８は予測係数c₁、c₂ならびに予測誤差dをそれぞれ互いに直交する座標軸で表す楕円放物面状の予測誤差の分布の概念図である。図８において、各座標軸は、それぞれ、予測係数c₁、c₂及び予測誤差dに対応する。３次元グラフ８００は、予測誤差dの分布を表す。３次元グラフ８００に示されるように、予測誤差dの最小値は、c₁-c₂平面において１点となり、その点から楕円状に予測誤差dが大きくなる。 FIG. 8 is a conceptual diagram of an elliptic paraboloid prediction error distribution in which the prediction coefficients c ₁ and c ₂ and the prediction error d are expressed by mutually orthogonal coordinate axes. In FIG. 8, each coordinate axis corresponds to the prediction coefficients c ₁ and c ₂ and the prediction error d, respectively. A three-dimensional graph 800 represents the distribution of the prediction error d. As shown in the three-dimensional graph 800, the minimum value of the prediction error d is one point on the c ₁ -c ₂ plane, and the prediction error d increases in an elliptical shape from that point.

最小誤差予測係数算出部１４２は、予測誤差dが最小値となるときの予測係数c_1min、c_2minを符号帳選択部１４３へ出力する。 The minimum error prediction coefficient calculation unit 142 outputs the prediction coefficients c _1min and c _2min when the prediction error d becomes the minimum value to the codebook selection unit 143.

符号帳選択部１４３は、予め準備された各予測係数の符号帳の中から、予測誤差dの分布形状、及び、予測誤差dが最小値となるときの予測係数c_1min、c_2minに基づいて、予測係数c₁、c₂の量子化値を決定するために利用される符号帳を選択する。 Based on the prediction error c distribution shape and the prediction coefficients c _1min and c _2min when the prediction error d is the minimum value, the code book selection unit 143 is based on the code book of each prediction coefficient prepared in advance. The codebook used for determining the quantized values of the prediction coefficients c ₁ and c ₂ is selected.

上記のように、予測係数c₁、c₂の量子化値の数が増えるほど、その予測係数c₁、c₂に割り当てられる符号の数も増える。そのため、各符号の直交性を保つためには、符号の数が増えるほど、符号の長さの平均値も増大し、その結果として符号化効率が低下する。そのため、符号帳に規定される予測係数c₁、c₂の量子化値の範囲は限られたものとなる。そのため、予測誤差dが最小となるときの予測係数c_1min、c_2minの値が、符号帳で規定されたc₁、c₂の量子化値の範囲から外れることがある。このような場合、その量子化値の範囲内の全ての量子化値に対してそれぞれ符号を割り当てても、使用されない符号が増えるだけであり、冗長となる。そこで本実施形態では、符号帳選択部１４３は、予測係数c_1min、c_2minのうちの一方が、符号帳に規定された予測係数の量子化値の範囲から外れる場合、または、その一方が予測誤差dに影響しない場合、他方の予測係数についての符号帳のみを選択する。 As described above, as the number of quantized values of the prediction coefficients c _1, c ₂ is increased, increasing the number of codes assigned to the prediction coefficients c _1, c _2. Therefore, in order to maintain the orthogonality of each code, the average value of the code length increases as the number of codes increases, and as a result, the coding efficiency decreases. Therefore, the range of quantized values of the prediction coefficients c ₁ and c ₂ defined in the codebook is limited. For this reason, the values of the prediction coefficients c _1min and c _2min when the prediction error d is minimized may be out of the range of quantized values of c ₁ and c ₂ defined by the codebook. In such a case, even if codes are assigned to all quantized values within the range of the quantized values, only codes that are not used are increased, which is redundant. Therefore, in the present embodiment, the codebook selection unit 143 determines _whether one of the prediction coefficients c _1min and c _2min is out of the range of the quantized value of the prediction coefficient defined in the codebook, or one of them is predicted. If the error d is not affected, only the codebook for the other prediction coefficient is selected.

先ず、予測誤差dの分布形状が楕円型である場合について説明する。
図９は、予測誤差dの分布形状が楕円型である場合における、予測誤差dが最小となるときの予測係数c_1min、c_2minと予測係数c₁、c₂の量子化値の範囲との位置関係を表す概念図である。図９において、予測係数c₁、c₂の量子化値の範囲９００は、予測係数c₁、c₂をそれぞれ互いに直交する座標軸とするc₁-c₂平面上に表される。そしてc_1t、c_1bは、それぞれ、符号帳に規定される予測係数c₁の量子化値の上限値及び下限値を表す。またc_2t、c_2bは、それぞれ、符号帳に規定される予測係数c₂の量子化値の上限値及び下限値を表す。 First, the case where the distribution shape of the prediction error d is elliptical will be described.
FIG. 9 shows the prediction coefficients c _1min and c _2min and the range of quantized values of the prediction coefficients c ₁ and c ₂ when the prediction error d is minimized when the distribution shape of the prediction error d is elliptical. It is a conceptual diagram showing a positional relationship. 9, a range 900 of the quantization values of the prediction coefficients c _1, c ₂ are represented the prediction coefficients c _1, c ₂ to c ₁ -c ₂ on a plane whose coordinate axes orthogonal to each other. C _1t and c _1b represent the upper limit value and the lower limit value of the quantized value of the prediction coefficient c ₁ defined in the codebook, respectively. C _2t and c _2b represent the upper limit value and the lower limit value of the quantized value of the prediction coefficient c ₂ defined in the codebook, respectively.

予測誤差曲面９０１〜９０５は、それぞれ、予測誤差dの分布を表す予測誤差曲面の一例である。例えば、予測係数c_1minが実際に符号化される予測係数c₁の量子化値の上限値よりも大きい場合の予測誤差曲面９０１は予測係数c₁の上限値c_1tに沿った直線上の点９１１において範囲９００と接する。したがって、予測係数c₁に関しては、実際に符号化されるのはその上限値c_1tとなる。そのため、符号帳としては、予測係数c₂のみが規定されればよい。同様に、予測係数c_1min及びc_2minのうちの少なくとも一方が符号帳に規定される量子化値の範囲９００から外れる予測誤差曲面９０２〜９０４も、範囲９００の境界上の点９１２〜９１４で範囲９００と接する。一方、予測係数c_1min及びc_2minの両方が範囲９００に含まれる予測誤差曲面９０５は、(c_1min,c_2min)そのものにて範囲９００と接する。 Each of the prediction error curved surfaces 901 to 905 is an example of a prediction error curved surface representing the distribution of the prediction error d. For example, the prediction error curved surface 901 when the prediction coefficient c _1min is larger than the upper limit value of the quantized value of the prediction coefficient c ₁ to be actually encoded is a point on a straight line along the upper limit value c _1t of the prediction coefficient c ₁ At 911, it touches the range 900. Therefore, regarding the prediction coefficient c _1, it is the upper limit value c _1t that is actually encoded. Therefore, as the codebook, only the prediction coefficients c ₂ need be defined. Similarly, at least one of the prediction error surface 902 to 904 outside the scope 900 of the quantization values defined in the codebook also range point 912-914 on the boundary of the range 900 of the prediction coefficients c _1min and c _2min Contact 900. On the other hand, the prediction error curved surface 905 in which both of the prediction coefficients c _1min and c _2min are included in the range 900 is in contact with the range 900 by (c _1min , c _2min ) itself.

図１０は、予測誤差dの分布形状が楕円型である場合における、符号帳に規定された予測係数の量子化値の範囲と予測誤差曲面との接点と、選択される符号帳との関係を示す図である。図１０は、図９に示したc₁-c₂平面を上から見た図である。予測誤差曲面と予測係数c₁、c₂の量子化値の範囲１０００との接点１００１が予測係数c₁の量子化値の上限値c_1tに沿った直線上にある場合、選択される符号帳には、範囲１０１０で示されるように、予測係数c₂の複数の量子化値のみが含まれる。また、予測誤差曲面と予測係数c₁、c₂の値の範囲１０００との接点１００２が予測係数c₂の量子化値の下限値c_2bに沿った直線上にある場合、選択される符号帳には、範囲１０１１で示されるように、予測係数c₁の複数の量子化値のみが含まれる。一方、予測誤差曲面の最小値が範囲１０００内である場合、選択される符号帳には、その範囲１０００内の予測係数c₁、c₂の両方の量子化値が含まれる。 FIG. 10 shows the relationship between the selected codebook and the contact point between the quantization coefficient range of the prediction coefficient defined in the codebook and the prediction error curved surface when the distribution shape of the prediction error d is elliptical. FIG. FIG. 10 is a view of the c ₁ -c ₂ plane shown in FIG. 9 as viewed from above. The codebook selected when the contact point 1001 between the prediction error curved surface and the quantization value range 1000 of the prediction coefficients c ₁ and c ₂ is on a straight line along the upper limit value c _1t of the quantization value of the prediction coefficient c ₁ Includes only a plurality of quantized values of the prediction coefficient c ₂ as indicated by a range 1010. Further, when the contact point 1002 between the prediction error curved surface and the value range 1000 of the prediction coefficients c ₁ and c ₂ is on a straight line along the lower limit value c _2b of the quantized value of the prediction coefficient c ₂ , the selected codebook Includes only a plurality of quantized values of the prediction coefficient c ₁ as indicated by a range 1011. On the other hand, when the minimum value of the prediction error curved surface is within the range 1000, the selected codebook includes the quantized values of both the prediction coefficients c ₁ and c ₂ within the range 1000.

図１１は、予測誤差dの分布形状が楕円型である場合の符号帳選択処理の動作フローチャートである。符号帳選択部１４３は、予測誤差dの分布形状が楕円型と判定された各フレームの周波数帯域ごとに、この動作フローチャートに従って使用する符号帳を選択する。
符号帳選択部１４３は、予測誤差dが最小となるときの予測係数の組(c_1min,c_2min)が、各予測係数の符号帳に規定される予測係数の量子化値の範囲内に含まれるか否か判定する（ステップＳ１０１）。
予測係数の組(c_1min,c_2min)の少なくとも一方が、予測係数の量子化値の範囲から外れる場合（ステップＳ１０１−Ｎｏ）、符号帳選択部１４３は、予測誤差曲面と予測係数の量子化値の範囲との接点を求める（ステップＳ１０２）。なお、符号帳選択部１４３は、（１０）式において、予測係数c_1min、c_2minのうちの量子化値の範囲から外れるものについて、予測係数の値をその量子化値の上限、下限のうちの近い方の値に設定する。例えば、予測係数c_1minが量子化値の範囲の下限を下回る場合、符号帳選択部１４３は、予測係数c₁をその下限値c_1bに設定する。そして符号帳選択部１４３は、他方をその取り得る値のうちで変化させたときに予測誤差dが最小となるときの予測係数の組(c_1c,c_2c)を接点とする。 FIG. 11 is an operation flowchart of the codebook selection process when the distribution shape of the prediction error d is elliptical. The codebook selection unit 143 selects a codebook to be used according to this operation flowchart for each frequency band of each frame in which the distribution shape of the prediction error d is determined to be elliptical.
The codebook selection unit 143 includes a set of prediction coefficients (c _1min , c _2min ) when the prediction error d is minimized within the range of quantized values of the prediction coefficients defined in the codebook of each prediction coefficient. It is determined whether or not (step S101).
When at least one of the pair of prediction coefficients (c _1min , c _2min ) is out of the range of the quantization value of the prediction coefficient (No in step S101), the codebook selection unit 143 quantizes the prediction error curved surface and the prediction coefficient A contact point with the value range is obtained (step S102). Note that the codebook selection unit 143 determines the value of the prediction coefficient among the upper limit and the lower limit of the quantization value for those out of the quantization value range of the prediction coefficients c _1min and c _2min in equation (10). Set the value closer to. For example, when the prediction coefficient c _1min falls below the lower limit of the quantization value range, the codebook selection unit 143 sets the prediction coefficient c ₁ to the lower limit value c _1b . Then, the codebook selection unit 143 uses the set of prediction coefficients (c _1c , c _2c ) when the prediction error d is minimized when the other one of the possible values is changed as a contact.

一方、ステップＳ１０１にて、予測係数の組(c_1min,c_2min)の何れも、符号帳に規定される量子化値の上限と下限の間に含まれる場合（ステップＳ１０１−Ｙｅｓ）、符号帳選択部１４３は、予測係数c₁及びc₂の両方の符号帳を選択する（ステップＳ１０６）。
ステップＳ１０４、Ｓ１０５またはＳ１０６の後、符号帳選択部１４３は、符号帳選択処理を終了する。 On the other hand, if any of the prediction coefficient pairs (c _1min , c _2min ) is included between the upper limit and the lower limit of the quantized value defined in the codebook in step S101 (step S101—Yes), the codebook The selection unit 143 selects both codebooks of the prediction coefficients c ₁ and c ₂ (step S106).
After step S104, S105, or S106, the codebook selection unit 143 ends the codebook selection process.

次に、予測誤差dの分布形状が放物線型である場合における符号帳の選択について説明する。
図１２は、予測誤差dの分布形状が放物線型である場合における、予測誤差dの最小値に対応する予測係数の組(c_1min,c_2min)と予測係数の量子化値の範囲との関係を示す図である。
直線１２０１は、（２２）式の条件(i)が満たされ、（２７）式に従って算出される予測係数の組(c_1min,c_2min)を表す。この場合、予測係数c_1minは任意であり、直線１２０１は、予測係数c₁の軸と平行になるので、予測係数c₂のみが符号化されればよい。したがって、符号帳選択部１４３は、予測係数c₂についての符号帳のみを選択すればよい。一方、直線１２０２は、（２２）式の条件(ii)が満たされ、（２８）式に従って算出される予測係数の組(c_1min,c_2min)を表す。この場合、予測係数c_2minは任意であり、直線１２０２は、予測係数c₂の軸と平行になるので、予測係数c₁のみが符号化されればよい。したがって、符号帳選択部１４３は、予測係数c₁についての符号帳のみを選択すればよい。 Next, selection of a codebook when the distribution shape of the prediction error d is parabolic will be described.
FIG. 12 shows the relationship between the prediction coefficient pair (c _1min , c _2min ) corresponding to the minimum value of the prediction error d and the range of quantized values of the prediction coefficient when the distribution shape of the prediction error d is parabolic. FIG.
A straight line 1201 represents a set of prediction coefficients (c _1min , c _2min ) that satisfies the condition (i) of Equation (22) and is calculated according to Equation (27). In this case, the prediction coefficient c _1min is arbitrary, and the straight line 1201 is parallel to the axis of the prediction coefficient c ₁ , so only the prediction coefficient c ₂ needs to be encoded. Accordingly, codebook selecting section 143 may select only the codebook for prediction coefficients c _2. On the other hand, a straight line 1202 represents a set of prediction coefficients (c _1min , c _2min ) that satisfies the condition (ii) of the equation (22) and is calculated according to the equation (28). In this case, the prediction coefficient c _2min is arbitrary, and the straight line 1202 is parallel to the axis of the prediction coefficient c ₂ , so that only the prediction coefficient c ₁ needs to be encoded. Accordingly, codebook selecting section 143 may select only the codebook for prediction coefficient c _1.

また、直線１２０３は、（２２）式の条件(iii)が満たされ、（２６）式に従って算出される予測係数の組(c_1min,c_2min)を表す。この場合、予測係数c_1minに応じて予測係数c_2minも変化する。そのため符号帳選択部１４３は、予測係数c₁及びc₂の両方の符号帳を選択する。 A straight line 1203 represents a set of prediction coefficients (c _1min , c _2min ) that satisfies the condition (iii) of the equation (22) and is calculated according to the equation (26). In this case, the prediction coefficient c _2min also changes according to the prediction coefficient c _1min . Therefore, the codebook selection unit 143 selects both codebooks of the prediction coefficients c ₁ and c ₂ .

図１３は、予測誤差dの分布形状が放物線型である場合の符号帳選択処理の動作フローチャートである。符号帳選択部１４３は、予測誤差dの分布形状が放物線型と判定された各フレームの周波数帯域ごとに、この動作フローチャートに従って使用する符号帳を選択する。
符号帳選択部１４３は、予測誤差dが最小となる予測係数(c_1min,c_2min)同士の関係を表す直線Lが予測係数c₂の軸と平行か否か、すなわち、（２８）式に従って算出されたものか否か判定する（ステップＳ２０１）。直線Lが予測係数c₂の軸と平行である場合（ステップＳ２０１−Ｙｅｓ）、符号帳選択部１４３は、予測係数c₁についての符号帳のみを選択する（ステップＳ２０２）。 FIG. 13 is an operation flowchart of the codebook selection process when the distribution shape of the prediction error d is parabolic. The codebook selection unit 143 selects a codebook to be used according to this operation flowchart for each frequency band of each frame in which the distribution shape of the prediction error d is determined to be parabolic.
The codebook selection unit 143 determines whether or not the straight line L representing the relationship between the prediction coefficients (c _1min , c _2min ) that minimizes the prediction error d is parallel to the axis of the prediction coefficient c ₂ , that is, according to the equation (28). It is determined whether or not it has been calculated (step S201). If the straight line L is parallel to the prediction coefficients c ₂ axes (step S201-Yes), codebook selecting section 143 selects only the codebook for prediction coefficients c ₁ (step S202).

一方、直線Lが予測係数c₂の軸と平行でない場合（ステップＳ２０１−Ｎｏ）、符号帳選択部１４３は、直線Lが予測係数c₁の軸と平行か否か、すなわち、（２７）式に従って算出されたものか否か判定する（ステップＳ２０３）。直線Lが予測係数c₁の軸と平行である場合（ステップＳ２０３−Ｙｅｓ）、符号帳選択部１４３は、予測係数c₂についての符号帳のみを選択する（ステップＳ２０４）。 On the other hand, if the straight line L is not parallel with the prediction coefficients c ₂ axes (step S201-No), codebook selecting section 143, the straight line L whether parallel to the axis prediction coefficients c _1, i.e., (27) It is determined whether it is calculated according to (step S203). If the straight line L is parallel to the prediction coefficients c ₁ axis (step S203-Yes), codebook selecting section 143 selects only the codebook for prediction coefficients c ₂ (step S204).

一方、直線Lが予測係数c₁の軸と平行でない場合（ステップＳ２０３−Ｎｏ）、符号帳選択部１４３は、予測係数c₁及びc₂の両方の符号帳を選択する（ステップＳ２０５）。
ステップＳ２０２、Ｓ２０４またはＳ２０５の後、符号帳選択部１４３は、符号帳選択処理を終了する。 On the other hand, if the straight line L is not parallel with the prediction coefficients c ₁ axis (step S203-No), codebook selecting section 143 selects a codebook for both predictive coefficients c ₁ and c ₂ (step S205).
After step S202, S204, or S205, the codebook selection unit 143 ends the codebook selection process.

符号帳選択部１４３は、周波数帯域ごとに、選択した符号帳を表す符号帳選択情報を予測係数符号化部１４４へ通知する。例えば、符号帳選択情報は、例えば、2ビットで表される。そして符号帳選択情報が'11'のとき、両方の予測係数に対する符号帳が選択されたことを表す。また符号帳選択情報が'01'のとき、予測係数c₁に対する符号帳が選択されたことを表し、'10'のとき、予測係数c₂に対する符号帳が選択されたことを表す。 The codebook selection unit 143 notifies the prediction coefficient encoding unit 144 of codebook selection information representing the selected codebook for each frequency band. For example, the codebook selection information is represented by 2 bits, for example. When the codebook selection information is “11”, it indicates that codebooks for both prediction coefficients have been selected. When the code book selection information is “01”, it indicates that the code book for the prediction coefficient c ₁ has been selected, and when it is “10”, it indicates that the code book for the prediction coefficient c ₂ has been selected.

予測係数符号化部１４４は、選択された符号帳に従って、予測係数を符号化する。例えば、予測係数符号化部１４４は、その符号帳に含まれる複数の予測係数の量子化値のうち、予測誤差dを最小化できる量子化値を選択する。そして予測係数符号化部１４４は、選択した量子化値に対応するインデックス値を求める。 The prediction coefficient encoding unit 144 encodes the prediction coefficient according to the selected codebook. For example, the prediction coefficient encoding unit 144 selects a quantization value that can minimize the prediction error d from among the quantization values of a plurality of prediction coefficients included in the codebook. Then, the prediction coefficient encoding unit 144 obtains an index value corresponding to the selected quantization value.

図１４は、予測係数の量子化値を格納した符号帳の一例を示す図である。図１４に示されるように、符号帳１４００では、二つの行が一組となって予測係数の量子化値が表されている。左端の列に"idx"と示された行１４１０、１４２０、１４３０、１４４０及び１４５０の各欄の数値は、インデックス値を表す。また左端の列に"C[idx]"と示された行１４１５、１４２５、１４３５、１４４５及び１４５５の各欄の数値は、一つ上のインデックス値に対応する予測係数の量子化値を表す。例えば、欄１４０１には、インデックス値として'-20'が格納されている。そして欄１４０２には、インデックス値'-20'に対応する予測係数の量子化値'-2.0'が格納されている。 FIG. 14 is a diagram illustrating an example of a codebook that stores quantization values of prediction coefficients. As shown in FIG. 14, in the codebook 1400, two rows represent a set and represent the quantized values of the prediction coefficients. The numerical values in the respective columns of the rows 1410, 1420, 1430, 1440 and 1450 indicated as “idx” in the leftmost column represent index values. The numerical values in the columns 1415, 1425, 1435, 1445, and 1455 in which “C [idx]” is shown in the leftmost column represent the quantized values of the prediction coefficients corresponding to the index values one level higher. For example, the column 1401 stores “−20” as an index value. The column 1402 stores the prediction coefficient quantization value “−2.0” corresponding to the index value “−20”.

例えば、周波数帯域kに対する予測係数c₁が1.21である場合、符号帳１４００では、インデックス値'12'に対応する予測係数の量子化値がc₁に最も近い。そこで、予測係数符号化部１４４は、c₁に対するインデックス値を'12'に設定する。 For example, when the prediction coefficient c ₁ for the frequency band k is 1.21, in the codebook 1400, the quantization value of the prediction coefficient corresponding to the index value “12” is closest to c ₁ . Therefore, the prediction coefficient encoding unit 144 sets the index value for c ₁ to '12'.

以下、予測誤差dの分布形状が楕円型である場合と放物線型である場合とに分けて、予測係数の量子化値及び対応するインデックス値の決定方法について説明する。
先ず、予測誤差dの分布形状が楕円型である場合におけるその量子化値及びインデックス値の決定方法について説明する。 Hereinafter, the method for determining the quantized value of the prediction coefficient and the corresponding index value will be described separately for the case where the distribution shape of the prediction error d is an elliptical type and the case of a parabolic type.
First, a method for determining the quantization value and the index value when the distribution shape of the prediction error d is elliptic will be described.

予測誤差dの最小値に対応する予測係数の組(c_1min,c_2min)の両方が符号帳に規定される予測係数の量子化値の範囲内に含まれる場合、予測係数符号化部１４４は、(c_1min,c_2min)のそれぞれについて、対応する符号帳を参照して最も近い量子化値を求めればよい。そして予測係数符号化部１４４は、符号帳を参照して、各予測係数の量子化値に対応するインデックス値を決定する。 When both of the prediction coefficient pairs (c _1min , c _2min ) corresponding to the minimum value of the prediction error d are included in the range of the quantized values of the prediction coefficients defined in the codebook, the prediction coefficient encoding unit 144 , (C _1min , c _2min ), the nearest quantized value may be obtained by referring to the corresponding codebook. And the prediction coefficient encoding part 144 determines the index value corresponding to the quantization value of each prediction coefficient with reference to a codebook.

一方、予測誤差dの最小値に対応する予測係数の組(c_1min,c_2min)が符号帳に規定される予測係数の量子化値の範囲から外れる場合、予測係数符号化部１４４は、（１０）式に基づいて、予測誤差曲面と予測係数の量子化値の範囲の境界との接点(c_1c,c_2c)を求める。あるいは、予測係数符号化部１４４は、符号帳選択部１４３から接点(c_1c,c_2c)を受け取ってもよい。 On the other hand, when the set of prediction coefficients (c _1min , c _2min ) corresponding to the minimum value of the prediction error d is out of the range of quantized values of the prediction coefficients specified in the codebook, the prediction coefficient encoding unit 144 ( Based on equation (10), a contact point (c _1c , c _2c ) between the prediction error curved surface and the boundary of the quantized value range of the prediction coefficient is obtained. Alternatively, the prediction coefficient encoding unit 144 may receive the contact (c _1c , c _2c ) from the codebook selection unit 143.

予測係数符号化部１４４は、予測係数c₂についての符号帳のみが選択されている場合、すなわち、接点(c_1c,c_2c)が予測係数c₁の下限値または上限値に沿った直線上にある場合には、予測係数c₂についての符号帳を参照する。そして予測係数符号化部１４４は、接点における予測係数c_2cに最も近い量子化値を選択し、その選択した量子化値に対応するインデックス値を求める。なお、c_1minが予測係数c₁の量子化値の範囲の上限c_1tよりも大きい場合には、予測係数符号化部１４４は、予測係数c₁について、その量子化値の範囲の上限c_1tに対応するインデックス値を求める。
一方、c_1minが予測係数c₁の量子化値の範囲の下限c_1bよりも小さい場合には、予測係数符号化部１４４は、予測係数c₁について、その量子化値の範囲の下限c_1bに対応するインデックス値を求める。
あるいは、c_1minが予測係数c₁の量子化値の範囲の上限c_1tよりも大きい場合、予測係数符号化部１４４は、予測係数c₁について、隣接する周波数帯域についてのインデックス値と同じインデックス値に設定してもよい。またc_1minが予測係数c₁の量子化値の範囲の下限c_1bよりも小さい場合には、予測係数符号化部１４４は、予測係数c₁について、隣接する周波数帯域についてのインデックス値に1を加算したインデックス値に設定してもよい。 When only the codebook for the prediction coefficient c ₂ is selected, the prediction coefficient encoding unit 144, that is, the contact (c _1c , c _2c ) is on a straight line along the lower limit value or the upper limit value of the prediction coefficient c ₁ If it is, the codebook for the prediction coefficient c ₂ is referred to. Then, the prediction coefficient encoding unit 144 selects a quantization value closest to the prediction coefficient c _2c at the contact point, and obtains an index value corresponding to the selected quantization value. Note that if c _1min is larger than the upper limit c _1t range of the quantization value of the prediction coefficients c _1, the prediction coefficient coding unit 144, the prediction coefficients c _1, the upper limit c _1t of range of the quantized value The index value corresponding to is obtained.
On the other hand, when c _1min is smaller than the lower limit c _1b of the quantization value range of the prediction coefficient c ₁ , the prediction coefficient encoding unit 144 sets the lower limit c _1b of the quantization value range of the prediction coefficient c _1. The index value corresponding to is obtained.
Alternatively, when c _1min is larger than the upper limit c _1t of the quantization value range of the prediction coefficient c ₁ , the prediction coefficient encoding unit 144 uses the same index value as the index value for the adjacent frequency band for the prediction coefficient c _1. May be set. When c _1min is smaller than the lower limit c _1b of the quantized value range of the prediction coefficient c ₁ , the prediction coefficient encoding unit 144 sets 1 as the index value for the adjacent frequency band for the prediction coefficient c _1. You may set to the added index value.

同様に、予測係数符号化部１４４は、予測係数c₁についての符号帳のみが選択されている場合、すなわち、接点(c_1c,c_2c)が予測係数c₂の下限値または上限値に沿った直線上にある場合には、予測係数c₁についての符号帳を参照する。そして予測係数符号化部１４４は、接点における予測係数c_1cに最も近い量子化値を選択し、その選択した量子化値に対応するインデックス値を求める。
一方、予測係数c₂については、予測係数符号化部１４４は、予測係数c₂の量子化値の範囲の上限c_2t及び下限c_2bのうちのc_2minに近い方に対応するインデックス値を求める。
あるいは、c_2minが予測係数c₂の量子化値の範囲の上限c_2tよりも大きい場合、予測係数符号化部１４４は、予測係数c₂について、隣接する周波数帯域についてのインデックス値と同じインデックス値に設定してもよい。またc_2minが予測係数c₂の量子化値の範囲の下限c_2bよりも小さい場合には、予測係数符号化部１４４は、予測係数c₂について、隣接する周波数帯域についてのインデックス値に1を加算したインデックス値に設定してもよい。 Similarly, when only the codebook for the prediction coefficient c ₁ is selected, the prediction coefficient encoding unit 144, that is, the contact (c _1c , c _2c ) follows the lower limit value or the upper limit value of the prediction coefficient c _2. If it is on the straight line, the code book for the prediction coefficient c ₁ is referred to. Then, the prediction coefficient encoding unit 144 selects a quantization value closest to the prediction coefficient c _1c at the contact point, and obtains an index value corresponding to the selected quantization value.
On the other hand, for the prediction coefficient c ₂ , the prediction coefficient encoding unit 144 obtains an index value corresponding to the closer to c _{2 min} of the upper limit c _2t and the lower limit c _2b of the quantized value range of the prediction coefficient c _2. .
Alternatively, when c _2min is larger than the upper limit c _2t of the quantization value range of the prediction coefficient c ₂ , the prediction coefficient encoding unit 144 uses the same index value as the index value for the adjacent frequency band for the prediction coefficient c _2. May be set. When c _2min is smaller than the lower limit c _2b of the quantized value range of the prediction coefficient c ₂ , the prediction coefficient encoding unit 144 sets 1 as the index value for the adjacent frequency band for the prediction coefficient c _2. You may set to the added index value.

次に、予測誤差dの分布形状が放物線型である場合におけるその量子化値及びインデックス値の決定方法について説明する。
予測誤差dの最小値に対応する予測係数(c_1min,c_2min)同士の関係を表す直線Lが（２８）式に従って算出されており、その直線Lが予測係数c₂の軸と平行であれば、予測係数符号化部１４４は、予測係数c₁についての符号帳を参照する。そして予測係数符号化部１４４は、c_1minに最も近い量子化値を選択し、その選択した量子化値に対応するインデックス値を求める。予測係数c_2minについては任意なので、予測係数符号化部１４４は、予測係数c₂の符号化値を省略する。そのため、予測係数c₂についてのインデックス値を設定しない。あるいは、予測係数符号化部１４４は、予測係数c₂について適当なインデックス値を設定してもよい。例えば、予測係数符号化部１４４は、予測係数c₂について、隣接する周波数帯域について求められたインデックス値と同じインデックス値に設定する。 Next, a method for determining the quantization value and the index value when the distribution shape of the prediction error d is parabolic will be described.
A straight line L representing the relationship between the prediction coefficients (c _1min , c _2min ) corresponding to the minimum value of the prediction error d is calculated according to the equation (28), and the straight line L is parallel to the axis of the prediction coefficient c _2. For example, the prediction coefficient encoding unit 144 refers to the code book for the prediction coefficient c ₁ . Then, the prediction coefficient encoding unit 144 selects a quantized value closest to c _1min and obtains an index value corresponding to the selected quantized value. Since the prediction coefficient c _2min is arbitrary, the prediction coefficient encoding unit 144 omits the encoded value of the prediction coefficient c ₂ . Therefore, do not set the index value of the prediction coefficients c _2. Alternatively, predictive coefficient coding section 144 may set the appropriate index value for the prediction coefficient c _2. For example, the prediction coefficient coding unit 144, the prediction coefficients c _2, is set to the same index value as the index value determined for adjacent frequency bands.

また、予測誤差dの最小値に対応する予測係数 (c_1min,c_2min)同士の関係を表す直線Lが（２７）式に従って算出されており、その直線Lが予測係数c₁の軸と平行であれば、予測係数符号化部１４４は、予測係数c₂についての符号帳を参照する。そして予測係数符号化部１４４は、c_2minに最も近い量子化値を選択し、その選択した量子化値に対応するインデックス値を求める。予測係数c_1minについては任意なので、予測係数符号化部１４４は、予測係数c₁の符号化値を省略する。そのため、予測係数c₁についてのインデックス値を設定しない。あるいは、予測係数符号化部１４４は、予測係数c₁について適当なインデックス値を設定してもよい。例えば、予測係数符号化部１４４は、予測係数c₁について、隣接する周波数帯域について求められたインデックス値と同じインデックス値に設定する。 A straight line L representing the relationship between the prediction coefficients (c _1min , c _2min ) corresponding to the minimum value of the prediction error d is calculated according to the equation (27), and the straight line L is parallel to the axis of the prediction coefficient c _1. if, predictive coefficient coding section 144 refers to the codebook for prediction coefficients c _2. Then, the prediction coefficient encoding unit 144 selects a quantization value closest to c _2min and obtains an index value corresponding to the selected quantization value. Since the prediction coefficient c _1min is arbitrary, the prediction coefficient encoding unit 144 omits the encoded value of the prediction coefficient c ₁ . Therefore, it does not set the index value for the prediction coefficient c _1. Alternatively, the prediction coefficient encoding unit 144 may set an appropriate index value for the prediction coefficient c ₁ . For example, the prediction coefficient encoding unit 144 sets the prediction coefficient c ₁ to the same index value as the index value obtained for the adjacent frequency band.

さらに、予測誤差dの最小値に対応する予測係数(c_1min,c_2min)同士の関係を表す直線Lが（２６）式に従って算出されている場合には、予測係数符号化部１４４は、符号帳に規定された予測係数c₁及びc₂の量子化値の範囲内にある直線L上の任意の点を選択する。そして予測係数符号化部１４４は、選択した点に最も近い、c₁の量子化値及びc₂の量子化値を選択し、それぞれ、その量子化値に対応するインデックス値を求める。 Furthermore, when the straight line L representing the relationship between the prediction coefficients (c _1min , c _2min ) corresponding to the minimum value of the prediction error d is calculated according to the equation (26), the prediction coefficient encoding unit 144 An arbitrary point on the straight line L within the range of the quantized values of the prediction coefficients c ₁ and c ₂ defined in the book is selected. Then, the prediction coefficient encoding unit 144 selects the c ₁ quantized value and the c ₂ quantized value closest to the selected point, and obtains an index value corresponding to the quantized value.

なお、直線Lと予測係数c₁及びc₂の量子化値の範囲が重ならないこともある。この場合には、予測係数符号化部１４４は、予測係数c₁及びc₂の量子化値の上限値と下限値の組み合わせからなる4個の点のうち、直線Lに最も近い点を求める。そして予測係数符号化部１４４は、その最も近い点であるc₁及びc₂の量子化値に対応するインデックス値をそれぞれ求める。なお、各点と直線Lとの距離の算出方法は既知なので、その詳細な説明は省略する。 Note that the range of quantized values of the straight line L and the prediction coefficients c ₁ and c ₂ may not overlap. In this case, the prediction coefficient encoding unit 144 obtains a point closest to the straight line L among four points that are combinations of the upper limit value and the lower limit value of the quantized values of the prediction coefficients c ₁ and c ₂ . Then, the prediction coefficient encoding unit 144 obtains index values corresponding to the quantized values of c ₁ and c ₂ that are the closest points. In addition, since the calculation method of the distance between each point and the straight line L is known, the detailed description thereof is omitted.

予測係数符号化部１４４は、各予測係数について、周波数方向に沿って隣接する周波数帯域のインデックス間の差分値を求める。例えば、周波数帯域kに対する予測係数c₁のインデックス値が'2'であり、周波数帯域(k-1)に対する予測係数c₁のインデックス値が'4'であれば、予測係数符号化部１４４は、周波数帯域kに対する予測係数c₁のインデックスの差分値を'-2'とする。ただし、予測係数符号化部１４４は、隣接する周波数帯域について符号帳が選択されていない予測係数については、符号帳が選択されている最も近い周波数帯域のインデックス値からの差分値を求めてもよい。 The prediction coefficient encoding unit 144 obtains a difference value between indexes of frequency bands adjacent in the frequency direction for each prediction coefficient. For example, the index value of the prediction coefficients c ₁ for the frequency band k is '2', if the index value of the prediction coefficients c ₁ for the frequency band (k-1) is '4', predictive coefficient coding section 144 The difference value of the index of the prediction coefficient c ₁ with respect to the frequency band k is set to “−2”. However, the prediction coefficient encoding unit 144 may obtain a difference value from the index value of the closest frequency band for which the codebook is selected for the prediction coefficient for which the codebook is not selected for the adjacent frequency band. .

予測係数符号化部１４４は、インデックス間の差分値と予測係数符号の対応を示した符号化テーブルを参照する。そして予測係数符号化部１４４は、符号化テーブルを参照することにより、各周波数帯域について、その差分値に対する予測係数符号idxc_m(k)(m=1,2)を決定する。予測係数符号は、類似度符号と同様に、例えば、ハフマン符号あるいは算術符号など、出現頻度が高い差分値ほど符号長が短くなる可変長符号とすることができる。特に、符号帳が選択されていない予測係数についてのインデックス値を、隣接する周波数帯域のインデックス値と等しいか1加算した値に設定すると、その差分値も、出現頻度が高い0か1となる。そのため、符号帳が選択されていない予測係数についての予測係数符号を短くできる。
なお、各予測係数の符号帳及び符号化テーブルは、予め、予測符号化部１４が有するメモリに格納される。 The prediction coefficient encoding unit 144 refers to an encoding table indicating the correspondence between the difference value between indexes and the prediction coefficient code. Then, the prediction coefficient encoding unit 144 determines a prediction coefficient code idxc _m (k) (m = 1, 2) for the difference value for each frequency band by referring to the encoding table. Similar to the similarity code, the prediction coefficient code can be a variable length code such as a Huffman code or an arithmetic code, in which the code length is shorter as the difference value has a higher appearance frequency. In particular, when the index value for a prediction coefficient for which no codebook is selected is set to a value that is equal to or equal to the index value of an adjacent frequency band, the difference value also becomes 0 or 1 with a high appearance frequency. Therefore, the prediction coefficient code for the prediction coefficient for which no codebook is selected can be shortened.
Note that the codebook and coding table of each prediction coefficient are stored in advance in a memory included in the prediction coding unit 14.

予測係数符号化部１４４は、予測係数符号idxc_m(k)(m=1,2)を空間情報符号化部１５へ出力する。
さらに、予測係数符号化部１４４は、符号帳選択情報も、例えば、ハフマン符号あるいは算術符号を用いて可変長符号化してもよい。そして予測係数符号化部１４４は、符号化された符号帳選択情報を多重化部１７へ出力する。 The prediction coefficient encoding unit 144 outputs the prediction coefficient code idxc _m (k) (m = 1, 2) to the spatial information encoding unit 15.
Furthermore, the prediction coefficient encoding unit 144 may also perform variable length encoding on the codebook selection information using, for example, a Huffman code or an arithmetic code. Then, the prediction coefficient encoding unit 144 outputs the encoded codebook selection information to the multiplexing unit 17.

なお、変形例によれば、予測係数符号化部１４４は、各周波数帯域のインデックス値そのものを可変長符号化することにより、各周波数帯域の予測係数符号を求めてもよい。この場合、符号帳を用いずにインデックス値が決定された予測係数については、その予測係数の上限値及び下限値の何れかに対応するインデックス値しか取り得ない。そのため、取り得るインデックス値は2通りしかないので、予測係数符号も短くて済む。 Note that according to the modification, the prediction coefficient encoding unit 144 may obtain the prediction coefficient code of each frequency band by performing variable length encoding on the index value itself of each frequency band. In this case, for a prediction coefficient for which an index value is determined without using a codebook, only an index value corresponding to either the upper limit value or the lower limit value of the prediction coefficient can be taken. Therefore, since there are only two possible index values, the prediction coefficient code can be short.

図１５は、オーディオ符号化処理の動作フローチャートを示す。なお、図１５に示されたフローチャートは、１フレーム分のマルチチャネルオーディオ信号に対する処理を表す。オーディオ符号化装置１は、マルチチャネルオーディオ信号を受信し続けている間、フレームごとに図１５に示されたオーディオ符号化処理の手順を繰り返し実行する。 FIG. 15 shows an operation flowchart of the audio encoding process. Note that the flowchart shown in FIG. 15 represents processing for a multi-channel audio signal for one frame. The audio encoding device 1 repeatedly executes the procedure of the audio encoding process shown in FIG. 15 for each frame while continuing to receive the multi-channel audio signal.

時間周波数変換部１１は、各チャネルの信号を周波数信号に変換する（ステップ３０１）。時間周波数変換部１１は、各チャネルの周波数信号を第１ダウンミックス部１２へ出力する。 The time-frequency converter 11 converts the signal of each channel into a frequency signal (step 301). The time frequency conversion unit 11 outputs the frequency signal of each channel to the first downmix unit 12.

次に、第１ダウンミックス部１２は、各チャネルの周波数信号をダウンミックスすることにより右、左、中央の3チャネルの周波数信号を生成する。さらに第１ダウンミックス部１２は、右、左、中央の各チャネルの空間情報を算出する（ステップＳ３０２）。第１ダウンミックス部１２は、3チャネルの周波数信号を第２ダウンミックス部１３へ出力する。また第１ダウンミックス部１２は、空間情報を空間情報符号化部１５へ出力する。 Next, the first downmix unit 12 generates right, left, and center three frequency signals by downmixing the frequency signals of the respective channels. Further, the first downmix unit 12 calculates spatial information of each of the right, left, and center channels (step S302). The first downmix unit 12 outputs a 3-channel frequency signal to the second downmix unit 13. The first downmix unit 12 outputs the spatial information to the spatial information encoding unit 15.

第２ダウンミックス部１３は、第１ダウンミックス部１２から受け取った3チャネルの周波数信号をダウンミックスすることにより、ステレオ周波数信号と、予測符号化用の中央チャネルの信号を生成する（ステップＳ３０３）。そして第２ダウンミックス部１３は、ステレオ周波数信号をチャネル信号符号化部１６へ出力する。さらに、第２ダウンミックス部１３は、ステレオ周波数信号とともに、中央チャネルの信号を予測符号化部１４へ出力する。 The second downmix unit 13 generates a stereo frequency signal and a central channel signal for predictive coding by downmixing the three-channel frequency signals received from the first downmix unit 12 (step S303). . Then, the second downmix unit 13 outputs the stereo frequency signal to the channel signal encoding unit 16. Furthermore, the second downmix unit 13 outputs the center channel signal to the predictive coding unit 14 together with the stereo frequency signal.

予測符号化部１４の予測誤差形状判定部１４１は、ステレオ周波数信号及び中央チャネルの信号に基づいて、予測誤差の分布形状を判定する（ステップＳ３０４）。そして予測符号化部１４の最小誤差予測係数算出部１４２は、予測誤差の分布形状に応じて予測誤差が最小値となる予測係数の組(c_1min,c_2min)を算出する（ステップＳ３０５）。さらに予測符号化部１４の符号帳選択部１４３は、予測誤差の分布形状が楕円型か否か判定する（ステップＳ３０６）。予測誤差の分布形状が楕円型である場合（ステップＳ３０６−Ｙｅｓ）、符号帳選択部１４３は、図１１に示した動作フローに従って、楕円型に対応する符号帳選択処理を実行する（ステップＳ３０７）。一方、予測誤差の分布形状が放物線型である場合（ステップＳ３０６−Ｎｏ）、符号帳選択部１４３は、図１３に示した動作フローに従って、放物線型に対応する符号帳選択処理を実行する（ステップＳ３０８）。 The prediction error shape determination unit 141 of the prediction encoding unit 14 determines the distribution shape of the prediction error based on the stereo frequency signal and the center channel signal (step S304). Then, the minimum error prediction coefficient calculation unit 142 of the prediction encoding unit 14 calculates a set of prediction coefficients (c _1min , c _2min ) in which the prediction error becomes the minimum value according to the distribution shape of the prediction error (step S305). Further, the codebook selection unit 143 of the prediction encoding unit 14 determines whether or not the distribution shape of the prediction error is elliptical (step S306). When the distribution shape of the prediction error is elliptical (step S306-Yes), the codebook selection unit 143 executes codebook selection processing corresponding to the elliptical shape according to the operation flow shown in FIG. 11 (step S307). . On the other hand, when the distribution shape of the prediction error is parabolic (step S306-No), the codebook selection unit 143 executes codebook selection processing corresponding to the parabolic type according to the operation flow shown in FIG. S308).

ステップＳ３０７またはＳ３０８の後、予測符号化部１４の予測係数符号化部１４４は、選択された符号帳に従って予測係数c₁、c₂を符号化する（ステップＳ３０９）。そして予測符号化部１４は、符号化された予測係数を空間情報符号化部１５へ渡す。さらに、予測係数符号化部１４４は、符号帳選択情報も符号化して、その符号化された符号帳選択情報を多重化部１７へ出力してもよい。 After step S307 or S308, the prediction coefficient encoding unit 144 of the prediction encoding unit 14 encodes the prediction coefficients c ₁ and c ₂ according to the selected code book (step S309). Then, the prediction encoding unit 14 passes the encoded prediction coefficient to the spatial information encoding unit 15. Furthermore, the prediction coefficient encoding unit 144 may also encode the codebook selection information and output the encoded codebook selection information to the multiplexing unit 17.

空間情報符号化部１５は、第１ダウンミックス部１２から受け取った空間情報を符号化し、その符号化された空間情報と符号化された予測係数を多重化することによりMPS符号を生成する（ステップＳ３１０）。そして空間情報符号化部１５は、そのMPS符号を多重化部１７へ出力する。 The spatial information encoding unit 15 encodes the spatial information received from the first downmix unit 12, and generates an MPS code by multiplexing the encoded spatial information and the encoded prediction coefficient (step S310). Then, the spatial information encoding unit 15 outputs the MPS code to the multiplexing unit 17.

一方、チャネル信号符号化部１６は、受け取った各チャネルのステレオ周波数信号のうち、低域成分をAAC符号化する。またチャネル信号符号化部１６は、受け取った各チャネルのステレオ周波数信号のうち、AAC符号化されない高域成分をSBR符号化する（ステップＳ３１１）。そしてチャネル信号符号化部１６は、SBR符号とAAC符号とを多重化部１７へ出力する。 On the other hand, the channel signal encoding unit 16 AAC encodes a low frequency component in the received stereo frequency signal of each channel. Further, the channel signal encoding unit 16 performs SBR encoding on a high frequency component that is not AAC encoded among the received stereo frequency signals of each channel (step S311). Then, the channel signal encoding unit 16 outputs the SBR code and the AAC code to the multiplexing unit 17.

最後に、多重化部１７は、生成されたSBR符号、AAC符号、MPS符号及び符号帳選択情報を多重化することにより、符号化されたオーディオ信号を生成する（ステップＳ３１２）。
多重化部１７は、符号化されたオーディオ信号を出力する。そしてオーディオ符号化装置１は、符号化処理を終了する。
なお、オーディオ符号化装置１は、ステップＳ３１１の処理とステップＳ３０４〜Ｓ３１０の処理を並列に実行してもよい。あるいは、オーディオ符号化装置１は、ステップＳ３０４〜Ｓ３１０の処理を行う前にステップＳ３１０の処理を実行してもよい。 Finally, the multiplexing unit 17 generates an encoded audio signal by multiplexing the generated SBR code, AAC code, MPS code, and codebook selection information (step S312).
The multiplexing unit 17 outputs the encoded audio signal. Then, the audio encoding device 1 ends the encoding process.
Note that the audio encoding device 1 may execute the process of step S311 and the processes of steps S304 to S310 in parallel. Alternatively, the audio encoding device 1 may execute the process of step S310 before performing the processes of steps S304 to S310.

以上に説明してきたように、このオーディオ符号化装置は、中央チャネルの周波数信号の予測値を二つのステレオ周波数信号で表すための二つの予測係数を符号化する。その際に、このオーディオ符号化装置は、予測誤差を最小にする各予測係数のうち、符号帳に規定された量子化値の範囲に含まれるか、予測誤差に影響するものについてのみ符号帳を利用して符号化する。そのため、このオーディオ符号化装置は、予測係数の符号の組み合わせの数を減らすことができるので、予測係数の符号に割り当てるビット量を削減できる。例えば、二つの予測係数の符号帳のそれぞれが、51個の量子化値を有しているとする。そして予測符号化部１４が各量子化値に対応するインデックス値を直接符号化する場合、二つの予測係数符号の組み合わせの数は全部で2601個となる。これに対し、一方の予測係数について符号帳が選択されず、予測誤差形状が放物線型である場合のように全く符号化されなければ、予測係数符号の数は51個で済む。また、予測誤差形状が楕円型の場合であっても、符号帳が選択されない方の予測係数については、量子化値の上限値または下限値の何れかであることだけが分かればよいので、二つの予測係数符号の組み合わせの数は全部で102個で済む。その結果として、このオーディオ符号化装置は、マルチチャネルオーディオ信号の全体の符号化データ量を削減することができる。あるいは、このオーディオ符号化装置は、予測係数の符号に割り当てるデータ量が少なくなった分だけ、他の符号、例えば、AAC符号に割り当てることで、マルチチャネルオーディオ信号の全体の符号化データ量を増加させずに、再生音質を向上できる。 As described above, this audio encoding apparatus encodes two prediction coefficients for representing the prediction value of the frequency signal of the center channel by two stereo frequency signals. At this time, the audio encoding apparatus uses the code book only for the prediction coefficient that minimizes the prediction error and that is included in the quantization value range defined in the code book or affects the prediction error. Use and encode. Therefore, since this audio encoding device can reduce the number of combinations of prediction coefficient codes, the amount of bits allocated to the prediction coefficient codes can be reduced. For example, it is assumed that each codebook of two prediction coefficients has 51 quantized values. When the predictive encoding unit 14 directly encodes the index value corresponding to each quantized value, the total number of combinations of two predictive coefficient codes is 2601. On the other hand, if no codebook is selected for one prediction coefficient and the prediction error shape is not encoded at all as in the case of a parabolic shape, the number of prediction coefficient codes is only 51. Even if the prediction error shape is elliptical, the prediction coefficient for which the codebook is not selected only needs to know whether it is either the upper limit value or the lower limit value of the quantization value. The total number of combinations of one prediction coefficient code is 102. As a result, this audio encoding apparatus can reduce the entire encoded data amount of the multi-channel audio signal. Alternatively, this audio encoding apparatus increases the overall encoded data amount of the multi-channel audio signal by allocating to another code, for example, an AAC code, by the amount of data allocated to the prediction coefficient code. The playback sound quality can be improved without doing so.

なお、本発明は上記の実施形態に限定されるものではない。変形例によれば、予測符号化部１４は、一方の予測係数についてのみ符号帳を選択した場合、符号帳が選択されなかった方の予測係数が量子化値の上限値か下限値かを表す境界情報を符号帳選択情報とともに多重化部１７へ出力してもよい。例えば、境界情報には1ビットが割り当てられ、その1ビットは、予測係数が上限値となる場合に'1'、予測係数が下限値となる場合に'0'の値をとる。予測符号化部１４は、一方の予測係数の符号帳のみが選択される場合に限り、この境界情報を多重化部１７へ出力してもよい。さらに予測符号化部１４は、符号帳選択情報と境界情報とをさらに可変長符号化してから多重化部１７へ出力してもよい。この変形例では、予測符号化部１４の予測係数符号化部１４４は、符号帳が選択されなかった方の予測係数符号を出力しなくてもよい。 In addition, this invention is not limited to said embodiment. According to the modified example, when the codebook is selected for only one prediction coefficient, the prediction coding unit 14 indicates whether the prediction coefficient for which the codebook is not selected is the upper limit value or the lower limit value of the quantized value. The boundary information may be output to the multiplexing unit 17 together with the codebook selection information. For example, 1 bit is allocated to the boundary information, and the 1 bit takes a value of “1” when the prediction coefficient becomes the upper limit value and takes a value of “0” when the prediction coefficient becomes the lower limit value. The prediction encoding unit 14 may output this boundary information to the multiplexing unit 17 only when only the codebook of one prediction coefficient is selected. Further, the predictive encoding unit 14 may further perform variable length encoding on the codebook selection information and the boundary information and then output the information to the multiplexing unit 17. In this modification, the prediction coefficient encoding unit 144 of the prediction encoding unit 14 may not output the prediction coefficient code for which no codebook has been selected.

さらに他の実施形態によれば、オーディオ符号化装置のチャネル信号符号化部は、ステレオ周波数信号を他の符号化方式に従って符号化してもよい。例えば、チャネル信号符号化部は、周波数信号全体をAAC符号化方式にしたがって符号化してもよい。この場合、図１に示されたオーディオ符号化装置において、SBR符号化部は省略される。 According to still another embodiment, the channel signal encoding unit of the audio encoding device may encode the stereo frequency signal according to another encoding method. For example, the channel signal encoding unit may encode the entire frequency signal according to the AAC encoding method. In this case, the SBR encoding unit is omitted in the audio encoding device shown in FIG.

また、符号化の対象となるマルチチャネルオーディオ信号は、5.1chオーディオ信号に限られない。例えば、符号化の対象となるオーディオ信号は、3ch、3.1chまたは7.1chなど、複数のチャネルを持つオーディオ信号であってもよい。この場合も、オーディオ符号化装置は、各チャネルのオーディオ信号を時間周波数変換することにより、各チャネルの周波数信号を算出する。そしてオーディオ符号化装置は、各チャネルの周波数信号をダウンミックスすることにより、3チャネルの周波数信号を生成する。そして、オーディオ符号化装置は、その3チャネルのうちの一つを、他の二つのチャネルの周波数信号を用いて予測符号化する際に、上記の実施形態と同様に符号帳を選択すればよい。 Further, the multi-channel audio signal to be encoded is not limited to the 5.1ch audio signal. For example, the audio signal to be encoded may be an audio signal having a plurality of channels such as 3ch, 3.1ch, or 7.1ch. Also in this case, the audio encoding device calculates the frequency signal of each channel by performing time-frequency conversion on the audio signal of each channel. Then, the audio encoding device generates a 3-channel frequency signal by downmixing the frequency signals of each channel. Then, the audio encoding device may select a codebook as in the above embodiment when predictively encoding one of the three channels using the frequency signals of the other two channels. .

次に、上記の実施形態またはその変形例によるオーディオ符号化装置にて符号化されたオーディオデータを復号するオーディオ復号装置について説明する。
図１６は、一実施形態によるオーディオ復号装置の概略構成図である。オーディオ復号装置２は、分離部２１と、チャネル信号復号部２２と、符号帳選択情報復号部２３と、予測係数復号部２４と、予測復号部２５と、空間情報復号部２６と、アップミックス部２７と、周波数時間変換部２８とを有する。 Next, an audio decoding apparatus that decodes audio data encoded by the audio encoding apparatus according to the above-described embodiment or its modification will be described.
FIG. 16 is a schematic configuration diagram of an audio decoding device according to an embodiment. The audio decoding device 2 includes a separation unit 21, a channel signal decoding unit 22, a codebook selection information decoding unit 23, a prediction coefficient decoding unit 24, a prediction decoding unit 25, a spatial information decoding unit 26, and an upmixing unit 27 and a frequency time conversion unit 28.

分離部２１は、符号化されたオーディオ信号を含むデータストリームから、符号化されたオーディオ信号が格納されたデータ形式にしたがって、AAC符号、SBR符号などのチャネル信号符号と、MBS符号と、符号化された符号帳選択情報とを取り出す。さらに分離部２１は、MBS符号から、空間情報の符号と予測係数符号とを分離する。そして分離部２１は、チャネル信号符号をチャネル信号復号部２２へ出力し、符号化された符号帳選択情報を符号帳選択情報復号部２３へ出力する。さらに分離部２１は、予測係数符号を予測係数復号部２４へ出力し、空間情報符号を空間情報復号部２６へ出力する。 The separation unit 21 encodes a channel signal code such as an AAC code and an SBR code, an MBS code, and an encoding according to a data format in which the encoded audio signal is stored from a data stream including the encoded audio signal. The selected codebook selection information is extracted. Further, the separation unit 21 separates the spatial information code and the prediction coefficient code from the MBS code. Separating section 21 then outputs the channel signal code to channel signal decoding section 22 and outputs the encoded codebook selection information to codebook selection information decoding section 23. Further, the separation unit 21 outputs the prediction coefficient code to the prediction coefficient decoding unit 24 and outputs the spatial information code to the spatial information decoding unit 26.

チャネル信号復号部２２は、受け取ったチャネル信号符号を復号する。その際、チャネル信号復号部２２は、オーディオ符号化装置１のチャネル信号符号化部１６による符号化処理と逆の処理を実行することでチャネル信号符号を復号して、ステレオ周波数信号の各チャネルの信号を再生する。すなわち、チャネル信号復号部２２は、AAC符号についてはAAC符号に対する復号処理を実行して左側チャネル及び右側チャネルの低周波数成分を再生する。そしてチャネル信号復号部２２は、左側チャネル及び右側チャネルの低周波数成分を時間周波数変換することにより、左側チャネル及び右側チャネルの周波数信号の低周波数成分を得る。
またチャネル信号復号部２２は、SBR符号についてはSBR符号に対する復号処理を実行して左側チャネル及び右側チャネルの周波数信号の高周波数成分を復号する。そしてチャネル信号復号部２２は、チャネルごとに、その低周波数成分と高周波数成分とを合成することで、ステレオ周波数信号の左側周波数信号L_p0(k,n)及び右側周波数信号R_p0(k,n)を再生する。そしてチャネル信号復号部２２は、再生したステレオ周波数信号を予測復号部２５へ出力する。 The channel signal decoding unit 22 decodes the received channel signal code. At that time, the channel signal decoding unit 22 decodes the channel signal code by executing a process reverse to the encoding process by the channel signal encoding unit 16 of the audio encoding device 1, so that each channel of the stereo frequency signal is decoded. Play the signal. That is, for the AAC code, the channel signal decoding unit 22 performs a decoding process on the AAC code to reproduce the low frequency components of the left channel and the right channel. And the channel signal decoding part 22 obtains the low frequency component of the frequency signal of a left channel and a right channel by carrying out time frequency conversion of the low frequency component of a left channel and a right channel.
Further, the channel signal decoding unit 22 performs a decoding process on the SBR code for the SBR code, and decodes the high frequency components of the frequency signals of the left channel and the right channel. Then, the channel signal decoding unit 22 combines the low frequency component and the high frequency component for each channel, so that the left frequency signal L _p0 (k, n) and the right frequency signal R _p0 (k, n) of the stereo frequency signal are combined. Play n). Then, the channel signal decoding unit 22 outputs the reproduced stereo frequency signal to the prediction decoding unit 25.

符号帳選択情報復号部２３は、符号化された符号帳選択情報を復号する。例えば、符号帳選択情報復号部２３は、符号帳選択情報の各値と、その値ごとに割り当てられているハフマン符号との対応関係を表す参照テーブルを参照して、ハフマン符号に対応する符号帳選択情報の値を復号する。そして符号帳選択情報復号部２３は、復号された符号帳選択情報を予測係数復号部２４へ通知する。なお、この参照テーブルは、例えば、符号帳選択情報復号部２３が有するメモリに予め記憶される。 The codebook selection information decoding unit 23 decodes the encoded codebook selection information. For example, the codebook selection information decoding unit 23 refers to a reference table representing the correspondence between each value of the codebook selection information and the Huffman code assigned for each value, and the codebook corresponding to the Huffman code Decodes the value of the selection information. Then, the codebook selection information decoding unit 23 notifies the prediction coefficient decoding unit 24 of the decoded codebook selection information. The reference table is stored in advance in a memory included in the codebook selection information decoding unit 23, for example.

予測係数復号部２４は、周波数帯域ごとに、左右それぞれのチャネルの予測係数符号と予測係数のインデックス値との対応関係を表すテーブルを参照して、予測係数符号に対応するインデックス値を再生する。そして予測係数復号部２４は、周波数帯域ごとに、通知された符号帳選択情報に従って選択されている符号帳を予測係数復号部２４が有するメモリから読み込む。そして予測係数復号部２４は、その符号帳に規定された複数の量子化値のうち、インデックス値に対応する予測係数の量子化値を特定する。また予測係数復号部２４は、符号帳が選択されていない予測係数についての予測係数符号については、その予測係数符号と予測係数の量子化値の上限値または下限値との対応を表すテーブルを参照して、その量子化値の上限値または下限値を求める。 The prediction coefficient decoding unit 24 reproduces the index value corresponding to the prediction coefficient code with reference to a table representing the correspondence relationship between the prediction coefficient codes of the left and right channels and the index value of the prediction coefficient for each frequency band. And the prediction coefficient decoding part 24 reads the codebook selected according to the notified codebook selection information from the memory which the prediction coefficient decoding part 24 has for every frequency band. And the prediction coefficient decoding part 24 specifies the quantization value of the prediction coefficient corresponding to an index value among the several quantization values prescribed | regulated by the codebook. Further, the prediction coefficient decoding unit 24 refers to a table representing the correspondence between the prediction coefficient code and the upper limit value or lower limit value of the quantized value of the prediction coefficient for the prediction coefficient code for the prediction coefficient for which no codebook is selected. Then, an upper limit value or a lower limit value of the quantized value is obtained.

また、予測係数符号として、隣接する周波数帯域間のインデックスの差分値がハフマン符号化されている場合には、予測係数復号部２４は、その差分値とハフマン符号との対応関係を表すテーブルを参照してインデックスの差分値を再生する。そして予測係数復号部２４は、その差分値を周波数帯域ごとに順次加算していくことにより、各周波数帯域のインデックス値を再生し、符号帳を参照して、そのインデックス値に対応する予測係数の量子化値を決定する。 Further, when the difference value of the index between adjacent frequency bands is Huffman-encoded as the prediction coefficient code, the prediction coefficient decoding unit 24 refers to a table representing the correspondence between the difference value and the Huffman code. Then, the index difference value is reproduced. And the prediction coefficient decoding part 24 reproduces | regenerates the index value of each frequency band by adding the difference value sequentially for every frequency band, refers to a codebook, and is the prediction coefficient corresponding to the index value. Determine the quantization value.

また変形例では、符号帳が選択されていない予測係数について、ある周波数帯域のインデックス値が隣接する周波数帯域のインデックス値と同じ（量子化値の範囲との接点が量子化値の上限）、または1加算した値（量子化値の範囲との接点が量子化値の下限）に設定されている。この場合には、予測係数復号部２４は、インデックスの差分値が0であれば、その予測係数に対する量子化値の上限値を再生する。一方、インデックス値の差分値が1であれば、その予測係数に対する量子化値の下限値を再生すればよい。 In a modification, for a prediction coefficient for which no codebook is selected, the index value of a certain frequency band is the same as the index value of an adjacent frequency band (the point of contact with the range of quantization values is the upper limit of the quantization value), or The value obtained by adding 1 (the point of contact with the quantization value range is the lower limit of the quantization value). In this case, if the index difference value is 0, the prediction coefficient decoding unit 24 reproduces the upper limit value of the quantization value for the prediction coefficient. On the other hand, if the difference value of the index value is 1, the lower limit value of the quantization value for the prediction coefficient may be reproduced.

さらに、全ての周波数帯域にわたって符号帳が選択されていない予測係数は、予測誤差の分布形状が放物線型となり、かつ、任意の値をとれる予測係数であるため、予測係数復号部２４は、その予測係数の値を任意の値、例えば、0とすればよい。
予測係数復号部２４は、各周波数帯域の左右のチャネルの予測係数の量子化値を予測復号部２５へ出力する。 Furthermore, since the prediction coefficient for which no codebook has been selected over all frequency bands is a prediction coefficient in which the distribution shape of the prediction error is parabolic and can take an arbitrary value, the prediction coefficient decoding unit 24 performs the prediction. The coefficient value may be an arbitrary value, for example, 0.
The prediction coefficient decoding unit 24 outputs the quantized values of the prediction coefficients of the left and right channels of each frequency band to the prediction decoding unit 25.

予測復号部２５は、周波数帯域ごとに、ステレオ周波数信号に含まれる各チャネルの周波数信号に、対応する予測係数の量子化値を乗じて得られる値の線形和を計算することで、中央チャネルの周波数信号の予測値C'_p0(k,n)を再生する。そして予測復号部２５は、ステレオ周波数信号に含まれる左側チャネルの周波数信号L_p0(k,n)と、右側チャネルの周波数信号R_p0(k,n)と、中央チャネルの周波数信号の予測値C'_p0(k,n)をアップミックスする。これにより、予測復号部２５は、元の5.1chの信号をダウンミックスして得られる３個のチャネルの周波数信号L_in(k,n)、R_in(k,n)、C_in(k,n)を再生する。
予測復号部２５は、周波数帯域ごとに、再生した周波数信号L_in(k,n)、R_in(k,n)、C_in(k,n)をアップミックス部２７へ出力する。 The prediction decoding unit 25 calculates, for each frequency band, a linear sum of values obtained by multiplying the frequency signal of each channel included in the stereo frequency signal by the quantization value of the corresponding prediction coefficient, so that The predicted value C ′ _p0 (k, n) of the frequency signal is reproduced. The predictive decoding unit 25 then includes the left channel frequency signal L _p0 (k, n), the right channel frequency signal R _p0 (k, n), and the predicted value C of the center channel frequency signal included in the stereo frequency signal. ' _{Upmix p0} (k, n). As a result, the predictive decoding unit 25 performs frequency signal L _in (k, n), R _in (k, n), C _in (k, n) of three channels obtained by downmixing the original 5.1ch signal. Play n).
The predictive decoding unit 25 outputs the regenerated frequency signals L _in (k, n), R _in (k, n), and C _in (k, n) to the upmix unit 27 for each frequency band.

空間情報復号部２６は、分離部２１から受け取った空間情報符号を復号する。例えば、類似度及び強度差のそれぞれの符号について、隣接する周波数帯域間のインデックスの差分値がハフマン符号化されている場合には、空間情報復号部２６は、その差分値とハフマン符号との対応関係を表すテーブルを参照してインデックスの差分値を再生する。そして空間情報復号部２６は、その差分値を周波数帯域ごとに順次加算していくことにより、各周波数帯域のインデックス値を再生する。そして空間情報復号部２６は、インデックス値と類似度または強度差の量子化値との対応関係を表すテーブルを参照して、そのインデックス値に対応する類似度及び強度差の量子化値を決定する。
空間情報復号部２６は、各周波数帯域の空間情報の量子化値をアップミックス部２７へ出力する。 The spatial information decoding unit 26 decodes the spatial information code received from the separation unit 21. For example, when the difference value of the index between adjacent frequency bands is Huffman-encoded for each code of the similarity and the intensity difference, the spatial information decoding unit 26 associates the difference value with the Huffman code. The index difference value is reproduced with reference to the table representing the relationship. And the spatial information decoding part 26 reproduces | regenerates the index value of each frequency band by adding the difference value sequentially for every frequency band. Then, the spatial information decoding unit 26 refers to a table representing the correspondence relationship between the index value and the similarity or the quantized value of the intensity difference, and determines the similarity and the quantized value of the intensity difference corresponding to the index value. .
The spatial information decoding unit 26 outputs the quantized value of the spatial information in each frequency band to the upmix unit 27.

アップミックス部２７は、周波数帯域ごとに、３個のチャネルの周波数信号L_in(k,n)、R_in(k,n)、C_in(k,n)を、空間情報に基づいてアップミックスすることにより、5.1chのオーディオ信号の各チャネルの周波数信号を再生する。そしてアップミックス部２７は、再生した各チャネルの周波数信号を周波数時間変換部２８へ出力する。 The upmix unit 27 upmixes the frequency signals L _in (k, n), R _in (k, n), and C _in (k, n) of three channels for each frequency band based on spatial information. By doing so, the frequency signal of each channel of the 5.1ch audio signal is reproduced. Then, the upmix unit 27 outputs the reproduced frequency signal of each channel to the frequency time conversion unit 28.

周波数時間変換部２８は、各チャネルの周波数信号を周波数時間変換することにより、5.1chのオーディオ信号を再生する。そしてオーディオ復号装置２は、再生したオーディオ信号を、例えば、スピーカへ出力する。 The frequency time conversion unit 28 reproduces a 5.1ch audio signal by performing frequency time conversion on the frequency signal of each channel. Then, the audio decoding device 2 outputs the reproduced audio signal to, for example, a speaker.

図１７は、オーディオ復号装置２により実行されるオーディオ復号処理の動作フローチャートである。オーディオ復号装置２は、フレームごとに、下記の動作フローチャートに従ってオーディオ信号を再生する。 FIG. 17 is an operation flowchart of audio decoding processing executed by the audio decoding device 2. The audio decoding device 2 reproduces an audio signal for each frame according to the following operation flowchart.

分離部２１は、ＳＢＲ符号、ＡＡＣ符号、空間情報符号、予測係数符号及び符号帳選択情報を取り出す（ステップＳ４０１）。
チャネル信号復号部２２は、分離部２１から受け取ったＳＢＲ符号、ＡＡＣ符号を復号することにより、ステレオ周波数信号を再生する（ステップＳ４０２）。
符号帳選択情報復号部２３は、符号帳選択情報を復号する（ステップＳ４０３）。
予測係数復号部２４は、選択された符号帳を用いて予測係数を再生する（ステップＳ４０４）。
予測復号部２５は、ステレオ周波数信号及び予測係数に基づいて中央チャネルの周波数信号を再生する（ステップＳ４０５）。そして予測復号部２５は、ステレオ周波数信号及び中央チャネルの周波数信号をアップミックスすることにより、元の5.1chの周波数信号をダウンミックスして得られる3チャネルの周波数信号を再生する。 The separation unit 21 extracts an SBR code, an AAC code, a spatial information code, a prediction coefficient code, and codebook selection information (Step S401).
The channel signal decoding unit 22 reproduces a stereo frequency signal by decoding the SBR code and the AAC code received from the separation unit 21 (step S402).
The codebook selection information decoding unit 23 decodes the codebook selection information (step S403).
The prediction coefficient decoding unit 24 reproduces the prediction coefficient using the selected codebook (step S404).
The predictive decoding unit 25 reproduces the frequency signal of the center channel based on the stereo frequency signal and the prediction coefficient (step S405). Then, the prediction decoding unit 25 reproduces a 3-channel frequency signal obtained by downmixing the original 5.1ch frequency signal by upmixing the stereo frequency signal and the center channel frequency signal.

一方、空間情報復号部２６は、分離部２１から受け取った符号化空間情報を復号することにより空間情報を再生する（ステップＳ４０６）。そしてアップミックス部２７は、空間情報に基づいて3チャネルの周波数信号をアップミックスすることにより5.1chの周波数信号を再生する（ステップＳ４０７）。
周波数時間変換部２８は、各チャネルの周波数信号周波数時間変換して5.1chのオーディオ信号を再生する（ステップＳ４０８）。
そしてオーディオ復号装置は、オーディオ復号処理を終了する。 On the other hand, the spatial information decoding unit 26 reproduces the spatial information by decoding the encoded spatial information received from the separation unit 21 (step S406). Then, the upmixing unit 27 reproduces the 5.1ch frequency signal by upmixing the 3-channel frequency signal based on the spatial information (step S407).
The frequency time conversion unit 28 performs frequency time conversion of the frequency signal of each channel to reproduce a 5.1ch audio signal (step S408).
Then, the audio decoding device ends the audio decoding process.

上記の実施形態または変形例によるオーディオ符号化装置が有する各部の機能をコンピュータに実現させるコンピュータプログラムは、半導体メモリ、磁気記録媒体または光記録媒体などの記録媒体に記憶された形で提供されてもよい。同様に、上記の実施形態または変形例によるオーディオ復号装置が有する各部の機能をコンピュータに実現させるコンピュータプログラムは、半導体メモリ、磁気記録媒体または光記録媒体などの記録媒体に記憶された形で提供されてもよい。 A computer program that causes a computer to realize the functions of the units included in the audio encoding device according to the above-described embodiment or modification may be provided in a form stored in a recording medium such as a semiconductor memory, a magnetic recording medium, or an optical recording medium. Good. Similarly, a computer program that causes a computer to realize the functions of the units included in the audio decoding device according to the above-described embodiment or modification is provided in a form stored in a recording medium such as a semiconductor memory, a magnetic recording medium, or an optical recording medium. May be.

また、上記の実施形態または変形例によるオーディオ符号化装置は、コンピュータ、ビデオ信号の録画機または映像伝送装置など、オーディオ信号を伝送または記録するために利用される各種の機器に実装される。さらに、上記の実施形態または変形例によるオーディオ復号装置は、コンピュータ、ビデオ信号の再生機など、オーディオ信号を再生するために利用される各種の機器に実装される。 The audio encoding device according to the above-described embodiment or modification is mounted on various devices used for transmitting or recording an audio signal, such as a computer, a video signal recorder, or a video transmission device. Furthermore, the audio decoding device according to the above-described embodiment or modification is mounted on various devices used for reproducing an audio signal, such as a computer and a video signal reproducing device.

ここに挙げられた全ての例及び特定の用語は、読者が、本発明及び当該技術の促進に対する本発明者により寄与された概念を理解することを助ける、教示的な目的において意図されたものであり、本発明の優位性及び劣等性を示すことに関する、本明細書の如何なる例の構成、そのような特定の挙げられた例及び条件に限定しないように解釈されるべきものである。本発明の実施形態は詳細に説明されているが、本発明の精神及び範囲から外れることなく、様々な変更、置換及び修正をこれに加えることが可能であることを理解されたい。 All examples and specific terms listed herein are intended for instructional purposes to help the reader understand the concepts contributed by the inventor to the present invention and the promotion of the technology. It should be construed that it is not limited to the construction of any example herein, such specific examples and conditions, with respect to showing the superiority and inferiority of the present invention. Although embodiments of the present invention have been described in detail, it should be understood that various changes, substitutions and modifications can be made thereto without departing from the spirit and scope of the present invention.

１オーディオ符号化装置
１１時間周波数変換部
１２ダウンミックス部
１３第２ダウンミックス部
１４予測符号化部
１４１予測誤差形状判定部
１４２最小誤差予測係数算出部
１４３符号帳選択部
１４４予測係数符号化部
１５空間情報符号化部
１６チャネル信号符号化部
１６１ SBR符号化部
１６２周波数時間変換部
１６３ AAC符号化部
１７多重化部
２オーディオ復号装置
２１分離部
２２チャネル信号復号部
２３符号帳選択情報復号部
２４予測係数復号部
２５予測復号部
２６空間情報復号部
２７アップミックス部
２８周波数時間変換部 DESCRIPTION OF SYMBOLS 1 Audio encoding apparatus 11 Time frequency conversion part 12 Downmix part 13 2nd downmix part 14 Prediction encoding part 141 Prediction error shape determination part 142 Minimum error prediction coefficient calculation part 143 Codebook selection part 144 Prediction coefficient encoding part 15 Spatial information encoding unit 16 Channel signal encoding unit 161 SBR encoding unit 162 Frequency time conversion unit 163 AAC encoding unit 17 Multiplexing unit 2 Audio decoding device 21 Separation unit 22 Channel signal decoding unit 23 Codebook selection information decoding unit 24 Prediction coefficient decoding unit 25 Prediction decoding unit 26 Spatial information decoding unit 27 Upmix unit 28 Frequency time conversion unit

Claims

Multiply the first channel signal and the second channel signal among the plurality of channels included in the audio signal, the first prediction coefficient multiplied by the first channel signal, and the second channel signal. An audio encoding device that predictively encodes a signal of a third channel of the plurality of channels based on a second prediction coefficient,
The third is a linear sum of a value obtained by multiplying the first channel signal by the first prediction coefficient and a value obtained by multiplying the second channel signal by the second prediction coefficient. A minimum error prediction coefficient calculation unit for calculating a set of first values of the first and second prediction coefficients when the error between the prediction value of the signal of the second channel and the signal of the third channel is minimized; ,
Either one of the first prediction coefficient and the second prediction coefficient does not affect the minimum value of the error, or one of the prediction coefficients included in the set is the prediction coefficient The first prediction coefficient and the second prediction coefficient for the other prediction coefficient, the first prediction coefficient and the second prediction coefficient are not included in the range of quantization values including a plurality of quantization values defined in the codebook for one of A codebook is selected, while both the first prediction coefficient and the second prediction coefficient affect the minimum value of the error, and the first of the first prediction coefficients included in the set And the first value of the second prediction coefficient are included in a range of quantization values including a plurality of quantization values defined in the codebook for the prediction coefficient, The codebook for each of the first and second prediction coefficients And the codebook selection section for-option,
Among the first and second prediction coefficients, for the prediction coefficient for which the codebook is selected, a quantization value that minimizes the error is obtained from a plurality of quantization values defined in the codebook. A prediction coefficient encoding unit that obtains an encoded prediction coefficient by encoding the quantized value;
An audio encoding device.

An error distribution shape determination unit that determines whether the distribution shape of the error is an elliptical paraboloid or a parabolic columnar surface based on the signals of the first, second, and third channels;
When the error distribution shape is an elliptic paraboloid, the minimum error prediction coefficient calculation unit minimizes the error according to an elliptic paraboloid equation using the first and second prediction coefficients as variables. When the first value pair of the first and second prediction coefficients is calculated, and the distribution shape of the error is a parabolic columnar surface, the first and second prediction coefficients are used as variables. The audio encoding device according to claim 1, wherein the first value set of the first and second prediction coefficients that minimizes the error is calculated according to a parabolic column surface equation.

The prediction coefficient encoding unit is configured such that the distribution shape of the error is an elliptic paraboloid, and the first value of the first prediction coefficient included in the set is a plurality of the plurality of the first prediction coefficients with respect to the first prediction coefficient. When the quantization value is larger than the upper limit of the quantization value, the second value of the second prediction coefficient that minimizes the error when the first prediction coefficient is set to the upper limit of the range of the quantization value is set as the error. A quantized value closest to the second value among the plurality of quantized values defined in the selected codebook for the second prediction coefficient is obtained according to an elliptic paraboloid equation representing a distribution shape. The audio encoding device according to claim 2, wherein the audio encoding device selects and encodes the selected quantized value.

When the error distribution shape is an elliptic paraboloid and the codebook for the first prediction coefficient is not selected, the prediction coefficient encoding unit includes the first prediction coefficient included in the set. 4. The audio code according to claim 2, wherein information representing any one of an upper limit and a lower limit of a range of the quantized value for the first prediction coefficient is encoded according to the first value. Device.

The prediction coefficient encoding unit, when the error distribution shape is a parabolic columnar shape, and the first value of the first prediction coefficient included in the set does not affect the minimum value of the error, A quantization value closest to the first value of the second prediction coefficient included in the set among a plurality of quantization values defined in the selected codebook for the second prediction coefficient is encoded. The audio encoding device according to any one of claims 1 to 4.

Multiply the first channel signal and the second channel signal among the plurality of channels included in the audio signal, the first prediction coefficient multiplied by the first channel signal, and the second channel signal. An audio encoding method for predictively encoding a signal of a third channel of the plurality of channels based on a second prediction coefficient,
The third is a linear sum of a value obtained by multiplying the first channel signal by the first prediction coefficient and a value obtained by multiplying the second channel signal by the second prediction coefficient. Calculating a set of first values of the first and second prediction coefficients when an error between a predicted value of the signal of the second channel and a signal of the third channel is minimized;
Either one of the first prediction coefficient and the second prediction coefficient does not affect the minimum value of the error, or one of the prediction coefficients included in the set is the prediction coefficient The first prediction coefficient and the second prediction coefficient for the other prediction coefficient, the first prediction coefficient and the second prediction coefficient are not included in the range of quantization values including a plurality of quantization values defined in the codebook for one of A codebook is selected, while both the first prediction coefficient and the second prediction coefficient affect the minimum value of the error, and the first of the first prediction coefficients included in the set And the first value of the second prediction coefficient are included in a range of quantization values including a plurality of quantization values defined in the codebook for the prediction coefficient, The codebook for each of the first and second prediction coefficients And-option,
Among the first and second prediction coefficients, for the prediction coefficient for which the codebook is selected, a quantization value that minimizes the error is obtained from a plurality of quantization values defined in the codebook. The encoded prediction coefficient is obtained by encoding the quantized value.
An audio encoding method.

Multiply the first channel signal and the second channel signal among the plurality of channels included in the audio signal, the first prediction coefficient multiplied by the first channel signal, and the second channel signal. An audio encoding computer program that causes a computer to predictively encode a signal of a third channel of the plurality of channels based on a second prediction coefficient,
The third is a linear sum of a value obtained by multiplying the first channel signal by the first prediction coefficient and a value obtained by multiplying the second channel signal by the second prediction coefficient. Calculating a set of first values of the first and second prediction coefficients when an error between a predicted value of the signal of the second channel and a signal of the third channel is minimized;
Either one of the first prediction coefficient and the second prediction coefficient does not affect the minimum value of the error, or one of the prediction coefficients included in the set is the prediction coefficient The first prediction coefficient and the second prediction coefficient for the other prediction coefficient, the first prediction coefficient and the second prediction coefficient are not included in the range of quantization values including a plurality of quantization values defined in the codebook for one of A codebook is selected, while both the first prediction coefficient and the second prediction coefficient affect the minimum value of the error, and the first of the first prediction coefficients included in the set And the first value of the second prediction coefficient are included in a range of quantization values including a plurality of quantization values defined in the codebook for the prediction coefficient, The codebook for each of the first and second prediction coefficients And-option,
Among the first and second prediction coefficients, for the prediction coefficient for which the codebook is selected, a quantization value that minimizes the error is obtained from a plurality of quantization values defined in the codebook. The encoded prediction coefficient is obtained by encoding the quantized value.
An audio encoding computer program for causing a computer to execute the above.

Based on the encoded channel signal data obtained by encoding the signals of the first and second channels among the plurality of channels included in the audio signal, and the signals of the first and second channels, A first prediction coefficient defining a plurality of quantized values for the first prediction coefficient and an encoded prediction coefficient obtained by encoding the first and second prediction coefficients for predicting a signal of the third channel. And a codebook selection information representing a codebook selected from a second codebook that defines a plurality of quantization values for the second prediction coefficient and a second codebook, according to a predetermined data format An audio decoding device for decoding the audio signal from audio data,
In accordance with the data format, a separator that extracts the encoded channel signal data, the encoded prediction coefficient, and the codebook selection information from the encoded audio data;
A channel signal decoding unit for reproducing the signals of the first and second channels by decoding the encoded channel signal data;
Of the first and second codebooks, the quantization corresponding to the encoded prediction coefficient among the plurality of quantization values defined in the codebook indicated to be selected in the codebook selection information A prediction coefficient decoding unit that reproduces the first and second prediction coefficients by specifying a value;
A first value is obtained by multiplying the reproduced first prediction coefficient by the signal of the first channel, and a second value is obtained by multiplying the reproduced second prediction coefficient by the signal of the second channel. A predictive decoding unit that obtains a value of 2 and reproduces the sum of the first value and the second value as a signal of the third channel;
An audio decoding device.

Based on the encoded channel signal data obtained by encoding the signals of the first and second channels among the plurality of channels included in the audio signal, and the signals of the first and second channels, A first prediction coefficient defining a plurality of quantized values for the first prediction coefficient and an encoded prediction coefficient obtained by encoding the first and second prediction coefficients for predicting a signal of the third channel. And a codebook selection information representing a codebook selected from a second codebook that defines a plurality of quantization values for the second prediction coefficient and a second codebook, according to a predetermined data format An audio decoding method for decoding the audio signal from audio data,
According to the data format, the encoded channel signal data, the encoded prediction coefficient, and the codebook selection information are extracted from the encoded audio data,
Reproducing the signals of the first and second channels by decoding the encoded channel signal data;
Of the first and second codebooks, the quantization corresponding to the encoded prediction coefficient among the plurality of quantization values defined in the codebook indicated to be selected in the codebook selection information Regenerating the first and second prediction coefficients by specifying a value;
A first value is obtained by multiplying the reproduced first prediction coefficient by the signal of the first channel, and a second value is obtained by multiplying the reproduced second prediction coefficient by the signal of the second channel. 2 is obtained, and the sum of the first value and the second value is reproduced as the signal of the third channel.
An audio decoding method.

Based on the encoded channel signal data obtained by encoding the signals of the first and second channels among the plurality of channels included in the audio signal, and the signals of the first and second channels, A first prediction coefficient defining a plurality of quantized values for the first prediction coefficient and an encoded prediction coefficient obtained by encoding the first and second prediction coefficients for predicting a signal of the third channel. And a codebook selection information representing a codebook selected from a second codebook that defines a plurality of quantization values for the second prediction coefficient and a second codebook, according to a predetermined data format An audio decoding computer program for causing a computer to decode the audio signal from audio data,
According to the data format, the encoded channel signal data, the encoded prediction coefficient, and the codebook selection information are extracted from the encoded audio data,
Reproducing the signals of the first and second channels by decoding the encoded channel signal data;
Of the first and second codebooks, the quantization corresponding to the encoded prediction coefficient among the plurality of quantization values defined in the codebook indicated to be selected in the codebook selection information Regenerating the first and second prediction coefficients by specifying a value;
A first value is obtained by multiplying the reproduced first prediction coefficient by the signal of the first channel, and a second value is obtained by multiplying the reproduced second prediction coefficient by the signal of the second channel. 2 is obtained, and the sum of the first value and the second value is reproduced as the signal of the third channel.
An audio decoding method.
A computer program for audio decoding that causes a computer to execute the above.