JP6220701B2

JP6220701B2 - Sample sequence generation method, encoding method, decoding method, apparatus and program thereof

Info

Publication number: JP6220701B2
Application number: JP2014037101A
Authority: JP
Inventors: 守谷　健弘; 健弘守谷; 優鎌本; 登原田; 弘和亀岡; 亮介杉浦
Original assignee: Nippon Telegraph and Telephone Corp; University of Tokyo NUC
Current assignee: Nippon Telegraph and Telephone Corp; University of Tokyo NUC
Priority date: 2014-02-27
Filing date: 2014-02-27
Publication date: 2017-10-25
Anticipated expiration: 2034-02-27
Also published as: JP2015161810A

Description

この発明は、音信号の符号化技術などの信号処理技術において、音信号に由来する周波数領域のサンプル列を、当該周波数領域のサンプル列におけるサンプル点の周波数領域での間隔を伸縮した系列を生成する技術に関する。 In the signal processing technology such as sound signal encoding technology, the present invention generates a sequence in which the frequency domain sample sequence derived from the sound signal is expanded and contracted in the frequency domain of the sample points in the frequency domain sample sequence. Related to technology.

低ビット（例えば10kbit/s〜20kbit/s程度）の音信号の符号化方法として、DFT（離散フーリエ変換）やMDCT（変形離散コサイン変換）などの周波数領域での直交変換係数に対する適応符号化が知られている。例えば標準規格技術であるMEPG USAC(Unified Speech and Audio Coding)は、TCX（transform coded excitation：変換符号化励振）符号化モードを持ち、この中ではMDCT係数をフレームごとに正規化して量子化後に可変長符号化している（例えば、非特許文献１参照）。 As an encoding method for sound signals of low bits (for example, about 10 kbit / s to 20 kbit / s), adaptive encoding for orthogonal transform coefficients in the frequency domain such as DFT (Discrete Fourier Transform) and MDCT (Modified Discrete Cosine Transform) is available. Are known. For example, MEPG USAC (Unified Speech and Audio Coding), a standard technology, has a TCX (transform coded excitation) coding mode, in which MDCT coefficients are normalized for each frame and variable after quantization. Long encoding is performed (for example, see Non-Patent Document 1).

従来のTCXに基づく符号化装置の構成例を図１に示す。以下、図１の各部について説明する。 A configuration example of a conventional TCX-based encoding device is shown in FIG. Hereinafter, each part of FIG. 1 will be described.

＜周波数領域変換部１１＞
周波数領域変換部１１には、時間領域の音信号が入力される。音信号は、例えば音声信号又は音響信号である。 <Frequency domain converter 11>
A time domain sound signal is input to the frequency domain converter 11. The sound signal is, for example, an audio signal or an acoustic signal.

周波数領域変換部１１は、所定の時間長のフレーム単位で、入力された時間領域の音信号を周波数領域のN点のMDCT係数列X(0),X(1),…,X(N-1)に変換する。Nは正の整数である。 The frequency domain transform unit 11 converts an input time domain sound signal into N frequency MDCT coefficient sequences X (0), X (1),..., X (N− Convert to 1). N is a positive integer.

変換されたMDCT係数列X(0),X(1),…,X(N-1)は、包絡正規化部１４に出力される。 The converted MDCT coefficient sequences X (0), X (1),..., X (N−1) are output to the envelope normalization unit 14.

＜線形予測分析部１２＞
線形予測分析部１２には、時間領域の音信号が入力される。 <Linear prediction analysis unit 12>
The linear prediction analysis unit 12 receives a sound signal in the time domain.

線形予測分析部１２は、フレーム単位で入力された音信号に対する線形予測分析を行うことにより、線形予測係数α₁,α₂,…,α_pを生成する。また、線形予測分析部１２は、生成された線形予測係数α₁,α₂,…,α_pを符号化して線形予測係数符号を生成する。線形予測係数符号の例はLSP(Line Spectrum Pairs)である。pは２以上の整数である。 The linear prediction analysis unit 12 generates linear prediction coefficients α ₁ , α ₂ ,..., Α _p by performing linear prediction analysis on the sound signal input in units of frames. Further, the linear prediction analysis unit 12 encodes the generated linear prediction coefficients α ₁ , α ₂ ,..., Α _p to generate a linear prediction coefficient code. An example of the linear prediction coefficient code is LSP (Line Spectrum Pairs). p is an integer of 2 or more.

また、線形予測分析部１２は、生成された線形予測係数符号に対応する線形予測係数である量子化線形予測係数^α₁,^α₂,…,^α_pを生成する。 Further, the linear prediction analysis unit 12 generates quantized linear prediction coefficients ^ α ₁ , ^ α ₂ ,..., ^ Α _p that are linear prediction coefficients corresponding to the generated linear prediction coefficient code.

生成された量子化線形予測係数^α₁,^α₂,…,^α_pは、パワースペクトル包絡系列生成部１３に出力される。また、生成された線形予測係数符号は、復号装置に出力される。 The generated quantized linear prediction coefficients ^ α ₁ , ^ α ₂ ,..., ^ Α _p are output to the power spectrum envelope sequence generation unit 13. The generated linear prediction coefficient code is output to the decoding device.

＜パワースペクトル包絡系列生成部１３＞
パワースペクトル包絡系列生成部１３には、線形予測分析部１２が生成した量子化線形予測係数^α₁,^α₂,…,^α_pが入力される。 <Power Spectrum Envelope Sequence Generation Unit 13>
The quantized linear prediction coefficients ^ α ₁ , ^ α ₂ ,..., ^ Α _p generated by the linear prediction analysis unit 12 are input to the power spectrum envelope sequence generation unit 13.

パワースペクトル包絡系列生成部１３は、量子化線形予測係数^α₁,^α₂,…,^α_pを用いて、以下の式(P1)により定義される平滑化パワースペクトル包絡系列^W_γ(0),^W_γ(1),…,^W_γ(N-1)を生成する。・を実数としてexp（・）はネイピア数を底とする指数関数、ｊは虚数単位、σ²は予測残差エネルギーである。γは、１以下の正の定数であり、以下の式(P1’)により定義されるパワースペクトル包絡系列W(0),W(1),…,W(N-1)の振幅の凹凸を鈍らせる係数、言い換えればパワースペクトル包絡系列を平滑化する係数である。 The power spectrum envelope sequence generation unit 13 uses the quantized linear prediction coefficients ^ α ₁ , ^ α ₂ ,..., ^ Α _p to smooth the power spectrum envelope sequence ^ W _γ defined by the following equation (P1). Generate (0), ^ W _γ (1), ..., ^ W _γ (N-1). Where exp is a real number and exp (•) is an exponential function with the Napier number as the base, j is an imaginary unit, and σ ² is the predicted residual energy. γ is a positive constant of 1 or less, and the amplitude unevenness of the power spectrum envelope sequence W (0), W (1),..., W (N-1) defined by the following formula (P1 ′) A coefficient for blunting, in other words, a coefficient for smoothing the power spectrum envelope sequence.

生成された平滑化パワースペクトル包絡系列^W_γ(0),^W_γ(1),…,^W_γ(N-1)は、包絡正規化部１４に出力される。 The generated smoothed power spectrum envelope sequences ^ _Wγ (0), ^ _Wγ (1),..., ^ _Wγ (N−1) are output to the envelope normalization unit 14.

＜包絡正規化部１４＞
包絡正規化部１４には、周波数領域変換部１１が生成したMDCT係数列X(0),X(1),…,X(N-1)及びパワースペクトル包絡系列生成部１３が出力した平滑化パワースペクトル包絡系列^W_γ(0),^W_γ(1),…,^W_γ(N-1)が入力される。 <Envelope normalization unit 14>
The envelope normalization unit 14 includes the MDCT coefficient sequence X (0), X (1),..., X (N-1) generated by the frequency domain conversion unit 11 and the smoothing output from the power spectrum envelope sequence generation unit 13. The power spectrum envelope sequence ^ _Wγ (0), ^ _Wγ (1), ..., ^ _Wγ (N-1) is input.

包絡正規化部１４は、MDCT係数列の各係数X(i)を平滑化パワースペクトル包絡系列の各値^W_γ(i))の平方根で正規化することにより、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)を生成する。つまり、X_N(0)= X(i)/sqrt(^W_γ(i)) [i=0,1,…,N-1]である。ここで、sqrt(・)は・の平方根を表す。 The envelope normalization unit 14 normalizes each coefficient X (i) of the MDCT coefficient sequence with the square root of each value ^ _Wγ (i)) of the smoothed power spectrum envelope sequence, thereby obtaining a normalized MDCT coefficient sequence X _N. (0), X _N (1), ..., X _N (N-1) are generated. That is, X _N (0) = X (i) / sqrt (^ W _γ (i)) [i = 0, 1,..., N−1]. Here, sqrt (•) represents the square root of •.

生成された正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)は、符号化部１５に出力される。 The generated normalized MDCT coefficient sequences X _N (0), X _N (1),..., X _N (N−1) are output to the encoding unit 15.

ここでは、聴覚的に歪が小さくなるような量子化の実現のために、包絡正規化部１４は、パワースペクトル包絡を鈍らせたパワースペクトル包絡の系列である平滑化パワースペクトル包絡系列^W_γ(0),^W_γ(1),…,^W_γ(N-1)を用いて、フレーム単位でMDCT係数列X(0),X(1),…,X(N-1)を正規化している。 Here, in order to realize quantization that audibly reduces distortion, the envelope normalization unit 14 performs a smoothed power spectrum envelope sequence ^ W _γ that is a power spectrum envelope sequence in which the power spectrum envelope is blunted. Using (0), ^ W _γ (1), ..., ^ W _γ (N-1), MDCT coefficient sequences X (0), X (1), ..., X (N-1) Normalized.

この結果、生成される正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)は、入力されたMDCT係数列X(0),X(1),…,X(N-1)ほどの大きな振幅の傾きや振幅の凹凸を持たないが、入力された音信号のパワースペクトル包絡系列と類似の大小関係を有するもの、すなわち低い周波数に対応する係数側の領域にやや大きな振幅を持ちピッチ周期に起因する微細構造を持つものとなる。 As a result, the generated normalized MDCT coefficient sequences X _N (0), X _N (1),..., X _N (N-1) are inputted MDCT coefficient sequences X (0), X (1), ..., which has no amplitude gradient or amplitude irregularity as large as X (N-1), but has a similar magnitude relationship to the power spectrum envelope sequence of the input sound signal, that is, the coefficient side corresponding to a low frequency This region has a slightly large amplitude and a fine structure resulting from the pitch period.

＜符号化部１５＞
符号化部１５には、包絡正規化部１４が生成した正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)が入力される。 <Encoding unit 15>
The encoding unit 15 receives the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1) generated by the envelope normalization unit 14.

符号化部１５は、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号を生成する。 The encoding unit 15 generates a code corresponding to the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1).

生成された正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号は、復号装置に出力される。 Codes corresponding to the generated normalized MDCT coefficient sequences X _N (0), X _N (1),..., X _N (N−1) are output to the decoding device.

正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)の各係数を利得（グローバルゲイン）gで割り算し、その結果を量子化した整数値による系列である量子化正規化済係数系列X_Q(0),X_Q(1),…,X_Q(N-1)を符号化して得られる符号を整数信号符号とする。非特許文献１の技術では、符号化部１５は、この整数信号符号のビット数が、予め配分されたビット数である配分ビット数B以下、かつ、なるべく大きな値となるような利得gを決定する。そして、符号化部１５は、この決定された利得gに対応する利得符号と、この決定された利得gに対応する整数信号符号とを生成する。 A sequence of integer values obtained by dividing the coefficients of normalized MDCT coefficient sequences X _N (0), X _N (1), ..., X _N (N-1) by gain (global gain) g and quantizing the result A code obtained by encoding the quantized normalized coefficient series X _Q (0), X _Q (1),..., X _Q (N−1) is an integer signal code. In the technique of Non-Patent Document 1, the encoding unit 15 determines a gain g such that the number of bits of the integer signal code is equal to or smaller than the allocated bit number B, which is the number of bits allocated in advance, and as large as possible. To do. Then, the encoding unit 15 generates a gain code corresponding to the determined gain g and an integer signal code corresponding to the determined gain g.

この生成された利得符号及び整数信号符号が、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号として復号装置に出力される。 The generated gain code and integer signal code are output to the decoding apparatus as codes corresponding to the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1).

以上のように、従来のTCXに基づく符号化では、パワースペクトル包絡を鈍らせた平滑化パワースペクトル包絡系列を用いてMDCT係数列を正規化した後、正規化MDCT係数列を符号化している。この符号化方法は、上記のMPEG-4 USACなどで採用されている。 As described above, in the conventional encoding based on TCX, the normalized MDCT coefficient sequence is encoded after normalizing the MDCT coefficient sequence using the smoothed power spectrum envelope sequence in which the power spectrum envelope is blunted. This encoding method is employed in the above MPEG-4 USAC and the like.

M. Neuendorf, et al., “MPEG Unified Speech and Audio Coding- The ISO/MPEGStandard for High-Efficiency Audio Coding of all Content Types,”AES 132ndConvention, Budapest, Hungary, 2012.M. Neuendorf, et al., “MPEG Unified Speech and Audio Coding- The ISO / MPEGStandard for High-Efficiency Audio Coding of all Content Types,” AES 132ndConvention, Budapest, Hungary, 2012.

線形予測係数を用いて得られるパワースペクトル包絡は、おおよそ（信号のサンプル数）／（線形予測次数）の解像度で元のスペクトルを表現するものである。そして、この解像度は周波数領域において均一である。すなわち、MPEG USACなどの従来のTCXに基づく符号化におけるMDCT係数列の正規化に用いるパワースペクトル包絡系列は、周波数領域の周波数軸で均等な間隔で、言い換えれば周波数方向の均一な解像度で離散化（以下、「線形離散化」ともいう）した包絡の値であった。 The power spectrum envelope obtained using the linear prediction coefficient expresses the original spectrum with a resolution of approximately (number of signal samples) / (linear prediction order). This resolution is uniform in the frequency domain. That is, the power spectrum envelope sequence used for normalization of MDCT coefficient sequences in encoding based on conventional TCX such as MPEG USAC is discretized at equal intervals on the frequency axis in the frequency domain, in other words, at a uniform resolution in the frequency direction. It was the envelope value (hereinafter also referred to as “linear discretization”).

通常の音声や音楽の信号は特定の周波数領域（例えば低周波数領域）にエネルギーが集中する場合が多く、エネルギーが集中している周波数領域ではパワースペクトル包絡の変化が大きい傾向がある。全周波数領域において均一な解像度で離散化した包絡の値を用いると、エネルギーが集中している周波数領域では周波数方向の解像度が不足して、得られるパワースペクトル包絡系列は元のパワースペクトル包絡の振幅の凹凸の変化を十分な精度で表現できないことがある。このようなパワースペクトル包絡系列を用いてMDCT係数列を正規化すると、解像度が不足した部分でのMDCT係数列とパワースペクトル包絡系列との差が大きくなり、正規化MDCT係数列の値のばらつきが大きくなってしまうため、符号化効率が低下する可能性があった。 In normal voice and music signals, energy is often concentrated in a specific frequency region (for example, low frequency region), and the power spectrum envelope tends to change greatly in the frequency region where energy is concentrated. When using envelope values that are discretized with uniform resolution in the entire frequency domain, the resolution in the frequency direction is insufficient in the frequency domain where energy is concentrated, and the resulting power spectrum envelope sequence is the amplitude of the original power spectrum envelope. May not be expressed with sufficient accuracy. When the MDCT coefficient sequence is normalized using such a power spectrum envelope sequence, the difference between the MDCT coefficient sequence and the power spectrum envelope sequence in the portion where the resolution is insufficient increases, and the variation in the value of the normalized MDCT coefficient sequence is increased. Since it becomes large, encoding efficiency may fall.

ここで、予測次数を増やせば、線形離散化でも解像度は高くすることはできるが、パラメータの情報量が増加して、符号化効率が低下する可能性がある。また、特定のフレームだけに次数を増やすと、フレーム間の処理の連続性のため処理が煩雑となる可能性がある。 Here, if the prediction order is increased, the resolution can be increased even by linear discretization, but there is a possibility that the amount of parameter information increases and the coding efficiency decreases. Further, if the order is increased only for a specific frame, the processing may become complicated due to the continuity of processing between frames.

符号化処理に限らず、音信号の信号処理においては、音信号に由来する周波数領域のサンプル列として、周波数方向の不均一な解像度で音信号を離散化したサンプル列を用いることにより信号処理の精度が向上する場合がある。 In signal processing of a sound signal, not limited to encoding processing, signal processing is performed by using a sample sequence obtained by discretizing a sound signal with a non-uniform resolution in the frequency direction as a sample sequence in the frequency domain derived from the sound signal. Accuracy may be improved.

この発明は、このような技術的背景に鑑みて、少ない演算量の増加で符号化効率を改善する技術を提供することを目的とする。また、符号化以外の信号処理において、少ない演算量の増加で信号処理の精度を改善する技術を提供することを目的とする。 In view of such a technical background, an object of the present invention is to provide a technique for improving coding efficiency with a small increase in calculation amount. It is another object of the present invention to provide a technique for improving the accuracy of signal processing with a small increase in the amount of computation in signal processing other than encoding.

この発明の一態様によるサンプル列生成方法（以下、第一サンプル列生成方法とする。）は、所定の時間区間ごとの音信号に由来する周波数領域のサンプル列をξ(0),ξ(1),…,ξ(N-1)として、予め定められたN×Nの疎行列Uを用いて、下記式により定義される~ξ(0),~ξ(1),…,~ξ(N-1)を生成するサンプル列生成ステップを有し、 A sample sequence generation method according to an aspect of the present invention (hereinafter referred to as a first sample sequence generation method) uses a frequency domain sample sequence derived from a sound signal for each predetermined time interval as ξ (0), ξ (1 ), ..., ξ (N-1) are defined by the following equations using a predetermined N × N sparse matrix U: ~ ξ (0), ~ ξ (1), ..., ~ ξ ( N-1) has a sample sequence generation step,

前記疎行列Uは、対角成分の近傍成分のみ０でない値を含む行列であって、その上三角行列に含まれる非零要素の数はその下三角行列に含まれる非零要素の数よりも少なく、前記ξ(0),ξ(1),…,ξ(N-1)に対応するサンプル点列は、隣接するサンプル点の周波数の間隔が均等である線形離散化サンプル点列である。 The sparse matrix U is a matrix that includes non-zero values only in the neighboring components of the diagonal component, and the number of nonzero elements included in the upper triangular matrix is greater than the number of nonzero elements included in the lower triangular matrix. The sample point sequence corresponding to ξ (0), ξ (1),..., Ξ (N-1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are uniform.

この発明の一態様によるサンプル列生成方法（以下、第二サンプル列生成方法とする。）は、所定の時間区間ごとの音信号に由来する周波数領域のサンプル列をξ(0),ξ(1),…,ξ(N-1)として、予め定められたM×Nの非負値行列Uを用いて、下記式により定義される~ξ(0),~ξ(1),…,~ξ(M-1)を生成するサンプル列生成ステップを有し、 A sample sequence generation method (hereinafter referred to as a second sample sequence generation method) according to an aspect of the present invention uses a frequency domain sample sequence derived from a sound signal for each predetermined time interval as ξ (0), ξ (1 ), ..., ξ (N-1) is defined by the following equation using a predetermined non-negative matrix U of M × N, which is defined by the following formula: ~ ξ (0), ~ ξ (1), ..., ~ ξ A sample sequence generation step for generating (M-1),

前記非負値行列Uは、その第i行第k列の成分をU[i,k]として、各i=1,2,…,Mについて、 The non-negative matrix U has a component of the i-th row and k-th column as U [i, k], and for each i = 1, 2,.

としたとき、g₁<g₂<…<g_Nを満たし、前記ξ(0),ξ(1),…,ξ(N-1)又は~ξ(0),~ξ(1),…,~ξ(M-1)に対応するサンプル点列は、周波数方向で等間隔なサンプル点列である。 Where g ₁ <g ₂ <... <g _N is satisfied, and ξ (0), ξ (1), ..., ξ (N-1) or ~ ξ (0), ~ ξ (1), ... , ~ ξ (M−1) are sample point sequences that are equally spaced in the frequency direction.

この発明の一態様によるサンプル列生成方法（以下、第三サンプル列生成方法とする。）は、所定の時間区間の音信号に由来する周波数領域のサンプル列を~η(0),~η(1),…,~η(N-1)として、予め定められたN×Nの疎行列Vを用いて、下記式により定義されるη(0),η(1),…,η(N-1)を生成するサンプル列生成ステップを有し、 A sample sequence generation method according to an aspect of the present invention (hereinafter referred to as a third sample sequence generation method) uses a frequency domain sample sequence derived from a sound signal in a predetermined time interval as ~ η (0), ~ η ( 1), ..., ~ η (N-1), using a predetermined N × N sparse matrix V, η (0), η (1), ..., η (N -1) has a sample sequence generation step to generate

前記疎行列Vは、対角成分の近傍成分のみ０でない値を含む行列であって、その上三角行列に含まれる非零要素の数はその下三角行列に含まれる非零要素の数よりも多く、前記η(0),η(1),…,η(N-1)に対応するサンプル点列は、隣接するサンプル点の周波数の間隔が均等である線形離散化サンプル点列である。 The sparse matrix V is a matrix including non-zero values only in the neighboring components of the diagonal component, and the number of nonzero elements included in the upper triangular matrix is greater than the number of nonzero elements included in the lower triangular matrix. In many cases, the sample point sequence corresponding to η (0), η (1),..., Η (N−1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are equal.

この発明の一態様による符号化方法は、第一サンプル列生成方法によって、前記所定の時間区間ごとの音信号を周波数領域に変換して得たサンプル列のパワーに対応する系列を前記ξ(0),ξ(1),…,ξ(N-1)とし、前記~ξ(0),~ξ(1),…,~ξ(N-1)を伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)として生成する伸縮疑似パワースペクトル系列生成ステップと、pを正の整数として、前記伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を時間領域に変換した系列である~X(0),~X(1),…,~X(N-1)を線形予測分析して、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを生成する線形予測分析ステップと、前記量子化伸縮線形予測係数^β₁,^β₂,…,^β_pに対応する周波数領域の系列である伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を生成する伸縮パワースペクトル包絡系列生成ステップと、第三サンプル列生成方法によって、前記伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を前記~η(0),~η(1),…,~η(N-1)として、前記η(0),η(1),…,η(N-1)をパワースペクトル包絡系列W(0),W(1),…,W(N-1)として生成する逆伸縮変換ステップと、前記パワースペクトル包絡系列W(0),W(1),…,W(N-1)を用いて前記周波数領域のサンプル列X(0),X(1),…,X(N-1)を正規化することにより、正規化された周波数領域サンプル列を生成する包絡正規化ステップと、前記正規化された周波数領域サンプル列を符号化して、前記正規化された周波数領域サンプル列に対応する符号を生成する符号化ステップと、を有する。 In the encoding method according to an aspect of the present invention, the sequence corresponding to the power of the sample sequence obtained by converting the sound signal for each predetermined time interval into the frequency domain by the first sample sequence generation method is the ξ (0 ), ξ (1), ..., ξ (N-1), and ~ ξ (0), ~ ξ (1), ..., ~ ξ (N-1) are expanded and contracted pseudo power spectrum series ~ Y (0) , ~ Y (1),..., ~ Y (N-1) to generate a stretched pseudo power spectrum sequence, and p is a positive integer, the stretched pseudo power spectrum series ~ Y (0), ~ Y ( 1), ..., ~ Y (N-1) is transformed into time domain ~ X (0), ~ X (1), ..., ~ X (N-1) of telescopic linear prediction coefficient _{_{^ β 1, ^ β 2,}} ..., a linear prediction analysis step of generating a ^ beta _p, the quantization telescopic linear prediction coefficient _{_{^ β 1, ^ β 2,}} ..., corresponding to the ^ beta _p Stretching power spectrum envelope series ~ W (0), ~ W (1), ..., ~ W (N-1), which is a frequency domain series And the third sample sequence generation method, the stretchable power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) is converted into ~ η (0), ~ η ( 1), ..., ~ η (N-1), and η (0), η (1), ..., η (N-1) are power spectrum envelope sequences W (0), W (1), ..., The inverse stretch conversion step generated as W (N-1) and the frequency spectrum sample sequence X (0) using the power spectrum envelope sequence W (0), W (1), ..., W (N-1) ), X (1),..., X (N-1) is normalized to generate an normalized frequency domain sample sequence, and the normalized frequency domain sample sequence is encoded. And generating a code corresponding to the normalized frequency domain sample sequence.

この発明の一態様による復号方法は、pを正の整数として、入力された伸縮線形予測係数符号を復号して量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを生成する伸縮線形予測係数復号ステップと、前記量子化伸縮線形予測係数^β₁,^β₂,…,^β_pに対応する周波数領域の系列である伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を生成する伸縮パワースペクトル包絡系列生成ステップと、第三サンプル列生成方法によって、前記伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を前記~η(0),~η(1),…,~η(N-1)として、前記η(0),η(1),…,η(N-1)をパワースペクトル包絡系列W(0),W(1),…,W(N-1)として生成する逆伸縮変換ステップと、入力された、正規化された周波数領域のサンプル列X_N(0),X_N(1),…,X_N(N-1)に対応する符号を復号して、正規化された周波数領域のサンプル列X_N(0),X_N(1),…,X_N(N-1)を生成する復号ステップと、前記パワースペクトル包絡系列W(0),W(1),…,W(N-1)を用いて前記正規化された周波数領域のサンプル列X_N(0),X_N(1),…,X_N(N-1)を逆正規化することにより、周波数領域のサンプル列X(0),X(1),…,X(N-1)を生成する包絡逆正規化ステップと、を有する。 The decoding method according to an aspect of the present invention generates quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p by decoding an input stretched linear prediction coefficient code, where p is a positive integer. And a stretched power spectrum envelope sequence that is a frequency domain sequence corresponding to the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p ~ W (0), ~ The stretched power spectrum envelope sequence ~ W (0), ~ W (1) is generated by the stretched power spectrum envelope series generating step for generating W (1), ..., ~ W (N-1) and the third sample string generating method. ), ..., ~ W (N-1) as ~ η (0), ~ η (1), ..., ~ η (N-1), and η (0), η (1), ..., η Inverse expansion / conversion conversion step for generating (N-1) as a power spectrum envelope sequence W (0), W (1), ..., W (N-1), and an input sample sequence of normalized frequency domain _{_{X N (0), X N}} (1), ..., and decodes the code corresponding to X _{N (N-1),} the difference in the normalized frequency domain Pull string _{_{X N (0), X N}} (1), ..., a decoding step of generating a X _{N (N-1),} the power spectrum envelope sequences W (0), W (1 ), ..., W (N -1) is used to denormalize the normalized frequency domain sample sequence X _N (0), X _N (1),..., X _N (N-1) to obtain a frequency domain sample sequence. Envelope denormalization step for generating X (0), X (1),..., X (N-1).

この発明の一態様によるサンプル列生成装置（以下、第一サンプル列生成装置とする。）は、所定の時間区間ごとの音信号に由来する周波数領域のサンプル列をξ(0),ξ(1),…,ξ(N-1)として、予め定められたN×Nの疎行列Uを用いて、下記式により定義される~ξ(0),~ξ(1),…,~ξ(N-1)を生成するサンプル列生成部を含み、 A sample string generation device (hereinafter referred to as a first sample string generation device) according to an aspect of the present invention uses a frequency domain sample string derived from a sound signal for each predetermined time interval as ξ (0), ξ (1 ), ..., ξ (N-1) are defined by the following equations using a predetermined N × N sparse matrix U: ~ ξ (0), ~ ξ (1), ..., ~ ξ ( Including a sample sequence generator for generating (N-1)

前記疎行列Uは、対角成分の近傍成分のみ０でないの値を含む行列であって、その上三角行列に含まれる非零要素の数はその下三角行列に含まれる非零要素の数よりも少なく、前記ξ(0),ξ(1),…,ξ(N-1)に対応するサンプル点列は、隣接するサンプル点の周波数の間隔が均等である線形離散化サンプル点列である。 The sparse matrix U is a matrix that includes non-zero values only in the neighboring components of the diagonal component, and the number of nonzero elements included in the upper triangular matrix is greater than the number of nonzero elements included in the lower triangular matrix. The sample point sequence corresponding to ξ (0), ξ (1),..., Ξ (N-1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are equal. .

この発明の一態様によるサンプル列生成装置（以下、第三サンプル列生成装置とする。）は、所定の時間区間ごとの音信号に由来する周波数領域のサンプル列を~η(0),~η(1),…,~η(M-1)として、予め定められたN×Nの疎行列Vを用いて、下記式により定義されるη(0),η(1),…,η(N-1)を生成するサンプル列生成部を含み、 A sample sequence generation device according to an aspect of the present invention (hereinafter referred to as a third sample sequence generation device) uses a sample sequence in a frequency domain derived from a sound signal for each predetermined time interval as ~ η (0), ~ η (1),..., ~ Η (M-1), using a predetermined N × N sparse matrix V, η (0), η (1),. Including a sample sequence generator for generating (N-1)

この発明の一態様による符号化装置は、前記所定の時間区間ごとの音信号を周波数領域に変換して得たサンプル列のパワーに対応する系列を前記ξ(0),ξ(1),…,ξ(N-1)とし、前記~ξ(0),~ξ(1),…,~ξ(N-1)を伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)として生成する第一サンプル列生成装置である伸縮疑似パワースペクトル系列生成部と、pを正の整数として、前記伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を時間領域に変換した系列である~X(0),~X(1),…,~X(N-1)を線形予測分析して、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを生成する線形予測分析部と、前記量子化伸縮線形予測係数^β₁,^β₂,…,^β_pに対応する周波数領域の系列である伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を生成する伸縮パワースペクトル包絡系列生成部と、前記伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を前記~η(0),~η(1),…,~η(N-1)として、前記η(0),η(1),…,η(N-1)をパワースペクトル包絡系列W(0),W(1),…,W(N-1)として生成する第二サンプル列生成装置である逆伸縮変換部と、前記パワースペクトル包絡系列W(0),W(1),…,W(N-1)を用いて前記周波数領域のサンプル列X(0),X(1),…,X(N-1)を正規化することにより、正規化された周波数領域サンプル列を生成する包絡正規化部と、前記正規化された周波数領域サンプル列を符号化して、前記正規化された周波数領域サンプル列に対応する符号を生成する符号化部と、を備える。 In the encoding device according to one aspect of the present invention, the sequence corresponding to the power of the sample sequence obtained by converting the sound signal for each predetermined time interval into the frequency domain is the ξ (0), ξ (1),. , ξ (N-1) and ~ ξ (0), ~ ξ (1), ..., ~ ξ (N-1) are expanded and contracted pseudo power spectrum series ~ Y (0), ~ Y (1), ... , ~ Y (N-1), a stretched pseudo power spectrum sequence generation unit that is a first sample string generating device, and p is a positive integer, the stretched pseudo power spectrum series ~ Y (0), ~ Y ( 1), ..., ~ Y (N-1) is transformed into time domain ~ X (0), ~ X (1), ..., ~ X (N-1) of telescopic linear prediction coefficient _{_{^ β 1, ^ β 2,}} ..., a linear prediction analysis unit for generating a ^ beta _p, the quantization telescopic linear prediction coefficient _{_{^ β 1, ^ β 2,}} ..., corresponding to the ^ beta _p A stretchable power spectrum envelope sequence generating unit for generating stretchable power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1), which are frequency domain sequences, Power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) as ~ η (0), ~ η (1), ..., ~ η (N-1), Second sample sequence generation for generating η (0), η (1),..., Η (N-1) as power spectrum envelope sequences W (0), W (1),. The inverse stretch / conversion transforming unit, which is a device, and the power spectrum envelope sequence W (0), W (1),..., W (N-1), and the frequency domain sample sequence X (0), X (1) ,..., X (N-1) is normalized to generate an normalized frequency domain sample sequence, and the normalized frequency domain sample sequence is encoded and the normalized And an encoding unit that generates a code corresponding to the frequency domain sample sequence.

この発明の一態様による復号装置は、pを正の整数として、入力された伸縮線形予測係数符号を復号して量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを生成する伸縮線形予測係数復号部と、前記量子化伸縮線形予測係数^β₁,^β₂,…,^β_pに対応する周波数領域の系列である伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を生成する伸縮パワースペクトル包絡系列生成部と、前記伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を前記~η(0),~η(1),…,~η(N-1)として、前記η(0),η(1),…,η(N-1)をパワースペクトル包絡系列W(0),W(1),…,W(N-1)として生成する第二サンプル列生成装置である逆伸縮変換部と、入力された、正規化された周波数領域のサンプル列X_N(0),X_N(1),…,X_N(N-1)に対応する符号を復号して、正規化された周波数領域のサンプル列X_N(0),X_N(1),…,X_N(N-1)を生成する復号部と、前記パワースペクトル包絡系列W(0),W(1),…,W(N-1)を用いて前記正規化された周波数領域のサンプル列X_N(0),X_N(1),…,X_N(N-1)を逆正規化することにより、周波数領域のサンプル列X(0),X(1),…,X(N-1)を生成する包絡逆正規化部と、を備える。 The decoding device according to an aspect of the present invention generates quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p by decoding an input stretched linear prediction coefficient code, where p is a positive integer. And a stretched power spectrum envelope sequence that is a frequency domain sequence corresponding to the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p ~ W (0), ~ Stretched power spectrum envelope sequence generation unit for generating W (1), ..., ~ W (N-1) and the stretched power spectrum envelope series ~ W (0), ~ W (1), ..., ~ W (N -1) as ~ η (0), ~ η (1), ..., ~ η (N-1), and η (0), η (1), ..., η (N-1) as power spectrum Inverse expansion / conversion transform unit, which is a second sample sequence generation device that generates as envelope sequences W (0), W (1),..., W (N-1), and an input sample sequence in a normalized frequency domain X _N (0), X _N (1), ..., code corresponding to X _N (N-1) is decoded and normalized frequency domain sample sequence X _N (0), X _N (1) ,…, X _N (N-1) and a frequency domain sample sequence X _N normalized using the power spectrum envelope sequence W (0), W (1),..., W (N-1) By denormalizing (0), X _N (1), ..., X _N (N-1), the frequency domain sample sequence X (0), X (1), ..., X (N-1) An envelope denormalization unit for generating

符号化効率を改善することができる。または、信号処理の精度を改善することができる。 Coding efficiency can be improved. Alternatively, the accuracy of signal processing can be improved.

従来の符号化装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the conventional encoding apparatus. 第一実施形態及び第二実施形態の符号化装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the encoding apparatus of 1st embodiment and 2nd embodiment. 第一実施形態及び第二実施形態のの符号化方法の例を説明するためのフローチャート。The flowchart for demonstrating the example of the encoding method of 1st embodiment and 2nd embodiment. 変換行列Uの例を説明するための図。The figure for demonstrating the example of the conversion matrix U. FIG. 変換行列Uの例を説明するための図。The figure for demonstrating the example of the conversion matrix U. FIG. 第一実施形態及び第二実施形態のの復号装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the decoding apparatus of 1st embodiment and 2nd embodiment. 第一実施形態及び第二実施形態のの復号方法の例を説明するためのフローチャート。The flowchart for demonstrating the example of the decoding method of 1st embodiment and 2nd embodiment. 第三実施形態の符号化装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the encoding apparatus of 3rd embodiment. 第三実施形態の符号化方法の例を説明するためのフローチャート。The flowchart for demonstrating the example of the encoding method of 3rd embodiment. 第三実施形態の復号装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the decoding apparatus of 3rd embodiment. 第三実施形態の復号方法の例を説明するためのフローチャート。The flowchart for demonstrating the example of the decoding method of 3rd embodiment. 第四実施形態の符号化装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the encoding apparatus of 4th embodiment. 第四実施形態の符号化方法の例を説明するためのフローチャート。The flowchart for demonstrating the example of the encoding method of 4th embodiment. 第四実施形態の復号装置の例を説明するためのブロック図。The block diagram for demonstrating the example of the decoding apparatus of 4th embodiment. 第四実施形態の復号方法の例を説明するためのフローチャート。The flowchart for demonstrating the example of the decoding method of 4th embodiment. 変換行列Uの例を説明するための図。The figure for demonstrating the example of the conversion matrix U. FIG. 図１６の変換行列Uにおける重心位置の性質を説明するための図。The figure for demonstrating the property of the gravity center position in the conversion matrix U of FIG. 変換行列Vの例を説明するための図。The figure for demonstrating the example of the transformation matrix V. FIG.

［技術的背景］
まず、従来技術で説明した符号化処理を例に技術的背景について説明する。 [Technical background]
First, the technical background will be described using the encoding process described in the related art as an example.

この発明の一例では、パワースペクトル包絡系列を利用する際に、周波数方向の非線形な解像度による離散化によって、離散係数列であるパワースペクトル包絡系列を生成する。このとき、パワースペクトル包絡の振幅のばらつきが大きい周波数領域では細かい解像度で離散化し、パワースペクトル包絡の振幅のばらつきが小さい周波数領域では粗い解像度で離散化する。これにより、正規化MDCT係数列の値のばらつきを小さくし、符号化効率を高めることができる。 In an example of the present invention, when a power spectrum envelope sequence is used, a power spectrum envelope sequence that is a discrete coefficient sequence is generated by discretization using nonlinear resolution in the frequency direction. At this time, it is discretized with a fine resolution in a frequency region where the amplitude variation of the power spectrum envelope is large, and is discretized with a coarse resolution in a frequency region where the amplitude variation of the power spectrum envelope is small. Thereby, the variation in the value of the normalized MDCT coefficient sequence can be reduced, and the encoding efficiency can be increased.

例えば、エネルギーの集中している周波数領域の離散化間隔を他の周波数領域の離散化間隔よりも小さくするようにする。言い換えれば、エネルギーの集中している周波数領域の解像度を他の周波数領域の解像度よりも高くするようにする。 For example, the discretization interval in the frequency region where energy is concentrated is made smaller than the discretization interval in other frequency regions. In other words, the resolution of the frequency region where energy is concentrated is set higher than the resolution of other frequency regions.

従来は、周波数方向の線形な解像度による離散化によりパワースペクトル包絡を表現していた。すなわち、F=0HzからF=100Hzを全周波数領域とし、N=10として、11個のサンプル点で離散化する場合、以下の表の11個の周波数のそれぞれをサンプル点として、これらの11個のサンプル点にそれぞれ対応するパワースペクトル包絡値の系列によりパワースペクトル包絡を表現していた。 Conventionally, the power spectrum envelope is expressed by discretization with linear resolution in the frequency direction. That is, when F = 0Hz to F = 100Hz are all frequency regions, N = 10, and discretization with 11 sample points, each of the 11 frequencies in the table below is used as a sample point. The power spectrum envelope is expressed by a series of power spectrum envelope values corresponding to the sample points.

つまり、隣接するサンプル点間の周波数領域での間隔が均等（上述の例では10Hz間隔）であるようなサンプル点列を用いてパワースペクトル包絡を表現していた。

That is, the power spectrum envelope is expressed using a sample point sequence in which the intervals in the frequency domain between adjacent sample points are uniform (in the above example, 10 Hz intervals).

このように、パワースペクトル包絡系列を表現するためのサンプル点列を、隣接するサンプル点間の周波数領域での間隔が均等（上述の例では10Hz間隔）であるように離散化することを、「線形離散化」と呼ぶ。 In this way, the sample point sequence for expressing the power spectrum envelope sequence is discretized so that the intervals in the frequency domain between adjacent sample points are uniform (in the above example, 10 Hz intervals). This is called “linear discretization”.

これに対して、本発明の一例では、周波数方向の非線形な解像度による離散化によって、パワースペクトル包絡を表現する。例えば、以下の表の11個の周波数のそれぞれをサンプル点として、これらの11個のサンプル点に対応するパワースペクトル包絡値の系列によりパワースペクトル包絡を表現する。 On the other hand, in an example of the present invention, the power spectrum envelope is expressed by discretization with nonlinear resolution in the frequency direction. For example, each of the 11 frequencies in the following table is set as a sample point, and the power spectrum envelope is expressed by a series of power spectrum envelope values corresponding to these 11 sample points.

この例では、低周波数領域の方が隣接するサンプル点間の周波数領域での間隔が狭く、高周波数領域ほど隣接する離散化サンプル点間の周波数領域での間隔が広くなっている。例えば、最も低周波数領域のサンプル点の間隔、言い換えればインデックス0に対応するサンプル点とインデックス1に対応するサンプル点との周波数領域での間隔は1Hzであるが、最も高周波数領域のサンプル点の間隔、言い換えればインデックス9に対応するサンプル点とインデックス10に対応するサンプル点との周波数領域での間隔は30Hzとなっている。

In this example, the interval in the frequency region between adjacent sample points is narrower in the low frequency region, and the interval in the frequency region between adjacent discrete sample points is wider in the higher frequency region. For example, the interval between the sample points in the lowest frequency region, in other words, the interval in the frequency region between the sample point corresponding to index 0 and the sample point corresponding to index 1 is 1 Hz, but the sample point in the highest frequency region The interval, in other words, the interval in the frequency domain between the sample point corresponding to the index 9 and the sample point corresponding to the index 10 is 30 Hz.

このように、パワースペクトル包絡系列を表現するためのサンプル点列を、隣接するサンプル点間の周波数方向での間隔が均等でないように離散化することを、「非線形離散化」と呼ぶ。 In this way, discretizing a sample point sequence for expressing a power spectrum envelope sequence so that intervals in the frequency direction between adjacent sample points are not uniform is called “nonlinear discretization”.

以下では、このような周波数方向の解像度の違いを区別するため、周波数方向に等間隔なサンプル点の系列を「線形離散化サンプル点列」とも呼び、周波数方向に不均等な間隔のサンプル点の系列を「非線形離散化サンプル点列」とも呼ぶこととする。線形離散化サンプル点列の隣接するサンプル点の周波数の間隔は均等であるが、非線形離散化サンプル点列の隣接するサンプル点の周波数の間隔は不均等である。また、線形離散化サンプル点列に含まれる各サンプル点を「線形離散化サンプル点」とも呼び、非線形離散化サンプル点列に含まれる各サンプル点を「非線形離散化サンプル点」とも呼ぶ。 In the following, in order to distinguish the difference in resolution in the frequency direction, a sequence of sample points that are equally spaced in the frequency direction is also referred to as a “linear discretization sample point sequence”. The series is also referred to as a “nonlinear discretized sample point sequence”. The frequency intervals between adjacent sample points in the linear discretized sample point sequence are uniform, but the frequency intervals between adjacent sample points in the non-linear discretized sample point sequence are unequal. Each sample point included in the linear discretized sample point sequence is also referred to as “linear discretized sample point”, and each sample point included in the nonlinear discretized sample point sequence is also referred to as “nonlinear discretized sample point”.

また、線形離散化サンプル点列の各サンプル点に対応する入力された音信号のパワーの系列を「パワースペクトル系列」とも呼び、非線形離散化サンプル点列の各サンプル点に対応する入力された音信号のパワーの系列を「伸縮疑似パワースペクトル系列」とも呼ぶこととする。 Also, the power sequence of the input sound signal corresponding to each sample point of the linear discretized sample point sequence is also called “power spectrum sequence”, and the input sound corresponding to each sample point of the non-linear discretized sample point sequence The signal power sequence is also referred to as a “stretching pseudo power spectrum sequence”.

なお、上述の例では、非線形離散化サンプル点列は低周波数領域ほどサンプル点間の間隔が狭くなっているが、必ずしもこの性質である必要はなく、例えば中周波数領域の方が低周波数領域よりも間隔が狭くてもよい。要するに、非線形離散化サンプル点列では、隣接するサンプル点間の間隔が不均等であればよい。 In the above example, the interval between the sample points of the non-linear discretized sample point sequence is narrower in the low frequency region, but it is not always necessary to have this property. For example, the middle frequency region is lower than the low frequency region. May be narrow. In short, in the non-linear discretized sample point sequence, the interval between adjacent sample points may be non-uniform.

［第一実施形態］
（第一実施形態の符号化）
第一実施形態の符号化装置の構成例を図２に示す。第一実施形態の符号化装置は、図２に示すように、周波数領域変換部２１と、伸縮疑似パワースペクトル系列生成部２２と、線形予測分析部２３と、伸縮パワースペクトル包絡系列生成部２４と、逆伸縮変換部２５と、包絡正規化部２６と、符号化部２７とを例えば備えている。この符号化装置により実現される第一実施形態の符号化方法の各処理の例を図３に示す。 [First embodiment]
(Encoding of the first embodiment)
A configuration example of the encoding apparatus of the first embodiment is shown in FIG. As shown in FIG. 2, the encoding device according to the first embodiment includes a frequency domain transform unit 21, a stretched pseudo power spectrum sequence generation unit 22, a linear prediction analysis unit 23, and a stretched power spectrum envelope sequence generation unit 24. The inverse expansion / conversion conversion unit 25, the envelope normalization unit 26, and the encoding unit 27 are provided, for example. FIG. 3 shows an example of each process of the encoding method of the first embodiment realized by this encoding apparatus.

以下、図２の各部について説明する。 Hereinafter, each part of FIG. 2 will be described.

＜周波数領域変換部２１＞
周波数領域変換部２１には、時間領域の音信号が入力される。音信号の例は、音声ディジタル信号又は音響ディジタル信号である。 <Frequency domain converter 21>
A time domain sound signal is input to the frequency domain converter 21. Examples of sound signals are voice digital signals or acoustic digital signals.

周波数領域変換部２１は、所定の時間長のフレーム単位で、入力された時間領域の音信号を周波数領域のN点のMDCT係数列X(0),X(1),…,X(N-1)に変換する（ステップＥ１）。Nは正の整数である。 The frequency domain transform unit 21 converts the input time domain sound signal into N frequency MDCT coefficient sequences X (0), X (1),..., X (N− 1) (step E1). N is a positive integer.

変換されたMDCT係数列X(0),X(1),…,X(N-1)は、包絡正規化部２６に出力される。 The converted MDCT coefficient sequences X (0), X (1),..., X (N−1) are output to the envelope normalization unit 26.

特に断りがない限り、以降の処理はフレーム単位で行われるものとする。 Unless otherwise specified, the subsequent processing is performed in units of frames.

ここでのMDCT係数列X(0),X(1),…,X(N-1)に対応する各サンプル点は、線形離散化サンプル点である。すなわち、MDCT係数列X(0),X(1),…,X(N-1)に対応するサンプル点列の隣接するサンプル点の周波数の間隔は等間隔である。言い換えれば、i=0,1,…,N-2として、MDCT係数列におけるインデックスiに対応する周波数とMDCT係数列におけるインデックスi+1に対応する周波数との間隔は等間隔である。 Each sample point corresponding to the MDCT coefficient sequence X (0), X (1),..., X (N−1) here is a linear discretization sample point. That is, the frequency intervals of adjacent sample points in the sample point sequence corresponding to the MDCT coefficient sequence X (0), X (1),..., X (N−1) are equal. In other words, when i = 0, 1,..., N−2, the interval between the frequency corresponding to the index i in the MDCT coefficient sequence and the frequency corresponding to the index i + 1 in the MDCT coefficient sequence is equal.

＜伸縮疑似パワースペクトル系列生成部２２＞
伸縮疑似パワースペクトル系列生成部２２には、周波数領域変換部２１が変換したMDCT係数列X(0),X(1),…,X(N-1)が入力される。 <Expandable pseudo power spectrum sequence generation unit 22>
MDCT coefficient sequences X (0), X (1),..., X (N−1) converted by the frequency domain conversion unit 21 are input to the expansion / contraction pseudo power spectrum sequence generation unit 22.

伸縮疑似パワースペクトル系列生成部２２は、まず、MDCT係数列X(0),X(1),…,X(N-1)の各係数の二乗値（パワー）からなる系列であるパワースペクトル系列Y(0),Y(1),…,Y(N-1)を生成する。すなわち、Y(i)=X(i)²(i=0,1,…,N-1)である。 First, the expansion / contraction pseudo power spectrum sequence generation unit 22 is a power spectrum sequence which is a sequence including the square value (power) of each coefficient of the MDCT coefficient sequence X (0), X (1),..., X (N-1). Y (0), Y (1), ..., Y (N-1) are generated. That is, Y (i) = X (i) ² (i = 0, 1,..., N−1).

そして、伸縮疑似パワースペクトル系列生成部２２は、パワースペクトル系列Y(0),Y(1),…,Y(N-1)を補間や線形変換することにより、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成する（ステップＥ２）。 The expansion / contraction pseudo power spectrum sequence generation unit 22 performs interpolation or linear conversion of the power spectrum sequence Y (0), Y (1),. 0), ~ Y (1), ..., ~ Y (N-1) are generated (step E2).

生成された伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)は、線形予測分析部２３に出力される。 The generated stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) is output to the linear prediction analysis unit 23.

ここで、パワースペクトル系列Y(0),Y(1),…,Y(N-1)に対応するサンプル点列は線形離散化サンプル点列であり、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)に対応するサンプル点列は非線形離散化サンプル点列である。 Here, the sample point sequence corresponding to the power spectrum sequence Y (0), Y (1),..., Y (N-1) is a linear discretized sample point sequence, and the stretched pseudo power spectrum sequence ~ Y (0) , ~ Y (1), ..., ~ Y (N-1) are sample point sequences that are nonlinear discretized sample point sequences.

言い換えれば、パワースペクトル系列Y(0),Y(1),…,Y(N-1)における各インデックス0,1,…,N-1に対応する周波数の間隔は等間隔である。また、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)における各インデックス0,1,…,N-1に対応する周波数の間隔は不均等な間隔である。 In other words, the frequency intervals corresponding to the indexes 0, 1,..., N-1 in the power spectrum series Y (0), Y (1),. Also, the frequency intervals corresponding to the indices 0, 1, ..., N-1 in the expansion / contraction pseudo power spectrum series ~ Y (0), ~ Y (1), ..., ~ Y (N-1) are uneven. It is an interval.

補間により伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成する場合、伸縮疑似パワースペクトル系列生成部２２は例えば以下の処理を行う。パワースペクトル系列Y(0),Y(1),…,Y(N-1)に対応するサンプル点列の隣接するサンプル点の間の周波数をfとする。そして、fにおけるパワースペクトル値がsinc関数（sinc(f)=sin(f)/f）に従うと仮定して補間した曲線を求める。そして、その曲線における非線形離散化サンプル点列の各サンプル点（周波数）に対応する値を伸縮疑似パワースペクトル値とすることで、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成する。なお、この場合、非線形離散化サンプル点列は予め与えられているものとする。 When generating the expansion / contraction pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) by interpolation, the expansion / contraction pseudo power spectrum series generation unit 22 performs the following processing, for example. Let f be the frequency between adjacent sample points in the sample point sequence corresponding to the power spectrum series Y (0), Y (1),..., Y (N-1). Then, an interpolated curve is obtained assuming that the power spectrum value at f follows the sinc function (sinc (f) = sin (f) / f). Then, by setting the value corresponding to each sample point (frequency) of the non-linear discretized sample point sequence in the curve as the stretch pseudo power spectrum value, the stretch pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) is generated. In this case, it is assumed that the nonlinear discretized sample point sequence is given in advance.

線形変換により伸縮疑似パワースペクトル系列~Y(0),…,~Y(N-1)を得る場合、伸縮疑似パワースペクトル系列生成部２２は、パワースペクトル系列Y(0),Y(1),…,Y(N-1)からなるベクトルに予め定められた変換行列Uを左から乗じることで伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成する。 When obtaining the expansion / contraction pseudo power spectrum sequence ~ Y (0), ..., ~ Y (N-1) by linear conversion, the expansion / contraction pseudo power spectrum series generation unit 22 generates the power spectrum series Y (0), Y (1), ..., Y (N-1) is multiplied by a predetermined transformation matrix U from the left to expand and contract the pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1 ) Is generated.

言い換えれば、伸縮疑似パワースペクトル系列生成部２２は、以下の式により定義される伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成する。 In other words, the expansion / contraction pseudo power spectrum sequence generation unit 22 generates expansion / contraction pseudo power spectrum sequences ~ Y (0), ~ Y (1), ..., ~ Y (N-1) defined by the following equations.

ここで、変換行列Uは、線形離散化サンプル点列から非線形離散化サンプル点列へのマッピングを近似する行列である。この変換は、周波数方向で等間隔のサンプル点列を周波数方向で不均等な間隔のサンプル点列に変換するものであり、いわば隣接するサンプル点間の周波数の間隔を伸縮させるものであることから、「伸縮変換」と呼ぶこととする。 Here, the transformation matrix U is a matrix that approximates the mapping from the linear discretized sample point sequence to the nonlinear discretized sample point sequence. This conversion is to convert a sample point sequence equally spaced in the frequency direction to a sample point sequence of unequal intervals in the frequency direction, so to speak, it expands or contracts the frequency interval between adjacent sample points. This is referred to as “stretch conversion”.

伸縮変換後の非線形離散化サンプル点列のある非線形離散化サンプル点の周波数（以下、伸縮後周波数と呼ぶ。）は、伸縮変換前の線形離散化サンプル点列の線形離散化サンプル点のうち伸縮後周波数と近い周波数を持つ１以上の線形離散化サンプル点の周波数の重み付き和で近似できる。言い換えれば、伸縮後周波数は、当該伸縮後周波数と最も近い周波数の線形離散化サンプル点の近傍の１以上の線形離散化サンプル点の周波数の重み付き和で近似できる。 The frequency of a non-linear discretization sample point with a non-linear discretization sample point sequence after stretch conversion (hereinafter referred to as post-stretch frequency) is the stretch of the linear discretization sample points of the linear discretization sample point sequence before stretch conversion. It can be approximated by a weighted sum of the frequencies of one or more linear discretization sample points having a frequency close to the rear frequency. In other words, the post-stretching frequency can be approximated by a weighted sum of the frequencies of one or more linear discretization sample points in the vicinity of the linear discretization sample point of the frequency closest to the post-stretching frequency.

変換行列Uの各行は伸縮変換後の非線形離散化サンプル点列の各非線形離散化サンプル点に対応し、変換行列Uの各列は線形離散化サンプル点列の各線形離散化サンプル点に対応する。すなわち、変換行列Uの各行は、当該各行に対応する非線形離散化サンプル点の周波数を表現するための各線形離散化サンプル点の周波数についての重みの系列になっている。 Each row of the transformation matrix U corresponds to each nonlinear discretization sample point of the nonlinear discretization sample point sequence after the stretch transformation, and each column of the transformation matrix U corresponds to each linear discretization sample point of the linear discretization sample point sequence . That is, each row of the transformation matrix U is a series of weights for the frequency of each linear discretization sample point for expressing the frequency of the nonlinear discretization sample point corresponding to each row.

本来は、伸縮後周波数を当該伸縮後周波数と最も近い周波数の線形離散化サンプル点の近傍の１以上の線形離散化サンプル点の周波数の重み付き和で近似するなら、その重みを負の値としてもよい。しかし、負の値を含むように変換行列Ｕを構成すると、精度よく伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成するためには後処理が必要となってしまう。そこで、この発明では、例えば変換行列Uの全ての要素を非負の値とする（つまり、Uを非負値行列とする）。これにより、後処理をすることなく精度の高い伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を生成することができる。 Originally, if the scaled frequency is approximated by a weighted sum of the frequencies of one or more linear discretization sample points in the vicinity of the linear discretization sample point of the frequency closest to the scaled frequency, the weight is set to a negative value. Also good. However, if the transformation matrix U is configured so as to include negative values, in order to accurately generate the stretchable pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) Post-processing is required. Therefore, in the present invention, for example, all elements of the transformation matrix U are set to non-negative values (that is, U is set to a non-negative value matrix). As a result, it is possible to generate a highly accurate expansion / contraction pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) without post-processing.

さらに、伸縮後周波数と離れた周波数の離散化サンプル点の周波数に乗じる重みは小さい値となることが想定されるため、小さい値の要素を０と見做しても、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)の精度への影響が少ない。そこで、変換行列Uの各行は、当該各行に対応する非線形離散化サンプル点の周波数と最も近い周波数の線形離散化サンプル点に対応する列の近傍要素のみ０でない値とし、残りの要素は０と例えば設定してもよい。ここで、変換行列Ｕにおいて０以外の値をもつ要素を「伸縮で対応する要素」と呼ぶ。変換行列Uは、例えば、伸縮で対応する要素の近傍のみ０でない値を持ち、それ以外の成分は０であるような帯行列（疎行列）であるといえる。 Furthermore, since the weight multiplied by the frequency of the discretized sampling point at a frequency distant from the post-stretching frequency is assumed to be a small value, even if the small value element is regarded as 0, the stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) have little effect on the accuracy. Therefore, in each row of the transformation matrix U, only the neighboring elements in the column corresponding to the linear discretization sample point having the frequency closest to the frequency of the nonlinear discretization sample point corresponding to each row are set to non-zero values, and the remaining elements are set to 0. For example, it may be set. Here, an element having a value other than 0 in the transformation matrix U is referred to as an “element corresponding to expansion / contraction”. It can be said that the transformation matrix U is a band matrix (sparse matrix) that has a non-zero value only in the vicinity of the corresponding element due to expansion and contraction and the other components are zero, for example.

行列のすべての値を使って変換することは演算量が多くなる可能性があるが、このように変換行列Uを疎行列とすることで少ない演算量で伸縮疑似パワースペクトルを得ることができる。行列中の０でないの要素の開始サンプル点を別途記憶して、そのサンプル点からのみの少数の演算とすればよい。
このように、変換行列Uを非負値行列、もしくは、疎行列とすることで、少ない演算量で精度よく伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を求めることができる。なお、変換行列Uを疎行列とする場合には負の値の要素を含んでもよい。 The conversion using all values of the matrix may increase the amount of calculation, but by making the conversion matrix U sparse in this way, the expansion / contraction pseudo power spectrum can be obtained with a small amount of calculation. A starting sample point of a non-zero element in the matrix may be stored separately, and a small number of operations may be performed only from the sample point.
Thus, by making the transformation matrix U a non-negative matrix or a sparse matrix, the expansion / contraction pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N -1) can be obtained. When the transformation matrix U is a sparse matrix, it may include a negative value element.

変換行列Uは、非線形離散化サンプル点列と線形離散化サンプル点列との相関関係から予め学習などにより求めておくことができる。変換行列Uの求め方については後述する。 The transformation matrix U can be obtained in advance by learning or the like from the correlation between the non-linear discretized sample point sequence and the linear discretized sample point sequence. A method for obtaining the transformation matrix U will be described later.

非線形離散化サンプル点列の隣接するサンプル点の間隔の伸縮度合いは任意に設定できる。低周波数領域のサンプル点間の間隔を縮小し、高周波数領域のサンプル点間の間隔を拡大する例として下記の一般化対数を使うことができる。なお、「隣接するサンプル点の間隔」のこと「離散化幅」とも呼ぶ
線形離散化サンプル点列のインデックスiのサンプル点に対応する周波数ω_iとの関係をあらわす関数の例としてS_λ(ω_i)を使うことができる。非線形離散化サンプル点列の同じインデックスiのサンプル点に対応する周波数S_λ(ω_i)と周波数ω_iとの間には、例えば以下の関係が成り立つ。 The degree of expansion / contraction of the interval between adjacent sample points in the non-linear discretized sample point sequence can be set arbitrarily. The following generalized logarithm can be used as an example of reducing the interval between sample points in the low frequency region and expanding the interval between sample points in the high frequency region. Note that “interval between adjacent sample points”, also called “discretization width”, is an example of a function representing the relationship with the frequency ω _i corresponding to the sample point of index i of the linear discretization sample point sequence S _λ (ω _i ) can be used. For example, the following relationship holds between the frequency S _λ (ω _i ) and the frequency ω _i corresponding to the sample point of the same index i in the non-linear discretized sample point sequence.

S_λ(ω_i)は、一般化対数関数と呼ばれる関数である。λは、非線形な伸縮度合い、すなわち非線形離散化サンプル点列のサンプル点の間隔の伸縮度合いを決める定数である。伸縮変換の度合いであるλは、入力信号の性質に合わせて設計することができる。λ=0のときは、S_λ(ω_i)は対数関数となる。 S _λ (ω _i ) is a function called a generalized logarithmic function. λ is a constant that determines the nonlinear expansion / contraction degree, that is, the expansion / contraction degree of the interval between the sample points of the nonlinear discretized sample point sequence. Λ, which is the degree of expansion / contraction conversion, can be designed according to the nature of the input signal. When λ = 0, S _λ (ω _i ) is a logarithmic function.

変換行列Uとλとの関係の例を図４に示す。図４（ａ）、図４（ｂ）及び図４（ｃ）は、縦軸を変換行列Uの行に対応するインデックス、横軸を変換行列Uの列に対応するインデックスとして、変換行列Uの各要素の値を色で示したものである。白色は値が０の要素を表し、黒色は０より大きな値を持つ要素を表す。λ=1の場合の変換行列Uは、線形離散化サンプル点列のまま何も変換しない行列であり、対角成分のみ値が１の対角行列である。これらの図から、λが大きくなるほど、変換行列Uにおける非零成分がなす曲線の形状、言い換えれば０より大きな値を持つ要素を近似する曲線の形状が対角線よりも下方向に非線形に曲がっていることが分かる。 An example of the relationship between the transformation matrix U and λ is shown in FIG. 4 (a), 4 (b), and 4 (c), the vertical axis represents the index corresponding to the row of the transformation matrix U, and the horizontal axis represents the index corresponding to the column of the transformation matrix U. The value of each element is indicated by color. White represents an element having a value of 0, and black represents an element having a value greater than 0. The transformation matrix U in the case of λ = 1 is a matrix in which nothing is transformed as a linear discretized sample point sequence, and is a diagonal matrix in which only the diagonal component has a value of 1. From these figures, as λ increases, the shape of the curve formed by the non-zero components in the transformation matrix U, in other words, the shape of the curve that approximates an element having a value greater than 0, is bent more nonlinearly below the diagonal line. I understand that.

ここで、縦軸を線形離散化サンプル点列のサンプル点の周波数とし、縦軸の上から下に向かうほど周波数が増大するものとし、横軸を非線形離散化サンプル点列の周波数とし、横軸の左から右に向かうほど周波数が増大するものとして定義される２次元平面を考える。 Here, the vertical axis is the frequency of the sample points of the linear discretization sample point sequence, the frequency increases from the top to the bottom of the vertical axis, the horizontal axis is the frequency of the nonlinear discretization sample point sequence, and the horizontal axis Consider a two-dimensional plane that is defined as a frequency that increases from left to right.

変換行列Uにおける非零成分がなす曲線の形状は、この２次元平面上に、変換行列Uの各要素のインデックスに対応する線形離散化サンプル点の周波数及び非線形離散化サンプル点の周波数をマッピングしたときの点列を補間して得られる曲線に相当するとも言える。この曲線のことを以下「伸縮曲線」とも呼ぶ。伸縮曲線は、言い換えれば、「伸縮で対応する要素」をマッピングした点列を補間して得られる曲線である。ここで、要素のインデックスに対応する線形離散化サンプル点とは、その要素の列に対応する線形離散化サンプル点のことである。また、要素のインデックスに対応する非線形離散化サンプル点とは、その要素の行に対応する非線形離散化サンプル点のことである。 The shape of the curve formed by the non-zero components in the transformation matrix U is mapped on this two-dimensional plane with the frequency of the linear discretization sampling point and the frequency of the nonlinear discretization sampling point corresponding to the index of each element of the transformation matrix U. It can be said that it corresponds to a curve obtained by interpolating the point sequence at that time. Hereinafter, this curve is also referred to as a “stretching curve”. In other words, the expansion / contraction curve is a curve obtained by interpolating a point sequence in which “elements corresponding to expansion / contraction” are mapped. Here, the linear discretization sample point corresponding to the element index is a linear discretization sample point corresponding to the element column. The non-linear discretization sample point corresponding to the element index is a non-linear discretization sample point corresponding to the row of the element.

なお、上記の例では低周波数領域の解像度が高周波数領域の解像度よりも高くなるような変換の例を示した。言い換えれば、低周波数領域の非線形離散化サンプル点間の間隔が高周波数領域の高周波数領域の非線形離散化サンプル点の間隔よりも狭くなるような変換の例を示した。しかし、これはあくまで一例に過ぎない。 In the above example, an example of conversion in which the resolution in the low frequency region is higher than the resolution in the high frequency region is shown. In other words, an example of transformation is shown in which the interval between nonlinear discrete sample points in the low frequency region is narrower than the interval between nonlinear discrete sample points in the high frequency region. However, this is only an example.

解像度又は伸縮の度合いは変換行列の非零要素が存在するサンプル点の傾きに対応するので、例えば、非線形離散化サンプル点列として、低周波数領域と高周波数領域との中間の周波数領域の解像度が他の周波数領域の解像度よりも高くなるような領域と低くなる領域が中心近くにあるような非線形離散化サンプル点列を用いても良い。この場合、変換行列Uは、例えば図５のような伸縮曲線上の近傍成分のみ非零となるような変換行列となる。 Since the resolution or the degree of expansion and contraction corresponds to the slope of the sample point where the non-zero element of the transformation matrix exists, for example, as a non-linear discretization sample point sequence, the resolution in the frequency region intermediate between the low frequency region and the high frequency region is A non-linear discretized sample point sequence in which a region that is higher than a resolution of other frequency regions and a region that is lower than the center may be used. In this case, the transformation matrix U is a transformation matrix in which, for example, only neighboring components on the stretch curve as shown in FIG. 5 are non-zero.

いずれにしても、変換行列UをM×N行列として、変換行列Uにおける伸縮曲線は、行を横軸とし、列を縦軸と見做した二次元平面において、左上端の成分に対応する点[1,1]から右下端の成分に対応する点[M,N]に向けて単調減少する曲線となる。言い換えると、変換行列Uは以下のような性質を持つ。 In any case, the transformation matrix U is an M × N matrix, and the expansion / contraction curve in the transformation matrix U corresponds to the component at the upper left corner in a two-dimensional plane with the row as the horizontal axis and the column as the vertical axis. The curve is monotonically decreasing from [1,1] toward the point [M, N] corresponding to the lower right component. In other words, the transformation matrix U has the following properties.

変換行列Uの第i行の重心g_iを、 The center of gravity g _i of the i-th row of the transformation matrix U is

と定義する。U[i,k]は変換行列Uの(i,k)要素を表す。行と列のインデックスは１から開始されるものとする。すると、変換行列Uの各行の重心の系列g₁,g₂,…,g_Nはg₁<g₂<…<g_Nという関係を満たす。なお、MとNはそれぞれ３以上の整数である。また、本実施形態においてはM=Nである。 It is defined as U [i, k] represents the (i, k) element of the transformation matrix U. Assume that the row and column indices start at 1. Then, sequence g _1, g ₂ of the center of gravity of each row of the transformation matrix U, ..., g _N satisfies the relationship of _{_{g 1 <g 2 <... <}} g N. M and N are integers of 3 or more. In this embodiment, M = N.

また、音声信号及び音響信号では低周波数領域又は中周波数領域にエネルギーが集中することが多いので、低周波数領域又は中周波数領域の非線形離散化サンプル点間の周波数の間隔を、高周波数領域の非線形離散化サンプル点間の周波数の間隔よりも狭くした非線形離散化サンプル点列に対応する伸縮疑似パワースペクトルを用いてもよい。この場合、変換行列Uの各行において、当該各行の非零要素のうち対角要素から最も距離の離れている要素と対角要素との距離（この距離は、列のインデックスがどれだけ離れているかを表す。）を各行のインデックスの伸縮距離としたとき、全インデックスのうちの前半分のインデックスに対する伸縮距離の平均は、後ろ半分のインデックスに対する伸縮距離の平均よりも大きい傾向にある。また、非零成分は正の値をとる。あるいは、変換行列Uは、その上三角行列に含まれる非負成分の数が、その下三角行列に含まれる非負成分の数よりも少ないような疎行列となる。図１６に変換行列Uの一例を示す。この例ではN=M=16である。図１７には、図１６の変換行列Uの重心g_iを、行番号を横軸とし、g_iの値を縦軸とする二次元平面にプロットしたものを示す。図１７に示されているように、行番号が増加するにつれてg_iの値は単調増加する。 In addition, since energy is often concentrated in a low frequency region or a medium frequency region in an audio signal and an acoustic signal, the frequency interval between nonlinear discrete sampling points in the low frequency region or the medium frequency region is set to a non-linear region in the high frequency region. A stretchable pseudo power spectrum corresponding to a non-linear discretized sample point sequence narrower than the frequency interval between the discretized sample points may be used. In this case, in each row of the transformation matrix U, the distance between the non-zero element of each row that is farthest from the diagonal element and the diagonal element (this distance is how far the column index is separated) Represents the expansion / contraction distance of the index of each row, the average of the expansion / contraction distance with respect to the index of the first half of all indexes tends to be larger than the average of the expansion / contraction distance with respect to the index of the rear half. The non-zero component takes a positive value. Alternatively, the transformation matrix U is a sparse matrix in which the number of non-negative components included in the upper triangular matrix is smaller than the number of non-negative components included in the lower triangular matrix. FIG. 16 shows an example of the transformation matrix U. In this example, N = M = 16. Figure 17 is a center of gravity g _i of the transformation matrix U in FIG. 16, the row number and the horizontal axis shows a plot on a two-dimensional plane having a longitudinal axis the value of g _i. As shown in FIG. 17, the value of g _i increases monotonically as the line number increases.

伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)は、入力された音信号を周波数領域に変換した信号のパワースペクトルを非線形離散化したものに相当する。 Expanded pseudo power spectrum series ~ Y (0), ~ Y (1), ..., ~ Y (N-1) is a non-linear discretization of the power spectrum of the signal converted from the input sound signal to the frequency domain. Equivalent to.

このように、入力された音信号を周波数領域に変換した信号のエネルギーが集中している周波数領域での離散化間隔がそれ以外の周波数領域での離散化間隔よりも狭くなるような非線形な離散化間隔のサンプル点に基づいてパワースペクトルを表現する。 In this way, the nonlinear discrete discretization is such that the discretization interval in the frequency domain where the energy of the signal converted from the input sound signal is converted to the frequency domain is narrower than the discretization interval in other frequency domains. The power spectrum is expressed based on the sampling points of the conversion interval.

＜線形予測分析部２３＞
線形予測分析部２３には、伸縮疑似パワースペクトル系列生成部２２が生成した伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)が入力される。 <Linear prediction analysis unit 23>
The linear prediction analysis unit 23 receives the expansion / contraction pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) generated by the expansion / contraction pseudo power spectrum series generation unit 22.

線形予測分析部２３は、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を用いて、以下の式により定義される~X(0),~X(1),…,~X(N-1)を線形予測分析して伸縮線形予測係数β₁,β₂,…,β_pを生成し、生成された伸縮線形予測係数β₁,β₂,…,β_pを符号化して伸縮線形予測係数符号と伸縮線形予測係数符号に対応する量子化された伸縮線形予測係数である量子化伸縮線形予測係数^β₁,^β₂,…,^β_pとを生成する（ステップＥ３）。 The linear predictive analysis unit 23 uses the stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) to define ~ X (0), ~ X (1), ..., ~ X (N-1) are subjected to linear prediction analysis to generate stretched linear prediction coefficients β ₁ , β ₂ , ..., β _p, and the generated stretched linear prediction coefficients β ₁ , β ₂ , ..., β _p is encoded and the stretched linear prediction coefficient code and the quantized stretched linear prediction coefficient ^ β ₁ , ^ β ₂ ,…, which is the quantized stretched linear prediction coefficient corresponding to the stretched linear prediction coefficient code ^ β _p is generated (step E3).

生成された量子化伸縮線形予測係数^β₁,^β₂,…,^β_pは、伸縮パワースペクトル包絡系列生成部２４に出力される。 The generated quantized stretch linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p are output to the stretch power spectrum envelope sequence generation unit 24.

また、生成された伸縮線形予測係数符号は、復号装置に送信される。 The generated stretched linear prediction coefficient code is transmitted to the decoding device.

伸縮線形予測係数符号を生成するために、線形予測分析部２３は、まず伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)に対応するサンプル点の周波数の間隔が均等であると見做して逆FFTに相当する演算を行うことにより、~Y(0),~Y(1),…,~Y(N-1)に対応する時間領域の信号列である伸縮相関関数信号列~X(0),~X(1),…,~X(N-1)を求める。そして、線形予測分析部２３は、求まった伸縮相関関数信号列~X(0),~X(1),…,~X(N-1)に対して線形予測分析を行って、伸縮線形予測係数β₁,β₂,…,β_pを生成する。そして、線形予測分析部２３は、生成された伸縮線形予測係数β₁,β₂,…,β_pを符号化することにより、伸縮線形予測係数符号を生成する。この結果、伸縮線形予測係数符号に対応する量子化伸縮線形予測係数^β₁,^β₂,…,^β_pも得られる。
伸縮線形予測係数は、伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)のサンプル点の周波数方向での間隔が均等な間隔であると見做したときの時間領域の信号に対応する線形予測係数である。 In order to generate the stretched linear prediction coefficient code, the linear prediction analysis unit 23 first samples points corresponding to the stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1). The time domain corresponding to ~ Y (0), ~ Y (1), ..., ~ Y (N-1) is obtained by assuming that the frequency intervals of , X (0), ~ X (1), ..., ~ X (N-1) are obtained. Then, the linear prediction analysis unit 23 performs linear prediction analysis on the obtained stretch correlation function signal sequence ~ X (0), ~ X (1), ..., ~ X (N-1) to obtain stretch linear prediction. Coefficients β ₁ , β ₂ ,..., Β _p are generated. Then, the linear prediction analyzer 23, the generated elastic linear prediction coefficient beta _1, beta _2, ..., by encoding the beta _p, generates a telescopic linear prediction coefficient code. As a result, quantized stretch linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p corresponding to the stretch linear prediction coefficient code are also obtained.
The expansion / contraction linear prediction coefficient is assumed that the intervals in the frequency direction of the sample points of the expansion / contraction pseudo power spectrum series ~ Y (0), ~ Y (1), ..., ~ Y (N-1) are equal. It is a linear prediction coefficient corresponding to the signal of the time domain at the time.

線形予測分析部２３による伸縮線形予測係数符号の生成は、例えば従来的な符号化技術によって行われる。従来的な符号化技術とは、例えば、線形予測係数そのものに対応する符号を予測係数符号とする符号化技術、線形予測係数をLSPパラメータに変換してLSPパラメータに対応する符号を予測係数符号とする符号化技術、線形予測係数をPARCOR係数に変換してPARCOR係数に対応する符号を予測係数符号とする符号化技術などである。これらの従来的な符号化技術において、線形予測係数を伸縮線形予測係数に置き換えて符号化することで、伸縮線形予測係数に対応する符号である伸縮線形予測係数符号が得られる。 The generation of the expansion / contraction linear prediction coefficient code by the linear prediction analysis unit 23 is performed by, for example, a conventional encoding technique. The conventional encoding technique is, for example, an encoding technique in which a code corresponding to the linear prediction coefficient itself is a prediction coefficient code, a code corresponding to the LSP parameter by converting the linear prediction coefficient into an LSP parameter, and a prediction coefficient code. Encoding techniques for converting linear prediction coefficients into PARCOR coefficients and using codes corresponding to the PARCOR coefficients as prediction coefficient codes. In these conventional encoding techniques, the linear prediction coefficient is replaced with the expansion / contraction linear prediction coefficient, and the expansion linear prediction coefficient code that is a code corresponding to the expansion / contraction linear prediction coefficient is obtained.

＜伸縮パワースペクトル包絡系列生成部２４＞
伸縮パワースペクトル包絡系列生成部２４には、線形予測分析部２３が生成した量子化伸縮線形予測係数^β₁,^β₂,…,^β_pが入力される。 <Elastic Power Spectrum Envelope Sequence Generation Unit 24>
The expansion / contraction power spectrum envelope sequence generation unit 24 receives the quantized expansion / contraction linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p generated by the linear prediction analysis unit 23.

伸縮パワースペクトル包絡系列生成部２４は、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを周波数領域に変換することにより、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を生成する（ステップＥ４）。 Stretch power spectrum envelope sequence generating unit 24, the quantization telescopic linear prediction coefficient _{_{^ β 1, ^ β 2,}} ..., ^ β by conversion to the frequency domain to _p, stretching the power spectral envelope sequence ~ W (0), ~ W (1),..., W (N-1) are generated (step E4).

生成された伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)は、逆伸縮変換部２５に出力される。 The generated expansion / contraction power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1) are output to the inverse expansion / conversion conversion unit 25.

伸縮パワースペクトル包絡系列生成部２４は、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを用いて、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)として、例えば以下の式（２）により定義される伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)又は以下の式（３）により定義される伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を生成する。 The expansion / contraction power spectrum envelope sequence generation unit 24 uses the expansion / contraction linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p to generate expansion / contraction power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1), for example, a stretched unsmoothed power spectrum envelope sequence defined by the following equation (2) ~ _Wo (0), ~ _Wo (1), ..., ~ _Wo ( N-1) or the stretched and smoothed power spectrum envelope sequence ~ _Wγ (0), ~ _Wγ (1), ..., ~ _Wγ (N-1) defined by the following equation (3).

言い換えれば、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)の一例が、式（２）により定義される伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)又は式（３）により定義される伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)である。 In other words, an example of a stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) is an stretched unsmoothed power spectrum envelope series defined by equation (2) ~ W _o (0), ~ W _o (1), ..., ~ W _o (N-1) or the stretched and smoothed power spectrum envelope sequence defined by equation (3) ~ W _γ (0), ~ W _γ ( 1), ..., ~ W _γ (N-1).

ここで、補正係数γは予め定められた１以下の定数であり、伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)の振幅の凹凸をを鈍らせる係数、言い換えれば伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)を平滑化する係数である。伸縮変換を行わない従来的な符号化処理におけるパワースペクトル包絡系列の平滑化に用いるγ、すなわち上述の式（Ｐ１）におけるγと同じと考えればよい。 Here, the correction coefficient γ is a predetermined constant equal to or less than 1, and the stretched and unsmoothed power spectrum envelope sequence ~ _Wo (0), ~ _Wo (1), ..., ~ _Wo (N-1) Is a coefficient that smoothes the unsmoothed power spectrum envelope sequence ~ _Wo (0), ~ _Wo (1), ..., ~ _Wo (N-1). is there. It may be considered that it is the same as γ used for smoothing the power spectrum envelope sequence in the conventional encoding process that does not perform expansion / conversion, that is, γ in the above formula (P1).

なお、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)に対応するサンプル点列、言い換えれば伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)における各インデックス0,1,…,N-1に対応する周波数の系列は、非線形離散化サンプル点列である。 Note that the sampled point sequence corresponding to the stretchable power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1), in other words, the stretchable power spectrum envelope series ~ W (0), ~ W A sequence of frequencies corresponding to indexes 0, 1,..., N-1 in (1),..., ~ W (N-1) is a non-linear discretized sample point sequence.

このようにして、伸縮パワースペクトル包絡系列生成部２４は、所定の時間区間ごとの音信号に由来する周波数領域のサンプル列のパワースペクトル包絡を平滑化した包絡を周波数方向に不均等間隔で離散化した系列である伸縮パワースペクトル包絡系列を生成する。 In this way, the expansion / contraction power spectrum envelope sequence generation unit 24 discretizes the envelope obtained by smoothing the power spectrum envelope of the frequency domain sample sequence derived from the sound signal for each predetermined time interval at unequal intervals in the frequency direction. The expansion / contraction power spectrum envelope sequence that is the sequence is generated.

所定の時間区間ごとの音信号に由来する周波数領域のサンプル列とは、この例ではMDCT係数列X(0),X(1),…,X(N-1)のことである。所定の時間区間ごとの音信号に由来する周波数領域のサンプル列として、MDCT係数列X(0),X(1),…,X(N-1)以外の周波数領域のサンプル列を用いてもよい。
伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)は、サンプル点の周波数の間隔が狭いところでは細かい解像度で表現されるため、パワースペクトル包絡の振幅の凹凸の細かい変化も表現することができる。逆に、サンプル点の周波数の間隔が広いところでは粗い解像度で表現されるため、パワースペクトル包絡の大まかな変化のみしか表現されない。一般に、音声音響信号はエネルギーの集中している部分でのパワースペクトル包絡の変化が大きく、それ以外の部分でのパワースペクトル包絡の変化は小さい。よって、エネルギーの集中している周波数領域での隣接するサンプル点間の周波数の間隔がそれ以外の周波数領域での隣接するサンプル点間の周波数の間隔よりも狭いような非線形離散化サンプル点列に対応する周波数領域の入力された音信号のパワーの系列を伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)とすることで、限られたサンプル点列でより正確にパワースペクトルの振幅の凹凸の変化を表現する離散系列を得ることができる。言い換えれば、エネルギーの集中している周波数領域を他の周波数領域よりも細かい解像度で表現するように非線形離散化することで、より正確にパワースペクトルを表現する離散系列を得ることができる。 In this example, the sample sequence in the frequency domain derived from the sound signal for each predetermined time interval is the MDCT coefficient sequence X (0), X (1),..., X (N−1). A sample sequence in the frequency domain other than the MDCT coefficient sequence X (0), X (1), ..., X (N-1) may be used as the frequency domain sample sequence derived from the sound signal for each predetermined time interval. Good.
The stretchable power spectrum envelope series ~ W (0), ~ W (1), ..., ~ W (N-1) is expressed with fine resolution where the frequency interval of the sample points is narrow. It is also possible to express fine changes in amplitude irregularities. On the contrary, when the frequency interval of the sample points is wide, it is expressed with a coarse resolution, so that only a rough change of the power spectrum envelope is expressed. In general, an audio-acoustic signal has a large change in power spectrum envelope in a portion where energy is concentrated, and a small change in power spectrum envelope in other portions. Therefore, a nonlinear discrete sample point sequence in which the frequency interval between adjacent sample points in the frequency region where energy is concentrated is narrower than the frequency interval between adjacent sample points in other frequency regions is used. A limited sample is obtained by setting the power sequence of the input sound signal in the corresponding frequency domain as a stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) It is possible to obtain a discrete series that more accurately expresses the change in the amplitude unevenness of the power spectrum by the point sequence. In other words, a discrete sequence that expresses the power spectrum more accurately can be obtained by performing nonlinear discretization so that the frequency region in which energy is concentrated is expressed with a finer resolution than the other frequency regions.

こうして得られた伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)に基づいて算出したスペクトル包絡値を用いてMDCT係数列を正規化すると、正規化MDCT係数列の大きさの変化が小さくなるため、効率的に符号化できるようになる。 When the MDCT coefficient sequence is normalized using the spectrum envelope value calculated based on the expansion / contraction pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) thus obtained, Since the change in the size of the generalized MDCT coefficient sequence becomes small, it becomes possible to efficiently encode.

＜逆伸縮変換部２５＞
逆伸縮変換部２５には、伸縮パワースペクトル包絡系列生成部２４が生成した伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)が入力される。 <Reverse expansion / conversion conversion unit 25>
The inverse expansion / conversion conversion unit 25 receives the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) generated by the expansion / contraction power spectrum envelope series generation unit 24.

逆伸縮変換部２５は、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を、補間又は線形変換により線形離散化サンプル点列に対応するパワースペクトル包絡系列W(0),W(1),…,W(N-1)に変換する（ステップＥ５）。 The inverse expansion / conversion conversion unit 25 converts the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) into power corresponding to the linear discretized sample point sequence by interpolation or linear conversion. It converts into the spectrum envelope series W (0), W (1),..., W (N-1) (step E5).

変換されたパワースペクトル包絡系列W(0),W(1),…,W(N-1)は、包絡正規化部２６に出力される。 The converted power spectrum envelope sequences W (0), W (1),..., W (N-1) are output to the envelope normalization unit 26.

補間によりパワースペクトル包絡系列W(0),W(1),…,W(N-1)を得る場合、逆伸縮変換部２５は例えば以下の処理を行う。逆伸縮変換部２５は、伸縮疑似パワースペクトル系列生成部２２と同様に、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)をsinc関数により補間した曲線（伸縮パワースペクトル包絡をなめらかにつないだ包絡）を求める。そして、その曲線上で線形離散化サンプル点列の各離散化サンプル点に対応する周波数のパワースペクトル包絡値の系列をパワースペクトル包絡系列W(0),W(1),…,W(N-1)として得る。 When obtaining the power spectrum envelope series W (0), W (1),..., W (N−1) by interpolation, the inverse expansion / conversion conversion unit 25 performs, for example, the following processing. Similar to the expansion / contraction pseudo power spectrum sequence generation unit 22, the inverse expansion / conversion conversion unit 25 interpolates the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) by the sinc function. Obtained curve (envelope that smoothly connects the expansion and contraction power spectrum envelope). Then, the power spectrum envelope sequence W (0), W (1),..., W (N− Get as 1).

線形変換によりパワースペクトル包絡系列W(0),W(1),…,W(N-1)を得る場合、逆伸縮変換部２５は、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)からなるベクトルに予め定められた変換行列Vを左から乗じることでパワースペクトル包絡系列W(0),W(1),…,W(N-1)を生成する。 When obtaining the power spectrum envelope sequence W (0), W (1),..., W (N-1) by linear transformation, the inverse expansion / conversion conversion unit 25 performs the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W ( 1), ..., ~ W (N-1) is multiplied by a predetermined transformation matrix V from the left to multiply the power spectrum envelope sequence W (0), W (1), ..., W (N-1 ) Is generated.

言い換えれば、逆伸縮変換部２５は、以下の式により定義されるパワースペクトル包絡系列W(0),W(1),…,W(N-1)を生成する。 In other words, the inverse expansion / conversion conversion unit 25 generates power spectrum envelope sequences W (0), W (1),..., W (N−1) defined by the following equations.

ここで、変換行列Vは、変換行列Uの逆変換を近似する行列であり、非線形離散化サンプル点列から線形離散化サンプル点列へのマッピングを近似する行列である。この変換は、不均等な間隔のサンプル点列を等間隔のサンプル点列に変換するものであり、上述の「伸縮変換」とは逆の関係となるようにサンプル点の間隔を伸縮させるものであることから、「逆伸縮変換」と呼ぶこととする。ただし、変換行列Vが変換行列Uの逆行列を意味するものではない。 Here, the transformation matrix V is a matrix that approximates the inverse transformation of the transformation matrix U, and is a matrix that approximates the mapping from the non-linear discretized sample point sequence to the linear discretized sample point sequence. This conversion is to convert a sample point sequence with unequal intervals into a sample point sequence with equal intervals, and to increase or decrease the interval between sample points so as to have a reverse relationship to the above-mentioned “stretch conversion”. For this reason, it will be referred to as “inverse stretch conversion”. However, the transformation matrix V does not mean an inverse matrix of the transformation matrix U.

変換行列Vは、例えば、逆伸縮で対応する要素の近傍のみ０でない値を持ち、それ以外の成分は０であるような帯行列（疎行列）とする。 The transformation matrix V is, for example, a band matrix (sparse matrix) that has a non-zero value only in the vicinity of the corresponding element by inverse expansion and contraction, and the other components are zero.

逆伸縮変換後の線形離散化サンプル点列のある線形離散化サンプル点（以下、「逆伸縮後周波数」と呼ぶ。）は、逆伸縮変換前の非線形離散化サンプル点列の非線形離散化サンプル点のうち逆伸縮後周波数と近い周波数を持つ１以上の非線形離散化サンプル点の周波数の重み付き和で近似できる。言い換えれば、逆伸縮後周波数は、当該逆伸縮後周波数と最も近い周波数の非線形離散化サンプル点の近傍の１以上の非線形離散化サンプル点の周波数の重み付き和で近似できる。 A linear discretization sample point having a linear discretization sample point sequence after inverse stretching transformation (hereinafter referred to as “frequency after inverse stretching”) is a nonlinear discretization sampling point of the nonlinear discretization sample point sequence before inverse stretching transformation. Can be approximated by a weighted sum of the frequencies of one or more nonlinear discretized sampling points having a frequency close to the frequency after inverse stretching. In other words, the frequency after inverse expansion / contraction can be approximated by a weighted sum of the frequencies of one or more nonlinear discretization sample points in the vicinity of the nonlinear discretization sample point having a frequency closest to the inverse expansion / contraction frequency.

変換行列Vの各行は逆伸縮変換後の線形離散化サンプル点列の各線形離散化サンプル点に対応し、変換行列Vの各列は非線形離散化サンプル点列の非線形離散化サンプル点に対応する。すなわち、変換行列Vの各行は、当該行に対応する線形離散化サンプル点の周波数を表現するための、各非線形離散化サンプル点の周波数についての重みの系列になっている。 Each row of the transformation matrix V corresponds to each linear discretization sample point of the linear discretization sample point sequence after inverse stretching transformation, and each column of the transformation matrix V corresponds to a nonlinear discretization sample point of the nonlinear discretization sample point sequence . That is, each row of the transformation matrix V is a series of weights with respect to the frequency of each non-linear discretization sample point for expressing the frequency of the linear discretization sample point corresponding to the row.

上記の性質から、変換行列Vの各行は、当該各行に対応する線形離散化サンプル点の周波数と最も近い周波数の非線形離散化サンプル点に対応する列の近傍要素のみ０でない値とし、残りの要素は０と例えば設定される。これは、線形離散化サンプル点の周波数とは離れた周波数の非線形離散化サンプル点の影響は極めて小さいため０とみなすことができるからである。 From the above property, each row of the transformation matrix V is set to a non-zero value only for the neighboring elements in the column corresponding to the non-linear discretization sample point of the frequency closest to the frequency of the linear discretization sample point corresponding to each row, and the remaining elements Is set to 0, for example. This is because the influence of the non-linear discretization sample point having a frequency different from the frequency of the linear discretization sample point is extremely small and can be regarded as zero.

上述の「逆伸縮で対応する要素」とは、逆伸縮後周波数に対応する変換行列Vの行の要素の中の、当該逆伸縮後周波数を近似するために用いる非線形離散化サンプル点に対応する列の要素を指す。すなわち、上述の「逆伸縮で対応する要素」とは、変換行列Vの各行の要素の中の、当該各行に対応する逆伸縮後周波数に最も近い周波数の非線形離散化サンプル点に対応する列の近傍の要素である。 The above-mentioned “element corresponding to inverse expansion / contraction” corresponds to a non-linear discretization sample point used to approximate the frequency after inverse expansion / contraction in the element of the row of the transformation matrix V corresponding to the frequency after reverse expansion / contraction. Points to a column element. That is, the above-mentioned “elements corresponding to inverse expansion / contraction” refers to the column corresponding to the nonlinear discretization sample point of the frequency closest to the frequency after inverse expansion / contraction corresponding to each row among the elements of each row of the transformation matrix V. Neighboring elements.

変換行列VをN×M行列として、変換行列Vにおける伸縮曲線は、行を横軸とし、列を縦軸と見做した二次元平面において、左上端の成分に対応する点[1,1]から右下端の成分に対応する点[N,M]に向けて単調減少する曲線となる。言い換えると、変換行列Vは以下のような性質を持つ。
変換行列Vの第i行の重心g_i’を、 The transformation curve in the transformation matrix V is the point [1,1] corresponding to the upper left component in a two-dimensional plane assuming that the transformation matrix V is an N × M matrix and the row is the horizontal axis and the column is the vertical axis. To a point [N, M] corresponding to the component at the lower right corner. In other words, the transformation matrix V has the following properties.
The centroid g _i ′ of the i-th row of the transformation matrix V is

と定義する。V[i,k]は変換行列Vの(i,k)要素を表す。行と列のインデックスは１から開始されるものとする。すると、変換行列Vの各行の重心の系列g₁’,g₂’,…,g_M’はg₁’<g₂’<…<g_M’という関係を満たす。
ただし、変換行列Vにおける伸縮曲線は、変換行列Uの伸縮曲線とは逆の曲がり方の伸縮曲線（逆伸縮曲線）に沿った成分のみ非零の値を持つような疎行列となる。つまり、変換行列Uの伸縮曲線と変換行列Vの伸縮曲線は、行を横軸とし、列を縦軸と見做した二次元平面において、左上端の成分に対応する点[1,1]と右下端の成分に対応する点[N,M]とを結ぶ直線に対してほぼ線対称な形状となる。 It is defined as V [i, k] represents the (i, k) element of the transformation matrix V. Assume that the row and column indices start at 1. Then, the transformation matrix center of gravity series g ₁ of each row of the _{V ', g 2', ...} , g M ' is g _1' satisfy the relationship of _{<g 2 '<... <g} M'.
However, the expansion / contraction curve in the conversion matrix V is a sparse matrix in which only components along the expansion / contraction curve (inverse expansion / contraction curve) in the opposite direction to the expansion / contraction curve of the conversion matrix U have non-zero values. In other words, the stretching curve of the transformation matrix U and the stretching curve of the transformation matrix V are points [1,1] corresponding to the upper left component in a two-dimensional plane with the horizontal axis as the row and the vertical axis as the column. The shape is almost line-symmetric with respect to a straight line connecting the points [N, M] corresponding to the component at the lower right end.

例えば、変換行列Uが、その上三角行列に含まれる非零成分の数が、その下三角行列に含まれる非零成分の数よりも少ないような疎行列である場合は、逆変換行列Vは、その上三角行列に含まれる非零成分の数が、その下三角行列に含まれる非零成分の数よりも多いような疎行列となる。このとき、変換行列Vの各行において、当該各行の非零要素のうち対角要素から最も距離の離れている要素と対角要素との距離（この距離は、列のインデックスがどれだけ離れているかを表す。）を各行のインデックスの伸縮距離としたとき、全インデックスのうちの前半分のインデックスに対する伸縮距離の平均は、後ろ半分のインデックスに対する伸縮距離の平均よりも小さい傾向にある。図１８に変換行列Vの例を示す。この例ではN=M=16である。 For example, when the transformation matrix U is a sparse matrix in which the number of nonzero components included in the upper triangular matrix is less than the number of nonzero components included in the lower triangular matrix, the inverse transformation matrix V is The sparse matrix is such that the number of non-zero components included in the upper triangular matrix is greater than the number of non-zero components included in the lower triangular matrix. At this time, in each row of the transformation matrix V, the distance between the non-zero element of each row that is farthest from the diagonal element and the diagonal element (this distance is how far the column index is separated) Represents the expansion / contraction distance of the index of each row, the average of the expansion / contraction distance with respect to the front half index of all the indexes tends to be smaller than the average of the expansion / contraction distance with respect to the rear half index. FIG. 18 shows an example of the transformation matrix V. In this example, N = M = 16.

なお、変換行列Vは、変換行列Uと同様に非負値行列であってもよい。この場合も変換行列Vは各行の重心が上記の性質g₁’<g₂’<…<g_M’を満たす。 Note that the transformation matrix V may be a non-negative matrix similarly to the transformation matrix U. Also in this case, in the transformation matrix V, the center of gravity of each row satisfies the above-described property g ₁ '<g ₂ '<...<g _M '.

＜包絡正規化部２６＞
包絡正規化部２６には、周波数領域変換部２１が変換したMDCT係数列X(0),X(1),…,X(N-1)及び逆伸縮変換部２５が変換したパワースペクトル包絡系列W(0),W(1),…,W(N-1)が入力される。 <Envelope normalization unit 26>
The envelope normalization unit 26 includes an MDCT coefficient sequence X (0), X (1),..., X (N-1) converted by the frequency domain conversion unit 21 and a power spectrum envelope sequence converted by the inverse expansion / conversion conversion unit 25. W (0), W (1),..., W (N-1) are input.

包絡正規化部２６は、パワースペクトル包絡系列W(0),W(1),…,W(N-1)を用いて、周波数領域のサンプル列であるMDCT係数列X(0),X(1),…,X(N-1)を正規化することにより、正規化された周波数領域サンプル列を生成する（ステップＥ６）。正規化された周波数領域サンプル列とは、この例では正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)である。 The envelope normalization unit 26 uses the power spectrum envelope series W (0), W (1),..., W (N-1) to generate MDCT coefficient sequences X (0), X ( 1),..., X (N-1) are normalized to generate a normalized frequency domain sample sequence (step E6). In this example, the normalized frequency domain sample sequence is a normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1).

生成された正規化された周波数領域サンプル列は、符号化部２７に出力される。 The generated normalized frequency domain sample sequence is output to the encoding unit 27.

包絡正規化部２６は、例えば、i=0,1,…,N-1として、MDCT係数列X(0),X(1),…,X(N-1)の各係数X(i)をパワースペクトル包絡系列W(0),W(1),…,W(N-1)の各包絡値W_γ(i)の平方根で除算することにより、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)の各係数X_N(i)を生成する。すなわち、i=0,1,…,N-1として、X_N(i)=X(i)/sqrt(W(i))である。ここで、xを実数としてsqrt(x)はxの平方根を表す。 For example, the envelope normalization unit 26 sets each coefficient X (i) of the MDCT coefficient sequence X (0), X (1),..., X (N-1) as i = 0, 1,. Is divided by the square root of each envelope value W _γ (i) of the power spectrum envelope series W (0), W (1),..., W (N-1) to obtain a normalized MDCT coefficient sequence X _N (0) , X _N (1),..., X _N (N−1) coefficients X _N (i) are generated. That is, X _N (i) = X (i) / sqrt (W (i)) where i = 0, 1,. Here, sqrt (x) represents the square root of x, where x is a real number.

なお、パワースペクトル包絡系列W(0),W(1),…,W(N-1)は、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)に由来するものである。 The power spectrum envelope sequences W (0), W (1),..., W (N-1) are expanded power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N- It is derived from 1).

したがって、包絡正規化部２６は、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)に基づいて周波数領域のサンプル列であるMDCT係数列X(0),X(1),…,X(N-1)を正規化することにより、正規化された周波数領域サンプル列を生成しているとも言える。 Therefore, the envelope normalization unit 26 uses the MDCT coefficient sequence X ((frequency domain sample sequence) based on the stretched power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1). It can be said that the normalized frequency domain sample sequence is generated by normalizing 0), X (1),..., X (N-1).

＜符号化部２７＞
符号化部２７には、包絡正規化部２６が生成した正規化された周波数領域サンプル列が入力される。この例では、正規化された周波数領域サンプル列は、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)である。 <Encoding unit 27>
The normalized frequency domain sample sequence generated by the envelope normalization unit 26 is input to the encoding unit 27. In this example, the normalized frequency domain sample sequence is a normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1).

符号化部２７は、正規化された周波数領域サンプル列を符号化して、その正規化された周波数領域サンプル列に対応する符号を生成する（ステップＥ７）。 The encoding unit 27 encodes the normalized frequency domain sample sequence and generates a code corresponding to the normalized frequency domain sample sequence (step E7).

生成された符号は、復号装置に出力される。 The generated code is output to the decoding device.

符号化部２７は、例えば従来と同様に正規化された周波数領域サンプル列に対応する符号を生成する。 For example, the encoding unit 27 generates a code corresponding to a frequency domain sample sequence normalized as in the conventional case.

すなわち、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)の各係数を利得（グローバルゲイン）gで割り算し、その結果を量子化した整数値による系列である量子化正規化済係数系列X_Q(0),X_Q(1),…,X_Q(N-1)を符号化して得られる符号を整数信号符号とする。符号化部１５は、この整数信号符号のビット数が、予め配分されたビット数である配分ビット数B以下、かつ、なるべく大きな値となるような利得gを決定する。そして、符号化部２７は、この決定された利得gに対応する利得符号と、この決定された利得gに対応する整数信号符号とを生成する。この場合、利得符号と整数信号符号とが、正規化された周波数領域サンプル列に対応する符号となる。 That is, each coefficient of the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N-1) is divided by a gain (global gain) g, and an integer value obtained by quantizing the result by a series quantized normalization haze coefficient sequence _{_{X Q (0), X Q}} (1), ..., a code obtained by coding X _Q (N-1) and the integer signal code. The encoding unit 15 determines a gain g such that the number of bits of the integer signal code is equal to or smaller than the allocated bit number B, which is the number of bits allocated in advance, and as large as possible. Then, the encoding unit 27 generates a gain code corresponding to the determined gain g and an integer signal code corresponding to the determined gain g. In this case, the gain code and the integer signal code are codes corresponding to the normalized frequency domain sample sequence.

〔変換行列Uと逆変換行列Vの求め方の例〕
線形離散化サンプル点列から非線形離散化サンプル点列への線形変換を実現する変換行列Uと、非線形離散化サンプル点列から線形離散化サンプル点列への線形変換を実現する逆変換行列Vとは、例えば予め学習によって求めておくことができる。ここでU,Vの要素はすべて負でないという制約をつける。変換前後のベクトルはパワースペクトルまたはその包絡であるため、すべて正値であるためである。 [Example of how to find transformation matrix U and inverse transformation matrix V]
A transformation matrix U that implements a linear transformation from a linear discretization sample point sequence to a nonlinear discretization sample point sequence, and an inverse transformation matrix V that implements a linear transformation from a nonlinear discretization sample point sequence to a linear discretization sample point sequence. Can be obtained, for example, by learning in advance. Here, the U and V elements are all non-negative. This is because the vectors before and after the conversion are power spectra or their envelopes, and are all positive values.

線形変換を用いてパワースペクトル系列Y(0),Y(1),…,Y(N-1)から伸縮疑似パワースペクトル系列~Y(0),~Y(1),…,~Y(N-1)を得る方法は、sinc関数などにより補間することで伸縮疑似パワースペクトルを求める場合と比較して意図しない変換を防ぐことができるという利点がある。 Using linear transformation, the power spectrum sequence Y (0), Y (1), ..., Y (N-1) can be expanded and contracted pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N The method of obtaining -1) has an advantage that unintended conversion can be prevented as compared with the case of obtaining a stretchable pseudo power spectrum by interpolating with a sinc function or the like.

まず、学習データとして、M個の線形離散化サンプル点から成る線形離散化サンプル点列に基づいて表現されるパワースペクトル系列(Ym(0),…,Ym(N-1)) (m=1,2,…,M)の集合Yと、これをsinc関数などにより補間することで得られる非線形離散化サンプル点列に基づいて表現される伸縮疑似パワースペクトル系列(~Ym(0),…,~Ym(N-1))(m=1,2,…,M)の集合~Yを用意しておく。 First, as learning data, a power spectrum sequence (Ym (0),..., Ym (N-1)) (m = 1) expressed based on a linear discrete sample point sequence consisting of M linear discrete sample points , 2, ..., M), and a stretched pseudo power spectrum sequence (~ Ym (0), ...,) expressed based on a non-linear discretized sample point sequence obtained by interpolating this with a sinc function or the like A set ~ Y of ~ Ym (N-1)) (m = 1,2, ..., M) is prepared.

そして、予め適当な初期値を設定した帯行列Uを用いて、UYと~Yとの距離が最小となるように、Uの各成分を更新していくことで、変換行列Uを学習する。 Then, the conversion matrix U is learned by updating each component of U so that the distance between UY and ~ Y is minimized using a band matrix U in which appropriate initial values are set in advance.

距離としては、線形予測係数が包絡の二乗値と元のパワースペクトルとの板倉齋藤距離が最小となるものであることから、学習時の距離の尺度としても板倉齋藤距離を用いるとよい。すなわち、伸縮変換行列の学習では、以下の式により定義される関数を目的関数として学習を行う。D_IS(a|b)はaを基準とするbとの板倉齋藤距離を表す。 As the distance, since the Itakura Saito distance between the square value of the envelope of the linear prediction coefficient and the original power spectrum is minimized, the Itakura Saito distance may be used as a distance measure during learning. That is, in learning of the expansion / contraction transformation matrix, learning is performed using a function defined by the following equation as an objective function. D _IS (a | b) represents the Itakura Saito distance from b based on a.

逆変換行列Vは、YとVUYとの板倉齋藤距離が最小となるように、すなわち、以下の式により定義される関数を目的関数として学習を行えばよい。 The inverse transformation matrix V may be learned so that the Itakura Saito distance between Y and VUY is minimized, that is, a function defined by the following equation is used as an objective function.

この最適化問題は、例えば補助関数法により解くことができる。例えば、所定の条件を満たすまで、以下の更新式によりUとVの要素を更新することで最適解に近づけることができる。 This optimization problem can be solved by an auxiliary function method, for example. For example, until the predetermined condition is satisfied, the U and V elements can be updated by the following update formula to approximate the optimal solution.

ここで、U_i,j ^(p) とV_i,j ^(p)は、それぞれp回目の繰り返しにより得られた変換行列Vの(i,j)成分、逆変換行列Vの(i,j)成分を表す。また、Y_j,k=Y_k(j)、すなわち、学習データYのk番目の疑似パワースペクトル系列におけるj番目の要素を表す。同様に~Y_i,k=~Y_k(i)であり、学習データ~Yのk番目の疑似パワースペクトル系列におけるi番目の要素を表す。 Here, U _{i, j} ^(p) and V _{i, j} ^(p) are respectively the (i, j) component of the transformation matrix V and the (i, j) of the inverse transformation matrix V obtained by the p-th iteration. Represents an ingredient. Y _{j, k} = Y _k (j), that is, the j-th element in the k-th pseudo power spectrum sequence of the learning data Y is represented. Similarly, ~ Y _{i, k} = ~ Y _k (i), which represents the i-th element in the k-th pseudo power spectrum sequence of learning data ~ Y.

この更新式から、変換行列Uや逆変換行列Vのうち初期値が０の要素は、学習後も０のままであるため、計算をする必要がなく、この制約の中での最適化が可能である。また、UとVは帯行列であるので、あらかじめ非ゼロのサンプルの位置を指定した変換により、実際の変換のための演算量を大幅に削減できるとともに、学習においても０成分である多くの要素は学習する必要がないため、低コストで学習を行うことができる。また、初期値を設定する際の帯幅を調整することで、変換や学習の演算量をさらに調節することができる。 From this update formula, elements with an initial value of 0 in the transformation matrix U and inverse transformation matrix V remain 0 after learning, so there is no need to calculate and optimization within this constraint is possible. It is. Also, since U and V are banded matrices, the amount of calculation for actual conversion can be greatly reduced by the conversion in which the positions of non-zero samples are specified in advance, and many elements that are zero components in learning can also be obtained. Since learning is not necessary, learning can be performed at low cost. In addition, the amount of conversion and learning can be further adjusted by adjusting the bandwidth when setting the initial value.

（第一実施形態の復号）
第一実施形態の符号化装置に対応する復号装置の構成例を図６に示す。第一実施形態の復号装置は、図６に示すように、伸縮線形予測係数復号部３１と、伸縮パワースペクトル包絡系列生成部３２と、逆伸縮変換部３３と、復号部３４と、包絡逆正規化部３５と、時間領域変換部３６とを例えば備えている。この復号装置により実現される第一実施形態の復号方法の各処理の例を図７に示す。 (Decoding of the first embodiment)
A configuration example of a decoding apparatus corresponding to the encoding apparatus of the first embodiment is shown in FIG. As shown in FIG. 6, the decoding device according to the first embodiment includes an expansion / contraction linear prediction coefficient decoding unit 31, an expansion / contraction power spectrum envelope sequence generation unit 32, an inverse expansion / conversion conversion unit 33, a decoding unit 34, and an envelope inverse normality. For example, a conversion unit 35 and a time domain conversion unit 36 are provided. An example of each process of the decoding method of the first embodiment realized by this decoding apparatus is shown in FIG.

復号装置では、符号化装置による符号化処理と逆順の処理でMDCT係数が再構成される。 In the decoding device, MDCT coefficients are reconstructed by processing in the reverse order to the encoding processing by the encoding device.

復号装置には、符号化装置が出力した、正規化された周波数領域サンプル列に対応する符号及び伸縮線形予測係数符号が少なくとも入力される。以下、正規化された周波数領域サンプル列に対応する符号として、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号が入力された場合を例に挙げて説明する。 The decoding device receives at least the code corresponding to the normalized frequency domain sample sequence and the stretched linear prediction coefficient code output from the encoding device. Hereinafter, as codes corresponding to normalized frequency domain sample sequences, codes corresponding to normalized MDCT coefficient sequences X _N (0), X _N (1),..., X _N (N-1) are input. A case will be described as an example.

以下、図６の各部について説明する。 Hereinafter, each part of FIG. 6 will be described.

＜伸縮線形予測係数復号部３１＞
伸縮線形予測係数復号部３１には、符号化装置が出力した伸縮線形予測係数符号が入力される。 <Extensible Linear Prediction Coefficient Decoding Unit 31>
The stretchable linear prediction coefficient decoding unit 31 receives the stretchable linear prediction coefficient code output from the encoding device.

伸縮線形予測係数復号部３１は、フレームごとに、入力された伸縮線形予測係数符号を例えば従来的な復号技術によって復号して量子化伸縮線形予測係数^β₁,^β₂,…, ^β_pを生成する（ステップＤ１）。 The expansion / contraction linear prediction coefficient decoding unit 31 decodes the input expansion / contraction linear prediction coefficient code by, for example, a conventional decoding technique for each frame to quantize the expansion / contraction linear prediction coefficients ^ β ₁ , ^ β ₂ ,. _p is generated (step D1).

生成された量子化伸縮線形予測係数^β₁,^β₂,…, ^β_pは、伸縮パワースペクトル包絡系列生成部３２に出力される。 The generated quantized stretch linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p are output to the stretch power spectrum envelope sequence generation unit 32.

ここで、従来的な復号技術とは、例えば、線形予測係数符号が量子化された線形予測係数に対応する符号である場合に線形予測係数符号を復号して量子化された線形予測係数を得る技術、線形予測係数符号が量子化されたLSPパラメータに対応する符号である場合に線形予測係数符号を復号して量子化されたLSPパラメータを得る技術などである。また、量子化された線形予測係数と量子化されたLSPパラメータは互いに変換可能なものであり、入力された予測係数符号と後段での処理において必要な情報に応じて、変換処理を行なえばよいのは周知である。以上から、上記の線形予測係数符号の復号処理と必要に応じて行なう上記の変換処理とを包含したものが「従来的な復号技術による復号」ということになる。なお、ここでは入力される線形予測係数符号が伸縮線形予測係数符号であるが、処理は従来的な復号処理と同様である。 Here, the conventional decoding technique is to obtain a quantized linear prediction coefficient by decoding the linear prediction coefficient code when the linear prediction coefficient code is a code corresponding to the quantized linear prediction coefficient, for example. A technique for obtaining a quantized LSP parameter by decoding a linear prediction coefficient code when the linear prediction coefficient code is a code corresponding to a quantized LSP parameter. In addition, the quantized linear prediction coefficient and the quantized LSP parameter can be converted to each other, and the conversion process may be performed according to the input prediction coefficient code and information necessary for the subsequent processing. Is well known. From the above, what includes the decoding process of the linear prediction coefficient code and the conversion process performed as necessary is “decoding by a conventional decoding technique”. Here, the input linear prediction coefficient code is a stretched linear prediction coefficient code, but the process is the same as the conventional decoding process.

＜伸縮パワースペクトル包絡系列生成部３２＞
伸縮パワースペクトル包絡系列生成部３２には、伸縮線形予測係数復号部３１が生成した量子化伸縮線形予測係数^β₁,^β₂,…,^β_pが入力される。 <Elastic Power Spectrum Envelope Sequence Generation Unit 32>
The expansion / contraction power spectrum envelope sequence generation unit 32 receives the quantized expansion / contraction linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p generated by the expansion / contraction linear prediction coefficient decoding unit 31.

伸縮パワースペクトル包絡系列生成部３２は、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを用いて、符号化装置の伸縮パワースペクトル包絡系列生成部２４と同様の処理により、非線形離散化サンプル点列に基づいて表現される伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を生成する（ステップＤ２）。 The expansion / contraction power spectrum envelope sequence generation unit 32 uses the quantized expansion / contraction linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p and performs the same processing as the expansion power spectrum envelope sequence generation unit 24 of the encoder. Then, the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) expressed based on the nonlinear discretized sample point sequence is generated (step D2).

生成された伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)は、逆伸縮変換部３３に出力される。 The generated expansion / contraction power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1) are output to the inverse expansion / conversion conversion unit 33.

＜逆伸縮変換部３３＞
逆伸縮変換部３３には、伸縮パワースペクトル包絡系列生成部３２が生成した伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)が入力される。 <Reverse expansion / contraction conversion unit 33>
The inverse expansion / contraction conversion unit 33 receives the expansion / contraction power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1) generated by the expansion / contraction power spectrum envelope series generation unit 32.

逆伸縮変換部３３は、符号化装置の逆伸縮変換部２５と同様の処理により、非線形離散化サンプル点列に基づいて表現される伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)を線形離散化サンプル点列に基づいて表現されるパワースペクトル包絡系列W(0),W(1),…,W(N-1)に変換する（ステップＤ３）。 The inverse expansion / conversion conversion unit 33 performs the same processing as the inverse expansion / contraction conversion unit 25 of the encoding device, and the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W (1) expressed based on the non-linear discretization sample point sequence , ..., ~ W (N-1) are converted into power spectrum envelope sequences W (0), W (1), ..., W (N-1) expressed based on the linear discretized sample point sequence (steps) D3).

変換されたパワースペクトル包絡系列W(0),W(1),…,W(N-1)は、包絡逆正規化部３５に出力される。 The converted power spectrum envelope sequences W (0), W (1),..., W (N−1) are output to the envelope denormalization unit 35.

＜復号部３４＞
復号部３４には、符号化装置が出力した正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号が入力される。 <Decoding unit 34>
A code corresponding to the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1) output from the encoding device is input to the decoding unit 34.

復号部３４は、フレームごとに、入力された正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号を復号して正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)を生成する（ステップＤ４）。 The decoding unit 34 decodes a code corresponding to the input normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1) for each frame to normalize the MDCT coefficient. Columns X _N (0), X _N (1),..., X _N (N−1) are generated (step D4).

生成された正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)は、包絡逆正規化部３５に出力される。 The generated normalized MDCT coefficient sequences X _N (0), X _N (1),..., X _N (N−1) are output to the envelope denormalization unit 35.

例えば、符号化装置でライス符号化を用いた場合には、復号部３４は、ライス符号化に対応した復号処理により符号を復号する。 For example, when Rice coding is used in the encoding device, the decoding unit 34 decodes the code by a decoding process corresponding to Rice coding.

正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号として利得符号及び整数信号符号が入力された場合には、復号部３４は、整数信号符号を復号することにより得られる量子化正規化済係数系列X_Q(0),X_Q(1),…,X_Q(N-1)に利得符号により特定される利得を乗じることにより正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)を生成する。 When a gain code and an integer signal code are input as codes corresponding to the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1), the decoding unit 34 By multiplying the quantized normalized coefficient sequence X _Q (0), X _Q (1), ..., X _Q (N-1) obtained by decoding the integer signal code by the gain specified by the gain code A normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1) is generated.

＜包絡逆正規化部３５＞
包絡逆正規化部３５には、逆伸縮変換部３３が変換したパワースペクトル包絡系列W(0),W(1),…,W(N-1)及び復号部３４が生成した正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)が入力される。 <Envelope inverse normalization unit 35>
The envelope denormalization unit 35 includes power spectrum envelope sequences W (0), W (1),..., W (N-1) converted by the inverse expansion / conversion conversion unit 33, and normalized MDCT coefficients generated by the decoding unit 34. The columns X _N (0), X _N (1), ..., X _N (N-1) are input.

包絡逆正規化部３５は、パワースペクトル包絡系列W(0),W(1),…,W(N-1)を用いて、正規化された周波数領域のサンプル列である正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)を逆正規化することにより、MDCT係数列X(0),X(1),…,X(N-1)を生成する（ステップＤ５）。 The envelope inverse normalization unit 35 uses a power spectrum envelope sequence W (0), W (1),..., W (N-1) to normalize a MDCT coefficient sequence that is a sample sequence in the frequency domain. By denormalizing X _N (0), X _N (1), ..., X _N (N-1), MDCT coefficient sequence X (0), X (1), ..., X (N-1) Is generated (step D5).

生成されたMDCT係数列X(0),X(1),…,X(N-1)は、時間領域変換部３６に出力される。 The generated MDCT coefficient sequences X (0), X (1),..., X (N−1) are output to the time domain conversion unit 36.

例えば、包絡逆正規化部３５は、i=0,1,…,N-1として、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)の各係数にX_N(i)に、パワースペクトル包絡系列W(0),W(1),…,W(N-1)の各包絡値W_γ(i)の平方根を乗じることによりMDCT係数列X(0),X(1),…,X(N-1)を生成する。すなわち、i=0,1,…,N-1として、X(i)=X_N(i)*sqrt(W(i))である。ここで、xを実数としてsqrt(x)はxの平方根を表す。 For example, the envelope inverse normalization unit 35, i = 0, 1, ..., a N-1, normalized MDCT coefficients _{_{X N (0), X N}} (1), ..., X N of (N-1) MDCT coefficient sequence by multiplying each coefficient by the square root of each envelope value W _γ (i) of the power spectrum envelope sequence W (0), W (1), ..., W (N-1) to X _N (i) Generate X (0), X (1), ..., X (N-1). That is, as i = 0, 1,..., N−1, X (i) = X _N (i) * sqrt (W (i)). Here, sqrt (x) represents the square root of x, where x is a real number.

したがって、包絡逆正規化部３５は、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)に基づいて周波数領域のサンプル列である正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)を逆正規化することにより、周波数領域サンプル列を生成しているとも言える。 Therefore, the envelope denormalization unit 35 normalizes MDCT coefficients which are frequency domain sample sequences based on the stretched power spectrum envelope sequences ~ W (0), ~ W (1), ..., ~ W (N-1). It can be said that the frequency domain sample sequence is generated by denormalizing the sequences X _N (0), X _N (1),..., X _N (N−1).

＜時間領域変換部３６＞
時間領域変換部３６には、包絡逆正規化部３５が生成したMDCT係数列X(0),X(1),…,X(N-1)が入力される。 <Time domain conversion unit 36>
MDCT coefficient sequences X (0), X (1),..., X (N−1) generated by the envelope denormalization unit 35 are input to the time domain conversion unit 36.

時間領域変換部３６は、フレームごとに、包絡逆正規化部で得た「MDCT係数列」を時間領域に変換してフレーム単位の音信号（復号音信号）を得る（ステップＤ６）。 For each frame, the time domain conversion unit 36 converts the “MDCT coefficient sequence” obtained by the envelope denormalization unit into the time domain to obtain a sound signal (decoded sound signal) in units of frames (step D6).

［第二実施形態］
（第二実施形態の符号化）
第二実施形態の符号化装置の構成例は、図２に示した第一実施形態の符号化装置の構成例と同様である。 [Second Embodiment]
(Encoding of the second embodiment)
The configuration example of the encoding device of the second embodiment is the same as the configuration example of the encoding device of the first embodiment shown in FIG.

以下、第一実施形態と異なる部分を中心に説明する。第一実施形態と同様の部分については説明を省略する。 Hereinafter, a description will be given centering on differences from the first embodiment. Description of the same parts as those in the first embodiment is omitted.

第二実施形態の符号化装置は、MDCT係数列X(0),X(1),…,X(N-1)を正規化する際に用いるパワースペクトル包絡系列が異なる。すなわち、非線形離散化サンプル点列に対応する伸縮パワースペクトル包絡を平滑化するときの平滑化方法が異なる。言い換えれば、伸縮平滑化パワースペクトル包絡系列生成部２４による伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)の生成方法が異なる。 The encoding apparatus of the second embodiment differs in the power spectrum envelope sequence used when normalizing the MDCT coefficient sequence X (0), X (1),..., X (N−1). That is, the smoothing method for smoothing the expansion / contraction power spectrum envelope corresponding to the non-linear discretized sample point sequence is different. In other words, the generation method of the expansion / contraction power spectrum envelope series ~ W (0), ~ W (1), ..., ~ W (N-1) by the extension / smoothing power spectrum envelope series generation unit 24 is different.

第一実施形態の符号化装置では、伸縮パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)の一例である伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)は、線形離散化サンプル点列に対応するパワースペクトル包絡系列を平滑化する従来と同様の方法で、非線形離散化サンプル点列に対応する伸縮疑似パワースペクトル包絡系列を平滑化することにより生成されている。 In the encoding device according to the first embodiment, the stretched power spectrum envelope sequence ~ _Wo (0), ~ _Wo (1), ..., ~ _Wo (N-1) is an example of the stretched and smoothed power spectrum envelope series. ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) are the same as the conventional method of smoothing the power spectrum envelope sequence corresponding to the linear discretized sample point sequence, It is generated by smoothing a stretched pseudo power spectrum envelope sequence corresponding to a non-linear discretized sample point sequence.

このようにして生成された非線形離散化サンプル点列に対応する伸縮パワースペクトル包絡系列を線形離散化サンプル点列に対応するパワースペクトル包絡系列に逆伸縮変換すると、線形離散化サンプル点列中の解像度の高い周波数領域では伸縮パワースペクトル包絡系列に対して施した平滑化の効果が相殺され、線形離散化サンプル点列に対応するパワースペクトル包絡系列に逆伸縮変換したときにピークの形が大きく残ってしまうことがある。その結果、ピークの形が大きく残ったパワースペクトル包絡系列を用いて正規化MDCT係数列を求めて符号化することになるので、平滑化効果が十分に得られず、符号化の効率が低下してしまうことがある。 When the expansion / contraction power spectrum envelope sequence corresponding to the non-linear discretization sample point sequence generated in this way is inversely stretched and converted to the power spectrum envelope sequence corresponding to the linear discretization sample point sequence, the resolution in the linear discretization sample point sequence is calculated. In the high frequency region, the effect of smoothing applied to the stretched power spectrum envelope sequence is offset, and a large peak shape remains when inverse stretch transform is performed to the power spectrum envelope sequence corresponding to the linear discretized sample point sequence. May end up. As a result, a normalized MDCT coefficient sequence is obtained and encoded using a power spectrum envelope sequence in which a large peak shape remains, so that a sufficient smoothing effect cannot be obtained, resulting in a decrease in encoding efficiency. May end up.

このため、第二実施形態の伸縮パワースペクトル包絡系列生成部２４は、逆伸縮変換後のパワースペクトル包絡系列が、線形離散化サンプル点列で均一な解像度で表現されたパワースペクトル包絡を平滑化したときの平滑化パワースペクトル（従来の平滑化パワースペクトル）を近似するものとなるように、非線形離散化サンプル点列の伸縮の度合いg(k)に応じて平滑化の効果を補正する。 For this reason, the expansion / contraction power spectrum envelope sequence generation unit 24 of the second embodiment smoothes the power spectrum envelope in which the power spectrum envelope sequence after inverse expansion / conversion conversion is expressed with a linear discrete sampled point sequence with a uniform resolution. The smoothing effect is corrected in accordance with the degree of expansion / contraction g (k) of the non-linear discretized sample point sequence so as to approximate the smoothing power spectrum at that time (conventional smoothing power spectrum).

具体的には、第二実施形態の伸縮平滑化パワースペクトル包絡系列生成部２４は、伸縮パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)として、式（３’）により定義される伸縮パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を生成する（ステップＥ４）。生成された伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)は、伸縮パワースペクトル包絡系列~W_o(0),~W(1)_o,…,~W_o(N-1)として逆伸縮変換部２５に出力される。 Specifically, the stretched and smoothed power spectrum envelope sequence generation unit 24 of the second embodiment performs stretched power spectrum envelope sequences ~ _Wo (0), ~ _Wo (1), ..., ~ _Wo (N-1 ), The expansion / contraction power spectrum envelope sequence ~ _Wγ (0), ~ _Wγ (1), ..., ~ _Wγ (N-1) defined by the equation (3 ') is generated (step E4). The generated stretched smoothed power spectrum envelope sequence ~ _Wγ (0), ~ _Wγ (1), ..., ~ _Wγ (N-1) is the stretched power spectrum envelope series ~ _Wo (0), ~ W (1) _o, ..., it is output to the inverse scale transformation unit 25 as _{~ W o (N-1)} .

式（３’）により定義される伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)は、g(k)(k=0,1,…,N-1)により補正された伸縮平滑化パワースペクトル包絡系列であることから、式（３’）により定義される伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)のことを、補正された伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)とも呼ぶ。 The stretched and smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) defined by equation (3 ') is expressed as g (k) (k = 0 , 1,..., N−1), the stretched and smoothed power spectrum envelope sequence defined by equation (3 ′) ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) is the corrected stretched smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N -1) Also called.

式（３）と式（３’）とを比較すると、補正係数γⁿにg(k)が乗じられている点が異なる。g(k)は、非線形離散化サンプル点列の線形離散化サンプル点列からの伸縮の度合いに対応する値であり、例えば以下のように定義される。 Comparing equation (3) and equation (3 ′), the difference is that the correction coefficient γ ⁿ is multiplied by g (k). g (k) is a value corresponding to the degree of expansion and contraction of the nonlinear discretized sample point sequence from the linear discretized sample point sequence, and is defined as follows, for example.

f(k)は、非線形離散化サンプル点列でｋ番目のインデックスに対応するサンプル点の周波数の、線形離散化サンプル点列での相対的な周波数位置を表すものである。したがって、式（３’）は、逆伸縮変換の際に間隔を縮める周波数では、補正値γを等価的に小さくすることを意味する。なお、非線形離散化サンプル点列から線形離散化サンプル点列への逆変換を実現する行列を逆変換行列Vとすれば、f(k)(k=0,1,…,N-1)は例えば以下のように表される。 f (k) represents the relative frequency position in the linear discretization sample point sequence of the frequency of the sample point corresponding to the kth index in the non-linear discretization sample point sequence. Therefore, the expression (3 ′) means that the correction value γ is equivalently reduced at the frequency at which the interval is shortened in the inverse expansion / contraction conversion. If the matrix that realizes the inverse transformation from the nonlinear discrete sample point sequence to the linear discrete sample point sequence is the inverse transformation matrix V, f (k) (k = 0,1, ..., N-1) is For example, it is expressed as follows.

なお、逆変換行列Vは、変換行列Uと同様に帯行列で表現できる。逆変換行列Vも変換行列Uと同様に、非線形離散化サンプル点列と線形離散化サンプル点列との相関関係から予め学習などにより求めておくことができる。求め方は第一実施形態で説明したものと同様である。なお、離散信号における周波数の非線形変換は不可逆な演算であるため、必ずしもVがUの逆行列の関係にあるものではない。 The inverse transformation matrix V can be expressed by a band matrix like the transformation matrix U. Similarly to the transformation matrix U, the inverse transformation matrix V can be obtained in advance by learning or the like from the correlation between the non-linear discretized sample point sequence and the linear discretized sample point sequence. The method of obtaining is the same as that described in the first embodiment. Note that since the nonlinear conversion of the frequency of a discrete signal is an irreversible operation, V does not necessarily have an inverse matrix relationship of U.

（第二実施形態の復号）
第二実施形態の復号装置の構成例は、図６に示した第一実施形態の復号装置の構成例と同様である。 (Decoding of the second embodiment)
The configuration example of the decoding device of the second embodiment is the same as the configuration example of the decoding device of the first embodiment shown in FIG.

第二実施形態の復号装置は、伸縮パワースペクトル包絡系列生成部３２が、ステップＤ２において、伸縮パワースペクトル包絡系列~W_o(0),~W_o(1),…,~W_o(N-1)として、式（３’）により定義される補正された伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を生成する部分で第一実施形態の復号装置と異なる。 In the decoding apparatus according to the second embodiment, the expansion / contraction power spectrum envelope sequence generation unit 32 performs the expansion / contraction power spectrum envelope sequence ~ _Wo (0), ~ _Wo (1), ..., ~ _Wo (N- As 1), the corrected stretched and smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) defined by Equation (3 ') is generated. This part is different from the decoding device of the first embodiment.

第二実施形態の復号装置は、他の部分については、第一実施形態の復号装置と同様である。 The decoding device of the second embodiment is the same as the decoding device of the first embodiment with respect to other parts.

［第三実施形態］
（第三実施形態の符号化）
第三実施形態の符号化装置の構成例を図８に示す。第三実施形態の符号化装置は、図８に示すように、周波数領域変換部２１と、伸縮疑似パワースペクトル系列生成部２２と、線形予測分析部２３と、伸縮平滑化パワースペクトル包絡系列生成部２１３と、第二逆伸縮変換部２８と、伸縮非平滑化パワースペクトル包絡系列生成部２９と、第一逆伸縮変換部２１０と、包絡正規化部２６と、符号化部２７とを例えば備えている。この符号化装置により実現される第三実施形態の符号化方法の各処理の例を図９に示す。 [Third embodiment]
(Encoding of the third embodiment)
An example of the configuration of the encoding apparatus according to the third embodiment is shown in FIG. As shown in FIG. 8, the encoding device according to the third embodiment includes a frequency domain transform unit 21, a stretched pseudo power spectrum sequence generation unit 22, a linear prediction analysis unit 23, and a stretch smoothing power spectrum envelope sequence generation unit. 213, a second inverse expansion / conversion conversion unit 28, an expansion / contraction unsmoothed power spectrum envelope series generation unit 29, a first inverse expansion / conversion conversion unit 210, an envelope normalization unit 26, and an encoding unit 27, for example. Yes. An example of each process of the encoding method according to the third embodiment realized by this encoding apparatus is shown in FIG.

以下、第一実施形態及び第二実施形態と異なる部分を中心に説明する。第一実施形態及び第二実施形態と同様の部分については説明を省略する。 Hereinafter, a description will be given centering on differences from the first embodiment and the second embodiment. Description of the same parts as those in the first embodiment and the second embodiment is omitted.

第一実施形態及び第二実施形態の符号化装置では、MDCT係数列X(0),X(1),…,X(N-1)を符号化する際に、平滑化されたパワースペクトル包絡の情報のみを用いた。これに対して、第三実施形態の符号化装置は、MDCT係数列を符号化する際に、平滑化されたパワースペクトル包絡の平方根を、MDCT係数を割り算するために用い、それに加えて、平滑化していない周波数領域のパワースペクトル包絡と平滑化されたパワースペクトル包絡のサンプルごとの比の情報を量子化対象のMDCT係数の振幅を推定する補助情報として用いる。 In the encoding device of the first embodiment and the second embodiment, when the MDCT coefficient sequence X (0), X (1),..., X (N-1) is encoded, the smoothed power spectrum envelope Only the information of was used. In contrast, the encoding device of the third embodiment uses the square root of the smoothed power spectrum envelope to divide the MDCT coefficient when encoding the MDCT coefficient sequence, and in addition to that, The ratio information for each sample of the power spectrum envelope in the frequency domain and the smoothed power spectrum envelope for each sample is used as auxiliary information for estimating the amplitude of the MDCT coefficient to be quantized.

＜伸縮平滑化パワースペクトル包絡系列生成部２１３＞
伸縮平滑化パワースペクトル包絡系列生成部２１３には、線形予測分析部２３が生成した量子化伸縮線形予測係数^β₁,^β₂,…,^β_pが入力される。 <Extension / Smoothing Power Spectrum Envelope Sequence Generation Unit 213>
The stretched smoothed power spectrum envelope sequence generation unit 213 receives the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p generated by the linear prediction analysis unit 23.

伸縮平滑化パワースペクトル包絡系列生成部２１３は、第一実施形態又は第二実施形態の伸縮平滑化パワースペクトル包絡系列生成部２４と同様の処理により、線量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを用いて、非線形離散化サンプル点列に対応する伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を生成する（ステップＥ４）。 The stretched / smoothed power spectrum envelope sequence generation unit 213 performs the same process as the stretched / smoothed power spectrum envelope sequence generation unit 24 of the first embodiment or the second embodiment, thereby performing a line quantization stretched linear prediction coefficient ^ β ₁ , ^ Using β ₂ ,…, ^ β _p , the stretched and smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1),…, ~ W _γ (N- 1) is generated (step E4).

生成された伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)は、第二逆伸縮変換部２８に出力される。 The generated stretched and smoothed power spectrum envelope sequences ~ _Wγ (0), ~ _Wγ (1), ..., ~ _Wγ (N-1) are output to the second inverse stretch transform unit 28.

伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)は、式（３）又は式（３’）で表される。 The stretched and smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) is expressed by Expression (3) or Expression (3 ').

＜第二逆伸縮変換部２８＞
第二逆伸縮変換部２８には、伸縮平滑化パワースペクトル包絡系列生成部２４が生成した伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)が入力される。 <Second Reverse Stretch Conversion Unit 28>
The second inverse expansion / conversion conversion unit 28 includes the expansion / contraction smoothed power spectrum envelope sequence generated by the expansion / contraction smoothed power spectrum envelope generation unit 24 ~ W _γ (0), ~ W _γ (1), ..., W _γ ( N-1) is input.

そして、第二逆伸縮変換部２８は、第一実施形態及び第二実施形態の逆伸縮変換部２５と同様の処理により、非線形離散化サンプル点列に対応する伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を、線形離散化サンプル点列に対応する平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)に変換する（ステップＥ８）。 Then, the second inverse expansion / conversion conversion unit 28 performs the same processing as the inverse expansion / contraction conversion unit 25 of the first embodiment and the second embodiment, and the expansion / contraction smoothed power spectrum envelope sequence ~ W corresponding to the non-linear discretized sample point sequence. _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) are smoothed power spectrum envelope sequences W _γ (0), W _γ (1) corresponding to the linear discretized sample point sequence ,..., W _γ (N−1) (step E8).

変換された平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)は、包絡正規化部２６及び符号化部２７に出力される。 The converted smoothed power spectrum envelope sequences W _γ (0), W _γ (1),..., W _γ (N−1) are output to the envelope normalization unit 26 and the encoding unit 27.

＜伸縮非平滑化パワースペクトル包絡系列生成部２９＞
伸縮非平滑化パワースペクトル包絡系列生成部２９には、線形予測分析部２３が生成した量子化伸縮線形予測係数^β₁,^β₂,…,^β_pが入力される。 <Expandable Unsmoothed Power Spectrum Envelope Sequence Generation Unit 29>
The stretched non-smoothed power spectrum envelope sequence generation unit 29 receives the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p generated by the linear prediction analysis unit 23.

伸縮非平滑化パワースペクトル包絡系列生成部２９は、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを用いて、式（２）により定義される平滑化前の伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)を生成する（ステップＥ９）。 Stretch textured power spectrum envelope sequence generating unit 29, the quantization telescopic linear prediction coefficient _{_{^ β 1, ^ β 2,}} ..., ^ β with _p, before smoothing is defined by equation (2) stretching the non Smoothed power spectrum envelope sequences ~ _Wo (0), ~ _Wo (1), ..., _Wo (N-1) are generated (step E9).

生成された伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)は、第一逆伸縮変換部２１０に出力される。 The generated stretch / unsmoothed power spectrum envelope sequences ~ _Wo (0), ~ _Wo (1), ..., _Wo (N-1) are output to the first inverse stretch transform unit 210.

＜第一逆伸縮変換部２１０＞
第一逆伸縮変換部２１０には、伸縮非平滑化パワースペクトル包絡系列生成部２９が生成した伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)が入力される。 <First Reverse Stretch Conversion Unit 210>
The first inverse expansion / conversion conversion unit 210 includes expansion / contraction non-smoothed power spectrum envelope sequence generated by the expansion / contraction non-smoothed power spectrum envelope sequence generation unit 29 ~ W _o (0), ~ W _o (1), ..., W _o. (N-1) is input.

第一逆伸縮変換部２１０は、非線形離散化サンプル点列に対応する伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)に基づいて、線形離散化サンプル点列に対応する非平滑化パワースペクトル包絡系列W(0),W(1),…,W(N-1)を生成する（ステップＥ１０）。 The first inverse expansion / conversion transform unit 210 converts the expansion / de-smoothing power spectrum envelope sequence corresponding to the non-linear discretized sample point sequence to W _o (0), to W _o (1), ..., W _o (N-1). Based on this, a non-smoothed power spectrum envelope sequence W (0), W (1),..., W (N-1) corresponding to the linear discretized sample point sequence is generated (step E10).

生成された非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)は、符号化部２７に出力される。 The generated non-smoothed power spectrum envelope sequences W _o (0), W _o (1),..., W _o (N−1) are output to the encoding unit 27.

第一逆伸縮変換部２１０は、第一実施形態の逆伸縮変換部２５と同様にして、補間や線形変換により非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)を生成する。 The first inverse expansion / conversion conversion unit 210 performs non-smoothed power spectrum envelope sequences W _o (0), W _o (1),... By interpolation or linear conversion in the same manner as the inverse expansion / contraction conversion unit 25 of the first embodiment. Generate W _o (N-1).

線形変換により非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)を生成する場合は、第一逆伸縮変換部２１０は、例えば第一実施形態で説明した逆変換行列Vを用いて、以下の式により非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)を生成する。 When the non-smoothed power spectrum envelope sequence W _o (0), W _o (1),..., W _o (N−1) is generated by linear transformation, the first inverse expansion / conversion transformation unit 210 performs, for example, the first implementation. Using the inverse transformation matrix V described in the embodiment, non-smoothed power spectrum envelope sequences W _o (0), W _o (1),..., W _o (N−1) are generated by the following equations.

補間により非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)を生成する場合、第一逆伸縮変換部２１０は例えば以下の処理を行う。逆伸縮変換部２５は、伸縮疑似パワースペクトル系列生成部２２と同様に、伸縮パワースペクトル包絡系列~W(0),~W(1),…,~W(N-1)をsinc関数により補間した曲線（伸縮パワースペクトル包絡をなめらかにつないだ包絡）を求める。そして、その曲線上で線形離散化サンプル点列の各離散化サンプル点に対応する周波数のパワースペクトル包絡値の系列を非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)として得る。 When generating non-smoothed power spectrum envelope sequences W _o (0), W _o (1),..., W _o (N−1) by interpolation, the first inverse expansion / conversion conversion unit 210 performs, for example, the following processing. Similar to the expansion / contraction pseudo power spectrum sequence generation unit 22, the inverse expansion / conversion conversion unit 25 interpolates the expansion / contraction power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) by the sinc function. Obtained curve (envelope that smoothly connects the expansion and contraction power spectrum envelope). Then, a series of power spectrum envelope values of frequencies corresponding to the discrete sample points of the linear discretized sample point sequence on the curve are converted into non-smoothed power spectrum envelope sequences W _o (0), W _o (1),. , W _o (N-1).

＜包絡正規化部２６＞
包絡正規化部２６には、周波数領域変換部２１が変換したMDCT係数列X(0),X(1),…,X(N-1)及び第二逆伸縮変換部２８が変換した平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)が入力される。 <Envelope normalization unit 26>
The envelope normalization unit 26 includes the MDCT coefficient sequence X (0), X (1),..., X (N-1) converted by the frequency domain conversion unit 21 and the smoothing converted by the second inverse expansion / conversion conversion unit 28. Power spectrum envelope sequences W _γ (0), W _γ (1),..., W _γ (N−1) are input.

包絡正規化部２６は、第一実施形態及び第二実施形態の包絡正規化部２６と同様にして、平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)に基づいて周波数領域のサンプル列であるMDCT係数列X(0),X(1),…,X(N-1)を正規化することにより、正規化された周波数領域サンプル列を生成する（ステップＥ６）。正規化された周波数領域サンプル列とは、この例では正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)である。例えば、i=0,1,…,N-1として、X_N(i)=X(i)/sqrt(W_γ(i))とする。 The envelope normalization unit 26 performs smoothing power spectrum envelope sequences W _γ (0), W _γ (1),..., W _γ (in the same manner as the envelope normalization unit 26 of the first and second embodiments. Frequency domain sample sequence normalized by normalizing MDCT coefficient sequence X (0), X (1), ..., X (N-1), which is a frequency domain sample sequence based on (N-1) Is generated (step E6). In this example, the normalized frequency domain sample sequence is a normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1). For example, assuming that i = 0, 1,..., N−1, X _N (i) = X (i) / sqrt (W _γ (i)).

＜符号化部２７＞
符号化部２７には、包絡正規化部２６が生成した正規化された周波数領域サンプル列と、第二逆伸縮変換部２８が変換した平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)と、第一逆伸縮変換部２１０が生成した非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)とが入力される。この例では、正規化された周波数領域サンプル列は、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)である。 <Encoding unit 27>
The encoding unit 27 includes the normalized frequency domain sample sequence generated by the envelope normalization unit 26 and the smoothed power spectrum envelope sequences W _γ (0), W _γ ( 1), ..., W _γ (N-1) and the non-smoothed power spectrum envelope sequence W _o (0), W _o (1), ..., W _o (N- 1) is entered. In this example, the normalized frequency domain sample sequence is a normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1).

符号化部２７は、まず、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)の各係数を利得（グローバルゲイン）gで割り算し、その結果を量子化した整数値による系列である量子化正規化済係数系列X_Q(0),X_Q(1),…,X_Q(N-1)を符号化して得られる符号を整数信号符号とする。量子化正規化済係数系列X_Q(0),X_Q(1),…,X_Q(N-1)は例えば可変長符号化される。符号化部２７は、この整数信号符号のビット数が、予め配分されたビット数である配分ビット数B以下、かつ、なるべく大きな値となるような利得gを決定する。そして、符号化部２７は、この決定された利得gに対応する利得符号と、この決定された利得gに対応する整数信号符号とを生成する。この場合、利得符号と整数信号符号とが、正規化された周波数領域サンプル列に対応する符号となる。 The encoding unit 27 first divides each coefficient of the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1) by a gain (global gain) g, and the result The code obtained by encoding the quantized normalized coefficient sequence X _Q (0), X _Q (1), ..., X _Q (N-1), which is a sequence of integer values obtained by quantizing To do. Quantized normalized coefficient series X _Q (0), X _Q (1),..., X _Q (N−1) are, for example, variable length encoded. The encoding unit 27 determines a gain g such that the number of bits of the integer signal code is equal to or smaller than the allocated bit number B, which is the number of bits allocated in advance, and as large as possible. Then, the encoding unit 27 generates a gain code corresponding to the determined gain g and an integer signal code corresponding to the determined gain g. In this case, the gain code and the integer signal code are codes corresponding to the normalized frequency domain sample sequence.

符号化部２７は、量子化正規化済係数系列X_Q(0),X_Q(1),…,X_Q(N-1)を可変長符号化する際に、平滑化前の値である非平滑化パワースペクトル包絡系列と平滑化後の値である平滑化パワースペクトル包絡系列とから求めた量子化対象のMDCT係数の振幅の推定値に応じて可変長符号化の情報配分を変更することで、圧縮効率を改善する。 The encoding unit 27 is a value before smoothing when the quantized normalized coefficient series X _Q (0), X _Q (1),..., X _Q (N−1) is variable length encoded. Changing the information distribution of variable length coding according to the estimated value of the amplitude of the MDCT coefficient to be quantized obtained from the unsmoothed power spectrum envelope sequence and the smoothed power spectrum envelope sequence that is the value after smoothing In order to improve the compression efficiency.

ここで、量子化対象のMDCT係数X_Q(k)の振幅は、非平滑化パワースペクトル包絡係数W_o(k)を、平滑化パワースペクトル包絡係数W_γ(k)で割り算した値の平方根で推定することができる。これを、量子化対象のMDCT係数X_Q(k)の振幅の推定値として用いる。 Here, the amplitude of the MDCT coefficient X _Q (k) to be quantized is the square root of the value obtained by dividing the unsmoothed power spectrum envelope coefficient W _o (k) by the smoothed power spectrum envelope coefficient W _γ (k). Can be estimated. This is used as an estimated value of the amplitude of the MDCT coefficient X _Q (k) to be quantized.

例えば、可変長符号化方法としてライス符号化を用いる場合は、量子化対象のMDCT係数X_Q(k)の振幅の推定値の対数に比例した値を、量子化正規化済係数系列X_Q(k)をライス符号化する際のライスパラメータとして用いて、ライス符号化を行う。すなわち、平滑化していない周波数領域のパワースペクトル包絡と平滑化されたパワースペクトル包絡のサンプルごとの比に基づいて、量子化正規化済係数系列X_Q(k)をライス符号化する際のライスパラメータを選択する。 For example, when using Rice coding as the variable-length coding method, a value proportional to the logarithm of the amplitude estimate of the MDCT coefficient X _Q (k) to be quantized is expressed as a quantized normalized coefficient sequence X _Q ( Rice coding is performed using k) as a rice parameter when performing rice coding. That is, based on the ratio of the unsmoothed frequency-domain power spectrum envelope to the smoothed power spectrum envelope for each sample, the Rice parameter for the Rician coding of the quantized normalized coefficient sequence X _Q (k) Select.

なお、サンプルkごとにライスパラメータを切り替えるのではなく、複数サンプルごとにライスパラメータを切り替えて符号化してもよい。この場合は、例えば、複数サンプルの量子化対象のMDCT係数X_Q(k)の振幅の推定値の平均値の対数に比例した値を、量子化正規化済係数系列X_Q(k)をライス符号化する際のライスパラメータとして用いる。 Instead of switching the rice parameter for each sample k, encoding may be performed by switching the rice parameter for each of a plurality of samples. In this case, for example, a value proportional to the logarithm of the average value of the amplitude estimation values of the MDCT coefficients X _Q (k) to be quantized of a plurality of samples is used as the quantized normalized coefficient series X _Q (k). Used as a rice parameter when encoding.

（第三実施形態の復号）
第三実施形態の復号装置の構成例を図１０に示す。第三実施形態の復号装置は、図１０に示すように、伸縮線形予測係数復号部３１と、伸縮平滑化パワースペクトル包絡系列生成部３１２と、第二逆伸縮変換部３９と、伸縮非平滑化パワースペクトル包絡系列生成部３７と、第一逆伸縮変換部３８と、復号部３４と、包絡逆正規化部３５と、時間領域変換部３６とを例えば備えている。この復号装置により実現される第三実施形態の復号方法の各処理の例を図１１に示す。 (Decoding of the third embodiment)
A configuration example of the decoding device according to the third embodiment is shown in FIG. As shown in FIG. 10, the decoding device of the third embodiment includes an expansion / contraction linear prediction coefficient decoding unit 31, an expansion / smoothing power spectrum envelope sequence generation unit 312, a second inverse expansion / conversion conversion unit 39, and expansion / contraction non-smoothing. For example, a power spectrum envelope series generation unit 37, a first inverse expansion / conversion conversion unit 38, a decoding unit 34, an envelope inverse normalization unit 35, and a time domain conversion unit 36 are provided. An example of each process of the decoding method according to the third embodiment realized by this decoding apparatus is shown in FIG.

以下、第一実施形態の復号復号装置と異なる部分を中心に説明し、第一実施形態と同様の部分については説明を省略する。 The following description will focus on the parts different from the decoding / decoding device of the first embodiment, and the description of the same parts as in the first embodiment will be omitted.

＜伸縮非平滑化パワースペクトル包絡系列生成部３７＞
伸縮非平滑化パワースペクトル包絡系列生成部３７には、伸縮線形予測係数復号部３１が生成した量子化伸縮線形予測係数^β₁,^β₂,…,^β_pが入力される。 <Expandable Unsmoothed Power Spectrum Envelope Sequence Generation Unit 37>
The stretched non-smoothed power spectrum envelope sequence generation unit 37 receives the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p generated by the stretched linear prediction coefficient decoding unit 31.

伸縮非平滑化パワースペクトル包絡系列生成部３７は、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを用いて、式（２）により定義される伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)を生成する（ステップＤ７）。 The stretched non-smoothed power spectrum envelope sequence generation unit 37 uses the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p and uses the stretched non-smoothed power spectrum defined by equation (2). Envelope sequences ~ _Wo (0), ~ _Wo (1), ..., _Wo (N-1) are generated (step D7).

生成された伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)は、第一逆伸縮変換部３８に出力される。 The generated stretch / unsmoothed power spectrum envelope sequences ~ _Wo (0), ~ _Wo (1), ..., _Wo (N-1) are output to the first inverse stretch transform unit 38.

＜第一逆伸縮変換部３８＞
第一逆伸縮変換部３８には、伸縮非平滑化パワースペクトル包絡系列生成部３７が生成した伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)が入力される。 <First Reverse Stretch Conversion Unit 38>
The first inverse expansion / conversion conversion unit 38 includes expansion / contraction non-smoothed power spectrum envelope sequence generated by the expansion / de-smoothing power spectrum envelope sequence generation unit 37 ~ W _o (0), ~ W _o (1), ..., W _o. (N-1) is input.

第一逆伸縮変換部３８は、第三実施形態の符号化装置の第一逆伸縮変換部２１０と同様の処理により、非線形離散化サンプル点列に対応する伸縮非平滑化パワースペクトル包絡系列~W_o(0),~W_o(1),…,W_o(N-1)に基づいて、線形離散化サンプル点列に対応する非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)を生成する（ステップＤ８）。 The first inverse expansion / conversion conversion unit 38 performs the same processing as the first inverse expansion / contraction conversion unit 210 of the encoding device according to the third embodiment, and the expansion / contraction non-smoothing power spectrum envelope sequence corresponding to the nonlinear discretized sample point sequence ~ W _{Based on o} (0), ~ W _o (1), ..., W _o (N-1), the unsmoothed power spectrum envelope sequence W _o (0), W _o ( 1),..., W _o (N−1) are generated (step D8).

生成された非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)は、復号部３４に出力される。 The generated non-smoothed power spectrum envelope sequences W _o (0), W _o (1),..., W _o (N−1) are output to the decoding unit.

＜伸縮平滑化パワースペクトル包絡系列生成部３１２＞
伸縮平滑化パワースペクトル包絡系列生成部３１２には、伸縮線形予測係数復号部３１が生成した量子化伸縮線形予測係数^β₁,^β₂,…,^β_pが入力される。 <Extension / Smoothing Power Spectrum Envelope Sequence Generation Unit 312>
The expansion / contraction smoothed power spectrum envelope sequence generation unit 312 receives the quantized expansion / contraction linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p generated by the expansion / contraction linear prediction coefficient decoding unit 31.

伸縮平滑化パワースペクトル包絡系列生成部３２は、量子化伸縮線形予測係数^β₁,^β₂,…,^β_pを用いて、符号化装置の伸縮平滑化パワースペクトル包絡系列生成部２４と同様の処理により、非線形離散化サンプル点列に基づいて表現される伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を生成する（ステップＤ２）。 The stretched / smoothed power spectrum envelope sequence generating unit 32 uses the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p and the stretched / smoothed power spectrum envelope sequence generating unit 24 of the encoder. The same processing generates the stretched and smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) expressed based on the nonlinear discretized sample point sequence (Step D2).

生成された伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)は、逆伸縮変換部３３に出力される。 The generated stretched and smoothed power spectrum envelope sequences ~ _Wγ (0), ~ _Wγ (1), ..., ~ _Wγ (N-1) are output to the inverse stretch conversion unit 33.

＜第二逆伸縮変換部３９＞
第二逆伸縮変換部３９は、第一実施形態及び第二実施形態の逆伸縮変換部２５及び第三実施形態の第二逆伸縮変換部２８と同様である。 <Second Reverse Stretch Conversion Unit 39>
The second reverse expansion / conversion conversion unit 39 is the same as the reverse expansion / conversion conversion unit 25 of the first embodiment and the second embodiment and the second reverse expansion / contraction conversion unit 28 of the third embodiment.

第二逆伸縮変換部３９には、伸縮平滑化パワースペクトル包絡系列生成部３２が生成した伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)が入力される。 The second inverse expansion / conversion conversion unit 39 includes the expansion / contraction smoothed power spectrum envelope sequence generated by the expansion / contraction smoothed power spectrum envelope generation unit 32 ~ _Wγ (0), ~ _Wγ (1), ..., ~ _Wγ ( N-1) is input.

第二逆伸縮変換部３９は、第一実施形態及び第二実施形態の逆伸縮変換部２５及び第三実施形態の第二逆伸縮変換部２８と同様にして、非線形離散化サンプル点列に対応する伸縮平滑化パワースペクトル包絡系列~W_γ(0),~W_γ(1),…,~W_γ(N-1)を、線形離散化サンプル点列に対応する平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)に変換する（ステップＤ９）。 The second inverse expansion / conversion conversion unit 39 corresponds to the non-linear discretization sample point sequence in the same manner as the inverse expansion / contraction conversion unit 25 of the first embodiment and the second embodiment and the second inverse expansion / contraction conversion unit 28 of the third embodiment. The smoothed power spectrum envelope sequence ~ W _γ (0), ~ W _γ (1), ..., ~ W _γ (N-1) to be smoothed power spectrum envelope series W corresponding to the linear discretized sample point sequence Convert to _γ (0), W _γ (1),..., W _γ (N−1) (step D9).

変換された平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)は、復号部３４及び包絡逆正規化部３５に出力される。 The converted smoothed power spectrum envelope sequences W _γ (0), W _γ (1),..., W _γ (N−1) are output to the decoding unit 34 and the envelope denormalization unit 35.

＜復号部３４＞
復号部３４には、符号化装置が出力した正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号と、第一逆伸縮変換部３８が変換した非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)と、第二逆伸縮変換部３９が変換した平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)とが入力される。 <Decoding unit 34>
The decoding unit 34 includes a code corresponding to the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1) output from the encoding device, and a first inverse expansion / conversion conversion unit. , W _o (0), W _o (1),..., W _o (N−1) converted by 38 and the smoothed power spectrum envelope sequence converted by the second inverse expansion / conversion transform unit 39. W _γ (0), W _γ (1),..., W _γ (N−1) are input.

生成された正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)は、包絡逆正規化部５４に出力される。 The generated normalized MDCT coefficient sequences X _N (0), X _N (1),..., X _N (N−1) are output to the envelope denormalization unit 54.

正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)に対応する符号として利得符号及び整数信号符号が入力された場合には、復号部３４は、まず符号化装置の符号化部２７で整数信号符号を得る際に用いた符号化処理に対応した復号処理により整数信号符号を復号し、「利得で正規化された正規化MDCT係数列」を得る。例えば、符号化装置でライス符号化を用いた場合には、復号部３４は、ライス符号化に対応した復号処理により整数信号符号を復号する。 When a gain code and an integer signal code are input as codes corresponding to the normalized MDCT coefficient sequence X _N (0), X _N (1),..., X _N (N−1), the decoding unit 34 First, the integer signal code is decoded by a decoding process corresponding to the encoding process used when the encoding unit 27 of the encoding device obtains the integer signal code, and a “normalized MDCT coefficient sequence normalized by gain” is obtained. . For example, when Rice coding is used in the encoding device, the decoding unit 34 decodes the integer signal code by a decoding process corresponding to Rice coding.

このとき、復号部３４は、第三実施形態の符号化装置の符号化部２７と同じ基準でライスパラメータを選択する。つまり、量子化対象のMDCT係数X_Q(k)の振幅の推定値の対数に比例した値を、量子化正規化済係数系列X_Q(k)のライスパラメータとして用いて、ライス復号を行う。量子化対象のMDCT係数X_Q(k)の振幅の推定値を求める際には、符号化装置の符号化部２７と同様に、復号装置の第一逆伸縮変換部３８で生成された非平滑化パワースペクトル包絡係数W_o(k)を、復号装置の第２逆伸縮変換部で生成された平滑化パワースペクトル包絡係数W_γ(k)で割り算した値の平方根で推定する。 At this time, the decoding unit 34 selects a rice parameter based on the same criteria as the encoding unit 27 of the encoding device of the third embodiment. That is, rice decoding is performed using a value proportional to the logarithm of the amplitude estimation value of the MDCT coefficient X _Q (k) to be quantized as the rice parameter of the quantized normalized coefficient series X _Q (k). When obtaining the estimated value of the amplitude of the MDCT coefficient X _Q (k) to be quantized, similarly to the encoding unit 27 of the encoding device, the non-smooth generated by the first inverse expansion / conversion conversion unit 38 of the decoding device. The normalized power spectrum envelope coefficient W _o (k) is estimated by the square root of the value divided by the smoothed power spectrum envelope coefficient W _γ (k) generated by the second inverse expansion / conversion transform unit of the decoding device.

次に、復号部３４は、得られた「利得で正規化された正規化MDCT係数列」に、利得符号により特定される利得を乗じることで、正規化MDCT係数列X_N(0),X_N(1),…,X_N(N-1)を生成する。 Next, the decoding unit 34 multiplies the obtained “normalized MDCT coefficient sequence normalized by gain” by the gain specified by the gain code, thereby obtaining a normalized MDCT coefficient sequence X _N (0), X _N (1), ..., X _N (N-1) is generated.

復号装置による以降の処理、すなわち包絡逆正規化部３５及び時間領域変換部３６の処理は、第一実施形態及び第二実施形態と同様であるため、これらの説明を省略する。 Since the subsequent processing by the decoding apparatus, that is, the processing of the envelope denormalization unit 35 and the time domain conversion unit 36 is the same as that of the first embodiment and the second embodiment, description thereof will be omitted.

［第四実施形態］
（第四実施形態の符号化）
第四実施形態の符号化装置の構成例を図１２に示す。第四実施形態の符号化装置は、図１２に示すように、周波数領域変換部２１と、伸縮疑似パワースペクトル系列生成部２２と、線形予測分析部２３と、伸縮非平滑化パワースペクトル包絡系列生成部２９と、第一逆伸縮変換部２１０と、高次線形予測分析部２１１と、平滑化パワースペクトル包絡系列生成部２１２と、包絡正規化部２６と、符号化部２７とを例えば備えている。この符号化装置により実現される第四実施形態の符号化方法の各処理の例を図１３に示す。 [Fourth embodiment]
(Encoding of the fourth embodiment)
An example of the configuration of the encoding apparatus according to the fourth embodiment is shown in FIG. As shown in FIG. 12, the encoding apparatus according to the fourth embodiment includes a frequency domain transform unit 21, a stretched pseudo power spectrum sequence generation unit 22, a linear prediction analysis unit 23, and a stretched unsmoothed power spectrum envelope sequence generation. For example, a unit 29, a first inverse expansion / conversion conversion unit 210, a high-order linear prediction analysis unit 211, a smoothed power spectrum envelope sequence generation unit 212, an envelope normalization unit 26, and an encoding unit 27. . An example of each process of the encoding method of the fourth embodiment realized by this encoding apparatus is shown in FIG.

第二実施形態の符号化装置は、伸縮パワースペクトル包絡系列として、非線形離散化サンプル点列の線形離散化サンプル点列からの伸縮の度合いに対応する値g(k)により補正された伸縮平滑化パワースペクトル包絡系列を生成した。 The encoding apparatus according to the second embodiment performs expansion / contraction smoothing corrected by a value g (k) corresponding to the degree of expansion / contraction from the linear discretization sample point sequence of the nonlinear discretization sample point sequence as the expansion / contraction power spectrum envelope sequence. A power spectrum envelope sequence was generated.

第四実施形態は、第二実施形態の補正の代替手法として線形離散化サンプル点列に対応するパワースペクトル包絡を求めた後、特定の部分周波数領域についてのみさらに高次の線形予測分析を行うことで細かいパワースペクトル包絡を求め、それを平滑化することで、第二実施形態と同等の効果を得るものである。 In the fourth embodiment, after obtaining a power spectrum envelope corresponding to a linear discretized sample point sequence as an alternative method of correction of the second embodiment, higher-order linear prediction analysis is performed only for a specific partial frequency region. By obtaining a fine power spectrum envelope and smoothing it, an effect equivalent to that of the second embodiment is obtained.

以下、第三実施形態と異なる部分を中心に説明する。第三実施形態と同様の部分については説明を省略する。 Hereinafter, a description will be given centering on differences from the third embodiment. Description of the same parts as those of the third embodiment is omitted.

以下では、低周波数領域の方が高周波数領域よりも解像度が高くなるような非線形離散化サンプル点列に対応する伸縮疑似パワースペクトル包絡を平滑化した場合と同等の効果を得る場合を例に挙げて説明する。 In the following, an example is given in which the same effect as that obtained by smoothing the expansion / contraction pseudo power spectrum envelope corresponding to the non-linear discretized sample point sequence in which the resolution is higher in the low frequency region than in the high frequency region is given as an example. I will explain.

周波数領域変換部２１、伸縮疑似パワースペクトル系列生成部２２、線形予測分析部２３、伸縮非平滑化パワースペクトル包絡系列生成部２９及び第一逆伸縮変換部２１０の処理は、第三実施形態と同様であるため、これらの説明を省略する。 The processes of the frequency domain conversion unit 21, the expansion / contraction pseudo power spectrum sequence generation unit 22, the linear prediction analysis unit 23, the expansion / contraction unsmoothed power spectrum envelope sequence generation unit 29, and the first inverse expansion / conversion conversion unit 210 are the same as in the third embodiment. Therefore, the description thereof is omitted.

＜高次線形予測分析部２１１＞
高次線形予測分析部２１１には、第一逆伸縮変換部２１０が変換した非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)のうちの解像度を高めたい所定の周波数領域に対応するパワースペクトル包絡系列が入力される。解像度を高めたい所定の周波数領域が低周波数領域である場合には、高次線形予測分析部２１１には、非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(I_L)が入力される。I_Lは、1以上N-2以下の整数である。 <Higher-order linear prediction analysis unit 211>
The higher-order linear prediction analysis unit 211 includes the non-smoothed power spectrum envelope sequence W _o (0), W _o (1) _,. A power spectrum envelope sequence corresponding to a predetermined frequency region for which the resolution is to be increased is input. When the predetermined frequency region whose resolution is to be increased is the low frequency region, the high-order linear prediction analysis unit 211 receives the unsmoothed power spectrum envelope sequence W _o (0), W _o (1),. _o (I _L ) is entered. _IL is an integer of 1 to N-2.

高次線形予測分析部２１１は、非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)のうちの低周波数領域に対応する非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(I_L)を用いて、高次の線形予測分析、すなわち線形予測分析部２３の予測次数pよりも高い予測次数p’での線形予測分析を行って、線形予測係数~α₁,~α₂,…,~α_p’を生成する（ステップＥ１１）。p’は、pより大きい整数である。すなわち、p’>pである。 The higher-order linear prediction analysis unit 211 performs unsmoothed power corresponding to the low frequency region of the unsmoothed power spectrum envelope sequence W _o (0), W _o (1),..., W _o (N−1). Using the spectral envelope sequence W _o (0), W _o (1),..., W _o (I _L ), a higher-order linear prediction analysis, that is, a prediction order p higher than the prediction order p of the linear prediction analysis unit 23. The linear prediction coefficients ˜α ₁ , ˜α ₂ ,..., ˜α _{p ′} are generated by performing the linear prediction analysis at “Step E11”. p ′ is an integer larger than p. That is, p ′> p.

生成された線形予測係数~α₁,~α₂,…,~α_p’は、平滑化パワースペクトル包絡系列生成部２１２に出力される。 The generated linear prediction coefficients ˜α ₁ , ˜α ₂ ,..., ˜α _{p ′} are output to the smoothed power spectrum envelope sequence generation unit 212.

なお、線形予測係数~α₁,~α₂,…,~α_p’は、低周波数領域でのスペクトル包絡の平滑化のためだけに用いられる。線形予測係数~α₁,~α₂,…,~α_p’は、パラメータとして量子化されたり復号装置に出力されなくてもよい。 Note that the linear prediction coefficients ˜α ₁ , ˜α ₂ ,..., ˜α _{p ′} are used only for smoothing the spectral envelope in the low frequency region. The linear prediction coefficients ˜α ₁ , ˜α ₂ ,..., ˜α _{p ′} do not have to be quantized as parameters and output to the decoding device.

この例では、低周波数領域の解像度を高くするような非線形伸縮変換を前提としているため、高次線形予測分析部２１１は、線形離散化サンプル点列に対応する非平滑化パワースペクトル包絡系列のうちの低周波数領域の非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(I_L)を用いて高次の線形予測分析を行っている。 In this example, since it is premised on nonlinear stretch transformation that increases the resolution in the low frequency region, the high-order linear prediction analysis unit 211 includes the non-smoothed power spectrum envelope sequence corresponding to the linear discretized sample point sequence. High-order linear prediction analysis is performed using non-smoothed power spectrum envelope sequences W _o (0), W _o (1), ..., W _o (I _L ) in the low frequency region.

もし、中間の周波数領域の解像度を高くするような非線形伸縮変換と同等な効果を得る場合には、高次線形予測分析部２１１は、線形離散化サンプル点列に対応する非平滑化パワースペクトル包絡系列のうちの中周波数領域に対応する非平滑化パワースペクトル包絡系列W_o(I_ML),W_o(I_ML+1),…,W_o(I_MH-1),W_o(I_MH) (1<I_ML<I_MH<N-2)を用いて高次の線形予測分析を行えばよい。 If an effect equivalent to a nonlinear expansion / contraction transformation that increases the resolution in the intermediate frequency domain is obtained, the higher-order linear prediction analysis unit 211 performs the non-smoothed power spectrum envelope corresponding to the linear discretized sample point sequence. Unsmoothed power spectrum envelope sequence W _o (I _ML ), W _o (I _ML +1),…, W _o (I _MH −1), W _o (I _MH ) Higher-order linear prediction analysis may be performed using (1 <I _ML <I _MH <N-2).

要するに、高次線形予測分析部２１１は、非平滑化パワースペクトル包絡系列のうちの解像度を高めたい所定の周波数領域に対応する非平滑化パワースペクトル包絡系列を用いて高次の線形予測分析を行えばよい。 In short, the high-order linear prediction analysis unit 211 performs high-order linear prediction analysis using a non-smoothed power spectrum envelope sequence corresponding to a predetermined frequency region of the non-smoothed power spectrum envelope sequence to be improved in resolution. Just do it.

＜平滑化パワースペクトル包絡系列生成部２１２＞
平滑化パワースペクトル包絡系列生成部２１２には、高次線形予測分析部２１１が生成した線形予測係数~α₁,~α₂,…,~α_p’と、第一逆伸縮変換部２１０が変換した非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)のうちの解像度を高めたい所定の周波数領域以外の周波数領域に対応する非平滑化パワースペクトル包絡系列とが入力される。この例では、解像度を高めたい所定の周波数領域以外の周波数領域に対応する非平滑化パワースペクトル包絡系列とは、W_o(I_L+1),W_o(I_L+2),…,W_o(N-1)である。 <Smoothing power spectrum envelope sequence generation unit 212>
The smoothed power spectrum envelope sequence generation unit 212 converts the linear prediction coefficients ~ α ₁ , ~ α ₂ , ..., ~ α _{p '} generated by the high-order linear prediction analysis unit 211 and the first inverse expansion / contraction conversion unit 210. Non-smoothing corresponding to a frequency region other than a predetermined frequency region in which the resolution is desired to be increased in the non-smoothed power spectrum envelope sequence W _o (0), W _o (1), ..., W _o (N-1) A power spectrum envelope sequence is input. In this example, non-smoothed power spectrum envelope sequences corresponding to frequency regions other than a predetermined frequency region for which resolution is to be improved are W _o (I _L +1), W _o (I _L +2),. _o (N-1).

平滑化パワースペクトル包絡系列生成部２１２は、解像度を高めたい所定の周波数領域に含まれるサンプル点ｋについては線形予測係数~α₁,~α₂,…,~α_p’を用いてW_γ（ｋ）を計算し、解像度を高めたい所定の周波数領域に含まれないサンプル点ｋについてはW_o(k)を平滑化した値であるW_γ(k)を計算することにより、平滑化パワースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)を生成する（ステップＥ１２）。 Smoothed power spectral envelope sequence generation unit 212, the linear prediction coefficient ~ alpha ₁ for sample point k included in a predetermined frequency region to increase the resolution, ~ alpha _2, ..., ~ using the α _{p 'W} γ ₍ k) is calculated, and W _γ (k), which is a value obtained by smoothing W _o (k), is calculated for sample points k not included in the predetermined frequency region for which resolution is to be increased, thereby _obtaining a smoothed power spectrum. Envelope sequences W _γ (0), W _γ (1),..., W _γ (N-1) are generated (step E12).

生成された平滑化パースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)は、包絡正規化部２６及び符号化部２７に出力される。 The generated smoothed per-spectrum envelope sequences W _γ (0), W _γ (1),..., W _γ (N−1) are output to the envelope normalization unit 26 and the encoding unit 27.

具体的には、平滑化パワースペクトル包絡系列生成部２１２は、解像度を高めたい所定の周波数領域に含まれるサンプル点kについては、線形予測係数~α₁,~α₂,…,~α_p’を用いて、下記式により平滑化パワースペクトル包絡W_γ(k)を計算する。解像度を高めたい所定の周波数領域に含まれるサンプル点kの例は、k=0,1,…,I_Lである。 Specifically, the smoothed power spectrum envelope sequence generation unit 212 performs linear prediction coefficients ~ α ₁ , ~ α ₂ , ..., ~ α _{p '} for sample points k included in a predetermined frequency region where resolution is desired to be increased. Is used to calculate the smoothed power spectrum envelope W _γ (k) by the following equation. Examples of sample points k included in a predetermined frequency region to increase the resolution, k = 0,1, ..., a I _L.

また、具体的には、平滑化パワースペクトル包絡系列生成部２１２は、解像度を高めたい所定の周波数領域以外の周波数領域に対応する非平滑化パワースペクトル包絡系列を従来の平滑化法と同様に補正係数γで平滑化した平滑化パワースペクトル包絡系列を生成する。 Specifically, the smoothed power spectrum envelope sequence generation unit 212 corrects the non-smoothed power spectrum envelope sequence corresponding to the frequency region other than the predetermined frequency region for which the resolution is to be improved, in the same manner as the conventional smoothing method. A smoothed power spectrum envelope sequence smoothed by the coefficient γ is generated.

解像度を高めたい所定の周波数領域以外の周波数領域に対応する非平滑化パワースペクトル包絡系列の例は、W_o(I_L+1),W_o(I_L+2),…,W_o(N-1)である。この場合、生成される平滑化パワースペクトル包絡系列は、W_γ(I_L+1),W_γ(I_L+2),…,W_γ(N-1)となる。 An example of a non-smoothed power spectrum envelope sequence corresponding to a frequency domain other than a predetermined frequency domain for which resolution is to be increased is W _o (I _L +1), W _o (I _L +2), ..., W _o (N -1). In this case, the generated smoothed power spectrum envelope sequences are W _γ (I _L +1), W _γ (I _L +2),..., W _γ (N−1).

なお、解像度を高めたい所定の周波数領域以外の周波数領域に対応する平滑化パワースペクトル包絡系列は、第三実施形態の伸縮平滑化パワースペクトル包絡系列生成部２１３及び第二逆伸縮変換部２８の処理と同様の処理を行うことにより生成されてもよい。 Note that the smoothed power spectrum envelope sequence corresponding to a frequency region other than the predetermined frequency region whose resolution is to be increased is processed by the expansion / contraction smoothing power spectrum envelope sequence generation unit 213 and the second inverse expansion / conversion conversion unit 28 of the third embodiment. It may be generated by performing the same process.

すなわち、まず、量子化線形予測係数^β₁,^β₂,…,^β_pを用いて、式（３’）により伸縮平滑化パワースペクトル係数列~W_γ(0),~W_γ(1),…,~W_γ(N-1)が生成される。そして、この生成された伸縮平滑化パワースペクトル係数列を線形離散化サンプル点列に対応する平滑化パワースペクトル包絡系列に変換する。この変換された平滑化パワースペクトル包絡系列のうちの解像度を高めたい所定の周波数領域以外の周波数領域に対応する部分が、解像度を高めたい所定の周波数領域以外の周波数領域に対応する平滑化パワースペクトル包絡系列となる。 That is, first, using the quantized linear prediction coefficients ^ β ₁ , ^ β ₂ ,..., ^ Β _p , the stretched smoothed power spectrum coefficient sequence ~ W _γ (0), ~ W _γ ( 1), ..., ~ W _γ (N-1) is generated. Then, the generated stretched and smoothed power spectrum coefficient sequence is converted into a smoothed power spectrum envelope sequence corresponding to the linear discretized sample point sequence. A portion of the converted smoothed power spectrum envelope sequence corresponding to a frequency region other than the predetermined frequency region whose resolution is to be increased corresponds to a smoothed power spectrum corresponding to a frequency region other than the predetermined frequency region whose resolution is to be increased. Envelope series.

第四実施形態の符号化装置は、この処理を行うための伸縮平滑化パワースペクトル包絡系列生成部２１３及び第二逆伸縮変換部２８を備えていてもよい。 The encoding device of the fourth embodiment may include a stretch / smoothing power spectrum envelope sequence generation unit 213 and a second inverse stretch / conversion transform unit 28 for performing this process.

符号化装置による以降の処理、すなわち包絡正規化部２６及び符号化部２７の処理は、第三実施形態と同様であるため、これらの説明を省略する。 Subsequent processing by the encoding device, that is, processing of the envelope normalization unit 26 and the encoding unit 27 is the same as that of the third embodiment, and thus description thereof is omitted.

（第四実施形態の復号）
第四実施形態の復号装置の構成例を図１４に示す。第四実施形態の復号装置は、図１４に示すように、伸縮線形予測係数復号部３１と、伸縮非平滑化パワースペクトル包絡系列生成部３７と、第一逆伸縮変換部３８と、高次線形予測分析部３１０と、平滑化パワースペクトル包絡系列生成部３１１と、復号部３４と、包絡逆正規化部３５と、時間領域変換部３６とを例えば備えている。この復号装置により実現される第四実施形態の復号方法の各処理の例を図１５に示す。 (Decoding of the fourth embodiment)
FIG. 14 shows a configuration example of the decoding device according to the fourth embodiment. As shown in FIG. 14, the decoding device of the fourth embodiment includes a stretched linear prediction coefficient decoding unit 31, a stretched / unsmoothed power spectrum envelope sequence generating unit 37, a first inverse stretch transforming unit 38, and a high-order linearity. For example, a prediction analysis unit 310, a smoothed power spectrum envelope sequence generation unit 311, a decoding unit 34, an envelope denormalization unit 35, and a time domain conversion unit 36 are provided. An example of each process of the decoding method of the fourth embodiment realized by this decoding apparatus is shown in FIG.

以下、第三実施形態の復号復号装置と異なる部分を中心に説明し、第三実施形態と同様の部分については説明を省略する。 The following description will focus on the parts different from the decoding / decoding device of the third embodiment, and the description of the same parts as in the third embodiment will be omitted.

伸縮線形予測係数復号部３１、伸縮非平滑化パワースペクトル包絡系列生成部３７、第一逆伸縮変換部３８、復号部３４、包絡逆正規化部３５、時間領域変換部３６の処理は、第三実施形態と同様である。 The processes of the expansion / contraction linear prediction coefficient decoding unit 31, the expansion / contraction unsmoothed power spectrum envelope sequence generation unit 37, the first inverse expansion / conversion conversion unit 38, the decoding unit 34, the envelope inverse normalization unit 35, and the time domain conversion unit 36 This is the same as the embodiment.

＜高次線形予測分析部３１０＞
高次線形予測分析部３１０には、第一逆伸縮変換部２１０が変換した非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)のうちの解像度を高めたい所定の周波数領域に対応する非平滑化パワースペクトル包絡系列が入力される。 <Higher-order linear prediction analysis unit 310>
The higher-order linear prediction analysis unit 310 includes the non-smoothed power spectrum envelope sequences W _o (0), W _o (1) _,. A non-smoothed power spectrum envelope sequence corresponding to a predetermined frequency region for which the resolution is to be increased is input.

高次線形予測分析部３１０は、符号化装置の高次線形予測分析部２１１と同様の処理により、解像度を高めたい所定の周波数領域に対応する非平滑化パワースペクトル包絡系列を用いて、高次の線形予測分析、すなわち線形予測分析部２３の予測次数pよりも高い予測次数p’での線形予測分析を行って、線形予測係数~α₁,~α₂,…,~α_p’を生成する（ステップＤ１０）。 The high-order linear prediction analysis unit 310 uses a non-smoothed power spectrum envelope sequence corresponding to a predetermined frequency region whose resolution is to be increased by the same processing as the high-order linear prediction analysis unit 211 of the encoding device, Linear prediction analysis, that is, linear prediction analysis at a prediction order p ′ higher than the prediction order p of the linear prediction analysis unit 23, to generate linear prediction coefficients ~ α ₁ , ~ α ₂ , ..., ~ α _{p '} (Step D10).

生成された線形予測係数~α₁,~α₂,…,~α_p’は、平滑化パワースペクトル包絡系列生成部３１１に出力される。 The generated linear prediction coefficients ˜α ₁ , ˜α ₂ ,..., ˜α _{p ′} are output to the smoothed power spectrum envelope sequence generation unit 311.

解像度を高めたい所定の周波数領域が低周波数領域である場合には、解像度を高めたい所定の周波数領域に対応する非平滑化パワースペクトル包絡系列は、例えばW_o(0),W_o(1),…,W_o(I_L)である。 When the predetermined frequency region whose resolution is to be increased is a low frequency region, the non-smoothed power spectrum envelope sequence corresponding to the predetermined frequency region whose resolution is to be increased is, for example, W _o (0), W _o (1) , ..., W _o (I _L ).

＜平滑化パワースペクトル包絡系列生成部３１１＞
平滑化パワースペクトル包絡系列生成部３１１には、高次線形予測分析部３１０が生成した線形予測係数~α₁,~α₂,…,~α_p’と、第一逆伸縮変換部３８が変換した非平滑化パワースペクトル包絡系列W_o(0),W_o(1),…,W_o(N-1)のうちの解像度を高めたい所定の周波数領域以外の周波数領域に対応する非平滑化パワースペクトル包絡系列とが入力される。 <Smoothing power spectrum envelope sequence generation unit 311>
The smoothed power spectrum envelope sequence generation unit 311 converts the linear prediction coefficients ~ α ₁ , ~ α ₂ , ..., ~ α _{p '} generated by the high-order linear prediction analysis unit 310 and the first inverse expansion / contraction conversion unit 38. Non-smoothing corresponding to a frequency region other than a predetermined frequency region in which the resolution is desired to be increased in the non-smoothed power spectrum envelope sequence W _o (0), W _o (1), ..., W _o (N-1) A power spectrum envelope sequence is input.

平滑化パワースペクトル包絡系列生成部３１１は、符号化装置の平滑化パワースペクトル包絡系列生成部２１２と同様にして、線形予測係数~α₁,~α₂,…,~α_p’と、解像度を高めたい所定の周波数領域以外の周波数領域に対応する非平滑化パワースペクトル包絡系列とを用いて、平滑化パースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)を生成する（ステップＤ１１）。 The smoothed power spectrum envelope sequence generation unit 311 performs linear prediction coefficients ~ α ₁ , ~ α ₂ , ..., ~ α _{p '} and the resolution in the same manner as the smoothed power spectrum envelope sequence generation unit 212 of the encoding device. Using the non-smoothed power spectrum envelope sequence corresponding to the frequency domain other than the predetermined frequency domain to be enhanced, the smoothed per-spectrum envelope series W _γ (0), W _γ (1), ..., W _γ (N- 1) is generated (step D11).

生成された平滑化パースペクトル包絡系列W_γ(0),W_γ(1),…,W_γ(N-1)は、復号部３４及び包絡逆正規化部３５に出力される。 The generated smoothed per-spectrum envelope sequences W _γ (0), W _γ (1),..., W _γ (N−1) are output to the decoding unit 34 and the envelope denormalization unit 35.

復号装置による以降の処理、すなわち復号部３４、包絡逆正規化部３５及び時間領域変換部３６の処理は、第三実施形態と同様であるため、これらの説明を省略する。 Subsequent processing by the decoding device, that is, processing of the decoding unit 34, the envelope denormalization unit 35, and the time domain conversion unit 36 is the same as that of the third embodiment, and thus description thereof is omitted.

［サンプル列生成装置及び方法］
上記説明した伸縮疑似パワースペクトル系列生成部２２は、音信号に由来する周波数領域のサンプル列ξ(0),ξ(1),…,ξ(N-1)を、予め定められたM×Nの非負値行列Uを用いて、下記式により定義されるサンプル列~ξ(0),~ξ(1),…,~ξ(M-1)を生成するサンプル列生成部を備えるサンプル列生成装置であると言える。言い換えれば、上記説明した伸縮疑似パワースペクトル系列生成部２２による伸縮疑似パワースペクトル系列生成ステップ（ステップＥ２）は、音信号に由来する周波数領域のサンプル列ξ(0),ξ(1),…,ξ(N-1)を、予め定められたM×Nの非負値行列Uを用いて、下記式により定義されるサンプル列~ξ(0),~ξ(1),…,~ξ(M-1)を生成するサンプル列生成ステップを含むサンプル列生成方法であると言える。例えば、N=Mとする。 [Sample sequence generation apparatus and method]
The above-described expansion / contraction pseudo power spectrum sequence generation unit 22 generates a frequency sequence sample sequence ξ (0), ξ (1),..., Ξ (N−1) derived from the sound signal by a predetermined M × N. Sample sequence generation including a sample sequence generation unit for generating sample sequences ~ ξ (0), ~ ξ (1), ..., ~ ξ (M-1) defined by the following formula using the non-negative matrix U of It can be said that it is a device. In other words, the expansion / contraction pseudo power spectrum sequence generation step (step E2) by the expansion / contraction pseudo power spectrum sequence generation unit 22 described above is performed in the frequency domain sample sequence ξ (0), ξ (1),. ξ (N-1) is a sample sequence ~ ξ (0), ~ ξ (1), ..., ~ ξ (M defined by the following equation using a predetermined M × N non-negative matrix U It can be said that this is a sample sequence generation method including a sample sequence generation step for generating -1). For example, N = M.

ここで、非負値行列Uは、＜伸縮疑似パワースペクトル系列生成部２２＞の欄で説明した変換行列Uに対応する。上記の通り、非負値行列Uは、多くの場合疎行列であるため、ここではUを疎行列としてもよい。疎行列Uの条件については、＜伸縮疑似パワースペクトル系列生成部２２＞の欄で説明した変換行列Uの条件と同様であるため、ここでは説明を省略する。 Here, the non-negative matrix U corresponds to the transformation matrix U described in the section <Expanded pseudo power spectrum sequence generation unit 22>. As described above, since the non-negative matrix U is often a sparse matrix, U may be a sparse matrix here. The conditions for the sparse matrix U are the same as the conditions for the transformation matrix U described in the section <Expanded pseudo power spectrum sequence generation unit 22>, and thus the description thereof is omitted here.

サンプル列ξ(0),ξ(1),…,ξ(N-1)は、音信号に由来する周波数領域のサンプル列であればどのようなサンプル列であってもよい。例えば、サンプル列ξ(0),ξ(1),…,ξ(N-1)は、[1]MDCT係数列X(0),X(1),…,X(N-1)であってもよいし、[2]MDCT係数の絶対値の列|X(0)|,|X(1)|,…,|X(N-1)|であってもよいし、[3]音信号のパワースペクトルMDCT係数の絶対値のq乗の値の列X(0)^q,X(1)^q,…,X(N-1)^qであってもよい。また、MDCT係数X(i)(i=0,1,…,N-1)をMDCT係数以外の周波数領域の特徴量に置き換えたものをX’(i)(i=0,1,…,N-1)として、サンプル列ξ(0),ξ(1),…,ξ(N-1)は、[4]X’(0),X’(1),…,X’(N-1)であってもよいし、[5]|X’(0)|,|X’(1)|,…,|X’(N-1)|であってもよいし、[6]X’(0)^q,X’(1)^q,…,X’(N-1)^qであってもよい。また、サンプル列ξ(0),ξ(1),…,ξ(N-1)は、[7]音信号のパワースペクトル系列であってもよいし、[8]音信号のパワースペクトル包絡系列であってもよい。さらに、サンプル列ξ(0),ξ(1),…,ξ(N-1)は、[9] [1]から[7]の補正値、言い換えれば[1]から[7]を平滑化した系列であってもよい。ここで、平滑化とは、系列の各値の大小関係を維持しつつ、系列の値の大きさ（振幅）の凹凸を鈍らせる処理を意味するものとする。 The sample sequence ξ (0), ξ (1),..., Ξ (N-1) may be any sample sequence as long as it is a frequency-domain sample sequence derived from a sound signal. For example, sample sequences ξ (0), ξ (1), ..., ξ (N-1) are [1] MDCT coefficient sequences X (0), X (1), ..., X (N-1). Or [2] a sequence of absolute values of MDCT coefficients | X (0) |, | X (1) |, ..., | X (N-1) |, or [3] sound It may be a sequence X (0) ^ q, X (1) ^ q,..., X (N-1) ^ q of the absolute value of the power spectrum MDCT coefficient of the signal. Further, the MDCT coefficient X (i) (i = 0, 1, ..., N-1) is replaced with a frequency domain feature amount other than the MDCT coefficient, and X '(i) (i = 0, 1, ..., N-1), the sample sequence ξ (0), ξ (1), ..., ξ (N-1) is [4] X '(0), X' (1), ..., X '(N- 1), [5] | X '(0) |, | X' (1) |, ..., | X '(N-1) |, or [6] X It may be '(0) ^ q, X' (1) ^ q, ..., X '(N-1) ^ q. Further, the sample sequence ξ (0), ξ (1),..., Ξ (N-1) may be a power spectrum sequence of [7] sound signal or [8] power spectrum envelope sequence of sound signal It may be. Furthermore, the sample sequence ξ (0), ξ (1), ..., ξ (N-1) smoothes the correction values [9] [1] to [7], in other words, [1] to [7] It may be a series. Here, smoothing means a process of dulling the unevenness of the magnitude (amplitude) of the values of the series while maintaining the magnitude relationship between the values of the series.

ここで、サンプル列が[2]から[9]のように非負の値のみからなる系列である場合には、後述する第二サンプル列の近似精度は高まる。 Here, when the sample sequence is a series consisting of only non-negative values such as [2] to [9], the approximation accuracy of the second sample sequence described later is increased.

同様に、サンプル列~ξ(0),~ξ(1),…,~ξ(M-1)は、音信号に由来する周波数領域のサンプル列であればどのようなサンプル列であってもよい。この音信号に由来する周波数領域のサンプル列の例は、上記と同様である。 Similarly, the sample sequence ~ ξ (0), ~ ξ (1), ..., ~ ξ (M-1) can be any sample string as long as it is a frequency domain sample string derived from a sound signal. Good. An example of the frequency domain sample sequence derived from the sound signal is the same as described above.

また、上記説明した、逆伸縮変換部２５、逆伸縮変換部３３、第一逆伸縮変換部２１０、第二逆伸縮変換部２８、第一逆伸縮変換部３８及び第二逆伸縮変換部３９は、音信号に由来する周波数領域のサンプル列~η(0),~η(1),…,~η(M-1)を、予め定められたN×Mの非負値行列Uを用いて、下記式により定義されるサンプル列η(0),η(1),…,η(N-1)を生成するサンプル列生成部を備えるサンプル列生成装置であると言える。言い換えれば、上記説明した、逆伸縮変換部２５による逆伸縮変換ステップ（ステップＥ５）、逆伸縮変換部３３による逆伸縮変換ステップ（ステップＤ３）、第一逆伸縮変換部２１０による逆伸縮変換ステップ（ステップＥ１０）、第二逆伸縮変換部２８による逆伸縮変換ステップ（ステップＥ８）、第一逆伸縮変換部３８による逆伸縮変換ステップ（ステップＤ８）及び第二逆伸縮変換部３９による逆伸縮変換ステップ（ステップＤ９）は、音信号に由来する周波数領域のサンプル列~η(0),~η(1),…,~η(M-1)を、予め定められたM×Nの非負値行列Uを用いて、下記式により定義されるサンプル列η(0),η(1),…,η(N-1)を生成するサンプル列生成ステップを含むサンプル列生成方法であると言える。例えば、N=Mとする。 In addition, the reverse expansion / conversion conversion unit 25, the reverse expansion / conversion conversion unit 33, the first reverse expansion / contraction conversion unit 210, the second reverse expansion / contraction conversion unit 28, the first reverse expansion / contraction conversion unit 38, and the second reverse expansion / contraction conversion unit 39 described above are The frequency domain sample sequence derived from the sound signal ~ η (0), ~ η (1), ..., ~ η (M-1), using a predetermined N × M non-negative matrix U, It can be said that this is a sample sequence generation apparatus including a sample sequence generation unit that generates sample sequences η (0), η (1),..., Η (N−1) defined by the following formula. In other words, the reverse expansion / conversion conversion step (step E5) by the reverse expansion / conversion conversion unit 25, the reverse expansion / conversion conversion step (step D3) by the reverse expansion / conversion conversion unit 33, and the reverse expansion / conversion conversion step by the first reverse expansion / conversion conversion unit 210 (described above) Step E10), reverse expansion / conversion conversion step (step E8) by the second reverse expansion / conversion conversion unit 28, reverse expansion / conversion conversion step (step D8) by the first reverse expansion / conversion conversion unit 38, and reverse expansion / conversion conversion step by the second reverse expansion / conversion conversion unit 39 (Step D9) is a predetermined M × N non-negative matrix of frequency domain sample sequences ~ η (0), ~ η (1), ..., ~ η (M-1) derived from sound signals. It can be said that this is a sample sequence generation method including a sample sequence generation step for generating sample sequences η (0), η (1),..., Η (N−1) defined by the following formula using U. For example, N = M.

ここで、非負値行列Vは、＜逆伸縮変換部２５＞の欄で説明した変換行列Vに対応する。上記の通り、非負値行列Vは、多くの場合疎行列であるため、ここではVを疎行列としてもよい。疎行列Vの条件については、＜逆伸縮変換部２５＞の欄で説明した変換行列Vの条件と同様であるため、ここでは説明を省略する。 Here, the non-negative matrix V corresponds to the transformation matrix V described in the section of <Inverse expansion / contraction transformation unit 25>. As described above, since the non-negative matrix V is often a sparse matrix, V may be a sparse matrix here. The conditions for the sparse matrix V are the same as the conditions for the conversion matrix V described in the section <Inverse expansion / conversion conversion unit 25>, and thus the description thereof is omitted here.

サンプル列~η(0),~η(1),…,~η(M-1)及びサンプル列η(0),η(1),…,η(N-1)は、音信号に由来する周波数領域のサンプル列であればどのようなサンプル列であってもよい。この音信号に由来する周波数領域のサンプル列の例は、上記と同様である。 Sample sequence ~ η (0), ~ η (1), ..., ~ η (M-1) and sample series η (0), η (1), ..., η (N-1) are derived from sound signals Any sample string may be used as long as it is a frequency-domain sample string. An example of the frequency domain sample sequence derived from the sound signal is the same as described above.

音信号の符号化及び復号等の信号処理において、ある周波数領域の解像度で表現されたサンプル列（第一サンプル列とも呼ぶ。）を、その解像度とは異なる解像度で表現されたサンプル列（第二サンプル列とも呼ぶ。）に変換する必要がある場合がある。 In signal processing such as encoding and decoding of a sound signal, a sample sequence (also referred to as a first sample sequence) expressed with a resolution in a certain frequency domain is used as a sample sequence (second sample) expressed with a resolution different from the resolution. It may be necessary to convert it to a sample string.

第一サンプル列の例が、線形離散サンプル点列に基づいて表現されたサンプル列であり、第二サンプル列の例が、非線形離散サンプル点列に基づいて表現されたサンプル列である。 An example of the first sample sequence is a sample sequence expressed based on a linear discrete sample point sequence, and an example of the second sample sequence is a sample sequence expressed based on a nonlinear discrete sample point sequence.

この例の場合、Uの全ての要素が非零の場合、Uを用いた演算により、非線形離散サンプル点列に基づいて表現された第二サンプル列の近似精度は上がるが、演算量が大きくなる。一方、Uを疎行列とすることで、それなりの近似精度を保ちつつ、第二サンプル列が本質的に非線形なものであっても、少ない演算量で第二サンプル列を近似することができる。 In this example, when all elements of U are non-zero, the calculation using U increases the approximation accuracy of the second sample sequence expressed based on the non-linear discrete sample point sequence, but the calculation amount increases. . On the other hand, by using U as a sparse matrix, the second sample sequence can be approximated with a small amount of computation even if the second sample sequence is essentially non-linear while maintaining a reasonable approximation accuracy.

また、第一サンプル列の別の例が、非線形離散サンプル点列に基づいて表現されたサンプル列であり、第二サンプル列の別の例が、線形離散サンプル点列に基づいて表現されたサンプル列である。 In addition, another example of the first sample sequence is a sample sequence expressed based on a nonlinear discrete sample point sequence, and another example of the second sample sequence is a sample expressed based on a linear discrete sample point sequence. Is a column.

この別の例の場合、Vの全ての要素が非零の場合、Vを用いた演算により、線形離散サンプル点列に基づいて表現された第二サンプル列の近似精度は上がるが、演算量が大きくなる。一方、Vを疎行列とすることで、それなりの近似精度を保ちつつ、第一サンプル列が本質的に非線形なものであっても、少ない演算量で第二サンプル列を近似することができる。 In this other example, when all elements of V are non-zero, the calculation using V increases the approximation accuracy of the second sample sequence expressed based on the linear discrete sample point sequence, but the amount of calculation is growing. On the other hand, by using V as a sparse matrix, the second sample sequence can be approximated with a small amount of computation even if the first sample sequence is essentially non-linear while maintaining a reasonable approximation accuracy.

なお、サンプル列生成装置及び方法による疎行列U,Vを用いた演算は、音信号の符号化及び復号以外の信号処理にも用いることができる。 Note that the calculation using the sparse matrices U and V by the sample sequence generation apparatus and method can be used for signal processing other than encoding and decoding of a sound signal.

［変形例等］
上記サンプル列生成方法、符号化方法、復号方法、これらの装置において説明した処理は、記載の順にしたがって時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。 [Modifications, etc.]
The sample sequence generation method, the encoding method, the decoding method, and the processes described in these apparatuses are not only executed in time series in the order described, but also in parallel according to the processing capability of the apparatus that executes the processes or as necessary. Or may be performed individually.

また、サンプル列生成方法による各ステップをコンピュータによって実現する場合、サンプル列生成方法の各ステップの処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、サンプル列生成方法の各ステップがコンピュータ上で実現される。 In addition, when each step of the sample string generation method is realized by a computer, the processing content of each step of the sample string generation method is described by a program. Each step of the sample sequence generation method is realized on the computer by executing this program on the computer.

同様に、符号化方法による各ステップをコンピュータによって実現する場合、符号化方法の各ステップの処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、符号化方法の各ステップがコンピュータ上で実現される。 Similarly, when each step of the encoding method is realized by a computer, the processing content of each step of the encoding method is described by a program. Each step of the encoding method is realized on the computer by executing this program on the computer.

同様に、復号方法による各ステップをコンピュータによって実現する場合、復号方法の各ステップの処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、復号方法の各ステップがコンピュータ上で実現される。 Similarly, when each step of the decoding method is realized by a computer, the processing content of each step of the decoding method is described by a program. Each step of the decoding method is realized on the computer by executing this program on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、符号化方法及び復号方法の各ステップは、コンピュータ上で所定のプログラムを実行させることにより構成することにしてもよいし、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, each step of the encoding method and the decoding method may be configured by causing a computer to execute a predetermined program, or at least a part of these processing contents may be realized by hardware. Also good.

その他、この発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。 Needless to say, other modifications are possible without departing from the spirit of the present invention.

１１周波数領域変換部
１２線形予測分析部
１３パワースペクトル包絡系列生成部
１４包絡正規化部
１５符号化部
２１周波数領域変換部
２２伸縮疑似パワースペクトル系列生成部
２３線形予測分析部
２４伸縮パワースペクトル包絡系列生成部
２５逆伸縮変換部
２６包絡正規化部
２７符号化部
２８第二逆伸縮変換部
２９伸縮非平滑化パワースペクトル包絡系列生成部
２１０第一逆伸縮変換部
２１１高次線形予測分析部
２１２平滑化パワースペクトル包絡系列生成部
３１伸縮線形予測係数復号部
３２伸縮パワースペクトル包絡系列生成部
３３逆伸縮変換部
３４復号部
３５包絡逆正規化部
３６時間領域変換部
３７伸縮非平滑化パワースペクトル包絡系列生成部
３８第一逆伸縮変換部
３９第二逆伸縮変換部
３１０高次線形予測分析部
３１１平滑化パワースペクトル包絡系列生成部 DESCRIPTION OF SYMBOLS 11 Frequency domain conversion part 12 Linear prediction analysis part 13 Power spectrum envelope sequence generation part 14 Envelope normalization part 15 Encoding part 21 Frequency domain conversion part 22 Expansion / contraction pseudo power spectrum series generation part 23 Linear prediction analysis part 24 Expansion / contraction power spectrum envelope series Generation unit 25 Inverse expansion / contraction conversion unit 26 Envelope normalization unit 27 Encoding unit 28 Second inverse expansion / contraction conversion unit 29 Expansion / non-smoothing power spectrum envelope sequence generation unit 210 First inverse expansion / conversion conversion unit 211 High-order linear prediction analysis unit 212 Smooth Power spectrum envelope sequence generation unit 31 expansion / contraction linear prediction coefficient decoding unit 32 expansion / contraction power spectrum envelope sequence generation unit 33 inverse expansion / conversion conversion unit 34 decoding unit 35 envelope inverse normalization unit 36 time domain conversion unit 37 expansion / contraction unsmoothed power spectrum envelope sequence Generation unit 38 First inverse expansion / conversion conversion unit 39 Second inverse expansion / contraction conversion unit 310 High-order linear prediction analysis unit 11 smoothed power spectral envelope sequence generator

Claims

Using a predetermined N × N sparse matrix U, where ξ (0), ξ (1), ..., ξ (N-1) are sample sequences in the frequency domain derived from sound signals in a predetermined time interval A sample sequence generation step for generating ~ ξ (0), ~ ξ (1), ..., ~ ξ (N-1) defined by the following equation,

The sparse matrix U is a matrix that includes non-zero values only in the neighboring components of the diagonal component, and the number of nonzero elements included in the upper triangular matrix is greater than the number of nonzero elements included in the lower triangular matrix. Less
The sample point sequence corresponding to ξ (0), ξ (1),..., Ξ (N-1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are equal.
Sample column generation method.

The sample sequence generation method according to claim 1,
The sample point sequence corresponding to ~ ξ (0), ~ ξ (1), ..., ~ ξ (N-1) is a non-linear discretized sample point sequence in which the frequency intervals of adjacent sample points are uneven. Yes,
Sample points of the linear discretization sample point sequence are linear discretization sample points, sample points of the nonlinear discretization sample point sequence are nonlinear discretization sample points,
Each row of the sparse matrix U corresponds to each nonlinear discretized sample point of the nonlinear discretized sample point sequence, and each column of the sparse matrix U corresponds to each linear discretized sample point of the linear discretized sample point sequence As to
Each row of the sparse matrix U is a non-zero value only in the neighboring elements of the column corresponding to the linear discretization sample point of the frequency closest to the frequency corresponding to the nonlinear discretization sample point corresponding to each row, and the other elements Is a sparse matrix such as 0,
Sample column generation method.

A frequency sequence sample sequence derived from the sound signal for each predetermined time interval is ξ (0), ξ (1),..., Ξ (N-1), and a predetermined non-negative matrix U of M × N is defined as A sample sequence generation step for generating ~ ξ (0), ~ ξ (1), ..., ~ ξ (M-1) defined by the following equation:

The non-negative matrix U has a component of the i-th row and k-th column as U [i, k], and for each i = 1, 2,.

Where g ₁ <g ₂ <… <g _N is satisfied,
The sample point sequence corresponding to ξ (0), ξ (1), ..., ξ (N-1) or ~ ξ (0), ~ ξ (1), ..., ~ ξ (M-1) has a frequency Sample point sequence equally spaced in the direction,
Sample column generation method.

The sample string generation method according to claim 1 or 2,
All elements of the sparse matrix U are non-negative,
Sample column generation method.

A predetermined N × N sparse matrix V, where ~ η (0), ~ η (1), ..., ~ η (N-1) are sample sequences in the frequency domain derived from sound signals in a predetermined time interval A sample sequence generation step for generating η (0), η (1),..., Η (N-1) defined by the following equation:

The sparse matrix V is a matrix including non-zero values only in the neighboring components of the diagonal component, and the number of nonzero elements included in the upper triangular matrix is greater than the number of nonzero elements included in the lower triangular matrix. Many
The sample point sequence corresponding to η (0), η (1),..., Η (N-1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are equal.
Sample column generation method.

The sample sequence generation method according to claim 5, comprising:
The sample point sequence corresponding to ~ η (0), ~ η (1), ..., ~ η (N-1) is a non-linear discretized sample point sequence in which the frequency intervals of adjacent sample points are uneven. Yes,
Sample points of the linear discretization sample point sequence are linear discretization sample points, sample points of the nonlinear discretization sample point sequence are nonlinear discretization sample points,
Each column of the sparse matrix V corresponds to each nonlinear discretized sample point of the nonlinear discretized sample point sequence, and each row of the sparse matrix V corresponds to each linear discretized sample point of the linear discretized sample point sequence As to
Each row of the sparse matrix V has a non-zero value only in the neighboring elements of the column corresponding to the non-linear discretization sample point of the frequency closest to the frequency of the linear discretization sample point corresponding to the row, and the other elements are 0. Is a sparse matrix such that
Sample column generation method.

The method of generating a sample string according to claim 5 or 6,
All elements of the sparse matrix V are non-negative,
Sample column generation method.

A sequence corresponding to the power of the sample sequence obtained by converting the sound signal of the predetermined time interval into the frequency domain by the sample sequence generation method according to claim 1 or 2 is the ξ (0), ξ (1 ), ..., ξ (N-1) and ~ ξ (0), ~ ξ (1), ..., ~ ξ (N-1) are expanded and contracted pseudo power spectrum series ~ Y (0), ~ Y (1 ), ..., ~ Y (N-1) to generate a stretching pseudo power spectrum sequence generation step;
p is a positive integer, and the stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) is a series converted to the time domain ~ X (0), ~ A linear prediction analysis step for generating a quantized stretch linear prediction coefficient ^ β ₁ , ^ β ₂ ,…, ^ β _p by linearly predicting X (1), ..., ~ X (N-1),
Stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W, which is a sequence in the frequency domain corresponding to the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p A stretching power spectrum envelope sequence generation step for generating (N-1);
8. The stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) is converted into ~ η (0), ... by the sample string generation method according to any one of claims 5 to 7. ~ η (1), ..., ~ η (N-1), and the above η (0), η (1), ..., η (N-1) are used as power spectrum envelope sequences W (0), W (1) , ..., inverse stretch conversion step generated as W (N-1),
The frequency domain sample sequence X (0), X (1),..., X (N-1) using the power spectrum envelope sequence W (0), W (1),. An envelope normalization step to generate a normalized frequency domain sample sequence by normalizing
An encoding step of encoding the normalized frequency domain sample sequence to generate a code corresponding to the normalized frequency domain sample sequence;
An encoding method including:

The encoding method according to claim 8, comprising:
The frequency interval between adjacent sample points in the low frequency region of the sample point sequence corresponding to the stretchable power spectrum envelope sequence is greater than the frequency interval between adjacent sample points in the high frequency region of the stretchable power spectrum envelope sequence. narrow,
An encoding method characterized by the above.

Stretching linear prediction coefficient decoding step for generating quantized stretching linear prediction coefficients ^ β ₁ , ^ β ₂ ,…, ^ β _p by decoding the input stretching linear prediction coefficient code, where p is a positive integer,
Stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W, which is a sequence in the frequency domain corresponding to the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p A stretching power spectrum envelope sequence generation step for generating (N-1);
8. The stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W (N-1) is converted into ~ η (0), ... by the sample string generation method according to any one of claims 5 to 7. ~ η (1), ..., ~ η (N-1), and the above η (0), η (1), ..., η (N-1) are used as power spectrum envelope sequences W (0), W (1) , ..., inverse stretch conversion step generated as W (N-1),
The code corresponding to the input sampled frequency domain sample sequence X _N (0), X _N (1), ..., X _N (N-1) is decoded, and the normalized frequency domain sample string is decoded. A decoding step for generating sample sequences X _N (0), X _N (1),..., X _N (N−1);
.., W (N-1) using the power spectrum envelope sequence W (0), W (1),..., W (N-1), the frequency domain sample sequence X _N (0), X _N (1),. by inverse normalize X _{N (N-1),} sample sequence X in the frequency domain (0), X (1) , ..., and envelope denormalization step of generating X (N-1),
A decoding method including:

A predetermined N × N sparse matrix U is used, where ξ (0), ξ (1),..., Ξ (N-1) are sample sequences in the frequency domain derived from sound signals for each predetermined time interval. Including a sample sequence generation unit for generating ~ ξ (0), ~ ξ (1), ..., ~ ξ (N-1) defined by the following formulas,

The sparse matrix U is a matrix having only non-zero values in the vicinity of diagonal components, and the number of non-zero elements included in the upper triangular matrix is greater than the number of non-zero elements included in the lower triangular matrix. Less
The sample point sequence corresponding to ξ (0), ξ (1),..., Ξ (N-1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are equal.
Sample sequence generator.

Predetermined N × N sparse matrix with ~ η (0), ~ η (1), ..., ~ η (N-1) as sample sequences in the frequency domain derived from the sound signal for each predetermined time interval A sample sequence generation unit that generates η (0), η (1),..., Η (N-1) defined by the following equation using V,

The sparse matrix V is a matrix including non-zero values only in the neighboring components of the diagonal component, and the number of nonzero elements included in the upper triangular matrix is greater than the number of nonzero elements included in the lower triangular matrix. Many
The sample point sequence corresponding to η (0), η (1),..., Η (N-1) is a linear discretized sample point sequence in which the frequency intervals of adjacent sample points are equal.
Sample sequence generator.

A sequence corresponding to the power of the sample sequence obtained by converting the sound signal for each predetermined time interval into the frequency domain is ξ (0), ξ (1), ..., ξ (N-1), and Generate ξ (0), ~ ξ (1), ..., ~ ξ (N-1) as the stretched pseudo power spectrum series ~ Y (0), ~ Y (1), ..., ~ Y (N-1) An expansion / contraction pseudo power spectrum sequence generation unit which is the sample string generation device of claim 11;
p is a positive integer, and the stretched pseudo power spectrum sequence ~ Y (0), ~ Y (1), ..., ~ Y (N-1) is a series converted to the time domain ~ X (0), ~ A linear prediction analysis unit that generates a quantized stretch linear prediction coefficient ^ β ₁ , ^ β ₂ ,…, ^ β _p by linearly predicting X (1), ..., ~ X (N-1),
Stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W, which is a sequence in the frequency domain corresponding to the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p A stretchable power spectrum envelope generation unit for generating (N-1),
The stretched power spectrum envelope series ~ W (0), ~ W (1), ..., ~ W (N-1) is changed to ~ η (0), ~ η (1), ..., ~ η (N-1). The η (0), η (1),..., Η (N-1) is generated as a power spectrum envelope sequence W (0), W (1),. A reverse expansion / conversion conversion unit that is a sample string generation device of
The frequency domain sample sequence X (0), X (1),..., X (N-1) using the power spectrum envelope sequence W (0), W (1),. An envelope normalization unit that generates a normalized frequency domain sample sequence by normalizing
An encoding unit that encodes the normalized frequency domain sample sequence and generates a code corresponding to the normalized frequency domain sample sequence;
An encoding device including:

A stretched linear prediction coefficient decoding unit that decodes an input stretched linear prediction coefficient code and generates quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ ,…, ^ β _p , where p is a positive integer,
Stretched power spectrum envelope sequence ~ W (0), ~ W (1), ..., ~ W, which is a sequence in the frequency domain corresponding to the quantized stretched linear prediction coefficients ^ β ₁ , ^ β ₂ , ..., ^ β _p A stretchable power spectrum envelope generation unit for generating (N-1),
The stretched power spectrum envelope series ~ W (0), ~ W (1), ..., ~ W (N-1) is changed to ~ η (0), ~ η (1), ..., ~ η (N-1). The η (0), η (1),..., Η (N-1) is generated as a power spectrum envelope sequence W (0), W (1),. A reverse expansion / conversion conversion unit that is a sample string generation device of
The code corresponding to the input sampled frequency domain sample sequence X _N (0), X _N (1), ..., X _N (N-1) is decoded, and the normalized frequency domain sample string is decoded. A decoding unit for generating sample sequences X _N (0), X _N (1),..., X _N (N−1);
.., W (N-1) using the power spectrum envelope sequence W (0), W (1),..., W (N-1), the frequency domain sample sequence X _N (0), X _N (1),. By denormalizing X _N (N-1), an envelope denormalization unit for generating frequency domain sample sequences X (0), X (1), ..., X (N-1);
A decoding device.

The program for making a computer perform each step of the sample row | line production | generation method in any one of Claim 1 to 7.