JPH08185199A

JPH08185199A - Voice coding device

Info

Publication number: JPH08185199A
Application number: JP7000300A
Authority: JP
Inventors: Kazunori Ozawa; 一範小澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1995-01-05
Filing date: 1995-01-05
Publication date: 1996-07-16
Anticipated expiration: 2015-01-31
Also published as: JP3003531B2

Abstract

PURPOSE: To provide a voice coding device in which good tone quality is obtained even for a low bit rate. CONSTITUTION: In a voice coding device consisting of a frame dividing section 110 dividing a frame with a previously decided frame unit, a spectrum parameter calculating section 200 obtaining a spectrum parameter from a voice signal, an adaptive code book section 500 cutting out a sound source signal of past delay and performing pitch prediction, a sound source quantizing section 350, an adaptive code book section in which delay in an adaptive code book is predicted from a difference quantization value and predicted difference is quantized is provided.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声信号を低いビット
レートで高品質に符号化するための音声符号化装置に関
するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice coding device for coding a voice signal with high quality at a low bit rate.

【０００２】[0002]

【従来の技術】音声信号を高能率に符号化する方式とし
て、例えば、Ｍ．ＳｃｈｒｏｅｄｅｒａｎｄＢ．Ａｔ
ａｌ氏による“Ｃｏｄｅ−ｅｘｃｉｔｅｄｌｉｎｅａ
ｒｐｒｅｄｉｃｔｉｏｎ：Ｈｉｇｈｑｕａｌｉｔｙ
ｓｐｅｅｃｈａｔｖｅｒｙｌｏｗｂｉｔｒ
ａｔｅｓ”（Ｐｒｏｃ．ＩＣＡＳＳＰ，ｐｐ．９３７−
９４０，１９８５年）と題した論文（文献１）や、Ｋｌ
ｅｉｊｎ氏らによる“Ｉｍｐｒｏｖｅｄｓｐｅｅｃｈ
ｑｕａｌｉｔｙａｎｄｅｆｆｉｃｉｅｎｔｖｅ
ｃｔｏｒｑｕａｎｔｉｚａｔｉｏｎｉｎＳＥＬ
Ｐ”（Ｐｒｏｃ．ＩＣＡＳＳＰ，ｐｐ．１５５−１５
８，１９８８年）と題した論文（文献２）などに記載さ
れているＣＥＬＰ（ＣｏｄｅＥｘｃｉｔｅｄＬｉｎ
ｅａｒＰｒｅｄｉｃｔｉｖｅＣｏｄｉｎｇ）が知ら
れている。この従来例では、送信側では、フレーム毎
（例えば２０ｍｓ）に音声信号から線形予測（ＬＰＣ）
分析を用いて、音声信号のスペクトル特性を表すスペク
トルパラメータを抽出する。フレームをさらにサブフレ
ーム（例えば５ｍｓ）に分割し、サブフレーム毎に過去
の音源信号を基に適応コードブックにおけるパラメータ
（ピッチ周期に対応する遅延パラメータとゲインパラメ
ータ）を抽出し、適応コードブックにより前記サブフレ
ームの音声信号をピッチ予測する。ピッチ予測して求め
た音源信号に対して、予め定められた種類の雑音信号か
らなる音源コードブック（ベクトル量子化コードブッ
ク）から最適音源コードベクトルを選択し最適なゲイン
を計算することにより、音源信号を量子化する。音源コ
ードベクトルの選択の仕方は、選択した音源コードベク
トルにより合成した信号と、前記残差信号との誤差電力
を最小化するように行なう。そして、選択されたコード
ベクトルの種類を表すインデクスとゲインならびに、前
記スペクトルパラメータと適応コードブックのパラメー
タをマルチプレクサ部により組み合わせて伝送する。受
信側の説明は省略する。2. Description of the Related Art As a method for encoding a voice signal with high efficiency, for example, M. Schroederand B. At
“Code-excited linea” by al
rprediction: High quality
speed at very low bit r
ates "(Proc. ICASSP, pp. 937-
940, 1985), a paper (reference 1), and Kl.
"Improved speech" by eijn et al.
quality and efficient ve
ctor quantification in SEL
P "(Proc. ICASSP, pp. 155-15
CELP (Code Excited Lin) described in a paper (reference 2) entitled "8, 1988)."
Ear Predictive Coding) is known. In this conventional example, on the transmitting side, linear prediction (LPC) is performed from the audio signal for each frame (for example, 20 ms).
The analysis is used to extract spectral parameters representative of the spectral characteristics of the audio signal. The frame is further divided into subframes (for example, 5 ms), and parameters (delay parameters and gain parameters corresponding to the pitch period) in the adaptive codebook are extracted for each subframe based on the past excitation signal, and the adaptive codebook is used to extract the parameters. Pitch prediction of a subframe audio signal. For the sound source signal obtained by pitch prediction, the optimum sound source code vector is selected from the sound source codebook (vector quantization codebook) consisting of a noise signal of a predetermined type, and the optimum gain is calculated. Quantize the signal. The sound source code vector is selected so that the error power between the signal synthesized by the selected sound source code vector and the residual signal is minimized. Then, the index and the gain indicating the type of the selected code vector, the spectrum parameter and the parameter of the adaptive codebook are combined by the multiplexer unit and transmitted. A description of the receiving side is omitted.

【０００３】[0003]

【発明が解決しようとする課題】前記従来法では、適応
コードブックにおいて、サブフレーム毎に遅延パラメー
タを求めて独立に伝送していた。例えば、音声の場合、
遅延は１６−１４０サンプルの範囲に存在するが、ピッ
チ周期の短い女性音などで十分な精度を得るためには、
遅延を整数サンプルきざみではなく、小数サンプルきざ
みとする必要がある。このため、遅延を表すのにサブフ
レーム当たり最低８ビット必要であり、１フレームに４
サブフレーム収容されるとすればフレーム当たりで３２
ビット必要であった。これはフレーム長を４０ｍｓとし
た場合、１秒当たりの伝送量にして、１．６ｋｂ／ｓで
あった。In the above-mentioned conventional method, the delay parameter is calculated for each subframe in the adaptive codebook and transmitted independently. For example, in the case of voice,
The delay exists in the range of 16-140 samples, but in order to obtain sufficient accuracy for female sounds with a short pitch period,
The delay should be in fractional sample increments rather than integer sample increments. Therefore, at least 8 bits are needed per subframe to represent the delay, and 4 bits per frame
32 per frame if subframes are accommodated
A bit needed. This was 1.6 kb / s in terms of the transmission amount per second when the frame length was 40 ms.

【０００４】このため、音声信号を４ｋｂ／ｓ以下で良
好に送ろうとした場合、遅延の伝送に必要な情報を低減
する必要があった。しかしながら、単純にサブフレーム
当たりのビット数を低減したのでは、ピッチ変化範囲が
狭まったり、精度が不十分になったりして音質が大幅に
劣化するという問題があった。Therefore, when an audio signal is satisfactorily transmitted at 4 kb / s or less, it is necessary to reduce the information required for delay transmission. However, if the number of bits per subframe is simply reduced, there is a problem that the pitch change range is narrowed and the accuracy is insufficient, resulting in a significant deterioration in sound quality.

【０００５】本発明は、上述の問題を解決し、少ないビ
ット数で遅延の伝送を可能とするので、音声信号を４ｋ
ｂ／ｓ以下で良好に符号化することが可能となる。The present invention solves the above-mentioned problems and enables delay transmission with a small number of bits.
Good coding is possible at b / s or less.

【０００６】[0006]

【課題を解決するための手段】第１の発明によれば、音
声信号を予め定めたフレーム単位に区切るフレーム分割
部と、前記音声信号からスペクトルパラメータを求める
スペクトルパラメータ計算部と、遅延分過去の音源信号
を切り出してピッチ予測を行なう適応コードブック部と
音源信号を量子化する音源量子化部とからなる音声符号
化装置において、前記適応コードブックにおける遅延を
過去の差分量子化値から予測し、予測して得た差分を量
子化する適応コードブック部を有することを特徴とする
音声符号化装置が得られる。According to the first aspect of the present invention, a frame dividing unit that divides a voice signal into predetermined frame units, a spectrum parameter calculating unit that obtains a spectrum parameter from the voice signal, and a delay component past In a speech coding apparatus consisting of an adaptive codebook section for performing pitch prediction by cutting out an excitation signal and an excitation quantization section for quantizing an excitation signal, a delay in the adaptive codebook is predicted from a past difference quantization value, A speech coding apparatus having an adaptive codebook unit for quantizing a difference obtained by prediction is obtained.

【０００７】第２の発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号からスペクトルパラメータを求めるスペクトルパラメ
ータ計算部と、遅延分過去の音源信号を切り出してピッ
チ予測を行なう適応コードブック部と音源信号を量子化
する音源量子化部とからなる音声符号化装置において、
前記適応コードブックにおける遅延を過去の差分量子化
値から予測し、予測して得た差分をもとに前記差分を量
子化するかあるいは前記遅延を量子化することを判別す
る適応コードブック部を有することを特徴とする音声符
号化装置が得られる。According to the second aspect of the invention, a frame dividing section for dividing the voice signal into predetermined frame units, a spectrum parameter calculating section for obtaining a spectrum parameter from the voice signal, and a pitch by extracting a sound source signal in the past by the delay amount. In a speech coding apparatus including an adaptive codebook unit for prediction and an excitation quantization unit for quantizing an excitation signal,
An adaptive codebook unit that predicts a delay in the adaptive codebook from a past difference quantization value, and determines whether to quantize the difference or quantize the delay based on the difference obtained by prediction. A speech coding apparatus characterized by having.

【０００８】第３の発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からスペクトルパラメータを求めるス
ペクトルパラメータ計算部と、遅延分過去の音源信号を
切り出してピッチ予測を行なう適応コードブック部と音
源信号を量子化する音源量子化部とからなる音声符号化
装置において、予め定められたモードにおいて、前記適
応コードブックにおける遅延を過去の差分量子化値から
予測し、予測して得た差分を量子化する適応コードブッ
ク部を有することを特徴とする音声符号化装置が得られ
る。According to the third aspect of the invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a spectrum parameter from the audio signal. In a speech coding apparatus including a spectrum parameter calculation unit that obtains, an adaptive codebook unit that cuts out a past excitation signal for delay and performs pitch prediction, and an excitation quantization unit that quantizes the excitation signal, a predetermined mode In the speech coding apparatus, a delay in the adaptive codebook is predicted from a past difference quantized value, and an adaptive codebook unit that quantizes the predicted difference is obtained.

【０００９】第４の発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からスペクトルパラメータを求めるス
ペクトルパラメータ計算部と、遅延分過去の音源信号を
切り出してピッチ予測を行なう適応コードブック部と音
源信号を量子化する音源量子化部とからなる音声符号化
装置において、予め定められたモードにおいて、前記適
応コードブックにおける遅延を過去の差分量子化値から
予測し、予測して得た差分をもとに前記差分を量子化す
るかあるいは前記遅延を量子化することを判別する適応
コードブック部を有することを特徴とする音声符号化装
置が得られる。According to the fourth aspect of the invention, a frame dividing section that divides the audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and determines a mode, and a spectrum parameter from the audio signal. In a speech coding apparatus including a spectrum parameter calculation unit that obtains, an adaptive codebook unit that cuts out a past excitation signal for delay and performs pitch prediction, and an excitation quantization unit that quantizes the excitation signal, a predetermined mode In, the adaptive codebook for predicting the delay in the adaptive codebook from the past difference quantization value and determining whether to quantize the difference or quantize the delay based on the difference obtained by prediction A speech coding apparatus having a section is obtained.

【００１０】[0010]

【実施例】図１は第１の発明による音声符号化装置の一
実施例を示すブロック図である。1 is a block diagram showing an embodiment of a speech coder according to the first invention.

【００１１】図において、入力端子１００から音声信号
を入力し、フレーム分割回路１１０では音声信号をフレ
ーム（例えば４０ｍｓ）毎に分割し、サブフレーム分割
回路１２０では、フレームの音声信号をフレームよりも
短いサブフレーム（例えば８ｍｓ）に分割する。In the figure, a voice signal is input from an input terminal 100, a frame division circuit 110 divides the voice signal into frames (for example, 40 ms), and a subframe division circuit 120 divides the voice signal of the frame into shorter than the frame. It is divided into subframes (for example, 8 ms).

【００１２】スペクトルパラメータ計算回路２００で
は、少なくとも一つのサブフレームの音声信号に対し
て、サブフレーム長よりも長い窓（例えば２４ｍｓ）を
かけて音声を切り出してスペクトルパラメータを予め定
められた次数（例えばＰ＝１０次）を計算する。ここで
スペクトルパラメータの計算には、周知のＬＰＣ分析
や、Ｂｕｒｇ分析等を用いることができる。ここでは、
Ｂｕｒｇ分析を用いることとする。Ｂｕｒｇ分析の詳細
については、中溝著による“信号解析とシステム同定”
と題した単行本（コロナ社１９８８年刊）の８２〜８７
頁（文献３）等に記載されているので説明は省略する。
さらにスペクトルパラメータ計算部では、Ｂｕｒｇ法に
より計算された線形予測係数α_i（ｉ＝１，…，１０）
を量子化や補間に適したＬＳＰパラメータに変換する。
ここで、線形予測係数からＬＳＰへの変換は、菅村他に
よる“線スペクトル対（ＬＳＰ）音声分析合成方式によ
る音声情報圧縮”と題した論文（電子通信学会論文誌、
Ｊ６４−Ａ、ｐｐ．５９９−６０６、１９８１年）（文
献４）を参照することができる。例えば、第１、３、５
サブフレームでＢｕｒｇ法により求めた線形予測係数
を、ＬＳＰパラメータに変換し、第２、４サブフレーム
のＬＳＰを直線補間により求めて、第２、４サブフレー
ムのＬＳＰを逆変換して線形予測係数に戻し、第１−５
サブフレームの線形予測係数α_il（ｉ＝１，…，１０，
ｌ＝１，…，５）を聴感重み付け回路２３０に出力す
る。また、第５サブフレームのＬＳＰをスペクトルパラ
メータ量子化回路２１０へ出力する。In the spectrum parameter calculation circuit 200, a speech signal is cut out by applying a window (for example, 24 ms) longer than the subframe length to a speech signal of at least one subframe, and a spectrum parameter is set to a predetermined order (for example, P = 10th order) is calculated. Here, well-known LPC analysis, Burg analysis, or the like can be used for the calculation of the spectrum parameter. here,
Burg analysis will be used. For details of Burg analysis, see "Signal Analysis and System Identification" by Nakamizo.
82-87 in the book titled "Corona Publishing Co., Ltd. 1988"
The description is omitted because it is described in the page (Reference 3) and the like.
Furthermore, in the spectrum parameter calculation unit, the linear prediction coefficient α _i (i = 1, ..., 10) calculated by the Burg method is used.
To LSP parameters suitable for quantization and interpolation.
Here, the conversion from the linear prediction coefficient to the LSP is performed by Sugamura et al., "Speech information compression by line spectrum pair (LSP) speech analysis and synthesis method" (The Institute of Electronics and Communication Engineers,
J64-A, pp. 599-606, 1981) (reference 4). For example, the first, third, fifth
The linear prediction coefficient obtained by the Burg method in the subframe is converted into an LSP parameter, the LSP of the second and fourth subframes is obtained by linear interpolation, and the LSP of the second and fourth subframe is inversely transformed to obtain the linear prediction coefficient. Return to No. 1-5
Subframe linear prediction coefficient α _il (i = 1, ..., 10,
l = 1, ..., 5) is output to the perceptual weighting circuit 230. In addition, the LSP of the fifth subframe is output to the spectrum parameter quantization circuit 210.

【００１３】スペクトルパラメータ量子化回路２１０で
は、あらかじめ定められたサブフレームのＬＳＰパラメ
ータを効率的に量子化する。以下では、量子化法とし
て、ベクトル量子化を用いるものとし、第５サブフレー
ムのＬＳＰパラメータを量子化するものとする。ＬＳＰ
パラメータのベクトル量子化の手法は周知の手法を用い
ることができる。具体的な方法は例えば、特開平４−１
７１５００号公報（特願平２−２９７６００号）（文献
５）や特開平４−３６３０００号公報（特願平３−２６
１９２５号）（文献６）や、特開平５−６１９９号公報
（特願平３−１５５０４９号）（文献７）や、Ｔ．Ｎｏ
ｍｕｒａｅｔａｌ．，による“ＬＳＰＣｏｄｉｎｇ
ＵｓｉｎｇＶＱ−ＳＶＱＷｉｔｈＩｎｔｅｒｐ
ｏｌａｔｉｏｎｉｎ４．０７５ｋｂｐｓＭ−ＬＣ
ＥＬＰＳｐｅｅｃｈＣｏｄｅｒ”と題した論文（Ｐ
ｒｏｃ．ＭｏｂｉｌｅＭｕｌｔｉｍｅｄｉａＣｏｍ
ｍｕｎｉｃａｔｉｏｎｓ，ｐｐ．Ｂ．２．５，１９９
３）（文献８）等を参照できるのでここでは説明は省略
する。The spectrum parameter quantization circuit 210 efficiently quantizes the LSP parameters of a predetermined subframe. In the following, it is assumed that vector quantization is used as the quantization method, and that the LSP parameter of the fifth subframe is quantized. LSP
A well-known method can be used as a method of vector quantization of parameters. A specific method is, for example, Japanese Patent Laid-Open No. 4-1
71500 (Japanese Patent Application No. 2-297600) (Reference 5) and Japanese Patent Application Laid-Open No. 4-363000 (Japanese Patent Application 3-26).
1925) (reference 6), JP-A-5-6199 (Japanese Patent Application No. 3-155049) (reference 7) and T.I. No
mura et al. , By "LSPCoding
Using VQ-SVQ With Interp
rotation in 4.075 kbps M-LC
A paper entitled "ELP Speech Coder" (P
rc. Mobile Multimedia Com
communications, pp. B. 2.5,199
3) (Reference 8) and the like can be referred to, so description thereof will be omitted here.

【００１４】また、スペクトルパラメータ量子化回路２
１０では、第５サブフレームで量子化したＬＳＰパラメ
ータをもとに、第１〜第４サブフレームのＬＳＰパラメ
ータを復元する。ここでは、現フレームの第５サブフレ
ームの量子化ＬＳＰパラメータと１つ過去のフレームの
第５サブフレームの量子化ＬＳＰを直線補間して、第１
〜第４サブフレームのＬＳＰを復元する。ここで、量子
化前のＬＳＰと量子化後のＬＳＰとの誤差電力を最小化
するコードベクトルを１種類選択した後に、直線補間に
より第１〜第４サブフレームのＬＳＰを復元できる。さ
らに性能を向上させるためには、前記誤差電力を最小化
するコードベクトルを複数候補選択したのちに、各々の
候補について、累積歪を評価し、累積歪を最小化する候
補と補間ＬＳＰの組を選択するようにすることができ
る。詳細は、例えば、特願平５−８７３７号明細書（文
献９）を参照することができる。Further, the spectrum parameter quantization circuit 2
In 10, the LSP parameters of the first to fourth subframes are restored based on the LSP parameters quantized in the fifth subframe. Here, the quantized LSP parameter of the fifth subframe of the current frame and the quantized LSP of the fifth subframe of the previous frame are linearly interpolated to obtain the first LSP parameter.
To restore the LSP of the fourth subframe. Here, after selecting one type of code vector that minimizes the error power between the LSP before quantization and the LSP after quantization, the LSPs of the first to fourth subframes can be restored by linear interpolation. In order to further improve the performance, after selecting a plurality of code vectors for minimizing the error power, for each candidate, the cumulative distortion is evaluated, and a combination of the candidate for minimizing the cumulative distortion and the interpolation LSP is determined. Can be selected. For details, refer to, for example, Japanese Patent Application No. 5-8737 (Reference 9).

【００１５】以上により復元した第１−４サブフレーム
のＬＳＰと第５サブフレームの量子化ＬＳＰをサブフレ
ーム毎に線形予測係数α′_il（ｉ＝１，…，１０，ｌ＝
１，…，５）に変換し、インパルス応答計算回路３１０
へ出力する。また、第５サブフレームの量子化ＬＳＰの
コードベクトルを表すインデクスをマルチプレクサ４０
０に出力する。The linear prediction coefficient _α'il (i = 1, ..., 10, l =) for each subframe of the LSP of the first to fourth subframes and the quantized LSP of the fifth subframe restored as described above.
1, ..., 5), and the impulse response calculation circuit 310
Output to. In addition, the multiplexer 40 uses an index representing the code vector of the quantized LSP of the fifth subframe.
Output to 0.

【００１６】上記において、直線補間のかわりに、ＬＳ
Ｐの補間パターンを予め定められたビット数（例えば２
ビット）分用意しておき、これらのパターンの各々に対
して１〜４サブフレームのＬＳＰを復元して累積歪を最
小化するコードベクトルと補間パターンの組を選択する
ようにしてもよい。このようにすると補間パターンのビ
ット数だけ伝送情報が増加するが、ＬＳＰのフレーム内
での時間的な変化をより精密に表すことができる。ここ
で、補間パターンは、トレーニング用のＬＳＰデータを
用いて予め学習して作成してもよいし、予め定められた
パターンを格納しておいてもよい。予め定められたパタ
ーンとしては、例えば、Ｔ．Ｔａｎｉｇｕｃｈｉｅｔ
ａｌによる“ＩｍｐｒｏｖｅｄＣＥＬＰｓｐｅｅ
ｃｈｃｏｄｉｎｇａｔ４ｋｂ／ｓａｎｄｂｅ
ｌｏｗ”と題した論文（Ｐｒｏｃ．ＩＣＳＬＰ，ｐｐ．
４１−４４，１９９２）（文献１０）等に記載のパター
ンを用いることができる。また、さらに性能を改善する
ためには、補間パターンを選択した後に、予め定められ
たサブフレームにおいて、ＬＳＰの真の値とＬＳＰの補
間値との誤差信号を求め、前記誤差信号をさらに誤差コ
ードブックで表すようにしてもよい。In the above, instead of linear interpolation, LS
The P interpolation pattern has a predetermined number of bits (for example, 2
Bits), and a set of a code vector and an interpolation pattern that minimizes cumulative distortion by restoring LSPs of 1 to 4 subframes for each of these patterns may be selected. In this way, the transmission information increases by the number of bits of the interpolation pattern, but it is possible to more accurately represent the temporal change in the LSP frame. Here, the interpolation pattern may be created by learning in advance using LSP data for training, or a predetermined pattern may be stored. As the predetermined pattern, for example, T. Taniguchi et
"Improved CELP speed" by al
ch coding at 4kb / s and be
Low ”(Proc. ICSLP, pp.
41-44, 1992) (Literature 10). In order to further improve the performance, after selecting an interpolation pattern, an error signal between the true value of the LSP and the interpolation value of the LSP is obtained in a predetermined subframe, and the error signal is further coded as an error code. It may be represented in a book.

【００１７】聴感重み付け回路２３０は、スペクトルパ
ラメータ計算回路２００から、各サブフレーム毎に量子
化前の線形予測係数α_il（ｉ＝１，…，１０，ｌ＝１，
…，５）を入力し、前記文献１にもとづき、サブフレー
ムの音声信号に対して聴感重み付けを行い、聴感重み付
け信号ｘ_w（ｎ）を出力する。The perceptual weighting circuit 230 receives from the spectral parameter calculation circuit 200 a linear prediction coefficient α _il (i = 1, ..., 10, l = 1, 1) before quantization for each subframe.
, 5) is input, the perceptual weighting is performed on the audio signal of the sub-frame based on the reference 1, and the perceptual weighting signal x _w (n) is output.

【００１８】応答信号計算回路２４０は、スペクトルパ
ラメータ計算回路２００から、各サブフレーム毎に線形
予測係数α_ilを入力し、スペクトルパラメータ量子化回
路２１０から、量子化、補間して復元した線形予測係数
α′_ilをサブフレーム毎に入力し、保存されているフィ
ルタメモリの値を用いて、入力信号ｄ（ｎ）＝０とした
応答信号を１サブフレーム分計算し、減算器２３５へ出
力する。ここで、応答信号ｘ_z（ｎ）は下式で表され
る。The response signal calculation circuit 240 receives the linear prediction coefficient α _il for each subframe from the spectrum parameter calculation circuit 200, and quantizes and interpolates and restores the linear prediction coefficient from the spectrum parameter quantization circuit 210. α ′ _il is input for each subframe, a response signal for the input signal d (n) = 0 is calculated for one subframe by using the stored value of the filter memory, and is output to the subtractor 235. Here, the response signal x _z (n) is expressed by the following equation.

【００１９】[0019]

【数１】 [Equation 1]

【００２０】ここで、γは、聴感重み付け量を制御する
重み係数であり、下記の式（３）と同一の値である。Here, γ is a weighting coefficient for controlling the perceptual weighting amount, and has the same value as the following equation (3).

【００２１】減算器２３５は、下式により、聴感重み付
け信号から応答信号を１サブフレーム分減算し、ｘ′_w
（ｎ）を適応コードブック回路５００へ出力する。The subtractor 235 by the following equation, by subtracting one subframe a response signal from the perceptual weighting signals, x _'w
(N) is output to the adaptive codebook circuit 500.

【００２２】ｘ′_w（ｎ）＝ｘ_w（ｎ）−ｘ_z（ｎ）（２）インパルス応答計算回路３１０は、ｚ変換が下式で表さ
れる重み付けフィルタのインパルス応答ｈ_w（ｎ）を予
め定められた点数Ｌだけ計算し、適応コードブック回路
３００、音源量子化回路３５０へ出力する。X ′ _w (n) = x _w (n) −x _z (n) (2) In the impulse response calculation circuit 310, the impulse response h _w (n) of the weighting filter whose z conversion is expressed by the following equation. Is calculated for a predetermined number of points L and output to adaptive codebook circuit 300 and excitation quantization circuit 350.

【００２３】[0023]

【数２】 [Equation 2]

【００２４】適応コードブック回路５００の構成を図２
に示す。図２において、遅延計算部５１０では、端子５
０１、５０２、５０３の各々から、過去の音源信号ｖ
（ｎ）、減算器２３５の出力信号ｘ′_w（ｎ）、インパ
ルス応答ｈ_w（ｎ）を入力し、ピッチに対応する遅延Ｔ
を下式を最小化するように求める。The configuration of the adaptive codebook circuit 500 is shown in FIG.
Shown in In FIG. 2, the delay calculation unit 510 has a terminal 5
From each of 01, 502, and 503, the past sound source signal v
(N), the output signal x ′ _w (n) of the subtractor 235, and the impulse response h _w (n) are input, and the delay T corresponding to the pitch is input.
Is calculated so as to minimize the following equation.

【００２５】[0025]

【数３】 (Equation 3)

【００２６】ここで、ｙ_w（ｎ−Ｔ）＝ｖ（ｎ−Ｔ）＊ｈ_w（ｎ）（５）であり、記号＊は畳み込み演算を表す。Here, y _w (nT) = v (nT) * h _w (n) (5), and the symbol * represents a convolution operation.

【００２７】ゲインβを下式に従い求める。The gain β is calculated according to the following equation.

【００２８】[0028]

【数４】 [Equation 4]

【００２９】ここで、女性音や子供の声に対して、遅延
の抽出精度を向上させるために、遅延を整数サンプルで
はなく、小数サンプル値で求めてもよい。具体的な方法
は、例えば、Ｐ．Ｋｒｏｏｎらによる、“Ｐｉｔｃｈ
ｐｒｅｄｉｃｔｏｒｓｗｉｔｈｈｉｇｈｔｅｍｐ
ｏｒａｌｒｅｓｏｌｕｔｉｏｎ”と題した論文（Ｐｒ
ｏｃ．ＩＣＡＳＳＰ，ｐｐ．６６１−６６４，１９９０
年）（文献１１）等を参照することができる。Here, in order to improve the accuracy of extracting the delay with respect to the female sound and the voice of the child, the delay may be obtained with a decimal sample value instead of an integer sample value. A specific method is described in P. "Pitch" by Kroon et al.
predictors with high temp
Oral resolution ”(Pr
oc. ICASSP, pp. 661-664, 1990
(Year) (Reference 11) and the like can be referred to.

【００３０】遅延予測部５２０では、遅延Ｔを入力し、
さらに、サブフレーム遅延部５４０から過去のサブフレ
ームの遅延の差分量子化値を、予測係数コードブック５
２５から予測係数を入力して、現在のサブフレームの遅
延をＭＡ（ＭｏｖｉｎｇＡｖｅｒａｇｅ）予測する。
一例として過去の一つのサブフレームの量子化値を予測
に用いる場合について下式に示す。The delay predictor 520 inputs the delay T,
Furthermore, the difference quantized value of the delay of the past subframe is calculated from the subframe delay unit 540 and the prediction coefficient codebook 5
The prediction coefficient is input from 25, and MA (Moving Average) is estimated for the delay of the present sub-frame.
As an example, the case where a quantized value of one past subframe is used for prediction is shown in the following equation.

【００３１】Ｔ_h＝ηｅ_h ^l-1 （７）ここで、ηは予測係数コードブックに格納された固定の
予測係数である。差分量子化部５３０では、下式に従い
差分を計算する。T _h = ηe _h ^l-1 (7) where η is a fixed prediction coefficient stored in the prediction coefficient codebook. The difference quantization unit 530 calculates the difference according to the following formula.

【００３２】ｅ^l＝Ｔ−Ｔ_h （８）差分値ｅ^lを予め定められた量子化ビット数で表して量
子化し、量子化値ｅ_h ^lを求め、遅延復元部５５０へ出
力する。量子化値ｅ_h ^lはサブフレーム遅延部５４０へ
出力する。また、量子化値ｅ_h ^lを表すインデクスを端
子５０５から出力する。E ^l = T−T _h (8) The difference value e ^l is represented by a predetermined number of quantization bits and quantized to obtain a quantized value e _h ^l, which is output to the delay restoration unit 550. The quantized value e _h ^l is output to the subframe delay unit 540. Also, the index representing the quantized value e _h ^l is output from the terminal 505.

【００３３】遅延復元部５５０では、下式に従い、遅延
Ｔ′を復元し出力する。The delay restoration unit 550 restores and outputs the delay T'according to the following equation.

【００３４】Ｔ′＝Ｔ_h＋ｅ_h ^l （９）また、ピッチ予測部５６０では、下式に従いピッチ予測
を行い、適応コードブック予測算差信号ｚ（ｎ）を端子
５０４から出力する。T ′ = T _h + e _h ^l (9) Further, the pitch prediction section 560 performs pitch prediction according to the following equation, and outputs the adaptive codebook prediction difference signal z (n) from the terminal 504.

【００３５】ｚ（ｎ）＝ｘ′_w（ｎ）−βｖ（ｎ−Ｔ′）＊ｈ_w（ｎ）（１０）以上で適応コードブック回路５００の説明を終える。Z (n) = x ′ _w (n) −βv (n−T ′) * h _w (n) (10) This completes the description of the adaptive codebook circuit 500.

【００３６】音源量子化回路３５０では、音源コードブ
ックを探索する例について示す。音源コードブック３５
１に格納されているコードベクトルを探索することによ
り、音源信号を量子化する。音源コードベクトルの探索
は、式を最小化するように、最良の音源コードベクトル
ｃ_j（ｎ）を選択する。このとき、最良のコードベクト
ルを１種選択してもよいし、２種以上のコードベクトル
を選んでおいて、ゲイン量子化の際に、１種に本選択し
てもよい。ここでは、２種以上のコードベクトルを選ん
でおくものとする。Excitation quantization circuit 350 shows an example of searching an excitation codebook. Sound source code book 35
The source signal is quantized by searching the code vector stored in 1. The search for the source code vector selects the best source code vector c _j (n) so as to minimize the equation. At this time, one of the best code vectors may be selected, or two or more types of code vectors may be selected and the main selection may be made to one when gain quantization is performed. Here, it is assumed that two or more types of code vectors are selected.

【００３７】[0037]

【数５】 (Equation 5)

【００３８】なお、一部の音源コードベクトルに対して
のみ、下式を適用するときには、複数個の音源コードベ
クトルをあらかじめ予備選択しておき、予備選択された
音源コードベクトルに対して、下式を適用することもで
きる。When the following formula is applied to only some sound source code vectors, a plurality of sound source code vectors are preselected, and the following formula is applied to the preselected sound source code vectors. Can also be applied.

【００３９】ゲイン量子化回路３６５は、ゲインコード
ブック３５５からゲインコードベクトルを読みだし、選
択された音源コードベクトルに対して、下式を最小化す
るように、音源コードベクトルとゲインコードベクトル
の組み合わせを選択する。The gain quantization circuit 365 reads the gain code vector from the gain code book 355, and combines the excitation code vector and the gain code vector so as to minimize the following expression with respect to the selected excitation code vector. Select.

【００４０】[0040]

【数６】 (Equation 6)

【００４１】ここで、β′_k、γ′_kは、ゲインコード
ブック３５５に格納された２次元ゲインコードブックに
おけるｋ番目のコードベクトルである。選択された音源
コードベクトルとゲインコードベクトルを表すインデク
スをマルチプレクサ４００に出力する。Here, β ′ _k and γ ′ _k are the kth code vector in the two-dimensional gain codebook stored in the gain codebook 355. The indexes representing the selected sound source code vector and gain code vector are output to the multiplexer 400.

【００４２】重み付け信号計算回路３６０は、スペクト
ルパラメータ計算回路の出力パラメータ及び、それぞれ
のインデクスを入力し、インデクスからそれに対応する
コードベクトルを読みだし、まず下式にもとづき駆動音
源信号ｖ（ｎ）を求める。The weighting signal calculation circuit 360 inputs the output parameter of the spectrum parameter calculation circuit and each index, reads the code vector corresponding to the index from the index, and first, outputs the driving sound source signal v (n) based on the following equation. Ask.

【００４３】ｖ（ｎ）＝β′_kｖ（ｎ−Ｔ）＋γ′_kｃ_j（ｎ）（１３）次に、スペクトルパラメータ計算回路２００の出力パラ
メータ、スペクトルパラメータ量子化回路２１０の出力
パラメータを用いて下式により、応答信号ｓ_w（ｎ）を
サブフレーム毎に計算し、応答信号計算回路２４０へ出
力する。V (n) = β ′ _k v (n−T) + γ ′ _k c _j (n) (13) Next, the output parameters of the spectrum parameter calculation circuit 200 and the spectrum parameter quantization circuit 210 are The response signal s _w (n) is calculated for each subframe using the following equation and is output to the response signal calculation circuit 240.

【００４４】[0044]

【数７】 (Equation 7)

【００４５】以上により、第１の発明に対応する実施例
の説明を終える。This is the end of the description of the embodiment corresponding to the first invention.

【００４６】第２の発明の一実施例を示すブロック図を
図３に示す。第２の発明では、第１の発明と適応コード
ブック回路６００の動作が異なるので、適応コードブッ
ク回路６００の動作を図４を用いて説明する。なお、図
４において、図２と同一の番号を付した構成要素は、図
２と同一の動作を行なうので説明は省略する。A block diagram showing an embodiment of the second invention is shown in FIG. In the second invention, the operation of the adaptive codebook circuit 600 is different from that of the first invention, so the operation of the adaptive codebook circuit 600 will be described with reference to FIG. Note that, in FIG. 4, the constituent elements given the same numbers as in FIG. 2 perform the same operations as in FIG.

【００４７】判別部６１０は、遅延予測部５２０の出力
である遅延予測値Ｔ_hと、遅延計算部５１０から現サブ
フレームの遅延Ｔを入力し、下式により誤差を求める。The discriminator 610 inputs the delay predictive value T _h output from the delay predictor 520 and the delay T of the current subframe from the delay calculator 510, and obtains an error by the following equation.

【００４８】ｅ^l＝Ｔ−Ｔ_h （１５）誤差ｅ^lの例えば絶対値を予め定められたしきい値と比
較し、しきい値よりも小さい時は、予測を用い、しきい
値よりも大きいときは予測しないという予測判別信号を
求め、スイッチ６２０₁，６２０₂と端子５０６に出力
する。E ^l = T−T _h (15) For example, the absolute value of the error e ^l is compared with a predetermined threshold value, and when it is smaller than the threshold value, prediction is used and When it is larger, a prediction determination signal that no prediction is made is obtained and output to the switches 620 ₁ and 620 ₂ and the terminal 506.

【００４９】スイッチ６２０₁は、予測判別信号を入力
し、予測なしのときはスイッチを上側に倒し、予測あり
のときは下側に倒すことにより、予測なしのときは、遅
延計算部５１０からの出力であるＴを、予測ありのとき
は遅延復元部５５０からの出力であるＴ′をピッチ予測
部５６０に出力する。スイッチ６２０₂は、予測判別信
号を入力し、予測なしのときは遅延Ｔに対応するインデ
クスを、予測ありのときは、差分量子化のインデクスを
端子５０５に出力する。The switch 620 ₁ inputs the prediction judgment signal, and when the prediction is not made, the switch is tilted upward, and when the prediction is made, the switch is tilted downward, and when the prediction is not made, the delay calculation unit 510 outputs the delay judgment signal. The output T is output to the pitch prediction unit 560 as the output T ′ from the delay restoration unit 550 when there is prediction. Switch 620 ₂ receives the prediction discrimination signal, an index corresponding to the delay T when no prediction, when the Yes prediction, and outputs the index of the differential quantization to the terminal 505.

【００５０】以上で説明を終える。This is the end of the description.

【００５１】図５は第３の発明の一実施例を示すブロッ
ク図である。図において、図１と同一の番号を付した構
成要素は、図１と同一の動作を行なうので、説明は省略
する。図５において、モード判別回路７００は、聴感重
み付け回路２３０からフレーム単位で聴感重み付け信号
を受取り、モード判別情報を出力する。ここでは、モー
ド判別に、現在のフレームの特徴量を用いる。特徴量と
しては、例えばピッチ予測ゲインを用いる。ピッチ予測
ゲインの計算は、例えば下式を用いる。FIG. 5 is a block diagram showing an embodiment of the third invention. In the figure, the components with the same numbers as in FIG. 1 perform the same operations as in FIG. In FIG. 5, the mode discrimination circuit 700 receives a perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, and outputs mode discrimination information. Here, the feature amount of the current frame is used for the mode determination. For example, a pitch prediction gain is used as the feature amount. The pitch prediction gain is calculated using, for example, the following formula.

【００５２】[0052]

【数８】 (Equation 8)

【００５３】ここで、Ｔは予測ゲインを最大化する最適
遅延である。Here, T is the optimum delay that maximizes the prediction gain.

【００５４】ピッチ予測ゲインをあらかじめ定められた
複数個のしきい値と比較して複数種類のモードに分類す
る。モードの個数としては、例えば４を用いることがで
きる。The pitch prediction gain is compared with a plurality of predetermined threshold values and classified into a plurality of types of modes. For example, 4 can be used as the number of modes.

【００５５】モード判別回路７００は、モード判別情報
を適応コードブック回路８００へ出力する。The mode discrimination circuit 700 outputs the mode discrimination information to the adaptive codebook circuit 800.

【００５６】適応コードブック回路８００の構成を図６
に示す。図において、図２、４と同一の番号を付した構
成要素は、図２、４と同一の働きをするので説明は省略
する。図６において、スイッチ８２０₁、８２０₂は、
端子８０１からモード判別情報を入力し、モードに応じ
て、遅延の予測あり／なしを切替える。The configuration of the adaptive codebook circuit 800 is shown in FIG.
Shown in In the figure, the components having the same numbers as those in FIGS. 2 and 4 have the same functions as those in FIGS. In FIG. 6, the switches 820 ₁ and 820 ₂ are
Mode discrimination information is input from the terminal 801, and the presence / absence of delay prediction is switched according to the mode.

【００５７】また、モード情報に応じて、ピッチ予測部
８６０の動作を変える。例えば、予め定められたモード
のみ（例えばモード０）、適応コードブック回路を使用
しないようにすることもできる。このようにするには、
ピッチ予測部８６０の演算において、（９）式を実行す
るときに、ゲインβを０として実行すれば良い。Further, the operation of the pitch predicting section 860 is changed according to the mode information. For example, the adaptive codebook circuit may not be used only in a predetermined mode (for example, mode 0). To do this,
In the calculation of the pitch prediction unit 860, the gain β may be set to 0 when the expression (9) is executed.

【００５８】図７は第４の発明の一実施例を示すブロッ
ク図である。図において、図１、３、５と同一の番号を
付した構成要素は、同一の動作を行なうので、説明は省
略する。図７では、適応コードブック回路９００の動作
が異なるので、この構成を図８に示す。図８において、
図４、６と同一の番号を付した構成要素は、同一の動作
を行なうので、説明は省略する。図８において、端子９
０１からモード情報を入力し、判別部９１０へ出力す
る。判別部９１０では、予め定められたモードについて
予測残差の判別を行ない、予測あり／なしの判別信号を
スイッチ６２０₁、６２０₂に出力する。あらかじめ定
められたモード以外では、予測なしとしておく。FIG. 7 is a block diagram showing an embodiment of the fourth invention. In the figure, the components having the same numbers as those in FIGS. 1, 3 and 5 perform the same operations, and therefore the description thereof will be omitted. Since the operation of the adaptive codebook circuit 900 is different in FIG. 7, this configuration is shown in FIG. In FIG.
Since the components having the same numbers as those in FIGS. 4 and 6 perform the same operation, the description thereof will be omitted. In FIG. 8, the terminal 9
The mode information is input from 01 and output to the determination unit 910. The discrimination unit 910 discriminates the prediction residual for a predetermined mode, and outputs a discrimination signal with / without prediction to the switches 620 ₁ and 620 ₂ . No prediction is made in modes other than the predetermined mode.

【００５９】以上で本発明の実施例の説明を終える。This completes the description of the embodiment of the present invention.

【００６０】上述した実施例に限らず、種々の変形が可
能である。The present invention is not limited to the above embodiment, but various modifications are possible.

【００６１】適応コードブック回路において、遅延予測
部５２０では、過去の複数フレームの差分量子化値から
遅延を予測する高次予測としてもよい。予測の次数をＬ
とすると、予測式は下式を使用する。In the adaptive codebook circuit, the delay predicting section 520 may perform high-order prediction in which the delay is predicted from the differential quantized values of a plurality of past frames. Let the prediction order be L
Then, the following formula is used as the prediction formula.

【００６２】[0062]

【数９】 [Equation 9]

【００６３】また、予測係数コードブックは、モード毎
に切替えてもよい。The prediction coefficient codebook may be switched for each mode.

【００６４】音源量子化回路の音源コードブックの構成
としては、他の周知な構成、例えば、多段構成や、スパ
ース構成などを用いることができる。As the configuration of the excitation codebook of the excitation quantization circuit, other well-known configurations such as multistage configuration and sparse configuration can be used.

【００６５】モード判別情報を用いて音源量子化回路に
おける音源コードブックを切替える構成とすることもで
きる。The excitation codebook in the excitation quantization circuit may be switched using the mode discrimination information.

【００６６】音源量子化回路では、音源コードブックを
探索する例について示したが、複数個の位置と振幅の異
なるマルチパルスを探索するようにしてもよい。ここ
で、マルチパルスの振幅と位置は、下式を最小化するよ
うに行なう。In the excitation quantization circuit, an example of searching the excitation codebook has been shown, but multipulses having a plurality of positions and different amplitudes may be searched. Here, the amplitude and position of the multi-pulse are set so as to minimize the following equation.

【００６７】[0067]

【数１０】 [Equation 10]

【００６８】ここで、ｇ_j，ｍ_jはそれぞれ、ｊ番目の
マルチパルスの振幅、位置を示す。ｋはマルチパルスの
個数である。Here, g _j and m _j respectively indicate the amplitude and position of the j-th multi-pulse. k is the number of multi-pulses.

【００６９】[0069]

【発明の効果】以上説明したように、本発明によれば、
音声符号化装置において、遅延を過去の差分量子化値を
用いて予測することにより、遅延を表すのに必要なビッ
ト数をサブフレーム当たり例えば８ビットから５ビット
程度に低減化することができる。これは、１秒当たりの
遅延伝送量にすると、１．６ｋｂ／ｓから１ｋｂ／ｓに
低減化できるので、音声全体の符号化速度を４ｋｂ／ｓ
以下に低減化することが容易となり、低減化しても従来
よりも良好な音質が得られるという効果がある。As described above, according to the present invention,
By predicting the delay using the past difference quantized value in the audio encoding device, the number of bits required to express the delay can be reduced from about 8 bits to about 5 bits per subframe. This can be reduced from 1.6 kb / s to 1 kb / s by setting the delay transmission amount per second, so that the coding rate of the entire voice is 4 kb / s.
It is easy to reduce to the following, and even if it is reduced, there is an effect that a better sound quality can be obtained than in the past.

[Brief description of drawings]

【図１】第１の発明の実施例を示す図。FIG. 1 is a diagram showing an embodiment of the first invention.

【図２】適応コードブック回路５００の構成を示す図。FIG. 2 is a diagram showing a configuration of an adaptive codebook circuit 500.

【図３】第２の発明の実施例を示す図。FIG. 3 is a diagram showing an embodiment of the second invention.

【図４】適応コードブック回路６００の構成を示す図。FIG. 4 is a diagram showing a configuration of an adaptive codebook circuit 600.

【図５】第３の発明の実施例を示す図。FIG. 5 is a diagram showing an embodiment of the third invention.

【図６】適応コードブック回路８００の構成を示す図。FIG. 6 is a diagram showing a configuration of an adaptive codebook circuit 800.

【図７】第４の発明の実施例を示す図。FIG. 7 is a diagram showing an embodiment of the fourth invention.

【図８】適応コードブック回路９００の構成を示す図。FIG. 8 is a diagram showing a configuration of an adaptive codebook circuit 900.

[Explanation of symbols]

１１０フレーム分割回路１２０サブフレーム分割回路２００スペクトルパラメータ計算回路２１０スペクトルパラメータ量子化回路２１１ＬＳＰコードブック２３０重み付け回路２３５減算回路２４０応答信号計算回路５００，６００，８００，９００適応コードブック回
路３１０インパルス応答計算回路３５０音源量子化回路３５１音源コードブック３５５ゲインコードブック３６０重み付け信号計算回路３６５ゲイン量子化回路４００マルチプレクサ５１０遅延計算部５２０遅延予測部５２５予測係数コードブック５３０差分量子化部５４０サブフレーム遅延部５５０遅延復元部５６０，８６０ピッチ予測部６２０₁，６２０₂，８２０₁，８２０₂ スイッチ回
路７００モード判別回路110 frame division circuit 120 sub-frame division circuit 200 spectrum parameter calculation circuit 210 spectrum parameter quantization circuit 211 LSP codebook 230 weighting circuit 235 subtraction circuit 240 response signal calculation circuit 500, 600, 800, 900 adaptive codebook circuit 310 impulse response calculation Circuit 350 Excitation Quantization Circuit 351 Excitation Codebook 355 Gain Codebook 360 Weighting Signal Calculation Circuit 365 Gain Quantization Circuit 400 Multiplexer 510 Delay Calculation Unit 520 Delay Prediction Unit 525 Prediction Coefficient Codebook 530 Difference Quantization Unit 540 Subframe Delay Unit 550 Delay recovery unit 560,860 Pitch prediction unit 620 ₁ , 620 ₂ , 820 ₁ , 820 ₂ switch circuit 700 mode discrimination circuit

Claims

[Claims]

1. A frame division unit that divides a voice signal into predetermined frame units, a spectrum parameter calculation unit that obtains a spectrum parameter from the voice signal, an adaptive codebook that cuts out a sound source signal that is past the delay amount and performs pitch prediction. In a speech coding apparatus comprising a unit and a source quantization unit for quantizing a source signal, an adaptive code for predicting a delay in the adaptive codebook from a past difference quantization value, and quantizing a difference obtained by prediction A speech coding apparatus having a book section.

2. A frame division unit that divides a voice signal into predetermined frame units, a spectrum parameter calculation unit that obtains a spectrum parameter from the voice signal, an adaptive codebook that cuts out a sound source signal that is a delay past and performs pitch prediction. In a speech coding apparatus comprising a unit and a source quantization unit for quantizing a source signal, a delay in the adaptive codebook is predicted from a past difference quantization value, and the difference is obtained based on the difference obtained by prediction. A speech coding apparatus having an adaptive codebook section for discriminating whether to quantize or quantize the delay.

3. A frame division unit that divides an audio signal into predetermined frame units, a mode determination unit that calculates a feature amount from the audio signal and determines a mode, and a spectrum parameter calculation unit that obtains a spectrum parameter from the audio signal. And a speech coding apparatus comprising an adaptive codebook section for cutting out a past excitation signal by a delay amount for pitch prediction and an excitation quantization section for quantizing the excitation signal, and the adaptive codebook in a predetermined mode. A speech coding apparatus, comprising: an adaptive codebook unit that predicts a delay in 1) from a past difference quantized value, and quantizes a difference obtained by prediction.

4. A frame division unit that divides an audio signal into predetermined frame units, a mode determination unit that calculates a feature amount from the audio signal and determines a mode, and a spectrum parameter calculation unit that obtains a spectrum parameter from the audio signal. And a speech coding apparatus comprising an adaptive codebook section for cutting out a past excitation signal by a delay amount for pitch prediction and an excitation quantization section for quantizing the excitation signal, and the adaptive codebook in a predetermined mode. Is predicted from the past difference quantization value, and has an adaptive codebook unit that determines whether to quantize the difference or to quantize the delay based on the difference obtained by prediction. Speech coding device.

5. The excitation signal is quantized by searching for an excitation codebook composed of a plurality of types of code vectors in the excitation quantization unit, according to claim 1, 2, 3 or 4. Speech coding device.