JPH0667696A

JPH0667696A - Speech encoding method

Info

Publication number: JPH0667696A
Application number: JP4244038A
Authority: JP
Inventors: Atsushi Matsumoto; 淳松本; Keiichi Katayanagi; 恵一片柳; Masayuki Nishiguchi; 正之西口
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1992-08-21
Filing date: 1992-08-21
Publication date: 1994-03-11

Abstract

PURPOSE:To reduce the arithmetic operation quantity by detecting the analytic output of a past speech signal which has the shortest distance to the analytic output of a current input speech signal and encoding the current input speech on the basis of this detection output. CONSTITUTION:A speech signal S(n) which is sampled, for example, at a 8kHz sampling frequency fs and converted into a digital signal is inputted to an input terminal 9. A short-period predictive inverse filter 10 filters the input signal reversely to predict a sound generated at the inner part of the throat, and its residual output r(m) is supplied to a subtracter 8; and the output of the subtracter 8 is supplied as an output difference e(n) to an energy calculation part (SIGMA()<2>) 11. Then this energy calculation part 11 calculates the energy of the output difference e(n) between the residual output r'(n) and the residual output r(n) from a short-period predictive inverse filter 10 and then selects, for example, the dynamic code vector of a dynamic code vector 1 from a terminal 12 so as to minimize the energy of the output difference e(n).

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、低ビットレートの音声
符号化方法に関し、過去の音声信号との相関を利用して
現在の入力音声信号を分析し符号化する音声符号化方法
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a low bit rate speech coding method, and more particularly to a speech coding method for analyzing and coding a present input speech signal by utilizing a correlation with a past speech signal.

【０００２】[0002]

【従来の技術】近年、低ビットレート、すなわち、4.8
〜9.6 kbpsでの音声符号化方法には、ベクトル和励起リ
ニア予測（ＶＳＥＬＰ：Vector Sum Excited Linear Pr
ediction）等のコード励起リニア予測（ＣＥＬＰ：Code
Excited Linear Prediction）が提唱されている。2. Description of the Related Art In recent years, a low bit rate, that is, 4.8
~ 9.6 kbps voice coding method, vector sum excited linear prediction (VSELP: Vector Sum Excited Linear Pr
code excitation linear prediction (CELP: Code)
Excited Linear Prediction) has been proposed.

【０００３】このＶＳＥＬＰについての技術内容は、モ
トローラ・インコーポレーテッドによる特表平２−５０
２１３５号公報の「改良されたベクトル励起源を有する
ディジタル音声コーダ」及び「VECTOR SUM EXCITED LIN
EAR PREDICTION(VSELP) : SPEECH CODING AT 8 KBPS :I
ra A.Gerson and Jasiuk :Paper presented at the In
t.Conf.on Acoustics,Speech and Signal Processing -
April 1990 」に記載されている。The technical contents of this VSELP are described in the special table 2-50 by Motorola Incorporated.
2135, "Digital Speech Coder with Improved Vector Excitation Source" and "VECTOR SUM EXCITED LIN"
EAR PREDICTION (VSELP): SPEECH CODING AT 8 KBPS: I
ra A. Gerson and Jasiuk: Paper presented at the In
t.Conf.on Acoustics, Speech and Signal Processing-
April 1990 ".

【０００４】このＶＳＥＬＰを用いた音声符号化方法
は、アナリシスバイシンセス（Analysis by synthesis
）によるコードブックサーチにより、低ビットレート
による高品質音声伝送を実現している。また、ＶＳＥＬ
Ｐを用いた音声符号化方法を適用した音声符号化装置
（音声コーダ）においては、入力音声信号の特性を形成
するピッチ及びノイズ予測器が導入された時間変動線形
フィルタをコードブックに記憶されたコードベクトルを
選択することで励起させて音声を符号化している。The voice encoding method using this VSELP is based on the analysis by synthesis.
) Codebook search to achieve high-quality voice transmission at a low bit rate. Also, VSEL
In a speech coding apparatus (speech coder) to which a speech coding method using P is applied, a time-varying linear filter in which a pitch and noise predictor forming characteristics of an input speech signal is introduced is stored in a codebook. Speech is encoded by being excited by selecting a code vector.

【０００５】具体的には、音声の各フレームに対して、
音声コーダはそれぞれのコードベクトルをフィルタに導
入して音声信号を形成し、もとの入力音声信号と比較し
てその差分出力を検出する。この差分出力は人間の聴覚
に基づく重み付フィルタに通され重み付けされる。ここ
で、現在のフレームに対して最小のエネルギーで重み付
けされた差分出力を発生させるコードベクトルを選択す
ることが望まれる。Specifically, for each frame of speech,
The speech coder introduces each code vector into the filter to form a speech signal and compares it with the original input speech signal to detect its differential output. This difference output is weighted by passing through a weighting filter based on human hearing. Here, it is desired to select a code vector that produces a minimum energy weighted differential output for the current frame.

【０００６】図５は上述したようなＶＳＥＬＰを用いた
音声符号化方法を適用した音声コーダの概略構成を示す
機能ブロック図である。この音声コーダは、過去の音声
信号の分析出力が複数のコードベクトルとして記憶され
たダイナミックコードブック５１と、雑音成分に関した
コードベクトルを記憶する第１の固定コードブック５２
及び第２の固定コードブック５３の合わせて３つのコー
ドブックを有する。FIG. 5 is a functional block diagram showing a schematic configuration of a speech coder to which the speech coding method using VSELP as described above is applied. This speech coder includes a dynamic codebook 51 in which analysis outputs of past speech signals are stored as a plurality of code vectors, and a first fixed codebook 52 in which code vectors related to noise components are stored.
And the second fixed codebook 53, there are three codebooks in total.

【０００７】入力端子６０には例えばサンプリング周波
数ｆ_s＝８kHz でサンプリングされ、図示しないバンド
パスフィルタ（ＢＰＦ）によって帯域が選択され、Ａ／
Ｄ変換器によってディジタル信号に変換された音声信号
Ｓ（ｎ）が入力される。この音声信号Ｓ（ｎ）は減算器
５９に供給される。一方、この減算器５９には上記３つ
のコードブックから選択された各コードベクトルで合成
された音声Ｓ'(ｎ）も供給される。この減算器５９の出
力は出力差ｅ（ｎ）としてエネルギー計算部６１（Σ
（）²）に供給される。このエネルギー計算部６１
は、上記出力差ｅ（ｎ）のエネルギーを計算する。そし
て、その出力差ｅ（ｎ）のエネルギーを最小とするよう
に、例えば端子５０からダイナミックコードブック５１
のダイナミックコードベクトルが選択される。同様に上
記第１の固定コードブック５２及び第２の固定コードブ
ック５３についても出力差ｅ（ｎ）のエネルギーが最小
となるような第１の固定コードベクトル及び第２の固定
コードベクトルが選択される。The input terminal 60 is sampled at a sampling frequency f _s = 8 kHz, for example, and a band is selected by a band pass filter (BPF) (not shown).
The audio signal S (n) converted into a digital signal by the D converter is input. The audio signal S (n) is supplied to the subtractor 59. On the other hand, the subtractor 59 is also supplied with the voice S ′ (n) synthesized with each code vector selected from the above three codebooks. The output of the subtractor 59 is used as the output difference e (n) in the energy calculation unit 61 (Σ
() ² ) is supplied. This energy calculation unit 61
Calculates the energy of the output difference e (n). Then, in order to minimize the energy of the output difference e (n), for example, from the terminal 50 to the dynamic codebook 51.
Dynamic code vector of is selected. Similarly, for the first fixed codebook 52 and the second fixed codebook 53, the first fixed code vector and the second fixed code vector that minimize the energy of the output difference e (n) are selected. It

【０００８】つまり、上記３つのコードブックから各コ
ードベクトルをサーチするには、上記ダイナミックコー
ドブック５１に記憶されているダイナミックコードベク
トル、上記第１の固定コードブック５２に記憶されてい
る第１の固定コードベクトル及び上記第２の固定コード
ブック５３に記憶されている第２の固定コードベクトル
が合成されることにより形成された音声Ｓ'(ｎ）と入力
音声信号Ｓ（ｎ）との出力差ｅ（ｎ）のエネルギーが最
小となることが条件となる。That is, in order to search each code vector from the above three codebooks, the dynamic code vector stored in the dynamic codebook 51 and the first code vector stored in the first fixed codebook 52 are searched. The output difference between the voice S '(n) and the input voice signal S (n) formed by combining the fixed code vector and the second fixed code vector stored in the second fixed code book 53. The condition is that the energy of e (n) is minimized.

【０００９】先ず、上記ダイナミックコードブック５１
に記憶されているダイナミックコードベクトルの選択に
ついて以下に述べる。First, the dynamic codebook 51 described above.
The selection of the dynamic code vector stored in is described below.

【００１０】上記ダイナミックコードブック５１は、例
えばサンプリング周波数を８kHz とし、音声の１フレー
ムが４０サンプルから構成されているとしたとき、例え
ば１２８個（１２８通り）のダイナミックコードベクト
ルを有する。この場合、上記合成フィルタ５８は、１２
８個のコードベクトルに演算を施す。そして、この合成
フィルタ５８から出力された音声Ｓ'(ｎ）と入力音声Ｓ
（ｎ）との出力差ｅ（ｎ）を減算器５９が算出し、エネ
ルギー計算部６１に供給する。このエネルギー計算部６
１は、上記出力差ｅ（ｎ）のエネルギーの値を算出す
る。そして、その出力差ｅ（ｎ）のエネルギーを最小と
するように、端子５０から最適なダイナミックコードベ
クトルを限定する最適インデックスｊ_optをサーチす
る。The dynamic code book 51 has, for example, 128 (128 ways) dynamic code vectors, assuming that the sampling frequency is 8 kHz and one frame of voice is composed of 40 samples. In this case, the synthesis filter 58 has 12
The operation is performed on the eight code vectors. Then, the voice S ′ (n) output from the synthesis filter 58 and the input voice S
The subtracter 59 calculates the output difference e (n) from (n) and supplies it to the energy calculator 61. This energy calculator 6
1 calculates the energy value of the output difference e (n). Then, the optimum index j _opt that limits the optimum dynamic code vector is searched from the terminal 50 so as to minimize the energy of the output difference e (n).

【００１１】ここで、上記合成フィルタ５８は、一般的
に１０次のＩＩＲフィルタにより構成されているので１
サンプルのデータに対して２０回の積和演算が行われ
る。そのため、２０回の積和演算を例えば４０サンプル
データ分行い、さらにそれをコードベクトル１２８個分
繰り返すことになる。Since the synthesis filter 58 is generally composed of a 10th-order IIR filter,
20 multiply-add operations are performed on the sample data. Therefore, the product-sum operation is performed 20 times, for example, for 40 sample data, and further, it is repeated for 128 code vectors.

【００１２】ここで、上記ダイナミックコードブック５
１のコードベクトルは、乗算器５４で係数βが乗算され
たあと加算器５７に供給される。また、上記第１の固定
コードブック５２のコードベクトルは、乗算器５５で係
数γ₁が乗算されたあと上記加算器５７に供給され、ま
た、第２の固定コードブック５３のコードベクトルは、
乗算器５６で係数γ₂が乗算されたあと上記加算器５７
に供給される。上記加算器５７は、上記それぞれの乗算
器５４、５５及び５６からの乗算結果を加算し、その加
算出力を上記合成フィルタ５８に供給している。Here, the dynamic code book 5 is used.
The code vector of 1 is supplied to the adder 57 after being multiplied by the coefficient β in the multiplier 54. Further, the code vector of the first fixed codebook 52 is supplied to the adder 57 after being multiplied by the coefficient γ ₁ in the multiplier 55, and the code vector of the second fixed codebook 53 is
After the coefficient γ ₂ is multiplied by the multiplier 56, the adder 57 is added.
Is supplied to. The adder 57 adds the multiplication results from the respective multipliers 54, 55 and 56 and supplies the addition output to the synthesis filter 58.

【００１３】図５において、破線で囲んだ第１の固定コ
ードブック５２及び第２の固定コードブック５３の各コ
ードベクトルについても、上述した演算が上記合成フィ
ルタ５８で施される。なお、この場合、それぞれのコー
ドベクトルの数は６４通りであり、繰り返される演算の
回数は６４回となる。In FIG. 5, each of the code vectors of the first fixed codebook 52 and the second fixed codebook 53 surrounded by a broken line is also subjected to the above-mentioned calculation by the synthesis filter 58. In this case, the number of each code vector is 64, and the number of repeated operations is 64.

【００１４】[0014]

【発明が解決しようとする課題】ところで、上述のＶＳ
ＥＬＰを用いた音声符号化方法を適用した音声コーダで
は、上述したようにコードブックのそれぞれのコードベ
クトルについてフィルタリングを全て行うと演算量が非
常に多くなる。すなわち、ダイナミックコードブックを
サーチする範囲の全ベクトルに対して、フィルタリング
を行ったものについて原音声との距離を計算する必要が
ありその演算量は解析的な方法に比較してかなり多くな
っている。ＶＳＥＬＰによる音声符号化方法全体の処理
時間の中でもこの部分の占める割合は他の部分に比較し
てかなり大きい。By the way, the above-mentioned VS
In the speech coder to which the speech coding method using ELP is applied, if all filtering is performed for each code vector of the codebook as described above, the amount of calculation becomes very large. In other words, it is necessary to calculate the distance from the original speech for all filtered vectors in the dynamic codebook search range, and the amount of calculation is considerably larger than in the analytical method. . The ratio of this part to the processing time of the entire VSELP speech coding method is considerably larger than that of the other parts.

【００１５】本発明は、上記実情に鑑みてなされたもの
であり、演算量を低減できる音声符号化方法の提供を目
的とする。The present invention has been made in view of the above circumstances, and an object thereof is to provide a speech coding method capable of reducing the amount of calculation.

【００１６】[0016]

【課題を解決するための手段】本発明に係る音声符号化
方法は、過去の音声信号との相関を利用して現在の入力
音声信号を分析し符号化する音声符号化方法において、
上記現在の入力音声信号を分析する工程と、この入力音
声信号の分析出力との距離が最短となる上記過去の音声
信号の分析出力を検出する検出工程と、上記検出工程の
検出出力を基に現在の入力音声を符号化する符号化工程
とを有することを特徴として上記課題を解決する。A speech coding method according to the present invention is a speech coding method for analyzing and coding a present input speech signal by utilizing a correlation with a past speech signal,
A step of analyzing the current input voice signal, a detection step of detecting the analysis output of the past voice signal having the shortest distance from the analysis output of the input voice signal, and based on the detection output of the detection step The above problem is solved by a coding step for coding the current input speech.

【００１７】また、他の発明に係る音声符号化方法は、
過去の音声信号の分析出力が複数のコードベクトルとし
て記憶されたコードブックをサーチして現在の入力音声
信号との相関を利用して符号化を行う符号化方法であっ
て、現在の入力音声信号を分析する工程と、この入力音
声信号の分析出力との距離が最短となる上記コードブッ
ク内のコードベクトルを直接サーチする工程と、上記サ
ーチされて得られたコードベクトルのインデックスを用
いて現在の入力音声信号を符号化する工程とを有するこ
とを特徴として上記課題を解決する。A speech encoding method according to another invention is
A coding method for searching a codebook in which an analysis output of a past voice signal is stored as a plurality of code vectors and performing coding by utilizing a correlation with a current input voice signal. And a step of directly searching for a code vector in the codebook in which the distance from the analysis output of the input speech signal is the shortest, and the current index using the index of the code vector obtained by the search is used. And a step of encoding an input voice signal.

【００１８】さらに、他の発明に係る音声符号化方法
は、過去の音声信号の分析出力が複数のコードベクトル
として記憶されたコードブックをサーチして現在の入力
音声信号との相関を利用して符号化を行う符号化方法で
あって、現在の入力音声信号を分析する工程と、この入
力音声信号の分析出力との距離が最短となる上記コード
ブック内のコードベクトルを直接サーチする第１のサー
チ工程と、上記第１のサーチ工程で得られたコードベク
トルを含め、該コードベクトルの近傍のコードベクトル
の中から入力音声信号との相関が最適となるようなコー
ドベクトルを間接サーチする第２のサーチ工程と、上記
第２のサーチ工程で得られたコードベクトルのインデッ
クスを用いて現在の入力音声信号を符号化する工程とを
有することを特徴として上記課題を解決する。Furthermore, a speech coding method according to another invention uses a correlation with a current input speech signal by searching a codebook in which analysis outputs of past speech signals are stored as a plurality of code vectors. An encoding method for encoding, comprising a step of analyzing a current input voice signal and a direct search for a code vector in the codebook having a shortest distance from an analysis output of the input voice signal. A second step of indirectly searching for a code vector including the code vector obtained in the first search step and having the optimum correlation with the input voice signal from the code vectors in the vicinity of the code vector. And a step of encoding the current input speech signal using the code vector index obtained in the second search step. Te to solve the above-mentioned problems.

【００１９】[0019]

【作用】現在の入力音声信号の分析出力との距離が最短
となる過去の音声信号の分析出力を検出し、この検出出
力を基に現在の入力音声を符号化するので、演算量を低
減できる。Since the analysis output of the past voice signal having the shortest distance from the analysis output of the current input voice signal is detected and the current input voice is encoded based on this detection output, the amount of calculation can be reduced. .

【００２０】[0020]

【実施例】以下、本発明に係る音声符号化方法の実施例
について、図面を参照しながら説明する。図１は本発明
の音声符号化方法が適用された第１の実施例となる符号
化装置（音声コーダ）の概略構成を示すブロック図であ
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of a voice coding method according to the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a schematic configuration of an encoding apparatus (speech coder) which is a first embodiment to which a speech encoding method of the present invention is applied.

【００２１】図１において、この音声コーダは、過去の
音声信号の分析出力が複数のコードベクトルとして記憶
されたダイナミックコードブック１と、雑音成分に関し
たコードベクトルを記憶する第１の固定コードブック２
及び第２の固定コードブック３の合わせて３つのコード
ブックを有する。上記ダイナミックコードブック１は、
時間と共に変化していくコードブックであり、一定時間
過去の分析データに基づいて複数のコードベクトルが蓄
えられている。また、第１の固定コードブック２及び第
２の固定コードブック３には、雑音成分が固定された状
態で蓄えられている。In FIG. 1, the speech coder includes a dynamic codebook 1 in which analysis outputs of past speech signals are stored as a plurality of code vectors, and a first fixed codebook 2 in which code vectors related to noise components are stored.
And the second fixed codebook 3 in total has three codebooks. The above dynamic codebook 1
It is a codebook that changes with time, and stores a plurality of code vectors based on analysis data past a certain time. Further, noise components are stored in a fixed state in the first fixed codebook 2 and the second fixed codebook 3.

【００２２】入力端子９には例えばサンプリング周波数
ｆ_s＝８kHz でサンプリングされ、図示しないＡ／Ｄ変
換器によってディジタル信号に変換された音声信号Ｓ
（ｎ）が入力される。この音声信号Ｓ（ｎ）は上述した
従来例の合成フィルタ５８の逆特性を有する短期予測逆
フィルタ１０に供給される。At the input terminal 9, for example, a voice signal S sampled at a sampling frequency f _s = 8 kHz and converted into a digital signal by an A / D converter (not shown).
(N) is input. The audio signal S (n) is supplied to the short-term predictive inverse filter 10 having the inverse characteristic of the above-described conventional synthesis filter 58.

【００２３】この短期予測逆フィルタ１０は、入力音声
に逆フィルタをかけ、いわゆる喉の奥の部分の音を予測
している。この短期予測逆フィルタ１０からの出力を残
差出力ｒ（ｎ）とする。この残差出力ｒ（ｎ）は、減算
器８に供給される。The short-term predictive inverse filter 10 applies an inverse filter to the input voice and predicts a so-called deep part of the throat. The output from the short-term predictive inverse filter 10 is the residual output r (n). The residual output r (n) is supplied to the subtractor 8.

【００２４】ここで、上記ダイナミックコードブッキ１
のコードベクトルは、乗算器４で係数βが乗算されたあ
と加算器７に供給される。また、上記第１の固定コード
ブック２のコードベクトルは、乗算器５で係数γ₁が乗
算されたあと上記加算器７に供給される。また、第２の
固定コードブック３のコードベクトルは、乗算器６で係
数γ₂が乗算されたあと上記加算器７に供給される。そ
して、上記加算器７からは、残差出力ｒ'(ｎ）が上記減
算器８に供給される。Here, the above-mentioned dynamic code bucky 1
The code vector of is multiplied by the coefficient β in the multiplier 4 and then supplied to the adder 7. The code vector of the first fixed codebook 2 is supplied to the adder 7 after being multiplied by the coefficient γ ₁ in the multiplier 5. The code vector of the second fixed codebook 3 is multiplied by the coefficient γ ₂ in the multiplier 6 and then supplied to the adder 7. Then, the residual output r ′ (n) is supplied from the adder 7 to the subtracter 8.

【００２５】すなわち、この減算器８には上記３つのコ
ードブックの各コードベクトルに応じた残差出力ｒ'
(ｎ）も供給される。この減算器８の出力は、出力差ｅ
（ｎ）としてエネルギー計算部（Σ（）²）１１に供
給される。That is, the subtracter 8 outputs the residual output r'according to each code vector of the above three codebooks.
(n) is also supplied. The output of this subtractor 8 is the output difference e
It is supplied to the energy calculation unit (Σ () ² ) 11 as (n).

【００２６】このエネルギー計算部１１は、上記残差出
力ｒ'(ｎ）と上記短期予測逆フィルタ１０からの残差出
力ｒ（ｎ）との出力差ｅ（ｎ）のエネルギーを計算す
る。そして、その出力差ｅ（ｎ）のエネルギーを最小と
するように、この第１の実施例は、例えば端子１２から
ダイナミックコードブック１のダイナミックコードベク
トルを選択する。The energy calculator 11 calculates the energy of the output difference e (n) between the residual output r '(n) and the residual output r (n) from the short-term prediction inverse filter 10. Then, in the first embodiment, for example, the dynamic code vector of the dynamic code book 1 is selected from the terminal 12 so as to minimize the energy of the output difference e (n).

【００２７】つまり、上記ダイナミックコードブック１
からダイナミックコードベクトルをサーチするには、上
記ダイナミックコードブック１に記憶されているダイナ
ミックコードベクトルを上記残差出力ｒ'(ｎ）と上記短
期予測逆フィルタ１０からの残差出力ｒ（ｎ）との出力
差ｅ（ｎ）のエネルギーが最小となるようにサーチすれ
ばよい。That is, the above dynamic codebook 1
In order to search for a dynamic code vector from the dynamic code vector stored in the dynamic codebook 1, the residual code output r ′ (n) and the residual output r (n) from the short-term prediction inverse filter 10 are used. The output difference e (n) may be searched so that the energy of the output difference e (n) is minimized.

【００２８】上記ダイナミックコードブック１は、例え
ばサンプリング周波数を８kHz とし、音声の１フレーム
が４０サンプルから構成されているとしたとき、例えば
１２８個（１２８通り）のダイナミックコードベクトル
を有する。しかし、この第１の実施例では、従来例で用
いられていた合成フィルタを用いていない。このため減
算器８では、１２８個のコードベクトルに応じた残差出
力ｒ'(ｎ）と上記短期予測逆フィルタ１０からの残差出
力ｒ（ｎ）を比較するだけである。つまり、４０サンプ
ルのデータのそれぞれついて１２８通りの比較演算を行
うだけである。The dynamic code book 1 has, for example, 128 (128 ways) dynamic code vectors when the sampling frequency is 8 kHz and one frame of voice is composed of 40 samples. However, the synthesis filter used in the conventional example is not used in the first embodiment. Therefore, the subtracter 8 only compares the residual output r '(n) corresponding to the 128 code vectors with the residual output r (n) from the short-term prediction inverse filter 10. In other words, only 128 different comparison operations are performed for each of the 40 sample data.

【００２９】すなわち、この第１の実施例は、１２８個
のコードベクトルに応じた残差出力ｒ'(ｎ）と上記短期
予測逆フィルタ１０からの残差出力ｒ（ｎ）とを比較
し、その出力差ｅ（ｎ）のエネルギーを最小とするよう
に、端子１２から最適なダイナミックコードベクトルを
最適インデックスＪ_optとしてサーチし、取り出すこと
によって、現在の入力音声信号を符号化する。That is, the first embodiment compares the residual output r '(n) corresponding to 128 code vectors with the residual output r (n) from the short-term prediction inverse filter 10, The current input speech signal is encoded by searching and extracting the optimum dynamic code vector from the terminal 12 as the optimum index J _{opt so} as to minimize the energy of the output difference e (n).

【００３０】次に、上記短期予測逆フィルタ１０の動作
を図２によって説明する。この図２では、図１の入力端
子９から入力される音声信号Ｓ（ｎ）の１フレーム当た
りのサンプル数をＮ_Sとしている。上述したように本実
施例ではサンプリング周波数ｆ_sを８kHz としており、
サンプル数Ｎ_Sは例えば４０である。また、上記ダイナ
ミックコードブック１のサイズ（選択の対象となるダイ
ナミックコードベクトルの数）をＮc とする。本実施例
では上述したように例えば１２８個（通り）である。ま
た、分析の対象となる入力音声信号をＳ（ｎ）とする。
ここで、ｎは０以上Ｎ_S未満である。また、その時点
（ｎの時点）での上記ダイナミックコードブック１の状
態、すなわち取り出されるダイナミックコードベクトル
をＪ（ｎ）とする。ここで、ｎは０以上Ｎc 未満であ
る。そして、任意の時点での上記短期予測逆フィルタ１
０による残差波形をｒ（ｎ）とする。ここで、ｎは０以
上Ｎ_S未満である。Next, the operation of the short-term predictive inverse filter 10 will be described with reference to FIG. In FIG. 2, the number of samples per frame of the audio signal S (n) input from the input terminal 9 of FIG. 1 is N _S. As described above, in this embodiment, the sampling frequency f _s is 8 kHz,
The sample number N _S is 40, for example. The size of the dynamic code book 1 (the number of dynamic code vectors to be selected) is Nc. In the present embodiment, as described above, the number is 128 (various). Further, the input voice signal to be analyzed is S (n).
Here, n is 0 or more and less than N _S. The state of the dynamic codebook 1 at that time point (time point n), that is, the dynamic code vector to be taken out is J (n). Here, n is 0 or more and less than Nc. Then, the short-term prediction inverse filter 1 at any time
The residual waveform of 0 is r (n). Here, n is 0 or more and less than N _S.

【００３１】通常、音声のスペクトルは、フォルマント
のたった図２のＡに示すようなエンベロープとなる。こ
のエンベロープに高速フーリエ変換（ＦＦＴ）を施すと
図２のＢに示すようなスペクトルとなり、ピークとピー
クの間がピッチとなる。このスペクトルを上記短期予測
逆フィルタ１０に入力すると、エンベローブが縮小され
た図２のＣに示すような波形の残差出力ｒ（ｎ）が得ら
れる。すなわち、ピッチが少し残り、フォルマント合成
に依存した冗長性が失われたホワイトノイズのような波
形である。Usually, the spectrum of the voice has an envelope as shown in FIG. 2A having only formants. When the fast Fourier transform (FFT) is applied to this envelope, the spectrum becomes as shown in FIG. 2B, and the pitch is between the peaks. When this spectrum is input to the short-term prediction inverse filter 10, a residual output r (n) having a waveform with a reduced envelope as shown in C of FIG. 2 is obtained. That is, it is a waveform like white noise in which the pitch is slightly left and the redundancy depending on the formant synthesis is lost.

【００３２】上記ダイナミックコードブック１には、上
記図２のＣに示された残差出力ｒ（ｎ）に非常に良く似
た図２のＤに示すようなダイナミックコードベクトルＪ
（ｎ）が記憶されている。そこで、この第１の実施例
は、上記減算器８で残差出力ｒ（ｎ）と上記ダイナミッ
クコードベクトルＪ（ｎ）に応じた残差出力ｒ'(ｎ）と
を比較演算している。The dynamic codebook 1 has a dynamic code vector J as shown in D of FIG. 2 which is very similar to the residual output r (n) shown in C of FIG.
(N) is stored. Therefore, in the first embodiment, the subtracter 8 compares and operates the residual output r (n) and the residual output r '(n) corresponding to the dynamic code vector J (n).

【００３３】ここで、上記残差出力ｒ（ｎ）は１フレー
ムのサンプル数Ｎ_Sに応じてＮ_S次元のベクトルとみる
ことができ、これをｒとする。また、ダイナミックコ
ードブック１のインデックスＪの時点からＮ_Sサンプル
を取り出したものをベクトルとしてｃ_Jとする。この
状態で、ｒとｃ_Jの距離（近似度）を求め、それが
最短となるインデックスＪ_optを持って、このフレーム
におけるピッチを抽出できる。このときｃ_Jのゲイン
は、問わない。Here, the residual output r (n) can be regarded as an N _S -dimensional vector according to the number of samples N _S in one frame, and this is designated as r. Also, a vector obtained by extracting N _S samples from the time point of the index J in the dynamic codebook 1 is defined as c _J. In this state, the distance (approximation degree) between r and c _J is obtained, and the pitch in this frame can be extracted by using the index J _opt that _minimizes the distance. At this time, the gain of c _J does not matter.

【００３４】すなわち、｜ｒ−ｔｃ_J｜が最小、｜
ｒ−ｔｃ_J｜²が最小のとき、このフレームにおけ
るピッチＪを抽出できるので、That is, | r-tc _J | is the minimum, |
Since the pitch J in this frame can be extracted when r-tc _J | ² is the minimum,

【００３５】[0035]

【数１】 [Equation 1]

【００３６】となる。ここで、ｒは固定であるから
（ｒ・ｃ_J）²／｜ｃ_J｜²を最大にするインデ
ックスＪ_optを選択することによりピッチを固定でき
る。It becomes Here, since r is fixed, the pitch can be fixed by selecting the index J _opt that maximizes (r · c _J ) ² / | c _J | ² .

【００３７】次に、上述した従来のＶＳＥＬＰ方式の音
声コーダとこの第１の実施例との演算量の差について具
体的に説明する。先ず、ＶＳＥＬＰ方式の音声コーダの
演算量について述べる。ＶＳＥＬＰ方式の音声コーダに
おいては、上記合成フィルタ５８を用いて音を合成して
いる。このため、音声を合成するためのフィルタリング
の系列数ｚ_J（ｎ）は、Next, the difference in the amount of calculation between the above-mentioned conventional VSELP type voice coder and the first embodiment will be specifically described. First, the calculation amount of the VSELP type voice coder will be described. In the VSELP type voice coder, sounds are synthesized using the synthesis filter 58. Therefore, the number of filtering sequences z _J (n) for synthesizing speech is

【００３８】[0038]

【数２】 [Equation 2]

【００３９】と表せる。ここで、ｈ（ｎ−ｉ）は上記合
成フィルタ５８の（ｎ−ｉ）サンプルでのインパルス応
答である。Can be expressed as Here, h (n-i) is the impulse response at the (n-i) samples of the synthesis filter 58.

【００４０】また、上記ｚ_J（ｎ）は、ｚ_J（ｎ）＝ｚ_J-1(ｎ−１）＋ｒ（−Ｊ）ｈ（ｎ）ここで、１≦ｎ≦Ｎ−１、ｚ_J（０）＝ｒ（−Ｊ）ｈ
（ｎ）である。Further, z _J (n) is z _J (n) = z _J-1 (n-1) + r (-J) h (n) where 1≤n≤N-1, z _J (0) = r (-J) h
(N).

【００４１】このｚ_J（ｎ）は上記ＶＳＥＬＰでは４０
×１２８である。次に、上記ＶＳＥＬＰ方式の音声コー
ダにおいて、ダイナミックコードブック５１から再合成
した音のパワーＧ_Jは、This z _J (n) is 40 in the above VSELP.
X128. Next, in the VSELP type voice coder, the power G _J of the sound resynthesized from the dynamic codebook 51 is

【００４２】[0042]

【数３】 [Equation 3]

【００４３】となる。ここで、ｂ' _J(n) は、上記ダイ
ナミックコードブックから出たコードベクトルの出力ｂ
_J(n) が上記合成フィルタＨ（ｚ）を通った後の出力で
ある。また、このパワーＧ_Jは、It becomes Where b ′ _J (n) is the output b of the code vector from the above dynamic codebook.
_J (n) is the output after passing through the synthesis filter H (z). Also, this power G _J is

【００４４】[0044]

【数４】 [Equation 4]

【００４５】とも表せる。この従来のＶＳＥＬＰでは２
０×１２８となる。It can also be expressed as 2 in this conventional VSELP
It becomes 0x128.

【００４６】また、ダイナミックコードブックと元の音
声の内積Ｃ_Jは、The inner product C _J of the dynamic codebook and the original voice is

【００４７】[0047]

【数５】 [Equation 5]

【００４８】となり、このＶＳＥＬＰ方式の音声コーダ
では４０×１２８となる。また、合成フィルタのインパ
ルスレスポンスｈ（ｎ）は、この合成フィルタが１０次
のＩＩＲフィルタであり、２０サンプル分を用いるので
１０×２０となる。Thus, the VSELP type voice coder is 40 × 128. The impulse response h (n) of the synthesis filter is 10 × 20 because this synthesis filter is a 10th-order IIR filter and 20 samples are used.

【００４９】以上、従来のＶＳＥＬＰ方式の音声コーダ
の演算量に対する本実施例の音声コーダの演算量を次の
表１に示す。The following table 1 shows the calculation amount of the voice coder of this embodiment with respect to the calculation amount of the conventional VSELP type voice coder.

【００５０】[0050]

【表１】 [Table 1]

【００５１】すなわち、この第１の実施例のは、合成フ
ィルタを用いていないので、音声を合成するためのフィ
ルタリングの系列数ｚ_J（ｎ）は０であり、合成フィル
タのインパルスレスポンスｈ（ｎ）も０である。That is, in the first embodiment, since the synthesis filter is not used, the number of filtering sequences z _J (n) for synthesizing voice is 0, and the impulse response h (n of the synthesis filter is ) Is also zero.

【００５２】また、音のパワーＧ_Jは、上記ＶＳＥＬＰ
方式の音声コーダのそれがコードブックから音を合成し
たときのパワーであり計算が複雑であるのに対し、単に
ダイナミックコードブックのパワーである。これは、上
述したｃ_Jの大きさということになる。つまり、｜
ｃ_J｜²となり、この第１の実施例では１×１２８と
なる。Further, the sound power G _J is equal to the above VSELP.
The power of the speech coder of the system is the power when the sound is synthesized from the codebook and the calculation is complicated, whereas it is the power of the dynamic codebook. This means the size of c _J described above. That is, |
c _J | ² , which is 1 × 128 in the first embodiment.

【００５３】また、ダイナミックコードブックと元の音
声の内積Ｃ_Jは、上記ｒとｃ_JJの大きさの二乗の
積となり、本実施例でも４０×１２８となる。The inner product C _J of the dynamic codebook and the original voice is the product of the squares of the sizes of r and c _JJ , which is also 40 × 128 in this embodiment.

【００５４】但し、この第１の実施例では、短期予測逆
フィルタ１０を用いているため、このフィルタリングの
系列数を４０×１２８として求めている。従来のＶＳＥ
ＬＰ方式の音声コーダでは、逆フィルタを用いていない
ので、０である。However, in the first embodiment, since the short-term prediction inverse filter 10 is used, the number of filtering sequences is calculated as 40 × 128. Conventional VSE
Since the LP system voice coder does not use the inverse filter, the value is 0.

【００５５】したがって、ＶＳＥＬＰ方式の音声コーダ
と第１の実施例の音声コーダの演算量とを比較すると、
１３０００対５６４８となり、本実施例の音声コーダの
演算量は従来のＶＳＥＬＰ方式の音声コーダのそれの半
分以下となった。Therefore, comparing the VSELP type voice coder and the calculation amount of the voice coder of the first embodiment,
The number of operations is 13000 to 5648, which is less than half that of the conventional VSELP type voice coder.

【００５６】次に、第２の実施例となる音声コーダにつ
いて説明する。図３は第２の実施例となる音声コーダの
概略構成を示す機能ブロック図である。図３において、
この音声コーダは、ダイナミックコードブック２１と、
第１の固定コードブック２２及び第２の固定コードブッ
ク２３の合わせて３つのコードブックを有する。Next, a voice coder according to the second embodiment will be described. FIG. 3 is a functional block diagram showing a schematic configuration of a voice coder according to the second embodiment. In FIG.
This voice coder includes a dynamic codebook 21,
The first fixed codebook 22 and the second fixed codebook 23 have a total of three codebooks.

【００５７】入力端子２９には例えばサンプリング周波
数ｆ_s＝８kHz でサンプリングされ、図示しないＡ／Ｄ
変換器によってディジタル信号に変換された音声信号Ｓ
（ｎ）が入力される。この音声信号Ｓ（ｎ）は減算器３
０に供給される。この減算器３０には、端子３４で０入
力された合成フィルタ３３からの０入力応答も供給され
る。この０入力応答は０入力が上記合成フィルタ３３を
通る際に新しいＬＰＣ係数により形成されたものであ
る。そして、この減算器３０は上記音声信号Ｓ（ｎ）か
ら０入力応答を減算し、その減算結果をＰ（ｎ）として
短期予測逆フィルタ３１に供給する。この短期予測逆フ
ィルタ３１には入力端子３２からフィルタ状態を０にク
リアするクリアフィルタ信号が供給され、入力音声信号
Ｓ（ｎ）の１フレームをフィルタリングした後、フィル
タの状態を０にクリアして初期化する。この短期予測逆
フィルタ３１からの出力を残差出力ｒ（ｎ）とする。こ
の残差出力ｒ（ｎ）は、減算器２８に供給される。The input terminal 29 is sampled at a sampling frequency f _s = 8 kHz, for example, and the A / D (not shown) is sampled.
Audio signal S converted into digital signal by converter
(N) is input. This audio signal S (n) is applied to the subtractor 3
Supplied to zero. The subtractor 30 is also supplied with the 0-input response from the synthesis filter 33, which is 0-input at the terminal 34. This 0 input response is formed by new LPC coefficients when the 0 input passes through the synthesis filter 33. Then, the subtractor 30 subtracts the 0 input response from the audio signal S (n), and supplies the subtraction result to the short-term prediction inverse filter 31 as P (n). A clear filter signal for clearing the filter state to 0 is supplied to the short-term predictive inverse filter 31 from the input terminal 32. After filtering one frame of the input audio signal S (n), the filter state is cleared to 0. initialize. The output from the short-term prediction inverse filter 31 is the residual output r (n). The residual output r (n) is supplied to the subtractor 28.

【００５８】ここで、上記ダイナミックコードベクトル
は、乗算器２４で係数βが乗算されたあと加算器２７に
供給される。また、上記第１の固定コードブック２２の
コードベクトルは、乗算器２５で係数γ₁が乗算された
あと上記加算器２７に供給される。さらに、第２の固定
コードブック２３のコードベクトルは、乗算器２６で係
数γ₂が乗算されたあと上記加算器２７に供給される。
そして、上記加算器２７の出力は、残差出力ｒ'(ｎ）と
して上記減算器２８に供給される。The dynamic code vector is supplied to the adder 27 after being multiplied by the coefficient β in the multiplier 24. The code vector of the first fixed codebook 22 is supplied to the adder 27 after being multiplied by the coefficient γ ₁ in the multiplier 25. Further, the code vector of the second fixed codebook 23 is multiplied by the coefficient γ ₂ in the multiplier 26 and then supplied to the adder 27.
Then, the output of the adder 27 is supplied to the subtractor 28 as a residual output r '(n).

【００５９】この減算器２８は、上記残差出力ｒ（ｎ）
と上記残差出力ｒ'(ｎ）との差を出力差ｅ（ｎ）として
エネルギー計算部３５に供給する。このエネルギ計算部
３５は、残差出力ｒ'(ｎ）と上記短期予測逆フィルタ３
１からの残差出力ｒ（ｎ）との出力差ｅ（ｎ）のエネル
ギーを計算する。そして、その出力差ｅ（ｎ）のエネル
ギーが最小となるように、この第２の実施例は端子３６
からダイナミックコードブック２１のダイナミックコー
ドベクトルのインデックスＪ_optをサーチし、取り出す
ことによって現在の入力音声信号を符号化する。The subtractor 28 outputs the residual output r (n).
The difference between the residual output r '(n) and the residual output r' (n) is supplied to the energy calculator 35 as an output difference e (n). The energy calculation unit 35 calculates the residual output r ′ (n) and the short-term prediction inverse filter 3 described above.
The energy of the output difference e (n) from the residual output r (n) from 1 is calculated. Then, in the second embodiment, the terminal 36 is used so that the energy of the output difference e (n) is minimized.
The current input speech signal is encoded by searching and extracting the index J _opt of the dynamic code vector of the dynamic codebook 21 from.

【００６０】この第２の実施例は、上記第１の実施例と
同様に従来例で用いられていた合成フィルタを減算器の
前段に使用していない。このため減算器２８では、１２
８個のコードベクトルに応じた残差出力ｒ'(ｎ）と上記
短期予測逆フィルタ３１からの残差出力ｒ（ｎ）を比較
するだけである。つまり、４０サンプルのデータのそれ
ぞれついて１２８通りの比較演算が行われるだけであ
る。In the second embodiment, the synthesis filter used in the conventional example is not used in the preceding stage of the subtractor as in the first embodiment. Therefore, in the subtractor 28, 12
The residual output r '(n) corresponding to the eight code vectors is simply compared with the residual output r (n) from the short-term prediction inverse filter 31. That is, only 128 different comparison operations are performed for each of the 40 sample data.

【００６１】この第２の実施例は、上記第１の実施例が
１フレームのフィルタリングを行った後、フィルタステ
ートをそのままの状態にして次のフレームの処理を行
う、すなわちフィルタステートをずっと保持し、繰り返
し使うのに対して、１フレームのフィルタリングを行っ
た後、フィルタステートを０にクリアしてから、次のフ
レームの処理を行うものである。In the second embodiment, after filtering one frame in the first embodiment, the filter state is left as it is and the next frame is processed, that is, the filter state is retained all the time. , Is used repeatedly, after filtering one frame, the filter state is cleared to 0, and then the next frame is processed.

【００６２】これは、例えば、この音声コーダと共に通
信系に用いられる音声デコーダ側でのマッチングをとる
ために有効である。通常、エンコーダ側には生の音声が
入力されるので問題はないが、デコーダ側では、元の生
の音声を知らない状態で音声を再合成するわけである。
このときに上記第１の実施例のようにフィルタステート
をずっと保持したままでは不都合が生じる。すなわち、
前のフレームの内部状態のフィルタ係数が新しくなった
状態で、入力音声信号Ｓ（ｎ）から０入力応答を減算し
ておかないと悪影響が出てくる。そして、０入力を合成
フィルタ３３に通し、元の音声信号から減算したものを
上記短期予測合成フィルタ３１に入力している。This is effective, for example, for matching on the audio decoder side used in the communication system together with this audio coder. Normally, there is no problem because the raw voice is input to the encoder side, but on the decoder side, the voice is resynthesized without knowing the original raw voice.
At this time, inconvenience arises if the filter state is kept held as in the first embodiment. That is,
If the 0 input response is not subtracted from the input audio signal S (n) in the state where the filter coefficient of the internal state of the previous frame is new, adverse effects will occur. Then, 0 input is passed through the synthesis filter 33, and what is subtracted from the original voice signal is input to the short-term prediction synthesis filter 31.

【００６３】この第２の実施例は、前のフレームの影響
を除去するために、０入力応答を入力音声から減算し
て、短期予測逆フィルタ３１で残差ｒ（ｎ）を求めてい
るため、減算結果Ｐ（ｎ）を有効に使え、さらにフィル
タの内部状態が量子化されたデータで更新されていくた
めデコーダ側と全く同じ状態が再現され、エンコーダと
デコーダのマッチングがとりやすくなる。In the second embodiment, in order to remove the influence of the previous frame, the 0 input response is subtracted from the input voice, and the short-term prediction inverse filter 31 obtains the residual r (n). , The subtraction result P (n) can be used effectively, and since the internal state of the filter is updated with the quantized data, the same state as on the decoder side is reproduced, and matching between the encoder and the decoder can be easily achieved.

【００６４】ここで、この第２の実施例は、上記第１の
実施例と同様に演算量を半分以下に抑えられる。演算量
削減の詳細な説明については、省略する。Here, in the second embodiment, the amount of calculation can be suppressed to half or less as in the first embodiment. A detailed description of the calculation amount reduction will be omitted.

【００６５】次に、本発明の第３の実施例となる音声コ
ーダについて説明する。図４は第３の実施例となる音声
コーダの概略構成を示す機能ブロック図である。図４に
おいて、この音声コーダは、ダイナミックコードブック
４１と、第１の固定コードブック４２及び第２の固定コ
ードブック４３の合わせて３つのコードブックを有す
る。Next, a voice coder which is a third embodiment of the present invention will be described. FIG. 4 is a functional block diagram showing a schematic configuration of a voice coder according to the third embodiment. In FIG. 4, this speech coder has a total of three codebooks including a dynamic codebook 41, a first fixed codebook 42 and a second fixed codebook 43.

【００６６】入力端子４８には例えばサンプリング周波
数ｆ_s＝８kHz でサンプリングされ、図示しないＡ／Ｄ
変換器によってディジタル信号に変換された音声信号Ｓ
（ｎ）が入力される。この音声信号Ｓ（ｎ）は短期予測
逆フィルタ４９に供給され、残差出力ｒ（ｎ）となる。
この残差出力ｒ（ｎ）は、減算器５０に供給される。The input terminal 48 is sampled at, for example, a sampling frequency f _s = 8 kHz, and an A / D (not shown) is sampled.
Audio signal S converted into digital signal by converter
(N) is input. This audio signal S (n) is supplied to the short-term prediction inverse filter 49 and becomes the residual output r (n).
The residual output r (n) is supplied to the subtractor 50.

【００６７】ここで、上記ダイナミックコードベクトル
は、乗算器４４で係数βが乗算されたあと加算器４７に
供給される。また、上記第１の固定コードブック４２の
コードベクトルは、乗算器４５で係数γ₁が乗算された
あと上記加算器４７に供給される。さらに、第２の固定
コードブック４３のコードベクトルは、乗算器４６で係
数γ₂が乗算されたあと上記加算器４７に供給される。
そして、上記加算器４７の出力は、残差出力ｒ'(ｎ）と
して上記減算器５０及び合成フィルタ５２に供給され
る。The dynamic code vector is supplied to the adder 47 after being multiplied by the coefficient β in the multiplier 44. The code vector of the first fixed codebook 42 is supplied to the adder 47 after being multiplied by the coefficient γ ₁ in the multiplier 45. Further, the code vector of the second fixed codebook 43 is supplied to the adder 47 after being multiplied by the coefficient γ ₂ in the multiplier 46.
The output of the adder 47 is supplied to the subtractor 50 and the synthesis filter 52 as a residual output r '(n).

【００６８】上記減算器５０は、上記残差出力ｒ（ｎ）
と上記残差出力ｒ'(ｎ）との差を出力差ｅ（ｎ）として
エネルギー計算部５１に供給する。このエネルギー計算
部５１は、上記出力差ｅ（ｎ）のエネルギーを計算す
る。そして、その出力差ｅ（ｎ）のエネルギーを最小と
するように、この第３の実施例は、例えば端子５５から
ダイナミックコードブック４１のダイナミックコードベ
クトルのインデックスＪ_optをサーチし、取り出す。The subtractor 50 outputs the residual output r (n).
The difference between the residual output r '(n) and the residual output r' (n) is supplied to the energy calculation unit 51 as an output difference e (n). The energy calculator 51 calculates the energy of the output difference e (n). Then, in order to minimize the energy of the output difference e (n), in the third embodiment, the index J _opt of the dynamic code vector of the dynamic code book 41 is searched and extracted from the terminal 55, for example.

【００６９】次に、このインデックスＪ_optのコードベ
クトルと該コードベクトルの近傍のコードベクトルとを
合成フィルタ５２に通すことにより、音声Ｓ'(ｎ）が得
られる。この音声Ｓ'(ｎ）は、減算器５３に供給され
る。この減算器５３には、上記入力音声Ｓ（ｎ）も供給
されている。この減算器５３の出力は、出力差Ｅ（ｎ）
としてエネルギー計算部５４に供給される。このエネル
ギー計算部５４は、上記出力差Ｅ（ｎ）のエネルギーを
計算する。そして、この第３の実施例は、出力差Ｅ
（ｎ）のエネルギーを最小とするように、端子５５から
ダイナミックコードブック４１の最適ダイナミックコー
ドベクトルの最適インデックスＪ' _optをサーチし、そ
の最適インデックスＪ' _optを取り出すことによって現
在の入力音声信号を符号化する。Next, the code vector of the index J _opt and the code vector near the code vector are passed through the synthesis filter 52 to obtain the voice S '(n). This voice S ′ (n) is supplied to the subtractor 53. The input voice S (n) is also supplied to the subtractor 53. The output of the subtractor 53 is the output difference E (n).
Is supplied to the energy calculation unit 54. The energy calculator 54 calculates the energy of the output difference E (n). And, in this third embodiment, the output difference E
The current input speech signal is obtained by searching the optimum index J ′ _opt of the optimum dynamic code vector of the dynamic codebook 41 from the terminal 55 so as to minimize the energy of (n) and extracting the optimum index J ′ _opt. Encode.

【００７０】以上、この第３の実施例は、上記短期予測
逆フィルタ４９からの残差出力ｒ（ｎ）とダイナミック
コードベクトルに応じた残差出力ｒ'(ｎ）との出力差ｅ
（ｎ）のエネルギーを最小とするようなダイナミックコ
ードベクトルのインデックスＪ_optを直接的にサーチ
（第１のサーチ工程）し、このインデックスＪ_optのコ
ードベクトルとその近傍のコードベクトルとを合成フィ
ルタ５２に供給して音声Ｓ'(ｎ）を合成し、この音声
Ｓ'(ｎ）と入力音声Ｓ（ｎ）との出力差Ｅ（ｎ）のエネ
ルギーが最小となるようなダイナミックコードブックの
ダイナミックコードベクトルのインデックスＪ' _optを
間接的にサーチ（第２のサーチ工程）し、取り出すこと
によって現在の入力音声信号を符号化している。As described above, in the third embodiment, the output difference e between the residual output r (n) from the short-term prediction inverse filter 49 and the residual output r '(n) corresponding to the dynamic code vector is output.
The index J _opt of the dynamic code vector that minimizes the energy of (n) is directly searched (first search step), and the code vector of this index J _{opt and} the code vector in the vicinity thereof are synthesized by the synthesis filter 52. To synthesize the voice S '(n), and the energy of the output difference E (n) between the voice S' (n) and the input voice S (n) is minimized. The current input speech signal is encoded by indirectly searching (second search step) the vector index _J'opt .

【００７１】このため、この第３の実施例は、第１のサ
ーチ工程でラフにインデックスサーチを行い、第２のサ
ーチ工程でその近傍をシビアにサーチすることができ
る。したがって、演算量を減らしても正確なコードベク
トルのインデックスをサーチできる。Therefore, in the third embodiment, the index search can be roughly performed in the first search step, and the vicinity thereof can be severely searched in the second search step. Therefore, an accurate code vector index can be searched even if the amount of calculation is reduced.

【００７２】[0072]

【発明の効果】本発明に係る音声符号化方法は、過去の
音声信号との相関を利用して現在の入力音声信号を分析
し符号化する音声符号化方法において、検出工程が現在
の入力音声信号を分析する工程の分析出力との距離が最
短となる過去の音声信号の分析出力を検出し、符号化工
程が上記検出工程の検出出力を基に現在の入力音声を符
号化するので、演算量を従来よりも半減できる。The speech coding method according to the present invention is a speech coding method in which a current input speech signal is analyzed and coded by utilizing a correlation with a past speech signal. Since the analysis output of the past speech signal that has the shortest distance from the analysis output of the signal analysis step is detected, the encoding step encodes the current input speech based on the detection output of the above detection step. The amount can be halved compared to the conventional one.

【００７３】また、他の発明に係る音声符号化方法は、
過去の音声信号の分析出力が複数のコードベクトルとし
て記憶されたコードブックをサーチして現在の入力音声
信号との相関を利用して符号化を行う符号化方法であっ
て、サーチ工程が現在の入力音声信号を分析する工程の
分析出力との距離が最短となる上記コードブック内のコ
ードベクトルを直接サーチし、符号化工程が上記サーチ
工程で得られたコードベクトルのインデックスを用いて
現在の入力音声信号を符号化するので、演算量を従来よ
りも半減できる。A speech coding method according to another invention is
A coding method for searching a codebook in which an analysis output of a past voice signal is stored as a plurality of code vectors and performing coding by utilizing a correlation with a current input voice signal, wherein a search step is a current step. The code vector in the above codebook that has the shortest distance from the analysis output in the step of analyzing the input voice signal is directly searched, and the coding step uses the index of the code vector obtained in the above search step to obtain the current input. Since the audio signal is encoded, the amount of calculation can be halved as compared with the conventional method.

【００７４】さらに、他の発明に係る音声符号化方法
は、過去の音声信号の分析出力が複数のコードベクトル
として記憶されたコードブックをサーチして現在の入力
音声信号との相関を利用して符号化を行う符号化方法で
あって、第１のサーチ工程が現在の入力音声信号を分析
する工程の分析出力との距離が最短となる上記コードブ
ック内のコードベクトルを直接サーチし、第２のサーチ
工程が上記第１のサーチ工程で得られたコードベクトル
を含め、該コードベクトルの近傍のコードベクトルの中
から入力音声信号との相関が最適となるようなコードベ
クトルを間接サーチし、符号化工程が上記第２のサーチ
工程で得られたコードベクトルのインデックスを用いて
現在の入力音声信号を符号化するので、演算量を従来よ
りも減らせる。Further, a speech coding method according to another invention searches a codebook in which analysis outputs of past speech signals are stored as a plurality of code vectors and utilizes the correlation with the current input speech signal. A coding method for performing coding, wherein the first search step directly searches for a code vector in the codebook having the shortest distance from the analysis output of the step of analyzing the current input speech signal, Of the code vector obtained in the first search step, the code vector obtained in the first search step is indirectly searched for a code vector having the optimum correlation with the input voice signal, and the code vector Since the encoding step encodes the current input speech signal using the index of the code vector obtained in the second search step, the calculation amount can be reduced as compared with the conventional case.

[Brief description of drawings]

【図１】第１の実施例の音声コーダの概略構成を示すブ
ロック図である。FIG. 1 is a block diagram showing a schematic configuration of a voice coder according to a first embodiment.

【図２】第１の実施例に用いられる短期予測逆フィルタ
の特性図である。FIG. 2 is a characteristic diagram of a short-term predictive inverse filter used in the first embodiment.

【図３】第２の実施例の音声コーダの概略構成を示すブ
ロック図である。FIG. 3 is a block diagram showing a schematic configuration of a voice coder of a second embodiment.

【図４】第３の実施例の音声コーダの概略構成を示すブ
ロック図である。FIG. 4 is a block diagram showing a schematic configuration of a voice coder of a third embodiment.

【図５】ＶＳＥＬＰを用いた音声コーダの機能ブロック
図である。FIG. 5 is a functional block diagram of a voice coder using VSELP.

[Explanation of symbols]

１・・・・・ダイナミックコードブック１０・・・・短期予測逆フィルタ１１・・・・エネルギー計算部 1-Dynamic codebook 10 --- Short-term prediction inverse filter 11 --- Energy calculator

Claims

[Claims]

1. A voice encoding method for analyzing and encoding a current input voice signal by utilizing a correlation with a past voice signal, the method comprising: analyzing the current input voice signal; A detection step of detecting an analysis output of the past speech signal having the shortest distance from the analysis output, and an encoding step of encoding a current input speech based on the detection output of the detection step. Speech coding method.

2. A coding method for searching a codebook in which an analysis output of a past speech signal is stored as a plurality of code vectors, and coding by utilizing a correlation with a current input speech signal. The step of analyzing the current input speech signal, the step of directly searching the code vector in the above codebook that has the shortest distance from the analysis output of this input speech signal, and the index of the code vector obtained by the above search Encoding the current input speech signal using the.

3. A coding method for searching a codebook in which an analysis output of a past speech signal is stored as a plurality of code vectors and performing coding by utilizing a correlation with a current input speech signal. A step of analyzing a current input voice signal and a direct search for a code vector in the code book having the shortest distance from an analysis output of the input voice signal;
Including the code vector obtained in the first search step, and indirectly searching for a code vector having the optimum correlation with the input speech signal from the code vectors in the vicinity of the code vector. A voice encoding method comprising: a second search step; and a step of encoding a current input voice signal using the index of the code vector obtained in the second search step.