JPH06202699A

JPH06202699A - Speech encoding device and speech decoding device, and speech encoding and decoding method

Info

Publication number: JPH06202699A
Application number: JP5240135A
Authority: JP
Inventors: Tadashi Yamaura; 正山浦; Masaya Takahashi; 真哉高橋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1992-09-29
Filing date: 1993-09-27
Publication date: 1994-07-22
Anticipated expiration: 2015-03-21
Also published as: JP3024455B2

Abstract

PURPOSE:To prevent sound quality deterioration due to the disorder of the periodicity of a synthesized speech by a small increase in the amount of transmitted information as to the speech encoding and decoding devices which separate an input speech signal into spectrum envelope information and sound source signal information and quantize the sound source signal into vectors. CONSTITUTION:A periodic pulse sound source generating means 21 which generates a sound source vector consisting of a periodic pulse train corresponding uniformly to an adaptive sound source vector outputted from an adaptive sound source code book consisting of a sound source signal found in a precedent frame and a multiple vector adaptive sound source encoding means 20 which generates a sound source signal by adding gains individually to adaptive sound source vectors and periodic pulse sound source vectors corresponding to the adaptive sound source vectors, and finds and encodes pitch information and gains so as to obtain a sound source signal for generating a synthesized speech minimizing distortion from the input speech are provided on an encoding side, and the same periodic pulse sound source generating means 23 as the encoding side and a multiple vector adaptive sound source decoding means 22 which generates adaptive sound source vectors and periodic pulse sound source vectors from the transmitted codes are provided on a decoding side.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、音声をディジタル伝
送あるいは蓄積する場合などに用いられるもので、音声
の音源信号情報をベクトル量子化する音声符号化・復号
化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice encoding / decoding device which is used when digitally transmitting or storing voice, and which vector-quantizes voice source signal information.

【０００２】[0002]

【従来の技術】音声をスペクトル包絡情報と音源信号情
報に分離し、音源信号情報をベクトル量子化する従来の
音声符号化・復号化装置として、図１０に示すものがあ
る。図１０は、W.B.Kleijn, D.J.Krasinski, and R.H.K
etchum 著”Improved Speech Quality and Efficient V
ector Quantization in SELP”（ICASSP■88, pp.155-1
58, 1988）に示されたのと同様なものである。図におい
て、１は符号化部、２は復号化部、３は伝送路であり、
４は入力音声、５は出力音声である。６はスペクトル分
析手段、７はスペクトルパラメータ符号化手段であり、
８は適応音源符号化手段、９、１４は適応音源符号帳で
ある。１０は駆動音源符号化手段、１１、１６は駆動音
源符号帳である。１２、１７は音源ベクトル生成手段、
１３は適応音源復号化手段、１５は駆動音源復号化手段
であり、１８はスペクトルパラメータ復号化手段、１９
は合成フィルタである。2. Description of the Related Art A conventional speech coding / decoding apparatus for separating speech into spectral envelope information and excitation signal information and vector-quantizing the excitation signal information is shown in FIG. Figure 10 shows WBKleijn, DJKrasinski, and RHK
"Improved Speech Quality and Efficient V" by etchum
ector Quantization in SELP ”(ICASSP ■ 88, pp.155-1
58, 1988). In the figure, 1 is an encoding unit, 2 is a decoding unit, 3 is a transmission path,
Reference numeral 4 is an input voice, and 5 is an output voice. 6 is spectrum analysis means, 7 is spectrum parameter coding means,
Reference numeral 8 is an adaptive excitation coding means, and 9 and 14 are adaptive excitation codebooks. 10 is a driving excitation coding means, and 11 and 16 are driving excitation codebooks. 12 and 17 are sound source vector generation means,
Reference numeral 13 is an adaptive excitation decoding means, 15 is a driving excitation decoding means, 18 is a spectrum parameter decoding means, 19
Is a synthesis filter.

【０００３】以下、従来の音声符号化・復号化装置の動
作について説明する。まず、符号化部１の動作について
説明する。スペクトルパラメータ分析手段６は、入力音
声４を分析して、スペクトルパラメータを抽出する。ス
ペクトルパラメータ符号化手段７は前記スペクトルパラ
メータを量子化し、それに対応する符号を復号化部２に
伝送路３を介して出力するとともに、量子化したスペク
トルパラメータを適応音源符号化手段８及び駆動音源符
号化手段１０に出力する。適応音源符号帳９には先行フ
レームにおいて求めた音源信号が記憶されており、ピッ
チ情報を符号として、現フレームにおいて前記音源信号
をピッチ周期で繰り返して得られる適応音源ベクトルを
出力する。図１１に先行フレームにおける音源信号とそ
のとき得られる適応音源ベクトルの例を示す。適応音源
ベクトルは、ピッチ周期がフレーム長よりも短いときに
は、図１１（ｂ）に示すように現フレーム内で音源信号
をピッチ周期で繰り返したものとなる。The operation of the conventional speech encoding / decoding device will be described below. First, the operation of the encoding unit 1 will be described. The spectrum parameter analysis means 6 analyzes the input voice 4 and extracts spectrum parameters. The spectrum parameter coding means 7 quantizes the spectrum parameter, outputs the code corresponding to the quantized spectrum parameter to the decoding section 2 via the transmission path 3, and also outputs the quantized spectrum parameter to the adaptive excitation coding means 8 and the driving excitation code. It outputs to the conversion means 10. The excitation signal obtained in the preceding frame is stored in the adaptive excitation codebook 9, and the adaptive excitation vector obtained by repeating the excitation signal in the pitch cycle in the current frame is output using the pitch information as a code. FIG. 11 shows an example of the sound source signal in the preceding frame and the adaptive sound source vector obtained at that time. When the pitch period is shorter than the frame length, the adaptive excitation vector is a repetition of the excitation signal in the current frame at the pitch period as shown in FIG. 11 (b).

【０００４】適応音源符号化手段８は、前記適応音源符
号帳９より入力されるＭ個の適応音源ベクトルａ_i（ｉ
＝１，... ，Ｍ）と、前記スペクトルパラメータ符号化
手段７より入力されたスペクトルパラメータを用いて、
Ｍ個の合成音声ベクトルＡ_i（ｉ＝１，... ，Ｍ）を合
成する。そして、入力音声４からフレーム毎に切り出し
た入力音声ベクトルＸとのベクトル間距離Ｄ_iを、例え
ば式（１）に従って求める。ここで、ゲインβ_iは距離
Ｄ_iが最小になるように、例えば式（２）に従って決定
する。（なお、ｔは転置をしめす）The adaptive excitation coding means 8 has M adaptive excitation vectors a _i (i) input from the adaptive excitation codebook 9.
= 1, ..., M) and the spectrum parameter input from the spectrum parameter encoding means 7,
M synthesized speech vectors A _i (i = 1, ..., M) are synthesized. Then, the inter-vector distance D _i between the input voice 4 and the input voice vector X cut out for each frame is obtained, for example, according to the equation (1). Here, the gain β _i is determined, for example, according to the equation (2) so that the distance D _i becomes the minimum. (Note that t indicates transposition)

【０００５】[0005]

【数１】 [Equation 1]

【０００６】次に、このベクトル間距離が最小となるベ
クトルＡ_Iを探索し、その符号Ｉ及びゲインβ_Iを量子化
したゲインβ_qIの符号を伝送路３を介して復号化部２に
出力し、また、選択された適応音源ベクトルａ_I及びそ
のゲインβ_qIを音源ベクトル生成手段１２に出力すると
ともに、誤差ベクトルＸ■ Ｘ■＝Ｘ−β_qIＡ_I を駆動音源符号化手段１０に出力する。Next, a vector A _I that minimizes the inter-vector distance is searched for, and the code of the gain _I _{qI obtained} by quantizing the code I and the gain β _I is output to the decoding unit 2 via the transmission line 3. In addition, the selected adaptive _excitation vector a _I and its gain β _qI are output to the _excitation vector generation means 12, and the error vector X ■ X ■ = X−β _qI A _I is output to the driving _excitation encoding means 10. To do.

【０００７】駆動音源符号帳１１には、例えばランダム
雑音から生成したＮ個の駆動音源ベクトルが記憶されて
いる。駆動音源符号化手段１０は、前記駆動音源符号帳
１１より入力される駆動音源ベクトルｃ_j（ｊ＝１，...
，Ｎ）と、前記スペクトルパラメータ符号化手段より
入力されたスペクトルパラメータを用いて、Ｎ個の合成
音声ベクトルＣ_j（ｊ＝１，... ，Ｎ）を合成する。そ
して、前記適応音源符号化手段８より入力された誤差ベ
クトルＸ■ とのベクトル間距離Ｄ_jを、例えば式（３）
に従って求める。ここで、ゲインγ_jは距離Ｄ_jが最小に
なるように、例えば式（４）に従って決定する。The driving excitation codebook 11 stores N driving excitation vectors generated from random noise, for example. The driving excitation coding means 10 is a driving excitation vector c _j (j = 1, ...) Inputted from the driving excitation codebook 11.
, N) and the spectrum parameter input from the spectrum parameter coding means, N synthesized speech vectors C _j (j = 1, ..., N) are synthesized. Then, the inter-vector distance D _j with the error vector X ■ input from the adaptive excitation encoding means 8 is calculated by, for example, the equation (3)
Ask according to. Here, the gain γ _j is determined, for example, according to the equation (4) so that the distance D _j is minimized.

【０００８】[0008]

【数２】 [Equation 2]

【０００９】次にこのベクトル間距離が最小となるベク
トルＣ_Jを探索し、その符号Ｊ及びゲインγ_Jを量子化し
たゲインγ_qJの符号を伝送路３を介して復号化部２に出
力するとともに、選択された駆動音源ベクトルｃ_J及び
そのゲインγ_qJを音源ベクトル生成手段１２に出力す
る。Next, a vector C _J having the minimum inter-vector distance is searched for, and the code J and the gain γ _{qJ obtained} by quantizing the gain γ _J are output to the decoding unit 2 via the transmission line 3. At the same time, the selected driving sound source vector c _J and its gain γ _qJ are output to the sound source vector generating means 12.

【００１０】音源ベクトル生成手段１２は、前記適応音
源符号化手段８より入力された適応音源ベクトル_ａI及
びそのゲインβ_qIと、前記駆動音源符号化手段１０より
入力された駆動音源ベクトルｃ_J及びそのゲインγ_qJよ
り音源ベクトル β_qIａ_I＋γ_qJｃ_J を生成し、これを適応音源符号帳９に出力する。The _excitation vector generation means 12 has the adaptive _excitation vector _aI and its gain β _qI input from the adaptive _excitation coding means 8 and the driving _excitation vector c _J and its input from the driving _excitation coding means 10. An _excitation vector β _qI a _I + γ _qJ c _J is generated from the gain γ _qJ , and this is output to the adaptive excitation codebook 9.

【００１１】次に、復号化部２の動作について説明す
る。適応音源復号化手段１３は、符号化部１から入力さ
れた適応音源ベクトルの符号Ｉに基づき、前記適応音源
符号帳９と同一の適応音源ベクトルを記憶する適応音源
符号帳１４から適応音源ベクトルａ_I を読みだし、ま
た、符号化部１から入力された適応音源ベクトルに対す
るゲインの符号よりゲインβ_qIを復号化し、前記適応音
源ベクトルａ_Iとそのゲインβ_qIを音源ベクトル生成手
段１７に出力する。また、駆動音源復号化手段１５は、
符号化部１から入力された駆動音源ベクトルの符号Ｊに
基づき、前記駆動音源符号帳１１と同一の駆動音源ベク
トルを記憶する駆動音源符号帳１６から駆動音源ベクト
ルｃJを読みだし、また、符号化部１から入力された駆
動音源ベクトルに対するゲインの符号よりゲインγ_qJを
復号し、前記駆動音源ベクトルｃ_Jとそのゲインγ_qJを
音源ベクトル生成手段１７に出力する。Next, the operation of the decoding section 2 will be described. Based on the code I of the adaptive excitation vector input from the encoding unit 1, the adaptive excitation decoding means 13 stores the same adaptive excitation vector as the adaptive excitation codebook 9 from the adaptive excitation codebook 14 that stores the adaptive excitation vector a. _I is read, the gain β _qI is decoded from the sign of the gain for the adaptive _excitation vector input from the encoding unit 1, and the adaptive _excitation vector a _I and its gain β _qI are output to the _excitation vector generation means 17. . Further, the driving sound source decoding means 15
Based on the code J of the drive excitation vector input from the encoding unit 1, the drive excitation vector cJ is read from the drive excitation codebook 16 that stores the same drive excitation vector as the drive excitation codebook 11, and is also encoded. The gain γ _qJ is decoded from the sign of the gain for the driving sound source vector input from the unit 1, and the driving sound source vector c _J and its gain γ _qJ are output to the sound source vector generating means 17.

【００１２】音源ベクトル生成手段１７は、前記適応音
源復号化手段１３より入力された適応音源ベクトルａ_I
及びそのゲインβ_qIと、前記駆動音源復号化手段１５よ
り入力された駆動音源ベクトルｃ_J、及びそのゲインγ
_qJより音源ベクトルβ_qIａ_I＋γ_qJｃ_Jを生成し、これを
適応音源符号帳１４及び合成フィルタ１９に出力する。The excitation vector generation means 17 is an adaptive excitation vector a _I input from the adaptive excitation decoding means 13.
And its gain β _qI , the driving _excitation vector c _J input from the driving _excitation decoding means 15, and its gain γ.
_{An excitation} vector β _qI a _I + γ _qJ c _J is generated from _qJ and is output to the adaptive _excitation codebook 14 and the synthesis filter 19.

【００１３】スペクトルパラメータ復号化手段１８は、
符号化部１から入力されたスペクトルパラメータの符号
に基づきスペクトルパラメータを復号し、合成フィルタ
１９に出力する。合成フィルタ１９は、前記音源ベクト
ル生成手段１７より入力された音源ベクトルと、前記ス
ペクトルパラメータ復号化手段１８より入力されたスペ
クトルパラメータを用いて出力音声５を合成する。The spectrum parameter decoding means 18 is
The spectrum parameter is decoded based on the code of the spectrum parameter input from the encoding unit 1 and output to the synthesis filter 19. The synthesis filter 19 synthesizes the output speech 5 using the excitation vector input from the excitation vector generation means 17 and the spectrum parameter input from the spectrum parameter decoding means 18.

【００１４】[0014]

【発明が解決しようとする課題】音声信号には有声音と
無声音が有り、それぞれは音源信号がピッチ周期のパル
ス的成分を持つか、周期性の無い白色雑音であるかで特
徴づけられる。そして、有声音におけるピッチ周期性の
再現性が合成音声の品質に与える影響は大きい。上記の
ような従来の音声符号化・復号化装置では、有声音にお
ける音源信号のピッチ周期のパルス的成分を、先行フレ
ームの音源信号で作られる適応音源ベクトルを用いるこ
とにより発生させていた。しかし、適応音源ベクトルは
複数のサンプルで構成されるので、有声音に必要なパル
ス的成分を積極的には生成できない。このため、無声音
から有声音への過渡部では適当な音源が生成できず音質
が劣化するという課題があった。また、パルス的成分が
生成されても、それ以外のサンプルを含めて音源を生成
するため、必ずしも最適なパルス系列が得られず、これ
がピッチ周期性の欠落につながり音質が劣化するという
課題もあった。Voice signals include voiced sound and unvoiced sound, and are characterized by whether the sound source signal has a pulse-like component of a pitch period or white noise having no periodicity. The reproducibility of pitch periodicity in voiced sound has a great influence on the quality of synthesized speech. In the conventional speech encoding / decoding apparatus as described above, the pulse-like component of the pitch period of the excitation signal in the voiced sound is generated by using the adaptive excitation vector created by the excitation signal of the preceding frame. However, since the adaptive sound source vector is composed of multiple samples, it is not possible to positively generate the pulse-like component necessary for voiced sound. Therefore, there is a problem that an appropriate sound source cannot be generated in the transition part from unvoiced sound to voiced sound and the sound quality is deteriorated. In addition, even if a pulse component is generated, the sound source is generated by including the other samples, so that an optimum pulse sequence cannot always be obtained, which leads to lack of pitch periodicity, resulting in deterioration of sound quality. It was

【００１５】また、音源信号を適応音源ベクトルとピッ
チ周期性を考慮していない駆動音源ベクトルとの線形和
により生成するので、適応音源ベクトルがピッチ周期性
を発生させたとしても、伝達情報量を低減させるために
は、フレーム長、即ちベクトル長を長くする、あるいは
駆動音源符号帳の符号帳サイズを小さくする必要があ
り、その場合には生成された音源信号のピッチ周期性が
乱れ、音質が劣化するという課題もあった。Further, since the sound source signal is generated by a linear sum of the adaptive sound source vector and the driving sound source vector in which the pitch periodicity is not taken into consideration, even if the adaptive sound source vector generates the pitch periodicity, the transmitted information amount is In order to reduce it, it is necessary to increase the frame length, that is, the vector length, or reduce the codebook size of the driving excitation codebook. In that case, the pitch periodicity of the generated excitation signal is disturbed and the sound quality is reduced. There was also the problem of deterioration.

【００１６】また、従来の音声符号化復号化装置では入
力音声のパワーによらず駆動音源ベクトルに対するゲイ
ン量子化を行っているが、量子化すべきゲインの範囲が
大きいため、少量のビットで量子化する場合は量子化歪
が大きくなり、合成音声の品質が劣化する。また、入力
音声のパワーにより駆動音源ベクトルのゲインの量子化
を変更するためには、新たにパワー情報を伝送する必要
があり、効率的ではないという課題もあった。In the conventional speech coding / decoding apparatus, the gain quantization for the driving excitation vector is performed regardless of the power of the input speech, but since the range of gain to be quantized is large, the quantization is performed with a small number of bits. If so, the quantization distortion becomes large and the quality of the synthesized speech deteriorates. Further, in order to change the quantization of the gain of the driving sound source vector by the power of the input sound, it is necessary to newly transmit the power information, which is not efficient.

【００１７】この発明は、かかる課題を解決するために
なされたもので、適応音源ベクトルに加え、この適応音
源ベクトルに対して一意的に求めることが可能な周期パ
ルス列からなる音源ベクトルを生成し、これを用いて音
源信号を生成すること、及び駆動音源ベクトルをピッチ
周期に同期して繰り返し用いることにより、少ない伝送
情報量でも音源信号のピッチ周期性の乱れが少なく、ま
た、適応音源信号のパワー情報により駆動音源ベクトル
のゲインの量子化を変更することにより、少ない伝送情
報量でも音源信号の量子化歪が小さい、品質の高い復号
音声を合成することを目的としている。The present invention has been made to solve the above problem, and in addition to an adaptive excitation vector, generates a excitation vector consisting of a periodic pulse train that can be uniquely obtained for this adaptive excitation vector, By using this to generate a sound source signal and by repeatedly using the driving sound source vector in synchronization with the pitch cycle, the pitch periodicity of the sound source signal is not disturbed even with a small amount of transmission information, and the power of the adaptive sound source signal is reduced. By changing the quantization of the gain of the driving excitation vector according to the information, it is an object to synthesize high-quality decoded speech with small quantization distortion of the excitation signal even with a small amount of transmission information.

【００１８】[0018]

【課題を解決するための手段】この発明に係る音声符号
化装置は、先行フレームの音源信号の情報である適応音
源ベクトルを記憶する適応音源符号帳と、上記適応音源
符号帳内の各適応音源ベクトルの主要成分として定まる
周期パルス音源ベクトルを生成する周期パルス音源生成
手段と、上記適応音源ベクトルと上記適応音源ベクトル
に対応する周期パルス音源ベクトルとの線形和を音源信
号として生成した復号音声と入力音声との歪みを最小に
する適応音源ベクトルとその適応音源ベクトルに対応す
る周期パルス音源ベクトルと、上記適応音源ベクトルと
上記周期パルス音源ベクトルとに対するゲインを求め、
上記適応音源ベクトルと、上記適応音源ベクトルと上記
周期パルス音源ベクトルとに対するゲインを符号化する
複数ベクトル適応音源符号化手段を備えた。A speech coding apparatus according to the present invention includes an adaptive excitation codebook for storing an adaptive excitation vector which is information of an excitation signal of a preceding frame, and each adaptive excitation in the adaptive excitation codebook. Periodic pulse source generating means for generating a periodic pulse source vector determined as a main component of the vector, and a decoded speech generated as a source signal by a linear sum of the adaptive source vector and the periodic pulse source vector corresponding to the adaptive source vector and the input An adaptive excitation vector that minimizes distortion with speech and a periodic pulse excitation vector corresponding to the adaptive excitation vector, and a gain for the adaptive excitation vector and the periodic pulse excitation vector are obtained,
A plurality of vector adaptive excitation encoding means for encoding the adaptive excitation vector and the gains for the adaptive excitation vector and the periodic pulse excitation vector are provided.

【００１９】請求項２の発明に係る復号化装置は、先行
フレームの音源信号の情報である適応音源ベクトルを記
憶する適応音源符号帳と、復号化された適応音源ベクト
ルの主要成分として定まる周期パルス音源ベクトルを生
成する周期パルス音源生成手段と、適応音源ベクトル
と、上記適応音源ベクトルと上記適応音源ベクトルに対
応する周期パルス音源ベクトルとに対するゲインを復号
化し、上記適応音源ベクトルと上記周期パルス音源ベク
トルの各々に上記ゲインを与えて加算する複数ベクトル
適応音源復号化手段とを備えた。The decoding apparatus according to the second aspect of the present invention is an adaptive excitation codebook for storing an adaptive excitation vector which is information of an excitation signal of a preceding frame, and a periodic pulse determined as a main component of the decoded adaptive excitation vector. A periodic pulse sound source generating means for generating a sound source vector, an adaptive sound source vector, the adaptive sound source vector, and a gain for the periodic pulse sound source vector corresponding to the adaptive sound source vector are decoded to obtain the adaptive sound source vector and the periodic pulse sound source vector. A multi-vector adaptive excitation decoding means for adding the above gains to each of them and adding them.

【００２０】請求項３の発明に係る符号化装置は、先行
フレームの音源信号の情報である適応音源ベクトルを記
憶する適応音源符号帳と、上記適応音源ベクトルを音源
信号として生成した復号音声と入力音声との歪みを最小
にする適応音源ベクトルとその適応音源ベクトルに対す
るゲインを求め、上記適応音源ベクトルと上記適応音源
ベクトルに対するゲインを符号化する適応音源符号化手
段と、予め用意された複数の音源信号を駆動音源ベクト
ルとして記憶している駆動音源符号帳と、入力音声に対
してピッチ周期間隔で並ぶ複数の特徴点をピッチ位置と
して抽出し、上記ピッチ位置を符号化するピッチ位置抽
出手段と、上記ピッチ位置に上記駆動音源符号帳の各駆
動音源ベクトルの所定の位置を合わせ、上記駆動音源ベ
クトルをピッチ周期で繰り返したピッチ同期駆動音源ベ
クトルを生成するピッチ同期化手段と、上記適応音源ベ
クトルと上記ピッチ同期駆動音源ベクトルとの線形和を
音源信号として生成した復号音声と入力音声との歪みを
最小にするピッチ同期駆動音源ベクトルとそのピッチ同
期駆動音源ベクトルに対するゲインを求め、上記ピッチ
同期駆動音源ベクトルに対応する駆動音源ベクトルと上
記ピッチ同期駆動音源ベクトルに対するゲインを符号化
する駆動音源符号化手段とを備えた。An encoding apparatus according to a third aspect of the present invention inputs an adaptive excitation codebook that stores an adaptive excitation vector that is information of an excitation signal of a preceding frame, and a decoded speech generated by using the adaptive excitation vector as an excitation signal. An adaptive sound source vector that minimizes distortion with speech and an adaptive sound source coding unit that obtains a gain for the adaptive sound source vector and encodes the adaptive sound source vector and the gain for the adaptive sound source vector, and a plurality of previously prepared sound sources A drive excitation codebook that stores a signal as a drive excitation vector, and a plurality of feature points that are arranged at pitch cycle intervals with respect to the input voice as pitch positions, and a pitch position extraction unit that encodes the pitch positions, Align the predetermined position of each driving excitation vector of the driving excitation codebook to the pitch position, and pitch the driving excitation vector. And a pitch synchronization means for generating a pitch-synchronized drive source vector repeated by the above, and a distortion between a decoded voice generated as a source signal and a linear sum of the adaptive excitation vector and the pitch-synchronized drive source vector and an input voice are minimized. A pitch-synchronized drive excitation vector and a gain for the pitch-synchronized drive excitation vector are obtained, and a drive excitation vector corresponding to the pitch-synchronized drive excitation vector and a drive excitation encoding means for encoding the gain for the pitch-synchronized drive excitation vector are provided. It was

【００２１】請求項４の発明に係る符号化装置は、先行
フレームの音源信号の情報である適応音源ベクトルを記
憶する適応音源符号帳と、上記適応音源ベクトルを音源
信号して生成した復号音声と入力音声との歪みを最小に
する適応音源ベクトルとその適応音源ベクトルに対する
ゲインを求め、上記適応音源ベクトルと上記適応音源ベ
クトルに対するゲインを符号化する適応音源符号化手段
と、予め用意された複数の音源信号を駆動音源ベクトル
として記憶している駆動音源符号帳と、ピッチ周期間隔
で並ぶ複数の点をピッチ位置として、上記ピッチ位置に
上記駆動音源符号帳の各駆動音源ベクトルの所定の位置
を合わせ、上記駆動音源ベクトルをピッチ周期で繰り返
してピッチ同期駆動音源ベクトルを生成するピッチ同期
化手段と、上記適応音源ベクトルと上記ピッチ同期駆動
音源ベクトルとの線形和を音源信号として生成した復号
音声と入力音声との歪みを最小にするピッチ同期駆動音
源ベクトルとそのピッチ同期駆動音源ベクトルに対する
ゲインを求め、上記ピッチ同期駆動音源ベクトルに対応
する駆動音源ベクトルとピッチ位置と上記ピッチ同期駆
動音源ベクトルに対するゲインを符号化する駆動音源符
号化手段とを備えた。An encoding apparatus according to a fourth aspect of the present invention includes an adaptive excitation codebook that stores an adaptive excitation vector that is information of an excitation signal of a preceding frame, and a decoded speech generated by generating an excitation signal of the adaptive excitation vector. An adaptive excitation vector that minimizes distortion with respect to the input speech, an adaptive excitation encoding unit that obtains a gain for the adaptive excitation vector, and encodes the adaptive excitation vector and the gain for the adaptive excitation vector, and a plurality of previously prepared adaptive excitation encoding units. A driving excitation codebook that stores excitation signals as driving excitation vectors and a plurality of points arranged at pitch period intervals as pitch positions, and the predetermined positions of the driving excitation vectors of the driving excitation codebook are aligned with the pitch positions. A pitch synchronization means for repeating the driving sound source vector in a pitch cycle to generate a pitch synchronization driving sound source vector; A pitch-synchronized drive source vector that minimizes distortion between the decoded voice and the input voice generated by using a linear sum of the excitation vector and the pitch-synchronized drive source vector as a source signal and a gain for the pitch-synchronized drive source vector are obtained, and the pitch A drive excitation vector corresponding to the synchronous drive excitation vector, a pitch position, and a drive excitation encoding means for encoding a gain for the pitch synchronous drive excitation vector are provided.

【００２２】請求項５の発明に係る復号化装置は、先行
フレームの音源信号の情報である適応音源ベクトルを記
憶する適応音源符号帳と、適応音源ベクトルとその適応
音源ベクトルに対するゲインを復号化する適応音源復号
化手段と、予め用意された複数の音源信号を駆動音源ベ
クトルとして記憶している駆動音源符号帳と、ピッチ位
置を復号化し、上記ピッチ位置に復号化された駆動音源
ベクトルの所定の位置を合わせ、上記駆動音源ベクトル
をピッチ周期で繰り返してピッチ同期駆動音源ベクトル
を生成するピッチ同期化手段と、駆動音源ベクトルと、
上記ピッチ同期駆動音源ベクトルに対するゲインを復号
化する駆動音源復号化手段と、上記適応音源ベクトルと
上記適応音源ベクトルに対するゲインと、上記ピッチ同
期駆動音源ベクトルと上記ピッチ同期駆動音源ベクトル
に対するゲインとから音源ベクトルを復号化する音源ベ
クトル生成手段とを備えた。A decoding device according to a fifth aspect of the present invention decodes an adaptive excitation codebook that stores an adaptive excitation vector that is information of an excitation signal of a preceding frame, an adaptive excitation vector and a gain for the adaptive excitation vector. Adaptive excitation decoding means, a driving excitation codebook that stores a plurality of excitation signals prepared in advance as driving excitation vectors, a pitch position is decoded, and a predetermined driving excitation vector of the decoded driving excitation vector is obtained at the pitch position. Pitch synchronization means for aligning positions and generating the pitch-synchronized driving sound source vector by repeating the driving sound source vector in a pitch cycle, and the driving sound source vector,
A sound source from a driving sound source decoding unit that decodes a gain for the pitch synchronization driving sound source vector, a gain for the adaptive sound source vector and the adaptive sound source vector, and a gain for the pitch synchronization driving sound source vector and the pitch synchronization driving sound source vector Excitation vector generation means for decoding the vector.

【００２３】請求項６の発明に係る符号化装置は、先行
フレームの音源信号の情報である適応音源ベクトルを記
憶する適応音源符号帳と、上記適応音源符号帳内の各適
応音源ベクトルにおいてピッチ周期間隔で並ぶ複数の特
徴点をピッチ位置として抽出するピッチ位置抽出手段
と、上記適応音源ベクトルを音源信号として生成した復
号音声と入力音声との歪みを最小にする適応音源ベクト
ルとその適応音源ベクトルに対するゲインを求め、上記
適応音源ベクトルと上記適応音源ベクトルに対するゲイ
ンを符号化する適応音源符号化手段と、予め用意された
複数の音源信号を駆動音源ベクトルとして記憶している
駆動音源符号帳と、上記ピッチ位置抽出手段で得られた
ピッチ位置に、上記駆動音源符号帳の各駆動音源ベクト
ルの所定の位置を合わせ、上記駆動音源ベクトルをピッ
チ周期で繰り返したピッチ同期駆動音源ベクトルを生成
するピッチ同期化手段と、上記適応音源ベクトルと上記
ピッチ同期駆動音源ベクトルとの線形和を音源信号とし
て生成した復号音声と入力音声との歪みを最小にするピ
ッチ同期駆動音源ベクトルとそのピッチ同期駆動音源ベ
クトルに対するゲインを求め、上記ピッチ同期駆動音源
ベクトルに対応する駆動音源ベクトルと上記ピッチ同期
駆動音源ベクトルに対するゲインを符号化する駆動音源
符号化手段とを備えた。An encoding apparatus according to a sixth aspect of the present invention is an adaptive excitation codebook for storing an adaptive excitation vector which is information of an excitation signal of a preceding frame, and pitch periods in each adaptive excitation vector in the adaptive excitation codebook. Pitch position extracting means for extracting a plurality of feature points arranged at intervals as pitch positions, an adaptive excitation vector that minimizes distortion between the decoded speech and the input speech generated with the adaptive excitation vector as the excitation signal, and the adaptive excitation vector An adaptive excitation coding means for obtaining a gain and encoding the adaptive excitation vector and the gain for the adaptive excitation vector; a driving excitation codebook storing a plurality of prepared excitation signals as driving excitation vectors; The pitch position obtained by the pitch position extraction means is combined with the predetermined position of each driving excitation vector of the above driving excitation codebook. And a pitch synchronization means for generating a pitch-synchronized drive excitation vector by repeating the drive excitation vector in a pitch cycle, and a decoded voice generated as a excitation signal a linear sum of the adaptive excitation vector and the pitch-synchronized drive excitation vector. A pitch-synchronized driving sound source vector that minimizes distortion with the input speech and a gain for the pitch-synchronized driving sound source vector are obtained, and the driving sound source vector corresponding to the pitch-synchronized driving sound source vector and the gain for the pitch-synchronous driving sound source vector are encoded. And a driving excitation encoding means for performing the same.

【００２４】請求項７の発明に係る復号化装置は、先行
フレームの音源信号の情報である適応音源ベクトルを記
憶する適応音源符号帳と、適応音源ベクトルとその適応
音源ベクトルに対するゲインを復号化する適応音源復号
化手段と、復号化された適応音源ベクトルにおいてピッ
チ周期間隔で並ぶ複数の特徴点をピッチ位置として抽出
するピッチ位置抽出手段と、予め用意された複数の音源
信号を駆動音源ベクトルとして記憶している駆動音源符
号帳と、上記ピッチ位置抽出手段で得られたピッチ位置
に、復号化された駆動音源ベクトルの所定の位置を合わ
せ、上記駆動音源ベクトルをピッチ周期で繰り返したピ
ッチ同期駆動音源ベクトルを生成するピッチ同期化手段
と、駆動音源ベクトルと、上記ピッチ同期駆動音源ベク
トルに対するゲインを復号化する駆動音源復号化手段
と、上記適応音源ベクトルと上記適応音源ベクトルに対
するゲインと、上記ピッチ同期駆動音源ベクトルと上記
ピッチ同期駆動音源ベクトルに対するゲインとから音源
ベクトルを復号化する音源ベクトル生成手段とを備え
た。A decoding device according to a seventh aspect of the present invention decodes an adaptive excitation codebook that stores an adaptive excitation vector that is information of an excitation signal of a preceding frame, an adaptive excitation vector and a gain for the adaptive excitation vector. Adaptive excitation decoding means, pitch position extraction means for extracting a plurality of feature points arranged at pitch cycle intervals in the decoded adaptive excitation vector as pitch positions, and storing a plurality of prepared excitation signals as driving excitation vectors Driving excitation codebook and the pitch position obtained by the pitch position extraction means are aligned with a predetermined position of the decoded driving excitation vector, and the driving excitation vector is repeated in a pitch cycle. A pitch synchronization means for generating a vector, a driving sound source vector, and a gay sound source for the pitch synchronization driving sound source vector. Driving excitation decoding means for decoding the above, an adaptive excitation vector, a gain for the adaptive excitation vector, and an excitation vector generation for decoding the excitation vector from the pitch-synchronized excitation vector and the gain for the pitch-synchronized excitation vector And means.

【００２５】請求項８の発明に係る符号化復号化方法
は、先行フレームの音源信号の情報である適応音源ベク
トルを記憶する第１の適応音源符号帳を送信側に持ち、
また対応する第２の適応音源符号帳を受信側に持ち、予
め用意された複数の音源信号を駆動音源ベクトルとして
記憶している第１の駆動音源符号帳を送信側に持ち、ま
た対応する第２の駆動音源符号帳を受信側に持ち、送信
側では、入力音声に対して、上記第１の適応音源符号帳
の中から最適な適応音源ベクトルを選択し、またそのゲ
インを定め、入力音声または上記適応音源ベクトルより
複数のピッチ位置を抽出し、上記ピッチ位置に上記駆動
音源符号帳の駆動音源ベクトルの所定の位置を合わせて
周期化したピッチ同期駆動音源ベクトルを作成して最適
なピッチ同期駆動音源ベクトルを選択し、またそのゲイ
ンを定め、上記適応音源ベクトルと上記ピッチ同期駆動
音源ベクトルと、上記適応音源ベクトルと上記ピッチ同
期駆動音源ベクトルとに対するゲインを符号化して送信
し、受信側では、受信符号により、上記第２の適応音源
符号帳から適応音源ベクトルを生成し、また上記適応音
源ベクトルのゲインを復号し、受信符号または上記適応
音源ベクトルより複数のピッチ位置を抽出し、上記第２
の駆動音源符号帳から駆動音源ベクトルを生成し、上記
ピッチ位置に上記駆動音源ベクトルの所定の位置を合わ
せて周期化したピッチ同期駆動音源ベクトルを作成し、
また上記ピッチ同期駆動音源ベクトルのゲインを復号
し、上記適応音源ベクトルと上記ピッチ同期駆動音源ベ
クトルとそれぞれのゲインとから音源ベクトルを復号す
るようにした。In the encoding / decoding method according to the invention of claim 8, the transmitting side has a first adaptive excitation codebook for storing an adaptive excitation vector which is information of the excitation signal of the preceding frame.
Further, it has a corresponding second adaptive excitation codebook on the receiving side, has a first driving excitation codebook that stores a plurality of prepared excitation signals as driving excitation vectors on the transmitting side, and also has a corresponding first excitation codebook. The receiving side has the second driving excitation codebook, and the transmitting side selects the optimum adaptive excitation vector from the first adaptive excitation codebook for the input speech, determines its gain, and determines the input speech. Alternatively, a plurality of pitch positions are extracted from the adaptive excitation vector, a predetermined position of the driving excitation vector of the driving excitation codebook is aligned with the pitch position to create a pitch-synchronized driving excitation vector, and optimal pitch synchronization is created. A driving sound source vector is selected and its gain is determined, and the adaptive sound source vector, the pitch synchronization driving sound source vector, the adaptive sound source vector, and the pitch synchronization driving sound source vector. The gains for and are encoded and transmitted, and on the receiving side, an adaptive excitation vector is generated from the second adaptive excitation codebook by the reception code, and the gain of the adaptive excitation vector is decoded to obtain the reception code or the adaptation. A plurality of pitch positions are extracted from the sound source vector, and the second
To generate a driving excitation vector from the driving excitation codebook, to create a pitch-synchronized driving excitation vector that is periodic with the predetermined position of the driving excitation vector aligned with the pitch position,
Further, the gain of the pitch-synchronized driving excitation vector is decoded, and the excitation vector is decoded from the adaptive excitation vector, the pitch-synchronized driving excitation vector and the respective gains.

【００２６】請求項９の発明にかかる符号化装置は、先
行フレームの音源信号の情報である適応音源ベクトルを
記憶する適応音源符号帳と、上記適応音源ベクトルを音
源信号として生成した復号音声と入力音声との歪みを最
小にする適応音源ベクトルとその適応音源ベクトルに対
するゲインを求め、上記適応音源ベクトルと上記適応音
源ベクトルに対するゲインを符号化する適応音源符号化
手段と、予め用意された複数の音源信号を駆動音源ベク
トルとして記憶している駆動音源符号帳と、上記適応音
源ベクトルとそのゲインより求められる適応音源信号の
パワー情報により駆動音源ベクトルに対して用いるゲイ
ン量子化テーブルを予め用意された複数のゲイン量子化
テーブルの中から選択して指定する制御手段と、上記適
応音源ベクトルと上記駆動音源ベクトルとの線形和を音
源信号として生成した復号音声と入力音声との歪みを最
小にする駆動音源ベクトルとその駆動音源ベクトルに対
するゲインを求め、上記指定されたゲイン量子化テーブ
ルを用いて、上記駆動音源ベクトルと上記駆動音源ベク
トルに対するゲインを符号化する駆動音源符号化手段と
を備えた。An encoding apparatus according to a ninth aspect of the present invention inputs an adaptive excitation codebook that stores an adaptive excitation vector, which is information on an excitation signal of a preceding frame, and a decoded speech generated by using the adaptive excitation vector as an excitation signal. An adaptive excitation vector that minimizes distortion with speech and an adaptive excitation vector that obtains a gain for the adaptive excitation vector, and encodes the adaptive excitation vector and the gain for the adaptive excitation vector, and a plurality of prepared excitation sources A plurality of pre-prepared driving excitation codebooks that store signals as driving excitation vectors, and gain quantization tables used for driving excitation vectors based on the adaptive excitation signal power information obtained from the adaptive excitation vectors and their gains. Control means for selecting and specifying from the gain quantization table of The driving excitation vector that minimizes the distortion between the decoded speech and the input speech generated by using the linear sum of the driving excitation vector as the excitation signal and the gain for the driving excitation vector are obtained, and the gain quantization table specified above is used. , And a drive excitation encoding means for encoding the drive excitation vector and a gain for the drive excitation vector.

【００２７】請求項１０の発明に係る復号化装置は、先
行フレームの音源信号の情報である適応音源ベクトルを
記憶する適応音源符号帳と、適応音源ベクトルとその適
応音源ベクトルに対するゲインを復号化する適応音源復
号化手段と、予め用意された複数の音源信号を駆動音源
ベクトルとして記憶している駆動音源符号帳と、上記適
応音源ベクトルと上記適応音源ベクトルに対するゲイン
より求められる適応音源信号のパワー情報により駆動音
源ベクトルに対して用いるゲイン量子化テーブルを予め
用意された複数のゲイン量子化テーブルの中から選択し
て指定する制御手段と、上記指定されたゲイン量子化テ
ーブルを用いて、駆動音源ベクトルとその駆動音源ベク
トルに対するゲインを復号化する駆動音源復号化手段と
を備えた。A decoding apparatus according to the invention of claim 10 decodes an adaptive excitation codebook for storing an adaptive excitation vector which is information of an excitation signal of a preceding frame, an adaptive excitation vector and a gain for the adaptive excitation vector. Adaptive excitation decoding means, a driving excitation codebook in which a plurality of excitation signals prepared in advance are stored as driving excitation vectors, power information of the adaptive excitation signal obtained from the adaptive excitation vector and a gain for the adaptive excitation vector By using a control means for selecting and designating a gain quantization table to be used for a driving excitation vector from a plurality of gain quantization tables prepared in advance, and a driving quantization vector specified above. And a driving excitation decoding means for decoding the gain for the driving excitation vector.

【００２８】請求項１１の発明に係る符号化復号化方法
は、先行フレームの音源信号の情報である適応音源ベク
トルを記憶する第１の適応音源符号帳を送信側に持ち、
また対応する第２の適応音源符号帳を受信側に持ち、予
め用意された複数の音源信号を駆動音源ベクトルとして
記憶している第１の駆動音源符号帳を送信側に持ち、ま
た対応する第２の駆動音源符号帳を受信側に持ち、送信
側では、入力音声に対して、上記第１の適応音源符号帳
の中から最適な適応音源ベクトルを選択し、またそのゲ
インを定め、上記適応音源ベクトルとそのゲインより求
められる適応音源信号のパワー情報により駆動音源ベク
トルのゲイン量子化テーブルを変更し、上記第１の駆動
音源符号帳の中から最適な駆動音源ベクトルを選択し、
またそのゲインを定め、上記適応音源ベクトルと上記駆
動音源ベクトルと、上記適応音源ベクトルと上記駆動音
源ベクトルとに対するゲインを符号化して送信し、受信
側では、受信符号により、上記第２の適応音源符号帳か
ら適応音源ベクトルを生成し、また上記適応音源ベクト
ルのゲインを復号し、上記適応音源ベクトルとそのゲイ
ンより求められる適応音源信号のパワー情報より駆動音
源ベクトルのゲイン量子化テーブルを変更し、上記第２
の駆動音源ベクトル符号帳から駆動音源ベクトルを生成
し、また上記駆動音源ベクトルのゲインを復号し、上記
適応音源ベクトルと、駆動音源ベクトルとそれぞれのゲ
インとから音源ベクトルを復号するようにしたものであ
る。The encoding / decoding method according to the invention of claim 11 has, on the transmitting side, a first adaptive excitation codebook for storing an adaptive excitation vector which is information of the excitation signal of the preceding frame.
Further, it has a corresponding second adaptive excitation codebook on the receiving side, has a first driving excitation codebook that stores a plurality of prepared excitation signals as driving excitation vectors on the transmitting side, and also has a corresponding first excitation codebook. The receiving side has two driving excitation codebooks, and the transmitting side selects the optimum adaptive excitation vector from the first adaptive excitation codebook for the input voice, determines the gain, and The gain quantization table of the driving excitation vector is changed by the power information of the adaptive excitation signal obtained from the excitation vector and its gain, and the optimum driving excitation vector is selected from the first driving excitation codebook,
Further, the gain is determined, the gains for the adaptive excitation vector, the driving excitation vector, and the adaptive excitation vector and the driving excitation vector are encoded and transmitted, and on the receiving side, the second adaptive excitation by the reception code. Generating an adaptive excitation vector from the codebook, decoding the gain of the adaptive excitation vector, changing the gain quantization table of the driving excitation vector from the power information of the adaptive excitation signal obtained from the adaptive excitation vector and the gain, Second above
The driving excitation vector is generated from the driving excitation vector codebook, and the gain of the driving excitation vector is decoded, and the excitation vector is decoded from the adaptive excitation vector, the driving excitation vector and each gain. is there.

【００２９】[0029]

【作用】この発明における入力音声符号化装置は、適応
音源ベクトルに対応して周期パルス音源ベクトルが新し
く生成され、それぞれゲインを掛けて合成され、入力音
声との歪みが最小となるベクトルとゲインが選択されて
符号化される。また、請求項２の発明の音声復号化装置
は、伝送された符号とゲインにより、周期性を持つ音源
ベクトルが生成され、これと適応音源ベクトルにより、
音源ベクトルが復号される。また、請求項３、４の発明
の音声符号化装置は、駆動音源ベクトルが周期的に繰り
返され、入力音声との歪みが最小になるベクトルとゲイ
ンが選択されて符号化される。また、請求項５の発明の
音声復号化装置は、伝送された符号により、駆動音源ベ
クトルが周期的に生成され、これと適応音源ベクトルに
より、音源ベクトルが復号される。また、請求項６の発
明の音声符号化装置は、適応音源ベクトルにおいてピッ
チ周期で定まる特徴点がピッチ位置として抽出され、こ
れにより駆動音源ベクトルが周期的に繰り返され、入力
音声との歪みが最小になるベクトルとゲインが選択され
て符号化される。また、請求項７の発明の音声復号化装
置は、伝送された符号、受信スペクトルパラメータとゲ
インにより、ピッチ位置が抽出され、これにより駆動音
源ベクトルが周期的に生成され、これと適応音源ベクト
ルにより、音源ベクトルが復号される。In the input speech coding apparatus according to the present invention, the periodic pulse excitation vector is newly generated corresponding to the adaptive excitation vector, multiplied by each gain and synthesized, and the vector and the gain that minimize distortion with the input speech are obtained. Selected and encoded. Further, the speech decoding apparatus according to the second aspect of the present invention generates an excitation vector having periodicity by the transmitted code and gain, and by this and the adaptive excitation vector,
The source vector is decoded. In the speech coding apparatus according to the third and fourth aspects of the invention, the driving excitation vector is periodically repeated, and the vector and gain that minimize distortion with the input speech are selected and coded. Also, in the speech decoding apparatus according to the fifth aspect of the present invention, the driving excitation vector is periodically generated by the transmitted code, and the excitation vector is decoded by this and the adaptive excitation vector. Further, in the speech coding apparatus according to the invention of claim 6, a feature point determined by a pitch cycle in the adaptive excitation vector is extracted as a pitch position, whereby the driving excitation vector is periodically repeated, and distortion with the input speech is minimized. The vector and gain are selected and encoded. According to the speech decoding apparatus of the invention of claim 7, the pitch position is extracted by the transmitted code, the reception spectrum parameter and the gain, whereby the driving excitation vector is periodically generated, and by this and the adaptive excitation vector. , The excitation vector is decoded.

【００３０】また、請求項８の発明の音声符号化復号化
方法は、フレーム中の適応音源ベクトルよりピッチが抽
出されて、このピッチに同期したピッチ同期ベクトルが
生成、符号化されて、適応音源ベクトル符号と共に伝送
され、受信側では、ピッチに同期したピッチ同期ベクト
ルが生成され、適応音源ベクトルと加算されて音源ベク
トルが復号される。また、請求項９の発明の音声符号化
装置は、適応音源ベクトルとそのゲインより適応音源信
号のパワー情報が抽出され、これに応じて駆動音源ベク
トルに対するゲイン量子化テーブルが変更され、入力音
声との歪みが最小になるベクトルとゲインが選択されて
符号化される。また、請求項１０の発明の音声復号化装
置は、伝送された適応音源ベクトルの符号とそのゲイン
により、適応音源信号のパワー情報が抽出され、これに
応じて駆動音源ベクトルに対するゲイン量子化テーブル
が変更され、これにより駆動音源ベクトルに対するゲイ
ンが復号され、これを用いて音源ベクトルが復号され
る。また、請求項１１の発明の音声符号化復号化方法
は、適応音源ベクトルとそのゲインより適応音源信号の
パワー情報が抽出され、これに応じて駆動音源ベクトル
に対するゲイン量子化テーブルが変更され、入力音声と
の歪みが最小になるベクトルとゲインが選択されて符号
化されて伝送され、受信側では、適応音源信号のパワー
情報に応じて駆動音源ベクトルに対するゲイン量子化テ
ーブルが変更され、これにより駆動音源ベクトルに対す
るゲインが復号され、これを用いて音源ベクトルが復号
される。According to the speech coding / decoding method of the present invention, the pitch is extracted from the adaptive excitation vector in the frame, the pitch synchronization vector synchronized with this pitch is generated and coded, and the adaptive excitation is generated. It is transmitted together with the vector code, and on the receiving side, a pitch synchronization vector which is synchronized with the pitch is generated and added with the adaptive excitation vector to decode the excitation vector. Further, according to the speech coding apparatus of the invention of claim 9, the power information of the adaptive excitation signal is extracted from the adaptive excitation vector and its gain, the gain quantization table for the driving excitation vector is changed in accordance with this, and the input speech The vector and gain that minimize the distortion of are selected and encoded. Further, according to the speech decoding apparatus of the invention of claim 10, the power information of the adaptive excitation signal is extracted by the code of the transmitted adaptive excitation vector and the gain thereof, and the gain quantization table for the driving excitation vector is correspondingly extracted. Modified, which causes the gain for the driving source vector to be decoded and used to decode the source vector. According to the speech encoding / decoding method of the invention of claim 11, the power information of the adaptive excitation signal is extracted from the adaptive excitation vector and its gain, and the gain quantization table for the driving excitation vector is changed in accordance with this, and the input The vector and gain that minimize distortion with speech are selected, coded, and transmitted, and on the receiving side, the gain quantization table for the driving excitation vector is changed according to the power information of the adaptive excitation signal. The gain for the excitation vector is decoded, and this is used to decode the excitation vector.

【００３１】[0031]

【Example】

実施例１．図１はこの発明の請求項１の一実施例の構成
図である。図１において図１０と同一の部分については
同一の符号を付し、説明を省略する。図１において、新
規な部分は、２０の複数ベクトル適応音源符号化手段、
２１及び２３の周期パルス音源生成手段、２２の複数ベ
クトル適応音源復号化手段である。スペクトル分析手段
６、スペクトルパラメータ符号化手段７からなる声道部
と、複数ベクトル適応音源符号化手段２０、駆動音源符
号化手段１０、音源ベクトル生成手段１２からなる声帯
部の分担は従来例と同じであるが、声帯部の、特にピッ
チ周期性を向上させ、合成後の音声品質を改善するもの
である。Example 1. 1 is a block diagram of an embodiment of claim 1 of the present invention. In FIG. 1, the same parts as those in FIG. 10 are designated by the same reference numerals and the description thereof will be omitted. In FIG. 1, a new part is 20 multi-vector adaptive excitation coding means,
21 and 23 are periodic pulse excitation generating means, and 22 are multiple vector adaptive excitation decoding means. The vocal tract portion including the spectrum analysis unit 6 and the spectrum parameter encoding unit 7 and the vocal cord portion including the multiple vector adaptive excitation encoding unit 20, the driving excitation encoding unit 10, and the excitation vector generation unit 12 are shared as in the conventional example. However, the pitch periodicity of the vocal cords, in particular, is improved, and the voice quality after synthesis is improved.

【００３２】以下、本発明の一実施例の動作について説
明する。まず、符号化部１の動作について説明する。周
期パルス音源生成手段２１は、適応音源符号帳９より入
力される適応音源ベクトルａ_iより、その最大振幅をと
る次元のみ１でその他の次元は０である周期単位インパ
ルス列からなる周期パルス音源ベクトルｐ_iを生成し、
このｐ_iを複数ベクトル符号化手段２０に出力する。元
々、適応音源符号帳９が前フレーム迄の入力のサンプリ
ング系列値を記憶しており、これから、複数ベクトル適
応音源符号化手段２０では現フレームの入力音声に最も
近い合成音声を生成するベクトルを選ぶが、記憶されて
いる適応音源符号帳９のベクトルは周期性の保持につい
ては充分ではない。同期パルス音源ベクトル音源ベクト
ルｐ_iを新たに作ることで周期性が保強されることにな
る。The operation of the embodiment of the present invention will be described below. First, the operation of the encoding unit 1 will be described. The periodic pulse excitation generator 21 is based on the adaptive excitation vector a _i input from the adaptive excitation codebook 9 and is a periodic pulse excitation vector consisting of a periodic unit impulse sequence in which the dimension having the maximum amplitude is 1 and the other dimensions are 0. generate p _i ,
This p _i is output to the multiple vector coding means 20. Originally, the adaptive excitation codebook 9 stores the sampling sequence value of the input up to the previous frame, and from this, the multi-vector adaptive excitation coding means 20 selects the vector that produces the synthesized speech closest to the input speech of the current frame. However, the stored vector of the adaptive excitation codebook 9 is not sufficient to maintain the periodicity. The periodicity is maintained by newly creating the synchronous pulse source vector source vector p _i .

【００３３】図２に本実施例における適応音源ベクトル
（ａ）と、それより生成される周期パルス音源ベクトル
（ｂ）の例を示す。つまり、記憶している多くのベクト
ル中で、その最大の振幅を持つベクトルが、単位化され
て出力される。このことにより、音声の再生にとって重
要な周期性を強調することが出来る。FIG. 2 shows an example of the adaptive excitation vector (a) and the periodic pulse excitation vector (b) generated from it in this embodiment. That is, of the many stored vectors, the vector with the largest amplitude is unitized and output. This makes it possible to emphasize the periodicity that is important for audio reproduction.

【００３４】複数ベクトル適応音源符号化手段２０は、
スペクトルパラメータ符号化手段７より入力されるスペ
クトルパラメータを用い、前記適応音源符号帳９より入
力される適応音源ベクトルａi 、前記周期パルス音源生
成手段２１より入力される周期パルス音源ベクトルｐ_i
より合成音声ベクトルＡ_i、Ｐ_iをそれぞれ合成する。そ
して、入力音声４からフレーム毎に切り出した入力音声
ベクトルＸとのベクトル間距離Ｄ_iを、例えば式（５）
に従って求める。ここで、ゲインβ_Ai、β_Piは距離Ｄ_i
が最小になるように、例えば式（６）、式（７）に従っ
て決定する。The multi-vector adaptive excitation coding means 20 is
Using the spectrum parameters input from the spectrum parameter encoding means 7, the adaptive excitation vector a _i input from the adaptive excitation codebook 9 and the periodic pulse excitation vector p _i input from the periodic pulse excitation generation means 21.
More synthesized speech vectors A _i and P _i are synthesized respectively. Then, the inter-vector distance D _i with respect to the input voice vector X cut out from the input voice 4 for each frame is calculated by, for example, formula (5)
Ask according to. Here, the gains β _Ai and β _Pi are distances D _i
Is determined according to, for example, equations (6) and (7).

【００３５】[0035]

【数３】 [Equation 3]

【００３６】次にこのベクトル間距離が最小となるベク
トルＡ_I、Ｐ_Iの組を探索し、その符号Ｉ及びゲイン
β_AI、β_PIを量子化したゲインβ_qAI、β_qPIの符号を伝
送路３を介して復号化部２に出力する。また、選択され
た適応音源ベクトルａ_I及びそのゲインβ_qAIと、周期パ
ルス音源ベクトルｐ_I及びそのゲインβ_qPIを、音源ベク
トル生成手段１２に出力するとともに、誤差ベクトルＸ
■ として、Ｘ■＝Ｘ−β_qAIＡ_I−β_qPIＰ_I を駆動音源符号化手段１０に出力する。このように、従
来と比べて、Ａ_IのベクトルをＡ_IとＰ_Iの組のベクトル
としたことが新規である。なお、ベクトル差をさらに量
子化して符号化送信する駆動音源符号化手段１０の動作
は、本実施例では、従来例と同じであり、復号部２に対
してはＪとγ_qJの符号を送信する。Next, a set of vectors A _I and P _I that minimizes the inter-vector distance is searched, and the code I and the gains β _qAI and β _qPI _obtained by quantizing the gains β _AI and β _PI are transmitted through the transmission line. 3 to the decoding unit 2. Also, the selected adaptive source vector a _I and its gain β _qAI , the periodic pulse source vector p _I and its gain β _qPI are output to the source vector generation means 12, and the error vector X is also output.
As (2), X * = X−β _qAI A _I −β _qPI P _I is output to the driving _excitation encoding means 10. Thus, as compared with the conventional, it is novel that the vector of A _I and the set of vectors of A _I and P _I. The operation of the driving _excitation coding means 10 for further quantizing and coding and transmitting the vector difference is the same as the conventional example in this embodiment, and the codes of J and γ _qJ are transmitted to the decoding unit 2. To do.

【００３７】次に、復号化部２の動作について説明す
る。複数ベクトル適応音源復号化手段２２は、符号化部
１から入力された適応音源ベクトルの符号Ｉを適応音源
符号帳１４に出力する。適応音源符号帳１４は、前記符
号Ｉに対応する適応音源ベクトルａ_Iを、複数ベクトル
適応音源復号化手段２２と周期パルス音源生成手段２３
へ出力する。周期パルス音源生成手段２３は、前記適応
音源符号帳１４から入力された適応音源ベクトルａ_Iよ
り周期パルス音源ベクトルｐ_Iを生成し、複数ベクトル
適応音源復号化手段２２に出力する。Next, the operation of the decoding section 2 will be described. The multiple vector adaptive excitation decoding means 22 outputs the code I of the adaptive excitation vector input from the encoding unit 1 to the adaptive excitation codebook 14. The adaptive excitation codebook 14 converts the adaptive excitation vector a _I corresponding to the code I into multiple vector adaptive excitation decoding means 22 and periodic pulse excitation generation means 23.
Output to. The periodic pulse excitation generating means 23 generates a periodic pulse excitation vector p _I from the adaptive excitation vector a _I input from the adaptive excitation codebook 14 and outputs it to the multi-vector adaptive excitation decoding means 22.

【００３８】複数ベクトル適応音源復号化手段２２は、
前記適応音源符号帳１４より入力される適応音源ベクト
ルａ_Iと、前記周期パルス音源生成手段２３より入力さ
れる周期パルス音源ベクトルｐ_I、及び符号化部１から
入力されたそれぞれのゲインの符号より復号したゲイン
β_qAI、β_qPIを音源ベクトル生成手段１７に出力する。
つまり、β_qAI、ａ_I、β_qPI、ｐ_Iの組が音源ベクトル生
成手段に与えられる。The multi-vector adaptive excitation decoding means 22 is
From the adaptive excitation vector a _I input from the adaptive excitation codebook 14, the periodic pulse excitation vector p _I input from the periodic pulse excitation generation means 23, and the respective gain codes input from the encoding unit 1. The decoded gains β _qAI and β _qPI are output to the _excitation vector generation means 17.
That is, a set of β _qAI , a _I , β _qPI and p _I is given to the sound source vector generating means.

【００３９】音源ベクトル生成手段１７は、前記複数ベ
クトル適応音源復号化手段２２より入力された適応音源
ベクトルａ_I、周期パルス音源ベクトルｐ_I及びそれぞれ
のゲインβ_qAI、β_qPIと、他方の入力である駆動音源復
号化手段１５より入力される駆動音源ベクトルｃ_J及び
そのゲインγ_qJより、下記の音源ベクトルを生成する。 β_qAIａ_I＋β_qPIｐ_I＋γ_qJｃ_J 生成された音源ベクトルは、適応音源符号帳１４及び合
成フィルタ１９に出力される。合成フィルタ１９の出力
は、合成して復元された音声出力となる。The excitation vector generating means 17 receives the adaptive _excitation vector a _I , the periodic pulse _excitation vector p _I and the respective gains β _qAI and β _qPI inputted from the multi-vector adaptive _excitation decoding means 22 and the other input. The following _excitation vector is generated from the driving _excitation vector c _J input from a certain driving _excitation decoding means 15 and its gain γ _qJ . β _qAI a _I + β _qPI p _I + γ _qJ c _{J The} generated _excitation vector is output to the adaptive _excitation codebook 14 and the synthesis filter 19. The output of the synthesizing filter 19 is a voice output that is synthesized and restored.

【００４０】実施例２．上記実施例１では、符号化する
フレーム内のピッチ周期性のみを考慮して周期パルス音
源ベクトルを生成している。一方、切り出して選択・符
号化された適応音源ベクトルのピッチ周期は、入力音声
の実際のピッチ周期の倍の値をとったりして、周期パル
ス音源ベクトルが生成するパルス列は、フレーム間では
必ずしも周期的にはならない。このときは、先行フレー
ムの情報を用いて、例えば平均ピッチ周期を求めてお
き、周期パルス音源ベクトルはその平均ピッチ周期のパ
ルス列から生成するなど、フレーム間でのピッチ周期性
をも考慮して周期パルス音源ベクトルを生成してもよ
い。Example 2. In the first embodiment, the periodic pulse excitation vector is generated by considering only the pitch periodicity in the frame to be encoded. On the other hand, the pitch period of the adaptive excitation vector that is cut out and selected / encoded has a value that is twice the actual pitch period of the input speech, and the pulse train generated by the periodic pulse excitation vector is not necessarily periodic between frames. It doesn't. At this time, using the information of the preceding frame, for example, the average pitch period is obtained in advance, and the periodic pulse excitation vector is generated from the pulse train of the average pitch period. A pulsed source vector may be generated.

【００４１】実施例３．上記実施例１では、周期パルス
音源ベクトルの各パルスの振幅が全て等しいとした。こ
れを変更して、例えば音声のパワーに応じて振幅を、前
フレームから順次、増加、又は減少させる等、線形的に
変化させるなど、各周期パルス音源ベクトルが異なる振
幅をとるように設定してもよい。Example 3. In the first embodiment, the amplitude of each pulse of the periodic pulse source vector is assumed to be equal. By changing this, for example, the amplitude is changed linearly by sequentially increasing, decreasing, or the like from the previous frame according to the power of the voice, and setting so that each periodic pulse source vector has a different amplitude. Good.

【００４２】実施例４．上記実施例１では、周期パルス
音源ベクトルのパルス位置を、適応音源ベクトルの振幅
最大点より求めているが、この両者を合成して得られる
合成音声ベクトルにおいて、両者のベクトルの内積が最
大となる周期パルス音源ベクトルを設定してもよい。Example 4. In the first embodiment, the pulse position of the periodic pulse excitation vector is obtained from the maximum amplitude point of the adaptive excitation vector. In the synthesized speech vector obtained by synthesizing the two, the inner product of the two vectors becomes maximum. A periodic pulse source vector may be set.

【００４３】実施例５．上記実施例１では、適応音源ベ
クトルと周期パルス音源ベクトルをそのまま用いて合成
音声ベクトルを生成し、その結果ベクトルと入力音声ベ
クトルとのベクトル間距離を求めている。しかし、例え
ば周期パルス音源ベクトルのパルス位置における適応音
源ベクトルの振幅値を０にして、適応音源ベクトルと周
期パルス音源ベクトルを直交化してベクトル間の相関を
取り除き、周期性については専ら周期パルス音源ベクト
ルを用いた合成音声ベクトルを生成してもよい。そし
て、その結果ベクトルと入力音声ベクトルとのベクトル
間距離を求めてもよい。Example 5. In the first embodiment, the adaptive sound source vector and the periodic pulse sound source vector are used as they are to generate a synthetic speech vector, and the inter-vector distance between the resultant vector and the input speech vector is obtained. However, for example, the amplitude value of the adaptive excitation vector at the pulse position of the periodic pulse excitation vector is set to 0, the adaptive excitation vector and the periodic pulse excitation vector are orthogonalized, and the correlation between the vectors is removed. You may generate the synthetic | combination speech vector using. Then, the inter-vector distance between the result vector and the input voice vector may be obtained.

【００４４】実施例６．上記実施例５では、適応音源ベ
クトルと周期パルス音源ベクトルとを直交化してその相
関を取り除いている。これを、それぞれを合成して得ら
れる合成音声ベクトル同志を直交化することによりベク
トル間の相関を取り除いてもよい。Example 6. In the fifth embodiment, the adaptive excitation vector and the periodic pulse excitation vector are orthogonalized to remove their correlation. The correlation between the vectors may be removed by orthogonalizing the synthesized speech vectors obtained by synthesizing them.

【００４５】実施例７．上記実施例１では、周期パルス
音源ベクトルを１つのベクトルで構成している。これ
を、周期パルス音源ベクトルの各パルスのゲインを、図
２（ｂ）で２パルスある例のとき、時間的に前のゲイン
と後のパルスのゲインを異なる値に設定するようにして
もよい。Example 7. In the first embodiment, the periodic pulse sound source vector is composed of one vector. In the case where the gain of each pulse of the periodic pulse source vector is two pulses in FIG. 2B, the gain of the previous pulse and the gain of the subsequent pulse may be set to different values in terms of time. .

【００４６】実施例８．図３はこの発明の請求項３及び
請求項５の発明の一実施例を示す構成図である。図３に
おいて、図１と同一の部分については同一の符号を付
し、説明を省略する。図３において、２４は適応音源符
号化手段、２５はピッチ位置抽出手段、２６は駆動音源
符号化手段である。２７、２８はピッチ同期化手段であ
り、２９は音源ベクトル生成手段である。Example 8. FIG. 3 is a block diagram showing an embodiment of the inventions of claims 3 and 5 of the present invention. In FIG. 3, the same parts as those in FIG. 1 are designated by the same reference numerals and the description thereof will be omitted. In FIG. 3, 24 is an adaptive excitation coding means, 25 is a pitch position extraction means, and 26 is a driving excitation coding means. Reference numerals 27 and 28 are pitch synchronization means, and 29 is a sound source vector generation means.

【００４７】以下、図３に示した本発明の一実施例の動
作について説明する。まず、符号化部１の動作について
説明する。適応音源符号化手段２４は、適応音源符号帳
９より入力される適応音源ベクトルａ_iと、スペクトル
パラメータ符号化手段７より入力されるスペクトルパラ
メータを用いて合成音声ベクトルＡ_iを合成する。そし
て、入力音声４からフレーム毎に切り出した入力音声ベ
クトルＸとの歪みが最小になる合成音声ベクトルＡ_I及
びそのゲインβ_Iを探索し、その符号Ｉ、及びゲインβ_I
を量子化したゲインβ_qIの符号を伝送路３を介して復号
化部２に出力し、選択された適応音源ベクトルａ_I及び
そのゲインβ_qIを音源ベクトル生成手段１２に出力し、
また誤差ベクトルＸ■ を、Ｘ■＝Ｘ−β_qIＡ_I とし、このＸ■ と前記適応音源ベクトルａ_Iを駆動音源
符号化手段２６に出力するとともに、前記適応音源ベク
トルａ_Iに対応するピッチ周期をピッチ位置抽出手段２
５に出力する。The operation of the embodiment of the present invention shown in FIG. 3 will be described below. First, the operation of the encoding unit 1 will be described. The adaptive excitation coding means 24 synthesizes the synthesized speech vector A _i using the adaptive excitation vector a _i input from the adaptive excitation codebook 9 and the spectrum parameter input from the spectrum parameter coding means 7. Then, a synthetic speech vector A _I and its gain β _I that minimize distortion with the input speech vector X cut out from the input speech 4 for each frame are searched, and its code I and gain β _I are searched.
The code of the gain β _qI quantized is output to the decoding unit 2 via the transmission line 3, and the selected adaptive _excitation vector a _I and its gain β _qI are output to the excitation vector generation means 12,
The addition error vector X ■, X ■ = pitch is X-β _qI A _I, and outputs the X ■ and the adaptive excitation vector a _I to driving excitation coding unit 26, corresponding to the adaptive excitation vector a _I Pitch position extraction means 2
Output to 5.

【００４８】ピッチ位置抽出手段２５は、適応音源符号
化手段２４より入力されたピッチ周期を用いてピッチ周
期のパルス列を作成し、これを音源としてスペクトルパ
ラメータ符号化手段７より入力されるスペクトルパラメ
ータを用いて合成音声ベクトルを生成したときに、入力
音声４からフレーム毎に切り出した入力音声ベクトルＸ
との歪みが最小となるパルス列を探索する。そして、そ
のパルス位置をピッチ位置としてピッチ同期化手段２７
に出力する。The pitch position extraction means 25 creates a pulse train of a pitch cycle using the pitch cycle input from the adaptive excitation coding means 24, and uses this as a sound source for the spectrum parameter input from the spectrum parameter coding means 7. When a synthetic speech vector is generated using the input speech vector X, the input speech vector X is extracted from the input speech 4 for each frame.
Search for a pulse train that minimizes the distortion between and. Then, using the pulse position as the pitch position, the pitch synchronization means 27
Output to.

【００４９】図４に、ピッチ位置探索時の動作ブロック
図を示す。この動作は、ピッチ周期のパルスの先頭位置
をフレームの先頭の位置から、例えば１フレームの切り
出し時間を１２８等分し、１／１２８時間ずらした値を
順次設定する。そして、先頭位置をずらしたピッチ周期
のパルス列を合成したものが入力音声に最も近いものを
選ぶ。こうすることで、入力音声のフレーミングによら
ず音声のピッチ構造に対して一意に定まる点が求められ
るので、周期性を明確化できる。FIG. 4 shows an operation block diagram when the pitch position is searched. In this operation, the start position of the pulse of the pitch cycle is divided from the start position of the frame, for example, by dividing the cut-out time of one frame into 128 equal parts, and a value shifted by 1/128 hours is sequentially set. Then, the one obtained by combining the pulse trains whose pitch positions are shifted and closest to the input voice is selected. By doing so, a point that is uniquely determined for the pitch structure of the voice is required regardless of the framing of the input voice, and thus the periodicity can be clarified.

【００５０】駆動音源符号帳１１には、例えば音声のピ
ッチ位置に同期したピッチ長で切り出した短周期予測残
差信号より、Linde, Buzo, Gray 法（ＬＢＧ法）で学習
した結果として、Ｎ個の駆動ベクトルが記憶されてい
る。また、各駆動音源ベクトルを切り出したピッチ位置
を、その駆動音源ベクトルのピッチ同期位置として設定
しておく。ピッチ同期化手段２７は、前記ピッチ位置抽
出手段２５より入力されたピッチ位置に、駆動音源符号
帳１１より入力される駆動音源ベクトルｃ_j のピッチ同
期位置を合わせて、ピッチ周期で繰り返すピッチ同期駆
動音源ベクトルｃ_j■を作成する。そして、これを駆動
音源符号化手段２６に出力する。図５に本実施例におけ
るピッチ同期駆動音源ベクトルの例を示す。駆動音源符
号化手段２６は、前記ピッチ同期化手段２７より入力さ
れるピッチ同期駆動音源ベクトルｃ_j■を、前記適応音
源符号化手段２４より入力された適応音源ベクトルａ_I
に対して、例えば式（８）に従って直交化した直交化ピ
ッチ同期駆動音源ベクトルｃ_j■を作成する。In the driving excitation codebook 11, for example, as a result of learning by the Linde, Buzo, Gray method (LBG method) from the short cycle prediction residual signal cut out at a pitch length synchronized with the pitch position of the speech, N pieces are obtained. The drive vector of is stored. In addition, the pitch position where each driving sound source vector is cut out is set as the pitch synchronization position of the driving sound source vector. The pitch synchronization means 27 aligns the pitch synchronization position of the drive excitation vector c _j input from the drive excitation codebook 11 with the pitch position input from the pitch position extraction means 25, and repeats the pitch synchronization drive in the pitch cycle. A sound source vector c _j ■ is created. Then, this is output to the driving excitation encoding means 26. FIG. 5 shows an example of the pitch synchronous drive sound source vector in this embodiment. The driving excitation coding means 26 converts the pitch-synchronized driving excitation vector c _{j 2} input from the pitch synchronization means 27 into the adaptive excitation vector a _I input from the adaptive excitation coding means 24.
On the other hand, for example, the orthogonalized pitch-synchronized drive sound source vector c _{j {} circle around (2) _} is created according to equation (8).

【００５１】[0051]

【数４】 [Equation 4]

【００５２】次に、前記直交化ピッチ同期駆動音源ベク
トルｃ_j■より合成音声ベクトルＣ_j■を合成する。そし
て、前記適応音源符号化手段２４から入力された誤差ベ
クトルＸ■とのベクトル間距離Ｄ_jを、例えば式（９）
に従って求める。ここで、ゲインγ_jはベクトル間距離
Ｄj を最小になるように、例えば式（10）に従って決定
する。Next, a synthesized voice vector C _{j {1} } is synthesized from the orthogonal pitch-synchronized drive sound source vector c _{j {1} }. Then, the inter-vector distance D _j from the error vector X ■ input from the adaptive excitation encoding means 24 is calculated by, for example, equation (9).
Ask according to. Here, the gain γ _j is determined, for example, according to the equation (10) so that the inter-vector distance D _j is minimized.

【００５３】[0053]

【数５】 [Equation 5]

【００５４】次に、このベクトル間距離Ｄ_jが最小とな
るベクトルＣ_J■を探索し、この符号Ｊ、及びゲインγ_J
を量子化したγ_qJの符号を伝送路３を介して復号部２に
出力する。同時に、選択された直交化ピッチ同期駆動音
源ベクトルｃ_J■及びそのゲインγ_qJを音源ベクトル生
成手段１２に出力する。このことは、従来はピッチに非
同期であったために生成される音源信号の周期性を乱し
ていた駆動音源ベクトルを、ピッチに同期させることで
音源信号の周期性を向上させることになる。Next, a vector C _{J (1)} that minimizes the inter-vector distance D _j is searched for, and the code J and the gain γ _J are searched.
The quantized γ _qJ code is output to the decoding unit 2 via the transmission line 3. At the same time, the selected orthogonalized pitch-synchronized drive sound source vector c _{J (1)} and its gain γ _qJ are output to the sound source vector generating means 12. This improves the periodicity of the sound source signal by synchronizing the driving sound source vector, which has been disturbing the periodicity of the sound source signal generated conventionally because it is asynchronous with the pitch, with the pitch.

【００５５】次に、復号化部２の動作について説明す
る。ピッチ同期化手段２８は、駆動音源符号帳１６より
入力される駆動音源ベクトルｃ_Jから、符号化部１から
入力されたピッチ位置に同期して繰り返したピッチ同期
駆動音源ベクトルｃ_J■を作成し、駆動音源復号化手段
１５に出力する。Next, the operation of the decoding section 2 will be described. The pitch synchronization means 28 creates a pitch-synchronized drive excitation vector c _J (1) that is repeated in synchronization with the pitch position input from the encoder 1 from the drive excitation vector c _J input from the drive excitation codebook 16. , To the driving excitation decoding means 15.

【００５６】音源ベクトル生成手段２９は、まず、駆動
音源復号化手段１５より入力されるピッチ同期駆動音源
ベクトルｃ_J■を、適応音源復号化手段１３より入力さ
れる適応音源ベクトルａ_Iと直交化して、直交化ピッチ
同期駆動音源ベクトルｃ_J■を生成する。次に、前記適
応音源ベクトルａ_I及び適応音源復号化手段１３より入
力されるゲインβ_qIと、前記直交化ピッチ同期駆動音源
ベクトルｃ_J■及び駆動音源復号化手段１５より入力さ
れるゲインγ_qJより、次式の音源ベクトル β_qIａ_I＋γ_qJｃ_J■ を生成し、これを適応音源符号帳１４及び合成フィルタ
１９に出力する。合成フィルタ１９の出力は合成して復
元された音声出力となる。The excitation vector generating means 29 first orthogonalizes the pitch-synchronized driving excitation vector c _{J (2)} input from the driving excitation decoding means 15 with the adaptive excitation vector a _I input from the adaptive excitation decoding means 13. To generate an orthogonalized pitch-synchronized drive sound source vector c _J ∘. Next, the gain β _qI input from the adaptive _excitation vector a _I and the adaptive _excitation decoding means 13, and the gain γ _qJ input from the orthogonalized pitch-synchronized drive _excitation vector c _{J 1} and the drive _excitation decoding means 15. From this, an _excitation vector β _qI a _I + γ _qJ c _J 2 of the following equation is generated and output to the adaptive _excitation codebook 14 and the synthesis filter 19. The output of the synthesizing filter 19 becomes an audio output that is synthesized and restored.

【００５７】実施例９.図６はこの発明の請求項４及び
請求項５の発明の一実施例を示す構成図である。図６に
おいて、図３と同一の部分については同一の符号を付
し、説明を省略する。図６において、３０は駆動音源符
号化手段である。Embodiment 9 FIG. 6 is a block diagram showing an embodiment of the inventions of claims 4 and 5 of the present invention. 6, parts that are the same as those in FIG. 3 are given the same reference numerals, and descriptions thereof will be omitted. In FIG. 6, 30 is a drive excitation encoding means.

【００５８】以下、図６に示した本発明の一実施例につ
いて説明する。適応音源符号化手段２４は、適応音源符
号帳９より入力される適応音源ベクトルａ_iと、スペク
トルパラメータ符号化手段７より入力されるスペクトル
パラメータを用いて合成音声ベクトルＡ_iを合成する。
そして、入力音声４からフレーム毎に切り出した入力音
声ベクトルＸとの歪みが最小になるベクトルＡ_I及びそ
のゲインβ_Iを探索し、その符号Ｉ、及びゲインβ_Iを量
子化したゲインβ_qIの符号を伝送路３を介して復号化部
２に出力し、選択された適応音源ベクトルａ_I及びその
ゲインβ_qIを音源ベクトル生成手段１２に出力し、また
誤差ベクトルＸ■を、Ｘ■＝Ｘ−β_qIＡ_I とし、このＸ■と前記適応音源ベクトルａ_I及びａ_Iに対
応するピッチ周期を駆動音源符号化手段３０に出力す
る。An embodiment of the present invention shown in FIG. 6 will be described below. The adaptive excitation coding means 24 synthesizes the synthesized speech vector A _i using the adaptive excitation vector a _i input from the adaptive excitation codebook 9 and the spectrum parameter input from the spectrum parameter coding means 7.
Then, from the input speech 4 distortion of the input speech vector X cut out for each frame is searched vector A _I and gain beta _I minimized, of the code I, and the gain beta _qI the gain beta _I quantized The code is output to the decoding unit 2 via the transmission path 3, the selected adaptive _excitation vector a _I and its gain β _qI are output to the excitation vector generation means 12, and the error vector X ■ is _expressed as X ■ = X. −β _qI A _I, and outputs the pitch period corresponding to this X 2 and the adaptive _excitation vectors a _I and a _I to the driving excitation encoding means 30.

【００５９】駆動音源符号化手段３０は、まず、前記適
応音源符号化手段２４より入力されたピッチ周期を用い
ピッチ同期化手段２７にピッチ位置ｋを出力する。この
ピッチ位置ｋはフレームの先頭の位置からｋ点目を先頭
にピッチ周期で定まる位置である。ピッチ同期化手段２
７は、前記駆動音源符号化手段３０より入力されたピッ
チ位置ｋに、駆動音源符号帳１１より入力される駆動音
源ベクトルｃ_j のピッチ同期位置を合わせて、ピッチ周
期で繰り返すピッチ同期駆動音源ベクトルｃ_jk■を作成
する。そして、これを駆動音源符号化手段３０に出力す
る。駆動音源符号化手段３０は、前記ピッチ同期化手段
２７より入力されたピッチ同期駆動音源ベクトルｃ_jk■
を、前記適応音源符号化手段２４より入力された適応
音源ベクトルａ_Iに対して、例えば式（11）に従って直
交化した直交化ピッチ同期駆動音源ベクトルｃ_jk■ を
作成する。The driving excitation coding means 30 first outputs the pitch position k to the pitch synchronization means 27 using the pitch period input from the adaptive excitation coding means 24. The pitch position k is a position determined by the pitch cycle starting from the k-th position from the head position of the frame. Pitch synchronization means 2
Reference numeral 7 is a pitch-synchronized drive excitation vector which is repeated at pitch intervals by matching the pitch synchronization position of the drive excitation vector c _j input from the drive excitation codebook 11 with the pitch position k input from the drive excitation encoding means 30. Create c _jk ■. Then, this is output to the driving excitation encoding means 30. The driving excitation coding means 30 is a pitch-synchronized driving _excitation vector c _{jk (2)} input from the pitch synchronization means 27.
Then, the orthogonalized pitch-synchronized drive _excitation vector c _jk (2) is created by orthogonalizing the adaptive excitation vector a _I input from the adaptive excitation encoding means 24 according to, for example, equation (11).

【００６０】[0060]

【数６】 [Equation 6]

【００６１】次に、前記直交化ピッチ同期駆動音源ベク
トルｃ_jk■ より合成音声ベクトルＣ_jk■ を合成する。
そして、前記適応音源符号化手段２４から入力された誤
差ベクトルＸ■ とのベクトル間距離Ｄ_jkを、例えば式
（12）に従って求める。ここで、ゲインγ_jkはベクトル
間距離Ｄ_jkを最小になるように、例えば式（13）に従っ
て決定する。Next, a synthesized speech vector C _{jk s} is synthesized from the orthogonalized pitch synchronization drive sound source vector c _{jk s} .
Then, the inter-vector distance D _jk with the error vector X2 input from the adaptive excitation coding means 24 is obtained according to, for example, equation (12). Here, the gain γ _jk is determined, for example, according to the equation (13) so that the inter-vector distance D _jk is minimized.

【００６２】[0062]

【数７】 [Equation 7]

【００６３】次に、このベクトル間距離Ｄ_jkを駆動音源
ベクトルの符号ｊ（ｊ＝１， ...，Ｎ）とピッチ位置ｋ
（ｋ＝１， ...，Ｌ）の全ての組み合わせに対して求
め、ベクトル間距離Ｄ_jkが最小になるベクトルＣ_JK■を
探索し、この符号Ｊ、ピッチ位置Ｋ、及びゲインγ_JKを
量子化したγ_qJKの符号を伝送路３を介して復号化部２
に出力する。同時に、選択された直交化ピッチ同期駆動
音源ベクトルｃ_JK■ 及びそのゲインγ_qJKを音源ベクト
ル生成手段１２に出力する。このことは、ピッチ位置を
合成音声の歪み最小となるものとして抽出することによ
り、合成音声の音質を向上させることになる。Next, the inter-vector distance D _{jk is set} to the code j (j = 1, ..., N) of the driving sound source vector and the pitch position k.
For all combinations (k = 1, ..., L), a vector C _JK (2) that minimizes the inter-vector distance D _jk is searched, and this code J, pitch position K, and gain γ _JK are obtained. The quantized γ _qJK code is decoded by the decoding unit 2 via the transmission line 3.
Output to. At the same time, the selected orthogonalized pitch-synchronized drive sound source vector c _{JK 1} and its gain γ _qJK are output to the sound source vector generating means 12. This improves the sound quality of the synthetic voice by extracting the pitch position as the one that minimizes the distortion of the synthetic voice.

【００６４】実施例１０．図７はこの発明の請求項６の
発明の一実施例を示す構成図である。図７において、図
６と同一の部分については同一の符号を付し、説明を省
略する。図７において、３１は適応音源符号化手段、３
２、３３はピッチ位置抽出手段である。Example 10. FIG. 7 is a block diagram showing an embodiment of the invention of claim 6 of the present invention. 7, parts that are the same as those shown in FIG. 6 are given the same reference numerals, and descriptions thereof will be omitted. In FIG. 7, 31 is adaptive excitation coding means, 3
2 and 33 are pitch position extracting means.

【００６５】以下、図７に示した本発明の一実施例の動
作について説明する。まず、符号化部１の動作について
説明する。ピッチ位置抽出手段３２は、適応音源符号帳
９より入力される適応音源ベクトルａ_iと、スペクトル
パラメータ符号化手段７より入力されるスペクトルパラ
メータを用いて合成音声ベクトルＡ_iを合成する。次に
前記適応音源ベクトルａ_iに対応するピッチ周期のパル
ス列を音源として、前記スペクトルパラメータを用いて
合成音声ベクトルを生成したときに、前記合成音声ベク
トルＡ_iとの歪みが最小となるパルス列を探索する。そ
して、そのパルス位置をピッチ位置として適応音源符号
化手段３１に出力する。The operation of the embodiment of the present invention shown in FIG. 7 will be described below. First, the operation of the encoding unit 1 will be described. The pitch position extracting means 32 synthesizes the synthesized speech vector A _i using the adaptive excitation vector a _i input from the adaptive excitation codebook 9 and the spectrum parameter input from the spectrum parameter encoding means 7. Next, when a synthesized speech vector is generated using the spectral parameters with a pulse train having a pitch period corresponding to the adaptive source vector a _i as a sound source, a pulse train having a minimum distortion with the synthesized speech vector A _i is searched for. To do. Then, the pulse position is output as a pitch position to the adaptive excitation encoding means 31.

【００６６】適応音源符号化手段３１は、まず前記適応
音源符号帳９より入力される適応音源ベクトルａ_iに対
応するピッチ周期のパルス列を音源として、前記スペク
トルパラメータ符号化手段７より入力されるスペクトル
パラメータを用いて合成音声ベクトルを生成したとき
に、入力音声４からフレーム毎に切り出した入力音声ベ
クトルＸとの歪みが最小となるパルス列を探索する。そ
して、そのパルス位置を入力音声におけるピッチ位置と
する。そして、前記ピッチ位置抽出手段３２より入力さ
れる適応音源ベクトルａ_iにおけるピッチ位置が、前記
入力音声におけるピッチ位置よりある距離、例えば５サ
ンプル、以内にある場合のみ以下の処理を行い、それ以
外の場合は前記適応音源ベクトルａ_iを符号化の対象か
ら外す。The adaptive excitation coding means 31 first uses the pulse train of the pitch period corresponding to the adaptive excitation vector a _i input from the adaptive excitation codebook 9 as the excitation, and inputs the spectrum input from the spectrum parameter encoding means 7. When a synthetic speech vector is generated using the parameters, a pulse train that minimizes distortion with the input speech vector X cut out from the input speech 4 for each frame is searched for. Then, the pulse position is set as the pitch position in the input voice. Then, the following processing is performed only when the pitch position in the adaptive sound source vector a _i input from the pitch position extraction means 32 is within a certain distance, for example, 5 samples, from the pitch position in the input speech, and other than that. In this case, the adaptive excitation vector a _i is excluded from the target of encoding.

【００６７】適応音源符号化手段３１は、次に、前記適
応音源ベクトルａ_iにおけるピッチ位置のみ１でその他
は０であるピッチパルスベクトルｐ_iを生成する。そし
て、前記スペクトルパラメータを用い、前記適応音源ベ
クトルａ_iと前記ピッチパルスベクトルｐ_iより合成音声
ベクトルＡ_i、Ｐ_iを合成する。また、入力音声ベクトル
Ｘとのベクトル間距離Ｄ_iを例えば式（14）に従って求
める。ここで、ｍｉｎ（ｘ，ｙ）はｘ、ｙのうち値の小
さい方を選択する関数であり、選択結果が出力される。
また、ゲインβ_Ai、β_Piは距離Ｄ_iが最小になるよう
に、例えば式（15）、式（16）に従って決定する。The adaptive excitation coding means 31 then generates a pitch pulse vector p _{i in} which only the pitch position in the adaptive excitation vector a _i is 1 and the others are 0. Then, using the spectrum parameters, synthesized speech vectors A _i and P _i are synthesized from the adaptive sound source vector a _i and the pitch pulse vector p _i . Further, the inter-vector distance D _i with respect to the input voice vector X is obtained, for example, according to the equation (14). Here, min (x, y) is a function that selects the smaller one of x and y, and the selection result is output.
Further, the gains β _Ai and β _Pi are determined in accordance with, for example, equations (15) and (16) so that the distance D _i becomes the minimum.

【００６８】[0068]

【数８】 [Equation 8]

【００６９】次に、このベクトル間距離が最小となるベ
クトルＡ_I を探索し、その符号Ｉ及びそのゲインβ_AIを
量子化したゲインβ_qAI の符号を伝送路３を介して復号
化部２に出力する。また、前記適応音源ベクトルａ_Iと
そのゲインβ_qAIを、音源ベクトル生成手段１２に出力
する。更に、前記ピッチ位置抽出手段３２より入力され
た前記適応音源ベクトルａ_Iにおけるピッチ位置を、ピ
ッチ同期化手段２７に出力する。また誤差ベクトルＸ■
を、Ｘ■＝Ｘ−β_qAIＡ_I とし、このＸ■ と前記適応音源ベクトルａ_Iを駆動音源
符号化手段２６にも出力する。Next, the vector A _I having the minimum inter-vector distance is searched for, and the code I and the code of the gain β _{qAI obtained} by quantizing the gain β _AI are transmitted to the decoding unit 2 via the transmission line 3. Output. Further, the adaptive _excitation vector a _I and its gain β _qAI are output to the excitation vector generation means 12. Further, the pitch position in the adaptive sound source vector a _I input from the pitch position extraction means 32 is output to the pitch synchronization means 27. Also, the error vector X
X X = X-β _qAI A _I, and _outputs this X ■ and the adaptive _excitation vector a _I to the driving excitation encoding means 26.

【００７０】次に、復号化部２の動作について説明す
る。ピッチ位置抽出手段３３は、適応音源符号帳１４よ
り入力される適応音源ベクトルａ_Iと、スペクトルパラ
メータ復号化手段１８より入力されるスペクトルパラメ
ータよりピッチ位置を抽出する。そして、このピッチ位
置をピッチ同期化手段２８に出力する。このことは、ピ
ッチ位置を入力音声を量子化したものである適応音源ベ
クトルとスペクトルパラメータより抽出することにより
復号化部においても一意に求めることが可能であり、ピ
ッチ位置を伝送する必要がなく、合成音声の音質の劣化
を起こさずに伝送情報量の削減することになる。Next, the operation of the decoding section 2 will be described. The pitch position extracting means 33 extracts a pitch position from the adaptive excitation vector a _I input from the adaptive excitation codebook 14 and the spectrum parameter input from the spectrum parameter decoding means 18. Then, this pitch position is output to the pitch synchronizing means 28. This can be uniquely obtained in the decoding unit by extracting the pitch position from the adaptive excitation vector that is a quantized input speech and the spectrum parameter, without the need to transmit the pitch position, The amount of transmitted information can be reduced without deteriorating the sound quality of the synthetic voice.

【００７１】実施例１１．上記実施例１０では、適応音
源符号化手段において、適応音源ベクトルを符号化対象
とする判断基準である適応音源ベクトルより抽出したピ
ッチ位置と、入力音声より求まるピッチ位置間の距離値
を固定としているが、例えばピッチ長に応じて可変とし
てもよい。Example 11. In the tenth embodiment, the adaptive excitation encoding means fixes the distance value between the pitch position extracted from the adaptive excitation vector which is the criterion for determining the adaptive excitation vector as the encoding target and the pitch position obtained from the input speech. However, it may be variable according to the pitch length, for example.

【００７２】実施例１２．上記実施例１０では、適応音
源符号化手段において、適応音源ベクトルより抽出した
ピッチ位置と入力音声より求まるピッチ位置の差が、あ
る距離以内のもののみ符号化の対象としているが、全て
の適応音源ベクトルがこの基準を満たさないフレームに
おいては、この距離基準を緩和、あるいは廃止してもよ
い。Example 12 In the tenth embodiment, the adaptive excitation coding means targets the encoding only when the difference between the pitch position extracted from the adaptive excitation vector and the pitch position obtained from the input speech is within a certain distance. This distance criterion may be relaxed or eliminated in frames where the vector does not meet this criterion.

【００７３】実施例１３．図８はこの発明の請求項９の
発明の一実施例を示す構成図である。図８において、図
１０と同一の部分については同一の符号を付し、説明を
省略する。図８において、３４、３６は制御手段、３５
は駆動音源符号化手段であり、３７は駆動音源復号化手
段である。Example 13 FIG. 8 is a block diagram showing an embodiment of the invention of claim 9 of the present invention. 8, parts that are the same as the parts shown in FIG. 10 are given the same reference numerals, and descriptions thereof will be omitted. In FIG. 8, 34 and 36 are control means, and 35.
Is a driving excitation encoding means, and 37 is a driving excitation decoding means.

【００７４】以下、図８に示した本発明の一実施例の動
作について説明する。まず、符号化部１の動作について
説明する。制御手段３４は適応音源符号化手段８より入
力される適応音源ベクトルａ_I及びそのゲインβ_qIより
適応音源信号のパワーＰＰ＝‖β_qIａ_I‖² を求める。そして、例えば、予め設定した閾値Ｐ₁、
Ｐ₂、Ｐ₃と前記適応音源信号のパワーＰとを比較して、
予め複数個用意してある駆動音源ベクトルに対するゲイ
ン量子化テーブルの中から、例えば以下の条件判別に従
って実際に量子化に用いる駆動音源ベクトルに対するゲ
イン量子化テーブルを選択し、選択された量子化テーブ
ルを駆動音源符号化手段３５に出力する。Ｐ＜Ｐ₁ の場合量子化テーブル１を選択Ｐ₁≦Ｐ＜Ｐ₂ の場合量子化テーブル２を選択Ｐ₂≦Ｐ＜Ｐ₃ の場合量子化テーブル３を選択Ｐ₃≦Ｐの場合量子化テーブル４を選択The operation of the embodiment of the present invention shown in FIG. 8 will be described below. First, the operation of the encoding unit 1 will be described. The control means 34 _obtains the power P P = ‖β _qI a _I ‖ ² of the adaptive _excitation signal from the adaptive _excitation vector a _I input from the adaptive _excitation encoding means 8 and its gain β _qI . Then, for example, a preset threshold P ₁ ,
Comparing P ₂ and P ₃ with the power P of the adaptive sound source signal,
From the gain quantization table for the driving excitation vector prepared in advance, for example, select the gain quantization table for the driving excitation vector to be actually used for quantization according to the following condition determination, and select the selected quantization table. It outputs to the drive excitation encoding means 35. If P <P ₁ Select Quantization Table 1 If P ₁ ≦ P <P ₂ Select Quantization Table 2 If P ₂ ≦ P <P ₃ Select Quantization Table 3 If P ₃ ≦ P Quantization Select table 4

【００７５】各量子化テーブルは、例えば図９に示す適
応音源信号のパワーと駆動音源ベクトルに対するゲイン
との相関関係を用いて各条件下では効率の良い量子化を
行えるように設計しておく。図９より分かるように、駆
動音源ベクトルに対するゲインの量子化を適応音源信号
のパワー情報により変更することにより、各量子化テー
ブルにおいては量子化すべきゲインの範囲を小さくでき
るので、量子化効率を上げることができる。Each quantization table is designed so that efficient quantization can be performed under each condition by using the correlation between the power of the adaptive excitation signal and the gain with respect to the driving excitation vector shown in FIG. 9, for example. As can be seen from FIG. 9, the gain range to be quantized in each quantization table can be reduced by changing the quantization of the gain for the driving excitation vector according to the power information of the adaptive excitation signal, thus increasing the quantization efficiency. be able to.

【００７６】駆動音源符号化手段３５は、駆動音源符号
帳１１より入力される駆動音源ベクトルｃ_jと、スペク
トルパラメータ符号化手段より入力されるスペクトルパ
ラメータを用いて、合成音声ベクトルＣ_jを合成する。
そして、適応音源符号化手段８より入力された誤差ベク
トルＸ■ とのベクトル間距離Ｄ_jを、例えば式（17）に
従って求める。ここで、ゲインγ_jは距離Ｄ_jが最小にな
るように、例えば式（18）に従って決定する。The driving excitation coding means 35 synthesizes the synthesized speech vector C _j using the driving excitation vector c _j inputted from the driving excitation codebook 11 and the spectrum parameter inputted from the spectrum parameter coding means. .
Then, the inter-vector distance D _j with the error vector X₄ input from the adaptive excitation encoding means 8 is obtained according to, for example, equation (17). Here, the gain γ _j is determined, for example, according to the equation (18) so that the distance D _j is minimized.

【００７７】[0077]

【数９】 [Equation 9]

【００７８】次にこのベクトル間距離が最小となるベク
トルＣ_Jを探索し、その符号Ｊ及びゲインγ_Jを前記制御
手段３４より入力された駆動音源ベクトルに対するゲイ
ン量子化テーブルを用いて量子化したゲインγ_qJの符号
を伝送路３を介して復号化部２に出力するとともに、選
択された駆動音源ベクトルｃ_J及びそのゲインγ_qJを音
源ベクトル生成手段１２に出力する。Next, a vector C _J that minimizes the inter-vector distance is searched, and its code J and gain γ _J are quantized using the gain quantization table for the driving sound source vector input from the control means 34. The code of the gain γ _qJ is output to the decoding unit 2 via the transmission path 3, and the selected drive _excitation vector c _J and its gain γ _qJ are output to the _excitation vector generation means 12.

【００７９】次に、復号化部２の動作について説明す
る。制御手段３６は適応音源復号化手段１３より入力さ
れる適応音源ベクトルａ_I及びそのゲインβ_qIより適応
音源信号のパワーＰを求め、前記制御手段３４と同一の
予め複数個用意してある駆動音源ベクトルに対するゲイ
ン量子化テーブルの中から、前記制御手段３４と同一の
閾値を用いた条件判別に従って逆量子化に用いる駆動音
源ベクトルに対するゲイン量子化テーブルを選択し、選
択された量子化テーブルを駆動音源復号化手段３７に出
力する。Next, the operation of the decoding section 2 will be described. The control means 36 _obtains the power P of the adaptive _excitation signal from the adaptive _excitation vector a _I input from the adaptive _excitation decoding means 13 and its gain β _qI, and a plurality of drive _excitation sources, which are the same as the control means 34, are prepared in advance. From the gain quantization table for the vector, the gain quantization table for the driving excitation vector used for inverse quantization is selected according to the condition determination using the same threshold value as that of the control means 34, and the selected quantization table is used as the driving excitation source. It outputs to the decoding means 37.

【００８０】駆動音源復号化手段３７は、符号化部１か
ら入力された駆動音源ベクトルの符号Ｊに基づき、駆動
音源符号帳１６から駆動音源ベクトルｃ_Jを読みだし、
また、前記制御手段３６より入力されたゲイン量子化テ
ーブルを用い、符号化部１から入力された駆動音源ベク
トルに対するゲインの符号よりゲインγ_qJを復号化し、
前記駆動音源ベクトルｃ_Jとそのゲインγ_qJを音源ベク
トル生成手段１７に出力する。The driving excitation decoding means 37 reads the driving excitation vector c _J from the driving excitation codebook 16 based on the code J of the driving excitation vector input from the encoding unit 1,
Further, using the gain quantization table input from the control means 36, the gain γ _qJ is decoded from the sign of the gain for the driving _excitation vector input from the encoding unit 1,
The driving sound source vector c _J and its gain γ _qJ are output to the sound source vector generating means 17.

【００８１】[0081]

【発明の効果】以上のようにこの発明によれば、音声符
号化装置に、適応音源ベクトルを記憶する適応音源符号
帳と、周期パルス音源ベクトルを生成する周期パルス音
源生成手段と、両者を加算合成し符号化する複数ベクト
ル適応音源符号化手段を備えたので、また、音声復号化
装置には符号化装置に対応する手段を備えたので、僅か
の伝送情報の追加のみで、ピッチ周期性の乱れが少ない
高品質の音声を合成できる効果がある。As described above, according to the present invention, an adaptive excitation codebook that stores an adaptive excitation vector and a periodic pulse excitation generation means that generates a periodic pulse excitation vector are added to a speech encoding device. Since the multi-vector adaptive excitation encoding means for synthesizing and encoding is provided, and the speech decoding apparatus is provided with means corresponding to the encoding apparatus, only a small amount of transmission information is added, and the pitch periodicity This has the effect of synthesizing high-quality speech with little disturbance.

【００８２】また請求項３ないし請求項５の発明によれ
ば、音声符号化装置は、適応音源符号帳と、ピッチ位置
抽出手段と、複数の音源信号を記憶している駆動音源符
号帳と、駆動音源ベクトルをピッチ周期で繰り返すピッ
チ同期化手段と、適応音源符号化手段と、駆動音源符号
化手段とを備えたので、また音声復号化装置には、符号
化装置に対応する手段を備えたので、僅かの伝送情報の
追加のみで、ピッチ周期性の乱れが少ない高品質の音声
を合成できる効果がある。According to the inventions of claims 3 to 5, the speech coding apparatus comprises an adaptive excitation codebook, pitch position extraction means, a driving excitation codebook storing a plurality of excitation signals, Since the pitch synchronization means for repeating the driving excitation vector in the pitch cycle, the adaptive excitation coding means, and the driving excitation coding means are provided, the speech decoding apparatus is provided with means corresponding to the coding apparatus. Therefore, it is possible to synthesize high-quality speech with little disturbance of pitch periodicity by adding a small amount of transmission information.

【００８３】また請求項６及び請求項７の発明によれ
ば、音声符号化装置は、適応音源符号帳と、適応音源ベ
クトルからピッチ位置を抽出するピッチ位置抽出手段
と、駆動音源符号帳と、駆動音源ベクトルをピッチ周期
で繰り返すピッチ同期化手段と、適応音源符号化手段
と、駆動音源符号化手段とを備えたので、また音声復号
化装置には、符号化装置に対応する手段を備えたので、
伝送情報の増加無しに、ピッチ周期性の乱れが少ない高
品質の音声を合成できる効果がある。According to the sixth and seventh aspects of the invention, the speech coding apparatus comprises an adaptive excitation codebook, pitch position extraction means for extracting a pitch position from an adaptive excitation vector, and a driving excitation codebook. Since the pitch synchronization means for repeating the driving excitation vector in the pitch cycle, the adaptive excitation coding means, and the driving excitation coding means are provided, the speech decoding apparatus is provided with means corresponding to the coding apparatus. So
There is an effect that it is possible to synthesize high-quality speech with little disturbance of pitch periodicity without increasing transmission information.

【００８４】また請求項８の発明によれば、周期ピッチ
を抽出する方法と、受信側では周期性を持つピッチ同期
ベクトルを生成する方法を備えたので、僅かの伝送情報
を加えるだけで、ピッチ周期性の乱れが少ない高品質の
伝送が可能になる効果がある。Further, according to the invention of claim 8, since the method for extracting the periodic pitch and the method for generating the pitch synchronization vector having the periodicity are provided on the receiving side, the pitch can be obtained by adding a small amount of transmission information. This has the effect of enabling high-quality transmission with less periodic disturbance.

【００８５】また請求項９及び請求項１０の発明によれ
ば、音声符号化装置は、適応音源符号帳と、駆動音源符
号帳と、適応音源符号化手段と、適応音源信号のパワー
情報により駆動音源ベクトルに対するゲイン量子化テー
ブルを変更する制御手段と、駆動音源符号化手段とを備
えたので、また音声復号化装置には、符号化装置に対応
する手段を備えたので、伝送情報の増加無しに、量子化
歪が小さい高品質の音声を合成できる効果がある。According to the ninth and tenth aspects of the present invention, the speech coding apparatus is driven by the adaptive excitation codebook, the driving excitation codebook, the adaptive excitation coding means, and the power information of the adaptive excitation signal. Since the control means for changing the gain quantization table for the excitation vector and the driving excitation encoding means are provided, and the speech decoding apparatus is provided with the means corresponding to the encoding apparatus, there is no increase in transmission information. In addition, there is an effect that high-quality speech with small quantization distortion can be synthesized.

【００８６】更に、請求項１１の発明によれば、駆動音
源ベクトルに対するゲインの量子化を適応音源信号のパ
ワー情報によって効率の良いものに変更する方法と、受
信側でも送信側に対応する量子化に変更する方法を備え
たので、伝送情報の増加無しに、量子化歪が小さい高品
質の伝送が可能になる効果がある。Further, according to the invention of claim 11, a method of changing the quantization of the gain with respect to the driving excitation vector to an efficient one by the power information of the adaptive excitation signal, and a quantization corresponding to the receiving side and the transmitting side. Since there is provided a method of changing to, there is an effect that high-quality transmission with small quantization distortion can be performed without increasing the transmission information.

[Brief description of drawings]

【図１】この発明の実施例１を示す構成図である。FIG. 1 is a configuration diagram showing a first embodiment of the present invention.

【図２】この発明の周期パルス音源ベクトルの例を示す
説明図である。FIG. 2 is an explanatory diagram showing an example of a periodic pulse sound source vector of the present invention.

【図３】この発明の実施例８を示す構成図である。FIG. 3 is a configuration diagram showing an eighth embodiment of the present invention.

【図４】この発明のピッチ位置抽出を示す説明図であ
る。FIG. 4 is an explanatory diagram showing pitch position extraction of the present invention.

【図５】この発明のピッチ同期駆動音源ベクトルの例を
示す説明図である。FIG. 5 is an explanatory diagram showing an example of a pitch-synchronized drive sound source vector of the present invention.

【図６】この発明の実施例９を示す構成図である。FIG. 6 is a configuration diagram showing an embodiment 9 of the present invention.

【図７】この発明の実施例１０を示す構成図である。FIG. 7 is a configuration diagram showing an embodiment 10 of the present invention.

【図８】この発明の実施例１３を示す構成図である。FIG. 8 is a configuration diagram showing an embodiment 13 of the present invention.

【図９】適応音源信号のパワーと駆動音源ベクトルに対
するゲインとの関係を示す説明図である。FIG. 9 is an explanatory diagram showing the relationship between the power of the adaptive excitation signal and the gain with respect to the driving excitation vector.

【図１０】従来の音声符号化復号化装置を示す構成図で
ある。FIG. 10 is a configuration diagram showing a conventional speech encoding / decoding device.

【図１１】適応音源ベクトルの例を示す説明図である。FIG. 11 is an explanatory diagram showing an example of an adaptive sound source vector.

[Explanation of symbols]

１符号化部２復号化部３伝送路４入力音声５出力音声６スペクトル分析手段７スペクトルパラメータ符号化手段８適応音源符号化手段９、１４適応音源符号帳１０駆動音源符号化手段１１、１６駆動音源符号帳１２、１７音源ベクトル生成手段１３適応音源復号化手段１５駆動音源復号化手段１８スペクトルパラメータ復号化手段１９合成フィルタ２０複数ベクトル適応音源符号化手段２１、２３周期パルス音源生成手段２２複数ベクトル適応音源復号化手段２４適応音源符号化手段２５ピッチ位置抽出手段２６駆動音源符号化手段２７、２８ピッチ同期化手段２９音源ベクトル生成手段３０駆動音源符号化手段３１適応音源符号化手段３２、３３ピッチ位置抽出手段３４、３６制御手段３５駆動音源符号化手段３７駆動音源復号化手段 1 Encoding Unit 2 Decoding Unit 3 Transmission Line 4 Input Speech 5 Output Speech 6 Spectrum Analysis Means 7 Spectrum Parameter Encoding Means 8 Adaptive Excitation Encoding Means 9, 14 Adaptive Excitation Codebook 10 Driving Excitation Encoding Means 11, 16 Driving Excitation codebook 12, 17 Excitation vector generation means 13 Adaptive excitation decoding means 15 Drive excitation decoding means 18 Spectral parameter decoding means 19 Synthesis filter 20 Multiple vector adaptive excitation coding means 21, 23 Periodic pulse excitation generation means 22 Multiple vectors Adaptive excitation decoding means 24 Adaptive excitation coding means 25 Pitch position extraction means 26 Drive excitation coding means 27, 28 Pitch synchronization means 29 Excitation vector generation means 30 Drive excitation coding means 31 Adaptive excitation coding means 32, 33 Pitch Position extraction means 34, 36 Control means 35 Drive sound Coding means 37 driving excitation decoding means

Claims

[Claims]

1. A speech coder which divides an input speech into spectral envelope information and excitation signal information for each frame divided at regular intervals, and stores an adaptive excitation vector, which is information on the excitation signal of a preceding frame. Adaptive excitation codebook, a periodic pulse excitation generating means for generating a periodic pulse excitation vector determined as a main component of each adaptive excitation vector in the adaptive excitation codebook, the adaptive excitation vector and a period corresponding to the adaptive excitation vector An adaptive excitation vector that minimizes distortion between the decoded speech and the input speech generated by using a linear sum of the pulse excitation vector as the excitation signal, a periodic pulse excitation vector corresponding to the adaptive excitation vector, the adaptive excitation vector, and the periodic pulse The gain with respect to the sound source vector is obtained, and the adaptive sound source vector and the adaptive sound source vector A speech coding apparatus comprising a plurality of vector adaptive excitation coding means for coding a gain for the periodic pulse excitation vector.

2. A speech decoding apparatus which decodes spectrum envelope information and excitation signal information coded for each frame and generates decoded speech using each decoded parameter, wherein the excitation signal of the preceding frame is An adaptive excitation codebook that stores an adaptive excitation vector that is information, a periodic pulse excitation generation means that generates a periodic pulse excitation vector that is determined as a main component of the decoded adaptive excitation vector, an adaptive excitation vector, and the adaptive excitation vector And a multi-vector adaptive excitation decoding means for decoding gains for the periodic pulse excitation vector corresponding to the adaptive excitation vector and for adding the gains to the adaptive excitation vector and the periodic pulse excitation vector respectively. Speech decoding device.

3. A speech coding apparatus for coding an input speech by dividing it into spectral envelope information and excitation signal information for each frame divided at regular intervals, and storing an adaptive excitation vector which is information of excitation signal of a preceding frame. Adaptive excitation codebook, an adaptive excitation vector that minimizes distortion between the decoded speech generated by using the adaptive excitation vector as the excitation signal and the input speech, and a gain for the adaptive excitation vector are obtained, and the adaptive excitation vector and the adaptive excitation vector Adaptive excitation coding means for coding a gain for a vector, a driving excitation codebook in which a plurality of prepared excitation signals are stored as driving excitation vectors, and a plurality of features arranged at pitch cycle intervals with respect to input speech A pitch position extracting means for extracting a point as a pitch position and encoding the pitch position; Pitch synchronization means for aligning a predetermined position of each driving excitation vector of the dynamic excitation codebook and generating a pitch synchronization driving excitation vector by repeating the driving excitation vector in a pitch cycle; the adaptive excitation vector and the pitch synchronization driving excitation. Pitch-synchronized driving sound source vector that minimizes distortion between the decoded speech generated by linearly summing the vector as a sound source signal and the input speech, and the gain for the pitch-synchronized driving sound source vector, and driving corresponding to the pitch-synchronized driving sound source vector A speech encoding apparatus comprising an excitation vector and a drive excitation encoding means for encoding a gain for the pitch-synchronized drive excitation vector.

4. A speech coding apparatus for coding an input speech by dividing it into spectral envelope information and excitation signal information for each frame divided at regular intervals, and storing an adaptive excitation vector which is information of excitation signal of a preceding frame. Adaptive excitation codebook, an adaptive excitation vector that minimizes distortion between the decoded speech generated by excitation signal of the adaptive excitation vector and the input speech, and a gain for the adaptive excitation vector are obtained, and the adaptive excitation vector and the adaptive An adaptive excitation coding means for encoding a gain for an excitation vector, a driving excitation codebook that stores a plurality of prepared excitation signals as driving excitation vectors, and a plurality of points arranged at pitch cycle intervals as pitch positions. , Aligning a predetermined position of each driving excitation vector of the driving excitation codebook with the pitch position, Pitch synchronization means for repeatedly generating a pitch-synchronized driving sound source vector in a pitch cycle, and distortion of a decoded speech and an input speech generated by using a linear sum of the adaptive sound source vector and the pitch-synchronized driving sound source vector as a sound source signal is minimized. A drive excitation code for obtaining a pitch-synchronized drive excitation vector and a gain for the pitch-synchronized drive excitation vector, and encoding a drive excitation vector corresponding to the pitch-synchronized drive excitation vector, a pitch position, and a gain for the pitch-synchronized drive excitation vector. A speech coding apparatus having a coding means.

5. A speech decoding apparatus for decoding spectral envelope information and excitation signal information encoded for each frame and generating decoded speech using each of the decoded parameters, wherein the excitation signal of the preceding frame is An adaptive excitation codebook that stores an adaptive excitation vector that is information, an adaptive excitation decoding unit that decodes an adaptive excitation vector and a gain for the adaptive excitation vector, and a plurality of prepared excitation signals that are stored as driving excitation vectors The driving excitation codebook that is operating, the pitch position is decoded, the predetermined position of the decoded driving excitation vector is aligned with the above pitch position, and the above driving excitation vector is repeated in a pitch cycle to generate a pitch synchronization driving excitation vector. Pitch synchronization means, driving source vector, and gain for the pitch-synchronized driving source vector Driving excitation decoding means for performing the above, and the excitation vector generating means for decoding the excitation vector from the adaptive excitation vector and the gain for the adaptive excitation vector, and the gains for the pitch-synchronized excitation vector and the pitch-synchronized excitation vector. A provided voice decoding device.

6. A speech coding apparatus for coding an input speech by dividing it into spectral envelope information and excitation signal information for each frame divided at regular intervals, and storing an adaptive excitation vector which is information of an excitation signal of a preceding frame. Adaptive excitation codebook, pitch position extracting means for extracting a plurality of feature points arranged at pitch cycle intervals in each adaptive excitation vector in the adaptive excitation codebook as pitch positions, and the adaptive excitation vector as the excitation signal. An adaptive excitation vector that obtains an adaptive excitation vector that minimizes distortion between the decoded speech and the input speech, a gain for the adaptive excitation vector, and an adaptive excitation vector that encodes the adaptive excitation vector and the gain for the adaptive excitation vector, and is prepared in advance. Driving excitation codebook storing a plurality of excitation signals as driving excitation vectors, and the pitch position Pitch synchronization means for aligning a predetermined position of each driving excitation vector of the driving excitation codebook with the pitch position obtained by the outputting means, and generating a pitch synchronization driving excitation vector by repeating the driving excitation vector in a pitch cycle. , A pitch-synchronous excitation vector that minimizes distortion between the decoded speech generated by using a linear sum of the adaptive excitation vector and the pitch-synchronized excitation vector as the excitation signal and the input speech, and a gain for the pitch-synchronous excitation vector A speech coding apparatus comprising: a driving excitation vector corresponding to the pitch-synchronized driving excitation vector; and driving excitation coding means for coding a gain for the pitch-synchronized driving excitation vector.

7. A speech decoding apparatus which decodes spectrum envelope information and excitation signal information encoded for each frame and generates decoded speech using each decoded parameter, in a speech signal of a preceding frame. An adaptive excitation codebook that stores an adaptive excitation vector that is information; an adaptive excitation decoding means that decodes an adaptive excitation vector and a gain for the adaptive excitation vector; and a plurality of decoded adaptive excitation vectors arranged at pitch cycle intervals. Pitch position extracting means for extracting the feature points of as a pitch position, a driving excitation codebook that stores a plurality of prepared excitation signals as driving excitation vectors, and a pitch position obtained by the pitch position extracting means. , A predetermined position of the decoded driving sound source vector is aligned, and the driving sound source vector is repeated in a pitch cycle. Pitch synchronization means for generating a synchronous drive excitation vector, a drive excitation vector, a drive excitation decoding means for decoding a gain for the pitch synchronization drive excitation vector, the adaptive excitation vector and a gain for the adaptive excitation vector A speech decoding apparatus comprising an excitation vector generation means for decoding an excitation vector from the pitch synchronization driven excitation vector and a gain for the pitch synchronization driven excitation vector.

8. A speech coding / decoding method for transmitting input speech by dividing the speech into spectral envelope information and excitation signal information for each frame divided at regular intervals, wherein an adaptive excitation vector which is information of the excitation signal of the preceding frame is set. A first adaptive excitation codebook to be stored is provided on the transmitting side, a corresponding second adaptive excitation codebook is provided on the receiving side, and a plurality of prepared excitation signals are stored as drive excitation vectors. Has the drive excitation codebook of
Further, the corresponding second driving excitation codebook is provided on the receiving side, and the transmitting side selects the optimum adaptive excitation vector from the first adaptive excitation codebook for the input voice and sets its gain. A plurality of pitch positions are extracted from the input speech or the adaptive excitation vector, and a pitch-synchronized drive excitation vector is created by synchronizing the pitch positions with predetermined positions of the drive excitation vector of the drive excitation codebook. An optimum pitch-synchronized driving sound source vector is selected and its gain is determined, and the adaptive sound source vector, the pitch synchronous driving sound source vector, the gains for the adaptive sound source vector and the pitch synchronous driving sound source vector are encoded and transmitted. , The receiving side generates an adaptive excitation vector from the second adaptive excitation codebook by the received code, and Decoding a plurality of pitch positions from the received code or the adaptive excitation vector, generating a driving excitation vector from the second driving excitation codebook, and setting a predetermined position of the driving excitation vector at the pitch position. A voice that creates a pitch-synchronized driving sound source vector that is also cycled, decodes the gain of the pitch-synchronized driving sound source vector, and decodes the sound source vector from the adaptive sound source vector, the pitch-synchronous driving sound source vector, and respective gains Encoding and decoding method.

9. A speech coding apparatus for coding an input speech by dividing it into spectral envelope information and excitation signal information for each frame divided at regular intervals, and storing an adaptive excitation vector which is information of an excitation signal of a preceding frame. Adaptive excitation codebook, an adaptive excitation vector that minimizes distortion between the decoded speech generated by using the adaptive excitation vector as the excitation signal and the input speech, and a gain for the adaptive excitation vector are obtained, and the adaptive excitation vector and the adaptive excitation vector Adaptive excitation coding means for encoding a gain for a vector, a driving excitation codebook in which a plurality of prepared excitation signals are stored as driving excitation vectors, and an adaptive excitation signal obtained from the adaptive excitation vector and its gain The gain quantization table used for the driving sound source vector is prepared in advance according to the power information of Control means for selecting and designating from among a number of gain quantization tables, and driving for minimizing distortion between the decoded speech and the input speech generated by using the linear sum of the adaptive excitation vector and the driving excitation vector as the excitation signal. A speech code including a drive source vector and a drive source encoding unit that encodes a gain for the drive source vector by obtaining a gain for the drive source vector and using the specified gain quantization table. Device.

10. A speech decoding apparatus for decoding spectral envelope information and excitation signal information encoded for each frame and generating decoded speech using each of the decoded parameters, wherein the excitation signal of the preceding frame is An adaptive excitation codebook that stores an adaptive excitation vector that is information, an adaptive excitation decoding unit that decodes an adaptive excitation vector and a gain for the adaptive excitation vector, and a plurality of prepared excitation signals that are stored as driving excitation vectors Driving excitation codebook, and the gain quantization table used for the driving excitation vector based on the adaptive excitation vector and the power information of the adaptive excitation signal obtained from the gain for the adaptive excitation vector. Of the specified gain quantization table Using Bull, speech decoding apparatus and a driving excitation decoding means for decoding the gain excitation vector and to its driving excitation vector.

11. A speech coding / decoding method for transmitting input speech by dividing the speech into spectral envelope information and excitation signal information for each frame divided at regular intervals, wherein an adaptive excitation vector which is information of the excitation signal of a preceding frame is set. A first adaptive excitation codebook to be stored is provided on the transmitting side, a corresponding second adaptive excitation codebook is provided on the receiving side, and a plurality of prepared excitation signals are stored as drive excitation vectors. Has the drive excitation codebook of
The corresponding second driving excitation codebook is provided on the receiving side, and the transmitting side selects the optimum adaptive excitation vector from the first adaptive excitation codebook with respect to the input voice, and sets its gain. The quantization table of the driving excitation vector is changed according to the power information of the adaptive excitation signal obtained from the adaptive excitation vector and its gain, and the optimum driving excitation vector is selected from the first driving excitation codebook. , The gain is determined, the gains for the adaptive excitation vector, the driving excitation vector, and the adaptive excitation vector and the driving excitation vector are encoded and transmitted, and on the receiving side, the second adaptation is performed by a reception code. Generate an adaptive excitation vector from the excitation codebook, decode the gain of the adaptive excitation vector, and obtain from the adaptive excitation vector and its gain. That adaptive power information from the change the gain quantization table excitation vector of the sound source signals, said second generating an excitation vector from the driving excitation vector codebook and decodes the gain of the excitation vector,
A speech coding / decoding method for decoding a sound source vector from the adaptive sound source vector, a driving sound source vector, and respective gains.