JP3024455B2

JP3024455B2 - Audio encoding device and audio decoding device

Info

Publication number: JP3024455B2
Application number: JP5240135A
Authority: JP
Inventors: 正山浦; 真哉高橋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1992-09-29
Filing date: 1993-09-27
Publication date: 2000-03-21
Anticipated expiration: 2015-03-21
Also published as: JPH06202699A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明は、音声をディジタル伝
送あるいは蓄積する場合などに用いられるもので、音声
の音源信号情報をベクトル量子化する音声符号化・復号
化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech encoding / decoding apparatus used for digitally transmitting or storing speech, and for performing vector quantization of speech source signal information.

【０００２】[0002]

【従来の技術】音声をスペクトル包絡情報と音源信号情
報に分離し、音源信号情報をベクトル量子化する従来の
音声符号化・復号化装置として、図８に示すものがあ
る。図８は、W.B.Kleijn, D.J.Krasinski, and R.H.Ket
chum 著”Improved Speech Quality and Efficient Vec
tor Quantization in SELP”（ICASSP'88, pp.155-158,
1988）に示されたのと同様なものである。図におい
て、１は符号化部、２は復号化部、３は伝送路であり、
４は入力音声、５は出力音声である。６はスペクトル分
析手段、７はスペクトルパラメータ符号化手段であり、
８は適応音源符号化手段、９、１４は適応音源符号帳で
ある。１０は駆動音源符号化手段、１１、１６は駆動音
源符号帳である。１２、１７は音源ベクトル生成手段、
１３は適応音源復号化手段、１５は駆動音源復号化手段
であり、１８はスペクトルパラメータ復号化手段、１９
は合成フィルタである。2. Description of the Related Art FIG. 8 shows a conventional speech encoding / decoding apparatus for separating speech into spectrum envelope information and excitation signal information and vector-quantizing the excitation signal information. Figure 8 shows WBKleijn, DJKrasinski, and RHKet
by chum, “Improved Speech Quality and Efficient Vec
tor Quantization in SELP ”(ICASSP'88, pp.155-158,
1988). In the figure, 1 is an encoding unit, 2 is a decoding unit, 3 is a transmission path,
4 is an input voice and 5 is an output voice. 6 is a spectrum analysis means, 7 is a spectrum parameter coding means,
8 is an adaptive excitation coding means, and 9 and 14 are adaptive excitation codebooks. Reference numeral 10 denotes a driving excitation coding unit, and reference numerals 11 and 16 denote driving excitation codebooks. 12, 17 are sound source vector generating means;
13 is an adaptive excitation decoding means, 15 is a driving excitation decoding means, 18 is a spectrum parameter decoding means, 19
Is a synthesis filter.

【０００３】以下、従来の音声符号化・復号化装置の動
作について説明する。まず、符号化部１の動作について
説明する。スペクトル分析手段６は、入力音声４を分析
して、スペクトルパラメータを抽出する。スペクトルパ
ラメータ符号化手段７は前記スペクトルパラメータを量
子化し、それに対応する符号を復号化部２に伝送路３を
介して出力するとともに、量子化したスペクトルパラメ
ータを適応音源符号化手段８及び駆動音源符号化手段１
０に出力する。適応音源符号帳９には先行フレームにお
いて求めた音源信号が記憶されており、ピッチ情報を符
号として、現フレームにおいて前記音源信号をピッチ周
期で繰り返して得られる適応音源ベクトルを出力する。
図９に先行フレームにおける音源信号とそのとき得られ
る適応音源ベクトルの例を示す。適応音源ベクトルは、
ピッチ周期がフレーム長よりも短いときには、図９
（ｂ）に示すように現フレーム内で音源信号をピッチ周
期で繰り返したものとなる。[0003] The operation of the conventional speech encoding / decoding device will be described below. First, the operation of the encoding unit 1 will be described. The spectrum analysis means 6 analyzes the input speech 4 and extracts spectrum parameters. The spectrum parameter encoding means 7 quantizes the spectrum parameters, outputs a code corresponding to the quantized spectrum parameters to the decoding unit 2 via the transmission path 3, and outputs the quantized spectrum parameters to the adaptive excitation encoding means 8 and the driving excitation code. Means 1
Output to 0. The adaptive excitation codebook 9 stores the excitation signal obtained in the preceding frame, and outputs an adaptive excitation vector obtained by repeating the excitation signal in the current frame at a pitch cycle using the pitch information as a code.
FIG. 9 shows an example of the excitation signal in the preceding frame and the adaptive excitation vector obtained at that time. The adaptive sound source vector is
When the pitch period is shorter than the frame length, FIG.
As shown in (b), the sound source signal is repeated at a pitch cycle in the current frame.

【０００４】適応音源符号化手段８は、前記適応音源符
号帳９より入力されるＭ個の適応音源ベクトルａ_i（ｉ
＝１，... ，Ｍ）と、前記スペクトルパラメータ符号化
手段７より入力されたスペクトルパラメータを用いて、
Ｍ個の合成音声ベクトルＡ_i（ｉ＝１，... ，Ｍ）を合
成する。そして、入力音声４からフレーム毎に切り出し
た入力音声ベクトルＸとのベクトル間距離Ｄ_iを、例え
ば式（１）に従って求める。ここで、ゲインβ_iは距離
Ｄ_iが最小になるように、例えば式（２）に従って決定
する。（なお、ｔは転置をしめす）[0004] Adaptive excitation coding means 8 includes M adaptive excitation vectors a _i (i
= 1,..., M) and the spectrum parameters inputted from the spectrum parameter encoding means 7,
The M synthesized speech vectors A _i (i = 1,..., M) are synthesized. Then, the inter-vector distance D _i between the input speech vector X cut out for each frame from the input speech 4, for example, determined according to equation (1). Here, the gain β _i is determined according to, for example, equation (2) so that the distance D _i is minimized. (Note that t indicates transposition.)

【０００５】[0005]

【数１】 (Equation 1)

【０００６】次に、このベクトル間距離が最小となるベ
クトルＡ_Iを探索し、その符号Ｉ及びゲインβ_Iを量子化
したゲインβ_qIの符号を伝送路３を介して復号化部２に
出力し、また、選択された適応音源ベクトルａ_I及びそ
のゲインβ_qIを音源ベクトル生成手段１２に出力すると
ともに、誤差ベクトルＸ’ Ｘ’＝Ｘ−β_qIＡ_I を駆動音源符号化手段１０に出力する。Next, the vector A _{I in} which the distance between the vectors is minimized is searched, and the code I and the code of the gain β _{qI obtained} by quantizing the gain β _I are output to the decoding unit 2 via the transmission line 3. Also, the selected adaptive _excitation vector a _I and its gain β _qI are output to the _excitation vector generation means 12, and the error vector X ′ X ′ = X−β _qI A _I is output to the driving _excitation coding means 10. I do.

【０００７】駆動音源符号帳１１には、例えばランダム
雑音から生成したＮ個の駆動音源ベクトルが記憶されて
いる。駆動音源符号化手段１０は、前記駆動音源符号帳
１１より入力される駆動音源ベクトルｃ_j（ｊ＝１，...
，Ｎ）と、前記スペクトルパラメータ符号化手段より
入力されたスペクトルパラメータを用いて、Ｎ個の合成
音声ベクトルＣ_j（ｊ＝１，... ，Ｎ）を合成する。そ
して、前記適応音源符号化手段８より入力された誤差ベ
クトルＸ’とのベクトル間距離Ｄ_jを、例えば式（３）
に従って求める。ここで、ゲインγ_jは距離Ｄ_jが最小に
なるように、例えば式（４）に従って決定する。[0007] The driving excitation codebook 11 stores, for example, N driving excitation vectors generated from random noise. The driving excitation coding means 10 generates a driving excitation vector c _j (j = 1,...) Inputted from the driving excitation codebook 11.
, N) and the spectrum parameters input from the spectrum parameter encoding means, and synthesizes N synthesized speech vectors C _j (j = 1,..., N). Then, the inter-vector distance D _j with respect to the error vector X ′ input from the adaptive excitation coding means 8 is calculated by, for example, the equation (3).
Ask according to. Here, the gain γ _j is determined according to, for example, equation (4) so that the distance D _j is minimized.

【０００８】[0008]

【数２】 (Equation 2)

【０００９】次にこのベクトル間距離が最小となるベク
トルＣ_Jを探索し、その符号Ｊ及びゲインγ_Jを量子化し
たゲインγ_qJの符号を伝送路３を介して復号化部２に出
力するとともに、選択された駆動音源ベクトルｃ_J及び
そのゲインγ_qJを音源ベクトル生成手段１２に出力す
る。Next, a search is made for a vector C _J that minimizes the inter-vector distance, and the code of the code J and the gain γ _{qJ obtained} by quantizing the gain γ _J are output to the decoding unit 2 via the transmission path 3. At the same time, the selected drive _excitation vector c _J and its gain γ _qJ are output to the _excitation vector generation means 12.

【００１０】音源ベクトル生成手段１２は、前記適応音
源符号化手段８より入力された適応音源ベクトルａ_I及
びそのゲインβ_qIと、前記駆動音源符号化手段１０より
入力された駆動音源ベクトルｃ_J及びそのゲインγ_qJよ
り音源ベクトル β_qIａ_I＋γ_qJｃ_J を生成し、これを適応音源符号帳９に出力する。 _Excitation vector generation means 12 generates adaptive _excitation vector a _I and gain β _qI inputted from adaptive _excitation coding means 8 and driving _excitation vector c _J and driving _excitation vector c _J inputted from driving _excitation encoding means 10. its gain gamma generate excitation vector _{_{_{β qI a I + γ qJ c}}} J than _QJ, and outputs it to the adaptive excitation codebook 9.

【００１１】次に、復号化部２の動作について説明す
る。適応音源復号化手段１３は、符号化部１から入力さ
れた適応音源ベクトルの符号Ｉに基づき、前記適応音源
符号帳９と同一の適応音源ベクトルを記憶する適応音源
符号帳１４から適応音源ベクトルａ_I を読みだし、ま
た、符号化部１から入力された適応音源ベクトルに対す
るゲインの符号よりゲインβ_qIを復号化し、前記適応音
源ベクトルａ_Iとそのゲインβ_qIを音源ベクトル生成手
段１７に出力する。また、駆動音源復号化手段１５は、
符号化部１から入力された駆動音源ベクトルの符号Ｊに
基づき、前記駆動音源符号帳１１と同一の駆動音源ベク
トルを記憶する駆動音源符号帳１６から駆動音源ベクト
ルＣ_Jを読みだし、また、符号化部１から入力された駆
動音源ベクトルに対するゲインの符号よりゲインγ_qJを
復号し、前記駆動音源ベクトルｃ_Jとそのゲインγ_qJを
音源ベクトル生成手段１７に出力する。Next, the operation of the decoding unit 2 will be described. The adaptive excitation decoding means 13 receives an adaptive excitation vector a from the adaptive excitation codebook 14 storing the same adaptive excitation vector as the adaptive excitation codebook 9 based on the code I of the adaptive excitation vector input from the encoding unit 1. _I is read out, and the gain β _qI is decoded from the sign of the gain for the adaptive _excitation vector input from the encoding unit 1, and the adaptive _excitation vector a _I and its gain β _qI are output to the _excitation vector generation means 17. . Further, the driving excitation decoding means 15 includes:
Based on the code J of the driving excitation vector inputted from the encoding unit 1, the driving excitation vector C _J is read from the driving excitation codebook 16 storing the same driving excitation vector as the driving excitation codebook 11, and the code is read. The gain γ _qJ is decoded from the sign of the gain for the drive _excitation vector input from the _converting unit 1, and the drive _excitation vector c _J and its gain γ _qJ are output to the _excitation vector generation unit 17.

【００１２】音源ベクトル生成手段１７は、前記適応音
源復号化手段１３より入力された適応音源ベクトルａ_I
及びそのゲインβ_qIと、前記駆動音源復号化手段１５よ
り入力された駆動音源ベクトルｃ_J、及びそのゲインγ
_qJより音源ベクトル β_qIａ_I＋γ_qJｃ_J を生成し、これを適応音源符号帳１４及び合成フィルタ
１９に出力する。The excitation vector generation means 17 outputs the adaptive excitation vector a _I inputted from the adaptive excitation decoding means 13.
And its gain β _qI , the driving _excitation vector c _J inputted from the driving _excitation decoding means 15, and its gain γ
generate excitation vector _{_{_{β qI a I + γ qJ c}}} J than _QJ, and outputs it to the adaptive excitation codebook 14 and synthesis filter 19.

【００１３】スペクトルパラメータ復号化手段１８は、
符号化部１から入力されたスペクトルパラメータの符号
に基づきスペクトルパラメータを復号化し、合成フィル
タ１９に出力する。合成フィルタ１９は、前記音源ベク
トル生成手段１７より入力された音源ベクトルと、前記
スペクトルパラメータ復号化手段１８より入力されたス
ペクトルパラメータを用いて出力音声５を合成する。The spectrum parameter decoding means 18 comprises:
The spectrum parameter is decoded based on the sign of the spectrum parameter input from the encoding unit 1 and output to the synthesis filter 19. The synthesis filter 19 synthesizes the output speech 5 using the sound source vector input from the sound source vector generation means 17 and the spectrum parameter input from the spectrum parameter decoding means 18.

【００１４】[0014]

【発明が解決しようとする課題】音声信号には有声音と
無声音が有り、それぞれは音源信号がピッチ周期のパル
ス的成分を持つか、周期性の無い白色雑音であるかで特
徴づけられる。そして、有声音におけるピッチ周期性の
再現性が合成音声の品質に与える影響は大きい。上記の
ような従来の音声符号化・復号化装置では、有声音にお
ける音源信号のピッチ周期のパルス的成分を、先行フレ
ームの音源信号で作られる適応音源ベクトルを用いるこ
とにより発生させていた。しかし、適応音源ベクトルは
複数のサンプルで構成されるので、有声音に必要なパル
ス的成分を積極的には生成できない。このため、無声音
から有声音への過渡部では適当な音源が生成できず音質
が劣化するという課題があった。また、パルス的成分が
生成されても、それ以外のサンプルを含めて音源を生成
するため、必ずしも最適なパルス系列が得られず、これ
がピッチ周期性の欠落につながり音質が劣化するという
課題もあった。The voice signal includes a voiced sound and an unvoiced sound, each of which is characterized by whether the sound source signal has a pulse component of a pitch period or white noise having no periodicity. The reproducibility of the pitch periodicity of voiced sounds has a large effect on the quality of synthesized speech. In the conventional speech encoding / decoding device as described above, the pulse-like component of the pitch period of the sound source signal in the voiced sound is generated by using the adaptive sound source vector created by the sound source signal of the preceding frame. However, since the adaptive sound source vector is composed of a plurality of samples, a pulse-like component necessary for voiced sound cannot be actively generated. For this reason, there has been a problem that an appropriate sound source cannot be generated in a transition portion from unvoiced sound to voiced sound, and the sound quality is deteriorated. In addition, even if a pulse-like component is generated, since a sound source is generated including other samples, an optimal pulse sequence is not always obtained, and this leads to a lack of pitch periodicity and a deterioration in sound quality. Was.

【００１５】また、音源信号を適応音源ベクトルとピッ
チ周期性を考慮していない駆動音源ベクトルとの線形和
により生成するので、適応音源ベクトルがピッチ周期性
を発生させたとしても、伝達情報量を低減させるために
は、フレーム長、即ちベクトル長を長くする、あるいは
駆動音源符号帳の符号帳サイズを小さくする必要があ
り、その場合には生成された音源信号のピッチ周期性が
乱れ、音質が劣化するという課題もあった。Further, since the excitation signal is generated by the linear sum of the adaptive excitation vector and the driving excitation vector not considering the pitch periodicity, even if the adaptive excitation vector generates the pitch periodicity, the amount of transmitted information is reduced. In order to reduce it, it is necessary to increase the frame length, that is, the vector length, or to reduce the codebook size of the driving excitation codebook. In this case, the pitch periodicity of the generated excitation signal is disturbed and the sound quality is reduced. There was also a problem of deterioration.

【００１６】この発明は、かかる課題を解決するために
なされたもので、適応音源ベクトルに加え、この適応音
源ベクトルに対して一意的に求めることが可能な周期パ
ルス列からなる音源ベクトルを生成し、これを用いて音
源信号を生成すること、及び駆動音源ベクトルをピッチ
周期に同期して繰り返し用いることにより、少ない伝送
情報量でも音源信号のピッチ周期性の乱れが少なく、品
質の高い復号音声を合成することを目的としている。The present invention has been made to solve such a problem, and in addition to an adaptive excitation vector, generates an excitation vector consisting of a periodic pulse train that can be uniquely determined for the adaptive excitation vector, By using this to generate a sound source signal and repeatedly using the drive sound source vector in synchronization with the pitch period, the pitch periodicity of the sound source signal is less disturbed even with a small amount of transmitted information, thereby synthesizing high-quality decoded speech. It is intended to be.

【００１７】[0017]

【課題を解決するための手段】この発明に係る音声符号
化装置は、スペクトル分析符号化手段、適応音源符号
帳、周期パルス音源生成手段、複数ベクトル適応音源符
号化手段、音源ベクトル生成手段を備え、一定間隔に区
切ったフレーム毎に入力音声を符号化する音声符号化装
置であって、スペクトル分析符号化手段は、入力音声の
スペクトル包絡情報であるスペクトルパラメータを抽出
して符号化し、適応音源符号帳は、先行フレームにおい
て上記音源ベクトル生成手段により出力された音源信号
から適応音源ベクトルを複数生成して記憶し、周期パル
ス音源生成手段は、上記適応音源符号帳から出力される
適応音源ベクトルの主要成分として定まる周期パルス音
源ベクトルを出力し、複数ベクトル適応音源符号化手段
は、上記符号化されたスペクトルパラメータと上記複数
の適応音源ベクトルのいずれか１つと該適応音源ベクト
ルに付与する第１ゲインと上記周期パルス音源ベクトル
と該周期パルス音源ベクトルに付与する第２ゲインとか
ら生成した復号音声と、入力音声との歪みが最小になる
よう上記適応音源ベクトルと第１ゲインと第２ゲインを
決定して符号化し、音源ベクトル生成手段は、少なくと
も上記決定されて符号化された適応音源ベクトルと第１
ゲインと第２ゲインと、上記周期パルス音源ベクトルか
ら音源信号を生成して出力するようにしたものである。A speech encoding apparatus according to the present invention comprises a spectrum analysis encoding means, an adaptive excitation codebook, a periodic pulse excitation generation means, a multi-vector adaptive excitation encoding means, and an excitation vector generation means. A speech encoding apparatus for encoding input speech for each frame divided at a fixed interval, wherein the spectrum analysis encoding means extracts and encodes a spectrum parameter which is spectrum envelope information of the input speech, and generates an adaptive excitation code. The book generates and stores a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generation means in the preceding frame, and the periodic pulse excitation generation means generates a main adaptive excitation vector output from the adaptive excitation codebook. A periodic pulse excitation vector determined as a component is output, and the multi-vector adaptive excitation encoding means outputs the encoded A decoded speech generated from a spectral parameter, one of the plurality of adaptive excitation vectors, a first gain applied to the adaptive excitation vector, the periodic pulse excitation vector, and a second gain applied to the periodic pulse excitation vector; The adaptive excitation vector, the first gain, and the second gain are determined and encoded so that distortion from the input speech is minimized, and the excitation vector generation unit determines at least the determined and encoded adaptive excitation vector and the first and second gains.
A sound source signal is generated and output from a gain, a second gain, and the periodic pulse sound source vector.

【００１８】請求項２の発明に係る復号化装置は、スペ
クトルパラメータ復号化手段、複数ベクトル適応音源復
号化手段、適応音源符号帳、周期パルス音源生成手段、
音源ベクトル生成手段、合成フィルタを備え、一定間隔
に区切ったフレーム毎に符号化されたスペクトルパラメ
ータの符号と適応音源ベクトルの符号とゲインの符号と
から復号信号を生成する音声復号化装置であって、スペ
クトルパラメータ復号化手段は、スペクトルパラメータ
の符号を入力され、該スペクトルパラメータの符号から
スペクトルパラメータを復号化し、複数ベクトル適応音
源復号化手段は、適応音源ベクトルの符号とゲインの符
号を入力され、該適応音源ベクトルの符号を上記適応音
源符号帳に出力すると共に、上記ゲインの符号から第１
のゲインと第２のゲインを復号化し、適応音源符号帳
は、先行フレームにおいて上記音源ベクトル生成手段に
より出力された音源信号から適応音源ベクトルを複数生
成して記憶すると共に、該記憶された適応音源ベクトル
の中から上記複数ベクトル適応音源復号化手段により出
力される適応音源ベクトルの符号に対応する適応音源ベ
クトルを出力し、周期パルス音源生成手段は、上記適応
音源符号帳から出力される適応音源ベクトルの主要成分
として定まる周期パルス音源ベクトルを出力し、音源ベ
クトル生成手段は、少なくとも上記第１ゲインを付与し
た適応音源ベクトルと上記第２ゲインを付与した周期パ
ルス音源ベクトルを加算して音源信号として出力し、合
成フィルタは、上記復号化されたスペクトルパラメータ
を用いて、上記音源信号から復号信号を生成するように
したものである。According to a second aspect of the present invention, there is provided a decoding apparatus comprising: a spectrum parameter decoding unit; a multi-vector adaptive excitation decoding unit; an adaptive excitation codebook;
An audio decoding apparatus comprising: an excitation vector generating unit; a synthesis filter; and generating a decoded signal from a code of a spectrum parameter encoded for each frame divided at a fixed interval, a code of an adaptive excitation vector, and a code of a gain. , The spectrum parameter decoding means is input with the code of the spectrum parameter, decodes the spectrum parameter from the code of the spectrum parameter, the multi-vector adaptive excitation decoding means is input with the code of the adaptive excitation vector and the code of the gain, The adaptive excitation vector code is output to the adaptive excitation codebook, and the first
The adaptive excitation codebook generates and stores a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generation means in the preceding frame, and decodes the stored adaptive excitation code. Outputting an adaptive excitation vector corresponding to the code of the adaptive excitation vector output from the multi-vector adaptive excitation decoding means from among the vectors, the periodic pulse excitation generating means outputting the adaptive excitation vector output from the adaptive excitation codebook A periodic pulse excitation vector determined as a main component of the above is output, and the excitation vector generating means adds at least the adaptive excitation vector to which the first gain has been added and the periodic pulse excitation vector to which the second gain has been added, and outputs as an excitation signal The synthesis filter uses the decoded spectrum parameters to generate the sound source. It is obtained so as to generate a decoded signal from the item.

【００１９】請求項３の発明に係る音声復号化装置は、
スペクトル分析符号化手段、ピッチ位置抽出手段、適応
音源符号帳、駆動音源符号帳、ピッチ同期化手段、適応
音源符号化手段、駆動音源符号化手段、音源ベクトル生
成手段を備え、一定間隔に区切ったフレーム毎に、入力
音声を符号化する音声符号化装置であって、スペクトル
分析符号化手段は、入力音声のスペクトル包絡情報であ
るスペクトルパラメータを抽出して符号化し、ピッチ位
置抽出手段は、入力音声からピッチ周期間隔で並ぶ複数
の特徴点をピッチ位置として抽出して符号化し、適応音
源符号帳は、先行フレームにおいて上記音源ベクトル生
成手段により出力された音源信号から適応音源ベクトル
を複数生成して記憶し、駆動音源符号帳は、予め定めら
れたピッチ同期位置を有する駆動音源ベクトルを複数記
憶し、ピッチ同期化手段は、上記符号化されたピッチ位
置に上記各駆動音源ベクトルのピッチ同期位置を合わ
せ、駆動音源ベクトルをピッチ周期で繰り返したピッチ
同期駆動音源ベクトルを生成し、適応音源符号化手段
は、上記符号化されたスペクトルパラメータに基づいて
上記適応音源ベクトルの合成音声ベクトルを生成し、入
力音声と第１ゲインを付与した上記適応音源ベクトルの
合成音声ベクトルの歪みが最小になるよう適応音源ベク
トル、第１ゲインを決定し、該決定された適応音源ベク
トルの符号を出力すると共に、該第1ゲインを符号化
し、駆動音源符号化手段は、上記符号化されたスペクト
ルパラメータに基づいて上記ピッチ同期駆動音源ベクト
ルの合成音声ベクトルを生成し、上記決定された第１ゲ
インを付与した上記決定された適応音源ベクトルの合成
音声ベクトルと第２ゲインを付与した上記ピッチ同期駆
動音源ベクトルの合成音声ベクトルの和を復号信号ベク
トルとしたとき、入力音声と該復号信号ベクトルの歪み
が最小となるよう上記ピッチ同期駆動音源ベクトルの合
成音声ベクトルと第２ゲインを決定し、該決定されたピ
ッチ同期駆動音源ベクトルの符号を出力すると共に、該
決定された第２ゲインを符号化し、音源ベクトル生成手
段は、少なくとも上記決定されて符号化された適応音源
ベクトルと第１ゲインと第２ゲインと、上記ピッチ同期
駆動音源ベクトルから音源信号を生成して出力するよう
にしたものである。According to a third aspect of the present invention, there is provided a speech decoding apparatus comprising:
A spectrum analysis coding unit, a pitch position extraction unit, an adaptive excitation codebook, a driving excitation codebook, a pitch synchronization unit, an adaptive excitation coding unit, a driving excitation coding unit, and an excitation vector generation unit are provided, and are divided into a fixed interval. A speech encoding apparatus for encoding an input speech for each frame, wherein a spectrum analysis encoding means extracts and encodes a spectrum parameter which is spectrum envelope information of the input speech, and a pitch position extraction means, The adaptive excitation codebook generates and stores a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generation means in the preceding frame, by extracting a plurality of feature points arranged at a pitch cycle interval as pitch positions and encoding them. The driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, Means for adjusting a pitch synchronization position of each of the drive excitation vectors to the encoded pitch position, generating a pitch-synchronous drive excitation vector in which the drive excitation vector is repeated at a pitch cycle, and wherein the adaptive excitation encoding means includes A synthesized speech vector of the adaptive sound source vector is generated based on the converted spectral parameters. Determining the gain, outputting the code of the determined adaptive excitation vector, encoding the first gain, and driving excitation encoding means based on the encoded spectral parameters, And the determined adaptive sound source vector to which the determined first gain has been added. When the sum of the synthesized speech vector of the pitch-synchronous drive excitation vector to which the second gain is added and the synthesized speech vector of the pitch-synchronized excitation vector is used as the decoded signal vector, the pitch-synchronous drive is performed so that the distortion of the input speech and the decoded signal vector is minimized. Determining a synthesized speech vector and a second gain of the sound source vector, outputting a code of the determined pitch synchronous drive sound source vector, and encoding the determined second gain; An excitation signal is generated and output from the adaptive excitation vector, the first gain, the second gain, and the pitch-synchronized excitation vector that have been encoded.

【００２０】請求項４の発明に係る音声符号化装置は、
スペクトル分析符号化手段、適応音源符号帳、駆動音源
符号帳、ピッチ同期化手段、適応音源符号化手段、駆動
音源符号化手段、音源ベクトル生成手段を備え、一定間
隔に区切ったフレーム毎に入力音声を符号化する音声符
号化装置であって、スペクトル分析符号化手段は、入力
音声のスペクトル包絡情報であるスペクトルパラメータ
を抽出して符号化し、適応音源符号帳は、先行フレーム
において上記音源ベクトル生成手段により出力された音
源信号から適応音源ベクトルを複数生成して記憶し、駆
動音源符号帳は、予め定められたピッチ同期位置を有す
る駆動音源ベクトルを複数記憶し、ピッチ同期化手段
は、時間軸上にピッチ周期間隔で並ぶ複数の点であるピ
ッチ位置に駆動音源ベクトルのピッチ同期位置を合わ
せ、駆動音源ベクトルをピッチ周期で繰り返したピッチ
同期駆動音源ベクトルを生成し、適応音源符号化手段
は、上記符号化されたスペクトルパラメータに基づいて
上記適応音源ベクトルの合成音声ベクトルを生成し、入
力音声と第１ゲインを付与した上記適応音源ベクトルの
合成音声ベクトルの歪みが最小になるよう適応音源ベク
トル、第１ゲインを決定し、該決定された適応音源ベク
トルの符号を出力すると共に、該決定された第1ゲイン
を符号化し、駆動音源符号化手段は、上記符号化された
スペクトルパラメータに基づいて、上記ピッチ同期駆動
音源ベクトルの合成音声ベクトルを生成し、上記決定さ
れた第１ゲインを付与した上記決定された適応音源ベク
トルの合成音声ベクトルと第２ゲインを付与した上記ピ
ッチ同期駆動音源ベクトルの合成音声ベクトルの和を復
号信号ベクトルとしたとき、入力音声と該復号信号ベク
トルの歪みが最小となるよう上記ピッチ同期駆動音源ベ
クトルの合成音声ベクトルとピッチ位置と第２ゲインを
決定し、該決定されたピッチ同期駆動音源ベクトルの符
号を出力すると共に、該決定されたピッチ位置と第２ゲ
インを符号化し、音源ベクトル生成手段は、少なくとも
上記決定されて符号化された適応音源ベクトルと第１ゲ
インと第２ゲインと、上記ピッチ同期駆動音源ベクトル
から音源信号を生成して出力するようにしたものであ
る。According to a fourth aspect of the present invention, there is provided a speech encoding apparatus comprising:
The apparatus includes a spectrum analysis encoding unit, an adaptive excitation codebook, a driving excitation codebook, a pitch synchronization unit, an adaptive excitation encoding unit, a driving excitation encoding unit, and an excitation vector generation unit, and receives an input speech for each frame divided at regular intervals. Wherein the spectrum analysis encoding means extracts and encodes a spectrum parameter which is spectrum envelope information of the input speech, and the adaptive excitation codebook includes the excitation vector generation means in a preceding frame. A plurality of adaptive excitation vectors are generated and stored from the excitation signal output by the above, the driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, and the pitch synchronization means The pitch of the driving sound source vector is adjusted to the pitch positions, which are a plurality of points arranged at pitch interval intervals, and the driving sound source vector An adaptive excitation encoding means generates a pitch-synchronous drive excitation vector repeated at a pitch cycle, generates a synthesized speech vector of the adaptive excitation vector based on the encoded spectrum parameter, and calculates an input speech and a first gain. The adaptive excitation vector and the first gain are determined so that the distortion of the synthesized speech vector of the applied adaptive excitation vector is minimized, and the sign of the determined adaptive excitation vector is output. The driving excitation encoding means generates a synthesized speech vector of the pitch-synchronized excitation vector based on the encoded spectral parameters, and applies the determined first gain to the determined adaptive gain. A synthesized speech vector of the pitch-synchronous drive excitation vector to which the synthesized speech vector of the sound source vector and the second gain are added. Is determined as a decoded signal vector, a synthesized voice vector, a pitch position, and a second gain of the pitch synchronous drive excitation vector are determined so that distortion of the input voice and the decoded signal vector is minimized, and the determined pitch is determined. A code of the synchronous drive excitation vector is output, and the determined pitch position and the second gain are encoded. The excitation vector generating means includes at least the determined and encoded adaptive excitation vector, the first gain, and the second gain. A sound source signal is generated and output from a gain and the pitch synchronous drive sound source vector.

【００２１】請求項５の発明に係る音声復号化装置は、
スペクトルパラメータ復号化手段、適応音源復号化手
段、適応音源符号帳、駆動音源符号帳、ピッチ同期化手
段、駆動音源復号化手段、音源ベクトル生成手段、合成
フィルタを備え、一定間隔に区切ったフレーム毎に符号
化されたスペクトルパラメータの符号と適応音源ベクト
ルの符号と駆動音源ベクトルの符号とゲインの符号とピ
ッチ位置の符号とから復号信号を生成する音声復号化装
置であって、スペクトルパラメータ復号化手段はスペク
トルパラメータの符号を入力され、該スペクトルパラメ
ータの符号からスペクトルパラメータを復号化し、適応
音源復号化手段は、適応音源ベクトルの符号と該適応音
源ベクトルに付与する第１ゲインの符号を入力され、該
適応音源ベクトルの符号を上記適応音源符号帳に出力す
ると共に、上記第１ゲインの符号から第１のゲインを復
号化し、適応音源符号帳は、先行フレームにおいて上記
音源ベクトル生成手段により出力された音源信号から適
応音源ベクトルを複数生成して記憶すると共に、該記憶
された適応音源ベクトルの中から上記適応音源復号化手
段により出力される適応音源ベクトルの符号に対応する
適応音源ベクトルを出力し、駆動音源符号帳は、予め定
められたピッチ同期位置を有する駆動音源ベクトルを複
数記憶すると共に、該記憶された駆動音源ベクトルの中
から上記駆動音源ベクトル復号化手段により出力される
駆動音源ベクトルの符号に対応する駆動音源ベクトルを
ピッチ同期化手段に対して出力し、ピッチ同期化手段
は、ピッチ位置の符号を入力され該符号からピッチ位置
を復号化し、該復号化されたピッチ位置に上記ピッチ同
期位置をあわせ、上記駆動音源符号帳から入力された駆
動音源ベクトルをピッチ周期で繰り返したピッチ同期駆
動音源ベクトルを生成し、駆動音源復号化手段は、駆動
音源ベクトルの符号と該駆動音源ベクトルに付与する第
2ゲインの符号を入力され、該駆動音源ベクトルの符号
を上記駆動音源符号帳に出力すると共に、上記第２ゲイ
ンの符号から第２のゲインを復号化し、音源ベクトル生
成手段は、上記第１ゲインを付与した適応音源ベクトル
と第２ゲインを付与したピッチ同期駆動音源ベクトルの
和である音源信号を出力し、合成フィルタは、上記復号
化されたスペクトルパラメータを用いて、上記音源信号
から復号信号を生成するようにしたものである。A speech decoding apparatus according to a fifth aspect of the present invention
Each frame is provided with a spectrum parameter decoding unit, an adaptive excitation decoding unit, an adaptive excitation codebook, a driving excitation codebook, a pitch synchronization unit, a driving excitation decoding unit, an excitation vector generation unit, and a synthesis filter. A speech decoding apparatus for generating a decoded signal from a spectrum parameter code, an adaptive excitation vector code, a driving excitation vector code, a gain code, and a pitch position code encoded in Receives the code of the spectrum parameter, decodes the spectrum parameter from the code of the spectrum parameter, the adaptive excitation decoding means receives the code of the adaptive excitation vector and the code of the first gain to be given to the adaptive excitation vector, Outputting the code of the adaptive excitation vector to the adaptive excitation codebook, The adaptive excitation codebook generates a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generation means in the preceding frame and stores the adaptive excitation codebook. An adaptive excitation vector corresponding to the code of the adaptive excitation vector output by the adaptive excitation decoding means is output from the excitation vectors, and the driving excitation codebook includes a plurality of driving excitation vectors having a predetermined pitch synchronization position. Storing the driving excitation vector corresponding to the code of the driving excitation vector output by the driving excitation vector decoding means from the stored driving excitation vectors to the pitch synchronization means, The means receives a code of the pitch position, decodes the pitch position from the code, and decodes the decoded pitch position. The pitch synchronous position is adjusted, a pitch synchronous driving excitation vector is generated by repeating the driving excitation vector input from the driving excitation codebook at a pitch cycle, and the driving excitation decoding means determines the code of the driving excitation vector and the driving excitation vector. The number given to the vector
The two-gain code is input, the code of the excitation vector is output to the excitation codebook, and the second gain is decoded from the code of the second gain. And a pitch-synchronous drive excitation vector to which the second gain has been applied, and outputs a excitation signal, and the synthesis filter uses the decoded spectral parameters to convert a decoded signal from the excitation signal It is generated.

【００２２】請求項６の発明に係る音声符号化装置は、
スペクトル分析符号化手段、適応音源符号帳、ピッチ位
置抽出手段、駆動音源符号帳、ピッチ同期化手段、適応
音源符号化手段、駆動音源符号化手段、音源ベクトル生
成手段を備え、一定間隔に区切ったフレーム毎に、入力
音声を符号化する音声符号化装置であって、スペクトル
分析符号化手段は、入力音声信号のスペクトル包絡情報
であるスペクトルパラメータを抽出して符号化し、適応
音源符号帳は、先行フレームにおいて上記音源ベクトル
生成手段により出力された音源信号から適応音源ベクト
ルを複数生成して記憶し、ピッチ位置抽出手段は、上記
適応音源ベクトルからピッチ周期間隔で並ぶ複数の特徴
点をピッチ位置として抽出し、駆動音源符号帳は、予め
定められたピッチ同期位置を有する駆動音源ベクトルを
複数記憶し、ピッチ同期化手段は、上記抽出されたピッ
チ位置に上記各駆動音源ベクトルのピッチ同期位置を合
わせ、駆動音源ベクトルをピッチ周期で繰り返したピッ
チ同期駆動音源ベクトルを生成し、適応音源符号化手段
は、上記符号化されたスペクトルパラメータに基づいて
上記適応音源ベクトルの合成音声ベクトルを生成し、入
力音声と第１ゲインを付与した上記適応音源ベクトルの
合成音声ベクトルの歪みが最小になるよう適応音源ベク
トル、第１ゲインを決定し、該決定された適応音源ベク
トルの符号を出力すると共に、該第１ゲインを符号化
し、駆動音源符号化手段は、上記符号化されたスペクト
ルパラメータに基づいて上記ピッチ同期駆動音源ベクト
ルの合成音声ベクトルを生成し、上記決定された第１ゲ
インを付与した上記決定された適応音源ベクトルの合成
音声ベクトルと第２ゲインを付与した上記ピッチ同期駆
動音源ベクトルの合成音声ベクトルの和を復号信号ベク
トルとしたとき、入力音声と該復号信号ベクトルの歪み
が最小となるよう上記ピッチ同期駆動音源ベクトルの合
成音声ベクトルと第２ゲインを決定し、該決定されたピ
ッチ同期駆動音源ベクトルの符号を出力すると共に、該
決定された第２ゲインを符号化し、音源ベクトル生成手
段は、少なくとも上記決定されて符号化された適応音源
ベクトルと第１ゲインと第２ゲインと、上記ピッチ同期
駆動音源ベクトルから音源信号を生成して出力するよう
にしたものである。A speech encoding apparatus according to a sixth aspect of the present invention
A spectrum analysis coding unit, an adaptive excitation codebook, a pitch position extracting unit, a driving excitation codebook, a pitch synchronization unit, an adaptive excitation coding unit, a driving excitation coding unit, and an excitation vector generation unit are provided. A speech encoding apparatus for encoding an input speech for each frame, wherein a spectrum analysis encoding means extracts and encodes a spectrum parameter which is spectrum envelope information of an input speech signal, and an adaptive excitation codebook includes A plurality of adaptive excitation vectors are generated and stored in the frame from the excitation signal output by the excitation vector generation unit, and the pitch position extraction unit extracts a plurality of feature points arranged at a pitch cycle interval from the adaptive excitation vector as a pitch position. The driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, and The synchronization unit adjusts the pitch synchronization position of each of the driving excitation vectors to the extracted pitch position, generates a pitch synchronization driving excitation vector in which the driving excitation vector is repeated at a pitch cycle, and the adaptive excitation encoding unit includes A synthesized speech vector of the adaptive excitation vector is generated based on the encoded spectral parameters, and the adaptive excitation vector and the synthesized speech vector of the adaptive excitation vector to which the first gain has been applied are minimized. 1 is determined, the code of the determined adaptive excitation vector is output, and the first gain is encoded, and the driving excitation encoding means performs the pitch synchronous driving excitation based on the encoded spectral parameters. The determined adaptive sound obtained by generating a synthesized speech vector of the vector and adding the determined first gain. When the sum of the synthesized voice vector of the vector and the synthesized voice vector of the pitch-synchronized drive excitation vector to which the second gain is added is set as the decoded signal vector, the pitch-synchronized drive is performed so that the distortion between the input voice and the decoded signal vector is minimized. Determining a synthesized speech vector and a second gain of the sound source vector, outputting a code of the determined pitch synchronous drive sound source vector, and encoding the determined second gain; An excitation signal is generated and output from the adaptive excitation vector, the first gain, the second gain, and the pitch-synchronized excitation vector that have been encoded.

【００２３】請求項７の発明に係る音声復号化装置は、
スペクトルパラメータ復号化手段、適応音源復号化手
段、適応音源符号帳、ピッチ位置抽出手段、駆動音源符
号帳、ピッチ同期化手段、駆動音源復号化手段、音源ベ
クトル生成手段、合成フィルタを備え、一定間隔に区切
ったフレーム毎に符号化されたスペクトルパラメータの
符号と適応音源ベクトルの符号と駆動音源ベクトルの符
号とゲインの符号とから復号信号を生成する音声復号化
装置であって、スペクトルパラメータ復号化手段は、ス
ペクトルパラメータの符号を入力され、該スペクトルパ
ラメータの符号からスペクトルパラメータを復号化し、
適応音源復号化手段は、適応音源ベクトルの符号と該適
応音源ベクトルに付与する第１ゲインの符号を入力さ
れ、該適応音源ベクトルの符号を上記適応音源符号帳に
出力すると共に、上記第１ゲインの符号から第１のゲイ
ンを復号化し、適応音源符号帳は、先行フレームにおい
て上記音源ベクトル生成手段により出力された音源信号
から適応音源ベクトルを複数生成して記憶すると共に、
該記憶された適応音源ベクトルの中から上記適応音源復
号化手段により出力される適応音源ベクトルの符号に対
応する適応音源ベクトルを出力し、ピッチ位置抽出手段
は、上記適応音源ベクトルからピッチ周期間隔で並ぶ複
数の特徴点をピッチ位置として抽出し、駆動音源符号帳
は、予め定められたピッチ同期位置を有する駆動音源ベ
クトルを複数記憶すると共に、該記憶された駆動音源ベ
クトルの中から上記駆動音源ベクトル復号化手段により
出力される駆動音源ベクトルの符号に対応する駆動音源
ベクトルをピッチ同期化手段に対して出力し、ピッチ同
期化手段は、上記抽出されたピッチ位置に上記各駆動音
源ベクトルのピッチ同期位置をあわせ、上記駆動音源符
号帳から入力された駆動音源ベクトルをピッチ周期で繰
り返したピッチ同期駆動音源ベクトルを生成し、駆動音
源復号化手段は、駆動音源ベクトルの符号と該駆動音源
ベクトルに付与する第2ゲインの符号を入力され、該駆
動音源ベクトルの符号を上記駆動音源符号帳に出力する
と共に、上記第２ゲインの符号から第２のゲインを復号
化し、音源ベクトル生成手段は、上記第１ゲインを付与
した適応音源ベクトルと第２ゲインを付与したピッチ同
期駆動音源ベクトルの和である音源信号を出力し、合成
フィルタは、上記復号化されたスペクトルパラメータを
用いて、上記音源信号から復号信号を生成するようにし
たものである。A speech decoding apparatus according to a seventh aspect of the present invention
A spectral parameter decoding unit, an adaptive excitation decoding unit, an adaptive excitation codebook, a pitch position extraction unit, a driving excitation codebook, a pitch synchronization unit, a driving excitation decoding unit, an excitation vector generation unit, a synthesis filter, and a fixed interval. A speech decoding apparatus for generating a decoded signal from a code of a spectral parameter encoded for each frame divided into a code of an adaptive excitation vector, a code of a drive excitation vector, and a code of a gain, comprising: Is input the sign of the spectral parameter, decodes the spectral parameter from the sign of the spectral parameter,
The adaptive excitation decoding means receives the code of the adaptive excitation vector and the code of the first gain to be given to the adaptive excitation vector, outputs the code of the adaptive excitation vector to the adaptive excitation codebook, and outputs the first gain. , The adaptive excitation codebook generates and stores a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generation means in the preceding frame,
An adaptive excitation vector corresponding to the code of the adaptive excitation vector output by the adaptive excitation decoding means is output from the stored adaptive excitation vectors, and the pitch position extracting means outputs the adaptive excitation vector at a pitch cycle interval from the adaptive excitation vector. A plurality of aligned feature points are extracted as pitch positions, the driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, and the driving excitation vector is selected from among the stored driving excitation vectors. A driving excitation vector corresponding to the code of the driving excitation vector output by the decoding unit is output to the pitch synchronization unit, and the pitch synchronization unit sets the pitch synchronization of each of the driving excitation vectors to the extracted pitch position. Adjust the pitch and repeat the driving excitation vector input from the driving excitation codebook at the pitch cycle. A driving excitation vector is generated, and the driving excitation decoding means receives the code of the driving excitation vector and the code of the second gain added to the driving excitation vector, and outputs the code of the driving excitation vector to the driving excitation codebook. At the same time, the second gain is decoded from the code of the second gain, and the excitation vector generating means is the sum of the adaptive excitation vector to which the first gain is applied and the pitch synchronous excitation excitation vector to which the second gain is applied. An excitation signal is output, and the synthesis filter generates a decoded signal from the excitation signal using the decoded spectrum parameter.

【００２４】[0024]

【作用】この発明における入力音声符号化装置は、適応
音源ベクトルに対応して周期パルス音源ベクトルが新し
く生成され、それぞれゲインを掛けて合成され、入力音
声との歪みが最小となるベクトルとゲインが選択されて
符号化される。また、請求項２の発明の音声復号化装置
は、伝送された符号とゲインにより、周期性を持つ音源
ベクトルが生成され、これと適応音源ベクトルにより、
音源ベクトルが復号化される。また、請求項３、４の発
明の音声符号化装置は、駆動音源ベクトルが周期的に繰
り返され、入力音声との歪みが最小になるベクトルとゲ
インが選択されて符号化される。また、請求項５の発明
の音声復号化装置は、伝送された符号により、駆動音源
ベクトルが周期的に生成され、これと適応音源ベクトル
により、音源ベクトルが復号化される。また、請求項６
の発明の音声符号化装置は、適応音源ベクトルにおいて
ピッチ周期で定まる特徴点がピッチ位置として抽出さ
れ、これにより駆動音源ベクトルが周期的に繰り返さ
れ、入力音声との歪みが最小になるベクトルとゲインが
選択されて符号化される。また、請求項７の発明の音声
復号化装置は、伝送された符号、受信スペクトルパラメ
ータとゲインにより、ピッチ位置が抽出され、これによ
り駆動音源ベクトルが周期的に生成され、これと適応音
源ベクトルにより、音源ベクトルが復号化される。According to the input speech encoding apparatus of the present invention, a periodic pulse excitation vector is newly generated corresponding to the adaptive excitation vector, and each of the periodic pulse excitation vectors is multiplied by a gain to be synthesized. Selected and encoded. In the speech decoding apparatus according to the second aspect of the present invention, an excitation vector having periodicity is generated by the transmitted code and the gain, and the excitation vector is generated by the generated excitation vector and the adaptive excitation vector.
The sound source vector is decoded. In the speech coding apparatus according to the third and fourth aspects of the present invention, the driving excitation vector is periodically repeated, and the vector and the gain that minimize the distortion from the input speech are selected and coded. Further, in the speech decoding apparatus according to the fifth aspect of the present invention, the driving excitation vector is periodically generated by the transmitted code, and the excitation vector is decoded by the generated excitation vector and the adaptive excitation vector. Claim 6
In the speech encoding apparatus according to the invention, a feature point determined by a pitch period in an adaptive excitation vector is extracted as a pitch position, whereby a driving excitation vector is periodically repeated, and a vector and a gain that minimize distortion from input speech are obtained. Is selected and encoded. In the speech decoding apparatus according to the seventh aspect of the present invention, the pitch position is extracted by the transmitted code, the received spectrum parameter and the gain, whereby the driving excitation vector is periodically generated. , The excitation vector is decoded.

【００２５】[0025]

【実施例】実施例１．図１はこの発明の請求項１の一実施例の構成図である。
図１において図８と同一の部分については同一の符号を
付し、説明を省略する。図１において、新規な部分は、
２０の複数ベクトル適応音源符号化手段、２１及び２３
の周期パルス音源生成手段、２２の複数ベクトル適応音
源復号化手段である。スペクトル分析手段６、スペクト
ルパラメータ符号化手段７からなる声道部と、複数ベク
トル適応音源符号化手段２０、駆動音源符号化手段１
０、音源ベクトル生成手段１２からなる声帯部の分担は
従来例と同じであるが、声帯部の、特にピッチ周期性を
向上させ、合成後の音声品質を改善するものである。[Embodiment 1] FIG. 1 is a block diagram of a first embodiment of the present invention.
In FIG. 1, the same portions as those in FIG. 8 are denoted by the same reference numerals, and description thereof will be omitted. In FIG. 1, the new part is
20 multi-vector adaptive excitation coding means, 21 and 23
And a multi-vector adaptive excitation decoding means 22. A vocal tract comprising spectrum analysis means 6, spectrum parameter coding means 7, multiple vector adaptive excitation coding means 20, driving excitation coding means 1
0, the sharing of the vocal chords composed of the sound source vector generation means 12 is the same as in the conventional example, but it is intended to improve the pitch periodicity of the vocal chords in particular and to improve the voice quality after synthesis.

【００２６】以下、本発明の一実施例の動作について説
明する。まず、符号化部１の動作について説明する。周
期パルス音源生成手段２１は、適応音源符号帳９より入
力される適応音源ベクトルａ_iより、その最大振幅をと
る次元のみ１でその他の次元は０である周期単位インパ
ルス列からなる周期パルス音源ベクトルｐ_iを生成し、
このｐ_iを複数ベクトル符号化手段２０に出力する。元
々、適応音源符号帳９が前フレーム迄の入力のサンプリ
ング系列値を記憶しており、これから、複数ベクトル適
応音源符号化手段２０では現フレームの入力音声に最も
近い合成音声を生成するベクトルを選ぶが、記憶されて
いる適応音源符号帳９のベクトルは周期性の保持につい
ては充分ではない。同期パルス音源ベクトルｐ_iを新た
に作ることで周期性が補強されることになる。The operation of one embodiment of the present invention will be described below. First, the operation of the encoding unit 1 will be described. The periodic pulse excitation generating means 21 generates a periodic pulse excitation vector composed of a periodic unit impulse train in which only the dimension having the maximum amplitude is 1 and the other dimensions are 0 from the adaptive excitation vector a _i input from the adaptive excitation codebook 9. generate p _i ,
And outputs the p _i into multiple vectors encoding means 20. Originally, the adaptive excitation codebook 9 stores the sampling sequence values of the input up to the previous frame, and from this, the multi-vector adaptive excitation coding means 20 selects the vector that generates the synthesized speech closest to the input speech of the current frame. However, the stored vector of adaptive excitation codebook 9 is not sufficient for maintaining the periodicity. The periodicity is reinforced by newly creating the synchronous pulse sound source vector p _i .

【００２７】図２に本実施例における適応音源ベクトル
（ａ）と、それより生成される周期パルス音源ベクトル
（ｂ）の例を示す。つまり、記憶している多くのベクト
ル中で、その最大の振幅を持つベクトルが、単位化され
て出力される。このことにより、音声の再生にとって重
要な周期性を強調することが出来る。FIG. 2 shows an example of the adaptive excitation vector (a) and the periodic pulse excitation vector (b) generated therefrom in this embodiment. In other words, the vector having the largest amplitude among many stored vectors is unitized and output. This makes it possible to emphasize periodicity that is important for sound reproduction.

【００２８】複数ベクトル適応音源符号化手段２０は、
スペクトルパラメータ符号化手段７より入力されるスペ
クトルパラメータを用い、前記適応音源符号帳９より入
力される適応音源ベクトルａ_i 、前記周期パルス音源生
成手段２１より入力される周期パルス音源ベクトルｐ_i
より合成音声ベクトルＡ_i、Ｐ_iをそれぞれ合成する。そ
して、入力音声４からフレーム毎に切り出した入力音声
ベクトルＸとのベクトル間距離Ｄ_iを、例えば式（５）
に従って求める。ここで、ゲインβ_Ai、β_Piは距離Ｄ_i
が最小になるように、例えば式（６）、式（７）に従っ
て決定する。The multi-vector adaptive excitation coding means 20 comprises:
The adaptive excitation vector a _i input from the adaptive excitation codebook 9 and the periodic pulse excitation vector p _i input from the periodic pulse excitation generating means 21 are used, using the spectrum parameters input from the spectrum parameter encoding means 7.
Then, the synthesized speech vectors A _i and P _i are synthesized. Then, the inter-vector distance D _i between the input speech vector X cut out for each frame from the input speech 4, for example, the formula (5)
Ask according to. Here, the gains β _Ai and β _Pi are the distances _Di
Is determined according to, for example, Expressions (6) and (7) so that is minimized.

【００２９】[0029]

【数３】 (Equation 3)

【００３０】次にこのベクトル間距離が最小となるベク
トルＡ_I、Ｐ_Iの組を探索し、その符号Ｉ及びゲイン
β_AI、β_PIを量子化したゲインβ_qAI、β_qPIの符号を伝
送路３を介して復号化部２に出力する。また、選択され
た適応音源ベクトルａ_I及びそのゲインβ_qAIと、周期パ
ルス音源ベクトルｐ_I及びそのゲインβ_qPIを、音源ベク
トル生成手段１２に出力するとともに、誤差ベクトル
Ｘ’として、Ｘ’＝Ｘ−β_qAIＡ_I−β_qPIＰ_I を駆動音源符号化手段１０に出力する。このように、従
来と比べて、Ａ_IのベクトルをＡ_IとＰ_Iの組のベクトル
としたことが新規である。なお、ベクトル差をさらに量
子化して符号化送信する駆動音源符号化手段１０の動作
は、本実施例では、従来例と同じであり、復号化部２に
対してはＪとγ_qJの符号を送信する。Next, a set of vectors A _I and P _I that minimizes the inter-vector distance is searched for, and the sign I and the signs of the gains β _qAI and β _qPI _obtained by quantizing the gains β _AI and β _PI are transmitted through the transmission line. 3 to the decoding unit 2. Also, the selected adaptive _excitation vector a _I and its gain β _qAI and the periodic pulse _excitation vector p _I and its gain β _qPI are output to the excitation vector generation means 12 and X ′ = X as an error vector X ′. −β _qAI A _I −β _qPI P _I is output to driving _excitation coding means 10. Thus, as compared with the conventional, it is novel that the vector of A _I and the set of vectors of A _I and P _I. The operation of the driving _excitation encoding means 10 for further quantizing the vector difference and encoding and transmitting the same is the same as that of the conventional example in the present embodiment. Send.

【００３１】次に、復号化部２の動作について説明す
る。複数ベクトル適応音源復号化手段２２は、符号化部
１から入力された適応音源ベクトルの符号Ｉを適応音源
符号帳１４に出力する。適応音源符号帳１４は、前記符
号Ｉに対応する適応音源ベクトルａ_Iを、複数ベクトル
適応音源復号化手段２２と周期パルス音源生成手段２３
へ出力する。周期パルス音源生成手段２３は、前記適応
音源符号帳１４から入力された適応音源ベクトルａ_Iよ
り周期パルス音源ベクトルｐ_Iを生成し、複数ベクトル
適応音源復号化手段２２に出力する。Next, the operation of the decoding unit 2 will be described. The multi-vector adaptive excitation decoding means 22 outputs the code I of the adaptive excitation vector input from the encoding unit 1 to the adaptive excitation codebook 14. The adaptive excitation codebook 14 converts the adaptive excitation vector a _I corresponding to the code I into a multi-vector adaptive excitation decoding means 22 and a periodic pulse excitation generating means 23.
Output to The periodic pulse excitation generating means 23 generates a periodic pulse excitation vector p _I from the adaptive excitation vector a _I input from the adaptive excitation codebook 14, and outputs it to the multi-vector adaptive excitation decoding means 22.

【００３２】複数ベクトル適応音源復号化手段２２は、
前記適応音源符号帳１４より入力される適応音源ベクト
ルａ_Iと、前記周期パルス音源生成手段２３より入力さ
れる周期パルス音源ベクトルｐ_I、及び符号化部１から
入力されたそれぞれのゲインの符号より復号したゲイン
β_qAI、β_qPIを音源ベクトル生成手段１７に出力する。
つまり、β_qAI、ａ_I、β_qPI、ｐ_Iの組が音源ベクトル生
成手段に与えられる。The multi-vector adaptive excitation decoding means 22 comprises:
From the adaptive excitation vector a _I input from the adaptive excitation codebook 14, the periodic pulse excitation vector p _I input from the periodic pulse excitation generating means 23, and the sign of each gain input from the encoding unit 1. The decoded gains β _qAI and β _qPI are output to the _excitation vector generation means 17.
That is, a set of β _qAI , a _I , β _qPI , and p _I is given to the sound source vector generation means.

【００３３】音源ベクトル生成手段１７は、前記複数ベ
クトル適応音源復号化手段２２より入力された適応音源
ベクトルａ_I、周期パルス音源ベクトルｐ_I及びそれぞれ
のゲインβ_qAI、β_qPIと、他方の入力である駆動音源復
号化手段１５より入力される駆動音源ベクトルｃ_J及び
そのゲインγ_qJより、下記の音源ベクトルを生成する。 β_qAIａ_I＋β_qPIｐ_I＋γ_qJｃ_J 生成された音源ベクトルは、適応音源符号帳１４及び合
成フィルタ１９に出力される。合成フィルタ１９の出力
は、合成して復元された音声出力となる。The excitation vector generation means 17 receives the adaptive _excitation vector a _I , the periodic pulse _excitation vector p _I and the respective gains β _qAI and β _qPI input from the multi-vector adaptive _excitation decoding means 22 and the other input. The following _excitation vector is generated from the driving _excitation vector c _J and the gain γ _qJ input from a certain driving _excitation decoding means 15. β _qAI a _I + β _qPI p _I + γ _qJ c _{J The} generated _excitation vector is output to adaptive _excitation codebook 14 and synthesis filter 19. The output of the synthesis filter 19 is an audio output that is synthesized and restored.

【００３４】実施例２．上記実施例１では、符号化するフレーム内のピッチ周期
性のみを考慮して周期パルス音源ベクトルを生成してい
る。一方、切り出して選択・符号化された適応音源ベク
トルのピッチ周期は、入力音声の実際のピッチ周期の倍
の値をとったりして、周期パルス音源ベクトルが生成す
るパルス列は、フレーム間では必ずしも周期的にはなら
ない。このときは、先行フレームの情報を用いて、例え
ば平均ピッチ周期を求めておき、周期パルス音源ベクト
ルはその平均ピッチ周期のパルス列から生成するなど、
フレーム間でのピッチ周期性をも考慮して周期パルス音
源ベクトルを生成してもよい。Embodiment 2 FIG. In the first embodiment, the periodic pulse excitation vector is generated in consideration of only the pitch periodicity in the frame to be encoded. On the other hand, the pitch period of the adaptive excitation vector that is cut out and selected / encoded takes a value that is twice the actual pitch period of the input speech, and the pulse train generated by the periodic pulse excitation vector is not necessarily periodic between frames. It does not become. At this time, using the information of the preceding frame, for example, an average pitch period is obtained, and the periodic pulse excitation vector is generated from a pulse train having the average pitch period.
The periodic pulse excitation vector may be generated in consideration of the pitch periodicity between frames.

【００３５】実施例３．上記実施例１では、周期パルス音源ベクトルの各パルス
の振幅が全て等しいとした。これを変更して、例えば音
声のパワーに応じて振幅を、前フレームから順次、増
加、又は減少させる等、線形的に変化させるなど、各周
期パルス音源ベクトルが異なる振幅をとるように設定し
てもよい。Embodiment 3 FIG. In the first embodiment, the amplitude of each pulse of the periodic pulse sound source vector is assumed to be all equal. By changing this, for example, according to the power of the sound, the amplitude is sequentially increased or decreased from the previous frame, such as linearly changing, such that each periodic pulse sound source vector is set to have a different amplitude. Is also good.

【００３６】実施例４．上記実施例１では、周期パルス音源ベクトルのパルス位
置を、適応音源ベクトルの振幅最大点より求めている
が、この両者を合成して得られる合成音声ベクトルにお
いて、両者のベクトルの内積が最大となる周期パルス音
源ベクトルを設定してもよい。Embodiment 4 FIG. In the first embodiment, the pulse position of the periodic pulse sound source vector is obtained from the maximum amplitude point of the adaptive sound source vector. However, in a synthesized speech vector obtained by synthesizing the two, the inner product of both vectors becomes maximum. A periodic pulse sound source vector may be set.

【００３７】実施例５．上記実施例１では、適応音源ベクトルと周期パルス音源
ベクトルをそのまま用いて合成音声ベクトルを生成し、
その結果ベクトルと入力音声ベクトルとのベクトル間距
離を求めている。しかし、例えば周期パルス音源ベクト
ルのパルス位置における適応音源ベクトルの振幅値を０
にして、適応音源ベクトルと周期パルス音源ベクトルを
直交化してベクトル間の相関を取り除き、周期性につい
ては専ら周期パルス音源ベクトルを用いた合成音声ベク
トルを生成してもよい。そして、その結果ベクトルと入
力音声ベクトルとのベクトル間距離を求めてもよい。Embodiment 5 FIG. In the first embodiment, a synthesized speech vector is generated using the adaptive excitation vector and the periodic pulse excitation vector as they are,
As a result, the inter-vector distance between the vector and the input speech vector is obtained. However, for example, if the amplitude value of the adaptive excitation vector at the pulse position of the periodic pulse excitation vector is 0,
Then, the adaptive excitation vector and the periodic pulse excitation vector may be orthogonalized to remove the correlation between the vectors, and for the periodicity, a synthetic speech vector using only the periodic pulse excitation vector may be generated. Then, the inter-vector distance between the result vector and the input speech vector may be obtained.

【００３８】実施例６．上記実施例５では、適応音源ベクトルと周期パルス音源
ベクトルとを直交化してその相関を取り除いている。こ
れを、それぞれを合成して得られる合成音声ベクトル同
志を直交化することによりベクトル間の相関を取り除い
てもよい。Embodiment 6 FIG. In the fifth embodiment, the adaptive excitation vector and the periodic pulse excitation vector are orthogonalized to remove their correlation. The correlation between vectors may be removed by orthogonalizing synthesized speech vectors obtained by synthesizing them.

【００３９】実施例７．上記実施例１では、周期パルス音源ベクトルを１つのベ
クトルで構成している。これを、周期パルス音源ベクト
ルの各パルスのゲインを、図２（ｂ）で２パルスある例
のとき、時間的に前のゲインと後のパルスのゲインを異
なる値に設定するようにしてもよい。Embodiment 7 FIG. In the first embodiment, the periodic pulse sound source vector is composed of one vector. In this case, when the gain of each pulse of the periodic pulse sound source vector has two pulses in FIG. 2B, the gain before and after the pulse may be set to different values in terms of time. .

【００４０】実施例８．図３はこの発明の請求項３及び請求項５の発明の一実施
例を示す構成図である。図３において、図１と同一の部
分については同一の符号を付し、説明を省略する。図３
において、２４は適応音源符号化手段、２５はピッチ位
置抽出手段、２６は駆動音源符号化手段である。２７、
２８はピッチ同期化手段であり、２９は音源ベクトル生
成手段である。Embodiment 8 FIG. FIG. 3 is a block diagram showing one embodiment of the third and fifth aspects of the present invention. 3, the same parts as those in FIG. 1 are denoted by the same reference numerals, and description thereof will be omitted. FIG.
, 24 is an adaptive excitation coding means, 25 is a pitch position extraction means, and 26 is a driving excitation coding means. 27,
28 is a pitch synchronization means, and 29 is a sound source vector generation means.

【００４１】以下、図３に示した本発明の一実施例の動
作について説明する。まず、符号化部１の動作について
説明する。適応音源符号化手段２４は、適応音源符号帳
９より入力される適応音源ベクトルａ_iと、スペクトル
パラメータ符号化手段７より入力されるスペクトルパラ
メータを用いて合成音声ベクトルＡ_iを合成する。そし
て、入力音声４からフレーム毎に切り出した入力音声ベ
クトルＸとの歪みが最小になる合成音声ベクトルＡ_I及
びそのゲインβ_Iを探索し、その符号Ｉ、及びゲインβ_I
を量子化したゲインβ_qIの符号を伝送路３を介して復号
化部２に出力し、選択された適応音源ベクトルａ_I及び
そのゲインβ_qIを音源ベクトル生成手段１２に出力し、
また誤差ベクトルＸ’を、Ｘ’＝Ｘ−β_qIＡ_I とし、このＸ’と前記適応音源ベクトルａ_Iを駆動音源
符号化手段２６に出力するとともに、前記適応音源ベク
トルａ_Iに対応するピッチ周期をピッチ位置抽出手段２
５に出力する。The operation of the embodiment of the present invention shown in FIG. 3 will be described below. First, the operation of the encoding unit 1 will be described. Adaptive excitation coding means 24 synthesizes synthesized speech vector A _i using adaptive excitation vector a _i input from adaptive excitation codebook 9 and spectrum parameters input from spectrum parameter coding means 7. Then, a search is made for the synthesized speech vector A _I and its gain β _I that minimize the distortion with respect to the input speech vector X cut out for each frame from the input speech 4, and its sign I and gain β _I are searched for.
_Is output to the decoding unit 2 via the transmission path 3 and the selected adaptive _excitation vector a _I and its gain β _qI are output to the excitation vector generation unit 12.
Further, the error vector X ′ is set to X ′ = X−β _qI A _I , and the X ′ and the adaptive _excitation vector a _I are output to the driving excitation coding unit 26 and the pitch corresponding to the adaptive excitation vector a _I Cycle is pitch position extraction means 2
5 is output.

【００４２】ピッチ位置抽出手段２５は、適応音源符号
化手段２４より入力されたピッチ周期を用いてピッチ周
期のパルス列を作成し、これを音源としてスペクトルパ
ラメータ符号化手段７より入力されるスペクトルパラメ
ータを用いて合成音声ベクトルを生成したときに、入力
音声４からフレーム毎に切り出した入力音声ベクトルＸ
との歪みが最小となるパルス列を探索する。そして、そ
のパルス位置をピッチ位置としてピッチ同期化手段２７
に出力する。The pitch position extracting means 25 creates a pulse train of the pitch cycle using the pitch cycle input from the adaptive excitation coding means 24 and converts the spectrum parameter input from the spectrum parameter coding means 7 into a pulse train using this as a sound source. Is used to generate a synthesized speech vector, an input speech vector X cut out from the input speech 4 for each frame.
Search for a pulse train that minimizes distortion. Then, the pulse synchronizing means 27 sets the pulse position as a pitch position.
Output to

【００４３】図４に、ピッチ位置探索時の動作ブロック
図を示す。この動作は、ピッチ周期のパルスの先頭位置
をフレームの先頭の位置から、例えば１フレームの切り
出し時間を１２８等分し、１／１２８時間ずらした値を
順次設定する。そして、先頭位置をずらしたピッチ周期
のパルス列を合成したものが入力音声に最も近いものを
選ぶ。こうすることで、入力音声のフレーミングによら
ず音声のピッチ構造に対して一意に定まる点が求められ
るので、周期性を明確化できる。FIG. 4 is an operation block diagram at the time of searching for a pitch position. In this operation, the start position of the pulse of the pitch cycle is divided from the start position of the frame by, for example, the cutout time of one frame by 128 and shifted by 1/128 hours. Then, a combination of a pulse train with a pitch cycle shifted from the head position is selected as the one closest to the input voice. By doing so, a point that is uniquely determined with respect to the pitch structure of the voice is determined regardless of the framing of the input voice, so that the periodicity can be clarified.

【００４４】駆動音源符号帳１１には、例えば音声のピ
ッチ位置に同期したピッチ長で切り出した短周期予測残
差信号より、Linde, Buzo, Gray 法（ＬＢＧ法）で学習
した結果として、Ｎ個の駆動ベクトルが記憶されてい
る。また、各駆動音源ベクトルを切り出したピッチ位置
を、その駆動音源ベクトルのピッチ同期位置として設定
しておく。ピッチ同期化手段２７は、前記ピッチ位置抽
出手段２５より入力されたピッチ位置に、駆動音源符号
帳１１より入力される駆動音源ベクトルｃ_j のピッチ同
期位置を合わせて、ピッチ周期で繰り返すピッチ同期駆
動音源ベクトルｃ_j’を作成する。そして、これを駆動
音源符号化手段２６に出力する。図５に本実施例におけ
るピッチ同期駆動音源ベクトルの例を示す。駆動音源符
号化手段２６は、前記ピッチ同期化手段２７より入力さ
れるピッチ同期駆動音源ベクトルｃ_j’を、前記適応音
源符号化手段２４より入力された適応音源ベクトルａ_I
に対して、例えば式（８）に従って直交化した直交化ピ
ッチ同期駆動音源ベクトルｃ_j”を作成する。In the driving excitation codebook 11, for example, as a result of learning by the Linde, Buzo, Gray method (LBG method) from a short-period prediction residual signal cut out at a pitch length synchronized with the pitch position of speech, N Are stored. In addition, a pitch position obtained by cutting out each drive sound source vector is set as a pitch synchronization position of the drive sound source vector. The pitch synchronization unit 27 adjusts the pitch synchronization position of the driving excitation vector c _j input from the driving excitation codebook 11 to the pitch position input from the pitch position extraction unit 25, and repeats the pitch synchronization driving at a pitch cycle. Create a sound source vector c _j ′. Then, this is output to driving excitation coding means 26. FIG. 5 shows an example of the pitch synchronous drive sound source vector in the present embodiment. The driving excitation coding means 26 converts the pitch-synchronized driving excitation vector c _j ′ input from the pitch synchronization means 27 into the adaptive excitation vector a _I input from the adaptive excitation coding means 24.
Then, for example, an orthogonalized pitch synchronous drive excitation vector c _j ″ that is orthogonalized according to equation (8) is created.

【００４５】[0045]

【数４】 (Equation 4)

【００４６】次に、前記直交化ピッチ同期駆動音源ベク
トルｃ_j”より合成音声ベクトルＣ_j”を合成する。そし
て、前記適応音源符号化手段２４から入力された誤差ベ
クトルＸ’とのベクトル間距離Ｄ_jを、例えば式（９）
に従って求める。ここで、ゲインγ_jはベクトル間距離
Ｄj を最小になるように、例えば式（10）に従って決定
する。Next, a synthesized speech vector C _j ″ is synthesized from the orthogonalized pitch synchronous drive excitation vector c _j ″. The inter-vector distance D _j with the error vector X ′ input from the adaptive excitation coding means 24 is calculated, for example, by using equation (9).
Ask according to. Here, the gain γ _j is determined according to, for example, equation (10) so as to minimize the inter-vector distance D _j .

【００４７】[0047]

【数５】 (Equation 5)

【００４８】次に、このベクトル間距離Ｄ_jが最小とな
るベクトルＣ_J”を探索し、この符号Ｊ、及びゲインγ_J
を量子化したγ_qJの符号を伝送路３を介して復号部２に
出力する。同時に、選択された直交化ピッチ同期駆動音
源ベクトルｃ_J”及びそのゲインγ_qJを音源ベクトル生
成手段１２に出力する。このことは、従来はピッチに非
同期であったために生成される音源信号の周期性を乱し
ていた駆動音源ベクトルを、ピッチに同期させることで
音源信号の周期性を向上させることになる。Next, a vector C _J ″ in which the inter-vector distance D _j is minimized is searched for, and the sign J and the gain γ _J are searched for.
_Is output to the decoding unit 2 via the transmission path 3. At the same time, the selected orthogonalized pitch synchronous drive _excitation vector c _J ″ and its gain γ _qJ are output to the _excitation vector generation means 12. This means that the _period of the _excitation signal generated because it was conventionally asynchronous with the pitch. The periodicity of the sound source signal is improved by synchronizing the driving sound source vector, which has been disturbed, with the pitch.

【００４９】次に、復号化部２の動作について説明す
る。ピッチ同期化手段２８は、駆動音源符号帳１６より
入力される駆動音源ベクトルｃ_Jから、符号化部１から
入力されたピッチ位置に同期して繰り返したピッチ同期
駆動音源ベクトルｃ_J’を作成し、駆動音源復号化手段
１５に出力する。Next, the operation of the decoding unit 2 will be described. The pitch synchronization means 28 creates a pitch-synchronous driving excitation vector c _J ′ that is repeated in synchronization with the pitch position input from the encoding unit 1 from the driving excitation vector c _J input from the driving excitation codebook 16. , To the driving excitation decoding means 15.

【００５０】音源ベクトル生成手段２９は、まず、駆動
音源復号化手段１５より入力されるピッチ同期駆動音源
ベクトルｃ_J’を、適応音源復号化手段１３より入力さ
れる適応音源ベクトルａ_Iと直交化して、直交化ピッチ
同期駆動音源ベクトルｃ_J”を生成する。次に、前記適
応音源ベクトルａ_I及び適応音源復号化手段１３より入
力されるゲインβ_qIと、前記直交化ピッチ同期駆動音源
ベクトルｃ_J”及び駆動音源復号化手段１５より入力さ
れるゲインγ_qJより、次式の音源ベクトル β_qIａ_I＋γ_qJｃ_J” を生成し、これを適応音源符号帳１４及び合成フィルタ
１９に出力する。合成フィルタ１９の出力は合成して復
元された音声出力となる。The excitation vector generating means 29 first orthogonalizes the pitch synchronous driving excitation vector c _J ′ input from the driving excitation decoding means 15 with the adaptive excitation vector a _I input from the adaptive excitation decoding means 13. To generate the orthogonalized pitch synchronous drive _excitation vector c _J ″. Next, the adaptive _excitation vector a _I and the gain β _qI input from the adaptive _excitation decoder 13 and the orthogonalized pitch synchronous drive _excitation vector c c "than the gain gamma _QJ input from and excitation decoding means 15, excitation vector _{_{_{β qI a I + γ qJ c}}} J of the formula" _J, and outputs it to the adaptive excitation codebook 14 and synthesis filter 19 The output of the synthesis filter 19 is an audio output that is synthesized and restored.

【００５１】実施例９. 図６はこの発明の請求項４及び請求項５の発明の一実施
例を示す構成図である。図６において、図３と同一の部
分については同一の符号を付し、説明を省略する。図６
において、３０は駆動音源符号化手段である。Embodiment 9 FIG. 6 is a block diagram showing an embodiment of the invention according to claims 4 and 5 of the present invention. 6, the same parts as those in FIG. 3 are denoted by the same reference numerals, and description thereof will be omitted. FIG.
In the above, reference numeral 30 denotes a driving excitation coding means.

【００５２】以下、図６に示した本発明の一実施例につ
いて説明する。適応音源符号化手段２４は、適応音源符
号帳９より入力される適応音源ベクトルａ_iと、スペク
トルパラメータ符号化手段７より入力されるスペクトル
パラメータを用いて合成音声ベクトルＡ_iを合成する。
そして、入力音声４からフレーム毎に切り出した入力音
声ベクトルＸとの歪みが最小になるベクトルＡ_I及びそ
のゲインβ_Iを探索し、その符号Ｉ、及びゲインβ_Iを量
子化したゲインβ_qIの符号を伝送路３を介して復号化部
２に出力し、選択された適応音源ベクトルａ_I及びその
ゲインβ_qIを音源ベクトル生成手段１２に出力し、また
誤差ベクトルＸ’を、Ｘ’＝Ｘ−β_qIＡ_I とし、このＸ’と前記適応音源ベクトルａ_I及びａ_Iに対
応するピッチ周期を駆動音源符号化手段３０に出力す
る。Hereinafter, an embodiment of the present invention shown in FIG. 6 will be described. Adaptive excitation coding means 24 synthesizes synthesized speech vector A _i using adaptive excitation vector a _i input from adaptive excitation codebook 9 and spectrum parameters input from spectrum parameter coding means 7.
Then, a search is made for a vector A _I and its gain β _I that minimize the distortion with respect to the input speech vector X cut out from the input speech 4 for each frame, and the sign I and the gain β _{qI obtained} by quantizing the gain β _I are _obtained . The code is output to the decoding unit 2 via the transmission path 3, the selected adaptive _excitation vector a _I and its gain β _qI are output to the excitation vector generation unit 12, and the error vector X ′ is _calculated as _follows : X ′ = X −β _qI A _I, and _outputs X ′ and the pitch period corresponding to the adaptive _excitation vectors a _I and a _I to the driving excitation encoding means 30.

【００５３】駆動音源符号化手段３０は、まず、前記適
応音源符号化手段２４より入力されたピッチ周期を用い
ピッチ同期化手段２７にピッチ位置ｋを出力する。この
ピッチ位置ｋはフレームの先頭の位置からｋ点目を先頭
にピッチ周期で定まる位置である。ピッチ同期化手段２
７は、前記駆動音源符号化手段３０より入力されたピッ
チ位置ｋに、駆動音源符号帳１１より入力される駆動音
源ベクトルｃ_j のピッチ同期位置を合わせて、ピッチ周
期で繰り返すピッチ同期駆動音源ベクトルｃ_jk’を作成
する。そして、これを駆動音源符号化手段３０に出力す
る。駆動音源符号化手段３０は、前記ピッチ同期化手段
２７より入力されたピッチ同期駆動音源ベクトルｃ_jk’
を、前記適応音源符号化手段２４より入力された適応音
源ベクトルａ_Iに対して、例えば式（11）に従って直交
化した直交化ピッチ同期駆動音源ベクトルｃ_jk”を作成
する。The driving excitation coding means 30 first outputs the pitch position k to the pitch synchronization means 27 using the pitch period input from the adaptive excitation coding means 24. This pitch position k is a position determined by the pitch cycle starting from the k-th point from the head position of the frame. Pitch synchronization means 2
Reference numeral 7 denotes a pitch-synchronous driving excitation vector that repeats at a pitch cycle by adjusting the pitch synchronization position of the driving excitation vector c _j input from the driving excitation codebook 11 to the pitch position k input from the driving excitation encoding means 30. Create c _jk '. Then, this is output to driving excitation coding means 30. The driving excitation encoding means 30 outputs the pitch synchronous driving _excitation vector c _jk ′ input from the pitch synchronization means 27.
_Is generated with respect to the adaptive excitation vector a _I input from the adaptive excitation encoding means 24, for example, in accordance with equation (11) to create an orthogonalized pitch synchronous drive _excitation vector c _jk ″.

【００５４】[0054]

【数６】 (Equation 6)

【００５５】次に、前記直交化ピッチ同期駆動音源ベク
トルｃ_jk”より合成音声ベクトルＣ_jk”を合成する。そ
して、前記適応音源符号化手段２４から入力された誤差
ベクトルＸ’ とのベクトル間距離Ｄ_jkを、例えば式（1
2）に従って求める。ここで、ゲインγ_jkはベクトル間
距離Ｄ_jkを最小になるように、例えば式（13）に従って
決定する。Next, a synthesized speech vector C _jk ″ is synthesized from the orthogonalized pitch synchronous drive _excitation vector c _jk ″. Then, the inter-vector distance D _jk with respect to the error vector X ′ input from the adaptive excitation coding means 24 is calculated by, for example, the equation (1)
Request according to 2). Here, the gain γ _jk is determined according to, for example, Expression (13) so as to minimize the inter-vector distance D _jk .

【００５６】[0056]

【数７】 (Equation 7)

【００５７】次に、このベクトル間距離Ｄ_jkを駆動音源
ベクトルの符号ｊ（ｊ＝１， ...，Ｎ）とピッチ位置ｋ
（ｋ＝１， ...，Ｌ）の全ての組み合わせに対して求
め、ベクトル間距離Ｄ_jkが最小になるベクトルＣ_JK”を
探索し、この符号Ｊ、ピッチ位置Ｋ、及びゲインγ_JKを
量子化したγ_qJKの符号を伝送路３を介して復号化部２
に出力する。同時に、選択された直交化ピッチ同期駆動
音源ベクトルｃ_JK”及びそのゲインγ_qJKを音源ベクト
ル生成手段１２に出力する。このことは、ピッチ位置を
合成音声の歪み最小となるものとして抽出することによ
り、合成音声の音質を向上させることになる。Next, the inter-vector distance _{Djk is determined} by the code j (j = 1,..., N) of the driving sound source vector and the pitch position k.
(K = 1,..., L), a vector C _JK ″ that minimizes the inter-vector distance D _jk is searched for, and this code J, pitch position K, and gain γ _JK are determined. The quantized γ _qJK code is transmitted to the decoding unit 2 via the transmission path 3.
Output to At the same time, the selected orthogonalized pitch synchronous drive _excitation vector c _JK ″ and its gain γ _qJK are output to the _excitation vector generation unit 12. This is _achieved by extracting the pitch position as the one that minimizes the distortion of the synthesized speech. Thus, the sound quality of the synthesized voice is improved.

【００５８】実施例１０．図７はこの発明の請求項６の発明の一実施例を示す構成
図である。図７において、図６と同一の部分については
同一の符号を付し、説明を省略する。図７において、３
１は適応音源符号化手段、３２、３３はピッチ位置抽出
手段である。Embodiment 10 FIG. FIG. 7 is a block diagram showing an embodiment of the invention according to claim 6 of the present invention. 7, the same parts as those in FIG. 6 are denoted by the same reference numerals, and the description will be omitted. In FIG. 7, 3
1 is an adaptive excitation coding means, and 32 and 33 are pitch position extracting means.

【００５９】以下、図７に示した本発明の一実施例の動
作について説明する。まず、符号化部１の動作について
説明する。ピッチ位置抽出手段３２は、適応音源符号帳
９より入力される適応音源ベクトルａ_iと、スペクトル
パラメータ符号化手段７より入力されるスペクトルパラ
メータを用いて合成音声ベクトルＡ_iを合成する。次に
前記適応音源ベクトルａ_iに対応するピッチ周期のパル
ス列を音源として、前記スペクトルパラメータを用いて
合成音声ベクトルを生成したときに、前記合成音声ベク
トルＡ_iとの歪みが最小となるパルス列を探索する。そ
して、そのパルス位置をピッチ位置として適応音源符号
化手段３１に出力する。The operation of the embodiment of the present invention shown in FIG. 7 will be described below. First, the operation of the encoding unit 1 will be described. The pitch position extracting means 32 synthesizes the synthesized speech vector A _i using the adaptive excitation vector a _i inputted from the adaptive excitation codebook 9 and the spectrum parameter inputted from the spectrum parameter encoding means 7. Next, when a synthetic speech vector is generated using the spectral parameters by using a pulse train having a pitch cycle corresponding to the adaptive sound source vector a _i as a sound source, a pulse train that minimizes distortion from the synthetic speech vector A _i is searched for. I do. Then, the pulse position is output to adaptive excitation encoding means 31 as a pitch position.

【００６０】適応音源符号化手段３１は、まず前記適応
音源符号帳９より入力される適応音源ベクトルａ_iに対
応するピッチ周期のパルス列を音源として、前記スペク
トルパラメータ符号化手段７より入力されるスペクトル
パラメータを用いて合成音声ベクトルを生成したとき
に、入力音声４からフレーム毎に切り出した入力音声ベ
クトルＸとの歪みが最小となるパルス列を探索する。そ
して、そのパルス位置を入力音声におけるピッチ位置と
する。そして、前記ピッチ位置抽出手段３２より入力さ
れる適応音源ベクトルａ_iにおけるピッチ位置が、前記
入力音声におけるピッチ位置よりある距離、例えば５サ
ンプル、以内にある場合のみ以下の処理を行い、それ以
外の場合は前記適応音源ベクトルａ_iを符号化の対象か
ら外す。The adaptive excitation coding means 31 first uses the pulse sequence of the pitch period corresponding to the adaptive excitation vector a _i input from the adaptive excitation codebook 9 as the excitation, and uses the spectrum input from the spectrum parameter coding means 7 as the excitation. When a synthesized speech vector is generated using the parameters, a pulse train that minimizes distortion from the input speech vector X cut out from the input speech 4 for each frame is searched. Then, the pulse position is set as a pitch position in the input voice. The following processing is performed only when the pitch position in the adaptive sound source vector a _i input from the pitch position extraction means 32 is within a certain distance, for example, 5 samples, from the pitch position in the input voice. In this case, the adaptive excitation vector a _i is excluded from the encoding target.

【００６１】適応音源符号化手段３１は、次に、前記適
応音源ベクトルａ_iにおけるピッチ位置のみ１でその他
は０であるピッチパルスベクトルｐ_iを生成する。そし
て、前記スペクトルパラメータを用い、前記適応音源ベ
クトルａ_iと前記ピッチパルスベクトルｐ_iより合成音声
ベクトルＡ_i、Ｐ_iを合成する。また、入力音声ベクトル
Ｘとのベクトル間距離Ｄ_iを例えば式（14）に従って求
める。ここで、ｍｉｎ（ｘ，ｙ）はｘ、ｙのうち値の小
さい方を選択する関数であり、選択結果が出力される。
また、ゲインβ_Ai、β_Piは距離Ｄ_iが最小になるよう
に、例えば式（15）、式（16）に従って決定する。Next, adaptive excitation coding means 31 generates a pitch pulse vector p _{i in} which only the pitch position in adaptive excitation vector a _i is 1 and the others are 0. Then, using the spectral parameters, synthesized speech vectors A _i and P _i are synthesized from the adaptive sound source vector a _i and the pitch pulse vector p _i . Also, determine the inter-vector distance D _i between the input speech vector X, for example, in accordance with equation (14). Here, min (x, y) is a function for selecting the smaller value of x and y, and the selection result is output.
The gain beta _Ai, beta _Pi is such that the distance D _i is minimized, for example, formula (15), determined in accordance with equation (16).

【００６２】[0062]

【数８】 (Equation 8)

【００６３】次に、このベクトル間距離が最小となるベ
クトルＡ_I を探索し、その符号Ｉ及びそのゲインβ_AIを
量子化したゲインβ_qAI の符号を伝送路３を介して復号
化部２に出力する。また、前記適応音源ベクトルａ_Iと
そのゲインβ_qAIを、音源ベクトル生成手段１２に出力
する。更に、前記ピッチ位置抽出手段３２より入力され
た前記適応音源ベクトルａ_Iにおけるピッチ位置を、ピ
ッチ同期化手段２７に出力する。また誤差ベクトルＸ’
を、Ｘ’＝Ｘ−β_qAIＡ_I とし、このＸ’ と前記適応音源ベクトルａ_Iを駆動音源
符号化手段２６にも出力する。Next, the vector A _{I in} which the distance between the vectors is minimized is searched, and the code I and the code of the gain β _{qAI obtained} by quantizing the gain β _AI are transmitted to the decoding unit 2 via the transmission path 3. Output. The adaptive _excitation vector a _I and its gain β _qAI are output to the excitation vector generation means 12. Further, the pitch position in the adaptive sound source vector a _I input from the pitch position extracting means 32 is output to the pitch synchronizing means 27. Also, the error vector X '
_{Is set} to X ′ = X−β _qAI A _I, and this X ′ and the adaptive _excitation vector a _I are also output to the driving excitation encoding means 26.

【００６４】次に、復号化部２の動作について説明す
る。ピッチ位置抽出手段３３は、適応音源符号帳１４よ
り入力される適応音源ベクトルａ_Iと、スペクトルパラ
メータ復号化手段１８より入力されるスペクトルパラメ
ータよりピッチ位置を抽出する。そして、このピッチ位
置をピッチ同期化手段２８に出力する。このことは、ピ
ッチ位置を入力音声を量子化したものである適応音源ベ
クトルとスペクトルパラメータより抽出することにより
復号化部においても一意に求めることが可能であり、ピ
ッチ位置を伝送する必要がなく、合成音声の音質の劣化
を起こさずに伝送情報量の削減することになる。Next, the operation of the decoding unit 2 will be described. The pitch position extracting means 33 extracts a pitch position from the adaptive excitation vector a _I input from the adaptive excitation codebook 14 and the spectrum parameter input from the spectrum parameter decoding means 18. Then, this pitch position is output to the pitch synchronization means 28. This means that the pitch position can be uniquely obtained even in the decoding unit by extracting the pitch position from the adaptive excitation vector and the spectrum parameter, which are obtained by quantizing the input voice, without transmitting the pitch position. This reduces the amount of transmitted information without deteriorating the sound quality of the synthesized speech.

【００６５】実施例１１．上記実施例１０では、適応音源符号化手段において、適
応音源ベクトルより抽出したピッチ位置と、入力音声よ
り求まるピッチ位置間の距離基準を固定としているが、
例えばピッチ長に応じて可変としてもよい。Embodiment 11 FIG. In the tenth embodiment, in the adaptive excitation coding means, the distance reference between the pitch position extracted from the adaptive excitation vector and the pitch position obtained from the input speech is fixed.
For example, it may be variable according to the pitch length.

【００６６】実施例１２．上記実施例１０では、適応音源符号化手段において、適
応音源ベクトルより抽出したピッチ位置と入力音声より
求まるピッチ位置の差が、ある距離以内のもののみ符号
化の対象としているが、全ての適応音源ベクトルがこの
基準を満たさないフレームにおいては、この距離基準を
緩和、あるいは廃止してもよい。Embodiment 12 FIG. In the tenth embodiment, the adaptive excitation encoding means encodes only the difference between the pitch position extracted from the adaptive excitation vector and the pitch position obtained from the input speech within a certain distance. For frames whose vectors do not meet this criterion, this distance criterion may be relaxed or abolished.

【００６７】[0067]

【発明の効果】以上のようにこの発明によれば、音声符
号化装置に、スペクトルパラメータを抽出して符号化す
るスペクトルパラメータ分析符号化手段と、適応音源ベ
クトルを複数生成して記憶する適応音源符号帳と、周期
パルス音源ベクトルを生成する周期パルス音源生成手段
と、入力音声との歪みが最小になるような適応音源ベク
トルと、適応音源ベクトル及び周期パルス音源ベクトル
に付与するゲインを決定して符号化する複数ベクトル適
応音源符号化手段と、音源信号を生成して出力する音源
ベクトル生成手段を備えたので、また、音声復号化装置
には符号化装置に対応する手段を備えたので、僅かの伝
送情報の追加のみで、ピッチ周期性の乱れが少ない高品
質の音声を合成できる効果がある。As described above, according to the present invention, a speech encoding apparatus extracts and encodes a spectrum parameter and encodes a spectrum parameter, and an adaptive excitation source for generating and storing a plurality of adaptive excitation vectors. A codebook, a periodic pulse excitation generating means for generating a periodic pulse excitation vector, an adaptive excitation vector that minimizes distortion with the input speech, and a gain to be applied to the adaptive excitation vector and the periodic pulse excitation vector. Since the multi-vector adaptive excitation encoding means for encoding and the excitation vector generation means for generating and outputting an excitation signal are provided, and the speech decoding apparatus is provided with means corresponding to the encoding apparatus, By simply adding the transmission information, it is possible to synthesize a high-quality voice with little disturbance of the pitch periodicity.

【００６８】また請求項３ないし請求項５の発明によれ
ば、音声符号化装置に、スペクトルパラメータを抽出し
て符号化するスペクトルパラメータ分析符号化手段と、
ピッチ位置抽出手段と、適応音源符号帳と、予め定めれ
たピッチ同期位置を有する駆動音源ベクトルを複数記憶
する駆動音源符号帳と、駆動音源ベクトルをピッチ周期
で繰り返すピッチ同期化手段と、適応音源符号化手段
と、入力音声と復号信号ベクトルの歪みが最小となるピ
ッチ同期駆動音源ベクトルと第２ゲインを決定し符号化
する駆動音源符号化手段と、音源信号を生成して出力す
る音源ベクトル生成手段とを備えたので、また音声復号
化装置には、符号化装置に対応する手段を備えたので、
僅かの伝送情報の追加のみで、ピッチ周期性の乱れが少
ない高品質の音声を合成できる効果がある。According to the third to fifth aspects of the present invention, the speech encoding device includes a spectrum parameter analysis encoding means for extracting and encoding a spectrum parameter.
Pitch excitation extracting means, an adaptive excitation codebook, a driving excitation codebook storing a plurality of driving excitation vectors having a predetermined pitch synchronization position, a pitch synchronization means for repeating the driving excitation vector at a pitch cycle, Encoding means, driving excitation excitation means for determining and encoding a pitch-synchronous excitation vector and a second gain for minimizing distortion between an input speech and a decoded signal vector, and excitation vector generation for generating and outputting an excitation signal Means, and the speech decoding apparatus has means corresponding to the encoding apparatus,
By adding only a small amount of transmission information, there is an effect that high-quality speech with little disturbance of pitch periodicity can be synthesized.

【００６９】また請求項６及び請求項７の発明によれ
ば、音声符号化装置は、スペクトルパラメータを抽出し
て符号化するスペクトルパラメータ分析符号化手段と、
適応音源ベクトルからピッチ位置を抽出するピッチ位置
抽出手段と、適応音源符号帳と、予め定めれたピッチ同
期位置を有する駆動音源ベクトルを複数記憶する駆動音
源符号帳と、駆動音源ベクトルをピッチ周期で繰り返す
ピッチ同期化手段と、適応音源符号化手段と、入力音声
と復号信号ベクトルの歪みが最小となるピッチ同期駆動
音源ベクトルと第２ゲインを決定し符号化する駆動音源
符号化手段と、音源信号を生成して出力する音源ベクト
ル生成手段とを備えたので、また音声復号化装置には、
符号化装置に対応する手段を備えたので、伝送情報の増
加無しに、ピッチ周期性の乱れが少ない高品質の音声を
合成できる効果がある。According to the sixth and seventh aspects of the present invention, the speech coding apparatus includes a spectrum parameter analysis coding means for extracting and coding a spectrum parameter;
Pitch position extracting means for extracting a pitch position from an adaptive excitation vector, an adaptive excitation codebook, a driving excitation codebook storing a plurality of driving excitation vectors having a predetermined pitch synchronization position, and a driving excitation vector at a pitch period. Pitch synchronizing means to be repeated, adaptive excitation coding means, driving excitation coding means for determining and coding a pitch-synchronized driving excitation vector and a second gain for minimizing distortion of input speech and a decoded signal vector, and excitation signal Sound source vector generating means for generating and outputting
Since means corresponding to the encoding device is provided, there is an effect that high-quality speech with little disturbance of pitch periodicity can be synthesized without increasing transmission information.

[Brief description of the drawings]

【図１】この発明の実施例１を示す構成図である。FIG. 1 is a configuration diagram showing a first embodiment of the present invention.

【図２】この発明の周期パルス音源ベクトルの例を示す
説明図である。FIG. 2 is an explanatory diagram showing an example of a periodic pulse sound source vector according to the present invention.

【図３】この発明の実施例８を示す構成図である。FIG. 3 is a configuration diagram showing an eighth embodiment of the present invention.

【図４】この発明のピッチ位置抽出を示す説明図であ
る。FIG. 4 is an explanatory diagram showing pitch position extraction according to the present invention.

【図５】この発明のピッチ同期駆動音源ベクトルの例を
示す説明図である。FIG. 5 is an explanatory diagram showing an example of a pitch synchronous drive sound source vector according to the present invention.

【図６】この発明の実施例９を示す構成図である。FIG. 6 is a configuration diagram showing a ninth embodiment of the present invention.

【図７】この発明の実施例１０を示す構成図である。FIG. 7 is a configuration diagram showing a tenth embodiment of the present invention.

【図８】従来の音声符号化復号化装置を示す構成図であ
る。FIG. 8 is a configuration diagram showing a conventional speech encoding / decoding device.

【図９】適応音源ベクトルの例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of an adaptive sound source vector.

[Explanation of symbols]

１符号化部２復号化部３伝送路４入力音声５出力音声６スペクトル分析手段７スペクトルパラメータ符号化手段８適応音源符号化手段９、１４適応音源符号帳１０駆動音源符号化手段１１、１６駆動音源符号帳１２、１７音源ベクトル生成手段１３適応音源復号化手段１５駆動音源復号化手段１８スペクトルパラメータ復号化手段１９合成フィルタ２０複数ベクトル適応音源符号化手段２１、２３周期パルス音源生成手段２２複数ベクトル適応音源復号化手段２４適応音源符号化手段２５ピッチ位置抽出手段２６駆動音源符号化手段２７、２８ピッチ同期化手段２９音源ベクトル生成手段３０駆動音源符号化手段３１適応音源符号化手段３２、３３ピッチ位置抽出手段 DESCRIPTION OF SYMBOLS 1 Encoding part 2 Decoding part 3 Transmission path 4 Input speech 5 Output speech 6 Spectrum analysis means 7 Spectrum parameter encoding means 8 Adaptive excitation encoding means 9, 14 Adaptive excitation codebook 10 Driving excitation encoding means 11, 16 Driving Excitation codebook 12, 17 Excitation vector generation means 13 Adaptive excitation decoding means 15 Driving excitation decoding means 18 Spectral parameter decoding means 19 Synthetic filter 20 Multiple vector adaptive excitation coding means 21, 23 Periodic pulse excitation generation means 22 Multiple vectors Adaptive excitation decoding means 24 Adaptive excitation coding means 25 Pitch position extraction means 26 Driving excitation coding means 27, 28 Pitch synchronization means 29 Excitation vector generation means 30 Driving excitation coding means 31 Adaptive excitation coding means 32, 33 Pitch Position extraction means

Claims

(57) [Claims]

1. An apparatus comprising spectrum analysis coding means, adaptive excitation codebook, periodic pulse excitation generation means, multi-vector adaptive excitation encoding means, excitation vector generation means, and encodes input speech for each frame divided at regular intervals. Wherein the spectrum analysis encoding means extracts and encodes a spectrum parameter which is spectrum envelope information of the input speech, and the adaptive excitation codebook is output by the excitation vector generation means in the preceding frame. A plurality of adaptive excitation vectors are generated from the generated excitation signal and stored, and the periodic pulse excitation generating means outputs a periodic pulse excitation vector determined as a main component of the adaptive excitation vector output from the adaptive excitation codebook, and Excitation encoding means includes the encoded spectral parameters and the plurality of adaptive excitation The distortion between the input speech and the decoded speech generated from any one of the tors and the first gain given to the adaptive excitation vector, the periodic pulse excitation vector and the second gain given to the periodic pulse excitation vector is minimized. The adaptive excitation vector, the first gain and the second gain are determined and encoded so that the adaptive excitation vector, the determined and encoded adaptive excitation vector, the first gain and the second gain, and A speech encoding device that generates and outputs a sound source signal from a periodic pulse sound source vector.

2. An apparatus comprising a spectrum parameter decoding means, a multi-vector adaptive excitation decoding means, an adaptive excitation codebook, a periodic pulse excitation generation means, an excitation vector generation means, and a synthesis filter, and encodes each frame divided at regular intervals. A speech decoding apparatus for generating a decoded signal from the obtained code of the spectrum parameter, the code of the adaptive excitation vector, and the code of the gain, wherein the spectrum parameter decoding means receives the code of the spectrum parameter, The multi-vector adaptive excitation decoding means receives the code of the adaptive excitation vector and the code of the gain, and outputs the code of the adaptive excitation vector to the adaptive excitation codebook. Decoding the first gain and the second gain from the code, Are generated and stored in the preceding frame from the excitation signal output by the excitation vector generation means, and output from the stored adaptive excitation vector by the multi-vector adaptive excitation decoding means. The periodic pulse excitation generating means outputs a periodic pulse excitation vector determined as a main component of the adaptive excitation vector output from the adaptive excitation codebook, and outputs an excitation vector corresponding to the code of the adaptive excitation vector. The generating means adds at least the adaptive excitation vector to which the first gain has been added and the periodic pulse excitation vector to which the second gain has been added, and outputs the sum as an excitation signal. The synthesis filter uses the decoded spectral parameters. And a speech decoding apparatus for generating a decoded signal from the sound source signal.

3. A spectrum analysis encoding means, a pitch position extracting means, an adaptive excitation codebook, a driving excitation codebook, a pitch synchronization means, an adaptive excitation encoding means, a driving excitation encoding means,
A speech encoding apparatus comprising a sound source vector generation unit and encoding an input speech for each frame divided at a fixed interval, wherein the spectrum analysis encoding unit extracts a spectrum parameter which is spectrum envelope information of the input speech. The pitch position extracting means extracts a plurality of feature points arranged at a pitch period interval from the input speech as pitch positions and encodes them, and the adaptive excitation codebook is output by the excitation vector generating means in the preceding frame. A plurality of adaptive excitation vectors are generated and stored from the excitation signal, the driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, and the pitch synchronization unit controls the encoded pitch position. To the pitch synchronization position of each drive sound source vector, and repeat the drive sound source vector at the pitch cycle. The adaptive excitation encoding means generates a synthesized speech vector of the adaptive excitation vector based on the encoded spectral parameters, and applies the input speech and the first gain to the adaptive excitation vector. The adaptive excitation vector and the first gain are determined so as to minimize the distortion of the synthesized speech vector, and the code of the determined adaptive excitation vector is output.
The driving excitation encoding means generates a synthesized speech vector of the pitch-synchronized driving excitation vector based on the encoded spectral parameters, and applies the determined first gain. When the sum of the synthesized speech vector of the adaptive excitation vector and the synthesized speech vector of the pitch-synchronized excitation vector to which the second gain is applied is defined as a decoded signal vector, the input speech and the decoded signal vector are minimized in distortion. Determining a synthesized speech vector and a second gain of the pitch-synchronous drive excitation vector, outputting a code of the determined pitch-synchronous drive excitation vector, and encoding the determined second gain; At least the determined and encoded adaptive excitation vector, the first gain and the second gain, and the pitch synchronization Speech encoding apparatus for generating and outputting a sound source signal from Doongen vector.

4. An apparatus comprising spectrum analysis coding means, adaptive excitation codebook, driving excitation codebook, pitch synchronization means, adaptive excitation coding means, driving excitation coding means, excitation vector generation means, and divided at regular intervals. A speech encoding apparatus for encoding an input speech for each frame, wherein the spectrum analysis encoding means extracts and encodes a spectrum parameter which is spectrum envelope information of the input speech, A plurality of adaptive excitation vectors are generated and stored from the excitation signal output by the excitation vector generation means, and the driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position. Adjusts the pitch synchronization position of the driving sound source vector to the pitch position, which is a plurality of points arranged at a pitch cycle interval on the time axis, Generating a pitch-synchronous drive excitation vector obtained by repeating the dynamic excitation vector at a pitch cycle, the adaptive excitation encoding means generates a synthesized speech vector of the adaptive excitation vector based on the encoded spectral parameters, and The adaptive excitation vector and the first gain are determined so that the distortion of the synthesized speech vector of the adaptive excitation vector to which the first gain is added is minimized, and the code of the determined adaptive excitation vector is output, and the determined adaptive excitation vector is output. The first excitation is encoded, and the driving excitation encoding means generates a synthesized speech vector of the pitch synchronous excitation vector based on the encoded spectral parameters, and provides the determined first gain. The synthesized speech vector of the determined adaptive sound source vector and the pitch synchronous drive sound source vector to which the second gain is added. When the sum of the synthesized speech vectors is used as the decoded signal vector, the synthesized speech vector, pitch position, and second gain of the pitch synchronous drive excitation vector are determined so that the distortion of the input speech and the decoded signal vector is minimized. Outputting the code of the determined pitch-synchronous drive excitation vector and encoding the determined pitch position and the second gain, wherein the excitation vector generating means includes at least the determined and encoded adaptive excitation vector and the first A speech encoding device for generating and outputting a sound source signal from a gain, a second gain, and the pitch synchronized drive sound source vector.

5. An apparatus comprising spectrum parameter decoding means, adaptive excitation decoding means, adaptive excitation codebook, driving excitation codebook, pitch synchronization means, driving excitation decoding means, excitation vector generation means, synthesis filter, and a fixed interval. An audio decoding device that generates a decoded signal from a code of a spectrum parameter encoded for each frame divided into a code of an adaptive excitation vector, a code of a drive excitation vector, a code of a gain, and a code of a pitch position, The spectrum parameter decoding means receives a code of the spectrum parameter, decodes the spectrum parameter from the code of the spectrum parameter, and the adaptive excitation decoding means determines the code of the adaptive excitation vector and the first gain to be given to the adaptive excitation vector. When a code is input and the code of the adaptive excitation vector is output to the adaptive excitation codebook, In addition, the first gain is decoded from the code of the first gain, and the adaptive excitation codebook generates and stores a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generation means in the preceding frame, An adaptive excitation vector corresponding to the code of the adaptive excitation vector output by the adaptive excitation decoding means is output from the stored adaptive excitation vectors, and the driving excitation codebook has a predetermined pitch synchronization position. A plurality of drive excitation vectors are stored, and a drive excitation vector corresponding to the code of the drive excitation vector output by the drive excitation vector decoding unit is output from the stored drive excitation vectors to the pitch synchronization unit. The pitch synchronization means receives the code of the pitch position, decodes the pitch position from the code, and The pitch synchronization position is adjusted to the calculated pitch position, and a pitch synchronization driving excitation vector is generated by repeating the driving excitation vector input from the driving excitation codebook at a pitch cycle. A code and a code of a second gain to be given to the driving excitation vector are input, and the code of the driving excitation vector is output to the driving excitation codebook, and a second gain is decoded from the code of the second gain, The excitation vector generation means outputs an excitation signal which is the sum of the adaptive excitation vector to which the first gain is applied and the pitch-synchronous drive excitation vector to which the second gain is applied, and the synthesis filter converts the decoded spectrum parameter to A speech decoding apparatus for generating a decoded signal from the sound source signal using the speech decoding apparatus.

6. A spectrum analysis coding means, an adaptive excitation codebook, a pitch position extracting means, a driving excitation codebook, a pitch synchronization means, an adaptive excitation coding means, a driving excitation coding means,
An audio encoding apparatus comprising an excitation vector generation unit and encoding input audio for each frame divided at a fixed interval, wherein the spectrum analysis encoding unit extracts a spectrum parameter which is spectrum envelope information of the input audio signal. The adaptive excitation codebook generates and stores a plurality of adaptive excitation vectors from the excitation signal output by the excitation vector generating means in the preceding frame, and the pitch position extracting means calculates a pitch period from the adaptive excitation vector. A plurality of feature points arranged at intervals are extracted as a pitch position, the driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, and the pitch synchronization unit stores the plurality of feature points in the extracted pitch position. The pitch synchronized position of each drive sound source vector is adjusted, and the drive sound source vector is repeated at the pitch cycle. The adaptive excitation encoding means generates a synthesized speech vector of the adaptive excitation vector based on the encoded spectral parameters, and applies the input speech and the first gain to the adaptive excitation. The adaptive excitation vector and the first gain are determined so that the distortion of the synthesized speech vector of the vector is minimized, the code of the determined adaptive excitation vector is output, and the first gain is encoded. Generates a synthesized speech vector of the pitch-synchronous drive excitation vector based on the encoded spectral parameters, and generates a synthesized speech vector of the determined adaptive excitation vector to which the determined first gain is added and a second speech vector. When the sum of the synthesized voice vector of the pitch synchronous drive excitation vector with the gain given is a decoded signal vector, Determining a synthesized speech vector and a second gain of the pitch-synchronous drive excitation vector so as to minimize distortion of the power speech and the decoded signal vector; outputting a code of the determined pitch-synchronous drive excitation vector; The excitation vector generating means generates an excitation signal from at least the determined and encoded adaptive excitation vector, the first gain and the second gain, and the pitch-synchronized excitation excitation vector. An audio encoding device to output.

7. Spectral parameter decoding means, adaptive excitation decoding means, adaptive excitation codebook, pitch position extraction means, driving excitation codebook, pitch synchronization means, driving excitation decoding means, excitation vector generation means, synthesis filter With
An audio decoding apparatus for generating a decoded signal from a code of a spectral parameter encoded for each frame divided at a fixed interval, a code of an adaptive excitation vector, a code of a driving excitation vector, and a code of a gain, comprising: The coding means receives the code of the spectrum parameter and decodes the spectrum parameter from the code of the spectrum parameter. The adaptive excitation decoding means sets the code of the adaptive excitation vector and the code of the first gain to be given to the adaptive excitation vector. The adaptive excitation codebook is input and outputs the code of the adaptive excitation vector to the adaptive excitation codebook, and decodes the first gain from the code of the first gain. While generating and storing a plurality of adaptive sound source vectors from the output sound source signal, An adaptive excitation vector corresponding to the code of the adaptive excitation vector output by the adaptive excitation decoding means is output from the stored adaptive excitation vectors, and the pitch position extracting means is arranged at a pitch cycle interval from the adaptive excitation vector. A plurality of feature points are extracted as pitch positions. The driving excitation codebook stores a plurality of driving excitation vectors having a predetermined pitch synchronization position, and decodes the driving excitation vector from among the stored driving excitation vectors. A drive excitation vector corresponding to the sign of the drive excitation vector output by the synchronization means is output to the pitch synchronization means, and the pitch synchronization means adjusts the pitch synchronization position of each of the drive excitation vectors to the extracted pitch position. And the pitch obtained by repeating the driving excitation vector input from the driving excitation codebook at the pitch cycle. A driving excitation vector is generated, and the driving excitation decoding unit receives the code of the driving excitation vector and the code of the second gain to be given to the driving excitation vector, and outputs the code of the driving excitation vector to the driving excitation codebook. At the same time, the second gain is decoded from the code of the second gain, and the excitation vector generation means is the sum of the adaptive excitation vector to which the first gain is applied and the pitch synchronous excitation excitation vector to which the second gain is applied. A speech decoding device that outputs a sound source signal, and a synthesis filter generates a decoded signal from the sound source signal using the decoded spectrum parameter.