JPH0981194A

JPH0981194A - Method and device for voice coding and decoding

Info

Publication number: JPH0981194A
Application number: JP7239993A
Authority: JP
Inventors: Masami Akamine; 政巳赤嶺
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1995-09-19
Filing date: 1995-09-19
Publication date: 1997-03-28
Anticipated expiration: 2015-09-19
Also published as: JP3332132B2

Abstract

PROBLEM TO BE SOLVED: To provide a method of voice coding by which a high-quality voice signal can be reproduced with a small amount of operation even at a low coding bit rate below about 4kbps. SOLUTION: The voice coding method which supplies the drive signals generated by using the drive vectors obtained from a drive vector coding book to a weighted synthesizing filter 113, thereby generates synthesized voice signals, searches the drive vector that minimizes the distortion of synthesized voice signals and outputs coding parameters representing the information on the above drive vector and the filter factor of the synthesizing filter 113 has plural adaptive coding books 110, 120, a noise coding book 121 and a signal classifier 130. When the nature of input voice signals is classified as to be periodical, or periodical and stationary by the signal classifier 130, the drive vectors obtained from the adaptive coding books 110, 120 are used to generate drive signals. Otherwise, the drive vectors obtained from the noise coding book 121 are used to generate drive signals.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、低ビットレートで
高品質の音声再生が可能な音声符号化／復号化方法およ
び音声符号化／復号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice encoding / decoding method and a voice encoding / decoding device capable of reproducing high quality voice at a low bit rate.

【０００２】[0002]

【従来の技術】電話帯域の音声を４ｋｂｐｓ程度の伝送
レートで符号化するための有効な符号化方式として、Ｃ
ＥＬＰ(Code Excited Linear Prediction)が知られてい
る。この方式は、フレーム単位に分割された入力音声か
ら声道をモデル化した音声合成フィルタを求める処理
と、このフィルタの入力信号に当たる駆動信号となる駆
動ベクトルを求める処理に大別される。これらの処理の
うち、後者は符号帳に格納された複数の駆動ベクトルを
一つずつ音声合成フィルタに通し、得られた合成音声信
号ベクトルの歪、つまり入力音声信号に対する誤差を計
算し、この歪が最小となる駆動ベクトルを探索する処理
からなる。この処理は閉ループ探索と呼ばれており、８
ｋｂｐｓ程度の符号化ビットレートでも良好な音質を再
生するために非常に有効な方法である。2. Description of the Related Art C is an effective encoding method for encoding telephone band voice at a transmission rate of about 4 kbps.
ELP (Code Excited Linear Prediction) is known. This method is roughly divided into a process of obtaining a voice synthesis filter that models a vocal tract from input voice divided into frame units, and a process of obtaining a drive vector that is a drive signal corresponding to the input signal of this filter. Of these processes, the latter passes a plurality of drive vectors stored in the codebook one by one through a speech synthesis filter, calculates the distortion of the obtained synthesized speech signal vector, that is, the error with respect to the input speech signal. Is the process of searching for a drive vector that minimizes. This process is called a closed loop search, and
This is a very effective method for reproducing good sound quality even at a coding bit rate of about kbps.

【０００３】しかし、符号化ビットレートが４ｋｂｐｓ
程度以下と低くなると、この方式では十分な品質の音声
を再生できなくなる問題がある。また、閉ループ探索に
多くの計算量を必要とすることも、この方式の問題点で
ある。以下、これらの問題点について具体的に説明す
る。However, the coding bit rate is 4 kbps.
When the value is lower than the above level, there is a problem in that this method cannot reproduce sound of sufficient quality. In addition, a large amount of calculation is required for the closed loop search, which is also a problem of this method. Hereinafter, these problems will be specifically described.

【０００４】ＣＥＬＰ方式に関しては、M.R.Schroeder
and B.S.Atal，“Code Excited Linear Prediction(CEL
P)： High Quality Speech at Very Low Bit Rates”，
Proc.ICASSP,pp.937-940,1985 およびW.S.Kleijin,D.J.
Krasinski et al.“ImprovedSpeech Quality and Effic
ient Vector Quantization in SELP ”，Proc.ICASSP,p
p.155-158,1988 で詳しく述べられている。[0004] Regarding the CELP system, MRSchroeder
and BSAtal, “Code Excited Linear Prediction (CEL
P): High Quality Speech at Very Low Bit Rates ”,
Proc.ICASSP, pp.937-940,1985 and WSKleijin, DJ
Krasinski et al. “Improved Speech Quality and Effic
ient Vector Quantization in SELP ”, Proc.ICASSP, p
This is detailed in p.155-158,1988.

【０００５】このＣＥＬＰ方式の概略を図７を用いて説
明する。入力端子４６０には、フレーム単位の音声信号
が入力される。この入力音声信号は線形予測分析部４５
０で分析され、重み付き合成フィルタ４３０のフィルタ
係数が求められると同時に、聴感重み付け部４４０に入
力され、重み付き入力音声信号が得られる。この重み付
き入力音声信号から重み付き合成フィルタ４３０の零状
態応答が差し引かれ、目標ベクトル４８０が生成され
る。The outline of the CELP method will be described with reference to FIG. An audio signal in frame units is input to the input terminal 460. This input speech signal is a linear prediction analysis unit 45.
0, the filter coefficient of the weighted synthesis filter 430 is obtained, and at the same time, the weighted input audio signal is input to the perceptual weighting section 440. The zero-state response of weighted synthesis filter 430 is subtracted from this weighted input speech signal to generate target vector 480.

【０００６】次に、適応符号帳４１１から駆動ベクトル
が一つずつ読み出され、さらにゲイン回路４２１でゲイ
ンを乗じられた後、重み付き合成フィルタ４３０に駆動
信号として入力されることにより、合成音声信号ベクト
ルが生成される。この合成音声信号ベクトルの歪、すな
わち目標ベクトル４８０との差が評価部４７０で評価さ
れ、この歪がより小さくなるように駆動ベクトルが適応
符号帳４１１から探索され、最適なものが第１の駆動ベ
クトルとされる。次に、この第１の駆動ベクトルの影響
を考慮して、雑音符号帳４１２から第２の駆動ベクトル
が同様にして探索される。最後に、第１および第２の駆
動ベクトルにそれぞれゲイン回路４２１および４２２で
最適なゲインが乗じられ、駆動信号が生成される。この
駆動信号によって適応符号帳４１１の内容の更新が行わ
れ、次フレームの音声信号の入力に備えられる。Next, the driving vectors are read out one by one from the adaptive codebook 411, further multiplied by the gain in the gain circuit 421, and then inputted to the weighted synthesizing filter 430 as a driving signal. A signal vector is generated. The distortion of the synthesized speech signal vector, that is, the difference from the target vector 480 is evaluated by the evaluation unit 470, the drive vector is searched from the adaptive codebook 411 so that this distortion becomes smaller, and the optimum one is the first drive. It is assumed to be a vector. Next, a second drive vector is similarly searched from the noise codebook 412 in consideration of the effect of the first drive vector. Finally, the first and second drive vectors are respectively multiplied by optimum gains in the gain circuits 421 and 422 to generate a drive signal. The contents of the adaptive codebook 411 are updated by this drive signal, and the update is prepared for the input of the audio signal of the next frame.

【０００７】以上の通り、ＣＥＬＰ方式では適応符号帳
４１１および雑音符号帳４１２に格納されている全ての
駆動ベクトルに対応して、重み付き合成フィルタ４３０
によるフィルタリング処理、合成音声信号の歪計算を行
い、この歪を最小とする駆動ベクトルを探索する閉ルー
プ探索を行っている。このＣＥＬＰ方式によると、８ｋ
ｂｐｓ程度の符号化ビットレートでは比較的良好な品質
の音声信号を再生することができるが、４ｋｂｐｓ程度
以下の低い符号化ビットレートになると、適応符号帳４
１１や雑音符号帳４１２の内容を更新する周期が長くな
ったり、ビット配分が少なくなるために、再生音声信号
の品質が劣化するという問題がある。As described above, in the CELP method, the weighted synthesis filter 430 is associated with all the drive vectors stored in the adaptive codebook 411 and the noise codebook 412.
A closed loop search is performed to search for a drive vector that minimizes this distortion by performing filtering processing and distortion calculation of the synthesized speech signal. According to this CELP method, 8k
At a coding bit rate of about bps, a voice signal of relatively good quality can be reproduced, but at a low coding bit rate of about 4 kbps or less, the adaptive codebook 4
11 and the random codebook 412 are updated with a longer cycle and the bit allocation is reduced, which causes a problem that the quality of the reproduced voice signal is deteriorated.

【０００８】この問題点についてさらに詳しく説明する
と、図７の構成では重み付き合成フィルタ４３０を駆動
するための駆動信号源が１段目の適応符号帳４１１と２
段目の雑音符号帳４１２からなる２段構成となってお
り、入力音声信号が周期的である場合、適応符号帳４１
１が駆動信号の周期的な成分を表現し、また適応符号帳
４１１だけでは表現できなかった駆動信号の残差成分を
雑音符号帳４１２が表現するというようにして、適応符
号帳４１１と雑音符号帳４１２が役割分担を行う構成と
なっている。To explain this problem in more detail, in the configuration of FIG. 7, the drive signal sources for driving the weighted synthesis filter 430 are the first-stage adaptive codebooks 411 and 2.
The noise codebook 412 in the second stage has a two-stage configuration, and when the input speech signal is periodic, the adaptive codebook 41
1 represents the periodic component of the drive signal, and the noise codebook 412 represents the residual component of the drive signal which could not be represented by the adaptive codebook 411 alone. The book 412 is configured to divide roles.

【０００９】この構成は、符号化ビットレートが８ｋｂ
ｐｓ程度のときは良好に作用し、高品質の再生音声信号
が得られていた。しかし、４ｋｂｐｓ程度以下の低い符
号化ビットレートになると、適応符号帳４１１や雑音符
号帳４１２の内容を更新する周期（一般に、サブフレー
ム呼ばれる）が長くなると共に、割り当てられるビット
数が少なくなるため、適応符号帳４１１が駆動信号の周
期的な成分を十分な精度で表現することができなくな
る。この結果、適応符号帳４１１では表現できない駆動
信号の残差成分に周期的な成分が残留し、これを雑音符
号帳４１２が表現することができないため、再生音声の
品質が劣化するのである。With this configuration, the coding bit rate is 8 kb.
When it was about ps, it worked well and a reproduced voice signal of high quality was obtained. However, at a low coding bit rate of about 4 kbps or less, the period (generally called a subframe) for updating the contents of the adaptive codebook 411 and the noise codebook 412 becomes long, and the number of bits allocated becomes small. The adaptive codebook 411 cannot express the periodic component of the drive signal with sufficient accuracy. As a result, a periodic component remains in the residual component of the drive signal that cannot be expressed by the adaptive codebook 411, and this cannot be expressed by the noise codebook 412, so that the quality of the reproduced voice deteriorates.

【００１０】また、ＣＥＬＰ方式では閉ループ探索を実
行するため、膨大な演算がかかるという問題がある。こ
の演算量の問題に対しては、駆動ベクトルの閉ループ探
索に必要な演算量を削減する方式が、R.C.RoseとT.P.Ba
rnwell IIIによって考案されている。この方式はＳＥＶ
(Self Excited Vocoder)と呼ばれており、その動作や性
能については、 R.C.Rose and T.P.Barnwell III：“Qu
ality Comparison ofLow Complexity 4800 bps Self Ex
cited and Code Excited Vocoders”,Proc.ofICASSP'8
7,pp.1637-1640,1987に記述されている。以下、このＳ
ＥＶ方式を簡単に説明する。Further, in the CELP method, since a closed loop search is executed, there is a problem that enormous calculation is required. To solve this problem of computational complexity, the method of reducing the computational complexity required for the closed loop search of the drive vector is RCRose and TPBa.
Invented by rnwell III. This method is SEV
It is called (Self Excited Vocoder), and its operation and performance are described in RCRose and TPBarnwell III: “Qu
ality Comparison of Low Complexity 4800 bps Self Ex
cited and Code Excited Vocoders ”, Proc.ofICASSP'8
7, pp.1637-1640,1987. Below, this S
The EV system will be briefly described.

【００１１】図８は、ＳＥＶ方式の音声符号化装置にお
ける駆動信号探索部のブロック図である。図７との比較
から分かるように、ＣＥＬＰ方式とＳＥＶ方式の大きな
違いは合成フィルタの駆動信号の構成法にある。ＣＥＬ
Ｐ方式では、適応符号帳と雑音符号帳を用いて駆動信号
を構成するのに対し、ＳＥＶ方式では１または２個の適
応符号帳（文献ではLong-term predictor と呼んでい
る）５００，５１０のみを用いて駆動信号を構成すると
いうシンプルな構成になっており、また適応符号帳の特
殊な構成を利用して演算量の削減を図っている。すなわ
ち、適応符号帳５００，５１０内の各駆動ベクトルが過
去の駆動信号からピッチ周期に対応した遅延分シフトし
ながら構成され、その結果、各駆動ベクトルはその要素
がオーバラップする構成になっていることを利用して、
駆動ベクトルに対する重み付き合成フィルタ５２０の出
力の計算を再帰的に行うことで、演算量を約１／２０に
削減している。FIG. 8 is a block diagram of a drive signal searching section in the SEV type speech encoding apparatus. As can be seen from the comparison with FIG. 7, the major difference between the CELP method and the SEV method lies in the method of structuring the drive signal of the synthesis filter. CEL
In the P method, the drive signal is configured using the adaptive codebook and the noise codebook, whereas in the SEV method, only one or two adaptive codebooks (referred to as Long-term predictor in the literature) 500 and 510 are used. The drive signal is configured by using a simple structure, and the special code structure of the adaptive codebook is used to reduce the calculation amount. That is, each drive vector in adaptive codebooks 500 and 510 is configured by shifting from the past drive signal by a delay corresponding to the pitch period, and as a result, each drive vector has a configuration in which its elements overlap. By utilizing that
The calculation amount is reduced to about 1/20 by recursively calculating the output of the weighted synthesis filter 520 for the drive vector.

【００１２】しかしながら、ＳＥＶ方式は入力音声信号
の性質によらず常に適応符号帳を用いて過去の履歴から
駆動信号を作成しているため、入力音声信号が無声区間
のように過去の信号との相関が少ない場合や、信号の立
ち上がり部分や立ち下がり部分、無声区間から有声区間
への過渡部分、有声区間から無声区間への過渡部分な
ど、音声の性質が変化する部分で適切な駆動信号を生成
することができず、音質がＣＥＬＰ方式より更に劣化し
てしまうという問題がある。However, in the SEV method, since the drive signal is always created from the past history by using the adaptive codebook regardless of the nature of the input voice signal, the input voice signal is different from the past signal like the unvoiced section. Generates an appropriate drive signal in a portion where the characteristics of the voice change, such as when there is little correlation, the rising or falling portion of the signal, the transient portion from the unvoiced section to the voiced section, or the transient section from the voiced section to the unvoiced section. However, there is a problem that the sound quality is further deteriorated as compared with the CELP method.

【００１３】[0013]

【発明が解決しようとする課題】上述したように、従来
の音声符号化方式のうちＣＥＬＰ方式は４ｋｂｐｓ程度
以下の低い符号化ビットレートでは、高品質の音声信号
を再生することができないばかりでなく、演算量が多い
という問題があり、またＳＥＶ方式では演算量は削減さ
れるが、ＳＥＬＰ方式よりさらに再生音声信号の品質が
劣化するという問題があった。As described above, the CELP method, which is one of the conventional audio encoding methods, cannot reproduce a high quality audio signal at a low encoding bit rate of about 4 kbps or less. There is a problem that the amount of calculation is large, and the amount of calculation is reduced in the SEV method, but there is a problem that the quality of the reproduced audio signal is further deteriorated as compared with the SELP method.

【００１４】本発明の目的は、４ｋｂｐｓ程度以下の低
い符号化ビットレートでも、少ない演算量で高品質の音
声信号を再生することができる音声符号化／復号化方法
および音声符号化／復号化装置を提供することにある。An object of the present invention is to provide a voice encoding / decoding method and a voice encoding / decoding device capable of reproducing a high quality voice signal with a small amount of calculation even at a low encoding bit rate of about 4 kbps or less. To provide.

【００１５】[0015]

【課題を解決するための手段】上記課題を解決するた
め、本発明は複数の駆動ベクトル符号帳から得られる駆
動ベクトルを用いて駆動信号を生成し、この駆動信号を
入力音声信号の分析結果に基づいてフィルタ係数が決定
される合成フィルタに供給し、この合成フィルタから出
力される合成音声信号の歪が最小となる駆動ベクトルを
駆動ベクトル符号帳から探索して、少なくとも該駆動ベ
クトルおよび合成フィルタのフィルタ係数の情報を表す
符号化パラメータを出力する音声符号化方法において、
駆動ベクトル符号帳として少なくとも２つの適応符号帳
および少なくとも一つの雑音符号帳を用意しておき、入
力音声信号の性質が周期的または周期的かつ定常的の場
合にはは少なくとも２つの適応符号帳から得られる駆動
ベクトルを用いて駆動信号を生成し、入力音声信号の性
質が非周期的または周期的かつ非定常的の場合には少な
くとも雑音符号帳から得られる駆動ベクトルを用いて駆
動信号を生成することを特徴とする。In order to solve the above problems, the present invention generates a drive signal using a drive vector obtained from a plurality of drive vector codebooks, and uses this drive signal as an analysis result of an input voice signal. It supplies to the synthesis filter whose filter coefficient is determined based on, and searches the drive vector codebook for a drive vector that minimizes the distortion of the synthesized voice signal output from this synthesis filter, and at least the drive vector and the synthesis filter In a speech coding method for outputting a coding parameter representing information on a filter coefficient,
At least two adaptive codebooks and at least one noise codebook are prepared as drive vector codebooks, and if the input speech signal is periodic or periodic and stationary, at least two adaptive codebooks are selected. Generate a drive signal using the obtained drive vector, and generate a drive signal using at least the drive vector obtained from the noise codebook if the input speech signal is aperiodic or periodic and nonstationary in nature. It is characterized by

【００１６】本発明に係る音声符号化装置は、駆動ベク
トル符号帳として用意された少なくとも２つの適応符号
帳および少なくとも一つの雑音符号帳と、入力音声信号
の性質を周期的か否か、または周期的かつ定常的か否か
により分類する分類手段と、この分類手段により入力音
声信号の性質が周期的または周期的かつ定常的と分類さ
れた場合には少なくとも２つの適応符号帳から得られる
駆動ベクトルを用いて駆動信号を生成し、入力音声信号
の性質が非周期的または周期的かつ非定常的と分類され
た場合には少なくとも雑音符号帳から得られる駆動ベク
トルを用いて駆動信号を生成する駆動信号生成手段とを
有することを特徴とする。The speech coding apparatus according to the present invention has at least two adaptive codebooks and at least one noise codebook prepared as drive vector codebooks, and whether the nature of the input speech signal is periodic or not. And a driving vector obtained from at least two adaptive codebooks when the characteristic of the input speech signal is classified as periodic or periodic and stationary by the classifying means. Drive signal is generated by using, and when the property of the input speech signal is classified as aperiodic or periodic and nonstationary, a drive signal is generated using at least a drive vector obtained from the noise codebook. And a signal generating means.

【００１７】このように本発明においては、合成フィル
タの駆動信号は多段の駆動ベクトル符号帳に格納された
駆動ベクトルの線形結合で構成される。そして、入力音
声信号の性質を周期的か否か、または周期的かつ定常か
否かにより分類し、周期的または周期的かつ定常的と分
類された場合は、少なくとも２段の適応符号帳を用いて
駆動信号を構成する。As described above, in the present invention, the drive signal of the synthesis filter is formed by the linear combination of the drive vectors stored in the multi-stage drive vector codebook. Then, the property of the input speech signal is classified according to whether it is periodic or periodic and stationary, and when classified as periodic or periodic and stationary, at least two-stage adaptive codebook is used. Drive signal.

【００１８】前述のように従来のＣＥＬＰ方式では、入
力音声信号が周期的である場合、１段目の適応符号帳で
駆動信号の周期的な成分を表現し、この適応符号帳だけ
で表現できなかった駆動信号の残差成分を２段目の雑音
符号帳で表現する構成となっていたため、４ｋｂｐｓ程
度以下の低い符号化ビットレートでは適応符号帳や雑音
符号帳の符号を更新する周期が長くなると共に、割り当
てられるビット数が少なくなる。従って、適応符号帳が
周期的な信号を十分な精度で表現することができなくな
り、適応符号帳の残差の成分に周期的な成分が残留し、
この残留した周期成分を雑音符号帳が表現することがで
きないことが原因で、再生音声信号の品質が劣化してい
た。As described above, in the conventional CELP system, when the input voice signal is periodic, the cyclic code component of the drive signal is expressed by the first-stage adaptive codebook and can be expressed only by this adaptive codebook. Since the residual component of the drive signal that has not existed is represented by the noise codebook in the second stage, the cycle for updating the code of the adaptive codebook or the noise codebook is long at a low coding bit rate of about 4 kbps or less. And the number of allocated bits decreases. Therefore, the adaptive codebook cannot express the periodic signal with sufficient accuracy, and the periodic component remains in the residual component of the adaptive codebook.
Due to the fact that the noise codebook cannot express the remaining periodic component, the quality of the reproduced voice signal is deteriorated.

【００１９】これに対して、本発明では入力音声信号が
周期的または周期的かつ定常的である場合は、１段目の
第１の適応符号帳で駆動信号の周期的な成分を表現した
後、符号化ビットレートの低下で１段目の残差に周期的
な成分が残留した場合でも、この周期成分を２段目の第
２の適応符号帳で表現することができる。このため、低
い符号化ビットレートでも高品質の再生音声信号を得る
ことができる。On the other hand, in the present invention, when the input voice signal is periodic or periodic and stationary, after the periodic adaptive component of the drive signal is expressed by the first adaptive codebook in the first stage. Even if a periodic component remains in the residual of the first stage due to the reduction of the encoding bit rate, this periodic component can be represented by the second adaptive codebook of the second stage. Therefore, it is possible to obtain a reproduced audio signal of high quality even at a low coding bit rate.

【００２０】一方、入力音声信号が非周期的または周期
的かつ非定常的の場合、本発明では駆動信号の生成に少
なくとも雑音符号帳と多くとも一つの適応符号帳を用い
ることにより、入力音声信号が周期的でなく、または周
期的であっても定常的でなく過去の駆動信号との相関が
小さい信号である場合、過去の駆動信号との相関を利用
する適応符号帳を多用せず、雑音符号帳を利用する構成
とすることができるので、良好な再生音声信号を得るこ
とができる。On the other hand, when the input voice signal is aperiodic or periodic and non-stationary, the present invention uses at least a noise codebook and at most one adaptive codebook to generate the drive signal, Is not periodic, or is periodic but not stationary, and has a small correlation with the past drive signal, the adaptive codebook that uses the correlation with the past drive signal is not heavily used and noise is reduced. Since the codebook can be used, a good reproduced voice signal can be obtained.

【００２１】このように本発明では、入力音声信号が周
期的か否かまたは周期的かつ定常的か否かに応じて、駆
動信号を複数段の適応符号帳で構成するか、雑音符号帳
で構成するかを切り替えることにより、入力音声信号の
性質に適合した駆動信号を生成できるので、低い符号化
ビットレートでも品質の良好な再生音声信号を得ること
が可能となる。As described above, according to the present invention, the drive signal is composed of a plurality of stages of adaptive codebooks or noise codebooks depending on whether the input speech signal is periodic or periodic and stationary. By switching the configuration, it is possible to generate a drive signal suitable for the property of the input audio signal, and thus it is possible to obtain a reproduced audio signal of good quality even at a low coding bit rate.

【００２２】また、本発明に係る音声復号化方法は、上
述の音声符号化方法によって得られた符号化パラメータ
から元の音声信号を復号するために、少なくとも駆動ベ
クトルおよび合成フィルタのフィルタ係数をそれぞれ表
わす符号化パラメータを入力とし、複数段の駆動ベクト
ル符号帳から得られる駆動ベクトルを用いて駆動信号を
生成し、この駆動信号を符号化パラメータに基づいてフ
ィルタ係数が決定される合成フィルタに供給することに
より音声信号を復号化する音声復号化方法において、駆
動ベクトル符号帳として少なくとも２つの適応符号帳お
よび少なくとも一つの雑音符号帳を用意しておき、符号
化パラメータに基づいて選択される少なくとも２つの適
応符号帳から得られる駆動ベクトルあるいは少なくとも
雑音符号帳から得られる駆動ベクトルを用いて駆動信号
を生成することを特徴とする。Further, in the speech decoding method according to the present invention, in order to decode the original speech signal from the coding parameters obtained by the above speech coding method, at least the driving vector and the filter coefficient of the synthesis filter are respectively set. A drive signal is generated by using a drive vector obtained from a plurality of stages of drive vector codebooks with the encoding parameter to be input as an input, and the drive signal is supplied to a synthesis filter whose filter coefficient is determined based on the encoding parameter. Accordingly, in the speech decoding method for decoding a speech signal, at least two adaptive codebooks and at least one noise codebook are prepared as the driving vector codebook, and at least two selected based on the coding parameters. Drive vectors obtained from the adaptive codebook, or at least obtained from the noise codebook And generating a driving signal by using the drive vector.

【００２３】本発明に係る音声復号化装置は、上述の音
声符号化装置によって得られた符号化パラメータから元
の音声信号を復号するために、少なくとも駆動ベクトル
および合成フィルタのフィルタ係数をそれぞれ表わす符
号化パラメータを入力とし、複数段の駆動ベクトル符号
帳から得られる駆動ベクトルを用いて駆動信号を生成
し、この駆動信号を符号化パラメータに基づいてフィル
タ係数が決定される合成フィルタに供給することにより
音声信号を復号化する音声復号化装置において、駆動ベ
クトル符号帳として用意された少なくとも２つの適応符
号帳および少なくとも一つの雑音符号帳と、符号化パラ
メータに基づいて選択される少なくとも２つの適応符号
帳から得られる駆動ベクトルあるいは少なくとも雑音符
号帳から得られる駆動ベクトルを用いて駆動信号を生成
する駆動信号生成手段とを有することを特徴とする。The speech decoding apparatus according to the present invention encodes at least the drive vector and the filter coefficient of the synthesis filter in order to decode the original speech signal from the coding parameters obtained by the speech coding apparatus described above. By inputting the encoding parameter and generating a drive signal using the drive vector obtained from the drive vector codebook of multiple stages, and supplying this drive signal to the synthesis filter whose filter coefficient is determined based on the encoding parameter, In a voice decoding device for decoding a voice signal, at least two adaptive codebooks and at least one noise codebook prepared as drive vector codebooks, and at least two adaptive codebooks selected based on coding parameters. Drive vector obtained from or at least the drive vector obtained from the random codebook. And having a drive signal generating means for generating a drive signal using a vector.

【００２４】[0024]

BEST MODE FOR CARRYING OUT THE INVENTION

（第１の実施形態）図１は、本発明の第１の実施形態に
係る音声符号化装置の構成を示すブロック図であり、音
声信号の入力端子１００、バッファ１０１、ＬＰＣ分析
部１０２、ＬＳＰ量子化器１０３、サブフレーム分割回
路１０４、重み付けフィルタ１０５、１段目の駆動信号
源を構成する第１の適応符号帳１１０、ゲイン回路１１
１、加算器１１２、重み付き合成フィルタ１１３、減算
器１１４、２段目の駆動信号源を構成する第２の適応符
号帳１２０と雑音符号帳１２１、切り替え器１２２、ゲ
イン回路１２３、利得符号帳１２４、信号分類器１３
０、符号帳１１０，１２０，１２１，１２４を探索する
ための探索部１４０，１４１，１４２、マルチプレクサ
１５０および符号化パラメータの出力端子１５１を有す
る。(First Embodiment) FIG. 1 is a block diagram showing the configuration of a speech coder according to a first embodiment of the present invention, in which a speech signal input terminal 100, a buffer 101, an LPC analysis section 102, and an LSP are provided. The quantizer 103, the subframe division circuit 104, the weighting filter 105, the first adaptive codebook 110 that constitutes the drive signal source of the first stage, and the gain circuit 11
1, adder 112, weighted synthesis filter 113, subtractor 114, second adaptive codebook 120 and noise codebook 121 that constitute the second stage drive signal source, switch 122, gain circuit 123, gain codebook 124, signal classifier 13
0, codebooks 110, 120, 121 and 124, search units 140, 141 and 142, a multiplexer 150, and a coding parameter output terminal 151.

【００２５】ここで、信号分類器１３０は入力音声信号
の性質を周期的または周期的かつ定常的か否かにより分
類して分類判定信号を出力するものである。また、切り
替え器１２２は信号分類器１３０からの分類判定信号に
基づいて２段目の駆動信号源として第２の適応符号帳１
２０および雑音符号帳１２１のいずれかを選択して取り
出すためのものである。Here, the signal classifier 130 classifies the characteristics of the input speech signal according to whether it is periodic or periodic and stationary, and outputs a classification determination signal. Further, the switcher 122 uses the second adaptive codebook 1 as the drive signal source of the second stage based on the classification determination signal from the signal classifier 130.
This is for selecting and extracting either the 20 or the random codebook 121.

【００２６】次に、本実施形態の音声符号化装置の動作
を説明する。まず、ディジタル化された音声信号が入力
端子１００から入力され、フレームと呼ばれる一定間隔
（例えば、２０ｍｓ）の区間に分割されてバッファ１０
１に蓄えられる。バッファ１０１からフレーム単位で読
み出された入力音声信号は、フレーム単位でＬＰＣ分析
部１０２によって線形予測分析され、入力音声信号のス
ペクトル包絡を表すパラメータである線形予測係数ａ_i
（ｉ＝１，…，ｐ）が計算されてＬＳＰ量子化器１０３
に入力される。Next, the operation of the speech coding apparatus of this embodiment will be described. First, a digitized audio signal is input from the input terminal 100, divided into sections of a constant interval (for example, 20 ms) called a frame, and the buffer 10
Stored in 1. The input speech signal read from the buffer 101 in units of frames is subjected to linear prediction analysis by the LPC analysis unit 102 in units of frames, and a linear prediction coefficient a _i which is a parameter representing a spectrum envelope of the input speech signal.
(I = 1, ..., P) is calculated and LSP quantizer 103 is calculated.
Is input to

【００２７】ＬＳＰ量子化器１０３では、線形予測係数
をＬＳＰ（線スペクトル対）パラメータに変換した後、
予め定められたビット数で量子化する。このＬＳＰ量子
化器１０３で量子化されたＬＳＰパラメータは、マルチ
プレクサ１５０に入力されるとともに、復号されて線形
予測係数に逆変換され、量子化された線形予測係数ａｑ
_i （ｉ＝１，…，ｐ）が得られる。線形予測分析の方
法、ＬＳＰパラメータの求め方、ＬＳＰパラメータの量
子化方法については、周知の方法を用いることができ
る。量子化された線形予測係数は、重み付けフィルタ１
０５と重み付き合成フィルタ１１３に与えられる。In the LSP quantizer 103, after converting the linear prediction coefficient into an LSP (line spectrum pair) parameter,
Quantize with a predetermined number of bits. The LSP parameter quantized by the LSP quantizer 103 is input to the multiplexer 150, decoded, inversely transformed into a linear prediction coefficient, and quantized linear prediction coefficient aq.
_i (i = 1, ..., P) is obtained. Well-known methods can be used for the method of linear prediction analysis, the method of obtaining LSP parameters, and the method of quantizing LSP parameters. The quantized linear prediction coefficient is the weighting filter 1
05 and the weighted synthesis filter 113.

【００２８】一方、バッファ１０１からフレーム単位で
読み出された入力音声信号は、サブフレーム分割回路１
０４によって１フレーム当たりサブフレームと呼ばれる
複数の区間に分割され、サブフレーム単位で重み付け合
成フィルタ１０５に入力されて、駆動信号探索の目標ベ
クトルとなる。そして、この重み付け合成フィルタ１１
３を駆動する駆動信号の探索処理がサブフレーム単位で
以下のようにして行われる。On the other hand, the input audio signal read from the buffer 101 on a frame-by-frame basis is the sub-frame division circuit 1
Each frame is divided into a plurality of sections called a subframe by 04, and is input to the weighting synthesis filter 105 in units of subframes to be a target vector for driving signal search. Then, this weighting synthesis filter 11
The search processing of the drive signal for driving 3 is performed in the subframe unit as follows.

【００２９】本実施形態の音声符号化装置では、第１の
適応符号帳１１０を１段目の駆動信号源、第２の適応符
号帳１２０と雑音符号帳１２１のいずれか一方を２段目
の駆動信号源とする２段構成となっており、各段の符号
帳から駆動ベクトルが順次探索される。この探索手順を
図３に示すフローチャートを用いて説明する。In the speech coding apparatus according to the present embodiment, the first adaptive codebook 110 is the driving signal source in the first stage, and one of the second adaptive codebook 120 and the noise codebook 121 is in the second stage. It has a two-stage structure as a drive signal source, and the drive vector is sequentially searched from the codebook of each stage. This search procedure will be described with reference to the flowchart shown in FIG.

【００３０】まず、１段目の第１の適応符号帳１１０に
ついて駆動ベクトルの探索を行う。すなわち、入力端子
１００より音声信号が入力されると（ステップＳ１）、
適応符号帳１１０の探索に先立ち、重み付けフィルタ１
０５で入力音声信号に聴感重み付けを施すと共に、前サ
ブフレームからの重み付け合成フィルタ１０５の影響を
差し引くことにより目標ベクトルＸを生成し（ステップ
Ｓ２）、第１の適応符号帳１１０の探索を行う（ステッ
プＳ３）。第１の適応符号帳１１０は音声信号の周期成
分（ピッチ）を表現するのに用いられ、この適応符号帳
１１０に格納される駆動ベクトルｅ(n) は、次式で表さ
れるように、過去の駆動信号をサブフレーム長分切り出
すことにより作成される。First, a drive vector is searched for the first adaptive codebook 110 in the first stage. That is, when an audio signal is input from the input terminal 100 (step S1),
Prior to searching the adaptive codebook 110, the weighting filter 1
In 05, the perceptual weighting is applied to the input speech signal, and the target vector X is generated by subtracting the influence of the weighting synthesis filter 105 from the previous subframe (step S2), and the first adaptive codebook 110 is searched ( Step S3). The first adaptive codebook 110 is used to represent the periodic component (pitch) of the speech signal, and the drive vector e (n) stored in this adaptive codebook 110 is expressed by the following equation. It is created by cutting out the past drive signal by the subframe length.

【００３１】ｅ(n) ＝ｅ(n-L) ，ｎ＝１，…，Ｎ（１）ここで、Ｌはラグ、Ｎはサブフレーム長である。第１の
適応符号帳１１０の探索は、探索部１４０によって、駆
動ベクトルｅを重み付き合成フィルタ１１３に通すこと
によって得られる合成音声信号ベクトルの歪、すなわち
目標ベクトルＸに対する合成音声信号ベクトルとの誤差
を最小とするラグを周知の方法に従って探索することで
行われる。ラグは、整数サンプルまたは小数サンプル単
位とすることができる。E (n) = e (nL), n = 1, ..., N (1) where L is a lag and N is a subframe length. The search of the first adaptive codebook 110 is performed by the search unit 140, in which the distortion of the synthesized speech signal vector obtained by passing the drive vector e through the weighted synthesis filter 113, that is, the error between the synthesized speech signal vector and the target vector X It is performed by searching for a lag that minimizes R according to a known method. The lag can be in whole or fractional samples.

【００３２】重み付けフィルタ１０５および重み付き合
成フィルタ１１３は、周知の方法で構成することがで
き、一例としてＷ(z) およびＨｗ(z) をそれぞれの伝達
関数とすると、Ｗ(z) ＝Ａ(z/r) ／Ａ（z/r') （２）Ｈｗ(z) ＝Ｗ(z) ／Ａｑ(z) （３）Ａ(z) ＝１＋Σａ_i ｚ^-i （４）Ａｑ(z) ＝１＋Σａｑ_i ｚ^-i （５）と表すことができる。The weighting filter 105 and the weighting synthesis filter 113 can be constructed by a known method. For example, if W (z) and Hw (z) are their transfer functions, then W (z) = A ( z / r) / A (z / r ') (2) Hw (z) = W (z) / Aq (z) (3) A (z) = 1 + Σa _i z ^-i (4) Aq (z) = It can be expressed as 1 + Σaq _i z ⁻ⁱ (5).

【００３３】次に、２段目の符号帳の探索を探索部１４
１によって行う。この場合、まず信号分類器１３０によ
り符号化対象の入力音声信号が周期的か否か、または周
期的かつ定常か否かにより分類する（ステップＳ４）。
具体的には、公知の有声／無声判定法、さらにはこれに
ピッチ周期の連続性判定を組み合わせた手法、あるいは
自己相関法などによって分類を行う。Next, the search unit 14 searches the second stage codebook.
Do by 1. In this case, the signal classifier 130 first classifies the input speech signal to be encoded according to whether it is periodic or whether it is periodic and stationary (step S4).
Specifically, classification is performed by a known voiced / unvoiced determination method, a method that combines pitch period continuity determination with this method, or an autocorrelation method.

【００３４】図２は、音声信号の有声区間と無声区間の
波形の一例を示す図である。同図に示されるように、無
声区間から有声区間へ転移する過渡部分や、有声区間か
ら無声区間へ転移する過渡部分では、周期的であるもの
のピッチ周期や振幅変化が一様でない非定常的な状態と
なっており、過渡部分以外では周期的かつ定常的な状態
となっている。信号分類器１３０は、最も簡単には入力
音声信号が有声区間か無声区間かの判定により周期的か
否かで分類を行ってもよいが、有声区間についてさらに
定常的か否かの判定を行うことにより、周期的かつ定常
的か否かで分類を行ってもよい。FIG. 2 is a diagram showing an example of waveforms of voiced sections and unvoiced sections of a voice signal. As shown in the figure, in the transitional part where the unvoiced section transitions to the voiced section and the transitional part where the voiced section transitions to the unvoiced section are periodic, the pitch period and the amplitude change are not uniform. It is in a state, and is in a periodic and steady state except in the transient part. The signal classifier 130 may perform classification based on whether the input voice signal is periodic or not by judging whether the input voice signal is a voiced section or an unvoiced section. However, the signal classifier 130 further determines whether or not the voiced section is stationary. Therefore, classification may be performed depending on whether it is periodic and stationary.

【００３５】ステップＳ４での分類の結果、入力音声信
号が周期的または周期的かつ定常的である場合は、２段
目の符号帳として第２の適応符号帳１２０を探索し（ス
テップＳ５）、そうでない場合は雑音符号帳１２１を探
索する（ステップＳ６）。第２の適応符号帳１２０の探
索は、第１の符号帳１１０の探索と同様に行う。また、
雑音符号帳１２１の探索は、従来のＣＥＬＰ方式と同様
に行う。このとき、演算量を削減するためオーバラッピ
ング符号帳、バックワードフィルタリング、予備選択な
ど周知の技術を利用することができる。As a result of the classification in step S4, when the input speech signal is periodic or periodic and stationary, the second adaptive codebook 120 is searched as the second stage codebook (step S5). Otherwise, the random codebook 121 is searched (step S6). The search for the second adaptive codebook 120 is performed in the same manner as the search for the first codebook 110. Also,
The random codebook 121 is searched in the same manner as the conventional CELP method. At this time, known techniques such as overlapping codebook, backward filtering, and preselection can be used to reduce the amount of calculation.

【００３６】最後に、利得符号帳１２４の探索を行う
（ステップＳ７）。利得符号帳１２４は、１段目と２段
目の符号帳から探索される駆動ベクトルに乗じるゲイン
を要素とするベクトルを代表ベクトルとして持つ。そし
て、１段目の第１の適応符号帳１１０から探索された駆
動ベクトルと、切り替え器１２２を介して取り出された
２段目の第１の適応符号帳１２０または雑音符号帳１２
１から探索された駆動ベクトルに、ゲイン回路１１１，
１２３によって利得符号帳１２４から探索されたゲイン
がそれぞれ乗じられ、さらに加算器１１２で加え合わせ
られて得られた駆動ベクトルを重み付き合成ベクトル１
１３に通して得られる合成音声信号ベクトルの歪（目標
ベクトルに対する誤差）が最小となるように、探索部１
４２で閉ループ的に周知の方法により利得符号帳１２４
が探索される。Finally, the gain codebook 124 is searched (step S7). The gain codebook 124 has, as a representative vector, a vector whose element is a gain multiplied by a drive vector searched from the codebooks of the first and second stages. Then, the drive vector searched from the first adaptive codebook 110 in the first stage and the first adaptive codebook 120 or the random codebook 12 in the second stage extracted via the switch 122.
In the drive vector searched from 1, the gain circuit 111,
The gains searched from the gain codebook 124 are multiplied by 123, respectively, and the driving vector obtained by adding them by the adder 112 is added to the weighted combined vector 1
Search unit 1 so that the distortion (error with respect to the target vector) of the synthesized speech signal vector obtained through 13 is minimized.
At 42, the gain codebook 124 is closed loop in a known manner.
Is searched.

【００３７】以上のようにして探索部１４０，１４１，
１４２で探索された第１の適応符号帳１１０のインデッ
クス（第１のインデックス）、第２の適応符号帳１２０
または雑音符号帳１２１のインデックス（第２のインデ
ックス）、および利得符号帳１２４のインデックス（第
３のインデックス）は、ＬＳＰ量子化器１０３からの量
子化されたパラメータおよび信号分類器１３０からの分
類判定信号とともに、符号化パラメータとしてマルチプ
レクサ１５０を介して出力される。マルチプレクサ１５
０から出力される符号化パラメータは、伝送路または記
憶媒体へ送出される。As described above, the search units 140, 141,
The index (first index) of the first adaptive codebook 110 searched in 142, the second adaptive codebook 120
Alternatively, the index of the random codebook 121 (second index) and the index of the gain codebook 124 (third index) are the quantized parameters from the LSP quantizer 103 and the classification determination from the signal classifier 130. It is output as a coding parameter together with the signal through the multiplexer 150. Multiplexer 15
The coding parameter output from 0 is sent to a transmission line or a storage medium.

【００３８】このように本実施形態の音声符号化装置に
おいては、入力音声信号が信号分類器１３０によって周
期的または周期的かつ定常的と分類された場合は、第１
の適応符号帳１１０で重み付き合成フィルタ１１３の駆
動信号の周期的な成分を表現し、さらに符号化ビットレ
ートの低下によって減算器１１４から得られる残差に周
期的な成分が残留した場合でも、この周期成分を第２の
適応符号帳１２０で表現することができるため、低い符
号化ビットレートでも音声復号化装置において高品質の
再生音声信号を得ることができる。As described above, in the speech coding apparatus of this embodiment, when the input speech signal is classified as periodic or periodic and stationary by the signal classifier 130, the first
Even if the periodic component of the drive signal of the weighted synthesis filter 113 is expressed by the adaptive codebook 110 of, and the periodic component remains in the residual obtained from the subtractor 114 due to the reduction of the encoding bit rate, Since this periodic component can be expressed by the second adaptive codebook 120, it is possible to obtain a reproduced voice signal of high quality in the voice decoding device even at a low encoding bit rate.

【００３９】また、信号分類器１３０によって入力音声
信号が非周期的または周期的かつ非定常的と分類された
場合には、雑音符号帳１２１あるいは適応符号帳１１０
と雑音符号帳１２１を用いて駆動信号を生成することに
より、過去の駆動信号との相関を利用する適応符号帳を
多用することなく、音声復号化装置において良好な再生
音声信号を得ることができる。When the signal classifier 130 classifies the input speech signal as aperiodic or periodic and nonstationary, the noise codebook 121 or the adaptive codebook 110 is used.
By generating the driving signal using the noise codebook 121 and the noise codebook 121, it is possible to obtain a good reproduced speech signal in the speech decoding device, without using the adaptive codebook that utilizes the correlation with the past driving signal. .

【００４０】次に、本実施形態に係る音声復号化装置に
ついて説明する。図４は、本実施形態に係る音声復号化
装置の構成を示すブロック図である。この音声復号化装
置は符号化データの入力端子１６０、デマルチプレクサ
１６１、ＬＳＰ逆量子化器１６２、第１の適応符号帳１
７０、ゲイン回路１７１、加算器１７２、合成フィルタ
１７３、ポストフィルタ１７４、第２の適応符号帳１８
０、雑音符号帳１８１、切り替え器１８２、ゲイン回路
１８３、利得符号帳１８４および再生音声信号の出力端
子１９０からなる。Next, the speech decoding apparatus according to this embodiment will be described. FIG. 4 is a block diagram showing the configuration of the speech decoding apparatus according to this embodiment. This speech decoding apparatus includes an encoded data input terminal 160, a demultiplexer 161, an LSP dequantizer 162, and a first adaptive codebook 1.
70, gain circuit 171, adder 172, synthesis filter 173, post filter 174, second adaptive codebook 18
0, a noise codebook 181, a switch 182, a gain circuit 183, a gain codebook 184, and a reproduced voice signal output terminal 190.

【００４１】入力端子１６０には、図１に示した音声符
号化装置から出力される符号化パラメータが伝送路また
は記憶媒体を介して入力される。この符号化パラメータ
はデマルチプレクサ１６１に入力され、図１のＬＳＰ量
子化器１０３で量子化されたＬＳＰパラメータ、第１の
適応符号帳１１０のインデックス（第１のインデック
ス）、第２の適応符号帳１２０または雑音符号帳１２１
のインデックス（第２のインデックス）、利得符号帳１
２４のインデックス（第３のインデックス）および信号
分類器１３０からの分類判定信号が分離して復号され
る。Coding parameters output from the speech coding apparatus shown in FIG. 1 are input to the input terminal 160 via a transmission line or a storage medium. This coding parameter is input to the demultiplexer 161, and the LSP parameter quantized by the LSP quantizer 103 of FIG. 1, the index (first index) of the first adaptive codebook 110, and the second adaptive codebook. 120 or the noise codebook 121
Index (second index), gain codebook 1
The 24 indexes (third index) and the classification determination signal from the signal classifier 130 are separated and decoded.

【００４２】デマルチプレクサ１６１の出力のうち、量
子化されたＬＳＰパラメータはＬＳＰ逆量子化器１６２
に入力されてＬＳＰパラメータが復元され、また第１の
インデックスは第１の適応符号帳１７０に、第２のイン
デックスは第２の適応符号帳１８０および雑音符号帳１
８１に、第３のインデックスは利得符号帳１８４に、分
類判定信号は切り替え器１８２にそれぞれ入力される。Of the output of the demultiplexer 161, the quantized LSP parameter is the LSP dequantizer 162.
LSP parameters are restored by inputting to the first adaptive codebook 170 and the second index is the first adaptive codebook 170, and the second index is the second adaptive codebook 180 and the random codebook 1
81, the third index is input to the gain codebook 184, and the classification determination signal is input to the switch 182.

【００４３】第１の適応符号帳１７０、第２の適応符号
帳１８０、雑音符号帳１８１および利得符号帳１８４
は、図１に示した音声符号化装置内の第１の適応符号帳
１１０、第２の適応符号帳１２０、雑音符号帳１２１お
よび利得符号帳１２４と同様に構成される。First adaptive codebook 170, second adaptive codebook 180, random codebook 181, and gain codebook 184.
Are configured similarly to the first adaptive codebook 110, the second adaptive codebook 120, the noise codebook 121, and the gain codebook 124 in the speech coding apparatus shown in FIG.

【００４４】この音声復号化装置の動作を説明すると、
デマルチプレクサ１６１から出力された第１のインデッ
クスに基づいて第１の適応符号帳１７０を探索して得ら
れた第１の駆動ベクトルは、ゲイン回路１７１において
利得符号帳１８４から第３のインデックスに基づいて探
索されたゲインが乗じられた後、加算器１７２に入力さ
れる。一方、第２のインデックスに基づいて第２の適応
符号帳１８０または雑音符号帳１８１を探索して得られ
た第２の駆動ベクトルは、分類判定信号に基づいて切り
替え器１８２で選択され、同様にゲイン回路１７２にお
いて利得符号帳１８４から第３のインデックスに基づい
て探索されたゲインが乗じられた後、加算器１７２に入
力される。The operation of this speech decoding apparatus will be described below.
The first drive vector obtained by searching the first adaptive codebook 170 based on the first index output from the demultiplexer 161 is based on the third index from the gain codebook 184 in the gain circuit 171. It is input to the adder 172 after being multiplied by the gain searched for. On the other hand, the second drive vector obtained by searching the second adaptive codebook 180 or the random codebook 181 based on the second index is selected by the switch 182 based on the classification determination signal, and similarly. The gain circuit 172 multiplies the gain searched by the gain codebook 184 based on the third index, and inputs the result to the adder 172.

【００４５】加算器１７２により合成された駆動ベクト
ルは、ＬＳＰ逆量子化器１６２からのＬＳＰパラメータ
によって伝達特性が制御される合成フィルタ１７３に駆
動信号として与えられ、この合成フィルタ１７３により
合成音声信号ベクトルが生成される。この合成音声信号
ベクトルは、同じくＬＳＰパラメータによって伝達特性
が制御されるポストフィルタ１７４に入力され、ここで
再生音声の主観品質を向上させるための周知の処理、例
えばピッチ強調、ホルマント強調、高域強調およびゲイ
ン調整などの処理が施された後、出力端子１９０より再
生音声信号として出力される。この再生音声信号は、前
述した通り高品質のものとなる。The drive vector synthesized by the adder 172 is given as a drive signal to the synthesis filter 173 whose transfer characteristic is controlled by the LSP parameter from the LSP dequantizer 162, and this synthesis filter 173 synthesizes the synthesized voice signal vector. Is generated. This synthesized speech signal vector is input to the post filter 174 whose transfer characteristic is also controlled by the LSP parameter, and here, well-known processing for improving the subjective quality of the reproduced speech, such as pitch enhancement, formant enhancement, and high frequency enhancement. After being subjected to processing such as gain adjustment and the like, it is output from the output terminal 190 as a reproduced audio signal. This reproduced audio signal has high quality as described above.

【００４６】（第２の実施形態）図５は、第２の実施形
態に係る音声符号化装置のブロック図である。図１と相
対応する部分に同一の参照符号を付して第１の実施形態
との相違点を中心に説明すると、この第２の実施形態と
第１の実施形態との相違点は、２段目の駆動信号源とな
る符号帳の選択方法にある。すなわち、第１の実施形態
では入力音声信号を分析することでその性質を分類し、
２段目の駆動信号として用いる符号帳を第２の適応符号
帳１２０と雑音符号帳１２１との切り替えていた。(Second Embodiment) FIG. 5 is a block diagram of a speech encoding apparatus according to the second embodiment. The same reference numerals are given to the portions corresponding to those in FIG. 1, and the description will focus on the differences from the first embodiment. The difference between the second embodiment and the first embodiment is 2 There is a method of selecting a codebook which is a drive signal source of the stage. That is, in the first embodiment, the characteristics are classified by analyzing the input voice signal,
The codebook used as the drive signal of the second stage was switched between the second adaptive codebook 120 and the noise codebook 121.

【００４７】これに対し、第２の実施形態では１段目の
駆動信号源となる符号帳として、第１の適応符号帳２１
０と、格納されている駆動ベクトルが常に固定の固定符
号帳２１１の２つの符号帳を持ち、これらの符号帳２１
０，２１１の一方を合成音声信号レベルの歪最小基準で
選択し、その結果から入力音声信号の性質を同様に周期
的か否か、または周期的かつ定常的か否かにより分類
し、それに応じて２段目の駆動信号源として用いる符号
帳として、第２の適応符号帳１２０と雑音符号帳１２１
のいずれか一方を選択する。言い換えると、第１の実施
形態では開ループ的に符号帳を選択するのに対して、第
２の実施形態は閉ループ的に２段目の符号帳の選択を行
う点が異なっている。On the other hand, in the second embodiment, the first adaptive codebook 21 is used as the codebook which is the drive signal source of the first stage.
0 and two codebooks of fixed codebook 211 in which the stored drive vector is always fixed.
One of 0 and 211 is selected by the minimum distortion criterion of the synthesized voice signal level, and the result is classified into the characteristics of the input voice signal according to whether they are periodic or not, and whether they are periodic and stationary. The second adaptive codebook 120 and the noise codebook 121 are used as the codebooks used as the drive signal source of the second stage.
Select either one of. In other words, in the first embodiment, the codebook is selected in an open loop, whereas in the second embodiment, the second-stage codebook is selected in a closed loop.

【００４８】本実施形態の動作を具体的に説明すると、
まずフレーム単位でＬＰＣ分析およびＬＳＰパラメータ
の量子化を第１の実施形態と全く同様に行う。次に、探
索部１４０において第１の適応符号帳２１０の探索を第
１の実施形態と同様に、また固定符号帳２１１の探索を
第１の実施形態における雑音符号帳１２１の探索と同様
に行い、適応符号帳２１０および固定符号帳２１１から
探索された駆動ベクトルを切り替え器２１２を介して取
り出す。The operation of this embodiment will be described in detail.
First, LPC analysis and LSP parameter quantization are performed in frame units exactly as in the first embodiment. Next, search section 140 searches for first adaptive codebook 210 as in the first embodiment, and searches for fixed codebook 211 as in the random codebook 121 in the first embodiment. , The drive vector searched from the adaptive codebook 210 and the fixed codebook 211 is extracted via the switch 212.

【００４９】そして、これら第１の適応符号帳２１０と
固定符号帳２１１の探索の結果、すなわち第１の適応符
号帳２１０から探索された駆動ベクトルにゲイン回路１
１１によって利得符号帳１２４から探索されたゲインを
乗じて得られた駆動ベクトルを重み付き合成フィルタ１
１３に通して得られる合成音声ベクトルの歪みと、固定
符号帳２１１から探索された駆動ベクトルにゲイン回路
１１１によって利得符号帳１２４から探索されたゲイン
を乗じて得られた駆動ベクトルを重み付き合成フィルタ
１１３に通して得られる合成音声ベクトルの歪みを比較
し、最小の歪を与える方の駆動ベクトルを１段目の駆動
信号源からの駆動ベクトルとする。この探索によって探
索部１４０で得られる第１の適応符号帳２１０または固
定符号帳２１１のインデックスは、マルチプレクサ１５
０と信号分類器２３０に与えられる。Then, the gain circuit 1 is added to the search results of the first adaptive codebook 210 and the fixed codebook 211, that is, the drive vector searched from the first adaptive codebook 210.
11, the driving vector obtained by multiplying the gain searched from the gain codebook 124 by the weighting synthesis filter 1
13, the distortion of the synthetic speech vector obtained through 13 and the drive vector obtained by multiplying the drive vector searched from the fixed codebook 211 by the gain searched from the gain codebook 124 by the gain circuit 111 are weighted synthesis filters. The distortions of the synthetic speech vectors obtained through 113 are compared, and the drive vector that gives the minimum distortion is taken as the drive vector from the drive signal source of the first stage. The index of the first adaptive codebook 210 or the fixed codebook 211 obtained by the search unit 140 by this search is the multiplexer 15
0 to the signal classifier 230.

【００５０】信号分類器２３０では、探索部１４０より
入力されたインデックスから１段目の駆動信号源として
第１の適応符号帳２１０が選択されたか固定符号帳２１
１が選択されたかどうかを判定することによって、入力
音声信号の性質を周期的か否か、または周期的でかつ定
常的か否かにより分類する。そして、信号分類器２３０
は適応符号帳２１０が選択された場合、つまり入力音声
信号が周期的または周期的かつ定常的の場合は、２段目
の駆動信号源として第２の適応符号帳２２０が選択され
るように、そうでない場合には雑音符号帳１２１が選択
されるように切り替え器１２２を制御する。In the signal classifier 230, whether the first adaptive codebook 210 is selected as the drive signal source of the first stage from the index input from the search unit 140 or the fixed codebook 21 is selected.
By determining whether 1 is selected, the nature of the input speech signal is classified according to whether it is periodic or whether it is periodic and stationary. Then, the signal classifier 230
When the adaptive codebook 210 is selected, that is, when the input speech signal is periodic or periodic and stationary, the second adaptive codebook 220 is selected as the second stage drive signal source, Otherwise, the switch 122 is controlled so that the random codebook 121 is selected.

【００５１】２段目の駆動信号源としての符号帳の探索
は、第１の実施形態と同様に行う。この場合、信号分類
器２３０は１段目の駆動信号源として第１の適応符号帳
２１０が選択されたか否かのみでなく、ピッチの連続性
も考慮に入れて２段目の符号帳の選択を行う構成とする
こともできる。The search for the codebook as the second stage drive signal source is performed in the same manner as in the first embodiment. In this case, the signal classifier 230 selects the second-stage codebook in consideration of not only whether the first adaptive codebook 210 is selected as the first-stage drive signal source but also the pitch continuity. It is also possible to adopt a configuration for performing.

【００５２】なお、図５の音声符号化装置に対応する音
声復号化装置は図４の音声復号化装置と同様であるの
で、説明を省略する。（第３の実施形態）第１および第２の実施形態では、駆
動信号源が２段の符号帳で構成されていたが、３段また
はそれ以上にすることもできる。図６は、駆動信号源が
３段の場合の第３の実施形態に係る音声符号化装置の構
成を示すブロック図である。同図に示すように、本実施
形態では図１に示した第１の実施形態の構成に、３段目
の駆動信号源であるもう一つの雑音符号帳３３０と、そ
の出力に利得符号帳１２４から探索されたゲインを乗じ
るためのゲイン回路３３１、および雑音符号帳３３０を
探索する探索部１４３が追加された構成となっている。Since the speech decoding apparatus corresponding to the speech encoding apparatus of FIG. 5 is the same as the speech decoding apparatus of FIG. 4, description thereof will be omitted. (Third Embodiment) In the first and second embodiments, the drive signal source is composed of a two-stage codebook, but it may be three or more stages. FIG. 6 is a block diagram showing the configuration of the speech encoding apparatus according to the third embodiment when the drive signal source has three stages. As shown in the figure, in the present embodiment, in addition to the configuration of the first embodiment shown in FIG. 1, another noise codebook 330 which is the drive signal source of the third stage and the gain codebook 124 at the output thereof. The configuration is such that a gain circuit 331 for multiplying the gain searched for from and a search unit 143 for searching the random codebook 330 are added.

【００５３】本実施形態における符号帳の探索動作は、
まず１段目の駆動信号源である第１の適応符号帳１１０
の探索を探索部１４０によって行い、次に２段目の駆動
信号源である第２の適応符号帳１２０または雑音符号帳
１２１の探索を探索部１４１によって行い、最後に３段
目の駆動信号源である雑音符号帳３３０の探索を探索部
１４３によって行うことによって達成される。The codebook search operation in this embodiment is as follows.
First, the first adaptive codebook 110 that is the drive signal source of the first stage
Of the second adaptive codebook 120 or the noise codebook 121, which is the driving signal source of the second stage, is searched by the searching unit 141, and finally the driving signal source of the third stage is searched. The random codebook 330 is searched by the search unit 143.

【００５４】なお、以上の実施形態では入力音声信号の
性質を周期的または周期的かつ定常的か否かにより分類
する手段として、図１に示すように入力音声信号を信号
分類器１３０に入力として有声／無声判定等により分類
を行うか、または図５に示すように第１段階で第１の適
応符号帳２１０が選択されたか否か、もしくは図６に示
すように第２段階で第２の適応符号帳１２０が選択され
たか否かにより間接的に分類する方法を用いたが、これ
に限られるものではない。In the above embodiment, the input voice signal is input to the signal classifier 130 as shown in FIG. 1 as means for classifying the nature of the input voice signal according to whether it is periodic or periodic and stationary. Classification is performed by voiced / unvoiced determination or the like, or whether the first adaptive codebook 210 is selected in the first stage as shown in FIG. 5, or the second adaptive codebook 210 in the second stage as shown in FIG. Although the method of indirectly classifying according to whether or not the adaptive codebook 120 is selected is used, the method is not limited to this.

【００５５】例えば、入力音声信号のパワーを判定して
入力音声信号の性質を周期的または周期的かつ定常的か
否かにより分類を行ってもよいし、第１の適応符号帳１
１０から読み出される駆動ベクトルにゲイン回路１１１
で乗じられるゲインが所定の閾値を越えたか否かを判定
し、閾値を越えたとき入力音声信号の性質が周期的また
は周期的かつ定常的と分類するようにするなど、種々の
変形が可能である。For example, the power of the input voice signal may be judged to classify the property of the input voice signal periodically or periodically and whether it is stationary, or the first adaptive codebook 1
The gain circuit 111 is added to the drive vector read from
Various modifications are possible, such as determining whether the gain multiplied by exceeds a predetermined threshold and classifying the input voice signal as periodic or periodic and stationary when the threshold is exceeded. is there.

【００５６】[0056]

【発明の効果】以上説明したように、本発明によれば入
力音声信号の性質を周期的か否かまたは周期的かつ定常
的か否かにより分類し、周期的または周期的かつ定常的
と分類した場合は、第１および第２の２段構成の適応符
号帳で駆動信号を作成することによって、第１の適応符
号帳で十分除去できずに残留した周期成分を第２の適応
符号帳で表現する構成とし、非周期的または周期的かつ
非定常的と分類した場合には、雑音符号帳を用いて駆動
信号を作成して相関の小さい信号を表現する構成として
いる。この結果、常に入力音声信号の性質に適合した駆
動信号を構成することができ、４ｋｂｐｓ程度の低いビ
ットレートでも良好な品質の音声を再生することができ
る。As described above, according to the present invention, the characteristics of the input speech signal are classified according to whether they are periodic or periodic and stationary, and are classified as periodic or periodic and stationary. In this case, the drive signal is generated by the first and second two-stage adaptive codebooks, so that the periodic components that cannot be sufficiently removed by the first adaptive codebook and remain are generated by the second adaptive codebook. When the configuration is represented, when classified as non-periodic or periodic and non-stationary, a drive signal is created using a noise codebook to represent a signal having a small correlation. As a result, it is possible to always configure a drive signal that matches the characteristics of the input audio signal, and it is possible to reproduce audio of good quality even at a low bit rate of about 4 kbps.

【００５７】また、本発明では駆動信号が適応符号帳か
ら構成されるので、適応符号帳の特殊な構成を利用して
ＣＥＬＰ方式で問題となっていた演算量を削減できると
いう効果もある。すなわち、ＳＥＶ方式と同様に、適応
符号帳内の各駆動ベクトルが過去の駆動信号からピッチ
周期に対応した遅延分シフトしながら構成され、その結
果、各駆動ベクトルはその要素がオーバラップする構成
になっていることを利用して、駆動ベクトルに対する合
成フィルタの出力の計算を再帰的に行うことで、演算量
を大幅に削減することが可能である。Further, in the present invention, since the drive signal is composed of the adaptive codebook, there is an effect that the special amount of the adaptive codebook can be used to reduce the amount of calculation which has been a problem in the CELP system. That is, similar to the SEV method, each drive vector in the adaptive codebook is configured by shifting from the past drive signal by a delay corresponding to the pitch period, and as a result, each drive vector has a configuration in which its elements overlap. By taking advantage of the fact that the output of the synthesis filter for the drive vector is recursively calculated, it is possible to significantly reduce the amount of calculation.

[Brief description of drawings]

【図１】本発明の第１の実施形態に係る音声符号化装置
の構成を示すブロック図FIG. 1 is a block diagram illustrating a configuration of a speech encoding device according to a first embodiment of the present invention.

【図２】音声信号の有声区間と無声区間の波形の例を示
す図FIG. 2 is a diagram showing an example of waveforms of voiced sections and unvoiced sections of a voice signal.

【図３】同実施形態における主要な処理の流れを示すフ
ローチャートFIG. 3 is a flowchart showing a flow of main processing in the same embodiment.

【図４】同実施形態に係る音声復号化装置の構成を示す
ブロック図FIG. 4 is a block diagram showing a configuration of a speech decoding apparatus according to the embodiment.

【図５】本発明の第２の実施形態に係る音声符号化装置
の構成を示すブロック図FIG. 5 is a block diagram showing a configuration of a speech encoding apparatus according to a second embodiment of the present invention.

【図６】本発明の第３の実施形態に係る音声符号化装置
の構成を示すブロック図FIG. 6 is a block diagram showing the configuration of a speech encoding apparatus according to a third embodiment of the present invention.

【図７】従来のＣＥＬＰ方式による音声符号化装置の典
型的な構成を示すブロック図FIG. 7 is a block diagram showing a typical configuration of a conventional CELP audio encoding apparatus.

【図８】従来のＳＥＶ方式による音声符号化装置の要部
の構成を示すブロック図FIG. 8 is a block diagram showing a configuration of a main part of a conventional SEV audio encoding device.

[Explanation of symbols]

１００…音声信号入力端子１０１…バッファ１０４…サブフレーム分割回路１０２…ＬＰＣ分析部１０３…ＬＳＰ量子化器１０５…重み付けフィルタ１１０…第１の適応符号帳１１１…ゲイン回路１１２…加算器１１３…重み付き合成フィルタ１１４…減算器１２０…第２の符号帳１２１…雑音符号帳１２２…切り替え器１２３…ゲイン回路１２４…利得符号帳１３０…信号分類器１４０〜１４３…探索部１５０…マルチプレクサ１５１…符号化パラメータ出力端子２１０…第１の適応符号帳２１１…固定符号帳１５０…マルチプレクサ１６０…符号化パラメータ入力端子１６１…デマルチプレクサ１６２…ＬＳＰ逆量子化器１７０…第１の適応符号帳１７１…ゲイン回路１７３…合成フィルタ１７４…ポストフィルタ１８０…第２の適応符号帳１８１…雑音符号帳１８２…切り替え器１８３…ゲイン回路２１０…第１の適応符号帳２１１…固定符号帳２１２…切り替え器２３０…信号分類器３３０…雑音符号帳３３１…ゲイン回路 100 ... Audio signal input terminal 101 ... Buffer 104 ... Subframe division circuit 102 ... LPC analysis unit 103 ... LSP quantizer 105 ... Weighting filter 110 ... First adaptive codebook 111 ... Gain circuit 112 ... Adder 113 ... Weighted Synthesis filter 114 ... Subtractor 120 ... Second codebook 121 ... Noise codebook 122 ... Switcher 123 ... Gain circuit 124 ... Gain codebook 130 ... Signal classifier 140-143 ... Search unit 150 ... Multiplexer 151 ... Encoding parameter Output terminal 210 ... First adaptive codebook 211 ... Fixed codebook 150 ... Multiplexer 160 ... Coding parameter input terminal 161 ... Demultiplexer 162 ... LSP dequantizer 170 ... First adaptive codebook 171 ... Gain circuit 173 ... Synthesis filter 174 ... Post filter 80 ... Second adaptive codebook 181 ... Random codebook 182 ... Switching device 183 ... Gain circuit 210 ... First adaptive codebook 211 ... Fixed codebook 212 ... Switching device 230 ... Signal classifier 330 ... Random codebook 331 ... Gain circuit

Claims

[Claims]

1. A drive signal is generated using a drive vector obtained from a plurality of drive vector codebooks, and the drive signal is supplied to a synthesis filter whose filter coefficient is determined based on an analysis result of an input audio signal. The drive vector codebook is searched for a drive vector in which the distortion of the synthesized voice signal output from the synthesis filter is smaller, and at least an encoding parameter representing the information of the drive vector and the filter coefficient of the synthesis filter is output. In the speech coding method, at least two adaptive codebooks and at least one noise codebook are prepared as the driving vector codebook, and at least the above when the property of the input speech signal is periodic or periodic and stationary. Generating the drive signal using drive vectors obtained from two adaptive codebooks; Speech encoding method and generating said drive signal using the driving vectors obtained from at least the noise code book when the nature of the signal is non-periodic or periodic and unsteady.

2. A drive signal is generated using a drive vector obtained from a plurality of drive vector codebooks, and the drive signal is supplied to a synthesis filter whose filter coefficient is determined based on the analysis result of the input audio signal. The drive vector codebook is searched for a drive vector in which the distortion of the synthesized voice signal output from the synthesis filter is smaller, and at least an encoding parameter representing the information of the drive vector and the filter coefficient of the synthesis filter is output. In the audio encoding device, at least 2 prepared as the drive vector codebook
One adaptive codebook and at least one noise codebook, a classifying unit that classifies the properties of the input speech signal according to whether they are periodic or periodic and stationary, and the input speech signal by this classifying unit Is classified as periodic or periodic and stationary, the drive signal obtained from at least the two adaptive codebooks is used to generate the drive signal, and the input voice signal has a non-periodic nature. Or a drive signal generation means for generating the drive signal using at least one of the adaptive codebook and the drive vector obtained from the noise codebook when classified as periodic and non-stationary. Speech coding device.

3. A drive signal is generated using a drive vector obtained from a plurality of drive vector codebooks, and this drive signal is supplied to a synthesis filter whose filter coefficient is determined based on the analysis result of the input audio signal. The drive vector codebook is searched for a drive vector in which the distortion of the synthesized voice signal output from the synthesis filter is smaller, and at least an encoding parameter representing the information of the drive vector and the filter coefficient of the synthesis filter is output. In the speech coding method, at least two adaptive codebooks, at least one fixed codebook and at least one noise codebook are prepared as the driving vector codebook, and the first adaptive codebook and the fixed codebook are prepared in a first step. Of the codebooks, the one with the smaller distortion of the synthesized speech signal is selected,
In the second step, it is determined whether the first adaptive codebook or the fixed codebook is selected in the first step to determine whether the property of the input speech signal is periodic, or whether it is periodic and stationary. If the nature of the input speech signal is periodic or periodic and stationary, the driving signal is generated using at least the driving vector obtained from the two adaptive codebooks, and the input signal A voice encoding method, wherein the drive signal is generated using at least a drive vector obtained from the noise codebook when the property of the voice signal is aperiodic or periodic and nonstationary.

4. A drive signal is generated using a drive vector obtained from a plurality of drive vector codebooks, and the drive signal is supplied to a synthesis filter whose filter coefficient is determined based on an analysis result of an input voice signal. The drive vector codebook is searched for a drive vector in which the distortion of the synthesized voice signal output from the synthesis filter is smaller, and at least an encoding parameter representing the information of the drive vector and the filter coefficient of the synthesis filter is output. In the audio encoding device, at least 2 prepared as the drive vector codebook
One adaptive codebook, at least one fixed codebook and at least one noise codebook, and one of the first adaptive codebook and the fixed codebook, whichever of the first adaptive codebook and the fixed codebook has the smaller distortion of the synthesized speech signal in the first stage, Selected,
In the second stage, it is determined whether the first adaptive codebook is selected or the fixed codebook is selected, and the classification means classifies the input speech signal according to whether the property is periodic or periodic and stationary. And when the classifying means classifies the input speech signal as periodic or periodic and stationary, the driving signal is generated using the driving vector obtained from at least two adaptive codebooks. Drive signal generation for generating the drive signal using a drive vector obtained from at least one adaptive codebook and noise codebook when the characteristics of the input speech signal are classified as aperiodic or periodic and nonstationary A speech coding apparatus having means.

5. A drive signal is generated using a drive vector obtained from a plurality of drive vector codebooks, at least encoding parameters representing a drive vector and a filter coefficient of a synthesis filter are input, and the drive signal is generated by the code. In a speech decoding method for decoding a speech signal by supplying it to a synthesis filter whose filter coefficient is determined on the basis of a coding parameter, at least two adaptive codebooks and at least one noise codebook are used as the driving vector codebook. Speech decoding, which is prepared and generates the drive signal using a drive vector obtained from at least two adaptive codebooks selected based on the encoding parameter or at least a drive vector obtained from a noise codebook Method.

6. A drive signal is generated using drive vectors obtained from a plurality of drive vector codebooks, at least encoding parameters representing the drive vector and the filter coefficient of the synthesis filter are input, and the drive signal is generated by the code. In a voice decoding device for decoding a voice signal by supplying it to a synthesizing filter whose filter coefficient is determined based on a conversion parameter, at least 2 prepared as the drive vector codebook
One adaptive codebook and at least one noise codebook, and a drive vector obtained from at least two adaptive codebooks selected on the basis of the encoding parameter or at least a drive vector obtained from the noise codebook. And a drive signal generating means for generating the signal.