JPH07271397A

JPH07271397A - Voice encoding device

Info

Publication number: JPH07271397A
Application number: JP6065265A
Authority: JP
Inventors: Ko Amada; 皇天田; Masami Akamine; 政巳赤嶺; Kimio Miseki; 公生三関; Susumu Kanba; 進神庭; Masahiro Oshikiri; 正浩押切
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1994-04-01
Filing date: 1994-04-01
Publication date: 1995-10-20
Anticipated expiration: 2018-12-02
Also published as: JP3471889B2

Abstract

PURPOSE:To provide a voice encoding device in which a high precision preselection of driving vectors is accomplished, the number of candidates of the driving vectors is reduced while keeping the high quality of encoded voices and the amount of computations for a final selection is reduced. CONSTITUTION:The device is provided with a driving vector generating section 101, a synthetic filter 102 which receives the generated driving vectors and generates synthetic voice vectors, a preselection section 104 which selects at least one driving vector from the generated driving vectors and a final selection section 105 which selects an optimum driving vector from the selected driving vector by the section 104. The section 104 selects the driving vector which makes the size of the inner product value of a target vector obtained from an inputted voice and the synthetic voice vector larger than the value of the driving vector that is weighted by a weighting function in which the driving vector is made as a parameter.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、線形予測分析型の音声
符号化装置、特にＣＥＬＰなどの複数の駆動ベクトルを
合成フィルタに入力し、得られた合成音声ベクトルと入
力音声を聴感重みの下で比較して歪みを最小にする駆動
ベクトルを符号帳から探索する音声符号化装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a linear predictive analysis type speech coding apparatus, in particular, a plurality of driving vectors such as CELP are input to a synthesis filter, and the obtained synthesized speech vector and the input speech are weighted under a perceptual weight. The present invention relates to a speech coder that searches a codebook for a drive vector that minimizes distortion by comparing the above.

【０００２】[0002]

【従来の技術】電話帯域の音声を４ｋｂｐｓ程度の伝送
レートで符号化する方式として、ＣＥＬＰ( Code Excit
ed Linear Prediction )方式は有効な方式の一つであ
る。このＣＥＬＰ方式での処理は、フレーム単位に分割
された入力音声から声道をモデル化した音声合成フィル
タを求める処理と、このフィルタの入力信号に当たる駆
動ベクトルを求める処理に大別される。これらのうち、
後者は符号帳に格納された複数の駆動ベクトルを一つず
つ音声合成フィルタに通し、合成音声と入力音声を比較
する符号帳探索と呼ばれる処理が必要であり、この処理
は多くの計算量を必要とする。本発明は符号帳探索にお
ける計算量削減に関するものである。2. Description of the Related Art CELP (Code Excit) is used as a method for encoding telephone band voice at a transmission rate of about 4 kbps.
The ed Linear Prediction) method is one of the effective methods. The processing by the CELP method is roughly divided into a processing for obtaining a voice synthesis filter that models a vocal tract from input speech divided into frame units, and a processing for obtaining a drive vector corresponding to an input signal of this filter. Of these,
The latter requires a process called codebook search that compares the synthesized speech with the input speech by passing multiple drive vectors stored in the codebook one by one through a speech synthesis filter, and this processing requires a large amount of calculation. And The present invention relates to reduction of calculation amount in codebook search.

【０００３】ＣＥＬＰ方式に関しては、例えばM.R.Schr
oeder and B.S.Atal,"Code ExcitedLinear Prediction
(CELP): High Auality Speech at Very Low Bit Rate
s", Proc. ICASSP,pp.937-940, 1985 および W.S.Klei
jin, D.J.Krasinski et al. "ImprovedSpeech Quality
and Efficient Vector Quantization in SELP", Proc.I
CASSP, pp.155-158, 1988 で詳しく述べられている。Regarding the CELP method, for example, MRSchr
oeder and BSAtal, "Code ExcitedLinear Prediction
(CELP): High Auality Speech at Very Low Bit Rate
s ", Proc. ICASSP, pp.937-940, 1985 and WSKlei
jin, DJKrasinski et al. "Improved Speech Quality
and Efficient Vector Quantization in SELP ", Proc.I
CASSP, pp.155-158, 1988.

【０００４】図９は、ＣＥＬＰ方式による音声符号化装
置の概略を示すブロック図である。まず、符号帳探索に
ついて説明する。駆動ベクトルとしては２系統が用意さ
れており、これらは適応符号帳９０１と雑音符号帳９０
２に格納されている。適応符号帳９０１は過去の駆動ベ
クトル群を格納した可変の符号帳であるのに対し、雑音
符号帳９０２は決まったパターンを複数格納している固
定の符号帳である。端子９０６に入力される入力音声ベ
クトルＲを線形予測分析部９０７で分析し、合成フィル
タ９０８の特性を求めた後、適応符号帳９０１および雑
音符号帳９０２からそれぞれ１つずつ最適な駆動ベクト
ルを選び出す。FIG. 9 is a block diagram showing an outline of a speech coding apparatus based on the CELP method. First, the codebook search will be described. Two systems are prepared as drive vectors. These are adaptive codebook 901 and random codebook 90.
Stored in 2. The adaptive codebook 901 is a variable codebook that stores past drive vector groups, whereas the noise codebook 902 is a fixed codebook that stores a plurality of fixed patterns. The linear prediction analysis unit 907 analyzes the input speech vector R input to the terminal 906, obtains the characteristics of the synthesis filter 908, and then selects one optimal driving vector from each of the adaptive codebook 901 and the noise codebook 902. .

【０００５】具体的には、符号帳９０１，９０２から一
つずつ取り出された駆動ベクトルを合成フィルタ９０８
に通し、得られた出力（合成音声ベクトル）を入力音声
ベクトルＲと比較して、入力音声ベクトルＲに最も近い
最適合成音声ベクトルを生成する駆動ベクトルを符号帳
９０１，９０２から探索する。Specifically, the drive vectors extracted one by one from the codebooks 901 and 902 are combined into a synthesis filter 908.
Then, the obtained output (synthesized speech vector) is compared with the input speech vector R, and the codebooks 901 and 902 are searched for the drive vector that produces the optimum synthesized speech vector that is closest to the input speech vector R.

【０００６】次に、駆動ベクトルの探索方法を数式を用
いて説明する。符号帳９０１，９０２から取り出された
ｉ番目の駆動ベクトルをｙｉ、これを合成フィルタ９０
８に通して得られた合成音声ベクトルをＹｉ、入力音声
ベクトルをＲと表すとき、次式に示すＹｉとＲの差の２
乗和、Ｅ＝｜Ｒ−αＹｉ｜² （１）を最小にする駆動ベクトルｙｉを探索するのが一般的で
ある。但し、αはｙｉが選ばれた場合の最適ゲインであ
り、図９ではゲイン回路９０３，９０４によって付与さ
れる。この式をαで偏微分した式をゼロとおくことによ
り、最適ゲインαが求まる。これを式（１）に代入して
整理すると、Ｒ＝｜Ｒ｜²−（Ｒ，Ｙｉ）²／｜Ｙｉ｜² （２）となる。この式の第１項は駆動ベクトルによらない定数
だから、最適ゲインαのもとで最適な駆動ベクトルを探
すことは式（２）の第２項；（Ｒ，Ｙｉ）²／｜Ｙｉ｜² （３）を評価式とし、これを最大にするｙｉを探すことに等し
い。全ての候補についてこの評価式を計算するため、符
号帳探索はＣＥＬＰ方式全体の中で最も計算量を必要と
する部分である。Next, a method for searching the drive vector will be described using mathematical expressions. The i-th drive vector extracted from the codebooks 901 and 902 is yi, and this is the synthesis filter 90.
When the synthesized speech vector obtained through 8 is denoted by Yi and the input speech vector is denoted by R, the difference between Yi and R expressed by the following equation is 2
It is general to search for a drive vector yi that minimizes the sum of multiplications, E = | R−αYi | ² (1). However, α is the optimum gain when yi is selected, and is given by the gain circuits 903 and 904 in FIG. The optimum gain α can be obtained by setting the expression obtained by partially differentiating this expression by α to zero. By substituting this into the equation (1) and rearranging, R = | R | ² − (R, Yi) ² / | Yi | ² (2). Since the first term of this equation is a constant that does not depend on the driving vector, searching for the optimal driving vector under the optimal gain α is the second term of the equation (2); (R, Yi) ² / | Yi | ² This is equivalent to using (3) as an evaluation expression and finding yi that maximizes this. Since this evaluation formula is calculated for all the candidates, the codebook search is the part that requires the most calculation amount in the entire CELP method.

【０００７】そこで簡略化した評価式を用いて、Ｍ個の
駆動ベクトル候補からなる符号帳の中からＮ個の候補
（１＜Ｎ＜Ｍ）を選び、これらＮ個の候補の中から上記
の評価式を用いて最適な候補を1 つに絞るという手法が
提案されている。この手法は、詳しくは特開平５−１０
０６９７で述べられている。符号帳から得られるＭ個の
駆動ベクトルから次の評価式；Ｅ＝（Ｒ，Ｙｉ）² （４）を大きくするＮ個を選ぶ前半の作業は予備選択と呼ば
れ、Ｎ個の候補を式（３）を用いて１つに絞る後半の作
業は本選択と呼ばれている。この手法によれば、計算量
のかかる本選択は予備選択で選ばれたＮ個に対して行う
だけで済み、予備選択をせずに符号帳のＭ個の候補全て
に対して本選択を行う場合に比べ大幅に計算量が削減で
きる。Therefore, using a simplified evaluation formula, N candidates (1 <N <M) are selected from the codebook consisting of M driving vector candidates, and the above-mentioned N candidates are selected from the above candidates. A method has been proposed that uses an evaluation formula to narrow down the optimal candidates to one. This method is described in detail in JP-A-5-10.
0697. The following evaluation formula is obtained from M drive vectors obtained from the codebook; E = (R, Yi) ² (4) The first half of the work to select N is called preselection, and N candidates are calculated. The work in the latter half of narrowing down to one using (3) is called main selection. According to this method, the main selection, which requires a large amount of calculation, only needs to be performed for N selected in the preliminary selection, and the main selection is performed for all M candidates of the codebook without performing the preliminary selection. The amount of calculation can be reduced significantly compared to the case.

【０００８】この従来の予備選択法は、式（３）の分母
に当たるＹｉのパワがほぼ一定であるという仮定に基づ
いていると考えられる。しかし、実際にはパワ｜Ｙｉ｜
²は駆動ベクトルｙｉのパワが必ずしも一定でないこと
や、たとえ一定だとしても合成フィルタの利得が駆動ベ
クトルに依存し定数にならないことを考えると、パワ｜
Ｙｉ｜²を一定とする仮定には無理があり、これが予備
選択の精度の低下を招いているという問題があった。It is considered that this conventional pre-selection method is based on the assumption that the power of Yi corresponding to the denominator of equation (3) is almost constant. However, in reality power | Yi |
Considering ² is that the power of the drive vector yi is not always constant, and even if it is constant, the gain of the synthesis filter depends on the drive vector and does not become a constant.
There is a problem in that the assumption that Yi | ² is constant is unreasonable, which causes a decrease in the accuracy of preselection.

【０００９】近年、古典的なＣＥＬＰ方式で提案された
雑音符号帳を使用することは少なくなっており、計算量
およびメモリ量を削減し、より高音質な符号化音声を得
るために、構造化された雑音符号帳を用いることが多
い。例えば、長い１本の雑音信号から１フレーム分の駆
動ベクトルを一つ前の駆動部ベクトルと重ねながら切り
出してくるオーバーラッピング符号帳、適応ベクトル検
索部で得られたピッチ情報を基に駆動ベクトルを周期化
するピッチ同期型符号帳、一つの駆動ベクトルをサンプ
ル間に挟み込む０の数を変えて使用する適応密度符号帳
などがある。In recent years, the noise codebook proposed in the classical CELP method has been used less frequently, and the structured code is used in order to reduce the calculation amount and the memory amount and obtain a coded voice with higher sound quality. Often, a random codebook is used. For example, based on the pitch information obtained by the overlapping codebook and the adaptive vector search unit, which extracts a drive vector for one frame from one long noise signal while overlapping it with the previous drive unit vector. There are pitch-synchronized codebooks that are made periodic, and adaptive density codebooks that use one drive vector by changing the number of 0s sandwiched between samples.

【００１０】これら構造化された符号帳は構造上、駆動
ベクトルのパワを一定にしづらい仕組みになっており、
このような状況で｜Ｙｉ｜²が一定と仮定する従来の予
備選択法を用いることは予備選択の精度を低下させ、結
果的に符号化音声の品質を劣化させる。また、符号化音
声の品質を保つために予備選択で残す候補数を増やす
と、本選択での計算量が増加してしまうという問題が生
じる。Due to the structure of these structured codebooks, it is difficult to keep the power of the drive vector constant.
In such a situation, the use of the conventional pre-selection method, which assumes that | Yi | ² is constant, reduces the accuracy of pre-selection and consequently deteriorates the quality of coded speech. Further, if the number of candidates to be left in the preliminary selection is increased in order to maintain the quality of the coded voice, there is a problem that the amount of calculation in the main selection increases.

【００１１】[0011]

【発明が解決しようとする課題】上述したように、構造
化された符号帳を用いて駆動ベクトルの予備選択を行う
場合、駆動ベクトルのパワが一定でないために、従来の
予備選択法では駆動ベクトルの選択精度が必ずしも良い
とは言えないという問題があった。As described above, when the preselection of the drive vector is performed by using the structured codebook, the power of the drive vector is not constant. Therefore, the conventional preselection method uses the drive vector. There was a problem that the selection accuracy of was not necessarily good.

【００１２】本発明は、駆動ベクトルの高精度な予備選
択を可能として、本選択に渡す駆動ベクトルの候補数を
符号化音声の品質を保ったまま削減でき、符号化に要す
る計算量の大半を占める本選択での計算量を低下させる
ことを可能とした音声符号化装置を提供することを目的
とする。The present invention enables highly accurate preselection of drive vectors and can reduce the number of drive vector candidates to be passed to the main selection while maintaining the quality of the encoded speech, and most of the calculation amount required for encoding is reduced. It is an object of the present invention to provide a speech coding apparatus capable of reducing the calculation amount in this selection.

【００１３】[0013]

【課題を解決するための手段】本発明は、駆動ベクトル
を生成する駆動ベクトル生成手段と、この駆動ベクトル
生成手段により生成された駆動ベクトルを入力して合成
音声ベクトルを生成する合成手段と、駆動ベクトル生成
手段により生成された駆動ベクトルから少なくとも一つ
の駆動ベクトルを選択する予備選択手段と、この予備選
択手段により選択された駆動ベクトルから最適な駆動ベ
クトルを選択する本選択手段とを有する音声符号化装置
において、予備選択手段を駆動ベクトルのパワに基づく
重み係数で重み付けされた評価式によって駆動ベクトル
の予備選択を行うように構成したことを骨子とする。According to the present invention, a drive vector generating means for generating a drive vector, a synthesizing means for receiving a drive vector generated by the drive vector generating means and generating a synthesized voice vector, and a drive Speech coding having preselection means for selecting at least one drive vector from the drive vectors generated by the vector generation means, and main selection means for selecting an optimum drive vector from the drive vectors selected by the preselection means The essence of the apparatus is that the preselection means is configured to preselect drive vectors by an evaluation formula weighted by a weighting coefficient based on the power of drive vectors.

【００１４】すなわち、第１の発明では所定の単位期間
に分割された入力音声から得られる目標ベクトルと合成
音声ベクトルとの内積値の大きさを駆動ベクトル生成手
段により生成された駆動ベクトルをパラメータとする重
み関数で重み付けした値をより大きくする駆動ベクトル
を選択するように予備選択手段を構成する。That is, in the first aspect of the invention, the magnitude of the inner product value of the target vector and the synthesized speech vector obtained from the input speech divided into the predetermined unit period is the driving vector generated by the driving vector generation means as a parameter. The pre-selection unit is configured to select the drive vector that makes the value weighted by the weighting function larger.

【００１５】これを数式で表すと、駆動ベクトル生成手
段から生成されるＭ個の駆動ベクトルｙｉ（ｉ＝１，
…，Ｍ）をそれぞれ合成手段に入力して、目標ベクトル
Ｒに最も近い合成音声ベクトルを出力する駆動ベクトル
を探し出す場合、予備選択手段において、ｙｉをパラメ
ータとする重み係数Ｗ（ｙｉ）を用いた予備選択の評価
式；Ｅ＝Ｗ（ｙｉ）（Ｒ，Ｙｉ）² （５）の値を大きくするＮ個（１＜Ｎ＜Ｍ）のｙｉを予備選択
候補として選び出す。ただし、Ｙｉはｙｉを音声合成手
段に入力して得られた出力である。When this is expressed by a mathematical expression, M driving vectors yi (i = 1, 1, which are generated from the driving vector generating means).
, M) are input to the synthesizing means to search for a driving vector that outputs the synthesized speech vector closest to the target vector R, the weighting coefficient W (yi) having yi as a parameter is used in the preliminary selecting means. Preliminary selection evaluation formula; E = W (yi) (R, Yi) ² (5) N y (1 <N <M) yi for increasing the value are selected as preliminary selection candidates. However, Yi is an output obtained by inputting yi to the voice synthesizing means.

【００１６】また、第２の発明では駆動ベクトル生成手
段が符号帳を有し、該符号帳から所定のインデックスで
指定された一つの駆動ベクトルを切り出して生成する場
合、所定の単位期間に分割された入力音声から得られる
目標ベクトルと合成音声ベクトルとの内積値の大きさを
駆動ベクトル生成手段の符号帳に格納された過去の駆動
ベクトル群およびインデックスをパラメータとする重み
関数で重み付けした値をより大きくする少なくとも一つ
の駆動ベクトルを選択するように予備選択手段を構成す
る。In the second invention, the drive vector generating means has a codebook, and when one drive vector specified by a predetermined index is cut out from the codebook and is generated, it is divided into predetermined unit periods. The value obtained by weighting the size of the inner product value of the target vector obtained from the input speech and the synthesized speech vector by a weighting function using the past drive vector group and the index stored in the codebook of the drive vector generation means as parameters The preselection means is configured to select at least one drive vector to be increased.

【００１７】これを数式で表すと、符号帳に格納された
駆動ベクトル群をＣとし、インデックスをｉとしたと
き、目標ベクトルＲに最も近い合成音声ベクトルを出力
する駆動ベクトルを探し出す場合、予備選択手段におい
て、Ｃとｉをパラメータとする重み係数Ｗ（Ｃ，ｉ）を
用いた予備選択の評価式；Ｅ＝Ｗ（Ｃ，ｉ）（Ｒ，Ｙｉ）² （６）の値を大きくする駆動ベクトルを予備選択候補として選
び出す。When this is expressed by a mathematical expression, when a drive vector group stored in the codebook is C and an index is i, when a drive vector that outputs a synthesized voice vector closest to the target vector R is searched for, preselection is performed. In the means, a preselection evaluation formula using a weighting coefficient W (C, i) having C and i as parameters; E = W (C, i) (R, Yi) ² (6) Drive for increasing the value of Select a vector as a preliminary selection candidate.

【００１８】さらに、第３の発明では所定の単位期間に
分割された入力音声から得られる目標ベクトルと最適合
成音声ベクトルを求めた後、合成手段により生成された
合成音声ベクトルを最適合成音声ベクトルに対し直交化
した直交化ベクトルを求め、この直交化ベクトルと目標
ベクトルの内積値の大きさを駆動ベクトル生成手段によ
り生成された駆動ベクトルをパラメータとする重み係数
で重み付けした値をより大きくする駆動ベクトルを選択
するように予備選択手段を構成する。Further, in the third invention, after obtaining the target vector and the optimum synthesized speech vector obtained from the input speech divided into the predetermined unit period, the synthesized speech vector generated by the synthesizing means is made into the optimal synthesized speech vector. A driving vector that obtains an orthogonalized vector that has been orthogonalized, and increases the value obtained by weighting the size of the inner product value of the orthogonalized vector and the target vector with a weighting factor using the driving vector generated by the driving vector generation means as a parameter The pre-selection means is configured to select.

【００１９】これを数式で説明すると、目標ベクトルＲ
を近似する最適合成音声ベクトルＸが既に求まっている
条件のもとで、Ｒを近似する２つ目の合成音声ベクトル
を出力する駆動ベクトルを駆動ベクトル生成手段から生
成されるＭ個の駆動ベクトルｙｉ（ｉ＝１，…Ｍ）の中
から探索する場合、ｙｉの合成音声ベクトルＹｉをＸに
対し直交化して、直交化ベクトルＹｖｉを求めた後、ｙ
ｉをパラメータとする重み係数Ｗ（ｙｉ）を用いた予備
選択の評価式；Ｅ＝Ｗ（ｙｉ）（Ｒ，Ｙｖｉ）² （７）の値を大きくするＮ個（１＜Ｎ＜Ｍ）のｙｉを予備選択
候補として選び出す。Describing this using mathematical expressions, the target vector R
Under the condition that the optimum synthesized speech vector X approximating R is already obtained, M driving vectors yi generated from the driving vector generating means are driving vectors for outputting the second synthesized speech vector approximating R. When searching from among (i = 1, ... M), the synthesized speech vector Yi of yi is orthogonalized to X to obtain an orthogonalized vector Yvi, and then y
Preliminary selection evaluation formula using weighting factor W (yi) with i as a parameter; E = W (yi) (R, Yvi) ² (7) N (1 <N <M) increasing the value of Select yi as a preliminary selection candidate.

【００２０】[0020]

【作用】本発明における予備選択での評価式；Ｅ＝Ｗ（ｙｉ）（Ｒ，Ｙｉ）² （８）は、次の根拠に基づき重みＷ（ｙｉ）を掛けない従来の
予備選択の評価式である式（４）より精度が高いと言え
る。The evaluation formula in the preselection according to the present invention; E = W (yi) (R, Yi) ² (8) is a conventional preselection evaluation formula which is not multiplied by the weight W (yi) based on the following grounds. It can be said that the accuracy is higher than the equation (4).

【００２１】本選択の評価式である式（３）の分母は合
成音声ベクトルのパワであり、駆動ベクトルｙｉと合成
フィルタの利得Ｇ（ｙｉ）を用いて｜Ｙｉ｜²＝Ｇ（ｙｉ）²｜ｙｉ｜² （９）と表せる。Ｇ（ｙｉ）はｙｉによって異なる値を取る
が、ｙｉのスペクトルの形がほぼ同じであれば、一定値
と仮定することができる。実際に符号帳は雑音系列など
で構成されることが多く、駆動ベクトル間でスペクトル
の分布はそれほど大きく異ならないのが普通である。そ
のためＧ（ｙｉ）を一定値と仮定するのは現実的であ
り、Ｇ＝Ｇ（ｙｉ）と定数と置くことにより式（９）
は、｜Ｙｉ｜²＝Ｇ²｜ｙｉ｜² （１０）と書ける。この式は合成音声ベクトルＹｉのパワは駆動
ベクトルｙｉのパワに合成フィルタの利得にあたる定数
Ｇの２乗を掛ければ推定できることを表している。ここ
で、予め符号帳に含まれる駆動ベクトルのパワが仮に一
定値だと仮定すれば、式（９）はさらに｜Ｙｉ｜²＝Ｇ²ｙ² （１１）となり、｜Ｙｉ｜²は定数となる。その結果、式（３）
の評価式の大小を比較するには分子だけで近似的な評価
が可能になるわけである。従来法は主にこの仮定に基づ
き分母｜Ｙｉ｜²を定数とおいて評価式の大小を比較し
ていると考えられる、しかし、従来の技術の項で述べた
ように、構造化された符号帳が用いられる近年の状況を
考慮すると、この仮定は予備選択の評価式の精度を低下
させる原因になっている。一方、本発明の評価式では重
み係数Ｗ（ｙｉ）を駆動ベクトルｙｉのパワーの逆数１
／｜Ｙｉ｜²とおけば、駆動ベクトルのパワを評価式に
含めることができるのでその分、評価式の精度が向上す
る。また、パワーの逆数を得るのが困難な場合はその推
定値を用いても、定数とする従来法より精度は良い。ま
た、従来法は本発明においてＷ（ｙｉ）＝１とおいた特
殊な場合と考えることもできる。The denominator of the expression (3), which is the evaluation expression of this selection, is the power of the synthetic speech vector, and | Yi | ² = G (yi) ² using the drive vector yi and the gain G (yi) of the synthetic filter. It can be expressed as | yi | ² (9). G (yi) takes different values depending on yi, but if the shape of the spectrum of yi is almost the same, it can be assumed to be a constant value. In practice, the codebook is often composed of noise sequences and the like, and the distribution of the spectra does not differ greatly between drive vectors. Therefore, it is realistic to assume that G (yi) is a constant value, and by setting G = G (yi) as a constant, equation (9)
Can be written as | Yi | ² = G ² | yi | ² (10). This expression shows that the power of the synthetic speech vector Yi can be estimated by multiplying the power of the driving vector yi by the square of the constant G which is the gain of the synthetic filter. Here, assuming that the power of the drive vector included in the codebook is a constant value in advance, equation (9) further becomes | Yi | ² = G ² y ² (11), and | Yi | ² is a constant. Become. As a result, formula (3)
In order to compare the size of the evaluation formulas of, it is possible to make an approximate evaluation using only the molecule. It is considered that the conventional method mainly compares the magnitudes of the evaluation formulas with the denominator | Yi | ² as a constant based on this assumption. However, as described in the section of the prior art, the structured codebook Considering the recent situation in which is used, this assumption causes the accuracy of the preselection evaluation formula to decrease. On the other hand, in the evaluation formula of the present invention, the weight coefficient W (yi) is set to the reciprocal 1 of the power of the drive vector yi.
When / │Yi | ² is set, the power of the drive vector can be included in the evaluation formula, and the accuracy of the evaluation formula is improved accordingly. Further, when it is difficult to obtain the reciprocal of the power, even if the estimated value is used, the accuracy is better than that of the conventional method using the constant. Also, the conventional method can be considered as a special case where W (yi) = 1 in the present invention.

【００２２】ところで、予備選択で選ぶ候補数Ｎは評価
式の簡略化の精度と関係が深く、精度の良い簡略化を行
えばＮは小さな値で済み、その結果、本選択で必要とさ
れる計算量も小さくなる。簡略化をし過ぎ精度を損なう
と、符号化音声の品質を維持するためにはＮを大きくせ
ざるを得ず、結果として本選択での計算量が増大してし
まう。つまり、評価式を精度を落さずにいかに簡略化す
るかが予備選択のポイントと考えられる。By the way, the number N of candidates to be selected in the preliminary selection is closely related to the precision of the simplification of the evaluation formula, and if the simplification is performed with high precision, N can be a small value, and as a result, it is required in the main selection. The amount of calculation is also small. If over-simplification is impaired in accuracy, N must be increased in order to maintain the quality of coded speech, resulting in an increase in the amount of calculation in this selection. In other words, it is considered that the point of preliminary selection is how to simplify the evaluation formula without lowering the accuracy.

【００２３】本発明に基づく評価式を用いると、上述し
たように予備選択の精度が向上するので、本選択に渡す
候補数を符号化音声の品質を保ったまま削減でき、本選
択の計算量を低下させることが可能になる。本選択での
計算量は符号化装置全体の計算量の大半を占めているた
め、結果として符号化装置全体の計算量を大きく削減さ
せる効果がある。When the evaluation formula based on the present invention is used, the accuracy of the preliminary selection is improved as described above, so that the number of candidates to be passed to the main selection can be reduced while maintaining the quality of the coded speech, and the calculation amount of the main selection can be reduced. Can be reduced. The calculation amount in this selection occupies most of the calculation amount of the entire encoding device, and as a result, the calculation amount of the entire encoding device is significantly reduced.

【００２４】[0024]

【実施例】以下、図面を参照して本発明の実施例を説明
する。Embodiments of the present invention will be described below with reference to the drawings.

【００２５】（実施例１）図１に、本発明の一実施例に
係る音声符号化装置の構成を示す。この音声符号化装置
は大きく分けて、駆動ベクトル生成部１０１、合成フィ
ルタ１０２、予備選択部１０４および本選択部１０８に
より構成される。入力端子１０３には、入力音声ベクト
ルＲが入力される。予備選択部１０４は、重み係数導出
部１０５と評価式計算部１０６および評価部１０７から
なる。(Embodiment 1) FIG. 1 shows the configuration of a speech coding apparatus according to an embodiment of the present invention. This speech coding apparatus is roughly divided into a drive vector generation unit 101, a synthesis filter 102, a preliminary selection unit 104, and a main selection unit 108. The input voice vector R is input to the input terminal 103. The preliminary selection unit 104 includes a weighting factor derivation unit 105, an evaluation formula calculation unit 106, and an evaluation unit 107.

【００２６】駆動ベクトル生成部１０１により生成され
た駆動ベクトルｙｉは、合成フィルタ１０２に通され、
合成音声ベクトルＹｉが得られる。また、駆動ベクトル
ｙｉは重み係数導出部１０５にも入力され、重み係数Ｗ
（ｙｉ）が得られる。評価式計算部１０６では、合成音
声ベクトルＹｉと、ｙｉをパラメータとする重み係数Ｗ
（ｙｉ）および入力音声ベクトルＲからなる評価式；Ｅ＝Ｗ（ｙｉ）（Ｒ，Ｙｉ）² の値、すなわちＲとＹｉの内積値の大きさをＷ（ｙｉ）
で重み付けした値を計算し、これを出力する。評価部１
０７では、駆動ベクトルｙｉの中で評価式Ｅの値をより
大きくする複数の駆動ベクトルｙｉを求め、そのインデ
ックスｉを予備選択候補として出力する。The drive vector yi generated by the drive vector generation unit 101 is passed through the synthesis filter 102,
A synthetic speech vector Yi is obtained. Further, the drive vector yi is also input to the weighting factor deriving unit 105, and the weighting factor W
(Yi) is obtained. In the evaluation formula calculation unit 106, the weight coefficient W with the synthesized voice vector Yi and yi as parameters
(Yi) and an evaluation expression consisting of the input speech vector R; E = W (yi) (R, Yi) ² value, that is, the magnitude of the inner product value of R and Yi is W (yi)
The value weighted by is calculated and output. Evaluation part 1
At 07, a plurality of drive vectors yi that make the value of the evaluation expression E larger among the drive vectors yi are obtained, and the index i thereof is output as a preliminary selection candidate.

【００２７】評価式Ｅは、重み係数Ｗ（ｙｉ）を掛けな
い従来の予備選択の評価式よりも精度が向上するという
効果がある。Ｗ（ｙｉ）の具体的な決め方に関しては、
実施例３以降で述べる。The evaluation formula E has the effect of improving accuracy as compared with the conventional preselection evaluation formula in which the weighting coefficient W (yi) is not multiplied. Regarding the specific method of determining W (yi),
The third and subsequent embodiments will be described.

【００２８】このようにして予備選択部１０４で選ばれ
た予備選択候補は、本選択部１０８で１候補に絞られ、
最適駆動ベクトルＸが出力１０９として得られる。ただ
し、例外としてディレードディシジョン等の利用を目的
に、本選択でも候補を絞るものの、複数の候補を残して
おく場合もある。The preliminary selection candidates thus selected by the preliminary selection unit 104 are narrowed down to one candidate by the main selection unit 108,
The optimum drive vector X is available as output 109. However, as an exception, for the purpose of using delayed decision and the like, although the candidates are narrowed down even in the main selection, a plurality of candidates may remain.

【００２９】（実施例２）図２に、本発明の他の実施例
に係る音声符号化装置の構成を示す。本実施例において
は、入力音声ベクトルＲを逆畳み込み演算部２０１に通
した後、評価式計算部１０６に入力している点が図１の
実施例と異なっている。(Embodiment 2) FIG. 2 shows the configuration of a speech coding apparatus according to another embodiment of the present invention. The present embodiment differs from the embodiment of FIG. 1 in that the input speech vector R is passed through the deconvolution operation unit 201 and then input to the evaluation formula calculation unit 106.

【００３０】今、合成フィルタ１０２によるフィルタリ
ングを表す行列をＨとおくと、Ｙｉ＝Ｈｙｉと表せる。
よって、評価式Ｅに含まれている内積の計算は、（Ｒ，Ｙｉ）＝Ｒ^tＨｙｉ＝（Ｈ^tＲ，ｙｉ）と表わせる。これはＨ^tＲを駆動ベクトルの探索開始前
に一度計算しておけば、駆動ベクトル探索時はこの値と
ｙｉとの内積演算だけで内積値（Ｒ，Ｙｉ）が得られる
ことを示している。従って、探索中にフィルタリング演
算を行う必要がなくなり、計算量をさらに削減すること
ができる。If the matrix representing the filtering by the synthesis filter 102 is H, then Yi = Hyi.
Therefore, the calculation of the inner product included in the evaluation formula E can be expressed as (R, Yi) = R ^t Hyi = (H ^t R, yi). This indicates that if H ^t R is calculated once before the search for the drive vector is started, the inner product value (R, Yi) can be obtained only by calculating the inner product of this value and yi when the drive vector is searched. . Therefore, it is not necessary to perform the filtering calculation during the search, and the calculation amount can be further reduced.

【００３１】図２において、入力音声ベクトルＲは逆畳
み込み演算部２０１に入力され、逆畳み込み入力音声ベ
クトルＨ^tＲが生成される。この畳み込み入力音声ベク
トルＨ^tＲが評価式計算部１０６に入力される。一方、
駆動ベクトルｙｉは直接、評価式計算部１０６と重み係
数導出部１０５に入力される。評価式計算部１０６では
（Ｈ^tＲ，ｙｉ）が計算されるが、これは（Ｒ，Ｙｉ）
と等価なので、図１と同じ予備選択候補ｉが予備選択部
１０４から得られる。この場合、計算量は実施例１に比
較して合成フィルタによるフィルタリング演算を行わな
い分だけ削減される。この計算量削減方法は、以降の実
施例についても適用が可能である。In FIG. 2, the input speech vector R is input to the deconvolution operation unit 201, and the deconvolution input speech vector H ^t R is generated. This convolutional input speech vector H ^t R is input to the evaluation formula calculation unit 106. on the other hand,
The drive vector yi is directly input to the evaluation formula calculation unit 106 and the weighting factor derivation unit 105. The evaluation formula calculation unit 106 calculates (H ^t R, yi), which is (R, Yi).
Therefore, the same preliminary selection candidate i as in FIG. 1 is obtained from the preliminary selection unit 104. In this case, the calculation amount is reduced as compared with the first embodiment by the amount that the filtering operation by the synthesis filter is not performed. This calculation amount reduction method can be applied to the following embodiments.

【００３２】（実施例３）図３に、本発明の第３の実施
例に係る音声符号化装置の構成を示す。本実施例におい
ては、駆動ベクトル生成部１０１がインデックス発生部
３０１と適応符号帳３０２および駆動ベクトル切り出し
部３０３により構成されている。適応符号帳３０２に
は、過去の駆動ベクトル群が格納されている。駆動ベク
トル切り出し部３０３では、インデックス発生部３０１
から出力されたインデックスｉに対応するピッチ周期に
基づき、適応符号帳３０２に格納された過去の駆動ベク
トル群から駆動ベクトルｙｉを切り出して出力する。駆
動ベクトルｙｉは合成フィルタ１０２を通り、合成音声
ベクトルとして評価式計算部１０６に入力される。(Embodiment 3) FIG. 3 shows the configuration of a speech coder according to a third embodiment of the present invention. In the present embodiment, the drive vector generation unit 101 is composed of an index generation unit 301, an adaptive codebook 302, and a drive vector cutout unit 303. The adaptive codebook 302 stores past drive vector groups. In the drive vector cutout unit 303, the index generation unit 301
The drive vector yi is cut out from the past drive vector group stored in the adaptive codebook 302 and output based on the pitch period corresponding to the index i output from. The drive vector yi passes through the synthesis filter 102 and is input to the evaluation formula calculation unit 106 as a synthesized voice vector.

【００３３】一方、適応符号帳３０２に格納された過去
の駆動ベクトル群Ｃとインデックスｉは重み係数導出部
１０５に入力され、これらの値をパラメータとする重み
係数Ｗ（Ｃ，ｉ）が出力される。評価式計算部１０６で
は、駆動ベクトルＹｉ、重み係数Ｗ（Ｃ，ｉ）および入
力音声ベクトルＲから評価式；Ｅ＝Ｗ（Ｃ，ｉ）（Ｒ，Ｙｉ）² を計算して出力する。以降の処理は実施例１と同じであ
る。On the other hand, the past drive vector group C and the index i stored in the adaptive codebook 302 are input to the weighting coefficient deriving unit 105, and the weighting coefficient W (C, i) having these values as parameters is output. It The evaluation formula calculation unit 106 calculates and outputs an evaluation formula; E = W (C, i) (R, Yi) ² from the drive vector Yi, the weighting coefficient W (C, i) and the input voice vector R. The subsequent processing is the same as that in the first embodiment.

【００３４】図４（ａ）（ｂ）に適応符号帳３０２と重
み係数Ｗ（Ｃ，ｉ）の例を示す。重み係数導出部１０５
は、適応符号帳３０２から入力される過去の駆動ベクト
ル群が図４（ａ）に示す波形の場合、この波形からブロ
ック毎の平均パワの逆数を基にして、図４（ｂ）に示す
重み関数Ｗ（Ｃ，ｉ）を作成する。4A and 4B show examples of the adaptive codebook 302 and the weighting coefficient W (C, i). Weighting factor deriving unit 105
When the past drive vector group input from the adaptive codebook 302 has the waveform shown in FIG. 4A, the weights shown in FIG. 4B are calculated from this waveform based on the reciprocal of the average power of each block. Create a function W (C, i).

【００３５】インデックスｉが入力されると対応するピ
ッチ周期が決まり、ピッチ周期が決まると、駆動ベクト
ル切り出し部３０３が適応符号帳３０２に格納されてい
る駆動ベクトル群のどの時刻から駆動ベクトルを切り出
してくるかが決まる。重み係数Ｗ（Ｃ，ｉ）は、この駆
動ベクトルの切り出し時刻をパラメータとして、図４
（ｂ）の重み係数グラフから求められる。適応符号帳３
０２の内容は駆動ベクトルの探索中は変化しないので、
探索開始前に図４（ｂ）の重み係数グラフを作成してお
けば、探索中はインデックスｉから重み係数Ｗ（Ｃ，
ｉ）がテーブルルックアップで求まる。When the index i is input, the corresponding pitch cycle is determined, and when the pitch cycle is determined, the drive vector cutout unit 303 cuts out the drive vector from which time of the drive vector group stored in the adaptive codebook 302. Whether to come or not is decided. The weighting factor W (C, i) is calculated by using the cut-out time of this drive vector as a parameter in FIG.
It is obtained from the weighting factor graph of (b). Adaptive codebook 3
Since the contents of 02 do not change during the search of the drive vector,
If the weighting factor graph of FIG. 4B is created before the search is started, the weighting factor W (C,
i) is obtained by table lookup.

【００３６】本実施例は、駆動ベクトル生成部１０１に
入力音声ベクトルＲと共に変化する適応符号帳３０２を
用いながらも、フレーム内の探索中は計算を必要とせず
に重み係数を容易に求められるという効果がある。According to the present embodiment, even though the adaptive codebook 302 that changes with the input speech vector R is used in the drive vector generation unit 101, the weighting coefficient can be easily obtained during the search within the frame without requiring calculation. effective.

【００３７】（実施例４）図５に、本実施例に係る音声
符号化装置の構成を示す。本実施例においては、駆動ベ
クトル生成部１０１は過去の駆動ベクトル群を格納した
適応符号帳５０１と、固定のベクトルを複数格納した雑
音符号帳５０２と、これらの符号帳５０１，５０２から
得られたベクトルにゲインを乗じるゲイン回路５０３，
５０４により構成される。また、音声合成部として入力
音声を線形予測分析して得られた予測係数を用いた再帰
フィルタに聴感重みフィルタを組み合わせた重み付き合
成フィルタ５０６が用いられ、さらに目標ベクトルとし
て聴感重みフィルタで重み付けされた現フレームの入力
音声から、前フレーム処理終了直後の内部状態での重み
付き合成フィルタのゼロ入力応答を差し引いた重み付き
入力音声ベクトルＲが用いられる。(Embodiment 4) FIG. 5 shows the configuration of a speech coding apparatus according to this embodiment. In the present embodiment, the drive vector generation unit 101 is obtained from the adaptive codebook 501 that stores past drive vector groups, the noise codebook 502 that stores a plurality of fixed vectors, and these codebooks 501 and 502. A gain circuit 503 for multiplying a vector by a gain
504. Further, a weighted synthesis filter 506, which is a recursive filter using a prediction coefficient obtained by performing a linear prediction analysis of the input speech and a perceptual weighting filter, is used as the voice synthesis unit, and is further weighted by the perceptual weighting filter as a target vector. The weighted input speech vector R obtained by subtracting the zero input response of the weighted synthesis filter in the internal state immediately after the end of the previous frame processing is used from the input speech of the current frame.

【００３８】まず、適応符号帳５０１に対して駆動ベク
トルの探索を行う。この時は雑音符号帳５０２は使用し
ないので、回路から切り離して考える。予備選択部１０
４には、まず重み付き入力音声ベクトルＲが取り込ま
れ、次に適応符号帳５０１から駆動ベクトルｘｉと、こ
れを重み付き合成フィルタ５０６に通して得られた合成
音声ベクトルＸｉが取り込まれる。このとき、ゲイン回
路５０３のゲインは定数（通常１とする）に固定してお
く。予備選択部１０４では、重み係数Ｗ（ｙｉ）を用い
た評価式；Ｅ＝Ｗ（ｘｉ）（Ｒ，Ｘｉ）² （１２）を計算し、この値の大きなものから順に数個の駆動ベク
トルを予備選択候補として残す。予備選択候補としての
駆動ベクトルを幾つ残すかは、適応符号帳５０１の大き
さや、求められる符号化音声品質などにもよるが、４〜
１６候補程度で十分な品質が得られることが多い。重み
係数Ｗ（ｘｉ）としては、次式を用いる。これは合成音
声ベクトルのパワの逆数である。ここで、ｘｉ（ｎ）は
ｘｉの第ｎ要素、Ｌはフレームの長さをそれぞれ表す。First, the adaptive codebook 501 is searched for a drive vector. Since the random codebook 502 is not used at this time, it is considered separately from the circuit. Preliminary selection unit 10
4, the weighted input speech vector R is first fetched, then the driving vector xi from the adaptive codebook 501 and the synthesized speech vector Xi obtained by passing this through the weighted synthesis filter 506. At this time, the gain of the gain circuit 503 is fixed to a constant (usually 1). In the preliminary selection unit 104, an evaluation formula using the weighting factor W (yi); E = W (xi) (R, Xi) ² (12) is calculated, and several driving vectors are calculated in descending order of this value. Leave as a preliminary selection candidate. How many drive vectors to leave as candidates for preliminary selection depends on the size of the adaptive codebook 501, the required coded speech quality, and the like, but 4 to
Sufficient quality is often obtained with about 16 candidates. The following equation is used as the weighting coefficient W (xi). This is the reciprocal of the power of the synthetic speech vector. Here, xi (n) represents the nth element of xi, and L represents the frame length.

【００３９】[0039]

【数１】このようにして予備選択部１０４で選ばれた予備選択候
補は、本選択部１０８で１候補に絞られ、１つ目の最適
駆動ベクトルＸが出力として得られる。ただし、例外と
してディレードディシジョン等の利用を目的に、本選択
でも候補を絞るものの、複数の候補を残しておく場合も
ある。[Equation 1] In this way, the preliminary selection candidates selected by the preliminary selection unit 104 are narrowed down to one candidate by the main selection unit 108, and the first optimum drive vector X is obtained as an output. However, as an exception, for the purpose of using delayed decision and the like, although the candidates are narrowed down even in the main selection, a plurality of candidates may remain.

【００４０】次に、雑音符号帳５０２について直交化探
索を行う。このときは、適応符号帳５０１は使用しない
ので回路から切り離して考える。予備選択部１０４で
は、先に取り込んだ重み付き入力音声ベクトルＲと、適
応符号帳５０１から探索された最適駆動ベクトルｘを重
み付き音声合成フィルタ５０６に通して得られた最も歪
みの小さくなる合成音声ベクトルＸが保持され、雑音符
号帳５０２から得られた駆動ベクトルｙｉと、これを重
み付き音声合成フィルタ５０６に通して得られた合成音
声ベクトルＹｉが入力される。ゲイン回路５０４のゲイ
ンは、定数（通常１）に固定しておくのが普通である。
予備選択部１０４は、重み係数Ｗ（ｙｉ）を用いた直交
化探索における予備選択の評価式；Ｅ＝Ｗ（ｙｉ）（Ｒ，Ｙｖｉ）² （１４）の値の大きなもの数個を予備選択候補として残す。Ｙｖ
ｉはＸに直交化された合成音声ベクトルＹｉであり、具
体的にはＹｖｉ＝Ｙｉ−｛（Ｙｉ，Ｘ）／｜Ｘ｜²｝（１５）で求められる。重み係数Ｗ（ｙｉ）は、次式に示すよう
に駆動ベクトルのパワの逆数で与えることにする。Next, an orthogonalization search is performed on the random codebook 502. At this time, adaptive codebook 501 is not used, and therefore it is considered separately from the circuit. In the pre-selection unit 104, the weighted input speech vector R previously fetched and the optimum driving vector x searched from the adaptive codebook 501 are passed through the weighted speech synthesis filter 506 to obtain the synthetic speech with the smallest distortion. The vector X is held, and the driving vector yi obtained from the random codebook 502 and the synthetic speech vector Yi obtained by passing the driving vector yi through the weighted speech synthesis filter 506 are input. The gain of the gain circuit 504 is usually fixed to a constant (normally 1).
The preselection unit 104 preselects several evaluation formulas for preselection in the orthogonalization search using the weighting coefficient W (yi); E = W (yi) (R, Yvi) ² (14) having a large value. Leave as a candidate. Yv
i is a synthetic speech vector Yi orthogonalized to X, and is specifically calculated by Yvi = Yi-{(Yi, X) / | X | ² } (15). The weight coefficient W (yi) is given by the reciprocal of the power of the drive vector as shown in the following equation.

【００４１】[0041]

【数２】雑音符号帳５０１内の駆動ベクトルは固定であるため、
式（１５）をフレーム毎に計算する必要はなく、(1) テ
ーブルデータとして予め持っておく、(2) 符号化装置の
初期化時に１回だけ計算する、(3) 駆動ベクトル自身の
パワを予め揃えて設計しておく、などの方法により重み
係数Ｗ（ｙｉ）自身の計算を省くことができる。(1)(2)
の方法は固定の符号帳ならば種類によらずに利用可能で
あるが、Ｗ（ｙｉ）を駆動ベクトルの数だけ記憶する分
だけのメモリが必要である。(3)の方法はメモリを必要
としないが、符号帳の構造によってはパワを予め揃える
のが困難なため使用できない場合もある。[Equation 2] Since the drive vector in the random codebook 501 is fixed,
It is not necessary to calculate equation (15) for each frame, (1) have it as table data in advance, (2) calculate it only once when the encoder is initialized, (3) the power of the drive vector itself It is possible to omit the calculation of the weighting coefficient W (yi) itself by a method such as designing in advance. (1) (2)
The method of (1) can be used regardless of the type if it is a fixed codebook, but it requires a memory for storing W (yi) by the number of drive vectors. The method (3) does not require a memory, but depending on the structure of the codebook, it may be impossible to use it because it is difficult to arrange the power in advance.

【００４２】最後に、予備選択部１０４で選ばれた予備
選択候補が本選択部１０８で１候補に絞られ、２つ目の
最適駆動ベクトルｙｉが得られる。Finally, the preliminary selection candidates selected by the preliminary selection unit 104 are narrowed down to one candidate by the main selection unit 108, and the second optimum drive vector yi is obtained.

【００４３】（実施例５）本実施例においては、実施例
４において直交化探索における予備選択の評価式とし
て、Ｅ＝Ｗ（ｙｖｉ）（Ｒ，Ｙｖｉ）² を用いる。この評価式はｙｖｉとその合成フィルタ出力
Ｙｖｉを、ｘｉとその合成フィルタ出力Ｘｉに置き換え
てみると式（１２）と同じ式になることから、本質的に
は上述した一般的な予備選択の評価式である。直交化探
索の予備選択の評価式では、重み係数のパラメータとし
て、本来ｙｖｉを使うべきところをｙｉで近似している
が、本実施例ではｙｖｉを使うことになるので、その分
精度が向上する効果がある。(Embodiment 5) In the present embodiment, E = W (yvi) (R, Yvi) ² is used as the evaluation formula of the preliminary selection in the orthogonal search in the fourth embodiment. This evaluation formula is the same as formula (12) when yvi and its synthesis filter output Yvi are replaced with xi and its synthesis filter output Xi. Therefore, the evaluation of the general preselection described above is essentially performed. It is an expression. In the evaluation formula for preselection of the orthogonalization search, yi is approximated to yi as a parameter of the weighting coefficient, but yvi is used in this embodiment, so the accuracy is improved accordingly. effective.

【００４４】（実施例６）本実施例においては、実施例
４において雑音符号帳の代わりにオーバラッピング符号
帳を用いる。オーバラッピング符号帳は、Ｗ（ｙｉ）の
計算において、実施例４で述べた(3) の方法が使えない
例であるが、符号帳の特徴を利用して少ない計算量でＷ
（ｙｉ）を計算することができる。(Embodiment 6) In this embodiment, an overlapping codebook is used instead of the random codebook in the fourth embodiment. The overwrapping codebook is an example in which the method (3) described in the fourth embodiment cannot be used in the calculation of W (yi).
(Yi) can be calculated.

【００４５】図６に、オーバラッピング符号長の駆動ベ
クトルの特徴を示した。ｋ番目の駆動ベクトルｙｋは、
ｋ−１番目の駆動ベクトルｙk-1 の先頭からＬ−Ｓ個の
要素に新たにＳ個の要素を先頭に追加した形になってい
る。今、ｙk-1 のパワＱk-1が既に求まっていると仮定
すると、次式の関係からＱｋが容易に求まることが分か
る。FIG. 6 shows the characteristics of the driving vector of the overlapping code length. The kth drive vector yk is
This is a form in which S elements are newly added to the head from the head of the (k-1) th drive vector yk-1 to the LS elements. Assuming that the power Qk-1 of yk-1 has already been obtained, it can be seen that Qk can be easily obtained from the relationship of the following equation.

【００４６】[0046]

【数３】つまり、Ｑk-1 が与えられれば、Ｑｋのパワは２Ｓ個の
要素のパワーの計算と２回の加減算で求められ、Ｑｋの
逆数を取ることでＷ（ｙｋ）が得られる。実際はＳ＝
２，Ｌ＝８０程度なので、１つの駆動ベクトルに対する
パワの計算は、オーバラッピング構造を利用しない場合
は８０要素のパワを計算する必要があるのに対し、オー
バラッピング構造を利用した場合は４要素分となり、計
算量を削減できる。念のため述べておくと、メモリがあ
れば、実施例４で述べた(1)(2)の方法を用いるのが現実
的な場合が多い。[Equation 3] In other words, if Qk-1 is given, the power of Qk is obtained by calculating the power of 2S elements and adding and subtracting twice, and W (yk) is obtained by taking the reciprocal of Qk. Actually S =
Since 2, L = 80, the power calculation for one drive vector needs to calculate the power of 80 elements when the overlapping structure is not used, whereas it is 4 elements when the overlapping structure is used. This reduces the calculation amount. For the sake of caution, if there is a memory, it is often practical to use the methods (1) and (2) described in the fourth embodiment.

【００４７】（実施例７）本実施例においては、実施例
４における適応符号帳探索時のＷ（ｙｉ）の計算を実施
例６で述べた方法を応用して効率的に行なう方法につい
て述べる。(Embodiment 7) In this embodiment, a method for efficiently calculating W (yi) at the time of adaptive codebook search in Embodiment 4 by applying the method described in Embodiment 6 will be described.

【００４８】図７および図８に、適応符号帳と、これか
ら得られる駆動ベクトルが示してある。適応符号帳には
過去の駆動ベクトル群が格納されており、ピッチ周期Ｔ
がフレーム長Ｌより短い場合（Ｔ＜Ｌ）は、この適応符
号帳より切り出してきた長さＴの区間をフレーム長に達
するまで図７のように繰り返す。Ｔ＞Ｌの場合は、Ｔか
らＬだけ手前の区間を図８のようにそのまま取り出す。FIGS. 7 and 8 show the adaptive codebook and the drive vector obtained from it. A past drive vector group is stored in the adaptive codebook, and the pitch cycle T
Is shorter than the frame length L (T <L), the section of length T cut out from the adaptive codebook is repeated as shown in FIG. 7 until the frame length is reached. When T> L, the section before T by L is taken out as it is as shown in FIG.

【００４９】このような適応符号帳から得られた駆動ベ
クトルのパワを計算する場合、Ｔ＜Ｌのときは１周期分
のパワＰ_Tを計算しておき、これを繰り返す回数だけ
（図では２回）足し合わせれば良い。ただし、端の部分
（図でＴ０の部分) では１周期に満たない区間が生じ、
この部分は別途計算する必要がある。別の方法として、
この区間はパワの計算に含めないという方法も考えられ
る。つまり、１サンプル当たりの平均パワをＰ_T／Ｔと
みなして、（Ｐ_T／Ｔ）×Ｌを駆動ベクトルのパワとみ
なす方法である。このようにしてパワが求まれば、この
逆数が重み係数Ｗ（ｙｉ）となる。さらに、Ｐ_TとＰ
_T+1にはＰ_T+1＝Ｐ_T＋Ｃ_T+1 ² （１８）の関係があるため周期ＴでＷ（ｙｉ）を計算しておけ
ば、上式を利用してＴ＋１のＷ（ｙｉ）を容易に求める
ことができる。Ｌ＜Ｔの場合は図３からも明らかなよう
にシフト量Ｓ＝１のオーバラッピング符号帳と全く同じ
である。ただ、符号長の内容が変わるため、実施例４で
述べた(1)(2)の方法は使えず、実施例６で述べた式（１
７）の関係を利用する方法が有効である。When calculating the power of a drive vector obtained from such an adaptive codebook, when T <L, the power P _T for one cycle is calculated, and this is repeated the number of times (2 in the figure). Just add them. However, in the end part (the part of T0 in the figure), a section less than one cycle occurs,
This part needs to be calculated separately. Alternatively,
A method may be considered in which this section is not included in the power calculation. That is, this is a method in which the average power per sample is regarded as P _T / T and (P _T / T) × L is regarded as the drive vector power. When the power is obtained in this way, this reciprocal becomes the weighting coefficient W (yi). Furthermore, P _T and P
_{Since T + 1} has a relationship of P _{T + 1} = P _T + C _{T + 1} ² (18), if W (yi) is calculated at the period T, then W (yi of T + 1 is calculated using the above equation. ) Can be easily obtained. When L <T, as is clear from FIG. 3, it is exactly the same as the overlapping codebook with the shift amount S = 1. However, since the content of the code length changes, the methods (1) and (2) described in the fourth embodiment cannot be used, and the equation (1) described in the sixth embodiment is used.
The method of utilizing the relationship of 7) is effective.

【００５０】[0050]

【発明の効果】以上説明したように、本発明によれば駆
動ベクトルの予備選択時の評価式に駆動ベクトルのパワ
を考慮しているため、従来法に比べて予備選択の精度が
向上する。その結果、本選択に渡す駆動ベクトルの候補
数を符号化音声の品質を保ったまま削減でき、本選択で
の計算量を低下させることが可能になる。本選択での計
算量は符号化装置全体の計算量の大半を占めているた
め、結果として符号化装置全体の計算量を大きく削減さ
せる効果が期待できる。As described above, according to the present invention, since the power of the drive vector is taken into consideration in the evaluation formula at the time of preselection of the drive vector, the precision of the preselection is improved as compared with the conventional method. As a result, the number of drive vector candidates to be passed to the main selection can be reduced while maintaining the quality of the coded speech, and the amount of calculation in the main selection can be reduced. Since the calculation amount in this selection occupies most of the calculation amount of the entire encoding device, as a result, the effect of greatly reducing the calculation amount of the entire encoding device can be expected.

[Brief description of drawings]

【図１】実施例１に係る音声符号化装置の構成を示すブ
ロック図FIG. 1 is a block diagram showing a configuration of a speech coding apparatus according to a first embodiment.

【図２】実施例２に係る逆畳み込みを用いて計算量を削
減した音声符号化装置の構成を示すブロック図FIG. 2 is a block diagram showing the configuration of a speech encoding apparatus in which the amount of calculation is reduced by using deconvolution according to the second embodiment.

【図３】実施例３に係る適応符号帳を用いた音声符号化
装置の構成を示すブロック図FIG. 3 is a block diagram showing a configuration of a speech coding apparatus using an adaptive codebook according to a third embodiment.

【図４】実施例３における重み係数の求め方を表す図FIG. 4 is a diagram showing a method of obtaining a weighting coefficient in the third embodiment.

【図５】実施例４に係る音声符号化装置の構成を示すブ
ロック図FIG. 5 is a block diagram showing a configuration of a speech coding apparatus according to a fourth embodiment.

【図６】オーバラップ符号帳の構造を示す図FIG. 6 is a diagram showing the structure of an overlap codebook.

【図７】適応符号帳の構造を示す図FIG. 7 is a diagram showing a structure of an adaptive codebook.

【図８】適応符号帳の構造を示す図FIG. 8 is a diagram showing a structure of an adaptive codebook.

【図９】ＣＥＬＰ符号化方式の概略図FIG. 9 is a schematic diagram of a CELP coding method.

[Explanation of symbols]

１０１…駆動ベクトル生成部１０２…合成
フィルタ１０３…入力端子１０４…予備
選択部１０５…重み係数導出部１０６…評価
式計算部１０７…評価部１０８…本選
択部１０９…最適駆動ベクトル２０１…逆畳
み込み演算部３０１…インデックス発生部３０２…適応
符号帳３０３…駆動ベクトル切り出し部５０１…適応
符号帳５０２…雑音符号帳５０３…ゲイ
ン回路５０４…ゲイン回路５０６…重み
付き合成フィルタ101 ... Driving vector generation unit 102 ... Synthesis filter 103 ... Input terminal 104 ... Preliminary selection unit 105 ... Weighting coefficient derivation unit 106 ... Evaluation formula calculation unit 107 ... Evaluation unit 108 ... Main selection unit 109 ... Optimal driving vector 201 ... Deconvolution operation Unit 301 ... Index generating unit 302 ... Adaptive codebook 303 ... Driving vector cutout unit 501 ... Adaptive codebook 502 ... Noise codebook 503 ... Gain circuit 504 ... Gain circuit 506 ... Weighted synthesis filter

フロントページの続き (72)発明者神庭進神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者押切正浩神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内(72) Inventor Susumu Kamiwa 1 Komukai Toshiba-cho, Saiwai-ku, Kawasaki-shi, Kanagawa Inside the Toshiba Research and Development Center, Inc. (72) Inventor Masahiro Oshikiri Komukai Toshiba-cho, Saiwai-ku, Kawasaki-shi, Kanagawa Incorporated company Toshiba Research and Development Center

Claims

[Claims]

1. A drive vector generating means for generating a drive vector, a synthesizing means for inputting the drive vector generated by the drive vector generating means to generate a synthesized voice vector, and a drive vector generating means for generating the synthesized voice vector. And a pre-selection unit for selecting at least one drive vector from the drive vectors, and a main selection unit for selecting an optimum drive vector from the drive vectors selected by the pre-selection unit. The value obtained by weighting the size of the inner product of the target vector obtained from the input voice divided into unit periods and the synthesized voice vector with a weighting function having the drive vector generated by the drive vector generation means as a parameter is made larger. A speech coding apparatus characterized in that a driving vector to be selected is selected.

2. A drive vector generating means having a codebook, and extracting and generating one drive vector designated by a predetermined index from the codebook, and the drive vector generated by the drive vector generating means. A synthesizing means for inputting and generating a synthesized voice vector, a pre-selecting means for selecting at least one drive vector from the drive vectors generated by the drive vector generating means, and an optimum one from the drive vectors selected by the pre-selecting means. The main selection means for selecting a different driving vector, and the preliminary selection means sets the magnitude of the inner product value of the target vector obtained from the input voice divided into a predetermined unit period and the synthesized voice vector to the sign. The drive vector group stored in the book and the value weighted by the weighting function with the index as a parameter are A speech coding apparatus characterized by selecting at least one drive vector to be made larger.

3. A drive vector generating means for generating a drive vector, a synthesizing means for inputting the drive vector generated by the drive vector generating means to generate a synthesized voice vector, and a drive vector generating means for generating the synthesized voice vector. And a pre-selection unit for selecting at least one drive vector from the drive vectors, and a main selection unit for selecting an optimum drive vector from the drive vectors selected by the pre-selection unit. After obtaining the target vector and the optimum synthesized speech vector obtained from the input speech divided into unit periods, the orthogonalized vector obtained by orthogonalizing the synthesized speech vector generated by the synthesizing means with respect to the optimal synthesized speech vector, The drive vector generation means calculates the magnitude of the inner product of the orthogonalized vector and the target vector. A speech coding apparatus, characterized in that a driving vector that increases a value weighted with a weighting coefficient using the generated driving vector as a parameter is selected.