JPH06202700A

JPH06202700A - Speech encoding device

Info

Publication number: JPH06202700A
Application number: JP3121806A
Authority: JP
Inventors: Hideo Osawa; 英男大沢
Original assignee: Japan Radio Co Ltd
Current assignee: Japan Radio Co Ltd
Priority date: 1991-04-25
Filing date: 1991-04-25
Publication date: 1994-07-22

Abstract

PURPOSE:To provide the speech encoding device which obtains a reproduced speech of high quality irrelevantly to a bit rate. CONSTITUTION:This device is equipped with a 1st code book 11 which contains a singular value vector group and a 2nd code book 12 which contains a residue vector group, and the difference between the output of a multiplier 13 and a speech conversion vector obtained by the unitary conversion of a unitary converting circuit 14 is found and stored in an error evaluating circuit 16. The error evaluating circuit 16 searches for a combination of the singular value vector and residue vector corresponding to the least value among differences obtained as to all combinations of the sigular value vectors and residue vectors in the two code books.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声符号化装置に関
し、特に、特異値分解を用いた適応予測符号化装置に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coder, and more particularly to an adaptive predictive coder using singular value decomposition.

【０００２】[0002]

【従来の技術】従来、音声信号の符号化すべきパラメー
タをベクトルととらえ、前もって準備された符号帳（コ
ードブック）から最適なベクトルパターンを選択し、ブ
ロック符号化するベクトル量子化を用いた音声符号化方
式がある。適応予測符号化方式では上記符号化すべきパ
ラメータの主なものとして線形予測係数及び残差信号が
あり、残差信号が雑音的であり乱数系列の信号として扱
えるため、この残差信号を数十サンプルまとめて１つの
ベクトルと見なしてコードブックに登録し、ブロック符
号化を行う。この方式を採用した音声符号化装置を図２
に示す。ここの装置では音声生成機構が、音源と声道特
性とからなるモデルで近似することができることを利用
する。即ち、音声信号Ｙは音源Ｘと合成フィルタＨとを
用いて、2. Description of the Related Art Conventionally, a speech code using vector quantization, in which a parameter to be encoded of a speech signal is regarded as a vector, an optimum vector pattern is selected from a codebook prepared in advance and block coding is performed. There is a conversion method. In the adaptive predictive coding method, there are a linear prediction coefficient and a residual signal as main parameters to be coded. Since the residual signal is noisy and can be treated as a random number sequence signal, this residual signal is tens of samples. The blocks are collectively regarded as one vector, registered in the codebook, and subjected to block coding. FIG. 2 shows a speech coding apparatus adopting this method.
Shown in. The apparatus here utilizes that the voice generation mechanism can be approximated by a model including a sound source and vocal tract characteristics. That is, the sound signal Y uses the sound source X and the synthesis filter H,

【０００３】[0003]

【数１】Ｙ＝Ｈ・Ｘ[Equation 1] Y = H · X

【０００４】と表現することができることを利用する。The fact that can be expressed as follows is utilized.

【０００５】以下、図２に示す音声符号化装置の動作を
説明する。図２に示す音声符号化装置では、入力端子２
１に入力された入力音声を分析回路２２で線形分析して
線形予測係数を求め、線形予測係数から合成フィルタ２
３を構成する。そして、入力音声と、コードブックに登
録されている残差ベクトルのうちの１つの残差ベクトル
を合成フィルタ２３に通した信号との差を減算器２５で
求める。求めた差を２乗器２６で２乗して誤差評価回路
２７に入力する。誤差評価回路２７はその値を記憶し、
次の残差ベクトルを出力するようコードブックに指示を
出す。この動作を１つの入力信号に対してコードブック
に登録されている残差信号全てについて行う。最後に、
誤差評価回路２７は、求めた値のうちから最も小さいも
のを探し出し、その値に対応する雑音信号のインデック
ス（番号）を出力する。先に分析回路で求められた予測
係数とこのインデックスは、多重回路２８で多重され、
伝送路に送出される。なお、復号器も同じコードブック
２４を有しており、受信した予測係数とインデックスと
から音声信号を合成することができる。The operation of the speech coder shown in FIG. 2 will be described below. In the speech coding apparatus shown in FIG. 2, the input terminal 2
The input speech input to 1 is linearly analyzed by the analysis circuit 22 to obtain a linear prediction coefficient, and the synthesis filter 2 is calculated from the linear prediction coefficient.
Make up 3. Then, the subtracter 25 obtains the difference between the input speech and the signal obtained by passing one residual vector among the residual vectors registered in the codebook through the synthesis filter 23. The obtained difference is squared by the squarer 26 and input to the error evaluation circuit 27. The error evaluation circuit 27 stores the value,
Instruct the codebook to output the next residual vector. This operation is performed for all the residual signals registered in the codebook for one input signal. Finally,
The error evaluation circuit 27 finds the smallest value among the obtained values and outputs the index (number) of the noise signal corresponding to the value. The prediction coefficient previously obtained by the analysis circuit and this index are multiplexed by the multiplexing circuit 28,
It is sent to the transmission line. Note that the decoder also has the same codebook 24, and can synthesize an audio signal from the received prediction coefficient and index.

【０００６】ところで、図２の音声符号化装置では、最
適な残差信号を探し出すための演算量が非常に多い。そ
こで、この問題を解決する方法として特異値分解を用い
た方式が考えられている。先に述べたように、音声発生
のメカニズムは、By the way, in the speech coding apparatus of FIG. 2, the amount of calculation for finding the optimum residual signal is very large. Therefore, as a method for solving this problem, a method using singular value decomposition has been considered. As mentioned above, the mechanism of voice generation is

【０００７】[0007]

【数２】Ｙ＝Ｈ・Ｘ[Equation 2] Y = H · X

【０００８】で表わすことができる。また、合成フィル
タＨは下三角行列で、次式のように表わすことができ
る。Can be expressed as The synthesis filter H is a lower triangular matrix and can be expressed by the following equation.

【０００９】[0009]

【数３】 [Equation 3]

【００１０】ここでｈ（Ｏ），ｈ（１）・・・ｈ（Ｎ−
１）は合成フィルタのインパルス応答である。合成フィ
ルタＨを特異値分解し対角化すると、数式３は数式２の
ように表わされる。Here, h (O), h (1) ... h (N-
1) is the impulse response of the synthesis filter. When the synthesis filter H is decomposed into singular values and diagonalized, Expression 3 is expressed as Expression 2.

【００１１】[0011]

【数４】 [Equation 4]

【００１２】ここで、ｄ_１，ｄ_２，・・・ｄ_Ｎは、特異
値と呼ばれるものである。また、Ｕ及びＶは対角化する
ためのユニタリー行列である。数式１を数式３を用いて
書き直すと、数式５のように表わされる。Here, d ₁ , d ₂ , ... d _N are called singular values. U and V are unitary matrices for diagonalization. When Formula 1 is rewritten using Formula 3, it is expressed as Formula 5.

【００１３】[0013]

【数５】Ｕ^ｔＹ＝ＤＶ^ｔＸ(5) U ^t Y = DV ^t X

【００１４】ここで、Ｕ^ｔＹ＝θ、ＶｔＸ＝ηとすれ
ば、Here, if U ^t Y = θ and VtX = η,

【００１５】[0015]

【数６】θ＝Ｄη[Equation 6] θ = Dη

【００１６】となる。なお、θは音声信号をユニタリー
変換した信号（音声変換ベクトル）である。特異値ベク
トルＤは対角行列であるので各要素ごとに独立した表現
ができ、[0016] It should be noted that θ is a signal (voice conversion vector) obtained by unitarily converting the voice signal. Since the singular value vector D is a diagonal matrix, each element can be expressed independently,

【００１７】[0017]

【数７】θ＝ｄη(7) θ = dη

【００１８】この方法により、残差ベクトルを合成フィ
ルタに通すという畳み込み演算を、要素単位の積和演算
に変換することができる。（一般に特異値ｄに替えて、
線形予測化係数が使用される。）With this method, the convolution operation of passing the residual vector through the synthesis filter can be converted into the element-wise product-sum operation. (Generally, instead of the singular value d,
Linear predictive coefficients are used. )

【００１９】[0019]

【発明が解決しようとする課題】しかしながら、従来の
音声符号化装置では、残差ベクトルの量子化誤差のみに
着目して符号化を行ない、特異値（一般的には線形予測
化係数）の量子化誤差については考慮されていない。換
言すると、量子化誤差を考慮して数式７を書き直すと、
θ＝（ｄ＋Δｄ）（η＋Δη）、で表わされる。従来は
Δηが最も小さい残差ベクトルをコードブックの中から
選択し、Δｄについては無視している。量子化ビット数
が多い場合、Δｄは十分に小さいと考えれるので無視で
きるが、量子化ビット数が少ない場合、すなわち、低ビ
ットレートの場合には、Δｄを無視することができず再
生音声が劣化するという問題点がある。本発明は、ビッ
トレートに無関係に高品質の再生音声が得られる音声符
号化装置を提供することを目的とする。However, in the conventional speech coding apparatus, coding is performed by paying attention only to the quantization error of the residual vector, and the quantization of the singular value (generally a linear prediction coefficient) is performed. The conversion error is not taken into consideration. In other words, if Equation 7 is rewritten in consideration of the quantization error,
θ = (d + Δd) (η + Δη) Conventionally, the residual vector with the smallest Δη is selected from the codebook and Δd is ignored. When the number of quantization bits is large, Δd can be ignored because it is considered to be sufficiently small. However, when the number of quantization bits is small, that is, when the bit rate is low, Δd cannot be ignored and the reproduced voice is reproduced. There is a problem of deterioration. SUMMARY OF THE INVENTION An object of the present invention is to provide a speech coder that can obtain reproduced speech of high quality regardless of the bit rate.

【００２０】[0020]

【課題を解決するための手段】本発明によれば、音声信
号をユニタリー変換してユニタリー変換された音声信号
を出力するユニタリー変換回路と、特異値ベクトル群を
蓄積する第１のコードブックと、残差ベクトル群を蓄積
する第２のコードブックと、前記第１のコードブックか
ら出力された特定の特異値ベクトルと前記第２のコード
ブックから出力された特定の残差ベクトルとの積を出力
する掛算器と、前記ユニタリー変換された音声信号と前
記積との差を求める減算器と、前記差が入力される度に
前記第１及び第２のコードブックに対して前記特定の特
異値ベクトル及び前記特定の残差ベクトルを指定すると
共に、全ての特異値ベクトルと残差ベクトルとの組み合
わせに対応する前記差を記憶し、最も絶対値の小さい前
記差に対応する前記特定の特異値ベクトルと前記特定の
残差ベクトルとの組み合わせを探索してその特異値ベク
トルと残差ベクトルのインデックスを出力する誤差評価
回路とを備えたことを特徴とする音声符号化装置が得ら
れる。According to the present invention, a unitary conversion circuit for unitary-converting a voice signal to output a unitary-converted voice signal, and a first codebook for accumulating a singular value vector group, Outputting a second codebook for accumulating a residual vector group, a product of a specific singular value vector output from the first codebook and a specific residual vector output from the second codebook A multiplier, a subtractor for obtaining the difference between the unitary-converted speech signal and the product, and the specific singular value vector for the first and second codebooks each time the difference is input. And specifying the specific residual vector, storing the differences corresponding to all combinations of the singular value vector and the residual vector, and corresponding to the difference having the smallest absolute value. A speech coding apparatus characterized by comprising an error evaluation circuit that searches for a combination of a specific singular value vector and the specific residual vector and outputs an index of the singular value vector and the residual vector To be

【００２１】[0021]

【実施例】以下に図面を参照して本発明の実施例を説明
する。図１の本発明の一実施例のブロック図を示す。本
実施例の音声符号化装置は第１のコードブック１１、第
２のコードブック１２、掛算器１３、ユニタリー変換回
路１４、減算器１５、及び誤差評価回路１６を有してい
る。第１のコードブック１１には予め多数の音声サンプ
ルから学習され最適に作成された特異値ベクトルが蓄積
されている。また、第２のコードブックには予め多数の
音声サンプルから学習され最適に作成された残差ベクト
ルが蓄積されている。Embodiments of the present invention will be described below with reference to the drawings. FIG. 2 shows a block diagram of one embodiment of the present invention in FIG. 1. The speech coding apparatus according to the present embodiment has a first codebook 11, a second codebook 12, a multiplier 13, a unitary conversion circuit 14, a subtractor 15, and an error evaluation circuit 16. The first codebook 11 stores singular value vectors optimally created by learning from a large number of voice samples in advance. Also, the second codebook stores residual vectors that have been optimally created by learning from a large number of voice samples in advance.

【００２２】入力端子１７に入力された入力音声はユニ
タリー変換回路１４に入力され音声変換ベクトルθ
（ｎ）に変換される（１≦ｎ≦Ｎ：ｎは要素番号）。他
方、第１のコードブック１１と第２のコードブック１２
とからはそれぞれ特定の特異値ベクトルと特定の残差ベ
クトルを取り出し、掛算器１３で合成音声変換ベクトル
θｋ（ｎ）を求める。すなわち、θｋ（ｎ）＝ｄｉ
（ｎ）×ηｊ（ｎ）を実行する（ｉ及びｊはインデック
ス）。次に減算器１５は音声変換ベクトルθ（ｎ）と合
成音声変換ベクトルθｉ（ｎ）との差｛θ（ｎ）−θｉ
（ｎ）｝を求める。求められた差｛θ（ｎ）−θｉ
（ｎ）｝は誤差評価回路１６で２乗され誤差ｅｉとして
記憶される。なお、ｅｋ＝Σ｛θ（ｎ）−θｉ（ｎ）｝
^２である。誤差評価回路１６は、１組の特異値ベクトル
と残差ベクトルとの組み合わせが終了すると次の組を出
力するように第１のコードブック及び第２のコードブッ
クに指示を出す。The input voice input to the input terminal 17 is input to the unitary conversion circuit 14 and the voice conversion vector θ.
Is converted into (n) (1 ≦ n ≦ N: n is an element number). On the other hand, the first codebook 11 and the second codebook 12
From (1) and (2), a specific singular value vector and a specific residual vector are respectively taken out, and the multiplier 13 obtains the synthesized speech conversion vector θk (n). That is, θk (n) = di
Perform (n) × ηj (n) (i and j are indices). Next, the subtractor 15 calculates the difference {θ (n) −θi between the speech conversion vector θ (n) and the synthesized speech conversion vector θi (n).
(N)} is calculated. Calculated difference {θ (n) −θi
(N)} is squared by the error evaluation circuit 16 and stored as an error ei. Note that ek = Σ {θ (n) −θi (n)}
It is ² . The error evaluation circuit 16 instructs the first codebook and the second codebook to output the next set when the combination of one set of singular value vector and residual vector is completed.

【００２３】上記のようにしてと、音声変換ベクトルθ
（ｎ）と、特異値ベクトルと残差ベクトルとの全ての組
み合わせ（ｉ及びｊの個数をそれぞれＩ及びＪとすれ
ば、Ｉ×Ｊ組）による合成音声変換ベクトルθｋ（ｎ）
との比較が終了すると、誤差評価回路１６は得られたＩ
×Ｊ個の誤差ｅｋのうち最も小さい値に対応する特異値
ベクトルと残差ベクトルとの組み合わせを探索する。そ
して、探索した結果（インデックスｉ及びｊ）を出力端
子１８に出力する。As described above, the voice conversion vector θ
(N) and all the combinations of the singular value vector and the residual vector (if the numbers of i and j are I and J, I × J sets), a synthesized speech conversion vector θk (n)
When the comparison with is completed, the error evaluation circuit 16 obtains the obtained I
A combination of the singular value vector and the residual vector corresponding to the smallest value among the × J error eks is searched for. Then, the search result (indexes i and j) is output to the output terminal 18.

【００２４】なお、本実施例では２つのコードブックに
格納された特異値ベクトルと残差ベクトルとの全ての組
み合わせについて演算を行うが簡単な積和計算、引き
算、２乗計算であるので、計算量は少なくてすむ。In this embodiment, all combinations of the singular value vector and the residual vector stored in the two codebooks are calculated, but simple product-sum calculation, subtraction and square calculation are performed. The amount is small.

【００２５】[0025]

【発明の効果】以上説明したように、本発明によれば、
特異値分解を用いた適用予測符号化方式において、残差
ベクトルだけでなく、特異値ベクトルについてもコード
ブックに蓄積しておき、その中から最も適当なベクトル
を選択するようにしたことで、ビットレートによって発
生する音声信号の品質低下を防止できる。As described above, according to the present invention,
In the applied predictive coding method using singular value decomposition, not only the residual vector but also the singular value vector is stored in the codebook, and the most suitable vector is selected from among them. It is possible to prevent deterioration of the quality of the audio signal generated by the rate.

[Brief description of drawings]

【図１】本発明の一実施例のブロック図である。FIG. 1 is a block diagram of an embodiment of the present invention.

【図２】従来の音声符号化装置のブロック図である。FIG. 2 is a block diagram of a conventional speech encoding device.

[Explanation of symbols]

１１、１２コードブック１３掛算器１４ユニタリー変換回路１５減算器１６誤差評価回路１７入力端子１８出力端子２１入力端子２２分析回路２３合成フィルタ２４コードブック２５減算器２６２乗器２７誤差評価回路２８多重回路 11, 12 Codebook 13 Multiplier 14 Unitary conversion circuit 15 Subtractor 16 Error evaluation circuit 17 Input terminal 18 Output terminal 21 Input terminal 22 Analysis circuit 23 Synthesis filter 24 Codebook 25 Subtractor 26 Squarer 27 Error evaluation circuit 28 Multiplexing circuit

Claims

[Claims]

1. A unitary conversion circuit for unitarily converting an audio signal to output an audio conversion vector, a first codebook for accumulating a singular value vector group, and a second codebook for accumulating a residual vector group. , The specific singular value vector output from the first codebook and the second singular value vector
A multiplier for outputting the product of the specific residual vector output from the codebook of No. 1, a subtractor for obtaining the difference between the voice conversion vector and the product, and the first and The specific singular value vector and the specific residual vector are specified for the second codebook, and the differences corresponding to all combinations of the singular value vector and the residual vector are stored, and the absolute value A combination of the specific singular value vector corresponding to the small difference and the specific residual vector, and an error evaluation circuit that outputs the index of the singular value vector and the residual vector. Speech coding device.