JPH07212239A

JPH07212239A - Method and device for quantizing vector-wise line spectrum frequency

Info

Publication number: JPH07212239A
Application number: JP5333161A
Authority: JP
Inventors: Rin Daniel; ダニエル・リン; Swaminasan Kumar; クマー・スワミナサン
Original assignee: Hughes Aircraft Co
Current assignee: Raytheon Co
Priority date: 1993-12-27
Filing date: 1993-12-27
Publication date: 1995-08-11

Abstract

PURPOSE: To provide a strong and consistent device to allow high-performance vector-quantization of a short term parameter expressed as a line spectrum frequency between various speakers and handsets. CONSTITUTION: This device is provided with plural specified code books 11-14 of quantized line spectrum frequency vectors, means for retrieving each specified code book for a candidate line spectrum frequency vector similar to the non- quantized line spectrum frequency vector of an input, means 15-18 for calculating distortion for each candidate vector by using the non-quantized vector of the input, and means 19 for selecting the quantized line spectrum frequency vector from among the candidate vectors from each code book by using the calculated distortion for each candidate vector.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はデジタル音声通信システ
ム、特に使用ビットの効率がよく、話者とハンドセット
との間の性能に関して効果的で、複雑性に関して中程度
であり、効果的で簡単な組込み式伝送エラ−検出構造を
有するコ−ド励起線形予測（ＣＥＬＰ）スピ−チエンコ
−ダ用の線形スペクトル周波数ベクトル量子化装置に関
する。FIELD OF THE INVENTION The present invention relates to digital voice communication systems, particularly those which are efficient in the bits used, effective in performance between the speaker and handset, moderate in complexity, effective and simple. The present invention relates to a linear spectral frequency vector quantizer for code-excited linear prediction (CELP) speech coder with a built-in transmission error detection structure.

【０００２】[0002]

【従来の技術】このような装置は通常コ−ダ兼デコ−ダ
として“コデック”と呼ばれている。本発明はデジタル
セルネットワ−クに特に適しているが、通話のためのス
ピ−チ圧縮を必要とする生産ラインで有効に使用され
る。北アメリカのセル通信システムは現在のアナログ周
波数変調（ＦＭ）形態からデジタルシステムまでを含
む。十分な速度8.0 Ｋｂｐｓベクトル和の励起線形予測
（ＶＳＥＬＰ）スピ−チコ−ダ、エラ−保護用のコンボ
リュ−ションコ−ド化、差動直角位相シフトキ−（ＱＰ
ＳＫ）変調、時分割、多重アクセス（ＴＤＭＡ）方式を
使用する標準方式は通信産業協会（ＴＩＡ）により適用
されている。これはセルシステムの通信伝送容量を三倍
にすると考えられる。２の要素により容量をさらに増加
するために、ＴＩＡは評価の処理と、実質上半分の速度
のコデックを選択する処理を開始する。ＴＩＡ技術評価
の目的でエラ−保護と共に半分の速度のコデックは6.4
Ｋｂｐｓの総合的なビット速度を有し、40ｍｓのフレ−
ムサイズに限定される。コデックは広範囲の状態にわた
って十分な速度標準に匹敵する音声品質を有するものと
考えられる。これらの状態は種々の話者、ハンドセット
の影響、背景雑音、チャンネル状態を含む。2. Description of the Related Art Such a device is usually called a "codec" as a coder / decoder. The present invention is particularly suitable for digital cell networks, but is effectively used in production lines that require speech compression for speech. North American cell communication systems include current analog frequency modulation (FM) forms to digital systems. Excited linear prediction (VSELP) speech coder with sufficient speed 8.0 Kbps vector sum, convolutional coded for error protection, differential quadrature phase shift key (QP
Standard schemes using SK) modulation, time division, multiple access (TDMA) schemes have been applied by the Telecommunications Industry Association (TIA). This is thought to triple the communication capacity of the cell system. To further increase capacity by a factor of two, the TIA begins the process of evaluation and the process of selecting a codec of substantially half speed. Half-speed codec with error protection for TIA technology evaluation is 6.4
It has an overall bit rate of Kbps and a frame rate of 40 ms.
Limited to small size. Codec is believed to have a voice quality comparable to a sufficient speed standard over a wide range of conditions. These conditions include various speakers, handset effects, background noise, and channel conditions.

【０００３】コ−ドブック励起線形予測（ＣＥＬＰ）は
低速度スピ−チコ−ド化の技術である。基本的な技術は
ランダムに分配された励起ベクトルのコ−ドブックの検
索からなり、そのベクトルは（ピッチと線形予測コ−ド
化（ＬＰＣ）短期間合成フィルタを通ってフィルタ処理
されるとき）入力シ−ケンスと最も近い出力シ−ケンス
を生成する。これを行うために、コ−ドブックの全ての
候補励起ベクトルは入力シ−ケンスと比較されることが
できる候補出力シ−ケンスを生成するためにピッチおよ
びＬＰＣ合成フィルタの両者によってフィルタ処理され
なければならない。これはＣＥＬＰを非常に計算的に強
化したアルゴリズムにし、典型的なコ−ドブックは1024
のエントリィとそれぞれ40のサンプル長を有する。さら
に知覚的エラ−加重フィルタが通常使用され、これは計
算的負荷に加えられる。高速のデジタル信号プロセッサ
は実時間でＣＥＬＰのような非常に複雑なアルゴリズム
を実行することを助けるが、低いビット速度での低品質
の問題が存続する。通信装置中にコデックを内臓するた
めに音声品質は8.0 Ｋｂｐｓデジタルセル標準に匹敵さ
れる必要がある。Codebook Excited Linear Prediction (CELP) is a technique for low speed speech coding. The basic technique consists of searching a codebook of randomly distributed excitation vectors, which are input (when filtered through a pitch and linear predictive coding (LPC) short term synthesis filter). Generate an output sequence that is closest to the sequence. To do this, all candidate excitation vectors in the codebook have to be filtered by both the pitch and LPC synthesis filters to produce a candidate output sequence that can be compared to the input sequence. I won't. This makes CELP a very computationally robust algorithm, with a typical codebook of 1024.
With 40 entries each. Furthermore, a perceptual error weighting filter is usually used, which adds to the computational load. Although high speed digital signal processors help to execute very complex algorithms such as CELP in real time, the low quality problem at low bit rates persists. Voice quality must be comparable to the 8.0 Kbps digital cell standard in order to have a codec built into the communication device.

【０００４】[0004]

【発明が解決しようとする課題】短期間の予測装置のパ
ラメ−タの多数の表示が存在する。共通して使用される
のはライン周波数スペクトルのセットである。ラインス
ペクトル周波数の量子化は幾つかの研究の主題である。
スカラおよびベクトル量子化装置の両者がこの目的のた
めに設計されている。典型的にスカラ量子化装置は話者
とハンドセットの間の良好な強固な性能、適度の複雑
性、組み込み式エラ−検出能力等の他の目的を満たすた
めに10のラインスペクトル周波数をエンコ−ドするため
に36〜40ビットを必要とする。必要なビットに関する効
率はそれ故犠牲にされる。一方、ベクトル量子化装置は
ビットに関して効率性を達成するが、組み込み式エラ−
検出能力がなく、話者またはハンドセットに依存する性
能のための費用がかかり、しばしば高い複雑性を犠牲に
する。There are numerous displays of short term predictor parameters. Commonly used is a set of line frequency spectra. Quantization of line spectral frequencies has been the subject of some research.
Both scalar and vector quantizers are designed for this purpose. Scalar quantizers typically encode 10 line spectral frequencies to meet other objectives such as good robust performance between the speaker and handset, moderate complexity, built-in error detection capability. You need 36-40 bits to do. Efficiency on the required bits is therefore sacrificed. Vector quantizers, on the other hand, achieve efficiency in terms of bits, but with built-in error
Lack of detection capability, costly for speaker or handset dependent performance, often at the expense of high complexity.

【０００５】それ故、本発明の目的は、頑丈で、一貫
し、種々の話者およびハンドセット間のラインスペクト
ル周波数として表される短期間のパラメ−タの良好な量
子化性能を提供するラインスペクトル周波数ベクトル量
子化装置を提供することである。Therefore, it is an object of the present invention to provide a robust, consistent line spectrum that provides good quantization performance of short term parameters expressed as line spectrum frequencies between different speakers and handsets. A frequency vector quantizer is provided.

【０００６】本発明の別の目的は最少の数のビットを使
用し、中程度の複雑性で伝送エラ−を克服するために組
込み式エラ−検出能力を備えている効率なラインスペク
トル周波数ベクトル量子化装置を提供することである。Another object of the present invention is to use a minimal number of bits and to provide an efficient line spectrum frequency vector quantum with built-in error detection capability to overcome transmission errors with moderate complexity. It is to provide an oxidization device.

【０００７】[0007]

【課題を解決するための手段】本発明によると各カテゴ
リ−に対して異なったベクトル量子化テ−ブルを使用し
て量子化されていないラインスペクトル周波数を４つの
カテゴリ−にクラス分類するラインスペクトル周波数ベ
クトル量子化装置と、話者およびハンドセットの間の頑
丈な性能を達成するために最適のカテゴリ−の最終選択
が設けられている。本発明は各カテゴリ−内で中程度の
複雑性の量子化されていないセットに“近接する”量子
化ラインスペクトル周波数の整列されたセットを生じる
２つの段の制限された検索処理が後続する分割ベクトル
量子化を使用する。受信機における効率的で簡単な送信
エラ−検出方式はベクトル量子化の分割特性と制限され
た検索処理により可能にされる。本発明は１０のライン
スペクトル周波数をエンコ−ドするため僅か26のビット
を使用して全て所望の目的を達成することができる。SUMMARY OF THE INVENTION According to the present invention, a line spectrum that classifies unquantized line spectrum frequencies into four categories using different vector quantization tables for each category. A final choice of frequency vector quantizer and the optimal category to achieve robust performance between the speaker and the handset is provided. The present invention is a division followed by a two stage limited search process that results in an aligned set of quantized line spectral frequencies "close" to an unquantized set of medium complexity within each category. Use vector quantization. An efficient and simple transmission error detection scheme at the receiver is made possible by the partitioning properties of the vector quantization and the limited search process. The present invention uses only 26 bits to encode 10 line spectral frequencies, all capable of achieving the desired objective.

【０００８】[0008]

【実施例】前述およびその他の目的、観点、利点は図面
を参照した本発明の好ましい実施例の後述の詳細な説明
から理解されるであろう。本発明の主題はスピ−チ信号
のベクトル量子化の改良である。文献（例えばShannon
の初期の“A MathematicalTheory of Communicatio
n”、Bell System Technical Journal 、27巻、1948
年）には情報の最も経済的なコ−ド化はソ−スのエント
ロピ−よりも大きくないビット速度を必要とし、この速
度は個々のサンプルをコ−ド化するよりもサンプルの大
きなグル−プまたはベクトルをコ−ド化することにより
達成されることを示している。このことはコ−ドブック
を使用して行われた。ベクトル送信のためコ−ドブック
のエントリィの指数（即ちアドレス）のみを送信する。
受信機はコ−ドブックの自分のコピ−を有し、送信ベク
トルを回復するためアドレスを使用する。コ−ドブック
のベクトル量子化は完全ではないが、エンコ−ドされる
べきデ−タで実際に遭遇する小さいが表示的なベクトル
のサンプルを含む。それ故、サンプルのシ−ケンスを送
信するために最も近接して一致するコ−ドブックエント
リィが選択され、アドレスが送信される。ベクトルの量
子化方法はビット速度を減少する利点を有するが、実際
のベクトルとコ−ドブック内の選択されたエントリィと
の間の不整合のために信号の歪みが起こる。The foregoing and other objects, aspects and advantages will be understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings. The subject of the invention is an improvement of the vector quantization of the speech signal. Literature (eg Shannon
Early "A Mathematical Theory of Communicatio
n ”, Bell System Technical Journal, 27, 1948
Year), the most economical coding of information requires a bit rate no greater than the entropy of the source, which rate is larger than the coding of individual samples. It is shown that this is achieved by coding the code or vector. This was done using a codebook. Only the index (ie address) of the codebook entry is transmitted for vector transmission.
The receiver has its own copy of the codebook and uses the address to recover the transmit vector. The vector quantization of the codebook is not perfect, but contains the small but descriptive vector samples that are actually encountered in the data to be encoded. Therefore, the closest matching codebook entry is selected to send the sequence of samples and the address is sent. The vector quantization method has the advantage of reducing the bit rate, but it causes signal distortion due to the mismatch between the actual vector and the selected entry in the codebook.

【０００９】コ−ドブックの構造では継続期間10〜30ミ
リ秒（ｍｓ）のスピ−チフレ−ムの短期間予測フィルタ
係数は通常の線形予測分析を使用して得られる。10次の
モデルが非常に一般的である。これらの短期間の10次の
モデルパラメ−タは10〜30ｍｓの間隔で更新される。こ
れらのパラメ−タの量子化は量子化処理により導入され
るスペクトルの歪みが所定の数のビットで最少であるこ
とを知覚するドメインで通常実行される。このようなド
メインの１つはラインスペクトル周波数ドメインであ
る。ラインスペクトル周波数の有効なセットは単調に増
加する周波数の必要に整列されたセットである。ライン
スペクトル周波数への短期間の予測装置のパラメ−タの
変換の複雑性は必要な解像度の程度による。性能の小さ
い損失は40Ｈｚの解像度でこのベクトル量子化方式を使
用して観察される。10のラインスペクトル周波数は量子
化され、本発明の好ましい実施例によるベクトル量子化
装置により26のビットを使用してエンコ−ドされる。量
子化後、10の量子化されたラインスペクトル周波数が短
期間の予測装置のフィルタ係数に変換される。In the codebook structure, the short term predictive filter coefficients of a speech frame of duration 10-30 milliseconds (ms) are obtained using conventional linear predictive analysis. The 10th order model is very common. These short term 10th order model parameters are updated at intervals of 10 to 30 ms. Quantization of these parameters is usually performed in the domain where it is perceived that the spectral distortion introduced by the quantization process is minimal for a given number of bits. One such domain is the line spectral frequency domain. A valid set of line spectral frequencies is a neatly aligned set of monotonically increasing frequencies. The complexity of converting short term predictor parameters to line spectral frequencies depends on the degree of resolution required. A small loss of performance is observed using this vector quantization scheme with a resolution of 40Hz. The 10 line spectral frequencies are quantized and encoded by the vector quantizer according to the preferred embodiment of the invention using 26 bits. After quantization, the 10 quantized line spectral frequencies are transformed into short term predictor filter coefficients.

【００１０】分割ベクトル量子化はコ−ドブックの大き
さを減少するのに使用される。ベクトル量子化装置によ
り使用される26のビットはスペクトル周波数のエンコ−
ドのための24のビットと最適のカテゴリ−をエンコ−ド
するための２つのビットとを含む。10のスペクトル周波
数が最低値ｘ₁から最高値ｘ₁₀の整列した素子ｘ₁、ｘ
₂、ｘ₃、ｘ₄、ｘ₅、ｘ₆、ｘ₇、ｘ₈、ｘ₉、ｘ₁₀
を有するベクトルとして表されるならば、24のビットで
エンコ−ドされる全てのベクトルのセットを表すコ−ド
ブックは２²⁴のエントリィを有する。ここではそれぞれ
８つのビットでエンコ−ドされている３−３−４分割ベ
クトルと呼ばれている整列された３つのサブセット、す
なわち（ｘ₁、ｘ₂、ｘ₃）、（ｘ₄、ｘ₅、ｘ₆）、
（ｘ₇、ｘ₈、ｘ₉、ｘ₁₀）にベクトルを分離すること
により、それぞれ２⁸または256のエントリィを有する
３つのコ−ドブックが必要とされ、コ−ドブックの構成
に大きな費用がかかることになる。Partition vector quantization is used to reduce the size of the codebook. The 26 bits used by the vector quantizer are spectral frequency encodes.
Includes 24 bits for encoding and 2 bits for encoding the optimal category. Aligned elements x ₁ , x with ₁₀ spectral frequencies from the lowest x ₁ to the highest x ₁₀
₂ , x ₃ , x ₄ , x ₅ , x ₆ , x ₇ , x ₈ , x ₉ , x ₁₀
, The codebook representing the set of all vectors encoded in 24 bits has 2 ²⁴ entries. Three aligned subsets, referred to herein as 3-3-4 split vectors, each encoded with 8 bits, namely (x ₁ , x ₂ , x ₃ ), (x ₄ , x ₅ , X ₆ ),
By separating the vector into (x ₇ , x ₈ , x ₉ , x ₁₀ ), three codebooks with 2 ⁸ or 256 entries respectively are required, which makes the construction of the codebook very expensive. It will be.

【００１１】基本的な形態では、本発明により使用され
るベクトル量子化方式は量子化されていないラインスペ
クトル周波数（ＬＳＦ）の以下のカテゴリ−に対する４
つの分離したベクトル量子化テ−ブルを使用する。In basic form, the vector quantization scheme used by the present invention is 4 for the following categories of unquantized line spectral frequencies (LSF):
Two separate vector quantization tables are used.

【００１２】１．ＩＲＳフィルタされた音声ＬＳＦベクトル２．ＩＲＳフィルタされた非音声ＬＳＦベクトル３．ＩＲＳフィルタされない音声ＬＳＦベクトル４．ＩＲＳフィルタされない非音声ＬＳＦベクトルＩＲＳフィルタは線形位相ＦＩＲ（有限期間のインパル
ス応答）フィルタであり、これはハンドセットトランス
デュ−サの高域通過フィルタ効果をモデル化するために
使用され、その大きさの応答がＣＣＩＴＴの勧告に適応
する。第１および第３のカテゴリ−では３−４−３の分
割ベクトル量子化は８−、１０−、６−ビットのコ−ド
ブックを使用して行われる。第２、第４のカテゴリ−で
は３−３−４の分割ベクトル量子化が７−、８−、９−
ビットのコ−ドブックを使用して行われる。２つのビッ
トは最適のカテゴリ−をエンコ−ドするために使用され
る。従って全体で26ビットは10のラインスペクトル周波
数のエンコ−ドに使用される。1. IRS filtered speech LSF vector 1. 2. IRS filtered non-speech LSF vector Speech LSF vector not IRS filtered 4. Non-IRS Unvoiced LSF Vector The IRS filter is a linear phase FIR (Finite Period Impulse Response) filter, which is used to model the high pass filter effect of a handset transducer and is of its magnitude. The response complies with CCITT recommendations. In the first and third categories, 3-4-3 split vector quantization is performed using 8-, 10-, and 6-bit codebooks. In the second and fourth categories, 3-3-4 division vector quantization is 7-, 8-, and 9-.
It is done using a bit codebook. Two bits are used to encode the optimal category. Thus a total of 26 bits are used to encode 10 line spectrum frequencies.

【００１３】図面を参照するとベクトル量子化は量子化
されていないラインスペクトル周波数に“最も近接す
る”各カテゴリ−に対応するベクトル量子化テ−ブルか
ら整列されたラインスペクトル周波数のセットを決定す
ることにより開始される。これは２段階の方法で達成さ
れる。図１で示されている第１の段では加重された平均
二乗として最も近い対応するベクトル量子化テ−ブルか
ら量子化されていないラインスペクトル周波数へ第１の
分割ベクトルエントリィが決定される。特に、ここでは
（／ｘ₁）として示されている第１の分割ベクトルはＩ
ＲＳフィルタされた音声ＬＳＦベクトル、ＩＲＳフィル
タされた非音声ＬＳＦベクトル、ＩＲＳフィルタされな
い音声ＬＳＦベクトル、ＩＲＳフィルタされない非音声
ＬＳＦベクトルの各コ−ドブックの各コ−ドブック11,1
2,13,14 に供給される。各コ−ドブックの出力はそれぞ
れブロック15,16,17,18 の歪み計算を受ける。歪みｄの
計算は平均二乗計算であり、With reference to the drawings, vector quantization is the determination of a set of aligned line spectral frequencies from the vector quantization table corresponding to each category "closest" to the unquantized line spectral frequencies. Be started by. This is accomplished in a two step process. In the first stage, shown in FIG. 1, the first split vector entry is determined from the closest corresponding vector quantization table as the weighted mean square to the unquantized line spectrum frequencies. In particular, the first split vector, shown here as (/ x ₁ ), is I
Each codebook 11,1 of each codebook of RS filtered speech LSF vector, IRS filtered non-speech LSF vector, IRS unfiltered speech LSF vector, IRS unfiltered non-speech LSF vector
It will be supplied to 2,13,14. The output of each codebook is subjected to the distortion calculation of blocks 15,16,17,18, respectively. The calculation of the strain d is a mean square calculation,

【数１】ここでｊ＝１，…，Ｎであり、Ｎはコ−ドブック中の素
子数である。従って各コ−ドブック検索と歪みの計算は
歪みｄの測定を生成し、これは選択装置19に供給され
る。最低の歪みを生成する、即ち最も近接した選択を表
している指数とコ−ドブックは第１の分割ベクトル（／
Ｘ）に記憶される。第２、第３の近接した選択も記憶さ
れる。[Equation 1] Here, j = 1, ..., N, and N is the number of elements in the codebook. Thus, each codebook search and distortion calculation produces a measure of distortion d, which is provided to selector 19. The index and codebook that produces the lowest distortion, ie, the closest selection, is the first partition vector (/
X). The second and third adjacent selections are also stored.

【００１４】第２の分割ベクトルのエントリィは最良の
３つの候補を生成するために類似の方法で検索される。
しかしながら候補の９の組合わせのうちの少なくとも１
つが整列された順序のセットであるように検索に制限が
ある。これは性能または複雑性の両者に関してペナルテ
ィ−が小さい非常にゆるやかな制限である。第３の分割
ベクトルのエントリィは同一の制限を受ける類似の方法
で検索される。この検索からの３つの最良の候補も記憶
される。３つの分割候補をそれぞれ有する３つの分割ベ
クトルの特定の例では全部で27の候補の組合わせが存在
するが、候補の少なくとも１つの組合わせが整列される
セットである設けられた制限は候補のＭ個の組合わせで
あり、Ｍは27よりも小さい。The entries of the second partition vector are searched in a similar manner to produce the best three candidates.
However, at least 1 in 9 combinations of candidates
The search is restricted so that one is an ordered set of two. This is a very loose limitation with a small penalty in terms of both performance or complexity. The entries of the third partition vector are retrieved in a similar way subject to the same restrictions. The three best candidates from this search are also stored. There are a total of 27 candidate combinations in the particular example of 3 partition vectors, each with 3 partition candidates, but the restriction provided is that at least one combination of candidates is an aligned set. There are M combinations, where M is less than 27.

【００１５】図２で示されている第２段では、完全な量
子化されたラインスペクトル周波数ベクトルのＭの組合
わせはセプストラル歪み測定を使用して量子化されない
ラインスペクトル周波数と比較され、最適の量子化され
たラインスペクトル周波数ベクトルが決定される。前述
したようにラインスペクトル周波数の整列されたセット
を生じる組合わせのみが第２段で考慮される。第２、第
３の分割ベクトルの検索期間中に設けられた制限のため
に少なくとも１つのこのような組合わせが存在する。第
１の段で使用される平均二乗エラ−歪み測定は計算上簡
単であり、それ故、第２段でより効率的ではあるがより
計算上複雑なセプストラル歪み測定を使用して検索され
る候補の量子化ＬＳＦベクトルのリストを簡潔にするの
に適切である。図２ではブロック21,22,23,24 により表
される候補のＭの組合わせがセプストラル歪み計算25に
入力される。In the second stage, shown in FIG. 2, the M combinations of the fully quantized line spectrum frequency vectors are compared to the unquantized line spectrum frequencies using the Cepstral distortion measurement and the optimal A quantized line spectrum frequency vector is determined. Only combinations that result in an aligned set of line spectral frequencies are considered in the second stage, as described above. There is at least one such combination due to the restrictions placed during the search of the second and third split vectors. The mean square error distortion measurement used in the first stage is computationally simple and therefore candidates searched using the more efficient but more computationally complex Cepstral distortion measurement in the second stage. Is appropriate to simplify the list of quantized LSF vectors of In FIG. 2, the candidate M combinations represented by blocks 21, 22, 23, 24 are input to the Cepstral distortion calculator 25.

【００１６】時には、計算は二重フ−リエ変換を使用し
て行われる。特にスピ−チスペクトルは励起スペクトル
の積であり、これは音声スピ−チに対しては一連の調波
から構成され、比較的スム−スな周波数関数である音声
区域伝達関数の変換である。スペクトルの対数が取られ
るならば、励起および伝達関数要素を付加し、対数にさ
れたスペクトルを再度変換するため次式が得られる。Occasionally, the calculations are done using a double Fourier transform. In particular, the speech spectrum is the product of the excitation spectra, which for speech speech is a transformation of the speech area transfer function, which consists of a series of harmonics and is a relatively smooth frequency function. If the logarithm of the spectrum is taken, adding the excitation and transfer function elements and retransforming the logarithmized spectrum gives:

【００１７】[0017]

【数２】このような変換は文献（D. P. Bogertによる“The quef
rency alanysis of time series for echoes:ceptrum,
pseudo-autocovariance,cross-cepstrum,and saphe cra
cking ”、Proc. Symp. 、Time Series Analysis、209
〜243 頁、1963年）により（“spectrum”の最初の４文
字を反転することによる）“セプストラム（cepstru
m）”と名付けられている。[Equation 2] Such transformations are described in the literature (“The quef by DP Bogert
rency alanysis of time series for echoes: ceptrum,
pseudo-autocovariance, cross-cepstrum, and saphe cra
cking ”, Proc. Symp., Time Series Analysis, 209
Pp. 243, 1963) (by reversing the first four letters of "spectrum").
m) ”.

【００１８】セプストラムの二重フ−リエ変換の計算は
計算的に複雑である。本発明の好ましい実施例はそれ
故、図１の歪み計算により行われるものと類似してセプ
ストラム距離ｄ_cepと、平均二乗計算の形態の計算を使
用する。特に、セプストラム歪み計算器25は以下の計算
を行う。The calculation of the double Fourier transform of the cepstrum is computationally complex. The preferred embodiment of the present invention therefore uses a calculation in the form of a cepstrum distance d _cep and a mean square calculation similar to that performed by the distortion calculation of FIG. In particular, septum strain calculator 25 performs the following calculations.

【００１９】[0019]

【数３】ここでｃ_iは文献（A. H. Gray, Jr. とJ. D. Markelに
よる“Distance Measures for Speech Processing ”、
IEEE Transactions on Acoustics , Speech, andSignal
Processing、ASSP-24 巻、No.5、1976年10月、380
頁）に説明されているようにセプストラム係数である。
ブロック25の出力はここでは指数Ｉ、指数Ｊ、指数Ｋと
して表されている最小の歪みを有する候補の組合わせで
ある。従って図２で示される最終段では、最適のカテゴ
リ−はその最適の量子化されたＬＳＦベクトルがもとの
量子化されていないＬＳＦベクトルと比較するとき最低
のセプストラル歪みを有するように決定される。良好な
性能と中程度の複雑性との２つの目的はこの２段の方法
により満たされる。[Equation 3] Where c _i is the reference (“Distance Measures for Speech Processing” by AH Gray, Jr. and JD Markel,
IEEE Transactions on Acoustics, Speech, and Signal
Processing, ASSP-24 Volume, No.5, October 1976, 380
The cepstrum coefficient, as described in (Page).
The output of block 25 is the combination of candidates with the least distortion, represented here as index I, index J, index K. Therefore, in the final stage shown in FIG. 2, the optimal category-is determined such that the optimal quantized LSF vector has the lowest cepstral distortion when compared to the original unquantized LSF vector. . The two goals of good performance and medium complexity are met by this two-stage method.

【００２０】完全な量子化されたＬＳＦベクトルは対応
するベクトルエントリィを連結することにより各３つの
分割した各ベクトルに対応する最適の指数から得られ
る。このようなベクトルは順序が整列されることが保証
されている。受信機で再構成された量子化されたＬＳＦ
ベクトルが整列されないならば３つの最適のベクトル量
子化指数の伝送のエラ−が検出される。このベクトル量
子化方式のこの簡単なエラ−検出能力は検索期間中に設
けられる制限のために可能にされ、３つに分離した性質
のために効果的にされる。The fully quantized LSF vector is obtained from the optimal index corresponding to each of the three divided vectors by concatenating the corresponding vector entries. Such vectors are guaranteed to be ordered. Quantized LSF reconstructed at receiver
If the vectors are not aligned, three optimal vector quantization index transmission errors are detected. This simple error detection capability of this vector quantization scheme is enabled due to the limitations placed during the search period, and is made effective due to its three-part nature.

【００２１】本発明の好ましい実施例では、ベクトル量
子化テ−ブルは50人以上の話者の大量のデ−タベ−スを
使用して標準的なＬＢＧアルゴリズムの変化を使用して
設計される。このベクトル量子化方式を使用して測定さ
れた０〜４Ｋｈｚからの全ての周波数にわたる平均的な
対数スペクトル歪みはＩＲＳフィルタされたスピ−チデ
−タベ−スとＩＲＳフィルタされていないスピ−チデ−
タベ−スの両者にわたって1.3 Ｄｂである。In the preferred embodiment of the invention, the vector quantization table is designed using a variation of the standard LBG algorithm using a large database of 50 or more speakers. . The average logarithmic spectral distortion over all frequencies from 0-4 Khz measured using this vector quantisation method is the IRS filtered and unirs filtered speech database. −
It is 1.3 Db over both tabes.

【００２２】図３は本発明の好ましい実施例による特定
の設定を示したブロック図である。３−４−３の分割ベ
クトル量子化としてのＬＳＦの分割したバンド部分31は
音声スピ−チ、即ちＩＲＳフィルタされた、およびＩＲ
Ｓフィルタされていない音声ＬＳＦベクトル用に使用さ
れる。３−３−４分割ベクトル量子化としてのＬＳＦの
分割したバンド部分32は非音声スピ−チ即ちＩＲＡフィ
ルタされた、およびＩＲＳフィルタされていない非音声
ＬＳＦベクトル用に使用される。FIG. 3 is a block diagram showing specific settings according to the preferred embodiment of the present invention. 3-4-3 Split Band Quantization of LSF as Split Vector Quantization 31 Speech Speech, IRS Filtered, and IR
Used for unfiltered speech LSF vectors. The LSF split band portion 32 as a 3-3-4 split vector quantisation is used for non-speech speech or IRA filtered and non-IRS filtered non-speech LSF vectors.

【００２３】分割バンド部分31は図４のＡで示されてい
るコ−ドブック検索処理を受け、一方分割バンド部分32
は図４のＢで示されているコ−ドブック検索を受ける。
最初に図４のＡを参照すると第１の分割ベクトルｗ₁、
ｗ₂、ｗ₃は組合わせ候補ｗ₁ ^*、ｗ₂ ^*、ｗ₃ ^*を生
成するためコ−ドブック41で検索され、第２の分割ベク
トルｗ₄、ｗ₅、ｗ₆、ｗ₇は組合わせ候補ｗ₄ ^*、ｗ
₅ ^*、ｗ₆ ^*、ｗ₇ ^*を生成するためにコ−ドブック42
で検索され、第３の分割ベクトルｗ₈、ｗ₉、ｗ₁₀は図
１を参照して通常説明されているように組合わせ候補ｗ
₈ ^*、ｗ₉ ^*、ｗ₁₀ ^*を生成するためにコ−ドブック43
で検索される。組合わせ候補は音声部分に対してセプス
トラル距離δ_cepを生成するためにブロック44でセプス
トラル距離計算を受ける。同様に、図４Ｂでは第の分割
ベクトルｗ₁、ｗ₂、ｗ₃は組合わせ候補ｗ₁ ^**、ｗ₂
^**、＊ｗ₃ ^**（上付きの＊＊はコ−ドブック41により生
成される組合わせ候補と区別するために使用される）を
生成するためコ−ドブック45で検索され、第２の分割ベ
クトルｗ₄、ｗ₅、ｗ₆は組合わせ候補ｗ₄ ^**、
ｗ₅ ^**、ｗ₆ ^**を生成するためコ−ドブック46で検索さ
れ、第３の分割ベクトルｗ₇、ｗ₈、ｗ₉、ｗ₁₀は組合
わせ候補ｗ₇ ^**、ｗ₈ ^**、ｗ₉ ^**、ｗ₁₀ ^**を生成するた
めコ−ドブック47で検索される。組合わせ候補は非音声
部分のセプストラル距離＊δ_cepを生成するためブロッ
ク48でセプストラル距離計算を受ける。The split band portion 31 undergoes the codebook search process shown in FIG. 4A, while the split band portion 32
Undergoes the codebook search shown at B in FIG.
First, referring to FIG. 4A, the first division vector w ₁ ,
w ₂ and w ₃ are searched in the codebook 41 to generate the combination candidates w ₁ ^* , w ₂ ^* and w ₃ ^* , and the second division vectors w ₄ , w ₅ , w ₆ and w ₇ are set. Matching candidate w ₄ ^* , w
₅ ^{^*,} w ₆ ^*, co to produce a w ₇ ^* - codebooks 42
, And the third partition vectors w ₈ , w ₉ , w ₁₀ are combined candidate w as described normally with reference to FIG.
₈ ^{^*,} w ₉ ^*, co-in order to generate the w ₁₀ ^* - codebook 43
It is searched by. The combinatorial candidates undergo a cepstral distance calculation at block 44 to generate a cepstral distance δ _cep for the speech portion. Similarly, in FIG. 4B, the first division vectors w ₁ , w ₂ , w ₃ are combination candidates w ₁ ^** , w ₂
^** , * w ₃ ^** (the superscript ** is used to distinguish from the combination candidates generated by the codebook 41) are searched in the codebook 45 and the second The division vectors w ₄ , w ₅ , and w ₆ are combination candidates w ₄ ^** ,
w ₅ ^**, for co generating a w ₆ ^** - is searched codebook 46, a third split vector _{_{_{w 7, w 8, w 9}}} , w 10 are combinatorial candidates w ₇ ^{^**,} w ₈ ^** , W ₉ ^** , w ₁₀ ^** are searched in codebook 47. The combinatorial candidates are subjected to a cepstral distance calculation at block 48 to produce a cepstral distance * δ _cep for the non-voice portion.

【００２４】本発明は単一の好ましい実施例に関して説
明されたが、本発明が特許請求の範囲の技術的範囲内で
変形が実行されることが当業者により理解できよう。Although the present invention has been described with respect to a single preferred embodiment, it will be appreciated by those skilled in the art that the present invention may be modified within the scope of the appended claims.

[Brief description of drawings]

【図１】送信される分割ベクトルの候補指数を選択する
ための送信機におけるコ−ドブック検索処理を示す機能
的ブロック図。FIG. 1 is a functional block diagram illustrating a codebook search process at a transmitter for selecting a candidate index for a transmitted split vector.

【図２】伝送のための最低の歪みの分割ベクトルシ−ケ
ンスを選択する制限された選択処理を示す機能的ブロッ
ク図。FIG. 2 is a functional block diagram illustrating a limited selection process for selecting the lowest distortion split vector sequence for transmission.

【図３】音声スピ−チおよび音声ではないスピ−チ部分
の３−４−３と３−３−４分割ベクトル量子化を使用す
る本発明の好ましい構成を示しているブロック図。FIG. 3 is a block diagram showing a preferred configuration of the invention using 3-4-3 and 3-3-4 split vector quantization of speech and non-speech speech portions.

【図４】３−４−３音声スピ−チ部分検索の検索処理を
示したブロック図と、３−３−４の音声ではないスピ−
チ部分の検索処理を示したブロック図。FIG. 4 is a block diagram showing a search process of 3-4-3 voice speech partial search, and 3-3-4 non-voice speech.
The block diagram which showed the search process of the H part.

───────────────────────────────────────────────────── フロントページの続き (72)発明者クマー・スワミナサンアメリカ合衆国、メリーランド州 20879、ゲイザースバーグ、ナンバー202、ロスト・ナイフ・サークル 18211 ─────────────────────────────────────────────────── --Continued Front Page (72) Inventor Kumar Swaminathan, State of Maryland, USA 20879, Gaithersburg, No. 202, Lost Knife Circle 18211

Claims

[Claims]

1. Receiving an unquantized line spectral frequency vector and searching a plurality of specific respective codebooks for candidate quantized line spectral frequency vectors similar to the unquantized vector, Compute the distortion measure for each candidate vector using the unquantized vector and the quantized line spectrum frequency vector from among the candidate vectors from each codebook using the calculated distortion measure for each candidate vector. A method for quantizing a line spectrum frequency vector in a digital communication system, characterized in that.

2. A plurality of specialized codebooks of quantized line spectral frequency vectors and each specialized line spectral frequency vector similar to the input unquantized line spectral frequency vector. Means for searching the codebook, means for calculating the distortion measure for each candidate vector using the unquantized vector of the input, and means for calculating the distortion measure for each candidate vector from each codebook. And a means for selecting a quantized line spectrum frequency vector from the candidate vectors.