JP2007072264A

JP2007072264A - Speech quantization method, speech quantization device, and program

Info

Publication number: JP2007072264A
Application number: JP2005260450A
Authority: JP
Inventors: Akitoshi Kataoka; 章俊片岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-09-08
Filing date: 2005-09-08
Publication date: 2007-03-22

Abstract

<P>PROBLEM TO BE SOLVED: To reproduce high quality broadband speech only by transmitting a few bits of additional information using a telephone band communication channel. <P>SOLUTION: An broadband LSP parameter as envelope information of speech is quantized by: using a certain length of a speech signal as a frame length and analyzing the broadband speech signal and the telephone band speech signal corresponding thereto and limited to a band narrower than that of the broadband signal, respectively; calculating the LSP parameter of the telephone band speech signal and the LSP parameter of the broadband speech signal; selecting some candidates from among a broadband LSP parameter code book using the LSP parameter of the telephone band speech signal, to the broadband LSP parameter code book for quantizing the LSP parameter of the broadband speech signal; and selecting a candidate nearest to the LSP parameter of the input band speech signal from among these candidates selected from the broadband LSP parameter code book. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、音声を符号化する時に必要な音声スペクトル包絡を表現するLSPの量子化方法及び装置、プログラムに関する。 The present invention relates to an LSP quantization method, apparatus, and program for expressing a speech spectrum envelope required when speech is encoded.

オフィスでもパソコンで簡単にWeb会議や多地点通信会議が行えるようになっている。しかし、これらの通信の多くは、従来の電話帯域音声が用いられている。広帯域符号化方法はすでに幾つかが標準化され使用することができる。しかし、一部サービスには用いられているが、一般的には広帯域音声が用いられる例は少ない。その理由として、既存設備との互換接続の問題を挙げることができる。
つまり、従来の端末と通信を行うためには、G.711などの規格で規定される電話帯域音声信号である必要がある。多地点通信においても、電話帯域音声信号しか対応していない既存の端末が一箇所でもあると、広帯域符号化法を用いることができない。現状、既存の電話帯域端末が数多くあり、広帯域音声信号に対応した端末にすべて置き換えるのは不可能である。そのため、従来の端末とも接続でき、広帯域対応の端末同士の場合には、広帯域音声が得られるようになることが望ましい。 Web conferences and multipoint communication conferences can be easily performed in a computer using a personal computer. However, most of these communications use conventional telephone band voice. Several wideband coding methods have already been standardized and can be used. However, although it is used for some services, in general, there are few examples where wideband voice is used. The reason for this is the problem of compatible connection with existing equipment.
That is, in order to communicate with a conventional terminal, it is necessary to use a telephone band voice signal defined by a standard such as G.711. Even in multipoint communication, the wideband coding method cannot be used if there is only one existing terminal that supports only telephone band audio signals. At present, there are many existing telephone band terminals, and it is impossible to replace all terminals with terminals compatible with wideband audio signals. For this reason, it is desirable to be able to connect to a conventional terminal, and in the case of terminals compatible with a wide band, it is desirable to be able to obtain wide band audio.

電話帯域音声信号(3.4kHz帯域)を広帯域音声信号(7kHz帯域)に拡張する帯域拡張方法が、非特許文献１、非特許文献２で提案されている。これらの方法では、伝送されてきた電話帯域音声信号のみから広帯域音声信号を生成するものである。帯域拡張の際、電話帯域の音声スペクトル包絡（LSPパラメータ）より広帯域音声信号のスペクトル包絡を求めることが非常に重要である。これらの方式では、音声信号のスペクトル包絡を電話帯域から広帯域に変換する方法として、マッピングを用いている。つまり、同じ音声区間を対象に学習した電話帯域用の包絡スペクトルの符号帳と広帯域用の包絡ベクトルの符号帳を予め用意しておき、送信側からは電話帯域音声情報のみが送られてくる。伝送されてきた電話帯域音声より得られた音声のスペクトル包絡情報を抽出し、電話帯域用の包絡スペクトル符号帳の中から最も近いコードを選び出す。このコードが電話帯域音声信号の包絡情報なので、このコードに対応づけされている広帯域用の符号帳のコードを広帯域音声信号のスペクトル包絡として用いられる。
吉田由紀，阿部匡伸、“コードブックマッピングによる狭帯域音声から広帯域音声の生成法，” 信学論，(D−2)，Vol.J78−D−2，No.3，pp.391−399，March. 1995. 中藤良久，津島峰生，則松武志，“スペクトル線形写像による帯域制限音声の広帯域化，” 信学論，(D−2)，Vol.J83−D−2，No.11，pp.2246−2254，Nov. 2000. Non-Patent Document 1 and Non-Patent Document 2 propose a band expansion method for expanding a telephone band audio signal (3.4 kHz band) to a wideband audio signal (7 kHz band). In these methods, a wideband voice signal is generated only from the transmitted telephone band voice signal. At the time of band expansion, it is very important to obtain the spectrum envelope of a wideband speech signal from the speech spectrum envelope (LSP parameter) of the telephone band. In these methods, mapping is used as a method for converting the spectral envelope of a voice signal from a telephone band to a wide band. That is, a codebook for a telephone band envelope spectrum and a codebook for a wideband envelope vector learned for the same voice section are prepared in advance, and only telephone band voice information is sent from the transmission side. The spectral envelope information of the voice obtained from the transmitted telephone band voice is extracted, and the closest code is selected from the envelope spectral codebook for the telephone band. Since this code is the envelope information of the telephone band voice signal, the code of the wideband codebook associated with this code is used as the spectrum envelope of the wideband voice signal.
Yuki Yoshida, Akinobu Abe, “Generation of Wideband Speech from Narrowband Speech Using Codebook Mapping,” IEICE Theory, (D-2), Vol. J78-D-2, No. 3, pp.391-399, March. 1995. Nakahisa Yoshihisa, Tsushima Mineo, Norimatsu Takeshi, “Bandwidth-limited Voice Broadband Using Spectral Linear Mapping,” IEICE Theory, (D-2), Vol. J83-D-2, No. 11, pp. 2246− 2254, Nov. 2000.

前記した非特許文献１及び非特許文献２で提案された帯域拡張方法によれば、電話帯域のスペクトル包絡が決まれば、広帯域のスペクトル包絡も一意に決定される。しかし、電話帯域内でのスペクトル包絡の形状が同じであっても、それより周波数が高い帯域が一意に決まるとは限らず、この方法では高品質を実現することはできない。
本発明の目的は電話帯域のみに応動する端末に対しても接続が可能であり、広帯域対応の端末間では広帯域音声信号で通信することを可能とすることは無論のこと、特に広帯域音声信号を高品質に復号可能とする音声量子化方法及び装置を提案しようとするものである。 According to the band expansion methods proposed in Non-Patent Document 1 and Non-Patent Document 2 described above, if the spectrum envelope of the telephone band is determined, the broadband spectrum envelope is also uniquely determined. However, even if the shape of the spectrum envelope in the telephone band is the same, a band having a higher frequency is not necessarily determined uniquely, and this method cannot achieve high quality.
The object of the present invention is that it is possible to connect to terminals that respond only to the telephone band, and it is obvious that it is possible to communicate with a broadband audio signal between terminals compatible with a wideband, especially a broadband audio signal. It is an object of the present invention to propose a speech quantization method and apparatus that can be decoded with high quality.

電話帯域音声信号を送信側と受信側でやり取りするシステムにおいて、送信側である長さをフレーム長として、広帯域音声信号とそれに対応した電話帯域音声信号をそれぞれ分析して、電話帯域音声信号のLSPパラメータと広帯域音声信号のLSPパラメータを算出し、広帯域音声信号のLSPパラメータを量子化するための広帯域LSPパラメータ符号帳を予め用意しておき、まず、電話帯域音声信号のLSPパラメータを用いて広帯域LSPパラメータ符号帳の中から候補（ベクトル）を電話帯域音声信号のLSPパラメータと各候補のひずみを計算することで、幾つかの候補に絞り込み、これら広帯域LSPパラメータ符号帳から選びだされた候補の中から、入力された広帯域音声信号のLSPパラメータに一番近い候補を選び、その選ばれた候補ベクトルを微少ビットの符号に量子化する。 In a system in which telephone band voice signals are exchanged between the transmitting side and the receiving side, the length on the transmitting side is set as the frame length, and the wideband voice signal and the corresponding telephone band voice signal are analyzed to determine the LSP of the telephone band voice signal. Parameters and the LSP parameters of the wideband speech signal are calculated, and a wideband LSP parameter codebook for quantizing the LSP parameters of the wideband speech signal is prepared in advance. Candidates (vectors) from the parameter codebook are narrowed down to several candidates by calculating the LSP parameters of the telephone band voice signal and the distortion of each candidate. Among the candidates selected from these wideband LSP parameter codebooks The candidate closest to the LSP parameter of the input wideband audio signal is selected from Quantized to issue.

電話帯域音声信号のLSPパラメータは受信側でも得ることができるので、広帯域LSPパラメータ符号帳とひずみの計算ができる。入力された広帯域音声のLSPパラメータに近いものを選ぶときには、予め電話帯域音声信号のLSPパラメータで広帯域LSPパラメータ符号帳に格納されている多数の広帯域LSPパラメータの中から歪みの小さい順に絞り込んだ極く少数の広帯域LSPパラメータの中から選択し、量子化すればよいから、量子化に必要なビットを削減できる。 Since the LSP parameter of the telephone band voice signal can be obtained also on the receiving side, the wideband LSP parameter codebook and distortion can be calculated. When selecting an LSP parameter that is close to the input broadband speech LSP parameter, the LSP parameter of the telephone bandwidth speech signal is selected in advance from the wideband LSP parameters stored in the broadband LSP parameter codebook in ascending order of distortion. Since it suffices to select and quantize from a small number of wideband LSP parameters, the bits required for quantization can be reduced.

本発明によれば電話帯域音声信号のLSPパラメータに基づき、広帯域音声信号のLSPパラメータを効率的に表現するので、広帯域音声信号のLSPパラメータをそのまま量子化するより、少ないビット数で量子化することができる。更に、微少ビット数であっても、広帯域情報として付加ビット情報を送信するから、受信側ではこの付加ビット情報により広帯域情報を復号することができる。この結果、電話帯域音声信号のみを頼りに広帯域音声信号を復号する従来の方法と比較して再生音声を高品質化することができる。 According to the present invention, since the LSP parameter of the wideband audio signal is efficiently expressed based on the LSP parameter of the telephone band audio signal, the LSP parameter of the wideband audio signal is quantized with a smaller number of bits than the quantization as it is. Can do. Furthermore, even if the number of bits is small, additional bit information is transmitted as wideband information, so that the wideband information can be decoded by the additional bit information on the receiving side. As a result, it is possible to improve the quality of the reproduced voice as compared with the conventional method of decoding the wideband voice signal relying only on the telephone band voice signal.

本発明による音声量子化装置はハードウェアによって構成することも可能であるが、現実には本発明による音声量子化プログラムをコンピュータにインストールし、コンピュータに音声量子化装置として機能させる実施形態が最良である。
コンピュータに音声量子化装置として機能させるためには、コンピュータに本発明による音声量子化プログラムをインストールし、コンピュータに備えられたCPUで音声量子化プログラムを解読させることにより、コンピュータに音声信号をある長さをフレーム長として広帯域音声信号のLSPパラメータを算出する広帯域音声分析部と、音声信号をある長さをフレーム長として広帯域音声信号の帯域より狭い帯域に制限された電話帯域音声信号のLSPパラメータを算出する電話帯域音声分析部と、広帯域音声信号のLSPパラメータを量子化するための広帯域LSPパラメータ符号帳と、電話帯域音声信号のLSPパラメータを用いて広帯域LSPパラメータ符号帳の中から候補となるベクトルを幾つか選び出す探索部と、この探索部で選び出されたベクトルの候補の中から広帯域音声信号のLSPパラメータに最も近似したベクトルを特定し、このベクトルを基に広帯域音声信号のLSPパラメータを量子化する広帯域LSP量子化部とを構築し音声量子化装置として機能させる。 The speech quantization apparatus according to the present invention can be configured by hardware. However, in reality, an embodiment in which the speech quantization program according to the present invention is installed in a computer and the computer functions as the speech quantization apparatus is the best. is there.
In order for a computer to function as a speech quantization device, the speech quantization program according to the present invention is installed in the computer, and the speech quantization program is decrypted by the CPU provided in the computer. A wideband speech analyzer that calculates the LSP parameters of a wideband speech signal using the length as the frame length, and the LSP parameters of the telephone bandwidth speech signal that is limited to a narrower bandwidth than the bandwidth of the wideband speech signal using the speech signal as a frame length. A telephone band speech analysis unit to be calculated, a wideband LSP parameter codebook for quantizing the LSP parameters of the wideband speech signal, and vectors that are candidates from the wideband LSP parameter codebook using the LSP parameters of the telephone bandwidth speech signal Search unit for selecting several vectors candidates for vectors selected by this search unit Identify the closest to the vector to LSP parameters of a wideband speech signal from within this vector based on the LSP parameters of a wideband speech signal to construct a wideband LSP quantizer for quantizing function as sound quantizer.

ここで本発明による音声量子化装置とこの音声量子化装置で量子化された符号を復号する復号装置の実施形態を説明する。
図１は音声量子化装置１００を示し、図２は復号装置２００を示す。音声量子化装置１００と復号化装置２００はともに電話帯域処理部と広帯域処理部とからなっている。音声量子化装置１００は広帯域音声信号をローパスフィルタ１１によって例えば3.4kHzの帯域幅に帯域制限し、ダウンサンプラ１２でダウンサンプリングしたのち電話帯域量子化部１３によって符号化する。 Here, an embodiment of a speech quantization apparatus according to the present invention and a decoding apparatus that decodes a code quantized by the speech quantization apparatus will be described.
FIG. 1 shows a speech quantization apparatus 100, and FIG. Both the speech quantization apparatus 100 and the decoding apparatus 200 are composed of a telephone band processing unit and a broadband processing unit. The voice quantization apparatus 100 limits the bandwidth of the wideband voice signal to a bandwidth of, for example, 3.4 kHz by the low-pass filter 11, down-samples it by the down sampler 12, and then encodes it by the telephone band quantization unit 13.

広帯域量子化部１５は広帯域音声信号と電話帯域量子化部１３で符号化された符号を復号して得られた復号音声信号により高域の音声生成に必要な情報を符号化する。
電話帯域量子化部１３で符号化された電話帯域符号と、広帯域量子化部１５で符号化された拡張符号はパケット化部１６でパケット化されて伝送データとして送信される。
復号装置２００は送られて来たパケット信号をパケット分離部１７で電話帯域符号と拡張符号に分離する。電話帯域符号は電話帯域復号部１８で復号され、復号音声を生成する。広帯域復号部２０は電話帯域復号部１８の復号音声と拡張符号を用いて広域部の音声信号を生成する。生成された高域部の音声信号をハイパスフィルタ１９Bによって高域のみを取り出し、アップサンプラ１９Aでアップサンプリングされた電話帯域音声信号と加算し、広帯域音声信号を得る。 The wideband quantization unit 15 encodes information necessary for generating high-frequency speech by using the wideband speech signal and the decoded speech signal obtained by decoding the code encoded by the telephone band quantization unit 13.
The telephone band code encoded by the telephone band quantization unit 13 and the extension code encoded by the wideband quantization unit 15 are packetized by the packetization unit 16 and transmitted as transmission data.
The decoding apparatus 200 separates the packet signal sent thereto into a telephone band code and an extension code by the packet separation unit 17. The telephone band code is decoded by the telephone band decoding unit 18 to generate decoded speech. The wideband decoding unit 20 generates a wide area voice signal by using the decoded voice and the extension code of the telephone band decoding unit 18. Only the high frequency band is extracted from the generated high frequency band audio signal by the high-pass filter 19B and added to the telephone band audio signal upsampled by the upsampler 19A to obtain a wideband audio signal.

音声量子化装置１００に用いられている広帯域量子化部１５の構成を図３に示す。処理の流れは以下の通りである。電話帯域復号部２１で復号した電話帯域音声信号を電話帯域音声分析部２３でLPC分析してLSPパラメータを得る。同様に広帯域音声信号を広帯域音声分析部２２でLPC分析して広帯域音声信号のLSPパラメータを得る。電話帯域音声信号のLSPパラメータと広帯域音声信号のLSPパラメータとを用いて広帯域LSP量子化部２４で広帯域LSPパラメータを量子化し、拡張情報としてインデックスindexを生成する。
復号部２００の広帯域復号部２０の構成を図４に示す。処理の流れは以下の通りである。 The configuration of the wideband quantization unit 15 used in the speech quantization apparatus 100 is shown in FIG. The flow of processing is as follows. The telephone band voice signal decoded by the telephone band decoding unit 21 is LPC analyzed by the telephone band voice analysis unit 23 to obtain LSP parameters. Similarly, the wideband audio signal is subjected to LPC analysis by the wideband audio analyzer 22 to obtain LSP parameters of the wideband audio signal. The broadband LSP quantization unit 24 quantizes the wideband LSP parameter using the LSP parameter of the telephone band voice signal and the LSP parameter of the wideband voice signal, and generates an index index as extended information.
The configuration of the wideband decoding unit 20 of the decoding unit 200 is shown in FIG. The flow of processing is as follows.

電話帯域音声信号を電話帯域音声分析部３２でLPC分析してLSPパラメータを得る。また、伝送されて来たインデックスindexと電話帯域音声信号のLSPパラメータを用い、広帯域LSP生成部３３で広帯域LSPパラメータを得る。広帯域LSPパラメータをLPC係数に変換して逆フィルタ３５と合成フィルタ３７にセットする。電話帯域音声信号をアップサンプラ３４でアップサンプル後、逆フィルタ３５に通し、電話帯域残差信号を作成する。次に、高域残差信号生成部３６において、電話帯域残差信号から非線形処理によって高域成分の残差信号を生成し、合成フィルタ３７によって高域の音声を生成する。この高域音声信号はハイパスフィルタ３８で抽出され図２に示した電話帯域復号部１８で復号された電話帯域信号に加算され、広帯域信号を生成する。 The telephone band voice signal is subjected to LPC analysis by the telephone band voice analyzer 32 to obtain LSP parameters. Also, the broadband LSP parameter is obtained by the broadband LSP generation unit 33 using the transmitted index index and the LSP parameter of the telephone band voice signal. The wideband LSP parameters are converted into LPC coefficients and set in the inverse filter 35 and the synthesis filter 37. The telephone band voice signal is upsampled by the upsampler 34 and then passed through the inverse filter 35 to generate a telephone band residual signal. Next, the high frequency residual signal generation unit 36 generates a high frequency component residual signal from the telephone band residual signal by non-linear processing, and the synthesis filter 37 generates high frequency audio. This high frequency audio signal is extracted by the high pass filter 38 and added to the telephone band signal decoded by the telephone band decoding unit 18 shown in FIG. 2 to generate a wide band signal.

図３に示した広帯域LSP量子化部２４の詳細を図５に示す。広帯域LSP量子化部２４は図５に示す通り、LSP探索部４２は電話帯域音声信号を分析して得られたLSPパラメータYと、広帯域音声信号を分析して得られたLSPパラメータXを用いて、広帯域LSPパラメータ符号帳４１より、最も対応の良いコードを選択する。広帯域LSPパラメータ符号帳４１の各候補ベクトルZi（M候補）の低次の部分の特性が電話帯域音声信号のLSPパラメータYに対応する。例えば、電話帯域音声信号のLSPパラメータYを８次のベクトル、広帯域音声信号のLSPパラメータXを１６次のベクトルとする。 Details of the wideband LSP quantization unit 24 shown in FIG. 3 are shown in FIG. As shown in FIG. 5, the wideband LSP quantization unit 24 uses the LSP parameter Y obtained by analyzing the telephone band voice signal and the LSP parameter X obtained by analyzing the wideband voice signal. The most suitable code is selected from the wideband LSP parameter codebook 41. The characteristics of the low-order part of each candidate vector Zi (M candidate) of the wideband LSP parameter codebook 41 correspond to the LSP parameter Y of the telephone band voice signal. For example, the LSP parameter Y of the telephone band voice signal is an eighth-order vector, and the LSP parameter X of the wideband voice signal is a sixteenth-order vector.

まず、広帯域LSPパラメータ符号帳４１の各候補ベクトルZiの中から電話帯域音声信号のLSPパラメータYの形状に近いものを選択するため、すべての候補（M個）について両者のひずみを計算する。次に、すべての候補の中からひずみの小さい上位N個の候補を選択する。広帯域LSPパラメータ符号帳４１に相当する符号帳は受信側にも設けられるため、この処理は受信側でも同様に行うことができる。上位N個の候補ベクトルと実際に入力された広帯域音声信号のLSPパラメータとのひずみを計算し、最もひずみの小さい候補の情報をindexとして伝送する。例えば、広帯域LSPパラメータ符号帳４１の候補M=128（７ビット）に対して、電話帯域信号のLSPパラメータで予め予備選択しておくことで、インデックスとしては７ビット全てを伝送する必要はなく、N＝８〜１６（３〜４ビット）のインデックスindexを伝送するだけですむ。 First, in order to select the candidate vector Zi of the wideband LSP parameter codebook 41 that is close to the shape of the LSP parameter Y of the telephone band voice signal, the distortions of both are calculated for all candidates (M). Next, the top N candidates with the smallest distortion are selected from all candidates. Since a code book corresponding to the wideband LSP parameter code book 41 is also provided on the receiving side, this processing can be similarly performed on the receiving side. Distortion between the top N candidate vectors and the LSP parameter of the actually input wideband speech signal is calculated, and information on the candidate with the smallest distortion is transmitted as an index. For example, by pre-selecting the candidate M = 128 (7 bits) of the broadband LSP parameter codebook 41 with the LSP parameter of the telephone band signal, it is not necessary to transmit all 7 bits as an index. It is only necessary to transmit an index index of N = 8 to 16 (3 to 4 bits).

つまり、受信側では送信されたインデックスindexと復号された電話帯域信号とを用いて広帯域LSP生成部３３（図４）に備えられた広帯域LSPパラメータ符号帳から、送信側で選択した候補と同じ広帯域LSPパラメータを抽出することができる。この抽出された広帯域LSPパラメータは、送信側で実際に入力された広帯域音声信号のLSPパラメータと最も近似したLSPパラメータと判定したLSPパラメータと一致する。従ってこの広帯域LSPパラメータを用いて高域成分を復号することにより常時高品質の広帯域信号を復号できることになる。この結果、本発明によれば少数ビットのインデックスindexを付加して伝送するだけで、高品質の広帯域信号を再生できることになる。 That is, on the receiving side, the same wideband as the candidate selected on the transmitting side from the wideband LSP parameter codebook provided in the wideband LSP generating unit 33 (FIG. 4) using the transmitted index index and the decoded telephone band signal. LSP parameters can be extracted. The extracted wideband LSP parameter coincides with the LSP parameter determined to be the LSP parameter closest to the LSP parameter of the wideband audio signal actually input on the transmission side. Therefore, it is possible to always decode a high-quality broadband signal by decoding the high-frequency component using the broadband LSP parameter. As a result, according to the present invention, a high-quality broadband signal can be reproduced simply by adding and transmitting an index index of a small number of bits.

広帯域LSPパラメータ符号帳４１の各候補ベクトルZi(j)と電話帯域信号のLSPパラメータベクトルY(j)のひずみ計算方法として、図６に示すように各候補ベクトルZi(j)と電話帯域信号のLSPパラメータベクトルY(j)同士のひずみを、低次のベクトルより順番に計算し、その総和を求め、総和の小さい方からN個の候補ベクトルを抽出する。この探索方法を順序探索法と称す。この順序探索法を数式で表すと、 As a distortion calculation method for each candidate vector Zi (j) of the wideband LSP parameter codebook 41 and the LSP parameter vector Y (j) of the telephone band signal, as shown in FIG. 6, each candidate vector Zi (j) and telephone band signal The distortions between the LSP parameter vectors Y (j) are calculated in order from the lower order vectors, the sum is obtained, and N candidate vectors are extracted from the smaller sum. This search method is called an order search method. This ordered search method is expressed by a mathematical formula:

となる。

It becomes.

広帯域LSPパラメータ符号帳４１の各候補ベクトルZi(j)と電話帯域信号のLSPパラメータベクトルY(j)のひずみ計算の方法として、図７に示すように、電話帯域音声信号のLSPパラメータ各要素に対して広帯域LSPパラメータ符号帳４１の候補ベクトルZi(j)の要素のすべての組み合せについてひずみを求め、電話帯域音声信号のLSPパラメータ要素に対して一番小さいひずみを与える候補ベクトルの要素の組み合わせを対とし、電話帯域信号LSPパラメータのすべての要素について一番小さいひずみを与える対のひずみの和を求め、その候補ベクトルのひずみとし、各候補ベクトルについても同様に計算して、ひずみの小さい上位N個を選択する。この選択方法を最小ひずみ探索法と称す。この最小ひずみ探索法を数式で表わすと、 As a method of calculating the distortion of each candidate vector Zi (j) of the wideband LSP parameter codebook 41 and the LSP parameter vector Y (j) of the telephone band signal, as shown in FIG. On the other hand, the distortion is obtained for all combinations of the elements of the candidate vector Zi (j) of the wideband LSP parameter codebook 41, and the combination of the elements of the candidate vector that gives the smallest distortion to the LSP parameter element of the telephone band voice signal is obtained. As a pair, the sum of the distortion of the pair that gives the smallest distortion for all elements of the telephone band signal LSP parameter is obtained, and the distortion of the candidate vector is calculated. Select. This selection method is called a minimum strain search method. This minimum strain search method is expressed by a mathematical formula:

となる。

It becomes.

広帯域用LSPパラメータ符号帳４１を一段の符号帳ではなく、多段の符号帳とする。２段目以降の情報を伝送するものとする。例えば、２段構成の時には次式で表される。 The wideband LSP parameter codebook 41 is not a one-stage codebook but a multistage codebook. It is assumed that information after the second stage is transmitted. For example, in the case of a two-stage configuration, it is expressed by the following formula.

広帯域用LSPパラメータ符号帳４１を該当フレームのみで求めるのではなく、前フレームのものに重み掛けを用いて量子化する。例えば第ｎフレーム目のLSPパラメータΩnは１次のフレーム間予測を用いて次式によって量子化する。
Ωn＝G0・Cn＋G1・Cn−1
Giは固定の予測係数、G0は現フレームに対する値、G1は前フレームに対する値、CnはLSP符号帳の出力ベクトルである。
以上説明した本発明による音声量子化装置はハードウェアによって構成することも可能であるが、より簡素に実現するにはコンピュータに本発明による音声量子化プログラムをインストールし、この音声量子化プログラムをコンピュータに備えられたCPUに解読させ実行させることによりコンピュータを音声量子化装置として機能させる実施形態が最良である。 The wideband LSP parameter codebook 41 is not obtained only for the corresponding frame, but is quantized using weighting for the previous frame. For example, the LSP parameter Ωn of the nth frame is quantized by the following equation using primary interframe prediction.
Ωn ＝ G0 ・ Cn ＋ G1 ・ Cn−1
Gi is a fixed prediction coefficient, G0 is a value for the current frame, G1 is a value for the previous frame, and Cn is an output vector of the LSP codebook.
The speech quantization apparatus according to the present invention described above can be configured by hardware. However, in order to achieve a more simple implementation, the speech quantization program according to the present invention is installed in a computer, and the speech quantization program is installed in the computer. An embodiment in which a computer is made to function as a speech quantizer by causing the CPU provided in the above to decode and execute is the best.

本発明による音声量子化プログラムはコンピュータが解読可能なプログラム言語によって記述され、コンピュータが読み取り可能な磁気ディスク或いはCD-ROMのような記録媒体に記録される。コンピュータにインストールするにはこれらの記録媒体から或いは通信回線を通じてインストールされる。 The voice quantization program according to the present invention is written in a computer-readable program language and recorded on a recording medium such as a magnetic disk or CD-ROM that can be read by the computer. To install in a computer, it is installed from these recording media or through a communication line.

電話会議装置の分野或いはインターネットを利用したIP電話の分野で活用される。 It is used in the field of telephone conference equipment or the field of IP telephone using the Internet.

本発明による音声量子化装置の全体を説明するためのブロック図。The block diagram for demonstrating the whole audio | voice quantization apparatus by this invention. 本発明による音声量子化装置で量子化した符号を復号する復号部を説明するためのブロック図。The block diagram for demonstrating the decoding part which decodes the code | cord | champ quantized with the audio | voice quantization apparatus by this invention. 本発明による音声量子化装置に用いる広帯域量子化部の概要を説明するためのブロック図。The block diagram for demonstrating the outline | summary of the wideband quantization part used for the audio | voice quantization apparatus by this invention. 図２に示した復号部で用いる広帯域復号部の構成を説明するためのブロック図。The block diagram for demonstrating the structure of the wideband decoding part used with the decoding part shown in FIG. 本発明による音声量子化装置に用いる探索部を説明するためのブロック図。The block diagram for demonstrating the search part used for the audio | voice quantization apparatus by this invention. 図５に示した探索部で用いる探索方法の一例を説明するための図。The figure for demonstrating an example of the search method used with the search part shown in FIG. 図５に示した探索部で用いる探索方法の他の例を説明ための図。The figure for demonstrating the other example of the search method used with the search part shown in FIG.

Explanation of symbols

１００音声量子化装置２２広帯域音声分析部
１１ローパスフィルタ２３電話帯域音声分析部
１２ダウンサンプラ２４広帯域LSP量子化部
１３電話帯域量子化部３１電話帯域復号部
１４遅延器３２電話帯域音声分析部
１５広帯域量子化部３３広帯域LSP生成部１６パケット化部３４アップサンプラ
１７パケット分離部３５逆フィルタ
１８電話帯域復号部３６高域残差信号生成部
１９Ａアップサンプラ３７合成フィルタ
１９Ｂハイパスフィルタ３８ハイパスフィルタ
２０広帯域復号部４１広帯域LSPパラメータ符号帳
２１電話帯域復号部４２ LSP探索部 DESCRIPTION OF SYMBOLS 100 Speech quantization apparatus 22 Wideband speech analysis part 11 Low pass filter 23 Telephone band speech analysis part 12 Downsampler 24 Wideband LSP quantization part 13 Telephone band quantization part 31 Telephone band decoding part 14 Delay device 32 Telephone band speech analysis part 15 Wideband Quantization unit 33 Wideband LSP generation unit 16 Packetization unit 34 Upsampler 17 Packet separation unit 35 Inverse filter 18 Telephone band decoding unit 36 Highband residual signal generation unit 19A Upsampler 37 Synthesis filter 19B Highpass filter 38 Highpass filter 20 Wideband decoding Unit 41 Wideband LSP parameter codebook 21 Telephone band decoding unit 42 LSP search unit

Claims

An audio signal having a certain length as a frame length is analyzed for a wideband audio signal and a corresponding telephone band audio signal that is limited to a band narrower than the band of the wideband audio signal. A wideband LSP parameter codebook for calculating the LSP parameters of a wideband speech signal and quantizing the LSP parameters of the wideband speech signal, using the LSP parameters of the telephone band speech signal as candidates from the wideband LSP parameter codebook And select the candidate closest to the LSP parameter of the input wideband speech signal from the candidates selected from these wideband LSP parameter codebooks, thereby quantizing the LSP parameter, which is the envelope information of speech. A speech quantization method characterized by comprising:

The speech quantization method according to claim 1, wherein
When selecting a candidate vector using the telephone band voice LSP parameter vector from the wideband LSP parameter codebook, calculate the mutual distortion of each vector in order from the lowest order, and calculate the sum of each distortion. A speech quantization method, wherein a predetermined number N is selected in order from the smallest.

The speech quantization method according to claim 1, wherein
When the search unit selects a candidate vector using the LSP parameter vector of the telephone band voice signal from the wideband LSP parameter codebook, for each element of the LSP parameter of the telephone band voice signal, the element of the wideband LSP parameter codebook Obtain distortion for all combinations, pair the elements of the candidate vector that give the smallest distortion with respect to the LSP parameter element of the telephone band voice signal, and the smallest for all elements of the LSP parameter of the telephone band voice signal The sum of the pair of strains that give the strain is obtained, the strain is calculated with the candidate vectors, and each candidate vector is calculated in the same manner, and the predetermined number N is selected in order from the smallest sum of the strains. Voice quantization method.

A wideband speech analysis unit that calculates a LSP parameter of a wideband speech signal with a certain length of the speech signal as a frame length;
A telephone band voice analysis unit for calculating an LSP parameter of a telephone band voice signal limited to a band narrower than a band of the wideband voice signal with a certain length as a frame length of the voice signal;
A wideband LSP parameter codebook for quantizing the LSP parameters of the wideband speech signal;
A search unit that selects several candidate vectors from a wideband LSP parameter codebook using the LSP parameters of the telephone band voice signal;
A wideband quantization unit that identifies a vector that is closest to the LSP parameter of the wideband speech from among the vector candidates selected by the search unit, and that quantizes the LSP parameter of the wideband speech signal based on this vector; ,
A speech quantization apparatus comprising:

The speech quantization apparatus according to claim 4, wherein
When the search unit selects candidate vectors from the broadband LSP parameter codebook using LSP parameter vectors of the telephone band voice signal, it calculates the distortion of each vector in order from the lowest order, and obtains the sum of the distortions. A speech quantizing device that selects a predetermined number N in order from the smallest value of the sum of distortions.

The speech quantization apparatus according to claim 4, wherein
When the search unit selects a candidate vector from the wideband LSP parameter codebook using the LSP parameter vector of the telephone band voice signal, the search unit selects a candidate vector of the wideband LSP parameter codebook for each element of the LSP parameter of the telephone band voice signal. Find the distortion for all combinations of elements, pair the candidate vector elements that give the least distortion to the LSP parameter elements of the telephone band voice signal, and the smallest for all the elements of the LSP parameter of the telephone band voice signal Calculate the sum of the pair of strains giving the strain, calculate the strain with the candidate vector, calculate each candidate vector in the same way, and select from the smallest value of the sum of strains to the predetermined N in order. A speech quantizer characterized by the above.

7. A program written in a program language that can be read by a computer, and causing the computer to function as the apparatus according to claim 4.