JP2001134298A

JP2001134298A - Speech encoding device and speech decoding device, and speech encoding/decoding system

Info

Publication number: JP2001134298A
Application number: JP2000253549A
Authority: JP
Inventors: Kazutoshi Yasunaga; 和敏安永; Toshiyuki Morii; 利幸森井
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1999-08-24
Filing date: 2000-08-24
Publication date: 2001-05-18

Abstract

PROBLEM TO BE SOLVED: To reduce the amount of computations during a code search on an encoding side while obtaining high quality synthesized speech on a decoding side in a speech encoding device, a speech decoding device and a speech encoding/decoding system which employ a pulse spreading code table. SOLUTION: The configuration of a pulse spreading code table A 101 of a speech encoding device is made different from that of a pulse spreading code table B 111 of a speech decoding device. A spreading pattern that is obtained by replacing components with zero for every one sample is stored in a spreading pattern storage section A 102 of the encoding side to conduct code search computations. Fixed waveforms frequently included in the noise source target obtained through learning, for example, are stored as spreading patterns in a spreading pattern storage section B 112 of a decoding side to conduct decoding. Thus, an encoding device in which the amount of computations is reduced in the encoding side and a decoding device capable of obtaining high quality synthesized speech on a decoding side are obtained.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声情報を効率的
に符号化・復号化するための音声符号化装置と音声復号
化装置、及び音声符号化復号化システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech encoding device, a speech decoding device, and a speech encoding / decoding system for efficiently encoding / decoding speech information.

【０００２】[0002]

【従来の技術】低ビットレートの音声符号化装置及び音
声復号化装置としては、ＣＥＬＰ方式に基づく符号化装
置及び復号化装置が多く開発されている。ＣＥＬＰ方式
に関する代表的文献としては、たとえば、「Code Excit
ed Linear Prediction："HighQuality Speech at Low B
it Rate"」，M. R. Schroeder et al, Proc. ICASSP'8
5, pp.937-940（文献１）を挙げることができる。ＣＥ
ＬＰ方式の符号化装置・復号化装置では、入力音声を一
定時間間隔（この時間間隔のことを、以下では、フレー
ムと呼ぶこともある）に区切り、当該フレームごとに音
声信号を符号化・復号化する。2. Description of the Related Art As a low bit rate speech coding apparatus and speech decoding apparatus, many coding apparatuses and decoding apparatuses based on the CELP system have been developed. As a representative document on the CELP system, for example, “Code Excit
ed Linear Prediction: "HighQuality Speech at Low B
it Rate "", MR Schroeder et al, Proc. ICASSP'8
5, pp. 937-940 (Reference 1). CE
In an encoding / decoding device of the LP system, input speech is divided into fixed time intervals (this time interval may be hereinafter referred to as a frame), and an audio signal is encoded / decoded for each frame. Become

【０００３】ここではまず、ＣＥＬＰ音声符号化装置の
概要を、図６を用いて説明する。図６のＣＥＬＰ音声符
号化装置では、まず始めに、線形予測分析部１２が、入
力音声１１を線形予測分析して線形予測係数を算出し、
算出した線形予測係数を線形予測係数符号化部１３へ出
力する。次に、線形予測係数符号化部１３が、線形予測
係数を符号化（ベクトル量子化）し、ベクトル量子化に
よって得られる量子化インデクス（以下、線形予測符号
と呼ぶ）を符号出力部２４および線形予測符号復号化部
１４へ出力する。次に、線形予測符号復号化部１４が、
線形予測係数符号化部１３で得た線形予測符号を復号化
（逆量子化）して合成フィルタ１５へ出力する。合成フ
ィルタ１５は、線形予測符号復号化部１４で復号化して
得られた復号化線形予測符号を係数に持つ全極型モデル
の合成フィルタを構成する。そして、適応符号帳１７か
ら選出される適応音源ベクトルに適応音源ゲイン１９を
乗じて得られるベクトルと、雑音符号帳１８から選出し
た雑音音源ベクトルに雑音音源ゲイン２０を乗じて得ら
れるベクトルとをベクトル加算部２２で加算して駆動音
源ベクトルを生成する。そして、歪み計算部１６が、当
該駆動音源ベクトルで合成フィルタ１５を駆動したとき
の出力ベクトルと、入力音声１１との歪み（数１）を計
算し、歪みＥＲを符号特定部２３へ出力する。First, an outline of a CELP speech coding apparatus will be described with reference to FIG. In the CELP speech coding apparatus in FIG. 6, first, the linear prediction analysis unit 12 performs a linear prediction analysis on the input speech 11 to calculate a linear prediction coefficient,
The calculated linear prediction coefficient is output to the linear prediction coefficient encoding unit 13. Next, the linear prediction coefficient encoding unit 13 encodes the linear prediction coefficient (vector quantization), and outputs a quantization index (hereinafter, referred to as a linear prediction code) obtained by the vector quantization to the code output unit 24 and the linear prediction coefficient. Output to the predictive code decoding unit 14. Next, the linear prediction code decoding unit 14
The linear prediction code obtained by the linear prediction coefficient coding unit 13 is decoded (dequantized) and output to the synthesis filter 15. The synthesis filter 15 forms a synthesis filter of an all-pole model having, as coefficients, a decoded linear prediction code obtained by decoding by the linear prediction code decoding unit 14. Then, a vector obtained by multiplying the adaptive excitation vector selected from the adaptive codebook 17 by the adaptive excitation gain 19 and a vector obtained by multiplying the noise excitation vector selected from the noise codebook 18 by the noise excitation gain 20 are vectorized. The addition is performed by the addition unit 22 to generate a driving sound source vector. Then, the distortion calculator 16 calculates a distortion (Equation 1) between the output vector when the synthesis filter 15 is driven by the driving sound source vector and the input speech 11, and outputs the distortion ER to the code specifying unit 23.

【０００４】[0004]

【数１】 (Equation 1)

【０００５】ただし、（数１）において、uは処理フレ
ーム内の入力音声ベクトル、Hは合成フィルタのインパ
ルス応答行列、gaは適応音源ゲイン、gcは雑音音源ゲイ
ン、pは適応音源ベクトル、cは雑音音源ベクトルであ
る。In Equation (1), u is an input speech vector in a processing frame, H is an impulse response matrix of a synthesis filter, ga is an adaptive excitation gain, gc is a noise excitation gain, p is an adaptive excitation vector, and c is an adaptive excitation vector. This is a noise source vector.

【０００６】ここで、適応符号帳１７は、過去数フレー
ム分の駆動音源ベクトルを格納したバッファ（動的メモ
リ）であり、上記適応符号帳１７から選出される適応音
源ベクトルは、入力音声を合成フィルタの逆フィルタに
通して得られる線形予測残差ベクトル中の周期成分を表
現するために使われる。Here, the adaptive codebook 17 is a buffer (dynamic memory) storing driving excitation vectors for the past several frames, and the adaptive excitation vector selected from the adaptive codebook 17 synthesizes input speech. It is used to represent a periodic component in a linear prediction residual vector obtained through an inverse filter of the filter.

【０００７】一方、雑音符号帳１８は、複数種類の固定
ベクトルを格納する静的メモリであり、上記雑音符号帳
１８から選出される雑音音源ベクトルは、線形予測残差
ベクトルに現処理フレームで新たに加わった非周期成分
（線形予測残差ベクトルから周期性（適応音源ベクトル
成分）を除去した成分）を表現するために使われる。な
お、雑音符号帳は、複数種類の固定ベクトルを格納して
いるため、固定符号帳と呼ばれることもある。そして、
適応音源ベクトル重み付け部１９および雑音音源ベクト
ル重み付け部２０は、適応符号帳１７から選出される適
応音源ベクトルおよび雑音符号帳１８から選出される雑
音音源ベクトルに対して、重み符号帳２１から読みだし
た適応音源ゲインおよび雑音音源ゲインを乗じる機能を
有している。なお、重み符号帳２１とは、適応音源ベク
トルに乗じる適応音源ゲインと、雑音音源ベクトルに乗
じる雑音音源ゲインとのセットを複数種類格納した静的
メモリである。On the other hand, the noise codebook 18 is a static memory for storing a plurality of types of fixed vectors, and the noise excitation vector selected from the noise codebook 18 is newly added to the linear prediction residual vector in the current processing frame. Is used to represent the non-periodic component (the component obtained by removing the periodicity (adaptive excitation vector component) from the linear prediction residual vector) added to. Note that the random codebook stores a plurality of types of fixed vectors, and is therefore sometimes called a fixed codebook. And
Adaptive excitation vector weighting section 19 and noise excitation vector weighting section 20 read from excitation codebook 21 the adaptive excitation vector selected from adaptive codebook 17 and the noise excitation vector selected from noise codebook 18. It has a function of multiplying the adaptive sound source gain and the noise sound source gain. The weight codebook 21 is a static memory storing a plurality of types of sets of an adaptive excitation gain multiplied by an adaptive excitation vector and a noise excitation gain multiplied by a noise excitation vector.

【０００８】符号特定部２３は、歪み計算部１６で計算
した（数１）の歪みＥＲを最小化する上記３つの符号帳
（適応符号帳、雑音符号帳、重み符号帳）のインデクス
の最適組み合わせを選択する。そして、歪み特定部２３
は、上記歪みが最小になるときに選択していた各符号帳
のインデクスを、それぞれ、適応音源符号、雑音音源符
号、重み符号として符号出力部２４へ出力する。そして
最後に、符号出力部２４は、線形予測係数符号化部１３
で得られた線形予測符号と、符号特定部２３で特定され
た適応音源符号、雑音音源符号および重み符号を、全て
まとめて現処理フレーム内の入力音声を表現する符号
（ビット情報）とし、復号化装置側へ出力する。なお、
歪み特定部２３で行なう適応音源符号、雑音音源符号、
重み符号の特定は、一定時間間隔のフレームを、サブフ
レームと呼ぶさらに短い時間間隔に分割した上で行われ
ることがある。ただし、本明細書では、フレームとサブ
フレームと特に区別しないで（フレームという呼び方に
統一した上で）、以下の説明を行なう。The code specifying unit 23 optimizes the index of the above three codebooks (adaptive codebook, noise codebook, and weighted codebook) to minimize the distortion ER of (Equation 1) calculated by the distortion calculator 16. Select Then, the distortion specifying unit 23
Outputs the index of each codebook selected when the distortion is minimized to the code output unit 24 as an adaptive excitation code, a noise excitation code, and a weight code, respectively. And finally, the code output unit 24 outputs the linear prediction coefficient coding unit 13
And the adaptive excitation code, the noise excitation code, and the weighting code specified by the code specifying unit 23 are all collected into a code (bit information) representing the input speech in the current processing frame, and decoded. Output to the conversion unit. In addition,
An adaptive excitation code, a noise excitation code,
The specification of the weight code may be performed after dividing a frame at a fixed time interval into shorter time intervals called subframes. However, in this specification, the following description will be given without distinction between a frame and a subframe (after unifying them as frames).

【０００９】次に、ＣＥＬＰ音声復号化装置の概要を、
図７を用いて説明する。図７のＣＥＬＰ復号化装置で
は、まず、符号入力部３１が、ＣＥＬＰ音声符号化装置
（図６）で特定した符号（フレーム区間内の音声信号を
符号表現するためのビット情報）を受け、受けた符号を
線形予測符号、適応音源符号、雑音音源符号、および重
み符号の４種類の符号に分解する。そして、線形予測符
号を線形予測係数復号化部３２へ、適応音源符号を適応
符号帳３３へ、雑音音源符号を雑音符号帳３４へ、重み
符号を重み符号帳３５へ出力する。次に、線形予測係数
復号化部３２は、符号入力部３１から入力される線形予
測符号を復号化して復号化線形予測符号を得、合成フィ
ルタ３９へ出力する。Next, an outline of the CELP speech decoding apparatus will be described.
This will be described with reference to FIG. In the CELP decoding device of FIG. 7, first, the code input unit 31 receives a code (bit information for code-representing a speech signal in a frame section) specified by the CELP speech coding device (FIG. 6), and receives the code. The resulting code is decomposed into four types of codes: a linear prediction code, an adaptive excitation code, a noise excitation code, and a weight code. Then, it outputs the linear prediction code to the linear prediction coefficient decoding unit 32, the adaptive excitation code to the adaptive codebook 33, the noise excitation code to the noise codebook 34, and the weight code to the weight codebook 35. Next, the linear prediction coefficient decoding unit 32 decodes the linear prediction code input from the code input unit 31 to obtain a decoded linear prediction code, and outputs the decoded linear prediction code to the synthesis filter 39.

【００１０】合成フィルタ３９は、線形予測係数復号化
部３２で得た復号化線形予測符号を係数にもつ全極型モ
デルの合成フィルタを構成する。また、適応符号帳３３
は、符号入力部３１から入力された適応音源符号に対応
する適応音源ベクトルを出力する。なおまた、雑音符号
帳３４は、符号入力部３１から入力された雑音音源符号
に対応する適応音源ベクトルを出力する。さらにまた、
重み符号帳３５は、符号入力部３１から入力される重み
符号に対応する適応音源ゲインおよび雑音音源ゲインを
読み出し、それぞれ適応音源ゲイン乗算部３６および雑
音音源ゲイン乗算部３７へ出力する。The synthesis filter 39 constitutes an all-pole model synthesis filter having the decoded linear prediction code obtained by the linear prediction coefficient decoding unit 32 as a coefficient. Also, adaptive codebook 33
Outputs an adaptive excitation vector corresponding to the adaptive excitation code input from code input section 31. Further, the noise codebook 34 outputs an adaptive excitation vector corresponding to the noise excitation code input from the code input unit 31. Furthermore,
The weighting codebook 35 reads out the adaptive excitation gain and the noise excitation gain corresponding to the weighting code input from the code input unit 31, and outputs them to the adaptive excitation gain multiplication unit 36 and the noise excitation gain multiplication unit 37, respectively.

【００１１】そして、適応音源ゲイン乗算部３６が、適
応符号帳３３から出力された適応音源ベクトルに、重み
符号帳３５から出力された適応音源ゲインを乗算し、雑
音音源ゲイン乗算部３７が、雑音符号帳３４から出力さ
れた雑音音源ベクトルに、重み符号帳３５で出力された
雑音音源ゲインを乗算する。そしてベクトル加算部３８
が、適応音源ゲイン乗算部３６および雑音音源ゲイン乗
算部３７それぞれの出力ベクトルを加算して駆動音源ベ
クトルを生成する。そして、当該駆動音源ベクトルで、
合成フィルタ３９を駆動し、受信したフレーム区間の合
成音声４０を出力する。The adaptive excitation gain multiplying section 36 multiplies the adaptive excitation vector output from the adaptive codebook 33 by the adaptive excitation gain output from the weighting codebook 35, and the noise excitation gain multiplication section 37 The noise excitation vector output from the codebook 34 is multiplied by the noise excitation gain output from the weighting codebook 35. And the vector adder 38
Generates the drive excitation vector by adding the output vectors of the adaptive excitation gain multiplication unit 36 and the noise excitation gain multiplication unit 37. Then, with the driving sound source vector,
The synthesizing filter 39 is driven to output the synthesized speech 40 of the received frame section.

【００１２】以上のようなＣＥＬＰ方式の音声符号化装
置・音声復号化装置において、品質の高い合成音声を得
るためには、（数１）の歪みＥＲを小さく抑えることが
必要になる。そのためには、（数１）を最小化するよう
に、適応音源符号、雑音音源符号、重み符号の組み合わ
せを閉ループで特定することが望ましい。しかし、（数
１）の歪みＥＲを閉ループで特定しようとすると演算処
理量が大きくなりすぎるため、上記３種類の符号は開ル
ープで特定していくことが一般的である。In the above-described CELP-type speech coding apparatus / speech decoding apparatus, in order to obtain a high-quality synthesized speech, it is necessary to suppress the distortion ER of (Equation 1) to a small value. For that purpose, it is desirable to specify a combination of the adaptive excitation code, the noise excitation code, and the weight code in a closed loop so as to minimize (Equation 1). However, when trying to identify the distortion ER of (Equation 1) in a closed loop, the amount of calculation becomes too large, and therefore, it is general to specify the above three types of codes in an open loop.

【００１３】具体的には、まず、適応符号帳探索を行
う。ここで、適応符号帳探索処理とは、入力音声を逆フ
ィルタに通して得られる予測残差ベクトル中の周期性成
分を、過去フレームの駆動音源ベクトルを格納した適応
符号帳から出力される適応音源ベクトルによってベクト
ル量子化する処理である。そして、線形予測残差ベクト
ル内の周期成分と、近い周期成分を有する適応音源ベク
トルのエントリー番号を適応音源符号として特定する。
なお、適応符号帳探索によって、同時に、理想適応音源
ゲインが暫定的に確定されることになる。Specifically, first, an adaptive codebook search is performed. Here, the adaptive codebook search process means that a periodic component in a prediction residual vector obtained by passing an input voice through an inverse filter is converted into an adaptive excitation source output from an adaptive codebook storing a driving excitation vector of a past frame. This is a process of performing vector quantization by a vector. Then, the entry number of the adaptive excitation vector having a periodic component in the linear prediction residual vector and a close periodic component is specified as the adaptive excitation code.
Note that the ideal adaptive excitation gain is provisionally determined at the same time by the adaptive codebook search.

【００１４】そして次に、雑音符号帳探索を行う。雑音
符号帳探索は、処理フレームの線形予測残差ベクトルか
ら周期成分を除去した成分、すなわち、線形予測残差ベ
クトルから適応音源ベクトル成分を差し引いた成分（以
下、雑音音源ターゲットと呼ぶこともある）を、雑音符
号帳に格納された複数の雑音音源ベクトル候補を用いて
ベクトル量子化する処理である。そして、この雑音符号
帳探索処理により、雑音音源ターゲットを、もっとも歪
み少なく符号化する雑音音源ベクトルのエントリ番号を
雑音音源符号として特定する。なお、雑音符号帳探索に
よって、同時に、理想雑音ゲインも暫定的に確定される
ことになる。Then, a random codebook search is performed. The noise codebook search is a component obtained by removing the periodic component from the linear prediction residual vector of the processing frame, that is, a component obtained by subtracting the adaptive excitation vector component from the linear prediction residual vector (hereinafter, also referred to as a noise excitation target). Is a vector quantization using a plurality of noise excitation vector candidates stored in the noise codebook. Then, by this noise codebook search processing, the entry number of the noise excitation vector that encodes the noise excitation target with the least distortion is specified as the noise excitation code. At the same time, the ideal noise gain is provisionally determined by the noise codebook search.

【００１５】そして最後に、重み符号帳探索を行う。重
み符号帳探索は、適応符号帳探索時に暫定的に得られた
理想適応ゲインと、雑音符号帳探索時に暫定的に得られ
た理想雑音ゲインとの２要素からなるベクトルを、重み
符号帳に格納されたゲイン候補ベクトル（適応音源ゲイ
ン候補と雑音音源ゲイン候補の２要素からなるベクトル
候補）で歪みが最小になるように符号化（ベクトル量子
化）する処理である。そして、ここで選択されるゲイン
候補ベクトルのエントリ番号が重み符号として符号出力
部へ出力される。Finally, a weight codebook search is performed. The weighted codebook search stores in the weighted codebook a vector composed of two elements, an ideal adaptive gain tentatively obtained during the adaptive codebook search and an ideal noise gain tentatively obtained during the noise codebook search. This is a process of encoding (vector quantization) so that distortion is minimized in the obtained gain candidate vector (a vector candidate including two elements of an adaptive excitation gain candidate and a noise excitation gain candidate). Then, the entry number of the gain candidate vector selected here is output to the code output unit as a weight code.

【００１６】ここでは、次に、ＣＥＬＰ音声符号化装置
における上記一般的な符号探索処理のうち、雑音符号帳
探索処理（適応音源符号を特定した後に、雑音音源符号
を特定する処理）についてさらに詳しく説明を行う。Here, of the above-mentioned general code search processing in the CELP speech coding apparatus, the noise codebook search processing (the processing of specifying the adaptive excitation code and then specifying the noise excitation code) will be described in more detail. Give an explanation.

【００１７】説明したように、一般的なＣＥＬＰ符号化
装置では、雑音符号帳探索を行う時点では、線形予測符
号および適応音源符号は、既に特定されている。ここ
で、既に特定されている線形予測符号によって構成され
る合成フィルタのインパルス応答行列をＨ、適応音源符
号と対応する適応音源ベクトルをｐ、適応音源符号を特
定した時点で同時に求まる理想適応音源ゲイン（暫定
値）をgaとすると、（数１）の歪みＥＲは、以下の（数
２）へと変形される。As described above, in a general CELP encoding apparatus, at the time of performing a random codebook search, the linear prediction code and the adaptive excitation code have already been specified. Here, H is the impulse response matrix of the synthesis filter composed of the linear prediction codes already specified, p is the adaptive excitation vector corresponding to the adaptive excitation code, and ideal adaptive excitation gain that is obtained simultaneously when the adaptive excitation code is specified. Assuming that the (provisional value) is ga, the distortion ER of (Equation 1) is transformed into (Equation 2) below.

【００１８】[0018]

【数２】 (Equation 2)

【００１９】ただし、（数２）内のベクトルｖは、フレ
ーム区間内の入力音声信号u、合成フィルタのインパル
ス応答行列H（既定）、適応音源ベクトルｐ（既定）、
理想適応音源ゲインga（暫定値）を用いた、以下の（数
３）の雑音音源ターゲットである。Here, the vector v in (Equation 2) is the input speech signal u in the frame section, the impulse response matrix H of the synthesis filter (default), the adaptive excitation vector p (default),
A noise source of the following (Equation 3) using the ideal adaptive sound source gain ga (provisional value).

【００２０】[0020]

【数３】 (Equation 3)

【００２１】なお、（数１）では雑音音源ベクトルがｃ
と表現されており、一方、（数２）では雑音音源ベクト
ルはckと表現がされている。これは、（数１）では雑音
音源ベクトルのエントリー番号（ｋのこと）を違いを明
示していないことに対して、（数２）ではエントリー番
号を明示していることによるものであり、表現上の違い
はあるものの意味する対象は同じものである。In Equation (1), the noise source vector is c
On the other hand, in (Equation 2), the noise source vector is expressed as ck. This is because (Equation 1) does not specify the entry number (k) of the noise source vector, whereas (Equation 2) specifies the entry number. Despite the differences above, the meaning of the object is the same.

【００２２】従って、雑音符号帳探索とは、（数２）の
歪みＥＲkを最小化するような雑音音源ベクトルｃkのエ
ントリ番号ｋを求める処理である。そして、（数２）の
歪みＥＲkを最小化するような雑音音源ベクトルｃkのエ
ントリ番号ｋを特定する際には、雑音音源ゲインｇcは
任意の値をとりうると仮定できる。従って（数２）の歪
みを最小化するようなエントリ番号を求める処理は、以
下の（数４）の分数式Ｄkを最大化するような雑音音源
ベクトルｃkのエントリ番号ｋを特定する処理に置き換
えられる。Therefore, the random codebook search is a process of obtaining the entry number k of the noise excitation vector ck so as to minimize the distortion ERk of (Equation 2). Then, when specifying the entry number k of the noise excitation vector ck that minimizes the distortion ERk of (Equation 2), it can be assumed that the noise excitation gain gc can take an arbitrary value. Therefore, the process of finding the entry number that minimizes the distortion of (Equation 2) is replaced by the process of specifying the entry number k of the noise source vector ck that maximizes the following fractional expression Dk of (Equation 4) Can be

【００２３】[0023]

【数４】 (Equation 4)

【００２４】そして、雑音符号帳探索は、（１）雑音音
源ベクトルｃkのエントリ番号ｋごとに（数４）の分数
式Ｄkを歪み計算部１６で計算し、その値を符号特定部
２３へ出力。（２）符号特定部２３で、エントリ番号ｋ
ごとの（数４）の値を大小比較して、その値が最大にな
るときのエントリ番号ｋを雑音音源符号と決定して符号
出力部２４へ出力。といった２段階の処理によって行わ
れることになる。The noise codebook search is performed by (1) calculating the fractional expression Dk of (Equation 4) for each entry number k of the noise excitation vector ck by the distortion calculation unit 16 and outputting the value to the code identification unit 23 . (2) In the code identification unit 23, the entry number k
The value of (Equation 4) for each is compared in magnitude, the entry number k at which the value becomes maximum is determined as the noise excitation code, and output to the code output unit 24. Such a two-stage process is performed.

【００２５】初期のＣＥＬＰ方式では、ランダム数列が
雑音音源ベクトルとして複数種類エントリーされた雑音
符号帳、すなわち、複数種類のランダム数列をメモリに
直接記録した雑音符号帳が使われていた。一方、近年の
低ビットレートＣＥＬＰ符号化・復号化装置において
は、振幅が＋１か−１の非零要素（非零要素以外の要素
の振幅は零）を少数個含んだ雑音音源ベクトルを生成す
る代数的符号帳を雑音符号帳部に備えるものが多く開発
されている。なお、代数的符号帳は、「Fast CELP Codi
ng based on Algebraic codes」, J.Adoul et al, Pro
c. IEEE Int. Conf. Acoustics, Speech, Signal Proce
ssing, 1987, pp. 1957-1960（文献２）や「Comparison
of Some Algebraic Structure for CELP Coding of Sp
eech」, J.Adoul et al, Proc. IEEE Int. Conf. Acous
tics, Speech, Signal Processing,1987, pp. 1953-195
6（文献３）などに開示されている。In the early CELP system, a random codebook in which a plurality of random numbers were entered as noise excitation vectors, that is, a random codebook in which a plurality of random numbers were directly recorded in a memory, was used. On the other hand, in recent low-bit-rate CELP encoding / decoding devices, a noise excitation vector including a small number of non-zero elements whose amplitudes are +1 or −1 (the amplitude of elements other than the non-zero elements is zero) is generated. Many that have an algebraic codebook in the noise codebook have been developed. The algebraic codebook is called "Fast CELP Codi
ng based on Algebraic codes '', J. Adoul et al, Pro
c. IEEE Int. Conf. Acoustics, Speech, Signal Proce
ssing, 1987, pp. 1957-1960 (Reference 2) and “Comparison
of Some Algebraic Structure for CELP Coding of Sp
eech ", J. Adoul et al, Proc. IEEE Int. Conf. Acous
tics, Speech, Signal Processing, 1987, pp. 1953-195
6 (Reference 3).

【００２６】上記文献に開示されている代数的符号帳
は、（１）ビットレートが８kb/s程度のＣＥＬＰ方式に
適用した場合、品質の高い合成音を生成できる、（２）
少ない演算量で雑音音源符号帳を探索できる、（３）雑
音音源ベクトルを、直接格納しておくデータＲＯＭ容量
が不要になる、といった優れた特徴を有する符号帳であ
る。そして、代数符号帳を雑音符号帳部に備えることを
特徴とするＣＳ−ＡＣＥＬＰ（ビットレート８kb/s）や
ＡＣＥＬＰ（ビットレート５．３kb/s）が、Ｇ．７２
９、ｇ７２３．１として、それぞれＩＴＵ−Ｔから１９
９６年に勧告化されている。なお、ＣＳ−ＡＣＥＬＰに
関しては、「Design and Description of CS-ACELP:A T
oll Quality 8 kb/s Speech Coder」, Redwan Salami e
t al, IEEE trans. SPEECH AND AUDIO PROCESSING, vo
l. 6, no. 2, March 1998（文献４）などに、その詳細
技術が開示されている。The algebraic codebook disclosed in the above document can (1) produce a high-quality synthesized sound when applied to the CELP system having a bit rate of about 8 kb / s.
This codebook has excellent features such that a noise excitation codebook can be searched with a small amount of calculation, and (3) a data ROM capacity for directly storing a noise excitation vector is not required. CS-ACELP (bit rate 8 kb / s) or ACELP (bit rate 5.3 kb / s), which is characterized in that an algebraic codebook is provided in the noise codebook section, is described in G.99. 72
9 and g723.1 from ITU-T to 19
It was recommended in 1996. For CS-ACELP, refer to “Design and Description of CS-ACELP: AT
oll Quality 8 kb / s Speech Coder '', Redwan Salamie
t al, IEEE trans. SPEECH AND AUDIO PROCESSING, vo
l. 6, no. 2, March 1998 (Reference 4), etc., discloses the detailed technology.

【００２７】代数的符号帳は、上記のように優れた特徴
を有する符号帳である。しかし一方、代数的符号帳をＣ
ＥＬＰ符号化・復号化装置の雑音符号帳に適用した場
合、雑音音源ターゲットは、比零要素を少数個だけ含ん
だ雑音音源ベクトルで常に符号化（ベクトル量子化）さ
れることになるので、雑音音源ターゲットの忠実な符号
表現は不可能であるといった課題も生じている。そし
て、処理フレームが、無声子音区間や背景雑音区間など
に相当する場合に、この課題は特に顕著になる。An algebraic codebook is a codebook having excellent features as described above. However, on the other hand, the algebraic codebook is
When applied to the noise codebook of the ELP encoding / decoding device, the noise excitation target is always encoded (vector quantized) with a noise excitation vector including only a small number of zero elements. There is also a problem that it is impossible to faithfully represent a sound source target with a code. This problem becomes particularly significant when the processing frame corresponds to an unvoiced consonant section, a background noise section, or the like.

【００２８】無声子音区間や背景雑音区間では、雑音音
源ターゲットが複雑な形状になることが多いためであ
る。またさらには、ビットレートが８kb/s程度よりさら
に低いＣＥＬＰ符号化・復号化装置に代数的符号帳を適
用した場合には、雑音音源ベクトル中の比零要素数を少
なくすることになるため、雑音音源ターゲットがパルス
的形状になりやすい有声区間でさえも、上記課題が問題
になる場合がある。This is because the noise source often has a complicated shape in an unvoiced consonant section or a background noise section. Furthermore, when an algebraic codebook is applied to a CELP encoding / decoding device having a bit rate lower than about 8 kb / s, the number of non-zero elements in the noise excitation vector is reduced. Even in a voiced section in which the noise source target tends to have a pulse shape, the above problem may be a problem.

【００２９】代数的符号帳の有する上記課題を解決する
一方法として、代数的符号帳より出力される少数個の非
零要素（非零要素以外の要素はゼロの値を持つ）を含む
ベクトルと、拡散パタンと呼ばれる固定波形とを重畳し
て得られるベクトルを、合成フィルタの駆動音源とする
パルス拡散符号帳が開示されている。パルス拡散符号帳
は、特開平１０−２３２６９６（文献５）、「パルス拡
散構造音源を併用するＡＣＥＬＰ符号化」安永他, 電子
情報通信学会平成９年度春季全国大会発表予稿集, D-14
-11, p. 253, 1997-03（文献６）、「パルス拡散音源を
用いた低レート音声符号化」安永他, 日本音響学会平成
１０年秋期研究発表会講演論文集, pp.281-282, 1998-1
0（文献７）などに開示されている。As a method for solving the above problem of the algebraic codebook, as a method, a vector including a small number of non-zero elements (elements other than the non-zero elements have a value of zero) output from the algebraic codebook is used. There is disclosed a pulse spread codebook in which a vector obtained by superimposing a fixed waveform called a diffusion pattern is used as a driving sound source of a synthesis filter. The pulse spread codebook is disclosed in Japanese Patent Laid-Open No. Hei 10-232696 (Reference 5), "ACELP Coding Using Pulse-Spread Structure Exciter" Yasunaga et al., Proceedings of the 1997 IEICE Spring Conference, D-14.
-11, p. 253, 1997-03 (Reference 6), "Low-rate speech coding using pulse spread source" Yasunaga et al., Proc. Of the 1998 Autumn Meeting of the Acoustical Society of Japan, pp. 281-282. , 1998-1
0 (Reference 7).

【００３０】そこで次に、上記文献で開示されたパルス
拡散符号帳の概要を、図８および図９を用いて説明す
る。なお、図９は、図８のパルス拡散符号帳のさらに詳
細な一例を示すものである。Next, an outline of the pulse spreading codebook disclosed in the above document will be described with reference to FIGS. FIG. 9 shows a more detailed example of the pulse spreading codebook of FIG.

【００３１】図８および図９のパルス拡散符号帳５０に
おいて、４１は少数個の非零要素（振幅は＋１または−
１）からなるパルスベクトル４２を生成する代数的符号
帳である。文献２、３、４に記載されているＣＥＬＰ符
号化装置・復号化装置では、代数的符号帳４１の出力で
あるパルスベクトル４２（少数個の非零要素によって構
成される）がそのまま、雑音音源ベクトルとして用いら
れている。そして、図８および図９の４３は、拡散パタ
ン格納部である。拡散パタン格納部４３は、拡散パタン
と呼ばれる固定波形を、各チャネルあたり１種類以上ず
つ格納している。なお、各チャネルごとに格納された前
記拡散パタンは、チャネル毎で異なる形状の拡散パタン
が格納される場合、各チャネルに同一形状（共通の）の
拡散パタンが格納される場合の双方が考えられる。各チ
ャネル用に格納される拡散パタンが共通の場合は、各チ
ャネル用に格納される拡散パタンが格納される場合を簡
単化したものに相当するので、本明細書の以下の説明で
は、チャネル毎に格納される拡散パタンの形状がそれぞ
れ異なる場合について説明を進めることとする。In the pulse spread codebook 50 of FIGS. 8 and 9, reference numeral 41 denotes a small number of non-zero elements (having an amplitude of +1 or-).
This is an algebraic codebook that generates a pulse vector 42 composed of 1). In the CELP encoding apparatus / decoding apparatus described in Literatures 2, 3, and 4, the pulse vector 42 (constituted by a small number of non-zero elements) output from the algebraic codebook 41 is directly used as a noise source. Used as a vector. Reference numeral 43 in FIGS. 8 and 9 denotes a diffusion pattern storage unit. The diffusion pattern storage unit 43 stores one or more types of fixed waveforms called diffusion patterns for each channel. The diffusion pattern stored for each channel may be either a case where a diffusion pattern having a different shape is stored for each channel or a case where a diffusion pattern having the same shape (common) is stored for each channel. . The case where the diffusion pattern stored for each channel is common corresponds to a simplified case where the diffusion pattern stored for each channel is stored. The case where the shapes of the diffusion patterns stored in the.

【００３２】パルス拡散符号帳５０は、代数的符号帳４
１からの出力ベクトル４２をそのまま雑音音源ベクトル
として出力するのではなく、代数的符号帳４１から出力
されるベクトル４２と、拡散パタン格納部４３から読み
出される拡散パタン４４とを、パルス拡散部４５でチャ
ネルごとに重畳し、重畳演算によって得られるベクトル
を加算して得られるベクトル４６を雑音音源ベクトルと
して利用する。The pulse spread codebook 50 is an algebraic codebook 4
Instead of directly outputting the output vector 42 from 1 as a noise excitation vector, the pulse spreading section 45 converts the vector 42 output from the algebraic codebook 41 and the spreading pattern 44 read from the spreading pattern storage section 43 by the pulse spreading section 45. A vector 46 obtained by superimposing for each channel and adding vectors obtained by the superimposition operation is used as a noise source vector.

【００３３】図８（図９）記載のパルス拡散符号帳をＣ
ＥＬＰ符号化・復号化装置の雑音符号帳部に用いる場
合、図６のＣＥＬＰ符号化装置・図７のＣＥＬＰ復号化
装置の雑音符号帳部分（図６の１８および図７の３４）
を、図８（図９）のパルス拡散符号帳で置き換えた図１
０・図１１が、それぞれ符号化装置・復号化装置の構成
となる。The pulse spread codebook shown in FIG.
When used in the noise codebook section of the ELP encoding / decoding device, the noise codebook portion of the CELP encoding device of FIG. 6 and the CELP decoding device of FIG. 7 (18 in FIG. 6 and 34 in FIG. 7)
Is replaced by the pulse spread codebook of FIG. 8 (FIG. 9).
0 and FIG. 11 are the configurations of the encoding device and the decoding device, respectively.

【００３４】なお、文献５、６、７において開示されて
いるＣＥＬＰ符号化・復号化装置は、符号化装置（図１
０）と復号化装置（図１１）で同一構成（代数的符号帳
部のチャネル数、拡散パタン格納部に登録されている拡
散パタンの種類数および形状などが、符号化装置側と復
号化装置側で共通）のパルス拡散符号帳を用いることを
特徴としている。そして、拡散パタン格納部４３に登録
しておく拡散パタンの形状、種類数、複数種類以上登録
している場合にはそれらの選択方法を効率的に設定する
ことによって、合成音声の品質を向上を図っている。The CELP encoding / decoding device disclosed in References 5, 6, and 7 is an encoding device (FIG. 1).
0) and the decoding apparatus (FIG. 11) have the same configuration (the number of channels in the algebraic codebook section, the number and type of spreading patterns registered in the spreading pattern storage section, etc.) between the coding apparatus side and the decoding apparatus. (Common on both sides). Then, by setting the shape, the number of types, and the selection method of a plurality of types of diffusion patterns registered in the diffusion pattern storage unit 43 efficiently, the quality of synthesized speech is improved. I'm trying.

【００３５】なお、パルス拡散符号帳に関するここでの
説明は、少数個の非零要素からなるパルスベクトル４２
を生成する符号帳として、非零要素の振幅を＋１もしく
は−１に限定した代数的符号帳を用いた場合についての
説明であるが、当該パルスベクトルを生成する符号帳と
しては、非零要素の振幅を限定しないマルチパルス符号
帳や、レギュラーパルス符号帳を用いることも可能であ
り、その場合にも、パルスベクトルを拡散パタンと重畳
したものを雑音音源ベクトルとして利用することで合成
音声の品質向上を実現できる。The description of the pulse spreading codebook herein is based on the pulse vector 42 composed of a small number of non-zero elements.
Is described using an algebraic codebook in which the amplitude of the non-zero element is limited to +1 or −1, but the codebook that generates the pulse vector is a non-zero element codebook. It is also possible to use a multi-pulse codebook with unlimited amplitude or a regular pulse codebook, and in such a case, improve the quality of synthesized speech by using a pulse vector superimposed on a spreading pattern as a noise source vector. Can be realized.

【００３６】そしてこれまでに、[１]多くの雑音音源タ
ーゲットの形状を統計学習し、雑音音源ターゲット中に
統計的に高い頻度で含まれる形状（文献５、６、７）の
拡散パタン、[２]無声子音区間や雑音区間を効率的に表
現するための乱数的な形状の拡散パタン（文献５、６、
７）、[３]有声定常区間を効率的に表現するためのパル
ス的な形状の拡散パタン（文献５、７）、[４]代数的符
号帳から出力されるパルスベクトルのエネルギー（非零
要素の位置にエネルギーが集中している）を周囲に分散
させるような作用を与える形状の拡散パタン（文献
６）、[５] 適当に用意したいくつかの拡散パタン候補
について、音声信号を、符号化、復号化、合成音声の視
聴評価を繰り返し、品質の高い合成音声を出力しうるよ
う選択した拡散パタン（文献５）、[６]音声学的な知見
をもとに作成した拡散パタン（文献５）などを、代数的
符号帳から出力される音源ベクトル中の非零要素（チャ
ネル）あたり１種類以上ずつ登録しておき、登録してお
いた拡散パタンと、代数的符号帳によって生成されるベ
クトル（少数個の非零要素によって構成される）とをチ
ャネルごとに重畳し、各チャネルの重畳結果を加算した
ものを雑音音源ベクトルとして用いることで、合成音声
の品質向上に有効であることが示されてきた。Up to now, [1] statistical learning has been performed on the shapes of many noise source targets, and the diffusion patterns of the shapes (References 5, 6, and 7) statistically included in the noise source targets at a high frequency. 2] A random-shaped diffusion pattern for efficiently expressing unvoiced consonant sections and noise sections (Refs. 5, 6,
7), [3] pulse-shaped diffusion patterns for efficiently expressing voiced stationary sections (References 5 and 7), [4] energy of pulse vectors output from algebraic codebooks (non-zero elements) (Where energy is concentrated at the position shown in FIG. 6)). A diffusion pattern having a function of dispersing it to the surroundings (Reference 6), [5] Encoding a speech signal for some appropriately prepared diffusion pattern candidates , Decoding and viewing evaluation of synthesized speech are repeated, and a diffusion pattern selected to output high quality synthesized speech (Reference 5), [6] A diffusion pattern created based on phonetic knowledge (Reference 5) ) Are registered for each non-zero element (channel) in the excitation vector output from the algebraic codebook, and at least one of the registered diffusion patterns and the vector generated by the algebraic codebook are registered. (A few non-zero elements Thus by superimposing and configured) for each channel, by using a material obtained by adding the superimposed results of each channel as a noise source vector, it has been shown to be effective in improving the quality of synthesized speech.

【００３７】また、特に、拡散パタン格納部４３が、チ
ャネルあたり複数種類（２種類以上）の拡散パタンを登
録している場合については、それら複数の拡散パタンの
選択方法として、＜１＞登録された拡散パタンの全組合
わせについて実際に符号化・復号化を行い、その結果生
じる符号化歪みが最小になるような拡散パタンをクロー
ズド選択する方法や、＜２＞雑音符号帳探索を行う時点
で既に明らかになっている音声的情報（ここでいう音声
的情報とは、例えば、重み符号の動的変動もしくは静的
変動を利用して判定した有声性の強弱情報、あるいは、
線形予測符号の動的変動を利用して判定した有声性の強
弱情報などのことである）利用して、拡散パタンをオー
プン選択する方法（文献５、６）などが開示されてい
る。In particular, when the diffusion pattern storage unit 43 registers a plurality of (two or more) types of diffusion patterns per channel, <1> is registered as a method of selecting the plurality of diffusion patterns. Encoding / decoding is performed for all combinations of the spread patterns that have been obtained, and a method of performing closed selection of a spread pattern that minimizes the resulting coding distortion, or <2> when performing a noise codebook search Speech information that has already been clarified (the speech information referred to here is, for example, voiced dynamic information determined using dynamic or static fluctuation of a weight code, or
A method of open-selecting a diffusion pattern using voiced strength information determined using dynamic fluctuation of a linear prediction code) is disclosed (References 5 and 6).

【００３８】なお、以降の従来の技術の説明では、説明
簡単化のため、図９のパルス拡散符号帳内の拡散パタン
格納部４５が、チャネルあたり１種類だけの拡散パタン
を登録していることを特徴とする図１２のパルス拡散符
号帳５０に限定して従来技術の説明を進める。In the following description of the conventional technique, for the sake of simplicity, it is assumed that the spreading pattern storage unit 45 in the pulse spreading codebook of FIG. 9 registers only one type of spreading pattern per channel. Description of the related art will be limited to the pulse spread codebook 50 of FIG.

【００３９】ここでは次に、代数的符号帳をＣＥＬＰ符
号化装置に適用した場合の雑音符号帳探索処理と比較し
て、パルス拡散符号帳をＣＥＬＰ符号化装置に適用した
場合の雑音符号帳探索処理を説明する。まず、代数的符
号帳を雑音符号帳部に用いた場合の符号帳探索処理を説
明する。Here, the noise codebook search processing when the pulse spread codebook is applied to the CELP coding apparatus is compared with the noise codebook search processing when the algebraic codebook is applied to the CELP coding apparatus. The processing will be described. First, a codebook search process in the case where an algebraic codebook is used for the random codebook will be described.

【００４０】代数的符号帳によって出力されるベクトル
内の非零要素数をＮ（代数的符号帳のチャネル数を
Ｎ）、チャネルごとに出力する振幅が＋１か−１の非零
要素を１本だけ含むベクトル（非零要素以外の要素の振
幅はゼロ）をｄi （ｉはチャネル番号：０≦ｉ≦N−
１）、サブフレーム長をＬとした時、代数的符号帳によ
って出力されるエントリー番号ｋの雑音音源ベクトルｃ
k は、以下の（数５）となる。The number of non-zero elements in the vector output by the algebraic codebook is N (the number of channels in the algebraic codebook is N), and one non-zero element with an amplitude of +1 or -1 is output for each channel. (The amplitude of elements other than the non-zero element is zero) di (i is the channel number: 0 ≦ i ≦ N−
1) When the subframe length is L, the noise excitation vector c of the entry number k output by the algebraic codebook
k is represented by the following (Equation 5).

【００４１】[0041]

【数５】 (Equation 5)

【００４２】そして、（数５）を（数４）に代入するこ
とで、以下の（数６）が得られる。Then, the following (Equation 6) is obtained by substituting (Equation 5) into (Equation 4).

【００４３】[0043]

【数６】 (Equation 6)

【００４４】この（数６）を整理して得られる以下の
（数７）を最大化するようなエントリ番号ｋを特定する
処理が雑音符号帳探索処理となる。The process of specifying the entry number k that maximizes the following (Expression 7) obtained by rearranging the (Expression 6) is the noise codebook search process.

【００４５】[0045]

【数７】 (Equation 7)

【００４６】ただし、（数７）において、ｘt＝ｖtＨ、
Ｍ＝ＨtＨ（ｖは雑音音源ターゲット）である。ここで
各エントリ番号ｋについて（数７）の値を計算する場
合、その前処理段階でｘt＝ｖtＨおよび、Ｍ＝ＨtＨを
計算し、計算結果をメモリに展開（記憶）させておく。
この前処理を導入することで、雑音音源ベクトルとして
エントリしている各候補ごとに（数７）を計算する際の
演算量を大幅に削減でき、この結果として、雑音符号帳
探索に要するトータルの演算量を少なくおさえられるこ
とが、（文献２、３、４）などに開示されており、一般
に知られている。Where xt = vtH,
M = HtH (v is a noise source). Here, when calculating the value of (Equation 7) for each entry number k, xt = vtH and M = HtH are calculated in the preprocessing stage, and the calculation results are expanded (stored) in a memory.
By introducing this preprocessing, it is possible to greatly reduce the amount of calculation when calculating (Equation 7) for each candidate entered as a noise excitation vector, and as a result, the total amount of noise codebook search required Reducing the amount of calculation is disclosed in (References 2, 3, and 4) and the like, and is generally known.

【００４７】次に、パルス拡散符号帳を雑音符号帳に用
いた場合の雑音符号帳探索処理を説明する。Next, a description will be given of a random codebook search process when a pulse spread codebook is used as a random codebook.

【００４８】パルス拡散符号帳の構成一部位である代数
的符号帳によって出力される非零要素数をＮ（代数的符
号帳のチャネル数をＮ）、チャネルごとに出力する振幅
が＋１か−１の非零要素を１本だけ含むベクトル（非零
要素以外の要素の振幅はゼロ）をｄi （ｉはチャネル番
号：０≦ｉ≦N−１）、拡散パタン格納部が格納してい
るチャネル番号ｉ用の拡散パタンをｗi 、サブフレーム
長をＬとした時、パルス拡散符号帳によって出力される
エントリー番号ｋの雑音音源ベクトルｃk は、次の（数
８）となる。The number of non-zero elements output by the algebraic codebook, which is a part of the pulse spread codebook, is N (the number of channels of the algebraic codebook is N), and the amplitude output for each channel is +1 or -1. (Where i is the channel number: 0 ≦ i ≦ N−1), the vector containing only one non-zero element (i is the channel number: 0 ≦ i ≦ N−1), and the channel number stored in the diffusion pattern storage unit Assuming that the spreading pattern for i is wi and the subframe length is L, the noise excitation vector ck of the entry number k output by the pulse spreading codebook becomes the following (Equation 8).

【００４９】[0049]

【数８】 (Equation 8)

【００５０】従ってこの場合、（数８）を（数４）に代
入することで、以下の（数９）が得られる。Therefore, in this case, the following (Equation 9) is obtained by substituting (Equation 8) for (Equation 4).

【００５１】[0051]

【数９】 (Equation 9)

【００５２】この（数９）を整理して得られる以下の
（数１０）を最大化する雑音音源ベクトルのエントリ番
号ｋを特定する処理が、パルス拡散符号帳を用いた場合
の雑音符号帳探索処理となる。The process of specifying the entry number k of the noise excitation vector that maximizes the following (Expression 10) obtained by rearranging the (Expression 9) is a noise codebook search using a pulse spread codebook. Processing.

【００５３】[0053]

【数１０】 (Equation 10)

【００５４】ただし、（数１０）において、ｘit＝ｖt
Ｈi（ただし、Ｈi＝ＨtＷi：Ｗiは拡散パタン重畳行
列）、Ｒ＝ＨitＨjである。各エントリ番号ｋについて
数１０の値計算する場合、その前処理としてＨi＝ＨtＷ
i およびｘit＝ｖtＨiおよびＲ＝ＨitＨjを計算しメモ
リに記録しておくことが可能である。すると、雑音音源
ベクトルとしてエントリしている各候補ごとに（数１
０）を計算する際の演算量が、代数的符号帳を用いた場
合に（数７）を計算する際の演算量と同じになり（（数
７）と（数１０）が同形であることから明らか）、パル
ス拡散符号帳を用いた場合も、少ない演算量で雑音符号
帳探索を行うことができる（文献５、６、７）。However, in (Equation 10), xit = vt
Hi (where Hi = HtWi: Wi is a diffusion pattern superposition matrix), and R = HitHj. When calculating the value of Equation 10 for each entry number k, Hi = HtW
It is possible to calculate i and xit = vtHi and R = HitHj and record them in memory. Then, for each candidate entered as a noise source vector (Equation 1)
The amount of calculation when calculating (0) is the same as the amount of calculation when calculating (Equation 7) using an algebraic codebook ((Equation 7) and (Equation 10) are isomorphic) ), The noise codebook search can be performed with a small amount of calculation even when the pulse spread codebook is used (References 5, 6, and 7).

【００５５】[0055]

【発明が解決しようとする課題】上記従来の技術におい
ては、パルス拡散符号帳をＣＥＬＰ符号化装置・復号化
装置の雑音符号帳部に用いることの効果、および、パル
ス拡散符号帳を雑音符号帳部に用いた場合に、代数的符
号帳を雑音符号帳部に用いた場合と同様の方法で雑音符
号帳探索を行えることを示した。代数的符号帳を雑音符
号帳部に用いた場合の雑音符号帳探索に要する演算量
と、パルス拡散符号帳を雑音符号帳部に用いた場合の雑
音符号帳探索に要する演算量の違いは、（数７）と（数
１０）それぞれの前処理段階に要する演算量の違い、す
なわち、前処理（ｘt＝ｖtＨ、Ｍ＝ＨtＨ）と前処理
（Ｈi＝ＨtＷi 、ｘit＝ｖtＨi、Ｒ＝ＨitＨj ）に要す
る演算量の違いである。In the above prior art, the effect of using a pulse spread codebook for a noise codebook section of a CELP encoding apparatus / decoding apparatus and the effect of using a pulse spread codebook as a noise codebook are described. It has been shown that, when used in the noise codebook section, a random codebook search can be performed in the same manner as when the algebraic codebook is used in the random codebook section. The difference between the amount of computation required for noise codebook search when using an algebraic codebook for the random codebook unit and the amount of computation required for noise codebook search when using the pulse spread codebook for the noise codebook unit is as follows: (Equation 7) and (Equation 10) Differences in the amount of computation required for each preprocessing stage, ie, preprocessing (xt = vtH, M = HtH) and preprocessing (Hi = HtWi, xit = vtHi, R = HitHj) Is the difference in the amount of computation required.

【００５６】一般に、ＣＥＬＰ符号化装置・復号化装置
では、そのビットレートが低くなるほど雑音符号帳部に
割り当て可能なビット数も減少する傾向にある。そして
この傾向は、代数的符号帳やパルス拡散符号帳を雑音符
号帳部に用いる場合、雑音音源ベクトルを構成する際の
非零要素数の減少につながっていく。したがって、ＣＥ
ＬＰ符号化装置・復号化装置のビットレートが低くなる
ほど、代数的符号帳を用いた場合とパルス拡散符号帳を
用いた場合の演算量の差は少なくなる。しかしビットレ
ートが比較的高い場合や、ビットレートが低くても演算
量を極力少なく押さえる必要がある場合には、パルス拡
散符号帳を用いることによって生じる前処理段階の演算
量の増加が無視できなくなることがある。In general, in the CELP encoding apparatus / decoding apparatus, the number of bits that can be allocated to the noise codebook tends to decrease as the bit rate decreases. This tendency leads to a decrease in the number of non-zero elements when constructing a noise excitation vector when using an algebraic codebook or a pulse spread codebook for the noise codebook section. Therefore, CE
As the bit rate of the LP encoder / decoder decreases, the difference in the amount of calculation between the case where an algebraic codebook is used and the case where a pulse spread codebook is used becomes smaller. However, when the bit rate is relatively high, or when it is necessary to minimize the amount of calculation even at a low bit rate, the increase in the amount of calculation in the preprocessing stage caused by using the pulse spreading codebook cannot be ignored. Sometimes.

【００５７】本発明は、パルス拡散符号帳を雑音符号帳
部に用いたＣＥＬＰ方式の音声符号化装置と音声復号化
装置、及び音声符号化復号化システムにおいて、符号化
側では、代数的符号帳を雑音符号帳部に用いる場合と比
べて増加する、前処理段階の符号探索時の演算量を少な
く抑えながら、復号化側では高品質な合成音声を得るこ
とを目的とする。According to the present invention, in a speech coding apparatus and speech decoding apparatus of the CELP system using a pulse spread codebook for a noise codebook section, and a speech coding / decoding system, an algebraic codebook is used on the coding side. It is an object of the present invention to obtain a high-quality synthesized speech on the decoding side while suppressing the amount of calculation at the time of code search in the preprocessing stage, which is increased as compared with the case where is used for the noise codebook unit.

【００５８】[0058]

【課題を解決するための手段】本発明は、パルス拡散符
号帳をＣＥＬＰ符号化装置・復号化装置の雑音符号帳部
に用いる場合に生じることがある上記課題を解決するた
めの発明であり、符号化装置側と復号化装置側で異なる
拡散パタンを用いることを特徴とする音声符号化装置・
復号化装置の発明である。本発明では、音声復号化装置
側の拡散パタン格納部には、従来の技術で示したような
文献２、３、４記載の拡散パタンを登録し、それを用い
ることで、代数的符号帳を用いる場合より品質の高い合
成音声を生成するのに対し、符号化装置側では、復号化
装置側の拡散パタン格納部に登録する拡散パタンを簡素
化した拡散パタン（例えば、一定間隔で間引いた拡散パ
タンや、ある長さで打ち切った拡散パタン）を登録し、
それを用いて雑音符号帳探索を行うようにする。SUMMARY OF THE INVENTION The present invention is to solve the above-mentioned problems which may occur when a pulse spread codebook is used for a noise codebook section of a CELP coding apparatus / decoding apparatus. A speech encoding device characterized by using different spreading patterns on the encoding device side and the decoding device side.
It is an invention of a decoding device. In the present invention, the diffusion pattern storage unit on the speech decoding device side registers the diffusion patterns described in References 2, 3, and 4 as described in the related art, and uses them to store an algebraic codebook. On the other hand, a synthesized speech having higher quality than the case of using it is generated. On the encoding device side, a diffusion pattern registered in the diffusion pattern storage section of the decoding device is simplified (for example, a diffusion pattern thinned out at regular intervals). Pattern, or a diffusion pattern censored at a certain length)
A random codebook search is performed by using this.

【００５９】これにより、パルス拡散符号帳を雑音符号
帳部に用いる場合に、符号化側では、代数的符号帳を雑
音符号帳部に用いる場合と比べて増加する、前処理段階
の符号探索時の演算量を少なく抑えることができ、復号
化側では、高品質の合成音声を得ることができる。Thus, when the pulse spread codebook is used for the noise codebook section, the coding side increases the code search time in the preprocessing stage, which is increased as compared with the case where the algebraic codebook is used for the noise codebook section. Can be reduced, and a high-quality synthesized speech can be obtained on the decoding side.

【００６０】[0060]

【発明の実施の形態】本発明の請求項１に記載の発明
は、少なくとも一つの非零要素（非零要素以外の要素は
ゼロの値を持つ）を含むベクトルと、拡散パタンと呼ば
れる固定波形とを重畳してベクトルを生成するパルス拡
散符号帳を備え、前記パルス拡散符号帳が、音声復号化
装置側のパルス拡散符号帳の構成と異なる構成を有する
ことを特徴とする音声符号化装置であり、音声符号化装
置のパルス拡散符号帳を、音声復号化装置と異なるパル
ス拡散符号帳でありながら、符号探索演算する際に演算
量を低減化するように構成できるという作用を有する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention according to claim 1 of the present invention provides a vector including at least one non-zero element (elements other than the non-zero element have a value of zero) and a fixed waveform called a diffusion pattern. And a pulse spread codebook that generates a vector by superimposing the pulse spread codebook, wherein the pulse spread codebook has a configuration different from the configuration of the pulse spread codebook on the voice decoding device side. There is an effect that the pulse spread codebook of the speech coding apparatus can be configured to reduce the amount of calculation when performing a code search calculation, even though the pulse spread codebook is different from that of the speech decoding apparatus.

【００６１】請求項２に記載の発明は、パルス拡散符号
帳の構成部位である拡散パタン格納部が、音声復号化装
置側の拡散パタン格納部が格納している拡散パタンと異
なる拡散パタンを格納していることを特徴とする請求項
１記載の音声符号化装置であり、音声符号化装置のパル
ス拡散符号帳に格納する拡散パタンを、音声復号化装置
に格納されているものと異なる拡散パタンでありなが
ら、符号探索演算する際に演算量が低減化されるように
構成することができるという作用を有する。According to a second aspect of the present invention, the spreading pattern storage unit, which is a component of the pulse spreading codebook, stores a spreading pattern different from the spreading pattern stored in the spreading pattern storage unit on the speech decoding device side. 2. A speech coding apparatus according to claim 1, wherein the spreading pattern stored in the pulse spread codebook of the speech coding apparatus is different from the spreading pattern stored in the speech decoding apparatus. However, there is an effect that the configuration can be made such that the amount of calculation is reduced when performing the code search calculation.

【００６２】請求項３に記載の発明は、拡散パタン格納
部が、音声復号化装置側の拡散パタン格納部が格納して
いる拡散パタンを簡素化して選られる得られる拡散パタ
ンを格納していることを特徴とする請求項２記載の音声
符号化装置であり、音声符号化装置のパルス拡散符号帳
に格納する拡散パタンを、音声復号化装置に格納された
拡散パタンを簡素化したものとすることで、符号探索演
算する際の演算量を低減化することができるという作用
を有する。According to a third aspect of the present invention, the diffusion pattern storage section stores a diffusion pattern obtained by simplifying the diffusion pattern stored in the diffusion pattern storage section of the speech decoding device. 3. The speech coding apparatus according to claim 2, wherein the spreading pattern stored in the pulse spread codebook of the speech coding apparatus is a simplified version of the spreading pattern stored in the speech decoding apparatus. This has the effect of reducing the amount of calculation when performing the code search calculation.

【００６３】そして請求項４に記載の発明のように、拡
散パタン格納部が、音声復号化装置側の拡散パタン格納
部が格納している拡散パタンの構成要素を、適当な間隔
ごとにゼロに置き換えて得られる拡散パタンを格納して
いることを特徴とする請求項２または請求項３記載の音
声符号化装置とするのが、好適である。According to the fourth aspect of the present invention, the spreading pattern storage unit reduces the constituent elements of the spreading pattern stored in the spreading pattern storage unit of the speech decoding device to zero at appropriate intervals. Preferably, the speech encoding apparatus according to the second or third aspect stores a diffusion pattern obtained by replacement.

【００６４】また、請求項５に記載の発明のように、拡
散パタン格納部が、音声復号化装置側の拡散パタン格納
部が格納している拡散パタンの構成要素を、Ｎサンプル
（Ｎは自然数）ごとにゼロに置き換えて得られる拡散パ
タンを格納していることを特徴とする請求項２から４の
いずれかに記載の音声符号化装置としても、同様の作用
を呈する。As described in the fifth aspect of the present invention, the spreading pattern storage unit uses N samples (N is a natural number) for the constituent elements of the spreading pattern stored in the spreading pattern storage unit of the speech decoding device. ) Stores a diffusion pattern obtained by replacing it with zero for each case, and the speech coding apparatus according to any one of claims 2 to 4 has a similar effect.

【００６５】また、請求項６に記載の発明のように、拡
散パタン格納部が、音声復号化装置側の拡散パタン格納
部が格納している拡散パタンの構成要素を、１サンプル
ごとにゼロに置き換えて得られる拡散パタンを格納して
いることを特徴とする請求項５記載の音声符号化装置と
するのが、より好適である。Further, as in the invention according to claim 6, the spreading pattern storage unit reduces the constituent elements of the spreading pattern stored in the spreading pattern storage unit of the speech decoding device to zero for each sample. It is more preferable that the speech encoding apparatus according to the fifth aspect stores a diffusion pattern obtained by replacement.

【００６６】請求項７に記載の発明は、拡散パタン格納
部が、音声復号化装置側の拡散パタン格納部が格納して
いる拡散パタンの構成要素を、適当な長さで打ち切って
得られる拡散パタンを格納していることを特徴とする請
求項２または請求項３記載の音声符号化装置であり、音
声符号化装置のパルス拡散符号帳に格納する拡散パタン
を適当な長さで打ち切ることで、音声復号化装置に格納
された拡散パタンと異なる拡散パタンでありながら、符
号探索演算する際の演算量を低減化することができると
いう作用を有する。According to a seventh aspect of the present invention, in the spread pattern storing section, the spread pattern obtained by truncating the constituent elements of the spread pattern stored in the spread pattern storage section of the speech decoding device at an appropriate length. 4. A speech encoding apparatus according to claim 2, wherein the speech pattern is stored in a pulse spreading codebook of the speech encoding apparatus by truncating the speech pattern at an appropriate length. In addition, although the spreading pattern is different from the spreading pattern stored in the speech decoding device, the amount of calculation for code search calculation can be reduced.

【００６７】また、請求項８に記載の発明のように、拡
散パタン格納部が、音声復号化装置側の拡散パタン格納
部が格納している拡散パタンの構成要素を、Ｎサンプル
（Ｎは自然数）の長さで打ち切って得られる拡散パタン
を格納していることを特徴とする請求項２、３、７のい
ずれかに記載の音声符号化装置としても、同様の作用を
呈する。Further, as in the invention according to claim 8, the spreading pattern storage unit stores the components of the spreading pattern stored in the spreading pattern storage unit of the speech decoding device as N samples (N is a natural number). ) Stores the diffusion pattern obtained by truncation at the length, and the same effect is exhibited as the speech encoding apparatus according to any one of claims 2, 3, and 7.

【００６８】そして請求項９に記載の発明のように、拡
散パタン格納部が、音声復号化装置側の拡散パタン格納
部が格納している拡散パタンの構成要素を、半分の長さ
に打ち切って得られる拡散パタンを格納していることを
特徴とする請求項２、３、７のいずれかに記載の音声符
号化装置とするのが、好適である。According to the ninth aspect of the present invention, the diffusion pattern storage section cuts off the constituent elements of the diffusion pattern stored in the diffusion pattern storage section of the speech decoding device into half length. It is preferable that the speech coding apparatus according to any one of claims 2, 3, and 7 stores the obtained diffusion pattern.

【００６９】請求項１０に記載の発明は、請求項１から
９のいずれかに記載の音声符号化装置で生成された音声
符号を有する音声信号を復号化する音声復号化装置であ
り、音声符号化装置で演算量を低減化されて符号化され
た音声信号を、復号化側では品質の高い合成音声を復号
化することができるという作用を有する。According to a tenth aspect of the present invention, there is provided a speech decoding apparatus for decoding a speech signal having a speech code generated by the speech coding apparatus according to any one of the first to ninth aspects. The decoding device can decode a speech signal coded with a reduced amount of operation and a high-quality synthesized speech on the decoding side.

【００７０】請求項１１に記載の発明は、請求項１から
請求項９のいずれかに記載の音声符号化装置を実現する
ソフトウェアプログラムを記述した信号処理用プロセッ
サであり、音声符号化の演算量を低減化したソフトウェ
アプログラムを用いることで、高速化あるいは低ビット
レート化を実現する信号処理用プロセッサを形成できる
という作用を有する。According to an eleventh aspect of the present invention, there is provided a signal processing processor in which a software program for realizing the audio encoding device according to any one of the first to ninth aspects is described, and the amount of arithmetic operation of the audio encoding is performed. By using a software program in which the signal processing is reduced, it is possible to form a signal processing processor that realizes high speed or low bit rate.

【００７１】請求項１２に記載の発明は、請求項１０記
載の音声復号化装置を実現するソフトウェアプログラム
を記述した信号処理用プロセッサであり、音声符号化装
置で演算量を低減化されて符号化された音声信号を用い
ながら、復号化側では品質の高い音声を復号化できるソ
フトウェアプログラムを用いることで、高品質な合成音
声を実現する信号処理用プロセッサを形成できるという
作用を有する。According to a twelfth aspect of the present invention, there is provided a signal processing processor in which a software program for realizing the speech decoding device according to the tenth aspect is described. By using a software program that can decode high-quality speech on the decoding side while using the obtained speech signal, it has an effect that a signal processing processor that realizes high-quality synthesized speech can be formed.

【００７２】請求項１３に記載の発明は、音声符号化装
置側が有するパルス拡散符号帳の構成と、音声復号化装
置側が有するパルス拡散符号帳の構成とが異なることを
特徴とする音声符号化復号化システムであり、音声符号
化装置のパルス拡散符号帳を、音声復号化装置と異なる
パルス拡散符号帳でありながら、符号探索演算する際に
演算量を低減化するように構成したシステムを形成でき
るという作用を有する。According to a thirteenth aspect of the present invention, the configuration of the pulse spread codebook of the speech encoder is different from the configuration of the pulse spread codebook of the speech decoder. And a system in which the pulse spread codebook of the speech coding apparatus is configured to reduce the amount of calculation when performing a code search operation while using a pulse spread codebook different from that of the speech decoding apparatus. It has the action of:

【００７３】請求項１４に記載の発明は、音声符号化装
置側が有するパルス拡散符号帳の構成と、音声復号化装
置側が有するパルス拡散符号帳の構成との違いが、それ
ぞれのパルス拡散符号帳に備えられた拡散パタンの形状
であることを特徴とする請求項１３に記載の音声符号化
復号化システムであり、音声符号化装置のパルス拡散符
号帳に格納する拡散パタンを、音声復号化装置に格納さ
れているものと異なる拡散パタンでありながら、符号探
索演算する際に演算量を低減化するように構成したシス
テムを形成できるという作用を有する。According to the fourteenth aspect of the present invention, the difference between the structure of the pulse spread codebook of the speech coding apparatus and the structure of the pulse spread codebook of the speech decoding apparatus is that each pulse spread codebook has 14. The speech encoding / decoding system according to claim 13, wherein the speech decoding apparatus has a shape of a diffusion pattern provided, and the diffusion pattern stored in the pulse spread codebook of the speech encoding apparatus is transmitted to the speech decoding apparatus. Although the spreading pattern is different from the stored one, there is an effect that it is possible to form a system configured to reduce the amount of calculation when performing the code search calculation.

【００７４】請求項１５に記載の発明は、音声符号化装
置側の拡散パタンの形状が、音声復号化装置側の拡散パ
タンの形状を簡素化したものであることを特徴とする請
求項１４に記載の音声符号化復号化システムであり、音
声符号化装置のパルス拡散符号帳に格納する拡散パタン
を、音声復号化装置に格納された拡散パタンを簡素化し
たものとすることで、符号探索演算する際の演算量を低
減化するように構成したシステムを形成できるという作
用を有する。According to a fifteenth aspect of the present invention, the shape of the diffusion pattern on the speech encoding device side is a simplified version of the shape of the diffusion pattern on the speech decoding device side. The speech encoding / decoding system according to claim 1, wherein the spreading pattern stored in the pulse spread codebook of the speech encoding apparatus is a simplified version of the spreading pattern stored in the speech decoding apparatus, and a code search operation is performed. This has the effect that a system configured to reduce the amount of computation when performing the operation can be formed.

【００７５】そして請求項１６に記載の発明のように、
音声符号化装置側の拡散パタンの形状が、音声復号化装
置側の拡散パタンの構成要素を、適当な間隔ごとにゼロ
に置き換えて得られる形状であることを特徴とする請求
項１３から１５のいずれかに記載の音声符号化復号化シ
ステムとするのが、好適である。And, as in the invention of claim 16,
16. The method according to claim 13, wherein the shape of the diffusion pattern on the voice encoding device side is a shape obtained by replacing constituent elements of the diffusion pattern on the voice decoding device side with zero at appropriate intervals. It is preferable to use any one of the speech encoding / decoding systems described above.

【００７６】また、請求項１７に記載の発明のように、
音声符号化装置側の拡散パタンの形状が、音声復号化装
置側の拡散パタンの構成要素を、Ｎサンプル（Ｎは自然
数）ごとにゼロに置き換えて得られる形状であることを
特徴とする請求項１３から１６のいずれかに記載の音声
符号化復号化システムとしても、同様の作用を呈する。Also, as in the invention of claim 17,
The shape of the diffusion pattern on the voice encoding device side is a shape obtained by replacing components of the diffusion pattern on the voice decoding device side with zero for every N samples (N is a natural number). The speech coding / decoding system according to any one of 13 to 16 has a similar effect.

【００７７】また、請求項１８に記載の発明のように、
音声符号化装置側の拡散パタンの形状が、音声復号化装
置側の拡散パタンの構成要素を、１サンプルごとにゼロ
に置き換えて得られる形状であることを特徴とする請求
項１７記載の音声符号化復号化システムとするのが、よ
り好適である。Further, according to the invention of claim 18,
18. The speech code according to claim 17, wherein the shape of the diffusion pattern on the speech encoding device side is a shape obtained by replacing components of the diffusion pattern on the speech decoding device side with zero for each sample. It is more preferable to use an encryption / decryption system.

【００７８】請求項１９に記載の発明は、音声符号化装
置側の拡散パタンの形状が、音声復号化装置側の拡散パ
タンの構成要素を、適当な長さで打ち切って得られる形
状であることを特徴とする請求項１３から１５のいずれ
かに記載の音声符号化復号化システムであり、音声符号
化装置のパルス拡散符号帳に格納する拡散パタンを適当
な長さで打ち切ることで、音声復号化装置に格納された
拡散パタンと異なる拡散パタンでありながら、符号探索
演算する際の演算量を低減化するように構成したシステ
ムを形成できるという作用を有する。According to a nineteenth aspect of the present invention, the shape of the diffusion pattern on the audio encoding device side is a shape obtained by truncating the components of the diffusion pattern on the audio decoding device side to an appropriate length. The speech encoding / decoding system according to any one of claims 13 to 15, wherein a speech pattern stored in a pulse spreading codebook of the speech encoding device is truncated at an appropriate length to perform speech decoding. However, there is an effect that a system configured to reduce the amount of calculation at the time of code search calculation can be formed even though the spreading pattern is different from the spreading pattern stored in the coding device.

【００７９】また、請求項２０に記載の発明のように、
音声符号化装置側の拡散パタンの形状が、音声復号化装
置側の拡散パタンの構成要素を、Ｎサンプル（Ｎは自然
数）の長さで打ち切って得られる形状であることを特徴
とする請求項１３、１４、１５、１９のいずれかに記載
の音声符号化復号化システムとしても、同様の作用を呈
する。Further, according to the invention of claim 20,
The spread pattern on the speech encoding device side is a shape obtained by truncating the constituent elements of the diffusion pattern on the speech decoding device side by a length of N samples (N is a natural number). The speech coding / decoding system according to any one of 13, 14, 15, and 19 exhibits the same operation.

【００８０】また、請求項２１に記載の発明のように、
音声符号化装置側の拡散パタンの形状が、音声復号化装
置側の拡散パタンの構成要素を、半分の長さに打ち切っ
て得られる形状であることを特徴とする請求項１３、１
４、１５、１９のいずれかに記載の音声符号化復号化シ
ステムとするのが、好適である。Further, according to the invention of claim 21,
2. The speech decoding apparatus according to claim 1, wherein the shape of the diffusion pattern is a shape obtained by truncating a component of the diffusion pattern on the speech decoding apparatus to half its length.
Preferably, the speech encoding / decoding system according to any one of 4, 15, and 19 is used.

【００８１】請求項２２に記載の発明は、請求項１１ま
たは１２に記載の信号処理用プロセッサを備えることを
特徴とする通信用基地局であり、符号化側では演算量を
低減化するソフトウェアプログラムを用いた信号処理用
プロセッサにより高速化あるいは低ビットレート化を可
能とし、復号化側では品質の高い音声を復号化できるソ
フトウェアプログラムを用いた信号処理用プロセッサに
より高品質な合成音声の実現を可能とする基地局を実現
できるという作用を有する。According to a twenty-second aspect of the present invention, there is provided a communication base station including the signal processing processor according to the eleventh or twelfth aspect, wherein a software program for reducing the amount of operation on the encoding side is provided. A high-speed or low bit rate can be achieved by a signal processing processor that uses a CDMA, and a high-quality synthesized speech can be realized by a signal processing processor that uses a software program that can decode high-quality speech on the decoding side. Has the effect of realizing a base station as follows.

【００８２】請求項２３に記載の発明は、請求項１１ま
たは１２に記載の信号処理用プロセッサを備えることを
特徴とする通信用端末であり、符号化側では演算量を低
減化するソフトウェアプログラムを用いた信号処理用プ
ロセッサにより高速化あるいは低ビットレート化を可能
とし、復号化側では品質の高い音声を復号化できるソフ
トウェアプログラムを用いた信号処理用プロセッサによ
り高品質な合成音声の実現を可能とする端末を実現でき
るという作用を有する。According to a twenty-third aspect of the present invention, there is provided a communication terminal comprising the signal processing processor according to the eleventh or twelfth aspect, wherein a software program for reducing the amount of operation on the encoding side is provided. High-speed or low-bit-rate can be achieved with the signal processing processor used, and high-quality synthesized speech can be realized with the signal processing processor using a software program that can decode high-quality speech on the decoding side. This has the effect of realizing a terminal that performs

【００８３】請求項２４に記載の発明は、請求項２２記
載の通信用基地局、および請求項２３記載の通信用端末
を無線ネットワークでつないだ無線通信システムであ
り、符号化側では演算量を低減化するソフトウェアプロ
グラムを用いた信号処理用プロセッサにより高速化ある
いは低ビットレート化を可能とし、復号化側では品質の
高い音声を復号化できるソフトウェアプログラムを用い
た信号処理用プロセッサにより高品質な合成音声の実現
を可能とする無線通信システムを実現できるという作用
を有する。According to a twenty-fourth aspect of the present invention, there is provided a wireless communication system in which the communication base station according to the twenty-second aspect and the communication terminal according to the twenty-third aspect are connected by a wireless network. High-speed synthesis or low-bit-rate processing can be achieved with a signal processing processor using a software program that reduces the noise, and high-quality synthesis can be achieved with a signal processing processor that uses a software program that can decode high-quality audio on the decoding side. This has the effect of realizing a wireless communication system capable of realizing voice.

【００８４】符号化装置側と復号化装置側で異なる拡散
パタンを用いることとは、予め用意された（復号化装置
用の）拡散ベクトルを、その特性を残しつつ変形するこ
とにより、エンコーダ用の拡散ベクトルを獲得すること
である。The use of different spreading patterns on the encoding device side and the decoding device side means that a previously prepared spreading vector (for the decoding device) is modified while retaining its characteristics, so that it is used for the encoder. It is to obtain the diffusion vector.

【００８５】ここで、復号化装置用の拡散ベクトルを予
め用意する方法としては、本発明者らが以前に出願した
特許（特開平１０−６３３００号公報）に開示された方
法、すなわち音源探索用ターゲットベクトルの統計的傾
向を学習することによって用意する方法、音源ターゲッ
トを実際に符号化し、その時生じる符号化歪みの総和を
より小さくする方向に徐々に変形させる操作を反復する
ことで用意する方法、及び合成音声を高品質化すべく音
声学的な知見に基づいて設計する方法などや、パルス音
源の高域位相成分をランダマイズさせることを目的に設
計する方法などが考えられる。これらの内容はすべてこ
こに含めておく。Here, as a method of preparing a spreading vector for a decoding device in advance, a method disclosed in a patent (Japanese Patent Application Laid-Open No. 10-63300) filed by the present inventors, that is, a method for searching for a sound source, A method of preparing by learning the statistical tendency of the target vector, a method of actually encoding the sound source target, and a method of preparing by repeatedly performing an operation of gradually deforming in a direction to reduce the sum of the encoding distortion generated at that time, A method of designing based on phonetic knowledge to improve the quality of synthesized speech, a method of designing for the purpose of randomizing a high-frequency phase component of a pulse sound source, and the like can be considered. All of these details are included here.

【００８６】このようにして得られた拡散ベクトルは、
いずれも拡散ベクトルの先頭サンプルに近いサンプル
（前方のサンプル）の振幅が、後方のサンプルの振幅よ
り、比較的大きめになるという特徴がある。中でも、先
頭のサンプルの振幅が、拡散ベクトル内の全サンプル中
で最大となることが多い（ほとんどの場合そのようにな
る）。The diffusion vector obtained in this way is
In any case, the amplitude of the sample (front sample) close to the first sample of the diffusion vector is relatively larger than the amplitude of the rear sample. In particular, the amplitude of the first sample is often the largest among all the samples in the spreading vector (this is almost always the case).

【００８７】復号化装置用の拡散ベクトルを、その特性
を残しつつ変形することでエンコーダ用の拡散ベクトル
を獲得する具体的方法としては、以下の方法が挙げられ
る。１）復号化装置用の拡散ベクトルのサンプル値を、適当
な間隔ごとにゼロに置き換えることで、エンコーダ用の
拡散ベクトルを獲得する。２）ある長さの復号化装置用の拡散ベクトルを、適当な
長さで打ち切ることによって、エンコーダ用の拡散ベク
トルを獲得する。３）振幅のしきい値を予め設定し、復号化装置用の拡散
ベクトルに対して設定したしきい値より振幅の小さいサ
ンプルをゼロに置き換えることで、エンコーダ用の拡散
ベクトルを獲得する。４）ある長さの復号化装置用の拡散ベクトルを、先頭サ
ンプルを含む適当な間隔ごとのサンプル値を保存し、そ
れ以外のサンプルの値をゼロに置きかえることで、符号
化装置用の拡散ベクトルを獲得する。As a specific method for obtaining a spreading vector for an encoder by deforming a spreading vector for a decoding device while maintaining its characteristics, the following method can be mentioned. 1) A spreading vector for an encoder is obtained by replacing a sample value of a spreading vector for a decoding device with zero at an appropriate interval. 2) The spreading vector for the encoder is obtained by truncating the spreading vector for the decoding device of a certain length at an appropriate length. 3) A threshold value of the amplitude is set in advance, and a sample whose amplitude is smaller than the threshold value set for the spreading vector for the decoding device is replaced with zero, thereby obtaining a spreading vector for the encoder. 4) The spreading vector for the encoding device is stored by storing the sample value at an appropriate interval including the first sample for the decoding device of a certain length and replacing the values of the other samples with zero. To win.

【００８８】ここで例えば上記１）の方法のように、拡
散ベクトルの前方からの数サンプルを用いた場合でも、
拡散ベクトルの概形（大まかな特性）を保存したまま、
符号化装置用の拡散ベクトルを新たに獲得することが可
能となっている。Here, for example, when several samples from the front of the diffusion vector are used as in the above method 1),
While preserving the general shape (rough characteristics) of the diffusion vector,
It is possible to newly obtain a spreading vector for the encoding device.

【００８９】また例えば、上記２）の方法のように、適
当な間隔ごとにサンプル値をゼロに置き換えてももとの
拡散ベクトルの概形（大まかな特性）を保存したまま、
符号化装置用の拡散ベクトルを新たに獲得することが可
能となる。特に、上記４）の方法の場合は、振幅が最大
であることの多い先頭サンプルの振幅をそのまま必ず保
存するという限定を付けているので、もとの拡散ベクト
ルの概形をより確実に保存しておくことが可能である。Further, for example, as in the above method 2), even if the sample value is replaced with zero at an appropriate interval, the original shape (rough characteristics) of the original diffusion vector is maintained.
It becomes possible to newly obtain a spreading vector for the encoding device. In particular, in the case of the above method 4), the limitation that the amplitude of the first sample, which often has the largest amplitude, is always stored as it is is imposed, so that the outline of the original diffusion vector can be stored more reliably. It is possible to keep.

【００９０】また、３）の方法のように、特定値以上の
振幅を有するサンプルをそのまま保存し、前記特定値以
下の振幅を有するサンプルの振幅をゼロに置き換えても
拡散ベクトルの概形（大まかな特性）を保存したまま、
符号化装置用の拡散ベクトルを獲得することが可能とな
る。Further, as in the method 3), the sample having the amplitude equal to or larger than the specific value is stored as it is, and the amplitude of the sample having the amplitude equal to or smaller than the specific value is replaced with zero. Characteristics) are preserved,
It is possible to obtain a spreading vector for the encoding device.

【００９１】以下、本発明の実施の実施の形態につい
て、図１から図４を用いて説明する。（実施の形態１）図１と図２は、それぞれ本実施の形態
における、パルス拡散符号帳を雑音符号帳部に用いるこ
とを特徴とするＣＥＬＰ方式の音声符号化装置と音声復
号化装置である。Hereinafter, embodiments of the present invention will be described with reference to FIGS. 1 to 4. (Embodiment 1) FIGS. 1 and 2 show a CELP-type speech coding apparatus and speech decoding apparatus, respectively, in which a pulse spread codebook is used for a noise codebook section in this embodiment. .

【００９２】図１および図２と、従来の技術で用いた図
１０および図１１との相違点は、図１０および図１１で
は同一構成（代数的符号帳部のチャネル数、拡散パタン
格納部に登録されている拡散パタンの種類数および形状
などが全て同一）のパルス拡散符号帳を用いているのに
対し、図１および図２では異なる構成のパルス拡散符号
帳を用いている点である。The difference between FIG. 1 and FIG. 2 and FIG. 10 and FIG. 11 used in the prior art is the same configuration (the number of channels of the algebraic codebook unit and the spreading pattern storage unit) in FIG. 10 and FIG. 1 and 2 use pulse spread codebooks having different configurations, whereas pulse spread codebooks having the same number of registered diffusion patterns and the same shape are used.

【００９３】そこで次に、図１内のパルス拡散符号帳Ａ
１０１と図２内のパルス拡散符号帳Ｂ１１１の構成を、
図３（ａ）および図３（ｂ）にそれぞれ示す。図３
（ａ）と図３（ｂ）のパルス拡散符号帳を比較した場
合、構成上の異なる点は、拡散パタン格納部に登録して
いる拡散パタンの形状が異なっている点である。図３
（ｂ）の音声復号化装置側では、拡散パタン格納部Ｂ１
１２には、従来の技術で説明した拡散パタンと同様の拡
散パタン、すなわち、［１］多くの雑音音源ターゲット
の形状を統計学習し、雑音音源ターゲット中に統計的に
高い頻度で含まれる形状の拡散パタン、［２］無声子音
区間や雑音区間を効率的に表現するための乱数的な形状
の拡散パタン、［３］有声定常区間を効率的に表現する
ためのパルス的な形状の拡散パタン、［４］代数的符号
帳から出力される音源ベクトルのエネルギー（非零要素
の位置にエネルギーが集中している）を周囲に分散させ
るような作用を与える形状の拡散パタン、［５］適当に
用意したいくつかの拡散パタン候補について、音声信号
を、符号化、復号化、合成音声の視聴評価を繰り返し、
品質の高い合成音声を出力しうるよう選択した拡散パタ
ン、［６］音声学的な知見をもとに作成した拡散パタン
のうちのいずれかの拡散パタン（雑音音源ベクトルＢ１
１３）が各チャネルあたり１種類ずつ登録されている。
そして一方、図３（ａ）の音声符号化装置側では、拡散
パタン格納部Ａ１０２には、図３（ｂ）の音声復号化装
置側の拡散パタン格納部に登録されている拡散パタン
を、１サンプルおきにゼロに置き換えた拡散パタン（雑
音音源ベクトルＡ１０３）が登録されている。Then, the pulse spreading codebook A in FIG.
101 and the configuration of the pulse spreading codebook B111 in FIG.
3 (a) and 3 (b) respectively. FIG.
When comparing the pulse spread codebook of FIG. 3A with the pulse spread codebook of FIG. 3B, the difference in the configuration is that the shape of the spreading pattern registered in the spreading pattern storage unit is different. FIG.
In the audio decoding device side of (b), the diffusion pattern storage unit B1
12, a diffusion pattern similar to the diffusion pattern described in the related art, that is, [1] statistical learning of the shape of many noise source targets, and the shape of a shape that is statistically frequently included in the noise source target. A diffusion pattern, [2] a random-shaped diffusion pattern for efficiently expressing unvoiced consonant sections and noise sections, [3] a pulse-shaped diffusion pattern for efficiently expressing voiced stationary sections, [4] A diffusion pattern of a shape giving an effect of dispersing the energy of the excitation vector output from the algebraic codebook (energy is concentrated at the position of the non-zero element) to the surroundings, [5] appropriately prepared For some of the diffusion pattern candidates, encoding, decoding, and listening / listening evaluation of synthesized speech are repeated,
A diffusion pattern selected to output a high-quality synthesized speech, [6] one of diffusion patterns created based on phonetic knowledge (noise source vector B1
13) is registered for each channel.
On the other hand, on the speech encoding device side in FIG. 3A, the diffusion pattern registered in the diffusion pattern storage portion on the speech decoding device side in FIG. A diffusion pattern (noise source vector A103) replaced with zero for each sample is registered.

【００９４】そして、上述のように構成されたＣＥＬＰ
音声符号化装置／音声復号化装置では、符号化装置側と
復号化装置側で異なる拡散パタンが登録されていること
を意識せずに、従来の技術で説明した場合と同様の方法
で、音声信号を符号化・復号化する。The CELP constructed as described above
In the speech encoding device / speech decoding device, the speech is decoded in the same manner as described in the related art, without being aware that different spreading patterns are registered on the encoding device side and the decoding device side. Encode and decode signals.

【００９５】すると、符号化装置では、パルス拡散符号
帳を雑音符号帳部に用いた場合の雑音符号帳探索時の前
処理演算量を削減することができ（Ｈi＝ＨtＷi および
ｘit＝ｖtＨiの演算量をおよそ半分に削減でき）、復号
化装置側では、従来どおりの拡散パタンをパルスベクト
ルに重畳することで、非零要素位置に集中しているエネ
ルギーを周囲に拡散することができ、合成音声の品質を
向上することが可能となる。Then, the coding apparatus can reduce the amount of pre-processing calculation at the time of searching for a random codebook when the pulse spread codebook is used for the random codebook section (calculation of Hi = HtWi and xit = vtHi). The amount can be reduced to about half), and the decoding device can diffuse the energy concentrated at the non-zero element position to the surroundings by superimposing the conventional diffusion pattern on the pulse vector, and synthesize speech. Quality can be improved.

【００９６】なお、本実施の形態では、図３（ａ）およ
び図３（ｂ）に示すように、音声符号化装置側では、音
声復号化装置側で用いる拡散パタンを１サンプルおきに
ゼロに置き換えた拡散パタンを用いる場合について説明
したが、音声符号化装置側では、音声復号化装置側で用
いる拡散パタンの要素をＮ（Ｎ≧１）サンプルおきにゼ
ロに置き換えて得られる拡散パタンを用いた場合にも、
本実施の形態をそのまま適用することができ、その場合
にも同様の作用を得ることができる。In this embodiment, as shown in FIGS. 3 (a) and 3 (b), the speech encoding device sets the spreading pattern used by the speech decoding device to zero every other sample. Although the case where the replaced diffusion pattern is used has been described, the speech coding apparatus uses a diffusion pattern obtained by replacing the elements of the spreading pattern used by the speech decoding apparatus with zeros every N (N ≧ 1) samples. If you have
The present embodiment can be applied as it is, and in that case, the same operation can be obtained.

【００９７】なおまた、本実施の形態では、従来技術の
説明と同様に、拡散パタン格納部が、チャネルあたり１
種類ずつの拡散パタンを登録している場合の実施の形態
を説明したが、チャネルあたり２種類以上の拡散パタン
が登録されており、それら拡散パタンを選択して用いる
ことを特徴とするパルス拡散符号帳を雑音符号帳部に用
いるＣＥＬＰ音声符号化装置・復号化装置においても本
発明を適用することが可能であり、その場合にも同様の
作用・効果を得ることができる。Further, in the present embodiment, as in the description of the prior art, the diffusion pattern storage unit has one channel per channel.
Although the embodiment in which the diffusion pattern is registered for each type has been described, two or more types of diffusion patterns are registered for each channel, and a pulse spreading code characterized in that these diffusion patterns are selected and used. The present invention can be applied to a CELP speech coding apparatus / decoding apparatus using a book as a noise codebook unit, and the same operation and effect can be obtained in such a case.

【００９８】なおまた、本実施の形態では、従来技術の
説明と同様に、代数的符号帳部が３個の非零要素を含む
ベクトルを出力するパルス拡散符号帳を用いた場合につ
いて実施の形態を説明したが、代数的符号帳部が出力す
るベクトル中の非零要素数がＭ個（Ｍ≧１）の場合にお
いても本実施の形態を適用することが可能であり、その
場合にも同様の作用・効果を得ることができる。Further, in the present embodiment, as in the description of the prior art, the algebraic codebook section uses a pulse spread codebook that outputs a vector including three non-zero elements. However, the present embodiment can be applied to the case where the number of non-zero elements in the vector output by the algebraic codebook unit is M (M ≧ 1). Function and effect can be obtained.

【００９９】なおまた、本実施の形態では、従来技術の
説明と同様に、少数個の非零要素からなるパルスベクト
ルを生成する符号帳として代数的符号帳を用いた場合に
ついて説明したが、当該パルスベクトルを生成する符号
帳としては、マルチパルス符号帳やレギュラーパルス符
号帳など、その他の符号帳を用いる場合にも本実施の形
態を適用することが可能であり、その場合にも同様の作
用・効果を得ることができる。Further, in this embodiment, as in the description of the prior art, a case where an algebraic codebook is used as a codebook for generating a pulse vector composed of a small number of non-zero elements has been described. As a codebook for generating a pulse vector, the present embodiment can be applied to a case where other codebooks such as a multi-pulse codebook and a regular pulse codebook are used.・ Effects can be obtained.

【０１００】（実施の形態２）図１と図２は、それぞれ
本実施の形態における、パルス拡散符号帳を雑音符号帳
部に用いることを特徴とするＣＥＬＰ方式の音声符号化
装置と音声復号化装置である。(Embodiment 2) FIGS. 1 and 2 show a CELP speech encoding apparatus and speech decoding apparatus according to the present embodiment, respectively, wherein a pulse spread codebook is used for a noise codebook section. Device.

【０１０１】図１および図２と、図１０および図１１と
の相違点は、図１０および図１１では同一構成（代数的
符号帳部のチャネル数、拡散パタン格納部に登録されて
いる拡散パタンの種類数および形状などが全て同一）の
パルス拡散符号帳を用いているのに対し、図１および図
２では異なる構成のパルス拡散符号帳を用いている点で
ある。The difference between FIGS. 1 and 2 and FIGS. 10 and 11 is that FIGS. 10 and 11 have the same configuration (the number of channels in the algebraic codebook section, the spreading pattern registered in the spreading pattern storage section). 1 and 2 use pulse spread codebooks having different configurations, while the pulse spread codebooks having the same number of types and the same shape are used.

【０１０２】そこで次に、図１内のパルス拡散符号帳Ａ
と図２内のパルス拡散符号帳Ｂの構成を、図４（ａ）お
よび図４（ｂ）にそれぞれ示す。図４（ａ）と図４
（ｂ）のパルス拡散符号帳の構成を比較した場合、構成
上の異なる点は、拡散パタン格納部に登録している拡散
パタンの長さが異なっている。図４（ｂ）の音声復号化
装置側では、拡散パタン格納部Ｂ１１２には、従来の技
術で説明した拡散パタンと同様の拡散パタン、すなわ
ち、［１］多くの雑音音源ターゲットの形状を統計学習
し、雑音音源ターゲット中に統計的に高い頻度で含まれ
る形状の拡散パタン、［２］無声子音区間や雑音区間を
効率的に表現するための乱数的な形状の拡散パタン、
［３］有声定常区間を効率的に表現するためのパルス的
な形状の拡散パタン、［４］代数的符号帳から出力され
る音源ベクトルのエネルギー（非零要素の位置にエネル
ギーが集中している）を周囲に分散させるような作用を
与える形状の拡散パタン、［５］適当に用意したいくつ
かの拡散パタン候補について、音声信号を、符号化、復
号化、合成音声の視聴評価を繰り替えし、品質の高い合
成音声を出力しうるよう選択した拡散パタン、［６］音
声学的な知見をもとに作成した拡散パタンのうちのいず
れかの拡散パタン（雑音音源ベクトルＢ１１３）が各チ
ャネルあたり１種類ずつ登録されている。そして一方、
図４（ａ）の音声符号化装置側では、拡散パタン格納部
Ａ１０４には、図４（ｂ）の音声復号化装置側の拡散パ
タン格納部に登録されている拡散パタンを、半分の長さ
で打ち切った拡散パタン（雑音音源ベクトルＡ１０５）
が登録されている。Then, the pulse spreading codebook A in FIG.
FIGS. 4A and 4B show the configuration of the pulse spreading codebook B in FIG. 2 and FIG. FIG. 4A and FIG.
When comparing the configuration of the pulse spread codebook in (b), the difference in the configuration is that the length of the spreading pattern registered in the spreading pattern storage unit is different. On the side of the speech decoding apparatus in FIG. 4B, the diffusion pattern storage unit B112 stores, in the diffusion pattern storage unit B112, a diffusion pattern similar to the diffusion pattern described in the related art, that is, [1] the shape of many noise sound source targets. And a diffusion pattern having a shape that is statistically included in the noise source at a high frequency, [2] a diffusion pattern having a random shape for efficiently expressing unvoiced consonant sections and noise sections,
[3] A pulse-shaped diffusion pattern for efficiently expressing a voiced stationary section, [4] Energy of an excitation vector output from an algebraic codebook (energy is concentrated at the position of a non-zero element) ) Is distributed around the surroundings, [5] For some appropriately prepared diffusion pattern candidates, the audio signal is encoded, decoded, and the listening / listening evaluation of the synthesized voice is repeated. One of diffusion patterns (noise source vector B113) selected from a diffusion pattern selected to output a high-quality synthesized speech and [6] a diffusion pattern created based on phonetic knowledge is 1 per channel. Each type is registered. And meanwhile,
In the speech encoding device side of FIG. 4A, the diffusion pattern registered in the speech pattern storage unit of the speech decoding device side of FIG. Diffusion pattern (noise source vector A105)
Is registered.

【０１０３】そして、上述のように構成されたＣＥＬＰ
音声符号化装置・復号化装置では、符号化装置側と復号
化装置側で異なる拡散パタンが登録されていることを意
識せずに、従来の技術で説明した場合と同様の方法で、
音声信号を符号化・復号化する。The CELP constructed as described above
In the speech encoding device / decoding device, without being aware that different spreading patterns are registered on the encoding device side and the decoding device side, in the same manner as described in the related art,
Encode and decode audio signals.

【０１０４】すると、符号化装置では、パルス拡散符号
帳を雑音符号帳部に用いた場合の雑音符号帳探索時の前
処理演算量を削減することができ（Ｈi＝ＨtＷi および
ｘit＝ｖtＨiの演算量をおよそ半分に削減でき）、復号
化装置側では、従来どおりの拡散パタンを利用すること
で、合成音声の品質向上を実現することが可能となる。Then, the coding apparatus can reduce the amount of pre-processing calculation at the time of searching for a random codebook when the pulse spread codebook is used for the random codebook section (calculation of Hi = HtWi and xit = vtHi). The amount can be reduced to about half), and the decoding apparatus can improve the quality of synthesized speech by using the conventional diffusion pattern.

【０１０５】なお、本実施の形態では、図４（ａ）およ
び図４（ｂ）に示すように、音声符号化装置側では、音
声復号化装置側で用いる拡散パタンを半分の長さで打ち
切った拡散パタンを用いる場合について説明したが、音
声符号化装置側では、音声符号化装置側で用いる拡散パ
タンを、さらに短い長さＮ（Ｎ≧１）で打ち切った場合
には、雑音符号帳探索時の前処理演算量をさらに削減す
ることが可能になるといった作用が得られる。ただしこ
こで、音声符号化装置側で用いる拡散パタンを長さ１で
打ち切る場合は、拡散パタンを用いない音声符号化装置
に相当する（音声復号化装置には拡散パタンが適用され
ている）。In this embodiment, as shown in FIGS. 4 (a) and 4 (b), the speech coding device cuts off the spreading pattern used by the speech decoding device at half the length. Although the case where the spread pattern is used has been described, the speech coding apparatus side searches for a noise codebook when the spreading pattern used in the speech coding apparatus is cut off at a shorter length N (N ≧ 1). The effect that it becomes possible to further reduce the amount of pre-processing calculation at the time is obtained. However, in this case, the case where the spreading pattern used on the side of the speech coding apparatus is cut off at the length of 1 corresponds to a speech coding apparatus that does not use a spreading pattern (a spreading pattern is applied to a speech decoding apparatus).

【０１０６】なおまた、本実施の形態では、従来の技術
の説明と同様に、拡散パタン格納部が、チャネルあたり
１種類ずつの拡散パタンを登録している場合を説明した
が、チャネルあたり２種類以上の拡散パタンが登録され
ており、それら拡散パタンを選択して用いることを特徴
とするパルス拡散符号帳を雑音符号帳部に用いる音声符
号化装置／音声復号化装置においても本実施の形態を適
用することが可能であり、その場合にも同様の作用・効
果を得ることができる。Further, in the present embodiment, as in the description of the prior art, the case where the diffusion pattern storage unit registers one type of diffusion pattern per channel has been described. The present embodiment is also applied to a speech coding apparatus / speech decoding apparatus using a pulse spread codebook for a noise codebook unit, characterized in that the above-mentioned spread patterns are registered and the spread patterns are selected and used. It is possible to apply, and in that case, the same operation and effect can be obtained.

【０１０７】なおまた、本実施の形態では、従来技術の
説明と同様に、代数的符号帳部が３個の非零要素を含む
ベクトルを出力するパルス拡散符号帳を用いた場合につ
いて実施の形態を説明したが、代数的符号帳部が出力す
るベクトル中の非零要素数がＭ個（Ｍ≧１）の場合にお
いても本実施の形態を適用することが可能であり、その
場合にも同様の作用・効果を得ることができる。Further, in the present embodiment, as in the description of the prior art, the case where the algebraic codebook unit uses a pulse spread codebook that outputs a vector including three non-zero elements is described. However, the present embodiment can be applied to the case where the number of non-zero elements in the vector output by the algebraic codebook unit is M (M ≧ 1). Function and effect can be obtained.

【０１０８】なおまた、本実施の形態では、音声符号化
装置側では、音声復号化装置側で用いる拡散パタンを半
分の長さで打ち切った拡散パタンを用いる場合について
説明したが、本実施の形態を、（実施の形態１）と併用
する場合、すなわち、音声符号化装置側では、音声復号
化装置側で用いる拡散パタンを長さＮ（Ｎ≧１）で打ち
切り、さらに、打ち切り後の拡散パタンをＭ（Ｍ≧１）
サンプルおきにゼロに置き換えることも可能であり、そ
の場合には、符号探索演算量をさらに低減することが可
能になる。In the present embodiment, a case has been described in which the speech coding apparatus uses a diffusion pattern obtained by truncating the spreading pattern used by the speech decoding apparatus by half the length. Is used in combination with (Embodiment 1), that is, the speech encoding device cuts off the spreading pattern used by the speech decoding device at a length N (N ≧ 1), and furthermore, Is M (M ≧ 1)
It is also possible to replace each sample with zero, in which case the amount of code search operation can be further reduced.

【０１０９】（実施の形態３）図５は本実施の形態にお
ける無線通信システムの構成を示す概念図である。２０
１は通信用端末、２０２は音声符号化装置を実現するソ
フトウエアプログラムを記述した信号処理用プロセッ
サ、２０３は音声復号化装置を実現するソフトウエアプ
ログラムを記述した信号処理用プロセッサ、２０４はプ
ロセッサ２０２と２０３とを有する音声符号化復号化シ
ステム、２０５は２０１と同様の構成を有する通信用端
末、２０６は通信用基地局、２０７は音声符号化装置を
実現するソフトウエアプログラムを記述した信号処理用
プロセッサ、２０８は音声復号化装置を実現するソフト
ウエアプログラムを記述した信号処理用プロセッサ、２
０９はプロセッサ２０７と２０８とを有する音声符号化
復号化システムである。(Embodiment 3) FIG. 5 is a conceptual diagram showing a configuration of a wireless communication system according to the present embodiment. 20
1 is a communication terminal, 202 is a signal processing processor that describes a software program that implements a speech encoding device, 203 is a signal processing processor that describes a software program that implements a speech decoding device, and 204 is a processor 202 And 203, a communication terminal 205 having the same configuration as 201, a communication base station 206, and a signal processing unit 207 for describing a software program for realizing the speech coding apparatus. A processor 208 is a signal processing processor that describes a software program for realizing the speech decoding device;
Reference numeral 09 denotes a speech encoding / decoding system including processors 207 and 208.

【０１１０】通信用端末２０１と２０５は、（実施の形
態１）または（実施の形態２）に記載された、音声符号
化装置を実現するソフトウエアプログラムを記述した信
号処理用プロセッサ２０２と音声復号化装置を実現する
ソフトウエアプログラムを記述した信号処理用プロセッ
サ２０３とを有する音声符号化復号化システム２０４を
有し、通信用基地局２０６を介して音声信号を送受信す
る。Communication terminals 201 and 205 include a signal processing processor 202 described in (Embodiment 1) or (Embodiment 2) and describing a software program for realizing a speech encoding apparatus, and speech decoding. And an audio encoding / decoding system 204 having a signal processing processor 203 in which a software program for realizing the encoding device is described, and transmits and receives an audio signal via a communication base station 206.

【０１１１】通信用基地局２０６は、通信用端末２０１
の音声符号化復号化システム２０４と同様な構成の音声
符号化復号化システム２０９を有し、音声符号化装置を
実現するソフトウエアプログラムを記述した信号処理用
プロセッサ２０７と音声復号化装置を実現するソフトウ
エアプログラムを記述した信号処理用プロセッサ２０８
と含んでいる。The communication base station 206 communicates with the communication terminal 201.
Has a speech encoding / decoding system 209 having a configuration similar to that of the speech encoding / decoding system 204, and realizes a signal processing processor 207 and a speech decoding device in which a software program for implementing the speech encoding device is described. Signal processing processor 208 describing a software program
Includes

【０１１２】図５に示すような通信用端末及び通信用基
地局を、無線ネットワークでつなぐことにより、高速且
つ低ビットレートの送受信でありながら復号化により高
品質な合成音声を実現することのできる無線通信システ
ムを構築することができる。By connecting a communication terminal and a communication base station as shown in FIG. 5 with a wireless network, high-quality synthesized speech can be realized by decoding while decoding at high speed and low bit rate. A wireless communication system can be constructed.

【０１１３】なお、図５では二つの通信用端末と一つの
通信用基地局により無線通信システムを構築していた
が、端末や基地局の数はこれに限定されるものではな
い。In FIG. 5, a radio communication system is constructed by two communication terminals and one communication base station, but the number of terminals and base stations is not limited to this.

【０１１４】[0114]

【発明の効果】以上のように本発明によれば、パルス拡
散符号帳を雑音符号帳部に用いるＣＥＬＰ方式の音声符
号化装置と復号化装置、及び音声符号化復号化システム
において、学習によって獲得された雑音音源ターゲット
中に頻繁に含まれる固定波形を拡散パタンとして登録し
ておき、当該拡散パタンをパルスベクトルに重畳する
（反映させる）ことで、雑音音源ターゲットにより近い
雑音音源ベクトルを利用することができるため、復号化
側で合成音声の品質向上を実現でき、さらには、符号化
側で、パルス拡散符号帳を雑音符号帳部に用いる場合に
問題となることがある雑音符号帳探索の演算量を、従来
（文献５、６、７）よりも低く抑えることが可能となる
という有利な効果が得られる。As described above, according to the present invention, a speech coding apparatus and a decoding apparatus of the CELP system using a pulse spread codebook for a noise codebook unit, and a speech coding and decoding system obtain by learning. A fixed waveform that is frequently included in the generated noise source target is registered as a diffusion pattern, and the diffusion pattern is superimposed (reflected) on the pulse vector to use a noise source vector closer to the noise source target. Therefore, it is possible to improve the quality of synthesized speech on the decoding side, and further, to perform a noise codebook search operation, which may be a problem when the pulse spreading codebook is used for the noise codebook section on the encoding side. This has the advantageous effect that the amount can be kept lower than in the prior art (References 5, 6, 7).

【０１１５】なお、少数個の非零要素からなるパルスベ
クトルを生成する符号帳として、マルチパルス符号帳や
レギュラーパルス符号帳など、その他の符号帳を用いた
場合にも同様の作用・効果を得ることができる。The same operation and effect can be obtained when other codebooks such as a multi-pulse codebook and a regular pulse codebook are used as a codebook for generating a pulse vector composed of a small number of non-zero elements. be able to.

[Brief description of the drawings]

【図１】本発明の実施の形態１によるＣＥＬＰ音声符号
化装置の構成ブロック図FIG. 1 is a configuration block diagram of a CELP speech encoding device according to a first embodiment of the present invention.

【図２】本発明の実施の形態１によるＣＥＬＰ音声復号
化装置の構成ブロック図FIG. 2 is a configuration block diagram of a CELP speech decoding device according to Embodiment 1 of the present invention.

【図３】（ａ）本発明の実施の形態１による音声符号化
装置で用いるパルス拡散符号帳Ａの一例を示す概念図（ｂ）本発明の実施の形態１による音声復号化装置で用
いるパルス拡散符号帳Ｂの一例を示す概念図FIG. 3A is a conceptual diagram showing an example of a pulse spreading codebook A used in the speech encoding device according to the first embodiment of the present invention. FIG. 3B is a conceptual diagram showing a pulse used in the speech decoding device according to the first embodiment of the present invention. Conceptual diagram showing an example of spreading codebook B

【図４】（ａ）本発明の実施の形態２による音声符号化
装置で用いるパルス拡散符号帳Ａの一例を示す概念図（ｂ）本発明の実施の形態２による音声復号化装置で用
いるパルス拡散符号帳Ｂの一例を示す概念図FIG. 4 (a) is a conceptual diagram showing an example of a pulse spreading codebook A used in a speech coding apparatus according to Embodiment 2 of the present invention. (B) Pulses used in a speech decoding apparatus according to Embodiment 2 of the present invention. Conceptual diagram showing an example of spreading codebook B

【図５】本発明の実施の形態３による無線通信システム
の構成を示す概念図FIG. 5 is a conceptual diagram showing a configuration of a wireless communication system according to a third embodiment of the present invention.

【図６】従来のＣＥＬＰ音声符号化装置の構成ブロック
図FIG. 6 is a block diagram showing the configuration of a conventional CELP speech coding apparatus.

【図７】従来のＣＥＬＰ音声復号化装置の構成ブロック
図FIG. 7 is a configuration block diagram of a conventional CELP speech decoding device.

【図８】従来のパルス拡散符号帳の構成ブロック図FIG. 8 is a configuration block diagram of a conventional pulse spreading codebook.

【図９】従来のパルス拡散符号帳の詳細な構成の一例を
示すブロック図FIG. 9 is a block diagram showing an example of a detailed configuration of a conventional pulse spreading codebook.

【図１０】従来のパルス拡散符号帳を雑音符号帳部に用
いたＣＥＬＰ音声符号化装置の構成ブロック図FIG. 10 is a block diagram showing the configuration of a conventional CELP speech coding apparatus using a pulse spread codebook as a noise codebook unit.

【図１１】従来のパルス拡散符号帳を雑音符号帳部に用
いたＣＥＬＰ音声復号化装置の構成ブロック図FIG. 11 is a block diagram showing a configuration of a conventional CELP speech decoding apparatus using a pulse spread codebook as a noise codebook unit.

【図１２】従来のパルス拡散符号帳の詳細な構成の一例
を示すブロック図FIG. 12 is a block diagram showing an example of a detailed configuration of a conventional pulse spreading codebook.

[Explanation of symbols]

１１入力音声１２線形予測分析部１３線形予測係数符号化部１４，３２線形予測符号復号化部１５，３９合成フィルタ１６歪み計算部１７，３３適応符号帳１８，３４雑音符号帳１９，３６適応音源ゲイン２０，３７雑音音源ゲイン２１，３５重み符号帳２２，３８ベクトル加算部２３符号特定部２４符号出力部３１符号入力部４０合成音声４１代数的符号帳４２パルスベクトル４３拡散パタン格納部４４拡散パタン４５パルス拡散部４６雑音音源ベクトル５０パルス拡散符号帳１０１パルス拡散符号帳Ａ１１１パルス拡散符号帳Ｂ１０２，１０４拡散パタン格納部Ａ１０３，１０５雑音音源ベクトルＡ１１２拡散パタン格納部Ｂ１１３雑音音源ベクトルＢ２０１，２０５通信用端末２０２信号処理用プロセッサ（符号化）２０３信号処理用プロセッサ（復号化）２０４音声符号化復号化システム（端末側）２０６通信用基地局２０７信号処理用プロセッサ（符号化）２０８信号処理用プロセッサ（復号化）２０９音声符号化復号化システム（基地局側） Reference Signs List 11 input speech 12 linear prediction analysis unit 13 linear prediction coefficient encoding unit 14, 32 linear prediction code decoding unit 15, 39 synthesis filter 16 distortion calculation unit 17, 33 adaptive codebook 18, 34 noise codebook 19, 36 adaptive sound source Gain 20, 37 Noise excitation gain 21, 35 Weighted codebook 22, 38 Vector addition unit 23 Code identification unit 24 Code output unit 31 Code input unit 40 Synthetic speech 41 Algebraic codebook 42 Pulse vector 43 Diffusion pattern storage unit 44 Diffusion pattern 45 pulse spreading section 46 noise excitation vector 50 pulse spreading codebook 101 pulse spreading codebook A 111 pulse spreading codebook B 102, 104 spreading pattern storage section A 103, 105 noise excitation vector A 112 diffusion pattern storage section B 113 noise excitation vector B 201,205 Communication terminal 202 Signal processing Processor (encoding) 203 signal processing processor (decoding) 204 voice coding / decoding system (terminal side) 206 communication base station 207 signal processing processor (coding) 208 signal processing processor (decoding) 209 Speech coding / decoding system (base station side)

Claims

[Claims]

1. A pulse spreading codebook that generates a vector by superimposing a vector including at least one non-zero element (elements other than the non-zero element have a value of zero) and a fixed waveform called a spreading pattern. Comprising, said pulse spread codebook,
An audio encoding device having a configuration different from the configuration of the pulse spread codebook on the audio decoding device side.

2. A spread pattern storage unit, which is a constituent part of the pulse spread codebook, stores a spread pattern different from the spread pattern stored in the spread pattern storage unit of the speech decoding device. The speech encoding device according to claim 1, wherein

3. The spread pattern storage unit stores a spread pattern obtained by simplifying the spread pattern stored in the spread pattern storage unit of the speech decoding device. 3. The speech encoding device according to 2.

4. A diffusion pattern storage unit stores a diffusion pattern obtained by replacing constituent elements of the diffusion pattern stored in the diffusion pattern storage unit of the speech decoding device with zeros at appropriate intervals. 4. The speech encoding device according to claim 2, wherein

5. The spreading pattern storage unit obtains a spreading pattern obtained by replacing the constituent elements of the spreading pattern stored in the spreading pattern storage unit of the speech decoding device with zeros for every N samples (N is a natural number). The speech encoding device according to any one of claims 2 to 4, wherein is stored.

6. A diffusion pattern storage unit stores a diffusion pattern obtained by replacing the constituent elements of the diffusion pattern stored in the diffusion pattern storage unit of the speech decoding device with zero for each sample. The speech encoding device according to claim 5, wherein:

7. The diffusion pattern storage unit stores a diffusion pattern obtained by truncating the constituent elements of the diffusion pattern stored in the diffusion pattern storage unit of the speech decoding device at an appropriate length. 4. The speech encoding device according to claim 2, wherein:

8. A spreading pattern storage unit, wherein a spreading pattern obtained by truncating the constituent elements of the spreading pattern stored in the spreading pattern storage unit of the speech decoding device at a length of N samples (N is a natural number). 8. The speech encoding device according to claim 2, wherein

9. The diffusion pattern storage unit stores a diffusion pattern obtained by truncating a component of the diffusion pattern stored in the diffusion pattern storage unit of the speech decoding device to half its length. The speech encoding device according to any one of claims 2, 3, and 7, wherein

10. A speech decoding device for decoding a speech signal having a speech code generated by the speech encoding device according to any one of claims 1 to 9.

11. A signal processing processor in which a software program for realizing the speech encoding device according to claim 1 is described.

12. A signal processing processor in which a software program for realizing the speech decoding device according to claim 10 is described.

13. A speech encoding / decoding system characterized in that the configuration of the pulse spread codebook of the speech encoding device is different from the configuration of the pulse spread codebook of the speech decoding device.

14. The difference between the configuration of the pulse spread codebook of the speech encoding device and the configuration of the pulse spread codebook of the speech decoding device is the shape of the diffusion pattern provided in each pulse spread codebook. The speech encoding / decoding system according to claim 13, wherein:

15. The speech coding / decoding according to claim 14, wherein the shape of the diffusion pattern on the speech encoding device side is a simplified version of the shape of the diffusion pattern on the speech decoding device side. system.

16. The spread pattern on the speech encoding device side is a shape obtained by replacing the components of the spread pattern on the speech decoding device side with zeros at appropriate intervals. Item 16. An audio encoding / decoding system according to any one of Items 13 to 15.

17. The shape of the diffusion pattern on the side of the audio encoding device is a shape obtained by replacing the components of the diffusion pattern on the side of the audio decoding device with zero for every N samples (N is a natural number). 17. The speech encoding / decoding system according to claim 13, wherein

18. The spread pattern on the speech encoding device side is a shape obtained by replacing the components of the spread pattern on the speech decoding device side with zero for each sample. 18. The speech encoding / decoding system according to 17.

19. The spread pattern on the audio encoding device side is a shape obtained by truncating a component of the diffusion pattern on the audio decoding device side to an appropriate length. 16. The speech encoding / decoding system according to any one of items 1 to 15.

20. The shape of the diffusion pattern on the speech encoding device side is a shape obtained by truncating the components of the diffusion pattern on the speech decoding device side by a length of N samples (N is a natural number). Claims 13, 14, 15, 19 characterized by the above-mentioned.
A speech encoding / decoding system according to any one of the above.

21. The spread pattern on the speech encoding device side is a shape obtained by truncating a component of the diffusion pattern on the speech decoding device side to half its length. , 14, 15, 19.

22. A communication base station comprising the signal processing processor according to claim 11.

23. A communication terminal comprising the signal processing processor according to claim 11.

24. A wireless communication system in which the communication base station according to claim 22 and the communication terminal according to claim 23 are connected by a wireless network.