JPH09127997A

JPH09127997A - Voice coding method and device

Info

Publication number: JPH09127997A
Application number: JP7279413A
Authority: JP
Inventors: Kazuyuki Iijima; 和幸飯島; Masayuki Nishiguchi; 正之西口; Atsushi Matsumoto; 淳松本; Shiro Omori; 士郎大森
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1995-10-26
Filing date: 1995-10-26
Publication date: 1997-05-16

Abstract

PROBLEM TO BE SOLVED: To simply switch the bit rate by providing multiple coding sections for the vector quantization of the time base wave-form, reference-inputting the quantization error of the (N-1)th stage for the coding of the Nth stage, and selecting the quantization output of each stage to switch the bit rate. SOLUTION: The second coding section 120 having a CELP coding structure is constituted of multiple vector quantization process sections, e.g. two coding sections 1201 , 1202 . When the vector quantization by the closed loop search of the first stage is finished, the quantization error of the (N-1)th stage is used as the reference input for the quantization of the Nth stage (2<=N). The calculation quantity is reduced. The case that both index outputs of two coding sections 1201 , 1202 are used and the case that only the output of the coding section 1201 of the first stage is used are switched, and the bit number can be simply switched.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力音声信号をブロッ
ク単位で区分して、この区分されたブロックを単位とし
て符号化処理を行うような音声符号化方法及び装置に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice encoding method and apparatus for dividing an input voice signal into blocks and performing an encoding process in units of the divided blocks.

【０００２】[0002]

【発明の属する技術分野】オーディオ信号（音声信号や
音響信号を含む）の時間領域や周波数領域における統計
的性質と人間の聴感上の特性を利用して信号圧縮を行う
ような符号化方法が種々知られている。この符号化方法
としては、大別して時間領域での符号化、周波数領域で
の符号化、分析合成符号化等が挙げられる。BACKGROUND OF THE INVENTION There are various coding methods for performing signal compression by utilizing the statistical properties of audio signals (including voice signals and acoustic signals) in the time domain and frequency domain and human auditory characteristics. Are known. This encoding method is roughly classified into encoding in the time domain, encoding in the frequency domain, and analysis-synthesis encoding.

【０００３】音声信号等の高能率符号化の例として、ハ
ーモニック（Harmonic）符号化、ＭＢＥ（Multiband Ex
citation: マルチバンド励起）符号化等のサイン波分析
符号化や、ＳＢＣ（Sub-band Coding:帯域分割符号
化）、ＬＰＣ（Linear Predictive Coding: 線形予測符
号化）、あるいはＤＣＴ（離散コサイン変換）、ＭＤＣ
Ｔ（モデファイドＤＣＴ）、ＦＦＴ（高速フーリエ変
換）等が知られている。また、音声信号等の高能率符号
化の例としては、合成による分析法を用いて最適ベクト
ルのクローズドループサーチによるベクトル量子化を用
いた符号励起線形予測（ＣＥＬＰ）符号化も存在する。[0003] Examples of high-efficiency coding of voice signals and the like include harmonic coding and MBE (Multiband Ex).
citation: Sine wave analysis coding such as multi-band excitation coding, SBC (Sub-band Coding: band division coding), LPC (Linear Predictive Coding: linear predictive coding), or DCT (discrete cosine transform), MDC
T (Modified DCT), FFT (Fast Fourier Transform), etc. are known. Further, as an example of high-efficiency encoding of a speech signal or the like, there is also code-excited linear prediction (CELP) encoding using vector quantization by closed loop search of an optimum vector using an analysis method by synthesis.

【０００４】[0004]

【発明が解決しようとする課題】ところで、上記音声信
号の高能率符号化の一例として例えば符号励起線形予測
符号化を用い、当該符号化された音声信号を伝送路に伝
送するようなシステムを考えた場合、上記符号励起線形
予測符号化ではストレートベクトル量子化を行い、共役
コードブック等を用いているため、例えば伝送路の容量
などに応じて伝送ビットレートを切り換えたいようなこ
とがあっても、その構造上、非常に困難である。また、
符号化側（エンコーダ）のビットレートと復号化側（デ
コーダ）のビットレートが異なっている場合、すなわち
例えばエンコーダでのビットレートがＭｋｂｐｓとなっ
ており、デコーダでのビットレートがＮｋｂｐｓとなっ
ているような場合には、上記Ｎｋｂｐｓに対応したデコ
ーダではエンコーダから供給されるＭｋｂｐｓの符号化
データ列に対して対応できないことになる。By the way, as an example of the high-efficiency coding of the above speech signal, consider a system in which, for example, code excitation linear predictive coding is used and the coded speech signal is transmitted to a transmission line. In this case, in the code excitation linear predictive coding, straight vector quantization is performed, and since a conjugate codebook or the like is used, for example, even if it is desired to switch the transmission bit rate according to the capacity of the transmission line, Due to its structure, it is very difficult. Also,
When the bit rate on the encoding side (encoder) and the bit rate on the decoding side (decoder) are different, that is, for example, the bit rate in the encoder is Mkbps and the bit rate in the decoder is Nkbps. In such a case, the decoder corresponding to Nkbps cannot support the encoded data string of Mkbps supplied from the encoder.

【０００５】そこで本発明は、このような実情に鑑みて
なされたものであり、伝送ビットレートを簡単に切り換
えることができると共に、符号化側と復号化側とでビッ
トレートが異なっている場合であっても復号化側で容易
に対応可能な符号化データ列を生成することができる音
声符号化方法及び装置の提供を目的とする。Therefore, the present invention has been made in view of such a situation, and in the case where the transmission bit rate can be easily switched and the bit rate is different between the encoding side and the decoding side. It is an object of the present invention to provide a speech coding method and apparatus capable of easily generating a coded data string that can be easily dealt with on the decoding side.

【０００６】[0006]

【課題を解決するための手段】本発明に係る音声符号化
方法及び装置は、入力音声信号を時間軸上で区分した各
ブロック単位で符号化を行うものであり、合成による分
析法を用いて最適ベクトルのクローズドループサーチに
よる時間軸波形のベクトル量子化を行う符号化を複数段
有し、このうちＮ段目の符号化の際には、Ｎ−１段目の
量子化誤差をリファレンス入力とし、各段の符号化によ
る量子化出力を選択してビットレートを切り換えること
により、上述した課題を解決する。A speech encoding method and apparatus according to the present invention encodes an input speech signal in units of blocks divided on a time axis, and uses an analysis method by synthesis. There are multiple stages of encoding for performing vector quantization of the time-axis waveform by the closed loop search of the optimum vector. Among them, when encoding the Nth stage, the quantization error of the (N-1) th stage is used as the reference input. , The above-mentioned problem is solved by selecting the quantized output by the encoding of each stage and switching the bit rate.

【０００７】すなわち、本発明の音声符号化方法及び装
置によれば、多段構成の各符号化出力の全部或いは一部
を選択することで、出力ビットレートを切り換え可能に
している。また、多段構成の各符号化出力を合わせて出
力するようにしているため、符号化側と復号化側とでビ
ットレートが異なっていたとしても、復号化側において
例えばこれら符号化出力の全部又は一部を選択すれば所
望のビットレートの符号化出力を容易に得ることが可能
となる。That is, according to the speech encoding method and apparatus of the present invention, the output bit rate can be switched by selecting all or a part of the encoded outputs of the multistage configuration. Further, since the encoded outputs of the multi-stage configuration are output together, even if the bit rates on the encoding side and the decoding side are different, on the decoding side, for example, all of these encoded outputs or If a part of them is selected, it becomes possible to easily obtain a coded output with a desired bit rate.

【０００８】[0008]

【発明の実施の形態】以下、本発明に係る好ましい実施
の形態について説明する。先ず、図１は、本発明に係る
音声符号化方法の実施の形態が適用された符号化装置の
基本構成を示している。BEST MODE FOR CARRYING OUT THE INVENTION Preferred embodiments of the present invention will be described below. First, FIG. 1 shows a basic configuration of an encoding apparatus to which an embodiment of a speech encoding method according to the present invention is applied.

【０００９】ここで、図１の音声信号符号化装置の基本
的な考え方は、入力音声信号の短期予測残差例えばＬＰ
Ｃ（線形予測符号化）残差を求めてサイン波分析（sinu
soidal analysis ）符号化、例えばハーモニックコーデ
ィング（harmonic coding ）を行う第１の符号化部１１
０と、入力音声信号に対して位相伝送を行う波形符号化
により符号化する第２の符号化部１２０とを有し、入力
信号の有声音（Ｖ：Voiced）の部分の符号化に第１の符
号化部１１０を用い、入力信号の無声音（ＵＶ：Unvoic
ed）の部分の符号化には第２の符号化部１２０を用いる
ようにすることである。Here, the basic idea of the speech signal coding apparatus of FIG. 1 is that the short-term prediction residual of the input speech signal, for example, LP.
Sine wave analysis (sinu
soidal analysis) First encoding unit 11 that performs encoding, for example, harmonic coding
0 and a second coding unit 120 that performs coding by waveform coding that performs phase transmission on the input speech signal, and is the first for coding the voiced sound (V: Voiced) portion of the input signal. Of the input signal unvoiced sound (UV: Unvoic
The second encoding unit 120 is used for encoding the portion (ed).

【００１０】上記第１の符号化部１１０には、例えばＬ
ＰＣ残差をハーモニック符号化やマルチバンド励起（Ｍ
ＢＥ）符号化のようなサイン波分析符号化を行う構成が
用いられる。上記第２の符号化部１２０には、例えば合
成による分析法を用いて最適ベクトルのクローズドルー
プサーチによるベクトル量子化を用いた符号励起線形予
測（ＣＥＬＰ）符号化の構成が用いられる。The first encoding unit 110 has, for example, L
Harmonic coding and multi-band excitation (M
A configuration for performing sine wave analysis encoding such as BE) encoding is used. The second encoding unit 120 employs, for example, a configuration of code excitation linear prediction (CELP) encoding using vector quantization by closed loop search of an optimal vector using an analysis method based on synthesis.

【００１１】図１の例では、入力端子１０１に供給され
た音声信号が、第１の符号化部１１０のＬＰＣ逆フィル
タ１１１及びＬＰＣ分析・量子化部１１３に送られてい
る。ＬＰＣ分析・量子化部１１３から得られたＬＰＣ係
数あるいはいわゆるαパラメータは、ＬＰＣ逆フィルタ
１１１に送られて、このＬＰＣ逆フィルタ１１１により
入力音声信号の線形予測残差（ＬＰＣ残差）が取り出さ
れる。また、ＬＰＣ分析・量子化部１１３からは、後述
するようにＬＳＰ（線スペクトル対）の量子化出力が取
り出され、これが出力端子１０２に送られる。ＬＰＣ逆
フィルタ１１１からのＬＰＣ残差は、サイン波分析符号
化部１１４に送られる。サイン波分析符号化部１１４で
は、ピッチ検出やスペクトルエンベロープ振幅計算が行
われると共に、Ｖ（有声音）／ＵＶ（無声音）判定部１
１５によりＶ／ＵＶの判定が行われる。サイン波分析符
号化部１１４からのスペクトルエンベロープ振幅データ
がベクトル量子化部１１６に送られる。スペクトルエン
ベロープのベクトル量子化出力としてのベクトル量子化
部１１６からのコードブックインデクスは、スイッチ１
１７を介して出力端子１０３に送られ、サイン波分析符
号化部１１４からの出力は、スイッチ１１８を介して出
力端子１０４に送られる。また、Ｖ／ＵＶ判定部１１５
からのＶ／ＵＶ判定出力は、出力端子１０５に送られる
と共に、スイッチ１１７、１１８の制御信号として送ら
れており、上述した有声音（Ｖ）のとき上記インデクス
及びピッチが選択されて各出力端子１０３及び１０４か
らそれぞれ取り出される。In the example of FIG. 1, the audio signal supplied to the input terminal 101 is sent to the LPC inverse filter 111 and the LPC analysis / quantization unit 113 of the first encoding unit 110. The LPC coefficient or the so-called α parameter obtained from the LPC analysis / quantization unit 113 is sent to the LPC inverse filter 111, and the LPC inverse filter 111 extracts a linear prediction residual (LPC residual) of the input audio signal. . Also, a quantized output of an LSP (line spectrum pair) is extracted from the LPC analysis / quantization unit 113 and sent to the output terminal 102 as described later. The LPC residual from LPC inverse filter 111 is sent to sine wave analysis encoding section 114. In the sine wave analysis encoding unit 114, pitch detection and spectrum envelope amplitude calculation are performed, and a V (voiced sound) / UV (unvoiced sound) determination unit 1 is performed.
15 is used to determine V / UV. The spectrum envelope amplitude data from the sine wave analysis encoding unit 114 is sent to the vector quantization unit 116. The codebook index from the vector quantization unit 116 as the vector quantization output of the spectrum envelope is
The output from the sine wave analysis encoding unit 114 is sent to the output terminal 104 via the switch 118. Also, the V / UV determination unit 115
Is output to the output terminal 105 and is also sent as a control signal for the switches 117 and 118. In the case of the above-mentioned voiced sound (V), the index and the pitch are selected and each output terminal is output. 103 and 104 respectively.

【００１２】図１の第２の符号化部１２０は、この例で
はＣＥＬＰ（符号励起線形予測）符号化構成を有してお
り、雑音符号帳１２１からの出力を、重み付きの合成フ
ィルタ１２２により合成処理し、得られた重み付き音声
を減算器１２３に送り、入力端子１０１に供給された音
声信号を聴覚重み付けフィルタ１２５を介して得られた
音声との誤差を取り出し、この誤差を距離計算回路１２
４に送って距離計算を行い、誤差が最小となるようなベ
クトルを雑音符号帳１２１でサーチするような、合成に
よる分析（Analysis by Synthesis ）法を用いたクロー
ズドループサーチを用いた時間軸波形のベクトル量子化
を行っている。このＣＥＬＰ符号化は、上述したように
無声音部分の符号化に用いられており、雑音符号帳１２
１からのＵＶデータとしてのコードブックインデクス
は、上記Ｖ／ＵＶ判定部１１５からのＶ／ＵＶ判定結果
が無声音（ＵＶ）のときオンとなるスイッチ１２７を介
して、出力端子１０７より取り出される。The second coding unit 120 of FIG. 1 has a CELP (code excitation linear prediction) coding structure in this example, and outputs the output from the random codebook 121 by a weighting synthesis filter 122. The weighted speech obtained by the synthesis processing is sent to the subtractor 123, the speech signal supplied to the input terminal 101 is taken out as an error from the speech obtained through the auditory weighting filter 125, and this error is calculated by the distance calculation circuit. 12
4 to calculate the distance, and search for a vector that minimizes the error in the noise codebook 121 by using a closed-loop search using an analysis by synthesis method. Vector quantization is performed. This CELP coding is used for coding the unvoiced sound portion as described above,
The codebook index as UV data from No. 1 is extracted from the output terminal 107 via a switch 127 that is turned on when the V / UV determination result from the V / UV determination unit 115 is unvoiced (UV).

【００１３】次に、図２は、本発明に係る音声復号化方
法の一実施の形態が適用された音声信号復号化装置とし
て、上記図１の音声信号符号化装置に対応する音声信号
復号化装置の基本構成を示すブロック図である。Next, FIG. 2 shows a speech signal decoding apparatus to which an embodiment of the speech decoding method according to the present invention is applied, which corresponds to the speech signal decoding apparatus of FIG. It is a block diagram which shows the basic composition of an apparatus.

【００１４】この図２において、入力端子２０２には上
記図１の出力端子１０２からの上記ＬＳＰ（線スペクト
ル対）の量子化出力としてのコードブックインデクスが
入力される。入力端子２０３、２０４、及び２０５に
は、上記図１の各出力端子１０３、１０４、及び１０５
からの各出力、すなわちエンベロープ量子化出力として
のインデクス、ピッチ、及びＶ／ＵＶ判定出力がそれぞ
れ入力される。また、入力端子２０７には、上記図１の
出力端子１０７からのＵＶ（無声音）用のデータとして
のインデクスが入力される。In FIG. 2, a codebook index as a quantized output of the LSP (line spectrum pair) from the output terminal 102 of FIG. 1 is input to the input terminal 202. The input terminals 203, 204, and 205 include the output terminals 103, 104, and 105 of FIG.
, That is, an index, a pitch, and a V / UV determination output as an envelope quantization output. The input terminal 207 receives an index as UV (unvoiced sound) data from the output terminal 107 shown in FIG.

【００１５】入力端子２０３からのエンベロープ量子化
出力としてのインデクスは、逆ベクトル量子化器２１２
に送られて逆ベクトル量子化され、ＬＰＣ残差のスペク
トルエンベロープが求められて有声音合成部２１１に送
られる。有声音合成部２１１は、サイン波合成により有
声音部分のＬＰＣ（線形予測符号化）残差を合成するも
のであり、この有声音合成部２１１には入力端子２０４
及び２０５からのピッチ及びＶ／ＵＶ判定出力も供給さ
れている。有声音合成部２１１からの有声音のＬＰＣ残
差は、ＬＰＣ合成フィルタ２１４に送られる。また、入
力端子２０７からのＵＶデータのインデクスは、無声音
合成部２２０に送られて、雑音符号帳を参照することに
より無声音部分のＬＰＣ残差が取り出される。このＬＰ
Ｃ残差もＬＰＣ合成フィルタ２１４に送られる。ＬＰＣ
合成フィルタ２１４では、上記有声音部分のＬＰＣ残差
と無声音部分のＬＰＣ残差とがそれぞれ独立に、ＬＰＣ
合成処理が施される。あるいは、有声音部分のＬＰＣ残
差と無声音部分のＬＰＣ残差とが加算されたものに対し
てＬＰＣ合成処理を施すようにしてもよい。ここで入力
端子２０２からのＬＳＰのインデクスは、ＬＰＣパラメ
ータ再生部２１３に送られて、ＬＰＣのαパラメータが
取り出され、これがＬＰＣ合成フィルタ２１４に送られ
る。ＬＰＣ合成フィルタ２１４によりＬＰＣ合成されて
得られた音声信号は、出力端子２０１より取り出され
る。The index as the envelope quantization output from the input terminal 203 is the inverse vector quantizer 212.
, And is subjected to inverse vector quantization, and the spectrum envelope of the LPC residual is obtained and sent to the voiced sound synthesis unit 211. The voiced sound synthesizer 211 synthesizes an LPC (linear predictive coding) residual of the voiced sound part by sine wave synthesis.
, And the pitch and V / UV determination outputs from 205 are also provided. The LPC residual of the voiced sound from the voiced sound synthesis unit 211 is sent to the LPC synthesis filter 214. Further, the index of the UV data from the input terminal 207 is sent to the unvoiced sound synthesizer 220, and the LPC residual of the unvoiced sound portion is extracted by referring to the noise codebook. This LP
The C residual is also sent to LPC synthesis filter 214. LPC
In the synthesis filter 214, the LPC residual of the voiced part and the LPC residual of the unvoiced part are independently LPC residuals.
A combining process is performed. Alternatively, LPC synthesis processing may be performed on the sum of the LPC residual of the voiced sound part and the LPC residual of the unvoiced sound part. Here, the index of the LSP from the input terminal 202 is sent to the LPC parameter reproducing unit 213, the α parameter of the LPC is extracted, and this is sent to the LPC synthesis filter 214. An audio signal obtained by LPC synthesis by the LPC synthesis filter 214 is extracted from the output terminal 201.

【００１６】次に、上記図１に示した音声信号符号化装
置のより具体的な構成について、図３を参照しながら説
明する。なお、図３において、上記図１の各部と対応す
る部分には同じ指示符号を付している。Next, a more specific structure of the speech signal coding apparatus shown in FIG. 1 will be described with reference to FIG. In FIG. 3, parts corresponding to the respective parts in FIG. 1 are given the same reference numerals.

【００１７】この図３に示された音声信号符号化装置に
おいて、入力端子１０１に供給された音声信号は、ハイ
パスフィルタ（ＨＰＦ）１０９にて不要な帯域の信号を
除去するフィルタ処理が施された後、ＬＰＣ（線形予測
符号化）分析・量子化部１１３のＬＰＣ分析回路１３２
と、ＬＰＣ逆フィルタ回路１１１とに送られる。In the speech signal coding apparatus shown in FIG. 3, the speech signal supplied to the input terminal 101 is filtered by a high-pass filter (HPF) 109 to remove a signal in an unnecessary band. After that, the LPC analysis circuit 132 of the LPC (linear predictive coding) analysis / quantization unit 113.
To the LPC inverse filter circuit 111.

【００１８】ＬＰＣ分析・量子化部１１３のＬＰＣ分析
回路１３２は、入力信号波形の２５６サンプル程度の長
さを１ブロックとしてハミング窓をかけて、自己相関法
により線形予測係数、いわゆるαパラメータを求める。
データ出力の単位となるフレーミングの間隔は、１６０
サンプル程度とする。サンプリング周波数ｆｓが例えば
８ｋHzのとき、１フレーム間隔は１６０サンプルで２０
ｍsec となる。The LPC analysis circuit 132 of the LPC analysis / quantization unit 113 obtains a linear prediction coefficient, so-called α parameter, by the autocorrelation method by applying a Hamming window with one block having a length of about 256 samples of the input signal waveform. .
The framing interval, which is the unit of data output, is 160
It is about a sample. When the sampling frequency fs is, for example, 8 kHz, one frame interval is 20 for 160 samples.
msec.

【００１９】ＬＰＣ分析回路１３２からのαパラメータ
は、α→ＬＳＰ変換回路１３３に送られて、線スペクト
ル対（ＬＳＰ）パラメータに変換される。これは、直接
型のフィルタ係数として求まったαパラメータを、例え
ば１０個、すなわち５対のＬＳＰパラメータに変換す
る。変換は例えばニュートン−ラプソン法等を用いて行
う。このＬＳＰパラメータに変換するのは、αパラメー
タよりも補間特性に優れているからである。The α parameter from the LPC analysis circuit 132 is sent to the α → LSP conversion circuit 133 and converted into a line spectrum pair (LSP) parameter. This converts the α parameter obtained as the direct type filter coefficient into, for example, 10 pieces, that is, 5 pairs of LSP parameters. The conversion is performed using, for example, the Newton-Raphson method. The conversion to the LSP parameter is because it has better interpolation characteristics than the α parameter.

【００２０】α→ＬＳＰ変換回路１３３からのＬＳＰパ
ラメータは、ＬＳＰ量子化器１３４によりマトリクスあ
るいはベクトル量子化される。このとき、フレーム間差
分をとってからベクトル量子化してもよく、複数フレー
ム分をまとめてマトリクス量子化してもよい。ここで
は、２０ｍsec を１フレームとし、２０ｍsec 毎に算出
されるＬＳＰパラメータを２フレーム分まとめて、マト
リクス量子化及びベクトル量子化している。The LSP parameter from the α → LSP conversion circuit 133 is quantized into a matrix or vector by the LSP quantizer 134. At this time, vector quantization may be performed after obtaining an inter-frame difference, or matrix quantization may be performed on a plurality of frames at once. Here, 20 msec is defined as one frame, and LSP parameters calculated every 20 msec are combined for two frames, and are subjected to matrix quantization and vector quantization.

【００２１】このＬＳＰ量子化器１３４からの量子化出
力、すなわちＬＳＰ量子化のインデクスは、端子１０２
を介して取り出され、また量子化済みのＬＳＰベクトル
は、ＬＳＰ補間回路１３６に送られる。The quantized output from the LSP quantizer 134, that is, the index of the LSP quantizer is the terminal 102.
And the quantized LSP vector is sent to the LSP interpolation circuit 136.

【００２２】ＬＳＰ補間回路１３６は、上記２０ｍsec
あるいは４０ｍsec 毎に量子化されたＬＳＰのベクトル
を補間し、８倍のレートにする。すなわち、２．５ｍse
c 毎にＬＳＰベクトルが更新されるようにする。これ
は、残差波形をハーモニック符号化復号化方法により分
析合成すると、その合成波形のエンベロープは非常にな
だらかでスムーズな波形になるため、ＬＰＣ係数が２０
ｍsec 毎に急激に変化すると異音を発生することがある
からである。すなわち、２．５ｍsec 毎にＬＰＣ係数が
徐々に変化してゆくようにすれば、このような異音の発
生を防ぐことができる。The LSP interpolation circuit 136 has the above-mentioned 20 msec.
Alternatively, the LSP vector quantized every 40 msec is interpolated to make the rate eight times higher. That is, 2.5 mse
The LSP vector is updated every c. This is because when the residual waveform is analyzed and synthesized by the harmonic encoding / decoding method, the envelope of the synthesized waveform becomes a very smooth and smooth waveform.
This is because an abnormal sound may be generated if it changes abruptly every msec. That is, if the LPC coefficient is gradually changed every 2.5 msec, the occurrence of such abnormal noise can be prevented.

【００２３】このような補間が行われた２．５ｍsec 毎
のＬＳＰベクトルを用いて入力音声の逆フィルタリング
を実行するために、ＬＳＰ→α変換回路１３７により、
ＬＳＰパラメータを例えば１０次程度の直接型フィルタ
の係数であるαパラメータに変換する。このＬＳＰ→α
変換回路１３７からの出力は、上記ＬＰＣ逆フィルタ回
路１１１に送られ、このＬＰＣ逆フィルタ１１１では、
２．５ｍsec 毎に更新されるαパラメータにより逆フィ
ルタリング処理を行って、滑らかな出力を得るようにし
ている。このＬＰＣ逆フィルタ１１１からの出力は、サ
イン波分析符号化部１１４、具体的には例えばハーモニ
ック符号化回路、の直交変換回路１４５、例えばＤＦＴ
（離散フーリエ変換）回路に送られる。In order to execute the inverse filtering of the input voice using the LSP vector for every 2.5 msec which has been interpolated in this way, the LSP → α conversion circuit 137
The LSP parameter is converted into, for example, an α parameter which is a coefficient of a direct type filter of about 10th order. This LSP → α
The output from the conversion circuit 137 is sent to the LPC inverse filter circuit 111, where the LPC inverse filter 111
Inverse filtering is performed using the α parameter updated every 2.5 msec to obtain a smooth output. An output from the LPC inverse filter 111 is output to an orthogonal transform circuit 145 of a sine wave analysis encoding unit 114, specifically, for example, a harmonic encoding circuit,
(Discrete Fourier transform) circuit.

【００２４】ＬＰＣ分析・量子化部１１３のＬＰＣ分析
回路１３２からのαパラメータは、聴覚重み付けフィル
タ算出回路１３９に送られて聴覚重み付けのためのデー
タが求められ、この重み付けデータが後述する聴覚重み
付きのベクトル量子化器１１６と、第２の符号化部１２
０の聴覚重み付けフィルタ１２５及び聴覚重み付きの合
成フィルタ１２２とに送られる。The α parameter from the LPC analysis circuit 132 of the LPC analysis / quantization unit 113 is sent to the perceptual weighting filter calculation circuit 139 to obtain data for perceptual weighting. Vector quantizer 116 and second encoding unit 12
0 and a synthesis filter 122 with a perceptual weight.

【００２５】ハーモニック符号化回路等のサイン波分析
符号化部１１４では、ＬＰＣ逆フィルタ１１１からの出
力を、ハーモニック符号化の方法で分析する。すなわ
ち、ピッチ検出、各ハーモニクスの振幅Ａｍの算出、有
声音（Ｖ）／無声音（ＵＶ）の判別を行い、ピッチによ
って変化するハーモニクスのエンベロープあるいは振幅
Ａｍの個数を次元変換して一定数にしている。A sine wave analysis coding unit 114 such as a harmonic coding circuit analyzes the output from the LPC inverse filter 111 by a harmonic coding method. That is, pitch detection, calculation of the amplitude Am of each harmonic, determination of voiced sound (V) / unvoiced sound (UV) are performed, and the number of the envelopes or amplitudes Am of the harmonics that change with the pitch is dimensionally converted to a constant number. .

【００２６】図３に示すサイン波分析符号化部１１４の
具体例においては、一般のハーモニック符号化を想定し
ているが、特に、ＭＢＥ（Multiband Excitation: マル
チバンド励起）符号化の場合には、同時刻（同じブロッ
クあるいはフレーム内）の周波数軸領域いわゆるバンド
毎に有声音（Voiced）部分と無声音（Unvoiced）部分と
が存在するという仮定でモデル化することになる。それ
以外のハーモニック符号化では、１ブロックあるいはフ
レーム内の音声が有声音か無声音かの択一的な判定がな
されることになる。なお、以下の説明中のフレーム毎の
Ｖ／ＵＶとは、ＭＢＥ符号化に適用した場合には全バン
ドがＵＶのときを当該フレームのＵＶとしている。In the concrete example of the sine wave analysis coding unit 114 shown in FIG. 3, general harmonic coding is assumed, but particularly in the case of MBE (Multiband Excitation) coding, The modeling is performed on the assumption that there is a voiced sound (Voiced) portion and an unvoiced sound (Unvoiced) portion in each frequency axis region of the same time (in the same block or frame), that is, in each band. In other harmonic coding, an alternative determination is made as to whether voice in one block or frame is voiced or unvoiced. In the following description, the term “V / UV for each frame” means that when all bands are UV when applied to MBE coding, the UV of the frame is used.

【００２７】図３のサイン波分析符号化部１１４のオー
プンループピッチサーチ部１４１には、上記入力端子１
０１からの入力音声信号が、またゼロクロスカウンタ１
４２には、上記ＨＰＦ（ハイパスフィルタ）１０９から
の信号がそれぞれ供給されている。サイン波分析符号化
部１１４の直交変換回路１４５には、ＬＰＣ逆フィルタ
１１１からのＬＰＣ残差あるいは線形予測残差が供給さ
れている。オープンループピッチサーチ部１４１では、
入力信号のＬＰＣ残差をとってオープンループによる比
較的ラフなピッチのサーチが行われ、抽出された粗ピッ
チデータは高精度ピッチサーチ１４６に送られて、後述
するようなクローズドループによる高精度のピッチサー
チ（ピッチのファインサーチ）が行われる。また、オー
プンループピッチサーチ部１４１からは、上記粗ピッチ
データと共にＬＰＣ残差の自己相関の最大値をパワーで
正規化した正規化自己相関最大値ｒ(p) が取り出され、
Ｖ／ＵＶ（有声音／無声音）判定部１１５に送られてい
る。The open-loop pitch search section 141 of the sine wave analysis coding section 114 of FIG.
The input voice signal from 01 is again the zero cross counter 1
Signals from the HPF (high-pass filter) 109 are supplied to 42 respectively. The LPC residual or the linear prediction residual from the LPC inverse filter 111 is supplied to the orthogonal transform circuit 145 of the sine wave analysis encoding unit 114. In the open loop pitch search section 141,
An LPC residual of the input signal is used to perform a relatively rough pitch search by an open loop, and the extracted coarse pitch data is sent to a high-precision pitch search 146, and a high-precision closed loop as described later is used. A pitch search (fine search of the pitch) is performed. From the open loop pitch search section 141, a normalized autocorrelation maximum value r (p) obtained by normalizing the maximum value of the autocorrelation of the LPC residual with power together with the coarse pitch data is extracted.
V / UV (voiced sound / unvoiced sound) determination unit 115.

【００２８】直交変換回路１４５では例えばＤＦＴ（離
散フーリエ変換）等の直交変換処理が施されて、時間軸
上のＬＰＣ残差が周波数軸上のスペクトル振幅データに
変換される。この直交変換回路１４５からの出力は、高
精度ピッチサーチ部１４６及びスペクトル振幅あるいは
エンベロープを評価するためのスペクトル評価部１４８
に送られる。The orthogonal transform circuit 145 performs an orthogonal transform process such as DFT (discrete Fourier transform) to transform the LPC residual on the time axis into spectrum amplitude data on the frequency axis. The output from the orthogonal transform circuit 145 is a high precision pitch search unit 146 and a spectrum evaluation unit 148 for evaluating the spectrum amplitude or envelope.
Sent to

【００２９】高精度（ファイン）ピッチサーチ部１４６
には、オープンループピッチサーチ部１４１で抽出され
た比較的ラフな粗ピッチデータと、直交変換部１４５に
より例えばＤＦＴされた周波数軸上のデータとが供給さ
れている。この高精度ピッチサーチ部１４６では、上記
粗ピッチデータ値を中心に、0.２〜0.５きざみで±数サ
ンプルずつ振って、最適な小数点付き（フローティン
グ）のファインピッチデータの値へ追い込む。このとき
のファインサーチの手法として、いわゆる合成による分
析 (Analysis by Synthesis)法を用い、合成されたパワ
ースペクトルが原音のパワースペクトルに最も近くなる
ようにピッチを選んでいる。このようなクローズドルー
プによる高精度のピッチサーチ部１４６からのピッチデ
ータについては、スイッチ１１８を介して出力端子１０
４に送っている。High precision (fine) pitch search unit 146
Is supplied with relatively rough coarse pitch data extracted by the open loop pitch search unit 141 and data on the frequency axis, for example, DFT performed by the orthogonal transform unit 145. The high-precision pitch search unit 146 oscillates ± several samples at intervals of 0.2 to 0.5 around the coarse pitch data value to drive the value of the fine pitch data with a decimal point (floating) to an optimum value. At this time, as a method of fine search, a so-called analysis by synthesis method is used, and the pitch is selected so that the synthesized power spectrum is closest to the power spectrum of the original sound. The pitch data from the high-precision pitch search unit 146 by such a closed loop is output via the switch 118 to the output terminal 10.
4

【００３０】スペクトル評価部１４８では、ＬＰＣ残差
の直交変換出力としてのスペクトル振幅及びピッチに基
づいて各ハーモニクスの大きさ及びその集合であるスペ
クトルエンベロープが評価され、高精度ピッチサーチ部
１４６、Ｖ／ＵＶ（有声音／無声音）判定部１１５及び
聴覚重み付きのベクトル量子化器１１６に送られる。The spectrum evaluation unit 148 evaluates the size of each harmonics and the spectrum envelope which is a set thereof based on the spectrum amplitude and the pitch as the orthogonal transformation output of the LPC residual, and the high precision pitch search unit 146, V / It is sent to the UV (voiced sound / unvoiced sound) determination unit 115 and the perceptual weighted vector quantizer 116.

【００３１】Ｖ／ＵＶ（有声音／無声音）判定部１１５
は、直交変換回路１４５からの出力と、高精度ピッチサ
ーチ部１４６からの最適ピッチと、スペクトル評価部１
４８からのスペクトル振幅データと、オープンループピ
ッチサーチ部１４１からの正規化自己相関最大値ｒ(p)
と、ゼロクロスカウンタ４１２からのゼロクロスカウン
ト値とに基づいて、当該フレームのＶ／ＵＶ判定が行わ
れる。さらに、ＭＢＥの場合の各バンド毎のＶ／ＵＶ判
定結果の境界位置も当該フレームのＶ／ＵＶ判定の一条
件としてもよい。このＶ／ＵＶ判定部１１５からの判定
出力は、出力端子１０５を介して取り出される。V / UV (voiced sound / unvoiced sound) determination section 115
Are the output from the orthogonal transformation circuit 145, the optimum pitch from the high-precision pitch search unit 146, and the spectrum evaluation unit 1
48 and the normalized autocorrelation maximum value r (p) from the open loop pitch search unit 141.
And the zero-cross count value from the zero-cross counter 412, the V / UV determination of the frame is performed. Further, the boundary position of the V / UV determination result for each band in the case of MBE may be used as one condition for the V / UV determination of the frame. The determination output from the V / UV determination unit 115 is taken out via the output terminal 105.

【００３２】ところで、スペクトル評価部１４８の出力
部あるいはベクトル量子化器１１６の入力部には、デー
タ数変換（一種のサンプリングレート変換）部が設けら
れている。このデータ数変換部は、上記ピッチに応じて
周波数軸上での分割帯域数が異なり、データ数が異なる
ことを考慮して、エンベロープの振幅データ｜Ａ_m｜を
一定の個数にするためのものである。すなわち、例えば
有効帯域を３４００ｋHzまでとすると、この有効帯域が
上記ピッチに応じて、８バンド〜６３バンドに分割され
ることになり、これらの各バンド毎に得られる上記振幅
データ｜Ａ_m｜の個数ｍ_MX＋１も８〜６３と変化するこ
とになる。このためデータ数変換部１１９では、この可
変個数ｍ_MX＋１の振幅データを一定個数Ｍ個、例えば４
４個、のデータに変換している。By the way, an output unit of the spectrum evaluation unit 148 or an input unit of the vector quantizer 116 is provided with a data number conversion unit (a kind of sampling rate conversion unit). The number-of-data converters are used to make the amplitude data | A _m | of the envelope a constant number in consideration of the fact that the number of divided bands on the frequency axis varies according to the pitch and the number of data varies. It is. That is, for example, if the effective band is up to 3400 kHz, this effective band is divided into 8 bands to 63 bands according to the pitch, and the amplitude data | A _m | of each of these bands is obtained. The number m _MX +1 also changes from 8 to 63. Therefore, the data number conversion unit 119 converts the variable number m _MX +1 of amplitude data into a fixed number M, for example, 4
It is converted into four data.

【００３３】このスペクトル評価部１４８の出力部ある
いはベクトル量子化器１１６の入力部に設けられたデー
タ数変換部からの上記一定個数Ｍ個（例えば４４個）の
振幅データあるいはエンベロープデータが、ベクトル量
子化器１１６により、所定個数、例えば４４個のデータ
毎にまとめられてベクトルとされ、重み付きベクトル量
子化が施される。この重みは、聴覚重み付けフィルタ算
出回路１３９からの出力により与えられる。ベクトル量
子化器１１６からの上記エンベロープのインデクスは、
スイッチ１１７を介して出力端子１０３より取り出され
る。なお、上記重み付きベクトル量子化に先だって、所
定個数のデータから成るベクトルについて適当なリーク
係数を用いたフレーム間差分をとっておくようにしても
よい。The fixed number M (for example, 44) of amplitude data or envelope data from the data number conversion unit provided at the output unit of the spectrum evaluation unit 148 or the input unit of the vector quantizer 116 is a vector quantum. By the digitizer 116, a predetermined number, for example, 44 pieces of data are put together into a vector, and weighted vector quantization is performed. This weight is given by the output from the auditory weighting filter calculation circuit 139. The index of the envelope from the vector quantizer 116 is
It is taken out from the output terminal 103 via the switch 117. Prior to the weighted vector quantization, an inter-frame difference using an appropriate leak coefficient may be calculated for a vector composed of a predetermined number of data.

【００３４】次に、第２の符号化部１２０について説明
する。第２の符号化部１２０は、いわゆるＣＥＬＰ（符
号励起線形予測）符号化構成を有しており、特に、入力
音声信号の無声音部分の符号化のために用いられてい
る。この無声音部分用のＣＥＬＰ符号化構成において、
雑音符号帳、いわゆるストキャスティック・コードブッ
ク（stochastic code book）１２１からの代表値出力で
ある無声音のＬＰＣ残差に相当するノイズ出力を、ゲイ
ン回路１２６を介して、聴覚重み付きの合成フィルタ１
２２に送っている。重み付きの合成フィルタ１２２で
は、入力されたノイズをＬＰＣ合成処理し、得られた重
み付き無声音の信号を減算器１２３に送っている。減算
器１２３には、上記入力端子１０１からＨＰＦ（ハイパ
スフィルタ）１０９を介して供給された音声信号を聴覚
重み付けフィルタ１２５で聴覚重み付けした信号が入力
されており、合成フィルタ１２２からの信号との差分あ
るいは誤差を取り出している。この誤差を距離計算回路
１２４に送って距離計算を行い、誤差が最小となるよう
な代表値ベクトルを雑音符号帳１２１でサーチする。こ
のような合成による分析（Analysis by Synthesis ）法
を用いたクローズドループサーチを用いた時間軸波形の
ベクトル量子化を行っている。Next, the second encoder 120 will be described. The second encoding unit 120 has a so-called CELP (Code Excited Linear Prediction) encoding configuration, and is particularly used for encoding an unvoiced sound portion of an input audio signal. In this unvoiced CELP coding configuration,
A noise output corresponding to an LPC residual of unvoiced sound, which is a representative value output from a noise codebook, that is, a so-called stochastic codebook 121, is passed through a gain circuit 126 to a synthesis filter 1 with auditory weights.
22. The weighted synthesis filter 122 performs an LPC synthesis process on the input noise, and sends the obtained weighted unvoiced sound signal to the subtractor 123. A signal obtained by subjecting the audio signal supplied from the input terminal 101 via the HPF (high-pass filter) 109 to auditory weighting by the auditory weighting filter 125 is input to the subtractor 123, and the difference from the signal from the synthesis filter 122 is input to the subtractor 123. Alternatively, the error is extracted. This error is sent to the distance calculation circuit 124 to calculate the distance, and a representative value vector that minimizes the error is searched in the noise codebook 121. Vector quantization of the time axis waveform is performed using the closed loop search using such an analysis by synthesis method.

【００３５】このＣＥＬＰ符号化構成を用いた第２の符
号化部１２０からのＵＶ（無声音）部分用のデータとし
ては、雑音符号帳１２１からのコードブックのシェイプ
インデクスと、ゲイン回路１２６からのコードブックの
ゲインインデクスとが取り出される。雑音符号帳１２１
からのＵＶデータであるシェイプインデクスは、スイッ
チ１２７ｓを介して出力端子１０７ｓに送られ、ゲイン
回路１２６のＵＶデータであるゲインインデクスは、ス
イッチ１２７ｇを介して出力端子１０７ｇに送られてい
る。As the data for the UV (unvoiced sound) portion from the second encoding unit 120 using this CELP encoding structure, the shape index of the codebook from the noise codebook 121 and the code from the gain circuit 126 are used. The gain index and the book are retrieved. Noise codebook 121
Is sent to the output terminal 107s via the switch 127s, and the gain index which is UV data of the gain circuit 126 is sent to the output terminal 107g via the switch 127g.

【００３６】ここで、これらのスイッチ１２７ｓ、１２
７ｇ及び上記スイッチ１１７、１１８は、上記Ｖ／ＵＶ
判定部１１５からのＶ／ＵＶ判定結果によりオン／オフ
制御され、スイッチ１１７、１１８は、現在伝送しよう
とするフレームの音声信号のＶ／ＵＶ判定結果が有声音
（Ｖ）のときオンとなり、スイッチ１２７ｓ、１２７ｇ
は、現在伝送しようとするフレームの音声信号が無声音
（ＵＶ）のときオンとなる。Here, these switches 127s, 12s
7g and the switches 117 and 118 are connected to the V / UV
On / off control is performed based on the V / UV determination result from the determination unit 115, and the switches 117 and 118 are turned on when the V / UV determination result of the audio signal of the frame to be currently transmitted is voiced (V). 127s, 127g
Is turned on when the audio signal of the frame to be transmitted at present is unvoiced (UV).

【００３７】次に、図４は、上記図２に示した本発明に
係る実施の形態としての音声信号復号化装置のより具体
的な構成を示している。この図４において、上記図２の
各部と対応する部分には、同じ指示符号を付している。Next, FIG. 4 shows a more specific structure of the speech signal decoding apparatus as the embodiment according to the present invention shown in FIG. In FIG. 4, parts corresponding to the respective parts in FIG. 2 are denoted by the same reference numerals.

【００３８】この図４において、入力端子２０２には、
上記図１、３の出力端子１０２からの出力に相当するＬ
ＳＰのベクトル量子化出力、いわゆるコードブックのイ
ンデクスが供給されている。In FIG. 4, the input terminal 202 has:
L corresponding to the output from the output terminal 102 in FIGS.
An SP vector quantization output, a so-called codebook index, is supplied.

【００３９】このＬＳＰのインデクスは、ＬＰＣパラメ
ータ再生部２１３のＬＳＰの逆ベクトル量子化器２３１
に送られてＬＳＰ（線スペクトル対）データに逆ベクト
ル量子化され、ＬＳＰ補間回路２３２、２３３に送られ
てＬＳＰの補間処理が施された後、ＬＳＰ→α変換回路
２３４、２３５でＬＰＣ（線形予測符号）のαパラメー
タに変換され、このαパラメータがＬＰＣ合成フィルタ
２１４に送られる。ここで、ＬＳＰ補間回路２３２及び
ＬＳＰ→α変換回路２３４は有声音（Ｖ）用であり、Ｌ
ＳＰ補間回路２３３及びＬＳＰ→α変換回路２３５は無
声音（ＵＶ）用である。またＬＰＣ合成フィルタ２１４
は、有声音部分のＬＰＣ合成フィルタ２３６と、無声音
部分のＬＰＣ合成フィルタ２３７とを分離している。す
なわち、有声音部分と無声音部分とでＬＰＣの係数補間
を独立に行うようにして、有声音から無声音への遷移部
や、無声音から有声音への遷移部で、全く性質の異なる
ＬＳＰ同士を補間することによる悪影響を防止してい
る。This LSP index is the LSP inverse vector quantizer 231 of the LPC parameter reproducing unit 213.
Is subjected to inverse vector quantization to LSP (line spectrum pair) data, sent to LSP interpolation circuits 232 and 233 and subjected to LSP interpolation processing, and then subjected to LPC (linear) by LSP → α conversion circuits 234 and 235. The α parameter is transmitted to the LPC synthesis filter 214. Here, the LSP interpolation circuit 232 and the LSP → α conversion circuit 234 are for voiced sound (V).
The SP interpolation circuit 233 and the LSP → α conversion circuit 235 are for unvoiced sound (UV). Also, the LPC synthesis filter 214
Separates the LPC synthesis filter 236 for the voiced portion and the LPC synthesis filter 237 for the unvoiced portion. That is, LPC coefficient interpolation is performed independently for voiced and unvoiced parts, and LSPs having completely different properties are interpolated between the transition from voiced to unvoiced and the transition from unvoiced to voiced. To prevent the adverse effects of doing so.

【００４０】また、図４の入力端子２０３には、上記図
１、図３のエンコーダ側の端子１０３からの出力に対応
するスペクトルエンベロープ（Ａｍ）の重み付けベクト
ル量子化されたコードインデクスデータが供給され、入
力端子２０４には、上記図１、図３の端子１０４からの
ピッチのデータが供給され、入力端子２０５には、上記
図１、図３の端子１０５からのＶ／ＵＶ判定データが供
給されている。The input terminal 203 of FIG. 4 is supplied with the coded index data of the spectral envelope (Am) weighted vector quantized corresponding to the output from the encoder side terminal 103 of FIGS. The input terminal 204 is supplied with the pitch data from the terminal 104 in FIGS. 1 and 3, and the input terminal 205 is supplied with the V / UV determination data from the terminal 105 in FIGS. ing.

【００４１】入力端子２０３からのスペクトルエンベロ
ープＡｍのベクトル量子化されたインデクスデータは、
逆ベクトル量子化器２１２に送られて逆ベクトル量子化
が施され、上記データ数変換に対応する逆変換が施され
て、スペクトルエンベロープのデータとなって、有声音
合成部２１１のサイン波合成回路２１５に送られてい
る。The vector-quantized index data of the spectral envelope Am from the input terminal 203 is
The data is sent to the inverse vector quantizer 212, subjected to inverse vector quantization, subjected to an inverse transform corresponding to the above-described data number conversion, becomes spectral envelope data, and becomes a sine wave synthesizing circuit of the voiced sound synthesizer 211. 215.

【００４２】なお、エンコード時にスペクトルのベクト
ル量子化に先だってフレーム間差分をとっている場合に
は、ここでの逆ベクトル量子化後にフレーム間差分の復
号を行ってからデータ数変換を行い、スペクトルエンベ
ロープのデータを得る。If the interframe difference is taken before the vector quantization of the spectrum at the time of encoding, the number of data is converted after decoding the interframe difference after the inverse vector quantization here, and the spectrum envelope is converted. Get the data of.

【００４３】サイン波合成回路２１５には、入力端子２
０４からのピッチ及び入力端子２０５からの上記Ｖ／Ｕ
Ｖ判定データが供給されている。サイン波合成回路２１
５からは、上述した図１、図３のＬＰＣ逆フィルタ１１
１からの出力に相当するＬＰＣ残差データが取り出さ
れ、これが加算器２１８に送られている。The sine wave synthesis circuit 215 has an input terminal 2
04 and the V / U from the input terminal 205
V determination data is supplied. Sine wave synthesis circuit 21
5, the LPC inverse filter 11 shown in FIGS.
LPC residual data corresponding to the output from 1 is extracted and sent to the adder 218.

【００４４】また、逆ベクトル量子化器２１２からのエ
ンベロープのデータと、入力端子２０４、２０５からの
ピッチ、Ｖ／ＵＶ判定データとは、有声音（Ｖ）部分の
ノイズ加算のためのノイズ合成回路２１６に送られてい
る。このノイズ合成回路２１６からの出力は、重み付き
重畳加算回路２１７を介して加算器２１８に送ってい
る。これは、サイン波合成によって有声音のＬＰＣ合成
フィルタへの入力となるエクサイテイション（Excitati
on：励起、励振）を作ると、男声等の低いピッチの音で
鼻づまり感がある点、及びＶ（有声音）とＵＶ（無声
音）とで音質が急激に変化し不自然に感じる場合がある
点を考慮し、有声音部分のＬＰＣ合成フィルタ入力すな
わちエクサイテイションについて、音声符号化データに
基づくパラメータ、例えばピッチ、スペクトルエンベロ
ープ振幅、フレーム内の最大振幅、残差信号のレベル等
を考慮したノイズをＬＰＣ残差信号の有声音部分に加え
ているものである。The envelope data from the inverse vector quantizer 212, the pitch from the input terminals 204 and 205, and the V / UV determination data are used as a noise synthesis circuit for adding noise in the voiced sound (V) portion. Have been sent to 216. The output from the noise synthesis circuit 216 is sent to an adder 218 via a weighted superposition addition circuit 217. This is an excitation (Excitati) which is input to the LPC synthesis filter of voiced sound by sine wave synthesis.
When on (excitation, excitation) is made, there is a case where there is a feeling of nasal congestion with a low pitch sound such as a male voice, and the sound quality changes suddenly between V (voiced sound) and UV (unvoiced sound) and feels unnatural. Considering a certain point, the LPC synthesis filter input of the voiced sound portion, that is, the excitation, was considered in consideration of parameters based on the speech coded data, for example, pitch, spectrum envelope amplitude, maximum amplitude in a frame, residual signal level, and the like. Noise is added to the voiced portion of the LPC residual signal.

【００４５】加算器２１８からの加算出力は、ＬＰＣ合
成フィルタ２１４の有声音用の合成フィルタ２３６に送
られてＬＰＣの合成処理が施されることにより時間波形
データとなり、さらに有声音用ポストフィルタ２３８ｖ
でフィルタ処理された後、加算器２３９に送られる。The addition output from the adder 218 is sent to the voiced sound synthesis filter 236 of the LPC synthesis filter 214 and subjected to LPC synthesis processing to become time waveform data, and further to the voiced sound post filter 238v.
, And sent to the adder 239.

【００４６】次に、図４の入力端子２０７ｓ及び２０７
ｇには、上記図３の出力端子１０７ｓ及び１０７ｇから
のＵＶデータとしてのシェイプインデクス及びゲインイ
ンデクスがそれぞれ供給され、無声音合成部２２０に送
られている。端子２０７ｓからのシェイプインデクス
は、無声音合成部２２０の雑音符号帳２２１に、端子２
０７ｇからのゲインインデクスはゲイン回路２２２にそ
れぞれ送られている。雑音符号帳２２１から読み出され
た代表値出力は、無声音のＬＰＣ残差に相当するノイズ
信号成分であり、これがゲイン回路２２２で所定のゲイ
ンの振幅となり、窓かけ回路２２３に送られて、上記有
声音部分とのつなぎを円滑化するための窓かけ処理が施
される。Next, the input terminals 207s and 207 of FIG.
The shape index and the gain index as UV data from the output terminals 107 s and 107 g in FIG. 3 are supplied to g, and are sent to the unvoiced sound synthesis unit 220. The shape index from the terminal 207s is stored in the noise codebook 221 of the unvoiced sound synthesizer 220 in the terminal 2
The gain index from 07g is sent to the gain circuit 222, respectively. The representative value output read from the noise codebook 221 is a noise signal component corresponding to the LPC residual of the unvoiced sound. The noise signal component has an amplitude of a predetermined gain in the gain circuit 222 and is sent to the windowing circuit 223. A windowing process is performed to smooth the connection with the voiced sound portion.

【００４７】窓かけ回路２２３からの出力は、無声音合
成部２２０からの出力として、ＬＰＣ合成フィルタ２１
４のＵＶ（無声音）用の合成フィルタ２３７に送られ
る。合成フィルタ２３７では、ＬＰＣ合成処理が施され
ることにより無声音部分の時間波形データとなり、この
無声音部分の時間波形データは無声音用ポストフィルタ
２３８ｕでフィルタ処理された後、加算器２３９に送ら
れる。The output from the windowing circuit 223 is the output from the unvoiced sound synthesizing section 220, which is the LPC synthesis filter 21.
4 is sent to the synthesis filter 237 for UV (unvoiced sound). The synthesis filter 237 performs LPC synthesis processing to obtain unvoiced sound time waveform data. The unvoiced sound time waveform data is filtered by the unvoiced sound post filter 238u, and then sent to the adder 239.

【００４８】加算器２３９では、有声音用ポストフィル
タ２３８ｖからの有声音部分の時間波形信号と、無声音
用ポストフィルタ２３８ｕからの無声音部分の時間波形
データとが加算され、出力端子２０１より取り出され
る。In the adder 239, the time waveform signal of the voiced sound portion from the voiced sound post filter 238v and the time waveform data of the unvoiced sound portion from the unvoiced sound post filter 238u are added together and taken out from the output terminal 201.

【００４９】上記音声信号符号化装置では、出力データ
のビットレートが可変されて出力される。具体的には、
出力データのビットレートを、低ビットレートと高ビッ
トレートとに切り換えることができる。例えば、低ビッ
トレートを２ｋbpsとし、高ビットレートを６ｋbpsとす
る場合には、以下の表１に示す各ビットレートのデータ
が出力される。In the above audio signal encoding device, the bit rate of the output data is changed and output. In particular,
The bit rate of output data can be switched between a low bit rate and a high bit rate. For example, when the low bit rate is 2 kbps and the high bit rate is 6 kbps, the data of each bit rate shown in Table 1 below is output.

【００５０】[0050]

【表１】 [Table 1]

【００５１】出力端子１０４からのピッチデータについ
ては、有声音時に、常に８bits／２０ｍsecで出力さ
れ、出力端子１０５から出力されるＶ／ＵＶ判定出力
は、常に１bit／２０ｍsecである。出力端子１０２から
出力されるＬＳＰ量子化のインデクスは、３２bits／４
０ｍsecと４８bits／４０ｍsecとの間で切り換えが行わ
れる。また、出力端子１０３から出力される有声音時
（Ｖ）のインデクスは、１５bits／２０ｍsecと８７bit
s／２０ｍsecとの間で切り換えが行われ、出力端子１０
７ｓ、１０７ｇから出力される無声音時（ＵＶ）のイン
デクスは、１１bits／１０ｍsecと２３bits／５ｍsecと
の間で切り換えが行われる。これにより、有声音時
（Ｖ）の出力データは、２ｋbpsでは４０bits／２０ｍs
ecとなり、６ｋbpsでは１２０bits／２０ｍsecとなる。
また、無声音時（ＵＶ）の出力データは、２ｋbpsでは
３９bits／２０ｍsecとなり、６ｋbpsでは１１７bits／
２０ｍsecとなる。The pitch data from the output terminal 104 is always output at 8 bits / 20 msec during voiced sound, and the V / UV determination output from the output terminal 105 is always 1 bit / 20 msec. The LSP quantization index output from the output terminal 102 is 32 bits / 4
Switching is performed between 0 msec and 48 bits / 40 msec. The index of the voiced sound (V) output from the output terminal 103 is 15 bits / 20 msec and 87 bits.
Switching between s / 20 msec and output terminal 10
The index at the time of unvoiced sound (UV) output from 7s and 107g is switched between 11 bits / 10 msec and 23 bits / 5 msec. Thus, the output data at the time of voiced sound (V) is 40 bits / 20 ms at 2 kbps.
ec, which is 120 bits / 20 msec at 6 kbps.
The output data at the time of unvoiced sound (UV) is 39 bits / 20 msec at 2 kbps, and 117 bits / 20 msec at 6 kbps.
20 msec.

【００５２】尚、上記ＬＳＰ量子化のインデクス、有声
音時（Ｖ）のインデクス、及び無声音時（ＵＶ）のイン
デクスについては、後述する各部の構成と共に説明す
る。The LSP quantization index, the voiced sound (V) index, and the unvoiced sound (UV) index will be described together with the configuration of each unit described later.

【００５３】次に、図５及び図６を用いて、ＬＳＰ量子
化器１３４におけるマトリクス量子化及びベクトル量子
化について詳細に説明する。Next, the matrix quantization and the vector quantization in the LSP quantizer 134 will be described in detail with reference to FIGS. 5 and 6.

【００５４】上述のように、ＬＰＣ分析回路１３２から
のαパラメータは、α→ＬＳＰ変換回路１３３に送られ
て、ＬＳＰパラメータに変換される。例えば、ＬＰＣ分
析回路１３２でＰ次のＬＰＣ分析を行う場合には、αパ
ラメータはＰ個算出される。このＰ個のαパラメータ
は、ＬＳＰパラメータに変換され、バッファ６１０に保
持される。As described above, the α parameter from the LPC analysis circuit 132 is sent to the α → LSP conversion circuit 133 and converted into the LSP parameter. For example, when the P order LPC analysis is performed by the LPC analysis circuit 132, P α parameters are calculated. The P α parameters are converted into LSP parameters and stored in the buffer 610.

【００５５】このバッファ６１０からは、２フレーム分
のＬＳＰパラメータが出力される。２フレーム分のＬＳ
Ｐパラメータはマトリクス量子化部６２０でマトリクス
量子化される。マトリクス量子化部６２０は、第１のマ
トリクス量子化部６２０₁と第２のマトリクス量子化部
６２０₂とから成る。２フレーム分のＬＳＰパラメータ
は、第１のマトリクス量子化部６２０₁でマトリクス量
子化され、これにより得られる量子化誤差が、第２のマ
トリクス量子化部６２０₂でさらにマトリクス量子化さ
れる。これらのマトリクス量子化により、時間軸方向の
相関を取り除く。This buffer 610 outputs LSP parameters for two frames. LS for 2 frames
The P parameter is subjected to matrix quantization by the matrix quantization unit 620. Matrix quantizer 620 consists of a first matrix quantizer 620 ₁ and a second matrix quantizer 620 _2. The LSP parameters for two frames are subjected to matrix quantization in the first matrix quantization section 620 ₁ , and the resulting quantization error is further subjected to matrix quantization in the second matrix quantization section 620 ₂ . By these matrix quantization, the correlation in the time axis direction is removed.

【００５６】マトリクス量子化部６２０₂からの２フレ
ーム分の量子化誤差は、ベクトル量子化部６４０に入力
される。ベクトル量子化部６４０は、第１のベクトル量
子化部６４０₁と第２のベクトル量子化部６４０₂とから
成る。さらに、第１のベクトル量子化部６４０₁は、２
つのベクトル量子化部６５０、６６０から成り、第２の
ベクトル量子化部６４０₂は、２つのベクトル量子化部
６７０、６８０から成る。第１のベクトル量子化部６４
０₁のベクトル量子化部６５０、６６０で、マトリクス
量子化部６２０からの量子化誤差が、それぞれ１フレー
ム毎にベクトル量子化される。これにより得られる量子
化誤差ベクトルは、第２のベクトル量子化部６４０₂の
ベクトル量子化部６７０、６８０で、さらにベクトル量
子化される。これらのベクトル量子化により、周波数軸
方向の相関を処理する。The quantization error for two frames from the matrix quantizer 620 ₂ is input to the vector quantizer 640. The vector quantization unit 640 includes a first vector quantization unit 640 ₁ and a second vector quantization unit 640 ₂ . Further, the first vector quantizer 640 ₁
The second vector quantizer 640 ₂ comprises two vector quantizers 650 and 660, and the second vector quantizer 640 ₂ comprises two vector quantizers 670 and 680. First vector quantizer 64
The vector quantization units 650 and 660 of 0 ₁ vector-quantize the quantization error from the matrix quantization unit 620 for each frame. The quantization error vector thus obtained is further vector-quantized by the vector quantizers 670 and 680 of the _second vector quantizer 640 ₂ . By these vector quantizations, the correlation in the frequency axis direction is processed.

【００５７】このように、マトリクス量子化を施す工程
を行うマトリクス量子化部６２０は、第１のマトリクス
量子化工程を行う第１のマトリクス量子化部６２０
₁と、この第１のマトリクス量子化による量子化誤差を
マトリクス量子化する第２のマトリクス量子化工程を行
う第２のマトリクス量子化部６２０₂とを少なくとも有
し、上記ベクトル量子化を施す工程を行うベクトル量子
化部６４０は、第１のベクトル量子化工程を行う第１の
ベクトル量子化部６４０₁と、この第１のベクトル量子
化の際の量子化誤差ベクトルをベクトル量子化する第２
のベクトル量子化工程を行う第２のベクトル量子化部６
４０₂とを少なくとも有する。As described above, the matrix quantizing unit 620 for performing the matrix quantizing process has the first matrix quantizing unit 620 for performing the first matrix quantizing process.
₁ and at least a _second matrix quantizer 620 ₂ for performing a second matrix quantization step of matrix quantizing a quantization error due to the first matrix quantization, and performing the above vector quantization. The vector quantizer 640 for performing the first vector quantization step 640 ₁ for performing the first vector quantization step, and the second vector quantizer 640 for vector quantizing the quantization error vector at the time of the first vector quantization.
Vector quantization unit 6 that performs the vector quantization process of
Having at least a 40 _2.

【００５８】次に、マトリクス量子化及びベクトル量子
化について具体的に説明する。Next, the matrix quantization and the vector quantization will be specifically described.

【００５９】バッファ６１０に保持された、２フレーム
分のＬＳＰパラメータ、すなわち１０×２の行列は、マ
トリクス量子化器６２０₁に送られる。上記第１のマト
リクス量子化部６１０₁では、２フレーム分のＬＳＰパ
ラメータが加算器６２１を介して重み付き距離計算器６
２３に送られ、最小となる重み付き距離が算出される。The LSP parameters for two frames, that is, the 10 × 2 matrix held in the buffer 610 are sent to the matrix quantizer 620 ₁ . In the first matrix quantizer 610 ₁ , the LSP parameters for two frames are sent to the weighted distance calculator 6 via the adder 621.
23, and a minimum weighted distance is calculated.

【００６０】この第１のマトリクス量子化部６２０₁に
よるコードブックサーチ時の歪尺度ｄ_MQ1は、ＬＳＰパ
ラメータＸ₁、量子化値Ｘ ₁を用い、（１）式で示す。The distortion measure d _MQ1 at the time of codebook search by the first matrix quantizer 620 ₁ is expressed by the equation (1) using the LSP parameter X ₁ and the quantized value X ₁ .

【００６１】[0061]

【数１】 (Equation 1)

【００６２】ここで、ｔはフレーム番号、ｉはＰ次元の
番号を示す。Here, t is a frame number and i is a P-dimensional number.

【００６３】また、このときの、周波数軸方向及び時間
軸方向に重みの制限を考慮しない場合の重みＷを（２）
式で示す。Further, at this time, the weight W when the limitation of the weight is not taken into consideration in the frequency axis direction and the time axis direction is (2)
It is shown by the formula.

【００６４】[0064]

【数２】 (Equation 2)

【００６５】この（２）式の重みＷは、後段のマトリク
ス量子化及びベクトル量子化でも用いられる。The weight W in the equation (2) is also used in matrix quantization and vector quantization in the latter stage.

【００６６】算出された重み付き距離はマトリクス量子
化器（ＭＱ₁）６２２に送られて、マトリクス量子化が
行われる。このマトリクス量子化により出力される８ビ
ットのインデクスは信号切換器６９０に送られる。ま
た、マトリクス量子化による量子化値は、加算器６２１
で、バッファ６１０からの次の２フレーム分のＬＳＰパ
ラメータから減算される。重み付き距離計算器６２３で
は、加算器６２１からの出力を用いて、最小となる重み
付き距離が算出される。このように、２フレーム毎に、
順次、重み付き距離計算器６２３では重み付き距離が算
出されて、マトリクス量子化器６２２でマトリクス量子
化が行われる。また、加算器６２１からの出力は、第２
のマトリクス量子化部６２０₂の加算器６３１に送られ
る。The calculated weighted distance is sent to the matrix quantizer (MQ ₁ ) 622 and matrix quantization is performed. The 8-bit index output by the matrix quantization is sent to the signal switch 690. The quantization value obtained by the matrix quantization is added to an adder 621.
Then, it is subtracted from the LSP parameters for the next two frames from the buffer 610. The weighted distance calculator 623 uses the output from the adder 621 to calculate the minimum weighted distance. In this way, every two frames
The weighted distance calculator 623 sequentially calculates the weighted distance, and the matrix quantizer 622 performs matrix quantization. The output from the adder 621 is the second
Is sent to the adder 631 of the matrix quantizer 620 ₂ .

【００６７】第２のマトリクス量子化部６２０₂でも第
１のマトリクス量子化部６２０₁と同様にして、マトリ
クス量子化を行う。上記加算器６２１からの出力は、加
算器６３１を介して重み付き距離計算器６３３に送ら
れ、最小となる重み付き距離が算出される。The second matrix quantizer 620 ₂ also performs matrix quantization in the same manner as the first matrix quantizer 620 ₁ . The output from the adder 621 is sent to the weighted distance calculator 633 via the adder 631, and the minimum weighted distance is calculated.

【００６８】この第２のマトリクス量子化部６２０₂に
よるコードブックサーチ時の歪尺度ｄ_MQ2を、第１のマ
トリクス量子化部６２０₁からの量子化誤差Ｘ₂、量子化
値Ｘ ₂により、（３）式で示す。The distortion measure d _MQ2 at the time of codebook search by the second matrix quantizer 620 _{2 is} _calculated by the quantization error X ₂ and the quantization value X ₂ from the _first matrix quantizer 620 _1. It is shown by the equation 3).

【００６９】[0069]

【数３】 (Equation 3)

【００７０】この重み付き距離はマトリクス量子化器
（ＭＱ₂）６３２に送られて、マトリクス量子化が行わ
れる。このマトリクス量子化により出力される８ビット
のインデクスは信号切換器６９０に送られる。また、マ
トリクス量子化による量子化値は、加算器６３１で、次
の２フレーム分の量子化誤差から減算される。重み付き
距離計算器６３３では、加算器６３１からの出力を用い
て、最小となる重み付き距離が順次算出される。また、
加算器６３１からの出力は、第１のベクトル量子化部６
４０₁の加算器６５１、６６１に１フレームずつ送られ
る。This weighted distance is sent to the matrix quantizer (MQ ₂ ) 632 for matrix quantization. The 8-bit index output by the matrix quantization is sent to the signal switch 690. Further, the quantized value obtained by the matrix quantization is subtracted from the quantization error for the next two frames by the adder 631. The weighted distance calculator 633 sequentially calculates the minimum weighted distance using the output from the adder 631. Also,
The output from the adder 631 is the first vector quantization unit 6
One frame is sent to the adders 651 and 661 of 40 ₁ .

【００７１】この第１のベクトル量子化部６４０₁で
は、１フレーム毎にベクトル量子化が行われる。加算器
６３１からの出力は、１フレーム毎に、加算器６５１、
６６１を介して重み付き距離計算器６５３、６６３にそ
れぞれ送られ、最小となる重み付き距離が算出される。The first vector quantizer 640 ₁ performs vector quantization for each frame. The output from the adder 631 is added to the adder 651,
The weighted distance is sent to weighted distance calculators 653 and 663 via 661 to calculate the minimum weighted distance.

【００７２】量子化誤差Ｘ₂と量子化値Ｘ ₂との差分は、
１０×２の行列であり、Ｘ₂−Ｘ₂’＝［Ｘ _3-1，Ｘ _3-2］と表すときの、この第１のベクトル量子化部６４０₁の
ベクトル量子化器６５２、６６２によるコードブックサ
ーチ時の歪尺度ｄ_VQ1、ｄ_VQ2を、（４）、（５）式で示
す。The difference between the quantization error X ₂ and the quantization value X ₂ is
It is a 10 × 2 matrix and is represented by the vector quantizers 652 and 662 of the _first vector quantizer 640 ₁ when X ₂ −X ₂ ′ = [ X _3-1 , X _3-2 ]. The distortion measures d _VQ1 and d _VQ2 at the time of codebook search are shown by equations (4) and (5).

【００７３】[0073]

【数４】 (Equation 4)

【００７４】この重み付き距離はベクトル量子化器（Ｖ
Ｑ₁）６５２、ベクトル量子化器（ＶＱ₂）６６２にそれ
ぞれ送られて、ベクトル量子化が行われる。このベクト
ル量子化により出力される各８ビットのインデクスは信
号切換器６９０に送られる。また、ベクトル量子化によ
る量子化値は、加算器６５１、６６１で、次に入力され
る２フレーム分の量子化誤差ベクトルから減算される。
重み付き距離計算器６５３、６６３では、加算器６５
１、６６１からの出力を用いて、最小となる重み付き距
離が順次算出される。また、加算器６５１、６６１から
の出力は、第２のベクトル量子化部６４０₂の加算器６
７１、６８１にそれぞれ送られる。This weighted distance is calculated by the vector quantizer (V
Q ₁ ) 652 and vector quantizer (VQ ₂ ) 662, respectively, and vector quantization is performed. Each 8-bit index output by this vector quantization is sent to a signal switch 690. Further, the quantized value obtained by the vector quantization is subtracted by the adders 651 and 661 from the quantized error vector of the next two frames.
In the weighted distance calculators 653 and 663, the adder 65
Using the outputs from 1 and 661, the minimum weighted distance is sequentially calculated. The outputs from the adders 651 and 661 are the outputs of the adder 6 of the _second vector quantizer 640 _2.
71 and 681 respectively.

【００７５】ここで、Ｘ _4-1 ＝Ｘ_3-1−Ｘ’_3-1 Ｘ _4-2 ＝Ｘ_3-2−Ｘ’_3-2 と表すときの、この第２のベクトル量子化部６４０₂の
ベクトル量子化器６７２、６８２によるコードブックサ
ーチ時の歪尺度ｄ_VQ3、ｄ_VQ4を、（６）、（７）式で示
す。Here, this second vector quantizer 640 when X _4-1 = X _3-1 −X ′ _3-1 X _4-2 = X _3-2 −X ′ _3-2 is expressed. the distortion measure d _VQ3, d _VQ4 during codebook search by the _second vector quantizer 672, 682, (6), shown in equation (7).

【００７６】[0076]

【数５】 (Equation 5)

【００７７】この重み付き距離はベクトル量子化器（Ｖ
Ｑ₃）６７２、ベクトル量子化器（ＶＱ₄）６８２にそれ
ぞれ送られて、ベクトル量子化が行われる。このベクト
ル量子化により出力される各８ビットのインデクスは信
号切換器６９０に送られる。また、ベクトル量子化によ
る量子化値は、加算器６７１、６８１で、次に入力され
る２フレーム分の量子化誤差ベクトルから減算される。
重み付き距離計算器６７３、６８３では、加算器６７
１、６８１からの出力を用いて、最小となる重み付き距
離が順次算出される。This weighted distance is calculated by the vector quantizer (V
Q ₃ ) 672 and vector quantizer (VQ ₄ ) 682, respectively, for vector quantization. Each 8-bit index output by this vector quantization is sent to a signal switch 690. Further, the quantized value obtained by the vector quantization is subtracted from the quantized error vectors for the next two frames to be inputted by the adders 671 and 681.
In the weighted distance calculators 673 and 683, the adder 67
Using the outputs from 1 and 681, the minimum weighted distance is sequentially calculated.

【００７８】また、コードブックの学習時には、上記各
歪尺度をもとにして、一般化ロイドアルゴリズム（ＧＬ
Ａ）により学習を行う。Further, at the time of learning the codebook, the generalized Lloyd algorithm (GL
Learning is performed according to A).

【００７９】尚、コードブックサーチ時と学習時の歪尺
度は、異なる値であっても良い。The distortion scales at the time of codebook search and at the time of learning may have different values.

【００８０】上記マトリクス量子化器６２２、６３２、
ベクトル量子化器６５２、６６２、６７２、６８２から
の各８ビットのインデクスは、信号切換器６９０で切り
換えられて、出力端子６９１から出力される。The matrix quantizers 622, 632,
The 8-bit indexes from the vector quantizers 652, 662, 672, and 682 are switched by the signal switch 690 and output from the output terminal 691.

【００８１】具体的には、低ビットレート時には、上記
第１のマトリクス量子化工程を行う第１のマトリクス量
子化部６２０₁、上記第２のマトリクス量子化工程を行
う第２のマトリクス量子化部６２０₂、及び上記第１の
ベクトル量子化工程を行う第１のベクトル量子化部６４
０₁での出力を取り出し、高ビットレート時には、上記
低ビットレート時の出力に上記第２のベクトル量子化工
程を行う第２のベクトル量子化部６４０₂での出力を合
わせて取り出す。Specifically, when the bit rate is low, the first matrix quantizing section 620 ₁ for performing the first matrix quantizing step and the second matrix quantizing section for performing the second matrix quantizing step are described. 620 ₂ , and a first vector quantization unit 64 that performs the first vector quantization step.
0 output at ₁ taken out, at the time of high bit-rate, taken together outputs of the second vector quantizer 640 ₂ carrying out the second vector quantization process on the output when the low bit rate.

【００８２】これにより、２ｋbps時には、３２bits／
４０ｍsecのインデクスが出力され、６ｋbps時には、４
８bits／４０ｍsecのインデクスが出力される。As a result, at 2 kbps, 32 bits /
An index of 40 msec is output and 4 at 6 kbps.
The 8bits / 40msec index is output.

【００８３】また、上記マトリクス量子化部６２０及び
上記ベクトル量子化部６４０では、上記ＬＰＣ係数を表
現するパラメータの持つ特性に合わせた、周波数軸方向
又は時間軸方向、あるいは周波数軸及び時間軸方向に制
限を持つ重み付けを行う。Further, in the matrix quantizing unit 620 and the vector quantizing unit 640, in the frequency axis direction or the time axis direction, or in the frequency axis and time axis direction, which is in accordance with the characteristic of the parameter expressing the LPC coefficient. Weight with restrictions.

【００８４】先ず、ＬＳＰパラメータの持つ特性に合わ
せた、周波数軸方向に制限を持つ重み付けについて説明
する。例えば、次数Ｐ＝１０とするとき、ＬＳＰパラメ
ータＸ（ｉ）を、低域、中域、高域の３つの領域とし
て、Ｌ₁＝｛Ｘ（ｉ）｜１≦ｉ≦２｝Ｌ₂＝｛Ｘ（ｉ）｜３≦ｉ≦６｝Ｌ₃＝｛Ｘ（ｉ）｜７≦ｉ≦１０｝とグループ化する。そして、各グループＬ₁、Ｌ₂、Ｌ₃
の重み付けを１／４、１／２、１／４とすると、各グル
ープＬ₁、Ｌ₂、Ｌ₃の周波数軸方向のみに制限を持つ重
みは、（８）、（９）、（１０）式となる。First, description will be given of weighting having a limit in the frequency axis direction, which is matched with the characteristic of the LSP parameter. For example, when the degree P = 10, the LSP parameter X (i) is set to three regions of low band, middle band, and high band: L ₁ = {X (i) | 1 ≦ i ≦ 2} L ₂ = {X (i) | 3 ≦ i ≦ 6} L ₃ = {X (i) | 7 ≦ i ≦ 10}. Then, each group L ₁ , L ₂ , L ₃
Is 1/4, 1/2, and 1/4, the weights of each group L ₁ , L ₂ , and L ₃ having restrictions only in the frequency axis direction are (8), (9), and (10). It becomes an expression.

【００８５】[0085]

【数６】 (Equation 6)

【００８６】これにより、各ＬＳＰパラメータの重み付
けは、各グループ内でのみ行われ、その重みは各グルー
プに対する重み付けで制限される。As a result, the weighting of each LSP parameter is performed only within each group, and the weighting is limited by the weighting for each group.

【００８７】ここで、時間軸方向からみると、各フレー
ムの重み付けの総和は、必ず１となるので、時間軸方向
の制限は１フレーム単位である。この時間軸方向のみに
制限を持つ重みは、（１１）式となる。Here, when viewed from the time axis direction, the sum of weighting of each frame is always 1, so that the limitation in the time axis direction is one frame unit. The weight having a restriction only in the time axis direction is given by equation (11).

【００８８】[0088]

【数７】 (Equation 7)

【００８９】この（１１）式により、周波数軸方向での
制限のない、フレーム番号ｔ＝０，１の２つのフレーム
間で、重み付けが行われる。この時間軸方向にのみ制限
を持つ重み付けは、マトリクス量子化を行う２フレーム
間で行う。According to the equation (11), weighting is performed between two frames having frame numbers t = 0 and 1 which are not limited in the frequency axis direction. The weighting having a limitation only in the time axis direction is performed between two frames on which matrix quantization is performed.

【００９０】また、学習時には、学習データとして用い
る全ての音声フレーム、即ち全データのフレーム数Ｔに
ついて、（１２）式により、重み付けを行う。At the time of learning, all voice frames used as learning data, that is, the number T of frames of all data, is weighted by the equation (12).

【００９１】[0091]

【数８】 (Equation 8)

【００９２】また、周波数軸方向及び時間軸方向に制限
を持つ重み付けについて説明する。例えば、次数Ｐ＝１
０とするとき、ＬＳＰパラメータＸ（ｉ，ｔ）を、低
域、中域、高域の３つの領域として、Ｌ₁＝｛Ｘ（ｉ，ｔ）｜１≦ｉ≦２，０≦ｔ≦１｝Ｌ₂＝｛Ｘ（ｉ，ｔ）｜３≦ｉ≦６，０≦ｔ≦１｝Ｌ₃＝｛Ｘ（ｉ，ｔ）｜７≦ｉ≦１０，０≦ｔ≦１｝とグループ化する。各グループＬ₁、Ｌ₂、Ｌ₃の重み付
けを１／４、１／２、１／４とすると、各グループ
Ｌ₁、Ｌ₂、Ｌ₃の周波数軸方向及び時間軸方向に制限を
持つ重み付けは、（１３）、（１４）、（１５）式とな
る。Weighting with restrictions in the frequency axis direction and the time axis direction will be described. For example, the order P = 1
When L is 0, the LSP parameter X (i, t) is set to three regions of low band, middle band, and high band. L ₁ = {X (i, t) | 1 ≦ i ≦ 2,0 ≦ t ≦ 1} L ₂ = {X (i, t) | 3 ≦ i ≦ 6, 0 ≦ t ≦ 1} L ₃ = {X (i, t) | 7 ≦ i ≦ 10, 0 ≦ t ≦ 1} and a group Turn into. Assuming that the weights of the groups L ₁ , L ₂ , L ₃ are １／, 、, １／, the weights of the groups L ₁ , L ₂ , L ₃ are limited in the frequency axis direction and the time axis direction. Becomes the expressions (13), (14), and (15).

【００９３】[0093]

【数９】 (Equation 9)

【００９４】この（１３）、（１４）、（１５）式によ
り、周波数軸方向では３つの帯域毎に、時間軸方向では
マトリクス量子化を行う２フレーム間に重み付けの制限
を加えた重み付けを行う。これは、コードブックサーチ
時及び学習時共に有効となる。According to the equations (13), (14) and (15), weighting is performed for every three bands in the frequency axis direction and in the time axis direction by weighting limitation between two frames for matrix quantization. . This is effective for both codebook search and learning.

【００９５】また、学習時においては、全データのフレ
ーム数について重み付けを行う。ＬＳＰパラメータＸ
（ｉ，ｔ）を、低域、中域、高域の３つの領域として、Ｌ₁＝｛Ｘ（ｉ，ｔ）｜１≦ｉ≦２，０≦ｔ≦Ｔ｝Ｌ₂＝｛Ｘ（ｉ，ｔ）｜３≦ｉ≦６，０≦ｔ≦Ｔ｝Ｌ₃＝｛Ｘ（ｉ，ｔ）｜７≦ｉ≦１０，０≦ｔ≦Ｔ｝とグループ化し、各グループＬ₁、Ｌ₂、Ｌ₃の重み付け
を１／４、１／２、１／４とすると、各グループＬ₁、
Ｌ₂、Ｌ₃の周波数軸方向及び時間軸方向に制限を持つ重
み付けは、（１６）、（１７）、（１８）式となる。At the time of learning, the number of frames of all data is weighted. LSP parameter X
Let (i, t) be the three regions of low band, middle band, and high band. L ₁ = {X (i, t) | 1 ≦ i ≦ 2, 0 ≦ t ≦ T} L ₂ = {X ( i, t) | 3 ≦ i ≦ 6, 0 ≦ t ≦ T} L ₃ = {X (i, t) | 7 ≦ i ≦ 10, 0 ≦ t ≦ T}, and each group L ₁ , L _If the weighting of ₂ and L ₃ is 1/4, 1/2, and 1/4, each group L ₁ ,
The weights L ₂ and L ₃ having restrictions in the frequency axis direction and the time axis direction are given by equations (16), (17) and (18).

【００９６】[0096]

【数１０】 (Equation 10)

【００９７】この（１６）、（１７）、（１８）式によ
り、周波数軸方向では３つの帯域毎に重み付けを行い、
時間軸方向では全フレーム間で重み付けを行うことがで
きる。By the equations (16), (17) and (18), weighting is performed for each of the three bands in the frequency axis direction,
In the time axis direction, weighting can be performed between all frames.

【００９８】さらに、上記マトリクス量子化部６２０及
び上記ベクトル量子化部６４０では、上記ＬＳＰパラメ
ータの変化の大きさに応じて重み付けを行う。音声フレ
ーム全体においては少数フレームとなる、Ｖ→ＵＶ、Ｕ
Ｖ→Ｖの遷移（トランジェント）部において、子音と母
音との周波数特性の違いから、ＬＳＰパラメータは大き
く変化する。そこで、（１９）式に示す重みを、上述の
重みＷ’（ｉ，ｔ）に乗算することにより、上記遷移部
を重視する重み付けを行うことができる。Further, the matrix quantizer 620 and the vector quantizer 640 perform weighting according to the magnitude of change in the LSP parameter. V → UV, U, which is a small number of frames in the entire audio frame
In the transition (transient) portion of V → V, the LSP parameter greatly changes due to a difference in frequency characteristics between the consonant and the vowel. Therefore, by multiplying the weight W '(i, t) described above by the weight shown in the equation (19), weighting with emphasis on the transition part can be performed.

【００９９】[0099]

【数１１】 [Equation 11]

【０１００】尚、（１９）式の代わりに、（２０）式を
用いることも考えられる。It is also possible to use equation (20) instead of equation (19).

【０１０１】[0101]

【数１２】 (Equation 12)

【０１０２】このように、ＬＳＰ量子化器１３４では、
２段のマトリクス量子化及び２段のベクトル量子化を行
うことにより、出力するインデクスのビット数を可変に
することができる。Thus, in the LSP quantizer 134,
By performing two-stage matrix quantization and two-stage vector quantization, the number of bits of an output index can be made variable.

【０１０３】次に、ベクトル量子化部１１６の基本構成
を図７、図７のベクトル量子化部１１６のより具体的な
構成を図８に示し、ベクトル量子化器１１６におけるス
ペクトルエンベロープ（Ａｍ）の重み付きベクトル量子
化の具体例について説明する。Next, the basic configuration of the vector quantizer 116 is shown in FIG. 7, and the more specific configuration of the vector quantizer 116 of FIG. 7 is shown in FIG. 8, and the spectrum envelope (Am) of the vector quantizer 116 is shown. A specific example of weighted vector quantization will be described.

【０１０４】先ず、図３の音声信号符号化装置におい
て、スペクトル評価部１４８の出力側あるいはベクトル
量子化器１１６の入力側に設けられたスペクトルエンベ
ロープの振幅のデータ数を一定個数にするデータ数変換
の具体例について説明する。First, in the speech signal coding apparatus of FIG. 3, data number conversion to make the number of amplitude envelope data provided on the output side of the spectrum evaluation section 148 or the input side of the vector quantizer 116 a fixed number. A specific example of will be described.

【０１０５】このデータ数変換には種々の方法が考えら
れるが、本実施の形態においては、例えば、周波数軸上
の有効帯域１ブロック分の振幅データに対して、ブロッ
ク内の最後のデータからブロック内の最初のデータまで
の値を補間するようなダミーデータを付加してデータ個
数をＮ_F個に拡大した後、帯域制限型のＯ_S倍（例えば
８倍）のオーバーサンプリングを施すことによりＯ_S倍
の個数の振幅データを求め、このＯ_S倍の個数（（ｍ_MX
＋１）×Ｏ_S個）の振幅データを直線補間してさらに多
くのＮ_M個（例えば２０４８個）に拡張し、このＮ_M個
のデータを間引いて上記一定個数Ｍ（例えば４４個）の
データに変換している。Various methods are conceivable for this data number conversion, but in the present embodiment, for example, for the amplitude data for one block of the effective band on the frequency axis, from the last data in the block to the block. After adding dummy data for interpolating the values up to the first data in the table to expand the number of data to N _F , the band-limited O _S times (for example, 8 times) oversampling is performed to obtain O. obtain an amplitude data of _S times the number, the O _S times the number ((m _MX
(+1) × O _S pieces of amplitude data are linearly interpolated to be expanded to a larger number of N _M pieces (for example, 2048 pieces), and the N _M pieces of data are thinned out to obtain the fixed number of M pieces (for example, 44 pieces) of data. Has been converted to.

【０１０６】図７の重み付きベクトル量子化を行うベク
トル量子化器１１６は、第１のベクトル量子化工程を行
う第１のベクトル量子化部５００と、この第１のベクト
ル量子化部５００における第１のベクトル量子化の際の
量子化誤差ベクトルを量子化する第２のベクトル量子化
工程を行う第２のベクトル量子化部５１０とを少なくと
も有する。この第１のベクトル量子化部５００は、いわ
ゆる１段目のベクトル量子化部であり、第２のベクトル
量子化部５１０は、いわゆる２段目のベクトル量子化部
である。The vector quantizer 116 for performing the weighted vector quantization shown in FIG. 7 includes a first vector quantizer 500 for performing the first vector quantizing process and a first vector quantizer 500 in the first vector quantizer 500. And a second vector quantizer 510 that performs a second vector quantization step of quantizing a quantization error vector at the time of vector quantization of 1. The first vector quantization unit 500 is a so-called first-stage vector quantization unit, and the second vector quantization unit 510 is a so-called second-stage vector quantization unit.

【０１０７】第１のベクトル量子化部５００の入力端子
５０１には、スペクトル評価部１４８の出力ベクトル
Ｘ、即ち一定個数Ｍのエンベロープデータが入力され
る。この出力ベクトルＸは、ベクトル量子化器５０２で
重み付きベクトル量子化される。これにより、ベクトル
量子化器５０２から出力されるシェイプインデクスは出
力端子５０３から出力され、また、量子化値Ｘ ₀’は出
力端子５０４から出力されると共に、加算器５０５、５
１３に送られる。加算器５０５では、出力ベクトルＸか
ら量子化値Ｘ ₀’が減算されて、複数次元の量子化誤差
ベクトルＹが得られる。The output vector of the spectrum evaluation unit 148 is connected to the input terminal 501 of the first vector quantization unit 500.
X , that is, a fixed number M of envelope data is input. This output vector X is weighted vector quantized by the vector quantizer 502. As a result, the shape index output from the vector quantizer 502 is output from the output terminal 503, the quantized value X ₀ 'is output from the output terminal 504, and the adders 505, 5
Sent to 13. In the adder 505, the quantized value X ₀ 'is subtracted from the output vector X to obtain a multidimensional quantization error vector Y.

【０１０８】この量子化誤差ベクトルＹは、第２のベク
トル量子化部５１０内のベクトル量子化部５１１に送ら
れる。このベクトル量子化部５１１は、複数個のベクト
ル量子化器で構成され、図７では、２個のベクトル量子
化器５１１₁、５１１₂から成る。量子化誤差ベクトルＹ
は次元分割されて、２個のベクトル量子化器５１１₁、
５１１₂で、それぞれ重み付きベクトル量子化される。
これらのベクトル量子化器５１１₁、５１１₂から出力さ
れるシェイプインデクスは、出力端子５１２₁、５１２₂
からそれぞれ出力され、また、量子化値Ｙ ₁’、Ｙ ₂’は
次元方向に接続されて、加算器５１３に送られる。この
加算器５１３では、量子化値Ｙ ₁’、Ｙ ₂’と量子化値Ｘ
₀’とが加算されて、量子化値Ｘ ₁’が生成される。この
量子化値Ｘ ₁’は出力端子５１４から出力される。This quantization error vector Y is sent to the vector quantization unit 511 in the second vector quantization unit 510. This vector quantization section 511 is composed of a plurality of vector quantizers, and in FIG. 7, is composed of two vector quantizers 511 ₁ and 511 ₂ . Quantization error vector Y
Is dimensionally divided into two vector quantizers 511 ₁ ,
At 511 ₂ , each is weighted vector quantized.
The shape indexes output from these vector quantizers 511 ₁ and 511 _{2 are} output to output terminals 512 ₁ and 512 _2.
Respectively, and the quantized values Y ₁ 'and Y ₂ ' are connected in the dimension direction and are sent to the adder 513. In the adder 513, the quantized values Y ₁ 'and Y ₂ ' and the quantized value X
_0'and are added to generate a quantized value X ₁ '. This quantized value X ₁ 'is output from the output terminal 514.

【０１０９】これにより、低ビットレート時には、上記
第１のベクトル量子化部５００による第１のベクトル量
子化工程での出力を取り出し、高ビットレート時には、
上記第１のベクトル量子化工程での出力及び上記第２の
量子化部５１０による第２のベクトル量子化工程での出
力を取り出す。As a result, when the bit rate is low, the output of the first vector quantization step by the first vector quantizer 500 is taken out, and when the bit rate is high, the output is obtained.
The output in the first vector quantization step and the output in the second vector quantization step by the second quantization unit 510 are extracted.

【０１１０】具体的には、図８に示すように、ベクトル
量子化器１１６内の第１のベクトル量子化部５００のベ
クトル量子化器５０２は、Ｌ次元、例えば４４次元の２
ステージ構成としている。Specifically, as shown in FIG. 8, the vector quantizer 502 of the first vector quantizer 500 in the vector quantizer 116 has an L-dimensional, for example 44-dimensional, two-dimensional vector quantizer 502.
It has a stage configuration.

【０１１１】すなわち、４４次元でコードブックサイズ
が３２のベクトル量子化コードブックからの出力ベクト
ルの和に、ゲインｇ_iを乗じたものを、４４次元のスペ
クトルエンベロープベクトルＸの量子化値Ｘ ₀’として
使用する。これは、図８に示すように、２つのシェイプ
コードブックをＣＢ０、ＣＢ１とし、その出力ベクトル
をｓ _0i、ｓ _1j、ただし０≦ｉ，ｊ≦３１、とする。ま
た、ゲインコードブックＣＢｇの出力をｇ_l、ただし０
≦ｌ≦３１、とする。ｇ_lはスカラ値である。この最終
出力Ｘ ₀’は、ｇ_i（ｓ _0i＋ｓ _1j）となる。That is, the sum of output vectors from a vector quantization codebook having a 44-dimensional codebook size of 32 and a gain g _i is multiplied to obtain a quantized value X ₀ 'of a 44-dimensional spectral envelope vector X. To use as. As shown in FIG. 8, the two shape codebooks are CB0 and CB1, and the output vectors are s _0i and s _1j , where 0 ≦ i, j ≦ 31. The output of the gain codebook CBg is represented by _gl , where 0
≦ l ≦ 31. _gl is a scalar value. This final output X ₀ 'is g _i ( s _0i + s _1j ).

【０１１２】ＬＰＣ残差について上記ＭＢＥ分析によっ
て得られたスペクトルエンベロープＡｍを一定次元に変
換したものをＸとする。このとき、Ｘをいかに効率的に
量子化するかが重要である。 Let X be a constant dimension of the spectral envelope Am obtained by the above MBE analysis of the LPC residual. At this time, how to efficiently quantize X is important.

【０１１３】ここで、量子化誤差エネルギＥを、Ｅ＝‖Ｗ｛ＨＸ−Ｈｇ_l（ｓ _0i＋ｓ _1j）｝‖² ・・・（２１）＝‖ＷＨ｛Ｘ−ｇ_l（ｓ _0i＋ｓ _1j）｝‖² と定義する。この（２１）式において、ＨはＬＰＣの合
成フィルタの周波数軸上での特性であり、Ｗは聴覚重み
付けの周波数軸上での特性を表す重み付けのための行列
である。[0113] Here, the quantization error energy E, E = ‖W {H X -Hg l (s 0i + s 1j)} ‖ ^{2 ··· (21) = ‖WH {} X -g l (s 0i + S _1j )} ∥ ² . In the equation (21), H is a characteristic of the LPC synthesis filter on the frequency axis, and W is a matrix for weighting that represents the characteristic of the auditory weighting on the frequency axis.

【０１１４】現フレームのＬＰＣ分析結果によるαパラ
メータを、α_i（１≦ｉ≦Ｐ）として、The α parameter according to the LPC analysis result of the current frame is α _i (1 ≦ i ≦ P),

【０１１５】[0115]

【数１３】 (Equation 13)

【０１１６】の周波数特性からＬ次元、例えば４４次元
の各対応する点の値をサンプルしたものである。The value of each corresponding point in the L dimension, for example, 44 dimensions, is sampled from the frequency characteristic of.

【０１１７】算出手順としては、一例として、１、
α₁、α₂、・・・、α_pに０詰めして、すなわち、１、
α₁、α₂、・・・、α_p、０、０、・・・、０として、
例えば２５６点のデータにする。その後、２５６点ＦＦ
Ｔを行い、（ｒ_e ²＋Ｉ_m ²）^1/2を０〜πに対応する点に
対して算出して、その逆数をとる。それをＬ点、すなわ
ち例えば４４点に間引いたものを対角要素とする行列
を、The calculation procedure is, for example, 1,
α ₁ , α ₂ ,..., α _p are padded with 0, that is, 1,
α ₁ , α ₂ ,..., α _p , 0, 0,.
For example, data of 256 points is used. After that, 256 points FF
Perform T, it is calculated for points corresponding to 0~π the _{^{_{^{(r e 2 + I m 2}}}} ) 1/2, taking its reciprocal. A matrix having diagonal elements obtained by thinning it out to L points, for example, 44 points,

【０１１８】[0118]

【数１４】 [Equation 14]

【０１１９】とする。It is assumed that

【０１２０】聴覚重み付け行列Ｗは、The perceptual weighting matrix W is

【０１２１】[0121]

【数１５】 (Equation 15)

【０１２２】とする。この（２３）式で、α_iは入力の
ＬＰＣ分析結果である。また、λa、λbは定数であり、
一例として、λa＝０．４、λb＝０．９が挙げられる。It is assumed that In this equation (23), α _i is the LPC analysis result of the input. Further, λa and λb are constants,
As an example, λa = 0.4 and λb = 0.9 can be mentioned.

【０１２３】行列あるいはマトリクスＷは、上記（２
３）式の周波数特性から算出できる。一例として、１、
α₁λb、α₂λb²、・・・、α_pλb^p、０、０、・・・、
０として２５６点のデータとしてＦＦＴを行い、０以上
π以下の区間に対して（ｒ_e ²[ｉ]＋Ｉ_m ²[ｉ]）^1/2、０
≦ｉ≦１２８、を求める。次に、１、α₁λa、α₂λ
a²、・・・、α_pλa^p 、０、０、・・・、０として分母
の周波数特性を２５６点ＦＦＴで０〜πの区間を１２８
点で算出する。これを（ｒ_e'²[ｉ]＋Ｉ_m'²[ｉ]）^1/2、
０≦ｉ≦１２８、とする。The matrix or the matrix W is (2)
It can be calculated from the frequency characteristic of equation 3). As an example, 1,
α ₁ λb, α ₂ λb ² ,..., α _p λb ^p , 0, 0,.
0 As performs an FFT as 256 points data for 0 or π following section _{^{(r e 2 [i] +}} I m 2 [i]) 1/2, 0
≦ i ≦ 128 is calculated. Next, 1, α ₁ λa, α ₂ λ
a ² , ..., α _p λ a ^p , 0, 0, ..., 0, the frequency characteristic of the denominator is 256 points FFT, and the interval of 0 to π is 128.
Calculate in points. This is (r _e ' ² [i] + I _m ' ² [i]) ^1/2 ,
0 ≦ i ≦ 128.

【０１２４】[0124]

【数１６】 (Equation 16)

【０１２５】として、上記（２３）式の周波数特性が求
められる。As the above, the frequency characteristic of the equation (23) is obtained.

【０１２６】これをＬ次元、例えば４４次元ベクトルの
対応する点について、以下の方法で求める。より正確に
は、直線補間を用いるべきであるが、以下の例では最も
近い点の値で代用している。This is obtained by the following method for the corresponding points of the L-dimensional, for example, 44-dimensional vector. More precisely, linear interpolation should be used, but the following example substitutes the value of the closest point.

【０１２７】すなわち、 ω[ｉ]＝ω₀［nint(128ｉ/L)］１≦ｉ≦Ｌただし、nint（Ｘ）は、Ｘに最も近い整数を返す関数である。That is, ω [i] = ω ₀ [nint (128i / L)] 1 ≦ i ≦ L where nint (X) is a function that returns an integer closest to X.

【０１２８】また、上記Ｈに関しても同様の方法で、h
(1)、h(2)、・・・、h(L)を求めている。すなわち、Further, with respect to the above H, in the same manner, h
(1), h (2), ..., h (L) are calculated. That is,

【０１２９】[0129]

【数１７】 [Equation 17]

【０１３０】となる。Is obtained.

【０１３１】ここで、他の例として、ＦＦＴの回数を減
らすのに、Ｈ(ｚ)Ｗ(ｚ)を先に求めてから、周波数特性
を求めてもよい。すなわち、Here, as another example, in order to reduce the number of FFTs, H (z) W (z) may be obtained first, and then the frequency characteristic may be obtained. That is,

【０１３２】[0132]

【数１８】 (Equation 18)

【０１３３】この（２５）式の分母を展開した結果を、The result of expanding the denominator of the equation (25) is

【０１３４】[0134]

【数１９】 [Equation 19]

【０１３５】とする。ここで、１、β₁、β₂、・・・、
β_2p、０、０、・・・、０として、例えば２５６点のデ
ータにする。その後、２５６点ＦＦＴを行い、振幅の周
波数特性を、It is assumed that Here, 1, β ₁ , β ₂ , ...,
As β _2p , 0, 0,... After that, a 256-point FFT is performed, and the frequency characteristic of the amplitude is

【０１３６】[0136]

【数２０】 (Equation 20)

【０１３７】とする。これより、It is assumed that Than this,

【０１３８】[0138]

【数２１】 (Equation 21)

【０１３９】これをＬ次元ベクトルの対応する点につい
て求める。上記ＦＦＴのポイント数が少ない場合は、直
線補間で求めるべきであるが、ここでは最寄りの値を使
用している。すなわち、This is calculated for the corresponding points of the L-dimensional vector. When the number of FFT points is small, it should be obtained by linear interpolation, but the nearest value is used here. That is,

【０１４０】[0140]

【数２２】 (Equation 22)

【０１４１】である。これを対角要素とする行列をＷ’
とすると、It is The matrix with this as diagonal elements is W '
Then

【０１４２】[0142]

【数２３】 (Equation 23)

【０１４３】となる。（２６）式は上記（２４）式と同
一のマトリクスとなる。It becomes: Equation (26) is the same matrix as equation (24).

【０１４４】このマトリクス、すなわち重み付き合成フ
ィルタの周波数特性を用いて、上記（２１）式を書き直
すと、Using the matrix, that is, the frequency characteristics of the weighted synthesis filter, the above equation (21) can be rewritten.

【０１４５】[0145]

【数２４】 (Equation 24)

【０１４６】となる。[0146]

【０１４７】ここで、シェイプコードブックとゲインコ
ードブックの学習法について説明する。Here, a method of learning the shape codebook and the gain codebook will be described.

【０１４８】先ず、ＣＢ０に関しコードベクトルｓ _0cを
選択する全てのフレームｋに関して歪の期待値を最小化
する。そのようなフレームがＭ個あるとして、First, the expected value of distortion is minimized for all frames k that select the code vector s _0c for CB0. Assuming there are M such frames,

【０１４９】[0149]

【数２５】 (Equation 25)

【０１５０】を最小化すればよい。この（２８）式中
で、Ｗ'_kはｋ番目のフレームに対する重み、Ｘ _kはｋ番
目のフレームの入力、ｇ_kはｋ番目のフレームのゲイ
ン、ｓ _1kはｋ番目のフレームについてのコードブックＣ
Ｂ１からの出力、をそれぞれ示す。It is only necessary to minimize In the equation (28), W ′ _k is a weight for the _kth frame, X _k is an input of the _kth frame, g _k is a gain of the kth frame, and s _1k is a codebook for the kth frame. C
Output from B1.

【０１５１】この（２８）式を最小化するには、To minimize the equation (28),

【０１５２】[0152]

【数２６】 (Equation 26)

【０１５３】[0153]

【数２７】 [Equation 27]

【０１５４】次に、ゲインに関しての最適化を考える。Next, optimization regarding gain will be considered.

【０１５５】ゲインのコードワードｇ_cを選択するｋ番
目のフレームに関しての歪の期待値Ｊ_gは、The expected distortion value J _{g for} the kth frame selecting the gain codeword g _c is

【０１５６】[0156]

【数２８】 [Equation 28]

【０１５７】上記（３１）式及び（３２）式は、シェイ
プｓ _0i、ｓ _1i及びゲインｇ_i、０≦ｉ≦３１の最適なセ
ントロイドコンディション(Centroid Condition)、すな
わち最適なデコーダ出力を与えるものである。なお、ｓ
_1iに関してもｓ _0iと同様に求めることができる。The above equations (31) and (32) are the optimum centroid condition (Centroid Condition) of the shapes s _0i , s _1i and the gain g _i , 0 ≦ i ≦ 31, that is, the optimum conditions. It provides a decoder output. In addition, s
_{The value of 1i} can be _{calculated in the} same manner as s _0i .

【０１５８】次に、最適エンコード条件（Nearest Neig
hbour Condition ）を考える。Next, the optimum encoding condition (Nearest Neig
hbour Condition).

【０１５９】歪尺度を求める上記（２７）式、すなわ
ち、Ｅ＝‖Ｗ'（Ｘ−ｇ_l（ｓ _0i＋ｓ _1j））‖²を最小化
するｓ _0i、ｓ _1jを、入力Ｘ、重みマトリクスＷ' が与え
られる毎に、すなわち毎フレームごとに決定する。[0159] The obtaining the distortion measure (27), i.e., E = ‖W '(X -g l (s 0i + s 1j)) ‖ ² minimizes s _0i, the s _1j, the input X, the weight It is determined every time the matrix W ′ is given, that is, every frame.

【０１６０】本来は、総当り的に全てのｇ_l（０≦ｌ≦
３１）、ｓ _0i（０≦ｉ≦３１）、ｓ ₁ _j（０≦ｊ≦３１）
の組み合せの、３２×３２×３２＝３２７６８通りにつ
いてＥを求めて、最小のＥを与えるｇ_l 、ｓ _0i、ｓ _1jの
組を求めるべきであるが、膨大な演算量となるので、本
実施の形態では、シェイプとゲインのシーケンシャルサ
ーチを行っている。なお、ｓ _0iとｓ _1jとの組み合せにつ
いては、総当りサーチを行うものとする。これは、３２
×３２＝１０２４通りである。以下の説明では、簡単化
のため、ｓ _0i＋ｓ _1jをｓ _mと記す。Originally, all g _l (0 ≦ l ≦
31), s _0i (0 ≦ i ≦ 31), s ₁ _j (0 ≦ j ≦ 31)
It is necessary to find E for 32 × 32 × 32 = 32768 combinations of the above, and find the set of g _l , s _0i , and s _1j that gives the minimum E, but this is an enormous amount of calculation. In this form, a sequential search of shape and gain is performed. Note that a brute force search is performed for the combination of s _0i and s _1j . This is 32
× 32 = 1024 patterns. In the following description, for simplicity, the s _0i + s _1j referred to as s _m.

【０１６１】上記（２７）式は、Ｅ＝‖Ｗ'（Ｘ−ｇ_lｓ
_m）‖² となる。さらに簡単のため、Ｘ _w＝Ｗ'Ｘ、ｓ _w
＝Ｗ'ｓ _mとすると、[0161] The above equation (27) is, E = ‖W '(X -g l s
_m) ‖ ² to become. For further _{simplicity, X w = W 'X,} s w
= When W 's _m,

【０１６２】[0162]

【数２９】 (Equation 29)

【０１６３】となる。従って、ｇ_l の精度が充分にとれ
ると仮定すると、It becomes Therefore, assuming that the accuracy of _gl is sufficiently high,

【０１６４】[0164]

【数３０】 [Equation 30]

【０１６５】という２つのステップに分けてサーチする
ことができる。元の表記を用いて書き直すと、The search can be performed by dividing it into two steps. Rewriting using the original notation,

【０１６６】[0166]

【数３１】 (Equation 31)

【０１６７】となる。この（３５）式が最適エンコード
条件(Nearest Neighbour Condition)である。It becomes: This equation (35) is the optimum encoding condition (Nearest Neighbor Condition).

【０１６８】ここで上記（３１）、（３２）式の条件
（Centroid Condition）と、（３５）式の条件を用い
て、ＬＢＧ(Linde-Buzo-Gray)アルゴリズム、いわゆる
一般化ロイドアルゴリズム（Generalized Lloyd Algori
thm:ＧＬＡ）によりコードブック（ＣＢ０、ＣＢ１、Ｃ
Ｂｇ）を同時にトレーニングできる。Here, using the conditions (Centroid Condition) of the above equations (31) and (32) and the condition of the equation (35), an LBG (Linde-Buzo-Gray) algorithm, a so-called generalized Lloyd algorithm is used. Algori
thm: GLA) and the codebook (CB0, CB1, C
Bg) can be trained simultaneously.

【０１６９】ところで、ベクトル量子化器１１６でのベ
クトル量子化の際の聴覚重み付けに用いられる重みＷ’
については、上記（２６）式で定義されているが、過去
のＷ’も加味して現在のＷ’を求めることにより、テン
ポラルマスキングも考慮したＷ’が求められる。By the way, the weight W'used for perceptual weighting at the time of vector quantization in the vector quantizer 116.
Is defined by the above equation (26), and W'in consideration of temporal masking is also obtained by obtaining the current W'in consideration of the past W '.

【０１７０】上記（２６）式中のwh(1),wh(2),・・・,w
h(L)に関して、時刻ｎ、すなわち第ｎフレームで算出さ
れたものをそれぞれwh_n(1),wh_n(2),・・・,wh_n(L) とす
る。Wh (1), wh (2), ..., w in the above equation (26)
respect h (L), the time n, that each wh _n (1) those which are calculated in the n-th _{frame, wh n (2), ···} , and wh _n (L).

【０１７１】時刻ｎで過去の値を考慮した重みをＡ
_n(i)、１≦ｉ≦Ｌと定義すると、Ａ_n(i)＝λＡ_n-1(i)＋（１−λ）wh_n(i) （wh_n(i)≦Ａ_n-1(i)）＝wh_n(i) （wh_n(i)＞Ａ_n-1(i)）とする。ここで、λは例えばλ＝０．２とすればよい。
このようにして求められたＡ_n(i)、１≦ｉ≦Ｌについ
て、これを対角要素とするマトリクスを上記重みとして
用いればよい。At time n, the weight considering the past value is A
_{If n} (i) and 1 ≦ i ≦ L are defined, A _n (i) = λA _n-1 (i) + (1-λ) wh _n (i) (wh _n (i) ≦ A _n-1 ( i)) = wh _n (i) (wh _n (i)> A _n-1 (i)). Here, λ may be, for example, λ = 0.2.
For A _n (i), 1 ≦ i ≦ L obtained in this way, a matrix having diagonal elements may be used as the weight.

【０１７２】このように重み付きベクトル量子化により
得られたシェイプインデクスｓ _0i、ｓ _1jは、出力端子５
２０、５２２からそれぞれ出力され、ゲインインデクス
ｇ_lは、出力端子５２１から出力される。また、量子化
値Ｘ ₀’は、出力端子５０４から出力されると共に、加
算器５０５に送られる。Shape indexes s _0i and s _1j thus obtained by weighted vector quantization are output terminal 5
20, 522, and the gain index _gl is output from the output terminal 521. The quantized value X ₀ ′ is output from the output terminal 504 and is also sent to the adder 505.

【０１７３】この加算器５０５では、出力ベクトルＸか
ら量子化値Ｘ ₀’が減算されて、量子化誤差ベクトルＹ
が生成される。この量子化誤差ベクトルＹは、具体的に
は、８個のベクトル量子化器５１１₁〜５１１₈から成る
ベクトル量子化部５１１に送られて、次元分割され、各
ベクトル量子化器５１１₁〜５１１₈で重み付きのベクト
ル量子化が施される。In this adder 505, the quantized value X ₀ 'is subtracted from the output vector X , and the quantized error vector Y
Is generated. Specifically, this quantization error vector Y is sent to a vector quantizer 511 composed of _eight vector quantizers 511 _{1 to} 511 ₈ , and is dimensionally divided, and each vector quantizer 511 _{1 to} 511 is quantized. Weighted vector quantization is applied at ₈ .

【０１７４】第２のベクトル量子化部５１０では、第１
のベクトル量子化部５００と比較して、かなり多くのビ
ット数を用いるため、コードブックのメモリ容量及びコ
ードブックサーチのための演算量（Complexity）が非常
に大きくなり、第１のベクトル量子化部５００と同じ４
４次元のままでベクトル量子化を行うことは、不可能で
ある。そこで、第２のベクトル量子化部５１０内のベク
トル量子化部５１１を複数個のベクトル量子化器で構成
し、入力される量子化値を次元分割して、複数個の低次
元ベクトルとして、重み付きのベクトル量子化を行う。In the second vector quantizer 510, the first vector quantizer 510
Since a considerably large number of bits are used as compared with the vector quantization unit 500, the memory capacity of the codebook and the amount of calculation (Complexity) for the codebook search become very large, and the first vector quantization unit 500 Same as 500 4
It is impossible to perform vector quantization with four dimensions. Therefore, the vector quantizer 511 in the second vector quantizer 510 is composed of a plurality of vector quantizers, and the input quantized value is dimensionally divided into a plurality of low-dimensional vectors and weighted. Perform vector quantization with.

【０１７５】ベクトル量子化器５１１₁〜５１１₈で用い
る各量子化値Ｙ ₀〜Ｙ ₇と、次元数と、ビット数との関係
を、表２に示す。Table 2 shows the relationship between the quantized values Y ₀ to Y ₇ used in the vector quantizers 511 _{1 to} 511 ₈ , the number of dimensions, and the number of bits.

【０１７６】[0176]

【表２】 [Table 2]

【０１７７】ベクトル量子化器５１１₁〜５１１₈から出
力されるインデクスＩｄｖｑ₀〜Ｉｄｖｑ₇は、各出力端
子５２３₁〜５２３₈からそれぞれ出力される。これらの
インデクスの合計は７２ビットである。[0177] index Idvq ₀ ~Idvq ₇ output from the vector quantizer 511 _{1 to 511} ₈ are outputted from the output terminals 523 _{1 to 523} _8. The sum of these indexes is 72 bits.

【０１７８】また、ベクトル量子化器５１１₁〜５１１₈
から出力される量子化値Ｙ ₀’〜Ｙ _７’を次元方向に接
続した値をＹ’とすると、加算器５１３では、量子化値
Ｙ’と量子化値Ｘ _０’とが加算されて、量子化値Ｘ ₁’
が得られる。よって、この量子化値Ｘ ₁’は、Ｘ ₁ ’＝Ｘ ₀’＋Ｙ’ ＝Ｘ−Ｙ＋Ｙ’ で表される。すなわち、最終的な量子化誤差ベクトル
は、Ｙ’−Ｙとなる。Further, the vector quantizers 511 _{1 to} 511 ₈
If a value obtained by connecting the quantized values Y ₀ 'to Y ₇ ' in the dimension direction is Y ', the quantized value
Y ′ and the quantized value X ₀ ′ are added to each other to obtain the quantized value X ₁ ′.
Is obtained. Therefore, the quantized value X ₁ ′ is represented by X ₁ ′ = X ₀ ′ + Y ′ = X − Y + Y ′. That is, the final quantization error vector is Y'- Y .

【０１７９】尚、音声信号復号化装置側では、この第２
のベクトル量子化部５１０からの量子化値Ｘ ₁’を復号
化するときには、第１のベクトル量子化部５００からの
量子化値Ｘ ₀’は不要であるが、第１のベクトル量子化
部５００及び第２のベクトル量子化部５１０からのイン
デクスは必要とする。On the audio signal decoding device side, this second
When decoding the quantized value X ₁ 'from the vector quantizer 510, the quantized value X ₀ ' from the first vector quantizer 500 is unnecessary, but the first vector quantizer 500 And the index from the second vector quantizer 510 is required.

【０１８０】次に、上記ベクトル量子化部５１１におけ
る学習法及びコードブックサーチについて説明する。Next, the learning method and codebook search in the vector quantizer 511 will be described.

【０１８１】先ず、学習法においては、量子化誤差ベク
トルＹ及び重みＷ’を用い、表２に示すように、８つの
低次元ベクトルＹ ₀〜Ｙ ₇及びマトリクスに分割する。こ
のとき、重みＷ’は、例えば４４点に間引いたものを対
角要素とする行列、First, in the learning method, the quantization error vector Y and the weight W'are used to divide into eight low-dimensional vectors Y ₀ to Y ₇ and a matrix as shown in Table 2. At this time, the weight W ′ is, for example, a matrix having diagonal elements obtained by thinning out 44 points,

【０１８２】[0182]

【数３２】 (Equation 32)

【０１８３】とすると、以下の８つの行列に分割され
る。Then, it is divided into the following eight matrices.

【０１８４】[0184]

【数３３】 [Equation 33]

【０１８５】このように、Ｙ及びＷ’の低次元に分割さ
れたものを、それぞれＹ _i 、Ｗ_i’ （１≦ｉ≦８）とする。The low-dimensional divisions of Y and W'in this way are defined as Y _i and W _i '(1≤i≤8), respectively.

【０１８６】ここで、歪尺度Ｅを、Ｅ＝‖Ｗ_i'（Ｙ _i−ｓ）‖² ・・・（３７）と定義する。このコードベクトルｓはＹ _iの量子化結果
であり、歪尺度Ｅを最小化する、コードブックのコード
ベクトルｓがサーチされる。Here, the distortion measure E is defined as E = ‖W _i '( Y _i − s ) ‖ ² ... (37). This code vector s is the quantization result of Y _i , and the code vector s of the codebook that minimizes the distortion measure E is searched.

【０１８７】尚、Ｗ_i’は、学習時には重み付けがあ
り、サーチ時には重み付け無し、すなわち単位行列と
し、学習時とコードブックサーチ時とでは異なる値を用
いるようにしてもよい。Note that W _i ′ may be weighted at the time of learning and not weighted at the time of searching, that is, may be a unit matrix, and different values may be used at the time of learning and at the time of codebook search.

【０１８８】また、コードブックの学習では、一般化ロ
イドアルゴリズム（ＧＬＡ）を用い、さらに重み付けを
行っている。先ず、学習のための最適なセントロイドコ
ンディションについて説明する。コードベクトルｓを最
適な量子化結果として選択した入力ベクトルＹがＭ個あ
る場合に、トレーニングデータをＹ _kとすると、歪の期
待値Ｊは、全てのフレームｋに関して重み付け時の歪の
中心を最小化するような（３８）式となる。In the codebook learning, a generalized Lloyd algorithm (GLA) is used and further weighting is performed. First, an optimal centroid condition for learning will be described. When there are M input vectors Y in which the code vector s is selected as the optimum quantization result, and the training data is Y _k , the expected value J of the distortion is that the center of the distortion at the time of weighting is the minimum for all frames k. The equation (38) becomes

【０１８９】[0189]

【数３４】 (Equation 34)

【０１９０】上記（３９）式で示すｓは最適な代表ベク
トルであり、最適なセントロイドコンディションであ
る。 S shown in the above equation (39) is an optimum representative vector, which is an optimum centroid condition.

【０１９１】また、最適エンコード条件は、‖Ｗ_i'（Ｙ
_i−ｓ）‖² の値を最小化するｓをサーチすればよい。
ここで、サーチ時のＷ_i'は、必ずしも学習時と同じＷ_i'
である必要はなく、重み無しでThe optimum encoding condition is ‖W _i '( Y
_It is sufficient to search for s that minimizes the value of _i − s ) ‖ ² .
Here, W _i 'at the time of searching is not always the same as W _i ' at the time of learning.
Need not be, with no weight

【０１９２】[0192]

【数３５】 (Equation 35)

【０１９３】のマトリクスとしてもよい。The matrix of may be used.

【０１９４】このように、音声信号符号化装置内のベク
トル量子化部１１６を２段のベクトル量子化部から構成
することにより、出力するインデクスのビット数を可変
にすることができる。As described above, by configuring the vector quantizer 116 in the speech signal coding apparatus with the two-stage vector quantizer, the number of bits of the output index can be made variable.

【０１９５】次に、本発明の前記ＣＥＬＰ符号化構成を
用いた第２の符号化部１２０は、より具体的には図９に
示すような、多段のベクトル量子化処理部（図９の例で
は２段の符号化部１２０₁と１２０₂）の構成を有するも
のとなされている。なお、当該図９の構成は、伝送ビッ
トレートを例えば前記２ｋｂｐｓと６ｋｂｐｓとで切り
換え可能な場合において、６ｋｂｐｓの伝送ビットレー
トに対応した構成を示しており、さらにシェイプ及びゲ
インインデクス出力を２３ビット／５ｍｓｅｃと１５ビ
ット／５ｍｓｅｃとで切り換えられるようにしているも
のである。また、この図９の構成における処理の流れは
図１０に示すようになっている。Next, the second coding unit 120 using the CELP coding structure of the present invention is more specifically a multi-stage vector quantization processing unit (example of FIG. 9) as shown in FIG. In the above, the two-stage encoding units 120 ₁ and 120 ₂ ) are included. The configuration of FIG. 9 shows a configuration corresponding to a transmission bit rate of 6 kbps when the transmission bit rate can be switched between 2 kbps and 6 kbps, for example, and the shape and gain index output are 23 bits / The switching is made between 5 msec and 15 bits / 5 msec. The flow of processing in the configuration of FIG. 9 is as shown in FIG.

【０１９６】この図９において、例えば、図９の第１の
符号化部２００は前記図３の第１の符号化部１１３と略
々対応し、図９のＬＰＣ分析回路３０２は前記図３に示
したＬＰＣ分析回路１３２と対応し、図９のＬＳＰパラ
メータ量子化回路３０３は図３の前記α→ＬＳＰ変換回
路１３３からＬＳＰ→α変換回路１３７までの構成と対
応し、図９の聴覚重み付けフィルタ３０４は図３の前記
聴覚重み付けフィルタ算出回路１３９及び聴覚重み付け
フィルタ１２５と対応している。したがって、この図９
において、端子３０５には前記図３の第１の符号化部１
１３のＬＳＰ→α変換回路１３７からの出力と同じもの
が供給され、また、端子３０７には前記図３の聴覚重み
付けフィルタ算出回路１３９からの出力と同じものが、
端子３０６には前記図３の聴覚重み付けフィルタ１２５
からの出力と同じものが供給される。ただし、この図５
の聴覚重み付けフィルタ３０４では、前記図３の聴覚重
み付けフィルタ１２５とは異なり、前記ＬＳＰ→α変換
回路１３７の出力を用いずに、入力音声データと量子化
前のαパラメータとから、前記聴覚重み付けした信号
（すなわち前記図３の聴覚重み付けフィルタ１２５から
の出力と同じ信号）を生成している。In FIG. 9, for example, the first encoding unit 200 of FIG. 9 substantially corresponds to the first encoding unit 113 of FIG. 3, and the LPC analysis circuit 302 of FIG. Corresponding to the LPC analysis circuit 132 shown, the LSP parameter quantization circuit 303 of FIG. 9 corresponds to the configuration from the α → LSP conversion circuit 133 to the LSP → α conversion circuit 137 of FIG. Reference numeral 304 corresponds to the auditory weighting filter calculation circuit 139 and the auditory weighting filter 125 of FIG. Therefore, this FIG.
At the terminal 305, the first encoding unit 1 of FIG.
The same output from the LSP → α conversion circuit 137 of FIG. 13 is supplied, and the same output from the auditory weighting filter calculation circuit 139 of FIG. 3 is supplied to the terminal 307.
At the terminal 306, the perceptual weighting filter 125 of FIG.
The same output from is supplied. However, FIG.
Unlike the perceptual weighting filter 125 of FIG. 3, the perceptual weighting filter 304 of FIG. 3 performs the perceptual weighting from the input voice data and the α parameter before quantization without using the output of the LSP → α conversion circuit 137. Signal (ie, the same signal as the output from the perceptual weighting filter 125 of FIG. 3 above).

【０１９７】また、この図９に示す２段構成の第２の符
号化部１２０₁及び１２０₂において、減算器３１３及び
３２３は図３の減算器１２３と対応し、距離計算回路３
１４及び３２４は図３の距離計算回路１２４と、ゲイン
回路３１１及び３２１は図３のゲイン回路１２６と、ス
トキャスティックコードブック３１０，３２０及びゲイ
ンコードブック３１５，３２５は図３の雑音符号帳１２
１とそれぞれ対応している。In addition, in the two-stage second coding units 120 ₁ and 120 ₂ shown in FIG. 9, the subtractors 313 and 323 correspond to the subtractor 123 of FIG. 3, and the distance calculation circuit 3
Reference numerals 14 and 324 denote the distance calculation circuit 124 of FIG. 3, gain circuits 311 and 321 denote the gain circuit 126 of FIG. 3, and stochastic codebooks 310 and 320 and gain codebooks 315 and 325 denote the noise codebook 12 of FIG.
It corresponds to 1 respectively.

【０１９８】このような図９の構成において、先ず、図
１０のステップＳ１に示すように、ＬＰＣ分析回路３０
２では、端子３０１から供給された入力音声データｘを
前述同様に適当なフレームに分割してＬＰＣ分析を行
い、αパラメータを求める。ＬＳＰパラメータ量子化回
路３０３では、上記ＬＰＣ分析回路３０２からのαパラ
メータをＬＳＰパラメータに変換して量子化し、さらに
この量子化したＬＳＰパラメータを補間した後、αパラ
メータに変換する。次に、当該ＬＳＰパラメータ量子化
回路３０３では、当該量子化したＬＳＰパラメータを変
換したαパラメータ、すなわち量子化されたαパラメー
タから、ＬＰＣ合成フィルタ関数１／Ｈ（ｚ）を生成
し、これを端子３０５を介して１段目の第２の符号化部
１２０₁の聴覚重み付き合成フィルタ３１２に送る。In the configuration of FIG. 9 as described above, first, as shown in step S1 of FIG.
In 2, the input voice data x supplied from the terminal 301 is divided into appropriate frames as described above and LPC analysis is performed to obtain the α parameter. The LSP parameter quantization circuit 303 converts the α parameter from the LPC analysis circuit 302 into an LSP parameter, quantizes the LSP parameter, interpolates the quantized LSP parameter, and converts it into an α parameter. Next, the LSP parameter quantizing circuit 303 generates an LPC synthesis filter function 1 / H (z) from the α parameter obtained by converting the quantized LSP parameter, that is, the quantized α parameter. 305 via a letter to the second encoding unit 120 ₁ of the perceptually weighted synthesis filter 312 of the first stage.

【０１９９】一方、聴覚重み付けフィルタ３０４では、
ＬＰＣ分析回路３０２からのαパラメータ（すなわち量
子化前のαパラメータ）から、前記図３の聴覚重み付け
フィルタ算出回路１３９によるものと同じ聴覚重み付け
のためのデータを求め、この重み付けのためのデータが
端子３０７を介して、１段目の第２の符号化部１２０₁
の聴覚重み付き合成フィルタ３１２に送られる。また、
当該聴覚重み付けフィルタ３０４では、図１０のステッ
プＳ２に示すように、入力音声データと量子化前のαパ
ラメータとから、前記聴覚重み付けした信号（前記図３
の聴覚重み付けフィルタ１２５からの出力と同じ信号）
を生成する。すなわち、先ず、量子化前のαパラメータ
から聴覚重み付けフィルタ関数Ｗ（ｚ）を生成し、さら
に入力音声データｘに当該フィルタ関数Ｗ（ｚ）をかけ
てｘ _Wを生成し、これを上記聴覚重み付けした信号とし
て、端子３０６を介して１段目の第２の符号化部１２０
₁の減算器３１３に送る。On the other hand, in the perceptual weighting filter 304,
From the α parameter from the LPC analysis circuit 302 (that is, the α parameter before quantization), the same data for perceptual weighting as obtained by the perceptual weighting filter calculation circuit 139 in FIG. 3 is obtained. 307, the second encoding unit 120 _{1 in the} first stage
To the auditory weighted synthesis filter 312. Also,
In the perceptual weighting filter 304, as shown in step S2 in FIG. 10, the perceptually weighted signal (see FIG. 3 above) is calculated from the input voice data and the α parameter before quantization.
The same signal as the output from the auditory weighting filter 125)
Generate That is, first, the perceptual weighting filter function W (z) is generated from the α parameter before quantization, the input audio data x is further multiplied by the filter function W (z) to generate x _W, and the perceptual weighting is performed. As a generated signal via the terminal 306 to the second encoding unit 120 of the first stage.
_{1 to} the subtractor 313.

【０２００】１段目の第２の符号化部１２０₁では、９
ビットシェイプインデクス出力のストキャスティックコ
ードブック（stochastic code book）３１０からの代表
値出力（無声音のＬＰＣ残差に相当するノイズ出力）が
ゲイン回路３１１に送られ、このゲイン回路３１１に
て、ストキャスティックコードブック３１０からの代表
値出力に６ビットゲインインデクス出力のゲインコード
ブック３１５からのゲイン（スカラ値）を乗じ、このゲ
イン回路３１１にてゲインが乗じられた代表値出力が、
１／Ａ（ｚ）＝（１／Ｈ（ｚ））・Ｗ（ｚ）の聴覚重み
付きの合成フィルタ３１２に送られる。この重み付きの
合成フィルタ３１２からは、図１０のステップＳ３のよ
うに、１／Ａ（ｚ）のゼロ入力応答出力が減算器３１３
に送られる。当該減算器３１３では、上記聴覚重み付き
合成フィルタ３１２からのゼロ入力応答出力と、上記聴
覚重み付けフィルタ３０４からの上記聴覚重み付けした
信号ｘ _Wとを用いた減算が行われ、この差分或いは誤差
が参照ベクトルｒとして取り出される。図１０のステッ
プＳ４に示すように、１段目の第２の符号化部１２０₁
でのサーチ時には、この参照ベクトルｒが、距離計算回
路３１４に送られ、ここで距離計算が行われ、量子化誤
差エネルギＥを最小にするシェイプベクトルｓとゲイン
ｇがサーチされる。ただし、ここでの１／Ａ（ｚ）はゼ
ロ状態である。すなわち、コードブック中のシェイプベ
クトルｓをゼロ状態の１／Ａ（ｚ）で合成したものをｓ
_synとするとき、式（４０）を最小にするシェイプベク
トルｓとゲインｇをサーチする。In the second encoding section 120 ₁ of the first stage, 9
A representative value output (a noise output corresponding to the LPC residual of unvoiced sound) from the stochastic code book 310 of the bit shape index output is sent to the gain circuit 311, and the stochastic code is output from the gain circuit 311. The representative value output from the book 310 is multiplied by the gain (scalar value) from the gain codebook 315 of the 6-bit gain index output, and the representative value output multiplied by the gain in the gain circuit 311 is:
1 / A (z) = (1 / H (z)) · W (z) is sent to the synthesis filter 312 with the auditory weight. From the weighted synthesis filter 312, the zero input response output of 1 / A (z) is subtracted by the subtractor 313 as in step S3 of FIG.
Sent to In the subtractor 313, the zero-input response output from the auditory weighting synthesis filter 312, is subtracted with the above perceptually weighted signal x _W from the perceptually weighted filter 304 is performed, the difference or error reference Extracted as a vector r . As shown in step S4 of FIG. 10, the second encoding unit 120 ₁ in the first stage
At the time of the search, the reference vector r is sent to the distance calculation circuit 314, where the distance calculation is performed, and the shape vector s and the gain g that minimize the quantization error energy E are searched. Here, 1 / A (z) is in a zero state. That is, the shape vector s in the codebook is synthesized by 1 / A (z) of the zero state to s
_{When syn} is set, a shape vector s and a gain g that minimize Expression (40) are searched.

【０２０１】[0201]

【数３６】 [Equation 36]

【０２０２】ここで、量子化誤差エネルギＥを最小とす
るｓとｇをフルサーチしてもよいが、計算量を減らすた
めに、以下のような方法をとることができる。Here, s and g which minimize the quantization error energy E may be fully searched, but the following method can be adopted in order to reduce the amount of calculation.

【０２０３】第１の方法として、以下の式（４１）に定
義するＥ_sを最小とするシェイプベクトルｓをサーチす
る。As a first method, a shape vector s that minimizes E _s defined in the following equation (41) is searched for.

【０２０４】[0204]

【数３７】 (37)

【０２０５】第２の方法として、第１の方法により得ら
れたｓより、理想的なゲインは、式（４２）のようにな
るから、式（４３）を最小とするｇをサーチする。As the second method, since the ideal gain is as shown in equation (42) from s obtained by the first method, g which minimizes equation (43) is searched for.

【０２０６】[0206]

【数３８】 (38)

【０２０７】Ｅ_g＝（ｇ_ref−ｇ）² （４３）ここで、Ｅはｇの二次関数であるから、Ｅ_gを最小にす
るｇはＥを最小化する。E _g = (g _ref −g) ² (43) Here, since E is a quadratic function of _g , _g that minimizes E _g minimizes E.

【０２０８】上記第１，第２の方法によって得られたｓ
とｇより、量子化誤差ベクトルｅ（ｎ）は次の式（４
４）のように計算できる。 S obtained by the above first and second methods
And g, the quantization error vector e (n) is given by the following equation (4
It can be calculated as in 4).

【０２０９】ｅ（ｎ）＝ｒ（ｎ）−ｇｓ _syn（ｎ）（４４）これを、２段目の第２の符号化部１２０₂のリファレン
ス入力として１段目と同様にして量子化する。 E (n) = r (n) −g s _syn (n) (44) Quantize this as the reference input of the second-stage second encoding unit 120 _{2 in} the same manner as in the first-stage. To do.

【０２１０】すなわち、上記１段目の第２の符号化部１
２０₁の聴覚重み付き合成フィルタ３１２からは、端子
３０５及び端子３０７に供給された信号がそのまま２段
目の第２の符号化部１２０₂の聴覚重み付き合成フィル
タ３２２に送られる。また、当該２段目の第２の符号化
部１２０₂減算器３２３には、１段目の第２の符号化部
１２０₁にて求めた上記量子化誤差ベクトルｅ（ｎ）が
供給される。That is, the second encoding unit 1 in the first stage
From 20 ₁ of the auditory weighting synthesis filter 312 is sent to the second perceptually weighted synthesis filter 322 of the encoding unit 120 ₂ of the signal supplied to the terminal 305 and the terminal 307 as the second stage. Further, the quantization error vector e (n) obtained by the second encoding unit 120 ₁ of the first stage is supplied to the second encoding unit 120 ₂ subtractor 323 of the second stage. .

【０２１１】次に、図１０のステップＳ５において、当
該２段目の第２の符号化部１２０₂でも１段目と同様に
処理が行われる。すなわち、５ビットシェイプインデク
ス出力のストキャスティックコードブック３２０からの
代表値出力がゲイン回路３２１に送られ、このゲイン回
路３２１にて、当該コードブック３２０からの代表値出
力に３ビットゲインインデクス出力のゲインコードブッ
ク３２５からのゲインを乗じ、このゲイン回路３２１の
出力が、聴覚重み付きの合成フィルタ３２２に送られ
る。当該重み付きの合成フィルタ３２２からの出力は減
算器３２３に送られ、当該減算器３２３にて上記聴覚重
み付き合成フィルタ３２２からの出力と１段目の量子化
誤差ベクトルｅ（ｎ）との差分が求められ、この差分が
距離計算回路３２４に送られてここで距離計算が行わ
れ、量子化誤差エネルギＥを最小にするシェイプベクト
ルｓとゲインｇがサーチされる。Next, in step S5 of FIG. 10, the second coding section 120 _{2 in the second} stage also performs the same processing as in the first stage. That is, a representative value output from the stochastic codebook 320 of the 5-bit shape index output is sent to the gain circuit 321, and the gain of the 3-bit gain index output is added to the representative value output from the codebook 320 by the gain circuit 321. The output from the gain circuit 321 is multiplied by the gain from the codebook 325 and sent to a synthesis filter 322 with auditory weights. The output from the weighted synthesis filter 322 is sent to the subtractor 323, and the subtractor 323 outputs the difference between the output from the auditory weighted synthesis filter 322 and the first-stage quantization error vector e (n). Is calculated, and this difference is sent to the distance calculation circuit 324 where distance calculation is performed, and the shape vector s and the gain g that minimize the quantization error energy E are searched.

【０２１２】上述したような１段目の第２の符号化部１
２０₁のストキャストコードブック３１０からのシェイ
プインデクス出力及びゲインコードブック３１５からの
ゲインインデクス出力と、２段目の第２の符号化部１２
０₂のストキャストコードブック３２０からのインデク
ス出力及びゲインコードブック３２５からのインデクス
出力は、インデクス出力切り換え回路３３０に送られる
ようになっている。ここで、当該第２の符号化部１２０
から前記２３ビット出力を行うときには、上記１段目と
２段目の第２の符号化部１２０₁及び１２０₂のストキャ
ストコードブック３１０，３２０及びゲインコードブッ
ク３１５，３２５からの各インデクスを合わせて出力
し、一方、前記１５ビット出力を行うときには、上記１
段目の第２の符号化部１２０₁のストキャストコードブ
ック３１０とゲインコードブック３１５からの各インデ
クスを出力する。The second-stage coding unit 1 in the first stage as described above
20 ₁ of the gain index output from the shape index output and the gain codebook 315 of the strike cast codebook 310, second encoding unit 12 of the second stage
0 index output from the index output and the gain codebook 325 of the _second strike cast codebook 320 are sent to the index output switching circuit 330. Here, the second encoding unit 120
When performing the 23-bit output from, the indexes from the astcast codebooks 310 and 320 and the gain codebooks 315 and 325 of the first-stage and second-stage second encoding units 120 ₁ and 120 ₂ are combined. When the 15-bit output is performed, the above 1
The respective indexes from the cast code codebook 310 and the gain codebook 315 of the second encoding unit 120 _{1 in} the second stage are output.

【０２１３】その後は、ステップＳ６のようにフィルタ
状態がアップデートされる。After that, the filter state is updated as in step S6.

【０２１４】ところで、本実施の形態では、２段目の第
２の符号化部１２０₂のインデクスビット数が、シェイ
プベクトルについては５ビットで、ゲインについては３
ビットと非常に少ない。このような場合、適切なシェイ
プ、ゲインがコードブックに存在しないと、量子化誤差
を減らすどころか逆に増やしてしまう可能性がある。By the way, in the present embodiment, the number of index bits of the second coding section 120 _{2 in} the second stage is 5 bits for the shape vector and 3 for the gain.
A bit and very little. In such a case, if the appropriate shape and gain do not exist in the codebook, the quantization error may be increased rather than reduced.

【０２１５】この問題を防ぐためには、ゲインに０を用
意しておけばよいが、ゲインは３ビットしかなく、その
うちの一つを０にしてしまうのは量子化器の性能を大き
く低下させてしまう。そこで、比較的多いビット数を割
り当てたシェイプベクトルに、要素が全て０のベクトル
を用意する。そして、このゼロベクトルを除いて、前述
のサーチを行い、量子化誤差が最終的に増えてしまった
場合に、ゼロベクトルを選択するようにする。なお、こ
のときのゲインは任意である。これにより、２段目の第
２の符号化部１２０₂が量子化誤差を増すことを防ぐこ
とができる。To prevent this problem, it is sufficient to prepare 0 for the gain, but the gain has only 3 bits, and making one of them 0 makes the performance of the quantizer largely deteriorate. I will end up. Therefore, a vector having all zero elements is prepared for a shape vector to which a relatively large number of bits are allocated. Then, the above-described search is performed excluding the zero vector, and when the quantization error finally increases, the zero vector is selected. The gain at this time is arbitrary. Thus, second-stage second encoding unit 120 ₂ can be prevented from increasing the quantization error.

【０２１６】なお、図９の例では、２段構成の場合を例
に挙げているが、２段に限らず複数段構成とすることが
できる。この場合、１段目のクローズドループサーチに
よるベクトル量子化が終了したら、Ｎ段目（２≦Ｎ）で
はＮ−１段目の量子化誤差をリファレンス入力として量
子化を行い、さらにその量子化誤差をＮ＋１段目のリフ
ァレンス入力とする。In the example of FIG. 9, the case of a two-stage configuration is taken as an example, but the number of stages is not limited to two, and a multi-stage configuration is also possible. In this case, when the vector quantization by the closed loop search of the first stage is completed, the Nth stage (2 ≦ N) performs the quantization with the quantization error of the N−1th stage as a reference input, and further the quantization error. Is the N + 1th stage reference input.

【０２１７】上述したように、図９及び図１０から、第
２の符号化部に多段のベクトル量子化器を用いることに
より、従来のような同じビット数のストレートベクトル
量子化や共役コードブックなどを用いたものと比較し
て、計算量が少なくなる。特に、ＣＥＬＰ符号化では、
合成による分析（Analysis by Synthesis ）法を用いた
クローズドループサーチを用いた時間軸波形のベクトル
量子化を行っているため、サーチの回数が少ないことが
重要である。また、２段の第２の符号化部１２０₁と１
２０₂の両インデクス出力を用いる場合と、１段目の第
２の符号化部１２０₁のインデクス出力のみを用いる
（２段目の第２の符号化部１２０₂の出力インデクスを
用いない）場合とを切り換えることにより、簡単にビッ
ト数を切り換えることが可能となっている。さらに上述
したように、１段目と２段目の第２の符号化部１２０₁
と１２０₂の両インデクス出力を合わせて出力するよう
なことを行えば、後のデコーダ側において例えば何れか
を選ぶようにすることで、デコーダ側でも容易に対応で
きることになる。すなわち例えば６ｋｂｐｓでエンコー
ドしたパラメータを、２ｋｂｐｓのデコーダでデコード
するときに、デコーダ側で容易に対応できることにな
る。またさらに、例えば２段目の第２の符号化部１２０
₂のシェイプコードブックにゼロベクトルを含ませるこ
とにより、割り当てられたビット数が少ない場合でも、
ゲインに０を加えるよりは少ない性能劣化で量子化誤差
が増加することを防ぐことが可能となっている。As described above, from FIG. 9 and FIG. 10, by using a multi-stage vector quantizer in the second coding unit, straight vector quantization with the same number of bits and a conjugate codebook as in the conventional art are performed. The amount of calculation is smaller than that using. In particular, in CELP coding,
Since the vector quantization of the time-axis waveform is performed using a closed-loop search using an analysis by synthesis method, it is important that the number of searches is small. Also, two-stage second encoding units 120 ₁ and 120 1
20 in the case of using both index outputs of the _2, (not using the second output index encoding unit 120 ₂ of the second stage) of the first-stage second encoding unit 120 using only _one index output when The number of bits can be easily switched by switching between. Further, as described above, the second encoding units 120 _{1 in the} first and second stages
If by performing the like and outputs the combined both index outputs of 120 _2, by to choose one example the decoder side after, so that can be easily associated with the decoder side. That is, for example, when a parameter encoded at 6 kbps is decoded by a 2 kbps decoder, the decoder can easily cope with it. Further, for example, the second encoding unit 120 at the second stage
By including a zero vector in the shape codebook of ₂ , even if the number of allocated bits is small,
It is possible to prevent the quantization error from increasing with less performance degradation than adding 0 to the gain.

【０２１８】次に、上記ストキャスティックコードブッ
クのコードベクトル（シェイプベクトル）は例えば以下
のようにして生成することができる。Next, the code vector (shape vector) of the above stochastic code book can be generated as follows, for example.

【０２１９】例えば、ストキャスティックコードブック
のコードベクトルは、いわゆるガウシアンノイズのクリ
ッピングにより生成することができる。具体的には、ガ
ウシアンノイズを発生させ、これを適当なスレシホール
ド値でクリッピングし、それを正規化することで、コー
ドブックを構成することができる。For example, the stochastic codebook code vector can be generated by so-called Gaussian noise clipping. More specifically, a codebook can be constructed by generating Gaussian noise, clipping it with an appropriate threshold value, and normalizing it.

【０２２０】ところが、音声には様々な形態があり、例
えば「さ，し，す，せ，そ」のようなノイズに近い子音
の音声には、ガウシアンノイズが適しているが、例えば
「ぱ，ぴ，ぷ，ぺ，ぽ」のような立ち上がりの激しい子
音（急峻な子音）の音声については、対応しきれない。However, there are various forms of voice, and Gaussian noise is suitable for consonant voices that are close to noise such as "sa, shi, su, se, so". It is not possible to deal with consonant sounds with a sharp rise (a sharp consonant sound) such as "pi, pu, pe, po".

【０２２１】そこで、本発明では、全コードベクトルの
うち、適当な数はガウシアンノイズとし、残りを学習に
より求めて上記立ち上がりの激しい子音とノイズに近い
子音の何れにも対応できるようにする。例えば、スレシ
ホールド値を大きくとると、大きなピークを幾つか持つ
ようなベクトルが得られ、一方、スレシホールド値を小
さくとると、ガウシアンノイズそのものに近くなる。し
たがって、このようにクリッピングスレシホールド値の
バリエーションを増やすことにより、例えば「ぱ，ぴ，
ぷ，ぺ，ぽ」のような立ち上がりの激しい子音や、例え
ば「さ，し，す，せ，そ」のようなノイズに近い子音な
どに対応でき、明瞭度を向上させることができるように
なる。なお、図１１には、図中実線で示すガウシアンノ
イズと図中点線で示すクリッピング後のノイズの様子を
示している。また、図１１の（Ａ）はクリッピングスレ
シホールド値が１．０の場合（すなわちスレシホールド
値が大きい場合）を、図１１の（Ｂ）にはクリッピング
スレシホールド値が０．４の場合（すなわちスレシホー
ルド値が小さい場合）を示している。この図１１の
（Ａ）及び（Ｂ）から、スレシホールド値を大きくとる
と、大きなピークを幾つか持つようなベクトルが得ら
れ、一方、スレシホールド値を小さくとると、ガウシア
ンノイズそのものに近くなることが判る。Therefore, in the present invention, an appropriate number of all code vectors is Gaussian noise, and the rest is obtained by learning so as to be able to deal with both the consonant with a sharp rise and the consonant close to noise. For example, when the threshold value is increased, a vector having several large peaks is obtained. On the other hand, when the threshold value is decreased, the vector approaches Gaussian noise itself. Therefore, by increasing the variation of the clipping threshold value in this way, for example, “ぱ, ぴ,
ぷ, ぺ, ぽ ”and consonants with a sharp rise, such as consonants close to noise such as さ, ，, ，, ，, ，, など, etc., thereby improving clarity. . Note that FIG. 11 shows the Gaussian noise shown by the solid line in the figure and the noise after clipping shown by the dotted line in the figure. Further, (A) of FIG. 11 shows a case where the clipping threshold value is 1.0 (that is, a large threshold value), and (B) of FIG. 11 shows that the clipping threshold value is 0.4. The case (that is, the case where the threshold value is small) is shown. From FIGS. 11A and 11B, when the threshold value is increased, a vector having several large peaks is obtained, while when the threshold value is decreased, Gaussian noise itself is generated. It turns out that it will be close.

【０２２２】このようなことを実現するため、先ず、ガ
ウシアンノイズのクリッピングにより初期コードブック
を構成し、さらに予め適当な数だけ学習を行わないコー
ドベクトルを決めておく。この学習しないコードベクト
ルは、その分散値が小さいものから順に選ぶようにす
る。これは、例えば「さ，し，す，せ，そ」のようなノ
イズに近い子音に対応させるためである。一方、学習を
行って求めるコードベクトルは、当該学習のアルゴリズ
ムとしてＬＢＧアルゴリズムを用いるようにする。ここ
で最適エンコード条件（Nearest Neighbour Conditio
n）でのエンコードは固定したコードベクトルと、学習
対象のコードベクトル両方を使用して行う。セントロイ
ドコンディション（Centroid Condition）においては、
学習対象のコードベクトルのみをアップデートする。こ
れにより、学習対象となったコードベクトルは「ぱ，
ぴ，ぷ，ぺ，ぽ」などの立ち上がりの激しい子音に対応
するようになる。In order to realize such a thing, first, an initial codebook is constructed by clipping Gaussian noise, and an appropriate number of code vectors which are not learned are determined in advance. The non-learned code vectors are selected in ascending order of variance. This is to make it correspond to a consonant close to noise, for example, "sa, shi, su, se, so". On the other hand, the code vector obtained by performing the learning uses the LBG algorithm as the learning algorithm. Here is the optimal encoding condition (Nearest Neighbor Conditio
Encoding in n) is performed using both the fixed code vector and the code vector to be learned. In the Centroid Condition,
Update only the code vector to be learned. As a result, the code vector to be learned is “ぱ,
ぴ, ぷ, ぺ, ぽ ”and so on.

【０２２３】なお、ゲインは通常通りの学習を行うこと
で、これらのコードベクトルに対して最適なものが学習
できる。The gain can be learned optimally with respect to these code vectors by performing learning as usual.

【０２２４】上述したガウシアンノイズのクリッピング
によるコードブックの構成のための処理の流れを図１２
に示す。FIG. 12 shows the flow of processing for constructing the codebook by clipping the Gaussian noise described above.
Shown in

【０２２５】この図１２において、ステップＳ１０で
は、初期化として、学習回数ｎ＝０とし、誤差Ｄ₀＝∞
とし、最大学習回数ｎ_maxを決定し、学習終了条件を決
めるスレシホールド値εを決定する。In FIG. 12, in step S10, as the initialization, the learning number n = 0 and the error D ₀ = ∞
The maximum learning number n _max is determined, and the threshold value ε that determines the learning end condition is determined.

【０２２６】次のステップＳ１１では、ガウシアンノイ
ズのクリッピングによる初期コードブックを生成し、ス
テップＳ１２では学習を行わないコードベクトルとして
一部のコードベクトルを固定する。In the next step S11, an initial codebook is generated by clipping Gaussian noise, and in step S12, some code vectors are fixed as non-learned code vectors.

【０２２７】次にステップＳ１３では上記コードブック
を用いてエンコードを行い、ステップＳ１４では誤差を
算出し、ステップＳ１５では（Ｄ_n-1−Ｄ_n）／Ｄ_n＜
ε、若しくはｎ＝ｎ_maxか否かを判断し、Ｙｅｓと判断
した場合には処理を終了し、Ｎｏと判断した場合にはス
テップＳ１６に進む。Next, in step S13, the above codebook is used for encoding, in step S14 an error is calculated, and in step S15, (D _n-1 -D _n ) / D _n <
It is determined whether or not ε or n = _nmax . If the determination is Yes, the process ends. If the determination is No, the process proceeds to step S16.

【０２２８】ステップＳ１６ではエンコードに使用され
なかったコードベクトルの処理を行い、次のステップＳ
１７ではコードブックのアップデートを行う。次にステ
ップＳ１８では学習回数ｎを１インクリメントし、その
後ステップＳ１３に戻る。In step S16, the code vector not used for encoding is processed, and the next step S16 is executed.
At 17, the code book is updated. Next, in step S18, the number of times of learning n is incremented by one, and thereafter, the process returns to step S13.

【０２２９】以上説明したような信号符号化装置及び信
号復号化装置は、例えば図１３及び図１４に示すような
携帯通信端末あるいは携帯電話機等に使用される音声コ
ーデックとして用いることができる。The signal coding apparatus and the signal decoding apparatus as described above can be used as an audio codec used in, for example, a mobile communication terminal or a mobile phone as shown in FIGS. 13 and 14.

【０２３０】すなわち、図１３は、上記図１、図３に示
したような構成を有する音声符号化部１６０を用いて成
る携帯端末の送信側構成を示している。この図１３のマ
イクロホン１６１で集音された音声信号は、アンプ１６
２で増幅され、Ａ／Ｄ（アナログ／ディジタル）変換器
１６３でディジタル信号に変換されて、音声符号化部１
６０に送られる。この音声符号化部１６０は、上述した
図１、図３に示すような構成を有しており、この入力端
子１０１に上記Ａ／Ｄ変換器１６３からのディジタル信
号が入力される。音声符号化部１６０では、上記図１、
図３と共に説明したような符号化処理が行われ、図１、
図２の各出力端子からの出力信号は、音声符号化部１６
０の出力信号として、伝送路符号化部１６４に送られ
る。伝送路符号化部１６４では、いわゆるチャネルコー
ディング処理が施され、その出力信号が変調回路１６５
に送られて変調され、Ｄ／Ａ（ディジタル／アナログ）
変換器１６６、ＲＦアンプ１６７を介して、アンテナ１
６８に送られる。That is, FIG. 13 shows a transmitting side configuration of a mobile terminal using the voice encoding unit 160 having the configuration as shown in FIGS. 1 and 3 above. The audio signal collected by the microphone 161 of FIG.
2 and is converted to a digital signal by an A / D (analog / digital) converter 163.
Sent to 60. The audio encoding section 160 has a configuration as shown in FIGS. 1 and 3 described above, and a digital signal from the A / D converter 163 is input to the input terminal 101. In the audio encoding unit 160, FIG.
The encoding process described with reference to FIG. 3 is performed, and FIG.
An output signal from each output terminal of FIG.
The output signal of 0 is sent to the transmission path coding unit 164. In the transmission path coding section 164, a so-called channel coding process is performed, and the output signal is output to the modulation circuit 165.
Is sent to the D / A (Digital / Analog)
Antenna 1 via converter 166 and RF amplifier 167
68.

【０２３１】また、図１４は、上記図２、図４に示した
ような構成を有する音声復号化部２６０を用いて成る携
帯端末の受信側構成を示している。この図１４のアンテ
ナ２６１で受信された音声信号は、ＲＦアンプ２６２で
増幅され、Ａ／Ｄ（アナログ／ディジタル）変換器２６
３を介して、復調回路２６４に送られ、復調信号が伝送
路復号化部２６５に送られる。２６４からの出力信号
は、上記図２、図４に示すような構成を有する音声復号
化部２６０に送られる。音声復号化部２６０では、上記
図２、図４と共に説明したような復号化処理が施され、
図２、図４の出力端子２０１からの出力信号が、音声復
号化部２６０からの信号としてＤ／Ａ（ディジタル／ア
ナログ）変換器２６６に送られる。このＤ／Ａ変換器２
６６からのアナログ音声信号がスピーカ２６８に送られ
る。Further, FIG. 14 shows the configuration of the receiving side of a portable terminal using the speech decoding unit 260 having the configuration as shown in FIGS. The audio signal received by the antenna 261 of FIG. 14 is amplified by the RF amplifier 262, and the A / D (analog / digital) converter 26
3, the signal is sent to the demodulation circuit 264, and the demodulated signal is sent to the transmission path decoding unit 265. The output signal from the H.264 is sent to the audio decoding unit 260 having the configuration as shown in FIGS. The audio decoding unit 260 performs the decoding process as described with reference to FIGS.
The output signal from the output terminal 201 in FIGS. 2 and 4 is sent to the D / A (digital / analog) converter 266 as a signal from the audio decoding unit 260. This D / A converter 2
The analog audio signal from 66 is sent to the speaker 268.

【０２３２】なお、本発明は上記実施の形態のみに限定
されるものではなく、例えば音声分析側（エンコード
側）の構成や、音声合成側（デコード側）の構成につい
ては、各部をハードウェア的に記載しているが、いわゆ
るＤＳＰ（ディジタル信号プロセッサ）等を用いてソフ
トウェアプログラムにより実現することも可能である。
また、上記ベクトル量子化の代わりに、複数フレームの
データをまとめてマトリクス量子化を施してもよい。さ
らに、本発明が適用される音声符号化方法やこれに対応
する復号化方法は、上記マルチバンド励起を用いた音声
分析／合成方法に限定されるものではなく、有声音部分
に正弦波合成を用いたり、無声音部分をノイズ信号に基
づいて合成するような種々の音声分析／合成方法に適用
でき、用途としても、伝送や記録再生に限定されず、ピ
ッチ変換やスピード変換、規則音声合成、あるいは雑音
抑圧のような種々の用途に応用できることは勿論であ
る。Note that the present invention is not limited to the above-described embodiment. For example, regarding the configuration on the voice analysis side (encoding side) and the configuration on the voice synthesis side (decoding side), each unit is hardware-based. However, it is also possible to realize it by a software program using a so-called DSP (digital signal processor) or the like.
Also, instead of the vector quantization, the data of a plurality of frames may be collectively subjected to matrix quantization. Furthermore, the speech coding method and the corresponding decoding method to which the present invention is applied are not limited to the speech analysis / synthesis method using the multi-band excitation described above. It can be used or applied to various voice analysis / synthesis methods such as synthesizing an unvoiced part based on a noise signal, and the application is not limited to transmission and recording / reproduction, and pitch conversion, speed conversion, regular voice synthesis, or Of course, it can be applied to various applications such as noise suppression.

【０２３３】[0233]

【発明の効果】以上の説明から明らかなように、本発明
に係る音声符号化方法によれば、合成による分析法を用
いて最適ベクトルのクローズドループサーチによる時間
軸波形のベクトル量子化を行う符号化を複数段有し、こ
のうちＮ段目の符号化の際には、Ｎ−１段目の量子化誤
差をリファレンス入力とし、各段の符号化による量子化
出力を選択してビットレートを切り換えることにより、
伝送ビットレートを簡単に切り換えることができると共
に、符号化側と復号化側とでビットレートが異なってい
る場合であっても復号化側で容易に対応可能な符号化デ
ータ列を生成することができる。As is apparent from the above description, according to the speech coding method of the present invention, the code for performing the vector quantization of the time base waveform by the closed loop search of the optimum vector by using the analysis method by the synthesis. There are a plurality of stages of encoding, and when encoding the Nth stage, the quantization error of the N-1th stage is used as a reference input, and the quantization output by the encoding of each stage is selected to set the bit rate. By switching,
The transmission bit rate can be easily switched, and even if the encoding side and the decoding side have different bit rates, the decoding side can easily generate an encoded data string that can be handled. it can.

[Brief description of the drawings]

【図１】本発明に係る音声符号化方法の実施の形態が適
用される音声信号符号化装置の基本構成を示すブロック
回路図である。FIG. 1 is a block circuit diagram showing a basic configuration of a speech signal coding apparatus to which an embodiment of a speech coding method according to the present invention is applied.

【図２】本発明に係る音声復号化方法の実施の形態が適
用される音声信号復号化装置の基本構成を示すブロック
回路図である。FIG. 2 is a block circuit diagram showing a basic configuration of a speech signal decoding apparatus to which an embodiment of a speech decoding method according to the present invention is applied.

【図３】本発明の実施の形態となる音声信号符号化装置
のより具体的な構成を示すブロック回路図である。FIG. 3 is a block circuit diagram showing a more specific configuration of the audio signal encoding device according to the embodiment of the present invention.

【図４】本発明の実施の形態となる音声信号復号化装置
のより具体的な構成を示すブロック回路図である。FIG. 4 is a block circuit diagram showing a more specific configuration of the audio signal decoding device according to the embodiment of the present invention.

【図５】ＬＳＰ量子化部の基本構成を示すブロック図で
ある。FIG. 5 is a block diagram illustrating a basic configuration of an LSP quantization unit.

【図６】ＬＳＰ量子化部のより具体的な構成を示すブロ
ック図である。FIG. 6 is a block diagram illustrating a more specific configuration of an LSP quantization unit.

【図７】ベクトル量子化部の基本構成を示すブロック図
である。FIG. 7 is a block diagram illustrating a basic configuration of a vector quantization unit.

【図８】ベクトル量子化部のより具体的な構成を示すブ
ロック図である。FIG. 8 is a block diagram illustrating a more specific configuration of a vector quantization unit.

【図９】本発明の音声信号符号化装置のＣＥＬＰ符号化
部分（第２の符号化部）の具体的構成を示すブロック回
路図である。FIG. 9 is a block circuit diagram showing a specific configuration of a CELP coding section (second coding section) of the speech signal coding apparatus of the present invention.

【図１０】図９の構成における処理の流れを示すフロー
チャートである。10 is a flowchart showing the flow of processing in the configuration of FIG.

【図１１】ガウシアンノイズと、異なるスレシホールド
値でのクリッピング後のノイズの様子を示す図である。FIG. 11 is a diagram showing a state of Gaussian noise and noise after clipping with different threshold values.

【図１２】学習によってシェイプコードブックを生成す
る際の処理の流れを示すフローチャートである。FIG. 12 is a flowchart showing a flow of processing when a shape codebook is generated by learning.

【図１３】本発明の音声信号符号化装置が適用される携
帯端末の送信側構成を示すブロック回路図である。[Fig. 13] Fig. 13 is a block circuit diagram showing a configuration of a transmission side of a mobile terminal to which the audio signal encoding device of the present invention is applied.

【図１４】本発明の音声信号復号化装置が適用される携
帯端末の受信側構成を示すブロック回路図である。FIG. 14 is a block circuit diagram showing a configuration of a receiving side of a mobile terminal to which the audio signal decoding device of the present invention is applied.

[Explanation of symbols]

１１０第１の符号化部１１１ＬＰＣ逆フィルタ１１３ＬＰＣ分析・量子化部１１４サイン波分析符号化部１１５Ｖ／ＵＶ判定部１２０，１２０₁，１２０₂ 第２の符号化部１２１雑音符号帳１２２，３１２，３２２重み付き合成フィルタ１２３，３１３，３２３減算器１２４，３１４，３２４距離計算回路１２５聴覚重み付けフィルタ３０２ＬＰＣ分析回路３０３ＬＰＣパラメータ量子化回路３０４聴覚重み付けフィルタ３１０，３２０ストキャスティックコードブック３１５，３２５ゲインコードブック３３０インデクス出力切り換え回路110 first coding unit 111 LPC inverse filter 113 LPC analysis / quantization unit 114 sine wave analysis coding unit 115 V / UV determination unit 120, 120 ₁ , 120 ₂ second coding unit 121 noise codebook 122, 312,322 Weighted synthesis filter 123,313,323 Subtractor 124,314,324 Distance calculation circuit 125 Auditory weighting filter 302 LPC analysis circuit 303 LPC parameter quantization circuit 304 Auditory weighting filter 310,320 Stochastic codebook 315,325 Gain codebook 330 Index output switching circuit

───────────────────────────────────────────────────── フロントページの続き (72)発明者大森士郎東京都品川区北品川６丁目７番35号ソニー株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Shiro Omori 6-735 Kita-Shinagawa, Shinagawa-ku, Tokyo Sony Corporation

Claims

[Claims]

1. A speech coding method in which an input speech signal is divided into blocks on the time axis and is coded in each block, wherein a time axis waveform by a closed loop search of an optimum vector using an analysis method by synthesis. Has a plurality of stages of vector quantization, and the Nth stage of the encoding process uses the N−1th stage quantization error as a reference input, and all of the quantization outputs of the respective stages of the encoding process. Alternatively, a voice encoding method is characterized in that a bit rate is selected and a bit rate is switched.

2. The voice encoding method according to claim 1, wherein a signal of an unvoiced sound portion of the input voice signal is encoded by the encoding step.

3. In each of the encoding steps, a quantized value is obtained by multiplying a shape vector extracted from the shape codebook by a gain from the gain codebook, and the shape codebook has a shape vector of size 0. The speech coding method according to claim 1, wherein

4. A speech coder that divides an input speech signal into blocks on the time axis and performs coding on a block-by-block basis. In a speech coder, a time-axis waveform by a closed-loop search of an optimum vector using an analysis method by synthesis. Has a plurality of stages of vector quantization, and the Nth stage encoding unit uses the N-1th stage quantization error as a reference input, and all the quantization outputs of the encoding units of each stage. Alternatively, a voice encoding device is characterized in that a bit rate is selected and a bit rate is switched.

5. The speech coding apparatus according to claim 4, wherein the unvoiced part of the input speech signal is coded by the coding means.

6. In each of the encoding means, a quantized value is obtained by multiplying a shape vector extracted from a shape codebook by a gain from a gain codebook, and the shape codebook has a shape vector of size 0. The speech coding apparatus according to claim 4, wherein the speech coding apparatus is a speech coding apparatus.