JP3092436B2

JP3092436B2 - Audio coding device

Info

Publication number: JP3092436B2
Application number: JP06032104A
Authority: JP
Inventors: 一範小澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1994-03-02
Filing date: 1994-03-02
Publication date: 2000-09-25
Anticipated expiration: 2015-09-25
Also published as: JPH07239700A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は音声信号を低いビットレ
ート、特に８〜４．８ｋｂ／ｓ程度で高品質に符号化す
るための音声符号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech encoding apparatus for encoding a speech signal at a low bit rate, especially at a high quality of about 8 to 4.8 kb / s.

【０００２】[0002]

【従来の技術】音声信号を８〜４．８ｋｂ／ｓ程度の低
いビットレートで符号化する方式としては、例えば、
Ｍ．ＳｃｈｒｏｅｄｅｒａｎｄＢ．Ａｔａｌ氏によ
る“Ｃｏｄｅ−ｅｘｃｉｔｅｄｌｉｎｅａｒｐｒｅ
ｄｉｃｔｉｏｎ：Ｈｉｇｈｑｕａｌｉｔｙｓｐｅｅ
ｃｈａｔｖｅｒｙｌｏｗｂｉｔｒａｔｅｓ”
（Ｐｒｏｃ．ＩＣＡＳＳＰ，ｐｐ．９３７−９４０，１
９８５年）と題した論文（文献１）や、Ｋｌｅｉｊｎ氏
らによる“Ｉｍｐｒｏｖｅｄｓｐｅｅｃｈｑｕａｌ
ｉｔｙａｎｄｅｆｆｉｃｉｅｎｔｖｅｃｔｏｒ
ｑｕａｎｔｉｚａｔｉｏｎｉｎＳＥＬＰ”と題した
論文（ＩＣＡＳＳＰ、ｐｐ．１５５−１５８、１９８８
年）（文献２）等に記載されているＣＥＬＰ（Ｃｏｄｅ
ＥｘｃｉｔｅｄＬＰＣＣｏｄｉｎｇ）方式や、
Ｂ．Ａｔａｌ氏らによる“Ａｎｅｗｍｏｄｅｌｏｆ
ＬＰＣｅｘｃｉｔａｔｉｏｎｆｏｒｐｒｏｄｕ
ｃｉｎｇｎａｔｕｒａｌ−ｓｏｕｎｄｉｎｇａｔ
ｌｏｗｂｉｔｒａｔｅｓ”，（Ｐｒｏｃ．ＩＣＡＳ
ＳＰ，ｐｐ．６１４−６１７，１９８２）と題した論文
（文献３）等に記載されているマルチパルス符号化方式
が知られている。2. Description of the Related Art As a method of encoding an audio signal at a low bit rate of about 8 to 4.8 kb / s, for example,
M. Schroeder and B.S. Atal's "Code-excited linear pre
dictation: High quality speech
chat at low low bit rates ”
(Proc. ICASSP, pp. 937-940, 1
985) (Reference 1) and Kleijn et al.'S "Improved speech qual
city and efficiency vector
Quantization in SELP "(ICASSP, pp. 155-158, 1988).
CELP (Code) described in
Excited LPC Coding)
B. "A new model of by Atal et al.
LPC exit for produ
sing natural-sounding at
low bit rates ", (Proc. ICAS
SP, pp. 614-617, 1982) is known.

【０００３】文献１、２の方法では、送信側では、フレ
ーム毎（例えば２０ms）に音声信号から音声信号のスペ
クトル特性を表すスペクトルパラメータを抽出し、フレ
ームをさらに小区間サブフレーム（例えば５ms）に分割
し、サブフレーム毎に、過去の音源信号をもとに再生し
た再生信号と、前記音源信号との重み付け２乗誤差を最
小化するように長時間相関（ピッチ相関）を表す適応コ
ードブックのピッチパラメータを抽出し、前記ピッチパ
ラメータにより前記サブフレームの音声信号を長期予測
し、長期予測して求めた残差信号に対して、予め定めら
れた種類の雑音信号からなるコードブックから選択した
信号により合成した信号と、前記音声信号との重み付け
２乗誤差を最小化するように一種類の雑音信号を選択す
るとともに、最適なゲインを計算する。そして選択され
た雑音信号の種類を表すインデクスとゲインならびに、
前記スペクトルパラメータとピッチパラメータを伝送す
る。受信側の説明は省略する。In the methods disclosed in Documents 1 and 2, the transmitting side extracts a spectrum parameter representing a spectrum characteristic of an audio signal from an audio signal for each frame (for example, 20 ms), and further divides the frame into small section subframes (for example, 5 ms). An adaptive codebook that represents a long-time correlation (pitch correlation) so as to minimize a weighted square error between a reproduced signal reproduced based on a past sound source signal and the sound source signal for each subframe. A pitch parameter is extracted, a speech signal of the subframe is long-term predicted by the pitch parameter, and a residual signal obtained by the long-term prediction is a signal selected from a codebook including a predetermined type of noise signal. In addition to selecting one type of noise signal so as to minimize the weighted square error between the signal synthesized by Calculate the gain. And the index and gain representing the type of the selected noise signal, and
The spectrum parameter and the pitch parameter are transmitted. Description on the receiving side is omitted.

【０００４】[0004]

【発明が解決しようとする課題】上述した文献１の従来
方式では、マルチパルスや、雑音信号からなるコードブ
ックを探索するときは、サブフレーム毎のコードブック
のビット数は一定としていた。また、マルチパルスを求
めるときは、サブフレームあるいはフレーム内でのマル
チパルスの個数は一定としていた。In the conventional system of the above-mentioned reference 1, when searching for a codebook composed of multi-pulses or noise signals, the number of bits of the codebook for each subframe is fixed. When obtaining multipulses, the number of multipulses in a subframe or a frame was fixed.

【０００５】しかるに、音声信号は、時間的にパワが大
きく変化するため、一定のビット数では信号のパワが時
間的に変化する箇所では、必ずしも良好に符号化するこ
とが困難で、符号化による劣化が顕著となるという問題
点があった。またこの問題点は特にビットレートを低減
しコードブックのサイズを小さくしたり、マルチパルス
の個数を少なくすると顕著であった。However, since the power of an audio signal changes greatly with time, it is difficult to satisfactorily encode a signal at a portion where the power of the signal changes with time at a fixed number of bits. There is a problem that the deterioration becomes remarkable. In addition, this problem is remarkable especially when the bit rate is reduced to reduce the size of the codebook or the number of multi-pulses is reduced.

【０００６】[0006]

【課題を解決するための手段】第１の発明によれば、入
力した離散的な音声信号を予め定められた時間長のフレ
ームに分割し、前記音声信号のスペクトル包絡を表すス
ペクトルパラメータを求めて出力するスペクトルパラメ
ータ計算部と、前記フレームを予め定められた時間長の
小区間に分割し、過去の音源信号からなる適応コードブ
ックをもとに再生した信号が前記音声信号に近くなるよ
うにピッチパラメータを求める適応コードブック部と、
前記音声信号の音源信号を予め構成した複数種類のコー
ドベクトルからなるコードブックあるいはマルチパルス
により表して出力する音源探索部とを有する音声符号化
装置において、前記音声信号からの聴覚のマスキング特
性をもとにマスキングしきい値を求めるマスキングしき
い値計算部と、前記しきい値をもとにサブフレームにお
けるコードブックのビット数あるいはマルチパルスの個
数を決めるビット割当部とを有することを特徴とする音
声符号化装置が得られる。According to the first aspect of the present invention, an input discrete audio signal is divided into frames of a predetermined time length, and a spectrum parameter representing a spectrum envelope of the audio signal is obtained. An output spectrum parameter calculator, dividing the frame into small sections of a predetermined time length, and adjusting the pitch so that a signal reproduced based on an adaptive codebook including past sound source signals is close to the audio signal. An adaptive codebook unit for obtaining parameters;
A sound source search unit that outputs a sound source signal of the audio signal by expressing the source signal by a codebook or a multi-pulse composed of a plurality of types of code vectors configured in advance, and outputting the audio signal. A masking threshold value calculation unit for calculating a masking threshold value, and a bit allocation unit for determining the number of bits of a codebook or the number of multi-pulses in a subframe based on the threshold value. A speech coding device is obtained.

【０００７】また、第２の発明によれば、第１の発明の
音声符号化装置において、入力音声信号を帯域分割する
帯域分割部と、前記帯域分割された信号について、前記
マスキングしきい値をもとに小区間ならびに前記帯域分
割された信号を符号化する際のコードブックのビット数
あるいはマルチパルスの個数を決めるビット割当部を有
することを特徴とする音声符号化装置が得られる。According to a second invention, in the speech coding apparatus according to the first invention, a band dividing section for dividing a band of an input speech signal, and the masking threshold value for the band-divided signal is set. A speech coding apparatus characterized by having a bit allocating unit that determines the number of bits of a codebook or the number of multi-pulses when coding a small section and the band-divided signal is obtained.

【０００８】さらに、第３の発明によれば、第１の発明
の音声符号化装置において、帯域分割フィルタのインパ
ルス応答があらかじめ畳み込まれたコードブックを有す
ることを特徴とする音声符号化装置が得られる。Further, according to a third aspect of the present invention, there is provided the speech encoding device according to the first invention, wherein the speech encoding device has a codebook in which the impulse response of the band division filter is convolved in advance. can get.

【０００９】[0009]

【作用】第１の発明では、フレームを分割したサブフレ
ーム毎に、音源コードブックのビット数を適応数に割り
当てる、あるいはマルチパルスの計算において、マルチ
パルスの個数を適応的に割り当てることを特徴とする。In the first invention, the number of bits of the excitation codebook is allocated to an adaptive number for each subframe obtained by dividing a frame, or the number of multipulses is adaptively allocated in multipulse calculation. I do.

【００１０】以下では、音源コードブックを用いる場合
について説明を行う。In the following, a case where a sound source codebook is used will be described.

【００１１】入力音声信号に対して、聴覚のマスキング
特性をもとにマスキングしきい値を求め、周波数軸上で
前記しきい値をもとにパワスペクトルＴ_ijに変換する。
ここで、ｉは周波数軸上のｉ番目の臨界帯域、ｊはフレ
ーム内のｊ番目のサブフレームを示す。ここでマスキン
グしきい値を求めるには、例えば、音声信号をＦＦＴし
てパワスペクトル｜Ｘ（ｋ）｜²を求め、これを臨界帯
域フィルタ、あるいは聴覚モデルにより分析して、各臨
界帯域毎のパワあるいはＲＭＳを計算し、これらの値か
ら各臨界帯域におけるマスキングしきい値を求める。マ
スキングしきい値の求め方は、例えば聴覚心理学実験に
より得られた値を用いる方法が知られており、詳細は、
Ｊｏｈｎｓｔｏｎ氏による“Ｔｒａｎｓｆｏｒｍｃｏ
ｄｉｎｇｏｆａｕｄｉｏｓｉｇｎａｌｓｕｓｉｎ
ｇｐｅｒｃｅｐｔｕａｌｎｏｉｓｅｃｒｉｔｅｒ
ｉａ”（ＩＥＥＥＪ．Ｓｅｌ．ＡｒｅａｓｏｎＣ
ｏｍｍｕｎ．，ｐｐ．３１４−３２３，１９８８）と題
した論文（文献４）や、Ｒ．ＤｒｏｇｏｄｅＩａｃ
ｏｖｏ氏らによる“Ｖｅｃｔｏｒｑｕａｎｔｉｚａｔ
ｉｏｎａｎｄｐｅｒｃｅｐｔｕａｌｃｒｉｔｅｒ
ｉａｉｎＳＶＤｂａｓｅｄＣＥＬＰｃｅｄｅ
ｒｓ”と題した論文（ＩＣＡＳＳＰ，ｐｐ．３３−３
６，１９９０年）（文献５）等を参照できる。また、臨
界帯域フィルタあるいは臨界帯域分析については、例え
ば、Ｊ．Ｔｏｂｉａｓ氏による“Ｆｏｕｎｄａｔｉｏｎ
ｏｆｍｏｄｅｒｎａｕｄｉｔｏｒｙｔｈｅｏｒ
ｙ”と題した単行本の第５章（文献６）等を参照でき
る。また、聴覚モデルについては、例えばＳｅｎｅｆｆ
氏による“Ａｃｏｍｐｕｔａｔｉｏｎａｌｍｏｄｅ
ｆｏｒｔｈｅｐｅｒｉｐｈｅｒａｌａｕｄｉｔｏ
ｒｙｓｙｓｔｅｍ：Ａｐｐｌｉｃａｔｉｏｎｔｏ
ｓｐｅｅｃｈｒｅｃｏｇｎｉｔｉｏｎｒｅｓｅａｒ
ｃｈ”と題した論文（Ｐｒｏｃ．ＩＣＡＳＳＰ，ｐｐ．
１９８３−１９８６，１９８６年）（文献７）等を参照
できる。[0011] A masking threshold is determined for an input voice signal based on the masking characteristics of the auditory sense, and is converted into a power spectrum T _ij on the frequency axis based on the threshold.
Here, i indicates the i-th critical band on the frequency axis, and j indicates the j-th subframe in the frame. Here, in order to obtain the masking threshold, for example, an audio signal is subjected to FFT to obtain a power spectrum | X (k) | ² , and this is analyzed by a critical band filter or an auditory model, and each critical band is analyzed. The power or RMS is calculated, and a masking threshold in each critical band is obtained from these values. As a method of obtaining a masking threshold, for example, a method using a value obtained by an auditory psychology experiment is known.
"Transform co by Johnston
dinof audio signals usin
g perceptual noise writer
ia "(IEEE J. Sel. Areas on C
ommun. Pp. 314-323, 1988) (Reference 4); Drago de Iac
Ovo et al., "Vector Quantizat
ion and perceptual criter
ia in SVD based CELP cede
rs "(ICASSP, pp. 33-3)
6, 1990) (Literature 5). For the critical band filter or critical band analysis, see, for example, "Foundation by Tobias
of modern audition theor
y "in Chapter 5 of the book (Reference 6) and the like. For the auditory model, for example, Seneff
"A computational mode
for the peripheral audio
ry system: Application to
speech recognition research
ch ”(Proc. ICASSP, pp.
1983-1986, 1986) (Reference 7).

【００１２】フレーム内の全てのサブフレームについ
て、Ｔ_ij（ｉ＝１．．．Ｌ，ｊ＝１．．．Ｍ）を求め、
サブフレーム、帯域毎に下式に従い、信号対マスキング
しきい値比ＳＭＲ_ijを求める。For all subframes in the frame, T _ij (i = 1... L, j = 1.
The signal-to-masking threshold ratio SMR _ij is obtained for each subframe and band according to the following equation.

【００１３】ＳＭＲ_ij＝Ｐ_ij／Ｔ_ij （１）ここで、ｉは臨界帯域信号、ｊはサブフレーム番号であ
る。Ｐ_ij、Ｔ_ijはｊ番目のサブフレームでｉ番目の臨界
帯域での入力音声のパワ、マスキングしきい値を示す。
サブフレーム間での適応ビット配分は下式に従う。SMR _ij = P _ij / T _ij (1) where i is a critical band signal and j is a subframe number. P _ij and T _ij indicate the power and masking threshold of the input voice in the i-th critical band in the j-th subframe.
Adaptive bit allocation between subframes follows the equation below.

【００１４】[0014]

【数１】 (Equation 1)

【００１５】ここで、Ｒ_j、Ｒ、Ｍ、Ｌはそれぞれ、ｊ
番目のサブフレームの割当ビット数、音源コードブック
の平均ビット数、臨界帯域数、フレーム内のサブフレー
ムの個数を示す。Here, R _j , R, M and L are j
It indicates the number of bits allocated to the third subframe, the average number of bits of the excitation codebook, the number of critical bands, and the number of subframes in the frame.

【００１６】なお、適応ビット配分の別な方法として、
下式を用いることもできる。As another method of adaptive bit allocation,
The following equation can also be used.

【００１７】[0017]

【数２】 (Equation 2)

【００１８】また、第２の発明では、入力信号をあらか
じめ複数個の帯域に分割する。ここで、帯域分割にはＱ
ＭＦ（ＱｕａｄｒａｔｕｒｅＭｉｒｒｏｒＦｉｌｔ
ｅｒ）を使用する。ＱＭＦフィルタの詳細については、
Ｐ．Ｖａｉｄｙａｎａｔｈａｎ氏らによる“Ｍｕｌｔｉ
ｒａｔｅｄｉｇｉｔａｌｆｉｌｔｅｒｓ，ｆｉｌｔ
ｅｒｂａｎｋｓ，ｐｏｌｙｐｈａｓｅｎｅｔｗｏｒ
ｋｓ，ａｎｄａｐｐｌｉｃａｔｉｏｎｓ：Ａｔｕ
ｔｏｒｉａｌ”（Ｐｒｏｃ．ＩＥＥＥ，ｐｐ．５６−９
３，１９９０年）と題した論文（文献８）等を参照する
ことができる。ここで、Ｗ個の帯域に分割するとすれ
ば、サブフレーム毎、帯域毎のビット配分は、上記
（２）式を変形してIn the second invention, the input signal is divided into a plurality of bands in advance. Here, Q is
MF (Quadrature Mirror Filter)
er). For details on the QMF filter,
P. "Multi by Vaidyanathan et al.
rate digital filters, filter
er banks, polyphase network
ks, and applications: A tu
trial "(Proc. IEEE, pp. 56-9)
3, 1990) can be referred to. Here, if it is divided into W bands, the bit allocation for each subframe and each band is obtained by modifying the above equation (2).

【００１９】[0019]

【数３】 (Equation 3)

【００２０】ここで、Ｒ_kjはｊ番目のサブフレームで、
ｋ番目の帯域を示す。ただし、ｊ＝１．．．Ｌ，ｋ＝
１．．．Ｗである。また、ＳＭＲ_kj＝Ｐ_kj／Ｔ_kj （５）であり、ｊ番目のサブフレームの分割帯域毎の入力信号
のパワ、ｊ番目のサブフレームの分割帯域毎のマスキン
グしきい値である。Where R _kj is the j-th subframe,
Indicates the k-th band. However, j = 1. . . L, k =
1. . . W. SMR _kj = P _kj / T _kj (5), which is the power of the input signal for each divided band of the j-th subframe and the masking threshold value for each divided band of the j-th subframe.

【００２１】さらに、第３の発明では、Ｗ個の音源コー
ドブックを有し、各音源コードブックの全てのコードベ
クトルに帯域分割フィルタのインパルス応答があらかじ
め畳み込まれている。このような音源コードブックを探
索するときは、下式により、各音源コードブックの音源
コードベクトルを加算した後に、第１の発明と同様な方
法で探索を行う。Furthermore, in the third invention, there are W excitation codebooks, and the impulse response of the band division filter is pre-convolved in all code vectors of each excitation codebook. When searching for such a sound source codebook, a search is performed in the same manner as in the first invention after adding the sound source code vectors of the respective sound source codebooks according to the following equation.

【００２２】[0022]

【数４】 (Equation 4)

【００２３】サブフレーム毎、帯域毎のビット割当は、
前記（２）、（３）または（４）式に従う。Bit allocation for each subframe and each band is as follows:
According to the formula (2), (3) or (4).

【００２４】[0024]

【実施例】図１は、第１の発明による音声符号化装置の
一実施例を示すブロック図である。ここでは、簡単のた
めに、音源コードブックの探索において、マスキングし
きい値にもとづいてコードブックのビット数を割り当て
る例について示すが、適応コードブックや他のコードブ
ックのビット割当に拡張することもできる。FIG. 1 is a block diagram showing an embodiment of a speech coding apparatus according to the first invention. Here, for the sake of simplicity, in the search for the sound source codebook, an example will be described in which the number of bits of the codebook is allocated based on the masking threshold. However, the present invention can be extended to the adaptive codebook and other codebooks. it can.

【００２５】図において、送信側では、入力端子１００
から音声信号を入力し、１フレーム分（例えば２０ms）
の音声信号をバッファメモリ１１０に格納する。In the figure, on the transmitting side, an input terminal 100
And input a voice signal from the device for one frame (for example, 20 ms)
Is stored in the buffer memory 110.

【００２６】ＬＰＣ分析回路１３０は、フレームの音声
信号のスペクトル特性を表すパラメータとして、ＬＳＰ
パラメータを前記フレームの音声信号から周知のＬＰＣ
分析を行いあらかじめ定められた次数Ｌだけ計算する。The LPC analysis circuit 130 uses the LSP as a parameter representing the spectral characteristic of the audio signal of the frame.
The parameters are calculated from the audio signal of the frame using a well-known LPC.
The analysis is performed and calculation is performed for a predetermined order L.

【００２７】次にＬＳＰ量子化回路１４０は、ＬＳＰパ
ラメータを予め定められた量子化ビット数で量子化し、
得た符号ｌ_kをマルチプレクサ２９０へ出力するととも
に、これを復号化してさらに線形予測係数ａ_i′（ｉ＝
１〜Ｐ）に変換してインパルス応答計算回路１７０、合
成フィルタ２９５へ出力する。ＬＳＰパラメータの符号
化、ＬＳＰパラメータと線形予測係数との変換の方法に
ついてはＳｕｇａｍｕｒａ氏らによる“Ｑｕａｎｔｉｚ
ｅｒｄｅｓｉｇｎｉｎＬＳＰｓｐｅｅｃｈａ
ｎａｌｙｓｉｓ−ｓｙｎｔｈｅｓｉｓ”と題した論文
（ＩＥＥＥＪ．Ｓｅｌ．ＡｒｅａｓＣｏｍｍｎ．，
ｐｐ．４３２−４４０，１９８８年）（文献９）等を参
照することができる。またＬＳＰパラメータをさらに効
率的に量子化するためには、ベクトル−スカラ量子化
や、他の周知なベクトル量子化法を用いることもでき
る。ＬＳＰのベクトル−スカラ量子化については、Ｍｏ
ｒｉｙａ氏らによる“ＴｒａｎｓｆｏｒｍＣｏｄｉｎ
ｇｏｆＳｐｅｅｃｈｕｓｉｎｇａＷｅｉｇｈ
ｔｅｄＶｅｃｔｏｒＱｕａｎｔｉｚｅｒ，”と題し
た論文（ＩＥＥＥＪ．Ｓｅｌ．Ａｒｅａｓ，Ｃｏｍｍ
ｕｎ．，ｐｐ．４２５−４３１，１９８８年）（文献１
０）等を参照できる。Next, the LSP quantization circuit 140 quantizes the LSP parameter with a predetermined number of quantization bits,
The obtained code l _k is output to the multiplexer 290, which is decoded and further subjected to linear prediction coefficients a _i ′ (i =
1 to P), and outputs the result to the impulse response calculation circuit 170 and the synthesis filter 295. The method of encoding LSP parameters and converting between LSP parameters and linear prediction coefficients is described in "Quantiz" by Sugamura et al.
er design in LSP speech a
analysis-synthesis "(IEEE J. Sel. Areas Commun.,
pp. 432-440, 1988) (Reference 9). In addition, in order to quantize the LSP parameters more efficiently, vector-scalar quantization or other well-known vector quantization methods can be used. For LSP vector-scalar quantization, see Mo.
"Transform Codin by Riya
g of Speech using a Weigh
ed Vector Quantizer, "(IEEE J. Sel. Areas, Comm.
un. Pp. 425-431, 1988) (Reference 1)
0) etc. can be referred to.

【００２８】サブフレーム分割回路１５０は、フレーム
の音声信号をサブフレームに分割する。ここで例えばサ
ブフレーム長は５msとする。The sub-frame division circuit 150 divides the audio signal of the frame into sub-frames. Here, for example, the subframe length is 5 ms.

【００２９】マスキングしきい値計算回路２０５は、入
力音声信号ｘ（ｎ）に対してＮ点のＦＥＴ交換を行いス
ペクトルＸ（ｋ）（ｋ＝０〜Ｎ−１）を求め、さらにパ
ワスペクトル｜Ｘ（ｋ）｜²を求め、これを臨界帯域フ
ィルタあるいは聴覚モデルにより分析して、各臨界帯域
毎のパワあるいはＲＭＳを計算する。ここでパワを計算
するには下式に従う。The masking threshold value calculation circuit 205 performs N-point FET exchange on the input audio signal x (n) to obtain a spectrum X (k) (k = 0 to N-1). X (k) | ² is obtained and analyzed by a critical band filter or an auditory model, and the power or RMS for each critical band is calculated. Here, the power is calculated according to the following equation.

【００３０】[0030]

【数５】 (Equation 5)

【００３１】ここで、ｂｌ_i、ｂｈ_iは、それぞれｉ番
目の臨界帯域の下限周波数、上限周波数を示す。Ｒは音
声信号帯域に含まれる臨界帯域の個数である。臨界帯域
については前記文献６等を参照できる。Here, bl _i and bh _i indicate the lower limit frequency and the upper limit frequency of the i-th critical band, respectively. R is the number of critical bands included in the audio signal band. Reference 6 can be referred to for the critical band.

【００３２】次に、下式に従い、臨界帯域スペクトルに
散布関数を畳み込む。Next, the scatter function is convolved with the critical band spectrum according to the following equation.

【００３３】[0033]

【数６】 (Equation 6)

【００３４】ここでｓｐｒｄ（ｊ，ｉ）は散布関数であ
り、具体的な値は前記文献４を参照できる。また、ｂ
_maxは、角周波数πまでの間に含まれる臨界帯域の個数
である。次に、下式に従い、マスキングしきい値スペク
トルＴｈ_iを計算する。Here, sprd (j, i) is a scatter function, and the specific value can be referred to the aforementioned reference 4. Also, b
_max is the number of critical bands included up to the angular frequency π. Next, a masking threshold spectrum Th _i is calculated according to the following equation.

【００３５】Ｔ′_i＝Ｃ_iＴ_i （９）ただしＴ_i＝１０^-(0i/10) （１０）０_i＝α（１４．５＋ｉ）＋（１−α）５．５（１１） α＝ｍｉｎ［（ＮＧ／Ｒ），１．０］（１２）T ′ _i = C _i T _i (9) where T _i = 10 ^{− (0i / 10)} (10) 0 _i = α (14.5 + i) + (1−α) 5.5 (11) α = Min [(NG / R), 1.0] (12)

【００３６】[0036]

【数７】 (Equation 7)

【００３７】ここで、ｋ_iはｉ次目のＫパラメータであ
り、ＬＰＣ分析回路１３０から入力した線形予測係数か
ら周知の方法により変換して求める。また、Ｍは線形予
測分析の次数である。[0037] Here, k _i is K parameter of the i following th obtained by converting by methods known from the linear prediction coefficients input from the LPC analysis circuit 130. M is the order of the linear prediction analysis.

【００３８】マスキングしきい値スペクトルは、絶対し
きい値を考慮することにより、下式のようになる。The masking threshold spectrum is given by the following equation by considering the absolute threshold.

【００３９】Ｔ″_i＝ｍａｘ［Ｔ_i，ａｂｓｔｈ_i］（１４）ここで、ａｂｓｔｈ_iは、臨界帯域ｉにおける絶対しき
い値であり、前記文献５を参照できる。T ″ _i = max [T _i , absth _i ] (14) Here, absth _i is an absolute threshold value in the critical band i, and can be referred to the aforementioned reference 5.

【００４０】次に、マスキングしきい値スペクトルＴ・
ｉ（ｉ＝１．．．ｂ_max）に対して、周波数軸をバーク
軸からヘルツ軸に変換したパワスペクトルＰ_m（ｆ）を
求め、これらを逆ＦＦＴすることにより、自己相関関数
ｒ（ｊ）（ｊ＝０．．．Ｎ−１）を求める。次に、自己
相関関数に対して、周知の線形予測分析を行うことによ
り、フィルタ係数ｂ_i（ｉ＝１．．．Ｐ）を計算する。Next, the masking threshold spectrum T ·
For i (i = 1... b _max ), the power spectrum P _m (f) obtained by converting the frequency axis from the Bark axis to the Hertz axis is obtained, and these are subjected to inverse FFT to obtain an autocorrelation function r (j ) (J = 0 ... N-1). Next, a filter coefficient b _i (i = 1... P) is calculated by performing well-known linear prediction analysis on the autocorrelation function.

【００４１】聴覚重み付け回路２２０は、フィルタ係数
ｂ_iを用いて（１４）式で定められる伝達特性を有する
フィルタに通して聴覚重み付けを行い、重み付け信号ｘ
_wm（ｎ）を得る。The perceptual weighting circuit 220 performs perceptual weighting through a filter having a transfer characteristic which is determined by using the filter coefficient b _i (14) wherein the weighting signal x
get _wm (n).

【００４２】[0042]

【数８】 (Equation 8)

【００４３】ここで、γ₁、γ₂は重み付け量を制御す
る定数であり、通常、０＜γ₂＜γ₁＜１に選ぶ。Here, γ ₁ and γ ₂ are constants for controlling the weighting amounts, and are usually selected to be 0 <γ ₂ <γ ₁ <1.

【００４４】インパルス応答計算回路１７０では、（１
５）式の伝達特性を有するフィルタのインパルス応ｈ_wm
（ｎ）をあらかじめ定められた長さまで求め、出力す
る。In the impulse response calculation circuit 170, (1
Impulse response h _{wm of the} filter having the transfer characteristic of equation 5)
(N) is obtained up to a predetermined length and output.

【００４５】Ａ_w（ｚ）＝Ｈ_wm（ｚ）・［１／Ａ（ｚ）］（１５）ただしA _w (z) = H _wm (z) · [1 / A (z)] (15)

【００４６】[0046]

【数９】 (Equation 9)

【００４７】であり、ａ_i′はＬＳＰ量子化回路１４０
から出力される。A _i ′ is the LSP quantization circuit 140
Output from

【００４８】減算器１９０は、重み付け信号から合成フ
ィルタ２９５の出力を減算して出力する。The subtractor 190 subtracts the output of the synthesis filter 295 from the weighted signal and outputs the result.

【００４９】適応コードブック２１０は、インパルス応
答計算回路１７０から重み付けインパルス応答ｈ
_w（ｎ）、減算器１９０から重み付け信号を入力し、長
期相関にもとづくピッチ予測を行い、ピッチパラメータ
として遅延Ｍとゲインβを計算する。以下の説明では適
応コードブックの予測次数は１とするが、２次以上の高
次とすることもできる。適応コードブックにおける遅延
Ｍの計算は、前記文献１、２等を参照することができ
る。さらに、ゲインβを求め、下式により、適応コード
ベクトルｘ_z（ｎ）を求めて、１９０の出力から減算す
る。The adaptive codebook 210 receives the weighted impulse response h from the impulse response calculation circuit 170.
_w (n), a weighted signal is input from the subtractor 190, pitch prediction is performed based on long-term correlation, and delay M and gain β are calculated as pitch parameters. In the following description, the prediction order of the adaptive codebook is assumed to be 1, but may be higher than or equal to second order. The calculation of the delay M in the adaptive codebook can be performed by referring to Documents 1 and 2 described above. Further, a gain β is obtained, an adaptive code vector x _z (n) is obtained by the following equation, and the adaptive code vector x _z (n) is subtracted from the output of 190.

【００５０】ｘ_z（ｎ）＝ｘ_wm（ｎ）−β・ｖ（ｎ−Ｍ）＊ｈ_wm（ｎ）（１７）ここで、ｘ_wm（ｎ）は減算器１９０の出力信号、ｖ
（ｎ）は過去の合成フィルタ駆動信号、ｈ_wm（ｎ）はイ
ンパルス応答計算回路１７０から出力される。記号＊は
畳み込み積分を表す。X _z (n) = x _wm (n) −β · v (n−M) * h _wm (n) (17) where x _wm (n) is the output signal of the subtractor 190, v
(N) is output from the past synthesis filter drive signal, and h _wm (n) is output from the impulse response calculation circuit 170. The symbol * represents convolution.

【００５１】ビット割当回路２１５は、マスキングしき
い値計算回路２０５から、サブフレーム毎にマスキング
しきい値スペクトルＴ_iまたはＴ′_iまたはＴ″_iを入
力し、前記（２）式あるいは（３）式に従い、ビット割
当を行う。ただし、フレーム全体でのビット数が下式の
ようにあらかじめ定められた値となるように、サブフレ
ームの割当ビット数が下限ビット数、上限ビット数をこ
えないように、ビット数の調整を行う。The bit allocation circuit 215 inputs the masking threshold spectrum T _i, T ′ _i, or T ″ _i for each subframe from the masking threshold calculation circuit 205, and obtains the equation (2) or (3). Bit allocation is performed according to the following equation, provided that the number of bits allocated to the subframe does not exceed the lower limit bit number and the upper limit bit number so that the number of bits in the entire frame becomes a predetermined value as in the following equation. Next, the number of bits is adjusted.

【００５２】[0052]

【数１０】 (Equation 10)

【００５３】Ｒ_min＜Ｒ_j＜Ｒ_max ここで、Ｒ_j、Ｒ_T、Ｒ_min、Ｒ_maxはそれぞれ、ｊ番
目のサブフレームの割当ビット数、フレーム全体での合
計ビット数、サブフレームの下限ビット数、サブフレー
ムの上限ビット数を示す。また、Ｌはフレーム内でのサ
ブフレームの個数である。以上の処理の結果、ビット割
当情報をマルチプレクサ２９０へ出力する。R_min<R_j<R_max Where R_j, R_T, R_min, R_maxIs the j-th
The number of bits allocated to the first subframe, the total
Total number of bits, lower limit number of bits of subframe, subframe
Indicates the upper limit bit number of the system. L is the support within the frame.
This is the number of subframes. As a result of the above processing,
This information is output to the multiplexer 290.

【００５４】音源コードブック探索回路２３０は、ビッ
ト数の異なるコードブック（２５０₁から２５０_N）を
有しており、サブフレーム毎の割当ビット数を入力し、
ビット数に応じて、コードブック２５０₁から２５０_N
を切り替える。そして、下式を最小化するように、音源
コードベクトルを選択する。The excitation codebook search circuit 230 has codebooks (250 ₁ to 250 _N ) having different numbers of bits, and inputs the allocated number of bits for each subframe.
Codebook 250 ₁ to 250 _N depending on the number of bits
Switch. Then, a sound source code vector is selected so as to minimize the following expression.

【００５５】[0055]

【数１１】 [Equation 11]

【００５６】ただし、γ_kは、コードベクトルｃ
_k（ｎ）（ｊ＝０．．．２^B−１；Ｂは音源コードブッ
クのビット数）に対する最適ゲインである。ｈ_wm（ｎ）
はインパルス応答計算回路１７０で求めたインパルス応
答である。Where γ _k is the code vector c
_k (n) (j = 0 ... 2 ^B -1; B is the number of bits of the sound source codebook). h _wm (n)
Is an impulse response obtained by the impulse response calculation circuit 170.

【００５７】音源コードブックは例えば、文献１のよう
にガウス乱数から構成しても良いし、あらかじめ学習し
て構成しておいてもよい。学習によるコードブックの構
成法は、例えばＬｉｎｄｅらによる“ＡｎＡｌｇｏｒ
ｉｔｈｍｆｏｒＶｅｃｔｏｒＱｕａｎｔｉｚａｔ
ｉｏｎＤｅｓｉｇｎ”と題した論文（ＩＥＥＥＴｒ
ａｎｓ．ＣＯＭ−２８，ｐｐ．８４−９５，１９８０
年）（文献１１）等を参照できる。The sound source code book may be composed of Gaussian random numbers as in Document 1, for example, or may be constructed by learning in advance. A method of constructing a codebook by learning is described in, for example, “An Algor
ism for Vector Quantizat
The paper entitled “Ion Design” (IEEE Tr
ans. COM-28, pp. 84-95, 1980
Year) (Reference 11).

【００５８】ゲインコードブック探索回路２６０は、選
択された音源コードベクトルを用いて、ゲインコードブ
ック２７０を用い、下式を最小化するようにゲインコー
ドベクトルを探索し出力する。The gain codebook search circuit 260 searches for and outputs a gain codevector using the selected sound source codevector and the gain codebook 270 so as to minimize the following equation.

【００５９】[0059]

【数１２】 (Equation 12)

【００６０】ここで、ｇ_1k、ｇ_2kは、ｋ番目の２次元ゲ
インコードベクトルである。選択された適応コードベク
トルのインデクス、音源コードベクトルのインデクス、
ゲインコードベクトルのインデクスを出力する。Here, g _1k and g _2k are k-th two-dimensional gain code vectors. Index of selected adaptive code vector, index of sound source code vector,
Outputs the gain code vector index.

【００６１】マルチプレクサ２９０は、ＬＳＰ量子化回
路１４０の出力、ビット割当回路２１５の出力、ゲイン
コードブック探索回路２６０の出力を組み合わせて出力
する。The multiplexer 290 combines and outputs the output of the LSP quantization circuit 140, the output of the bit allocation circuit 215, and the output of the gain codebook search circuit 260.

【００６２】合成フィルタ回路２９５は、ゲインコード
ブック探索回路２６０の出力を用いて重み付け再生信号
を求め減算器１９０に出力する。The synthesis filter circuit 295 obtains a weighted reproduction signal using the output of the gain codebook search circuit 260 and outputs it to the subtractor 190.

【００６３】図２は、第２の発明による音声符号化装置
の一実施例を示すブロック図である。図において、図１
と同一の番号を記した構成要素は、図１と同一の動作を
行うので、説明は省略する。FIG. 2 is a block diagram showing an embodiment of the speech encoding apparatus according to the second invention. In the figure, FIG.
The components denoted by the same reference numerals perform the same operations as those in FIG. 1, and a description thereof will not be repeated.

【００６４】帯域分割回路３００は、音声信号をあらか
じめ定められた個数（例えばＷ個）の帯域に分割する。
各帯域の帯域幅はあらかじめ設定しておく。帯域分割に
は、ＱＭＦフィルタバンクを用いる。ＱＭＦフィルタバ
ンクの構成法については、前記文献８等を参照できる。The band dividing circuit 300 divides the audio signal into a predetermined number (for example, W) of bands.
The bandwidth of each band is set in advance. A QMF filter bank is used for band division. For the method of configuring the QMF filter bank, reference can be made to the aforementioned reference 8.

【００６５】マスキングしきい値計算回路４１０は、図
１のマスキングしきい値計算回路２０５と同様に各臨界
帯域のマスキングしきい値を計算する。そして、帯域分
割回路３００で分割した各帯域に含まれるマスキングし
きい値を用いて、（５）式に従いＳＭＲ_kjを求め、ビッ
ト割当回路４２０に出力する。また、各帯域毎に含まれ
るマスキングしきい値から、図１のマスキングしきい値
計算回路２０５と同様の方法によりフィルタ係数ｂ_iを
計算し、音声符号化回路４００₁−４００_Wに出力す
る。The masking threshold value calculation circuit 410 calculates the masking threshold value of each critical band, similarly to the masking threshold value calculation circuit 205 of FIG. Then, using the masking threshold included in each band divided by the band division circuit 300, the SMR _kj is obtained according to the equation (5), and is output to the bit allocation circuit 420. Further, from the masking threshold included in each band, the filter coefficient b _i calculated in the same manner as the masking threshold calculating circuit 205 in FIG. 1, and outputs to the audio coding circuit 400 ₁ -400 _W.

【００６６】ビット割当回路４２０は、マスキングしき
い値計算回路４１０から入力したＳＭＲ_kj（ｊ＝
１．．．Ｌ，ｋ＝１．．．Ｗ）を用いて、前記（４）式
に従い、サブフレーム毎及び帯域毎にビット数を割り当
て、音声符号化回路４００₁−４００_Wに出力する。The bit allocation circuit 420 receives the SMR _kj (j =
1. . . L, k = 1. . . Using W), the number of bits is assigned to each subframe and each band in accordance with the above equation (4), and output to the speech coding circuits 400 ₁ -400 _W.

【００６７】図３は音声符号化回路４００₁−４００_W
の構成を示すブロック図である。４００₁−４００_Wは
すべて同一の動作をするので、一例として、ｌ番目の帯
域に関する音声符号化回路４００_lを図３に示す。図に
おいて、図１と同一の番号を付した構成要素は、図１と
同一の動作をするので、説明は省略する。FIG. 3 shows a speech encoding circuit 400 ₁ -400 _W
FIG. 3 is a block diagram showing the configuration of FIG. Since 400 _{1 to} 400 _W all perform the same operation, FIG. 3 shows an example of a speech encoding circuit 400 _l for the l-th band. In the figure, components having the same reference numerals as those in FIG. 1 perform the same operations as those in FIG.

【００６８】聴覚重み付け回路２２０は、聴覚重み付け
用のフィルタ係数ｂ_iを入力し、図１の聴覚重み付け回
路２２０と同一の動作を行う。[0068] Perceptual weighting circuit 220 inputs the filter coefficients b _i for perceptual weighting, performs the same operation as perceptual weighting circuit 220 of FIG.

【００６９】音源コードブック探索回路２３０は、サブ
フレーム毎、帯域毎にビット割当値Ｒ_kjを入力し、音源
コードブックのビット数を切り替える。The excitation codebook search circuit 230 receives the bit allocation value R _kj for each subframe and each band, and switches the number of bits of the _excitation codebook.

【００７０】図４は第３の発明の一実施例を示すブロッ
ク図である。図において、図１あるいは図２と同一の番
号を付した構成要素は、図１あるいは図２と同一の動作
を行うので、説明は省略する。FIG. 4 is a block diagram showing an embodiment of the third invention. In the figure, components having the same reference numerals as those in FIG. 1 or FIG. 2 perform the same operations as those in FIG. 1 or FIG.

【００７１】音源コードブック探索回路２３０は、ビッ
ト割当回路４２０からサブフレーム毎、帯域毎のビット
割当値を入力し、ビット割当値に応じて、帯域毎、サブ
フレーム毎に音源コードブックを切り替える。各帯域毎
に、ビット数の異なるＮ種類のコードブックを有してい
る。例えば帯域１では、コードブック５００₁₁から５０
０_1Nまである。帯域Ｗも同様にして、５００_W1から５０
０_WNまで有している。さらに、帯域毎に、コードブック
中の全てのコードベクトルには、該当する帯域分割フィ
ルタのインパルス応答があらかじめ畳み込まれている。
例えば、帯域１では、帯域１に相当する帯域分割フィル
タのインパルス応答が前記文献８によりあらかじめ計算
され、これが全ての帯域１のＮ個のコードブックの全て
のコードベクトルにあらかじめ畳み込まれているのであ
る。The excitation codebook search circuit 230 receives the bit allocation value for each subframe and each band from the bit allocation circuit 420, and switches the excitation codebook for each band and each subframe according to the bit allocation value. Each band has N types of codebooks having different numbers of bits. For example, in band 1, codebooks 500 ₁₁ to 50
Up to 0 _1N . Similarly, the band W is set to 50 _W1 to 50 _W1.
It has up to 0 _WN . Further, for each band, the impulse response of the corresponding band division filter is convolved in advance in all the code vectors in the codebook.
For example, in the band 1, the impulse response of the band division filter corresponding to the band 1 is calculated in advance by the above-mentioned reference 8, and this is previously convolved with all the code vectors of the N codebooks of all the bands 1. is there.

【００７２】サブフレーム毎に、帯域毎のビット割当値
を入力して、ビット数に応じたコードブックを読みだ
し、前記（６）式に従い、全帯域分（ここではＷ個）の
コードベクトルを加算し新たなコードベクトルｃ（ｎ）
を作成し、前記（１８）式を最小化するようなコードベ
クトルを選択する。このとき、各帯域毎のコードブック
の全帯域での総組合せについて探索を行うと膨大な演算
量を必要とするので、適応コードブック出力信号を帯域
分割し、各帯域に対して、該当のコードブックから歪の
小さいコードベクトルを複数候補選択し、全ての帯域で
の候補間の総組合せの各々について、（６）式を用いて
全帯域のコードベクトルを復元し、総組合せの中から、
歪を最小化するコードベクトルを選択するようにしても
よい。このようにすると、コードブック探索に要する演
算量を大幅に低減できる。For each subframe, a bit allocation value for each band is input, a code book corresponding to the number of bits is read out, and the code vectors for the entire band (here, W bits) are obtained according to the above equation (6). Addition and new code vector c (n)
Is created, and a code vector that minimizes the expression (18) is selected. At this time, if a search is performed for the total combination of the codebooks in each band in all the bands, an enormous amount of calculation is required. Therefore, the adaptive codebook output signal is divided into bands, and the corresponding code A plurality of code vectors with small distortion are selected from the book, and for each of the total combinations between the candidates in all the bands, the code vectors of all the bands are restored using the equation (6).
A code vector that minimizes distortion may be selected. In this way, the amount of calculation required for searching the codebook can be significantly reduced.

【００７３】上記実施例において、ビット割当の決め方
は、あらかじめＳＭＲをクラスタリングして、各クラス
タのＳＭＲと割当ビット数とをテーブルにしたビット割
当用コードブックを所定ビット数（例えばＢビット）だ
け設計しておき、これをビッと割当回路におけるビット
割当の計算のときに用いることもできる。このような構
成とすると、伝送すべきビット割当情報は、フレーム当
りＢビットでよいので、ビット割当用の伝送情報を削減
することができる。In the above embodiment, the bit allocation is determined by clustering SMRs in advance and designing a bit allocation codebook in which the SMR of each cluster and the number of allocated bits are tabulated for a predetermined number of bits (for example, B bits). In addition, this can be used when calculating bit allocation in the bit and allocation circuit. With such a configuration, the bit allocation information to be transmitted may be B bits per frame, so that the transmission information for bit allocation can be reduced.

【００７４】また、第２、第３の発明において、サブフ
レーム毎、帯域毎のビット割当は（４）式以外に下式を
用いることもできる。In the second and third aspects of the present invention, the following equation can be used for the bit allocation for each subframe and each band other than the equation (4).

【００７５】[0075]

【数１３】 (Equation 13)

【００７６】ここで、Ｑ_kは、ｋ番目の分割帯域に含ま
れる臨界帯域の個数である。Here, Q _k is the number of critical bands included in the k-th divided band.

【００７７】なお、以上の実施例では、音源コードブッ
クのビット数を適応的に割り当てる例について示した
が、音源コードブックのみならず、ＬＳＰコードブッ
ク、適応コードブック、ゲインコードブックのいずれの
ビット割当にも、本発明は適用可能である。In the above embodiment, an example has been described in which the number of bits of the sound source codebook is adaptively allocated. However, not only the sound source codebook but also any one of the LSP codebook, the adaptive codebook, and the gain codebook The present invention is applicable to assignment.

【００７８】また、ビット割当回路２１５、４２０にお
けるビット割当の方法としては、（２）式あるいは
（３）式により一旦ビット数を割り当てた後に、実際に
割り当てたビット数による音源コードブックを用いて量
子化を行い、量子化雑音を測定し、下式を最大化するよ
うに、ビット割当を調整することもできる。As a bit allocation method in the bit allocation circuits 215 and 420, a bit number is once allocated according to the equation (2) or (3), and then a sound source codebook based on the actually allocated bit number is used. It is also possible to perform quantization, measure the quantization noise and adjust the bit allocation to maximize the following equation.

【００７９】[0079]

【数１４】 [Equation 14]

【００８０】ここで、σ_nj ²はｊ番目のサブフレームで
測定した量子化雑音である。Here, σ _nj ² is the quantization noise measured in the j-th subframe.

【００８１】また、マスキングしきい値スペクトルの計
算法としては、他の周知な方法を使用することができ
る。Further, as a method of calculating the masking threshold spectrum, other known methods can be used.

【００８２】また、マスキングしきい値計算回路２０
５、４１０では、演算量を低減するために、フーリエ変
換のかわりに、帯域分割フィルタ群を用いることもでき
る。The masking threshold value calculating circuit 20
In 5, 410, in order to reduce the amount of calculation, a band division filter group can be used instead of the Fourier transform.

【００８３】また、聴覚重み付け回路２２０では、文献
１、２等の従来型の聴覚重み付けを施すこともできる。
このときの重み付けフィルタの伝達特性は下式でかけ
る。The auditory weighting circuit 220 can also perform conventional auditory weighting as described in Documents 1 and 2.
The transfer characteristic of the weighting filter at this time is given by the following equation.

【００８４】[0084]

【数１５】 (Equation 15)

【００８５】ここで、ａ_iはＬＰＣ分析回路１３０で求
めた線形予測係数である。Here, a _i is a linear prediction coefficient obtained by the LPC analysis circuit 130.

【００８６】[0086]

【発明の効果】以上述べたように、本発明によれば、音
声信号から聴覚のマスキングしきい値を求め、これをも
とに音声信号の小区間におけるコードブックのビット数
あるいはマルチパルスの個数を割り当てているので、音
声信号の特徴の時間的変化に追従して聴覚的に常に良好
な音質を保てるという効果がある。さらに、時間方向の
特徴の変化のみならず、周波数方向の特徴の変化も考慮
してコードブックのビット数あるいはマルチパルスの個
数を割り当てることも可能であるので、低ビットレート
でより良好な音質の符号化音声を得ることができるとい
う効果がある。As described above, according to the present invention, the auditory masking threshold is obtained from the audio signal, and based on this, the number of bits of the codebook or the number of multi-pulses in a small section of the audio signal is determined. Is assigned, so that there is an effect that a good sound quality can always be maintained aurally by following the temporal change of the characteristic of the audio signal. Furthermore, it is possible to allocate the number of bits or the number of multi-pulses in the codebook in consideration of not only changes in the characteristics in the time direction but also changes in the characteristics in the frequency direction. There is an effect that encoded speech can be obtained.

[Brief description of the drawings]

【図１】第１の発明の実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the first invention.

【図２】第２の発明の実施例を示すブロック図である。FIG. 2 is a block diagram showing an embodiment of the second invention.

【図３】図２の音声符号化回路の構成を示すブロック図
である。FIG. 3 is a block diagram illustrating a configuration of a speech encoding circuit in FIG. 2;

【図４】第３の発明の実施例を示すブロック図である。FIG. 4 is a block diagram showing an embodiment of the third invention.

[Explanation of symbols]

１１０バッファメモリ１３０ＬＰＣ分析回路１４０ＬＳＰ量子化回路１５０サブフレーム分割回路１７０インパルス応答回路１９０減算器２０５、４１０マスキングしきい値計算回路２１０適応コードブック２１５、４２０ビット割当回路２２０聴覚重み付け回路２３０、５３０音源コードブック探索回路２５０、５０コードブック２６０ゲインコードブック探索回路２７０ゲインコードブック２９０、４３０マルチプレクサ２９５合成フィルタ３００帯域分割回路４００音声符号化回路 Reference Signs List 110 buffer memory 130 LPC analysis circuit 140 LSP quantization circuit 150 subframe division circuit 170 impulse response circuit 190 subtractor 205, 410 masking threshold calculation circuit 210 adaptive codebook 215, 420 bit allocation circuit 220 auditory weighting circuit 230, 530 Sound source codebook search circuit 250, 50 codebook 260 Gain codebook search circuit 270 Gain codebook 290, 430 Multiplexer 295 Synthesis filter 300 Band division circuit 400 Speech coding circuit

フロントページの続き (56)参考文献特開平２−294700（ＪＰ，Ａ) 特開平４−302532（ＪＰ，Ａ) 特開平５−232998（ＪＰ，Ａ) 特開平１−207799（ＪＰ，Ａ) Ｄｒｏｇｏｅｔ．ａｌ．”ＶｅｃｔｏｒＱｕａｎｔｉｚａｔｉｏｎＡｎｄＣｒｉｔｅｒｉａＩｎＳＶＤＢａｓｅｄＣＥＬＰＣｏｄｅｒｓ”，ＩＣＡＳＳＰ−90，Ｖｏｌ．１, ｐｐ33−36（1990) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 11/00 - 21/06 H03M 7/30 H04B 14/04 ＪＩＣＳＴファイル（ＪＯＩＳ)Continuation of front page (56) References JP-A-2-294700 (JP, A) JP-A-4-302532 (JP, A) JP-A-5-232998 (JP, A) JP-A-1-207799 (JP) , A) Drago et. al. "Vector or Quantization And Criteria In SVD Based CELP Coders", ICASP-90, Vol. 1, pp33-36 (1990) (58) Fields investigated (Int. Cl. ⁷ , DB name) G10L 11/00-21/06 H03M 7/30 H04B 14/04 JICST file (JOIS)

Claims

(57) [Claims]

1. A spectrum parameter calculator for dividing an input discrete voice signal into frames of a predetermined time length, obtaining a spectrum parameter representing a spectrum envelope of the voice signal, and outputting the spectrum parameter. An adaptive codebook section that divides the audio signal into small sections of a predetermined time length and calculates a pitch parameter so that a signal reproduced based on an adaptive codebook including a past sound source signal is close to the audio signal; And a sound source search unit that outputs the sound source signal in a form of a codebook or a multi-pulse composed of a plurality of types of code vectors pre-configured, and outputs the result. A masking threshold value calculation unit for obtaining a threshold value; Speech encoding apparatus; and a bit allocation unit for determining the number of bits or the number of multi-pulses Dobukku.

2. A band splitting unit for splitting an input audio signal into bands, and a code for encoding a small section and the band-divided signal of the band-divided signal based on the masking threshold. 2. The speech coding apparatus according to claim 1, further comprising a bit allocating unit that determines the number of bits of the book or the number of multi-pulses.

3. The speech coding apparatus according to claim 1, wherein the codebook includes a codebook in which an impulse response of a band division filter is convolved in advance.