JPH1020889A

JPH1020889A - Voice coding device and recording medium

Info

Publication number: JPH1020889A
Application number: JP8171480A
Authority: JP
Inventors: Hiroyuki Ebara; 原宏幸江
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-07-01
Filing date: 1996-07-01
Publication date: 1998-01-23

Abstract

PROBLEM TO BE SOLVED: To reduce an operation amount when a noise code book is retrieved, a memory amount required for the noise code book and to improve tone quality of a sound part by structuring the noise code book, sharing plural pieces of pulses with noise code vectors of adjacent indexes and making so that only a few or a piece of pulse is different. SOLUTION: A square error minimization means 21 selects the noise code vector 25 that a square error between a target vector of a noise code vector component and the noise code vector after passing through a linear predictive synthetic filter becomes minimum from the noise code book 24 to output it by using the target vector 22 of the noise code vector component and the impulse response 23 of the linear predictive filter. At this time, the noise code book 24 is structured so that only a piece of pulse is different between the noise code vectors of the adjacent indexes. For instance, in the index 2, 2a, 2b, 2c are the same as 1b, 1c, 1d of the index 1, and only 2d is different.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＣＥＬＰ型音声符
号化装置およびそれをソフトウェア化して記録した記録
媒体に関するものである。[0001] 1. Field of the Invention [0002] The present invention relates to a CELP-type speech coding apparatus and a recording medium on which the software is recorded.

【０００２】[0002]

【従来の技術】従来、スパースな雑音音源（１フレーム
中に振幅がゼロでない数本のパルスを立てて構成する雑
音音源）を用いるＣＥＬＰ型音声符号化装置は、日本音
響学会講演論文集（平成６年３月）第３０５頁から第３
０６頁や日本音響学会講演論文集（平成７年３月）第２
４３頁から第２４４頁に開示されている。これらのＣＥ
ＬＰ型音声符号化装置における雑音符号ベクトル生成技
術について図８を用いて説明する。2. Description of the Related Art Conventionally, a CELP-type speech coding apparatus using a sparse noise source (a noise source configured by raising several pulses having a non-zero amplitude in one frame) has been proposed in the Proceedings of the Acoustical Society of Japan (Heisei Heisei). March 2006) 305 to 3rd
Page 06 and the 2nd Annual Meeting of the Acoustical Society of Japan (March 1995)
It is disclosed on pages 43 to 244. These CE
The noise code vector generation technique in the LP-type speech coding apparatus will be described with reference to FIG.

【０００３】図８において、２乗誤差最小化手段１１
は、雑音符号ベクトル成分のターゲットベクトル１２
（聴覚重み付け後の入力音声から線形予測合成フィルタ
のゼロ入力応答と適応符号ベクトル成分を減じたもの）
と合成音声の雑音符号ベクトル成分（雑音符号ベクトル
を線形予測合成フィルタに通して適正利得を乗じたも
の）との誤差が最小となる雑音符号ベクトル１５を、雑
音符号帳１４の中から選択して出力する。線形予測合成
フィルタに通した雑音符号ベクトルの計算は、線形予測
合成フィルタのインパルス応答１３を雑音符号ベクトル
の各コードベクトルに畳み込むことによって行われる。In FIG. 8, a square error minimizing means 11 is shown.
Is the target vector 12 of the noise code vector component
(Subtracting the zero input response of the linear prediction synthesis filter and the adaptive code vector component from the input speech after the hearing weighting)
A noise code vector 15 that minimizes an error between the noise code vector component of the synthesized speech and a noise code vector component obtained by passing the noise code vector through a linear predictive synthesis filter and multiplying by an appropriate gain is selected from the noise code book 14. Output. The calculation of the noise code vector passed through the linear prediction synthesis filter is performed by convolving the impulse response 13 of the linear prediction synthesis filter with each code vector of the noise code vector.

【０００４】なお、雑音符号ベクトルを適応符号ベクト
ルに対して直交化する場合は、雑音符号ベクトル成分の
ターゲットベクトルを用いずに、音源全体のターゲット
ベクトル（聴覚重み付け後の入力音声から線形予測合成
フィルタのゼロ入力応答を減じたもの）と適応符号ベク
トルを用いて、ターゲットベクトルと合成音声との２乗
誤差最小化を行う。When the noise code vector is orthogonalized with respect to the adaptive code vector, the target vector of the entire sound source (a linear predictive synthesis filter based on the input speech after hearing weighting) is used without using the target vector of the noise code vector component. ), And the adaptive code vector is used to minimize the square error between the target vector and the synthesized speech.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来の雑音符号帳では、２乗誤差最小化手段において行わ
れる誤差評価関数の計算が、雑音符号帳に格納された各
雑音符号ベクトル毎に独立して行われるため、演算量が
多く、使用するメモリ容量が大きいという問題があっ
た。However, in the above conventional noise codebook, the calculation of the error evaluation function performed by the square error minimizing means is independent for each noise code vector stored in the noise codebook. Therefore, there is a problem that the amount of calculation is large and the memory capacity used is large.

【０００６】本発明は、このような従来の問題を解決す
るものであり、雑音符号帳探索時の演算量削減と雑音符
号帳に必要なメモリ量の削減を図るとともに、有声部の
音質の向上を図ることのできる音声符号化装置およびそ
れを具現した記録媒体を提供することを目的とする。SUMMARY OF THE INVENTION The present invention solves such a conventional problem. The present invention aims to reduce the amount of computation for searching for a random codebook, reduce the amount of memory required for the random codebook, and improve the sound quality of voiced parts. And a recording medium embodying the same.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
の第１の発明は、雑音符号帳を構造化し、隣り合うイン
デックスの雑音符号ベクトルは複数本のパルス（同じ位
置、同じ振幅）を共有し、少数または１本のパルスのみ
が異なるようにすることによって、または雑音符号ベク
トルがｎ本以上のパルスからなり、Ｎ番目のインデック
スに対応する雑音符号ベクトルを、雑音符号帳のＮ番目
のインデックスに記述されている位置および振幅のパル
スと（Ｎ−１）番目から（Ｎ−ｎ＋１）番目のインデッ
クスに記述されている位置および振幅のパルスとの和に
よって表現することによって、雑音符号帳探索時の演算
量削減と雑音符号帳に必要なメモリ量の削減を図るもの
であり、インデックスが１つ前の雑音符号ベクトルに対
して行った誤差評価関数計算の結果の一部を利用できる
ため、各インデックスの雑音符号ベクトルに対して独立
に誤差評価関数計算を行う場合に比べて演算量の削減が
可能となる。また、インデックスが１つ前の雑音符号ベ
クトルと異なるパルスの位置と振幅のみを符号帳に格納
すれば良いため、雑音符号帳に必要なメモリ量の削減も
可能となる。According to a first aspect of the present invention, a noise code book is structured, and noise code vectors of adjacent indexes share a plurality of pulses (the same position and the same amplitude). The noise code vector corresponding to the N-th index is changed to the N-th index of the noise codebook by making only a small number or one of the pulses different, or the noise code vector is composed of n or more pulses. Is expressed by the sum of the pulse of the position and amplitude described in (1) and the pulse of the position and amplitude described in the (N-1) th to (N-n + 1) th indices. And the amount of memory required for the random codebook is reduced. Since the availability of some of the results of the function calculation, it is possible to reduce the amount of calculation as compared with the case of performing the error evaluation function calculating independently for random code vectors for each index. Also, since only the position and amplitude of a pulse whose index is different from the previous noise code vector need be stored in the codebook, the amount of memory required for the noise codebook can be reduced.

【０００８】上記目的を達成するための第２の発明は、
第１の発明に示した雑音符号帳を複数個用意し、適応符
号ベクトルに存在するピッチピークの位置に応じてこれ
ら複数個の雑音符号帳を使い分けることにより、有声部
の音声品質の向上を図るものであり、雑音符号帳を切り
替えることによって１フレーム（サブフレーム）内に含
まれる位相情報を取り除くことができ、有声部の音源量
子化性能を向上することができる。また、演算量につい
ては第１の発明と同様に削減が可能である。A second invention for achieving the above object is:
A plurality of noise codebooks shown in the first invention are prepared, and the plurality of noise codebooks are selectively used according to the position of a pitch peak existing in an adaptive code vector, thereby improving the voice quality of a voiced part. By switching the noise codebook, phase information included in one frame (subframe) can be removed, and the sound source quantization performance of voiced parts can be improved. Further, the amount of calculation can be reduced as in the first invention.

【０００９】上記目的を達成するための第３の発明は、
第１の発明に示した雑音符号帳において、パルスの位置
をフレーム（サブフレーム）内の絶対的な位置で表すの
ではなく、適応符号ベクトルのピッチピーク位置を０と
する相対的な位置で表すことにより、第１の発明と同じ
メモリ量で第２の発明の効果を得ることが可能である。A third invention for achieving the above object is:
In the noise codebook according to the first aspect, the position of the pulse is not represented by an absolute position in a frame (subframe), but is represented by a relative position where the pitch peak position of the adaptive code vector is 0. Thereby, the effect of the second invention can be obtained with the same memory amount as the first invention.

【００１０】上記目的を達成するための第４の発明は、
スパースな雑音符号帳のパルス位置を確率的に生成する
際に、適応符号ベクトルのピッチパルス位置近傍の生起
確立を高くして、ピッチパルス位置近傍のバリエーショ
ンを多くすることにより、有声部の音声品質を向上させ
ることができる。[0010] A fourth invention for achieving the above object is:
When stochastically generating pulse positions of a sparse noise codebook, by increasing the probability of occurrence near the pitch pulse position of the adaptive code vector and increasing the variations near the pitch pulse position, the voice quality of voiced parts is increased. Can be improved.

【００１１】[0011]

【発明の実施の形態】本発明の請求項１記載の発明は、
ＣＥＬＰ型音声符号化装置において、雑音符号帳がパル
スの位置と振幅の組み合わせを記述する要素から構成さ
れており、雑音符号帳に格納された雑音符号ベクトル
が、隣り合うインデックスの雑音符号ベクトル間では少
数本のパルスのみが異なるように構造化されている音声
符号化装置であり、雑音符号帳を格納するのに必要なメ
モリ量の削減および雑音符号帳探索時の演算量削減がで
きるという作用を有する。BEST MODE FOR CARRYING OUT THE INVENTION
In a CELP type speech coding apparatus, a noise codebook is composed of elements describing a combination of a pulse position and an amplitude, and a noise code vector stored in the noise codebook is generated between noise code vectors of adjacent indexes. This is a speech coding apparatus structured so that only a small number of pulses are different, and has the effect of reducing the amount of memory required to store a random codebook and the amount of computation when searching for a random codebook. Have.

【００１２】本発明の請求項２記載の発明は、隣り合う
インデックスの雑音符号ベクトル間では１本のパルスの
みが異なるように構造化されている請求項１記載の音声
符号化装置雑音符号ベクトルがｎ本以上のパルスからな
り、Ｎ番目のインデックスにであり、雑音符号帳を格納
するのに必要なメモリ量の削減および雑音符号帳探索時
の演算量削減ができるという作用を有する。According to a second aspect of the present invention, the noise code vector of the speech coding apparatus according to the first aspect is structured such that only one pulse differs between noise code vectors of adjacent indexes. It is composed of n or more pulses and is at the N-th index, and has the effect of reducing the amount of memory required to store the random codebook and the amount of computation when searching for the random codebook.

【００１３】本発明の請求項３記載の発明は、ＣＥＬＰ
型音声符号化装置において、雑音符号帳がパルスの位置
と振幅の組み合わせを記述する要素から構成されてお
り、雑音符号ベクトルがｎ本以上のパルスからなり、Ｎ
番目のインデックスに対応する雑音符号ベクトルが、雑
音符号帳のＮ番目のインデックスに記述されている位置
および振幅のパルスと（Ｎ−１）番目から（Ｎ−ｎ＋
１）番目のインデックスに記述されている位置および振
幅のパルスとの和によって表現される音声符号化装置で
あり、雑音符号帳を格納するのに必要なメモリ量の削減
および雑音符号帳探索時の演算量削減ができるという作
用を有する。The invention according to claim 3 of the present invention provides a CELP
In the speech coding apparatus, the random codebook is composed of elements describing the combination of the position and the amplitude of the pulse, and the random code vector is composed of n or more pulses.
The random code vector corresponding to the n-th index is obtained by adding the pulse of the position and the amplitude described in the N-th index of the random codebook and the (N−1) th to (N−n +
1) A speech encoding device represented by the sum of the position and amplitude pulses described in the index, which reduces the amount of memory required to store the random codebook and reduces This has the effect of reducing the amount of calculation.

【００１４】本発明の請求項４記載の発明は、各インデ
ックスの内容が、1 本のパルスの位置と振幅のみからな
る請求項２または請求項３記載の音声符号化装置であ
り、１つ前のインデックスに対して行われた誤差評価関
数の計算結果の一部を再利用して誤差評価関数の計算を
行うことができるため、雑音符号帳探索に要する演算量
を削減できるという作用を有する。According to a fourth aspect of the present invention, there is provided the speech encoding apparatus according to the second or third aspect, wherein the content of each index comprises only the position and amplitude of one pulse. Since the calculation of the error evaluation function can be performed by reusing a part of the calculation result of the error evaluation function performed on the index of, the operation amount required for searching the random codebook can be reduced.

【００１５】本発明の請求項５記載の発明は、適応符号
ベクトルのピッチピーク位置によって、使用する雑音符
号帳を切り替えて用いる請求項４記載の音声符号化装置
であり、ピッチピーク位置によって雑音符号帳を切り替
えて使用することにより、１フレーム（サブフレーム）
内に存在する位相情報を取り除くことができ、有声部の
音源の量子化効率を向上できるという作用を有する。According to a fifth aspect of the present invention, there is provided the speech coding apparatus according to the fourth aspect, wherein the noise code book to be used is switched and used according to the pitch peak position of the adaptive code vector. One frame (sub-frame) by switching and using books
This has the effect of removing the phase information present in the voiced part and improving the quantization efficiency of the sound source of the voiced part.

【００１６】本発明の請求項６記載の発明は、各インデ
ックスの内容に記述されているパルスの位置が、適応符
号ベクトルのピッチピーク位置を０とする、適応符号ベ
クトルのピッチピーク位置からの相対位置によって記述
されている請求項１から５のいずれかに記載の音声符号
化装置であり、雑音符号帳を格納するために必要なメモ
リ量と雑音符号帳探索に必要な演算量の削減を可能と
し、加えて有声部の音声品質の向上を可能とする作用を
有する。In the invention according to claim 6 of the present invention, the position of the pulse described in the content of each index is relative to the pitch peak position of the adaptive code vector when the pitch peak position of the adaptive code vector is 0. The speech coding apparatus according to any one of claims 1 to 5, wherein the speech coding apparatus is described by a position, and can reduce a memory amount necessary for storing a random codebook and a calculation amount required for searching for a random codebook. In addition, it has an effect of improving the voice quality of the voiced part.

【００１７】本発明の請求項７記載の発明は、適応符号
ベクトルのピッチピーク位置に対応する位置の発生確率
が、他の位置の発生確率よりも高くなっている乱数発生
関数によってパルスの位置を決定した雑音符号帳を有す
る請求項１から６のいずれかに記載の音声符号化装置で
あり、ピッチピーク位置近傍の雑音符号ベクトルのバリ
エーションを多くすることによって、有声部の音声品質
の向上を図ることができるという作用を有する。According to a seventh aspect of the present invention, the position of a pulse is determined by a random number generation function in which the probability of occurrence of a position corresponding to the pitch peak position of an adaptive code vector is higher than the probability of occurrence of other positions. The speech coding apparatus according to any one of claims 1 to 6, further comprising a determined noise codebook, wherein the variation of the noise code vector near the pitch peak position is increased to improve the voice quality of the voiced part. It has the effect of being able to.

【００１８】本発明の請求項８記載の発明は、請求項１
から７のいずれかに記載の音声符号化装置をソフトウェ
アで実現したプログラムを記録した記録媒体であり、請
求項１から７のいずれかに記載の音声符号化装置をパー
ソナルコンピュータ等の磁気ディスク、光磁気ディス
ク、ＲＯＭカートリッジ等の記録媒体を利用したシステ
ムにおいて実現できるという作用を有する。[0018] The invention according to claim 8 of the present invention is the invention according to claim 1.
8. A recording medium on which a program that implements the audio encoding device according to any one of claims 1 to 7 by software is recorded, and the audio encoding device according to any one of claims 1 to 7 is a magnetic disk such as a personal computer, This has the effect that it can be realized in a system using a recording medium such as a magnetic disk or a ROM cartridge.

【００１９】以下、本発明の実施の形態について、図１
から図７を用いて説明する。（実施の形態１）図１は本発明の実施の形態１における
ＣＥＬＰ型音声符号化装置の雑音符号ベクトル生成部を
示し、雑音符号帳に格納された雑音符号ベクトルが、隣
り合うインデックスの雑音符号ベクトル間では少数本例
えば１本のパルスのみが異なるように構造化されてい
る。図１において、２１は２乗誤差最小化手段であり、
雑音符号ベクトル成分のターゲットベクトル２２と線形
予測合成フィルタのインパルス応答２３と雑音符号帳２
４に格納されている雑音符号ベクトルを入力として、雑
音符号ベクトル成分のターゲットベクトルと線形予測合
成フィルタを通した後の雑音符号ベクトルとの２乗誤差
が最小となる雑音符号ベクトル２５を選択して出力す
る。２２は聴覚重み付け後の入力音声から線形予測合成
フィルタのゼロ入力応答と適応符号ベクトル成分を減じ
た雑音符号ベクトル成分のターゲットベクトルである。
２３は線形予測合成フィルタのインパルス応答に聴覚重
み付けをした線形予測合成フィルタのインパルス応答で
ある。２４は隣り合うインデックスの雑音符号ベクトル
間では１本のパルスのみが異なるように構造化されてい
る雑音符号帳である。例えば、インデックス２では、２
ａ，２ｂ，２ｃがインデックス１の１ｂ，１ｃ，１ｄと
同じであり、２ｄのみが異なる。２５は雑音符号ベクト
ル成分のターゲットベクトルとの２乗誤差を最小とする
雑音符号ベクトルである。Hereinafter, an embodiment of the present invention will be described with reference to FIG.
This will be described with reference to FIG. (Embodiment 1) FIG. 1 shows a noise code vector generation unit of a CELP type speech coding apparatus according to Embodiment 1 of the present invention, in which a noise code vector stored in a noise codebook is a noise code of an adjacent index. The vectors are structured such that only a small number of, for example, one pulse is different. In FIG. 1, reference numeral 21 denotes a square error minimizing means.
The target vector 22 of the noise code vector component, the impulse response 23 of the linear prediction synthesis filter, and the noise codebook 2
4, the noise code vector 25 that minimizes the square error between the target vector of the noise code vector component and the noise code vector after passing through the linear prediction synthesis filter is selected. Output. Reference numeral 22 denotes a target vector of a noise code vector component obtained by subtracting the zero input response of the linear prediction synthesis filter and the adaptive code vector component from the input speech after the hearing weighting.
Reference numeral 23 denotes an impulse response of the linear prediction synthesis filter in which the impulse response of the linear prediction synthesis filter is weighted by auditory sense. Reference numeral 24 denotes a noise codebook structured so that only one pulse differs between noise code vectors of adjacent indexes. For example, for index 2, 2
a, 2b, 2c are the same as 1b, 1c, 1d of index 1, and only 2d is different. Reference numeral 25 denotes a noise code vector that minimizes the square error of the noise code vector component from the target vector.

【００２０】以上のように構成されたＣＥＬＰ型音声符
号化装置の雑音符号ベクトル生成部の動作について図１
を用いて説明する。２乗誤差最小化手段２１は、雑音符
号ベクトル成分のターゲットベクトル２２と線形予測合
成フィルタのインパルス応答２３とを用いて（数１）を
最大化することによって、雑音符号ベクトル成分のター
ゲットベクトルと線形予測合成フィルタを通した後の雑
音符号ベクトルとの２乗誤差が最小となる雑音符号ベク
トル２５を雑音符号帳２４の中から選択して出力する。The operation of the noise code vector generation unit of the CELP type speech coding apparatus configured as described above is shown in FIG.
This will be described with reference to FIG. The square error minimizing means 21 maximizes (Equation 1) by using the target vector 22 of the noise code vector component and the impulse response 23 of the linear prediction synthesis filter, thereby obtaining a linear relationship with the target vector of the noise code vector component. The noise code vector 25 that minimizes the square error with the noise code vector after passing through the prediction synthesis filter is selected from the noise codebook 24 and output.

【００２１】[0021]

【数１】（添え字ｔは行列の転置を表す）Ｘ: ターゲットベクトル（Ｌはフレーム（サブフレー
ム）長）Ｈ: インパルス応答畳込み行列Ｃ: 雑音符号ベクトル(Equation 1) (Subscript t represents transposition of matrix) X: Target vector (L is frame (subframe) length) H: Impulse response convolution matrix C: Noise code vector

【００２２】[0022]

【数２】 an:ｎ番目のパルス振幅 pn:ｎ番目のパルス位置(Equation 2) an: nth pulse amplitude pn: nth pulse position

【００２３】[0023]

【数３】 (Equation 3)

【００２４】[0024]

【数４】 (Equation 4)

【００２５】[0025]

【数５】 (Equation 5)

【００２６】ここで、（数１）の分子の項は（数２）に
よって表され、（数１）の分母の項は（数３）で表され
る。さらに、雑音符号帳が隣り合うインデックスの雑音
符号ベクトル間では、１本のパルスのみが異なるように
構造化されていることから、（数２）は（数４）で、
（数３）は（数５）で表すことができ、（数４）および
（数５）中に示すように、これらの式を計算する際には
再帰的に利用できる部分が存在する。（数４）（数５）
中、ｐ_Nはインデックスを１つ増やすことによって新た
に加わったパルスの位置を示し、ｐ_N-nはインデックス
を１つ増やすことによって削除されたパルスの位置を示
す。ｐ_N-n+1〜ｐ_N-1は１つ前のインデックスの雑音符
号ベクトルと現在のインデックスの雑音符号ベクトルに
共通のパルスの位置を示す。雑音符号帳が隣り合うイン
デックスの雑音符号ベクトル間では1 本のパルスのみが
異なるような構造化がされていない場合は、（数２）
（数３）（数４）（数５）のように表現できないため、
コードベクトル毎に（数２）（数３）に含まれる全ての
項について計算しなけらばならず、演算量が多くなる。Here, the numerator term of (Equation 1) is represented by (Equation 2), and the denominator term of (Equation 1) is represented by (Equation 3). Further, since the noise codebook is structured so that only one pulse is different between the noise code vectors of adjacent indexes, (Equation 2) is (Equation 4).
(Equation 3) can be expressed by (Equation 5), and as shown in (Equation 4) and (Equation 5), there is a part that can be used recursively when calculating these equations. (Equation 4) (Equation 5)
Where p _N indicates the position of the newly added pulse by increasing the index by one, and p _Nn indicates the position of the deleted pulse by increasing the index by one. _{p N-n + 1 ~p N} -1 indicates the position of the common pulse to the noise code vector of the noise code vector and the current index of the previous index. If the noise code book is not structured such that only one pulse is different between the random code vectors of the adjacent indexes,
Since it cannot be expressed as (Equation 3) (Equation 4) (Equation 5),
It is necessary to calculate for all the terms included in (Equation 2) and (Equation 3) for each code vector, which increases the amount of calculation.

【００２７】なお、ここでは隣り合うインデックスのコ
ードベクトル間では１本のみのパルスが異なる場合を示
したが、２本または３本とした場合も同様の効果が得ら
れる。ただし、演算量削減の効果は、（隣り合うインデ
ックスのコードベクトル間で異なるパルス数）／（コー
ドベクトルに含まれるの全パルス数）の値が小さいほど
効果が大きく、この値が大きいと演算量、メモリ量とも
削減効果はなくなる。Although the case where only one pulse differs between code vectors of adjacent indexes has been described here, the same effect can be obtained when two or three pulses are used. However, the effect of the reduction in the amount of computation is greater as the value of (number of pulses different between code vectors of adjacent indexes) / (total number of pulses included in the code vector) is smaller. However, the effect of reducing the amount of memory is lost.

【００２８】また、（数１）を最大化する際、分子成分
（数２）のみで予め複数個選択し（予備選択）、選択さ
れたものに対してのみ分母成分（数３）を計算して（数
１）を最大化するようにすれば、さらに演算量の削減を
行うことも可能である。この場合も、分子成分を（数
２）で算出するよりも（数４）で算出する方が演算量の
削減が可能となる。Further, when maximizing (Equation 1), a plurality of numerator components (Equation 2) are selected in advance (preliminary selection), and a denominator component (Equation 3) is calculated only for the selected one. By maximizing Equation 1, it is possible to further reduce the amount of computation. Also in this case, the calculation amount can be reduced by calculating the molecular component by (Equation 4) rather than by (Equation 2).

【００２９】（実施の形態２）次に実施の形態２につい
て説明する。図２は本発明の実施の形態における雑音符
号帳を用いた雑音符号ベクトル生成部を示し、雑音符号
ベクトルがｎ本以上例えばｎ本のパルスからなり、Ｎ番
目のインデックスに対応する雑音符号ベクトルが、雑音
符号帳のＮ番目のインデックスに記述されている位置お
よび振幅のパルスと（Ｎ−１）番目から（Ｎ−ｎ＋１）
番目のインデックスに記述されている位置および振幅の
パルスとの組み合わせによって表現される。(Embodiment 2) Next, Embodiment 2 will be described. FIG. 2 shows a random code vector generation unit using a random codebook according to an embodiment of the present invention, in which the random code vector is composed of n or more, for example, n pulses, and the random code vector corresponding to the N-th index is , The pulse of the position and the amplitude described in the N-th index of the noise codebook and the (N−1) th to (N−n + 1)
It is represented by a combination of the position and amplitude pulses described in the third index.

【００３０】図２において、３１は１本のパルスの位置
と振幅をインデックスの順番にバッファ３２と２乗誤差
最小化手段３５に出力する雑音符号帳、３２は雑音符号
帳３１から出力されたＮ番目のインデックスのパルスの
位置と振幅をバッファリングし、（Ｎ−ｎ）番目〜（Ｎ
−１）番目のインデックスのパルスの位置と振幅を２乗
誤差最小化手段３５に出力するバッファ、３３は聴覚重
み付け後の入力音声から線形予測合成フィルタのゼロ入
力応答と適応符号ベクトル成分を減じた雑音符号ベクト
ル成分のターゲットベクトル、３４は線形予測合成フィ
ルタのインパルス応答に聴覚重み付けをした線形予測合
成フィルタのインパルス応答、３５はＮ番目のインデッ
クスのパルスの位置と振幅を雑音符号帳３１から入力
し、（Ｎ−ｎ）番目〜（Ｎ−１）番目のパルスの位置を
バッファ３２から入力し、雑音符号ベクトルのターゲッ
トベクトル３３と線形予測合成フィルタのインパルス応
答３４を入力として、雑音符号ベクトル成分のターゲッ
トベクトル３３と線形予測合成フィルタを通した後の雑
音符号ベクトルとの２乗誤差が最小となる雑音符号ベク
トル３６を選択して出力する２乗誤差最小化手段、３６
は２乗誤差最小化手段３５から最終的に出力される雑音
符号ベクトルである。In FIG. 2, reference numeral 31 denotes a noise codebook which outputs the position and amplitude of one pulse to the buffer 32 and the squared error minimizing means 35 in the order of index, and 32 denotes N which is output from the noise codebook 31. The position and the amplitude of the pulse of the index number are buffered, and (N−n) th to (N
-1) A buffer for outputting the position and amplitude of the pulse of the first index to the square error minimizing means 35. The buffer 33 subtracts the zero input response of the linear predictive synthesis filter and the adaptive code vector component from the input speech after the auditory weighting. The target vector of the noise code vector component, 34 is the impulse response of the linear prediction synthesis filter obtained by weighting the impulse response of the linear prediction synthesis filter with auditory weight, and 35 is the position and amplitude of the pulse of the Nth index input from the noise codebook 31. , (N−n) th to (N−1) th pulse positions are input from the buffer 32, and the target vector 33 of the noise code vector and the impulse response 34 of the linear prediction synthesis filter are input, and the noise code vector component Between the target vector 33 and the noise code vector after passing through the linear prediction synthesis filter Squared error minimization means multiplication error selects and outputs the noise code vector 36 that minimizes, 36
Is a noise code vector finally output from the square error minimizing means 35.

【００３１】以上のように構成された雑音符号ベクトル
生成部について、図３を用いてその動作を説明する。雑
音符号帳３１は、各インデックスに対して１本のパルス
の位置と振幅が格納されており、最初のインデックスか
ら順番にパルスの位置と振幅をバッファ３２と２乗誤差
最小化手段３５に出力する。The operation of the noise code vector generator configured as described above will be described with reference to FIG. The noise codebook 31 stores the position and amplitude of one pulse for each index, and outputs the position and amplitude of the pulse to the buffer 32 and the square error minimizing means 35 in order from the first index. .

【００３２】バッファ３２は、雑音符号帳３１から出力
されたＮ番目のインデックスに格納されたパルスの位置
と振幅をバッファリングする。バッファ３２には、雑音
符号帳３１の（Ｎ−ｎ）番目〜（Ｎ−１）番目のインデ
ックスに格納されているパルスの位置と振幅がバッファ
リングされており、新たに雑音符号帳３１のＮ番目のイ
ンデックスに格納されているパルスの位置と振幅がバッ
ファに加えられる。そして、雑音符号帳３１の（Ｎ−
ｎ）番目〜（Ｎ−１）番目のインデックスに格納されて
いるパルスの振幅と位置を２乗誤差最小化手段３５に出
力する。The buffer 32 buffers the pulse position and amplitude stored at the N-th index output from the noise codebook 31. In the buffer 32, the positions and amplitudes of the pulses stored in the (N−n) th to (N−1) th indexes of the random codebook 31 are buffered. The position and amplitude of the pulse stored at the third index is added to the buffer. Then, (N-
The amplitude and the position of the pulse stored in the (n) th to (N−1) th indexes are output to the square error minimizing means 35.

【００３３】２乗誤差最小化手段３５は、雑音符号帳３
１から入力したＮ番目のインデックスのパルスと、バッ
ファ３２から入力した（Ｎ−ｎ）番目〜（Ｎ−１）番目
のインデックスのパルスの位置と振幅を用いて、（数
４）と（数５）を計算して（数１）を評価し、雑音符号
帳３１の全てのインデックスの中から（数１）を最大と
する雑音符号帳のインデックスＮにより表される雑音符
号ベクトル３６を出力する。雑音符号ベクトル３６はｎ
本のパルスからなり、Ｎ番目〜（Ｎ−ｎ＋１）番目のイ
ンデックスに格納されている位置と振幅のパルスによっ
て生成される。The square error minimizing means 35 outputs the noise codebook 3
(Equation 4) and (Equation 5) by using the position and amplitude of the N-th index pulse input from No. 1 and the (N-n) th to (N-1) -th index pulses input from the buffer 32. ) Is calculated to evaluate (Equation 1), and a noise code vector 36 represented by the index N of the noise codebook that maximizes (Equation 1) is output from all the indexes of the noise codebook 31. The noise code vector 36 is n
It is composed of pulses of position and amplitude stored in the Nth to (N-n + 1) th indexes.

【００３４】雑音符号帳３１の例を図３に示す。図３の
例ではインデックス１で表される雑音符号ベクトルは、
４サンプル目に振幅１のパルスが存在する（その他のサ
ンプルは全てゼロである）ベクトルであり、インデック
ス２で表される雑音符号ベクトルは、４サンプル目と５
０サンプル目にそれぞれ振幅１と−２のパルスが存在す
るベクトル、インデックス３で表される雑音符号ベクト
ルは、４サンプル目と５０サンプル目と３３パルス目に
それぞれ振幅１と−２と３のパルスが存在するベクトル
である。ｎ＝３の場合は、インデックス４で表される雑
音符号ベクトルは、５０サンプル目と３３サンプル目と
１４サンプル目にそれぞれ振幅−２と３と５のパルスが
存在するベクトルとなり、インデックスＮで表される雑
音符号ベクトルは、インデックス（Ｎ−２）〜Ｎに格納
されている位置と振幅のパルス３本から構成されるベク
トルとなる。この例では、インデックス１〜２で表され
る雑音符号ベクトルは、パルスが３本にならない（イン
デックス１〜（ｎ−１）で表される雑音符号ベクトルは
パルス数がｎ本にならない）。全てのインデックスに対
して３本（ｎ本）のパルスを有するベクトルを割り当て
る場合は、有効なインデックスの範囲を３以上（ｎ以
上）に限定すればよい。この場合、最初の２本（ｎ本）
分のパルスの位置と振幅を格納するインデックスの分だ
け余計にメモリが必要となる。FIG. 3 shows an example of the random codebook 31. In the example of FIG. 3, the random code vector represented by index 1 is
This is a vector in which a pulse of amplitude 1 exists in the fourth sample (all other samples are zero), and the random code vector represented by index 2 is
A vector in which pulses of amplitudes 1 and -2 exist at the 0th sample, and a noise code vector represented by index 3 are pulses of amplitudes 1, -2 and 3 at the 4th, 50th and 33rd pulses, respectively. Is a vector in which When n = 3, the noise code vector represented by index 4 is a vector in which pulses of amplitude −2, 3 and 5 exist at the 50th sample, the 33rd sample, and the 14th sample, respectively. The resulting noise code vector is a vector composed of three pulses of the position and amplitude stored in the indexes (N−2) to N. In this example, the noise code vector represented by index 1 or 2 does not have three pulses (the noise code vector represented by index 1 or (n-1) does not have n pulses). When assigning a vector having three (n) pulses to all indices, the effective index range may be limited to three or more (n or more). In this case, the first two (n)
An extra memory is required for the index for storing the position and amplitude of the minute pulse.

【００３５】なお、インデックスの割り当てを工夫する
（グレイ符号）ことによって、伝送路誤りに対する耐性
を向上させることも可能である。It is possible to improve the resistance to transmission line errors by devising the assignment of the index (Gray code).

【００３６】（実施の形態３）次に実施の形態３につい
て説明する。図４は本発明の実施の形態３における音声
符号化装置の雑音符号ベクトル生成部を示し、実施の形
態１または２に示した雑音符号帳を複数個用意し、適応
符号ベクトルに存在するピッチピークの位置に応じてこ
れら複数個の雑音符号帳を使い分ける構成を有する。Third Embodiment Next, a third embodiment will be described. FIG. 4 shows a noise code vector generation unit of a speech coding apparatus according to Embodiment 3 of the present invention, in which a plurality of noise codebooks shown in Embodiment 1 or 2 are prepared and a pitch peak existing in an adaptive code vector is provided. Has a configuration in which these plurality of random codebooks are properly used in accordance with the position of.

【００３７】図４において、４１は適応符号帳から選択
され出力された適応符号ベクトル、４２は適応符号ベク
トル４１を入力として、適応符号ベクトル内のピッチパ
ルス位置を検出して雑音符号帳選択器４３に出力するピ
ッチパルス位置検出器、４３はピッチパルス位置検出器
４２から出力されたピッチパルス位置を入力として、選
択すべき雑音符号帳の情報をスイッチ４５に出力する雑
音符号帳選択器、４４は実施の形態１または２に示した
ように構造化されていて、異なるピッチパルスの位置に
応じてそれぞれ最適化されている雑音符号帳、４５は雑
音符号帳選択器４３から入力された選択すべき雑音符号
帳の情報に基づいて、使用する雑音符号帳４４を切り替
え、２乗誤差最小化手段４８に接続するスイッチ、４６
は聴覚重み付け後の入力音声から線形予測合成フィルタ
のゼロ入力応答と適応符号ベクトル成分を減じた雑音符
号ベクトル成分のターゲットベクトル、４７は線形予測
合成フィルタのインパルス応答に聴覚重み付けをした線
形予測合成フィルタのインパルス応答、４８は雑音符号
ベクトル成分のターゲットベクトル４６と線形予測合成
フィルタのインパルス応答４７とスイッチ４５を介して
入力される雑音符号帳４４に格納されている雑音符号ベ
クトルを入力として、雑音符号ベクトル成分のターゲッ
トベクトルと線形予測合成フィルタを通した後の雑音符
号ベクトルとの２乗誤差が最小となる雑音符号ベクトル
４９を選択して出力する２乗誤差最小化手段、４９は２
乗誤差最小化手段４８から最終的に出力される雑音符号
ベクトルである。In FIG. 4, reference numeral 41 denotes an adaptive code vector selected and output from the adaptive code book, and reference numeral 42 denotes a noise code book selector 43 which receives the adaptive code vector 41, detects a pitch pulse position in the adaptive code vector, and A pitch pulse position detector 43 which receives the pitch pulse position output from the pitch pulse position detector 42 and outputs information on a noise code book to be selected to a switch 45; A random codebook 45 structured as shown in the first or second embodiment and optimized in accordance with the positions of different pitch pulses, 45 is to be selected from the random codebook selector 43 A switch 46 for switching the noise codebook 44 to be used based on the information in the noise codebook and connecting to the square error minimizing means 48;
Is the target vector of the noise code vector component obtained by subtracting the zero input response of the linear prediction synthesis filter and the adaptive code vector component from the input speech after the hearing weighting, and 47 is the linear prediction synthesis filter obtained by weighting the impulse response of the linear prediction synthesis filter with the hearing weight , The impulse response 48 of the noise code vector component, the impulse response 47 of the linear prediction synthesis filter, and the noise code vector stored in the noise codebook 44 input via the switch 45 are used as inputs. A square error minimizing means for selecting and outputting a noise code vector 49 that minimizes the square error between the target vector of the vector component and the noise code vector after passing through the linear prediction synthesis filter;
The noise code vector finally output from the squared error minimizing means 48.

【００３８】以上のように構成された音声符号化装置の
雑音符号ベクトル生成部について、図４を用いてその動
作を説明する。図４において、ピッチパルス位置検出器
４２は、入力された適応符号ベクトルを用いて適応符号
ベクトル内に存在するピッチパルスの位置を検出する。
ピッチパルスの位置は、ピッチ周期で並べたインパルス
列と適応符号ベクトルとの正規化相互相関を最大化する
ことによって行うことができる。また、ピッチ周期で並
べたインパルス列を合成フィルタに通したものと、適応
符号ベクトルを合成フィルタに通したものとの誤差を最
小化することによって、より精度良く求めることも可能
である。パルス位置は、フレーム（またはサブフレー
ム）の先頭から最初のピッチパルスまでのサンプル数に
よって表現される。The operation of the noise code vector generation unit of the speech coding apparatus configured as described above will be described with reference to FIG. In FIG. 4, a pitch pulse position detector 42 detects a position of a pitch pulse existing in an adaptive code vector using an input adaptive code vector.
The position of the pitch pulse can be determined by maximizing the normalized cross-correlation between the impulse train arranged in the pitch cycle and the adaptive code vector. Further, by minimizing the error between the impulse train arranged in the pitch cycle that has passed through the synthesis filter and the adaptive code vector that has passed through the synthesis filter, the impulse train can be obtained with higher accuracy. The pulse position is represented by the number of samples from the beginning of a frame (or subframe) to the first pitch pulse.

【００３９】雑音符号帳選択器４３は、適応符号ベクト
ルのピッチピーク位置に基づいて、最適と思われる雑音
符号帳を複数個の雑音符号帳４４の中から１つ選択し、
スイッチ４５を切り替えるための制御情報をスイッチ４
５に出力する。The noise codebook selector 43 selects one of a plurality of noise codebooks 44 considered to be optimal from among a plurality of noise codebooks 44 based on the pitch peak position of the adaptive code vector.
The control information for switching the switch 45 is transmitted to the switch 4
5 is output.

【００４０】スイッチ４５は、雑音符号帳選択器４３か
ら入力した制御情報にしたがって複数個の雑音符号帳４
４のうちの１つの雑音符号帳と２乗誤差最小化手段４８
とを接続する。The switch 45 controls a plurality of random codebooks 4 according to the control information input from the random codebook selector 43.
Noise codebook and squared error minimizing means 48
And connect.

【００４１】２乗誤差最小化手段４８は、雑音符号ベク
トル成分のターゲットベクトル４６と線形予測合成フィ
ルタのインパルス応答４７とを用いて（数１）を最大化
することによって、雑音符号ベクトル成分のターゲット
ベクトルと線形予測合成フィルタを通した後の雑音符号
ベクトルとの２乗誤差が最小となる雑音符号ベクトル４
９を、スイッチ４５を介して接続されている雑音符号帳
４４の中から選択して出力する。The square error minimizing means 48 maximizes (Equation 1) using the target vector 46 of the noise code vector component and the impulse response 47 of the linear prediction synthesis filter, thereby obtaining the target of the noise code vector component. The noise code vector 4 in which the square error between the vector and the noise code vector after passing through the linear prediction synthesis filter is minimized
9 is selected from the random codebook 44 connected via the switch 45 and output.

【００４２】なお、実施の形態３で示した構成は、有声
部に対して効果があるので、有声部・無声部などで音源
をモード分けして使い分ける構成を有するＣＥＬＰ型音
声符号化装置の有声部に適用することができる。また、
この雑音符号ベクトル生成器にピッチ周期化処理部を設
けることで、より有声部の音質向上が図れる。Since the configuration shown in the third embodiment is effective for voiced parts, the voiced speech of the CELP type speech coding apparatus having a configuration in which the sound source is divided into voiced and unvoiced parts and used in different modes. It can be applied to parts. Also,
By providing a pitch period processing unit in this noise code vector generator, the sound quality of voiced parts can be further improved.

【００４３】（実施の形態４）次に実施の形態４につい
て説明する。図５は実施の形態４における音声符号化装
置の雑音符号ベクトル生成部を示し、実施の形態１から
３における雑音符号帳において、パルスの位置をフレー
ム（サブフレーム）内の絶対的な位置で表すのではな
く、適応符号ベクトルのピッチピーク位置を０とする相
対的な位置で表すことを特徴とする。(Fourth Embodiment) Next, a fourth embodiment will be described. FIG. 5 shows a noise code vector generation unit of the speech coding apparatus according to the fourth embodiment. In the noise code book according to the first to third embodiments, the position of a pulse is represented by an absolute position in a frame (subframe). , And is represented by a relative position where the pitch peak position of the adaptive code vector is 0.

【００４４】図５において、５１は適応符号帳から選択
され出力された適応符号ベクトル、５２は適応符号ベク
トル５１を入力として、適応符号ベクトル内のピッチパ
ルス位置を検出してパルス位置変換器５３に出力するピ
ッチパルス位置検出器、５３はピッチパルス位置検出器
５２から出力されるピッチパルス位置と雑音符号帳５４
から出力されるパルス位置およびパルス振幅を入力と
し、パルス位置のみを変換して、パルス振幅と変換した
パルス位置をバッファ５５と２乗誤差最小化手段５８に
出力するパルス位置変換器、５４はピッチパルス位置を
０としてパルス位置を相対的に表している実施の形態２
記載の雑音符号帳、５５はパルス位置変換器５３を介し
て雑音符号帳５４から出力されたＮ番目のインデックス
のパルスの位置（パルス位置変換器５３によって変換さ
れたパルス位置）と振幅をバッファリングし、（Ｎ−
ｎ）番目〜（Ｎ−１）番目のインデックスのパルスの位
置（パルス位置変換器５３によって変換されたパルス位
置）と振幅を２乗誤差最小化手段５８に出力するバッフ
ァ、５６は聴覚重み付け後の入力音声から線形予測合成
フィルタのゼロ入力応答と適応符号ベクトル成分を減じ
た雑音符号ベクトル成分のターゲットベクトル、５７は
線形予測合成フィルタのインパルス応答に聴覚重み付け
をした線形予測合成フィルタのインパルス応答、５８は
Ｎ番目のインデックスのパルスの位置と振幅をパルス位
置変換器５３を介して雑音符号帳５４から入力し、（Ｎ
−ｎ）番目〜（Ｎ−１）番目のパルスの位置をパルス位
置変換器５３を介してバッファ５５から入力し、雑音符
号ベクトルのターゲットベクトル５６と線形予測合成フ
ィルタのインパルス応答５７を入力として、雑音符号ベ
クトル成分のターゲットベクトル５６と線形予測合成フ
ィルタを通した後の雑音符号ベクトルとの２乗誤差が最
小となる雑音符号ベクトル５９を選択して出力する２乗
誤差最小化手段、５９は２乗誤差最小化手段５８から最
終的に出力される雑音符号ベクトルである。In FIG. 5, reference numeral 51 denotes an adaptive code vector selected and output from the adaptive code book, and 52 receives the adaptive code vector 51 as an input, detects a pitch pulse position in the adaptive code vector, and sends it to a pulse position converter 53. The pitch pulse position detector 53 for outputting the pitch pulse position outputted from the pitch pulse position detector 52 and the noise codebook 54
A pulse position converter which receives only the pulse position and pulse amplitude output from the input, converts only the pulse position, and outputs the converted pulse amplitude to the buffer 55 and the square error minimizing means 58; Embodiment 2 in which the pulse position is relatively represented by setting the pulse position to 0
The noise code book 55 described above buffers the pulse position (pulse position converted by the pulse position converter 53) and the amplitude of the N-th index pulse output from the noise code book 54 via the pulse position converter 53. And (N-
A buffer for outputting the pulse position (pulse position converted by the pulse position converter 53) and the amplitude of the (n) th to (N-1) th indices to the square error minimizing means 58; A target vector of a noise code vector component obtained by subtracting the zero input response of the linear prediction synthesis filter and the adaptive code vector component from the input speech; 57, an impulse response of the linear prediction synthesis filter obtained by weighting the impulse response of the linear prediction synthesis filter with an auditory weight; Inputs the position and amplitude of the pulse of the N-th index from the random codebook 54 via the pulse position converter 53, and (N
The positions of the (n) th to (N-1) th pulses are input from the buffer 55 via the pulse position converter 53, and the target vector 56 of the noise code vector and the impulse response 57 of the linear prediction synthesis filter are input. Square error minimizing means for selecting and outputting a noise code vector 59 that minimizes the square error between the target vector 56 of the noise code vector component and the noise code vector after passing through the linear prediction synthesis filter; The noise code vector finally output from the power error minimizing means 58.

【００４５】以上のように構成された、雑音符号ベクト
ル生成部の動作について、図５を用いて説明する。図５
において、ピッチパルス位置検出器５２は、入力された
適応符号ベクトル５１を用いて適応符号ベクトル内に存
在するピッチパルスの位置を検出する。ピッチパルスの
位置は、ピッチ周期で並べたインパルス列と適応符号ベ
クトルとの正規化相互相関を最大化することによって求
めることができる。また、ピッチ周期で並べたインパル
ス列を合成フィルタに通したものと、適応符号ベクトル
を合成フィルタに通したものとの誤差を最小化すること
によって、より精度良く求めることも可能である。パル
ス位置は、フレーム（またはサブフレーム）の先頭から
最初のピッチパルスまでのサンプル数によって表現され
る。The operation of the noise code vector generator configured as described above will be described with reference to FIG. FIG.
In, the pitch pulse position detector 52 detects the position of the pitch pulse existing in the adaptive code vector using the input adaptive code vector 51. The position of the pitch pulse can be obtained by maximizing the normalized cross-correlation between the impulse train arranged in the pitch period and the adaptive code vector. Further, by minimizing the error between the impulse train arranged in the pitch cycle that has passed through the synthesis filter and the adaptive code vector that has passed through the synthesis filter, the impulse train can be obtained with higher accuracy. The pulse position is represented by the number of samples from the beginning of a frame (or subframe) to the first pitch pulse.

【００４６】パルス位置変換器５３は、ピッチパルス位
置を０として相対的に表現されている雑音符号帳５４に
格納されているパルス位置をフレーム（またはサブフレ
ーム）の先頭を０とする絶対的な表現に変換する。具体
的には、雑音符号帳５４から入力されたパルス位置にピ
ッチパルス位置検出器５２によって出力されたパルス位
置を加算する。そして、フレーム（またはサブフレー
ム）の先頭を０とする絶対的な表現に変換されたパルス
位置とパルス振幅（振幅は入力された値そのまま）をバ
ッファ５５と２乗誤差最小化手段５８に出力する。The pulse position converter 53 sets the pulse position stored in the noise codebook 54, which is relatively expressed assuming that the pitch pulse position is 0, as the absolute value of the start of the frame (or subframe) as 0. Convert to representation. Specifically, the pulse position output by the pitch pulse position detector 52 is added to the pulse position input from the noise codebook 54. Then, the pulse position and the pulse amplitude (the amplitude is input as it is) converted into an absolute expression in which the beginning of the frame (or the subframe) is 0 are output to the buffer 55 and the square error minimizing means 58. .

【００４７】バッファ５５は、パルス位置変換器５３を
介して雑音符号帳５４から出力されたＮ番目のインデッ
クスに格納されたパルスの位置（パルス位置変換器５３
によって変換されたパルス位置）と振幅をバッファリン
グする。バッファ５５には、パルス位置変換器５３を介
した雑音符号帳５４の（Ｎ−ｎ）番目〜（Ｎ−１）番目
のインデックスに格納されているパルスの位置と振幅が
バッファリングされており、新たにパルス位置変換器５
３を介した雑音符号帳５４のＮ番目のインデックスに格
納されているパルスの位置と振幅がバッファに加えられ
る。そして、パルス位置変換器５３を介した雑音符号帳
５４の（Ｎ−ｎ）番目〜（Ｎ−１）番目のインデックス
に格納されているパルスの振幅と位置を２乗誤差最小化
手段５８に出力する。The buffer 55 stores the position of the pulse (pulse position converter 53) stored at the N-th index output from the noise codebook 54 via the pulse position converter 53.
And the amplitude of the converted pulse position are buffered. The buffer 55 buffers the positions and amplitudes of the pulses stored in the (N−n) th to (N−1) th indexes of the noise codebook 54 via the pulse position converter 53, New pulse position converter 5
The position and amplitude of the pulse stored in the N-th index of the random codebook 54 via 3 is added to the buffer. Then, the amplitude and position of the pulse stored in the (N−n) th to (N−1) th indexes of the noise codebook 54 via the pulse position converter 53 are output to the square error minimizing means 58. I do.

【００４８】２乗誤差最小化手段５８は、パルス位置変
換器５３を介して雑音符号帳５４から入力したＮ番目の
インデックスのパルスと，バッファ５５から入力した
（Ｎ−ｎ）番目〜（Ｎ−１）番目のインデックスのパル
スの位置と振幅を用いて、（数４）と（数５）を計算し
て（数１）を評価し、雑音符号帳５４の全てのインデッ
クスの中から（数１）を最大とする雑音符号帳のインデ
ックスＮにより表される雑音符号ベクトル３６を出力す
る。雑音符号ベクトル５９はｎ本のパルスからなり、Ｎ
番目〜（Ｎ−ｎ＋１）番目のインデックスに格納されて
いる位置と振幅のパルスによって生成される。The square error minimizing means 58 includes the N-th index pulse input from the noise codebook 54 via the pulse position converter 53 and the (N-n) -th to (N-) pulses input from the buffer 55. (Equation 4) and (Equation 5) are calculated using the pulse position and amplitude of the (1) th index, and (Equation 1) is evaluated. ) Is output, and the noise code vector 36 represented by the index N of the noise codebook that maximizes the value is output. The noise code vector 59 is composed of n pulses, and N
It is generated by the pulse of the position and the amplitude stored in the (th)-(N-n + 1) th indices.

【００４９】なお、実施の形態４で用いられる雑音符号
帳は、図３に示した実施の形態２のものと同様である
が、パルス位置を示す数値がピッチパルス位置を０とす
る相対値で示されている。また、実施の形態４で示した
構成は、有声部に対して効果があるので、有声部・無声
部などで音源をモード分けして使い分ける構成を有する
ＣＥＬＰ型音声符号化装置の有声部に適用することがで
きる。また、この雑音符号ベクトル生成部にピッチ周期
化処理部を設けることで、より有声部の音質向上が図れ
る。The noise codebook used in the fourth embodiment is the same as that in the second embodiment shown in FIG. 3, except that the pulse position value is a relative value with the pitch pulse position being 0. It is shown. Also, since the configuration shown in Embodiment 4 is effective for voiced parts, it is applied to a voiced part of a CELP-type speech coding apparatus having a configuration in which a sound source is divided into modes for voiced and unvoiced parts and used properly. can do. Further, by providing a pitch period processing unit in the noise code vector generation unit, the sound quality of voiced parts can be further improved.

【００５０】また、実施の形態１〜４に示した音声符号
化装置は、磁気ディスク、光磁気ディスク、ＲＯＭカー
トリッジ等の記録媒体にソフトウェアとして記録して実
現することが可能である。Further, the audio encoding apparatus described in the first to fourth embodiments can be realized by recording as software on a recording medium such as a magnetic disk, a magneto-optical disk, and a ROM cartridge.

【００５１】（実施の形態５）次に実施の形態５につい
て説明する。図６は本発明の実施の形態５におけるパル
ス位置符号帳生成器を示し、スパースな雑音符号帳のパ
ルス位置を確率的に生成する際に、適応符号ベクトルの
ピッチパルス位置近傍の生起確立を高くすることを特徴
とする。(Fifth Embodiment) Next, a fifth embodiment will be described. FIG. 6 shows a pulse position codebook generator according to Embodiment 5 of the present invention. When stochastically generating a pulse position of a sparse noise codebook, the probability of occurrence near the pitch pulse position of the adaptive code vector is increased. It is characterized by doing.

【００５２】図６において、６１は発生するパルス位置
の最小値、６２は発生するパルス位置の最大値、６３は
予め定めておくピッチパルスの位置、６４はパルス位置
の生起確率分布を表す確率分布関数、６５はパルス位置
最小値６１とパルス位置最大値６２とピッチパルス位置
６３と確率分布関数６４を入力として、パルス位置を順
番に出力するパルス位置生成器、６６はパルス位置生成
器６５から出力されたパルス位置によって構成されるパ
ルス位置符号帳である。In FIG. 6, 61 is the minimum value of the generated pulse position, 62 is the maximum value of the generated pulse position, 63 is the position of the predetermined pitch pulse, and 64 is the probability distribution representing the occurrence probability distribution of the pulse position. A function 65 is a pulse position generator that sequentially outputs pulse positions by using the pulse position minimum value 61, the pulse position maximum value 62, the pitch pulse position 63, and the probability distribution function 64 as input, and 66 is an output from the pulse position generator 65. It is a pulse position codebook constituted by the obtained pulse positions.

【００５３】以上のように構成された、パルス位置符号
帳生成器の動作について、図６および図７を用いて説明
する。図６において、パルス位置最小値６１は、パルス
位置符号帳６６に現れるパルス位置の最小値を規定する
ためにパルス位置生成器６５に入力される。The operation of the pulse position codebook generator configured as described above will be described with reference to FIGS. In FIG. 6, a minimum pulse position value 61 is input to a pulse position generator 65 for defining a minimum value of a pulse position appearing in a pulse position codebook 66.

【００５４】パルス位置最大値６２は、パルス位置符号
帳６６に現れるパルス位置の最大値を規定するためにパ
ルス位置生成器６５に入力される。The maximum pulse position value 62 is input to a pulse position generator 65 for defining the maximum pulse position appearing in the pulse position codebook 66.

【００５５】ピッチパルス位置６３は、パルス音源を雑
音音源に適用した場合に、有声部においてはピッチパル
ス位置付近にパルスが立てられやすくなる傾向を利用す
るために、パルス位置生成器６５に入力される。The pitch pulse position 63 is input to the pulse position generator 65 in order to utilize the tendency that a pulse is likely to be formed near the pitch pulse position in a voiced part when a pulse sound source is applied to a noise sound source. You.

【００５６】確率分布関数６４は、パルス音源を雑音音
源に適用した場合に、有声部においてはピッチパルス位
置付近にパルスが立てられやすくなる傾向を確率分布関
数として表現したもので、確率分布にしたがってパルス
位置を生成させるために、パルス位置生成器６５に入力
する。The probability distribution function 64 expresses, as a probability distribution function, a tendency that a pulse is likely to be formed near a pitch pulse position in a voiced part when a pulse sound source is applied to a noise source. The pulse position is input to a pulse position generator 65 to generate a pulse position.

【００５７】パルス位置生成器６５は、パルス位置最小
値６１とパルス位置最大値６２とピッチパルス位置６３
と確率分布関数６４を入力として、パルス位置最小値６
１からパルス位置最大値６２までの間の位置を確率分布
関数６４の分布にしたがって生起させ、生成したパルス
位置をパルス位置符号帳６６に出力する。ここで、確率
分布関数はピッチパルス位置６３に分布の中心を持つよ
うな関数で、図７に示したようなものとなる。この確率
分布関数は、十分な符号ベクトルを有するパルス音源を
用いて音声符号化を行って統計をとることなどによって
予め用意しておく。The pulse position generator 65 has a minimum pulse position value 61, a maximum pulse position value 62, and a pitch pulse position 63.
And the probability distribution function 64 as inputs, the pulse position minimum value 6
Positions between 1 and the pulse position maximum value 62 are generated according to the distribution of the probability distribution function 64, and the generated pulse positions are output to the pulse position codebook 66. Here, the probability distribution function is a function having a distribution center at the pitch pulse position 63, and is as shown in FIG. The probability distribution function is prepared in advance by performing speech coding using a pulse sound source having a sufficient code vector and obtaining statistics.

【００５８】なお、本実施の形態５に示したパルス位置
符号帳を用いて実施の形態１から４の符号帳を作成する
ことができる。Note that the codebooks of the first to fourth embodiments can be created using the pulse position codebook shown in the fifth embodiment.

【００５９】[0059]

【発明の効果】以上のように本発明は、スパースな雑音
符号帳を構造化することによって、雑音符号帳探索に要
する演算量と雑音符号帳を格納するために必要な演算量
を削減することが可能となる。また、パルス位置を適応
符号ベクトルに存在するピッチパルス位置からの相対的
な位置を用いて表現する等により、１ピッチ波形内に存
在する位相情報を利用して有声部の音質向上を図ること
ができるという効果が得られる。As described above, according to the present invention, by structuring a sparse noise codebook, the amount of computation required for searching for a random codebook and the amount of computation required for storing the noise codebook can be reduced. Becomes possible. Also, by expressing the pulse position using a relative position from the pitch pulse position existing in the adaptive code vector, it is possible to improve the sound quality of the voiced part by using the phase information existing in one pitch waveform. The effect that it can be obtained is obtained.

[Brief description of the drawings]

【図１】本発明の実施の形態１におけるＣＥＬＰ型音声
符号化装置の雑音符号ベクトル生成部のブロック図FIG. 1 is a block diagram of a noise code vector generation unit of a CELP-type speech coding apparatus according to Embodiment 1 of the present invention.

【図２】本発明の実施の形態に２におけるＣＥＬＰ型音
声符号化装置の雑音符号ベクトル生成部のブロック図FIG. 2 is a block diagram of a noise code vector generation unit of a CELP-type speech coding apparatus according to Embodiment 2 of the present invention;

【図３】本発明の実施の形態２における雑音符号帳の一
例を示す一覧図FIG. 3 is a list showing an example of a random codebook according to Embodiment 2 of the present invention.

【図４】本発明の実施の形態３におけるＣＥＬＰ型音声
符号化装置の雑音符号ベクトル生成部のブロック図FIG. 4 is a block diagram of a noise code vector generation unit of a CELP-type speech coding apparatus according to Embodiment 3 of the present invention.

【図５】本発明の実施の形態４におけるＣＥＬＰ型音声
符号化装置の雑音符号ベクトル生成部のブロック図FIG. 5 is a block diagram of a noise code vector generation unit of a CELP-type speech coding apparatus according to Embodiment 4 of the present invention.

【図６】本発明の実施の形態５におけるパルス位置符号
帳生成器のブロック図FIG. 6 is a block diagram of a pulse position codebook generator according to a fifth embodiment of the present invention.

【図７】本発明の実施の形態５におけるパルス位置符号
帳生成器に用いられるパルス位置の生起分布を表す関数
の模式図FIG. 7 is a schematic diagram of a function representing an occurrence distribution of pulse positions used in a pulse position codebook generator according to Embodiment 5 of the present invention.

【図８】一般的なＣＥＬＰ型音声符号化装置の雑音符号
ベクトル生成部のブロック図FIG. 8 is a block diagram of a noise code vector generation unit of a general CELP type speech coding apparatus.

[Explanation of symbols]

１１、２１２乗誤差最小化手段１２、２２雑音符号ベクトル成分のターゲットベクト
ル１３、２３線形予測合成フィルタインパルス応答１４、２４雑音符号帳１５、２５雑音ベクトル３１雑音符号帳３２バッファ３３雑音符号ベクトル成分のターゲットベクトル３４線形予測合成フィルタインパルス応答３５２乗誤差最小化手段３６雑音符号ベクトル４１適応符号ベクトル４２ピッチパルス位置検出器４３雑音符号帳選択器４４雑音符号帳４５スイッチ４６雑音符号ベクトル成分のターゲットベクトル４７線形予測合成フィルタインパルス応答４８２乗誤差最小化手段４９雑音符号ベクトル５１適応符号ベクトル５２ピッチパルス位置検出器５３パルス位置変換器５４雑音符号帳５５バッファ５６雑音符号ベクトル成分のターゲットベクトル５７線形予測合成フィルタインパルス応答５８２乗誤差最小化手段５９雑音符号ベクトル６１パルス位置最小値６２パルス位置最大値６３ピッチパルス位置６４確立密度関数６５パルス位置生成器６６パルス位置符号帳11, 21 square error minimizing means 12, 22 target vector of noise code vector component 13, 23 linear prediction synthesis filter impulse response 14, 24 noise codebook 15, 25 noise vector 31 noise codebook 32 buffer 33 noise code vector component Target vector 34 linear prediction synthesis filter impulse response 35 square error minimization means 36 noise code vector 41 adaptive code vector 42 pitch pulse position detector 43 noise codebook selector 44 noise codebook 45 switch 46 target of noise code vector component Vector 47 linear prediction synthesis filter impulse response 48 square error minimizing means 49 noise code vector 51 adaptive code vector 52 pitch pulse position detector 53 pulse position converter 54 noise codebook 55 buffer 56 noise code Target vector of vector component 57 linear prediction synthesis filter impulse response 58 square error minimizing means 59 noise code vector 61 minimum pulse position 62 maximum pulse position 63 pitch pulse position 64 probability density function 65 pulse position generator 66 pulse position code Book

Claims

[Claims]

In a CELP speech coding apparatus, a noise codebook is composed of elements describing a combination of a position and an amplitude of a pulse, and a noise code vector stored in the noise codebook is used for an index of an adjacent index. A speech coding apparatus structured so that only a small number of pulses are different between noise code vectors.

2. The speech coding apparatus according to claim 1, wherein only one pulse is structured to be different between noise code vectors of adjacent indexes.

3. In a CELP speech coding apparatus, a noise codebook is composed of elements describing a combination of a pulse position and an amplitude, a noise code vector is composed of n or more pulses, and an N-th index is used. Is a pulse of the position and amplitude described in the N-th index of the random codebook and (N-1)
A speech encoding device represented by the sum of the position and amplitude pulses described in the (N−n + 1) th index from the (n) th index.

4. The speech encoding apparatus according to claim 2, wherein the content of each index comprises only the position and amplitude of one pulse.

5. The speech coding apparatus according to claim 4, wherein a noise codebook to be used is switched according to a pitch peak position of the adaptive code vector.

6. The pulse position described in the content of each index is described by a relative position from the pitch peak position of the adaptive code vector, where the pitch peak position of the adaptive code vector is 0. 6. The speech encoding device according to any one of claims 1 to 5.

7. A noise code book in which a pulse position is determined using a random number generation function in which the probability of occurrence of a position corresponding to the pitch peak position of an adaptive code vector is higher than the probability of occurrence of another position. Item 7. The speech encoding device according to any one of Items 1 to 6.

8. A recording medium storing a program that implements the speech encoding device according to claim 1 by software.