JPH0997099A

JPH0997099A - Encoding/decoding device

Info

Publication number: JPH0997099A
Application number: JP7253327A
Authority: JP
Inventors: Satoshi Watanabe; 聡渡辺
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1995-09-29
Filing date: 1995-09-29
Publication date: 1997-04-08

Abstract

PROBLEM TO BE SOLVED: To transmit only one piece of code table code per a frame. SOLUTION: A power calculation part 1 obtains sound power at every frame, and an LPC analysis part 2 obtains an LPC coefficient, and a characteristic vector calculation part 4 obtains an LPC cepstrum coefficient. A vector quantization part 20 makes the code table code answering to a code table vector closest to a characteristic vector of a present frame different from any one of code table vectors of frame codes of past (k-1) pieces of frames among k pieces of code table vectors until k-th vector from the vector close to the characteristic vector of the present frame a frame code. A class function is obtained from the frame codes of the present frame and the past (k-1) pieces of frames. A pitch extraction part 3 obtains a pitch of a sound. A vector inverse quantization part 22 calculates the characteristic vector of the present frame, and an element piece forming part 6 forms a sound element piece from the characteristic vector. A waveform superposition part 7 superposes the sound element piece by the pitch.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、高品質な再生音声
などを高い圧縮率で得ることのできる、ベクトル量子化
を用いた高能率符号化・復号化装置に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-efficiency coding / decoding apparatus using vector quantization, which is capable of obtaining high-quality reproduced speech at a high compression rate.

【０００２】[0002]

【従来の技術】従来、このような分野の技術としては、
例えば、次のような文献に記載されるものがあった。文献１：古井貞煕著、「ディジタル音声処理」、東海大
学出版会、ディジタルテクノロジーシリーズ文献２：信学技報 RC90-26、１９９０年、I.A.Geson
著、「Vector sum exited Linear prediction(VSELP) s
peech coding for JAPAN digitalcelluar 」文献３：ICASSP 87 、１９８７年、H.P.tseng 他著、
「Fuzzy Vector Quantization Applied to Hidden Mark
ov Modeling 」文献４：音声研究会資料、１９９８年、中村他著、「フ
ァジイベクトル量子化を用いたスペクトログラムの正規
化検討」、sp 87-123 パラメータの組をまとめて一つの符号で表現するベクト
ル量子化は、前記文献１に記載されており、非常に効率
的な符号化の手法である。2. Description of the Related Art Conventionally, techniques in such a field include:
For example, some documents were described in the following documents. Reference 1: Sadahiro Furui, "Digital Speech Processing", Tokai University Press, Digital Technology Series Reference 2: IEICE Technical Report RC90-26, 1990, IAGeson
Written, `` Vector sum exited Linear prediction (VSELP) s
"peech coding for JAPAN digitalcelluar" Reference 3: ICASSP 87, 1987, HPtseng et al.,
`` Fuzzy Vector Quantization Applied to Hidden Mark
ov Modeling "Reference 4: Speech Study Group Material, 1998, Nakamura et al.," Spectral Normalization Study Using Fuzzy Vector Quantization ", sp 87-123 A vector that collectively represents a set of parameters with a single code. Quantization is described in Document 1 and is a very efficient encoding method.

【０００３】この手法は、一般に、符号帳内の代表ベク
トルの数を増すほど、量子化歪みが低減できるが、符号
帳探索に要する計算量や符号帳の記憶容量が増大すると
いう特徴を持つ。ベクトル量子化を音声符号化に適用し
た場合が前記文献２に記載されているが、これは、上記
特徴を考慮して、複数の符号帳の併用等で互いに補間す
る手法が用いられるが、処理の複雑さや、それに伴う処
理量の増加等の問題点をもつ。前記文献３に記載される
ファジイベクトル量子化は、入力ベクトルを符号帳ベク
トルへの帰属度で表現する手法であり、符号帳内の代表
ベクトル数を増すことなく、量子化歪みを削減できる。
中村等が提案した前記文献４に記載されるＫ近傍則は、
各フレームについて、入力ベクトルに近い方からｋ番目
までのｋ個の符号帳ベクトルに制限してファジイベクト
ル量子化を行うものであり、求めるべき級関数値の個数
が従来の符号帳ベクトル全てに対しｋ個に減少するた
め、処理量、記憶容量等の点でもきわめて実用的なもの
となった。In general, this method can reduce the quantization distortion as the number of representative vectors in the codebook increases, but it has a feature that the calculation amount required for the codebook search and the storage capacity of the codebook increase. The case where the vector quantization is applied to the speech coding is described in the above-mentioned Document 2. In consideration of the above characteristics, a method of interpolating with each other by using a plurality of codebooks is used. However, there are problems such as the complexity of and the increase in processing amount. The fuzzy vector quantization described in Document 3 is a method of expressing an input vector by a degree of membership in a codebook vector, and can reduce quantization distortion without increasing the number of representative vectors in the codebook.
The K-nearest neighbor rule described in Reference 4 proposed by Nakamura et al.
For each frame, fuzzy vector quantization is performed by limiting to k codebook vectors from the one closest to the input vector, and the number of power function values to be calculated is the same for all conventional codebook vectors. Since the number is reduced to k, it is extremely practical in terms of processing amount and storage capacity.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、前記文
献４に記載された従来のベクトル量子化を高能率音声符
号に適用する場合に、１フレームあたりｋ個の級関数の
ほかにｋ個の符号帳コードを伝送する必要がなり、ビッ
トレートが高くなるという問題点があった。However, when the conventional vector quantization described in Document 4 is applied to a high-efficiency speech code, k codebooks in addition to k series functions per frame are used. There is a problem that the code needs to be transmitted and the bit rate becomes high.

【０００５】[0005]

【課題を解決するための手段】本発明は、前記課題を解
決するために、入力された音声の特徴ベクトルをフレー
ム毎に抽出する特徴ベクトル抽出部と、各フレームにつ
いて、それぞれが符号帳コードを持つ複数個の符号帳ベ
クトルの中から符号帳ベクトルを選定して、その選定し
た符号帳ベクトルに対応する符号帳コードをそのフレー
ムのフレームコードとするベクトル量子化部と、各フレ
ームについて、前記フレームコードが表す符号帳ベクト
ルとそのフレームの特徴ベクトルとの距離に基づいて、
級関数値を計算する級関数計算部と、前記各フレームに
ついて、そのフレームコードと級関数値とを記憶する記
憶部と、前記各フレームについて、前記記憶部に記憶さ
れた前記フレームコードと前記級関数値に基づいて、そ
のフレームの特徴ベクトルを算出するベクトル逆量子化
部とを備えた符号化・復号化装置において、前記ベクト
ル量子化及び前記ベクトル逆量子化を以下のように構成
している。In order to solve the above problems, the present invention provides a feature vector extraction unit for extracting a feature vector of input speech for each frame, and a codebook code for each frame. A vector quantizer that selects a codebook vector from a plurality of codebook vectors that it has, and uses a codebook code corresponding to the selected codebook vector as a frame code of the frame, and for each frame, the frame Based on the distance between the codebook vector represented by the code and the feature vector of that frame,
A series function calculation unit that calculates a series function value, a storage unit that stores the frame code and the series function value for each frame, and for each frame, the frame code and the class that are stored in the storage unit. In the encoding / decoding device including the vector dequantization unit that calculates the feature vector of the frame based on the function value, the vector quantization and the vector dequantization are configured as follows. .

【０００６】すなわち、前記ベクトル量子化部は、前記
複数個の符号帳ベクトルの中から現フレームの特徴ベク
トルとの距離が近い方からｋ番目（ｋ≧２の自然数）ま
での、ｋ本の符号帳ベクトルの中から、その直前の（ｋ
−１）フレームまでの（ｋ−１）個のフレームのフレー
ムコードが表す符号帳ベクトルのいずれとも異なり、現
フレームの特徴ベクトルとの距離が最も近い符号帳ベク
トルに対応する符号帳コードを現フレームのフレームコ
ードとするフレームコード決定部と、前記現フレームと
その直前の（ｋ−１）フレームまでのｋ個のフレームの
フレームコードのそれぞれが表す符号帳ベクトルと前記
現フレームの前記特徴ベクトルとの各距離に基づいて、
現フレームの級関数値を計算する級関数計算部とを有し
ている。そして、前記ベクトル逆量子化部は、前記記憶
部に記憶された現フレームとその直前の（ｋ−１）フレ
ームまでのｋ個のフレームのフレームコードのそれぞれ
が表す符号帳ベクトルと現フレームの級関数値から、現
フレームの特徴ベクトルを算出する構成にしている。That is, the vector quantizer is configured to code k codes from the plurality of codebook vectors to the k-th (k ≧ 2 natural number) closest to the feature vector of the current frame. From the book vector, (k
-1) Different from any of the codebook vectors represented by the frame codes of (k-1) frames up to the frame, the codebook code corresponding to the codebook vector closest to the feature vector of the current frame is set to the current frame. A frame code determining unit to be a frame code of the current frame, a codebook vector represented by each of the frame codes of the current frame and k frames up to the immediately preceding (k-1) frame, and the feature vector of the current frame. Based on each distance,
And a series function calculation unit for calculating a series function value of the current frame. The vector dequantization unit is configured to classify the codebook vector and the current frame represented by the frame code of the current frame stored in the storage unit and the k frames up to the immediately preceding (k-1) th frame. The feature vector of the current frame is calculated from the function value.

【０００７】[0007]

【発明の実施形態】図１は、本発明の実施形態を示す音
声符号化・復号化装置の構成図である。本発明の実施形
態の音声符号化・復号化装置が従来の音声符号化・復号
化装置と異なる点は、直前（ｋ−１）フレーム分のフレ
ームコードを参照することで現フレームのフレームコー
ドをただ１つ決定し、これを含めた直前ｋフレーム分の
フレームコードに限定したファジィベクトル量子化を行
うようにしたことである。図１に示すように、本実施形
態の音声符号化・復号化装置は、パワー計算部１、ＬＰ
Ｃ分析部２、ピッチ抽出部３、特徴ベクトル抽出部４、
ベクトル量子化・逆量子化部５、素片作成部６、波形重
畳部７、記憶部８より構成されている。音声のパワーを
計算するパワー計算部１は、標本化された音声を入力と
する。パワー計算部１の出力側は、１フレーム分のＬＰ
Ｃ(Linear Predictive Coding)分析をするＬＰＣ分析部
２、及び記憶部８が接続されている。ＬＰＣ分析部２の
出力側は、１フレーム内で音声のピッチを計算するピッ
チ抽出部３、及びＬＰＣケプストラム係数を算出する特
徴ベクトル抽出部４が接続されている。ピッチ抽出部３
の出力側は、記憶部８が接続されている。特徴ベクトル
抽出部４の出力側は、ベクトル量子化・逆量子化部５が
接続されている。ベクトル量子化・逆量子化部５は、ベ
クトル量子化部２０、符号帳２１、ベクトル逆量子化部
２２より構成されている。符号帳２１の出力側は、ベク
トル量子化部２１、及びベクトル逆量子化部２２が接続
されている。ベクトル量子化部２１の出力側は、記憶部
８が接続され、記憶部８の出力側は、ベクトル逆量子化
部２２が接続されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a block diagram of a speech encoding / decoding device showing an embodiment of the present invention. The voice encoding / decoding device according to the exemplary embodiment of the present invention is different from the conventional voice encoding / decoding device in that the frame code of the current frame is referred to by referring to the frame code of the immediately preceding (k−1) frame. This is because only one is determined and fuzzy vector quantization limited to the frame codes for the immediately preceding k frames including this is performed. As shown in FIG. 1, the speech encoding / decoding device according to the present embodiment includes a power calculation unit 1 and an LP.
C analysis unit 2, pitch extraction unit 3, feature vector extraction unit 4,
The vector quantization / dequantization unit 5, the segment generation unit 6, the waveform superposition unit 7, and the storage unit 8 are included. The power calculation unit 1 that calculates the power of a voice receives the sampled voice as an input. The output side of the power calculator 1 is an LP for one frame.
An LPC analysis unit 2 that performs C (Linear Predictive Coding) analysis and a storage unit 8 are connected. The output side of the LPC analysis unit 2 is connected to a pitch extraction unit 3 that calculates the pitch of the voice within one frame and a feature vector extraction unit 4 that calculates the LPC cepstrum coefficient. Pitch extractor 3
The storage unit 8 is connected to the output side of. The output side of the feature vector extraction unit 4 is connected to the vector quantization / inverse quantization unit 5. The vector quantizer / inverse quantizer 5 includes a vector quantizer 20, a codebook 21, and a vector dequantizer 22. The output side of the codebook 21 is connected to the vector quantization unit 21 and the vector dequantization unit 22. The storage unit 8 is connected to the output side of the vector quantization unit 21, and the vector dequantization unit 22 is connected to the output side of the storage unit 8.

【０００８】図２は、図１中のベクトル量子化・逆量子
化部の構成図である。図２に示すように、ベクトル量子
化部２０は、コードバッファ３１、ｋ近傍コード選出部
３２、フレームコード決定部３３、級関数計算部３４に
より構成されている。符号帳２１の出力側は、ｋ近傍コ
ード選出部３２、及び級関数計算部３４が接続されてい
る。ｋ近傍コード選出部３２の入力側は、図１中の特徴
ベクトル抽出部４が接続され、ｋ近傍コード選出部３２
の出力側は、フレームコード決定部３３が接続され、さ
らにフレームコード決定部３４の出力側には、記憶部
８、及び級関数計算部３４が接続されている。級関数計
算部３４の出力側は、記憶部８が接続されている。符号
帳２１、及び記憶部８の出力側は、ベクトル逆量子化部
２２が接続され、ベクトル逆量子化部２２の出力側は、
図１中の素片作成部６が接続されている。コードバッフ
ァ３１は、直前ｋ（ｋ≧２の自然数）フレーム分のフレ
ームコードを保持する局所的な記憶領域であり、その値
は、保存され続けるが、１フレームにつき１回後述する
更新が行われる。ｋ近傍コード選出部３２は、符号帳２
１を参照して、「ｋ近傍」の「ｋ近傍コード」を選出す
るものである。フレームコード決定部３３は、現フレー
ムの「フレームコード」を決定するものである。FIG. 2 is a block diagram of the vector quantization / inverse quantization section in FIG. As shown in FIG. 2, the vector quantization unit 20 includes a code buffer 31, a k-nearest neighbor code selection unit 32, a frame code determination unit 33, and a series function calculation unit 34. On the output side of the codebook 21, a k-nearest neighbor code selection unit 32 and a class function calculation unit 34 are connected. The input side of the k-nearest neighbor code selection unit 32 is connected to the feature vector extraction unit 4 in FIG.
The frame code determination unit 33 is connected to the output side of, and the storage unit 8 and the series function calculation unit 34 are connected to the output side of the frame code determination unit 34. The storage unit 8 is connected to the output side of the series function calculation unit 34. The output side of the codebook 21 and the storage unit 8 is connected to the vector dequantization unit 22, and the output side of the vector dequantization unit 22 is
The segment creating unit 6 in FIG. 1 is connected. The code buffer 31 is a local storage area that holds the frame code for the immediately preceding k (a natural number of k ≧ 2) frames, and its value is continuously stored, but is updated once per frame as described later. . The k neighborhood code selection unit 32 uses the codebook 2
With reference to 1, the “k neighborhood code” of the “k neighborhood” is selected. The frame code determination unit 33 determines the "frame code" of the current frame.

【０００９】ここで、「ｋ近傍」とは、入力ベクトルと
の距離が近い方からｋ番目までの、ｋ本の符号帳ベクト
ルのことをいう。「ｋ近傍コード」とは、ｋ本の符号帳
ベクトルに対応するｋ個の符号帳コードのことを指し、
ｊ番目に近いベクトルをコードのことをＸ_jと表記す
る。「フレームコード」とは、１フレームに１つ決定す
る蓄積パラメータのことで、現フレームのフレームコー
ドをｆ₀、ｉフレーム前のフレームコードをｆ_iと表記
する。符号帳２１は、あらかじめ大量の学習データから
抽出したＬＰＣスペクトラム係数を、ＬＢＧアルゴリズ
ムで歪みが最小となる様にクラスタリングしたものであ
る。ＬＰＢアルゴリズムについては、下記の文献５に詳
細に記載されている。文献５：ＩＥＥＥＣＯＭ−２８、１９８０−０１、Li
nde,Buzo,Gray 著、「An Algorithm for Vector Quanti
zer Design」以下、これらの図を参照しつつ、図１の動作の説明をす
る。Here, the "k neighborhood" means k codebook vectors from the closest distance to the input vector to the kth. “K neighborhood code” refers to k codebook codes corresponding to k codebook vectors,
The vector closest to the j-th is referred to as a code as X _j . The “frame code” is a storage parameter that is determined once for each frame, and the frame code of the current frame is described as f ₀ and the frame code before the i frame is described as f _i . The codebook 21 is obtained by clustering LPC spectrum coefficients extracted from a large amount of learning data in advance so that distortion is minimized by the LBG algorithm. The LPB algorithm is described in detail in Document 5 below. Reference 5: IEEE COM-28, 1980-01, Li
nde, Buzo, Gray, `` An Algorithm for Vector Quanti
zer Design ”Hereinafter, the operation of FIG. 1 will be described with reference to these drawings.

【００１０】（１）符号化処理マイクから入力された音声信号は、アンプ、ローパスフ
ィルタを経て、Ａ／Ｄ変換器でディジタル化され、バッ
ファを介して一定区間長のフレーム（例えば、フレーム
長１０ｍｓ、フレームシフト周期５ｍｓ）毎に、パワー
計算部１に送出される。パワー計算部１は、音声信号の
各サンプルの２乗和を計算することで、現フレームのパ
ワーを求め、ＬＰＣ分析部２には、音声信号を、記憶部
８には、パワーをそれぞれ送出する。ＬＰＣ分析部２
は、音声信号を線形予測モデルから最小二乗法によりＬ
ＰＣ係数を決定して、特徴ベクトル算出部４にはＬＰＣ
係数を、ピッチ抽出部３にはＬＰＣ係数と１フレーム分
の音声データを、それぞれ送出する。特徴ベクトル算出
部４は、ＬＰＣ係数を用いた漸化式を計算することで、
ＬＰＣケプストラム係数（人間の聴覚特性に近い性質を
持っており、音声信号をＦＦＴ(Fast Fourier Transfor
m)、対数変換、逆ＦＦＴして得られたものと同じ）を算
出し、これを特徴ベクトルとして、ベクトル量子化部２
０に送出する。なお、上述のＬＰＣ分析およびＬＰＣケ
プストラム計算方法については、前記文献１に詳細に記
載されている。(1) Encoding processing A voice signal input from a microphone is digitized by an A / D converter after passing through an amplifier and a low-pass filter, and a frame having a fixed section length (for example, a frame length of 10 ms) is passed through a buffer. , Every 5 ms of frame shift period) is sent to the power calculation unit 1. The power calculation unit 1 calculates the power of the current frame by calculating the sum of squares of each sample of the voice signal, and outputs the voice signal to the LPC analysis unit 2 and the power to the storage unit 8, respectively. . LPC analysis unit 2
Is a speech prediction signal from a linear prediction model by the method of least squares L
The PC coefficient is determined, and the feature vector calculation unit 4 stores the LPC.
The coefficient, the LPC coefficient and the audio data for one frame are sent to the pitch extraction unit 3, respectively. The feature vector calculation unit 4 calculates the recurrence formula using the LPC coefficient,
LPC cepstrum coefficient (has a property close to human auditory characteristics, and converts an audio signal into an FFT (Fast Fourier Transfor
m), logarithmic transformation, and the same as the one obtained by inverse FFT), and using this as a feature vector, the vector quantization unit 2
Send to 0. The above LPC analysis and LPC cepstrum calculation method are described in detail in Document 1 above.

【００１１】図１中の特徴ベクトル抽出部４から送出さ
れた特徴ベクトル（以下、入力ベクトルＸと表記する）
は、図２中のｋ近傍コード選出部３２に入力される。こ
のとき、コードバッファ３１上には、ｋ個のフレームコ
ード｛ｆ₁，ｆ₂，…，ｆ_k｝が保存されている。ｋ近
傍コード選出部３２は、入力ベクトルＸと符号帳２に格
納された符号帳全ベクトルとの距離を行い、その結果を
ソートすることで、ｋ近傍コードＸ_j（ｊ＝１，…，
ｋ）を求め、これをフレームコード決定部３３に出力す
る。距離尺度としては、例えば、ケプストラム距離を用
いる。図３は、図２中のフレームコード決定部３３の動
作を示すフローチャートである。以下、図３を参照しつ
つ、フレームコード決定部３３の動作の説明をする。フ
レームコード決定部３３は、ステップＳ１において、ｊ
＝１に初期化する。ステップＳ２において、第１近傍コ
ードＸ₁と、コードバッファ３１上の（ｋ−１）個のフ
レームコード｛ｆ₁，…，ｆ_k-1｝を比較する。ｋフレ
ーム前のフレーコードｆ_kとは、比較しない。The feature vector transmitted from the feature vector extraction unit 4 in FIG. 1 (hereinafter referred to as the input vector X)
Is input to the k-nearest neighbor code selection unit 32 in FIG. At this time, k frame codes {f ₁ , f ₂ , ..., F _k } are stored in the code buffer 31. The k-nearest neighbor code selection unit 32 performs the distance between the input vector X and all the codebook vectors stored in the codebook 2, and sorts the results to obtain the k-nearest neighbor code X _j (j = 1, ...,).
k), and outputs this to the frame code determination unit 33. For example, the cepstrum distance is used as the distance measure. FIG. 3 is a flowchart showing the operation of the frame code determining unit 33 in FIG. Hereinafter, the operation of the frame code determination unit 33 will be described with reference to FIG. The frame code determination unit 33 determines in step S1 that j
Initialize to = 1. In step S2, the first neighborhood code X ₁ is compared with the (k−1) frame codes {f ₁ , ..., F _k−1 } on the code buffer 31. No comparison is made with the frame code f _k before k frames.

【００１２】ステップＳ３において、第１近傍コードＸ
₁が、コードバッファ３１上の（ｋ−１）個のコードの
いずれとも異なる場合、フレームコードｆ₀を第１近傍
コードＸ₁に決定して、ステップＳ４に進み、コードバ
ッファ３１上の（ｋ−１）個のフレームコードのいずれ
かのコードに等しければ、ステップＳ３に進む。ステッ
プＳ３において、ｊを１増加して、ステップＳ２に戻
る。ステップＳ２において、今度は、第２近傍コードＸ
₂とコードバッファ３１上の（ｋ−１）個のコード｛ｆ
₁，…，ｆ_k-1｝を比較する。ステップＳ３において、
第２近傍コードＸ₂が、コードバッファ３１上の（ｋ−
１）個のコードのいずれとも異なる場合、フレームコー
ドｆ₀を第２近傍コードＸ₂に決定して、ステップＳ４
に進み、コードバッファ３１上の（ｋ−１）個のコード
のいずれかのフレームコードに等しければ、再びステッ
プＳ３において、ｊを１増加して、ステップＳ２に再び
戻り、今度は、第３近傍コードＸ₂とコードバッファ３
１上の（ｋ−１）個のフレームコード｛ｆ₁，…，ｆ
_k-1｝を比較する。このような手順を多くともｋ近傍コ
ードＸ_kに対してまで繰り返せば、必ずフレームコード
Ｘ_jを決定し、ステップＳ４に進むことができる。この
ようにして決定されたフレームコードＸ_jは、コードバ
ッファ３１上の（ｋ−１）個のフレームコードのいずれ
とも異なり、符号帳２１に格納された符号帳ベクトルの
中で入力ベクトルＸと最も距離の近い符号帳ベクトルに
対応するコードとなる。In step S3, the first neighborhood code X
_{If 1} is different from any of the (k-1) codes in the code buffer 31, the frame code f ₀ is determined to be the first neighbor code X ₁ , and the process proceeds to step S4, where (k -1) If it is equal to any one of the frame codes, go to step S3. In step S3, j is incremented by 1, and the process returns to step S2. In step S2, this time the second neighborhood code X
₂ and (k-1) codes {f on the code buffer 31
_{, 1} , ..., F _k−1 } are compared. In step S3,
The second neighborhood code X ₂ is (k−
1) If it is different from any of the code, the frame code f ₀ is determined as the second neighborhood code X ₂ , and step S4
If it is equal to any one of the (k-1) codes on the code buffer 31, then j is incremented by 1 again in step S3 and the process returns to step S2 again, this time the third neighborhood. Code X ₂ and code buffer 3
(K-1) frame codes {f ₁ , ..., F on ₁
_k-1 } are compared. If such a procedure is repeated at most for the k-nearest neighbor code X _k , the frame code X _j can be determined and the process can proceed to step S4. The frame code X _j determined in this way differs from any of the (k−1) frame codes in the code buffer 31, and is the most significant of the input vector X among the codebook vectors stored in the codebook 21. It is a code corresponding to a codebook vector having a short distance.

【００１３】ステップＳ４において、フレームコードＸ
_jをフレームコードｆ₀とする。ステップＳ５におい
て、フレームコードｆ₀を記憶部８へ出力し、ステップ
Ｓ６に進む。ステップＳ６において、記憶部８に格納さ
れた｛ｆ₁，ｆ₂，…，ｆ_k｝を｛ｆ₀，ｆ₁，…，ｆ
_k-1｝に更新する。ここで、得られたフレームコード
｛ｆ₀，ｆ₁，…，ｆ_k-1｝は、各々が全て異なり、第
１近傍Ｘ₁がそのフレームコード｛ｆ₀，ｆ₁，…，ｆ
_k-1｝の中に必ず含まれる。しかも、音声は連続的に変
化するという特徴があるので、音声が滑らかに変化する
ときは、フレームコード｛ｆ₀，ｆ₁，…，ｆ_k-1｝
は、集合として、ｋ近傍コード｛Ｘ₁，Ｘ₂，…，
Ｘ_k｝に一致する場合が多く、｛ｆ₀，ｆ₁，…，ｆ
_k-1｝のｋ個の符号帳コードはｋ近傍に近いものである
といえる。級関数計算部３４は、更新されたコードバッ
ファ３１上のｋ個のコード｛ｆ₀，…，ｆ_k-1｝のそれ
それが示す符号帳ベクトルと入力ベクトルＸとの距離を
再計算し、これを用いて適当なビットに量子化もしくは
ベクトル量子化し、これを級関数値として、記憶部８に
出力してもいいし、入力ベクトルＸの最近傍コードに対
応する級関数値λ0 だけを計算、量子化し、最近傍を指
すポインタコードとともに記憶部８に出力してもよい。
ただし、後者の場合は、他の近傍コードに対応する各級
関数値を、例えば、一律に（１−λ0 ）／（ｋ−１）と
して与えるなどの処理をベクトル逆量子化部２２に付加
しておくことが必要である。なお、この級関数値を求め
る方法は、下記の文献５に詳細に述べられている。In step S4, the frame code X
Let _j be the frame code f ₀ . In step S5, the frame code f ₀ is output to the storage unit 8, and the process proceeds to step S6. In step S6, {f ₁ , f ₂ , ..., F _k } stored in the storage unit 8 are changed to {f ₀ , f ₁ , ..., F.
_k-1 }. Here, the obtained frame codes {f ₀ , f ₁ , ..., F _k-1 } are all different from each other, and the first neighborhood X ₁ has the frame code {f ₀ , f ₁ , ..., F.
Always included in _k-1 }. Moreover, since the voice has a characteristic that it continuously changes, when the voice changes smoothly, the frame code {f ₀ , f ₁ , ..., F _k-1 }
Is a set of k neighborhood codes {X ₁ , X ₂ , ...,
Many if it matches _{_{X k}, {f 0,}} f 1, ..., f
It can be said that the k codebook codes of ( _k-1 ) are close to k. The series function calculation unit 34 recalculates the distance between the codebook vector indicated by each of the k codes {f ₀ , ..., F _k−1 } on the updated code buffer 31 and the input vector X, Quantize or vector quantize to an appropriate bit using this, and output it to the storage unit 8 as a series function value, or calculate only the class function value λ 0 corresponding to the nearest code of the input vector X. , And may be quantized and output to the storage unit 8 together with the pointer code that points to the nearest neighbor.
However, in the latter case, processing such as uniformly giving each class function value corresponding to another neighborhood code as (1-λ0) / (k-1) is added to the vector dequantization unit 22. It is necessary to keep it. The method of obtaining the class function value is described in detail in Document 5 below.

【００１４】文献５：特開平１−２３７６００号公報一方、図１中のピッチ抽出部３は、ＬＰＣ係数と音声信
号から計算したＬＰＣ残差波形を用いる変形相関法（Ｌ
ＰＣ残差波形をビッチ分ずらすと位相がほぼ一致して、
ずらした前後のパワーが最大になる）を用いて、ビッチ
長を決定して、記憶部８へ送る。なお、ＬＰＣ残差波形
を用いる変形相関法は、下記文献６に記載されている。文献６：Reports of the Int.Con.Acoust.Tokyo,C-5-5
、１９６８年、Y.kohashi 編、F.Itakura 、S,Saito
著、「Analysis Synthesis Telephonybased on the Max
imum Likelihood Method 」（２）復号化処理次に、ベクトル逆量子化部２２について説明する。記憶
部８から受け取ったフレームコードにより、ベクトル逆
量子化部２２内の図示しないコードバッファの更新を、
ベクトル量子化部２０内のコードバッファ３１と同じ手
順で行う。つまり、コードバッファ上のｋ個のコード
｛ｆ₁，ｆ₂，…，ｆ_k｝から｛ｆ₀，ｆ₁，…，ｆ
_k-1｝に更新される。次に、記憶部８から受け取ったｋ
個の級関数値を係数として、コードバッファ上のｋ個の
コード｛ｆ₀，ｆ₁，…ｆ_k-1｝に対応する符号帳ベク
トルを符号帳２１から得て、それらの符号帳ベクトルを
一次結合することで、ＬＰＣケプストラム係数である特
徴ベクトルを得て、図１中の素片作成部６に送る。素片
作成部６は、このＬＰＣケプストラム係数にＦＦＴ、指
数変換、逆ＦＦＴを順次施して、対称形音声素片（ＬＰ
Ｃケプストラム係数は、ＦＦＴの実数部のみを用いて作
成されており、虚数部は捨てられているため、音声素片
が対称になる）を作成し、これを波形重畳部７に送る。Reference 5: Japanese Patent Application Laid-Open No. 1-237600 On the other hand, the pitch extraction unit 3 in FIG. 1 uses a modified correlation method (L) which uses an LPC residual waveform calculated from an LPC coefficient and an audio signal.
When the PC residual waveform is shifted by the bitch, the phases almost match,
The power before and after the shift is maximized) is used to determine the bitch length, and the bitch length is sent to the storage unit 8. The modified correlation method using the LPC residual waveform is described in Document 6 below. Reference 6: Reports of the Int.Con.Acoust.Tokyo, C-5-5
, 1968, edited by Y.kohashi, F. Itakura, S, Saito
Author, `` Analysis Synthesis Telephonybased on the Max
imum Likelihood Method ”(2) Decoding Processing Next, the vector dequantization unit 22 will be described. The frame code received from the storage unit 8 is used to update a code buffer (not shown) in the vector dequantization unit 22.
The procedure is the same as that of the code buffer 31 in the vector quantizer 20. That is, from k codes {f ₁ , f ₂ , ..., F _k } on the code buffer to {f ₀ , f ₁ , ..., f
_k-1 } is updated. Next, k received from the storage unit 8
The codebook vectors corresponding to the _k codes {f ₀ , f ₁ , ... F _k-1 } on the code buffer are obtained from the codebook 21 using the series function values as coefficients, and those codebook vectors are obtained. By linearly combining, a feature vector that is an LPC cepstrum coefficient is obtained and sent to the segment creating unit 6 in FIG. The segment generation unit 6 sequentially performs FFT, exponential conversion, and inverse FFT on the LPC cepstrum coefficient to generate a symmetric speech segment (LP
The C cepstrum coefficient is created by using only the real number part of the FFT, and the imaginary number part is discarded, so that the voice segment is symmetric) is created and sent to the waveform superimposing part 7.

【００１５】波形重畳部７は、記憶部８から受け取った
ピッチおよびパワーから、音声素片のパワーを決定し、
素片パワーの調節を行った後、素片をピッチ間隔で重畳
することで音声を合成し、出力バッファ、Ｄ／Ａ変換器
を介して、スピーカより出力する。なお、波形重畳部７
における素片のパワー調整および素片重畳方法について
は、以下の文献６に詳細に記載されている。文献６：日本音響学会誌、４４、１１、１９８８年、中
島他著、「パワースペクトル包絡（ＰＳＥ）音声分析・
合成系」以上説明したように、実施形態によれば、１フレームに
つき１コードだけを蓄積するファジイベクトル量子化を
実現でき、音声蓄積装置の記憶容量を大幅に削減できる
という利点がある。The waveform superposing unit 7 determines the power of the speech unit from the pitch and the power received from the storage unit 8,
After the unit power is adjusted, the units are superimposed at a pitch interval to synthesize a voice and output from the speaker via the output buffer and the D / A converter. The waveform superimposing unit 7
The power adjustment of the element and the method of superimposing the element in (3) are described in detail in Document 6 below. Reference 6: Acoustical Society of Japan, 44, 11 , 1988, Nakajima et al., "Power spectrum envelope (PSE) speech analysis."
Synthesizing System ”As described above, according to the embodiment, there is an advantage that the fuzzy vector quantization for accumulating only one code per frame can be realized and the storage capacity of the voice accumulating device can be significantly reduced.

【００１６】なお、本発明は、上記実施形態に限定され
ず種々の変形が可能である。その変形例としては、例え
ば次のようなものがある。（１）本実施形態では、特徴パラメータにＬＰＣケプ
ストラムを用いたが、改良ケプストラム係数等の別パラ
メータを用いて実施してもよい。（２）また、本発明はフレーム毎にベクトル量子化を
行う音声符号化方式すべてに対して適用することがで
き、その利用形態を波形重畳型の音声蓄積装置に限るも
のではない。（３）さらに、本発明は、音声以外のメディア、例え
ば、画像などのデータ圧縮や符号化などに利用すること
もできる。The present invention is not limited to the above embodiment, and various modifications can be made. For example, there are the following modifications. (1) In the present embodiment, the LPC cepstrum is used as the characteristic parameter, but another parameter such as the improved cepstrum coefficient may be used. (2) Further, the present invention can be applied to all speech coding methods in which vector quantization is performed for each frame, and its usage form is not limited to the waveform superposition type speech storage device. (3) Furthermore, the present invention can be used for media other than voice, for example, data compression or encoding of images and the like.

【００１７】[0017]

【発明の効果】以上詳細に説明したように、本発明によ
れば、１フレームにつき１コードだけを蓄積するベクト
ル量子化を実現でき、記憶部の記憶容量を大幅に削減で
きる。As described in detail above, according to the present invention, vector quantization for accumulating only one code per frame can be realized, and the storage capacity of the storage unit can be greatly reduced.

[Brief description of drawings]

【図１】本発明の実施形態の音声符号化・復号化装置の
構成図である。FIG. 1 is a configuration diagram of a speech encoding / decoding device according to an embodiment of the present invention.

【図２】図１中のベクトル量子化・逆量子化部の構成図
である。FIG. 2 is a configuration diagram of a vector quantization / inverse quantization unit in FIG.

【図３】図２中のフレームコード決定部の動作を示すフ
ローチャートである。3 is a flowchart showing an operation of a frame code determination unit in FIG.

[Explanation of symbols]

１パワー計算部２ＬＰＣ分析部３ピッチ抽出部４特徴ベクトル抽出部５ベクトル量子化・逆量子化部６素片作成部７波形重畳部８記憶部２０ベクトル量子化部２１符号帳２２ベクトル逆量子化部３１コードバッファ３２ｋ近傍コード選出部３３フレームコード決定部３４級関数計算部 1 Power Calculation Section 2 LPC Analysis Section 3 Pitch Extraction Section 4 Feature Vector Extraction Section 5 Vector Quantization / Inverse Quantization Section 6 Element Creation Section 7 Waveform Superposition Section 8 Storage Section 20 Vector Quantization Section 21 Codebook 22 Vector Inverse Quantization Conversion unit 31 code buffer 32 k neighborhood code selection unit 33 frame code determination unit 34 class function calculation unit

Claims

[Claims]

1. A feature vector extraction unit for extracting a feature vector of input speech for each frame, and for each frame, a codebook vector is selected from a plurality of codebook vectors each having a codebook code. Then, for each frame, a vector quantizer that uses the codebook code corresponding to the selected codebook vector as the frame code of the frame, and for each frame, the distance between the codebook vector represented by the frame code and the feature vector of the frame. Based on a series function calculation unit for calculating a series function value, a storage unit for storing the frame code and the series function value for each frame, and for each frame, the frame code stored in the storage unit. And a vector dequantization unit that calculates a feature vector of the frame based on the class function value. In the encoding / decoding device, the vector quantizing unit includes k number of codebook vectors from the closest distance to the feature vector of the current frame to the k-th (k ≧ 2 natural number). , Which is different from any of the codebook vectors represented by the frame codes of (k-1) frames up to the immediately preceding (k-1) frame, the distance from the feature vector of the current frame is the largest. A frame code determination unit that uses a codebook code corresponding to a close codebook vector as a frame code of the current frame, and the frame code of each of k frames up to the (k-1) th frame immediately before the current frame. A class function calculator that calculates a class function value of the current frame based on each distance between the codebook vector and the feature vector of the current frame; Le inverse quantization unit, the current frame stored in the storage unit and the immediately preceding (k-
1) A coding / decoding device characterized in that a feature vector of a current frame is calculated from a codebook vector represented by each of frame codes of k frames up to a frame and a series function value of the current frame. .