JPH02146100A

JPH02146100A - Voice encoding device and voice decoding device

Info

Publication number: JPH02146100A
Application number: JP63299822A
Authority: JP
Inventors: Shigeru Hosoi; 茂細井; Yoshio Sato; 佐藤　好男; Koichi Honma; 光一本間
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1988-11-28
Filing date: 1988-11-28
Publication date: 1990-06-05
Anticipated expiration: 2013-09-17
Also published as: JP2797348B2

Abstract

PURPOSE:To improve the quality of a voice to be decoded and the encoding efficiency by quantizing the voice according to the characteristics of the voice such as a voiced sound, a voiceless sound, and no voice. CONSTITUTION:This device is equipped with a classifying device 1 which analyzes the characteristics of the input voice signal and classifies it by a voiced sound stationary part, a voiced sound transient part, a voiceless sound, and no sound. This classifying device 1 outputs its decision data to switches 10 and 11, a quantizer 4, and a multiplexer 5 according to the characteristics of the voice signal. Then no sound and voiceless sound as to the voice signal are inputted directly to the quantizer 4, the voiced transient part is inputted to the quantizer 4 through a prediction device 3, and the voiced sound stationary part is compressed by a time-base compressor 2 and then inputted to the quantizer 4 through the prediction device 3, so that they are quantized respectively. Consequently, the quantity of the voice to be decoded and the encoding efficiency are improved.

Description

【発明の詳細な説明】産業上の利用分野本発明はディジタル通信、ボイスメール等に利用する音
声符号化装置と音声復号化装置に関する。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a voice encoding device and a voice decoding device used in digital communications, voice mail, and the like.

従来の技術第６図（ａ）は、従来の音声符号化装置を示し、第６図
（ｂ）は、従来の音声復号化装置を示す。Prior Art FIG. 6(a) shows a conventional speech encoding device, and FIG. 6(b) shows a conventional speech decoding device.

第６図（ａ）において、１４は、リニアＰＣＭによりＡ
／Ｄ変換された音声信号により予測誤差と短期／長期予
測フィルタ係数を求める予測器、１５は、第７図に示す
ように、一定長の複数の信号列（代表ベクトル）が予め
格納されたコードブック１５ａと、予測器１４からの予
測誤差をベクトル量子化して代表ベクトルの番号を出力
するベクトル量子化器１５ｂを備えた量子化器、１６は
、予測器１４からの短期／長期予測フィルタ係数の量子
化値と、量子化器１５からの代表ベクトルの番号を多重
化する多重化器である。In FIG. 6(a), 14 is A by linear PCM.
As shown in FIG. 7, the predictor 15, which calculates the prediction error and short-term/long-term prediction filter coefficients using the /D-converted audio signal, is a code in which a plurality of signal sequences (representative vectors) of a certain length are stored in advance. A quantizer 16 includes a book 15a and a vector quantizer 15b that vector quantizes the prediction error from the predictor 14 and outputs a representative vector number. This is a multiplexer that multiplexes the quantized value and the representative vector number from the quantizer 15.

第６図（ｂ）において、１７は、上記符号器からの短期
予測フィルタ係数及び長期予測フィルタ係数と、代表ベ
クトルの番号を分離する分離器、１８は、符号器のコー
ドブック１５ａと同一のコードブック（不図示）を備え
、分離器１７からの番号に対応する代表ベクトルを予測
誤差として出力する逆量子化器、１９は、短期予測フィ
ルタ係数及び長期予測フィルタ係数と逆量子化器１８か
らの代表ベクトルにより音声信号を合成する合成器であ
る。In FIG. 6(b), 17 is a separator that separates the short-term prediction filter coefficients and long-term prediction filter coefficients from the encoder from the representative vector number, and 18 is the same code as the codebook 15a of the encoder. An inverse quantizer 19 includes a book (not shown) and outputs the representative vector corresponding to the number from the separator 17 as a prediction error; This is a synthesizer that synthesizes audio signals using representative vectors.

次に、上記従来例の動作を説明する。Next, the operation of the above conventional example will be explained.

第６図（ａ）において、リニアＰＣＭによりＡ／Ｄ変換
された音声信号が入力すると、予測器１４では、音声信
号の近接サンプル値間の相関を除去するために短期予測
フィルタ係数を求め、この短期予測フィルタ係数により
短期予測誤差を求める。In FIG. 6(a), when an audio signal A/D converted by linear PCM is input, the predictor 14 calculates a short-term prediction filter coefficient in order to remove the correlation between adjacent sample values of the audio signal. Find the short-term prediction error using the short-term prediction filter coefficients.

更に、音声信号の音源ピンチの周期的な相関を除去する
ために、短期予測誤差により長期予測フィルタ係数を求
め、この長期予測フィルタ係数により予測誤差を求める
。Furthermore, in order to remove the periodic correlation of the sound source pinch of the audio signal, a long-term prediction filter coefficient is determined using the short-term prediction error, and a prediction error is determined using the long-term prediction filter coefficient.

量子化器１５では、この予測誤差の信号列とコードブッ
ク１５ａの各代表ベクトルの２乗距離を計算し、その値
が最も小さい代表ベクトルの番号を量子化値として出力
する。The quantizer 15 calculates the squared distance between this prediction error signal sequence and each representative vector of the codebook 15a, and outputs the number of the representative vector with the smallest value as a quantized value.

しだがって、多重化器１６からは、音声信号が短期予測
フィルタ係数及び長期予測フィルタ係数と、代表ベクト
ルの番号に圧縮されたデータとして復号器に送出される
。Therefore, the multiplexer 16 sends the audio signal to the decoder as data compressed into short-term prediction filter coefficients, long-term prediction filter coefficients, and representative vector numbers.

第６図（ｂ）において、合成器１９は、符号器の予測器
１４のフィルタと逆特性のフィルタを備えており、した
がって、符号器からの短期／長期予測フィルタ係数に応
じたフィルタにより代表ベクトルを音声信号に復号する
ことができる。In FIG. 6(b), the synthesizer 19 is equipped with a filter having the opposite characteristics to the filter of the predictor 14 of the encoder, and therefore the representative vector is can be decoded into an audio signal.

発明が解決しようとする課題しかしながら、上記従来の音声符号化装置と音声復号化
装置では、有声音、無声音、無音等の音声の特性にかか
わらず同じ処理を行うので、低ビツトレートで音声符号
化する場合、音声の特性に応じて符号化しないので、復
号された音声の品質が良好でなく、符号化効率を向上す
ることができないという問題点がある。Problems to be Solved by the Invention However, since the above-mentioned conventional speech encoding devices and speech decoding devices perform the same processing regardless of the characteristics of the speech, such as voiced, unvoiced, or silent, it is difficult to encode speech at a low bit rate. In this case, since encoding is not performed according to the characteristics of the voice, the quality of the decoded voice is not good and there is a problem that the encoding efficiency cannot be improved.

本発明はこのような問題点を解決するものであり、符号
化効率を向上することができる音声符号化装置と音声復
号化装置を提供することを目的とする。The present invention is intended to solve these problems, and aims to provide a speech encoding device and a speech decoding device that can improve coding efficiency.

課題を解決するための手段本発明の音声符号化装置は、上記目的を達成するために
、音声信号を有声音定常部と、有声音過渡部と無声音等
に分類し、有声音定常部を時間軸上に圧縮し、有声音過
渡部と、圧縮された有声音定常部の予測誤差を出力し、
無声音等と、有声音定常部と有声音過渡部の予測誤差を
量子化するようにしだものである。Means for Solving the Problems In order to achieve the above object, the speech encoding device of the present invention classifies a speech signal into a voiced sound steady part, a voiced sound transient part, an unvoiced sound, etc., and divides the voiced sound steady part into a temporal compresses on the axis, outputs the prediction error of the voiced sound transient part and the compressed voiced sound stationary part,
It is designed to quantize prediction errors for unvoiced sounds, etc., voiced sound stationary parts, and voiced sound transient parts.

また、本発明の音声復号化装置は、上記目的を達成する
だめに、無声音等と、有声音定常部と有声音過渡部の量
子化値を逆量子化し、この逆量子化された有声音定常部
と有声音過渡部を音声信号に合成し、この合成された有
声音定常部を時間軸上で伸張するようにしたものである
。In addition, in order to achieve the above object, the speech decoding device of the present invention dequantizes the quantization values of unvoiced sounds, etc., voiced sound stationary parts, and voiced sound transient parts, and This synthesizes the voiced sound transient part and the voiced sound transient part into an audio signal, and extends the synthesized voiced sound steady part on the time axis.

作用本発明は上記構成により、有声音、無声音、無音等の音
声の特性に応じて量子化することができるので、復号さ
れた音声の品質が良好となり、また、符号化効率を向上
することができる。Effect of the Invention With the above configuration, the present invention can perform quantization according to the characteristics of voice, such as voiced sound, unvoiced sound, and silence, so that the quality of decoded sound is good and the encoding efficiency can be improved. can.

実施例以下、図面を参照して本発明の詳細な説明する。第１図
（ａ）は、本発明に係る音声符号化装置の一実施例を示
すブロック図、第１図（ｂ）は、本発明に係る音声復号
化装置の一実施例を示すブロック図、第２図は、第１図
（ａ）の予測器を示す詳細なフロック図、第３図は、第
１図（ａ）の量子化器を示す詳細なブロック図、第４図
は、一般的な音声信号の波形図、第５図は、第１図（ａ
）の分類器の動作を説明するためのフローチャートであ
る。EXAMPLES Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1(a) is a block diagram showing an embodiment of a speech encoding device according to the present invention, FIG. 1(b) is a block diagram showing an embodiment of a speech decoding device according to the present invention, 2 is a detailed block diagram showing the predictor in FIG. 1(a), FIG. 3 is a detailed block diagram showing the quantizer in FIG. 1(a), and FIG. 4 is a detailed block diagram showing the quantizer in FIG. 1(a). The waveform diagram of the audio signal, Figure 5, is similar to Figure 1 (a
) is a flowchart for explaining the operation of the classifier.

第１図（ａ）において、１は、後述するように、入力し
た音声信号の特性を分析し、有声音定常部、有声音過渡
部、無声音、無音に分類する分類器、２は、分類器２か
らの判定データにより音声信号の有声音定常部の類似波
形を間引き、時間軸上で圧縮する時間軸圧縮器である。In FIG. 1(a), 1 is a classifier that analyzes the characteristics of the input audio signal and classifies it into a voiced stationary part, a voiced transient part, an unvoiced sound, and a silent part, as will be described later; and 2 is a classifier. This is a time axis compressor that thins out similar waveforms in the voiced stationary part of the audio signal and compresses them on the time axis based on the judgment data from No. 2.

３は、音声信号の有声音により、その予測誤差と短期／
長期予測フィルタ係数を求める予測器であり、予測器３
は、第２図に示すように、有声音により短期予測誤差を
求めるだめの短期予測フィルタ２１と、有声音の近接サ
ンプル値間の相関を除去するために短期予測フィルタ２
１の短期予測フィルタ係数を求める短期予測分析器２２
と、短期予測フィルタ２１からの短期予測誤差により予
測誤差を求めるだめの長期予測フィルタ２３と、有声音
の音源ピッチの周期的な相関を除去するために、短期予
測誤差により長期予測フィルタ２３の長期予測フィルタ
係数を求める長期予測分析器（ピッチ予測器）２４と、
短期予測フ（ルタ係数と長期予測フィルタ係数をそれぞ
れ量子化する量子化器２５より構成されている。3, due to the voiced sound of the audio signal, its prediction error and short-term/
Predictor 3 is a predictor that calculates long-term prediction filter coefficients.
As shown in FIG. 2, a short-term prediction filter 21 is used to obtain a short-term prediction error based on voiced sounds, and a short-term prediction filter 2 is used to remove correlations between adjacent sample values of voiced sounds.
short-term prediction analyzer 22 for determining short-term prediction filter coefficients of 1;
and a long-term prediction filter 23 which calculates a prediction error using the short-term prediction error from the short-term prediction filter 21; a long-term prediction analyzer (pitch predictor) 24 that calculates prediction filter coefficients;
It is comprised of a quantizer 25 that quantizes short-term predictive filter coefficients and long-term predictive filter coefficients, respectively.

４は、予測器３からの予測誤差を例えばベクトル量子化
する量子化器であり、量子化器４は、第３図に示すよう
に、有声音定常部用、有声音過渡部用、無声音用、無音
用の代表ベクトルがそれぞれ格納されたコードブック３
１〜３４と、分類器１により分類された音声信号の判定
データによりコードブック３１〜３４の１つを選択し、
予測器３からの予測誤差の信号列と当該コードブックの
各代表ベクトルの２乗距離を計算し、その値が最も小さ
い代表ベクトルの番号を量子化値として出力するベクト
ル量子化器３５より構成されている。4 is a quantizer that vector-quantizes the prediction error from the predictor 3, and the quantizer 4 is used for voiced sound stationary parts, voiced sound transient parts, and unvoiced sounds, as shown in FIG. , codebook 3 in which representative vectors for silence are stored, respectively.
1 to 34, and one of the codebooks 31 to 34 is selected based on the judgment data of the audio signal classified by the classifier 1,
It is composed of a vector quantizer 35 that calculates the squared distance between the prediction error signal sequence from the predictor 3 and each representative vector of the codebook, and outputs the number of the representative vector with the smallest value as a quantized value. ing.

５は、分類器１により分類された音声信号の判定データ
と、予測器３からの短期／長期予測フィルタ係数の量子
化値と、量子化器４からの代表ベクトルの番号を多重化
する多重化器、１０．１１はそれぞれ、分類器１により
分類された音声の特性に応じて切り替えられるスイッチ
である。5 is a multiplexing unit that multiplexes the judgment data of the audio signal classified by the classifier 1, the quantized value of the short-term/long-term prediction filter coefficient from the predictor 3, and the number of the representative vector from the quantizer 4. 10 and 11 are switches that are switched according to the characteristics of the voice classified by the classifier 1, respectively.

第１図（ｂ）において、符号化装置からの伝送データを
音声信号の判定データと、短期／長期予測フィルタ係数
の量子化値と代表ベクトルの番号に分離する分離器、７
は、符号器のコードブ７り３１〜３４と同一のコードブ
ック（不図示）を備え、分離器６からの音声信号の判定
データにより、当該コードブックを選択し、分離器６か
らの番号に対応する代表ベクトルを予測誤差として出力
する逆量子化器、８は、短期予測フィルタ係数及び長期
予測フィルタ係数と逆量子化器８からの代表ベクトルに
より、音声信号の有声音を合成する合成器、９は、音声
信号の有声音定常部を時間軸上で伸張する時間軸伸張器
、１２．１３はそれぞれ、分離器６からの音声信号の判
定データに応じて切り替えられるスイッチである０次に、上記実施例の動作を説明する。In FIG. 1(b), a separator 7 separates transmission data from the encoding device into voice signal judgment data, short-term/long-term predictive filter coefficient quantization values, and representative vector numbers;
is equipped with the same codebook (not shown) as codebook 7 31 to 34 of the encoder, selects the codebook based on the judgment data of the audio signal from the separator 6, and selects the codebook corresponding to the number from the separator 6. an inverse quantizer 8 that outputs a representative vector as a prediction error; a synthesizer 9 that synthesizes a voiced sound of the audio signal using the short-term predictive filter coefficients, long-term predictive filter coefficients, and the representative vector from the inverse quantizer 8; 12 and 13 are switches that are switched according to the judgment data of the audio signal from the separator 6, respectively. The operation of the embodiment will be explained.

第１図（ａ）において、リニアＰＣＭによりＡ／Ｄ変換
された音声信号は、一定の時間長（フレーム）毎に符号
化装置に入力する。In FIG. 1(a), an audio signal that has been A/D converted by linear PCM is input to an encoding device every fixed time length (frame).

音声信号（アナログ信号）は、第４図に示すように、無
音と、音声の始まりである比較的長い無声音と、無声音
と有声音定常部の間に続く比較的短い有声音過渡部と、
有声音定常部より構成されている。As shown in FIG. 4, the audio signal (analog signal) includes silence, a relatively long unvoiced sound that is the beginning of audio, and a relatively short voiced transition part that continues between the unvoiced sound and the voiced sound steady part.
It is composed of voiced stationary parts.

この音声信号においては、（１）有声音定常部は、類似
した波形が連続して周期性を示し、サンプル間、ピッチ
周期間で相関が強いという特性があり、（２）有声音過
渡部は、類似波形は連続しないが、サンプル間、ピンチ
周期間で相関が強いという特性があり、（３）無声音は
、サンプル間、ピッチ周期間で相関が弱く、信号の変化
が激しいという特性があり、（４）無音は、信号の振幅
が小さいという特性がある。In this audio signal, (1) the voiced sound stationary part is characterized by a series of similar waveforms that exhibit periodicity, and the correlation between samples and pitch periods is strong, and (2) the voiced sound transient part is , similar waveforms are not continuous, but have the characteristic of strong correlation between samples and during the pinch cycle; (3) unvoiced sounds have the characteristic of weak correlation between samples and during the pitch cycle, and the signal changes rapidly; (4) Silence has a characteristic that the amplitude of the signal is small.

このような音声信号が入力すると、分類器１は、第５図
に示すように、１フレーム内の波形のパワーＰｗを計算
しくステップ５１）、パワーＰｗが閾値以上か否かを判
別する（ステップ５２）。パワーＰｗが閾値未満の場合
には、そのフレームを無音と判別しくステップ５３）、
閾値以上の場合には有音と判定してステップ５４以下に
進む。When such an audio signal is input, the classifier 1 calculates the power Pw of the waveform within one frame, as shown in FIG. 52). If the power Pw is less than the threshold, the frame is determined to be silent (step 53);
If it is equal to or greater than the threshold, it is determined that there is a sound, and the process proceeds to step 54 and subsequent steps.

ステップ５５では、１フレーム内の波形の自己相関係数
ｒ　（ｉ）を計算し、計算した自己相関係数ｒ（ｉ）に
より、指定されたｉにおける最大値ｒｍａｘを求め（ス
テップ５６）、この最大値ｒｍａｘが閾値以上か否かを
判別する（ステップ５７）。最大値ｒｍａｘが閾値未満
の場合には、そのフレームを無声音部と判別しくステッ
プ５８）、閾値以上の場合には有声音と判定してステッ
プ５９以下に進む。In step 55, the autocorrelation coefficient r(i) of the waveform within one frame is calculated, and the maximum value rmax at the specified i is determined from the calculated autocorrelation coefficient r(i) (step 56). It is determined whether the maximum value rmax is greater than or equal to the threshold (step 57). If the maximum value rmax is less than the threshold, the frame is determined to be an unvoiced sound part (step 58), and if it is greater than the threshold, it is determined to be a voiced sound and the process proceeds to step 59 and subsequent steps.

ステップ６０では、前のフレームが有声音か否かを判定
し、ＮＯの場合にはそのフレームを有声音過渡部と判定
しくステップ６１）、ＹＥＳの場合にはステップ６２以
下に進む。In step 60, it is determined whether or not the previous frame is a voiced sound. If NO, the frame is determined to be a voiced sound transition part (step 61); if YES, the process proceeds to step 62 and subsequent steps.

ステップ６２では、自己相関係数ｒ　（ｉ）によりピッ
チ周期Ｐｎを計算し、前のフレームのピッチ周期Ｐｎ−
１との変化率ρを計算しくステップ６３）、変化率ρが
閾値以上か否かを判別する（ステップ６４）。変化率ρ
が閾値未満の場合には、そのフレームを有声音定常部と
判定し、閾値以上の場合には有声音過渡部と判定する。In step 62, the pitch period Pn is calculated using the autocorrelation coefficient r (i), and the pitch period Pn- of the previous frame is calculated.
The rate of change ρ from 1 is calculated (step 63), and it is determined whether the rate of change ρ is greater than or equal to a threshold (step 64). Rate of change ρ
If is less than the threshold, the frame is determined to be a voiced sound stationary part, and if it is greater than or equal to the threshold, the frame is determined to be a voiced sound transient part.

分類器１は、この音声信号の４種類の特性に応じて、例
えば２ビツトの判定データをスイッチ、１０．１１、量
子化器４、多重化器５に出力する。The classifier 1 outputs, for example, 2-bit judgment data to the switch 10.11, the quantizer 4, and the multiplexer 5, depending on the four types of characteristics of the audio signal.

有声音定常部と判定されたフレームにおいては、スイッ
チ１０は時間軸圧縮器２側に切り替わり、有声音と判定
したフレームにおいては、スイッチ１１は予測器３側に
切り替わる。In a frame determined to be a voiced sound stationary part, the switch 10 is switched to the time axis compressor 2 side, and in a frame determined to be a voiced sound, the switch 11 is switched to the predictor 3 side.

しだがって、音声信号の無音と無声音は直接量子化器４
に入力してそれぞれ無音用コードブック３４、無声音用
コードブック３３の代表ベクトルの番号に量子化され、
有声音過渡部は予測器３を介して量子化器４に入力して
有声音過渡部用コードブック３２の代表ベクトルの番号
に量子化され、有声音定常部は時間軸圧縮器２により圧
縮された後予測器３を介して量子化器４に入力して有声
音定常部用コードブック３１０代表ベクトルの番号に量
子化される。Therefore, silence and unvoiced sounds in the audio signal are directly processed by the quantizer 4.
and are quantized into representative vector numbers of the silence codebook 34 and the unvoiced sound codebook 33, respectively.
The voiced sound transient part is inputted to the quantizer 4 via the predictor 3 and quantized into the representative vector number of the codebook 32 for voiced sound transient parts, and the voiced sound steady part is compressed by the time axis compressor 2. After that, it is input to the quantizer 4 via the predictor 3 and quantized into the number of the representative vector of the codebook 310 for voiced sound stationary part.

ここで、無音と無声音を符号化する場合には、予測器３
を用いないので、予測フィルタ係数等に割り当てられた
ビットを量子化器４のピントに割り当て、伝送されるビ
ット数を一定にする。Here, when encoding silence and unvoiced sounds, the predictor 3
Since the bits assigned to the predictive filter coefficients etc. are assigned to the focus of the quantizer 4, the number of transmitted bits is kept constant.

音声信号の判定データは、予測フィルタ係数と代表ベク
トルの番号とともに復号器に伝送されるので、復号化装
置は元の音声信号に復号することができ、また、有声音
、無声音、無音等の特性に応じて量子化された値を復号
するので、復号された音声の品質が良好となる。The judgment data of the speech signal is transmitted to the decoder along with the predictive filter coefficients and the number of the representative vector, so the decoding device can decode it to the original speech signal, and also distinguish between voiced, unvoiced, silent, etc. Since the quantized value is decoded according to the quantized value, the quality of the decoded voice is good.

発明の詳細な説明したように、本発明の音声符号化装置は、音声信
号を有声音定常部と、有声音過渡部と無声音等に分類し
、有声音定常部を時間軸上に圧縮し、有声音過渡部と、
圧縮された有声音定常部の予測誤差を出力し、無声音等
と、有声音定常部と有声音過渡部の予測誤差を量子化す
るようにし、また、本発明の音声復号化装置は、無声音
等と、有声音定常部と有声音過渡部の量子化値を逆量子
化し、この逆量子化された有声音定常部と有声音過渡部
を音声信号に合成し、この合成された有声音定常部を時
間軸上で伸張するようにしたので、有声音、無声音、無
音等の音声の特性に応じて量子化することができるので
、復号された音声の品質が良好となり、また、符号化効
率を向上することができる。As described in detail, the speech encoding device of the present invention classifies an audio signal into a voiced sound steady part, a voiced sound transient part, an unvoiced sound, etc., compresses the voiced sound steady part on the time axis, a voiced transition part,
The speech decoding device of the present invention outputs the prediction error of the compressed voiced sound stationary part and quantizes the prediction error of the voiced sound steady part and the voiced sound transient part. Then, the quantized values of the voiced sound steady part and the voiced sound transient part are dequantized, and the dequantized voiced sound steady part and voiced sound transient part are synthesized into a speech signal, and this synthesized voiced sound steady part is Since it is expanded on the time axis, it is possible to quantize according to the characteristics of the voice, such as voiced, unvoiced, or silent, which improves the quality of the decoded voice and improves the encoding efficiency. can be improved.

[Brief explanation of the drawing]

第１図（ａ）は、本発明に係る音声符号化装置の一実施
例を示すブロック図、第１図（ｂ）は、本発明に係る音
声復号化装置の一実施例を示すブロック図、第２図は、
第１図（ａ）の予測器を示す詳細なブロック図、第３図
は、第１図（ａ）の量子化器を示す詳細なブロック図、
第４図は、一般的な音声信号の波形図、第５図は、第１
図（ａ）の分類器の動作を説明するだめのフローチャー
ト、第６図（ａ）は、従来の音声符号化装置を示すブロ
ック図、第６図（ｂ）は、従来の音声復号化装置を示す
ブロック図、第７図は、第６図（ａ）の量子化器を示す
詳細なブロック図である。１・・・分類器、２・・時間軸圧縮器、３・・・予測器
、４・・・量子化器、５・・・多重化器、６・分離器、
７・・・逆量子化器、８・・・合成器、９・・・時間軸
伸張器、３１・・有声音定常部用コードブック、３２・
・有声音過渡部用コードブック、３３・・・無声音用コ
ードブック、３４・・無音用コードブック。代理人の氏名　弁理士　粟野重孝　ほか１名−へ ■　４Ｎ１犀剖 ■　塙Ｐ魯慰這郡 ■　ν盾 ■　熱盾第図FIG. 1(a) is a block diagram showing an embodiment of a speech encoding device according to the present invention, FIG. 1(b) is a block diagram showing an embodiment of a speech decoding device according to the present invention, Figure 2 shows
FIG. 3 is a detailed block diagram showing the predictor of FIG. 1(a); FIG. 3 is a detailed block diagram showing the quantizer of FIG. 1(a);
Figure 4 is a waveform diagram of a general audio signal, and Figure 5 is a waveform diagram of a general audio signal.
FIG. 6(a) is a flowchart explaining the operation of the classifier in FIG. 6(a). FIG. 6(b) is a block diagram showing a conventional speech encoding device. The block diagram shown in FIG. 7 is a detailed block diagram showing the quantizer of FIG. 6(a). 1... Classifier, 2... Time axis compressor, 3... Predictor, 4... Quantizer, 5... Multiplexer, 6... Separator,
7... Inverse quantizer, 8... Synthesizer, 9... Time axis expander, 31... Code book for voiced sound stationary part, 32...
・Code book for voiced sound transition part, 33... Code book for unvoiced sound, 34... Code book for silent sound. Agent's name: Patent attorney Shigetaka Awano and 1 other person ■ 4N1 Rhinoceros ■ Hanawa P Lukkou County ■ ν Shield■ Netsushishi Diagram

Claims

[Claims]

(1) means for classifying the audio signal into a voiced sound steady part, a voiced sound transient part, an unvoiced sound, etc., and a means for compressing the voiced sound steady part of the sound signal classified by the classification means on the time axis;
means for outputting a prediction error of the voiced sound transient part of the audio signal classified by the classification means, the voiced sound steady part compressed by the compression means, the unvoiced sound, etc. of the audio signal classified by the classification means; A speech encoding device comprising: means for quantizing prediction errors of the voiced sound stationary part and the voiced sound transient part.

(2) The quantization means includes a codebook in which representative vectors for a voiced sound stationary part, a voiced sound transient part, an unvoiced sound, etc. are stored, and converts the voiced sound steady part, voiced sound transient part, unvoiced sound, etc. into the code 2. The speech encoding apparatus according to claim 1, wherein the speech encoding apparatus performs vector quantization using a book and outputs a number of a representative vector.

(3) Means for dequantizing the quantization values of unvoiced sounds, etc., voiced sound steady parts and voiced sound transient parts, and means for synthesizing the dequantized voiced sound steady parts and voiced sound transient parts into an audio signal. and means for expanding the synthesized voiced sound stationary part on a time axis.