JP2001142499A

JP2001142499A - Speech encoding device and speech decoding device

Info

Publication number: JP2001142499A
Application number: JP31953499A
Authority: JP
Inventors: Kazunori Ozawa; 一範小澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1999-11-10
Filing date: 1999-11-10
Publication date: 2001-05-25
Also published as: CA2325322A1; EP1100076A3; EP1100076A2

Abstract

PROBLEM TO BE SOLVED: To provide a speech encoding device and a speech decoding device which can obtain a satisfactory tone quality of encoding regardless of a low bit rate with respect to speech on which a background noise is superposed. SOLUTION: A discrimination part 800 extracts a feature quantity from a sound signal to discriminate a mode, and a smoothing circuit 450 smoothes at least one of the gain, of a sound source signal, the gain of an adaptive code book, a spectrum parameter, and the level of the sound source signal in the time direction if the discrimination output is a preliminarily determined mode. A multiplexer part 400 uses the smoothed signal to locally reproduce a synthesized signal and combines and outputs the output of a spectrum parameter calculation part, that of the discrimination part, that of an adaptive code book part, that of a sound source quantization part, and that of a gain quantization part.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、低いビットレート
でも音声信号に重畳した背景雑音信号を良好に符号化す
るための音声符号化装置ならびに音声復号化装置に関す
るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding apparatus and a speech decoding apparatus for satisfactorily coding a background noise signal superimposed on a speech signal even at a low bit rate.

【０００２】[0002]

【従来の技術】音声信号を高能率に符号化する方式とし
ては、例えば、M.Schroeder and B.Atal 氏による"Code
-excited linear prediction: High quality speech at
verylow bitrates" (Proc. ICASSP, pp.937-940, 1985
年)と題した論文（文献１）や、Kleijn氏らによる"Imp
roved speech quality and efficient vector quantiza
tion in SELP" (Proc. ICASSP, pp.155-158, 1988 年)
と題した論文（文献２）などに記載されているCELP (Co
de Excited Linear Predictive Coding) が知られてい
る。2. Description of the Related Art As a method for encoding a speech signal with high efficiency, for example, "Code by M. Schroeder and B. Atal"
-excited linear prediction: High quality speech at
verylow bitrates "(Proc. ICASSP, pp.937-940, 1985
), And "Imp" by Kleijn et al.
roved speech quality and efficient vector quantiza
tion in SELP "(Proc. ICASSP, pp.155-158, 1988)
CELP (Co
de Excited Linear Predictive Coding) is known.

【０００３】この従来例では、送信側では、フレーム毎
(例えば20ms) に音声信号から線形予測(LPC) 分析を用
いて、音声信号のスペクトル特性を表すスペクトルパラ
メータを抽出する。フレームをさらにサブフレーム(例
えば5ms)に分割し、サブフレーム毎に過去の音源信号を
基に適応コードブックにおけるパラメータ(ピッチ周期
に対応する遅延パラメータとゲインパラメータ)を抽出
し、適応コードブックにより前記サブフレームの音声信
号をピッチ予測する。ピッチ予測して求めた音源信号に
対して、予め定められた種類の雑音信号からなる音源コ
ーブック(ベクトル量子化コードブック) から最適な音
源コードベクトルを選択し、最適なゲインを計算するこ
とにより、音源信号を量子化する。音源コードベクトル
の選択の仕方は、選択した雑音信号により合成した信号
と、前記残差信号との誤差電力を最小化するように行
う。そして、選択されたコードベクトルの種類を表すイ
ンデクスとゲインならびに、前記スペクトルパラメータ
と適応コードブックのパラメータをマルチプレクサ部に
より組み合わせて伝送する。受信側の説明は省略する。In this conventional example, on the transmitting side, every frame
At time (for example, 20 ms), a linear parameter (LPC) analysis is used to extract a spectral parameter representing a spectral characteristic of the audio signal from the audio signal. The frame is further divided into subframes (e.g., 5 ms), and the parameters in the adaptive codebook (delay parameters and gain parameters corresponding to the pitch period) are extracted based on the past sound source signals for each subframe, and the adaptive codebook Pitch prediction of the audio signal of the subframe. For the sound source signal obtained by pitch prediction, by selecting an optimum sound source code vector from a sound source cobook (vector quantization codebook) composed of a predetermined type of noise signal and calculating an optimum gain, Quantize the sound source signal. The excitation code vector is selected so as to minimize the error power between the signal synthesized from the selected noise signal and the residual signal. Then, the index and gain indicating the type of the selected code vector, the spectrum parameter and the parameter of the adaptive codebook are combined and transmitted by the multiplexer unit. Description on the receiving side is omitted.

【０００４】[0004]

【発明が解決しようとする課題】前記従来法では、符号
化ビットレートを例えば8kb/s以下に削減すると、特に
音声信号に背景雑音信号が重畳している場合に、背景雑
音信号が劣化し、全体の音質が劣化するという問題があ
った。これは、特に携帯電話などで音声符号化を使用す
る場合に顕著であった。前記文献１、２の方法では、ビ
ットレートを削減すると音源コードブックのビット数が
低減し、波形の再現精度が低下する。音声信号のように
波形の相関の高い場合は、劣化はそれほど顕著ではない
が、背景雑音のように相関が低い信号に対しては、劣化
が顕著となる。In the conventional method, when the encoding bit rate is reduced to, for example, 8 kb / s or less, the background noise signal deteriorates, particularly when the background noise signal is superimposed on the audio signal. There was a problem that the overall sound quality deteriorated. This was particularly noticeable when using voice coding on a mobile phone or the like. In the methods of the above-mentioned Documents 1 and 2, when the bit rate is reduced, the number of bits of the sound source codebook is reduced, and the waveform reproduction accuracy is reduced. When the correlation between the waveforms is high as in an audio signal, the deterioration is not so remarkable, but when the signal has a low correlation such as background noise, the deterioration becomes remarkable.

【０００５】また、C.Laflammeによる"16 kbps wideban
d speech coding technique basedon algebraic CELP"
と題した論文（Proc. ICASSP,pp.13-16,1991）（文献
３）の従来法では、音源信号をパルスの組み合わせで表
しているので、音声に対してはモデルの整合性が高く良
好な音質が得られるが、ビットレートが低い場合に、パ
ルスの個数が充分でないために、符号化音声の背景雑音
部分の音質が極めて劣化するとい問題点があった。[0005] Also, "16 kbps wideban" by C. Laflamme
d speech coding technique basedon algebraic CELP "
In the conventional method of the paper entitled "Proc. ICASSP, pp. 13-16, 1991" (Reference 3), since the sound source signal is represented by a combination of pulses, the consistency of the model is high and good for speech. However, when the bit rate is low, the number of pulses is insufficient, so that the sound quality of the background noise portion of the coded speech is extremely deteriorated.

【０００６】この理由としては、音声の母音区間では、
パルスがピッチの開始点であるピッチパルスの近辺に集
中するために、少ない個数のパルスで効率的に表すこと
ができるが、背景雑音のようなランダム信号に対して
は、パルスをランダムに立てる必要があるため、少ない
個数のパルスでは、背景雑音を良好に表すことは困難で
あり、ビットレートを低減化し、パルスの個数が削減さ
れると、背景雑音に対する音質が急激に劣化していた。The reason for this is that in a vowel section of a voice,
Since the pulses are concentrated near the pitch pulse, which is the starting point of the pitch, the number of pulses can be efficiently represented by a small number of pulses. However, for random signals such as background noise, the pulses need to be randomly generated. Therefore, it is difficult to satisfactorily represent the background noise with a small number of pulses, and when the bit rate is reduced and the number of pulses is reduced, the sound quality with respect to the background noise is rapidly deteriorated.

【０００７】そこで、本発明の目的は、上述の問題を解
決し、ビットレートが低い場合にも、比較的少ない演算
量で、特に背景雑音に対する音質の劣化の少ない音声符
号化装置と音声復号化装置を提供することにある。SUMMARY OF THE INVENTION It is therefore an object of the present invention to solve the above-mentioned problems, and to provide a speech coding apparatus and a speech decoding apparatus which have a relatively small amount of operation even when the bit rate is low, and in which the sound quality is not particularly deteriorated due to background noise. It is to provide a device.

【０００８】[0008]

【課題を解決するための手段】前述の課題を解決するた
めに、本発明による音声符号化装置と音声復号化装置は
次のような特徴的な構成を採用している。In order to solve the above-mentioned problems, a speech encoding apparatus and a speech decoding apparatus according to the present invention employ the following characteristic configurations.

【０００９】（１）音声信号を入力し予め定められたフ
レーム毎にスペクトルパラメータを求めて量子化するス
ペクトルパラメータ計算部と、前記フレームを複数個の
サブフレームに分割し、前記サブフレームにおいて過去
の量子化された音源信号から適応コードブックにより遅
延とゲインを求め、音声信号を予測して残差を求める適
応コードブック部と、前記スペクトルパラメータを用い
て前記音声信号の音源信号を量子化して出力する音源量
子化部と、前記適応コードブックのゲインと前記音源信
号のゲインを量子化するゲイン量子化部を有する音声符
号化装置において、前記音声信号から予め定めた特徴量
を抽出し、抽出した特徴量に基づいて予め定めた複数の
モードのうちいずれのモードであるかを判別するモード
判別部と、前記判別部の出力が予め定められたモードの
場合に、前記音源信号のゲイン、前記適応コードブック
のゲイン、前記スペクトルパラメータ、前記音源信号の
レベルのうち少なくとも一つを時間方向に平滑化する平
滑化回路を有し、前記平滑化された信号を用いて合成信
号を局部再生し、前記スペクトルパラメータ計算部の出
力と前記判別部の出力と前記適応コードブック部の出力
と前記音源量子化部の出力と前記ゲイン量子化部の出力
を組み合わせて出力するマルチプレクサ部とを有する音
声符号化装置。(1) A spectrum parameter calculation unit for inputting a speech signal and obtaining and quantizing a spectrum parameter for each predetermined frame, dividing the frame into a plurality of subframes, and An adaptive codebook section for obtaining a delay and a gain by an adaptive codebook from a quantized sound source signal, predicting an audio signal to obtain a residual, and quantizing and outputting the sound source signal of the audio signal using the spectrum parameter. In a speech coding apparatus having a sound source quantization unit that performs quantization and a gain quantization unit that quantizes the gain of the adaptive codebook and the gain of the sound source signal, a predetermined feature amount is extracted and extracted from the sound signal. A mode discrimination unit for discriminating which mode among a plurality of predetermined modes based on the characteristic amount; A smoothing circuit that smoothes at least one of a gain of the excitation signal, a gain of the adaptive codebook, the spectrum parameter, and a level of the excitation signal in a time direction when an output of the unit is in a predetermined mode. Local reproduction of the synthesized signal using the smoothed signal, the output of the spectrum parameter calculation unit, the output of the determination unit, the output of the adaptive codebook unit, and the output of the sound source quantization unit. And a multiplexer unit that combines and outputs the output of the gain quantization unit.

【００１０】（２）前記モード判別回路は、フレーム毎
にモードの判別を行う上記（１）の音声符号化装置。(2) The speech coding apparatus according to (1), wherein the mode discriminating circuit discriminates a mode for each frame.

【００１１】（３）前記特徴量は、ピッチ予測ゲインで
ある上記（１）の音声符号化装置。(3) The speech encoding apparatus according to (1), wherein the feature quantity is a pitch prediction gain.

【００１２】（４）前記モード判別回路は、サブフレー
ム毎に求めたピッチ予測ゲインをフレーム全体で平均
し、この値とあらかじめ定められた複数のしきい値を比
較して予め定められた複数のモードに分類する上記
（１）の音声符号化装置。(4) The mode discriminating circuit averages the pitch prediction gain obtained for each sub-frame for the entire frame, compares this value with a plurality of predetermined thresholds, and determines a plurality of predetermined thresholds. The speech encoding device according to the above (1), which is classified into modes.

【００１３】（５）前記予め定めた複数のモードは、無
声区間、過渡区間、弱い有声区間、強い有声区間にほぼ
対応するものである上記（１）の音声符号化装置。(5) The speech coding apparatus according to (1), wherein the plurality of predetermined modes substantially correspond to unvoiced sections, transient sections, weak voiced sections, and strongly voiced sections.

【００１４】（６）音声情報としてのスペクトルパラメ
ータ、ピッチ、ゲイン及び音源信号を入力して分離する
デマルチプレクサ回路と、前記分離されたピッチと音源
信号とゲインとから音源信号を復元する音源信号復元回
路と、前記復元された音源信号と前記スペクトルパラメ
ータを用いて音声信号を合成する合成フィルタ回路と、
前記合成音声を入力し前記スペクトルパラメータを用い
てポストフィルタリングを行うポストフィルタ回路とか
ら構成される音声復号化装置において、前記ポストフィ
ルタ回路の出力信号と前記スペクトルパラメータから逆
ポストフィルタリングと逆合成フィルタリングを行ない
音源信号を推定する逆フィルタ回路と、前記推定音源信
号のレベルと前記ゲインと前記スペクトルパラメータの
うちの少なくとも一つに対して時間方向の平滑化を行う
平滑化回路と、前記平滑化した信号を前記合成フィルタ
回路に入力し、得られた合成信号を前記ポストフィルタ
回路に入力して音声信号を合成する音声復号化装置。(6) A demultiplexer circuit for inputting and separating a spectrum parameter, a pitch, a gain and a sound source signal as voice information, and a sound source signal restoring for restoring a sound source signal from the separated pitch, sound source signal and gain. A circuit, a synthesis filter circuit that synthesizes an audio signal using the restored sound source signal and the spectrum parameter,
And a post-filter circuit configured to input the synthesized speech and perform post-filtering using the spectrum parameter.In the speech decoding apparatus, inverse post-filtering and inverse synthesis filtering are performed based on an output signal of the post-filter circuit and the spectrum parameter. An inverse filter circuit for estimating a source signal to be performed, a smoothing circuit for performing time-direction smoothing on at least one of the level of the estimated source signal, the gain, and the spectral parameter, and the smoothed signal. Is input to the synthesis filter circuit, and the obtained synthesized signal is input to the post-filter circuit to synthesize an audio signal.

【００１５】（７）復号化すべき音声信号についての特
徴量に基づいて定められたモードを示す判別情報、スペ
クトルパラメータ、ピッチ、ゲイン及び音源信号を入力
し分離するデマルチプレクサ回路と、前記分離されたピ
ッチと音源信号とゲインとから音源信号を復元する音源
信号復元回路と、前記復元された音源信号と前記スペク
トルパラメータを用いて音声信号を合成する合成フィル
タ回路と、前記合成音声を入力し前記スペクトルパラメ
ータを用いてポストフィルタリングを行うポストフィル
タ回路とから構成される音声復号化装置において、前記
判別情報が予め定められたモードの場合に、前記ポスト
フィルタ回路の出力信号と前記スペクトルパラメータか
ら逆ポストフィルタリングと逆合成フィルタリングを行
ない音源信号を推定する逆フィルタ回路と、前記推定音
源信号のレベルと前記ゲインと前記スペクトルパラメー
タのうちの少なくとも一つに対して時間方向の平滑化を
行う平滑化回路と、前記平滑化した信号を前記合成フィ
ルタ回路に入力し、得られた合成信号を前記ポストフィ
ルタ回路に入力して音声信号を合成する音声復号化装
置。(7) A demultiplexer circuit for inputting and separating discrimination information indicating a mode determined based on a feature amount of a speech signal to be decoded, a spectrum parameter, a pitch, a gain, and an excitation signal, and A sound source signal restoring circuit for restoring a sound source signal from a pitch, a sound source signal, and a gain, a synthesis filter circuit for synthesizing an audio signal using the restored sound source signal and the spectrum parameter, and And a post-filter circuit that performs post-filtering using a parameter. In the case where the discrimination information is in a predetermined mode, inverse post-filtering is performed based on an output signal of the post-filter circuit and the spectrum parameter. And perform inverse synthesis filtering to estimate the sound source signal. An inverse filter circuit, a smoothing circuit for performing time-direction smoothing on at least one of the level of the estimated sound source signal, the gain, and the spectral parameter, and the synthesis filter circuit for filtering the smoothed signal. And a speech decoding device for inputting the obtained synthesized signal to the post-filter circuit to synthesize a speech signal.

【００１６】（８）前記モード判別は、フレーム毎に行
われる上記（７）の音声符号化装置。(8) The speech encoding apparatus according to (7), wherein the mode determination is performed for each frame.

【００１７】（９）前記特徴量は、ピッチ予測ゲインで
ある上記（７）の音声符号化装置。(9) The speech encoding apparatus according to the above (7), wherein the feature amount is a pitch prediction gain.

【００１８】（１０）前記モード判別は、サブフレーム
毎に求めたピッチ予測ゲインをフレーム全体で平均し、
この値とあらかじめ定められた複数のしきい値を比較し
て行われる上記（７）の音声符号化装置。(10) In the mode discrimination, a pitch prediction gain obtained for each sub-frame is averaged over the entire frame,
The speech encoding device according to the above (7), wherein the value is compared with a plurality of predetermined threshold values.

【００１９】（１１）前記モードは、無声区間、過渡区
間、弱い有声区間、強い有声区間にほぼ対応するもので
ある上記（７）の音声符号化装置。(11) The speech encoding apparatus according to (7), wherein the mode substantially corresponds to an unvoiced section, a transient section, a weak voiced section, and a strong voiced section.

【００２０】（１２）音声信号についてのスペクトルパ
ラメータ、適応コードブックのゲイン、音源コードブッ
クのゲイン、音源信号のＲＭＳの少なくとも一つを時間
方向に平滑化した信号に基づいて合成音声信号を局部再
生する音声符号化装置。(12) Locally reproducing a synthesized speech signal based on a signal obtained by smoothing at least one of a spectrum parameter, a gain of an adaptive codebook, a gain of a sound source codebook, and an RMS of a sound source signal in a time direction. Speech encoding device.

【００２１】（１３）音声信号の復号側においてポスト
フィルタリング後の信号から、逆ポスト・合成フィルタ
処理により残差信号を求め、この残差信号のＲＭＳ、受
信スペクトルパラメータ、適応コードブックのゲイン、
音源コードブックのゲインの少なくとも一つを時間方向
に平滑化した信号に基づいて音声信号を合成処理し直
し、更にポストフィルタリング処理をし直して最終的な
合成信号を出力する音声復号化装置。(13) On the decoding side of the audio signal, a residual signal is obtained from the post-filtered signal by inverse post-synthesis filter processing, and the RMS of the residual signal, the received spectrum parameters, the gain of the adaptive codebook,
An audio decoding device that re-synthesizes an audio signal based on a signal obtained by smoothing at least one of the gains of a sound source codebook in the time direction, performs post-filtering processing again, and outputs a final synthesized signal.

【００２２】（１４）音声信号の復号側においてポスト
フィルタリング後の信号から、逆ポスト・合成フィルタ
処理により残差信号を求め、復号化すべき音声信号につ
いての特徴量に基づいて定められたモード、あるいは前
記特徴量が予め定められた領域に存在する場合におい
て、前記残差信号のＲＭＳ、受信スペクトルパラメー
タ、適応コードブックのゲイン、音源コードブックのゲ
インの少なくとも一つを時間方向に平滑化した信号に基
づいて音声信号を合成処理し直し、更にポストフィルタ
リング処理をし直して最終的な合成信号を出力する音声
復号化装置。(14) On the decoding side of the audio signal, a residual signal is obtained from the post-filtered signal by inverse post-synthesis filter processing, and a mode determined based on the characteristic amount of the audio signal to be decoded, or In the case where the feature amount is in a predetermined region, a signal obtained by smoothing at least one of the RMS of the residual signal, the reception spectrum parameter, the gain of the adaptive codebook, and the gain of the excitation codebook in the time direction. An audio decoding device that re-synthesizes an audio signal based on the signal, and further performs post-filtering processing to output a final synthesized signal.

【００２３】[0023]

【発明の実施の形態】図１は本発明による音声符号化装
置の第１の実施形態例を示すブロック図である。図１に
おいて、入力端子100 から音声信号を入力し、フレーム
分割回路110では音声信号をフレーム（例えば20ms）毎
に分割し、サブフレーム分割回路120では、フレームの
音声信号をフレームよりも短いサブフレーム（例えば5m
s）に分割する。FIG. 1 is a block diagram showing a first embodiment of a speech coding apparatus according to the present invention. In FIG. 1, an audio signal is input from an input terminal 100, a frame division circuit 110 divides the audio signal into frames (for example, 20 ms), and a subframe division circuit 120 divides the audio signal of the frame into subframes shorter than the frame. (For example, 5m
s).

【００２４】スペクトルパラメータ計算回路200 では、
少なくとも一つのサブフレームの音声信号に対して、サ
ブフレーム長よりも長い窓（例えば24ms）をかけて音声
を切り出してスペクトルパラメータを予め定められた次
数（例えばP=10 次）計算する。ここで、スペクトルパ
ラメータの計算には、周知のLPC分析や、Burg 分析等を
用いることができる。ここでは、Burg 分析を用いるこ
ととする。Burg 分析の詳細については、中溝著によ
る”信号解析とシステム同定”と題した単行本（コロナ
社1988 年刊）の82 〜87 頁（文献４）等に記載されて
いるので説明は略する。さらにスペクトルパラメータ
計算部200では、Burg 法により計算された線形予測係数
αi(i = 1,…,10) を量子化や補間に適したLSP パラメ
ータに変換する。ここで、線形予測係数からLSP への変
換は、菅村他による”線スペクトル対（LSP）音声分析
合成方式による音声情報圧縮”と題した論文（電子通信
学会論文誌、J64-A、pp.599-606、1981年）（文献５）
を参照することができる。例えば、第２、４サブフレー
ムでBurg 法により求めた線形予測係数を、LSP パラメ
ータに変換し、第１、３サブフレームのLSP を直線補間
により求めて、第１、３サブフレームのLSP を逆変換し
て線形予測係数に戻し、第１- ４サブフレームの線形予
測係数αil(i =1,…,10,l = 1,…,5) を聴感重み付け回
路230 に出力する。また、第４サブフレームのLSP をス
ペクトルパラメータ量子化回路210 へ出力する。In the spectrum parameter calculation circuit 200,
At least one subframe audio signal is cut out over a window (for example, 24 ms) longer than the subframe length, and spectrum parameters are calculated for a predetermined order (for example, P = 10th order). Here, the well-known LPC analysis, Burg analysis, or the like can be used for calculating the spectrum parameters. Here, Burg analysis is used. Details of the Burg analysis are described in a book entitled "Signal Analysis and System Identification" by Nakamizo (Corona Publishing Co., 1988), pp. 82-87 (Reference 4), and will not be described here. Further, the spectrum parameter calculation unit 200 converts the linear prediction coefficients αi (i = 1,..., 10) calculated by the Burg method into LSP parameters suitable for quantization and interpolation. Here, the conversion from linear prediction coefficients to LSP is described in a paper entitled “Speech Information Compression by Line Spectrum Pair (LSP) Speech Analysis / Synthesis Method” by Sugamura et al. (Transactions of IEICE, J64-A, pp.599 -606, 1981) (Reference 5)
Can be referred to. For example, the linear prediction coefficients obtained by the Burg method in the second and fourth sub-frames are converted into LSP parameters, the LSPs of the first and third sub-frames are obtained by linear interpolation, and the LSPs of the first and third sub-frames are inverted. The conversion is returned to the linear prediction coefficient, and the linear prediction coefficient αil (i = 1,..., 10, l = 1,..., 5) of the first to fourth subframes is output to the audibility weighting circuit 230. Further, it outputs the LSP of the fourth subframe to spectrum parameter quantization circuit 210.

【００２５】スペクトルパラメータ量子化回路210 で
は、予め定められたサブフレームのLSPパラメータを効
率的に量子化し、下式の歪みを最小化する量子化値を出
力する。The spectrum parameter quantization circuit 210 efficiently quantizes the LSP parameter of a predetermined subframe and outputs a quantization value for minimizing the distortion of the following equation.

【数１】 (Equation 1)

【００２６】ここで、LSP(i), QLSP(i)j、W(i) はそれ
ぞれ、量子化前のi 次目のLSP、量子化後のj 番目の結
果、重み係数である。Here, LSP (i), QLSP (i) j, and W (i) are an i-th LSP before quantization, a j-th result after quantization, and a weight coefficient, respectively.

【００２７】以下では、量子化法として、ベクトル量子
化を用いるものとし、第４サブフレームのLSP パラメー
タを量子化するものとする。LSP パラメータのベクトル
量子化の手法は周知の手法を用いることができる。具体
的な方法は例えば、特開平4-171500 号公報（特願平2-2
97600 号）（文献６）や特開平4-363000号公報（特願平
3-261925号）（文献７）や、特開平5-6199 号公報（特
願平3-155049 号）（文献８）や、T.Nomura et al., に
よる"LSP Coding Using VQ-SVQ With Interpolation i
n 4.075 kbps M-LCELP Speech Coder"と題した論文(Pro
c. Mobile Multimedia Communications, pp.B.2.5, 199
3) （文献９）等を参照できるのでここでは説明は略す
る。In the following, it is assumed that vector quantization is used as a quantization method, and that the LSP parameter of the fourth subframe is quantized. A well-known method can be used for the vector quantization of the LSP parameter. A specific method is described in, for example, JP-A-4-171500 (Japanese Patent Application No.
97600) (Reference 6) and JP-A-4-363000 (Japanese Patent Application No. Hei.
3-261925) (Reference 7), JP-A-5-6199 (Japanese Patent Application No. 3-155049) (Reference 8), and "LSP Coding Using VQ-SVQ With Interpolation" by T. Nomura et al. i
n 4.075 kbps M-LCELP Speech Coder "(Pro
c. Mobile Multimedia Communications, pp.B.2.5, 199
3) Since (Reference 9) can be referred to, the description is omitted here.

【００２８】また、スペクトルパラメータ量子化回路21
0 では、第4 サブフレームで量子化したLSP パラメータ
をもとに、第1 〜第4 サブフレームのLSP パラメータを
復元する。ここでは、現フレームの第4 サブフレームの
量子化LSP パラメータと1 つ過去のフレームの第4 サブ
フレームの量子化LSP を直線補間して、第1 〜第3 サブ
フレームのLSP を復元する。ここで、量子化前のLSP と
量子化後のLSP との誤差電力を最小化するコードベクト
ルを1 種類選択した後に、直線補間により第1〜第4 サ
ブフレームのLSPを復元できる。さらに性能を向上させ
るためには、前記誤差電力を最小化するコードベクトル
を複数候補選択したのちに、各々の候補について、累積
歪を評価し、累積歪を最小化する候補と補間LSP の組を
選択するようにすることができる。詳細は、例えば、特
開平6-222797号公報（特願平5-8737 号）（文献１０）
を参照することができる。The spectrum parameter quantization circuit 21
In 0, the LSP parameters of the first to fourth subframes are restored based on the LSP parameters quantized in the fourth subframe. Here, the LSPs of the first to third subframes are restored by linearly interpolating the quantization LSP parameter of the fourth subframe of the current frame and the quantization LSP of the fourth subframe of the previous frame. Here, after selecting one type of code vector that minimizes the error power between the LSP before quantization and the LSP after quantization, the LSPs of the first to fourth subframes can be restored by linear interpolation. In order to further improve the performance, after selecting a plurality of code vectors for minimizing the error power, the cumulative distortion is evaluated for each candidate, and a combination of the candidate for minimizing the cumulative distortion and the interpolation LSP is determined. Can be selected. For details, see, for example, JP-A-6-222797 (Japanese Patent Application No. 5-8737) (Reference 10).
Can be referred to.

【００２９】以上により復元した第１〜３サブフレーム
のLSP と第４サブフレームの量子化LSPをサブフレーム
毎に線形予測係数α_il(i =1,…,10,l = 1,…,5)に変換
し、インパルス応答計算回路310 へ出力する。また、第
４サブフレームの量子化LSP のコードベクトルを表すイ
ンデクスをマルチプレクサ400 に出力する。The LSPs of the first to third sub-frames and the quantized LSP of the fourth sub-frame, which have been reconstructed as described above, are assigned to the linear prediction coefficients α _il (i = 1,..., 10, l = 1,. ) And outputs the result to the impulse response calculation circuit 310. Further, an index representing the code vector of the quantized LSP of the fourth subframe is output to the multiplexer 400.

【００３０】聴感重み付け回路230 は、スペクトルパラ
メータ計算回路200 から、各サブフレーム毎に量子化前
の線形予測係数α_il(i =1,…,10,l = 1,…,5)を入力
し、前記文献1にもとづき、サブフレームの音声信号に
対して聴感重み付けを行い、聴感重み付け信号を出力す
る。The perceptual weighting circuit 230 inputs the linear prediction coefficient α _il (i = 1,..., 10, l = 1,..., 5) from the spectrum parameter calculation circuit 200 for each subframe before quantization. Based on Document 1, perceptual weighting is performed on the audio signal of the subframe, and a perceptual weighting signal is output.

【００３１】応答信号計算回路240 は、スペクトルパラ
メータ計算回路200 から、各サブフレーム毎に線形予測
係数α_ilを入力し、スペクトルパラメータ量子化回路21
0 から、量子化、補間して復元した線形予測係数α_ilを
サブフレーム毎に入力し、保存されているフィルタメモ
リの値を用いて、入力信号を零d(n) =0 とした応答信号
を1 サブフレーム分計算し、減算器235 へ出力する。こ
こで、応答信号xz(n)は下式で表される。The response signal calculation circuit 240 receives the linear prediction coefficient α _il for each subframe from the spectrum parameter calculation circuit 200 and
From 0, a linear prediction coefficient α _il restored by quantization and interpolation is input for each subframe, and the response signal is set to 0 d (n) = 0 using the saved filter memory value. Is calculated for one subframe and output to the subtractor 235. Here, the response signal xz (n) is represented by the following equation.

【数２】 (Equation 2)

【００３２】ここで、N はサブフレーム長を示す。γ
は、聴感重み付け量を制御する重み係数であり、下記の
式(7) と同一の値である。sw(n)、p(n) は、それぞれ、
重み付け信号計算回路の出力信号、後述の式(7) におけ
る右辺第１項のフィルタの分母の項の出力信号をそれぞ
れ示す。Here, N indicates a subframe length. γ
Is a weighting coefficient for controlling the perceptual weighting amount, and has the same value as the following equation (7). sw (n) and p (n) are
The output signal of the weighting signal calculation circuit and the output signal of the denominator term of the filter of the first term on the right side in equation (7) described below are shown.

【００３３】減算器235 は、下式により、聴感重み付け
信号から応答信号を１サブフレーム分減算し、x’ w(n)
を適応コードブック回路300 へ出力する。The subtractor 235 subtracts the response signal for one subframe from the auditory sensation weighted signal by the following equation, and obtains x ′ w (n)
Is output to the adaptive codebook circuit 300.

【数３】 (Equation 3)

【００３４】インパルス応答計算回路310 は、z 変換が
下式で表される聴感重み付けフィルタのインパルス応答
hw(n) を予め定められた点数L だけ計算し、適応コード
ブック回路500、音源量子化回路350 へ出力する。The impulse response calculation circuit 310 calculates the impulse response of the auditory weighting filter whose z-transform is expressed by the following equation.
hw (n) is calculated by a predetermined number L and output to the adaptive codebook circuit 500 and the sound source quantization circuit 350.

【数４】 (Equation 4)

【００３５】モード判別回路800は、フレーム分割回路
の出力信号を用いて、特徴量を抽出し、フレーム毎にモ
ードの判別を行う。ここで、特徴としては、ピッチ予測
ゲインを用いることができる。サブフレーム毎に求めた
ピッチ予測ゲインをフレーム全体で平均し、この値と予
め定められた複数のしきい値を比較し、予め定められた
複数のモードに分類する。ここでは、一例として、モー
ドの種類は４とする。この場合、モード０、１、２、３
は、それぞれ、無声区間、過渡区間、弱い有声区間、強
い有声区間にほぼ対応するものとする。モード判別情報
を音源量子化回路350とゲイン量子化回路365とマルチプ
レクサ400へ出力する。The mode discriminating circuit 800 uses the output signal of the frame dividing circuit to extract a characteristic amount and discriminate the mode for each frame. Here, as a feature, a pitch prediction gain can be used. The pitch prediction gain obtained for each subframe is averaged for the entire frame, this value is compared with a plurality of predetermined thresholds, and the mode is classified into a plurality of predetermined modes. Here, as an example, it is assumed that the number of modes is four. In this case, mode 0, 1, 2, 3
Respectively correspond approximately to an unvoiced section, a transient section, a weak voiced section, and a strong voiced section. The mode determination information is output to the sound source quantization circuit 350, the gain quantization circuit 365, and the multiplexer 400.

【００３６】適応コードブック回路500 では、ゲイン量
子化回路３７０から過去の音源信号v(n)を、減算器２
３５から出力信号x’ w(n) を、インパルス応答計算回
路310から聴感重み付けインパルス応答hw(n) を入力す
る。ピッチに対応する遅延Tを下式の歪みを最小化する
ように求め、遅延を表すインデクスをマルチプレクサ４
００に出力する。In the adaptive code book circuit 500, the past sound source signal v (n) from the gain quantization circuit 370 is subtracted by the subtractor 2
An output signal x ′ w (n) is input from the block 35 and an auditory weighting impulse response hw (n) is input from the impulse response calculation circuit 310. The delay T corresponding to the pitch is determined so as to minimize the distortion of the following equation, and the index representing the delay is calculated by the multiplexer 4.
Output to 00.

【数５】式(8)において、記号* は畳み込み演算を表す。(Equation 5) In equation (8), the symbol * represents a convolution operation.

【００３７】次に、ゲインβを下式に従い求める。Next, the gain β is obtained according to the following equation.

【数６】 (Equation 6)

【００３８】ここで、女性音や、子供の声に対して、遅
延の抽出精度を向上させるために、遅延を整数サンプル
ではなく、小数サンプル値で求めてもよい。具体的な方
法は、例えば、P.Kroon らによる、"Pitch pre-dictors
with high temporal resolution"と題した論文（Proc.
ICASSP, pp.661-664, 1990 年）（文献１１）等を参照
することができる。Here, in order to improve the accuracy of delay extraction for female sounds and children's voices, the delay may be determined not by integer samples but by decimal sample values. A specific method is described in, for example, "Pitch pre-dictors" by P. Kroon et al.
with high temporal resolution "(Proc.
ICASSP, pp.661-664, 1990) (Reference 11).

【００３９】さらに、適応コードブック回路５００では
式(10)に従いピッチ予測を行ない、予測残差信号ew(n)
を音源量子化回路３５5へ出力する。Further, in the adaptive codebook circuit 500, pitch prediction is performed according to the equation (10), and the prediction residual signal ew (n)
Is output to the sound source quantization circuit 355.

【数７】 (Equation 7)

【００４０】音源量子化回路３５５では、モード判別
情報を入力し、モードにより、音源信号の量子化方法を
切り替える。The sound source quantization circuit 355 inputs mode discrimination information, and switches the quantization method of the sound source signal depending on the mode.

【００４１】モード１、２、３では、M 個のパルスをた
てるものとする。モード１、２、３では、パルスの振幅
をM パルス分まとめて量子化するための、Bビットの振
幅コードブック、もしくは極性コードブックを有してい
るものとする。以下では、極性コードブックを用いる場
合の説明を行なう。この極性コードブックは、音源コー
ドブック３５１に格納されている。In modes 1, 2, and 3, M pulses are applied. In modes 1, 2, and 3, it is assumed that a B-bit amplitude codebook or a polarity codebook for collectively quantizing the pulse amplitude for M pulses is provided. Hereinafter, a description will be given of a case where the polarity codebook is used. This polarity codebook is stored in the sound source codebook 351.

【００４２】有声では、音源量子化回路３５５は、コー
ドブック３５１に格納された各極性コードベクトルを読
み出し、各コードベクトルに対して位置をあてはめ、式
（11）を最小化するコードベクトルと位置の組合せを複
数セット選択する。In voiced, the sound source quantization circuit 355 reads out each polarity code vector stored in the code book 351, assigns a position to each code vector, and determines a code vector and a position for minimizing the equation (11). Select multiple combinations.

【数８】 (Equation 8)

【００４３】ここで、hw(n) は、聴感重み付けインパル
ス応答である。式(11) を最小化するには、式(12)を最
大化する極性コードベクトルg_ikと位置miの組合せを求
めれば良い。Here, hw (n) is an auditory weighting impulse response. In order to minimize the expression (11), a combination of the polarity code vector g _ik and the position mi that maximizes the expression (12) may be obtained.

【数９】 (Equation 9)

【００４４】または、式(13)を最大化するように選択し
ても良い。この方が分子の計算に要する演算量が低減化
される。Alternatively, selection may be made so as to maximize equation (13). This reduces the amount of calculation required for calculating the numerator.

【数１０】 (Equation 10)

【００４５】ここで、モード１〜３の場合の各パルスの
とり得る位置は、演算量削減のため、文献３に示すよう
に、拘束することができる。一例として、N=40, M=5と
すると、各パルスのとり得る位置は表１のようになる。Here, the possible positions of each pulse in modes 1 to 3 can be constrained as shown in Reference 3 to reduce the amount of calculation. As an example, if N = 40 and M = 5, the possible positions of each pulse are as shown in Table 1.

【表１】 [Table 1]

【００４６】極性コードベクトルの探索終了後、選択さ
れた複数セットの極性コードベクトルと位置の組み合わ
せをゲイン量子化回路３７０に出力する。予め定められ
たモード（この例ではモード０）では、表２のように、
パルスの位置を一定の間隔で定め、パルス全体の位置を
シフトさせるための複数のシフト量をさだめておく。以
下の例の場合は、位置を1 サンプルずつシフトさせると
して、4 種類のシフト量(シフト0, シフト1, シフト2,
シフト3) を用いる。また、この場合はシフト量を2 ビ
ットで量子化して伝送する。After the search for the polarity code vector is completed, the combination of the selected plurality of sets of the polarity code vector and the position is output to the gain quantization circuit 370. In a predetermined mode (mode 0 in this example), as shown in Table 2,
The positions of the pulses are determined at regular intervals, and a plurality of shift amounts for shifting the position of the entire pulse are determined. In the following example, assuming that the position is shifted one sample at a time, four types of shift amounts (shift 0, shift 1, shift 2,
Use shift 3). In this case, the shift amount is quantized by 2 bits and transmitted.

【表２】表２の各シフト量及び各パルス位置に対する極性を、式
(14)から予め求めておく。[Table 2] The polarity for each shift amount and each pulse position in Table 2 is expressed by an equation
Determine in advance from (14).

【００４７】各シフト量毎に、表２に示す位置とそれに
対応する極性を、ゲイン量子化回路365に出力する。The positions shown in Table 2 and the corresponding polarities are output to the gain quantization circuit 365 for each shift amount.

【００４８】ゲイン量子化回路３７０は、モード判別回
路800からモード判別情報を入力する。音源量子化回路
３５５から、モード１〜３では、複数セットの極性コー
ドベクトルとパルス位置の組み合わせを入力し、モード
０では、シフト量毎にパルスの位置とそれに対応する極
性の組み合わせを入力する。The gain quantization circuit 370 receives the mode discrimination information from the mode discrimination circuit 800. In modes 1 to 3, combinations of a plurality of sets of polarity code vectors and pulse positions are input from the sound source quantization circuit 355, and in mode 0, combinations of pulse positions and corresponding polarities are input for each shift amount.

【００４９】ゲイン量子化回路３７０は、ゲインコー
ドブック３８０からゲインコードベクトルを読み出
し、モード１〜３では、選択された複数セットの極性コ
ードベクトルと位置の組み合わせに対して、式(15)を最
小化するようにゲインコードベクトルを探索し、歪みを
最小化するゲインコードベクトル、極性コードベクトル
と位置の組み合わせを１種類選択する。The gain quantization circuit 370 reads the gain code vector from the gain code book 380, and in the modes 1 to 3, minimizes the expression (15) with respect to the selected combination of a plurality of sets of polarity code vectors and positions. A gain code vector is searched so as to optimize the gain code vector, and one kind of combination of the gain code vector, the polarity code vector, and the position for minimizing distortion is selected.

【数１１】 [Equation 11]

【００５０】ここでは、適応コードブックのゲインとパ
ルスで表した音源のゲインの両者を同時にベクトル量子
化する例について示した。選択された極性コードベクト
ルを表すインデクス、位置を表す符号、ゲインコードベ
クトルを表すインデクスをマルチプレクサ４００に出力
する。Here, an example has been described in which both the gain of the adaptive codebook and the gain of the sound source expressed in pulses are simultaneously vector-quantized. The index representing the selected polarity code vector, the code representing the position, and the index representing the gain code vector are output to the multiplexer 400.

【００５１】判別情報がモード0の場合は、複数のシフ
ト量と各シフト量の場合の各位置に対応した極性を入力
し、ゲインコードベクトルを探索し、式(16)を最小化す
るようにゲインコードベクトルとシフト量を１種類選択
する。When the discrimination information is mode 0, a plurality of shift amounts and polarities corresponding to respective positions in the case of each shift amount are input, a gain code vector is searched, and the equation (16) is minimized. One type of gain code vector and shift amount is selected.

【数１２】 (Equation 12)

【００５２】ここで、β k、G’ kは、ゲインコードブ
ック３８０に格納された2 次元ゲインコードブックにお
けるk 番目のコードベクトルである。また、δ(j)はj番
目のシフト量を示し、g’ kは選択されたゲインコード
ベクトルを表す。選択されたゲインコードベクトルを表
すインデクスとシフト量を表す符号をマルチプレクサ４
００に出力する。Here, β k and G ′ k are the k-th code vector in the two-dimensional gain codebook stored in the gain codebook 380. Further, δ (j) indicates the j-th shift amount, and g ′ k indicates the selected gain code vector. The index indicating the selected gain code vector and the code indicating the shift amount are assigned to the multiplexer 4.
Output to 00.

【００５３】なお、モード１〜３では、複数パルスの振
幅を量子化するためのコードブックを、音声信号を用い
て予め学習して格納しておくこともできる。コードブッ
クの学習法は、例えば、Linde 氏らによる"An algorith
m for vector quantizationdesign," と題した論文(IEE
E Trans. Commun., pp.84-95, January, 1980) （文献
１２）等を参照できる。In modes 1 to 3, a code book for quantizing the amplitude of a plurality of pulses can be learned and stored in advance using an audio signal. Codebook learning methods are described, for example, by Linde et al., "An algorith
m for vector quantizationdesign, "(IEE
E Trans. Commun., Pp.84-95, January, 1980) (Reference 12).

【００５４】平滑回路４５０は、モード情報を入力し、
モード情報が予め定められたモードの場合（例えばモー
ド０）に、ゲインコードベクトルのうちの音源信号のゲ
イン、適応コードブックのゲイン、音源信号のRMSとス
ペクトルパラメータのうちの少なくとも一つを時間方向
に平滑化する。The smoothing circuit 450 inputs the mode information,
When the mode information is a predetermined mode (for example, mode 0), at least one of the gain of the excitation signal in the gain code vector, the gain of the adaptive codebook, the RMS of the excitation signal, and the spectrum parameter is set in the time direction. Smoothing.

【００５５】ここで、音源信号のゲインの平滑化は次式
に従う。ここで、ｍはサブフレーム番号を示す。Here, the gain of the sound source signal is smoothed according to the following equation. Here, m indicates a subframe number.

【数１３】 (Equation 13)

【００５６】適応コードブックのゲインの平滑化は次式
に従う。The gain of the adaptive codebook is smoothed according to the following equation.

【数１４】 [Equation 14]

【００５７】音源信号のRMSの平滑化は次式に従う。The smoothing of the RMS of the sound source signal follows the following equation.

【数１５】 (Equation 15)

【００５８】スペクトルパラメータの平滑化は次式に従
う。ここでは、LSP上で平滑化を行う場合を示す。The spectral parameters are smoothed according to the following equation. Here, a case where the smoothing is performed on the LSP is shown.

【数１６】 (Equation 16)

【００５９】重み付け信号計算回路360 は、モード判別
情報と平滑回路の平滑化信号を入力し、モード１〜３の
場合は、式(17)にもとづき駆動音源信号v(n) を求め
る。The weighting signal calculation circuit 360 receives the mode discrimination information and the smoothing signal of the smoothing circuit, and in the case of modes 1 to 3, obtains the driving sound source signal v (n) based on equation (17).

【数１７】 v(n) は適応コードブック回路５００に出力される。[Equation 17] v (n) is output to the adaptive codebook circuit 500.

【００６０】モード０の場合は、式(18)にもとづき駆動
音源信号v(n) を求める。In the case of the mode 0, the driving sound source signal v (n) is obtained based on the equation (18).

【数１８】 v(n) は適応コードブック回路５００に出力される。(Equation 18) v (n) is output to the adaptive codebook circuit 500.

【００６１】次に、スペクトルパラメータ計算回路200
の出力パラメータ、スペクトルパラメータ量子化回路21
0の出力パラメータ、平滑回路４５０の出力パラメータ
を用いて応答信号xw(n)をサブフレーム毎に計算する。
ここで、モード１〜３では、式(19)により計算し、応答
信号計算回路２４０へ出力する。Next, the spectrum parameter calculation circuit 200
Output parameter, spectrum parameter quantization circuit 21
Using the output parameter of 0 and the output parameter of the smoothing circuit 450, the response signal xw (n) is calculated for each subframe.
Here, in modes 1 to 3, calculation is performed by equation (19) and output to the response signal calculation circuit 240.

【数１９】 [Equation 19]

【００６２】一方、モード０では、平滑回路４５０にお
いて平滑化されたLSPパラメータを入力して、平滑化さ
れた線形予測係数に変換し、式(20)により応答信号を計
算し、応答信号計算回路２４０へ出力する。On the other hand, in mode 0, the LSP parameters smoothed in the smoothing circuit 450 are input, converted into smoothed linear prediction coefficients, and a response signal is calculated by the equation (20). Output to 240.

【数２０】 (Equation 20)

【００６３】次に本発明の第２の実施形態例を図２を参
照しながら説明する。デマルチプレクサ500は、受信し
た信号から、ゲインコードベクトルを示すインデクス、
適応コードブックの遅延を示すインデクス、音源信号の
情報、音源コードベクトルのインデクス、スペクトルパ
ラメータのインデクスを入力し、各パラメータを分離し
て出力する。Next, a second embodiment of the present invention will be described with reference to FIG. The demultiplexer 500 extracts an index indicating a gain code vector from the received signal,
An index indicating the delay of the adaptive codebook, information of a sound source signal, an index of a sound source code vector, and an index of a spectrum parameter are input, and each parameter is separated and output.

【００６４】ゲイン復号回路５１０は、ゲインコードベ
クトルのインデクスとモード判別情報を入力し、ゲイン
コードブック３８０からインデクスに応じてゲインコー
ドベクトルを読み出し、出力する。Gain decoding circuit 510 receives the index of the gain code vector and the mode discrimination information, reads out the gain code vector from gain code book 380 according to the index, and outputs it.

【００６５】適応コードブック回路５２０は、モード判
別情報と適応コードブックの遅延を入力し、適応コード
ベクトルを発生し、ゲインコードベクトルにより適応コ
ードブックのゲインを乗じて出力する。The adaptive codebook circuit 520 receives the mode discrimination information and the delay of the adaptive codebook, generates an adaptive code vector, multiplies the gain of the adaptive codebook by the gain code vector, and outputs the product.

【００６６】音源信号復元回路５４０では、モード判別
情報がモード１〜３のときは、音源コードブック３５１
から読み出した極性コードベクトルと、パルスの位置情
報とゲインコードベクトルを用いて、音源信号を発生し
て加算器５５０に出力する。モード判別情報がモード０
の場合は、パルス位置、位置のシフト量とゲインコード
べクトルから音源信号を発生して加算器５５０に出力す
る。In the sound source signal restoring circuit 540, when the mode discrimination information is Modes 1 to 3, the sound source code book 351
A sound source signal is generated using the polarity code vector read from, the pulse position information and the gain code vector, and output to the adder 550. Mode 0 is mode 0
In the case of (1), a sound source signal is generated from the pulse position, the shift amount of the position, and the gain code vector, and output to the adder 550.

【００６７】加算器５５０は、適応コードブック回路５
２０の出力と音源信号復元回路５４０の出力を用いて、
駆動音源信号v(n)を発生し、適応コードブック回路５２
０と合成フィルタ５６０に出力する。The adder 550 is connected to the adaptive codebook circuit 5
20 and the output of the sound source signal restoration circuit 540,
A driving sound source signal v (n) is generated, and the adaptive code book circuit 52
0 and output to the synthesis filter 560.

【００６８】スペクトルパラメータ復号回路５７０は、
スペクトルパラメータを復号し、線形予測係数に変換
し、合成フィルタ回路５６０に出力する。The spectrum parameter decoding circuit 570
The spectrum parameters are decoded, converted into linear prediction coefficients, and output to the synthesis filter circuit 560.

【００６９】合成フィルタ回路５６０は、駆動音源信号
v(n)と線形予測係数を入力し、再生信号s(n)を計算す
る。The synthesis filter circuit 560 generates the driving sound source signal.
v (n) and a linear prediction coefficient are input, and a reproduced signal s (n) is calculated.

【００７０】ポストフィルタ回路６００は、再生信号s
(n)に対して量子化雑音をマスクするためのポストフィ
ルタリングを行い、ポストフィルタリング出力信号S
_p(n)を出力する。ここで、ポストフィルタの伝達特性
は、式(25)で表される。The post-filter circuit 600 generates the reproduced signal s
(n) is subjected to post-filtering to mask quantization noise, and a post-filtering output signal S
Output _p (n). Here, the transfer characteristic of the post filter is expressed by equation (25).

【数２１】 (Equation 21)

【００７１】逆ポスト・合成フィルタ回路６１０は、ポ
ストフィルタと合成フィルタの逆フィルタを構成し、残
差信号e(n)を計算する。ここで、逆フィルタの伝達特性
は、式(26)で表される。The inverse post / synthesis filter circuit 610 forms an inverse filter of the post filter and the synthesis filter, and calculates a residual signal e (n). Here, the transfer characteristic of the inverse filter is expressed by equation (26).

【数２２】 (Equation 22)

【００７２】平滑回路６２０は、ゲインコードベクトル
のうちの音源信号のゲイン、適応コードブックのゲイ
ン、残差信号のRMSとスペクトルパラメータのうちの少
なくとも一つを時間方向に平滑化する。ここで、音源信
号のゲインの平滑化、適応コードブックのゲインの平滑
化、スペクトルパラメータの平滑化は、それぞれ、式(1
7), (18), (20)に従う。残差信号e(n)のRMSの平滑化は
式(27)に従う。ここで、RMSe(m)は、第mサブフレームの
残差信号のRMSを示す。The smoothing circuit 620 smoothes at least one of the gain of the excitation signal, the gain of the adaptive codebook, the RMS of the residual signal, and the spectral parameter in the gain code vector in the time direction. Here, the smoothing of the gain of the sound source signal, the smoothing of the gain of the adaptive codebook, and the smoothing of the spectral parameters are respectively performed by Equation (1)
Follow 7), (18) and (20). The RMS smoothing of the residual signal e (n) complies with Equation (27). Here, RMSe (m) indicates the RMS of the residual signal of the m-th subframe.

【数２３】 (Equation 23)

【００７３】平滑回路６２０は、平滑化したパラメータ
を用いて駆動音源信号を復元する。ここでは、残差信号
のRMSを平滑化して駆動音源信号を復元する場合につい
て、式(28)に示す。The smoothing circuit 620 restores the driving sound source signal using the smoothed parameters. Here, Expression (28) shows a case where the RMS of the residual signal is smoothed to restore the driving excitation signal.

【数２４】 (Equation 24)

【００７４】合成フィルタ５６０は、平滑化したパラメ
ータを用いて求めた駆動音源信号The synthesis filter 560 outputs the driving sound source signal obtained using the smoothed parameters.

【外１】を入力して再生信号を[Outside 1] To input the playback signal.

【外２】計算する。この場合、平滑化した線形予測係数を用いて
も良い。[Outside 2] calculate. In this case, a smoothed linear prediction coefficient may be used.

【００７５】ポストフィルタ６００は、当該再生信号を
入力し、ポストフィルタリングを行い、ポストフィルタ
リング後の最終的な再生信号The post-filter 600 receives the reproduction signal, performs post-filtering, and performs a final reproduction signal after the post-filtering.

【外３】を求めて出力する。[Outside 3] Is output.

【００７６】図３は第３の実施形態例を示すブロック図
である。図３において、図２と同一の番号を付した構成
要素は、同一の動作をするので、説明は省略する。FIG. 3 is a block diagram showing a third embodiment. In FIG. 3, components denoted by the same reference numerals as those in FIG. 2 operate in the same manner, and a description thereof will be omitted.

【００７７】図３において、逆ポストフィルタ回路６３
０、平滑回路６４０は、デマルチプレクサ５００から判
別情報を入力し、判別情報が予め定められたモード（例
えばモード０）を示す場合に、各々の動作を行う。各々
の動作は、図２の逆ポスト・合成フィルタ回路６１０、
平滑回路６２０とそれぞれ同じなので、説明を省略す
る。In FIG. 3, an inverse post filter circuit 63
0, the smoothing circuit 640 receives discrimination information from the demultiplexer 500, and performs each operation when the discrimination information indicates a predetermined mode (for example, mode 0). Each operation is performed by the inverse post-synthesis filter circuit 610 of FIG.
Since each is the same as the smoothing circuit 620, the description is omitted.

【００７８】[0078]

【発明の効果】以上説明したように、本発明の音声符号
化装置によれば、スペクトルパラメータ、適応コードブ
ックのゲイン、音源コードブックのゲイン、音源信号の
ＲＭＳの少なくとも一つを時間方向に平滑化したものを
使用して合成信号を局部再生するので、背景雑音の重畳
した音声に対しても、低ビットレートでも、背景雑音部
におけるパラメータの局所的な時間変動を押さえること
ができ、音質的な劣化の少ない符号化音声を提供できる
という効果がある。As described above, according to the speech coding apparatus of the present invention, at least one of the spectral parameter, the adaptive codebook gain, the excitation codebook gain, and the RMS of the excitation signal is smoothed in the time direction. Since the synthesized signal is locally reproduced using the converted signal, it is possible to suppress the local time variation of the parameter in the background noise section even at a low bit rate, even for voices with background noise superimposed, and There is an effect that coded speech with little deterioration can be provided.

【００７９】また、本発明の音声復号化装置によれば、
復号側でポストフィルタリングした信号から、逆ポスト
・合成フィルタ処理により残差信号を求め、残差信号の
RMS、受信したスペクトルパラメータ、適応コードブッ
クのゲイン、音源コードブックのゲインの少なくとも一
つを時間方向に平滑化したものを用いて信号を合成し直
し、ポストフィルタリングをし直して最終的な合成信号
を出力しているので、従来方式の復号化装置をなんら修
正することなく、完全な後処理として処理を追加するこ
とができ、本処理を追加することにより、低ビットレー
トでも、背景雑音部におけるパラメータの局所的な時間
変動を押さえることができ、音質的な劣化の少ない合成
音声を提供できるという効果がある。According to the speech decoding apparatus of the present invention,
From the signal post-filtered on the decoding side, a residual signal is obtained by inverse post-synthesis filter processing.
RMS, received spectral parameters, adaptive codebook gain, at least one of the sound source codebook gains are smoothed in the time direction to re-synthesize the signal, post-filtered again and the final synthesized signal , So that the processing can be added as a complete post-processing without any modification to the conventional decoding device. There is an effect that a local temporal variation of a parameter can be suppressed, and a synthesized voice with less deterioration in sound quality can be provided.

【００８０】さらに、本発明による音声復号化装置によ
れば、予め定められたモード、あるいは特徴量が予め定
められた領域に存在する場合において、パラメータの平
滑化処理をしているので、、特定の区間（たとえば無音
部）においてのみ処理を行うことができ、音声区間に弊
害を与えることなしに、背景雑音が重畳した音声を低ビ
ットレートで符号化しても、背景雑音部分を良好に符号
化できるという効果がある。Further, according to the speech decoding apparatus of the present invention, the parameter is smoothed in a predetermined mode or in a case where a feature value exists in a predetermined area. Can be performed only in the section (for example, a silent section), and the background noise portion can be satisfactorily encoded even if the speech on which the background noise is superimposed is encoded at a low bit rate without affecting the speech section. There is an effect that can be.

[Brief description of the drawings]

【図１】本発明による音声符号化装置の第１の実施形態
例を示すブロック図である。FIG. 1 is a block diagram showing a first embodiment of a speech encoding apparatus according to the present invention.

【図２】本発明による音声符号化装置の第２の実施形態
例を示すブロック図である。FIG. 2 is a block diagram showing a second embodiment of the speech encoding apparatus according to the present invention.

【図３】本発明による音声符号化装置の第３の実施形態
例を示すブロック図である。FIG. 3 is a block diagram showing a third embodiment of the speech coding apparatus according to the present invention.

[Explanation of symbols]

110 フレーム分割回路 120 サブフレーム分割回路 200 スペクトルパラメータ計算回路 210 スペクトルパラメータ量子化回路 211 ＬＳＰコードブック 230 聴感重み付け回路 235 減算回路 240 応答信号計算回路 310 インパルス応答計算回路 350、355、356、357 音源量子化回路 351 音源コードブック 360 重み付け信号計算回路 365、370 ゲイン量子化回路 380、515 ゲインコードブック 400 マルチプレクサ 450 平滑回路 500 適応コードブック回路 520 デマルチプレクサ 510 ゲイン復号回路 520 適応コードブック回路 540 音源復元回路 550 加算回路 560 合成フィルタ回路 570 スペクトルパラメータ復号回路 600 ポストフィルタ回路 610、630 逆ポスト・合成フィルタ回路 620、640 平滑回路 110 Frame division circuit 120 Subframe division circuit 200 Spectrum parameter calculation circuit 210 Spectrum parameter quantization circuit 211 LSP codebook 230 Perceptual weighting circuit 235 Subtraction circuit 240 Response signal calculation circuit 310 Impulse response calculation circuit 350, 355, 356, 357 Sound source quantum Circuit 351 sound source codebook 360 weighting signal calculation circuit 365,370 gain quantization circuit 380,515 gain codebook 400 multiplexer 450 smoothing circuit 500 adaptive codebook circuit 520 demultiplexer 510 gain decoding circuit 520 adaptive codebook circuit 540 sound source restoration circuit 550 Addition circuit 560 Synthesis filter circuit 570 Spectrum parameter decoding circuit 600 Post filter circuit 610, 630 Inverse post / synthesis filter circuit 620, 640 Smoothing circuit

【手続補正書】[Procedure amendment]

【提出日】平成１３年１月２３日（２００１．１．２
３）[Submission date] January 23, 2001 (2001.1.2)
3)

【手続補正１】[Procedure amendment 1]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】特許請求の範囲[Correction target item name] Claims

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【特許請求の範囲】[Claims]

【手続補正２】[Procedure amendment 2]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１６[Correction target item name] 0016

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００１６】（８）前記モード判別は、フレーム毎に行
われる上記（７）の音声復号化装置。[0016] (8) The mode discrimination, voice decrypt apparatus of (7) to be performed for each frame.

【手続補正３】[Procedure amendment 3]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１７[Correction target item name] 0017

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００１７】（９）前記特徴量は、ピッチ予測ゲインで
ある上記（７）の音声復号化装置。[0017] (9) the feature amount, the audio decrypt apparatus of (7) is a pitch prediction gain.

【手続補正４】[Procedure amendment 4]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１８[Correction target item name] 0018

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００１８】（１０）前記モード判別は、サブフレーム
毎に求めたピッチ予測ゲインをフレーム全体で平均し、
この値とあらかじめ定められた複数のしきい値を比較し
て行われる上記（７）の音声復号化装置。(10) In the mode discrimination, a pitch prediction gain obtained for each sub-frame is averaged over the entire frame,
Voice decrypt apparatus of (7) to be performed by comparing a plurality of predetermined threshold this value.

【手続補正５】[Procedure amendment 5]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００１９[Correction target item name] 0019

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００１９】（１１）前記モードは、無声区間、過渡区
間、弱い有声区間、強い有声区間にほぼ対応するもので
ある上記（７）の音声復号化装置。[0019] (11) the mode, unvoiced speech decrypted device transient period, weak voiced segment, in which corresponding substantially to the strong voiced segments (7).

【手続補正６】[Procedure amendment 6]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００２０[Correction target item name] 0020

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００２０】（１２）音声信号についてのスペクトルパ
ラメータ、適応コードブックのゲイン、音源コードブッ
クのゲイン、音源信号のＲＭＳの少なくとも一つを時間
方向に平滑化した信号に基づいて合成音声信号を局部再
生する音声復号化装置。(12) Locally reproducing a synthesized speech signal based on a signal obtained by smoothing at least one of a spectrum parameter, a gain of an adaptive codebook, a gain of a sound source codebook, and an RMS of a sound source signal in a time direction. voice decrypted device that.

【手続補正７】[Procedure amendment 7]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００２５[Correction target item name] 0025

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００２５】スペクトルパラメータ量子化回路210 で
は、予め定められたサブフレームのLSPパラメータを効
率的に量子化し、下式の歪みを最小化する量子化値を出
力する。ここで、スペクトルパラメータ量子化回路210
は、ＬＳＰコードブック211を参照する。 The spectrum parameter quantization circuit 210 efficiently quantizes the LSP parameter of a predetermined sub-frame and outputs a quantization value for minimizing the following equation. Here, the spectrum parameter quantization circuit 210
Refer to the LSP codebook 211.

【数１】 (Equation 1)

【手続補正８】[Procedure amendment 8]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００３４[Correction target item name] 0034

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００３４】インパルス応答計算回路310 は、z 変換が
下式で表される聴感重み付けフィルタのインパルス応答
hw(n) を予め定められた点数L だけ計算し、適応コード
ブック回路470、音源量子化回路350 へ出力する。The impulse response calculation circuit 310 calculates the impulse response of the auditory weighting filter whose z-transform is expressed by the following equation.
hw (n) is calculated by a predetermined number L and output to the adaptive codebook circuit 470 and the sound source quantization circuit 350.

【数４】 (Equation 4)

【手続補正９】[Procedure amendment 9]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００３５[Correction target item name] 0035

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００３５】モード判別回路300は、フレーム分割回路
の出力信号を用いて、特徴量を抽出し、フレーム毎にモ
ードの判別を行う。ここで、特徴としては、ピッチ予測
ゲインを用いることができる。サブフレーム毎に求めた
ピッチ予測ゲインをフレーム全体で平均し、この値と予
め定められた複数のしきい値を比較し、予め定められた
複数のモードに分類する。ここでは、一例として、モー
ドの種類は４とする。この場合、モード０、１、２、３
は、それぞれ、無声区間、過渡区間、弱い有声区間、強
い有声区間にほぼ対応するものとする。モード判別情報
を音源量子化回路350とゲイン量子化回路365とマルチプ
レクサ400へ出力する。The mode discriminating circuit 300 uses the output signal of the frame dividing circuit to extract the characteristic amount and discriminate the mode for each frame. Here, as a feature, a pitch prediction gain can be used. The pitch prediction gain obtained for each subframe is averaged for the entire frame, this value is compared with a plurality of predetermined thresholds, and the mode is classified into a plurality of predetermined modes. Here, as an example, the type of the mode is 4. In this case, mode 0, 1, 2, 3
Respectively correspond approximately to an unvoiced section, a transient section, a weak voiced section, and a strong voiced section. The mode determination information is output to the sound source quantization circuit 350, the gain quantization circuit 365, and the multiplexer 400.

【手続補正１０】[Procedure amendment 10]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００３６[Correction target item name] 0036

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００３６】適応コードブック回路470 では、ゲイン量
子化回路３７０から過去の音源信号v(n)を、減算器２
３５から出力信号x’ w(n) を、インパルス応答計算回
路310から聴感重み付けインパルス応答hw(n) を入力す
る。ピッチに対応する遅延Tを下式の歪みを最小化する
ように求め、遅延を表すインデクスをマルチプレクサ４
００に出力する。In the adaptive code book circuit 470 , the past excitation signal v (n) from the gain quantization circuit 370 is subtracted by the subtracter 2.
An output signal x ′ w (n) is input from 35 and an auditory weighting impulse response hw (n) is input from the impulse response calculation circuit 310. The delay T corresponding to the pitch is determined so as to minimize the distortion of the following equation, and the index representing the delay is calculated by the multiplexer 4.
Output to 00.

【手続補正１１】[Procedure amendment 11]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００３９[Correction target item name] 0039

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００３９】さらに、適応コードブック回路４７０では
式(10)に従いピッチ予測を行ない、予測残差信号ew(n)
を音源量子化回路３５5へ出力する。Further, the adaptive codebook circuit 470 performs pitch prediction according to the equation (10), and obtains a prediction residual signal ew (n).
Is output to the sound source quantization circuit 355.

【数７】 (Equation 7)

【手続補正１２】[Procedure amendment 12]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００５９[Correction target item name] 0059

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００５９】重み付け信号計算回路360 は、モード判別
情報と平滑回路の平滑化信号を入力し、モード１〜３の
場合は、式(２１)にもとづき駆動音源信号v(n) を求め
る。The weighting signal calculation circuit 360 receives the mode discrimination information and the smoothing signal of the smoothing circuit, and obtains the driving sound source signal v (n) based on the equation ( 21 ) for modes 1 to 3.

【数１７】 v(n) は適応コードブック回路４７０に出力される。[Equation 17] v (n) is output to the adaptive codebook circuit 470 .

【手続補正１３】[Procedure amendment 13]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００６０[Correction target item name] 0060

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００６０】モード０の場合は、式(２２)にもとづき駆
動音源信号v(n) を求める。In the case of the mode 0, the driving sound source signal v (n) is obtained based on the equation ( 22 ).

【数１８】 v(n) は適応コードブック回路４７０に出力される。(Equation 18) v (n) is output to the adaptive codebook circuit 470 .

【手続補正１４】[Procedure amendment 14]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００６１[Correction target item name] 0061

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００６１】次に、スペクトルパラメータ計算回路200
の出力パラメータ、スペクトルパラメータ量子化回路21
0の出力パラメータ、平滑回路４５０の出力パラメータ
を用いて応答信号xw(n)をサブフレーム毎に計算する。
ここで、モード１〜３では、式(２３)により計算し、応
答信号計算回路２４０へ出力する。Next, the spectrum parameter calculation circuit 200
Output parameter, spectrum parameter quantization circuit 21
Using the output parameter of 0 and the output parameter of the smoothing circuit 450, the response signal xw (n) is calculated for each subframe.
Here, in modes 1 to 3, calculation is performed by equation ( 23 ), and the calculated signal is output to the response signal calculation circuit 240.

【数１９】 [Equation 19]

【手続補正１５】[Procedure amendment 15]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】００６２[Correction target item name] 0062

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【００６２】一方、モード０では、平滑回路４５０にお
いて平滑化されたLSPパラメータを入力して、平滑化さ
れた線形予測係数に変換し、式(２４)により応答信号を
計算し、応答信号計算回路２４０へ出力する。On the other hand, in mode 0, the LSP parameters smoothed by the smoothing circuit 450 are input, converted into smoothed linear prediction coefficients, and a response signal is calculated by the equation ( 24 ). Output to 240.

【数２０】 (Equation 20)

【手続補正１６】[Procedure amendment 16]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】符号の説明[Correction target item name] Explanation of sign

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【符号の説明】 110 フレーム分割回路 120 サブフレーム分割回路 200 スペクトルパラメータ計算回路 210 スペクトルパラメータ量子化回路 211 ＬＳＰコードブック 230 聴感重み付け回路 235 減算回路 240 応答信号計算回路300 モード判別回路 310 インパルス応答計算回路 350、355、356、357 音源量子化回路 351 音源コードブック 360 重み付け信号計算回路 365、370 ゲイン量子化回路 380、515 ゲインコードブック 400 マルチプレクサ 450 平滑回路470 適応コードブック回路500 デマルチプレクサ 510 ゲイン復号回路 520 適応コードブック回路 540 音源復元回路 550 加算回路 560 合成フィルタ回路 570 スペクトルパラメータ復号回路 600 ポストフィルタ回路 610、630 逆ポスト・合成フィルタ回路 620、640 平滑回路[Explanation of Signs] 110 Frame division circuit 120 Subframe division circuit 200 Spectrum parameter calculation circuit 210 Spectrum parameter quantization circuit 211 LSP codebook 230 Auditory weighting circuit 235 Subtraction circuit 240 Response signal calculation circuit300 Mode discrimination circuit 310 Impulse response calculation circuit 350, 355, 356, 357 Sound source quantization circuit 351 Sound source codebook 360 Weighted signal calculation circuit 365, 370 Gain quantization circuit 380, 515 Gain codebook 400 Multiplexer 450 Smoothing circuit470 Adaptive codebook circuit500 Demultiplexer 510 Gain decoding circuit 520 Adaptive codebook circuit 540 Sound source restoration circuit 550 Addition circuit 560 Synthesis filter circuit 570 Spectrum parameter decoding circuit 600 Post filter circuit 610, 630 Inverse post / synthesis filter circuit 620, 640 Smoothing circuit

【手続補正１７】[Procedure amendment 17]

【補正対象書類名】図面[Document name to be amended] Drawing

【補正対象項目名】図３[Correction target item name] Figure 3

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【図３】 FIG. 3

Claims

[Claims]

A spectrum parameter calculator for inputting an audio signal and obtaining and quantizing a spectrum parameter for each predetermined frame; dividing the frame into a plurality of subframes; An adaptive codebook section for obtaining a delay and a gain from an encoded sound source signal using an adaptive codebook, predicting an audio signal and obtaining a residual, and quantizing and outputting the sound source signal of the audio signal using the spectrum parameter. In a speech encoding apparatus having a sound source quantization unit and a gain quantization unit for quantizing the gain of the adaptive codebook and the gain of the sound source signal, a predetermined feature amount is extracted from the speech signal and extracted. A mode discriminator for discriminating which mode among a plurality of predetermined modes based on the amount, When the output is in a predetermined mode, a smoothing circuit is provided for smoothing at least one of the gain of the excitation signal, the gain of the adaptive codebook, the spectrum parameter, and the level of the excitation signal in the time direction. The synthesized signal is locally reproduced using the smoothed signal, and the output of the spectrum parameter calculation unit, the output of the discrimination unit, the output of the adaptive codebook unit, the output of the sound source quantization unit, and the gain are output. A speech encoding device comprising: a multiplexer unit that combines and outputs an output of a quantization unit.

2. The speech encoding apparatus according to claim 1, wherein said mode discriminating circuit discriminates a mode for each frame.

3. The speech coding apparatus according to claim 1, wherein the feature quantity is a pitch prediction gain.

4. The mode discriminating circuit averages a pitch prediction gain obtained for each sub-frame for the entire frame, and compares this average with a plurality of predetermined thresholds to determine a plurality of predetermined modes. The speech encoding device according to claim 1, wherein the speech encoding device is classified as:

5. The speech coding apparatus according to claim 1, wherein the plurality of predetermined modes substantially correspond to an unvoiced section, a transient section, a weak voiced section, and a strong voiced section.

6. A spectrum parameter as voice information,
A demultiplexer circuit for inputting and separating a pitch, a gain and a sound source signal, and a sound source signal restoring circuit for restoring a sound source signal from the separated pitch, sound source signal and gain,
Speech decoding comprising a synthesis filter circuit for synthesizing a speech signal using the restored sound source signal and the spectrum parameter, and a post-filter circuit for inputting the synthesized speech and performing post-filtering using the spectrum parameter. In the apparatus, an inverse filter circuit that performs an inverse post-filtering and an inverse synthesis filtering from the output signal of the post-filter circuit and the spectral parameter to estimate a sound source signal,
A smoothing circuit for performing time-direction smoothing on at least one of the level of the estimated sound source signal, the gain, and the spectral parameter; and inputting the smoothed signal to the synthesis filter circuit. An audio decoding apparatus characterized in that the synthesized signal is input to the post-filter circuit to synthesize an audio signal.

7. A demultiplexer circuit for inputting and separating discrimination information indicating a mode determined based on a characteristic amount of a speech signal to be decoded, a spectrum parameter, a pitch, a gain, and a sound source signal, A sound source signal restoring circuit for restoring a sound source signal from a sound source signal and a gain, a synthesis filter circuit for synthesizing an audio signal using the restored sound source signal and the spectrum parameter, and inputting the synthesized speech to the spectrum parameter. And a post-filter circuit that performs post-filtering by using, when the discrimination information is in a predetermined mode, inverse post-filtering is performed based on the output signal of the post-filter circuit and the spectrum parameter. Estimate sound source signal by performing inverse synthesis filtering An inverse filter circuit, a smoothing circuit that performs time-direction smoothing on at least one of the level of the estimated sound source signal, the gain, and the spectrum parameter, and the smoothed signal to the synthesis filter circuit. An audio decoding apparatus, comprising: inputting an input synthesized signal to the post-filter circuit to synthesize an audio signal;

8. The speech encoding apparatus according to claim 7, wherein said mode discrimination is performed for each frame.

9. The speech coding apparatus according to claim 7, wherein said feature amount is a pitch prediction gain.

10. The method according to claim 1, wherein the mode discrimination is performed by averaging the pitch prediction gain obtained for each sub-frame in the entire frame, and comparing this value with a plurality of predetermined thresholds. Item 8. The speech encoding device according to Item 7.

11. The speech coding apparatus according to claim 7, wherein said mode substantially corresponds to an unvoiced section, a transient section, a weak voiced section, and a strong voiced section.

12. A local reproduction of a synthesized speech signal based on a signal obtained by smoothing at least one of a spectrum parameter, an adaptive codebook gain, a sound source codebook gain, and a sound source signal RMS of a sound signal in a time direction. A speech coding apparatus characterized by the above-mentioned.

13. A decoding apparatus for decoding a residual signal from a post-filtered signal on the decoding side of an audio signal by inverse post-synthesis filter processing, the RMS of the residual signal, a reception spectrum parameter, a gain of an adaptive codebook, and a sound source code. Speech decoding characterized by re-synthesizing an audio signal based on a signal obtained by smoothing at least one of the book gains in the time direction, further performing post-filtering processing, and outputting a final synthesized signal. apparatus.

14. A mode determined on the decoding side of an audio signal from a post-filtered signal by inverse post-synthesis filter processing based on a characteristic amount of an audio signal to be decoded, or In the case where the feature quantity is in a predetermined area, based on a signal obtained by smoothing at least one of the RMS of the residual signal, the received spectrum parameter, the gain of the adaptive codebook, and the gain of the excitation codebook in the time direction. An audio decoding apparatus for re-synthesizing the audio signal, and further performing post-filtering processing to output a final synthesized signal.