JP2560682B2

JP2560682B2 - Speech signal coding / decoding method and apparatus

Info

Publication number: JP2560682B2
Application number: JP60077814A
Authority: JP
Inventors: 一範小澤
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1985-04-12
Filing date: 1985-04-12
Publication date: 1996-12-04
Anticipated expiration: 2011-12-04
Also published as: JPS61236599A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声信号符号化復号化方法とその装置、特に
音声信号を低いビットレイトで高品質に符号化復号化す
るための符号化復号化方法とその装置に関する。The present invention relates to a speech signal coding / decoding method and apparatus, and more particularly to a coding / decoding method for coding / decoding a speech signal with high quality at a low bit rate. A method and its apparatus.

（従来技術とその問題点）音声信号を16Kビット／秒程度以下の伝送速度で符号
化する方式として、マルチパルス駆動形音声符号化法が
提案されている。この方法の詳細については、ビーエ
スアタル（B.S.ATAL）氏らによる“アニューモデル
オブエルピーシーエクサイテイションフォ
ープロデューシングナチュラルサウンディング
スピーチアットロウビットレイツ”（“A NEW
MODEL OF LPC EXCITATION FOR PRODUCING NATURAL−SOU
NDING SPEECH AT LOW BIT RATES"）と題した論文（PRO.
I.C.A.S.S.P.,p.p.614−617,1982）（文献１）等に説明
されているのでここでは簡単に説明する。(Prior Art and its Problems) As a method of encoding a voice signal at a transmission rate of about 16 Kbit / sec or less, a multi-pulse drive voice encoding method has been proposed. For more information on this method, see “Anew Model of LPC Excitement for Producing Natural Sounding” by BSATAL et al.
Speech at Low Bit Rate "(" A NEW
MODEL OF LPC EXCITATION FOR PRODUCING NATURAL−SOU
NDING SPEECH AT LOW BIT RATES ") (PRO.
ICASSP, pp 614-617, 1982) (reference 1) and so on, so that a brief description will be given here.

第３図は前記文献１に記載の従来方式の符号器側の処
理を示すブロック図である。図において、符号器入力端
子400から離散的音声信号が入力され、バッファメモリ
回路410に１フレーム分蓄積される。減算器420及びＫパ
ラメータ計算回路480はバッファメモリ回路410から信号
を入力する。但し、前記文献１ではＫパラメータのかわ
りにレフレクシォンコエフィシェンツ（REFLECTION COE
FFICIENTS）と記載されているが、これはＫパラメータ
と同一である。Ｋパラメータ計算回路480は共分散法に
従って、フレーム内の音声スペクトル包絡を表わすＫパ
ラメータKiを16次（１＜ｉ＜16）求め、合成フィルタ43
0へ出力する。音源パルス発生回路440は１フレーム内に
あらかじめ定められた個数のパルス列ｄ（ｎ）を発生さ
せる、パルス列ｄ（ｎ）の一例を第４図に示す。第４図
で横軸は離散的な時刻を、縦軸は振幅をそれぞれ示す。FIG. 3 is a block diagram showing the processing on the encoder side of the conventional method described in Document 1 above. In the figure, a discrete audio signal is input from the encoder input terminal 400 and accumulated for one frame in the buffer memory circuit 410. The subtractor 420 and the K parameter calculation circuit 480 receive the signal from the buffer memory circuit 410. However, in Reference 1 above, instead of the K parameter, the REFLECTION COE
FFICIENTS), which is the same as the K parameter. The K parameter calculation circuit 480 obtains a 16th order (1 <i <16) K parameter Ki representing the speech spectrum envelope in the frame according to the covariance method, and the synthesis filter 43
Output to 0. FIG. 4 shows an example of the pulse train d (n) in which the sound source pulse generation circuit 440 generates a predetermined number of pulse trains d (n) in one frame. In FIG. 4, the horizontal axis represents discrete time and the vertical axis represents amplitude.

第３図に戻って、合成フィルタ430はパルス列ｄ
（ｎ）により駆動されて合成音声信号（ｎ）が求めら
れる。（ｎ）は減算器420へ出力され、原音声Ｘ
（ｎ）との差信号ｅ（ｎ）が計算される。重みずけ回路
490は重みずけ関数ｗ（ｎ）を用いて重みずけ誤差ew
（ｎ）を計算する。Returning to FIG. 3, the synthesis filter 430 uses the pulse train d
Driven by (n), the synthesized voice signal (n) is obtained. (N) is output to the subtractor 420 and the original speech X
The difference signal e (n) from (n) is calculated. Weighting circuit
490 is a weighting error ew using the weighting function w (n)
Calculate (n).

ここで重みずけ関数ｗ（ｎ）の特性は、Ｚ変換値をＷ
（Ｚ）とすると、合成フィルタの予測係数値を用いて次
式で表わされる。Here, the characteristic of the weighting function w (n) is that the Z conversion value is W
If it is (Z), it is expressed by the following equation using the prediction coefficient value of the synthesis filter.

ここでｒは０＜ｒ＜１の定数であり、Ｗ（Ｚ）の周波
数特性を決定する。（１）式でＷ（Ｚ）を合成フィルタ
の周波数特性に依存させて決めているのは、フォルマン
トの近傍での誤差は耳につきにくいという聴感的なマス
キング効果を利用するためである。 Here, r is a constant of 0 <r <1, and determines the frequency characteristic of W (Z). The reason why W (Z) is determined depending on the frequency characteristic of the synthesizing filter in the equation (1) is to use the audible masking effect that an error in the vicinity of the formant is hard to hear.

第５図にあるフレームにおける音声信号のスペクトル
と、Ｗ（Ｚ）の周波数特性の一例とを示す。ここではｒ
＝0.8としてある。図において横軸は周波数（最大4KH
z）を、縦軸は対数振幅（最大60dB）をそれぞれ示す。
また、上部の曲線は音声信号のスペクトルを、下部の曲
線はＷ（Ｚ）のスペクトルを示す。FIG. 5 shows a spectrum of an audio signal in a frame shown in FIG. 5 and an example of frequency characteristics of W (Z). Where r
= 0.8. In the figure, the horizontal axis is frequency (maximum 4KH
z), and the vertical axis shows the logarithmic amplitude (maximum 60 dB).
The upper curve shows the spectrum of the audio signal and the lower curve shows the W (Z) spectrum.

第３図へ戻って、重みずけ誤差ew（ｎ）は誤差最小化
回路450へフィードバックされる。誤差最小化回路450は
次式に従い誤差電力εを計算する。Returning to FIG. 3, the weighted error ew (n) is fed back to the error minimization circuit 450. The error minimization circuit 450 calculates the error power ε according to the following equation.

ここでＮはサンプル数を示し、前記文献１では40サン
プル（5msec）としている。次に、誤差最小化回路450は
誤差電力εを小さくするように音源パルス計算回路440
に対しパルス位置及び振幅情報を与える。合成フィルタ
430はこの音原パルス列を駆動源として合計信号
（ｎ）を計算し、減算器420では先に計算した誤差ｅ
（ｎ）からもとまった合成信号（ｎ）を減算し重みず
け回路490へ出力する。重みずけ回路490は重みずけ誤差
を計算し誤差最小化回路450へフィードバックし、誤差
最小化回路450は誤差電力を小さくするように音源パル
スを調整する。 Here, N represents the number of samples, which is 40 samples (5 msec) in Document 1 above. Next, the error minimization circuit 450 uses the sound source pulse calculation circuit 440 to reduce the error power ε.
For pulse position and amplitude information. Synthesis filter
430 calculates the total signal (n) using this sound source pulse train as a driving source, and the subtracter 420 calculates the error e calculated previously.
The original synthesized signal (n) is subtracted from (n) and output to the weighting circuit 490. The weighting circuit 490 calculates a weighting error and feeds it back to the error minimizing circuit 450, and the error minimizing circuit 450 adjusts the sound source pulse so as to reduce the error power.

こうして音源パルス列の発生から誤差最小化による音
源パルス列の発生までの一連の処理は音源パルス列があ
らかじめ定められた値に達するまで繰り返される。In this way, a series of processes from the generation of the sound source pulse train to the generation of the sound source pulse train by minimizing the error is repeated until the sound source pulse train reaches a predetermined value.

この方式の場合に伝送すべき情報は合成フィルタのＫ
パラメータKi（１≦ｉ≦16）と、音源パルス列の位置及
び振幅であるので、伝送ビットレイトが16Kbps以下の領
域では有効な方式と考えられる。The information to be transmitted in the case of this method is K of the synthesis filter.
The parameter Ki (1 ≦ i ≦ 16) and the position and amplitude of the sound source pulse train are considered to be an effective method in the region where the transmission bit rate is 16 Kbps or less.

しかしながら、この従来方式は演算量が非常に多いと
いう欠点があった。これは音源パルスを計算する際に、
一旦信号を合成し、原音声信号との誤差電力を計算しこ
れを小さくするようにパルスの振幅と位置を調整してい
ることに起因している。すなわち、パルス計算に合成処
理が必要であることに最大の問題がある。また、この従
来方式では、伝送レイトを下げるとピッチ周波数の高い
音声に対しては合成音声の品質が劣化するという欠点が
あった。これは、ピッチ周波数が高い場合には、パルス
計算のフレームに多くのピッチ波形が存在し、これを良
好に合成するためには多くのパルス数を必要とするため
である。However, this conventional method has a drawback that the amount of calculation is very large. This is when calculating the source pulse
This is because the signals are once synthesized, the error power with respect to the original speech signal is calculated, and the pulse amplitude and position are adjusted so as to reduce it. In other words, the biggest problem is that the pulse calculation requires synthesis processing. Further, this conventional method has a drawback that the quality of synthesized speech is deteriorated for speech with a high pitch frequency when the transmission rate is lowered. This is because when the pitch frequency is high, there are many pitch waveforms in the frame for pulse calculation, and a large number of pulses are required to successfully combine them.

（発明の目的）本発明の目的は、比較的少ない演算量で、低い伝送ビ
ットレイトでも高品質な音声を合成することのできる高
能率の音声信号符号化復号化方法とその装置を提供する
ことにある。(Object of the Invention) It is an object of the present invention to provide a highly efficient speech signal coding / decoding method and apparatus capable of synthesizing a high quality speech even with a low transmission bit rate with a relatively small amount of calculation. It is in.

（発明の構成）本発明の音声信号符号化復号化方法は、送信側では離
散的な音声信号を入力し、前記音声信号からピッチを表
わすピッチパラメータを求め、前記ピッチパラメータに
応じた時間区間毎に前記音声信号を分割し前記音声信号
の予め設定した短時間のスペクトル包絡を表わすスペク
トルパラメータを求め、前記ピッチパラメータ及び前記
スペクトルパラメータをもとに前記音声信号を表わすた
めの音源パルス列を前記時間区間の複数個の内の一部の
区間に対して求めて符号化し、前記ピッチパラメータを
表わす符号と前記スペクトルパラメータを表わす符号と
前記音源パルス列を表わす符号とが組み合わされた符号
を出力し、受信側では前記組み合された符号を入力し、
前記組み合わされた符号から前記ピッチパラメータを表
わす符号と前記スペクトルパラメータを表わす符号と前
記音源パルス列を表わす符号とを分離して復号し、前記
復号されたピッチパラメータをもとに前記時間区間を復
元して前記復号されたスペクトルパラメータと前記復号
された音源パルス列とをもとに駆動音源信号を復元し、
前記時間区間に応じて求めた区間について前記音声信号
を合成する。(Structure of the Invention) According to the speech signal coding / decoding method of the present invention, a discrete speech signal is input on the transmitting side, a pitch parameter representing a pitch is obtained from the speech signal, and each time section corresponding to the pitch parameter is obtained. A spectrum parameter representing a preset short-time spectrum envelope of the voice signal is divided into two, and a sound source pulse train for representing the voice signal is generated on the basis of the pitch parameter and the spectrum parameter. Is obtained and encoded for a part of a plurality of sections, and outputs a code in which a code representing the pitch parameter, a code representing the spectrum parameter, and a code representing the excitation pulse train are combined, Now enter the combined code,
A code representing the pitch parameter, a code representing the spectrum parameter, and a code representing the excitation pulse train are separated and decoded from the combined code, and the time interval is restored based on the decoded pitch parameter. Restore the driving excitation signal based on the decoded spectrum parameter and the decoded excitation pulse train,
The voice signal is synthesized with respect to the section obtained according to the time section.

本発明の音声信号符号化装置は、入力した音声信号か
らピッチを表わすピッチパラメータを抽出し符号化する
ピッチ計算回路と、前記音声信号の予め設定した短時間
のスペクトル包絡を表わすスペクトルパラメータを抽出
し符号化するパラメータ計算回路と、前記ピッチパラメ
ータをもとに前記音声信号を分割するための分割位置を
求め且つ分割された時間区間を複数個含むフレーム区間
を設定するフレーム区間設定回路と、前記短時間のスペ
クトルに応じたインパルス応答数列の自己相関数列を算
出する自己相関関数計算回路と、前記短時間のスペクト
ルに応じたインパルス応答数列の相互相関数列を算出す
る相互相関関数計算回路と、前記自己相関数列と前記相
互相関数列とを用いて前記フレーム区間内の一部の区間
に対して音源パルス列を算出して符号化する駆動信号計
算・符号化回路と、前記ピッチ計算回路の出力符号と前
記パラメータ計算回路の出力符号と前記駆動信号計算・
符号化回路の出力符号とを組み合わせた符号系列を出力
するマルチプレクサ回路とを有している。A speech signal coding device of the present invention extracts a pitch calculation circuit for extracting and coding a pitch parameter representing a pitch from an input speech signal, and a spectrum parameter representing a preset short-time spectrum envelope of the speech signal. A parameter calculation circuit for encoding, a frame section setting circuit for obtaining a division position for dividing the audio signal based on the pitch parameter, and setting a frame section including a plurality of divided time sections; An autocorrelation function calculation circuit for calculating an autocorrelation sequence of an impulse response sequence according to a spectrum of time, a crosscorrelation function calculation circuit for calculating a crosscorrelation sequence of an impulse response sequence according to the short-time spectrum, and the self Source pulse for a part of the frame interval using the correlation sequence and the cross-correlation sequence A drive signal calculating and coding circuit for coding is calculated and output code and the drive signal calculating and the output code and the parameter calculation circuit of said pitch calculation circuit
And a multiplexer circuit that outputs a code sequence in which the output code of the encoding circuit is combined.

本発明の音声信号復号化装置は、ピッチパラメータを
表わす符号とスペクトルパラメータを表わす符号と音源
パルス列を表わす符号とが組み合わされた符号系列が入
力され前記ピッチパラメータを表わす符号と前記スペク
トルパラメータを表わす符号と前記音源パルス列を表わ
す符号とを分離して復号し前記復号されたピッチパラメ
ータをもとにピッチ周期に応じた時間区間を復元するデ
マルチプレクサ・復元化回路と、前記復元された時間区
間と前記復号された音源パルス列をもとに駆動音源信号
を復元する駆動音源信号復元回路と、前記駆動音源信号
と前記復号されたスペクトルパラメータをもとに前記時
間区間に応じて求めた区間の音声信号を合成する合成フ
ィルタ回路とを有している。The speech signal decoding device of the present invention is input with a code sequence in which a code representing a pitch parameter, a code representing a spectrum parameter, and a code representing an excitation pulse train are input, and a code representing the pitch parameter and a code representing the spectrum parameter. And a code representing the excitation pulse train are separated and decoded, and a demultiplexer / restoring circuit that restores a time section corresponding to a pitch cycle based on the decoded pitch parameter, the restored time section, and the A driving sound source signal restoration circuit that restores a driving sound source signal based on a decoded sound source pulse train, and an audio signal of a section obtained according to the time section based on the driving sound source signal and the decoded spectrum parameter. And a synthesizing filter circuit for synthesizing.

（発明の原理）本発明は音声信号の周期性を利用し、送信側では音声
信号からピッチ周期を表わすピッチパラメータを求め、
ピッチ周期にもとずいた時間区間毎に音声信号を分割す
る。そして音声信号の短時間スペクトル包絡を表わすス
ペクトルパラメータを求めてことスペクトルパラメータ
をもとに音声信号を表わすための音源パルス列を、前記
時間区間に応じて求めた区間に含まれる時間区間のうち
の一部の区間に対して計算し受信側に伝送する。ここで
音源パルス列の計算には、例えば“特願昭57−231606"
に記載の第（21）式を参照することができる。受信側で
は、受信した音源パルス列の振幅、位置を用いて音源信
号を発生し音源パルス列が伝送されなかった時間区間で
はこの音源信号を補間することにより駆動音源信号を復
元し、この駆動音源信号とスペクトルパラメータを用い
て前記時間区間に応じて求めた区間について音声信号を
合成する。(Principle of the Invention) The present invention utilizes the periodicity of a voice signal, and a transmitting side obtains a pitch parameter representing a pitch period from the voice signal,
The audio signal is divided into time intervals based on the pitch cycle. Then, a spectrum parameter representing a short-time spectrum envelope of the voice signal is obtained. A sound source pulse train for representing the voice signal based on the spectrum parameter is one of the time intervals included in the time interval obtained according to the time interval. It is calculated for each section and transmitted to the receiving side. Here, for calculation of the sound source pulse train, for example, “Japanese Patent Application No. 57-231606”
The formula (21) described in can be referred to. On the receiving side, a sound source signal is generated using the amplitude and position of the received sound source pulse train, and the drive sound source signal is restored by interpolating this sound source signal during the time interval when the sound source pulse train was not transmitted, and this drive sound source signal The speech signal is synthesized for the section obtained according to the time section using the spectrum parameter.

（実施例）以下、本発明の実施例について図面を参照して詳細に
説明する。第１図（ａ）は本発明による音声信号符号化
装置の送信側の一実施例を示すブロック図であり、第１
図（ｂ）は受信側の一実施例を示すブロック図である。Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 (a) is a block diagram showing an embodiment of the transmitting side of the audio signal encoding device according to the present invention.
FIG. 6B is a block diagram showing an embodiment of the receiving side.

第１図（ａ）において、音声信号Ｘ（ｎ）が入力され
あらかじめ定められたサンプル数だけバッファメモリ回
路110に蓄積される。ピッチ分析回路130はバッファメモ
リ回路110の出力を用いてピッチ周期Pdを計算する。Pd
の計算法は、例えば、アールブイコックス（R.V.CO
X）氏らによる“リアルタイムインプリメンティシ
ョンオブタイムドメインハーモニックスケィ
リングオブスピーチ”（“REAL−TIME IMPLEMENTAT
ION OF TIME DOMAIN HARMONIC SCALING OF SPEECH SIGN
ALS"）と題した論文（IEEE TRANS.A.S.S.P.,p.p.258−2
72,1983）（文献２）等で述べられている方法を用いる
ことができる。In FIG. 1A, the audio signal X (n) is input and stored in the buffer memory circuit 110 by a predetermined number of samples. The pitch analysis circuit 130 uses the output of the buffer memory circuit 110 to calculate the pitch period Pd. Pd
The calculation method of RVCO (RVCO
X) et al., "REAL-TIME IMPLEMENTAT"("REAL-TIMEIMPLEMENTAT")
ION OF TIME DOMAIN HARMONIC SCALING OF SPEECH SIGN
ALS ") (IEEE TRANS.ASSP, pp258-2
72, 1983) (reference 2) and the like can be used.

ピッチ符号化回路150はピッチ周期Pdをあらかじめ定
められた量子化ビット数で周知の方法により量子化符号
化し、符号ldをマルチプレクサ260へ出力する。また復
号化して得たPd′をフレーム区間設定回路155及び駆動
信号計算回路220へ出力する。Pitch encoding circuit 150 quantizes and encodes pitch period Pd by a known method with a predetermined number of quantization bits, and outputs code ld to multiplexer 260. In addition, Pd ′ obtained by decoding is output to the frame section setting circuit 155 and the drive signal calculation circuit 220.

フレーム区間設定回路155はピッチ周期Pd′を用いて
ピッチ周期Pd′毎に音声信号を分割するための分割位置
を求め、更に分割された時間区間を幾つか含むようにフ
レーム区間を設定する。以後、分割された時間区間をサ
ブフレームと呼ぶ。フレーム区間はバッファメモリ回路
110,合成フィルタ回路250,駆動信号復元回路240へ出力
される。また分割位置は駆動信号計算回路220へ出力さ
れる。The frame section setting circuit 155 obtains a division position for dividing an audio signal for each pitch cycle Pd 'using the pitch cycle Pd', and sets the frame section so as to include some time sections which are further divided. Hereinafter, the divided time interval will be referred to as a subframe. Frame section is buffer memory circuit
It is output to 110, the synthesis filter circuit 250, and the drive signal restoration circuit 240. The division position is output to the drive signal calculation circuit 220.

次にＫパラメータ計算回路140はバッファメモリ回路1
10からフレーム区間にもとずき音声信号を入力し、入力
した音声信号のスペクトル包絡を表わすＫパラメータKi
を計算する。ここでＫパラメータはPARCOR係数と同一の
パラメータである。Ｋパラメータの計算法としては、自
己相関法がよく知られている。この方法の詳細について
は、ジョンマクホウル（JOHN MAKHOUL）氏らによる
“クォンタイゼイションプロパティズオブトラン
スミッションパラメターズインリニアプリディ
クティブシステムズ（“QUANTIZATION PROPERTIES OF
TRANSMISSION PARAMETERS IN LINEARPREDICTIVE SYSTE
MS"）と題した論文（IEEE TRANS.A.S.S.P.,p.p.309−32
1,19753）（文献3.）等に述べられているので、ここで
は説明を省略する。Next, the K parameter calculation circuit 140 is the buffer memory circuit 1
A voice signal is input based on the frame interval from 10 and a K parameter Ki representing the spectral envelope of the input voice signal
Is calculated. Here, the K parameter is the same parameter as the PARCOR coefficient. The autocorrelation method is well known as a method for calculating the K parameter. For more information on this method, see “Quantization Properties of Transmission Parameters in Linear Predictive Systems (“ QUANTIZATION PROPERTIES OF ”by JOHN MAKHOUL and colleagues.
TRANSMISSION PARAMETERS IN LINEARPREDICTIVE SYSTE
MS ") paper (IEEE TRANS.ASSP, pp309-32
1,19753) (reference 3), etc., and therefore the description is omitted here.

第１図（ａ）に戻って、ＫパラメータKiはＫパラメー
タ符号化回路160へ出力される。Ｋパラメータ符号化回
路160はあらかじめ定められた量子化ビット数に基き周
知の方法によりKiを符号化し、符号liをマルチプレクサ
260へ出力する。また、Ｋパラメータ符号化回路160はli
を復号化して得たＫパラメータ復号値Ki′を用いて周知
の方法により予測係数値ai′に変換し、インパルス応答
計算回路170と重みずけ回路200と合成フィルタ回路250
へ出力する。Returning to FIG. 1A, the K parameter Ki is output to the K parameter encoding circuit 160. The K parameter encoding circuit 160 encodes Ki by a known method based on a predetermined number of quantization bits, and multiplexes the code li.
Output to 260. Further, the K parameter encoding circuit 160 is li
Is converted into a prediction coefficient value ai 'by a well-known method using the K parameter decoded value Ki' obtained by decoding the impulse response calculation circuit 170, the weighting circuit 200, and the synthesis filter circuit 250.
Output to.

インパルス応答計算回路170はＫパラメータ符号化回
路160から予測係数値ai′を入力し、重みずけされた合
成フィルタの伝達関数を表わすインパルス応答hw（ｎ）
を計算する。ここで、hw（ｎ）の計算には、例えば“特
願昭59−042305"の第４図（ａ）に記載のインパルス応
答計算回路210と同一の方法を用いることができる。イ
ンパルス応答hw（ｎ）は自己相関関数計算回路180と相
互相関関数計算回路210とへ出力される。The impulse response calculation circuit 170 inputs the prediction coefficient value ai ′ from the K parameter encoding circuit 160, and the impulse response hw (n) representing the transfer function of the weighted synthesis filter.
Is calculated. Here, the same method as the impulse response calculation circuit 210 described in FIG. 4 (a) of "Japanese Patent Application No. 59-042305" can be used for the calculation of hw (n). The impulse response hw (n) is output to the autocorrelation function calculation circuit 180 and the cross-correlation function calculation circuit 210.

自己相関関数計算回路180はインパルス応答計算回路1
70からインパルス応答hw（ｎ）を入力し、次式に従い自
己相関関数を計算する。The autocorrelation function calculation circuit 180 is an impulse response calculation circuit 1
The impulse response hw (n) is input from 70, and the autocorrelation function is calculated according to the following equation.

自己相関関数Rhh（ｍ）は駆動信号計算回路220へ出力
される。 The autocorrelation function Rhh (m) is output to the drive signal calculation circuit 220.

次に減算器120はバッファメモリ回路110から音声信号
Ｘ（ｎ）をフレーム区間設定回路155におけるフレーム
区間にもとずくサンプル数だけ入力し、Ｘ（ｎ）から合
成フィルタ回路250の出力（ｎ）を減算し、結果ｅ
（ｎ）を重みずけ回路200へ出力する。Next, the subtractor 120 inputs the audio signal X (n) from the buffer memory circuit 110 by the number of samples based on the frame section in the frame section setting circuit 155, and outputs the output (n) of the synthesis filter circuit 250 from X (n). Is subtracted and the result e
(N) is output to the weighting circuit 200.

重みずけ回路200はｅ（ｎ）を入力し、また、Ｋパラ
メータ符号化回路160から予測係数ai′を入力し、ｅ
（ｎ）に対し重みずけを施して求めたew（ｎ）を出力す
る。ここでew（ｎ）の計算には、例えば“特願昭59−04
2305"の第４図（ａ）に記載の重みずけ回路410と同一の
方法を用いることができる。The weighting circuit 200 inputs e (n), and also inputs the prediction coefficient ai ′ from the K parameter encoding circuit 160,
Ew (n) obtained by weighting (n) is output. Here, for calculating ew (n), for example, “Japanese Patent Application No. 59-04
The same method as the weighting circuit 410 shown in FIG. 4 (a) of 2305 ″ can be used.

相互相関関数計算回路210は重みずけ回路200からew
（ｎ）を入力し、インパルス応答計算回路170からイン
パルス応答hw（ｎ）を入力し、次式に従い相互相関関数
_hx（ｍ）を計算する。The cross-correlation function calculation circuit 210 outputs the weighting circuit 200 to ew
(N) is input, the impulse response hw (n) is input from the impulse response calculation circuit 170, and the cross-correlation function is calculated according to the following equation.
Calculate _hx (m).

相互相関関数_hx（ｍ）は駆動信号計算回路220へ出
力される。 The cross-correlation function _hx (m) is output to the drive signal calculation circuit 220.

次に、駆動信号計算回路220は、フレーム区間設定回
路155から入力した分割位置を用いて区切られた相互相
関関数列と自己相関関数列を用いて音声信号を表わすた
めの音源パルス列を、幾つかの時間区間の内の一部の区
間（サブフレーム）について計算する。この手順を次に
示す。尚、ここでは一例として、２つのサブフレームに
対し、片方のサブフレームについて音源パルス列を求
め、残りのサブフレームについては音源パルスをまびく
場合について説明する。Next, the drive signal calculation circuit 220 uses several source pulse trains for representing a voice signal using the cross-correlation function sequence and the auto-correlation function sequence divided using the division positions input from the frame period setting circuit 155. The calculation is performed for a part of the time section (subframe). This procedure is shown below. Here, as an example, for two subframes, a case where a sound source pulse train is obtained for one of the subframes and the sound source pulses are scattered for the remaining subframes will be described.

第２図（ａ）に一例としてサブフレーム４個分の音声
波形、第２図（ｂ）に第２図（ａ）の音声波形から求め
た相互相関関数列、第２図（ｃ）に４個のサブフレーム
区間をそれぞれ示す。As an example, FIG. 2 (a) shows an audio waveform for four subframes, FIG. 2 (b) shows a cross-correlation function sequence obtained from the audio waveform of FIG. 2 (a), and FIG. 2 (c) shows 4 Each of the subframe sections is shown.

次に、２つのサブフレームのうち片方のサブフレーム
（ここでは前方のサブフレームとし、第２図（ｃ）では
1,3で示したサブフレームである）に対して、音源パル
ス列をあらかじめ定められた個数だけ計算する。ここで
音源パルス列の計算には、例えば“特願昭57−231606"
に記載の第（21）式で示した方法を用いることができ
る。第２図（ｄ）に１番目と３番目のサブフレームに対
して、音源パルス列をあらじめ定められた個数だけ求め
た例について示す。第２図（ｄ）では各サブフレームの
割り当てパルス数は４としている。Next, one of the two sub-frames (here, the front sub-frame, and in FIG. 2 (c),
For a sub-frame indicated by 1 and 3), a predetermined number of sound source pulse trains are calculated. Here, for calculation of the sound source pulse train, for example, “Japanese Patent Application No. 57-231606”
The method represented by the formula (21) described in (4) can be used. FIG. 2D shows an example in which a predetermined number of excitation pulse trains are obtained for the first and third subframes. In FIG. 2 (d), the number of pulses assigned to each subframe is four.

このようにして求めたパルスの振幅、位置は符号化回
路230へ出力される。The pulse amplitude and position thus obtained are output to the encoding circuit 230.

符号化回路230は入力したパルスの振幅、位置を符号
化する。そして、パルスの振幅、位置の符号をマルチプ
レクサ260へ出力する。また、入力したパルスの振幅、
位置の復号値gi′,mi′を駆動信号復元回路240へ出力す
る。ここで、パルスの符号化法には、例えば“特願昭57
−231605"に記載の符号化回路250と同一の方法を用いる
ことができる。他の符号化法として可変長符号化法、あ
るいは、差分符号化法等種々の方法が考えられる。The encoding circuit 230 encodes the amplitude and position of the input pulse. Then, the amplitude of the pulse and the sign of the position are output to the multiplexer 260. Also, the amplitude of the input pulse,
The decoded values gi 'and mi' of the position are output to the drive signal restoration circuit 240. Here, as a pulse encoding method, for example, “Japanese Patent Application No.
The same method as the encoding circuit 250 described in −231605 ″ can be used. As other encoding methods, various methods such as a variable length encoding method or a differential encoding method can be considered.

駆動信号復元回路240は符号化回路230から入力したパ
ルスの振幅および位置の復号値を用いて、第２図（ｄ）
の1,3のサブフレームについて音源パルス列を発生させ
る。次に第２図（ｄ）のサブフレーム2,4に対しては、
それぞれサブフレーム1,3の音源パルス列をもとに補間
して求める。補間の特別の場合として例えばサブフレー
ム1,3の音源パルス列をピッチ周期Pd′だけずらして繰
り返すこともできる。そして、フレーム区間設定回路15
5から入力したフレーム区間だけ駆動音源信号を求め
て、合成フィルタ回路250へ出力する。The drive signal restoration circuit 240 uses the decoded values of the amplitude and position of the pulse input from the encoding circuit 230, and FIG.
Source pulse trains are generated for 1 and 3 subframes. Next, for subframes 2 and 4 in FIG. 2 (d),
Interpolation is performed based on the sound source pulse trains of subframes 1 and 3, respectively. As a special case of interpolation, for example, the sound source pulse trains of subframes 1 and 3 can be repeated with a pitch period Pd ′. Then, the frame section setting circuit 15
The drive sound source signal is obtained only for the frame section input from 5, and is output to the synthesis filter circuit 250.

合成フィルタ回路250は駆動音源信号、Ｋパラメータ
を入力し１フレーム区間の応答信号（ｎ）を計算す
る。ここで、この計算には、例えば“特願昭57−23160
5"に記載の合成フィルタ回路320と同一の方法を用いる
ことができる。The synthesis filter circuit 250 inputs the driving sound source signal and the K parameter and calculates the response signal (n) in one frame section. Here, in this calculation, for example, "Japanese Patent Application No. 57-23160"
The same method as the synthesis filter circuit 320 described in 5 "can be used.

マルチプレクサ回路260はＫパラメータ符号化回路160
の符号lKiとピッチ符号化回路150の符号ldと符号化回路
230の符号を入力して一旦蓄積し、これらを組合せてあ
らかじめ定められた時刻毎に送信側出力端子270から出
力する。以上で本発明による音声信号符号化装置の送信
側の説明を終了する。The multiplexer circuit 260 is a K parameter encoding circuit 160.
Code lKi and pitch coding circuit 150 code ld and coding circuit
The code of 230 is input, accumulated once, and these are combined and output from the transmission side output terminal 270 at every predetermined time. This is the end of the description of the transmitting side of the audio signal encoding device according to the present invention.

次に、本発明による音声信号符号化装置の受信側の構
成について第１図（ｂ）を参照して説明する。Next, the configuration of the receiving side of the audio signal encoding device according to the present invention will be described with reference to FIG. 1 (b).

デマルチプレクサ290にあらかじめ定められた時刻毎
に受信側入力端子280から入力した符号のうち、Ｋパラ
メータを表わす符号と、ピッチ周期を表わす符号と、パ
ルスの振幅、位置を表わす符号とを分離して、それぞれ
Ｋパラメータ復号回路330、ピッチ復号回路320、パルス
復号回路300へ出力する。Of the codes input from the reception side input terminal 280 to the demultiplexer 290 at predetermined times, the code representing the K parameter, the code representing the pitch period, and the code representing the amplitude and position of the pulse are separated. , K parameter decoding circuit 330, pitch decoding circuit 320, and pulse decoding circuit 300, respectively.

Ｋパラメータ復号回路330はＫパラメータを復号して
復号値Ki′を合成フィルタ回路350へ出力する。The K parameter decoding circuit 330 decodes the K parameter and outputs the decoded value Ki ′ to the synthesis filter circuit 350.

ピッチ復号回路320はピッチ周期Pd′を復号して駆動
信号復元回路340、フレーム区間設定回路345へ出力す
る。The pitch decoding circuit 320 decodes the pitch period Pd ′ and outputs it to the drive signal restoration circuit 340 and the frame section setting circuit 345.

パルス復号回路300はパルス振幅gi′、位置mi′を復
号して駆動信号復元回路340へ出力する。The pulse decoding circuit 300 decodes the pulse amplitude gi 'and the position mi' and outputs them to the drive signal restoring circuit 340.

フレーム区間設定回路345は送信側におけるフレーム
区間設定回路155と同一の動作を行ない、求めたフレー
ム区間を駆動信号計算回路340、合成フィルタ回路350へ
出力する。The frame section setting circuit 345 performs the same operation as the frame section setting circuit 155 on the transmitting side, and outputs the obtained frame section to the drive signal calculation circuit 340 and the synthesis filter circuit 350.

駆動信号復元回路340は送信側の駆動信号復元回路240
と同一の動作を行ない１フレーム区間の駆動音源信号を
求めて合成フィルタ回路350へ出力する。The drive signal restoration circuit 340 is the drive signal restoration circuit 240 on the transmission side.
The same operation as the above is performed to obtain a driving sound source signal in one frame section and output it to the synthesis filter circuit 350.

合成フィルタ回路350は駆動音源信号を入力し、送信
側の合成フィルタ回路250と同一の動作をして１フレー
ム区間の合成音声信号（ｎ）を計算し、バッファメモ
リ回路355へ出力する。The synthesis filter circuit 350 receives the driving sound source signal, performs the same operation as the synthesis filter circuit 250 on the transmitting side, calculates a synthesized voice signal (n) in one frame section, and outputs it to the buffer memory circuit 355.

バッファメモリ回路355はあらかじめ定められたサン
プル数だけ合成音声信号を蓄積した後に受信側出力端子
360から出力する。The buffer memory circuit 355 stores the synthesized voice signal for a predetermined number of samples and then outputs it to the receiving side output terminal.
Output from 360.

以上で本発明による音声信号符号化装置の受信側の説
明を終える。This is the end of the description of the receiving side of the audio signal encoding device according to the present invention.

駆動信号計算回路220におけるパルス計算法として
は、本実施例でのべた方法の他に、種々の方法を用いる
ことができる。例えばパルスを１つ求めるごとに過去に
求めたパルスの振幅を調整する方法を用いることができ
る。この方法の詳細については小野氏らによる“マルチ
パルス駆動型音声符号化法における音源パルス探索法の
検討”と題した論文（日本音響学会講演論文集157,198
3）（文献４）等に述べられているのでここでは説明を
省略する。As a pulse calculation method in the drive signal calculation circuit 220, various methods can be used in addition to the method described in this embodiment. For example, a method of adjusting the amplitude of a pulse obtained in the past every time one pulse is obtained can be used. For details of this method, see a paper entitled “Study on Source Pulse Search Method in Multi-pulse Driven Speech Coding” by Ono et al. (Proceedings of the Acoustical Society of Japan 157,198).
3) Since it is described in (Reference 4) and the like, the description is omitted here.

また、駆動信号計算回路220にて音源パルス列を求め
る際に、サブフレームごとにパルスを求めていたが、サ
ブフレームをいくつか含む区間全体に対してあらかじめ
定められた個数のパルスを求めるようにしてもよい。更
に、本実施例の送信側では、Ｋパラメータの分析に用い
たフレーム区間に属するサブフレームではＫパラメータ
の値は一定としたが、Ｋパラメータの値をサブフレーム
毎に滑らかに変化させながらパルスを求めてもよい。具
体的には、Ｋパラメータの値を前後のフレームのＫパラ
メータの値を用いてサブフレーム毎に補間し、この値を
予測系数に変換して、重みずけ回路200、インパルス応
答計算回路170、合成フィルタ回路250に出力し、サブフ
レーム毎に係数を更新して求めた相互相関関数、自己相
関関数を用いてパルスを計算する。このようにした方が
より良好な音声を合成できる。このようにした場合、合
成フィルタのパラメータの補間法としては、直線補間以
外の方法も考えられる。Further, when obtaining the sound source pulse train in the drive signal calculation circuit 220, the pulse was obtained for each subframe, but a predetermined number of pulses are obtained for the entire section including some subframes. Good. Furthermore, on the transmitting side of the present embodiment, the value of the K parameter is constant in the subframes belonging to the frame section used for the analysis of the K parameter, but the pulse is generated while smoothly changing the value of the K parameter for each subframe. You may ask. Specifically, the value of the K parameter is interpolated for each subframe using the values of the K parameter of the preceding and succeeding frames, this value is converted into a prediction coefficient, and the weighting circuit 200, the impulse response calculation circuit 170, The pulse is calculated using the cross-correlation function and autocorrelation function that are output to the synthesis filter circuit 250 and updated by updating the coefficient for each subframe. In this way, better voice can be synthesized. In this case, a method other than linear interpolation can be considered as an interpolation method for the parameters of the synthesis filter.

また、合成フィルタのパラメータを補間する場合、Ｋ
パラメータについて補間する方法の他に、例えば、予測
係数を補間する方法（但し、この場合はフィルタの安定
性をチェックする必要がある）や、対数断面積関数を補
間する方法や、自己相関関数を補間する方法等を用いる
こともできる。これらの具体的な方法は、ビーエス
アタル（B.S.ATAL）氏らによる“スピーチアナリシス
アンドシンセシスバイリニアープリディクショ
ンオブザスピーチウェイブ”（“SPEECH ANALY
SIS AND SYNTHESIS BY LINEAR PREDICTION OF THE SPEE
CH WAVE"）と題した論文（J.ACOUST.SOC.AM.,p.p.637−
655,1971）（文献５）等に述べられているので、説明は
省略する。When interpolating the parameters of the synthesis filter, K
In addition to the method of interpolating the parameters, for example, the method of interpolating the prediction coefficient (however, in this case it is necessary to check the stability of the filter), the method of interpolating the logarithmic cross-section function, and the autocorrelation function An interpolation method or the like can also be used. These specific methods are
"SPACH ANALYSIS AND SYNTHESIS BY LINEAR PREDICTION OF THE SPEECH WAVE" by BSATAL et al.
SIS AND SYNTHESIS BY LINEAR PREDICTION OF THE SPEE
CH WAVE ") (J.ACOUST.SOC.AM., Pp637-
655, 1971) (reference 5) and the like, and therefore description thereof will be omitted.

また本実施例では、サブフレーム区間を幾つか含むフ
レーム区間の音声信号に対して、減算、相互相関関数計
算、音源パルス計算処理を行なったが、サブフレーム区
間毎に上記の処理を行なうようにしてもよい。更に、Ｋ
パラメータの分析に用いるフレーム区間は上記の減算相
互相関関数計算、音源パルス計算処理をする区間と同じ
にしてもよいし、変えてもよい。またＫパラメータの分
析に用いるフレーム区間については、音声の変化部で
は、フレーム区間を短くし、定常部ではフレーム区間を
長く設定してもよい。このようにしたほうが音声の変化
部での品質を向上できるとともに伝送ビットレイトを下
げることができる。このような構成とする場合、Ｋパラ
メータ分析フレーム区間をピッチ周期に応じて決めても
よいし、ピッチ周期と関係なく決めてもよい。Further, in the present embodiment, the subtraction, the cross-correlation function calculation, and the sound source pulse calculation processing are performed on the voice signal of the frame section including some sub-frame sections, but the above processing is performed for each sub-frame section. May be. Furthermore, K
The frame section used for the parameter analysis may be the same as or different from the section in which the subtraction cross-correlation function calculation and the sound source pulse calculation process are performed. As for the frame section used for the analysis of the K parameter, the frame section may be set to be short in the voice changing section and the frame section may be set to be long in the stationary section. By doing so, it is possible to improve the quality at the voice change portion and reduce the transmission bit rate. In such a configuration, the K parameter analysis frame section may be determined according to the pitch cycle or may be determined regardless of the pitch cycle.

また２つのサブフレームに対して１つのサブフレーム
分のパルスを伝送する例について説明したが、３つ以上
のサブフレームに対して１つのサブフレーム分のパルス
を伝送してもよい。このようにした方がより伝送ビット
レイトを低減できる。またこの間引き率は一定でなくて
もよい。例えばピッチ周期に応じて変えることもでき
る。つまり、ピッチ周期が短いときには間引くサブフレ
ーム数を多くし、ピッチ周期が長いときには間引く数を
減らす構成とすることもできる。Also, an example in which a pulse for one subframe is transmitted for two subframes has been described, but a pulse for one subframe may be transmitted for three or more subframes. This makes it possible to further reduce the transmission bit rate. The thinning rate does not have to be constant. For example, it can be changed according to the pitch period. That is, the number of subframes to be thinned out may be increased when the pitch period is short, and the number of thinned out subframes may be reduced when the pitch period is long.

本発明の他の構成として、次のようにすることもでき
る。送信側で最初に一旦音源パルスの励振位置を求め
る。これには駆動信号計算回路220において第一番目の
音源パルスを求めればよい。次にこの励振位置をもとに
してピッチ周期に応じた時間区間の開始位置を求め、こ
の開始位置からピッチ周期毎にあらかじめ定められた区
間数だけ音声信号を分割する。そして音声信号を良好に
表わす音源パルス列を、幾つかの時間区間の内の一部の
区間に対して、前記実施例と同一の方法を用いて求める
ようにしてもよい。但しこのような構成とした場合には
時間区間の開始位置の情報を受信側に伝送する必要があ
る。As another configuration of the present invention, the following can be performed. First, the transmitter side first obtains the excitation position of the sound source pulse. For this purpose, the drive signal calculation circuit 220 may find the first sound source pulse. Next, the start position of the time section corresponding to the pitch cycle is obtained based on this excitation position, and the audio signal is divided from this start position by the predetermined number of sections for each pitch cycle. Then, a sound source pulse train that satisfactorily represents a voice signal may be obtained for some of some time intervals using the same method as in the above-described embodiment. However, in the case of such a configuration, it is necessary to transmit the information on the start position of the time section to the receiving side.

尚、ディジタル信号処理の分野でよく知られているよ
うに、自己相関関数はパワスペクトルから計算すること
もできる。また、相互相関関数はクロスパワスペクトル
から計算することもできる。これらの対応関係について
は、エーブイオッペンハイム（A.V.OPPENHEIM）氏
らによる“ディジタル信号処理”“DIGITAL SIGNALPROC
ESSING"と題した単行本（文献６）等の第８章にて詳細
に説明されているので、ここでは説明を省略する。Note that, as is well known in the field of digital signal processing, the autocorrelation function can be calculated from a power spectrum. Also, the cross-correlation function can be calculated from the cross-power spectrum. Regarding these correspondences, see "Digital signal processing" and "DIGITAL SIGNAL PROC" by AVOPPENHEIM and others.
Since it has been described in detail in Chapter 8 such as a book titled "ESSING" (reference 6), its description is omitted here.

（本発明の効果）以上説明したように本発明は、送信側では音声信号の
周期性を利用し、ピッチ周期に応じた時間区間毎に音声
信号を分割し音声信号の予め設定した短時間のスペクト
ル包絡を表わすスペクトルパラメータをもとに音声信号
を表わす音源パルス列を時間区間の複数個の内の一部の
区間について求めて符号化して出力し、受信側では受信
したピッチパラメータを表わす符号を分離，復号化した
ピッチパラメータをもとに復元化したピッチ周期に応じ
た時間区間と分離，復号化された音源パルス列をもとに
復元された駆動音源信号と、分離，復号化されたスペク
トルパラメータとをもとに時間区間に応じて求めた区間
の音声信号を合成するので、従来の方法及び装置より比
較的少ない演算量で低い伝送ビットレイトにおいても高
品質な音声を合成することができるという効果がある。(Effects of the present invention) As described above, the present invention utilizes the periodicity of the audio signal on the transmission side, divides the audio signal into time intervals according to the pitch cycle, and sets a predetermined short time of the audio signal. Based on the spectrum parameter that represents the spectrum envelope, the sound source pulse train that represents the voice signal is obtained for some of the multiple time intervals, encoded, and output, and the receiving side separates the code that represents the received pitch parameter. , The time interval corresponding to the pitch period restored based on the decoded pitch parameter and the driving source signal restored based on the decoded source pulse train, and the separated and decoded spectral parameters Since the audio signal of the section obtained according to the time section is synthesized on the basis of the This has the effect of being able to synthesize quality voice.

[Brief description of drawings]

第１図（ａ），（ｂ）は、本発明による高能率音声信号
化および復号化装置の一実施例を表わすブロック図、第
２図は駆動信号計算回路220における処理内容の一例を
示す図、第３図は従来方式の構成を示すブロック図、第
４図はマルチパルスで表わした音源信号の一例を示す
図、第５図は音声信号の周波数特性と第３図に記載の重
みずけ回路の特性の一例を示す図である。図において、110,355,410……バッファメモリ回路、12
0,420……減算回路、250,350,430……合成フィルタ回
路、200,490……重みずけ回路、170……インパルス応答
計算回路、180……自己相関関数計算回路、210……相互
相関関数計算回路、220……駆動信号計算回路、240,340
……駆動信号復元回路、130……ピッチ分析回路、140、
480……Ｋパラメータ計算回路、150……ピッチ符号化回
路、155,345……フレーム区間設定回路、160……Ｋパラ
メータ符号化回路、230……符号化回路、260……マルチ
プレクサ、290……デマルチプレクサ、300……パルス復
号回路、320……ピッチ復号回路、330……Ｋパラメータ
復号回路、440……音源パルス発生回路、450……誤差最
小化回路をそれぞれ示す。1 (a) and 1 (b) are block diagrams showing an embodiment of a high-efficiency voice signal conversion and decoding apparatus according to the present invention, and FIG. 2 is a diagram showing an example of processing contents in the drive signal calculation circuit 220. 3, FIG. 3 is a block diagram showing a configuration of a conventional system, FIG. 4 is a diagram showing an example of a sound source signal represented by multi-pulses, and FIG. 5 is a frequency characteristic of an audio signal and weighting shown in FIG. It is a figure which shows an example of the characteristic of a circuit. In the figure, 110,355,410 ... buffer memory circuit, 12
0,420 …… subtraction circuit, 250,350,430 …… synthesis filter circuit, 200,490 …… weighting circuit, 170 …… impulse response calculation circuit, 180 …… autocorrelation function calculation circuit, 210 …… cross-correlation function calculation circuit, 220 …… Drive signal calculation circuit, 240,340
...... Drive signal restoration circuit, 130 …… Pitch analysis circuit, 140,
480 ... K parameter calculation circuit, 150 ... pitch encoding circuit, 155,345 ... frame section setting circuit, 160 ... K parameter encoding circuit, 230 ... encoding circuit, 260 ... multiplexer, 290 ... demultiplexer , 300 ... pulse decoding circuit, 320 ... pitch decoding circuit, 330 ... K parameter decoding circuit, 440 ... excitation pulse generation circuit, 450 ... error minimization circuit, respectively.

Claims

(57) [Claims]

1. A transmission side inputs a discrete voice signal, obtains a pitch parameter representing a pitch from the voice signal, divides the voice signal for each time interval according to the pitch parameter, and pre-generates the voice signal. A spectrum parameter representing a set short-time spectrum envelope is obtained, and a sound source pulse train for representing the voice signal is generated for a part of a plurality of the time intervals based on the pitch parameter and the spectrum parameter. Then, a code representing the pitch parameter, a code representing the spectrum parameter, and a code representing the excitation pulse train is output in combination, and the reception side inputs the combined code, A code representing the pitch parameter and a spectrum parameter from the combined code Signal and a code representing the excitation pulse train are separated and decoded, and the time interval is restored based on the decoded pitch parameter to obtain the decoded spectrum parameter and the decoded excitation pulse train. A voice signal encoding / decoding method, characterized in that the drive excitation signal is restored to, and the voice signal is synthesized with respect to a section obtained according to the time section.

2. A pitch calculation circuit for extracting and encoding a pitch parameter representing a pitch from an input voice signal, and a parameter calculation for extracting and encoding a spectrum parameter representing a preset short-time spectrum envelope of the voice signal. A circuit, a frame section setting circuit for determining a division position for dividing the audio signal based on the pitch parameter, and setting a frame section including a plurality of divided time sections, and a frame section setting circuit An autocorrelation function calculating circuit for calculating an autocorrelation sequence of the impulse response sequence, a crosscorrelation function calculating circuit for calculating a crosscorrelation sequence of the impulse response sequence according to the short-time spectrum, the autocorrelation sequence and the mutual correlation The excitation pulse train is calculated and encoded for a part of the frame interval using the correlation sequence A drive signal calculation / encoding circuit, and a multiplexer circuit that outputs a code sequence that combines the output code of the pitch calculation circuit, the output code of the parameter calculation circuit, and the output code of the drive signal calculation / encoding circuit. An audio signal encoding device having.

3. A code sequence in which a code representing a pitch parameter, a code representing a spectrum parameter, and a code representing an excitation pulse train are combined and inputted, a code representing the pitch parameter, a code representing the spectrum parameter, and the excitation pulse train. A demultiplexer / restoring circuit that restores the time interval corresponding to the pitch period based on the decoded pitch parameter by decoding the code representing A driving sound source signal restoration circuit for restoring a driving sound source signal based on a pulse train, and a synthesis filter for synthesizing an audio signal in a section obtained according to the time section based on the driving sound source signal and the decoded spectrum parameter An audio signal decoding device comprising: a circuit.