JPH0378637B2

JPH0378637B2 -

Info

Publication number: JPH0378637B2
Application number: JP58124479A
Authority: JP
Inventors: Shigeru Ono
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-07-08
Filing date: 1983-07-08
Publication date: 1991-12-16
Also published as: JPS6017500A

Description

【発明の詳細な説明】本発明は音声信号の低ビツトレイト波形符号化
方式、特に伝送情報量を10kビツト／秒以下とな
るような符号化装置に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a low bit rate waveform encoding system for audio signals, and particularly to an encoding apparatus that allows the amount of transmitted information to be 10 kbit/sec or less.

音声信号を10kビツト／秒程度以下の伝送情報
量で符号化するための効果的な方法として、音声
信号の駆動音源信号系列を、それを用いて再生し
た信号と入力信号との誤差最小を条件として、短
時間毎に探索する方法が知られている。米国ベル
電話研究所のビー・エス・アタール（B.S.
ATAL）氏らによる、駆動音源信号系列を複数
個のパルスで表わし、その振幅、位相を短時間毎
に符号器側でＡ−ｂ−Ｓ（Analysis−by−
Synthesis）法により求める方式は有効である。
これに対する説明は、1982年度のアイ・シー・エ
ー・エス・エス・ピー（ICASSP）の予稿集614
〜617貢（文献１）に掲載されているのでここで
は説明を省く。文献１の従来方式は、パルス系列
を求めるのにＡ−ｂ−Ｓ法を用いているため、演
算量が非常に多いという欠点がある。それに対し
特許出願番号昭57−231603（文献２）において、
上記パルス系列を求めるための演算量を大巾に縮
少する方式が提案されている。これ等の方式によ
り、伝送レイトを10kビツト／秒以下とした領域
で良好な再生音質が得られると報告されている。
前記文献２（特許出願番号昭57−231603）の従来
方式を簡単に説明する。１フレーム内、Ｋ個のパ
ルス系列からなる駆動音源系列を次のように表わ
す。 As an effective method for encoding an audio signal with a transmission information amount of approximately 10k bits/second or less, the driving excitation signal sequence of the audio signal is conditioned on the minimum error between the signal reproduced using the sequence and the input signal. A method of searching every short period of time is known. B.S. Attar of Bell Telephone Laboratories, USA
A-b-S (Analysis-by-
The method obtained using the (Synthesis) method is effective.
An explanation of this can be found in the 1982 ICASSP Proceedings 614.
~617 Mitsugu (Reference 1), so the explanation is omitted here. The conventional method disclosed in Reference 1 uses the A-b-S method to obtain the pulse sequence, and therefore has the drawback of requiring a very large amount of calculations. On the other hand, in patent application number 1983-231603 (Document 2),
A method has been proposed to greatly reduce the amount of calculation required to obtain the above pulse sequence. It has been reported that these methods can provide good playback quality at transmission rates of 10k bits/second or less.
The conventional method disclosed in Document 2 (Patent Application No. 1983-231603) will be briefly explained. A driving sound source sequence consisting of K pulse sequences within one frame is expressed as follows.

ｄ（ｎ）＝_K 〓^k=1 gkδ（ｎ−lk），ｎ＝０，１，…，Ｎ−１ −(1) ここでδ（・）はクロネツカーのδである。Ｎ
はフレーム長、gkは位置lkに立つパルスの振幅
を表わす。ｄ（ｎ）は合成フイルタに入力して得
られる再生信号x〓（ｎ）は、合成フイルタの予測
係数をαi（ｉ＝１、…、Ｍ、Ｍは合成フイルタの
次数）とすると、次のように書ける。 d(n)= _K 〓 ^k=1 gkδ(n-lk), n=0, 1,...,N-1 -(1) Here, δ(·) is Kronetzker's δ. N
is the frame length, and gk is the amplitude of the pulse standing at position lk. The reproduced signal x〓(n) obtained by inputting d(n) to the synthesis filter is as follows, assuming that the prediction coefficient of the synthesis filter is αi (i=1, . . . , M, M is the order of the synthesis filter). It can be written as follows.

x〓（ｎ）＝ｄ（ｎ）＋_M 〓ⁱ⁼¹ αix〓（ｎ−ｉ） −(2) 入力音声信号ｘ（ｎ）と再生信号x〓（ｎ）との１
フレーム内の重み付き二乗誤差は、Ｊ＝_N-1 〓ⁱ⁼⁰ （（ｘ（ｎ）−x〓（ｎ））＊ｗ（ｎ））² −(3) となる。ここで＊はたたみ込み積分の記号であ
り、ｗ（ｎ）は重み付き関数を表わす。(3)式は、
ｘ（ｎ）、x〓（ｎ）、ｗ（ｎ）のＺ変換をそれぞれＸ
（ｚ）、X〓（ｚ）、Ｗ（ｚ）とすると、次のように表
わされる。 x〓(n)=d(n)+ _M〓 ⁱ⁼¹ αix〓(n−i) −(2) 1 of input audio signal x(n) and playback signal x〓(n)
The weighted squared error within a frame is J= _N-1 〓 ⁱ⁼⁰ ((x(n)-x〓(n))*w(n)) ² -(3). Here, * is a symbol for convolution integral, and w(n) represents a weighted function. Equation (3) is
The Z transformations of x(n), x〓(n), and w(n) are respectively
(z), X〓(z), and W(z), it is expressed as follows.

Ｊ＝｜Ｘ（ｚ）Ｗ（ｚ）−X〓（ｚ）Ｗ（ｚ）｜²
−(4) ここで｜・｜は絶対値を表わす。また(2)式の関
係から、X〓（ｚ）は次のようになる。 J=｜X(z)W(z)−X〓(z)W(z)｜ ²
−(4) Here |・| represents the absolute value. Also, from the relationship in equation (2), X〓(z) becomes as follows.

X〓（ｚ）＝Ｈ（ｚ）Ｄ（ｚ） −(5) Ｈ（ｚ）は合成フイルタのＺ変換、Ｚ（ｚ）は駆
動音源のＺ変換である。(5)を(4)に代入するとＪ＝｜Ｘ（ｚ）Ｗ（ｚ）−Ｈ（ｚ）Ｗ（ｚ）Ｄ
（ｚ）｜² −(6) である。 X〓(z)=H(z)D(z)−(5) H(z) is the Z transformation of the synthesis filter, and Z(z) is the Z transformation of the driving sound source. Substituting (5) into (4), J=|X(z)W(z)−H(z)W(z)D
(z) | ² −(6).

従つて、Ｘ（ｚ）Ｗ（ｚ）とＮ（ｚ）Ｗ（ｚ）の逆
Ｚ変換の信号をそれぞれxw（ｎ）＝ｘ（ｎ）＊ｗ
（ｎ）とhw（ｎ）＝ｈ（ｎ）＊ｗ（ｎ）と記すと、(6)
は次のようになる。 Therefore, the inverse Z-transformed signals of X(z)W(z) and N(z)W(z) are respectively xw(n)=x(n)*w
(n) and hw(n)=h(n)*w(n), (6)
becomes as follows.

Ｊ＝_N-1 〓ⁿ⁼⁰ （xw（ｎ）＝_K 〓^k=1 gkhw（ｎ−lk））² −(7) (7)式を最小にするような音源バルス系列の振幅
gk、位置mkを求めるのに、(7)式をgkで偏微分し
て０とおいた式、つまりの関係を利用する。 J= _N-1 〓 ⁿ⁼⁰ (xw(n)= _K 〓 ^k=1 gkhw(n-lk)) ² −(7) Amplitude of the sound source pulse sequence that minimizes equation (7)
To find gk and position mk, the equation (7) is partially differentiated with respect to gk and set to 0, that is, Take advantage of the relationship between

ここで、ψxh（・）はxw（ｎ）とhw（ｎ）から
計算した相互関関数列を、ψhh（・）はhw（ｎ）
の自己相関々数列をそれぞれ表わし、次のように
表わされる。尚ψhh（・）は共分散関数とも呼ば
れる。 Here, ψxh(・) is the interaction function sequence calculated from xw(n) and hw(n), and ψhh(・) is hw(n)
The autocorrelation sequences of are expressed as follows. Note that ψhh(·) is also called the covariance function.

ψxh（lk）＝_N-1 〓〓ⁿ⁼⁰ xw（ｎ）hw（ｎ−lk）＝ψhx（−lk）０≦lk≦Ｎ
−１(9) ψhh（li，lj）＝_N-(li-lj)+1 〓〓ⁿ⁼⁰ hw（ｎ−li）hw（ｎ−li）０≦li，lj≦Ｎ−１(10
) 従来方式は、(8)のgkをlkだけの関数とみるこ
とにより、ｋ番目のパルスの振幅と位置を決める
ものである。つまり、(8)の｜gk｜を最大にする
lkをｋ番目のパルスの位置とし、そのときのgk
をｋ番目のパルスの振幅とするものである。この
方式はgkが正確にlkだけの関数であれば、(7)式
を最も小さくする音源パルス系列が計算される
が、実際の音声信号はその限りでなく、一般に
gkは、l₁，l₂，…，lkなどの関数である。 ψxh (lk) = _N-1 〓〓 ⁿ⁼⁰ xw (n) hw (n-lk) = ψhx (-lk) 0≦lk≦N
−1(9) ψhh(li, lj)= _N-(li-lj)+1 〓〓 ⁿ⁼⁰ hw(n-li) hw(n-li) 0≦li, lj≦N-1(10
) The conventional method determines the amplitude and position of the k-th pulse by regarding gk in (8) as a function of only lk. In other words, maximize |gk| in (8)
Let lk be the position of the k-th pulse, and gk at that time
Let be the amplitude of the k-th pulse. In this method, if gk is exactly a function of only lk, the sound source pulse sequence that minimizes equation (7) is calculated, but this is not the case for actual audio signals, and in general
gk is a function such as l ₁ , l ₂ , ..., lk.

第１図は、文献２の従来方式の一実施例を示す
ブロツク図である。第２図は、音源パルス系列計
算回路１４０で行なわれる音源パルス系列の振幅
gk、位置lkを求める処理手順を表わす流れ図で
ある。第１図において各構成要素は１フレーム毎
に処理を行う。１００は符号器入力端子を示し、
Ａ／Ｄ変換された音声信号系列ｘ（ｎ）が入力さ
れる。１１０はバツフアメモリ回路で、音声信号
系列を１フレーム分蓄積する。Ｋパラメータ計算
回路１８０は、バツフアメモリ回路１１０に蓄積
された音声信号ｘ（ｎ）を入力し、あらかじめ定
められた数だけＫパラメータKi（１≦ｉ≦Ｍ）を
計算する。この値はＫパラメータ符号化回路１９
０に出力される。Ｋパラメータ符号化回路１９０
は、例えばあらかじめ定められた量子化ビツト数
に基づいてKiを符号化し、その符号Ikiをマルチ
プレクサ１６０へ出力する。またＫパラメータ符
号化回路１９０は、Ikiを復号化し、復号値
Ki′（１≦ｉ≦Ｍ）をインパルス応答計算回路１２
０と、重み付け回路２００へ出力する。重み付け
回路２００は、入力音声信号ｘ（ｎ）とＫパラメ
ータ復号値Ki′を入力し、合成フイルタの周波数
特性に依存した重み付け関数ｗ（ｎ）を用い、前
述のxw（ｎ）を計算し、得られたxw（ｎ）を相互
相関々数計算回路１３５へ出力する。尚、ここで
用いる重み付け関数ｗ（ｎ）は、例えばそのＺ変
換Ｗ（ｚ）を、合成フイルタの予測パラメータαi
と０≦ｒ≦１を満足する実定数ｒにより、Ｗ（ｚ）
＝（１−_M 〓ⁱ⁼¹ αiZ^-i／（１−_M 〓ⁱ⁼¹ αir^jZ_-i）と表わされる
ものを採用する。インパルス応答回路１２０は、
Kiを入力し、前述のhw（ｎ）（インパルス応答と
前述と同じ重み付き関数のたたみ込み積分）を定
められたサンプル数だけ計算し、求まつたhw
（ｎ）を共分散関数計算回路１３０と相互関関数
計算回路１３５とへ出力する。共分散関数計算回
路１３０は、あらかじめ定められたサンプル数の
hw（ｎ）を入力し、前述の(10)式に従つてψhh（li，
lj）（０≦li，lj≦Ｎ−１）を計算し、これを音源
パルス系列計算回路１４０へ出力する。次に、音
源パルス系列計算回路の説明をする。音源パルス
系列計算回路１４０は、相互関々数計算回路１３
５からψxh（lk）（０≦lk≦Ｎ−１）を、共分散関
数計算回路１３０からψhh（li，lj）（０≦li，lj≦
Ｎ−１）をそれぞれ入力し、前述のパルス計算ア
ルゴリズム(8)式を用いて音源パルス系列の振幅
gk及び位置lkを計算する。第２図は、音源パル
ス系列計算回路１４０で行なわれる処理手順を表
わす流れ図である。１つ目のパルスは(8)式におい
て、Ｋ＝１とおき振幅g₁を位置l₁の関数、g₁＝
ψxh（l₁）／ψhh（l₁，l₁）として表わす。次に、｜
g₁｜を最大にするl₁を選び、その際のl₁、g₁を１
番目のパルス位置及び振幅とする。２番目のパル
スは、(8)式においてＫ＝２とおき、｜g₂｜を最大
にするl₂を選び、その際のl₂、g₂を２番目のパル
スの位置及び振幅とする。３番目以後のパルスも
同様にして計算し、あらかじめ定まつたパルス数
に達するまで続ける。第２図において、１はパル
スの個数を計算する計算カウンターを１に初期化
する。２は比較であり、パルスの個数があらかじ
め定められた個数より大きいか、小さいかを判断
し、定められた個数より大きければ、パルス系列
計算の処理を終える。３は(8)式の計算を行うもの
で、(8)式において、l₁，…，l_k-1、及びg₁，…，
g_k-1を既知とし、｜gk｜を最大にするlkを求め、
そのときのgkをｋ番目のパルスの振幅と位置と
して出力する。４は加算器で、パルスの個数を計
算する計算カウンターの内容を１つふやす。以上
で音源パルス計算回路１４０の説明を終える。 FIG. 1 is a block diagram showing an embodiment of the conventional method disclosed in Document 2. FIG. 2 shows the amplitude of the sound source pulse sequence calculated by the sound source pulse sequence calculation circuit 140.
12 is a flowchart showing a processing procedure for determining gk and position lk. In FIG. 1, each component performs processing for each frame. 100 indicates an encoder input terminal,
An A/D converted audio signal sequence x(n) is input. 110 is a buffer memory circuit that stores one frame worth of audio signal series. The K parameter calculation circuit 180 receives the audio signal x(n) stored in the buffer memory circuit 110 and calculates a predetermined number of K parameters Ki (1≦i≦M). This value is the K parameter encoding circuit 19
Output to 0. K parameter encoding circuit 190
encodes Ki based on, for example, a predetermined number of quantization bits, and outputs the code Iki to multiplexer 160. Further, the K parameter encoding circuit 190 decodes Iki and decodes the decoded value.
Ki′ (1≦i≦M) by the impulse response calculation circuit 12
0 and is output to the weighting circuit 200. The weighting circuit 200 inputs the input audio signal x(n) and the K-parameter decoded value Ki′, calculates the above-mentioned xw(n) using a weighting function w(n) that depends on the frequency characteristics of the synthesis filter, The obtained xw(n) is output to the cross-correlation calculation circuit 135. The weighting function w(n) used here is, for example, the Z-transformation W(z) using the prediction parameter αi of the synthesis filter.
By a real constant r satisfying 0≦r≦1, W(z)
= (1- _M 〓 ⁱ⁼¹ αiZ ^-i / (1- _M 〓 ⁱ⁼¹ αir ^j Z _-i ) is adopted. The impulse response circuit 120 is
Input Ki, calculate the above-mentioned hw(n) (convolution integral of the impulse response and the same weighted function as above) for the specified number of samples, and find the hw
(n) is output to the covariance function calculation circuit 130 and the correlation function calculation circuit 135. The covariance function calculation circuit 130 calculates a predetermined number of samples.
Input hw(n) and use ψhh(li,
lj) (0≦li, lj≦N-1) and outputs this to the sound source pulse sequence calculation circuit 140. Next, the sound source pulse sequence calculation circuit will be explained. The sound source pulse sequence calculation circuit 140 includes the mutual function calculation circuit 13
5 to ψxh (lk) (0≦lk≦N−1), and ψhh (li, lj) (0≦li, lj≦) from the covariance function calculation circuit 130.
N-1), and calculate the amplitude of the sound source pulse sequence using the pulse calculation algorithm (8) above.
Calculate gk and position lk. FIG. 2 is a flowchart showing the processing procedure performed by the sound source pulse sequence calculation circuit 140. For the first pulse, in equation (8), K = 1, amplitude g ₁ is a function of position l ₁ , g ₁ =
Expressed as ψxh (l ₁ )/ψhh (l ₁ , l ₁ ). Next, |
Select l ₁ that maximizes g ₁ |, and then set l ₁ and g ₁ to 1
th pulse position and amplitude. For the second pulse, set K=2 in equation (8), select l ₂ that maximizes |g ₂ |, and let l ₂ and g ₂ at that time be the position and amplitude of the second pulse. The third and subsequent pulses are calculated in the same manner until the predetermined number of pulses is reached. In FIG. 2, 1 initializes to 1 a calculation counter that calculates the number of pulses. 2 is a comparison, in which it is determined whether the number of pulses is larger or smaller than a predetermined number, and if it is larger than the predetermined number, the pulse sequence calculation process is finished. 3 calculates equation (8), where l ₁ ,..., l _k-1 and g ₁ ,...,
Assuming that g _k-1 is known, find the lk that maximizes |gk|,
gk at that time is output as the amplitude and position of the k-th pulse. 4 is an adder that increments by one the contents of a calculation counter that calculates the number of pulses. This concludes the explanation of the sound source pulse calculation circuit 140.

第１図に戻つて、符号化回路１５０は、音源パ
ルス計算回路１４０の出力であるパルス系列の振
幅gk及び位置lkを入力し、それらを符号化する。
振幅gkや位置lkの符号化については従来よく知
られている方法を用いることができる。振幅gk
については、例えば、１フレーム内のパルス系例
の振幅の最大値を正規化係数として、この値で各
パルス振幅を正規化し、その後量子化、符号化す
る方法が考えられる。位置lkについては、例えば
フアクシミリ信号符号化の分野でよく知られてい
るランレングス符号化を用いることが考えられ
る。これは符号“０”の続く長さをあらかじめ定
められた符号系列を用いて表わすものである。モ
ルチプレクサ１６０は、Ｋパラメータ符号化回路
１９０の出力符号と符号化回路１５０の出力符号
を入力し、これらを組み合わせて、送信側出力端
子１７０から通信路へ出力する。 Returning to FIG. 1, the encoding circuit 150 receives the amplitude gk and position lk of the pulse sequence output from the excitation pulse calculation circuit 140 and encodes them.
Conventionally well-known methods can be used for encoding the amplitude gk and position lk. amplitude gk
For example, a method can be considered in which the maximum value of the amplitude of a pulse system example within one frame is used as a normalization coefficient, each pulse amplitude is normalized with this value, and then quantized and encoded. As for the position lk, it is conceivable to use run-length encoding, which is well known in the field of facsimile signal encoding, for example. This represents the length of the code "0" using a predetermined code sequence. The multiplexer 160 inputs the output code of the K-parameter encoding circuit 190 and the output code of the encoding circuit 150, combines them, and outputs them from the transmission side output terminal 170 to the communication path.

以上、文献２従来方式において、駆動音源パル
ス系列を探索する方式について述べた。 The method of searching for a driving sound source pulse sequence in the conventional method of Document 2 has been described above.

文献２従来方式は、音源パルス系列の振幅、位
置を求めるアルゴリズムにおいて、パルスの振幅
は、そのパルスが立つ位置だけの関数であるとい
う仮定をおいている。しかし、実際の音声信号に
対しては、前述の仮定は成りたたず、文献２従来
方式において、音源パルス系列を求めるために使
用した前記(8)式に見るように、一般にgkはl₁，
l₂，…，lkなどの関数となる。したがつて、文献
２従来方式により決定された音源パルス系列は、
前記(7)式のＪを真に小さくするものではなく、更
に適した音源パルス系列が存在する。駆動音源信
号系列を複数のパルスで表わす方式において、伝
送レイトが10kビツト／秒以下の領域で更によい
音声品質を得るためには、より適した音源パルス
系列の振幅と位置を求めることが必要となる。本
発明は、この音源パルス探索アルゴリズムの改良
に関するものである。 In the conventional method of Reference 2, in an algorithm for determining the amplitude and position of a sound source pulse sequence, it is assumed that the amplitude of a pulse is a function only of the position at which the pulse stands. However, for actual audio signals, the above assumption does not hold, and as shown in the above equation (8) used to obtain the sound source pulse sequence in the conventional method of Reference 2, gk is generally l ₁ ，
It becomes a function such as l ₂ ,…, lk. Therefore, the sound source pulse sequence determined by the conventional method in Reference 2 is as follows:
There is a more suitable sound source pulse sequence that does not truly reduce J in equation (7). In a method in which the driving sound source signal sequence is represented by multiple pulses, in order to obtain even better audio quality in the region where the transmission rate is 10k bits/second or less, it is necessary to find a more suitable amplitude and position of the sound source pulse sequence. Become. The present invention relates to improvements to this sound source pulse search algorithm.

本発明の目的は、比較的少ない演算量で10kビ
ツト／秒以下の伝送レートに適用し得る高品質な
音声符号化方式を提供するものである。 An object of the present invention is to provide a high-quality speech encoding method that can be applied to transmission rates of 10 kbit/sec or less with a relatively small amount of calculation.

本発明によれば、離散的音声信号系列を入力
し、前記音声信号系列を短時間毎に分割し短時間
音声信号系列を求める手段と、前記短時間音声信
号系列からクペクトル包絡を表わすパラメータを
抽出して符号化する手段と、前記スペクトル包絡
に対応するインパルス応答系列の自己相関々数列
を計算する手段と、前記スペクトル包絡に対応す
るインパルス応答系列と前記短時間音声信号系列
との相互相関々数列を計算する手段と、前記自己
相関々数列と前記相互相関々数列とを用いて前記
短時間音声信号系列の駆動音源信号系列に適した
音源パルスの位置と振幅を逐次的に求める際に過
去に求めた音源パルスの位置と振幅とをもとに新
たな音源パルスの位置を決定し前記過去に求めた
音源パルスの位置と前記新たに決定した音源パル
スの位置とをもとに前記過去に求めた音源パルス
と前記新たな音源パルスとの振幅を計算しなおす
ようにした駆動音源パルス符号化手段と、前記ス
ペクトル包絡を表わすパラメータの符号と前記駆
動音源信号系列を表わす符号と組み合わせて出力
する手段とを有することを特徴とする音声符号化
装置が得られる。 According to the present invention, there is provided a means for inputting a discrete audio signal sequence, dividing the audio signal sequence into short-time intervals to obtain a short-time audio signal sequence, and extracting a parameter representing a spectral envelope from the short-term audio signal sequence. means for calculating an autocorrelation sequence of the impulse response sequence corresponding to the spectral envelope; and a cross-correlation sequence of the impulse response sequence corresponding to the spectral envelope and the short-time speech signal sequence. and means for calculating the autocorrelation sequence and the cross-correlation sequence when successively determining the position and amplitude of the sound source pulse suitable for the driving sound source signal sequence of the short-time audio signal sequence. The position of a new sound source pulse is determined based on the position and amplitude of the sound source pulse that has been found, and the position of the sound source pulse that has been found in the past is determined based on the position of the sound source pulse that was found in the past and the position of the newly determined sound source pulse. driving sound source pulse encoding means for recalculating the amplitude of the new sound source pulse and the new sound source pulse; and means for outputting a combination of the code of the parameter representing the spectral envelope and the code representing the driving sound source signal sequence. There is obtained a speech encoding device characterized by having the following.

また本発明によれば離散的音声信号系列を入力
し、前記音声信号系列を短時間毎に分割し短時間
音声信号系列を求める手段と、前記短時間音声信
号系列からスペクトル包絡を表わすパラメータを
抽出して符号化する手段と、前記スペクトル包絡
に前記短時間音声信号系列をもとにあらかじめ定
められた補正を加えたスペクトル包絡をもつイン
パルス応答系列の自己相関々数列を計算する手段
と、前記短時間音声信号系列をもとにあらかじめ
定められた補正を加えた目標信号系列とを用いて
相互相関々数列を計算する手段とを用いて前記短
時間音声信号系列の駆動音源信号系列に適した音
源パルスの位置と振幅を逐次的に求める際に過去
に求めた音源パルスの位置と振幅とをもとに新た
な音源パルスの位置を決定し前記過去に求めた音
源パルスの位置と前記新たに決定した音源パルス
の位置とをもとに前記過去に求めた音源パルスと
前記新たな音源パルスとの振幅を計算しなおすよ
うにした駆動音源パルス符号化手段と、前記スペ
クトル包絡を表わすパラメータの符号と前記駆動
音源信号系列を表わす符号とを組み合わせて出力
する手段とを有することを特徴とする音声符号化
装置を提供できる。 Further, according to the present invention, there is provided a means for inputting a discrete audio signal sequence, dividing the audio signal sequence into short-time intervals to obtain a short-time audio signal sequence, and extracting a parameter representing a spectral envelope from the short-term audio signal sequence. means for calculating an autocorrelation sequence of an impulse response sequence having a spectral envelope obtained by adding a predetermined correction to the spectral envelope based on the short-term speech signal sequence; means for calculating a cross-correlation sequence using a target signal sequence obtained by adding a predetermined correction based on the temporal audio signal sequence; When determining the position and amplitude of a pulse sequentially, a new sound source pulse position is determined based on the previously determined sound source pulse position and amplitude, and the previously determined sound source pulse position and the newly determined sound source pulse position are determined. drive sound source pulse encoding means configured to recalculate the amplitude of the sound source pulse obtained in the past and the new sound source pulse based on the position of the sound source pulse obtained in the past; and the code of the parameter representing the spectral envelope. and a code representing the drive excitation signal sequence.

本発明による音声符号化方式は、上記音源パル
ス系列を求めるアルゴリズムに特徴がある。従つ
て、以後前記(7)式が与えられたとき、(7)式のＪを
最小にする音源パルス列の振幅gk，ｋ＝１，２，
…Ｋと位置lk，ｋ＝１，２，…，Ｋを求める本発
明のアルゴリズムについて説明する。 The speech encoding method according to the present invention is characterized by an algorithm for obtaining the above-mentioned sound source pulse sequence. Therefore, when the above equation (7) is given, the amplitude gk of the sound source pulse train that minimizes J in equation (7), k = 1, 2,
...K and the position lk, k=1, 2, . . . , the algorithm of the present invention for determining K will be explained.

まず、振幅と位置がそれぞれ｛g₁，g₂，…，
g_K-1｝，｛l₁，l₂…，l_K-1｝である（Ｋ−１）個のパ
ルス系列に、更に１個のパルスを加えたときの二
乗誤差を(7)式に倣い下のように表わす。 First, the amplitude and position are respectively {g ₁ , g ₂ ,...,
g _K-1 }, {l ₁ , l ₂ ..., l _K-1 }, the square error when one more pulse is added to the (K-1) pulse sequence is expressed in equation (7). It is expressed as shown below.

JK_N-1 〓ⁿ⁼¹ （xw（ｎ）−_K 〓^k=1 gkhw（ｎ−lk））² −(11) Ｋ番目のパルスの影響をみるために（11）式を
gkで偏微分して０とおくと、次に関係が得られ
る。JK _N-1 〓 ⁿ⁼¹ (xw(n)− _K 〓 ^k=1 gkhw(n−lk)) ² −(11) To see the influence of the K-th pulse, use equation (11).
If we partially differentiate with respect to gk and set it to 0, we get the following relationship.

また、このときのJKはJ_K-1、gKを用い、次の
ように計算できる。 Moreover, JK at this time can be calculated as follows using J _K-1 and gK.

JK＝J_K-1−g² _K/〓hh（lK，lK）ｋ＞１−(13) 但し J_K-1＝_N-1 〓ⁿ⁼⁰ （xw(n)−_K 〓^k=1 gkhw(n-lk)）² −(14) JKは（12）、（13）両式よりlKの関数となり、
（13）式から、（12）式のg²Kが最も大きくなるlK
にパルスを立てるときJKが最も小さくなること
がわかる。次に、（11）式をgkで偏微分して０と
おくことにより、次の関係を得る。 JK＝J _K-1 −g ² _K/ 〓hh(lK,lK) k＞1−(13) However, J _K-1 ＝ _N-1 〓 ⁿ⁼⁰ (xw(n)− _K 〓 ^k=1 gkhw (n-lk)) ² −(14) JK is a function of lK from both equations (12) and (13),
From equation (13), lK at which g ² K in equation (12) is the largest
It can be seen that JK becomes the smallest when the pulse is set at . Next, by partially differentiating equation (11) with respect to gk and setting it to 0, the following relationship is obtained.

ψxn(lk)＝_K 〓ⁱ⁼¹ giψhh(li,lk) ，ｋ＝１，…Ｋ −(15) （15）を満たすgk，ｋ＝１，…，Ｋは次の連
立一次方程式の解として求まる。 ψxn(lk)= _K 〓 ⁱ⁼¹ giψhh(li,lk) , k=1,...K −(15) gk, k=1,..., K that satisfies (15) is the solution of the following simultaneous linear equations: Seek.

（16）式の左辺のKxKの行列は正定値、対象
行列であり、gk，ｋ＝１，…，Ｋはチヨレスキ
ー（CHOLESKY）分解等の高速アルゴリズムで
解くことができる。また（15）式が成立すると
き、JKは最小になり、次のように計算できる。 The KxK matrix on the left side of equation (16) is a positive definite, symmetric matrix, and gk, k=1, . . . , K can be solved using a high-speed algorithm such as CHOLESKY decomposition. Also, when equation (15) holds, JK becomes minimum and can be calculated as follows.

JK＝_N-1 〓ⁿ⁼⁰ xw（ｎ）²−_K 〓^k=1 gkψxh（lk） −（17）よつて、（12）式、（16）式においてＫ＝１を初
期値としl₁，g₁を求め、以後Ｋに関して逐次的
に、（12）式により位置lkを、（16）式により、
g₁，g₂，…，gkを計算していき、パルス数があ
らかじめ定められた値に達するか、あるいは、求
まつたg₁，g₂，…，gp，l₁，l₂…，lpを（17）式
に代入し、得られる二乗誤差の値があらかじめ定
められた値より小さくなるか、あるいは、新たに
立つパルスの振幅の大きさがあらかじめ定められ
た値より小さくなるまで繰り返すことにより、(7)
式を小さくする駆動音源パルス系列の振幅gkと
位置lkを探索することができる。以上で、本発明
のアルゴリズムの導出に関する説明を終える。JK= _N-1 〓 ⁿ⁼⁰ xw(n) ² − _K 〓 ^k=1 gkψxh(lk) −(17) Therefore, in equations (12) and (16), K=1 is the initial value and l ₁ , g ₁ , and then sequentially with respect to K, the position lk is determined by equation (12), and the position lk is determined by equation (16),
Calculate g ₁ , g ₂ , ..., gk until the number of pulses reaches a predetermined value or the calculated g ₁ , g ₂ , ..., gp, l ₁ , l ₂ ..., lp By substituting into equation (17) and repeating it until the value of the squared error obtained becomes smaller than the predetermined value, or the amplitude of the newly rising pulse becomes smaller than the predetermined value. ,(7)
It is possible to search for the amplitude gk and position lk of the driving sound source pulse sequence that reduce the equation. This concludes the explanation regarding the derivation of the algorithm of the present invention.

以上述べてきたように、本発明は音源パルス系
列を求めるアルゴリズムに特徴があり、音源パル
ス系列計算回路１４０を除いて、文献２の従来方
式の実施例を示す第１図と全く同一の構成で本発
明は実現できる。そこで、ここでは本発明による
音源パルス系列計算回路１４０について説明す
る。第３図は、本発明による音源パルス系列計算
回路で行なわれる処理手順を表わす流れ図であ
る。１つ目のパルスの位置は、（12）式において
Ｐ＝０とおいて、ψxh（l₁）／ψhh（l₁，l₁）を計算
し、（ψxh（l₁）／ψhh（l₁，l₁））²を最大にするl₁
を
選ぶ。これが、１つ目のパルス位置となる。１つ
目のパルスの振幅g₁は、（16）式においてＰ＝０
とし、上記l₁を代入し計算する。２つ目のパルス
の位置は、（12）式においてＰ＝１とおいて、上
記g₁，l₁を代入し、｛（ψxh（l₂）−g₁ψhh（l₁，
l₂））／ψhh（l₂，l₂）｝²を最大にするl₂である。g₁
と
g₂の値は、l₁とl₂を（16）式に代入し、決定され
る。３つ目以上の計算も同様で、（12）式により
Ｐ番目の位置lpを求め、l₁，…，lpを（16）式に
代入し、g₁，…，gpの値を求める、という処理
を繰り返す。第３図において、５はパルスの個数
を計数する計数カウンターを１に初期化する。６
は比較であり、パルスの個数があらかじめ定めら
れた個数より大きいか、小さいかを判定し、定め
られた個数より大きかつたら、パルス系列計算の
処理を終える。７は、前記（12）式の計算を行う
もので、パルスの位置を計算する。８は、前記
（16）式の計算を行うもので、パルスの振幅を計
算する。９は、加算器で、パルスの個数を計数す
る計数カウンターを１つふやす。以上で、本発明
による音源パルス計算回路の説明を終える。 As described above, the present invention is characterized by an algorithm for determining a sound source pulse sequence, and has the same configuration as that in FIG. The present invention can be realized. Therefore, the sound source pulse sequence calculation circuit 140 according to the present invention will be explained here. FIG. 3 is a flowchart showing the processing procedure performed by the sound source pulse sequence calculation circuit according to the present invention. The position of the first pulse is determined by setting P=0 in equation (12), calculating ψxh(l ₁ )/ψhh(l ₁ , l ₁ ), and calculating (ψxh(l ₁ )/ψhh(l ₁ , l ₁ )) Maximize ² l ₁
Choose. This becomes the first pulse position. The amplitude g ₁ of the first pulse is P = 0 in equation (16)
Then, calculate by substituting l ₁ above. The position of the second pulse is determined by setting P=1 in equation (12) and substituting the above g ₁ and l ₁ , and then calculating the position by {(ψxh(l ₂ )−g ₁ ψhh(l ₁ ,
l ₂ ))/ψhh(l ₂ , l ₂ )} is l ₂ that maximizes ² . g ₁
and
The value of g ₂ is determined by substituting l ₁ and l ₂ into equation (16). The third and subsequent calculations are similar: find the P-th position lp using equation (12), substitute l ₁ ,..., lp into equation (16), and find the values of g ₁ ,..., gp. Repeat the process. In FIG. 3, 5 initializes to 1 a counting counter that counts the number of pulses. 6
is a comparison, and it is determined whether the number of pulses is larger or smaller than a predetermined number, and if it is larger than the predetermined number, the pulse sequence calculation process is finished. 7 calculates the above equation (12), and calculates the position of the pulse. 8 calculates the above equation (16), and calculates the amplitude of the pulse. 9 is an adder that increments by one a counting counter that counts the number of pulses. This completes the explanation of the sound source pulse calculation circuit according to the present invention.

本発明の構成によれは、音源パルス系列の計算
において、（16）式により最適な振幅を、（12）式
によりパルス数について逐次的に最適な位置を決
定しているので、文献２の従来方式に見るよう
な、パルスの振幅を、そのパルスが立つ位置だけ
の関数とみる仮定がなく、より適した音源パルス
系列を得ることができる。したがつて、より良好
な再生音質が得られるという効果がある。また
ψxh（lk）（０≦lk≦Ｎ−１）とψhh（l_i，l_j）（０≦
l_i，l_j≦Ｎ−１）の値を１フレーム毎に前もつて
計算しておくことにより、文献２と同様、（12）
式の演算は、掛け算と引き算という簡略化された
演算となる。更に（16）式は、正値対称行列とな
るので高速に解くアルゴリズムが存在し、文献１
の従来方式に比べ、演算量を大巾に減らすことが
できるという効果がある。 According to the configuration of the present invention, in calculating the sound source pulse sequence, the optimum amplitude is determined sequentially using equation (16) and the optimum position regarding the number of pulses is determined using equation (12). Unlike the conventional method, there is no assumption that the amplitude of a pulse is a function only of the position at which the pulse stands, and a more suitable sound source pulse sequence can be obtained. Therefore, there is an effect that better reproduced sound quality can be obtained. Also, ψxh (lk) (0≦lk≦N-1) and ψhh (l _i , l _j ) (0≦
By calculating the values of l _i , l _j ≦N−1) for each frame in advance, (12)
The operations on the expression are simplified operations of multiplication and subtraction. Furthermore, since equation (16) is a positive symmetric matrix, there is an algorithm that can solve it quickly, as described in Reference 1.
This method has the effect of greatly reducing the amount of calculation compared to the conventional method.

尚、前述の本発明の音源パルス系列の計算は、
フレーム単位で行なつたが、フレームをいくつか
のサブフレームに分割し、そのサブフレーム毎に
パルス系列を計算するような構成にしてもよい。
この構成によれは、フレーム分割数をｄとする
と、第３図に示した構成と比べて演算量を大略
１／ｄ倍することができる。 Note that the calculation of the sound source pulse sequence of the present invention described above is as follows:
Although the calculation is performed on a frame-by-frame basis, the frame may be divided into several subframes, and the pulse sequence may be calculated for each subframe.
With this configuration, if the number of frame divisions is d, the amount of calculation can be increased approximately 1/d compared to the configuration shown in FIG. 3.

また、以上説明した構成例においては、フレー
ム長を一定としたが、これは可変にしてもよい。
可変にした方が特性は向上する。また、短時間音
声信号系列のスペクトル包絡を表わすパラメータ
としてはＫパラメータを用いたが、これはよく知
られている他のパラメータ（例えはLSPパラメー
タ等）を用いてもよい。更に、前述の重み付け関
数ｗ（ｎ）はなくてもよい。 Furthermore, in the configuration example described above, the frame length is constant, but it may be variable.
The characteristics will improve if it is made variable. Further, although the K parameter is used as the parameter representing the spectral envelope of the short-time audio signal sequence, other well-known parameters (for example, LSP parameters, etc.) may also be used. Furthermore, the weighting function w(n) described above may be omitted.

また、本発明による音源パルス計算式（12），
（16）両式においては、ψhh（・）としては(10)式に
従つて共分散関数を計算したが、これは下式のよ
うな自己相関々数列を計算するような構成にして
もよい。 In addition, the sound source pulse calculation formula (12) according to the present invention,
(16) In both equations, the covariance function for ψhh(・) was calculated according to equation (10), but this may be configured to calculate an autocorrelation sequence as shown in the equation below. .

ψhh（l_i，l_j）＝_N-(li-lj)+1 〓〓ⁿ⁼⁰ hw(n)hw（ｎ−｜l_i−l_j｜），０≦｜l_i−l_j｜≦Ｎ−
１−(18) このような構成をとることによつて、ψhh（・）
の計算に要する演算量を大幅に低減させることが
可能となり、全体の演算量も低減できるという効
果がある。 ψhh(l _i , l _j )= _N-(li-lj)+1 〓〓 ⁿ⁼⁰ hw(n)hw(n-｜l _i −l _j ｜), 0≦｜l _i −l _j ｜≦ N-
1-(18) By adopting this configuration, ψhh(・)
It is possible to significantly reduce the amount of calculation required for the calculation of , and there is an effect that the amount of calculation as a whole can also be reduced.

更に、本発明において、合成フイルタの自己相
関々数列を計算するに際し、一担合成フイルタの
インパルス応答を求めてから(10)式に従い計算した
が、自己相関々数列は、合成フイルタのパワース
ペクトラムを逆フーリエ変換することにより求め
ることができる。また本発明において合成フイル
タのインパルス応答と入力音声信号の相互相関々
数列の計算は(9)式に従い計算したが、合成フイル
タのパワースペクトラムと入力音声信号のパワー
スペクトラムの積を逆フーリエ変換することによ
り求めることができる。 Furthermore, in the present invention, when calculating the autocorrelation sequence of the synthesis filter, the impulse response of the single-stage synthesis filter is calculated according to equation (10). It can be obtained by performing an inverse Fourier transform. Furthermore, in the present invention, the cross-correlation sequence between the impulse response of the synthesis filter and the input audio signal is calculated according to equation (9), but the product of the power spectrum of the synthesis filter and the power spectrum of the input audio signal is subjected to inverse Fourier transform. It can be found by

[Brief explanation of drawings]

第１図は従来方式の構成による音声符号化方式
の一実施例を示すブロツク図、第２図は、従来方
式による音源パルス系列計算回路で行う処理手順
を示す流れ図、第３図は、本発明による音源パル
ス系列計算回路で行う処理手順を示す流れ図をそ
れぞれ示す。図において、１１０……バツフアメモリ回路、
１２０……インパルス応答計算回路、１３０……
共分散関数計算回路、１３５……相互相関々数計
算回路、１４０……音源パルス系列計算回路、１
５０……符号化回路、１６０……マルチプレク
サ、１８０……Ｋパラメータ計算回路、１９０…
…Ｋパラメータ符号化回路、２００……重み付け
回路、１……パルス計数カウンター、２……比較
器、３……パルス計算回路、４…加算器、５……
パルス計数カウンター、６……比較器、７……パ
ルス位置計算回路、８……パルス振幅計算回路、
９……加算器をそれぞれ示す。 FIG. 1 is a block diagram showing an example of a speech encoding method with a conventional configuration, FIG. 2 is a flowchart showing a processing procedure performed by a conventional sound source pulse sequence calculation circuit, and FIG. A flowchart showing the processing procedure performed by the sound source pulse sequence calculation circuit according to the following is shown. In the figure, 110...buffer memory circuit,
120... Impulse response calculation circuit, 130...
Covariance function calculation circuit, 135... Cross-correlation number calculation circuit, 140... Sound source pulse sequence calculation circuit, 1
50... Encoding circuit, 160... Multiplexer, 180... K parameter calculation circuit, 190...
...K parameter encoding circuit, 200...Weighting circuit, 1...Pulse counting counter, 2...Comparator, 3...Pulse calculation circuit, 4...Adder, 5...
Pulse count counter, 6...Comparator, 7...Pulse position calculation circuit, 8...Pulse amplitude calculation circuit,
9 indicates an adder.

Claims

[Scope of Claims] 1. Means for inputting a discrete audio signal sequence, dividing the audio signal sequence into short-time intervals to obtain a short-time audio signal sequence, and determining a parameter representing a spectral envelope from the short-term audio signal sequence. means for extracting and encoding; means for calculating an autocorrelation sequence of the impulse response sequence corresponding to the spectral envelope; and a cross-correlation sequence of the impulse response sequence corresponding to the spectral envelope and the short-time signal sequence. and means for calculating the autocorrelation sequence and the cross-correlation sequence when successively determining the position and amplitude of the sound source pulse suitable for the driving sound source signal sequence of the short-time audio signal sequence. The position of a new sound source pulse is determined based on the position and amplitude of the sound source pulse that has been found, and the position of the sound source pulse that has been found in the past is determined based on the position of the sound source pulse that was found in the past and the position of the newly determined sound source pulse. driving excitation pulse encoding means configured to recalculate the amplitude of the excitation pulse and the new excitation pulse, and outputting a combination of the code of the parameter representing the spectral envelope and the code representing the driving excitation signal sequence. 1. A speech encoding device comprising: means. 2. Means for inputting a discrete audio signal sequence, dividing the audio signal sequence into short-time intervals to obtain a short-time audio signal sequence, and extracting and encoding a parameter representing a spectral envelope from the short-time audio signal sequence. means for calculating an autocorrelation sequence of an impulse response sequence having a spectral envelope obtained by adding a predetermined correction to the spectral envelope based on the short-time audio signal sequence; The position and amplitude of the sound source pulse suitable for the driving sound source signal sequence of the short-time audio signal sequence are calculated using means for calculating a cross-correlation sequence using the target signal sequence to which a predetermined correction has been applied to the original signal sequence. When successively determining drive sound source pulse encoding means configured to recalculate the amplitude of the sound source pulse obtained in the past and the new sound source pulse based on the position; a code of a parameter representing the spectral envelope; and the drive sound source signal. 1. A speech encoding device comprising means for outputting a combination of a code representing a sequence and a code representing a sequence.