JPH028900A

JPH028900A - Voice encoding and decoding method, voice encoding device, and voice decoding device

Info

Publication number: JPH028900A
Application number: JP63158112A
Authority: JP
Inventors: Kazunori Ozawa; 一範小澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-06-28
Filing date: 1988-06-28
Publication date: 1990-01-12

Abstract

PURPOSE:To improve the quality of a reproduced voice by finding a sound source signal consisting of multiple pulses in each subframe according to the sound source signal of a past subframe by utilizing pitch correlations between subframes and in a subframe. CONSTITUTION:A subframe division part 355 divides a voice signal into subframes of predetermined time length (including at least plural pitches) equal to or shorter than a frame section. Pitch prediction is carried out among frames by using a sound source signal which is found in a past subframe and stored in a sound source pulse memory 305 to remove the pitch correlation between subframes and then pitch prediction is performed even in the subframe to re move the pitch correlation in the subframe, thereby finding the sound source signal consisting of multiple pulses. Consequently, there is not a large increase in the amount of sent information and the sound quality can be improved with a sent information and the sound quality can be improved with a relatively small arithmetic quantity.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音声符号化復号化方法並びに音声符号化装置及
び音声復号化装置に関し、特に音声信号を低いビットレ
ートで効率的に符号化、復号化するための音声符号化復
号化方法並びに音声符号化装置及び音声復号化装置に関
する。[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to a speech encoding/decoding method, a speech encoding device, and a speech decoding device, and in particular to a method for efficiently encoding and decoding an audio signal at a low bit rate. The present invention relates to a speech encoding/decoding method, a speech encoding device, and a speech decoding device for encoding.

[Conventional technology]

音声信号を低いビットレート、例えば１６ｋｂ／ｓ程度
以下で伝送する方式としては、マルチパルス符号化法な
どが知られている。これらは音源信号を複数個のパルス
の組合せ（マルチパルス）で表し、声道の特徴をデジタ
ルフィルタで表し、音源パルスの情報とフィルタの係数
を、一定時間区間（フレーム）毎に求めて伝送している
。この方法の詳細については、例えばＡｒａｓｅｋｉ、
　Ｏｚａｗａ、　Ｏｎｏ＋　０ｃｈｉａｉ氏による“Ｍ
ｕｌｔｉ−ｐｕｌｓｅ　Ｅｘｃｉｔｅｄ　５ｐｅｅｃｈ
　ＣｏｄｅｒＢａｓｅｄ　ｏｎ　Ｍａｘｉｍｕｎ＋　Ｃ
ｒｏｓｓｃｏｒｒｅｌａｔｉｏｎ　５ｅａｒｃｈ　Ａ１
ｇｏｒｉｔｈｎ＋”、　（ＧＬＯＢＥＣＯＭ　８３．　
ＩＥＥＥ　Ｇｌｏｂａｌ　Ｔｅｌｅｃｏｍｍｕｎｉｃａ
ｔｉｏｎ、講演番号２３．３．１９８３）　　（文献１
）に記載されている。この方法では、声道情報と音源信
号を分離してそれぞれ表現すること、および音源信号を
表現する手段として複数のパルス列の組合せを用いるこ
とにより、復号後に良好な音声信号を出力する。A multi-pulse encoding method is known as a method for transmitting audio signals at a low bit rate, for example, about 16 kb/s or less. These represent the sound source signal as a combination of multiple pulses (multipulse), represent the characteristics of the vocal tract with a digital filter, and transmit the information on the sound source pulse and the filter coefficients after determining them for each fixed time interval (frame). ing. For details of this method, see for example Araseki,
“M” by Ozawa, Ono+ Ochiai
ulti-pulse Excited 5peech
CoderBased on Maximun+C
rosscorrelation 5earch A1
gorithn+”, (GLOBECOM 83.
IEEE Global Telecommunica
tion, lecture number 23.3.1983) (Reference 1
)It is described in. This method outputs a good audio signal after decoding by separately representing the vocal tract information and the sound source signal, and by using a combination of a plurality of pulse trains as a means to represent the sound source signal.

ここで、音源信号を表すパルス列を求める基本的な考え
方について、第３図を用いて説明する。Here, the basic concept for obtaining a pulse train representing a sound source signal will be explained using FIG. 3.

図中の入力端子８００からはフレーム毎に分割され音声
信号が入力される。合成フィルタ回路８２０には現フレ
ームの音声信号から求められたスペクトルパラメータが
入力されている。音源計算回路８１０において初期マル
チパルスを発生し、これを合成フィルタ回路８２０に入
力することによって出力として合成音声波形が得られる
。減算器８４０では前記入力信号から合成音声波形を減
する。この結果を重み付は回路８５０へ入力し、出力と
して現フレームでの重み付は誤差電力を得る。そしてこ
の重み付は誤差電力を最小とするように、音源計算回路
８１０において規定個数のマルチパルスの振幅と位置を
求める。An audio signal divided into frames is inputted from an input terminal 800 in the figure. Spectral parameters determined from the audio signal of the current frame are input to the synthesis filter circuit 820. An initial multipulse is generated in the sound source calculation circuit 810 and inputted to the synthesis filter circuit 820 to obtain a synthesized speech waveform as an output. A subtracter 840 subtracts the synthesized speech waveform from the input signal. This result is input to a weighting circuit 850, and the weighting error power in the current frame is obtained as an output. Then, the amplitude and position of a specified number of multipulses are determined in the sound source calculation circuit 810 so that this weighting minimizes the error power.

[Problem to be solved by the invention]

しかしながら、上述した従来技術では、ビットレートが
充分に高く音源パルスの数が充分なときは音質が良好で
あったが、ビットレートを下げていくと音質が低下する
という問題点があった。However, in the above-mentioned conventional technology, the sound quality was good when the bit rate was sufficiently high and the number of sound source pulses was sufficient, but there was a problem in that the sound quality deteriorated as the bit rate was lowered.

この問題点を改善するために、マルチパルス音源のピッ
チ毎の単周期性（ピッチ相関）を利用したピッチ予測マ
ルチパルス法が提案されている。In order to improve this problem, a pitch prediction multi-pulse method has been proposed that utilizes the single periodicity (pitch correlation) of each pitch of a multi-pulse sound source.

この方法の詳細は、例えば、特願昭５８−１３９０２２
号明細書（文献２）に詳しいのでここでは説明を省略す
る。前記文献２のピッチ予測マルチパルス法では、フレ
ーム区間内のみで音源のピッチ相関を利用したピッチ予
測を行っており、フレーム間にわたるピッチ相関は利用
していなかったため、ピッチ予測による音質改善効果は
それほど大きくなく、特に、伝送遅延が長くならないよ
うにフレーム区間を１０〜２０ｍ５程度とした場合、ピ
ッチ周期の長い男性話者（通常ピッチ周期は８〜１０ｍ
５位である）に対して、ピッチ予測の効果はほとんど無
いという問題点があった。For details of this method, see Japanese Patent Application No. 58-139022.
The detailed description can be found in the No. 1 specification (Reference 2), so the explanation will be omitted here. In the pitch prediction multi-pulse method of Document 2, pitch prediction is performed using the pitch correlation of the sound source only within a frame interval, and the pitch correlation between frames is not used, so the sound quality improvement effect of pitch prediction is not so much. In particular, if the frame interval is set to about 10 to 20 m5 to avoid long transmission delays, male speakers with a long pitch period (typically, the pitch period is 8 to 10 m5)
5th place), there was a problem in that pitch prediction had almost no effect.

一方、前記問題点を改善するために、フレーム区間内の
みでなく、フレームとフレームにまたがって、フレーム
間で音源のピッチ予測を行うピッチ予測マルチパルス法
も提案されている。例えばこの方法については、特願昭
６０−２７３９３６号明細書（文献３）を参照すること
ができるのでここでは説明を略すが、前記文献３の方法
は、２組のピッチ再生フィルタと２組のスペクトル包絡
合成フィルタを必要とするため、前記フィルタを求める
ための演算量が増大するとともに、合計で４種類のフィ
ルタ係数の伝送のための情報量が増加するという問題点
があった。On the other hand, in order to improve the above-mentioned problems, a pitch prediction multi-pulse method has been proposed in which the pitch of a sound source is predicted not only within a frame interval but also between frames, and between frames. For example, regarding this method, reference can be made to Japanese Patent Application No. 1982-273936 (Document 3), so the explanation will be omitted here. Since a spectral envelope synthesis filter is required, there is a problem in that the amount of calculation to obtain the filter increases, and the amount of information for transmitting four types of filter coefficients increases in total.

本発明の目的は、ビットレートが高いところでも、また
ビットレートを下げていっても従来よりも良好な音声を
再生することが可能で、比較的少ない演算量で実現可能
な音声符号化復号化方法並びに音声符号化装置及び音声
復号化装置を提供することにある。The purpose of the present invention is to provide audio encoding and decoding that can reproduce better audio than ever before even when the bit rate is high or even when the bit rate is lowered, and that can be realized with a relatively small amount of calculation. An object of the present invention is to provide a method, a speech encoding device, and a speech decoding device.

[Means to solve the problem]

本発明の音声符号化復号化方法は、送信側では、離散的な音声信号を入力しあらかじめ定め
られたフレーム区間に分割し、前記音声信号からスペク
トル特性を表すスペクトルパラメータとピッチを表すピ
ッチパラメータとを求め、前記音声信号を前記ピッチを
少なくとも複数個含むサブフレーム区間に分割し、過去
の音源信号をもとにサブフレーム間及びサブフレーム内
のピッチ予測によりマルチパルスからなる音源信号を前
記サブフレーム区間において求め、受信側では、前記過去の音源信号をもとに前記ピッチパ
ラメータと前記音源信号とを用いてサブフレーム間及び
サブフレーム内のピッチ予測により駆動音源信号を求め
、前記スペクトルパラメータを用いて前記音声信号を良
好に表す合成音声信号を出力することを特徴とする。In the audio encoding/decoding method of the present invention, on the transmitting side, a discrete audio signal is input and divided into predetermined frame sections, and a spectral parameter representing a spectral characteristic and a pitch parameter representing a pitch are extracted from the audio signal. The audio signal is divided into subframe sections including at least a plurality of the pitches, and the sound source signal consisting of multi-pulses is divided into the subframes by pitch prediction between subframes and within the subframe based on past sound source signals. On the receiving side, based on the past sound source signal, the pitch parameter and the sound source signal are used to predict the pitch between subframes and within the subframe, and the driving sound source signal is calculated, and the driving sound source signal is calculated using the spectral parameter. and outputs a synthesized audio signal that satisfactorily represents the audio signal.

また、本発明の音声符号化装置は、入力した音声信号をあらかじめ定められたフレーム区間
に分割し、前記音声信号からスペクトル特性を表すスペ
クトルパラメータとピッチを表すピッチパラメータとを
求め符号化するパラメータ計算回路と、前記音声信号を前記ピッチを少なくとも複数個含むサブ
フレーム区間に分割するサブフレーム分割回路と、過去の音源信号をもとにサブフレームにわたるピッチ予
測を行い、さらにサブフレーム内のピッチ予測を行いマ
ルチパルス列からなる音源信号を求めて符号化する音源
パルス計算回路と、前記パラメータ計算回路の出力と前
記音源パルス計算回路の出力とを組み合わせて出力する
マルチプレクサ回路とを有することを特徴とする。Furthermore, the audio encoding device of the present invention divides an input audio signal into predetermined frame sections, and performs parameter calculation to obtain and encode a spectral parameter representing a spectral characteristic and a pitch parameter representing a pitch from the audio signal. circuit, a subframe dividing circuit that divides the audio signal into subframe sections including at least a plurality of the pitches, and performs pitch prediction over the subframes based on past sound source signals, and further performs pitch prediction within the subframes. and a multiplexer circuit that combines and outputs the output of the parameter calculation circuit and the output of the sound source pulse calculation circuit.

さらに、本発明の音声復号化装置は、・音声信号のスペ
クトルパラメータを表す符号とピッチパラメータを表す
符号と音源信号を表す符号とを入力して分離して復号化
するデマルチプレクサ回路と、前記復号化したピッチパラメータと過去の音源信号をも
とにサブフレーム間ピッチ予測を行い、さらに前記復号
化した音源信号を用いてサブフレーム内のピッチ予測に
より駆動音源信号を求めるピッチ再生フィルタと、前記駆動音源信号と前記復号化したスペクトルパラメー
タとを用いて音声信号を良好に表す音声信号を合成する
合成フィルタ回路とを有することを特徴上する。Furthermore, the audio decoding device of the present invention includes: a demultiplexer circuit that inputs, separates, and decodes a code representing a spectral parameter of an audio signal, a code representing a pitch parameter, and a code representing a sound source signal; and the decoding device. a pitch recovery filter that performs inter-subframe pitch prediction based on the converted pitch parameters and past sound source signals, and further calculates a driving sound source signal by predicting the pitch within the subframe using the decoded sound source signal; The present invention is characterized by comprising a synthesis filter circuit that synthesizes an audio signal that satisfactorily represents the audio signal using the audio source signal and the decoded spectral parameter.

[Effect]

本発明によれば、音声信号をフレーム区間に等しいかそ
れよりも短いあらかじめ定められた時間長（少なくとも
複数個のピッチを含む）のサブフレームに分割し、過去
のサブフレームで求められた音源信号を用いて、フレー
ム間でピッチ予測を行い、サブフレーム間のピッチ相関
を除去した後に、サブフレーム内でもピッチ予測を行い
、サブフレーム内のピッチ相関を除去してマルチパルス
からなる音源信号を求める。さらに、本発明では、１ｍ
のピッチ再生フィルタと１組のスペクトル包絡合成フィ
ルタとを用いており、伝送情報量の大幅な増加なく、比
較的少ない演算量で、音質を大きく改善できる音声符号
化復号化方法並びに音声符号化装置及び音声復号化装置
を実現できるという特徴がある。According to the present invention, an audio signal is divided into subframes of a predetermined time length (including at least a plurality of pitches) equal to or shorter than a frame interval, and a sound source signal obtained in a past subframe is After performing pitch prediction between frames and removing pitch correlation between subframes, perform pitch prediction within subframes using . Furthermore, in the present invention, 1 m
A voice encoding/decoding method and a voice encoding device that use a pitch recovery filter and a set of spectral envelope synthesis filters, and can significantly improve sound quality without significantly increasing the amount of transmitted information and with a relatively small amount of calculation. The present invention is characterized in that it can realize an audio decoding device and an audio decoding device.

〔Example〕

次に、本発明の実施例について図面を参照して説明する
。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は本発明による音声符号化復号化方法並びに音声
符号化装置及び音声復号化装置の一実施例の構成を示す
ブロック図であり、また、第２図は本発明におけるフレ
ーム間、フレーム内ピッチ予測の方法を説明するための
ブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of the speech encoding/decoding method, speech encoding device, and speech decoding device according to the present invention, and FIG. FIG. 2 is a block diagram for explaining a pitch prediction method.

本発明に従う音声符号化、復号化処理においては、送信
側では、離散的な音声信号を入力しあらかじめ定められ
たフレーム区間に分割し、音声信号からスペクトル特性
を表すスペクトルパラメータとピッチを表すピッチパラ
メータとを求め、音声信号を前記ピッチを少なくとも複
数個含むサブフレーム区間に分割し、過去の音源信号を
もとにサブフレーム間及びサブフレーム内のピッチ予測
によりマルチパルスからなる音源信号を前記サブフレー
ム区間において求め、受信側では、前記過去の音源信号
をもとに前記ピッチパラメータと前記音源信号とを用い
てサブフレーム間及びサブフレーム内のピッチ予測によ
り駆動音源信号を求め、前記スペクトルパラメータを用
いて前記音声信号を良好に表す合成音声信号を出力する
。In the audio encoding and decoding processing according to the present invention, on the transmitting side, a discrete audio signal is input and divided into predetermined frame sections, and a spectral parameter representing spectral characteristics and a pitch parameter representing pitch are extracted from the audio signal. The audio signal is divided into subframe sections including at least a plurality of the above-mentioned pitches, and the sound source signal consisting of multi-pulses is divided into the subframes by pitch prediction between subframes and within the subframe based on the past sound source signal. On the reception side, based on the past sound source signal, the driving sound source signal is obtained by pitch prediction between subframes and within the subframe using the pitch parameter and the sound source signal, and using the spectral parameter. and outputs a synthesized audio signal that satisfactorily represents the audio signal.

まず、第２図を用いて説明する。フレーム区間（例えば
２０ｍ５　）毎の音声信号がサブフレーム分割部３５５
により、フレーム区間と等しいかそれよりも短いサブフ
レーム区間に分割される。前記フレームの音声信号から
、周知の手法（例えばＬＰＣ分析法）により、スペクト
ル包絡合成フィルタ３２０の係数ａ＋　　（ここでＭ次
とする）を求める。また周知の方法（例えば自己相関法
）により、ピッチ再生フィルタ３２５の係数す、　ピッ
チ周期Ｔ８（ここでは、簡単のため１次とする）を求め
る。First, explanation will be given using FIG. 2. The audio signal for each frame section (for example, 20 m5) is collected by the subframe dividing unit 355.
The frame is divided into subframe sections that are equal to or shorter than the frame section. From the audio signal of the frame, a coefficient a+ (here, M-th order) of the spectral envelope synthesis filter 320 is determined by a well-known method (eg, LPC analysis method). Further, by a well-known method (eg, autocorrelation method), the coefficients of the pitch recovery filter 325 and the pitch period T8 (here, for simplicity, it is assumed to be first-order) are determined.

ここでスペクトル包絡合成フィルタ３２０．ピッチ再生
フィルタ３２５の伝達特性は、下記（１）、　（２１式
でそれぞれ表される。Here, the spectral envelope synthesis filter 320. The transfer characteristics of the pitch recovery filter 325 are expressed by the following equations (1) and (21), respectively.

Ｈ，（ｚ）＝音源パルスメモリ３０５は、過去のサブフレームの１ピ
ッチ周期前の音源信号を記憶しておく。まず、スイッチ
２８０を１側へ倒し、過去のマルチパルス列を用いてピ
ッチ再生フィルタ３２５へ入力し現サブフレームのピッ
チを再生する。次にスイッチ２９０を１側へ倒し、スペ
クトル包絡合成フィルタ３２０へ出力する。この出力を
減算器３３０へ出力する。減算器３３０は、サブフレー
ム区間の音声信号からスペクトル包絡合成フィルタ３２
０の出力を減算しｅ、（ｎ）を求め、これを聴感重み付
はフィルタ３５０へ出力する。ここでフィルタ３５０は
前記文献２に記載の重み付は回路と同様の構成を用いる
ことができる。聴感重み付はフィルタ３５０の出力の２
乗平均値Ｅは、音源パルス計算部３１０へ出力される。H, (z) = The sound source pulse memory 305 stores the sound source signal one pitch period before the past subframe. First, the switch 280 is turned to the 1 side, and the past multi-pulse train is input to the pitch reproduction filter 325 to reproduce the pitch of the current subframe. Next, the switch 290 is turned to the 1 side, and the signal is output to the spectrum envelope synthesis filter 320. This output is output to subtracter 330. The subtracter 330 extracts the spectral envelope synthesis filter 32 from the audio signal in the subframe section.
The output of 0 is subtracted to obtain e and (n), which are output to the auditory weighting filter 350. Here, the filter 350 can use the same configuration as the weighting circuit described in Document 2 above. The perceptual weighting is performed using 2 of the output of the filter 350.
The root mean value E is output to the sound source pulse calculation section 310.

次に、スイッチ２８０．２９０を２側へ倒す。音源パル
ス計算部３１０は、例えば、文献１に記載の方法や、文
献２．３の音源計算部と同様の方法を用いて、前記Ｅの
値を最小化するように、サブフレーム内ピッチ予測を用
いて振幅の大きなマルチパルスを、あらかじめ定められ
た個数だけ求める。その後、大振幅マルチパルスによっ
て再生された信号’Ｋ（ｎ）を、ピッチ再生フィルタ３
２５．スペクトル包絡合成フィルタ３２０を用いて求め
、減算器３３０において減算し残差ｅ、（ｎ）を求め、
これを聴感重み付はフィルタ３５０へ出力する。聴感重
み付はフィルタ３５０の出力の２乗平均値を求め、小振
幅音源計算部３１５へ出力する。Next, flip the switches 280 and 290 to the 2 side. The sound source pulse calculation unit 310 performs intra-subframe pitch prediction to minimize the value of E using, for example, the method described in Reference 1 or the same method as the sound source calculation unit in Reference 2.3. A predetermined number of multi-pulses with large amplitudes are obtained using this method. Thereafter, the signal 'K(n) regenerated by the large amplitude multi-pulse is transferred to the pitch regeneration filter 3.
25. is obtained using a spectral envelope synthesis filter 320, and subtracted in a subtracter 330 to obtain a residual e,(n),
This is output to the filter 350 for auditory weighting. For auditory weighting, the root mean square value of the output of the filter 350 is calculated and output to the small amplitude sound source calculation section 315.

次にスイッチ２９０を３側へ倒す。小振幅マルチパルス
は、大振幅マルチパルスはどピッチの周期性は強くなく
、はぼランダムに発生すると考えられる。従って大振幅
マルチパルスを求めたあとは、ピッチ再生フィルタ３２
５を切り離し、スペクトル包絡合成フィルタ３２０のみ
を用いて、あらかじめ定められた個数の小振幅マルチパ
ルス列を求める。Next, flip the switch 290 to the 3 side. Small-amplitude multi-pulses do not have strong pitch periodicity like large-amplitude multi-pulses, and are considered to occur almost randomly. Therefore, after obtaining the large amplitude multi-pulse, the pitch recovery filter 32
5 is separated and a predetermined number of small amplitude multipulse trains are obtained using only the spectral envelope synthesis filter 320.

マルチパルスの求め方は前記文献１を参照することがで
きる。なお、小振幅マルチパルス列の代わりに、より効
率的な表現として、あらかじめ定められた次元数、ビッ
ト数を有するコードブック（符号帳）を作成しておき、
このコードブックのうち、前記２乗平均値を最小化する
ように、最適なものを選択するようにしてもよい。コー
ドブックの作成法としては、あらかじめ学習データにつ
いてトレーニングにより作成しておいても良いし、ある
統計的分布（例えばガウス分布）を有する乱数列を位相
を種々変えて配置しておいてもよい。For how to obtain multipulses, reference can be made to the above-mentioned document 1. Note that instead of a small amplitude multi-pulse train, a codebook with a predetermined number of dimensions and number of bits is created as a more efficient representation.
Among these codebooks, the optimal one may be selected so as to minimize the root mean square value. The codebook may be created in advance by training on learning data, or a random number sequence having a certain statistical distribution (for example, Gaussian distribution) may be arranged with various phases.

前者のコードブックの作成法は、例えば、Ｍａｋｈｏｕ
１氏らによる“Ｖｅｃｔｏｒ　Ｑｕａｎｔｉｚａｔｉｏ
ｎ　ｉｎ　５ｐｅｅｃｈ　Ｃｏｄｉｎｇ”　（Ｐｒｏｃ
、　ＩＥｆｉＥ、　ＰＰ、１５５１−１５８８．１９８
５年）　（文献４）を参照することができる。また後者
の方法は、例えば５ｃｈｒｏｅｄｅｒ、へｔａ１氏らに
よる“Ｃｏｄｅ−ｅｘｃｉｔｅｄ　１ｉｎｅａｒ　ｐｒ
ｅｄｉｃｔｉｏｎ　（ＣＢＬＰ）：　ｈｉｇｈ−ｑｕａ
ｌｉｔｙｓｐｅｅｃｈ　ａｔ　ｖｅｒｙ　ｌｏｗ　ｂｉ
ｔ　ｒａｔｅｓ”　（Ｐｒｏｃ、　ＩＣＡＳＳＰ＋ｐｐ
、２６−２９．１９８５）　　（文献５）を参照するこ
とができる。また、誤差最小化基準によるコードブック
の選択法、ゲインの求め方については、前記文献４．５
などを参照することができる。The former method of creating a codebook is, for example, Makhou
“Vector Quantizatio” by Mr. 1 et al.
n in 5peech Coding” (Proc.
, IEfiE, PP, 1551-1588.198
5 years) (Reference 4). Furthermore, the latter method is described, for example, in “Code-excited linear pr.
edition (CBLP): high-qua
lityspeech at very low bi
t rates” (Proc, ICASSP+pp
, 26-29.1985) (Reference 5). Also, regarding the codebook selection method and gain calculation method using the error minimization criterion, see the above-mentioned document 4.5.
etc. can be referred to.

一方、特性をほぼ同じに保ちながら演算量を大きく低減
化するために、以下の構成も可能である。On the other hand, in order to greatly reduce the amount of calculation while keeping the characteristics substantially the same, the following configuration is also possible.

今、サブフレーム毎に求めた、聴感重み付けを行った音
声信号ｘ、（ｎ）と、聴感重み付けを行った合成フィル
タのインパルス応答り、４（ｎ）との相互相関関数をφ
、とする。この求め方は前記文献２などに詳しいのでこ
こでは説明を略す。Now, the cross-correlation function between the perceptually weighted audio signal x,(n) obtained for each subframe and the perceptually weighted impulse response of the synthesis filter, 4(n), is φ
, and so on. The method of obtaining this is detailed in the above-mentioned document 2, so the explanation will be omitted here.

音源パルスメモリ３０５と、前記インパルス応答の自己
相関関数Ｒｈｈ　（ｍ）を用いて、サブフレーム間でピ
ッチ予測を行い、相互相関関数φｘｈを下式のように修
正する。Pitch prediction is performed between subframes using the sound source pulse memory 305 and the autocorrelation function Rhh (m) of the impulse response, and the cross-correlation function φxh is modified as shown in the following equation.

φ’　ｘｈ　（ｍ）　　＝φ、、（ｍ）　　−Σｇ、Ｌ
−１・Ｒｈｈ　（ｌ　ｍ　　ｍ６ト’　ｌ　）　　　（
３１ここで、ｇｉ’−’、　ｍ１Ｌ−’　は音源メモリ
に格納されている過去（ｌ−１）のサブフレームのマル
チパルス列のｉ番目のパルスの振幅９位置をそれぞれ示
す。次に、Φ　ｘｈ（ｍ）とＲｔ、ｈ（ｍ）とを用いて
サブフレーム内のピッチ予測により、現サブフレームの
大振幅マルチパルス列の振幅ｇ１１位置ｍ。φ' xh (m) =φ,, (m) −Σg,L
-1・Rhh (l m m6t' l ) (
31 Here, gi'-' and m1L-' each indicate the amplitude 9 position of the i-th pulse of the multi-pulse train of the past (l-1) subframe stored in the sound source memory. Next, by predicting the pitch within the subframe using Φ xh(m) and Rt, h(m), the amplitude g11 position m of the large amplitude multi-pulse train in the current subframe is determined.

を規定個数だけ求める。この方法については、前記文献
２などを参照することができる。次に、求めた大振幅マ
ルチパルス列ｇ　；１ｍ；を用いて信号’Ｋ（ｎ）を−
旦再生し、サブフレームの音声信号から°Ｋ　（ｎ）を
減算して前述の残差ｅ、（ｎ）を求め、前述と同一の方
法により、小振幅マルチパルス列あるいは、コードブッ
クのインデクスとゲインを求める。Find the specified number of . Regarding this method, reference can be made to the above-mentioned document 2 and the like. Next, the signal 'K(n) is -
Once reproduced, °K (n) is subtracted from the subframe audio signal to obtain the residual e, (n), and then the small amplitude multipulse train or the codebook index and gain are obtained using the same method as described above. seek.

このように本発明の送信側の伝送情報は、大振幅マルチ
パルス列の振幅１位置、小振幅マルチパルス列の振幅１
位置、（あるいはコードブックを用いる時は、コードブ
ックのインデクスとゲイン）、ピッチ再生フィルタの係
数とスペクトル包絡合成フィルタの係数である。In this way, the transmission information on the transmitting side of the present invention is the amplitude 1 position of the large amplitude multi-pulse train and the amplitude 1 position of the small amplitude multi-pulse train.
These are the position (or the index and gain of the codebook when a codebook is used), the coefficients of the pitch recovery filter, and the coefficients of the spectral envelope synthesis filter.

次に、第１図を参照して説明するに、音声符号化、復号
化系は、音声符号化装置と音声復号化装置とから構成さ
れ、両者は適宜の伝送路を介して接続される。Next, referring to FIG. 1, the speech encoding/decoding system is comprised of a speech encoding device and a speech decoding device, both of which are connected via a suitable transmission path.

音声符号化装置は、入力した音声信号をあらかじめ定め
られたフレーム区間に分割し、前記音声信号からスペク
トル特性を表すスペクトルパラメータとピッチを表すピ
ッチパラメータとを求め符号化するパラメータ計算回路
と、前記音声信号を前記ピッチを少なくとも複数個含む
サブフレーム区間に分割するサブフレーム分割回路と、
過去の音源信号をもとにサブフレームにわたるピッチ予
測を行いさらにサブフレーム内のピッチ予測を行いマル
チパルス列からなる音源信号を求めて符号化する音源パ
ルス計算回路と、前記パラメータ計算回路の出力と前記
音源パルス計算回路の出力とを組み合わせて出力するマ
ルチプレクサ回路とを有する。The audio encoding device includes a parameter calculation circuit that divides an input audio signal into predetermined frame sections, obtains and encodes a spectral parameter representing a spectral characteristic and a pitch parameter representing a pitch from the audio signal, and encodes the audio signal. a subframe dividing circuit that divides the signal into subframe sections including at least a plurality of the pitches;
a sound source pulse calculation circuit that performs pitch prediction over subframes based on past sound source signals, further predicts pitch within subframes, and obtains and encodes a sound source signal consisting of a multipulse train; It has a multiplexer circuit that combines and outputs the output of the sound source pulse calculation circuit.

音声復号化装置は、音声信号のスペクトルパラメータを
表す符号とピッチパラメータを表す符号と音源信号を表
す符号とを人力して分離して復号化するデマルチプレク
サ回路と、前記復号化したピッチパラメータと過去の音
源信号をもとにサブフレーム間のピッチ予測を行いさら
に前記復号化した音源信号を用いてサブフレーム内のピ
ッチ予測により駆動音源信号を求めるピッチ再生フィル
タと、前記駆動音源信号と前記復号化したスペクトルパ
ラメータとを用いて音声信号を良好に表す音声信号を合
成する合成フィルタ回路とを有する。The audio decoding device includes a demultiplexer circuit that manually separates and decodes a code representing a spectral parameter, a code representing a pitch parameter, and a code representing a sound source signal of an audio signal, and a demultiplexer circuit that manually separates and decodes a code representing a spectrum parameter of an audio signal, a code representing a pitch parameter, and a code representing a sound source signal, and a pitch recovery filter that predicts the pitch between subframes based on the sound source signal and further calculates a driving sound source signal by predicting the pitch within the subframe using the decoded sound source signal; and a synthesis filter circuit that synthesizes an audio signal that satisfactorily represents the audio signal using the obtained spectral parameters.

音声符号化、復号化処理は、以下のようにしてなされる
。Audio encoding and decoding processing is performed as follows.

本発明の一実施例を示す第１図において、入力端子５０
０から離散的な音声信号ｘ　（ｎ）を入力する。スペク
トル、ピッチパラメータ計算回路５２０では、分割した
フレーム区間（例えば２０ｍ５　）の音声信号のスペク
トル包絡を表すスペクトル包絡合成フィルタ６１０のス
ペクトルパラメータａｉを、周知のＬＰＧ分析法によっ
て求める。また、ピソチ再生フィルタ６０５の係数すと
ピッチ周期Ｔを、周知の自己相関法により求める。具体
的な方法としては、例えば前記文献３のにパラメータ、
ピッチ計算回路を参照することができる。In FIG. 1 showing an embodiment of the present invention, an input terminal 50
A discrete audio signal x (n) from 0 is input. The spectral and pitch parameter calculation circuit 520 calculates the spectral parameter ai of the spectral envelope synthesis filter 610, which represents the spectral envelope of the audio signal in the divided frame sections (for example, 20 m5), using the well-known LPG analysis method. Further, the coefficient and pitch period T of the Pisochi reproduction filter 605 are determined by a well-known autocorrelation method. As a specific method, for example, the parameter in the above-mentioned document 3,
You can refer to the pitch calculation circuit.

求められたパラメータ及び係数、ピッチ周期に対しては
、パラメータ量子化器５２５において量子化を行う。量
子化の方法は、特願昭５９−２７２４３５号明細占（文
献６）に示されているようなスカラー量子化や、あるい
は前記文献４などに示されたベクトル量子化を行っても
よい。逆量子化器５３０は、量子化した結果を用いて逆
量子化して出力する。A parameter quantizer 525 quantizes the determined parameters, coefficients, and pitch period. The quantization method may be scalar quantization as shown in Japanese Patent Application No. 59-272435 (Reference 6), or vector quantization as shown in Reference 4. The dequantizer 530 dequantizes and outputs the quantized result.

サブフレーム分割回路５１５は、フレームの音声信号を
、フレームよりも短いかまたはフレームと同じ長さのサ
ブフレームに分割する。ここでピッチ周期を利用して、
ピッチ周期を少なくとも複数個含むようにし、ピッチ周
期が短い場合は、サブフレーム長は１０ｍ５程度とし、
ピッチ周期が長い場合は、フレーム長と同じとする。The subframe division circuit 515 divides the frame audio signal into subframes that are shorter than the frame or have the same length as the frame. Here, using the pitch period,
At least a plurality of pitch periods should be included, and if the pitch period is short, the subframe length should be about 10 m5,
If the pitch period is long, it is the same as the frame length.

重み付は回路５４０は、サブフレームに分割された音声
信号と逆量子化されたスペクトルパラメータを用いて前
記信号に重み付けを行う。重み付けの方法は、前記文献
６の重み付は回路２００を参照することができる。A weighting circuit 540 weights the signal using the audio signal divided into subframes and the dequantized spectral parameter. For the weighting method, reference can be made to the weighting circuit 200 in Document 6.

インパルス応答計算回路５５０は、逆量子化されたスペ
クトルパラメータと、ピッチ周期、係数を用いて聴感重
み付けを施された合成フィルタのインパルス応答を計算
する。具体的な方法は前記文献２の重み付は回路を参照
できる。The impulse response calculation circuit 550 calculates an impulse response of a perceptually weighted synthesis filter using the dequantized spectral parameter, pitch period, and coefficient. For a specific method, refer to the weighting circuit in Document 2.

自己相関関数計算回路５６０は、前記インパルス応答の
自己相関関数Ｒい（ｍ）を計算し、相互相関関数修正回
路５７５と音源パルス計算回路５８０へ出力する。自己
相関関数の計算法は、前記文献２や前記文献６の自己相
関関数計算回路１８０を参照することができる。The autocorrelation function calculation circuit 560 calculates the autocorrelation function R(m) of the impulse response and outputs it to the cross-correlation function correction circuit 575 and the sound source pulse calculation circuit 580. For the method of calculating the autocorrelation function, reference can be made to the autocorrelation function calculation circuit 180 in Document 2 and Document 6.

相互相関関数計算回路５７０は、前記重み付けられた信
号と、前記インパルス応答との相互相関関数Φｘｈ　（
ｍ）を計算して、相互相関関数修正回路５７５へ出力す
る。The cross-correlation function calculation circuit 570 calculates a cross-correlation function Φxh (
m) is calculated and output to the cross-correlation function correction circuit 575.

相互相関関数修正回路５７５では、音源パルスメモリ５
８１から１ピッチ周期だけ過去の音源信号と自己相関関
数Ｒい（ｍ）とを入力し、前記（３）式に従い、サブフ
レーム間のピッチ予測を行って相互相関関数を修正して
Φ’ｘｈ（ｍ）を出力する。In the cross-correlation function correction circuit 575, the sound source pulse memory 5
81, input the past sound source signal and autocorrelation function R(m) by one pitch period, perform pitch prediction between subframes according to equation (3), correct the cross-correlation function, and obtain Φ'xh. Output (m).

音源パルス計算回路５８０では、φ’、ｈ（ｍ）とＲｈ
ｈ（ｍ）とを用いて、フレーム内ピッチ予測により、大
振幅のマルチパルス列の振幅と位置を規定個数だけ求め
る。パルス列の計算方法については、前記文献２の音源
パルス計算回路を参照することができる。In the sound source pulse calculation circuit 580, φ', h(m) and Rh
Using h(m), the amplitude and position of a specified number of large-amplitude multi-pulse trains are determined by intra-frame pitch prediction. Regarding the pulse train calculation method, reference can be made to the sound source pulse calculation circuit in Document 2.

量子化器５８５は、前記マルチパルス列の振幅と位置を
量子化して符号を出力する。具体的な方法は前記文献６
などを参照できる。この出力は逆量子化器５９０によっ
て逆量子化され、パルス発生器６０帆ピツチ再生フイル
タ６０５．スペクトル包絡合成フィルタ６１０に通すこ
とによって、前記大振幅の音源パルスによる合成音声信
号”Ｋ（ｎ）が求まる。A quantizer 585 quantizes the amplitude and position of the multi-pulse train and outputs a code. The specific method is described in the above document 6.
etc. can be referred to. This output is dequantized by a dequantizer 590 and passed through a pulse generator 60 and a pitch recovery filter 605 . By passing the signal through a spectral envelope synthesis filter 610, a synthesized speech signal "K(n)" based on the large-amplitude sound source pulse is obtained.

ここでピッチ再生フィルタ６０５．スペクトル包絡合成
フィルタ６１０は、第２図のピッチ再生フィルタ３２５
、スペクトル包絡合成フィルタ３２０と同一の動作をす
る。Here, the pitch reproduction filter 605. The spectral envelope synthesis filter 610 is similar to the pitch recovery filter 325 in FIG.
, operates in the same way as the spectral envelope synthesis filter 320.

減算器６１５は、前記音声信号’Ｋ（ｎ）から合成音声
信号ｘ　（ｎ）を減することによって、残差信号ｅｘ　
（ｎ）に対して小振幅音源信号を計算する。The subtracter 615 subtracts the synthesized speech signal x (n) from the speech signal 'K(n) to obtain a residual signal ex
A small amplitude sound source signal is calculated for (n).

小振幅音源計算回路６２０では、第２図で動作を説明し
たように、小振幅音源信号を、規定個数のマルチパルス
列か、あるいはコードブックを用いて表す。ここでは、
マルチパルス列を用いることとする。第２図に関連して
説明した方法により、小振幅マルチパルス列の振幅１位
置を計算し、これらを量子化、符号化して符号を出力す
る。またこれらを復号化して小振幅音源パルス列を発生
させて加算器６２５へ出力する加算器６２５は、ピッチ再生フィルタ６０５の出力と、
小振幅音源計算回路６２０の出力である小振幅音源パル
ス列とを加算して音源パルスメモリ５８１へ出力する。In the small-amplitude sound source calculation circuit 620, the small-amplitude sound source signal is expressed using a predetermined number of multi-pulse trains or a codebook, as described in FIG. 2. here,
A multi-pulse train will be used. By the method described in connection with FIG. 2, one amplitude position of a small amplitude multi-pulse train is calculated, quantized and encoded, and a code is output. The adder 625 decodes these to generate a small amplitude sound source pulse train and outputs it to the adder 625.
The small amplitude sound source pulse train which is the output of the small amplitude sound source calculation circuit 620 is added and output to the sound source pulse memory 581.

マルチプレクサ６３０は、量子化器５８５の出力である
マルチパルス列の振幅９位置を表す符号、小振幅音源計
算回路６２０の出力である、小振幅マルチパルス列の振
幅１位置を表す符号、パラメータ量子化器５２５の出力
であるスペクトルパラメータ、ピッチ周期、係数を表す
符号を組み合わせて出力する。The multiplexer 630 has a code representing nine amplitude positions of the multi-pulse train which is the output of the quantizer 585, a code representing one amplitude position of the small amplitude multi-pulse train which is the output of the small amplitude excitation calculation circuit 620, and a parameter quantizer 525. The spectral parameters, pitch periods, and codes representing the coefficients are combined and output.

一方、受信側では、デマルチプレクサ７１０は、大振幅
マルチパルス列の振幅１位置を表す符号、スペクトルパ
ラメータ、ピッチ周期、係数を表す符号、小振幅音源信
号を表す符号を分離して出力する。On the other hand, on the receiving side, the demultiplexer 710 separates and outputs a code representing one amplitude position of the large-amplitude multipulse train, a code representing the spectral parameter, pitch period, and coefficient, and a code representing the small-amplitude excitation signal.

音源パルス復号器７２０は、大振幅のマルチパルスの振
幅１位置を復号する。The source pulse decoder 720 decodes one amplitude position of a large amplitude multi-pulse.

スペクトル、ピッチパラメータ復号器７５０は、送信側
の逆量子化器５３０と同じ働きをして、スペクトルパラ
メータ、ピッチ周期Ｔ、係数すを復号して出力する。The spectrum and pitch parameter decoder 750 functions in the same way as the inverse quantizer 530 on the transmitting side, and decodes and outputs the spectrum parameter, pitch period T, and coefficients.

小振幅音源復号器７４５は、小振幅のマルチパルス列の
振幅１位置を復号して出力する。The small amplitude excitation decoder 745 decodes and outputs one amplitude position of the small amplitude multi-pulse train.

パルス発生器７２５は、前記大振幅のマルチパルス列に
よる音源信号を発生させる。The pulse generator 725 generates a sound source signal based on the large-amplitude multi-pulse train.

ピッチ再生フィルタ７３０は、パルス発生器７２５の出
力、ピッチ周期Ｔ、係数すを入力し、音源パルスメモリ
７３５の出力を用いて、サブフレーム間のピッチ予測を
行いながら、ピッチを再生した音源信号を求め加算器７
４０へ出力する。The pitch recovery filter 730 inputs the output of the pulse generator 725, the pitch period T, and the coefficient S, and uses the output of the sound source pulse memory 735 to predict the pitch between subframes and reproduces the pitch of the sound source signal. Search adder 7
Output to 40.

加算器７４０は前記音源信号と小振幅音源復号器７４５
の出力信号を加算して駆動音源信号を求め、合成フィル
タ回路７６０を駆動する。An adder 740 combines the sound source signal with a small amplitude sound source decoder 745.
The driving sound source signal is obtained by adding the output signals of the driving sound source signal, and the synthesis filter circuit 760 is driven.

音源パルスメモリ７３５は、送信側の音源パルスメモリ
５８１と同一の動作をし、加算器７４０の出力を格納す
る。The sound source pulse memory 735 operates in the same way as the sound source pulse memory 581 on the transmission side, and stores the output of the adder 740.

合成フィルタ回路７６０では前記駆動音源信号及び前記
復号されたスペクトルパラメータを用いて、サブフレー
ム毎に合成音声波形を求めて出力する。The synthesis filter circuit 760 uses the driving sound source signal and the decoded spectrum parameters to obtain and output a synthesized speech waveform for each subframe.

以上述べた実施例は、第２図に関連して説明した演算量
を低減化する方法として述べた方法の一例であるが、第
２図に関連して述べた基本的方式の構成としてもよい。The embodiment described above is an example of the method described as a method for reducing the amount of calculations described in relation to FIG. 2, but the basic method described in relation to FIG. 2 may also be used. .

ただし、このような構成とすると、演算量が増加する。However, such a configuration increases the amount of calculations.

小振幅音源信号としては、小振幅マルチパルス列以外に
も第２図に関連して述べたように、コードブックを用い
る構成としてもよい。このときは、小振幅マルチパルス
列の振幅１位置を伝送する代わりに、コードブックのイ
ンデクス、ゲインを伝送することになる。As the small amplitude sound source signal, in addition to the small amplitude multi-pulse train, as described in connection with FIG. 2, a codebook may be used. In this case, instead of transmitting one amplitude position of the small amplitude multi-pulse train, the index and gain of the codebook are transmitted.

また、演算量をさらに低減するために、小振幅音源信号
も、大振幅マルチパルスと同じように、フレーム内ピッ
チ予測を行いながら求めるようにしてもよい。このとき
は、−旦信号を再生して残差ｅ２（ｎ）を求める必要が
なくなるので、送信側のパルス発生器６００．ピッチ再
生フィルタ６０５．スペクトル包絡合成フィルタ６１０
．小振幅音源計算回路６２０が不要となる。しかしなが
ら、−ｉに小振幅信号には、ピッチ相関は少ないと考え
られるので、このような構成とすると、演算量は低減で
きるものの特性が低下すると思われる。Furthermore, in order to further reduce the amount of calculation, the small amplitude sound source signal may also be obtained while performing intra-frame pitch prediction in the same way as the large amplitude multi-pulse. In this case, there is no need to regenerate the signal once to obtain the residual e2(n), so the pulse generator 600. Pitch recovery filter 605. Spectral envelope synthesis filter 610
．． The small amplitude sound source calculation circuit 620 becomes unnecessary. However, since it is thought that there is little pitch correlation for a small amplitude signal at -i, it is thought that such a configuration reduces the amount of calculation but degrades the characteristics.

また、マルチパルスの計算方法としては、前記文献ｌに
示した方法の他に、種々の周知な方法を用いることがで
きる。これには、１列えば、０ＺａＷａ氏らによる“Ａ
　５ｔｕｄｙ　ｏｎ　Ｐｕ１ｓｅ　５ｅａｒｃｈ　ＡＩ
ｇｏｒｉｔｈｍｓｆｏｒ　Ｍｕｌｔｉ−ｐｕｌｓｅ　５
ｐｅｅｃｈ　Ｃｏｄｅｒ　Ｒｅａｌｉｚａｔｉｏｎ”　
（ＩＥＥＥ　ＪＳＡＣ，ｐｐ、１３３−１４１．１９８
６）　　（文献７）を参照することができる。Furthermore, as a method for calculating multi-pulses, various well-known methods can be used in addition to the method shown in the above-mentioned document 1. One example of this is the “A
5tudy on Pulse 5earch AI
gorithmsfor Multi-pulse 5
peech Coder Realization”
(IEEE JSAC, pp. 133-141.198
6) (Reference 7) can be referred to.

また、ピッチ周期の計算法としては、前述の実施例で示
した方法の他に、例えば、下記（４）式のように、過去
の音源信号ｖ　（ｎ）と、ピッチ再生フィルタ、スペク
トル包絡合成フィルタで再生した信号と、現サブフレー
ムの入力音声信号ｘ　（ｎ）との誤差電力Ｅを最小化す
るような位置Ｔを探索し、そのときの係数すを求めるこ
ともできる。In addition, as a method for calculating the pitch period, in addition to the method shown in the above embodiment, for example, as shown in equation (4) below, the past sound source signal v (n), a pitch reproduction filter, and spectral envelope synthesis can be used. It is also possible to search for a position T that minimizes the error power E between the signal reproduced by the filter and the input audio signal x (n) of the current subframe, and find the coefficient S at that time.

Ｅ＝Σ（（ｘ　（ｎ）　−ｂ　−ｖ　（ｎ−Ｔ）　＊ｈ
　（ｎ）　）＊Ｗ　（ｎ）　）　”　　　　　　　　　
　　　　（４１ここで、ｈ（ｎ）はスペクトル合成フィ
ルタのインパルス応答、ｗ　（ｎ）は聴感重み付は回路
のインパルス応答を示す。E=Σ((x (n) −b −v (n−T) *h
(n) )*W (n) )”
(41 Here, h(n) is the impulse response of the spectral synthesis filter, and w (n) is the impulse response of the auditory weighted circuit.

また、サブフレームのピッチ周期Ｔに線形のずれτを許
容するようにしてもよい。具体的な方法については、Ｏ
ｎｏ氏らによる“２．４ｋｂｐｓ　ｐｉｔｃｈ　ｐｒｅ
ｄｉｃｔｉｏｎ　ｍｕｌｔｉ−ｐｕｌｓｅ　５ｐｅｅｃ
ｈ　ｃｏｄｉｎｇ”　（ｐｒｏｃ、　ＩＥＥＥ　ＩＣＡ
ＳＳＩ”８８．９．Ｓ４．９．１９８８）　　（文献８
）を参照できる。ただし、このときはピッチ情報として
、Ｔ以外にτも伝送する必要がある。Furthermore, a linear deviation τ may be allowed in the pitch period T of the subframe. For specific methods, see O
“2.4kbps pitch pre
diction multi-pulse 5peec
h coding” (proc, IEEE ICA
SSI”88.9.S4.9.1988) (Reference 8
) can be referenced. However, in this case, in addition to T, it is also necessary to transmit τ as pitch information.

〔Effect of the invention〕

本発明によれば、サブフレーム毎に、過去のサブフレー
ムの音源信号に基づき、サブフレーム間及びサブフレー
ム内のピッチ相関を利用してマルチパルスからなる音源
信号を求め、音源信号を効率的かつ良好に表しているこ
と、このような計算を相関領域で行っていること、１組
のピッチ再生フィルタ、スペクトル包絡合成フィルタを
用いているので、従来技術に比べ、比較的少ない演算量
で、再生音声の音質を大きく改善できるという大きな効
果がある。またこの効果は、ビットレートが低い場合に
より顕著である。According to the present invention, a sound source signal consisting of multi-pulses is obtained for each subframe based on sound source signals of past subframes using pitch correlation between subframes and within a subframe, and the sound source signal is efficiently and Because the calculations are performed in the correlation domain, and a pair of pitch recovery filters and spectral envelope synthesis filters are used, reproduction can be performed with a relatively small amount of calculation compared to conventional technology. This has the great effect of greatly improving the sound quality of the voice. Moreover, this effect is more pronounced when the bit rate is low.

[Brief explanation of the drawing]

第１図は本発明による音声符号化復号化方法と音声符号
化装置及び音声復号化装置の一実施例の構成を示すブロ
ック図、第２図は本発明におけるフレーム間、フレーム内ピッチ
予測の方法を説明するためのブロック図、第３図はパル
ス列探索法の従来例を表すブロック図である。２８０、２９０・・・・・スイッチ３０５、５８１．７３５　　・・音源パルスメモリ３１
０、５８０・・・・・音源パルス計算（部）回路３１５、６２０・・・・・小振幅音源計算（部）回路３２０、６１０・・・・・スペクトル包絡合成フィルタ３２５、６０５．７３０　　・・ピッチ再生フィルタ３
３０　　・・・・・・・減算器３５０　　・・・・・・・聴感重み付はフィルタ３５５
、５１５・・・・・サブフレーム分割（部）回路５２０　　・・・・・・・スペクトル、ピッチパラメー
タ計算回路５２５　　・・・・・・・パラメータ量子化器５３０、
５９０・・・・・逆量子化器５４０　　・・・・・・・重み付は回路５５０　・・・
・・・・インパルス応答計算回路５６０　　・・・・・
・・自己相関関数計算回路６００゜６２５゜７６０　・　・　・　・　・相互相関関数計算回路相互相関関数修正回路音源パルス計算回路量子化器パルス発生器加算器マルチプレクサデマルチプレクサ音源パルス復号器小振幅音源復号器スペクトル、ピッチパラメータ復号器合成フィルタ回路第２図FIG. 1 is a block diagram showing the configuration of an embodiment of a speech encoding/decoding method, speech encoding device, and speech decoding device according to the present invention, and FIG. 2 is a method for interframe and intraframe pitch prediction according to the present invention. FIG. 3 is a block diagram showing a conventional example of a pulse train search method. 280, 290...Switches 305, 581.735...Sound source pulse memory 31
0, 580...Sound source pulse calculation (part) circuit 315, 620...Small amplitude sound source calculation (part) circuit 320, 610...Spectrum envelope synthesis filter 325, 605.730... Pitch reproduction filter 3
30 ..... Subtractor 350 ..... Auditory weighting is filter 355
, 515... Subframe division (part) circuit 520... Spectrum and pitch parameter calculation circuit 525... Parameter quantizer 530,
590...Inverse quantizer 540...Weighting circuit 550...
... Impulse response calculation circuit 560 ...
・・Autocorrelation function calculation circuit 600゜625゜760 ・・・・・・・・・・・・ Cross correlation function calculation circuit Cross correlation function correction circuit Excitation pulse calculation circuit Quantizer Pulse generator Adder Multiplexer Demultiplexer Excitation pulse decoder Small amplitude excitation decoding Figure 2: Pitch parameter decoder synthesis filter circuit

Claims

[Claims]

(1) On the transmitting side, a discrete audio signal is input and divided into predetermined frame sections, a spectral parameter representing a spectral characteristic and a pitch parameter representing a pitch are obtained from the audio signal, and the audio signal is divided into predetermined frame sections. Divide into subframe sections containing at least a plurality of pitches, and calculate a sound source signal consisting of multipulses in the subframe section by predicting the pitch between subframes and within the subframe based on past sound source signals, and on the receiving side, A driving sound source signal is obtained by pitch prediction between subframes and within a subframe using the pitch parameter and the sound source signal based on the past sound source signal, and the sound signal is well expressed using the spectral parameter. A speech encoding/decoding method characterized by outputting a synthesized speech signal.

(2) a parameter calculation circuit that divides an input audio signal into predetermined frame sections, obtains and encodes a spectral parameter representing a spectral characteristic and a pitch parameter representing a pitch from the audio signal; A subframe dividing circuit divides the pitch into subframe sections including at least a plurality of pitches, predicts the pitch over the subframe based on past sound source signals, and further predicts the pitch within the subframe to generate the sound source signal consisting of a multi-pulse train. 1. A speech encoding device comprising: an excitation pulse calculation circuit that calculates and encodes an excitation pulse; and a multiplexer circuit that combines and outputs the output of the parameter calculation circuit and the output of the excitation pulse calculation circuit.

(3) a demultiplexer circuit that inputs, separates, and decodes a code representing a spectrum parameter of an audio signal, a code representing a pitch parameter, and a code representing a sound source signal; and the decoded pitch parameter and past sound source signal. a pitch recovery filter that performs inter-subframe pitch prediction based on the decoded excitation signal and further calculates a driving excitation signal by predicting the pitch within the subframe using the decoded excitation signal; and the driving excitation signal and the decoded spectral parameter. and a synthesis filter circuit for synthesizing an audio signal that satisfactorily represents an audio signal using the following.