JPS62207036A

JPS62207036A - Voice coding system and its apparatus

Info

Publication number: JPS62207036A
Application number: JP61049633A
Authority: JP
Inventors: Kazunori Ozawa; 一範小澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1986-03-07
Filing date: 1986-03-07
Publication date: 1987-09-11

Abstract

PURPOSE:To synthesize voice with high quality even with comparatively less operation quantity and low transmission bit rate by using a spectrum parameter obtained as to a representative period and a pulse train so as to represent the voice signal of the entire frame in an excellent way. CONSTITUTION:A drive signal restoration circuit 240 uses a decoded value, position of a representative period and an interpolated pitch of an inputted sound source pulse train, recovers the pulse train of the sub-frame period other than the representative period from the preceding and succeeding frame pulse trains by the interpolation processing to generate a sound source signal for one frame, which is outputted to a synthesized filter circuit 250 as a drive sound source signal. The circuit 250 inputs a drive sound source signal and a K parameter interpolated based on the representative period, converts the K parameter into a forecast coefficient while switching the parameter at each sub-frame and calculates a reply signal. A multiplexer circuit 260 inputs a code k1 representing the K parameter of the representative period, a code ld of a pitch coding circuit 150, a code of a coder 230, a code representing the sub frame phase and a code representing the position of the representative period and combines them to output the result from a terminal 270.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音声符号化方式とその装置に関し、特に音声信
号を低いと・・ｌトレイトで高品質に符号化するための
符号化方式とその装置に関する。[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to an audio encoding method and an apparatus thereof, and in particular to an encoding method and its method for encoding an audio signal with low and high quality traits. Regarding equipment.

[Conventional technology]

音声信号を低い伝送ビットレイト（例えば４゜８　ｋ　
ｂ　ｐ　ｓ程度）で符号化する方式として、ボコーダ（
ＶＯＣＯＤＥＲ）が知られている。この方法の原理につ
いては、例えば、エム　アール　シュレイター（Ｍ、Ｒ
，５ＣＨＲＯＢＤＥＲ）氏による°゛ボコーダズ：アナ
リシスアンド　シンセシス　オブ　スピーチ°’　　（
”ＶＯＣＯＤＥＲ３：ＡＮＡＬＹＳＩＳ　　ＡＮＤ　　
５ＹＮＴＨＥＳＩＳＯＦ　　５ＰＥＥＣＨ”）と題した
論文（ＰＲＯＣ，ＩＥＥＥ、ｐ、ｐ、７２０−７３４．
ＭＡＹ。If the audio signal is transmitted at a low transmission bit rate (e.g. 4°8K)
A vocoder (approximately
VOCODER) is known. The principle of this method is described, for example, by M.R. Schreiter (M.R.
, 5CHROBDER) by °゛Vocodaz: Analysis and Synthesis of Speech °' (
“VOCODER3: ANALYSIS AND
5YNTHESISOF 5PEECH”) (PROC, IEEE, p, p, 720-734.
MAY.

１９６６）（文献１）等に詳細に説明されている。1966) (Reference 1).

また、線形予測分析法を用いてボコーダとしてエル　ピ
ー　シー　ボコーダ（ＬＰＣＶＯＣＯＤＥＲ）が知られ
ており、その内容については、例えば、ジェー　ディー
　マーケル（Ｊ、Ｄ、ＭＡＲＫＥＬ）氏らによる′ア　
リニアー　プレディクション　ボコーダ　ベイスト　ア
ボン　ザオートコリレイション　メソッド“’（”ＡＬ
ＩＮＥＡＲＰＲＥＤＩ（’：１’ｌＯＮ　　ＶＯＣＯＤ
ＥＲＢＡＳＥＤ　　ＵＰＯＮ　　ＴＨＥ　　ＡＵＴＯＣ
ＯＲＲＥＬＡＴＩＯＮ　　ＭＥＴＨＯＤ”）と題した論
文（ＩＥＥＥ　　ＴＲＡＮＳ、Ａ、Ｓ、Ｓ、Ｐ、。In addition, the LPC VOCODER is known as a vocoder that uses the linear predictive analysis method.
Linear Prediction Vocoder Beist Avon The Autocorrelation Method "'("AL
INEARPREDI(':1'lON VOCOD
ERBASED UPON THE AUTOC
A paper entitled "ORRELATION METHOD") (IEEE TRANS, A, S, S, P,.

ρ、　　ｐ、　　１２４−１３４．ＡＰＲＩＬ、　　１
９７４）（文献２）等に詳細に説明されている。本発明
はＶ　ＯＣＯＤ　Ｅ　Ｒの音源部を改良したものであり
、ＬＰＣＶＯＣＯＤＥＲと密接な関係があるので、以下
ＬＰＣＶＯＣＯＤＥＲについて合成部の構成を中心に概
略を説明する。ρ, p, 124-134. APRIL, 1
974) (Reference 2). The present invention is an improvement of the sound source section of the VOCODER and is closely related to the LPCVOCODER, so an outline of the LPCVOCODER will be explained below, focusing on the configuration of the synthesis section.

第３図は従来方式の合成側の一例を示す文献２に記載の
ＬＰＣＶＯＣＯＤＥＲの合成部（受信部）のブロック図
である。合成部は音源発生部５００と合成フィ・ルタ回
路５１０からなる。音源発生部５００はインパルス発生
器５０１と雑音発生器５０２と有声／無声切換え回路５
０３と、ゲイン回路５０４から構成される。ＶＯＣＯＤ
ＥＲでは、音声信号は短時間（例えば２０ｍ５ｅｃ）毎
に有声と無声の２種に分けられ、有声の場合は、インパ
ルス発生器５０１からと・ソチ周期Ｐｄの時間間隔をも
つパルス列が発生される。一方、無声の場合は、雑音発
生器５０２から白色雑音が発生される。有声／無声の制
御は切換え回路５０３によって行なわれる。このように
して発生された信号に対して、ゲイン回路５０４によっ
てゲインＧが与えられ、音源信号ｄ　（ｎ）として合成
フィルタ回路５１０へ出力される。FIG. 3 is a block diagram of the combining section (receiving section) of the LPC VOCODER described in Document 2, which shows an example of the combining side of the conventional system. The synthesis section consists of a sound source generation section 500 and a synthesis filter circuit 510. The sound source generator 500 includes an impulse generator 501, a noise generator 502, and a voiced/unvoiced switching circuit 5.
03 and a gain circuit 504. VOCOD
In ER, audio signals are divided into two types, voiced and unvoiced, at short intervals (for example, 20 m5ec), and in the case of voiced signals, the impulse generator 501 generates a pulse train with a time interval of the Sochi period Pd. On the other hand, if there is no voice, white noise is generated from the noise generator 502. Voiced/unvoiced control is performed by a switching circuit 503. A gain G is applied to the signal generated in this manner by a gain circuit 504, and the signal is outputted to a synthesis filter circuit 510 as a sound source signal d(n).

合成フィルタ回路５１０では音源信号ｄ　（ｎ）とフィ
ルタパラメータに、を用いて音声ｘ　（ｎ＞を合成し出
力する。ここで、ピッチ周期Ｐｄ、有声／無声切換え信
号（Ｖ／１ＪＶ）、ゲインＧ、フィルタパラメータに＋
は分析側（送信側）においてあらかじめ定められた時間
ごとに計算されて受信側に伝送される。The synthesis filter circuit 510 synthesizes and outputs the sound x (n>) using the sound source signal d (n) and the filter parameters. Here, the pitch period Pd, voiced/unvoiced switching signal (V/1JV), gain G , + to the filter parameter
is calculated at predetermined time intervals on the analysis side (sending side) and transmitted to the receiving side.

以上説明したＬＰＣＶＯＣＯＤＥＲにおいては、伝送情
報はピッチ周期、有声／無声切換え信号、ゲイン、フィ
ルタパラメータであり、これらの情報から音声信号を合
成できるので、伝送ビットレイトを低く（例えば４．８
ｋｂｐｓ程度）することができる。しかしながら、この
従来法では品買の良好な音声を合成することは困難であ
った。In the LPC VOCODER described above, the transmission information is the pitch period, voiced/unvoiced switching signal, gain, and filter parameters, and since the audio signal can be synthesized from these information, the transmission bit rate can be set to a low value (for example, 4.8
kbps). However, with this conventional method, it is difficult to synthesize a good quality voice.

それは、音源信号は有声の場合は音源を１ビ・ソチあた
り１個のインパルスで表わしており、更に位相情報も含
まれないので、自然性はかなり損なわれており、その合
成音は所謂機械的な音であった。The reason is that when the sound source signal is voiced, the sound source is represented by one impulse per bi-sochi, and it also does not include phase information, so the naturalness is considerably impaired, and the synthesized sound is so-called mechanical sound. It was a sound.

また、音声を有声と無声という２種の極端なりラスに分
け、音源をインパルス音源か雑音源に切り換えているの
で、有声／無声の判別誤りが起きた場合は、大きな品質
劣化を引き起こすという欠点があった。また、無声と有
声の切換わり部では、音源を良好に表わすことができず
劣化がおきていた。更にピッチ周期がずれて求っな場合
には、大きな品質劣化を引き起こすという欠点があった
。In addition, since the sound is divided into two extremes, voiced and unvoiced, and the sound source is switched between an impulse sound source and a noise source, there is a drawback that if an error in voiced/unvoiced discrimination occurs, it will cause a large quality deterioration. there were. Furthermore, at the transition between voiceless and voiced, the sound source could not be expressed well, resulting in deterioration. Furthermore, if the pitch period is shifted and cannot be determined, there is a drawback that a large quality deterioration is caused.

音源を改良する方法として、例えば特願昭５９−２７２
４３５号明細書（文献３）等に記載されているように、
１つのピッチ区間のパルス列を用いてフレーム全体の音
源を表わす方法が知られている。この方法では、送信側
で、フレームで求めたピッチ周期を用いて１フレームを
ピッチ周期毎のピッチ区間に分割し、フレームで求めた
１フレ一ム全体の平均的な特性を表わすスペクトルパラ
メータ（合成フィルタの係数）を用いてピッチ区間毎に
パルス列を求め、１フレ一ム全体に対して良好な信号を
再生できるようなピッチ区間（代表区間）を１つ選んで
いた。As a method of improving the sound source, for example, Japanese Patent Application No. 59-272
As described in Specification No. 435 (Document 3) etc.,
A method is known in which a pulse train of one pitch section is used to represent a sound source for an entire frame. In this method, on the transmitting side, one frame is divided into pitch sections for each pitch period using the pitch period found in the frame, and a spectral parameter (combined A pulse train is obtained for each pitch section using the filter coefficients, and one pitch section (representative section) that can reproduce a good signal for the entire frame is selected.

[Problem that the invention seeks to solve]

上述した文献３の方法では、スペクトルパラメータとし
て、代表区間の位置とは無関係に、フレーム全体に対し
て平均的な特性を表わすものを求めており、この場合フ
レーム時間長は２０ｍ５ｅＣ程度を用いることが多く、
過渡部や母音遷移部などのように音声のスペクトル特性
が短時間的に大きく変化しているフレームなどではスペ
クトルパラメータも短い時間で変化しているなめ１フレ
一ム全体の音声信号の平均的な特性から求めたスペクト
ルパラメータは代表区間では必ずしも最適ではないとい
う問題点があった。In the method of Reference 3 mentioned above, the spectrum parameter is determined to represent the average characteristics for the entire frame, regardless of the position of the representative section, and in this case, it is possible to use a frame time length of about 20 m5 eC. many,
In frames where the spectral characteristics of the voice change greatly over a short period of time, such as in transient parts or vowel transition parts, the spectral parameters also change over a short time. There was a problem in that the spectral parameters determined from the characteristics were not necessarily optimal in the representative section.

本発明の目的は、比較的少ない演算量で低い伝送ビット
レイトでも高品質な音声を合成することのできる音声符
号化方式とその装置を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a speech encoding method and apparatus that can synthesize high-quality speech even at a low transmission bit rate with a relatively small amount of calculation.

[Means for solving problems]

第１の発明の音声符号化方式は、送信側では離散的な音
声信号を入力しあらかじめ定められた第１の時間区間に
分割し前記音声信号からピッチを表わすピッチパラメー
タを抽出して前記第１時間区間より短い第２の時間区間
に分割し前記第２の時間区間のうちの１つの区間を選び
前記選ばれた区間の音声信号の短時間スペクトル包絡を
表わすスペクトルパラメータと前記選ばれた区間の音声
信号の音源を表わす情報とを求め該スペクトルパラメー
タ及び音源情報と前記ピッチパラメータとを組み合わせ
て出力し、受信側では前記ピッチパラメータをもとに前
記第２の時間区間を復元し前記選ばれた区間の音源を表
わす情報と前記選ばれた区間のスペクトルパラメータを
用いて前記音声信号を合成している。In the audio encoding method of the first invention, on the transmitting side, a discrete audio signal is input, divided into predetermined first time intervals, and a pitch parameter representing the pitch is extracted from the audio signal. dividing the time interval into second time intervals shorter than the time interval, selecting one interval from the second time interval, and selecting a spectral parameter representing the short-time spectral envelope of the audio signal of the selected interval; Information representing the sound source of the audio signal is obtained, and the spectral parameter and sound source information are combined and output with the pitch parameter, and the receiving side restores the second time interval based on the pitch parameter and the selected sound source. The audio signal is synthesized using information representing the sound source of the section and spectral parameters of the selected section.

また、第２の発明の音声符号化装置は、入力した音声信
号をあらかじめ定められた第１の時間区間に分割し前記
音声信号からピッチを表わすピッチパラメータを抽出し
て符号化し前記ピッチパラメータをもとに前記第１の時
間区間よりも短い第２の時間区間に分割する分割図、路
と、前記第２の時間区間のうちの１つの区間を選び前記
選ばれた区間の音声信号の短時間スペクトル包絡を表わ
すスペクトルパラメータと前記選ばれた区間の音声信号
の音源を表わすパルス列とを求めて符号化するパラメー
タ計算回路と、前記選ばれた区間の音源を表わす符号と
前記選ばれた区間のスペクトルパラメータを表わす符号
と前記ピッチパラメータを表わす符号とを組み合わせて
出力するマルチプレクサ回路とを有している。Further, the speech encoding device of the second invention divides an input speech signal into a predetermined first time interval, extracts and encodes a pitch parameter representing a pitch from the speech signal, and also extracts and encodes the pitch parameter. and dividing the audio signal into a second time interval shorter than the first time interval, and selecting one interval of the second time interval and dividing the audio signal of the selected interval into a short time interval. a parameter calculation circuit that calculates and encodes a spectral parameter representing a spectral envelope and a pulse train representing a sound source of the audio signal in the selected section; a code representing the sound source of the selected section and a spectrum of the selected section; It has a multiplexer circuit that outputs a combination of a code representing a parameter and a code representing the pitch parameter.

更に、第３の発明の音声復号化装置は、あらかじめ定め
られた第１の時間区間毎にピッチパラメータを表わす符
号と選ばれた区間のスペクトルパラメータを表わす符号
と選ばれた区間の音源を表わす符号とが組み合わされた
符号系列を入力し前記ピッチパラメータを表わす符号と
前記選ばれた区間のスペクトルパラメータを表わす符号
と前記選ばれた区間の音源情報を表わす符号とを分離し
て復号するデマルチプレクサ回路と、前記復号されたピ
ッチパラメータをもとに前記第１の時間区間よりも短い
第２の時間区間を復元する分割回路と、前記復号された
選ばれた区間の音源をもとにして時間的に滑らかな変化
を与える処理を施して駆動音源信号を復元する駆動音源
信号復元回路と、前記復号された選ばれた区間のスペク
トルパラメータと前記復元された駆動音源信号を用いて
音声信号を合成し出力する合成フィルタ回路とを有して
いる。Furthermore, the audio decoding device of the third invention includes a code representing a pitch parameter for each predetermined first time interval, a code representing a spectral parameter of the selected interval, and a code representing a sound source of the selected interval. a demultiplexer circuit which inputs a code sequence in which the above is combined, and separates and decodes a code representing the pitch parameter, a code representing the spectral parameter of the selected section, and a code representing sound source information of the selected section; a dividing circuit that restores a second time interval shorter than the first time interval based on the decoded pitch parameter; a driving sound source signal restoration circuit that restores the driving sound source signal by applying processing to give a smooth change to the driving sound source signal; and a synthesis filter circuit for output.

[Effect]

本発明は、１つのピッチ区間（代表区間）について求め
たスペクトルパラメータとパルス列とを用いてフレーム
全体の音声信号を良好に表わすことを特徴としている。The present invention is characterized in that the audio signal of the entire frame is well represented using the spectral parameters and pulse train determined for one pitch section (representative section).

この方式の実現法としては、例えば、ピッチ区間毎にス
ペクトルパラメータを求めこれを用いてビ・ソチ区間毎
にバルサ列を計算しておき、フレーム全体で良好な音声
信号を再生できる区間を探索し、代表区間のパルス列と
代表区間のスペクトルパラメータを伝送する方法が考え
られる。また、他の方法としては、適切な方法で代表区
間を選定した後に、この区間のスペクトルパラメータと
パルス列を求める方法が考えられる。第２図は前者の方
法の処理の一例を示している。＜ａ）は１フレームの音
声波形、（ｂ）はフレームをピッチ区間毎に分割した模
様を示す。ここで各ピッチ区間毎にスペクトルパラメー
タを求める。（Ｃ）はこのスペクトルパラメータを用い
てピッチ区間毎に求めたパルス列を示し、（ｄ）は選ば
れた代表区間とその区間のパルス列を示す。To implement this method, for example, find the spectrum parameters for each pitch section, use this to calculate a balsa sequence for each Bi-Sochi section, and search for sections that can reproduce a good audio signal over the entire frame. , a method of transmitting the pulse train of the representative section and the spectral parameters of the representative section is conceivable. Another possible method is to select a representative section using an appropriate method and then obtain the spectral parameters and pulse train of this section. FIG. 2 shows an example of processing of the former method. <a) shows the audio waveform of one frame, and (b) shows the frame divided into pitch sections. Here, spectrum parameters are determined for each pitch section. (C) shows a pulse train obtained for each pitch section using this spectrum parameter, and (d) shows a selected representative section and a pulse train in that section.

ここで代表区間を選ぶ方法としては、前記文献３に示し
た方法と同じ方法を用いることができる。Here, as a method for selecting the representative section, the same method as shown in the above-mentioned document 3 can be used.

また、ピッチ区間のパルス列の振幅と位置を求める方法
としては、前記文献３に記載の方法を用いることができ
るが、これ以外にも例えばアナリシスーバイーシンセシ
ス（ＡＮＡＬＹＳ　Ｉ　５−ｂｙ−３ＹＮＴＨＥＳ　Ｉ
　Ｓ　；　Ａ−ｂ−Ｓ＞の手法を用いる方法が知られて
おり、その詳細についてはビー　ニス　アタル（Ｂ、Ｓ
、ＡＴＡＬ）氏らによる°ア　ニュー　モデル　オブ　
エル　ビー　シー　エクサイテイション　フォー　プロ
デューシング　ナチュラル　サウンディング　スピーチ
アット　ロウ　ビット　レイツ゛’（”ＡＮＥＷＭＯＤ
ＥＬ　　ＯＦ　　ＬＰＣＥＸＣＩＴＡＴＩＯＮ　　ＦＯ
ＲＰＲＯＤｔＪＣＩＮＧ　　ＮＡＴＵＲＡＬ　　５ＯＵ
ＮＤＩＮＧ　　５ＰＥＥＣＨＡＴＬＯＷ　　ＢＩＴ　　
ＲＡＴＥＳ”）と題した論文（ＰＲＯＣ，１，Ｃ，Ａ、
Ｓ、Ｓ、Ｐ、、Ｐ、Ｉ）。Further, as a method for determining the amplitude and position of the pulse train in the pitch section, the method described in the above-mentioned document 3 can be used, but in addition to this, for example, analysis by synthesis (ANALYSIS I 5-by-3YNTHES I
A method is known that uses the method of S;
A new model of
LBC Excitement for Producing Natural Sounding Speech at Low Bit Rates'("ANEWMOD")
EL OF LPC EXCITATION FO
RPRODtJCING NATURAL 5OU
NDING 5PEECHATLOW BIT
RATES”) (PROC, 1, C, A,
S, S, P,, P, I).

６１４−６１７．１９８２＞（文献４）等に説明されて
いる。614-617.1982> (Reference 4).

〔Example〕

次に、本発明について図面を参照して詳細に説明する。 Next, the present invention will be explained in detail with reference to the drawings.

第１図（ａ）、（ｂ）はそれぞれ本発明の音声符号化方
式の一実施例における送信側、受信側のブロック図であ
る。FIGS. 1(a) and 1(b) are block diagrams of a transmitting side and a receiving side, respectively, in an embodiment of the audio encoding method of the present invention.

第１図（ａ）において、送信側入力端子１００から音声
信号ｘ（ｎ＞が入力され、フレーム（第１の時間区間）
毎にあらかじめ定められたサンプル数だけバッファメモ
リ回路１１０に蓄積される。In FIG. 1(a), an audio signal x (n>) is input from the transmission side input terminal 100, and a frame (first time interval)
Each time, a predetermined number of samples are stored in the buffer memory circuit 110.

次ににパラメータ計算回路１４０はバッファメモリ回路
１１０からあらかじめ定められたサンプル数の音声信号
を入力し、フレーム内の音声信号の平均的なスペクトル
包絡を表わすにパラメータを計算する。ここでにパラメ
ータはＰＡＲＣＯＲ係数と同一のパラメータである。Ｋ
パラメータの計算法としては自己相関法がよく知られて
いる。この方法の詳細については、ジョン　マコウル氏
（ＪＯＨＮ　　ＭＡＫＨＯｔＪＬ）氏らにより°りオン
タイゼイション　プロパテイブ　オブ　トランスミジョ
ン　パラメータズ　イン　リニア　プリディクチイブ　
システムズ（“’ＱＵＡＮＴＩＺＡＴＩＯＮ　　ＰＲＯ
ＰＥＲＴＩＢＳ　　ＯＦ　　ＴＲＡＮＳＭＩＳＳＩＯＮ
　　ＰＡＲＡＭＥＴＥＲ３ＩＮＬＩＮＥＡＲＰＲＥＤＩ
ＣＴＩＶＥ　　ＳＹＳＴＥＭＳ”）と題した論文（ＩＥ
ＥＥ　　ＴＲＡＮＳ、Ａ、Ｓ、Ｓ、Ｐ、、ρ、ｐ、３０
９−３２１゜１９８３）（文献５）等に述べられている
ので、ここでは説明を省略する。第１図（ａ）に戻って
、Ｋパラメータに、はにパラメータ符号化回路１６０へ
出力される。Next, the parameter calculation circuit 140 inputs a predetermined number of samples of the audio signal from the buffer memory circuit 110 and calculates parameters representing the average spectral envelope of the audio signal within the frame. The parameters here are the same as the PARCOR coefficients. K
The autocorrelation method is well known as a parameter calculation method. For more information on this method, please refer to John Makhoul et al.'s Ontization Properties of Transmission Parameters in Linear Prediction
Systems (“'QUANTIZATION PRO
PERTIBS OF TRANSMISSION
PARAMETER3INLINEARPREDI
CTIVE SYSTEMS”) (IE
EE TRANS,A,S,S,P,,ρ,p,30
9-321° 1983) (Reference 5), so the explanation will be omitted here. Returning to FIG. 1(a), the K parameter is output to the parameter encoding circuit 160.

Ｋパラメータ符号化回路１６０はあらかじめ定められた
量子化ビット数に基づいてにパラメータに＋　を符号化
しさらに復号化して得たにパラメータ復号値に１°を予
測係数ａ１°に変換しサブフレーム分割回路１６５へ出
力する。The K parameter encoding circuit 160 encodes + into the parameter based on a predetermined number of quantization bits, further decodes it, converts the obtained parameter decoded value 1° into a prediction coefficient a1°, and converts it into a subframe dividing circuit. Output to 165.

ピッチ分析回路１３０はバッファメモリ回路１１０の出
力を用いてフレーム内のピッチ周期Ｐｄを計算する。ピ
ッチ周期Ｐｄの計算法は、例えば、アール　ブイ　コッ
クス（Ｒ，Ｖ、Ｃ０Ｘ）氏らによる゛′リアル　タイム
　インブリメンティジョン　オブ　タイム　ドメイン　
ハーモニツクスケイリング　オブ　スピーチ゛（”ＲＥ
ＡＬ−ＴＩＭＥ　　ＩＭＰＬＥＭＥＮＴＡＴＩＯＮ　　
ＯＦＴＴＭＦ、　　ＤＯＭＡＩＮ　　ＨＡＲＭＯＮＩＣ
３ＣＡＬＩＮＧ　　ＯＦ　　５ＰＥＥＣＨ５ＩＧＮＡＬ
Ｓ”）と題した論文（ＩＥＥＥ　　ＴＲＡＮＳ。The pitch analysis circuit 130 uses the output of the buffer memory circuit 110 to calculate the pitch period Pd within the frame. The method for calculating the pitch period Pd is, for example, the ``Real Time Imbrimention of Time Domain'' by R. V. Cox (R, V, C0X) et al.
Harmonics Scaling of Speech ("RE")
AL-TIME IMPLEMENTATION
OFTTMF, DOMAIN HARMONIC
3CALING OF 5PEECH5IGNAL
A paper entitled “S”) (IEEE TRANS.

Ａ、Ｓ、Ｓ、Ｐ、、ｐ、ｐ、２５８−２７２．１９８３
＞（文献６）等で述べられている方法を用いることがで
きる。A, S, S, P,, p, p, 258-272.1983
> (Reference 6) etc. can be used.

ピ・ソチ符号化回路１５０はピッチ周期Ｐｄをあらかじ
め定められた量子化ビット数で量子化符号化し、符号２
ｄをマルチプレクサ２６０へ出力する。また復号化して
得たピッチ周期Ｐｄ’をピッチ補間回路２２５へ出力す
る。The Pisochi encoding circuit 150 quantizes and encodes the pitch period Pd using a predetermined number of quantization bits, and
d to multiplexer 260. Furthermore, the pitch period Pd' obtained by decoding is output to the pitch interpolation circuit 225.

ピット補間回路２２５はピ・ソチ周期Ｐｄ’　と隣接フ
レームのピッチ周期を用いてピッチ周期を補間してこれ
をサブフレーム分割回路１６５へ出力する。The pit interpolation circuit 225 interpolates the pitch period using the pitch period Pd' and the pitch period of the adjacent frame, and outputs the interpolated pitch period to the subframe division circuit 165.

サブフレーム分割回路１６５は予測係数ａ、゛と補間し
たピッチ周期を用いてフレームをピッチ周期毎のサブフ
レーム（第２の時間区間）に分割する。サブフレーム分
割法としては音源信号を表わす音源パルス列を求めて分
割する方法が知られており、この方法については前記文
献３の実施例中に駆動信号計算回路２２０として記載さ
れているのでここでは説明を省略する。−例として第２
図（ａ）、（ｂ）に１フレームの音声波形、フレームを
サブフレームに分割した模様をそれぞれ示す９ここでサ
ブフレーム位相ＴＰ゛は適切なビ・ソト数の符号で表現
されてマルチプレクサ２６０へ出力される。またサブフ
レーム分割位置はＩ（パラメータ計算回路１４０９重み
づけ回路２００．にパラメータ補間回路２５５１合成フ
ィルタ回路２５０、駆動信号復元回路２４０．駆動信号
計算回路２２０へ出力される。The subframe division circuit 165 divides the frame into subframes (second time intervals) for each pitch period using the prediction coefficients a, ゛ and the interpolated pitch period. As a subframe division method, a method is known in which a sound source pulse train representing a sound source signal is obtained and divided, and this method is described as the drive signal calculation circuit 220 in the embodiment of the above-mentioned document 3, so it will not be explained here. omitted. - As an example, the second
Figures (a) and (b) show the audio waveform of one frame and the pattern in which the frame is divided into subframes.9 Here, the subframe phase TP' is expressed by an appropriate bi-soto number code and sent to the multiplexer 260. Output. Further, the subframe division position is outputted to I (parameter calculation circuit 1409, weighting circuit 200, parameter interpolation circuit 2551, synthesis filter circuit 250, drive signal restoration circuit 240, drive signal calculation circuit 220).

次に、Ｋパラメータ計算回路１４０では入力したサブフ
レーム分割位置に従い、サブフレーム区間毎にスペクト
ルパラメータとしてにパラメータを計算し、Ｋパラメー
タ符号化回路１６０へ出力する７にパラメータ符号化回
路１６０ではサブフレーム毎に求めたにパラメータを符
号化して復号し、さらに予測係数ａｌ’に変換してイン
パルス応答計算回路１７０１重みづけ回路２００．にパ
ラメータ補間口路２５５へ出力する。Ｋパラメータ補間
回路２５５は各サブフレーム区間を基準として隣接フレ
ームのにパラメータとの間で直線補間を施して補間した
にパラメータを各サブフレーム毎に求める。Next, the K parameter calculation circuit 140 calculates parameters as spectral parameters for each subframe section according to the input subframe division position, and outputs them to the K parameter encoding circuit 160. The parameters obtained in each case are encoded and decoded, and further converted into prediction coefficients al', which are sent to the impulse response calculation circuit 1701 and the weighting circuit 200. is output to the parameter interpolation port 255. The K parameter interpolation circuit 255 performs linear interpolation between the parameters of adjacent frames using each subframe section as a reference, and obtains interpolated parameters for each subframe.

インパルス応答計算回路１７０はサブフレームの個数だ
け予測係数ａ１′を入力し重みづけされた合成フィルタ
の伝達関数を表わすインパルス応答ｈ　ｗ　（ｎ　）を
サブフレームの個数だけ計算する。The impulse response calculation circuit 170 receives as many prediction coefficients a1' as the number of subframes, and calculates impulse responses h w (n) representing the weighted transfer function of the synthesis filter for the number of subframes.

ここで、インパルス応答ｈｗ（ｎ）の計算には、例えば
特願昭５９−０４２３０５号明細書（文献７）の第４図
（ａ）に記載のインパルス応答計算回路２１０と同一の
方法を用いることができる。Here, to calculate the impulse response hw(n), for example, the same method as the impulse response calculation circuit 210 described in FIG. 4(a) of Japanese Patent Application No. 59-042305 (Document 7) may be used. I can do it.

インパルス応答ｈｗ（ｎ）は自己相関関数計算回路１８
０と相互相関関数計算回路２１０とへ出力される。The impulse response hw(n) is the autocorrelation function calculation circuit 18
0 and is output to the cross-correlation function calculation circuit 210.

自己相関関数計算回路１８０はサブフレーム毎に求めた
インパルス応答ｈｗ（ｎ）を入力し、サブフレーム毎に
自己相関関数係数Ｒｈｈ　（ｍ）を計算して駆動信号計
算回路２２０へ出力する。ここでＲｈｈ　（ｍ）の計算
には例えば前記文献７に記載の自己相関関数計算回路１
８０と同一の方法を用いることができる。The autocorrelation function calculation circuit 180 inputs the impulse response hw(n) obtained for each subframe, calculates the autocorrelation function coefficient Rhh (m) for each subframe, and outputs it to the drive signal calculation circuit 220. Here, to calculate Rhh (m), for example, the autocorrelation function calculation circuit 1 described in the above-mentioned document 7 is used.
The same method as 80 can be used.

次に減算器１２０はバッファメモリ回路１１０の音声信
号Ｘ　（ｎ）から合成フィルタ回路２５０の出力を１フ
レーム分減算し、減算結果ｅ　（ｎ）を重みづけ回路２
００へ出力する。重みづけ回路２００は減算結果ｅ　（
ｎ＞を入力しサブフレーム分割位置を用いてサブフレー
ムに分割しサブフレーム毎の予測係数ａ１°を用いてサ
ブフレーム毎に減算結果ｅ　（ｎ）に対し重みづけを施
して重みづけ結果ｅ　ｗ　（ｎ　）　’ｇ出力する。こ
こで重みづけ結果ｅｗ（ｎ＞の計算には、例えば前記文
献７の第４図（ａ＞に記載の重みづけ回路４１０と同一
の方法を用いることができる。Next, the subtracter 120 subtracts the output of the synthesis filter circuit 250 by one frame from the audio signal X (n) of the buffer memory circuit 110, and uses the subtraction result e (n) as
Output to 00. The weighting circuit 200 calculates the subtraction result e (
n>, divides it into subframes using the subframe division position, and weights the subtraction result e (n) for each subframe using the prediction coefficient a1° for each subframe to obtain a weighted result e w (n) Output 'g. Here, the weighting result ew(n> can be calculated using the same method as the weighting circuit 410 described in FIG. 4 (a>) of Document 7, for example.

相互相関関数計算回路２１０はサブフレーム毎に重みづ
け結果ｅ　ｗ　（ｎ　＞とインパルス応答ｈ　ｗ（ｎ）
を入力して相互相関関数φ、を計算し、駆動信号計算回
路２２０へ出力する。ここで相互相関係数φｈａの計算
には例えば前記文献７に記載の相互相関関数計算回路２
１０と同一の方法を用いることができる。The cross-correlation function calculation circuit 210 calculates the weighting result e w (n >) and the impulse response h w (n) for each subframe.
is input to calculate the cross-correlation function φ, and output it to the drive signal calculation circuit 220. Here, to calculate the cross-correlation coefficient φha, for example, the cross-correlation function calculation circuit 2 described in the above-mentioned document 7 is used.
The same method as in 10 can be used.

次に、駆動計算回路２２０は１フレームの音声信号を良
好に表わす１つのと・ソチ区間の音源パルス列とスペク
トルパラメータ（ここではにパラメータ）を求める。音
源信号の求め方を以下で説明する。まず最初に、サブフ
レーム分割位置を用いてフレームを第２図（ｂ）に示す
ようなサブフレームに分割する。そしてサブフレーム区
間毎に入力した相互相関関数を用いて、あらかじめ定め
られたｌ［ｉｉ！数のパルスを計算する。代表的なサブ
フレーム区間の遷定法としては、例えば第２図（ｃ）の
あるサブフレーム（例えば２番目の区間■）に着目し、
このサブフレームの音源パルス列と隣接フレームの音源
パルス列との間で直線補間を施して、他のサブフレーム
の音源パルス列を再生する。Next, the drive calculation circuit 220 obtains a sound source pulse train and a spectrum parameter (in this case, a parameter) for one interval that satisfactorily represents the audio signal of one frame. How to obtain the sound source signal will be explained below. First, a frame is divided into subframes as shown in FIG. 2(b) using subframe division positions. Then, using the cross-correlation function input for each subframe section, a predetermined l[ii! Calculate the number of pulses. As a typical subframe interval transition method, for example, focusing on a certain subframe (for example, the second interval ■) in Fig. 2(c),
Linear interpolation is performed between the sound source pulse train of this subframe and the sound source pulse train of an adjacent frame to reproduce the sound source pulse train of other subframes.

そして２番目のサブフレーム区間■を基準として補間し
て求めたにパラメータをにパラメータ補間回路２５５か
ら入力し、音源パルス列とにパラメータとを用いて１フ
レ一ム全体の音声信号を再生し、入力音声信号との誤差
電力を求める。以上の処理をいくつかのサブフレームに
ついて行ない、誤差電力を小さくするサブフレーム区間
を選んでこれを代表区間とする。このような手順で求め
た代表区間とその区間の音源パルス列を第２図（ｄ）に
示す。代表区間の音源パルス列の振幅１位置は符号器２
３０へ出力される。また代表区間のサブフレーム番号■
はあらかじめ定められたビット数で符号化され、マルチ
プレクサ２６０及びにパラメータ符号化回路１６０へ出
力される。Then, the parameters obtained by interpolation using the second subframe section ■ as a reference are inputted from the parameter interpolation circuit 255, and the audio signal of the entire one frame is reproduced using the sound source pulse train and the parameters. Find the error power with the audio signal. The above processing is performed for several subframes, and a subframe section that reduces the error power is selected and used as a representative section. FIG. 2(d) shows the representative section obtained by such a procedure and the sound source pulse train of that section. The amplitude 1 position of the sound source pulse train in the representative section is encoder 2
30. Also, the subframe number of the representative section■
is encoded with a predetermined number of bits and output to multiplexer 260 and parameter encoding circuit 160.

Ｋパラメータ符号化回路１６０は代表区間のサブフレー
ム番号を入力しこの区間のにパラメータを表わす符号を
マルチプレクサ２６０へ出力する。The K parameter encoding circuit 160 inputs the subframe number of the representative section and outputs a code representing the parameter of this section to the multiplexer 260.

符号器２３０はパルス列の振幅９位置を符号化しマルチ
プレクサ２６０へ出力する。また、パルス列の振幅１位
置の復号値ｇ、’　、ｍ、’を駆動信号復元回路２４０
へ出力する。ここで、パルスの符号化法には、例えば前
記文献７に記載の符号化回路４７０と同一な方法を用い
ることができる。Encoder 230 encodes nine amplitude positions of the pulse train and outputs it to multiplexer 260 . Further, the decoded values g,', m,' of the amplitude 1 position of the pulse train are sent to the drive signal restoration circuit 240.
Output to. Here, as the pulse encoding method, for example, the same method as that of the encoding circuit 470 described in Document 7 can be used.

駆動信号復元回路２４０は入力した音源パルス列の復号
値１代表区間の位置、補間したピッチ周期を用いて、代
表区間以外のサブフレーム区間のパルス列を前後のフレ
ームのパルス列から補間処理により再生して１フレ一ム
分の音源信号を発生させ、これを駆動音源信号として合
成フィルタ回路２５０へ出力する。The drive signal restoration circuit 240 uses the position of the decoded value 1 representative section of the input sound source pulse train and the interpolated pitch period to reproduce the pulse train of the subframe section other than the representative section by interpolation processing from the pulse train of the previous and subsequent frames. A sound source signal for one frame is generated and outputted to the synthesis filter circuit 250 as a drive sound source signal.

合成フィルタ回路２５０は駆動音源信号９代表区間を基
準として補間されたにパラメータを入力し、サブフレー
ム毎ににパラメータを切り換えて予測係数に変換し、１
フレ一ム分の応答信号Ｘ（ｎ）を計算する。ここで応答
信号の計算には、例えば前記文献７に記載の合成フィル
タ回路４００と同一の方法を用いることができる。The synthesis filter circuit 250 inputs the parameters interpolated with reference to the representative section of the driving sound source signal 9, switches the parameters for each subframe, converts them into prediction coefficients, and converts the parameters into prediction coefficients.
A response signal X(n) for one frame is calculated. Here, the same method as in the synthesis filter circuit 400 described in Document 7 can be used to calculate the response signal, for example.

マルチブレスタ回路２６０は代表区間のにパラメータを
表わす符号ｅ　ｋ　ｌとピッチ符号化回路１５０の符号
／ｄと符号器２３０の符号、サブフレーム位相を表わす
符号１代表区間の位置を表わす符号を入力し、これらを
組み合わせて送信側出力端子２７０から出力する。以上
で本実施例の音声符号化方式の送信側の説明を終了する
。The multi-breaster circuit 260 inputs the code e k l representing the parameter of the representative section, the code /d of the pitch encoding circuit 150, the code of the encoder 230, and the code representing the subframe phase 1 and the code representing the position of the representative section. , these are combined and output from the transmission side output terminal 270. This concludes the explanation of the transmitting side of the audio encoding system of this embodiment.

次に、第１図（ｂ）において、デマルチプレクサ２９０
は受信側入力端子２８０から入力した符号のうち、代表
区間のにパラメータを表わす符号と、ビ・ソチ周期を表
わす符号と、音源パルス列を表わす符号とを分離して、
それぞれにパラメータ復号回路３３０．ピッチ復号回路
３２０．ｆ３Ｉ号回路３００へ出力する。またサブフレ
ーム位相を表わす符号をサブフレーム分割回路３５５へ
出力し、代表区間を表わす符号を駆動信号復元回路３４
０と１（パラメータ補間回路３５５へ出力する。Next, in FIG. 1(b), the demultiplexer 290
separates the code input from the receiving side input terminal 280 into the code representing the parameter of the representative interval, the code representing the Bi-Sochi period, and the code representing the sound source pulse train, and
Each parameter decoding circuit 330. Pitch decoding circuit 320. It is output to the f3I circuit 300. Further, a code representing the subframe phase is output to the subframe division circuit 355, and a code representing the representative section is output to the drive signal restoration circuit 355.
0 and 1 (output to parameter interpolation circuit 355).

Ｉ（パラメータ復号回路３３０はにパラメータを復号し
て復号値ＫＩ°をにパラメータ補間回路３５５へ出力す
る。I(parameter decoding circuit 330 decodes the parameter and outputs the decoded value KI° to parameter interpolation circuit 355.

ピッチ復号回路３２０はピッチ周期Ｐｄ’を復号してピ
ッチ補間回路３４５へ出力する。ピ・ソチ補間回路３４
５は補間したピッチ周期を求めてサブフレーム分割回路
３５５へ出力する。The pitch decoding circuit 320 decodes the pitch period Pd' and outputs it to the pitch interpolation circuit 345. Pisochi interpolation circuit 34
5 determines the interpolated pitch period and outputs it to the subframe division circuit 355.

サブフレーム分割回路３５５は補間したピッチ周期とサ
ブフレーム位相を入力しサブフレーム分割位置を求め、
駆動信号復元回路３４０とにパラメータ補間回路３３５
と合成フィルタ回路３５０へ出力する。The subframe division circuit 355 inputs the interpolated pitch period and subframe phase to determine subframe division positions.
Drive signal restoration circuit 340 and parameter interpolation circuit 335
is output to the synthesis filter circuit 350.

復号回路３００は音源パルス列を復号して駆動信号復元
回路３４０へ出力する。駆動信号復元回路３４０は送信
側の駆動信号復元回路２４ｏ（第１図（ａ）に図示）と
同一の動作をし、１フレ一ム全体に−ノいて音源パルス
列を発生させ駆動音源信号として合成フィルタ回路３５
０へ出力する。The decoding circuit 300 decodes the sound source pulse train and outputs it to the drive signal restoration circuit 340. The drive signal restoration circuit 340 operates in the same way as the drive signal restoration circuit 24o (shown in FIG. 1(a)) on the transmitting side, and generates a sound source pulse train for one entire frame and synthesizes it as a drive sound source signal. Filter circuit 35
Output to 0.

Ｋパラメータ補間回路３５５は復号したにパラメータ、
代表区間の位置、サブフレーム分割位置を入力し、代表
区間を基準として隣接したフレームとの間でにパラメー
タをザブフレーム毎に直線補間し、補間されたＩくパラ
メータを合成フィルタ回路３５０へ出力する。The K parameter interpolation circuit 355 decodes the parameters,
The position of the representative section and the subframe division position are input, parameters are linearly interpolated for each subframe between adjacent frames based on the representative section, and the interpolated parameters are output to the synthesis filter circuit 350. .

合成フィルタ回路３５０はサブフレーム分割位置、駆動
音源信号、補間されたにパラメータを入力し、送信側の
合成フィルタ回路２５０（第１図（ａ）に図示〉と同一
の動作をして１フレ一ム分の合成音声信号’；ｃ　（ｎ
　）を計算し、受信側出力端子３６０から出力する。The synthesis filter circuit 350 inputs the subframe division position, the driving sound source signal, and the interpolated parameters, and performs the same operation as the synthesis filter circuit 250 on the transmission side (shown in FIG. 1(a)) to generate one frame. Synthetic speech signal ';c (n
) is calculated and output from the receiving side output terminal 360.

以上で本実施例の音声符号化方式の受信側の説明を終了
する。This concludes the explanation of the receiving side of the audio encoding system of this embodiment.

尚、本実施例における駆動信号計算回路２２０では、音
源信号を代表区間の音源パルス列で表わすようにしたが
、有声か無声かをフレーム毎に判別し、無声区間では代
表区間の音源パルス列ではなくて雑音源とパルス列の組
み合わせにより音源を表わすようにしてもよい。なぜな
らば無声区間では音源信号は雑音的になるのでこのよう
にしたほうが無声区間での音質を向上させることができ
る。つまり、有声の場合は音源信号として代表区間のパ
ルス列、無声の場合は雑音とパルス列の組み合わせを用
いるようにしてもよい。この方法については例えば前記
文献３に記載されているのでここでは説明は省略する。In the drive signal calculation circuit 220 of this embodiment, the sound source signal is represented by the sound source pulse train of the representative section, but whether it is voiced or unvoiced is determined for each frame, and in the unvoiced section, the sound source pulse train of the representative section is not used. A sound source may be represented by a combination of a noise source and a pulse train. This is because the sound source signal becomes noisy during unvoiced sections, so this method can improve the sound quality during unvoiced sections. That is, in the case of voiced sound, a pulse train of the representative section may be used as the sound source signal, and in the case of unvoiced sound, a combination of noise and pulse train may be used. This method is described in, for example, the above-mentioned document 3, so the explanation is omitted here.

ここで有声区間と無声区間の判別を簡単に行なう方法と
しては、例えば前記文献３に記載されているように、音
声信号の１ピツチ離れた自己相関関数の値からピッチゲ
インを求め、ピッチゲインの大きさにより有声か無声か
を判別する方法を用いることができる。また他の周知な
方法を用いることもできる。Here, as a method to easily distinguish between voiced sections and unvoiced sections, as described in the above-mentioned document 3, for example, the pitch gain is calculated from the value of the autocorrelation function 1 pitch apart of the audio signal, and the pitch gain is A method of determining voiced or unvoiced based on the size can be used. Other known methods can also be used.

パルス計算法としては、本実施例で述べた方法の他に、
種々の方法を用いることができる。例えばパルスを１つ
求めるごとに過去に求めたパルスの振幅を調整する方法
を用いることができる。この方法の詳細については小野
氏らによる゛マルチパルス駆動型音声符号化法における
音源パルス探索法″と題した論文（日本音響学会講演論
文集１５７．１９８３＞（文献８）等に述べられている
のでここでは説明を省略する。In addition to the method described in this example, pulse calculation methods include:
Various methods can be used. For example, a method may be used in which the amplitude of previously determined pulses is adjusted each time one pulse is determined. The details of this method are described in the paper titled ``Sound source pulse search method in multipulse-driven speech coding method'' by Mr. Ono et al. (Reference 8) Therefore, the explanation is omitted here.

また、パルス列を求める際に、フレームをサブフレーム
に分割したのちにサブフレーム毎にパルス列を求めてい
たが、サブフレームに分割せずにフレーム全体に対して
あらかじめ定められた個数のパルスを求めそのうちのサ
ブフレームに入るパルスを用いるようにしてもよい。Additionally, when determining a pulse train, the frame was divided into subframes and then a pulse train was determined for each subframe. It is also possible to use a pulse that falls within a subframe of .

また処理の簡単化のためにフレーム内でにパラメータは
一定としてパルス列を求めてもよい。このようにすると
、Ｋパラメータ補間回路２５５゜３３５は不要となるが
スペクトルパラメータは時間的な連続性が保てなくなる
。Further, in order to simplify the process, the pulse train may be determined with the parameters constant within a frame. In this case, the K-parameter interpolation circuit 255° 335 becomes unnecessary, but the temporal continuity of the spectral parameters cannot be maintained.

更に代表区間を適切な方法、例えばフレーム中央付近あ
るいはパワの大きな区間などで選んだのちに、その区間
のパルス列とその区間のにパラメータを求めるようにし
てもよい。Furthermore, after selecting a representative section in an appropriate manner, for example, near the center of the frame or a section with large power, the pulse train of that section and the parameters of that section may be determined.

あるいはあらかじめ定められた時間間隔毎に分割しても
よい。このようにするとサブフレーム位相は伝送する必
要はない。Alternatively, it may be divided at predetermined time intervals. In this way, there is no need to transmit the subframe phase.

また代表区間の位置はあらかじめ定めておいてもよい。Further, the position of the representative section may be determined in advance.

このようにすると代表区間の位置を伝送する必要はない
。In this way, there is no need to transmit the position of the representative section.

音源パルス列及びにパラメータの補間については、選ん
だ代表区間を基準としてピッチ周期に同期させて補間し
てもよいし、音源パルス列及びにパラメータのいずれか
一方、あるいは両方とも、代表区間ではなくあらかじめ
定められたピッチ区間（例えば、フレームの中央付近の
ピッチ区間）に対して補間を施してもよい。またピッチ
周期に同期させずに補間してもよい。Regarding the interpolation of the sound source pulse train and the parameters, interpolation may be performed in synchronization with the pitch period using the selected representative section as a reference, or either one or both of the sound source pulse train and the parameters may be interpolated in a predetermined manner instead of the representative section. Interpolation may be performed on the pitch section (for example, the pitch section near the center of the frame). Alternatively, interpolation may be performed without synchronizing with the pitch period.

また■くパラメータの補間処理は省略することもできる
し、受信側のみ行なうようにしてもよい。Also, the interpolation process for the parameters described in (1) can be omitted, or it can be performed only on the receiving side.

このようにすることにより、Ｋパラメータ補間回路２５
５．３３５を省略することができる。By doing this, the K parameter interpolation circuit 25
5.335 can be omitted.

音源パルス列９合成フィルタのパラメータ（スペクトル
パラメータ）、ピッチ周期の補間法としては、直線補間
以外の方法も考えられる。例えば、音源ンバル゛ス列、
ピッチ周期については、対数補間等も考えられる。また
、合成フィルタのパラメータを補間する場合、本実施例
ではにパラメータについて補間したが、例えば、予測係
数（但し、この場合はフィルタの安定性をチェックする
必要がある）、対数断面積関数、フォルマントパラメー
タや自己相関関数を補間する方法等を用いることもでき
る。これらの具体的な方法は、ビーニス　アタル（Ｂ、
Ｓ、ＡＴＡＬ）氏らによる“スピーチ　アナリシス　ア
ンド　シンセシス　バイ　リニアー　プリディクション
　オブ　ザ　スピーチ　ウニイブ’　　（”５ＰＥＥＣ
ＨＡＮＡＬＹＳＩＳ　　ＡＮＤ　　５ＹＮＴＨＥＳＩＳ
　　ＢＹＬＩＮＥＡＲＰＲＥＤＩＣＴＩＯＮ　　ＯＦ　
　ＴＨＥ　　５ＰＥＥＣＨＷＡＶＥ’“）と題した論文
（Ｊ、ＡＣＯＵＳＴ、ＳＯＣ，ＡＭ、、ｐ、ｐ。As the interpolation method for the parameters (spectral parameters) of the sound source pulse train 9 synthesis filter and the pitch period, methods other than linear interpolation may be considered. For example, a sound source array,
Regarding the pitch period, logarithmic interpolation or the like can also be considered. In addition, when interpolating the parameters of the synthesis filter, in this example, the parameters are interpolated. It is also possible to use a method of interpolating parameters or autocorrelation functions. These specific methods are described by Benis Atal (B,
“Speech Analysis and Synthesis by Linear Prediction of the Speech Unit” (“5PEEC”) by Mr. S., ATAL) et al.
HANALYSIS AND 5YNTHESIS
BYLINEAR PREDICTION OF
The paper entitled THE 5PEECHWAVE'") (J, ACOUST, SOC, AM,, p, p.

６３７−６５５．１９７１＞（文献９）等に述べられて
いるので、説明は省略する。637-655.1971> (Reference 9), etc., so the explanation will be omitted.

本実施例では、フレーム長は一定と°してにパラメータ
の分析及び音源パルス列の計算をしたが、フレーム長は
可変としてもよい、このようにした場合には、音声の変
化部ではフレーム長を短くし、定常部ではフレーム長を
長くできるので、伝送ビットレイトを低減することがで
しる。更に、ピッチ周期に応じて（例えばピッチ周期の
整数倍）フレーム長を決めるようにしてもよい。In this example, the parameters were analyzed and the sound source pulse train was calculated assuming that the frame length was constant. However, the frame length may also be variable. Since the frame length can be made shorter and the frame length can be made longer in the stationary part, the transmission bit rate can be reduced. Furthermore, the frame length may be determined according to the pitch period (for example, an integral multiple of the pitch period).

以上説明した方法は単独に用いてもよいし、これらの方
法を適切に組み合わせて用いてもよい。The methods described above may be used alone or in appropriate combinations.

本発明の他の構成法として、第１図（ａ）における駆動
信号復元回路２４０２合成フ合成フィルタ５０、にパラ
メータ補間回路２５５．減算器１２０を省略した構成を
とることもできる。このようにした場合は、送信側で音
声信号を合成しなくてもよく、装置構成を簡略化するこ
とができる。As another configuration method of the present invention, the drive signal restoration circuit 2402 in FIG. 1(a) is combined with the parameter interpolation circuit 255. It is also possible to adopt a configuration in which the subtracter 120 is omitted. In this case, there is no need to synthesize audio signals on the transmitting side, and the device configuration can be simplified.

尚、ディジタル信号処理の分野でよく知られているよう
に、自己相関関数はパワスペクトルがら計算することも
できる。また、相互相関関数はクロスパワスペクトルか
ら計算することもできる。Note that, as is well known in the field of digital signal processing, the autocorrelation function can also be calculated from the power spectrum. Further, the cross-correlation function can also be calculated from the cross-power spectrum.

これらの対応関係については、ニー　ブイ　オッペンハ
イム（Ａ、Ｖ、ＯＰＰＥＮＨＥＩＭ）氏らによる“°デ
ィジタル信号処理゛°“ＤＩＧＩＴＡＬＳＩＧＮＡＬ　
　ＰＲＯＣＥＳＳＩＮＧ”と題した単行本（文献１０）
等の第８章に詳細に説明されているので、ここでは説明
を省略する。Regarding these correspondence relationships, see “Digital Signal Processing” by N.V. Oppenheim et al.
A book titled “PROCESSING” (Reference 10)
Since it is explained in detail in Chapter 8, etc., the explanation will be omitted here.

〔Effect of the invention〕

以上述べたように本発明は、１フレームをピッチ区間に
分割し１つのピッチ区間（代表区間）を選んでスペクト
ルパラメータと音源パルス列を計算して伝送し、受信側
ではこれらを用いて音声信号を再生しているので、従来
方式に比べ同じビットレイトでもよい高品質な音声を再
生できるという効果がある。As described above, the present invention divides one frame into pitch sections, selects one pitch section (representative section), calculates and transmits the spectral parameters and sound source pulse train, and uses these to transmit the audio signal on the receiving side. Since this method uses playback, it has the effect of being able to play back high-quality audio with the same bit rate compared to conventional methods.

[Brief explanation of drawings]

第１図（ａ＞、（ｂ）はそれぞれ本発明の音声符号化方
式の一実施例における送信側、受信側のブロック図、第
２図は第１図（ａ）に示した駆動信号計算回路における
処理内容の一例を示す図、第３図は従来方式の合成側の
一例を示すブロック図である。１００・・・送信側入力端子、１１０・・・バッファメ
モリ、１２０・・・減算器、１３０・・・ピッチ分析回
路、１４０・・・Ｋパラメータ計算回路、１５０・・・
ピッチ符号化回路、１６０・・・Ｋパラメータ符号化回
路、１６５．３５５・・・サブフレーム分割回路、１７
０・・・インパルス応答計算回路、１８０・・・自己相
関関数計算回路、２００・・・重みづけ回路、２１０・
・・相互相関関数計算回路、２２０・・・駆動信号計算
回路、２２５．３４５・・・ピッチ補間回路、２３０・
・・符号器、２４０，３４０・・・駆動信号復元回路、
２５０゜３５０．５１０・・・合成フィルタ回路、２５
５．３３５・・・Ｋパラメータ補間回路、２６０・・・
マルチプレクサ、２７０・・・送信側出力端子、２８０
・・・受信側入力端子、２９０・・・デマルチプレクサ
、３００・・復号回路、３２０・・・ピッチ復号回路、
３３０・・・ｙｆｆＺ図（Ｌ）ｒｒ′（Ｃ）ｔ′ ｔ！＞箔３図５−補正の女ｆ＄手続補正書（自発）６２．６．−１昭和　　年　　月　　日各FIGS. 1(a) and (b) are block diagrams of the transmitting side and receiving side, respectively, in an embodiment of the audio encoding method of the present invention, and FIG. 2 is the drive signal calculation circuit shown in FIG. 1(a). FIG. 3 is a block diagram showing an example of the synthesis side of the conventional method. 100... Sending side input terminal, 110... Buffer memory, 120... Subtractor, 130...Pitch analysis circuit, 140...K parameter calculation circuit, 150...
Pitch encoding circuit, 160...K parameter encoding circuit, 165.355...Subframe division circuit, 17
0... Impulse response calculation circuit, 180... Autocorrelation function calculation circuit, 200... Weighting circuit, 210.
... Cross-correlation function calculation circuit, 220 ... Drive signal calculation circuit, 225.345 ... Pitch interpolation circuit, 230.
... Encoder, 240, 340 ... Drive signal restoration circuit,
250°350.510...Synthesis filter circuit, 25
5.335...K parameter interpolation circuit, 260...
Multiplexer, 270... Transmission side output terminal, 280
. . . receiving side input terminal, 290 . . . demultiplexer, 300 . . . decoding circuit, 320 . . . pitch decoding circuit,
330...yffZ diagram (L) rr' (C) t' t! > Haku 3 Figure 5 - Woman of Amendment f$ Procedural Amendment (Voluntary) 62.6. -1 Showa year month day each

Claims

[Claims]

(1) On the transmitting side, a discrete audio signal is input and divided into a predetermined first time interval, a pitch parameter representing pitch is extracted from the audio signal, and a second time interval shorter than the first time interval is input. and selecting one of the second time intervals to represent a spectral parameter representing a short-time spectral envelope of the audio signal in the selected interval and a sound source of the audio signal in the selected interval. The spectrum parameter and sound source information are combined and outputted with the pitch parameter, and the receiving side restores the second time interval based on the pitch parameter and information representing the sound source of the selected interval. and a spectral parameter of the selected section to synthesize the audio signal.

(2) Divide the input audio signal into a predetermined first time interval, extract and encode a pitch parameter representing the pitch from the audio signal, and divide the input audio signal into a predetermined first time interval based on the pitch parameter. a dividing circuit for dividing into short second time intervals; and one of said second time intervals;
a parameter calculation circuit that selects a section and calculates and encodes a spectral parameter representing a short-time spectral envelope of the audio signal in the selected section and a pulse train representing a sound source of the audio signal in the selected section; A speech encoding device comprising: a multiplexer circuit that combines and outputs a code representing a sound source of the selected interval, a code representing a spectrum parameter of the selected interval, and a code representing the pitch parameter.

(3) Input a code sequence in which a code representing the pitch parameter, a code representing the spectral parameter of the selected interval, and a code representing the sound source of the selected interval are combined for each predetermined first time interval. a demultiplexer circuit that separates and decodes a code representing the pitch parameter, a code representing the spectral parameter of the selected section, and a code representing sound source information of the selected section;
a dividing circuit that restores a second time interval shorter than the first time interval based on the decoded pitch parameter; and a division circuit that restores a second time interval that is shorter than the first time interval based on the decoded pitch parameter; a driving sound source signal restoring circuit that restores the driving sound source signal by performing processing to give a change in the driving sound source signal; and a driving sound source signal restoration circuit that synthesizes and outputs an audio signal using the decoded spectrum parameter of the selected section and the restored driving sound source signal. A speech decoding device comprising a synthesis filter circuit.