JPS61150000A

JPS61150000A - Voice encoding system and apparatus

Info

Publication number: JPS61150000A
Application number: JP59272435A
Authority: JP
Inventors: 一範小澤; 卓荒関
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1984-12-24
Filing date: 1984-12-24
Publication date: 1986-07-08
Anticipated expiration: 2014-01-06
Also published as: JP2844590B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声信号を低いビットレイトで高品質に符号化
するための符号化方法とその装置に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to an encoding method and apparatus for encoding an audio signal with high quality at a low bit rate.

（従来技術）音声信号を低い伝送ビットレイト（例えば４．８ｋｂｐ
ｓ　　程度）で符号化する方式として、ボコーダ（ｖＯ
ｃＯＤＥＲ）　、が知られている。この方法については
、原理については例えば、エムアールシュレイダ−（Ｍ
、　Ｒ，５ＣＲＯＥＤＥＲ）氏によ６　「ホコーｙ’ス
：−ｒナリシス　アンド　シンセシス　オプ　スピーチ
（ＶＯＣＯＤＥＲ８：　ＡＮＡＬＹＳＩＳ　ＡＮＤ　５
ＹＮＴＨＥＳＩＳ　０ＦＳＰＥＥＣＨ月と趙した論文（
ＰＲＯＣ，ＩＥＥＥ、　ｐ、　ｐ、　７２０−７３４、
　ＭＡＹ、　１９６６　；文献１月等に詳細に説明され
ている。また、線形予測分析法を用いるボコーダとして
エルピーシ−ボコーダ（ＬＰＣＶＯＣＯＤＥＲ）が知ら
れており、その内容については、例えは、シェープイー
マーケル（Ｊ、　Ｄ、　ＭＡＲＫＥＬ）氏らによる「ア
　リニアープレディクションボコーダベイスドアボンザ
オートコリレイション　メソッド（ＡＬＩＮＥＡＲＰＲ
ＥＤＩＣＴＩＯＮ　ＶＯＣＯＤＥＲＢＡＳＥＤ　ＵＰＯ
ＮＴＨＥ　　ＡＵＴＯＣＯＲＲＥＬＡＴＩＯＮ　　ＭＥ
ＴＨＯＤ）Ｊ　と題した論文（ＩＥＥＥ　’Ｉ’ＲＡＮ
Ｓ、　Ａ、　Ｓ、　Ｓ、　Ｐ、　、　ｐ、　ｐ、　１２
４−１３４゜ＡＰＰＩＬ、　　１９７４　　；　　文献
２月等に詳細に説明されている。本発明はＶＯＣＯＤＥ
Ｒの音源部を改良したちのであり、ＬＰＣＶＯＣＯＤＥ
Ｒと密接な関係があるので、以下ＬＰＣＶＯＣＯＤＥＲ
について合成部の構成を中心に概略を説明する。(Prior art) Audio signals are transmitted at a low transmission bit rate (for example, 4.8 kbp).
Vocoder (vO
cODER) is known. Regarding the principle of this method, for example, see M.R. Schrader (M.
, R,5 CROEDER)6 ``VOCODER8: ANALYSIS AND 5
YNTHESIS 0FSPEECH The moon and the paper (
PROC, IEEE, p, p, 720-734,
MAY, 1966; Reference January, etc. In addition, LPC VOCODER is known as a vocoder that uses linear predictive analysis, and its contents can be found, for example, in ``Linear Prediction Vocoder'' by J. D. MARKEL et al. Based Abonza Autocorrelation Method (ALINEARPR
EDICTION VOCODERBASED UPO
NTHE AUTOCORRELATION ME
A paper entitled THOD) J (IEEE 'I'RAN
S, A, S, S, P, , p, p, 12
4-134° APPIL, 1974; It is explained in detail in February, etc. The present invention is a VOCODE
It is an improved sound source part of R, and LPC VOCODE
Since it is closely related to R, hereafter LPCVOCODER
An outline will be explained focusing on the configuration of the synthesis section.

第４図は、文献２に記載（７）ＬＰＣＶＯＣＯＤＥＲの
合成部（受信部）を示すブロック図である。合成部は音
源発生部５００と合成フィルタ回路５１０からなる。音
源発生部５００はインパルス発生器５０１と雑音発生器
５０２と有声ｌ無声切り替え回路５０３と、ゲイン回路
５０４から構成される。ＶＯＣＯＤＥＲでは、音声信号
は短時間（夙えば２０ｍ　５ｅｃ）毎に有声と無声の２
種にわけられ、有声の場合は、インパルス発声器５０１
からピッチ周期Ｐｄの時間間隔をもつパルス列が発生さ
れる。一方無声の場合は、雑音発生器５０２から白色雑
音が発生される。有声ｌ無声の制御は、切り換え回路５
０３にておこなわれる。このようにして発生された信号
に対して、ゲイン回路５０４にてゲインＧがあたえられ
、音源信号ｄ（ｎ）として合成フィルタ５１０へ出力さ
れる。合成フィルタ５１０では音源信号−ｄ（ｎ）とフ
ィルタパラメータＫｉを用いて音声ｘ（ｎ）を合成し出
力する。ここでピッチ周期Ｐｄ、有声ｌ無声切り換え信
号（Ｖ／ＵＶ）、ゲインＧ、　フィルタパラメータＫｉ
は分析側（送信側）においてあらかじめ定められた時間
ごとに計算され受信側に伝送される。FIG. 4 is a block diagram showing the combining unit (receiving unit) of the LPC VOCODER (7) described in Document 2. The synthesis section includes a sound source generation section 500 and a synthesis filter circuit 510. The sound source generating section 500 includes an impulse generator 501, a noise generator 502, a voiced/unvoiced switching circuit 503, and a gain circuit 504. With VOCODER, the audio signal is divided into voiced and unvoiced signals every short period of time (20m 5ec).
If it is divided into species and is voiced, an impulse vocalizer 501
A pulse train having a time interval of pitch period Pd is generated from . On the other hand, if there is no voice, white noise is generated from the noise generator 502. Voiced/unvoiced control is performed by switching circuit 5.
It will be held in 03. A gain G is applied to the signal generated in this manner in a gain circuit 504, and the signal is outputted to a synthesis filter 510 as a sound source signal d(n). The synthesis filter 510 synthesizes and outputs the sound x(n) using the sound source signal -d(n) and the filter parameter Ki. Here, pitch period Pd, voiced/unvoiced switching signal (V/UV), gain G, filter parameter Ki
is calculated at predetermined time intervals on the analysis side (sending side) and transmitted to the receiving side.

（発明が解決しようとする問題点）以上説明したＬＰＣＶＯＣＯＤＥＲにおいては、伝送情
報は、ピッチ周期、有声ｌ無声信号、ゲイン、フィルタ
パラメータであり、これらの情報から音声信号を合成で
きるので伝送ビットレイトをひくく（例えば４．８ｋ　
ｂｐｓ程度）することができる。しかしながら、この従
来法では品質の良好な音声を合成することは困難であっ
た。それは音声を有声と無声という２種の極端なりラス
にわけているので、有声／無声の判別誤りがおきた場合
はおおきな品質劣化をひきおこすといる欠点があった。(Problems to be Solved by the Invention) In the LPC VOCODER described above, the transmission information is the pitch period, voiced/unvoiced signal, gain, and filter parameters, and since the audio signal can be synthesized from these information, the transmission bit rate can be adjusted. (e.g. 4.8k
bps). However, with this conventional method, it is difficult to synthesize high-quality speech. Since this method divides speech into two extremes, voiced and unvoiced, it has the disadvantage that if an error in voiced/unvoiced discrimination occurs, it will cause a significant deterioration in quality.

また、音源信号は、有声の場合には音源を１ピツチあた
り１個のインパルスで表しており、位相情報を含まない
ため個人性はかなり損なわれており、その合成音は所謂
Ｆ機械的な音Ｊであった。また、無声と有声の切り換わ
り部では音源を良好に表すことができず、劣化がおきて
いた。更に、ピッチ周期がずれで求まった場合には、お
おきな品質劣化を引き起こすという欠点があった。本発
明は、比較的少ない演算量で、低い伝送ビットレイトで
も高品質な音声を合成することのできる高能率音声符号
化方式とその装置を提供することを目的をする。In addition, when the sound source signal is voiced, the sound source is represented by one impulse per pitch, and since it does not include phase information, individuality is considerably lost, and the synthesized sound is a so-called F mechanical sound. It was J. In addition, the sound source could not be represented well at the transition between voiceless and voiced, resulting in deterioration. Furthermore, if the pitch period is determined by a deviation, there is a drawback that a large quality deterioration is caused. SUMMARY OF THE INVENTION An object of the present invention is to provide a high-efficiency speech encoding method and apparatus that can synthesize high-quality speech even at a low transmission bit rate with a relatively small amount of calculation.

（問題点を解決するための手段）本発明の音声符号化方式は、送信側では離散的な音声信
号を入力し、前記音声信号から短時間スペクトル包絡を
表すスペクトルパラメータとピッチを表すピッチパラメ
ータとを抽出して符号化し、前記音声信号と前記ピッチ
パラメータと前記スペクトルパラメータをもとにして前
記音声信号を表すための代表的な音源パルス列を前記ピ
ッチパラメータに応じて求めた時間区間のうちの代表的
な区間に対しあらかじめ定められた個数だけ求めて符号
化し、前記ピッチパラメータ表す符号と前記スペクトル
パラメータを表す符号とを組み合わせて出力し、受信（
Ｒ１−では前記組み合わされた符号を入力し、前記ピッ
チパラメータを表す符号と前記スペクトルパラメータを
表す符号と前記代表的な音源パルス列を表す符号とを分
離して復号し、前記復号されたピッチパラメータと前記
復号された音源パルス列をもとに前記音源パルス列に対
し時間的になめらかな変化をあたえる処理をほどこして
駆動音源信号を復元し前記復元された駆動音源信号と前
記復号されたスペクトルパラメータとをもとに前記音声
信号を合成することを特徴とする。(Means for Solving the Problems) The audio encoding method of the present invention inputs a discrete audio signal on the transmitting side, and extracts a spectrum parameter representing a short-time spectral envelope and a pitch parameter representing a pitch from the audio signal. is extracted and encoded, and based on the audio signal, the pitch parameter, and the spectrum parameter, a representative sound source pulse train for representing the audio signal is determined according to the pitch parameter. A predetermined number of parameters are determined and encoded for each interval, and the code representing the pitch parameter and the code representing the spectrum parameter are combined and output, and the reception (
In R1-, the combined code is input, and the code representing the pitch parameter, the code representing the spectrum parameter, and the code representing the representative sound source pulse train are separated and decoded, and the decoded pitch parameter and the code representing the representative sound source pulse train are separated and decoded. Based on the decoded sound source pulse train, the sound source pulse train is subjected to processing to give a temporally smooth change to restore the driving sound source signal, and the restored driving sound source signal and the decoded spectral parameter are also combined. The method is characterized in that the audio signal is synthesized with the audio signals.

また・本発明の符号化装置は、入力した音声イ言号から
短時間スペクトル包絡を表すスペクトルパラメータとピ
ッチを表すピッチパラメータとを抽出し符号化するパラ
メータ計算回路と、前記音声信号と前記ピッチパラメー
タと前記スペクトルパラメータをもとにして、前記音声
信号を表すための代表的な音源パルス列を前記ピッチパ
ラメータに応じて求めた時間区間のうちの代表的な区間
に対しあらかじめ定められた個数だけ求めて符号化する
駆動信号計算回路と、前記パラメータ計算回路の出力符
号と前記駆動信号計算回路の出力符号とを組み合わせて
出力するマルチプレクサ回路とを有することを特徴とす
る。Further, the encoding device of the present invention includes a parameter calculation circuit that extracts and encodes a spectral parameter representing a short-time spectral envelope and a pitch parameter representing a pitch from an input speech A word, and the audio signal and the pitch parameter. and the spectral parameters, a predetermined number of representative sound source pulse sequences for representing the audio signal are obtained for a representative section of the time sections obtained according to the pitch parameter. The present invention is characterized by comprising a drive signal calculation circuit for encoding, and a multiplexer circuit for outputting a combination of the output code of the parameter calculation circuit and the output code of the drive signal calculation circuit.

更に本発明の復号化装置は、ピッチパラメータを表す符
号とスペクトルパラメータを表す符号と代表的な音源パ
ルス列を表す符号とが組み合わされた符号系列が入力さ
れ、前記ピッチパラメータを表す符号と前記スペクトル
パラメータを表す符号と前記音源パルス列を表す符号と
を分離して復号するデマルチプレクサ回路と、前記復号
されたピッチパラメータと前記復号された音源パルス列
をもとに前記音源パルス列に対して時間的になめらかな
変化を与える処理を施して駆動音源信号を復元する駆動
音源信号復元回路と、前記駆動音源信号と前記復号され
たスペクトルパラメータとをもとに音声信号を合成し出
力する合成フィルタ回路とを有することを特徴とする。Furthermore, the decoding device of the present invention receives as input a code sequence in which a code representing a pitch parameter, a code representing a spectral parameter, and a code representing a representative sound source pulse train are combined, and the code representing the pitch parameter and the spectral parameter are combined. a demultiplexer circuit that separates and decodes a code representing the sound source pulse train and a code representing the sound source pulse train; A driving sound source signal restoration circuit that restores the driving sound source signal by performing a process of changing the driving sound source signal, and a synthesis filter circuit that synthesizes and outputs an audio signal based on the driving sound source signal and the decoded spectral parameter. It is characterized by

（作用）本発明は音源信号をつねに第５図に示すような複数個の
パルスの組み合わ一＋！：（マルチパルス）で表し、有
声部と無声部でパルスと雑音を切り換えないことを特徴
とする。第５図において横軸は離散的な時間を、縦軸が
振幅を表す。音声はこのマルチパルスと合成フィルタか
ら合成される。音源を表すパルスの振幅と位置を求める
方法としては、例えばアナリシスーバイーシンセシス（
ＡＮＡＬＹＳＩＳ−ｂｙ−３ＹＮＴＨＥＳＩＳ　；　Ａ
−ｂ−８）の方法を用いる方法が知られており、その詳
細についてはビー　ニス　アタル（Ｂ、　Ｓ、　ＡＴＡ
Ｌ）氏らによる「ア　ニ二一　モデルオブエル　ピー　
シーエクサイテイション　７オープロデユーシングナチ
ユラル　サウンディング　スピーチアット　ロウビット
　レイツ（Ａ　ＮＥＷ　ＭＯＤＥＬＯＦＬＰＣＥＸＣＩ
ＴＡＴＩＯＮＦＯＲＰＲＯＤＵＣＩＮＧＮＡＴＵ−ＲＡ
ＬＳＯＵＮＤＩＮＧＳＰＥＥＣＨＡＴＬＯＷＢＩＴＲＡ
ＴＥＳ）Ｊと題した論文（ＰＲＯＣ，１，Ｃ，Ａ、　Ｓ
、　Ｓ、　Ｐ、　、　ｐ、　ｐ、　６１４−６１７゜１
９８２　；文献３）等に説明されている。また高速に求
める方法として、相関演算にもとづく方法が知られてお
り、詳細は特願昭５７−２３１６０６号明細書（文献４
）等に言己載されているのでここでは説明を省略する。(Function) The present invention always generates a sound source signal by combining a plurality of pulses as shown in FIG. : (multi-pulse), and is characterized by not switching between pulses and noise between voiced and unvoiced parts. In FIG. 5, the horizontal axis represents discrete time and the vertical axis represents amplitude. Speech is synthesized from this multipulse and a synthesis filter. For example, analysis-by-synthesis (
ANALYSIS-by-3YNTHESIS; A
-b-8) is known, and its details are given by Benis Atal (B, S, ATA).
“Anni-21 Model of L.P.” by Mr. L) et al.
Sea Excitation 7 O Producing Natural Sounding Speech at Low Bit Rates (A NEW MODELOFLPCEXCI
TATIONFORPRODUCINGNATU-RA
LSOUNDINGSPEECHATLOWBITRA
TES) J (PROC, 1, C, A, S
, S, P, , p, p, 614-617゜1
982; Document 3), etc. In addition, a method based on correlation calculation is known as a method for obtaining high-speed calculations, and the details can be found in Japanese Patent Application No. 57-231606 (Reference 4).
), etc., so the explanation will be omitted here.

更に本発明は音声信号の周期性を利用しており、送信側
では音声信号を表すための代表的な音源パルスを代表的
な１ピッチ区間に対し、あらがじめ定めた個数だけ計算
し、このパルスの振幅、位置と音声信号のスペクトル包
絡特性を表すスペクトルパラメータとピッチ周期情報と
を伝送する。Furthermore, the present invention utilizes the periodicity of the audio signal, and on the transmitting side, a predetermined number of representative sound source pulses for representing the audio signal are calculated for one representative pitch section, The amplitude and position of this pulse, spectral parameters representing the spectral envelope characteristics of the audio signal, and pitch period information are transmitted.

受信側では代表的なパルスの振幅、位置とピッチ周期情
報を用いて、前後のフレームからｖ）イ間処理により、
この代表的なパルスを時間的：二ｔめらかに変化させな
が−ら駆動音源信号を復元し、この駆動音源信号と、補
間処理により時間的になめらかに変化させて求めたスペ
クトルパラメータとを用いて高品質な音声信号を合成す
ることを特徴としている。On the receiving side, using representative pulse amplitude, position, and pitch cycle information, the previous and subsequent frames are processed by v)
A driving sound source signal is restored by changing this representative pulse smoothly over time, and this driving sound source signal is combined with a spectral parameter obtained by changing smoothly over time through interpolation processing. It is characterized by the fact that it synthesizes high-quality audio signals using .

（実施例）以下、本発明の実施例について図面を参照して詳細に説
明する。第１図（ａ）は本発明による高能率音声符号化
方式の送信側の一実施例を示すブロック図であり、第１
図（ｂ）は受信側の一実施例を示すブロック図である。(Example) Hereinafter, an example of the present invention will be described in detail with reference to the drawings. FIG. 1(a) is a block diagram showing an embodiment of the transmitting side of the high-efficiency speech encoding system according to the present invention.
Figure (b) is a block diagram showing an embodiment of the receiving side.

第１図（ａ）において、音声信号Ｘ（ｎ）が入力され、
あらかじめ定められたサンプル数だけバッファメモリ回
路１１０に蓄積される。次ににパラメータ計算回路１４
０は、バッファメモリ回路１１０からあらかじめ定めら
れたサンプル数の音声信号を入力し、音声信号のスペク
トル包絡を表すにパラメータを計算する。ここでにパラ
メータはＰＡＲＣＯＲ係数と同一のパラメータである。In FIG. 1(a), an audio signal X(n) is input,
A predetermined number of samples are stored in the buffer memory circuit 110. Next, the parameter calculation circuit 14
0 inputs a predetermined number of samples of the audio signal from the buffer memory circuit 110 and calculates parameters representing the spectral envelope of the audio signal. The parameters here are the same as the PARCOR coefficients.

Ｋパラメータの計算法としては、自己相関法がよく知ら
れている。この方法の詳細については、ジョン　マクホ
ウル氏　（ＪＯＨＮＭＡＫＨＯＵＬ）氏らによる「クオ
ンタイゼイションブロパディズ　オプ　トランスミッシ
ョン　パラメータズ　イン　リニア　プリディクチイブ
　システンズ（ＱＵＡＮＴＩＺＡＴＩＯＮ　ＰＲＯＰＥ
ＲＴＩＥＳ　ＯＦ　ＴＲＡＮＳ−ＭＩＳＳＩＯＮＰＡＲ
ＡＭＥＴＥＲ８ＩＮＬＩＮＥＡＲＰＲＥＤＩＣ−ＴＩＶ
Ｅ　ＳＹＳＴＥＭＳ月と趙した論文（ＩＥＥＥ　ＴＲＡ
ＮＳ、Ａ。The autocorrelation method is well known as a method for calculating the K parameter. For more information on this method, please refer to ``Quantization Properties Op Transmission Parameters in Linear Predictive Systems'' by JOHNMAKHOUL et al.
RTIES OF TRANS-MISSION PAR
AMETER8INLINEARPREDIC-TIV
E SYSTEMS moon and moon paper (IEEE TRA
N.S., A.

Ｓ、　Ｓ、　Ｐ、　、　ｐ、　ｐ、　３０９−３２１．
１９８３　；文献５）等に述べられているので、ここで
は説明を省略する。S, S, P, , p, p, 309-321.
1983; Reference 5), so the explanation will be omitted here.

第１図（ａ）にもどって、ＫパラメータＫｉは、Ｋパラ
メータ符号化回路１６０へ出力される。Ｋパラメータ符
号化回路１６０は、あらかじめ定められた量子化ビット
数に基すいてＫｉを符号化し、符号ｆｆＫ＋をマルチプ
レクサ２６０へ出力する。また、Ｋパラメータ符号化回
路１６０は、ｅＫＩを復号化して得たにパラメータ復号
値Ｋｉ’を用い、予測係数値ａｉ′に変換し、インパル
ス応答計算回路１７０と重みずけ回路２００とへ出力す
る。またにパラメータ復号値に、を補間回路２５５へ出
力する。Returning to FIG. 1(a), the K parameter Ki is output to the K parameter encoding circuit 160. K parameter encoding circuit 160 encodes Ki based on a predetermined number of quantization bits, and outputs code ffK+ to multiplexer 260. Further, the K parameter encoding circuit 160 uses the parameter decoded value Ki' obtained by decoding the eKI, converts it into a prediction coefficient value ai', and outputs it to the impulse response calculation circuit 170 and the weighting circuit 200. . Furthermore, the parameter decoded value is output to the interpolation circuit 255.

ピッチ分析回路１３０は、バッファメモリ回路１１０の
出力を用いてピッチ周期Ｐを計算する。Ｐの計算法は、
例えば、アールブイコックス（Ｒ，Ｖ、　Ｃ０Ｘ）氏ら
による「リアル　タム　インプリメンティションオプタ
イム　ドメインハーモニツクスケイリングオプ　スピー
チ　シグナルズ（ＲＥＡＬ−ＴＩＭＥ　ＩＭＰＬＥ−Ｍ
ＥＮＴＡＴＩＯＮ　ＯＦ　ＴＩＭＥ　ＤＯＭＡＩＮ　Ｈ
ＡＲＭＯＮＩＣ８ＣＡＬＩＮＧ　ＯＦ　５ＰＥＥＣＨ５
ＩＧＮＡＬＳ）と題した論文（ＩＥＥＥ　ＴＲＡＮＳ、
　Ａ、　Ｓ、　Ｓ、　Ｐ、　、　ｐ、　ｐ、　２５８−
２７２．１９８３　；文献６）等で述べられている方法
を用いることができる。Pitch analysis circuit 130 calculates pitch period P using the output of buffer memory circuit 110. The calculation method for P is
For example, "REAL-TIME IMPLE-M
ENTATION OF TIME DOMAIN H
ARMONIC8CALING OF 5PEECH5
IGNALS) (IEEE TRANS,
A, S, S, P, , p, p, 258-
272.1983; Reference 6) and the like can be used.

ピッチ符号化回路１５０は、ピッチ周期Ｐｄをあらがじ
めさだめられた量子化ビット数で量子化符号化し、符号
ｅｄをマルチプレクサ２６０へ出力する。また復号化、
して得たＰｄ′を駆動信号計算回路２２０、及び補間回
路２５５へ出力する。The pitch encoding circuit 150 quantizes and encodes the pitch period Pd using a preset number of quantization bits, and outputs a code ed to the multiplexer 260. Also decryption,
The obtained Pd' is output to the drive signal calculation circuit 220 and the interpolation circuit 255.

インパルス応答計算回路１７０は、Ｋパラメータ符号化
回路１６０から予測系数値ａｉ′を入力し、重みすけさ
れた合成フィルタの伝達関数を表すインパルス応答り、
（ｎ）を計算する。ここでり、、（ｎ）の計算には、例
えば特願昭５９−０４２３０５号明細書の第４図（ａ）
にｇ己載のインパルス応答計算回路２１０と同一の方法
を用いることができる。インパルス応答ｈｗ（ｎ）は、
自己相関関数計算回路１８０と相互相関関数計算回路２
１０とへ出力される。The impulse response calculation circuit 170 inputs the prediction system value ai' from the K parameter encoding circuit 160, and calculates an impulse response representing the transfer function of the weighted synthesis filter.
Calculate (n). Here, to calculate (n), for example, see FIG.
The same method as the impulse response calculation circuit 210 mounted on the G can be used. The impulse response hw(n) is
Autocorrelation function calculation circuit 180 and cross-correlation function calculation circuit 2
10.

自己相関関数計算回路１８０はインパルス応答計算回路
１７０からインパルス応答ｈｗ（ｎ）を入力し、次式に
従い自己相関関数を計算する。The autocorrelation function calculation circuit 180 receives the impulse response hw(n) from the impulse response calculation circuit 170 and calculates the autocorrelation function according to the following equation.

自己相関関数Ｒｈｈ　（・）は駆動信号計算回路２２０
へ出力される。The autocorrelation function Rhh (·) is the drive signal calculation circuit 220
Output to.

次に減算器１２０はバッファメモリ回路１１０の音声信
号Ｘ（ｎ）から合成フィルタ回路２５０の出力を１フレ
ーム分減算し、結果ｅ（ｎ）を重みすけ回路２００へす
る。重みずけ回路２００はｅ（ｎ）を入力し、またにパ
ラメータ符号化回路１６０から予測係数ａｉ′を入力し
、ｅ（ｎ）に対し重みずけを施して求めたｅＶ（ｎ）を
出力する。ここでｅｗ（ｎ）の計算には、例えば特願昭
５９−０４２３０５号明細書の第４図（ａ）に記載の重
みすけ回路４１０と同一の方法を用いることができる。Next, the subtracter 120 subtracts the output of the synthesis filter circuit 250 by one frame from the audio signal X(n) of the buffer memory circuit 110, and sends the result e(n) to the weighting circuit 200. The weighting circuit 200 inputs e(n) and also inputs the prediction coefficient ai' from the parameter encoding circuit 160, and outputs eV(n) obtained by weighting e(n). do. Here, to calculate ew(n), for example, the same method as that of the weight scale circuit 410 shown in FIG. 4(a) of Japanese Patent Application No. 59-042305 can be used.

相互相関関数計算回路２１０は、重みずけ回路２００か
らｅ、−（ｎ）を入力し、インパルス応答計算回路１７
０からインパルス応答ｈｗ　（ｎ）を入力し、次式に従
い相互相関関数甲産。）を計算する。The cross-correlation function calculation circuit 210 receives e and -(n) from the weighting circuit 200 and inputs e and -(n) from the impulse response calculation circuit 17.
Input the impulse response hw (n) from 0 and calculate the cross-correlation function according to the following formula. ).

９ｈ、（ｆｆｉｌ＝Σｅ、（ｎｌ・ｈ、（ｎ−ｒｎ　）
＋　（１≦ｍ≦Ｍ　　・・（２１ｗｒｌ相互相関関数　％（ｍ）は駆動信号計算回路２２０へ出
力される。9h, (ffil=Σe, (nl・h, (n−rn)
+ (1≦m≦M . . . (21wrl) The cross-correlation function % (m) is output to the drive signal calculation circuit 220.

次に駆動信号計算回路２２０は、音声信号を表す音源信
号として、パルス系列を代表的なピッチ区間に対して計
算する。この手順を次に示す。まず最初に、フレームを
ピッチ周期Ｐ　ｄ’ごとのサブフレームに分割する。こ
の分割にはピッチ励振位置を知る必要があるが、これは
音源を表すノクルステ］を求めることにより知ることが
できる。つまり、第１番目のパルスのｆｆｌ？ｉＥから
、と・ノチの励振位置を知ることができる。ココでパル
ス列の計算に（↓、例工ば特願昭５７−２３１６０６号
明細書の第（２１）式で示した方法を用いることができ
る。第２図（ａ）に１フレームの音声波形を、第２図（
ｂ）に第１番目に求まるノクルスｇ１とこのパルスの位
置を用いて分割したサブフレームのようすを示す。次の
処理とシテサブフレーム毎に、あらかじめ定められた個
数の／＜ルスを計算する。代表的なピッチ区間の選定法
としては、例えば絶対値の大きなパルスを含むサブフレ
ームを代表ピッチ区間とし、この区間に含まれるパルス
を代表パルスとする方法を用しすること力ｆできる。こ
のようにして求めた代表とツチノくルスを第２図（ｃ）
に示す。パルスの振幅、位置は符号器２３０へ出力され
る。また、サブフレーム位相Ｔ、代表ピッチ区間のサブ
フレーム番号（第２図（ｃ）では３）を代表ピッチ位置
としてあらかじめ定められたビ・ント数で符号化し、マ
ルチプレクサ２６０へ出力する。Next, the drive signal calculation circuit 220 calculates a pulse sequence for a typical pitch section as a sound source signal representing an audio signal. This procedure is shown below. First, a frame is divided into subframes each having a pitch period P d'. For this division, it is necessary to know the pitch excitation position, which can be found by finding the Nokruste representing the sound source. In other words, ffl of the first pulse? From iE, the excitation position of the tonochi can be known. Here, the method shown in equation (21) of the specification of Japanese Patent Application No. 57-231606 can be used to calculate the pulse train (↓). , Figure 2 (
b) shows the state of subframes divided using the Noculus g1 found first and the position of this pulse. For the next process and for each subframe, a predetermined number of /<ruses are calculated. As a typical method for selecting a pitch section, for example, a method can be used in which a subframe including a pulse with a large absolute value is set as a representative pitch section, and a pulse included in this section is set as a representative pulse. Figure 2 (c) shows the representative and Tsuchinokurus obtained in this way.
Shown below. The amplitude and position of the pulse are output to encoder 230. Further, the subframe phase T and the subframe number of the representative pitch section (3 in FIG. 2(c)) are encoded with a predetermined number of bits as the representative pitch position, and are output to the multiplexer 260.

符号化回路２３０は、入力したパスルの振幅、位置を符
号化する。そして、代表ピッチ区間のパルスの振幅、位
置の符号をマルチプレクサへ出力する。The encoding circuit 230 encodes the amplitude and position of the input pulse. Then, the amplitude and position sign of the pulse in the representative pitch section are output to the multiplexer.

また、入力した全てのパルスの振幅、位置の復号値ｇｉ
’、　ｍｉ’を駆動信号復元回路２４０へ出力する。こ
こでパルスの符号化法には、例えば特願　昭５７−２３
１６０５号明細書に記載の符号化回路２５０と同一の方
法を用いることができる。他の符号化法として、可変長
符号化法、あるいは、差分符号化法等積々の方法が考え
られる。In addition, the decoded values gi of the amplitude and position of all input pulses are
', mi' are output to the drive signal restoration circuit 240. Here, the pulse encoding method includes, for example, the patent application 1986-23.
The same method as the encoding circuit 250 described in the '1605 specification can be used. As other encoding methods, various methods such as a variable length encoding method or a differential encoding method can be considered.

駆動信号復元回路２４０は、符号化回路２３０から入力
したパルスの振幅および位置の復号値を用いて、１フレ
一ム分のパルスを発生させ、これを駆動音源信号として
合成フィルタ回路２５０へ出力する。The drive signal restoration circuit 240 generates a pulse for one frame using the decoded values of the amplitude and position of the pulse inputted from the encoding circuit 230, and outputs this as a drive sound source signal to the synthesis filter circuit 250. .

補間回路２５５は、ピッチ周期Ｐｄ’、サブフレーム位
相Ｔ、代表ピッチ位置を入力する。そして、フレームを
ピッチ周期Ｐｄ’ごとのサブフレームに分割し、Ｋパラ
メータをこのサブフレームごとに補間する。ここで補間
は直線補間とし、１フレーム過去及び１フレーム先のに
パラメータの値を用いる。この補間のようすを第３図に
示す。図において第ｊフレームのｉ番目のにパラメータ
Ｋｉ、　ｊは、１フレーム過去の値Ｋｉ、　ｊ−１，及
び１フレーム先の値Ｋｉ、　ｊ＋１を用ν１て、サブフ
レーム毎に補間がおこなわれる。代表と・７千区間には
、Ｋパラメータ符号化回路１６０から入力したにパラメ
ータ復号値を用いる。このようにして補間して求めたに
パラメータは、合成フィルタ回路２５０へ出力される。The interpolation circuit 255 inputs the pitch period Pd', the subframe phase T, and the representative pitch position. Then, the frame is divided into subframes for each pitch period Pd', and the K parameter is interpolated for each subframe. Here, the interpolation is linear interpolation, and parameter values for one frame past and one frame ahead are used. This interpolation is shown in FIG. In the figure, the parameter Ki,j of the i-th frame of the j-th frame is interpolated for each subframe using the value Ki,j-1 of one frame past and the value Ki,j+1 of one frame ahead. The decoded parameter values input from the K-parameter encoding circuit 160 are used for the representative and 7,000-thousand sections. The parameters obtained through interpolation in this manner are output to the synthesis filter circuit 250.

合成フィルタ２５０は、駆動音源信号、及び補間された
にパラメータを入力し、１フレ一ム分の応答信号Ｘ（ｎ
）を計算する。ここでこの計算には、例えば特願昭５７
−２３１６０５号明細書に記載の合成フィルり回路３２
０と同一の方法を用いることができる。The synthesis filter 250 inputs the driving sound source signal and the interpolated parameters, and generates a response signal X(n
). Here, for this calculation, for example,
Synthetic fill circuit 32 described in -231605 specification
0 can be used.

マルチプレクサ回路２６０は、Ｋパラメータ符号化回路
１６０の符号’Ｋｉとピッチ符号化回路１５０の符号ｅ
、と符号化回路２３０の符号及びサブフレーム位相、代
表ピッチ位置を入力し、これらを組あわせて送信側出力
端子２７０から出力する。以上で本発明による高能率音
声符号化方式の送信側の説明を終了する。The multiplexer circuit 260 uses the code 'Ki of the K parameter encoding circuit 160 and the code e of the pitch encoding circuit 150.
, and the code, subframe phase, and representative pitch position of the encoding circuit 230 are inputted, and these are combined and outputted from the transmission side output terminal 270. This concludes the explanation of the transmission side of the high-efficiency speech encoding system according to the present invention.

次に、本発明による音声符号化方式の受信側の構成につ
いて、第１図（ｂ）を参照して説明する。Next, the configuration of the receiving side of the audio encoding system according to the present invention will be explained with reference to FIG. 1(b).

デマルチプレクサ２９０は、受信側入力端子２８０から
入力した符号のうち、Ｋパラメータをあられす符号と、
ピッチ周期を表す符号と、パルスの振幅、位置を表す符
号と、サブフレーム位相、代表ピッチ位置を表す符号と
を分離して、それぞれにパラメータ復号回路３３０、ピ
ッチ復号回路３２０、パルス復号回路３００、駆動信号
復元回路３４０へ出力する。The demultiplexer 290 converts the K parameter from the code input from the receiving side input terminal 280 into a hail code, and
A parameter decoding circuit 330, a pitch decoding circuit 320, a pulse decoding circuit 300, and a parameter decoding circuit 330, a pitch decoding circuit 320, a pulse decoding circuit 300, and separate codes representing the pitch period, pulse amplitude and position, and subframe phase and representative pitch position are separated. It is output to the drive signal restoration circuit 340.

Ｋパラメータ復号回路３３０は、Ｋパラメータを復号し
て復号値Ｋｉ′を補間回路３３５へ出力する。K parameter decoding circuit 330 decodes the K parameter and outputs a decoded value Ki' to interpolation circuit 335.

ピッチ復号回路３２０は、ピッチ周期Ｐ　ｄ’を復号し
て、駆動信号復元回路３４０、補間回路３３５へ出力す
る。The pitch decoding circuit 320 decodes the pitch period P d' and outputs it to the drive signal restoration circuit 340 and the interpolation circuit 335.

パルス復号回路３００はパルス振幅ｇｉ′、位置ｍｉ’
を復号して駆動信号復元回路３４０へ出力する。The pulse decoding circuit 300 has a pulse amplitude gi′ and a pulse position mi′.
is decoded and output to the drive signal restoration circuit 340.

駆動信号復元回路３４０は、まずサブフレーム位相、代
表ピッチ位置を表す符号と、ピッチ周期Ｐｄ’を入力し
、フレームをピッチ周期Ｐｄ’ごとのサブフレームに分
割する。そして代表ピッチ位置で表されるサブフレーム
区間に対して位置ｍ゛に振幅ｇ′のパルスを発生させる
。次に、代表ピッチパルスと１フレーム過去、及び１フ
レーム先の代表的なパルスを用いてサブフレーム毎にパ
ルスを補間して求める。こうして全てのサブフレームに
ついてパルスを発生させ駆動音源信号を復元し、１フレ
一ム分の駆動音源信号を合成フィルタ回路３５０へ出力
する。The drive signal restoration circuit 340 first inputs the subframe phase, the code representing the representative pitch position, and the pitch period Pd', and divides the frame into subframes for each pitch period Pd'. Then, a pulse of amplitude g' is generated at position m' for the subframe section represented by the representative pitch position. Next, pulses are interpolated and determined for each subframe using the representative pitch pulse and representative pulses from one frame past and one frame ahead. In this way, pulses are generated for all subframes to restore the drive sound source signal, and the drive sound source signal for one frame is output to the synthesis filter circuit 350.

補間回路３３５は、送信側の補間回路２５５と同一の動
作をし、復号されたにパラメータをピッチ周期ごとに補
間し、補間されたパラメータを合成フィルタ回路３５０
へ出力する。The interpolation circuit 335 operates in the same way as the interpolation circuit 255 on the transmitting side, interpolates the decoded parameters for each pitch period, and applies the interpolated parameters to the synthesis filter circuit 350.
Output to.

合成フィルタ回路３５０は、駆動音源信号、補間された
にパラメータを入力し、送信側の合成７゛イルタ回路２
５０と同一の動作をして、１フレ一ム分の合成音声信号
Ｘ（ｎ）を計算し受信側出力端子３６０から出力する。The synthesis filter circuit 350 inputs the driving sound source signal and the interpolated parameters, and sends the synthesis 7 filter circuit 2 on the transmitting side.
The same operation as 50 is performed to calculate the synthesized audio signal X(n) for one frame and output it from the receiving side output terminal 360.

以上で本発明による。高能率音声符号化方式の受信側の
説明をおえる。This is according to the present invention. I will explain the receiving side of the high-efficiency speech encoding method.

駆動信号計算回路２２０におけるパルス計算法としては
、本実施例でのべた方法の他に種々の方法を用いること
ができる。例えばパルスを１つ求めるごとに過去に求め
たパルスの振幅を調整する方法を用いることができる。As the pulse calculation method in the drive signal calculation circuit 220, various methods can be used in addition to the method described in this embodiment. For example, a method may be used in which the amplitude of previously determined pulses is adjusted each time one pulse is determined.

この方法の詳細については小野圧らによる「マルチパル
ス駆動型音声符号化法における音源パルス探索法の検討
」と題した論文印本音響学会講演論文鳥１５７．１９８
３　；文献７）等に述べられているのでここでは説明を
省略する。For details of this method, please refer to the paper entitled "Study of sound source pulse search method in multi-pulse driven speech coding method" by Otsuta Ono et al.
3; Document 7), so the explanation will be omitted here.

また、駆動信号計算回路２２０にてパルスを求めるさい
に、フレームをサブ７に一ムに分割した後に、サブフレ
ームごとにパルスを求めていたが、サブフレームに分割
せずに、フレーム全体に対してあらかじめ定めた個数の
パルスを求めるようにしてもよい。更に、本実施例の送
信側では、Ｋパラメータの値はフレーム内で一定（つま
り合成フィルタの特性がフレーム内で変化しない）とし
て、このフレーム内のサブフレームごとにパルスを求め
たが、Ｋパラメータの値をサブフレーム毎になめらかに
変化させながらパルスを求めてもよい。具体的には、Ｋ
パラメータの値を前後のフレームのにパラメータの値を
用いてサブフレーム毎に補間し、この値を予測係数に変
換して、重みすけ回路２００、インパルス応答計算回路
１７０、合成フィルタ回路２５０に出力し、サブフレー
ム毎に係数を更新して求めた相互相関関係、自己相関関
係を用いパルスを計算する。このうようにしたほうが、
より良好な音声を合成できる。また、パルス及びにパラ
メータの値を補間するさいに、送信側では、代表的なピ
ッチ区間の位置を求め伝送し、この区間を基準としてピ
ッチ周期に同期させて補間していたが、パルス及びにパ
ラメータのいずれか一方、あるいは両方とも、あらかじ
め定められたピッチ区間（例えば、フレームの中央付近
のピッチ区間）を基準として補間を施してもよい。両方
ともにこのような補間法を用いる場合は、代表ピッチ区
間の位置を表す符号を伝送しなくてもよく、伝送ビット
レイトを減らすことができる。また、代表ピッチ区間と
して、絶対値の大きなパルスを含むサブフレームを選択
していたが、例えばフレーム中央のサブフレームな選択
する方法等、他の方法を用いることもできる。また、サ
ブフレーム分割を行うときにピッチ周期は一定としてい
たが、この値も前後のフレームのピッチ周期を用いて補
間するようにしてもよい。一方、パルス及びにパラメー
タをピッチ周期に同期させずに捕間する方法も考えられ
る。In addition, when determining pulses in the drive signal calculation circuit 220, pulses were determined for each subframe after dividing the frame into 7 subframes, but instead of dividing the frame into subframes, Alternatively, a predetermined number of pulses may be obtained. Furthermore, on the transmitting side of this embodiment, pulses were obtained for each subframe within this frame assuming that the value of the K parameter was constant within the frame (that is, the characteristics of the synthesis filter did not change within the frame). The pulse may be determined while smoothly changing the value of . Specifically, K
The parameter values are interpolated for each subframe using the parameter values of the previous and subsequent frames, and this value is converted into a prediction coefficient and output to the weighting circuit 200, impulse response calculation circuit 170, and synthesis filter circuit 250. , pulses are calculated using the cross-correlation and autocorrelation obtained by updating the coefficients for each subframe. It is better to do it like this,
Better speech can be synthesized. In addition, when interpolating parameter values for pulses and pulses, the transmitting side determines the position of a representative pitch section and transmits it, and interpolates in synchronization with the pitch period using this section as a reference. Either one or both of the parameters may be interpolated using a predetermined pitch interval (for example, a pitch interval near the center of the frame) as a reference. If both of these interpolation methods are used, it is not necessary to transmit the code representing the position of the representative pitch section, and the transmission bit rate can be reduced. Furthermore, although a subframe including a pulse with a large absolute value is selected as the representative pitch section, other methods may be used, such as selecting a subframe at the center of the frame. Furthermore, although the pitch period is constant when performing subframe division, this value may also be interpolated using the pitch period of the previous and subsequent frames. On the other hand, a method of capturing pulses and parameters without synchronizing them with the pitch period may also be considered.

この場合は、フレームをあらかじめ定められた時間間隔
（例えば２．５ｍ　ｓｅｃ程度程度図切り、この区間毎
に補間処理を行う。この場合はサブフレーム位相は伝送
しなくてもよい。ここで、補間の基準区間としては、代
表区間を送信側で求めてもよいし、あらかじめ定めてお
いてもよい（例えばフレーム中央付近）。後者の場合に
は、サブフレーム位相と代表ピッチ位置を伝送しなくて
もよく、更にビットレイトを減らすことができる。In this case, the frame is cut into predetermined time intervals (for example, about 2.5 msec) and interpolation processing is performed for each interval. In this case, the subframe phase does not need to be transmitted. As the reference interval, a representative interval may be determined on the transmitting side or may be determined in advance (for example, near the center of the frame).In the latter case, the subframe phase and representative pitch position need not be transmitted. The bitrate can be further reduced.

次に、パルス、合成フィルタのパラメータ、ピッチ周期
の補間法としては、直線補間以外の方法も考えられる。Next, as an interpolation method for pulses, synthesis filter parameters, and pitch periods, methods other than linear interpolation may be considered.

例えば、パルス、ピッチ周期については、対数補間等も
考えられる。また、合成フィルタのパラメータを補間す
る場合、本実施例ではにパラメータについて補間したが
、例えば予測係数を補間する方法（但し、この場合はフ
ィルタの安定性をチェックする必要がある）や、対数断
面積関数を補間する方法や、自己相関関数を補間する方
法等を用いることもできる。これらの具体的な方法は、
ビーエスアタル（Ｂ、　Ｓ、　ＡＴＡＬ）氏らによる「
スピーチ　アナリシス　アンド　シンセシス　バイリニ
アー　プリディクション　オプ　ザ　スピーチウニイブ
（ＳＰＥＥＣＨＡＮＡＬＹＳＩＳ　ＡＮＤ　５ＹＮＴＨ
ＥＳＩＳＢＹ　ＬＩＮＥＡＲＰＲＥＤＩＣＴＩＯＮ　Ｏ
Ｆ　ＴＨＥ　５ＰＥＥＣＨＷＡＶＥ）　と趙した論文（
Ｊ、　ＡＣＯＵＳＴ、　ＳＯＣ，ＡＭ、、、　ｐ、　ｐ
。For example, logarithmic interpolation and the like can be considered for pulse and pitch periods. In addition, when interpolating the parameters of the synthesis filter, in this example, the parameters are interpolated, but for example, it is possible to interpolate the prediction coefficients (however, in this case, it is necessary to check the stability of the filter), or to A method of interpolating an area function, a method of interpolating an autocorrelation function, etc. can also be used. These specific methods are:
``By B, S, ATAL et al.
SPEECHANALYSIS AND 5YNTH
ESISBY LINEAR PREDICTION O
F THE 5PEECHWAVE)
J, ACOUST, SOC, AM,, p, p
.

６３７−６５５．１９７１　；文献８）等で述べられて
いるので説明は省略する。637-655.1971; Reference 8), etc., so the explanation will be omitted.

本実施例では、フレーム長一定としてにパラメータの分
析および音源パルス列の計算をしたが、フレーム長は可
変としてもよい。このようにした場合には、音声の変化
部では、フレーム長を短くし、定常部ではフレーム長を
長くできるので、伝送ビットレイトを低減することがで
きる。In this embodiment, the parameters are analyzed and the sound source pulse train is calculated assuming that the frame length is constant, but the frame length may be variable. In this case, the frame length can be shortened in the changing part of the audio, and the frame length can be made long in the constant part, so that the transmission bit rate can be reduced.

更に、ピッチ周期に応じて（例えばピッチ周期の整数倍
）フレーム長を決めるようにすれば、本実施例で述べた
サブフレーム位相も送らなくてよいので、更に伝送ビッ
トレイトを低減することができる。Furthermore, if the frame length is determined according to the pitch period (for example, an integral multiple of the pitch period), it is not necessary to send the subframe phase described in this embodiment, so the transmission bit rate can be further reduced. .

本発明の他の構成法として、第１図（ａ）に於ける駆動
信号復元回路２４０、合成フィルタ回路２５０、補間回
路２５５、減算回路１２０を省略して構成をとることも
できる。このようにした場合は、送信側で音声信号を合
成しなくてもよく、装置構成を簡略化することができる
。As another configuration method of the present invention, the drive signal restoration circuit 240, synthesis filter circuit 250, interpolation circuit 255, and subtraction circuit 120 in FIG. 1(a) may be omitted. In this case, there is no need to synthesize audio signals on the transmitting side, and the device configuration can be simplified.

尚、ディジタル信号処理の分野でよく知られているよう
に、自己相関関数はパワスペクトルから計算することも
できる。また、相互相関関数はクロスパワスペクトルか
ら計算することもできる。Note that, as is well known in the field of digital signal processing, the autocorrelation function can also be calculated from the power spectrum. Further, the cross-correlation function can also be calculated from the cross-power spectrum.

これらの対応関係については、ニーブイ　オッペンハイ
ム（Ａ、　Ｖ、　ＯＰＰＥＮＨＥＩＭ）氏らによる「デ
ィジタル信号２Ｑ　（ＤＩＧＩＴＡＬ　５ＩＧＮＡＬ　
ＰＲＯＣＥＳＳＩＮＧ）Ｊと超した単行本（文献９）等
の第８章にて詳細に説明されているので、ここでは説明
を省略する。Regarding these correspondence relationships, please refer to “DIGITAL 5IGNAL 2Q” by Niebu Oppenheim (A, V, OPPENHEIM) and others
Since it is explained in detail in Chapter 8 of the book published by J. PROCESSING (Reference 9), etc., the explanation will be omitted here.

（発明の効果）本発明によれば、音源信号を複数個のパルス列（マルチ
パルス）の組み合わせで表すと共に、音声信号の周期性
を利用して、音声信号を良好に表すように代表的なピッ
チ区間のパルスを求めて伝送し、この代表的なピッチパ
ルスを時間的になめらかに変化させながら駆動音源信号
を復元し音声を合成しているので、低い伝送ビットレイ
トにおいても高品質な音声を合成できるという効果があ
る。(Effects of the Invention) According to the present invention, a sound source signal is represented by a combination of a plurality of pulse trains (multipulses), and the periodicity of the sound signal is used to express the sound signal at a representative pitch so as to represent the sound signal well. The pulses in the interval are determined and transmitted, and the driving sound source signal is restored and synthesized while changing this representative pitch pulse smoothly over time, so high-quality audio can be synthesized even at low transmission bit rates. There is an effect that it can be done.

[Brief explanation of the drawing]

第１図（ａ）、（ｂ）は、本発明による高能率音声符号
化方式の一実施例を表すブロック図、第２図（ａ）〜（
ｃ）は駆動信号計算回路２２０における処理内容の一例
を示す図、第３図は補間回路の処理例を示す図、第４図
は従来方式の合成側の構成を示すブロック図、第５図は
マルチパルスで表した音源信号の一例を示す図である。図において１１０・・・・・バッファメモリ回路１２０・・・・・減算回路２５０、３５０・・・・・合成フィルタ回路２００・・
・・・重みずけ回路１７０・・・・・インパルス応答計算回路１８０・・・
・・自己相関関数計算回路２１０・・・・・相互相関関
数計算回路２２０・・・・・駆動信号計算回路２４０、３４０・・・・・駆動信号復元回路１３０・・
・・・ピッチ分析回路１４０・・・・・Ｋパラメータ計算回路１５０・・・・
・ピッチ符号化回路１６０・・・・・Ｋパラメータ符号化回路２３０・・・
・・符号化回路２５５、３３５・・・・・補間回路２６０・・・・・マルチプレクサ２９０・・・・・デマルチプレクサ３００・・・・・パルス復号回路３２０・・・・・ピッチ復号回路３３０−・・−にパラメータ復号回路をそれぞれ示す。オ　２　図（ｂ）オ　３　口オ　４　図オ　５　図手続補正書（甑ｖ）特許庁長官殿　　　　　　　　　　感１、事件の表示　　昭和５９年　　特許願　第２７２４
３５号２、発明の名称音声符号化方式とその装置３、補正をする者事件との関係　　　　　　　　出願人東京都港区芝五丁目３３番１号（４２３）　　日本電気株式会社代表者　関本忠弘４、代理人、６１，３．了Ｓ補正の対象（１）　　［Ｂ書の特許請求の範囲の欄（２）　　明細
書の発明の詳細な説明の欄＆補正の内容＋１）　　特許請求の範囲を別紙のように補正する。（２）　　明細書４７頁第９行目に「スピクトル」と６
るを「スペクトル」と間圧する。（３）　　明細書第７頁第１７行目に「符号とｔ」とあ
るのを「符号と前記代表的な音源パルス列を表す符号と
を」と補正する。（４）　　明細書第９頁第１３行目に「前記駆動音源」
とめるのを「前記復元嘔れた駆動音源」と補正する。（５）　　明ｉ４ｖ２ｉｇ１７頁第１行目に「パスル」
とめるのｔ−ｒパルス」と補正する。（６）明細書第１８頁第２行目にｒｈｉ、ｊＪ　とめる
ＯをｒＫｉｊＪと補正する。（７）　　明細書第１８頁第３行目に「Ｋｉ　、　ｊ　
−Ｉ　Ｊ　　とｂるのｔｒＫｉｊ−ＩＪと補正する。（８〕　　明細書第１８頁第３行目に、「Ｋｉ、ｊ刊」
とあるのを「Ｋｉ）刊」と補正する。（９）　　明ｍ誓第１９頁第６行目に「サブフレーム位
相」とあるのを「サブフレーム位相Ｊと補正する。 αｌ１１　　明細書第２３頁第１行目に「方法等」とめ
るのｆ：ｒ方法や良好な音声を再生できるような区間を
選択する方法等」と補正する。、１ノ１１Ｎ代理人　弁理士　内　原　　　晋特許請求の範囲Ｌ　送信側では離散的な音声信号を入力し、前記音声信
号から短時間スペクトル包絡ｔ−表すスペクトルパラメ
ータとピッチを表すピッチパラメータとを抽出して符号
化し、前記音声信号と前記ピッチパラメータと前記スペ
クトルパラメータ分もとくして前記音声信号を表わすた
めの代表的な音源パルス列を前記ピッチパラメータに応
じて求めた時間区間のうちの代表的な区間に対しあらか
じめ定められた個数たけ求めて符号化し、前記ピッチパ
ラメータを表す符号と前記スペクトルパラメータｔ−表
す符号と前記代表的な音源パルス列ｔ”表わす符号とｔ
−組み合わせて出力し、受信例では前記組み合わされた
符号を人力し、前記ピッチパラメータを表す符号と前記
スペクトルパラメータを表す符号と前記代表的な音源パ
ルスタリを表す符号とを分離して信号し、前記復号され
たピッチパラメータと前記復号された音源パルス列をも
とに前記音源パルス列に対し時間的にな０らかな変化を
与える処理をほどこして駆動音源信号を復元し前記復元
された駆動音源信号と前記復元されたスペクトルパラメ
ータをもとに前記音声信号を合成することを特徴とする
音声符号化方式。２　入力し７ｔｆ声信号から短時間スペクトル包絡を表
すスペクトルパラメータとピッチｔ−ｉすピッチパラメ
ータとｔ抽出し符号化するパラメータ計算回路と、前記
音声信号と前記ピッチパラメータと前記スペクトルパラ
メータをもとにして前記曽声信号を表わ丁友めの代表的
な音源パルス列を前記ピッチパラメータに応じて求めた
時間区間のうちの代表的な区間に対しめらかじめ定めら
れた個数だけ求めて符号化する駆動信号計算回路と、前
記パラメータ計算回路の出力符号と前記駆動信号計算回
路の出力符号とを組み合わせて出力するマルチプレクサ
回路とを有すること１特徴とする音声符号化装置。１　ピッチパラメータを表す符号とスペクトルパラメー
タを茨す符号とｆｍパルス列を表す符号とが組み合わさ
れた符号系列が入力さｎ１前記ビソチバ２メータを表す
符号と前記スペクトルパラメータを表す符号と前記音源
パルス列ヲ表す符号とを分離して復号するマルチプレク
サ回路と、前記復号延ＫＬ７Ｃヒツチパラメータと前記
復号さｎた音源パルス列をもとに前記音源パルス列に対
して時間的になめらかな変化を与える処理をほどこして
駆動音源信号ｆ：復元する駆動音源信号復元回路と、ｉ
Ｉｒ紀復元された駆動音源信号と前記復号されたスペク
トルパラメータとをもとに音声信号を合成し出力する合
成フィルタ回路と金有することに％徴とする音声復号化
装置。代理人　弁理士　内　原　　　晋・　　　、：゛、ノFIGS. 1(a) and 1(b) are block diagrams representing an embodiment of the high-efficiency speech encoding method according to the present invention, and FIGS. 2(a)-(
c) is a diagram showing an example of processing contents in the drive signal calculation circuit 220, FIG. 3 is a diagram showing an example of processing in the interpolation circuit, FIG. 4 is a block diagram showing the configuration of the synthesis side of the conventional method, and FIG. FIG. 3 is a diagram showing an example of a sound source signal expressed as multi-pulses. In the figure, 110...Buffer memory circuit 120...Subtraction circuit 250, 350...Synthesis filter circuit 200...
... Weighting circuit 170 ... Impulse response calculation circuit 180 ...
... Autocorrelation function calculation circuit 210 ... Cross correlation function calculation circuit 220 ... Drive signal calculation circuit 240, 340 ... Drive signal restoration circuit 130 ...
... Pitch analysis circuit 140 ... K parameter calculation circuit 150 ...
- Pitch encoding circuit 160...K parameter encoding circuit 230...
... Encoding circuits 255, 335 ... Interpolation circuit 260 ... Multiplexer 290 ... Demultiplexer 300 ... Pulse decoding circuit 320 ... Pitch decoding circuit 330- . . - show parameter decoding circuits, respectively. E 2 Diagram (b) O 3 Mouth O 4 Diagram O 5 Draft procedure amendment (Koshiv) Dear Commissioner of the Japan Patent Office Kan 1, Indication of the case 1982 Patent Application No. 2724
35 No. 2, Name of the invention Audio encoding system and its device 3, Relationship with the amended case Applicant 5-33-1 Shiba, Minato-ku, Tokyo (423) NEC Corporation Representative Tadahiro Sekimoto 4; Agent, 61,3. Subject of amendment (1) [Claims column of Book B (2) Detailed explanation column of the invention in the specification & contents of amendment +1) The scope of claims will be amended as shown in the attached sheet. (2) "Spiktor" and 6 on page 47, line 9 of the specification
``Spectrum'' and ``spectrum''. (3) In the 17th line of page 7 of the specification, the phrase "symbol and t" is corrected to "symbol and the symbol representing the representative sound source pulse train." (4) “The driving sound source” on page 9, line 13 of the specification
The error is corrected to ``the reconstructed driving sound source''. (5) “Pasuru” in the first line of page 17 of Akira i4v2ig
It is corrected as ``stop t-r pulse''. (6) In the second line of page 18 of the specification, rhi, jJ Stop O is corrected to rKijJ. (7) On page 18, line 3 of the specification, “Ki, j
−I J and bru are corrected as trKij−IJ. (8) On page 18, line 3 of the specification, “Ki, published by j”
The text has been corrected to read "Published by Ki)." (9) ``Subframe phase'' in the 6th line of page 19 of the specification is corrected to ``subframe phase J. αl11 ``Method, etc.'' should be changed to ``method, etc.'' in the 1st line of page 23 of the specification. :r method, method of selecting a section that can reproduce good audio, etc." , 1/11N Agent Patent Attorney Susumu Uchihara Patent Claims L On the transmitting side, a discrete audio signal is input, and from the audio signal, a spectral parameter representing the short-time spectrum envelope t- and a pitch parameter representing the pitch are extracted. A representative section of the time section in which a representative sound source pulse train for representing the audio signal is determined according to the pitch parameter by encoding the audio signal, the pitch parameter, and the spectrum parameter. A predetermined number of pulses are determined and encoded, and a code representing the pitch parameter, a code representing the spectrum parameter t-, a code representing the representative sound source pulse train t'', and t are obtained.
- combine and output, and in the reception example manually input the combined codes, separate and signal the code representing the pitch parameter, the code representing the spectrum parameter, and the code representing the representative sound source pulse tally; Based on the decoded pitch parameter and the decoded sound source pulse train, the sound source pulse train is subjected to processing to give a zero smooth change over time to restore the driving sound source signal, and the restored driving sound source signal and the sound source pulse train are A speech encoding method characterized in that the speech signal is synthesized based on restored spectrum parameters. 2. A parameter calculation circuit that extracts and encodes a spectral parameter representing a short-time spectral envelope from an input 7tf voice signal, a pitch parameter and a pitch parameter, and a A predetermined number of representative sound source pulse trains representing the first pitch signal are smoothly determined and encoded for a representative interval of the time interval determined according to the pitch parameter. A speech encoding device characterized by comprising: a drive signal calculation circuit; and a multiplexer circuit that combines and outputs an output code of the parameter calculation circuit and an output code of the drive signal calculation circuit. 1. A code sequence in which a code representing a pitch parameter, a code representing a spectral parameter, and a code representing an fm pulse train are combined is input. a multiplexer circuit that separates and decodes the code, and a driving sound source that performs processing to give a temporally smooth change to the sound source pulse train based on the decoded extended KL7C hitch parameter and the decoded sound source pulse train. Signal f: a drive sound source signal restoration circuit to restore, and i
An audio decoding device characterized by a synthesis filter circuit that synthesizes and outputs an audio signal based on a drive sound source signal restored from the Ir period and the decoded spectral parameters. Agent: Susumu Uchihara, Patent Attorney: ゛,ノ

Claims

[Claims] 1. On the transmitting side, a discrete audio signal is input, and a spectral parameter representing a short-time spectrum envelope and a pitch parameter representing a pitch are extracted and encoded from the audio signal, and A predetermined number of representative sound source pulse sequences for representing the audio signal based on the pitch parameter and the spectrum parameter are provided for a representative period of the time period obtained according to the pitch parameter. The code representing the pitch parameter and the code representing the spectral parameter are combined and output, and the receiving side inputs the combined code and generates a code representing the pitch parameter and a code representing the spectral parameter. and a code representing the representative sound source pulse train, and perform a process of applying a temporally smooth change to the sound source pulse train based on the decoded pitch parameter and the decoded sound source pulse train. 1. A sound encoding method comprising: restoring a drive sound source signal by applying a signal to the sound source, and synthesizing the sound signal based on the restored drive sound source signal and the restored spectral parameter. 2. A parameter calculation circuit that extracts and encodes a spectral parameter representing a short-time spectral envelope and a pitch parameter representing a pitch from an input audio signal; a drive signal calculation circuit that obtains and encodes a predetermined number of representative sound source pulse trains for representing a sound signal in a representative section of the time sections obtained according to the pitch parameter; and the parameter A speech encoding device comprising: a multiplexer circuit that combines and outputs the output code of the calculation circuit and the output code of the drive signal calculation circuit. 3. A code sequence in which a code representing a pitch parameter, a code representing a spectral parameter, and a code representing an excitation pulse train are combined is input, and a code representing the pitch parameter, a code representing the spectral parameter, and a code representing the excitation pulse train are input. a multiplexer circuit that separates and decodes the sound source pulse train; and a drive sound source signal that performs processing to give a temporally smooth change to the sound source pulse train based on the decoded pitch parameter and the decoded sound source pulse train. An audio decoding device comprising: a driving excitation signal restoration circuit for restoring; and a synthesis filter circuit for synthesizing and outputting an audio signal based on the driving excitation signal and the decoded spectrum parameter.