JP2580123B2

JP2580123B2 - Speech synthesizer

Info

Publication number: JP2580123B2
Application number: JP61089359A
Authority: JP
Inventors: 昭一佐々部; 博雄北川
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-04-18
Filing date: 1986-04-18
Publication date: 1997-02-12
Anticipated expiration: 2012-02-12
Also published as: JPS62245299A

Description

【発明の詳細な説明】技術分野本発明は、音声合成装置，より詳細には、残差駆動に
よる音声合成装置に関する。Description: TECHNICAL FIELD The present invention relates to a speech synthesis device, and more particularly, to a speech synthesis device using residual driving.

従来技術従来、スペクトル包絡パラメータ（LPC,PARCOR,LSPな
ど）と音源信号を合成フィルタに入力して合成音声を得
る音声合成方法において、その駆動音源として、有音声
部ではインパルス列、切出した１ピッチ周期残差波形、
平均または代表残差波形など、無音声部では残差波形、
白色雑音（Ｍ系列等）などを用いてきた。而して、上記
従来技術は、残差信号を利用することによって音質改善
の効果は見られるが、しかし、残差信号の利用を部分的
あるいは選択的に行なうため、十分な明瞭度、音質を得
られない場合が生じる欠点があった。2. Description of the Related Art Conventionally, in a speech synthesis method of obtaining a synthesized speech by inputting a spectrum envelope parameter (LPC, PARCOR, LSP, etc.) and a sound source signal to a synthesis filter, as a driving sound source, an impulse train in a voiced portion, a cut out one pitch Periodic residual waveform,
For non-speech parts such as average or representative residual waveforms,
White noise (such as M-sequence) has been used. Although the above-described prior art has an effect of improving sound quality by using the residual signal, however, since the use of the residual signal is performed partially or selectively, sufficient clarity and sound quality are obtained. There was a drawback that sometimes it could not be obtained.

目的本発明は、上述のごとき実情に鑑みてなされたもの
で、特に、逆フイルタリングにより得られる残差信号か
ら駆動音源信号を生成することにより、従来技術に比し
て高品質の合成音声を得ることを目的としてなされたも
のである。Object The present invention has been made in view of the above circumstances, and in particular, by generating a driving sound source signal from a residual signal obtained by inverse filtering, it is possible to generate high-quality synthesized speech as compared with the related art. It was made for the purpose of obtaining.

構成本発明は、上記目的を達成するために、音声波形から
抽出されたスペクトル包絡パラメータと逆フィルタリン
グ処理により抽出された残差波形を用いて音声波形を合
成する音声合成装置において、音声波形から抽出された
スペクトル包絡パラメータを記憶するスペクトル包絡パ
ラメータ記憶部と、前記パラメータから得られた残差波
形及び該残差波形を変形処理した残差波形を記憶する残
差波形記憶部と、逆フィルタリングにより得られた残差
波形を直接駆動音源として直接使用するか、変形処理し
て使用するかを指定する制御データを発生する駆動音源
生成制御部と、該駆動音源生成制御部からの制御データ
に従って、前記残差波形記憶部より読み出された残差波
形データから駆動音源データを作成する駆動音源生成部
と、該駆動音源生成部により作成された駆動音源データ
と、前記スペクトル包絡パラメータ記憶部からのパラメ
ータとを合成する合成フィルタとを有し、破裂性の無声
音部では逆フィルタリングにより得られた残差波形を直
接駆動音源として用い、その他の無声音部では抽出した
残差波形を変形処理した波形を駆動音源として用いるこ
とを特徴としたものである。以下、本発明の実施例に基
づいて説明する。In order to achieve the above object, the present invention provides a speech synthesizer for synthesizing a speech waveform using a spectrum envelope parameter extracted from a speech waveform and a residual waveform extracted by inverse filtering processing. A spectrum envelope parameter storage unit for storing the obtained spectrum envelope parameter, a residual waveform storage unit for storing a residual waveform obtained from the parameter and a residual waveform obtained by deforming the residual waveform, and a residual waveform storage unit obtained by inverse filtering. Directly using the obtained residual waveform as a direct drive sound source, or a drive sound source generation control unit that generates control data to specify whether to use the transformed sound source, according to control data from the drive sound source generation control unit, A driving sound source generating unit that generates driving sound source data from the residual waveform data read from the residual waveform storage unit; And a synthesis filter for synthesizing the driving sound source data created by the generation unit and the parameters from the spectrum envelope parameter storage unit. In the bursty unvoiced sound part, the residual waveform obtained by inverse filtering is directly used as the driving sound source. In other unvoiced sound portions, a waveform obtained by deforming the extracted residual waveform is used as a driving sound source. Hereinafter, a description will be given based on examples of the present invention.

第１図は、本発明の一実施例を説明するためのブロッ
ク線図で、図中、１はスペクトル包絡パラメータ記憶
部,2は残差波形記憶部,3は駆動音源生成制御部,4は駆動
音源生成部,5は合成フィルタ部,6はD/A変換部,7は出力
端子で、本発明においては、駆動音源波形としてスペク
トル包絡パラメータを抽出する際に生じる残差波形を直
接あるいは変形処理して用いている。FIG. 1 is a block diagram for explaining an embodiment of the present invention, in which 1 is a spectrum envelope parameter storage unit, 2 is a residual waveform storage unit, 3 is a driving sound source generation control unit, and 4 is A driving sound source generating unit, 5 is a synthesis filter unit, 6 is a D / A converting unit, and 7 is an output terminal.In the present invention, a residual waveform generated when a spectral envelope parameter is extracted as a driving sound source waveform is directly or modified. Used after processing.

駆動音源波形として音声音部ではスペクトル包絡パラ
メータ（LPC,LSPなど）を用いて原音声波形を逆フィル
タリングして得られる予測残差波形の位相特性のみを同
位相に変形した残差波形の１ピッチ周期分を用いて、繰
り返し接続し、振幅を定めるなどの整形をして使用すれ
ばよい。無音声部では前記予測残差波形を直接用いれば
よい。また、無音声部において、破裂部には直接残差波
形を駆動音源とし、摩擦部に残差波形から抽出した複数
の単時間の代表残差をランダムに接続して駆動音源とす
るように制御してもよい。One pitch of the residual waveform obtained by inverse-filtering the original residual waveform using the spectral envelope parameters (LPC, LSP, etc.) as the driving sound source waveform and transforming only the phase characteristic of the predicted residual waveform to the same phase What is necessary is just to connect repeatedly by using the period, to perform shaping such as determining the amplitude, and use it. In the non-voice portion, the predicted residual waveform may be directly used. In the silent part, control is performed so that the residual waveform is directly used as the driving sound source for the rupture part, and multiple single-time representative residuals extracted from the residual waveform are randomly connected to the friction part as the driving sound source. May be.

第１図において、音声波形から抽出されたスペクトル
包絡パラメータと該パラメータを用いて原音声波形を逆
フィルタリングして得られた残差波形及び該残差波形を
変形処理した残差波形などが各々スペクトル包絡パラメ
ータ記憶部１及び残差波形記憶部２に記憶されている。
駆動音源生成部４では前記残差波形記憶部２より読み出
された残差波形データを制御部３よりの制御データ（ピ
ッチ，振幅，予測残差波形か変形処理した残差波形かを
指示など）に従って選択，接続，整形して駆動音源デー
タが作成され、スペクトル包絡パラメータデータと共に
声道モデル回路である合成フィルタ部５に送られ、該合
成フィルター部５の出力はD/A変換部６を通してアナロ
グ信号に変換され、出力端子部７に合成音声信号が出力
される。In FIG. 1, a spectrum envelope parameter extracted from an audio waveform, a residual waveform obtained by inverse filtering the original audio waveform using the parameter, a residual waveform obtained by deforming the residual waveform, and the like are each a spectrum. It is stored in the envelope parameter storage unit 1 and the residual waveform storage unit 2.
The driving sound source generation unit 4 converts the residual waveform data read from the residual waveform storage unit 2 into control data (pitch, amplitude, prediction residual waveform or a modified residual waveform, etc.) from the control unit 3. ) Is selected, connected, and shaped to generate driving sound source data, which is sent together with the spectral envelope parameter data to the synthesis filter unit 5 which is a vocal tract model circuit, and the output of the synthesis filter unit 5 is passed through the D / A conversion unit 6. The signal is converted into an analog signal, and a synthesized voice signal is output to the output terminal unit 7.

効果以上の説明から明らかなように、本発明によると、自
然音声波形からスペクトル包絡パラメータを抽出する際
に生じる残差波形を直接あるいは変形処理して用いるこ
とによって合成音声の音質，明瞭度を向上させることが
できる。Advantages As is apparent from the above description, according to the present invention, the sound quality and intelligibility of synthesized speech are improved by using the residual waveform generated when extracting the spectral envelope parameter from the natural speech waveform, directly or by using a modified process. Can be done.

[Brief description of the drawings]

第１図は、本発明による音声合成装置の実施に使用され
るシステムの一例を説明するためのブロック線図であ
る。１……スペクトル包絡パラメータ記憶部,2……残差波形
記憶部,3……駆動音源生成制御部,4……駆動音源生成
部,5……合成フィルター,6……D/A変換部,7……出力端
子。FIG. 1 is a block diagram for explaining an example of a system used for implementing a speech synthesizer according to the present invention. 1 ... Spectral envelope parameter storage unit, 2 ... Residual waveform storage unit, 3 ... Drive sound source generation control unit, 4 ... Drive sound source generation unit, 5 ... Synthesis filter, 6 ... D / A conversion unit, 7 Output terminal.

Claims

(57) [Claims]

1. A speech synthesizer for synthesizing a speech waveform using a spectrum envelope parameter extracted from a speech waveform and a residual waveform extracted by inverse filtering, stores a spectrum envelope parameter extracted from the speech waveform. A spectrum envelope parameter storage unit, a residual waveform storage unit for storing a residual waveform obtained from the parameter and a residual waveform obtained by subjecting the residual waveform to a deformation process, and directly driving a residual waveform obtained by inverse filtering A driving sound source generation control unit for generating control data for specifying whether to use directly as a sound source or to use the sound source after transformation processing, and read out from the residual waveform storage unit according to control data from the driving sound source generation control unit. A driving sound source generating unit that generates driving sound source data from the generated residual waveform data, and a driving sound generated by the driving sound source generating unit. Data, and a synthesis filter for synthesizing the parameters from the spectrum envelope parameter storage unit.In the bursty unvoiced part, the residual waveform obtained by the inverse filtering is directly used as a driving sound source, and in other unvoiced parts, A speech synthesizer characterized by using a waveform obtained by transforming an extracted residual waveform as a driving sound source.