JPH07311595A

JPH07311595A - Voice synthesizer

Info

Publication number: JPH07311595A
Application number: JP6128310A
Authority: JP
Inventors: Yoshiharu Ito; 嘉治伊藤; Ichiro Ando; 一郎安東
Original assignee: Toyo Communication Equipment Co Ltd
Current assignee: Toyo Communication Equipment Co Ltd
Priority date: 1994-05-18
Filing date: 1994-05-18
Publication date: 1995-11-28

Abstract

PURPOSE:To generate a synthesized voice having a natural and enjoyable tone quality by inserting a specific filter between a driving sound source and a vocal tract function filter. CONSTITUTION:The voice waveforms inputted to an analysis section 1 are accumulated in a buffer 3 and are made into blocks for every prescribed length. The voices, which are made into blocks, are transmitted to each of a vocal tract analysis section 4, a voiced.unvoiced analysis section 5 and a pitch extracting section 6, respectively. The pitch period obtained by the section 6 is inputted to a pulse sound source generating section 7 and impulse train having specific intervals is generated. Then, the pulse train is inputted to a 1/f<n> (where n is a positive real number) filter 8, filtered and outputted. Based on the 'voiced' and 'unvoiced' discrimination result of the section 5, a changeover of a switch 9 is conducted and the output of the 1/f<n> filter 8 is inputted to a synthesis filter 11 as a driving sound source in the case of 'voiced', and the noise sound source output of a noise sound source 10 is inputted in the case of 'unvoice'.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力音声波形に従って
合成した合成音声を出力する音声合成装置に関し、特
に、自然で聴きやすい音質の合成音声を発生することが
できる音声合成装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice synthesizing device for outputting a synthetic voice synthesized according to an input voice waveform, and more particularly to a voice synthesizing device capable of generating a synthetic voice having a natural and easy-to-listen sound quality.

【０００２】[0002]

【従来の技術】一般に、従来の音源で声道関数フィルタ
を駆動する型の音声合成装置では、音源として雑音（無
声音）とパルス列か、もしくは声道関数の導関数を音声
信号に作用させた残差をその特徴に基づいて類別した代
表残差を用いている。しかしながら、この様な方式で合
成した音声は周波数成分の低域が実音声に比して劣化し
てしまうので、機械的で、自然性に欠け、聴覚的に違和
感がある合成音となってしまう問題点があった。2. Description of the Related Art Generally, in a conventional speech synthesizer of a type in which a vocal tract function filter is driven by a sound source, noise (unvoiced sound) and a pulse train as a sound source or a derivative of a vocal tract function is applied to a speech signal. Representative residuals are used which are classified based on their characteristics. However, in a voice synthesized in this manner, the low frequency components are deteriorated as compared with the real voice, so that the synthesized voice is mechanical, lacks in naturalness, and is aurally unnatural. There was a problem.

【０００３】[0003]

【目的】本発明は、上記事情に鑑みてなされたものであ
って、自然で聴きやすい音質の合成音声を発生すること
ができる音声合成装置を提供することを目的とする。The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a voice synthesizing device capable of generating a synthetic voice having a sound quality that is natural and easy to hear.

【０００４】[0004]

【発明の概要】上記目的を達成するため、本発明は、駆
動音源で声道関数フィルタを駆動して音声を合成する音
声合成装置において、上記駆動音源と声道関数フィルタ
との間に１／ｆⁿ （ｎは正の実数）のフィルターを挿入
したことを特徴とする。本発明の他の特徴は、入力音声
波形を分析して音声パラメータを求める分析部と、その
音声パラメータから出力合成音声波形を得るために駆動
音源で声道関数フィルタを駆動する合成部とを有する音
声合成装置において、上記合成部における上記駆動音源
と声道関数フィルタとの間に１／√ｆフィルタを設けた
ことである。SUMMARY OF THE INVENTION In order to achieve the above object, the present invention provides a voice synthesizer for driving a vocal tract function filter by a driving sound source to synthesize a voice, wherein 1 / is provided between the driving sound source and the vocal tract function filter. It is characterized in that a filter of f ⁿ (n is a positive real number) is inserted. Another feature of the present invention has an analysis unit that analyzes an input speech waveform to obtain a speech parameter, and a synthesis unit that drives a vocal tract filter with a driving sound source to obtain an output synthesized speech waveform from the speech parameter. In the voice synthesizer, a 1 / √f filter is provided between the driving sound source and the vocal tract function filter in the synthesizer.

【０００５】[0005]

【実施例】以下、本発明を図示した実施例に基づいて説
明する。図１は、本発明による音声合成装置の一実施例
を示す構成図である。図１に示す様に、この音声合成装
置は、入力音声波形の入力されるバッファ３と、上記バ
ッファ３にそれぞれ接続された声道分析部４、有声・無
声分析部５、およびピッチ抽出部６と、上記ピッチ抽出
部６に接続されたパルス音源生成部７と、上記パルス音
源生成部７に接続された１／√ｆフィルタ８と、上記１
／√ｆフィルタ８および有声・無声分析部５に接続され
たスイッチ９と、上記スイッチ９に接続されたノイズ音
源１０と、上記スイッチ９および上記声道分析部４に接
続された合成フィルタ（声道関数フィルタ）１１とを有
している。The present invention will be described below based on the illustrated embodiments. FIG. 1 is a block diagram showing an embodiment of a speech synthesizer according to the present invention. As shown in FIG. 1, this speech synthesizer includes a buffer 3 into which an input speech waveform is input, a vocal tract analysis unit 4, a voiced / unvoiced analysis unit 5, and a pitch extraction unit 6 which are connected to the buffer 3, respectively. A pulse sound source generation unit 7 connected to the pitch extraction unit 6; a 1 / √f filter 8 connected to the pulse sound source generation unit 7;
/ √f filter 8 and a switch 9 connected to the voiced / unvoiced analysis unit 5, a noise sound source 10 connected to the switch 9, a synthesis filter connected to the switch 9 and the vocal tract analysis unit 4 (voice Road function filter) 11.

【０００６】そして、上記バッファ３、声道分析部４、
有声・無声分析部５、およびピッチ抽出部６によって分
析部１が構成され、上記パルス音源生成部７、１／√ｆ
フィルタ８、スイッチ９、ノイズ音源１０、および合成
フィルタ１１によって合成部２が構成される。The buffer 3, the vocal tract analyzer 4,
The voiced / unvoiced analysis unit 5 and the pitch extraction unit 6 constitute the analysis unit 1, and the pulse sound source generation unit 7, 1 / √f
The filter 8, the switch 9, the noise sound source 10, and the synthesis filter 11 constitute the synthesis unit 2.

【０００７】次に、上記構成の音声合成装置の動作につ
いて説明する。まず、大きく見ると、上記分析部１にお
いて、入力音声波形が分析され、音声パラメータが求め
られ、上記合成部２において、上記分析部１によって求
められた音声パラメータから出力合成音声波形が得られ
る。Next, the operation of the speech synthesizer having the above configuration will be described. First, when viewed broadly, the analysis unit 1 analyzes the input speech waveform to obtain a speech parameter, and the synthesis unit 2 obtains an output synthesized speech waveform from the speech parameter obtained by the analysis unit 1.

【０００８】次に、個々の細かな動作について説明する
と、上記分析部１に入力された入力音声波形は、上記バ
ッファ３に蓄えられ、一定長毎にブロック化される。上
記バッファ３でブロック化された音声は、上記声道分析
部４、有声・無声分析部５、およびピッチ抽出部６へそ
れぞれ渡され、上記声道分析部４では周波数分析によ
り、声道情報が得られ、上記有声・無声分析部５では、
入力音声の有声／無声が判定され、上記ピッチ抽出部６
では、入力音声が有声の場合に基本周波数（ピッチ周波
数）が分析され、ピッチ周期が出力される。上記声道分
析部４、有声・無声分析部５およびピッチ抽出部６の出
力は、上記合成部２に供給される。Next, the individual detailed operations will be described. The input voice waveform input to the analysis unit 1 is stored in the buffer 3 and is divided into blocks of a fixed length. The voices blocked by the buffer 3 are respectively passed to the vocal tract analysis unit 4, the voiced / unvoiced analysis unit 5, and the pitch extraction unit 6, and the vocal tract analysis unit 4 performs frequency analysis to obtain vocal tract information. Obtained, and in the voiced / unvoiced analysis unit 5,
Whether the input voice is voiced / unvoiced is determined, and the pitch extraction unit 6
In, when the input voice is voiced, the fundamental frequency (pitch frequency) is analyzed and the pitch period is output. The outputs of the vocal tract analysis unit 4, the voiced / unvoiced analysis unit 5, and the pitch extraction unit 6 are supplied to the synthesis unit 2.

【０００９】上記ピッチ抽出部６で得られたピッチ周期
は、上記パルス音源生成部７へ入力され、図２に示す様
なピッチ周期に相当する間隔を持つインパルス列が生成
される。次に、このパルス列は上記１／√ｆフィルタ８
へ入力され、以下の（１）式に示す様なフィルタリング
を受け出力される。Ｅφ（ｏ）＝√π Ｅφ（ｉ）＝｛（２ｉ−１）／２ｉ｝Ｅφ（ｉ−１）（１≦ｉ≦Ｎ）…（１）上記１／√ｆフィルタ８よりの出力波形を示すと図３の
様になる。図３に示す様に、ここで生成されたパルス列
は、Ｅφ（ｉ）の区間で１／√ｆの減衰特性をもってい
る。すなわち、上記フィルタ８は、１／ｆⁿ （ｎは正の
実数）のフィルタの役割を果たすと言える。The pitch cycle obtained by the pitch extraction section 6 is inputted to the pulse sound source generation section 7 and an impulse train having an interval corresponding to the pitch cycle as shown in FIG. 2 is generated. Next, this pulse train is the 1 / √f filter 8 described above.
Is input to, and filtered as shown in the following equation (1) and output. Eφ (o) = √π Eφ (i) = {(2i−1) / 2i} Eφ (i−1) (1 ≦ i ≦ N) (1) The output waveform from the 1 / √f filter 8 is The result is shown in Fig. 3. As shown in FIG. 3, the pulse train generated here has an attenuation characteristic of 1 / √f in the section of Eφ (i). That is, it can be said that the filter 8 plays a role of a 1 / f ⁿ (n is a positive real number) filter.

【００１０】そして、上記有声・無声分析部５の有声・
無声判定の結果により、上記スイッチ９の切り換えを行
い、有声の場合は、上記１／√ｆフィルタ８の出力が、
また無声の場合は、上記ノイズ音源１０のノイズ音源出
力が駆動音源として上記合成フィルタ（声道関数フィル
タ）１１へ入力される。上記合成フィルタ１１では、上
記声道分析部４で得られた声道情報を係数に持ち、上記
駆動音源により、合成音声波形を得て出力する。Then, the voiced / unvoiced analysis part 5
The switch 9 is switched according to the result of the unvoiced judgment. In the case of voice, the output of the 1 / √f filter 8 is
In the case of no voice, the noise sound source output of the noise sound source 10 is input to the synthesis filter (vocal tract function filter) 11 as a driving sound source. The synthesis filter 11 has the vocal tract information obtained by the vocal tract analysis unit 4 as a coefficient, and obtains and outputs a synthetic speech waveform by the driving sound source.

【００１１】この様に構成すれば、有声区間において、
合成音声の周波数成分上で低域が強調されるため、自然
で聴きやすい音質を持った合成音声を合成することがで
きる。即ち、例えば、「あいかわらず」という音声フレ
ーズを例にとって実験した結果、従来装置では図４に示
す合成結果が得られたのに対し、本発明の装置では図５
に示す様にはっきりとした（影の濃い）合成音声が得ら
れた。With this configuration, in the voiced section,
Since the low frequency band is emphasized on the frequency component of the synthetic voice, it is possible to synthesize a synthetic voice having a natural and easy-to-listen sound quality. That is, for example, as a result of an experiment using the voice phrase "Aishinari", the conventional device obtains the synthesis result shown in FIG. 4, whereas the device of the present invention produces the result shown in FIG.
A clear (shaded) synthetic speech was obtained as shown in.

【００１２】[0012]

【発明の効果】本発明は、以上説明した様に、駆動音源
で声道関数フィルタ（合成フィルタ）を駆動する音声合
成装置において、駆動音源と声道関数フィルタの間に１
／ｆⁿ（ｎは正の実数）のフィルタを挿入し、音源→１
／ｆⁿ フィルタ→声道関数フィルタの構成としたので、
有声区間において合成音声の周波数成分上で低域が強調
されるため、自然で聴きやすい音質を持った合成音声を
合成することができる。As described above, according to the present invention, in a voice synthesizer for driving a vocal tract function filter (synthesis filter) with a driving sound source, a 1 unit is provided between the driving sound source and the vocal tract function filter.
Insert a filter of / f ⁿ (n is a positive real number), and sound source → 1
/ F ⁿ filter → Since the configuration of the vocal tract function filter,
Since the low frequency band is emphasized on the frequency component of the synthesized voice in the voiced section, it is possible to synthesize a synthesized voice having a natural and easy-to-listen sound quality.

[Brief description of drawings]

【図１】本発明による音声合成装置の一実施例を示す構
成図である。FIG. 1 is a block diagram showing an embodiment of a speech synthesizer according to the present invention.

【図２】図１に示したパルス音源生成部において生成さ
れるインパルス列を示す波形図である。2 is a waveform diagram showing an impulse train generated in the pulse sound source generation unit shown in FIG.

【図３】図１に示した１／√ｆフィルタよりの出力波形
図である。FIG. 3 is an output waveform diagram from the 1 / √f filter shown in FIG.

【図４】従来装置における合成音声結果を示す図であ
る。FIG. 4 is a diagram showing a synthesized voice result in a conventional device.

【図５】本発明の装置における合成音声結果を示す図で
ある。FIG. 5 is a diagram showing a synthesized voice result in the device of the present invention.

[Explanation of symbols]

１…分析部、２…合成
部、３…バツファ、４…声
道分析部、５…有声・無声分析部、
６…ピッチ抽出部、７…パルス音源生成部、
８…１／√ｆフィルタ、９…スイッチ、
１０…ノイズ音源、１１…合成フ
ィルタ（声道関数フィルタ）、1 ... Analysis unit, 2 ... Synthesis unit, 3 ... Buffer, 4 ... Vocal tract analysis unit, 5 ... Voiced / unvoiced analysis unit,
6 ... Pitch extraction unit, 7 ... Pulse sound source generation unit,
8 ... 1 / √f filter, 9 ... switch,
10 ... Noise source, 11 ... Synthesis filter (vocal tract function filter),

Claims

[Claims]

1. A voice synthesizer for synthesizing a voice by driving a vocal tract function filter with a driving sound source, wherein 1 / f ⁿ (n is a positive real number) between the driving sound source and the vocal tract function filter. A voice synthesizer characterized by inserting the filter of.

2. A speech synthesizer having an analyzing section for analyzing an input speech waveform to obtain a speech parameter, and a synthesizing section for driving a vocal tract function filter with a driving sound source to obtain an output synthesized speech waveform from the speech parameter. And 1 between the driving sound source and the vocal tract function filter in the synthesis unit.
A voice synthesizing device provided with a / √f filter.