JPH03174597A

JPH03174597A - Voice synthesizer

Info

Publication number: JPH03174597A
Application number: JP1314750A
Authority: JP
Inventors: Nobuhide Yamazaki; 山崎　信英
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1989-12-04
Filing date: 1989-12-04
Publication date: 1991-07-29

Abstract

PURPOSE:To generate a more natural synthesized voice by multiplying a synthesized voice signal by the output of a fluctuation time-series memory through an integrator and adding amplitude fluctuations to the synthesized voice. CONSTITUTION:The fluctuation time series memory 2a is stored with amplitude fluctuations obtained from a natural voice by an amplitude fluctuation extracting means 1 in advance. For example, a vowel which is generated steadily is used as the natural voice and the amplitude fluctuation extracting means 1 finds a maximum amplitude value in each pitch period of the natural voice and divide maximum values by a mean value for normalization to obtain amplitude fluctuations. The voice signal outputted by a synthesizing filter 4 is superposed upon the amplitude fluctuations from a fluctuation generating circuit 2 to obtain a voice signal given the amplitude fluctuations. Consequently, the amplitude fluctuations in the natural voice can be reflected faithfully and the more natural synthesized voice can be obtained.

Description

【発明の詳細な説明】挟揉盆互本発明は、音声合成装置、より詳細には、規則音声合成
装置において１合成音声に自然性を付与するための装置
の構成に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech synthesis device, and more particularly, to the configuration of a device for imparting naturalness to one synthesized speech in a regular speech synthesis device.

ｋ来挟権従来の規則音声合成装置では、合成音声の振幅を定めら
れた規則によって制御している。例えば。2. Description of the Related Art Conventional regular speech synthesis devices control the amplitude of synthesized speech according to predetermined rules. for example.

韻律情報をもとにピッチパタンを得るのと同様にして、
振幅パタンを得るようにした方法がある。In the same way as obtaining pitch patterns based on prosodic information,
There is a method for obtaining an amplitude pattern.

ところで、人間の発声する音声は生体から発声されるた
めに、色々なゆらぎが含まれており、その一つとして振
幅のゆらぎが存在する。そこで、従来より、この振幅ゆ
らぎを近似合成し、振幅パタンに重畳することが行われ
ている。この振幅ゆらぎの近似合成として、さまざまな
方式が提案されてきた。例えば、（ａ）特公昭６２−４
９６３９号公報では、正規乱数発生手段の出力をゆらぎ
系列として使用している（これは、音声信号にゆらぎデ
ータを加える方式である）。（ｂ）特開昭６３−２２９
４９９号公報では、乱数発生手段の出力に積分フィルタ
を用いて、ゆらぎ系列を得ている（これは、ゆらぎ系列
に定数を加えたものと、音声信号とを積算する方式であ
る）。（ｃ）特開昭５８−１８６８００号公報では、振
幅パタンに１／ｆゆらぎをあたえている（ゆらぎのあた
え方については、いっさい書かれていない）ａまた、（
ｄ）特開昭５５−１３３０９９号公報では、Ｄ／Ａ変換
器の基準電圧に、増幅器から得られたゆらぎを加えてい
る。而して、上記従来技術（ａ）では、振幅ゆらぎとし
てランダム系列発振器の出力を、正規乱数系列に変換し
て用いている。また、（ｂ）では、ランダム系列を積分
器に通すことで。By the way, since the human voice is uttered by a living body, it contains various fluctuations, one of which is amplitude fluctuation. Therefore, conventionally, this amplitude fluctuation is approximately synthesized and superimposed on the amplitude pattern. Various methods have been proposed for approximate synthesis of this amplitude fluctuation. For example, (a) Tokuko Sho 62-4
In Publication No. 9639, the output of a normal random number generation means is used as a fluctuation sequence (this is a method of adding fluctuation data to an audio signal). (b) Japanese Patent Publication No. 63-229
No. 499 uses an integral filter for the output of the random number generating means to obtain a fluctuation sequence (this is a method of integrating the fluctuation sequence plus a constant and the audio signal). (c) In JP-A-58-186800, a 1/f fluctuation is given to the amplitude pattern (no information is given on how to give the fluctuation)a Also, (
d) In Japanese Patent Laid-Open No. 55-133099, fluctuations obtained from an amplifier are added to the reference voltage of a D/A converter. In the prior art (a), the output of the random sequence oscillator is converted into a normal random number sequence and used as the amplitude fluctuation. Also, in (b), by passing the random sequence through an integrator.

振幅ゆらぎを得ている。また、（ｃ）では、ランダム系
列を１／ｆ特性のテーブルを持つ量子化器によって１／
ｆゆらぎの近似を行っている。また、（ｄ）では、増幅
率が非常に大きい増幅器の出力ゆらぎを用いている。し
かしながら、上記従来の方式はあくまでも振幅ゆらぎの
近似であったので、不自然さが残っていた。Amplitude fluctuations are obtained. In addition, in (c), the random sequence is 1/
Approximation of f fluctuation is performed. Furthermore, in (d), the output fluctuation of an amplifier with a very large amplification factor is used. However, since the above-mentioned conventional method was only an approximation of amplitude fluctuation, unnaturalness remained.

１−一攻本発明は、上述のごとき実情に鑑みてなされたもので、
特に、生体の振幅ゆらぎを反映して、より自然な合成音
を得ることを目的としてなされたものである。1-One Attack The present invention was made in view of the above-mentioned circumstances,
In particular, this was done with the aim of reflecting the amplitude fluctuations of the living body to obtain more natural synthesized sounds.

講−一」又本発明は、上記目的を達成するために、一定の規則ある
いは保存されたデータをもとに合成音声の振幅を決定す
る規則音声合成装置において、ゆらぎ時系列メモリと、
積算器を有し、合成音声信号とゆらぎ時系列メモリの出
力を、積算器によって掛け合わせることで１合成音声に
振幅ゆらぎを付与することを特徴としたものである。以
下、本発明の実施例に基づいて説明する。In order to achieve the above object, the present invention provides a regular speech synthesizer that determines the amplitude of synthesized speech based on certain rules or stored data, which includes a fluctuation time series memory;
The apparatus is characterized in that it has an integrator and adds amplitude fluctuation to one synthesized speech by multiplying the synthesized speech signal and the output of the fluctuation time series memory by the integrator. Hereinafter, the present invention will be explained based on examples.

第３図は、本発明に用いる振幅ゆらぎ生成回路の一例を
説明するための回路構成図で、図中、１は振幅ゆらぎ抽
出回路、２は振幅ゆらぎ生成回路で、該振幅ゆらぎ生成
回路２はゆらぎ時系列メモリ２ａ及びアドレスカウンタ
２ｂを有しており、ゆらぎ時系列メモリ２ａは、ゆらぎ
をデジタル的に記憶している。このゆらぎ時系列メモリ
２ａは、例えば、Ｄビットアドレスカウンタ２ｂにより
アドレス指定される。上記アドレスカウンタ２ｂはピッ
チ周期信号によってエピッチ周期ごとに歩進される。別
の実施例として、上記アドレスカウンタはクロック発振
器をピッチ周期信号によって制御しても良い。FIG. 3 is a circuit configuration diagram for explaining an example of an amplitude fluctuation generation circuit used in the present invention. In the figure, 1 is an amplitude fluctuation extraction circuit, 2 is an amplitude fluctuation generation circuit, and the amplitude fluctuation generation circuit 2 is It has a fluctuation time series memory 2a and an address counter 2b, and the fluctuation time series memory 2a stores fluctuations digitally. This fluctuation time series memory 2a is addressed, for example, by a D-bit address counter 2b. The address counter 2b is incremented every pitch period by the pitch period signal. In another embodiment, the address counter may control a clock oscillator with a pitch period signal.

上記ゆらぎ時系列メモリ２ａには、あらかじめ、振幅ゆ
らぎ抽出手段１によって肉声から得た振幅ゆらぎを保存
しておく。例えば、肉声として、定常的に発声した母音
音声を用い、振幅ゆらぎ抽出手段１によって、肉声の１
ピッチ周期ごとに振幅の最大値を求め、これらを平均値
で除算することで正規化したものを振幅ゆらぎとする。The amplitude fluctuation obtained from the real voice by the amplitude fluctuation extraction means 1 is stored in advance in the fluctuation time series memory 2a. For example, using a regularly uttered vowel voice as the real voice, the amplitude fluctuation extraction means 1 extracts one of the real voices.
The maximum value of the amplitude is determined for each pitch period, and the value is normalized by dividing it by the average value, and the result is defined as the amplitude fluctuation.

第１図及び第２図は、それぞれ本発明による音声合成装
置の実施例を説明するための構成図で。FIG. 1 and FIG. 2 are block diagrams for explaining embodiments of a speech synthesis device according to the present invention, respectively.

図中、２は振幅ゆらぎ生成回路、３は音源生成部、４は
合成フィルタ、５は積算器で、第１図に示した実施例に
おいて、合成フィルタ４から出力された音声信号は、積
算器５によってゆらぎ生成回路２からの振幅ゆらぎと重
畳され、振幅ゆらぎが与えられた音声信号となる。別の
実施例として、第２図に示すように、音源信号に振幅ゆ
らぎを与えても、同様の効果が得られる。また、波形編
集型の音声合成装置においても１合成音声波形と上記振
幅ゆらぎを積算器によって重畳することで同様の効果が
得られる。In the figure, 2 is an amplitude fluctuation generation circuit, 3 is a sound source generation section, 4 is a synthesis filter, and 5 is an integrator.In the embodiment shown in FIG. 5, the signal is superimposed with the amplitude fluctuation from the fluctuation generation circuit 2, resulting in an audio signal given amplitude fluctuation. As another example, as shown in FIG. 2, the same effect can be obtained even if amplitude fluctuation is applied to the sound source signal. Further, in a waveform editing type speech synthesis device, a similar effect can be obtained by superimposing one synthesized speech waveform and the above amplitude fluctuation using an integrator.

また、ゆらぎ時系列メモリに複数の異なったゆらぎを保
存し９例えば１合成する音韻ごとに切替えて、より自然
なゆらぎを与えることも可能である。It is also possible to store a plurality of different fluctuations in the fluctuation time-series memory and switch them for each phoneme to be synthesized, for example, to provide more natural fluctuations.

勿−一≦４以上の説明から明らかなように、本発明によると、本発
明の音声合成装置のゆらぎ生成回路によって、肉声に含
まれる振幅のゆらぎを忠実に反映することができ、より
自然性の高い合成音声を得ることができる。As is clear from the above description, according to the present invention, the fluctuation generation circuit of the speech synthesizer of the present invention can faithfully reflect the amplitude fluctuations included in the real voice, making it more natural. It is possible to obtain high-quality synthesized speech.

[Brief explanation of the drawing]

第１図及び第２図は、それぞれ本発明の詳細な説明する
ための構成図、第３図は１本発明の実施に用いる振幅ゆ
らぎ生成回路の一例を説明するための構成図である。１・・・振幅ゆらぎ抽出回路、２・・・振幅ゆらぎ生成
回路、２ａ・・・ゆらぎ時系列メモリ、２ｂ・・・アド
レスカウンタ、３・・・音源生成部、４・・・合成フィ
ルタ、５・・・積算器。FIGS. 1 and 2 are block diagrams for explaining the present invention in detail, and FIG. 3 is a block diagram for explaining an example of an amplitude fluctuation generation circuit used to implement the present invention. DESCRIPTION OF SYMBOLS 1... Amplitude fluctuation extraction circuit, 2... Amplitude fluctuation generation circuit, 2a... Fluctuation time series memory, 2b... Address counter, 3... Sound source generation unit, 4... Synthesis filter, 5 ...Integrator.

Claims

[Claims]

1. A regular speech synthesizer that determines the amplitude of synthesized speech based on certain rules or stored data, which has a fluctuation time series memory and an integrator, and which outputs the synthesized speech signal and the fluctuation time series memory. , a speech synthesis device characterized in that the synthesized speech is multiplied by the integrator to impart amplitude fluctuation to the synthesized speech.