JPS60205597A

JPS60205597A - Voice synthesizer

Info

Publication number: JPS60205597A
Application number: JP59063936A
Authority: JP
Inventors: 敏郎柴沼; 金盛　亨; 末田　信
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1984-03-30
Filing date: 1984-03-30
Publication date: 1985-10-17

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（イ）発明の技術分野本発明は、任意の入力語について合成音声を出力する音
声合成装置に関する。DETAILED DESCRIPTION OF THE INVENTION (a) Technical Field of the Invention The present invention relates to a speech synthesis device that outputs synthesized speech for an arbitrary input word.

（ロ）従来技術と問題点音声合成装置は、近年、各種の分野で使用されており、
例えば、任意の入力文を音声出力する文　）章読上げ装
置等に使用されている。(b) Prior art and problems Speech synthesis devices have been used in various fields in recent years.
For example, it is used in a sentence/chapter reading device that outputs an arbitrary input sentence aloud.

従来、任意文を合成するには音素音節などのこまがい単
位でパラメータを持ち、そのパラメータで任意語を合成
し、さらに任意文を合成する方法が用いられていた。し
かしながら、音素音節などによる合成方法では、こまか
い単位でパラメータを持つので、音と音の補間特性など
の問題で音質の劣化はさけられなかった。Conventionally, in order to synthesize arbitrary sentences, a method has been used in which parameters are provided for each small unit such as a phoneme or syllable, and arbitrary words are synthesized using those parameters, and then arbitrary sentences are synthesized. However, since synthesis methods using phonemes and syllables have parameters in small units, deterioration in sound quality cannot be avoided due to problems such as interpolation characteristics between sounds.

（ハ）発明の目的本発明は、上記欠点を解決し、自然発声にきわめて近い
音声による任意文の音声合成を可能にすることを目的と
する。(c) Purpose of the Invention The object of the present invention is to solve the above-mentioned drawbacks and to enable speech synthesis of arbitrary sentences using speech that is extremely close to natural speech.

（ニ）発明の構成上記目的を達成するために本発明は、任意の入力語の読
み情報を構成する音素音節等についての音声パラメータ
を記憶する音素音節等パラメータ格納手段をそなえ、任
意の入力語に対して上記音素音節等パラメータ格納手段
より対応する音素音節等パラメータを取出して音声合成
することが可能なように構成された音声合成装置におい
て、上記音素音節等より長い、２拍以上の各種の長さを
もつ単語・文節等に対応する音声パラメータを記憶する
語句等パラメータ格納手段をそなえ、上記音素音節等パ
ラメータ格納手段の内容とともに上記語句等パラメータ
格納手段の内容を使用して任意の入力語についての音声
合成処理を行なうよう構成したことを特徴とする。(d) Structure of the Invention In order to achieve the above object, the present invention is provided with a phoneme syllable etc. parameter storage means for storing phoneme syllables etc. phonetic parameters constituting the reading information of an arbitrary input word. In a speech synthesis device configured to be capable of extracting the corresponding phoneme-syllable parameters from the phoneme-syllable parameter storage means and synthesizing the speech, the speech synthesis device is configured to be able to extract the corresponding phoneme-syllable parameters from the phoneme-syllable parameter storage means and perform speech synthesis. A word/phrase etc. parameter storage means is provided for storing speech parameters corresponding to words, phrases, etc. having a length, and the contents of the word/phrase etc. parameter storage means are used together with the contents of the phoneme/syllable etc. parameter storage means to store an arbitrary input word. The present invention is characterized in that it is configured to perform speech synthesis processing for.

（ホ）発明の実施例以下、本発明を図面により説明する。(E) Examples of the invention Hereinafter, the present invention will be explained with reference to the drawings.

第１図は、本発明によるｌ実施例の音声合成装置のブロ
ック図であり、図中、■は任意文保持部。FIG. 1 is a block diagram of a speech synthesis device according to an embodiment of the present invention, and in the figure, ■ indicates an arbitrary sentence holding section.

２は文章解析部、３は単語辞書部、４はパラメータ変換
部、５は音素音節等パラメータ格納部、６・は語句等パ
ラメータ格納部、７は韻律設定部、８は音声合成部であ
る。2 is a sentence analysis section, 3 is a word dictionary section, 4 is a parameter conversion section, 5 is a phoneme/syllable parameter storage section, 6 is a word/phrase etc. parameter storage section, 7 is a prosody setting section, and 8 is a speech synthesis section.

第１図の動作は以下の通りである。The operation of FIG. 1 is as follows.

まず、任意文保持部１から取り出された一連の文章は、
文章解析部２に入力される。文章解析部２においては、
単語辞書部３を参照して、入力された文章を単語に区切
るとともに、単語に読みを与える。First, a series of sentences extracted from the arbitrary sentence holding unit 1 is
The text is input to the text analysis section 2. In the text analysis section 2,
Referring to the word dictionary section 3, the input sentence is divided into words and readings are given to the words.

例えば、「任意文を合成する」という文に対しては「任
意」、「文」、「を」、「合成」、「する」、というよ
うに単語に分けるとともに、「にんい」、「ふん」、「
を」、「ごうせい」、「する」、という読みの列を作成
する。For example, the sentence ``compose arbitrary sentences'' is divided into words such as ``arbitrary,''``sentence,''``wo,''``synthesize,'' and ``do,'' as well as ``nini,'' `` Hmm","
Create columns with the readings ``wo'', ``goosei'', and ``suru''.

次に、これらの単語の読みの列は、パラメータ変換部４
と、韻律設定部７に入力される。ここで、韻律設定部７
は、ｉ語辞書３内に、単語の読み情報とともに格納され
ているアクセント型情報２拍数情報１文法情報等にもと
づいて当該単語のピッチパターンを作成するものである
。この韻律設定部７は、本発明には直接関係しない部分
である。Next, the string of pronunciations of these words is processed by the parameter conversion unit 4.
is input to the prosody setting section 7. Here, the prosody setting section 7
creates a pitch pattern for the word based on accent type information, beat count information, grammar information, etc. stored in the i-word dictionary 3 together with the pronunciation information of the word. This prosody setting section 7 is a section that is not directly related to the present invention.

一方、パラメータ変換部４は、音素音節等パラメータ格
納部５および語句等パラメータ格納部６を使用し、第２
図に示す動作フローによりパラメータ変換処理を行なう
。On the other hand, the parameter conversion section 4 uses the phoneme-syllable etc. parameter storage section 5 and the word/phrase etc. parameter storage section 6, and
Parameter conversion processing is performed according to the operational flow shown in the figure.

ずなわら、まず１つの入力単語について、語句等パラメ
ータ格納部６を参照して、当該単語に対応する語句等パ
ラメータが存在するか否かを調べる。もし、存在すれば
、対応する語句等パラメータからパラメータ時系列を作
成する。First, for one input word, the word/phrase/parameter storage unit 6 is referred to to check whether there is a word/phrase/parameter corresponding to the word. If it exists, a parameter time series is created from the corresponding parameters such as words and phrases.

また、語句等パラメータ格納部６に当該ｉ語に対応する
語句等パラメータが存在しないとき、当該単語を微細な
音素音節等に分解した上で、音素音節等パラメータ格納
部５を参照して、当該音素音節等に対応する音素音節等
パラメータからパラメータ時系列を作成する。このよう
にして、１つの単語についてのパラメータ時系列を得て
、音声合成部８に送出した後、入力単語列における次の
単語を取出して、上記したのと同様な処理を行なってい
（。In addition, when there is no word/phrase/etc. parameter corresponding to the i-word in the word/phrase/etc. parameter storage unit 6, after decomposing the word into minute phonemes, syllables, etc., the corresponding A parameter time series is created from parameters such as phonemes and syllables corresponding to phonemes and syllables. In this way, after obtaining a parameter time series for one word and sending it to the speech synthesis unit 8, the next word in the input word string is extracted and the same processing as described above is performed.

次に、音声合成部８では、韻律設定部７からのピッチパ
ターンと、パラメータ変換部４がらのバ゛ラメータ時系
列とにもとづいて音声合成処理を行ない、所要の音声出
力を行なう。Next, the speech synthesis section 8 performs speech synthesis processing based on the pitch pattern from the prosody setting section 7 and the parameter time series from the parameter conversion section 4, and outputs the desired speech.

実施例において語句等パラメータ格納部６には、使用頻
度の高い単語１文節、あるいはそれ以上の長い文章単位
等の語句の読みと、それに対応するパラメータが対にな
って格納されている。また、音素音節等パラメータ格納
部５には、音素音節等と、それに対応するパラメータが
対になって格納される。In the embodiment, the word/phrase/etc. parameter storage unit 6 stores readings of frequently used words, such as one clause or longer sentence units, and the corresponding parameters in pairs. Further, in the phoneme syllable etc. parameter storage section 5, phoneme syllables etc. and parameters corresponding thereto are stored in pairs.

（へ）発明の効果本発明によれば、使用頻度の高い語句はそれを単位にし
てパラメータを記憶し、音声合成する文中に、その語句
があられれたときには、その語句に対応して記憶された
パラメータを使用し、記憶されていない語句があられれ
たときのみ、任意語を合成するごまかい単位のパラメー
タを使用して音声合成するようにしたので、従来方法と
比較して音と音の補間処理が減少し、音質の向上を図る
ことができる。(f) Effects of the Invention According to the present invention, parameters are stored in units of frequently used words and phrases, and when that word appears in a sentence to be synthesized into speech, it is stored in correspondence with that word. By using parameters that have been stored, and only when a word that is not memorized is found, speech synthesis is performed using parameters in trick units that synthesize arbitrary words, so compared to conventional methods, the difference between sounds and sounds is improved. Interpolation processing is reduced and sound quality can be improved.

[Brief explanation of drawings]

第１図は本発明による番実施例の音声合成装置のブロッ
ク図、第２図はパラメータ変換部の動作フローを示す図
である。第１図において、■は任意文保持部、２は文章解析部、
３は単語辞書部、４はパラメータ変換部１５は音素音節
等パラメータ格納部、６は語句等パラメータ格納部、８
は音声合成部である。者Ｐ微力 ’＄ｒ旧FIG. 1 is a block diagram of a speech synthesizer according to an embodiment of the present invention, and FIG. 2 is a diagram showing an operation flow of a parameter conversion section. In Figure 1, ■ is an arbitrary sentence holding unit, 2 is a sentence analysis unit,
3 is a word dictionary section; 4 is a parameter conversion section 15 is a phoneme-syllable parameter storage section; 6 is a word-phrase parameter storage section; 8
is the speech synthesis section. person P weak power '$r old

Claims

[Scope of Claims] (11) A phoneme/syllable/etc. parameter storage means for storing phonetic parameters of phonemes/syllables/etc. constituting the reading information of an arbitrary input word; In a speech synthesis device configured to be able to extract and synthesize speech by extracting parameters such as phonemes, syllables, etc., which are longer than the above-mentioned phonemes, syllables, etc.,
A word/phrase/etc. parameter storage means is provided for storing audio parameters corresponding to words, phrases, etc. having various lengths longer than beats, and the content of the word/phrase/etc. parameter storage means is used together with the contents of the phoneme/syllable/etc. parameter storage means. A speech synthesis device characterized in that it is configured to perform speech synthesis processing on an arbitrary input word. [1#(For certain words etc. in the 21 human words, first,
When a word, etc. parameter corresponding to the word, etc. exists by referring to the word, phrase, etc. parameter storage means, a speech parameter time series is created from the word, etc. parameter, and on the other hand, a speech parameter time series corresponding to the word, etc. is stored in the word, phrase, etc. parameter storage means. It is characterized by being configured to break down the word, etc. into smaller phonemes, syllables, etc. when the word, etc. parameter does not exist, and extract the phoneme, syllable, etc. parameter from the phoneme, syllable, etc. parameter storage means to create a speech parameter time series. A speech synthesis device according to claim 11.