JPH0477794A

JPH0477794A - Voice synthesizer

Info

Publication number: JPH0477794A
Application number: JP2190693A
Authority: JP
Inventors: Yoshiyuki Hara; 義幸原; Masaki Egawa; 雅樹江川
Original assignee: Toshiba Corp; Toshiba Computer Engineering Corp
Current assignee: Toshiba Corp; Toshiba Computer Engineering Corp
Priority date: 1990-07-20
Filing date: 1990-07-20
Publication date: 1992-03-11
Anticipated expiration: 2014-11-15
Also published as: JP2977236B2

Abstract

PURPOSE:To present a voice synthesizer, which can effectively generate a synthesized voice with high naturality, by controlling whether the voice is made nasal or not and whether the voice is made long vowel or not, according to the information of a nasal sound or a long vowel sound when converting the information of reading shown by an accent dictionary to the sequence of vocal sounds. CONSTITUTION:According to the nasal sound information and the long vowel sound information to be applied from an input part 1, the application of nasal sound regulation and long vowel sound regulation at a vocal sound sequence certification part 4 is controlled, and there is difference in the sequence of vocal sounds to be generated there. Namely, it is controlled whether the vocal sound to be the object of the nasal sound is converted to the nasal vocal sound or not and whether the sound of (i) is replaced with the long vowel sound [-]or not, and the sequence of vocal sounds is generated corresponding to an input instruction. As the result, only by controlling the input of the nasal sound information and the long vowel sound information corresponding to the purpose of voice synthesization, the voice can be synthesized with high naturality corresponding to the purpose or locality.

Description

【発明の詳細な説明】［発明の目的］（産業上の利用分野）本発明は文字コード列から目的に応じた合成音声を自然
性良く生成することのできる音声合成装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Object of the Invention] (Industrial Application Field) The present invention relates to a speech synthesis device that can generate synthesized speech according to a purpose from a character code string with good naturalness.

（従来の技術）近時、入力文字コード列を解析してその音韻系列と韻律
情報とを求め、この音韻系列と韻律情報とに従い所定の
規則を適用して音韻パラメータ列と韻律パラメータ列と
を生成し、これらのパラメータ列に基づいて合成音声を
生成する音声合成装置が種々開発されている。この種の
規則合成法に基づく音声合成装置は、従来の録音編集方
式の音声合成装置に比較して任意の単語や文章を表す合
成音声を比較的簡単に生成し得ると云う利点を持つ。こ
れ故、音声認識技術と相俟って自然性の高いマンマシン
中インターフェースを実現する上での重要な技術として
注目されている。(Prior Art) Recently, an input character code string is analyzed to obtain its phoneme sequence and prosodic information, and a phonological parameter string and a prosodic parameter string are generated by applying predetermined rules according to the phoneme sequence and prosody information. Various speech synthesis devices have been developed that generate synthesized speech based on these parameter sequences. A speech synthesis device based on this type of rule synthesis method has the advantage that it can relatively easily generate synthesized speech representing arbitrary words or sentences, compared to speech synthesis devices using conventional recording and editing methods. Therefore, in combination with voice recognition technology, it is attracting attention as an important technology for realizing highly natural man-machine interfaces.

ところでこの種の音声合成装置は、例えばワードプロセ
ッサにて作成された文章を音声出力する為に使用され、
その利用範囲が拡がる傾向にある。By the way, this type of speech synthesis device is used, for example, to output sentences created using a word processor.
The scope of its use is expanding.

また最近では、例えば声の高さや発話速度を変え得る為
の機能を組み込み、合成出力する音声を成る程度好みの
音声として加工し得るような工夫もなされている。Recently, devices have also been devised to incorporate functions that allow, for example, changing the pitch of the voice and the rate of speech, so that the synthesized and output voice can be processed to the extent desired.

ところで日本語音声の場合、ガ行［ガ、ギ、グ。By the way, in the case of Japanese audio, the ga line [ga, gi, gu.

ゲ１　ゴ、ギヤ、ギュ、ギョ］の音については、これを
鼻音化した音［力９．キ０．り０　ケ０コ０．キ０ヤ、
キ０ユ、キｏ＝３］が存在する。このような鼻音化する
音に対処するべく、従来の音声合成装置では鼻音化しな
いガ行の音声素片と鼻音化したガ行の音声素片とを予め
作成しておき、御整合性規則やアクセント辞書にその情
報を登録している。そしてこれらの音声素片を選択的に
用いて合成音声を生成するものとなっている。例えば［
鏡コをｃシカ０　ミコとして音声合成している。Ge 1 Go, Gear, Gyu, Gyo] is a nasalized sound [Power 9. Ki0. ri0 ke0ko0. Ki0ya,
Ki0yu, Kio=3] exists. In order to deal with such nasalized sounds, conventional speech synthesis devices create in advance speech segments for the G line that are not nasalized and speech segments for the G line that are nasalized. The information is registered in the accent dictionary. These speech segments are then selectively used to generate synthetic speech. for example[
Kagamiko is voice synthesized as C Shika 0 Miko.

またが行の音量外でも、例えば［銀行］を［ギンコーコ
として音声合成するようにしている。Even outside the volume range, for example, [Bank] is synthesized as [Ginkoko].

然し乍ら、このような鼻音化はガ行の音を含む単語の種
類によって必ず生じると云うものではなく、言語的地域
性に依存して鼻音化しないで発声している所もある。However, this kind of nasalization does not always occur depending on the type of word that contains the G sound, and there are some words that are pronounced without nasalization depending on the linguistic region.

また工列音に後続するイ音についても、例えば［先生］
に代表されるように、これを［センセ−］として発声す
る場合と［センセイ］として発声する場合とがある。こ
の場合、−船釣にはアクセント辞書の読み［イコに対応
するところに引き音［−］を登録しておき［せんせい］
なる読みに対して「センセ−］なる情報を得ることで引
き音に対処している。Also, regarding the A sound that follows the koretsu sound, for example, [teacher]
As typified by, there are cases where this is pronounced as [Sensei] and cases where it is pronounced as [Sensei]. In this case, for boat fishing, register the pronunciation [-] in the accent dictionary where it corresponds to [Iko].
We deal with the hikion by obtaining information about the ``sense'' reading.

然し乍ら、アクセント辞書に引き音を登録しておくと、
これを引き音化しないで音声合成することができなくな
ると云う問題がある。つまり使用者の好み等に応じて引
き音化した合成音声や、弓き音化しない合成音声を任意
に得ることができないと云う問題がある。However, if you register the accent in the accent dictionary,
There is a problem in that it is no longer possible to synthesize speech without converting this into a subtone. In other words, there is a problem in that it is not possible to arbitrarily obtain synthesized speech that is made into a pitched sound or synthesized speech that is not made into a pitched sound according to the user's preference.

（発明が解決しようとする課題）このように従来にあっては、合成すべき音声を鼻音化す
るか否か、また引き音化するか否かが、予めアクセント
辞書に登録された規則等に委ねられており、これを利用
者が任意に弯更して目的とする音声を自然性良く得るこ
とが困難であると云う不具合があった。(Problem to be Solved by the Invention) In this way, in the past, whether or not to make the speech to be synthesized nasalized or accentuated was determined based on rules registered in advance in an accent dictionary. There was a problem in that it was difficult for the user to arbitrarily bend the sound to obtain the desired sound with good naturalness.

本発明はこのような事情を考慮してなされたもので、そ
の目的とするところは、鼻音化や引き音化に対する指示
を簡易に行って、自然性の高い合成音声を効果的に生成
することのできる音声合成装置を提供することにある。The present invention has been developed in consideration of the above circumstances, and its purpose is to easily give instructions for nasalization and accentuation, and to effectively generate highly natural synthesized speech. The object of the present invention is to provide a speech synthesis device that can perform the following.

［発明の横ｌ戊コ（課題を解決するための手段）本発明は入力文字コード列を解析してその音韻系列と韻
律情報とを求め、この音韻系列と韻律情報とから所定の
規則に従って音韻パラメータ列と韻律パラメータ列を生
成し、これらのパラメータ列に基づいて合成音声を生成
する音声合成装置に係り、特に生成する合成音声を鼻音化させるか否かを示す鼻音
化情報を入力する手段を備え、入力された鼻音化情報が
音声の鼻音化を示している場合には予め定められた鼻音
化規則を適用して鼻音化の音韻を含む音韻系列を求め、
また前記鼻音化情報が非膵音化を示している場合には上
記杯音化規則を適用しないでその音韻系列を求めるよう
にしたことを特徴とするものである。[Aside from the Invention (Means for Solving the Problems) The present invention analyzes an input character code string to obtain its phoneme sequence and prosody information, and uses the phoneme sequence and prosody information to calculate phonemes according to predetermined rules. The present invention relates to a speech synthesis device that generates a parameter string and a prosodic parameter string and generates synthesized speech based on these parameter strings, and in particular, a means for inputting nasalization information indicating whether or not to nasalize the synthesized speech to be generated. If the input nasalization information indicates nasalization of the voice, a predetermined nasalization rule is applied to obtain a phoneme sequence including a nasalized phoneme,
Further, when the nasalization information indicates non-pancreaticization, the phoneme sequence is determined without applying the cupration rule.

また生成する合成音声の工列音に後続するイ音を引き音
化するか否かを示す引き音化情報を人力する手段を備え
、入力された引き音化情報が上記イ音の引き音化を示し
ている場合にはイ音に代えて引き音を発生させ、逆に前
記引き音化情報が非引き音化を示している場合にはイ音
をそのまま発生させるようにしたことを特徴とするもの
である。In addition, a means is provided for manually inputting hikionization information indicating whether or not to hikitonize the A sound that follows the gaku sound of the synthesized speech to be generated, and the input hikihonization information is converted into the hikionization of the A sound. , the A sound is generated instead of the A sound, and conversely, when the A sound is not changed, the A sound is generated as is. It is something to do.

（作　用）本発明によれば、鼻音化規則を有効にするか否かを指定
する鼻音化情報を入力することで、上記鼻音化規則を有
効にするか否かを簡易に制御し、鼻音化した音声または
鼻音化しない音声を選択的に合成することが可能となる
。また引き音化情報を用いて二列音に後続するイ音を引
き音化して音声合成するか否かを簡易に制御することが
可能となる。この結果、音声合成の目的に応じて」１記
鼻音化情報と引き音化情報との人力を制御するだけで、
その目的や地域性に合った合成音声を自然性良く音声合
成することか可能となる。(Function) According to the present invention, by inputting nasalization information specifying whether or not to enable the nasalization rule, it is possible to easily control whether or not to enable the nasalization rule, This makes it possible to selectively synthesize nasalized speech or non-nasalized speech. Furthermore, it is possible to easily control whether or not to convert the A sound that follows the second row of sounds into a raised sound and perform speech synthesis using the raised sound information. As a result, depending on the purpose of speech synthesis, you can simply control the nasalization information and the consonantization information according to the purpose of speech synthesis.
It becomes possible to synthesize speech with good naturalness to suit the purpose and regional characteristics.

（実施例）以下、図面を参照して本発明の一実施例に係る音声合成
装置について説明する。(Embodiment) Hereinafter, a speech synthesis device according to an embodiment of the present invention will be described with reference to the drawings.

第１図は実施例装置の概略構成図で、１は単語や文章等
の文字コード列等を入力する入力部である。この人力部
１を介してこの実施例装置において特徴的な鼻音化情報
や引き音情報等も入力される。しかして人力部１から人
力される入力文字コートタリは単語照合部２に与えられ
、アクセント辞書３との照合に供される。また前記入力
部１から人力された前記鼻音化情報や引き音情報は、音
韻系列検定部４に与えられる。単語照合部２は、予め複
数の単語についてのアクセントや品詞　読みの情報等を
登録しであるアクセント辞書３と前記入力文字コード列
とを照合し、一致検出された単語に関するアクセント情
報および品詞の情報をアクセント型検定部５に与え、ま
たその単語についての読みの情報を音韻系列検定部４に
与える。FIG. 1 is a schematic configuration diagram of an embodiment apparatus, and numeral 1 is an input section for inputting character code strings such as words and sentences. Via this human power section 1, nasalization information, accentuation information, etc., which are characteristic of this embodiment apparatus, are also input. The input character code entered manually from the human power section 1 is given to the word collation section 2 and is subjected to collation with the accent dictionary 3. Further, the nasalization information and the consonant information inputted manually from the input section 1 are provided to the phoneme sequence testing section 4. The word matching unit 2 matches the input character code string with an accent dictionary 3 in which accents, parts of speech, pronunciations, etc. of a plurality of words are registered in advance, and extracts accent information and part of speech information regarding the words for which a match is detected. is given to the accent type testing section 5, and information on the pronunciation of the word is given to the phoneme sequence testing section 4.

しかして音韻系列検定部４では、前記単語照合部２から
与えられる入力文字コード列についての読みの情報を音
韻系列に変換するが、この際、前記人力部ｌから与えら
れる鼻音化情報および引き音化情報に従って異なる処理
を実行して、その音韻系列を求める。即ち、鼻音化情報
が鼻音化モードを表している場合には、鼻音化対象の音
を鼻音化するべく鼻音化規則を適用して、入力文字コー
ド列の先頭のガ行音を除くガ行音を鼻音化する音韻に変
換する。また鼻音化情報が非鼻音化モードを表している
場合には、入力文字コード列の全てのガ行音をそのまま
音韻化する。つまり非鼻音化モードの場合には、前記鼻
音化規則を適用することなく読みの情報を音韻化する。Therefore, the phoneme sequence testing section 4 converts the reading information about the input character code string given from the word matching section 2 into a phoneme series, but at this time, the nasalization information and the accentuation information given from the human power section 1 are used. The phoneme sequence is determined by performing different processing according to the categorization information. That is, when the nasalization information indicates the nasalization mode, the nasalization rule is applied to nasalize the sound to be nasalized, and the G sound is removed from the G sound at the beginning of the input character code string. into a nasalized phoneme. Furthermore, when the nasalization information indicates a non-nasalization mode, all the G sounds in the input character code string are converted into phonemes as they are. In other words, in the non-nasalization mode, the reading information is converted into phonemes without applying the nasalization rule.

また音韻系列検定部４では前記入力部１から与えられる
引き音化情報が引き音化を示している場合には、引き音
化規則を適用して、読みの情報中の工列音に後続するイ
音を引き音［−〕に変更し、上記引き音化情報が非引き
音化を示している場合には、工列音に後続するイ音をそ
のまま音韻化する。In addition, in the phoneme sequence verification section 4, when the kinonization information given from the input section 1 indicates kinonization, the phonology sequence verification section 4 applies the kinonization rule to determine the sound that follows the kusunon in the reading information. When the A sound is changed to a Hirikon [-], and the above-mentioned Hirikonization information indicates non-Hirikonization, the A sound following the Gakuon is converted into a phoneme as it is.

一方、アクセント型検定部５では、単語照合部２から与
えられる１つまたは複数のアクセント情報、およびその
品詞情報に従い、１つのアクセント句を単位としてその
アクセント型を決定する。On the other hand, the accent type testing section 5 determines the accent type of each accent phrase in accordance with one or more pieces of accent information provided from the word matching section 2 and its part of speech information.

しかしてこのようにして決定されたアクセント型は、前
記音韻系列検定部４にて上述した如く音韻化された音韻
系列と共に合成パラメータ生成部７に与えられる。する
とこの合成パラメータ生成部７では、音声素片ファイル
６を参照して前記音韻系列に対応する音韻パラメータ列
を生成し、また前記アクセント型の情報に従って韻律パ
ラメータ列を生成する。音声合成部８はこのようにして
生成された音韻パラメータ列と韻律パラメータ列とに従
って合成音声を生成し、これを出力する。However, the accent type determined in this manner is provided to the synthesis parameter generation section 7 along with the phoneme sequence phoneme-ized as described above by the phoneme sequence verification section 4. Then, the synthesis parameter generation section 7 generates a phoneme parameter string corresponding to the phoneme sequence by referring to the speech segment file 6, and also generates a prosodic parameter string according to the accent type information. The speech synthesis unit 8 generates synthesized speech according to the phonetic parameter string and the prosody parameter string generated in this manner, and outputs the synthesized speech.

即ち、この実施例装置では、人力部１から与えられる鼻
音化情報と引き音化情報とに従って音韻系列検定部４に
おける鼻音化規則および引き音化規則の適用が制御され
、そこで生成される音韻系列に違いが持たされるものと
なっている。つまり鼻音化の対象となる音韻を鼻音化音
韻に変換するか否か、またイ音を引き音［−］に置き換
えるか否かの制御がなされ、入力指示に応じた音韻系列
が生成されるようになっている。That is, in this embodiment, the application of the nasalization rule and the consonantization rule in the phoneme sequence verification section 4 is controlled according to the nasalization information and the consonantization information provided from the human power section 1, and the phoneme sequence generated therein is controlled. There is a difference between the two. In other words, it is possible to control whether or not to convert the phoneme that is the target of nasalization into a nasalized phoneme, and whether or not to replace the i sound with a consonant [-], and to generate a phoneme sequence according to the input instruction. It has become.

かくしてこのように構成された本装置によれば、例えば
「株式会社」なる単語を入力部ｌから与えた場合、入力
部１はその文字コード列「株式会社」を単語照合部２に
与える。すると単語照合部２は、例えば第２図に示すよ
うに構成されたアクセント辞書３と上記入力文字コード
列とを照合し、その見出し語「株式会社」に対応する読
みｒカブシキ力イシャコを求めて音韻系列検定部４に与
える。According to this device configured in this way, for example, when the word "Corporation Corporation" is input from the input section 1, the input section 1 supplies the character code string "Corporation Corporation" to the word matching section 2. Then, the word matching unit 2 matches the input character code string with the accent dictionary 3 configured as shown in FIG. It is given to the phoneme sequence testing section 4.

音韻系列検定部４では入力部１から与えられた鼻音化情
報が鼻音化モードを示している場合には、上記読み「カ
ブシキカイシャ」の「力」を鼻音化してｒ　ｋａｂｕｓ
ｖｉｋ１ｆ’ａｉｓｊａＪに変換する。但し、［ｓ　ｖ
　ｉ］は無声化した「シ」を示し、ｒ　ｒａＪは鼻音化
した「ガ」を示している。このようにして変換された音
韻系列が合成パラメータ生成部７に与えられ、音声素片
ファイル６を参照してその音韻系列に対応した音韻パラ
メータ列が生成される。When the nasalization information given from the input section 1 indicates the nasalization mode, the phoneme sequence test section 4 nasalizes the "power" of the above reading "kabushiki kaisha" and pronounces it as r kabus.
Convert to vik1f'aisjaJ. However, [s v
i] indicates a devoiced "shi", and r raJ indicates a nasalized "ga". The phoneme sequence converted in this way is given to the synthesis parameter generation section 7, which generates a phoneme parameter sequence corresponding to the phoneme sequence with reference to the speech unit file 6.

一方、アクセント型検定部５では、前記アクセント辞書
３から「株式会社」のアクセント型が［５型］であるこ
とを検定する。このようなアクセント型の情報に従い、
前記合成パラメータ生成部７はその韻律パラメータ列を
生成する。そして音声合成部８は、上述した如く生成さ
れた音韻パラメータ列と韻律パラメータ列とに従い、そ
の合成音声を「カブシキカ０イシャ」として生成出力す
ることになる。On the other hand, the accent type testing section 5 verifies that the accent type of "Corporation Corporation" is [Type 5] from the accent dictionary 3. According to such accent type information,
The synthesis parameter generation section 7 generates the prosodic parameter sequence. Then, the speech synthesis unit 8 generates and outputs the synthesized speech as "Kabushikika 0 Isha" according to the phonetic parameter string and the prosody parameter string generated as described above.

これに対して同じ入力文字コード列「株式会社」が与え
られた場合であっても、人力部１から与えられた鼻音化
情報が非鼻音化モードを示している場合には、前記音韻
系列検定部４ては、その読み「カブシキ力イシャ」をそ
のままｒ　ｋａｂｕｓｖｉｋｉｇａｉｓｊａ　Ｊに変換
する。この結果、このような音韻系列に基づき作成され
た音韻パラメータ系列に従うことにより、音声合成部８
では「カブシキガイシャ」なる合成音声を生成して出力
することになる。On the other hand, even if the same input character code string "Co., Ltd." is given, if the nasalization information given from the human resources department 1 indicates a non-nasalization mode, the phoneme series test Section 4 converts the pronunciation "Kabushiki Riki Isha" directly into r kabusvikigaisja J. As a result, by following the phoneme parameter sequence created based on such a phoneme sequence, the speech synthesis unit 8
Now we will generate and output a synthesized voice called "Kabushiki Gaisha".

別の例について説明すると、例えば入力文字コード列と
して「綺麗」なる単語が入力され、引き音モードが指定
されている場合には、例えば第２図に示すアクセント辞
書３から求められる読み「キレ−」が、音韻系列検定部
４にてそのまま音韻系列ｒｋｌｒｅｌＪに変換される。To explain another example, if the word ``kirei'' is input as an input character code string and the hikion mode is specified, for example, if the word ``kirei'' is determined from the accent dictionary 3 shown in FIG. '' is directly converted into a phoneme sequence rklrelJ by the phoneme sequence verification section 4.

但し、［１］は引き音を示している。そし、てこの入力
単語についてのアクセント型が前記アクセント辞書３か
ら［１型］として求められることから、ここでは前記音
声合成部８は「キレ−」なる合成音声を生成出力するこ
とになる。However, [1] indicates a draw sound. Since the accent type for the input word of lever is determined as [type 1] from the accent dictionary 3, the speech synthesis section 8 generates and outputs the synthesized speech "Kirei".

これに対して非引き音モードが指示されている場合、上
記読み「キレ−」の「−」が二列音「し」に後続するイ
音として引き音［−］に変更されたものであることから
、これを元の音「イ」に戻して音韻系列ｒｋｌｒｅｌＪ
を生成する。この結果、音声合成部８は、上述した音韻
系列に基づく音韻ばれメータ列に従って、その合成音声
を「キレイ」として求めることになる。On the other hand, if the non-biki-on mode is specified, the "-" in the above pronunciation "kire-" is changed to the hiki-on [-] as the A sound that follows the second-row sound "shi". Therefore, by returning this to the original sound "i", the phonetic series rklrelJ
generate. As a result, the speech synthesis unit 8 determines the synthesized speech as "beautiful" according to the phoneme blur meter string based on the above-mentioned phoneme sequence.

かくしてこのように構成され、動作する本装置によれば
、鼻音化情報の入力によって音声合成しようとする音声
を鼻音化するか否かを簡易に選択制御することができる
。しかも引き音情報の入力によって二列音に後続するイ
音をそのまま「イ」として音声合成するか引き音「−」
に変換して音声合成するかを簡易に選択制御することが
できる。According to the present apparatus configured and operated in this way, it is possible to easily select and control whether or not to nasalize the speech to be synthesized by inputting the nasalization information. In addition, by inputting the pull sound information, the A sound that follows the second row of sounds is either synthesized as "i" or the sound is synthesized as the pull sound "-".
It is possible to easily select and control whether to convert it into speech and synthesize it.

従ってアクセント辞書３の構成（内容）を変更すること
なしに簡易に鼻音化と引き音化とを選択制御して所望と
する音声を合成出力することが可能となる。しかも鼻音
化情報によって鼻音化規則を適用するか否か、また引き
音化情報によって引き音化規則を適用するか否かを制御
指示することだけによって、非常に簡易に合成音声の鼻
音化と引き音化とを制御することが可能となる等の実用
上多大なる効果が奏せられる。Therefore, without changing the configuration (content) of the accent dictionary 3, it is possible to easily select and control nasalization and accentuation to synthesize and output a desired voice. Moreover, by simply instructing whether or not to apply the nasalization rule based on the nasalization information, and whether or not to apply the nasalization rule based on the nasalization information, it is possible to make the synthesized speech nasalized and accentuated very easily. This has great practical effects, such as making it possible to control soundization.

尚、本発明は上述した実施例に限定されるものではない
。例えばアクセント辞書３の内容・構成は上述した例に
限定されるものではない。またアクセント辞書に鼻音化
した読みの情報を登録しておき、これを適宜元の読み（
音部）に戻して音声合成に供するようにしても良い。ま
たアクセント辞書３に引き音化しない読みの情報を登録
しておくことも勿論可能である。その他、本発明はその
要旨を逸脱しない範囲で種々変形して実施することがで
きる。Note that the present invention is not limited to the embodiments described above. For example, the contents and structure of the accent dictionary 3 are not limited to the example described above. Also, register the nasalized reading information in the accent dictionary, and change it to the original reading (
It is also possible to return it to the sound part) and use it for speech synthesis. Of course, it is also possible to register in the accent dictionary 3 information on pronunciations that are not converted into accented sounds. In addition, the present invention can be implemented with various modifications without departing from the gist thereof.

［発明の効果］以上説明したように本発明によれば、アクセント辞書に
示される読みの情報を音韻系列に変換する際、鼻音化の
情報や引き音化の情報に従って鼻音化するか否か、また
引き音化するか否かを制御するので、非常に簡易に、且
つ効果的に所望とする音声を合成出力することができる
等の実用上多大なる効果か奏せられる。[Effects of the Invention] As explained above, according to the present invention, when converting reading information shown in an accent dictionary into a phoneme sequence, it is possible to determine whether or not to perform nasalization according to nasalization information or elongation information. Furthermore, since it is possible to control whether or not to make the sound more pronounced, great practical effects can be achieved, such as the ability to synthesize and output a desired voice very simply and effectively.

[Brief explanation of the drawing]

図は本発明の一実施例に係る音声合成装置について示す
もので、第１図は実施例装置の概略構成図、第２図は実
施例装置におけるアクセント辞書の構成例を示す図であ
る。１・・・入力部、２・・・単語照合部、３・・・アクセ
ント辞書、４・・・音韻系列検定部、５・・・アクセン
ト型検定部、６・・・音声素片ファイル、７・・・合成
パラメータ生成部、８・・・音声合成部。出願人代理人　弁理士　鈴江武彦（１１を化十貴◆号３［！音情報文生コー靭Ｔｓ１）（奪成管船第１図The figures show a speech synthesis device according to an embodiment of the present invention. FIG. 1 is a schematic configuration diagram of the embodiment device, and FIG. 2 is a diagram showing an example of the configuration of an accent dictionary in the embodiment device. DESCRIPTION OF SYMBOLS 1... Input part, 2... Word collation part, 3... Accent dictionary, 4... Phonological series test part, 5... Accent type test part, 6... Speech element file, 7 ...Synthesis parameter generation unit, 8...Speech synthesis unit. Applicant's agent Patent attorney Takehiko Suzue (11 wo Kajuki◆No. 3 [!Sound Information Bunsei Kou Ts1) (Aisei Kansen Figure 1)

Claims

[Claims]

(1) Analyze the input character code string to obtain its phoneme sequence and prosodic information, generate a phoneme parameter string and a prosodic parameter string from this phoneme sequence and prosodic information according to predetermined rules, and based on these parameter strings. A speech synthesizer that generates synthesized speech by using a voice synthesizer, comprising means for inputting nasalization information indicating whether or not to nasalize the synthesized speech to be generated, and when the nasalization information indicates that the voice is nasalized. A sound characterized in that a phoneme sequence is determined by applying a predetermined nasalization rule, and when the nasalization information indicates non-nasalization, a phoneme sequence is determined without applying the nasalization rule. Synthesizer.

(2) Analyze the input character code string to obtain its phonological sequence and prosodic information, generate a phonological parameter sequence and a prosodic parameter sequence from this phonological sequence and prosody information according to predetermined rules, and generate a phonological parameter sequence and a prosodic parameter sequence based on these parameter sequences. A speech synthesis device that generates synthesized speech by using a voice synthesizer, comprising a means for inputting tone conversion information indicating whether or not to convert the A sound following the E sound of the synthesized speech to be generated into a tone, If indicates that the A sound has been made into a drawn sound, a drawn sound is generated instead of the A sound, and when the said drawn sound information indicates that the A sound is not made into a drawn sound, the A sound is generated as is. A speech synthesis device characterized by: