JPH0244080B2 - - Google Patents

Info

Publication number
JPH0244080B2
JPH0244080B2 JP58135184A JP13518483A JPH0244080B2 JP H0244080 B2 JPH0244080 B2 JP H0244080B2 JP 58135184 A JP58135184 A JP 58135184A JP 13518483 A JP13518483 A JP 13518483A JP H0244080 B2 JPH0244080 B2 JP H0244080B2
Authority
JP
Japan
Prior art keywords
independent word
character string
word
hiragana
independent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP58135184A
Other languages
Japanese (ja)
Other versions
JPS6026997A (en
Inventor
Norimasa Nomura
Sadaichi Watanabe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Tokyo Shibaura Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tokyo Shibaura Electric Co Ltd filed Critical Tokyo Shibaura Electric Co Ltd
Priority to JP58135184A priority Critical patent/JPS6026997A/en
Publication of JPS6026997A publication Critical patent/JPS6026997A/en
Publication of JPH0244080B2 publication Critical patent/JPH0244080B2/ja
Granted legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Description

【発明の詳細な説明】 〔発明の技術分野〕 本発明は日本語文の文字列を音韻系列に変換す
る文字音韻変換装置に関する。
DETAILED DESCRIPTION OF THE INVENTION [Technical Field of the Invention] The present invention relates to a graphite-phoneme conversion device that converts a character string of a Japanese sentence into a phoneme sequence.

〔発明の技術的背景とその問題点〕[Technical background of the invention and its problems]

従来、日本語文章を入力すると音声に変換して
出力されるいわゆる音声合成装置なるものが知ら
れている。この種の装置として文節以下の単位で
蓄積された音韻辞書より、入力された文字コード
に対応する音韻を引出し、音声出力する方式が考
えられた。この方式では出力される音声にはアク
セント等が含まれてなく、単調でリズムがなく意
味の解釈が困難となる欠点があつた。そこで文字
列を分析しこの文字列をアクセント情報等の含ま
れた音韻系列に変換することが必要であつたが、
従来においてはこれを人間が行つていた。すなわ
ち文字列の音韻系列を人間が作成して、これを音
声合成装置に入力することを行つてきた。
2. Description of the Related Art Conventionally, so-called speech synthesis devices have been known that convert input Japanese text into speech and output the converted speech. As a device of this type, a system was devised in which a phoneme corresponding to an input character code is extracted from a phoneme dictionary stored in units of phrases or less and outputted as voice. This method had the disadvantage that the output speech did not contain accents, was monotonous, had no rhythm, and was difficult to interpret. Therefore, it was necessary to analyze the character string and convert it into a phonological sequence that included accent information, etc.
In the past, this was done by humans. In other words, humans have created phoneme sequences of character strings and input them into speech synthesis devices.

このため、文字列を音韻系列に変換する場合に
は、文字列の漢字部分を読みがな(音韻)に変え
るだけでなく、自然性の高い音声に変換するため
にアクセントを正しく検定する必要がある。この
場合ひらがな文字列は、それが付属語であるか、
あるいは自立語であるかによつてアクセント位置
が異なるため同じひらがな文字列であつてもそれ
が用いられている状況によつて付属語であるか自
立語であるかを正しく区別することが重要であつ
た。従来の装置においては、入力されたひらがな
文字列が付属語であるか自立語であるかを正しく
識別する機能は備えていなかつた。そこで人間が
これを区別してその結果に基づきアクセント情報
を音声合成装置に入力していたが、この方法であ
れば人間にとつて時間・労力の点から観て負担が
大きかつた。
Therefore, when converting a character string into a phonological sequence, it is necessary not only to change the kanji part of the character string to the reading (phonetic), but also to correctly test the accent in order to convert it into a highly natural sound. be. In this case the hiragana string is either an adjunct or
Also, the accent position differs depending on whether it is an independent word, so even if the same hiragana character string is used, it is important to correctly distinguish whether it is an attached word or an independent word depending on the situation in which it is used. It was hot. Conventional devices do not have a function to correctly identify whether an input hiragana character string is an attached word or an independent word. Therefore, humans had to distinguish between these accents and input accent information into a speech synthesis device based on the results, but this method was burdensome for humans in terms of time and effort.

〔発明の目的〕[Purpose of the invention]

本発明の目的は、日本語文章の文字列を音韻系
列に変換すること、特にひらがな文字列のアクセ
ントを高精度で検定することを可能にする装置を
提供することにある。
SUMMARY OF THE INVENTION An object of the present invention is to provide an apparatus that converts character strings of Japanese sentences into phonological sequences, and in particular, that makes it possible to test the accent of hiragana character strings with high accuracy.

〔発明の概要〕[Summary of the invention]

本発明は、漢字かな混りの文字列を文字種の変
化による文節単位に区分し、この文節の先頭文字
が漢字のとき文節文字列を漢字自立語辞書と照合
を行い、他方先頭文字がひらがなのとき、ひらが
な自立語辞書と照合を行い、活用語尾が用言連用
形もしくは用言連体形の場合、文節中から自立語
とその活用語尾を除いた残りのひらがな文字列に
ついて用言連用形もしくは用言連体形に接続可能
な特定の付属語文字列が先頭にあるかどうか検定
し、もし特定の付属語文字列であるとき上記のひ
らがな文字列を付属語辞書と照合し、一方特定の
付属語文字列でないとき上記のひらがな自立語辞
書と照合を行い、また上記の文節の先頭文字から
照合された自立語が用言以外あるいは用言であつ
ても連用形もしくは連体形以外のとき、文節中か
ら自立語およびその活用語尾をとり除いた残りの
ひらがな文字列と付属語辞書との照合を行うこと
によつて、上記漢字かな混りの文字列を音韻系列
に変換するものである。
The present invention divides a character string containing kanji and kana into phrase units based on changes in character types, and when the first character of this phrase is a kanji, the phrase character string is checked against a kanji independent word dictionary, and on the other hand, when the first character of the phrase is a hiragana When the hiragana independent word dictionary is checked, and if the conjugated word ending is a conjugated word or a conjugated word, the remaining hiragana character string after removing the independent word and its conjugated ending from the clause is found to be a conjugated word or a conjugated word. Test whether a specific adjunct string that can be connected to the body shape is at the beginning, and if it is a specific adjunct string, match the above hiragana string with the adjunct dictionary; If it is not, it is checked against the Hiragana independent word dictionary above, and if the independent word checked from the first letter of the clause above is other than a predicate, or even if it is a predicate, it is not a conjunctive form or an adjunctive form, the independent word from within the clause is checked. By removing the conjugated endings and comparing the remaining hiragana character strings with an adjunct word dictionary, the above character strings containing kanji and kana are converted into a phoneme sequence.

特にひらがな文字列部分について、自立語であ
るか付属語であるかを正しく識別することにもと
づいて、文章を音声に変換する際に重要なアクセ
ントを高精度で検定するものである。
In particular, this test accurately identifies accents, which are important when converting sentences into speech, based on correctly identifying whether hiragana character strings are independent words or attached words.

〔発明の効果〕〔Effect of the invention〕

本発明に依れば、日本語文章の文字列を音韻系
列に変換出来、特にひらがな文字列について自立
語と付属語を正しく識別することにより、文章を
自然性の高い音声へ変換することにおいて重要な
アクセントの検定を高精度で行うことが出来るの
で、音声合成装置の品質を高めることが出来、人
間にとつては時間・労力の点から観て負担が軽減
する。
According to the present invention, character strings of Japanese sentences can be converted into phoneme sequences, which is especially important in converting sentences into highly natural speech by correctly identifying independent words and attached words for hiragana character strings. Since the accent test can be performed with high precision, the quality of the speech synthesizer can be improved, and the burden on humans in terms of time and labor is reduced.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の一実施例を図面を参照して説明
する。第1図は本実施例装置の構成図である。先
ず音声に変換すべき漢字かな混りの文字列が入力
文章記憶装置1に蓄えられる。文節切り回路2は
この入力文章記憶装置1に蓄えられた文章に対し
て、例えば「かな」から「漢字」への変化等、そ
の文字種の変化を検出する等して、入力文字列を
文節単位に切り出し、これを入力文節レジスタ3
にセツトする。この入力文節レジスタ3にセツト
された一文節の文字列ごとに、以下の文字音韻変
換が行われる。
Hereinafter, one embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of the apparatus of this embodiment. First, a character string containing kanji and kana to be converted into speech is stored in the input text storage device 1. The bunsetsu cutting circuit 2 converts the input character string into bunsetsu units by detecting a change in character type, such as a change from "kana" to "kanji", for the sentences stored in the input text storage device 1. and input it into the input phrase register 3.
Set to . For each character string of a phrase set in the input phrase register 3, the following graphophonemic conversion is performed.

文節先頭文字検定回路4は、文節の先頭文字が
漢字であるかひらがなであるかを検出し、文節の
先頭からの辞書照合を漢字自立語照合回路5で行
うか(信号線16)、あるいはひらがな自立語照
合回路10で行うか(信号線17)を決める。漢
字自立語辞書6は、漢字で始まる単語をその品詞
の情報とともに記憶し、また単語に対応してその
単語の音韻系列、アクセント型を格納している。
ひらがな自立語辞書11はひらがなで始まる単語
を登録する辞書で形式は上記漢字自立語辞書6と
同じである。
The clause first character verification circuit 4 detects whether the first character of the clause is a kanji or a hiragana, and performs dictionary matching from the beginning of the clause using the kanji independent word matching circuit 5 (signal line 16), or detects whether the first character of the clause is a hiragana character or a hiragana character. It is determined whether the independent word matching circuit 10 is used (signal line 17). The kanji independent word dictionary 6 stores words starting with kanji along with information on their parts of speech, and also stores the phonological sequence and accent type of the words in correspondence with the words.
The hiragana independent word dictionary 11 is a dictionary that registers words starting with hiragana, and has the same format as the kanji independent word dictionary 6 described above.

漢字自立語照合回路5は文節の先頭文字が漢字
のときその文節の先頭文字列を漢字自立語辞書6
と照合する。この照合により一致検出された文字
列の内、最も長い文字列のものから、つまり最長
マツチング単語から順に侯補単語として辞書自立
語照合回路5の中に設けられた一時記憶装置に記
憶する。そして最長マツチング単語の自立語が用
言でない場合は、その音韻系列及びアクセント情
報が信号線18を通じて音韻系列レジスタ14に
転送されセツトされる。又、自立語が用言である
場合は、その音韻系列とアクセント情報が、以下
示す回路に順次送られる。
When the first character of a clause is a kanji, the kanji independent word matching circuit 5 converts the first character string of the clause into a kanji independent word dictionary 6.
Check with Among the character strings detected as matches through this matching, the longest character string, that is, the longest matching word, is stored as a candidate word in a temporary storage device provided in the dictionary independent word matching circuit 5. If the independent word of the longest matching word is not a predicate, its phoneme sequence and accent information are transferred to the phoneme sequence register 14 through the signal line 18 and set therein. If the independent word is a predicate, its phoneme sequence and accent information are sequentially sent to the circuits shown below.

他方ひらがな自立語照合回路10は文節の先頭
文字がひらがなのとき、文節の先頭文字列をひら
がな自立語辞書11と照合し、上記漢字自立語照
合回路5の場合と同様なことを行い、信号線19
を通じて音韻系列及びアクセント情報を音韻系列
レジスタ14へ送る。
On the other hand, when the first character of a bunsetsu is a hiragana, the hiragana independent word matching circuit 10 matches the first character string of the bunsetsu with the hiragana independent word dictionary 11, performs the same process as the kanji independent word matching circuit 5, and connects the signal line to the hiragana independent word matching circuit 10. 19
The phoneme sequence and accent information are sent to the phoneme sequence register 14 through the phoneme sequence register 14.

次に、上述の漢字自立語照合回路5またはひら
がな自立語照合回路10で照合された自立語が用
言のとき(信号線20,21)活用語尾検定回路
7は自立語に続くひらがな文字列を活用語尾表8
と照合することによつて活用語尾検定を行い、合
格の場合は(自立語+活用語尾)に対する音韻系
列及びアクセント情報を信号線22を通して音韻
系列レジスタに送る((自立語+活用語尾)で文
節が切れる場合)。もし不合格のときは上記の漢
字自立語照合回路5もしくはひらがな自立語照合
回路10に設けられた一時記憶装置から2番目の
侯補単語をとり出し、それについて上述の最長マ
ツチング単語についての処理と同様の事を行う。
Next, when the independent word matched by the above-mentioned Kanji independent word matching circuit 5 or Hiragana independent word matching circuit 10 is a predicate (signal lines 20, 21), the conjugated ending verification circuit 7 checks the hiragana character string following the independent word. Conjugation ending table 8
A conjugated ending test is carried out by comparing the conjugated endings with ). If it fails, the second complementary word is taken out from the temporary storage provided in the above-mentioned Kanji independent word matching circuit 5 or the Hiragana independent word matching circuit 10, and the second complementary word is extracted from the above-mentioned process for the longest matching word. Do the same thing.

活用語尾検定回路7では、活用語尾が検定され
た後入力文節文字列から前記自立語と活用語尾を
とり除く。そして検定できた活用語尾が用言連用
形もしくは用言連体形のとき、この残りの文字列
がひらがな文字列検定回路9へ送られる(信号線
23)。ひらがな文字列検定回路9は、上記の残
りの文字列の先頭文字列が以下示すようなひらが
な2文字以上の特定文字であるかどうか特定文字
表15と照合することによつて検定し、そのよう
な特定文字であるときは、それらは付属語である
可能性が大きいため、信号線24を通じて付属語
照合回路12へ行く。
In the conjugated ending testing circuit 7, after the conjugated ending is verified, the independent word and the conjugated ending are removed from the input phrase character string. When the conjugated ending that has been verified is in the conjunction form or the conjunction form, the remaining character string is sent to the hiragana character string verification circuit 9 (signal line 23). The hiragana character string verification circuit 9 verifies whether the first character string of the remaining character strings is a specific character of two or more hiragana characters as shown below by comparing it with the specific character table 15. If the characters are specific characters, there is a high possibility that they are adjunct words, so they are sent to the adjunct word matching circuit 12 via the signal line 24.

用言連用形の後の場合、「ます」「たり」「つつ」
「ても」「ながら」など。用言連体形の後の場合、
「から」「さえ」「しか」「だけ」「けれど」など。
After the conjunction form, "masu", "tari", "tsutsu"
``Even'', ``Nagara'', etc. After the adjunctive form,
``kara'', ``sae'', ``shika'', ``only'', ``but'', etc.

他方、上記のような特定文字でないとき、信号
線25を通じてひらがな自立語照合回路10へ行
き、上記の残りの文字列をひらがな自立語辞書1
1と照合する。ここで照合ができた場合、ひらが
な自立語照合回路10は前記の用言連用形もしく
は用言連体形のアクセントをこれら用言のアクセ
ント型(前記の漢字自立語辞書もしくはひらがな
自立語辞書に登録されている)と活用形(連用形
であるか連体形であるかの区別)などにもとづい
て検定し、そのアクセント情報を音韻系列レジス
タ14に送る。(信号線19) 付属語照合回路12は、上記の特定文字を先頭
とする文字列と付属語辞書13とを照合すること
によつて付属語の検定を行う。付属語辞書13は
付属語および付属語の複合形(例「だけなので」)
を見出し語とし、それらが自立語に接続するとき
の条件、およびそれらが自立語と結合したときの
アクセントの移動に関する情報などが登録されて
いる。付属語照合回路12は照合できた付属語を
音韻系列に変換して音韻系列レジスタ14へ送る
とともに(信号線26)、自立語と付属語が結合
している文節のアクセントも、上記の付属語辞書
13に登録されている情報をもとに検定し、同じ
く音韻系列レジスタ14へ文節アクセント位置情
報を送る。
On the other hand, if it is not a specific character as mentioned above, it goes to the hiragana independent word matching circuit 10 through the signal line 25, and the remaining character strings are sent to the hiragana independent word dictionary 1.
Check with 1. If the matching is successful here, the hiragana independent word matching circuit 10 converts the accents of the above-mentioned adjunctive form or adjunctive form into the accent type of these words (registered in the above-mentioned Kanji independent word dictionary or hiragana independent word dictionary). The accent information is verified based on the conjugated form (distinction between the adjunctive form and the adjunctive form), and the accent information is sent to the phoneme series register 14. (Signal line 19) The adjunct word matching circuit 12 verifies the adjunct word by comparing the character string starting with the above-mentioned specific character with the adjunct word dictionary 13. The attached word dictionary 13 contains attached words and compound forms of attached words (e.g. "only because")
are used as headwords, and the conditions under which they are connected to independent words, as well as information regarding the shift of accent when they are combined with independent words, are registered. The adjunct word matching circuit 12 converts the adjunct word that has been matched into a phoneme sequence and sends it to the phoneme sequence register 14 (signal line 26), and also converts the accent of a clause in which an independent word and an adjunct word are combined into a phoneme sequence using the above adjunct word. The test is performed based on the information registered in the dictionary 13, and the clause accent position information is also sent to the phoneme sequence register 14.

なお、文節の先頭文字列との照合において漢字
自立照合回路5あるいはひらがな自立語照合回路
10で照合された自立語が用言以外のとき、ある
いは用言であつても活用語尾検定回路7で検定さ
れた活用語尾が連用形もしくは連体形以外のと
き、文節からその照合された自立語およびその活
用語尾をとり際いた残りの文字列が信号線27,
28,29を通して付属語照合回路12へ送られ
る。ここで残りの文字列が付属語辞書13と照合
され、前述と同様に付属語が照合されたとき音韻
系列とアクセント情報を音韻系列レジスタ14へ
送る。さらに照合された付属語をとり除いてもな
お文節文字列に残りがあるときは、その残り文字
列が信号線30を通してひらがな文字列検定回路
9へ送られて上記と同様のことが行われる。
In addition, when the independent word matched by the kanji independent matching circuit 5 or the hiragana independent word matching circuit 10 is other than a pragmatic word, or even if it is a pragmatic word, it is verified by the conjugated ending test circuit 7. When the conjugated ending is other than the adjunctive form or the adjunctive form, the remaining character string that includes the verified independent word and its conjugated ending from the clause is sent to the signal line 27,
It is sent to the attached word matching circuit 12 through 28 and 29. Here, the remaining character strings are compared with the adjunct word dictionary 13, and when the adjunct words are matched, the phoneme sequence and accent information are sent to the phoneme sequence register 14, as described above. Furthermore, if the phrase character string remains even after removing the collated adjunct, the remaining character string is sent to the hiragana character string verification circuit 9 through the signal line 30, and the same process as above is performed.

以上の様に本実施例によれば、ひらがな文字列
検定回路9において特定文字列を検定し、それに
もとづいてひらがな自立語照合回路10と付属語
照合回路12での照合順序を変えることにより、
用言連用形もしくは用言連体形の後に続くひらが
な文字列について付属語であるか自立語であるか
を正しく区別して検定できる。
As described above, according to this embodiment, by testing a specific character string in the hiragana character string verification circuit 9 and changing the collation order in the hiragana independent word verification circuit 10 and the attached word verification circuit 12 based on the verification,
It is possible to correctly distinguish and test whether a hiragana character string following a conjunctive form or a conjunctive form is an attached word or an independent word.

例えば、入力文字列が、「話します」の場合、
「話し」が用言連用形として検定されたとき、残
りの文字列「ます」の照合がひらがな文字列検定
回路9を通さずに従来の様に直接ひらがな自立語
照合回路10で行なわれると、自立語として「ま
す(増す)」が検定されるためアクセントも「ま
す」(平板型)として検定される。しかし言うま
でもなくこれは誤りである。本実施例ではひらが
な文字列検定回路9で「ます」は特定文字として
検定されるため、この場合「ます」の照合は付属
語照合回路12で行われ、このため付属語の助動
詞として検定出来、文節のアクセント「――――‐ 話しま
す」が得られる。
For example, if the input string is "speak",
When "Kashi" is verified as a conjunctive form, if the remaining character string "masu" is checked directly in the hiragana independent word matching circuit 10 as in the past without passing through the hiragana character string verification circuit 9, it becomes independent. Since the word "masu" (increase) is tested, the accent is also tested as "masu" (flat type). But needless to say, this is wrong. In this embodiment, since "masu" is verified as a specific character by the hiragana character string verification circuit 9, in this case, the verification of "masu" is performed by the adjunct word verification circuit 12, so that it can be verified as an auxiliary verb of the adjunct word. Accent of phrase “――――‐ Talk
'is obtained.

上記の例の場合、用言連用形の検定後ではひら
がな自立語照合回路10でなく付属語照合回路1
2へ行くことで、ひらがな文字列について正しい
検定結果が得られるが、用言連用形の検定後で常
に付属語照合回路12へ行くことにしても誤りが
生じる。例えば入力文節文字列が「話しやすい」
の場合、「話し」が用言連用形として検定された
あと、残りの文字列「やすい」について、無条件
に付属語照合回路12へ行くと、文字「や」が付
属語の助詞として検定され、そのあとさらに
「や」をとり除いた残りの文字列「すい」がひら
がな自立語照合回路10で自立語「すい(吸い)」
として検定されるため、アクセントが2つに分離
して「―――‐ 話し や□すい」となる。これは言うまで
もなく誤りであり、正しくは「――――‐ 話しや すい」と
ならねばならない。本実施例では用言連用形もし
くは用言連体形の検定後、その残りの文字例がひ
らがな文字列検定回路9へ送られて付属語である
か自立語であるか識別されるため、無条件で付属
語照合されることも、また無条件で自立語照合さ
れることも無く、上記の例の様な誤りは起こらず
にアクセントを正しく検出できる。
In the above example, after testing the conjunction form, the hiragana independent word matching circuit 10 is replaced by the attached word matching circuit 1.
By going to step 2, a correct test result can be obtained for the hiragana character string, but an error will also occur if the test always goes to the adjunct matching circuit 12 after testing the conjunction form. For example, the input clause character string is "easy to speak"
In the case of , after ``Kashi'' is verified as a conjunction form, if the remaining character string ``Tasu'' goes to the adjunct matching circuit 12 unconditionally, the character ``ya'' is verified as a particle of the adjunct. After that, the remaining character string "Sui" after removing "ya" is converted into the independent word "Sui" by the Hiragana independent word matching circuit 10.
Because the accent is separated into two, it becomes ``――――‐ talk and □sui.'' Needless to say, this is a mistake; the correct answer should be ``――――-Easy to talk.'' In this embodiment, after testing the adjunctive form or the adjunctive form, the remaining character examples are sent to the hiragana character string testing circuit 9 to identify whether they are attached words or independent words. There is no adjunct word matching or independent word matching unconditionally, and the accent can be detected correctly without making the error as in the above example.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例装置を示す概略構成
図である。 1…入力文章記憶装置、2…文節切り回路、3
…入力文節レジスタ、4…文節先頭文字検定回
路、5…漢字自立語照合回路、6…漢字自立語辞
書記憶装置、7…活用語尾検定回路、8…活用語
尾表記憶装置、9…ひらがな文字列検定回路、1
0…ひらがな自立語照合回路、11…ひらがな自
立語辞書記憶装置、12…付属語照合回路、13
…付属語辞書記憶装置、14…音韻系列レジス
タ、15…特定文字表記憶装置、16〜30…信
号線。
FIG. 1 is a schematic diagram showing an apparatus according to an embodiment of the present invention. 1... Input sentence storage device, 2... Bunsetsu cutting circuit, 3
…Input clause register, 4…Bunsetsu first character verification circuit, 5…Kanji independent word collation circuit, 6…Kanji independent word dictionary storage device, 7…conjugation ending verification circuit, 8…conjugation ending table storage device, 9…hiragana character string Verification circuit, 1
0... Hiragana independent word matching circuit, 11... Hiragana independent word dictionary storage device, 12... Adjunct word matching circuit, 13
...Adjunct word dictionary storage device, 14...Phonological sequence register, 15...Specific character table storage device, 16-30...Signal line.

Claims (1)

【特許請求の範囲】[Claims] 1 日本語文の文字列を文字種の変化に基いて文
節に区切る文節切り手段と、この文節切り手段よ
り出力された文節の先頭文字が漢字かひらがなか
を検定する文節先頭文字検定手段と、この文節先
頭文字検定手段により前記文節の先頭文字が漢字
として検定された場合この文節の先頭部分に含ま
れる自立語を検出しこの自立語が用言か否かを検
定し前記文節からこの自立語を除いた第1の残り
の文字列及びこの自立語の音韻アクセント情報を
出力する漢字自立語照合手段と、前記文節先頭文
字検定手段により前記文節の先頭文字がひらがな
として検定された場合この文節の先頭部分に含ま
れる自立語を検出しこの自立語が用言か否かを検
定し前記文節からこの自立語を除いた第1の残り
の文字列及びこの自立語の音韻アクセント情報を
出力するひらがな自立語照合手段と、前記漢字自
立語照合手段又はひらがな自立語照合手段により
前記自立語が用言として検定された場合前記第1
の残りの文字列に含まれる前記自立語の活用語尾
が連用形又は連体形であるか否かを検定し前記第
1の残りの文字列からこの活用語尾を除いた第2
の残りの文字列及びこの活用語尾の音韻アクセン
ト情報と前記漢字自立語照合手段又はひらがな自
立語照合手段より出力された前記自立語の音韻ア
クセント情報をそのまま出力する活用語尾検定手
段と、この活用語尾検定手段により前記活用語尾
が連用形又は連体形として検定された場合前記第
2の残りの文字列の先頭部分が用言連用形又は連
体形に接続可能な特定文字列であるか否かを検定
し特定文字列でない場合に前記第2の残りの文字
列及び前記活用語尾検定手段より出力された前記
自立語とその活用語尾の音韻アクセント情報を前
記ひらがな自立語照合手段へそのまま出力するひ
らがな文字列検定手段と、前記漢字自立語照合手
段又はひらがな自立語照合手段において前記自立
語が用言でないと検定された場合前記第1の残り
の文字列と自立語の音韻アクセント情報、前記活
用語尾検定手段において前記活用語尾が連用形又
は連体形でないと検定された場合の前記第2の残
りの文字列及び自立語とその活用語尾の音韻アク
セント情報、前記ひらがな文字列検定手段におい
て前記第2の残りの文字列の先頭部分が特定文字
列であると検定された場合の前記第2の残りの文
字列及び自立語とその活用語尾の音韻アクセント
情報を入力し前記第1又は第2の残りの文字列の
先頭部分に含まれる付属語を検出しこの付属語が
自立語又は活用語尾に接続するときの情報をもと
に前記自立語と活用語尾と付属語又は自立語と付
属語からなる文字列全体の音韻アクセント情報を
決定する付属語照合手段と、この付属語照合手段
から出力された前記文字列全体の音韻アクセント
情報を格納する音韻系列レジスタとを具備したこ
とを特徴とする文字音韻変換装置。
1. A bunsetsu cutting means for dividing a character string of a Japanese sentence into clauses based on changes in character types, a bunsetsu first character verification means for testing whether the first character of the bunsetsu outputted from this bunsetsu cutting means is a kanji character or a hiragana character, When the first character of the phrase is verified as a kanji by the first character verification means, an independent word included in the beginning of the phrase is detected, it is verified whether or not this independent word is a term, and the independent word is removed from the phrase. Kanji independent word matching means outputs the first remaining character string and phonological accent information of this independent word, and when the first character of the clause is verified as hiragana by the clause beginning character verification means, the beginning part of this clause is verified as hiragana. A hiragana independent word that detects an independent word included in the clause, tests whether or not this independent word is a pragmatic word, and outputs a first remaining character string obtained by removing this independent word from the clause and phonological accent information of this independent word. When the independent word is verified as a term by the collation means and the Kanji independent word collation means or the Hiragana independent word collation means, the first
A second character string that is obtained by testing whether the conjugated ending of the independent word contained in the remaining character string is an adjunctive form or an adjunctive form, and removes this conjugative ending from the first remaining character string.
a conjugated ending verification means for directly outputting the remaining character strings, the phonological accent information of the conjugated ending, and the phonological accent information of the independent word outputted from the Kanji independent word matching means or the Hiragana independent word matching means; If the testing means verifies that the ending of the conjugated word is a conjunctive form or an adnominal form, it is determined whether or not the beginning part of the second remaining character string is a specific character string that can be connected to a conjunctive form or an adnominal form. hiragana character string verification means for outputting the second remaining character string and the independent word and its conjugation ending phonological accent information outputted from the conjugation ending verification means as they are to the hiragana independent word matching means when the character string is not a character string; and if the independent word is determined to be not a predicate by the Kanji independent word matching means or the Hiragana independent word matching means, the first remaining character string and the phonological accent information of the independent word, The phonological accent information of the second remaining character string and the independent word and its conjugated ending when the conjugated ending is determined to be neither the adjunctive form nor the adjunctive form; When the beginning part is verified to be a specific character string, the second remaining character string, the independent word, and the phonological accent information at the end of its conjugated word are input, and the beginning part of the first or second remaining character string is input. The phonological accent of the entire character string consisting of the independent word, the conjugated ending, the attached word, or the independent word and the attached word is determined based on the information when this attached word is connected to an independent word or a conjugated word. What is claimed is: 1. A graphite-phoneme conversion device comprising: an adjunct collation means for determining information; and a phoneme series register for storing phonetic accent information of the entire character string output from the adjunction collation means.
JP58135184A 1983-07-26 1983-07-26 Character phoneme converter Granted JPS6026997A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58135184A JPS6026997A (en) 1983-07-26 1983-07-26 Character phoneme converter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58135184A JPS6026997A (en) 1983-07-26 1983-07-26 Character phoneme converter

Publications (2)

Publication Number Publication Date
JPS6026997A JPS6026997A (en) 1985-02-09
JPH0244080B2 true JPH0244080B2 (en) 1990-10-02

Family

ID=15145796

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58135184A Granted JPS6026997A (en) 1983-07-26 1983-07-26 Character phoneme converter

Country Status (1)

Country Link
JP (1) JPS6026997A (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6258362A (en) * 1985-09-09 1987-03-14 Canon Inc Japanese word processor
JPH0731670B2 (en) * 1985-09-09 1995-04-10 キヤノン株式会社 Information processing method
JPS6426926A (en) * 1987-07-23 1989-01-30 Fujitsu Ltd Control system for punctuation utterance of paragraph
JPH0215764A (en) * 1988-07-01 1990-01-19 Nippon Telegr & Teleph Corp <Ntt> Information distributing system
KR101307856B1 (en) * 2012-02-15 2013-09-12 에스티엑스조선해양 주식회사 Container guide device of container ship
GB201821130D0 (en) * 2018-12-21 2019-02-06 Ocado Innovation Ltd Robotic device test station and methods

Also Published As

Publication number Publication date
JPS6026997A (en) 1985-02-09

Similar Documents

Publication Publication Date Title
US6347298B2 (en) Computer apparatus for text-to-speech synthesizer dictionary reduction
DE68913669D1 (en) Pronunciation of names by a synthesizer.
US6393444B1 (en) Phonetic spell checker
Nikulásdóttir et al. An Icelandic pronunciation dictionary for TTS
JPH0244080B2 (en)
JP3002202B2 (en) Numeral reading device in rule speech synthesizer
Xydas et al. Text normalization for the pronunciation of non-standard words in an inflected language
Rao et al. Word boundary hypothesization in Hindi speech
JPS58127231A (en) Kanji (chinese character)-kana (japanese syllabary) converter
JPH0415503B2 (en)
JPH0363767A (en) Text voice synthesizer
KR0180650B1 (en) Sentence analysis method for korean language in voice synthesis device
JPS5972495A (en) Character phoneme transducer
JPS5934592A (en) Character phoneme converter
Schulze Hypothesizing of words for isolated and connected word recognition systems based on phonem preclassification
JPS61223799A (en) Voice synthesizer
JP2801601B2 (en) Text-to-speech synthesizer
JP3269083B2 (en) Natural language processor
JP3378059B2 (en) Sentence generator
CONVERSION Emmy M. Konst and Louis Boves
JPH04305B2 (en)
JPS63157261A (en) Audio revision type back-up device
JPS59127150A (en) Sentence reading and checking device
JPS59127147A (en) Sentence reading out and checking device
JPH06289890A (en) Natural language processor