JPS6026997A - Character phoneme converter - Google Patents

Character phoneme converter

Info

Publication number
JPS6026997A
JPS6026997A JP58135184A JP13518483A JPS6026997A JP S6026997 A JPS6026997 A JP S6026997A JP 58135184 A JP58135184 A JP 58135184A JP 13518483 A JP13518483 A JP 13518483A JP S6026997 A JPS6026997 A JP S6026997A
Authority
JP
Japan
Prior art keywords
independent word
character string
word
character
ending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP58135184A
Other languages
Japanese (ja)
Other versions
JPH0244080B2 (en
Inventor
典正 野村
渡辺 貞一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to JP58135184A priority Critical patent/JPS6026997A/en
Publication of JPS6026997A publication Critical patent/JPS6026997A/en
Publication of JPH0244080B2 publication Critical patent/JPH0244080B2/ja
Granted legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 〔発明の技術分野〕 本発明は日本語文の文字列を音韻系列に変換する文字音
韻便換装置に関する。
DETAILED DESCRIPTION OF THE INVENTION [Technical Field of the Invention] The present invention relates to a graphophoneme conversion device that converts a character string of a Japanese sentence into a phoneme sequence.

〔発明の技術的背景とその間融点〕[Technical background of the invention and its melting point]

従米、日本語文章を入力すると音声に変換して出力され
るいわゆる音声合成装置なるものが知られている。この
種の装置として文節以下のh=位で蓄積された音韻辞書
より、入力された文字コードに対応する音韻なづ1出し
、音声出力する方式が考えられた。この方式では出力さ
れる音声にはアクセント等が含まれてなく、単調でリズ
ムがなく意味の解釈が困難となる欠点があった。そこで
文字列を分析しこの文字列をアクセント情報等の含まれ
た音韻系列に変換することが必要であったが、従来にお
いてはこれを人1−が行っていた。すなわち文字列の音
韻系列を人間が作成して、これを音声合成装置に入力す
ること全行ってきた。
A so-called speech synthesizer is known, which converts input Japanese text into speech and outputs it. As a device of this kind, a method has been considered in which a phoneme dictionary stored in the h=position below a phrase is used to extract the phoneme word corresponding to the input character code and output it as a voice. This method had the disadvantage that the output speech did not contain any accents, was monotonous, had no rhythm, and was difficult to interpret. Therefore, it was necessary to analyze the character string and convert it into a phoneme sequence containing accent information, but conventionally this was done by person 1-. In other words, all the work has been done by humans creating phoneme sequences of character strings and inputting them into speech synthesis devices.

このため、文字列を音韻系列に変換する場合には、文字
列の漢字部分を読みがな(音韻)に変えるだけでなく、
自然性の高い音声に変換するためにアクセントを正しく
検定する必要がある。この場合ひらがな文字列は、それ
が付属語であるか、あるいは自立語であるかによってア
クセント位置が異なるため同じひらがな文字列であって
もそれが用いられている状況によって付属語であるか自
立語であるかを正しく区別することが重要であった。従
来の装置においては、入力されたひらがな 4文字列が
付属語であるか自立語であるかを正しく識別する檄能は
倫えていなかった。そこで人間がこれを区別してその結
果に基づきアクセント情報を音声合成装置に入力してい
たが、この方法であれば人間にとって時間・労力の点か
ら観て負担が大きかった。
Therefore, when converting a character string to a phonetic sequence, you not only change the kanji part of the character string to the reading (phonetic), but also
In order to convert speech into a highly natural sound, it is necessary to accurately test accents. In this case, the hiragana character string has different accent positions depending on whether it is an attached word or an independent word, so even if the same hiragana character string is used, it can be an attached word or an independent word depending on the situation in which it is used. It was important to correctly distinguish between Conventional devices were not capable of correctly identifying whether an input four-character string of hiragana was an attached word or an independent word. Therefore, humans were required to distinguish between these accents and input accent information into a speech synthesis device based on the results, but this method was burdensome for humans in terms of time and effort.

〔発明の目的〕[Purpose of the invention]

本発明の目的は、日本語文章の文字列を音韻系列に変換
すること、特にひらがな文字列のアクセントを高精度で
検定′1−ることを可能にする装置を提供することにあ
る。
SUMMARY OF THE INVENTION An object of the present invention is to provide an apparatus that enables converting character strings of Japanese sentences into phoneme sequences, and in particular, testing the accents of hiragana character strings with high precision.

〔発明の概要〕[Summary of the invention]

本発明は、漢字かな混りの文字列を文字種の授化による
文節単位に区分し、この文節の先頭文字が漢字のとき文
節文字列を漢字自立語辞書と照合を行い、他方先頭文字
がひらがなのとき、ひらがな自立語辞簀と照合を行い、
活用語尾が用言連用形もしくは用言連体形の場合、文節
中から自立語とその活用語尾を除いた残りのひらがな文
字列について、用言連用形もしくは用言連体形に接続可
能な特定の付属語文字列が先頭にあるかどうか検定し、
もし特定の付属語文字列であるとき上記のひらがな文字
列を付属語辞書と照合し、一方特定の付属語文字列でな
いとき上記のひらがな自立語辞書と照合を行い、また上
記の文節の先頭文字から照合された自立語が用言以外あ
るいは用言であっても連用形もしくは連体形以外のとき
、文節中から自立語およびその活用語尾をとり除いた残
りのひらがな文字夕11と11属語辞書との照合を行う
ことによって、上記漢字かな混りの文字列を音韻系列に
変換するものである。
The present invention divides a character string containing kanji and kana into phrase units by imparting character types, and when the first character of this phrase is a kanji, the phrase character string is compared with a kanji independent word dictionary, and on the other hand, the first character is a hiragana. When , it is checked against the hiragana independent word dictionary,
If the conjugated ending is a conjunctive or adjunctive form, the remaining hiragana character string after removing the independent word and its conjugated ending from the clause is a specific adjunct character that can be connected to the conjunctive or adjunctive form. Test if the column is at the beginning,
If it is a specific attached word string, the above hiragana string is checked against the attached word dictionary, and if it is not a specific attached word string, it is checked against the above hiragana independent word dictionary, and the first character of the above clause is checked. When the independent word collated from is other than a pragmatic word, or even if it is a pragmatic word, it is not a conjunctive form or an adnominal form, the independent word and its conjugated ending are removed from the clause, and the remaining hiragana character 11 and 11 genitive dictionary are used. By performing the matching, the above-mentioned character string containing kanji and kana is converted into a phonological sequence.

特にひらがな文字列部分について、自立語であるか伺1
414語であるかを正しく識別することにもとづいて、
文章を音声に変換する際に重要なアクセントを高精度で
検定するものである。
Especially regarding the hiragana character string part, I am wondering if it is an independent word.
Based on correctly identifying the 414 words,
This is a highly accurate test for accents, which are important when converting sentences into speech.

〔発明の効果〕〔Effect of the invention〕

本発明に依れば、日本語文章の文字列を音韻系列に変換
出来、特にひらがな文字列について自立語と付−語を正
しく識別することにより、文章を自然性の高い音声へ変
換することにおいて重要なアクセントの検定を高精度で
行うことが出来るので、音声合成装置の品質を高めるこ
とが出来、人間にとっては11:f間・労力の点から観
て負担が軽減する。
According to the present invention, character strings of Japanese sentences can be converted into phoneme sequences, and in particular, by correctly identifying independent words and attached words for hiragana character strings, it is possible to convert sentences into highly natural sounds. Since important accents can be tested with high accuracy, the quality of the speech synthesizer can be improved, and the burden on humans is reduced in terms of time and effort.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の一実施例を図面を参照して説明する。第
1図は本実施例装置の構成図である。先ず音声に変換す
べき漢字かな混りの文字列が入力文章記憶装置(1)に
蓄えられる。文節切り回路(2)はこの入力文章記憶装
置(1)に蓄えられた文章に対して、例えば「かな」か
ら「漢字」への変化等、その文字種の変化を検出する等
して、入力文字列を文節単位に切り出し、これを入力文
節レジスタ(3)にセットする。この入力文節レジスタ
(3)にセットされた一文節の文字列ごとに、以下の又
字音韻斐換が行われる。
Hereinafter, one embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of the apparatus of this embodiment. First, a character string containing kanji and kana to be converted into speech is stored in the input text storage device (1). The phrase cutting circuit (2) detects a change in character type from the sentences stored in the input sentence storage device (1), such as a change from "kana" to "kanji", and converts the input characters. The column is cut out into phrases and set in the input phrase register (3). For each character string of a phrase set in the input phrase register (3), the following phonetic conversion is performed.

文節先頭文字検定回路(4)は、文節の先頭文字が漢字
であるかひらがなであるかを検出し、文節の先頭からの
辞書照合を漢字自立語照合回路(5)で行うか(信号線
αQ)、あるいはひらがな自立語照合回路(IIで行う
か(信号線a7))を決める。漢字自立語辞書(6)は
、漢字で始まる単語をその品詞の情報とともに記憶し、
また単語に対応してその単語の音韻系列、アクセント型
を格納している。ひらがな自立語辞書(11)はひらが
なで始まる単語を登録する辞書で形式は上記漢字自立語
辞書(6)と同じである。
The clause first character verification circuit (4) detects whether the first character of the clause is a kanji or a hiragana, and performs dictionary matching from the beginning of the clause using the kanji independent word matching circuit (5) (signal line αQ). ) or the hiragana independent word matching circuit (II (signal line a7)). The Kanji Independent Word Dictionary (6) stores words starting with kanji along with their parts of speech,
It also stores the phonological sequence and accent type of each word. The Hiragana Independent Word Dictionary (11) is a dictionary that registers words starting with Hiragana, and has the same format as the Kanji Independent Word Dictionary (6).

漢字自立語照合回路(5)は文節の先頭文字が漢字のと
きその文節の先頭文字列を漢字自立語辞書(6)と照合
する。この照合により一致検出された文字列の内、最も
長い文字列のものから、つまり最長マツチング単語から
順に候補単語として漢字自立語照合回路(5)の中に設
けられた一時記憶装置に記憶する。そして最長マツチン
グ単語の自立語が用言でない場合は、その音韻系列及び
アクセント情報が信号線u暑通じて音韻系列レジスタ(
141に転送されセットされる。又、自立語が用言であ
る場合は、その音韻系列とアクセント情報が、以下示す
回路に順次送られる。
When the first character of a clause is a kanji, the kanji independent word matching circuit (5) matches the first character string of the clause with the kanji independent word dictionary (6). Among the character strings detected as matches through this matching, the longest character string, that is, the longest matching word, is stored as a candidate word in a temporary storage device provided in the Kanji independent word matching circuit (5). If the independent word of the longest matching word is not a predicate, its phoneme series and accent information are passed through the signal line u heat to the phoneme series register (
141 and set. If the independent word is a predicate, its phoneme sequence and accent information are sequentially sent to the circuits shown below.

他方ひらがな自立語照合回路Qlは文節の先頭文字がひ
らがなのとき、文節の先頭文字列をひらがな自立語辞書
aυと照合し、上記漢字自立語照合回路(5)の場合と
同様なことを行い、信号線acJを通じて音韻系列及び
アクセント情報を音韻系列レジスタa力へ送る。
On the other hand, when the first character of a clause is a hiragana, the hiragana independent word matching circuit Ql matches the first character string of the clause with the hiragana independent word dictionary aυ, and performs the same thing as in the case of the above-mentioned kanji independent word matching circuit (5), The phoneme sequence and accent information are sent to the phoneme sequence register a through the signal line acJ.

次に、上述の漢字自立語辞書回1i!! (51または
ひらがな自立語照合回路(IIで照合された自立語が用
言のとき(信号線(21)(2υ)活用語尾検定回路(
7)は自立語に続くひらがな文字列を活用語尾表(8)
と照合することによって活用語尾検定を行い、合格の場
合は(自立語+活用語尾)に対する音韻系列及びアクセ
ント情報を信号線(2渇を通して音韻系列レジスタに送
る((自立語+活用語尾)で文節が切れる場合)。もし
不合格のときは上記の漢字自立語照合回路(5)もしく
はひらがな自立語照合回路α1に設けられた一時記憶装
置から2番目の候補単語をとり出し、それについて上述
の最長マツチング単語についての処理と同様の事を行う
Next, the above-mentioned kanji independent word dictionary episode 1i! ! (51 or Hiragana independent word matching circuit (II) When the independent word matched by II is a pragmatic (signal line (21) (2υ)) Conjugation ending verification circuit (
7) is a list of conjugated hiragana character strings following independent words (8)
A conjugated ending test is performed by comparing the conjugated endings with ).If it fails, the second candidate word is retrieved from the temporary storage device provided in the Kanji independent word matching circuit (5) or the Hiragana independent word matching circuit α1, and it is Perform the same processing as for matching words.

活用語尾検定回路(7)では、活用語尾が検定された後
入力文節文字列から前記自立語と活用語尾をとり除く。
In the conjugated ending testing circuit (7), after the conjugated ending is verified, the independent word and the conjugated ending are removed from the input phrase character string.

そして検定できた活用語尾が用言連用形もしくは用言連
体形のとき、この残りの文字列がひらがな文字列検定回
路(9)へ送られる(信号線CI!31)。ひらがな文
字列検定回路(9)は、上記の残りの文字列の先頭文字
列が以下示すようなひらがな2文字以上の特定文字であ
るかどうか特定文字表a9と照合することによって検定
し、そのような特定文字であるときは、それらは付属語
である可能性が大きいため、信号線(24)を通じて付
属語照合回路021へ行く。
When the conjugated ending that has been verified is in the conjunction form or the conjunction form, the remaining character string is sent to the hiragana character string verification circuit (9) (signal line CI!31). The hiragana character string verification circuit (9) verifies whether the first character string of the remaining character strings is a specific character of two or more hiragana characters as shown below by comparing it with the specific character table a9, and If the characters are specific characters, there is a high possibility that they are adjunct words, so they go to the adjunct word matching circuit 021 through the signal line (24).

用言連用形の後の場合、「まず」 「たり」 「っつ」
 「ても」 「ながら」な戸釉言連体形の後の場合、「
から」 「さえ」「シか」 「だけj 「けれど」など
After a conjunction form: ``Mazu'', ``Tari'', ``Ttsu''
After the ``tome'' and ``nagara'' adjunctive forms, ``
kara,''``sae,''``shika,''``dakej'' and ``but.''

他方、上記のような特定文字でないとき、信号線(2暖
を通じてひらがな自立語照合回路(10)へ行き、上記
の残りの文字列をひらがな自立語辞書aυと照合する。
On the other hand, if it is not a specific character as mentioned above, it goes to the hiragana independent word matching circuit (10) through the signal line (2-wire) and matches the remaining character strings with the hiragana independent word dictionary aυ.

ここで照合ができた場合、ひらがな自立語照合回路α0
)は前記の用言連用形もしくは用言連体形のアクセント
をこれら用言のアクセント型(前記の漢字自立語辞書も
しくはひらがな自立語辞書に登録されている)と活用形
(連用形であるか連体形であるかの区別)などにもとつ
いて検定し、そのアクセント情報を音韻系列レジスタα
滲に送る。(信号線(I翅) 付属語照合回路α2は、上記の特定文字を先頭とする文
字列と付属語辞書0階とを照合することによって付属語
の検定を行う。付属語辞書(1階は付属・語および付属
語の複合形(例「たけなので」)を見出し語とし、それ
らが自立語に接続するときの条件、およびそれらが自立
語と結合したときのアクセントの移動に関する情報など
が登録されている。
If the matching is successful here, the hiragana independent word matching circuit α0
) is the accent of the above-mentioned adjunctive form or adjunctive form. The accent information is then stored in the phonological sequence register α.
Send it to Yu. (Signal line (I wing) The adjunct word matching circuit α2 verifies the adjunct word by comparing the character string starting with the above specific character with the adjunct word dictionary 0th floor. The adjunct word dictionary (the first floor is Adjuncts/words and compound forms of adjunct words (e.g. "takenade") are registered as headwords, and information on the conditions when they are connected to independent words, and the shift of accent when they are combined with independent words, etc. is registered. has been done.

付属語照合回路α2は照合できた付属語を音韻系列に変
換して音韻系列レジスタIへ送るとともに(信号線(2
6) )、自立語と付属語が結合している文節のアクセ
ントも、上記の付属語辞!’Q、3)に登録されている
情報をもとに検定し、同じく音韻系列レジスタu滲へ文
節アクセント位置情報を送る。
The adjunct word matching circuit α2 converts the adjunct word that has been matched into a phoneme sequence and sends it to the phoneme sequence register I (signal line (2)
6) ), the accent of a clause in which an independent word and an adjunct are combined is also the adjunct word above! It is verified based on the information registered in 'Q, 3), and the phrase accent position information is also sent to the phoneme series register u.

なお、文節の先頭文字列との照合において漢字自立照合
回路(5)あるいはひらがな自立語照合回路(10)で
照合された自立語が用言以外のとき、あるいは用言であ
っても活用語尾検定口M (71で検定された活用語尾
が連用形もしくは連体形以外のとき、文節からその照合
された自立語およびその活用語尾をとり際いた残りの文
字列が信号線(27) 、 08J 、 01を通して
付属語照合向tshaaへ送られる。ここで残りの文字
列が伺属語辞11(13)と照合され、前述と同様に付
属語が照合されたとき音韻系列とアクセント情報を1韻
系列レジスタ04)へ送る。さらに照合された付属語を
とり除いてもなお文節文字列に残りがあるときは、その
残り文字列が信号線(至)を通してひらがな文字列検定
回路(9)へ送られて上記と同様のことが行われる。
In addition, when the independent word matched by the Kanji independent matching circuit (5) or the Hiragana independent word matching circuit (10) in matching with the first character string of the clause is other than a pragmatic word, or even if it is a pragmatic word, the conjugated ending test is not performed. Mouth M (When the conjugated ending tested in 71 is other than the adjunctive form or the adjunctive form, the remaining character strings that include the verified independent word and its conjugated ending from the clause are passed through signal lines (27), 08J, and 01. It is sent to the adjunct word matching direction tshaa.Here, the remaining character strings are compared with the adjunct word dictionary 11 (13), and when the adjunct word is matched in the same way as described above, the phoneme sequence and accent information are sent to the rhyme sequence register 04. ). Furthermore, if the phrase character string remains even after removing the matched adjunct, the remaining character string is sent to the hiragana character string verification circuit (9) through the signal line (to) and the same process as above is performed. will be held.

以上の様に本実施例によれば、ひらがな文字列検定回路
(9)において特定文字列を検定し、それにもとづいて
ひらがな自立語照合回路α0)と付属語照合回路0渇で
の照合順序を変えることにより、用d連用形もしくは用
言連体形の後に続くひらがな文字列について付属語であ
るか自立語であるかを正しく区別して検定できる。
As described above, according to this embodiment, a specific character string is verified in the hiragana character string verification circuit (9), and based on that, the collation order in the hiragana independent word verification circuit α0) and the attached word verification circuit 0 is changed. By doing this, it is possible to correctly distinguish and test whether a hiragana character string following a d-adjunction form or a d-adjunction form is an attached word or an independent word.

例えば、入力文字列か「話します」の場合、「話し」が
用言連用形として検定されたとき、残りの文字列「ます
」の照合がひらがな文字列検定回路(9)を通さずに従
来の様に直接ひらがな自立語照合回路0Iで行なわれる
と、自立語として「まず(増す)」が検定されるためア
クセントも「ます」(平板型)として検定される。しか
し言うまでもなくこれは誤りである。本実htU例では
ひらがな文字列検定回路(9)で「まず」は特定文字と
して検定されるため、この場合「ます」の照合は付属語
照合lOJ路02)で行われ、このため付属語の助動詞
として検定出来、文節のアクセント「話します」が得ら
れる。
For example, in the case of the input character string ``Kashimasu'', when ``Kashimasu'' is verified as a conjunction, the remaining character string ``Masu'' is compared without going through the hiragana character string verification circuit (9). When the hiragana independent word matching circuit 0I performs the test directly, ``Machi (increase)'' is tested as an independent word, so the accent is also tested as ``masu'' (flat type). But needless to say, this is wrong. In this actual htU example, the hiragana character string verification circuit (9) tests ``Mazu'' as a specific character, so in this case, the matching for ``Masu'' is performed in the adjunct matching lOJ path 02); It can be tested as an auxiliary verb, and it can be used as an accent for phrases such as ``speak''.

上記の例の場合、用言連用形の検定後ではひらがな自立
語照合回路00)でなく付属語照合回路(1功へ行くこ
とで、ひらがな文字列について正しい検定結果が得られ
るが、用言連用形の検定後で常に付属語照合回路峻へ行
くことにしても誤りが生じる。
In the above example, after testing the conjunction form, by going to the attached word matching circuit (1 go) instead of the hiragana independent word matching circuit (00), a correct test result can be obtained for the hiragana character string, but the Even if you decide to always go to the attached word matching circuit after the test, an error will occur.

例えば入力文節文字列が1誦しやすい」の場合、金回路
(121へ行くと、文字「や」が付属語の助詞として検
定され、そのあとさらに「や」をとり除いた残りの文字
列「すい」がひらがな自立語照合回moo+で自立語「
すい(吸い)」として検定されるため、アクセントが2
つに分離してI−iや口〒い」となる。これは言うまで
もなく誤りであり、正しくは「πτ〒7い」とならねば
ならない。本実施例では用言連用形もしくは用言連体形
の検定後、その残りの文字例がひらがな文字列検定回路
(9)へ送られて付属語であるか自立語であるか識別さ
れるため、無条件で付属語照合されることも、また無条
件で自立語照合されることも無く、上記の例の様な誤り
は起こらずにアクセントを正しく検出できる。
For example, if the input clause character string is ``easy to recite,'' then when you go to the gold circuit (121), the character ``ya'' is tested as a particle of an adjunct, and then the remaining character string after removing ``ya'' is ``Sui'' is an independent word in the hiragana independent word matching session moo+
The accent is 2 because it is tested as “Sui (sucking)”.
It can be separated into ``I-i'' and ``口〒''. Needless to say, this is a mistake, and the correct answer should be "πτ〒7ii." In this embodiment, after the verification of the adjunctive form or the adjunctive form, the remaining character examples are sent to the hiragana character string verification circuit (9) to identify whether they are attached words or independent words. There is no conditional matching of adjunct words, and there is no unconditional matching of independent words, so the accents can be detected correctly without the errors in the above example.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の一実施例装置を示す概略構成図である
。 ■・・・入力文章記憶装置 2・・・文節切り回路3・
・・入力文節レジスタ 4・・・文節先頭文字検定回路
 5・・・漢字自立語照合回 路 6・・・漢字自立語辞書記憶装置 7・・・活用語尾検電回路 8・・・活用語尾表記憶装
置9・・・ひらがな文γ列検定回路 10・・・ひらがな自立語照合回路 11・・・ひらがな自立語辞書記憶装置12・・・付属
語照合回路 13・・・付属語辞書記憶装置 14・・・音韻系列レジスタ 15・・・特定文字表記憶装置 16〜30・・・信号線 代理人 弁理士 則 近 憲 右 (ばか1名) 第1図 入力−(/ 女−訃士刀す 2゛ を己・1泡1.装置 面頂1 X力丈1叶 3 しり’X5’ り1〒夫」聡゛ 9 栓芝回路 f7乙 s l乙 復¥包立寄 ” 照+8翌芥 1己・1泡、毒し覆 、!、Zt 7 4用tt)も ミ全耳■瞠0も表 検負り回路 i已・11亀、増L1 73 15 7 ′うtfi’j−’Jq楚瓜 7□ 検1L@路 を己・洗笈名【 f/ ’J −f)う町へ戸自fl@ 西令回1番 ジ 13
FIG. 1 is a schematic diagram showing an apparatus according to an embodiment of the present invention. ■...Input sentence storage device 2...Bunsetsu cutting circuit 3.
...Input clause register 4...Bunsetsu first character verification circuit 5...Kanji independent word collation circuit 6...Kanji independent word dictionary storage device 7...Inflection ending detection circuit 8...Inflection ending table storage Device 9... Hiragana sentence gamma sequence verification circuit 10... Hiragana independent word matching circuit 11... Hiragana independent word dictionary storage device 12... Adjunct word matching circuit 13... Adjunct word dictionary storage device 14...・Phonological series register 15...Specific character table storage device 16-30...Signal line agent Patent attorney Noriyoshi Chika Ken Right (1 idiot) Figure 1 input - (/ Female - Masashitosu 2゛) Self・1 bubble 1. Device Mentop 1 , Poison and overturn!, Zt 7 4 for tt) also Mi all ears ■ 瞠 0 also table test negative circuit i 已・11 turtle, increase L1 73 15 7 'utfi'j-'Jq Chu Gua 7□ test 1L@ro to the town [f/'J -f] to the town fl@ Nishi Reiki 1st ji 13

Claims (1)

【特許請求の範囲】[Claims] 日本語文の文字列を文字種の変化に基いて文節に区切る
文節切り手段と、この文節切り手段より出力された文節
の先頭文字が漢字かひらがなかを検定する文節先頭文字
検定手段と、この文節先頭文字検定手段により前記文節
の先頭文字が漢字として検定された場合この文節の先頭
部分に含まれる自立語を検出しこの自立語が用言か否か
を検定し前記文節からこの自立語を除いた第1の残りの
文字列及びこの自立語の音韻アクセント情報を出力する
漢字自立語照合手段と、前記文節先頭文字検定手段によ
り前記文節の先頭文字がひらがなとして検定された場合
この文節の先頭部分に含まれる自立語を検出しこの自立
語が用言か否かを検定し011記文節からこの自立語を
除いた第1の残りの文字列及びこの自立語の音韻アクセ
ント情報を出力するひらがな自立語照合手段と、前記漢
字自立語照合手段又はひらがな自立語照合手段により前
記自立語が用言として検定された場合前記第1の残りの
文字列に含まれる前記自立語の活用語尾が運用形又は連
体形であるか否かを検定し前記第1の残りの文字列から
この活用語尾を除いた第2の残りの文字列及びこの活用
語尾の音韻アクセント情報と前記漢字自立語照合手段又
はひらがな自立語照合手段より出力された前記自立語の
音韻アクセント情報をそのまま出力する活用語尾検定手
段と、この活用語尾検定手段により前記活用語尾が連用
形又は連体形として検定された場合前記第2の残りの文
字列の先頭部分が用言連用形又は連体形に接続可能な特
定文字列であるか否かを検定し特定文字列でない場合に
011記第2の残りの文字列及び前記活用語尾検定手段
より出力された前記自立語とその活用語尾の音韻アクセ
ント情報を前記ひらがな自立語照合手段へそのまま出力
するひらがな文字列検定手段と、がI記漢字自立語照合
手段又はひらがな自立語照合手段において前記自立語が
用言でないと検定された場合前記第1の残りの文字列と
自立語の音韻アクセント情報、前記活用語尾検定手段に
おいて前記活用語尾が連用形又は連体形でないと検定さ
れた場合の前記第2の残りの文字夕(1及び自立語とそ
の活用語尾の音韻アクセント情報、前記ひらがな文字列
検定手段において前記第2の残りの文字列の先頭部分が
特定文字列であると検定された場合の前記第2の残りの
文字列及び自立語どその活用語尾の音韻アクセント情報
を入力し前記第1又は5152の残りの文″−T列の先
頭部分に含まれる付属語を検出しこの付属語が自立語又
は活用語尾に接続するときの情報をもとに前記自立語と
活用語尾と付属語又は自立語と付属語からなる文字列全
体の首韻アクセン) flt報を決定する付属語照合手
段と、この+I属語照合手段から出力された前記文字列
全体の音韻アクセント情報を格納する音韻系列レジスタ
とを具備したことを特徴とする文字1仙変換装置。
A bunsetsu cutting means divides a character string of a Japanese sentence into clauses based on changes in character type; a bunsetsu first character verification means tests whether the first character of the clause outputted from the bunsetsu cutting means is a kanji character or a hiragana character; When the first character of the phrase is verified as a kanji by the character verification means, an independent word included in the beginning of the phrase is detected, it is verified whether or not this independent word is a phrasing, and the independent word is removed from the phrase. Kanji independent word matching means outputs the first remaining character string and phonological accent information of this independent word, and when the first character of the clause is verified as hiragana by the clause first character verification means, the beginning part of this clause is verified as hiragana. A hiragana independent word that detects the included independent word, tests whether or not this independent word is a pragmatic word, and outputs the first remaining character string after removing this independent word from the 011 clause and the phonological accent information of this independent word. When the independent word is verified as a term by the collation means and the Kanji independent word collation means or the Hiragana independent word collation means, the conjugated ending of the independent word included in the first remaining character string is an operational form or a combination. A second remaining character string obtained by examining whether or not it is a body shape and removing this conjugated ending from the first remaining character string, phonological accent information of this conjugated ending, and the kanji independent word matching means or hiragana independent word. a conjugated ending testing means for outputting the phonological accent information of the independent word outputted from the matching means as is; and a second remaining character string when the conjugated ending is verified as a conjunctive form or an adjunctive form by the conjugated ending testing means; It is tested whether the first part of the character string is a specific character string that can be connected to the adjunctive form or the adnominal form. A hiragana character string verification means for outputting the independent word and its conjugated ending phonological accent information as is to the hiragana independent word matching means; the first remaining character string and the phonological accent information of the independent word; and the second remaining character string when the conjugated ending is determined not to be a conjunctive or an adjunctive ending by the inflected ending testing means; evening (1 and independent words and phonological accent information at the end of their conjugated words, the second remaining character string when the first part of the second remaining character string is verified to be a specific character string by the hiragana character string verification means) Input the character string and phonological accent information of the conjugated ending of the independent word, detect the adjunct included in the first part or the beginning of the remaining sentence "-T string of 5152, and determine whether this adjunct is an independent word or the conjugated ending. (alliteration accent) of the entire character string consisting of the independent word, the conjugated ending, and the adjunct word, or the independent word and the adjunct word, based on the information when connecting to the adjunct word; A character-to-character conversion device comprising: a phoneme series register that stores phoneme accent information of the entire character string outputted from the matching means.
JP58135184A 1983-07-26 1983-07-26 Character phoneme converter Granted JPS6026997A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58135184A JPS6026997A (en) 1983-07-26 1983-07-26 Character phoneme converter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58135184A JPS6026997A (en) 1983-07-26 1983-07-26 Character phoneme converter

Publications (2)

Publication Number Publication Date
JPS6026997A true JPS6026997A (en) 1985-02-09
JPH0244080B2 JPH0244080B2 (en) 1990-10-02

Family

ID=15145796

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58135184A Granted JPS6026997A (en) 1983-07-26 1983-07-26 Character phoneme converter

Country Status (1)

Country Link
JP (1) JPS6026997A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6258361A (en) * 1985-09-09 1987-03-14 Canon Inc Japanese word processor
JPS6258362A (en) * 1985-09-09 1987-03-14 Canon Inc Japanese word processor
JPS6426926A (en) * 1987-07-23 1989-01-30 Fujitsu Ltd Control system for punctuation utterance of paragraph
JPH0215764A (en) * 1988-07-01 1990-01-19 Nippon Telegr & Teleph Corp <Ntt> Information distributing system
KR101307856B1 (en) * 2012-02-15 2013-09-12 에스티엑스조선해양 주식회사 Container guide device of container ship
KR20210098535A (en) * 2018-12-21 2021-08-10 오카도 이노베이션 리미티드 Robotic Device Test Stations and Methods

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6258361A (en) * 1985-09-09 1987-03-14 Canon Inc Japanese word processor
JPS6258362A (en) * 1985-09-09 1987-03-14 Canon Inc Japanese word processor
JPS6426926A (en) * 1987-07-23 1989-01-30 Fujitsu Ltd Control system for punctuation utterance of paragraph
JPH0215764A (en) * 1988-07-01 1990-01-19 Nippon Telegr & Teleph Corp <Ntt> Information distributing system
KR101307856B1 (en) * 2012-02-15 2013-09-12 에스티엑스조선해양 주식회사 Container guide device of container ship
KR20210098535A (en) * 2018-12-21 2021-08-10 오카도 이노베이션 리미티드 Robotic Device Test Stations and Methods

Also Published As

Publication number Publication date
JPH0244080B2 (en) 1990-10-02

Similar Documents

Publication Publication Date Title
US6347298B2 (en) Computer apparatus for text-to-speech synthesizer dictionary reduction
JPS6026997A (en) Character phoneme converter
JPH06282290A (en) Natural language processing device and method thereof
JPS59127151A (en) Sentence reading device
JP3002202B2 (en) Numeral reading device in rule speech synthesizer
JPS59127139A (en) Sentence fault detecting and correcting device
JPS59127152A (en) Sentence reading device
JPH0415503B2 (en)
JPH0130173B2 (en)
JP3269083B2 (en) Natural language processor
JPS5972495A (en) Character phoneme transducer
Rao et al. Word boundary hypothesization in Hindi speech
JPS5934592A (en) Character phoneme converter
KR0180650B1 (en) Sentence analysis method for korean language in voice synthesis device
Martín-Contreras USING THE MASORA FOR INTERPRETING THE VOCALISATION AND ACCENTUATION OF THE BIBLICAL TEXT1
JPS59127150A (en) Sentence reading and checking device
JPH04305B2 (en)
JPS59127146A (en) Sentence reading-out device
JP2801601B2 (en) Text-to-speech synthesizer
KR0136423B1 (en) Phonetic change processing method by validity check of sound control symbol
JPH0375898B2 (en)
JPH11224250A (en) Dictionary device
d'Alessandro et al. Joint Evaluation of Text-To-Speech Synthesis in French Within the AUPELF ARC-B3 Project
JPH06348703A (en) Pronunciation dictionary developing device
JPS63158600A (en) Word detector