JPH02201643A

JPH02201643A - System for discriminatingly reading isomorphic word

Info

Publication number: JPH02201643A
Application number: JP1022021A
Authority: JP
Inventors: Junko Komatsu; 小松　順子
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1989-01-31
Filing date: 1989-01-31
Publication date: 1990-08-09
Anticipated expiration: 2012-09-24
Also published as: JP2655711B2

Abstract

PURPOSE:To discriminatingly read an isomorphic word without preparing a case pattern for all terminologies by obtaining the intensity of the co-occurrence of an isomorphic word and a literary word in a modification relation through the use of a co-occurrence dictionary and selecting a combination in which the coincidence is more intensive. CONSTITUTION:The above system is equipped with the co-occurrence dictionary to store co-occurrence relation between sentenses made into a clue at the time of discriminatingly read the isomorphic word (expressed in the same way but read in the different way). For example, in the execution of 'WATAKUSHIWA' (I) 'GAKKO E' (to school) 'ITTA' ('ITTA' (went) 'OKONATTA' (done)), since 'ITTA' (went) is the isomorphic word, when the co-occurrence dictionary id retrieved, the moidification relation to coincide at an index level cannot be found in the example sentence. However, the meaning code of 'school' expresses a facility, and since it coincides with 'MACHI' (town) in the hierarchy of a meaning classification one above, the intensity of the co-occurrence is made into 3 according to a prescribed rule, and as for the reading of 'OKONATTA', 'ITTA' is selected. Thus, without preparing the case pattern of all declinable words, the isomorphic word can be discriminatingly read easily.

Description

【発明の詳細な説明】技権分災本発明は、テキスト音声合成装置の同形語読み分け方式
に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for distinguishing isomorphic words in a text-to-speech synthesis device.

菜米肢菫同形語を読み分けるには、その同形語となんらかの関係
を持つ他の単語を手ががりにしなければならない。そこ
で、一般には、同形語を含む文節と係り受け関係にある
文節を手がかりとする方法をとる。このことに関しては
、「階層的単語属性を用いて同形語の自動読み分け法」
　（電芋通信学会論文誌８５　／　３　Ｖｏｌ、Ｊ６８
−Ｄ　Ｎｏ、３　ｐ、３９２−ｐ、３９９）に記載され
ている。それによると次のような方法がとられていた。In order to distinguish between the isomorphic words of the same name, it is necessary to use other words that have some kind of relationship with the isomorphic word as clues. Therefore, in general, a method is used in which a clause that has a dependency relationship with a clause that includes a homograph is used as a clue. Regarding this, please refer to ``Automatic reading method for homographs using hierarchical word attributes''.
(Denimo Communication Society Journal 85/3 Vol. J68
-D No., 3 p, 392-p, 399). According to this, the following methods were used:

（１）同形語が用言の場合は、その用言の必須路となる
文節を見つけ、その文節の名詞の意味属性が用言の格パ
ターンで指定されたものに一致する用言を選択する。(1) If the homograph is a predicate, find the clause that is the essential path of the predicate, and select the clause whose semantic attributes of the noun in that clause match those specified by the case pattern of the predicate. .

（２）同形語が並列句を構成している場合は、相手の名
詞に意味的に近い組合せを選択する。(2) If the homographs form a parallel phrase, select a combination that is semantically close to the other noun.

（３）同形語が格助詞゛′の″で結ばれた連体修飾関係
を構成している場合は、読み分けの対象となる単語に予
め、格助詞゛′の″を介して前後に接続しうる単語の意
味属性を与えておき、それに一致する組合せとなるもの
を選択する。(3) If homographs form an adnominal modification relationship connected by the case particle ``'no'', they can be connected before and after the word to be distinguished through the case particle ``'no''. The semantic attributes of words are given, and combinations that match them are selected.

（４）同形語が用言の格要素となっている場合は、その
用言の格パターンで指定された意味属性に一致する方を
選択する。(4) If a homograph is a case element of a predicate, select the one that matches the semantic attribute specified by the case pattern of the predicate.

（５）同形語が名詞文節−述部の関係になる場合（〜は
−だ。″の関係になる場合）は、述部となる名詞と名詞
文節内の名詞が意味的に近い属性を持つ組合せを選択す
る。(5) When a homograph has a noun clause-predicate relationship (~wa-da.'' relationship), the noun that is the predicate and the noun in the noun clause have attributes that are semantically similar. Select a combination.

（６）同形語が連体修飾されている場合は、その同形語
を修飾している用言を児っけ、その用言の格パターンで
指定された属性を持つ名詞を選択する。(6) If the isomorphic word is modified by an adnominal, a predicate modifying the isomorphic word is created, and a noun having an attribute specified by the case pattern of the predicate is selected.

従来の方法のうち、前記（１）、（４）、（６）を実現
するには、すべての用言について格パターンを〜用意しなければならない。しかし、これは辞書の語常数
が多い場合には、非常に手間のかかる作業であり、同形
語という限られた単語の読み分けのために、辞書中のす
べての用言の格パターンを用意するのはコスト的に見合
わないという問題点がある。In order to realize (1), (4), and (6) of the conventional methods, case patterns must be prepared for all predicates. However, this is a very time-consuming task when the number of common words in the dictionary is large, and in order to distinguish the limited number of words called homographs, it is necessary to prepare case patterns for all the words in the dictionary. The problem is that it is not cost-effective.

目　　　　　的本発明は、上述のごとき実情に鑑みてなされたもので、
同形語を読み分ける際の手がかりとなる共起関係の典型
的な例を記憶した共起辞書を用意しておき、それを用い
て同形語と係り受け関係にある文節との共起の強さを求
め、共起が強い方の組合せを選択する方法をとり、すべ
ての用言について格パターンを用意しなくても同形語の
読み分けが可能になり、また、共起の強さを調べる際に
、共起辞書と見出しレベルで一致する場合はもちろんの
こと、共起辞書中に登録されている単語と類似の意味分
類の単語であれば、共起関係がある程度成立すると判断
することにより、１つの典型例だけで多様な表現の共起
関係も類推することが可能になるように構成した同形語
読み分け方式を提供することを目的としてなされたもの
である。Purpose The present invention was made in view of the above-mentioned circumstances.
Prepare a co-occurrence dictionary that stores typical examples of co-occurrence relations to help you distinguish between isomorphic words, and use it to determine the strength of co-occurrence between the isomorphic word and clauses in dependency relationships. By using the method of determining the combination with the strongest co-occurrence, it is possible to distinguish between isomorphic words without preparing case patterns for all terms, and when investigating the strength of co-occurrence, , by determining that a co-occurrence relationship holds to some extent, not only when the words match the co-occurrence dictionary at the heading level, but also when the words have a similar semantic classification to the words registered in the co-occurrence dictionary. The purpose of this study is to provide a method for identifying isomorphic words that is structured so that it is possible to infer the co-occurrence relationships of various expressions using only one typical example.

構　　　ｌ又本発明は、」二足目的を達成するために、漢字かな混じ
りの１ヨ本語文章を形態素解析する形態素解析部と、文
節間の係り受け関係を調べる係り受け解析部と、前記形
態素解析と係り受け解析の結果を用いて韻律記号列を生
成する韻律記号生成部と。Furthermore, in order to achieve the two-dimensional purpose, the present invention includes a morphological analysis unit that morphologically analyzes a Japanese sentence containing kanji and kana, a dependency analysis unit that examines dependency relationships between clauses, and a dependency analysis unit that examines dependency relationships between clauses. a prosodic symbol generation unit that generates a prosodic symbol string using the results of morphological analysis and dependency analysis;

韻律記号列を合成音声に変換する音声合成部とを備えた
テキスト音声合成装置において、同形語（表記が同じだ
が、読みが異なる単語）を読み分ける際の手がかりとな
る文節間の共起関係を記憶した共起辞書を備え、同形語
がある場合には、該同形語を含む文節と係り受け関係に
ある文節との共起の強さを共起辞書を用いて検索し、そ
の結果共起関係の強い方の単語を選択すること、更に、
共起の強さを決定する際に、前記共起辞書と見出しが一
致する場合だけでなく、共起辞書に登録されている単語
と意味的に類似なＱＬ語の場合も、ある程度共起関係が
あると判断することを特徴としたものである。以下、本
発明の実施例に基づいて説明する。In a text-to-speech synthesizer equipped with a speech synthesis unit that converts prosodic symbol strings into synthesized speech, the co-occurrence relationship between clauses is used as a clue to distinguish between isomorphic words (words with the same spelling but different pronunciations). If there is a memorized co-occurrence dictionary, if there is a isomorphic word, the strength of co-occurrence between the clause containing the isomorphic word and the clause in a dependency relationship is searched using the co-occurrence dictionary, and the result is a co-occurrence dictionary. Selecting words that are more closely related, and
When determining the strength of co-occurrence, we consider not only the cases where the headings match the co-occurrence dictionary, but also the cases of QL words that are semantically similar to words registered in the co-occurrence dictionary. It is characterized by determining that there is. Hereinafter, the present invention will be explained based on examples.

第１図は、本発明による同形語読み分け方式を説明する
ためのもので、共起辞書の構成を示し、第２図は、各単
語の意味分類を示すもので、階層的に定義されている。Figure 1 is for explaining the isomorphic word classification method according to the present invention, and shows the structure of a co-occurrence dictionary, and Figure 2 shows the meaning classification of each word, which is defined hierarchically. .

この実施例では、同形語読み分け処理は、テキスト音声
合成の一連の処理の中で係り受け解析の直後におかれる
。次に、同形語読み分け処理がどのように行なわれるが
を具体的に示して説明する。共起の強さを決定するルー
ルは次の第１表のように定義されている。In this embodiment, the homograph reading process is performed immediately after the dependency analysis in a series of text-to-speech synthesis processes. Next, how the isomorphic word recognition process is performed will be specifically explained. The rules for determining the strength of co-occurrence are defined as shown in Table 1 below.

実例１「°°“°゛゛°°゛°°゛°°“′°゛°°゛″””
’１ｒ”””””””””’”′］謹””””’１彼は　今日　実験を　行った（いった／おこなった）。Example 1 “°°“°゛゛°°゛°°゛°°“′°゛°°゛″””
'1r"""""""""'"′] 謹"""'1 He conducted (did/conducted) an experiment today.

この実例では、“行った”が同形語である。In this example, “gone” is the isomorphic word.

″行った″に係る文節には、彼はｎ、ｕ今日″″実験を
″の三つがある。共起辞書で″行った″を検索すると、
次の２つの共起関係が得られる。There are three phrases related to ``did'': ``he did n'', ``today'', ``experiment''. When you search for ``did'' in the co-occurrence dictionary, you get
The following two co-occurrence relationships are obtained.

町へ（に）行った（いった）″ ″実験を　　行った（おこなった）” 実例１０文中に共起辞書の見出しと全く一致する″実験
を　行った″という係り受け関係があるので、共起の強
さは５となり、この場合、′行った″の読みとして″お
こなった”が選択される。``I went to the town.'' ``I conducted an experiment.'' In the example 10 sentence, there is a dependency relationship such as ``I conducted an experiment,'' which is exactly the same as the entry in the co-occurrence dictionary. The strength of the occurrence is 5, and in this case, ``Ogadatta'' is selected as the reading of ``Ogata''.

実例２膳“゛パ°゛°°“”””’１謹””””’１私は　学校へ　行った（いった／おこなった）にの実例
でも″行った”が同形語であるので、＝７共起辞書を検索すると実例１と同じ検索結果が得られる
。今度は、見出しレベルで一致する係り受け関係は例文
中にない。しかし、″学校″の意味コードは第２図にお
ける２、１．１で施設を表わし、“町”と意味分類の１
つの上の階層で一致しているので、共起の強さは３とな
り、″行った″の読みとしては、″いった″が選択され
る。Example 2: “I went to school” in the example “゛pa°゛°°“”””’1 謹”””’1 I went to school (I went/did), because “I went” is a homograph. , = 7 If you search the co-occurrence dictionary, you will get the same search results as Example 1. This time, there is no dependency relationship that matches at the heading level in the example sentence. However, the meaning code for "school" is 2 in Figure 2. , 1.1 represents the facility, and “town” and 1 of the semantic classification
Since they match in the upper hierarchy, the strength of co-occurrence is 3, and the reading of ``Ichita'' is selected as ``Ichita''.

実例３ｒ””””””””””””’“°゛°°°°゛°“”
””１八代（はちだい／やつしろ）を　出発した。Example 3 r”””””””””””””°゛°°°°゛°“”
``1 I left Hachidai/Yatsushiro.

この実例では、″八代”が同形語であるので、共起辞書
を検索すると、″八代（やつしろ）に到着した”が得ら
れる。″出発した”は“到着した”と意味分類が等しい
ので、共起の強さは２となり、“八代”の読みとしては
、“やつしろ”が選択される。In this example, since "Yatsushiro" is a homograph, searching the co-occurrence dictionary yields "Arrived at Yatsushiro". Since "departed" has the same semantic classification as "arrived", the strength of co-occurrence is 2, and "Yatsushiro" is selected as the reading of "Yashiro".

カニーー限以上の説明から明らかなように、本発明によると、共起
辞書を用いることによって、すべての用言の格パターン
を用意しなくても、容易に同形語の読み分けが可能にな
る。また、共起の強さを決定する際に、意味分類レベル
の一致も考慮することによって、共起辞書には典型的な
例しか登録されていなくても、かなり多様な表現の共起
関係にも対応することができる。As is clear from the above explanation, according to the present invention, by using a co-occurrence dictionary, isomorphic words can be easily distinguished without having to prepare case patterns for all words. In addition, by considering the agreement of the semantic classification level when determining the strength of co-occurrence, even if only typical examples are registered in the co-occurrence dictionary, co-occurrence relationships of quite diverse expressions can be realized. can also be handled.

[Brief explanation of the drawing]

第１図は、本発明による同形語読み分け方式を説明する
ためのもので、共起辞書の構成を示す図、第２図は、階
層的に定義されている意味分類を示す図である。特許出願人　　株式会社　リコーFIG. 1 is a diagram for explaining the isomorphic word classification method according to the present invention, and is a diagram showing the structure of a co-occurrence dictionary, and FIG. 2 is a diagram showing hierarchically defined meaning classifications. Patent applicant Ricoh Co., Ltd.

Claims

[Scope of Claims] 1. A morphological analysis unit that morphologically analyzes a Japanese sentence containing kanji and kana, a dependency analysis unit that examines dependency relationships between clauses, and a system that uses the results of the morphological analysis and dependency analysis. In a text-to-speech synthesizer equipped with a prosodic symbol generator that generates a prosodic symbol string and a speech synthesizer that converts the prosodic symbol string into synthesized speech, the co-occurrence relationship between clauses is used as a clue when distinguishing isomorphic words. If there is a isomorphic word, the co-occurrence dictionary is used to search for the strength of co-occurrence between a clause containing the isomorphic word and a clause in a dependency relationship, and the result is , is an isomorphic reading method characterized by selecting words with a stronger co-occurrence relationship. 2. When determining the strength of co-occurrence, we consider not only the cases where the heading matches the co-occurrence dictionary, but also the cases of words that are semantically similar to words registered in the co-occurrence dictionary. 2. The isomorphism reading method according to claim 1, wherein it is determined that there is a relationship.