JPH02201643A - System for discriminatingly reading isomorphic word - Google Patents
System for discriminatingly reading isomorphic wordInfo
- Publication number
- JPH02201643A JPH02201643A JP1022021A JP2202189A JPH02201643A JP H02201643 A JPH02201643 A JP H02201643A JP 1022021 A JP1022021 A JP 1022021A JP 2202189 A JP2202189 A JP 2202189A JP H02201643 A JPH02201643 A JP H02201643A
- Authority
- JP
- Japan
- Prior art keywords
- occurrence
- isomorphic
- word
- words
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims description 13
- 230000000877 morphologic effect Effects 0.000 claims description 4
- 235000016496 Panda oleosa Nutrition 0.000 claims description 2
- 240000000220 Panda oleosa Species 0.000 claims description 2
- 238000012986 modification Methods 0.000 abstract description 2
- 230000004048 modification Effects 0.000 abstract description 2
- 230000015572 biosynthetic process Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
Abstract
Description
【発明の詳細な説明】
技権分災
本発明は、テキスト音声合成装置の同形語読み分け方式
に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method for distinguishing isomorphic words in a text-to-speech synthesis device.
菜米肢菫
同形語を読み分けるには、その同形語となんらかの関係
を持つ他の単語を手ががりにしなければならない。そこ
で、一般には、同形語を含む文節と係り受け関係にある
文節を手がかりとする方法をとる。このことに関しては
、「階層的単語属性を用いて同形語の自動読み分け法」
(電芋通信学会論文誌85 / 3 Vol、J68
−D No、3 p、392−p、399)に記載され
ている。それによると次のような方法がとられていた。In order to distinguish between the isomorphic words of the same name, it is necessary to use other words that have some kind of relationship with the isomorphic word as clues. Therefore, in general, a method is used in which a clause that has a dependency relationship with a clause that includes a homograph is used as a clue. Regarding this, please refer to ``Automatic reading method for homographs using hierarchical word attributes''.
(Denimo Communication Society Journal 85/3 Vol. J68
-D No., 3 p, 392-p, 399). According to this, the following methods were used:
(1)同形語が用言の場合は、その用言の必須路となる
文節を見つけ、その文節の名詞の意味属性が用言の格パ
ターンで指定されたものに一致する用言を選択する。(1) If the homograph is a predicate, find the clause that is the essential path of the predicate, and select the clause whose semantic attributes of the noun in that clause match those specified by the case pattern of the predicate. .
(2)同形語が並列句を構成している場合は、相手の名
詞に意味的に近い組合せを選択する。(2) If the homographs form a parallel phrase, select a combination that is semantically close to the other noun.
(3)同形語が格助詞゛′の″で結ばれた連体修飾関係
を構成している場合は、読み分けの対象となる単語に予
め、格助詞゛′の″を介して前後に接続しうる単語の意
味属性を与えておき、それに一致する組合せとなるもの
を選択する。(3) If homographs form an adnominal modification relationship connected by the case particle ``'no'', they can be connected before and after the word to be distinguished through the case particle ``'no''. The semantic attributes of words are given, and combinations that match them are selected.
(4)同形語が用言の格要素となっている場合は、その
用言の格パターンで指定された意味属性に一致する方を
選択する。(4) If a homograph is a case element of a predicate, select the one that matches the semantic attribute specified by the case pattern of the predicate.
(5)同形語が名詞文節−述部の関係になる場合(〜は
−だ。″の関係になる場合)は、述部となる名詞と名詞
文節内の名詞が意味的に近い属性を持つ組合せを選択す
る。(5) When a homograph has a noun clause-predicate relationship (~wa-da.'' relationship), the noun that is the predicate and the noun in the noun clause have attributes that are semantically similar. Select a combination.
(6)同形語が連体修飾されている場合は、その同形語
を修飾している用言を児っけ、その用言の格パターンで
指定された属性を持つ名詞を選択する。(6) If the isomorphic word is modified by an adnominal, a predicate modifying the isomorphic word is created, and a noun having an attribute specified by the case pattern of the predicate is selected.
従来の方法のうち、前記(1)、(4)、(6)を実現
するには、すべての用言について格パターンを〜
用意しなければならない。しかし、これは辞書の語常数
が多い場合には、非常に手間のかかる作業であり、同形
語という限られた単語の読み分けのために、辞書中のす
べての用言の格パターンを用意するのはコスト的に見合
わないという問題点がある。In order to realize (1), (4), and (6) of the conventional methods, case patterns must be prepared for all predicates. However, this is a very time-consuming task when the number of common words in the dictionary is large, and in order to distinguish the limited number of words called homographs, it is necessary to prepare case patterns for all the words in the dictionary. The problem is that it is not cost-effective.
目 的
本発明は、上述のごとき実情に鑑みてなされたもので、
同形語を読み分ける際の手がかりとなる共起関係の典型
的な例を記憶した共起辞書を用意しておき、それを用い
て同形語と係り受け関係にある文節との共起の強さを求
め、共起が強い方の組合せを選択する方法をとり、すべ
ての用言について格パターンを用意しなくても同形語の
読み分けが可能になり、また、共起の強さを調べる際に
、共起辞書と見出しレベルで一致する場合はもちろんの
こと、共起辞書中に登録されている単語と類似の意味分
類の単語であれば、共起関係がある程度成立すると判断
することにより、1つの典型例だけで多様な表現の共起
関係も類推することが可能になるように構成した同形語
読み分け方式を提供することを目的としてなされたもの
である。Purpose The present invention was made in view of the above-mentioned circumstances.
Prepare a co-occurrence dictionary that stores typical examples of co-occurrence relations to help you distinguish between isomorphic words, and use it to determine the strength of co-occurrence between the isomorphic word and clauses in dependency relationships. By using the method of determining the combination with the strongest co-occurrence, it is possible to distinguish between isomorphic words without preparing case patterns for all terms, and when investigating the strength of co-occurrence, , by determining that a co-occurrence relationship holds to some extent, not only when the words match the co-occurrence dictionary at the heading level, but also when the words have a similar semantic classification to the words registered in the co-occurrence dictionary. The purpose of this study is to provide a method for identifying isomorphic words that is structured so that it is possible to infer the co-occurrence relationships of various expressions using only one typical example.
構 l又
本発明は、」二足目的を達成するために、漢字かな混じ
りの1ヨ本語文章を形態素解析する形態素解析部と、文
節間の係り受け関係を調べる係り受け解析部と、前記形
態素解析と係り受け解析の結果を用いて韻律記号列を生
成する韻律記号生成部と。Furthermore, in order to achieve the two-dimensional purpose, the present invention includes a morphological analysis unit that morphologically analyzes a Japanese sentence containing kanji and kana, a dependency analysis unit that examines dependency relationships between clauses, and a dependency analysis unit that examines dependency relationships between clauses. a prosodic symbol generation unit that generates a prosodic symbol string using the results of morphological analysis and dependency analysis;
韻律記号列を合成音声に変換する音声合成部とを備えた
テキスト音声合成装置において、同形語(表記が同じだ
が、読みが異なる単語)を読み分ける際の手がかりとな
る文節間の共起関係を記憶した共起辞書を備え、同形語
がある場合には、該同形語を含む文節と係り受け関係に
ある文節との共起の強さを共起辞書を用いて検索し、そ
の結果共起関係の強い方の単語を選択すること、更に、
共起の強さを決定する際に、前記共起辞書と見出しが一
致する場合だけでなく、共起辞書に登録されている単語
と意味的に類似なQL語の場合も、ある程度共起関係が
あると判断することを特徴としたものである。以下、本
発明の実施例に基づいて説明する。In a text-to-speech synthesizer equipped with a speech synthesis unit that converts prosodic symbol strings into synthesized speech, the co-occurrence relationship between clauses is used as a clue to distinguish between isomorphic words (words with the same spelling but different pronunciations). If there is a memorized co-occurrence dictionary, if there is a isomorphic word, the strength of co-occurrence between the clause containing the isomorphic word and the clause in a dependency relationship is searched using the co-occurrence dictionary, and the result is a co-occurrence dictionary. Selecting words that are more closely related, and
When determining the strength of co-occurrence, we consider not only the cases where the headings match the co-occurrence dictionary, but also the cases of QL words that are semantically similar to words registered in the co-occurrence dictionary. It is characterized by determining that there is. Hereinafter, the present invention will be explained based on examples.
第1図は、本発明による同形語読み分け方式を説明する
ためのもので、共起辞書の構成を示し、第2図は、各単
語の意味分類を示すもので、階層的に定義されている。Figure 1 is for explaining the isomorphic word classification method according to the present invention, and shows the structure of a co-occurrence dictionary, and Figure 2 shows the meaning classification of each word, which is defined hierarchically. .
この実施例では、同形語読み分け処理は、テキスト音声
合成の一連の処理の中で係り受け解析の直後におかれる
。次に、同形語読み分け処理がどのように行なわれるが
を具体的に示して説明する。共起の強さを決定するルー
ルは次の第1表のように定義されている。In this embodiment, the homograph reading process is performed immediately after the dependency analysis in a series of text-to-speech synthesis processes. Next, how the isomorphic word recognition process is performed will be specifically explained. The rules for determining the strength of co-occurrence are defined as shown in Table 1 below.
実例1
「°°“°゛゛°°゛°°゛°°“′°゛°°゛″””
’1r”””””””””’”′]
謹””””’1
彼は 今日 実験を 行った(いった/おこなった)。Example 1 “°°“°゛゛°°゛°°゛°°“′°゛°°゛″””
'1r"""""""""'"′] 謹"""'1 He conducted (did/conducted) an experiment today.
この実例では、“行った”が同形語である。In this example, “gone” is the isomorphic word.
″行った″に係る文節には、彼はn、u今日″″実験を
″の三つがある。共起辞書で″行った″を検索すると、
次の2つの共起関係が得られる。There are three phrases related to ``did'': ``he did n'', ``today'', ``experiment''. When you search for ``did'' in the co-occurrence dictionary, you get
The following two co-occurrence relationships are obtained.
町へ(に)行った(いった)″
″実験を 行った(おこなった)”
実例10文中に共起辞書の見出しと全く一致する″実験
を 行った″という係り受け関係があるので、共起の強
さは5となり、この場合、′行った″の読みとして″お
こなった”が選択される。``I went to the town.'' ``I conducted an experiment.'' In the example 10 sentence, there is a dependency relationship such as ``I conducted an experiment,'' which is exactly the same as the entry in the co-occurrence dictionary. The strength of the occurrence is 5, and in this case, ``Ogadatta'' is selected as the reading of ``Ogata''.
実例2
膳“゛パ°゛°°“”””’1
謹””””’1
私は 学校へ 行った(いった/おこなった)にの実例
でも″行った”が同形語であるので、=7
共起辞書を検索すると実例1と同じ検索結果が得られる
。今度は、見出しレベルで一致する係り受け関係は例文
中にない。しかし、″学校″の意味コードは第2図にお
ける2、1.1で施設を表わし、“町”と意味分類の1
つの上の階層で一致しているので、共起の強さは3とな
り、″行った″の読みとしては、″いった″が選択され
る。Example 2: “I went to school” in the example “゛pa°゛°°“”””’1 謹”””’1 I went to school (I went/did), because “I went” is a homograph. , = 7 If you search the co-occurrence dictionary, you will get the same search results as Example 1. This time, there is no dependency relationship that matches at the heading level in the example sentence. However, the meaning code for "school" is 2 in Figure 2. , 1.1 represents the facility, and “town” and 1 of the semantic classification
Since they match in the upper hierarchy, the strength of co-occurrence is 3, and the reading of ``Ichita'' is selected as ``Ichita''.
実例3
r””””””””””””’“°゛°°°°゛°“”
””1八代(はちだい/やつしろ)を 出発した。Example 3 r”””””””””””””°゛°°°°゛°“”
``1 I left Hachidai/Yatsushiro.
この実例では、″八代”が同形語であるので、共起辞書
を検索すると、″八代(やつしろ)に到着した”が得ら
れる。″出発した”は“到着した”と意味分類が等しい
ので、共起の強さは2となり、“八代”の読みとしては
、“やつしろ”が選択される。In this example, since "Yatsushiro" is a homograph, searching the co-occurrence dictionary yields "Arrived at Yatsushiro". Since "departed" has the same semantic classification as "arrived", the strength of co-occurrence is 2, and "Yatsushiro" is selected as the reading of "Yashiro".
カニーー限
以上の説明から明らかなように、本発明によると、共起
辞書を用いることによって、すべての用言の格パターン
を用意しなくても、容易に同形語の読み分けが可能にな
る。また、共起の強さを決定する際に、意味分類レベル
の一致も考慮することによって、共起辞書には典型的な
例しか登録されていなくても、かなり多様な表現の共起
関係にも対応することができる。As is clear from the above explanation, according to the present invention, by using a co-occurrence dictionary, isomorphic words can be easily distinguished without having to prepare case patterns for all words. In addition, by considering the agreement of the semantic classification level when determining the strength of co-occurrence, even if only typical examples are registered in the co-occurrence dictionary, co-occurrence relationships of quite diverse expressions can be realized. can also be handled.
第1図は、本発明による同形語読み分け方式を説明する
ためのもので、共起辞書の構成を示す図、第2図は、階
層的に定義されている意味分類を示す図である。
特許出願人 株式会社 リコーFIG. 1 is a diagram for explaining the isomorphic word classification method according to the present invention, and is a diagram showing the structure of a co-occurrence dictionary, and FIG. 2 is a diagram showing hierarchically defined meaning classifications. Patent applicant Ricoh Co., Ltd.
Claims (1)
素解析部と、文節間の係り受け関係を調べる係り受け解
析部と、前記形態素解析と係り受け解析の結果を用いて
韻律記号列を生成する韻律記号生成部と、韻律記号列を
合成音声に変換する音声合成部とを備えたテキスト音声
合成装置において、同形語を読み分ける際の手がかりと
なる文節間の共起関係を記憶した共起辞書を備え、同形
語がある場合には、該同形語を含む文節と係り受け関係
にある文節との共起の強さを前記共起辞書を用いて検索
し、その結果、共起関係の強い方の単語を選択すること
を特徴とする同形語読み分け方式。 2、共起の強さを決定する際に、前記共起辞書と見出し
が一致する場合だけでなく、共起辞書に登録されている
単語と意味的に類似な単語の場合も、ある程度共起関係
があると判断することを特徴とする請求項1記載の同形
語読み分け方式。[Scope of Claims] 1. A morphological analysis unit that morphologically analyzes a Japanese sentence containing kanji and kana, a dependency analysis unit that examines dependency relationships between clauses, and a system that uses the results of the morphological analysis and dependency analysis. In a text-to-speech synthesizer equipped with a prosodic symbol generator that generates a prosodic symbol string and a speech synthesizer that converts the prosodic symbol string into synthesized speech, the co-occurrence relationship between clauses is used as a clue when distinguishing isomorphic words. If there is a isomorphic word, the co-occurrence dictionary is used to search for the strength of co-occurrence between a clause containing the isomorphic word and a clause in a dependency relationship, and the result is , is an isomorphic reading method characterized by selecting words with a stronger co-occurrence relationship. 2. When determining the strength of co-occurrence, we consider not only the cases where the heading matches the co-occurrence dictionary, but also the cases of words that are semantically similar to words registered in the co-occurrence dictionary. 2. The isomorphism reading method according to claim 1, wherein it is determined that there is a relationship.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1022021A JP2655711B2 (en) | 1989-01-31 | 1989-01-31 | Homomorphic reading system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1022021A JP2655711B2 (en) | 1989-01-31 | 1989-01-31 | Homomorphic reading system |
Publications (2)
Publication Number | Publication Date |
---|---|
JPH02201643A true JPH02201643A (en) | 1990-08-09 |
JP2655711B2 JP2655711B2 (en) | 1997-09-24 |
Family
ID=12071333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP1022021A Expired - Fee Related JP2655711B2 (en) | 1989-01-31 | 1989-01-31 | Homomorphic reading system |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2655711B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH06289890A (en) * | 1993-03-31 | 1994-10-18 | Sony Corp | Natural language processor |
-
1989
- 1989-01-31 JP JP1022021A patent/JP2655711B2/en not_active Expired - Fee Related
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH06289890A (en) * | 1993-03-31 | 1994-10-18 | Sony Corp | Natural language processor |
Also Published As
Publication number | Publication date |
---|---|
JP2655711B2 (en) | 1997-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5475587A (en) | Method and apparatus for efficient morphological text analysis using a high-level language for compact specification of inflectional paradigms | |
US5794177A (en) | Method and apparatus for morphological analysis and generation of natural language text | |
Brill | A corpus-based approach to language learning | |
JP4306894B2 (en) | Natural language processing apparatus and method, and natural language recognition apparatus | |
Gaizauskas et al. | University of Sheffield: Description of the LaSIE system as used for MUC-6 | |
US6243670B1 (en) | Method, apparatus, and computer readable medium for performing semantic analysis and generating a semantic structure having linked frames | |
Pradhan et al. | Semantic role labeling using different syntactic views | |
US6101492A (en) | Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis | |
US5559693A (en) | Method and apparatus for efficient morphological text analysis using a high-level language for compact specification of inflectional paradigms | |
US6269189B1 (en) | Finding selected character strings in text and providing information relating to the selected character strings | |
US7788084B2 (en) | Labeling of work of art titles in text for natural language processing | |
EP0583083A2 (en) | Finite-state transduction of related word forms for text indexing and retrieval | |
US20040054530A1 (en) | Generating speech recognition grammars from a large corpus of data | |
WO1997004405A9 (en) | Method and apparatus for automated search and retrieval processing | |
US20070011160A1 (en) | Literacy automation software | |
EP1331574B1 (en) | Named entity interface for multiple client application programs | |
McDonald | An efficient chart-based algorithm for partial-parsing of unrestricted texts | |
EP1290574B1 (en) | System and method for matching a textual input to a lexical knowledge base and for utilizing results of that match | |
Kamir et al. | A comprehensive NLP system for modern standard Arabic and modern Hebrew | |
RU2004127924A (en) | DATA TRANSFER METHOD AND DEVICE FOR IMPLEMENTING THIS METHOD | |
JPH02201643A (en) | System for discriminatingly reading isomorphic word | |
Litkowski | Question Answering Using XML-Tagged Documents. | |
JP3526063B2 (en) | Voice recognition device | |
Batarfi et al. | Building an Arabic semantic lexicon for Hajj | |
KR20050123007A (en) | A system for generating technique for generating korean phonetic alphabet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
LAPS | Cancellation because of no payment of annual fees |