JP3015509B2

JP3015509B2 - Automatic learning device for machine translation

Info

Publication number: JP3015509B2
Application number: JP3149100A
Authority: JP
Inventors: 友樹長瀬
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1991-06-21
Filing date: 1991-06-21
Publication date: 2000-03-06
Anticipated expiration: 2015-03-06
Also published as: JPH04372061A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、機械翻訳された訳文の
修正操作に伴なって多義語選択や関係子選択に使用する
データを自動的に生成する機械翻訳用自動学習装置に関
する。機械翻訳システムを使用して自動的に例えば日本
文から英文に翻訳する機械翻訳する際の多義語の選択に
ついては、適切な訳を学習する機能が設けられている。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an automatic learning apparatus for machine translation, which automatically generates data used for polysemy word selection and relational element selection in accordance with a correction operation of a machine-translated translated sentence. A function for learning an appropriate translation is provided for selecting a polysemous word at the time of machine translation for automatically translating a Japanese sentence into an English sentence using a machine translation system.

【０００２】しかし、従来の多義語選択の学習機能はユ
ーザが翻訳英文の中の誤っている語の位置と正しい訳語
を指示したり、直接に多義語選択に使用する意味関係の
データを追加する必要があり、ユーザの負担が大きく、
この点の改善が望まれる。However, the conventional learning function of polysemy word selection allows a user to indicate the position of an erroneous word in a translated English sentence and a correct translation word, or to add data of semantic relations directly used for polysemy word selection. Need, the burden on the user is large,
Improvement in this respect is desired.

【０００３】[0003]

【従来の技術】従来の機械翻訳システムにおいては、多
義語の適切な訳を学習する機能が設けられており、多義
語の学習時にはユーザが誤っている語の位置と正しい訳
語を明に指定する必要がある。例えば、2. Description of the Related Art A conventional machine translation system is provided with a function of learning an appropriate translation of an ambiguous word. When learning an ambiguous word, a user explicitly specifies a position of an erroneous word and a correct translated word. There is a need. For example,

【０００４】[0004]

【数１】 (Equation 1)

【０００５】と訳すシステムに「切る」の訳し分け（多
義語選択）を学習させるためには、訳文中の「ｃｕｔ」
を指定し、正しい訳が「ｓｈｕｆｆｌｅ」であることを
指示する必要がある。このようなユーザの指示による多
義語の学習が行われると、原文の構造解析によって多義
語選択のための意味関係データが生成されて意味関係辞
書に登録される。このため学習後に翻訳を行うと作成さ
れた意味関係辞書の内容を参照することで、[0005] In order to make the translation system learn the translation of "cut" (polysemy word selection), the "cut"
Must be specified to indicate that the correct translation is “shuffle”. When such polysemy learning is performed according to the user's instruction, semantic relation data for polysemy word selection is generated by structural analysis of the original sentence and registered in the semantic relation dictionary. Therefore, by translating after learning, by referring to the contents of the created semantic relation dictionary,

【０００６】[0006]

【数２】 (Equation 2)

【０００７】と訳され、適切な多義語選択ができる。Thus, an appropriate polysemy can be selected.

【０００８】[0008]

【発明が解決しようとする課題】しかしながら、このよ
うな従来の機械翻訳システムの多義語学習機能にあって
は、学習モードの設定状態でユーザが機械翻訳された訳
文の中から誤って選択された多義語を見つけ、正しい訳
語を指示するという学習操作を繰り返さなければなら
ず、学習に要するユーザの負担が大きいという問題があ
る。However, in the polysemy word learning function of such a conventional machine translation system, a user mistakenly selects a translated sentence from a machine translation in a learning mode setting state. There is a problem that a learning operation of finding a polysemous word and instructing a correct translation word must be repeated, and a user burden required for learning is large.

【０００９】また適切な学習対象となる訳文がない場合
には、ユーザが直接に多義語選択のための意味関係デー
タを追加しなければならず、依然としてユーザ負担が大
きいという問題がある。同様な問題は係り関係の意味知
識データを生成する学習についても同様に生ずる。[0009] If there is no appropriate translation to be learned, the user must directly add semantic relation data for polysemous word selection, and there is still a problem that the user burden is large. A similar problem also occurs in learning for generating the relationship knowledge data.

【００１０】本発明は、このような従来の問題点に鑑み
てなされたもので、機械翻訳システムにあっては、ユー
ザが誤った訳を正しく修正する作業を必ず行う点に着目
し、ユーザの修正作業に伴って自動的に多義語選択や係
り関係の選択に使用する意味知識データを生成してユー
ザ負担を大きく軽減するようにした機械翻訳用自動学習
装置を提供することを目的とする。The present invention has been made in view of such conventional problems, and in a machine translation system, paying attention to the fact that a user always performs an operation of correctly correcting an incorrect translation, It is an object of the present invention to provide an automatic learning apparatus for machine translation, which automatically generates semantic knowledge data used for selecting a polysemy word and selecting a relation in association with a correction work, thereby greatly reducing a user's burden.

【００１１】[0011]

【課題を解決するための手段】図１は本発明の原理説明
図である。まず本発明は、日本文等の原文を機械翻訳し
て訳文、例えば翻訳英文を出力する機械翻訳システムを
対象とする。このような機械翻訳システムを対象とした
機械翻訳用自動学習装置として本発明にあっては、機械
翻訳された訳文の修正操作を行った際に、修正対象とな
った原文の構造を解析して概念構造を生成する原文構造
解析部１と、原文に対する修正された正しい訳文の構造
を解析して概念構造を生成する訳文構造解析部２と、原
文構造解析部１と訳文構造解析部２で解析された結果を
比較し、原文と訳文の概念構造の相違に基づいて意味関
係データを生成して機械翻訳に使用する意味関係辞書４
に登録する構造比較部３とを設けたことを特徴とする。FIG. 1 is a diagram illustrating the principle of the present invention. First, the present invention is directed to a machine translation system that translates an original sentence such as a Japanese sentence and outputs a translated sentence, for example, a translated English sentence. In the present invention, as an automatic learning device for machine translation intended for such a machine translation system, the present invention analyzes the structure of the original sentence to be corrected when performing a correction operation on the translated sentence machine translated. An original sentence structure analysis unit 1 for generating a conceptual structure, a translated sentence structure analysis unit 2 for analyzing a corrected correct translated sentence structure of the original sentence and generating a conceptual structure, and analysis by an original sentence structure analysis unit 1 and a translated sentence structure analysis unit 2 The result of the comparison is compared, the semantic relation data is generated based on the difference in the conceptual structure between the original sentence and the translated sentence, and the semantic relation dictionary 4 used for machine translation.
And a structure comparing unit 3 for registering the information.

【００１２】例えば多義語選択のための学習にあって
は、構造比較部３は原文構造解析部１により解析された
原文と訳文構造解析部２で解析された修正訳文の概念構
造の相違に基づいて機械翻訳時に多義語の選択に使用す
る意味関係データを生成する。日本文から英文への翻訳
する際の多義語選択を例にとると、構造比較部３は原文
構造解析部１により解析された日本文と訳文構造解析部
２で解析された修正翻訳英文の概念構造の相違に基づい
て機械翻訳時に多義語の選択に使用する意味関係データ
を生成する。For example, in the learning for polysemy word selection, the structure comparing unit 3 is based on the difference between the conceptual structure of the original sentence analyzed by the original sentence structure analyzing unit 1 and the modified translated sentence analyzed by the translated sentence structure analyzing unit 2. To generate semantic relation data used for selecting polysemous words during machine translation. Taking an example of polysemy selection when translating a Japanese sentence into an English sentence, the structure comparing unit 3 is a concept of the Japanese sentence analyzed by the original sentence structure analyzing unit 1 and the concept of the corrected translated English sentence analyzed by the translated sentence structure analyzing unit 2. Generates semantic relation data used for selecting polysemous words during machine translation based on the difference in structure.

【００１３】また係り関係の学習にあっては、構造比較
部３は、原文構造解析部１により解析された原文の関係
子と訳文構造解析部２により解析された修正訳文の関係
子との相違に基づいて係り関係の意味関係データを生成
する。例えば日本文から英文への翻訳を例にとると、構
造比較部３は解析された日本文の関係子と修正翻訳英文
の関係子との相違に基づいて係り関係の意味関係データ
を生成する。In the learning of the relationship, the structure comparison unit 3 determines the difference between the correlator of the original sentence analyzed by the original sentence structure analysis unit 1 and the correlator of the corrected translated sentence analyzed by the translated sentence structure analysis unit 2. And generates semantic relationship data of the relationship based on the relationship. For example, taking a translation from a Japanese sentence to an English sentence as an example, the structure comparison unit 3 generates semantic relationship data of a relationship based on the difference between the analyzed Japanese sentence and the corrected translated English sentence.

【００１４】[0014]

【作用】このような構成を備えた本発明の機械翻訳用自
動学習装置によれば、機械翻訳された訳文の修正操作を
ユーザが行うと、修正された正しい訳文と原文の組に対
しそれぞれ構造解析が行なわれ、解析結果として得られ
た構造概念の比較により、利用できると判断された相違
から多義語選択或いは係り関係選択の意味関係知識（意
味関係データ）を生成して意味関係辞書に登録する処理
が自動的に行われる。According to the automatic learning apparatus for machine translation of the present invention having such a configuration, when a user performs an operation of correcting a translated sentence that has been machine-translated, a structure of a corrected correct translated sentence and an original sentence are respectively set. Analysis is performed, and by comparing structural concepts obtained as a result of analysis, semantic relation knowledge (semantic relation data) of polysemy word selection or dependency relation selection is generated from the differences determined to be usable and registered in the semantic relation dictionary. Is automatically performed.

【００１５】このような訳文の修正操作に伴なう原文と
正しい訳文との双方向解析に基づく意味関係データの自
動生成により、ユーザは全く学習操作を意識することな
く、多義語選択や係り関係を決める意味関係データの学
習による登録ができ、しかも日常的な修正操作を通じて
学習が行われるため、機械翻訳システムを使うほど能力
を向上させることができる。By automatically generating the semantic relation data based on the bidirectional analysis of the original sentence and the correct translation accompanying the operation of correcting the translated sentence, the user can select polysemous words and change relations without being conscious of the learning operation at all. Can be registered by learning the semantic relation data that determines the language, and the learning is performed through daily correction operations. Therefore, the more the machine translation system is used, the more the ability can be improved.

【００１６】[0016]

【実施例】図２は本発明の一実施例を示した実施例構成
図である。図２において、１０は原言語形態素解析部で
あり、また１２は原言語構文解析部であり、原文として
の原言語の概念構造を解析するために設けられる。一
方、１４は目標言語形態素解析部であり、また１６は目
標言語構文解析部であり、修正された正しい訳文として
の目的言語の概念構造を解析するために設けられる。FIG. 2 is a block diagram showing an embodiment of the present invention. In FIG. 2, reference numeral 10 denotes a source language morphological analysis unit, and reference numeral 12 denotes a source language syntax analysis unit, which is provided to analyze the conceptual structure of the source language as an original sentence. On the other hand, 14 is a target language morphological analysis unit, and 16 is a target language syntax analysis unit, which is provided to analyze the concept structure of the target language as a corrected correct translation.

【００１７】原言語形態素解析部１０及び目標言語形態
素解析部１４に対する入力は、機械翻訳システムで機械
翻訳された訳文の修正操作を行った際に、「原文」「修正された正しい訳文」の組として入力される。勿論、この「原文−修正訳文」
の組をリアルタイムで入力せずに、学習データを格納す
るメモリに記憶し、訳文修正操作が終了した後の学習モ
ードの指定で読出して学習するようにしてもよい。The input to the source language morphological analysis unit 10 and the target language morphological analysis unit 14 is a set of "source text" and "correct corrected text" when the translation of the machine translated by the machine translation system is performed. Is entered as Of course, this "original text-corrected translation"
May not be input in real time, but may be stored in a memory for storing learning data, and read and learned by designating a learning mode after the translation correction operation is completed.

【００１８】原言語形態素解析部１０及び目標言語形態
素解析部１４による各言語の形態素解析は、文を構成す
る単語を１つの形態素とし、形態素毎に例えば図３に示
すデータ形式をもつ中間言語に変換する。図３の中間言
語のデータ形式は、例えば「切る」、「ｃｕｔ」等の表
記と、概念記号で構成され、更に必要に応じて属性情報
が設けられる。In the morphological analysis of each language by the source language morphological analysis unit 10 and the target language morphological analysis unit 14, words constituting a sentence are defined as one morpheme, and each morpheme is converted into an intermediate language having a data format shown in FIG. Convert. The data format of the intermediate language in FIG. 3 is composed of notations such as “cut” and “cut” and concept symbols, and further provided with attribute information as needed.

【００１９】ここで概念記号とは、どのような言語でも
共通に意味を表現できる言語であり、例えば日本語の
「切る」と英語の「ｃｕｔ」が同じ概念であれば、同じ
概念記号（コード）が割り当てられる。以下の構造解析
にあっては、全て概念記号を用いた中間言語を使用して
処理される。原言語構文解析部１２及び目標言語構文解
析部１６は、原言語形態素解析部１０及び目標言語形態
素解析部１４から得られた形態素を使用して構文解析を
行い、原言語及び目的言語の概念構造に変換する。Here, the concept symbol is a language in which the meaning can be expressed in any language in common. For example, if "cut" in Japanese and "cut" in English have the same concept, the same concept symbol (code ) Is assigned. The following structural analysis is all processed using an intermediate language using concept symbols. The source language syntax analysis unit 12 and the target language syntax analysis unit 16 perform syntax analysis using the morphemes obtained from the source language morphological analysis unit 10 and the target language morphological analysis unit 14, and construct the conceptual structure of the source language and the target language. Convert to

【００２０】図４は原文となる日本文「私は教室でトラ
ンプを切る。」の原言語と目的言語の概念構造を示した
もので、機械翻訳により「I cut cards in the classroom. 」が出力され、これを正しい訳文「I shuffle cards in the classroom. 」に修正した場合の構文解析結果を示している。勿論、い
ずれも各言語に対応した中間言語（概念記号）で表現さ
れている。FIG. 4 shows the conceptual structure of the source language and the target language of the original Japanese sentence "I cut cards in the classroom.""I cut cards in the classroom." It shows the result of parsing when this is corrected to the correct translation "I shuffle cards in the classroom." Of course, each is expressed in an intermediate language (concept symbol) corresponding to each language.

【００２１】図４に示す原側（原言語側）は、日本文の
構文解析でも修正前の原文の構文解析でも同じ概念構造
となることから、翻訳原文について示している。また目
標側（目標言語側）は修正済み訳文の概念構造を示す。
この概念構造は、動詞「ｃｕｔ」、「ｓｈｕｆｆｌｅ」
をｆｒｏｍノードとして、他の形態素をｔｏノードとし
ており、各ノード間には意味関係を示す関係子が示され
ている。The original side (source language side) shown in FIG. 4 has the same conceptual structure in both the syntax analysis of the Japanese sentence and the syntax analysis of the original sentence before correction, and thus shows the translated original sentence. The target side (target language side) indicates the conceptual structure of the corrected translated sentence.
This conceptual structure is composed of the verbs “cut”, “shuffle”
Is a from node, and other morphemes are to nodes, and a relational element indicating a semantic relationship is shown between the nodes.

【００２２】再び図２を参照するに、原言語構文解析部
１２及び目標言語構文解析部１６の解析結果は概念構造
差異解析部１８に送られ、例えば図４に示した原文の概
念構造と修正済み訳文の概念構造を比較し、差異があれ
ば次の意味関係知識構成部２０に通知する。図４の場
合、ｆｒｏｍノードを構成する「ｃｕｔ」と「ｓｈｕｆ
ｆｌｅ」が相違している。Referring again to FIG. 2, the analysis results of the source language syntax analysis section 12 and the target language syntax analysis section 16 are sent to the conceptual structure difference analysis section 18 where, for example, the conceptual structure of the original text shown in FIG. The concept structure of the already translated sentence is compared, and if there is a difference, it is notified to the next semantic relationship knowledge forming unit 20. In the case of FIG. 4, “cut” and “shuf” configuring the from node
fle "is different.

【００２３】意味関係知識構成部２０では原言語と目的
言語の概念構造の差異を解析して意味関係知識を抽出
し、抽出できた知識があれば辞書の形式に構成し、意味
関係辞書登録部２２において意味関係辞書に登録する。
更に登録情報問合せ部２４が設けられ、意味関係辞書登
録部２２で登録した意味関係データをユーザに示し、間
違ったデータがあればユーザがそれを指摘し、削除でき
るようにする。The semantic relation knowledge constructing unit 20 analyzes the difference between the conceptual structures of the source language and the target language and extracts semantic relation knowledge. If there is any extracted knowledge, it constructs it in a dictionary format. At 22, it is registered in the semantic relation dictionary.
Further, a registration information inquiry unit 24 is provided, and the semantic relation data registered by the semantic relation dictionary registration unit 22 is shown to the user, and if there is incorrect data, the user can point out the data and delete it.

【００２４】意味関係知識構成部２０による解析処理を
図５のフローチャートを参照して説明すると次のように
なる。尚、図５の処理は図６に示す原側と目標側の概念
構造の一般形を例にとっている。図５において、まずス
テップＳ１で焦点を目標側動詞のノード（Ａ）に置く。
次にステップＳ２で焦点を任意の必須格の先、例えばノ
ード（Ｂ）へ置く。The analysis process performed by the semantic relationship knowledge forming unit 20 will be described below with reference to the flowchart of FIG. The process of FIG. 5 is based on the general form of the conceptual structure of the original side and the target side shown in FIG. In FIG. 5, first, in step S1, the focus is set on the node (A) of the target verb.
Next, in step S2, the focal point is set to an arbitrary essential point, for example, a node (B).

【００２５】続いてステップＳ３で原側で目標側の焦点
となったノード（Ｂ）と同じノード（Ｃ）を探す。この
場合、同じノードがあることからステップＳ４からステ
ップＳ５に進み、目標側のｆｒｏｍノード（Ａ）の動詞
Ｖ１と目標側のｆｒｏｍノード（Ｄ）の動詞Ｖ２が相違
し、且つ関係子α、βが同じか否かチェックする。即
ち、Subsequently, in step S3, the same node (C) as the node (B) that has become the focus on the target side on the original side is searched for. In this case, since there is the same node, the process proceeds from step S4 to step S5, where the verb V1 of the target side from node (A) and the verb V2 of the target side from node (D) are different, and the relations α and β Check if are the same. That is,

【００２６】[0026]

【数３】 (Equation 3)

【００２７】が成立するか否かチェックする。図６の場
合、この条件が成立しているのでステップＳ６に進み、
焦点を他の必須格の先となるノード（Ｅ）に置く。続い
てステップＳ７で他の必須格のノードがあるか否かチェ
ックし、この場合ノード（Ｅ）があるのでステップＳ３
に戻り、ノード（Ｅ）と同じ原側のノード（Ｆ）を探
し、ステップＳ４、ステップＳ５を介してステップＳ６
に進むが、このとき残りノードはないのでステップＳ７
からステップＳ１０に進んで意味関係データを１組生成
する。It is checked whether or not the following holds. In the case of FIG. 6, since this condition is satisfied, the process proceeds to step S6,
Focus on the other prerequisite node (E). Subsequently, in step S7, it is checked whether or not there is any other required node. In this case, since there is a node (E), step S3 is performed.
To find the node (F) on the same original side as the node (E), and through steps S4 and S5, to step S6.
To step S7 since there is no remaining node at this time.
Then, the process proceeds to step S10 to generate one set of semantic relation data.

【００２８】更にステップＳ８で他の動詞ノードを探す
が図６の場合はないのでステップＳ９を介して一連の処
理を終了する。図７は図４の概念構造について図６の処
理を通じて作成された多義語選択のための意味関係デー
タの説明図であり、目的言語についてｆｒｏｍノードを
「ＳＨＵＦＦＬＥ」としてｔｏノード「Ｉ」、「ＣＡＲ
Ｄ」、「ＣＬＡＳＳＲＯＯＭ」が設けられ、更にｆｒｏ
ｍノードとｔｏノードとの関係子＜ａｇｅｎｔ＞、＜ｏ
ｂｊ＞、＜ｐｌａｃｅ＞が登録される。Further, in step S8, another verb node is searched for, but there is no case in FIG. 6, so that a series of processing ends through step S9. FIG. 7 is an explanatory diagram of semantic relation data for selecting a polysemy word created through the processing of FIG. 6 with respect to the conceptual structure of FIG. 4. For the target language, the from node is set to “SHUFFFLE” and the to nodes “I” and “CAR” are used.
D "and" CLASSROOM "are provided.
Relations <agent>, <o between m node and to node
bj> and <place> are registered.

【００２９】次に本発明で作成された意味関係データを
使用した機械翻訳システムにおける多義語選択の処理を
簡単に説明すると次のようになる。図８は、日本語「切
る」に関係する機械翻訳システムの日本語辞書と意味関
係辞書の内容を示す。まず日本語辞書には、日本号表記
「切る」に対応した図３に示したデータ形式によって英
語の「ｃｕｔ」、「ｓｈｕｆｆｌｅ」及び「ｔｕｒｎ」
と同じ意味の概念記号が登録され、更に優先度が設定さ
れている。Next, the processing of selecting a polysemy in a machine translation system using the semantic relation data created by the present invention will be briefly described as follows. FIG. 8 shows the contents of the Japanese dictionary and the semantic relation dictionary of the machine translation system related to Japanese "cut". First, in the Japanese dictionary, English “cut”, “shuffle”, and “turn” are stored in the data format shown in FIG. 3 corresponding to the Japanese name notation “cut”.
Concept symbols having the same meaning as are registered, and priorities are set.

【００３０】この日本語辞書のみでは、優先度による多
義語選択が行われるため、「切る」は「ｃｕｔ」に訳さ
れる。一方、意味関係辞書には、ｆｒｏｍノードを「ｓ
ｈｕｆｆｌｅ」としてｔｏノードに「ｃａｒｄ」をもつ
意味関係データが関係子＜ｏｂｊ＞と共に登録されてい
る。ｆｒｏｍノードの「ｔｕｒｎ」についてもｔｏノー
ドに「ｓｔｅｅｒｉｎｇｗｈｅｅｌ」をもつ意味関係
データが関係子＜ｏｂｊ＞と共に登録されている。With this Japanese dictionary alone, polysemy selection based on priority is performed, so "cut" is translated into "cut". On the other hand, in the semantic relation dictionary, the from node is set to “s
The meaning relation data having “card” in the to node as “huffle” is registered together with the relational element <obj>. Regarding “turn” of the from node, semantic relation data having “steering wheel” is registered in the to node together with the relational element <obj>.

【００３１】図９は図８の日本語辞書及び意味関係辞書
を使用して「紙を切る」を翻訳した場合の多義語選択を
示す。まず原文を構造解析して「切る」対象が「ｐａｐ
ｅｒ」であることを知り、意味関係辞書を参照すると、
該当する意味関係がない。この場合には、日本語辞書の
「切る」の多義語のうちの優先度が最も高い「ｃｕｔ」
が選ばれる。FIG. 9 shows polysemy selection in the case where "paper cut" is translated using the Japanese dictionary and semantic relation dictionary of FIG. First, the original text is analyzed for structure, and the target to be cut is “pap
er ", and referring to the semantic relation dictionary,
There is no relevant meaning. In this case, “cut”, which has the highest priority among the polysemy words of “cut” in the Japanese dictionary,
Is selected.

【００３２】図１０は図８の日本語辞書及び意味関係辞
書を使用して「トランプを切る」を翻訳した場合の多義
語選択を示す。まず原文を構造解析して「切る」対象が
「ｃａｒｄ」であることを知り、意味関係辞書を参照す
ると、FIG. 10 shows the polysemy selection when "card cut" is translated using the Japanese dictionary and semantic relation dictionary of FIG. First, we analyze the structure of the original sentence and find out that the "cut" target is "card".

【００３３】[0033]

【数４】 (Equation 4)

【００３４】があるので、「ｓｈｕｆｆｌｅ」が正しい
訳として選ばれる。図９及び図１０では多義語選択の基
準、即ち訳し分けの基準となる関係子として＜ｏｂｊ＞
を例にとっているが、これ以外に訳し訳の基準となる関
係子には、＜ｉｎｓｔ＞、＜ｇｏａｌ＞、＜ｍａｎｎｅ
ｒ＞及び＜ｒｏｌｅ＞等が多く、＜ｓｕｂｊ＞や＜ｐｌ
ａｃｅ＞等が基準となることは少ない。従って機械翻訳
システムが訳し分けのために本発明で生成する意味関係
知識は、＜ｏｂｊ＞、＜ｉｎｓｔ＞、＜ｇｏａｌ＞、＜
ｍａｎｎｅｒ＞及び＜ｒｏｌｅ＞等に限定する。Because there is, "shuffle" is chosen as the correct translation. In FIG. 9 and FIG. 10, <obj> is used as a criterion for polysemy word selection, that is,
Is used as an example, but in addition to the above, relational elements that serve as translation criteria include <inst>, <goal>, and <manne.
r> and <role> etc., and <subj> and <pl>
ace> is rarely used as a reference. Therefore, the semantic relation knowledge generated by the machine translation system according to the present invention for translation is <obj>, <inst>, <goal>, <
manager> and <role>.

【００３５】次に本発明による係り関係の意味関係デー
タの学習処理を説明する。図１１は原文「製造の費用」
に対し正しい訳文「Ｃｏｓｔｆｏｒｍａｎｕｆａｃｔｕｒｅ」の組を対象に係り関係の意味関係知識を得るための構文
解析結果を示す。尚、構文解析では訳文と同じ概念記号
の中間語で表現されることから、原文についてもＣＯＳ
Ｔ、ＭＡＮＵＦＡＣＴＵＲＥと表わしているこの構文解
析において原文「製造の費用」における「ＣＯＳＴ」と
「ＭＡＮＵＦＡＣＴＵＲＥ」の係り関係を示す関係子は
修飾関係を示す＜ＭＯＤ＞であり、一方、正しい訳文で
の関係子は目的関係を示す＜ＰＵＲＰＯＳＥ＞となって
いる。Next, a description will be given of a learning process of the semantic relationship data of the relationship according to the present invention. Figure 11 is the original text "Manufacturing costs"
Here, the parsing result for obtaining the knowledge of the semantic relation of the relation is shown for the set of the correct translation "Cost for manufacture". In the parsing, the original sentence is expressed in COS because it is expressed by the intermediate word of the same conceptual symbol as the translated sentence.
In this syntax analysis expressed as T, MANUFACTURE, the relational element indicating the relational relation between "COST" and "MANUFACTURE" in the original text "Manufacturing cost" is <MOD> indicating the modification relation, while the relation in the correct translated sentence. The child is <PURPOSE> indicating the purpose relationship.

【００３６】この場合には、図１２に示す係り関係の意
味関係データが生成される。図１２において、ｆｒｏｍ
ノードには「ＣＯＳＴ」が格納され、ｔｏノードには
「ＭＡＮＵＦＡＣＴＵＲＥ」が格納され、更に関係子に
は正しい訳文の構文解析で得られた関係子＜ＰＵＲＰＯ
ＳＥ＞が格納される。即ち、原文の日本語を構文解析し
た概念構造と、正しい訳文である英語を構文解析した概
念構造を比較した際に、同じ位置にあるｆｒｏｍノード
とｔｏノードの管の関係子が異なっている場合には、正
解を英語解析により得られた関係子に置き代えれば正し
い訳が出る可能性が高くなる。In this case, semantic relation data of the relation shown in FIG. 12 is generated. In FIG.
"COST" is stored in the node, "MANUFACTURE" is stored in the to node, and a relational element <PURPO obtained by parsing the correct translation is stored in the relational element.
SE> is stored. In other words, when the concept structure obtained by parsing the original sentence Japanese and the concept structure obtained by parsing the correct translated sentence English are different from each other, the relation between the tubes of the from node and the to node at the same position is different. In, if the correct answer is replaced by a relational element obtained by the English analysis, the possibility that a correct translation will be obtained increases.

【００３７】図１３は図２に示した概念構造差異解析部
１８及び意味関係知識構成部２０による係り関係の意味
関係データを生成する処理を示したフロチャートであ
る。図１３において、まずステップＳ１で目標側（正解
訳側）の任意の関係子をαとする。次にステップＳ２で
関係子αのｆｒｏｍノードをＦ、ｔｏノードをＴとす
る。さらにステップＳ３で原側（原文側）でFIG. 13 is a flowchart showing a process of generating the semantic relation data of the relation by the conceptual structure difference analyzing section 18 and the semantic relation knowledge forming section 20 shown in FIG. In FIG. 13, first, in step S1, an arbitrary relational element on the target side (correct answer translation side) is set to α. Next, in step S2, the from node of the relation α is F, and the to node is T. Further, in step S3, on the original side (original side),

【００３８】[0038]

【数５】 (Equation 5)

【００３９】という構造を探す。この構造が原側にある
ことがステップＳ４で判別されるとステップＳ５に進ん
で原側の目標側の関係子αとβが等しいか否かチェック
する。両者の関係子が相違すればステップＳ６に進んで
意味関係データを作成する。１つの係り関係の意味知識
データの作成が済むとステップＳ７で次の関係子がある
か否かチェックし、次の関係子があればステップＳ８で
αとしてステップＳ２に戻り、同様な処理を繰り返す。
全ての関係子の処理が済めば、これがステップＳ７で判
別され、一連の処理を終了する。Search for the structure If it is determined in step S4 that this structure is on the original side, the process proceeds to step S5 to check whether or not the relational elements α and β on the original target side are equal. If the two are different, the process proceeds to step S6 to create semantic relationship data. When the creation of the semantic knowledge data of one relationship is completed, it is checked in step S7 whether there is a next relationship. If there is another relationship, the process returns to step S2 as α in step S8 and repeats the same processing. .
When all the relations have been processed, this is determined in step S7, and a series of processing ends.

【００４０】尚、上記の実施例は日本文から英文への翻
訳を例にとるものであったが、この逆でもよいし、任意
の言語間での翻訳でもそのまま適用できる。In the above-described embodiment, the translation from Japanese to English is taken as an example. However, the reverse is also possible, and the translation between any languages can be applied as it is.

【００４１】[0041]

【発明の効果】以上説明したように本発明によれば、機
械翻訳された訳文をユーザが修正すると、原文と修正さ
れた正しい訳文の組を対象に双方向の構造解析が行わ
れ、概念構造の相違点に基づいて多義語選択の意味関係
データ及び係り関係の意味関係データが自動的に生成さ
れて意味関係辞書に登録され、ユーザに学習操作を意識
させることなく自動学習ができ、ユーザの負担を大きく
軽減することができる。As described above, according to the present invention, when a user corrects a machine-translated translated sentence, bidirectional structural analysis is performed on a set of the original sentence and the corrected correct translated sentence, and a conceptual structure is obtained. The semantic relation data of polysemy word selection and the semantic relation data of the relation are automatically generated and registered in the semantic relation dictionary based on the difference between the words, and automatic learning can be performed without making the user aware of the learning operation. The burden can be greatly reduced.

[Brief description of the drawings]

【図１】本発明の原理説明図FIG. 1 is a diagram illustrating the principle of the present invention.

【図２】本発明の実施例構成図FIG. 2 is a configuration diagram of an embodiment of the present invention.

【図３】本発明で中間言語として使用する概念記号のデ
ータ形式の説明図FIG. 3 is an explanatory diagram of a data format of a concept symbol used as an intermediate language in the present invention.

【図４】本発明の多義語選択の意味関係データの生成に
使用される原文と修正訳文の構文解析結果を示した説明
図FIG. 4 is an explanatory diagram showing a result of parsing an original sentence and a corrected translated sentence used for generating semantic relation data for polysemy selection according to the present invention;

【図５】図２の実施例で多義語選択を対象に行われる概
念構造の差異解析と意味関係知識の生成処理を示したフ
ローチャートFIG. 5 is a flowchart showing a concept structure difference analysis and a semantic relationship knowledge generation process performed for polysemy word selection in the embodiment of FIG. 2;

【図６】図５の処理対象となる原側と目標側の概念構造
の一般形を示した説明図6 is an explanatory diagram showing a general form of a conceptual structure of an original side and a target side to be processed in FIG. 5;

【図７】図４の構文解析結果の相違に基づいて生成され
た意味関係データの説明図FIG. 7 is an explanatory diagram of semantic relation data generated based on a difference between the syntax analysis results of FIG. 4;

【図８】機械翻訳システムで使用される日本語辞書及び
本発明で生成した意味関係辞書の内容の一例を示した説
明図FIG. 8 is an explanatory diagram showing an example of the contents of a Japanese dictionary used in a machine translation system and a semantic dictionary generated in the present invention.

【図９】「紙を切る」の翻訳における多義語選択を示し
た説明図FIG. 9 is an explanatory diagram showing polysemy selection in the translation of “cut paper”.

【図１０】「トランプを切る」の翻訳における多義語選
択を示した説明図FIG. 10 is an explanatory diagram showing polysemy selection in the translation of “cutting cards”

【図１１】本発明の係り関係の自動学習に使用される原
文と正解訳の構文解析による概念構造の例を示した説明
図FIG. 11 is an explanatory diagram showing an example of a conceptual structure obtained by syntactic analysis of an original sentence and a correct translation used for automatic learning of a relation according to the present invention.

【図１２】図１１の概念構造における関係子の相違から
作成された係り関係の意味関係データの説明図FIG. 12 is an explanatory diagram of semantic relation data of a relation created from differences in relations in the conceptual structure of FIG. 11;

【図１３】図２の実施例で係り関係を対象に行われる概
念構造の差異解析と意味関係知識の生成処理を示したフ
ローチャートFIG. 13 is a flowchart showing a difference analysis of a conceptual structure and a process of generating semantic relationship knowledge performed on the relationship in the embodiment of FIG. 2;

[Explanation of symbols]

１：原文構造解析部２：訳文構造解析部３：構造比較部４：意味関係辞書１０：原言語形態素解析部１２：原言語構文解析部１４：目標言語形態素解析部１６：目標言語構文解析部１８：概念構造差異解析部２０：意味関係知識構成部２２：意味関係辞書登録部２４：登録情報問合せ部 1: Source sentence structure analysis unit 2: Translated sentence structure analysis unit 3: Structure comparison unit 4: Semantic relation dictionary 10: Source language morphological analysis unit 12: Source language syntax analysis unit 14: Target language morphological analysis unit 16: Target language syntax analysis unit 18: Concept structure difference analysis unit 20: Semantic relation knowledge composition unit 22: Semantic relation dictionary registration unit 24: Registration information inquiry unit

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/27 G06F 17/28 ＩＮＳＰＥＣ（ＤＩＡＬＯＧ) ＪＩＣＳＴファイル（ＪＯＩＳ)────────────────────────────────────────────────── ─── Continued on the front page (58) Fields surveyed (Int. Cl. ⁷ , DB name) G06F 17/27 G06F 17/28 INSPEC (DIALOG) JICST file (JOIS)

Claims

(57) [Claims]

In a machine translation system for machine-translating an original sentence and outputting a translated sentence, when a correction operation of a machine-translated translated sentence is performed, the structure of the original sentence to be modified is analyzed to construct a conceptual structure. , A translated sentence analysis unit 2 that analyzes the structure of a corrected and correct translated sentence for the original sentence to generate a conceptual structure, and an analysis performed by the original sentence structure analysis unit 1 and the translated sentence structure analysis unit 2. And a structure comparison unit 3 that compares the results obtained, generates semantic relation data based on the difference in the conceptual structure between the original sentence and the translated sentence, and registers the data in a semantic relation dictionary 4 used for machine translation. Automatic learning device for machine translation.

2. An automatic learning apparatus for machine translation according to claim 1, wherein said structure comparing section is used for selecting polysemous words at the time of machine translation based on a difference in conceptual structure between said original sentence and a corrected translation. An automatic learning device for machine translation, which generates semantic relation data.

3. The automatic learning apparatus for machine translation according to claim 1, wherein said structure comparing section 3 generates semantic relation data of a relation based on a difference between a relational element of said original sentence and a relational element of a corrected translation. An automatic learning device for machine translation characterized by generating.

4. An automatic learning apparatus for machine translation according to claim 3, wherein said structure comparing section is analyzed by said translated sentence structure analyzing section and a relational element of the original sentence analyzed by said original sentence structure analyzing section. An automatic learning device for machine translation, wherein the automatic translation device generates semantic relationship data of a relationship based on a difference between the modified translation and a relative.