JPH034364A

JPH034364A - Machine translating system

Info

Publication number: JPH034364A
Application number: JP1138867A
Authority: JP
Inventors: Masaie Amano; 天野　真家; Etsuo Ito; 悦雄伊藤; Kimito Takeda; 武田　公人; Koichi Hasebe; 浩一長谷部
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1989-05-31
Filing date: 1989-05-31
Publication date: 1991-01-10

Abstract

PURPOSE:To execute the selection of a translation word more easily by using a co-occurence data in the limited capacity of a memory by generating and storing the co-occurence data in automatic learning type in the working process of edit, etc. CONSTITUTION:When a user selects an appropriate translation word out of plural translation word candidates by using a translation word selection key in an input part 1, it is recognized with an edit control part 9. Next, the grammatical relation of the translation word to be selected with another translation word is detected at a grammatical relation judging part 11, and the acquirement of coincidence with set grammatical relation is judged. When it is judged that coincidence is obtained, the control part 9 sends the information of the translation word to be selected and that of another translation word to a storage part 12 after making them to a pair. In such a way, the pair of the translation word to be selected and another translation word can be stored as the cogenerative data.

Description

【発明の詳細な説明】［発明の目的］（産業上の利用分野）この発明は、機械翻訳システムに係り、特に訳語の選択
を容易にする機能を持った機械翻訳システムに関する。[Detailed Description of the Invention] [Object of the Invention] (Industrial Application Field) The present invention relates to a machine translation system, and particularly to a machine translation system having a function of facilitating the selection of translated words.

（従来の技術）コンピュータ技術を用いて翻訳処理を自動的に行なう機
械翻訳システムの開発が盛んに行なわれている。従来の
機械翻訳システムでは、同じ原語に対して複数の訳語候
補が存在する時、それらのうちから適切なものをユーザ
に選択させる方式がとられている。訳語候補が多数ある
場合、ユーザの希望する訳語が最初に表示されればよい
が、そうでない時は次候補キーの操作により他の訳語候
補を次々と表示させなければならず、選択に時間がかか
る。(Prior Art) Machine translation systems that automatically perform translation processing using computer technology are being actively developed. In conventional machine translation systems, when multiple translation candidates exist for the same source word, a method is adopted in which the user selects an appropriate translation candidate from among them. When there are a large number of translation candidates, it is sufficient if the translation desired by the user is displayed first, but if this is not the case, the other translation candidates must be displayed one after another by operating the next candidate key, which takes time to select. It takes.

そこで、２語の意味的な結合のし易さに着目し、結合し
易い２語をペアにした、いわゆる共起データを作成して
、それらを多数蓄積した共起衣を用意しておき、複数の
訳語候補が発生した場合、その共起衣にあるものを優先
して表示したり、自動選択する方法が考えられている。Therefore, we focused on the ease of semantic combination of two words, created so-called co-occurrence data that pairs two words that are easy to combine, and prepared a co-occurrence data that accumulates a large number of them. When multiple translation candidates occur, methods are being considered to prioritize and display the ones that are co-occurring or to automatically select them.

共起衣の中に該当する語のペアがない場合は、従来通り
である。If there is no corresponding word pair in the co-occurrence, the procedure is as usual.

このような共起衣を用いる方法により、例えば「ホテル
」と「ボーイ」をペアにした共起データを共起衣に登録
しておけば、“ｂｏｙ”の訳語候補である「少年」　「
ボーイ」などの中から、「ホテル」に関係するものとし
て最大の可能性を与える「ボーイ」を最上位に表示した
り、または「ボーイ」を自動的に選択したりすることか
できる。By using such a method of using co-occurrence, for example, if you register co-occurrence data pairing "hotel" and "boy" in co-occurrence, you can register the co-occurrence data that pairs "hotel" and "boy" in co-occurrence.
It is possible to display "Boy", which gives the greatest possibility of being related to "Hotel", at the top from among "Boy", etc., or to automatically select "Boy".

従来考えられている、共起衣を用いる方法では、共起衣
を予め機械翻訳システム内に格納しておかなければなら
ない。ここで、辞書に登録されている語数を１０万語と
すると、２語のペアは単純計算で１０万語×１０万語−
１００億ベアとなる。これらの中で共起関係にあるもの
は遥かに少ないが、それでも数百万乃至数千刃ベアは存
在すると考えられる。このような多数のペアを全て共起
データとして共起衣に、予め登録しておくことは、不可
能に近い。In the conventionally considered method of using co-occurrence, the co-occurrence must be stored in the machine translation system in advance. Here, if the number of words registered in the dictionary is 100,000 words, the pair of two words is simply calculated as 100,000 words x 100,000 words -
It will be 10 billion bears. Among these, there are far fewer co-occurring relationships, but it is thought that there are still millions to thousands of blade bears. It is almost impossible to register all such many pairs in advance as co-occurrence data in co-occurrence data.

ところで、機械翻訳システムにおける辞書は、不特定多
数のユーザが使うことを前提にしているため、５〜１０
万語という多数の語を登録しておく必要があるが、−人
のユーザ、あるいは一つの部所に限れば、実際に使われ
る語の数は遥かに少なく、１〜２万程度に過ぎないこと
が分かっている。By the way, dictionaries in machine translation systems are assumed to be used by an unspecified number of users, so
It is necessary to register a large number of words, 10,000 words, but the number of words actually used is far smaller, only around 10,000 to 20,000, if limited to -10,000 users or one department. I know that.

しかし、辞書が不特定多数を対象にしているように、共
起衣も予め用意するとすれば不特定多数を対象にせざる
を得ない。これは共起データの収集および共起衣の作成
を困難にすると同時に、膨大な容量のメモリを必要とす
ることになり、現実的でない。However, just as a dictionary targets an unspecified number of people, if co-registration is prepared in advance, it would have to target an unspecified number of people. This makes it difficult to collect co-occurrence data and create a co-occurrence cloth, and at the same time requires a huge amount of memory, which is not practical.

（発明が解決しようとする課８）上述したように、従来の共起衣を機械翻訳に用いる方法
では、共起衣としてメモリに登録できる共起データの数
に限界があるため、実用的な意味では、訳語の選択を容
易にする効果が小さいという問題があった。(Issue 8 to be solved by the invention) As mentioned above, in the conventional method of using co-occurrence for machine translation, there is a limit to the number of co-occurrence data that can be registered in memory as co-occurrence. In terms of meaning, there was a problem in that the effect of facilitating the selection of translation words was small.

本発明はこのような問題を解決し、限られたメモリ容量
の下で、共起データを用いて訳語の選択をより容易に行
なうことできる、機械翻訳システムを提供することを目
的とする。An object of the present invention is to solve such problems and provide a machine translation system that can more easily select translated words using co-occurrence data with limited memory capacity.

［発明の構成］（３題を解決するための手段）上記の課題を達成するため、本発明はユーザが機械翻訳
のための原文人力や編集などの作業を行なっている過程
で共起データを自動学習的に作成して記憶するようにし
たことを特徴としている。[Structure of the Invention] (Means for Solving the Three Problems) In order to achieve the above-mentioned problems, the present invention collects co-occurrence data while a user is performing work such as human input and editing of the original text for machine translation. It is characterized by being created and memorized automatically.

すなわち、本発明の機械翻訳システムは、機械翻訳時に
複数の訳語候補の中から選択された被選択訳語と訳文中
の他の訳語とが特定の文法的関係にあるかどうかを判定
し、特定の文法的関係にあると判定された被選択訳語と
他の訳語をそれぞれ示す情報を組にして、共起データと
して記憶するようにしたものである。That is, the machine translation system of the present invention determines whether a selected translation word selected from a plurality of translation word candidates and other translation words in the translated text have a specific grammatical relationship during machine translation, and Information indicating each of the selected translation word and another translation word that has been determined to have a grammatical relationship is stored as a set of co-occurrence data.

また、より簡単には、機械翻訳時に複数の訳語候補の中
から選択された一意に決定さ、れた被選択訳語を示す情
報と、他の一意に決定された訳語を示す情報とを、全て
の文法的関係にあるものについて組にして共起データと
して記憶するか、または被選択訳語を表わす情報と、訳
文中の該被選択訳語の直前および直後の少なくとも一方
の訳語を表わす情報とを文法的関係によらず組にして共
起データとして記憶してもよい。In addition, more simply, information indicating a uniquely determined selected translation word selected from a plurality of translation word candidates during machine translation, and information indicating other uniquely determined translation words are all stored. Either the grammatical relations between the two are stored as co-occurrence data, or the information representing the selected translation word and the information representing at least one of the translation words immediately before and after the selected translation word in the translation are stored in a grammatical manner. They may be grouped and stored as co-occurrence data regardless of the physical relationship.

（作用）このように本発明では、機械翻訳処理の過程で共起デー
タが作成され記ｊａされることにより、共起衣が蓄積さ
れるので、予め共起衣を作る必要がない。(Operation) In this way, in the present invention, co-occurrence data is created and recorded in the process of machine translation processing, thereby accumulating co-occurrence co-occurrence, so there is no need to create co-occurrence co-occurrence in advance.

こうして蓄積される共起衣は、従来の不特定多数のユー
ザのために用意されたものと異なり、特定の一人または
数人程度のユーザの語粂使用傾向を学習した結果を反映
しているため、訳語の選択が容易となる。The co-pronunciations accumulated in this way are different from conventional ones prepared for an unspecified number of users, as they reflect the results of learning the word usage trends of one or a few specific users. , the selection of translation words becomes easier.

また、特定のユーザか使う給量には偏りがあり、数万語
に収まるのが普通であることから、共起表として蓄積さ
れる共起データの数は非常に少なくて済むにもかかわら
ず、訳語の選択を容易にする効果は大きい。In addition, the amount of money used by a particular user is uneven, and it is normal for it to be within the tens of thousands of words, so even though the amount of co-occurrence data that can be accumulated as a co-occurrence table is very small. , which has a great effect in making it easier to select translation words.

（実施例）以下、図面を参照して本発明の詳細な説明する。(Example) Hereinafter, the present invention will be described in detail with reference to the drawings.

第１図は本発明の一実施例に係る機械翻訳システムの構
成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a machine translation system according to an embodiment of the present invention.

第１図において、入力部１は例えばキーボードであり、
翻訳対象となる原文（例えば日本語文）を入力したり、
各種編集のためのコマンドを入力するためのものである
。表示部２は入力された原文や、機械翻訳結果である訳
文（例えば英文）および訳語候補リストその他の各種編
集情報の表示を行なう。In FIG. 1, the input unit 1 is, for example, a keyboard;
Enter the original text to be translated (e.g. Japanese text),
This is for inputting commands for various editing. The display unit 2 displays the input original text, a translated text (eg, English text) as a result of machine translation, a list of translation candidates, and other various editing information.

構文解析部３は入力された原文の構文解析を行ない、翻
訳部４は構文解析の結果得られた原文の構造を、翻訳の
目的言語である例えば英語の語と構造に翻訳する。訳文
生成部５は翻訳部４の翻訳結果を用いて、目的言語から
なる訳文を生成する。解析用文法・辞書６は構文解析部
３て、対訳辞書７は翻訳部４で、また生成用文法８は訳
文生成部５でそれぞれの処理に使用される。The syntactic analysis unit 3 performs syntactic analysis of the input original text, and the translation unit 4 translates the structure of the original text obtained as a result of the syntactic analysis into words and structures of the target language of translation, for example, English. The translation generation unit 5 uses the translation result of the translation unit 4 to generate a translation in the target language. The parsing grammar/dictionary 6 is used by the syntactic analysis section 3, the bilingual dictionary 7 is used by the translation section 4, and the generation grammar 8 is used by the translation generation section 5 for their respective processing.

編集制御部９は機械翻訳処理を含めた編集処理を全体的
に制御するものであり、本実施例では後述するように共
起データの作成もこの編集制御部９で行なわれる。The editing control unit 9 controls the entire editing process including the machine translation process, and in this embodiment, the editing control unit 9 also creates co-occurrence data, as will be described later.

文法的関係判定部１１は複数の訳語候補から一つの語が
選択されたとき、被選択訳語と訳文中の他の訳語との文
法的関係（主として係り受は関係）を文解析部４の解析
結果を利用して検出し、その検出した文法的関係が特定
の関係、すなわち予め文法的関係表によって設定されて
いる一つまたは複数の文法的関係に一致するか否かを判
定する。When one word is selected from a plurality of translation word candidates, the grammatical relationship determination unit 11 uses the sentence analysis unit 4 to analyze the grammatical relationship (mainly relationships of dependency) between the selected translation word and other translated words in the translated sentence. The detection is performed using the results, and it is determined whether the detected grammatical relationship matches a specific relationship, that is, one or more grammatical relationships preset by a grammatical relationship table.

文法的関係判定部１１の判定結果は、編集制御部９に与
えられる。編集制御部９ではこの判定結果に従って、共
起データを作成する。The judgment result of the grammatical relationship judgment section 11 is given to the editing control section 9. The editing control unit 9 creates co-occurrence data according to this determination result.

共起データ記憶部１２は、編集制御部９で作成された共
起データを記憶することによって、共起表を蓄積する。The co-occurrence data storage section 12 accumulates a co-occurrence table by storing the co-occurrence data created by the editing control section 9.

次に、第２図に示すフローチャートを用いて、本実施例
における共起データの作成・記憶手順を説明する。なお
、第２図はある原文についての機械翻訳か行なわれた場
合の処理を示している。機械翻訳の結果、表示部２では
例えば第３図に示すような表示がなされる。第３図にお
いて、入力された原文は原文表示領域３１に、また翻訳
の結果得られた訳文は訳文表示領域３２にそれぞれ表示
される。さらに、編集領域３３には訳語候補リストなど
の各種の編集情報が表示される。Next, the procedure for creating and storing co-occurrence data in this embodiment will be explained using the flowchart shown in FIG. Note that FIG. 2 shows the processing when machine translation is performed on a certain original text. As a result of machine translation, a display as shown in FIG. 3 is displayed on the display unit 2, for example. In FIG. 3, the input original text is displayed in the original text display area 31, and the translated text obtained as a result of translation is displayed in the translated text display area 32. Furthermore, various editing information such as a translation candidate list is displayed in the editing area 33.

機械翻訳により最初に得られた訳文中に、ユーザの意図
する訳語でないものが表示されている場合（第３図の例
では「少年」）、入力部１に備えられた「次候補キー」
を操作すると、他の訳語候補が表示される。また、例え
ば入力部１に備えられた「訳語候補−括表示キー」を操
作すると、第３図に示すように編集領域３３に訳語候補
リストが表示される。If the translated text initially obtained by machine translation displays a translation word that is not the user's intended translation ("boy" in the example in Figure 3), the "next candidate key" provided in input unit 1 is displayed.
If you operate , other translation candidates will be displayed. Further, for example, when the user operates the "translation candidate-collective display key" provided in the input unit 1, a list of translation candidates is displayed in the editing area 33 as shown in FIG.

これらのいずれの場合も、ユーザが入力部１に備えられ
た「訳語選択キー」を用いて複数の訳語候補の中から適
切な訳語を選択すると、編集制御部９でそれが認識され
る（ステップＳｌ）。次に、文法的関係判定部１１にお
いて、選択された語（被選択訳語）と、訳文中の他の訳
語（特に、−意に決定されている訳語）との文法的関係
が検出され、さらに検出された文法的関係が、文法的関
係表によって予め設定されている文法的関係に一致する
かどうかが判定される（ステップ８２〜Ｓ３）。In any of these cases, when the user selects an appropriate translation from among multiple translation candidates using the "translation selection key" provided in the input unit 1, the editing control unit 9 recognizes it (step SL). Next, the grammatical relationship determination unit 11 detects the grammatical relationship between the selected word (selected translation word) and other translated words in the translation (especially, the translated word that has been arbitrarily determined), and then It is determined whether the detected grammatical relationship matches a grammatical relationship preset by the grammatical relationship table (steps 82 to S3).

ステップＳ３での判定の結果、被選択訳語と他の訳語と
の文法的関係が、予め設定されている文法的関係と一致
したと判定された場合は、編集制御部９がその被選択訳
語の情報と他の訳語の情報とを組にして共起データ記憶
部１２に送る。これにより共起データ記憶部１２で、被
選択訳語と他の訳語との組か共起データとして記憶され
る（ステップＳ４）。As a result of the determination in step S3, if it is determined that the grammatical relationship between the selected translation word and another translation word matches the preset grammatical relationship, the editing control unit 9 The information and the information of other translated words are combined and sent to the co-occurrence data storage section 12. As a result, the co-occurrence data storage unit 12 stores a pair of the selected translation word and another translation word as co-occurrence data (step S4).

第３図の例を用いてより具体的に説明する。This will be explained more specifically using the example shown in FIG.

今、機械翻訳された文の表示の中で、「少年」と表示さ
れている部分の原語である“ｂｏｙ　　に対する適切な
訳語として、「ボーイ」がユーザにより選択されたとす
る。すなわち、「私は、ホテルの少年を呼ぶ。」という訳文が得られたとする。Suppose that the user selects "boy" as an appropriate translation for "boy", which is the original word for the part displayed as "boy" in the machine-translated text. In other words, suppose we have obtained the translation ``I call the boy from the hotel.''

この訳文の構造は第４図に示されるようになる。同図に
示すように、この訳文を構成する各訳語間の文法的関係
は、次の通りである。まず、「ボーイ」は「ホテルの」
という名詞句で修飾されており、また「呼ぶ」という動
詞の目的語となっている。すなわち、この場合の被選択
訳語である「ボーイ」と、同じ訳文中の他の一意に決定
されている訳語である「ホテル」、「呼ぶ」との文法的
関係（係り受けの関係）は、それぞれ被修飾語と修飾語
の関係、目的語と動詞の関係となっている。なお、訳文
中で一意に決定されている訳語の判、別は、機械翻訳時
に各訳語に対して一時的に立てられる“訳語決定フラグ
を検出することにより行なうことができる。The structure of this translated text is shown in FIG. As shown in the figure, the grammatical relationships between the translated words that make up this translated sentence are as follows. First of all, "boy" means "hotel's"
It is modified by the noun phrase ``to call,'' and is also the object of the verb ``to call.'' In other words, the grammatical relationship (dependency relationship) between the selected translation word ``boy'' in this case and the other uniquely determined translation words ``hotel'' and ``call'' in the same translation are as follows: The relationship between the modified word and the modifier, and the relationship between the object and the verb, respectively. Note that it is possible to determine whether a translated word is uniquely determined in a translated text by detecting a "translated word determination flag" that is temporarily set for each translated word during machine translation.

文法的関係判定部１１は、被選択訳語「ボーイ」と、他
の一意に決定されている訳語「ホテル」および「呼ぶ」
との文法的関係を検出し、これが予め設定された特定の
関係にあるかどうかを判定する。この場合、これらの文
法的関係はいずれも文法的関係表に予め設定されている
ものとする。編集制御部９では文法的関係判定部１１の
判定結果を受けると、「ボーイ」と「ホテル」の組（ボ
ーイ、ホテル）と、「ボーイ」と「呼ぶ」の組（ボーイ
、呼ぶ）を共起データとして共起データ記憶部１２に記
憶させる。The grammatical relationship determination unit 11 selects the selected translation word "boy" and other uniquely determined translation words "hotel" and "karu".
Detects the grammatical relationship between the two and determines whether this has a specific preset relationship. In this case, it is assumed that all of these grammatical relationships are set in advance in the grammatical relationship table. When the editing control unit 9 receives the judgment result from the grammatical relationship judgment unit 11, it shares the pair of “boy” and “hotel” (boy, hotel) and the pair of “boy” and “call” (boy, call). The co-occurrence data is stored in the co-occurrence data storage unit 12 as co-occurrence data.

共起データ記憶部１２での記憶に際しては、共起データ
を構成する２語の文字コードを組として記憶してもよい
が、文字コードに付される辞書ＩＤとよばれる識別番号
を組として記憶することが望ましい。When storing in the co-occurrence data storage unit 12, character codes of two words constituting the co-occurrence data may be stored as a set, but an identification number called a dictionary ID attached to the character code may be stored as a set. It is desirable to do so.

第５図は第１図における対訳辞書７の一部を示したもの
で、原語である英語の文字列、訳語である日本語の文字
列、および辞書ＩＤを組として格納している。また、同
じ英語の文字列に対応する日本語の訳語候補が複数ある
ため、各訳語候補には訳語Ｎｏ、が付されている。FIG. 5 shows a part of the bilingual dictionary 7 in FIG. 1, which stores a character string in English as the original language, a character string in Japanese as the translated word, and a dictionary ID as a set. Furthermore, since there are multiple Japanese translation candidates corresponding to the same English character string, each translation candidate is assigned a translation number.

ここで、（ボーイ、ホテル）の組を共起データとして記
憶する場合、第６図に示すように「ボーイ」を示す辞書
ＩＤ及び訳語Ｎｏ、と、「ホテル」を示す、辞書ＩＤと
を組にして記憶すればよい。辞書ＩＤ及び訳語Ｎｏ、は
文字コードより遥かにビット数が少ないので、辞書ＩＤ
及び訳語Ｎｏ、を用いて共起データを記憶すると、文字
コードを用いて共起データを記憶する場合に比較して共
起データ記憶部１２の記憶容量は小さくなる。Here, when storing the pair (boy, hotel) as co-occurrence data, as shown in Figure 6, the dictionary ID and translation number indicating "boy" and the dictionary ID indicating "hotel" are combined. and memorize it. Dictionary ID and translation number have far fewer bits than character code, so dictionary ID
When co-occurrence data is stored using character codes and translated word numbers, the storage capacity of the co-occurrence data storage unit 12 becomes smaller than when co-occurrence data is stored using character codes.

また、共起データを構成する２つの辞書ＩＤ及び訳語Ｎ
ｏ、の組に、両者の文法的関係を示す情報である２項間
関係名を付加したものを共起データとして記憶してもよ
い。In addition, the two dictionary IDs and translation word N that make up the co-occurrence data
A combination of ``o'' and ``dyadic relationship name'', which is information indicating a grammatical relationship between the two, may be stored as co-occurrence data.

次に、機械翻訳に際して、複数の訳語候補を与えるよう
な原語が入力され、且つその訳語候補の一つと訳文中の
一意に決定されている他の訳語との組合わせが、共起デ
ータ記憶部１２に共起データとして記憶されているもの
とする。Next, during machine translation, a source word that provides multiple translation word candidates is input, and a combination of one of the translation word candidates and another uniquely determined translation word in the translated text is stored in the co-occurrence data storage unit. 12 as co-occurrence data.

この様な場合には、その訳語候補が最も高い可能性を与
えるものとして、訳文の表示中に最初に現れる。また、
この場合、第３図中に示すような訳語候補リストを表示
させたとすれば、共起データとして記憶されている訳語
候補は、最上位に表示される。従って、ユーザは複数の
訳語候補の中から、適切な訳語を容易に選択することが
できる。In such a case, the translation candidate appears first in the translation display as the one that provides the highest possibility. Also,
In this case, if a translation candidate list as shown in FIG. 3 is displayed, the translation candidates stored as co-occurrence data are displayed at the top. Therefore, the user can easily select an appropriate translation word from among a plurality of translation word candidates.

また、このように共起データとして記憶されている訳語
候補を候補とせず、自動的に選択するようにしてもよい
。Alternatively, the translation word candidates stored as co-occurrence data may not be used as candidates, but may be automatically selected.

本発明は上記実施例に限られず、種々変形して実施する
ことができる。例えば上記実施例では２つの語を組にし
て共起データとしたが、３つまたはそれ以上の語を組に
して共起データとして記憶してもよい。例えば前述の例
に従えば「ボーイ」と「ホテル」と「呼ぶ」の組（ボー
イ、ホテル、呼ぶ）を共起データとして記憶することも
できる。The present invention is not limited to the above embodiments, and can be implemented with various modifications. For example, in the above embodiment, two words are combined as co-occurrence data, but three or more words may be combined and stored as co-occurrence data. For example, according to the above example, a set of "boy", "hotel", and "call" (boy, hotel, call) can be stored as co-occurrence data.

また、上記実施例では学習する共起データの信頼度を高
めるために、複数の訳語候補から選択された被選択訳語
と、機械翻訳により最初に得られたされた訳文中の他の
一意に決定されている訳語との文法的関係を検出し、特
定の文法的関係にある被選択訳語と他の訳語との組のみ
を共起データとしたが、特定の文法的関係にあるものだ
けを共起データとする必要はなく、全ての文法的関係に
ある一意に決定された被選択訳語と他の訳語との組を共
起データとしてもよい。また、このような文法的関係を
判定せず、機械的に被選択訳語とその直前または直後の
一意に決定されている訳語、あるいは直前および直後両
方の一意に決定されている訳語とを組にして共起データ
としてもよい。In addition, in the above example, in order to increase the reliability of the co-occurrence data to be learned, the selected translation word selected from multiple translation word candidates and other uniquely determined translation words in the translated text initially obtained by machine translation are used. We detected the grammatical relationship between the selected translated word and the other translated word that has a specific grammatical relationship, and used only pairs of the selected translated word and other translated words that have a specific grammatical relationship as co-occurrence data. It is not necessary to use co-occurrence data, and a set of a uniquely determined selected translation word and another translation word that have all grammatical relationships may be used as co-occurrence data. In addition, without determining such grammatical relationships, the selected translation is automatically paired with a uniquely determined translation immediately before or after it, or with a uniquely determined translation both immediately before and after it. It may also be used as co-occurrence data.

その他、本発明は要旨を逸脱しない範囲で種々変形して
実施することが可能である。In addition, the present invention can be implemented with various modifications without departing from the scope.

［発明の効果］本発明によれば、機械翻訳の過程で共起関係を持つ訳語
を学習して共起データを記憶することによって、共起表
を蓄積することにより、予め多数の共起データを共起表
として大容量のメモリに用意してお（ことなく、訳語の
選択を容易にすることができる。[Effects of the Invention] According to the present invention, by learning translated words that have co-occurrence relationships in the process of machine translation and storing the co-occurrence data, a large number of co-occurrence data can be stored in advance by accumulating a co-occurrence table. By preparing a co-occurrence table in a large memory, you can easily select translation words.

また、本発明により蓄積される共起表は、実際に機械翻
訳システムを使用するユーザの語嘗使用傾向を学習した
結果を強く反映したものとなるため、記憶される共起デ
ータの数が少なくとも効果は大きい。Furthermore, since the co-occurrence table accumulated by the present invention strongly reflects the results of learning the word usage tendencies of users who actually use the machine translation system, the number of co-occurrence data stored is at least The effect is great.

しかも、本発明の機械翻訳システムは、訳語の選択を行
なうにつれて共起データが蓄積されてゆき、使い込むほ
ど性能が向上するという特長がある。Furthermore, the machine translation system of the present invention has the advantage that co-occurrence data is accumulated as translation words are selected, and the performance improves as the system is used.

[Brief explanation of the drawing]

第１図は本発明の一実施例に係る機械翻訳システムの構
成を示すブロック図、第２図は同実施例における共起デ
ータ作成・記憶手順を説明するためのフローチャート、
第３図は同実施例、における機械翻訳時の画面上の表示
例を示す図、第４図は同実施例により得られた訳文の構
造を示す図、第５図は同実施例における共起データ作成
の元となる辞書の一部を示す図、第６図は同実施例にお
ける共起データの具体例を示す図である。１・・・入力部３・・・構文解析部５・・・訳文生成部７・・・対訳辞書９・・・編集制御部１１・・・文法的関係判定部１２・・・共起データ記憶部２・・・表示部４・・・翻訳部６・・・解析用文法・辞書８・・・生成用文法FIG. 1 is a block diagram showing the configuration of a machine translation system according to an embodiment of the present invention, and FIG. 2 is a flowchart for explaining the procedure for creating and storing co-occurrence data in the embodiment.
Figure 3 is a diagram showing an example of the display on the screen during machine translation in the same example, Figure 4 is a diagram showing the structure of a translated sentence obtained by the same example, and Figure 5 is a diagram showing the co-occurrence in the same example. FIG. 6 is a diagram showing a part of the dictionary from which data is created, and FIG. 6 is a diagram showing a specific example of co-occurrence data in the same embodiment. 1... Input unit 3... Syntactic analysis unit 5... Translation generation unit 7... Bilingual dictionary 9... Edit control unit 11... Grammatical relationship determination unit 12... Co-occurrence data storage Part 2... Display part 4... Translation part 6... Parsing grammar/dictionary 8... Generation grammar

Claims

[Claims]

(1) A determining means for determining whether a selected translation word selected from a plurality of translation word candidates during machine translation has a specific grammatical relationship with other translated words in the translated text; A machine translation system comprising: storage means for storing a set of information indicating each of the selected translation word determined to have a grammatical relationship and the other uniquely determined translation word.

(2) Information indicating a uniquely determined translation word selected from multiple translation word candidates during machine translation and information indicating other uniquely determined translation words in all grammatical relationships. A machine translation system characterized by comprising a storage means for storing things in pairs.

(3) Information indicating a selected translation word selected from a plurality of translation word candidates during machine translation, and information indicating at least one translation word immediately before or after the selected translation word in the translated text are stored in pairs. A machine translation system characterized by comprising a storage means.