JPH05101095A

JPH05101095A - Method and device for detecting nonuniformity of translated word

Info

Publication number: JPH05101095A
Application number: JP3285742A
Authority: JP
Inventors: Hiroko Kida; 裕子木田; Hiroyuki Kaji; 博行梶; Yasutsugu Morimoto; 康嗣森本
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1991-10-04
Filing date: 1991-10-04
Publication date: 1993-04-23

Abstract

PURPOSE:To improve efficiency for the uniform of a translated word and the correction of spelling error by dividing a source text into words, also dividing a text edited after translation into word, drawing a dictionary concerning words in the source text and executing the extraction and display of words which corresponding relation can not be recognized. CONSTITUTION:The source text added a sentence number for each sentence is divided into words and stored in a first word division table 41. Next, an analysis processing processor 1 divides the translated text added a sentence number corresponding to that of the source text by one-to-one into words by drawing the dictionary while processing the end of the word or the like, and a processing program in a main memory 2 stores the respective divided words into a second word division table 42. Afterwards, the words in the first word division table 41 are extracted, and a corresponding relation decision table 43 with the second word division table 42 is prepared by using a translation dictionary 3. Then, the words not uniforming the translated word corresponding to the source word are extracted from this table 43, and an alarm information table 44 is prepared and displayed at a display device 7.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、第１言語から第２言語
への機械翻訳と人手による後編集作業に伴う訳語不統一
をチェックする訳語不統一検出方法及び装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a translation word inconsistency detection method and apparatus for checking translation word inconsistency associated with machine translation from a first language to a second language and manual post-editing work.

【０００２】[0002]

【従来の技術】従来から、第１言語から第２言語への機
械翻訳処理装置は、一般に辞書と文法規則、そして、文
法規則を適用するための処理装置によって構成されてい
る。しかし、第１言語の持つ多義性を主とする曖昧さの
解消が困難なため、第１言語を完全に解析することは不
可能とされている。また、第１言語の解析に成功して
も、第２言語の生成において、人間が日常使っているよ
うな自然な言語の流れに沿ったものができるとは限らな
い。その結果、機械翻訳処理装置により翻訳されたテキ
ストを人間が後編集する作業の必要性が生ずる。機械翻
訳処理装置により翻訳を行う場合、原テキスト中の同じ
パターンに対して翻訳アルゴリズムは同じように動作す
るので、原テキスト中の同一の単語については、同じ訳
語が適用される。しかし、人間が後編集すると、原単語
と機械翻訳処理装置によって一通りに統一されていた訳
語が不統一になる可能性がある。そこで、人間の後編集
作業によっても訳語が不統一にならないように、同一の
修正をテキスト全体の複数箇所に対し実行する等の方式
がとられて来た。機械翻訳処理に伴う後編集作業に関し
ては、特開昭５８−１５１６７７号公報に開示されてい
る。特開昭５８−１５１６７７号公報には、第１言語に
よるテキストとその第２言語による翻訳テキスト及び両
テキストの構文要素単位の対応関係を記憶し、翻訳テキ
ストが修正された時に修正された構文要素の修正前後の
文字列を記憶し、修正された構文要素と同じ構文要素を
テキスト全体にわたって検索し、これに対しても同一の
修正を施すということが開示されている。2. Description of the Related Art Conventionally, a machine translation processing device from a first language to a second language generally comprises a dictionary, grammatical rules, and a processing device for applying the grammatical rules. However, since it is difficult to resolve the ambiguity mainly due to the polysemy of the first language, it is impossible to completely analyze the first language. Further, even if the analysis of the first language is successful, the generation of the second language does not always follow the natural language flow that humans use everyday. As a result, it becomes necessary for a human to post-edit the text translated by the machine translation processor. When the translation is performed by the machine translation processing device, the translation algorithm operates in the same manner for the same pattern in the original text, and thus the same translated word is applied to the same word in the original text. However, when a human post-edits, there is a possibility that the original word and the translated word that has been unified by the machine translation processing device may become inconsistent. Therefore, a method has been adopted in which the same correction is performed on multiple parts of the entire text so that the translated words do not become inconsistent even by human post-editing work. The post-editing work associated with the machine translation process is disclosed in Japanese Patent Application Laid-Open No. 58-151677. Japanese Laid-Open Patent Publication No. 58-151677 stores a text in a first language, a translated text in a second language, and a correspondence relationship between syntax element units of both texts, and a syntax element modified when the translation text is modified. It is disclosed that the character strings before and after the correction are stored, the same syntax element as the corrected syntax element is searched over the entire text, and the same correction is applied to this.

【０００３】[0003]

【発明が解決しようとする課題】現状の機械翻訳処理装
置は、翻訳品質が十分とはいえず、翻訳したテキストに
対して、作業者が後編集作業を行うことが必要不可欠で
あるが、その結果、原テキスト中の同一の単語に対して
訳語が不統一になる可能性があり、その訳語不統一の発
生を防止する必要がある。従来技術として開示している
方式でも、その訳語不統一の発生防止に一定の効果はあ
るが、作業者が一般のワードプロセッサの文字修正機能
を使用して、後編集作業を行う場合が想定でき、このよ
うな後編集作業を行う場合には効果がない。本発明の目
的は、上述の事情に鑑み、後編集作業が終了した後に、
テキストから不統一になっている訳語を自動的に検出
し、また、後編集作業に伴うスペルミスの発生を自動的
に検出し、訳語の統一及びスペルミスの校正作業を高能
率化することにある。In the current machine translation processing apparatus, the translation quality is not sufficient, and it is indispensable for the operator to perform post-editing work on the translated text. As a result, there is a possibility that the translated words of the same word in the original text may become inconsistent, and it is necessary to prevent the translation inconsistency from occurring. Even with the method disclosed as the conventional technique, there is a certain effect in preventing the occurrence of the word inconsistency, but it can be assumed that the worker uses the character correction function of a general word processor to perform post-editing work. There is no effect when performing such post-editing work. In view of the above circumstances, the purpose of the present invention is to
It is to automatically detect a translated word that is inconsistent in a text and also to automatically detect the occurrence of a spelling error associated with a post-editing operation, thereby increasing the efficiency of the unifying translation word and the proofreading operation of the misspelling.

【０００４】[0004]

【問題点を解決するための手段】上記課題は、第１言語
による第１のテキストを単語に分割して第１単語分割テ
ーブルに記憶し、第２言語による第２のテキストを単語
に分割して第２単語分割テーブルに記憶し、対訳辞書を
利用して第１のテキストの単語と第２のテキストの単語
を対応付けて対応関係決定テーブルに記憶し、第２のテ
キストの単語との対応関係が不統一の第１のテキストの
単語を対応関係決定テーブルから抽出し、警告情報テー
ブルに記憶し、これを利用して不統一である対応関係に
ついて訳語不統一の警告情報を表示画面に表示すること
によって、達成される。また、第１言語によるテキスト
を第２言語に機械翻訳して得られる第１のテキストを単
語に分割して第１単語分割テーブルに記憶し、前記機械
翻訳結果を更に後編集して得られる第２のテキストを単
語に分割して第２単語分割テーブルに記憶し、第１のテ
キストの単語と第２のテキストの単語を対応付けて対応
関係決定テーブルに記憶し、第２のテキストの単語との
対応関係がつく出現箇所と対応関係がつかない出現箇所
の両方を有する第１のテキストの単語を対応関係決定テ
ーブルから抽出し、警告情報テーブルに記憶し、抽出し
た単語について対応関係が不統一であることを表示画面
に表示することによって、達成される。[Means for Solving the Problems] The above problem is that the first text in the first language is divided into words and stored in the first word division table, and the second text in the second language is divided into words. Stored in the second word division table, the words of the first text and the words of the second text are associated with each other using the bilingual dictionary and stored in the correspondence determination table, and the correspondence with the words of the second text is stored. The words of the first text with ununiform relations are extracted from the correspondence relation determination table, stored in the warning information table, and using this, the warning information of unconcordance translations is displayed on the display screen. It is achieved by doing. Further, the first text obtained by machine-translating the text in the first language into the second language is divided into words and stored in the first word division table, and the machine translation result is further edited to obtain the first text. The second text is divided into words and stored in the second word division table, the words of the first text and the words of the second text are associated and stored in the correspondence determination table, and the words of the second text are stored. The words of the first text having both the occurrence places with the correspondence relation and the appearance places without the correspondence relation are extracted from the correspondence relation determination table and stored in the warning information table, and the correspondence relations of the extracted words are not unified. This is achieved by displaying that

【０００５】[0005]

【作用】本発明の訳語不統一検出方法において、まず、
原テキストを単語分割して第１単語分割テーブルに記憶
し、翻訳し後編集されたテキスト（以後、翻訳テキスト
と称する）を単語分割して第２単語分割テーブルに記憶
し、第１単語分割テーブル中の単語について辞書引きし
て、辞書に登録された該単語の訳語全てについて、第２
単語分割テーブルを検索し、それぞれに一致するものを
抽出して、第１単語分割テーブル中の単語とともに対応
関係決定テーブルに記憶する。この場合の単語とは、文
における構成要素単位の最小のものである。一致するも
のが見つけられなかった第１単語分割テーブル中の単語
については、特別の対応関係を持つものとして「？？
？」の文字列とともに対応関係決定テーブルに記憶す
る。原テキスト中のある単語に対応しているはずの翻訳
テキスト中の単語にスペルミスが発生しているときは、
当然辞書中の該単語の訳語と一致するものがみつけられ
ないため、原テキスト中の該単語に対応するものは「？
？？」となる。このようにして、対応関係決定テーブル
を参照することにより、原テキスト中の単語が翻訳テキ
スト中においてどのような訳語を割り当てられているか
が分かる。つぎに、対応関係決定テーブルの原単語と訳
語との対応を参照し、同一の原単語に対して複数の訳語
が割り当てられている対応関係を抽出し、警告情報テー
ブルに記憶し、記憶された対応関係について警告情報を
表示画面に表示する。原テキスト中にある単語が複数出
現し、そのうちのいくつかにスペルミスが発生した場合
も、原テキスト中の単語に対応するものは「？？？」
と、辞書に登録されている訳語とが混在することにな
り、上記の同一の原単語に対して複数の訳語が割り当て
られている対応関係に該当する。そこで、このようなス
ペルミスが発生している対応関係についても、警告情報
が表示されることとなる。尚、処理を行うテキストは、
機械翻訳処理によるものだけではなく、人間による翻訳
テキストを対象としても同様の効果を得られる。また、
本発明の訳語不統一検出方法において、原テキストを機
械翻訳して得られる機械翻訳テキストを単語に分割して
第１単語分割テーブルに記憶し、前記機械翻訳結果を更
に後編集して得られる翻訳テキストを単語に分割して第
２単語分割テーブルに記憶し、機械翻訳テキストの単語
と翻訳テキストの単語を対応付けて対応関係決定テーブ
ルに記憶し、翻訳テキストの単語との対応関係がつく出
現箇所と、対応関係がつかない出現箇所の両方を有する
機械翻訳テキストの単語を対応関係決定テーブルから抽
出し、警告情報テーブルに記憶し、抽出した単語につい
て対応関係が不統一であることを表示画面に表示する。
これにより、対訳辞書を参照することなく、第１単語分
割テーブルの単語を直接第２単語分割テーブルから検索
すれば良いことになるため、処理が簡略化される。In the method for detecting inconsistency of translated words of the present invention, first,
The original text is word-divided and stored in the first word division table, and the translated and edited text (hereinafter referred to as translated text) is word-divided and stored in the second word division table, and the first word division table. The word in the dictionary is looked up in the dictionary, and the second word is extracted for all the translations of the word registered in the dictionary.
The word division table is searched, the matching ones are extracted, and stored in the correspondence determination table together with the words in the first word division table. The word in this case is the minimum of the constituent element unit in the sentence. Words in the first word division table for which no match is found are treated as “??
? It is stored in the correspondence relationship determination table together with the character string “”. If there is a misspelling in a word in the translated text that should correspond to a word in the original text,
Of course, since a match with the translated word of the word in the dictionary cannot be found, the word corresponding to the word in the original text is "?
? ? Will be In this way, by referring to the correspondence relationship determination table, it is possible to know what translated word is assigned to the word in the original text. Next, referring to the correspondence between the original word and the translated word in the correspondence relationship determination table, the correspondence relationship in which a plurality of translated words are assigned to the same original word is extracted, stored in the warning information table, and stored. Display warning information about the correspondence on the display screen. Even if multiple words appear in the original text and some of them are misspelled, the corresponding word in the original text is "????".
And the translated words registered in the dictionary are mixed, which corresponds to the correspondence relationship in which a plurality of translated words are assigned to the same original word. Therefore, warning information is also displayed for the correspondence relationship in which such a spelling error has occurred. The text to be processed is
Similar effects can be obtained not only by machine translation processing but also by human translated text. Also,
In the translation word inconsistency detection method of the present invention, a machine translation text obtained by machine translation of an original text is divided into words, stored in a first word division table, and a translation obtained by further editing the machine translation result. The text is divided into words and stored in the second word division table, the words of the machine translation text and the words of the translation text are associated with each other and stored in the correspondence relationship determination table, and the appearance locations having the correspondence relationship with the words of the translation text are stored. , And the words of the machine-translated text that have both occurrences that do not correspond to each other are extracted from the correspondence determination table and stored in the warning information table, and it is displayed on the display screen that the correspondences of the extracted words are inconsistent. indicate.
This simplifies the process because the words in the first word division table can be directly searched from the second word division table without referring to the bilingual dictionary.

【０００６】[0006]

【実施例】本発明の実施例として、日本語から英語への
機械翻訳処理を行い、人手によって後編集されたものに
関して、原テキスト中のある単語に対する訳語が翻訳テ
キスト全体で統一されているかどうかを調べる場合につ
いて説明する。図１は、本発明の一実施例を示すブロッ
ク図である。図１において、１は解析処理プロセッサ、
２は解析等の内部処理用テーブルや処理プログラムを有
するメインメモリ、３は辞書用メモリ、４は訳語不統一
をチェックする訳語チェック用メモリ、５はチェックす
るテキストをいれた文書ファイル、６は作業者がチェッ
クするテキストを手により入力する場合に使用するキー
ボード、７は作業者がチェック結果を確認するための表
示装置を示している。また、訳語不統一チェック用メモ
リ４の内部の割付けにおいて、４１は翻訳対象である原
テキストを単語分割した結果を格納する第１単語分割テ
ーブル、４２は翻訳処理を行い、人手によって後編集さ
れた翻訳テキストを単語分割した結果を格納する第２単
語分割テーブル、４３は原テキストの単語と翻訳テキス
トの訳語とを一対一に対応付けた対応関係決定テーブ
ル、４４は翻訳テキスト内の訳語が統一されていない対
応関係を抽出して格納する警告情報テーブルを示してい
る。[Embodiment] As an embodiment of the present invention, whether or not a translated word corresponding to a word in the original text is unified in the entire translated text in a machine translated from Japanese to English and manually post-edited The case of examining is explained. FIG. 1 is a block diagram showing an embodiment of the present invention. In FIG. 1, 1 is an analysis processor,
2 is a main memory having an internal processing table for analysis and the like, and a processing program, 3 is a dictionary memory, 4 is a translation word checking memory for checking inconsistency of translation words, 5 is a document file with text to be checked, and 6 is work. A keyboard used when the operator manually inputs the text to be checked, and 7 is a display device for the operator to confirm the check result. In the internal allocation of the translation word inconsistency check memory 4, 41 is a first word division table for storing the result of word division of the original text to be translated, 42 is a translation process, and is post-edited manually. A second word division table that stores the result of word division of the translated text, 43 is a correspondence relationship determination table that associates the words of the original text and the translated words of the translated text in a one-to-one correspondence, and 44 is the unified translated words in the translated text. 11 shows a warning information table for extracting and storing uncorresponding correspondences.

【０００７】図２は、本発明の一実施例の処理動作を示
すフローチャートである。以下、図２に沿って、処理動
作を説明する。作業者によりキーボード６から入力され
るか、あるいは、文書ファイル５中にあれば文書ファイ
ル５から取り込まれた（ステップ１１）１文ごとに文番
号の付いた原テキストを単語分割し、第１単語分割テー
ブル４１に格納する（ステップ１２）。図３は、該（ス
テップ１２）の処理動作を示すフローチャートである。
図３に従って該処理動作について詳述する。解析処理プ
ロセッサ１は、入力された翻訳テキストを対訳辞書３を
参照しながら単語に分割する（ステップ１２１）。この
場合、翻訳テキストというのは、機械翻訳処理装置によ
り翻訳し、人手により後編集したテキストであり、単語
というのは、文の構成要素の中で最小の単位である。単
語分割の方式としては、例えば、特開昭６２−１６９２
６２号公報があり、これを用いることが可能である。解
析処理プロセッサ１が分割した単語を、図１のメインメ
モリ２中の処理プログラムは、まず、図４に示すような
第１単語分割テーブル４１に格納する（ステップ１２
２）。該テキスト中の単語を全て第１単語分割テーブル
４２に格納し終えたら（ステップ１２３）、該処理動作
を終える。図４の第１単語分割テーブル４１におい
て、４１１は文の番号を格納し、４１２は１文中の全て
の単語をその出現順に格納している。この第１単語分割
テーブル４１の４１１の行数と４１２の列数は、処理対
象とする原テキストの大きさに応じて増減するものとす
る。FIG. 2 is a flow chart showing the processing operation of an embodiment of the present invention. The processing operation will be described below with reference to FIG. The first text is divided into words by dividing the original text with a sentence number for each sentence input by the operator from the keyboard 6 or fetched from the document file 5 if it is in the document file 5 (step 11). The data is stored in the division table 41 (step 12). FIG. 3 is a flowchart showing the processing operation of (step 12).
The processing operation will be described in detail with reference to FIG. The analysis processor 1 divides the input translated text into words while referring to the bilingual dictionary 3 (step 121). In this case, the translated text is a text translated by a machine translation processing apparatus and manually post-edited, and a word is the smallest unit among the constituent elements of a sentence. As a method of word division, for example, Japanese Patent Laid-Open No. 62-1692
There is a publication No. 62, which can be used. The processing program in the main memory 2 of FIG. 1 stores the words divided by the analysis processor 1 in the first word division table 41 as shown in FIG. 4 (step 12).
2). When all the words in the text have been stored in the first word division table 42 (step 123), the processing operation ends. In the first word division table 41 of FIG. 4, 411 stores sentence numbers, and 412 stores all words in one sentence in the order of appearance. It is assumed that the number of rows 411 and the number of columns 412 of the first word division table 41 increase or decrease according to the size of the original text to be processed.

【０００８】次に、図１の解析処理プロセッサ１が、作
業者によりキーボード６から入力されるか、あるいは、
文書ファイル５中にあれば文書ファイル５から取り込ま
れた（ステップ１３）、原テキストの文番号と一対一の
対応をしている文番号が一文ごとに付けられた翻訳テキ
ストを、語尾の'ｓ'や'ｅｄ'等の処理を行いながら、ブ
ランクからブランクまでの文字列により、辞書引きして
単語分割し、図１のメインメモリ２中の処理プログラム
が、分割した各単語を、図３に示したフローチャートと
同様の処理動作により、図５に示すような第２単語分割
テーブル４２に格納する（ステップ１４）。図５の第２
単語分割テーブル４２において、４２１はそれぞれの文
の番号を格納し、４２２は１文中の全ての単語をその出
現順に格納している。この第２単語分割テーブル４２の
４２１の行数と４２２の列数は、処理対象とする翻訳テ
キストの大きさに応じて増減するものとする。Next, the analysis processor 1 of FIG. 1 is input by a worker from the keyboard 6, or
If the translated text is included in the document file 5 (step 13), the translated text in which the sentence number corresponding to the sentence number of the original text in a one-to-one correspondence is added for each sentence is's. While performing processing such as'or'ed ', the dictionary is divided into words by the character string from blank to blank, and the processing program in the main memory 2 in FIG. It is stored in the second word division table 42 as shown in FIG. 5 by the same processing operation as the flowchart shown (step 14). Second of FIG.
In the word division table 42, 421 stores the number of each sentence, and 422 stores all the words in one sentence in the order of appearance. It is assumed that the number of rows 421 and the number of columns 422 of the second word division table 42 increase or decrease according to the size of the translated text to be processed.

【０００９】図６は、特に、図８の対応関係決定テーブ
ル４３を作成する処理動作（ステップ１５）を示したフ
ローチャートである。図６に沿って該処理動作を詳述す
る。図４の第１単語分割テーブル４１中の４１１の１の
文の、４１２の１の単語「用紙」を抽出し（ステップ１
５０２）、該単語について対訳辞書３を検索する（ステ
ップ１５０３）。図７は、対訳辞書３の構造と内容を示
したものである。図７において、３１は日本語の単語、
３２は該単語の品詞、３３は該単語の品詞が動詞／形容
詞であった場合の活用の型を示し、３４は該単語に対応
する英語の訳語を示している。図７の例において、「用
紙」の対訳の英語は、「ｐａｐｅｒ」、「ｆｏｒｍ」と
複数登録されているが、まず、そのうちの「ｐａｐｅ
ｒ」を選んで（ステップ１５０６）、図５の第２単語分
割テーブル４２中の４２１の１の文の全ての単語を検索
して（ステップ１５０７）、抽出した訳語、すなわち、
「ｐａｐｅｒ」と一致するものを求める（ステップ１５
０８）。該単語に対応する訳語と一致するものが第２単
語分割テーブル４２中の同じ文番号の行中にあれば、そ
れと第１単語分割テーブル４１の該単語は対応関係にあ
ると言える。第２単語分割テーブル４２の同じ文番号の
行の中に「ｐａｐｅｒ」がなく、且つ、該単語に未処理
の訳語が残ってなければ、図８に示すような対応関係決
定テーブル４３に、対応する訳語なしの意味の「？？
？」とともに格納する（ステップ１５０９）。対応関係
が求められた「用紙」と「ｐａｐｅｒ」についても、図
８に示すような対応関係決定テーブル４３に格納する
（ステップ１５１０）。図８において４３１には抽出さ
れた全対応関係の組合せの番号を格納し、４３２には原
テキストの単語、４３３には翻訳テキストの単語を格納
している。４３３に「？？？」がある原テキストの単語
は対応する訳語がないということを示している。第１単
語分割テーブル４１の文番号１の行の全ての単語につい
て、上記処理動作を繰り返し、一つの単語についての検
索が終わったら、次の単語へ進み（ステップ１５０
５）、一つの文が終わったら（ステップ１５１１）、次
の行、すなわち、次の文へ移り（ステップ１５１２）、
第１単語分割テーブル４１中の全ての単語についての処
理が終了したら（ステップ１５０１）、対応関係決定テ
ーブル４３を作成する処理動作を終える。FIG. 6 is a flow chart showing the processing operation (step 15) of creating the correspondence relationship determination table 43 of FIG. The processing operation will be described in detail with reference to FIG. In the first word division table 41 of FIG. 4, the 1st sentence of 411 and the 1st word of “412” “paper” are extracted (step 1
502), the bilingual dictionary 3 is searched for the word (step 1503). FIG. 7 shows the structure and contents of the bilingual dictionary 3. In FIG. 7, 31 is a Japanese word,
Reference numeral 32 indicates a part of speech of the word, 33 indicates a type of inflection when the part of speech of the word is a verb / adjective, and 34 indicates an English translation corresponding to the word. In the example of FIG. 7, a plurality of English translations of “paper” are registered as “paper” and “form”. First, the “paper”
"r" (step 1506), all words of the sentence 1 of 421 in the second word division table 42 of FIG. 5 are searched (step 1507), and the extracted translated word, that is,
Find the one that matches "paper" (step 15)
08). If there is a match with the translated word corresponding to the word in the line of the same sentence number in the second word division table 42, it can be said that the word in the first word division table 41 has a correspondence relationship. If there is no "paper" in the line of the same sentence number in the second word division table 42 and there is no unprocessed translated word in the word, the correspondence relation determination table 43 as shown in FIG. Meaning "??"
? Is stored together with “” (step 1509). The "paper" and the "paper" for which the correspondence has been obtained are also stored in the correspondence determination table 43 as shown in FIG. 8 (step 1510). In FIG. 8, 431 stores the numbers of the combinations of the extracted all correspondences, 432 stores the words of the original text, and 433 stores the words of the translated text. A word in the original text with "???" in 433 indicates that there is no corresponding translation. The above processing operation is repeated for all the words in the row of sentence number 1 of the first word division table 41, and when the search for one word is completed, the process proceeds to the next word (step 150).
5) When one sentence ends (step 1511), move to the next line, that is, the next sentence (step 1512),
When the processing for all the words in the first word division table 41 is completed (step 1501), the processing operation for creating the correspondence relationship determination table 43 is completed.

【００１０】次に、対応関係決定ーブル４３中から、原
単語に対応する訳語の統一がとれていない対応関係を抽
出する（ステップ１６）。図９は、特に、原単語に対応
する訳語の統一がとれていないという警告を出す対象と
なる対応関係を、対応関係決定テーブル４３から、抽出
する処理動作を示したフローチャートである。図９に沿
って該処理動作を詳述する。図１のメインメモリ２中の
処理プログラムは、次に、図９の対応関係決定テーブル
４３から１組目「用紙−ｐａｐｅｒ」の原テキストの単
語「用紙」を抽出して（ステップ１６０２）、どの対応
関係の組を抽出すれば良いのかを記憶しておくために組
番号１を変数ＣＯＵＮＴに代入しておく（ステップ１６
０３）。それから、抽出した１組目の原テキストの単語
「用紙」を、第１単語変数に格納し（ステップ１６０
４）、同じ組の翻訳テキストの単語「ｐａｐｅｒ」を抽
出して第２単語変数に代入する（ステップ１６０５）。
２組目、３組目、４組目と進みながら（ステップ１６１
０）、第１単語変数中の原テキストの単語と一致する単
語が、対応関係決定テーブル４３中にないかを探す（ス
テップ１６０７）。対応関係決定テーブル４３中を検索
し終わっても（ステップ１６０６）、一致する単語が見
つからなければ、該原単語は原テキスト中に１つしか存
在せず、従って訳語の統一の有無を調べる必要がないの
で、変数ＣＯＵＮＴを１つ増やした値の番号の組、この
場合は対応関係決定テーブル４３の２組目の「プリント
−ｐｒｉｎｔ」の対応関係へ移る（ステップ１６０
８）。一致する単語をみつけた場合（ステップ１６０
９）には、それに対応する訳語と第２単語変数中の翻訳
テキストの単語の文字列を比較する（ステップ１６１
１）。ここでは、対応関係決定テーブル４３の４組目に
第１単語変数中の原テキストの単語「用紙」と一致する
文字列が存在する。そこで、４組目の翻訳テキストの単
語「ｆｏｒｍ」を抽出する。該単語と第２単語変数中の
翻訳テキストの単語が一致すれば、訳語の統一がとれて
いるので、次の組へ進み（ステップ１６１０）、ステッ
プ１６０７、ステップ１６０９を繰り返す。ここでは、
第２単語変数中の翻訳テキストの単語は「ｐａｐｅｒ」
なので、両者は一致しない。つまり、原テキストの「用
紙」という単語には「ｐａｐｅｒ」と「ｆｏｒｍ」とい
う２つの異なった訳語が割り当てられていることが分か
る。そこで、第１単語変数の文字列「用紙」と第２単語
変数の文字列「ｐａｐｅｒ」、「ｆｏｒｍ」を図１０の
警告情報テーブル４４に格納する（ステップ１６１
２）。図１０の警告情報テーブル４４において、４４１
は警告情報を表示する対応関係の組の数、４４２は原テ
キストの単語、４４３は翻訳テキスト中で該単語に割り
当てられている訳語の数、４４４は翻訳テキスト中で該
単語に割り当てられている訳語を格納している。警告情
報テーブル４４に格納されている対応関係のうち、２組
目の原テキストの単語「キー」に対応しているのは「ｋ
ｅｙ」と「？？？」であるが、これは、「キー」の訳語
の一つが「ｋｙ」とスペルミスになっているため対応関
係が付けられずに、特別の対応関係を持っているという
ことを示している。こうして対応関係決定テーブル４３
中の全ての組に対して、上記検索処理を行って対応関係
決定テーブル４３が終了したら（ステップ１６０１）、
該処理動作を終了する。Next, from the correspondence relation determination table 43, correspondence relations in which the translated words corresponding to the original words are not unified are extracted (step 16). FIG. 9 is a flowchart showing a processing operation for extracting, from the correspondence relationship determination table 43, a correspondence relationship that is a target for issuing a warning that the translated words corresponding to the original word are not unified. The processing operation will be described in detail with reference to FIG. The processing program in the main memory 2 of FIG. 1 then extracts the word “paper” of the original text of the first set “paper-paper” from the correspondence determination table 43 of FIG. 9 (step 1602), which The group number 1 is assigned to the variable COUNT in order to store whether to extract the group of the correspondence relationship (step 16).
03). Then, the extracted word "paper" of the first set of original texts is stored in the first word variable (step 160).
4) The word "paper" of the same set of translated texts is extracted and assigned to the second word variable (step 1605).
While proceeding to the second, third, and fourth groups (step 161
0), the word matching the original text in the first word variable is searched for in the correspondence determination table 43 (step 1607). If the matching word is not found even after the search in the correspondence relationship determination table 43 is completed (step 1606), there is only one original word in the original text, and therefore it is necessary to check whether or not the translated words are unified. Since there is no variable, the variable COUNT is incremented by 1, and in this case, the correspondence relation of the second set "print-print" of the correspondence relation determination table 43 is moved to (step 160).
8). If a matching word is found (step 160)
In 9), the corresponding translated word and the character string of the word of the translated text in the second word variable are compared (step 161).
1). Here, a character string that matches the word “paper” of the original text in the first word variable exists in the fourth set of the correspondence determination table 43. Therefore, the word "form" of the fourth set of translated text is extracted. If the word and the word of the translated text in the second word variable match, the translated words have been unified, so the process proceeds to the next group (step 1610), and steps 1607 and 1609 are repeated. here,
The word of the translated text in the second word variable is "paper"
So the two do not match. That is, it can be seen that the word "paper" in the original text is assigned with two different translations "paper" and "form". Therefore, the character string "paper" of the first word variable and the character strings "paper" and "form" of the second word variable are stored in the warning information table 44 of FIG. 10 (step 161).
2). In the warning information table 44 of FIG. 10, 441
Is the number of sets of correspondences displaying warning information, 442 is the word of the original text, 443 is the number of translated words assigned to the word in the translated text, 444 is assigned to the word in the translated text Stores translated words. Of the correspondences stored in the warning information table 44, the word “key” of the second set of original texts corresponds to “k”.
"ey" and "???", but because one of the translated words of "key" is misspelled with "ky", there is no corresponding relationship and it has a special corresponding relationship. It is shown that. Thus, the correspondence determination table 43
When the above-mentioned search processing is performed for all the groups in the inside and the correspondence relationship determination table 43 ends (step 1601),
The processing operation ends.

【００１１】そして、警告情報テーブル４４に格納され
ている対応関係について、表示装置７に図１１の如く強
調表示あるいはサマリー表示する（ステップ１７）。強
調表示の方式として、特開昭５８−１０１３６５で開示
されている方式を用いることが可能である。上記警告情
報の表示については、例えば、日本語の「いう」のよう
に英語において訳し分けが必要なものについて、作業者
が故意に異なった訳語を選んでいる場合は、それらに関
する訳語不統一の警告表示はかえって煩わしいので、作
業者が検出不要の単語を指定できるものとする。Then, the correspondence relationship stored in the warning information table 44 is highlighted or summarized on the display device 7 as shown in FIG. 11 (step 17). As the highlighting method, the method disclosed in Japanese Patent Laid-Open No. 58-101365 can be used. Regarding the display of the above warning information, for example, if the operator intentionally selects different translated words such as Japanese "i" that need to be translated in English, it is possible that Since the warning display is rather annoying, it is assumed that the operator can specify a word that does not need to be detected.

【００１２】また、本発明の他の実施例として、第１の
テキストに原テキストの代わりに機械翻訳しただけのテ
キストを用い、第２のテキストに翻訳テキストの代わり
に機械翻訳結果を後編集した翻訳テキストを用いれば、
原テキストがなくても前述した本発明の一実施例と同じ
効果をあげることができる。本実施例の場合、対応関係
決定テーブルを作成する処理において、前述した本発明
の一実施例では対訳辞書を参照して第１単語分割テーブ
ルの単語の訳語を求めた上で該訳語を第２単語分割テー
ブルから検索したのに対して、第１単語分割テーブルの
単語を直接第２単語分割テーブルから検索すれば良いた
め、処理が簡略化される。As another embodiment of the present invention, a text that has been machine translated is used as the first text instead of the original text, and the machine translation result is post-edited as the second text instead of the translated text. With translated text,
Even if there is no original text, the same effect as the above-described embodiment of the present invention can be obtained. In the case of the present embodiment, in the process of creating the correspondence relationship determination table, in the above-described embodiment of the present invention, the translated word of the word in the first word division table is obtained by referring to the bilingual dictionary, and then the translated word is set to the second word. While the word division table is searched, the words in the first word division table may be directly searched from the second word division table, so that the process is simplified.

【００１３】[0013]

【発明の効果】以上詳細に述べたごとく、本発明によれ
ば、機械翻訳処理装置によって翻訳し、人手によって後
編集されたテキストから、不統一になっている訳語ある
いはスペルミスが生じている訳語を自動的に検出するこ
とができ、訳語の統一あるいはスペルミスの修正を能率
良く行うことができる。そのため、翻訳テキストを分割
して、複数の作業者が後編集作業にあたった後、それら
のテキストをマージして、複数の作業者が後編集作業を
行ったために、原単語に対して不統一になった訳語を統
一しなおすといった方式が可能となり、その結果、機械
翻訳処理装置によって翻訳されたテキストに対する後編
集作業に関して、後編集作業全体にかかる時間を短縮さ
せ、労力を軽減するという顕著な効果が生ずる。As described above in detail, according to the present invention, a translated word having a non-uniform or a misspelled word is extracted from a text translated by a machine translation processor and post-edited manually. It can be automatically detected, and the translations can be unified or spelling errors can be corrected efficiently. Therefore, the translated text is divided, and after multiple workers perform post-editing work, those texts are merged and post-editing work is performed by multiple workers. It becomes possible to unify the translated words, and as a result, regarding the post-editing work on the text translated by the machine translation processing device, the time required for the whole post-editing work is shortened and the labor is remarkably reduced. The effect occurs.

[Brief description of drawings]

【図１】本発明の一実施例の装置のブロック図である。FIG. 1 is a block diagram of an apparatus according to an embodiment of the present invention.

【図２】本発明の一実施例の処理動作を示すフローチャ
ートである。FIG. 2 is a flowchart showing a processing operation of an embodiment of the present invention.

【図３】本発明の一実施例における単語分割テーブルを
作成する処理動作を示すフローチャートである。FIG. 3 is a flowchart showing a processing operation for creating a word division table in an embodiment of the present invention.

【図４】本発明の一実施例における原テキストを単語分
割した結果を表す第１単語分割テーブルである。FIG. 4 is a first word division table showing a result of word division of an original text according to an embodiment of the present invention.

【図５】本発明の一実施例における翻訳テキストを単語
分割した結果を表す第２単語分割テーブルである。FIG. 5 is a second word division table showing a result of word division of the translated text according to the embodiment of the present invention.

【図６】本発明の一実施例における対応関係決定テーブ
ルを作成する処理動作を示すフローチャートである。FIG. 6 is a flowchart showing a processing operation for creating a correspondence relationship determination table in an embodiment of the present invention.

【図７】本発明の一実施例において利用される対訳辞書
の構造と内容を示す図である。FIG. 7 is a diagram showing the structure and contents of a bilingual dictionary used in an embodiment of the present invention.

【図８】本発明の一実施例における原テキストの単語と
翻訳テキストの単語を対応づけた対応関係決定テーブル
である。FIG. 8 is a correspondence relationship determination table in which a word of an original text and a word of a translated text are associated with each other according to an embodiment of the present invention.

【図９】本発明の一実施例における警告情報テーブルを
作成する処理動作を示すフローチャートである。FIG. 9 is a flowchart showing a processing operation for creating a warning information table in the embodiment of the present invention.

【図１０】本発明の一実施例における画面に警告表示す
る必要のある対応関係を抽出した警告情報テーブルであ
る。FIG. 10 is a warning information table in which correspondences that need to be displayed as warnings on a screen according to an embodiment of the present invention are extracted.

【図１１】本発明の一実施例における処理に伴う表示画
面の例を示す図である。FIG. 11 is a diagram showing an example of a display screen associated with processing in an embodiment of the present invention.

【符号の説明】１解析処理プロセッサ２メインメモリ３辞書用メモリ４訳語チェック用メモリ４１第１単語分割テーブル４２第２単語分割テーブル４３対応関係決定テーブル４４警告情報テーブル５文書ファイル６キーボード７表示装置[Explanation of Codes] 1 Analysis Processor 2 Main Memory 3 Dictionary Memory 4 Translated Word Check Memory 41 First Word Division Table 42 Second Word Division Table 43 Correspondence Determination Table 44 Warning Information Table 5 Document File 6 Keyboard 7 Display Device

Claims

[Claims]

1. A method of translating a first text described in a first natural language into a second text described in a second natural language, performing post-editing work, and then converting the minimum of the first text. A translated word inconsistency detection method characterized in that when the minimum syntactic element unit of the second text corresponding to a syntactic element unit becomes inconsistent, a warning is displayed about the inconsistent correspondence.

2. The method for detecting inconsistency of translations according to claim 1, wherein the first text is divided into minimum syntax element units and stored in a first table, and the second text is divided into minimum syntax element units. And storing in the second table by using the bilingual dictionary and associating the minimum syntax element unit of the first text with the minimum syntax element unit of the second text in the third table The step of extracting the minimum syntactical element unit of the first text from the third table in which the correspondence with the minimum syntactical element unit of the second text is inconsistent, the extracted correspondence relation being inconsistent Is displayed on the display screen.

3. A first text after machine-translating a text described in a first natural language into a second natural language as a first text and post-editing the first text as a second text. The second corresponding to the smallest syntax element unit of
If the minimum syntactic element unit of the text of
A method for detecting inconsistency in translated words, which is characterized by displaying a warning about inconsistent correspondence.

4. The method for detecting inconsistency of translations according to claim 3, wherein the first text is divided into minimum syntax element units and stored in the first table, and the second text is divided into minimum syntax element units. And storing in the second table, and storing in the third table in association with the minimum syntax element unit of the first text and the minimum syntax element unit of the second text. From the step of extracting from the third table the minimum syntax element unit of the first text in which the correspondence relationship with the minimum syntax element unit does not match, and the step of displaying on the display screen that the extracted correspondence relationship does not match A method for detecting inconsistency in translated words, which is characterized by being formed.

5. An analysis processor, a main memory, a dictionary memory, a word translation memory, a document file, a keyboard and a display device, wherein the first text described in the first natural language is converted into the second natural language. Means for translating and post-editing into a second text described by, a means for dividing the first text into minimum syntax element units, a means for dividing the second text into minimum syntax element units, and a bilingual dictionary A means for associating the minimum syntactical element unit of the first text with the minimum syntactical element unit of the second text by utilizing, and the correspondence relation between the minimum syntactical element unit of the second text are inconsistent. An apparatus for detecting inconsistency in translated words, comprising means for extracting the minimum syntactical element unit of the text and means for displaying that the extracted correspondence is inconsistent.

6. An analysis processor, a main memory, a dictionary memory, a translation check memory, a document file, a keyboard, and a display device, wherein the text described in the first natural language is converted into the second natural language in the first natural language. Means for machine-translating the text as a text, a means for post-editing the first text as a second text, a means for dividing the first text into minimum syntax element units, and a second text as a minimum syntax element unit. A first means in which the correspondence between the dividing means, the means for associating the minimum syntax element unit of the first text with the minimum syntax element unit of the second text, and the minimum syntax element unit of the second text do not match. A united word inconsistency detection device characterized by having means for extracting the minimum syntactical element unit of the text of the text and means for displaying the fact that the extracted correspondences do not match. Place