JPH0683861A

JPH0683861A - Machine translation device

Info

Publication number: JPH0683861A
Application number: JP3119837A
Authority: JP
Inventors: Jiyunjiee Kuo; クォ・ジュンジェー
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1991-05-24
Filing date: 1991-05-24
Publication date: 1994-03-25

Abstract

PURPOSE:To provide the high-quality machine translation device provided with an operation processing method and a proverb/idiom dictionary to be referred to so as to simultaneously resolve a proverb, idiom, which are mixed among words, and polysemy related to them. CONSTITUTION:A source language analyzing intermediate structure generation part 200 provides intermediate structure depending on a source language by performing the analysis of syntax and meaning to an inputted sentence while referring to an analysis dictionary 250. A source language proverb/idiom processing part 300 reads the intermediate structure of the source language from a buffer 900, refers to a proverb/idiom dictionary 350 and prepares one node by defining the node of the proverb or idiom contained in the intermediate structure as one morpheme. Then, a new meaning code is applied again and after the node related to the relevant proverb or idiom is excluded, the processing result is stored in the buffer 900.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は定型文ではない、単語間
に混じている諺や慣用句及びその関連の多義性を自動的
に処理することができる機械翻訳装置に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a machine translation device capable of automatically processing proverbs and idioms mixed between words and ambiguity associated with each other, which are not fixed phrases.

【０００２】[0002]

【従来の技術】二十世紀は新しい情報がどんどん出てき
ており、知識爆発時代といえるだろう。皆、時代に淘汰
されないように絶えず知識を吸収しなければならない。
だが、できた知識は国内だけではなく、外国からもあ
る。尚、普通の人は母国語を読むスピードが外国語より
早いから、翻訳の重要性は言うまでもないことである。
翻訳の質及び効率を向上するために、人手の代わりに機
械による何らかの方法、つまり機械翻訳システムを考え
なければならない時代になってきたわけである。機械翻
訳システムでは、入力され翻訳される言語を原始言語
と、翻訳され出力される言語を目的言語とする。中間転
換方式を採用する機械翻訳装置は一般的に図８に示すよ
うに、（１）原始言語解析部（２）中間構造転換部（３）目的言語生成部（４）参照用字典、辞書という四つの部分から構成される。機械翻訳の品質は、
原始言語解析部で入力された語句を正確的に解析した
か、また中間構造転換部で原始言語と目的言語との差異
を解消したか、目的言語生成部で目的言語の生成文法規
則に基づいて正確的に目的言語を生成したか、というこ
とによくかかっている。とりわけ、翻訳の対象が諺、慣
用句などの定型文である場合、各語句が文法正しい掛り
受け関係を満足しないものが多く、その結果、翻訳不可
となったり、形式的に文法上の掛り受けを満足しても各
語句毎の訳語句に基づき生成された訳文がまったく意味
をなさない訳文となってしまったりすることが多く、翻
訳の品質が低下する。例えば、Ｈｏｗｄｏｙｏｕｄｏ？（慣用句）に対して、慣用句の解析処理を無視し、普通の英文解析
を行うと、解析結果は図１４のようになる。[Prior Art] The 20th century can be said to be an era of knowledge explosion with the emergence of new information. Everyone must constantly absorb their knowledge so that they will not be slaughtered in the times.
However, the knowledge acquired is not only domestic but also foreign. It is needless to say that translation is important because ordinary people read their native language faster than foreign languages.
In order to improve the quality and efficiency of translation, we have come to the time when we have to consider some method by machine instead of manpower, that is, a machine translation system. In a machine translation system, a language that is input and translated is a source language, and a language that is translated and output is a target language. A machine translation device that adopts an intermediate conversion method is generally referred to as (1) a source language analysis unit (2) an intermediate structure conversion unit (3) a target language generation unit (4) a reference dictionary and a dictionary, as shown in FIG. It consists of four parts. The quality of machine translation is
Based on the grammatical rules of the target language generated by the target language generation unit, whether the input phrase was correctly analyzed by the source language analysis unit, and whether the difference between the source language and the target language was resolved by the intermediate structure conversion unit. It often depends on whether the target language is generated correctly. In particular, when the target of translation is a fixed sentence such as a proverb or an idiomatic phrase, each word often does not satisfy the grammatically correct dependency relation, and as a result, it becomes impossible to translate or the formally grammatical dependency Even if the above condition is satisfied, the translated text generated based on the translated text for each word often becomes a meaningless text, and the quality of the translation deteriorates. For example, How do you do? When (ordinary phrase) is subjected to ordinary English sentence analysis while ignoring the parsing process of the phrase, the analysis result is as shown in FIG.

【０００３】中国語の訳語は「The Chinese translation is "

【０００４】[0004]

【外１】 [Outer 1]

【０００５】如何作？（あなたはどうするか）」とな
り、「How is it? (What do you do?)

【０００６】[0006]

【外２】 [Outside 2]

【０００７】好！（こんにちは）」と正しく翻訳される
ことができない。この問題点を解決するために、従来の
技術、例えば日本特開昭 62−82464 号公報の機械翻訳
システムにおいて、諺、慣用句などの定型文の翻訳がよ
り適切に行えるようにしたシステムが提案されている。
この機械翻訳システムは図９に示すようなものである。
図９において、入力部１は原始言語の入力を行う。編集
制御部４は翻訳処理前、後の二種言語の編集を処理す
る。原文記憶部２及び訳文記憶部３はそれぞれ処理中の
原文及び訳文を記憶する。翻訳部５は翻訳の処理をす
る。翻訳辞書６は図１２に示しているように、６ａ〜６
ｇの七部分に分けられ、翻訳処理に検索の必要のある辞
書である。表示制御部７は表示部８の画面表示を処理す
る。印刷部９はファイルや画面の内容をプリントする。
翻訳部５の処理流れは図１０のように、ｓ７１は翻訳辞
書６の定型文辞書６ｇを参照して、入力された原始言語
は定型文であるかを判定する。定型文であれば、ｓ７２
の処理を行い、すなわち、定型文辞書６ｇから日本語の
訳文を取り出して、次にｓ７３により訳文を出力する。
ｓ７１により定型文ではないと判定される場合は、ｓ７
４により一般の翻訳処理を行う。詳細な流れは図１１に
示す。まず形態素解析ｓ５１により規則・不規則変化辞
書６ａを参照して、形態素の解析を行う。次に辞書検索
ｓ５２により単語・熟語辞書６ｂを参照して、処理の必
要のある形態素を探し出してから、ｓ５３、ｓ５４、ｓ
５５により解析文法６ｃ、変換文法６ｄを参照して、構
文解析及び構造変換を繰り返して行い、適当な目的言語
の中間構造を見つける。そして、構文生成ｓ５６により
生成文法６ｅを参照しながら、目的言語の表層構造を生
成する。最後に、形態素生成ｓ５７により形態素生成文
法６ｆを参照して、適当な訳語を選び出す。以上の処理
ステップは一般の翻訳処理と言われる。下記の英文を例
として、この従来例により図１３に示している定型文辞
書を参照しながら、図１０のｓ７１では下記の英文を定
型文と判定するので、ｓ７２により定型文辞書から、マ
ッチングを行い、訳語を取り出してから、ｓ７３を介し
て翻訳結果を出力する。下記のような正しい処理結果が
得られる。Good! (Hello) "and it can not be properly translated. In order to solve this problem, a conventional technique, for example, a machine translation system of Japanese Patent Laid-Open No. 62-82464, is proposed, which can more appropriately translate a standard sentence such as a proverb and an idiom. Has been done.
This machine translation system is as shown in FIG.
In FIG. 9, the input unit 1 inputs a source language. The editing control unit 4 processes editing in two languages before and after translation processing. The original sentence storage unit 2 and the translated sentence storage unit 3 respectively store the original sentence and the translated sentence that are being processed. The translation unit 5 processes translation. The translation dictionary 6 is, as shown in FIG.
It is a dictionary that is divided into seven parts of g and needs to be searched for translation processing. The display control unit 7 processes the screen display of the display unit 8. The printing unit 9 prints the contents of files and screens.
As shown in FIG. 10, the processing flow of the translation unit 5 refers to the fixed text dictionary 6g of the translation dictionary 6 in s71 to determine whether the input source language is a fixed text. If it is a fixed phrase, s72
That is, the Japanese translation is taken out from the fixed text dictionary 6g, and then the translation is output in s73.
If it is determined by s71 that it is not a fixed phrase, s7
In step 4, general translation processing is performed. The detailed flow is shown in FIG. First, the morpheme is analyzed by referring to the rule / irregular change dictionary 6a by the morpheme analysis s51. Next, by referring to the word / jukugo dictionary 6b by the dictionary search s52, a morpheme that needs to be processed is searched, and then s53, s54, s
55, the parsing grammar 6c and the conversion grammar 6d are referred to, and the syntactic analysis and the structure transformation are repeatedly performed to find an intermediate structure of an appropriate target language. Then, the syntax generation s56 generates the surface structure of the target language while referring to the generation grammar 6e. Finally, the morpheme generation s57 refers to the morpheme generation grammar 6f to select an appropriate translated word. The above processing steps are called general translation processing. While referring to the standard sentence dictionary shown in FIG. 13 according to this conventional example using the following English sentence as an example, the following English sentence is determined to be a standard sentence in s71 of FIG. After performing the translation and extracting the translated word, the translation result is output via s73. The following correct processing result is obtained.

【０００８】Ｈｏｗｄｏｙｏｕｄｏ？（英文原文）はじめまして（日文訳文）How do you do? (English original) Nice to meet you (Japanese translation)

【０００９】[0009]

【本発明が解決しようとする課題】入力された原文は語
順の自由度のため、諺や慣用句は定型文ではなくて、単
語間に混じている場合は、例えば、「身に付ける」とい
う定型文を辞書に「穿上」という中国語訳語として登録
すれば、下記のような誤りが出る。第１例彼は身にコートを付ける。（原文）他在身上穿大衣。（従来技術の中国語訳語）他＊穿上＊大衣。（適切な中国語訳語）第２例彼は技術を身に付ける。（原文）他穿上技術。（従来技術の中国語訳語）他＊学習＊技術。（適切な中国語訳語）第１例は語順の自由度のためで、定型文のマッチングを
することができないので、適当な訳語を獲得することが
できない。第２例は慣用句や諺の多義性 (polysemy) に
したがって発生した誤りである。[Problems to be Solved by the Invention] Since the input original sentence has a degree of freedom in word order, when a proverb or an idiom is not a fixed sentence but is mixed between words, for example, it is referred to as "get it". If you register a fixed phrase in the dictionary as a Chinese translation of "Hikage", the following error will occur. First example He puts a coat on himself. Originally dressed as a garment. (Chinese translation of the prior art) Other * Deployment * Large garment. (Appropriate Chinese translation) The second example He acquires skill. Original technology. (Chinese translation of conventional technology) Other * learning * technology. (Appropriate Chinese translation) In the first example, the degree of freedom in word order does not allow the matching of fixed phrases, so an appropriate translation cannot be obtained. The second example is an error caused by the polysemy of idioms and proverbs.

【００１０】本発明は上記の欠点に鑑み、単語間に混じ
ている諺や慣用句及びそれらに関連する多義性を同時に
解決することができる演算処理方法、且つ参照できる諺
慣用句辞書により品質の高い機械翻訳装置を提供するこ
とを目的とする。In view of the above-mentioned drawbacks, the present invention provides an arithmetic processing method capable of simultaneously solving proverbs and idioms mixed between words and ambiguities related to them, and a proverb idiom dictionary that can be referred to to improve quality. An object is to provide a high machine translation device.

【００１１】[0011]

【課題を解決するための手段】本発明は、原始言語の各
諺や慣用句に対して該当諺や慣用句の主形態素を検索キ
ーとして主形態素、主形態素及び副形態素の関連情報、
該当主形態素のすべての諺や慣用句及びそれらに対応す
る意味コードを記憶する諺慣用句辞書と、入力された原
始言語を解析して得られた中間構造を前記諺慣用句辞書
に記憶されている情報と比較して所定条件に合う諺や慣
用句の文を一つのノードにする諺慣用句処理部と、処理
された中間構造のノード数などの情報により入力された
文が単純な諺や慣用句であるかを判断する中間構造判別
部と、該当原始言語の各諺及び慣用句に対応する品詞コ
ード、意味コード、意味支配コード及び対応する目的言
語の訳語を記憶する訳語選択辞書を備えたことを特徴と
する機械翻訳装置である。According to the present invention, for each proverb or idiom of a source language, the main morpheme of the proverb or idiom is used as a search key, and related information of the main morpheme or the submorpheme,
A proverb dictionary that stores all proverbs and idioms of the main morpheme and their corresponding meaning codes, and an intermediate structure obtained by analyzing the input source language are stored in the proverb idiom dictionary. The proverb phrase processing part that makes a proverb or phrase phrase that matches a predetermined condition into one node compared to the information that is input, and the sentence that is input by the information such as the number of nodes of the processed intermediate structure is a simple proverb or Equipped with an intermediate structure discriminator that determines whether the phrase is an idiomatic phrase, and a translation selection dictionary that stores the part of speech code, the meaning code, the meaning control code, and the translation of the corresponding target language corresponding to each proverb and idiomatic phrase of the corresponding source language. It is a machine translation device characterized by that.

【００１２】[0012]

【作用】本発明は機械翻訳装置の原始言語解析部及び中
間構造転換部のうちに、諺慣用句処理部を設けて、原始
言語の中間構造（解析結果）にの諺や慣用句について処
理を行い、定型文のマッチング処理の欠点を解決する。
それから、他の処理が必要であるかを判断する中間構造
判別部を設けているにつれて無駄な処理を減らす。本機
械翻訳装置は単に単語間に混じている慣用句や諺を処理
するだけではなく、処理中に意味コードを修正したり、
無駄な、長すぎるノードを削除するなどの手段により、
慣用句や諺の多義性を解決することができる。更に、原
始言語の中間構造をよりきれいに且つ適当に修正するこ
とができるので、翻訳の品質及び効率を向上することが
できる。According to the present invention, a proverb idiom processing unit is provided in the source language analysis unit and the intermediate structure conversion unit of the machine translation device to process proverbs and idioms in the intermediate structure (analysis result) of the source language. By doing so, the shortcomings of the matching process of fixed phrases are solved.
Then, wasteful processing is reduced as an intermediate structure determination unit for determining whether other processing is necessary is provided. This machine translation device does not only process idioms and proverbs mixed between words, but also corrects semantic codes during processing,
By removing unnecessary, too long nodes, etc.
Can solve the ambiguity of idioms and proverbs. Furthermore, since the intermediate structure of the source language can be modified more cleanly and appropriately, the quality and efficiency of translation can be improved.

【００１３】[0013]

【実施例】図１は本発明の一実施例における機械翻訳装
置の構成を示すブロック図である。図１において、１０
０は処理しようとする原始言語をシステムに入力する入
力部である。２００は入力された文に対して、解析辞書
２５０を参照しながら、構文、語意の解析を行い、原始
言語に依存する中間構造を獲得する原始言語解析中間構
造生成部である。３００はバッフア９００から原始言語
の中間構造を読み出して、諺慣用句辞書３５０を参照
し、中間構造に含まれている諺や慣用句のノードを一つ
の形態素として、一つのノードにする。そして、改めて
新しい意味コードを与え、該当諺や慣用句に関連するノ
ードを削除してから、処理結果をバッファ９００に記憶
する諺慣用句処理部である。諺慣用句処理部３００の処
理流れは図２、図３に示す。図５は諺慣用句辞書３５０
の構造の一部を示す説明図である。諺慣用句辞書３５０
には主形態素、主形態素制限要素、補助検索要素、慣用
句、関連意味コードなどの情報を載っている。中間構造
判別部４００はバッファ９００から原始言語の中間構造
を取り出して、図４に示している処理流れにより、単純
な諺或いは慣用句であるかを判定して、判定結果（IDOM
フラグ値）をバッファ９００に記憶する。概念構造転換
部５００は上記中間構造判別部４００の判定結果によ
り、下記の動作を行う。(1)判定結果が単純な諺や慣用句である場合目的言語の生成処理をする必要がなくて、訳語選択部６
００により直接的に諺や慣用句を目的言語に変換するこ
とができる。(2)判定結果が単純な諺や慣用句でない場合バッファ９００から原始言語の中間構造を取り出して、
差異調整転換辞書５５０を参照して、原始言語の語意、
構文に依存する中間構造を目的言語に依存する中間構造
に転換してから、転換結果をバッファ９００に記憶す
る。訳語選択部６００はバッファ９００から目的言語の
中間構造を取り出して、訳語選択辞書６５０を参照し
て、中間構造の各ノードの訳語を決定してから、その処
理結果をバッファ９００に記憶する。目的言語生成部７
００は上記の中間構造判別部４００の判定結果が単純な
諺や慣用句である場合は、処理する必要がない。だが、
単純な諺や慣用句でない場合はバッファ９００から中間
構造を取り出して、目的言語の生成文法により目的言語
を生成し、翻訳結果をバッファ９００に記憶する。最後
にバッファ９００の翻訳結果をプリント等組当てた出力
部８００により出力する。1 is a block diagram showing the configuration of a machine translation apparatus according to an embodiment of the present invention. In FIG. 1, 10
Reference numeral 0 is an input unit for inputting the source language to be processed into the system. Reference numeral 200 denotes a source language analysis intermediate structure generation unit that analyzes the syntax and meaning of an input sentence with reference to the analysis dictionary 250 and acquires an intermediate structure depending on the source language. 300 reads the intermediate structure of the source language from the buffer 900, refers to the proverb phrase dictionary 350, and makes the proverb and phrase phrases included in the intermediate structure into one node as one morpheme. Then, it is a proverbial phrase processing section that stores a processing result in the buffer 900 after giving a new meaning code again and deleting the node related to the relevant proverb or phrase. The processing flow of the proverbial phrase processing unit 300 is shown in FIGS. FIG. 5 shows a proverb phrase dictionary 350.
It is explanatory drawing which shows a part of structure. Proverb idiom dictionary 350
Contains information such as main morphemes, main morpheme restriction elements, auxiliary search elements, idioms, and related meaning codes. The intermediate structure discriminating unit 400 takes out the intermediate structure of the source language from the buffer 900, judges whether it is a simple proverb or idiom according to the processing flow shown in FIG. 4, and judges the judgment result (IDOM
The flag value) is stored in the buffer 900. The conceptual structure conversion unit 500 performs the following operation according to the determination result of the intermediate structure determination unit 400. (1) When the judgment result is a simple saying or idiom, it is not necessary to generate the target language, and the translation word selection unit 6
00, the proverb or idiom can be directly converted into the target language. (2) If the judgment result is not a simple saying or idiom , take out the intermediate structure of the source language from the buffer 900,
Referring to the difference adjustment conversion dictionary 550, the meaning of the source language,
After converting the syntax-dependent intermediate structure into the target language-dependent intermediate structure, the conversion result is stored in the buffer 900. The translated word selection unit 600 extracts the intermediate structure of the target language from the buffer 900, refers to the translated word selection dictionary 650 to determine the translated words of each node of the intermediate structure, and then stores the processing result in the buffer 900. Target language generator 7
00 does not need to be processed if the determination result of the intermediate structure determination unit 400 is a simple saying or phrase. However,
If it is not a simple saying or idiom, the intermediate structure is taken out from the buffer 900, the target language is generated by the generation grammar of the target language, and the translation result is stored in the buffer 900. Finally, the translation result of the buffer 900 is output by the output unit 800 that is associated with the print or the like.

【００１４】図２、図３は原始言語諺慣用句処理部３０
０の処理流れ図である。図２、図３において、ステップ
２０１は原始言語の中間構造に対してマッチング処理を
して、PROCフラグ（PROCフラグにより中間構造のあるノ
ードは処理されたかを判断する）は１ではないノードを
捜し出す。そういうノードがなかったら、原始言語諺慣
用句処理部３００の動作を終えて、図１の中間構造判別
部４００に入る。もし、PROCフラグは１ではないノード
があれば、続いてステップ２０２により、このノードの
属性は修飾語であるかを判断する。修飾語である場合
は、ステップ２３０の処理に入り、このノードのPROCフ
ラグを１に設定する。そして、ステップ２０１の処理に
戻る。修飾語ではない場合は、ステップ２０３により判
定して葉ノード（子ノードがないノード）と判定する
と、２０４の処理に入り、該当ノードの形態素により、
図１の諺慣用句辞書３５０を参照し、諺や慣用句がある
かどうかを判定する。諺や慣用句がないと判定すると、
ステップ２３０の処理をを行う。諺や慣用句がある場合
はステップ２０５の処理を行い、諺慣用句辞書３５０か
ら、関連のあるすべての諺や慣用句集合S(i)を捜し出し
てから、ステップ２０６の処理に入る。処理されている
ノードに子ノードが葉ノードであるノードを取り出して
集合Yとする。そして、集合Yと集合S(i)との論理積演算
をして、獲得する可能性のある諺や慣用句の集合をM(i)
とする。それから、ステップ２０８の判断処理を行い、
すべてのM(i)は空集合であれば、諺や慣用句がないこと
を意味するので、諺慣用句処理部３００の処理を終え
て、図１の中間構造判別部４００の処理に入る。空集合
ではない場合は、ステップ２０９の処理を行い、上記の
諺慣用句辞書３５０を参照して主形態素制限要素L(i)を
取り出して、そしてステップ２１０の主形態素のノード
属性値から助動詞情報Aを取り出してから、L(i)と集合A
との論理積演算を行い、判断用の集合J(i)を獲得するよ
うにする。図２のステップ２０１では、上記得られた集
合J(i)が空集合であるかどうかを判断する。空集合であ
れば、ステップ２３０の処理を行い、空集合ではない場
合は、上記の諺慣用句辞書３５０を参照して、M(i)を検
索キーとして、慣用句X及び新しい意味コードYを捜し出
す。ステップ２１６、２１７の処理により、中間構造に
対して修正したり、長すぎるノードを削除してから、上
記のステップ２３０の処理に戻る。図４は中間構造判別
部４００の処理を示すフロチャートである。図４におい
て、ステップ３０１は中間構造に対して、該当ノードは
単純な諺や慣用句であるかを判断するために、単一なノ
ードであるかを検査して、単一なノードであれば、ステ
ップ３０２の処理を行い、IDOMフラグ値を１に設定し
て、単一なノードでない場合は、ステップ３０３の処理
を行い、IDOMフラグ値をゼロに設定する。日中翻訳を例
として本発明の機械翻訳装置の動作を説明する。2 and 3 show a source language proverb idiom processing unit 30.
It is a processing flow chart of 0. 2 and 3, step 201 performs a matching process on the intermediate structure of the source language, and finds a node whose PROC flag (the PROC flag determines whether a node with an intermediate structure has been processed) is not 1. . If there is no such node, the operation of the source language proverb idiom processing unit 300 is ended, and the intermediate structure determining unit 400 of FIG. 1 is entered. If there is a node whose PROC flag is not 1, then at step 202, it is judged whether the attribute of this node is a modifier. If it is a modifier, the process of step 230 is entered and the PROC flag of this node is set to 1. Then, the process returns to step 201. If it is not a qualifier, it is determined in step 203 that it is a leaf node (node without child node), and the processing in step 204 is performed.
With reference to the proverb phrasebook 350 of FIG. 1, it is determined whether there is a proverb or phrase. If you determine that there are no proverbs or idioms,
The process of step 230 is performed. If there are proverbs and idioms, the process of step 205 is performed, and after all the proverbs and idiom sets S (i) are searched from the proverb idiom dictionary 350, the process of step 206 is started. A node whose child node is a leaf node is taken out from the processed node to form a set Y. Then, the logical product operation of the set Y and the set S (i) is performed, and the set of proverbs and idioms that may be obtained is set to M (i).
And Then, perform the determination process of step 208,
If all M (i) are an empty set, it means that there are no proverbs or idioms. Therefore, the processing of the proverbial idiom processing unit 300 ends, and the process of the intermediate structure determination unit 400 of FIG. 1 starts. If it is not an empty set, the process of step 209 is performed, the main morpheme restriction element L (i) is extracted by referring to the proverb phrasebook 350, and the auxiliary verb information is extracted from the node attribute value of the main morpheme of step 210. After taking out A, L (i) and set A
And a logical product operation is performed to obtain the set J (i) for judgment. In step 201 of FIG. 2, it is determined whether the set J (i) obtained above is an empty set. If it is an empty set, the process of step 230 is performed. If it is not an empty set, the proverb phrase dictionary 350 is referred to, and the phrase phrase X and the new meaning code Y are set using M (i) as a search key. Find out. By the processing of steps 216 and 217, the intermediate structure is corrected or the node that is too long is deleted, and then the processing returns to the above-mentioned step 230. FIG. 4 is a flowchart showing the processing of the intermediate structure discriminating unit 400. In FIG. 4, step 301 is an intermediate structure. In order to determine whether the corresponding node is a simple saying or idiom, it is checked whether it is a single node. The process of step 302 is performed, the IDOM flag value is set to 1, and when the node is not a single node, the process of step 303 is performed and the IDOM flag value is set to zero. The operation of the machine translation apparatus of the present invention will be described by taking Japanese-Chinese translation as an example.

【００１５】「彼は借金で首が回らぬ」という原始言語
の文を入力部１００により入力されて、原始言語解析中
間構造生成部２００の処理を経て、図６に示している原
始言語中間構造を獲得することができる。次に、原始言
語諺慣用句処理部３００により、図２、図３の流れのよ
うに処理する。図２のステップ２０１では、図６に示し
ている中間構造に対して、上から下まで、また右から左
まで中間構造の PROCフラグ値は１ではないノードを捜
し出す。すると、ノード「回る」が見つかれる。したが
って、ステップ２０２の判断処理により、このノードは
修飾語ではない（例えば、埋め込文など）ので、ステッ
プ２０３の処理に入る。図６の中間構造により「回る」
ノードは葉ノードではないことがよく判断できるので、
ステップ２０４の処理に入る。図１の諺慣用句辞書３５
０には形態素「回る」を検索キーとしての諺慣用句情報
を記憶されているので、ノード「回る」には関連する
諺、慣用句があるという判断を行ない、ステップ２０５
の処理に入る。図５に示している諺慣用句辞書に、形態
素「回る」を検索キーとしてのすべての関連諺慣用句の
補助検索要素集合Sは下記のようになる。A source language sentence "He will not turn around due to debt" is input by the input unit 100, processed by the source language analysis intermediate structure generation unit 200, and then the source language intermediate structure shown in FIG. Can be earned. Next, the proto-language proverb idiom processing unit 300 processes as in the flow of FIGS. In step 201 of FIG. 2, the PROC flag value of the intermediate structure is searched for from the top to the bottom and from the right to the left in the intermediate structure shown in FIG. Then, the node "turn" is found. Therefore, according to the determination processing of step 202, since this node is not a modifier (for example, an embedded sentence), the processing of step 203 is started. "Turn" by the intermediate structure in Fig. 6
You can often determine that the node is not a leaf node, so
The processing of step 204 is entered. Proverb dictionary 35 of FIG.
Since the proverb idiom information using the morpheme "turn" as a search key is stored in 0, it is determined that the node "turn" has a proverb or idiom related to it, and step 205
Enter the process. In the proverb phrase dictionary shown in FIG. 5, the auxiliary search element set S of all the related proverb phrases using the morpheme "turn" as a search key is as follows.

【００１６】 S(1)=（（首が）（借金で）） S(2)=（（手が）） S(3)=（（目が）） S(4)=（（頭が）） S(5)=（（舌が）） S(6)=（（気が））次に、図２のステップ２０６の処理に入り、図６の中間
構造により下記のように葉ノード集合Yを獲得すること
ができる。S (1) = ((head) (in debt)) S (2) = ((hand)) S (3) = ((eye)) S (4) = ((head)) ) S (5) = ((tongue is)) S (6) = ((ki)) Next, the process of step 206 of FIG. 2 is performed, and the leaf node set Y is as follows by the intermediate structure of FIG. Can be earned.

【００１７】Y=（（首が）（借金で）（彼は））そして、図２のステップ２０７の処理に入る。ここで、
M(i)=S(i)∩Ｙ,i=１〜６である。その処理結果は下記の
ようになる。Y = ((head) (debt) (he)) Then, the process of step 207 in FIG. 2 is started. here,
M (i) = S (i) ∩Y, i = 1 to 6. The processing result is as follows.

【００１８】 M(1)=（（首が）（借金で）） M(2)=Ф ： M(6)=Ф そして、ステップ２０６の処理に入り、M(i)は空集合で
はないので、ステップ２０９の処理に入る。諺慣用句辞
書によると、下のようになる。M (1) = ((head) (in debt)) M (2) = Ф: M (6) = Ф Then, the process of step 206 is started, and M (i) is not an empty set. Then, the process of step 209 starts. According to the proverb dictionary, it looks like this:

【００１９】L(1)=（ないんぬまいず）次に、図３のステップ２１０の処理に入り、形態素「回
る」の助動詞属性集合A=（ん）ので、J(i)=L(1)。腆の処
理を行うと、J(1)だけは空集合ではないという結果が見
つかれる。続いて、図３のステップ２１１の判断処理で
ある。J(1)は空集合ではないので、ステップ２１５に入
る。諺慣用句辞書及び補助検索キーM(1)により、 X（慣用句）＝借金で首が回らぬ Y（新意味コード）＝M３７１という処理結果が得られる。ステップ２１６に入り、
X、Yにより図６の中間構造の形態素「回る」ノードの形
態素及び意味コード属性値を代わってから、ステップ２
１７の処理により、M(1)及びL(1)に含まれているノー
ド、属性値が既にステップ２１６に処理されたので、無
駄な、長過ぎるなノード及び属性値はここで削除するこ
とができる。図６の例によれば、この処理結果が図６の
右の説明例のように、「借金」及び「首」という二つの
ノード、及び形態素「回る」の助動詞属性「ん」等が削
除されるわけである。次に、図３のステップ２３０に入
り、形態素「回る」ノードの属性PROCを１に設定する
と、ステップ２１０に戻って、中間構造からPROCは１で
はないノード「彼」を見つけてからステップ２０２に入
り、ノード「彼」は修飾語ではないので、ステップ２３
０に入り、このノードのPROCを１に設定してから、ステ
ップ２０１に戻って、このとき、中間構造にのPROCは１
ではないノードがないので、図１の原始言語諺慣用句処
理部３００の処理が終わる。続いて、図４の処理流れの
ように中間構造判別部４００の処理に入る。図４におい
て、まずステップ３０１の判別処理に入り、図６の右の
説明例に示すように、中間構造はただ二つのノードが残
っているので、この入力文は定型文ではないことを判定
することができるので、ステップ３０３に入り、フラグ
IDOMをゼロに設定してから、中間構造判別部４００の処
理を終える。そして、図１の概念構造転換部５００の処
理に入る。図６に示しているノードのフラグIDON値はゼ
ロであるので、図１の差異調整転換辞書５５０を参照を
して、中国語と日本語との語意、構文などの差異を調整
する。その後、図１の訳語選択部６００により、図７に
示すような訳語選択辞書６５０を参照して、中間構造の
各ノードの形態素を決める。例えば、「彼」の訳語は
「他」で、「借金で首が回らぬ」の訳語は「債臺高築」
である。続いて、目的言語生成部７００により、中文の
表層構造を生成して、この例の生成訳文「他債臺高築」
となる。最後に、図１の出力部８００により、翻訳結果
を出力する。L (1) = (no no numazu) Next, the processing of step 210 in FIG. 3 is entered, and since the auxiliary verb attribute set A = (n) of the morpheme “turn”, J (i) = L ( 1). The result of the processing of Kaji is that J (1) is not the only empty set. Next is the determination process of step 211 in FIG. Since J (1) is not an empty set, step 215 is entered. With the proverb dictionary and the auxiliary search key M (1), the processing result of X (idiom) = unsuccessful due to debt Y (new meaning code) = M371 is obtained. Enter step 216,
After substituting the morpheme and the meaning code attribute value of the morpheme “turn” node of the intermediate structure of FIG. 6 with X and Y, step 2
Since the nodes and attribute values included in M (1) and L (1) have already been processed in step 216 by the processing of 17, it is possible to delete unnecessary and too long nodes and attribute values here. it can. According to the example of FIG. 6, as the result of this processing, two nodes “debt” and “neck”, and the auxiliary verb attribute “n” of the morpheme “turn” are deleted as in the explanation example on the right side of FIG. 6. That is why. Next, in step 230 of FIG. 3, when the attribute PROC of the morpheme “turn” node is set to 1, the process returns to step 210, and the node “he” whose PROC is not 1 is found from the intermediate structure, and then step 202 is executed. Enter, step 23 because node "he" is not a modifier.
Enter 0, set PROC of this node to 1, and then return to step 201, where PROC of the intermediate structure is 1
Since there is no other node, the processing of the source language proverb idiom processing unit 300 in FIG. 1 ends. Then, the process of the intermediate structure discriminating unit 400 is started as shown in the process flow of FIG. In FIG. 4, first, the determination process of step 301 is performed, and as shown in the explanatory example on the right side of FIG. 6, since only two nodes remain in the intermediate structure, it is determined that this input sentence is not a fixed sentence. So you can go to step 303 and flag
After setting IDOM to zero, the process of the intermediate structure determination unit 400 is finished. Then, the process of the conceptual structure conversion unit 500 of FIG. 1 starts. Since the flag IDON value of the node shown in FIG. 6 is zero, the difference in the meaning and syntax of Chinese and Japanese is adjusted by referring to the difference adjustment conversion dictionary 550 of FIG. After that, the translated word selection unit 600 of FIG. 1 refers to a translated word selection dictionary 650 as shown in FIG. 7 to determine the morpheme of each node of the intermediate structure. For example, the translation of "he" is "other", and the translation of "I can't turn my head because of debt" is "Bond Tachitaka"
Is. Next, the target language generation unit 700 generates a Chinese surface structure, and generates a translated sentence "other bond Tachitakachi" in this example.
Becomes Finally, the output unit 800 of FIG. 1 outputs the translation result.

【００２０】[0020]

【発明の効果】本発明によれば、下記のような効果が得
られる。（１）定型文ではない諺や慣用句でも、正確に処理でき
るので、適切な訳語が得られる。したがって、翻訳の品
質を向上することができる。（２）処理中の不用な、長過ぎる構造を削除したり、ま
た不適当な意味コードを修正することなどにより、中間
構造を簡単にし、機械翻訳のスピードを速めることがで
きる。（３）機械翻訳の品質の向上に連れて、再編集の必要性
を著しく低くすることができるので、人手による修正も
減少し、自動的な機械翻訳を達成することができる。According to the present invention, the following effects can be obtained. (1) Properly translated words can be obtained because proverbs and idioms that are not fixed phrases can be processed accurately. Therefore, the quality of translation can be improved. (2) The intermediate structure can be simplified and the speed of machine translation can be speeded up by deleting unnecessary and too long structures during processing or correcting improper meaning codes. (3) As the quality of machine translation is improved, the necessity of re-editing can be significantly reduced, so that manual correction can be reduced and automatic machine translation can be achieved.

【００２１】このように、本発明は従来定型文の諺や慣
用句しか処理できないという問題点を解決することがで
きる上に、実行する際の効率もよく、実用性がはるかに
大きい。As described above, the present invention can solve the problem that only the proverbs and idiomatic phrases of conventional fixed sentences can be processed, and the efficiency of execution is high, and the practicality is much greater.

【００２２】[0022]

[Brief description of drawings]

【００２３】[0023]

【図１】本発明の一実施例における機械翻訳装置の構成
を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a machine translation device according to an embodiment of the present invention.

【００２４】[0024]

【図２】同実施例における原始言語諺慣用句処理部の処
理を示すフローチャートである。FIG. 2 is a flowchart showing processing of a source language proverb idiom processing unit in the embodiment.

【００２５】[0025]

【図３】同実施例における原始言語諺慣用句処理部の図
２の続きの処理を示すフローチャートである。FIG. 3 is a flow chart showing a continuation of FIG. 2 of the source language proverb phrase processing unit in the embodiment.

【００２６】[0026]

【図４】同実施例における中間構造判別部の処理を示す
フロチャートである。FIG. 4 is a flowchart showing a process of an intermediate structure discriminating unit in the embodiment.

【００２７】[0027]

【図５】同実施例における諺慣用句辞書の一部の構造を
示す図である。FIG. 5 is a diagram showing a partial structure of a proverbial phrase dictionary in the embodiment.

【００２８】[0028]

【図６】同実施例に使用した例文の原始言語の中間構造
を示す図である。FIG. 6 is a diagram showing an intermediate structure of a source language of an example sentence used in the example.

【００２９】[0029]

【図７】同実施例における訳語選択辞書の一部の構造を
示す図である。FIG. 7 is a diagram showing a partial structure of a translation word selection dictionary in the embodiment.

【００３０】[0030]

【図８】一般の中間構造方式における機械翻訳装置の翻
訳処理過程を示す流れ図である。FIG. 8 is a flowchart showing a translation processing process of a machine translation device in a general intermediate structure system.

【００３１】[0031]

【図９】従来の機械翻訳装置の構成例を示すブロック図
である。FIG. 9 is a block diagram showing a configuration example of a conventional machine translation device.

【００３２】[0032]

【図１０】従来の機械翻訳装置の処理を示すフローチャ
ートである。FIG. 10 is a flowchart showing processing of a conventional machine translation device.

【００３３】[0033]

【図１１】従来例の翻訳処理を示すフローチャートであ
る。FIG. 11 is a flowchart showing a translation process of a conventional example.

【００３４】[0034]

【図１２】従来例の参照用辞書の構成を示す図である。FIG. 12 is a diagram showing a configuration of a reference dictionary of a conventional example.

【００３５】[0035]

【図１３】従来例の定型文辞書の一部の構成を示す図で
ある。FIG. 13 is a diagram showing a part of the configuration of a conventional fixed phrase dictionary.

【００３６】[0036]

【図１４】従来のノード解析を慣用句について行なった
例を示す図である。FIG. 14 is a diagram showing an example in which conventional node analysis is performed on an idiom.

【００３７】[0037]

[Explanation of symbols]

１００入力部２００原始言語解析中間構造生成部２５０解析辞書３００原始言語諺慣用句処理部３５０諺慣用句辞書４００中間構造判別部５００概念構造転換部５５０差異調整転換部６００訳語選択部６５０訳語選択辞書７００目的言語生成部８００出力部９００バッファ 100 input unit 200 source language analysis intermediate structure generation unit 250 analysis dictionary 300 source language proverb idiom processing unit 350 proverb idiom dictionary 400 intermediate structure discrimination unit 500 conceptual structure conversion unit 550 difference adjustment conversion unit 600 translation word selection unit 650 translation word selection dictionary 700 target language generation unit 800 output unit 900 buffer

Claims

[Claims]

1. A main morpheme, related information of a main morpheme and a submorpheme, and all proverbs and idioms of the relevant main morpheme using the main morpheme of the proverb and idiom as a search key for each proverb and idiom of the source language. And a proverb phrase dictionary that stores the corresponding meaning codes, and an intermediate structure obtained by analyzing the input source language are compared with the information stored in the proverb phrase dictionary and meet a predetermined condition. A proverb phrase processing part that makes a sentence of a proverb or an idiom into one node, and an intermediate structure that determines whether the sentence entered is a simple proverb or idiom based on information such as the number of nodes in the processed intermediate structure A machine translation device comprising: a discriminating unit; and a translation word selection dictionary that stores a part-of-speech code, a meaning code, a meaning control code, and a translation of a corresponding target language corresponding to each proverb and idiomatic phrase of the corresponding source language.