JP3447955B2

JP3447955B2 - Machine translation system and machine translation method

Info

Publication number: JP3447955B2
Application number: JP13332298A
Authority: JP
Inventors: 孝之酒匂; 聡木下
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-05-15
Filing date: 1998-05-15
Publication date: 2003-09-16
Anticipated expiration: 2018-05-15
Also published as: JPH11328175A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、自動的に翻訳を行
うための機械翻訳システム及び機械翻訳方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a machine translation system and a machine translation method for automatically performing translation.

【０００２】[0002]

【従来の技術】近年、計算機の小型化やインターネット
の普及によって、誰もが簡単にホームページ等を通じ
て、情報発信や情報収集ができるようになってきてい
る。例えばホームページにおいてはその約８０％が英語
をはじめとする外国語のホームページであり、一般のユ
ーザが外国語に接する機会が増えてきている。また、そ
の他の様々な局面においても、翻訳を必要とするケース
が急増している。2. Description of the Related Art In recent years, with the downsizing of computers and the spread of the Internet, it has become possible for anyone to easily transmit and collect information through a homepage or the like. For example, about 80% of homepages are homepages in foreign languages including English, and general users are more and more likely to come into contact with foreign languages. Also, in various other situations, the number of cases requiring translation is rapidly increasing.

【０００３】このような背景から翻訳システムの需要が
高まっており、またこのような需要に応えるために多く
の翻訳システムが開発され、販売、頒布されている。し
かしながら、従来の翻訳システムは、技術文書を対象に
開発されていたシステムが多く、例えばニュース等に頻
繁に現れるタイトル文のように動詞や冠詞等が省略され
た非文法的な文などを正確にもしくは適切に翻訳するこ
とは難しかった。Under such circumstances, there is an increasing demand for translation systems, and many translation systems have been developed, sold and distributed in order to meet such demand. However, many conventional translation systems have been developed for technical documents. For example, non-grammatical sentences in which verbs and articles are omitted, such as title sentences that frequently appear in news, etc. Or it was difficult to translate properly.

【０００４】[0004]

【発明が解決しようとする課題】従来、翻訳対象とする
文が文法的に特殊な形態を持つ場合、このような文につ
いては正確にもしくは適切に翻訳することは困難であっ
た。本発明は、上記事情を考慮してなされたもので、文
法的に特殊な形態を持つ文についてもより正確もしくは
適切に翻訳することの可能な機械翻訳システム及び機械
翻訳方法を提供することを目的とする。Conventionally, when a sentence to be translated has a grammatically special form, it has been difficult to accurately or properly translate such a sentence. The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a machine translation system and a machine translation method capable of more accurately or appropriately translating a sentence having a grammatically special form. And

【０００５】[0005]

【課題を解決するための手段】本発明に係る機械翻訳シ
ステムは、翻訳対象となる原文を入力する入力手段と、
前記原文を翻訳して訳文を生成する翻訳手段と、前記原
文から不完全文を抽出する第１の抽出手段と、前記原文
から前記不完全文に類似する類似文を抽出する第２の抽
出手段と、前記類似文を利用して前記不完全文に対する
訳文を生成する補完手段とを備え、前記補完手段は、前
記類似文に対して所定の構造解析を施すことによって生
成された木構造を構成するノードのうち、前記不完全文
に含まれる語句に対応する語句を持つノードからルート
・ノードまでのノードを用いて生成した文を、該不完全
文に対する訳文とするものであることを特徴とする。ま
た、本発明に係る機械翻訳システムは、翻訳対象となる
原文を入力する入力手段と、前記原文を翻訳して訳文を
生成する翻訳手段と、前記原文から不完全文を抽出する
第１の抽出手段と、前記原文と同一言語で記述された他
の所定の文書から前記不完全文に類似する類似文を抽出
する第２の抽出手段と、前記類似文を利用して前記不完
全文に対する訳文を生成する補完手段とを備え、前記補
完手段は、前記類似文に対して所定の構造解析を施すこ
とによって生成された木構造を構成するノードのうち、
前記不完全文に含まれる語句に対応する語句を持つノー
ドからルート・ノードまでのノードを用いて生成した文
を、該不完全文に対する訳文とするものであることを特
徴とする。A machine translation system according to the present invention comprises input means for inputting an original sentence to be translated,
A translation unit that translates the original sentence to generate a translated sentence, a first extracting unit that extracts an incomplete sentence from the original sentence, and a second extracting unit that extracts a similar sentence similar to the incomplete sentence from the original sentence. And a complementing unit that generates a translated sentence for the incomplete sentence by using the similar sentence, and the complementing unit configures a tree structure generated by performing a predetermined structural analysis on the similar sentence. A node generated from nodes having a phrase corresponding to a phrase included in the incomplete sentence to a root node among the nodes to be used as a translated sentence for the incomplete sentence. To do. Further, the machine translation system according to the present invention includes an input unit for inputting an original sentence to be translated, a translation unit for translating the original sentence to generate a translated sentence, and a first extraction for extracting an incomplete sentence from the original sentence. Means, second extracting means for extracting a similar sentence similar to the incomplete sentence from another predetermined document described in the same language as the original sentence, and a translated sentence for the incomplete sentence using the similar sentence Complementing means for generating, the complementing means, among the nodes constituting the tree structure generated by performing a predetermined structural analysis on the similar sentence,
It is characterized in that a sentence generated by using nodes from a node having a phrase corresponding to a phrase included in the incomplete sentence to a root node is a translated sentence for the incomplete sentence.

【０００６】また、本発明に係る機械翻訳システムは、
翻訳対象となる原文を入力する入力手段と、前記原文を
翻訳して訳文を生成する翻訳手段と、前記原文から不完
全文を抽出する第１の抽出手段と、前記原文から前記不
完全文に類似する類似文を抽出する第２の抽出手段と、
前記類似文を利用して前記不完全文に対する訳文を生成
する補完手段とを備え、前記補完手段は、前記類似文に
対して生成された概念依存構造データの木構造を構成す
るノードのうち、前記不完全文に含まれる語句に対応す
る語句を持つノードからルート・ノードまでのノードを
用いて生成した文を、該不完全文に対する訳文とするも
のであることを特徴とする。また、本発明に係る機械翻
訳システムは、翻訳対象となる原文を入力する入力手段
と、前記原文を翻訳して訳文を生成する翻訳手段と、前
記原文から不完全文を抽出する第１の抽出手段と、前記
原文と同一言語で記述された他の所定の文書から前記不
完全文に類似する類似文を抽出する第２の抽出手段と、
前記類似文を利用して前記不完全文に対する訳文を生成
する補完手段とを備え、前記補完手段は、前記類似文に
対して生成された概念依存構造データの木構造を構成す
るノードのうち、前記不完全文に含まれる語句に対応す
る語句を持つノードからルート・ノードまでのノードを
用いて生成した文を、該不完全文に対する訳文とするも
のであることを特徴とする。Further, the machine translation system according to the present invention is
An input unit for inputting an original sentence to be translated, a translation unit for translating the original sentence to generate a translated sentence, a first extracting unit for extracting an incomplete sentence from the original sentence, and an incomplete sentence from the original sentence. Second extraction means for extracting similar similar sentences;
Complementary means for generating a translated sentence for the incomplete sentence using the similar sentence, wherein the complementing means, among the nodes constituting the tree structure of the concept dependent structure data generated for the similar sentence, It is characterized in that a sentence generated by using nodes from a node having a phrase corresponding to a phrase included in the incomplete sentence to a root node is a translated sentence for the incomplete sentence. Further, the machine translation system according to the present invention includes an input unit for inputting an original sentence to be translated, a translation unit for translating the original sentence to generate a translated sentence, and a first extraction for extracting an incomplete sentence from the original sentence. Means, and second extracting means for extracting a similar sentence similar to the incomplete sentence from another predetermined document described in the same language as the original sentence,
Complementary means for generating a translated sentence for the incomplete sentence using the similar sentence, wherein the complementing means, among the nodes constituting the tree structure of the concept dependent structure data generated for the similar sentence, It is characterized in that a sentence generated by using nodes from a node having a phrase corresponding to a phrase included in the incomplete sentence to a root node is a translated sentence for the incomplete sentence.

【０００７】好ましくは、前記第１の抽出手段は、予め
定められた条件を満足する文を不完全文として抽出する
ものであるようにしてもよい。好ましくは、前記不完全
文はタイトル文であるようにしてもよい。Preferably, the first extracting means may extract a sentence satisfying a predetermined condition as an incomplete sentence. Preferably, the incomplete sentence may be a title sentence.

【０００８】好ましくは、前記第２の抽出手段は、前記
原文に含まれる文のうち、前記不完全文に含まれる機能
語以外の語句に対応する語句を最も多く含むものを、前
記類似文として抽出するものであるようにしてもよい。[0008] Preferably, the second extracting means, among the sentences included in the original sentence, includes the largest number of phrases corresponding to phrases other than the function words included in the incomplete sentence as the similar sentence. You may make it extract.

【０００９】[0009]

【００１０】好ましくは、前記補完手段により生成され
た前記不完全文の訳文のみを記憶するか、または前記翻
訳手段により生成された前記不完全文の訳文と前記補完
手段により生成された前記不完全文の訳文とをともに記
憶するようにしてもよい。Preferably, only the translated sentence of the incomplete sentence generated by the complementing means is stored, or the translated sentence of the incomplete sentence generated by the translating means and the incomplete sentence generated by the complementing means. The translation of the sentence may be stored together.

【００１１】好ましくは、前記翻訳手段による翻訳に先
だって前記原文から不完全文の抽出を行い、不完全文が
抽出された場合には、該不完全文については前記翻訳を
行わないようにしてもよい。Preferably, an incomplete sentence is extracted from the original sentence before the translation by the translation means, and if the incomplete sentence is extracted, the incomplete sentence may not be translated. Good.

【００１２】好ましくは、前記訳文を出力する手段をさ
らに備え、前記補完手段により生成された前記不完全文
の訳文については、その文が前記補完手段により生成さ
れたものである旨を知らしめるためのデータを付加して
出力することを特徴とするようにしてもよい。[0012] Preferably, the apparatus further comprises means for outputting the translated sentence, and in order to inform that the translated sentence of the incomplete sentence generated by the complementing means is the sentence generated by the complementing means. The data may be added and output.

【００１３】好ましくは、前記翻訳手段により生成され
た前記不完全文の訳文と前記補完手段により生成された
前記不完全文の訳文とを、同時にまたは順番に提示する
ための手段をさらに備えるようにしてもよい。Preferably, the apparatus further comprises means for presenting the translated sentence of the incomplete sentence generated by the translating means and the translated sentence of the incomplete sentence generated by the complementing means at the same time or in sequence. May be.

【００１４】また、本発明に係る機械翻訳方法は、翻訳
対象となる原文を入力するステップと、入力した前記原
文を原文記憶手段に記憶するステップと、翻訳手段によ
り前記原文を翻訳して訳文を生成するステップと、生成
した前記訳文を訳文記憶手段に記憶するステップと、第
１の抽出手段により前記原文から不完全文を抽出するス
テップと、第２の抽出手段により前記原文から前記不完
全文に類似する類似文を抽出するステップと、補完手段
により前記類似文を利用して前記不完全文に対する訳文
を生成する補完ステップとを有し、前記補完ステップで
は、前記補完手段は、前記類似文に対して所定の構造解
析を施すことによって生成された木構造を構成するノー
ドのうち、前記不完全文に含まれる語句に対応する語句
を持つノードからルート・ノードまでのノードを用いて
生成した文を、該不完全文に対する訳文とすることを特
徴とする。また、本発明に係る機械翻訳方法は、翻訳対
象となる原文を入力するステップと、入力した前記原文
を原文記憶手段に記憶するステップと、翻訳手段により
前記原文を翻訳して訳文を生成するステップと、生成し
た前記訳文を訳文記憶手段に記憶するステップと、第１
の抽出手段により前記原文から不完全文を抽出するステ
ップと、前記原文と同一言語で記述された他の所定の文
書を入力するステップと、第２の抽出手段により前記他
の所定の文書から前記不完全文に類似する類似文を抽出
するステップと、補完手段により前記類似文を利用して
前記不完全文に対する訳文を生成する補完ステップとを
有し、前記補完ステップでは、前記補完手段は、前記類
似文に対して所定の構造解析を施すことによって生成さ
れた木構造を構成するノードのうち、前記不完全文に含
まれる語句に対応する語句を持つノードからルート・ノ
ードまでのノードを用いて生成した文を、該不完全文に
対する訳文とすることを特徴とする。Further, the machine translation method according to the present invention includes a step of inputting an original sentence to be translated, a step of storing the input original sentence in an original sentence storage means, and a translation means for translating the original sentence into a translated sentence. A step of generating, a step of storing the generated translated sentence in a translated sentence storage means, a step of extracting an incomplete sentence from the original sentence by a first extracting means, and a step of extracting an incomplete sentence from the original sentence by a second extracting means. And a complementing step of generating a translated sentence for the incomplete sentence by using the similar sentence by a complementing means, in the complementing step, the complementing means includes the similar sentence. Among the nodes that make up the tree structure generated by performing a predetermined structural analysis on the The statement generated using node up over preparative node, characterized in that the translation for the incomplete sentence. Further, the machine translation method according to the present invention includes a step of inputting an original sentence to be translated, a step of storing the input original sentence in an original sentence storage means, and a step of translating the original sentence by the translation means to generate a translated sentence. A step of storing the generated translated text in a translated text storage means;
Extracting the incomplete sentence from the original sentence by the extracting means, inputting another predetermined document described in the same language as the original sentence, and second extracting means from the other predetermined document by the second extracting means. There is a step of extracting a similar sentence similar to an incomplete sentence, and a complementing step of generating a translated sentence for the incomplete sentence by using a similar sentence by a complementing means, and in the complementing step, the complementing means, Of the nodes forming the tree structure generated by performing a predetermined structural analysis on the similar sentence, the nodes from the node having the phrase corresponding to the phrase included in the incomplete sentence to the root node are used. The sentence generated by the above is used as a translated sentence for the incomplete sentence.

【００１５】また、本発明に係る機械翻訳方法は、翻訳
対象となる原文を入力するステップと、入力した前記原
文を原文記憶手段に記憶するステップと、翻訳手段によ
り前記原文を翻訳して訳文を生成するステップと、生成
した前記訳文を訳文記憶手段に記憶するステップと、第
１の抽出手段により前記原文から不完全文を抽出するス
テップと、第２の抽出手段により前記原文から前記不完
全文に類似する類似文を抽出するステップと、補完手段
により前記類似文を利用して前記不完全文に対する訳文
を生成する補完ステップとを有し、前記補完ステップで
は、前記補完手段は、前記類似文に対して生成された概
念依存構造データの木構造を構成するノードのうち、前
記不完全文に含まれる語句に対応する語句を持つノード
からルート・ノードまでのノードを用いて生成した文
を、該不完全文に対する訳文とすることを特徴とする。
また、本発明に係る機械翻訳方法は、翻訳対象となる原
文を入力するステップと、入力した前記原文を原文記憶
手段に記憶するステップと、翻訳手段により前記原文を
翻訳して訳文を生成するステップと、生成した前記訳文
を訳文記憶手段に記憶するステップと、第１の抽出手段
により前記原文から不完全文を抽出するステップと、前
記原文と同一言語で記述された他の所定の文書を入力す
るステップと、第２の抽出手段により前記他の所定の文
書から前記不完全文に類似する類似文を抽出するステッ
プと、補完手段により前記類似文を利用して前記不完全
文に対する訳文を生成する補完ステップとを有し、前記
補完ステップでは、前記補完手段は、前記類似文に対し
て生成された概念依存構造データの木構造を構成するノ
ードのうち、前記不完全文に含まれる語句に対応する語
句を持つノードからルート・ノードまでのノードを用い
て生成した文を、該不完全文に対する訳文とすることを
特徴とする。Further, the machine translation method according to the present invention includes the steps of inputting an original sentence to be translated, storing the input original sentence in an original sentence storage means, and translating the original sentence by the translation means to obtain a translated sentence. A step of generating, a step of storing the generated translated sentence in a translated sentence storage means, a step of extracting an incomplete sentence from the original sentence by a first extracting means, and a step of extracting an incomplete sentence from the original sentence by a second extracting means. And a complementing step of generating a translated sentence for the incomplete sentence by using the similar sentence by a complementing means, in the complementing step, the complementing means includes the similar sentence. Among the nodes forming the tree structure of the concept-dependent structure data generated for, the node having the word corresponding to the word contained in the incomplete sentence is selected as the root node. A sentence node generated using the up, characterized in that the translation for the incomplete sentence.
Further, the machine translation method according to the present invention includes a step of inputting an original sentence to be translated, a step of storing the input original sentence in an original sentence storage means, and a step of translating the original sentence by the translation means to generate a translated sentence. A step of storing the generated translated sentence in the translated sentence storage means, a step of extracting an incomplete sentence from the original sentence by the first extracting means, and inputting another predetermined document described in the same language as the original sentence. And a step of extracting a similar sentence similar to the incomplete sentence from the other predetermined document by the second extracting means, and a translated sentence for the incomplete sentence using the similar sentence by the complementing means. Completing step, wherein in the complementing step, among the nodes forming the tree structure of the concept-dependent structure data generated for the similar sentence, Were generated using the node to the root node from the node with the phrase corresponding to the words contained in the full text sentence, characterized in that the translation for the incomplete sentence.

【００１６】また、本発明は、コンピュータに、翻訳対
象となる原文を入力させ、入力させた前記原文を原文記
憶手段に記憶させ、翻訳手段により前記原文を翻訳して
訳文を生成させ、生成させた前記訳文を訳文記憶手段に
記憶させ、第１の抽出手段により前記原文から不完全文
を抽出させ、第２の抽出手段により前記原文から前記不
完全文に類似する類似文を抽出させ、補完手段により前
記類似文を利用して前記不完全文に対する訳文を生成さ
せるとともに、前記補完手段により前記訳文を生成させ
るにあたっては、前記類似文に対して所定の構造解析を
施すことによって生成された木構造を構成するノードの
うち、前記不完全文に含まれる語句に対応する語句を持
つノードからルート・ノードまでのノードを用いて生成
した文を、該不完全文に対する訳文とさせるためのプロ
グラムを記録したコンピュータ読み取り可能な記録媒体
である。また、本発明は、コンピュータに、翻訳対象と
なる原文を入力させ、入力させた前記原文を原文記憶手
段に記憶させ、翻訳手段により前記原文を翻訳して訳文
を生成させ、生成させた前記訳文を訳文記憶手段に記憶
させ、第１の抽出手段により前記原文から不完全文を抽
出させ、前記原文と同一言語で記述された他の所定の文
書を入力させ、第２の抽出手段により前記他の所定の文
書から前記不完全文に類似する類似文を抽出させ、補完
手段により前記類似文を利用して前記不完全文に対する
訳文を生成させるとともに、前記補完手段により前記訳
文を生成させるにあたっては、前記類似文に対して所定
の構造解析を施すことによって生成された木構造を構成
するノードのうち、前記不完全文に含まれる語句に対応
する語句を持つノードからルート・ノードまでのノード
を用いて生成した文を、該不完全文に対する訳文とさせ
るためのプログラムを記録したコンピュータ読み取り可
能な記録媒体である。Further, according to the present invention, a computer is caused to input an original sentence to be translated, the inputted original sentence is stored in an original sentence storage means, and the original sentence is translated by the translation means to generate a translated sentence, and the translated sentence is generated. The translated sentence is stored in the translated sentence storage unit, the first extracting unit extracts the incomplete sentence from the original sentence, and the second extracting unit extracts the similar sentence similar to the incomplete sentence from the original sentence, and complements the sentence. When generating the translated sentence for the incomplete sentence by using the similar sentence by the means and generating the translated sentence by the complementing means, a tree generated by performing a predetermined structural analysis on the similar sentence. Among the nodes forming the structure, the sentence generated by using the nodes from the node having the phrase corresponding to the phrase included in the incomplete sentence to the root node is A computer-readable recording medium recording a program for causing a translation for a statement. Further, the present invention causes a computer to input an original sentence to be translated, store the input original sentence in an original sentence storage means, translate the original sentence by the translation means to generate a translated sentence, and generate the translated sentence. Is stored in the translated sentence storage means, an incomplete sentence is extracted from the original sentence by the first extracting means, another predetermined document described in the same language as the original sentence is input, and the other is written by the second extracting means. When a similar sentence similar to the incomplete sentence is extracted from the predetermined document, the complementing unit uses the similar sentence to generate a translated sentence for the incomplete sentence, and the complementing unit generates the translated sentence. From among the nodes forming the tree structure generated by performing the predetermined structural analysis on the similar sentence, the node having the phrase corresponding to the phrase included in the incomplete sentence The statement generated using node up over preparative node is a computer-readable recording medium recording a program for causing a translation for the incomplete sentence.

【００１７】また、本発明は、コンピュータに、翻訳対
象となる原文を入力させ、入力させた前記原文を原文記
憶手段に記憶させ、翻訳手段により前記原文を翻訳して
訳文を生成させ、生成させた前記訳文を訳文記憶手段に
記憶させ、第１の抽出手段により前記原文から不完全文
を抽出させ、第２の抽出手段により前記原文から前記不
完全文に類似する類似文を抽出させ、補完手段により前
記類似文を利用して前記不完全文に対する訳文を生成さ
せるとともに、前記補完手段により前記訳文を生成させ
るにあたっては、前記類似文に対して生成された概念依
存構造データの木構造を構成するノードのうち、前記不
完全文に含まれる語句に対応する語句を持つノードから
ルート・ノードまでのノードを用いて生成した文を、該
不完全文に対する訳文とさせるためのプログラムを記録
したコンピュータ読み取り可能な記録媒体である。ま
た、本発明は、コンピュータに、翻訳対象となる原文を
入力させ、入力させた前記原文を原文記憶手段に記憶さ
せ、翻訳手段により前記原文を翻訳して訳文を生成さ
せ、生成させた前記訳文を訳文記憶手段に記憶させ、第
１の抽出手段により前記原文から不完全文を抽出させ、
前記原文と同一言語で記述された他の所定の文書を入力
させ、第２の抽出手段により前記他の所定の文書から前
記不完全文に類似する類似文を抽出させ、補完手段によ
り前記類似文を利用して前記不完全文に対する訳文を生
成させるとともに、前記補完手段により前記訳文を生成
させるにあたっては、前記類似文に対して生成された概
念依存構造データの木構造を構成するノードのうち、前
記不完全文に含まれる語句に対応する語句を持つノード
からルート・ノードまでのノードを用いて生成した文
を、該不完全文に対する訳文とさせるためのプログラム
を記録したコンピュータ読み取り可能な記録媒体であ
る。Further, according to the present invention, a computer is made to input an original sentence to be translated, the inputted original sentence is stored in an original sentence storage means, and the original sentence is translated by the translation means to generate a translated sentence. The translated sentence is stored in the translated sentence storage unit, the first extracting unit extracts the incomplete sentence from the original sentence, and the second extracting unit extracts the similar sentence similar to the incomplete sentence from the original sentence, and complements the sentence. When generating the translated sentence for the incomplete sentence by using the similar sentence by the means, and generating the translated sentence by the complementing means, the tree structure of the concept-dependent structure data generated for the similar sentence is configured. Among the nodes to be generated, the sentence generated by using the nodes from the node having the phrase corresponding to the phrase included in the incomplete sentence to the root node is A computer-readable recording medium recording a program for a sentence. Further, the present invention causes a computer to input an original sentence to be translated, store the input original sentence in an original sentence storage means, translate the original sentence by the translation means to generate a translated sentence, and generate the translated sentence. Is stored in the translated sentence storage means, and the incomplete sentence is extracted from the original sentence by the first extracting means,
The other predetermined document described in the same language as the original sentence is input, the second extracting unit extracts a similar sentence similar to the incomplete sentence from the other predetermined document, and the complementing unit extracts the similar sentence. While generating a translated sentence for the incomplete sentence by using, and generating the translated sentence by the complementing means, among the nodes constituting the tree structure of the concept-dependent structure data generated for the similar sentence, A computer-readable recording medium recording a program for converting a sentence generated using nodes from a node having a phrase corresponding to a phrase included in the incomplete sentence to a root node into a translated sentence for the incomplete sentence. Is.

【００１８】なお、装置に係る本発明は方法に係る発明
としても成立し、方法に係る本発明は装置に係る発明と
しても成立する。また、装置または方法に係る本発明
は、コンピュータに当該発明に相当する手順を実行させ
るための（あるいはコンピュータを当該発明に相当する
手段として機能させるための、あるいはコンピュータに
当該発明に相当する機能を実現させるための）プログラ
ムを記録したコンピュータ読取り可能な記録媒体として
も成立する。It should be noted that the present invention relating to the apparatus also holds as the invention relating to the method, and the present invention relating to the method also holds as the invention relating to the apparatus. Further, the present invention relating to an apparatus or a method is provided for causing a computer to execute a procedure corresponding to the present invention (or for causing a computer to function as means corresponding to the present invention, or for a computer to have a function corresponding to the present invention. It also holds as a computer-readable recording medium in which a program (for realizing it) is recorded.

【００１９】本発明によれば、不完全文に類似する文
（不完全文でない文）の原文もしくは翻訳結果を利用し
て該不完全文の訳文を生成する、より正確に翻訳された
文章を得ることができ、不完全文における翻訳精度の向
上を図ることができる。この結果、訳文の可読性を向上
させることができ、ユーザによる文章の理解を早めるこ
とに貢献できる。According to the present invention, a translated sentence of a sentence that is similar to an incomplete sentence (a sentence that is not an incomplete sentence) is generated by using an original sentence or a translation result, and a more accurately translated sentence is generated. It is possible to improve the translation accuracy of incomplete sentences. As a result, the readability of the translated text can be improved, which contributes to quicker understanding of the text by the user.

【００２０】[0020]

【発明の実施の形態】以下、図面を参照しながら発明の
実施の形態を説明する。図１に、本発明の一実施形態に
係る機械翻訳システムの構成例を示す。図１に示される
ように、本機械翻訳システムは、入力部１０１、翻訳部
１０２、制御部１０３、不完全文認識補完部１０４、出
力部１０５を備えている。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 shows a configuration example of a machine translation system according to an embodiment of the present invention. As shown in FIG. 1, the machine translation system includes an input unit 101, a translation unit 102, a control unit 103, an incomplete sentence recognition complementing unit 104, and an output unit 105.

【００２１】入力部１０１は、翻訳対象となる文書や、
ユーザからの指示などを入力するためのものである。翻
訳対象となる文書の入力には、例えば、キーボード（キ
ー入力）、記録媒体駆動装置（磁気ディスクや磁気テー
プあるいは光ディスクなどの記録媒体からの読み込
み）、ＯＣＲ（光学読取り）、ネットワーク接続装置
（通信による取得）などを使用することができる。ま
た、任意の複数種類の入力形態を使用可能としてもよ
い。なお、ＯＣＲを使用する場合には、読み取った文字
画像を文字認識してコード列を生成する機能を用いる。The input unit 101 is for translating documents and
It is for inputting an instruction or the like from the user. For inputting a document to be translated, for example, a keyboard (key input), a recording medium driving device (reading from a recording medium such as a magnetic disk, a magnetic tape, or an optical disk), OCR (optical reading), a network connection device (communication) Can be used). Also, any of a plurality of types of input forms may be usable. When OCR is used, a function of recognizing a read character image and generating a code string is used.

【００２２】ユーザからの指示（例えば翻訳指示）など
の入力には、例えば、キーボード、マウスなどを用い
る。制御部１０３は、入力部１により翻訳対象となる文
書を入力させる、入力文書を翻訳部１０２に送り処理を
させる、翻訳部１０２による翻訳結果等を不完全文認識
補完部１０４に送り処理をさせる、翻訳部１０２および
不完全文認識補完部１０４により得られた出力すべき翻
訳結果等を出力部１０５に送り出力をさせる、などとい
った当該翻訳システム全体の制御を行う。A keyboard, a mouse or the like is used for inputting an instruction (eg, a translation instruction) from the user. The control unit 103 causes the input unit 1 to input a document to be translated, sends the input document to the translation unit 102 for processing, and sends the translation result by the translation unit 102 to the incomplete sentence recognition complementing unit 104 for processing. The entire translation system is controlled by sending the translation result and the like obtained by the translation unit 102 and the incomplete sentence recognition complementation unit 104 to the output unit 105 for output.

【００２３】翻訳部１０２は、制御部１０３から渡され
た文書データに対して翻訳処理を行い、得られた結果を
制御部１０３に返す。不完全文認識補完部１０４は、翻
訳部１０２による翻訳結果等を制御部１０３からを受け
取り、入力文書中の不完全文を認識し、不完全文が抽出
されたならば、その不完全文に対する最適化処理を行
い、その結果を制御部１０３に返す。The translation unit 102 translates the document data passed from the control unit 103 and returns the obtained result to the control unit 103. The incomplete sentence recognition complementing unit 104 receives the translation result by the translation unit 102 from the control unit 103, recognizes the incomplete sentence in the input document, and if the incomplete sentence is extracted, the incomplete sentence is extracted. The optimization process is performed and the result is returned to the control unit 103.

【００２４】ここで、「不完全文」とは、例えば、名詞
句のみで構成された文や、文法的に正しくない文、曖昧
性の高い文などが、これに該当する。どのような不完全
文を当該システムで対象とするかは適宜定めることがで
きる。不完全文の検出方法としては、例えば、予め当該
システムで対象とする不完全文を特徴付ける条件を定め
ておき、入力テキストの各文について、その文が上記条
件を満足するか否か調べ、条件を満足する場合にその文
を不完全文とする、などの方法が考えられる。Here, the "incomplete sentence" corresponds to, for example, a sentence composed only of noun phrases, a sentence that is not grammatically correct, or a sentence with high ambiguity. What kind of incomplete sentence is targeted by the system can be appropriately determined. As a method of detecting an incomplete sentence, for example, a condition that characterizes an incomplete sentence that is a target in the system is set in advance, and for each sentence of the input text, it is checked whether the sentence satisfies the above condition, and the condition is determined. If the sentence is satisfied, the sentence may be regarded as an incomplete sentence.

【００２５】最適化処理では、入力文書中から当該不完
全文に類似する文（以下、類似文と呼ぶ）を抽出し、こ
の類似文を利用して、当該不完全文にとってより適切で
あると期待される訳文を生成する。なお、類似文は、不
完全文でない文から選ぶのが好ましい。In the optimization process, a sentence similar to the incomplete sentence (hereinafter referred to as a similar sentence) is extracted from the input document, and this similar sentence is used to determine that it is more appropriate for the incomplete sentence. Generate the expected translation. The similar sentence is preferably selected from sentences that are not incomplete sentences.

【００２６】出力部１０５は、翻訳文等を出力させるた
めのものである。翻訳文の出力形態としては、例えば、
翻訳文のみを出力するものや、入力文書とその翻訳文と
を併せて出力するものなど種々の形態が考えられる。The output unit 105 is for outputting a translated sentence or the like. As the output form of the translated text, for example,
Various forms are conceivable, such as outputting only a translated sentence or outputting an input document and its translated sentence together.

【００２７】出力部１０５としては、光学ディスプレイ
（ユーザに提示）、プリンタ（ユーザに提示）、ネット
ワーク接続装置（ネットワークを介して接続されたプリ
ンタに出力、あるいはネットワークを介して他の計算機
に転送等）、記録媒体駆動装置（磁気ディスクや磁気テ
ープあるいは光ディスクなどの記録媒体への保存）など
を用いることができる。また、任意の複数種類の出力形
態を使用可能としてもよい。As the output unit 105, an optical display (presented to the user), a printer (presented to the user), a network connection device (output to a printer connected via the network, transfer to another computer via the network, etc.) ), A recording medium driving device (storing in a recording medium such as a magnetic disk, a magnetic tape, or an optical disk) can be used. Further, it is possible to use arbitrary plural kinds of output forms.

【００２８】以下では、具体例を用いながら本実施形態
についてより詳しく説明する。本実施形態では、英文を
和文に翻訳する場合を例として説明する。また、本実施
形態では、不完全文の例として「タイトル文」を用いて
説明する。In the following, the present embodiment will be described in more detail using a specific example. In this embodiment, the case where an English sentence is translated into a Japanese sentence will be described as an example. In addition, in the present embodiment, “title sentence” will be used as an example of an incomplete sentence.

【００２９】図２に、入力部１０１により得られた英文
テキスト（原文）の一例を示す。この入力テキストは、
複数の文から構成され、第１番目の文が“ＡＮＤＥＲＳ
ＯＮＨＥＡＤＳＦＯＲＭＩＤＤＬＥＥＡＳ
Ｔ”、第２番目の文が“Ｕ．Ｓ．Ｓｅｃｒｅｔａｒｙ
ｏｆＳｔａｔｅＴｏｍＡｎｄｅｒｓｏｎｓｅｔ
ｏｕｔｏｎｈｉｓｆｉｒｓｔＭｉｄｄｌｅ
ＥａｓｔｔｒｉｐＴｕｅｓｄａｙ．”、第３番目の文
が“Ｈｅｓａｉｄｈｅｈｏｐｅｄｔｏ …
”、などとなっている。FIG. 2 shows an example of an English text (original text) obtained by the input unit 101. This input text is
It consists of multiple sentences, the first sentence is "ANDERS"
ON HEADS FOR MIDDLE EAS
T "and the second sentence is" U. S. Secretary
of State Tom Anderson set
out on his first Middle
East trip Tuesday. ", The third sentence is" He said he hopped to ...
", And so on.

【００３０】入力部１０１により得られた入力テキスト
は、制御部１０３の制御によって翻訳部１０２に渡され
る。翻訳部１０２は、入力テキストを翻訳し、翻訳結果
を対訳データ（原文と訳文とを文毎に対応させたもの）
の形式で、制御部１０３に返す。なお、翻訳部１０２に
おける翻訳方法としては一般的な方法を用いることがで
きる。The input text obtained by the input unit 101 is passed to the translation unit 102 under the control of the control unit 103. The translation unit 102 translates the input text, and translates the translation result (parallel correspondence between the original sentence and the translated sentence).
In a format of As a translation method in the translation unit 102, a general method can be used.

【００３１】また、翻訳部１０２は、上記の対訳データ
とともに、各文の解析結果（例えば構文解析による解析
結果）を概念依存構造データの形式で、制御部１０３に
返すものとする。なお、この概念依存構造データは、翻
訳処理の過程で得られるものであってもよいし、翻訳処
理とは別に解析を行って得られるものであってもよい。Further, the translation unit 102 returns the analysis result of each sentence (for example, the analysis result by syntax analysis) to the control unit 103 in the form of concept dependent structure data together with the above-mentioned parallel translation data. The concept-dependent structure data may be obtained in the process of translation processing, or may be obtained by performing analysis separately from the translation processing.

【００３２】図３に対訳データの一例を、図４に構文解
析によって得られる概念依存構造データの一例を、それ
ぞれ示す。なお、図３の各文毎に設けられた概念依存構
造データへのポインタは、その文に対応する図４の概念
依存構造データへのリンクを示している。FIG. 3 shows an example of parallel translation data, and FIG. 4 shows an example of concept-dependent structure data obtained by syntax analysis. Note that the pointer to the concept-dependent structure data provided for each sentence in FIG. 3 indicates a link to the concept-dependent structure data in FIG. 4 corresponding to the sentence.

【００３３】さて、翻訳部１０２により対訳データと概
念依存構造データが生成されると、次に、これらデータ
は制御部１０３の制御により不完全文認識補完部１０４
に渡される。When the translation unit 102 generates the parallel translation data and the concept-dependent structure data, these data are then controlled by the control unit 103 to complete the incomplete sentence recognition complementation unit 104.
Passed to.

【００３４】不完全文認識補完部１０４では、対訳デー
タ中からタイトル文（すなわち本例における不完全文）
を検出し、その検出したタイトル文に最適化処理を施し
て該当する訳文を修正した後に、制御部１０３に返す。
以下、不完全文認識補完部１０４における処理について
詳しく説明する。In the incomplete sentence recognition complementing unit 104, the title sentence (that is, the incomplete sentence in this example) is selected from the parallel translation data.
Is detected, the detected title sentence is subjected to optimization processing to correct the corresponding translated sentence, and then returned to the control unit 103.
Hereinafter, the processing in the incomplete sentence recognition complementing unit 104 will be described in detail.

【００３５】図５に、不完全文認識補完部１０４の処理
手順の一例を示す。まず、ステップＳ４０２にて、入力
された対訳データからタイトル文を検出する。FIG. 5 shows an example of the processing procedure of the incomplete sentence recognition complementing unit 104. First, in step S402, a title sentence is detected from the input bilingual data.

【００３６】タイトル文の検出方法としては、例えば、
「原文が全て大文字で書かれている」、「１行目で名詞
句のみの文である」、といったタイトル文に頻繁に見ら
れる言語現象を１つもしくは複数組合わせて用いればよ
い。また、入力テキストがＨＴＭＬのようなハイパーテ
キスト記述言語で書かれている場合、タイトル文には、
タグが付けられていたり、他の文よりも大きな文字サイ
ズが指定されていたりするので、そのような付加的な情
報を利用してもよい。As a method of detecting the title sentence, for example,
One or more combinations of language phenomena often found in title sentences such as "the original sentence is written in all capital letters" and "the sentence is only a noun phrase in the first line" may be used. If the input text is written in a hypertext description language such as HTML, the title sentence will contain
Such additional information may be used because it is tagged or has a larger character size than other sentences.

【００３７】ここでは、全ての文字が大文字で書かれて
いる文をタイトル文である、と認識させるものとする
と、図３の入力テキストにおける文番号１の文、すなわ
ち、“ＡＮＤＥＲＳＯＮＨＥＡＤＳＦＯＲＭＩＤ
ＤＬＥＥＡＳＴ”が、タイトル文として認識され取り
出される。Here, assuming that the sentence in which all the characters are written in capital letters is recognized as the title sentence, the sentence of sentence number 1 in the input text of FIG. 3, that is, "ANDERSON HEADS FOR MID"
"DLE EAST" is recognized as a title sentence and taken out.

【００３８】取り出されたタイトル文は、翻訳部１０２
で使用されている形態素解析を行うことによって単語の
原形に戻される。本例では、“Ａｎｄｅｒｓｏｎ”、
“ｈｅａｄ”、“ｆｏｒ”、“ｍｉｄｄｌｅ”、“ｅａ
ｓｔ”が取り出される。なお、この形態素解析として
は、一般的な方法を用いることができる。The retrieved title sentence is translated by the translation unit 102.
The original form of the word is restored by performing the morphological analysis used in. In this example, "Anderson",
"Head", "for", "middle", "ea"
st ”is extracted. A general method can be used for this morphological analysis.

【００３９】次に、ステップＳ４０３〜Ｓ４０８にて、
タイトル文以外の文について、それぞれタイトル文との
類似性を調べ、最も類似した文をタイトル類似文とす
る。タイトル文との類似度については、例えば、タイト
ル文を構成する単語の原形が、それぞれの文中にいくつ
含まれるか、といった方法等により評価することができ
る。ここでは、一例として、タイトル文に含まれる単語
のうち機能語（前置詞等）を除いた単語の原形をいくつ
含むかを類似度とする。Next, in steps S403 to S408,
For sentences other than the title sentence, the similarities with the title sentence are examined, and the most similar sentence is set as the title similar sentence. The degree of similarity with the title sentence can be evaluated by, for example, a method such as how many original forms of words constituting the title sentence are included in each sentence. Here, as an example, the degree of similarity includes the number of original forms of words excluding function words (prepositions, etc.) among the words included in the title sentence.

【００４０】まず、ステップＳ４０４にて１文目を取り
出すがタイトル文であるため、ステップＳ４０５では何
もせず、類似度は０とする。次に、ステップＳ４０７，
Ｓ４０４にて次の文を取り出し、ステップＳ４０５にて
タイトル文との類似度が計算される。本実施形態では、
類似度を、タイトル文に含まれる単語のうち機能語を除
いた単語の原形が文中にいくつ含まれるか、としている
ので、取り出された文に対して形態素解析を行い単語の
原形を求める。First, the first sentence is extracted in step S404, but since it is a title sentence, nothing is done in step S405 and the similarity is set to 0. Next, in step S407,
The next sentence is extracted in S404, and the similarity with the title sentence is calculated in step S405. In this embodiment,
Since the degree of similarity is defined as how many original forms of words excluding functional words are included in the sentence included in the title sentence, morphological analysis is performed on the extracted sentence to obtain the original form of the word.

【００４１】次に、それら一単語ずつが、“Ａｎｄｅｒ
ｓｏｎ”、“ｈｅａｄ”、“ｍｉｄｄｌｅ”、“ｅａｓ
ｔ”と一致するか調べられ、その一致した数が類似度と
して図３の対訳データの類似度の欄に書き込まれる。Next, the words are "under
"son", "head", "middle", "eas"
It is checked whether they match t ", and the number of matches is written in the similarity column of the parallel translation data in FIG. 3 as the similarity.

【００４２】類似度を書き込んだら、ステップＳ４０６
で最後の文かチェックをし、最後でない場合は、ステッ
プＳ４０７，Ｓ４０４にて次の文に進み、ステップＳ４
０５にて、再度、類似度を計算する。After writing the similarity, step S406.
If it is not the last sentence, the process proceeds to the next sentence in steps S407 and S404, and in step S4.
At 05, the degree of similarity is calculated again.

【００４３】最後の文である場合は、ステップＳ４０８
に進み、対訳データの類似度の最も高い文をタイトル類
似文として、ステップＳ４０９のタイトル反映処理に進
む。図６に、図３の例に類似度の情報を付加した対訳デ
ータの例を示す。If it is the last sentence, step S408.
Then, the sentence having the highest degree of similarity of the bilingual data is set as the title similar sentence, and the process proceeds to the title reflection process of step S409. FIG. 6 shows an example of parallel translation data in which information on the degree of similarity is added to the example of FIG.

【００４４】ここでは、ステップＳ４０８にて、図６の
文番号２の文、すなわち、“Ｕ．Ｓ．Ｓｅｃｒｅｔａ
ｒｙｏｆＳｔａｔｅＴｏｍＡｎｄｅｒｓｏｎ
ｓｅｔｏｕｔｏｎｈｉｓｆｉｒｓｔＭｅｄｄ
ｌｅＥａｓｔｔｒｉｐｏｎＴｕｅｓｄａｙ．”
がタイトル類似文として選ばれたものとする。Here, in step S408, the sentence of sentence number 2 in FIG. 6, that is, "US Secreta" is displayed.
ry of State Tom Anderson
set out on his first Medd
le East trip on Tuesday. ”
Is selected as the title similar sentence.

【００４５】なお、最大の類似度を持つ類似文が複数得
られた場合には、他の所定の基準で１つの類似文を選択
するものとする。例えば、当該不完全文に最も位置が近
い（文番号が最も近い）ものを選択する方法、後述する
表示可能ノードの数が最も多いものを選択する方法、最
も語数の多いものを選択する方法など、種々の方法が考
えられる。When a plurality of similar sentences having the maximum degree of similarity are obtained, one similar sentence is selected based on another predetermined criterion. For example, a method that selects the one that has the closest position (closest sentence number) to the incomplete sentence, a method that selects the largest number of displayable nodes described later, a method that selects the largest number of words, etc. Various methods are possible.

【００４６】次に、ステップＳ４０９のタイトル反映処
理では、タイトル類似文の訳文がタイトルとして適切に
なるように加工して、図６の対訳データを修正し、これ
を制御部１０３に戻し、処理を終了する。Next, in the title reflection process of step S409, the translated sentence of the title-similar sentence is processed so as to be appropriate as a title, the bilingual translation data of FIG. 6 is corrected, and this is returned to the control unit 103, and the process is executed. finish.

【００４７】図７に、ステップＳ４０９のタイトル反映
処理の手順の一例を示す。まず、ステップＳ６０２に
て、図６の対訳データの概念依存構造ポインタを利用し
てタイトル類似文の構文構造を概念依存構造データから
取り出す。なお、ここでは、構造解析結果として、構文
解析によって得られた概念依存構造を用いているが、意
味解析などの他の解析手法により得た結果を用いてもよ
い。FIG. 7 shows an example of the procedure of the title reflection processing in step S409. First, in step S602, the syntactic structure of the title-similar sentence is extracted from the concept-dependent structure data using the concept-dependent structure pointer of the bilingual data in FIG. Although the concept-dependent structure obtained by the syntactic analysis is used here as the structure analysis result, the result obtained by another analysis method such as semantic analysis may be used.

【００４８】図８に、図６の対訳データから得られたタ
イトル類似文の概念依存構造の例を示す。次に、ステッ
プＳ６０３〜Ｓ６０７により、概念依存構造を構成する
ノードのうちから、表示させるノード（表示可能ノー
ド）を選択し、表示可能のフラグを立てる。FIG. 8 shows an example of the concept-dependent structure of the title-similar sentence obtained from the parallel translation data of FIG. Next, in steps S603 to S607, the node (displayable node) to be displayed is selected from the nodes forming the concept-dependent structure, and the displayable flag is set.

【００４９】まず、ステップＳ６０３にて、変数ｐを１
とし、タイトル文の先頭からｐ番目の単語を取り出し、
それがタイトル類似文のノードとして含まれているを検
索する。含まれている場合には、ステップＳ６０４に
て、そのノードから、解析木を辿り、ルートノードまで
のノードを全て、表示可能とし、ステップＳ６０６へ進
む。一方、含まれていない場合には、何もせずに、ステ
ップＳ６０６へ進む。First, in step S603, the variable p is set to 1
And take the p-th word from the beginning of the title sentence,
Search for that is included as a node of the title similar sentence. If it is included, in step S604, the parse tree is traced from that node to display all the nodes up to the root node, and the process proceeds to step S606. On the other hand, if it is not included, nothing is done and the process proceeds to step S606.

【００５０】次に、ステップＳ６０６にて、ｐ単語目の
単語が文末単語かどうか判定し、文末でない場合には、
変数ｐを１つ増し、ステップＳ６０４から再度処理を繰
り返す。一方、文末である場合には、ステップＳ６０８
に進む。Next, in step S606, it is determined whether the p-th word is the end-of-sentence word, and if it is not the end-of-sentence,
The variable p is incremented by 1, and the process is repeated from step S604. On the other hand, if it is the end of the sentence, step S608.
Proceed to.

【００５１】さて、図６および図８の例では、まず、ス
テップＳ６０４にて、“ＡＮＤＥＲＳＯＮ”が選ばれ、
図８の概念依存構造において“ＡＮＤＥＲＳＯＮ”の訳
語である“アンダーソン”を表示可能とする。この状態
を図９に示す。In the example of FIGS. 6 and 8, first, in step S604, "ANDERSON" is selected,
In the concept-dependent structure of FIG. 8, "Anderson" which is a translation of "ANDERSON" can be displayed. This state is shown in FIG.

【００５２】次に、ステップＳ６０５の処理により、ル
ートノードまでが表示可能となる。この状態を図１０に
示す。次に、ステップＳ６０６，Ｓ６０７にて、次の単
語を見に行くが、“ＨＥＡＤ”についてはタイトル類似
文に現れないので、何も行わず、ステップＳ６０６，Ｓ
６０７，Ｓ６０４と処理が進められる。Next, by the processing of step S605, up to the root node can be displayed. This state is shown in FIG. Next, in steps S606 and S607, the next word is viewed, but since "HEAD" does not appear in the title-similar sentence, nothing is done, and steps S606 and S607 are executed.
The processing proceeds to 607 and S604.

【００５３】以上の処理を繰り返すことにより、図１１
で示されるような木構造が生成される。次に、ステップ
Ｓ６０８にて、上記の生成された木構造のうち表示可能
なノードを使って、タイトル文（すなわち不完全文）の
訳文を再構成する。By repeating the above processing, FIG.
A tree structure as shown by is generated. Next, in step S608, the translated sentence of the title sentence (that is, incomplete sentence) is reconstructed using the displayable node of the generated tree structure.

【００５４】この訳文の再構成には、翻訳部１０１で使
われる文生成処理を利用すればよい。なお、この文生成
処理としては、一般的な方法を用いることができる。次
に、この生成された訳文で、図６の対訳データのタイト
ル文における訳文を書き換え、処理を終了する。The sentence generation process used in the translation unit 101 may be used to reconstruct the translated sentence. A general method can be used for this sentence generation processing. Next, the translated sentence in the title sentence of the bilingual data in FIG. 6 is rewritten with the generated translated sentence, and the process ends.

【００５５】図１１の例では、訳文として、“トム・ア
ンダーソンは中東旅行に出発した”が生成され、この訳
文で図６の対訳データの訳文が書き換えられ、処理が終
了する。In the example of FIG. 11, "Tom Anderson departed for the Middle East trip" is generated as the translated sentence, the translated sentence of the bilingual data in FIG. 6 is rewritten with this translated sentence, and the process ends.

【００５６】図１２に、書き換えられた対訳データの例
を示す。不完全文認識補完部１０４により対訳データが
修正されると、制御部１０３の制御により、出力すべき
データが出力部１０５に渡され、出力部１０５により出
力が行われる。FIG. 12 shows an example of rewritten parallel translation data. When the parallel translation data is corrected by the incomplete sentence recognition complementing unit 104, the data to be output is passed to the output unit 105 under the control of the control unit 103, and the output unit 105 outputs the data.

【００５７】本実施形態によれば、不完全文に類似する
文を利用して該不完全文の訳文を生成する、より正確に
翻訳された文章を得ることができ、訳文の可読性を向上
させることができる。According to the present embodiment, a sentence similar to an incomplete sentence is used to generate a translated sentence of the incomplete sentence, a more accurately translated sentence can be obtained, and the readability of the translated sentence is improved. be able to.

【００５８】また、上記では、制御部を介して各部にデ
ータを受け渡しするものとして説明したが、実データを
渡すようにしてもよいし、実データが格納されているア
ドレスなどのポインタ情報を渡すようにしてもよい。In the above description, the data is transferred to each unit via the control unit. However, actual data may be transferred, or pointer information such as an address where the actual data is stored may be transferred. You may do it.

【００５９】また、上記では、タイトル類似文の訳文を
用いてタイトルとして適切になるように加工したが、タ
イトル類似文の訳文をそのままタイトルとしてもよい。
また、上記では、概念依存構造データはすべての文につ
いて生成したが、概念依存構造データが翻訳の過程で得
られるものではなく、翻訳とは別に解析を行って得られ
るものである場合には、不完全文認識補完部で不完全文
の訳文を生成するためのもととする「類似文」について
のみ概念依存構造データを生成するようにしてもよい。
この場合には、不完全文認識補完部で概念依存構造デー
タを生成するようにしてもよい。Further, in the above description, the translated sentence of the title-similar sentence is used so as to be suitable as a title, but the translated sentence of the title-similar sentence may be used as it is as the title.
Further, in the above, the concept dependent structure data is generated for all sentences, but if the concept dependent structure data is not obtained in the process of translation, but is obtained by performing analysis separately from translation, The concept-dependent structure data may be generated only for the “similar sentence” that is the basis for generating the translated sentence of the incomplete sentence in the incomplete sentence recognition complementing unit.
In this case, the incomplete sentence recognition complementing unit may generate concept dependent structure data.

【００６０】また、上記では、翻訳部による不完全文の
訳文を不完全文認識補完部で得た訳文で書き換えるもの
としたが、前者と後者の両方の訳文を併せて記憶するよ
うにしてもよい。あるいは、また、最初に入力文書中か
ら不完全文を抽出し、不完全文については、翻訳部によ
る翻訳を行わず、不完全文認識補完部による訳文の生成
のみ行うようにしてもよい。In the above description, the translated sentence of the incomplete sentence by the translation unit is rewritten with the translated sentence obtained by the incomplete sentence recognition complementing unit, but both the translated sentence of the former and the latter sentence may be stored together. Good. Alternatively, the incomplete sentence may be first extracted from the input document, and the incomplete sentence may not be translated by the translation unit, and only the translated sentence may be generated by the incomplete sentence recognition complementing unit.

【００６１】また、翻訳部による不完全文の訳文と不完
全文認識補完部による不完全文の訳文の両方を記憶する
場合に、翻訳結果としての翻訳文を外部に出力するにあ
たって、不完全文については、両方の訳文を同時に出力
することも可能である。また、例えば、光学ディスプレ
イに翻訳文を表示するにあたって、不完全文について
は、まず不完全文認識補完部による訳文を提示し、ユー
ザからの指示があった場合に翻訳部による訳文を提示す
るようにすることも可能であり、あるいは両方の訳文を
同時に提示することも可能である。When both the translated sentence of the incomplete sentence by the translation unit and the translated sentence of the incomplete sentence by the incomplete sentence recognition complementing unit are stored, the translated sentence as the translation result is output to the outside. For, it is possible to output both translated sentences at the same time. Further, for example, when displaying a translated sentence on an optical display, for an incomplete sentence, first, the translated sentence by the incomplete sentence recognition complementing unit is presented, and the translated sentence by the translation unit is presented when an instruction from the user is given. Or both translations can be presented at the same time.

【００６２】また、上記では、不完全文に最も類似する
類似文を利用して不完全文認識補完部により不完全文の
訳文を生成したが、不完全文に類似するいくつかの類似
文をそれぞれ利用して複数の訳文を生成してもよい。Further, in the above, the translated sentence of the incomplete sentence is generated by the incomplete sentence recognition complementing unit using the similar sentence most similar to the incomplete sentence, but some similar sentences similar to the incomplete sentence are generated. A plurality of translated sentences may be generated by utilizing each.

【００６３】なお、訳文データを出力するにあたって、
不完全文の訳文が不完全文認識補完部により得られたも
のであることを知らしめるような情報（例えば、その旨
を明示的に表現する文字等のデータ、あるいはその訳文
を構成する文字に対する色等の文字修飾情報（認識部に
よる訳文との区別を可能とする値）、など）を付与して
もよい。また、訳文生成のもととした類似文の内容もし
くはその文番号などの情報を付与してもよい。When outputting the translated text data,
Information that informs that the translation of the incomplete sentence was obtained by the incomplete sentence recognition complementation unit (for example, data such as characters that express that fact, or the characters that make up the translated sentence). Character decoration information such as color (a value that allows the recognition unit to distinguish the translated text), or the like may be added. Further, information such as the content of the similar sentence or the sentence number that is the source of the translated sentence generation may be added.

【００６４】また、上記では、不完全文に類似する類似
文を同一の入力文書中から抽出し、これを当該不完全文
の訳文の生成に利用したが、類似文を検索する対象とし
て、当該不完全文を含む文書の代わりに、あるいは当該
不完全文を含む文書と併せて、あるいは当該不完全文を
含む文書中に類似文が存在しなかった場合に、当該不完
全文を含む文書とは異なる文書、例えば、過去に当該シ
ステムで翻訳し記憶しておいたテキスト、既存の対訳コ
ーパス、例えば複数のテキストが翻訳対象となっている
場合における他の１または複数のテキスト等を利用する
ようにしてもよい。なお、当該不完全文を含む文書とは
異なる文書は、例えば、入力部から読み込むことができ
る。In the above description, a similar sentence similar to an incomplete sentence is extracted from the same input document and is used to generate a translated sentence of the incomplete sentence. Instead of the document containing the incomplete sentence, or together with the document containing the incomplete sentence, or when there is no similar sentence in the document containing the incomplete sentence, the document containing the incomplete sentence. Use different documents, eg, texts that have been translated and stored in the system in the past, existing bilingual corpus, such as one or more other texts when multiple texts are to be translated. You may A document different from the document containing the incomplete sentence can be read from the input unit, for example.

【００６５】また、本実施形態では、不完全文の例とし
てタイトル文を扱ったが、もちろんこれに限定されず、
例えば名詞句のみからなる文を不完全文として扱うな
ど、他の種々の形態のものを不完全文として扱うことが
できる。Further, in the present embodiment, the title sentence is treated as an example of the incomplete sentence, but of course the present invention is not limited to this.
Various other forms can be treated as incomplete sentences, for example, a sentence consisting of only noun phrases can be treated as an incomplete sentence.

【００６６】また、複数種類の不完全文を扱うようにし
てもよい。例えば、「タイトル文」と「それ以外の名詞
句のみからなる文（例えば見出し文など）」とを不完全
文としてもよい。この場合、不完全文の種類に応じて、
最適化処理の内容を異ならせるようにしてもよい。It is also possible to handle a plurality of types of incomplete sentences. For example, the "title sentence" and the "sentence consisting only of other noun phrases (for example, headline sentence)" may be incomplete sentences. In this case, depending on the type of incomplete sentence,
The contents of the optimization process may be different.

【００６７】また、当該システムが扱うことのできる不
完全文の種類をユーザに提示し、ユーザにより選択され
た１または複数の種類の不完全文についてのみ、不完全
文認識補完部による訳文の生成を行うようにしてもよ
い。Further, the types of incomplete sentences that can be handled by the system are presented to the user, and only one or more types of incomplete sentences selected by the user are generated by the incomplete sentence recognition complementing unit to generate translated sentences. May be performed.

【００６８】また、上記では、原文を英文、訳文を和文
とする場合を例にとって説明したが、本発明は、原文を
和文、訳文を英文とする場合にも、また和文と他の言語
間、英文と他の言語間の翻訳を行う場合にも適用可能で
ある。In the above description, the case where the original sentence is an English sentence and the translated sentence is a Japanese sentence has been described as an example. It can also be applied when translating between English and other languages.

【００６９】以上、本発明の一実施形態について説明し
てきたが、以上の各機能は、ハードウェアを用いてもソ
フトウェアを用いても実現可能である。また、本実施形
態に係る機械翻訳システムは、汎用計算機を用いて実現
することも、専用機として実現することも可能である。Although one embodiment of the present invention has been described above, each function described above can be realized by using hardware or software. Further, the machine translation system according to the present embodiment can be realized by using a general-purpose computer or a dedicated machine.

【００７０】また、本実施形態は、コンピュータに所定
の手順を実行させるための（あるいはコンピュータを所
定の手段として機能させるための、あるいはコンピュー
タに所定の機能を実現させるための）プログラムを記録
したコンピュータ読取り可能な記録媒体として実施する
こともできる。本発明は、上述した実施の形態に限定さ
れるものではなく、その技術的範囲において種々変形し
て実施することができる。Further, the present embodiment is a computer recording a program for causing a computer to execute a predetermined procedure (or for causing the computer to function as a predetermined means or for causing the computer to realize a predetermined function). It can also be implemented as a readable recording medium. The present invention is not limited to the above-described embodiments, but can be implemented with various modifications within the technical scope thereof.

【００７１】[0071]

【発明の効果】本発明によれば、不完全文に類似する文
を利用して該不完全文の訳文を生成する、より正確に翻
訳された文章を得ることができ、訳文の可読性を向上さ
せることができる。According to the present invention, a sentence that is similar to an incomplete sentence can be used to generate a translated sentence of the incomplete sentence, and a more accurately translated sentence can be obtained, thus improving readability of the translated sentence. Can be made.

[Brief description of drawings]

【図１】本発明の一実施形態に係る機械翻訳システムの
構成例を示す図FIG. 1 is a diagram showing a configuration example of a machine translation system according to an embodiment of the present invention.

【図２】入力テキストの一例を示す図FIG. 2 is a diagram showing an example of input text.

【図３】対訳データの一例を示す図FIG. 3 is a diagram showing an example of parallel translation data.

【図４】概念依存構造データの一例を示す図FIG. 4 is a diagram showing an example of concept-dependent structure data.

【図５】不完全文認識補完処理の手順の一例を示すフロ
ーチャートFIG. 5 is a flowchart showing an example of a procedure of incomplete sentence recognition complementing processing.

【図６】類似度を加えた対訳データの一例を示す図FIG. 6 is a diagram showing an example of parallel translation data with similarity added.

【図７】タイトル反映処理の手順の一例を示すフローチ
ャートFIG. 7 is a flowchart showing an example of a procedure of title reflection processing.

【図８】タイトル類似文の概念依存構造の一例を示す図FIG. 8 is a diagram showing an example of a concept dependency structure of a title similar sentence.

【図９】タイトル類似文の概念依存構造の一例を示す図FIG. 9 is a diagram showing an example of a concept dependency structure of a title similar sentence.

【図１０】タイトル類似文の概念依存構造の一例を示す
図FIG. 10 is a diagram showing an example of a concept dependency structure of a title similar sentence.

【図１１】タイトル類似文の概念依存構造の一例を示す
図FIG. 11 is a diagram showing an example of a concept dependency structure of a title similar sentence.

【図１２】最終的に書き換えられた対訳データの一例を
示す図FIG. 12 is a diagram showing an example of finally rewritten bilingual data.

[Explanation of symbols]

１０１…入力部１０２…翻訳部１０３…制御部１０４…不完全文認識補完部１０５…出力部 101 ... Input section 102 ... Translation Department 103 ... Control unit 104 ... Incomplete sentence recognition complementing unit 105 ... Output unit

───────────────────────────────────────────────────── フロントページの続き (56)参考文献村田真樹・長尾真，表層表現と用例を用いた動詞の省略の補完，情報処理学会研究報告96−ＳＬＰ−14−４，日本, 1996年12月13日，Ｖｏｌ．96，Ｎｏ. 123，ｐ．23−ｐ．30 村田真樹・長尾真，表層表現と用例を用いた動詞の省略の補完，電子情報通信学会技術研究報告ＳＰ96−72，日本, 1996年12月12日，Ｖｏｌ．96，Ｎｏ. 421，ｐ．23−ｐ．30 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/21 - 17/28 ─────────────────────────────────────────────────── ─── Continuation of the front page (56) References Maki Murata and Makoto Nagao, Completion of verb abbreviations using surface expressions and examples, IPSJ Research Report 96-SLP-14-4, Japan, December 1996. 13th, Vol. 96, No. 123, p. 23-p. 30 Maki Murata, Makoto Nagao, Complementation of verb abbreviations using surface expressions and examples, IEICE Technical Report SP96-72, Japan, December 12, 1996, Vol. 96, No. 421, p. 23-p. 30 (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06F 17/21-17/28

Claims

(57) [Claims]

1. Input means for inputting an original sentence to be translated, translating means for translating the original sentence to generate a translated sentence, first extracting means for extracting an incomplete sentence from the original sentence, and from the original sentence second extracting means for extracting a similar sentence similar to the incomplete sentence, the example Bei and complementing means for generating a translation for said utilizing similar sentence incomplete sentence, said complementary means, the similar sentence Predetermined structural analysis for
Of the nodes that make up the tree structure generated by applying
Of these, have a phrase that corresponds to the phrase included in the incomplete sentence.
Generated using nodes from one node to the root node
A machine translation system, wherein the translated sentence is a translated sentence for the incomplete sentence .

2. An input unit for inputting an original sentence to be translated, a translation unit for translating the original sentence to generate a translated sentence, a first extracting unit for extracting an incomplete sentence from the original sentence, the original sentence, Second extracting means for extracting a similar sentence similar to the incomplete sentence from another predetermined document described in the same language; and complementing means for generating a translated sentence for the incomplete sentence using the similar sentence Bei example, said complementary means, the predetermined structural analysis with respect to the similar sentence
Of the nodes that make up the tree structure generated by applying
Of these, have a phrase that corresponds to the phrase included in the incomplete sentence.
Generated using nodes from one node to the root node
A machine translation system, wherein the translated sentence is a translated sentence for the incomplete sentence .

3. Input means for inputting an original sentence to be translated
When extracts a translation means for generating a translated sentence by translating the original text, first extracting means for extracting an incomplete sentence from the original sentence, a similar sentence similar to the incomplete sentence from the original sentence
Generate a translated sentence for the incomplete sentence using the second extraction means and the similar sentence.
Complementing means for performing the concept-dependent generation generated for the similar sentence.
Of the nodes that make up the tree structure of existing structure data,
From a node that has a phrase corresponding to a phrase included in a complete sentence
The sentence generated using the nodes up to the root node is
Characterized by being a translation of an incomplete sentence
Machine translation system.

4. Input means for inputting an original sentence to be translated
When a translation means for generating a translated sentence by translating the original text, before the first extraction means and other predetermined document written in the source and the same language to extract an incomplete sentence from the original sentence
Second extracting means for extracting a similar sentence similar to the incomplete sentence
And generate a translated sentence for the incomplete sentence using the similar sentence.
Complementing means for performing the concept-dependent generation generated for the similar sentence.
Of the nodes that make up the tree structure of existing structure data,
From a node that has a phrase corresponding to a phrase included in a complete sentence
The sentence generated using the nodes up to the root node is
Characterized by being a translation of an incomplete sentence
Machine translation system.

5. The first extracting means extracts a sentence satisfying a predetermined condition as an incomplete sentence, according to any one of claims 1 to 4 . Machine translation system.

6. A machine translation system according to any one of claims 1 to 4, wherein the incomplete sentence is a title sentence.

7. The second extraction means extracts, as the similar sentence, one of the sentences included in the original sentence, which includes the largest number of phrases corresponding to phrases other than the functional words included in the incomplete sentence. The invention according to claim 1 or 2 , characterized in that
4. The machine translation system according to any one of 4 above.

8. A translation sentence of the incomplete sentence generated by the complementing means is stored, or a translated sentence of the incomplete sentence generated by the translating means and the incomplete sentence generated by the complementing means. The machine translation system according to any one of claims 1 to 4, wherein the machine translation system further stores the translated text.

9. An incomplete sentence is extracted from the original sentence prior to the translation by the translation means, and when the incomplete sentence is extracted, the incomplete sentence is not translated by the translation means. The machine translation system according to any one of claims 1 to 4 .

10. A means for outputting the translated text, further comprising: for informing that the translated text of the incomplete sentence generated by the complementing means is the sentence generated by the complementing means. The machine translation system according to any one of claims 1 to 4, wherein data is added and output.

11. A means for presenting the translated text of the incomplete sentence generated by the translating means and the translated text of the incomplete sentence generated by the complementing means at the same time or in order. Claims 1 to 4 characterized by
The machine translation system according to item 1 .

12. A step of inputting an original sentence to be translated.
And a step of storing the input original text in the original text storage means
And a step of translating the original text to generate a translated text by the translation means.
And a step of storing the generated translated text in the translated text storage means.
When, to extract the incomplete sentence from the original sentence by the first extraction means
A step, a step of extracting a similar sentence similar to the incomplete sentence from the original by the second extracting means, and a complementary step of generating translation for the incomplete sentence by using the similar sentence by complementing unit And in the complementing step, the complementing means adds to the similar sentence.
A tree generated by applying a predetermined structural analysis to
Of the nodes that make up the structure, included in the incomplete sentence
From the node that has the word corresponding to the word to the root node
The sentence generated by using the node in
A machine translation method characterized in that it is translated.

13. A step of inputting an original sentence to be translated.
And a step of storing the input original text in the original text storage means
And a step of translating the original text to generate a translated text by the translation means.
And a step of storing the generated translated text in the translated text storage means.
When, to extract the incomplete sentence from the original sentence by the first extraction means
Input step and other specified document described in the same language as the original text
And a step of extracting a similar sentence similar to the incomplete sentence from the other predetermined document by the second extracting means, and a translation sentence for the incomplete sentence using the similar sentence by the complementing means. Complementing step of
A tree generated by applying a predetermined structural analysis to
Of the nodes that make up the structure, included in the incomplete sentence
From the node that has the word corresponding to the word to the root node
The sentence generated by using the node in
A machine translation method characterized in that it is translated.

14. A step of inputting an original sentence to be translated.
And a step of storing the input original text in the original text storage means
And a step of translating the original sentence by the translation means to generate a translated sentence.
And a step of storing the generated translated text in the translated text storage means.
And an incomplete sentence is extracted from the original sentence by the first extracting means.
Step and second extraction means to resemble the original sentence to the incomplete sentence
A similar sentence is extracted, and a complementary means is used to match the incomplete sentence using the similar sentence.
And a complementary step of generating a translated sentence.
Construct a tree structure of concept-dependent structural data generated for
Node that corresponds to the phrase included in the incomplete sentence.
From the node that has the phrase to the root node
The sentence generated by using the sentence can be used as a translation for the incomplete sentence.
A machine translation method characterized by.

15. A step of inputting an original sentence to be translated
And a step of storing the input original text in the original text storage means
And a step of translating the original sentence by the translation means to generate a translated sentence.
And a step of storing the generated translated text in the translated text storage means.
And an incomplete sentence is extracted from the original sentence by the first extracting means.
Input step and other specified document described in the same language as the original text
And a second extraction means for extracting the incompleteness from the other predetermined document.
A step of extracting a similar sentence similar to the whole sentence;
And a complementary step of generating a translated sentence.
Construct a tree structure of concept-dependent structural data generated for
Node that corresponds to the phrase included in the incomplete sentence.
From the node that has the phrase to the root node
The sentence generated by using the sentence can be used as the translated sentence for the incomplete sentence.
A machine translation method characterized by.

16. An original sentence to be translated is input to a computer, and the input original sentence is stored in an original sentence storage means.
The translation means translates the original text to generate a translated text, and the generated translated text is stored in the translated text storage means.
Incomplete sentences are extracted from the original sentence by the extraction unit 1 of
To extract similar sentence similar to the incomplete sentence from the original by the second extraction means and by using the similar sentence to produce a translation for the incomplete sentence by complementing unit together
In generating the translated text by the complementing means,
Is obtained by performing a predetermined structural analysis on the similar sentence.
Of the nodes that make up the tree structure generated by
From a node that has a phrase corresponding to a phrase included in a complete sentence
The sentence generated using the nodes up to the root node is
A computer-readable recording medium in which a program for translating an incomplete sentence is recorded.

17. An original sentence to be translated is input to a computer, and the input original sentence is stored in an original sentence storage means.
The translation means translates the original text to generate a translated text, and the generated translated text is stored in the translated text storage means.
Incomplete sentences are extracted from the original sentence by the extraction unit 1 of
Input another specified document described in the same language as the original text
Then, the second extraction unit extracts a similar sentence similar to the incomplete sentence from the other predetermined document, and the complementary unit extracts the similar sentence .
Wherein together to produce translation for said utilizing similar sentence incomplete statements Ri, when to generate the translation by the complementing unit
Is obtained by performing a predetermined structural analysis on the similar sentence.
Of the nodes that make up the tree structure generated by
From a node that has a phrase corresponding to a phrase included in a complete sentence
The sentence generated using the nodes up to the root node is
A computer-readable recording medium in which a program for translating an incomplete sentence is recorded.

18. An original sentence to be translated is input to a computer, and the input original sentence is stored in an original sentence storage means.
The translation means translates the original text to generate a translated text, and the generated translated text is stored in the translated text storage means.
Incomplete sentences are extracted from the original sentence by the extraction unit 1 of
To extract similar sentence similar to the incomplete sentence from the original by the second extraction means and by using the similar sentence to produce a translation for the incomplete sentence by complementing unit together
In generating the translated text by the complementing means,
Is the concept-dependent structural data generated for the similar sentence.
Of the nodes that make up the tree structure , included in the incomplete sentence
Root node from the node that has the phrase corresponding to
A sentence generated using nodes up to
A computer-readable recording medium in which a program for making a translation is recorded.

19. An original sentence to be translated is input to a computer, and the input original sentence is stored in an original sentence storage means.
The translation means translates the original text to generate a translated text, and the generated translated text is stored in the translated text storage means.
Incomplete sentences are extracted from the original sentence by the extraction unit 1 of
Input another specified document described in the same language as the original text
Then, the second extraction unit extracts a similar sentence similar to the incomplete sentence from the other predetermined document, and the complementary unit extracts the similar sentence .
Wherein together to produce translation for said utilizing similar sentence incomplete statements Ri, when to generate the translation by the complementing unit
Is the concept-dependent structural data generated for the similar sentence.
Of the nodes that make up the tree structure, included in the incomplete sentence
Root node from the node that has the phrase corresponding to
A sentence generated using nodes up to
A computer-readable recording medium in which a program for making a translation is recorded.