JP2004157688A

JP2004157688A - Translation device, translation method and translation program

Info

Publication number: JP2004157688A
Application number: JP2002321718A
Authority: JP
Inventors: Akira Kataoka; 明片岡; Naruhiro Ikeda; 成宏池田; Yoshihiro Matsuo; 義博松尾
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2002-11-05
Filing date: 2002-11-05
Publication date: 2004-06-03
Anticipated expiration: 2022-11-05
Also published as: JP4399154B2

Abstract

<P>PROBLEM TO BE SOLVED: To make an accurate translation without preparing a plurality of translation patterns even if a language having free word order is included in an original language and the word order of the corresponding object language is fixed. <P>SOLUTION: A text sentence written in the original language is divided into words, and a syntax tree is produced from the text sentence thus divided according to an original language pattern. When the syntax tree of the object language is produced from the syntax tree of the original language according to the object language pattern, a preliminarily designated component is stored with a marker among the components constituting the sentence structure of the object language, and the translation is performed by arranging the component thus stored at the position of the marker described in the object language pattern. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
この発明は、原言語で記載されたテキスト文を目的言語に翻訳する機械翻訳システムに係わり、特に翻訳パターンを用いて翻訳を行う翻訳装置、翻訳方法及び翻訳プログラムに関する。
【０００２】
【従来の技術】
機械翻訳システムにおいて、機械翻訳では扱いにくい分野特有の特徴的な文を精度良く翻訳する翻訳方式や、ユーザによって機械翻訳システムをカスタマイズすることが可能な翻訳方式として、従来よりパターン翻訳方式が採用されている。このパターン翻訳方式は、原言語の文パターンとその翻訳である目的言語の文パターンとの対応を表す翻訳パターンを用意しておき、原言語で記載されたテキスト文と上記翻訳パターンとを照合して目的言語に翻訳するものである。
【０００３】
図５は、日本語から英語に翻訳する際の従来技術による翻訳パターンの一例を示すものである。翻訳パターンは、ＩＤと、原言語パターンと、目的言語パターンとから構成される。原言語パターンでは、矢印“⇒”の左側にこの原言語パターンを置き換える変数が記述され、矢印の右側に入力文と照合する単語、あるいは変数の列が記述される。ここで、鈎括弧“「”、“」”で括られている文字列（“私”、“は”など）が単語を示し、括られていない文字列（“文”、“動詞句”など）が変数を示す。また、コロン“：”の後の数字は、原言語パターンに記述された単語や変数を目的言語パターンから参照するためのインデックスを示す。目的言語パターンは、単語、あるいはインデックスの列で表される。ここで、単語は原言語パターンと同様に鈎括弧で括られており、インデックスは原言語パターンに記述された数字と対応している。
【０００４】
例えば、ＩＤが００８である翻訳パターンは、入力文に“私”という単語が存在すれば変数“名詞句”で置き換え、目的言語では“Ｉ”を生成することを意味する。また、ＩＤが００３である翻訳パターンは、変数“を格”で置き換えられた入力文の一部分と変数“動詞句”で置き換えられた入力文の一部分が存在すれば、変数“動詞句”で置き換え、目的言語では原言語パターンの出現順とは逆に、変数“動詞句”、変数“を格”の順に生成することを意味する。
【０００５】
図５の翻訳パターンを用いて、
“私は一所懸命英語を勉強する”（例１）
の日本語文を翻訳すると、原言語パターンと照合した結果、図６（ａ）の構文木が得られ、対応する目的言語パターンに従って変換した結果、図６（ｂ）の生成結果が得られる。そして、最終的に図６（ｂ）の末端の単語列を図の左から出現する順に並べることで、
“ＩｓｔｕｄｙＥｎｇｌｉｓｈｈａｒｄ”（例２）
の翻訳文が得られる。
【０００６】
なお、従来では、原言語パターンを一文字の変数で置き換えることによって、翻訳パターンの入れ子構造（木構造）が表現できる翻訳装置が知られている（例えば、特許文献１参照。）。さらに本発明者等は、翻訳テンプレートを使用して翻訳を行う翻訳方法も提案している（例えば、特許文献２参照。）。
【０００７】
【特許文献１】
特許第３１８９１８６号公報
【０００８】
【特許文献２】
特願２００２−１３８９３０
【０００９】
【発明が解決しようとする課題】
ところが、従来のパターン翻訳方式では、目的言語パターンの構成要素は単語、あるいは変数の列で表現されている。そのため、日本語や朝鮮語など語順が自由である言語を原言語とし、かつ英語など語順が固定的な言語を目的言語とした場合には、テキスト文の語順によって翻訳文の語順が変化するという問題が生じる。
【００１０】
例えば、前述した図５の翻訳パターンを用いて、上記例１とは語順を変化させた、
“私は英語を一所懸命勉強する”（例３）
の日本語文を翻訳すると、原言語パターンと照合した結果、図７（ａ）の構文木が得られ、対応する目的言語パターンに従って変換した結果、図７（ｂ）の生成結果が得られる。そして、最終的には、
“ＩｓｔｕｄｙｈａｒｄＥｎｇｌｉｓｈ”（例４）
の翻訳文が得られる。この例４の翻訳文は、翻訳パターンの適用順に応じて英語の語順も変化しており、正確な英語文とは云えない。
【００１１】
上記翻訳方式で正確な翻訳文を得るには、例１と例３それぞれの語順に応じた翻訳パターンを用意する必要があるだけでなく、
“一所懸命私は英語を勉強する”や、
“英語を私は一所懸命勉強する”
などの、考えられるすべての語順に対応した翻訳パターンを用意しなければならない。このため、膨大な量の翻訳パターンが必要となってしまう。
【００１２】
この発明は上記事情に着目してなされたもので、その目的とするところは、原言語に語順が自由な言語が含まれ、対応する目的言語の語順が固定される場合でも、複数の翻訳パターンを用意することなく正確な翻訳を行えるようにした翻訳装置、翻訳方法及び翻訳プログラムを提供することにある。
【００１３】
【課題を解決するための手段】
上記目的を達成するためにこの発明は、原言語の構文木を作成するための原言語パターンと、この原言語パターンに対応した目的言語の構文木を作成するための目的言語パターンと、目的言語評価部とから構成され、かつ上記目的言語パターンには上記目的言語の構文木を表す構成要素のうち位置が変更可能な構成要素を示す識別子が含まれ、上記目的言語評価部には上記識別子が表す目的言語の構成要素を特定するための情報が含まれる翻訳パターンを有し、原言語で記載されたテキスト文を単語分割して、この分割された単語をもとに上記原言語パターンに従って原言語の構文木を作成する。そして、上記作成された原言語の構文木をもとに上記目的言語パターンに従って目的言語の構文木を作成し、この作成された目的言語の構文木の構成要素のうち上記目的言語評価部の識別子が表す構成要素を当該識別子と対応付けて記憶しておく。そして、上記目的言語パターンに上記識別子が存在した場合には、当該識別子に対応付けて記憶された構成要素を置き換えて翻訳文を生成するようにしたものである。
【００１４】
したがってこの発明によれば、翻訳文を生成する際に、目的言語評価部に目的言語パターンの構成要素を生成する位置として識別子が記述されていれば、当該識別子により指定された位置に上記構成要素を置換して得られる部分木が埋め込まれる。このため、原言語に語順が自由な言語が含まれ、対応する目的言語の語順が固定される場合でも、複数の翻訳パターンを用意することなく正確な翻訳文を生成することが可能となる。
【００１５】
またこの発明は、ユーザの入力操作に応じて、上記目的言語パターンに識別子を含める処理を行う手段をさらに備えることも特徴とする。このような構成を備えることで、翻訳文の語順をユーザが任意に変更することが可能となり、これにより翻訳パターンをカスタマイズすることができる。
【００１６】
さらにこの発明は、ユーザの入力操作に応じて、上記目的言語評価部に含まれる上記識別子が表す目的言語の構成要素を特定するための情報を含める処理を行う手段をさらに備えることも特徴とする。このような構成を備えることで、語順が変更する構成要素をユーザが任意に指定することが可能となり、これにより上記同様翻訳パターンをカスタマイズすることができる。
【００１７】
【発明の実施の形態】
以下、図面を参照してこの発明の実施形態を説明する。
図１は、この発明に係わる翻訳装置の一実施形態を示す機能ブロック図である。
【００１８】
この翻訳装置は、入力部１と、プロセッサ２と、記憶部６と、出力部１１とを備えている。入力部１は、例えばキーボードにより構成され、ユーザがテキスト文をはじめ、翻訳パターンを構成する要素、例えばマーカやこのマーカが指定する値等を入力するために使用する。
【００１９】
記憶部６は、言語別単語辞書記憶エリア７と、対訳辞書記憶エリア８と、翻訳パターン記憶エリア９と、マーカテーブル記憶エリア１０とを有する。言語別単語辞書記憶エリア７は、翻訳に使用する複数の言語について、単語と当該単語の品詞等の情報を記憶する。対訳辞書記憶エリア８は、例えば日英対訳の辞書データを記憶する。翻訳パターン記憶エリア９は、原言語を目的言語に翻訳する際に照合する翻訳パターン情報を記憶しており、この翻訳パターンの詳しい構成は後述する。マーカテーブル記憶エリア１０は、後述する訳文生成処理部５の指示により、上記翻訳パターンに記述されたインデックスとマーカ及びマーカが指定する値とを対応付けて記憶するものである。
【００２０】
プロセッサ２は、ＣＰＵやＲＯＭ、ＲＡＭ等のコンピュータとしての一般的な構成を備えており、上記ＲＯＭに記憶された翻訳プログラムにより指定される処理手順に従って翻訳処理を実行する。この翻訳処理のためにプロセッサ２は、単語分割処理部３と、構文解析処理部４と、訳文生成処理部５とを備えている。
【００２１】
単語分割処理部３は、原言語で記載されたテキスト文を、上記言語別単語辞書記憶エリア７に格納された情報に基づいて単語分割を行う。構文解析処理部４は、翻訳パターン記憶エリア９に記憶された翻訳パターンと、上記単語分割処理部３により分割されたテキスト文とを用いて構文の解析を行う。訳文生成処理部５は、上記構文解析処理部４により解析された構文を、上記翻訳パターン記憶エリア９に記憶された翻訳パターンを参照して目的言語に変換する。
【００２２】
出力部１１は、例えばディスプレイやプリンタにより構成され、上記プロセッサ２の制御の下に、上記訳文生成処理部５により生成された翻訳文等を表示又はプリントアウトする。
【００２３】
ところで、上記翻訳パターン記憶エリア９に記憶された翻訳パターンは次のように構成される。図２は、この翻訳パターンの構成の一例を示す図である。
すなわち、翻訳パターンは、識別番号（ＩＤ）と、原言語パターンと、目的言語パターンと、目的言語評価部とから構成される。ＩＤは一意となる任意の数字列からなり、各翻訳パターンを特定するために使用される。
【００２４】
原言語パターンは、原言語のテキスト文を構文解析するために使用されるもので、この原言語パターンには単語、品詞及び格等が記述される。例えば、図中の矢印“⇒”の左側にはこの原言語パターンを置き換える変数が記述され、矢印の右側にはテキスト文と照合する単語、あるいは変数の列が記述される。ここで、鈎括弧“「”“」”で括られている文字列（“私”、“は”など）が単語を示し、括られていない文字列（“文”、“動詞句”など）が変数を示す。また、コロン“：”の後の数字は、原言語パターンに記述された単語や変数を目的言語パターンから参照するためのインデックスを示す。
【００２５】
目的言語パターンは、目的言語の単語、インデックスあるいはマーカの列で表される。このうち単語は、原言語パターンと同様に鈎括弧で括られており、インデックスは原言語パターンに記述された数字と対応している。またマーカは“＄”で始まる文字列で表される。このマーカは、目的言語の翻訳文を生成する際の構文を構成する構成要素のうち、位置が変更可能な構成要素に付与される識別子である。
【００２６】
目的言語評価部は、マーカとして指定された構成要素の指定位置を表すもので、等号“＝”の左辺に生成位置がインデックスとマーカとで指定され、右辺にその位置に生成すべき構成要素がインデックスで示される。
【００２７】
例えば、ＩＤが「１０３」である翻訳パターンの目的言語パターンでは、インデックスが「２」で参照される変数“動詞句”を生成することを意味し、目的言語評価部ではインデックスが「２」で参照される変数“動詞句”から生成された構成要素におけるマーカ“＄ｏｂｊ”の位置に、インデックスが１で参照される変数“を格”から生成される構成要素を埋め込むことを意味する。
【００２８】
また、ＩＤが「１０５」である翻訳パターンの目的言語パターンでは、インデックスが「１」で参照される変数“動詞”を生成し、その前後のマーカ“＄ｓｂｊ”、“＄ｏｂｊ”、“＄ａｄｖ”で示される位置に他の翻訳パターンで指定される構成要素が埋め込まれることを意味する。
【００２９】
次に、以上のように構成された翻訳装置による翻訳手順とその処理内容を説明する。なお、ここでは日本語文の
“私は英語を一所懸命勉強する”
を英文に翻訳する場合を例にとって説明する。
【００３０】
ユーザが入力部１から翻訳対象の原言語のテキスト文“私は英語を一所懸命勉強する”を入力すると、プロセッサ２は当該テキスト文を単語分割処理部３に取り込む。単語分割処理部３は、この取り込んだテキスト文を上記言語別単語辞書を参照して単語単位に分割する。
【００３１】
例えば、“私／は／英語／を／一所懸命／勉強する”のように分割する。なお、／は単語の区切りを示す。この単語分割の手段としては、例えば形態素解析処理が使われるが、正確に単語認識が可能であれば特に形態素解析処理に限定されるものではなく、他の処理手段を用いてもよい。上記単語分割されたテキスト文は、構文解析処理部４に転送される。
【００３２】
構文解析処理部４は、上記単語分割されたテキスト文を、翻訳パターン記憶エリア９に記憶された翻訳パターンの原言語パターンと照合し、テキスト文の構文木を作成する。図３（ａ）に、作成された構文木を示す。この構文木の各ノードには、上記原言語パターンが一つずつ対応している。ところで、図２に示す原言語パターンは、文脈自由文法の形式で記述されているため、既存の構文解析アルゴリズムを適用することで容易に構文木が作成可能である。文脈自由文法を解析する構文解析アルゴリズムとしては、例えば一般的な文脈自由文法規則が扱え、かつ解析過程の制御の自由度が大きいチャート法が用いられる。上記構文解析処理部４により作成された構文木は、訳文生成処理部５に送られる。
【００３３】
訳文生成処理部５は、上記構文解析処理部４から送られた構文木を、上記翻訳パターンの目的言語パターンを参照して目的言語に変換する。図３（ｂ）は、図３（ａ）に示す原言語の構文木から作成された目的言語の構文木を示す。この訳文生成処理部５では、関数ｇｅｎｅｒａｔｅ（）を定義しており、この関数ｇｅｎｅｒａｔｅ（）を実行することで原言語の構文木から目的言語の構文木への変換処理が行われる。図４は、この関数ｇｅｎｅｒａｔｅ（）を用いた変換処理の手順と内容を示すフローチャートである。以下に、この図４を用いて上記訳文生成処理部５による変換処理動作を説明する。
【００３４】
まず訳文生成処理部５は、原言語の根ノードであるＳＮｏｄｅ［文，ＩＤ＝１０１］を呼び出す。ここで、ＳＮｏｄｅ［文，ＩＤ＝１０１］は、原言語の構文木のノードを示す構造体であり、変数が“文”、ＩＤが「１０１」であることを示し、さらに構文木の子ノードへのポインタも有している。この呼び出したＳＮｏｄｅ［文，ＩＤ＝１０１］から、原言語ノードの変数“文”と、翻訳パターンのＩＤ＝「１０１」を取得する（ステップＳ１０）。これら取得した情報をもとに、目的言語ノードＴＮｏｄｅ［文，ＩＤ＝１０１］を生成する。ここで、ＴＮｏｄｅ［文，ＩＤ＝１０１］は、目的言語の構文木のノードを示す構造体であり、変数が“文”、ＩＤが「１０１」であることを示し、さらに構文木の子ノードへのポインタも有している。
【００３５】
続いて訳文生成処理部５は、ステップＳ１２に移行してここでマーカテーブルが存在するか否かを判定する。ここではまだマーカテーブルが設定されていないのでステップＳ１４に移行する。ステップＳ１４では、目的言語評価部をすべて処理したか否かを判定する。そして、ＩＤが「１０１」の翻訳パターンには目的言語評価部が無いので、ステップＳ１７に移行する。ステップＳ１７では、目的言語パターンのすべての構成要素について訳文生成処理を行ったか否かを判定する。そして、ここではまだ処理を終了していないので、構成要素の種類ごとの処理を次のように実行する。
【００３６】
すなわち、訳文生成処理部５はステップＳ１８、ステップＳ２０及びステップＳ２２により、構成要素が「単語」であるのか、「インデックス」であるのか、さらに「マーカ」であるのかを判定する。いま、ＩＤ「１０１」の目的言語パターンにはインデックス“１”が記述されている。このため、ステップＳ２０からステップＳ２１に移行し、ここでインデックス参照先である原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０２］を呼び出す。そして、このＳＮｏｄｅ［動詞句，ＩＤ＝１０２］と、再帰的に呼び出した関数ｇｅｎｅｒａｔｅ（）とから目的言語ノードＴＮｏｄｅ［動詞句，ＩＤ＝１０２］を作成する。そして、この作成したＴＮｏｄｅ［動詞句，ＩＤ＝１０２］をＴＮｏｄｅ［文，ＩＤ＝１０１］の子ノードとして設定する。ここで、訳文生成処理部５は、すべてのノードの処理ごとに関数ｇｅｎｅｒａｔｅ（）を再帰的に呼び出す。そして、目的言語パターンのすべての構成要素を処理すると、当該ＩＤ「１０１」の翻訳パターンにおける関数ｇｅｎｅｒａｔｅ（）の処理を終了する（ステップＳ２４）。
【００３７】
次に訳文生成処理部５は、ＩＤが「１０２」の翻訳パターンについての関数ｇｅｎｅｒａｔｅ（）の処理を行う。なお、上記ＩＤ「１０１」の場合と重複するステップについては説明を省略する。
【００３８】
まず訳文生成処理部５は、呼び出した原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０２］の変数“動詞句”とＩＤ＝１０２をステップＳ１０で取得し、ステップＳ１１で目的言語ノードＴＮｏｄｅ［動詞句，ＩＤ＝１０２］を作成する。そして、マーカテーブルが存在しなければステップＳ１２からステップＳ１４に移行し、ここで目的言語評価部をすべて処理したか否かを判定する。いまＩＤが「１０２」の翻訳パターンには目的言語評価部に位置指定がある。このため訳文生成処理部５は、ステップＳ１５に移行して右辺値のインデックス“１”で参照される原言語ノードＳＮｏｄｅ［主格，ＩＤ＝１０６］を呼び出す。そして、このＳＮｏｄｅ［主格，ＩＤ＝１０６］と再帰的に呼び出した関数ｇｅｎｅｒａｔｅ（）とから作成される目的言語ノードＴＮｏｄｅ［主格，ＩＤ＝１０６］を、位置指定の左辺値のインデックス“２”とマーカ“＄ｓｂｊ”とに対応付けてマーカテーブル記憶エリア１０に記憶する（ステップＳ１６）。この結果、マーカテーブルは表１に示すエントリを持つことになる。
【００３９】
【表１】

【００４０】
そうして目的言語評価部をすべて処理すると、訳文生成処理部５はステップＳ１７に移行し、ここで目的言語パターンのすべての構成要素を処理したか否かを判定する。そして、未処理であればステップＳ１８、ステップＳ２０及びステップＳ２２により構成要素の種類を判定する。いま、ＩＤ「１０２」の目的言語パターンにはインデックス“２”が記述されている。このため、訳文生成処理部５はインデックス“２”で参照される原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０３］をステップＳ２１で呼び出す。そして、マーカテーブルにエントリが存在すれば、以下の条件のいずれかを満たすエントリを呼び出す。
目的言語パターンに記述されているインデックスと同じインデックスを持つエントリ（条件１）
インデックスがＮＵＬＬであり、かつ、現目的言語パターンに存在しないマーカを持つエントリ（条件２）
いま、表１を参照すると、１行目のエントリのインデックスが“２”であり、条件１を満たす。このため、訳文生成処理部５は当該エントリを呼び出す。そして、上記原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０３］からＴＮｏｄｅ［動詞句，ＩＤ＝１０３］を作成し、上記呼び出したエントリをＴＮｏｄｅ［動詞句，ＩＤ＝１０３］に属性情報として設定する。そして、このＴＮｏｄｅ［動詞句，ＩＤ＝１０３］をＴＮｏｄｅ［動詞句，ＩＤ＝１０２］の子ノードに設定する。そして、目的言語パターンのすべての構成要素を処理すると、当該翻訳パターンにおける関数ｇｅｎｅｒａｔｅ（）の処理を終了する。
【００４１】
次に訳文生成処理部５は、原言語ノードのＳＮｏｄｅ［動詞句，ＩＤ＝１０３］を呼び出す。このとき、属性情報として、前述のようにマーカテーブルのエントリが設定されている。このため訳文生成処理部５は、ステップＳ１３に移行し、ここで上記エントリを取得してインデックス“２”を“ＮＵＬＬ”に変更し、マーカテーブル記憶エリア１０に再度記憶する。この結果、マーカテーブルは表２に示すエントリを持つことになる。
【００４２】
【表２】

【００４３】
さらに訳文生成処理部５は、前述の説明と同様に目的言語評価部についてステップＳ１５及びステップＳ１６の処理を行う。この結果、マーカテーブル記憶エリア１０は表３に示すエントリを持つことになる。
【００４４】
【表３】

【００４５】
いま、ＩＤが「１０３」の翻訳パターンの目的言語パターンには、インデックス“２”が記述されている。このため訳文生成処理部５は、表３のマーカテーブルを参照し、前述の条件１、条件２を満たすエントリが存在するかを判定する。
表３の１行目のエントリは、インデックスが“ＮＵＬＬ”であり、かつ目的言語パターンにはマーカ“＄ｓｂｊ”は存在しない。このため、前述の条件２を満たす。表３の２行目のエントリは、インデックスが目的言語パターンに記述されているインデックスと同じである。このため、前述の条件１を満たす。
【００４６】
したがって訳文生成処理部５は、表３の２つのエントリと、目的言語パターンのインデックス“２”で参照される原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０４］とを呼び出す。そして、この原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０４］と再帰的に呼び出した関数ｇｅｎｅｒａｔｅ（）とからＴＮｏｄｅ［動詞句，ＩＤ＝１０４］を作成し、上記呼び出した２つのエントリをＴＮｏｄｅ［動詞句，ＩＤ＝１０４］に属性情報として設定する。そして、このＴＮｏｄｅ［動詞句，ＩＤ＝１０４］をＴＮｏｄｅ［動詞句，ＩＤ＝１０３］の子ノードに設定する（ステップＳ２１）。
【００４７】
この処理動作から明らかなように、現在のノードにおける目的言語パターンが指定されたマーカを持たない場合には、現在のノードの子ノードを関数ｇｅｎｅｒａｔｅ（）が処理する際に、マーカテーブルのエントリをもとに子ノードをたどることにより、指定されたマーカが記述されている目的言語パターンを探すことができる。辿るべき子ノードが複数ある場合には、例えば最左、深さ優先で子ノードを辿るといった優先順位を予め定めておけばよい。
【００４８】
次に訳文生成処理部５は、ＩＤが「１０５」の翻訳パターンについての関数ｇｅｎｅｒａｔｅ（）の処理を行う。原言語ノードＳＮｏｄｅ［動詞句，ＩＤ＝１０５］を呼び出したとき、属性情報としてマーカテーブルのエントリが３つ与えられており、表４に示すエントリがマーカテーブル記憶エリア１０に記憶されている（ステップＳ１２）。
【００４９】
【表４】

【００５０】
いま、ＩＤが「１０５」である翻訳パターンの目的言語パターンには、マーカ“＄ｓｂｊ”が記述されている。このため訳文生成処理部５は、ステップＳ２２からステップＳ２３に移行し、ここでマーカテーブルのエントリのうちインデックスが“ＮＵＬＬ”であり、かつ同じマーカを持つ目的言語ノードを選択する。例えば、表４に示すマーカテーブルでは１行目のエントリが選択され、目的言語ノードＴＮｏｄｅ［主格，ＩＤ＝１０６］を最初の子ノードとして設定する。
【００５１】
次に、目的言語パターンにはインデックス“１”が記述されているため、インデックス参照先である原言語ノードＳＮｏｄｅ［動詞，ＩＤ＝１１１］を呼び出し、このＳＮｏｄｅ［動詞，ＩＤ＝１１１］と再帰的に呼び出した関数ｇｅｎｅｒａｔｅ（）とから目的言語ノードＴＮｏｄｅ［動詞，ＩＤ＝１１１］を作成する。そして、この作成したＴＮｏｄｅ［動詞，ＩＤ＝１１１］を２番目の子ノードとして設定する（ステップＳ２１）。このとき、表４に示すマーカテーブルのエントリは、いずれも前述の条件１、条件２を満たしていないので、属性情報は設定されない。
【００５２】
次の目的言語パターンの構成要素は、マーカ“＄ｏｂｊ”であるため、訳文生成処理部５はステップＳ２２からステップＳ２３に移行し、ここで表４のマーカテーブルを参照して目的言語ノードＴＮｏｄｅ［を格，ＩＤ＝１０７］を取得する。そして、このＴＮｏｄｅ［を格，ＩＤ＝１０７］を３番目の子ノードとして設定する。
同様に、最後の構成要素はマーカ“＄ａｄｖ”であるため、訳文生成処理部５は表４のマーカテーブルを参照して目的言語ノードＴＮｏｄｅ［副詞句，ＩＤ＝１１０］を４番目の子ノードとして設定する。
【００５３】
そうして、目的言語パターンのすべての構成要素についての処理を終了すると、訳文生成処理部５はステップＳ１７からステップＳ２４に移行して、当該翻訳パターンにおける関数ｇｅｎｅｒａｔｅ（）の処理を終了する。
【００５４】
最後に、原言語ノードＳＮｏｄｅ［名詞句，ＩＤ＝１０８］を呼び出したときの関数ｇｅｎｅｒａｔｅ（）の動作を説明する。訳文生成処理部５は、ＩＤが「１０８」である翻訳パターンの目的言語パターンには単語“Ｉ”が記述されているので、ステップＳ１８からステップＳ１９に移行し、ここで再帰的に呼び出した関数ｇｅｎｅｒａｔｅ（）により目的言語ノードＴＮｏｄｅ［「Ｉ」，ＩＤ＝＊＊＊］を作成し、ＴＮｏｄｅ［名詞句，ＩＤ＝１０８］の子ノードとして設定する。そして、当該翻訳パターンにおける関数ｇｅｎｅｒａｔｅ（）の処理を終了する（Ｓ２４）。ここで、ＩＤ＝「＊＊＊」は単語を表すノードであることを示している。
【００５５】
以上のように訳文生成処理部５は、すべてのノードの処理ごとに関数ｇｅｎｅｒａｔｅ（）を再帰的に呼び出す。このため、図３（ａ）に示す原言語の構文木の根ノードを呼び出して翻訳処理を実行すると、図３（ｂ）に示す目的言語の構文木が生成される。そして、訳文生成処理部５は、この生成された目的言語の構文木（図３（ｂ））を最左、深さ優先で辿ることにより得られる末端の単語を出現順に並べ、これにより翻訳文の表示データを生成する。そして、この生成された翻訳文の表示データを出力部１１へ出力し表示させる。この結果出力部１１では、翻訳文“ＩｓｔｕｄｙＥｎｇｌｉｓｈｈａｒｄ”が表示される。
【００５６】
以上詳述したようにこの実施形態では、原言語で記載されたテキスト文を単語分割し、この単語分割されたテキスト文から原言語パターンに従って構文木を作成する。そして、原言語の構文木から目的言語パターンに従って目的言語の構文木を作成する際に、目的言語評価部によりマーカで指定された構成要素を記憶しておき、目的言語パターンに記述されたマーカの位置に上記記憶された構成要素を配置して翻訳文を生成するようにしたものである。
【００５７】
したがってこの実施形態によれば、マーカを設定した目的言語の構成要素を配置する位置を変更することが可能となる。よって、原言語の語順が自由であり、目的言語の語順が固定的な言語であっても、翻訳パターンの適用順序によらずに所定の語順で翻訳文を生成することができる。
【００５８】
また上記実施形態では、翻訳パターンの目的言語評価部に変更可能な構成要素を予め特定して翻訳処理を実行するようにしていたが、上記変更可能な構成要素をユーザが設定できるように構成してもよい。すなわち、変更可能な構成要素の入力をユーザに促す設定画面を作成して出力部１１に表示する。そして、ユーザが入力部１により入力した情報を上記目的言語評価部に反映させることで実現可能である。
【００５９】
同様に上記実施形態では、翻訳パターンの目的言語パターンに記述されたマーカの配置する位置を予め特定して（図２におけるＩＤ＝１０５の目的言語パターン）翻訳処理を実行するようにしていたが、上記マーカの位置をユーザが設定できるように構成してもよい。すなわち、マーカにより指定された構成要素の変更を促す設定画面を作成して出力部１１に表示する。そして、ユーザが入力部１により入力した情報を上記目的言語パターンに反映させることで実現可能である。
このような構成にすることで、上記一定の語順で翻訳文を作成できることに加えて、翻訳方式をカスタマイズすることが可能となる。
【００６０】
また上記実施形態では、ＲＯＭに記憶された翻訳プログラムをＣＰＵが実行することで翻訳処理を行うように説明したが、上記翻訳プログラムをＣＤ−ＲＯＭ等の外部記憶媒体から読み込んだり、ネットワーク上のサイトから通信回線を介してダウンロードしてインストールするようにしてもよい。
【００６１】
その他、翻訳装置の種類とその構成、辞書の種類とその構成、単語分割処理や構文解析処理の方法、訳文生成処理の制御手順とその内容等についても、この発明の要旨を逸脱しない範囲で種々変形して実施できる。
【００６２】
【発明の効果】
以上詳述したようにこの発明では、原言語の構文木を作成するための原言語パターンと、この原言語パターンに対応した目的言語の構文木を作成するための目的言語パターンと、目的言語評価部とから構成され、かつ上記目的言語パターンには上記目的言語の構文木を表す構成要素のうち位置が変更可能な構成要素を示す識別子が含まれ、上記目的言語評価部には上記識別子が表す目的言語の構成要素を特定するための情報が含まれる翻訳パターンを有し、原言語で記載されたテキスト文を単語分割して、この分割された単語をもとに上記原言語パターンに従って原言語の構文木を作成する。そして、上記作成された原言語の構文木をもとに上記目的言語パターンに従って目的言語の構文木を作成し、この作成された目的言語の構文木の構成要素のうち上記目的言語評価部の識別子が表す構成要素を当該識別子と対応付けて記憶しておく。そして、上記目的言語パターンに上記識別子が存在した場合には、当該識別子に対応付けて記憶された構成要素を置き換えて翻訳文を生成するようにしている。
【００６３】
したがってこの発明によれば、原言語に語順が自由な言語が含まれ、対応する目的言語の語順が固定される場合でも、複数の翻訳パターンを用意することなく正確な翻訳を行えるようにした翻訳装置、翻訳方法及び翻訳プログラムを提供することができる。
【図面の簡単な説明】
【図１】この発明に係わる実施形態における翻訳装置の構成を示すブロック図。
【図２】図１に示した翻訳装置における翻訳パターンの一例。
【図３】図１に示した翻訳装置により作成される構文木を示す図。
【図４】訳文生成処理部５で実行される関数ｇｅｎｅｒａｔｅ（）の動作を示すフローチャート。
【図５】日本語から英語に翻訳する際の従来技術による翻訳パターンの一例。
【図６】従来技術による翻訳例。
【図７】従来技術による翻訳例。
【符号の説明】
１…入力部
２…プロセッサ
３…単語分割処理部
４…構文解析処理部
５…訳文生成処理部
６…記憶部
７…言語別単語辞書記憶エリア
８…対訳辞書記憶エリア
９…翻訳パターン記憶エリア
１０…マーカテーブル記憶エリア
１１…出力部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a machine translation system that translates a text sentence described in a source language into a target language, and more particularly to a translation device, a translation method, and a translation program that translate using a translation pattern.
[0002]
[Prior art]
In machine translation systems, the pattern translation method has been adopted as a translation method for accurately translating characteristic sentences specific to fields that are difficult to handle with machine translation, and as a translation method that allows users to customize the machine translation system. ing. This pattern translation method prepares a translation pattern representing the correspondence between a sentence pattern in a source language and a sentence pattern in a target language which is a translation of the translation pattern, and compares a text sentence described in the source language with the above translation pattern. And translate it into the target language.
[0003]
FIG. 5 shows an example of a translation pattern according to the prior art when translating from Japanese to English. The translation pattern includes an ID, a source language pattern, and a target language pattern. In the source language pattern, a variable that replaces the source language pattern is described on the left side of the arrow “⇒”, and a word or a string of variables to be matched with the input sentence is described on the right side of the arrow. Here, character strings enclosed in angle brackets "", ""("i","wa", etc.) indicate words, and character strings not enclosed (e.g., "sentences", "verb phrases", etc.) ) Indicates a variable, and the number after the colon “:” indicates an index for referring to a word or a variable described in the source language pattern from the target language pattern. Here, the words are enclosed in brackets as in the source language pattern, and the index corresponds to the number described in the source language pattern.
[0004]
For example, a translation pattern having an ID of 008 means that if the word "I" exists in the input sentence, it is replaced with the variable "noun phrase", and "I" is generated in the target language. Also, if there is a part of the input sentence replaced with the variable “case” and a part of the input sentence replaced with the variable “verb phrase”, the translation pattern with ID 003 is replaced with the variable “verb phrase”. In the target language, this means that the variable “verb phrase” and the variable “case” are generated in the reverse order of the appearance order of the source language pattern.
[0005]
Using the translation pattern of FIG. 5,
"I study English hard" (Example 1)
When the Japanese sentence is translated, the syntax tree of FIG. 6A is obtained as a result of collation with the source language pattern, and the generation result of FIG. 6B is obtained as a result of conversion according to the corresponding target language pattern. Finally, by arranging the word strings at the end of FIG. 6B in the order in which they appear from the left of the figure,
“I study English hard” (Example 2)
Is obtained.
[0006]
Heretofore, there has been conventionally known a translation apparatus that can express a nested structure (tree structure) of a translation pattern by replacing a source language pattern with a variable of one character (for example, see Patent Document 1). Further, the present inventors have proposed a translation method for performing translation using a translation template (for example, see Patent Document 2).
[0007]
[Patent Document 1]
Japanese Patent No. 3189186
[0008]
[Patent Document 2]
Japanese Patent Application No. 2002-138930
[0009]
[Problems to be solved by the invention]
However, in the conventional pattern translation method, the components of the target language pattern are represented by words or strings of variables. Therefore, if the target language is a language with a free word order, such as Japanese or Korean, and the target language is a language with a fixed word order, such as English, the word order of the translated sentence will change depending on the word order of the text sentence. Problems arise.
[0010]
For example, using the translation pattern of FIG. 5 described above, the word order was changed from that of Example 1 above.
“I study English hard” (Example 3)
When the Japanese sentence is translated, the syntax tree of FIG. 7A is obtained as a result of collation with the source language pattern, and the generation result of FIG. 7B is obtained as a result of conversion according to the corresponding target language pattern. And ultimately,
"I study hard English" (Example 4)
Is obtained. The translated sentence of Example 4 has a different English word order according to the order in which the translation patterns are applied, and cannot be said to be an accurate English sentence.
[0011]
In order to obtain an accurate translation by the above-mentioned translation method, it is necessary not only to prepare a translation pattern according to the word order of each of Example 1 and Example 3, but also
"I work hard to study English"
"I study English hard"
You must prepare a translation pattern that corresponds to all possible word orders, such as. For this reason, an enormous amount of translation patterns are required.
[0012]
The present invention has been made with a focus on the above circumstances, and the object thereof is that even when the source language includes a language in which the word order is free and the word order of the corresponding target language is fixed, a plurality of translation patterns can be obtained. It is an object of the present invention to provide a translation device, a translation method, and a translation program that can perform an accurate translation without preparing a translation.
[0013]
[Means for Solving the Problems]
In order to achieve the above object, the present invention provides a source language pattern for generating a source language syntax tree, a target language pattern for generating a target language syntax tree corresponding to the source language pattern, and a target language pattern. And the target language pattern includes an identifier indicating a component whose position can be changed among the components representing the syntax tree of the target language, and the target language evaluation unit includes the identifier. It has a translation pattern that includes information for specifying the constituent elements of the target language to be represented, and divides a text sentence written in the source language into words, and based on the divided words, Create a language parse tree. Then, a syntax tree of the target language is created according to the target language pattern based on the created syntax tree of the source language, and an identifier of the target language evaluation unit among the constituent elements of the created syntax tree of the target language. Is stored in association with the identifier. When the identifier is present in the target language pattern, a translation is generated by replacing the component stored in association with the identifier.
[0014]
Therefore, according to the present invention, when the translation is generated, if the identifier is described in the target language evaluation unit as the position for generating the component of the target language pattern, the component is placed at the position designated by the identifier. Is embedded. For this reason, even when the source language includes a language whose word order is free and the word order of the corresponding target language is fixed, it is possible to generate an accurate translation without preparing a plurality of translation patterns.
[0015]
The present invention is further characterized by further comprising means for performing a process of including an identifier in the target language pattern according to a user's input operation. With such a configuration, the user can arbitrarily change the word order of the translated sentence, and thereby can customize the translation pattern.
[0016]
Furthermore, the present invention is further characterized by further comprising means for performing processing for including information for specifying a component of a target language represented by the identifier included in the target language evaluation unit in response to an input operation of a user. . By providing such a configuration, the user can arbitrarily specify the component whose word order is to be changed, whereby the translation pattern can be customized as described above.
[0017]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a functional block diagram showing an embodiment of the translation apparatus according to the present invention.
[0018]
This translation device includes an input unit 1, a processor 2, a storage unit 6, and an output unit 11. The input unit 1 is composed of, for example, a keyboard, and is used by a user to input elements constituting a translation pattern, such as a text sentence, for example, a marker and a value designated by the marker.
[0019]
The storage unit 6 has a language-specific word dictionary storage area 7, a bilingual dictionary storage area 8, a translation pattern storage area 9, and a marker table storage area 10. The language-specific word dictionary storage area 7 stores, for a plurality of languages used for translation, information such as words and parts of speech of the words. The bilingual dictionary storage area 8 stores, for example, Japanese-English bilingual dictionary data. The translation pattern storage area 9 stores translation pattern information to be collated when translating the source language into the target language, and the detailed configuration of this translation pattern will be described later. The marker table storage area 10 stores the index described in the translation pattern, the marker, and the value designated by the marker in association with each other in accordance with an instruction from the translation generation processing unit 5 described later.
[0020]
The processor 2 has a general configuration as a computer such as a CPU, a ROM, and a RAM, and executes a translation process according to a processing procedure designated by a translation program stored in the ROM. For this translation processing, the processor 2 includes a word division processing unit 3, a syntax analysis processing unit 4, and a translated sentence generation processing unit 5.
[0021]
The word division processing unit 3 divides a text sentence described in the source language based on the information stored in the language-specific word dictionary storage area 7. The syntax analysis processing unit 4 analyzes the syntax using the translation pattern stored in the translation pattern storage area 9 and the text sentence divided by the word division processing unit 3. The translated sentence generation processing unit 5 converts the syntax analyzed by the syntax analysis processing unit 4 into a target language with reference to the translation pattern stored in the translation pattern storage area 9.
[0022]
The output unit 11 is configured by, for example, a display or a printer, and displays or prints out the translations and the like generated by the translation generation processing unit 5 under the control of the processor 2.
[0023]
Incidentally, the translation patterns stored in the translation pattern storage area 9 are configured as follows. FIG. 2 is a diagram showing an example of the configuration of this translation pattern.
That is, the translation pattern includes an identification number (ID), a source language pattern, a target language pattern, and a target language evaluation unit. The ID is composed of an arbitrary number string that is unique, and is used to specify each translation pattern.
[0024]
The source language pattern is used for parsing a text sentence in the source language, and words, parts of speech, cases, and the like are described in the source language pattern. For example, a variable to replace this source language pattern is described on the left side of an arrow “⇒” in the figure, and a word or a string of variables to be matched with a text sentence is described on the right side of the arrow. Here, a character string enclosed in angle brackets """""(" I "," wa ", etc.) indicates a word, and an unenclosed character string (" Sentence "," Verb phrase ", etc.) Indicates a variable. The number after the colon ":" indicates an index for referring to a word or a variable described in the source language pattern from the target language pattern.
[0025]
The target language pattern is represented by a sequence of words, indexes, or markers in the target language. Of these, words are enclosed in brackets as in the source language pattern, and the index corresponds to the number described in the source language pattern. The marker is represented by a character string starting with “@”. This marker is an identifier given to a component whose position can be changed among the components constituting the syntax at the time of generating the translation of the target language.
[0026]
The target language evaluation unit indicates a specified position of a component specified as a marker. A generation position is specified by an index and a marker on the left side of an equal sign “=”, and a component to be generated at that position is specified on the right side. Is indicated by an index.
[0027]
For example, the target language pattern of the translation pattern having the ID “103” means that a variable “verb phrase” referred to by the index “2” is generated, and the target language evaluation unit sets the index to “2”. This means that a component generated from the variable “case” referenced by the index 1 is embedded at the position of the marker “@obj” in the component generated from the referenced variable “verb phrase”.
[0028]
In addition, in the target language pattern of the translation pattern having the ID “105”, a variable “verb” whose index is referred to by “1” is generated, and markers “$ sbj”, “$ obj”, and “$” before and after that are generated. adv "means that a component specified by another translation pattern is embedded.
[0029]
Next, a description will be given of a translation procedure by the translation apparatus configured as described above and the processing contents. In this case, the Japanese sentence
"I study English hard"
An example in which is translated into English will be described.
[0030]
When the user inputs a text sentence of the source language to be translated “I study English hard” from the input unit 1, the processor 2 takes the text sentence into the word division processing unit 3. The word division processing unit 3 divides the fetched text sentence into words by referring to the above-mentioned language-specific word dictionary.
[0031]
For example, it is divided as "I / I / English // Work hard / Study". Here, / indicates a word break. As a means for dividing the word, for example, morphological analysis processing is used. However, as long as accurate word recognition is possible, the processing is not particularly limited to morphological analysis processing, and other processing means may be used. The word-segmented text sentence is transferred to the syntax analysis processing unit 4.
[0032]
The syntactic analysis processing unit 4 compares the word-segmented text sentence with the source language pattern of the translation pattern stored in the translation pattern storage area 9, and creates a syntax tree of the text sentence. FIG. 3A shows the created syntax tree. Each source language pattern corresponds to each node of the syntax tree. By the way, since the source language pattern shown in FIG. 2 is described in the form of a context-free grammar, a syntax tree can be easily created by applying an existing syntax analysis algorithm. As a syntax analysis algorithm for analyzing a context-free grammar, for example, a chart method that can handle general context-free grammar rules and has a high degree of freedom in controlling the analysis process is used. The syntax tree created by the syntax analysis processing unit 4 is sent to the translated sentence generation processing unit 5.
[0033]
The translated sentence generation processing unit 5 converts the syntax tree sent from the syntax analysis processing unit 4 into a target language with reference to the target language pattern of the translation pattern. FIG. 3B shows a syntax tree of the target language created from the syntax tree of the source language shown in FIG. The translated sentence generation processing unit 5 defines a function generate (), and by executing this function generate (), a conversion process from a source language syntax tree to a target language syntax tree is performed. FIG. 4 is a flowchart showing the procedure and contents of the conversion process using the function generate (). Hereinafter, the conversion processing operation by the translation generation processing unit 5 will be described with reference to FIG.
[0034]
First, the translated sentence generation processing unit 5 calls an SNode [sentence, ID = 101] which is a root node of the source language. Here, SNode [sentence, ID = 101] is a structure indicating a node of the syntax tree in the source language, indicating that the variable is “sentence” and the ID is “101”, and further, a child node of the syntax tree. It also has a pointer to From the called SNode [sentence, ID = 101], the variable “sentence” of the source language node and the translation pattern ID = “101” are acquired (step S10). Based on the acquired information, a target language node TNode [sentence, ID = 101] is generated. Here, TNode [sentence, ID = 101] is a structure indicating a node of the syntax tree of the target language, indicating that the variable is “sentence” and the ID is “101”, and further, a child node of the syntax tree. It also has a pointer to
[0035]
Subsequently, the translated sentence generation processing unit 5 proceeds to step S12, and determines whether or not a marker table exists. Here, since the marker table has not been set yet, the process proceeds to step S14. In step S14, it is determined whether all target language evaluation units have been processed. Since the translation pattern with the ID “101” does not include the target language evaluation unit, the process proceeds to step S17. In step S17, it is determined whether or not the translation generation process has been performed for all the components of the target language pattern. Since the processing has not been completed here, the processing for each type of component is executed as follows.
[0036]
That is, in steps S18, S20, and S22, the translated sentence generation processing unit 5 determines whether the component is a “word”, an “index”, or a “marker”. Now, the index “1” is described in the target language pattern of ID “101”. Therefore, the process proceeds from step S20 to step S21, where the source language node SNode [verb phrase, ID = 102], which is the index reference destination, is called. Then, a target language node TNode [verb phrase, ID = 102] is created from the SNode [verb phrase, ID = 102] and the function generate () recursively called. Then, the created TNode [verb phrase, ID = 102] is set as a child node of TNode [sentence, ID = 101]. Here, the translated sentence generation processing unit 5 recursively calls the function generate () for each processing of all nodes. When all the components of the target language pattern have been processed, the processing of the function generate () in the translation pattern having the ID “101” is completed (step S24).
[0037]
Next, the translated sentence generation processing unit 5 performs a process of the function generate () for the translation pattern with the ID “102”. Note that the description of steps overlapping with the case of the ID “101” will be omitted.
[0038]
First, the translated sentence generation processing unit 5 acquires the variable “verb phrase” and ID = 102 of the called source language node SNode [verb phrase, ID = 102] in step S10, and in step S11, the target language node TNode [verb phrase, ID = 102]. If there is no marker table, the process proceeds from step S12 to step S14, where it is determined whether or not all the target language evaluation units have been processed. Now, the translation pattern with ID “102” has a position designation in the target language evaluation unit. For this reason, the translated sentence generation processing unit 5 proceeds to step S15 and calls the source language node SNode [Nominative, ID = 106] referred to by the index “1” of the right side value. Then, a target language node TNode [Nominative, ID = 106] created from the SNode [Nominative, ID = 106] and the function generate () recursively called is set as an index “2” of the left side value of the position specification. The information is stored in the marker table storage area 10 in association with the marker "$ sbj" (step S16). As a result, the marker table has the entries shown in Table 1.
[0039]
[Table 1]

[0040]
When all of the target language evaluation units have been processed in this way, the translated sentence generation processing unit 5 proceeds to step S17, where it determines whether all components of the target language pattern have been processed. If it has not been processed, the type of the component is determined in steps S18, S20 and S22. Now, the index “2” is described in the target language pattern of ID “102”. Therefore, the translated sentence generation processing unit 5 calls the source language node SNode [verb phrase, ID = 103] referred to by the index “2” in step S21. If an entry exists in the marker table, an entry that satisfies one of the following conditions is called.
An entry having the same index as the index described in the target language pattern (condition 1)
An entry whose index is NULL and has a marker that does not exist in the current target language pattern (condition 2)
Now, referring to Table 1, the index of the entry in the first row is “2”, which satisfies the condition 1. Therefore, the translated sentence generation processing unit 5 calls the entry. Then, a TNode [verb phrase, ID = 103] is created from the source language node SNode [verb phrase, ID = 103], and the called entry is set as attribute information in the TNode [verb phrase, ID = 103]. Then, this TNode [verb phrase, ID = 103] is set as a child node of TNode [verb phrase, ID = 102]. When all the components of the target language pattern have been processed, the processing of the function generate () in the translation pattern ends.
[0041]
Next, the translated sentence generation processing unit 5 calls the source language node SNode [verb phrase, ID = 103]. At this time, the entry of the marker table is set as the attribute information as described above. For this reason, the translated sentence generation processing unit 5 proceeds to step S13, where the entry is acquired, the index “2” is changed to “NULL”, and stored in the marker table storage area 10 again. As a result, the marker table has the entries shown in Table 2.
[0042]
[Table 2]

[0043]
Further, the translated sentence generation processing unit 5 performs the processing of steps S15 and S16 for the target language evaluation unit in the same manner as described above. As a result, the marker table storage area 10 has the entries shown in Table 3.
[0044]
[Table 3]

[0045]
Now, the index “2” is described in the target language pattern of the translation pattern with the ID “103”. Therefore, the translated sentence generation processing unit 5 refers to the marker table in Table 3 and determines whether there is an entry that satisfies the

conditions

1 and 2 described above.
The entry in the first row of Table 3 has an index of “NULL” and does not include the marker “$ sbj” in the target language pattern. Therefore, Condition 2 described above is satisfied. The entry in the second row of Table 3 is the same as the index whose index is described in the target language pattern. Therefore, Condition 1 described above is satisfied.
[0046]
Therefore, the translated sentence generation processing unit 5 calls the two entries in Table 3 and the source language node SNode [verb phrase, ID = 104] referred to by the index “2” of the target language pattern. Then, a TNode [verb phrase, ID = 104] is created from the source language node SNode [verb phrase, ID = 104] and the function generate () recursively called, and the two entries called TNode [verb] Phrase, ID = 104] as attribute information. Then, this TNode [verb phrase, ID = 104] is set as a child node of TNode [verb phrase, ID = 103] (step S21).
[0047]
As is apparent from this processing operation, when the target language pattern at the current node does not have the designated marker, when the function generate () processes the child node of the current node, the marker table entry is deleted. By following the child nodes based on the target language pattern, the target language pattern in which the designated marker is described can be searched. When there are a plurality of child nodes to be traced, for example, a priority order such as tracing the child nodes with the leftmost and depth priority may be determined in advance.
[0048]
Next, the translated sentence generation processing unit 5 performs the process of the function generate () for the translation pattern with the ID “105”. When the source language node SNode [verb phrase, ID = 105] is called, three entries of the marker table are given as the attribute information, and the entries shown in Table 4 are stored in the marker table storage area 10 (step). S12).
[0049]
[Table 4]

[0050]
Now, the marker “$ sbj” is described in the target language pattern of the translation pattern whose ID is “105”. For this reason, the translated sentence generation processing unit 5 proceeds from step S22 to step S23, where the target language node having the index “NULL” and the same marker among the entries of the marker table is selected. For example, in the marker table shown in Table 4, the entry in the first row is selected, and the target language node TNode [Nominative, ID = 106] is set as the first child node.
[0051]
Next, since the target language pattern describes the index “1”, the source language node SNode [verb, ID = 111], which is the index reference destination, is called, and this SNode [verb, ID = 111] is recursively referred to. Then, a target language node TNode [verb, ID = 111] is created from the function “generate ()” that has been called. Then, the created TNode [verb, ID = 111] is set as the second child node (step S21). At this time, since none of the entries in the marker table shown in Table 4 satisfy the above-described

conditions

1 and 2, no attribute information is set.
[0052]
Since the component of the next target language pattern is the marker "@obj", the translation generation processing unit 5 proceeds from step S22 to step S23, where the target language node TNode [ , ID = 107]. Then, this TNode [case, ID = 107] is set as the third child node.
Similarly, since the last component is the marker “@adv”, the translated sentence generation processing unit 5 refers to the marker table in Table 4 and sets the target language node TNode [adverb phrase, ID = 110] to the fourth child node. Set as
[0053]
When the processing for all components of the target language pattern is completed, the translated sentence generation processing unit 5 proceeds from step S17 to step S24, and ends the processing of the function generate () in the translation pattern.
[0054]
Finally, the operation of the function generate () when calling the source language node SNode [noun phrase, ID = 108] will be described. Since the target language pattern of the translation pattern having the ID “108” describes the word “I”, the translated sentence generation processing unit 5 proceeds from step S18 to step S19, where the function recursively called. A target language node TNode [“I”, ID = ***] is created by generate (), and set as a child node of TNode [noun phrase, ID = 108]. Then, the processing of the function generate () in the translation pattern ends (S24). Here, ID = “***” indicates that the node represents a word.
[0055]
As described above, the translated sentence generation processing unit 5 recursively calls the function generate () for each processing of all nodes. Therefore, when the root node of the syntax tree of the source language shown in FIG. 3A is called and the translation process is executed, the syntax tree of the target language shown in FIG. 3B is generated. Then, the translated sentence generation processing unit 5 arranges the terminal words obtained by tracing the generated syntax tree of the target language (FIG. 3B) in the leftmost and depth-first order in the order of appearance, and thereby translates the translated sentence. Generate display data for. Then, the display data of the generated translation is output to the output unit 11 and displayed. In the result output unit 11, the translated sentence "I study English hard" is displayed.
[0056]
As described in detail above, in this embodiment, a text sentence written in the source language is divided into words, and a syntax tree is created from the word-divided text sentence according to the source language pattern. Then, when creating a target language syntax tree from the source language syntax tree in accordance with the target language pattern, the components specified by the markers by the target language evaluator are stored, and the markers described in the target language pattern are stored. A translation is generated by arranging the stored components at the position.
[0057]
Therefore, according to this embodiment, it is possible to change the position where the component of the target language in which the marker is set is arranged. Therefore, even if the word order of the source language is free and the word order of the target language is a fixed language, a translated sentence can be generated in a predetermined word order regardless of the application order of the translation patterns.
[0058]
Further, in the above-described embodiment, the translation processing is executed by specifying in advance the components that can be changed in the target language evaluation unit of the translation pattern. However, the configuration is such that the user can set the components that can be changed. You may. That is, a setting screen for prompting the user to input a changeable component is created and displayed on the output unit 11. This can be realized by reflecting information input by the user through the input unit 1 in the target language evaluation unit.
[0059]
Similarly, in the above embodiment, the translation process is executed by specifying in advance the position of the marker described in the target language pattern of the translation pattern (the target language pattern of ID = 105 in FIG. 2). You may comprise so that a user can set the position of the said marker. That is, a setting screen for prompting the change of the component specified by the marker is created and displayed on the output unit 11. This can be realized by reflecting the information input by the user through the input unit 1 on the target language pattern.
With such a configuration, it is possible to customize the translation method in addition to being able to create a translation in the above-mentioned fixed word order.
[0060]
In the above-described embodiment, the translation process is performed by the CPU executing the translation program stored in the ROM. However, the translation program may be read from an external storage medium such as a CD-ROM, or may be read from a site on a network. May be downloaded via a communication line and installed.
[0061]
In addition, the type and configuration of the translation device, the type and configuration of the dictionary, the method of word division processing and the syntax analysis processing, the control procedure of the translation generation processing, and the contents thereof are also various without departing from the gist of the present invention. It can be modified and implemented.
[0062]
【The invention's effect】
As described above in detail, according to the present invention, a source language pattern for generating a source language syntax tree, a target language pattern for generating a target language syntax tree corresponding to the source language pattern, and a target language evaluation And the target language pattern includes an identifier indicating a component whose position can be changed among the components indicating the syntax tree of the target language, and the target language evaluation unit indicates the target language pattern. It has a translation pattern that includes information for specifying the components of the target language, divides a text sentence written in the source language into words, and uses the source words in accordance with the source language pattern based on the divided words. Create a parse tree for Then, a syntax tree of the target language is created according to the target language pattern based on the created syntax tree of the source language, and an identifier of the target language evaluation unit among the constituent elements of the created syntax tree of the target language. Is stored in association with the identifier. Then, when the identifier is present in the target language pattern, a translation is generated by replacing the component stored in association with the identifier.
[0063]
Therefore, according to the present invention, even when the source language includes a language whose word order is free and the word order of the corresponding target language is fixed, the translation can be performed accurately without preparing a plurality of translation patterns. An apparatus, a translation method, and a translation program can be provided.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a translation apparatus according to an embodiment of the present invention.
FIG. 2 is an example of a translation pattern in the translation device shown in FIG.
FIG. 3 is a view showing a syntax tree created by the translation apparatus shown in FIG. 1;
FIG. 4 is a flowchart showing an operation of a function generate () executed by a translated sentence generation processing unit 5;
FIG. 5 is an example of a translation pattern according to the related art when translating from Japanese to English.
FIG. 6 is a translation example according to the related art.
FIG. 7 is a translation example according to the related art.
[Explanation of symbols]
1 ... input section
2. Processor
3. Word division processing unit
4: Parsing processing unit
5 Translated sentence generation processing unit
6 ... Storage unit
7… Language dictionary storage area by language
8 ... Bilingual dictionary storage area
9: Translation pattern storage area
10. Marker table storage area
11 Output unit

Claims

A source language pattern for generating a source language syntax tree, a target language pattern for generating a target language syntax tree corresponding to the source language pattern, and a target language evaluator; The pattern includes an identifier indicating a component whose position can be changed among the components representing the syntax tree of the target language, and the target language evaluator includes a component for specifying the component of the target language represented by the identifier. First storage means for storing a translation pattern including information,
Dividing means for dividing a text sentence written in the source language into words;
First creating means for creating a source language syntax tree according to the source language pattern based on the words divided by the dividing means;
Second creating means for creating a syntax tree of a target language according to the target language pattern based on the syntax tree of the source language created by the first creating means;
A second storage unit configured to store, in association with the identifier, a component represented by the identifier of the target language evaluation unit among the components of the syntax tree of the target language created by the second creation unit;
When the identifier is present in the target language pattern, generating means for generating a translation by replacing components stored in association with the identifier by the second storage means. Translation device.

2. The translation apparatus according to claim 1, further comprising means for performing a process of including the identifier in the target language pattern in response to a user's input operation.

2. The translation apparatus according to claim 1, further comprising means for performing a process of including information for specifying a component of a target language represented by the identifier in the target language evaluation unit in response to a user input operation. .

A source language pattern for generating a source language syntax tree, a target language pattern for generating a target language syntax tree corresponding to the source language pattern, and a target language evaluator; The pattern includes an identifier indicating a component whose position can be changed among the components representing the syntax tree of the target language. A translation method for translating a text sentence written in a source language into a target language by using a translation pattern including information,
Dividing the text sentence into words;
Creating a source language syntax tree according to the source language pattern based on the divided words;
Creating a syntax tree of the target language according to the target language pattern based on the created syntax tree of the source language,
Storing the components represented by the identifier of the target language evaluation unit among the components of the created syntax tree of the target language in association with the identifier;
When the identifier is present in the target language pattern, replacing a component stored in association with the identifier to generate a translated sentence.

A source language pattern for generating a source language syntax tree, a target language pattern for generating a target language syntax tree corresponding to the source language pattern, and a target language evaluator; The pattern includes an identifier indicating a component whose position can be changed among the components representing the syntax tree of the target language. A storage unit storing a translation pattern including information, and a computer, wherein the computer translates a text sentence written in the source language into a target language using the translation pattern stored in the storage unit A translation program used in the device,
Dividing the text sentence into words;
Creating a source language syntax tree according to the source language pattern based on the divided words;
Creating a target language syntax tree according to the target language pattern based on the generated source language syntax tree;
Storing the component represented by the identifier of the target language evaluation unit among the components of the created syntax tree of the target language in association with the identifier;
When the identifier is present in the target language pattern, replacing the components stored in association with the identifier to generate a translated sentence.