JP4876329B2

JP4876329B2 - Parallel translation probability assigning device, parallel translation probability assigning method, and program thereof

Info

Publication number: JP4876329B2
Application number: JP2001144337A
Authority: JP
Inventors: 真一郎亀井; 潔山端; 誠也長田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2001-05-15
Filing date: 2001-05-15
Publication date: 2012-02-15
Anticipated expiration: 2021-05-15
Also published as: JP2002342325A

Description

【０００１】
【発明の属する技術分野】
機械翻訳、クロス言語テキスト検索など、異なる言語の間で言葉の対応をとることを課題とする自然言語処理技術に関する。
【０００２】
【従来技術】
機械翻訳、クロス言語テキスト検索など、異なる言語の間で言葉の対応をとることを課題とする自然言語処理技術においては、一方の言語の単語を、もう一方の言語の適切な単語に対応させることは非常に重要な課題であり、訳語選択の問題と呼ばれている。この課題が重要な問題であることは、一般にどのような言語対の場合にも当てはまるが、以下では英語と日本語の場合を取り上げ、具体例を示して説明する。
【０００３】
英語の単語は一般に複数の意味をもち、一般にはそれぞれ異なる日本語の単語に対応する。ところが、自然言語処理分野において元の英単語の使われている状況を正しく判断して適切な日本語の単語を選択することは、一般には非常に困難である。たとえば、英語の単語「ｔｅｒｍ」には「期間」という意味の他に「専門用語」という意味があるが、どのような場合に「期間」という意味となり、どのような場合に「専門用語」という意味になるか、という訳語の選択条件を、あらかじめ明示的に記述することは非常に難しい。
【０００４】
この問題を解決する方法として、言葉が実際に使用された例、すなわち実例文を大量に集めてそれを利用する方法が提案されている。
【０００５】
たとえば「野上宏康、熊野明、田中克己、天野真家『既存目的言語文書からの訳語の自動学習方式』情報処理学会第４２回全国大会（平成３年）」（先行技術文献１）では、以下のような方法が提案されている。
【０００６】
まず、異なる言語（日本語と英語など）で、同じ分野の話題を述べている文例を大量に収集しておく。次に、一方の言語（たとえば英語）の単語が、相手言語（たとえば日本語）の訳語候補のうち、どの訳語に対応するかの確からしさを判定する際に、相手言語の文例集における、各訳語候補の出現確率の高さを用いる。たとえば、今、英語の「ｔｅｒｍ」を「期間」と訳すのが確からしいか「専門用語」と訳すのが確からしいかを判断するのに、同じ分野の話題を述べている日本語の文例集の中に出現する「期間」という単語と「専門用語」という単語の頻度を計測し、その多い方を「ｔｅｒｍ」の訳語とする、という手法である。この手法には、相手言語の文例集のみを分析すればよいという利点がある。
【０００７】
また「中島弘之、梶博行『対訳テキストを利用した訳語選択のための共起関係の自動抽出』情報処理学会第３９回全国大会（平成元年）」（先行技術文献２）では、以下のような方法が提案されている。
【０００８】
まず、異なる言語（日本語と英語など）で、一方が他方の翻訳関係にあるような対訳文例集を用意する。さらに、二つの言語の間の対訳辞書を用意し、第１の言語の例文に含まれる単語に対して対訳辞書を引き、訳語候補を挙げる。その例文と対訳関係にある第２の言語の例文の中に出現する訳語候補の頻度を計測し、最も高頻度で現われる訳語候補を、元の単語に対する訳語とする、という手法である。この手法は、互いに翻訳関係にある対訳例文が利用できる場合には、高い精度で訳語を認定できるという利点がある。
【０００９】
【発明が解決しようとする課題】
しかしながら、先行技術文献１の方法は、相手言語の単語の頻度だけを手がかりにしているため、相手言語で一般的に高頻度で出現する単語が訳語として採用されてしまいやすい、という欠点がある。
【００１０】
たとえば、英語の単語「ｍａｋｅ」には「作る」という訳語の他にも多くの日本語の訳語が相当する。一例として「ｍａｋｅａｃａｌｌ」を「電話をする」と訳すためには「ｃａｌｌ」を「電話」に対応するものとし、「ｍａｋｅ」には「する」という動詞が対応するものとして辞書を構築するのが通常の手法である。このように辞書を作る時「ｍａｋｅ」には少なくとも「作る」と「する」という訳語候補が存在することになる。この場合、先行技術文献１の方法に従って、相手言語、つまり日本語の単語の出現頻度だけを計測すると、訳語「作る」よりも訳語「する」の方が一般に出現頻度が高いので、「ｍａｋｅ」の訳語候補として「する」が最も確からしいものとして選択されてしまう。先行技術文献１には、このように、本来の訳語として適切かどうかとは無関係に、相手言語で出現頻度の高い訳語が選択されやすい、という欠点がある。
【００１１】
また、先行技術文献２の方法は、互いに翻訳関係にある対訳例文が大量に存在する場合に有効な方法であるが、実際には、互いに翻訳関係にある対訳例文の量は極めて限られている。先行技術文献２の方法は対訳例文が大量に存在しない場合には適用できない、という欠点がある。
【００１２】
本願発明の目的は、従来の手法がもつ、上記のような問題点を解決し、より確からしい訳語候補を選択するための方法を提供するところにある。
【００１３】
【課題を解決するための手段】
本発明の対訳確率付与装置は、第１の言語を第２の言語に翻訳する際に用いられる対訳確率付与装置であって、第１の言語の文例集と第２の言語の文例集とを有し、第１の言語の単語に対する第２の言語の訳語候補を単語対応対として格納した第１言語第２言語対訳辞書を有し、第１の言語の文例集における単語の出現に関する統計量を計算する第１言語統計量計算モジュールを有し、第２の言語の文例集における単語の出現に関する統計量を計算する第２言語統計量計算モジュールを有し、対訳辞書の各単語対応対に付与された対訳確率をパラメータとして、第１の言語の文例集から第１言語統計量計算モジュールによって求められる統計量から、第２の言語の単語の出現に関する統計量を推定する対訳確率モデルを有し、第２の言語の文例集から第２言語統計量計算モジュールによって求められた統計量と対訳確率モデルによって第１の言語から推定された第２の言語の統計量との差を最小にするようにパラメータを求める対訳確率付与部を有することを特徴とする。
【００１４】
この場合、第１の言語の単語の出現に関する統計量から第２の言語の単語の出現に関する統計量を推定する対訳確率モデルとして、第１の言語の単語Ｅ（ｉ）の出現確率Ｅ（ｉ）とその単語Ｅ（ｉ）が第２の言語の訳語Ｊ（ｎ）に対応する対応確率Ｓ（ｉ，ｎ）との積を求め、第１の言語の各単語Ｅ（ｉ）に関して上記の積を可算した和をとることによって第２の言語における単語Ｊ（ｎ）の出現確率Ｊ（ｎ）を計算する対訳確率モデルを用いてもよい。
【００１５】
また、第１の言語の単語の出現に関する統計量から第２の言語の単語の出現に関する統計量を推定する対訳確率モデルとして、第１の言語で一つの文の中に出現する二つの単語のペアの共起確率Ｐ（Ｅ（ｉ）＾Ｅ（Ｊ））とその単語ペアを構成する各単語Ｅ（ｉ）およびＥ（Ｊ）が第２の言語の訳語に対応する対応確率Ｓ（ｉ，ｍ）およびＳ（Ｊ，ｎ）との積を求め、第１の言語の各単語ペアＥ（ｉ）およびＥ（Ｊ）に関して上記の積を可算した和をとることによって第２の言語で一つの文の中に出現する二つの単語のペアの出現確率Ｐ（Ｊ（ｍ）＾Ｊ（ｎ））を計算する対訳確率モデルを用いことにしてもよい。
【００１６】
また、第１の言語の単語の出現に関する統計量から第２の言語の単語の出現に関する統計量を推定する対訳確率モデルとして、第１の言語で構文上の係り受け関係にある二つの単語のペアの共起確率Ｐ（Ｅ（ｉ）＾Ｅ（Ｊ））とその単語ペアを構成する各単語Ｅ（ｉ）およびＥ（Ｊ）が第２の言語の訳語に対応する対応確率Ｓ（ｉ，ｍ）およびＳ（Ｊ，ｎ）との積を求め、第１の言語の各単語ペアＥ（ｉ）およびＥ（Ｊ）に関して上記の積を可算した和をとることによって第２の言語で構文上の係り受け関係にある二つの単語のペアの出現確率Ｐ（Ｊ（ｍ）＾Ｊ（ｎ））を計算する対訳確率モデルを用いることにしてもよい。
【００１７】
【発明の実施の形態】
本発明の実施の形態について図面を参照して説明する。図１は本発明の第１実施の形態の訳語選択システムの構成を示すブロック図である。
【００１８】
本実施の形態は、第１言語の文例集１、第２言語の文例集２、第１の言語の単語に対する第２の言語の訳語候補を単語対応対として格納した第１言語第２言語対訳辞書３、第１言語の文例集における単語の出現に関する統計量を計算する第１言語統計量計算モジュール４、第２言語の文例集における単語の出現に関する統計量を計算する第２言語統計量計算モジュール５、対訳辞書の各単語対応対に付与された対訳確率をパラメータとして、第１の言語の文例集から第１言語統計量計算モジュールによって求められる統計量から、第２の言語の単語の出現に関する統計量を推定する対訳確率モデルを格納した対訳確率モデル格納部６、第２の言語の文例集から第２言語統計量計算モジュールによって求められた統計量と対訳確率モデルによって第１の言語から推定された第２の言語の統計量との差を最小にするようにパラメータを求める対訳確率付与部７とから構成されている。
【００１９】
各ブロックの内容と動作について以下に説明する。第１言語の文例集１には、第１の言語、たとえば英語の実例文が格納されている。第２言語の文例集２には、第２の言語、たとえば日本語の実例文が格納されている。第１言語第２言語対訳辞書３には、第１の言語の各単語に対する第２の言語の訳語候補を単語対応対として格納してある。図２は、第１言語第２言語対訳辞書３の内容の例を示した図である。この図では、第１言語の単語Ｅ（ｉ）に対応する第２言語の訳語候補として、Ｊ（k）、Ｊ（ｍ）、Ｊ（ｎ）が存在する場合を示している。
【００２０】
この図でｅ（ｉ）は、第１言語の単語Ｅ（ｉ）の出現確率、ｊ（k）、ｊ（ｍ）、ｊ（ｎ）はそれぞれ第２言語の単語Ｊ（k）、Ｊ（ｍ）、Ｊ（ｎ）の出現確率を表す。また、Ｓ（ｉ，k）、Ｓ（ｉ，ｍ）、Ｓ（ｉ，ｍ）は、それぞれ、第１言語の単語Ｅ（ｉ）が、第２言語の単語Ｊ（k）、Ｊ（ｍ）、Ｊ（ｎ）に翻訳される確率を表す。
【００２１】
第１言語統計量計算モジュール４は、第１言語の文例集１における単語の出現に関する統計量を計算する。第２言語統計量計算モジュール５は、第２言語の文例集２における単語の出現に関する統計量を計算する。第１言語統計量計算モジュール４および第２言語統計量計算モジュール５は、必要に応じて、第１言語の文例集１および第２言語の文例集２に含まれる文を形態素解析したり構文解析したりして、そこに含まれる単語の出現に関する統計量を計算する。統計量の例としては、各単語の出現確率や二つの単語が同時に出現する共起確率などがある。
【００２２】
対訳確率モデル格納部６には、第１の言語の単語の出現に関する統計量から第２の言語の単語の出現に関する統計量を推定する対訳確率モデルが格納してある。この対訳確率モデルは、対訳辞書の各単語対応対に付与された対訳確率をパラメータとして、第１の言語の文例集から第１言語統計量計算モジュールによって求められる統計量から、第２の言語の単語の出現に関する統計量を推定する。
【００２３】
対訳確率付与部７は、第２の言語の文例集から第２言語統計量計算モジュールによって求められた統計量と対訳確率モデルによって第１の言語から推定された第２の言語の統計量との差を最小にするように、対訳辞書の各単語対応対に付与された対訳確率パラメータを調整する。
【００２４】
図６は、本願発明の第２の実施の形態を説明する図である。
【００２５】
図６において、本願発明の第２の実施の形態は、入力装置１０１と、コンピュータから構成されるデータ処理装置１０２と、出力装置１０３と、記憶装置１０４と、訳語選択プログラムを記録した記憶媒体１０５とを備える。記憶媒体１０５は、磁気ディスク、磁気テープ、光ディスク、半導体メモリその他の記憶媒体よりなる。
【００２６】
訳語選択プログラムは、記憶媒体１０５からデータ処理装置１０２の主記憶装置に読み込まれ、データ処理装置１０２の動作を制御する。データ処理装置１０２は、訳語選択プログラムの制御により以下の処理を行なう。
【００２７】
訳語の選択を行なうべき単語が入力装置１０１から入力されると、第１言語統計量計算モジュール４と第２言語統計量計算モジュール５とが起動される。第１言語統計量計算モジュール４は、第１言語の文例集１における単語の出現に関する統計量を計算する。第２言語統計量計算モジュール５は、第２言語の文例集２における単語の出現に関する統計量を計算する。
【００２８】
次に、対訳確率付与部７が起動される。対訳確率付与部７は、第２の言語の文例集から第２言語統計量計算モジュールによって求められた統計量と対訳確率モデル格納部６に格納された対訳確率モデルによって第１の言語から推定された第２の言語の統計量との差を最小にするように、対訳辞書の各単語対応対に付与された対訳確率パラメータを調整する。
【００２９】
結果として得られた対訳確率パラメータの値にしたがって、訳語が出力装置１０３から出力される。
【００３０】
次に、図１に示した実施の形態における、対訳確率モデル格納部６に格納されている対訳確率モデルの例を用いて、本願発明の動作を説明する。次の式は、対訳確率モデルの一例を表す式である。
【００３１】
【数１】

この式１において、ｅ（ｉ）は第１言語のｉ番目の単語Ｅ（ｉ）の出現確率を表す。またｊ（ｍ）は第２言語のｍ番目の単語Ｊ（ｍ）の出現確率を表す。Ｓ（ｉ，ｍ）は、第１言語のｉ番目の単語Ｅ（ｉ）が、第２言語のｍ番目の単語Ｊ（ｍ）に翻訳される確率を表す。この式は、第１言語の各単語の出現確率と翻訳確率の積の総和が第２言語の各単語の出現確率を与えるというモデルを表している。
【００３２】
この式１のＳ（ｉ，ｍ）が、この対訳確率モデルにおけるパラメータであり、第１言語の単語Ｅ（ｉ）と第２言語の訳語候補Ｊ（ｍ）との単語対応対に与えられた対訳確率である。このパラメータには、第１言語の単語は第２言語の単語に必ず対応するという仮定の下で、
【００３３】
【数２】

という制約がある。
【００３４】
この対訳確率モデルによって各単語の対訳確率を求めるには、第１言語統計量計算モジュール４によって、第１言語の文例集１における単語の出現確率ｅ（ｉ）を計算し、第２言語統計量計算モジュール５によって、第２言語の文例集２における単語の出現確率ｊ（ｍ）を計算し、このようにして求めたｅ（ｉ）およびｊ（ｍ）を上記の対訳確率モデルの式に代入して上記の制約を満たすパラメータＳ（ｉ，ｍ）を定める。
【００３５】
次に、図３、図４、図５を用いて、本願発明と従来方式の差異を説明する。ここでは、例として、英語の単語を日本語の単語に翻訳する場合を考える。
【００３６】
図３は、第１言語第２言語対訳辞書３の中の英単語「ｄｏ」と「ｍａｋｅ」の単語対応対を示している。ここでは簡単のため、英単語「ｄｏ」は日本語の単語「する」１語とだけ訳語候補としての単語対応対をなしており、英単語「ｍａｋｅ」は日本語の単語の「つくる」と「する」の２単語と、訳語候補としての単語対応対をなしている状況を想定する。
【００３７】
図４は、先行技術文献１で示されているような、第２言語の文例集における単語の出現頻度だけを用いて、第１言語の単語の訳語選択を行なう従来方式の動作を、図３で示した単語対応対の構成をもった「ｄｏ」と「ｍａｋｅ」を例に挙げて表した図である。図４は第２言語の文例集における「する」と「つくる」の出現確率が、仮にそれぞれ、０．２０および０．０１である状況を示している。この場合、先行技術文献１の従来方式では、「ｍａｋｅ」の訳語として、出現確率の高い単語「する」が単語「つくる」よりも優先されてしまう。
【００３８】
図５は、本願発明の動作を、図３で示した単語対応対の構成をもった「ｄｏ」と「ｍａｋｅ」を例に挙げて表した図である。図５では、第１言語の文例集における「ｄｏ」と「ｍａｋｅ」の出現確率が、仮にそれぞれ、０．１８および０．０２である状況を示している。第２言語の文例集における「する」と「つくる」の出現確率は、図４の場合と同様に、それぞれ、０．２０および０．０１であるとする。
【００３９】
本願発明では、上述の式で示したような対訳確率モデルを用いて、日本語の同じ単語を訳語としてもつ英単語の影響を考慮した計算を行なう。この方法で、英単語「ｍａｋｅ」が「する」に翻訳される確率および「つくる」に翻訳される確率を計算すると、この例のように「する」の頻度が高くても、その頻度の大部分は英単語「ｄｏ」からの翻訳確率に対応するので、「ｍａｋｅ」から「する」への翻訳確率は低くなる。図５では「ｍａｋｅ」から「つくる」への翻訳確率が０．９、「ｍａｋｅ」から「する」への翻訳確率が０．１という結果が得られた場合を示している。
【００４０】
次に、対訳確率モデル格納部６に格納されている対訳確率モデルの第２の例を用いて、本願発明の動作を説明する。次に挙げる式は、対訳確率モデルの一例を表す式である。
【００４１】
【数３】

この式３において、Ｐ（Ｅ（ｉ）＾Ｅ（ｊ））は、第１言語で単語Ｅ（ｉ）と単語Ｅ（ｊ）が同時に出現する共起確率を表す。また、Ｐ（Ｊ（ｍ）＾Ｊ（ｎ））は、第２言語で単語Ｊ（ｍ）と単語Ｊ（ｎ）が同時に出現する共起確率を表す。この式３は、第１言語における二つの単語の共起確率とそれぞれの単語の対訳確率の積の総和が、第２言語における二つの単語の共起確率を与えるというモデルを表している。
【００４２】
この式のＳ（ｉ，ｍ）およびＳ（ｊ，ｎ）が、この対訳確率モデルにおけるパラメータであり、それぞれ、第１言語の単語Ｅ（ｉ）と第２言語の訳語候補Ｊ（ｍ）との単語対応対に与えられた対訳確率、第１言語の単語Ｅ（Ｊ）と第２言語の訳語候補Ｊ（ｎ）との単語対応対に与えられた対訳確率である。このパラメータには、第１言語の単語は第２言語の単語に必ず対応するという仮定の下で、
【００４３】
【数４】

という制約がある。
【００４４】
この対訳確率モデルを使って各単語の対訳確率を求める場合、二つの単語の共起として、何種類かの共起が考えられる。共起の種類の一つとして、一つの文の中に二つの単語が共に出現する文内共起がある。
【００４５】
この場合、第１言語統計量計算モジュール４によって、第１言語の文例集１における二つの単語の文内共起確率Ｐ（Ｅ（ｉ）＾Ｅ（Ｊ））を計算し、２言語統計量計算モジュール５によって、第２言語の文例集２における二つの単語の文内共起確率Ｐ（Ｊ（ｍ）＾Ｊ（ｎ））を計算し、このようにして求めたＰ（Ｅ（ｉ）＾Ｅ（Ｊ））およびＰ（Ｊ（ｍ）＾Ｊ（ｎ））を上記の対訳確率モデルの式に代入して、上記の制約を満たすパラメータＳ（ｉ，ｍ）を定める。
【００４６】
もう一つの共起の種類として、二つの単語が、互いに構文的な係り受け関係にある係り受け共起がある。この場合、第１言語統計量計算モジュール４によって、第１言語の文例集１における二つの単語の係り受け共起確率Ｐ（Ｅ（ｉ）＾Ｅ（ｊ））を計算し、第２言語統計量計算モジュール５によって、第２言語の文例集２における二つの単語の係り受け共起確率Ｐ（Ｊ（ｍ）＾Ｊ（ｎ））を計算し、このようにして求めたＰ（Ｅ（ｉ）＾Ｅ（ｊ））およびＰ（Ｊ（ｍ）＾Ｊ（ｎ））を上記の対訳確率モデルの式に代入して、上記の制約を満たすパラメータＳ（ｉ，ｍ）を定める。
【００４７】
【発明の効果】
本願発明によれば、第１言語の単語の訳語を定める際、先行技術文献１とは異なり、第１言語と第２言語の両方の全体の単語の対訳確率を考慮に入れるため、第２言語で出現確率の高い単語が訳語に選ばれやすいという先行技術文献１のもっていた欠点が解消されている。
【００４８】
また、本願発明で用いる第１言語および第２言語の文例集は互いに翻訳関係にあることを仮定していないため、大量に収集することができる。互いに翻訳関係にある文例集が存在しないと適用できないという先行技術文献２のもっていた欠点が解消されている。
【００４９】
さらに、本願発明では文内共起を用いて対訳確率を求めるため、単独の単語の対訳確率だけを用いる場合に比べて、複合語などの場合の翻訳の精度が向上する。
【００５０】
また、本願発明では、係り受け共起を用いて対訳確率を求めるため、動詞とその格要素の名詞が組になって訳語が定まるような場合の翻訳の精度が向上する。
【図面の簡単な説明】
【図１】本発明の第１の形態をなす訳語選択システムの構成を示すブロック図である。
【図２】図１に示した実施例における、第１言語第２言語対訳辞書３の内容例を表す図である。
【図３】従来法と本発明の動作を比較するための、第１言語第２言語対訳辞書３の内容例を示す図である。
【図４】従来法の動作を説明するための第１言語第２言語対訳辞書３の内容例を示す図である。
【図５】本発明の動作を説明するための第１言語第２言語対訳辞書３の内容例を示す図である。
【図６】本発明の第２の実施の形態をなす訳語選択システムの構成を示すブロック図である。
【符号の説明】
１第１言語の文例集
２第２言語の文例集
３第１言語第２言語対訳辞書
４第１言語統計量計算モジュール
５第２言語統計量計算モジュール
６対訳確率モデル格納部
７対訳確率付与部
１０１入力装置
１０２データ処理装置
１０３出力装置
１０４記憶装置
１０５記録媒体[0001]
BACKGROUND OF THE INVENTION
The present invention relates to natural language processing technology such as machine translation, cross-language text search, and the like that has a problem of matching words between different languages.
[0002]
[Prior art]
In natural language processing technology, such as machine translation and cross-language text search, where the challenge is to match words between different languages, make words in one language correspond to the appropriate words in the other language. Is a very important issue and is called the problem of translation selection. The fact that this issue is an important issue generally applies to any language pair, but in the following, the case of English and Japanese will be taken up and explained with specific examples.
[0003]
English words generally have multiple meanings and generally correspond to different Japanese words. However, it is generally very difficult to correctly determine the situation in which the original English word is used in the natural language processing field and select an appropriate Japanese word. For example, the English word “term” has a meaning of “technical term” in addition to the meaning of “period”. In any case, it means “period”, and in that case “technical term”. It is very difficult to explicitly describe in advance beforehand the selection condition for the translation of meaning.
[0004]
As a method for solving this problem, an example in which words are actually used, that is, a method of collecting a large number of actual example sentences and using them has been proposed.
[0005]
For example, in “Hiroyasu Nogami, Akira Kumano, Katsumi Tanaka, Masaya Amano“ Automatic Learning Method of Translations from Existing Target Language Documents ”Information Processing Society of Japan 42nd National Convention (1991) (prior art document 1), Such a method has been proposed.
[0006]
First, collect a large number of sample sentences that describe topics in the same field in different languages (such as Japanese and English). Next, when determining the certainty of which translation word corresponds to the translation language candidate of the other language (for example, Japanese) in each language (for example, English), Use the high appearance probability of the translation candidate. For example, to determine whether it is certain that the English term "term" is translated as "period" or "technical term", it is a collection of Japanese examples that describe topics in the same field. This is a method of measuring the frequency of the word “period” and the word “technical term” appearing in the word, and setting the greater number as the translated term “term”. This technique has the advantage that only the other language sentence collection needs to be analyzed.
[0007]
In “Hiroyuki Nakajima, Hiroyuki Tsuji“ Automatic extraction of co-occurrence relations for translation selection using bilingual text ”Information Processing Society of Japan 39th National Convention (1989) (prior art document 2), Have been proposed.
[0008]
First, prepare a bilingual example collection in which different languages (such as Japanese and English) have one translation relationship with the other. Further, a bilingual dictionary between the two languages is prepared, and the bilingual dictionary is drawn with respect to the words included in the example sentences of the first language, and translation candidates are listed. This is a method of measuring the frequency of candidate words appearing in an example sentence of a second language that has a translation relationship with the example sentence, and setting the candidate word appearing with the highest frequency as the translated word for the original word. This method has an advantage that a translated word can be identified with high accuracy when bilingual example sentences in translation relation can be used.
[0009]
[Problems to be solved by the invention]
However, since the method of Prior Art Document 1 uses only the frequency of words in the partner language as a clue, there is a drawback that words that frequently appear in the partner language generally tend to be adopted as translated words.
[0010]
For example, the English word “make” corresponds to many Japanese translations in addition to the translation “make”. For example, to translate “make a call” to “call”, “call” corresponds to “phone”, and “make” corresponds to the verb “to”. This is the usual method. In this way, when making a dictionary, “make” has at least translation candidates “make” and “do”. In this case, when only the appearance frequency of the partner language, that is, the Japanese word is measured according to the method of Prior Art Document 1, the translated word “to” generally has a higher appearance frequency than the translated word “make”, so “make” Is selected as the most probable translation candidate. Thus, the prior art document 1 has a drawback that it is easy to select a translated word having a high appearance frequency in the partner language regardless of whether or not the original translated word is appropriate.
[0011]
The method of Prior Art Document 2 is an effective method when there are a large number of translated example sentences that are in translation relation to each other. However, in practice, the amount of translated example sentences that are in translation relation to each other is extremely limited. . The method of Prior Art Document 2 has a drawback that it cannot be applied when there are not a large number of parallel translated example sentences.
[0012]
The object of the present invention is to provide a method for solving the above-mentioned problems of the conventional method and selecting more likely translation candidates.
[0013]
[Means for Solving the Problems]
A parallel translation probability assigning apparatus according to the present invention is a parallel translation probability assigning apparatus used when translating a first language into a second language, and includes a sentence example collection of the first language and a sentence example collection of the second language. A first language / second language parallel translation dictionary storing word candidate translations of the second language with respect to words in the first language as word-corresponding pairs, and statistics relating to the appearance of the words in the sentence collection of the first language And a second language statistic calculation module for calculating a statistic related to the appearance of a word in the second language sentence collection, each word corresponding pair in the bilingual dictionary There is a bilingual probability model that estimates the statistics related to the appearance of words in the second language from the statistics obtained by the first language statistic calculation module from the sentence collection in the first language, using the given translation probabilities as parameters. And sentences in the second language Bilingual probabilities for obtaining parameters so as to minimize the difference between the statistic obtained by the second language statistic calculation module from the collection and the second language statistic estimated from the first language by the bilingual probability model It has the part.
[0014]
In this case, the appearance probability E (i) of the word E (i) in the first language is used as a parallel translation probability model for estimating the statistics related to the appearance of the word in the second language from the statistics related to the appearance of the word in the first language. ) And the corresponding probability S (i, n) of the word E (i) corresponding to the translated word J (n) of the second language, and the above-mentioned for each word E (i) of the first language A parallel translation probability model may be used in which the appearance probability J (n) of the word J (n) in the second language is calculated by taking the sum of products.
[0015]
In addition, as a parallel translation probability model for estimating a statistic regarding the appearance of a word in the second language from a statistic regarding the appearance of a word in the first language, two words appearing in one sentence in the first language are used. The co-occurrence probability P (E (i) ^ E (J)) of the pair and the corresponding probability S (i) in which each word E (i) and E (J) constituting the word pair corresponds to the translation of the second language , M) and S (J, n) in the second language by taking the sum of the above products for each word pair E (i) and E (J) in the first language. A bilingual probability model that calculates the appearance probability P (J (m) ^ J (n)) of a pair of two words appearing in one sentence may be used.
[0016]
Further, as a parallel translation probability model for estimating a statistic related to the appearance of a second language word from a statistic related to the appearance of a word in the first language, two words having a syntactic dependency in the first language are used. The co-occurrence probability P (E (i) ^ E (J)) of the pair and the corresponding probability S (i) in which each word E (i) and E (J) constituting the word pair corresponds to the translation of the second language , M) and S (J, n) in the second language by taking the sum of the above products for each word pair E (i) and E (J) in the first language. You may decide to use the parallel translation probability model which calculates the appearance probability P (J (m) ^ J (n)) of the pair of two words in a syntactic dependency.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a translated word selection system according to a first embodiment of this invention.
[0018]
In the present embodiment, a first language second language parallel translation in which a first language sentence example collection 1, a second language sentence collection 2, a translation candidate of a second language for a word in the first language is stored as a word correspondence pair Dictionary 3, first language statistic calculation module 4 for calculating a statistic regarding the appearance of words in the first language sentence collection, and a second language statistic calculation for calculating a statistic regarding the appearance of words in the second language sentence collection. Module 5, using the bilingual probabilities assigned to each word-corresponding pair in the bilingual dictionary as parameters, the appearance of words in the second language from the statistics obtained by the first language statistics calculation module from the sentence collection in the first language The bilingual probability model storage unit 6 storing the bilingual probability model for estimating the statistic related to the statistic, and the statistic obtained from the second language sentence collection by the second language statistic calculation module and the bilingual probability model And a first and statistic of the second language that is estimated from a language difference obtaining the parameters to minimize the translation probability imparting unit 7 for the.
[0019]
The contents and operation of each block will be described below. The first language sentence example collection 1 stores a first language, for example, an actual example sentence in English. The second language sentence example collection 2 stores a second language, for example, an actual Japanese example sentence. The first language / second language parallel translation dictionary 3 stores translated word candidates of the second language for each word of the first language as a word correspondence pair. FIG. 2 is a diagram showing an example of the contents of the first language / second language parallel translation dictionary 3. This figure shows a case where J (k), J (m), and J (n) exist as translation candidates for the second language corresponding to the word E (i) in the first language.
[0020]
In this figure, e (i) is the appearance probability of the first language word E (i), j (k), j (m), and j (n) are the second language words J (k) and J (k), respectively. m), the appearance probability of J (n). In addition, S (i, k), S (i, m), and S (i, m) are respectively the first language word E (i) and the second language words J (k) and J (m). ), J (n) represents the probability of translation.
[0021]
The first language statistic calculation module 4 calculates a statistic regarding the appearance of a word in the sentence collection 1 of the first language. The second language statistic calculation module 5 calculates a statistic regarding the appearance of words in the second language sentence collection 2. The first language statistic calculation module 4 and the second language statistic calculation module 5 perform morphological analysis and syntax analysis on sentences included in the sentence collection 1 of the first language and the sentence collection 2 of the second language as necessary. Or calculate a statistic about the appearance of the words contained therein. Examples of statistics include the appearance probability of each word and the co-occurrence probability that two words appear simultaneously.
[0022]
The parallel translation probability model storage unit 6 stores a parallel translation probability model for estimating a statistic related to the appearance of a word in the second language from a statistic related to the appearance of a word in the first language. This bilingual probability model uses the bilingual probabilities assigned to each word-corresponding pair in the bilingual dictionary as parameters, from the statistics obtained by the first language statistic calculation module from the sentence collection of the first language, in the second language. Estimate statistics on word appearance.
[0023]
The bilingual probability assigning unit 7 calculates the statistic obtained by the second language statistic calculation module from the second language sentence collection and the statistic of the second language estimated from the first language by the bilingual probability model. The bilingual probability parameter assigned to each pair of words in the bilingual dictionary is adjusted so as to minimize the difference.
[0024]
FIG. 6 is a diagram for explaining a second embodiment of the present invention.
[0025]
In FIG. 6, the second embodiment of the present invention is an input device 101, a data processing device 102 composed of a computer, an output device 103, a storage device 104, and a storage medium 105 on which a translation word selection program is recorded. With. The storage medium 105 includes a magnetic disk, a magnetic tape, an optical disk, a semiconductor memory, or other storage media.
[0026]
The translated word selection program is read from the storage medium 105 into the main storage device of the data processing device 102 and controls the operation of the data processing device 102. The data processing device 102 performs the following processing under the control of the translation word selection program.
[0027]
When a word for which a translation word is to be selected is input from the input device 101, the first language statistic calculation module 4 and the second language statistic calculation module 5 are activated. The first language statistic calculation module 4 calculates a statistic regarding the appearance of a word in the sentence collection 1 of the first language. The second language statistic calculation module 5 calculates a statistic regarding the appearance of words in the second language sentence collection 2.
[0028]
Next, the parallel translation probability assigning unit 7 is activated. The parallel translation probability assigning unit 7 is estimated from the first language based on the statistics obtained by the second language statistics calculation module from the sentence collection of the second language and the parallel translation probability model stored in the parallel translation probability model storage unit 6. The translation probability parameter assigned to each word-corresponding pair in the parallel translation dictionary is adjusted so as to minimize the difference from the statistics of the second language.
[0029]
The translated word is output from the output device 103 according to the value of the parallel translation probability parameter obtained as a result.
[0030]
Next, the operation of the present invention will be described using an example of the parallel translation probability model stored in the parallel translation probability model storage unit 6 in the embodiment shown in FIG. The following expression is an expression representing an example of a parallel translation probability model.
[0031]
[Expression 1]

In Equation 1, e (i) represents the appearance probability of the i-th word E (i) in the first language. J (m) represents the appearance probability of the mth word J (m) in the second language. S (i, m) represents the probability that the i-th word E (i) in the first language is translated into the m-th word J (m) in the second language. This expression represents a model in which the sum of the products of the appearance probability and translation probability of each word in the first language gives the appearance probability of each word in the second language.
[0032]
S (i, m) in Equation 1 is a parameter in this bilingual probability model, and is given to the word correspondence pair of the first language word E (i) and the second language translation word candidate J (m). This is the translation probability. This parameter assumes that a first language word always corresponds to a second language word,
[0033]
[Expression 2]

There is a restriction.
[0034]
In order to obtain the translation probability of each word using this parallel translation probability model, the first language statistic calculation module 4 calculates the word appearance probability e (i) in the sentence collection 1 of the first language, and the second language statistic. The calculation module 5 calculates the word appearance probability j (m) in the sentence example 2 of the second language, and substitutes e (i) and j (m) obtained in this way into the expression of the above-described parallel translation probability model. Then, a parameter S (i, m) that satisfies the above constraints is determined.
[0035]
Next, the difference between the present invention and the conventional method will be described with reference to FIGS. 3, 4, and 5. Here, as an example, consider a case where an English word is translated into a Japanese word.
[0036]
FIG. 3 shows word correspondence pairs of English words “do” and “make” in the first language / second language parallel translation dictionary 3. Here, for the sake of simplicity, the English word “do” forms a word-corresponding pair as a translation candidate with only one Japanese word “suru”, and the English word “make” is a Japanese word “make”. Assume a situation in which two words “Yes” and a word correspondence pair as translation candidates are formed.
[0037]
FIG. 4 shows the operation of the conventional method for selecting a translation of a word in the first language using only the appearance frequency of the word in the second language sentence collection as shown in Prior Art Document 1. FIG. FIG. 6 is a diagram showing “do” and “make” having the configuration of the word correspondence pair shown in FIG. FIG. 4 shows a situation in which the occurrence probabilities of “do” and “create” in the sentence examples of the second language are 0.20 and 0.01, respectively. In this case, in the conventional method of Prior Art Document 1, the word “do” having a high appearance probability is prioritized over the word “create” as the translated word “make”.
[0038]
FIG. 5 is a diagram showing the operation of the present invention taking “do” and “make” having the configuration of the word correspondence pair shown in FIG. 3 as an example. FIG. 5 shows a situation in which the appearance probabilities of “do” and “make” in the sentence examples of the first language are 0.18 and 0.02, respectively. Assume that the occurrence probabilities of “do” and “create” in the second language sentence collection are 0.20 and 0.01, respectively, as in FIG.
[0039]
In the present invention, a calculation is performed in consideration of the influence of English words having the same Japanese word as a translation word, using a bilingual probability model as shown in the above equation. With this method, the probability that the English word “make” is translated into “to” and the probability that it is translated into “create” is calculated. Even if the frequency of “to” is high as in this example, the frequency is large. Since the part corresponds to the translation probability from the English word “do”, the translation probability from “make” to “do” becomes low. FIG. 5 shows a case where the translation probability from “make” to “create” is 0.9 and the translation probability from “make” to “yes” is 0.1.
[0040]
Next, the operation of the present invention will be described using a second example of the parallel translation probability model stored in the parallel translation probability model storage unit 6. The following expression is an expression representing an example of a translation probability model.
[0041]
[Equation 3]

In Equation 3, P (E (i) ^ E (j)) represents the co-occurrence probability that the word E (i) and the word E (j) appear simultaneously in the first language. P (J (m) ^ J (n)) represents the co-occurrence probability that the word J (m) and the word J (n) appear simultaneously in the second language. Equation 3 represents a model in which the sum of products of the co-occurrence probability of two words in the first language and the translation probability of each word gives the co-occurrence probability of the two words in the second language.
[0042]
In this equation, S (i, m) and S (j, n) are parameters in the translation probability model, and the word E (i) in the first language and the candidate word J (m) in the second language are Is the translation probability given to the word correspondence pair of the word E (J) in the first language and the candidate word J (n) in the second language. This parameter assumes that a first language word always corresponds to a second language word,
[0043]
[Expression 4]

There is a restriction.
[0044]
When the translation probability of each word is obtained using this parallel translation probability model, several types of co-occurrence can be considered as the co-occurrence of two words. One type of co-occurrence is intra-sentence co-occurrence in which two words appear together in one sentence.
[0045]
In this case, the first language statistic calculation module 4 calculates the in-sentence co-occurrence probability P (E (i) ^ E (J)) of two words in the first language sentence example collection 1, and the bilingual statistic. The calculation module 5 calculates the in-sentence co-occurrence probability P (J (m) ^ J (n)) of the two words in the sentence example 2 of the second language. ^ E (J)) and P (J (m) ^ J (n)) are substituted into the above bilingual probability model equation to determine the parameter S (i, m) that satisfies the above constraints.
[0046]
Another type of co-occurrence is dependency co-occurrence in which two words are syntactically dependent on each other. In this case, the first language statistic calculation module 4 calculates the dependency co-occurrence probability P (E (i) ^ E (j)) of two words in the sentence collection 1 of the first language, and the second language statistics The quantity calculation module 5 calculates the dependency co-occurrence probability P (J (m) ^ J (n)) of two words in the sentence example 2 of the second language and calculates P (E (i ) ^ E (j)) and P (J (m) ^ J (n)) are substituted into the above bilingual probability model equation to determine a parameter S (i, m) that satisfies the above constraints.
[0047]
【Effect of the invention】
According to the present invention, when determining the translation of a word in the first language, unlike the prior art document 1, it takes into account the parallel translation probabilities of the entire words in both the first language and the second language. Thus, the disadvantage of the prior art document 1 that a word having a high appearance probability is easily selected as a translated word is solved.
[0048]
Moreover, since the sentence collections of the first language and the second language used in the present invention are not assumed to be in a translation relationship with each other, they can be collected in large quantities. The disadvantage of Prior Art Document 2 that it cannot be applied unless there is a collection of sentence examples that are in a translation relationship with each other is solved.
[0049]
Furthermore, in the present invention, since the translation probability is obtained by using intra-sentence co-occurrence, the accuracy of translation in the case of a compound word or the like is improved as compared with the case where only the translation probability of a single word is used.
[0050]
Further, in the present invention, since the parallel translation probability is obtained using dependency co-occurrence, the translation accuracy is improved when the verb and its case element noun are paired to determine the translated word.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a translated word selection system according to a first embodiment of the present invention.
FIG. 2 is a diagram illustrating an example of contents of a first language / second language parallel translation dictionary 3 in the embodiment illustrated in FIG. 1;
FIG. 3 is a diagram showing an example of the contents of a first language / second language parallel translation dictionary 3 for comparing the operation of the conventional method and the present invention.
FIG. 4 is a diagram showing an example of contents of a first language / second language parallel translation dictionary 3 for explaining an operation of a conventional method;
FIG. 5 is a diagram showing an example of contents of a first language / second language parallel translation dictionary 3 for explaining the operation of the present invention;
FIG. 6 is a block diagram showing a configuration of a translated word selection system according to a second embodiment of the present invention.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 1st language sentence example collection 2 2nd language sentence example collection 3 1st language 2nd language parallel translation dictionary 4 1st language statistic calculation module 5 2nd language statistic calculation module 6 Parallel translation probability model storage part 7 Parallel translation probability provision part 101 Input Device 102 Data Processing Device 103 Output Device 104 Storage Device 105 Recording Medium

Claims

A translation probability assigning device used when translating a first language into a second language,
A first language sentence collection and a second language sentence collection;
A first language / second language parallel translation dictionary that stores translation candidates of the second language for the words of the first language as word correspondence pairs;
A first language statistic calculation module for calculating a statistic relating to the appearance of a word in the first language sentence collection;
A second language statistic calculation module for calculating a statistic related to the appearance of a word in the second language sentence collection;
From the statistic obtained by the first language statistic calculation module, using the statistic obtained by the second language statistic calculation module and the bilingual probability assigned to each word correspondence pair of the bilingual dictionary as parameters, A bilingual translation probability adding unit that adjusts parameters so as to minimize a difference from a statistic related to the appearance of a word in the second language estimated based on an expression stored in a model storage unit Probability grant device.

2. The bilingual probability providing apparatus according to claim 1, wherein the appearance probability e (i) of the word E (i) in the first language and the word E (i) are expressed as expressions stored in the bilingual probability model storage unit . The product of the corresponding probabilities S (i, n) corresponding to the translation J (n) of the two languages is obtained, and the sum of the above products is added to each word E (i) of the first language. A bilingual probability assigning apparatus using an expression for calculating an appearance probability j (n) of a word J (n) in two languages.

2. The parallel translation probability assigning apparatus according to claim 1, wherein the co-occurrence probability P (E () of a pair of two words appearing in one sentence in the first language is an expression stored in the parallel translation probability model storage unit. i) ^ E (j)) and the corresponding probabilities S (i, m) and S (j, n) for each word E (i) and E (j) constituting the word pair corresponding to the translation of the second language 2) appearing in one sentence in the second language by taking the product of the above product for each word pair E (i) and E (j) in the first language. An apparatus for providing a translation probability, which uses an expression for calculating an appearance probability P (J (m) ^ J (n)) of a pair of words.

The parallel translation probability assigning apparatus according to claim 1, wherein the co-occurrence probability P (E () of a pair of two words having a syntactic dependency in the first language is used as an expression stored in the parallel translation probability model storage unit. i) ^ E (J)) and the corresponding probabilities S (i, m) and S (J, n) in which each word E (i) and E (J) constituting the word pair corresponds to the translation of the second language ), And the sum of the above products for each word pair E (i) and E (J) in the first language is taken into a syntactically dependent relationship in the second language. An apparatus for providing a translation probability, which uses an expression for calculating an appearance probability P (J (m) ^ J (n)) of a pair of words.

A first language sentence collection and a second language sentence collection;
A first language / second language parallel translation dictionary that stores translation candidates of the second language for the words of the first language as word correspondence pairs;
(A) a process of calculating the statistics relating to the appearance of a word in the phrase collection of the first language,
(B) a process of calculating the statistics relating to the appearance of a word in the phrase collection of the second language,
(C) Using the bilingual probabilities assigned to each word-corresponding pair in the first language / second language bilingual dictionary as a parameter , based on the statistics calculated in (a), the appearance of words in the second language Calculation process to estimate statistics,
(D) a process of adjusting the parameter so as to minimize the difference between the statistic calculated by (b) and the statistic estimated by (c) ;
A program characterized by having executed.

6. The program according to claim 5, wherein as the processing of (c), the appearance probability E (i) of the word E (i) in the first language and the translation J ( n) with the corresponding probability S (i, n) corresponding to n) and taking the sum of the above products for each word E (i) in the first language to obtain the word J ( A program that performs a process of calculating an appearance probability J (n) of n).

6. The program according to claim 5, wherein as the process of (c), a co-occurrence probability P (E (i) ^ E (J)) of a pair of two words appearing in one sentence in the first language. And the corresponding probabilities S (i, m) and S (J, n) corresponding to the translated words of the second language for each word E (i) and E (J) constituting the word pair, Appearance probability P of two word pairs appearing in one sentence in the second language by taking the sum of the above products for each word pair E (i) and E (J) in one language A program characterized by performing a process of calculating (J (m) ^ J (n)).

6. The program according to claim 5, wherein in the processing of (c), the co-occurrence probability P (E (i) ^ E (J)) of a pair of two words having a syntactic dependency in the first language. And the corresponding probabilities S (i, m) and S (J, n) corresponding to the translated words of the second language for each word E (i) and E (J) constituting the word pair, Appearance probability P of two word pairs that are syntactically dependent in the second language by taking the sum of the above products for each word pair E (i) and E (J) in one language A program characterized by performing a process of calculating (J (m) ^ J (n)).

A first language sentence collection and a second language sentence collection;
Translation probability applying device for chromatic and first language second language bilingual dictionary with the candidate word of a second language for words in the first language as a word corresponding pairs,
(A) calculates statistics regarding the appearance of a word in the phrase collection of the first language,
(B) calculates statistics regarding the appearance of a word in the phrase collection of the second language,
(C) Using the bilingual probabilities assigned to each word-corresponding pair in the first language / second language bilingual dictionary as a parameter , based on the statistics calculated in (a), the appearance of words in the second language Do a calculation to estimate the statistics,
(D) A bilingual probability assigning method comprising adjusting the parameters so as to minimize a difference between the statistic calculated in (b) and the statistic estimated in (c) .