JPH09160918A

JPH09160918A - Translated sentence corresponding method and device therefor

Info

Publication number: JPH09160918A
Application number: JP7324562A
Authority: JP
Inventors: Masahiko Haruno; 雅彦春野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-12-13
Filing date: 1995-12-13
Publication date: 1997-06-20

Abstract

PROBLEM TO BE SOLVED: To make the wide translated sentences of two languages correspond to each other with high precision. SOLUTION: An input part 110 inputs the correspondence text of two languages such as Japanese and English from a storage device 10, etc. A morpheme analyzing part 120 performs a morpheme analysis for each inputted text. A degree of similarity calculating part 130 calculates the degree of similarity of the word of the both languages as the mutual information capacity in the text from the morpheme analysis result and further selects a word pair with high reliability in a statistical qualification. A sentence correspondence estimating part 140 narrows down the sentence correspondence possible relation by using the degree of similarity and an existing bilingual dictionary. A post- processing part 150 selects a sentence correspondence pair having a prescribed number of times of support for the narrowed down sentence correspondence possible relation. An output part 160 outputs this selected sentence correspondence pair to a storage device 20, etc.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は対訳文対応付け方法
及び装置に係り、詳しくは、機械翻訳、知識ベースシス
テム等の自然言語システムに用いられ、対訳テキストか
ら自動的に知識を学習する対訳文対応付け方法及び装置
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a bilingual sentence associating method and apparatus, and more particularly, to a bilingual sentence used for a natural language system such as a machine translation or a knowledge base system to automatically learn knowledge from a bilingual text. The present invention relates to an associating method and device.

【０００２】[0002]

【従来の技術】従来の対訳文対応付けは、主に英語・フ
ランス語間などの構造並びに語彙が非常に近い言語間で
行われており、それらは、文中に含まれる単語数や文字
数などの情報で対訳の対応付けを行なう方法が一般的で
あった。一方、日本語・英語などの対訳に関しては、対
訳辞書のみを用いる方法、ならびに、ダイナミックプロ
グラミングの手法を用いて、対訳辞書を用いた後に後処
理として統計を用いる方法がある。2. Description of the Related Art Conventional bilingual sentence mapping is performed mainly between languages such as English and French that have very similar structures and vocabularies, and these are information such as the number of words and the number of characters included in a sentence. It was common to associate parallel translations with. On the other hand, regarding bilingual translations such as Japanese and English, there are a method of using only a bilingual dictionary and a method of using statistics as post-processing after using a bilingual dictionary using a dynamic programming method.

【０００３】[0003]

【発明が解決しようとする課題】このように、従来の対
訳文対応付け方法は、構造の似た比較的対応付けの容易
なテキストを扱ってきた。しかしながら、日本語と英語
のように全く構造も思考法も異なる言語間では、素直に
訳された対訳テキストであっても、その構成が違ってい
たり内容の削除等が行なわれるのが普通である。このよ
うな場合には、データからの統計的情報と既存の知識源
である辞書を適切に組み合わせることが重要である。統
計的情報、辞書情報の長短所は以下のようにまとめられ
る。As described above, the conventional bilingual sentence associating method has dealt with texts having a similar structure and relatively easy to associate. However, between languages such as Japanese and English, which have completely different structures and thinking methods, it is common for the translated texts to be translated in a different way and to have their contents deleted. . In such cases, it is important to properly combine the statistical information from the data with the existing knowledge source dictionary. The advantages and disadvantages of statistical information and dictionary information are summarized as follows.

【０００４】統計情報の長所：データに依存した情報を
獲得出来るので、そのテキストの文脈に適切な訳語関係
を見つけることが出来る。また、日本語のように単語切
り（形態素解析）が必要な言語においては単語切りが誤
っていても情報を取り出せることが長所である。統計情報の短所：信頼性の高い統計情報を得るために
は、対象とする単語がデータ中に複数回出現する必要が
ある。多くの単語が１，２度しか現われないことを考え
ると、統計情報を取れる単語は限られてくる。辞書情報の長所：一度しか現われない単語についても情
報を得ることが出来る。辞書情報の短所：１つの単語の訳語には様々なものが考
えられ、データ中で使われているものが対訳辞書に載っ
ているとは限らない。また、形態素解析の段階で誤りが
あれば、正しい辞書びきは不可能である。これらから分かる様に、統計的情報と辞書情報の長短所
は相補的な関係にある。Advantages of statistical information: Since data-dependent information can be obtained, it is possible to find a translation relation suitable for the context of the text. Further, in a language such as Japanese that requires word segmentation (morphological analysis), it is an advantage that information can be retrieved even if the word segmentation is incorrect. Disadvantages of statistical information: In order to obtain reliable statistical information, the target word must appear multiple times in the data. Considering that many words appear only once or twice, the number of words for which statistical information can be obtained is limited. Advantages of dictionary information: Information can be obtained even for words that appear only once. Disadvantages of dictionary information: There are various possible translations for one word, and the ones used in the data are not always listed in the bilingual dictionary. Moreover, if there is an error in the stage of morphological analysis, correct dictionary lookup is impossible. As can be seen from these, the advantages and disadvantages of statistical information and dictionary information are in a complementary relationship.

【０００５】本発明の目的は、従来の問題を解決し、統
計的情報と辞書情報を適切に組合わせた高精度な対訳文
対応付け方法及び装置を提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to solve the conventional problems and provide a highly accurate parallel translation sentence associating method and apparatus in which statistical information and dictionary information are properly combined.

【０００６】[0006]

【課題を解決するための手段】本発明は、２ヶ国語の対
応テキストが与えられると、類似度計算手段において、
両言語の単語の類似度をデータ中の相互情報量として計
算し、さらにｔ−ｔｅｓｔ等による統計的検定で信頼度
の高いものだけを選択する。次に、文対応推定手段に
て、この類似度と既存の対訳辞書の情報を用いて可能な
文の範囲を絞り込む。この絞り込まれた情報を用いて、
さらに類似度計算手段と文対応推定手段において上記の
操作を繰り返す。この操作の繰り返しにより、対応可能
な文の組が次第に絞り込まれ、最終的に所望の文対応付
けが得られる。According to the present invention, when corresponding texts in two languages are given, the similarity calculation means
The degree of similarity between words in both languages is calculated as the amount of mutual information in the data, and only those with a high degree of reliability are selected by a statistical test such as t-test. Next, the sentence correspondence estimation means narrows down the range of possible sentences using this similarity and the information of the existing bilingual dictionary. Using this refined information,
Further, the above operation is repeated in the similarity calculation means and the sentence correspondence estimation means. By repeating this operation, the set of available sentences is gradually narrowed down, and the desired sentence association is finally obtained.

【０００７】[0007]

【発明の実施の形態】以下、本発明の一実施例として、
日本語と英語の対応テキストが与えられた場合について
説明する。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, as one embodiment of the present invention,
The case where corresponding texts in Japanese and English are given is explained.

【０００８】図１に、本発明の一実施例の対訳文対応付
け装置のシステム構成図を示す。本対訳対応付け装置１
００は、入力部１１０、形態素解析部１２０、類似度計
算部１３０、文対応推定部１４０、後処理部１５０、出
力部１６０、これら各部のワークエリアとして使用され
る記憶部１７０、及び、既存の対訳辞書１８０からな
る。１０は日本語と英語と対応テキストデータが格納さ
れている記憶装置、２０は対応付けられた対訳文ペアが
格納される記憶装置である。なお、対応テキストデータ
の入力手段は、必ずしも記憶装置である必要はない。FIG. 1 shows a system configuration diagram of a bilingual sentence associating device according to an embodiment of the present invention. Book translation matching device 1
00 is an input unit 110, a morphological analysis unit 120, a similarity calculation unit 130, a sentence correspondence estimation unit 140, a post-processing unit 150, an output unit 160, a storage unit 170 used as a work area for each of these units, and an existing unit. It consists of a bilingual dictionary 180. Reference numeral 10 is a storage device in which Japanese and English and corresponding text data are stored, and 20 is a storage device in which associated bilingual sentence pairs are stored. The input means for the corresponding text data does not necessarily have to be the storage device.

【０００９】入力部１１０は、記憶装置１０などからの
日本語と英語の２ヶ国語の対応テキストを入力して記憶
部１７０の所定のワークエリアに格納する。形態素解析
部１２０は、日本語と英語の対応テキストを記憶部１７
０の所定ワークエリアからとり出して、それぞれ形態素
解析を行い、その結果を記憶部１７０の所定のワークエ
リアに格納する。類似度計算部１３０は、記憶部１７０
の所定のワークエリア内の形態素解析結果から両言語の
単語の対応可能関係を算出し、その相互情報量を求め、
更に統計的検定（ｔ−ｔｅｓｔ）により信頼性の高い単
語対を選択し、記憶部１７０の所定のワークエリアに格
納する。文対応推定部１４０は、記憶部１７０の所定ワ
ークエリア内の単語対について、あらかじめ用意された
対訳辞書１８０を用いて、日本文ｉと英文ｊの対応が支
持される回数をカウントし、所定の閾値にて文対応可能
関係を絞り込み、記憶部１７０の所定のワークエリアに
格納する。後処理部１５０は、記憶部１７０の所定ワー
クエリア内の文対応可能関係から、所定の支持回数を持
つ文対応ペアを選択し、記憶部１７０の所定のワークエ
リアに格納する。出力部１６０は、後処理部１５０で選
択された記憶部１７０の所定ワークエリア内の文対応ペ
アを記憶装置２０へ出力する。The input unit 110 inputs corresponding texts in two languages, Japanese and English, from the storage device 10 and stores them in a predetermined work area of the storage unit 170. The morphological analysis unit 120 stores the corresponding texts in Japanese and English in the storage unit 17.
0 is extracted from a predetermined work area, morphological analysis is performed on each, and the result is stored in a predetermined work area of the storage unit 170. The similarity calculation unit 130 includes a storage unit 170.
From the morphological analysis results in the given work area, the correspondence relationship between words in both languages is calculated, and the mutual information amount is calculated.
Further, a highly reliable word pair is selected by a statistical test (t-test) and stored in a predetermined work area of the storage unit 170. The sentence correspondence estimation unit 140 counts the number of times the correspondence between the Japanese sentence i and the English sentence j is supported for a word pair in a predetermined work area of the storage unit 170, using a pre-prepared bilingual dictionary 180. Sentence correspondence relationships are narrowed down by a threshold value and stored in a predetermined work area of the storage unit 170. The post-processing unit 150 selects a sentence correspondence pair having a predetermined number of support times from the sentence correspondence correspondence in the predetermined work area of the storage unit 170 and stores it in the predetermined work area of the storage unit 170. The output unit 160 outputs to the storage device 20 the sentence-corresponding pair in the predetermined work area of the storage unit 170 selected by the post-processing unit 150.

【００１０】図２に、図１中の特に類似度計算部１３
０、文対応推定部１４０、後処理部１５０の接続関係を
示す。ここで、類似度計算部１３０と文対応推定部１４
０は記憶部１７０のワークエリアを介してループを構成
しており、この両者の処理の繰り返しで文対応範囲が絞
り込まれる。FIG. 2 shows the similarity calculator 13 in FIG.
0 shows the connection relationship between the sentence correspondence estimation unit 140 and the post-processing unit 150. Here, the similarity calculation unit 130 and the sentence correspondence estimation unit 14
0 forms a loop via the work area of the storage unit 170, and the sentence correspondence range is narrowed down by repeating the processes of both.

【００１１】図３は、本実施例の一連の処理ステップを
示したものである。まず、形態素解析部１２０におい
て、それぞれが対応する日本語テキストと英語テキスト
の双方が形態素解析され、必要な品詞の単語だけが選び
出される（ステップ３００）。以後の対応付けでは、こ
こで取り出された単語だけが利用される。また、入力さ
れた日英テキスト中の文数から初期的な文対応可能関係
が作られる。この初期的関係では、それぞれのテキスト
の先頭、終末同士は対応し、それ以外の対応関係には幅
を持たせる。対応の幅は、テキストの両端では小さく、
テキストの中央に近いほど大きく取る。FIG. 3 shows a series of processing steps in this embodiment. First, the morphological analysis unit 120 performs morphological analysis on both the corresponding Japanese text and English text, and selects only the necessary part-of-speech word (step 300). In the subsequent correspondence, only the words extracted here are used. An initial sentence correspondence relationship is created from the number of sentences in the input Japanese-English text. In this initial relationship, the beginning and end of each text correspond to each other, and other correspondences have a width. The width of the correspondence is small at both ends of the text,
The closer to the center of the text, the larger.

【００１２】図４に、この対応可能関係の例を示す。対
応の幅はテキストの両端では小さく、テキストの中央に
近いほど大きくなっている。対応可能関係の数は日本語
の文数だけある。FIG. 4 shows an example of this correspondence possibility. The width of the correspondence is small at both ends of the text, and increases toward the center of the text. There are as many correspondence relationships as there are Japanese sentences.

【００１３】次に、類似度計算部１３０において、対応
可能関係から単語対応を推定する（ステップ３１０）。
以下に類似度計算部１３０の働きを説明する。Next, the similarity calculator 130 estimates word correspondences from the correspondence relationships (step 310).
The operation of the similarity calculation unit 130 will be described below.

【００１４】いま、対応ペアｉ中ｊ番目の日本語単語を
Ｊ_ij、対応ペアｉ中ｋ番目の英語単語をＥ_ikとする。ま
た、Ｎ（Ｊ_ij）を単語Ｊ_ijが現われる対応ペア数とす
る。ただし、１つの出現単語が複数のペアで二重に数え
られないように管理する。この時、対応ペアｉ中の日、
英の単語をＪ_ij、Ｅ_ikの類似度は、以下の相互情報量Ｉ
(Ｊ_ij、Ｅ_ik）で与えられる。ここで、Ｐrは確率、ｎは
ペアの総数（即ち日本語テキストの文数）である。It is now assumed that the j-th Japanese word in the corresponding pair i is J _ij and the k-th English word in the corresponding pair i is E _ik . Also, _let N (J _ij ) be the number of corresponding pairs in which the word J _ij appears. However, one occurrence word is managed so as not to be counted twice in a plurality of pairs. At this time, the day of the corresponding pair i,
The similarity between English words J _ij and E _ik is the mutual information I
(J _ij , E _ik ). Here, Pr is the probability and n is the total number of pairs (that is, the number of sentences in Japanese text).

【００１５】[0015]

【数１】 [Equation 1]

【００１６】この相互情報量は、単語対Ｊ_ijとＥ_ikの出
現の割合を表わしており、これを利用することで日英の
単語の近さを計ることができる。ただし、相互情報量は
頻度の低い単語対しても大きくなることがあるが、これ
は統計的に信頼性が低い。そこで、類似度計算部１３０
では、統計的検定（ｔ−ｔｅｓｔ）を合わせて行い、信
頼性の高い単語対のものだけを取り出す。この相互情報
量を取る操作を、対応可能関係中に含まれる（必要な品
詞を持つ）全ての単語の組合せについて行なう。This mutual information indicates the rate of appearance of word pairs J _ij and E _ik , and by using this, the closeness of Japanese and English words can be measured. However, the mutual information can be large even for infrequent words, but this is statistically unreliable. Therefore, the similarity calculation unit 130
Then, a statistical test (t-test) is also performed, and only the reliable word pairs are extracted. The operation of obtaining the mutual information amount is performed for all word combinations (having a necessary part of speech) included in the correspondence relationship.

【００１７】次に、文対応推定部１４０において、類似
度計算部１３０で得られた単語対と既存の対訳辞書を用
いて文対応可能関係を絞り込む（ステップ３２０）。Next, the sentence correspondence estimation unit 140 narrows down the sentence correspondence correspondence relationship using the word pairs obtained by the similarity calculation unit 130 and the existing bilingual dictionary (step 320).

【００１８】該文対応推定部３２０では、以下のステッ
プで日本文ｉと英文ｊの対応が支持される回数を数えて
いく。ｓｔ、ｄｉｃは外部から与えられるパラメータ
で、それぞれ統計、対訳辞書でサポートされた時に加え
る点数である。通常は、ｓｔをｄｉｃより大きく取る。ステップ１：相互情報量の大きかった単語ペア順に、こ
の操作を適用する。即ち、日本文ｉと英文ｊにそのペア
が含まれ、かつ、日本文ｉの対応ペアに含まれる英文で
他にその英単語を含むものが無ければ、日本文ｉと英文
ｊの組合せにｓｔを加える。なお、このステップに非公
差の条件を加えることも可能である。ステップ２：日本文ｉと英文ｊに対訳辞書の単語ペアが
含まれ、かつ、日本文ｉの対応ペアに含まれる英文で他
にその英単語を含むものが無ければ、日本文ｉと英文ｊ
の組合せにｄｉｃを加える。ステップ３：ステップ１、２である閾値を越えた対応
は、確実な対応として確定する（この対応をアンカーと
呼ぶ）。次の繰り返しへの入力としてアンカーの列から
新しい文対応可能関係を構成する。２つのアンカーに挟
まれる部分の対応可能関係は幅を持つが、その幅はアン
カーに近いほど小さく、アンカーの中央に近いほど大き
く取る。The sentence correspondence estimation unit 320 counts the number of times the correspondence between the Japanese sentence i and the English sentence j is supported in the following steps. st and dic are parameters given externally, and are points added when supported by statistics and bilingual dictionaries, respectively. Usually, st is set larger than dic. Step 1: This operation is applied in the order of word pairs with the largest mutual information. That is, if the Japanese sentence i and the English sentence j include the pair and there is no other English sentence included in the corresponding pair of the Japanese sentence i including the English word, the combination of the Japanese sentence i and the English sentence j is st. Add. It is also possible to add a non-tolerance condition to this step. Step 2: If the Japanese sentence i and the English sentence j include a word pair in the bilingual dictionary, and there is no other English sentence included in the corresponding pair of the Japanese sentence i including the English word, the Japanese sentence i and the English sentence j.
Add dic to the combination. Step 3: Correspondences exceeding the thresholds of Steps 1 and 2 are confirmed as reliable correspondences (this correspondence is called an anchor). Construct a new sentence-capable relationship from the anchor column as input to the next iteration. The correspondence between the portions sandwiched by the two anchors has a width, but the width is smaller as it is closer to the anchor and larger as it is closer to the center of the anchor.

【００１９】以上の類似度計算部１３０および文対応推
定部１４０の処理を文対応可能関係が収束するまで繰り
返す（ステップ３３０）、これにより、入力された日英
テキスト間の文対応関係を得ることが出来る。The above-described processing of the similarity calculation unit 130 and the sentence correspondence estimation unit 140 is repeated until the sentence correspondence relation converges (step 330), thereby obtaining the sentence correspondence relation between the input Japanese-English texts. Can be done.

【００２０】本手法は、ダイナミックプログラミングに
基づいて既存の対訳辞書を用いた後で後処理として統計
を用いる既存手法と比較して以下の長所を有する。 (１) 既存手法では、初めに辞書による対応付けを行な
うため、専門分野のテキストなど語彙が辞書に掲載され
ていないテキストでは正解率が著しく低下する。既存手
法の統計処理は第一段階の辞書による対応付けの結果に
基づいて行なうため、第一段階の正解率が低い場合には
正しい結果を得ることができない。また、既存手法では
形態素解析部の正解率に大きく左右されるという問題が
生じる。本発明手法ではこれらの問題が解決されてい
る。 (２) 日本語と英語のテキストでは相互に対応していな
い部分が含まれていることが多い。また、日本語文と英
文の対応関係がクロスしていることが多い（日本文ｉと
英文ｊが対応しているときに番号がｉより小さい日本文
がｊより大きい英文が対応あるいはその逆ケース）。既
存手法ではダインミックプログラミングで局所的に対応
付けを行なうため、このような問題に対処出来ない。一
方、テキスト全体を見ながらアンカーを設定していく本
発明手法では上記の問題に対処可能である。This method has the following advantages over the existing method that uses statistics as post-processing after using an existing bilingual dictionary based on dynamic programming. (1) In the existing method, since the correspondence is first made with the dictionary, the accuracy rate is remarkably lowered in the texts in which the vocabulary is not posted in the dictionary, such as texts in specialized fields. Since the statistical processing of the existing method is performed based on the result of the association by the dictionary in the first step, it is impossible to obtain a correct result when the correct answer rate in the first step is low. In addition, the existing method has a problem that it is greatly affected by the accuracy rate of the morphological analysis unit. The method of the present invention solves these problems. (2) Japanese and English texts often include parts that do not correspond to each other. In addition, the correspondence between Japanese sentences and English sentences is often crossed (when Japanese sentences i correspond to English sentences j, Japanese sentences with numbers smaller than i correspond to English sentences larger than j or vice versa). . In the existing method, such a problem cannot be dealt with because local mapping is performed by dynmic programming. On the other hand, the method of the present invention in which the anchor is set while viewing the entire text can address the above problem.

【００２１】次に、後処理部１５０は、収束した文対応
可能関係から最終結果を導く（ステップ３４０）。文対
応可能関係では、支持回数の低い日本文は多くの対応英
文を持つ。そこで後処理部１５０では、それらの対応に
支持回数の有意な差がある場合には、多くの支持回数を
持つ文だけを対応関係として選び、どの対応の支持回数
も小さい場合は、その日本文は対応英文を持たないと判
断する。Next, the post-processing section 150 derives a final result from the converged sentence correspondence correspondence relationship (step 340). In the sentence correspondence relationship, Japanese sentences with low support frequency have many corresponding English sentences. Therefore, in the post-processing unit 150, when there is a significant difference in the number of support times between the correspondences, only a sentence having a large number of support times is selected as a correspondence relationship. It is determined that there is no corresponding English sentence.

【００２２】以上、本発明の一実施例として日本文と英
文の対応付けについて説明したが、本発明はこれに限定
されるものでないことは云うまでもない。Although the correspondence between Japanese and English sentences has been described above as an embodiment of the present invention, it goes without saying that the present invention is not limited to this.

【００２３】[0023]

【発明の効果】以上説明したように、本発明によれば、
幅広い２ヶ国のテキスト中に含まれる文間の対応付けを
高精度で自動的に行なうことが可能である。従って、こ
こで得られる対応つきコーパスは、機械翻訳、例文検索
システム等のシステムに用いられ、また、自動的に知識
を学習するシステムの入力としても利用できる。As described above, according to the present invention,
It is possible to automatically and highly accurately associate sentences included in texts of a wide range of two countries. Therefore, the corpus with correspondence obtained here is used in a system such as a machine translation or an example sentence search system, and can also be used as an input of a system for automatically learning knowledge.

[Brief description of the drawings]

【図１】本発明の一実施例としてのシステム構成図であ
る。FIG. 1 is a system configuration diagram as an embodiment of the present invention.

【図２】図１中の主要部の接続関係を示す図である。FIG. 2 is a diagram showing a connection relationship of main parts in FIG.

【図３】本発明の実施例の動作を説明するフローチャー
トである。FIG. 3 is a flowchart illustrating the operation of the embodiment of the present invention.

【図４】文対応ペアを説明する図である。FIG. 4 is a diagram illustrating a sentence correspondence pair.

[Explanation of symbols]

１００対訳対応付け装置１１０入力部１２０形態素解析部１３０類似度計算部１４０文対応推定部１５０後処理部１６０出力部１７０記憶部１８０対訳辞書 100 bilingual association device 110 input unit 120 morphological analysis unit 130 similarity calculation unit 140 sentence correspondence estimation unit 150 post-processing unit 160 output unit 170 storage unit 180 bilingual dictionary

Claims

[Claims]

1. A method for automatically associating sentences included in bilingual texts, wherein bilingual texts are provided based on statistics when bilingual texts are given. Repeat the process of calculating the degree of similarity between words and the process of estimating the sentence correspondence using the similarity and the existing bilingual dictionary, gradually narrowing down the set of sentences that can be dealt with, and finally obtaining the desired sentence correspondence. A method for associating bilingual texts, which is characterized by obtaining an index.

2. An input means for inputting corresponding texts in two languages, a morpheme analysis means for performing a morpheme analysis on each input text, a correspondence possibility relationship is calculated from a morpheme analysis result, and a mutual information amount is obtained. The number of times the correspondence between a sentence in one language and a sentence in the other language is supported using a similarity calculation means that selects a word pair by a statistical test and a prepared bilingual dictionary for the selected word pair. Count
Sentence correspondence estimating means for narrowing down the sentence correspondence possibility with a predetermined threshold, post-processing means for selecting a sentence correspondence pair having a predetermined support frequency for the narrowed sentence correspondence relation, and selected sentence correspondence A parallel translation sentence associating device, comprising: an output unit that outputs a pair as a final result.