JPH0950435A

JPH0950435A - Translation device

Info

Publication number: JPH0950435A
Application number: JP7199537A
Authority: JP
Inventors: Ikuo Karashi; 育雄芥子; Hiroshi Ikeuchi; 洋池内; Hiroyuki Kanza; 浩幸勘座
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1995-08-04
Filing date: 1995-08-04
Publication date: 1997-02-18

Abstract

PROBLEM TO BE SOLVED: To retrieve bilingual examples substantially exceeding the range of sentence examples. SOLUTION: This device is provided with an example base 1 for storing plural examples of bilingual sentences, an index storage means 2 for storing indexes for indicating the correspondence of example vectors for indicating the features of the respective examples in the example base 1 and the respective examples, an input means 3 for inputting a translation request sentence including words, a word dictionary 4 for holding word vectors for indicating the features of the respective words, a vector generation means 5 for analyzing the inputted translation request sentence and generating the context vector of the translation request sentence obtained by using the word dictionary 4, an example candidate retrieval means 6 for retrieving example candidates based on a distance between the context vector of the translation request sentence and the example vector, a display means 7 for displaying the example candidates and a bilingual sentence display means 8 for displaying the bilingual sentence corresponding to the example selected from the example candidates by a user.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、キーワードや文章
からなる翻訳要求がされると、旅行会話文などの対訳事
例を表示させる翻訳装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a translation device for displaying a bilingual translation example such as a travel conversation sentence when a translation request including a keyword or a sentence is made.

【０００２】[0002]

【従来の技術】従来、対訳事例の検索によって翻訳を行
う事例ベース翻訳装置には、分類表示方式によるものと
キーワード検索方式によるものに大別できる。2. Description of the Related Art Conventionally, case-based translation apparatuses that perform translation by searching bilingual cases can be roughly classified into a classification display method and a keyword search method.

【０００３】分類表示方式は、予め対訳事例を場所や場
面等によって様々なジャンルに分類し、その分類表に基
づいて適切な対訳事例を探す方式である。例えば、旅行
会話文に関する事例では、大分類として、ホテル，空
港，レストラン，道，．．．等に分類され、大分類
「道」の基に、地下鉄，バス，タクシー，案内，．．．
等に細分類され、その下に各種の事例が格納される。例
えば、事例「シティーパークへ行くには何という停留所
で降りますか？」は、大分類「道」の基の分類「バス」
の下に格納される。The classification display system is a system in which bilingual translation examples are classified into various genres according to places, scenes and the like in advance, and an appropriate bilingual translation example is searched based on the classification table. For example, in the case of travel conversation sentences, the major categories are hotels, airports, restaurants, roads ,. . . Etc., subway, bus, taxi, guidance ,. . .
Etc., and various cases are stored under it. For example, the example "What stop do you get off at to get to City Park?"
Stored under.

【０００４】一方、キーワード検索方式は、各対訳事例
に内容を表現するキーワードを付与したり、例文に含ま
れる全ての単語を対象に適切な対訳事例を探す方式であ
る。例えば、上述の例では、「シティーパーク」，「行
く」，「何」，「停留所」，「降りる」等がキーワード
の候補になる。さらに、キーワード検索方式を拡張した
ものに、特開平３ー２７６３６７号の「用例主導型機械
翻訳方式」がある。これは、単語の意味の類似性に基づ
いて木構造に階層化した単語辞書（シソーラス辞書）を
用いて、事例を検索する方式である。On the other hand, the keyword search method is a method in which a keyword expressing the contents is added to each bilingual case, or an appropriate bilingual case is searched for by targeting all the words included in an example sentence. For example, in the above example, “city park”, “go”, “what”, “stop”, “get off”, etc. are candidates for keywords. Further, as an extension of the keyword search method, there is an "example-driven machine translation method" of Japanese Patent Laid-Open No. 3-276367. This is a method for searching a case using a word dictionary (thesaurus dictionary) hierarchized in a tree structure based on the similarity of word meanings.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記分
類表示方式に基づく事例ベース翻訳装置では、事例が少
ない間は効果的であるが、事例数が膨大になると、多く
の人の要求を満たすような分類は困難であり、利用者の
要求を満たす事例がどのジャンルに分類されているか容
易に決めることが出来ない。また、同一ジャンルに多く
の事例が分類されるため確認に時間がかかるという欠点
がある。However, the case-based translation apparatus based on the above classification display method is effective while the number of cases is small, but when the number of cases becomes enormous, it meets the needs of many people. Classification is difficult, and it is not possible to easily determine to which genre the cases satisfying the user's request are classified. Further, there is a drawback that it takes time to confirm because many cases are classified into the same genre.

【０００６】また、上記キーワード検索方式による事例
検索では、キーワードとして対訳文に付加された単語で
しか検索できず、利用者が入力すると想定される全ての
単語を予めキーワードとして対訳事例に付加しておくこ
とが事実上不可能である。従って、キーワードとして付
加された限られた範囲内での入力文の検索しか出来な
い。同様に事例の全ての文字列を検索対象にするフルテ
キストサーチを用いても、事例中に現れる単語あるいは
文章でしか検索出来ない。つまりどちらの検索方法にお
いても、検索対象事例に明示的に表現された自然言語の
範囲を越えた、検索は不可能である。In the case search by the keyword search method, only words added to the bilingual sentence as keywords can be searched, and all words supposed to be input by the user are added in advance to the bilingual case as keywords. It is virtually impossible to set. Therefore, it is only possible to search for input sentences within the limited range added as keywords. Similarly, even if a full-text search that searches all the character strings of the case is used, only the words or sentences that appear in the case can be searched. In other words, in both search methods, search beyond the range of the natural language explicitly expressed in the search target case is impossible.

【０００７】さらに、特開平３ー２７６３６７号で提案
された用例主導型機械翻訳方式は、１次元的に言葉の意
味を分類したシソーラスを用いることによって、より柔
軟な対訳事例の検索を実現している。ここでは、形態素
解析の言語解析によって、事例と翻訳要求を構成する単
語の正確な対応を取り、各単語間のシソーラス上の距離
を計算することによって類似事例を検索しているので、
翻訳対象の範囲が拡大されると、翻訳要求の文脈に沿わ
ない、不要な対訳事例が検索される、大規模な対訳事例
の収集が必要である、といった問題点がある。Further, the example-driven machine translation system proposed in Japanese Patent Laid-Open No. 3-276367 realizes more flexible retrieval of parallel translation examples by using a thesaurus in which the meanings of words are classified one-dimensionally. There is. Here, by using linguistic analysis of morphological analysis, an accurate correspondence between the case and the words that make up the translation request is obtained, and similar cases are searched by calculating the distance on the thesaurus between each word.
When the range of translation target is expanded, there are problems that the translation request context is not met, unnecessary translation examples are searched, and large-scale translation examples need to be collected.

【０００８】そこで本発明の目的は、上記問題点を解決
するために、少ない事例でも翻訳要求に最も適当な対訳
事例の検索を可能とする翻訳装置を提供することにあ
る。SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide a translation apparatus which can solve the above-mentioned problems by searching for a bilingual case most suitable for a translation request even in a small number of cases.

【０００９】[0009]

【課題を解決するための手段】本発明の翻訳装置は、事
例とその対訳文との組が対訳事例として記憶された事例
ベースと、上記事例の特徴を表す事例ベクトルと、該事
例ベクトルと上記対訳事例との対応を示すインデックス
とを格納したインデックス蓄積手段と、単語を含む翻訳
要求を入力する入力手段と、翻訳要求の単語と特徴単語
との関連程度を示す特徴単語ベクトルを保持する単語辞
書と、翻訳要求に対応する上記特徴単語ベクトルの加算
結果に基づいて翻訳要求ベクトルを生成するベクトル生
成手段と、翻訳要求ベクトルと事例ベクトルとの距離に
基づいて事例候補を検索する事例候補検索手段と、検索
された事例候補に対応する対訳文をインデックスを用い
て表示する対訳文表示手段とを備えたことを特徴とす
る。The translation device of the present invention includes a case base in which a set of a case and its bilingual sentence is stored as a bilingual case, a case vector representing the characteristics of the case, the case vector, and the case vector. Index storage means for storing an index indicating the correspondence with the bilingual case, input means for inputting a translation request including a word, and a word dictionary holding a characteristic word vector indicating the degree of association between the translation request word and the characteristic word A vector generation means for generating a translation request vector based on the addition result of the characteristic word vectors corresponding to the translation request; and a case candidate search means for searching for a case candidate based on the distance between the translation request vector and the case vector. And a bilingual sentence display means for displaying a bilingual sentence corresponding to the retrieved case candidate using an index.

【００１０】本発明の翻訳装置では、例えば日本語と英
語の対訳文の事例ベースで日本人の利用者を想定する
と、利用者が翻訳要求文を日本語で入力すると、単語辞
書を用いて翻訳要求ベクトルが作成され、事例ベース中
の各事例ベクトルとの距離が計算され、対訳文（英語）
が表示される。In the translation apparatus of the present invention, for example, assuming a Japanese user on the basis of parallel translation sentences in Japanese and English, when the user inputs a translation request sentence in Japanese, translation is performed using a word dictionary. A request vector is created, the distance to each case vector in the case base is calculated, and a bilingual sentence (English)
Is displayed.

【００１１】本発明では、対訳事例および翻訳要求ベク
トルは、特徴単語ベクトルを基にして作成され、検索が
ベクトル演算によって実現されるため、形態素解析の言
語解析を必要としない。さらに、対訳事例および翻訳要
求ベクトルは、人間の知識・常識等に応じて特徴付けら
れた特徴単語ベクトルを基に作成されるので、任意に与
えた単語あるいは文章での人間の知識・常識等に応じた
検索が可能になる。In the present invention, the parallel translation example and the translation request vector are created based on the feature word vector, and the search is realized by vector operation, so that the linguistic analysis of the morphological analysis is not required. In addition, parallel translation examples and translation request vectors are created based on characteristic word vectors that are characterized according to human knowledge and common sense. It is possible to search accordingly.

【００１２】また、事例候補表示手段を備えているの
で、翻訳要求に類似した事例候補から順に日本語で表示
され、類似した複数の事例候補の中から利用者が一つの
事例を選択する時の自由度が向上し、操作性が向上す
る。Further, since the case candidate display means is provided, the case candidates similar to the translation request are displayed in Japanese in order, and when the user selects one case from a plurality of similar case candidates. The degree of freedom is improved and the operability is improved.

【００１３】さらに、日本語の原文の事例と英語の対訳
文からなる新規の対訳事例を登録することができ、対象
の対訳事例の範囲を拡張できる。Furthermore, it is possible to register a new bilingual translation example consisting of a Japanese original text translation and an English bilingual translation, so that the range of the target bilingual translation can be expanded.

【００１４】[0014]

【発明の実施の形態】図１に、本発明の実施例に係る事
例ベースの翻訳装置の機能ブロック図を示す。図１にお
いて、事例ベース１，インデックス蓄積手段２，入力手
段３，単語辞書４，ベクトル生成手段５，事例候補検索
手段６，表示手段７，対訳文表示手段８，事例登録手段
９から構成されている。FIG. 1 shows a functional block diagram of a case-based translation apparatus according to an embodiment of the present invention. In FIG. 1, it comprises a case base 1, an index storage means 2, an input means 3, a word dictionary 4, a vector generation means 5, a case candidate search means 6, a display means 7, a translated text display means 8 and a case registration means 9. There is.

【００１５】次に、翻訳装置の各構成について説明す
る。事例ベース１には、対訳事例が翻訳要求の言語と対
訳文の言語の組として複数組格納されている。インデッ
クス蓄積手段２は、事例ベース１中の翻訳要求の言語で
の各事例と特徴単語（特徴単語については後述する）と
の関連の強さを示す事例ベクトルと、その事例ベクトル
と対訳事例との対応をとるインデックスとが格納されて
いる。事例ベクトルは、予め与えられていてもよいし、
後述する新規事例の登録で作成された特徴単語ベクトル
の和を用いても良い。Next, each component of the translation apparatus will be described. In the case base 1, a plurality of parallel translation cases are stored as a set of a translation request language and a translation text language. The index storage unit 2 stores a case vector indicating the strength of association between each case in the translation request language in the case base 1 and the characteristic word (the characteristic word will be described later), the case vector, and the bilingual case. The index and the correspondence are stored. The case vector may be given in advance,
You may use the sum of the characteristic word vector created by registration of the new case mentioned later.

【００１６】利用者は、キーボードからなる入力手段３
を用いて翻訳要求を入力する。入力は、音声や手書きし
た内容をオンラインで認識させて入力してもよい。単語
辞書４は、自然言語文から単語を抽出するためのオート
マトン（文字の遷移）と各単語の特徴ベクトルとからな
る。この特徴ベクトルとして、後述する特徴単語ベクト
ルを格納している。図２に、単語辞書４を構成するオー
トマトンの一部を示す。ここでは、「レストラン」、
「レストルーム」、「ストライキ」の３単語のみ登録さ
れている例である。ここで、｛レ，ス｝は、レ、ス以外
の総ての文字を、実線はgoto関数を、破線はfailure関
数（他のすべての状態から初期状態へのfailure関数は
省略されている）を、各単語のベクトル値はその状態に
おける出力であり、抽出単語（ここでは３単語）の特徴
単語ベクトルを示す。ベクトル生成手段５は、翻訳要求の内容を単語辞書４の
オートマトンに流し、翻訳要求の各単語とその特徴単語
ベクトルを抽出し、特徴単語ベクトルの和から翻訳要求
ベクトルを生成する。事例候補検索手段６は、翻訳要求
ベクトルとインデックス蓄積手段２に格納された各事例
ベクトルとの距離を計算し、類似した事例から順に翻訳
要求の言語で事例候補をディスプレイからなる表示手段
７に表示する。表示手段７は、必ずしも必要ではない
が、表示した方が優先順位が多少低くても、事例の選択
範囲とすることができるので、利用者の使いやすさが向
上する。対訳文表示手段８は、事例候補の中から利用者
が選択した事例候補に対応する対訳文を表示する。な
お、表示手段７、対訳文表示手段８は、同一ディスプレ
イで実現でき、出力手段としてプリンタ等も用いること
ができる。The user uses the input means 3 including a keyboard.
Use to enter a translation request. The input may be made by recognizing voice or handwritten content online. The word dictionary 4 includes an automaton (transition of characters) for extracting a word from a natural language sentence and a feature vector of each word. As the feature vector, a feature word vector described later is stored. FIG. 2 shows a part of the automaton forming the word dictionary 4. Here, "restaurant",
In this example, only three words "rest room" and "strike" are registered. Here, {Les, s} is all the characters except Les and Sus, the solid line is the goto function, and the broken line is the failure function (the failure function from all other states to the initial state is omitted). The vector value of each word is the output in that state, and indicates the characteristic word vector of the extracted word (here, 3 words). The vector generation means 5 sends the content of the translation request to the automaton of the word dictionary 4, extracts each word of the translation request and its characteristic word vector, and generates a translation request vector from the sum of the characteristic word vectors. The case candidate search means 6 calculates the distance between the translation request vector and each case vector stored in the index storage means 2, and displays the case candidates in the translation request language on the display means 7 which is a display in order from similar cases. To do. Although the display means 7 is not always necessary, even if the display means 7 has a slightly lower priority, the display means 7 can be included in the selection range of the case, so that the usability for the user is improved. The bilingual sentence display means 8 displays the bilingual sentence corresponding to the case candidate selected by the user from the case candidates. The display means 7 and the translated text display means 8 can be realized by the same display, and a printer or the like can be used as the output means.

【００１７】また、事例登録手段９は、事例ベース１に
入力手段３を用いて入力された新規の対訳事例につい
て、単語辞書４とベクトル生成手段５とから新規の事例
については事例ベクトルを作成し、その事例ベクトルと
対訳事例との対応をとるインデックスとをインデックス
蓄積手段２に追加すると共に、新規の対訳事例（事例と
その対訳文）を事例ベース１に登録する。新規事例の登
録では、事例ベクトルは、事例から抽出された単語の特
徴単語ベクトルの和となる。Further, the case registration means 9 creates a case vector for the new case from the word dictionary 4 and the vector generation means 5 for the new parallel translation case input to the case base 1 using the input means 3. , The index for correlating the case vector and the bilingual case is added to the index storage means 2, and a new bilingual case (case and its bilingual sentence) is registered in the case base 1. When registering a new case, the case vector is the sum of the characteristic word vectors of the words extracted from the case.

【００１８】図３に、上記翻訳装置をＣＰＵを利用した
電気的ハードウェアで実現した場合のブロック構成図を
示す。図２において、補助記憶装置２１と，各種処理を
行うＣＰＵ２２，処理結果を記憶する主記憶装置２３お
よび各種入出力デバイスとＣＰＵとを接続する入出力Ｃ
ｈ（チャネル）２４を含む翻訳処理部２５，ＣＲＴなど
からなる表示装置２６、およびキーボード２７から構成
されている。FIG. 3 is a block diagram showing a case where the above translation device is realized by electric hardware using a CPU. In FIG. 2, an auxiliary storage device 21, a CPU 22 for performing various processes, a main memory device 23 for storing processing results, and an input / output C for connecting various input / output devices to the CPU.
The translation processing unit 25 includes an h (channel) 24, a display device 26 including a CRT, and a keyboard 27.

【００１９】図１の事例ベース１と単語辞書４は、図３
の補助記憶装置２１に格納され、図１のインデックス蓄
積手段２，ベクトル生成手段５，事例検索手段６，事例
登録手段９は、図３の翻訳処理部２５に対応し、図１の
入力手段３は、図３のキーボード２７に対応し、図１の
表示手段７，対訳文表示手段８は、図３の表示装置２６
に対応する。The case base 1 and the word dictionary 4 shown in FIG.
The index storage means 2, the vector generation means 5, the case search means 6, and the case registration means 9 stored in the auxiliary storage device 21 of FIG. 1 correspond to the translation processing unit 25 of FIG. 3, and the input means 3 of FIG. Corresponds to the keyboard 27 shown in FIG. 3, and the display means 7 and the translated text display means 8 shown in FIG.
Corresponding to.

【００２０】次に、特徴単語ベクトルについて説明す
る。本実施例での単語辞書４、ベクトル生成手段５、事
例検索手段６の具体的構成は、「大規模文書データベー
スからの連想検索（社団法人電子情報通信学会発行の
信学技法ＡＩ９２−９９，１９９３−１）」の文脈ベク
トルを利用する。Next, the characteristic word vector will be described. The specific configurations of the word dictionary 4, the vector generation means 5, and the case search means 6 in this embodiment are as follows: −1) ”context vector is used.

【００２１】つまり、文脈ベクトルは、文章中での単語
のもつ概念と文脈との関係の程度を示したものであり、
多くの特徴単語との意味的な結合関係の程度をベクトル
表現したもので、ｎ個の概念分類を特徴単語とすると、
各次元が一つの特徴単語に対応したｎ次元ベクトル空間
上の一点で表現するものである。単語ｉの文脈ベクトル
Ｘｉ＝（ｘｉ１，ｘｉ２，．．．ｘｉｎ）の各要素の値
は、次のように定義される。That is, the context vector indicates the degree of the relationship between the concept of the word in the sentence and the context,
It is a vector expression of the degree of semantic connection with many characteristic words. If n concept classifications are characteristic words,
Each dimension is represented by one point on the n-dimensional vector space corresponding to one feature word. The value of each element of the context vector Xi = (xi1, xi2, ... Xin) of word i is defined as follows.

【００２２】ｘｉｊ＝０ｉｆ単語ｉが特徴単語ｊと関係なしｘｉｊ＝１ｉｆ単語ｉが特徴単語ｊと関係あり例えば、次の６個の特徴単語（人間，悲しい，芸術，科学，興奮，政治）を選択した場合に、単語「パイロット」の６次元の文脈
ベクトルＸは、２値で示すと以下の通りとなる。Xij = 0 if word i is not related to characteristic word j xij = 1 if if i is related to characteristic word j For example, the following six characteristic words (human, sad, art, science, excitement, politics) ) Is selected, the 6-dimensional context vector X of the word “pilot” is expressed as a binary value as follows.

【００２３】Ｘ＝（１，０，０，１，１，０）また、単語ｉと特徴単語ｊの関係はその強度に応じて次
のように多値で表現してもよい。X = (1,0,0,1,1,0) Further, the relationship between the word i and the characteristic word j may be expressed by multiple values as follows according to its strength.

【００２４】Ｘ＝（２，０，０，３，１，０）本実施例では、上記文脈ベクトルを多値の特徴単語ベク
トルとして用いる。X = (2,0,0,3,1,0) In this embodiment, the context vector is used as a multivalued feature word vector.

【００２５】ベクトル生成手段５は、自然言語テキスト
である翻訳要求から抽出された特徴単語ベクトルの和を
長さが一定になるように正規化したものを翻訳要求ベク
トルとして生成する。事例検索手段６は、翻訳要求ベク
トルと各事例ベクトルの長さが一定に正規化されている
ので、その距離として内積を用いることができる。The vector generating means 5 generates a translation request vector by normalizing the sum of the characteristic word vectors extracted from the translation request, which is a natural language text, so that the length becomes constant. Since the lengths of the translation request vector and each case vector are normalized to be constant, the case search means 6 can use the inner product as the distance.

【００２６】図４を用いて、本発明の処理の流れについ
て説明する。The processing flow of the present invention will be described with reference to FIG.

【００２７】ステップＳ１では、入力手段３によって、
翻訳要求文「セントラルパークに行くには何処で降りれ
ばいいの？」が入力される。ここで、入力は文章でおこ
なっているが、単語のみでもよい。In step S1, the input means 3 causes
The translation request sentence "Where should I get off to go to Central Park?" Is entered. Here, the input is performed in sentences, but only words may be input.

【００２８】ステップＳ２では、ベクトル生成手段５
は、単語辞書４を用いて、入力された翻訳要求文から
「セントラルパーク」「行く」「何処」「降り」の４単
語を抽出し、対応する特徴単語ベクトルを求める。当然
ながら、対応の特徴単語ベクトルが単語辞書４に予め格
納されていない場合は、翻訳要求文の一部については特
徴単語ベクトルを求めない。In step S2, the vector generating means 5
Uses the word dictionary 4 to extract four words "central park", "go", "where" and "down" from the input translation request sentence, and obtains a corresponding feature word vector. Of course, if the corresponding characteristic word vector is not stored in the word dictionary 4 in advance, the characteristic word vector is not obtained for a part of the translation request sentence.

【００２９】以上から注出された各単語の特徴単語ベク
トルは、セントラルパーク（０，２，０，０，１，０，２，０，１）行く（０，１，０，０，０，１，１，０，０）何処（１，０，０，０，０，１，０，０，１）降り（０，２，０，０，０，２，１，０，０）となる。The characteristic word vector of each word extracted from the above is the central park (0, 2, 0, 0, 1, 0, 2, 0, 1) going (0, 1, 0, 0, 0, Where (1,1,0,0) where (1,0,0,0,0,1,0,0,1) get down (0,2,0,0,0,2,1,0,0) .

【００３０】ステップＳ３では、各々の単語の特徴単語
ベクトルを加えたベクトルＣＶ０＝（１，５，０，０，
１，４，４，０，２）を計算した後、ベクトルＣＶ０を
長さ１０に正規化したものを翻訳要求ベクトルＣＶ１＝
（１，６，６，０，１，５，５，０，３）として、ベク
トル生成手段５が生成する。In step S3, the vector CV0 = (1, 5, 0, 0,
1, 4, 4, 0, 2), and then the translation request vector CV1 = which is obtained by normalizing the vector CV0 to a length of 10.
The vector generating means 5 generates (1, 6, 6, 0, 1, 5, 5, 0, 3).

【００３１】図５に、対訳事例の事例ベースと対応する
事例ベクトルの格納の様子を示す。即ち、対訳事例を事
例ベース１に、事例ベクトル及び事例ベクトルとその対
訳事例を対応付けるインデックスをインデックス蓄積手
段２に格納している。例えば、事例１「シャッターを押
すだけ：Ｊｕｓｔｐｕｓｈｔｈｅｂｕｔｔｏ
ｎ．」は事例ベース１に格納され、その事例ベクトル
（３，１，１，．．．０）と事例ベース１における格納
場所を指し示すインデックスＩ１がインデックス蓄積手
段２に格納され、以下同様にして、複数の対訳事例が格
納されている。FIG. 5 shows how the case base corresponding to the bilingual case is stored. That is, the bilingual case is stored in the case base 1, and the case vector and the index associating the case vector with the bilingual case are stored in the index storage unit 2. For example, Case 1 “Just push the shutter: Just push the butto”
n. Is stored in the case base 1, the case vector (3,1,1, ...) and the index I1 indicating the storage location in the case base 1 are stored in the index storage means 2. The parallel translation examples of are stored.

【００３２】ステップＳ４では、事例検索手段６が翻訳
要求ベクトルＣＶ１とインデックス蓄積手段２に格納さ
れた全事例ベクトルの距離を計算する。図３の例では、
事例１「シャッターを押すだけ」と翻訳要求文との距離
は、事例１の事例ベクトル（（３，１，１，．．．０）
と翻訳要求ベクトルＣＶ１（１，６，０，．．．３）と
の内積となり、３×１＋１×６＋．．．３×０＝２３点
となる。なお、長さ１０に正規化されているので、満点
は、１００点になる。以下に示すように、同様にして全事
例と距離計算を行う。In step S4, the case retrieval means 6 calculates the distance between the translation request vector CV1 and all the case vectors stored in the index storage means 2. In the example of FIG.
The distance between the case 1 “just press the shutter” and the translation request sentence is the case vector of the case 1 ((3, 1, 1, ... 0)
Becomes the inner product of the translation request vector CV1 (1, 6, 0, ... 3) and 3 × 1 + 1 × 6 +. . . 3 × 0 = 23 points. Since the length is normalized to 10, the maximum score is 100 points. As shown below, all cases and distance calculations are performed in the same manner.

【００３３】事例１：２３点事例２：７８点・事例１２０００：３５点ステップＳ５では、翻訳要求文との距離が近い（内積値
が大きい）事例から順に数事例，事例ベース１から事例
候補として得点（内積値）と共に表示手段７に表示す
る。Case 1: 23 points Case 2: 78 points • Case 12000: 35 points In step S5, several cases are selected in order from a case close to the translation request sentence (large inner product value), and case base 1 as case candidates. It is displayed on the display means 7 together with the score (inner product value).

【００３４】１．（９５点）シティパークへ行くには何
という停留所で降りますか？２．（８５点）セントラルパークへ行くのはこの道です
か？３．（８０点）ハイドパークは何処ですか？ステップＳ６では、利用者が入力手段３から選択した事
例候補の対訳文が対訳文表示手段８に表示される。1. (95 points) What stop do you get off at to go to City Park? 2. (85 points) Is this the road to Central Park? 3. (80 points) Where is Hyde Park? In step S6, the translated text of the case candidate selected by the user from the input means 3 is displayed on the translated text display means 8.

【００３５】１．（９５点）シティパークへ行くには何
という停留所で降りますか？ＷｈａｔｂｕｓｓｔｏｐｄｏＩｇｅｔｏｆ
ｆａｔＣｉｔｙＰａｒｋ？以上のステップにより、適切な対訳文を利用者は得るこ
とができる。1. (95 points) What stop do you get off at to go to City Park? What bus stop do I get of
f at City Park? Through the above steps, the user can obtain an appropriate bilingual sentence.

【００３６】また、新規対訳事例の登録も、翻訳要求ベ
クトル生成と同様にして、事例について事例ベクトルが
生成され、図５に示すようにインデックス蓄積手段に新
規事例ベクトルが追加される。例えば、図５の事例１を
登録する場合は、「シャッター」，「押す」の特徴単語
ベクトルを単語辞書４から求め、その和を正規化したベ
クトルを事例ベクトルとして、事例ベース１における事
例１の格納場所を指し示すインデックスＩ１と共に事例
ベクトルをインデックス蓄積手段２に格納する。Further, when registering a new bilingual case, a case vector is generated for the case in the same manner as the translation request vector generation, and the new case vector is added to the index storage means as shown in FIG. For example, when registering the case 1 of FIG. 5, the characteristic word vectors of “shutter” and “press” are obtained from the word dictionary 4, and a vector obtained by normalizing the sum is used as a case vector, The case vector is stored in the index storage means 2 together with the index I1 indicating the storage location.

【００３７】以上、本発明の実施例は、事例ベクトルの
作成に事例を構成する文を用いたが、それ以外にも例え
ば利用者が与えた事例の内容を表す数単語から事例ベク
トルを作成してもよい。また、本発明の実施例では、日
本語と英語の対訳文を用いたが、それに限定されるもの
ではない。例えば、日本語と英語，仏語，中国語，など
複数の言語で事例を構成し、利用者が指定した言語で翻
訳要求文を入力し、翻訳要求文を入力した言語を用いて
事例候補を表示し、利用者が選択した言語の対訳文を表
示させる構成も考えられる。As described above, in the embodiment of the present invention, the sentence forming the case is used to create the case vector, but other than that, for example, the case vector is prepared from several words representing the contents of the case given by the user. May be. Further, in the embodiment of the present invention, the Japanese and English bilingual sentences are used, but the present invention is not limited thereto. For example, the case is composed in multiple languages such as Japanese and English, French, Chinese, etc., the translation request sentence is input in the language specified by the user, and the case candidates are displayed using the language in which the translation request sentence is input. However, a configuration in which the bilingual text in the language selected by the user is displayed is also conceivable.

【００３８】[0038]

【発明の効果】本発明の翻訳装置は、従来の翻訳装置が
検索対象事例に明示的に表現された自然言語の範囲を越
えた検索は不可能であるのに比べて、対訳事例に対し人
間の常識が反映された自然言語での検索を可能とし、自
然言語の範囲を大幅に越えた対訳事例の検索が可能にな
り、翻訳支援が効率的になる。EFFECTS OF THE INVENTION The translation device of the present invention is not capable of searching beyond the range of the natural language explicitly expressed in the search target case, whereas the translation device of the present invention is human It is possible to search in natural language that reflects common sense of, and it becomes possible to search for parallel translation examples that greatly exceed the range of natural language, and translation support becomes efficient.

[Brief description of drawings]

【図１】本発明の実施例に係る翻訳装置の機能ブロック
図である。FIG. 1 is a functional block diagram of a translation device according to an embodiment of the present invention.

【図２】本発明の実施例に係る単語辞書の構成を示す図
である。FIG. 2 is a diagram showing a configuration of a word dictionary according to an embodiment of the present invention.

【図３】本発明の実施例に係る翻訳装置の電気的ブロッ
ク図である。FIG. 3 is an electrical block diagram of a translation device according to an embodiment of the present invention.

【図４】本発明の実施例に係る翻訳装置の処理フローを
示すフローチャートである。FIG. 4 is a flowchart showing a processing flow of the translation device according to the embodiment of the present invention.

【図５】本発明の実施例に係る翻訳装置の対訳事例と対
応の事例ベクトルの格納の様子を示す図である。FIG. 5 is a diagram showing a storage state of parallel translation cases and corresponding case vectors of the translation apparatus according to the embodiment of the present invention.

[Explanation of symbols]

１事例ベース２インデックス蓄積手段３入力手段４単語辞書５ベクトル生成手段６事例検索手段７表示手段８対訳文表示手段９事例登録手段 1 case base 2 index storage means 3 input means 4 word dictionary 5 vector generation means 6 case search means 7 display means 8 bilingual sentence display means 9 case registration means

Claims

[Claims]

1. A case base in which a set of a case and its bilingual sentence is stored as a bilingual case, a case vector representing a characteristic of the case, and an index indicating a correspondence between the case vector and the bilingual case are stored. Index storage means, an input means for inputting a translation request including words, a word dictionary holding a characteristic word vector indicating the degree of association between the translation request word and the characteristic word, and the characteristic word vector corresponding to the translation request Vector generation means for generating a translation request vector based on the result of addition, a case candidate search means for searching a case candidate based on the distance between the translation request vector and the case vector, and a bilingual sentence corresponding to the searched case candidate. And a translated text display means for displaying by using an index.

2. A case candidate display means for displaying the retrieved case candidates is further provided, and the bilingual sentence corresponding to the case selected by the user from the case candidates displayed on the case candidate displaying means is displayed on the bilingual sentence displaying means. The translation device according to claim 1, which is displayed.

3. A case registration means is further provided, and when a new bilingual case is input, the vector generation means creates a new case vector from the new case using the word dictionary, and the case registration means. 3. The translation device according to claim 1, wherein the translation storage device registers a new index and a new case vector in the index storage means, and also registers the new parallel translation case in the case base.