JPH08180057A

JPH08180057A - Method and device for retrieving document

Info

Publication number: JPH08180057A
Application number: JP6320059A
Authority: JP
Inventors: Masato Yajima; 真人矢島; Noriko Koyama; 紀子小山
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1994-12-22
Filing date: 1994-12-22
Publication date: 1996-07-12

Abstract

PURPOSE: To provide a document retrieval method and a device therefor capable of providing more useful information to a user who thinks of translating a retrieval input sentence. CONSTITUTION: The combination of most similar clauses between the retrieval input sentence of a first language inputted from an input part 1 and a retrieval object sentence inside a translation storage part 12 is obtained in an optimum combination calculation part 7. From the combination, in a word correspondence detection part 9, the work of the retrieval object sentence corresponding to the word of the retrieval input sentence is detected and further, the word of the sentence of a second language corresponding to the word of the retrieval object sentence is detected from the translation storage part 12. Also, the word of the second language which is the equivalent of the word provided in the retrieval input sentence is retrieved from a equivalent storage part 13 in an equivalent retrieval part 8 and a word replacing part 11 replaces the word of the sentence of the second language corresponding to the word of the retrieval object sentence detected in the word correspondence detection part 9 with the word or word string of the second language which is the equivalent of the word of the retrieval input sentence retrieved in the equivalent retrieval part 8.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、大量の文書中からユー
ザが入力した検索入力文と類似している文章を検索する
のに好適な文章検索方法および装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a text search method and apparatus suitable for searching a text similar to a search input text input by a user from a large amount of text.

【０００２】[0002]

【従来の技術】近年、日本語ワードプロセッサや光学的
文字読取装置（ＯＣＲ）などの文章入力装置の普及に伴
い、それまで紙やマイクロフィルムという形態で保存し
ていた文章を、電子化して磁気ディスクや光磁気ディス
クなどの外部記憶装置に保存するようになってきた。ま
た大量の電子化された文書データから、いかにユーザが
要求する文書を検索するかというテキスト検索技術も開
発され発展しつつある。2. Description of the Related Art In recent years, with the spread of text input devices such as Japanese word processors and optical character readers (OCRs), the texts previously stored in the form of paper or microfilm have been digitized into magnetic disks. It has come to be stored in an external storage device such as a magnetic disk or a magneto-optical disk. Further, a text search technique for searching a document required by a user from a large amount of digitized document data has been developed and is being developed.

【０００３】従来の検索技術では、入力された検索文と
文字列が一致している、あるいは使われている単語が一
致している文が検索されるのが一般的であり、文全体と
して類似した文を検索するということができなかった。In the conventional search technique, it is general to search for a sentence in which a character string matches an input search sentence, or a word in which it is used, and the sentence as a whole is similar. I couldn't search for the sentence.

【０００４】これに対し、文を一定の文構成単位、例え
ば文節単位に分割して、文節ごとに類似度を計算し、最
も類似している文節の組み合わせを求めるという方式で
は、文全体で類似している文を検索できる。この検索技
術の一応用として、辞書の用例を検索することが考えら
れる。辞書の用例は、日本語文（である第１の言語）と
その訳語である第２の言語（例えば英語）の文の組から
構成されている。上記の検索技術を用いて、入力された
日本語文と類似した日本語文を用例から検索すれば、そ
の対訳である第２の言語の文は、入力した日本語文を第
２の言語に翻訳する際に有用な情報を与えてくれる。On the other hand, in a method in which a sentence is divided into a certain sentence constituent unit, for example, a bunsetsu unit, the degree of similarity is calculated for each bunsetsu, and the most similar bunsetsu combination is obtained, the whole sentence is similar. You can search the sentence you are doing. As an application of this search technique, it is possible to search for a dictionary example. An example of a dictionary is composed of a set of sentences of a Japanese sentence (which is the first language) and a translated sentence of the second language (for example, English). Using the above search technique, if a Japanese sentence similar to the input Japanese sentence is searched from the example, the parallel sentence, that is, the sentence in the second language, will be displayed when the input Japanese sentence is translated into the second language. Give useful information to.

【０００５】しかし、この方式では提示された第２の言
語の文をそのまま使用するわけにはいかず、ユーザは、
この第２の言語の文を参考にして新たな翻訳文を作成し
なければならなかった。However, this method cannot use the presented sentence of the second language as it is, and the user
A new translation had to be created with reference to the sentence in the second language.

【０００６】また、この方式では日本語（第１の言語）
についての検索を行うのみで、第２の言語について参照
するだけの用途しかなく、第２の言語も言語であるにも
かかわらず、それから得られる情報を活用できなかっ
た。Also, in this method, Japanese (first language) is used.
However, the information obtained from the second language could not be utilized even though the second language was also a language.

【０００７】[0007]

【発明が解決しようとする課題】このように従来の文書
検索方式では、検索された検索対象文の対訳は、あくま
でユーザが翻訳する際の参考に過ぎず、どこを修正すれ
ば入力文の翻訳になるかはユーザ自身が考えて新たに翻
訳文を作成しなければならないという問題があった。ま
た、検索された検索対象文の対訳はあくまでユーザが参
照するだけの用途に過ぎないことから、その言語的な情
報を利用することができないという問題もあった。As described above, in the conventional document retrieval method, the parallel translation of the retrieved search target sentence is merely a reference when the user translates, and if any correction is made, the translation of the input sentence is translated. There was a problem that the user had to think about whether or not to create a new translated sentence. In addition, since the parallel translation of the retrieved search target sentence is only for the purpose of being referred to by the user, there is a problem that the linguistic information cannot be used.

【０００８】本発明は上記の欠点を考慮してなされたも
のでその目的は、検索された検索対象文の対訳を、ユー
ザが入力した検索入力文の翻訳文に合うように加工して
提示できる文書検索方法および装置を提供することにあ
る。The present invention has been made in view of the above-mentioned drawbacks, and an object thereof is to process and present a bilingual translation of a retrieved retrieval target sentence so as to match the translation of the retrieval input sentence input by the user. A document search method and apparatus are provided.

【０００９】本発明の他の目的は、検索された第１の言
語の検索対象文の対訳である第２の言語の文を、さらに
検索の入力文として第２の言語の検索に供することによ
って、第１の言語の検索では検索しきれなかった第２の
言語の例文を検索するなど、有用な情報を提示できる文
書検索方法および装置を提供することにある。Another object of the present invention is to provide a sentence in the second language, which is a parallel translation of the retrieved sentence in the first language, to the second language as an input sentence for the retrieval. It is an object of the present invention to provide a document search method and apparatus that can present useful information such as searching for example sentences in a second language that could not be searched for in the first language.

【００１０】[0010]

【課題を解決するための手段および作用】本発明の第１
の観点に係る構成は、第１の言語の検索入力文と類似し
た検索対象文を検索する文書検索方法および装置におい
て、第１の言語の検索対象文とその対訳である第２の言
語の文を組にして対応づけて記憶しておく対訳記憶手段
と、第１の言語の単語に対して訳語である第２の言語の
単語または単語列を組にした単語辞書を持ち、上記第１
の言語の検索入力文に含まれる単語をキーとして、上記
単語辞書から訳語である第２の言語の単語を検索し、上
記求められた検索対象文で、上記検索入力文と類似した
表現の中に当該検索入力文の単語と対応のとれる単語が
ある場合、上記求められた検索対象文と組になる第２の
言語の文の対応する単語を、上記検索入力文の単語の訳
語である単語または単語列でそのまま置き換えることを
特徴とするものである。Means and Actions for Solving the Problems First of the Invention
In the document search method and apparatus for searching the search target sentence similar to the search input sentence in the first language, the configuration according to the viewpoint is a search target sentence in the first language and a sentence in the second language which is a parallel translation thereof. And a word dictionary in which a word or a word string in the second language that is a translation for the word in the first language is paired and stored.
Using the words included in the search input sentence of the language of as a key, a word in the second language that is a translation is searched from the word dictionary, and in the search target sentence obtained as described above, in an expression similar to the search input sentence. If there is a word corresponding to the word of the search input sentence, the corresponding word of the sentence of the second language that is paired with the obtained search target sentence is a word that is a translation of the word of the search input sentence. Alternatively, it is characterized in that the word string is replaced as it is.

【００１１】また、上記求められた検索対象文と組にな
る第２の言語の文の対応する単語を、検索入力文の単語
の訳語である単語または単語列でそのまま置き換える代
わりに、当該検索入力文の単語の訳語である単語または
単語列を変形して置き換える構成としても構わない。こ
こでの変形は、上記求められた検索対象文と組になる第
２の言語の文の対応する単語の持つ単語属性（活用・時
制など）に応じて行うことが可能である。また、上記求
められた検索対象文と組になる第２の言語の文の対応す
る単語の持つ構文属性に応じて変形することも可能であ
る。Further, instead of directly replacing the corresponding word of the sentence of the second language that is paired with the obtained sentence to be searched with the word or word string that is the translation of the word of the search input sentence, The word or word string that is the translation of the word in the sentence may be transformed and replaced. The modification here can be performed according to the word attribute (utilization, tense, etc.) of the corresponding word of the sentence of the second language that forms a pair with the obtained search target sentence. Further, it is also possible to transform the sentence according to the syntactic attribute of the corresponding word of the sentence of the second language that forms a pair with the obtained sentence to be searched.

【００１２】上記第１の観点に係る構成においては、検
索入力文と類似した検索対象文を検索した後で、検索対
象文の対訳文を検索入力文の単語の訳語または単語列で
そのまま置換したり、変形して置換したりできるので、
単に検索入力文を翻訳するための参考となる文を提示す
るのではなく、検索入力文の翻訳文に近い形に修正した
文を提示できる。In the configuration according to the first aspect, after searching for a search target sentence similar to the search input sentence, the bilingual sentence of the search target sentence is directly replaced with the translation word or word string of the word of the search input sentence. Or you can transform it and replace it,
Instead of simply presenting a reference sentence for translating the search input sentence, it is possible to present a sentence corrected to a form close to the translated sentence of the search input sentence.

【００１３】本発明の第２の観点に係る構成は、検索入
力文と類似した検索対象文を検索する文書検索方法およ
び装置において、第１の言語の文とその対訳である第２
の言語の文を組にして対応づけて記憶しておき、第１の
言語による検索でその組から検索結果を得、対訳である
第２の言語の文をさらに入力文として第２の言語により
検索を行なうことを特徴とするものである。According to a second aspect of the present invention, in a document retrieval method and apparatus for retrieving a retrieval target sentence similar to a retrieval input sentence, a sentence in the first language and its parallel translation are provided.
Sentences in the same language are stored in association with each other, search results are obtained from the set by the search in the first language, and the bilingual sentence in the second language is further input in the second language. It is characterized by performing a search.

【００１４】また本発明では、上記第２の観点に係る構
成における第２の言語の文の検索時に、第１の言語の検
索時に類似していると判断された部分が類似している結
果を優先して提示することをも特徴とする。Further, according to the present invention, when the sentence of the second language in the configuration according to the second aspect is searched, the result that the parts judged to be similar at the time of the search of the first language are similar is obtained. Another feature is that they are presented with priority.

【００１５】また、第２の言語の文の検索結果のうち、
第１の言語の検索結果と同じものを優先して提示するこ
とをも特徴とする。上記第２の観点に係る構成において
は、検索入力文と類似した検索対象文を検索した後で、
検索対象文の対訳をさらに検索入力文として検索し、さ
らにその検索結果から検索入力文に対して有用な情報を
優先して提示するので、ユーザにとって第２の言語の言
語的な特性を生かした有用な情報を提供することができ
る。Of the search results of sentences in the second language,
It is also characterized in that the same result as the search result in the first language is preferentially presented. In the configuration according to the second aspect, after searching the search target sentence similar to the search input sentence,
The bilingual translation of the search target sentence is further searched as a search input sentence, and the useful information for the search input sentence is preferentially presented from the search result, so that the linguistic characteristic of the second language is utilized for the user. It can provide useful information.

【００１６】[0016]

【実施例】以下、本発明の実施例につき図面を参照して
説明する。［第１の実施例］図１は、本発明の第１の実施例を示す
文書検索装置のブロック構成図である。Embodiments of the present invention will be described below with reference to the drawings. [First Embodiment] FIG. 1 is a block diagram of a document retrieval apparatus according to the first embodiment of the present invention.

【００１７】図１の文書検索装置は、入力部１、制御部
２、表示部３、単語分割部４、文節合成部５、文節類似
度計算部６、最適組み合わせ算出部７、訳語検索部８、
単語対応検出部９、単語変形部１０、単語置換部１１、
対訳記憶部１２および訳語記憶部１３から構成される。The document retrieval apparatus of FIG. 1 has an input unit 1, a control unit 2, a display unit 3, a word division unit 4, a phrase synthesis unit 5, a phrase similarity calculation unit 6, an optimum combination calculation unit 7, and a translated word search unit 8. ,
A word correspondence detection unit 9, a word transformation unit 10, a word replacement unit 11,
It is composed of a parallel translation storage unit 12 and a translated word storage unit 13.

【００１８】入力部１は、文字列や種々の制御コード
（コマンド等）を入力するためのもので、例えばキーボ
ードまたはＯＣＲからなる入力装置である。制御部２は
装置全体の制御を司るもので、例えば中央処理ユニット
（ＣＰＵ）である。ここでは、制御部２は、文書検索処
理を実行するための装置内各部（単語分割部４、文節合
成部５、文節類似度計算部６、最適組み合わせ算出部
７、訳語検索部８、単語対応検出部９および単語変形部
１０）の制御を行なう。The input unit 1 is for inputting a character string and various control codes (commands, etc.), and is an input device composed of, for example, a keyboard or an OCR. The control unit 2 controls the entire apparatus, and is, for example, a central processing unit (CPU). Here, the control unit 2 includes each unit in the apparatus (word dividing unit 4, phrase matching unit 5, phrase similarity calculating unit 6, optimal combination calculating unit 7, translation word searching unit 8, word correspondence) for executing document search processing. The detection unit 9 and the word transformation unit 10) are controlled.

【００１９】表示部３は、入力文の文字列や検索対象文
の文字列などを表示するためのもので、例えばＣＲＴデ
ィスプレイ装置または液晶表示装置である。対訳記憶部
１２は、第１の言語（例えば日本語）である検索対象文
と、その対訳である第２の言語（例えば英語）の文を組
にして、互いの単語の言い回し・表現などの対応が取れ
る形で情報を記憶しておくためのものである。The display unit 3 is for displaying a character string of an input sentence or a character string of a retrieval target sentence, and is, for example, a CRT display device or a liquid crystal display device. The parallel translation storage unit 12 forms a search target sentence that is a first language (for example, Japanese) and a sentence of a second language (for example, English) that is a parallel translation thereof into a group, and expresses words and expressions of each other's words. It is for storing information in a form that can be dealt with.

【００２０】訳語記憶部１３は、第１の言語の単語と、
その訳語である第２の言語の単語または単語列を組にし
て記憶しておくためのものである。上記対訳記憶部１２
および訳語記憶部１３は、例えばハードディスク装置や
光ディスク装置などからなる大容量の外部記憶装置によ
り実現される。The translated word storage unit 13 stores words in the first language,
It is for storing a word or a word string in the second language, which is the translated word, as a set. The parallel translation storage unit 12
The translated word storage unit 13 is realized by a large-capacity external storage device such as a hard disk device or an optical disk device.

【００２１】単語分割部４、文節合成部５、文節類似度
計算部６、最適組み合わせ算出部７、訳語検索部８、単
語対応検出部９および単語変形部１０は、文書検索処理
に必要な機能要素であり、それぞれ固有の処理ルーチン
により実現されるものである。The word dividing unit 4, the phrase synthesizing unit 5, the phrase similarity calculating unit 6, the optimum combination calculating unit 7, the translated word searching unit 8, the word correspondence detecting unit 9 and the word transforming unit 10 are functions necessary for the document searching process. The elements are realized by their own processing routines.

【００２２】単語分割部４は、入力部１から入力された
第１の言語の入力文（以下、検索入力文と称する）の文
字列を単語単位に分割するものである。文節合成部５
は、単語分割分部により分割された単語からなる文節を
合成すものである。The word dividing unit 4 divides the character string of the input sentence of the first language (hereinafter referred to as a search input sentence) input from the input unit 1 into word units. Phrase synthesizer 5
Is to synthesize a clause consisting of words divided by the word division part.

【００２３】文節類似度計算部６は、文節合成部５によ
り得られた検索入力文の文節と、対訳記憶部１２に記憶
された検索の対象となる第１の言語の文（以下、検索対
象文と称する）の文節との類似度計算を行なうものであ
る。The bunsetsu similarity calculation unit 6 detects the bunsetsu of the search input sentence obtained by the bunsetsu synthesizing unit 5 and the sentence of the first language to be searched stored in the parallel translation storage unit 12 (hereinafter referred to as a search target). (Referred to as a sentence) to calculate the degree of similarity with the clause.

【００２４】最適組み合わせ算出部７は、類似度計算結
果に基づいて、文全体としても最も類似している文節の
組み合わせを求めるものである。訳語検索部８は、検索
入力文から単語分割部４により分割された単語をキーに
して、訳語記憶部１３から、訳語である第２の言語を検
索するものである。The optimum combination calculation unit 7 obtains a combination of clauses that are most similar to each other on the basis of the similarity calculation result. The translated word search unit 8 searches the translated word storage unit 13 for the second language, which is a translated word, by using the word divided by the word dividing unit 4 from the search input sentence as a key.

【００２５】単語対応検出部９は、最適組み合わせ算出
部７で得られた文節の組み合わせから、検索入力文の単
語と対応した検索対象文の単語を検索すると共に、その
検索対象文の単語に対応した第２の言語の文の単語を対
訳記憶部１２から検出するものである。The word correspondence detection unit 9 searches the word of the search target sentence corresponding to the word of the search input sentence from the combination of clauses obtained by the optimum combination calculation unit 7, and also corresponds to the word of the search target sentence. The word of the sentence in the second language is detected from the parallel translation storage unit 12.

【００２６】単語変形部１０は、上記した検索入力文の
単語の訳語である第２の言語の単語または単語列を、単
語対応検出部９で検出した検索対象文の第２の言語の文
の単語の単語属性や構文属性に応じて変形するものであ
る。The word transforming unit 10 detects a word or word string in the second language, which is a translation of the word of the search input sentence, from the sentence in the second language of the search target sentence detected by the word correspondence detecting unit 9. It is transformed according to the word attribute and the syntax attribute of the word.

【００２７】単語置換部１１は、単語変形部１０で変形
した検索入力文の単語の訳語である第２の言語または単
語列で、単語対応検出部９で検出した検索対象文の第２
の言語の文の単語を置き換えるものである。The word replacing unit 11 is the second language or word string which is a translation of the word of the search input sentence transformed by the word transforming unit 10, and is the second sentence of the search target sentence detected by the word correspondence detecting unit 9.
It replaces words in sentences in the language.

【００２８】次に、図１の構成の動作について、その概
要を説明する。入力部１から、まず検索入力文となる第
１の言語の文字列が入力される。この検索入力文は、制
御部２を介して単語分割部４に送られ、単語単位に分割
される。図２（ａ）はこの検索入力文の文字列を解析し
て単語単位に分割した状況を示した図である。このとき
単語分割部４は、分割した単位に品詞、および活用形、
時制などの（形態素情報と呼ばれる）単語属性の情報を
付加している。Next, an outline of the operation of the configuration shown in FIG. 1 will be described. First, a character string in the first language, which is a search input sentence, is input from the input unit 1. This search input sentence is sent to the word dividing unit 4 via the control unit 2 and divided into word units. FIG. 2A is a diagram showing a situation in which the character string of the search input sentence is analyzed and divided into word units. At this time, the word division unit 4 divides the divided unit into parts of speech and inflections,
Information on word attributes (called morpheme information) such as tenses is added.

【００２９】単語分割した結果は文節合成部５に送られ
る。文節合成部５は、単語分割部４により得られた単語
を合成して、図２（ｂ）に示すように文節を生成する。
文節類似度計算部６は、対訳記憶部１２から検索対象文
として任意の１文を取り出す。この対訳記憶部１２に
は、検索対象文の文字列の他に、その文字列について、
単語と文節が記憶されている。対訳記憶部１２に記憶さ
れている内容の例を図３に示す。The result of word division is sent to the phrase synthesizing unit 5. The phrase synthesizing unit 5 synthesizes the words obtained by the word dividing unit 4 to generate a phrase as shown in FIG.
The phrase similarity calculation unit 6 extracts an arbitrary sentence from the parallel translation storage unit 12 as a search target sentence. In the parallel translation storage unit 12, in addition to the character string of the search target sentence,
Words and phrases are stored. An example of the contents stored in the parallel translation storage unit 12 is shown in FIG.

【００３０】文節類似度計算部６は、対訳記憶部１２か
ら取り出した検索対象文の１文節と文節合成部５により
得られた検索入力文の１文節とを比較して、類似度を計
算する動作を、（検索入力文の文節数）×（検索対象文
の文節数）の総当たりで行なう。この計算結果に基づい
て最適組み合わせ算出部２７は、文全体としてもっとも
類似している組み合わせを求める。The phrase similarity calculation unit 6 compares the one phrase of the search target sentence retrieved from the parallel translation storage unit 12 with the one phrase of the search input sentence obtained by the phrase synthesis unit 5 to calculate the similarity. The operation is performed by a brute force of (number of clauses of search input sentence) × (number of clauses of search target sentence). Based on this calculation result, the optimum combination calculation unit 27 finds the most similar combination for the entire sentence.

【００３１】制御部２は、最適組み合わせ算出部７によ
り求められた最適組み合わせから、検索入力文と検索対
象文とが類似しているか否かを判断し、似ているなら
ば、その検索対象文を検索入力文に対する検索結果とす
る。この場合、訳語検索部８は、検索入力文の各単語を
キーにして、訳語記憶部１３から訳語を検索する。The control unit 2 judges from the optimum combination obtained by the optimum combination calculation unit 7 whether or not the search input sentence and the search target sentence are similar, and if they are similar, the search target sentence. Is the search result for the search input sentence. In this case, the translated word search unit 8 searches for a translated word from the translated word storage unit 13 using each word of the search input sentence as a key.

【００３２】単語対応検出部９は、最適組み合わせ算出
部７で得られた文節の組み合わせから検索入力文の単語
と対応した検索対象文の単語を検出し、さらに対訳記憶
部１２から当該検索対象文の単語に対応した訳語の単語
を検出する。The word correspondence detection unit 9 detects the word of the search target sentence corresponding to the word of the search input sentence from the combination of clauses obtained by the optimum combination calculation unit 7, and further detects the search target sentence from the parallel translation storage unit 12. The word of the translated word corresponding to the word of is detected.

【００３３】単語変形部１０は、訳語検索部８により得
られた検索入力文の単語の訳語を、単語対応検出部９で
検出した検索対象文の訳語の単語の単語属性や構文属性
に応じて変形する。The word transforming unit 10 determines the translation of the word of the search input sentence obtained by the translation searching unit 8 according to the word attribute or the syntax attribute of the word of the translation of the search target sentence detected by the word correspondence detection unit 9. Deform.

【００３４】単語置換部１１は、単語変形部１０で変形
した検索入力文の単語の訳語で、単語対応検出部９で検
出した検索対象文の訳語の単語を置き換える。この結果
は、表示部３に表示される。The word substituting unit 11 replaces the word of the translation of the search target sentence detected by the word correspondence detecting unit 9 with the translation of the word of the search input sentence transformed by the word transforming unit 10. The result is displayed on the display unit 3.

【００３５】このとき、対訳記憶部２８に未検索の検索
対象文が残っているならば、そこから未検索の１文が取
り出され、上記の動作が繰り返される。以上が、図１の
構成の動作の概要である。At this time, if the unsearched sentence to be searched remains in the parallel translation storage unit 28, one unsearched sentence is taken out from it and the above operation is repeated. The above is the outline of the operation of the configuration of FIG.

【００３６】次に、図１の構成の動作の詳細について、
図４及び図５のフローチャートを参照して説明する。ま
ず、入力部１から検索入力文が入力されると（ステップ
Ｓ１）、制御部２は当該検索入力文を表示部３の表示画
面に表示すると共に、単語分割部４に出力する。ここで
は検索入力文として、例えば「ここから駅までは少なく
とも５分はかかる」という日本語文字列（第１の言語の
文字列）が入力されたと仮定する。Next, regarding the details of the operation of the configuration of FIG.
This will be described with reference to the flowcharts of FIGS. 4 and 5. First, when a search input sentence is input from the input unit 1 (step S1), the control unit 2 displays the search input sentence on the display screen of the display unit 3 and outputs it to the word dividing unit 4. Here, it is assumed that, for example, a Japanese character string (character string in the first language) "It takes at least 5 minutes from here to the station" is input as the search input sentence.

【００３７】単語分割部４は、入力された検索入力文の
文字列を解析して単語単位に分割する（ステップＳ
２）。すなわち単語分割部４は、図２（ａ）に示すよう
に、検索入力文の「ここから駅までは５分はかかる」を
「ここ」、「から」、「駅」、「まで」、「は」、「少
なくとも」、「５」、「分」、「は」、「かかる」の単
語に分割し、それぞれの品詞、および活用形、時制など
の単語属性の情報を付加する。The word dividing unit 4 analyzes the character string of the input search input sentence and divides it into word units (step S).
2). That is, as shown in FIG. 2A, the word segmentation unit 4 indicates that “it takes 5 minutes from here to the station” in the search input sentence “here”, “kara”, “station”, “to”, “ It is divided into words such as "ha", "at least", "5", "minutes", "ha", and "take", and information on each part of speech and word attribute information such as conjugations and tenses is added.

【００３８】単語分割部４による単語分割処理が終了す
ると、制御部２により文節合成部５が起動される。文節
合成部５は、単語分割部４により得られた単語を合成し
て、図２（ｂ）に示すように文節を生成する（ステップ
Ｓ３）。ここでは、「ここから」、「駅までは」、「少
なくとも」、「５分は」、「かかる」の５文節が生成さ
れる。When the word dividing process by the word dividing unit 4 is completed, the control unit 2 activates the phrase synthesizing unit 5. The phrase combining unit 5 combines the words obtained by the word dividing unit 4 to generate a phrase as shown in FIG. 2B (step S3). Here, five clauses of “from here”, “to the station”, “at least”, “for 5 minutes”, and “take” are generated.

【００３９】文節合成部５による文節単位への合成処理
が終了すると、制御部２により文節類似度計算部６が起
動される。文節類似度計算部６は、対訳記憶部１２から
検索対象文として任意の１文を取り出す（ステップＳ
４）。When the bunsetsu synthesizing unit 5 completes the bunsetsu synthesizing process, the bunsetsu similarity calculating unit 6 is activated by the control unit 2. The phrase similarity calculation unit 6 extracts an arbitrary sentence as a search target sentence from the parallel translation storage unit 12 (step S
4).

【００４０】この対訳記憶部１２には、検索対象文の文
字列の他に、その文字列について、あらかじめ単語分割
部４の処理（または単語分割部４と同等の機能）により
分割された単語と、文節合成部５の処理（または文節合
成部５と同等の機能）により生成された文節が記憶され
ている。検索対象文として、例えば「ここから港までは
５時間はかかる」という文字列が対訳記憶部１２に記憶
されている場合、当該対訳記憶部１２にはさらに、単語
分割部４の処理による図３（ａ）のような、「ここ」、
「から」、「港」、「まで」、「は」、「５」、「時
間」、「は」、「かかる」という単語が記憶されてお
り、図３（ｂ）に示すように文節合成部５の処理による
「ここから」、「港までは」、「５時間は」、「かか
る」という文節が記憶されている。In the parallel translation storage unit 12, in addition to the character string of the sentence to be searched, the character string is divided into words by the processing of the word dividing unit 4 (or a function equivalent to the word dividing unit 4) in advance. The clauses generated by the processing of the clause synthesizer 5 (or the function equivalent to the clause synthesizer 5) are stored. As a search target sentence, for example, when a character string “It takes 5 hours from here to the port” is stored in the parallel translation storage unit 12, the parallel translation storage unit 12 is further processed by the processing of the word dividing unit 4 in FIG. "Here", like (a),
The words "kara", "minato", "to", "ha", "5", "time", "ha", and "take" are stored, and as shown in FIG. The phrases “from here”, “to the port”, “5 hours”, and “take” are stored by the processing of the part 5.

【００４１】文節類似度計算部６は、文節合成部５によ
り得られた検索入力文の１文節と対訳記憶部１２から取
り出した検索対象文の１文節とを比較して、類似度を計
算する（ステップＳ５）。この類似度計算は、（検索入
力文の文節数）×（検索対象文の文節数）の総当たりで
行なわれ、全ての文節の組み合わせについて繰り返され
る（ステップＳ６）。The phrase similarity calculator 6 compares one phrase of the search input sentence obtained by the phrase synthesizer 5 with one sentence of the search target sentence retrieved from the parallel translation storage unit 12 to calculate the similarity. (Step S5). This similarity calculation is performed by a brute force of (number of clauses of search input sentence) × (number of clauses of search target sentence), and is repeated for all combinations of clauses (step S6).

【００４２】文節類似度計算部６は、この類似度計算に
より、所定の類似度以上の文節を求める。ここでは、検
索入力文の文節の「ここから」と検索対象文の「ここか
ら」の組み合わせは、類似度が高く（一致）、検索入力
文の「駅までは」と検索対象文の「港までは」は、やや
類似度が高く（類似）なっている。The phrase similarity calculation unit 6 obtains a phrase having a predetermined similarity or higher by this similarity calculation. Here, the combination of “from here” in the clause of the search input sentence and “from here” in the search target sentence has a high degree of similarity (match), and “to the station” in the search input sentence and “ports in the search target sentence” "Ha" is somewhat similar (similar).

【００４３】文節類似度計算部６による類似度計算が終
了すると、制御部２により最適組み合わせ算出部７が起
動される。最適組み合わせ算出部７は、文節類似度計算
部６の類似度計算結果に基づいて、検索入力文と検索対
象文との間で文全体として最も類似している文節の組み
合わせを求める（ステップＳ７）。ここでは、検索入力
文の文節と検索対象文の文節の最適な組み合わせは、図
６に示すように、検索入力文の「ここから」に検索対象
文の「ここから」を、検索入力文の「駅までは」に検索
対象文の「港までは」を、検索入力文の「５分は」に検
索対象文の「５時間は」を、検索入力文の「かかる」に
検索対象文の「かかる」を対応づける組み合わせが最適
であると算出される。When the phrase similarity calculation unit 6 completes the similarity calculation, the control unit 2 activates the optimum combination calculation unit 7. The optimum combination calculation unit 7 obtains a combination of phrases that are the most similar as a whole sentence between the search input sentence and the search target sentence, based on the similarity calculation result of the phrase similarity calculation unit 6 (step S7). . Here, the optimum combination of the phrase of the search input sentence and the phrase of the search target sentence is, as shown in FIG. 6, “from here” of the search target sentence to “from here” of the search input sentence, as shown in FIG. “To the station” is the search target sentence, “To the port”, “5 minutes” is the search target sentence, “5 hours is”, and “Search” is the search target sentence. It is calculated that the combination that associates “takes” is optimal.

【００４４】制御部２は、最適組み合わせ算出部７によ
り検索入力文の文節と検索対象文の文節の最適な組み合
わせが求められると、その最適組み合わせから、検索入
力文と検索対象文とが類似しているか否か、即ち当該検
索対象文を検索結果とすることができるか否かを判断す
る（ステップＳ８）。ここでの判断基準は、説明を簡略
化するために、類似している（例えば類似度が零でな
い）文節を少なくとも１つ含む組み合わせであれば、検
索入力文と検索対象文とは似ているとするものである。
勿論、この判断基準は、全ての文節について一定以上の
類似度を持つか否か、あるいは一定割合の数の文節につ
いて一定以上の類似度を持つか否かなど、要求される検
索精度等に応じて適宜変更設定可能である。When the optimum combination calculating unit 7 finds the optimum combination of the phrase of the search input sentence and the phrase of the search target sentence, the control unit 2 determines that the search input sentence and the search target sentence are similar to each other from the optimum combination. Whether or not the search target sentence can be used as the search result (step S8). In order to simplify the description, the criterion here is that the search input sentence and the search target sentence are similar to each other if the combination includes at least one clause that is similar (for example, the degree of similarity is not zero). It is what
Of course, this criterion depends on the required search accuracy, such as whether all the phrases have a certain degree of similarity or not, or whether a certain proportion of the number of phrases has a certain degree of similarity or more. Can be appropriately changed and set.

【００４５】制御部２は、検索入力文と検索対象文とが
似ていると判断した場合には、訳語検索部８を起動す
る。訳語検索部８は、検索入力文の各単語、すなわち検
索入力文を対象として単語分割部４により分割された各
単語（ここでは、自立語）をキーにして、訳語記憶部１
３から訳語である第２の言語を検索する（ステップＳ
９）。具体的には、図２（ｃ）に示すように「駅」の訳
語は「station 」、「分」の訳語「minute」というよう
に訳語が検索される。When the control unit 2 determines that the search input sentence and the search target sentence are similar to each other, the control unit 2 activates the translated word search unit 8. The translated word search unit 8 uses each word of the search input sentence, that is, each word (here, independent word) divided by the word dividing unit 4 for the search input sentence as a key, and uses the translated word storage unit 1 as a key.
A second language, which is a translation, is searched from 3 (step S
9). Specifically, as shown in FIG. 2C, the translated word of "station" is searched for as "station" and the translated word of "minute" as "minute".

【００４６】訳語検索部８による訳語検索処理が終了す
ると、制御部２により単語対応検出部９が起動される。
単語対応検出部９は、最適組み合わせ算出部７で得られ
た文節の組み合わせから検索入力文の単語と対応した検
索対象文の単語を検出する（ステップＳ１０）。具体的
には、図６の検索入力文の文節「駅までは」と検索対象
文の文節「港までは」の組からは、「駅」と「港」が対
応し、「５分は」と「５時間は」からは「分」と「時
間」が対応することが検出される。When the translated word search process by the translated word search unit 8 is completed, the control unit 2 activates the word correspondence detection unit 9.
The word correspondence detection unit 9 detects the word of the search target sentence corresponding to the word of the search input sentence from the phrase combination obtained by the optimum combination calculation unit 7 (step S10). Specifically, from the set of the phrase “to station” in the search input sentence and the phrase “to port” in the search target sentence in FIG. 6, “station” corresponds to “port”, and “5 minutes” From "and 5 hours", it is detected that "minute" and "hour" correspond.

【００４７】単語対応検出部９はさらに、対訳記憶部１
２から検索対象文の単語に対応した第２の言語の文の単
語を検出する（ステップＳ１１）。この対訳記憶部１２
には、図３（ｃ）に示すように検索対象文「ここから港
までは５時間はかかる」の対訳である第２の言語の文
「It takes five hours to go to the seaport from he
re」が組として記憶されており、同時に各単語の情報
（品詞および単語属性）も記憶されている。対訳記憶部
１２にはまた、図３（ｄ）に示すように検索対象文の単
語と対訳の文の単語の対応関係も記憶されている。した
がって、検索対象文が「ここから港までは５時間はかか
る」の場合、「港」と対応する第２の言語の単語は、
「seaport 」、「時間」に対応する第２の言語は「hou
r」であることが検出される。最終的には、図７に示す
ように、検索入力文の単語「駅」が、検索対象文の対訳
の単語「seaport 」に、同様に「分」が「hour」に対応
することが分かる。The word correspondence detection unit 9 further includes a parallel translation storage unit 1.
The word of the sentence of the second language corresponding to the word of the sentence to be searched is detected from 2 (step S11). This parallel translation storage unit 12
As shown in Fig. 3 (c), the sentence "It takes five hours to go to the seaport from he" is a translation of the search target sentence "It takes 5 hours from here to the port".
“Re” is stored as a set, and at the same time, information on each word (part of speech and word attribute) is also stored. The parallel translation storage unit 12 also stores the correspondence between the words of the search target sentence and the words of the bilingual sentence, as shown in FIG. Therefore, when the search target sentence is “it takes 5 hours from here to the port”, the word in the second language corresponding to “port” is
The second language corresponding to "seaport" and "time" is "hou
r "is detected. Finally, as shown in FIG. 7, it can be seen that the word “station” in the search input sentence corresponds to the word “seaport” in the parallel translation of the search target sentence, and similarly, “minute” corresponds to “hour”.

【００４８】単語対応検出部９による検出処理が終了す
ると、制御部２により単語変形部１０および単語置換部
１１が順に起動される。単語変形部１０および単語置換
部１１は、単語対応検出部９の検出結果に基づいて、検
索対象文の対訳である第２の言語の文（中の単語）を変
形し置換する処理を行なう（ステップＳ１２）。このス
テップＳ１２の処理の詳細を説明する。When the detection process by the word correspondence detecting unit 9 is completed, the control unit 2 activates the word transforming unit 10 and the word replacing unit 11 in order. The word transforming unit 10 and the word replacing unit 11 perform a process of transforming and replacing a sentence (inside word) in the second language, which is a parallel translation of the search target sentence, based on the detection result of the word correspondence detecting unit 9 ( Step S12). Details of the processing in step S12 will be described.

【００４９】まず、単語対応検出部９の検出結果に基づ
いて、検索対象文の対訳中の単語を変形し置換する処理
（ステップＳ１２）には、検索対象文の対訳の単語の単
語属性に応じた変形・置換を行なう方式、または検索対
象文の対訳の単語に係わる構文属性も考慮した変形・置
換を行なう方式が適用可能である。First, based on the detection result of the word correspondence detection unit 9, the process of transforming and replacing the word in the parallel translation of the search target sentence (step S12) is performed according to the word attribute of the parallel translation word of the search target sentence. A method of performing transformation / replacement, or a method of performing transformation / replacement in consideration of syntactic attributes related to parallel words of a search target sentence can be applied.

【００５０】そこで、上記ステップＳ１２の処理の詳細
を、検索対象文の対訳の単語の単語属性に応じた変形・
置換を行なう場合を例に、図５（ａ）のフローチャート
を参照して説明する。Therefore, the details of the processing in step S12 are modified according to the word attribute of the bilingual word of the search target sentence.
An example of replacement will be described with reference to the flowchart of FIG.

【００５１】まず単語変形部１０は、訳語検索部８で検
索した検索入力文の訳語である第２の言語の単語または
単語列を、単語対応検出部９で調べた検索対象文の第２
の言語の文の単語の例えば単語属性（活用、時制等）に
応じて変形する（ステップＳ２０）。具体的には、検索
入力文の単語「駅」は、検索対象文の対訳の単語「seap
ort 」に対応するので、「seaport 」の単語属性「単数
形」から単語「駅」の訳語「station 」は、単数形の
「station 」と変形される。同様に検索入力文の単語
「分」は、検索対象文の対訳の単語「hour」に対応する
ので、「hour」の単語属性「複数形」から単語「分」の
訳語「minute」は複数形の「minutes 」に変形される。First, the word transforming unit 10 searches the word correspondence detecting unit 9 for the second word or word string in the second language, which is a translation of the search input sentence searched by the translation searching unit 8, for the second search target sentence.
It is transformed according to, for example, the word attribute (inflection, tense, etc.) of the word of the sentence of the language (step S20). Specifically, the word "station" in the search input sentence is replaced by the word "seap" in the parallel translation of the search target sentence.
Since it corresponds to "ort", the translated word "station" from the word attribute "singular" of "seaport" is transformed into the singular "station". Similarly, the word "minute" in the search input sentence corresponds to the parallel translation word "hour" in the search target sentence, so the translated word "minute" from the word attribute "plural" in "hour" is plural Is transformed into "minutes".

【００５２】単語変形部１０により検索対象文の対訳で
ある第２の言語の文の変形が行なわれると、今度は単語
置換部１１により、当該単語変形部１０の結果を用い
て、変形した検索入力文の単語の訳語で対応する検索対
象文の対訳の文の単語を置き換える処理が行なわれる
（ステップＳ２１）。具体的には、検索対象文の対訳
「Ittakes five hours to go to the seaport from her
e」中の「hours 」、「seaport 」がそれぞれ「minutes
」、「station 」に置き換えられ、検索入力文「ここ
から駅までは５分はかかる」の翻訳文に近い形の「It t
akes five minutes togo to the station from here」
という第２の言語の文が得られる。When the word transforming unit 10 transforms the sentence in the second language, which is a parallel translation of the sentence to be searched, this time, the word replacing unit 11 uses the result of the word transforming unit 10 to perform the transformed search. A process of replacing the word of the parallel sentence of the search target sentence with the translated word of the word of the input sentence is performed (step S21). Specifically, the parallel translation of the search target sentence `` It takes five hours to go to the seaport from her
"hours" and "seaport" in "e" are "minutes"
, "Station" is replaced by "It t," which is similar to the translated sentence of the search input sentence "It takes 5 minutes from here to the station".
akes five minutes togo to the station from here ''
A sentence in the second language is obtained.

【００５３】制御部２は、上記した単語変形部１０およ
び単語置換部１１によるステップＳ２０，Ｓ２１からな
る変形・置換処理（ステップＳ１２）が終了すると、そ
の変形・置換後の第２の言語の文を表示部３に表示出力
する（ステップＳ１３）。そして制御部２は文節類似度
計算部６に制御を渡し、訳語記憶部１３に未だ取り出し
ていない（未検索の）検索対象文が残っているならば
（ステップＳ１４）、当該訳語記憶部１３からその検索
対象文を１文だけ取り出させる（ステップＳ４）。この
ステップＳ４以降の処理は、前記した「ここから港まで
は５時間はかかる」という検索対象文が取り出された場
合と同様である。但し、ステップＳ８で検索入力文と検
索対象文とが似ていないと判断された場合には、当該検
索対象文は無視されてステップＳ１４に飛ぶ。また、ス
テップＳ１４で訳語記憶部１３内に未検索の検索対象文
が残っていないものと判断された場合には、図４のフロ
ーチャートで示されている一連の処理（文書検索処理）
は終了となる。When the transformation / replacement process (step S12) including the steps S20 and S21 by the word transformation unit 10 and the word substitution unit 11 is completed, the control unit 2 completes the sentence in the second language after the transformation / replacement. Is output to the display unit 3 (step S13). Then, the control unit 2 transfers control to the phrase similarity calculation unit 6, and if there is a retrieval target sentence that has not been taken out (unretrieved) in the translated word storage unit 13 (step S14), the translated word storage unit 13 Only one sentence to be retrieved is retrieved (step S4). The processing from step S4 is the same as when the search target sentence "It takes 5 hours from here to the port" is extracted. However, if it is determined in step S8 that the search input sentence and the search target sentence are not similar, the search target sentence is ignored and the process jumps to step S14. If it is determined in step S14 that no unsearched search target sentence remains in the translated word storage unit 13, the series of processes (document search process) shown in the flowchart of FIG.
Ends.

【００５４】ここで、本実施例による検索および単語置
換の表示例を図８に示す。図８（ａ）は検索入力文の表
示例、８（ｂ）は検索対象文およびその対訳の文の表示
例であり、図８（ｃ）は検索入力文に従って対訳の単語
を変形および置換した結果の表示例である。図中、斜線
が施されている部分は、置換・変形の対象となる、ある
いは置換・変形された訳語の単語が、強調表示されてい
ることを示す。FIG. 8 shows a display example of search and word replacement according to this embodiment. FIG. 8A is a display example of a search input sentence, 8B is a display example of a search target sentence and its bilingual sentence, and FIG. 8C is a transformation and replacement of bilingual words in accordance with the search input sentence. It is a display example of a result. In the figure, the shaded portions indicate that the words of the translated or replaced or transformed target words are highlighted.

【００５５】次に、単語対応検出部９の結果に基づいて
単語を変形し置換するステップＳ１２の詳細を、検索対
象文の対訳の単語に係わる構文属性に応じた変形・置換
を行なう場合を例に、図５（ｂ）のフローチャートを参
照して説明する。Next, the details of step S12 for transforming and replacing a word based on the result of the word correspondence detection unit 9 will be described as an example of the case where transforming / replacement is performed according to the syntactic attribute relating to the bilingual word of the search target sentence. First, referring to the flowchart of FIG.

【００５６】まず、この処理を考えるために、検索入力
文として、例えば「ここから駅までは少なくとも１分は
かかる」という文字列が入力されたと仮定する。上記の
ステップＳ１からステップＳ７の処理を経て、上記と同
様に検索対象文「ここから港までは５時間かかる」との
最適組み合わせを求めると、図９に示すように「ここか
ら」−「ここから」、「駅までは」−「港までは」、
「１分は」−「５時間は」、「かかる」−「かかる」と
なる。First, in order to consider this processing, it is assumed that a character string such as "it takes at least 1 minute from here to the station" is input as a search input sentence. When the optimum combination of the search target sentence “it takes 5 hours from here to the port” is obtained through the processes of steps S1 to S7, as shown in FIG. 9, “from here”-“here” "To", "to the station"-"to the port",
"1 minute"-"5 hours", "takes"-"takes".

【００５７】この場合、訳語検索部８により、図１０に
示すように「駅」の訳語「station」、「１」の訳語「a
(one) 」、「分」の訳語「minute」と検索される（ス
テップＳ９）。そして単語対応検出部９は、図１１に示
すように検索入力文の単語「駅」が検索対象文の対訳の
単語「seaport 」に、同様に「１」が「five」に、
「分」が「hour」に対応することを検出する（ステップ
Ｓ１０，Ｓ１１）。In this case, the translation word search unit 8 causes the translation word "station" for "station" and the translation word "a" for "1" as shown in FIG.
(one) "and" minute "are searched for translation" minute "(step S9). Then, as shown in FIG. 11, the word correspondence detection unit 9 determines that the word “station” of the search input sentence is the parallel translation word “seaport” of the search target sentence and “1” is the “five”.
It is detected that "minute" corresponds to "hour" (steps S10 and S11).

【００５８】単語変形部１０は、訳語検索部８で検索し
た検索入力文の単語の訳語である第２の言語の単語また
は単語列を、単語対応検出部９で調べた検索対象文の第
２の言語の文の単語の単語属性に応じて変形する（ステ
ップＳ３０）。具体的には、検索入力文の単語「駅」
は、検索対象文の対訳の単語「seaport 」に対応するの
で、「seaport 」の単語属性「単数形」から、単語
「駅」の訳語「station 」は単数形の「station 」と変
形される。同様に単語「１」は単語「five」に対応し、
「five」の単語属性はないので、単語「１」の訳語「a
」もしくは「one 」に変形される。また、単語「分」
は単語「hour」に対応するので、「hour」の単語属性
「複数形」から、単語「分」の訳語「minute」は複数形
の「minutes 」に変形される。The word transforming unit 10 searches the word correspondence detecting unit 9 for the second word or word string in the second language, which is a translation of the word of the search input sentence searched by the translation searching unit 8, and searches for the second search target sentence. It is transformed according to the word attribute of the word of the sentence of the language (step S30). Specifically, the word "station" in the search input sentence
Corresponds to the word “seaport” in the bilingual translation of the search target sentence, so the translated word “station” of the word “station” is transformed into the singular “station” from the word attribute “singular” of “seaport”. Similarly, the word "1" corresponds to the word "five",
Since there is no word attribute of "five", the translated word "a" of the word "1"
Or transformed into "one". Also, the word "minute"
Corresponds to the word "hour", the word attribute "plural" of "hour" is transformed into the plural "minutes" of the translated word "minute" of the word "minute".

【００５９】単語置換部１１は、単語変形部１０の結果
を用いて、変形した検索入力文の単語の訳語で対応する
検索対象文の対訳の文の単語を置き換える（ステップＳ
３１）。具体的には、検索対象文の対訳「It takes fiv
e hours to go to the seaport from here」中の「fiv
e」、「hours 」、「seaport 」がそれぞれ「a (on
e)」、「minutes 」、「station 」に置き換えられ、
「It takes a minutes to goto the station from here
」という文が得られる。この単語置換部１１により単
語の置き換えがなされた第２の言語の文（単語置換部１
１の結果）は、単語変形部１０に戻される。The word substituting unit 11 uses the result of the word transforming unit 10 to replace the word in the bilingual sentence of the search target sentence with the translation of the word in the transformed search input sentence (step S).
31). Specifically, the parallel translation of the search target sentence "It takes fiv
e hours to go to the seaport from here ''
"e", "hours", and "seaport" are "a (on
e) "," minutes "," station ",
`` It takes a minutes to go to the station from here
The sentence "" is obtained. The sentence in the second language in which the words are replaced by the word replacing unit 11 (the word replacing unit 1
The result 1) is returned to the word transformation unit 10.

【００６０】単語変形部１０は、単語置換部１１の結果
を構文属性を用いてチェックし（ステップＳ３２）、構
文属性と合っていない単語を検出して、該当単語の変形
を行なう（ステップＳ３３）。The word transforming unit 10 checks the result of the word replacing unit 11 using the syntax attribute (step S32), detects a word that does not match the syntax attribute, and transforms the word (step S33). .

【００６１】具体的には単語変形部１０は、例えば図１
２のような構文属性を保持しており、「名詞は直前にa
またはone がある場合は単数形、直前にone 以外の数詞
（例えばtwo,three ）があれば複数形である」というよ
うな情報をもっている。そこで単語変形部１０は、上記
ステップＳ３２において、自身が保持している構文属性
で単語置換部１１の結果である「It takes a minutes t
o go to the stationfrom here 」をチェックすること
により、［a minutes ］が、上記した「名詞は直前にa
またはone がある場合は単数形」という構文属性と合っ
ていないことを検出する。この場合、単語変形部１０は
上記ステップＳ３３において、この構文属性に応じて
「minutes 」を単数形の「minute」に変形し、この結果
を再び単語置換部１１に渡す。More specifically, the word transforming unit 10 is, for example, as shown in FIG.
It holds syntactic attributes like 2.
Or, if there is one, it is singular, and if there is a numeral other than one immediately before it (for example, two, three), it is plural. ” Therefore, in step S32, the word transforming unit 10 uses the syntax attribute held by the word transforming unit 10 as a result of the word replacing unit 11, that is, "It takes a minutes t".
By checking "go to the station from here", [a minutes] is
Or, if one exists, it is detected that it does not match the singular syntax attribute. In this case, the word transforming unit 10 transforms "minutes" into a singular "minute" in accordance with the syntax attribute in step S33, and passes the result to the word replacing unit 11 again.

【００６２】単語置換部１１は、再びステップＳ３１の
処理を行ない、「It takes a minute to go to the sta
tion from here」という文を得るので、再び単語変形部
１０に結果を戻す。こんどは構文属性と合っていない単
語は検出されないので、単語変形部１０および単語置換
部１１による図５（ｂ）のステップＳ３０〜Ｓ３３から
なる変形・置換処理（図４のフローチャート中のステッ
プＳ１２）は終了となる。すると、この検索対象文の対
訳を対象とする変形・置換処理の結果が、制御部２によ
り表示部３に表示される。The word substituting section 11 performs the processing of step S31 again, and returns "It takes a minute to go to the sta".
Since the sentence "tion from here" is obtained, the result is returned to the word transformation unit 10 again. Since the word which does not match the syntax attribute is not detected this time, the transformation / replacement process (step S12 in the flowchart of FIG. 4) including the steps S30 to S33 of FIG. 5B by the word transformation unit 10 and the word substitution unit 11 is performed. Ends. Then, the control unit 2 causes the display unit 3 to display the result of the transformation / replacement process for the parallel translation of the search target sentence.

【００６３】以上は、単語変形部１０および単語置換部
１１が、検索対象文の対訳である第２の言語の文中の単
語を変形・置換する場合について説明したが、この単語
変形部１０および単語置換部１１により、検索対象文で
ある第１の言語の文中の単語を変形・置換する処理も行
なう構成とすることも可能である。そのために、単語変
形部１０には、検索入力文の単語を、単語対応検出部９
で検出した対応する検索対象文の単語の単語属性や構文
属性に応じて変形する処理機能が付加される。また単語
置換部１１には、単語変形部１０で変形した検索入力文
の単語で、単語対応検出部９で検出した対応する検索対
象文の単語を置き換える処理機能が付加される。The case where the word transforming unit 10 and the word replacing unit 11 transform and replace a word in the sentence of the second language which is a parallel translation of the search target sentence has been described above. It is also possible to adopt a configuration in which the replacing unit 11 also performs a process of transforming and replacing a word in the sentence of the first language that is the search target sentence. For this reason, the word transformation unit 10 sends the word of the search input sentence to the word correspondence detection unit 9
A processing function of transforming the word of the corresponding sentence to be detected detected in step 1 according to the word attribute or the syntax attribute is added. Further, the word replacing unit 11 is provided with a processing function of replacing the word of the corresponding search target sentence detected by the word correspondence detecting unit 9 with the word of the search input sentence transformed by the word transforming unit 10.

【００６４】この場合、図４のフローチャート中のステ
ップＳ１２の処理の後に、検索対象文の単語を変形・置
換する処理が追加される。処理の詳細は、ステップＳ１
２の処理の詳細フローチャートである図５（ａ），
（ｂ）と同様の流れになる。必要があれば、図５
（ａ），（ｂ）のフローチャート中の「単語の訳語」お
よび「対訳の単語」を「単語」に、「検索入力文の訳
語」を「検索入力文の単語」に読み替えられたい。In this case, after the process of step S12 in the flowchart of FIG. 4, a process of transforming / replacement of the word of the search target sentence is added. The details of the process are step S1.
5A, which is a detailed flowchart of the process of FIG.
The flow is the same as in (b). Figure 5 if necessary
In the flowcharts of (a) and (b), "word translation" and "parallel translation word" should be replaced with "word", and "search input sentence translation" should be replaced with "search input sentence word".

【００６５】すなわち、単語変形部１０は、例えば図５
（ａ）中のステップＳ２０の処理と同様にして、検索入
力文の単語を、単語対応検出部９で調べた検索対象文の
単語の単語属性に応じて変形する。具体的には、検索入
力文の単語「駅」は、検索対象文の単語「港」に対応
し、この「港」の単語属性は「名詞」なので活用しない
ことから、単語「駅」も活用せず変形は行なわない。同
様に「分」の単語属性も「名詞」なので対応する単語
「時間」も変形しない。That is, the word transforming unit 10 is, for example, as shown in FIG.
Similar to the process of step S20 in (a), the word of the search input sentence is transformed according to the word attribute of the word of the search target sentence checked by the word correspondence detection unit 9. Specifically, the word "station" in the search input sentence corresponds to the word "port" in the search target sentence, and since the word attribute of this "port" is "noun", it is not used, so the word "station" is also used. No deformation is performed. Similarly, since the word attribute of "minute" is also "noun", the corresponding word "time" is not transformed.

【００６６】単語置換部１１は、単語変形部１０の結果
を用いて、例えば図５（ａ）中のステップＳ２１の処理
と同様に、変形した検索入力文の単語で、対応する検索
対象文の単語を置き換える。具体的には、検索対象文
「ここから港までは５時間はかかる」の例では、
「港」、「時間」をそれぞれ「駅」、「分」に置き換え
て、「ここから駅までは５分はかかる」という出力を得
る。The word substituting unit 11 uses the result of the word transforming unit 10, for example, in the same manner as the process of step S21 in FIG. Replace words Specifically, in the example of the search target sentence "It takes 5 hours from here to the port",
By replacing "port" and "time" with "station" and "minute", respectively, the output "it takes 5 minutes from here to the station" is obtained.

【００６７】図１３に、検索対象文の単語を変形・置換
する処理が追加された本実施例による検索および単語置
換の表示例を示す。図１３（ａ）は検索入力文の表示
例、図１３（ｂ）は検索対象文およびその対訳の文の表
示例、図１３（ｃ）は単語属性に応じて検索対象文の対
訳の文の単語を変形・置換した状況を示した図、図１３
（ｄ）は構文属性に応じて検索対象文の対訳の文の単語
を変形・置換した表示例、図１３（ｅ）は単語属性に応
じて検索対象文の単語を変形・置換した表示例である。
図中、斜線が施されている部分は、図８と同様に、置換
・変形の対象となる、あるいは置換・変形された訳語の
単語が、強調表示されていることを示す。FIG. 13 shows a display example of search and word replacement according to the present embodiment in which the process of transforming and replacing the word of the search target sentence is added. FIG. 13A shows a display example of a search input sentence, FIG. 13B shows a display example of a search target sentence and its parallel translation sentence, and FIG. 13C shows a parallel translation sentence of the search target sentence according to a word attribute. FIG. 13 is a diagram showing a situation in which a word is transformed / replaced.
FIG. 13D shows a display example in which the words of the bilingual sentence of the search target sentence are transformed / replaced according to the syntax attribute, and FIG. 13E shows a display example in which the words of the search target sentence are transformed / replaced according to the word attribute. is there.
In the figure, the shaded portion indicates that the word of the translated word that is to be replaced / transformed or has been replaced / transformed is highlighted as in FIG.

【００６８】なお本発明は上述した実施例に限定される
ものではない。本実施例では、単語変形部１０および単
語置換部１１が行なう検索入力文の単語に応じた検索対
象文の対訳文の単語の変形・置換の処理は、システムが
自動的に行なうようにした。しかし単語対応検出部９
が、検索入力文の単語と対訳の文の単語との対応を検出
した時点で、システムが検出した対応関係を表示部３で
提示し、ユーザに変形・置換の処理を行なうか否かを選
択させるようにしてよい。The present invention is not limited to the above embodiment. In the present embodiment, the system automatically performs the processing of word transformation / replacement of the bilingual sentence of the search target sentence according to the word of the search input sentence performed by the word transformation unit 10 and the word substitution unit 11. However, the word correspondence detection unit 9
When the correspondence between the word in the search input sentence and the word in the bilingual sentence is detected, the display unit 3 presents the correspondence detected by the system, and the user selects whether or not to perform transformation / replacement processing. You can let me do it.

【００６９】また、単語変形部１０および単語置換部１
１により検索対象文の対訳文が変形・置換される毎に、
その結果を表示するのではなく、検索入力文と似ている
と判断された各検索対象文について、例えば類似度の総
和を求めて評価値とし、その評価値の大きい順に、対応
する検索対象文の対訳文の変形・置換結果を並べて表示
して、ユーザに選ばせるようにしてもよい。このために
は、検索入力文に対する検索処理（ステップＳ４〜Ｓ８
の処理）を対訳記憶部１２内の全ての検索対象文につい
て繰り返し行なって、検索入力文に似ていると判断され
た検索対象文を全て保持しておき、しかる後に上記した
評価値の大きい検索対象文の順に、その対訳文の変形・
置換を行なうようにするとよい。また、評価値の最も大
きい検索対象文についてのみ、前記ステップＳ９〜Ｓ１
３の処理を行なうようにしても構わない。Further, the word transformation unit 10 and the word replacement unit 1
Whenever the bilingual sentence of the search target sentence is transformed or replaced by 1,
Instead of displaying the result, for each search target sentence that is judged to be similar to the search input sentence, for example, the sum of the similarities is calculated as the evaluation value, and the corresponding search target sentence is arranged in descending order of the evaluation value. The transformation / replacement results of the bilingual sentence may be displayed side by side so that the user can select the result. To this end, a search process for the search input sentence (steps S4 to S8)
Processing) is repeatedly performed for all search target sentences in the parallel translation storage unit 12, all search target sentences that are determined to be similar to the search input sentence are held, and thereafter, a search with a large evaluation value is performed. Transformation of bilingual sentences in the order of target sentences
It is better to make a replacement. Further, only for the search target sentence having the largest evaluation value, the steps S9 to S1 are performed.
The process 3 may be performed.

【００７０】要するに本発明は要旨を逸脱しない範囲で
種々変形して実施することができる。［第２の実施例］図１４は、本発明の第２の実施例を示
す文書検索装置のブロック構成図である。In short, the present invention can be implemented with various modifications without departing from the scope of the invention. [Second Embodiment] FIG. 14 is a block diagram of a document retrieval apparatus showing a second embodiment of the present invention.

【００７１】図１４の文書検索装置は、入力部２１、制
御部２２、表示部２３、文分割部２４、優先制御部２
５、類似度計算部２６、最適組み合わせ算出部２７、対
訳記憶部２８および検索部２９から構成される。The document retrieval apparatus shown in FIG. 14 has an input unit 21, a control unit 22, a display unit 23, a sentence dividing unit 24, and a priority control unit 2.
5, a similarity calculation unit 26, an optimum combination calculation unit 27, a parallel translation storage unit 28, and a search unit 29.

【００７２】入力部２１は、文字列や種々の制御コード
（コマンド等）を入力するためのもので、例えばキーボ
ードまたはＯＣＲからなる入力装置である。制御部２２
は、装置全体の制御を司るものである。The input unit 21 is for inputting a character string and various control codes (commands, etc.), and is an input device composed of, for example, a keyboard or OCR. Control unit 22
Controls the entire device.

【００７３】表示部２３は、文字列や検索対象文字列な
どを表示するための出力手段であり、例えばＣＲＴディ
スプレイ装置または液晶表示装置である。文分割部２４
は、入力された文字列（文）を解析して一定の文構成単
位に分割するものである。The display unit 23 is an output means for displaying a character string, a character string to be searched, etc., and is, for example, a CRT display device or a liquid crystal display device. Sentence division unit 24
Is to analyze an input character string (sentence) and divide it into fixed sentence constituent units.

【００７４】優先制御部２５は検索部２９による検索結
果の中から優先すべきものを選び出して出力するもので
ある。類似度計算部２６は、同じ言語（第１の言語また
は第２の言語）の文字列（文）の各文構成単位を比較し
て、類似度を計算するものである。The priority control unit 25 selects and outputs a priority one from the search results by the search unit 29. The similarity calculation unit 26 compares the sentence constituent units of the character strings (sentences) in the same language (first language or second language) to calculate the similarity.

【００７５】最適組み合わせ算出部２７は、類似度計算
部２６の計算結果から、文全体として最も類似している
文構成単位の組み合わせを求めるものである。対訳記憶
部２８は、第１の言語の文と、その対訳である第２の言
語の文を組にして、対応が取れる形で保存記憶しておく
ためのものである。The optimum combination calculation unit 27 obtains a combination of sentence constituent units that are most similar to each other from the calculation result of the similarity calculation unit 26. The parallel translation storage unit 28 is for storing and storing a sentence in the first language and a sentence in the second language, which is the parallel translation thereof, as a set so that they can be associated with each other.

【００７６】検索部２９は、対訳記憶部８から類似して
いる文を検索するものである。次に、図１４の構成の動
作について、その概要を説明する。入力部２１から、ま
ず検索入力文となる第１の言語の文字列が入力される。
この検索入力文は、制御部２２を介して文分割部２４に
送られ、一定の文構成単位（以下、単に単位と称する）
に分割される。図１５は文字列を解析して単位分割（こ
こでは文節に分割）した状況を示した図である。このと
き文分割部２４は、分割した単位に品詞、活用形などの
情報を付加している。The search section 29 searches the parallel translation storage section 8 for similar sentences. Next, an outline of the operation of the configuration of FIG. 14 will be described. First, a character string in the first language, which is a search input sentence, is input from the input unit 21.
This search input sentence is sent to the sentence dividing unit 24 via the control unit 22, and a fixed sentence constituent unit (hereinafter, simply referred to as a unit).
Is divided into FIG. 15 is a diagram showing a situation in which a character string is analyzed and divided into units (here, divided into clauses). At this time, the sentence dividing unit 24 adds information such as a part of speech and an inflection to the divided units.

【００７７】次に単位分割した結果は、検索部２９に送
られる。検索部２９は対訳記憶部２８から、検索対象文
を１文取り出す。対訳記憶部２８には、第１の言語と第
２の言語の文が組になって記憶されている。この第１の
言語と第２の言語の文はそれぞれ単位分割されており、
単位ごとに対応が取れる形になっている。対訳記憶部２
８に記憶されている内容の例を図１６に示す。Next, the result of unit division is sent to the search unit 29. The search unit 29 extracts one sentence to be searched from the parallel translation storage unit 28. In the parallel translation storage unit 28, sentences in the first language and the second language are stored as a set. The sentences in the first language and the sentences in the second language are each divided into units,
Correspondence can be taken for each unit. Parallel translation storage unit 2
16 shows an example of the contents stored in FIG.

【００７８】検索部２９は、対訳記憶部２８から取り出
した検索対象文を類似度計算部２６に送る。類似度計算
部２６は、検索入力文の１単位と検索対象文の１単位と
を比較して類似度を計算する動作を、（検索入力文の単
位数）×（検索対象文の単位数）の総当たりで行なう。
この計算結果に基づいて最適組み合わせ算出部２７は、
文全体としてもっとも類似している組み合わせを求め
る。検索部２９は、最適組み合わせ算出部２７により求
められた組み合わせから、対応する検索対象文が検索入
力文に似ているか否かを判断し、似ているならば当該検
索対象文を検索結果として受け入れる。The search unit 29 sends the search target sentence extracted from the parallel translation storage unit 28 to the similarity calculation unit 26. The similarity calculation unit 26 calculates the similarity by comparing one unit of the search input sentence with one unit of the search target sentence, and calculates the similarity by (number of units of search input sentence) × (number of units of search target sentence) Will be brute force.
Based on this calculation result, the optimum combination calculation unit 27
Find the most similar combination for the whole sentence. The search unit 29 determines from the combination obtained by the optimum combination calculation unit 27 whether or not the corresponding search target sentence is similar to the search input sentence, and if similar, accepts the search target sentence as a search result. .

【００７９】以上の動作は、対訳記憶部２８内の全ての
検索対象文について繰り返される。検索部２９により得
られた検索結果は第１の言語と第２の言語の文の組であ
り、対訳として第２の言語の文を持っている。The above operation is repeated for all the search target sentences in the parallel translation storage unit 28. The search result obtained by the search unit 29 is a set of sentences in the first language and the second language, and has a sentence in the second language as a parallel translation.

【００８０】検索部２９は、検索結果（検索された検索
対象文）の対訳としての第２の言語の文を入力文とし
て、今度は第２の言語により、先の第１の言語による検
索の場合と同様に、対訳記憶部２８に記憶されている検
索対象文について検索する。この検索の後、対訳記憶部
２８に第２の言語による未検索の検索対象文が残ってい
るならば、そこから未検索の１文が取り出され、上記の
動作が繰り返される。The search unit 29 uses the sentence in the second language as the parallel translation of the search result (searched sentence to be searched) as the input sentence, this time using the second language to search in the previous first language. Similar to the case, the search target sentence stored in the parallel translation storage unit 28 is searched. After this search, if an unsearched sentence to be searched for in the second language remains in the parallel translation storage unit 28, one unsearched sentence is extracted from it and the above operation is repeated.

【００８１】さて、第２の言語に対する検索で得られた
検索結果を出力する際、制御部２２は優先制御部２５に
その結果を送り、どの検索結果を優先するかを判定させ
た上で、その結果を表示部２３に出力する。優先制御部
２５は、第１の言語に対する検索と第２の言語に対する
検索の結果および状況を考慮して、どれを優先するかを
決定する。When outputting the search result obtained by the search for the second language, the control unit 22 sends the result to the priority control unit 25 to make it possible to determine which search result has priority. The result is output to the display unit 23. The priority control unit 25 determines which is to be prioritized in consideration of the result and the situation of the search for the first language and the search for the second language.

【００８２】以上が、図１４の構成の動作の概要であ
る。次に、図１４の構成の動作の詳細について、図１７
のフローチャートを参照して説明する。The above is the outline of the operation of the configuration of FIG. Next, regarding the details of the operation of the configuration of FIG. 14, FIG.
This will be described with reference to the flowchart in FIG.

【００８３】本実施例では、入力部２１から検索入力文
が入力されることにより、この検索入力文と文全体とし
て類似している検索対象文を対訳記憶部２８から検索す
る動作が実行される。In this embodiment, when a search input sentence is input from the input unit 21, an operation of searching the parallel translation storage unit 28 for a search target sentence that is similar to the search input sentence as a whole sentence is executed. .

【００８４】まず、入力部２１から検索入力文が入力さ
れると（ステップＳ４１）、制御部２２は当該検索入力
文を表示部２３の表示画面に表示すると共に、文分割部
２４に出力する。ここでは検索入力文として、例えば
「その書類を一日も早く提出しなさい」という日本語文
字列（第１の言語の文字列）が入力されたと仮定する。First, when a search input sentence is input from the input unit 21 (step S41), the control unit 22 displays the search input sentence on the display screen of the display unit 23 and outputs it to the sentence dividing unit 24. Here, it is assumed that, for example, a Japanese character string (character string in the first language) "Please submit the document as soon as possible" is input as the search input sentence.

【００８５】文分割部２４は、入力された検索入力文の
文字列を解析して単位分割する（ステップＳ４２）。具
体的には、文分割部２４は、図１５に示すように、検索
入力文の「その書類を一日も早く提出しなさい」を、
「その」、「書類を」、「一日も」、「早く」、「提出
しなさい」のように一定の単位（ここでは文節）に分割
し、それぞれに品詞、活用形、付属語などの情報を付加
する。The sentence dividing unit 24 analyzes the character string of the input search input sentence and divides it into units (step S42). Specifically, as shown in FIG. 15, the sentence segmentation unit 24 sets the search input sentence “submit the document as soon as possible” to
Divide into certain units (here, clauses) such as "that", "documents", "one day", "early", "submit", and use parts of speech, conjugations, adjuncts, etc. Add information.

【００８６】制御部２２は、この情報を検索部２９に出
力する。検索部２９は、制御部２２から出力された情報
（文分割部２４により単位分割された検索入力文）を内
部に保持すると、対訳記憶部２８から検索対象文とし
て、任意の１文を取り出す（ステップＳ４３）。対訳記
憶部２８に記憶されている検索対象文は、あらかじめ文
分割部２４と同等の機能により単位分割されている。検
索対象文として、例えば「君の論文を一日も早く読みた
いものだ」という文が対訳記憶部２８から取り出された
と仮定する。この場合、対訳記憶部２８には、図１６に
示すように、「君の」、「論文を」、「一日も」、「早
く」、「読みたいものだ」という単位が記憶されてお
り、各単位ごとに分割された、「I d like to read you
r article assoon as possible 」という、対訳の情報
も記憶されている。対訳記憶部２８にはさらに、それぞ
れの単位ごとに、日本語（第１の言語）と英語（第２の
言語）との対応関係（を示す情報）も記憶されている。The control unit 22 outputs this information to the search unit 29. When the search unit 29 internally holds the information output from the control unit 22 (search input sentence divided by the sentence dividing unit 24), any one sentence is retrieved from the parallel translation storage unit 28 as a search target sentence ( Step S43). The search target sentence stored in the parallel translation storage unit 28 is divided into units by the function equivalent to that of the sentence dividing unit 24 in advance. It is assumed that, for example, a sentence "I want to read your paper as soon as possible" is retrieved from the parallel translation storage unit 28 as a search target sentence. In this case, as shown in FIG. 16, the parallel translation storage unit 28 stores units such as "Kimi no", "paper", "one day", "early", and "I want to read". , Divided by each unit, "I d like to read you
The bilingual information "r article assoon as possible" is also stored. The parallel translation storage unit 28 also stores (information indicating) a correspondence relationship between Japanese (first language) and English (second language) for each unit.

【００８７】検索部２９による検索対象文の取り出しが
終了すると、制御部２２により類似度計算部２６が起動
される。類似度計算部２６は、文分割部２４により得ら
れた検索入力文の１単位（例えば文節）と検索部２９に
より対訳記憶部２８から取り出された検索対象文の１単
位（例えば文節）とを比較し、類似度計算を行なう（ス
テップＳ４４）。この類似度計算は、（検索入力文の単
位数）×（検索対象文の単位数）の総当たりで行なわ
れ、全ての単位の組み合わせについて繰り返される（ス
テップＳ４５）。When the retrieval unit 29 completes the retrieval of the retrieval target sentence, the control unit 22 activates the similarity calculation unit 26. The similarity calculation unit 26 stores one unit (for example, a phrase) of the search input sentence obtained by the sentence dividing unit 24 and one unit (for example, a phrase) of the search target sentence extracted from the parallel translation storage unit 28 by the search unit 29. The comparison is performed and the similarity is calculated (step S44). This similarity calculation is performed by a brute force of (the number of units of the search input sentence) × (the number of units of the search target sentence), and is repeated for all combinations of units (step S45).

【００８８】最適組み合わせ算出部２７は、類似度計算
部２６の類似度計算結果に基づいて、検索入力文と検索
対象文との間で文全体として最も類似している単位の組
み合わせを求める（ステップＳ４６）。ここでは、図１
８に示すように、検索入力文の「一日も」に検索対象文
の「一日も」を、検索入力文の「早く」に検索対象文の
「早く」を対応づける組み合わせが最適であると算出さ
れる。The optimum combination calculation unit 27 obtains a combination of units that are most similar in the entire sentence between the search input sentence and the search target sentence, based on the similarity calculation result of the similarity calculation unit 26 (step S46). Here, FIG.
As shown in FIG. 8, a combination of associating “all day” of the search input sentence with “all day of the search target sentence” and “early” of the search input sentence with “early” of the search target sentence is optimal. Is calculated.

【００８９】検索部２９は、最適組み合わせ算出部２７
により検索入力文の単位と検索対象文の単位の最適な組
み合わせが求められると、その最適組み合わせから、検
索入力文と検索対象文とが類似しているか否かを判断す
る（ステップＳ４７）。このステップＳ４７での判断基
準は、説明を簡略化するために、前記第１の実施例の場
合と同様、類似している（例えば類似度が零でない）単
位を少なくとも１つ含む組み合わせであれば、検索入力
文と検索対象文とは似ているとするものである。ここで
は、ステップＳ４３で取り出された検索対象文「君の論
文を一日も早く読みたいものだ」は検索入力文「その書
類を一日も早く提出しなさい」と似ていると判定された
ものとする。The retrieving unit 29 uses the optimum combination calculating unit 27.
When the optimum combination of the unit of the search input sentence and the unit of the search target sentence is obtained by the above, it is determined from the optimum combination whether or not the search input sentence and the search target sentence are similar (step S47). In order to simplify the description, the judgment criterion in step S47 is a combination including at least one unit that is similar (for example, the degree of similarity is not zero) as in the case of the first embodiment. , It is assumed that the search input sentence and the search target sentence are similar. Here, it is determined that the search target sentence “I want to read your paper as soon as possible” retrieved in step S43 is similar to the search input sentence “submit the document as soon as possible”. I shall.

【００９０】検索部２９は、ステップＳ４７において似
ていると判断した場合には、該当する検索対象文を検索
結果（原文の検索結果）として受け入れて保持した後
（ステップＳ４８）、ステップＳ４９に進む。また検索
部２９は、ステップＳ４７において似ていないと判断し
た場合には、そのままステップＳ４９に進む。When the search unit 29 determines in step S47 that they are similar to each other, it accepts and holds the corresponding search target sentence as a search result (search result of the original sentence) (step S48), and then proceeds to step S49. . When the search unit 29 determines in step S47 that they are not similar to each other, the process directly proceeds to step S49.

【００９１】検索部２９は、ステップＳ４９において、
対訳記憶部２８に未検索の検索対象文が残っているか否
かをチェックし、残っているならば、上記した検索入力
文（原文）に対する一連の検索処理（ステップＳ４３〜
Ｓ４８）を繰り返す。The search unit 29, in step S49,
It is checked whether or not an unsearched search target sentence remains in the parallel translation storage unit 28, and if there is any, a series of search processing for the above-mentioned search input sentence (original sentence) (step S43-).
S48) is repeated.

【００９２】このようにして、検索入力文（原文）に対
する検索処理を対訳記憶部２８内の全ての検索対象文に
ついて行なうと、検索部２９はステップＳ４９からステ
ップＳ５０に進む。In this way, when the search process for the search input sentence (original sentence) is performed for all the search target sentences in the parallel translation storage unit 28, the search unit 29 proceeds from step S49 to step S50.

【００９３】さて、対訳記憶部２８には、前記したよう
に検索対象文の対訳である第２の言語も記憶されている
（図１６参照）。ここでは、「I d like to read your
article as soon as possible 」という英語の文が、ス
テップＳ４７で検索入力文に似ていると判断された「君
の論文を一日も早く読みたいものだ」という文（検索結
果として保持されている検索対象文）の対訳として記憶
されている。As described above, the parallel translation storage unit 28 also stores the second language which is the parallel translation of the search target sentence (see FIG. 16). Here, `` I d like to read your
The English sentence "article as soon as possible" was determined to be similar to the search input sentence in step S47, "I want to read your paper as soon as possible" (held as a search result) It is stored as a parallel translation of (search target sentence).

【００９４】検索部２９は、ステップＳ５０において、
この対訳の英語の文、すなわち検索結果の対訳文を取り
出し、これも対訳記憶部２８に記憶されている単位ごと
の情報を得る。The search unit 29, in step S50,
The English sentence of this bilingual translation, that is, the bilingual translation of the search result is taken out, and this also obtains information for each unit stored in the bilingual translation storage unit 28.

【００９５】検索部２９は、対訳記憶部２８から取り出
した検索結果の対訳文（検索入力文に類似している検索
対象文の対訳文）を英語（第２の言語）での検索入力文
として、先のステップＳ４３〜Ｓ４８にて日本語につい
て行なったのと同様な方法で、類似度計算部２６および
最適組み合わせ算出部２７を用いながら英語についても
検索を行なう。The search unit 29 uses the parallel translation sentence (the parallel translation sentence of the search target sentence similar to the search input sentence) of the search result extracted from the parallel translation storage unit 28 as the search input sentence in English (second language). In the same manner as that performed for Japanese in steps S43 to S48, an English search is also performed using the similarity calculation unit 26 and the optimum combination calculation unit 27.

【００９６】すなわち、検索部２９はまず、対訳記憶部
２８から検索対象文として任意の１文の対訳を取り出す
（ステップＳ５１）。類似度計算部２６は、検索結果の
対訳文（英語の検索入力文）の１単位（例えば単語）と
検索部２９により対訳記憶部２８から取り出された検索
対象文の訳文（英語の検索対象文）の１単位（例えば単
語）とを比較し、類似度計算を行なう（ステップＳ５
２）。類似度計算部２６は、この類似度計算を、（検索
入力文の単位数）×（検索対象文の単位数）の総当たり
数分繰り返す（ステップＳ５３）。That is, the retrieval unit 29 first takes out a bilingual translation of an arbitrary sentence as a retrieval target sentence from the bilingual storage unit 28 (step S51). The similarity calculation unit 26 calculates one unit (for example, a word) of the parallel translation sentence (English search input sentence) of the search result and the translation sentence (the English search target sentence) of the search target sentence retrieved from the parallel translation storage unit 28 by the search unit 29. ) Is compared with one unit (for example, a word) to calculate similarity (step S5).
2). The similarity calculation unit 26 repeats this similarity calculation for the total number of (the number of units of the search input sentence) × (the number of units of the search target sentence) (step S53).

【００９７】最適組み合わせ算出部２７は、類似度計算
部２６の類似度計算結果に基づいて、英語の検索入力文
と英語の検索対象文との間で文全体として最も類似して
いる単位の組み合わせを求める（ステップＳ５４）。The optimum combination calculation unit 27, based on the similarity calculation result of the similarity calculation unit 26, a combination of units that are most similar as a whole sentence between the English search input sentence and the English search target sentence. Is calculated (step S54).

【００９８】検索部２９は、最適組み合わせ算出部２７
により英語の検索入力文の単位と英語の検索対象文の単
位の最適な組み合わせが求められると、その最適組み合
わせから、英語の検索入力文と英語の検索対象文とが類
似しているか否かを判断する（ステップＳ５４）。も
し、似ていると判断した場合、検索部２９は、その英語
の検索対象文と対応する日本語の検索対象文を検索結果
として内部に保持する（ステップＳ５６）。The search unit 29 uses the optimum combination calculation unit 27.
When the optimal combination of the unit of the English search input sentence and the unit of the English search target sentence is determined by, the optimum combination is used to determine whether the English search input sentence and the English search target sentence are similar. It is determined (step S54). If it is determined that they are similar, the search unit 29 internally holds the Japanese search target sentence corresponding to the English search target sentence as a search result (step S56).

【００９９】検索部２９は、ステップＳ５５で新たな英
語の検索入力文と英語の検索対象文とが似ていると判断
した場合にはステップＳ５６を実行した後に、似ていな
いと判断した場合にはステップＳ５６をスキップして、
新たな英語の検索入力文に対する未検索の検索対象文が
対訳記憶部２８に残っているか否かをチェックし（ステ
ップＳ５７）、残っているならば、前記ステップＳ５１
以降の処理を未検索の検索対象文がなくなるまで繰り返
す。If the search unit 29 determines in step S55 that the new English search input sentence is similar to the English search target sentence, it executes step S56 and then determines that they are not similar. Skips step S56,
It is checked whether or not the unsearched search target sentence for the new English search input sentence remains in the parallel translation storage unit 28 (step S57).
The subsequent processing is repeated until there are no unsearched search target sentences.

【０１００】以上の結果、例えば、もともとの検索入力
文（原文である日本語の検索入力文）「その書類を一日
も早く提出しなさい」についての検索結果「君の論文を
一日も早く読みたいものだ」の対訳文「I d like to re
ad your article as soon aspossible 」を新たな検索
入力文とした場合に、その新たな検索入力文に対する英
語（第２の言語）による検索結果として、図１９（ａ）
に示すような検索結果が検索部２９にて得られたものと
する。As a result of the above, for example, the original search input sentence (search input sentence of the original Japanese) "Please submit the document as soon as possible""Your paper as soon as possible""I d like to re"
19 (a) as a search result in English (second language) for the new search input sentence when "ad your article as soon as possible" is set as the new search input sentence.
It is assumed that the search result as shown in FIG.

【０１０１】制御部２２は、このような検索部２９にて
得られた検索結果を受け取り、その検索結果を優先制御
部２５に出力して当該優先制御部２５を起動する。優先
制御部２５は、受け取った検索結果から何を優先して出
力するかを決定する（ステップＳ５８）。このステップ
Ｓ５８での決定処理には、図２０（ａ）のフローチャー
トに示す第１の手順（方法）と図２０（ｂ）のフローチ
ャートに示す第２の手順（方法）のいずれか一方が適用
可能であるが、その具体的な手順については後述する。The control unit 22 receives the search result obtained by the search unit 29, outputs the search result to the priority control unit 25, and activates the priority control unit 25. The priority control unit 25 determines what is to be preferentially output from the received search result (step S58). Either the first procedure (method) shown in the flowchart of FIG. 20 (a) or the second procedure (method) shown in the flowchart of FIG. 20 (b) can be applied to the determination processing in step S58. However, the specific procedure will be described later.

【０１０２】さて、優先制御部２５での決定結果は制御
部２２に返される。制御部２２は、優先制御部２５での
決定結果を受け取ると、優先すると決定された検索結果
の文を表示部２３の表示画面に優先して出力する（ステ
ップＳ５９）検索部２９は、制御部２２による検索結果の出力処理が
行なわれると、もともとの検索入力文（日本語の検索入
力文）に対する未処理の検索結果（すなわち、対訳を新
たな検索入力文とする検索に用いる検索結果）が残って
いるか否かをチェックし（ステップＳ６０）、残ってい
るならば、前記ステップＳ５０に戻る。このようにし
て、ステップＳ５０以降の処理が未処理の検索結果がな
くなるまで繰り返される。The determination result of the priority control unit 25 is returned to the control unit 22. When the control unit 22 receives the determination result of the priority control unit 25, the control unit 22 preferentially outputs the sentence of the search result determined to have priority to the display screen of the display unit 23 (step S59). When the output processing of the search result by 22 is performed, the unprocessed search result (that is, the search result used for the search with the parallel translation as a new search input sentence) for the original search input sentence (Japanese search input sentence) It is checked whether or not it remains (step S60), and if it remains, the process returns to step S50. In this way, the processing from step S50 is repeated until there are no unprocessed search results.

【０１０３】次に、前記ステップＳ５８での決定処理の
詳細を、図２０（ａ）のフローチャートに示す第１の手
順を適用した場合について説明する。例えば、もともと
の検索入力文は「その書類を一日も早く提出しなさい」
であり、その検索結果として「君の論文を一日も早く読
みたいものだ」が得られたものとする。また、この検索
結果の対訳として「I d like to read your articleas
soon as possible 」があり、それを入力文として検索
した結果には、図６（ａ）において符号１９１ａ〜１９
５ａで示す５文があったものとする。さらに、そのう
ち、文１９１ａ，１９２ａ，１９４ａは「as (soon) as
possible 」の部分が類似しており、文１９３ａ，１９
５ａは「would like to 」の部分が類似しているものと
する。Next, the details of the determination processing in step S58 will be described in the case where the first procedure shown in the flowchart of FIG. 20 (a) is applied. For example, the original search input sentence is "Please submit the document as soon as possible."
And, as a result of the search, "I want to read your paper as soon as possible" is obtained. In addition, as a parallel translation of this search result, `` I d like to read your articleas
"as soon as possible", and the result of searching it as an input sentence is 191a to 19a in FIG.
It is assumed that there are 5 sentences indicated by 5a. Furthermore, among them, sentences 191a, 192a, and 194a are "as (soon) as
The part of "possible" is similar, and sentences 193a, 19
5a is similar in the "would like to" portion.

【０１０４】優先制御部２５は、まず制御部２２から受
け取った検索部２９での検索結果から１文を取り出し
（ステップＳ６０）、どの部分が似ているかの情報を調
べる（ステップＳ６１）。文１９１ａの「I ll do it a
s soon as possible 」は、「as soon as possible 」
の部分が似ていることがわかる。The priority control section 25 first extracts one sentence from the search result of the search section 29 received from the control section 22 (step S60), and checks information about which part is similar (step S61). Sentence 191a, "I ll do it a
"s soon as possible" means "as soon as possible"
You can see that the part of is similar.

【０１０５】次に優先制御部２５は、日本語文の「一日
も早く」が英語の「as soon as possible 」に対応して
いるという情報を利用して、似ている部分が同じである
かどうかを判断する（ステップＳ６２）、似ている部分
が同じであれば優先し（ステップＳ６３）、違えば優先
しない。上記の例では、文１９１ａは優先するが、文１
９３ａの「I d like to know where the money came fr
om」は上記の「as soon as possible 」とは違う「woul
d like to 」の部分が似ているため優先しない。Next, the priority control unit 25 uses the information that the Japanese sentence "as soon as possible" corresponds to the English sentence "as soon as possible", and determines whether the similar portions are the same. It is determined (step S62) that if similar parts are the same, priority is given (step S63), and if they are different, priority is not given. In the above example, sentence 191a has priority, but sentence 1
93a's "I d like to know where the money came fr
"om" is different from "as soon as possible" above, but "woul
Since "d like to" is similar, it is not prioritized.

【０１０６】優先制御部２５は、以上の手順により、制
御部２２から受け取った全ての検索結果のそれぞれに対
して優先するかどうかを調べる（ステップＳ６４）。こ
の結果、上記の例では、文１９１ａ，１９２ａ，１９４
ａは優先し、文１９３ａ，１９５ａは優先しないことが
決定される。By the above procedure, the priority control unit 25 checks whether or not to give priority to each of all the search results received from the control unit 22 (step S64). As a result, in the above example, sentences 191a, 192a, 194
It is determined that a has priority and sentences 193a and 195a do not have priority.

【０１０７】次に、前記ステップＳ５８での決定処理の
詳細を、図２０（ｂ）のフローチャートに示す第２の手
順を適用した場合について説明する。まず、もともとの
検索入力文「その書類を一日も早く提出しなさい」の
（日本語の）検索結果として、「君の論文を一日も早く
読みたいものだ」の他に、「この部門のコンピュータ化
を一日も早く図るべきです」が検索されていたとする。
この場合の検索結果を図１９（ｂ）において符号１９１
ｂ，１９２ｂに示す。また、図１９（ａ）に示したよう
に、「君の論文を一日も早く読みたいものだ」の対訳
「I d like to read your article as soon as possibl
e 」を入力として検索した結果にも、「この部門のコン
ピュータ化を一日も早く図るべきです」の対訳「We mus
t computarize this department as soon as possible
」が含まれている。Next, the details of the determination processing in step S58 will be described for the case where the second procedure shown in the flowchart of FIG. 20 (b) is applied. First, in addition to "I want to read your paper as soon as possible" as the search result of the original search input sentence "Please submit the document as soon as possible", Should be computerized as soon as possible ".
The search result in this case is indicated by reference numeral 191 in FIG.
b, 192b. In addition, as shown in Fig. 19 (a), a parallel translation of "I want to read your paper as soon as possible""I d like to read your article as soon as possibl
In the search results using "e" as input, a translation of "We mus should be computerized in this department as soon as possible"
t computarize this department as soon as possible
"It is included.

【０１０８】優先制御部２５はまず、図１９（ａ）に示
したような検索結果（もともとの検索入力文の検索結果
の対訳を入力文として検索した結果）から１文を取り出
し（ステップＳ７０）、その検索結果と図１９（ｂ）に
示したようなもともとの日本語での検索結果とを比較す
る（ステップＳ７１）。そして優先制御部２５は、取り
出した１文が日本語での検索結果にあるかどうかを調べ
（ステップＳ７２）、あれば優先し（ステップＳ７
３）、なければ優先しない。First, the priority control unit 25 takes out one sentence from the search result as shown in FIG. 19 (a) (the result of searching the original search input sentence by using the parallel translation of the search result as the input sentence) (step S70). , And the search result in the original Japanese as shown in FIG. 19B is compared (step S71). Then, the priority control unit 25 checks whether or not the retrieved one sentence is in the search result in Japanese (step S72), and if there is, gives priority (step S7).
3) If not, there is no priority.

【０１０９】優先制御部２５は、以上の手順により、も
ともとの検索入力文の検索結果の対訳を入力文として検
索した第１９（ａ）に示した検索結果（制御部２２から
受け取った全ての検索結果）のそれぞれに対して優先す
るかどうかを調べる（ステップＳ７４）。この結果、上
記の例では、文１９２ａは優先し、それ以外は優先しな
いことが決定される。Through the above procedure, the priority control unit 25 retrieves the search result of the original search input sentence as the input sentence and retrieves the retrieval result shown in the 19th (a) (all retrievals received from the control unit 22. It is checked whether or not each of the results) has priority (step S74). As a result, in the above example, it is determined that the sentence 192a has priority and the others do not have priority.

【０１１０】なお本発明は上述した実施例に限定される
ものではない。本実施例では、第１の言語として日本
語、第２の言語として英語が想定されているが、それが
逆でも構わない。また日本語と英語以外の別の言語であ
ってもかまわない。The present invention is not limited to the above embodiments. In this embodiment, Japanese is assumed as the first language and English is assumed as the second language, but they may be reversed. It may be another language other than Japanese and English.

【０１１１】また、優先制御部２５が優先するとしたも
のについて、出力結果を優先するのではなく、限定する
用途に用いてもよい。また、これらの一連の、第１の言
語による検索→その検索結果からの対訳の取り出し→第
２の言語（対訳）による検索という流れを、ユーザから
の指示により行なうようにしてもよいし、システムが自
動的に行なってもよい。要するに、本発明は要旨を逸脱
しない範囲で種々変形して実施することができる。Further, the priority control section 25 may give priority to the output result instead of giving priority to the output result. In addition, the sequence of the search in the first language → the retrieval of the parallel translation from the search result → the search in the second language (parallel translation) may be performed according to an instruction from the user, or the system may be used. May be done automatically. In short, the present invention can be variously modified and implemented without departing from the scope of the invention.

【０１１２】[0112]

【発明の効果】以上説明したように本発明によれば、第
１の言語の検索入力文と類似した検索対象文を検索する
文書検索方法および装置において、検索された検索対象
文の対訳である第２の言語の文の単語を、対応する検索
入力文で当該単語に対応する単語の訳語を変形して置き
換える構成とすることにより、単に検索入力文を翻訳す
るための参考となる文を提示するのではなく、検索入力
文の翻訳文に近い形に修正した文を提示でき、したがっ
て検索入力文を翻訳したいと考えているユーザに、より
有用な例文を提示することができる。As described above, according to the present invention, a document retrieval method and apparatus for retrieving a retrieval target sentence similar to the retrieval input sentence in the first language is a parallel translation of the retrieved retrieval target sentence. Presenting a reference sentence for simply translating the search input sentence by arranging the word of the sentence of the second language by replacing the translated word of the word corresponding to the word with the corresponding search input sentence Instead, it is possible to present a sentence in which the search input sentence is corrected to a form close to the translated sentence, and thus to present a more useful example sentence to a user who wants to translate the search input sentence.

【０１１３】また、本発明によれば、第１の言語の検索
入力文と類似した検索対象文を検索した後で、その検索
対象文の対訳である第２の言語の文を新たな検索入力文
として再度検索を行なう構成とすることにより、ユーザ
にとって第２の言語の言語的な特性を生かした有用な情
報を提供することができる。特に、再度の検索時に、第
１の言語の検索時に類似していると判断された部分が類
似している結果を優先して提示するとか、第２の言語の
文の検索結果のうち、第１の言語の検索結果と同じもの
を優先して提示することにより、ユーザにとってより有
用な情報を提供することができる。Further, according to the present invention, after a search target sentence similar to the search input sentence in the first language is searched, a sentence in the second language, which is a parallel translation of the search target sentence, is newly input as a search input. By configuring the sentence to be searched again, it is possible to provide the user with useful information that makes use of the linguistic characteristics of the second language. In particular, at the time of another search, priority is given to presenting results that are similar to the portion determined to be similar at the time of searching the first language, or among the search results of sentences in the second language, By preferentially presenting the same search result in one language, more useful information can be provided to the user.

[Brief description of drawings]

【図１】本発明の第１の実施例を示す文書検索装置のブ
ロック構成図。FIG. 1 is a block configuration diagram of a document search device showing a first embodiment of the present invention.

【図２】検索入力文の一例を示す図であり、図２（ａ）
は単語分割部４により分割された検索入力文の単語およ
び単語情報を、図２（ｂ）は文節合成部５により単語か
ら合成された検索入力文の文節を、図２（ｃ）は検索入
力文の単語と単語検索部８により検索された訳語との対
応を示す。FIG. 2 is a diagram showing an example of a search input sentence, and FIG.
Is a word and word information of the search input sentence divided by the word dividing unit 4, FIG. 2B is a phrase of the search input sentence synthesized from the words by the phrase synthesizing unit 5, and FIG. 2C is a search input. The correspondence between the words in the sentence and the translated words searched by the word search unit 8 is shown.

【図３】対訳記憶部１２の記憶内容例を示す図であり、
図３（ａ）は検索対象文の単語および単語情報を、図３
（ｂ）は検索対象文の文節を、図３（ｃ）は検索対象文
の対訳の単語および単語情報を、図３（ｄ）は検索対象
文の単語とそれに対応する対訳の単語を示す。FIG. 3 is a diagram showing an example of contents stored in a parallel translation storage unit 12,
FIG. 3A shows the words and word information of the search target sentence as shown in FIG.
3 (b) shows the clauses of the search target sentence, FIG. 3 (c) shows the words and word information of the translation target sentence, and FIG. 3 (d) shows the words of the search target sentence and the corresponding translation words.

【図４】図１の構成の処理の流れを示すフローチャー
ト。4 is a flowchart showing a processing flow of the configuration of FIG.

【図５】図４中のステップＳ１３の処理の詳細を説明す
るためのもので、図５（ａ）は検索入力文の単語で、検
索対象文の対応する単語を変形・置換する方法を適用し
たフローチャート、図５（ｂ）は構文属性に応じた変形
を行なう方法を適用したフローチャート。5 is for explaining the details of the process of step S13 in FIG. 4, FIG. 5 (a) is a word of a search input sentence, and a method of transforming / replaceing the corresponding word of the search target sentence is applied. FIG. 5B is a flowchart to which a method for performing transformation according to syntax attributes is applied.

【図６】最適組み合わせ算出部７により求められた検索
入力文の文節と検索対象文の文節の組み合わせを示す
図。FIG. 6 is a diagram showing a combination of a phrase of a search input sentence and a phrase of a search target sentence, which are obtained by an optimum combination calculation unit 7.

【図７】単語対応検出部９により検出された、検索入力
文の単語、その訳語、検索対象文の単語および検索対象
文の対訳の単語を対応づけて示す図。FIG. 7 is a diagram showing the words of the search input sentence, their translations, the words of the search target sentence, and the bilingual words of the search target sentence, which are detected by the word correspondence detection unit, in association with each other.

【図８】図１の構成における検索および単語置換の表示
例を示す図。FIG. 8 is a diagram showing a display example of search and word replacement in the configuration of FIG.

【図９】最適組み合わせ算出部７により求められた検索
入力文の文節と検索対象文の文節の組み合わせを示す
図。FIG. 9 is a diagram showing a combination of a phrase of a search input sentence and a phrase of a search target sentence, which are obtained by an optimum combination calculating unit 7.

【図１０】検索対象文の単語とそれに対応する対訳の単
語を示す図。FIG. 10 is a diagram showing words in a search target sentence and corresponding bilingual words.

【図１１】単語対応検出部９により検出された、検索入
力文の単語、その訳語、検索対象文の単語および検索対
象文の単語を対応づけて示す図。FIG. 11 is a diagram showing the words of the search input sentence, their translations, the words of the search target sentence, and the words of the search target sentence, which are detected by the word correspondence detection unit, in association with each other.

【図１２】構文属性を示す図。FIG. 12 is a diagram showing syntax attributes.

【図１３】図１の構成における検索および単語置換の表
示例を示す図。13 is a diagram showing a display example of search and word replacement in the configuration of FIG.

【図１４】本発明の第２の実施例を示す文書検索装置の
ブロック構成図。FIG. 14 is a block configuration diagram of a document search device showing a second embodiment of the present invention.

【図１５】文分割部２４により分割された単位および当
該単位に付加される情報を示す図。FIG. 15 is a diagram showing a unit divided by a sentence dividing unit and information added to the unit.

【図１６】対訳記憶部２８に記憶された検索対象文の一
例を示す図。16 is a diagram showing an example of a search target sentence stored in a parallel translation storage unit 28. FIG.

【図１７】図１４の構成の処理の流れを示すフローチャ
ート。FIG. 17 is a flowchart showing a processing flow of the configuration of FIG.

【図１８】最適組み合わせ算出部２７により求められた
文節の組み合わせを示す図。FIG. 18 is a diagram showing a combination of phrases obtained by an optimum combination calculation unit 27.

【図１９】第２の言語による検索の結果の例を示す図。FIG. 19 is a diagram showing an example of a result of a search in a second language.

【図２０】図１８中のステップＳ５８の処理の詳細を説
明するためのもので、図２０（ａ）はもともとの検索入
力文による検索において類似していると判断された単位
に対応する訳語の部分が、その訳語による検索において
も類似しているものを優先する場合の手順を示したフロ
ーチャート、図２０（ｂ）は訳語についての検索により
得られた結果のうち、もともとの検索入力文についての
検索の結果にも含まれているものを優先する場合の手順
を示したフローチャート。20 is a diagram for explaining the details of the process of step S58 in FIG. 18, and FIG. 20 (a) shows a translation word corresponding to a unit determined to be similar in the search by the original search input sentence. FIG. 20B is a flowchart showing a procedure for giving priority to a similar part in the search by the translated word. FIG. 20B shows the result of the search for the translated word for the original search input sentence. The flowchart which showed the procedure in the case of giving priority to what is also contained in the search result.

[Explanation of symbols]

１，２１…入力部、２，２２…制御部、３，２３…表示
部、４…単語分割部、５…文節合成部、６…文節類似度
計算部、７，２７…最適組み合わせ算出部、８…単語検
索部、９…単語対応検出部、１０…単語変形部、１１…
単語置換部、１２，２８…対訳記憶部、１３…訳語記憶
部（単語辞書）、２４…文分割部、２５…優先制御部
（類似部分判定手段、検索結果判定手段）、２６…類似
度計算部（第１の類似度計算手段、第２の類似度計算手
段）、２９…検索部（第１の検索手段、第２の検索手
段）。1, 21 ... Input unit, 2, 22 ... Control unit, 3, 23 ... Display unit, 4 ... Word division unit, 5 ... Phrase synthesis unit, 6 ... Phrase similarity calculation unit, 7, 27 ... Optimal combination calculation unit, 8 ... Word search unit, 9 ... Word correspondence detection unit, 10 ... Word transformation unit, 11 ...
Word replacement section, 12, 28 ... Parallel translation storage section, 13 ... Translated word storage section (word dictionary), 24 ... Sentence division section, 25 ... Priority control section (similar part determination means, search result determination means), 26 ... Similarity calculation Part (first similarity calculation means, second similarity calculation means), 29 ... Search unit (first search means, second search means).

Claims

[Claims]

1. A search input sentence in the first language and a search target sentence are compared for each certain sentence constituent unit to calculate a similarity of the sentence constituent unit, and it is determined that one sentence as a whole is similar. In the document search method for obtaining a search target sentence, a parallel translation storage unit for storing a search target sentence in the first language and a sentence in the second language, which is a parallel translation thereof, in association with each other; Has a word dictionary that is a set of words or word strings in the second language that are translations for words in the first language, and translates words from the word dictionary using the words included in the search input sentence in the first language as a key When a word in the second language is searched for, and in the obtained search target sentence, there is a word corresponding to the word of the search input sentence in the expression similar to the search input sentence, the obtained Corresponding to the sentence in the second language that is paired with the search target sentence. Document search method a word, and replaces it by the search as the translation of the word in the input sentence word or word string.

2. A parallel translation storage means for storing the input means for inputting a character string, the search target sentence which is the first language, and the sentence which is the parallel translation of the second language as a set in association with each other. And a translated word storage means for storing a word in the first language and a word or word string in the second language, which is a translated word, in combination, and a search input sentence in the first language is input from the input means. And a similarity calculation means for calculating the similarity between the search input sentence and an arbitrary search target sentence in the parallel translation storage means for each fixed sentence constituent unit, and the calculation of the similarity calculation means An optimum combination calculating means for obtaining a combination of sentence composition units that are most similar between the search input sentence and the search target sentence based on the result; and a second translation that is a translation of a word included in the search input sentence. Of words in different languages from the translated word storage means And a word in the second language corresponding to the word in the search input sentence from the combination obtained by the means, and a word in the second language corresponding to the word in the search input sentence. Of the word in the second language sentence corresponding to the word in the sentence to be searched detected by the word correspondence detecting unit, and the word correspondence detecting unit searches for the word in the second language corresponding to the word in the search target sentence detected by the word correspondence detecting unit. And a word replacement unit that replaces with a word or a word string in the second language that is a translation of a word of the search input sentence that has been obtained, so as to obtain a corrected bilingual translation of the search target sentence similar to the search input sentence A document retrieval device characterized by the above.

3. A search input sentence in the first language and a search target sentence are compared for each fixed sentence constituent unit to calculate a similarity of the sentence constituent unit, and it is determined that one sentence as a whole is similar. In the document search method for obtaining a search target sentence, a parallel translation storage unit for storing a search target sentence in the first language and a sentence in the second language, which is a parallel translation thereof, in association with each other; Has a word dictionary that is a set of words or word strings in the second language that are translations for words in the first language, and translates words from the word dictionary using the words included in the search input sentence in the first language as a key And searching for a word in a second language that is, and in the obtained search target sentence, there is a word corresponding to the word of the search input sentence in an expression similar to the search input sentence, the obtained Corresponds to the sentence in the second language that is paired with the search target sentence Term and document search method characterized by replacing by transforming the word or word string as the translation of the words in the search input sentence.

4. When replacing a corresponding word of a sentence of a second language, which is paired with the obtained sentence to be searched, with a word or a word string that is a translation of the word of the search input sentence, the second word
4. The document search method according to claim 3, wherein the translated word or word string is modified and replaced according to the word attribute of the corresponding word of the sentence of the language.

5. When replacing a corresponding word of a sentence of a second language that forms a pair with the obtained search target sentence with a word or a word string that is a translation of the word of the search input sentence, the second word
4. The document search method according to claim 3, wherein the translated word or word string is modified and replaced according to the syntax attribute of the corresponding word of the sentence of the language.

6. A parallel translation storage means for storing the input means for inputting a character string, the search target sentence of the first language, and the parallel sentence of the second language as a set in association with each other. And a translated word storage means for storing a word in the first language and a word or word string in the second language, which is a translated word, in combination, and a search input sentence in the first language is input from the input means. And a similarity calculation means for calculating the similarity between the search input sentence and an arbitrary search target sentence in the parallel translation storage means for each fixed sentence constituent unit, and the calculation of the similarity calculation means An optimum combination calculating means for obtaining a combination of sentence composition units that are most similar between the search input sentence and the search target sentence based on the result; and a second translation that is a translation of a word included in the search input sentence. Of words in different languages from the translated word storage means And a word in the second language corresponding to the word in the search input sentence from the combination obtained by the means, and a word in the second language corresponding to the word in the search input sentence. Word correspondence detecting means for detecting the word from the parallel translation storage means, and word transforming means for transforming a word or word string in the second language which is a translation of the word of the search input sentence searched by the translation search means. A word replacement unit that replaces the word of the sentence in the second language corresponding to the word of the search target sentence detected by the word correspondence detection unit with the word or word string transformed by the word transforming unit. A document search device, wherein a corrected bilingual translation of a search target sentence similar to the search input sentence is obtained.

7. A search input sentence in the first language and a search target sentence are compared for each certain sentence constituent unit to calculate a similarity of the sentence constituent unit, and it is determined that one sentence as a whole is similar. In the document search method for obtaining a search target sentence, a parallel translation storage unit that stores a search target sentence in the first language and a sentence in a second language, which is a parallel translation thereof, in association with each other; Has a word dictionary that is a set of words or word strings in the second language that are translations for words in the first language, and translates words from the word dictionary using the words included in the search input sentence in the first language as a key When a word in the second language is searched for, and there is a word in the expression similar to the search input sentence that corresponds to the word of the search input sentence in the obtained search target sentence, the search target sentence Transform the word in the corresponding search input sentence At the same time, the word corresponding to the sentence of the second language, which is paired with the word of the search target sentence, is replaced by transforming the word or word string that is the translation of the word of the search input sentence. Document search method.

8. A parallel translation storage means for storing the input means for inputting a character string, the search target sentence which is the first language, and the sentence which is the parallel translation of the second language as a set in association with each other. And a translated word storage means for storing a word in the first language and a word or word string in the second language, which is a translated word, in combination, and a search input sentence in the first language is input from the input means. And a similarity calculation means for calculating the similarity between the search input sentence and an arbitrary search target sentence in the parallel translation storage means for each fixed sentence constituent unit, and the calculation of the similarity calculation means An optimum combination calculating means for obtaining a combination of sentence constituent units that are most similar between the search input sentence and the search target sentence based on the result; and a second translation that is a translation of a word included in the search input sentence Of words in different languages from the translated word storage means And a word in the second language corresponding to the word in the search input sentence from the combination obtained by the means, and a word in the second language corresponding to the word in the search input sentence. Word correspondence detecting means for detecting the word in the parallel translation storage means, and a word or word string in the second language which is a translation of the word in the search input sentence and the word in the search input sentence searched by the translation word searching means. Replacing the word of the search target sentence detected by the word correspondence detection unit with the word of the search input sentence modified by the word transformation unit, and detecting by the word correspondence detection unit The word of the sentence of the second language corresponding to the searched word of the search target sentence is a translation of the word of the search input sentence transformed by the word transforming means. ; And a word replacement means for replacing in the language of the word or word string,
A document search device, wherein a corrected parallel translation of a search target sentence similar to the search input sentence is obtained.

9. A one-sentence sentence is calculated by applying a search method of calculating a similarity between a given sentence constituent unit between a search input sentence and a search target sentence and finding a combination of the most similar sentence constituent units. In a document search method for searching a search target sentence that is determined to be similar as a whole, a bilingual translation in which a sentence in a first language and a sentence in a second language, which is a translation thereof, are associated and stored With respect to a search input sentence having a storage means, a search input sentence in a first language is searched by a search method for a first language from a set of sentences stored in the storage means. A document search method characterized in that a sentence of a second language corresponding to the sentence is used as an input sentence, and a set of sentences stored in the storage means is searched by the search method for the second language.

10. An input means for inputting a character string, and a parallel translation storage means for storing a search target sentence of a first language and a sentence of a second language which is a parallel translation thereof as a set in association with each other. A first similarity calculation means for calculating a similarity between a search input sentence in the first language and a search target sentence for each fixed sentence constituent unit; and a search input sentence in the second language and a search target sentence. Second similarity calculation means for calculating the similarity for each fixed sentence constituent unit between the first and second similarity calculation means, and the first language or the second similarity calculation means based on the calculation result of the first or second similarity calculation means. An optimum combination calculation means for obtaining a combination of sentence constituent units that are most similar to a language; and when a search input sentence of the first language is input from the input means, the storage means for the search input sentence The first language from the set of sentences stored in A first search unit that applies the similarity calculation unit and the optimum combination calculation unit to perform a search, and a sentence in a second language that is a parallel translation of the result of the search target sentence searched by the first search unit. As a new search input sentence, a search is performed by applying the second similarity calculation unit and the optimum combination calculation unit for a second language from a set of sentences stored in the storage unit, with respect to the search input sentence. And a second search means for executing the document search apparatus.

11. A one-sentence sentence is calculated by applying a search method for calculating a similarity between a given sentence constituent unit between a search input sentence and a search target sentence and finding a combination of sentence constituent units that are most similar. In a document search method for searching a search target sentence that is determined to be similar as a whole, a bilingual translation in which a sentence in a first language and a sentence in a second language, which is a translation thereof, are associated and stored With respect to a search input sentence having a storage means, a search input sentence in a first language is searched by a search method for a first language from a set of sentences stored in the storage means. A sentence in the second language corresponding thereto is used as an input sentence, and a search is performed for the second language from the set of sentences stored in the storage means by the search method. Document search method section of the second language, also characterized by a priority as a search result which are similar in the search by the second language corresponding to the determined sentence structural units are.

12. An input means for inputting a character string, and a parallel translation storage means for storing a search target sentence in a first language and a sentence in a second language which is a parallel translation thereof as a pair in association with each other. A first similarity calculation means for calculating a similarity between a search input sentence in the first language and a search target sentence for each fixed sentence constituent unit; and a search input sentence in the second language and a search target sentence. Second similarity calculation means for calculating the similarity for each fixed sentence constituent unit between the first and second similarity calculation means, and the first language or the second similarity calculation means based on the calculation result of the first or second similarity calculation means. An optimum combination calculation means for obtaining a combination of sentence constituent units that are most similar to a language; and when a search input sentence of the first language is input from the input means, the storage means for the search input sentence The first language from the set of sentences stored in A first search unit that applies the similarity calculation unit and the optimum combination calculation unit to perform a search, and a sentence in a second language that is a parallel translation of the result of the search target sentence searched by the first search unit. As a new search input sentence, a search is performed by applying the second similarity calculation unit and the optimum combination calculation unit for a second language from a set of sentences stored in the storage unit, with respect to the search input sentence. That the second language portion corresponding to the sentence constituent unit determined to be similar in the first search means is similar in the second search means A document retrieval apparatus, comprising: a similar portion determination unit for determining whether or not it is determined, and giving priority to a result determined to be similar by the similar portion determination unit as a search result.

13. A one-sentence sentence is obtained by applying a search method that calculates a similarity between certain sentence constituent units between a search input sentence and a search target sentence and obtains a combination of the most similar sentence constituent units. In a document search method for searching a search target sentence that is determined to be similar as a whole, a bilingual translation in which a sentence in a first language and a sentence in a second language, which is a translation thereof, are associated and stored With respect to a search input sentence having a storage means, a search input sentence in a first language is searched by a search method for a first language from a set of sentences stored in the storage means. A sentence corresponding to the second language is used as an input sentence, and a search for the second language is further performed from the set of sentences stored in the storage unit by the search method. From Among results, document search method characterized by giving priority to those contained in the search results for the first language as a search result.

14. An input means for inputting a character string, and a parallel translation storage means for storing a search target sentence of a first language and a sentence of a second language which is a parallel translation thereof as a set in association with each other. A first similarity calculation means for calculating a similarity between a search input sentence in the first language and a search target sentence for each fixed sentence constituent unit; and a search input sentence in the second language and a search target sentence. Second similarity calculation means for calculating the similarity for each fixed sentence constituent unit between the first and second similarity calculation means, and the first language or the second similarity calculation means based on the calculation result of the first or second similarity calculation means. An optimum combination calculation means for obtaining a combination of sentence constituent units that are most similar to a language; and when a search input sentence of the first language is input from the input means, the storage means for the search input sentence The first language from the set of sentences stored in A first search unit that applies the similarity calculation unit and the optimum combination calculation unit to perform a search, and a sentence in a second language that is a parallel translation of the result of the search target sentence searched by the first search unit. As a new search input sentence, a search is performed by applying the second similarity calculation unit and the optimum combination calculation unit for a second language from a set of sentences stored in the storage unit, with respect to the search input sentence. And a search result determination unit that determines whether or not the search result obtained by the second search unit is included in the search result obtained by the first search unit. Then, the document search apparatus characterized in that the search result determined by the search result determination means is prioritized as a search result.