JP5998779B2

JP5998779B2 - SEARCH DEVICE, SEARCH METHOD, AND PROGRAM

Info

Publication number: JP5998779B2
Application number: JP2012201209A
Authority: JP
Inventors: 昌剛角谷; 友樹長瀬; 富士　秀; 秀富士; 育昌鄭
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2012-09-13
Filing date: 2012-09-13
Publication date: 2016-09-28
Anticipated expiration: 2032-09-13
Also published as: JP2014056457A

Description

本明細書で議論される実施態様は、情報の検索技術に関するものである。 The embodiments discussed herein relate to information retrieval techniques.

情報の検索技術のひとつとして、曖昧検索という技術が知られている（例えば、非特許文献１、非特許文献２、及び非特許文献３参照）。曖昧検索とは、指定した検索文字列パターンに被検索テキストが完全に一致しない場合でもマッチングに成功させる検索技術であり、テキスト検索システムに広く使われている。 As one of information retrieval techniques, a technique called fuzzy retrieval is known (see, for example, Non-Patent Document 1, Non-Patent Document 2, and Non-Patent Document 3). The fuzzy search is a search technique that makes a matching succeed even when a search target text does not completely match a specified search character string pattern, and is widely used in a text search system.

例えば、数百万もの文例が格納されているテキストデータベースに対して、「らくらくフォンで写真を撮りたい」という検索キーワードを用いて曖昧検索を行う。すると、例えば、「らくらくホンで写真を撮る方法」、「写真撮影時の注意点」、「写真をメールで送りたいとき」、…、等といった検索結果が、検索キーワードに対する類似度の高い順に得られる。このように、曖昧検索では、検索文字列『らくらくフォン』に対する検索結果『らくらくホン』、あるいは検索文字列『撮りたい』に対する検索結果『撮る』及び『撮影』などのように、検索文字列が完全には一致しないテキストも、検索結果として得られる。 For example, an ambiguous search is performed with respect to a text database storing millions of sentence examples using a search keyword “I want to take a picture with an easy phone”. Then, for example, search results such as “How to take a photo with easy phone”, “Precautions when taking a photo”, “When you want to send a photo by email”, etc. It is done. As described above, in the fuzzy search, the search character string such as the search result “Raku Raku Fon” for the search character string “Raku Raku Fon” or the search result “Shoot” and “Shoot” for the search character string “I want to shoot” is used. Text that does not match exactly is also obtained as a search result.

また、翻訳対象である第一言語の原文から第二言語の文への翻訳をコンピュータ等の機械によって行う機械翻訳技術が知られている。この技術のひとつに、翻訳対象原文と翻訳用例の原文との差異部分を翻訳用例の訳文中の語句に対応付けて他の語句と区別して表示して、翻訳用例の訳文の差異に相当する部分への編集を容易にするという技術が知られている（例えば特許文献１参照）。また、第二言語により第一言語の文書検索装置を利用できるように機械翻訳装置に自動的に翻訳用の知識を獲得する機能を持たせることによって、翻訳精度を向上させるという技術が知られている（例えば特許文献２参照）。更に、ある言語の文書情報に、得たい情報に関連した範囲を指定し、指定範囲の文書の機械翻訳結果に基づいて、指定範囲の文書と類似の文書を検索するための情報検索式を生成して他の言語の文書を検索するという技術が知られている（例えば特許文献３参照）。 Also known is a machine translation technique in which translation from an original sentence in a first language to be translated into a sentence in a second language is performed by a machine such as a computer. One of the technologies is to display the difference between the translation source text and the translation example source text in correspondence with the words in the translation example translation, distinguishing it from other words, and the equivalent of the translation example translation difference. A technique for facilitating editing is known (for example, see Patent Document 1). Also known is a technique for improving translation accuracy by providing a machine translation device with a function for automatically acquiring knowledge for translation so that a document search device in the first language can be used in the second language. (For example, refer to Patent Document 2). Furthermore, a range related to the information to be obtained is specified for document information in a certain language, and an information search formula for searching for a document similar to the document in the specified range is generated based on the machine translation result of the document in the specified range. A technique of searching for documents in other languages is known (see, for example, Patent Document 3).

特開２００９−１１６５８４号公報JP 2009-116584 A 特開平１１−２０３２８７号公報JP-A-11-203287 特開２００５−１１１３８号公報JP 2005-11138 A

ジーン・マイヤーズ（Gene Myers）、「ア・ファスト・ビット−ベクトル・アルゴリズム・フォー・アプロキシメイト・ストリング・マッチング・ベイスド・オン・ダイナミック・プログラミング（A fast bit-vector algorithm for approximate string matching based on dynamic programming）」、ジャーナル・オブ・ジ・エイシイエム（Journal of the ACM）、アソシエーション・フォー・コンピューティング・マシナリー（Association for Computing Machinery）、1999年5月、第46巻、第3号、p.395-415Gene Myers, “A fast bit-vector algorithm for approximate string matching based on dynamic programming) ", Journal of the ACM, Association for Computing Machinery, May 1999, 46, 3, p.395- 415 ゴンザロ・ナバロ（Gonzalo Navarro）、「ア・ガイデッド・ツアー・トゥ・アプロキシメイト・ストリング・マッチング（A guided tour to approximate string matching）」、エイシイエム・コンピューティング・サーベイズ（ACM Computing Surveys）、アソシエーション・フォー・コンピューティング・マシナリー（Association for Computing Machinery）、2001年3月、第33巻、第1号、p.31-88Gonzalo Navarro, "A guided tour to approximate string matching", ACM Computing Surveys, Association For・ Association for Computing Machinery, March 2001, Vol.33, No.1, p.31-88 ゴンザロ・ナバロ（Gonzalo Navarro）、外３名、「インデキシング・メソッズ・フォー・アプロキシメイト・ストリング・マッチング（Indexing Methods for Approximate String Matching）」、ブリテン・オブ・ザ・テクニカル・コミッティ・オン・データ・エンジニアリング（Bulletin of the Technical Committee on Data Engineering）、アイトリプルイー・コンピュータ・ソサエティ（IEEE Computer Society）、2001年12月、第24巻、第4号、p.19-27Gonzalo Navarro, 3 others, "Indexing Methods for Approximate String Matching", Bulletin of the Technical Committee on Data Engineering (Bulletin of the Technical Committee on Data Engineering), IEEE Computer Society, December 2001, Vol. 24, No. 4, p.19-27

前述した曖昧検索の技術は、字面のマッチングを行うに過ぎないものであるため、単語レベルの表記揺れに弱いという課題が知られている。
例えば「日本で人気のあるリンゴの品種」というクエリ文を用いて曖昧検索を行う場合を考えてみる。このクエリ文においてカタカナで表記されている『リンゴ』は、ひらがなによる表記『りんご』や漢字による表記『林檎』も広く用いられている。また、このクエリ文における文字列『人気のある』は、同義である『ポピュラー』も広く用いられている。曖昧検索においては、テキストデータベースの文例に対して、このような『リンゴ』、『りんご』、及び『林檎』のマッチング、あるいは『人気のある』及び『ポピュラー』のマッチングは単純には行えない。このようなマッチングを可能にするためには、例えば、同義語辞書を使用して検索前に検索キーを展開する等の処理が行われる。 The above-described fuzzy search technique is merely a matching of character faces, and therefore, there is a known problem that it is vulnerable to word-level notation fluctuations.
For example, consider a case in which an ambiguous search is performed using a query sentence “apple varieties popular in Japan”. In this query statement, “apple” written in katakana is widely used as “apple” written in hiragana and “apple” written in kanji. In addition, the character string “popular” in this query statement is also widely used as a synonym “popular”. In the fuzzy search, such “apple”, “apple”, and “apple” matching, or “popular” and “popular” matching cannot simply be performed on sentence examples in a text database. In order to enable such matching, for example, processing such as expanding a search key before searching using a synonym dictionary is performed.

また、曖昧検索の技術を、例えば日本語や韓国語などといった語順が比較的自由である言語に適用すると、語順の揺れに起因して検索精度が低下する場合がある。
例えば「日本で人気のあるリンゴの品種」というクエリ文を用いて曖昧検索を行う場合をここでも考えてみる。この場合、単に文字列が似ているかどうかを検索の基準としているに過ぎないと、語順が入れ替わった文の意味が同じであることは判断できない。このため、例えば、「日本でリンゴの品種は○○が人気です」等といった文例がテキストデータベースに存在しても、クエリ文に対する類似度が低いものとして、検索結果の順位が下位に沈んでしまうことがある。すなわち、この場合、検索結果として、例えば、「日本ではリンゴの品種はサンふじが一番人気があります。」といった文例が検索結果の下位に並んでしまうことがある。そして、この場合には、例えば「日本で人気のあるリンゴの品書きはアップルパイです。」、「日本人はリンゴの種類にこだわります。」、「日本で人気のあるプリンを３種類紹介します。」等といった文例が検索結果の上位に並んでしまうことになる。更に、前述した単語レベルの表記揺れをも考慮していない場合には、例えば、「林檎の種類で最もポピュラーなものは、日本では、ふじ、つがる、ジョナゴールド等です。」といった、意味上では最上位の検索結果となるべき文例が下位に沈んでしまうことがある。 In addition, when the fuzzy search technique is applied to a language in which the word order is relatively free, such as Japanese or Korean, the search accuracy may be reduced due to fluctuation of the word order.
For example, let us consider the case where an ambiguous search is performed using a query sentence “a variety of apples popular in Japan”. In this case, it is impossible to determine that the meanings of the sentences in which the word order is changed are the same unless the search is simply based on whether the character strings are similar. For this reason, for example, even if a text example such as “Japanese apple varieties are popular in Japan” exists in the text database, the similarity of the query sentence is low, and the rank of the search result falls below. Sometimes. That is, in this case, for example, sentence examples such as “San Fuji is the most popular apple variety in Japan” may be arranged in the lower order of the search results. And in this case, for example, “Apple apple pie is popular in Japan”, “Japanese are particular about the type of apple”, “Three puddings that are popular in Japan” Sentence examples such as "" are arranged at the top of the search results. Furthermore, in the case where the above-mentioned word level notation is not taken into account, for example, “The most popular types of apples are Fuji, Tsugaru, Jonagold, etc. in Japan.” The sentence example that should be the highest search result may sink to the lower level.

上述した問題に鑑み、本明細書で後述する検索装置は、使用する言語によっては広く許容されている単語表記や語順の揺れに起因する曖昧検索の検索精度の低下を抑制する。 In view of the above-described problems, a search device described later in this specification suppresses a decrease in search accuracy of an ambiguous search caused by word notation and fluctuation of word order widely accepted depending on a language to be used.

本明細書で後述する検索装置のひとつに、対訳形式データベースと、機械翻訳部と、判別部と、曖昧検索部と、を備えるというものがある。ここで、対訳形式データベースには、検索対象である第一言語のテキストと当該テキストについての第二言語での翻訳文と当該テキストに含まれている語句が回答となる質問の種別を表す質問種別情報とが対応付けられて格納されている。機械翻訳部は、入力された第一言語によるクエリ文を第二言語に機械翻訳する。判別部は、当該クエリ文によって表現されている質問の種別を判別する。曖昧検索部は、対訳形式データベースに格納されている翻訳文に対して第二言語に翻訳されたクエリ文と判別部により判別された質問の種別とを検索キーとして用いて曖昧検索を行う。そして、曖昧検索部は、当該曖昧検索の結果である翻訳文に対応付けられている第一言語のテキストを対訳形式データベースから抽出して出力する。 One of the search devices described later in this specification includes a bilingual format database, a machine translation unit, a determination unit, and an ambiguous search unit. Here, in the bilingual format database, a question type representing a type of a question to which a text in the first language to be searched, a translated sentence in the second language for the text, and a phrase included in the text are answers Information is stored in association with each other. The machine translation unit machine translates the input query sentence in the first language into the second language. The determination unit determines the type of the question expressed by the query sentence. The ambiguous search unit performs an ambiguous search using the query sentence translated into the second language with respect to the translation sentence stored in the parallel translation database and the type of the question determined by the determination unit as a search key. Then, the fuzzy search unit extracts the text of the first language associated with the translated sentence that is the result of the fuzzy search from the bilingual format database and outputs it.

本明細書で後述する検索装置によれば、使用する言語によっては広く許容されている単語表記や語順の揺れに起因する曖昧検索の検索精度の低下が抑制されるという効果を奏する。
という効果を奏する。 According to the search device described later in this specification, there is an effect that a decrease in search accuracy of fuzzy search caused by word notation and fluctuation of word order widely accepted depending on a language to be used is suppressed.
There is an effect.

検索装置の一実施例の機能構成図である。It is a functional block diagram of one Example of a search device. 機械翻訳部による機械翻訳の手順を図解した図である。It is the figure which illustrated the procedure of the machine translation by a machine translation part. 意味解析部によって生成される原文の概念構造の一例を図解した図である。It is the figure which illustrated an example of the conceptual structure of the original sentence produced | generated by the semantic analysis part. 原文用のメタ情報の生成の手法の説明図である。It is explanatory drawing of the method of the production | generation of the meta information for original texts. 対訳形式データベースのデータ構造を図解した図である。It is the figure which illustrated the data structure of the bilingual form database. クエリ文の概念構造の作成例を図解した図である。It is the figure which illustrated the example of creation of the conceptual structure of a query sentence. クエリ文用のメタ情報の生成の手法の説明図である。It is explanatory drawing of the method of the production | generation of the meta information for query sentences. 検索キー作成部による検索キーの作成の説明図である。It is explanatory drawing of creation of the search key by a search key creation part. 曖昧検索部による曖昧検索の具体例である。It is a specific example of the fuzzy search by a fuzzy search part. 検索装置の一実施例のハードウエア構成例を表した図である。It is a figure showing the hardware structural example of one Example of a search device. 対訳形式ＤＢ作成処理の第一の例の処理内容を表したフローチャートである。It is a flowchart showing the processing content of the 1st example of bilingual form DB creation processing. 検索処理の第一の例の処理内容を表したフローチャートである。It is a flowchart showing the processing content of the 1st example of a search process. 対訳形式ＤＢ作成処理の第二の例の処理内容を表したフローチャート（その１）である。It is a flowchart (the 1) showing the processing content of the 2nd example of bilingual form DB creation processing. 対訳形式ＤＢ作成処理の第二の例の処理内容を表したフローチャート（その２）である。It is a flowchart (the 2) showing the processing content of the 2nd example of bilingual form DB creation processing. 検索処理の第二の例の処理内容を表したフローチャート（その１）である。It is a flowchart (the 1) showing the processing content of the 2nd example of a search process. 検索処理の第二の例の処理内容を表したフローチャート（その２）である。It is the flowchart (the 2) showing the processing content of the 2nd example of a search process. 検索処理の第二の例の処理内容を表したフローチャート（その３）である。It is a flowchart (the 3) showing the processing content of the 2nd example of a search process.

まず図１について説明する。図１は、検索装置１の一実施例の機能構成図である。
図１の検索装置１は、第一言語を用いて表現されているクエリ文２が入力されると、そのクエリ文２を検索キーとしてデータベースに対して曖昧検索を行って、検索結果３として、第一言語を用いて表現されているテキストを出力する装置である。 First, FIG. 1 will be described. FIG. 1 is a functional configuration diagram of an embodiment of a search device 1.
When the query sentence 2 expressed using the first language is input, the search device 1 of FIG. 1 performs an ambiguous search on the database using the query sentence 2 as a search key, A device that outputs text expressed using a first language.

図１の検索装置１は、まず、原文テキストデータベース１０、対訳形式データベース１１、検索キー作成部１３、及び曖昧検索部１４を備えている。
原文テキストデータベース１０は、検索装置１により行われる曖昧検索の検索対象である、第一言語を用いて表現されている多数のテキスト（以下、このテキストを「原文」と称することとする）が格納されているデータベースである。なお、以下の説明では、原文テキストデータベース１０を「原文テキストＤＢ１０」と記すこととする。 The search device 1 of FIG. 1 includes a source text database 10, a parallel translation format database 11, a search key creation unit 13, and an ambiguous search unit 14.
The original text database 10 stores a large number of texts (hereinafter referred to as “original texts”) expressed using the first language, which are search targets of the fuzzy search performed by the search device 1. Database. In the following description, the original text database 10 is referred to as “original text DB 10”.

対訳形式データベース１１は、原文テキストＤＢ１０に多数格納されている原文と、当該原文についての第二言語での翻訳文とが対応付けられて格納されているデータベースである。なお、以下の説明では、対訳形式データベース１１を「対訳形式ＤＢ１１」と称することとする。 The bilingual format database 11 is a database in which a large number of original sentences stored in the original text DB 10 and translated sentences of the original sentences in the second language are stored in association with each other. In the following description, the parallel translation format database 11 is referred to as a “parallel translation format DB 11”.

機械翻訳部１２は、検索装置１に入力された、第一言語を用いて表現されているクエリ文２を、第二言語に機械翻訳する。なお、本実施例においては、対訳形式ＤＢ１１に格納される翻訳文、すなわち、原文についての第二言語での翻訳文についても、この機械翻訳部１２による機械翻訳によって作成する。 The machine translation unit 12 machine-translates the query sentence 2 input to the search device 1 and expressed using the first language into the second language. In the present embodiment, the translated text stored in the parallel translation format DB 11, that is, the translated text of the original text in the second language is also created by machine translation by the machine translation section 12.

検索キー作成部１３は、機械翻訳部１２によって第二言語に機械翻訳されたクエリ文２から、曖昧検索部１４が行う曖昧検索における検索キーを作成する。
曖昧検索部１４は、対訳形式ＤＢ１１に格納されている原文の翻訳文に対して、検索キー作成部１３によって作成された検索キーを用いて曖昧検索を行う。そして、この曖昧検索の結果として得られた翻訳文に対応付けられている原文を、対訳形式ＤＢ１１から抽出し、検索結果３として出力する。 The search key creation unit 13 creates a search key in the fuzzy search performed by the fuzzy search unit 14 from the query sentence 2 machine-translated into the second language by the machine translation unit 12.
The ambiguous search unit 14 performs an ambiguous search using the search key created by the search key creating unit 13 for the translated sentence stored in the parallel translation format DB 11. Then, the original sentence associated with the translated sentence obtained as a result of the fuzzy search is extracted from the bilingual form DB 11 and output as the search result 3.

上述した構成を有している検索装置１では、検索対象である原文についての第二言語の翻訳文に対し、クエリ文２についての第二言語の翻訳文を検索キーとして用いた曖昧検索が行われる。 In the search device 1 having the above-described configuration, an ambiguous search using the second language translation of the query sentence 2 as a search key is performed on the second language translation of the original sentence to be searched. Is called.

以下の説明では、第一言語が日本語であり第二言語が英語である場合を例にして説明する。
クエリ文２が「太郎が学校でリンゴを食べた。」であり、このクエリ文２を検索キーとして、原文「太郎が学校で林檎を食べた。」を曖昧検索する場合を想定する。従来の曖昧検索では、日本語同士で文字列のマッチングが行われるために、検索キーにおける『リンゴ』と原文における『林檎』とは異なる単語と判定されてしまう。このため、この２つの文は全く同じ意味にもかかわらずランキングが下がる可能性がある。但し、クエリ文２である「太郎が学校でリンゴを食べた。」についての英訳文と、原文「太郎が学校で林檎を食べた。」についての英訳文とは、どちらも「Taro ate an apple at school.」と全く同じ文になる。従って、検索装置１による検索結果３の順位は、原文「太郎が学校で林檎を食べた」が首位となり、最も適切な結果が得られる。 In the following description, a case where the first language is Japanese and the second language is English will be described as an example.
Assume that the query sentence 2 is “Taro ate an apple at school”, and the original sentence “Taro ate an apple at school” is vaguely searched using the query sentence 2 as a search key. In the conventional fuzzy search, character strings are matched between Japanese, so that “apple” in the search key and “apple” in the original text are determined as different words. For this reason, the rankings of these two sentences may drop despite the exact same meaning. However, the English translation of the query sentence 2 “Taro ate an apple at school” and the English translation of the original sentence “Taro ate an apple at school” are both “Taro ate an apple”. at school. "is exactly the same sentence. Accordingly, the ranking of the search result 3 by the search device 1 is the original sentence “Taro ate apple at school”, and the most appropriate result is obtained.

また、別の例として、クエリ文２が「太郎が学校でリンゴを食べた。」であり、このクエリ文２を検索キーとして、原文「学校で太郎がリンゴを食べた。」を曖昧検索する場合を想定する。従来の曖昧検索では、日本語同士で文字列のマッチングが行われるが、クエリ文２と原文との間で『太郎が』と『学校で』との順番が異なるため、この２つの文は全く同じ意味にもかかわらず、ランキングが下がる可能性がある。但し、クエリ文２である「太郎が学校でリンゴを食べた。」についての英訳文と、原文「学校で太郎がリンゴを食べた。」についての英訳文とは、どちらも「Taro ate an apple at school.」と全く同じ文になる。従って、検索装置１による検索結果３の順位は、原文「学校で太郎がリンゴを食べた。」が首位となり、最も適切な結果が得られる。 As another example, query sentence 2 is “Taro ate an apple at school.” Using this query sentence 2 as a search key, the original sentence “Taro ate an apple at school” is ambiguously searched. Assume a case. In the conventional fuzzy search, character strings are matched between Japanese, but the order of “Taroga” and “in school” is different between the query sentence 2 and the original sentence. Despite the same meaning, the ranking may drop. However, the English translation of “Taro ate an apple at school”, which is the query sentence 2, and the English translation of the original “Taro ate an apple at school” are both “Taro ate an apple”. at school. "is exactly the same sentence. Accordingly, the ranking of the search result 3 by the search device 1 is the original sentence “Taro ate an apple at school”, and the most appropriate result is obtained.

以上のように、検索装置１を用いた曖昧検索では、原文及びクエリ文２における第一言語での単語表現や語順の揺れが吸収されるので、これらの揺れに起因する検索精度の低下が抑制される。 As described above, in the ambiguous search using the search device 1, since the word expression and word order fluctuation in the first language in the original sentence and the query sentence 2 are absorbed, a decrease in search accuracy due to these fluctuations is suppressed. Is done.

なお、図１の検索装置１は、更に、クエリ文用質問種別判別部２１とクエリ文用質問種別判別パターン２２とを備えている。
クエリ文用質問種別判別部２１は、クエリ文用質問種別判別パターン２２を用いることによって、クエリ文２によって表現されている質問の種別を判別する。なお、クエリ文用質問種別判別パターン２２については後述する。 The search device 1 in FIG. 1 further includes a query sentence question type determination unit 21 and a query sentence question type determination pattern 22.
The query sentence question type determination unit 21 determines the type of the question expressed by the query sentence 2 by using the query sentence question type determination pattern 22. The query sentence question type determination pattern 22 will be described later.

なお、ここで、対訳形式ＤＢ１１には、原文に含まれている語句が回答となる質問の種別を表す質問種別情報が、当該原文についての翻訳文に対応付けられて更に格納されている。また、検索キー作成部１３は、更に、クエリ文用質問種別判別部２１により判別された質問の種別から、曖昧検索部１４が行う曖昧検索における検索キーを作成する。そして、曖昧検索部１４は、第二言語に翻訳されたクエリ文２に加えてクエリ文用質問種別判別部２１により判別された質問の種別を更に検索キーとして用いて曖昧検索を行う。そして、この曖昧検索の結果として得られた翻訳文に対応付けられている原文を、対訳形式ＤＢ１１から抽出し、検索結果３として出力する。 Here, in the bilingual translation format DB 11, question type information indicating the type of question for which the word / phrase included in the original text is an answer is further stored in association with the translated text of the original text. Further, the search key creating unit 13 further creates a search key in the fuzzy search performed by the fuzzy search unit 14 from the question type determined by the query sentence question type determining unit 21. Then, the ambiguous search unit 14 performs an ambiguous search using the question type determined by the query sentence question type determining unit 21 in addition to the query sentence 2 translated into the second language as a search key. Then, the original sentence associated with the translated sentence obtained as a result of the fuzzy search is extracted from the bilingual form DB 11 and output as the search result 3.

検索装置１では、上述した構成を有していることにより、いわゆる５Ｗ１Ｈ型の質問に対する回答の検索を行うことを可能としている。なお、「５Ｗ１Ｈ型の質問」とは、主体（Who）、対象（What）、時期（When）、場所（Where）、理由（Why）、又は手法（How）を問う質問のことである。 Since the search device 1 has the above-described configuration, it is possible to search for an answer to a so-called 5W1H type question. The “5W1H type question” refers to a question asking about the subject (Who), the target (What), the time (When), the place (Where), the reason (Why), or the method (How).

従来の曖昧検索では、文字列のマッチングによって検索が行われるために、このような５Ｗ１Ｈ型の質問に対する回答の検索に対応できない場合がある。
例えば、仙台では稲をいつ植えるのかを知りたいときに、クエリ文２を単純に「稲を仙台でいつ植えますか？」とし、このクエリ文２を検索キーとして原文を検索する場合を想定する。従来の曖昧検索では、この場合、「稲を仙台でいつ植えるかが問題だ。」、「稲を仙台ではいつ植えるのか調査する必要がある。」、「仙台でいつ稲を植えるだろう。」などといった、文字列『いつ』を含む原文がランキングの上位に現れてしまう。その一方、従来の曖昧検索では、この場合、本来の知りたい情報である、例えば文字列『６月』を含む原文「仙台では６月に稲を植える。」は、文字列が一致しないため、ランキングが下位に沈んでしまうことがある。 In the conventional fuzzy search, since the search is performed by matching character strings, there is a case where the search for an answer to such a 5W1H type question cannot be supported.
For example, when it is desired to know when to plant rice in Sendai, the query sentence 2 is simply “When do you plant rice in Sendai?” And the original sentence is searched using this query sentence 2 as a search key. . In the conventional fuzzy search, in this case, “When to plant rice in Sendai is a problem.” “It is necessary to investigate when to plant rice in Sendai.” “When will rice be planted in Sendai?” The original text that includes the character string “when” appears at the top of the ranking. On the other hand, in the conventional fuzzy search, in this case, the original information including the character string “June”, for example, the original text “Sendai planting rice in June.” Rankings may sink below.

これに対し、上述した構成を有している検索装置１では、原文の語句が回答となる質問の種別を表す情報が対応付けられている翻訳文に対し、第二言語に翻訳されたクエリ文２とクエリ文２の質問の種別とを検索キーとして用いて曖昧検索が行われる。つまり、上述の例では、原文「仙台では６月に稲を植える。」の英訳文「The rice plant is planted in Sendai in June.」に、この原文の語句『６月』が回答となる質問の種別を表す情報『When』が対応付けられて対訳形式ＤＢ１１に格納されている。また、クエリ文２が「稲を仙台でいつ植えますか？」である場合に、クエリ文用質問種別判別部２１によって、クエリ文２によって表現されている質問の種別が『When』であると判別される。この結果、クエリ文２の英訳文「When is the rice plant planted in Sendai?」とクエリ文２によって表現されている質問の種別『When』とを検索キーとして用いて曖昧検索が行われる。従って、クエリ文２の英訳文と原文の英訳文とが似ていることに加えて、両者の質問の種別がマッチする原文が、検索結果３の順位の上位に現れることが期待できる。つまり、知りたい情報を含む原文が、より精度良く検索結果３として得られることになる。 On the other hand, in the search device 1 having the above-described configuration, the query sentence translated into the second language with respect to the translation sentence associated with the information indicating the type of the question whose answer is the original phrase. An ambiguous search is performed using 2 and the type of question of the query sentence 2 as search keys. In other words, in the above example, the original sentence “The rice plant is planted in Sendai in June.” In the original sentence “The rice plant is planted in Sendai in June.” Information “When” indicating the type is associated with and stored in the parallel translation format DB 11. Further, when the query sentence 2 is “When do you plant rice in Sendai?”, The query sentence question type determination unit 21 determines that the type of the question expressed by the query sentence 2 is “When”. Determined. As a result, an ambiguous search is performed using the English translation of the query sentence 2 “When is the rice planted in Sendai?” And the question type “When” expressed by the query sentence 2 as search keys. Therefore, in addition to the English translation sentence of the query sentence 2 and the English translation sentence of the original sentence being similar, it can be expected that the original sentence in which both types of questions match appears in the higher rank of the search result 3. That is, the original text including the information to be obtained is obtained as the search result 3 with higher accuracy.

また、図１の検索装置１は、意味解析部３１とメタ情報生成部３２とを備えている。
意味解析部３１は、原文テキストＤＢ１０に格納されている原文の意味解析を行う。
なお、図１の構成においては、機械翻訳部１２が備えている意味解析機能を、意味解析部３１として利用するものとしている。すなわち、図１の構成においては、機械翻訳部１２は、第一言語で表現されている翻訳対象の文についての意味解析部３１による意味解析の結果に相当する第二言語の文を生成することによって、当該翻訳対象の文の第二言語への機械翻訳を行う。ここで、意味解析部３１を機械翻訳部１２とは別個に検索装置１に備えるようにしてもよい。 The search device 1 of FIG. 1 includes a semantic analysis unit 31 and a meta information generation unit 32.
The semantic analysis unit 31 performs semantic analysis of the original text stored in the original text DB 10.
In the configuration of FIG. 1, the semantic analysis function provided in the machine translation unit 12 is used as the semantic analysis unit 31. That is, in the configuration of FIG. 1, the machine translation unit 12 generates a sentence in the second language corresponding to the result of the semantic analysis by the semantic analysis unit 31 for the sentence to be translated expressed in the first language. Then, machine translation of the sentence to be translated into the second language is performed. Here, the semantic analysis unit 31 may be provided in the search device 1 separately from the machine translation unit 12.

メタ情報生成部３２は、原文用質問種別判別部３３と格納処理部３５とを備えている。
原文用質問種別判別部３３は、意味解析部３１による原文の意味解析の結果に基づいて、且つ、原文用質問種別判別パターン３４を用いることによって、当該原文についての翻訳文に含まれる語句についての質問種別情報を生成する。なお、原文用質問種別判別パターン３４については後述する。 The meta information generation unit 32 includes an original text question type determination unit 33 and a storage processing unit 35.
Based on the result of the semantic analysis of the original text by the semantic analysis unit 31 and using the original text question type determination pattern 34, the original text question type determination unit 33 uses the original text question type determination pattern 34 to determine the phrase included in the translation of the original text. Generate question type information. The original question type determination pattern 34 will be described later.

格納処理部３５は、原文用質問種別判別部３３により生成された質問種別情報を、当該原文についての翻訳文に対応付けて対訳形式ＤＢ１１に格納する。
この意味解析部３１とメタ情報生成部３２とにより、原文に含まれている語句についての質問種別情報の対訳形式ＤＢ１１への格納が行われる。曖昧検索部１４は、このようにして対訳形式ＤＢ１１に格納された、原文の翻訳文と当該翻訳文に対応付けられている質問種別情報とに対して曖昧検索を行う。 The storage processing unit 35 stores the question type information generated by the original text question type discriminating unit 33 in the parallel translation format DB 11 in association with the translated text of the original text.
The semantic analysis unit 31 and the meta information generation unit 32 store the question type information about the words included in the original text in the parallel translation format DB 11. The ambiguous search unit 14 performs an ambiguous search on the translated sentence of the original sentence and the question type information associated with the translated sentence stored in the parallel translation format DB 11 in this way.

なお、本実施例では、原文用質問種別判別部３３は、前述の質問種別情報と当該質問種別情報によって表されている種別の質問の回答である原文についての翻訳文に含まれる語句とを含むメタ情報を生成する。そして、格納処理部３５は、原文用質問種別判別部３３が生成したメタ情報を、当該原文についての翻訳文に付加して対訳形式ＤＢ１１に格納する。 In the present embodiment, the source text question type discriminating unit 33 includes the above question type information and a phrase included in the translation of the original text that is an answer to the question of the type represented by the question type information. Generate meta information. Then, the storage processing unit 35 adds the meta information generated by the original text question type determination unit 33 to the translated text of the original text and stores it in the parallel translation format DB 11.

例えば、原文が「仙台では６月に稲を植えます。」の場合を想定する。この原文に対して意味解析部３１が意味解析を行うと、原文の概念構造が得られる。この概念構造に基づくと、まず、『仙台では〜植えます。』の部分が、植えることについての場所情報を含んでいることが分かる。この場合、原文用質問種別判別部３３は、『植える』の訳語『PLANT』を中心自立語とし、場所情報を含んでいることを表す情報『@Where』を組み合わせたメタ情報『@Where PLANT』を生成する。また、意味解析部３１により得られた上述の概念構造に基づくと、『６月に〜植えます。』の部分が、植えることについての時期情報も含んでいることが分かる。この場合、原文用質問種別判別部３３は、『植える』の訳語『PLANT』を中心自立語とし、時期情報を含んでいることを表す情報『@When』を組み合わせたメタ情報『@When PLANT』も生成する。格納処理部３５は、生成されたメタ情報『@Where PLANT』及び『@When PLANT』を原文についての英訳文に付加した「@Where PLANT @When PLANT The rice plant is planted in Sendai in June.」を対訳形式ＤＢ１１に格納する。曖昧検索部１４は、このようにして対訳形式ＤＢ１１に格納された、メタ情報が付加されている翻訳文に対して曖昧検索を行う。 For example, assume that the original text is “Plant to plant rice in Sendai in June”. When the semantic analysis unit 31 performs semantic analysis on the original text, a conceptual structure of the original text is obtained. Based on this conceptual structure, first of all, “In Sendai, we plant. It can be seen that the portion of "contains the location information about planting. In this case, the text question type discriminating unit 33 uses meta-information “@Where PLANT” in which the translated word “PLANT” of “planting” is used as a central independent word and information “@Where” indicating that it includes location information is combined. Is generated. Moreover, based on the above-mentioned conceptual structure obtained by the semantic analysis unit 31, “plant in June. It can be seen that the part of "" also contains timing information about planting. In this case, the text question type discriminating unit 33 uses meta-information “@When PLANT” in which the translated word “PLANT” of “planting” is used as a central independent word and information “@When” indicating that it includes time information is combined. Also generate. The storage processing unit 35 adds “@Where PLANT @When PLANT The rice plant is planted in Sendai in June.” With the generated meta information “@Where PLANT” and “@When PLANT” added to the English translation of the original text. It is stored in the bilingual format DB 11. The ambiguous search unit 14 performs an ambiguous search on the translation sentence added with the meta information stored in the parallel translation format DB 11 in this way.

また、本実施例では、意味解析部３１は、更に、検索装置１に入力されたクエリ文２の意味解析も行う。従って、機械翻訳部１２の意味解析機能を意味解析部３１として利用する図１の構成においては、機械翻訳部１２は、意味解析部３１によるクエリ文２の意味解析の結果に相当する第二言語の文を生成することによって、クエリ文２の第二言語への機械翻訳を行う。また、クエリ文用質問種別判別部２１は、意味解析部３１によるクエリ文２の意味解析の結果に基づき、且つ、クエリ文用質問種別判別パターン２２を用いることによって、クエリ文２によって表現されている質問の種別を判別する。 In the present embodiment, the semantic analysis unit 31 further performs a semantic analysis of the query sentence 2 input to the search device 1. Therefore, in the configuration of FIG. 1 in which the semantic analysis function of the machine translation unit 12 is used as the semantic analysis unit 31, the machine translation unit 12 uses the second language corresponding to the result of the semantic analysis of the query sentence 2 by the semantic analysis unit 31. Machine translation of the query sentence 2 into the second language. The query sentence question type determination unit 21 is expressed by the query sentence 2 based on the result of the semantic analysis of the query sentence 2 by the semantic analysis unit 31 and using the query sentence question type determination pattern 22. Determine the type of question you have.

例えば、「稲を仙台でいつ植えますか？」というクエリ文２が検索装置１に入力された場合を想定する。このクエリ文２に対して意味解析部３１が意味解析を行うと、クエリ文２の概念構造が得られる。ここで、クエリ文用質問種別判別部２１は、特に疑問詞に注目して、クエリ文２が５Ｗ１Ｈ型の質問のうちのいずれの質問の種別のものであるかを判別する。本例の場合、クエリ文２における『いつ植えますか？』の部分から、植えることについての時期情報についての質問であることが分かる。この場合、クエリ文用質問種別判別部２１は、『植える』の訳語『PLANT』を中心自立語とし、質問の種別が時期情報の質問であることを表す情報『@When』を組み合わせたメタ情報『@When PLANT』を生成する。検索キー作成部１３は、生成されたメタ情報を、機械翻訳部１２により翻訳されたクエリ文２の英訳文に付加した「@When PLANT When is the rice plant planted in Sendai?」を検索キーとして作成する。曖昧検索部１４は、対訳形式ＤＢ１１に格納されている原文の翻訳文に対し、このようにして作成された検索キーを用いて曖昧検索を行う。 For example, it is assumed that a query sentence 2 “When do you plant rice in Sendai?” Is input to the search device 1. When the semantic analysis unit 31 performs semantic analysis on the query sentence 2, the conceptual structure of the query sentence 2 is obtained. Here, the query sentence question type determination unit 21 determines which question type of the query sentence 2 is a 5W1H type question by paying particular attention to the question words. In this example, "When do you plant?" From the part, it turns out that it is a question about time information about planting. In this case, the query sentence question type discriminating unit 21 uses the translated word “PLANT” of “planting” as a central independent word and meta information combining information “@When” indicating that the question type is a question of time information. Generate “@When PLANT”. The search key creation unit 13 creates "@When PLANT When is the rice plant planted in Sendai?" By adding the generated meta information to the English translation of the query sentence 2 translated by the machine translation unit 12 To do. The ambiguous search unit 14 performs an ambiguous search on the translation of the original sentence stored in the parallel translation format DB 11 using the search key created in this way.

このように、検索装置１は、上述のようにして作成されたメタ情報とクエリ文２の翻訳文とを検索キーとして用いて対訳形式ＤＢ１１に対する曖昧検索を行う。従って、クエリ文２と原文との翻訳文同士が似ていることに加えて、両者に付加されたメタ情報がマッチしている原文が、検索結果３の順位の上位のものとして得られることが期待できる。従って、知りたい情報が含まれている原文がより精度良く検索結果３として得られることになる。 In this way, the search device 1 performs an ambiguous search on the bilingual form DB 11 using the meta information created as described above and the translation of the query sentence 2 as search keys. Accordingly, in addition to the fact that the translation sentences of the query sentence 2 and the original sentence are similar to each other, the original sentence in which the meta information added to both is matched can be obtained as the higher rank of the search result 3. I can expect. Therefore, the original text including the information to be obtained is obtained as the search result 3 with higher accuracy.

図１の検索装置１は、以上のように構成されている。
次に、メタ情報生成部３２によるメタ情報の生成及び対訳形式ＤＢ１１への格納の動作について、更に詳しく説明する。 The search device 1 in FIG. 1 is configured as described above.
Next, the operation of generating meta information and storing it in the bilingual form DB 11 by the meta information generating unit 32 will be described in more detail.

まず図２について説明する。図２は、機械翻訳部１２による機械翻訳の手順を図解したものである。
機械翻訳部１２による機械翻訳では、順に、Ｓ１：形態素解析、Ｓ２：構文解析、Ｓ３：意味解析、Ｓ４：文生成の各処理が行われる。なお、この機械翻訳の手法自体は、広く知られた手法である。 First, FIG. 2 will be described. FIG. 2 illustrates a machine translation procedure performed by the machine translation unit 12.
In the machine translation by the machine translation unit 12, each process of S1: morphological analysis, S2: syntax analysis, S3: semantic analysis, and S4: sentence generation is sequentially performed. The machine translation method itself is a widely known method.

Ｓ１の形態素解析処理は、第一言語で表現されている翻訳対象の原文を形態素に分割して各形態素の品詞を判別する処理である。この形態素解析処理では、原文に使用される語と当該語の品詞と当該語についての第二言語での訳語とを対応付けたリストである翻訳辞書４０が利用される。なお、図１の構成においては、翻訳辞書４０は機械翻訳部１２自身が備えているものとしている。 The morpheme analysis process of S1 is a process of dividing the original text to be translated expressed in the first language into morphemes and determining the part of speech of each morpheme. In this morphological analysis process, a translation dictionary 40, which is a list in which words used in the original sentence, parts of speech of the words, and translations of the words in the second language are associated with each other, is used. In the configuration of FIG. 1, the translation dictionary 40 is provided in the machine translation unit 12 itself.

Ｓ２の構文解析処理は、形態素解析処理によって得られた形態素とその品詞の判別結果とを用いて、原文の構文構造を解析することによって、原文を構成している語句間の文法的な関係（文節間の係り受け構造）を表している構文木を作成する。 The syntax analysis process of S2 uses the morpheme obtained by the morpheme analysis process and the discrimination result of the part of speech to analyze the syntactic structure of the original sentence, thereby providing a grammatical relationship between the words constituting the original sentence ( Create a syntax tree representing the dependency structure between clauses).

Ｓ３の意味解析処理は、構文解析処理によって得られた原文の構文木に対して意味解析を行うことによって、原文の概念構造を生成する。本実施例では、原文の概念構造は、原文を構成している語句を表しているノード（node）と各ノード間を繋ぐことによって語句同士の概念の関係を表しているアーク（arc ）とを用いた木構造によって表現される。 The semantic analysis process of S3 generates a conceptual structure of the original sentence by performing a semantic analysis on the original sentence syntax tree obtained by the syntactic analysis process. In the present embodiment, the conceptual structure of the original text includes a node (node) representing the word constituting the original text and an arc (arc) representing the conceptual relationship between the words by connecting each node. It is expressed by the tree structure used.

Ｓ４の文生成処理は、原文を構成している語句に対応する訳語の候補を翻訳辞書４０から抽出し、意味解析処理によって生成された概念構造に基づき、抽出された候補から訳語を選択することによって、原文についての第二言語での翻訳文を生成する処理である。 In the sentence generation process of S4, candidate translations corresponding to the words constituting the original sentence are extracted from the translation dictionary 40, and a translation word is selected from the extracted candidates based on the conceptual structure generated by the semantic analysis process. Is a process of generating a translated sentence in the second language for the original sentence.

なお、図２には、一例として、原文「太郎は秋葉原で富士通のパソコンと携帯電話を買った。」から翻訳文「Taro bought the personal computer and the cellular phone of Fujitsu in Akihabara.」への機械翻訳の手順が図解されている。 For example, in FIG. 2, the original text “Taro bought a Fujitsu computer and mobile phone in Akihabara” was translated into “Taro bought the personal computer and the cellular phone of Fujitsu in Akihabara.” The procedure is illustrated.

図１の検索装置１における意味解析部３１は、上述した機械翻訳部１２による機械翻訳の手順のうちの、Ｓ１：形態素解析、Ｓ２：構文解析、及びＳ３：意味解析の各処理を行う。 The semantic analysis unit 31 in the search device 1 of FIG. 1 performs the processes of S1: morphological analysis, S2: syntax analysis, and S3: semantic analysis in the machine translation procedure by the machine translation unit 12 described above.

ここで図３について説明する。図３は、意味解析部３１によって生成される原文の概念構造の一例を図解したものである。
図３の例は、提示した文章のうちの最初の段落の文「タイ中央部を流下するチャオプラ川はタイ第一の河川であり、下流域のチャオプラヤデルタはアジアモンスーン地域における米の一大産地である（図１）。」から意味解析部３１が作成した概念構造である。 Here, FIG. 3 will be described. FIG. 3 illustrates an example of the conceptual structure of the original text generated by the semantic analysis unit 31.
The example in Fig. 3 shows the sentence in the first paragraph of the sentence "The Chao Phra River, which flows down the central part of Thailand, is the first river in Thailand, and the Chao Phraya Delta in the downstream area is a major rice producing region in the Asian monsoon region. (FIG. 1) ”is a conceptual structure created by the semantic analysis unit 31.

メタ情報生成部３２の原文用質問種別判別部３３は、原文の概念構造におけるアークについての意味解析の結果のうち、質問の種別に予め対応付けられている所定のパターンに一致しているものについて、メタ情報を生成する。なお、このメタ情報の生成には、アークによって結ばれているノードが表している語句と質問の種別とが用いられる。このメタ情報の生成の手法について、図４を用いて説明する。 The question type discriminating unit for original text 33 of the meta information generating unit 32 is the result of the semantic analysis on the arc in the conceptual structure of the original text that matches the predetermined pattern previously associated with the question type. Meta information is generated. The meta information is generated by using a phrase represented by nodes connected by arcs and a question type. A method of generating the meta information will be described with reference to FIG.

図４に図解されている概念構造は、図３の例におけるものと同一のものである。また、図４に図解されている抽出パターンａ、ｂ、ｃ、…、ｎは、いずれも、図１における原文用質問種別判別パターン３４である。原文用質問種別判別パターン３４は、原文の概念構造におけるアークについての意味解析の結果と質問の種別とを対応付けたパターンである。 The conceptual structure illustrated in FIG. 4 is the same as that in the example of FIG. Further, all of the extraction patterns a, b, c,..., N illustrated in FIG. 4 are the original text question type determination pattern 34 in FIG. The original question type discriminating pattern 34 is a pattern in which the result of the semantic analysis on the arc in the conceptual structure of the original sentence is associated with the question type.

例えば、図４における「抽出パターンａ」は、原文の概念構造におけるアークについての意味解析の結果が『場所』である場合に、５Ｗ１Ｈ型の質問の種別のうちで場所を問う質問を表している『Where』が対応付けられていることを表している。つまり、この抽出パターンは、このようなアークが原文の概念構造に存在する場合にメタ情報『@Where』を当該原文の翻訳文に付加することを表している。 For example, “extraction pattern a” in FIG. 4 represents a question that asks for a place among the types of 5W1H type questions when the result of the semantic analysis of the arc in the conceptual structure of the original text is “place”. It indicates that “Where” is associated. That is, this extraction pattern indicates that meta information “@Where” is added to the translation of the original text when such an arc exists in the conceptual structure of the original text.

また、例えば、図４における「抽出パターンｃ」は、原文の概念構造におけるアークについての意味解析の結果が『対象』である場合に、５Ｗ１Ｈ型の質問の種別のうちで対象を問う質問を表している『What』が対応付けられていることを表している。つまり、この抽出パターンは、このようなアークが原文の概念構造に存在する場合にメタ情報『@What』を当該原文の翻訳文に付加することを表している。 Further, for example, “extraction pattern c” in FIG. 4 represents a question asking the target among the types of 5W1H type questions when the result of the semantic analysis on the arc in the conceptual structure of the original text is “target”. "What" is associated. That is, this extraction pattern indicates that meta information “@What” is added to the translation of the original text when such an arc exists in the conceptual structure of the original text.

なお、この抽出パターンでは、このようなアークが原文の概念構造に存在する場合に、このアークを表している矢印の根元側のノードが表している語句の訳語を中心自立語としてメタ情報に含めて当該原文の翻訳文に付加することも表している。この中心自立語をメタ情報に含めて翻訳文に付加することによって、５Ｗ１Ｈ型の質問の種別だけでなく、質問の種別とそれに関係する語までもが一致する文が、曖昧検索部１４による曖昧検索により得られる検索結果３において、より順位が上昇することが期待できる。なお、中心自立語をメタ情報に含ませないようにしてもよい。 In this extraction pattern, if such an arc exists in the conceptual structure of the original text, the translation of the phrase represented by the node on the root side of the arrow representing this arc is included in the meta information as a central independent word. This also indicates that it is added to the translation of the original text. By including this central independent word in the translation information and adding it to the translated sentence, not only the 5W1H type of question but also a sentence that matches not only the question type but also the related word is ambiguous by the ambiguous search unit 14. In the search result 3 obtained by the search, it can be expected that the rank is further increased. The central independent word may not be included in the meta information.

図４の例では、概念構造から、意味解析の結果が『対象』であるアークの一部として（Ａ）及び（Ｂ）のアークが検出され、意味解析の結果が『場所』であるアークの一部として（Ｃ）及び（Ｄ）のアークが検出されたものとしている。そして、この場合には、「抽出パターンａ」及び「抽出パターンｃ」に基づき、メタ情報として、中心自立語を含めた『@What THAILAND』、『@What FLOW』、『@Where DELTA』、『@Where FLOW』が生成されたことを表している。 In the example of FIG. 4, the arcs (A) and (B) are detected as part of the arc whose semantic analysis result is “target” from the conceptual structure, and the arc whose semantic analysis result is “location” is detected. It is assumed that arcs (C) and (D) are detected as a part. In this case, based on “extracted pattern a” and “extracted pattern c”, “@What THAILAND”, “@What FLOW”, “@Where DELTA”, “ @Where FLOW ”is generated.

メタ情報生成部３２の格納処理部３５は、原文について以上のようにして生成されたメタ情報を、当該原文の翻訳文に付加して対訳形式ＤＢ１１に格納する。
ここで図５について説明する。図５は対訳形式ＤＢ１１のデータ構造を図解したものである。 The storage processing unit 35 of the meta information generating unit 32 adds the meta information generated as described above for the original sentence to the translated sentence of the original sentence and stores it in the parallel translation format DB 11.
Here, FIG. 5 will be described. FIG. 5 illustrates the data structure of the bilingual format DB 11.

対訳形式ＤＢ１１では、レコード毎に、各レコードを識別するために付与されるレコードＩＤと、第一言語（ここでは日本語）の原文と、第二言語の翻訳文（ここでは英訳）とが対応付けられて格納される。格納処理部３５は、各レコードに、「０」から始まるレコードＩＤを付与すると共に、原文テキストＤＢ１０から原文を１文ずつ読み出して当該原文を格納し、更に、当該原文についての翻訳文を格納する。なお、この翻訳文は、原文テキストＤＢ１０から読み出した原文を機械翻訳部１２が機械翻訳して作成したものであり、格納処理部３５は、この翻訳文に、原文用質問種別判別部３３により生成されたメタ情報を付加して所定のレコードに格納する。 In the bilingual format DB 11, for each record, a record ID assigned to identify each record, the original text in the first language (here, Japanese), and the translated text in the second language (here, English translation) correspond to each other. Attached and stored. The storage processing unit 35 assigns a record ID starting from “0” to each record, reads the original sentence one by one from the original text DB 10, stores the original sentence, and further stores a translation for the original sentence. . This translated sentence is created by machine translation unit 12 machine-translating the original sentence read from original text DB 10, and storage processing unit 35 generates the original sentence by question type discriminating part 33 for the original sentence. The added meta information is added and stored in a predetermined record.

なお、曖昧検索部１４による曖昧検索において、検索処理の高速化のためにインデックスを作成する場合には、以上のようにして作成を終えたところでインデックスを作成すればよい。なお、このようなインデックスを使用する場合は、データベースのファイルサイズが大きくなるため、メモリ使用量のところで動作可能サイズの限界になることが考えられる。しかし、このような場合には、例えば検索対象の分野毎にデータベースを分ける等の対処を行って、メモリ使用量を削減するとよい。 In the fuzzy search performed by the fuzzy search unit 14, when an index is created for speeding up the search process, the index may be created when the creation is completed as described above. Note that when such an index is used, the file size of the database becomes large, and it is considered that the operable size is limited in terms of memory usage. However, in such a case, it is preferable to reduce the memory usage by taking measures such as dividing the database for each field to be searched.

次に、図１の検索装置１による検索動作について、更に詳しく説明する。
検索装置１にクエリ文２が入力されると、まず、意味解析部３１が、クエリ文２の意味解析を行って、クエリ文２の概念構造を作成する。意味解析部３１は、原文の概念構造を作成するための前述した手法と同様にして、クエリ文２の概念構造の作成を行う。 Next, the search operation by the search device 1 in FIG. 1 will be described in more detail.
When the query sentence 2 is input to the search device 1, first, the semantic analysis unit 31 performs a semantic analysis of the query sentence 2 to create a conceptual structure of the query sentence 2. The semantic analysis unit 31 creates the conceptual structure of the query sentence 2 in the same manner as described above for creating the conceptual structure of the original sentence.

図６は、意味解析部３１によるクエリ文２の概念構造の作成例を表したものであり、検索装置１の使用者が、「イモの生産量が多い県はどこですか。」というクエリ文２を入力して検索装置１に検索開始の指示を行った場合を表している。検索装置１は、検索開始指示を受け取ると、意味解析部３１にクエリ文２の意味解析を行わせ、この結果、図６に図解されている、クエリ文２の概念構造が作成される。 FIG. 6 shows an example of creation of the conceptual structure of the query sentence 2 by the semantic analysis unit 31, and the user of the search apparatus 1 asks the query sentence 2 “Where is the prefecture with the highest production of potatoes?”. Is input and the search apparatus 1 is instructed to start search. Upon receiving the search start instruction, the search device 1 causes the semantic analysis unit 31 to perform semantic analysis of the query sentence 2, and as a result, the conceptual structure of the query sentence 2 illustrated in FIG. 6 is created.

クエリ文２の概念構造が得られると、次に、クエリ文用質問種別判別部２１が、クエリ文２の概念構造におけるアークが表している概念の関係と、そのアークによって繋がれている語句の品詞とに基づいて、クエリ文２で表現されている質問の種別の判別を行う。そして、この判別の結果に応じて、当該質問の種別を表しているメタ情報の生成を行う。このメタ情報の生成の手法について、図７を用いて説明する。 When the conceptual structure of the query sentence 2 is obtained, the query sentence question type discriminating unit 21 next selects the relationship between the concepts represented by the arcs in the conceptual structure of the query sentence 2 and the phrases connected by the arcs. Based on the part of speech, the type of the question expressed in the query sentence 2 is determined. Then, according to the determination result, meta information representing the type of the question is generated. A method of generating this meta information will be described with reference to FIG.

図７に図解されている概念構造は、図６の例におけるものと同一のものである。また、図６に図解されている抽出パターンａ、ｂ、ｃ、…、ｎは、いずれも、図１におけるクエリ文用質問種別判別パターン２２である。クエリ文用質問種別判別パターン２２は、クエリ文２の概念構造におけるアークについての意味解析の結果、及び、そのアークによって繋がれているノードが表している語句の品詞と、質問の種別とを対応付けたパターンである。特に、クエリ文用質問種別判別パターン２２は、アークによって繋がれているノードが表している語句の品詞が疑問詞である場合に、その疑問詞によって形成される質問の種別が対応付けられている。 The conceptual structure illustrated in FIG. 7 is the same as that in the example of FIG. Moreover, all of the extraction patterns a, b, c,..., N illustrated in FIG. 6 are the query sentence question type determination patterns 22 in FIG. The query sentence question type discrimination pattern 22 corresponds to the result of the semantic analysis on the arc in the conceptual structure of the query sentence 2, the part of speech of the phrase represented by the node connected by the arc, and the question type. It is a pattern attached. In particular, the query sentence question type determination pattern 22 is associated with the type of question formed by the question word when the part of speech of the phrase represented by the nodes connected by the arc is the question word. .

例えば、図７における「抽出パターンａ」は、原文の概念構造におけるアークについての意味解析の結果が『述語』であって、且つ、そのアークによって繋がれているノードが表している語句が、場所を訊ねる疑問詞『どこ？』である場合のパターンを表している。そして、この「抽出パターンａ」は、このパターンに合致する場合に、５Ｗ１Ｈ型の質問の種別のうちで場所を問う質問を表しているメタ情報『@Where』を生成することを表している。 For example, the “extracted pattern a” in FIG. 7 is that the result of the semantic analysis for the arc in the conceptual structure of the original sentence is “predicate”, and the phrase represented by the node connected by the arc is the place The question word "Where?" ] Represents a pattern. This “extracted pattern a” indicates that meta information “@Where” representing a question asking for a place among the types of 5W1H type questions is generated when this pattern matches.

このように、クエリ文用質問種別判別部２１は、クエリ文２が表している質問が５Ｗ１Ｈ型の質問のうちの何に該当するのかを、クエリ文２に含まれる疑問詞に注目して判別する。この点において、原文に含まれる語句が回答となる質問が５Ｗ１Ｈ型の質問のうちの何に該当するのかを、原文に含まれる各語句に注目して判別する原文用質問種別判別部３３と、クエリ文用質問種別判別部２１とは異なっている。 In this way, the query sentence question type determination unit 21 determines what the question represented by the query sentence 2 corresponds to among the 5W1H type questions by focusing on the question words included in the query sentence 2. To do. In this respect, a question type discriminating unit for original text 33 for discriminating which of the 5W1H type questions the question to which the word included in the original text is an answer corresponds to each word / phrase included in the original text, This is different from the query sentence question type determination unit 21.

図７の例では、概念構造から、意味解析の結果が『述語』であるアークとして（Ｅ）のアークが検出され、更に、このアークが、疑問詞である語句『どこ？』を表しているノードに繋がっていることが検出される。従って、この場合には、「抽出パターンａ」に基づき、メタ情報として、『@Where』が生成されたことを表している。 In the example of FIG. 7, the arc of (E) is detected as an arc whose result of semantic analysis is “predicate” from the conceptual structure, and this arc is the phrase “where? It is detected that it is connected to a node representing “”. Therefore, in this case, “@Where” is generated as meta information based on “extraction pattern a”.

なお、クエリ文用質問種別判別部２１は、原文用質問種別判別部３３と同様に、生成されるメタ情報に中心自立語を含めるようにしてもよい。
クエリ文用質問種別判別部２１によりメタ情報が作成されると、次に、検索キーの作成が検索キー作成部１３によって行われる。この検索キーの作成について、図８を用いて説明する。 Note that the query sentence question type determination unit 21 may include a central independent word in the generated meta information, similarly to the original sentence question type determination unit 33.
When the meta information is created by the query sentence question type discriminating unit 21, the search key is then created by the search key creating unit 13. The creation of this search key will be described with reference to FIG.

検索キー作成部１３は、まず、機械翻訳部１２より、第一言語で表現されているクエリ文２の第二言語への翻訳文を取得し、この翻訳文を検索キーとする。図８の例では、「イモの生産量が多い県はどこですか。」という日本語のクエリ文２についての機械翻訳部１２による英訳文「Where is prefecture where lot of production of potato exists?」を検索キー作成部１３が取得して検索キーとして用いることを表している。 First, the search key creation unit 13 obtains a translation of the query sentence 2 expressed in the first language into the second language from the machine translation unit 12, and uses this translation as a search key. In the example of FIG. 8, the machine translation unit 12 searches for the English translation “Where is prefecture where lot of production of potato exists?” For the query sentence 2 in Japanese, “Where is the prefecture with the highest production of potatoes?” This indicates that the key creation unit 13 acquires and uses it as a search key.

次に、検索キー作成部１３は、クエリ文２による質問の種別の判別結果を表しているメタ情報をクエリ文用質問種別判別部２１から取得して検索キーとした翻訳文に付加することによって、このメタ情報も検索キーとする。図８の例では、クエリ文用質問種別判別部２１が生成した、クエリ文２による質問の種別の判別結果を表しているメタ情報『@Where』を検索キー作成部１３が取得して上述の英訳文の先頭に付加することで、このメタ情報も検索キーとして用いることを表している。 Next, the search key creation unit 13 obtains the meta information representing the question type discrimination result by the query sentence 2 from the query sentence question type discrimination unit 21 and adds it to the translated sentence as the search key. This meta information is also used as a search key. In the example of FIG. 8, the search key creation unit 13 obtains the meta information “@Where” generated by the query statement question type determination unit 21 and representing the determination result of the question type by the query statement 2, and By adding it to the beginning of the English translation, this meta information is also used as a search key.

その後、曖昧検索部１４は、検索キー作成部１３により作成された検索キーを用いて対訳形式ＤＢ１１における第二言語の翻訳文を曖昧検索し、この曖昧検索の結果として得られた翻訳文に対応付けられている第一言語の原文を対訳形式ＤＢ１１から抽出して出力する。この曖昧検索の具体例を、図９を用いて説明する。 Thereafter, the fuzzy search unit 14 uses the search key created by the search key creation unit 13 to perform a fuzzy search on the translation sentence in the second language in the parallel translation format DB 11, and corresponds to the translation sentence obtained as a result of this fuzzy search. The attached original text of the first language is extracted from the parallel translation format DB 11 and output. A specific example of this fuzzy search will be described with reference to FIG.

前述したように、対訳形式ＤＢ１１の第二言語の翻訳文のフィールドには、原文の翻訳文に、原文用質問種別判別部３３により生成されたメタ情報が付加されて格納されている。従って、クエリ文用質問種別判別部２１で生成されたメタ情報が付加されたクエリ文２の翻訳文を検索キーとして用いて対訳形式ＤＢ１１の翻訳文のフィールドを曖昧検索することによって、知りたい情報が検索結果３の上位の順位で得られるようになる。図９の例は、クエリ文２の英訳文にメタ情報『@Where』を付加したものを検索キーとして用いることで、対訳形式ＤＢ１１の英語フィールドの英訳文にメタ情報『@Where』が付加されたものが曖昧検索の結果において上位の順位でヒットすることを表している。つまり、曖昧検索の結果において、メタ情報『@Where』が文字列として一致し、且つ、文中に『場所』の情報が含まれているものが上位の順位でヒットするようになる。そして、この曖昧検索の結果として得られた英訳文についての日本語の原文が、この順位に従って、検索結果３として検索装置１から出力されることを表している。 As described above, in the translated language field of the second language of the bilingual form DB 11, the meta information generated by the source sentence question type determination unit 33 is added to the original sentence and stored. Therefore, the information desired to be obtained by performing an ambiguous search on the translated sentence field of the parallel translation DB 11 using the translated sentence of the query sentence 2 to which the meta information generated by the query sentence type discriminating unit 21 is added as a search key. Are obtained in the higher rank of the search result 3. In the example of FIG. 9, the meta information “@Where” is added to the English translation of the English field of the parallel translation DB 11 by using the English translation of the query sentence 2 with the meta information “@Where” added as a search key. Represents a hit in the higher rank in the result of the fuzzy search. That is, in the result of the fuzzy search, the meta information “@Where” matches as a character string, and the sentence containing “location” information is hit in the higher rank. Then, the Japanese original sentence about the English translation sentence obtained as a result of the fuzzy search is output from the search device 1 as the search result 3 according to this order.

図１の検索装置１の検索動作は以上のようにして行われるので、知りたい情報が検索結果３の上位の順位で得られ、しかも、原文及びクエリ文２の翻訳文を用いた曖昧検索により、単語レベルの表記揺れや語順の揺れによる検索精度の低下が抑制される。 Since the search operation of the search device 1 in FIG. 1 is performed as described above, the information to be obtained is obtained in the higher rank of the search result 3, and furthermore, by the ambiguous search using the translated sentence of the original sentence and the query sentence 2 In addition, a decrease in search accuracy due to fluctuations in word level notation and word order is suppressed.

次に図１０について説明する。図１０は、図１の検索装置１のハードウエア構成の一例を表している。
本構成例においては、検索装置１はコンピュータ５０により構成されている。コンピュータ５０は、ＭＰＵ５１、ＲＯＭ５２、ＲＡＭ５３、ハードディスク装置５４、入力装置５５、出力装置５６、インタフェース装置５７、及び記録媒体駆動装置５８を備えている。なお、これらの構成要素はバスライン５９を介して接続されており、ＭＰＵ５１の管理の下で各種のデータを相互に授受することができる。 Next, FIG. 10 will be described. FIG. 10 shows an example of a hardware configuration of the search device 1 of FIG.
In this configuration example, the search device 1 is configured by a computer 50. The computer 50 includes an MPU 51, ROM 52, RAM 53, hard disk device 54, input device 55, output device 56, interface device 57, and recording medium drive device 58. These components are connected via a bus line 59, and various data can be exchanged under the management of the MPU 51.

ＭＰＵ（Micro Processing Unit）５１は、コンピュータ５０全体の動作を制御する演算処理装置である。
ＲＯＭ（Read Only Memory）５２は、所定の基本制御プログラムが予め記録されている読み出し専用半導体メモリである。ＭＰＵ５１は、この基本制御プログラムをコンピュータ５０の起動時に読み出して実行することにより、コンピュータ５０の各構成要素の動作制御が可能になる。なお、ＲＯＭ５２として、フラッシュメモリ等の、記憶データが不揮発性であるメモリを使用してもよい。 An MPU (Micro Processing Unit) 51 is an arithmetic processing unit that controls the operation of the entire computer 50.
A ROM (Read Only Memory) 52 is a read-only semiconductor memory in which a predetermined basic control program is recorded in advance. The MPU 51 can control the operation of each component of the computer 50 by reading out and executing the basic control program when the computer 50 is started. As the ROM 52, a memory such as a flash memory whose storage data is nonvolatile may be used.

ＲＡＭ（Random Access Memory）５３は、ＭＰＵ５１が各種の制御プログラムを実行する際に、必要に応じて作業用記憶領域として使用する、随時書き込み読み出し可能な半導体メモリである。 A RAM (Random Access Memory) 53 is a semiconductor memory that can be written and read at any time and used as a working storage area as needed when the MPU 51 executes various control programs.

ハードディスク装置５４は、ＭＰＵ５１によって実行される各種の制御プログラムや各種のデータを記憶しておく記憶装置である。ＭＰＵ５１は、ハードディスク装置５４に記憶されている所定の制御プログラムを読み出して実行することにより、各種の制御処理を行えるようになる。このコンピュータ５０を用いて検索装置１を構成する場合には、ハードディスク装置５４には図１の原文テキストＤＢ１０、クエリ文用質問種別判別パターン２２、及び原文用質問種別判別パターン３４を予め格納しておくようにする。また、このコンピュータ５０を用いて検索装置１を構成する場合には、ハードディスク装置５４は、対訳形式ＤＢ１１としても使用される。 The hard disk device 54 is a storage device that stores various control programs executed by the MPU 51 and various data. The MPU 51 can perform various control processes by reading and executing a predetermined control program stored in the hard disk device 54. When the search device 1 is configured using the computer 50, the hard disk device 54 stores in advance the original text DB 10, the query sentence question type determination pattern 22, and the original sentence question type determination pattern 34 of FIG. To leave. When the search device 1 is configured using the computer 50, the hard disk device 54 is also used as the parallel translation format DB 11.

入力装置５５は、例えばキーボード装置やマウス装置であり、例えば検索装置１の使用者により操作されると、その操作内容に対応付けられている使用者からの各種情報の入力を取得し、取得した入力情報をＭＰＵ５１に送付する。検索装置１への入力であるクエリ文２は、例えば入力装置５５によって受け付けられる。 The input device 55 is, for example, a keyboard device or a mouse device. For example, when operated by the user of the search device 1, the input device 55 acquires and acquires various information inputs from the user associated with the operation content. The input information is sent to the MPU 51. The query sentence 2 that is an input to the search device 1 is received by the input device 55, for example.

出力装置５６は例えば液晶ディスプレイやスピーカであり、ＭＰＵ５１から送付される出力データに応じ、合成音声の発音や、各種のテキスト・画像の表示を行う。検索装置１の出力である検索結果３は、例えば出力装置５６で表示され、あるいは、合成音声による検索結果３の読み上げにより出力される。 The output device 56 is, for example, a liquid crystal display or a speaker, and generates synthesized speech and displays various texts and images according to output data sent from the MPU 51. The search result 3 that is the output of the search device 1 is displayed on the output device 56, for example, or is output by reading out the search result 3 using synthesized speech.

インタフェース装置５７は、このコンピュータ５０に接続される各種機器との間での各種情報の授受の管理を行う。検索装置１への入力であるクエリ文２は、例えば他の機器から出力されて、直接に、あるいはインターネット等の通信ネットワークを介して、インタフェース装置５７で受け付けるようにしてもよい。また、検索装置１の出力である検索結果は、インタフェース装置５７から出力して、直接に、あるいはインターネット等の通信ネットワークを介して、他の機器へ送付するようにしてもよい。更に、インタフェース装置５７に直接に、あるいはインターネット等の通信ネットワークを介して接続される不図示の外部記憶装置を、コンピュータ５０自身が備えているハードディスク装置５４の代用若しくは補助として、使用するようにしてもよい。 The interface device 57 manages the exchange of various information with various devices connected to the computer 50. The query statement 2 that is an input to the search device 1 may be output from another device, for example, and received by the interface device 57 directly or via a communication network such as the Internet. The search result that is the output of the search device 1 may be output from the interface device 57 and sent to another device directly or via a communication network such as the Internet. Further, an external storage device (not shown) connected to the interface device 57 directly or via a communication network such as the Internet may be used as a substitute or auxiliary for the hard disk device 54 provided in the computer 50 itself. Also good.

記録媒体駆動装置５８は、可搬型記録媒体６０に記録されている各種の制御プログラムやデータの読み出しを行う装置である。ＭＰＵ５１は、可搬型記録媒体６０に記録されている所定の制御プログラムを、記録媒体駆動装置５８を介して読み出して実行することによって、各種の制御処理を行うようにしてもよい。なお、可搬型記録媒体６０としては、例えばＣＤ−ＲＯＭ（Compact Disc Read Only Memory）やＤＶＤ−ＲＯＭ（Digital Versatile Disc Read Only Memory）、ＵＳＢ（Universal Serial Bus）規格のコネクタが備えられているフラッシュメモリなどがある。 The recording medium driving device 58 is a device that reads various control programs and data recorded on the portable recording medium 60. The MPU 51 may perform various control processes by reading and executing a predetermined control program recorded on the portable recording medium 60 via the recording medium driving device 58. As the portable recording medium 60, for example, a flash memory equipped with a CD-ROM (Compact Disc Read Only Memory), a DVD-ROM (Digital Versatile Disc Read Only Memory), or a USB (Universal Serial Bus) standard connector. and so on.

このようなコンピュータ５０を用いて検索装置１を構成するには、例えば、後述する各種の制御処理をＭＰＵ５１に行わせるための制御プログラムを作成する。作成された制御プログラムはハードディスク装置５４若しくは可搬型記録媒体６０に予め格納しておく。また、ハードディスク装置５４には原文テキストＤＢ１０、クエリ文用質問種別判別パターン２２、及び原文用質問種別判別パターン３４を予め格納しておくようにする。そして、ＭＰＵ５１に所定の指示を与えてこの制御プログラムを読み出させて実行させる。こうすることで、コンピュータ５０が、対訳形式ＤＢ１１、機械翻訳部１２、検索キー作成部１３、曖昧検索部１４、クエリ文用質問種別判別部２１、及びメタ情報生成部３２として機能するようになる。 In order to configure the search device 1 using such a computer 50, for example, a control program for causing the MPU 51 to perform various control processes described later is created. The created control program is stored in advance in the hard disk device 54 or the portable recording medium 60. Further, the original text DB 10, the query sentence question type determination pattern 22, and the original sentence question type determination pattern 34 are stored in the hard disk device 54 in advance. Then, a predetermined instruction is given to the MPU 51 to read and execute this control program. By doing so, the computer 50 functions as the parallel translation format DB 11, the machine translation unit 12, the search key creation unit 13, the fuzzy search unit 14, the query sentence question type determination unit 21, and the meta information generation unit 32. .

次に、コンピュータ５０のＭＰＵ５１により行われる各種の制御処理について説明する。なお、以下の説明では、第一言語を日本語とし、第二言語を英語とする。
まず図１１について説明する。図１１は、対訳形式ＤＢ作成処理の第一の例の処理内容を表したフローチャートである。この処理は、原文テキストＤＢ１０に格納されている日本語の原文とその英訳文とを対応付けて対訳形式ＤＢ１１に格納する処理である。但し、この第一の例では、原文に含まれている語句が回答となる質問の種別を表すメタ情報の生成及び対訳形式ＤＢ１１への格納については行わない場合の処理である。 Next, various control processes performed by the MPU 51 of the computer 50 will be described. In the following description, the first language is Japanese and the second language is English.
First, FIG. 11 will be described. FIG. 11 is a flowchart showing the processing contents of the first example of the bilingual format DB creation processing. This process is a process of associating a Japanese original sentence stored in the original text DB 10 and its English translation sentence in the parallel translation format DB 11. However, in this first example, it is a process in the case where the generation of the meta information indicating the type of the question to which the word / phrase included in the original sentence is an answer and the storage into the parallel translation format DB 11 are not performed.

図１１において、まず、Ｓ１０１では、変数ｉに初期値「０」を代入する処理をＭＰＵ５１が行う。
次に、Ｓ１０２では、原文テキストＤＢ１０を参照して、この処理実行時の変数ｉの値をレコードＩＤとする原文のレコードが原文テキストＤＢ１０に存在するか否かを判定する処理をＭＰＵ５１が行う。なお、本実施例において、原文テキストＤＢ１０には、各レコードに、レコードを識別するために付与されている、「０」から始まるレコードＩＤと、日本語の原文とが格納されているものとする。 In FIG. 11, first, in S101, the MPU 51 performs a process of substituting the initial value “0” for the variable i.
Next, in S102, the MPU 51 performs a process of referring to the original text DB 10 to determine whether or not an original record having the value of the variable i at the time of execution of the process exists in the original text DB 10. In the present embodiment, it is assumed that the original text DB 10 stores a record ID starting from “0” and a Japanese original sentence, which are assigned to identify each record. .

Ｓ１０２の判定処理において、ＭＰＵ５１は、この処理実行時の変数ｉの値をレコードＩＤとする原文のレコードが原文テキストＤＢ１０に存在すると判定したとき（判定結果がＹｅｓのとき）にはＳ１０３に処理を進める。一方、ＭＰＵ５１は、ここで、この処理実行時の変数ｉの値をレコードＩＤとする原文のレコードが原文テキストＤＢ１０に存在しないと判定したとき（判定結果がＮｏのとき）には、この対訳形式ＤＢ作成処理を終了する。 In the determination process of S102, when the MPU 51 determines that the original text record having the value of the variable i at the time of execution of the process exists in the original text DB 10 (when the determination result is Yes), the MPU 51 performs the process in S103. Proceed. On the other hand, when the MPU 51 determines that there is no original text record in the original text DB 10 in which the value of the variable i at the time of this process execution is a record ID (when the determination result is No), this bilingual format The DB creation process is terminated.

次に、Ｓ１０３では、この処理実行時の変数ｉの値をレコードＩＤとする原文テキストＤＢ１０のレコードから原文を読み出し、読み出した原文に対して機械翻訳処理を実行して当該原文の英訳文を作成する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は機械翻訳部１２として機能する。 Next, in S103, the original text is read from the record of the original text DB 10 using the value of the variable i at the time of this processing as the record ID, and machine translation processing is executed on the read original text to create an English translation of the original text. The MPU 51 performs the processing to be performed. The MPU 51 that performs this process functions as the machine translation unit 12.

次に、Ｓ１０４では、この処理実行時の変数ｉの値をレコードＩＤとし、Ｓ１０３の処理において原文テキストＤＢ１０から読み出した原文及び作成した英訳文を各フィールドに格納したレコードを作成して対訳形式ＤＢ１１へ格納する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は、機械翻訳部１２として機能する。この処理を行うＭＰＵ５１は格納処理部３５として機能する。 Next, in S104, the value of the variable i at the time of executing this process is set as a record ID, and a record storing the original sentence read from the original text DB 10 and the created English translation sentence in each field in the process of S103 is created and the bilingual form DB 11 The MPU 51 performs the process of storing the data. The MPU 51 that performs this process functions as the machine translation unit 12. The MPU 51 that performs this process functions as the storage processing unit 35.

次に、Ｓ１０５では、この処理時点での変数ｉの値に「１」を加算した結果の値を改めて変数ｉに代入する処理をＭＰＵ５１が行い、その後はＳ１０２へ処理を戻して上述した処理が繰り返される。 Next, in S105, the MPU 51 performs a process of substituting the value obtained by adding “1” to the value of the variable i at the time of this process into the variable i, and then returns to S102 to perform the above-described process. Repeated.

以上までの処理が対訳形式ＤＢ作成処理の第一の例であり、この処理をＭＰＵ５１が行うことによって、原文テキストＤＢ１０に格納されている日本語の原文とその英訳文とが対応付けられて対訳形式ＤＢ１１に格納される。 The above processing is the first example of the bilingual format DB creation processing. When the MPU 51 performs this processing, the Japanese original text stored in the source text DB 10 and the English translation text are associated with each other and translated. Stored in the format DB 11.

次に図１２について説明する。図１２は、検索処理の第一の例の処理内容を表したフローチャートである。この処理は、上述した対訳形式ＤＢ作成処理の第一の例の実行によって作成された対訳形式ＤＢ１１に対し、入力された日本語のクエリ文２についての英訳文を検索キーとして用いて曖昧検索を行う処理である。 Next, FIG. 12 will be described. FIG. 12 is a flowchart showing the processing contents of the first example of the search processing. In this process, an ambiguous search is performed on the bilingual form DB 11 created by executing the first example of the bilingual form DB creating process described above using the English translation of the input Japanese query sentence 2 as a search key. This is the process to be performed.

図１２において、まず、Ｓ１５１では、検索装置１に入力されたクエリ文２を取得し、取得したクエリ文２に対して機械翻訳処理を実行してクエリ文２の英訳文を作成する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は機械翻訳部１２として機能する。 In FIG. 12, first, in S151, the MPU 51 executes a process of acquiring the query sentence 2 input to the search device 1 and executing a machine translation process on the acquired query sentence 2 to create an English translation of the query sentence 2. Do. The MPU 51 that performs this process functions as the machine translation unit 12.

次に、Ｓ１５２では、Ｓ１５１の処理によって作成された英訳文を検索キーとして用いて、対訳形式ＤＢ１１の英訳文のフィールドを曖昧検索する処理をＭＰＵ５１が行う。そして、続くＳ１５３では、曖昧検索の結果として得られた英訳文と同一のレコードに格納されている原文を対訳形式ＤＢ１１から読み出して検索結果３として出力する処理をＭＰＵ５１が行い、その後は、この検索処理を終了する。このＳ１５２及びＳ１５３の処理を行うＭＰＵ５１は曖昧検索部１４として機能する。 Next, in S152, the MPU 51 performs an ambiguous search on the English translation text field of the parallel translation DB 11 using the English translation text created by the processing of S151 as a search key. In the subsequent S153, the MPU 51 performs a process of reading the original text stored in the same record as the English translation obtained as a result of the fuzzy search from the parallel translation DB 11 and outputting it as the search result 3, and thereafter, this search is performed. The process ends. The MPU 51 that performs the processes of S152 and S153 functions as the fuzzy search unit 14.

以上までの処理が検索処理の第一の例であり、この処理をＭＰＵ５１が行うことによって、検索装置１による曖昧検索が行われ、原文及びクエリ文２における日本語での単語表現や語順の揺れが吸収されて、これらの揺れに起因する検索精度の低下が抑制される。 The above processing is the first example of the search processing, and the MPU 51 performs this processing, so that the search device 1 performs an ambiguous search, and the word expression in Japanese and the query sentence 2 in Japanese and the fluctuation of the word order. Is absorbed, and a decrease in search accuracy due to these fluctuations is suppressed.

次に図１３Ａ及び図１３Ｂについて説明する。図１３Ａ及び図１３Ｂは、対訳形式ＤＢ作成処理の第二の例の処理内容を表したフローチャートである。この処理は、原文テキストＤＢ１０に格納されている日本語の原文とその英訳文とを対応付けて対訳形式ＤＢ１１に格納する処理である。また、この第二の例では、原文に含まれている語句が回答となる質問の種別を表すメタ情報の生成及び対訳形式ＤＢ１１への格納も行われる。 Next, FIGS. 13A and 13B will be described. 13A and 13B are flowcharts showing the processing contents of the second example of the bilingual format DB creation processing. This process is a process of associating a Japanese original sentence stored in the original text DB 10 and its English translation sentence in the parallel translation format DB 11. Further, in the second example, generation of meta information indicating the type of question in which the phrase included in the original sentence is an answer and storage in the bilingual form DB 11 are also performed.

まず、図１３ＡのＳ２０１において、変数ｉに初期値「０」を代入する処理をＭＰＵ５１が行う。
次に、Ｓ２０２では、原文テキストＤＢ１０を参照して、この処理実行時の変数ｉの値をレコードＩＤとする原文のレコードが原文テキストＤＢ１０に存在するか否かを判定する処理をＭＰＵ５１が行う。なお、本実施例においても、原文テキストＤＢ１０には、各レコードに、レコードを識別するために付与されている、「０」から始まるレコードＩＤと、日本語の原文とが格納されているものとする。 First, in S201 of FIG. 13A, the MPU 51 performs a process of substituting the initial value “0” for the variable i.
Next, in S202, the MPU 51 performs a process of referring to the original text DB 10 to determine whether or not an original record having the value of the variable i at the time of execution of the process exists in the original text DB 10. Also in this embodiment, the original text DB 10 stores a record ID starting from “0” and a Japanese original sentence, which are assigned to each record to identify the record. To do.

Ｓ２０２の判定処理において、ＭＰＵ５１は、この処理実行時の変数ｉの値をレコードＩＤとする原文のレコードが原文テキストＤＢ１０に存在すると判定したとき（判定結果がＹｅｓのとき）にはＳ２０３に処理を進める。一方、ＭＰＵ５１は、ここで、この処理実行時の変数ｉの値をレコードＩＤとする原文のレコードが原文テキストＤＢ１０に存在しないと判定したとき（判定結果がＮｏのとき）には、この対訳形式ＤＢ作成処理を終了する。 In the determination process of S202, when the MPU 51 determines that the original text record having the value of the variable i at the time of execution of the process exists in the original text DB 10 (when the determination result is Yes), the MPU 51 performs the process in S203. Proceed. On the other hand, when the MPU 51 determines that there is no original text record in the original text DB 10 in which the value of the variable i at the time of this process execution is a record ID (when the determination result is No), this bilingual format The DB creation process is terminated.

次に、Ｓ２０３では、この処理実行時の変数ｉの値をレコードＩＤとする原文テキストＤＢ１０のレコードから原文を読み出し、読み出した原文に対して意味解析を行って当該原文についての概念構造を作成する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は意味解析部３１として機能する。 Next, in S203, the original text is read from the record of the original text DB 10 using the value of the variable i at the time of execution of the process as the record ID, and the semantic analysis is performed on the read original text to create a conceptual structure for the original text. Processing is performed by the MPU 51. The MPU 51 that performs this process functions as the semantic analysis unit 31.

次に、Ｓ２０４では、Ｓ２０３の処理による原文の意味解析の結果に基づいて、且つ、原文用質問種別判別パターン３４を用いることによって、当該原文についての英訳文に含まれる語句についての前述のタグ情報を生成する処理をＭＰＵ５１が行う。このＳ２０４の処理の詳細は、図１３Ｂを用いて後で説明する。この処理を行うＭＰＵ５１は、原文用質問種別判別部３３として機能する。 Next, in S204, based on the result of the semantic analysis of the original text in the process of S203, and using the original text question type discrimination pattern 34, the above-described tag information about the words included in the English translation of the original text Is generated by the MPU 51. Details of the processing in S204 will be described later with reference to FIG. 13B. The MPU 51 that performs this process functions as the original text question type determination unit 33.

次に、Ｓ２０５では、この処理実行時の変数ｉの値をレコードＩＤとする原文テキストＤＢ１０のレコードから原文を読み出し、読み出した原文に対して機械翻訳処理を実行して当該原文の英訳文を作成する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は機械翻訳部１２として機能する。 Next, in S205, the original text is read from the record of the original text DB 10 using the value of the variable i at the time of execution of the process as a record ID, and machine translation processing is executed on the read original text to create an English translation of the original text. The MPU 51 performs the processing to be performed. The MPU 51 that performs this process functions as the machine translation unit 12.

次に、Ｓ２０６では、Ｓ２０３の処理において原文テキストＤＢ１０から読み出した原文と、作成した英訳文にＳ２０４の処理によって生成したタグ情報を付加したものとよりレコードを作成して対訳形式ＤＢ１１へ格納する処理をＭＰＵ５１が行う。なお、作成されたレコードには、この処理実行時の変数ｉの値が、レコードＩＤとして付加される。この処理を行うＭＰＵ５１は格納処理部３５として機能する。 Next, in S206, a process of creating a record based on the original sentence read from the original text DB 10 in the process of S203 and the created English translation sentence with the tag information generated in the process of S204 and storing it in the bilingual form DB 11 Is performed by the MPU 51. Note that the value of the variable i at the time of executing this process is added to the created record as a record ID. The MPU 51 that performs this process functions as the storage processing unit 35.

次に、Ｓ２０７では、この処理時点での変数ｉの値に「１」を加算した結果の値を改めて変数ｉに代入する処理をＭＰＵ５１が行い、その後はＳ２０２へ処理を戻して上述した処理が繰り返される。 Next, in S207, the MPU 51 performs a process of substituting the value obtained by adding “1” to the value of the variable i at the time of this process into the variable i, and then returns to S202 to perform the above-described process. Repeated.

次に、図１３ＡのＳ２０４の処理の詳細について、図１３Ｂのフローチャートを用いて説明する。
まず、Ｓ２１１において、変数ｊに初期値「０」を代入すると共に、メタ情報を初期化して空のデータ（ヌルデータ）を代入する処理をＭＰＵ５１が行う。 Next, details of the processing in S204 of FIG. 13A will be described using the flowchart of FIG. 13B.
First, in S211, the MPU 51 performs a process of substituting an initial value “0” for the variable j and initializing meta information and substituting empty data (null data).

次に、Ｓ２１２では、原文用質問種別判別パターン３４を参照して、変数ｊの値に相当する順番のパターンが原文用質問種別判別パターン３４に存在するか否かを判定する処理をＭＰＵ５１が行う。なお、本実施例において、原文用質問種別判別パターン３４は、所定の記憶領域において、０番目から順番に格納されているものとする。 Next, in S212, the MPU 51 performs a process of referring to the original sentence question type determination pattern 34 to determine whether or not the pattern corresponding to the value of the variable j exists in the original sentence question type determination pattern 34. . In this embodiment, it is assumed that the original text question type determination pattern 34 is stored in order from the 0th in a predetermined storage area.

Ｓ２１２の判定処理において、ＭＰＵ５１は、変数ｊの値に相当する順番のパターンが原文用質問種別判別パターン３４に存在すると判定したとき（判定結果がＹｅｓのとき）にはＳ２１３に処理を進める。一方、ＭＰＵ５１は、ここで、変数ｊの値に相当する順番のパターンが原文用質問種別判別パターン３４には存在しないと判定したとき（判定結果がＮｏのとき）には、この図１３Ｂの処理を終了して図１３Ａに処理を戻す。 In the determination process of S212, when the MPU 51 determines that an order pattern corresponding to the value of the variable j exists in the original question type determination pattern 34 (when the determination result is Yes), the MPU 51 advances the process to S213. On the other hand, when the MPU 51 determines that there is no pattern in the order corresponding to the value of the variable j in the original text question type determination pattern 34 (when the determination result is No), the process of FIG. 13B is performed. And the process returns to FIG. 13A.

次に、Ｓ２１３では、変数ｊの値に相当する順番の原文用質問種別判別パターン３４が、図１３ＡのＳ２０３の処理で作成された原文の概念構造の全部若しくは一部と一致しているか否かを、図４を用いて説明したようにして判定する処理をＭＰＵ５１が行う。ＭＰＵ５１は、ここで、一致していると判定したとき（判定結果がＹｅｓのとき）にはＳ２１４に処理を進め、一致していないと判定したとき（判定結果がＮｏのとき）にはＳ２１５に処理を進める。 Next, in S213, whether or not the original sentence question type determination pattern 34 in the order corresponding to the value of the variable j matches the whole or part of the conceptual structure of the original sentence created in the process of S203 in FIG. 13A. The MPU 51 performs a process for determining the above as described with reference to FIG. When the MPU 51 determines that they match (when the determination result is Yes), the MPU 51 proceeds to S214. When it determines that they do not match (when the determination result is No), the MPU 51 proceeds to S215. Proceed with the process.

次に、Ｓ２１４では、図１３Ｂの処理が開始されてから作成してきたメタ情報に、Ｓ２１３の判定処理によって一致すると判定された原文用質問種別判別パターン３４に対応付けられているメタ情報を更に連結して追加する処理をＭＰＵ５１が行う。 Next, in S214, the meta information associated with the original text question type determination pattern 34 determined to be matched by the determination processing in S213 is further linked to the meta information created after the processing in FIG. 13B is started. Then, the MPU 51 performs processing to be added.

次に、Ｓ２１５では、この処理時点での変数ｊの値に「１」を加算した結果の値を改めて変数ｊに代入する処理をＭＰＵ５１が行い、その後はＳ２１２へ処理を戻して上述した処理が繰り返される。 Next, in S215, the MPU 51 performs a process of substituting the value obtained by adding “1” to the value of the variable j at the time of this process to the variable j. Thereafter, the process returns to S212, and the process described above is performed. Repeated.

以上までの処理が対訳形式ＤＢ作成処理の第二の例であり、この処理をＭＰＵ５１が行うことによって、原文テキストＤＢ１０に格納されている日本語の原文とその英訳文とが対応付けられて対訳形式ＤＢ１１に格納される。また、この処理をＭＰＵ５１が行うことによって、原文に含まれている語句が回答となる質問の種別を表すメタ情報の生成及び対訳形式ＤＢ１１への格納も行われる。 The above processing is the second example of the bilingual format DB creation processing. When the MPU 51 performs this processing, the Japanese original text stored in the source text DB 10 and the English translation text are associated with each other and translated. Stored in the format DB 11. In addition, by performing this processing by the MPU 51, generation of meta information indicating the type of question in which the phrase included in the original sentence is an answer and storage in the bilingual form DB 11 are also performed.

次に図１４Ａ、図１４Ｂ、及び図１４Ｃについて説明する。図１４Ａ、図１４Ｂ、及び図１４Ｃは、検索処理の第二の例の処理内容を表したフローチャートである。この処理は、上述した対訳形式ＤＢ作成処理の第二の例の実行によって作成された対訳形式ＤＢ１１に対し、入力された日本語のクエリ文２についての英訳文とクエリ文２の質問の種別とを検索キーとして用いて曖昧検索を行う処理である。 Next, FIGS. 14A, 14B, and 14C will be described. FIG. 14A, FIG. 14B, and FIG. 14C are flowcharts showing the processing contents of the second example of the search processing. This process is performed for the bilingual form DB 11 created by executing the bilingual form DB creating process described above, and the English translation sentence for the input Japanese query sentence 2 and the question type of the query sentence 2 Is a process for performing an ambiguous search by using as a search key.

まず、図１４ＡのＳ２５１において、検索装置１に入力されたクエリ文２から検索キーを作成する処理をＭＰＵ５１が行う。このＳ２０４の処理の詳細は、図１４Ｂ及び図１４Ｃを用いて後で説明する。 First, in S251 of FIG. 14A, the MPU 51 performs a process of creating a search key from the query sentence 2 input to the search device 1. Details of the processing in S204 will be described later with reference to FIGS. 14B and 14C.

次に、Ｓ２５２では、Ｓ２５１の処理によって作成された検索キーを用いて、対訳形式ＤＢ１１の英訳文のフィールドを曖昧検索する処理をＭＰＵ５１が行う。そして、続くＳ２５３では、曖昧検索の結果として得られた英訳文と同一のレコードに格納されている原文を対訳形式ＤＢ１１から読み出して検索結果３として出力する処理をＭＰＵ５１が行い、その後は、この検索処理を終了する。このＳ２５２及びＳ２５３の処理を行うＭＰＵ５１は曖昧検索部１４として機能する。 Next, in S252, the MPU 51 performs an ambiguous search for the English translation field of the bilingual form DB11 using the search key created in the process of S251. In the subsequent S253, the MPU 51 performs a process of reading the original text stored in the same record as the English translation obtained as a result of the fuzzy search from the parallel translation DB 11 and outputting it as the search result 3, and thereafter this search is performed. End the process. The MPU 51 that performs the processing of S252 and S253 functions as the fuzzy search unit 14.

次に、図１４ＡのＳ２５１の処理の詳細について、図１４Ｂのフローチャートを用いて説明する。
まず、Ｓ２６１において、検索装置１に入力されたクエリ文２を取得し、取得したクエリ文２に対して意味解析を行ってクエリ文２についての概念構造を作成する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は意味解析部３１として機能する。 Next, details of the processing of S251 of FIG. 14A will be described using the flowchart of FIG. 14B.
First, in S261, the MPU 51 performs a process of acquiring the query sentence 2 input to the search device 1, performing semantic analysis on the acquired query sentence 2, and creating a conceptual structure for the query sentence 2. The MPU 51 that performs this process functions as the semantic analysis unit 31.

次に、Ｓ２６２では、Ｓ２６１の処理によるクエリ文２の意味解析の結果に基づいて、且つ、クエリ文用質問種別判別パターン２２を用いることによって、クエリ文２についての英訳文に含まれる語句についての前述のタグ情報を生成する処理をＭＰＵ５１が行う。このＳ２６２の処理の詳細は、図１４Ｃを用いて後で説明する。この処理を行うＭＰＵ５１は、クエリ文用質問種別判別部２１として機能する。 Next, in S262, based on the result of the semantic analysis of the query sentence 2 in the process of S261, and using the query sentence question type determination pattern 22, the phrase included in the English translation of the query sentence 2 is determined. The MPU 51 performs the process for generating the tag information. Details of the processing in S262 will be described later with reference to FIG. 14C. The MPU 51 that performs this process functions as the query sentence question type determination unit 21.

次に、Ｓ２６３では、Ｓ２６１の処理により取得されたクエリ文２に対して機械翻訳処理を実行してクエリ文２の英訳文を作成する処理をＭＰＵ５１が行う。この処理を行うＭＰＵ５１は機械翻訳部１２として機能する。なお、本実施例では、この機械翻訳処理では、Ｓ２６１の処理により作成されたクエリ文２についての概念構造に相当する英訳文を生成することによって、クエリ文２の英訳文を作成するようにする。 Next, in S263, the MPU 51 performs a process of executing a machine translation process on the query sentence 2 acquired by the process of S261 to create an English translation of the query sentence 2. The MPU 51 that performs this process functions as the machine translation unit 12. In this embodiment, in this machine translation process, the English translation sentence of the query sentence 2 is created by generating an English translation sentence corresponding to the conceptual structure of the query sentence 2 created by the process of S261. .

次に、Ｓ２６４では、Ｓ２６３の処理により作成された英訳文に、Ｓ２６２の処理によって生成されたタグ情報を付加することによって検索キーを作成する処理をＭＰＵ５１が行い、その後は、この図１４Ｂの処理を終了して図１４Ａに処理を戻す。この処理を行うＭＰＵ５１は検索キー作成部１３として機能する。 Next, in S264, the MPU 51 performs a process of creating a search key by adding the tag information generated by the process of S262 to the English translation sentence created by the process of S263, and thereafter the process of FIG. 14B. And the process returns to FIG. 14A. The MPU 51 that performs this process functions as the search key creation unit 13.

次に、図１４ＢのＳ２６２の処理の詳細について、図１４Ｃのフローチャートを用いて説明する。
まず、図１４ＣのＳ２７１において、変数ｋに初期値「０」を代入すると共に、メタ情報を初期化して空のデータ（ヌルデータ）を代入する処理をＭＰＵ５１が行う。 Next, details of the processing of S262 of FIG. 14B will be described using the flowchart of FIG. 14C.
First, in S271 of FIG. 14C, the MPU 51 performs a process of substituting the initial value “0” for the variable k and initializing meta information and substituting empty data (null data).

次に、Ｓ２７２では、クエリ文用質問種別判別パターン２２を参照して、変数ｋの値に相当する順番のパターンがクエリ文用質問種別判別パターン２２に存在するか否かを判定する処理をＭＰＵ５１が行う。なお、本実施例において、クエリ文用質問種別判別パターン２２は、所定の記憶領域において、０番目から順番に格納されているものとする。 Next, in S272, the MPU 51 determines whether or not the query sentence question type determination pattern 22 has an order pattern corresponding to the value of the variable k with reference to the query sentence question type determination pattern 22. Do. In this embodiment, it is assumed that the query sentence question type determination pattern 22 is stored in order from the 0th in a predetermined storage area.

Ｓ２７２の判定処理において、ＭＰＵ５１は、変数ｋの値に相当する順番のパターンがクエリ文用質問種別判別パターン２２に存在すると判定したとき（判定結果がＹｅｓのとき）にはＳ２７３に処理を進める。一方、ＭＰＵ５１は、ここで、変数ｋの値に相当する順番のパターンがクエリ文用質問種別判別パターン２２には存在しないと判定したとき（判定結果がＮｏのとき）には、この図１４Ｃの処理を終了して図１４Ｂに処理を戻す。 In the determination process of S272, when the MPU 51 determines that the pattern in the order corresponding to the value of the variable k exists in the query sentence question type determination pattern 22 (when the determination result is Yes), the MPU 51 advances the process to S273. On the other hand, when the MPU 51 determines that the pattern of the order corresponding to the value of the variable k does not exist in the query sentence question type determination pattern 22 (when the determination result is No), the MPU 51 of FIG. The process ends and the process returns to FIG. 14B.

次に、Ｓ２７３では、変数ｋの値に相当する順番のクエリ文用質問種別判別パターン２２が、図１４ＢのＳ２６１で作成されたクエリ文２の概念構造の全部若しくは一部と一致しているか否かを、図７を用いて説明したようにして判定する処理をＭＰＵ５１が行う。ＭＰＵ５１は、ここで、一致していると判定したとき（判定結果がＹｅｓのとき）にはＳ２７４に処理を進め、一致していないと判定したとき（判定結果がＮｏのとき）にはＳ２７５に処理を進める。 Next, in S273, whether or not the query sentence question type determination pattern 22 in the order corresponding to the value of the variable k matches all or part of the conceptual structure of the query sentence 2 created in S261 of FIG. 14B. The MPU 51 performs the process of determining whether or not as described with reference to FIG. When the MPU 51 determines that they match (when the determination result is Yes), the MPU 51 proceeds to S274, and when it determines that they do not match (when the determination result is No), the MPU 51 proceeds to S275. Proceed with the process.

次に、Ｓ２７４では、図１４Ｃの処理が開始されてから作成してきたメタ情報に、Ｓ２７３の判定処理によって一致すると判定されたクエリ文用質問種別判別パターン２２に対応付けられているメタ情報を更に連結して追加する処理をＭＰＵ５１が行う。 Next, in S274, the meta information associated with the query sentence question type determination pattern 22 determined to match the meta information created after the processing of FIG. The MPU 51 performs the process of connecting and adding.

次に、Ｓ２７５では、この処理時点での変数ｋの値に「１」を加算した結果の値を改めて変数ｋに代入する処理をＭＰＵ５１が行い、その後はＳ２７２へ処理を戻して上述した処理が繰り返される。 Next, in S275, the MPU 51 performs a process of substituting the value obtained by adding “1” to the value of the variable k at the time of this process into the variable k, and then returns to S272 to perform the above-described process. Repeated.

以上までの処理が検索処理の第二の例であり、この処理をＭＰＵ５１が行うことによって、検索装置１による曖昧検索が行われ、原文及びクエリ文２における日本語での単語表現や語順の揺れが吸収されて、これらの揺れに起因する検索精度の低下が抑制される。また、クエリ文２の質問の種別を示しているメタ情報とクエリ文２の翻訳文とを検索キーとして用いて曖昧検索が行われるので、知りたい情報が含まれている原文が、より精度良く検索結果３として得られるようになる。 The above processing is the second example of the search processing. When the MPU 51 performs this processing, an ambiguous search is performed by the search device 1, and the word expression in Japanese and the query order in the original sentence and the query sentence 2 are fluctuated. Is absorbed, and a decrease in search accuracy due to these fluctuations is suppressed. In addition, since the fuzzy search is performed using the meta information indicating the type of the query of the query sentence 2 and the translated sentence of the query sentence 2 as a search key, the original sentence including the information to be known is more accurately obtained. The search result 3 is obtained.

以上の実施例を含む実施形態に関し、更に以下の付記を開示する。
（付記１）
検索対象である第一言語のテキストと該テキストについての第二言語での翻訳文とが対応付けられて格納されている対訳形式データベースと、
入力された前記第一言語によるクエリ文を前記第二言語に機械翻訳する機械翻訳部と、
前記対訳形式データベースに格納されている前記翻訳文に対して前記第二言語に翻訳されたクエリ文を検索キーとして用いて曖昧検索を行い、該曖昧検索の結果である翻訳文に対応付けられている第一言語のテキストを前記対訳形式データベースから抽出して出力する曖昧検索部と、
を備えることを特徴とする検索装置。
（付記２）
前記対訳形式データベースには、前記テキストに含まれている語句が回答となる質問の種別を表す質問種別情報が、前記テキストについての前記翻訳文に対応付けられて更に格納されており、
前記クエリ文によって表現されている質問の種別を判別する判別部を更に備えており、
前記曖昧検索部は、前記第二言語に翻訳されたクエリ文に加えて前記判別部により判別された質問の種別を更に前記検索キーとして用いて前記曖昧検索を行う、
ことを特徴とする付記１に記載の検索装置。
（付記３）
前記テキストの意味解析を行う意味解析部と、
前記意味解析の結果に基づいて、前記テキストについての翻訳文に含まれる語句についての前記質問種別情報を生成して、前記テキストについての翻訳文に対応付けて前記対訳形式データベースに格納する質問種別情報生成部と、
を更に備え、
前記曖昧検索部は、前記対訳形式データベースに格納されている、前記翻訳文と前記翻訳文に対応付けられている質問種別情報とに対して前記曖昧検索を行う、
ことを特徴とする付記２に記載の検索装置。
（付記４）
前記質問種別情報生成部は、前記質問種別情報と前記質問種別情報によって表されている種別の質問の回答である前記テキストについての翻訳文に含まれる語句とを含むメタ情報を生成して、前記テキストについての翻訳文に付加して前記対訳形式データベースに格納し、
前記曖昧検索部は、前記対訳形式データベースに格納されている、前記メタ情報が付加されている翻訳文に対して前記曖昧検索を行う、
ことを特徴とする付記３に記載の検索装置。
（付記５）
前記意味解析部は、前記テキストを構文解析して作成される構文木の意味を解析することによって、前記テキストを構成している語句を表しているノードと各ノード間を繋ぐことによって語句同士の概念の関係を表しているアークとにより構成される前記テキストの概念構造を生成し、
前記質問種別情報生成部は、前記質問の種別に予め対応付けられている概念の関係を表しているアークを前記テキストの概念構造から抽出し、抽出されたアークに対応付けられている質問の種別を表している情報を、該抽出されたアークによって繋がれているノードが表している語句についての前記質問種別情報として生成する、
ことを特徴とする付記３又は４に記載の検索装置。
（付記６）
前記意味解析部は、更に、前記クエリ文の意味解析を行い、
前記機械翻訳部は、前記クエリ文の意味解析の結果に相当する前記第二言語の文を生成することによって、前記クエリ文の前記第二言語への機械翻訳を行う、
ことを特徴とする付記３から５のうちのいずれか一項に記載の検索装置。
（付記７）
前記判別部は、前記クエリ文を構文解析して作成される構文木の意味を解析することによって、前記クエリ文を構成している各語句を表している各ノードと該各ノード間を繋ぐことによって語句同士の概念の関係を表しているアークとにより構成される前記クエリ文の概念構造を生成し、該クエリ文の概念構造におけるアークが表している概念の関係と、該アークによって繋がれているノードが表している語句の品詞とに基づいて、前記クエリ文によって表現されている質問の種別を判別することを特徴とする付記２から６のうちのいずれか一項に記載の検索装置。
（付記８）
前記判別部は、前記語句の品詞が疑問詞であるものについて、該疑問詞の種別に基づいて前記クエリ文によって表現されている質問の種別を判別することを特徴とする付記７に記載の検索装置。
（付記９）
前記質問の種別は、５Ｗ１Ｈ型の質問における５Ｗ１Ｈの種別であることを特徴とする付記２から８のうちのいずれか一項に記載の検索装置。
（付記１０）
検索対象である第一言語のテキストと該テキストについての第二言語での翻訳文とが対応付けられて格納されている対訳形式データベースに対する検索処理をコンピュータに実行させるプログラムであって、
入力された前記第一言語によるクエリ文を前記第二言語に機械翻訳し、
前記対訳形式データベースに格納されている前記翻訳文に対して前記第二言語に翻訳されたクエリ文を検索キーとして用いて曖昧検索を行い、
前記曖昧検索の結果である翻訳文に対応付けられている第一言語のテキストを前記対訳形式データベースから抽出して出力する、
処理を前記コンピュータに実行させるプログラム。
（付記１１）
前記対訳形式データベースには、前記テキストに含まれている語句が回答となる質問の種別を表す質問種別情報が、前記テキストについての前記翻訳文に対応付けられて更に格納されており、
前記プログラムは、前記クエリ文によって表現されている質問の種別を判別する処理を前記コンピュータに更に実行させ、
前記曖昧検索を行う処理では、前記第二言語に翻訳されたクエリ文に加えて前記判別する処理により判別された質問の種別を更に前記検索キーとして用いて前記曖昧検索を行う、
付記１０に記載のプログラム。
（付記１２）
前記プログラムは、
前記テキストの意味解析を行い、
前記意味解析の結果に基づいて、前記テキストについての翻訳文に含まれる語句についての前記質問種別情報を生成して、前記テキストについての翻訳文に対応付けて前記対訳形式データベースに格納する、
処理を前記コンピュータに更に実行させ、
前記曖昧検索を行う処理では、前記対訳形式データベースに格納されている、前記翻訳文と前記翻訳文に対応付けられている質問種別情報とに対して前記曖昧検索を行う、
付記１１に記載のプログラム。
（付記１３）
前記質問種別情報を生成する処理では、前記質問種別情報と前記質問種別情報によって表されている種別の質問の回答である前記テキストについての翻訳文に含まれる語句とを含むメタ情報を生成して、前記テキストについての翻訳文に付加して前記対訳形式データベースに格納し、
前記曖昧検索を行う処理では、前記対訳形式データベースに格納されている、前記メタ情報が付加されている翻訳文に対して前記曖昧検索を行う、
付記１２に記載のプログラム。
（付記１４）
前記意味解析を行う処理では、前記テキストを構文解析して作成される構文木の意味を解析することによって、前記テキストを構成している語句を表しているノードと各ノード間を繋ぐことによって語句同士の概念の関係を表しているアークとにより構成される前記テキストの概念構造を生成し、
前記質問種別情報を生成する処理では、前記質問の種別に予め対応付けられている概念の関係を表しているアークを前記テキストの概念構造から抽出し、抽出されたアークに対応付けられている質問の種別を表している情報を、該抽出されたアークによって繋がれているノードが表している語句についての前記質問種別情報として生成する、
付記１２又は１３に記載のプログラム。
（付記１５）
前記プログラムは、前記クエリ文の意味解析を行う処理を前記コンピュータに更に実行させ、
前記機械翻訳する処理では、前記クエリ文の意味解析の結果に相当する前記第二言語の文を生成することによって、前記クエリ文の前記第二言語への機械翻訳を行う、
付記１２から１５のうちのいずれか一項に記載のプログラム。
（付記１６）
前記質問の種別を判別する処理では、前記クエリ文を構文解析して作成される構文木の意味を解析することによって、前記クエリ文を構成している各語句を表している各ノードと該各ノード間を繋ぐことによって語句同士の概念の関係を表しているアークとにより構成される前記クエリ文の概念構造を生成し、該クエリ文の概念構造におけるアークが表している概念の関係と、該アークによって繋がれているノードが表している語句の品詞とに基づいて、前記クエリ文によって表現されている質問の種別を判別する付記１１から１５のうちのいずれか一項に記載のプログラム。
（付記１７）
前記質問の種別を判別する処理では、前記語句の品詞が疑問詞であるものについて、該疑問詞の種別に基づいて前記クエリ文によって表現されている質問の種別を判別する付記１６に記載のプログラム。
（付記１８）
前記質問の種別は、５Ｗ１Ｈ型の質問における５Ｗ１Ｈの種別である付記１１から１７のうちのいずれか一項に記載のプログラム。
（付記１９）
検索対象である第一言語のテキストと該テキストについての第二言語での翻訳文とが対応付けられて格納されている対訳形式データベースに対する検索方法であって、
入力された前記第一言語によるクエリ文を前記第二言語に機械翻訳し、
前記対訳形式データベースに格納されている前記翻訳文に対して前記第二言語に翻訳されたクエリ文を検索キーとして用いて曖昧検索を行い、
前記曖昧検索の結果である翻訳文に対応付けられている第一言語のテキストを前記対訳形式データベースから抽出して出力する、
ことを特徴とする検索方法。
（付記２０）
前記対訳形式データベースには、前記テキストに含まれている語句が回答となる質問の種別を表す質問種別情報が、前記テキストについての前記翻訳文に対応付けられて更に格納されており、
前記クエリ文によって表現されている質問の種別を判別し、
前記曖昧検索では、前記第二言語に翻訳されたクエリ文に加えて前記判別により判別された質問の種別を更に前記検索キーとして用いて前記曖昧検索を行う、
ことを特徴とする付記１９に記載の検索方法。 The following additional notes are further disclosed with respect to the embodiment including the above examples.
(Appendix 1)
A bilingual format database in which the text of the first language to be searched and the translation of the text in the second language are stored in association with each other;
A machine translation unit for machine-translating the input query sentence in the first language into the second language;
A fuzzy search is performed using the query sentence translated into the second language as a search key with respect to the translated sentence stored in the bilingual format database, and the translated sentence is associated with the translated sentence as a result of the fuzzy search. An ambiguous search unit that extracts and outputs the text in the first language from the parallel translation database;
A search device comprising:
(Appendix 2)
In the bilingual format database, question type information indicating a type of question in which a phrase included in the text is an answer is further stored in association with the translation of the text,
Further comprising a discriminator for discriminating the type of question expressed by the query statement;
The fuzzy search unit performs the fuzzy search using the query type determined by the determination unit in addition to the query sentence translated into the second language, as the search key;
The search device according to supplementary note 1, wherein:
(Appendix 3)
A semantic analysis unit that performs semantic analysis of the text;
Based on the result of the semantic analysis, the question type information for generating the question type information for the phrase included in the translated sentence for the text and storing it in the parallel translation format database in association with the translated sentence for the text A generator,
Further comprising
The fuzzy search unit performs the fuzzy search on the translated sentence and the question type information associated with the translated sentence stored in the parallel translation format database.
The search device according to supplementary note 2, wherein:
(Appendix 4)
The question type information generation unit generates meta information including the question type information and a phrase included in a translated sentence for the text that is an answer to a question of a type represented by the question type information, Attached to the translation of the text and stored in the bilingual database,
The fuzzy search unit performs the fuzzy search on a translation sentence that is stored in the parallel translation format database and to which the meta information is added.
The search device according to Supplementary Note 3, wherein
(Appendix 5)
The semantic analysis unit analyzes the meaning of a syntax tree created by parsing the text, and connects each node with a node representing the phrase that constitutes the text. Generating a conceptual structure of the text composed of arcs representing the relationship of the concepts;
The question type information generation unit extracts an arc representing a concept relationship previously associated with the question type from the conceptual structure of the text, and the type of question associated with the extracted arc Is generated as the question type information about the words and phrases represented by the nodes connected by the extracted arc,
The search device according to supplementary note 3 or 4, characterized in that:
(Appendix 6)
The semantic analysis unit further performs semantic analysis of the query statement,
The machine translation unit performs machine translation of the query sentence into the second language by generating a sentence of the second language corresponding to a result of semantic analysis of the query sentence.
The search device according to any one of supplementary notes 3 to 5, characterized in that:
(Appendix 7)
The discriminator connects each node and each node representing each word constituting the query sentence by analyzing the meaning of a syntax tree created by parsing the query sentence. To generate a conceptual structure of the query statement composed of arcs representing the concept relationship between words, and connected by the arc to the conceptual relationship represented by the arc in the conceptual structure of the query statement The search device according to any one of appendices 2 to 6, wherein the type of the question expressed by the query sentence is determined based on the part of speech of the phrase represented by the node.
(Appendix 8)
The search according to claim 7, wherein the determination unit determines the type of the question expressed by the query sentence based on the type of the question word for the part of speech of the word that is a question word. apparatus.
(Appendix 9)
9. The search device according to any one of appendices 2 to 8, wherein the question type is a 5W1H type in a 5W1H type question.
(Appendix 10)
A program for causing a computer to execute a search process for a bilingual format database in which a text in a first language to be searched and a translated sentence in the second language for the text are stored in association with each other,
Machine-translated the input query sentence in the first language into the second language;
An ambiguous search is performed using a query sentence translated into the second language as a search key with respect to the translated sentence stored in the parallel translation format database,
Extracting and outputting the text in the first language associated with the translated sentence as a result of the fuzzy search from the bilingual format database;
A program for causing the computer to execute processing.
(Appendix 11)
In the bilingual format database, question type information indicating a type of question in which a phrase included in the text is an answer is further stored in association with the translation of the text,
The program further causes the computer to execute a process of determining a type of a question expressed by the query statement,
In the process of performing the fuzzy search, the fuzzy search is further performed using the type of the question determined by the determination process in addition to the query sentence translated into the second language as the search key.
The program according to appendix 10.
(Appendix 12)
The program is
Perform semantic analysis of the text,
Based on the result of the semantic analysis, generate the question type information for the phrase included in the translated sentence for the text, and store it in the parallel translation format database in association with the translated sentence for the text.
Causing the computer to further perform processing;
In the process of performing the fuzzy search, the fuzzy search is performed on the translated sentence and the question type information associated with the translated sentence stored in the parallel translation format database.
The program according to appendix 11.
(Appendix 13)
In the process of generating the question type information, meta information including the question type information and a phrase included in a translated sentence for the text that is an answer to the type of question represented by the question type information is generated. , Added to the translation of the text and stored in the bilingual format database,
In the process of performing the fuzzy search, the fuzzy search is performed on the translation sentence stored in the bilingual format database and having the meta information added thereto.
The program according to attachment 12.
(Appendix 14)
In the process of performing the semantic analysis, by analyzing the meaning of the syntax tree created by parsing the text, the phrase representing the phrase constituting the text is connected to each node by the phrase Generating a conceptual structure of the text composed of arcs representing the relationship between the concepts of each other;
In the process of generating the question type information, an arc representing a concept relationship previously associated with the question type is extracted from the conceptual structure of the text, and the question associated with the extracted arc Generating information as the question type information about the words and phrases represented by the nodes connected by the extracted arc,
The program according to appendix 12 or 13.
(Appendix 15)
The program further causes the computer to execute processing for performing semantic analysis of the query statement,
In the machine translation process, machine translation of the query sentence into the second language is performed by generating a sentence in the second language corresponding to a result of semantic analysis of the query sentence.
The program according to any one of appendices 12 to 15.
(Appendix 16)
In the process of determining the type of the question, by analyzing the meaning of the syntax tree created by parsing the query sentence, each node representing each word and phrase constituting the query sentence and each Generating a conceptual structure of the query statement composed of arcs representing the relationship of concepts between words by connecting nodes, and the relationship of concepts represented by arcs in the conceptual structure of the query statement; The program according to any one of supplementary notes 11 to 15, wherein the type of a question expressed by the query sentence is determined based on a part of speech expressed by a node connected by an arc.
(Appendix 17)
The program according to appendix 16, wherein, in the process of determining the type of the question, the type of the question expressed by the query sentence is determined based on the type of the interrogative word when the part of speech of the phrase is a questionable word. .
(Appendix 18)
The program according to any one of Supplementary Notes 11 to 17, wherein the question type is a 5W1H type in a 5W1H type question.
(Appendix 19)
A search method for a bilingual format database in which a text in a first language to be searched and a translation in a second language for the text are stored in association with each other,
Machine-translated the input query sentence in the first language into the second language;
An ambiguous search is performed using a query sentence translated into the second language as a search key with respect to the translated sentence stored in the parallel translation format database,
Extracting and outputting the text in the first language associated with the translated sentence as a result of the fuzzy search from the bilingual format database;
A search method characterized by that.
(Appendix 20)
In the bilingual format database, question type information indicating a type of question in which a phrase included in the text is an answer is further stored in association with the translation of the text,
Determining the type of question represented by the query statement;
In the fuzzy search, in addition to the query sentence translated into the second language, the fuzzy search is further performed using the type of the question determined by the determination as the search key.
The search method according to supplementary note 19, characterized by:

１検索装置
２クエリ文
３検索結果
１０原文テキストＤＢ
１１対訳形式ＤＢ
１２機械翻訳部
１３検索キー作成部
１４曖昧検索部
２１クエリ文用質問種別判別部
２２クエリ文用質問種別判別パターン
３１意味解析部
３２メタ情報生成部
３３原文用質問種別判別部
３４原文用質問種別判別パターン
３５格納処理部
４０翻訳辞書
５０コンピュータ
５１ＭＰＵ
５２ＲＯＭ
５３ＲＡＭ
５４ハードディスク装置
５５入力装置
５６出力装置
５７インタフェース装置
５８記録媒体駆動装置
５９バスライン
６０可搬型記録媒体 1 Search Device 2 Query Statement 3 Search Result 10 Original Text DB
11 Bilingual DB
DESCRIPTION OF SYMBOLS 12 Machine translation part 13 Search key creation part 14 Fuzzy search part 21 Query sentence question classification judgment part 22 Query sentence question classification judgment pattern 31 Semantic analysis part 32 Meta information generation part 33 Original sentence question classification judgment part 34 Original sentence question classification Discrimination pattern 35 Storage processing unit 40 Translation dictionary 50 Computer 51 MPU
52 ROM
53 RAM
54 hard disk device 55 input device 56 output device 57 interface device 58 recording medium drive device 59 bus line 60 portable recording medium

Claims

The text of the first language to be searched, the translated text of the text in the second language, and the question type information indicating the type of question to which the phrase included in the text is an answer are stored in association with each other A bilingual database,
A machine translation unit for machine-translating the input query sentence in the first language into the second language;
A discriminator for discriminating the type of the question expressed by the query statement;
An ambiguous search is performed on the translated sentence stored in the bilingual form database using a query sentence translated into the second language and a question type determined by the determination unit as a search key. An ambiguous search unit that extracts and outputs text in a first language associated with a translated sentence that is a search result from the bilingual format database;
A search device comprising:

A semantic analysis unit that performs semantic analysis of the text;
Based on the result of the semantic analysis, the question type information for generating the question type information for the phrase included in the translated sentence for the text and storing it in the parallel translation format database in association with the translated sentence for the text A generator,
Further comprising
The fuzzy search unit performs the fuzzy search on the translated sentence and the question type information associated with the translated sentence stored in the parallel translation format database.
The search device according to claim 1 .

The question type information generation unit generates meta information including the question type information and a phrase included in a translated sentence for the text that is an answer to a question of a type represented by the question type information, Attached to the translation of the text and stored in the bilingual database,
The fuzzy search unit performs the fuzzy search on a translation sentence that is stored in the parallel translation format database and to which the meta information is added.
The search device according to claim 2 .

The semantic analysis unit analyzes the meaning of a syntax tree created by parsing the text, and connects each node with a node representing the phrase that constitutes the text. Generating a conceptual structure of the text composed of arcs representing the relationship of the concepts;
The question type information generation unit extracts an arc representing a concept relationship previously associated with the question type from the conceptual structure of the text, and the type of question associated with the extracted arc Is generated as the question type information about the words and phrases represented by the nodes connected by the extracted arc,
The search device according to claim 2 or 3 , wherein

The semantic analysis unit further performs semantic analysis of the query statement,
The machine translation unit performs machine translation of the query sentence into the second language by generating a sentence of the second language corresponding to a result of semantic analysis of the query sentence.
The search device according to any one of claims 2 to 4 , characterized in that:

The discriminator connects each node and each node representing each word constituting the query sentence by analyzing the meaning of a syntax tree created by parsing the query sentence. To generate a conceptual structure of the query statement composed of arcs representing the concept relationship between words, and connected by the arc to the conceptual relationship represented by the arc in the conceptual structure of the query statement based on the word parts of speech which are nodes represent the search device according to any one of claims 1-5, characterized in that to determine the type of questions that are represented by the query statement .

The determination section, for those the word part of speech is interrogative, according to claim 6, characterized in that to determine the type of questions that are represented by the query statement based on the type of該疑Wheels Search device.

The search device according to any one of claims 1 to 7 , wherein the information indicating the type of question is a type of 5W1H in a 5W1H type question.

The text of the first language to be searched, the translated text of the text in the second language, and the question type information indicating the type of question to which the phrase included in the text is an answer are stored in association with each other A program for causing a computer to execute a search process for a parallel translation database,
Machine-translated the input query sentence in the first language into the second language;
Determining the type of question represented by the query statement;
An ambiguous search is performed using the query sentence translated into the second language for the translation sentence stored in the bilingual form database and the type of the determined question as a search key,
Extracting and outputting the text in the first language associated with the translated sentence as a result of the fuzzy search from the bilingual format database;
A program for causing the computer to execute processing.

The text of the first language to be searched, the translated text of the text in the second language, and the question type information indicating the type of question to which the phrase included in the text is an answer are stored in association with each other a search method performed by the computer to pair with and bilingual format database,
The computer is
Machine-translated the input query sentence in the first language into the second language;
Determining the type of question represented by the query statement;
An ambiguous search is performed using the query sentence translated into the second language for the translation sentence stored in the bilingual form database and the type of the determined question as a search key,
Extracting and outputting the text in the first language associated with the translated sentence as a result of the fuzzy search from the bilingual format database;
A search method characterized by that.