JP5578137B2

JP5578137B2 - Search program, apparatus and method

Info

Publication number: JP5578137B2
Application number: JP2011117371A
Authority: JP
Inventors: 文人西野; 照宣粂
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-05-25
Filing date: 2011-05-25
Publication date: 2014-08-27
Anticipated expiration: 2031-05-25
Also published as: JP2012247869A

Description

本技術は、検索技術に関する。 The present technology relates to search technology.

各々異なる種類のデータを格納する複数のデータベースが検索対象である場合、ユーザが検索クエリーを入力すると、その検索クエリーに基づいて複数のデータベースが検索され、検索クエリーに適合するデータが検出される。複数のデータベースを検索対象とする従来技術としては、ユーザが複数のデータベースの中から一のデータベースを選択すると、そのデータベースに専用の検索画面が表示されるものが知られている。この検索画面に入力されたキーワードにて、選択されたデータベースが検索される。 When a plurality of databases that store different types of data are search targets, when a user inputs a search query, the plurality of databases are searched based on the search query, and data that matches the search query is detected. As a conventional technique for searching a plurality of databases, there is known a technique in which a dedicated search screen is displayed on a database when the user selects one database from the plurality of databases. The selected database is searched with the keyword input on the search screen.

しかし、上記従来技術では、ユーザは複数のデータベースの各々について、そのデータベースに専用の検索画面に検索クエリーを入力することとなる。また、検索の精度を高めるためには、ユーザ自身が、各データベースのデータの特性を把握し、どのデータベースにどのような種類のキーワードが適するかを考えることとなる。このため、ユーザにとっては利便性が低かった。 However, in the above prior art, for each of a plurality of databases, the user inputs a search query on a search screen dedicated to the database. In addition, in order to increase the accuracy of the search, the user himself grasps the characteristics of the data in each database and considers what kind of keyword is suitable for which database. For this reason, it was not convenient for the user.

一方、従来技術として、ユーザが、すべてのデータベースについて共通の検索クエリーを入力する方法も知られている。しかし、複数のデータベースの各々に異なる種類のデータが格納されている場合、共通の検索クエリーにはデータの種類に不適切なキーワードが含まれることとなり、検索精度が落ちる。このため、ユーザは所望のデータを得ることができない場合があった。 On the other hand, as a conventional technique, a method in which a user inputs a common search query for all databases is also known. However, when different types of data are stored in each of a plurality of databases, a common search query includes keywords that are inappropriate for the type of data, resulting in a decrease in search accuracy. For this reason, the user may not be able to obtain desired data.

特に、近年は、一つのエンティティに関するデータが、複数のデータベースに分散して格納されるケースが多くなっている。たとえば、エンティティが商品、書籍、店舗などの場合、商品の商品名や機能、書籍の書誌事項、店舗の種類や場所などの要素が基本的なデータとなる。この基本的なデータに、安い、おもしろい、美味しい、といった評価の要素を含むデータや、人気や価格などのランキングの要素を含むデータなど、付属的なデータが付加される。そして、これらのデータは複数のデータベースに分散して格納される場合がある。特に、インターネットのブログなどにおいては、基本的なデータに評価の要素を含むデータが付加されるケースが多い。また、企業内でもデータの共有化が進み、一つのエンティティに対して複数人がデータを付加することがある。このように、一つのエンティティに関して様々な種類のデータが付加され、このデータが複数のデータベースに分散して格納されるケースが増加しており、検索時に所望のデータを得るのが難しくなっている。 In particular, in recent years, data relating to one entity is often distributed and stored in a plurality of databases. For example, when the entity is a product, a book, a store, etc., the basic data includes elements such as the product name and function of the product, the bibliographic items of the book, and the type and location of the store. To this basic data, additional data such as data including evaluation elements such as cheap, interesting and delicious and data including ranking elements such as popularity and price are added. These data may be distributed and stored in a plurality of databases. In particular, in Internet blogs and the like, data including evaluation elements is often added to basic data. In addition, data sharing is progressing within a company, and a plurality of people may add data to one entity. As described above, various types of data are added with respect to one entity, and cases in which this data is distributed and stored in a plurality of databases are increasing, and it is difficult to obtain desired data at the time of search. .

特開２００７−１３３５０５号公報JP 2007-133505 A

従って、本技術の目的は、一側面として、複数のデータベースを検索する場合において、ユーザの利便性を高めるための技術を提供することである。 Therefore, the objective of this technique is providing the technique for improving a user's convenience, when searching a some database as one side surface.

本検索方法は、（Ａ）ユーザから入力された第１の検索クエリーに含まれる１又は複数のクエリー要素を抽出し、（Ｂ）クエリー要素ごとに、各々異なる種類のデータを格納する複数のデータベースとの関連度を算出し、（Ｃ）関連度に基づいて、クエリー要素ごとに複数のデータベースの各々との関連の有無を判定し、（Ｄ）複数のデータベースの各々について、当該データベースと関連有りと判定されたクエリー要素がある場合は、当該クエリー要素を含む第２の検索クエリーを生成し、（Ｅ）第２の検索クエリーに基づいて、複数のデータベースのうち、当該第２の検索クエリーに対応するデータベースを検索する処理を含む。 In this search method, (A) one or more query elements included in a first search query input by a user are extracted, and (B) a plurality of databases each storing different types of data for each query element. (C) Based on the degree of association, the presence / absence of association with each of the plurality of databases is determined for each query element, and (D) each of the plurality of databases is associated with the database. If there is a query element determined to be, a second search query including the query element is generated. (E) Based on the second search query, the second search query is selected from the plurality of databases. Includes processing to search the corresponding database.

複数のデータベースを検索する場合において、ユーザの利便性を高めることができる。 When searching a plurality of databases, user convenience can be improved.

図１（ａ）は、本技術の実施の形態における検索対象となるデータベースの例１を説明するための図である。図１（ｂ）は、本技術の実施の形態における検索対象となるデータベースの例２を説明するための図である。図１（ｃ）は、本技術の実施の形態における検索対象となるデータベースの例３を説明するための図である。FIG. 1A is a diagram for describing Example 1 of a database to be searched in the embodiment of the present technology. FIG. 1B is a diagram for explaining Example 2 of the database to be searched in the embodiment of the present technology. FIG.1 (c) is a figure for demonstrating the example 3 of the database used as the search object in embodiment of this technique. 図２は、検索装置の機能ブロック図である。FIG. 2 is a functional block diagram of the search device. 図３は、検索装置による処理の処理フローを示す図である。FIG. 3 is a diagram illustrating a processing flow of processing by the search device. 図４は、抽出部の処理の処理フローを示す図である。FIG. 4 is a diagram illustrating a processing flow of processing of the extraction unit. 図５は、抽出部による処理の具体例を説明する図である。FIG. 5 is a diagram illustrating a specific example of processing by the extraction unit. 図６は、計算部と判定部の処理の処理フローの例１を示す図である。FIG. 6 is a diagram illustrating a first example of a processing flow of processing of the calculation unit and the determination unit. 図７は、計算部と判定部の処理の処理フローの例２を示す図である。FIG. 7 is a diagram illustrating a second example of the processing flow of the calculation unit and the determination unit. 図８（ａ）乃至（ｂ）は、計算部と判定部の処理の具体例を説明する図である。FIGS. 8A to 8B are diagrams for explaining a specific example of the processing of the calculation unit and the determination unit. 図９は、関連度を計算する処理の処理フローの例１を示す図である。FIG. 9 is a diagram illustrating a first example of a processing flow of the processing for calculating the degree of association. 図１０は、関連度を計算する処理の処理フローの例２を示す図である。FIG. 10 is a diagram illustrating a second example of the process flow of the process of calculating the degree of association. 図１１は、カテゴリテーブルの一例を示す図である。FIG. 11 is a diagram illustrating an example of a category table. 図１２は、クエリー生成部による処理の処理フローのを示す図である。FIG. 12 is a diagram illustrating a processing flow of processing by the query generation unit. 図１３（ａ）と図１３（ｂ）は、クエリー生成部による処理の具体例を説明する図である。FIGS. 13A and 13B are diagrams illustrating a specific example of processing by the query generation unit. 図１４Ａは、検索部による処理の処理フローを示す図である。FIG. 14A is a diagram illustrating a processing flow of processing by the search unit. 図１４Ｂは、カテゴリに応じて検索方法を選択する処理の処理フローを示す図である。FIG. 14B is a diagram illustrating a processing flow of processing for selecting a search method according to a category. 図１５は、集合生成部による処理の処理フローの例１を示す図である。FIG. 15 is a diagram illustrating a first example of a processing flow of processing performed by the set generation unit. 図１６は、集合生成部による処理の処理フローの例１について具体例を説明する図である。FIG. 16 is a diagram illustrating a specific example of the processing flow example 1 of processing by the set generation unit. 図１７は、集合生成部による処理の処理フローの例２を示す図である。FIG. 17 is a diagram illustrating a second example of a processing flow of processing performed by the set generation unit. 図１８は、集合生成部による処理の処理フローの例２について具体例を説明する図である。FIG. 18 is a diagram illustrating a specific example of the processing flow example 2 of processing by the set generation unit. 図１９は、各地域の位置関係を表すデータの一例を示す図である。FIG. 19 is a diagram illustrating an example of data representing the positional relationship of each region. 図２０は、部分集合生成部による処理の処理フローの例１を示す図である。FIG. 20 is a diagram illustrating a first example of a processing flow of processing performed by the subset generation unit. 図２１（ａ）乃至図２１（ｃ）は、部分集合生成部による処理の処理フローの例１について具体例を説明する図である。FIG. 21A to FIG. 21C are diagrams illustrating a specific example of the processing flow example 1 of processing by the subset generation unit. 図２２は、部分集合生成部による処理の処理フローの例２を示す図である。FIG. 22 is a diagram illustrating a second example of the process flow of the process performed by the subset generation unit. 図２３は、部分集合生成部による処理の処理フローの例２を示す図である。FIG. 23 is a diagram illustrating a second example of the process flow of the process performed by the subset generation unit. 図２４（ａ）乃至図２５（ｃ）は、部分集合生成部による処理の処理フローの例２について具体例を説明する図である。FIGS. 24A to 25C are diagrams illustrating a specific example of the processing flow example 2 of processing by the subset generation unit. 図２５は、他の実施の形態の検索装置を説明する機能ブロック図である。FIG. 25 is a functional block diagram illustrating a search device according to another embodiment. 図２６は、他の実施の形態の検索装置を説明する機能ブロック図である。FIG. 26 is a functional block diagram illustrating a search device according to another embodiment. 図２７は、コンピュータの機能ブロック図である。FIG. 27 is a functional block diagram of a computer.

最初に、図１乃至図３を用いて、本技術の実施の形態における概要を説明しておく。本実施の形態において検索対象となるデータベースは複数であり、各々異なる種類のデータが格納されている。データベースに格納されているデータは、一のエンティティに対して複数のデータブロックが関連付けられており、これらのデータブロックが複数のデータベースに分散して格納されている。図１（ａ）乃至図１（ｃ）は、本実施の形態において検索対象となるデータベースの例である。 First, an outline of the embodiment of the present technology will be described with reference to FIGS. 1 to 3. In this embodiment, there are a plurality of databases to be searched, and different types of data are stored. In the data stored in the database, a plurality of data blocks are associated with one entity, and these data blocks are distributed and stored in a plurality of databases. FIG. 1A to FIG. 1C are examples of databases to be searched in this embodiment.

図１（ａ）は、検索対象となるデータベースの例１を説明するための図である。例１では、識別子によりエンティティと複数のデータブロックが関連付けられる。識別子は、例えば、書籍のタイトル、店舗の店名、文献の文献番号などである。この識別子が各データブロックに付与されることで、エンティティとデータブロックが関連付けられる。図１（ａ）では、識別子ｅ１が付与されたデータブロックｄ１はエンティティＥ１に関連付けられ、識別子ｅ２が付与されたデータブロックｄ２はエンティティＥ２に関連付けられ、識別子ｅ３が付与されたデータブロックｄ３はエンティティＥ３に関連付けられている。各データブロックｄ１乃至ｄ３は、その種類に応じて複数のデータベースに分散して格納されている。例えば、エンティティが書籍であれば、書誌事項が記述されたデータブロック、評価が記述されたデータブロックなどが各々異なるデータベースに格納される。なお、ここで識別子は、例えば個々を識別するシリアル番号のように、厳密な一意名としての識別子だけでなく、例えば製品名といったように、種類や分類を特定できる名称等も含まれる。 FIG. 1A is a diagram for explaining Example 1 of a database to be searched. In Example 1, an entity is associated with a plurality of data blocks by an identifier. The identifier is, for example, a book title, a store name, a literature reference number, or the like. By assigning this identifier to each data block, the entity and the data block are associated with each other. In FIG. 1A, the data block d1 to which the identifier e1 is assigned is associated with the entity E1, the data block d2 to which the identifier e2 is assigned is associated with the entity E2, and the data block d3 to which the identifier e3 is assigned is the entity Associated with E3. Each data block d1 to d3 is distributed and stored in a plurality of databases according to the type. For example, if the entity is a book, a data block in which bibliographic items are described, a data block in which evaluation is described, and the like are stored in different databases. Here, the identifier includes not only an identifier as a strict unique name such as a serial number for identifying each individual, but also a name that can specify the type and classification such as a product name.

図１（ｂ）は、検索対象となるデータベースの例２を説明するための図である。例２では、参照によりエンティティと複数のデータブロックとが関連付けられる。例えば、エンティティに関する主となるデータブロックと、付属的なデータブロックとがあり、付属的なデータブロックは主となるデータブロックへのリンクを有する。そして、主となるデータブロックと付属的なデータブロックは、複数のデータベースに分散して格納されている。典型例としては、例えば、Ｗｅｂアノテーションが挙げられる。Ｗｅｂアノテーションは、Ｗｅｂページに付加情報を与える技術である。たとえば、Ｗｅｂページに付箋（メモ書き）を電子的に与える場合、Ｗｅｂページが主となるデータブロックであり、Ｗｅｂページにリンクを有するメモ書きが付属的なデータブロックである。図１（ｂ）では、エンティティＥ１についてはデータブロックｄ４２がデータブロックｄ４１を参照し、エンティティＥ２についてはデータブロックｄ５２がデータブロックｄ５１を参照し、エンティティＥ３についてはデータブロックｄ６２がデータブロックｄ６１を参照している。主となるデータブロックｄ４１，ｄ５１，ｄ６１と付属的なデータブロックｄ４２，ｄ５２，ｄ６２は、異なるデータベースに分散して格納されている。 FIG. 1B is a diagram for explaining an example 2 of a database to be searched. In Example 2, an entity is associated with a plurality of data blocks by reference. For example, there is a main data block for an entity and an ancillary data block, and the ancillary data block has a link to the main data block. The main data block and the accompanying data block are distributed and stored in a plurality of databases. A typical example is a Web annotation. Web annotation is a technique for giving additional information to a Web page. For example, when a tag (memo) is electronically given to a Web page, the Web page is a main data block, and a memo with a link on the Web page is an associated data block. In FIG. 1B, for the entity E1, the data block d42 refers to the data block d41, for the entity E2, the data block d52 refers to the data block d51, and for the entity E3, the data block d62 refers to the data block d61. doing. The main data blocks d41, d51, d61 and the accompanying data blocks d42, d52, d62 are distributed and stored in different databases.

図１（ｃ）は、検索対象となるデータベースの例３を説明するための図である。例３は、データの分割によりエンティティと複数のデータブロックが関連付けられる。各エンティティに関するデータは、物理的には一つのデータブロックであるが、その中に複数の種類の情報が含まれている。このデータブロックは情報の種類ごとに仮想的に分離され、その分離されたパーツの各々が仮想的なデータブロックとみなされる。そして、仮想的なデータブロックは、仮想的なデータベースに分散して格納されているとみなされる。典型例としては、エンティティの紹介サイトが挙げられる。エンティティのデータブロックには、基本情報と評価情報が含まれる。基本情報は、例えば、書籍の書誌事項や、店舗の種類や店名や地図などの情報である。評価情報は、エンティティに対する評価の記述や点数などの情報である。この基本情報と評価情報が仮想的に分離され、分離されたパーツの各々が仮想的なデータブロックとみなされる。そして、基本情報のデータベースと評価情報のデータベースが仮想的にあるとみなされ、仮想的なデータブロックが仮想的なデータベースに分散して格納されているとみなされる。図１（ｃ）では、エンティティＥ１に関するデータブロック中に仮想的なデータブロックｄ７１，ｄ７２，ｄ７３が含まれており、エンティティＥ２に関するデータブロック中に仮想的なデータブロックｄ８１，ｄ８２が含まれており、エンティティＥ３に関するデータブロック中に仮想的なデータブロックｄ９１，ｄ９２が含まれている。なお、データブロックｄ７１，ｄ８１，ｄ９１とデータブロックｄ７２，ｄ７３，ｄ８２，ｄ９２は、その種類に応じて、複数の仮想的なデータベースに分散して格納されているとみなされる。 FIG. 1C is a diagram for explaining Example 3 of the database to be searched. In Example 3, an entity and a plurality of data blocks are associated by dividing data. The data related to each entity is physically one data block, and a plurality of types of information are included therein. This data block is virtually separated for each type of information, and each of the separated parts is regarded as a virtual data block. The virtual data block is regarded as being distributed and stored in the virtual database. A typical example is an entity introduction site. The entity data block includes basic information and evaluation information. The basic information is, for example, information such as bibliographic items of books, store types, store names, and maps. The evaluation information is information such as description of evaluation and score for the entity. The basic information and the evaluation information are virtually separated, and each separated part is regarded as a virtual data block. The basic information database and the evaluation information database are considered to be virtually, and the virtual data blocks are assumed to be distributed and stored in the virtual database. In FIG. 1C, virtual data blocks d71, d72, d73 are included in the data block related to the entity E1, and virtual data blocks d81, d82 are included in the data block related to the entity E2. , Virtual data blocks d91 and d92 are included in the data block related to the entity E3. Note that the data blocks d71, d81, d91 and the data blocks d72, d73, d82, d92 are considered to be distributed and stored in a plurality of virtual databases according to their types.

上記例１乃至例３に示されるように、各々異なる種類のデータを格納する複数のデータベースを検索対象とする場合、ユーザがデータベースごとに専用の検索クエリーを入力するとなると、利便性が低くなる。一方、複数のデータベースに共通の検索クエリーに基づいて検索することとすると、データベースのデータの種類に合致しないクエリー要素が含まれるため、検索精度が低くなる。また、検索結果はデータベースごとに出力され、ユーザにとっては煩雑である。また、ユーザが検索クエリーに合致するエンティティを求めている場合であっても、検索結果は検索クエリーに適合するデータブロックの列として与えられるため、ユーザの情報ニーズに必ずしもマッチしない。本実施の形態の技術はこのような問題を解決できるものである。 As shown in Examples 1 to 3, when a plurality of databases each storing different types of data are to be searched, if the user inputs a dedicated search query for each database, the convenience is lowered. On the other hand, if a search is performed based on a search query common to a plurality of databases, the search accuracy is low because query elements that do not match the data type of the database are included. In addition, the search result is output for each database, which is complicated for the user. Even if the user is seeking an entity that matches the search query, the search result is given as a sequence of data blocks that match the search query, so it does not necessarily match the information needs of the user. The technique of the present embodiment can solve such a problem.

図２は、本実施の形態の検索装置１００の機能ブロック図であり、図３は、検索装置１００による処理の処理フローを示す図である。検索装置１００は、複数のデータベースＤ１，Ｄ２，Ｄ３を検索可能である。データベースＤ１，Ｄ２，Ｄ３の各々には、異なる種類のデータが格納されている。本実施の形態では、一のエンティティに複数のデータブロックが関連付けられており、この複数のデータブロックがデータベースＤ１，Ｄ２，Ｄ３に分散して格納されている。データベースの例は、図１（ａ），図１（ｂ），図１（ｃ）に示した通りである。本実施の形態において、データベースは三つであるが、その数は複数であれば良く、三つに限定されるものではない。検索装置１００は、抽出部１と、計算部２と、判定部３と、検索クエリー生成部４と、検索部５１，５２，５３と、集合生成部６１，６２，６３と、部分集合生成部７と、クエリーリスト格納部１０と、クエリー格納部１１と、集合格納部１２と、部分集合格納部１３とを有する。 FIG. 2 is a functional block diagram of the search device 100 according to the present embodiment, and FIG. 3 is a diagram illustrating a processing flow of processing by the search device 100. The search device 100 can search a plurality of databases D1, D2, and D3. Different types of data are stored in each of the databases D1, D2, and D3. In the present embodiment, a plurality of data blocks are associated with one entity, and the plurality of data blocks are distributed and stored in the databases D1, D2, and D3. Examples of the database are as shown in FIGS. 1A, 1B, and 1C. In the present embodiment, the number of databases is three, but the number of databases is not limited to three. The search device 100 includes an extraction unit 1, a calculation unit 2, a determination unit 3, a search query generation unit 4, search units 51, 52, and 53, set generation units 61, 62, and 63, and a subset generation unit. 7, a query list storage unit 10, a query storage unit 11, a set storage unit 12, and a subset storage unit 13.

以下、各部について説明する。なお、図３の各ステップは、各部に対応して実行されるため、各ステップの説明は各部に各ステップを対応付けることで行う。抽出部１は、ユーザから入力された第１の検索クエリーＱ１に含まれる１または複数のクエリー要素ｑｉを抽出する(ステップＳ１）機能を備える。計算部２は、抽出部１にて抽出されたクエリー要素ｑｉの各々について、各データベースＤｊ（Ｄ１，Ｄ２，Ｄ３）との関連度ｆ（ｑｉ，Ｄｊ）を算出する（ステップＳ３）機能を備える。判定部３は、計算部２にて算出された関連度ｆ（ｑｉ，Ｄｊ）に基づいて、クエリー要素ｑｉの各々について各データベースＤｊ（Ｄ１，Ｄ２，Ｄ３）との関連の有無を判定する（ステップＳ５）機能を備える。検索クエリー生成部４は、データベースＤｊ（Ｄ１，Ｄ２，Ｄ３）の各々について、そのデータベースＤｊ（Ｄ１，Ｄ２，Ｄ３）と関連有りと判定されたクエリー要素ｑｉがある場合は、そのクエリー要素ｑｉを含む第２の検索クエリーＱ２ｄｊを生成する（ステップＳ７）機能を備える。検索部５１，５２，５３は、第２の検索クエリーＱ２ｄｊに基づいて、データベースＤ１，Ｄ２，Ｄ３のうち、第２の検索クエリーＱ２ｄｊに対応するデータベースＤｊを検索する(ステップＳ９）機能を備える。検索部５１はデータベースＤ１の検索を担当し、検索部５２はデータベースＤ２の検索を担当し、検索部５３はデータベースＤ３の検索を担当する。集合生成部６１，６２，６３は、検索部５１，５２，５３による処理において、データベースＤ１，Ｄ２，Ｄ３のうち二以上のデータベースの各々からデータブロックが検出されると、そのデータブロックに関連付けられるエンティティを特定して、エンティティの集合を生成する（ステップＳ１１）機能を備える。集合生成部６１は検索部５１の検索結果を処理し、集合生成部６２は検索部５２の検索結果を処理し、集合生成部６３は検索部５３の検索結果を処理する。部分集合生成部７は、所定のルールに基づいて、集合生成部６１，６２，６３により特定されたエンティティの集合から、エンティティの部分集合を生成する（ステップＳ１３）機能を備える。 Hereinafter, each part will be described. Note that each step in FIG. 3 is executed corresponding to each part, and therefore the description of each step is performed by associating each step with each part. The extraction unit 1 has a function of extracting one or a plurality of query elements qi included in the first search query Q1 input by the user (step S1). The calculation unit 2 has a function of calculating the degree of association f (qi, Dj) with each database Dj (D1, D2, D3) for each query element qi extracted by the extraction unit 1 (step S3). . The determination unit 3 determines whether each query element qi is related to each database Dj (D1, D2, D3) based on the degree of association f (qi, Dj) calculated by the calculation unit 2 ( Step S5) A function is provided. When there is a query element qi determined to be associated with the database Dj (D1, D2, D3) for each of the databases Dj (D1, D2, D3), the search query generation unit 4 sets the query element qi to A second search query Q2dj including the function is generated (step S7). The search units 51, 52, and 53 have a function of searching the database Dj corresponding to the second search query Q2dj from the databases D1, D2, and D3 based on the second search query Q2dj (step S9). The search unit 51 is in charge of searching the database D1, the search unit 52 is in charge of searching the database D2, and the search unit 53 is in charge of searching the database D3. When the data generation is detected from each of two or more databases among the databases D1, D2, and D3 in the processing by the search units 51, 52, and 53, the set generation units 61, 62, and 63 are associated with the data blocks. A function is provided for identifying entities and generating a set of entities (step S11). The set generation unit 61 processes the search result of the search unit 51, the set generation unit 62 processes the search result of the search unit 52, and the set generation unit 63 processes the search result of the search unit 53. The subset generation unit 7 has a function of generating a subset of entities from the set of entities specified by the set generation units 61, 62, and 63 based on a predetermined rule (step S13).

検索装置１００は、以下のように使用される。ユーザは、第１の検索クエリーＱ１を入力する。このとき、ユーザは、データベースＤ１，Ｄ２，Ｄ３に格納されるデータの種類を意識することなく、複数のデータベースＤ１，Ｄ２，Ｄ３に共通の検索クエリーとして第１のクエリーＱ１を入力すれば良い。ユーザはデータベースＤ１，Ｄ２，Ｄ３ごとに検索クエリーを入力しなくてもよく、又、どのデータベースにどのような種類のキーワードが適するかを考えなくてもよい。第１の検索クエリーＱ１にどのようなクエリー要素が含まれていも、後述する関連度に基づいて関連の有無が判断されるため、使用するクエリー要素が制限されることもなく、第１の検索クエリーの自由度は高い。 The search device 100 is used as follows. The user inputs the first search query Q1. At this time, the user may input the first query Q1 as a search query common to the plurality of databases D1, D2, D3 without being aware of the type of data stored in the databases D1, D2, D3. The user does not need to input a search query for each of the databases D1, D2, and D3, and does not need to consider what kind of keyword is suitable for which database. Whatever query element is included in the first search query Q1, whether or not there is a relation is determined based on the degree of relation described later, so that the query element to be used is not limited and the first search The degree of freedom of query is high.

第１の検索クエリーＱ１が入力されると、抽出部１は第１の検索クエリーＱ１から１又は複数のクエリー要素ｑｉを抽出する。計算部２は、クエリー要素ｑｉの各々についてデータベースＤ１，Ｄ２，Ｄ３との関連度ｆ（ｑｉ，Ｄｊ）を算出する。判定部３は、関連度ｆ（ｑｉ，Ｄｊ）に基づいて、各クエリー要素ｑｉとデータベースＤ１，Ｄ２，Ｄ３との関連の有無を判定する。第１の検索クエリーＱ１にどのようなクエリー要素が含まれていても、その都度、関連度ｆ（ｑｉ，Ｄｊ）が計算され、関連の有無が判定される。そして、検索クエリー生成部４は、複数のデータベースＤｊ（Ｄ１，Ｄ２，Ｄ３）の各々について、第２の検索クエリーＱ２ｄｊを生成する。第２の検索クエリーＱ２ｄｊには、そのデータベースＤｊと関連有りと判定されたクエリー要素が含まれている。検索部５１，５２，５３は、第２の検索クエリーＱ２ｄｊに基づいて、その第２の検索クエリーＱ２ｄｊに対応するデータベースＤｊを検索する。検索部５１，５２，５３は、担当のデータベースＤｊ（Ｄ１，Ｄ２，Ｄ３）に適する第２の検索クエリーＱ２ｄｊに基づいて、担当のデータベースＤｊ（Ｄ１，Ｄ２，Ｄ３）を検索することとなるため、検索精度が高くなる。集合生成部６１，６２，６３は、検索部５１，５２，５３により二以上のデータベースからデータブロックが検出されると、そのデータブロックに関連付けられているエンティティを特定し、エンティティの集合を生成する。そして、部分集合生成部７は、所定のルールに基づいて、エンティティの集合から部分集合を生成する。ユーザには、この部分集合に含まれるエンティティのリストが出力される。エンティティのリストは、エンティティの識別子のリストでも良いし、エンティティに関するデータへのリンクのリストでも良い。エンティティの部分集合は、所定のルールに基づいて、データベースＤ１，Ｄ２，Ｄ３の検索結果が統合されたものである。本実施の形態によれば、ユーザは、データベースごとに検索結果を得るのではなく、統合された検索結果を得ることができる。検索結果はエンティティの集合として与えられるため、ユーザの情報ニーズにもマッチする。 When the first search query Q1 is input, the extraction unit 1 extracts one or a plurality of query elements qi from the first search query Q1. The calculation unit 2 calculates the degree of association f (qi, Dj) with the databases D1, D2, and D3 for each of the query elements qi. The determination unit 3 determines whether or not each query element qi is associated with the databases D1, D2, and D3 based on the degree of association f (qi, Dj). Whatever query element is included in the first search query Q1, the degree of association f (qi, Dj) is calculated each time, and the presence or absence of the relation is determined. Then, the search query generation unit 4 generates a second search query Q2dj for each of the plurality of databases Dj (D1, D2, D3). The second search query Q2dj includes a query element determined to be related to the database Dj. The search units 51, 52, and 53 search the database Dj corresponding to the second search query Q2dj based on the second search query Q2dj. The search units 51, 52, and 53 search for the responsible database Dj (D1, D2, D3) based on the second search query Q2dj suitable for the responsible database Dj (D1, D2, D3). , Search accuracy will be higher. When the search units 51, 52, and 53 detect data blocks from two or more databases, the set generation units 61, 62, and 63 identify entities associated with the data blocks, and generate a set of entities. . Then, the subset generation unit 7 generates a subset from the set of entities based on a predetermined rule. A list of entities included in this subset is output to the user. The list of entities may be a list of entity identifiers or a list of links to data about the entities. The entity subset is obtained by integrating the search results of the databases D1, D2, and D3 based on a predetermined rule. According to the present embodiment, the user can obtain an integrated search result instead of obtaining a search result for each database. Since the search results are given as a set of entities, it matches the user's information needs.

以下に、上記各部及び各ステップの一例を詳細に説明する。 Hereinafter, an example of each of the above-described units and steps will be described in detail.

図４は、抽出部１の処理の処理フローの一例を示す図であり、図３のステップＳ１に相当する処理である。本実施の形態において、第１の検索クエリーＱ１は、自然言語テキストデータとしてユーザにより入力される。抽出部１は、第１の検索クエリーＱ１を取り込み（ステップＳ１０１）、形態素解析を行い（ステップＳ１０３）、ストップワードを除去する（ステップＳ１０５）。ストップワードとは、検索に用いる語から除外する語であり、例えば機能語等である。次に、抽出部１は、チャンキングにより意味的な観点から形態素をまとめ（ステップＳ１０７）、検索の観点から語形の整形を行う（ステップＳ１０９）。整形は、例えば、活用形を終止形や語幹に変換したり、表記のゆれを考慮して標準形へ変換することが考えられる。なお、本実施の形態において、第１の検索クエリーＱ１はテキスト形式の自然言語であるが、これに限られるものではない。例えば、第１の検索クエリーＱ１は、音声形式であっても良く、又、単語やフレーズの列であっても良い。また、第１の検索クエリーＱ１は、検索装置１００の入力部（図示せず）から直接入力されても良いし、別途設けられるユーザ端末から入力され、インターネット等のネットワークを介して検索装置１００に入力されても良い。 FIG. 4 is a diagram illustrating an example of a processing flow of the processing of the extraction unit 1, and corresponds to step S1 of FIG. In the present embodiment, the first search query Q1 is input by the user as natural language text data. The extraction unit 1 takes in the first search query Q1 (step S101), performs morphological analysis (step S103), and removes stop words (step S105). A stop word is a word that is excluded from words used for search, such as a function word. Next, the extraction unit 1 collects morphemes from a semantic viewpoint by chunking (step S107), and shapes a word form from a retrieval viewpoint (step S109). For shaping, for example, it is possible to convert the inflection form to the end form or the stem, or to the standard form in consideration of the fluctuation of the notation. In the present embodiment, the first search query Q1 is a natural language in text format, but is not limited to this. For example, the first search query Q1 may be in an audio format, or may be a string of words or phrases. The first search query Q1 may be directly input from an input unit (not shown) of the search device 100, or may be input from a separately provided user terminal and sent to the search device 100 via a network such as the Internet. It may be entered.

図５は、抽出部１による処理の具体例を説明する図である。第１の検索クエリーＱ１として「日本橋のおいしくてそれほど高くないお寿司屋さん」が入力されると、抽出部１は、形態素解析により第１の検索クエリーＱ１を形態素の列に分解する。その結果、第１の検索クエリーＱ１は、「日本橋」「の」「おいしく」「て」「それほど」「高く」「ない」「お」「寿司」「屋」「さん」となる。次に、ストップワード「の」「て」「それほど」「お」「屋」「さん」が除去される。そして、チャンキングにより、「高く」「ない」がまとめられて「高くない」となり、整形により「おいしく」が「おいしい」となる。これにより、第１の検索クエリーＱ１からクエリー要素「日本橋」「おいしい」「高くない」「寿司」が抽出される。 FIG. 5 is a diagram illustrating a specific example of processing by the extraction unit 1. When “Nihonbashi's delicious and not so expensive sushi restaurant” is input as the first search query Q1, the extraction unit 1 decomposes the first search query Q1 into morpheme strings by morphological analysis. As a result, the first search query Q1 is “Nihonbashi” “No” “Delicious” “Te” “So much” “High” “No” “O” “Sushi” “Ya” “San”. Next, the stop words “no” “te” “so” “o” “ya” “san” are removed. Then, “high” and “not” are combined into “not high” by chunking, and “delicious” becomes “delicious” by shaping. As a result, the query elements “Nihonbashi”, “delicious”, “not expensive”, and “sushi” are extracted from the first search query Q1.

図６は、計算部２と判定部３の処理の処理フローの例１を示す図であり、図３のステップＳ３及びステップＳ５に相当する。例１は、判定部３における判定を閾値により行う例である。計算部２は、抽出されたクエリー要素からクエリー要素ｑｉを一つ取得し（ステップＳ２１０１）、クエリー要素ｑｉとデータベースＤｊとの関連度ｆ（ｑｉ，Ｄｊ）を計算し、例えばメインメモリ等の記憶装置に格納する（ステップＳ２１０３）。ここで、データベースＤｊは、複数のデータベースＤ１，Ｄ２，Ｄ３のうちの一つのデータベースである。なお、関連度の具体的な計算方法については後述する。判定部３は関連度ｆ（ｑｉ，Ｄｊ）の閾値判定を行う（ステップＳ２１０５）。関連度ｆ（ｑｉ，Ｄｊ）が閾値範囲内であれば、クエリー要素ｑｉはデータベースＤｊと関連有りと判断し、クエリー要素ｑｉをデータベースＤｊのクエリーリストＬ１ｄｊに追加する（ステップＳ２１０７）。ステップＳ２１０５において、関連度ｆ（ｑｉ，Ｄｊ）が閾値範囲外の場合、判定部３は、クエリー要素ｑｉとデータベースＤｊとは関連無しと判断し、ステップＳ２１０９に進む。ステップＳ２１０９では、計算部２がクエリー要素ｑｉについて、データベースＤ１，Ｄ２，Ｄ３との関連度を計算したか判断する。「Ｎｏ」の場合、計算部２は、処理対象を次のデータベースに設定し（ステップＳ２１１１）、ステップＳ２１０３に戻る。この処理が繰り返されることで、クエリー要素ｑｉについて、データベースＤ１，Ｄ２，Ｄ３との関連度が算出され、関連の有無が判断される。計算部２は、ステップＳ２１０９にて、「Ｙｅｓ」と判断した場合、すべてのクエリー要素について関連度を算出したか判断する（ステップＳ２１１３）。「Ｎｏ」の場合、計算部２は、処理対象を次のクエリー要素に設定し（ステップＳ２１１５）、ステップＳ２１０１に戻る。この処理が繰り返されることで、クエリー要素の各々について、データベースＤ１，Ｄ２，Ｄ３との関連度が算出され、関連の有無が判定される。クエリーリストＬ１ｄｊは、データベースＤｊと関連付けてクエリーリスト格納部１０に格納される。クエリーリストＬ１ｄｊには、そのデータベースＤｊとの関連度が閾値範囲内であるクエリー要素、すなわち、そのデータベースＤｊと関連有りと判断されたクエリー要素の列が格納される。なお、判定における閾値は上限も設定することが好ましい。出現頻度が高すぎるクエリー要素はデータの種類に関係無く使用される要素である可能性が高く、特定のデータベースと関連が高いと判断することは不適切なためである。 FIG. 6 is a diagram illustrating a first example of the processing flow of the calculation unit 2 and the determination unit 3, and corresponds to step S3 and step S5 in FIG. Example 1 is an example in which the determination in the determination unit 3 is performed using a threshold value. The calculation unit 2 acquires one query element qi from the extracted query elements (step S2101), calculates the degree of association f (qi, Dj) between the query element qi and the database Dj, and stores it in, for example, a main memory Store in the device (step S2103). Here, the database Dj is one database among the plurality of databases D1, D2, and D3. A specific method for calculating the degree of association will be described later. The determination unit 3 determines the threshold value of the relevance f (qi, Dj) (step S2105). If the relevance f (qi, Dj) is within the threshold range, it is determined that the query element qi is related to the database Dj, and the query element qi is added to the query list L1dj of the database Dj (step S2107). If the relevance f (qi, Dj) is outside the threshold range in step S2105, the determination unit 3 determines that the query element qi and the database Dj are not related, and the process proceeds to step S2109. In step S2109, it is determined whether the calculation unit 2 has calculated the degree of association with the databases D1, D2, and D3 for the query element qi. In the case of “No”, the calculation unit 2 sets the processing target in the next database (step S2111) and returns to step S2103. By repeating this process, the degree of association with the databases D1, D2, and D3 is calculated for the query element qi, and the presence or absence of the association is determined. If the calculation unit 2 determines “Yes” in step S 2109, the calculation unit 2 determines whether the relevance level has been calculated for all query elements (step S 2113). In the case of “No”, the calculation unit 2 sets the processing target to the next query element (step S2115), and returns to step S2101. By repeating this process, the degree of association with each of the query elements is calculated for each of the query elements, and the presence or absence of the association is determined. The query list L1dj is stored in the query list storage unit 10 in association with the database Dj. The query list L1dj stores a query element whose degree of association with the database Dj is within a threshold range, that is, a column of query elements determined to be associated with the database Dj. Note that it is preferable to set an upper limit as the threshold value in the determination. This is because a query element having an appearance frequency that is too high is likely to be an element used regardless of the type of data, and it is inappropriate to determine that the relation to a specific database is high.

図７は計算部２と判定部３の処理フローの例２を示す図であり、図３のステップＳ３及びステップＳ５に相当する。例２は、判定部３における判定が、関連度の最高値に基づいて行われる例である。計算部２は、抽出されたクエリー要素からクエリー要素ｑｉを一つ取得し（ステップＳ２２０１）、クエリー要素ｑｉとデータベースＤｊとの関連度ｆ（ｑｉ，Ｄｊ）を計算し、例えばメインメモリ等の記憶装置に格納する（ステップＳ２２０３）。計算部２は、クエリー要素ｑｉについて、データベースＤ１，Ｄ２，Ｄ３との関連度を算出したか判断する（ステップＳ２２０５）。「Ｎｏ」の場合、計算部２は、処理対象を次のデータベースに設定し（ステップＳ２２０７）、ステップＳ２２０３に戻る。この処理が繰り返されることで、クエリー要素ｑｉについて、データベースＤ１，Ｄ２，Ｄ３との関連度ｆ（ｑｉ，Ｄｊ）が算出される。判定部３は、ステップＳ２２０５にて「Ｙｅｓ」と判断した場合、データベースＤ１，Ｄ２，Ｄ３の各々について算出された関連度を比較し、最も高い関連度を示すデータベースＤｊを特定する。そして、判定部３は、クエリー要素ｑｉはデータベースＤｊと関連があると判断し、データベースＤｊのクエリーリストＬ１ｄｊにクエリー要素ｑｉを追加する（ステップＳ２２０９）。計算部２は、すべてのクエリー要素について関連度を算出したか判断する（ステップＳ２２１１）。「Ｎｏ」の場合、計算部２は、処理対象を次のクエリー要素に設定し（ステップＳ２２１３）、ステップＳ２２０１に戻る。クエリーリストは、クエリーリスト格納部１０に格納される。例２の処理の結果、データベースＤｊのクエリーリストＬ１ｄｊには、複数のデータベースＤ１，Ｄ２，Ｄ３のうち、データベースＤｊとの関連度が最高値であったクエリー要素、すなわち、データベースＤｊと関連有りと判断されたクエリー要素の列が格納される。 FIG. 7 is a diagram illustrating a second example of the processing flow of the calculation unit 2 and the determination unit 3, and corresponds to Step S3 and Step S5 of FIG. Example 2 is an example in which the determination in the determination unit 3 is performed based on the highest relevance value. The calculation unit 2 acquires one query element qi from the extracted query elements (step S2201), calculates the degree of association f (qi, Dj) between the query element qi and the database Dj, and stores it in, for example, a main memory Store in the device (step S2203). The calculation unit 2 determines whether the degree of association with the databases D1, D2, and D3 has been calculated for the query element qi (step S2205). In the case of “No”, the calculation unit 2 sets the processing target in the next database (step S2207) and returns to step S2203. By repeating this process, the degree of association f (qi, Dj) with the databases D1, D2, and D3 is calculated for the query element qi. If the determination unit 3 determines “Yes” in step S 2205, the determination unit 3 compares the degrees of association calculated for each of the databases D 1, D 2, and D 3 to identify the database Dj that indicates the highest degree of association. The determination unit 3 determines that the query element qi is related to the database Dj, and adds the query element qi to the query list L1dj of the database Dj (step S2209). The calculation unit 2 determines whether the relevance level has been calculated for all query elements (step S2211). In the case of “No”, the calculation unit 2 sets the processing target to the next query element (step S2213) and returns to step S2201. The query list is stored in the query list storage unit 10. As a result of the processing in Example 2, the query list L1dj of the database Dj includes a query element that has the highest degree of association with the database Dj among the plurality of databases D1, D2, and D3, that is, the association with the database Dj. A column of the determined query elements is stored.

図８（ａ）乃至（ｄ）は、計算部２と判定部３の処理の具体例を説明する図である。クエリー要素として「日本橋」「おいしい」「高くない」「寿司」が抽出されると（図８（ａ））、計算部２はデータベースＤ１，Ｄ２，Ｄ３の各々について、各クエリー要素「日本橋」「おいしい」「高くない」「寿司」との関連度を計算する（図８（ｂ））。判定部３は、算出された関連度に基づいて関連の有無を判定する（図８（ｃ））。上記例１に示す処理においては、閾値に基づいて判定が行われる。例えば、閾値が１０／１００から５０／１００の範囲内である場合は、以下の通りとなる。データベースＤ１については、「日本橋」（関連度１０／１００）と「寿司」（関連度２０／１００）が関連有りと判断され、データベースＤ１の第１のクエリーリストＬ１ｄ１に追加される（図８（ｄ））。データベースＤ２については、「おいしい」（関連度３０／１００）と「高くない」（関連度２０／１００）が関連有りと判断され、データベースＤ２のクエリーリストＬ１ｄ２に追加される（図８（ｄ））。データベースＤ３については、閾値範囲内のクエリー要素がないので、クエリーリストＬ１ｄ３は作成されない。なお、図８（ｂ）に示す関連度の値は、説明の便宜のために設定したものであり、実際の値とは異なる。 FIGS. 8A to 8D are diagrams illustrating specific examples of processing performed by the calculation unit 2 and the determination unit 3. When “Nihonbashi”, “delicious”, “not expensive”, and “sushi” are extracted as the query elements (FIG. 8A), the calculation unit 2 sets the query elements “Nihonbashi”, “for each of the databases D1, D2, and D3. The degree of association with “delicious”, “not expensive”, and “sushi” is calculated (FIG. 8B). The determination unit 3 determines whether or not there is a relationship based on the calculated degree of relevance (FIG. 8C). In the process shown in Example 1 above, the determination is performed based on the threshold value. For example, when the threshold is within the range of 10/100 to 50/100, the following occurs. Regarding the database D1, “Nihonbashi” (relevance 10/100) and “sushi” (relevance 20/100) are determined to be related and added to the first query list L1d1 of the database D1 (FIG. 8 ( d)). Regarding the database D2, “delicious” (relevance 30/100) and “not high” (relevance 20/100) are determined to be related and added to the query list L1d2 of the database D2 (FIG. 8D). ). For the database D3, since there is no query element within the threshold range, the query list L1d3 is not created. Note that the relevance values shown in FIG. 8B are set for convenience of explanation and are different from the actual values.

また、上記例２に示す処理においては、最も高い関連度に基づいて判定が行われる。判定部３は、クエリー要素「日本橋」について、各データベースＤ１，Ｄ２，Ｄ３との関連度を比較し、関連度が最高値であるデータベースＤ１と関連有りと判断する（図８（ｃ））。データベースＤ１の第１のクエリーリストＬ１ｄ１には「日本橋」が追加される。同様に、「おいしい」「高くない」「寿司」についても比較が行われ、関連度が最高値を示すデータベースＤｊの第１のクエリーリストＬ１ｄｊにクエリー要素が追加される（図８（ｄ））。この結果、クエリーリストＬ１ｄ１にはデータベースＤ１と関連有りと判断されたクエリー要素「日本橋」「寿司」が格納され、クエリーリストＬ１ｄ２にはデータベースＤ２と関連有りと判断されたクエリー要素「おいしい」「高くない」が格納される。 In the processing shown in Example 2 above, the determination is performed based on the highest degree of association. The determination unit 3 compares the degree of association of the query element “Nihonbashi” with each of the databases D1, D2, and D3, and determines that there is an association with the database D1 having the highest degree of association (FIG. 8C). “Nihonbashi” is added to the first query list L1d1 of the database D1. Similarly, “delicious”, “not expensive”, and “sushi” are also compared, and a query element is added to the first query list L1dj of the database Dj having the highest relevance (FIG. 8D). . As a result, the query elements “Nihonbashi” and “sushi” determined to be related to the database D1 are stored in the query list L1d1, and the query elements “delicious” and “high” determined to be related to the database D2 are stored in the query list L1d2. Is stored.

図９は、関連度を計算する処理の処理フローの例１を示す図であり、図１０は、その例２を示す図である。この例１及び例２は、図６に示される関連度計算及び判定（例１）の処理フローのステップＳ２１０３、及び、図７に示される関連度計算及び判定（例２）の処理フローのステップＳ２２０３に相当する処理である。例１は、出現頻度に基づいて関連度を算出する例である。特定のクエリー要素が特定のデータベースに頻繁に出現する場合、そのクエリー要素とデータベースとは関連を有する可能性が高い。そこで、この出現頻度に基づいて関連度を計算することにより、関連度がより適切な値となる。計算部２は、クエリー要素ｑｉを一つ取得し（ステップ３１１）、クエリー要素ｑｉのデータベースＤｊにおける出現頻度Ｒｄｊをカウントする（ステップ３１３）。次に、計算部２は、データベースＤｊのデータ要素数に対する出現頻度Ｒｄｊの割合に基づいて、関連度ｆ（ｑｉ，Ｄｊ）を算出し、例えばメインメモリ等の記憶装置に格納する（ステップＳ３１５）。関連度ｆ（ｑｉ，Ｄｊ）は出現頻度が高いほど高くなる。ここで、データベースＤｊのデータ要素とは、クエリー要素に対応する要素である。データ要素は、例えば、各データベースＤ１，Ｄ２，Ｄ３に格納されるデータについて、抽出部１による同様のクエリー要素抽出処理を行うことで抽出でき、データ要素数は、各データベースから抽出されたデータ要素の数をカウントすることにより得られる。これらの処理は予め行っておき、データベースＤ１，Ｄ２，Ｄ３の各々について、データ要素数を格納部に格納しておくことが好ましい。なお、関連度は、必ずしもデータ要素数に対する割合を用いなくても良く、例えば、出現頻度Ｒｄｊをそのまま用いて算出されても良い。 FIG. 9 is a diagram illustrating a first example of a processing flow of processing for calculating the relevance, and FIG. 10 is a diagram illustrating the second example. This Example 1 and Example 2 are steps S2103 of the relevance calculation and determination (example 1) processing flow shown in FIG. 6, and the relevance calculation and determination (example 2) processing flow steps shown in FIG. This is processing corresponding to S2203. Example 1 is an example in which the degree of association is calculated based on the appearance frequency. When a specific query element frequently appears in a specific database, the query element and the database are likely to have an association. Therefore, by calculating the degree of association based on the appearance frequency, the degree of association becomes a more appropriate value. The calculation unit 2 acquires one query element qi (step 311), and counts the appearance frequency Rdj of the query element qi in the database Dj (step 313). Next, the calculation unit 2 calculates the degree of association f (qi, Dj) based on the ratio of the appearance frequency Rdj to the number of data elements in the database Dj, and stores it in a storage device such as a main memory (step S315). . The degree of association f (qi, Dj) increases as the appearance frequency increases. Here, the data element of the database Dj is an element corresponding to the query element. For example, the data elements can be extracted by performing similar query element extraction processing by the extraction unit 1 on the data stored in the databases D1, D2, and D3. The number of data elements is the number of data elements extracted from each database. Is obtained by counting the number of. These processes are preferably performed in advance, and the number of data elements is preferably stored in the storage unit for each of the databases D1, D2, and D3. Note that the degree of association does not necessarily need to use a ratio to the number of data elements, and may be calculated using, for example, the appearance frequency Rdj as it is.

図１０に示す例２は、カテゴリを用いて関連度を計算する例である。例えば、データに店舗の所在地が記述されている場合、具体的な地名（「日本橋」「東京」等）に基づいて関連度が判断されると、データベースにその具体的な地名が出現しない場合は関連度が低いと誤判断される。そこで、例２は、クエリー要素やデータ要素をカテゴリに置き換えて関連度を計算する。検索システム１００には、カテゴリと、そのカテゴリに含まれるカテゴリ要素とを対応付けるルールが予め規定されている。このルールは、例えば、図１１に示されるように、カテゴリテーブルの形式で検索システム１００の格納部（図示せず）に格納されている。計算部２は、クエリー要素ｑｉを取得すると（ステップＳ３２１）、カテゴリテーブルを参照し、クエリー要素ｑｉをカテゴリ要素として含むカテゴリｃｉを特定する（ステップＳ３２３）。次に、計算部２は、カテゴリテーブルからカテゴリｃｉに含まれるカテゴリ要素を取得し、そのカテゴリ要素がデータベースＤｊに出現する出現頻度Ｒｄｊをカウントする（ステップＳ３２５）。次に、計算部２は、データベースＤｊのデータ要素数に対する出現頻度Ｒｄｊの割合に基づいて関連度ｆ（ｃｉ、Ｄｊ）を算出し、例えばメインメモリ等の記憶装置に格納する（ステップＳ３２７）。これにより、カテゴリに基づいて関連度が算出され、関連度が更に適切な値となる。ここで、関連度は、必ずしもデータ要素数に対する割合を用いなくても良く、例えば、出現頻度Ｒｄｊをそのまま用いて算出されても良い。 Example 2 shown in FIG. 10 is an example in which the degree of association is calculated using a category. For example, if the location of a store is described in the data, and if the degree of relevance is determined based on a specific place name (such as “Nihonbashi” or “Tokyo”), the specific place name does not appear in the database It is misjudged that the relevance is low. Therefore, in Example 2, the query elements and data elements are replaced with categories, and the relevance is calculated. In the search system 100, a rule for associating a category with a category element included in the category is defined in advance. For example, as shown in FIG. 11, this rule is stored in a storage unit (not shown) of the search system 100 in the form of a category table. When obtaining the query element qi (step S321), the calculation unit 2 refers to the category table and identifies the category ci including the query element qi as a category element (step S323). Next, the calculation unit 2 acquires a category element included in the category ci from the category table, and counts the appearance frequency Rdj in which the category element appears in the database Dj (step S325). Next, the calculation unit 2 calculates the degree of association f (ci, Dj) based on the ratio of the appearance frequency Rdj to the number of data elements in the database Dj, and stores it in a storage device such as a main memory (step S327). Thereby, the degree of association is calculated based on the category, and the degree of association becomes a more appropriate value. Here, the degree of association does not necessarily need to use a ratio to the number of data elements, and may be calculated using, for example, the appearance frequency Rdj as it is.

なお、データベースＤｊの各データ要素のカテゴリは、図１１に示したように、カテゴリテーブルに基づいて予め特定され、格納部（図示せず）に格納されていることが好ましい。また、カテゴリに関するルールは、一部一致によりカテゴリが特定されるものでも良い。例えば、カテゴリテーブルには「橋」「町」「川」などが地名接尾辞として規定されており、この地名接尾辞を有する要素については、カテゴリを「地名」と判断する。また、カテゴリとしては、表記のゆれや同義語だけをまとめて一つのカテゴリとした粒度の細かいもの、或いは、「地名」「料理種類」「味」「値頃感」などのような意味的なカテゴリ、或いは、「名詞」「形容詞」などのような品詞レベルの粒度の粗いカテゴリでも良い。また、関連度は、クエリー要素とデータベースとがどの程度の関連性を有するかの指標となる値であれば良く、上記例１及び例２に限られない。例えば、ｎ文字の文字列（ｎｇｒａｍ）を単位として、その文字列の出現頻度に基づき、関連度を算出しても良い。 As shown in FIG. 11, the category of each data element in the database Dj is preferably specified in advance based on the category table and stored in a storage unit (not shown). Further, the category-related rule may be such that the category is specified by partial matching. For example, in the category table, “bridge”, “town”, “river”, and the like are defined as place name suffixes, and for an element having this place name suffix, the category is determined as “place name”. In addition, as a category, it is a fine-grained category that combines notation fluctuations and synonyms into one category, or a semantic category such as “place name”, “cooking type”, “taste”, “value feeling”, etc. Alternatively, a coarse-grained category of part-of-speech level such as “noun” or “adjective” may be used. The degree of association may be a value that is an index indicating how much the query element and the database are related, and is not limited to Example 1 and Example 2 above. For example, the degree of relevance may be calculated based on the appearance frequency of the character string in units of n character strings (ngram).

図１２は、検索クエリー生成部４による処理の処理フローの一例を示す図であり、図３のステップ７に相当する。クエリー生成部４は、クエリーリスト格納部１０に格納されているデータベースＤｊのクエリーリストＬ１ｄｊを参照する（ステップＳ７０１）。クエリーリストＬ１ｄｊには、データベースＤｊに関連有りと判断された１又は複数のクエリー要素が格納されている。クエリーリストＬ１ｄｊに含まれるクエリー要素は一つである場合もあるし、複数の場合もある。クエリー生成部４は、クエリーリストＬ１ｄｊに格納されている１又は複数のクエリー要素を取得し、所定のルールに基づいて、そのクエリー要素を含む第２の検索クエリーＱ２ｄｊを生成する（ステップＳ７０３）。次に、検索クエリー生成部４は、第２の検索クエリーＱ２ｄｊをデータベースＤｊと関連付けてクエリー格納部１１に格納する（ステップＳ７０５）。検索クエリー生成部４は、データベースＤ１，Ｄ２，Ｄ３について第２の検索クエリーを生成したか判断する（ステップＳ７０７）。「Ｎｏ」の場合、検索クエリー生成部４は、処理対象を次のデータベースのクエリーリストに設定して（ステップＳ７０９）、ステップＳ７０１に戻る。「Ｙｅｓ」の場合は終了する。クエリー格納部１１には、データベースＤ１，Ｄ２，Ｄ３に関連付けて第２の検索クエリーＱ２ｄ１，Ｑ２ｄ２，Ｑ２ｄ３が格納される。これにより、データベースＤ１，Ｄ２，Ｄ３の各々について、専用の第２の検索クエリーＱ２ｄ１，Ｑ２ｄ２，Ｑ２ｄ３が生成される。データベースＤｊの第２の検索クエリーＱ２ｄｊには、データベースＤｊに関連有りと判断されたクエリー要素が含まれる。ただし、データベースＤｊについて、関連有りと判断されたクエリー要素がない場合は、そのデータベースＤｊに対応する第２の検索クエリーＱ２ｄｊは生成されない。 FIG. 12 is a diagram illustrating an example of a processing flow of processing by the search query generation unit 4, and corresponds to Step 7 in FIG. The query generation unit 4 refers to the query list L1dj of the database Dj stored in the query list storage unit 10 (step S701). The query list L1dj stores one or more query elements that are determined to be related to the database Dj. There may be one or more query elements included in the query list L1dj. The query generation unit 4 acquires one or a plurality of query elements stored in the query list L1dj, and generates a second search query Q2dj including the query elements based on a predetermined rule (step S703). Next, the search query generation unit 4 stores the second search query Q2dj in the query storage unit 11 in association with the database Dj (step S705). The search query generation unit 4 determines whether a second search query has been generated for the databases D1, D2, and D3 (step S707). In the case of “No”, the search query generation unit 4 sets the processing target in the query list of the next database (step S709), and returns to step S701. If “Yes”, the process ends. The query storage unit 11 stores second search queries Q2d1, Q2d2, and Q2d3 in association with the databases D1, D2, and D3. As a result, dedicated second search queries Q2d1, Q2d2, and Q2d3 are generated for each of the databases D1, D2, and D3. The second search query Q2dj in the database Dj includes query elements that are determined to be related to the database Dj. However, if there is no query element determined to be related to the database Dj, the second search query Q2dj corresponding to the database Dj is not generated.

図１３（ａ）及び（ｂ）は、検索クエリー生成部４による処理の具体例を説明する図である。クエリー生成部４は、クエリーリスト格納部１０に格納されているデータベースＤ１のクエリーリストＬ１ｄ１を参照し、所定のルールに基づいて、クエリーリストＬ１ｄ１のクエリー要素「日本橋」「寿司」を含む第２の検索クエリーＱ２ｄ１を生成する（図１３（ａ））。第２の検索クエリーＱ２ｄ１には、データベースＤ１に関連有りと判断されたクエリー要素「日本橋」「寿司」が含まれる。同様に、検索クエリー生成部４は、データベースＤ２のクエリーリストＬ１ｄ２を参照し、クエリー要素「おいしい」「高くない」を含む第２の検索クエリーＱ２ｄ２を生成する（図１３（ｂ））。第２の検索クエリーＱ２ｄ２には、データベースＤ２に関連有りと判断されたクエリー要素「おいしい」「高くない」が含まれる。データベースＤ３についてはクエリーリストＬ１ｄ３がないため、第２の検索クエリーＱ２ｄ３は生成されない。第２の検索クエリーＱ２ｄ１，Ｑ２ｄ２はデータベースＤ１，Ｄ２と関連付けられてクエリー格納部１１に格納される。なお、図１３の例では、所定のルールは、クエリー要素を「ｏｒ」で結ぶとしているが、これに限られるものではない。 FIGS. 13A and 13B are diagrams illustrating a specific example of processing by the search query generation unit 4. The query generation unit 4 refers to the query list L1d1 of the database D1 stored in the query list storage unit 10, and based on a predetermined rule, includes a second element including the query elements “Nihonbashi” and “sushi” of the query list L1d1. A search query Q2d1 is generated (FIG. 13A). The second search query Q2d1 includes query elements “Nihonbashi” and “sushi” determined to be related to the database D1. Similarly, the search query generation unit 4 refers to the query list L1d2 of the database D2, and generates a second search query Q2d2 including the query elements “delicious” and “not expensive” (FIG. 13B). The second search query Q2d2 includes query elements “delicious” and “not expensive” determined to be related to the database D2. Since there is no query list L1d3 for the database D3, the second search query Q2d3 is not generated. The second search queries Q2d1 and Q2d2 are associated with the databases D1 and D2 and stored in the query storage unit 11. In the example of FIG. 13, the predetermined rule is to connect query elements with “or”, but is not limited thereto.

図１４Ａは、検索部５１，５２，５３による処理の処理フローの例を示す図であり、図３のステップＳ９の処理に相当する。検索部５１，５２，５３は、クエリー格納部１１に格納されている第２のクエリーＱ２ｄｊを取得する（ステップＳ５１０１）。そして、検索部５１，５２，５３は、担当のデータベースＤｊに専用の第２の検索クエリーＱ２ｄｊを取得した場合、第２の検索クエリーＱ２ｄｊに基づいて担当のデータベースＤｊを検索する（ステップＳ５１０３）。データベースＤｊからは第２の検索クエリーＱ２ｄｊに適合するデータブロックが検出される。検出されるデータブロックは一つの場合もあるし、複数の場合もある。 FIG. 14A is a diagram illustrating an example of a processing flow of processing by the search units 51, 52, and 53, and corresponds to the processing in step S9 in FIG. The search units 51, 52, and 53 acquire the second query Q2dj stored in the query storage unit 11 (step S5101). Then, when the search unit 51, 52, 53 acquires the second search query Q2dj dedicated to the database Dj in charge, the search unit 51, 52, 53 searches the database Dj in charge based on the second search query Q2dj (step S5103). A data block matching the second search query Q2dj is detected from the database Dj. There may be one detected data block or a plurality of data blocks.

ユーザから入力された第１の検索クエリーＱ１は、データベースＤ１，Ｄ２，Ｄ３に共通の検索クエリーであり、データベースによっては関連のないクエリー要素が含まれている。このため、第１の検索クエリーＱ１に基づいてデータベースＤ１，Ｄ２，Ｄ３を検索すると、検索精度が低くなる。これに対し、第２の検索クエリーＱ２ｄｊは、データベースＤｊに専用の検索クエリーであり、データベースＤｊに関連有りと判断されたクエリー要素のみが含まれるため、検索精度が高くなる。 The first search query Q1 input from the user is a search query common to the databases D1, D2, and D3, and includes query elements that are not related to each database. For this reason, if the databases D1, D2, and D3 are searched based on the first search query Q1, the search accuracy is lowered. On the other hand, the second search query Q2dj is a search query dedicated to the database Dj and includes only query elements that are determined to be related to the database Dj, so that the search accuracy is improved.

なお、検索部５１，５２，５３における検索方法は、任意であるが、例えば、Ｂｏｏｌｅａｎ検索や拡張Ｂｏｏｌｅａｎ検索が挙げられる。また、クエリー要素に応じて検索方法が選択されても良い。この場合、クエリー要素と検索方法との対応付けを規定したテーブルが格納部に格納されている。検索部５１，５２，５３は、そのテーブルを参照し、第２の検索クエリーＱ２ｄｊに含まれるクエリー要素に対応する検索方法を特定し、その検索方法にてデータベースＤｊを検索する。これにより、クエリー要素の特性に適する検索方法にて検索が行われ、検索精度を更に高めることができる。 In addition, although the search method in search part 51,52,53 is arbitrary, For example, Boolean search and extended Boolean search are mentioned. A search method may be selected according to the query element. In this case, a table defining the correspondence between query elements and search methods is stored in the storage unit. The search units 51, 52, 53 refer to the table, specify a search method corresponding to the query element included in the second search query Q2dj, and search the database Dj using the search method. Thereby, a search is performed by a search method suitable for the characteristics of the query element, and the search accuracy can be further improved.

また、クエリー要素のカテゴリに応じて検索方法が選択されても良い。図１４Ｂは、カテゴリに応じて検索方法を選択する処理の処理フローを示す図である。検索部５１，５２，５３は、第２の検索クエリーＱ２ｄｊからクエリー要素を取得し、カテゴリとカテゴリに含まれる複数のカテゴリ要素とを対応付けるカテゴリルールに基づいて、そのクエリー要素をカテゴリ要素として含むカテゴリを特定する（ステップＳ５２０１）。次に、検索部５１，５２，５３は、カテゴリと検索方法とを対応付ける検索ルールに基づいて、特定されたカテゴリに対応する検索方法を特定する（ステップＳ５２０３）。そして、検索部５１，５２，５３は、そのクエリー要素についてはその検索方法にてデータベースを検索する。第２の検索クエリーＱ２ｄｊに複数のクエリー要素が含まれる場合は、各々のクエリー要素についてカテゴリを特定し、クエリー要素ごとにカテゴリに応じた検索方法にて検索を行う。なお、カテゴリルールは、上述したカテゴリテーブルを用いても良い。また、検索ルールは、カテゴリと検索方法とを対応付けるテーブルとして格納部（図示せず）に格納されていても良い。これにより、クエリー要素のカテゴリの特性に適する検索方法にて検索が行われ、検索精度を更に高めることができる。 A search method may be selected according to the category of the query element. FIG. 14B is a diagram illustrating a processing flow of processing for selecting a search method according to a category. The search units 51, 52, and 53 acquire a query element from the second search query Q2dj, and based on a category rule that associates a category with a plurality of category elements included in the category, a category including the query element as a category element Is identified (step S5201). Next, the search units 51, 52, and 53 specify the search method corresponding to the specified category based on the search rule that associates the category with the search method (step S5203). Then, the search units 51, 52, and 53 search the database with the search method for the query element. When the second search query Q2dj includes a plurality of query elements, a category is specified for each query element, and a search is performed for each query element using a search method corresponding to the category. Note that the category table described above may be used as the category rule. In addition, the search rule may be stored in a storage unit (not shown) as a table that associates categories with search methods. Thereby, a search is performed by a search method suitable for the characteristics of the category of the query element, and the search accuracy can be further improved.

更に、検索方法は、データベースに応じて選択されても良い。この場合、データベースＤ１，Ｄ２，Ｄ３と検索方法を対応付けたテーブルが格納部（図示せず）に格納されている。検索部５１，５２，５３は、このテーブルを参照し、担当のデータベースＤｊに対応する検索方法を特定し、その検索方法にて担当のデータベースＤｊを検索する。これにより、データベースの特性に適する検索方法にて検索が行われ、検索精度を更に高めることができる。 Furthermore, the search method may be selected according to the database. In this case, a table in which the databases D1, D2, and D3 are associated with the search method is stored in a storage unit (not shown). The search units 51, 52, and 53 refer to this table, specify a search method corresponding to the responsible database Dj, and search the responsible database Dj by the search method. Thereby, a search is performed by a search method suitable for the characteristics of the database, and the search accuracy can be further improved.

図１５は集合生成部６１，６２，６３による処理の処理フローの例１を示す図であり、図１７はその例２を示す図である。例１及び例２は、図３のステップＳ１１に相当する。例１では、検索部５１，５２，５３による検索の結果、担当のデータベースＤｊに第２の検索クエリーＱ２ｄｊに適合するデータブロックが検出されると、集合生成部６１，６２，６３は、検出されたデータブロックに関連付けられるエンティティＥｘを特定する（ステップＳ１１１１）。エンティティは一つである場合もあれば、複数の場合もある。集合生成部６１，６２，６３は、エンティティＥｘが特定されると、そのエンティティをデータベースＤｊのエンティティリストＬ２ｄｊに追加する（ステップＳ１１１３）。エンティティリストＬ２ｄｊには、データベースＤｊから検出されたエンティティＥｘの列が含まれる。集合生成部６１，６２，６３は、エンティティリストＬ２ｄｊをデータベースＤｊと関連付けて集合格納部１２に格納する。集合格納部１２に格納されたエンティティリストＬ２ｄｊは、データベースＤｊから検出されたエンティティの集合を構成する。すなわち、集合格納部１２には、データベースＤｊごとに、そのデータベースから検出されたエンティティの集合が格納される。 FIG. 15 is a diagram showing a first example of a processing flow of processing by the set generation units 61, 62, and 63, and FIG. 17 is a diagram showing a second example thereof. Examples 1 and 2 correspond to step S11 in FIG. In Example 1, as a result of the search by the search units 51, 52, and 53, when a data block matching the second search query Q2dj is detected in the database Dj in charge, the set generation units 61, 62, and 63 are detected. The entity Ex associated with the data block is identified (step S1111). There may be one entity or multiple entities. When the entity Ex is specified, the set generation units 61, 62, and 63 add the entity to the entity list L2dj of the database Dj (step S1113). The entity list L2dj includes a column of the entity Ex detected from the database Dj. The set generation units 61, 62, and 63 store the entity list L2dj in the set storage unit 12 in association with the database Dj. The entity list L2dj stored in the set storage unit 12 constitutes a set of entities detected from the database Dj. That is, the set storage unit 12 stores a set of entities detected from each database Dj.

図１６は集合生成部６１，６２，６３の処理フローの例１の具体例を説明する図である。枠Ａ内は集合生成部６１の処理の説明であり、枠Ｂ内は集合生成部６２の処理の説明であり、枠Ｃ内は集合生成部６３の処理の説明である。枠Ａ内において、検索部５１は、検索クエリーＱ２ｄ１を取得し、データベースＤ１を検索する。その結果、データブロックｄ１、ｄ２、ｄ３、ｄ４が検出されると、集合生成部６１は、データブロックｄ１、ｄ２、ｄ３、ｄ４に関連付けられるエンティティＥ１、Ｅ２，Ｅ３，Ｅ４を特定し、エンティティリストＬ２ｄ１に追加する。同様に、枠Ｂ内において、検索部５２がデータベースＤ２からデータブロックｄ５，ｄ６，ｄ７を検出すると、集合生成部６２は、これらに関連付けられるエンティティＥ１，Ｅ３，Ｅ５をエンティティリストＬ２ｄ２に追加する。データベースＤ３に専用の第２の検索クエリーＱ２ｄ３は生成されていないため、検索部５３及び集合生成部６３の処理は行われない。エンティティリストＬ２ｄ１は、データベースＤ１から検出されたエンティティの集合であり、エンティティリストＬ２ｄ１はデータベースＤ２から検出されたエンティティの集合である。 FIG. 16 is a diagram for explaining a specific example of Example 1 of the processing flow of the set generation units 61, 62, and 63. The inside of the frame A is a description of the processing of the set generation unit 61, the inside of the frame B is the description of the processing of the set generation unit 62, and the inside of the frame C is the description of the processing of the set generation unit 63. Within the frame A, the search unit 51 acquires the search query Q2d1 and searches the database D1. As a result, when the data blocks d1, d2, d3, d4 are detected, the set generation unit 61 identifies the entities E1, E2, E3, E4 associated with the data blocks d1, d2, d3, d4, and the entity list Add to L2d1. Similarly, when the search unit 52 detects the data blocks d5, d6, d7 from the database D2 in the frame B, the set generation unit 62 adds entities E1, E3, E5 associated therewith to the entity list L2d2. Since the dedicated second search query Q2d3 is not generated in the database D3, the processing of the search unit 53 and the set generation unit 63 is not performed. The entity list L2d1 is a set of entities detected from the database D1, and the entity list L2d1 is a set of entities detected from the database D2.

図１７は集合生成部６１，６２，６３による処理フローの例２を示す図である。検索部５１，５２，５３により、データベースＤｊから第２の検索クエリーＱ２ｄｊに適合するデータブロックが検出されると、集合生成部６１，６２，６３は、検出されたデータブロックに関連付けられるエンティティＥｘを特定する（ステップＳ１１２１）。次に、集合生成部６１，６２，６３は、検出されたデータブロックの各々について第１の評価値Ｖｄを算出する（ステップＳ１１２３）。第１の評価値Ｖｄは、データブロックｄが第２の検索クエリーＱ２ｄｊに合致する程度を表す値であれば良い。第２の検索クエリーＱ２ｄｊに対するデータブロックｄの合致度が高いほど、第１の評価値Ｖｄは高くなる。第１の評価値Ｖｄの具体的な算出方法については後述する。集合生成部６１，６２，６３は、エンティティＥｘと第１の評価値Ｖｄとを関連付けてエンティティリストＬ３ｄｊに追加する（ステップＳ１１２５）。エンティティリストＬ３ｄｊは、データベースＤｊと関連付けて集合格納部１２に格納される。 FIG. 17 is a diagram illustrating a second example of the processing flow by the set generation units 61, 62, and 63. When the search units 51, 52, and 53 detect a data block that matches the second search query Q2dj from the database Dj, the set generation units 61, 62, and 63 select the entity Ex associated with the detected data block. Specify (step S1121). Next, the set generation units 61, 62, and 63 calculate the first evaluation value Vd for each of the detected data blocks (step S1123). The first evaluation value Vd may be a value that represents the degree to which the data block d matches the second search query Q2dj. The higher the matching degree of the data block d with respect to the second search query Q2dj, the higher the first evaluation value Vd. A specific method for calculating the first evaluation value Vd will be described later. The set generation units 61, 62, and 63 associate the entity Ex and the first evaluation value Vd and add them to the entity list L3dj (step S1125). The entity list L3dj is stored in the set storage unit 12 in association with the database Dj.

図１８は、集合生成部６１，６２，６３の上記例２の具体例を説明する図である。例１と同様に、集合生成部６１は、データブロックｄ１、ｄ２、ｄ３、ｄ４に関連付けられるエンティティＥ１、Ｅ２，Ｅ３，Ｅ４を特定し、データベースＤ１のエンティティリストＬ３ｄ１に追加する。また、集合生成部６１は、検索部５１により検出されたデータブロックｄ１、ｄ２、ｄ３、ｄ４の各々について第１の評価値Ｖｄを算出し、エンティティＥ１、Ｅ２，Ｅ３，Ｅ４と関連付けてエンティティリストＬ３ｄ１に追加する。集合生成部６２も同様の処理を行い、エンティティと第１の評価値ＶｄをデータベースＤ２のエンティティリストＬ３ｄ２に追加する。エンティティリストＬ３ｄ１は、データベースＤ１の検索の結果として検出されたエンティティと第１の評価値Ｖｄとの組みの列を含む。エンティティリストＬ３ｄ２は、データベースＤ２の検索の結果として検出されたエンティティと第１の評価値Ｖｄの組みの列を含む。なお、図１８に示す第１の評価値Ｖｄの値は、説明の便宜のために設定したものであり、実際の値とは異なる。 FIG. 18 is a diagram illustrating a specific example of Example 2 of the set generation units 61, 62, and 63. Similar to Example 1, the set generation unit 61 identifies entities E1, E2, E3, and E4 associated with the data blocks d1, d2, d3, and d4, and adds them to the entity list L3d1 of the database D1. In addition, the set generation unit 61 calculates a first evaluation value Vd for each of the data blocks d1, d2, d3, and d4 detected by the search unit 51, and associates the entity with the entities E1, E2, E3, and E4. Add to L3d1. The set generation unit 62 performs the same processing, and adds the entity and the first evaluation value Vd to the entity list L3d2 of the database D2. The entity list L3d1 includes a set of columns of entities detected as a result of the search of the database D1 and the first evaluation value Vd. The entity list L3d2 includes a set of columns of entities detected as a result of the search of the database D2 and the first evaluation value Vd. Note that the value of the first evaluation value Vd shown in FIG. 18 is set for convenience of explanation, and is different from the actual value.

ここで、第１の評価値は次のように算出される。検索装置１００の格納部（図示せず）には、第１の評価値Ｖｄの算出方法を規定した評価値ルールが格納されている。集合生成部６１，６２，６３は、格納部（図示せず）に格納される評価値ルールを参照し、評価値ルールに基づいて第１の評価値Ｖｄを算出する。第２の検索クエリーＱ２ｄｊに複数のクエリー要素が含まれる場合は、複数のクエリー要素の各々について評価値を算出し、これらの評価値の総和又は平均値等を算出し、データブロックの第１の評価値Ｖｄとしても良い。第１の評価値Ｖｄの算出方法の例としては、次のものが挙げられる。第１の例として、第２の検索クエリーＱ２ｄｊのクエリー要素とデータブロックのデータ要素との関連性に基づいて算出するものが挙げられる。関連性を表す値としては、例えば、類似度や関連度や近似度が挙げられる。例えば、クエリー要素が地名属性の場合、地名が完全一致したときは評価値１、隣町の関係を有するときは評価値０．８とする。関連性を表す値の算出方法は、予め検索装置１００の格納部（図示せず）に算出ルールとして格納されている。集合生成部６１，６２，６３は、クエリー要素に応じた算出ルールを取得し、その算出ルールに基づいて類似度や関連度や近似度などを算出し、例えばメインメモリ等の記憶装置に格納する。例えば、地名の場合は、図１９に示されるような地域名と各地域の位置関係を示すデータが格納部（図示せず）格納されている。集合生成部６１，６２，６３は、そのデータを参照し、位置関係に基づいて第１の評価値Ｖｄを算出する。 Here, the first evaluation value is calculated as follows. The storage unit (not shown) of the search device 100 stores an evaluation value rule that defines a method for calculating the first evaluation value Vd. The set generation units 61, 62, and 63 refer to the evaluation value rule stored in the storage unit (not shown), and calculate the first evaluation value Vd based on the evaluation value rule. When the second search query Q2dj includes a plurality of query elements, an evaluation value is calculated for each of the plurality of query elements, a sum or an average value of these evaluation values is calculated, and the first block of the data block is calculated. The evaluation value Vd may be used. The following is mentioned as an example of the calculation method of 1st evaluation value Vd. As a first example, there is one that is calculated based on the relationship between the query element of the second search query Q2dj and the data element of the data block. Examples of values representing relevance include similarity, relevance, and approximation. For example, if the query element has a place name attribute, the evaluation value is 1 when the place names are completely matched, and the evaluation value is 0.8 when there is a neighboring town relationship. The calculation method of the value representing the relevance is stored in advance as a calculation rule in a storage unit (not shown) of the search device 100. The set generation units 61, 62, and 63 acquire calculation rules corresponding to the query elements, calculate similarity, relevance, approximation, and the like based on the calculation rules, and store them in a storage device such as a main memory, for example. . For example, in the case of a place name, data indicating the area name and the positional relationship between each area as shown in FIG. 19 is stored in a storage unit (not shown). The set generation units 61, 62, and 63 refer to the data and calculate the first evaluation value Vd based on the positional relationship.

また、第１の評価値Ｖｄの算出方法は、オントロジー間の距離を用いるものであっても良い。例えば、集合生成部６１，６２，６３は、クエリー要素が「寿司」の場合、日本料理として共通する「天ぷら」は距離が近いと判断し、フランス料理は距離が遠いと判断し、その距離に応じた値を第１の評価値Ｖｄとする。また第１の評価値の他の算出方法としては、クエリー要素の出現頻度に基づくものが挙げられる。第２の検索クエリーＱ２ｄｊに含まれるクエリー要素の出現頻度が高いデータブロックは、第２の検索クエリーＱ２ｄｊに対する合致度が高い。そこで、集合生成部６１，６２，６３は、検出されたデータブロックについて、第２の検索クエリーＱ２ｄｊに含まれるクエリー要素の出現頻度をカウントし、出現頻度に基づいて第１の評価値Ｖｄを算出する。また、第１の評価値Ｖｄの他の算出方法としては、集合生成部６１，６２，６３は、シソーラス等の辞書データを参照し、検出されたデータブロックについて、クエリー要素と一定の関係にある語（同義語、反意語、類義語など）の出現頻度をカウントし、その出現頻度に基づいて第１の評価値Ｖｄを算出しても良い。出現頻度が高いほど、第１の評価値Ｖｄは高くなる。 In addition, the first evaluation value Vd may be calculated using a distance between ontology. For example, when the query element is “sushi”, the set generation units 61, 62, and 63 determine that “tempura”, which is a common Japanese dish, is close, and that French food is far away, and the distance is The corresponding value is set as the first evaluation value Vd. Another method for calculating the first evaluation value is based on the appearance frequency of query elements. A data block in which the appearance frequency of query elements included in the second search query Q2dj is high has a high degree of match with the second search query Q2dj. Therefore, the set generation units 61, 62, and 63 count the appearance frequency of query elements included in the second search query Q2dj for the detected data block, and calculate the first evaluation value Vd based on the appearance frequency. To do. As another calculation method for the first evaluation value Vd, the set generation units 61, 62, and 63 refer to dictionary data such as a thesaurus, and the detected data block has a certain relationship with the query element. The appearance frequency of words (synonyms, antonyms, synonyms, etc.) may be counted, and the first evaluation value Vd may be calculated based on the appearance frequency. The higher the appearance frequency, the higher the first evaluation value Vd.

図２０は、部分集合生成部７による処理の処理フローの例１を示す図であり、図２２と図２３はその例２を示す図である。例１及び例２は、図３のステップＳ１３に相当する処理である。例１は集合生成部６１，６２，６３の処理を処理フローの例１（図１５）としたときの後続処理であり、例２は集合生成部６１，６２，６３の処理を処理フローの例２（図１７）としたときの後続処理である。 FIG. 20 is a diagram illustrating Example 1 of a processing flow of processing by the subset generation unit 7, and FIGS. 22 and 23 are diagrams illustrating Example 2 thereof. Examples 1 and 2 are processes corresponding to step S13 in FIG. Example 1 is a subsequent process when the processing of the set generation units 61, 62, and 63 is set as the processing flow example 1 (FIG. 15). Example 2 is an example of the processing flow of the processing of the set generation units 61, 62, and 63. This is subsequent processing when 2 (FIG. 17).

図２０に示される例１において、部分集合生成部７は、集合格納部１２からデータベースＤｊのエンティティリストＬ２ｄｊを取得する（ステップ１３１０１）。次に、部分集合生成部７は、エンティティリストＬ２ｄｊからエンティティＥｘを取得し（ステップＳ１３１０３）、エンティティＥｘの評価リストＬ４ｅｘにデータベースＤｊを追加する（ステップＳ１３１０５）。エンティティは一つの場合もあれば、複数の場合もある。部分集合生成部７は、エンティティリストＬ２ｄｊに含まれるすべてのエンティティＥを処理したか判断する（ステップＳ１３１０７）。「Ｎｏ」の場合、部分集合生成部７は、処理対象を次のエンティティに設定し（ステップＳ１３１０９）、ステップＳ１３１０３に戻る。この処理を繰り返すことで、エンティティリストＬ２ｄｊに含まれるすべてのエンティティＥｘについて、対応する評価リストＬ４ｅｘにデータベースＤｊが追加される。ステップＳ１３１０７において「Ｙｅｓ」と判断された場合、部分集合生成部７は、すべてのエンティティリストについて処理したか判断する（ステップＳ１３１１１）。「Ｎｏ」の場合、部分集合生成部７は、処理対象を次のエンティティリストに設定し（ステップＳ１３１１３）、ステップＳ１３１０１に戻る。これにより、エンティティの各々について、評価リストが生成される。生成された評価リストは、例えばメインメモリ等の記憶装置に格納される。エンティティＥｘの評価リストＬ４ｅｘには、そのエンティティＥｘが検出されたデータベースの列が設けられる。部分集合生成部７は、生成された各エンティティの評価リストを参照し、所定のルールに基づいて、エンティティを抽出し、抽出したエンティティのリストＬ５を部分集合格納部１３に格納する（ステップＳ１３１１５）。部分集合格納部１３に格納されたリストＬ５はエンティティの部分集合を構成する。 In Example 1 shown in FIG. 20, the subset generation unit 7 acquires the entity list L2dj of the database Dj from the set storage unit 12 (step 13101). Next, the subset generation unit 7 acquires the entity Ex from the entity list L2dj (step S13103), and adds the database Dj to the evaluation list L4ex of the entity Ex (step S13105). There may be one entity or multiple entities. The subset generation unit 7 determines whether all entities E included in the entity list L2dj have been processed (step S13107). In the case of “No”, the subset generation unit 7 sets the processing target to the next entity (step S13109) and returns to step S13103. By repeating this process, the database Dj is added to the corresponding evaluation list L4ex for all the entities Ex included in the entity list L2dj. If “Yes” is determined in step S13107, the subset generation unit 7 determines whether all entity lists have been processed (step S13111). In the case of “No”, the subset generation unit 7 sets the processing target to the next entity list (step S13113) and returns to step S13101. Thereby, an evaluation list is generated for each of the entities. The generated evaluation list is stored in a storage device such as a main memory, for example. The evaluation list L4ex of the entity Ex is provided with a database column in which the entity Ex is detected. The subset generation unit 7 refers to the generated evaluation list of each entity, extracts an entity based on a predetermined rule, and stores the extracted entity list L5 in the subset storage unit 13 (step S13115). . The list L5 stored in the subset storage unit 13 constitutes a subset of entities.

エンティティを抽出するときの所定のルールの例としては、次のものが挙げられる。検索部５１，５２，５３の検索において、多くのデータベースで検出されたエンティティは、第１の検索クエリーＱ１に適合するエンティティである可能性が高い。そこで、例えば、部分集合生成部７は、エンティティＥｘの評価リストＬ４ｅｘに含まれるデータベースの数が閾値以上の場合、エンティティＥｘを抽出するようにしても良い。また、別の例では、部分集合生成部７は、集合生成部６１，６２，６３で生成されたエンティティリストを一つの集合とみなし、その和集合又は積集合を生成することで、部分集合を生成しても良い。 Examples of the predetermined rule when extracting an entity include the following. In the search by the search units 51, 52, and 53, there is a high possibility that an entity detected in many databases is an entity that matches the first search query Q1. Therefore, for example, the subset generation unit 7 may extract the entity Ex when the number of databases included in the evaluation list L4ex of the entity Ex is equal to or greater than a threshold value. In another example, the subset generation unit 7 regards the entity list generated by the set generation units 61, 62, and 63 as one set, and generates the union or intersection set to generate the subset. It may be generated.

図２１（ａ）乃至（ｃ）は、部分集合生成部７による処理の処理フローの例１について具体例を説明する図である。図２１（ａ）に示すように、部分集合生成部７は、エンティティリストＬ２ｄ１からエンティティＥ１を取得し、エンティティＥ１の評価リストＬ４ｅ１にデータベースＤ１を追加する。同様に、部分集合生成部７は、エンティティリストＬ２ｄ１からエンティティＥ２，Ｅ３，Ｅ４を取得し、評価リストＬ４ｅ２，Ｌ４ｅ３，Ｌ４ｅ４にデータベースＤ１を追加する。また、図２１（ｂ）に示すように、部分集合生成部７は、エンティティリストＬ２ｄ２からエンティティＥ１，Ｅ３，Ｅ５を取得し、エンティティの評価リストＬ４ｅ１，Ｌ４ｅ３，Ｌ４ｅ５に追加する。評価リストＬ４ｅ１乃至Ｌ４ｅ５は、例えばメインメモリ等の記憶装置に格納される。部分集合生成部７は、所定のルールに基づいてエンティティＥ１，Ｅ２，Ｅ３，Ｅ４，Ｅ５からエンティティを抽出する。本具体例においては、エンティティＥｘの評価リストＬ４ｅｘに二以上のデータベースが含まれる場合、そのエンティティＥｘが抽出される。図２１（ｃ）に示すように、部分集合生成部７は、評価リストに二以上のデータベースが含まれるエンティティＥ１，Ｅ３を抽出し、リストＬ５に追加する。リストＬ５はエンティティの部分集合である。 FIGS. 21A to 21C are diagrams illustrating a specific example of the processing flow example 1 of processing by the subset generation unit 7. As illustrated in FIG. 21A, the subset generation unit 7 acquires the entity E1 from the entity list L2d1, and adds the database D1 to the evaluation list L4e1 of the entity E1. Similarly, the subset generation unit 7 acquires entities E2, E3, E4 from the entity list L2d1, and adds the database D1 to the evaluation lists L4e2, L4e3, L4e4. Further, as illustrated in FIG. 21B, the subset generation unit 7 acquires the entities E1, E3, and E5 from the entity list L2d2 and adds them to the entity evaluation lists L4e1, L4e3, and L4e5. The evaluation lists L4e1 to L4e5 are stored in a storage device such as a main memory, for example. The subset generation unit 7 extracts entities from the entities E1, E2, E3, E4, and E5 based on a predetermined rule. In this specific example, when two or more databases are included in the evaluation list L4ex of the entity Ex, the entity Ex is extracted. As shown in FIG. 21C, the subset generation unit 7 extracts entities E1 and E3 whose evaluation list includes two or more databases and adds them to the list L5. List L5 is a subset of entities.

図２２及び図２３に示される例２は、集合生成部６１，６２，６３の処理を処理フローの例２（図１７）とした場合の後続処理である。図２０に示される例１と異なる点は、次の点である。なお、例１と共通する点については説明を省略する。部分集合生成部７は、ステップＳ１３２０３において、エンティティリストＬ３ｄｊからエンティティＥｘと第１の評価値Ｖｄを取得し、ステップ１３２０５において、エンティティＥｘの評価リストＬ４ｅｘに第１の評価値Ｖｄを追加する。評価リストＬ４ｅｘは、エンティティと関連付けられて、例えばメインメモリ等の記憶装置に格納される。例１において、評価リストＬ４ｅｘにはデータベースＤｊが追加されたが、本例においては、第１の評価値Ｖｄが追加される。図２３に示されるように、部分集合生成部７は、エンティティＥｘの評価リストＬ４ｅｘに含まれる第１の評価値Ｖｄに基づいて、エンティティＥｘの総合評価値Ｖｘを算出する（ステップＳ１３２１５）。総合評価値Ｖｘは、エンティティＥｘと関連付けられて、例えばメインメモリ等の記憶装置に格納される。総合評価値Ｖｘは、エンティティＥｘの第１のクエリーＱ１に対する合致度を表しており、合致度が高いほど総合評価値Ｖｘは高くなる。エンティティＥｘの評価値リストＬ４ｅｘに複数の第１の評価値Ｖｄが格納されている場合は、第１の評価値Ｖｄの総和又は平均値等を総合評価値としても良い。部分集合生成部７は、すべてのエンティティの総合評価値を算出したかを判断する（ステップＳ１３２１７）。「Ｎｏ」の場合、部分集合生成部７は、処理対象を次のエンティティに設定し（ステップＳ１３２１９）、ステップＳ１３２１５に戻る。この処理を繰り返すことにより、すべてのエンティティについて総合評価値が算出される。ステップＳ１３２１７にて「Ｙｅｓ」と判断した場合、部分集合生成部７は、各エンティティの総合評価値を参照し、所定のルールに基づいて、エンティティを抽出する。抽出されたエンティティのリストＬ５は部分集合格納部１３に格納される（ステップＳ１３２２１）。ここで、所定のルールとしては、例えば、総合評価値が閾値以上であるエンティティを抽出するものが挙げられる。部分集合格納部１３に格納されたリストＬ５はエンティティの部分集合である。 Example 2 shown in FIGS. 22 and 23 is a subsequent process when the processing of the set generation units 61, 62, and 63 is set as Example 2 (FIG. 17) of the processing flow. The following points are different from Example 1 shown in FIG. Note that a description of points in common with Example 1 is omitted. In step S13203, the subset generation unit 7 acquires the entity Ex and the first evaluation value Vd from the entity list L3dj. In step 13205, the subset generation unit 7 adds the first evaluation value Vd to the evaluation list L4ex of the entity Ex. The evaluation list L4ex is associated with an entity and stored in a storage device such as a main memory, for example. In Example 1, the database Dj is added to the evaluation list L4ex. However, in this example, the first evaluation value Vd is added. As illustrated in FIG. 23, the subset generation unit 7 calculates the comprehensive evaluation value Vx of the entity Ex based on the first evaluation value Vd included in the evaluation list L4ex of the entity Ex (step S13215). The comprehensive evaluation value Vx is associated with the entity Ex and stored in a storage device such as a main memory, for example. The comprehensive evaluation value Vx represents the degree of match of the entity Ex with respect to the first query Q1, and the higher the degree of match, the higher the comprehensive evaluation value Vx. When a plurality of first evaluation values Vd are stored in the evaluation value list L4ex of the entity Ex, the total or average value of the first evaluation values Vd may be used as the comprehensive evaluation value. The subset generation unit 7 determines whether or not the comprehensive evaluation values of all entities have been calculated (step S13217). In the case of “No”, the subset generation unit 7 sets the processing target to the next entity (step S13219) and returns to step S13215. By repeating this process, comprehensive evaluation values are calculated for all entities. If “Yes” is determined in step S13217, the subset generation unit 7 refers to the comprehensive evaluation value of each entity and extracts the entity based on a predetermined rule. The extracted entity list L5 is stored in the subset storage unit 13 (step S13221). Here, as the predetermined rule, for example, a rule for extracting an entity whose comprehensive evaluation value is equal to or greater than a threshold value can be cited. The list L5 stored in the subset storage unit 13 is a subset of entities.

図２４（ａ）乃至（ｃ）は、上記例２の具体例を説明する図である。図２４（ａ）に示されるように、部分集合生成部７は、エンティティリストＬ３ｄ１から各エンティティＥ１乃至Ｅ４の第１の評価値Ｖｄを取得し、各エンティティの評価リストＬ４ｅ１乃至Ｌ４ｅ４に追加する。また、図２４（ｂ）に示されるように、部分集合生成部７は、エンティティリストＬ３ｄ２から各エンティティＥ１，Ｅ３，Ｅ５の第１の評価値Ｖｄを取得し、各エンティティの評価リストＬ４ｅ１，Ｌ４ｅ３，Ｌ４ｅ５に追加する。評価リストは、エンティティと関連付けられて、例えばメインメモリ等の記憶装置に格納される。部分集合生成部７は、評価リストＬ４ｅｘの各々について第１の評価値Ｖｄの総和を算出し、各エンティティＥ１乃至Ｅ５の総合評価値Ｖ１乃至Ｖ５とする。総合評価値は、エンティティと関連付けられて、例えばメインメモリ等の記憶装置に格納される。図２４（ｃ）に示されるように、部分集合生成部７は、所定のルールに基づいて、エンティティを抽出し、リストＬ５に追加する。リストＬ５は部分集合格納部１３に格納される。ここでは、総合評価値が閾値以上のエンティティＥ１とエンティティＥ３が抽出される。リストＬ５は、エンティティの部分集合である。 24A to 24C are diagrams illustrating a specific example of the second example. As illustrated in FIG. 24A, the subset generation unit 7 acquires the first evaluation value Vd of each entity E1 to E4 from the entity list L3d1, and adds it to the evaluation list L4e1 to L4e4 of each entity. Further, as illustrated in FIG. 24B, the subset generation unit 7 acquires the first evaluation value Vd of each entity E1, E3, E5 from the entity list L3d2, and the evaluation list L4e1, L4e3 of each entity. , L4e5. The evaluation list is associated with the entity and stored in a storage device such as a main memory. The subset generation unit 7 calculates the total sum of the first evaluation values Vd for each of the evaluation lists L4ex, and sets the total evaluation values V1 to V5 of the entities E1 to E5. The comprehensive evaluation value is associated with the entity and stored in a storage device such as a main memory, for example. As illustrated in FIG. 24C, the subset generation unit 7 extracts entities based on a predetermined rule and adds them to the list L5. The list L5 is stored in the subset storage unit 13. Here, the entity E1 and the entity E3 whose comprehensive evaluation values are equal to or greater than the threshold are extracted. List L5 is a subset of entities.

生成されたエンティティの部分集合は、モニターやプリンターなどの出力装置に出力され、ユーザに提供される。これにより、ユーザは、第１の検索クエリーＱ１の検索結果として、エンティティの部分集合を得ることができる。エンティティの部分集合は、エンティティのリストとして提供されても良いし、エンティティに関連付けられるデータブロックへのリンクのリストとして提供されても良い。本実施の形態では、データベースごとに検索結果が出力されるのではなく、各データベースの検索結果が統合されて出力される。出力されるエンティティは、総合評価値に基づいて第１の検索クエリーＱ１との合致度が高いと判断されたものである。したがって、ユーザは、情報ニーズに近いエンティティを求めることが可能となる。 The generated subset of entities is output to an output device such as a monitor or a printer and provided to the user. Thereby, the user can obtain a subset of entities as a search result of the first search query Q1. The subset of entities may be provided as a list of entities or as a list of links to data blocks associated with the entities. In the present embodiment, search results are not output for each database, but search results for each database are integrated and output. The output entity is determined to have a high degree of match with the first search query Q1 based on the comprehensive evaluation value. Therefore, the user can obtain an entity close to information needs.

また、本技術の実施の形態の他の例として検索装置２００が挙げられる。図２５は、検索装置２００を説明する機能ブロック図である。図２５において、検索装置１００と同一の要素は、同一の符号を付することで説明を省略する。検索装置２００は、検索装置１００における集合生成部６１，６２，６３、集合格納部１２、部分集合生成部７、部分集合格納部１３を備えないものである。すなわち、検索装置２００は、検索装置１００の検索部５１，５２，５３による処理までを行う。そして、検索結果として、検出されたデータブロックのリストや、検出されたデータブロックへのリンクのリストが出力される。検索装置１００はデータブロックに関連付けられるエンティティを検索するものであるが、検索装置２００はデータブロック自体を検索したいときに有効である。 Moreover, the search apparatus 200 is mentioned as another example of embodiment of this technique. FIG. 25 is a functional block diagram illustrating the search device 200. In FIG. 25, the same elements as those of the search device 100 are denoted by the same reference numerals, and the description thereof is omitted. The search device 200 does not include the set generation units 61, 62, and 63, the set storage unit 12, the subset generation unit 7, and the subset storage unit 13 in the search device 100. That is, the search device 200 performs processing up to the search units 51, 52, and 53 of the search device 100. As a search result, a list of detected data blocks and a list of links to the detected data blocks are output. The search device 100 searches for an entity associated with a data block, but the search device 200 is effective when it is desired to search the data block itself.

また、本技術の実施の形態の他の例として検索装置３００が挙げられる。図２６は、検索装置３００を説明する機能ブロック図である。図２６において、検索装置１００と同一の要素は、同一の符号を付することで説明を省略する。検索装置３００は、検索装置１００における部分集合生成部７、部分集合格納部１３を備えないものである。すなわち、検索装置３００は、集合生成部６１，６２，６３による処理までを行う。そして、検索結果として、集合格納部１２に格納されているエンティティリストが出力されたり、エンティティリストに含まれるエンティティへのリンクが出力されたりする。エンティティリストは、データベースごとに生成されるため、検索結果はデータベースごとに出力される。検索装置１００は、各データベースの検索結果を統合してエンティティの部分集合を出力するものであるが、検索装置３００は、データベースごとに検索結果を分けて得たい場合に有効である。 Moreover, the search apparatus 300 is mentioned as another example of embodiment of this technique. FIG. 26 is a functional block diagram illustrating the search device 300. In FIG. 26, the same elements as those of the search device 100 are denoted by the same reference numerals, and the description thereof is omitted. The search device 300 does not include the subset generation unit 7 and the subset storage unit 13 in the search device 100. That is, the search device 300 performs processing up to the set generation units 61, 62, and 63. As a search result, an entity list stored in the set storage unit 12 is output, or a link to an entity included in the entity list is output. Since the entity list is generated for each database, the search result is output for each database. The search device 100 integrates the search results of each database and outputs a subset of entities. The search device 300 is effective when it is desired to obtain search results separately for each database.

以上本技術の実施の形態について説明したが、本技術はこれに限定されるものではない。例えば、図２，図２５，図２６の機能ブロック図は一例であって、必ずしも実際のプログラムモジュール構成と一致しない。また、処理フローについても、処理結果が変わらない限り、ステップの順番を入れ替えたり、並列に実行しても良い場合もある。 Although the embodiment of the present technology has been described above, the present technology is not limited to this. For example, the functional block diagrams of FIGS. 2, 25, and 26 are examples, and do not necessarily match the actual program module configuration. As for the processing flow, as long as the processing result does not change, the order of the steps may be changed or may be executed in parallel.

なお、上で述べた検索装置１００，２００，３００は、コンピュータ装置であって、図２７に示すように、メモリ２５０１とＣＰＵ（Central Processing Unit）２５０３とハードディスク・ドライブ（ＨＤＤ：Hard Disk Drive）２５０５と表示装置２５０９に接続される表示制御部２５０７とリムーバブル・ディスク２５１１用のドライブ装置２５１３と入力装置２５１５とネットワークに接続するための通信制御部２５１７とがバス２５１９で接続されている。オペレーティング・システム（ＯＳ：Operating System）及び本実施例における処理を実施するためのアプリケーション・プログラムは、ＨＤＤ２５０５に格納されており、ＣＰＵ２５０３により実行される際にはＨＤＤ２５０５からメモリ２５０１に読み出される。ＣＰＵ２５０３は、アプリケーション・プログラムの処理内容に応じて表示制御部２５０７、通信制御部２５１７、ドライブ装置２５１３を制御して、所定の動作を行わせる。また、処理途中のデータについては、主としてメモリ２５０１に格納されるが、ＨＤＤ２５０５に格納されるようにしてもよい。本技術の実施例では、上で述べた処理を実施するためのアプリケーション・プログラムはコンピュータ読み取り可能なリムーバブル・ディスク２５１１に格納されて頒布され、ドライブ装置２５１３からＨＤＤ２５０５にインストールされる。インターネットなどのネットワーク及び通信制御部２５１７を経由して、ＨＤＤ２５０５にインストールされる場合もある。このようなコンピュータ装置は、上で述べたＣＰＵ２５０３、メモリ２５０１などのハードウエアとＯＳ及びアプリケーション・プログラムなどのプログラムとが有機的に協働することにより、上で述べたような各種機能を実現する。 The search devices 100, 200, and 300 described above are computer devices, and as shown in FIG. 27, a memory 2501, a CPU (Central Processing Unit) 2503, and a hard disk drive (HDD: Hard Disk Drive) 2505. A display control unit 2507 connected to the display device 2509, a drive device 2513 for the removable disk 2511, an input device 2515, and a communication control unit 2517 for connecting to a network are connected by a bus 2519. An operating system (OS) and an application program for executing the processing in this embodiment are stored in the HDD 2505, and are read from the HDD 2505 to the memory 2501 when executed by the CPU 2503. The CPU 2503 controls the display control unit 2507, the communication control unit 2517, and the drive device 2513 according to the processing content of the application program, and performs a predetermined operation. Further, data in the middle of processing is mainly stored in the memory 2501, but may be stored in the HDD 2505. In an embodiment of the present technology, an application program for performing the above-described processing is stored in a computer-readable removable disk 2511 and distributed, and installed from the drive device 2513 to the HDD 2505. In some cases, the HDD 2505 may be installed via a network such as the Internet and the communication control unit 2517. Such a computer apparatus realizes various functions as described above by organically cooperating hardware such as the CPU 2503 and the memory 2501 described above and programs such as the OS and application programs. .

以上述べた本実施の形態をまとめると、以下のようになる。 The above-described embodiment can be summarized as follows.

本実施の形態に係る検索方法は、（Ａ）ユーザから入力された第１の検索クエリーに含まれる１又は複数のクエリー要素を抽出し、（Ｂ）クエリー要素ごとに、各々異なる種類のデータを格納する複数のデータベースとの関連度を算出し、（Ｃ）関連度に基づいて、クエリー要素ごとに複数のデータベースの各々との関連の有無を判定し、（Ｄ）複数のデータベースの各々について、当該データベースと関連有りと判定されたクエリー要素がある場合は、当該クエリー要素を含む第２の検索クエリーを生成し、（Ｅ）第２の検索クエリーに基づいて、複数のデータベースのうち、当該第２の検索クエリーに対応するデータベースを検索する処理を含む。なお、ここで、複数のデータベースとは、図１（ａ）や図１（ｂ）に示されるように、物理的に分離されたデータベースでも良いし、図１（ｃ）に示されるように、物理的には同一であり仮想的に分離されたデータベースであっても良い。 In the search method according to the present embodiment, (A) one or more query elements included in the first search query input by the user are extracted, and (B) different types of data are obtained for each query element. Calculating the degree of association with a plurality of databases to be stored; (C) determining whether or not there is an association with each of the plurality of databases for each query element based on the degree of association; and (D) for each of the plurality of databases. If there is a query element determined to be related to the database, a second search query including the query element is generated. (E) Based on the second search query, among the plurality of databases, A process of searching a database corresponding to the second search query. Here, the plurality of databases may be physically separated databases as shown in FIG. 1A and FIG. 1B, or as shown in FIG. The databases may be physically the same and virtually separated.

本検索方法によれば、ユーザは複数のデータベースに共通の検索クエリーとして第１の検索クエリーを入力すると、データベースの各々について、そのデータベースに専用の第２の検索クエリーが生成される。第２の検索クエリーは、担当のデータベースに関連有りと判断されたクエリー要素が含まれている。したがって、ユーザは、データベースごとに検索クエリーを入力したり、データベースのデータの種類に適する検索クエリーを考えたりしなくてもよいため、利便性が高くなる。第１の検索クエリーは複数のデータベースに共通のものであるが、これにより検索精度が落ちることはない。ユーザは、各データベースに専用の検索クエリーを入力しなくても、精度の高い検索結果を得ることができるようになる。検索クエリーにどのようなクエリー要素が含まれていても、クエリー要素ごとに関連度が判断され、各データベースとの関連の有無が判定されるため、第１の検索クエリーの自由度も高い。 According to this search method, when a user inputs a first search query as a search query common to a plurality of databases, a second search query dedicated to the database is generated for each database. The second search query includes a query element that is determined to be related to the database in charge. Therefore, the user does not have to input a search query for each database or think about a search query suitable for the type of data in the database, which increases convenience. The first search query is common to a plurality of databases, but this does not reduce the search accuracy. The user can obtain highly accurate search results without inputting a dedicated search query to each database. Regardless of what query elements are included in the search query, the degree of relevance is determined for each query element, and whether or not there is an association with each database is determined, so the degree of freedom of the first search query is also high.

また、一のエンティティに対して関連付けられた複数のデータブロックが、前記複数のデータベースに分散して格納されている場合がある。この場合、上で述べた検索する処理において、（Ｅ１）複数のデータベースのうち二以上のデータベースの各々からデータブロックが検出されると、当該データブロックに関連付けられているエンティティを特定して、当該エンティティの集合を生成し、（Ｅ２）エンティティの集合から、所定のルールに基づいて、当該エンティティの部分集合を生成する処理を実行する場合もある。これによって、二以上のデータベースの各々からデータブロックが検出されても、所定のルールに基づいて、エンティティの部分集合とすることで、統合された検索結果を得ることができるようになる。検索結果はエンティティのリストとなるため、ユーザは情報ニーズに適合した検索結果を得ることができる。ここで、所定のルールは、二以上のデータベースから得られたエンティティの集合を統合して一つの集合にするものであれば良い。例えば、各データベースから得られるエンティティの集合の和集合又は積集合としても良い。また、エンティティの各々について、第１の検索クエリーに対する適合度を示す総合評価値を算出し、総合評価値が閾値以上のエンティティのみを抽出し、部分集合としても良い。なお、ここで、データブロックとは、図１（ａ）や図１（ｂ）に示されるように、物理的に分離されたものでも良いし、図１（ｃ）に示されるように、物理的には同一であって仮想的に分離されたものでも良い。 In addition, a plurality of data blocks associated with one entity may be distributed and stored in the plurality of databases. In this case, in the search process described above, (E1) when a data block is detected from each of two or more databases among a plurality of databases, the entity associated with the data block is specified, There is a case where a set of entities is generated, and (E2) a process of generating a subset of the entities from the set of entities based on a predetermined rule may be executed. Accordingly, even if a data block is detected from each of two or more databases, an integrated search result can be obtained by using a subset of entities based on a predetermined rule. Since the search result is a list of entities, the user can obtain a search result suitable for information needs. Here, the predetermined rule may be any one that integrates a set of entities obtained from two or more databases into one set. For example, it may be a union or intersection of sets of entities obtained from each database. Further, for each entity, a comprehensive evaluation value indicating the degree of conformity to the first search query may be calculated, and only entities having a total evaluation value equal to or greater than a threshold may be extracted and set as a subset. Here, the data block may be physically separated as shown in FIG. 1 (a) or 1 (b), or may be physically separated as shown in FIG. 1 (c). They may be the same and virtually separated.

さらに、上で述べた関連度を算出する処理において、クエリー要素ごとに、複数のデータベースの各々における当該クエリー要素の出現頻度をカウントし、当該出現頻度に基づいて、当該クエリー要素とデータベースの各々との関連度を算出するようにしても良い。特定のクエリー要素が特定のデータベースに頻繁に出現する場合、そのクエリー要素とデータベースとは関連を有する可能性が高い。このようにすれば、関連度がより適切な値となる。 Further, in the process of calculating the relevance described above, for each query element, the frequency of appearance of the query element in each of the plurality of databases is counted, and based on the frequency of appearance, the query element and each of the databases are counted. It is also possible to calculate the degree of relevance. When a specific query element frequently appears in a specific database, the query element and the database are likely to have an association. In this way, the degree of association becomes a more appropriate value.

さらに、上で述べた関連度を算出する処理が、（Ｂ１）カテゴリと当該カテゴリに含まれる複数のカテゴリ要素とを対応付けるカテゴリルールに基づいて、クエリー要素をカテゴリ要素として含むカテゴリを特定し、（Ｂ２）カテゴリルールに基づいて、特定されたカテゴリに含まれる複数のカテゴリ要素を取得し、（Ｂ３）複数のデータベースの各々について、取得した複数のカテゴリ要素の出現頻度をカウントし、（Ｂ４）出現頻度に基づいて、クエリー要素と複数のデータベースの各々との関連度を算出する処理を含むようにしても良い。例えば、データに店舗の所在地が記述されている場合、具体的な地名（「日本橋」「東京」等）に基づいて関連度が判断されると、データベースにその具体的な地名が出現しない場合は関連度が低いと誤って判断される。このようにすれば、カテゴリに基づいて関連度が算出され、関連度は更に適切な値となる。 Furthermore, the process of calculating the relevance described above specifies (B1) a category including a query element as a category element based on a category rule that associates a category with a plurality of category elements included in the category, B2) Acquire a plurality of category elements included in the specified category based on the category rule, (B3) Count the appearance frequency of the acquired plurality of category elements for each of the plurality of databases, and (B4) Appearance A process of calculating the degree of association between the query element and each of the plurality of databases based on the frequency may be included. For example, if the location of a store is described in the data, and if the degree of relevance is determined based on a specific place name (such as “Nihonbashi” or “Tokyo”), the specific place name does not appear in the database It is erroneously determined that the relevance is low. In this way, the degree of association is calculated based on the category, and the degree of association becomes a more appropriate value.

さらに、上で述べた検索する処理が、（Ｅ３）カテゴリと当該カテゴリに含まれる複数のカテゴリ要素とを対応付けるカテゴリルールに基づいて、クエリー要素をカテゴリ要素として含むカテゴリを特定し、（Ｅ４）カテゴリと検索方法とを対応付ける検索ルールに基づいて、特定されたカテゴリに対応する検索方法を特定し、（Ｅ５）特定された検索方法に基づいてクエリー要素を含む第２の検索クエリーを生成する処理を含むようにしても良い。このようにすれば、カテゴリに応じた検索方法により検索が行われ、検索の精度が更に高くなる。 Further, the search processing described above specifies (E3) a category including a query element as a category element based on a category rule that associates a category with a plurality of category elements included in the category, and (E4) a category And (E5) a process of generating a second search query including a query element based on the specified search method, specifying a search method corresponding to the specified category based on a search rule that associates the search method with It may be included. In this way, the search is performed by the search method corresponding to the category, and the search accuracy is further increased.

さらに、上で述べたエンティティの集合を生成する処理が、検索する処理にて検出されたデータブロックごとに、所定のルールに基づいて、第２の検索クエリーとの合致度を示す第１の評価値を算出する処理を含むようにしても良い。その際、上で述べたエンティティの部分集合を生成する処理が、エンティティごとに、当該エンティティに関連付けられたデータブロックの第１の評価値に基づいて、当該エンティティの評価を示す総合評価値を算出し、総合評価値に基づいて、エンティティの集合からエンティティの部分集合を生成する処理を含むようにしても良い。このようにすれば、第１の検索クエリーとの合致度が高いデータブロックの第１の評価値が高くなる。そして、データブロックの第１の評価値に基づいて、そのデータブロックに関連付けられるエンティティの総合評価値が算出される。この総合評価値は、エンティティと第１の検索クエリーとの合致度が高いほど高くなる。部分集合には第１の検索クエリーと合致度が高いエンティティが含まれることとなり、ユーザは情報ニーズに適合した検索結果を得ることができる。 Further, the process for generating the set of entities described above performs a first evaluation indicating a degree of matching with the second search query based on a predetermined rule for each data block detected in the search process. You may make it include the process which calculates a value. At that time, the process of generating a subset of the entities described above calculates, for each entity, a comprehensive evaluation value indicating the evaluation of the entity based on the first evaluation value of the data block associated with the entity. In addition, a process of generating a subset of entities from the set of entities based on the comprehensive evaluation value may be included. In this way, the first evaluation value of the data block having a high degree of match with the first search query is increased. And based on the 1st evaluation value of a data block, the comprehensive evaluation value of the entity linked | related with the data block is calculated. The overall evaluation value increases as the degree of matching between the entity and the first search query increases. The subset includes an entity having a high degree of matching with the first search query, and the user can obtain a search result that matches the information needs.

なお、当該プログラムは、例えばフレキシブル・ディスク、ＣＤ−ＲＯＭなどの光ディスク、光磁気ディスク、半導体メモリ（例えばＲＯＭ）、ハードディスク等のコンピュータ読み取り可能な記憶媒体又は記憶装置に格納される。なお、処理途中のデータについては、ＲＡＭ等の記憶装置に一時保管される。 The program is stored in a computer-readable storage medium or storage device such as a flexible disk, an optical disk such as a CD-ROM, a magneto-optical disk, a semiconductor memory (for example, ROM), or a hard disk. Note that data being processed is temporarily stored in a storage device such as a RAM.

以上の実施例を含む実施形態に関し、さらに以下の付記を開示する。 The following supplementary notes are further disclosed with respect to the embodiments including the above examples.

（付記１）
ユーザから入力された第１の検索クエリーに含まれる１又は複数のクエリー要素を抽出し、
前記クエリー要素ごとに、各々異なる種類のデータを格納する複数のデータベースとの関連度を算出し、
前記関連度に基づいて、前記クエリー要素ごとに前記複数のデータベースの各々との関連の有無を判定し、
前記複数のデータベースの各々について、当該データベースと関連有りと判定されたクエリー要素がある場合は、当該クエリー要素を含む第２の検索クエリーを生成し
前記第２の検索クエリーに基づいて、前記複数のデータベースのうち、当該第２の検索クエリーに対応するデータベースを検索する
処理をコンピュータに実行させるための検索プログラム。 (Appendix 1)
Extracting one or more query elements included in a first search query input by a user;
For each of the query elements, calculate the degree of association with a plurality of databases each storing different types of data,
Based on the degree of association, for each query element, determine whether or not there is an association with each of the plurality of databases,
For each of the plurality of databases, when there is a query element determined to be related to the database, a second search query including the query element is generated, and the plurality of the plurality of databases are based on the second search query. A search program for causing a computer to execute a process of searching a database corresponding to the second search query among the databases.

（付記２）
一のエンティティに対して関連付けられた複数のデータブロックが、前記複数のデータベースに分散して格納されており、
前記検索する処理において、前記複数のデータベースのうち二以上のデータベースの各々から前記データブロックが検出されると、当該データブロックに関連付けられているエンティティを特定して、当該エンティティの集合を生成し、
前記エンティティの集合から、所定のルールに基づいて、当該エンティティの部分集合を生成する
処理を、さらに、前記コンピュータに実行させるための付記１記載の検索プログラム。 (Appendix 2)
A plurality of data blocks associated with one entity are distributed and stored in the plurality of databases;
In the search process, when the data block is detected from each of two or more databases among the plurality of databases, an entity associated with the data block is specified, and a set of the entities is generated.
The search program according to appendix 1, for causing the computer to further execute a process of generating a subset of the entity from the entity set based on a predetermined rule.

（付記３）
前記関連度を算出する処理が、
前記クエリー要素ごとに、前記複数のデータベースの各々における当該クエリー要素の出現頻度をカウントし、
当該出現頻度に基づいて、当該クエリー要素と前記データベースの各々との前記関連度を算出する
処理を含む、付記１又は２記載の検索プログラム。 (Appendix 3)
The process of calculating the relevance is
For each query element, count the frequency of occurrence of the query element in each of the plurality of databases,
The search program according to appendix 1 or 2, including a process of calculating the degree of association between the query element and each of the databases based on the appearance frequency.

（付記４）
前記関連度を算出する処理が、
カテゴリと当該カテゴリに含まれる複数のカテゴリ要素とを対応付けるカテゴリルールに基づいて、前記クエリー要素を前記カテゴリ要素として含むカテゴリを特定し、
前記カテゴリルールに基づいて、特定された前記カテゴリに含まれる前記複数のカテゴリ要素を取得し、
前記複数のデータベースの各々について、取得した前記複数のカテゴリ要素の出現頻度をカウントし、
前記出現頻度に基づいて、前記クエリー要素と前記複数のデータベースの各々との前記関連度を算出する
処理を含む、付記１又は２記載の検索プログラム。 (Appendix 4)
The process of calculating the relevance is
Based on a category rule that associates a category with a plurality of category elements included in the category, a category including the query element as the category element is identified,
Obtaining the plurality of category elements included in the identified category based on the category rule;
For each of the plurality of databases, count the frequency of appearance of the acquired category elements,
The search program according to appendix 1 or 2, including a process of calculating the degree of association between the query element and each of the plurality of databases based on the appearance frequency.

（付記５）
前記検索する処理が、
カテゴリと当該カテゴリに含まれる複数のカテゴリ要素とを対応付けるカテゴリルールに基づいて、前記クエリー要素をカテゴリ要素として含むカテゴリを特定し、
前記カテゴリと検索方法とを対応付ける検索ルールに基づいて、特定された前記カテゴリに対応する検索方法を特定し、
特定された前記検索方法に基づいて検索する
処理を含む、付記１又は２記載の検索プログラム。 (Appendix 5)
The process of searching includes
Based on a category rule that associates a category with a plurality of category elements included in the category, a category including the query element as a category element is identified,
Based on a search rule that associates the category with a search method, a search method corresponding to the specified category is specified,
The search program according to appendix 1 or 2, including a process of searching based on the specified search method.

（付記６）
前記エンティティの集合を生成する処理が、
前記検索する処理にて検出された前記データブロックごとに、所定のルールに基づいて前記第２の検索クエリーとの合致度を示す第１の評価値を算出し、
前記エンティティの部分集合を生成する処理が、
前記エンティティごとに、当該エンティティに関連付けられた前記データブロックの前記第１の評価値に基づいて、当該エンティティの評価を示す総合評価値を算出し、
前記総合評価値に基づいて、前記エンティティの集合から前記エンティティの部分集合を生成する
処理を含む付記２記載の検索プログラム。 (Appendix 6)
The process of generating the set of entities includes
For each data block detected in the search process, calculate a first evaluation value indicating a degree of match with the second search query based on a predetermined rule;
The process of generating a subset of the entities comprises
For each of the entities, based on the first evaluation value of the data block associated with the entity, a comprehensive evaluation value indicating the evaluation of the entity is calculated.
The search program according to claim 2, further comprising: generating a subset of the entities from the set of entities based on the comprehensive evaluation value.

（付記７）
ユーザから入力された第１の検索クエリーに含まれる一又は複数のクエリー要素を抽出し、
前記クエリー要素ごとに、各々異なる種類のデータを格納する複数のデータベースとの関連度を算出し、
前記関連度に基づいて、前記クエリー要素ごとに前記複数のデータベースとの関連の有無を判定し、
前記複数のデータベースの各々について、当該データベースと関連有りと判定されたクエリー要素がある場合は、当該クエリー要素を含む第２の検索クエリーを生成し、
前記第２検索クエリーに基づいて、前記複数のデータベースのうち、当該第２検索クエリーに対応するデータベースを検索する
処理を含み、コンピュータにより実行される検索方法。 (Appendix 7)
Extracting one or more query elements included in the first search query input by the user;
For each of the query elements, calculate the degree of association with a plurality of databases each storing different types of data,
Based on the degree of association, for each query element to determine the presence or absence of association with the plurality of databases,
For each of the plurality of databases, if there is a query element determined to be related to the database, a second search query including the query element is generated,
A search method executed by a computer, including a process of searching a database corresponding to the second search query among the plurality of databases based on the second search query.

（付記８）
ユーザから入力された第１の検索クエリーに含まれる一又は複数のクエリー要素を抽出する抽出部と、
前記クエリー要素ごとに、各々異なる種類のデータを格納する複数のデータベースの各々との関連度を算出する計算部と、
前記関連度に基づいて、前記クエリー要素ごとに前記複数のデータベースの各々との関連の有無を判定する判定部と、
前記複数のデータベースの各々について、当該データベースと関連有りと判定された前記クエリー要素がある場合は、当該クエリー要素を含む第２の検索クエリーを生成する検索クエリー生成部と、
前記第２の検索クエリーに基づいて、前記複数のデータベースのうち、当該第２の検索クエリーに対応するデータベースを検索する検索部と
を有する検索装置。 (Appendix 8)
An extraction unit that extracts one or more query elements included in the first search query input by the user;
A calculation unit that calculates a degree of association with each of a plurality of databases that store different types of data for each of the query elements;
A determination unit that determines presence / absence of association with each of the plurality of databases for each query element based on the degree of association;
For each of the plurality of databases, when there is the query element determined to be associated with the database, a search query generation unit that generates a second search query including the query element;
A search device comprising: a search unit that searches a database corresponding to the second search query among the plurality of databases based on the second search query.

１００，２００，３００検索装置
１抽出部
２計算部
３判定部
４検索クエリー生成部
５１，５２，５３検索部
６１，６２，６３集合生成部
７部分集合生成部
１０クエリーリスト格納部
１１クエリー格納部
１２集合格納部
１３部分集合格納部
Ｄ１，Ｄ２，Ｄ３データベース 100, 200, 300 Search device 1 Extraction unit 2 Calculation unit 3 Determination unit 4 Search query generation units 51, 52, 53 Search units 61, 62, 63 Set generation unit 7 Subset generation unit 10 Query list storage unit 11 Query storage unit 12 Set storage unit 13 Subset storage unit D1, D2, D3 Database

Claims

Extracting one or more query elements included in a first search query input by a user;
For each of the query elements, calculate the degree of association with a plurality of databases each storing different types of data,
Based on the degree of association, for each query element, determine whether or not there is an association with each of the plurality of databases,
For each of the plurality of databases, if there is a query element determined to be related to the database, a second search query including the query element is generated,
A search program for causing a computer to execute a process of searching a database corresponding to the second search query among the plurality of databases based on the second search query.

A plurality of data blocks associated with one entity are distributed and stored in the plurality of databases;
In the search process, when the data block is detected from each of two or more databases among the plurality of databases, an entity associated with the data block is specified, and a set of the entities is generated.
The search program according to claim 1, further causing the computer to execute a process of generating a subset of the entity from the entity set based on a predetermined rule.

The process of calculating the relevance is
For each of the query elements, a process of counting the appearance frequency of the query element in each of the plurality of databases and calculating the degree of association between the query element and each of the databases based on the appearance frequency. The search program according to claim 1 or 2.

The process of calculating the relevance is
Based on a category rule that associates a category with a plurality of category elements included in the category, a category including the query element as the category element is identified,
Obtaining the plurality of category elements included in the identified category based on the category rule;
For each of the plurality of databases, count the frequency of appearance of the acquired category elements,
The search program according to claim 1, further comprising: calculating the degree of association between the query element and each of the plurality of databases based on the appearance frequency.

The process of searching includes
Based on a category rule that associates a category with a plurality of category elements included in the category, a category including the query element as a category element is identified,
Based on a search rule that associates the category with a search method, a search method corresponding to the specified category is specified,
The search program according to claim 1, further comprising a process of searching based on the specified search method.

The process of generating the set of entities includes
For each data block detected in the search process, calculate a first evaluation value indicating a degree of match with the second search query based on a predetermined rule;
The process of generating a subset of the entities calculates, for each entity, a comprehensive evaluation value indicating an evaluation of the entity based on the first evaluation value of the data block associated with the entity;
The search program according to claim 2, further comprising: generating a subset of the entities from the set of entities based on the comprehensive evaluation value.

Extracting one or more query elements included in the first search query input by the user;
For each of the query elements, calculate the degree of association with a plurality of databases each storing different types of data,
Based on the degree of association, for each query element to determine the presence or absence of association with the plurality of databases,
For each of the plurality of databases, if there is a query element determined to be related to the database, a second search query including the query element is generated,
A search method executed by a computer, including a process of searching a database corresponding to the second search query among the plurality of databases based on the second search query.

An extraction unit that extracts one or more query elements included in the first search query input by the user;
A calculation unit that calculates the degree of association with each of a plurality of databases that store different types of data for each of the query elements;
A determination unit that determines presence / absence of association with each of the plurality of databases for each query element based on the degree of association;
For each of the plurality of databases, when there is the query element determined to be associated with the database, a search query generation unit that generates a second search query including the query element;
A search device comprising: a search unit that searches a database corresponding to the second search query among the plurality of databases based on the second search query.