JP2013206066A

JP2013206066A - Data retrieval system and data retrieval method

Info

Publication number: JP2013206066A
Application number: JP2012073585A
Authority: JP
Inventors: Naoharu Yamada; 直治山田; Miki Hara; 未來原; Kozo Noaki; 浩三野秋; Takeshi Naganuma; 武史長沼
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2012-03-28
Filing date: 2012-03-28
Publication date: 2013-10-07

Abstract

PROBLEM TO BE SOLVED: To generate a retrieval result whose retrieval order is appropriately evaluated even when a retrieval result related to metadata and a retrieval result related to retrieval object data are separately acquired.SOLUTION: A data retrieval system 1 comprises: a retrieval result acquisition part 33 for acquiring a data retrieval result indicating extraction data from a plurality of retrieval object data; a metadata storage part 38a for storing a plurality of metadata; a data retrieval part 32 for generating a metadata retrieval result indicating extraction metadata from a plurality of metadata; a relation degree calculation part 34 for calculating an extraction data relation degree and an extraction metadata relation degree; a retrieval result generation part 36 for calculating an integrated relation degree in which the extraction data relation degree and the extraction metadata relation degree are incorporated into the plurality of retrieval object data; and a data communication part 37 for outputting an integrated retrieval result in which the data retrieval result and the metadata retrieval result are integrated, in order based on the integrated relation degree.

Description

本発明は、ネットワークを介してデータを検索するデータ検索システム及びデータ検索方法に関するものである。 The present invention relates to a data search system and a data search method for searching for data via a network.

従来から、インターネット等の通信ネットワーク内のデータを検索キーワードを用いて検索するサービスが提供されている。このようなサービスに関連する技術に関して、例えば、下記特許文献１には、複数の検索実行装置によって得られた検索結果を、情報の質の高さに応じた重み付けより統合して出力する分散型検索方法が開示されている。このような機能によれば、分散型検索装置における検索結果として検索要求に適合した結果が得られる。 Conventionally, a service for searching data in a communication network such as the Internet using a search keyword has been provided. With regard to the technology related to such services, for example, the following Patent Document 1 discloses a distributed type in which search results obtained by a plurality of search execution devices are integrated and output by weighting according to the quality of information. A search method is disclosed. According to such a function, a result suitable for the search request can be obtained as a search result in the distributed search device.

特開２００３−２４８６９１号公報JP 2003-248691 A

しかしながら、上述した従来の検索方法では、データ自体の格納先と、そのデータに付加されたメタデータ等の付加データの格納先が異なり、データ自体と付加データとに関する検索結果が別々に得られる場合に、それぞれの検索結果の関連性を適切に評価した検索結果を得ることは困難である。例えば、あるデータについてはメタデータが検索キーワードに一致し、他のデータについてはメタデータとデータ自体の両方が検索キーワードに一致するような場合に、両方のデータの検索順位を適切に評価できない傾向にある。 However, in the conventional search method described above, the storage location of the data itself is different from the storage location of the additional data such as metadata added to the data, and the search results regarding the data itself and the additional data are obtained separately. In addition, it is difficult to obtain a search result that appropriately evaluates the relevance of each search result. For example, when metadata matches the search keyword for some data and both the metadata and the data itself match the search keyword for other data, the search ranking of both data tends not to be evaluated properly It is in.

そこで、本発明は、かかる課題に鑑みて為されたものであり、検索対象データに付加される付加データに関する検索結果と、検索対象データに関する検索結果が別々に得られる場合であっても、適切に検索順位が評価された検索結果を生成することが可能なデータ検索システム及びデータ検索方法を提供することを目的とする。 Therefore, the present invention has been made in view of such problems, and even when a search result related to additional data added to search target data and a search result related to search target data are obtained separately, the present invention is appropriate. It is an object of the present invention to provide a data search system and a data search method capable of generating a search result whose search rank is evaluated.

上記課題を解決するため、本発明のデータ検索システムは、複数の検索対象データのうちから検索キーワードによって抽出された複数の抽出データを示す第１の検索結果を取得する検索結果取得手段と、複数の検索対象データに付加された複数の付加データを格納する付加データ格納手段と、複数の付加データのうちから検索キーワードに関連する複数の抽出付加データを抽出して、複数の抽出付加データが付加された複数の検索対象データを示す第２の検索結果を生成するデータ検索手段と、第１の検索結果取得手段によって取得された第１の検索結果を基に、複数の抽出データの検索キーワードとの第１の関連度を計算し、データ検索手段によって取得された第２の検索結果を基に、複数の抽出付加データに対応する複数の検索対象データの検索キーワードとの第２の関連度を計算する関連度算出手段と、複数の抽出データと複数の抽出付加データに対応する複数の検索対象データとに対して、第１の関連度及び第２の関連度を加味した第３の関連度を計算する検索結果生成手段と、第１の検索結果及び第２の検索結果を合わせた第３の検索結果を、第３の関連度を基にした順位で出力する出力手段と、を備える。 In order to solve the above problems, a data search system according to the present invention includes a search result acquisition unit that acquires a first search result indicating a plurality of extracted data extracted by a search keyword from a plurality of search target data, and a plurality of search result acquisition means. Additional data storage means for storing a plurality of additional data added to the search target data, and extracting a plurality of extracted additional data related to the search keyword from the plurality of additional data, and adding the plurality of extracted additional data A data search unit for generating a second search result indicating the plurality of search target data, a search keyword for the plurality of extracted data based on the first search result acquired by the first search result acquisition unit, and A plurality of search target data corresponding to the plurality of extracted additional data based on the second search result acquired by the data search means. Relevance calculation means for calculating a second relevance level with the search keyword, and a plurality of extracted data and a plurality of search target data corresponding to the plurality of extracted additional data. The search result generation means for calculating the third relevance level taking into account the relevance level of the first search result and the third search result obtained by combining the first search result and the second search result are based on the third relevance level. Output means for outputting in order.

或いは、本発明のデータ検索方法は、検索結果取得手段が、複数の検索対象データのうちから検索キーワードによって抽出された複数の抽出データを示す第１の検索結果を取得する検索結果取得ステップと、付加データ格納手段が、複数の検索対象データに付加された複数の付加データを格納する付加データ格納ステップと、データ検索手段が、複数の付加データのうちから検索キーワードに関連する複数の抽出付加データを抽出して、複数の抽出付加データが付加された複数の検索対象データを示す第２の検索結果を生成するデータ検索ステップと、関連度算出手段が、第１の検索結果取得手段によって取得された第１の検索結果を基に、複数の抽出データの検索キーワードとの第１の関連度を計算し、データ検索手段によって取得された第２の検索結果を基に、複数の抽出付加データに対応する複数の検索対象データの検索キーワードとの第２の関連度を計算する関連度算出ステップと、検索結果生成手段が、複数の検索結果データと複数の抽出付加データに対応する複数の検索対象データとに対して、第１の関連度及び第２の関連度を加味した第３の関連度を計算する検索結果生成ステップと、出力手段が、第１の検索結果及び第２の検索結果を合わせた第３の検索結果を、第３の関連度を基にした順位で出力する出力ステップと、を備える。 Alternatively, in the data search method of the present invention, the search result acquisition means acquires a first search result indicating a plurality of extracted data extracted by a search keyword from a plurality of search target data; An additional data storage step in which the additional data storage means stores a plurality of additional data added to the plurality of search target data; and a plurality of extracted additional data related to the search keyword from the plurality of additional data. And a data search step for generating a second search result indicating a plurality of search target data to which a plurality of extraction additional data is added, and a relevance calculation means are acquired by the first search result acquisition means. Based on the first search result, the first degree of association with the search keyword of the plurality of extracted data is calculated and acquired by the data search means A relevance level calculating step for calculating a second relevance level with a search keyword of a plurality of search target data corresponding to a plurality of extracted additional data based on the search result of 2; A search result generating step for calculating a third relevance factor including the first relevance factor and the second relevance factor for the data and a plurality of search object data corresponding to the plurality of additional data to be extracted; Output a third search result obtained by combining the first search result and the second search result in a rank based on the third relevance level.

このようなデータ検索システム、或いはデータ検索方法によれば、検索キーワードを基に複数の検索対象データを対象に検索された第１の検索結果が取得されると共に、同じ検索キーワードを基に複数の付加データを対象に検索された第２の検索結果が生成される。そして、第１の検索結果を基に複数の抽出データの第１の関連度が計算され、第２の検索結果を基に複数の抽出付加データに対応する複数の検索対象データの第２の関連度が生成された後、両方の検索結果に含まれる複数の検索対象データ毎に第１の関連度及び第２の関連度が加味された第３の関連度が計算され、両方の検索結果を合わせた第３の検索結果が第３の関連度に基づく順位で出力される。これにより、複数の検索対象データを対象にした検索と、それらの検索対象データに対応する付加データを対象にした検索結果が別々に得られる場合であっても、両方の検索結果における検索キーワードとの関連度を加味することにより、検索対象データ毎の関連度が適切に評価された検索結果が得られる。その結果、検索対象データに対して適切に検索順位が評価された検索結果を生成することができる。 According to such a data search system or data search method, a first search result searched for a plurality of search target data based on a search keyword is acquired, and a plurality of search results are acquired based on the same search keyword. A second search result searched for the additional data is generated. Then, the first relevance of the plurality of extracted data is calculated based on the first search result, and the second relevance of the plurality of search target data corresponding to the plurality of extracted additional data is calculated based on the second search result. After the degree is generated, a third relevance level in which the first relevance level and the second relevance level are added is calculated for each of a plurality of search target data included in both search results. The combined third search result is output in the order based on the third degree of association. Thus, even when a search for a plurality of search target data and a search result for additional data corresponding to the search target data are obtained separately, the search keyword in both search results By adding the relevance level, a search result in which the relevance level for each search target data is appropriately evaluated can be obtained. As a result, it is possible to generate a search result in which the search rank is appropriately evaluated for the search target data.

関連度算出手段は、複数の抽出データ中の検索キーワードの出現回数を基に第１の関連度を計算する、ことが好ましい。こうすれば、第１の検索結果における抽出データの関連度を簡易に求めることができる。 The relevance calculation means preferably calculates the first relevance based on the number of appearances of the search keyword in the plurality of extracted data. In this way, the relevance of the extracted data in the first search result can be easily obtained.

また、関連度算出手段は、第１の検索結果中の複数の抽出データの順位を基に第１の関連度を計算する、ことも好ましい。このようにしても、第１の検索結果における抽出データの関連度を簡易に求めることができる。 It is also preferable that the relevance degree calculating means calculates the first relevance degree based on the ranks of the plurality of extracted data in the first search result. Even in this way, the relevance of the extracted data in the first search result can be easily obtained.

さらに、検索結果生成手段は、複数の抽出データと複数の抽出付加データに対応する複数の検索対象データとに対して、第１の関連度及び第２の関連度を重み付け加算して第３の関連度を計算する、ことも好ましい。この場合、第１及び第２の検索結果における検索キーワードとの関連度を用いて、検索対象データ毎の関連度をより適切に評価することができる。 Further, the search result generating means weights and adds the first relevance level and the second relevance level to the plurality of search target data corresponding to the plurality of extracted data and the plurality of additional data to be extracted. It is also preferable to calculate relevance. In this case, the degree of association for each search target data can be more appropriately evaluated using the degree of association with the search keyword in the first and second search results.

またさらに、出力手段により出力された第３の検索結果に対するユーザのデータ選択履歴を基に、重み付け加算時に用いる重み付けの値を動的に変更するパラメータ変更手段をさらに備える、ことも好ましい。かかるパラメータ変更手段を備えれば、ユーザにとって利用価値の高い検索結果を生成することができる。 It is also preferable to further include parameter changing means for dynamically changing a weighting value used at the time of weighting addition based on a user data selection history for the third search result output by the output means. With such parameter changing means, it is possible to generate a search result having a high utility value for the user.

さらにまた、パラメータ変更手段は、データ選択履歴が示す第１の選択結果を選択する回数と、データ選択履歴が示す第２の選択結果を選択する回数との比較に基づいて、重み付けの値を変更する、ことも好ましい。このようにすれば、ユーザにとって利用価値の高い検索結果をより確実に生成することができる。 Furthermore, the parameter changing means changes the weighting value based on a comparison between the number of times of selecting the first selection result indicated by the data selection history and the number of times of selecting the second selection result indicated by the data selection history. It is also preferable. In this way, a search result having a high utility value for the user can be generated more reliably.

本発明によれば、検索対象データに付加される付加データに関する検索結果と、検索対象データに関する検索結果が別々に得られる場合であっても、適切に検索順位が評価された検索結果を生成することができる。 According to the present invention, even when a search result related to additional data added to search target data and a search result related to search target data are obtained separately, a search result whose search rank is appropriately evaluated is generated. be able to.

本発明の好適な一実施形態にかかるデータ検索システム１の概略構成図である。1 is a schematic configuration diagram of a data search system 1 according to a preferred embodiment of the present invention. 図１の端末装置２及びメタデータ検索用サーバ装置３を構成する情報処理装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the information processing apparatus which comprises the terminal device 2 and the metadata search server apparatus 3 of FIG. 図１のメタデータ格納部３８ａに格納されたメタデータのデータ構成の一例を示す図である。It is a figure which shows an example of the data structure of the metadata stored in the metadata storage part 38a of FIG. 図１の選択履歴格納部３８ｃに格納された統合検索結果データのデータ構成の一例を示す図である。It is a figure which shows an example of a data structure of the integrated search result data stored in the selection history storage part 38c of FIG. 図１のデータ格納部４３ａに格納された検索対象データのデータ構成の一例を示す図である。It is a figure which shows an example of a data structure of the search object data stored in the data storage part 43a of FIG. 図１のデータ検索システム１による統合検索結果データ生成時の動作を示すフローチャートである。It is a flowchart which shows the operation | movement at the time of the integrated search result data generation by the data search system 1 of FIG. 図１のデータ検索システム１による統合検索結果の出力例を示す図である。It is a figure which shows the example of an output of the integrated search result by the data search system 1 of FIG. 本発明の変形例にかかるデータ検索システム１０１の概略構成図である。It is a schematic block diagram of the data search system 101 concerning the modification of this invention.

以下、図面とともに本発明によるデータ検索システム及びデータ検索方法の好適な実施形態について詳細に説明する。なお、図面の説明においては同一要素には同一符号を付し、重複する説明を省略する。 Hereinafter, preferred embodiments of a data search system and a data search method according to the present invention will be described in detail with reference to the drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant description is omitted.

図１は、本発明の好適な一実施形態にかかるデータ検索システム１の概略構成図である。図１に示すデータ検索システム１は、ユーザにより端末装置２を利用して記憶されたデータやインターネット上に公開されたデータ等の検索対象データを管理し、検索対象データの検索処理を実行する通信システムである。このような検索対象データとしては、スケジュールデータ、写真データ、文書データ、ＳＮＳ（Social Networking Service）投稿データ、電子メールデータ、音楽データ、ニュースデータ等が挙げられる。詳細には、データ検索システム１は、データ検索を実行しようとするユーザが使用する端末装置２と、検索対象データに付加されたメタデータ（付加データ）を管理するメタデータ検索用サーバ装置３とにより構成されている。この端末装置２とメタデータ検索用サーバ装置３とは、移動体通信方式を採用した移動体通信ネットワークや有線通信ネットワーク等によって構成される通信ネットワークＮＷを介して、相互にデータ通信を行うことが可能とされている。さらに、端末装置２及びメタデータ検索用サーバ装置３は、通信ネットワークＮＷを介して、検索対象データを管理するデータ検索用サーバ装置４との間でデータ通信を行うことが可能とされている。なお、メタデータ検索用サーバ装置３及びデータ検索用サーバ装置４は、１台のサーバ装置で構成されていてもよいし、複数のサーバ装置が連携して動作するサーバシステムであってもよい。端末装置２は、携帯電話端末、スマートフォン、ＰＤＡ等に代表される端末装置である。 FIG. 1 is a schematic configuration diagram of a data search system 1 according to a preferred embodiment of the present invention. A data search system 1 shown in FIG. 1 manages search target data such as data stored by a user using a terminal device 2 and data published on the Internet, and executes search processing of search target data. System. Examples of such search target data include schedule data, photo data, document data, SNS (Social Networking Service) posting data, e-mail data, music data, news data, and the like. More specifically, the data search system 1 includes a terminal device 2 used by a user who intends to execute data search, a metadata search server device 3 that manages metadata (additional data) added to search target data, and the like. It is comprised by. The terminal device 2 and the metadata search server device 3 can perform data communication with each other via a communication network NW configured by a mobile communication network, a wired communication network, or the like adopting a mobile communication method. It is possible. Further, the terminal device 2 and the metadata search server device 3 can perform data communication with the data search server device 4 that manages the search target data via the communication network NW. The metadata search server device 3 and the data search server device 4 may be configured by a single server device or a server system in which a plurality of server devices operate in cooperation. The terminal device 2 is a terminal device represented by a mobile phone terminal, a smartphone, a PDA, or the like.

図２は、図１のデータ検索システム１の端末装置２或いはメタデータ検索用サーバ装置３を構成する情報処理装置のハードウェア構成を示すブロック図である。この情報処理装置１００は、物理的には、ＣＰＵ５１と、主記憶装置であるＲＡＭ５２及びＲＯＭ５３と、ハードディスク装置等の補助記憶装置５６と、入力デバイスである入力キー、タッチパネル、マウス等の入力装置５５と、ディスプレイ、スピーカ等の出力装置５７と、他の端末装置やサーバ装置との間での通信ネットワークＮＷを介したデータの送受信を司る通信モジュール５４とを含む装置として構成されている。端末装置２或いはメタデータ検索用サーバ装置３によって実現される機能は、図２に示すＣＰＵ５１、ＲＡＭ５２等のハードウェア上に所定のプログラムを読み込ませることにより、ＣＰＵ５１の制御のもとで通信モジュール５４、入力装置５５、出力装置５７を動作させるとともに、ＲＡＭ５２や補助記憶装置５６におけるデータの読み出し及び書き込みを行うことで実現される。 FIG. 2 is a block diagram showing a hardware configuration of an information processing apparatus constituting the terminal device 2 or the metadata search server device 3 of the data search system 1 of FIG. The information processing apparatus 100 physically includes a CPU 51, a RAM 52 and a ROM 53 that are main storage devices, an auxiliary storage device 56 such as a hard disk device, and input devices 55 such as input keys, a touch panel, and a mouse that are input devices. And an output device 57 such as a display and a speaker, and a communication module 54 that controls transmission / reception of data to / from other terminal devices and server devices via the communication network NW. The functions realized by the terminal device 2 or the metadata search server device 3 are such that a predetermined program is read on the hardware such as the CPU 51 and the RAM 52 shown in FIG. This is realized by operating the input device 55 and the output device 57 and reading and writing data in the RAM 52 and the auxiliary storage device 56.

図１に戻って、メタデータ検索用サーバ装置３は、機能的な構成要素として、インデックス作成部３１、データ検索部（データ検索手段）３２と、検索結果取得部（検索結果取得手段）３３と、関連度算出部（関連度算出手段）３４と、パラメータ変更部（パラメータ変更手段）３５と、検索結果生成部（検索結果生成手段）３６と、データ通信部（出力手段）３７と、メタデータ格納部３８ａと、インデックス格納部３８ｂと、選択履歴格納部３８ｃとを備えている。 Returning to FIG. 1, the metadata search server device 3 includes, as functional components, an index creation unit 31, a data search unit (data search unit) 32, a search result acquisition unit (search result acquisition unit) 33, , Relevance calculation unit (relevance calculation unit) 34, parameter change unit (parameter change unit) 35, search result generation unit (search result generation unit) 36, data communication unit (output unit) 37, metadata A storage unit 38a, an index storage unit 38b, and a selection history storage unit 38c are provided.

まず、メタデータ検索用サーバ装置３の各構成要素の機能について詳細に説明する。 First, the function of each component of the metadata search server device 3 will be described in detail.

インデックス作成部３１は、端末装置２のユーザによる要求を受け付けたことを契機に、メタデータ格納部３８ａに格納されたメタデータの検索用のインデックスを作成する。このインデックスは、データ検索部３２によってメタデータ格納部３８ａに格納されたメタデータを検索する際に検索処理を高速化するために参照される。 The index creation unit 31 creates an index for searching the metadata stored in the metadata storage unit 38a when a request from the user of the terminal device 2 is received. This index is referred to in order to speed up the search process when searching for metadata stored in the metadata storage unit 38a by the data search unit 32.

データ検索部３２は、端末装置２から検索クエリ（検索要求）を受け付けた際に、メタデータ格納部３８ａに格納された複数の検索対象データに付加されたメタデータのうちから、その検索クエリに含まれる検索キーワードに関連する複数のメタデータを抽出メタデータとして抽出する。この際、データ検索部３２は、インデックス格納部３８ｂに格納されたインデックスを参照しながらデータ検索処理を実行する。 When the data search unit 32 receives a search query (search request) from the terminal device 2, the data search unit 32 selects the search query from among the metadata added to the plurality of search target data stored in the metadata storage unit 38a. A plurality of metadata related to the included search keyword is extracted as extracted metadata. At this time, the data search unit 32 executes a data search process while referring to the index stored in the index storage unit 38b.

ここで、図３には、メタデータ格納部３８ａに格納されたメタデータの構成の一例を示している。同図に示すように、メタデータ格納部３８ａには複数のメタデータが格納されており、これらのメタデータは、データ検索用サーバ装置４に記憶されている複数の検索対象データの１つ１つに対応して予め生成されて格納される。例えば、複数の項目を有するメタデータとして、“作成日時：2012/1/30 11:00”、“作成位置：岡山県岡山市北区”、“作成時スケジュール：岡山出張Ａ社会議”、“同行者：鈴木”、“ファイル名：会議資料１”、及び“キーワード：ＳＮＳサービス調査”等の複数の項目のデータが、対応する検索対象データを識別するためのＩＤ“d001”に関連付けて格納されている。このように、メタデータ格納部３８ａには、データ検索用サーバ装置４に格納されている複数の検索対象データに付加された複数のメタデータが記憶されている。このようなメタデータには、“作成日時”、“作成位置”、“作成時スケジュール”、“同行者”、及び“ファイル名”のように端末装置２によって検索対象データに自動的に付与される項目と、“キーワード”のように端末装置２においてユーザから登録された項目の２種類が含まれる。データ検索部３２は、図３に示すようなメタデータを参照しながら、メタデータのいずれかの項目のデータ中に検索キーワードを含むようなＩＤを、抽出メタデータを特定する複数のＩＤとして抽出する。例えば、検索キーワードが“ＳＮＳ”の場合は、図３に示すようなメタデータのなかから、項目“キーワード”に“ＳＮＳ”を含む２つのＩＤ“d001”、”d002”を抽出する。そして、データ検索部３２は、抽出メタデータを特定する複数のＩＤを示すメタデータ検索結果を作成し関連度算出部３４に引き渡す。データ検索部３２は、このメタデータ検索結果を、検索キーワードとの一致度或いは検索キーワードの出現件数を基に決定した検索順位に従ってＩＤを並べて作成する。 Here, FIG. 3 shows an example of the configuration of the metadata stored in the metadata storage unit 38a. As shown in the figure, a plurality of metadata is stored in the metadata storage unit 38a, and these metadata are each one of a plurality of search target data stored in the data search server device 4. Are generated and stored in advance corresponding to each. For example, as metadata having a plurality of items, “creation date / time: 2012/1/30 11:00”, “creation position: Kita-ku, Okayama city, Okayama”, “schedule at creation: Okayama business trip A company meeting”, “ Accompanying person: Suzuki, “File name: Meeting material 1”, “Keyword: SNS service survey” and other data items are stored in association with ID “d001” for identifying the corresponding search target data. Has been. As described above, the metadata storage unit 38 a stores a plurality of metadata added to a plurality of search target data stored in the data search server device 4. Such metadata is automatically assigned to search target data by the terminal device 2 such as “creation date”, “creation position”, “schedule at creation”, “accompanying person”, and “file name”. And two types of items registered by the user in the terminal device 2 such as “keyword”. The data search unit 32 extracts IDs including a search keyword in the data of any item of the metadata as a plurality of IDs for specifying the extracted metadata while referring to the metadata as shown in FIG. To do. For example, when the search keyword is “SNS”, two IDs “d001” and “d002” including “SNS” in the item “keyword” are extracted from the metadata as shown in FIG. Then, the data search unit 32 creates a metadata search result indicating a plurality of IDs specifying the extracted metadata, and delivers it to the relevance calculation unit 34. The data search unit 32 creates the metadata search result by arranging IDs according to the search order determined based on the degree of coincidence with the search keyword or the number of occurrences of the search keyword.

ここで、データ検索部３２は、検索キーワードに日時情報が含まれている場合には、その日時情報が示す時刻に対して時間的に近似する項目を有するメタデータを抽出する。例えば、検索キーワードに日時情報“2012/1/29”を含む場合には、その日時情報の示す日に対して前後１日の範囲“2012/1/28”〜“2012/1/30”の時刻情報を含むメタデータを抽出する。また、データ検索部３２は、検索キーワードに位置情報が含まれている場合には、その位置情報が示す位置に対して空間的に近似する項目を有するメタデータを抽出する。例えば、検索キーワードに位置情報“岡山県岡山市”を含む場合には、当該地名の一致（一部一致含む）を判断してもよいし、当該地名を緯度経度に変換することで、空間的な近接性（位置が近似するか否か）の判断を行ってもよい。 Here, when the date and time information is included in the search keyword, the data search unit 32 extracts metadata having items that are temporally approximated to the time indicated by the date and time information. For example, if date and time information “2012/1/29” is included in the search keyword, the range of “2012/1/28” to “2012/1/30” is one day before and after the date indicated by the date and time information. Extract metadata including time information. In addition, when the search keyword includes position information, the data search unit 32 extracts metadata having items that spatially approximate the position indicated by the position information. For example, if the search keyword includes the location information “Okayama City, Okayama Prefecture”, the location name may be matched (including partly matched), or the location name may be converted into latitude and longitude. Judgment of close proximity (whether or not the position is approximate) may be performed.

図１に戻って、検索結果取得部３３は、端末装置２から検索クエリを受け付けた際に、その検索クエリをデータ検索用サーバ装置４に転送し、データ検索用サーバ装置４から検索対象データの中から抽出された抽出データに関するデータ検索結果を取得する。このデータ検索結果は、データ検索用サーバ装置４において所定の方法により決定された検索順位に従ってＩＤを並べて作成されている。ここで、検索順位の決定方法については、既存の種々のアルゴリズムを用いることができる。また、データ検索用サーバ装置４のデータ検索結果には、各抽出データ中の検索キーワードの出現件数が含まれていてもよい。例えば、検索順位がＩＤ“d001”“d002”、及び“d004”の順であった場合には、この順番で並べられたＩＤが、それぞれの抽出データ中の検索キーワードの出現件数とともにデータ検索結果に含まれている。そして、検索結果取得部３３は、取得したデータ検索結果を関連度算出部３４に引き渡す。 Returning to FIG. 1, when the search result acquisition unit 33 receives a search query from the terminal device 2, the search result acquisition unit 33 transfers the search query to the data search server device 4. Acquire data search results regarding the extracted data extracted from the inside. This data search result is created by arranging IDs according to a search order determined by a predetermined method in the data search server device 4. Here, various existing algorithms can be used as a method for determining the search order. The data search result of the data search server device 4 may include the number of occurrences of the search keyword in each extracted data. For example, if the search order is the order of ID “d001”, “d002”, and “d004”, the IDs arranged in this order together with the number of occurrences of the search keyword in each extracted data are the data search results. Included. Then, the search result acquisition unit 33 delivers the acquired data search result to the relevance calculation unit 34.

関連度算出部３４は、検索結果取得部３３によって取得されたデータ検索結果を基に、データ検索用サーバ装置４によって抽出された各抽出データに関する検索キーワードに対する関連度（抽出データ関連度）を計算する。さらに、データ検索部３２によって作成されたメタデータ検索結果を基に、データ検索部３２によって抽出された各抽出メタデータに対応する検索対象データの検索キーワードに対する関連度（抽出メタデータ関連度）を計算する。すなわち、関連度算出部３４は、各抽出データ中における検索キーワードの出現回数Ｎ_Ｋと、全ての検索対象データ中の検索キーワードの出現回数の総数Ｎ_{ＫＴＯＴＡＬ}とに基づいて、下記式；
Ｖ_Ｒ１＝Ｎ_Ｋ／Ｎ_{ＫＴＯＴＡＬ}
を用いて、各抽出データの関連度Ｖ_Ｒ１を算出する。また、関連度算出部３４は、各抽出メタデータの関連度Ｖ_Ｒ２を全て固定値１と算出する。 The degree-of-association calculation unit 34 calculates the degree of association (extraction data relevance) for the search keyword related to each piece of extracted data extracted by the data search server device 4 based on the data search result acquired by the search result acquisition unit 33. To do. Further, based on the metadata search result created by the data search unit 32, the degree of relevance (extraction metadata relevance) of the search target data corresponding to each extracted metadata extracted by the data search unit 32 to the search keyword is calculated. calculate. In other words, the degree-of-association calculation unit 34, based on the number of occurrences N _K of the search keyword in the respective extraction data, the total number N _KTOTAL number of occurrences of search terms in all of the search target data, the following formula;
V _R1 = N _K / N _KTOTAL
_{Is used} to calculate the relevance VR1 of each extracted data. Moreover, the degree-of-association calculation unit 34 calculates the relevance V _R2 of the extracted metadata and all the fixed value 1.

ここで、関連度算出部３４は、上記式の代わりに、データ検索結果に含まれる抽出データの件数Ｎ_Ｄと、該当抽出データの検索順位Ｎ_{ＯＲＤＥＲ}とを基に、下記式；
Ｖ_Ｒ１＝（Ｎ_Ｄ−Ｎ_{ＯＲＤＥＲ}）／Ｎ_Ｄ
を用いて、各抽出データの関連度Ｖ_Ｒ１を算出してもよい。また、関連度算出部３４は、抽出データ関連度Ｖ_Ｒ１の算出方法と同様にして、抽出メタデータ関連度Ｖ_Ｒ２を計算してもよい。関連度算出部３４は、算出した抽出データ関連度Ｖ_Ｒ１及び抽出メタデータ関連度Ｖ_Ｒ２を、対応する検索対象データを識別するＩＤとともに検索結果生成部３６に引き渡す。 Here, the relevance calculation unit 34 uses the following formula based on the number N _{D of} extracted data included in the data search result and the search order N _ORDER of the corresponding extracted data instead of the above formula:
_{_{_{V R1 = (N D -N ORDER}}} ) / N D
With, it may calculate the relevance V _R1 of the extraction data. Further, the relevance level calculating unit 34 may calculate the extracted metadata relevance level _{VR2 in} the same manner as the extraction data relevance level _VR1 . The degree-of-association calculation unit 34 passes the calculated extracted data relevance level _VR1 and extracted metadata relevance level _VR2 to the search result generation unit 36 together with an ID for identifying the corresponding search target data.

検索結果生成部３６は、データ検索結果に含まれる複数の抽出データとメタデータ検索結果に含まれる複数の抽出メタデータとに対応する複数の検索対象データごとに、抽出データ関連度Ｖ_Ｒ１と抽出メタデータ関連度Ｖ_Ｒ２とを加味した統合関連度Ｖ_Ｒ３を計算する。すなわち、検索結果生成部３６は、取得した抽出データ関連度Ｖ_Ｒ１及び抽出メタデータ関連度Ｖ_Ｒ２から同一の検索対象データのＩＤに対応する関連度を取り出して、両方の関連度Ｖ_Ｒ１，Ｖ_Ｒ２を重み付け加算により統合して統合関連度Ｖ_Ｒ３を求める。具体的には、データ検索結果及びメタデータ検索結果に含まれるＩＤごとに、重み付け加算のための可変係数α（＜１）を用いて、下記式；
Ｖ_Ｒ３＝Ｖ_Ｒ１×α＋Ｖ_Ｒ２×（１−α）
により、統合関連度Ｖ_Ｒ３を計算する。このとき、検索結果生成部３６は、パラメータ変更部３５によって設定された可変係数αを用いる。また、データ検索結果に含まれ、メタデータ検索結果に含まれない検索対象データに対しては、Ｖ_Ｒ２＝０とし、メタデータ検索結果に含まれ、データ検索結果に含まれない検索対象データに対しては、Ｖ_Ｒ１＝０として統合関連度Ｖ_Ｒ３を求める。そして、検索結果生成部３６は、データ検索結果及びメタデータ検索結果に含まれる全ての検索対象データのＩＤ毎に計算した統合関連度Ｖ_Ｒ３を、選択履歴格納部３８ｃに格納する。 The search result generation unit 36 extracts the extracted data relevance level _VR1 and the extraction for each of a plurality of search target data corresponding to a plurality of extracted data included in the data search result and a plurality of extracted metadata included in the metadata search result. calculating the integrated relevance V _R3 in consideration and metadata relevance V _R2. That is, the search result generation unit 36 extracts the relevance level corresponding to the ID of the same search target data from the acquired extracted data relevance level V _R1 and extracted metadata relevance level V _R2 , and both relevance levels V _R1 , V _R2 is integrated by weighted addition to determine the integrated relevance _VR3 . Specifically, for each ID included in the data search result and the metadata search result, using the variable coefficient α (<1) for weighting addition, the following formula:
V _R3 = V _R1 × α + V _R2 × (1-α)
Accordingly, calculating the integrated relevance _{V R3.} At this time, the search result generating unit 36 uses the variable coefficient α set by the parameter changing unit 35. For search target data included in the data search result but not included in the metadata search result, V _R2 = 0 is set, and the search target data included in the metadata search result and not included in the data search result is included. On the other hand, the integrated relevance V _R3 is obtained with V _R1 = 0. Then, the search result generation unit 36 stores the integrated relevance _VR3 calculated for each ID of all search target data included in the data search result and the metadata search result in the selection history storage unit 38c.

図４には、検索結果生成部３６により選択履歴格納部３８ｃに格納された統合関連度のデータ構成の一例を示している。同図に示すように、データ検索結果及びメタデータ検索結果に含まれるＩＤ“d001”…毎に、関連度算出部３４によって算出された抽出データ関連度“0.8”…及び抽出メタデータ関連度“1”…と、検索結果生成部３６によって算出された統合関連度“0.8*0.3+1*0.7=0.94”（α＝0.3の場合）…とが互いに関連付けて記憶される。この選択履歴格納部３８ｃに格納されたデータは、端末装置２に検索クエリに応じて提示するための統合検索結果データとして利用される。 FIG. 4 shows an example of the data structure of the integration relevance stored in the selection history storage unit 38c by the search result generation unit 36. As shown in the figure, for each ID “d001”... Included in the data search result and metadata search result, the extracted data relevance “0.8”... And the extracted metadata relevance “ 1 ”... And the integrated relevance“ 0.8 * 0.3 + 1 * 0.7 = 0.94 ”(when α = 0.3) calculated by the search result generation unit 36 are stored in association with each other. The data stored in the selection history storage unit 38c is used as integrated search result data to be presented to the terminal device 2 according to the search query.

図１に戻って、データ通信部３７は、メタデータ検索用サーバ装置３と、端末装置２及びデータ検索用サーバ装置４との間のデータ通信を実行する。特に、データ通信部３７は、端末装置２から検索クエリを受信するとともに、データ検索結果とメタデータ検索結果とを合わせた統合検索結果データを選択履歴格納部３８ｃから読み出して、検索クエリを送信した端末装置２に対して返信する。さらに、データ通信部３７は、統合検索結果データによって提示された検索対象データに関する参照要求を端末装置２から受信したか否かを監視し、参照要求の回数（選択履歴）を検索対象データ毎に選択履歴格納部３８ｃに格納する。例えば、図４に示すように、データ通信部３７は、検索対象データを特定する複数のＩＤ“d001”…毎に、端末装置２のユーザによる参照要求の回数を示す選択回数“N1”…を記録する。データ通信部３７は、複数の検索クエリに対して作成された統合検索結果データに関して、所定期間における選択回数を履歴データとして蓄積して格納する。また、データ通信部３７は、検索クエリをデータ検索用サーバ装置４に転送するとともに、それに応じてデータ検索用サーバ装置４からデータ検索結果を取得する。 Returning to FIG. 1, the data communication unit 37 executes data communication between the metadata search server device 3 and the terminal device 2 and the data search server device 4. In particular, the data communication unit 37 receives the search query from the terminal device 2, reads out the integrated search result data that combines the data search result and the metadata search result from the selection history storage unit 38c, and transmits the search query. A reply is sent to the terminal device 2. Further, the data communication unit 37 monitors whether or not a reference request related to the search target data presented by the integrated search result data is received from the terminal device 2, and the number of reference requests (selection history) is determined for each search target data. Stored in the selection history storage unit 38c. For example, as shown in FIG. 4, the data communication unit 37 sets a selection number “N1”... Indicating the number of reference requests by the user of the terminal device 2 for each of a plurality of IDs “d001”. Record. The data communication unit 37 accumulates and stores the number of selections in a predetermined period as history data regarding the integrated search result data created for a plurality of search queries. Further, the data communication unit 37 transfers the search query to the data search server device 4 and acquires the data search result from the data search server device 4 accordingly.

パラメータ変更部３５は、検索結果生成部３６が統合関連度Ｖ_Ｒ３を計算するために参照する係数αを、選択履歴格納部３８ｃに格納された検索対象データのユーザによるデータ選択履歴に応じて変更する。すなわち、パラメータ変更部３５は、選択履歴格納部３８ｃに含まれる所定期間における選択回数を参照し、メタデータ検索結果に含まれていた検索対象データを選択した選択回数の合計Ｎ_Ｓ１を集計し、データ検索結果に含まれていた検索対象データを選択した選択回数の合計Ｎ_Ｓ２を集計する。例えば、図４の例によれば、０を越える抽出データ関連度を有するＩＤ（すなわち、データ検索結果に含まれるＩＤ）に対応付けられた選択回数“N1”、“N2”、“N4”を合算して、合計値“Ｎ_Ｓ１＝Ｎ１＋Ｎ２＋Ｎ４”を計算し、０を越える抽出メタデータ関連度を有するＩＤ（すなわち、メタデータ検索結果に含まれるＩＤ）に対応付けられた選択回数“N1”、“N2” 、“N3”を合算して、合計値“Ｎ_Ｓ２＝Ｎ１＋Ｎ２＋Ｎ３”を計算する。そして、パラメータ変更部３５は、係数αを、下記式；
α＝Ｎ_Ｓ１／（Ｎ_Ｓ１＋Ｎ_Ｓ２）
を用いて、選択回数の２つの合計値Ｎ_Ｓ１，Ｎ_Ｓ２を比較することによって動的に変更する。 Parameter changing unit 35, the search result generation unit 36 the coefficient α to be referred to for calculating the integrated relevance V _R3, changed according to the data selection history by the user of the search target data stored in the selection history storage unit 38c To do. That is, the parameter changing unit 35 refers to the number of selections in a predetermined period included in the selection history storage unit 38c, and totals the total number of selections N _S1 that selected the search target data included in the metadata search result. The total _NS2 of the number of times of selecting the search target data included in the data search result is totaled. For example, according to the example of FIG. 4, the number of selections “N1”, “N2”, “N4” associated with an ID having extracted data relevance greater than 0 (ie, an ID included in the data search result) In total, the total value “N _S1 = N1 + N2 + N4” is calculated, and the number of selections “N1” associated with the ID having the extracted metadata relevance degree exceeding 0 (that is, the ID included in the metadata search result), “N2” and “N3” are added together to calculate a total value “N _S2 = N1 + N2 + N3”. Then, the parameter changing unit 35 sets the coefficient α to the following formula:
α = N _S1 / (N _S1 + N _S2 )
Is used to dynamically change the two total values N _S1 and N _S2 of the number of selections.

次に、端末装置２の機能構成について説明する。端末装置２は、メタデータ登録部２１、クエリ入力部２２、情報出力部２３、及びデータ通信部２４を備えて構成されている。メタデータ登録部２１は、ユーザからメタデータ格納部３８ａに格納するメタデータの一部項目の登録を受け付け、その登録に関するデータをデータ通信部２４を介してメタデータ検索用サーバ装置３に送信する。クエリ入力部２２は、ユーザから検索クエリの入力を受け付け、データ通信部２４を介してメタデータ検索用サーバ装置３に送信する。情報出力部２３は、メタデータ検索用サーバ装置３から統合検索結果データを受信し、ディスプレイ等の出力装置に出力させる。データ通信部２４は、メタデータ検索用サーバ装置３及びデータ検索用サーバ装置４とのデータ通信を実行する。また、データ通信部２４は、ユーザから統合検索結果データによって提示された検索対象データに対してデータ参照のための選択入力を受け付けると、その選択入力をメタデータ検索用サーバ装置３及びデータ検索用サーバ装置４に送信する。 Next, the functional configuration of the terminal device 2 will be described. The terminal device 2 includes a metadata registration unit 21, a query input unit 22, an information output unit 23, and a data communication unit 24. The metadata registration unit 21 receives registration of some items of metadata stored in the metadata storage unit 38 a from the user, and transmits data related to the registration to the metadata search server device 3 via the data communication unit 24. . The query input unit 22 receives an input of a search query from the user and transmits it to the metadata search server device 3 via the data communication unit 24. The information output unit 23 receives the integrated search result data from the metadata search server device 3 and outputs it to an output device such as a display. The data communication unit 24 performs data communication with the metadata search server device 3 and the data search server device 4. In addition, when the data communication unit 24 receives a selection input for data reference with respect to the search target data presented by the integrated search result data from the user, the data communication unit 24 sends the selection input to the metadata search server device 3 and the data search server. It transmits to the server device 4.

さらに、データ検索用サーバ装置４の機能構成について説明する。データ検索用サーバ装置４は、データ検索部４１、データ通信部４２、データ格納部４３ａ、及びインデックス格納部４３ｂを備えている。データ格納部４３ａには、図５に示すように、複数の検索対象データ“会議資料1.doc”…が、それらのデータを識別するID“d001”…に対応付けて格納されており、インデックス格納部４３ｂには、データ格納部４３ａに格納されている検索対象データの検索用のインデックスが格納されている。データ検索部４１は、メタデータ検索用サーバ装置３から検索クエリを受信したことを契機に、データ格納部４３ａに格納された検索対象データの中から抽出データを抽出し、抽出データを示すデータ検索結果を生成する。ここで、データ検索部４１は、このデータ検索結果を、検索順位に従って複数の抽出データを示すＩＤが並ぶように、各抽出データ中の検索キーワードの出現件数を含めて生成する。データ通信部４２は、端末装置２及びメタデータ検索用サーバ装置３との間でのデータ通信を実行する。 Further, the functional configuration of the data search server device 4 will be described. The data search server device 4 includes a data search unit 41, a data communication unit 42, a data storage unit 43a, and an index storage unit 43b. As shown in FIG. 5, the data storage unit 43a stores a plurality of search target data “meeting material 1.doc”... In association with ID “d001”. The storage unit 43b stores a search index for search target data stored in the data storage unit 43a. The data search unit 41 extracts the extracted data from the search target data stored in the data storage unit 43a when the search query is received from the metadata search server device 3, and performs a data search indicating the extracted data. Generate results. Here, the data search unit 41 generates the data search result including the number of occurrences of the search keyword in each extracted data so that IDs indicating a plurality of extracted data are arranged in accordance with the search order. The data communication unit 42 performs data communication between the terminal device 2 and the metadata search server device 3.

以下、図６を参照して、データ検索システム１の動作について説明するとともに、併せてデータ検索システム１におけるデータ検索方法について詳述する。同図は、データ検索システム１による統合検索結果データ生成時の動作を示すフローチャートである。 Hereinafter, the operation of the data search system 1 will be described with reference to FIG. 6, and the data search method in the data search system 1 will be described in detail. FIG. 2 is a flowchart showing an operation when the integrated search result data is generated by the data search system 1.

まず、端末装置２において、検索対象データを検索キーワードを用いて検索を要求する検索クエリが受け付けられる（ステップＳ１０１）。そうすると、検索クエリがメタデータ検索要求として、メタデータ検索用サーバ装置３のデータ検索部３２によって受け付けられる（ステップＳ１０２）。そして、データ検索部３２によって、検索キーワードを用いてメタデータ格納部３８ａに格納された複数のメタデータが抽出されることにより、メタデータ検索結果が取得される（ステップＳ１０３）。さらに、メタデータ検索用サーバ装置３の検索結果取得部３３により、データ検索用サーバ装置４に対して、検索クエリがデータ検索要求として送信される（ステップＳ１０４）。これに応じて、検索結果取得部３３により、データ検索用サーバ装置４から検索キーワードを用いた検索対象データに関するデータ検索結果が取得される（ステップＳ１０５）。 First, the terminal device 2 accepts a search query for requesting a search for search target data using a search keyword (step S101). Then, the search query is accepted as a metadata search request by the data search unit 32 of the metadata search server device 3 (step S102). The data search unit 32 extracts a plurality of metadata stored in the metadata storage unit 38a using the search keyword, thereby acquiring a metadata search result (step S103). Further, the search result acquisition unit 33 of the metadata search server device 3 transmits a search query as a data search request to the data search server device 4 (step S104). In response to this, the search result acquisition unit 33 acquires the data search result related to the search target data using the search keyword from the data search server device 4 (step S105).

その後、メタデータ検索用サーバ装置３の関連度算出部３４により、データ検索結果及びメタデータ検索結果を基に、各抽出データに関する抽出データ関連度、及び各抽出メタデータに関する抽出メタデータ関連度が計算される（ステップＳ１０６）。次に、メタデータ検索用サーバ装置３のパラメータ変更部３５により、選択履歴格納部３８ｃに格納された検索対象データの選択履歴が参照されることにより、統合関連度の計算時に用いられる係数αの値が動的に変更される（ステップＳ１０７）。さらに、メタデータ検索用サーバ装置３の検索結果生成部３６により、検索対象データ毎に抽出データ関連度及び抽出メタデータ関連度を重み付け加算することにより統合関連度が算出される（ステップＳ１０８）。その後、検索結果生成部３６により、算出された統合関連度が、検索対象データを特定するＩＤに対応付けられた統合検索結果データとして、選択履歴格納部３８ｃに格納される（ステップＳ１０９）。最後に、メタデータ検索用サーバ装置３のデータ通信部３７により、端末装置２に対して、選択履歴格納部３８ｃに格納された統合検索結果データが送信される（ステップＳ１１０）。 Thereafter, the relevance calculation unit 34 of the metadata search server device 3 determines the extracted data relevance regarding each extracted data and the extracted metadata relevance regarding each extracted metadata based on the data search result and the metadata search result. Calculated (step S106). Next, the parameter change unit 35 of the metadata search server device 3 refers to the selection history of the search target data stored in the selection history storage unit 38c, so that the coefficient α used in the calculation of the degree of integration is calculated. The value is dynamically changed (step S107). Further, the integrated result is calculated by weighting and adding the extracted data relevance and the extracted metadata relevance for each search target data by the search result generation unit 36 of the metadata search server device 3 (step S108). Thereafter, the search result generation unit 36 stores the calculated integrated relevance in the selection history storage unit 38c as integrated search result data associated with an ID for specifying the search target data (step S109). Finally, the integrated search result data stored in the selection history storage unit 38c is transmitted to the terminal device 2 by the data communication unit 37 of the metadata search server device 3 (step S110).

図７には、メタデータ検索用サーバ装置３から送信された統合検索結果データに応じて、端末装置２において表示された出力画面Ｄ_１の一例を示している。同図に示すように、検索キーワード“ＳＮＳ”に対する統合検索結果として、検索対象データ“会議資料１．ｄｏｃ”、“会議資料２．ｄｏｃ”、“会議資料４．ｄｏｃ”が抽出されたことが示される。また、統合検索結果には、検索キーワードにヒットしたメタデータ中の関連箇所“ＳＮＳサービス調査”、及び検索キーワードにヒットした検索対象データ中の関連箇所“２０１２年度のＳＮＳの市場規模は…”が、各検索対象データ毎に併せて表示される。 FIG. 7 shows an example of the output screen D ₁ displayed on the terminal device 2 in accordance with the integrated search result data transmitted from the metadata search server device 3. As shown in the figure, the search target data “meeting material 1.doc”, “meeting material 2.doc”, and “meeting material 4.doc” are extracted as integrated search results for the search keyword “SNS”. Indicated. Also, the integrated search result includes a related part “SNS service survey” in the metadata hit with the search keyword and a related part “Search the SNS market size in 2012…” in the search target data hit the search keyword. These are displayed together for each search target data.

以上説明したデータ検索システム１、及びデータ検索システム１におけるデータ検索方法によれば、検索キーワードを基に複数の検索対象データを対象に検索されたデータ検索結果が取得されると共に、同じ検索キーワードを基に複数のメタデータを対象に検索されたメタデータ検索結果が生成される。そして、データ検索結果を基に抽出データ関連度が計算され、メタデータ検索結果を基に抽出メタデータ関連度が生成される。その後、両方の検索結果に含まれる複数の検索対象データ毎に統合関連度が計算され、両方の検索結果を合わせた統合検索結果が統合関連度に基づく順位で出力される。これにより、複数の検索対象データを対象にした検索と、それらの検索対象データに対応するメタデータを対象にした検索結果が別々の装置で得られる場合であっても、両方の検索結果における検索キーワードとの関連度を加味することにより、検索対象データ毎の検索キーワードに対する関連度が適切に評価された検索結果が得られる。その結果、検索対象データに対して検索キーワードとの関連性が適切に検索順位が評価された検索結果を生成することができる。 According to the data search system 1 and the data search method in the data search system 1 described above, the data search result searched for a plurality of search target data based on the search keyword is acquired, and the same search keyword is obtained. Based on the plurality of metadata, a metadata search result is generated. Then, the extracted data relevance is calculated based on the data search result, and the extracted metadata relevance is generated based on the metadata search result. Thereafter, the integrated relevance is calculated for each of a plurality of search target data included in both search results, and the integrated search results obtained by combining both search results are output in the order based on the integrated relevance. As a result, even if the search for multiple search target data and the search results for the metadata corresponding to the search target data are obtained on separate devices, the search in both search results By taking into account the degree of association with the keyword, a search result in which the degree of association with the search keyword for each search target data is appropriately evaluated can be obtained. As a result, it is possible to generate a search result in which the search order is appropriately evaluated for the relevance of the search keyword to the search target data.

また、抽出データ関連度が抽出データ中の検索キーワードの出現回数や、データ検索結果中の検索順位を基に計算されるので、データ検索結果における抽出データ関連度を簡易に求めることができる。 Moreover, since the extracted data relevance is calculated based on the number of appearances of the search keyword in the extracted data and the search rank in the data search result, the extracted data relevance in the data search result can be easily obtained.

さらに、統合関連度は抽出データ関連度及び抽出メタデータ関連度を重み付け加算することで得られるので、データ検索結果及びメタデータ検索結果における検索キーワードとの関連度を用いて、検索対象データ毎の関連度をより適切に評価することができる。また、この重み付け加算に用いる係数αは統合検索結果に対するユーザのデータ選択履歴を基に動的に変更されるので、ユーザにとって利用価値の高い検索結果を生成することができる。 Furthermore, since the integrated relevance is obtained by weighted addition of the extracted data relevance and the extracted metadata relevance, the relevance with the search keyword in the data search result and the metadata search result is used for each search target data. Relevance can be evaluated more appropriately. Further, since the coefficient α used for the weighted addition is dynamically changed based on the user's data selection history for the integrated search result, a search result having high utility value for the user can be generated.

なお、本発明は、上述した実施形態に限定されるものではない。 In addition, this invention is not limited to embodiment mentioned above.

例えば、図１に示したようなメタデータ検索用サーバ装置３の各構成要素は、ユーザが使用する端末装置に一部又は全てが具備されていてもよい。例えば、図８に示すように、検索結果取得部３３、関連度算出部３４、パラメータ変更部３５、検索結果生成部３６、及び選択履歴格納部３８ｃが、検索対象データを検索しようとするユーザが使用する端末装置１０２内に備えられていてもよい。 For example, some or all of the components of the metadata search server device 3 as shown in FIG. 1 may be included in the terminal device used by the user. For example, as illustrated in FIG. 8, a search result acquisition unit 33, a relevance calculation unit 34, a parameter change unit 35, a search result generation unit 36, and a selection history storage unit 38 c may search for search target data. You may be provided in the terminal device 102 to be used.

１，１０１…データ検索システム、２，１０２…端末装置、３，１０３…メタデータ検索用サーバ装置、３２…データ検索部（データ検索手段）、３３…検索結果取得部（検索結果取得手段）、３４…関連度算出部（関連度算出手段）、３５…パラメータ変更部（パラメータ変更手段）、３６…検索結果生成部（検索結果生成手段）、３８ａ…メタデータ格納部（付加データ格納手段）。 DESCRIPTION OF SYMBOLS 1,101 ... Data search system, 2,102 ... Terminal device, 3,103 ... Server device for metadata search, 32 ... Data search part (data search means), 33 ... Search result acquisition part (search result acquisition means), 34... Relevance calculation unit (relevance calculation unit) 35. Parameter change unit (parameter change unit) 36. Search result generation unit (search result generation unit) 38 a. Metadata storage unit (additional data storage unit).

Claims

Search result acquisition means for acquiring a first search result indicating a plurality of extracted data extracted from a plurality of search target data by a search keyword;
Additional data storage means for storing a plurality of additional data added to the plurality of search target data;
Data for extracting a plurality of extracted additional data related to the search keyword from the plurality of additional data and generating a second search result indicating a plurality of search target data to which the plurality of extracted additional data is added Search means;
Based on the first search result acquired by the first search result acquisition means, a first relevance level of the plurality of extracted data with the search keyword is calculated, and the first search result acquired by the data search means is obtained. Relevance calculation means for calculating a second relevance of the plurality of search target data corresponding to the plurality of extracted additional data with the search keyword based on the search result of 2;
A third degree of association is calculated by adding the first degree of association and the second degree of association to the plurality of extracted data and the plurality of search target data corresponding to the plurality of extracted additional data. Search result generation means;
Output means for outputting a third search result obtained by combining the first search result and the second search result in an order based on the third degree of association;
A data retrieval system comprising:

The relevance calculation means includes:
Calculating the first relevance based on the number of appearances of the search keyword in the plurality of extracted data;
The data search system according to claim 1.

The relevance calculation means includes:
Calculating the first relevance based on the ranking of the plurality of extracted data in the first search result;
The data search system according to claim 1.

The search result generating means includes
A third relevance level is obtained by weighting and adding the first relevance level and the second relevance level to the plurality of extracted data and the plurality of search target data corresponding to the plurality of extracted additional data. calculate,
The data search system according to claim 1, wherein the data search system is a data search system.

5. The apparatus according to claim 4, further comprising parameter changing means for dynamically changing a weighting value used at the time of weighting addition based on a user data selection history for the third search result output by the output means. The data retrieval system described.

The parameter changing means may determine the weighting value based on a comparison between the number of times of selecting the first selection result indicated by the data selection history and the number of times of selecting the second selection result indicated by the data selection history. Change
6. A data search system according to claim 5, wherein:

A search result acquisition step in which a search result acquisition means acquires a first search result indicating a plurality of extracted data extracted from a plurality of search target data by a search keyword;
An additional data storage means for storing a plurality of additional data added to the plurality of search target data;
A data search means extracts a plurality of extracted additional data related to the search keyword from the plurality of additional data, and a second search indicating the plurality of search target data to which the plurality of extracted additional data is added A data retrieval step to generate results;
The degree-of-association calculating means calculates a first degree of association with the search keyword of the plurality of extracted data based on the first search result acquired by the first search result acquisition means, and the data search A relevance calculation step of calculating a second relevance of the plurality of search target data corresponding to the plurality of extracted additional data with the search keyword based on the second search result acquired by the means;
The search result generation means adds the first relevance level and the second relevance level to the plurality of search result data and the plurality of search target data corresponding to the plurality of extracted additional data. A search result generation step for calculating the degree of relevance 3;
An output step of outputting a third search result obtained by combining the first search result and the second search result in an order based on the third degree of association;
A data search method comprising: