JP2011253299A

JP2011253299A - Retrieval device, retrieval method and retrieval program

Info

Publication number: JP2011253299A
Application number: JP2010126043A
Authority: JP
Inventors: So Hibino; 壮日比野; Satoshi Fukada; 聡深田; Kyotaro Horiguchi; 恭太郎堀口
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-06-01
Filing date: 2010-06-01
Publication date: 2011-12-15
Anticipated expiration: 2030-06-01
Also published as: JP5374444B2

Abstract

PROBLEM TO BE SOLVED: To reduce the response time of retrieval without unconformity with a database.SOLUTION: Respective documents constituting response data are stored in a fragment cache storing part 111. A document acquisition part 106 acquires documents from the fragment cache storing part 111 if the documents to be acquired exist in the fragment cache storing part 111. If there is no such document, the document acquisition part 106 acquires documents from an RDBMS2. As a result, the number of times of acquisition of documents from the RDBMS2 can be minimized, thereby reducing the response time of retrieval.

Description

本発明は、メタデータを検索する技術に関する。 The present invention relates to a technique for searching metadata.

従来、クライアントからの検索クエリに該当するメタデータをＲｅｌａｔｉｏｎａｌＤａｔａＢａｓｅＭａｎａｇｅｍｅｎｔＳｙｓｔｅｍ（ＲＤＢＭＳ）から検索して検索結果を返却するアプリケーションサーバにおいて、初回問い合わせの検索クエリと検索結果をキャッシュとして保存しておき、２回目以降の同一の検索クエリの問い合わせがあった場合にキャッシュから検索結果を返却することで、応答時間を短縮していた（例えば、特許文献１）。 2. Description of the Related Art Conventionally, in an application server that retrieves metadata corresponding to a search query from a client from the Relational DataBase Management System (RDBMS) and returns the search result, the search query and the search result of the initial inquiry are stored as a cache. The response time is shortened by returning the search result from the cache when there is an inquiry of the same search query after the first time (for example, Patent Document 1).

特開２００９−１７５８９６号公報JP 2009-175896 A

アプリケーションサーバでは、メタデータがラージオブジェクト型で格納されたｘｍｌ等の構造化文書である場合、ＲＤＢＭＳ上の検索時間やＲＤＢＭＳからのラージオブジェクトの取得時間がボトルネックとなる。ラージオブジェクトとは、ＲＤＢＭＳに格納するデータの型であり、データページを超えたサイズのデータを格納できるという特徴がある。 In the application server, when the metadata is a structured document such as xml stored in a large object type, the search time on the RDBMS and the acquisition time of the large object from the RDBMS become a bottleneck. A large object is a type of data stored in an RDBMS and has a feature that data having a size exceeding the data page can be stored.

従来技術のように、初回検索時に返却データをキャッシュしておき、次回以降の問い合わせ時はキャッシュに保存されている返却データを用いることで、応答時間を短縮し、リソースの消費量を低減することができる。しかしながら、検索クエリと返却データを１対１で管理している従来のキャッシュでは、返却データがＲＤＢＭＳ上に格納されたラージオブジェクト型データの集合である場合、ＲＤＢＭＳ上で、返却データを構成する１つのラージオブジェクト型データが更新されるとキャッシュされた返却データとＲＤＢＭＳに不整合が生じるという問題がある。 As in the prior art, return data is cached at the time of the first search, and the return data stored in the cache is used for the subsequent inquiries, thereby shortening the response time and reducing resource consumption. Can do. However, in the conventional cache that manages the search query and the return data on a one-to-one basis, when the return data is a set of large object type data stored on the RDBMS, the return data is configured on the RDBMS. When one large object type data is updated, there is a problem that inconsistency occurs between the cached return data and the RDBMS.

本発明は、上記に鑑みてなされたものであり、返却データがラージオブジェクト型データの集合である場合に、データベースとの不整合を生じることなく検索の応答時間を短縮することを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to shorten the response time of a search without causing inconsistency with a database when return data is a set of large object type data.

第１の本発明に係る検索装置は、ドキュメントを格納したデータベースから検索条件に該当する前記ドキュメントを検索する検索装置であって、検索クエリを受信する受信手段と、前記検索クエリを用いて前記データベースを検索し、当該検索クエリの検索条件に該当するドキュメントに関して、前記ドキュメントの識別子及び前記ドキュメントの格納場所を示す格納場所情報を有する検索結果を取得する検索手段と、以前取得したドキュメント、当該ドキュメントの識別子を格納する第１キャッシュ保存手段と、前記検索結果のドキュメントの識別子に対応するドキュメントが前記第１キャッシュ保存手段に存在するか否か判定し、前記ドキュメントが前記第１キャッシュ保存手段に存在する場合は、前記第１キュッシュ保存手段から前記ドキュメントを取得し、存在しない場合は、前記検索結果の格納場所情報に基づいて前記データベースから前記ドキュメントを取得する取得手段と、前記データベースから取得した前記ドキュメントを当該ドキュメントの識別子とともに前記第１キャッシュ保存手段に格納する格納手段と、前記取得手段が取得した１つ以上の前記ドキュメントを結合して応答データを生成する生成手段と、前記応答データを返却する返却手段と、を有することを特徴とする。 A search device according to a first aspect of the present invention is a search device for searching for a document corresponding to a search condition from a database storing documents, receiving means for receiving a search query, and the database using the search query. Search means for obtaining a search result having a storage location information indicating the identifier of the document and the storage location of the document, and a previously acquired document, A first cache storage unit for storing an identifier, and a determination is made as to whether or not a document corresponding to the identifier of the document of the search result exists in the first cache storage unit, and the document exists in the first cache storage unit In the case, from the first cuche storage means, An acquisition means for acquiring the document from the database based on the storage location information of the search result, and the first cache storage of the document acquired from the database together with an identifier of the document. Storage means for storing in the means, generation means for combining one or more documents acquired by the acquisition means to generate response data, and return means for returning the response data .

第２の本発明に係る検索方法は、検索装置がドキュメントを格納したデータベースから検索条件に該当する前記ドキュメントを検索する検索方法であって、検索クエリを受信するステップと、前記検索クエリを用いて前記データベースを検索し、当該検索クエリの検索条件に該当するドキュメントに関して、前記ドキュメントの識別子及び前記ドキュメントの格納場所を示す格納場所情報を有する検索結果を取得するステップと、前記検索結果のドキュメントの識別子に対応するドキュメントが、以前取得したドキュメント、当該ドキュメントの識別子を格納する前記第１キャッシュ保存手段に存在するか否か判定し、前記ドキュメントが前記第１キャッシュ保存手段に存在する場合は、前記第１キュッシュ保存手段から前記ドキュメントを取得し、存在しない場合は、前記検索結果の格納場所情報に基づいて前記データベースから前記ドキュメントを取得するステップと、前記データベースから取得した前記ドキュメントを当該ドキュメントの識別子とともに前記第１キャッシュ保存手段に格納するステップと、前記取得手段が取得した１つ以上の前記ドキュメントを結合して応答データを生成するステップと、前記応答データを返却するステップと、を有することを特徴とする。 A search method according to a second aspect of the present invention is a search method in which a search device searches for a document corresponding to a search condition from a database storing documents, the step of receiving a search query, and using the search query Searching the database, obtaining a search result having storage location information indicating the identifier of the document and the storage location of the document with respect to a document corresponding to the search condition of the search query, and the identifier of the document of the search result It is determined whether or not a document corresponding to the above-mentioned document exists in the first cache storage unit that stores the previously acquired document and the identifier of the document, and if the document exists in the first cache storage unit, The document is saved from one cuche storage means. And if not present, obtaining the document from the database based on the storage location information of the search result, and storing the document obtained from the database together with an identifier of the document in the first cache storage means A step of combining the one or more documents acquired by the acquiring unit to generate response data, and a step of returning the response data.

第３の本発明に係る検索プログラムは、上記検索方法をコンピュータに実行させるための検索プログラムである。 A search program according to a third aspect of the present invention is a search program for causing a computer to execute the search method.

本発明によれば、返却データがラージオブジェクト型データの集合である場合に、データベースとの不整合を生じることなく検索の応答時間を短縮することができる。 According to the present invention, when the return data is a set of large object type data, the search response time can be shortened without causing inconsistency with the database.

本実施の形態における検索装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the search device in this Embodiment. レンポンスキャッシュとフラグメントキャッシュを説明する概略図である。It is the schematic explaining a remponce cache and a fragment cache. 本実施の形態における検索装置の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the search device in this Embodiment. ドキュメントを取得する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process which acquires a document.

以下、本発明の実施の形態について図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本実施の形態における検索装置１の構成を示す機能ブロック図である。同図に示す検索装置１は、端末３から検索クエリを受信し、ＲＤＢＭＳ２から放送番組やビデオ・オン・デマンドコンテンツの番組メタデータを検索して端末３へ返却する装置である。検索クエリとは、端末３から検索装置１への情報検索要求であり、例えば、タイトル、ジャンル、日時等の検索条件を論理演算子や修飾子で結合したものである。端末３は、検索装置１へ検索クエリを送信し、検索クエリに対する検索結果を応答データとして受信するものであり、例えば、ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌＴｅｌｅｖｉｓｉｏｎ（ＩＰＴＶ）サービスにおける受信端末などがある。 FIG. 1 is a functional block diagram showing the configuration of the search device 1 according to the present embodiment. The search device 1 shown in the figure is a device that receives a search query from the terminal 3, searches the RDBMS 2 for program metadata of a broadcast program and video-on-demand content, and returns it to the terminal 3. The search query is an information search request from the terminal 3 to the search device 1 and is a combination of search conditions such as title, genre, date and time, using logical operators and modifiers. The terminal 3 transmits a search query to the search device 1 and receives a search result for the search query as response data. For example, the terminal 3 includes a receiving terminal in the Internet Protocol Television (IPTV) service.

まず、ＲＤＢＭＳ２の構成について説明する。ＲＤＢＭＳ２は、複数のアトリビュートテーブル２０１、ドキュメントテーブル２０２、一次検索結果テーブル２０３、およびキャッシュ管理テーブル２０４で構成される。アトリビュートテーブル２０１は、番組メタデータが記述されたドキュメントを属性に展開したものであり、コンテンツＩＤ（ｃｒｉｄ）、タイトル、概要、出演者等の番組に関する情報（番組メタデータ）で構成されている。ドキュメントとはＸＭＬ等で記述された構造化文書のことであり、属性とは名前と値のペアで記述される文書の構成要素のことである。ドキュメントテーブル２０２は、フラグメントＩＤ（ｆｉｄ）、フラグメントバージョン（ｆｖｅｒ）、コンテンツＩＤ（ｃｒｉｄ）、および番組メタデータを返却単位に分割したドキュメント（フラグメント）をラージオブジェクト型としたものを格納する。一次検索結果テーブル２０３は、検索クエリから生成したｈａｓｈ値と、検索条件に該当する番組メタデータを持つコンテンツＩＤのリスト（ｃｒｉｄｌｉｓｔ）を格納する。キャッシュ管理テーブル２０４は、ｈａｓｈ値と、一次検索結果テーブル２０３に格納した一次検索結果および後述するレスポンスキャッシュの有効期限を格納する。 First, the configuration of the RDBMS 2 will be described. The RDBMS 2 includes a plurality of attribute tables 201, a document table 202, a primary search result table 203, and a cache management table 204. The attribute table 201 is an expansion of a document in which program metadata is described, and includes information (program metadata) related to a program such as a content ID (crid), a title, an outline, and performers. A document is a structured document described in XML or the like, and an attribute is a component of a document described by name / value pairs. The document table 202 stores a fragment object (fragment) obtained by dividing a fragment ID (fid), fragment version (fver), content ID (crid), and program metadata (fragment) into return units. The primary search result table 203 stores a hash value generated from the search query and a list (content list) of content IDs having program metadata corresponding to the search condition. The cache management table 204 stores a hash value, a primary search result stored in the primary search result table 203, and an expiration date of a response cache described later.

続いて、検索装置１の構成について説明する。図１に示すように、検索装置１は、クエリ受信部１０１、キャッシュ有効期限判定部１０２、構文解析部１０３、ＳＱＬ生成部１０４、検索実行部１０５、ドキュメント取得部１０６、応答データ生成部１０７、応答データ返却部１０８、レスポンスキャッシュ保存部１１０、およびフラグメントキャッシュ保存部１１１を備える。検索装置１が備える各部は、演算処理装置、記憶装置等を備えたコンピュータにより構成して、各部の処理がプログラムによって実行されるものとしてもよい。このプログラムは検索装置１が備える記憶装置に記憶されており、磁気ディスク、光ディスク、半導体メモリ等の記録媒体に記録することも、ネットワークを通して提供することも可能である。 Next, the configuration of the search device 1 will be described. As shown in FIG. 1, the search device 1 includes a query receiving unit 101, a cache expiration date determination unit 102, a syntax analysis unit 103, an SQL generation unit 104, a search execution unit 105, a document acquisition unit 106, a response data generation unit 107, A response data return unit 108, a response cache storage unit 110, and a fragment cache storage unit 111 are provided. Each unit included in the search device 1 may be configured by a computer including an arithmetic processing device, a storage device, and the like, and the processing of each unit may be executed by a program. This program is stored in a storage device included in the search device 1, and can be recorded on a recording medium such as a magnetic disk, an optical disk, or a semiconductor memory, or provided through a network.

クエリ受信部１０１は、端末３から検索クエリを受信する。 The query receiving unit 101 receives a search query from the terminal 3.

キャッシュ有効期限判定部１０２は、受信した検索クエリからｈａｓｈ値を計算し、キャッシュ管理テーブル２０４からそのｈａｓｈ値に対応する有効期限、つまり検索クエリに対応する有効期限を取得して、一次検索結果とレスポンスキャッシュの有効期限判定を行う。受信した検索クエリと同じ検索クエリにより検索したことがある場合は、有効期限を取得でき、受信した検索クエリと同じ検索クエリにより検索したことがない場合は、有効期限を取得できない。計算したｈａｓｈ値は、有効期限を付してキュッシュ管理テーブル２０４に格納される。 The cache expiration date determination unit 102 calculates the hash value from the received search query, acquires the expiration date corresponding to the hash value from the cache management table 204, that is, the expiration date corresponding to the search query, and the primary search result and Response cache expiration date is determined. If the search has been performed with the same search query as the received search query, the expiration date can be acquired. If the search query has not been searched with the same search query as the received search query, the expiration date cannot be acquired. The calculated hash value is stored in the cache management table 204 with an expiration date.

構文解析部１０３、ＳＱＬ生成部１０４、検索実行部１０５は、ｈａｓｈ値に対応する有効期限が存在しない場合、あるいは、ｈａｓｈ値が有効期限切れの場合に、検索クエリからＳＱＬを生成し、ＲＤＢＭＳ２に対して検索を実行する。具体的には、構文解析部１０３が検索クエリから構文木を作成し、ＳＱＬ生成部１０４がその構文木からＳＱＬを生成し、検索実行部１０５が生成したＳＱＬによりＲＤＢＭＳ２に対して検索を実行する。 The syntax analysis unit 103, the SQL generation unit 104, and the search execution unit 105 generate an SQL from the search query when the expiration date corresponding to the hash value does not exist or when the hash value has expired, and the RDBMS2 And execute the search. Specifically, the syntax analysis unit 103 creates a syntax tree from the search query, the SQL generation unit 104 generates SQL from the syntax tree, and executes a search on the RDBMS 2 using the SQL generated by the search execution unit 105. .

ドキュメント取得部１０６は、ＲＤＢＭＳ２から得られる検索結果であるＲｅｓｕｌｔＳｅｔオブジェクト２０５からｆｉｄ、ｆｖｅｒを取得し、ｆｉｄに該当するドキュメント（フラグメント）をフラグメントキャッシュ保存部１１１、あるいはＲＤＢＭＳ２から取得する。ＲｅｓｕｌｔＳｅｔオブジェクト２０５は、ｆｉｄ、ｆｖｅｒ、および実体データの格納場所を示すｏｉｄの情報を含む。ＲｅｓｕｌｔＳｅｔオブジェクト２０５から取得したｆｉｄに一致するｆｉｄのフラグメントがフラグメントキャッシュ保存部１１１に存在し、ＲｅｓｕｌｔＳｅｔオブジェクト２０５から取得したｆｖｅｒがフラグメントキャッシュ保存部１１１のｆｖｅｒ以下の値であれば、ｆｉｄに該当するフラグメントをフラグメントキャッシュ保存部１１１から取得する。一致するｆｉｄのフラグメントがフラグメントキャッシュ保存部１１１に存在しない場合、あるいは、ＲｅｓｕｌｔＳｅｔオブジェクト２０５から取得したｆｖｅｒがフラグメントキャッシュ保存部１１１のｆｖｅｒより大きい場合は、ＲｅｓｕｌｔＳｅｔオブジェクト２０５のｏｉｄに従って、ＲＤＢＭＳ２から実体データを取得する。取得した実体データは、フラグメントキャッシュ保存部１１１に保存する。 The document acquisition unit 106 acquires fid and fver from the ResultSet object 205 that is a search result obtained from the RDBMS 2, and acquires a document (fragment) corresponding to fid from the fragment cache storage unit 111 or the RDBMS 2. The ResultSet object 205 includes fid, fver, and oid information indicating the storage location of the entity data. If a fragment of fid that matches the fid acquired from the ResultSet object 205 exists in the fragment cache storage unit 111, and the fver acquired from the ResultSet object 205 is less than or equal to the fver of the fragment cache storage unit 111, the fragment corresponding to fid Is obtained from the fragment cache storage unit 111. When the matching fragment of fid does not exist in the fragment cache storage unit 111, or when the fver acquired from the ResultSet object 205 is larger than the fver of the fragment cache storage unit 111, the entity data is obtained from the RDBMS2 according to the oid of the ResultSet object 205. get. The acquired entity data is stored in the fragment cache storage unit 111.

応答データ生成部１０７は、ドキュメント取得部１０６が取得したドキュメントをデリミタで区切り結合したものを応答データとして生成する。生成した応答データは、ｈａｓｈ値とともにレスポンスキャッシュ保存部１１０に保存する。 The response data generation unit 107 generates, as response data, a document obtained by the document acquisition unit 106 delimited and combined with a delimiter. The generated response data is stored in the response cache storage unit 110 together with the hash value.

応答データ返却部１０８は、応答データを端末３へ返却する。検索クエリから生成したｈａｓｈ値が有効期限内であって、ｈａｓｈ値に対応する応答データがレスポンスキャッシュ保存部１１０に存在する場合は、応答データ返却部１０８がレスポンスキャッシュ保存部１１０からｈａｓｈ値に対応するレスポンスキャッシュ（応答データ）を読み出して端末３へ返却する。ｈａｓｈ値が有効期限切れ、あるいはｈａｓｈ値に対応する応答データがレスポンスキャッシュ保存部１１０に存在しない場合は、応答データ生成部１０７が生成した応答データを端末３へ返却する。 The response data return unit 108 returns the response data to the terminal 3. When the hash value generated from the search query is within the validity period and the response data corresponding to the hash value exists in the response cache storage unit 110, the response data return unit 108 corresponds to the hash value from the response cache storage unit 110. The response cache (response data) to be read is read and returned to the terminal 3. If the hash value has expired or the response data corresponding to the hash value does not exist in the response cache storage unit 110, the response data generated by the response data generation unit 107 is returned to the terminal 3.

レスポンスキャッシュ保存部１１０は、検索クエリから計算したｈａｓｈ値とその検索クエリにより得られた応答データとを関連付けてレスポンスキャッシュとして保存する。 The response cache storage unit 110 associates the hash value calculated from the search query with the response data obtained from the search query and stores it as a response cache.

フラグメントキャッシュ保存部１１１は、ｆｉｄ、ｆｖｅｒとともにｆｉｄに該当するドキュメント（フラグメント）をフラグメントキャッシュとして保存する。 The fragment cache storage unit 111 stores a document (fragment) corresponding to fid as a fragment cache together with fid and fver.

図２に、レンポンスキャッシュとフラグメントキャッシュを説明する概略図を示す。同図に示す応答データは、番組Ａ，Ｂに関する情報を検索して得られたものである。番組単位に単数もしくは複数のフラグメントが存在することができる。図２において、番組Ａの番組メタデータはフラグメントａ１〜ａ３、番組Ｂの番組メタデータはフラグメントｂ１〜ｂ３である。フラグメントａ１〜ｂ３は、ドキュメントテーブル２０２において、ｆｉｄ、ｆｖｅｒに関連付けられて格納されている。応答データは、番組Ａ，Ｂに関する情報を検索して得られた、番組メタデータの返却単位であるフラグメントａ１〜ｂ３を結合したものである。 FIG. 2 is a schematic diagram illustrating the remence cache and the fragment cache. The response data shown in the figure is obtained by searching for information on programs A and B. There can be one or more fragments per program. In FIG. 2, program metadata of program A is fragments a1 to a3, and program metadata of program B is fragments b1 to b3. Fragments a1 to b3 are stored in the document table 202 in association with fid and fver. The response data is obtained by combining fragments a1 to b3, which are program metadata return units, obtained by searching for information on programs A and B.

レスポンスキャッシュ保存部１１０は、検索要求単位で応答データをキャッシュする。具体的には、レスポンスキャッシュ保存部１１０は、番組Ａ，Ｂに関する情報を検索するための検索クエリから計算したｈａｓｈ値とフラグメントａ１〜ｂ３を結合した応答データとを関連付けて保持する。 The response cache storage unit 110 caches response data in search request units. Specifically, the response cache storage unit 110 stores the hash value calculated from the search query for searching for information on the programs A and B and the response data obtained by combining the fragments a1 to b3 in association with each other.

フラグメントキャッシュ保存部１１１は、フラグメント単位でフラグメントをキャッシュする。具体的には、フラグメントキャッシュ保存部１１１は、ドキュメント取得部１０６がＲＤＢＭＳ２から取得した各フラグメントａ１〜ｂ３とｆｉｄ、ｆｖｅｒとを関連付けて保持する。ｆｉｄは、フラグメントａ１〜ｂ３それぞれに一意に割り付けられた識別子である。ｆｖｅｒは、フラグメントのバージョンを表す値であり、例えば、番組Ａの番組メタデータであるフラグメントａ１が更新された場合、ｆｖｅｒが１から２へ更新される。ｆｖｅｒとして単純に更新時刻を用いてもよい。 The fragment cache storage unit 111 caches fragments in units of fragments. Specifically, the fragment cache storage unit 111 stores the fragments a1 to b3 acquired from the RDBMS2 by the document acquisition unit 106 in association with fid and fver. fid is an identifier uniquely assigned to each of the fragments a1 to b3. fver is a value representing the version of the fragment. For example, when fragment a1 which is program metadata of program A is updated, fver is updated from 1 to 2. The update time may be simply used as fver.

次に、検索装置１の処理の流れについて説明する。 Next, the processing flow of the search device 1 will be described.

図３は、検索装置１の処理の流れを示すフローチャートである。 FIG. 3 is a flowchart showing the flow of processing of the search device 1.

クエリ受信部１０１は、端末３から検索クエリを受信すると、検索クエリからｈａｓｈ値を生成する（ステップＳ１０１）。 When receiving the search query from the terminal 3, the query receiving unit 101 generates a hash value from the search query (step S101).

そして、キャッシュ有効期限判定部１０２は、そのｈａｓｈ値に対応する有効期限をＲＤＢＭＳ２のキャッシュ管理テーブル２０４より取得し（ステップＳ１０２）、ｈａｓｈ値に対応する有効期限の有無を判定する（ステップＳ１０３）。 Then, the cache expiration date determination unit 102 acquires the expiration date corresponding to the hash value from the cache management table 204 of the RDBMS 2 (step S102), and determines whether there is an expiration date corresponding to the hash value (step S103).

ｈａｓｈ値に対応する有効期限が取得できた場合、キャッシュ有効期限判定部１０２は、その有効期限が期限内か否かを判定する（ステップＳ１０４）。期限切れの場合は、一次検索結果テーブル２０３、キャッシュ管理テーブル２０４から該当レコードを削除する（ステップＳ１０５）。 When the expiration date corresponding to the hash value can be acquired, the cache expiration date determination unit 102 determines whether the expiration date is within the expiration date (step S104). If it has expired, the corresponding record is deleted from the primary search result table 203 and the cache management table 204 (step S105).

ステップＳ１０３においてｈａｓｈ値に対応する有効期限が無いと判定された場合、あるいはステップＳ１０４において取得した有効期限が期限切れと判定された場合は、受信した検索クエリを用いて、構文解析部１０３、ＳＱＬ生成部１０４、検索実行部１０５によりＲＤＢＭＳ２に対して一次検索を実施する（ステップＳ１０６）。一次検索とは、検索条件に該当するコンテンツＩＤをアトリビュートテーブル２０１から検索するものである。具体的には、検索条件に従って、アトリビュートテーブル２０１の属性に該当するコンテンツＩＤを検索する。一次検索結果である、検索条件に該当するコンテンツＩＤのリスト（ｃｒｉｄｌｉｓｔ）を、検索クエリから生成したｈａｓｈ値とともに一次検索結果テーブル２０３に格納する。 If it is determined in step S103 that there is no expiration date corresponding to the hash value, or if the expiration date acquired in step S104 is determined to have expired, the syntax analysis unit 103, SQL generation is performed using the received search query. The primary search is performed on the RDBMS 2 by the unit 104 and the search execution unit 105 (step S106). The primary search is a search for the content ID corresponding to the search condition from the attribute table 201. Specifically, the content ID corresponding to the attribute in the attribute table 201 is searched according to the search condition. A list (cridlist) of content IDs corresponding to the search condition, which is the primary search result, is stored in the primary search result table 203 together with the hash value generated from the search query.

一方、ｈａｓｈ値に対応する有効期限が期限内の場合は、ｈａｓｈ値に対応する応答データがレスポンスキャッシュ保存部１１０に存在するか否か判定する（ステップＳ１０７）。 On the other hand, when the expiration date corresponding to the hash value is within the time limit, it is determined whether or not response data corresponding to the hash value exists in the response cache storage unit 110 (step S107).

レスポンスキャッシュ保存部１１０にｈａｓｈ値に対応する応答データが存在する場合は、応答データ返却部１０８は、レスポンスキャッシュ保存部１１０からその応答データを取得し（ステップＳ１０８）、端末３に返却する（ステップＳ１１３）。 When the response data corresponding to the hash value exists in the response cache storage unit 110, the response data return unit 108 acquires the response data from the response cache storage unit 110 (step S108) and returns it to the terminal 3 (step S108). S113).

一次検索を実施した後、あるいはｈａｓｈ値に対応する応答データが存在しない場合は、二次検索を実施する（ステップＳ１０９）。二次検索とは、一次検索結果テーブル２０３の一次検索結果とドキュメントテーブル２０２を結合し、コンテンツＩＤのリスト（ｃｒｉｄｌｉｓｔ）に該当するドキュメントをＲｅｓｕｌｔＳｅｔオブジェクト２０５として取得するものである。具体的には、一次検索で得られたコンテンツＩＤのリスト（ｃｒｉｄｌｉｓｔ）とドキュメントテーブル２０２をコンテンツＩＤ（ｃｒｉｄ）で内部結合した表を取得する。 After performing the primary search or when there is no response data corresponding to the hash value, a secondary search is performed (step S109). The secondary search combines the primary search result of the primary search result table 203 and the document table 202 and acquires a document corresponding to a content ID list (cridlist) as a ResultSet object 205. Specifically, a table in which a list of content IDs (cridlist) obtained by the primary search and the document table 202 are internally joined by content ID (crid) is acquired.

二次検索後、ドキュメント取得部１０６が二次検索で得られたＲｅｓｕｌｔＳｅｔオブジェクト２０５を参照し、フラグメントキャッシュ保存部１１１あるいはＲＤＢＭＳ２からドキュメントを取得する（ステップＳ１１０）。ドキュメントを取得するステップＳ１１０の詳細は後述する。 After the secondary search, the document acquisition unit 106 refers to the ResultSet object 205 obtained by the secondary search, and acquires a document from the fragment cache storage unit 111 or the RDBMS 2 (step S110). Details of step S110 for obtaining a document will be described later.

そして、応答データ生成部１０７が、取得したドキュメントを結合して応答データを生成し（ステップＳ１１１）、生成した応答データをｈａｓｈ値とともにレスポンスキャッシュ保存部１１０に保存する（ステップＳ１１２）。 Then, the response data generation unit 107 combines the acquired documents to generate response data (step S111), and stores the generated response data in the response cache storage unit 110 together with the hash value (step S112).

そして、応答データ返却部１０８が生成した応答データを端末３に返却する（ステップＳ１１３）。 Then, the response data generated by the response data return unit 108 is returned to the terminal 3 (step S113).

続いて、ドキュメントを取得する処理の流れについて説明する。 Next, the flow of processing for acquiring a document will be described.

図４は、ドキュメント取得部１０６がドキュメントを取得する処理の流れを示すフローチャートである。 FIG. 4 is a flowchart illustrating a flow of processing in which the document acquisition unit 106 acquires a document.

まず、ドキュメント取得部１０６は、ＲｅｓｕｌｔＳｅｔオブジェクト２０５よりｆｉｄ，ｆｖｅｒを１件取得し（ステップＳ２０１）、ｆｉｄ，ｆｖｅｒが取得できたか否か判定する（ステップＳ２０２）。ｆｉｄ，ｆｖｅｒが取得できない場合は、検索条件に該当するすべてのドキュメントを取得したのでドキュメントを取得する処理を終了する。 First, the document acquisition unit 106 acquires one fid and fver from the ResultSet object 205 (step S201), and determines whether the fid and fver have been acquired (step S202). If fid and fver cannot be acquired, all the documents corresponding to the search condition have been acquired, and the process for acquiring the documents ends.

そして、取得したｆｉｄに対応するフラグメントがフラグメントキャッシュ保存部１１１に存在するか否か判定する（ステップＳ２０３）。取得したｆｉｄに対応するフラグメントがフラグメントキャッシュ保存部１１１に存在する場合、取得したｆｖｅｒとフラグメントキャッシュ保存部１１１に格納されたフラグメントキャッシュのｆｖｅｒとを比較する（ステップＳ２０４）。 Then, it is determined whether or not a fragment corresponding to the acquired fid exists in the fragment cache storage unit 111 (step S203). When the fragment corresponding to the acquired fid exists in the fragment cache storage unit 111, the acquired fver is compared with the fever of the fragment cache stored in the fragment cache storage unit 111 (step S204).

取得したｆｖｅｒがフラグメントキャッシュのｆｖｅｒ以下の場合は、フラグメントキャッシュ保存部１１１からドキュメント（フラグメント）を取得する（ステップＳ２０５）。 If the acquired fver is less than or equal to the fver of the fragment cache, the document (fragment) is acquired from the fragment cache storage unit 111 (step S205).

一方、取得したｆｖｅｒがフラグメントキャッシュのｆｖｅｒより大きい場合は、ＲｅｓｕｌｔＳｅｔオブジェクト２０５のｏｉｄに該当するドキュメントをＲＤＢＭＳ２から取得する（ステップＳ２０６）。取得したドキュメントは、ｆｉｄ，ｆｖｅｒとともにフラグメントキャッシュ保存部１１１に保存する（ステップＳ２０７）。 On the other hand, if the acquired fver is larger than the fver of the fragment cache, the document corresponding to the oid of the ResultSet object 205 is acquired from the RDBMS 2 (step S206). The acquired document is stored in the fragment cache storage unit 111 together with fid and fver (step S207).

ドキュメントを取得した後は、ステップＳ２０１に戻り、これらの処理をＲｅｓｕｌｔＳｅｔオブジェクト２０５が保持する全てのｆｉｄについて行う。 After the document is acquired, the process returns to step S201, and these processes are performed for all fids held in the ResultSet object 205.

以上説明したように、本実施の形態によれば、応答データを構成する個別のドキュメントをフラグメントキャッシュ保存部１１１に保存しておき、ドキュメント取得部１０６は、取得するドキュメントがフラグメントキャッシュ保存部１１１に存在すればフラグメントキャッシュ保存部１１１からドキュメントを取得し、存在しなければ、ＲＤＢＭＳ２からドキュメントを取得することにより、ドキュメントをＲＤＢＭＳ２から取得する回数を最小限にすることが可能となり、検索の応答時間を短縮することができる。 As described above, according to the present embodiment, individual documents constituting response data are stored in the fragment cache storage unit 111, and the document acquisition unit 106 stores the acquired document in the fragment cache storage unit 111. If it exists, the document is acquired from the fragment cache storage unit 111. If it does not exist, the document is acquired from the RDBMS2, thereby minimizing the number of times the document is acquired from the RDBMS2, and the search response time is reduced. It can be shortened.

本実施の形態によれば、フラグメントキャッシュ保存部１１１にドキュメントのバージョンを保存しておき、ドキュメント取得部１０６は、フラグメントキャッシュ保存部１１１のドキュメントのバージョンがＲＤＢＭＳ２のドキュメントのバージョンより古い場合は、ＲＤＢＭＳ２からドキュメントを取得することにより、キャッシュされたドキュメントとＲＤＢＭＳ２のドキュメントに不整合が生じない。 According to the present embodiment, the document version is stored in the fragment cache storage unit 111, and the document acquisition unit 106, when the document version in the fragment cache storage unit 111 is older than the document version in the RDBMS2, is RDBMS2 As a result, the cached document and the RDBMS2 document do not become inconsistent.

また、レスポンスキャッシュ保存部１１０に検索要求単位で応答データをキャッシュしておくことで、同一の検索クエリを受信した場合には、より応答時間を短縮することができる。 In addition, by caching the response data for each search request in the response cache storage unit 110, the response time can be further shortened when the same search query is received.

１…検索装置
１０１…クエリ受信部
１０２…キャッシュ有効期限判定部
１０３…構文解析部
１０４…ＳＱＬ生成部
１０５…検索実行部
１０６…ドキュメント取得部
１０７…応答データ生成部
１０８…応答データ返却部
１１０…レスポンスキャッシュ保存部
１１１…フラグメントキャッシュ保存部
２…ＲＤＢＭＳ
２０１…アトリビュートテーブル
２０２…ドキュメントテーブル
２０３…一次検索結果テーブル
２０４…キャッシュ管理テーブル
２０５…ＲｅｓｕｌｔＳｅｔオブジェクト
３…端末 DESCRIPTION OF SYMBOLS 1 ... Search apparatus 101 ... Query receiving part 102 ... Cache expiration date determination part 103 ... Syntax analysis part 104 ... SQL generation part 105 ... Search execution part 106 ... Document acquisition part 107 ... Response data generation part 108 ... Response data return part 110 ... Response cache storage unit 111 ... Fragment cache storage unit 2 ... RDBMS
201 ... Attribute table 202 ... Document table 203 ... Primary search result table 204 ... Cache management table 205 ... ResultSet object 3 ... Terminal

Claims

A search device for searching for a document corresponding to a search condition from a database storing documents,
Receiving means for receiving a search query;
Search means for searching the database using the search query, and for obtaining a search result having storage location information indicating a storage location of the document and an identifier of the document with respect to a document corresponding to a search condition of the search query;
First cache storage means for storing a previously acquired document and an identifier of the document;
It is determined whether or not a document corresponding to the document identifier of the search result exists in the first cache storage unit. If the document exists in the first cache storage unit, the first cache storage unit If the document is acquired and does not exist, acquisition means for acquiring the document from the database based on the storage location information of the search result;
Storage means for storing the document acquired from the database together with an identifier of the document in the first cache storage means;
Generating means for combining the one or more documents acquired by the acquiring means to generate response data;
Return means for returning the response data;
A search device comprising:

The search result has a version of the document;
The first cache storage means stores a version of the document,
If the version in the search result is the same as or older than the version in the first cache storage unit, the acquisition unit acquires the document from the first cache storage unit, and the version in the search result Is newer than the version in the first cache storage means, obtains the document from the database based on the storage location information of the search result,
The search device according to claim 1, wherein the storage means further stores the version of the document acquired from the database in the first cache storage means.

An expiration date management table that holds an expiration date corresponding to the search query;
Second cache storage means for storing response data previously returned corresponding to the search query;
A determination unit that acquires the expiration date corresponding to the search query, can acquire the expiration date, and determines that the response data is acquired from the second cache storage unit when the expiration date is within the expiration date; ,
If the determination means determines that the response data is acquired from the second cache storage means, the return means acquires and returns the response data corresponding to the search query from the second cache storage means, 3. The search device according to claim 1, wherein when the response data generated by the generation unit is returned, the response data is stored in the second cache storage unit in association with the search query.

A search method in which a search device searches the document corresponding to a search condition from a database storing documents,
Receiving a search query;
Searching the database using the search query, and obtaining a search result having storage location information indicating an identifier of the document and a storage location of the document with respect to a document corresponding to a search condition of the search query;
It is determined whether or not a document corresponding to the document identifier of the search result exists in the first cache storage unit that stores the previously acquired document and the identifier of the document, and the document is stored in the first cache storage unit. Obtaining the document from the first cache storage means if present, and obtaining the document from the database based on storage location information of the search result if not present;
Storing the document obtained from the database together with an identifier of the document in the first cache storage unit;
Combining one or more of the documents acquired by the acquisition means to generate response data;
Returning the response data;
A search method characterized by comprising:

The search result has a version of the document;
The first cache storage means stores a version of the document,
The obtaining step obtains the document from the first cache storage means when the version in the search result is the same as or older than the version in the first cache storage means, If the version is newer than the version in the first cache storage means, the document is acquired from the database based on the storage location information of the search result;
5. The search method according to claim 4, wherein the storing step further stores the version of the document acquired from the database in the first cache storage unit.

The expiration date corresponding to the search query is acquired from an expiration date management table that holds an expiration date corresponding to the search query, the expiration date can be acquired, and if the expiration date is within the expiration date, the search query Determining to obtain the response data from the second cache storage means for storing the response data previously returned in correspondence;
The returning step acquires and returns the response data corresponding to the search query from the second cache storage unit when it is determined in the determining step that the response data is acquired from the second cache storage unit. 6. The search according to claim 4, wherein when the response data generated in the generating step is returned, the response data is stored in the second cache storage unit in association with the search query. Method.

A search program for causing a computer to execute the search method according to claim 4.