JP2004145729A

JP2004145729A - Data management device and program and recording medium

Info

Publication number: JP2004145729A
Application number: JP2002311340A
Authority: JP
Inventors: Hiroshi Takegawa; 竹川　弘志; Tetsuya Ikeda; 池田　哲也
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-10-25
Filing date: 2002-10-25
Publication date: 2004-05-20
Anticipated expiration: 2022-10-25
Also published as: JP4311926B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data management device which can search a data-registered packet at high speed without being affected by the size of a hash table or the number of data stored in the hash table when taking out all the data registered in the hash table. <P>SOLUTION: The data management device is provided with a hash table storage means 14 consisting of: the hash table; a chain list of a component holding data information starting from the packet obtained by a data hash value to be registered and located on the hash table; and link information holding the association of the registered packet on the hash table. In registering the data information on the chain list and in registering the data information on a new packet on the hash table, the link information is modified by assigning the new packet as a coming packet following the registered packet. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、データを登録し検索可能にするデータ管理装置並びにプログラムおよび記録媒体に関し、特に、ハッシュ関数を用いてデータの格納場所を求めるデータ管理装置に関する。
【０００２】
【従来の技術】
コンピュータにおけるデータの登録、検索などのデータ管理を行う手法の一つにハッシュ法がある（非特許文献１および特許文献１を参照）。
ハッシュ法は、データをある一定の長さのハッシュ値に圧縮するため、ハッシュ値に衝突が起こる場合がある。
例えば、図１３に示したように、ハッシュテーブルの大きさをｎとしたとき、ハッシュテーブルはｎ個のバケットに分割され、それぞれのバケットは１からｎまでのインデックスによって指し示される。
この図１３の場合、インデックスｎにおいてハッシュ値が衝突しており、チェイン法を用いて解決している。
【０００３】
このハッシュ法を使って、あるデータを登録するときは、以下の手順で処理する。
（１）登録するデータに対してハッシュ関数を適用し、ハッシュ値を求める。
（２）求めたハッシュ値をインデックスとするバケットを求める。
（３）求めたバケットに登録されているものがあるか調べる。
（４）図１３に示したインデックス２のバケットのように、空ポインタのゼロが記憶されており、何も登録されていないバケットへは、登録するデータ（またはそのデータへのポインタ）を格納した要素を生成し、その要素へのポインタをバケットに登録する（インデックス１または３を参照）。
（５）または、図１３に示したインデックス１やインデックス３のバケットのように、すでにチェインの先頭要素へのポインタが記憶されているバケットへは、登録するデータ（またはそのデータへのポインタ）を格納した要素を生成し、その要素へのポインタをバケットに登録する。また、生成した要素には、元の先頭要素へのポインタを記憶する（インデックスｎを参照）。その結果、新たに生成した要素がチェインの先頭となり、元の先頭要素はチェインの先頭から２番目の要素となる。
【０００４】
以上のようにハッシュテーブルを使ってデータ値を登録すると、同じハッシュ値が得られるデータは、そのハッシュ値をインデックスとするバケットに格納されたチェインリストに登録されることになる。
【０００５】
以上のようにして登録されたデータを取り出すには、以下の手順で処理する。
（１）現在処理中のバケットのインデックスを１に初期化する。
（２）現在処理中のバケットのインデックスがｎより大きければ、すべてのバケットを処理しているので、データの取り出し処理を終了する。
（３）現在処理中のバケットに格納されているチェインリストの先頭要素へのポインタを取り出して、現在処理中の要素へのポインタとする。
（Ａ）現在処理中の要素へのポインタが空ポインタのゼロであれば、現在処理中の要素をすべて処理したので、（４）へ進む。
（Ｂ）さもなければ、そのポインタが指す要素に格納されているデータを取り出す。
（Ｃ）ポインタが指す要素に格納されている、次の要素へのポインタを取り出し、現在処理中の要素へのポインタとし、（Ａ）へ戻る。
（４）現在処理中のバケットのインデックスに１を加え、（２）へ戻る。
【０００６】
特許文献２の「テーブル管理方式」では、ハッシュテーブルに登録されるすべての要素をその登録順に依存しない独立した順序で辿ることを可能にするために、それぞれの要素に次の要素を得るための情報（例えば、ポインタ）を格納するようにしている。この情報を使うことにより、ハッシュテーブルにデータがどのような順序に登録されているかとは無関係に、すべての要素を任意の順序、例えば、それらが保持するデータの昇順に並ぶような順序に並べることができる。
【０００７】
【非特許文献１】
萩原宏、西原清一著「現代データ構造とプログラム技法」、
オーム社、１９８７年１０月
【特許文献１】
特許第３０５６７０４号公報（従来技術）
【特許文献２】
特開平０５−０８１１０２号公報
【０００８】
【発明が解決しようとする課題】
上述したように、ハッシュテーブルに登録されているすべてのデータを取り出すとき、あるバケットにチェインの先頭要素へのポインタが格納されていない場合、そのバケットの（インデックス＋１）のバケット以降にチェインの先頭要素へのポインタが登録されているかを、１つずつ調べていく必要がある。
【０００９】
この調査は、ハッシュテーブルのサイズが大きくなるにつれ、また、ハッシュテーブルに格納されているデータの数が少なくなるにつれ、バケットにチェインの先頭要素へのポインタが登録されていない場合が多くなり、次のデータを得るために必要なバケットの探索に必要な時間が長くなっていく。
【００１０】
特許文献２の技術は、この問題を解決するためのものであり、一般的なチェイン法を用いるハッシュテーブルと比較して、要素を次へ次へと辿るだけなので、登録されているすべてのデータを高速に取り出すことができる。
しかし、次の要素を得るための情報を要素ごとに格納するため、その情報のための領域が登録される要素の数に比例して必要となってくる。
【００１１】
本発明は、上述の実情を考慮してなされたものであって、ハッシュテーブルに登録されているすべてのデータを取り出すとき、ハッシュテーブルのサイズやハッシュテーブルに格納されているデータの数に影響なく、データが登録されているバケットを高速に探索することができるデータ管理装置並びにプログラムおよび記録媒体を提供することを目的とする。
【００１２】
【課題を解決するための手段】
上記の課題を解決するために、本発明の請求項１のデータ管理装置は、ハッシュ関数を用いて登録するデータの格納場所を決定するデータ管理装置において、ハッシュテーブルと、登録するデータにハッシュ関数を適用して求められた前記ハッシュテーブル上のバケットから開始されるデータに関する情報を保持する要素をチェインによってリンクしたチェインリストと、前記ハッシュテーブル上の登録済バケットの関連付けを保持するリンク情報とからなるハッシュテーブル記憶手段と、登録するデータにハッシュ関数を適用して求められたバケットから示すチェインリストに該データに関する情報を登録して前記ハッシュテーブル記憶手段へ格納するデータ登録手段とを備え、前記ハッシュテーブル上の新たなバケットに登録した場合、この新たなバケットをデータが登録済のバケットの次に辿るべきバケットとして前記リンク情報を変更するようにしたことを特徴とする。
【００１３】
また、本発明の請求項２は、請求項１に記載のデータ管理装置において、前記リンク情報が、データが登録済であるバケットの前に辿るべきバケットの位置を表す情報と、データが登録済であるバケットの後ろに辿るべきバケットの位置を表す情報とからなることを特徴とする。
また、本発明の請求項３は、請求項１または２に記載のデータ管理装置において、前記リンク情報がバケットを示すインデックスであることを特徴とする。
【００１４】
また、本発明の請求項４は、請求項１、２または３に記載のデータ管理装置において、登録したデータの検索を行うデータ検索手段を備え、前記データ検索手段は、全データを取り出すときには、前記リンク情報から前記ハッシュテーブルにデータを登録しているバケットを取り出して、そのバケットからリンクされているチェインリストから登録されているデータを取り出すようにしたことを特徴とする。
【００１５】
また、本発明の請求項５は、請求項１、２、３または４に記載のデータ管理装置において、登録したデータの削除を行うデータ削除手段を備え、前記データ削除手段は、指定された削除データを前記チェインリストから削除し、このチェインリストに登録されたデータがなくなった場合、この削除データを示すバケットを前記リンク情報から削除するようにしたことを特徴とする。
【００１６】
また、本発明の請求項６のプログラムは、コンピュータに、請求項１乃至５のいずれかに記載のデータ管理装置の機能を実行させるためのプログラムである。また、本発明の請求項７の記録媒体は、請求項６に記載のプログラムを記録したコンピュータ読み取り可能な記録媒体である。
【００１７】
したがって、ハッシュテーブルに登録されているすべてのデータを取り出すとき、ハッシュテーブルのサイズやハッシュテーブルに格納されているデータの数に影響なく、データが登録されているバケットを高速に探索することができる。
【００１８】
さらに、データの取り出し順が正順でも逆順でも同様に、ハッシュテーブルのサイズやハッシュテーブルに格納されているデータの数に影響なく、バケットを高速に探索することができる。
【００１９】
【発明の実施の形態】
以下、図面を参照して本発明に係るデータ管理装置の一実施形態を説明する。図１は、本発明に係るデータ管理装置の実施形態を構成するコンピュータシステムのブロック図である。
図１において、本実施形態は、サーバ１０および端末２０とから構成される。
【００２０】
端末２０は、システムの利用者がサーバ１０で実行されているデータ管理装置に対して登録・検索・削除するデータそのものやデータが記録されているデータファイルなどを指定したり、得られた操作の結果を出力するために利用するコンピュータである。
一般に、端末２０には、問合せ要求を入力するためのキーボードおよびマウス等のポインティングデバイスや、問合せ結果を表示するためのディスプレイが装備されている。
また、端末２０は、ＣＰＵ、メモリ、ハードディスクを備え、プログラムが実行可能である。
【００２１】
サーバ１０は、端末２０と通信網３０で接続され、端末２０からの操作要求に応じた処理を行い、その結果を端末２０に返すためのコンピュータである。
また、サーバ１０は、ＣＰＵ、メモリ、ハードディスクを備え、データ管理のためのプログラムが実行可能である。
サーバ１０によって管理されるデータは、ハードディスクなどの記録媒体に記録される。
【００２２】
ここで、通信網３０は、サーバ１０および端末２０間を結合するための伝送路であって、一般には、ケーブルで実現され、通信プロトコルにはＴＣＰ／ＩＰが使われる。但し、伝送路としてはケーブルだけではなく、それらの間の通信プロトコルが一致するものであれば有線または無線のいずれでもよく、例えば、公衆回線や専用回線等によるＬＡＮ（Ｌｏｃａｌ　Ａｒｅａ　Ｎｅｔｗｏｒｋ）、ＷＡＮ（Ｗｉｄｅ　Ａｒｅａ　Ｎｅｔｗｏｒｋ）、インターネットなどを用いることができる。
【００２３】
また、本発明のデータ管理装置の各機能が端末２０ですべて実行されるときには、上述したサーバ１０および通信網３０は必要としない。
【００２４】
図２は、本発明に係るデータ管理装置の機能構成を示すブロック図であり、同図において、データ管理装置は、データ入出力手段２１、データ登録手段１１、ハッシュテーブル記憶手段１４、データ検索手段１２、データ削除手段１３とを少なくとも含んでいる。
【００２５】
データ入出力手段２１は、端末２０において動作し、利用者が端末２０を介して入力する、データの登録、検索、削除要求を受け付け、この要求をそれぞれ処理し、その結果を出力する。
【００２６】
データ登録手段１１、データ検索手段１２、データ削除手段１３およびハッシュテーブル記憶手段１４は、サーバ１０で動作し、データ管理装置の機能の一部を構成する。
【００２７】
まず、データ登録手段１１、データ検索手段１２およびデータ削除手段１３で使うハッシュテーブル、チェインリストおよびリンク情報を記憶するハッシュテーブル記憶手段１４のデータ構造について説明する。このハッシュテーブル記憶手段１４は、サーバ１０のメモリまたはハードディスク上に記憶される。
【００２８】
図３は、本実施形態で使用するハッシュテーブル記憶手段１４のデータ構造の一例であり、ハッシュテーブルＨおよびチェインリストＣのデータ構造は、従来の技術で説明したものと同じである。
【００２９】
すなわち、ハッシュテーブルＨの大きさをｎとしたとき、ハッシュテーブルＨはｎ個のバケットに分割され、それぞれのバケットは１からｎまでのインデックスによって指し示される。
データが登録されていない各バケットはゼロが設定される。
データを登録するときは、データにハッシュ関数を適用し、ハッシュ値を求めてハッシュ値をインデックスとするバケットを求める。
（１）この求めたバケットに登録されているものがない（バケットに格納されている値がゼロ）場合には、要素Ｅを生成し、その要素Ｅへのポインタをバケットに登録する。
この要素Ｅは、登録するデータ（またはそのデータへのポインタ）Ｅｖと同じハッシュ値をもつデータの他の要素ＥへのポインタＥｐとから構成される。
要素ＥのポインタＥｐがゼロに設定されていれば、次にチェインされる要素がないことを示している。
【００３０】
（２）または、この求めたバケットに登録されているものがある場合には、登録するデータ（またはそのデータへのポインタ）を格納した要素Ｅを生成し、生成した要素ＥのチェインポインタＥｐには、バケットに登録されているチェインリストの先頭要素へのポインタを格納し、バケットには生成した要素Ｅへのポインタを登録する。
【００３１】
以下、本実施形態では、リンク情報の表現形式として、バケットを示すインデックスを使用するものとして説明する。
図３において、リンク情報は、後リンク開始インデックスＢと後リンク情報Ｎ、および前リンク開始インデックスＡと前リンク情報Ｐからなっている。
【００３２】
後リンク開始インデックスＢと後リンク情報Ｎは、それぞれインデックス０の要素、その後にインデックス１からｎの各バケットに対応した要素をならべて１つの配列として構成する。
後リンク開始インデックスＢは、ハッシュテーブルＨに登録されているすべてのデータを取得するときに、最初に探索すべきバケットのインデックスが記憶される。後リンク情報Ｎは、バケットからチェインされている要素を探索後、次に探索を開始するバケットのインデックスを格納している。後リンク情報Ｎの値がゼロのときは、次に探索するバケットがないことを示す。
【００３３】
例えば、図４において、後リンク開始インデックスＢ（インデックス０）の格納値は３であるから最初に探索されるのは、インデックス３のバケットから行われる。
インデックス３のバケットの後リンク情報Ｎにはｎが格納されているので、このバケットの次に探索されるバケットはインデックスｎのバケットである。
また、インデックス１の後リンク情報Ｎにはゼロが格納されているので、このバケットの次に探索するバケットが存在しない。
【００３４】
同様に、前リンク開始インデックスＡと前リンク情報Ｐは、それぞれインデックス０の要素、その後にインデックス１からｎの各バケットに対応した要素をならべて１つの配列として構成する。
前リンク開始インデックスＡは、ハッシュテーブルＨに登録されているすべてのデータを取得するときに、後リンク情報で辿る順序の逆順にバケットを並べたときの最初に探索すべきバケットのインデックスが記憶される。前リンク情報Ｐは、バケットからチェインされている要素を探索後、次に探索を開始するバケットのインデックスを格納している。前リンク情報Ｐの値がゼロのときは、次に探索するバケットがないことを示す。
【００３５】
例えば、図４において、前リンク開始インデックスＡ（インデックス０）の格納値は１であるから最初に探索されるのは、インデックス１のバケットから行われる。
インデックス１のバケットの前リンク情報Ｐにはｎが格納されているので、このバケットの次に探索されるバケットはインデックスｎのバケットである。
また、インデックス２の前リンク情報Ｐにはゼロが格納されているので、このバケットの次に探索するバケットが存在しない。
【００３６】
データ登録手段１１は、データ入出力手段２１からのデータ登録要求に応じ、与えられたデータが登録されるべきチェインを求め、そのチェインの先頭要素としてデータを登録する。
データ登録手段１１の処理手順を図５に示すフローチャートによって説明する。
ここで、データを登録するに先立ち、前リンク開始インデックスＡ、後リンク開始インデックスＢおよびハッシュテーブルＨの各バケットの値をゼロに初期化しておく。
【００３７】
データ入出力手段２１から渡された登録データに対してハッシュ関数を適用して、ハッシュ値を求める（ステップＳ１，Ｓ２）。
求めたハッシュ値をインデックスとするバケットを求める（ステップＳ３）。以下、求めたバケットのインデックスをｉとする。
登録するデータ（またはそのデータへのポインタ）を格納する要素を生成する（ステップＳ３）。
この要素は、チェインリストＣにおけるもので、図３における要素Ｅに対応するもので、登録するデータ（またはそのデータへのポインタ）をＥｖへ格納し、ゼロをＥｐへ格納している。
【００３８】
インデックスｉのバケットの内容がゼロであるかを調べる（ステップＳ５）。
インデックスｉのバケットの内容がゼロでなければ（ステップＳ５のＹＥＳ）、すでにチェインの先頭要素へのポインタが登録されているので、生成したチェインリストＣの要素のＥｐ領域にインデックスｉのバケットの内容を格納し（ステップＳ６）、ステップＳ１１へ進む。その結果、今生成した要素がチェインリストの先頭となる。
【００３９】
一方、インデックスｉのバケットの内容がゼロであれば（ステップＳ５のＮＯ）、インデックスｉの前リンク情報をゼロとして格納する（ステップＳ７）。
後リンク開始インデックスの内容をインデックスとする前リンク情報にｉを格納する（ステップＳ８）。
インデックスｉの後リンク情報として、後リンク開始インデックスの内容を格納する（ステップＳ９）。
後リンク開始インデックスにｉを格納する（ステップＳ１０）。
最後に、チェインリストに生成した要素へのポインタをインデックスｉのバケットに登録し（ステップＳ１１）、データの登録を終了する。
【００４０】
例えば、図４の状態のハッシュテーブル記憶手段１４に対して、ハッシュ値が２のデータを登録すると、図６のように登録される。
【００４１】
データ検索手段１２は、データ入出力手段２１からのデータ検索要求が１つのデータを検索するものであれば、ハッシュテーブルから与えられたデータが登録されているチェインリスト中の要素を求め、その要素が検索対象となるデータを保持しているか否かによって、その結果をデータ入出力手段２１に返す。
また、データ入出力手段２１からの全データ取得要求があれば、リンク情報を参照しながらすべてのチェインの全要素を取り出し、その結果をデータ入出力手段２１に返す。
【００４２】
まず、１つのデータを検索する要求を受けたときのデータ検索手段１２の処理手順を図７に示すフローチャートによって説明する。
検索対象となるデータに対してハッシュ関数を適用し、ハッシュ値を求める（ステップＳ２１，Ｓ２２）。
求めたハッシュ値をインデックスとするバケットを求める（ステップＳ２３）。
現在処理中の要素へのポインタをバケットに格納されているチェインの先頭要素へのポインタとする（ステップＳ２４）。
【００４３】
現在処理中の要素へのポインタがゼロであるかを調べる（ステップＳ２５）。
現在処理中の要素へのポインタがゼロであれば（ステップＳ２５のＮＯ）、検索しているデータは登録されていないとして処理を終了する（ステップＳ２６）。
一方、現在処理中の要素へのポインタがゼロでなければ（ステップＳ２５のＹＥＳ）、そのポインタが示す要素に格納されているデータを取り出し、検索対象であるデータと等しければ（ステップＳ２７のＹＥＳ）、このデータが検索しているデータであるとして処理を終了する（ステップＳ２８）。
ポインタで示された要素のデータが検索対象のデータでなければ、その要素に格納されている次の要素へのポインタを現在処理中の要素へのポインタとして置き換え（ステップＳ２９）、ステップＳ２５へ戻る。
【００４４】
次に、すべてのデータを取り出す要求を受けたときのデータ検索手段１２の処理手順を図８に示すフローチャートによって説明する。
以下、後リンク情報を使った全データの取得を説明するが、前リンク情報を使った全データの取得も同様にできる。すなわち、後リンク情報を使ったときの手順において、後リンク情報の代わりに前リンク情報を使えばよい。
【００４５】
現在処理中のバケットのインデックスを後リンク開始インデックスの内容で初期化する（ステップＳ３１）。
現在処理中のバケットのインデックスがゼロであれば（ステップＳ３２のＹＥＳ）、すべてのバケットを処理したとして処理を終了する。
これまでに取り出されたデータは、全データの取り出し要求したデータ入出力手段２１へ戻される。
【００４６】
現在処理中の要素へのポインタをバケットに格納されているチェインリストの先頭要素へのポインタで初期化する（ステップＳ３３）。
現在処理中の要素へのポインタがゼロであるか調べ、空ポインタであるゼロであれば（ステップＳ３４のＮＯ）、現在処理中の要素へのポインタがなくなったので現在処理中のバケットのインデックスを現在処理中のバケットの後リンク情報に置き換えて（ステップＳ３５）、ステップＳ３２へ戻る。
この処理で、同じハッシュ値（インデックス）を持つデータを示すチェインリストに関するデータの取り出しを終了し、次のハッシュ値を持つデータの処理へ移る。
【００４７】
一方、現在処理中の要素へのポインタがゼロでない場合（ステップＳ３４のＹＥＳ）、そのポインタで示される要素に格納されているデータを取り出す（ステップＳ３６）。
要素に格納されている次の要素へのポインタを得て、現在処理中の要素へのポインタとし（ステップＳ３７）、ステップＳ３４へ戻る。
【００４８】
データ削除手段１３は、データ入出力手段２１からのデータ削除要求に応じ、与えられたデータが登録されているチェインリストを求め、そのチェインリスト中に与えられた削除要求されたデータが登録されていれば、削除し、その結果をデータ入出力手段２１に返す。
データ削除手段１３の処理手順を図９および図１０に示すフローチャートによって説明する。
【００４９】
データ入出力手段２１から渡された削除対象となるデータに対してハッシュ関数を適用し、ハッシュ値を求める（ステップＳ４１，Ｓ４２）。
求めたハッシュ値をインデックスとするバケットを求める（ステップＳ４３）。以下、ここで求めたインデックスをｉとする。
現在処理中の要素へのポインタをバケットに格納されているチェインの先頭要素へのポインタとする（ステップＳ４４）。
【００５０】
現在処理中の要素へのポインタがゼロであるかを調べる（ステップＳ４５）。
現在処理中の要素へのポインタがゼロであれば（ステップＳ４５のＮＯ）、削除指定されたデータは登録されていないとして処理を終了する（ステップＳ４６）。
一方、現在処理中の要素へのポインタがゼロでない場合（ステップＳ４５のＹＥＳ）、そのポインタで示される要素に格納されているデータが削除対象であるデータと一致しなければ（ステップＳ４７のＮＯ）、その要素に格納されている次の要素へのポインタを現在処理中の要素へのポインタとして置き換え（ステップＳ４８）、ステップＳ４５へ戻る。
一方、そのポインタで示される要素に格納されているデータが削除対象であるデータと一致していれば（ステップＳ４７のＹＥＳ）、このデータが削除しているデータであるので、以下の処理を行う。
【００５１】
削除するデータを格納する要素がチェインリストの先頭であるか調べ、チェインの先頭であれば（ステップＳ４９のＹＥＳ）、削除する要素が示す次の要素のポインタをバケットへ格納し、この削除対象の要素を削除する（ステップＳ５０）。
例えば、図１１（Ａ）に示すように、チェインリスト中の先頭の要素（値１を格納する要素）を削除する場合、この値１を格納する要素の次の要素（値２を格納する要素）へのポインタをバケットに格納する。
【００５２】
また、削除する要素がチェインリストの先頭でなければ、（ステップＳ４９のＮＯ）削除するデータを格納する要素の直前の要素が指すポインタを、削除するデータを格納する要素の次の要素へのポインタとし、この削除対象の要素を削除する（ステップＳ５１）。
例えば、図１１（Ｂ）に示すように、チェインリスト中の２番目の要素（値２を格納する要素）を削除する場合、この値２を格納する要素の次の要素（値３を格納する要素）へのポインタを、１番目の要素（値１を格納する要素）の次の要素を指すポインタとする。
【００５３】
次に、削除対象の要素を削除した後、まだチェインリスト中に要素があれば（ステップＳ５２のＮＯ）、削除処理を終了する。
一方、削除対象の要素を削除した結果、チェインリストにひとつも要素がなくなれば（ステップＳ５２のＹＥＳ）、図１０のフローチャートで示す以下の処理を行う。
【００５４】
インデックスｉの後リンク情報をｊとしたとき、インデックスｊの前リンク情報をインデックスｉの前リンク情報と置き換える（ステップＳ５３）。
インデックスｉの前リンク情報をｋとしたとき、インデックスｋの後リンク情報をインデックスｉの後リンク情報と置き換える（ステップＳ５４）。
【００５５】
後リンク開始インデックスの内容がインデックスｉであれば（ステップＳ５５のＹＥＳ）、後リンク開始インデックスをインデックスｉの後リンク情報に置き換える（ステップＳ５６）。
前リンク開始インデックスの内容がインデックスｉであれば（ステップＳ５７のＹＥＳ）、前リンク開始インデックスをインデックスｉの前リンク情報と置き換える（ステップＳ５８）。
最後に、インデックスｉの後リンク情報および前リンク情報にゼロを格納し（ステップＳ５９，Ｓ６０）、削除処理を終了する。
【００５６】
例えば、図４に示した状態のハッシュテーブル記憶手段１４の内容から、ハッシュ値が３のバケットの先頭に格納されているデータを削除すると、図１２に示すようになる。
【００５７】
本発明は、上述した実施形態のみに限定されたものではない。上述した実施形態のデータ管理装置を構成する各機能をそれぞれプログラム化し、あらかじめＣＤ−ＲＯＭ等の記録媒体に書き込んでおき、コンピュータに搭載したＣＤ−ＲＯＭドライブのような媒体駆動装置にこのＣＤ−ＲＯＭ等を装着して、これらのプログラムをコンピュータのメモリあるいは記憶装置に格納し、それを実行することによって、本発明の目的が達成されることは言うまでもない。
この場合、記録媒体から読み出されたプログラム自体が上述した実施形態の機能を実現することになり、そのプログラムおよびそのプログラムを記録した記録媒体も本発明を構成することになる。
【００５８】
なお、プログラムを格納する記録媒体としては半導体媒体（例えば、ＲＯＭ、不揮発性メモリカード等）、光媒体（例えば、ＤＶＤ、ＭＯ、ＭＤ、ＣＤ等）、磁気媒体（例えば、磁気テープ、フレキシブルディスク等）等のいずれであってもよい。
【００５９】
また、ロードしたプログラムを実行することにより上述した実施形態の機能が実現されるだけでなく、そのプログラムの指示に基づき、オペレーティングシステムあるいは他のアプリケーションプログラム等と共同して処理することによって上述した実施形態の機能が実現される場合も含まれる。
【００６０】
市場に流通させる場合には、可搬型の記録媒体にプログラムを格納して流通させたり、インターネット等の通信網を介して接続されたサーバコンピュータの記憶装置に格納しておき、通信網を通じて他のコンピュータに転送することもできる。この場合、このサーバコンピュータの記憶装置も本発明の記録媒体に含まれる。なお、可搬型の記録媒体上のプログラム、または転送されてくるプログラムを、コンピュータに内蔵する記録媒体へインストールし、そのインストールされたプログラムを実行することによって上述した実施形態の機能が実現される。
【００６１】
【発明の効果】
以上説明したように本発明によれば、ハッシュテーブルに登録されているすべてのデータを取り出すとき、ハッシュテーブルのサイズやハッシュテーブルに格納されているデータの数に影響なく、データが登録されているバケットを高速に探索することができる。
【００６２】
さらに、データの取り出し順が正順でも逆順でも同様に、ハッシュテーブルのサイズやハッシュテーブルに格納されているデータの数に影響なく、バケットを高速に探索することができる。
【図面の簡単な説明】
【図１】本発明に係るデータ管理装置の実施形態を構成するコンピュータシステムのブロック図である。
【図２】本発明に係るデータ管理装置の機能構成を示すブロック図である。
【図３】ハッシュデータ記憶手段のデータ構造の一例である。
【図４】ハッシュデータ記憶手段のデータ設定を説明するための図である。
【図５】データ登録手段の処理手順を示すフローチャートである。
【図６】図４のハッシュデータ記憶手段に新たなデータを登録した一例である。
【図７】指定されたデータの検索要求を受けた場合のデータ検索手段の処理手順を示すフローチャートである。
【図８】全データの取り出し要求を受けた場合のデータ検索手段の処理手順を示すフローチャートである。
【図９】データ削除手段の処理手順を示すフローチャートである。
【図１０】データ削除手段の処理手順を示すフローチャートである（図９の続き）。
【図１１】チェインリストから要素を削除するときの説明図である。
【図１２】図４に示した状態のハッシュテーブル記憶手段からデータを削除したときの一例である。
【図１３】従来のハッシュテーブルの構成を説明するための図である。
【符号の説明】
Ａ…前リンク開始インデックス、Ｐ…前リンク情報、Ｂ…後リンク開始インデックス、Ｎ…後リンク情報、Ｈ…ハッシュテーブル、Ｅ…チェインリストの要素、Ｅｖ…要素Ｅのデータ値またはデータへのポインタ、Ｅｐ…要素Ｅのチェインリスト中の次の要素へのポインタ、１０…サーバ、１１…データ登録手段、１２…データ検索手段、１３…データ削除手段、１４…ハッシュテーブル記憶手段、２０…端末、２１…データ入出力手段、３０…通信網。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a data management device, a program, and a recording medium for registering and retrieving data, and more particularly to a data management device for obtaining a data storage location using a hash function.
[0002]
[Prior art]
One of techniques for performing data management such as data registration and search in a computer is a hash method (see Non-Patent Document 1 and Patent Document 1).
Since the hash method compresses data into a hash value of a certain length, a collision may occur in the hash value.
For example, as shown in FIG. 13, when the size of the hash table is n, the hash table is divided into n buckets, and each bucket is indicated by an index from 1 to n.
In the case of FIG. 13, the hash values collide at the index n, and are solved by using the chain method.
[0003]
When registering certain data using this hash method, the following procedure is used.
(1) A hash function is applied to data to be registered to obtain a hash value.
(2) Obtain a bucket using the obtained hash value as an index.
(3) Check if there is any registered in the obtained bucket.
(4) Like the bucket of index 2 shown in FIG. 13, zero of the empty pointer is stored, and data to be registered (or a pointer to the data) is stored in a bucket in which nothing is registered. Generate an element and register a pointer to that element in the bucket (see index 1 or 3).
(5) Alternatively, the data to be registered (or the pointer to the data) is stored in the bucket in which the pointer to the head element of the chain is already stored, such as the buckets of index 1 and index 3 shown in FIG. Generate the stored element and register a pointer to that element in the bucket. In the generated element, a pointer to the original head element is stored (see index n). As a result, the newly generated element becomes the head of the chain, and the original head element becomes the second element from the head of the chain.
[0004]
When a data value is registered using a hash table as described above, data for which the same hash value is obtained is registered in a chain list stored in a bucket using the hash value as an index.
[0005]
To retrieve the data registered as described above, the following procedure is performed.
(1) The index of the bucket currently being processed is initialized to 1.
(2) If the index of the bucket currently being processed is greater than n, all buckets have been processed, and the data fetching process ends.
(3) The pointer to the head element of the chain list stored in the bucket currently being processed is extracted and used as the pointer to the element currently being processed.
(A) If the pointer to the element currently being processed is an empty pointer of zero, all the elements currently being processed have been processed, and the process proceeds to (4).
(B) Otherwise, retrieve the data stored in the element pointed to by the pointer.
(C) The pointer to the next element stored in the element indicated by the pointer is taken out, set as the pointer to the element currently being processed, and the process returns to (A).
(4) Add 1 to the index of the bucket currently being processed, and return to (2).
[0006]
In the “table management method” of Patent Document 2, in order to enable all elements registered in a hash table to be traced in an independent order that does not depend on the registration order, the following elements are obtained for each element. Information (for example, a pointer) is stored. By using this information, regardless of the order in which the data is registered in the hash table, all the elements are arranged in an arbitrary order, for example, in an order in which the data held in the elements are arranged in ascending order. be able to.
[0007]
[Non-patent document 1]
Hagiwara Hiroshi and Nishihara Seiichi, "Modern Data Structure and Programming Techniques",
Ohmsha, October 1987
[Patent Document 1]
Japanese Patent No. 3056704 (prior art)
[Patent Document 2]
JP 05-081022 A
[0008]
[Problems to be solved by the invention]
As described above, when fetching all data registered in the hash table, if a pointer to the head element of the chain is not stored in a certain bucket, the head of the chain is set after the bucket of (index + 1) of that bucket. It is necessary to check whether pointers to elements are registered one by one.
[0009]
This study shows that as the size of the hash table increases, and as the number of data stored in the hash table decreases, the pointer to the head element of the chain is often not registered in the bucket. The time required to search for the bucket required to obtain the data of the data becomes longer.
[0010]
The technique of Patent Literature 2 is for solving this problem. Compared with a hash table using a general chain method, only the elements are traversed one after another. Can be taken out at high speed.
However, since information for obtaining the next element is stored for each element, an area for the information is required in proportion to the number of registered elements.
[0011]
The present invention has been made in consideration of the above-described circumstances, and when extracting all data registered in the hash table, the size of the hash table and the number of data stored in the hash table are not affected. It is an object of the present invention to provide a data management device, a program, and a recording medium capable of quickly searching for a bucket in which data is registered.
[0012]
[Means for Solving the Problems]
In order to solve the above-mentioned problem, a data management apparatus according to claim 1 of the present invention is a data management apparatus that determines a storage location of data to be registered using a hash function. A chain list in which elements holding information on data starting from a bucket on the hash table obtained by applying the above are linked by a chain, and link information holding an association between registered buckets on the hash table. A hash table storage unit, and a data registration unit that registers information about the data in a chain list indicated by a bucket obtained by applying a hash function to the data to be registered and stores the information in the hash table storage unit, When registering to a new bucket on the hash table, Wherein the data of a new bucket is possible to change the link information as a bucket to follow the next registered buckets.
[0013]
According to a second aspect of the present invention, in the data management device according to the first aspect, the link information includes information indicating a position of a bucket to be followed before a bucket in which data has been registered, and data in which the data has been registered. And information indicating the position of the bucket to be traced after the bucket.
According to a third aspect of the present invention, in the data management device according to the first or second aspect, the link information is an index indicating a bucket.
[0014]
According to a fourth aspect of the present invention, in the data management apparatus according to the first, second, or third aspect, the data management apparatus further includes a data search unit that searches for registered data. A bucket in which data is registered in the hash table is extracted from the link information, and registered data is extracted from a chain list linked from the bucket.
[0015]
According to a fifth aspect of the present invention, in the data management apparatus according to the first, second, third, or fourth aspect, the data management apparatus further comprises a data deletion unit for deleting the registered data, wherein the data deletion unit performs the designated deletion. Data is deleted from the chain list, and when there is no more data registered in the chain list, a bucket indicating the deleted data is deleted from the link information.
[0016]
A program according to a sixth aspect of the present invention is a program for causing a computer to execute the functions of the data management device according to any one of the first to fifth aspects. A recording medium according to a seventh aspect of the present invention is a computer-readable recording medium storing the program according to the sixth aspect.
[0017]
Therefore, when retrieving all data registered in the hash table, the bucket in which the data is registered can be searched at high speed without affecting the size of the hash table or the number of data stored in the hash table. .
[0018]
Furthermore, the bucket can be searched at high speed regardless of the size of the hash table or the number of data stored in the hash table, regardless of whether the data is retrieved in the normal or reverse order.
[0019]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of a data management device according to the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of a computer system constituting an embodiment of a data management device according to the present invention.
In FIG. 1, this embodiment includes a server 10 and a terminal 20.
[0020]
The terminal 20 allows the user of the system to specify the data itself to be registered / searched / deleted with respect to the data management device executed on the server 10 and the data file in which the data is recorded, and to execute the obtained operation. A computer used to output results.
In general, the terminal 20 is provided with a pointing device such as a keyboard and a mouse for inputting an inquiry request, and a display for displaying an inquiry result.
The terminal 20 includes a CPU, a memory, and a hard disk, and can execute a program.
[0021]
The server 10 is a computer that is connected to the terminal 20 via the communication network 30, performs a process according to an operation request from the terminal 20, and returns a result to the terminal 20.
The server 10 includes a CPU, a memory, and a hard disk, and can execute a program for data management.
Data managed by the server 10 is recorded on a recording medium such as a hard disk.
[0022]
Here, the communication network 30 is a transmission path for coupling between the server 10 and the terminal 20, and is generally realized by a cable, and TCP / IP is used as a communication protocol. However, the transmission path is not limited to a cable, but may be a wired or wireless communication as long as the communication protocol between them is the same. For example, a LAN (Local Area Network) using a public line or a dedicated line, a WAN ( Wide Area Network, the Internet, and the like can be used.
[0023]
When all the functions of the data management device of the present invention are executed by the terminal 20, the server 10 and the communication network 30 described above are not required.
[0024]
FIG. 2 is a block diagram showing a functional configuration of the data management device according to the present invention. In FIG. 2, the data management device includes a data input / output unit 21, a data registration unit 11, a hash table storage unit 14, a data search unit. 12 and data deletion means 13.
[0025]
The data input / output means 21 operates at the terminal 20, receives data registration, search, and deletion requests input by the user via the terminal 20, processes the requests, and outputs the results.
[0026]
The data registration unit 11, the data search unit 12, the data deletion unit 13, and the hash table storage unit 14 operate on the server 10 and constitute a part of the function of the data management device.
[0027]
First, the data structure of the hash table used by the data registration unit 11, the data search unit 12, and the data deletion unit 13, the hash table storage unit 14 that stores the chain list and the link information will be described. This hash table storage means 14 is stored on the memory of the server 10 or on a hard disk.
[0028]
FIG. 3 shows an example of the data structure of the hash table storage means 14 used in the present embodiment. The data structures of the hash table H and the chain list C are the same as those described in the related art.
[0029]
That is, when the size of the hash table H is n, the hash table H is divided into n buckets, and each bucket is indicated by an index from 1 to n.
Each bucket for which no data is registered is set to zero.
When registering data, a hash function is applied to the data, a hash value is obtained, and a bucket having the hash value as an index is obtained.
(1) If there is nothing registered in the obtained bucket (the value stored in the bucket is zero), an element E is generated, and a pointer to the element E is registered in the bucket.
The element E includes data to be registered (or a pointer to the data) Ev and a pointer Ep to another element E of the data having the same hash value.
If the pointer Ep of the element E is set to zero, it indicates that there is no element to be chained next.
[0030]
(2) Alternatively, if there is a bucket registered in the obtained bucket, an element E storing the data to be registered (or a pointer to the data) is generated, and the chain pointer Ep of the generated element E is stored in the chain pointer Ep. Stores a pointer to the head element of the chain list registered in the bucket, and registers a pointer to the generated element E in the bucket.
[0031]
Hereinafter, the present embodiment will be described on the assumption that an index indicating a bucket is used as the expression format of the link information.
In FIG. 3, the link information includes a rear link start index B and rear link information N, and a front link start index A and front link information P.
[0032]
The rear link start index B and the rear link information N are each configured as one array by arranging the element of index 0 and then the elements corresponding to the buckets of indexes 1 to n.
The back link start index B stores the index of the bucket to be searched first when acquiring all the data registered in the hash table H. The post-link information N stores the index of the bucket to start searching next after searching for the chained element from the bucket. When the value of the subsequent link information N is zero, it indicates that there is no next bucket to be searched.
[0033]
For example, in FIG. 4, since the stored value of the post-link start index B (index 0) is 3, the first search is performed from the bucket of index 3.
Since n is stored in the post-link information N of the bucket of the index 3, the bucket searched next to this bucket is the bucket of the index n.
Further, since zero is stored in the post-link information N of the index 1, there is no bucket to be searched next to this bucket.
[0034]
Similarly, the previous link start index A and the previous link information P are configured as one array by arranging the element of the index 0 and then the elements corresponding to the buckets of the indexes 1 to n.
The front link start index A stores the index of the first bucket to be searched when the buckets are arranged in the reverse order of the tracing by the rear link information when acquiring all the data registered in the hash table H. You. The previous link information P stores an index of a bucket to start searching next after searching for an element chained from the bucket. When the value of the previous link information P is zero, it indicates that there is no next bucket to be searched.
[0035]
For example, in FIG. 4, since the stored value of the previous link start index A (index 0) is 1, the first search is performed from the bucket of index 1.
Since n is stored in the preceding link information P of the bucket of index 1, the bucket searched next to this bucket is the bucket of index n.
Since zero is stored in the previous link information P of the index 2, there is no bucket to be searched next to this bucket.
[0036]
In response to a data registration request from the data input / output unit 21, the data registration unit 11 finds a chain in which the given data is to be registered, and registers the data as the first element of the chain.
The processing procedure of the data registration unit 11 will be described with reference to the flowchart shown in FIG.
Here, before registering the data, the values of the buckets of the previous link start index A, the rear link start index B, and the hash table H are initialized to zero.
[0037]
A hash function is applied to the registered data passed from the data input / output unit 21 to obtain a hash value (steps S1 and S2).
A bucket having the obtained hash value as an index is obtained (step S3). Hereinafter, the calculated bucket index is defined as i.
An element for storing data to be registered (or a pointer to the data) is generated (step S3).
This element is in the chain list C and corresponds to the element E in FIG. 3. Data to be registered (or a pointer to the data) is stored in Ev, and zero is stored in Ep.
[0038]
It is checked whether the content of the bucket at index i is zero (step S5).
If the content of the bucket at index i is not zero (YES in step S5), since the pointer to the head element of the chain has already been registered, the content of the bucket at index i is stored in the Ep area of the element of the generated chain list C. Is stored (step S6), and the process proceeds to step S11. As a result, the element just created becomes the head of the chain list.
[0039]
On the other hand, if the content of the bucket at index i is zero (NO at step S5), the previous link information at index i is stored as zero (step S7).
I is stored in the previous link information using the content of the rear link start index as an index (step S8).
The content of the post-link start index is stored as post-link information of index i (step S9).
I is stored in the post-link start index (step S10).
Finally, the pointer to the element generated in the chain list is registered in the bucket at index i (step S11), and the data registration ends.
[0040]
For example, when data with a hash value of 2 is registered in the hash table storage unit 14 in the state of FIG. 4, the data is registered as shown in FIG.
[0041]
If the data search request from the data input / output unit 21 searches for one piece of data, the data search unit 12 obtains an element in the chain list in which the given data is registered from the hash table, and Returns the result to the data input / output means 21 depending on whether or not holds the data to be searched.
If there is an all data acquisition request from the data input / output unit 21, all the elements of all the chains are extracted with reference to the link information, and the result is returned to the data input / output unit 21.
[0042]
First, the processing procedure of the data search means 12 when a request for searching one piece of data is received will be described with reference to the flowchart shown in FIG.
A hash function is applied to the data to be searched to obtain a hash value (steps S21 and S22).
A bucket having the obtained hash value as an index is obtained (step S23).
The pointer to the element currently being processed is set as the pointer to the first element of the chain stored in the bucket (step S24).
[0043]
It is checked whether the pointer to the element currently being processed is zero (step S25).
If the pointer to the element currently being processed is zero (NO in step S25), the process ends, assuming that the data being searched has not been registered (step S26).
On the other hand, if the pointer to the element currently being processed is not zero (YES in step S25), the data stored in the element indicated by the pointer is extracted and if it is equal to the data to be searched (YES in step S27). Then, the process is terminated assuming that the data is the data being searched (step S28).
If the data of the element indicated by the pointer is not the data to be searched, the pointer to the next element stored in the element is replaced with the pointer to the element currently being processed (step S29), and the process returns to step S25. .
[0044]
Next, the processing procedure of the data search means 12 when receiving a request to retrieve all data will be described with reference to the flowchart shown in FIG.
Hereinafter, acquisition of all data using the rear link information will be described. However, acquisition of all data using the previous link information can be similarly performed. That is, in the procedure when the rear link information is used, the previous link information may be used instead of the rear link information.
[0045]
The index of the bucket currently being processed is initialized with the content of the post-link start index (step S31).
If the index of the bucket currently being processed is zero (YES in step S32), the processing ends as all buckets have been processed.
The data extracted so far is returned to the data input / output means 21 which has requested all data to be extracted.
[0046]
The pointer to the element currently being processed is initialized with the pointer to the head element of the chain list stored in the bucket (step S33).
It is checked whether the pointer to the element currently being processed is zero, and if it is an empty pointer (NO in step S34), there is no longer a pointer to the element currently being processed. The bucket currently being processed is replaced with the post-link information (step S35), and the process returns to step S32.
In this process, the extraction of the data relating to the chain list indicating the data having the same hash value (index) is completed, and the process proceeds to the processing of the data having the next hash value.
[0047]
On the other hand, if the pointer to the element currently being processed is not zero (YES in step S34), the data stored in the element indicated by the pointer is extracted (step S36).
The pointer to the next element stored in the element is obtained, set as the pointer to the element currently being processed (step S37), and the process returns to step S34.
[0048]
In response to a data deletion request from the data input / output unit 21, the data deletion unit 13 obtains a chain list in which the given data is registered, and the given deletion request data is registered in the chain list. If it is, it is deleted and the result is returned to the data input / output means 21.
The processing procedure of the data deletion unit 13 will be described with reference to the flowcharts shown in FIGS.
[0049]
A hash function is applied to the data to be deleted passed from the data input / output unit 21 to obtain a hash value (steps S41 and S42).
A bucket having the obtained hash value as an index is obtained (step S43). Hereinafter, the index obtained here is defined as i.
The pointer to the element currently being processed is set as the pointer to the first element of the chain stored in the bucket (step S44).
[0050]
It is checked whether the pointer to the element currently being processed is zero (step S45).
If the pointer to the element currently being processed is zero (NO in step S45), the process ends assuming that the data designated to be deleted has not been registered (step S46).
On the other hand, when the pointer to the element currently being processed is not zero (YES in step S45), unless the data stored in the element indicated by the pointer matches the data to be deleted (NO in step S47). Then, the pointer to the next element stored in the element is replaced with a pointer to the element currently being processed (step S48), and the process returns to step S45.
On the other hand, if the data stored in the element indicated by the pointer matches the data to be deleted (YES in step S47), since this data is the data to be deleted, the following processing is performed. .
[0051]
It is checked whether the element storing the data to be deleted is the head of the chain list. If the element is the head of the chain (YES in step S49), the pointer of the next element indicated by the element to be deleted is stored in the bucket, and this deletion target The element is deleted (Step S50).
For example, as shown in FIG. 11A, when deleting the first element (the element storing the value 1) in the chain list, the element following the element storing the value 1 (the element storing the value 2) ) Is stored in the bucket.
[0052]
If the element to be deleted is not the head of the chain list (NO in step S49), the pointer pointed to by the element immediately before the element storing the data to be deleted is set to the pointer to the element next to the element storing the data to be deleted. Then, the element to be deleted is deleted (step S51).
For example, as shown in FIG. 11B, when deleting the second element (the element storing the value 2) in the chain list, the element next to the element storing the value 2 (the value 3 is stored). The pointer to the element) is a pointer to the element next to the first element (the element storing the value 1).
[0053]
Next, after deleting the element to be deleted, if there is still an element in the chain list (NO in step S52), the deletion processing ends.
On the other hand, if there is no element in the chain list as a result of deleting the element to be deleted (YES in step S52), the following processing shown in the flowchart of FIG. 10 is performed.
[0054]
Assuming that the subsequent link information of the index i is j, the previous link information of the index j is replaced with the previous link information of the index i (step S53).
Assuming that the previous link information of the index i is k, the link information after the index k is replaced with the link information after the index i (step S54).
[0055]
If the content of the post-link start index is index i (YES in step S55), the post-link start index is replaced with post-link information of index i (step S56).
If the content of the previous link start index is index i (YES in step S57), the previous link start index is replaced with the previous link information of index i (step S58).
Finally, zero is stored in the subsequent link information and the previous link information of the index i (steps S59 and S60), and the deletion processing ends.
[0056]
For example, when the data stored at the head of the bucket with the hash value of 3 is deleted from the contents of the hash table storage unit 14 in the state shown in FIG. 4, the result becomes as shown in FIG.
[0057]
The present invention is not limited to only the above-described embodiment. Each function of the data management apparatus of the above-described embodiment is programmed and written in a recording medium such as a CD-ROM in advance, and this CD-ROM is stored in a medium drive such as a CD-ROM drive mounted on a computer. It is needless to say that the objects of the present invention can be achieved by installing these programs in a memory or a storage device of a computer and executing the programs.
In this case, the program itself read from the recording medium implements the functions of the above-described embodiment, and the program and the recording medium on which the program is recorded also constitute the present invention.
[0058]
In addition, as a recording medium for storing the program, a semiconductor medium (for example, ROM, nonvolatile memory card, etc.), an optical medium (for example, DVD, MO, MD, CD, etc.), a magnetic medium (for example, magnetic tape, flexible disk, etc.) ) May be used.
[0059]
Further, not only the functions of the above-described embodiment are realized by executing the loaded program, but also the above-described execution is performed by performing processing in cooperation with an operating system or another application program based on an instruction of the program. The case where the function of the form is realized is also included.
[0060]
When distributing to the market, the program is stored and distributed in a portable recording medium, or stored in a storage device of a server computer connected via a communication network such as the Internet, and another program is stored through the communication network. It can also be transferred to a computer. In this case, the storage device of the server computer is also included in the recording medium of the present invention. The functions of the above-described embodiments are implemented by installing a program on a portable recording medium or a transferred program into a recording medium built in a computer, and executing the installed program.
[0061]
【The invention's effect】
As described above, according to the present invention, when retrieving all data registered in the hash table, the data is registered regardless of the size of the hash table or the number of data stored in the hash table. A bucket can be searched at high speed.
[0062]
Furthermore, the bucket can be searched at high speed regardless of the size of the hash table or the number of data stored in the hash table, regardless of whether the data is retrieved in the normal or reverse order.
[Brief description of the drawings]
FIG. 1 is a block diagram of a computer system constituting an embodiment of a data management device according to the present invention.
FIG. 2 is a block diagram showing a functional configuration of a data management device according to the present invention.
FIG. 3 is an example of a data structure of a hash data storage unit.
FIG. 4 is a diagram for explaining data setting of hash data storage means.
FIG. 5 is a flowchart illustrating a processing procedure of a data registration unit.
FIG. 6 is an example in which new data is registered in the hash data storage means of FIG.
FIG. 7 is a flowchart illustrating a processing procedure of a data search unit when a search request for designated data is received.
FIG. 8 is a flowchart showing a processing procedure of a data search unit when a request to retrieve all data is received.
FIG. 9 is a flowchart illustrating a processing procedure of a data deleting unit.
FIG. 10 is a flowchart showing a processing procedure of a data deleting unit (continuation of FIG. 9).
FIG. 11 is an explanatory diagram when an element is deleted from a chain list.
FIG. 12 is an example when data is deleted from the hash table storage means in the state shown in FIG. 4;
FIG. 13 is a diagram for explaining a configuration of a conventional hash table.
[Explanation of symbols]
A: Previous link start index, P: Previous link information, B: Rear link start index, N: Rear link information, H: Hash table, E: Chain list element, Ev: Data value of element E or pointer to data , Ep ... Pointer to the next element in the chain list of element E, 10 ... server, 11 ... data registration means, 12 ... data search means, 13 ... data deletion means, 14 ... hash table storage means, 20 ... terminal, 21 data input / output means, 30 communication network.

Claims

In a data management device that determines a storage location of data to be registered using a hash function, a hash table and information about data starting from a bucket on the hash table obtained by applying a hash function to data to be registered are stored. Hash table storage means including a chain list in which elements to be held are linked by a chain and link information for holding the association of the registered buckets on the hash table, and a bucket obtained by applying a hash function to the data to be registered And a data registering means for registering information on the data in the chain list shown in FIG. 2 and storing the information in the hash table storage means. If the data is registered in a new bucket on the hash table, Bucket to follow next bucket Data management apparatus is characterized in that so as to change the link information as.

2. The data management device according to claim 1, wherein the link information includes information indicating a position of a bucket to be followed before a bucket in which data is registered, and information of a bucket to be followed after a bucket in which data is registered. A data management device comprising: information indicating a position.

3. The data management device according to claim 1, wherein the link information is an index indicating a bucket.

4. The data management device according to claim 1, further comprising a data search unit for searching for registered data, wherein the data search unit stores data in the hash table from the link information when extracting all data. A data management device wherein a registered bucket is taken out and registered data is taken out from a chain list linked from the bucket.

5. The data management device according to claim 1, further comprising a data deletion unit for deleting registered data, wherein said data deletion unit deletes specified deletion data from said chain list. A data management device characterized in that when data registered in the chain list is exhausted, a bucket indicating the deleted data is deleted from the link information.

A program for causing a computer to execute the functions of the data management device according to claim 1.

A computer-readable recording medium on which the program according to claim 6 is recorded.