JP2009020757A

JP2009020757A - Data registration apparatus, data registration method and program

Info

Publication number: JP2009020757A
Application number: JP2007183580A
Authority: JP
Inventors: Yusuke Doi; 裕介土井
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2007-07-12
Filing date: 2007-07-12
Publication date: 2009-01-29

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data registration apparatus capable of efficiently writing data from a primary data group to a DHT. <P>SOLUTION: In the data registration apparatus 1, each DHT node 3 dispersively stores data made to correspond to a key, on the basis of a hash value of the key. In the data registration apparatus 1, an input part 15 inputs a series of registration data including a key and primary data which is a data source from a primary data table 2, a hash sort part 12 calculates a hash value of the key in each registration data of the series and sorts the order of registration data in the series on the basis of the hash value so that registration data to be stored in the same DHT node 3 are continued, a filter part 13 deletes registration data other than that to be registered again out of the registration data of the series, a writing part 14 writes respective registration data of the series in the corresponding DHT node 3 in accordance with the sorting order. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、分散ハッシュテーブルへのデータの登録を行うデータ登録装置、データ登録方法及びプログラムに関する。 The present invention relates to a data registration apparatus, a data registration method, and a program for registering data in a distributed hash table.

Ｃｈｏｒｄ方式（非特許文献１）やＴａｐｅｓｔｒｙ方式（非特許文献２）などといった分散ハッシュテーブル（Distributed Hash Table，ＤＨＴ）を代表とする大規模分散ストレージシステムにおいては、ストレージシステム上で取り扱うデータはソフトステート的に保持されることが多い（非特許文献３）。つまり、ストレージシステム上でのデータの保持は一定期間しか保証されず、一次データを所有するデータ源（分散して存在するクライアント）が必要に応じて再び書きこむ必要がある。 In a large-scale distributed storage system represented by a distributed hash table (DHT) such as the Chord method (Non-patent document 1) and the Tapestry method (Non-patent document 2), the data handled on the storage system is in soft state. (Non-Patent Document 3). That is, the retention of data on the storage system is guaranteed only for a certain period of time, and the data source (clients existing in a distributed manner) that owns the primary data needs to rewrite as necessary.

一次データを全てソフトステート的にＤＨＴ上で保持する場合、ＤＨＴのシステム規模が大きくなり、データ量が大きくなると、データ量に比例して維持すべきステートおよびリフレッシュのための通信が増加してしまい、スケールしない。他方、ソフトステートを用いずに、ＤＨＴ側にデータの保持を任せると、ＤＨＴノードの故障などでデータが失なわれる可能性がある。さらに、ＤＨＴからのバックアップあるいはマイグレーションを行う際には、ＤＨＴが期待するデータ量を保管しなければならないので、素直にリニアなバックアップを行うと時間がかかりすぎることから、ＤＨＴへのバックアップあるいはマイグレーションを行うことが自然であるが、このとき効率的に一方のＤＨＴの内容を他方のＤＨＴにコピーする方法が存在しなかった。
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of ACM SIGCOMM, August 2001. Ben Y. Zhao, John D. Kubiatowicz, and Anthony D. Joseph. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Technical Report UCB//CSD-01-1141, U.C.Berkeley, April 2000. Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu. OpenDHT: A public DHT service and its uses. In Proceedings of ACM SIGCOMM 2005, August 2005. When all primary data is held on the DHT in a soft state, if the DHT system scale increases and the amount of data increases, the state to be maintained and the communication for refresh increase in proportion to the amount of data. Do not scale. On the other hand, if data retention is left to the DHT side without using the soft state, data may be lost due to a failure of the DHT node or the like. Furthermore, when performing backup or migration from DHT, it is necessary to store the amount of data expected by DHT. Therefore, since it takes too much time to perform linear backup in a straightforward manner, backup or migration to DHT is not possible. Although it is natural to do this, there has been no way to efficiently copy the contents of one DHT to the other DHT.
I. Stoica, R. Morris, D. Karger, MF Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications.In Proceedings of ACM SIGCOMM, August 2001. Ben Y. Zhao, John D. Kubiatowicz, and Anthony D. Joseph. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Technical Report UCB // CSD-01-1141, UCBerkeley, April 2000. Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu.OpenDHT: A public DHT service and its uses.In Proceedings of ACM SIGCOMM 2005, August 2005.

従来、一次データ群からＤＨＴへのデータの書き込み或いは第１のＤＨＴから第２のＤＨＴへのデータの書き込みを行う効果的な方法がなかった。 Conventionally, there has been no effective method for writing data from the primary data group to the DHT or writing data from the first DHT to the second DHT.

本発明は、上記事情を考慮してなされたもので、一次データ群からＤＨＴへのデータの書き込み或いは第１のＤＨＴから第２のＤＨＴへのデータの書き込みを効率的に行うことの可能なデータ登録装置、データ登録方法及びプログラムを提供することを目的とする。 The present invention has been made in consideration of the above circumstances, and is capable of efficiently writing data from the primary data group to the DHT or writing data from the first DHT to the second DHT. It is an object to provide a registration device, a data registration method, and a program.

本発明は、キー情報に対応付けられたデータを該キー情報のハッシュ値に基づいて分散記憶する複数の分散ハッシュノード装置からなる分散ハッシュテーブルに該データを登録するデータ登録装置において、前記キー情報と前記データの元である一次データとを含む登録情報の系列を入力する手段と、入力された前記系列の各々の登録情報について、当該登録情報に係る前記キー情報のハッシュ値を計算するハッシュ手段と、入力された前記系列における登録情報の順序が、同一の分散ハッシュノード装置に記憶すべき登録情報が連続する順序となるように、各登録情報に係る前記キー情報のハッシュ値に基づいて該系列における登録情報の順序をソートするソート手段と、ソートされた前記系列に含まれる登録情報の全部又は一部を対象として、当該登録情報を記憶すべき前記分散ハッシュノード装置への当該登録情報の書き込みを、当該ソート順に従って行う書込手段とを備えたことを特徴とする。 The present invention provides a data registration apparatus for registering data in a distributed hash table including a plurality of distributed hash node apparatuses that distribute and store data associated with key information based on a hash value of the key information. And means for inputting a series of registration information including primary data that is the source of the data, and hash means for calculating a hash value of the key information related to the registration information for each registration information of the inputted series And the order of the registration information in the inputted sequence is based on the hash value of the key information related to each registration information so that the registration information to be stored in the same distributed hash node device is the sequential order. Sorting means for sorting the order of registration information in the series, and targeting all or part of the registration information included in the sorted series Te, the writing of the registration information to be stored the registration information the distributed hash node apparatus, characterized by comprising a writing means for performing in accordance with the sort order.

本発明によれば、一次データ群からＤＨＴへのデータの書き込み或いは第１のＤＨＴから第２のＤＨＴへのデータの書き込みを効率的に行うことができる。 According to the present invention, it is possible to efficiently write data from the primary data group to the DHT or write data from the first DHT to the second DHT.

以下、図面を参照しながら本発明の実施形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（第１の実施形態）
図１に、本発明の一実施形態に係る分散ハッシュテーブルシステムの構成例を示す。 (First embodiment)
FIG. 1 shows a configuration example of a distributed hash table system according to an embodiment of the present invention.

図１において、１はデータ登録装置、２は一次データテーブル、３はＤＨＴノード、８はネットワークである。 In FIG. 1, 1 is a data registration device, 2 is a primary data table, 3 is a DHT node, and 8 is a network.

複数台のＤＨＴノード３が連携して、１つの分散ハッシュテーブル３０を構成する。分散ハッシュテーブル３０を構成するＤＨＴノードの台数は任意である。また、この台数は、ＤＨＴノードの参加・離脱によって、動的に変動し得るものである。 A plurality of DHT nodes 3 cooperate to form one distributed hash table 30. The number of DHT nodes constituting the distributed hash table 30 is arbitrary. Further, this number can be dynamically changed by joining / leaving the DHT node.

各ＤＨＴノード３は、自身が担当するハッシュ値に該当するハッシュ値を与えるキー情報について、図２に例示するようにキー情報（ｋｅｙ）とデータ（ｄａｔａ：１又は複数のｖａｌｕｅ）とを対応付けて保持しており、クライアント装置（図示せず）により送信された、キー情報が指定された問い合せメッセージを受信した場合に、これに応答して、該キー情報のハッシュ値を求め、該ハッシュ値が自身の担当するハッシュ値であれば、該キー情報に対応して記憶しているデータを含む応答メッセージを該クライアント装置へ返送する。自身の担当するハッシュ値でなければ、例えば、その問い合せメッセージを他のＤＨＴノード３へ転送するか、より適切なＤＨＴノード３への参照として、ＩＰアドレスとポート番号の組合せやＵＲＬなどを返送する。 Each DHT node 3 associates key information (key) and data (data: 1 or a plurality of values) as illustrated in FIG. 2 with respect to key information that provides a hash value corresponding to the hash value that it is in charge of. In response to the inquiry message with the key information specified, which is sent by the client device (not shown), the hash value of the key information is obtained in response to the inquiry message. If the hash value is in charge of itself, a response message including data stored in correspondence with the key information is returned to the client device. If it is not the hash value that it is in charge of, for example, the inquiry message is transferred to another DHT node 3, or a combination of an IP address and a port number or a URL is returned as a reference to a more appropriate DHT node 3. .

なお、ＤＨＴノード３については、自ノードに関する情報の管理、参加・離脱制御、隣接ノードに関する情報の管理、データベースの管理、自ノードの属するハッシュ空間上の任意のハッシュ値を担当するＤＨＴノードの検索、クライアント装置からの問い合わせメッセージに対する手続などは、基本的には、従来からある方法を用いて構わない。 For the DHT node 3, management of information on the own node, participation / leaving control, management of information on adjacent nodes, database management, search for a DHT node in charge of an arbitrary hash value in the hash space to which the own node belongs The procedure for the inquiry message from the client device may basically use a conventional method.

一次データテーブル群２０の個々の一次データテーブル２は、それぞれ、図３に例示するように、分散ハッシュテーブル３０に登録すべきデータの元（データ源）となる多様な一次データ（ｖａｌｕｅ）を、そのキー情報（ｋｅｙ）に対応付けて保持する。一次データテーブル２の数は任意である。なお、同一のキー情報を持つ一次データが、複数の一次データテーブル２に渡って保持されることもある。 As illustrated in FIG. 3, each primary data table 2 of the primary data table group 20 includes various primary data (values) that are sources (data sources) of data to be registered in the distributed hash table 30. The key information (key) is held in association with it. The number of primary data tables 2 is arbitrary. Note that primary data having the same key information may be held across a plurality of primary data tables 2.

なお、一次データテーブル群２０はどのような構成であってもよく、例えば、一次データテーブル群２０の個々の一次データテーブル２が通常の独立したデータベースであってもよいし、複数の一次データテーブル２がもう１つの分散ハッシュテーブル３０を構成するものであってもよい。 The primary data table group 20 may have any configuration. For example, each primary data table 2 of the primary data table group 20 may be a normal independent database, or a plurality of primary data tables. 2 may constitute another distributed hash table 30.

データ登録装置１は、一次データテーブル２の保持する一次データを、当該一次データを担当するＤＨＴノード３に登録する（すなわち、各一次データは、これに対応するキー情報のハッシュ値を担当するＤＨＴノード３に登録される）。なお、ここでは、データ登録装置１は、個々の一次データテーブル２に対応して一つずつ設けられるものとする。一対のデータ登録装置１と一次データテーブル２とは、互いに独立した装置であってもよいし、一体化した装置であってもよい。 The data registration device 1 registers the primary data held in the primary data table 2 in the DHT node 3 that is responsible for the primary data (that is, each primary data is a DHT that is responsible for the hash value of the corresponding key information). Registered in node 3). Here, one data registration device 1 is provided corresponding to each primary data table 2. The pair of data registration devices 1 and the primary data table 2 may be devices independent of each other or may be integrated devices.

１つの分散ハッシュテーブル３０を構成する複数のＤＨＴノード３は、ネットワーク８を介して、相互に通信可能である。また、各データ登録装置１は、ネットワーク８を介して任意のＤＨＴノード３と通信可能である。なお、ネットワーク８は、どのようなネットワークでもよい。ここでは、一例としてネットワーク８がインターネットである場合を例にとって説明する。 A plurality of DHT nodes 3 constituting one distributed hash table 30 can communicate with each other via the network 8. Each data registration device 1 can communicate with an arbitrary DHT node 3 via the network 8. The network 8 may be any network. Here, a case where the network 8 is the Internet will be described as an example.

さて、図１に例示する分散ハッシュテーブルシステムは、例えばトレーサビリティ応用などに用いるサービス構成（例えば文献「尾崎哲, 土井裕介, 若山史郎. 小規模から円滑に拡張できる商品トレーサビリティシステム. 東芝レビュー, Vol.60, No.8, pp.27-31, 2005.」）に適うものである。以下では、本分散ハッシュテーブルシステムを、個品トレーサビリティシステムの構築に利用する場合を例にとって説明するものとする。 The distributed hash table system illustrated in FIG. 1 is a service configuration used for, for example, traceability applications (for example, literature “Tetsu Ozaki, Yusuke Doi, Shiro Wakayama. Product traceability system that can be expanded smoothly from a small scale. Toshiba Review, Vol. 60, No.8, pp.27-31, 2005.)). Hereinafter, a case where the present distributed hash table system is used for constructing an individual product traceability system will be described as an example.

個品トレーサビリティシステムに適用する場合には、例えば、一次データテーブル群２０の各一次データテーブル２は、個品ごとの一次データとしての「その個品に関連する情報（例えば、品質情報等）を提供するサーバ装置の情報（例えば、ＵＲＬ）」を、そのキー情報としての「その個品に固有の識別情報（以下、タグＩＤと呼んで説明するものとする）」に対応付けて保持し、分散ハッシュテーブル３０の各ＤＨＴノード３は、ルックアップ等に用いるインデックス用のテーブルに、二次データとしての（１又は複数の）上記ＵＲＬ等をタグＩＤと対応付けて保持する。なお、この場合、ＵＲＬ等は、図１の一次データテーブル２を持つノードを示すものであってもよいし、他のサーバ装置を示すものであってもよい。 When applied to an individual product traceability system, for example, each primary data table 2 of the primary data table group 20 includes “information related to the individual product (for example, quality information) as primary data for each individual product. The server information to be provided (for example, URL) ”is held in association with“ identification information unique to the individual product (hereinafter referred to as tag ID ”) as the key information, Each DHT node 3 of the distributed hash table 30 holds the URL (one or more) as secondary data in association with a tag ID in an index table used for lookup or the like. In this case, the URL or the like may indicate a node having the primary data table 2 in FIG. 1 or may indicate another server device.

図４に、個品毎に付与されるタグＩＤの一例として、ＥＰＣｇｌｏｂａｌにおけるＧＩＤ−９６（EPCglobal. EPC tag data standards version 1.3. EPCglobal Ratified Specification, March 2006.）を示す。この場合、タグＩＤは、会社ＩＤ・商品種別ＩＤ・個品ＩＤの３階層からなる。なお、この場合、ＤＨＴは、例えば、会社毎あるいは会社及び商品種別毎などに設けられるが、これに限定されるものではない。 FIG. 4 shows GID-96 (EPCglobal. EPC tag data standards version 1.3. EPCglobal Ratified Specification, March 2006) in EPCglobal as an example of a tag ID assigned to each individual product. In this case, the tag ID consists of three layers of company ID, product type ID, and individual product ID. In this case, the DHT is provided for each company or for each company and product type, but is not limited thereto.

このような個品トレーサビリティシステムにおいては、例えば、ある商品（個品）に関する情報を取得・閲覧したいユーザ（例えば消費者）は、クライアント装置（図示せず）から、その個品のタグＩＤを指定した問い合わせメッセージを、分散ハッシュテーブル３０（のいずれかのＤＨＴノード３）に送信し、分散ハッシュテーブル３０からＵＲＬ等のリストを取得した後に、さらに、取得したＵＲＬ等をもとに所定のサーバ装置に問い合わせメッセージを送信することによって、該個品に関する具体的な情報（例えば温度管理履歴等）を取得する、といった動作が行われる。 In such an individual item traceability system, for example, a user (for example, a consumer) who wants to acquire and view information on a certain item (individual item) specifies a tag ID of the individual item from a client device (not shown). The inquiry message is transmitted to the distributed hash table 30 (any one of the DHT nodes 3), a list of URLs and the like is acquired from the distributed hash table 30, and then a predetermined server device based on the acquired URL and the like. By transmitting an inquiry message to the device, an operation of acquiring specific information (for example, temperature management history) regarding the individual product is performed.

ここで、分散ハッシュテーブル３０は、ノード故障などにより、インデックス用のテーブル（タグＩＤ：［ＵＲＬ，…］）の一部を失うことがある。その際、特定のＤＨＴノード３の保持するデータの一部が壊れるケースや、特定のＤＨＴノード３が故障して（離脱が生じて）、そのＤＨＴノード３が保持していたデータが全て失われるケースなどがある。 Here, the distributed hash table 30 may lose a part of the index table (tag ID: [URL,...) Due to a node failure or the like. At that time, a part of the data held by the specific DHT node 3 is broken, or the specific DHT node 3 breaks down (disengages), and all the data held by the DHT node 3 is lost. There are cases.

前者の場合には、その壊れたデータをそのＤＨＴノード３に再登録することで復旧できる。 In the former case, the broken data can be recovered by re-registering with the DHT node 3.

後者の場合には、その故障したＤＨＴノード３が保持していたデータを、他のＤＨＴノード３に再登録することで復旧できる。どのＤＨＴノード３が故障した場合に、どのＤＨＴノード３を用いて復旧させるかは、予め決めておくのが望ましい。一例として、Ｃｈｏｒｄのように、ハッシュ値の集合で構成されるハッシュ空間において、両端が連結された閉じた数直線としてハッシュ空間が形成される場合を考える（例えば、各ＤＨＴノード３にハッシュ値（例えばＤＨＴノードのアドレスのハッシュ値）が割り当てられ、各ＤＨＴノード３の担当するハッシュ値の範囲（上限値及び下限値）は、各ＤＨＴノード３に割り当てられたハッシュ値に基づいて決せられる）と、この場合、例えば、或るＤＨＴノード３が故障した場合に、そのＤＨＴノード３が保持していたデータを、ハッシュ空間上でハッシュ値の増加方向でそのＤＨＴノード３に最も近いノード（サクセッサー）、あるいは、ハッシュ空間上でハッシュ値の減少方向でそのＤＨＴノード３に最も近いノード（プレデセッサー）が、以降、それら欠損したデータをも保持するものと予め決めておくことができる。 In the latter case, the data held by the failed DHT node 3 can be recovered by re-registering with another DHT node 3. It is desirable to determine in advance which DHT node 3 is to be used for recovery when which DHT node 3 fails. As an example, consider a case where a hash space is formed as a closed number line with both ends connected in a hash space formed of a set of hash values, such as Chord (for example, a hash value ( (For example, the hash value of the address of the DHT node) is assigned, and the range (upper limit value and lower limit value) of the hash value assigned to each DHT node 3 is determined based on the hash value assigned to each DHT node 3) In this case, for example, when a certain DHT node 3 fails, the data held by the DHT node 3 is converted into a node (successor) that is closest to the DHT node 3 in the hash value increasing direction in the hash space. ) Or a node (predecessor) closest to the DHT node 3 in the hash value decreasing direction in the hash space. ) It is, after they missing data can be determined in advance and retain also.

なお、ハッシュ値を計算するためのハッシュ関数としては、ビット数が充分長くて一様な割り当てが行なわれる関数として、例えば、ＳＨＡ−１やＭＤ５を採用することができる。ハッシュ空間は、ＳＨＡ−１の場合は、ｍ＝１６０とし、ＭＤ５の場合は、ｍ＝１２８として、０〜２^ｍ−１のｍビットの整数で表わされる。 As a hash function for calculating a hash value, for example, SHA-1 or MD5 can be adopted as a function that has a sufficiently long number of bits and is uniformly assigned. The hash space is represented by an m-bit integer of 0 to 2 ^m −1 with m = 160 for SHA-1 and m = 128 for MD5.

また、各ＤＨＴノード３に対応付けるハッシュ値の決め方としては、例えば、当該ＤＨＴノード３の持つアドレス（例えばＩＰアドレス）に対してハッシュ関数を適用して得られるハッシュ値を用いる方法を採用してもよいし、他の方法を採用してもよい。 Further, as a method for determining a hash value to be associated with each DHT node 3, for example, a method using a hash value obtained by applying a hash function to an address (for example, an IP address) of the DHT node 3 may be adopted. Alternatively, other methods may be adopted.

本実施形態のデータ登録装置１は、上記のデータが欠損した領域（故障領域）を検出し、これを復元するものである。 The data registration device 1 of the present embodiment detects a region (failure region) in which the above data is lost and restores it.

以下、本実施形態のデータ登録装置１について詳しく説明する。 Hereinafter, the data registration device 1 of the present embodiment will be described in detail.

図５に、本実施形態に係るデータ登録装置１の構成例を示す。図４に示されるように、データ登録装置１は、指示受信部１１、ハッシュ・ソート部１２、フィルタ部１３、書込部１４、入力部１５を備えている。 FIG. 5 shows a configuration example of the data registration device 1 according to the present embodiment. As shown in FIG. 4, the data registration device 1 includes an instruction receiving unit 11, a hash / sort unit 12, a filter unit 13, a writing unit 14, and an input unit 15.

指示受信部１１は、ネットワーク８を介して、対象となる分散ハッシュテーブル３０における故障領域を示す情報（故障領域情報）、該分散ハッシュテーブル３０で使用すべきハッシュアルゴリズム、同一のＤＨＴノード３の担当するハッシュ値が連続するようにソートするためのソートアルゴリズムを含む登録指示メッセージを受信し、それらを蓄積するためのものである。 The instruction receiving unit 11 transmits information indicating a failure area in the target distributed hash table 30 (failure area information), a hash algorithm to be used in the distributed hash table 30, and the same DHT node 3 via the network 8. The registration instruction message including the sorting algorithm for sorting the hash values to be consecutive is received and accumulated.

なお、上記ハッシュアルゴリズム及びソートアルゴリズムについては、例えば、事前にパラメータの組み合わせを表記する方法を定義して、登録指示メッセージに該パラメータを記述するようにしてもよいし、Ｊａｖａ（登録商標）等のプラットフォームにある動的なオブジェクト読み込み手段を用いるようにしてもよい。 As for the above hash algorithm and sort algorithm, for example, a method for expressing a combination of parameters in advance may be defined, and the parameters may be described in the registration instruction message, or Java (registered trademark) or the like may be used. You may make it use the dynamic object reading means in a platform.

また、上記ハッシュアルゴリズム及びソートアルゴリズムは、登録指示メッセージに含めず、個々のデータ登録装置１においてそれぞれ設定するようにしてもよい。 The hash algorithm and sort algorithm may be set in each data registration device 1 without being included in the registration instruction message.

故障領域情報は、例えば、故障したデータのキー情報（ｋｅｙ）のハッシュ値（ｈ（ｋｅｙ））のリストである。なお、該リストのフォーマットは、どのようなものでもよく、例えば、全ハッシュ値を列記する方法で記述してもよいし、ハッシュ値の始点と終点との組を列記する方法で記述してもよいし、それらを併用してもよい。 The failure area information is, for example, a list of hash values (h (key)) of key information (key) of failed data. The format of the list may be any format. For example, the list may be described by a method that lists all hash values, or may be described by a method that lists pairs of hash value start points and end points. They may be used in combination.

なお、指示受信部１１が登録指示メッセージを受信できるようにする方法には種々のバリエーションが可能である。 Note that various variations are possible for the method of enabling the instruction receiving unit 11 to receive the registration instruction message.

例えば、分散ハッシュテーブル３０全体で、故障した領域を記録する故障領域リストというテーブルを共有するものとし（例えば、ＤＨＴノード３以外の管理ノード（図示せず）を１台設け、これに故障領域リストを保持させるものとし）、各ＤＨＴノード３は、定期的に、自装置が故障領域を持つかどうか調べ、故障領域を持つ場合に、故障領域リストに、故障領域のエントリ（故障領域の始点，故障領域の終点）を書き加えるものとする。 For example, the distributed hash table 30 as a whole is assumed to share a table called a failure area list for recording a failure area (for example, one management node (not shown) other than the DHT node 3 is provided, and the failure area list is provided in this table). Each DHT node 3 periodically checks whether or not its own device has a failure area, and if it has a failure area, the failure area entry (failure area start point, (End point of failure area) shall be added.

なお、例えば、或るＤＨＴノード３が故障して離脱した場合には、該或るＤＨＴノード３は、故障領域のエントリを書き加えることはできないことになるが、該或るＤＨＴノード３の代わりとなるべきＤＨＴノード３（例えば、サクセッサーあるいはプレデセッサー）が、該或るＤＨＴノード３の離脱を検出した以降に、該或るＤＨＴノード３が保持すべきデータが全て欠損したものとして、故障領域のエントリを書き加えることができるようになる。また、主として耐故障性の向上のために、或るＤＨＴノード３の担当するデータと同じ内容(ミラー)を、常時、該或るＤＨＴノード３が故障して離脱した際に該或るＤＨＴノード３の代わりとなるべきＤＨＴノード３（例えば、サクセッサーあるいはプレデセッサー）が保持するようにする場合には、該代わりとなるべきＤＨＴノード３が、該或るＤＨＴノード３の離脱を検出した以降に、該或るＤＨＴノード３の担当するデータのうちに故障領域があれば、これを故障領域のエントリを書き加えることができるようになる。また、後述する時刻によるバックアップ指示と同様に、ミラーの最終更新時刻をもとにしたデータ復元を行うために、故障領域のエントリに時刻情報を追記してもよい。この場合は、該故障領域に含まれるデータのうち、エントリに記録された時刻よりも新しいデータが失われた可能性があることを意味する。 For example, when a certain DHT node 3 breaks down due to a failure, the certain DHT node 3 cannot add an entry of the failure area, but instead of the certain DHT node 3 After the DHT node 3 (for example, a successor or predecessor) to be detected detects that the certain DHT node 3 has left, all the data to be held by the certain DHT node 3 has been lost. An entry can be added. Also, mainly for the purpose of improving fault tolerance, the same contents (mirror) as the data handled by a certain DHT node 3 are always transferred to the certain DHT node 3 when the certain DHT node 3 fails and leaves. When a DHT node 3 to be replaced by a DHT node 3 (for example, a successor or a predecessor) holds, after the DHT node 3 to be replaced detects the detachment of the certain DHT node 3, If there is a failure area in the data handled by a certain DHT node 3, an entry for the failure area can be added to the failure area. Similarly to a backup instruction based on the time described later, time information may be added to the entry of the failure area in order to perform data restoration based on the last update time of the mirror. In this case, it means that there is a possibility that data newer than the time recorded in the entry is lost among the data included in the failure area.

そして、データ登録装置１は、定期的に、上記管理ノードに、故障領域に関する問い合わせメッセージを送信し、これを受信した該管理ノードが、該故障領域リストを登録指示メッセージに含めて、そのデータ登録装置１に返送するようにしてもよい。 Then, the data registration device 1 periodically transmits an inquiry message regarding the failure area to the management node, and the management node that has received the inquiry includes the failure area list in the registration instruction message and registers the data. You may make it return to the apparatus 1. FIG.

あるいは、管理ノードが、定期的に、該故障領域リストを含む登録指示メッセージを全データ登録装置１にブロードキャストするようにしてもよい。 Alternatively, the management node may periodically broadcast a registration instruction message including the failure area list to all the data registration devices 1.

なお、上記では、ＤＨＴノード３以外の管理ノードを１台設けるものとしたが、管理ノードを複数台設けてもよいし、管理ノードの代わりに又はこれに加えて、1又は複数のＤＨＴノード３に、管理ノードの役割を担わせてもよい。また、故障領域リストを保持する管理ノード等が複数設けられる場合には、データ登録装置１は、それらのうちのいずれか一つに故障領域に関する問い合わせメッセージを送信すればよい。また、管理ノードが、定期的に、該故障領域リストを含む登録指示メッセージをデータ登録装置１に送信するようにする場合には、管理ノード等が複数設けられるときは、それらノードが、登録指示メッセージを送信すべきデータ登録装置１を分担してもよい。 In the above description, one management node other than the DHT node 3 is provided. However, a plurality of management nodes may be provided, or one or more DHT nodes 3 may be provided instead of or in addition to the management node. The management node may be assigned a role. In addition, when a plurality of management nodes or the like holding a failure area list are provided, the data registration apparatus 1 may transmit an inquiry message regarding the failure area to any one of them. In addition, when the management node periodically transmits a registration instruction message including the failure area list to the data registration apparatus 1, when a plurality of management nodes are provided, the nodes are instructed to register. You may share the data registration apparatus 1 which should transmit a message.

また、管理ノード等を複数設ける場合に、各管理ノード等で、故障領域リストの対象とするＤＨＴノード３を分担することも可能である。 In addition, when a plurality of management nodes are provided, it is possible to share the DHT node 3 that is the target of the failure area list with each management node.

ところで、修復されたデータの情報が、故障領域リストいつまでも残るのを避けるために、故障領域リスト内の各エントリは、時間ｔで期限切れとなるようにするのが好ましい。ただし、個々のデータ登録装置１は、自身がデータを持つタグＩＤに対応するデータが欠損になれば、これを復元しなければならないので、復元より先に期限切れが発生して復元し損なうことを回避するために、上記管理ノード等に、時間ｔよりも短かい間隔で故障領域に関する問い合わせを行って、故障領域リストを確実に取得して、故障したデータを更新するようにするのが好ましい。また、管理ノード等が、定期的に、該故障領域リストを含む登録指示メッセージをデータ登録装置１に送信するようにする場合には、時間ｔよりも短かい間隔で登録指示メッセージを送信するのが好ましい。 By the way, it is preferable that each entry in the failure area list expires at time t in order to prevent the information of the repaired data from remaining indefinitely. However, the individual data registration device 1 must restore the data corresponding to the tag ID having the data itself if it is lost. Therefore, the data registration device 1 expires before the restoration and fails to restore. In order to avoid this, it is preferable to make an inquiry about the failure area to the management node or the like at an interval shorter than the time t so as to reliably acquire the failure area list and update the failed data. In addition, when the management node or the like periodically transmits a registration instruction message including the failure area list to the data registration apparatus 1, the registration instruction message is transmitted at an interval shorter than the time t. Is preferred.

ハッシュ・ソート部１２は、書き込み対象とする分散ハッシュテーブル３０の持つハッシュ値空間の連続性を考慮して効率的な書き込みを行うために、分散ハッシュテーブル３０のハッシュ値空間の連続性に沿って一次データ源のデータをソートする。 The hash sort unit 12 follows the continuity of the hash value space of the distributed hash table 30 in order to perform efficient writing in consideration of the continuity of the hash value space of the distributed hash table 30 to be written. Sort data from the primary data source.

ハッシュ・ソート部１２は、入力部１５を介して、対応する一次データテーブル２から、それが保持する全てのエントリ（ｋｅｙ，ｖａｌｕｅ）を、任意の順序の（ｋｅｙ，ｖａｌｕｅ）の系列として入力し、該（ｋｅｙ，ｖａｌｕｅ）の系列の各々のエントリについて、指示されたハッシュアルゴリズムに従って、キー情報ｋｅｙのハッシュ値ｈ（ｋｅｙ）を計算して、（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）を求め、全（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）を、指示されたソート・アルゴリズムに従ってソートして、ソートされた（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列を作成する。 The hash sort unit 12 inputs, through the input unit 15, all the entries (key, value) held by the corresponding primary data table 2 as a sequence of (key, value) in an arbitrary order. For each entry of the (key, value) sequence, a hash value h (key) of the key information key is calculated according to the instructed hash algorithm to obtain (h (key), key, value), All (h (key), key, value) are sorted according to the indicated sorting algorithm to create a sorted (h (key), key, value) column.

例えば、複数のＤＨＴノード３が、それぞれ、分散ハッシュテーブル３０において使用されるハッシュ値空間から自装置に割り当てられた連続するハッシュ値の範囲に含まれるハッシュ値を与えるキー情報に係るデータを記憶するものである場合に、ソート・アルゴリズムは、該分散ハッシュテーブル３０において使用されるハッシュ値空間の順序に従って書き込みが行われるようにソートするものである。 For example, each of the plurality of DHT nodes 3 stores data related to key information that gives a hash value included in a range of consecutive hash values assigned to the own device from the hash value space used in the distributed hash table 30. If so, the sorting algorithm sorts so that writing is performed according to the order of the hash value space used in the distributed hash table 30.

具体例として、例えば、Ｃｈｏｒｄに従う分散ハッシュテーブル３０において、ハッシュアルゴリズムをＭＤ５あるいはＳＨＡ−１とし、ソート・アルゴリズムを昇順若しくは降順又はランダムスタートの昇順若しくはランダムスタートの降順などとすることができる。 As a specific example, for example, in the distributed hash table 30 according to Chord, the hash algorithm may be MD5 or SHA-1, and the sort algorithm may be ascending order or descending order, ascending order of random start, or descending order of random start.

なお、ランダムスタートのソート・アルゴリズムは、データの書き込みが特定のＤＨＴノード３に集中しないように、書き込みを開始するＤＨＴノード３がランダムになるようにするために、ランダムに選択したハッシュ値が先頭になるようにソートするものである。例えばＣｈｏｒｄのような分散ハッシュテーブルアルゴリズムでは、ハッシュ値の空間は閉じた数直線で表現され、前後関係のみが定義される。従って、数直線のどこから書き込みを開始してもハッシュ値空間を一周することにより、漏れなく全てのｖａｌｕｅを書き込むことができる。 Note that the random start sort algorithm uses a randomly selected hash value so that the DHT node 3 that starts writing is random so that data writing does not concentrate on a specific DHT node 3. It sorts so that it becomes. For example, in a distributed hash table algorithm such as Chord, the space of hash values is expressed by a closed number line, and only the context is defined. Therefore, all values can be written without omission by going around the hash value space no matter where the number line starts writing.

フィルタ部１３は、ハッシュ・ソート部１２から書込部１４に与えられた、ソートされた（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列を対象として、指示された「故障領域情報」に基づいて、フィルタリングを行う。すなわち、故障領域情報には、再登録すべきデータのハッシュ値が記述されているので、ソートされた（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列から、再登録すべきデータ以外のものを削除する（すなわち、ｈ（ｋｅｙ）が故障領域情報に記述されていない（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）を削除すればよい）。 The filter unit 13 targets the sorted (h (key), key, value) column given from the hash sort unit 12 to the writing unit 14 based on the designated “failure area information”. , Filtering. That is, since the hash value of the data to be re-registered is described in the failure area information, data other than the data to be re-registered is deleted from the sorted (h (key), key, value) column. (That is, h (key) is not described in the failure area information (h (key), key, value) may be deleted).

なお、このフィルタ部１３は、ハッシュ・ソート部１２と書込部１４との間に設けてもよい。 The filter unit 13 may be provided between the hash / sort unit 12 and the writing unit 14.

書込部１４は、書き込み対象とする分散ハッシュテーブル３０の持つハッシュ値空間の連続性とＤＨＴノード３の担当領域を判断しながら、データをかたまり（バルク）単位で書き込む。 The writing unit 14 writes data in chunks (bulk) while determining the continuity of the hash value space of the distributed hash table 30 to be written and the area in charge of the DHT node 3.

書込部１４は、ネットワーク８を介して、ソート及びフィルタリングされた（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列の各登録情報（ｋｅｙ，ｖａｌｕｅ）を、先頭から（もしくは、ランダムな位置から）順に、分散ハッシュテーブル３０（の該当するＤＨＴノード３）へ書き込むものである。 The writing unit 14 sorts and filters each registered information (key, value) in the column of (h (key), key, value) via the network 8 from the head (or from a random position). In order, the data is written in the distributed hash table 30 (corresponding DHT node 3).

なお、ランダムな位置から開始するのは、データの書き込みが特定のＤＨＴノード３に集中しないようにするためである。また、遅いＤＨＴノード近辺で輻輳をおこさないように、書きこみの速度（書き込み先ノードの変更の時間間隔など）に一定の制約をかける（例えば、ランダムな待ち時間を設ける）ようにしてもよい。 The reason for starting from a random position is to prevent data writing from being concentrated on a specific DHT node 3. In addition, a certain restriction may be imposed on the writing speed (such as the time interval for changing the write destination node) so as not to cause congestion near the slow DHT node (for example, a random waiting time is provided). .

図６に、データ登録装置１の処理手順の一例を示す。 FIG. 6 shows an example of the processing procedure of the data registration device 1.

データ登録装置１は、一定期間の間に受信された登録指示メッセージに含まれる（対象となる分散ハッシュテーブル３０における）故障領域情報（例えば、故障領域リスト）と、ハッシュアルゴリズムと、ソートアルゴリズムとを抽出し、これらを指示受信部１１に伝えることで、復元動作をトリガする（ステップＳ１)。 The data registration device 1 includes failure area information (for example, failure area list) (in the target distributed hash table 30), a hash algorithm, and a sort algorithm included in the registration instruction message received during a certain period. Extraction and transmission of these to the instruction receiving unit 11 trigger the restoration operation (step S1).

なお、ステップＳ２で新たな故障領域情報が検出されなかった場合（一定期間の間に登録指示メッセージが受信されなかった場合を含む）は、以降の処理をスキップしてステップＳ１に戻る。 If new failure area information is not detected in step S2 (including the case where the registration instruction message is not received for a certain period), the subsequent processing is skipped and the process returns to step S1.

さて、ステップＳ２で新たな故障領域情報が検出された（例えば、故障領域リストに新規に書き加えられた故障領域情報が存在する）場合は、その領域（故障領域の始点、故障領域の終点）をフィルタ部１３にセットし、上記ハッシュアルゴリズム及びソートアルゴリズムをハッシュ・ソート部１２にセットした後に、データの復元を開始する。 If new failure area information is detected in step S2 (for example, there is failure area information newly added to the failure area list), that area (failure area start point, failure area end point) Is set in the filter unit 13 and the hash algorithm and sort algorithm are set in the hash sort unit 12, and then data restoration is started.

まず、ハッシュ・ソート部１２は、一次データ源である一次データテーブル２から、対象となる分散ハッシュテーブル３０に書き込むべき（ｋｅｙ，ｖａｌｕｅ）形式のデータの（任意の順序の）列を受信し、セットされたハッシュアルゴリズムｈを利用してｈ（ｋｅｙ）を計算して（ステップＳ３）、（ｋｅｙ，ｖａｌｕｅ）の列を（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列に変換した後に、セットされたソートアルゴリズムに従ってソートを行う（ステップＳ４)。ソートされた（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列は、同一のＤＨＴノード３の担当するハッシュ値が連続するようにソートされるので（例えば、ｃｈｏｒｄの場合は、ｈ（ｋｅｙ）がハッシュ値空間上で連続するようにソートされるので）、書込部１４により効率的に書き込むことができる。 First, the hash sort unit 12 receives a sequence (in any order) of data in (key, value) format to be written to the target distributed hash table 30 from the primary data table 2 that is a primary data source, It is set after calculating h (key) using the set hash algorithm h (step S3) and converting the (key, value) column to the (h (key), key, value) column. Sorting is performed according to the sorting algorithm (step S4). The sorted columns of (h (key), key, value) are sorted so that hash values assigned to the same DHT node 3 are continuous (for example, in the case of chord, h (key) is hashed). Since it is sorted so as to be continuous in the value space), it can be written efficiently by the writing unit 14.

次に、書込部１４は、書き込むべきデータをフィルタ部１３に問合せ、フィルタ部１３は、セットされた故障領域情報に基づいてフィルタリングを行い（すなわち、ソートされた（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列から、故障領域情報にｈ（ｋｅｙ）が含まれないエントリを削除し）、書き込むべきデータ（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列を書込部１４に返し（ステップＳ５）、書込部１４は、フィルタリング及びソートされたデータ（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）の列における各（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）を対象として、書き込みを行う（ステップＳ６）。なお、フィルタ部１３から書込部１４へは、ｈ（ｋｅｙ）でソートされた（ｋｅｙ，ｖａｌｕｅ）を返してもよい。 Next, the writing unit 14 inquires the filter unit 13 about data to be written, and the filtering unit 13 performs filtering based on the set failure area information (that is, sorted (h (key), key, The entry that does not include h (key) in the failure area information is deleted from the column of value), and the column of data to be written (h (key), key, value) is returned to the writing unit 14 (step S5). The writing unit 14 performs writing for each (h (key), key, value) in the column of the filtered and sorted data (h (key), key, value) (step S6). Note that (key, value) sorted by h (key) may be returned from the filter unit 13 to the writing unit 14.

図７に、ステップＳ６の書き込みの処理手順の一例を示す。 FIG. 7 shows an example of the write processing procedure in step S6.

まず、対象となる分散ハッシュテーブル３０のうちから、適当なＤＨＴノード３を一つ選択して設定する（ステップＳ１１）。 First, one appropriate DHT node 3 is selected and set from the target distributed hash table 30 (step S11).

次に、フィルタリング及びソートされたデータ（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）（あるいは、（ｋｅｙ，ｖａｌｕｅ））の列を対象として、未だ書き込みが完了していないデータ（ｋｅｙ，ｖａｌｕｅ）のうち、最もソート順の早いものを一つ選択し、（ｋｅｙ，ｖａｌｕｅ）を含む書き込みメッセージを現在設定されているＤＨＴノード３へ送信することによって、当該データ（ｋｅｙ，ｖａｌｕｅ）の書き込みを試行する（ステップＳ１２）。 Next, among the data (key, value) that has not yet been written in the column of the filtered and sorted data (h (key), key, value) (or (key, value)), One with the earliest sort order is selected, and a write message including (key, value) is transmitted to the currently set DHT node 3, thereby attempting to write the data (key, value) (step) S12).

ここで、データ登録装置１から、書き込むべき（ｋｅｙ，ｖａｌｕｅ）を含む書き込みメッセージを受信したＤＨＴノード３は、そのｋｅｙから計算したハッシュ値ｈ（ｋｅｙ）が自装置の担当するものであれば、該（ｋｅｙ，ｖａｌｕｅ）を自装置に登録する。ｈ（ｋｅｙ）が自装置の担当するものでなければ、該ＤＨＴノード３又は他のＤＨＴノード３等が、該ハッシュ値ｈ（ｋｅｙ）を担当するＤＨＴノード３により近いＤＨＴノード３（担当するＤＨＴノード３を含む）を解決して、該より近いＤＨＴノード３を指示するリダイレクトメッセージを、データ登録装置１へ返すものとする。 Here, the DHT node 3 that has received the write message including (key, value) to be written from the data registration device 1 has the hash value h (key) calculated from the key as the device in charge. The (key, value) is registered in its own device. If h (key) is not in charge of the own device, the DHT node 3 or another DHT node 3 or the like is a DHT node 3 (DHT in charge) closer to the DHT node 3 in charge of the hash value h (key). Node 3 is resolved) and a redirect message indicating the closer DHT node 3 is returned to the data registration device 1.

なお、例えば、各ＤＨＴノード３が、自装置が担当する一連のハッシュ値の直後に位置するＤＨＴノード（後続ＤＨＴノード）を記憶しておき、ハッシュ値ｈ（ｋｅｙ）が自装置の担当範囲にない（ｋｅｙ，ｖａｌｕｅ）を含む書き込みメッセージを受信したＤＨＴノード３は、この後続ＤＨＴノードの情報をリダイレクトメッセージに含めて返信するようにしてもよい。この場合、後続ＤＨＴノードが正しいものであれば、次の書き込みに成功するが、正しいものでなければ、書き込みに成功するまで、次々とリダイレクトメッセージが返信される。もちろん、ハッシュ値ｈ（ｋｅｙ）が自装置の担当範囲にない（ｋｅｙ，ｖａｌｕｅ）を含む書き込みメッセージを受信したＤＨＴノード３が、当該ハッシュ値を担当する正しいＤＨＴノードを他のＤＨＴノード３に問い合わせメッセージを送信するなどして、該正しいＤＨＴノードを解決した上で、該書き込みメッセージを受信したＤＨＴノード３あるいは該問い合わせメッセージを受信したＤＨＴノード３等が、正しいＤＨＴノードを示す情報をリダイレクトメッセージに含めて返信するようにしてもよいし、他の方法を用いてもよい。 For example, each DHT node 3 stores a DHT node (subsequent DHT node) located immediately after a series of hash values handled by the own device, and the hash value h (key) is in the assigned range of the own device. The DHT node 3 that has received the write message including no (key, value) may return the information including the information of the subsequent DHT node in the redirect message. In this case, if the subsequent DHT node is correct, the next writing is successful, but if it is not correct, redirect messages are successively returned until the writing is successful. Of course, the DHT node 3 that has received the write message that includes the (key, value) whose hash value h (key) is not in the scope of its own device inquires the other DHT node 3 about the correct DHT node that is responsible for the hash value. The DHT node 3 that has received the write message or the DHT node 3 that has received the inquiry message, etc. after resolving the correct DHT node by transmitting a message or the like, transmits information indicating the correct DHT node as a redirect message. You may make it reply including it, and you may use another method.

リダイレクションが発生すれば（ステップＳ１３）、上記ＤＨＴノード３から返された上記データを書き込むべき正しいＤＨＴノード３を設定し直し（ステップＳ１１）、あらためて上記データの書き込みを行う（ステップＳ１２）。 If redirection occurs (step S13), the correct DHT node 3 to which the data returned from the DHT node 3 is to be written is reset (step S11), and the data is written again (step S12).

リダイレクションが発生せず、書き込みに成功すれば（ステップＳ１３）、ソート順で次のデータの書き込みを試行する（ステップＳ１２）。 If redirection does not occur and writing is successful (step S13), an attempt is made to write the next data in the sort order (step S12).

なお、上記ソートによって、同一のＤＨＴノード３に書き込むべきデータが複数ある場合には、それらは連続する順番にソートされるので、次に書き込むべきＤＨＴノード３が変わるまで、ステップＳ１２の書き込みは連続して成功することになる。 If there is a plurality of data to be written to the same DHT node 3 by the above sorting, they are sorted in a sequential order, so that the writing in step S12 is continuous until the DHT node 3 to be written next changes. And will be successful.

以上の処理は、ステップＳ１４で次のデータが有る間、繰り返され、すべてのデータが書き込まれると、処理が終了となる。 The above process is repeated while there is the next data in step S14, and the process ends when all the data is written.

なお、図６及び図７の例では、ステップＳ５で全データについてフィルタリングを完了した後に、ステップＳ６で書き込みを行うようにしたが、図７のステップＳ１２において、データを一つ選択した際に、このデータにフィルタリングをかけて（すなわち、故障領域情報に基づいて書き込むべきデータか否か調べて）、書き込むべきデータである場合に、書き込むものとし、書き込むべきデータでない場合には、ステップＳ１４に遷移することを、繰り返し行うようにしてもよい。 In the example of FIGS. 6 and 7, after all the data is filtered in step S5, writing is performed in step S6. However, when one data is selected in step S12 of FIG. This data is filtered (that is, whether or not the data is to be written based on the failure area information). If the data is to be written, the data is written. If the data is not to be written, the process proceeds to step S14. This may be repeated.

以下、より具体的な書き込みの手順例について説明する。 Hereinafter, a more specific example of the writing procedure will be described.

例えばＣｈｏｒｄの場合は、ＳＨＡ１により定義される閉じた１６０ビット１次元のハッシュ値空間となり、１次元空間上の一定の領域を一定のＤＨＴノードが管理する。もし、ソートしないデータ列の書き込みを行った場合は、Ｃｈｏｒｄアルゴリズムに基づくＤＨＴノード探索が書き込み毎に実行され、効率が悪くなるが、本実施形態では、ハッシュ・ソート部１２により、書き込み先の分散ハッシュテーブル３０のハッシュ値空間の連続性に沿った順序のデータ列で書き込みを行うことによって、特定のＤＨＴノードが担当する領域への書き込みをまとめられるようになり、探索のコストを削減できる。同様に、当該範囲への書き込みが終了し、次のデータエントリを書き込む先を探索する場合も、ハッシュ値空間の連続性に基づき書き込み先を移動するだけでよい。 For example, in the case of Chord, it becomes a closed 160-bit one-dimensional hash value space defined by SHA1, and a certain area on the one-dimensional space is managed by a certain DHT node. If a data string that is not sorted is written, a DHT node search based on the Chord algorithm is executed for each write and the efficiency deteriorates. In this embodiment, the hash sort unit 12 distributes the write destination. By writing data in a sequence of data in accordance with the continuity of the hash value space of the hash table 30, it becomes possible to collect writes to the area handled by a specific DHT node, thereby reducing the search cost. Similarly, when writing to the range is completed and a destination for writing the next data entry is searched, it is only necessary to move the writing destination based on the continuity of the hash value space.

なお、上記において、分散ハッシュテーブル３０を構成する各ＤＨＴノード３は、自身が担当するハッシュ値に該当するハッシュ値を与えるキー情報について、キー情報（ｋｅｙ）とデータ（ｄａｔａ：１又は複数のｖａｌｕｅ）とを対応付けて保持し、クライアント装置からのキー情報を指定した問い合せメッセージに応答して、該キー情報に対応して記憶しているデータを含む応答メッセージを返送するものとしたが、その代わりに、各ＤＨＴノード３は、キー情報のハッシュ値（ｈ（ｋｅｙ））とデータ（ｄａｔａ：１又は複数のｖａｌｕｅ）とを対応付けて保持し、クライアント装置からのキー情報のハッシュ値を指定した問い合せメッセージに応答して、該キー情報のハッシュ値に対応して記憶しているデータを含む応答メッセージを返送するものであってもよい。この場合には、各実施形態において、データ登録装置１は、キー情報（ｋｅｙ）からハッシュ値ｈ（ｋｅｙ）を計算した後は、（ｈ（ｋｅｙ），ｖａｌｕｅ）を扱えばよい（キー情報（ｋｅｙ）は扱わなくて構わない）。また、ＤＨＴノード３へ送信する書き込みメッセージには、（ｋｅｙ，ｖａｌｕｅ）の代わりに（ｈ（ｋｅｙ），ｖａｌｕｅ）を付加すればよい。この点は、第２、第３の実施形態についても同様である。 In the above, each DHT node 3 constituting the distributed hash table 30 has key information (key) and data (data: 1 or multiple values) for key information that provides a hash value corresponding to the hash value that it is responsible for. ), And a response message including data stored corresponding to the key information is returned in response to the inquiry message specifying the key information from the client device. Instead, each DHT node 3 holds the hash value (h (key)) of the key information and the data (data: 1 or multiple values) in association with each other, and specifies the hash value of the key information from the client device In response to the inquiry message, the response message including the stored data corresponding to the hash value of the key information. It may be the one to return the. In this case, in each embodiment, after calculating the hash value h (key) from the key information (key), the data registration device 1 may handle (h (key), value) (key information ( (key) does not have to be handled). In addition, (h (key), value) may be added to the write message transmitted to the DHT node 3 instead of (key, value). This also applies to the second and third embodiments.

図８に、ｐｓｅｕｄｏｃｏｄｅによる書き込みのアルゴリズム例を示す。なお、このアルゴリズム例は、図６のステップＳ４のフィルタリングとステップＳ５の書き込みを一体化したものである。なお、図６において、／／で始まる文はコメント文である。 FIG. 8 shows an example of algorithm for writing by pseudocode. In this example algorithm, filtering in step S4 in FIG. 6 and writing in step S5 are integrated. In FIG. 6, a sentence starting with // is a comment sentence.

このアルゴリズム例において、ＴｕｐｌｅＳｔｒｅａｍはＴｕｐｌｅのイテレータであり、Ｔｕｐｌｅはｈｋｅｙとｖａｌｕｅの組み合わせである。ｈｋｅｙはｈ（ｋｅｙ）に相当する、一次データ側の（ｋｅｙ，ｖａｌｕｅ）に対するｈ（ｋｅｙ）である。ｓｏｒｔｅｄ＿ｓｏｕｒｃｅは、（フィルタリングされる前の）ソートされた（ｈ（ｋｅｙ），ｖａｌｕｅ）である。 In this example algorithm, TupleStream is a Tuple iterator, and Tuple is a combination of hkey and value. hkey is h (key) corresponding to (key, value) on the primary data side, corresponding to h (key). Sorted_source is sorted (before being filtered) (h (key), value).

ＴｕｐｌｅＦｉｌｔｅｒは、（ｈ（ｋｅｙ），ｖａｌｕｅ）を対象とする、故障領域情報に基づくフィルタである。 The TupleFilter is a filter based on failure area information for (h (key), value).

ＤＨＴＮｏｄｅは、対象となる分散ハッシュテーブル３０のＤＨＴノード３を代表するクラスである。ｎｐｔｒは、ＤＨＴノード３の１つのノードを示すＤＨＴＮｏｄｅのインスタンスへのポインタであり、ＵＲＬあるいはＩＰアドレス等を含む。 The DHTNode is a class representing the DHT node 3 of the target distributed hash table 30. nptr is a pointer to an instance of DHTNode indicating one node of the DHT node 3, and includes a URL or an IP address.

ｎｐｔｒ．ｐｕｔ（ｔ．ｈｋｅｙ，ｔ．ｖａｌｕｅ）は、一つの（ｈ（ｋｅｙ），ｖａｌｕｅ）をｎｐｔｒが示すＤＨＴノード３へ書き込む。なお、ここでは、ｈ（ｋｅｙ）を送信する場合について記述しているが、ｋｅｙを送信する場合には、ｔ．ｈｋｅｙをｔ．ｋｅｙとすればよい。 nptr. put (t.hkey, t.value) writes one (h (key), value) to the DHT node 3 indicated by nptr. Here, the case of transmitting h (key) is described, but when transmitting the key, t. hkey to t. The key may be used.

ＤＨＴノード３は、自装置の担当する範囲外への書き込み要求については、ＲｅｄｉｒｅｃｔＥｘｃｅｐｔｉｏｎとともにより正しいと考えられるＤＨＴＮｏｄｅへの参照を返す。ｃａｔｃｈは、これを受信するものである。ｎｐｔｒ＝ｒｅ．ｒｅｄｉｒｅｃｔｅｄ＿ｔｏは、返されたＤＨＴノード３のＵＲＬあるいはＩＰアドレス等から生成されたＤＨＴＮｏｄｅのインスタンスをｎｐｔｒにセットする。 The DHT node 3 returns a reference to a DHTNode that is considered to be more correct together with the RedirectException for a write request outside the range handled by the own device. The catch receives this. nptr = re. redirected_to sets an instance of DHTNode generated from the URL or IP address of the returned DHT node 3 to nptr.

ここで、対象となる分散ハッシュテーブル３０の故障領域情報（故障領域リスト）の発見と広告は、ＤＨＴのアルゴリズムによって異なる方式が考えられる。例えば、Ｃｈｏｒｄの場合は、ＤＨＴハッシュ値空間上で（前のノードを示す）プレデセッサーの値の変動により、ノードの離脱／故障を発見することができる。離脱したノードが所持していた領域のミラーを当該ＤＨＴノードが持っていない場合は、当該ＤＨＴノードが持つ故障領域リスト情報に離脱したＤＨＴノードの領域情報を時刻と共に追記し、更新した故障領域リストを広告する。故障領域リストの広告は、（後のノードを示す）サクセッサーおよびＣｈｏｒｄ上の短絡経路であるフィンガーの宛先となる各ＤＨＴノードに対して行うことが一例として考えられる。故障領域リストの広告を受信したＤＨＴノードは、その故障領域リストを自身の故障領域リストと比較し、新しいデータが存在する場合のみ自身の故障領域リストに新しいデータをマージし、同様に広告を行う。この比較のために、生成する故障領域リストに疑似乱数や時刻などに基づくＩＤをつけてもよい。 Here, the discovery and advertisement of the failure area information (failure area list) of the target distributed hash table 30 may be different depending on the DHT algorithm. For example, in the case of Chord, node detachment / failure can be found by fluctuation of the predecessor value (indicating the previous node) in the DHT hash value space. If the DHT node does not have a mirror of the area owned by the detached node, the area information of the detached DHT node is added to the failure area list information of the DHT node along with the time, and the updated failure area list Advertise As an example, the advertisement of the failure area list may be performed for each DHT node that is a destination of a finger that is a short-circuit path on a successor and a chord (indicating a later node). The DHT node that has received the failure area list advertisement compares the failure area list with its own failure area list, merges new data into its own failure area list only when new data exists, and advertises similarly. . For this comparison, an ID based on a pseudo-random number, time, or the like may be attached to the generated failure area list.

故障領域リストの発見と広告と、データ登録装置の組み合わせにより、一次データテーブル群２０とインデックス用ＤＨＴという構成において、ＤＨＴの故障等に由来するデータの欠損から素早く低コストに復帰できる。 By combining the failure area list discovery, advertisement, and data registration device, the primary data table group 20 and the index DHT can quickly and inexpensively recover from data loss due to a DHT failure or the like.

本実施形態によれば、分散ハッシュテーブル上のデータ欠損に対する一次データ源からの復元を効率的に行うことができる。 According to the present embodiment, it is possible to efficiently perform restoration from the primary data source for data loss on the distributed hash table.

（第２の実施形態）
以下、第２の実施形態が第１の実施形態と相違する部分を中心に説明する。 (Second Embodiment)
In the following, the second embodiment will be described focusing on the differences from the first embodiment.

図９に、本実施形態に係る分散ハッシュテーブルシステムの構成例を示す。 FIG. 9 shows a configuration example of the distributed hash table system according to the present embodiment.

本実施形態は、第１の実施形態の一次データテーブル群２０を、第１の分散ハッシュテーブル４０とし（一次テーブル２をＤＨＴノード４とし）、第１の実施形態の分散ハッシュテーブル３０を、第２の分散ハッシュテーブル３０として、第２の分散ハッシュテーブル３０を、第１の分散ハッシュテーブル４０のバックアップとして使用するようにしたものである。 In the present embodiment, the primary data table group 20 of the first embodiment is a first distributed hash table 40 (the primary table 2 is a DHT node 4), and the distributed hash table 30 of the first embodiment is As the second distributed hash table 30, the second distributed hash table 30 is used as a backup of the first distributed hash table 40.

なお、バックアップを複数系統備えてもよく（第２の分散ハッシュテーブル３０を複数系統備えてもよく）、この場合には、バックアップは、各系統毎に独立して行えばよい。 Note that a plurality of backups may be provided (a plurality of second distributed hash tables 30 may be provided). In this case, the backup may be performed independently for each system.

図１０に、本実施形態に係るデータ登録装置１の構成例を示す。この構成例は基本的には図５と同様である。 FIG. 10 shows a configuration example of the data registration device 1 according to the present embodiment. This configuration example is basically the same as that shown in FIG.

本実施形態では、第２の分散ハッシュテーブル４０に第１の分散ハッシュテーブル３０のバックアップを行うにあたって、データ登録装置１は、継続的に（例えば一定期間が経過する毎に）差分バックアップを行う場合を例にとって説明する。なお、この差分バックアップは、例えば、分散ハッシュテーブルやネットワークの負荷が低い時間帯などに行うと効果的である。 In this embodiment, when performing backup of the first distributed hash table 30 on the second distributed hash table 40, the data registration apparatus 1 performs differential backup continuously (for example, every time a certain period elapses). Will be described as an example. Note that this differential backup is effective, for example, when it is performed in a distributed hash table or a time zone when the network load is low.

なお、例えば第１の分散ハッシュテーブル４０及び第２の分散ハッシュテーブル３０に対して設けられた制御装置７が、各々のデータ登録装置１（の指示受信部１１）に対して同時或いは順次にバックアップ指示を出す（個々のデータ登録装置１においては適当な間隔でバックアップ指示を受信する）ようにしてもよいし、制御装置７を設けずに、個々のデータ登録装置１の内部において、自発的に且つ継続的に（例えば、一定期間経過毎に）、その指示受信部１１にバックアップ指示を出すようにしてもよい。 For example, the control device 7 provided for the first distributed hash table 40 and the second distributed hash table 30 backs up each data registration device 1 (the instruction receiving unit 11) simultaneously or sequentially. An instruction may be issued (individual data registration apparatus 1 receives a backup instruction at an appropriate interval), or the control apparatus 7 is not provided, and the individual data registration apparatus 1 voluntarily In addition, a backup instruction may be issued to the instruction receiving unit 11 continuously (for example, every elapse of a certain period of time).

また、第２の分散ハッシュテーブル３０のためのハッシュアルゴリズム及びソートアルゴリズムについても基本的には第１の実施形態と同様であり、例えば、制御装置７が上記ハッシュアルゴリズム及びソートアルゴリズムをバックアップ指示メッセージに含めて各データ登録装置１に与えてもよいし、バックアップ指示メッセージに含めず、個々のデータ登録装置１においてそれぞれ設定するようにしてもよい。 The hash algorithm and sort algorithm for the second distributed hash table 30 are basically the same as those in the first embodiment. For example, the control device 7 uses the hash algorithm and sort algorithm as a backup instruction message. It may be given to each data registration device 1 or may be set in each data registration device 1 without being included in the backup instruction message.

さて、本実施形態においては、例えば、制御装置７（又はデータ登録装置１自身）が、データ登録装置１の指示受信部１１に対して、「前回のバックアップ指示時刻」を示すことによって、その時刻以降に更新された（ｋｅｙ，ｖａｌｕｅ）のデータエントリのみを更新する方法が考えられる。このときは、一次データ源であるＤＨＴノード４に、個々の（ｋｅｙ，ｖａｌｕｅ）のデータエントリの登録時刻を記録する機能と、ＮＴＰ等の機能により個々の一次データ源であるＤＨＴノード４の時計を揃えておく機能とを設けるのが望ましい。例えば、データエントリ（ｋｅｙ，ｖａｌｕｅ）に、登録時刻ｒｅｇ＿ｔを加えて、（ｋｅｙ，ｖａｌｕｅ，ｒｅｇ＿ｔ）とし、また、上記「前回のバックアップ指示時刻」以降の時刻に登録されたデータエントリ（すなわち、「前回のバックアップ指示時刻」以降の時刻である登録時刻ｒｅｇ＿ｔを持つデータエントリ）のみを登録するように、フィルタ部１３が登録時刻ｒｅｇ＿ｔを参照してフィルタリングを行うようにすればよい。 In the present embodiment, for example, the control device 7 (or the data registration device 1 itself) indicates the “previous backup instruction time” to the instruction receiving unit 11 of the data registration device 1, so that time A method may be considered in which only the (key, value) data entry updated thereafter is updated. At this time, the function of recording the registration time of each (key, value) data entry in the DHT node 4 as the primary data source and the clock of the DHT node 4 as the individual primary data source by a function such as NTP. It is desirable to provide a function for aligning. For example, the registration time reg_t is added to the data entry (key, value) to obtain (key, value, reg_t), and the data entry registered at the time after the “last backup instruction time” (that is, “ The filtering unit 13 may perform filtering with reference to the registration time reg_t so that only the data entry having the registration time reg_t that is a time after the “last backup instruction time” is registered.

なお、バックアップを一系統だけ備える場合には、より簡単にdirty bitなどの管理手法により、バックアップされていないデータのみを選択的に更新することもできる。例えば、指示受信部１１は、dirty bitがセットされているデータエントリのみを登録するようにフィルタ部１３を設定し、また、一次データ源であるＤＨＴノード４は、バックアップが完了したデータエントリからdirty bitを解除するようにすればよい。 When only one backup is provided, it is possible to selectively update only the data that has not been backed up more easily by a management method such as dirty bit. For example, the instruction receiving unit 11 sets the filter unit 13 so as to register only the data entry in which the dirty bit is set, and the DHT node 4 that is the primary data source starts from the data entry that has been backed up. The bit should be canceled.

図１１に、本実施形態のデータ登録装置１の処理手順の一例を示す。 FIG. 11 shows an example of a processing procedure of the data registration device 1 of the present embodiment.

ステップＳ２１のステップＳ２（図６）との相違は、ステップＳ２１で、指示受信部１１が（例えば「前回のバックアップ指示時刻」を含む）バックアップ指示を受信したことによって、処理が開始する点である。 The difference between step S21 and step S2 (FIG. 6) is that the process starts when the instruction receiving unit 11 receives a backup instruction (including, for example, “previous backup instruction time”) in step S21. .

ステップＳ２２，Ｓ２３（ハッシュ計算及びソート）は、図６のステップＳ３，Ｓ４と同様である。 Steps S22 and S23 (hash calculation and sorting) are the same as steps S3 and S4 in FIG.

ステップＳ２４（フィルタリング）のステップＳ５（図６）との相違は、フィルタリングの内容が、前回のバックアップ指示時刻あるいはdirty bitなどに基づく差分バックアップを行うようにフィルタリングするものである点である。 The difference between step S24 (filtering) and step S5 (FIG. 6) is that filtering is performed so that differential backup based on the previous backup instruction time or dirty bit is performed.

ステップＳ２５（書き込み）は、図６のステップＳ６及び図７と同様である。 Step S25 (write) is the same as step S6 in FIG. 6 and FIG.

また、第１の実施形態で示した図８の手順例も、フィルタリングの内容が異なる以外は、本実施形態でも同様に使用可能である。 Further, the procedure example of FIG. 8 shown in the first embodiment can also be used in the present embodiment except that the contents of filtering are different.

なお、本実施形態においても、分散ハッシュテーブル４０側に設けたデータ登録装置１が、第１の実施形態と同様に分散ハッシュテーブル３０における故障領域の復元をも行うようにしてもよい。 Also in this embodiment, the data registration apparatus 1 provided on the distributed hash table 40 side may also restore the failure area in the distributed hash table 30 as in the first embodiment.

また、分散ハッシュテーブル３０側にもデータ登録装置１を設け、第１の実施形態と同様に分散ハッシュテーブル４０における故障領域の復元を行うようにしてもよい。 Further, the data registration device 1 may be provided on the distributed hash table 30 side, and the failure area in the distributed hash table 40 may be restored as in the first embodiment.

本実施形態によれば、分散ハッシュテーブル間の差分バックアップを効率的に行うことができる。 According to this embodiment, differential backup between distributed hash tables can be performed efficiently.

（第３の実施形態）
以下、第３の実施形態が第１、第２の実施形態と相違する部分を中心に説明する。 (Third embodiment)
In the following, the third embodiment will be described focusing on the differences from the first and second embodiments.

本実施形態に係る分散ハッシュテーブルシステムの構成例は図９と同様である。 A configuration example of the distributed hash table system according to the present embodiment is the same as that shown in FIG.

本実施形態に係るデータ登録装置１の構成例は図１０と同様である。 A configuration example of the data registration device 1 according to the present embodiment is the same as that shown in FIG.

第２の実施形態では、第１の分散ハッシュテーブル４０から第２の分散ハッシュテーブル３０への差分バックアップを行ったが、本実施形態は、第１の分散ハッシュテーブル４０から第２の分散ハッシュテーブル３０へのコピー（複製）を行うようにしたものである。なお、コピーはどのような目的で行われるものであってもよく、例えばバックアップでも分散ハッシュテーブルシステムのリプレースでも構わない。 In the second embodiment, the differential backup from the first distributed hash table 40 to the second distributed hash table 30 is performed. However, in the present embodiment, the first distributed hash table 40 is changed to the second distributed hash table. 30 (copying) to 30. Note that copying may be performed for any purpose, for example, backup or replacement of a distributed hash table system.

なお、例えば第１の分散ハッシュテーブル４０及び又は第２の分散ハッシュテーブル３０に対して設けられた制御装置７が、第１の分散ハッシュテーブル４０における各々のデータ登録装置１（の指示受信部１１）に対して同時或いは順次に、第２の分散ハッシュテーブル３０へのコピー指示に、第２の分散ハッシュテーブル３０へのエントリポイント（例えばＩＰアドレスとポート番号）を付加して送信する。なお、第２の分散ハッシュテーブル３０のためのハッシュアルゴリズム及びソートアルゴリズムについても基本的には第１、第２の実施形態と同様である。 Note that, for example, the control device 7 provided for the first distributed hash table 40 and / or the second distributed hash table 30 has the data receiving device 1 (the instruction receiving unit 11 thereof) in the first distributed hash table 40. At the same time or sequentially, an entry point (for example, an IP address and a port number) to the second distributed hash table 30 is added to the copy instruction to the second distributed hash table 30 and transmitted. The hash algorithm and sort algorithm for the second distributed hash table 30 are basically the same as those in the first and second embodiments.

第１の分散ハッシュテーブル４０における各データ登録装置１では、ハッシュ・ソート部１２にソートアルゴリズムをセットする点は第１、第２の実施形態と同様であるが、ここでは、対応するＤＨＴノード４が持っている全データをバックアップするため、フィルタ部１３には、空データ（ｎｕｌｌ）を設定することによって、書きこみ領域の制限は行わないようにする。 Each data registration device 1 in the first distributed hash table 40 is similar to the first and second embodiments in that the sort algorithm is set in the hash / sort unit 12, but here the corresponding DHT node 4 In order to back up all data stored in the file, the write area is not limited by setting empty data (null) in the filter unit 13.

ハッシュ・ソート部１２は、データ（ｋｅｙ，ｖａｌｕｅ）の列を入力として、指定された、第２の分散ハッシュテーブル３０のハッシュアルゴリズムとソートアルゴリズムを利用し、ソートされたデータ列（ｈ（ｋｅｙ），ｋｅｙ，ｖａｌｕｅ）を書込部１４へ渡す。 The hash sort unit 12 receives a column of data (key, value) as an input and uses the specified hash algorithm and sort algorithm of the second distributed hash table 30 to sort the data sequence (h (key) , Key, value) to the writing unit 14.

書込部１４は、与えられたデータを先頭から（もしくは、ランダムな位置から）書きこみを開始する。書込部１４は、第２の分散ハッシュテーブル３０の或るＤＨＴノード３に塊のままのデータを書き込む。書き込むデータがノードの担当領域を外れた時点で、エラー通知等の手段により第２の分散ハッシュテーブル３０により定義される空間内の適切な隣接ノードに書きこみ先を変更する。ランダムな位置から開始するのは、第１の分散ハッシュテーブル４０における各データ登録装置１が第２の分散ハッシュテーブル３０の特定のＤＨＴノード４へ書き込んでしまうような事態を防ぐためである。なお、遅いＤＨＴノード近辺で輻輳をおこさないように、書きこみの速度（書き込み先ノードの変更の時間間隔など）に一定の制約をかける（例えば、ランダムな待ち時間を設ける）ようにしてもよい。 The writing unit 14 starts writing the given data from the beginning (or from a random position). The writing unit 14 writes the data as a block to a certain DHT node 3 of the second distributed hash table 30. When the data to be written is out of the area in charge of the node, the write destination is changed to an appropriate adjacent node in the space defined by the second distributed hash table 30 by means such as error notification. The reason for starting from a random position is to prevent a situation in which each data registration device 1 in the first distributed hash table 40 writes to a specific DHT node 4 in the second distributed hash table 30. It should be noted that a certain restriction may be imposed on the writing speed (such as the time interval for changing the write destination node) so as not to cause congestion in the vicinity of the slow DHT node (for example, a random waiting time is provided). .

図１２に、本実施形態のデータ登録装置１の処理手順の一例を示す。 FIG. 12 shows an example of the processing procedure of the data registration device 1 of this embodiment.

ステップＳ３１のステップＳ２（図６）との相違は、ステップＳ２１で、指示受信部１１が（例えばＩＰアドレスとポート番号を含む）コピー指示を受信したことによって、処理が開始する点である。 The difference between step S31 and step S2 (FIG. 6) is that the process starts when the instruction receiving unit 11 receives a copy instruction (including, for example, an IP address and a port number) in step S21.

ステップＳ３２，Ｓ３３（ハッシュ計算及びソート）は、図６のステップＳ３，Ｓ４と同様である。 Steps S32 and S33 (hash calculation and sorting) are the same as steps S3 and S4 in FIG.

ステップＳ３４（フィルタリング）のステップＳ５（図６）との相違は、フィルタリングの内容が、全データをコピー対象とするようにフィルタリングするものである点である。なお、本実施形態においては、フィルタ部１３及びステップＳ３４を省くことも可能である。 The difference between step S34 (filtering) and step S5 (FIG. 6) is that the contents of filtering are filtered so that all data is a copy target. In the present embodiment, the filter unit 13 and step S34 can be omitted.

ステップＳ３５（書き込み）は、図６のステップＳ６及び図７と同様である。 Step S35 (writing) is the same as step S6 in FIG. 6 and FIG.

なお、本実施形態においても、分散ハッシュテーブル４０側に設けたデータ登録装置１が、第１の実施形態と同様に分散ハッシュテーブル３０における故障領域の復元をも行うようにしてもよいし、第２の実施形態と同様に差分バックアップをも行うようにしてもよい。 In this embodiment, the data registration device 1 provided on the distributed hash table 40 side may also restore the failure area in the distributed hash table 30 as in the first embodiment. Similar to the second embodiment, differential backup may be performed.

本実施形態によれば、第１の分散ハッシュテーブルから第２の分散ハッシュテーブルへデータ移行を効率的に行うことができる。 According to the present embodiment, data migration can be efficiently performed from the first distributed hash table to the second distributed hash table.

なお、以上の各機能は、ソフトウェアとして記述し適当な機構をもったコンピュータに処理させても実現可能である。
また、本実施形態は、コンピュータに所定の手順を実行させるための、あるいはコンピュータを所定の手段として機能させるための、あるいはコンピュータに所定の機能を実現させるためのプログラムとして実施することもできる。加えて該プログラムを記録したコンピュータ読取り可能な記録媒体として実施することもできる。 Each of the above functions can be realized even if it is described as software and processed by a computer having an appropriate mechanism.
The present embodiment can also be implemented as a program for causing a computer to execute a predetermined procedure, causing a computer to function as a predetermined means, or causing a computer to realize a predetermined function. In addition, the present invention can be implemented as a computer-readable recording medium on which the program is recorded.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

本発明の一実施形態に係る分散ハッシュテーブルシステムの構成例を示す図The figure which shows the structural example of the distributed hash table system which concerns on one Embodiment of this invention. ＤＨＴノード３が保持するデータの一例を示す図The figure which shows an example of the data which DHT node 3 hold | maintains 一次データテーブルが保持するデータの一例を示す図The figure which shows an example of the data which a primary data table hold | maintains タグＩＤの一例を示す図The figure which shows an example of tag ID 同実施形態に係るデータ登録装置の構成例を示す図The figure which shows the structural example of the data registration apparatus which concerns on the same embodiment 同実施形態に係るデータ登録装置の処理手順の一例を示すフローチャートThe flowchart which shows an example of the process sequence of the data registration apparatus which concerns on the embodiment 図６の手順のステップＳ６の書き込みの処理手順の一例を示すフローチャートThe flowchart which shows an example of the write-in processing procedure of step S6 of the procedure of FIG. 書き込みのアルゴリズム例を示す図Diagram showing an example algorithm for writing 同実施形態に係る分散ハッシュテーブルシステムの他の構成例を示す図The figure which shows the other structural example of the distributed hash table system which concerns on the embodiment 同実施形態に係るデータ登録装置の他の構成例を示す図The figure which shows the other structural example of the data registration apparatus which concerns on the same embodiment 同実施形態に係るデータ登録装置の処理手順の他の例を示すフローチャートThe flowchart which shows the other example of the process sequence of the data registration apparatus which concerns on the embodiment 同実施形態に係るデータ登録装置の処理手順の更に他の例を示すフローチャートThe flowchart which shows another example of the process sequence of the data registration apparatus which concerns on the embodiment

Explanation of symbols

１…データ登録装置、２…一次データテーブル、３，４…ＤＨＴノード、７…制御装置、８…ネットワーク、１１…指示受信部、１２…ハッシュ・ソート部、１３…フィルタ部、１４…書込部、１５…入力部、２０…一次データテーブル群、３０，４０…分散ハッシュテーブル DESCRIPTION OF SYMBOLS 1 ... Data registration apparatus, 2 ... Primary data table, 3, 4 ... DHT node, 7 ... Control apparatus, 8 ... Network, 11 ... Instruction receiving part, 12 ... Hash sort part, 13 ... Filter part, 14 ... Write , 15 ... input unit, 20 ... primary data table group, 30, 40 ... distributed hash table

Claims

In a data registration device for registering data in a distributed hash table composed of a plurality of distributed hash node devices that distribute and store data associated with key information based on a hash value of the key information.
Means for inputting a sequence of registration information including the key information and primary data that is the source of the data;
Hashing means for calculating a hash value of the key information related to the registration information for each input registration information of the series,
Based on the hash value of the key information related to each registration information, the order of the registration information in the inputted sequence is the order in which the registration information to be stored in the same distributed hash node device is continuous. A sorting means for sorting the order of registration information;
Write means for writing the registration information to the distributed hash node device that should store the registration information for all or part of the registration information included in the sorted series according to the sort order. A data registration device characterized by comprising.

The writing means includes
Node information indicating a distributed hash node device selected as a writing destination from a plurality of distributed hash node devices in advance is held,
In writing the registration information to be written in the series, first, the registration information was tried to be written using the node information as a write destination, and as a result, the registration information was successfully written. In this case, the process shifts to writing registration information having the next order to be written in the series, while the distributed hash node device to store the registration information is different from the distributed hash node device indicated by the node information. Therefore, when the writing of the registration information fails, the writing of the registration information is tried again after changing the node information to another distributed hash node device. Data registration device.

When the writing means fails to write the registered information, the writing unit should store the registered information from the distributed hash node device indicated by the node information or another distributed hash node device. 3. The data registration device according to claim 2, wherein information indicating a distributed hash node device closer to the distributed hash node device is received, and the node information is changed to information indicating the received correct distributed hash node device. .

4. The writing unit according to claim 1, wherein the writing unit performs the writing for all of the registration information included in the sorted series when a copy command is received from the outside. The data registration device according to item.

Prior to the writing means performing the writing, the writing means further comprises selection means for selecting whether to perform the writing for each registration information included in the sorted series,
4. The writing unit according to claim 1, wherein the writing unit performs the writing only on the registration information included in the sorted series that is selected by the selection unit to perform the writing. The data registration device according to any one of the above.

Means further comprising means for receiving a message indicating list information of hash values of key information relating to missing data in the distributed hash table;
The selection means selects, for each piece of registration information included in the sorted series, the writing to be performed when the hash value of the key information related to the registration information is included in the list information. The data registration apparatus according to claim 5, wherein the data registration apparatus is a data registration apparatus.

The selection means, for each registration information included in the sorted series, when the registration information has not yet been written since the content of the primary data related to the registration information has been updated. 6. The data registration apparatus according to claim 5, wherein the data registration is selected to be performed.

The registration information includes information indicating the last update time of the primary data related to the registration information,
The said selection means selects the said writing when the information which shows the last update time contained in the said registration information shows the time after the time regarding the last writing, It is characterized by the above-mentioned. 8. The data registration device according to 7.

9. The data according to claim 1, wherein the writing unit provides a random waiting time when a distributed hash node device as a writing destination is changed in the writing. Registration device.

Each of the plurality of distributed hash node devices stores data related to key information that gives hash values included in a range of consecutive hash values assigned to the own device from the hash value space used in the distributed hash table. Is,
10. The sorting means according to claim 1, wherein the sorting means sorts the order of registration information in the input sequence in ascending or descending order of hash values of key information related to the registration information. The data registration device described.

11. The writing unit according to claim 1, wherein the writing unit starts the writing of each registration information included in the sorted series from registration information randomly selected from the sorted series. The data registration device according to item 1.

12. The data registration apparatus according to claim 1, wherein the writing unit performs the writing repeatedly at regular intervals.

Means for receiving a message indicating hash algorithm and sort algorithm information for the distributed hash table;
The hash means calculates the hash value from the key information according to the hash algorithm,
13. The data registration apparatus according to claim 1, wherein the sorting unit sorts the order of registration information in the series according to the sorting algorithm.

In a data registration method of a data registration device for registering data in a distributed hash table composed of a plurality of distributed hash node devices that distribute and store data associated with key information based on a hash value of the key information.
Inputting a sequence of registration information including the key information and primary data that is the source of the data;
Calculating a hash value of the key information related to the registration information for each registration information of the input sequence;
Based on the hash value of the key information related to each registration information, the order of the registration information in the inputted sequence is the order in which the registration information to be stored in the same distributed hash node device is continuous. Sorting the order of registration information;
Writing the registration information to the distributed hash node device that should store the registration information for all or part of the registration information included in the sorted sequence according to the sort order. A data registration method characterized by the above.

In a program for causing a computer to function as a data registration device that registers data in a distributed hash table composed of a plurality of distributed hash node devices that distribute and store data associated with key information based on a hash value of the key information ,
Means for inputting a sequence of registration information including the key information and primary data that is the source of the data;
Hashing means for calculating a hash value of the key information related to the registration information for each input registration information of the series,
Based on the hash value of the key information related to each registration information, the order of the registration information in the inputted sequence is the order in which the registration information to be stored in the same distributed hash node device is continuous. A sorting means for sorting the order of registration information;
Write means for writing the registration information to the distributed hash node device that should store the registration information for all or part of the registration information included in the sorted series according to the sort order. A program that causes a computer to function.