JP5723309B2

JP5723309B2 - Server and program

Info

Publication number: JP5723309B2
Application number: JP2012047991A
Authority: JP
Inventors: 絵里子岩佐; 近藤　悟; 悟近藤; 道生入江; 雅志金子; 健福元
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-03-05
Filing date: 2012-03-05
Publication date: 2015-05-27
Anticipated expiration: 2032-03-05
Also published as: JP2013182575A

Description

本発明は、協調してデータ処理を行うクラスタを構成する複数のサーバにおけるデータ複製の技術に関する。 The present invention relates to a technology for data replication in a plurality of servers constituting a cluster that performs data processing in a coordinated manner.

近年、クラウドコンピューティングの隆盛に伴い、多量のデータの処理や保持を効率的に行うことが求められている。そこで、複数のサーバを協調動作させることにより効率的な処理を実現する分散処理技術が発展している。 In recent years, with the rise of cloud computing, it has been required to efficiently process and retain a large amount of data. Thus, distributed processing technology has been developed that realizes efficient processing by operating a plurality of servers in a coordinated manner.

分散処理を行う際には、処理対象（管理対象）のデータを、クラスタを構成する各サーバ（以下、「クラスタメンバ」または「メンバ」とも称する。）に振り分けておく必要がある。このとき、クラスタ全体での処理能力を高めるためには、各クラスタメンバが担当するデータ数（データ量）は平均化されていることが望ましい。 When performing distributed processing, it is necessary to distribute processing target (management target) data to each server constituting the cluster (hereinafter also referred to as “cluster member” or “member”). At this time, in order to increase the processing capacity of the entire cluster, it is desirable that the number of data (data amount) handled by each cluster member is averaged.

代表的なデータの振り分け手法として、各データのｋｅｙをハッシュ関数にかけた値（以下、「ｈａｓｈ（ｋｅｙ）」と称する。）をクラスタメンバ数Ｎで割った余り、すなわち「ｈａｓｈ（ｋｅｙ）ｍｏｄＮ」を番号として持つクラスタメンバにデータを振り分ける手法がある。この場合、各クラスタメンバに事前に「０」から「Ｎ−１」までの番号を割り当てていることが前提となる。このような振り分け手法を用いた場合、クラスタメンバを追加すると、Ｎの値が変化して、多くのデータについて、担当するクラスタメンバが変更になるため、担当するデータの再配置が必要になる。 As a typical data distribution method, a remainder obtained by dividing a value obtained by multiplying the key of each data by a hash function (hereinafter referred to as “hash (key)”) by the number N of cluster members, that is, “hash (key) mod N There is a method of distributing data to cluster members having "" as a number. In this case, it is assumed that numbers “0” to “N−1” are assigned to each cluster member in advance. When such a distribution method is used, if a cluster member is added, the value of N changes, and the cluster member in charge of a lot of data is changed, so that the data in charge must be rearranged.

そこで、クラスタメンバの追加に伴い担当するクラスタメンバが変更になるデータ数を約１／Ｎに抑える方法として、コンシステント・ハッシュ法［Consistent Hashing］（非特許文献１参照）を用いた振り分け手法がある。このコンシステント・ハッシュ法は、Amazon Dynamo（非特許文献２参照）等で用いられている。 Therefore, as a method for suppressing the number of data that the cluster member in charge changes with the addition of the cluster member to about 1 / N, there is a distribution method using a consistent hashing method (see Non-Patent Document 1). is there. This consistent hash method is used in Amazon Dynamo (see Non-Patent Document 2) and the like.

このコンシステント・ハッシュ法を用いたデータ振り分け手法では、クラスタメンバとデータの双方にＩＤ（IDentifier）を割り当て、データのＩＤからＩＤ空間を時計回りに辿った場合に最初に出合ったクラスタメンバをそのデータの担当とする。 In this data distribution method using the consistent hash method, an ID (IDentifier) is assigned to both the cluster member and the data, and when the ID space is traced clockwise from the data ID, the first cluster member encountered is Take charge of data.

また、多量のデータの管理をクラスタ構成の分散処理システムで行う場合、あるクラスタメンバに障害が発生した場合でも他のクラスタメンバで処理を継続できるように、データの複製を保持することでデータ冗長化を実現する必要がある。これは、コンシステント・ハッシュ法によるデータ管理手法を用いた分散処理システムにおいても同様である。 In addition, when managing a large amount of data in a cluster-structured distributed processing system, data redundancy is maintained by maintaining a copy of the data so that even if a failure occurs in one cluster member, processing can be continued on other cluster members. Needs to be realized. The same applies to a distributed processing system that uses a data management technique based on the consistent hash method.

図５に示すように、コンシステント・ハッシュ法では、クラスタメンバ（メンバ１〜４）とデータ（データＡ〜Ｄ。黒丸（●）で表示）の双方にＩＤを割り当て、データのＩＤからＩＤ空間を時計回りに辿り最初に出合ったクラスタメンバをそのデータの担当として決定する。そして、担当するクラスタメンバのさらに右隣（時計回りに次）のクラスタメンバに複製データを担当させる。 As shown in FIG. 5, in the consistent hash method, IDs are assigned to both cluster members (members 1 to 4) and data (data A to D, indicated by black circles (●)), and the ID space is determined from the data ID. The cluster member meeting first is determined as the charge of the data. Then, the cluster member that is further to the right of the cluster member in charge (next clockwise) is assigned the duplicate data.

例えば、図５においては、データＡはＩＤ空間上を時計回りに辿り最初に出合ったメンバ１が担当となり、その複製データはＩＤ空間上でメンバ１の右隣にあたるメンバ２に担当させる。このように原本データ・複製データを担当するクラスタメンバを決定することで、クラスタメンバに離脱があった場合でも複製データを所持しているクラスタメンバが新しくデータを担当するクラスタメンバとなることで対応できるという利点がある。なお、複製データを複数個とる場合には、さらに右隣のクラスタメンバに２個目の複製データを担当させるようにすることもできる。 For example, in FIG. 5, data A follows the ID space in the clockwise direction, and the member 1 who meets first is in charge, and the duplicate data is assigned to the member 2 on the right side of the member 1 in the ID space. By determining the cluster member responsible for the original data / replicated data in this way, even if the cluster member leaves, the cluster member that owns the replicated data becomes a new cluster member responsible for the data. There is an advantage that you can. When a plurality of pieces of duplicate data are taken, the second duplicate data can be assigned to the cluster member on the right side.

David Karger et al., “Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web”, [online], 1997, ACM, [平成24年2月20日検索], インターネット<ＵＲＬ：http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf>David Karger et al., “Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web”, [online], 1997, ACM, [searched February 20, 2012], Internet <URL: http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf> Giuseppe DeCandia et al., “Dynamo: Amazon’s Highly Available Key-value Store”, [online], 2007, ACM, [平成24年2月20日検索], インターネット<ＵＲＬ：http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf>Giuseppe DeCandia et al., “Dynamo: Amazon's Highly Available Key-value Store”, [online], 2007, ACM, [searched February 20, 2012], Internet <URL: http://www.allthingsdistributed.com /files/amazon-dynamo-sosp2007.pdf>

ここで、ＩＤ空間上におけるクラスタメンバにおける具体的なサーバの割り当てポリシを定める場合に、一つでなく、異なる二つのポリシで運用したい場合がある。
前記したように、コンシステント・ハッシュ法によるデータ管理手法では、データを担当するクラスタメンバの右隣に複製データを配置する方法を採用する。ここで、図６に示すように、クラスタを構成するメンバは、地理的に離れたＫ個（ここでは５個）のデータセンタエリア（サーバを管理するデータセンタが管轄するエリア）内から選択されているとする。 Here, when a specific server allocation policy for cluster members in the ID space is determined, there are cases where it is desired to operate with two different policies instead of one.
As described above, the data management method based on the consistent hash method employs a method in which replicated data is arranged on the right side of the cluster member in charge of data. Here, as shown in FIG. 6, the members constituting the cluster are selected from K (in this case, five) data center areas (areas under the control of the data center that manages the server) that are geographically separated. Suppose that

そして、例えば、激甚災害や大規模障害等（以下、代表して「激甚災害」と称する。）によってデータセンタエリア単位でサーバの使用ができなくなる場合を考える。このとき、クラスタ内の多数のサーバが同時に減設される。通常、コンシステント・ハッシュ法のＩＤ空間上では、サーバについて物理位置を考慮せずにランダムに配置しているため、減設されるサーバがＩＤ空間上で隣り合うクラスタメンバであった場合、原本データおよび複製データの両方を消失してしまう（複製データが１個の場合）。 Consider, for example, a case where a server cannot be used in units of data center areas due to a catastrophic disaster, a large-scale failure, or the like (hereinafter referred to as a “severe catastrophe”). At this time, a large number of servers in the cluster are removed simultaneously. Usually, in the ID space of the consistent hash method, the servers are randomly arranged without considering the physical position. Therefore, if the server to be removed is a cluster member adjacent in the ID space, the original Both data and duplicated data are lost (when there is only one duplicated data).

一方、ＩＤ空間上で右隣のクラスタメンバに複製データを作成する際の通信時間等を考えると、ＩＤ空間上で隣り合うクラスタメンバ同士が物理的に近いほうがよい。
つまり、サーバの物理位置を考慮しながらＩＤ空間を作成すれば、高速データ複製を考慮したＩＤ空間や、激甚災害を考慮したＩＤ空間等を作成することができる。しかし、単一のＩＤ空間を利用する手法では、例えば、高速データ複製と激甚災害対策を両立させることができない。 On the other hand, considering the communication time and the like when creating duplicate data for the cluster member on the right side in the ID space, it is better that the cluster members adjacent in the ID space are physically close to each other.
That is, if an ID space is created while considering the physical location of the server, an ID space that takes into account high-speed data replication, an ID space that takes into account a catastrophic disaster, and the like can be created. However, the method using a single ID space cannot achieve both high-speed data replication and severe disaster countermeasures, for example.

そこで、本発明は、前記した事情に鑑みてなされたものであり、分散処理システムにおいて、クラスタメンバであるサーバの割り当てのポリシについて異なる複数のものを両立して運用可能にするサーバおよびそのプログラムを提供することを課題とする。 Therefore, the present invention has been made in view of the circumstances described above, and a server and a program for enabling a plurality of different allocation policies for servers as cluster members to be operated in a distributed processing system. The issue is to provide.

前記課題を解決するために、本発明は、環状のＩＤ空間に、管理対象の複数のデータ、および、前記データを管理しクラスタを構成する複数のサーバ、が割り振られ、それぞれの前記サーバが、前記ＩＤ空間において自身から所定方向回りに次の前記サーバまでの間に位置する前記データを管理するとともに、当該次の前記サーバから前記所定方向回りにさらに次の前記サーバまでの間に位置する前記データの複製を記憶する分散処理システムにおいて、負荷分散装置によって振り分けられたクライアントマシンからのリクエストを処理する前記サーバであって、前記複数のサーバそれぞれは所定の３以上の複数の物理的な地域のいずれかに属しており、第１ＩＤ空間が前記地域の数に分割されていて、当該分割されているそれぞれの部分には同じ地域の前記サーバが配置されている第１ＩＤ空間管理情報、および、第２ＩＤ空間において同じ地域の前記サーバが隣り合わないように配置されている第２ＩＤ空間管理情報、を記憶する記憶部と、前記負荷分散装置によって振り分けられた前記クライアントマシンからのリクエストを処理するとともに、前記第１ＩＤ空間管理情報を参照して前記第１ＩＤ空間における自身から前記所定方向回りに次の前記サーバに当該リクエストの処理に用いたデータの複製の記憶を要求し、かつ、前記第２ＩＤ空間管理情報を参照して前記第２ＩＤ空間における自身から前記所定方向回りに次の前記サーバに当該リクエストの処理に用いたデータの複製の記憶を要求する処理部と、を備えることを特徴とする。 In order to solve the above-described problem, the present invention allocates a plurality of data to be managed and a plurality of servers that manage the data and constitute a cluster in a circular ID space, and each of the servers includes: Managing the data located between itself and the next server around the predetermined direction in the ID space, and located between the next server and the next server around the predetermined direction. In a distributed processing system for storing a copy of data, the server processes requests from client machines distributed by a load balancer, and each of the plurality of servers includes a plurality of predetermined physical regions of three or more. The first ID space is divided into the number of the regions, and each divided portion A storage unit that stores first ID space management information in which the servers in the same area are arranged, and second ID space management information arranged so that the servers in the same area are not adjacent to each other in the second ID space; The request from the client machine distributed by the load balancer is processed, and the request is processed from the self in the first ID space to the next server around the predetermined direction with reference to the first ID space management information. Requesting storage of a copy of the data used in the process, and referring to the second ID space management information from the self in the second ID space to the next server around the predetermined direction, the data used for processing the request And a processing unit that requests storage of a copy.

これによれば、高速データ複製用の第１ＩＤ空間を有する第１ＩＤ空間管理情報と、激甚災害対策用の第２ＩＤ空間を有する第２ＩＤ空間管理情報を用いることで、高速データ複製と激甚災害対策を両立可能な分散処理システムにおけるサーバを実現することができる。つまり、異なるＩＤ空間を複数用いることで、サーバの割り当てのポリシについて異なる複数のものを両立して運用可能にするサーバを提供することができる。 According to this, by using the first ID space management information having the first ID space for high-speed data replication and the second ID space management information having the second ID space for severe disaster countermeasures, high-speed data replication and severe disaster countermeasures can be performed. A server in a compatible distributed processing system can be realized. That is, by using a plurality of different ID spaces, it is possible to provide a server that can operate a plurality of different server allocation policies at the same time.

また、本発明は、前記処理部が、第１ＩＤ空間が前記地域の数に分割されていて前記分割されているそれぞれの部分には同じ地域の前記サーバが配置されている第１ＩＤ空間管理情報を参照して前記第１ＩＤ空間における自身から前記所定方向回りに次の前記サーバに当該リクエストの処理に用いたデータの複製の記憶を要求する場合は、当該リクエストの処理と同期的に行い、第２ＩＤ空間において同じ地域の前記サーバが隣り合わないように配置されている前記第２ＩＤ空間管理情報を参照して前記第２ＩＤ空間における自身から前記所定方向回りに次の前記サーバに当該リクエストの処理に用いたデータの複製の記憶を要求する場合は、当該リクエストの処理と非同期的に行うことを特徴とする。 Further, the present invention provides the first ID space management information in which the first ID space is divided into the number of the regions, and the server in the same region is arranged in each divided part. Referring to requesting the next server to store a copy of the data used for processing the request around the predetermined direction from itself in the first ID space, the second ID is performed synchronously with the processing of the request. For processing the request to the next server around the predetermined direction from itself in the second ID space with reference to the second ID space management information arranged so that the servers in the same area are not adjacent to each other in the space When requesting storage of a copy of the received data, it is performed asynchronously with the processing of the request.

これによれば、処理時間が短い第１ＩＤ空間管理情報を用いたデータ複製はリクエスト処理と同期的に行い、処理時間が長い第２ＩＤ空間管理情報を用いたデータ複製はリクエスト処理と同非期的に行うことで、サーバにおける処理遅延の可能性を低減することができる。 According to this, data replication using the first ID space management information with a short processing time is performed synchronously with the request processing, and data replication using the second ID space management information with a long processing time is the same as the request processing. By doing so, the possibility of processing delay in the server can be reduced.

また、本発明は、コンピュータを前記サーバとして機能させるためのプログラムである。 Further, the present invention is a program for causing a computer to function as the server.

これによれば、このようなプログラムを実装したコンピュータをサーバとして機能させることができる。 According to this, a computer in which such a program is installed can be caused to function as a server.

本発明によれば、分散処理システムにおいて、異なるＩＤ空間を複数用いることで、サーバの割り当てのポリシについて異なる複数のものを両立して運用可能にするサーバおよびそのプログラムを提供することができる。 According to the present invention, by using a plurality of different ID spaces in a distributed processing system, it is possible to provide a server and a program for enabling a plurality of different server allocation policies to be operated simultaneously.

本発明の概要の説明図である。It is explanatory drawing of the outline | summary of this invention. 本実施形態の分散処理システム等の構成図である。It is a block diagram of the distributed processing system etc. of this embodiment. （ａ）は第１ＩＤ空間管理情報の例を示す図で、（ｂ）は第２ＩＤ空間管理情報の例を示す図である。(A) is a figure which shows the example of 1st ID space management information, (b) is a figure which shows the example of 2nd ID space management information. 本実施形態のサーバによる処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process by the server of this embodiment. 従来のコンシステント・ハッシュ法の説明図である。It is explanatory drawing of the conventional consistent hash method. 従来のコンシステント・ハッシュ法におけるクラスタメンバの物理位置の説明図である。It is explanatory drawing of the physical position of the cluster member in the conventional consistent hash method.

以下、本発明を実施するための形態（以下、実施形態と称する。）について、図面を参照（言及図以外の図も適宜参照）しながら説明する。なお、理解を容易にするために、まず、図１を参照して本実施形態の概要について説明し、その後、詳細に説明する。 Hereinafter, modes for carrying out the present invention (hereinafter referred to as embodiments) will be described with reference to the drawings (refer to drawings other than the referenced drawings as appropriate). In order to facilitate understanding, the outline of the present embodiment will be described first with reference to FIG. 1 and then described in detail.

（本実施形態の概要）
本実施形態では、クラスタメンバであるサーバの割り当ての異なるポリシの例として、高速データ複製および激甚災害対策というポリシを挙げて説明する。
まず、図１（ａ）に示すように、データセンタエリアの数Ｋ（３以上の自然数。ここでは５）に応じて、高速データ複製用の第１ＩＤ空間を事前に分割しておく。そして、クラスタにサーバを追加する場合には、ＩＤ空間における新たなクラスタメンバの挿入先を決定する。例えば、第１ＩＤ空間の中でクラスタメンバ間の距離（第１ＩＤ空間上の距離）が最も広い個所を検索（特定）し、そのクラスタメンバ間の中央の位置を新たなクラスタメンバのＩＤの挿入先として決定する（図１（ａ）の（１））。なお、このような位置を選択する理由は、クラスタ構成の分散処理システムでは、クラスタメンバ間で負荷がなるべく平均化されていることが望ましく、統計的に、データ数がクラスタメンバ数よりも格段に多い場合には、各クラスタメンバが担当するデータ数はＩＤ空間上のクラスタメンバ間の距離にほぼ比例するためである。 (Outline of this embodiment)
In this embodiment, as an example of a policy with different allocation of servers that are cluster members, a policy of high-speed data replication and catastrophic disaster countermeasures will be described.
First, as shown in FIG. 1A, the first ID space for high-speed data replication is divided in advance according to the number K of data center areas (a natural number of 3 or more, here 5). When adding a server to a cluster, the insertion destination of a new cluster member in the ID space is determined. For example, the location where the distance between the cluster members (distance in the first ID space) is the largest in the first ID space is searched (specified), and the center position between the cluster members is inserted into the new cluster member ID insertion destination. (1 in FIG. 1A). The reason for selecting such a position is that, in a distributed processing system with a cluster configuration, it is desirable that the load among the cluster members be averaged as much as possible. Statistically, the number of data is much larger than the number of cluster members. This is because the number of data handled by each cluster member is almost proportional to the distance between the cluster members in the ID space when there are many.

第１ＩＤ空間における新たなクラスタメンバの挿入先を決定した（図１（ａ）の（１））後に、その挿入先に対応するデータセンタエリアを特定し、そのデータセンタエリアに物理的に存在するサーバを選択し（図１（ａ）の（２））、クラスタメンバに組み込む。このようなアルゴリズムにより、サーバの物理位置を考慮したサーバ選択を行うことができ、高速データ複製を実現するとともに、各サーバにおける処理負荷をより平均化することができる。 After determining the insertion destination of a new cluster member in the first ID space ((1) in FIG. 1A), the data center area corresponding to the insertion destination is specified and physically exists in the data center area. A server is selected ((2) in FIG. 1A) and incorporated into a cluster member. With such an algorithm, server selection can be performed in consideration of the physical location of the server, high-speed data replication can be realized, and the processing load on each server can be further averaged.

なお、クラスタにサーバを追加する場合に、第１ＩＤ空間の中で、クラスタメンバ間の距離が最も広い個所の中央の位置を選んだが、これは一例に過ぎない。つまり、クラスタにサーバを追加する場合に、第１ＩＤ空間において選択する位置は任意でよく、必須事項は、そのＩＤ空間において選択した位置に対応するデータセンタエリア（以下、「地域」と称する場合もある。）に物理的に存在するサーバを選択することである。これにより、高速データ複製を実現することができる。 In addition, when adding a server to a cluster, the center position of the place where the distance between cluster members is the widest in the first ID space is selected, but this is only an example. That is, when adding a server to a cluster, the position selected in the first ID space may be arbitrary, and the essential item is a data center area corresponding to the position selected in the ID space (hereinafter also referred to as “region”). Is to select a server that physically exists. Thereby, high-speed data replication can be realized.

次に、図１（ｂ）に示すように、激甚災害対策用の第２ＩＤ空間では、まず、同じ地域のサーバが隣り合わないように、サーバを図示のように分散させて（地域ごとに順番に）並べる。 Next, as shown in FIG. 1B, in the second ID space for catastrophic disaster countermeasures, first, the servers in the same region are distributed as shown so that the servers in the same region are not adjacent (in order for each region). To line up.

そして、サーバの増設時は、第２ＩＤ空間の中でクラスタメンバ間の距離（第２ＩＤ空間上の距離）が最も広い個所を検索（特定）し、そのクラスタメンバ間の中央の位置を新たなクラスタメンバのＩＤの挿入先として決定する（図１（ｂ）の（１））。 Then, when adding a server, the location where the distance between the cluster members (distance on the second ID space) is the largest in the second ID space is searched (specified), and the center position between the cluster members is determined as a new cluster. It is determined as a member ID insertion destination ((1) in FIG. 1B).

第２ＩＤ空間における新たなクラスタメンバの挿入先を決定した（図１（ｂ）の（１））後に、その挿入先の周囲Ｋ−１個（最低限、両隣だけでもよい。）のサーバの地域を調査する（図１（ｂ）の（２））。そして、そのＫ−１個（最低限、両隣）のサーバのいずれも属さない地域のサーバを選択し（図１（ｂ）の（３））、そのサーバを増設する。 After determining the insertion destination of a new cluster member in the second ID space ((1) in FIG. 1 (b)), the area of K-1 servers around the insertion destination (only at least both neighbors are acceptable). Is investigated ((2) in FIG. 1B). Then, a server in a region to which none of the K-1 (at least, both adjacent) servers belongs is selected ((3) in FIG. 1B), and the server is added.

なお、「周囲Ｋ−１個のサーバ」とは、具体的には、Ｋが奇数のときは、挿入先の両側それぞれについて｛（Ｋ−１）／２｝台のサーバのことである。また、Ｋが偶数のときは、挿入先の両側の一方について（Ｋ／２）台、他方について｛（Ｋ−２）／２｝台のサーバのことである The “periphery K−1 servers” specifically refers to {(K−1) / 2} servers for both sides of the insertion destination when K is an odd number. When K is an even number, it means (K / 2) servers on one side of the insertion destination and {(K-2) / 2} servers on the other side.

このようなアルゴリズムにより、第２ＩＤ空間において同じ地域のサーバが隣り合うことがないので、３つ以上の地域のうちの１つの地域に存在するすべてのサーバがダウンしても、データが消失する事態を回避することができる。また、各サーバにおける処理負荷をより平均化することもできる。 With such an algorithm, servers in the same region do not adjoin in the second ID space, so that data is lost even if all servers existing in one of the three or more regions go down. Can be avoided. In addition, the processing load on each server can be further averaged.

改めて説明すると、コンシステント・ハッシュ法の特性上、「どれか１つでもデータが（複製ともども）失われる確率」は、「連続してＭ（冗長数（原本データと複製データの合計数））個同じ地域のサーバが配置される確率」と等しい。図１（ｂ）で説明したアルゴリズムによれば、Ｍ≧２の場合に常に上記確率を０％にできる。つまり、１つの地域が全壊（その地域に存在するすべてのサーバがダウン）したとしても、Ｍが２以上であればいずれのデータも失われない（複製データは必ず残る）。 To explain again, due to the characteristics of the consistent hash method, the “probability of losing any one of the data (also with duplicates)” is “continuous M (redundant number (total number of original data and duplicated data))” It is equal to the probability that servers in the same area will be placed. According to the algorithm described in FIG. 1B, the probability can always be 0% when M ≧ 2. That is, even if one area is completely destroyed (all servers existing in that area are down), if M is 2 or more, no data is lost (duplicated data always remains).

また、挿入先の「両隣」だけでなく「周囲Ｋ−１個」のサーバの地域を調査する手法の場合、Ｋが奇数のときは、第２ＩＤ空間において任意の連続する｛（Ｋ＋１）／２｝台のサーバはすべて異なる地域に属していることになる。また、Ｋが偶数のときは、第２ＩＤ空間において任意の連続する（Ｋ／２）台のサーバはすべて異なる地域に属していることになる。これによって、冗長数を３以上とした場合、１つの地域のサーバがすべてダウンしても、ダウンしたサーバが保持していた原本データの複製データを保持しているサーバの数の期待値をより大きくすることができる。 In addition, in the case of a method of examining the area of “K-1 surroundings” servers as well as “both neighbors” of the insertion destination, when K is an odd number, any continuous {(K + 1) / 2 in the second ID space } All servers belong to different regions. When K is an even number, any continuous (K / 2) servers in the second ID space all belong to different areas. As a result, if the number of redundancy is 3 or more, even if all the servers in one region are down, the expected value of the number of servers holding duplicate data of the original data held by the down server is more Can be bigger.

例えば、Ｋ＝５で、Ｍ＝３の場合、１つの地域が全壊しても、その地域のサーバが保持していた原本データの複製データを保持しているサーバは、いずれもその地域以外の別々の地域に属しており、複製データを保持し続けている。つまり、その場合、２つの地域が全壊したとしても、１つのサーバが複製データを必ず保持していることになる。 For example, if K = 5 and M = 3, even if one area is completely destroyed, any server that holds duplicate data of the original data held by the server in that area is not in that area. They belong to different regions and keep replica data. That is, in this case, even if two areas are completely destroyed, one server always holds duplicate data.

ＫとＭの値をもっと増加させれば、さらに多くの地域が全壊した場合でも、複製データが消失していない可能性を高くすることができる。例えば、Ｋ＝７で、Ｍ＝４の場合、３つの地域が全壊しても、１つのサーバが複製データを必ず保持していることになる。 If the values of K and M are further increased, it is possible to increase the possibility that duplicate data has not been lost even if more areas are completely destroyed. For example, when K = 7 and M = 4, even if three areas are completely destroyed, one server always holds duplicate data.

なお、クラスタにサーバを追加する場合に、第２ＩＤ空間の中で、クラスタメンバ間の距離が最も広い個所の中央の位置を選んだが、これは一例に過ぎない。つまり、クラスタにサーバを追加する場合に、第２ＩＤ空間において選択する位置は任意でよく、必須事項は、その第２ＩＤ空間において選択した位置の両隣のサーバのいずれも属さない地域のサーバを選択することである。これにより、３つ以上の地域のうちの１つの地域に存在するすべてのサーバがダウンしても、データが消失する事態を回避することができる。つまり、激甚災害対策の効果を奏する。 In addition, when adding a server to a cluster, the center position of the place where the distance between cluster members is the widest in the second ID space is selected, but this is only an example. That is, when adding a server to a cluster, the position to be selected in the second ID space may be arbitrary, and an essential matter is to select a server in a region to which neither of the servers adjacent to the selected position in the second ID space belongs. That is. Thereby, even if all the servers existing in one of the three or more regions go down, a situation in which data is lost can be avoided. In other words, it has the effect of catastrophic disaster countermeasures.

（実施形態）
次に、本実施形態について説明する。図２に示すように、本実施形態の分散処理システム１０００は、負荷分散装置３、クラスタ１００を構成する複数のサーバ４を備えている。負荷分散装置３は、インターネット等のネットワーク２を介して、複数のクライアントマシン１と接続されている。 (Embodiment)
Next, this embodiment will be described. As shown in FIG. 2, the distributed processing system 1000 according to the present embodiment includes a load distribution device 3 and a plurality of servers 4 constituting a cluster 100. The load balancer 3 is connected to a plurality of client machines 1 via a network 2 such as the Internet.

全体の主な動作について説明すると、クライアントマシン１からのリクエストを、ネットワーク２経由で負荷分散装置３が受け取る。負荷分散装置３は、そのリクエストを、複数のサーバ４のいずれかに振り分ける。リクエストを振り分けられたサーバ４は、そのリクエストの処理を行うとともに、他の２つのサーバ４にデータの複製の記憶を要求する。 The main operation will be described. The load distribution device 3 receives a request from the client machine 1 via the network 2. The load balancer 3 distributes the request to one of the plurality of servers 4. The server 4 to which the request has been distributed processes the request and requests the other two servers 4 to store a copy of the data.

次に、負荷分散装置３とサーバ４の構成について説明する。
負荷分散装置３は、記憶部３１、処理部３２、入力部３３、表示部３４、通信部３５を備える。
記憶部３１は、情報を記憶する手段であり、ＲＡＭ（Random Access Memory）やＲＯＭ（Read Only Memory）などのメモリ、ＨＤＤ（Hard Disk Drive）などによって構成される。記憶部３１には、第１ＩＤ空間管理情報３１１と第２ＩＤ空間管理情報３１２が格納されている。なお、第１ＩＤ空間管理情報３１１は第１ＩＤ空間管理情報４１１と同一で、第２ＩＤ空間管理情報３１２は第２ＩＤ空間管理情報４１２と同一であり、第１ＩＤ空間管理情報４１１と第２ＩＤ空間管理情報４１２の説明は後記する。また、記憶部３１には、処理部３２の動作プログラムなども格納されている（図示を省略）。 Next, the configuration of the load balancer 3 and the server 4 will be described.
The load distribution device 3 includes a storage unit 31, a processing unit 32, an input unit 33, a display unit 34, and a communication unit 35.
The storage unit 31 is a means for storing information, and includes a memory such as a random access memory (RAM) or a read only memory (ROM), a hard disk drive (HDD), or the like. The storage unit 31 stores first ID space management information 311 and second ID space management information 312. The first ID space management information 311 is the same as the first ID space management information 411, the second ID space management information 312 is the same as the second ID space management information 412, and the first ID space management information 411 and the second ID space management information 412 are the same. Will be described later. The storage unit 31 also stores an operation program for the processing unit 32 (not shown).

処理部３２は、記憶部３１に格納された情報に基づいて演算処理を行う手段であり、例えばＣＰＵ（Central Processing Unit）によって構成される。処理部３２は、クラスタ１００へのサーバ４の追加等により第１ＩＤ空間管理情報３１１を更新すると、その最新の第１ＩＤ空間管理情報３１１をすべてのサーバ４に送信する。また、処理部３２は、クラスタ１００へのサーバ４の追加等により第２ＩＤ空間管理情報３１２を更新すると、その最新の第２ＩＤ空間管理情報３１２をすべてのサーバ４に送信する。 The processing unit 32 is means for performing arithmetic processing based on information stored in the storage unit 31, and is configured by, for example, a CPU (Central Processing Unit). When the processing unit 32 updates the first ID space management information 311 by adding the server 4 to the cluster 100 or the like, the processing unit 32 transmits the latest first ID space management information 311 to all the servers 4. Further, when the second ID space management information 312 is updated by adding the server 4 to the cluster 100 or the like, the processing unit 32 transmits the latest second ID space management information 312 to all the servers 4.

処理部３２は、また、通信部３５が受信したクライアントマシン１からのリクエストを、いずれのサーバ４に振り分けるかを決定する。振り分けの決定は、第１ＩＤ空間管理情報３１１または第２ＩＤ空間管理情報３１２に基づくが、いずれを用いるかは、システム一意に事前に定められている。 The processing unit 32 also determines to which server 4 the request from the client machine 1 received by the communication unit 35 is distributed. The determination of the distribution is based on the first ID space management information 311 or the second ID space management information 312. Which is used is determined in advance uniquely for the system.

入力部３３は、サーバ４のユーザが情報を入力する手段であり、例えば、キーボードやマウスによって実現される。
表示部３４は、情報を表示する手段であり、例えば、ＬＣＤ（Liquid Crystal Display）によって実現される。
通信部３５は、外部装置との通信に用いられる通信インタフェースである。 The input unit 33 is a means for the user of the server 4 to input information, and is realized by, for example, a keyboard or a mouse.
The display unit 34 is a means for displaying information, and is realized by, for example, an LCD (Liquid Crystal Display).
The communication unit 35 is a communication interface used for communication with an external device.

サーバ４は、負荷分散装置３によって振り分けられたクライアントマシン１からのリクエストを処理するコンピュータ装置である。なお、前記したように、コンシステント・ハッシュ法では、環状のＩＤ空間に、管理対象の複数のデータ、および、データを管理しクラスタ１００を構成する複数のサーバ４、が割り振られ、それぞれのサーバ４が、ＩＤ空間において自身から時計回り（所定方向回り）に次のサーバ４までの間に位置するデータを管理するとともに、当該次のサーバ４から時計回り（所定方向回り）にさらに次のサーバ４までの間に位置するデータの複製を記憶することを前提とする。また、複数のサーバ４それぞれは、所定の３以上の複数の物理的な地域のいずれかに属している。 The server 4 is a computer device that processes requests from the client machines 1 distributed by the load balancer 3. As described above, in the consistent hash method, a plurality of data to be managed and a plurality of servers 4 that manage the data and constitute the cluster 100 are allocated to the circular ID space, and each server 4 manages data located between itself and the next server 4 in the clockwise direction (predetermined direction) in the ID space, and the next server is further rotated clockwise (predetermined direction) from the next server 4 It is assumed that a copy of data located between 4 and 4 is stored. Each of the plurality of servers 4 belongs to one of a plurality of predetermined three or more physical regions.

サーバ４は、記憶部４１、処理部４２、通信部４３を備える。
記憶部４１は、情報を記憶する手段であり、ＲＡＭやＲＯＭなどのメモリ、ＨＤＤなどによって構成される。記憶部４１には、負荷分散装置３から受信した第１ＩＤ空間管理情報３１１と第２ＩＤ空間管理情報３１２が、それぞれ、第１ＩＤ空間管理情報４１１と第２ＩＤ空間管理情報４１２として格納されている。なお、記憶部４１には、処理部４２の動作プログラムなども格納されている（図示を省略）。 The server 4 includes a storage unit 41, a processing unit 42, and a communication unit 43.
The storage unit 41 is a means for storing information, and includes a memory such as a RAM or a ROM, an HDD, or the like. The storage unit 41 stores first ID space management information 311 and second ID space management information 312 received from the load balancer 3 as first ID space management information 411 and second ID space management information 412, respectively. The storage unit 41 also stores an operation program for the processing unit 42 (not shown).

処理部４２は、記憶部４１に格納された情報に基づいて演算処理を行う手段であり、例えばＣＰＵによって構成される。処理部４２は、負荷分散装置３によって振り分けられたクライアントマシン１からのリクエストを処理する。処理部４２は、さらに、第１ＩＤ空間管理情報４１１を参照して第１ＩＤ空間における自身から所定方向回りに次のサーバ４に当該リクエストの処理に用いたデータの複製の記憶を要求する。処理部４２は、さらに、第２ＩＤ空間管理情報４１２を参照して第２ＩＤ空間における自身から所定方向回りに次のサーバ４に当該リクエストの処理に用いたデータの複製の記憶を要求する。 The processing unit 42 is a unit that performs arithmetic processing based on information stored in the storage unit 41, and is configured by a CPU, for example. The processing unit 42 processes a request from the client machine 1 distributed by the load balancer 3. The processing unit 42 further refers to the first ID space management information 411 and requests the next server 4 to store a copy of the data used for the processing of the request around the predetermined direction from itself in the first ID space. The processing unit 42 further refers to the second ID space management information 412 and requests the next server 4 to store a copy of the data used for the processing of the request around the predetermined direction from itself in the second ID space.

また、処理部４２は、第１ＩＤ空間管理情報４１１を参照して第１ＩＤ空間における自身から所定方向回りに次のサーバ４に当該リクエストの処理に用いたデータの複製の記憶を要求する場合は、当該リクエストの処理と同期的に行うのが好ましい。また、処理部４２は、第２ＩＤ空間管理情報４１２を参照して第２ＩＤ空間における自身から所定方向回りに次のサーバ４に当該リクエストの処理に用いたデータの複製の記憶を要求する場合は、当該リクエストの処理と非同期的に（都合のいいタイミングで）行うのが好ましい。 In addition, when the processing unit 42 refers to the first ID space management information 411 and requests the next server 4 to store a copy of the data used for processing the request around the predetermined direction from itself in the first ID space, It is preferable to synchronize with the processing of the request. In addition, when the processing unit 42 refers to the second ID space management information 412 and requests the next server 4 to store a copy of the data used for processing the request around the predetermined direction from itself in the second ID space, It is preferable to execute the request asynchronously (at a convenient timing).

このように、処理時間が短い第１ＩＤ空間管理情報４１１を用いたデータ複製はリクエスト処理と同期的に行い、処理時間が長い第２ＩＤ空間管理情報４１２を用いたデータ複製はリクエスト処理と同非期的に行うようにすれば、サーバ４における処理遅延の可能性を低減することができる。 Thus, data replication using the first ID space management information 411 with a short processing time is performed in synchronization with the request processing, and data replication using the second ID space management information 412 with a long processing time is the same as the request processing. If this is done, the possibility of processing delay in the server 4 can be reduced.

通信部４３は、外部装置との通信に用いられる通信インタフェースである。
なお、サーバ４は、入力部や表示部を備えていてもよい。 The communication unit 43 is a communication interface used for communication with an external device.
The server 4 may include an input unit and a display unit.

次に、第１ＩＤ空間管理情報４１１について説明する。第１ＩＤ空間管理情報４１１では、第１ＩＤ空間が地域の数に分割されていて、当該分割されているそれぞれの部分には同じ地域のサーバ４が配置されている（図１（ａ）参照）。
図３（ａ）では、第１ＩＤ空間管理情報４１１について、ＩＤ空間におけるＩＤと、サーバ４の識別子としてのＩＰアドレスの情報を示している。 Next, the first ID space management information 411 will be described. In the first ID space management information 411, the first ID space is divided into the number of regions, and the servers 4 in the same region are arranged in each divided part (see FIG. 1A).
FIG. 3A shows the ID in the ID space and the IP address information as the identifier of the server 4 for the first ID space management information 411.

第１ＩＤ空間管理情報４１１は、管理対象のデータについて、所定のハッシュ値変換によって算出されたＩＤを用いて、そのデータを担当するサーバ４を管理する情報である。ＩＤは、ＩＤ空間におけるＩＤであり、サーバ４が管理を担当するデータの領域を特定するために格納され、ＩＤの値の大きさでソートされている。 The first ID space management information 411 is information for managing the server 4 in charge of data to be managed using an ID calculated by a predetermined hash value conversion. The ID is an ID in the ID space, and is stored in order to specify an area of data for which the server 4 is in charge of management, and is sorted by the size of the ID value.

例えば、図３（ａ）では、第１行目のＩＤの値が「４４６Ｄ０ＢＥ６」の場合は、識別子が「００００００００」〜「４４６Ｄ０ＢＥ６」の領域に属するデータを、ＩＰアドレスが「１９２．１６８．０．１０」のサーバ４が担当することを示す。また、第２行目の場合は、１つ前の行のＩＤの値に１をプラスした識別子からその行のＩＤまでの値の識別子に属するデータを、ＩＰアドレスが「１９２．１６８．０．２５」のサーバ４が担当することを示す。 For example, in FIG. 3A, when the ID value in the first row is “446D0BE6”, the data belonging to the area with the identifiers “00000000” to “446D0BE6” and the IP address “192.168.8.0”. .10 "indicates that the server 4 is in charge. In the case of the second row, the data belonging to the identifier of the value from the identifier obtained by adding 1 to the ID value of the previous row to the ID of the row, the IP address “192.168.8.0. 25 "indicates that the server 4 is in charge.

次に、第２ＩＤ空間管理情報４１２について説明する。第２ＩＤ空間管理情報４１２では、第２ＩＤ空間において同じ地域のサーバ４が隣り合わないように配置されている（図１（ｂ）参照）。図３（ｂ）では、第２ＩＤ空間管理情報４１２について、ＩＤ空間におけるＩＤと、サーバ４の識別子としてのＩＰアドレスの情報を示している。 Next, the second ID space management information 412 will be described. In the second ID space management information 412, the servers 4 in the same area are arranged so as not to be adjacent to each other in the second ID space (see FIG. 1B). FIG. 3B shows information on the ID in the ID space and the IP address as the identifier of the server 4 for the second ID space management information 412.

次に、サーバ４による処理について説明する。
図４に示すように、ステップＳ１において、サーバ４の処理部４２は、クライアントマシン１からのリクエストを、負荷分散装置３から受信したか否かを判定し、Ｙｅｓの場合はステップＳ２に進み、Ｎｏの場合はステップＳ１に戻る。
ステップＳ２において、処理部４２は、そのリクエストに対するデータ処理を行う。 Next, processing by the server 4 will be described.
As shown in FIG. 4, in step S <b> 1, the processing unit 42 of the server 4 determines whether or not the request from the client machine 1 has been received from the load balancer 3. If yes, the process proceeds to step S <b> 2. If no, the process returns to step S1.
In step S2, the processing unit 42 performs data processing for the request.

ステップＳ３において、処理部４２は、記憶部４１の第１ＩＤ空間管理情報４１１を参照して、第１ＩＤ空間における時計回りに隣のサーバ４にデータの複製の記憶を要求する。 In step S <b> 3, the processing unit 42 refers to the first ID space management information 411 in the storage unit 41 and requests the next server 4 to store a copy of the data in the clockwise direction in the first ID space.

ステップＳ４において、処理部４２は、記憶部４１の第２ＩＤ空間管理情報４１２を参照して、第２ＩＤ空間における時計回りに隣のサーバ４にデータの複製の記憶を要求する。 In step S <b> 4, the processing unit 42 refers to the second ID space management information 412 in the storage unit 41, and requests the next server 4 to store the data copy in the clockwise direction in the second ID space.

このようにして、本実施形態によれば、高速データ複製用の第１ＩＤ空間を有する第１ＩＤ空間管理情報４１１と、激甚災害対策用の第２ＩＤ空間を有する第２ＩＤ空間管理情報４１２を用いることで、高速データ複製と激甚災害対策を両立可能な分散処理システム１０００におけるサーバ４を実現することができる。つまり、異なるＩＤ空間を複数用いることで、サーバ４の割り当てのポリシについて異なる複数のものを両立して運用可能にするサーバ４を提供することができ、分散処理システム１０００の柔軟な運用が可能になる。 As described above, according to the present embodiment, the first ID space management information 411 having the first ID space for high-speed data replication and the second ID space management information 412 having the second ID space for catastrophic disaster countermeasures are used. Thus, the server 4 in the distributed processing system 1000 capable of achieving both high-speed data replication and severe disaster countermeasures can be realized. In other words, by using a plurality of different ID spaces, it is possible to provide the server 4 that can operate a plurality of different allocation policies of the server 4 in a compatible manner, and the distributed processing system 1000 can be operated flexibly. Become.

また、サーバ４において、処理時間が短い第１ＩＤ空間管理情報４１１を用いたデータ複製はリクエスト処理と同期的に行い、処理時間が長い第２ＩＤ空間管理情報４１２を用いたデータ複製はリクエスト処理と同非期的に行うようにすれば、処理遅延の可能性を低減することができる。 In the server 4, data replication using the first ID space management information 411 with a short processing time is performed synchronously with the request processing, and data replication using the second ID space management information 412 with a long processing time is the same as the request processing. If it is performed on an aperiodic basis, the possibility of processing delay can be reduced.

以上で本実施形態の説明を終えるが、本発明の態様はこれらに限定されるものではない。
例えば、本実施形態では、高速データ複製および激甚災害対策という２つのポリシを反映した異なる２つのＩＤ空間を用いる場合を例にとって説明したが、これに限定されず、異なるポリシの複数のＩＤ空間を用いるだけでデータ消失の可能性低減等の効果が期待できるので、種々のポリシに基づく異なる複数のＩＤ空間を用いることができる。 Although description of this embodiment is finished above, the aspect of the present invention is not limited to these.
For example, in this embodiment, the case where two different ID spaces reflecting two policies of high-speed data replication and catastrophic disaster countermeasures are used as an example. However, the present invention is not limited to this, and a plurality of ID spaces of different policies are used. Since the effect of reducing the possibility of data loss or the like can be expected simply by using it, a plurality of different ID spaces based on various policies can be used.

また、地域として、データセンタエリアを単位とする場合を例にとって説明したが、データセンタエリアをさらに分割したものや都道府県等の別の単位を採用してもよい。 Moreover, although the case where the data center area is used as a unit as an area has been described as an example, another unit such as a further divided data center area or a prefecture may be adopted.

また、本実施形態ではコンシステント・ハッシュ法を前提としたが、他の手法を前提としてもよい。
また、本発明は、コンピュータをサーバ４として機能させるためのプログラムとしても具現化可能である。
その他、具体的な構成や処理について、本発明の主旨を逸脱しない範囲で適宜変更が可能である。 In the present embodiment, the consistent hash method is assumed, but another method may be assumed.
The present invention can also be embodied as a program for causing a computer to function as the server 4.
In addition, specific configurations and processes can be appropriately changed without departing from the gist of the present invention.

１クライアントマシン
２ネットワーク
３負荷分散装置
４サーバ
３１記憶部
３２処理部
３３入力部
３４表示部
３５通信部
４１記憶部
４２処理部
４３通信部
１００クラスタ
３１１第１ＩＤ空間管理情報
３１２第２ＩＤ空間管理情報
４１１第１ＩＤ空間管理情報
４１２第２ＩＤ空間管理情報
１０００分散処理システム DESCRIPTION OF SYMBOLS 1 Client machine 2 Network 3 Load distribution apparatus 4 Server 31 Storage part 32 Processing part 33 Input part 34 Display part 35 Communication part 41 Storage part 42 Processing part 43 Communication part 100 Cluster 311 1st ID space management information 312 2nd ID space management information 411 First ID space management information 412 Second ID space management information 1000 Distributed processing system

Claims

In a circular ID (IDentifier) space, a plurality of data to be managed and a plurality of servers that manage the data and constitute a cluster are allocated, and each of the servers rotates in a predetermined direction from itself in the ID space. A distributed processing system for managing the data located between the next server and storing a copy of the data located between the next server and the next server around the predetermined direction. In the server for processing a request from a client machine distributed by the load balancer,
Each of the plurality of servers belongs to one of a plurality of predetermined physical areas of three or more,
1st ID space management information by which the 1st ID space is divided | segmented into the number of the said area | region, and the said server of the same area is arrange | positioned in each of the said division | segmentation, and
A storage unit for storing second ID space management information arranged so that the servers in the same region are not adjacent to each other in the second ID space;
The request from the client machine distributed by the load balancer is processed, and the request is processed from the self in the first ID space to the next server around the predetermined direction with reference to the first ID space management information. Requesting storage of a copy of the data used in the process, and referring to the second ID space management information from the self in the second ID space to the next server around the predetermined direction, the data used for processing the request A processing unit requesting storage of a copy;
A server comprising:

The processor is
When referring to the first ID space management information and requesting the next server to store a copy of the data used for processing the request around the predetermined direction from itself in the first ID space, When synchronously referring to the second ID space management information and requesting the storage of a copy of the data used for processing the request to the next server around the predetermined direction from itself in the second ID space, The server according to claim 1, wherein the server performs the request asynchronously.

Program for functioning as the server according to the computer to claim 1 or claim 2.