JP5969315B2

JP5969315B2 - Data migration processing system and data migration processing method

Info

Publication number: JP5969315B2
Application number: JP2012184486A
Authority: JP
Inventors: 絵里子岩佐; 道生入江
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-08-23
Filing date: 2012-08-23
Publication date: 2016-08-17
Anticipated expiration: 2032-08-23
Also published as: JP2014041550A

Description

本発明は、ネットワーク上に分散配置されるサーバをクラスタ化してデータを格納する分散処理システムにおいて、分散処理システムを構成するサーバに障害等が発生しクラスタから離脱した際、または、分散処理システムを構成するサーバを追加した際に、データの移行処理（再冗長化処理または再配置処理）を行うデータ移行処理システムおよびデータ移行処理方法に関する。 The present invention relates to a distributed processing system in which servers distributed in a network are clustered to store data, and when the server constituting the distributed processing system has failed from the cluster due to a failure or the like. The present invention relates to a data migration processing system and a data migration processing method for performing data migration processing (re-redundancy processing or rearrangement processing) when a server to be configured is added.

近年、クラウドコンピューティングの隆盛に伴い、多量のデータの処理や保持を効率的に行うことが求められている。そこで、複数のサーバを協調動作させることにより効率的な処理を実現する分散処理技術が発展している。 In recent years, with the rise of cloud computing, it has been required to efficiently process and retain a large amount of data. Thus, distributed processing technology has been developed that realizes efficient processing by operating a plurality of servers in a coordinated manner.

分散処理を行う際には、処理対象のデータを、クラスタを構成する各サーバ（以下、「クラスタメンバ」または「ノード」と称する。）に振り分ける必要がある。このとき、クラスタ全体での処理能力を高めるためには、各ノードが担当するデータ数（データ量）は平均化されていることが望ましい。 When performing distributed processing, it is necessary to distribute data to be processed to each server (hereinafter referred to as “cluster member” or “node”) that constitutes the cluster. At this time, in order to increase the processing capacity of the entire cluster, it is desirable that the number of data (data amount) handled by each node is averaged.

代表的なデータの管理（振り分け）手法として、各データのｋｅｙをハッシュ関数にかけた値（以下、「ｈａｓｈ（ｋｅｙ）」と称する。）をノード数Ｎで割った余り、すなわち「ｈａｓｈ（ｋｅｙ）ｍｏｄＮ」を番号として持つノードにデータを振り分ける方法がある。この場合、各ノードに事前に「０」から「Ｎ−１」までの番号を割り当てていることが前提となる。このような管理（振り分け）方法を用いた場合、ノードを追加すると、Ｎの値が変化して、多くのデータについて、そのデータの保存を担当するノードが変更になるため、担当するデータを再配置することが必要になる。 As a typical data management (distribution) technique, a value obtained by multiplying the key of each data by a hash function (hereinafter referred to as “hash (key)”) by the number of nodes N, that is, “hash (key)”. There is a method for distributing data to nodes having “mod N” as a number. In this case, it is assumed that numbers “0” to “N−1” are assigned to each node in advance. When such a management (distribution) method is used, if a node is added, the value of N changes, and for many data, the node that is responsible for storing the data is changed. It is necessary to arrange.

そこで、クラスタメンバの追加に伴い担当するノードが変更になるデータ数を約１／Ｎに抑える方法として、コンシステントハッシュ（Consistent Hashing）法（非特許文献１参照）を用いた管理（振り分け）手法がある。このコンシステントハッシュ法は、Amazon Dynamo（非特許文献２参照）等で用いられる。 Therefore, a management (distribution) method using a consistent hashing method (see Non-Patent Document 1) is used as a method for suppressing the number of data to be changed by a node in charge with addition of cluster members to about 1 / N. There is. This consistent hash method is used in Amazon Dynamo (see Non-Patent Document 2) and the like.

このコンシステントハッシュ法を用いたデータ管理（振り分け）手法では、ノードとデータの双方にＩＤ（IDentifier）を割り当てる。そして、データのＩＤから閉じたＩＤ空間を時計回りに辿った場合に最初に出合ったノードをそのデータの担当とする。ノードに対するＩＤの与え方の例としては、ＩＰアドレスをハッシュ関数にかけた値（hash（ＩＰアドレス））が挙げられる。 In this data management (distribution) method using the consistent hash method, IDs (IDentifiers) are assigned to both nodes and data. Then, when the ID space closed from the data ID is traced clockwise, the first node encountered is taken charge of the data. An example of how to give an ID to a node is a value (hash (IP address)) obtained by multiplying an IP address by a hash function.

クラスタ構成の分散処理システムでは、各ノードの処理性能が等しい場合には、各ノードが担当するデータ量を等しくする、すなわち、コンシステントハッシュ法のＩＤ空間における、ノード間の距離（以下、「ノードの担当領域」と称する。）を等しくすることが望ましい。この点を解決するため、各ノードに仮想的に複数のＩＤを持たせる手法が用いられている（非特許文献１参照）。各ノードが複数のＩＤを持つことで、仮想ＩＤ毎の担当領域は異なっていても、大数の法則に従いノードの担当領域は平均化される。 In a distributed processing system with a cluster configuration, when the processing performance of each node is equal, the amount of data handled by each node is made equal, that is, the distance between nodes (hereinafter referred to as “node” in the ID space of the consistent hash method). It is desirable to make them equal to each other. In order to solve this point, a method of virtually giving a plurality of IDs to each node is used (see Non-Patent Document 1). Since each node has a plurality of IDs, even if the assigned area for each virtual ID is different, the assigned areas of the nodes are averaged according to the law of large numbers.

多数のデータ管理をクラスタ構成の分散処理システムで実行する場合、あるノードに障害が発生した場合でも他のノードで処理が継続できるように、データの複製を保持することでデータの冗長化を実現している。コンシステントハッシュ法によるデータ管理手法を用いた分散処理システムにおいても、データの冗長化が必要であり、図１０に示すような複製データの配置手法をとるものがある。 When a large number of data management is performed in a clustered distributed processing system, data redundancy is achieved by maintaining a copy of the data so that processing can be continued on other nodes even if a failure occurs on one node doing. Even in a distributed processing system using a data management method based on the consistent hash method, data redundancy is required, and there is a method of arranging replicated data as shown in FIG.

図１０に示すように、コンシステントハッシュ法では、ノード（ノード「１」〜「４」）とデータ（データＡ〜Ｄ。黒丸（●）で表示）の双方にＩＤを割り当て、データのＩＤからＩＤ空間を時計回りに辿り最初に出合ったノードをそのデータの担当として決定する。そして、担当するノードのさらに右隣（時計回りに次）のノードに複製データを担当させる。 As shown in FIG. 10, in the consistent hash method, IDs are assigned to both nodes (nodes “1” to “4”) and data (data A to D, indicated by black circles (●)), and the ID of the data is calculated. The ID space is traced clockwise and the first node encountered is determined to be responsible for the data. Then, the node that is further to the right of the node in charge (next in the clockwise direction) is assigned the duplicate data.

例えば、図１０において、データＡについては、ＩＤ空間上を時計回りに辿り最初に出合ったノード「１」が担当となり、その複製データについては、ＩＤ空間上でノード「１」の右隣にあたるノード「２」に担当させることとなる。このように原本データ・複製データを担当するノードを決定することで、ノードが離脱した場合でも複製データを所持しているノードが新しくデータを担当するノードとなることにより処理を継続できるという利点がある。なお、複製データを複数個とる場合には、複製データを担当するノードのさらに右隣のノードに複製データを担当させるようにする。 For example, in FIG. 10, for data A, the node “1” that first meets in the clockwise direction on the ID space is in charge, and the duplicate data is a node that is adjacent to the node “1” on the right side in the ID space. "2" will be assigned. By determining the node in charge of the original data / replicated data in this way, even if the node leaves, there is an advantage that processing can be continued because the node that owns the replicated data becomes the node in charge of the new data. is there. When a plurality of pieces of duplicate data are taken, the duplicate data is assigned to a node further to the right of the node responsible for the duplicate data.

David karger et al.,“Consistent Hashing and Random Trees:Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web”，［online］，1997，ACM，［平成24年8月8日検索］，インターネット<ＵＲＬ:http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf>David karger et al., “Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web”, [online], 1997, ACM, [August 8, 2012 search], Internet <URL: http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf> Giuseppe DeCandia，et al.，“Dynamo: Amazon’s Highly Available Key-value Store,” SOSP’07, October 14-17, 2007, Stevenson, Washington, USA，［online］、［平成24年8月8日検索］、インターネット<ＵＲＬ:http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf>Giuseppe DeCandia, et al., “Dynamo: Amazon's Highly Available Key-value Store,” SOSP'07, October 14-17, 2007, Stevenson, Washington, USA, [online], [searched August 8, 2012] Internet <URL: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf>

コンシステントハッシュ法によるデータ管理手法を用い、図１０に示すような複製データの配置手法をとる分散処理システムにおいて、ノードに障害等が発生し、そのノードがクラスタを離脱するケースを考える。このとき、離脱したノードが保持していたデータは失われるため、一部のデータに関して冗長度が減少することとなる。図１１に示す例では、ノード「２」がクラスタから離脱したため、ノード「２」が保持していたデータＡとデータＢについて冗長度が減少する。 Consider a case in which a failure or the like occurs in a node and the node leaves the cluster in a distributed processing system that employs a data management method based on the consistent hash method and adopts a replication data arrangement method as shown in FIG. At this time, since the data held by the detached node is lost, the redundancy of some data is reduced. In the example shown in FIG. 11, since node “2” has left the cluster, the redundancy of data A and data B held by node “2” decreases.

冗長度が減少した状態で、さらにノードの離脱が重なって起こると、クラスタからデータＡやデータＢが完全に失われる可能性があるため、データの冗長度の回復は早急に行われることが望ましい。一方で、各ノードは実際には多数のデータを保持しているため、ノードが離脱した直後にすべてのデータを再冗長化すると負荷が高くなり、クラスタが実行している通常の処理に影響を及ぼす可能性がある。 It is desirable that recovery of data redundancy is performed quickly because data A and data B may be completely lost from the cluster if the node detachment occurs with redundancy reduced. . On the other hand, since each node actually holds a large amount of data, if all the data is re-redundant immediately after the node leaves, the load becomes high and the normal processing that the cluster is performing is affected. There is a possibility of effect.

また、コンシステントハッシュ法によるデータ管理手法を用い、図１０に示すような複製データの配置手法をとる分散処理システムにおいて、クラスタの性能不足等により、ノードをクラスタに追加する場合を考える。追加されたノードは、コンシステントハッシュ法のＩＤ空間情報に従い、既存ノードが担当していたデータの一部を引き継ぐことになる。このとき、データの引き継ぎが完了するまでは、ノードを新たに追加したにも関わらず、各ノードにおいて保持しているデータ量が不均一な状態になる。これに対し、各ノードは多数のデータを保持しているため、ノードを追加した直後にすべてのデータを再配置すると負荷が高くなり、クラスタが実行している通常の処理に影響を及ぼす可能性がある。 Further, consider a case where a node is added to a cluster due to insufficient cluster performance or the like in a distributed processing system using a data management method based on the consistent hash method and adopting a method for arranging replicated data as shown in FIG. The added node takes over part of the data handled by the existing node according to the ID space information of the consistent hash method. At this time, until the data transfer is completed, the amount of data held in each node becomes non-uniform even though a new node is added. On the other hand, each node holds a lot of data, so if you relocate all the data immediately after adding the node, the load will be high and the normal processing that the cluster is performing may be affected. There is.

このような背景に鑑みて本発明がなされたのであり、本発明は、クラスタを構成するノードの離脱または追加があった場合に、ノードの処理負荷を抑えながらデータを移行させることができる、データ移行処理システムおよびデータ移行処理方法を提供することを課題とする。 The present invention has been made in view of such a background, and in the present invention, when a node constituting a cluster is detached or added, data can be migrated while suppressing the processing load of the node. It is an object of the present invention to provide a migration processing system and a data migration processing method.

前記した課題を解決するため、請求項１に記載の発明は、クラスタを構成する複数のノードのいずれかが、クライアントにサービスを提供するためのデータを原本データとして記憶する所有者ノード、または、前記データの複製データを記憶する１つ以上の複製ノードとして割り当てられて記憶するデータ移行処理システムであって、前記データ移行処理システムを構成する前記複数のノードそれぞれが、固有な識別子であるノード識別子が付された前記複数のノードそれぞれについて、前記データと前記所有者ノードおよび前記複製ノードとが対応付けられたノード識別子管理情報が記憶される記憶部と、前記原本データを記憶する前記所有者ノードのノード識別子および前記複製データを記憶する前記複製ノードのノード識別子を、前記原本データおよび前記複製データそれぞれに、ノード情報として付与するノード情報付与部と、前記ノードの離脱または追加を検知して、前記ノード識別子管理情報を、前記ノードの離脱または追加に応じた、前記データと前記所有者ノードおよび前記複製ノードとの新たな対応付けに変更して格納させるノード識別子管理部と、前記ノードの離脱が検知された場合に、変更された前記ノード識別子管理情報に基づいて、自身が記憶している前記原本データ、および、自身が記憶している複製データであって、その原本データが消失しているときには、当該複製データを、前記所有者ノードまたは前記複製ノードを変更するために行うデータ移行が必要か否かの判定対象となるデータを示す判定対象データとして抽出し、前記ノードの追加が検知された場合に、変更された前記ノード識別子管理情報に基づいて、自身が記憶している前記原本データを、前記判定対象データとして抽出するデータ抽出部と、前記データ移行による前記ノード自身の処理負荷を抑制するように設定されたパラメータに基づく所定のタイミングのときに、前記抽出した判定対象データについて、前記変更されたノード識別子管理情報に対応した所有者ノードおよび複製ノードを特定し、前記特定した所有者ノードおよび複製ノードそれぞれのノード識別子が、前記ノード情報と比較して一致しない場合に、当該抽出した判定対象データを、前記データ移行が必要なデータを示すデータ移行対象データとして検出し、前記検出したデータ移行対象データを、前記特定した所有者ノードおよび複製ノードに移行するデータ移行処理部と、を備え、前記パラメータが、前記データ移行を並列で実行できる最大スレッド数を示すデータ移行処理スレッド数、前記データ移行を実行した後の待機時間を示すデータ移行処理実行間隔、前記変更されたノード識別子管理情報に対応した所有者ノードおよび複製ノードを特定する処理であるシミュレーションを含む、前記データ移行対象データの検出処理を連続で実行する回数を示すシミュレーション最大回数、の少なくとも１つであることを特徴とするデータ移行処理システムとした。 In order to solve the above-described problem, the invention according to claim 1 is characterized in that any of a plurality of nodes constituting a cluster stores an owner node that stores data for providing a service to a client as original data, or A data migration processing system that is allocated and stored as one or more replication nodes that store duplicate data of the data, wherein each of the plurality of nodes constituting the data migration processing system is a unique identifier For each of the plurality of nodes to which is attached, a storage unit that stores node identifier management information in which the data, the owner node, and the duplicate node are associated with each other, and the owner node that stores the original data A node identifier of the replication node storing the replication data and the node identifier of the replication data, A node information adding unit to be added as node information to each of the data and the duplicated data, and detection of the node leaving or adding, and the node identifier management information in accordance with the node leaving or adding And a node identifier management unit that changes and stores a new association between the owner node and the replication node, and when the node is detected to be detached, based on the changed node identifier management information, When the original data stored by itself and duplicate data stored by itself are lost, the original data is lost, and the owner node or the duplicate node is changed. Therefore, it is extracted as determination target data indicating data to be determined whether or not data migration to be performed is necessary. If known, based on the changed node identifier management information, the original data stored therein is extracted as the determination target data, and the process of the node itself by the data migration At a predetermined timing based on a parameter set to suppress a load, an owner node and a replication node corresponding to the changed node identifier management information are specified for the extracted determination target data, and the specification is performed. If the node identifier of each of the owner node and the duplicated node does not match with the node information, the extracted determination target data is detected as data migration target data indicating the data that needs to be migrated, The detected data migration target data is migrated to the specified owner node and replication node A data migration processing unit, and the parameter is a data migration processing execution interval indicating the number of data migration processing threads indicating the maximum number of threads capable of executing the data migration in parallel, and a waiting time after the data migration is executed. A simulation maximum number of times indicating the number of times that the data migration target data detection process is continuously executed , including a simulation that is a process of identifying an owner node and a replication node corresponding to the changed node identifier management information, The data migration processing system is characterized by being one.

また、請求項４に記載の発明は、クラスタを構成する複数のノードのいずれかが、クライアントにサービスを提供するためのデータを原本データとして記憶する所有者ノード、または、前記データの複製データを記憶する１つ以上の複製ノードとして割り当てられて記憶するデータ移行処理システムのデータ移行処理方法であって、前記データ移行処理システムを構成する前記複数のノードそれぞれが、固有な識別子であるノード識別子が付された前記複数のノードそれぞれについて、前記データと前記所有者ノードおよび前記複製ノードとが対応付けられたノード識別子管理情報が記憶される記憶部を備えており、前記原本データを記憶する前記所有者ノードのノード識別子および前記複製データを記憶する前記複製ノードのノード識別子を、前記原本データおよび前記複製データそれぞれに、ノード情報として付与するステップと、前記ノードの離脱または追加を検知して、前記ノード識別子管理情報を、前記ノードの離脱または追加に応じた、前記データと前記所有者ノードおよび前記複製ノードとの新たな対応付けに変更して格納させるステップと、前記ノードの離脱が検知された場合に、変更された前記ノード識別子管理情報に基づいて、自身が記憶している前記原本データ、および、自身が記憶している複製データであって、その原本データが消失しているときには、当該複製データを、前記所有者ノードまたは前記複製ノードを変更するために行うデータ移行が必要か否かの判定対象となるデータを示す判定対象データとして抽出し、前記ノードの追加が検知された場合に、変更された前記ノード識別子管理情報に基づいて、自身が記憶している前記原本データを、前記判定対象データとして抽出するステップと、前記データ移行による前記ノード自身の処理負荷を抑制するように設定されたパラメータに基づく所定のタイミングのときに、前記抽出した判定対象データについて、前記変更されたノード識別子管理情報に対応した所有者ノードおよび複製ノードを特定し、前記特定した所有者ノードおよび複製ノードそれぞれのノード識別子が、前記ノード情報と比較して一致しない場合に、当該抽出した判定対象データを、前記データ移行が必要なデータを示すデータ移行対象データとして検出し、前記検出したデータ移行対象データを、前記特定した所有者ノードおよび複製ノードに移行するステップと、を実行し、前記パラメータが、前記データ移行を並列で実行できる最大スレッド数を示すデータ移行処理スレッド数、前記データ移行を実行した後の待機時間を示すデータ移行処理実行間隔、前記変更されたノード識別子管理情報に対応した所有者ノードおよび複製ノードを特定する処理であるシミュレーションを含む、前記データ移行対象データの検出処理を連続で実行する回数を示すシミュレーション最大回数、の少なくとも１つであることを特徴とするデータ移行処理方法とした。 According to a fourth aspect of the present invention, any one of a plurality of nodes constituting a cluster stores an owner node that stores data for providing services to a client as original data, or duplicate data of the data. A data migration processing method of a data migration processing system that is allocated and stored as one or more replication nodes to store, wherein each of the plurality of nodes constituting the data migration processing system has a unique identifier For each of the plurality of attached nodes, the storage unit stores node identifier management information in which the data, the owner node, and the duplicate node are associated with each other, and stores the original data A node identifier of the replicating node and a node identifier of the replicating node storing the replicated data Adding to each of the original data and the duplicated data as node information; detecting the withdrawal or addition of the node; and detecting the node identifier management information in accordance with the withdrawal or addition of the node; Step of changing to a new association between the owner node and the duplicate node and storing it, and when the detachment of the node is detected, it is stored based on the changed node identifier management information. The original data that is stored, and the duplicate data that is stored in the original data, and when the original data is lost, the duplicate data is used to change the owner node or the duplicate node. Extracted as judgment target data indicating data to be judged whether or not migration is necessary, and addition of the node is detected In other words, based on the changed node identifier management information, the original data stored therein is extracted as the determination target data, and the processing load of the node itself due to the data migration is suppressed. At the predetermined timing based on the parameters set in the above, the owner node and the replication node corresponding to the changed node identifier management information are identified for the extracted determination target data, and the identified owner node and When the node identifier of each replication node does not match with the node information, the extracted determination target data is detected as data migration target data indicating data that needs to be migrated, and the detected data migration Migrating target data to the identified owner node and replica node; , It is executed, the parameter, the data data migration processing thread number indicating the maximum number of threads that can run the migration in parallel, the data migration processing execution interval indicating the wait time after execution of the data migration, is the changed It is at least one of the maximum number of simulations indicating the number of times that the data migration target data detection process is continuously executed , including a simulation that is a process for identifying an owner node and a replication node corresponding to the node identifier management information A data migration processing method characterized by

このようにすることで、クラスタを構成するノードが離脱または追加した直後に、データ移行処理（再冗長化処理または再配置処理）を実行するのではなく、所定のタイミングでデータ移行処理の対象となるデータ（データ移行対象データ）を検出するための処理を実行し、そこで検出されたデータについて、データ移行処理を実行する。よって、ノードの負荷を抑えながらデータを移行（再冗長化・再配置）させることができる。
また、データ移行処理スレッド数、データ移行処理実行間隔、シミュレーション最大回数の少なくとも１つをパラメータとして設定し、データ移行を徐々に実行させることができる。 In this way, the data migration process (re-redundancy process or rearrangement process) is not executed immediately after the nodes constituting the cluster leave or are added, but the data migration process is performed at a predetermined timing. The process for detecting the data (data migration target data) is executed, and the data migration process is executed for the data detected there. Therefore, data can be migrated (re-redundant / rearranged) while suppressing the load on the node.
Further, at least one of the number of data migration processing threads, the data migration processing execution interval, and the maximum number of simulations can be set as a parameter, and data migration can be executed gradually.

請求項２に記載の発明は、前記複数のノードそれぞれが、前記ノード自身の処理負荷を監視し、前記処理負荷が所定値を超えた場合に、前記データ移行処理部に処理中断情報を出力するノード負荷監視部を、さらに備え、前記データ移行処理部は、前記処理中断情報を受け取ると、前記データ移行を中断することを特徴とする請求項１に記載のデータ移行処理システムとした。
また、請求項５に記載の発明は、前記複数のノードそれぞれが、前記ノード自身の処理負荷を監視し、前記処理負荷が所定値を超えた場合に、前記データ移行を中断させるステップを、さらに実行することを特徴とする請求項４に記載のデータ移行処理方法とした。 According to a second aspect of the present invention, each of the plurality of nodes monitors the processing load of the node itself, and outputs processing interruption information to the data migration processing unit when the processing load exceeds a predetermined value. the node load monitoring unit, further wherein the data migration processing unit receives the process interruption information and the data migration processing system according to claim 1, characterized in that interrupting the data migration.
The invention according to claim 5 further comprises the step of each of the plurality of nodes monitoring the processing load of the node itself, and interrupting the data migration when the processing load exceeds a predetermined value. The data migration processing method according to claim 4, wherein the data migration processing method is executed.

このように、各ノードは、自身の処理負荷を監視することにより、処理負荷が所定値を超えた場合に、データ移行を中断することができる。よって、パラメータの設定による所定のタイミングでのデータ移行処理に加えて、ノードの処理負荷が所定値を超えた場合のタイミングでも、データ移行を中断することができるため、確実にノードの負荷を抑制しながらデータ移行させることができる。 In this way, each node can interrupt data migration when the processing load exceeds a predetermined value by monitoring its own processing load. Therefore, in addition to data migration processing at a predetermined timing by parameter setting, data migration can be interrupted even when the processing load of the node exceeds a predetermined value, so the node load is reliably suppressed Data can be transferred while

請求項３に記載の発明は、前記複数のノードそれぞれが、前記クライアントから前記データによる前記サービスの提供を要求するメッセージを受信した場合に、前記メッセージの処理を実行すると共に、前記サービスの対象となるデータが前記データ移行対象データである場合に、前記サービスの対象となるデータを、前記変更されたノード識別子管理情報に対応した所有者ノードおよび複製ノードに移行するメッセージ処理部を、さらに備えることを特徴とする請求項１または請求項２に記載のデータ移行処理システムとした。 According to a third aspect of the present invention, when each of the plurality of nodes receives a message requesting the provision of the service by the data from the client, the node performs processing of the message, and sets the target of the service. A message processing unit for migrating the data to be serviced to an owner node and a replica node corresponding to the changed node identifier management information when the data to be migrated is the data to be migrated; The data migration processing system according to claim 1 or claim 2 is provided.

このようにすることで、各ノードは、メッセージを受信したことにより、それ以降も利用可能性が高いと予測されるデータについては、そのデータに関するメッセージ処理を実行すると共に、データ移行処理（再冗長化処理または再配置処理）を実行し、冗長度の減少の回復や、データの適正な配置を迅速に達成することができる。 In this way, each node performs message processing on data that is predicted to be highly usable after receiving a message, and performs data migration processing (re-redundancy). Recovery processing or relocation processing) can be executed, and the reduction in redundancy and proper data placement can be quickly achieved.

本発明によれば、クラスタを構成するノードの離脱または追加があった場合に、ノードの処理負荷を抑えながらデータを移行させる、データ移行処理システムおよびデータ移行処理方法を提供することができる。 According to the present invention, it is possible to provide a data migration processing system and a data migration processing method for migrating data while suppressing the processing load of a node when a node constituting a cluster is detached or added.

本実施形態に係るデータ移行処理システムを含む分散処理システムの全体構成を示す図である。It is a figure which shows the whole structure of the distributed processing system containing the data migration processing system which concerns on this embodiment. 本実施形態に係るデータ移行処理システムを構成する各ノードによる、データの再冗長化処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of the data redundancy process by each node which comprises the data migration processing system which concerns on this embodiment. 本実施形態に係るノードの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of the node which concerns on this embodiment. 本実施形態に係るノード識別子管理テーブルのデータ構成例を示す図である。It is a figure which shows the data structural example of the node identifier management table which concerns on this embodiment. 本実施形態に係るデータ移行処理システムにおいて、各サーバが実行するデータ移行処理の全体の流れを示すフローチャートである。5 is a flowchart showing an overall flow of data migration processing executed by each server in the data migration processing system according to the present embodiment. 本実施形態に係るノードのデータ移行処理部が行うデータ移行対象データの検出処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the detection process of the data migration object data which the data migration process part of the node which concerns on this embodiment performs. 本実施形態に係るノードのデータ移行パラメータ管理部により設定されるデータ移行処理の所定のタイミングを説明するための図である。It is a figure for demonstrating the predetermined timing of the data migration process set by the data migration parameter management part of the node which concerns on this embodiment. 本実施形態の変形例１に係るノードの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of the node which concerns on the modification 1 of this embodiment. 本実施形態の変形例２に係るノードの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of the node which concerns on the modification 2 of this embodiment. コンシステントハッシュ法によるデータ管理手法を説明するための図である。It is a figure for demonstrating the data management method by the consistent hash method. コンシステントハッシュ法によるデータ管理手法において、ノードが離脱した例を示す図である。It is a figure which shows the example which the node left | separated in the data management method by the consistent hash method.

次に、本発明を実施するための形態（以下、「本実施形態」という）におけるデータ移行処理システム等について説明する。 Next, a data migration processing system and the like in a mode for carrying out the present invention (hereinafter referred to as “the present embodiment”) will be described.

＜分散処理システムの全体構成＞
まず、本実施形態に係るデータ移行処理システム１００を含む分散処理システム１０００の全体構成について説明する。
図１は、本実施形態に係るデータ移行処理システム１００を含む分散処理システム１０００の全体構成を示す図である。 <Overall configuration of distributed processing system>
First, the overall configuration of the distributed processing system 1000 including the data migration processing system 100 according to the present embodiment will be described.
FIG. 1 is a diagram showing an overall configuration of a distributed processing system 1000 including a data migration processing system 100 according to the present embodiment.

この分散処理システム１０００は、各クライアント２からのメッセージを受け付けるロードバランサ３と、振り分け装置４と、クラスタを構成する複数のノード１とを含んで構成される。ロードバランサ３は、クライアント２からのメッセージを単純なラウンドロビン等により各振り分け装置４に振り分ける。振り分け装置４は、受信したメッセージを、例えば、コンシステントハッシュ法等に基づき、各ノード１に振り分ける。各ノード１では、メッセージ処理を行い、クライアント２にサービスを提供する。また、本実施形態において、クラスタを構成する複数のノード１をデータ移行処理システム１００として説明する。 The distributed processing system 1000 includes a load balancer 3 that receives messages from each client 2, a distribution device 4, and a plurality of nodes 1 that form a cluster. The load balancer 3 distributes the message from the client 2 to each distribution device 4 by simple round robin or the like. The distribution device 4 distributes the received message to each node 1 based on, for example, a consistent hash method. Each node 1 performs message processing and provides a service to the client 2. In the present embodiment, a plurality of nodes 1 constituting a cluster will be described as the data migration processing system 100.

なお、図１においては、振り分け装置４とノード１とを別装置として記載したが、同一サーバ上で別々の機能として動作させることも可能である。また、振り分け装置４も、図１に示すように、クラスタ構成をとることができる。さらに、ロードバランサ３が存在せず、クライアント２から任意の振り分け装置４にメッセージを送信することも可能である。 In FIG. 1, the distribution device 4 and the node 1 are described as separate devices, but can be operated as separate functions on the same server. The distribution device 4 can also take a cluster configuration as shown in FIG. Further, the load balancer 3 does not exist, and a message can be transmitted from the client 2 to an arbitrary distribution device 4.

本実施形態では、分散処理システム１０００のデータ管理手法として、ノード１の離脱時および追加時の影響が少ない、コンシステントハッシュ法によるデータ管理手法を例として説明する。ただし、コンシステントハッシュ法に限定されるものではない。また、図１０に示した複製データの配置手法により、コンシステントハッシュ法のＩＤ空間上で右隣（時計回りに次）のノード１に複製データを担当させることとする。
また、本実施形態に係るデータ移行処理システム１００では、ノード１の離脱時と追加時において共に同様の仕組みでデータの再冗長化処理、再配置処理を実行するため、ノード離脱後にデータの再冗長化処理を実行する例を主として説明する。 In the present embodiment, as a data management method of the distributed processing system 1000, a data management method based on the consistent hash method, which is less affected when the node 1 leaves or is added, will be described as an example. However, it is not limited to the consistent hash method. Further, according to the replica data arrangement method shown in FIG. 10, the node 1 on the right side (next clockwise) in the ID space of the consistent hash method is assigned the replica data.
Further, in the data migration processing system 100 according to the present embodiment, data re-redundancy processing and rearrangement processing are executed with the same mechanism both when the node 1 is detached and when it is added. An example of executing the conversion processing will be mainly described.

＜処理概要＞
次に、本実施形態に係るデータ移行処理システム１００の処理概要について説明する。
図２は、本実施形態に係るデータ移行処理システム１００を構成する各ノード１による、データの再冗長化処理（複製数：１）の概要を説明するための図である。図２（ａ）は、再冗長化処理前の初期状態を示し、図２（ｂ）は、ノード１（ここでは、ノード「４」）の減設（離脱）後の状態を示し、図２（ｃ）は、データの再冗長化処理を実行した状態を示す。 <Process overview>
Next, an outline of processing of the data migration processing system 100 according to the present embodiment will be described.
FIG. 2 is a diagram for explaining the outline of the data re-redundancy processing (the number of copies: 1) by each node 1 constituting the data migration processing system 100 according to the present embodiment. 2A shows an initial state before the re-redundancy processing, and FIG. 2B shows a state after the node 1 (here, the node “4”) is removed (detached). (C) shows a state in which the data redundancy process is executed.

まず、図２（ａ）に示すように、初期状態において、データＸ（データＸのＩＤ（データ識別子））については、ＩＤ空間を時計回りに辿り最初に出合ったノード「１」が担当ノードとなる。つまり、データＸの原本データがノード「１」に格納される。なお、原本データを格納し管理するノード１を、以下、「所有者ノード」と称することがある。そして所有者ノードであるノード「１」のさらに右隣のノード「４」がデータＸの複製データを格納する。なお、複製データを格納し管理するノード１（複製ノード）を、以下、「バディ」と称することがある。
このとき、各ノード１は、自身に格納する各データに、所有者ノードの識別子とバディの識別子とを付与する（ステップＳ１）。なお、この所有者ノードの識別子とバディ（複製ノード）の識別子とを併せてノード情報と称する。 First, as shown in FIG. 2A, in the initial state, for the data X (ID of the data X (data identifier)), the node “1” that first meets the ID space in the clockwise direction is the responsible node. Become. That is, the original data of the data X is stored in the node “1”. The node 1 that stores and manages the original data may be hereinafter referred to as an “owner node”. Then, the node “4” on the right side of the node “1” that is the owner node stores the duplicate data of the data X. Note that the node 1 (replica node) that stores and manages the replicated data may be hereinafter referred to as a “buddy”.
At this time, each node 1 adds an identifier of the owner node and an identifier of the buddy to each data stored in itself (step S1). The identifier of the owner node and the identifier of the buddy (duplicate node) are collectively referred to as node information.

次に、図２（ｂ）に示すように、クラスタを構成するノード１のうちの１つ、ここでは、ノード「４」が障害等により離脱したとする。各ノード１は、クラスタを構成するノード１（ノード「４」）の離脱を検知すると、（１）自身が原本として管理しているデータ（原本データ）、および、（２）自身が複製として管理しているデータ（複製データ）のうち、その複製データの原本データを管理している所有者ノードが離脱したノード１と一致するデータ、つまり、原本データが消失した複製データ、を減設後に再冗長化処理が必要か否かの判定対象となるデータ（以下、「判定対象データ」と称する。）として抽出する（ステップＳ２）。 Next, as shown in FIG. 2B, it is assumed that one of the nodes 1 constituting the cluster, here, the node “4” has left due to a failure or the like. When each node 1 detects the detachment of the node 1 (node “4”) constituting the cluster, (1) data managed as an original (original data), and (2) managed as a replica Data that is the same as node 1 from which the owner node that manages the original data of the replicated data, that is, the replicated data from which the original data has been lost, is removed It is extracted as data (hereinafter referred to as “determination target data”) as a determination target of whether or not redundancy processing is necessary (step S2).

続いて、各ノード１は、所定のタイミングで、抽出した判定対象データについて、減設後に担当すべきノード（所有者ノードおよびバディ）をシミュレーションし、そのシミュレーション結果と、判定対象データのノード情報とを比較する。そして、各ノード１は、その比較の結果、所有者ノードの識別子またはバディの識別子が一致しない判定対象データを、データ移行対象データとして検出し、データ移行処理（ここでは、再冗長化処理）を実行する（ステップＳ３）。
各ノード１が、シミュレーションを実施する所定のタイミングは、複数のパラメータを設定することにより決定される。このパラメータには、例えば、データ移行処理スレッド数（データ移行処理（再冗長化処理または再配置処理）を並列で実行できる最大スレッド数）、データ移行処理実行間隔（データ移行処理を実行した直後の待機時間）、シミュレーション最大回数（各スレッドがシミュレーションを連続で実行する最大回数）等により決定される。なお、これらのパラメータの詳細は後記する。 Subsequently, each node 1 simulates the nodes (owner nodes and buddies) that should be in charge after the reduction with respect to the extracted determination target data at a predetermined timing, and the simulation result and the node information of the determination target data Compare Then, as a result of the comparison, each node 1 detects determination target data that does not match the owner node identifier or buddy identifier as data migration target data, and performs data migration processing (here, re-redundancy processing). Execute (Step S3).
The predetermined timing at which each node 1 performs the simulation is determined by setting a plurality of parameters. This parameter includes, for example, the number of data migration processing threads (the maximum number of threads that can execute data migration processing (re-redundancy processing or rearrangement processing) in parallel), the data migration processing execution interval (immediately after executing the data migration processing) standby time), the simulation maximum number (each thread is determined by the maximum number of times) or the like running a simulation continuously. Details of these parameters will be described later.

このように、本実施形態に係るデータ移行処理システム１００等においては、クラスタを構成するノード１が離脱または追加した直後に、データ移行処理（再冗長化処理または再配置処理）を実行するのではなく、所定のタイミングでデータ移行処理の対象となるデータ（データ移行対象データ）を検出するための処理を実行し、そこで検出されたデータについて、データ移行処理を実行する。よって、ノード１の負荷を抑えながらデータを移行（再冗長化または再配置）させることができる。 As described above, in the data migration processing system 100 or the like according to the present embodiment, the data migration processing (re-redundancy processing or rearrangement processing) is not performed immediately after the node 1 constituting the cluster leaves or is added. Instead, a process for detecting data (data migration target data) to be subjected to the data migration process is executed at a predetermined timing, and the data migration process is executed for the detected data. Therefore, data can be migrated (re-redundant or rearranged) while suppressing the load on the node 1.

＜ノードの構成＞
以下、本実施形態に係るデータ移行処理システム１００を構成するノード１の構成例について、具体的に説明する。 <Node configuration>
Hereinafter, a configuration example of the node 1 configuring the data migration processing system 100 according to the present embodiment will be specifically described.

図３は、本実施形態に係るノード１の構成例を示す機能ブロック図である。
ノード１は、図１に示したように、振り分け装置４と通信可能に接続されると共に、クラスタを構成する自身以外の他のノード１とも通信可能に接続される。そして、クライアント２からのメッセージを受信し、サービスを提供する。また、このノード１は、クラスタを構成するノード１に離脱または追加があった場合に、移行が必要なデータについて、再冗長化処理または再配置処理を実行する。
このノード１は、図３に示すように、制御部１０と、入出力部１１と、メモリ部１２と、記憶部１３とを含んで構成される。 FIG. 3 is a functional block diagram illustrating a configuration example of the node 1 according to the present embodiment.
As shown in FIG. 1, the node 1 is communicably connected to the sorting device 4 and is also communicably connected to other nodes 1 other than itself constituting the cluster. Then, it receives a message from the client 2 and provides a service. The node 1 executes re-redundancy processing or rearrangement processing on data that needs to be migrated when the node 1 constituting the cluster is disconnected or added.
As shown in FIG. 3, the node 1 includes a control unit 10, an input / output unit 11, a memory unit 12, and a storage unit 13.

入出力部１１（入力部）は、振り分け装置４や、自身以外の他のノード１との間の情報の入出力を行う。また、この入出力部１１は、通信回線を介して情報の送受信を行う通信インタフェースと、不図示のキーボード等の入力手段やモニタ等の出力手段等との間で入出力を行う入出力インタフェースとから構成される。 The input / output unit 11 (input unit) inputs / outputs information to / from the distribution device 4 and other nodes 1 other than itself. The input / output unit 11 includes a communication interface that transmits and receives information via a communication line, and an input / output interface that performs input / output between an input unit such as a keyboard (not shown) and an output unit such as a monitor. Consists of

制御部１０は、ノード１全体の制御を司り、ノード識別子管理部１０１、メッセージ処理部１０２、死活監視部１０３、ノード情報付与部１０４、データ抽出部１０５、データ移行処理部１０６およびデータ移行パラメータ管理部１０７を含んで構成される。なお、この制御部１０は、例えば、記憶部１３に格納されたプログラムをＣＰＵ（Central Processing Unit）がメモリ部１２であるＲＡＭ（Random Access Memory）に展開し実行することで実現される。 The control unit 10 controls the entire node 1, and includes a node identifier management unit 101, a message processing unit 102, an alive monitoring unit 103, a node information addition unit 104, a data extraction unit 105, a data migration processing unit 106, and a data migration parameter management. The unit 107 is configured to be included. In addition, this control part 10 is implement | achieved when CPU (Central Processing Unit) expand | deploys and executes the program stored in the memory | storage part 13 on RAM (Random Access Memory) which is the memory part 12, for example.

ノード識別子管理部１０１は、クラスタを構成する各ノード１に関する識別情報をノード識別子管理テーブル４００（ノード識別子管理情報）として管理する。 The node identifier management unit 101 manages identification information regarding each node 1 constituting the cluster as a node identifier management table 400 (node identifier management information).

図４は、本実施形態に係るノード識別子管理テーブル４００（ノード識別子管理情報）のデータ構成例を示す図である。図４に示すように、ノード識別子管理テーブル４００は、クラスタを構成する各ノード１のノード識別子４０１とアドレス４０２とを含んで構成される。
このノード識別子４０１は、コンシステントハッシュ法のＩＤ空間上でのノードＩＤに対応する。また、コンシステントハッシュ法において仮想ＩＤを用いる場合には、ノード識別子４０１は、仮想ＩＤ毎に割り当てられ、ノード識別子管理テーブル４００に登録される。そして、このノード識別子管理テーブル４００では、例えば、ノード識別子４０１を昇順に並べることにより、コンシステントハッシュ法のＩＤ空間におけるＩＤ（または仮想ＩＤ）を昇順に並べて管理することができる。つまり、ノード識別子管理テーブル４００において、ノード識別子を昇順に並べたときの次のノード１が、ＩＤ空間上での右隣（時計回りに次）のノード１となる。
例えば、図４においては、コンシステントハッシュ法のＩＤ空間に基づくデータ識別子が「０」から「５６」であるデータについては、同図の第１行目が指すノード（ノード識別子「５６」、アドレス「１９２．１６８．０．２４」であるノード）が「所有者ノード」として担当し、次の第２行目が指すノード（ノード識別子「１７２」、アドレス「１９２．１６８．１．２５」）がバディとして担当することを示す。同様に、データ識別子が「５６」に１を加えた「５７」から「１７２」であるデータについては、第２行目が指すノードが「所有者ノード」として担当し、次の第３行目が指すノードがバディとして担当することを示す。
このようにして、このノード識別子管理テーブル４００に基づき、データとその所有者ノードとバディとが対応付けられる。
なお、このノード識別子４０１は、ノード識別子管理部１０１が各ノード１に対して付与することもできるし、他のノード１や外部装置（例えば、振り分け装置４等）が生成したノード識別子管理テーブル４００を受信して格納することも可能である。 FIG. 4 is a diagram showing a data configuration example of the node identifier management table 400 (node identifier management information) according to the present embodiment. As shown in FIG. 4, the node identifier management table 400 is configured to include a node identifier 401 and an address 402 of each node 1 constituting the cluster.
This node identifier 401 corresponds to the node ID on the consistent hash method ID space. Further, when a virtual ID is used in the consistent hash method, the node identifier 401 is assigned for each virtual ID and registered in the node identifier management table 400. In the node identifier management table 400, for example, by arranging the node identifiers 401 in ascending order, IDs (or virtual IDs) in the ID space of the consistent hash method can be arranged and managed in ascending order. That is, in the node identifier management table 400, the next node 1 when the node identifiers are arranged in ascending order is the node 1 on the right side (next clockwise) in the ID space.
For example, in FIG. 4, for data whose data identifier based on the ID space of the consistent hash method is “0” to “56”, the node (node identifier “56”, address The node “192.168.0.24”) is in charge as the “owner node”, and the node indicated by the next second line (node identifier “172”, address “192.168.1.25”) Indicates that he will be in charge as a buddy. Similarly, for data whose data identifier is “57” to “172” obtained by adding 1 to “56”, the node pointed to by the second row is in charge as the “owner node”, and the next third row Indicates that the node pointed to by is in charge as a buddy.
Thus, based on this node identifier management table 400, data, its owner node, and buddies are associated with each other.
The node identifier 401 can be assigned to each node 1 by the node identifier management unit 101, or the node identifier management table 400 generated by another node 1 or an external device (for example, the distribution device 4). Can be received and stored.

ノード識別子管理部１０１は、クラスタを構成する複数のノード１において、あるノード１が離脱した際には、そのノード１のノード識別子４０１とアドレス４０２とを含むレコードを削除する。また、ノード識別子管理部１０１は、クラスタを構成する複数のノード１において、ノード１が追加された場合には、そのノード１のノード識別子４０１とアドレス４０２とを含むレコードを新規に登録する。 When a node 1 leaves a plurality of nodes 1 constituting a cluster, the node identifier management unit 101 deletes a record including the node identifier 401 and the address 402 of the node 1. In addition, when a node 1 is added to a plurality of nodes 1 constituting the cluster, the node identifier management unit 101 newly registers a record including the node identifier 401 and the address 402 of the node 1.

図３に戻り、メッセージ処理部１０２は、振り分け装置４から振り分けられたメッセージを受信し、そのメッセージの処理を実行し、処理結果をクライアント２に返信することにより、サービスを提供する。このメッセージによりメッセージ処理部１０２が実行する処理は、例えば、データの登録、更新、検索、削除等である。また、メッセージ処理部１０２は、データの登録や更新等のメッセージを受信した場合に、自身以外の他のノード（ここでは、ノード識別子を昇順に並べた場合の次のノード、つまり、コンシステントハッシュ法のＩＤ空間での右隣のノード）にデータの複製を行うことでデータの冗長化を実現する。また、メッセージ処理部１０２は、メッセージの処理に必要なデータをそのノード１自身が保持していなかった場合には、他のノード１に要求すること等により、そのデータを取得することが可能である。
なお、メッセージ処理部１０２は、クラスタを構成するノード１に離脱や追加があった場合において、データの再冗長化や再配置が実行される以前に、データ移行の対象となるデータに関するメッセージを受信したときには、そのメッセージの処理を実行すると共に、そのデータのデータ移行処理（再冗長化処理または再配置処理）を実行する。 Returning to FIG. 3, the message processing unit 102 provides the service by receiving the message distributed from the distribution device 4, executing the processing of the message, and returning the processing result to the client 2. The processing executed by the message processing unit 102 by this message is, for example, data registration, update, search, and deletion. Further, when receiving a message such as data registration or update, the message processing unit 102 receives a node other than itself (here, the next node when node identifiers are arranged in ascending order, that is, a consistent hash). Data redundancy is realized by duplicating data on the right adjacent node in the legal ID space. In addition, when the node 1 itself does not hold data necessary for message processing, the message processing unit 102 can acquire the data by requesting the other node 1 or the like. is there.
Note that the message processing unit 102 receives a message regarding data to be migrated before re-redundancy or rearrangement of data in the case where the node 1 constituting the cluster is disconnected or added. If so, the message processing is executed and the data migration processing (re-redundancy processing or rearrangement processing) of the data is executed.

死活監視部１０３は、自身以外の他のノード１との間で、所定の時間間隔で死活監視信号の送受信を実行することにより、クラスタを構成するノード１の離脱や追加を監視する。死活監視部１０３が、ノード１の離脱や追加を検出した場合には、自ノード１若しくは他ノード１のノード識別子管理部１０１、または、ノード識別子を設定する外部装置に通知を行い、ノード識別子管理テーブル４００に反映させる。つまり、クラスタを構成する複数のノード１は、常に、同一内容のノード識別子管理テーブル４００を備えるようにする。 The alive monitoring unit 103 monitors the detachment or addition of the nodes 1 constituting the cluster by executing transmission / reception of alive monitoring signals at predetermined time intervals with other nodes 1 other than itself. When the alive monitoring unit 103 detects the detachment or addition of the node 1, the alive monitoring unit 103 notifies the node identifier management unit 101 of the own node 1 or the other node 1 or an external device for setting the node identifier, and manages the node identifier. This is reflected in the table 400. That is, the plurality of nodes 1 constituting the cluster are always provided with the node identifier management table 400 having the same contents.

ノード情報付与部１０４は、各データに、各データのデータ識別子に対応づけてノード情報を付与する。なお、ノード情報とは、前記したように、原本データを保持しているノード（所有者ノード）のノード識別子と複製データを保持しているノード（バディ：複製ノード）のノード識別子の情報である。ノード情報付与部１０４は、データ移行処理部１０６が、ノード１の離脱や追加によりデータの再冗長化処理または再配置処理を実行した場合に、新たなノード情報をそのデータに付与する。
よって、ノード情報付与部１０４は、新規にデータを登録する際と、ノード１の離脱や追加等によりデータの再冗長化処理または再配置処理を実行する際に、対象となるデータに対し、ノード情報を付与する。 The node information assigning unit 104 assigns node information to each data in association with the data identifier of each data. The node information is information on the node identifier of the node holding the original data (owner node) and the node identifier of the node holding the duplicate data (buddy: duplicate node) as described above. . The node information adding unit 104 adds new node information to the data when the data migration processing unit 106 executes a data re-redundancy process or a rearrangement process by the removal or addition of the node 1.
Therefore, the node information adding unit 104 applies the node to the target data when registering new data and when executing the data re-redundancy process or the rearrangement process by the removal or addition of the node 1 or the like. Give information.

データ抽出部１０５は、ノード識別子管理部１０１により、ノード識別子管理テーブル４００（図４参照）が変更されたことを契機として、（１）自身が原本として管理しているデータ（原本データ）、および、（２）自身が複製として管理しているデータ（複製データ）のうち、その複製データの原本データを管理しているノード１（所有者ノード）が離脱したノード１と一致するデータ、つまり、原本データが消失した複製データ、をデータ移行処理（再冗長化処理または再配置処理）が必要か否かの判定対象となるデータ（判定対象データ）として抽出し、その抽出したデータのデータ識別子を、抽出データ管理テーブル２００に格納する。
本処理の契機は、上述のように、ノード識別子管理テーブル４００（図４参照）の変更であるが、別の実施形態では、一部のノード１または外部装置（例えば、システム管理装置）が他の各ノード１に対して再冗長化処理または再配置処理の開始要求メッセージを送信することとして、当該他の各ノード１はそのメッセージを受信することを本処理の契機にするようにしてもよい。
なお、ノード１が追加された場合には、（２）の原本データが消失した複製データは、存在しないため、そのノード１が原本として管理しているデータ（原本データ）のみが抽出される。 The data extraction unit 105 is triggered by the change of the node identifier management table 400 (see FIG. 4) by the node identifier management unit 101 (1) data (original data) managed by itself as an original, and (2) Of the data (copy data) that is managed as a copy by itself, the data that matches the node 1 from which the node 1 (owner node) that manages the original data of the copy data leaves, that is, The duplicate data in which the original data is lost is extracted as data (determination target data) that is subject to determination as to whether or not data migration processing (re-redundancy processing or relocation processing) is necessary, and the data identifier of the extracted data is And stored in the extracted data management table 200.
As described above, the trigger of this processing is a change in the node identifier management table 400 (see FIG. 4). However, in another embodiment, some nodes 1 or external devices (for example, system management devices) are others. By transmitting a re-redundancy processing or relocation processing start request message to each node 1, the other nodes 1 may receive this message as a trigger for this processing. .
When node 1 is added, since there is no duplicate data in which the original data in (2) has been lost, only data (original data) managed by the node 1 as the original is extracted.

データ移行処理部１０６は、データ移行パラメータ管理部１０７により設定されるパラメータに基づく所定のタイミングで、データ抽出部１０５により抽出され抽出データ管理テーブル２００に格納された判定対象データについて、変更されたノード識別子管理テーブル４００に基づき、コンシステントハッシュ法等の予め定められたデータ管理手法に従った場合の所有者ノードとバディとを特定し（以下、この処理を「シミュレーション」と称す。）、各データ（判定対象データ）に付与されているノード情報と比較する。そして、データ移行処理部１０６は、この比較の結果、所有者ノードの識別子およびバディの識別子が一致（完全一致）しないデータを、データ移行対象データとして検出し、データ移行処理（再冗長化処理または再配置処理）を実行する。 The data migration processing unit 106 changes the node that has been changed with respect to the determination target data extracted by the data extraction unit 105 and stored in the extracted data management table 200 at a predetermined timing based on the parameters set by the data migration parameter management unit 107. Based on the identifier management table 400, an owner node and a buddy when a predetermined data management method such as a consistent hash method is used are specified (hereinafter, this process is referred to as “simulation”), and each data is specified. Compare with the node information given to (determination target data). Then, as a result of this comparison, the data migration processing unit 106 detects data whose owner node identifier and buddy identifier do not match (completely match) as data migration target data, and performs data migration processing (re-redundancy processing or Relocation processing).

なお、冗長数が３以上（複製データが２以上）の場合、複製データを管理する複数のノード１（ＩＤ空間上で原本データを管理する所有者ノードに近いノードから、バディ「１」，バディ「２」，・・・と呼ぶ。）が同一データに対してデータ移行処理を実行する虞がある。そこで、複数のバディが存在する場合には、番号の若いバディがデータ移行処理を担当することとする。このデータ移行対象データの検出処理の詳細については、図６を参照して後記する。 When the redundancy number is 3 or more (duplicate data is 2 or more), a plurality of nodes 1 that manage the replicated data (from the node close to the owner node that manages the original data on the ID space, buddy “1”, buddy “2”,...)) May perform data migration processing on the same data. Therefore, when there are a plurality of buddies, the buddy with the smallest number is in charge of the data migration process. Details of the data migration target data detection process will be described later with reference to FIG.

データ移行パラメータ管理部１０７は、データ移行処理部１０６がデータ移行処理を実行する所定のタイミングを、１つ以上のパラメータを用いて設定する。このデータ移行パラメータ管理部１０７により、データ移行処理の実行タイミングが調整されることで、データ移行処理の負荷が一度に集中して通常の処理を妨げサービス品質の低下を招かないように負荷を調整しながら、データ移行処理を実行できる。 The data migration parameter management unit 107 sets a predetermined timing at which the data migration processing unit 106 executes the data migration processing using one or more parameters. By adjusting the execution timing of the data migration processing by the data migration parameter management unit 107, the load is adjusted so that the load of the data migration processing is concentrated at a time and the normal processing is not disturbed and the service quality is not deteriorated. However, the data migration process can be executed.

このデータ移行パラメータ管理部１０７に設定されるパラメータは、例えば、データ移行処理スレッド数、データ移行処理実行間隔、シミュレーション最大回数である。
データ移行処理スレッド数とは、再冗長化処理や再配置処理を並列で実行できる最大スレッド数である。このデータ移行処理スレッド数に大きな値を設定すると、データ移行処理にかかる時間を短縮することができるが、ノード１の処理負荷は大きくなる。これに対し、データ移行処理スレッド数に小さな値を設定すると、データ移行処理にかかる時間は増加するが、ノード１の処理負荷は小さくなる。
データ移行処理実行間隔とは、各スレッドで、データ移行処理部１０６がデータ移行対象データの検出処理（図６）を実行した結果、データ移行対象データが検出され、データ移行処理部１０６が、そのデータ移行処理（再冗長化処理または再配置処理）を実行した後に待機する時間を指す。
シミュレーション最大回数とは、各スレッドが前記したシミュレーション（変更後のノード識別子管理テーブル４００（図４参照）に基づく、所有者ノードとバディの特定）を含むデータ移行対象データの検出処理を連続で実行する回数を指す。データ移行処理部１０６は、データ移行対象データの検出処理を連続で実行した後に、所定の時間待機し、その後、再びシミュレーションを含むデータ移行対象データの検出処理を開始する。
このシミュレーション最大回数を少なく設定する、または、データ移行処理実行間隔を長く設定することで、ノード１は、処理負荷を抑えながら徐々にデータ移行処理を実行することが可能となる。
なお、このデータ移行処理スレッド数、データ移行処理実行間隔、シミュレーション最大回数の各パラメータは、データ移行パラメータ管理部１０７により、すべて設定されてもよいし、いずれか１つでもよいし、各パラメータを任意に組み合わせて設定されてもよい。
また、このデータ移行パラメータ管理部１０７により設定されたパラメータにより調整される所定のタイミングでのデータ移行処理の詳細については、図７を参照して後記する。 The parameters set in the data migration parameter management unit 107 are, for example, the number of data migration processing threads, the data migration processing execution interval, and the maximum number of simulations.
The number of data migration processing threads is the maximum number of threads that can execute re-redundancy processing and rearrangement processing in parallel. Setting a large value for the number of data migration processing threads can reduce the time required for the data migration processing, but the processing load on the node 1 increases. On the other hand, if a small value is set for the number of data migration processing threads, the time required for the data migration processing increases, but the processing load on the node 1 decreases.
The data migration processing execution interval is the result of the data migration processing unit 106 executing the data migration target data detection process (FIG. 6) in each thread, so that the data migration target data is detected and the data migration processing unit 106 It refers to the time to wait after executing data migration processing (re-redundancy processing or relocation processing).
The maximum number of simulations means that each thread continuously executes the data migration target data detection process including the above-described simulation (identification of owner node and buddy based on the changed node identifier management table 400 (see FIG. 4)). The number of times to do. The data migration processing unit 106 continuously executes the data migration target data detection process, waits for a predetermined time, and then starts the data migration target data detection process including simulation again.
By setting the maximum number of simulations small or setting the data migration processing execution interval long, the node 1 can gradually execute the data migration processing while suppressing the processing load.
The data migration processing thread number, the data migration processing execution interval, and the simulation maximum number of parameters may be all set by the data migration parameter management unit 107, or any one of them may be set. Any combination may be set.
Details of the data migration process at a predetermined timing adjusted by the parameters set by the data migration parameter management unit 107 will be described later with reference to FIG.

メモリ部１２は、ＲＡＭ等の一次記憶装置からなり、制御部１０によるデータ処理に必要な情報を一時的に記憶している。なお、このメモリ部１２には、データ抽出部１０５が抽出した判定対象データのデータ識別子を格納する前記した抽出データ管理テーブル２００が記憶される。 The memory unit 12 includes a primary storage device such as a RAM, and temporarily stores information necessary for data processing by the control unit 10. The memory unit 12 stores the extracted data management table 200 that stores the data identifier of the determination target data extracted by the data extraction unit 105.

記憶部１３は、ハードディスクやフラッシュメモリ等の記憶装置からなり、サービスの対象となる原本データや複製データを含むデータ３００や、前記したノード識別子管理テーブル４００（図４参照）等が記憶される。また、この記憶部１３には、データ移行パラメータ管理部１０７により設定される各パラメータ値（不図示）等が記憶される。
なお、データ３００の各データには、そのデータのデータ識別子、所有者ノードのノード識別子および、バディのノード識別子が含まれる。 The storage unit 13 includes a storage device such as a hard disk or a flash memory, and stores data 300 including original data and copy data to be serviced, the node identifier management table 400 (see FIG. 4), and the like. The storage unit 13 stores parameter values (not shown) set by the data migration parameter management unit 107.
Each data of the data 300 includes a data identifier of the data, a node identifier of the owner node, and a node identifier of the buddy.

＜データ移行処理システムの処理流れ＞
次に、本実施形態に係るデータ移行処理システム１００において、データ移行処理を実行する場合の処理の流れについて説明する。 <Processing flow of data migration processing system>
Next, the flow of processing when executing data migration processing in the data migration processing system 100 according to the present embodiment will be described.

（データ移行処理の全体の流れ）
図５は、本実施形態に係るデータ移行処理システム１００において、各ノード１が実行するデータ移行処理の全体の流れを示すフローチャートである。 (Overall flow of data migration processing)
FIG. 5 is a flowchart showing the overall flow of data migration processing executed by each node 1 in the data migration processing system 100 according to the present embodiment.

まず、ノード１の死活監視部１０３は、自身以外の他のノード１との間で、所定の時間間隔で死活監視信号を送受信することにより、クラスタを構成するノード１の離脱や追加が発生したか否かを判定する（ステップＳ１０）。そして、死活監視部１０３は、ノード１の離脱や追加が発生したと判定した場合に（ステップＳ１０→Ｙｅｓ）、その離脱または追加の情報をノード識別子管理部１０１に出力する。一方、死活監視部１０３は、ノード１の離脱や追加が発生していないと判定した場合には（ステップＳ１０→Ｎｏ）、ステップＳ１０の判定処理を繰り返す。
なお、死活監視部１０３は、他のノード１との間での死活監視信号の送受信ではなく、外部装置（例えば、振り分け装置４）や他のノード１が検知したノード１の離脱や追加の発生情報を受信することにより、ノード１の離脱や追加の情報を取得するようにしてもよい。 First, the alive monitoring unit 103 of the node 1 transmits / receives a alive monitoring signal to / from other nodes 1 other than the node 1 at a predetermined time interval, so that the node 1 constituting the cluster is disconnected or added. Whether or not (step S10). If the life and death monitoring unit 103 determines that the node 1 has been detached or added (step S10 → Yes), the information on the withdrawal or addition is output to the node identifier management unit 101. On the other hand, if the life and death monitoring unit 103 determines that the node 1 is not detached or added (step S10 → No), the determination process of step S10 is repeated.
The life / death monitoring unit 103 does not transmit / receive a life / death monitoring signal to / from another node 1, but leaves or adds a node 1 detected by an external device (for example, the distribution device 4) or another node 1. By receiving the information, the node 1 may be removed or additional information may be acquired.

次に、ノード識別子管理部１０１は、死活監視部１０３から受け取った、ノード１の離脱や追加の情報に基づき、ノード識別子管理テーブル４００（図４参照）を変更する（ステップＳ１１）。
具体的には、ノード識別子管理部１０１は、クラスタを構成する複数のノード１において、あるノード１が離脱した場合には、そのノード１のノード識別子４０１とアドレス４０２とを含むレコードを削除する。また、ノード識別子管理部１０１は、クラスタを構成する複数のノード１において、新たなノード１が追加された場合には、そのノード１のノード識別子４０１とアドレス４０２とを含むレコードを新規に登録する。 Next, the node identifier management unit 101 changes the node identifier management table 400 (see FIG. 4) based on the information about the detachment or addition of the node 1 received from the alive monitoring unit 103 (step S11).
Specifically, the node identifier management unit 101 deletes a record including the node identifier 401 and the address 402 of the node 1 when a node 1 leaves the plurality of nodes 1 constituting the cluster. Further, when a new node 1 is added to the plurality of nodes 1 constituting the cluster, the node identifier management unit 101 newly registers a record including the node identifier 401 and the address 402 of the node 1. .

続いて、データ抽出部１０５は、ノード識別子管理部１０１によりノード識別子管理テーブル４００（図４参照）が変更されたことを契機として、判定対象データを抽出し、その判定対象データのデータ識別子を抽出データ管理テーブル２００に格納する（ステップＳ１２）。
具体的には、データ抽出部１０５は、（１）自身が原本として管理しているデータ（原本データ）、および、（２）自身が複製として管理しているデータ（複製データ）のうち原本データが消失した複製データを、データ移行処理（再冗長化処理または再配置処理）が必要か否かの判定対象となるデータ（判定対象データ）として抽出し、その抽出したデータのデータ識別子を、抽出データ管理テーブル２００に格納する。 Subsequently, the data extraction unit 105 extracts the determination target data when the node identifier management unit 400 changes the node identifier management table 400 (see FIG. 4), and extracts the data identifier of the determination target data. The data is stored in the data management table 200 (step S12).
Specifically, the data extraction unit 105 includes (1) original data managed as original data (original data) and (2) original data among data managed as duplicates (replicated data). Is extracted as data (determination target data) that is subject to determination as to whether data migration processing (re-redundancy processing or relocation processing) is necessary, and the data identifier of the extracted data is extracted Store in the data management table 200.

そして、データ移行処理部１０６は、データ移行パラメータ管理部１０７により設定された、所定のタイミングであるか否かを判定する（ステップＳ１３）。データ移行処理部１０６は、所定のタイミングであれば（ステップＳ１３→Ｙｅｓ）、次のステップＳ１４に進み、所定のタイミングでなければ（ステップＳ１３→Ｎｏ）、所定のタイミングになるまで待機する。
なお、データ移行パラメータ管理部１０７により設定される所定のタイミングについては、図７において説明する。 Then, the data migration processing unit 106 determines whether it is a predetermined timing set by the data migration parameter management unit 107 (step S13). If it is a predetermined timing (step S13 → Yes), the data migration processing unit 106 proceeds to the next step S14, and if it is not the predetermined timing (step S13 → No), it waits until the predetermined timing is reached.
The predetermined timing set by the data migration parameter management unit 107 will be described with reference to FIG.

ステップＳ１４において、データ移行処理部１０６は、抽出データ管理テーブル２００を参照し、判定対象データ（具体的には判定対象データのデータ識別子）を１つ抽出する（ステップＳ１４）。 In step S14, the data migration processing unit 106 refers to the extracted data management table 200 and extracts one determination target data (specifically, a data identifier of the determination target data) (step S14).

次に、データ移行処理部１０６は、ステップＳ１４で抽出した判定対象データが、データ移行処理（再冗長化処理または再配置処理）の対象となるデータ（データ移行対象データ）か否かを検出する処理（データ移行対象データの検出処理）を実行する（ステップＳ１５）。なお、このデータ移行対象データの検出処理の詳細は、図６において説明する。 Next, the data migration processing unit 106 detects whether or not the determination target data extracted in step S14 is data (data migration target data) that is a target of data migration processing (re-redundancy processing or rearrangement processing). Processing (data migration target data detection processing) is executed (step S15). Details of the data migration target data detection process will be described with reference to FIG.

続いて、データ移行処理部１０６は、ステップＳ１５により、データ移行対象データが検出されたか否かを判定する（ステップＳ１６）。ここで、データ移行処理部１０６は、データ移行対象データが検出された場合には（ステップＳ１６→Ｙｅｓ）、ステップＳ１７に進み、検出されなかった場合には（ステップＳ１６→Ｎｏ）、ステップＳ１８に進む。 Subsequently, the data migration processing unit 106 determines whether or not data migration target data has been detected in step S15 (step S16). If the data migration target data is detected (step S16 → Yes), the data migration processing unit 106 proceeds to step S17. If not detected (step S16 → No), the data migration processing unit 106 proceeds to step S18. move on.

ステップＳ１７において、データ移行処理部１０６は、データ移行処理（再冗長化処理または再配置処理）を実行する。なお、このデータ移行処理部１０６によるデータ移行処理は、ノード情報付与部１０４が、移行するデータに付与されたノード情報を、ステップＳ１５のデータ移行対象データの検出処理でシミュレーション（変更後のノード識別子管理テーブル４００（図４参照）に基づく、所有者ノードとバディの特定）した所有者ノードの識別子とバディの識別子とに変更してから実行される。 In step S17, the data migration processing unit 106 executes data migration processing (re-redundancy processing or rearrangement processing). The data migration processing by the data migration processing unit 106 is performed by the node information adding unit 104 simulating the node information given to the data to be migrated in the data migration target data detection process in step S15 (node identifier after change). This is executed after changing the owner node identifier and buddy identifier based on the management table 400 (see FIG. 4).

次に、ステップＳ１８において、データ移行処理部１０６は、抽出データ管理テーブル２００に格納されたすべての判定対象データ（データ識別子）を処理したか否かを判定する。ここで、データ移行処理部１０６は、抽出データ管理テーブル２００に、まだ処理していない判定対象データ（データ識別子）が格納されている場合には（ステップＳ１８→Ｎｏ）、ステップＳ１３に戻り処理を続ける。一方、データ移行処理部１０６は、抽出データ管理テーブル２００に格納されたすべての判定対象データ（データ識別子）の処理を終えている場合には（ステップＳ１８→Ｙｅｓ）、データ移行処理を終了する。 Next, in step S <b> 18, the data migration processing unit 106 determines whether all the determination target data (data identifiers) stored in the extracted data management table 200 have been processed. Here, if the extracted data management table 200 stores determination target data (data identifier) that has not yet been processed (step S18 → No), the data migration processing unit 106 returns to step S13 to perform the processing. to continue. On the other hand, if the data migration processing unit 106 has finished processing all the determination target data (data identifiers) stored in the extracted data management table 200 (step S18 → Yes), the data migration processing ends.

なお、ノード１のデータ移行処理部１０６が、この図５のステップＳ１３〜Ｓ１８によるデータ移行処理を徐々に実行している間においても、そのノード１はクライアント２からメッセージを受信し、メッセージ処理部１０２がそのメッセージに対する処理を実行している。メッセージ処理部１０２は、受信したメッセージについての処理を実行すると共に、そのメッセージの対象となるデータが、データ移行対象データか否かを、そのデータに付されたノード情報を参照して判定する。具体的には、そのノード情報中の所有者ノードとバディのいずれかが、各々自ノード、ＩＤ空間上の右隣ノードと異なれば、データ移行対象データと判定する。そして、メッセージ処理部１０２は、当該データがデータ移行対象データである場合には、そのデータのデータ移行処理（再冗長化処理または再配置処理）を実行する。なお、自ノードが受信したメッセージにかかるデータを保持していない場合（ＩＤ空間上の右隣にノードが追加された場合に発生しうる）には、メッセージ処理部１０２は、他ノードからのデータの取得を含む再配置処理を伴ってメッセージの処理を実行する。データの取得先は、ノード識別子管理テーブル４００を参照して特定する。
このようにすることで、ノード１は、メッセージを受信したことにより、それ以降も利用可能性が高いと予測されるデータについては、そのデータに関するメッセージ処理を実行すると共に、データ移行処理（再冗長化処理または再配置処理）を行い、冗長度の減少の回復や、データの適正な配置を迅速に達成することができる。 Even while the data migration processing unit 106 of the node 1 gradually executes the data migration processing in steps S13 to S18 in FIG. 5, the node 1 receives the message from the client 2 and receives the message processing unit. 102 executes processing for the message. The message processing unit 102 executes processing on the received message and determines whether or not the data targeted by the message is data migration target data with reference to the node information attached to the data. Specifically, if any of the owner node and the buddy in the node information is different from the own node and the right adjacent node in the ID space, it is determined as the data migration target data. When the data is data migration target data, the message processing unit 102 executes data migration processing (re-redundancy processing or rearrangement processing) of the data. When the data related to the message received by the own node is not held (which may occur when a node is added to the right side of the ID space), the message processing unit 102 receives data from other nodes. Message processing is executed with relocation processing including acquisition of. The data acquisition destination is specified with reference to the node identifier management table 400.
In this way, the node 1 performs message processing on the data that is predicted to be highly usable after receiving the message, and also performs data migration processing (re-redundancy). Or the rearrangement process), the reduction of the redundancy can be recovered and the proper arrangement of the data can be quickly achieved.

（データ移行対象データの検出処理）
次に、図５のステップＳ１５において、ノード１のデータ移行処理部１０６が実行するデータ移行対象データの検出処理について詳細に説明する。
図６は、本実施形態に係るノード１のデータ移行処理部１０６が行うデータ移行対象データの検出処理の流れを示すフローチャートである。 (Data migration target data detection process)
Next, the data migration target data detection process executed by the data migration processing unit 106 of the node 1 in step S15 in FIG. 5 will be described in detail.
FIG. 6 is a flowchart showing a flow of data migration target data detection processing performed by the data migration processing unit 106 of the node 1 according to the present embodiment.

まず、ノード１のデータ移行処理部１０６は、抽出データ管理テーブル２００から抽出した判定対象データについて、シミュレーション（図６において、「Ａ処理」と記載）を実行する（ステップＳ２０）。具体的には、データ移行処理部１０６は、抽出した判定対象データについて、変更されたノード識別子管理テーブル４００に基づき、コンシステントハッシュ法等の予め定められたデータ管理手法に従った場合の所有者ノードとバディとを特定する。 First, the data migration processing unit 106 of the node 1 executes a simulation (described as “A process” in FIG. 6) for the determination target data extracted from the extracted data management table 200 (step S20). Specifically, the data migration processing unit 106 uses the owner of the extracted determination target data in accordance with a predetermined data management method such as a consistent hash method based on the changed node identifier management table 400. Identify nodes and buddies.

次に、データ移行処理部１０６は、抽出データ管理テーブル２００から抽出した判定対象データに付されたノード情報（所有者ノードの識別子とバディの識別子）を抽出（図６において、「Ｂ処理」と記載）する（ステップＳ２１）。 Next, the data migration processing unit 106 extracts node information (owner node identifier and buddy identifier) attached to the determination target data extracted from the extracted data management table 200 (in FIG. 6, “B processing”). (Step S21).

続いて、シミュレーション（Ａ処理）の結果、自ノードが所有者ノードか否かを判定する（ステップＳ２２）。ここで、シミュレーション（Ａ処理）の結果、自ノードが所有者ノードである場合には（ステップＳ２２→Ｙｅｓ）、次のステップＳ２３に進み、一方、自ノードが所有者ノードでない場合には（ステップＳ２２→Ｎｏ）、ステップＳ２５に進む。 Subsequently, as a result of the simulation (A process), it is determined whether or not the own node is the owner node (step S22). Here, as a result of the simulation (A process), when the own node is the owner node (step S22 → Yes), the process proceeds to the next step S23, while when the own node is not the owner node (step S22-> No), it progresses to step S25.

ステップＳ２３において、データ移行処理部１０６は、ノード情報抽出（Ｂ処理）の結果、自ノードが所有者ノードか否かを判定する。ここで、ノード情報抽出（Ｂ処理）の結果、自ノードが所有者ノードでない場合は（ステップＳ２３→Ｎｏ）、そのデータ（判定対象データ）を、データ移行処理の対象となるデータ（データ移行対象データ）として決定する。つまり、同じデータに関して、所有者ノードがＡ処理とＢ処理とで一致しないため、データ移行が必要なデータ（図６において、「データ移行対象」と記載）となる。一方、ノード情報抽出（Ｂ処理）の結果、自ノードが所有者ノードである場合は（ステップＳ２３→Ｙｅｓ）、次のステップＳ２４に進む。 In step S23, the data migration processing unit 106 determines whether or not the own node is the owner node as a result of the node information extraction (B process). Here, when the node information is extracted (B process), if the own node is not the owner node (step S23 → No), the data (determination target data) is converted into data (data migration target). Data). That is, regarding the same data, since the owner node does not match between the A process and the B process, the data needs to be migrated (described as “data migration target” in FIG. 6). On the other hand, as a result of the node information extraction (B process), when the own node is the owner node (step S23 → Yes), the process proceeds to the next step S24.

ステップＳ２４において、データ移行処理部１０６は、シミュレーション（Ａ処理）の結果とノード情報抽出（Ｂ処理）の結果とでバディが一致するか否かを判定する。ここで、バディが一致する場合には（ステップＳ２４→Ｙｅｓ）、そのデータ（判定対象データ）は、データ移行の必要のないデータ（図６において、「データ移行対象外」と記載）となる。一方、バディが一致しない場合には（ステップＳ２４→Ｎｏ）、そのデータ（判定対象データ）を、データ移行処理の対象となるデータ（データ移行対象データ）として決定する。つまり、所有者ノードが一致していても、バディが一致していないため、データ移行対象データとなる。 In step S24, the data migration processing unit 106 determines whether or not the buddies match between the result of the simulation (A process) and the result of the node information extraction (B process). Here, when the buddies match (step S24 → Yes), the data (determination target data) is data that does not require data migration (described as “not subject to data migration” in FIG. 6). On the other hand, if the buddies do not match (step S24 → No), the data (determination target data) is determined as data (data transfer target data) to be subjected to data transfer processing. That is, even if the owner nodes match, the buddies do not match, so the data becomes the data migration target data.

次に、ステップＳ２２において、シミュレーション（Ａ処理）の結果、自ノードが所有者ノードでない場合には（ステップＳ２２→Ｎｏ）、データ移行処理部１０６は、ノード情報抽出（Ｂ処理）の結果、自ノードが所有者ノードか否かを判定する（ステップＳ２５）。ここで、ノード情報抽出（Ｂ処理）の結果、自ノードが所有者ノードある場合には（ステップＳ２５→Ｙｅｓ）、そのデータ（判定対象データ）を、データ移行処理の対象となるデータ（データ移行対象データ）として決定する。つまり、所有者ノードがＡ処理とＢ処理で一致しないため、データ移行対象となる。一方、ノード情報抽出（Ｂ処理）の結果、自ノードが所有者ノードでない場合は（ステップＳ２５→Ｎｏ）、次のステップＳ２６に進む。 Next, in step S22, when the result of the simulation (A process) is that the own node is not the owner node (step S22 → No), the data migration processing unit 106 obtains the result of the node information extraction (B process). It is determined whether or not the node is the owner node (step S25). Here, when the node information is extracted (B process) and the own node is the owner node (step S25 → Yes), the data (determination target data) is converted into the data (data migration target) for the data migration process. Target data). That is, since the owner node does not match between the A process and the B process, it becomes a data migration target. On the other hand, as a result of the node information extraction (B process), if the own node is not the owner node (step S25 → No), the process proceeds to the next step S26.

ステップＳ２６において、データ移行処理部１０６は、ノード情報抽出（Ｂ処理）の結果、所有者ノードが変更後のノード識別子管理テーブル４００（図４参照）に存在するか否かを判定する。ここで、存在する場合には（ステップＳ２６→Ｙｅｓ）、そのデータ（判定対象データ）は、データ移行対象外となる。一方、存在しない場合には（ステップＳ２６→Ｎｏ）、次のステップＳ２７に進む。 In step S26, the data migration processing unit 106 determines whether the owner node exists in the node identifier management table 400 (see FIG. 4) after the change as a result of the node information extraction (B process). If it exists (step S26 → Yes), the data (determination target data) is not subject to data migration. On the other hand, if it does not exist (step S26 → No), the process proceeds to the next step S27.

ステップＳ２７において、データ移行処理部１０６は、ノード情報抽出（Ｂ処理）の結果、自身より若いバディ（ノード識別子を昇順に並べた場合に、番号が若いバディ）が存在し、変更後のノード識別子管理テーブル４００（図４参照）にも、その自身より若いバディが存在するか否かを判定する。つまり、複数のバディが存在する場合に、自身が番号の最も若いバディであるか否かを判定する。ここで、自身より若いバディが存在する場合には（ステップＳ２７→Ｙｅｓ）、同一データに対し重複してデータ移行処理を実行するのを避けるため、そのデータ（判定対象データ）を、データ移行対象外とする。一方、自身より若いバディが存在しない場合には（ステップＳ２７→Ｎｏ）、そのデータ（判定対象データ）を、データ移行処理の対象となるデータ（データ移行対象データ）として決定する。 In step S27, as a result of the node information extraction (B process), the data migration processing unit 106 has a buddy younger than itself (a buddy with a younger number when node identifiers are arranged in ascending order), and the node identifier after the change In the management table 400 (see FIG. 4), it is determined whether there is a buddy younger than itself. That is, when there are a plurality of buddies, it is determined whether or not it is the youngest buddy. Here, if there is a buddy younger than itself (step S27 → Yes), the data (determination target data) is transferred to the data transfer target in order to avoid performing the data transfer process on the same data redundantly. It is outside. On the other hand, when there is no buddy younger than itself (step S27 → No), the data (determination target data) is determined as data (data migration target data) to be subjected to data migration processing.

このようにすることで、データ移行処理部１０６は、抽出データ管理テーブル２００に抽出された判定対象データそれぞれについて、データ移行処理を実行すべきか否かを判定し、データ移行対象データを検出することができる。また、複数のバディが存在する場合であっても、離脱や追加されたノード１の両隣にある、所有者ノードと番号が最も若いバディとがトリガとなりデータ移行処理が実行されるので、同一データについて、重複してデータ移行処理を実行することを避けることができる。 In this way, the data migration processing unit 106 determines whether or not the data migration processing should be executed for each determination target data extracted in the extracted data management table 200, and detects the data migration target data. Can do. Even if there are multiple buddies, the data migration process is executed by the owner node and the buddy with the lowest number on both sides of the node 1 that has been detached or added. It is possible to avoid duplicating the data migration process.

（所定のタイミングでのデータ移行処理）
次に、図７を参照して、データ移行パラメータ管理部１０７により設定されるパラメータに基づく、データ移行処理の所定のタイミングについて説明する。
図７においては、データ移行パラメータ管理部１０７に設定されるパラメータの例として、データ移行処理スレッド数、データ移行処理実行間隔、および、シミュレーション最大回数が設定されている例を示す。 (Data migration process at a predetermined timing)
Next, a predetermined timing of the data migration process based on the parameters set by the data migration parameter management unit 107 will be described with reference to FIG.
FIG. 7 shows an example in which the number of data migration processing threads, the data migration processing execution interval, and the maximum number of simulations are set as examples of parameters set in the data migration parameter management unit 107.

図７は、データ移行処理スレッド数（符号Ｐ１）が「３」に設定されている例を示している。
また、各スレッドにおいて、データ移行処理部１０６が、抽出データ管理テーブル２００から判定対象データを取得すると（ステップＳ３０）、データ移行対象データの検出処理を実行し（ステップＳ３１）、データ移行対象データが検出された場合には、そのままデータ移行処理（再冗長化処理または再配置処理）を実行する（ステップＳ３２）。 FIG. 7 shows an example in which the number of data migration processing threads (symbol P1) is set to “3”.
In each thread, when the data migration processing unit 106 acquires the determination target data from the extracted data management table 200 (step S30), the data migration target data is detected (step S31). If detected, data migration processing (re-redundancy processing or rearrangement processing) is executed as it is (step S32).

ここで、データ移行対象データについて、ステップＳ３２において、データ移行処理を実行した後、データ移行処理部１０６は、データ移行処理実行間隔（符号Ｐ２）が設定されているため、所定の時間、次の判定対象データの取得処理（ステップＳ３０）を行わずに待機する。 Here, for the data migration target data, after the data migration processing is executed in step S32, the data migration processing unit 106 sets the data migration processing execution interval (reference P2), and therefore, for a predetermined time, The process waits without performing the determination target data acquisition process (step S30).

また、図７のスレッド「１」に示すように、シミュレーション最大回数（符号Ｐ３）のパラメータが「５」に設定されている場合には、データ移行処理部１０６は、抽出データ管理テーブル２００から判定対象データを取得し、データ移行対象データの検出処理を５回連続して実行する。そして、データ移行処理部１０６は、シミュレーション最大回数（ここでは「５」回）のデータ移行対象データの検出処理を行っても、データ移行対象データが検出されなかったときには、所定の時間待機する。 In addition, as shown in the thread “1” in FIG. 7, when the parameter of the maximum simulation count (symbol P3) is set to “5”, the data migration processing unit 106 determines from the extracted data management table 200. The target data is acquired, and the data migration target data detection process is continuously executed five times. The data migration processing unit 106 waits for a predetermined time when the data migration target data is not detected even if the data migration target data is detected the maximum number of times of simulation (here, “5” times).

このようにすることで、各パラメータにより設定された所定のタイミングでデータ移行対象データの検出処理を実行し、データ移行対象データが検出された場合に、データ移行処理を実行することができる。よって、クラスタを構成するノード１が離脱したり追加されたりした直後において、他の既存のノード１は、データ移行処理を一時に実行しないため、ノード１が処理負荷を抑えながら徐々にデータ移行処理（再冗長化処理または再配置処理）を実行することができる。 In this way, the data migration target data detection process is executed at a predetermined timing set by each parameter, and the data migration process can be executed when the data migration target data is detected. Therefore, immediately after the node 1 constituting the cluster leaves or is added, the other existing nodes 1 do not execute the data migration process at a time, so that the node 1 gradually performs the data migration process while reducing the processing load. (Re-redundancy processing or rearrangement processing) can be executed.

以上説明したように、本実施形態に係る、データ移行処理システム１００およびデータ移行処理方法によれば、クラスタを構成するノード１の離脱または追加があった場合に、ノード１の処理負荷を抑えながらデータを移行させることができる。 As described above, according to the data migration processing system 100 and the data migration processing method according to the present embodiment, when the node 1 constituting the cluster is detached or added, the processing load on the node 1 is suppressed. Data can be migrated.

＜変形例１＞
次に、本実施形態に係るデータ移行処理システム１００の変形例１について説明する。
図８は、本実施形態の変形例１に係るノード１ａの構成例を示す機能ブロック図である。
図３に示した本実施形態に係るノード１との違いは、制御部１０に、ノード負荷監視部１０８を追加して備えていることである。 <Modification 1>
Next, a first modification of the data migration processing system 100 according to the present embodiment will be described.
FIG. 8 is a functional block diagram illustrating a configuration example of the node 1a according to the first modification of the present embodiment.
A difference from the node 1 according to the present embodiment illustrated in FIG. 3 is that a node load monitoring unit 108 is additionally provided in the control unit 10.

ノード負荷監視部１０８は、ノード１ａ自身の処理負荷（例えば、ＣＰＵ使用率、メモリ使用率等）を監視し、その処理負荷に対し予め設定された所定値を超えた場合に、データ移行処理部１０６に処理中断情報を出力することにより、データ移行処理部１０６によるデータ移行処理を中断させる。具体的には、例えば、データ移行処理部１０６が、図７のステップＳ３０で示した、抽出データ管理テーブル２００からの判定対象データの取得を中止することにより、データ移行処理を中断する。
そして、ノード負荷監視部１０８は、ノード１ａ自身の処理負荷が所定値以下になった場合に、データ移行処理部１０６に処理開始情報を出力することにより、データ移行処理部１０６によるデータ移行処理を再開させる。例えば、データ移行処理部１０６が、図７のステップＳ３０で示した判定対象データの取得処理を再開することにより、データ移行処理を実行する。 The node load monitoring unit 108 monitors the processing load (for example, CPU usage rate, memory usage rate, etc.) of the node 1a itself, and when the processing load exceeds a predetermined value set in advance, the data migration processing unit By outputting the processing interruption information to 106, the data migration processing by the data migration processing unit 106 is suspended. Specifically, for example, the data migration processing unit 106 interrupts the data migration processing by canceling the acquisition of the determination target data from the extracted data management table 200 shown in step S30 of FIG.
Then, the node load monitoring unit 108 outputs the processing start information to the data migration processing unit 106 when the processing load of the node 1a itself becomes a predetermined value or less, thereby performing the data migration processing by the data migration processing unit 106. Let it resume. For example, the data migration processing unit 106 executes the data migration processing by resuming the determination target data acquisition processing shown in step S30 of FIG.

このようにすることで、データ移行パラメータ管理部１０７で設定されたパラメータに基づき、所定のタイミングでデータ移行処理部１０６がデータ移行処理を実行している場合であっても、何らかの理由（例えば、クライアント２から大量のメッセージを受信した場合等）で、ノード１ａの処理負荷が所定を超えた場合に、データ移行処理（再冗長化処理または再配置処理）を中断することができる。よって、ノード１ａの処理負荷を抑制することが可能となる。 Thus, even if the data migration processing unit 106 is executing the data migration processing at a predetermined timing based on the parameters set by the data migration parameter management unit 107, for some reason (for example, When a large amount of messages are received from the client 2 and the processing load of the node 1a exceeds a predetermined value, the data migration processing (re-redundancy processing or rearrangement processing) can be interrupted. Therefore, it is possible to suppress the processing load on the node 1a.

＜変形例２＞
次に、本実施形態に係るデータ移行処理システム１００の変形例２について説明する。
図９は、本実施形態の変形例２に係るノード１ｂの構成例を示す機能ブロック図である。
図８に示した本実施形態の変形例１に係るノード１ａとの違いは、制御部１０にデータ移行パラメータ管理部１０７を備えていないことである。 <Modification 2>
Next, a second modification of the data migration processing system 100 according to the present embodiment will be described.
FIG. 9 is a functional block diagram illustrating a configuration example of the node 1b according to the second modification of the present embodiment.
A difference from the node 1 a according to the first modification of the present embodiment illustrated in FIG. 8 is that the control unit 10 does not include the data migration parameter management unit 107.

この場合、ノード１ｂは、データ移行パラメータ管理部１０７を備えていないため、図５のステップＳ１３およびＳ１４で示した、抽出データ管理テーブル２００から判定対象データを、設定したパラメータに基づく所定のタイミングで取得するような処理を、データ移行処理部１０６は実行していない。データ移行処理部１０６は、ステップＳ１３を実行することなく、直ちにステップＳ１４にて抽出データ管理テーブル２００から判定対象データを取得し、データ移行対象データの検出処理を実行し、データ移行対象データが検出された場合には、そのデータのデータ移行処理を実行する。なお、この場合、図５のフローにおいて、ステップＳ１８→Ｎｏのとき、ステップＳ１４に戻る。
そして、ノード負荷監視部１０８は、ノード１ｂ自身の処理負荷（例えば、ＣＰＵ使用率、メモリ使用率等）を監視し、その処理負荷に対し予め設定された所定値を超えた場合に、データ移行処理部１０６に処理中断情報を出力することにより、データ移行処理部１０６によるデータ移行処理を中断させる。また、ノード負荷監視部１０８は、ノード１ｂ自身の処理負荷が所定値以下になった場合に、データ移行処理部１０６に処理開始情報を出力することにより、データ移行処理部１０６によるデータ移行処理を再開させる。 In this case, since the node 1b does not include the data migration parameter management unit 107, the determination target data is extracted from the extracted data management table 200 shown in steps S13 and S14 in FIG. 5 at a predetermined timing based on the set parameters. The data migration processing unit 106 does not execute the process to obtain. The data migration processing unit 106 immediately acquires the determination target data from the extracted data management table 200 in step S14 without executing step S13, executes the data migration target data detection process, and detects the data migration target data. If so, the data migration processing of the data is executed. In this case, if step S18 → No in the flow of FIG. 5, the process returns to step S14.
Then, the node load monitoring unit 108 monitors the processing load (for example, CPU usage rate, memory usage rate, etc.) of the node 1b itself, and when the processing load exceeds a predetermined value set in advance, data migration is performed. By outputting the processing interruption information to the processing unit 106, the data migration processing by the data migration processing unit 106 is interrupted. Further, the node load monitoring unit 108 outputs the processing start information to the data migration processing unit 106 when the processing load of the node 1b itself becomes a predetermined value or less, thereby performing the data migration processing by the data migration processing unit 106. Let it resume.

このようにすることで、ノード１ｂは、自身の処理負荷が所定値を超えた場合に、データ移行処理を中断し、処理負荷が所定値以下になった場合に、データ移行処理を再開することができる。よって、ノード１ｂは、自身の処理負荷を直接的に監視し、処理負荷が所定値以上にならないように抑制した上で、データ移行処理を実行することが可能となる。 In this way, the node 1b interrupts the data migration process when its own processing load exceeds a predetermined value, and resumes the data migration process when the processing load falls below the predetermined value. Can do. Thus, the node 1b can directly monitor the processing load of itself and execute the data migration process after suppressing the processing load from exceeding a predetermined value.

１，１ａ，１ｂノード
２クライアント
３ロードバランサ
４振り分け装置
１０制御部
１１入出力部（入力部）
１２メモリ部
１３記憶部
１００データ移行処理システム
１０１ノード識別子管理部
１０２メッセージ処理部
１０３死活監視部
１０４ノード情報付与部
１０５データ抽出部
１０６データ移行処理部
１０７データ移行パラメータ管理部
１０８ノード負荷監視部
２００抽出データ管理テーブル
３００データ（サービス対象データ）
４００ノード識別子管理テーブル（ノード識別子管理情報）
１０００分散処理システム 1, 1a, 1b Node 2 Client 3 Load balancer 4 Distribution device 10 Control unit 11 Input / output unit (input unit)
DESCRIPTION OF SYMBOLS 12 Memory part 13 Storage part 100 Data migration processing system 101 Node identifier management part 102 Message processing part 103 Life / death monitoring part 104 Node information provision part 105 Data extraction part 106 Data migration processing part 107 Data migration parameter management part 108 Node load monitoring part 200 Extracted data management table 300 data (service target data)
400 Node identifier management table (node identifier management information)
1000 Distributed processing system

Claims

One of a plurality of nodes constituting a cluster is assigned as an owner node that stores data for providing services to clients as original data, or one or more replication nodes that store duplicate data of the data A data migration processing system for storing
Each of the plurality of nodes constituting the data migration processing system is
A storage unit that stores node identifier management information in which the data, the owner node, and the duplicate node are associated with each other, with respect to each of the plurality of nodes to which a node identifier that is a unique identifier is attached;
A node information giving unit that gives, as node information, a node identifier of the owner node that stores the original data and a node identifier of the duplicate node that stores the duplicate data, respectively, to the original data and the duplicate data;
Detecting the removal or addition of the node, and storing the node identifier management information by changing the data to the new association between the data and the owner node and the replication node according to the removal or addition of the node A node identifier management unit
When the separation of the node is detected, based on the changed node identifier management information, the original data stored by itself and the duplicate data stored by itself, the original data Is lost, the replication data is extracted as determination target data indicating data to be determined whether or not the data migration performed to change the owner node or the replication node is necessary,
A data extraction unit that extracts the original data stored therein as the determination target data based on the changed node identifier management information when the addition of the node is detected;
The owner node corresponding to the changed node identifier management information for the extracted determination target data at a predetermined timing based on a parameter set to suppress the processing load of the node itself due to the data migration If the node identifier of each of the identified owner node and duplicate node does not match the node information, the extracted determination target data is designated as the data that needs to be migrated. A data migration processing unit that detects the data migration target data shown and migrates the detected data migration target data to the identified owner node and replication node,
The parameter is
The number of data migration processing threads indicating the maximum number of threads that can execute the data migration in parallel, the data migration processing execution interval indicating the waiting time after executing the data migration, the owner corresponding to the changed node identifier management information A data migration processing system comprising at least one of a simulation maximum number of times indicating the number of times the data migration target data detection process is continuously executed , including a simulation that is a process of identifying a node and a replication node .

Each of the plurality of nodes
A node load monitoring unit that monitors the processing load of the node itself, and outputs processing interruption information to the data migration processing unit when the processing load exceeds a predetermined value;
The data migration processing system according to claim 1, wherein the data migration processing unit suspends the data migration when receiving the processing interruption information.

Each of the plurality of nodes
When a message requesting the provision of the service by the data is received from the client, the processing of the message is executed, and when the data targeted for the service is the data migration target data, The data migration according to claim 1, further comprising a message processing unit that migrates target data to an owner node and a replication node corresponding to the changed node identifier management information. Processing system.

One of a plurality of nodes constituting a cluster is assigned as an owner node that stores data for providing services to clients as original data, or one or more replication nodes that store duplicate data of the data A data migration processing method of a data migration processing system for storing
Each of the plurality of nodes constituting the data migration processing system is
For each of the plurality of nodes to which a node identifier that is a unique identifier is attached, the storage unit stores node identifier management information in which the data, the owner node, and the duplicate node are associated with each other.
A node identifier of the owner node that stores the original data and a node identifier of the replica node that stores the duplicate data are provided as node information to the original data and the duplicate data, respectively.
Detecting the removal or addition of the node, and storing the node identifier management information by changing the data to the new association between the data and the owner node and the replication node according to the removal or addition of the node Step to
When the separation of the node is detected, based on the changed node identifier management information, the original data stored by itself and the duplicate data stored by itself, the original data Is lost, the replication data is extracted as determination target data indicating data to be determined whether or not the data migration performed to change the owner node or the replication node is necessary,
When the addition of the node is detected, based on the changed node identifier management information, extracting the original data stored therein as the determination target data;
The owner node corresponding to the changed node identifier management information for the extracted determination target data at a predetermined timing based on a parameter set to suppress the processing load of the node itself due to the data migration If the node identifier of each of the identified owner node and duplicate node does not match the node information, the extracted determination target data is designated as the data that needs to be migrated. Detecting as the data migration target data shown, and migrating the detected data migration target data to the identified owner node and replication node,
The parameter is
The number of data migration processing threads indicating the maximum number of threads that can execute the data migration in parallel, the data migration processing execution interval indicating the waiting time after executing the data migration, the owner corresponding to the changed node identifier management information A data migration processing method comprising: at least one of a simulation maximum number of times indicating the number of times the data migration target data detection process is continuously executed , including a simulation that is a process of identifying a node and a replication node .

Each of the plurality of nodes
The data migration processing method according to claim 4, further comprising the step of monitoring the processing load of the node itself and suspending the data migration when the processing load exceeds a predetermined value.