JP2013178677A

JP2013178677A - Distributed processing system, dispatcher, and distributed processing management device

Info

Publication number: JP2013178677A
Application number: JP2012042483A
Authority: JP
Inventors: Satoru Kondo; 悟近藤; Takeshi Fukumoto; 健福元
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-02-28
Filing date: 2012-02-28
Publication date: 2013-09-09
Anticipated expiration: 2032-02-28
Also published as: JP5719323B2

Abstract

PROBLEM TO BE SOLVED: To provide a distributed processing system, a dispatcher, and a distributed processing management device which speed up range search and uniforms server loads in a database of a KVS format that is applied with consistent hashing.SOLUTION: A dispatcher D of a distributed processing system is provided with: an access information management part 15 for counting the number of accesses to a virtual node disposed on a consistent hashing ring; and a virtual node allocation part 16 which acquires access information showing the number of accesses for every virtual node from other dispatchers D, totals the number of accesses for every virtual node, and disposes the virtual node in an area in charge on the consistent hashing ring of a virtual node whose total number of accesses is the largest.

Description

本発明は、ネットワーク上に分散配置されるサーバをクラスタ化してデータを格納する分散データベースの技術分野、および、大規模なデータ集合から、所定の条件で所望のデータを取得する検索技術の分野に属する。 The present invention is applied to the technical field of a distributed database that stores data by clustering servers distributed on a network, and the field of search technology that acquires desired data from a large data set under predetermined conditions. Belongs.

近年、クラウド化に伴いＷｅｂ側でのサービス提供が多くなってきている。ユーザも非常に膨大となることから、単位時間当たりのクエリ処理量に着目し、分散システムを導入する場合が増えつつある。特にボトルネックとなり易いＤＢ（DataBase）の分野では、旧来のＲＤＢ（Relational DataBase）からＮｏＳＱＬ型のシステムに移行しつつある。 In recent years, service provision on the Web side has increased along with cloudization. Since the number of users is also extremely large, paying attention to the query processing amount per unit time, the case of introducing a distributed system is increasing. In particular, in the field of DB (DataBase), which is likely to become a bottleneck, the traditional RDB (Relational DataBase) is shifting to a NoSQL system.

ＮｏＳＱＬ型では、スケーラビリティを享受することを最大目標としており、ハッシュ関数を利用するＫＶＳ（Key Value Store）形式のものが多い。ＫＶＳでは、ＲＤＢのようにテーブル構造のデータを持つことはせず、検索対象としてkeyを予め設定し、そのkeyに対してハッシュ関数を適用することでＯ(1)若しくはＯ（log(N)）の計算量によるvalue検索を可能とし、上記の特性を得ている。但し、その代償としてＳＱＬレベルの検索や、トランザクション処理の一貫性等が不得意であることが知られている。 In the NoSQL type, the maximum goal is to enjoy scalability, and there are many KVS (Key Value Store) formats that use a hash function. KVS does not have table-structured data unlike RDB, but sets a key as a search target in advance, and applies a hash function to that key to apply O (1) or O (log (N) ) Value search based on the amount of calculation), and the above characteristics are obtained. However, it is known that it is not good at SQL level search and transaction processing consistency as a price.

代表的なＫＶＳ形式のＤＢとしては、コンシステントハッシュ（Consistent Hashing）アルゴリズムを利用したシステムがある（非特許文献１参照）。 As a typical KVS format DB, there is a system that uses a consistent hashing algorithm (see Non-Patent Document 1).

非特許文献１に記載されたようなコンシステントハッシュを利用したＤＢでは、動的にサーバを、追加、削除したとしても、既存のクラスタに対する影響が小さいという特徴がある。また、コンシステントハッシュにおいては、１つの物理ノードに対して、仮想的に複数のノードを割り当てる仮想ノードという概念を導入し、コンシステントハッシュ環の領域を仮想ノードに割り当てることにより、更に既存のクラスタへの影響を低減することも可能である。 A DB using a consistent hash as described in Non-Patent Document 1 has a feature that even if a server is dynamically added or deleted, the influence on an existing cluster is small. In the consistent hash, the concept of a virtual node that virtually assigns a plurality of nodes to one physical node is introduced, and an area of the consistent hash ring is assigned to the virtual node to further increase the existing cluster. It is also possible to reduce the influence on.

Giuseppe DeCandia，et al.，“Dynamo: Amazon’s Highly Available Key-value Store,” SOSP’07, October 14-17, 2007, Stevenson, Washington, USA，［online］、［平成24年2月13日検索］、インターネット<http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf>Giuseppe DeCandia, et al., “Dynamo: Amazon's Highly Available Key-value Store,” SOSP'07, October 14-17, 2007, Stevenson, Washington, USA, [online], [searched on February 13, 2012] , Internet <http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf>

本発明で解決しようとする課題は、以下の点である。
ＮｏＳＱＬ型のＤＢシステム、特に、コンシステントハッシュに基づくものは、データ集合を得る手段として用いられる範囲検索を効率化（高速化）するため、通常一致検索で用いる不連続なハッシュ関数（図１０（ａ）参照）ではなく、順序性を保持したハッシュ関数（連続関数）を用いる（図１０（ｂ）参照）。しかし、順序性を保持したハッシュ関数では、入力分布に偏りが発生した際に、クラスタ内のいずれかのサーバに負荷が集中してしまうことが発生し易くなる。 The problems to be solved by the present invention are as follows.
A NoSQL DB system, particularly one based on a consistent hash, is a discontinuous hash function (FIG. 10 () shown in FIG. 10) that makes a range search used as a means for obtaining a data set more efficient. Instead of using a), use a hash function (continuous function) that maintains order (see FIG. 10B). However, in a hash function that maintains order, it is easy for a load to be concentrated on any server in the cluster when a bias occurs in the input distribution.

順序性を保持したハッシュ関数だけでは、基本的にはあらゆる入力分布に対して負荷を均等化することはできないため、コンシステントハッシュの仮想ノードの割り当てを用いて、負荷を均等化することが考えられる。但し、順序性を保持するハッシュ関数では、コンシステントハッシュ環においてデータの配置が非常に偏ることになった場合に、単純にランダムで仮想ノードを割り当ててしまうと、密にデータが配置された領域にはほとんど仮想ノードが割り当てられないケースが発生し、結果として負荷が均等にならないという問題がある。 Basically, it is not possible to equalize the load for all input distributions with only a hash function that maintains order, so it is possible to equalize the load using the allocation of virtual nodes for consistent hash It is done. However, in the hash function that maintains the order, if the virtual node is simply allocated randomly when the data arrangement is very biased in the consistent hash ring, the area where the data is densely arranged Has a problem that almost no virtual nodes are allocated, resulting in uneven load.

このような背景に鑑みて本発明がなされたのであり、本発明は、コンシステントハッシュを適用したＫＶＳ形式のデータベースにおいて、範囲検索を高速化し、かつ、サーバ負荷を均等にすることができる、分散処理システム、ディスパッチャおよび分散処理管理装置を提供することを課題とする。 The present invention has been made in view of such a background, and in the KVS format database to which a consistent hash is applied, the present invention is a distributed method capable of speeding up the range search and equalizing the server load. It is an object of the present invention to provide a processing system, a dispatcher, and a distributed processing management device.

前記した課題を解決するため、請求項１に記載の発明は、入力装置から受信した入力情報を、サーバに振り分ける複数のディスパッチャと、前記ディスパッチャから受信した入力情報に基づき、データの保存、検索を含む処理を実行する複数の前記サーバとを備える分散処理システムであって、前記ディスパッチャが、コンシステントハッシュ環上における、前記複数のサーバの仮想ノードの担当領域を示す仮想ノード割当情報を記憶する記憶部と、前記入力装置から前記入力情報を受信し、前記入力情報に対して、順序性を保持したハッシュ関数を用いてハッシュ値を計算し、前記計算したハッシュ値が、前記コンシステントハッシュ環上に配置された前記複数のサーバのいずれの仮想ノードの担当領域に含まれるかを、前記仮想ノード割当情報に基づき判定し、前記計算したハッシュ値を担当領域に含むと判定された仮想ノードの基となるサーバを、振り分け先となる前記サーバとして決定し、前記決定したサーバに前記入力情報を送信する振り分け処理部と、前記コンシステントハッシュ環上に配置された仮想ノードの担当領域に対し、前記入力情報のハッシュ値が当該仮想ノードの担当領域に含まれることを示すアクセス数を、前記仮想ノード毎にカウントするアクセス情報管理部と、前記分散処理システムに、新たなサーバが追加される場合に、自身以外のディスパッチャから、前記仮想ノード毎のアクセス数を取得して、前記仮想ノード毎のアクセス数の合計値を計算し、前記計算したアクセス数の合計値が最も大きい仮想ノードの前記コンシステントハッシュ環上の担当領域に、前記新たなサーバの仮想ノードを配置し、新たな前記仮想ノード割当情報を生成して、前記自身以外のディスパッチャに送信する仮想ノード割当部と、を備えることを特徴とする分散処理システムとした。 In order to solve the above-described problem, the invention according to claim 1 is directed to a plurality of dispatchers that distribute input information received from an input device to a server, and data storage and retrieval based on the input information received from the dispatcher. A distributed processing system comprising a plurality of the servers that execute processing including the storage, wherein the dispatcher stores virtual node allocation information that indicates areas in charge of virtual nodes of the plurality of servers on a consistent hash ring And receiving the input information from the input device, calculating a hash value with respect to the input information using a hash function that retains order, and the calculated hash value is stored on the consistent hash ring. Which virtual node of the plurality of servers arranged in the virtual node allocation area Determine based on the information, determine the server that is the basis of the virtual node determined to include the calculated hash value in the assigned area as the server that is the distribution destination, and transmit the input information to the determined server For each virtual node, the number of accesses indicating that the hash value of the input information is included in the assigned area of the virtual node for the assigned area of the virtual node arranged on the consistent hash ring When a new server is added to the distributed information processing unit and the access information management unit that counts, the number of accesses for each virtual node is obtained from a dispatcher other than itself, and the number of accesses for each virtual node On the consistent hash ring of the virtual node having the largest total value of the calculated number of accesses. A distributed node comprising: a virtual node of the new server in this area; a virtual node allocation unit that generates new virtual node allocation information and transmits the virtual node allocation information to a dispatcher other than itself; The system.

また、請求項３に記載の発明は、入力装置から受信した入力情報を、サーバに振り分ける複数のディスパッチャと、前記ディスパッチャから受信した入力情報に基づき、データの保存、検索を含む処理を実行する複数の前記サーバとを備える分散処理システムの前記ディスパッチャであって、コンシステントハッシュ環上における、前記複数のサーバの仮想ノードの担当領域を示す仮想ノード割当情報を記憶する記憶部と、前記入力装置から前記入力情報を受信し、前記入力情報に対して、順序性を保持したハッシュ関数を用いてハッシュ値を計算し、前記計算したハッシュ値が、前記コンシステントハッシュ環上に配置された前記複数のサーバのいずれの仮想ノードの担当領域に含まれるかを、前記仮想ノード割当情報に基づき判定し、前記計算したハッシュ値を担当領域に含むと判定された仮想ノードの基となるサーバを、振り分け先となる前記サーバとして決定し、前記決定したサーバに前記入力情報を送信する振り分け処理部と、前記コンシステントハッシュ環上に配置された仮想ノードの担当領域に対し、前記入力情報のハッシュ値が当該仮想ノードの担当領域に含まれることを示すアクセス数を、前記仮想ノード毎にカウントするアクセス情報管理部と、前記分散処理システムに、新たなサーバが追加される場合に、自身以外のディスパッチャから、前記仮想ノード毎のアクセス数を取得して、前記仮想ノード毎のアクセス数の合計値を計算し、前記計算したアクセス数の合計値が最も大きい仮想ノードの前記コンシステントハッシュ環上の担当領域に、前記新たなサーバの仮想ノードを配置し、新たな前記仮想ノード割当情報を生成して、前記自身以外のディスパッチャに送信する仮想ノード割当部と、を備えることを特徴とするディスパッチャとした。 According to a third aspect of the present invention, a plurality of dispatchers that distribute input information received from an input device to a server, and a plurality of processes that execute processing including data storage and retrieval based on the input information received from the dispatcher. The dispatcher of the distributed processing system comprising the server, and a storage unit that stores virtual node allocation information indicating areas in charge of virtual nodes of the plurality of servers on the consistent hash ring, and the input device The input information is received, a hash value is calculated for the input information using a hash function that retains order, and the calculated hash values are arranged on the consistent hash ring. Based on the virtual node allocation information, it is determined which virtual node of the server is included in the area in charge. A distribution processing unit that determines a server that is a base of a virtual node determined to include the calculated hash value in the assigned area as the distribution destination server, and transmits the input information to the determined server; and the consistency An access information management unit that counts, for each virtual node, the number of accesses indicating that the hash value of the input information is included in the area in charge of the virtual node with respect to the area in charge of the virtual node arranged on the tent hash ring And when a new server is added to the distributed processing system, obtain the number of accesses for each virtual node from a dispatcher other than itself, and calculate the total number of accesses for each virtual node, The new support is added to the assigned area on the consistent hash ring of the virtual node having the largest total access count. Place the server virtual node, generates a new piece of the virtual node allocation information, and the dispatchers, characterized in that it comprises, a virtual node assignment unit that transmits to the dispatcher than the own.

このようにすることで、本発明に係る分散処理システムのディスパッチャは、順序性を保持したハッシュ関数を用いて、振り分け先となるサーバを決定することができる。よって、範囲検索を行う際に、すべてのサーバを検索する必要をなくし、所定の範囲のデータを検索すればよくなるため、応答速度を高速化することができる。
さらに、分散処理システムにおいて、新たなサーバが追加される場合に、各ディスパッチャが、コンシステントハッシュ環上に配置された仮想ノードに対するアクセス数をカウントし、各ディスパッチャの仮想ノード毎のアクセス数の合計値を算出し、その合計値が最も大きい仮想ノードのコンシステントハッシュ環上の担当領域に、新たなサーバの仮想ノードを配置する。よって、アクセス数が多い領域に優先的に仮想ノードを配置するため、サーバ負荷を均等にすることができる。 By doing in this way, the dispatcher of the distributed processing system concerning this invention can determine the server which becomes a distribution destination using the hash function holding the order. Therefore, when performing a range search, it is not necessary to search all the servers, and it is only necessary to search data within a predetermined range, so that the response speed can be increased.
Further, in the distributed processing system, when a new server is added, each dispatcher counts the number of accesses to the virtual nodes arranged on the consistent hash ring, and the total number of accesses for each virtual node of each dispatcher. The value is calculated, and the virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node having the largest total value. Therefore, since virtual nodes are preferentially arranged in an area where the number of accesses is large, server loads can be equalized.

請求項２に記載の発明は、前記仮想ノード割当部が、前記計算したアクセス数の合計値が最も大きい仮想ノードの前記コンシステントハッシュ環上の担当領域に、前記新たなサーバの仮想ノードを配置した場合に、当該アクセス数の合計値を、前記仮想ノードを配置したことにより分割された領域の領域比に応じて分配し、新たな仮想ノードのアクセス数の合計値として設定し、前記設定後のコンシステントハッシュ環上の仮想ノードにおいて、前記アクセス数の合計値が最も大きい仮想ノードの担当領域に、前記新たなサーバの仮想ノードを配置する仮想ノード割当処理を、前記新たなサーバに設定される所定数の仮想ノードが前記コンシステントハッシュ環上に配置し終えるまで実行し、前記新たな仮想ノード割当情報を生成することを特徴とする請求項１に記載の分散処理システムとした。 According to the second aspect of the present invention, the virtual node allocating unit arranges the virtual node of the new server in the assigned area on the consistent hash ring of the virtual node having the largest calculated total number of accesses. In this case, the total value of the number of accesses is distributed according to the area ratio of the areas divided by arranging the virtual nodes, and is set as the total number of accesses of the new virtual node. In the virtual node on the consistent hash ring, virtual node allocation processing for placing the virtual node of the new server in the assigned area of the virtual node having the largest total number of accesses is set for the new server. Executing until a predetermined number of virtual nodes are arranged on the consistent hash ring, and generating the new virtual node allocation information. It was distributed processing system according to claim 1, symptoms.

また、請求項４に記載の発明は、前記仮想ノード割当部が、前記計算したアクセス数の合計値が最も大きい仮想ノードの前記コンシステントハッシュ環上の担当領域に、前記新たなサーバの仮想ノードを配置した場合に、当該アクセス数の合計値を、前記仮想ノードを配置したことにより分割された領域の領域比に応じて分配し、新たな仮想ノードのアクセス数の合計値として設定し、前記設定後のコンシステントハッシュ環上の仮想ノードにおいて、前記アクセス数の合計値が最も大きい仮想ノードの担当領域に、前記新たなサーバの仮想ノードを配置する仮想ノード割当処理を、前記新たなサーバに設定される所定数の仮想ノードが前記コンシステントハッシュ環上に配置し終えるまで実行し、前記新たな仮想ノード割当情報を生成することを特徴とする請求項３に記載のディスパッチャとした。 Further, in the invention according to claim 4, the virtual node allocating unit has the virtual node of the new server in the assigned area on the consistent hash ring of the virtual node having the largest calculated total number of accesses. Is distributed according to the area ratio of the areas divided by arranging the virtual node, and set as the total value of the number of accesses of the new virtual node, In the virtual node on the consistent hash ring after the setting, a virtual node allocation process for allocating the virtual node of the new server in the area in charge of the virtual node having the largest total number of accesses is performed on the new server. Execute until a predetermined number of set virtual nodes have been placed on the consistent hash ring, and generate the new virtual node allocation information Was dispatcher according to claim 3, characterized and.

このようにすることで、分散処理システムにおいて、新たなサーバが追加される場合に、仮想ノードのアクセス数が最も大きい仮想ノードの担当領域に対する、新たなサーバの仮想ノードの配置を繰り返し実行することができる。よって、サーバ負荷を均等にする効果をさらに高めることができる。 In this way, in a distributed processing system, when a new server is added, the placement of the new server's virtual node is repeatedly executed in the assigned area of the virtual node having the largest number of virtual node accesses. Can do. Therefore, the effect of equalizing the server load can be further enhanced.

請求項５に記載の発明は、入力装置から受信した入力情報を、サーバに振り分ける複数のディスパッチャと、前記ディスパッチャから受信した入力情報に基づき、データの保存、検索を含む処理を実行する複数の前記サーバと、前記複数のディスパッチャに接続され、前記ディスパッチャが受信した入力情報の振り分け先となる前記サーバを管理する分散処理管理装置と、を備える分散処理システムの前記分散処理管理装置であって、コンシステントハッシュ環上における、前記複数のサーバの仮想ノードの担当領域を示す仮想ノード割当情報を記憶する記憶部と、前記コンシステントハッシュ環上に配置された仮想ノードの担当領域に対し、前記入力情報のハッシュ値が当該仮想ノードの担当領域に含まれることを示すアクセス数を、前記仮想ノード毎にカウントしたアクセス情報を、前記複数のディスパッチャそれぞれから収集するアクセス情報収集部と、前記分散処理システムに新たなサーバが追加される場合、前記仮想ノード毎のアクセス数の合計値を計算し、前記計算したアクセス数の合計値が最も大きい仮想ノードの前記コンシステントハッシュ環上の担当領域に、前記新たなサーバの仮想ノードを配置し、新たな前記仮想ノード割当情報を生成して、前記複数のディスパッチャに配信する仮想ノード割当情報生成部と、を備えることを特徴とする分散処理管理装置とした。 The invention according to claim 5 is a plurality of dispatchers for distributing input information received from an input device to a server, and a plurality of the processes for executing processing including data storage and retrieval based on the input information received from the dispatcher. A distributed processing management device of a distributed processing system, comprising: a server; and a distributed processing management device that is connected to the plurality of dispatchers and that manages the server to which input information received by the dispatcher is distributed. A storage unit for storing virtual node allocation information indicating virtual node assignment areas of the plurality of servers on the tent hash ring, and the input information for the virtual node assignment area arranged on the consistent hash ring The number of accesses indicating that the hash value of the virtual node is included in the area in charge of the virtual node, When a new server is added to the distributed processing system and an access information collecting unit that collects access information counted for each virtual node from each of the plurality of dispatchers, a total value of the number of accesses for each virtual node is calculated. Then, the virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node having the largest total value of the calculated access numbers, and the new virtual node allocation information is generated, And a virtual node allocation information generation unit that distributes to the plurality of dispatchers.

このようにすることで、本発明に係る分散処理管理装置は、各ディスパッチャから、コンシステントハッシュ環上に配置された仮想ノードに対するアクセス数をカウントしたアクセス情報を収集する。そして、分散処理管理装置は、新たなサーバを追加する場合に、各ディスパッチャの仮想ノード毎のアクセス数の合計値を算出し、その合計値が最も大きい仮想ノードのコンシステントハッシュ環上の担当領域に、新たなサーバの仮想ノードを配置して、新たな仮想ノード割当情報を生成し、各ディスパッチャに配信する。よって、アクセス数が多い領域に優先的に仮想ノードを配置するため、各サーバの負荷を均等にすることができる。 By doing so, the distributed processing management apparatus according to the present invention collects access information obtained by counting the number of accesses to the virtual nodes arranged on the consistent hash ring from each dispatcher. Then, when adding a new server, the distributed processing management device calculates the total number of accesses for each virtual node of each dispatcher, and the assigned area on the consistent hash ring of the virtual node having the largest total value Then, a new virtual node of the server is arranged, new virtual node allocation information is generated, and distributed to each dispatcher. Therefore, since virtual nodes are preferentially arranged in an area with a large number of accesses, the load on each server can be equalized.

請求項６に記載の発明は、前記仮想ノード割当情報生成部が、前記計算したアクセス数の合計値が最も大きい仮想ノードの前記コンシステントハッシュ環上の担当領域に、前記新たなサーバの仮想ノードを配置した場合に、当該アクセス数の合計値を、前記仮想ノードを配置したことにより分割された領域の領域比に応じて分配し、新たな仮想ノードのアクセス数の合計値として設定し、前記設定後のコンシステントハッシュ環上の仮想ノードにおいて、前記アクセス数の合計値が最も大きい仮想ノードの担当領域に、前記新たなサーバの仮想ノードを配置する仮想ノード割当処理を、前記新たなサーバに設定される所定数の仮想ノードが前記コンシステントハッシュ環上に配置し終えるまで実行し、前記新たな仮想ノード割当情報を生成することを特徴とする請求項５に記載の分散処理管理装置とした。 According to the sixth aspect of the present invention, the virtual node allocation information generating unit has the virtual node of the new server in the assigned area on the consistent hash ring of the virtual node having the largest total number of the calculated accesses. Is distributed according to the area ratio of the areas divided by arranging the virtual node, and set as the total value of the number of accesses of the new virtual node, In the virtual node on the consistent hash ring after the setting, a virtual node allocation process for allocating the virtual node of the new server in the area in charge of the virtual node having the largest total number of accesses is performed on the new server. Execute until a predetermined number of set virtual nodes have been placed on the consistent hash ring, and generate the new virtual node allocation information It was distributed processing management apparatus according to claim 5, characterized in that.

このようにすることで、分散処理管理装置は、分散処理システムにおいて、新たなサーバが追加される場合に、仮想ノードのアクセス数が最も大きい仮想ノードの担当領域に対する、新たなサーバの仮想ノードの配置を繰り返し実行することができる。よって、サーバ負荷を均等にする効果をさらに高めることができる。 In this way, when a new server is added to the distributed processing system, the distributed processing management device can change the virtual node of the new server to the assigned area of the virtual node having the largest virtual node access count. Placement can be performed repeatedly. Therefore, the effect of equalizing the server load can be further enhanced.

本発明によれば、コンシステントハッシュを適用したＫＶＳ形式のデータベースにおいて、範囲検索を高速化し、かつ、サーバ負荷を均等にする、分散処理システム、ディスパッチャおよび分散処理管理装置を提供することができる。 According to the present invention, it is possible to provide a distributed processing system, a dispatcher, and a distributed processing management device that speed up range search and equalize server loads in a KVS database to which a consistent hash is applied.

本実施形態に係る分散処理システムを含む全体構成を示す図である。It is a figure which shows the whole structure containing the distributed processing system which concerns on this embodiment. 本実施形態に係る分散処理システムの内部構成を示す図である。It is a figure which shows the internal structure of the distributed processing system which concerns on this embodiment. 本実施形態に係る分散処理システムの処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of the process of the distributed processing system which concerns on this embodiment. 本実施形態に係るディスパッチャの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of the dispatcher which concerns on this embodiment. 本実施形態に係る仮想ノード割当情報のデータ構成の一例を示す図である。It is a figure which shows an example of the data structure of the virtual node allocation information which concerns on this embodiment. 本実施形態に係る各ディスパッチャによる、仮想ノードの担当領域のアクセス数のカウント処理を説明するためのフローチャートである。It is a flowchart for demonstrating the count process of the access count of the area in charge of a virtual node by each dispatcher which concerns on this embodiment. 本実施形態に係る各ディスパッチャによる、仮想ノードの担当領域のアクセス数のカウント処理を説明するための図である。It is a figure for demonstrating the count processing of the access count of the area in charge of a virtual node by each dispatcher which concerns on this embodiment. 本実施形態に係る分散処理システムにおいて、新たなサーバが追加される場合の仮想ノード割当処理を説明するためのシーケンス図である。It is a sequence diagram for demonstrating the virtual node allocation process in case the new server is added in the distributed processing system which concerns on this embodiment. 本実施形態に係る分散処理システムにおいて、既存のサーバを削除する場合の仮想ノード割当処理を説明するためのシーケンス図である。It is a sequence diagram for demonstrating the virtual node allocation process in the case of deleting the existing server in the distributed processing system which concerns on this embodiment. コンシステントハッシュにおける不連続関数と連続関数を説明するための図である。It is a figure for demonstrating the discontinuous function and continuous function in a consistent hash.

次に、本発明を実施するための形態（以下、「本実施形態」という）における分散処理システム１について説明する。 Next, the distributed processing system 1 in a mode for carrying out the present invention (hereinafter referred to as “the present embodiment”) will be described.

＜システム構成＞
図１は、本実施形態に係る分散処理システム１を含む全体構成を示す図である。
図１に示すように、分散処理システム１は、ネットワークを介して、外部システム２であるオペレータシステムや、端末３等と接続される。そして、外部システム２や端末３からの入力データ（クエリ）を受け取り、分散処理システム１内でデータの保存、更新、検索等を行い、その結果を出力データとして、外部システム２や端末３に送信する。 <System configuration>
FIG. 1 is a diagram showing an overall configuration including a distributed processing system 1 according to the present embodiment.
As shown in FIG. 1, the distributed processing system 1 is connected to an operator system, which is an external system 2, a terminal 3, and the like via a network. Then, it receives input data (query) from the external system 2 or the terminal 3, performs storage, update, search, etc. of the data in the distributed processing system 1, and transmits the result as output data to the external system 2 or the terminal 3. To do.

＜分散処理システムの構成＞
図２は、本実施形態に係る分散処理システム１の内部構成を示す図である。
図２に示すように、本分散処理システム１は、ロードバランサＢ（Balancer：各図において「Ｂ」と表記）と、複数のディスパッチャＤ（Dispatcher：各図において「Ｄ」と表記）と、複数のプロセッサＰ（Processor：各図において「Ｐ」と表記）と、複数のストレージＳ(Storage：各図において「Ｓ」と表記）とを含んで構成される。 <Configuration of distributed processing system>
FIG. 2 is a diagram showing an internal configuration of the distributed processing system 1 according to the present embodiment.
As shown in FIG. 2, the distributed processing system 1 includes a load balancer B (Balancer: indicated as “B” in each figure), a plurality of dispatchers D (Dispatcher: indicated as “D” in each figure), a plurality of Processor P (Processor: indicated as “P” in each figure) and a plurality of storages S (Storage: indicated as “S” in each figure).

ロードバランサＢは、入力装置４から入力データを取得し、出力データを出力装置５に送信する。また、ロードバランサＢは、ラウンドロビン等により、入力データを複数のディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）のいずれかに振り分ける。ここで、入力装置４および出力装置５は、図１に示した外部システム２や端末３である。また、本分散処理システム１に対する入力データは、例えば、ＳＱＬのクエリやＸＣＡＰ（XML Configuration Access Protocol）のような、データベースからデータを取得するための要求のことである。 The load balancer B acquires input data from the input device 4 and transmits output data to the output device 5. The load balancer B distributes input data to one of a plurality of dispatchers D (D ₁ , D ₂ , D ₃ ) by round robin or the like. Here, the input device 4 and the output device 5 are the external system 2 and the terminal 3 shown in FIG. The input data to the distributed processing system 1 is a request for acquiring data from a database such as an SQL query or XCAP (XML Configuration Access Protocol).

各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）は、複数のプロセッサＰ（Ｐ_１，Ｐ_２，Ｐ_３）と接続されており、ロードバランサＢから取得した入力データ（クエリ）を、プロセッサＰ（Ｐ_１，Ｐ_２，Ｐ_３）のいずれかに振り分ける。このディスパッチャＤは、入力データを解析し、コンシステントハッシュを適用してハッシュ関数により、データの格納先であるサーバ（プロセッサＰとストレージＳとの組）を決定し、その入力データ（クエリ）を送信する。なお、ディスパッチャＤの詳細な構成と処理については、後記する。 Each dispatcher D (D ₁ , D ₂ , D ₃ ) is connected to a plurality of processors P (P ₁ , P ₂ , P ₃ ), and receives input data (query) acquired from the load balancer B as processor P Assign to any of (P ₁ , P ₂ , P ₃ ). The dispatcher D analyzes the input data, applies a consistent hash, determines a server (a pair of the processor P and the storage S) as a data storage destination by a hash function, and determines the input data (query). Send. The detailed configuration and processing of the dispatcher D will be described later.

プロセッサＰは、複数のディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）と自身が制御するストレージＳとに接続されており、いずれかのディスパッチャＤから入力データを受信し、その入力データに従い、ストレージＳに新規データを保存したり、既存データを更新したり、データの検索処理をしたりする制御を実行する。また、ストレージＳは、実際にデータを保存する記憶手段であり、例えば、各データがＸＭＬ（Extensible Markup Language）ファイルで保存される。なお、本実施形態においては、このプロセッサＰとストレージＳとの組を、１つのサーバとして説明する。また、後記において説明するように、本実施形態において、サーバ（プロセッサＰとストレージＳとの組）が本分散処理システム１に追加、削除される際には、その該当サーバ（例えば、プロセッサＰ_１およびストレージＳ_１）の分散処理システム１への登録等を管理するディスパッチャＤ_１も併せて、追加、削除されるものとする。さらに、ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）、プロセッサＰ（Ｐ_１，Ｐ_２，Ｐ_３）、ストレージＳ（Ｓ_１，Ｓ_２，Ｓ_３）それぞれは、図２に図示した３つの装置に限定されず、複数の装置であればよい。 The processor P is connected to a plurality of dispatchers D (D ₁ , D ₂ , D ₃ ) and a storage S controlled by itself, receives input data from any one of the dispatchers D, and stores the storage according to the input data. Control is performed to store new data in S, update existing data, or perform data search processing. The storage S is a storage unit that actually stores data. For example, each data is stored as an XML (Extensible Markup Language) file. In the present embodiment, the set of the processor P and the storage S will be described as one server. Further, as will be described later, in this embodiment, when a server (a combination of the processor P and the storage S) is added to or deleted from the distributed processing system 1, the corresponding server (for example, the processor P _1). The dispatcher D ₁ that manages the registration of the storage S ₁ ) to the distributed processing system 1 is also added and deleted. Further, the dispatcher D (D ₁ , D ₂ , D ₃ ), the processor P (P ₁ , P ₂ , P ₃ ), and the storage S (S ₁ , S ₂ , S ₃ ) are each shown in FIG. The device is not limited to a device, and may be a plurality of devices.

＜概要＞
まず、本実施形態に係る分散処理システム１が行う処理の概要について説明する。
図３は、本実施形態に係る分散処理システム１の処理の概要を説明するための図である。
図３（ａ）は、コンシステントハッシュに基づき、仮想ノードを設定し、範囲検索を高速化するためにハッシュ関数を連続かつ単調増加関数とした場合の従来のシステムにおける例を示している。この場合、各サーバに格納されたデータを、ハッシュ値順にソートしておくことで、任意の範囲を取得すること（例えば、keyを文字列とする前方一致）に対して、常にＯ（log(N)）の計算量でデータを取得できる。しかしながら、図３（ａ）に示すように、コンシステントハッシュ環の一部の領域に出現頻度が偏る可能性が高くなる。この問題を、仮想ノードを増やすことで解決しようとする場合、従来の仮想ノード割当方法では、例えば、ランダムに仮想ノードが配置されてしまうため、仮想ノードを増加させても、偏った領域に充分に仮想ノードが配置されず、その結果サーバ負荷のばらつきが解消されない可能性がある。
この問題を解決するため、本実施形態に係る分散処理システム１では、各ディスパッチャＤが、コンシステントハッシュ環の担当領域において、仮想ノードがそのデータ処理を担当したアクセス数をカウントし、アクセス数が多い領域に集中的に新たな仮想ノードを配置することで、サーバ負荷を均等にするものである（図３（ｂ）参照）。 <Overview>
First, an outline of processing performed by the distributed processing system 1 according to the present embodiment will be described.
FIG. 3 is a diagram for explaining an outline of processing of the distributed processing system 1 according to the present embodiment.
FIG. 3A shows an example of a conventional system in which virtual nodes are set based on the consistent hash and the hash function is a continuous and monotonically increasing function in order to speed up the range search. In this case, the data stored in each server is sorted in the order of the hash value to obtain an arbitrary range (for example, forward matching with key as a character string). Data can be acquired with the calculation amount of N)). However, as shown in FIG. 3A, there is a high possibility that the appearance frequency is biased to a partial region of the consistent hash ring. When trying to solve this problem by increasing the number of virtual nodes, in the conventional virtual node allocation method, for example, virtual nodes are randomly arranged. As a result, there is a possibility that variations in server load may not be resolved.
In order to solve this problem, in the distributed processing system 1 according to the present embodiment, each dispatcher D counts the number of accesses in which the virtual node is responsible for data processing in the area in charge of the consistent hash ring. By allocating new virtual nodes intensively in many areas, the server load is equalized (see FIG. 3B).

以下、まず、本実施形態に係る分散処理システム１において、仮想ノードのアクセス数のカウント処理と、コンシステントハッシュ環への仮想ノードの配置処理とを行う、ディスパッチャＤの詳細について説明する。その後に、本分散処理システム１が実行する処理の流れについて説明する。 Hereinafter, the details of the dispatcher D that performs the virtual node access count processing and the virtual node placement processing in the consistent hash ring in the distributed processing system 1 according to the present embodiment will be described first. After that, the flow of processing executed by the distributed processing system 1 will be described.

≪ディスパッチャＤ≫
図４は、本実施形態に係るディスパッチャＤの構成例を示す機能ブロック図である。
ディスパッチャＤは、ロードバランサＢおよび複数のプロセッサＰ（Ｐ_１，Ｐ_２，Ｐ_３）と通信可能に接続され、ロードバランサＢから取得した入力データ（クエリ）を、プロセッサＰ（Ｐ_１，Ｐ_２，Ｐ_３）に振り分ける装置であり、図４に示すように、制御部１０と、入出力部２０と、メモリ部３０と、記憶部４０とを含んで構成される。 ≪Dispatcher D≫
FIG. 4 is a functional block diagram illustrating a configuration example of the dispatcher D according to the present embodiment.
The dispatcher D is communicably connected to the load balancer B and the plurality of processors P (P ₁ , P ₂ , P ₃ ), and inputs data (query) acquired from the load balancer B to the processor P (P ₁ , P _2). , P ₃ ), and includes a control unit 10, an input / output unit 20, a memory unit 30, and a storage unit 40, as shown in FIG.

入出力部２０は、ロードバランサＢや、各プロセッサＰ（Ｐ_１，Ｐ_２，Ｐ_３）との間の情報の入出力を行う。例えば、入出力部２０は、ロードバランサＢが送信した入力データ（クエリ）を受信し、各プロセッサＰに対し、その入力データ（クエリ）の送信を行う。また、入出力部２０は、ストレージＳに保存されるデータ等の検索結果をプロセッサＰから受信し、ロードバランサＢに対して送信する等の処理を行う。
また、この入出力部２０は、通信回線を介して情報の送受信を行う通信インタフェースと、不図示のキーボード等の入力手段やモニタ等の出力手段等との間で入出力を行う入出力インタフェースとから構成される。 The input / output unit 20 inputs / outputs information to / from the load balancer B and each processor P (P ₁ , P ₂ , P ₃ ). For example, the input / output unit 20 receives the input data (query) transmitted from the load balancer B, and transmits the input data (query) to each processor P. Further, the input / output unit 20 performs processing such as receiving search results such as data stored in the storage S from the processor P and transmitting them to the load balancer B.
The input / output unit 20 includes a communication interface that transmits and receives information via a communication line, and an input / output interface that performs input / output between an input unit such as a keyboard (not shown) and an output unit such as a monitor. Consists of

制御部１０は、ディスパッチャＤ全体の制御を司り、情報受信部１１と、構文解析部１２と、振り分け処理部１３と、保存情報管理部１４と、アクセス情報管理部１５と、仮想ノード割当部１６と、情報送信部１７とを含んで構成される。なお、この制御部１０は、例えば、ディスパッチャＤの記憶部４０に格納されたプログラムをＣＰＵ（Central Processing Unit）がメモリ部３０であるＲＡＭ（Random Access Memory）に展開し実行することで実現される。 The control unit 10 controls the entire dispatcher D, and includes an information reception unit 11, a syntax analysis unit 12, a distribution processing unit 13, a stored information management unit 14, an access information management unit 15, and a virtual node allocation unit 16. And an information transmission unit 17. The control unit 10 is realized by, for example, developing and executing a program stored in the storage unit 40 of the dispatcher D in a RAM (Random Access Memory) that is a memory unit 30 by a CPU (Central Processing Unit). .

情報受信部１１は、入出力部２０を介して、ロードバランサＢからの入力データ（クエリ）や、プロセッサＰからの出力データを取得する。 The information receiving unit 11 acquires input data (query) from the load balancer B and output data from the processor P via the input / output unit 20.

構文解析部１２は、情報受信部１１から入力データ（クエリ）を受け取り、そのクエリの内容を構文解析する。例えば、構文解析部１２は、その入力データ（クエリ）が、ストレージＳに格納されたデータに対する検索要求（GET）であり、「keyの完全一致検索」や、「keyの範囲検索」等であるかを解析したり、新規のデータの登録要求（PUT）や、既存データの更新要求（UPDATE）等のクエリの内容を解析したりする。
そして、構文解析部１２は、その解析結果を振り分け処理部１３に引き渡す。 The syntax analysis unit 12 receives input data (query) from the information reception unit 11, and parses the content of the query. For example, the syntax analysis unit 12 has a search request (GET) for data stored in the storage S as input data (query), such as “complete match search of key” and “key range search”. Or the contents of a query such as a new data registration request (PUT) or an existing data update request (UPDATE).
Then, the syntax analysis unit 12 delivers the analysis result to the distribution processing unit 13.

振り分け処理部１３は、ハッシュ値計算部１３１を備え、ハッシュ値計算部１３１が、構文解析部１２から取得した解析結果に基づき、予め設定された順序性を保持したハッシュ関数、つまり、連続かつ単調増加するハッシュ関数を用いて、コンシステントハッシュを適用し、入力データのハッシュ値を計算する。
また、振り分け処理部１３は、ハッシュ値計算部１３１が計算したハッシュ値に基づき、記憶部４０に記憶された仮想ノード割当情報１００を参照し、振り分け先となるコンシステントハッシュ環上の仮想ノードを決定する。そして、振り分け処理部１３は、この決定された仮想ノードの基となるサーバ（プロセッサＰとストレージＳとの組）を、振り分け先のサーバとして選択する。なお、仮想ノード割当情報１００については、後記して説明する。 The distribution processing unit 13 includes a hash value calculation unit 131. The hash value calculation unit 131 is a hash function having a preset order based on the analysis result acquired from the syntax analysis unit 12, that is, continuous and monotonous. A consistent hash is applied using an increasing hash function, and a hash value of the input data is calculated.
Further, the distribution processing unit 13 refers to the virtual node allocation information 100 stored in the storage unit 40 based on the hash value calculated by the hash value calculation unit 131, and determines the virtual node on the consistent hash ring that is the distribution destination. decide. Then, the distribution processing unit 13 selects a server (a combination of the processor P and the storage S) that is the basis of the determined virtual node as a distribution destination server. The virtual node allocation information 100 will be described later.

保存情報管理部１４は、構文解析部１２が入力データ（クエリ）を構文解析した結果に応じて、各サーバに保存される情報を管理する全体的な制御を行う。
具体的には、保存情報管理部１４は、振り分け処理部１３が、データの取得要求（検索）、保存、変更等を実行するサーバを決定すると、その決定した振り分け先となるサーバに、入力データ（クエリ）を、情報送信部１７を介して送信する。
また、保存情報管理部１４は、データの取得要求を示す入力データ（クエリ）の場合に、各サーバから取得したデータを、出力データとしてロードバランサＢに送信する制御を行う。 The stored information management unit 14 performs overall control for managing information stored in each server according to the result of the syntax analysis unit 12 parsing the input data (query).
Specifically, when the distribution processing unit 13 determines a server to execute a data acquisition request (search), storage, change, and the like, the storage information management unit 14 sends input data to the determined distribution destination server. (Query) is transmitted via the information transmission unit 17.
Further, in the case of input data (query) indicating a data acquisition request, the storage information management unit 14 performs control to transmit data acquired from each server to the load balancer B as output data.

アクセス情報管理部１５は、振り分け処理部１３が、入力データ（クエリ）を処理する仮想ノードを決定し、保存情報管理部１４が、その仮想ノードの基となるサーバ（プロセッサＰとストレージＳとの組）に入力データを送信し、その応答情報を当該サーバから受信したことを契機として、その仮想ノードが処理した回数である、アクセス数１０４を１つカウントアップする。具体的には、アクセス情報管理部１５は、記憶部４０に記憶された仮想ノード割当情報１００に設けられたコンシステントハッシュ環における仮想ノードの担当領域に対応づけて、入力データのアクセス数１０４をカウントアップする。
なお、このアクセス数１０４は、コンシステントハッシュ環上に配置された仮想ノードの担当領域に対し、入力情報のハッシュ値が当該仮想ノードの担当領域に含まれることを示すものである。 In the access information management unit 15, the distribution processing unit 13 determines a virtual node that processes input data (query), and the storage information management unit 14 uses a server (processor P and storage S between the virtual node). When the input data is transmitted to the group) and the response information is received from the server, the access count 104, which is the number of times the virtual node has processed, is incremented by one. Specifically, the access information management unit 15 sets the access count 104 of the input data in association with the virtual node assigned area in the consistent hash ring provided in the virtual node allocation information 100 stored in the storage unit 40. Count up.
The access count 104 indicates that the hash value of the input information is included in the area in charge of the virtual node for the area in charge of the virtual node arranged on the consistent hash ring.

図５は、本実施形態に係る仮想ノード割当情報１００のデータ構成の一例を示す図である。
図５に示すように、仮想ノード割当情報１００は、仮想ノードＩＤ（Identification）１０１、物理ノードＩＤ１０２、ハッシュ値１０３、アクセス数１０４のデータ項目を含んで構成される。 FIG. 5 is a diagram illustrating an example of a data configuration of the virtual node allocation information 100 according to the present embodiment.
As shown in FIG. 5, the virtual node allocation information 100 includes data items of a virtual node ID (Identification) 101, a physical node ID 102, a hash value 103, and an access count 104.

ここで、仮想ノードＩＤ１０１は、本分散処理システム１内において仮想ノードを特定するための固有な番号である。例えば、図５に示す、仮想ノードＩＤ１０１の「１３−１８１」は、仮想ノードの基となる物理ノードのＩＤ（物理ノードＩＤ１０２）と、当該物理ノードにおける固有な番号のうちの１つ（ここでは、「１８１」番）との組により構成される。
物理ノードＩＤ１０２は、入力データの振り分け先となるサーバ（プロセッサＰとストレージＳとの組）を、本分散処理システム１内において特定するための固有な番号である。例えば、図５に示すように、物理ノードＩＤ１０２として「１３」が設定される。
なお、この仮想ノードＩＤ１０１および物理ノードＩＤ１０２は、本分散処理システム１内において、一意に特定されるＩＤであればよく、図５に示した表記方法に限定されるものではない。 Here, the virtual node ID 101 is a unique number for specifying a virtual node in the distributed processing system 1. For example, “13-181” of the virtual node ID 101 shown in FIG. 5 is one of the ID (physical node ID 102) of the physical node that is the basis of the virtual node and a unique number in the physical node (here, , “181”).
The physical node ID 102 is a unique number for identifying within the distributed processing system 1 a server (a set of the processor P and the storage S) that is a distribution destination of input data. For example, as shown in FIG. 5, “13” is set as the physical node ID 102.
Note that the virtual node ID 101 and the physical node ID 102 may be any ID that is uniquely specified in the distributed processing system 1, and are not limited to the notation shown in FIG.

ハッシュ値１０３は、コンシステントハッシュ環において、仮想ノードが担当する領域を特定するためのものであり、例えば、「０」から順に「１００００」までのいずれかの値が格納される。
例えば、図５に示す、第１行目のハッシュ値１０３が「５６」の場合は、仮想ノードＩＤ１０１が「１３−１８１」の仮想ノードのコンシステントハッシュ環における入力データの担当領域が、「０」〜「５６」であることを示す。また、第２行目のハッシュ値１０３が「１７２」の場合は、仮想ノードＩＤ１０１が「５−９６」の仮想ノードのコンシステントハッシュ環における入力データの担当領域が、１つ前の行のハッシュ値１０３の値に「１」をプラスした「５７」〜「１７２」であることを示す。 The hash value 103 is used to identify the area handled by the virtual node in the consistent hash ring. For example, any value from “0” to “10000” is stored in order.
For example, when the hash value 103 in the first row shown in FIG. 5 is “56”, the area in charge of input data in the consistent hash ring of the virtual node whose virtual node ID 101 is “13-181” is “0”. ”To“ 56 ”. When the hash value 103 of the second row is “172”, the area in charge of input data in the consistent hash ring of the virtual node whose virtual node ID 101 is “5-96” is the hash of the previous row. The value 103 is “57” to “172” obtained by adding “1” to the value 103.

アクセス数１０４には、仮想ノードが担当する入力データが処理される毎に、その仮想ノードの担当領域に配置された入力データの数（アクセス数）がアクセス情報管理部１５によりカウントアップされ格納される。
例えば、図５に示す、第１行目のアクセス数１０４が「１０９８」の場合は、仮想ノードＩＤ１０１が「１３−１８１」の仮想ノードの担当であるハッシュ値１０３が「０」〜「５６」の領域に、「１０９８」回のアクセス数があったことを示す。 As the number of accesses 104, each time the input data handled by the virtual node is processed, the number of input data (number of accesses) arranged in the assigned area of the virtual node is counted up and stored by the access information management unit 15. The
For example, when the number of accesses 104 in the first row shown in FIG. 5 is “1098”, the hash value 103 that is assigned to the virtual node whose virtual node ID 101 is “13-181” is “0” to “56”. This area indicates that the number of accesses is “1098” times.

この仮想ノード割当情報１００のうち、仮想ノードＩＤ１０１、物理ノードＩＤ１０２およびハッシュ値１０３は、各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）において、すべて共通の値を備えている。これに対し、アクセス数１０４は、ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）それぞれが受信した入力データに対し、各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）の振り分け処理部１３が振り分けた仮想ノード毎に、各アクセス情報管理部１５により独自にカウントされる。なお、このアクセス情報管理部１５による、仮想ノードの担当領域のアクセス数１０４のカウント処理については、後記する図６および図７において詳細に説明する。 In the virtual node allocation information 100, the virtual node ID 101, the physical node ID 102, and the hash value 103 all have a common value in each dispatcher D (D ₁ , D ₂ , D ₃ ). In contrast, the access number 104 for the input data, each dispatcher _{_{D (D 1, D 2,}} D 3) is received, the distribution processing unit 13 distributing each dispatcher _{_{D (D 1, D 2,}} D 3) Each virtual node is uniquely counted by each access information management unit 15. The count processing of the number of accesses 104 of the virtual node's assigned area by the access information management unit 15 will be described in detail with reference to FIGS.

また、アクセス情報管理部１５は、コーディネータとなるディスパッチャＤ（詳細は後記）から、アクセス情報集計通知を受信すると、自身のディスパッチャＤが備える仮想ノード割当情報１００を参照し、アクセス情報を返信する。ここで、アクセス情報とは、仮想ノード割当情報１００のデータ項目のうち、仮想ノードＩＤ１０１とそれに対応付けられたアクセス数１０４の情報を意味する。
さらに、アクセス情報管理部１５は、コーディネータとなるディスパッチャＤから、新たな仮想ノード割当情報１００を受信すると、記憶部４０に記憶された仮想ノード割当情報１００を、受信した新たな仮想ノード割当情報１００で更新する。その際、アクセス数１０４は「０」に初期化される。 Further, when the access information management unit 15 receives an access information aggregation notification from a dispatcher D (details will be described later) serving as a coordinator, the access information management unit 15 refers to the virtual node allocation information 100 provided in its own dispatcher D and returns access information. Here, the access information means information on the virtual node ID 101 and the number of accesses 104 associated therewith among the data items of the virtual node allocation information 100.
Further, when the access information management unit 15 receives the new virtual node allocation information 100 from the dispatcher D serving as a coordinator, the access information management unit 15 converts the virtual node allocation information 100 stored in the storage unit 40 into the received new virtual node allocation information 100. Update with. At that time, the access number 104 is initialized to “0”.

図４に戻り、仮想ノード割当部１６は、分散処理システム１内において、各サーバ（プロセッサＰとストレージＳとの組）の負荷にばらつきが生じる等したことより、サーバを追加したり削除したりする場合に、当該サーバの追加や削除に対応した、新たな仮想ノードのコンシステントハッシュ環上の配置を決定する。
なお、以下に説明する仮想ノード割当部１６の処理は、自身のディスパッチャＤが、仮想ノードの新たな配置を決定する各ディスパッチャＤのコーディネータとして機能する場合に実行されるものである。このコーディネータは、各ディスパッチャＤのうちの１つが管理者等により、または、任意に設定され、コーディネータとして機能するディスパッチャＤが故障等した場合には、他のディスパッチャＤのうちの１つが、代わりにコーディネータの役割を果たすものである。 Returning to FIG. 4, the virtual node allocation unit 16 adds or deletes servers due to variations in the load of each server (a set of the processor P and the storage S) in the distributed processing system 1. In this case, the arrangement of the new virtual node on the consistent hash ring corresponding to the addition or deletion of the server is determined.
The processing of the virtual node allocation unit 16 described below is executed when its own dispatcher D functions as a coordinator of each dispatcher D that determines a new arrangement of virtual nodes. In this coordinator, when one of the dispatchers D is set by an administrator or the like or arbitrarily and the dispatcher D functioning as the coordinator fails, one of the other dispatchers D It plays the role of coordinator.

仮想ノード割当部１６は、サーバ（プロセッサＰとストレージＳとの組）を追加する場合に、例えば、追加するサーバを管理するディスパッチャＤから、サーバが新たに追加されたことを示す参加通知を受け取ると、既存の各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）に対して、アクセス情報集計通知を送信する。このアクセス情報集計通知は、各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）が備える仮想ノード割当情報１００に記憶されたアクセス情報（仮想ノードＩＤ１０１とそれに対応付けられたアクセス数１０４の情報）の返信を求めるメッセージである。
仮想ノード割当部１６は、各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）から送信されたアクセス情報を集計し、アクセス数１０４の合計値が最も大きい仮想ノードを選択する。そして、仮想ノード割当部１６は、新たに追加するサーバの仮想ノードを、アクセス数１０４の合計値が最も大きい仮想ノードの担当領域に配置する。 When adding a server (a pair of processor P and storage S), the virtual node allocation unit 16 receives a participation notification indicating that a server has been newly added from, for example, a dispatcher D that manages the server to be added. Then, an access information aggregation notification is transmitted to each existing dispatcher D (D ₁ , D ₂ , D ₃ ). This access information aggregation notification is made up of access information (information of the virtual node ID 101 and the number of accesses 104 associated therewith) stored in the virtual node allocation information 100 provided in each dispatcher D (D ₁ , D ₂ , D ₃ ). It is a message requesting a reply.
The virtual node allocation unit 16 aggregates the access information transmitted from each dispatcher D (D ₁ , D ₂ , D ₃ ) and selects the virtual node having the largest total number of accesses 104. Then, the virtual node allocation unit 16 arranges the virtual node of the newly added server in the assigned area of the virtual node having the largest total number of accesses 104.

なお、このときの当該領域への仮想ノードの配置は、１つの仮想ノードに限定されず、予め設定した所定数の仮想ノードを当該領域に配置するようにしてもよい。また、仮想ノードを配置する位置（ハッシュ値）も、例えば、当該領域に１つの仮想ノードを配置する場合に、その領域を２等分する位置に配置してもよいし、領域の長さを例えば、１対２のように分割する位置に配置してもよい。また、当該領域に２つの仮想ノードを配置する場合に、その領域を３等分する位置に配置してもよいし、その領域の長さを任意の割合で分割する位置に配置するようにしてもよい。 Note that the placement of virtual nodes in the area at this time is not limited to one virtual node, and a predetermined number of virtual nodes set in advance may be placed in the area. In addition, for example, when one virtual node is arranged in the area, the position (hash value) at which the virtual node is arranged may be arranged at a position that bisects the area, and the length of the area may be set as follows. For example, you may arrange | position in the position divided | segmented like 1-to-2. In addition, when two virtual nodes are arranged in the area, the area may be arranged at a position where the area is divided into three equal parts, or the length of the area may be arranged at a position where it is divided at an arbitrary ratio. Also good.

仮想ノード割当部１６は、当該領域に仮想ノードを配置すると、その領域のアクセス数１０４をそれぞれの領域比に応じて分配し、その分割された領域のアクセス数１０４の合計値とする。そして、仮想ノード割当部１６は、新たに配置された仮想ノードを含めて、再度、アクセス数１０４の合計値が最も大きい仮想ノードの領域を選択し、所定数の仮想ノードをその領域に配置する処理を繰り返す。なお、１つの物理ノードに対応する仮想ノードの数は予め決められており、仮想ノード割当部１６は、新たに配置するすべての仮想ノードの配置が終了するまで、この仮想ノード割当処理を繰り返す。 When a virtual node is arranged in the area, the virtual node allocation unit 16 distributes the access count 104 of the area in accordance with the area ratio, and sets the total access count 104 of the divided areas. Then, the virtual node allocating unit 16 again selects a virtual node area having the largest total number of accesses 104 including the newly arranged virtual node, and places a predetermined number of virtual nodes in the area. Repeat the process. Note that the number of virtual nodes corresponding to one physical node is determined in advance, and the virtual node assignment unit 16 repeats this virtual node assignment processing until the placement of all newly placed virtual nodes is completed.

仮想ノード割当部１６は、すべての仮想ノードのコンシステントハッシュ環の領域への配置が終了することにより、新たな仮想ノード割当情報１００（図５に示す、仮想ノードＩＤ１０１、物理ノードＩＤ１０２およびハッシュ値１０３の情報）を生成し、その生成した新たな仮想ノード割当情報１００を、追加するサーバを管理するディスパッチャＤを含めた各ディスパッチャＤに送信する。 When the placement of all virtual nodes in the consistent hash ring region is completed, the virtual node allocation unit 16 creates new virtual node allocation information 100 (virtual node ID 101, physical node ID 102, and hash value shown in FIG. 5). 103) and the generated new virtual node allocation information 100 is transmitted to each dispatcher D including the dispatcher D that manages the server to be added.

また、仮想ノード割当部１６は、分散処理システム１の管理者等により、既存のサーバのうちの一つの削除指示を受けた場合に、各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）に対して、アクセス情報集計通知を送信する。仮想ノード割当部１６は、各ディスパッチャＤ（Ｄ_１，Ｄ_２，Ｄ_３）から送信されたアクセス情報を集計し、物理ノード（サーバ）毎に、各物理ノード（サーバ）に設定されたすべての仮想ノードのアクセス数１０４を合計する。そして、そのアクセス数１０４の合計値が最も少ない物理ノード（サーバ）を削除する物理ノード（サーバ）に決定する。 Further, the virtual node allocating unit 16 responds to each dispatcher D (D ₁ , D ₂ , D ₃ ) when receiving an instruction to delete one of the existing servers by an administrator of the distributed processing system 1 or the like. To send an access information aggregation notification. The virtual node allocating unit 16 aggregates access information transmitted from each dispatcher D (D ₁ , D ₂ , D ₃ ), and sets all the physical nodes (servers) set for each physical node (server). The number of virtual node accesses 104 is totaled. Then, the physical node (server) having the smallest total number of accesses 104 is determined as the physical node (server) to be deleted.

続いて、仮想ノード割当部１６は、削除する物理ノード（サーバ）に設定された仮想ノードを、コンシステントハッシュ環上から取り除いた新たな仮想ノード割当情報１００（図５に示す、仮想ノードＩＤ１０１、物理ノードＩＤ１０２およびハッシュ値１０３の情報）を生成し、その生成した新たな仮想ノード割当情報１００を、削除するサーバを管理するディスパッチャＤを除いた、各ディスパッチャＤに送信する。また、仮想ノード割当部１６は、削除するサーバを管理するディスパッチャＤに対して、削除通知を送信する。 Subsequently, the virtual node allocation unit 16 creates new virtual node allocation information 100 (virtual node ID 101, shown in FIG. 5) obtained by removing the virtual node set as the physical node (server) to be deleted from the consistent hash ring. Information on the physical node ID 102 and the hash value 103) is generated, and the generated new virtual node allocation information 100 is transmitted to each dispatcher D excluding the dispatcher D that manages the server to be deleted. In addition, the virtual node allocation unit 16 transmits a deletion notification to the dispatcher D that manages the server to be deleted.

情報送信部１７は、振り分け処理部１３が決定した振り分け先となるプロセッサＰに対して、入力データ等を送信したり、入力データ（クエリ）の内容に応じた各サーバへの制御情報等を送信したりする。また、プロセッサＰから受信したデータ等を、ロードバランサＢへ送信する等の制御を行う。 The information transmission unit 17 transmits input data or the like to the processor P that is the distribution destination determined by the distribution processing unit 13 or transmits control information or the like to each server according to the contents of the input data (query). To do. In addition, control such as transmitting data received from the processor P to the load balancer B is performed.

次に、記憶部４０は、ハードディスクやフラッシュメモリ等の記憶装置からなり、前記した仮想ノード割当情報１００（図５）を記憶する。また、記憶部４０は、ロードバランサＢや、自身以外の各ディスパッチャＤ、各プロセッサＰのアドレス（ＩＰアドレス）等を記憶する。 Next, the storage unit 40 includes a storage device such as a hard disk or a flash memory, and stores the virtual node allocation information 100 (FIG. 5). The storage unit 40 stores the load balancer B, each dispatcher D other than itself, the address (IP address) of each processor P, and the like.

メモリ部３０は、ＲＡＭ等の一次記憶装置からなり、制御部１０によるデータ処理に必要な情報を一時的に記憶している。 The memory unit 30 includes a primary storage device such as a RAM, and temporarily stores information necessary for data processing by the control unit 10.

＜分散処理システムの処理の流れ＞
次に、本実施形態に係る分散処理システム１の処理の流れについて説明する。 <Processing flow of distributed processing system>
Next, a processing flow of the distributed processing system 1 according to the present embodiment will be described.

≪アクセス数のカウント処理≫
まず、図６および図７を参照して、本実施形態に係るディスパッチャＤによるアクセス数１０４のカウント処理について説明する。
図６は、本実施形態に係る各ディスパッチャＤ（アクセス情報管理部１５）による、仮想ノードの担当領域のアクセス数１０４のカウント処理を説明するためのフローチャートである。なお、図６においては、入力装置４と出力装置５が同一の装置である例を示すが、入力装置４と出力装置５が別の装置であっても構わない。 ≪Access count process≫
First, with reference to FIG. 6 and FIG. 7, the count process of the access number 104 by the dispatcher D according to the present embodiment will be described.
FIG. 6 is a flowchart for explaining the count processing of the access count 104 of the assigned area of the virtual node by each dispatcher D (access information management unit 15) according to the present embodiment. Although FIG. 6 shows an example in which the input device 4 and the output device 5 are the same device, the input device 4 and the output device 5 may be different devices.

まず、入力装置４は、入力データ（クエリ）として、例えば、検索要求「GET Key=a0123」をロードバランサＢに送信する（ステップＳ１０）。ロードバランサＢは、例えば、ランダムに入力データをディスパッチャＤに振り分ける。ここでは、ロードバランサＢは、ディスパッチャＤ_１に振り分けて、入力データを送信する（ステップＳ１１）。 First, the input device 4 transmits, for example, a search request “GET Key = a0123” as input data (query) to the load balancer B (step S10). For example, the load balancer B randomly distributes input data to the dispatcher D. Here, the load balancer B is distributed to the dispatcher _{D 1,} transmits the input data (step S11).

ディスパッチャＤ_１の振り分け処理部１３は、入力データ（例えば、「Key=a0123」）のハッシュ値を計算し、仮想ノード割当情報１００を参照して、振り分け先となる仮想ノードを決定し、その決定した仮想ノードの基となる物理ノード（サーバ）に、入力データを送信する（ステップＳ１２）。ここで、ディスパッチャＤ_１は、複数のサーバのうち、例えば、プロセッサＰ_ｋとストレージＳ_ｋとの組のサーバに、入力データを送信する（ステップＳ１２）。 The distribution processing unit 13 of the dispatcher D ₁ calculates a hash value of input data (for example, “Key = a0123”), refers to the virtual node allocation information 100, determines a virtual node that is a distribution destination, and determines the determination. The input data is transmitted to the physical node (server) that is the basis of the virtual node (step S12). Here, the dispatcher _{D 1} among a plurality of servers, for example, the set of server processor _{P k} and storage _{S k,} transmits the input data (step S12).

次に、プロセッサＰ_ｋは、入力データ（クエリ）として「GET Key=a0123」を受け取り、ストレージＳ_ｋを検索し、検索結果として、例えば、「a0123.xml」のデータを得る（ステップＳ１３）。そして、プロセッサＰ_ｋは、検索結果「a0123.xml」を、ディスパッチャＤ_１に送信する（ステップＳ１４）。 Then, the processor P _k receives "GET Key = a0123" as the input data (queries), searches the storage S _k, as a search result, for example, obtain data "a0123.xml" (step S13). Then, the processor _{P k} is the search result "a0123.xml", and transmits to the dispatcher _{D 1} (step S14).

続いて、ディスパッチャＤ_１のアクセス情報管理部１５は、受信した応答情報（ここでは、検索結果）の基となる入力データ（クエリ）について、コンシステントハッシュ環で担当した仮想ノードを特定し、自身が記憶する仮想ノード割当情報１００（図５）のアクセス数１０４を１つカウントアップする（ステップＳ１５）。 Subsequently, the access information management unit 15 of the dispatcher D ₁ identifies the virtual node in charge of the consistent hash ring for the input data (query) that is the basis of the received response information (here, the search result) The number of accesses 104 of the virtual node allocation information 100 (FIG. 5) stored in is counted up by one (step S15).

続いて、ディスパッチャＤ_１は、検索結果「a0123.xml」を、ロードバランサＢを介して、出力装置５に送信する（ステップＳ１６，Ｓ１７）。 Subsequently, the dispatcher _{D 1} the search results "a0123.xml", via the load balancer B, and transmitted to the output device 5 (step S16, S17).

次に、入力装置４から、別の入力データ（クエリ）として検索要求「GET Key=b9876」がロードバランサＢに送信されたものとする（ステップＳ２０）。ロードバランサＢは、例えば、ランダムに入力データをディスパッチャＤに振り分ける。ロードバランサＢは、ここでは、例えば、ディスパッチャＤ_２に振り分け先を決定して、入力データを送信する（ステップＳ２１）。 Next, it is assumed that a search request “GET Key = b9876” is transmitted from the input device 4 to the load balancer B as another input data (query) (step S20). For example, the load balancer B randomly distributes input data to the dispatcher D. The load balancer B, where, for example, to determine the distribution destination to the dispatcher D _2, and transmits the input data (step S21).

ディスパッチャＤ_２の振り分け処理部１３は、入力データ（例えば、「Key=b9876」）のハッシュ値を計算し、仮想ノード割当情報１００を参照して、振り分け先となる仮想ノードを決定し、その決定した仮想ノードの基となる物理ノード（サーバ）に、入力データを送信する。ここで、ディスパッチャＤ_２は、複数のサーバのうち、例えば、プロセッサＰ_ｍとストレージＳ_ｍとの組のサーバに、入力データを送信する（ステップＳ２２）。 The distribution processing unit 13 of the dispatcher D ₂ calculates a hash value of input data (for example, “Key = b9876”), refers to the virtual node allocation information 100, determines a virtual node as a distribution destination, and determines the determination. The input data is transmitted to the physical node (server) that is the basis of the virtual node. Here, the dispatcher _{D 2} among the plurality of servers, for example, the set of servers and processors _{P m} and the storage _{S m,} and transmits the input data (step S22).

次に、プロセッサＰ_ｍは、入力データ（クエリ）として「GET Key=b9876」を受け取り、ストレージＳ_ｍを検索し、検索結果として、例えば、「b9876.xml」のデータを得る（ステップＳ２３）。そして、プロセッサＰ_ｍは、検索結果「b9876.xml」を、ディスパッチャＤ_２に送信する（ステップＳ２４）。 Next, the processor P _m receives “GET Key = b9876” as input data (query), searches the storage S _m , and obtains data of “b9876.xml”, for example, as a search result (step S23). Then, the processor _{P m} the search results "b9876.xml", and transmits to the dispatcher _{D 2} (step S24).

続いて、ディスパッチャＤ_２のアクセス情報管理部１５は、受信した応答情報（ここでは、検索結果）の基となる入力データ（クエリ）について、コンシステントハッシュ環で担当した仮想ノードを特定し、自身が記憶する仮想ノード割当情報１００（図５）のアクセス数１０４を１つカウントアップする（ステップＳ２５）。 Subsequently, the access information management unit 15 of the dispatcher D ₂ identifies the virtual node in charge of the consistent hash ring for the input data (query) that is the basis of the received response information (here, the search result) Is incremented by one (104) in the virtual node allocation information 100 (FIG. 5) stored in (step S25).

続いて、ディスパッチャＤ_２は、検索結果「b9876.xml」を、ロードバランサＢを介して、出力装置５に送信する（ステップＳ２６，Ｓ２７）。 Subsequently, the dispatcher _{D 2} are results "b9876.xml", via the load balancer B, and transmitted to the output device 5 (step S26, S27).

図７は、図６のステップＳ１５およびステップＳ２５における、各ディスパッチャＤのアクセス情報管理部１５による、仮想ノード割当情報１００（図５）のアクセス数１０４のカウント処理を説明するための図である。
各ディスパッチャＤには、同じ位置に仮想ノードが配置された同一のコンシステントハッシュ環が設定される。各ディスパッチャＤによるアクセス数１０４のカウントは、図７に示すように、例えば、ディスパッチャＤ_１やディスパッチャＤ_２それぞれにおいて、コンシステントハッシュ環上に配置された仮想ノードが担当する領域に、振り分け処理部１３が、入力データ（クエリ）を何回振り分けて処理したかを、アクセス情報管理部１５がカウントする。 FIG. 7 is a diagram for explaining the count processing of the number of accesses 104 of the virtual node allocation information 100 (FIG. 5) by the access information management unit 15 of each dispatcher D in step S15 and step S25 of FIG.
In each dispatcher D, the same consistent hash ring in which virtual nodes are arranged at the same position is set. As shown in FIG. 7, for example, the dispatcher D ₁ and the dispatcher D ₂ count the number of accesses 104 by each dispatcher D in the area assigned to the virtual node arranged on the consistent hash ring. The access information management unit 15 counts how many times the input data (query) is distributed and processed.

なお、アクセス情報管理部１５は、図８において説明するように、コーディネータとなるディスパッチャＤから、アクセス情報集計通知を受け取ると、自身の仮想ノード割当情報１００（図５）を参照して、仮想ノードＩＤ１０１とそれに対応付けられたアクセス数１０４との情報を、アクセス情報として、コーディネータとなるディスパッチャＤに返信する。コーディネータとなるディスパッチャＤ（仮想ノード割当部１６）は、仮想ノードＩＤ１０１毎に、アクセス数１０４を集計し、アクセス数１０４の合計値が最も大きい仮想ノードの領域に、新たな仮想ノードを配置する。
以下、サーバを追加する場合と、削除する場合とについての、本実施形態に係るディスパッチャＤの仮想ノード割当処理について具体的に説明する。 In addition, as will be described with reference to FIG. 8, when the access information management unit 15 receives the access information aggregation notification from the dispatcher D serving as the coordinator, the access information management unit 15 refers to its own virtual node allocation information 100 (FIG. 5) Information on the ID 101 and the number of accesses 104 associated therewith is returned as access information to the dispatcher D serving as the coordinator. The dispatcher D (virtual node allocation unit 16) serving as a coordinator counts the number of accesses 104 for each virtual node ID 101, and arranges a new virtual node in the virtual node area where the total value of the number of accesses 104 is the largest.
Hereinafter, the virtual node allocation processing of the dispatcher D according to the present embodiment for adding and deleting servers will be described in detail.

≪仮想ノード割当処理（サーバを追加する場合）≫
図８は、本実施形態に係る分散処理システム１において、新たなサーバが追加される場合の仮想ノード割当処理を説明するためのシーケンス図である。
サーバの追加は、例えば、既存のサーバだけでは充分な性能を得られない（例えば、スループットが足りない）場合や、サーバ負荷にばらつきが生じている場合等に、サーバを追加することでサーバ負荷の均等化を図るために実行される。
なお、図８では、サーバ「＃４」としてのプロセッサＰ_４とストレージＳ_４との組と共に、このサーバ「＃４」を管理するディスパッチャＤ_４も一緒に追加されるものとして説明する。なお、この追加される、プロセッサＰ_４とストレージＳ_４とディスパッチャＤ_４とをまとめて追加される物理ノードと記載することがある。 ≪Virtual node allocation process (when adding a server) ≫
FIG. 8 is a sequence diagram for explaining a virtual node assignment process when a new server is added in the distributed processing system 1 according to the present embodiment.
For example, when adding an existing server alone, sufficient performance cannot be obtained (for example, when the throughput is insufficient), or when server load varies, the server load can be increased by adding a server. It is executed to equalize
In FIG. 8, it is assumed that the dispatcher D ₄ managing the server “# 4” is added together with the set of the processor P ₄ as the server “# 4” and the storage S ₄ . The added processor P ₄ , storage S _4, and dispatcher D ₄ may be described as a physical node to be added collectively.

図８に示すように、まず、追加されるサーバ＃４（プロセッサＰ_４とストレージＳ_４との組）を管理するディスパッチャＤ_４から、各ディスパッチャＤのうち、コーディネータとして機能するディスパッチャＤ（ここでは、ディスパッチャＤ_１とする）に向けて、参加通知が送信される（ステップＳ３０）。この参加通知には、追加されるディスパッチャＤ_４およびプロセッサＰ_４のアドレス情報等が含まれる。 As shown in FIG. 8, first, from the dispatcher D ₄ that manages the added server # 4 (the set of the processor P ₄ and the storage S ₄ ), the dispatcher D that functions as a coordinator among the dispatchers D (here, , towards the dispatcher _{D 1),} participation notification is transmitted (step S30). This participation notification includes the address information of the dispatcher D ₄ and the processor P ₄ to be added.

参加通知を受信したディスパッチャＤ_１の仮想ノード割当部１６は、参加通知を送信してきたディスパッチャＤ_４以外の既存の各ディスパッチャＤ（Ｄ_２，Ｄ_３）に対し、アクセス情報集計通知を送信する（ステップＳ３１）。 The virtual node allocation unit 16 of the dispatcher D ₁ that has received the participation notification transmits an access information aggregation notification to each of the existing dispatchers D (D ₂ , D ₃ ) other than the dispatcher D ₄ that has transmitted the participation notification ( Step S31).

アクセス集計通知を受信したディスパッチャＤ（Ｄ_２，Ｄ_３）のアクセス情報管理部１５それぞれは、自身のディスパッチャＤが記憶する仮想ノード割当情報１００に基づき、アクセス情報（仮想ノードＩＤ１０１とそれに対応付けられたアクセス数１０４との情報）を生成し、ディスパッチャＤ_１に送信する（ステップＳ３２）。 Each access information management unit 15 of the dispatcher D (D ₂ , D ₃ ) that has received the access aggregation notification is based on the virtual node allocation information 100 stored in its own dispatcher D, and accesses information (virtual node ID 101 and the associated information). It generates information) with the access number 104, and transmits to the dispatcher _{D 1} (step S32).

ディスパッチャＤ_１の仮想ノード割当部１６は、各ディスパッチャＤ（Ｄ_２，Ｄ_３）から送信されたアクセス情報のアクセス数１０４を仮想ノード毎に集計し、アクセス数１０４の合計値の最も大きい仮想ノードの領域を選択する。そして、仮想ノード割当部１６は、新たに追加するサーバの所定数の仮想ノードを、アクセス数１０４の合計値の最も大きい仮想ノードが担当しているコンシステントハッシュ環上の領域（最頻領域）に配置する（ステップＳ３３）。そして、ディスパッチャＤ_１の仮想ノード割当部１６は、最頻領域に仮想ノードを配置すると、その最頻領域のアクセス数１０４の合計値をそれぞれの領域比に応じて分配し、その分割された領域のアクセス数１０４の合計値とする。そして、仮想ノード割当部１６は、新たに配置された仮想ノードを含めて、再度、アクセス数１０４の合計値の最も大きい仮想ノードの領域（最頻領域）を選択し、仮想ノードをその領域（最頻領域）に配置する処理を繰り返す。 The virtual node allocation unit 16 of the dispatcher D ₁ totals the access number 104 of the access information transmitted from each dispatcher D (D ₂ , D ₃ ) for each virtual node, and the virtual node having the largest total value of the access number 104 Select the area. Then, the virtual node allocating unit 16 is an area on the consistent hash ring (the most frequent area) in which the virtual node having the largest total value of the access count 104 is responsible for the predetermined number of virtual nodes of the server to be newly added. (Step S33). Then, when the virtual node allocation unit 16 of the dispatcher D ₁ arranges the virtual node in the most frequent area, the virtual node allocation unit 16 distributes the total value of the access counts 104 of the most frequent area according to the respective area ratios, and the divided areas. The total number of accesses 104 of Then, the virtual node allocation unit 16 again selects a virtual node area (mode) having the largest total number of accesses 104 including the newly arranged virtual node, and selects the virtual node as the area ( The process of arranging in the most frequent area) is repeated.

ディスパッチャＤ_１の仮想ノード割当部１６は、すべての仮想ノードの割り当てが終了すると、仮想ノード（仮想ノードＩＤ１０１）と、コンシステントハッシュ環上のその仮想ノードの担当領域（ハッシュ値１０３）とを示した新たな仮想ノード割当情報１００（図５）を生成し、自身の仮想ノード割当情報１００を更新した上で、各ディスパッチャＤ（Ｄ_２，Ｄ_３，Ｄ_４）に通知する（ステップＳ３４）。そして、各ディスパッチャＤ（Ｄ_２，Ｄ_３，Ｄ_４）のアクセス情報管理部１５は、受信した新たな仮想ノード割当情報１００を用いて、記憶部４０に記憶された仮想ノード割当情報１００を更新する。 When the allocation of all the virtual nodes is completed, the virtual node allocation unit 16 of the dispatcher D ₁ indicates the virtual node (virtual node ID 101) and the area in charge of the virtual node on the consistent hash ring (hash value 103). The new virtual node allocation information 100 (FIG. 5) is generated, and its own virtual node allocation information 100 is updated and then notified to each dispatcher D (D ₂ , D ₃ , D ₄ ) (step S34). Then, the access information management unit 15 of each dispatcher D (D ₂ , D ₃ , D ₄ ) updates the virtual node allocation information 100 stored in the storage unit 40 using the received new virtual node allocation information 100. To do.

≪仮想ノード割当処理（サーバを削除する場合）≫
図９は、本実施形態に係る分散処理システム１において、既存のサーバを削除する場合の仮想ノード割当処理を説明するためのシーケンス図である。なお、ここでは、最も負荷が少ないサーバを、仮想ノード割当情報１００のアクセス数１０４に基づき決定し、削除する例を説明する。
なお、このサーバの削除は、例えば、分散処理システム１内において、サーバ数を一定に保ちたいときであり、かつ、サーバ負荷にばらつきが生じているときに、既存のサーバを一旦削除し、その削除したサーバを再度追加することで、サーバ負荷の均等化を図る場合等に実行される。
また、ここでの削除は、サーバ（プロセッサＰとストレージＳとの組）と共にそのサーバを管理するディスパッチャＤも含めた１つの物理ノードが削除されるものとして説明する。 ≪Virtual node allocation process (when deleting a server) ≫
FIG. 9 is a sequence diagram for explaining virtual node assignment processing when an existing server is deleted in the distributed processing system 1 according to the present embodiment. Here, an example will be described in which the server with the smallest load is determined and deleted based on the number of accesses 104 of the virtual node allocation information 100.
This server deletion is, for example, when the number of servers is to be kept constant in the distributed processing system 1 and when the server load varies, the existing server is temporarily deleted, It is executed when the server load is equalized by adding the deleted server again.
Deletion here will be described on the assumption that one physical node including the dispatcher D that manages the server (a pair of the processor P and the storage S) is deleted.

まず、コーディネータとして機能するディスパッチャＤ（ここでは、ディスパッチャＤ_１とする）の仮想ノード割当部１６は、各ディスパッチャＤ（図９では、ディスパッチャＤ_２，Ｄ_３，Ｄ_４）に対して、アクセス情報集計通知を送信する（ステップＳ４０）。なお、コーディネータとして機能するディスパッチャＤは、例えば、分散処理システム１の管理者等から、サーバ（１つの物理ノード）の削除指示を受信したこと等を契機として、各ディスパッチャＤに対し、アクセス情報集計通知を送信する。 First, the dispatcher D to function as a coordinator (here, the dispatcher _{D 1)} virtual node allocation unit 16, relative to each dispatcher D (in FIG. 9, the dispatcher _D _2, D 3, _{D 4),} the access information An aggregation notice is transmitted (step S40). The dispatcher D functioning as a coordinator collects access information for each dispatcher D, for example, when receiving an instruction to delete a server (one physical node) from an administrator of the distributed processing system 1 or the like. Send a notification.

アクセス集計通知を受信したディスパッチャＤ（Ｄ_２，Ｄ_３，Ｄ_４）のアクセス情報管理部１５それぞれは、自身のディスパッチャＤが記憶する仮想ノード割当情報１００に基づき、アクセス情報（仮想ノードＩＤ１０１とそれに対応付けられたアクセス数１０４との情報）を生成し、ディスパッチャＤ_１に送信する（ステップＳ４１）。 Each of the access information management units 15 of the dispatcher D (D ₂ , D ₃ , D ₄ ) that has received the access aggregation notification is based on the virtual node allocation information 100 stored in its own dispatcher D, and access information (virtual node ID 101 and It generates information) with the access number 104 associated, to the dispatcher _{D 1} (step S41).

ディスパッチャＤ_１の仮想ノード割当部１６は、各ディスパッチャＤ（Ｄ_２，Ｄ_３，Ｄ_４）から送信されたアクセス情報のアクセス数１０４を仮想ノード毎に集計し、さらに、仮想ノードのアクセス数１０４の集計を、その仮想ノードの基となる物理ノード（サーバ）毎に合計し、アクセス数１０４の合計値が最も少ない物理ノード（サーバ）を削除する物理ノード（サーバ）に決定する（ステップＳ４２）。ここでは、サーバ「＃２」およびそれを管理するディスパッチャＤ_２の物理ノードが削除されるものとする。 The virtual node allocation unit 16 of the dispatcher D ₁ totals the access number 104 of the access information transmitted from each dispatcher D (D ₂ , D ₃ , D ₄ ) for each virtual node, and further, the virtual node access number 104 Is determined for each physical node (server) that is the basis of the virtual node, and the physical node (server) with the smallest total number of accesses 104 is determined as the physical node (server) to be deleted (step S42). . Here, it is assumed that the server "# 2" and the physical node dispatcher D ₂ to manage it is deleted.

ディスパッチャＤ_１の仮想ノード割当部１６は、削除する物理ノード（サーバ）を決定すると、コンシステントハッシュ環上から、その物理ノードに対応して配置された仮想ノードを取り除き、仮想ノードの新たな担当領域（ハッシュ値１０３）を決定して、新たな仮想ノード割当情報１００（図５）を生成する。そして、ディスパッチャＤ_１の仮想ノード割当部１６は、自身の仮想ノード割当情報１００を、生成した新たな仮想ノード割当情報１００で更新した上で、削除するサーバを管理するディスパッチャＤ_２を除いた、各ディスパッチャＤ（Ｄ_３，Ｄ_４）に、新たな仮想ノード割当情報１００を送信する（ステップＳ４３）。そして、各ディスパッチャＤ（Ｄ_３，Ｄ_４）のアクセス情報管理部１５は、受信した新たな仮想ノード割当情報１００を用いて、記憶部４０に記憶された仮想ノード割当情報１００を更新する。また、仮想ノード割当部１６は、削除するサーバを管理するディスパッチャＤ（ここでは、ディスパッチャＤ_２）に対して、削除通知を送信する（ステップＳ４４）。 When the virtual node allocating unit 16 of the dispatcher D ₁ determines the physical node (server) to be deleted, the virtual node arranged corresponding to the physical node is removed from the consistent hash ring, and the virtual node is newly assigned. An area (hash value 103) is determined, and new virtual node allocation information 100 (FIG. 5) is generated. Then, the virtual node allocation unit 16 of the dispatcher D ₁ updates its own virtual node allocation information 100 with the generated new virtual node allocation information 100, and then excludes the dispatcher D ₂ that manages the server to be deleted. The new virtual node allocation information 100 is transmitted to each dispatcher D (D ₃ , D ₄ ) (step S43). Then, the access information management unit 15 of each dispatcher D (D ₃ , D ₄ ) updates the virtual node allocation information 100 stored in the storage unit 40 using the received new virtual node allocation information 100. In addition, the virtual node allocation unit 16 transmits a deletion notification to the dispatcher D (here, dispatcher D ₂ ) that manages the server to be deleted (step S44).

以上説明したように、本実施形態に係る、分散処理システム１およびディスパッチャＤによれば、コンシステントハッシュを適用したＫＶＳ形式のデータベースにおいて、順序性のあるハッシュ関数を用いることで、範囲検索を行う際の応答速度を高速化できる。さらに、コンシステントハッシュ環上において、仮想サーバのアクセス数１０４をカウントすることにより、アクセスの頻度が偏った箇所に優先的に仮想ノードを割り当て、サーバ負荷を均等にすることができる。 As described above, according to the distributed processing system 1 and the dispatcher D according to the present embodiment, a range search is performed by using an ordered hash function in a KVS format database to which a consistent hash is applied. Response speed can be increased. In addition, by counting the number of virtual server accesses 104 on the consistent hash ring, virtual nodes can be preferentially assigned to places where the access frequency is biased, and the server load can be equalized.

（変形例１）
本実施形態においては、図８および図９に示したように、複数のディスパッチャＤのうちの１つがコーディネータとして機能することにより、サーバの追加や削除に伴う仮想ノード割当処理を行った。しかし、本発明は、これに限定されず、サーバの追加や削除の情報を、ゴシッププロトコル（Gossip protocol）により、配信負荷を抑制しながら、複数のディスパッチャＤ全体に配信し、各ディスパッチャＤの仮想ノード割当部１６それぞれが仮想ノード割当処理を実行するようにしてもよい。 (Modification 1)
In the present embodiment, as shown in FIGS. 8 and 9, one of the plurality of dispatchers D functions as a coordinator, thereby performing a virtual node assignment process associated with addition or deletion of servers. However, the present invention is not limited to this, and server addition / deletion information is distributed to a plurality of dispatchers D while suppressing a distribution load by using a gossip protocol (Gossip protocol). Each of the node allocation units 16 may execute a virtual node allocation process.

（変形例２）
また、本実施形態のように、複数のディスパッチャＤのうちの１つがコーディネータとして機能する構成に替えて、各ディスパッチャＤと接続される管理装置（分散処理管理装置）を設ける構成にしてもよい。
この場合、管理装置（分散処理管理装置）は、仮想ノード割当情報１００を記憶する記憶部と、アクセス情報収集部と、仮想ノード割当情報生成部を備える。
そして、管理装置のアクセス情報収集部が、各ディスパッチャＤに対し、図８に示したアクセス情報集計通知を送信し、各ディスパッチャＤがカウントしたアクセス情報を、ディスパッチャＤそれぞれから受信する。 (Modification 2)
Further, as in this embodiment, instead of a configuration in which one of the plurality of dispatchers D functions as a coordinator, a configuration in which a management device (distributed processing management device) connected to each dispatcher D may be provided.
In this case, the management apparatus (distributed processing management apparatus) includes a storage unit that stores the virtual node allocation information 100, an access information collection unit, and a virtual node allocation information generation unit.
Then, the access information collection unit of the management device transmits the access information aggregation notification shown in FIG. 8 to each dispatcher D, and receives the access information counted by each dispatcher D from each dispatcher D.

管理装置の仮想ノード割当情報生成部は、分散管理システム１に新たなサーバが追加される場合、アクセス情報収集部が収集したアクセス情報に基づき、仮想ノード毎のアクセス数の合計値を計算し、計算したアクセス数の合計値が最も大きい仮想ノードのコンシステントハッシュ環上の担当領域に、新たなサーバの仮想ノードを配置し、新たな仮想ノード割当情報１００を生成する。そして、生成した新たな仮想ノード割当情報１００を、管理装置から各ディスパッチャＤに配信する。 When a new server is added to the distributed management system 1, the virtual node allocation information generation unit of the management device calculates the total number of accesses for each virtual node based on the access information collected by the access information collection unit, The virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node having the largest calculated access count, and new virtual node allocation information 100 is generated. Then, the generated new virtual node allocation information 100 is distributed from the management apparatus to each dispatcher D.

なお、仮想ノード割当情報生成部は、新たな仮想ノード割当情報１００を生成する際に、計算したアクセス数の合計値が最も大きい仮想ノードのコンシステントハッシュ環上の担当領域に、新たなサーバの仮想ノードを配置し、そのアクセス数の合計値を、仮想ノードを配置したことにより分割された領域の領域比に応じて分配して、新たな仮想ノードのアクセス数の合計値として設定し、設定後のコンシステントハッシュ環上の仮想ノードにおいて、アクセス数の合計値が最も大きい仮想ノードの担当領域に、新たなサーバの仮想ノードを配置する仮想ノード割当処理を、新たなサーバに設定される所定数の仮想ノードがコンシステントハッシュ環上に配置し終えるまで実行して、新たな仮想ノード割当情報１００を生成してもよい。 When the virtual node allocation information generation unit generates new virtual node allocation information 100, the virtual node allocation information generation unit creates a new server in the assigned area on the consistent hash ring of the virtual node having the largest total number of accesses. A virtual node is placed, and the total number of accesses is distributed according to the area ratio of the areas divided by the placement of the virtual node, and set as the total number of new virtual node accesses. In a virtual node on the subsequent consistent hash ring, a virtual node allocation process for allocating a virtual node of a new server in the assigned area of the virtual node having the largest total access count is set to the new server. It may be executed until a number of virtual nodes are arranged on the consistent hash ring, and new virtual node allocation information 100 may be generated.

（変形例３）
また、本実施形態に係るディスパッチャＤのアクセス情報管理部１５は、自身のディスパッチャＤが入力データ（クエリ）を処理する度に、すべての入力データについて、仮想ノードのアクセス数１０４のカウント処理を行うものとして説明した（図６参照）。しかしながら、アクセス情報管理部１５は、サーバへアクセスするすべての入力データについてカウント処理を行うのではなく、例えば、所定期間のサンプリングによりカウント処理を行うようにしてもよい。この場合、例えば、時間平均や波形推測等の一般的に用いられる手法により補正を行うことにより、サーバ負荷の均等化を実現する。 (Modification 3)
Further, the access information management unit 15 of the dispatcher D according to the present embodiment performs count processing of the number of accesses 104 of the virtual nodes for all input data every time the dispatcher D processes the input data (query). It explained as a thing (refer FIG. 6). However, the access information management unit 15 may perform the counting process by sampling for a predetermined period, for example, instead of performing the counting process for all input data that accesses the server. In this case, for example, the server load is equalized by performing correction by a generally used method such as time averaging or waveform estimation.

１分散処理システム
２外部システム
３端末
４入力装置
５出力装置
１０制御部
１１情報受信部
１２構文解析部
１３振り分け処理部
１４保存情報管理部
１５アクセス情報管理部
１６仮想ノード割当部
１７情報送信部
２０入出力部
３０メモリ部
４０記憶部
１００仮想ノード割当情報
１３１ハッシュ値計算部
Ｂロードバランサ
Ｄディスパッチャ
Ｐプロセッサ
Ｓストレージ DESCRIPTION OF SYMBOLS 1 Distributed processing system 2 External system 3 Terminal 4 Input device 5 Output device 10 Control part 11 Information receiving part 12 Syntax analysis part 13 Distribution processing part 14 Storage information management part 15 Access information management part 16 Virtual node allocation part 17 Information transmission part 20 Input / output unit 30 Memory unit 40 Storage unit 100 Virtual node allocation information 131 Hash value calculation unit B Load balancer D Dispatcher P Processor S Storage

Claims

A distributed processing system comprising a plurality of dispatchers that distribute input information received from an input device to a server, and a plurality of servers that execute processing including data storage and retrieval based on the input information received from the dispatcher. And
The dispatcher is
A storage unit for storing virtual node allocation information indicating a region in charge of a virtual node of the plurality of servers on the consistent hash ring;
The input information is received from the input device, a hash value is calculated with respect to the input information using a hash function that retains order, and the calculated hash value is arranged on the consistent hash ring. Which server of the plurality of servers is included in the assigned area of the virtual node is determined based on the virtual node allocation information, and the server that is the basis of the virtual node that is determined to include the calculated hash value in the assigned area Is determined as the server to be a distribution destination, and a distribution processing unit that transmits the input information to the determined server;
Access information that counts, for each virtual node, the number of accesses indicating that the hash value of the input information is included in the area in charge of the virtual node for the area in charge of the virtual node arranged on the consistent hash ring The management department,
When a new server is added to the distributed processing system, the number of accesses for each virtual node is obtained from a dispatcher other than itself, and the total number of accesses for each virtual node is calculated. The virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node with the largest total number of accesses, and new virtual node allocation information is generated, A virtual node assignment unit to send to the dispatcher;
A distributed processing system comprising:

The virtual node allocation unit
When the virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node with the largest total value of the calculated access number, the total value of the access number is calculated as the virtual node. Distribute according to the area ratio of the areas divided by the arrangement, set as the total value of the number of accesses of the new virtual node, in the virtual node on the consistent hash ring after the setting, the total of the number of accesses A virtual node allocation process for allocating the virtual node of the new server in the assigned area of the virtual node having the largest value is arranged on the consistent hash ring with a predetermined number of virtual nodes set in the new server. The distributed processing system according to claim 1, wherein the distributed virtual processing system is executed until completion, and the new virtual node allocation information is generated.

A distributed processing system comprising: a plurality of dispatchers that distribute input information received from an input device to a server; and a plurality of servers that execute processing including data storage and retrieval based on the input information received from the dispatcher. A dispatcher,
A storage unit for storing virtual node allocation information indicating a region in charge of a virtual node of the plurality of servers on the consistent hash ring;
The input information is received from the input device, a hash value is calculated with respect to the input information using a hash function that retains order, and the calculated hash value is arranged on the consistent hash ring. Which server of the plurality of servers is included in the assigned area of the virtual node is determined based on the virtual node allocation information, and the server that is the basis of the virtual node that is determined to include the calculated hash value in the assigned area Is determined as the server to be a distribution destination, and a distribution processing unit that transmits the input information to the determined server;
Access information that counts, for each virtual node, the number of accesses indicating that the hash value of the input information is included in the area in charge of the virtual node for the area in charge of the virtual node arranged on the consistent hash ring The management department,
When a new server is added to the distributed processing system, the number of accesses for each virtual node is obtained from a dispatcher other than itself, and the total number of accesses for each virtual node is calculated. The virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node with the largest total number of accesses, and new virtual node allocation information is generated, A virtual node assignment unit to send to the dispatcher;
A dispatcher comprising:

The virtual node allocation unit
When the virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node with the largest total value of the calculated access number, the total value of the access number is calculated as the virtual node. Distribute according to the area ratio of the areas divided by the arrangement, set as the total value of the number of accesses of the new virtual node, in the virtual node on the consistent hash ring after the setting, the total of the number of accesses A virtual node allocation process for allocating the virtual node of the new server in the assigned area of the virtual node having the largest value is arranged on the consistent hash ring with a predetermined number of virtual nodes set in the new server. The dispatcher according to claim 3, wherein the dispatcher executes until completion and generates the new virtual node allocation information.

Connected to the plurality of dispatchers that distribute input information received from the input device to the server, the plurality of servers that execute processing including data storage and retrieval based on the input information received from the dispatcher, and the plurality of dispatchers A distributed processing management device of a distributed processing system comprising: a distributed processing management device that manages the server that is a distribution destination of input information received by the dispatcher,
A storage unit for storing virtual node allocation information indicating a region in charge of a virtual node of the plurality of servers on the consistent hash ring;
Access information obtained by counting, for each virtual node, the number of accesses indicating that the hash value of the input information is included in the assigned area of the virtual node for the assigned area of the virtual node arranged on the consistent hash ring An access information collection unit that collects from each of the plurality of dispatchers;
When a new server is added to the distributed processing system, the total value of the number of accesses for each virtual node is calculated, and the virtual node having the largest total value of the calculated number of accesses is assigned to the consistent hash ring. A virtual node allocation information generating unit that arranges virtual nodes of the new server in an area, generates new virtual node allocation information, and distributes the virtual node allocation information to the plurality of dispatchers;
A distributed processing management device comprising:

The virtual node allocation information generation unit
When the virtual node of the new server is arranged in the assigned area on the consistent hash ring of the virtual node with the largest total value of the calculated access number, the total value of the access number is calculated as the virtual node. Distribute according to the area ratio of the areas divided by the arrangement, set as the total value of the number of accesses of the new virtual node, in the virtual node on the consistent hash ring after the setting, the total of the number of accesses A virtual node allocation process for allocating the virtual node of the new server in the assigned area of the virtual node having the largest value is arranged on the consistent hash ring with a predetermined number of virtual nodes set in the new server. 6. The distributed processing management apparatus according to claim 5, wherein the processing is executed until the processing is completed, and the new virtual node allocation information is generated.