JP5575828B2

JP5575828B2 - Garbage collection execution device, garbage collection execution method, and garbage collection execution program

Info

Publication number: JP5575828B2
Application number: JP2012084005A
Authority: JP
Inventors: 寛之内山; 光一鷲坂
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2012-04-02
Filing date: 2012-04-02
Publication date: 2014-08-20
Anticipated expiration: 2032-04-02
Also published as: JP2013214201A

Description

この発明は、ガベージコレクション実行装置、ガベージコレクション実行方法及びガベージコレクション実行プログラムに関する。 The present invention relates to a garbage collection execution device, a garbage collection execution method, and a garbage collection execution program.

従来、クラウドコンピューティングの進展とともに、分散Ｋｅｙ−Ｖａｌｕｅストアを利用したさまざまなシステムが提案されている。分散Ｋｅｙ−Ｖａｌｕｅストアは、「Ｋｅｙ（キー）」と「Ｖａｌｕｅ（値）」とのペアによってデータを管理するデータベースであり、例えば、Ｇｏｏｇｌｅ（登録商標）社のＢｉｇｔａｂｌｅや、オープンソースのＨＢａｓｅ、Ｈｙｐｅｒｔａｂｌｅなどが知られている。 Conventionally, various systems using a distributed key-value store have been proposed with the progress of cloud computing. The distributed key-value store is a database that manages data by a pair of “key (key)” and “value (value)”. For example, Google (registered trademark) Bigtable, open source HBase, Hypertable Etc. are known.

例えば、ＢｉｇｔａｂｌｅのＰＣクラスタ上のサーバ群それぞれで、いくつかの範囲に分割したテーブル（部分テーブル）を管理する。そして、各サーバにおいては、ログ先行書き込み用ログファイル（以下、ＷＡＬ（Write Ahead Logging）ログと記す）が備えられ、書き込み要求に対する永続性が担保される。Ｂｉｇｔａｂｌｅ（分散Ｋｅｙ−Ｖａｌｕｅストア）は、ＧＦＳ(分散ファイルシステム)上にファイルを作成する。 For example, each server group on the Bigtable PC cluster manages a table (partial table) divided into several ranges. Each server is provided with a log file for log advance writing (hereinafter referred to as a WAL (Write Ahead Logging) log) to ensure the durability of the write request. Bigtable (Distributed Key-Value Store) creates a file on GFS (Distributed File System).

ここで、分散ファイルシステムを利用した分散Ｋｅｙ−Ｖａｌｕｅストアにおいては、一度追加された（Ｋｅｙ, Ｖａｌｕｅ）は、削除などのオペレーションが実行されたとしても削除フラグの付いた（Ｋｅｙ, Ｖａｌｕｅ）が新たに追加されるだけで、元の（Ｋｅｙ, Ｖａｌｕｅ）が削除されない。そのため、不要な（Ｋｅｙ, Ｖａｌｕｅ）が分散ファイルシステムのディスク容量を無駄に占有し、検索処理効率を低下させてしまう。 Here, in the distributed key-value store using the distributed file system, once added (Key, Value), even if an operation such as deletion is executed, (Key, Value) with a deletion flag is newly added. The original (Key, Value) is not deleted. Therefore, unnecessary (Key, Value) unnecessarily occupies the disk capacity of the distributed file system and decreases the search processing efficiency.

そこで、このような分散Ｋｅｙ−Ｖａｌｕｅストアにおいては、各サーバにおいてガベージコレクション処理を実行することによってディスク占有量の適正化を実行する。例えば、分散Ｋｅｙ−Ｖａｌｕｅストアにおけるガベージコレクション処理では、永続化された全ての追加、更新、削除ファイルを一端読み込み、削除するべきレコード群をすべて削除した上で、新たな永続化ファイルを分散ファイルシステム上に作成する。すなわち、ガベージコレクション処理は、全ての情報を分散ファイルシステムから読み込み、ほぼ全ての情報（削除を実行した後のファイル）を分散ファイルシステムに書き込む。 Accordingly, in such a distributed key-value store, the disk occupancy is optimized by executing a garbage collection process in each server. For example, in the garbage collection process in the distributed key-value store, all the added, updated, and deleted files that have been persisted are read once, all the records to be deleted are deleted, and then a new persistent file is distributed to the distributed file system. Create on top. That is, in the garbage collection process, all information is read from the distributed file system, and almost all information (file after execution of deletion) is written to the distributed file system.

上述したように、分散Ｋｅｙ−Ｖａｌｕｅストアにおけるガベージコレクション処理は、不要なファイルを削除することで、分散ファイルシステムのディスク占有量を削減して検索処理効率を向上させるが、全てのファイルを対象として処理が実行されるため、処理負荷が極めて高く、単純に頻度を上げて実行することができない。すなわち、ガベージコレクション処理の実行時には、ディスク占有量とリソース負荷（例えば、ディスクＩ／Ｏ、ネットワーク、ＣＰＵ（Central Processing Unit）などへの負荷）との間のトレードオフの関係が重要となる。 As described above, the garbage collection process in the distributed key-value store reduces the disk occupancy of the distributed file system by deleting unnecessary files and improves the search processing efficiency. Since the processing is executed, the processing load is extremely high, and it is not possible to execute the processing simply by increasing the frequency. That is, at the time of executing the garbage collection process, a trade-off relationship between the disk occupation amount and the resource load (for example, the load on the disk I / O, the network, the CPU (Central Processing Unit), etc.) becomes important.

そこで、このような問題に対して、ガベージコレクション処理の実行間隔を設定する手法や、ユーザに対してガベージコレクション処理の実行コマンドを提供する手法、追加されるレコードの増加に伴って実行される部分テーブルの分割時にガベージコレクション処理を実行する手法などが知られている。 So, for such problems, a method for setting the garbage collection processing execution interval, a method for providing the user with a garbage collection processing execution command, and a part that is executed as the number of records added increases. A technique for executing a garbage collection process when a table is divided is known.

F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: A Distributed Storage System for Structured Data,” OSDI(2006)F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: A Distributed Storage System for Structured Data,” OSDI (2006) S. Ghemawat, H. Gobioff, S.-T. Leung, “The Google File System, ” SOSP(2003)S. Ghemawat, H. Gobioff, S.-T. Leung, “The Google File System,” SOSP (2003) A. Khurana, “HBase,” Hadoop Day(2010)A. Khurana, “HBase,” Hadoop Day (2010) D. Judd, “Hypertable: An Open Source, High Performance, Scalable Database,” OSCON(2008)D. Judd, “Hypertable: An Open Source, High Performance, Scalable Database,” OSCON (2008) M. Burrows, “The Chubby lock service for loosely-coupled distributed systems,”OSDI(2006)M. Burrows, “The Chubby lock service for loosely-coupled distributed systems,” OSDI (2006)

しかしながら、上述した従来技術では、ガベージコレクション処理が短期間に集中して発生することで、リソース負荷が増大する場合があった。例えば、ガベージコレクション処理の実行間隔を設定する手法では、全部分テーブルを対象として１つの実行間隔しか設定することができないため、部分テーブルごとに実行されるガベージコレクション処理の実行タイミングが重なり、負荷分散ができない。また、ユーザに対してガベージコレクション実行コマンドを提供する手法では、ユーザが分散Ｋｅｙ−Ｖａｌｕｅストアに関する十分なスキルを有していない場合、適切なタイミングでガベージコレクション処理を実行することができない。また、上述した部分テーブルの分割時にガベージコレクション処理を実行する手法では、追加されるファイルが増加するに従って部分テーブルの分割が短期間に集中して発生することから、結果的に、ガベージコレクション処理も短期間に集中して発生することとなり、負荷分散ができない。 However, in the above-described conventional technology, there are cases where the resource load increases due to the occurrence of garbage collection processing concentrated in a short period of time. For example, in the method of setting the execution interval of the garbage collection process, only one execution interval can be set for all the partial tables. Therefore, the execution timing of the garbage collection process executed for each partial table overlaps, and load distribution is performed. I can't. In the method of providing a garbage collection execution command to the user, if the user does not have sufficient skills regarding the distributed key-value store, the garbage collection processing cannot be executed at an appropriate timing. In addition, in the method of executing the garbage collection process at the time of the partial table division described above, the partial table division occurs in a short time as the number of files to be added increases. As a result, the garbage collection process is also performed. It will occur in a short period of time and load balancing will not be possible.

そこで、本願に係る技術は、上述した従来技術の問題に鑑みてなされたものであって、ガベージコレクションの発生を分散させることで、リソース負荷を分散することを可能にするガベージコレクション実行装置、ガベージコレクション実行方法及びガベージコレクション実行プログラムを提供することを目的とする。 Accordingly, the technology according to the present application has been made in view of the above-described problems of the prior art, and the garbage collection execution device and the garbage that make it possible to distribute the resource load by distributing the occurrence of garbage collection It is an object to provide a collection execution method and a garbage collection execution program.

上述した課題を解決し、目的を達成するため、本願に係るガベージコレクション実行装置は、分散Ｋｅｙ−Ｖａｌｕｅストアに含まれる複数のサーバそれぞれにおいて所定の容量で形成された複数の部分テーブルによって分散して管理されたキー情報及び値情報を含むレコードに対するガベージコレクション処理を、前記サーバそれぞれで前記部分テーブルごとに時系列上分散して実行されるように前記部分テーブルごとの実行時期を決定する決定部と、前記決定部によって決定された実行時期に、対象となる部分テーブルに対するガベージコレクション処理を実行する実行部とを備えたことを特徴とする。 In order to solve the above-described problems and achieve the object, the garbage collection execution apparatus according to the present application is distributed by a plurality of partial tables formed with a predetermined capacity in each of a plurality of servers included in a distributed key-value store. A determination unit that determines an execution time for each partial table so that garbage collection processing for records including managed key information and value information is distributed in time series for each partial table in each of the servers; And an execution unit that executes a garbage collection process on the target partial table at the execution time determined by the determination unit.

本願に係るガベージコレクション実行装置は、ガベージコレクションの発生を分散させることで、リソース負荷を分散することを可能にする。 The garbage collection execution apparatus according to the present application makes it possible to distribute the resource load by distributing the occurrence of garbage collection.

図１は、第１の実施形態に係る分散ファイルシステムを利用した分散Ｋｅｙ−Ｖａｌｕｅストアの構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of a distributed key-value store using the distributed file system according to the first embodiment. 図２は、第１の実施形態に係るスレーブサーバによるレコードの管理を説明するための図である。FIG. 2 is a diagram for explaining record management by the slave server according to the first embodiment. 図３は、第１の実施形態に係るスレーブサーバによるガベージコレクション処理を説明するための図である。FIG. 3 is a diagram for explaining garbage collection processing by the slave server according to the first embodiment. 図４は、従来技術に係る課題を説明するための図である。FIG. 4 is a diagram for explaining a problem related to the prior art. 図５は、第１の実施形態に係るスレーブサーバの構成の一例を示す図である。FIG. 5 is a diagram illustrating an example of the configuration of the slave server according to the first embodiment. 図６は、第１の実施形態に係るスレーブサーバによって実行されるエリア分割とガベージコレクション処理の一例を説明するための図である。FIG. 6 is a diagram for explaining an example of area division and garbage collection processing executed by the slave server according to the first embodiment. 図７は、第１の実施形態に係るガベージコレクション決定部による処理の手順を示すフローチャートである。FIG. 7 is a flowchart illustrating a processing procedure performed by the garbage collection determination unit according to the first embodiment. 図８は、第１の実施形態に係るガベージコレクション実行部による処理の手順を示すフローチャートである。FIG. 8 is a flowchart illustrating a processing procedure performed by the garbage collection execution unit according to the first embodiment. 図９は、第２の実施形態に係るガベージコレクション決定部による削除容量の予測を説明するための図である。FIG. 9 is a diagram for explaining prediction of deletion capacity by the garbage collection determination unit according to the second embodiment. 図１０は、第２の実施形態に係るスレーブサーバによって実行されるガベージコレクション処理の一例を説明するための図である。FIG. 10 is a diagram for explaining an example of the garbage collection process executed by the slave server according to the second embodiment. 図１１は、第２の実施形態に係るガベージコレクション決定部による処理の手順を示すフローチャートである。FIG. 11 is a flowchart illustrating a processing procedure by the garbage collection determination unit according to the second embodiment. 図１２は、第３の実施形態に係るガベージコレクション実行プログラムを実行するコンピュータを示す図である。FIG. 12 is a diagram illustrating a computer that executes a garbage collection execution program according to the third embodiment.

以下に添付図面を参照して、本願に係るガベージコレクション実行装置、ガベージコレクション実行方法及びガベージコレクション実行プログラムの実施形態を詳細に説明する。なお、以下では、分散Ｋｅｙ−Ｖａｌｕｅストアに含まれるスレーブサーバを本願に係るガベージコレクション実行装置として機能させた場合を例に挙げて説明する。また、本願に係るガベージコレクション実行装置、ガベージコレクション実行方法及びガベージコレクション実行プログラムは、以下の実施形態により限定されるものではない。 Exemplary embodiments of a garbage collection execution device, a garbage collection execution method, and a garbage collection execution program according to the present application will be described below in detail with reference to the accompanying drawings. In the following description, a case where a slave server included in the distributed key-value store functions as a garbage collection execution device according to the present application will be described as an example. In addition, the garbage collection execution device, the garbage collection execution method, and the garbage collection execution program according to the present application are not limited to the following embodiments.

（第１の実施形態）
まず、分散ファイルシステムを利用した分散Ｋｅｙ−Ｖａｌｕｅストアの一例を説明する。図１は、第１の実施形態に係る分散ファイルシステムを利用した分散Ｋｅｙ−Ｖａｌｕｅストアの構成の一例を示す図である。図１に示すように、第１の実施形態に係る分散ファイルシステムを利用した分散Ｋｅｙ−Ｖａｌｕｅストアは、スレーブサーバ１００Ａ〜１００Ｃと、マスターサーバ２００と、ネットワークスイッチ３００とを有し、スレーブサーバ１００Ａ〜１００Ｃとマスターサーバ２００とがネットワークスイッチ３００を介してそれぞれ接続される。なお、図１においては、スレーブサーバ１００Ａ〜１００Ｃが図示されているが、実際には、さらに複数のスレーブサーバがネットワークスイッチ３００に接続されている。 (First embodiment)
First, an example of a distributed key-value store using a distributed file system will be described. FIG. 1 is a diagram illustrating an example of a configuration of a distributed key-value store using the distributed file system according to the first embodiment. As shown in FIG. 1, the distributed key-value store using the distributed file system according to the first embodiment includes slave servers 100A to 100C, a master server 200, and a network switch 300, and the slave server 100A. To 100C and the master server 200 are connected to each other via the network switch 300. In FIG. 1, slave servers 100 </ b> A to 100 </ b> C are illustrated, but actually, a plurality of slave servers are further connected to the network switch 300.

そして、分散Ｋｅｙ−Ｖａｌｕｅストアは、例えば、ネットワークスイッチ３００を介してアクセスされるユーザのデータを「Ｋｅｙ（キー情報）」と「Ｖａｌｕｅ（値情報）」とのペアを含むレコードを、スレーブサーバ上に形成された部分テーブル（以下、エリアと記す）で管理する。一例を挙げると、分散Ｋｅｙ−Ｖａｌｕｅストアは、図１に示すように、Ｋｅｙによってソートされたレコードのテーブルを分割して、スレーブサーバ１００Ａ〜１００Ｃで管理する。ここで、テーブルの分割は、レコードのエントリ数に基づいて実行される場合であってもよく、或いは、データサイズに基づいて実行される場合であってもよい。 The distributed key-value store stores, for example, a record including a pair of “key (key information)” and “value (value information)” on the slave server as data of a user accessed via the network switch 300. It is managed by a partial table (hereinafter referred to as an area) formed in For example, as shown in FIG. 1, the distributed key-value store divides a table of records sorted by key and manages them by slave servers 100A to 100C. Here, the division of the table may be executed based on the number of entries in the record, or may be executed based on the data size.

ネットワークスイッチ３００は、ユーザによって操作される図示しないクライアント装置のアクセス先のスレーブサーバを切り替える。マスターサーバ２００は、スレーブサーバ１００Ａ〜１００Ｃへの各種指示要求を行う。例えば、マスターサーバ２００は、テーブルにおけるどのエリアをどのスレーブサーバに配置するかを決定して、各スレーブサーバに指示要求を行う。一例を挙げると、マスターサーバ２００は、図１に示すように、Ｋｅｙ１からＫｅｙ９９までのエリアと、Ｋｅｙ３００からＫｅｙ３９９までのエリアとをスレーブサーバ１００Ａに配置する。 The network switch 300 switches a slave server to be accessed by a client device (not shown) operated by a user. The master server 200 makes various instruction requests to the slave servers 100A to 100C. For example, the master server 200 determines which area in the table is assigned to which slave server, and issues an instruction request to each slave server. For example, as shown in FIG. 1, the master server 200 arranges an area from Key1 to Key99 and an area from Key300 to Key399 on the slave server 100A.

スレーブサーバ１００Ａ〜１００Ｃは、マスターサーバ２００によって配置されたエリアを管理し、自身が管理するエリアに対する検索・更新といったレコードに対する処理をユーザからの指示に従って実行する。例えば、スレーブサーバ１００Ａは、Ｋｅｙ１からＫｅｙ９９までのエリアに対する検索の指示要求をユーザから受付けて、受付けた指示要求を実行する。なお、分散ファイルシステムにおいては、複数のサーバに各レコードのレプリケーションが格納されている。 The slave servers 100A to 100C manage the areas arranged by the master server 200, and execute processing for records such as search and update for the areas managed by the slave servers according to instructions from the user. For example, the slave server 100A receives a search instruction request for the areas from Key1 to Key99 from the user, and executes the received instruction request. In the distributed file system, replication of each record is stored in a plurality of servers.

ここで、分散Ｋｅｙ−Ｖａｌｕｅストアでは、一度追加されたレコードに対して削除や更新などのオペレーションが実行された場合でも、当該レコードが削除されたり、更新されたりすることはない。例えば、ユーザがスレーブサーバ１００Ａによって管理された(Ｋｅｙ３０２,Ｖａｌｕｅ３０２)に対して削除のオペレーションを実行した場合に、スレーブサーバ１００Ａは、図１に示すように、Ｄ(Ｋｅｙ３０２,Ｖａｌｕｅ３０２) を管理する。すなわち、分散Ｋｅｙ−Ｖａｌｕｅストアは、レコードの更新や削除がユーザから要求された場合、既に格納済みの(Ｋｅｙ,Ｖａｌｕｅ)ペアを更新するのではなく、更新や削除を意味する新たな（Ｋｅｙ’,Ｖａｌｕｅ’)を追加する。以下、Ｄ(Ｋｅｙ,Ｖａｌｕｅ)は、(Ｋｅｙ,Ｖａｌｕｅ)を削除することを意味することとする。また、Ｕ(Ｋｅｙ,Ｖａｌｕｅ’)は、Ｋｅｙに対応する(Ｋｅｙ,Ｖａｌｕｅ)を(Ｋｅｙ,Ｖａｌｕｅ’)に更新することを意味することとする。なお、削除と同様に、格納されたデータを更新するのではなく、内部的にはＵ(Ｋｅｙ,Ｖａｌｕｅ’)という新たなペアを追加する。スレーブサーバ１００Ａ〜スレーブサーバ１００Ｃは、検索があった場合、同一のＫｅｙに関する追加、更新、削除を示す(Ｋｅｙ,Ｖａｌｕｅ)群から必要な(Ｋｅｙ,Ｖａｌｕｅ)のみをユーザに返却する。 Here, in the distributed key-value store, even when an operation such as deletion or update is executed on a record once added, the record is not deleted or updated. For example, when the user performs a delete operation on (Key 302, Value 302) managed by the slave server 100A, the slave server 100A manages D (Key 302, Value 302) as shown in FIG. In other words, the distributed Key-Value store does not update the already stored (Key, Value) pair but updates (deletes) a new (Key ′) that means update or deletion when a record update or deletion is requested by the user. , Value '). Hereinafter, D (Key, Value) means that (Key, Value) is deleted. U (Key, Value ') means that (Key, Value) corresponding to Key is updated to (Key, Value'). Similar to the deletion, the stored data is not updated, but a new pair U (Key, Value ') is added internally. When there is a search, the slave servers 100A to 100C return only the necessary (Key, Value) from the (Key, Value) group indicating addition, update, and deletion related to the same Key to the user.

次に、スレーブサーバに対するレコードの管理の詳細について図２を用いて説明する。なお、分散Ｋｅｙ−Ｖａｌｕｅストアに含まれるスレーブサーバは、それぞれ同様の管理を行うため、図２においては、図１に示すスレーブサーバ１００Ａを例に挙げて説明する。図２は、第１の実施形態に係るスレーブサーバ１００Ａによるレコードの管理を説明するための図である。 Next, details of record management for the slave server will be described with reference to FIG. Since the slave servers included in the distributed key-value store perform the same management, the slave server 100A shown in FIG. 1 will be described as an example in FIG. FIG. 2 is a diagram for explaining record management by the slave server 100A according to the first embodiment.

例えば、スレーブサーバ１００Ａは、図２に示すように、１つのＷＡＬログ（ログ先行書き込み用ログファイル）と、複数のエリア（エリア１及びエリア２）とを有する。エリア１及びエリア２は、図２に示すように、メモリ上のバッファと、複数のソート済みＫｅｙＶａｌｕｅファイルを有する。そして、エリア１及びエリア２は、所定の容量が設定され、格納されたレコードの容量が所定の容量を超える場合に、分割される。例えば、エリア１に格納されたレコードの容量が所定の容量を超える場合に、エリア１が分割され、エリア１に格納されたレコードは分割後のエリアで管理される。 For example, the slave server 100A has one WAL log (log advance write log file) and a plurality of areas (area 1 and area 2) as shown in FIG. As shown in FIG. 2, area 1 and area 2 have a buffer on the memory and a plurality of sorted KeyValue files. Area 1 and area 2 are divided when a predetermined capacity is set and the capacity of the stored record exceeds the predetermined capacity. For example, when the capacity of a record stored in area 1 exceeds a predetermined capacity, area 1 is divided, and the record stored in area 1 is managed in the divided area.

ここで、図２の（１）に示すように、ユーザから（ｎｅｗＫｅｙ，ｎｅｗＶａｌｕｅ）の追加要求を受付けると、スレーブサーバ１００Ａは、まず、（２）に示すように、（ｎｅｗＫｅｙ，ｎｅｗＶａｌｕｅ）をＷＡＬログへ書き込む。このＷＡＬログへの書き込みは、ファイルの最後尾に追記されることで実現され、書き込みが成功した場合に、レコードの永続性が保証される。 Here, as shown in (1) of FIG. 2, when receiving a request for adding (newKey, newValue) from the user, the slave server 100A first sets (NewKey, newValue) to WAL as shown in (2). Write to log. This writing to the WAL log is realized by appending to the end of the file, and when the writing is successful, the record persistence is guaranteed.

その後、図２の（３）に示すように、スレーブサーバ１００Ａは、（ｎｅｗＫｅｙ，ｎｅｗＶａｌｕｅ）をエリア１内のメモリ上バッファへ書き込む。さらに、メモリ上バッファ内の容量が大きくなると、（４）に示すように、スレーブサーバ１００Ａは、メモリ上バッファのレコード（ｎｅｗＫｅｙ，ｎｅｗＶａｌｕｅ）を、Ｋｅｙによってソートした後、分散ファイルシステム上のソート済みＫｅｙＶａｌｕｅファイルに書き出し、メモリ上バッファをクリアする。なお、図２においては、ソート済みＫｅｙＶａｌｕｅファイルを別ファイルに書き込んでいるが、実施形態はこれに限定されることはなく、例えば、１つのファイルに追記で書き込む場合であってもよい。 Thereafter, as shown in (3) of FIG. 2, the slave server 100A writes (newKey, newValue) to the memory buffer in the area 1. Further, when the capacity in the memory buffer increases, as shown in (4), the slave server 100A sorts the records in the memory buffer (newKey, newValue) by Key, and then sorts them on the distributed file system. Write to KeyValue file and clear buffer in memory. In FIG. 2, the sorted KeyValue file is written in another file, but the embodiment is not limited to this. For example, the file may be additionally written in one file.

上述したように、スレーブサーバ１００Ａは、メモリ上の一時的な（Ｋｅｙ，Ｖａｌｕｅ）群を分散ファイルシステム上のソート済みＫｅｙＶａｌｕｅファイルに書き込むことでレコードを永続化させる。なお、ユーザからレコードの検索要求を受信した場合、スレーブサーバ１００Ａは、メモリ上バッファと、ソート済みＫｅｙＶａｌｕｅファイルとを読み込み、検索結果をユーザへ返却する。 As described above, the slave server 100A makes a record permanent by writing a temporary (Key, Value) group on the memory to the sorted KeyValue file on the distributed file system. When receiving a record search request from the user, the slave server 100A reads the buffer in memory and the sorted KeyValue file, and returns the search result to the user.

上述したレコードの追加は、削除や更新を示すレコードも同様に実行される。例えば、スレーブサーバ１００Ａは、図２に示すように、Ｄ（Ｋｅｙ１０，Ｖａｌｕｅ１０）をエリア２内のメモリ上バッファへ書き込む。そして、スレーブサーバ１００Ａは、メモリ上バッファのＤ（Ｋｅｙ１０，Ｖａｌｕｅ１０）を、Ｋｅｙ１によってソートした後、分散ファイルシステム上のソート済みＫｅｙＶａｌｕｅファイルに格納する。 The above-described record addition is executed in the same manner for records indicating deletion or update. For example, the slave server 100A writes D (Key10, Value10) to the memory buffer in the area 2, as shown in FIG. Then, the slave server 100A sorts the buffer D (Key10, Value10) in the memory by Key1, and then stores it in the sorted KeyValue file on the distributed file system.

これは、削除対象である（Ｋｅｙ１０，Ｖａｌｕｅ１０）が既にソート済みＫｅｙＶａｌｕｅファイルとして永続化されており、全てのソート済みＫｅｙＶａｌｕｅファイルを読み込まない限り削除対象の（Ｋｅｙ１０，Ｖａｌｕｅ１０）がどこに存在するかわからないためである。これにより、ソート済みＫｅｙＶａｌｕｅファイルへの書き込み時に対象となる（Ｋｅｙ１０，Ｖａｌｕｅ１０）を検索・更新する必要がないため、書き込みスループットを向上させることができる。 This is because (Key10, Value10) to be deleted has already been made permanent as a sorted KeyValue file, and it is not known where (Key10, Value10) to be deleted exists unless all the sorted KeyValue files are read. It is. This eliminates the need to search and update the target (Key10, Value10) when writing to the sorted KeyValue file, thereby improving the write throughput.

次に、スレーブサーバによるガベージコレクション処理について図３を用いて説明する。なお、分散Ｋｅｙ−Ｖａｌｕｅストアに含まれるスレーブサーバは、それぞれ同様のガベージコレクションを行うため、図３においては、図１に示すスレーブサーバ１００Ａを例に挙げて説明する。図３は、第１の実施形態に係るスレーブサーバ１００Ａによるガベージコレクション処理を説明するための図である。 Next, garbage collection processing by the slave server will be described with reference to FIG. Since slave servers included in the distributed key-value store perform similar garbage collection, FIG. 3 will be described by taking the slave server 100A shown in FIG. 1 as an example. FIG. 3 is a diagram for explaining garbage collection processing by the slave server 100A according to the first embodiment.

例えば、スレーブサーバ１００Ａは、図３の（１）に示すように、エリア１の全てのレコード（メモリ上バッファ内の（Ｋｅｙ，Ｖａｌｕｅ）、及び、ソート済みＫｅｙＶａｌｕｅファイル）を読み込み、削除するべきレコード（例えば、Ｄ（Ｋｅｙ，Ｖａｌｕｅ）及び対象となる（Ｋｅｙ，Ｖａｌｕｅ））を削除したうえで新しいファイルを作成する。そして、（２）に示すように、スレーブサーバ１００Ａは、既存のファイルとメモリ上の（Ｋｅｙ，Ｖａｌｕｅ）をすべて削除する。 For example, as shown in (1) of FIG. 3, the slave server 100A reads all the records in the area 1 (the (Key, Value) and the sorted KeyValue file in the buffer on the memory) and records to be deleted. (For example, D (Key, Value) and the target (Key, Value)) are deleted and a new file is created. Then, as shown in (2), the slave server 100A deletes all existing files and (Key, Value) on the memory.

上述したように、ガベージコレクション処理は、エリアごとに既存の全てのレコードを読み込み、削除するべきレコードを削除したうえで、削除後の全てのレコードを書き込む。したがって、ガベージコレクション処理は、ディスクＩ／Ｏや、ＣＰＵ、ネットワークなどのリソースを大量に消費する。従って、従来、ガベージコレクション処理の実行タイミングに関する種々の手法が開示されているが、いずれの手法においても十分な負荷分散を実行することができない。以下、従来技術に係る課題について説明する。図４は、従来技術に係る課題を説明するための図である。 As described above, the garbage collection process reads all existing records for each area, deletes records to be deleted, and writes all the records after deletion. Therefore, the garbage collection process consumes a large amount of resources such as disk I / O, CPU, and network. Therefore, various methods related to the execution timing of the garbage collection processing have been disclosed, but sufficient load distribution cannot be executed by any of the methods. Hereinafter, problems related to the prior art will be described. FIG. 4 is a diagram for explaining a problem related to the prior art.

図４においては、（Ａ）にガベージコレクション処理の実行間隔を設定する手法について示し、（Ｂ）にユーザにガベージコレクション処理の実行コマンドを提供する手法について示し、（Ｃ）にエリアの分割時にガベージコレクションを実行する手法について示す。また、図４の上段の矩形と矢印で示された図は、矩形がエリアを示し、矩形の長さが分割サイズを示し、レコードの追加に伴ってエリアが分割されていく状態を示す。また、図４の中段の図は、横軸が時間（ｔ）を示し、縦軸がガベージコレクション（ＧＣ）の発生数を示す。 In FIG. 4, (A) shows a method for setting an execution interval of garbage collection processing, (B) shows a method for providing a user with an execution command for garbage collection processing, and (C) shows garbage collection at the time of area division. A method for performing collection will be described. Further, in the figure indicated by the upper rectangle and the arrow in FIG. 4, the rectangle indicates the area, the length of the rectangle indicates the division size, and the area is divided as the record is added. In the middle diagram of FIG. 4, the horizontal axis indicates time (t), and the vertical axis indicates the number of occurrences of garbage collection (GC).

まず、ガベージコレクション処理の実行間隔を設定する手法では、ガベージコレクション処理の実行間隔を、全てのエリアで１つしか設定することができないため、図４の（Ａ）に示すように、エリア群の各ガベージコレクション処理の実行タイミングが重なってしまい、時間軸に対する負荷分散ができない。すなわち、ガベージコレクション処理の実行間隔を設定する手法では、ディスクＩ／Ｏや、ネットワーク、ＣＰＵの負荷が高い状態が一定の間隔で発生するということに他ならない。実際、分散ファイルシステムの初期の段階では、ガベージコレクション処理を実行した場合の負荷の高い状態が１時間程度で終了するが、管理されるデータ量（レコード量）が増大するに従い、負荷が高い状態が長くなるという問題も指摘されている。これは、データ量の増加に伴いエリアが増加し、ガベージコレクション処理の発生数が増加するためである。 First, in the method of setting the garbage collection processing execution interval, since only one garbage collection processing execution interval can be set for all areas, as shown in FIG. The execution timing of each garbage collection process overlaps, and load distribution on the time axis cannot be performed. That is, in the method of setting the execution interval of the garbage collection process, there is nothing but a state in which the load on the disk I / O, the network, and the CPU is high at a constant interval. In fact, at the initial stage of the distributed file system, the high load state when the garbage collection process is executed is completed in about one hour, but the load increases as the amount of managed data (record amount) increases. Has also been pointed out. This is because the area increases as the amount of data increases, and the number of garbage collection processes increases.

また、ユーザにガベージコレクション処理の実行コマンドを提供する手法では、図４の（Ｂ）に示すように、ガベージコレクション処理の実行時（ＧＣ実行）にエリア数の同数のガベージコレクション処理が実行され、負荷分散ができない。さらに、ユーザにガベージコレクション処理の実行コマンドを提供する手法では、ユーザがシステムの内部を十分に理解していない限り、きわめて難しい。例えば、システムの内部を十分に理解していない場合、ガベージコレクション処理を実行し過ぎて、ＣＰＵやディスクＩ／Ｏに対する負荷を高めてしまい、本来ユーザに提供するべき、（Ｋｅｙ，Ｖａｌｕｅ）の追加、更新、削除、検索といった機能のスループットやレスポンスタイムが悪化させる場合がある。逆に、ガベージコレクション処理の実行が少な過ぎて、分散ファイルシステム上のディスク占有量が減少せずにディスクフルなどの問題を引き起こす場合がある。 In the method of providing the user with a garbage collection process execution command, as shown in FIG. 4B, when the garbage collection process is executed (GC execution), the same number of garbage collection processes as the number of areas are executed. Load balancing is not possible. Furthermore, it is very difficult to provide a user with a garbage collection process execution command unless the user fully understands the inside of the system. For example, if you do not fully understand the inside of the system, garbage collection processing will be executed too much, increasing the load on the CPU and disk I / O, and adding (Key, Value) that should be provided to the user In some cases, the throughput and response time of functions such as update, deletion, and search deteriorate. On the other hand, there are cases where the garbage collection process is executed too little and the disk occupancy on the distributed file system does not decrease, causing a problem such as disk full.

また、エリアの分割時（分割前）にガベージコレクション処理を実行する手法では、レコードのエリアへの書き込みがランダムに行われ、各エリアへのレコードの蓄積状況も平準化されるため、図４の（Ｃ）に示すように、各エリアで発生する分割のタイミングがほぼ同時となる（一定期間内に集中する）。その結果、エリアの分割時にガベージコレクション処理を実行する手法では、ほぼ同時に発生するエリアの分割時にディスクＩ／Ｏや、ネットワーク、ＣＰＵの負荷が高い状態となる。すなわち、管理されるデータが増加するに従って、短時間に集中して発生するガベージコレクション処理の発生数が増加し、リソース負荷も増加することとなる。なお、図４の（Ｃ）において、エリアが分割される時間がＴ→Ｔ２→Ｔ４となっているのは、一度分割が行われるとエリアの数が２倍になり、同じ速度で（Ｋｅｙ，Ｖａｌｕｅ）を投入した場合には、エリア分割までに、２倍の時間を要するためである。 Further, in the method of executing the garbage collection process at the time of area division (before division), writing to the area of the record is performed at random, and the accumulation state of the record in each area is also equalized. As shown in (C), the timing of division occurring in each area is almost simultaneous (concentrates within a certain period). As a result, in the method of executing the garbage collection process at the time of area division, the load on the disk I / O, the network, and the CPU is high when the areas are divided almost simultaneously. In other words, as the amount of managed data increases, the number of garbage collection processes occurring in a concentrated manner in a short time increases, and the resource load also increases. In FIG. 4C, the time during which the area is divided is T → T2 → T4. Once the division is performed, the number of areas is doubled at the same speed (Key, This is because it takes twice as long to divide the area when (Value) is input.

上述したように、従来技術においては、いずれの手法においても、短時間に集中して発生するガベージコレクション処理数が大きい。そこで、本願に係るスレーブサーバは、ガベージコレクションの発生を分散させることで、リソース負荷を抑制する。図５は、第１の実施形態に係るスレーブサーバ１００Ａの構成の一例を示す図である。なお、分散ファイルシステムに含まれるスレーブサーバは同様の構成を有していることから、図１においてはスレーブサーバ１００Ａを例に挙げて説明する。 As described above, in any of the conventional techniques, the number of garbage collection processes that occur intensively in a short time is large. Therefore, the slave server according to the present application suppresses the resource load by distributing the occurrence of garbage collection. FIG. 5 is a diagram illustrating an example of the configuration of the slave server 100A according to the first embodiment. Since the slave servers included in the distributed file system have the same configuration, FIG. 1 will be described by taking the slave server 100A as an example.

図５に示すように、スレーブサーバ１００Ａは、通信制御Ｉ／Ｆ部１１０と、入力部１２０と、表示部１３０と、記憶部１４０と、制御部１５０とを有する。通信制御Ｉ／Ｆ部１１０は、ユーザによって操作されるユーザ端末装置と、制御部１５０との間でやり取りする各種情報に関する通信を制御する。例えば、通信制御Ｉ／Ｆ部１１０は、レコードの検索・追加・更新・削除などに係る通信を制御する。また、通信制御Ｉ／Ｆ部１１０は、マスターサーバ２００と制御部１５０との間でやり取りする各種情報に関する通信を制御する。また、通信制御Ｉ／Ｆ部１１０は、入力部１２０及び表示部１３０と、制御部１５０との間での各種情報のやり取りを制御する。 As illustrated in FIG. 5, the slave server 100A includes a communication control I / F unit 110, an input unit 120, a display unit 130, a storage unit 140, and a control unit 150. The communication control I / F unit 110 controls communication related to various types of information exchanged between the user terminal device operated by the user and the control unit 150. For example, the communication control I / F unit 110 controls communication related to search / addition / update / deletion of records. Further, the communication control I / F unit 110 controls communication related to various information exchanged between the master server 200 and the control unit 150. The communication control I / F unit 110 controls the exchange of various information between the input unit 120 and the display unit 130 and the control unit 150.

入力部１２０は、例えば、キーボードやマウスなどであり、分散Ｋｅｙ−Ｖａｌｕｅストアの管理者による種々の情報の入力処理を受付ける。表示部１３０は、例えば、ディスプレイなどであり、分散Ｋｅｙ−Ｖａｌｕｅストアの管理者に対して処理結果を表示出力する。 The input unit 120 is, for example, a keyboard or a mouse, and accepts various information input processes by the administrator of the distributed key-value store. The display unit 130 is, for example, a display, and displays and outputs the processing result to the administrator of the distributed key-value store.

記憶部１４０は、図５に示すように、データ記憶領域１４１と、ソート済みファイル記憶領域１４２とを有する。記憶部１４０は、例えば、ハードディスク、光ディスクなどの記憶装置、または、ＲＡＭ（Random Access Memory）、フラッシュメモリ（Flash Memory）などの半導体メモリ素子であり、スレーブサーバ１００Ａによって実行される各種プログラムなどを記憶する。 As shown in FIG. 5, the storage unit 140 includes a data storage area 141 and a sorted file storage area 142. The storage unit 140 is, for example, a storage device such as a hard disk or an optical disk, or a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory, and stores various programs executed by the slave server 100A. To do.

データ記憶領域１４１は、レコード（Ｋｅｙ，Ｖａｌｕｅ）を記憶する記憶領域であり、例えば、図２に示すメモリ上バッファである。ソート済みファイル記憶領域１４２は、Ｋｅｙによってソートされたレコードを記憶する記憶領域であり、例えば、図２に示すソート済みＫｅｙＶａｌｕｅファイルである。そして、これらソート済みＫｅｙＶａｌｕｅファイルが分散ファイルシステムに該当し、これらのファイルのレプリカが複数のサーバで管理されることで、信頼性を保持する。 The data storage area 141 is a storage area for storing records (Key, Value), and is, for example, a memory buffer shown in FIG. The sorted file storage area 142 is a storage area for storing records sorted by key, and is, for example, a sorted key value file shown in FIG. These sorted KeyValue files correspond to the distributed file system, and replicas of these files are managed by a plurality of servers, thereby maintaining reliability.

制御部１５０は、図５に示すように、ガベージコレクション決定部１５１と、ガベージコレクション実行部１５２とを有する。制御部１５０は、例えば、ＣＰＵやＭＰＵ（Micro Processing Unit）などの電子回路やＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などの集積回路であり、スレーブサーバ１００Ａの全体制御を実行する。なお、図示していないが、制御部１５０は、レコードの検索・追加・削除・更新などの処理を実行する機能部や、レコードをソートする機能部、エリアを分割する機能部などを有する。 As illustrated in FIG. 5, the control unit 150 includes a garbage collection determination unit 151 and a garbage collection execution unit 152. The control unit 150 is, for example, an electronic circuit such as a CPU or MPU (Micro Processing Unit), or an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array), and executes overall control of the slave server 100A. To do. Although not shown, the control unit 150 includes a functional unit that executes processing such as search, addition, deletion, and update of records, a functional unit that sorts records, and a functional unit that divides areas.

ガベージコレクション決定部１５１は、分散Ｋｅｙ−Ｖａｌｕｅストアに含まれる複数のサーバそれぞれにおいて所定の容量で形成された複数のエリアによって分散して管理されたＫｅｙ及びＶａｌｕｅを含むレコードに対するガベージコレクション処理を、サーバそれぞれでエリアごとに時系列上分散して実行されるようにエリアごとの実行時期を決定する。具体的には、ガベージコレクション決定部１５１は、エリアの分割時にガベージコレクション処理が実行される場合に、所定の容量が所定の範囲内で変化するようにエリアを分割させる。 The garbage collection determination unit 151 performs a garbage collection process on records including Key and Value distributed and managed by a plurality of areas formed with a predetermined capacity in each of a plurality of servers included in the distributed Key-Value store. The execution time for each area is determined so that each area is distributed in time series. Specifically, the garbage collection determination unit 151 divides an area so that a predetermined capacity changes within a predetermined range when a garbage collection process is executed when the area is divided.

例えば、ガベージコレクション決定部１５１は、ユーザによってレコードの追加が実行されるごとに、エリアが分割される閾値となるエリアベースサイズと、エリアベースサイズにランダムな変化を与える変動サイズと、現在のエリアサイズとを取得する。そして、ガベージコレクション決定部１５１は、取得した変動サイズ以下で０以上のランダムな値（Ｒ）を生成する。 For example, each time a record is added by the user, the garbage collection determination unit 151 includes an area base size that is a threshold for dividing an area, a variable size that randomly changes the area base size, and the current area. Get the size and. Then, the garbage collection determination unit 151 generates a random value (R) that is equal to or smaller than the acquired variation size and is equal to or greater than zero.

ここで、ガベージコレクション決定部１５１は、エリアベースサイズの値に、生成したＲを加算した値が現在のエリアサイズの値を上回っているか否かを判定して、上回っている場合に、ガベージコレクション処理をスケジューリングする。すなわち、ガベージコレクション決定部１５１は、現在のエリアサイズの値が、エリアを分割する閾値にランダムな値を加えた値を上回った場合に、エリアを分割するように決定するとともに、ガベージコレクション処理を実行するように、後述するガベージコレクション実行部１５２に通知する。 Here, the garbage collection determination unit 151 determines whether or not the value obtained by adding the generated R to the area base size value exceeds the current area size value. Schedule processing. That is, the garbage collection determination unit 151 determines to divide an area when the current area size value exceeds a value obtained by adding a random value to a threshold value for dividing the area, and performs garbage collection processing. A garbage collection execution unit 152, which will be described later, is notified to execute.

これにより、分割されるエリアサイズをランダムに変化させことができ、エリアの分割の発生を時系列上分散させることができる。その結果、ガベージコレクション処理の実行も時系列上分散させることができ、リソース負荷を分散することができる。なお、変動サイズは、分散ファイルシステムの管理者、或いは、ユーザによって任意に設定することができる。 Thereby, the area size to be divided can be changed at random, and the occurrence of area division can be distributed over time. As a result, the execution of the garbage collection process can also be distributed over time, and the resource load can be distributed. The variable size can be arbitrarily set by the administrator of the distributed file system or the user.

ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によって決定された実行時期に、対象となるエリアに対するガベージコレクション処理を実行する。具体的には、ガベージコレクション実行部１５２は、エリアの分割時にガベージコレクション処理を実行する。すなわち、ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によってエリアサイズがランダムに変化された各エリアの分割時にガベージコレクション処理を実行する。 The garbage collection execution unit 152 executes the garbage collection process for the target area at the execution time determined by the garbage collection determination unit 151. Specifically, the garbage collection execution unit 152 executes a garbage collection process when the area is divided. That is, the garbage collection execution unit 152 executes the garbage collection process when each area whose area size is randomly changed by the garbage collection determination unit 151 is divided.

図６は、第１の実施形態に係るスレーブサーバ１００Ａによって実行されるエリア分割とガベージコレクション処理の一例を説明するための図である。ここで、図６の上段の矩形と矢印で示された図は、矩形がエリアを示し、矩形の長さが分割サイズを示し、レコードの追加に伴ってエリアが分割されていく状態を示す。また、図６の下段の図は、横軸が時間（ｔ）を示し、縦軸がガベージコレクション（ＧＣ）の発生数を示す。 FIG. 6 is a diagram for explaining an example of area division and garbage collection processing executed by the slave server 100A according to the first embodiment. Here, in the figure indicated by the upper rectangle and the arrow in FIG. 6, the rectangle indicates the area, the rectangle length indicates the division size, and the area is divided as the record is added. In the lower diagram of FIG. 6, the horizontal axis indicates time (t), and the vertical axis indicates the number of occurrences of garbage collection (GC).

例えば、スレーブサーバ１００Ａによって実行されるエリア分割は、図６の上段に示すように、分割されるごとにエリアサイズがランダムに変化する。上述したように、各エリアへのレコードの蓄積状況は平準化されることから、エリアの分割が発生するタイミングはエリアごとに異なったものとなる。従って、エリアの分割時にガベージコレクション処理を実行すると、図６の下段に示すように、ガベージコレクション処理が時系列上分散されて実行されることとなる。言い換えると、スレーブサーバ１００Ａは、一定時間間隔ごとに発生するガベージコレクション処理の数を均一化する（特定時間間隔内の集中的なガベージコレクション処理の発生を抑止する）ことができる。 For example, in the area division executed by the slave server 100A, as shown in the upper part of FIG. As described above, since the record accumulation status in each area is leveled, the timing at which the area is divided differs for each area. Therefore, when the garbage collection process is executed at the time of area division, the garbage collection process is distributed and executed in time series as shown in the lower part of FIG. In other words, the slave server 100A can equalize the number of garbage collection processes that occur at regular time intervals (suppress the occurrence of intensive garbage collection processes within a specific time interval).

次に、第１の実施形態に係るスレーブサーバ１００Ａによる処理の手順について、図７及び図８を用いて説明する。図７は、第１の実施形態に係るガベージコレクション決定部１５１による処理の手順を示すフローチャートである。第１の実施形態に係るガベージコレクション決定部１５１は、ユーザによってレコードが追加されると、図７に示すように、エリアベースサイズ、変動サイズ、現在のエリアサイズを取得する（ステップＳ１０１）。 Next, a processing procedure performed by the slave server 100A according to the first embodiment will be described with reference to FIGS. FIG. 7 is a flowchart illustrating a processing procedure performed by the garbage collection determination unit 151 according to the first embodiment. When a record is added by the user, the garbage collection determination unit 151 according to the first embodiment acquires an area base size, a variable size, and a current area size as illustrated in FIG. 7 (step S101).

そして、ガベージコレクション決定部１５１は、０以上変動サイズ以下で、ランダム値（Ｒ）を生成して（ステップＳ１０２）、エリアベースサイズ＋Ｒが現在のエリアサイズを下回っているか否かを判定する（ステップＳ１０３）。ここで、エリアベースサイズ＋Ｒが現在のエリアサイズを下回っている場合には（ステップＳ１０３肯定）、ガベージコレクション決定部１５１は、ガベージコレクション処理をスケジューリングして（ステップＳ１０４）、処理を終了する。 Then, the garbage collection determination unit 151 generates a random value (R) that is not less than 0 and not more than the variation size (step S102), and determines whether or not the area base size + R is smaller than the current area size (step S102). S103). If the area base size + R is smaller than the current area size (Yes at Step S103), the garbage collection determination unit 151 schedules the garbage collection process (Step S104) and ends the process.

一方、エリアベースサイズ＋Ｒが現在のエリアサイズを下回っていない場合には（ステップＳ１０３否定）、ガベージコレクション決定部１５１は、ガベージコレクション処理をスケジューリングせずに、処理を終了する。 On the other hand, if the area base size + R is not smaller than the current area size (No at Step S103), the garbage collection determination unit 151 ends the process without scheduling the garbage collection process.

図８は、第１の実施形態に係るガベージコレクション実行部１５２による処理の手順を示すフローチャートである。なお、図８においては、ガベージコレクション決定部１５１によってガベージコレクション処理がスケジューリングされた後の処理について示す。 FIG. 8 is a flowchart illustrating a processing procedure performed by the garbage collection execution unit 152 according to the first embodiment. FIG. 8 shows a process after the garbage collection determining unit 151 schedules the garbage collection process.

第１の実施形態に係るガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によってガベージコレクション処理がスケジューリングされると、図８に示すように、スケジューリングされたガベージコレクションタスクを取得する（ステップＳ２０１）。具体的には、ガベージコレクション実行部１５２は、ガベージコレクション処理を実行する対象となるエリアの情報を取得する。 When the garbage collection determination unit 151 schedules the garbage collection process, the garbage collection execution unit 152 according to the first embodiment acquires a scheduled garbage collection task as illustrated in FIG. 8 (step S201). Specifically, the garbage collection execution unit 152 acquires information on an area to be subjected to the garbage collection process.

そして、ガベージコレクション実行部１５２は、取得したエリアにおける既存のソート済みＫｅｙＶａｌｕｅファイル群から更新・削除を反映したうえで、新たなソート済みＫｅｙＶａｌｕｅファイルを生成して（ステップＳ２０２）、処理を終了する。 Then, the garbage collection execution unit 152 reflects the update / deletion from the existing sorted KeyValue file group in the acquired area, generates a new sorted KeyValue file (step S202), and ends the process.

［第１の実施形態の効果］
上述したように、第１の実施形態によれば、ガベージコレクション決定部１５１は、分散Ｋｅｙ−Ｖａｌｕｅストアに含まれる複数のサーバそれぞれにおいて所定の容量で形成された複数のエリアによって分散して管理されたＫｅｙ及びＶａｌｕｅを含むレコードに対するガベージコレクション処理を、サーバそれぞれでエリアごとに時系列上分散して実行されるようにエリアごとの実行時期を決定する。そして、ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によって決定された実行時期に、対象となるエリアに対するガベージコレクション処理を実行する。従って、第１の実施形態に係るスレーブサーバ１００Ａは、ガベージコレクション処理の実行を時系列上分散して実行させることができ、リソース負荷を分散することを可能にする。 [Effect of the first embodiment]
As described above, according to the first embodiment, the garbage collection determination unit 151 is distributed and managed by a plurality of areas formed with a predetermined capacity in each of a plurality of servers included in the distributed key-value store. Further, the execution timing for each area is determined so that the garbage collection processing for the records including Key and Value is executed in a time-series manner in each server. Then, the garbage collection execution unit 152 executes a garbage collection process for the target area at the execution time determined by the garbage collection determination unit 151. Therefore, the slave server 100A according to the first embodiment can distribute the execution of the garbage collection process in time series, and can distribute the resource load.

また、第１の実施形態によれば、ガベージコレクション決定部１５１は、エリアの分割時にガベージコレクション処理が実行される場合に、所定の容量が所定の範囲内で変化するようにエリアを分割させる。そして、ガベージコレクション実行部１５２は、エリアの分割時にガベージコレクション処理を実行する。従って、第１の実施形態に係るスレーブサーバ１００Ａは、既存の技術を用いて容易にガベージコレクション処理の実行を時系列上分散させることができ、リソース負荷の分散を容易に実行することを可能にする。 In addition, according to the first embodiment, the garbage collection determination unit 151 divides an area so that a predetermined capacity changes within a predetermined range when the garbage collection process is executed when the area is divided. And the garbage collection execution part 152 performs a garbage collection process at the time of area division. Therefore, the slave server 100A according to the first embodiment can easily perform garbage collection processing in time series using existing technology, and can easily execute resource load distribution. To do.

（第２の実施形態）
上述した第１の実施形態においては、エリアサイズをランダムに変化させることで、エリアの分割のタイミングを分散させ、エリアの分割時にガベージコレクションを実行する場合について説明した。第２の実施形態では、ガベージコレクション処理によって削減される容量を予測し、予測結果に応じて、ガベージコレクション処理を実行する場合について説明する。なお、第２の実施形態に係るスレーブサーバは、第１の実施形態に係るスレーブサーバ１００Ａと比較して、ガベージコレクション決定部１５１及びガベージコレクション実行部１５２による処理内容のみが異なる。以下、これらを中心に説明する。 (Second Embodiment)
In the first embodiment described above, the case has been described in which the area division timing is distributed by randomly changing the area size, and the garbage collection is executed when the area is divided. In the second embodiment, a case will be described in which the capacity to be reduced by the garbage collection process is predicted, and the garbage collection process is executed according to the prediction result. Note that the slave server according to the second embodiment differs from the slave server 100A according to the first embodiment only in the processing contents by the garbage collection determination unit 151 and the garbage collection execution unit 152. Hereinafter, these will be mainly described.

第２の実施形態に係るガベージコレクション決定部１５１は、所定の時間間隔でエリアごとにガベージコレクション処理を実行した場合の削減容量を予測し、予測した削減容量が最大となるエリアをガベージコレクション処理の対象と決定する。具体的には、ガベージコレクション決定部１５１は、レコードがエリアに登録された時点から所定の期間が経過した場合に、当該レコードがガベージコレクション処理によって削除されうる削除対象レコードと判定し、エリアに含まれる削除対象レコードの合計容量を削除容量として予測する。 The garbage collection determination unit 151 according to the second embodiment predicts a reduction capacity when the garbage collection process is executed for each area at a predetermined time interval, and determines an area where the predicted reduction capacity is the maximum for the garbage collection process. Decide on the target. Specifically, the garbage collection determination unit 151 determines that the record is a deletion target record that can be deleted by the garbage collection process when a predetermined period has elapsed since the record was registered in the area, and is included in the area. The total capacity of records to be deleted is predicted as the deletion capacity.

ここで、第２の実施形態に係るスレーブサーバ１００Ａでは、（Ｋｅｙ，Ｖａｌｕｅ）に（Ｋｅｙ，Ｖａｌｕｅ）が登録された時刻を示すＴｉｍｅｓｔａｍｐを加えたレコードを用いる。すなわち、第２の実施形態に係るスレーブサーバ１００Ａは、記憶部１４０に（Ｋｅｙ，Ｔｉｍｅｓｔａｍｐ，Ｖａｌｕｅ）タプルを格納する。なお、Ｔｉｍｅｓｔａｍｐは、（Ｋｅｙ，Ｖａｌｕｅ）が追加される際に、ユーザによって設定される場合であってもよく、或いは、制御部１５０によって登録される場合であってもよい。制御部１５０によって登録される場合には、制御部１５０は、（Ｋｅｙ，Ｖａｌｕｅ）が追加された時刻を登録する機能部をさらに有する。 Here, the slave server 100A according to the second embodiment uses a record obtained by adding Timestamp indicating the time when (Key, Value) is registered to (Key, Value). That is, the slave server 100A according to the second embodiment stores a (Key, Timestamp, Value) tuple in the storage unit 140. Note that Timestamp may be set by the user when (Key, Value) is added, or may be registered by the control unit 150. When registered by the control unit 150, the control unit 150 further includes a function unit that registers the time when (Key, Value) is added.

例えば、ガベージコレクション決定部１５１は、所定の時間が経過するごとに、以下に示す式（１）により、エリアごとにガベージコレクション処理による削除容量を予測する。なお、式（１）における「tpds：Total Prediction of Deleting Size」は、削除されると予測される全容量を示す関数である。また、式（１）における「Area[i]」は、スレーブサーバ内の各エリアを示し、（１≦i≦Ｎ：正の整数）である。また、式（１）における「Area[i].SortedKVF[j]」は、各エリアに格納されたソート済みＫｅｙＶａｌｕｅファイルを示し、（１≦j≦Ｍ：正の整数）である。また、式（１）における「pds」は、削除されると予測される容量を示す関数である。 For example, the garbage collection determination unit 151 predicts the deletion capacity by the garbage collection process for each area by the following equation (1) every time a predetermined time elapses. Note that “tpds: Total Prediction of Deleting Size” in Equation (1) is a function indicating the total capacity that is predicted to be deleted. In addition, “Area [i]” in Expression (1) indicates each area in the slave server, and is (1 ≦ i ≦ N: a positive integer). In addition, “Area [i] .SortedKVF [j]” in Expression (1) indicates the sorted KeyValue file stored in each area, and is (1 ≦ j ≦ M: positive integer). In addition, “pds” in Expression (1) is a function indicating the capacity that is predicted to be deleted.

すなわち、ガベージコレクション決定部１５１は、式（１）に示すように、エリアごとに、削除されうるソート済みＫｅｙＶａｌｕｅファイルの合計値を算出することで、削除容量を予測する。 That is, the garbage collection determination unit 151 predicts the deletion capacity by calculating the total value of the sorted KeyValue files that can be deleted for each area, as shown in Expression (1).

ここで、式（１）に含まれる「pds」は、以下の式（２）によって定義される。なお、式（２）における「SortedKVF[j].size」は、レコードサイズを示す。また、式（２）における「SortedKVF[j].maxts」は、エリア内の最新のレコードを示し、「SortedKVF[j].mints」は、エリア内の最古のレコードを示す。これは、登録されている各レコードが（Ｋｅｙ，Ｔｉｍｅｓｔａｍｐ，Ｖａｌｕｅ）を要素とする集合であることから原理上、最大のＴｉｍｅｓｔａｍｐを有するもの（最新）と最小のＴｉｍｅｓｔａｍｐを有するもの（最古）とがあり、それを用いることができる。また、式（２）における「ＣＴ：Current Time」は、現時点での時刻を示し、「ＰＶ：Period of Validity」は、レコードの有効期限を示す。 Here, “pds” included in the equation (1) is defined by the following equation (2). Note that “SortedKVF [j] .size” in Expression (2) indicates the record size. Further, “SortedKVF [j] .maxts” in Expression (2) indicates the latest record in the area, and “SortedKVF [j] .mints” indicates the oldest record in the area. This is because, in principle, each registered record is a set having (Key, Timestamp, Value) as an element, and in principle, the record having the maximum Timestamp (latest) and the one having the minimum Timestamp (oldest) Can be used. In addition, “CT: Current Time” in Equation (2) indicates the current time, and “PV: Period of Validity” indicates the validity period of the record.

すなわち、ガベージコレクション決定部１５１は、式（２）に示すように、レコードの有効期限が切れている場合（SortedKVF[j].maxts≦ＣＴ−ＰＶ）には、レコードのサイズを削除容量とし、レコードの有効期限が切れていない場合（ＣＴ−ＰＶ≦SortedKVF[j].mints）には、削除容量を「０」とする。ここで、ガベージコレクション決定部１５１は、有効期限が切れている容量を線形に予測する。すなわち、ガベージコレクション決定部１５１は、時間経過に従って、削除できる量が線形に増加するという前提に従って、削除容量を予測する。 That is, the garbage collection determining unit 151 uses the record size as the deletion capacity when the record has expired (SortedKVF [j] .maxts ≦ CT-PV), as shown in Expression (2). If the record has not expired (CT-PV ≦ SortedKVF [j] .mints), the deletion capacity is set to “0”. Here, the garbage collection determination unit 151 linearly predicts the capacity that has expired. That is, the garbage collection determination unit 151 predicts the deletion capacity according to the assumption that the amount that can be deleted increases linearly with the passage of time.

図９は、第２の実施形態に係るガベージコレクション決定部１５１による削除容量の予測を説明するための図である。ここで、図９の（Ａ）においては、横軸が時刻（ｔ）を示す。そして、図９の（Ａ）においては、１つのエリアに含まれる全ての「SortedKVF[j](１≦j≦Ｍ)」をそれぞれ示す線分を横軸上に示す。すなわち、図９の（Ａ）に示す線分は、図９の（Ｂ）に示すように、左端が「SortedKVF[j].mints」を示し、右端が「SortedKVF[j].maxts」を示し、長さが「SortedKVF[j].size」を示す。 FIG. 9 is a diagram for explaining prediction of deletion capacity by the garbage collection determination unit 151 according to the second embodiment. Here, in FIG. 9A, the horizontal axis indicates time (t). In FIG. 9A, line segments respectively indicating all “SortedKVF [j] (1 ≦ j ≦ M)” included in one area are shown on the horizontal axis. That is, in the line segment shown in FIG. 9A, the left end indicates “SortedKVF [j] .mints” and the right end indicates “SortedKVF [j] .maxts”, as shown in FIG. 9B. The length indicates “SortedKVF [j] .size”.

例えば、ガベージコレクション決定部１５１は、図９の（Ａ）において有効期間（「ＣＴ」と「ＣＴ−ＰＶ」との間）にあるソート済みＫｅｙＶａｌｕｅファイルについては、削除対象とはせず、削除容量を「０」とする。一方、無効期間にあるソート済みＫｅｙＶａｌｕｅファイルについては、削除容量をソート済みＫｅｙＶａｌｕｅファイルのサイズとする。そして、ガベージコレクション決定部１５１は、無効期間と有効期間との両方にかかるソート済みＫｅｙＶａｌｕｅファイルについては、式（２）の第２式を用いて削除容量を算出する。すなわち、式（２）に示すように、（SortedKVF[j].mints≦ＣＴ−ＰＶ≦SortedKVF[j].maxts）の場合には、ガベージコレクション決定部１５１は、期限切れとなった割合を算出して、全体のレコードサイズにおける当該割合のサイズを削除容量として算出する。 For example, the garbage collection determining unit 151 does not delete the sorted KeyValue file in the effective period (between “CT” and “CT-PV”) in FIG. Is “0”. On the other hand, for the sorted KeyValue file in the invalid period, the deletion capacity is set to the size of the sorted KeyValue file. Then, the garbage collection determination unit 151 calculates the deletion capacity for the sorted KeyValue file for both the invalid period and the valid period using the second expression of Expression (2). That is, as shown in equation (2), in the case of (SortedKVF [j] .mints ≦ CT-PV ≦ SortedKVF [j] .maxts), the garbage collection determining unit 151 calculates the ratio of expired. Thus, the size of the ratio in the entire record size is calculated as the deletion capacity.

ガベージコレクション決定部１５１は、式（１）及び式（２）を用いて各エリアの削除容量を算出して、以下の式（３）によりガベージコレクション処理の対象となるエリアを抽出する。なお、式（３）における「gct：garbage collection target」は、現時点でガベージコレクション処理を実行した場合に最も削除量が大きいエリアを返却する関数である。また、式（３）における「Areas」は、{Area[j]|１≦j≦Ｍ}で表される集合を示す。 The garbage collection determination unit 151 calculates the deletion capacity of each area using Expression (1) and Expression (2), and extracts an area that is a target of garbage collection processing according to Expression (3) below. Note that “gct: garbage collection target” in Expression (3) is a function that returns the area with the largest deletion amount when the garbage collection process is executed at the present time. In addition, “Areas” in Expression (3) represents a set represented by {Area [j] | 1 ≦ j ≦ M}.

すなわち、ガベージコレクション決定部１５１は、「Area」を引数にもつ関数「gct」により、現時点でガベージコレクションを実行した場合に、最も削除量が大きいエリアを返却する。ここで、ガベージコレクション決定部１５１は、以下に示す式（４）により、削除容量が一定量を超えない場合にガベージコレクション処理を実行しないように制御する。なお、式（４）における「gtc_over_mds」は、返却されたエリアに対する削除量が一定量を満たさない場合に、ガベージコレクション処理を行わないことを示す「０」を示す関数である。また、式（４）における「ＭＤＳ：Minimum Deleting Size」は、ガベージコレクション処理を実行するか否かの閾値を示す。 That is, the garbage collection determination unit 151 returns the area with the largest deletion amount when the garbage collection is currently executed by the function “gct” having “Area” as an argument. Here, the garbage collection determination unit 151 performs control so as not to execute the garbage collection process when the deletion capacity does not exceed a certain amount, according to the following equation (4). Note that “gtc_over_mds” in Expression (4) is a function indicating “0” indicating that the garbage collection process is not performed when the deletion amount for the returned area does not satisfy a certain amount. Further, “MDS: Minimum Deleting Size” in Expression (4) indicates a threshold value for determining whether or not to execute the garbage collection process.

すなわち、ガベージコレクション決定部１５１は、現時点でガベージコレクション処理を実行した場合のエリアの削除容量がＭＤＳを超えていた場合に、当該エリアに関する情報をガベージコレクション実行部１５２に通知する。一方、現時点でガベージコレクション処理を実行した場合のエリアの削除容量がＭＤＳ以下の場合には、ガベージコレクション決定部１５１は、ガベージコレクション処理を実行しないことを示す「０」をガベージコレクション実行部１５２に通知する。 That is, the garbage collection determination unit 151 notifies the garbage collection execution unit 152 of information related to the area when the deletion capacity of the area when the garbage collection process is currently executed exceeds MDS. On the other hand, if the deletion capacity of the area when the garbage collection process is currently executed is equal to or less than MDS, the garbage collection determination unit 151 gives “0” indicating that the garbage collection process is not executed to the garbage collection execution unit 152. Notice.

上述したように、ガベージコレクション決定部１５１は、有効期限が切れているか否かに基づく削除容量をエリアごとに算出し、算出した削除容量が最大となるエリアをガベージコレクション処理の対象として決定する。そして、ガベージコレクション決定部１５１は、対象としたエリアの削除容量が一定量（ＭＤＳ）を超えたことを条件にガベージコレクション処理を実行させる。なお、上述した有効期限及びＭＤＳは、分散ファイルシステムの管理者、或いは、ユーザによって任意に設定することができる。例えば、有効期限として、Ｗｅｂ情報などの保持期限などとして「３０日」を設定する場合であってもよい。 As described above, the garbage collection determination unit 151 calculates the deletion capacity based on whether or not the expiration date has expired for each area, and determines the area where the calculated deletion capacity is the maximum as the target of the garbage collection process. Then, the garbage collection determination unit 151 executes the garbage collection process on the condition that the deletion capacity of the target area exceeds a certain amount (MDS). The expiration date and MDS described above can be arbitrarily set by the administrator of the distributed file system or the user. For example, “30 days” may be set as the expiration date for the Web information or the like.

第２の実施形態に係るガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によって決定されたエリアに対するガベージコレクション処理を実行する。具体的には、ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によってガベージコレクション処理の対象として決定されたエリアにおける削除容量がＭＤＳを上回った場合に、当該エリアに対するガベージコレクション処理を実行する。すなわち、ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１からエリアの情報が通知された場合に、通知されたエリアに対するガベージコレクション処理を実行する。 The garbage collection execution unit 152 according to the second embodiment executes a garbage collection process for the area determined by the garbage collection determination unit 151. Specifically, the garbage collection execution unit 152 executes the garbage collection process for the area when the deletion capacity in the area determined as the object of the garbage collection process by the garbage collection determination unit 151 exceeds MDS. That is, when the garbage collection execution unit 152 is notified of area information from the garbage collection determination unit 151, the garbage collection execution unit 152 executes a garbage collection process for the notified area.

図１０は、第２の実施形態に係るスレーブサーバによって実行されるガベージコレクション処理の一例を説明するための図である。ここで、図１０の上段の矩形と矢印で示された図は、矩形がエリアを示し、矩形の長さが分割サイズを示し、レコードの追加に伴ってエリアが分割されていく状態を示す。また、図１０の下段の図は、横軸が時間（ｔ）を示し、縦軸がガベージコレクション（ＧＣ）の発生数を示す。また、図１０の下段に示す黒三角形は、ガベージコレクション処理が実行されたことを示し、点線の三角形は、ガベージコレクション処理が実行されていないことを示す。 FIG. 10 is a diagram for explaining an example of the garbage collection process executed by the slave server according to the second embodiment. Here, in the figure indicated by the upper rectangle and the arrow in FIG. 10, the rectangle indicates the area, the rectangle length indicates the division size, and the area is divided as the record is added. In the lower diagram of FIG. 10, the horizontal axis indicates time (t), and the vertical axis indicates the number of occurrences of garbage collection (GC). Further, the black triangle shown in the lower part of FIG. 10 indicates that the garbage collection process has been executed, and the dotted triangle indicates that the garbage collection process has not been executed.

例えば、第２の実施形態に係るスレーブサーバ１００Ａによって実行されるガベージコレクション処理は、図１０の下段に示すように、１つのエリアに対するガベージコレクション処理が時系列上分散されて実行される。すなわち、スレーブサーバ１００Ａは、特定時間間隔内の集中的なガベージコレクション処理の発生を抑止することができる。 For example, in the garbage collection process executed by the slave server 100A according to the second embodiment, as shown in the lower part of FIG. 10, the garbage collection process for one area is executed in a time-series distributed manner. That is, the slave server 100A can suppress the occurrence of intensive garbage collection processing within a specific time interval.

次に、第２の実施形態に係るスレーブサーバ１００Ａによる処理の手順について、図１１を用いて説明する。なお、第２の実施形態に係るガベージコレクション実行部１５２によるガベージコレクション処理は、第１の実施形態と同様である（図８参照）。図１１は、第２の実施形態に係るガベージコレクション決定部１５１による処理の手順を示すフローチャートである。第２の実施形態に係るガベージコレクション決定部１５１は、図１１に示すように、一定時間が経過すると（ステップＳ３０１肯定）、各エリアについて、推定される削除データ量を算出する（ステップＳ３０２）。 Next, a processing procedure by the slave server 100A according to the second embodiment will be described with reference to FIG. The garbage collection process by the garbage collection execution unit 152 according to the second embodiment is the same as that of the first embodiment (see FIG. 8). FIG. 11 is a flowchart illustrating a procedure of processing performed by the garbage collection determination unit 151 according to the second embodiment. As shown in FIG. 11, the garbage collection determination unit 151 according to the second embodiment calculates an estimated deletion data amount for each area when a predetermined time has passed (Yes in Step S <b> 301) (Step S <b> 302).

そして、ガベージコレクション決定部１５１は、削除容量が最大となるエリアを抽出して（ステップＳ３０３）、抽出したエリアの削除容量がＭＤＳを上回るか否かを判定する（ステップＳ３０４）。ここで、抽出したエリアの削除容量がＭＤＳを上回る場合には（ステップＳ３０４肯定）、ガベージコレクション決定部１５１は、該当するエリアに対するガベージコレクション処理をスケジューリングして（ステップＳ３０５）、処理を終了する。 Then, the garbage collection determination unit 151 extracts an area with the maximum deletion capacity (step S303), and determines whether or not the deletion capacity of the extracted area exceeds the MDS (step S304). Here, when the deletion capacity of the extracted area exceeds MDS (Yes at Step S304), the garbage collection determination unit 151 schedules garbage collection processing for the corresponding area (Step S305), and ends the processing.

一方、抽出したエリアの削除容量がＭＤＳを上回らない場合には（ステップＳ３０４否定）、ガベージコレクション決定部１５１は、ガベージコレクション処理をスケジューリングせずに、処理を終了する。なお、一定時間経過するまで、ガベージコレクション決定部１５１は、待機状態である（ステップＳ３０１否定）。 On the other hand, if the deleted capacity of the extracted area does not exceed MDS (No at step S304), the garbage collection determination unit 151 ends the process without scheduling the garbage collection process. Note that the garbage collection determination unit 151 is in a standby state until a predetermined time has elapsed (No in step S301).

［第２の実施形態の効果］
上述したように、第２の実施形態によれば、ガベージコレクション決定部１５１は、所定の時間間隔でエリアごとにガベージコレクション処理を実行した場合の削減容量を予測し、削減容量が最大となるエリアをガベージコレクション処理の対象と決定する。そして、ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によって決定されたエリアに対するガベージコレクション処理を実行する。従って、第２の実施形態に係るスレーブサーバ１００Ａは、ディスク占有量とリソース負荷との間のトレードオフをコントロールすることができ、リソース負荷を分散することを可能にする。 [Effects of Second Embodiment]
As described above, according to the second embodiment, the garbage collection determination unit 151 predicts the reduction capacity when the garbage collection processing is executed for each area at a predetermined time interval, and the area where the reduction capacity is maximized. Is determined as a target of garbage collection processing. Then, the garbage collection execution unit 152 executes a garbage collection process for the area determined by the garbage collection determination unit 151. Therefore, the slave server 100A according to the second embodiment can control the trade-off between the disk occupation amount and the resource load, and can distribute the resource load.

また、第２の実施形態によれば、ガベージコレクション決定部１５１は、レコードがエリアに登録された時点から所定の期間が経過した場合に、当該レコードがガベージコレクション処理によって削除されうる削除対象レコードと判定し、エリアに含まれる削除対象レコードの合計容量を削除容量として予測する。従って、第２の実施形態に係るスレーブサーバ１００Ａは、不要なレコードを確実に削除するようにガベージコレクション処理を実行させることを可能にする。 Further, according to the second embodiment, the garbage collection determination unit 151 includes a deletion target record that can be deleted by the garbage collection process when a predetermined period has elapsed since the record was registered in the area. Judgment is made, and the total capacity of the records to be deleted included in the area is predicted as the deletion capacity. Therefore, the slave server 100A according to the second embodiment makes it possible to execute the garbage collection process so as to reliably delete unnecessary records.

また、第２の実施形態によれば、ガベージコレクション実行部１５２は、ガベージコレクション決定部１５１によってガベージコレクション処理の対象として決定されたエリアにおける削除容量が所定の閾値を上回った場合に、当該エリアに対するガベージコレクション処理を実行する。従って、第２の実施形態に係るスレーブサーバ１００Ａは、効果的なガベージコレクション処理を実行することを可能にする。 In addition, according to the second embodiment, the garbage collection execution unit 152, when the deletion capacity in the area determined as the garbage collection processing target by the garbage collection determination unit 151 exceeds a predetermined threshold, Execute garbage collection processing. Therefore, the slave server 100A according to the second embodiment makes it possible to execute an effective garbage collection process.

（第３の実施形態）
これまで第１の実施形態及び第２の実施形態を説明したが、本願に係る実施形態は、第１の実施形態及び第２の実施形態に限定されるものではない。すなわち、これらの実施形態は、その他の様々な形態で実行されることが可能であり、種々の省略、置き換え、変更を行うことができる。 (Third embodiment)
Although the first embodiment and the second embodiment have been described so far, the embodiment according to the present application is not limited to the first embodiment and the second embodiment. That is, these embodiments can be executed in various other forms, and various omissions, replacements, and changes can be made.

例えば、各装置の分散・統合の具体的形態（例えば、図５の形態）は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的又は物理的に分散・統合することができる。一例を挙げると、ガベージコレクション決定部１５１及びガベージコレクション実行部１５２とを一つの処理部として統合してもよく、一方、ガベージコレクション決定部１５１を、削除容量を予測する予測部と、削除対象エリアを抽出する抽出部とに分散してもよい。 For example, the specific form of distribution / integration of each device (for example, the form shown in FIG. 5) is not limited to the one shown in the figure, and all or a part thereof can be changed in arbitrary units according to various loads and usage conditions. Functionally or physically distributed and integrated. For example, the garbage collection determination unit 151 and the garbage collection execution unit 152 may be integrated as one processing unit, while the garbage collection determination unit 151 includes a prediction unit that predicts a deletion capacity, and a deletion target area. It may be distributed to the extraction unit that extracts.

また、制御部１５０をスレーブサーバ１００Ａの外部装置としてネットワーク経由で接続するようにしてもよく、或いは、ガベージコレクション決定部１５１及びガベージコレクション実行部１５２とを別の装置がそれぞれ有し、ネットワークに接続されて協働することで、上述したスレーブサーバ１００Ａの機能を実現するようにしてもよい。 The control unit 150 may be connected as an external device of the slave server 100A via a network, or another device has a garbage collection determination unit 151 and a garbage collection execution unit 152, and is connected to the network. Thus, the functions of the slave server 100A described above may be realized by cooperating.

上述した実施形態で説明したスレーブサーバ１００Ａは、あらかじめ用意されたプログラムをコンピュータで実行することで実現することもできる。そこで、以下では、図５に示したスレーブサーバ１００Ａと同様の機能を実現するガベージコレクション実行プログラムを実行するコンピュータの一例を説明する。 The slave server 100A described in the above-described embodiment can also be realized by executing a program prepared in advance by a computer. Therefore, an example of a computer that executes a garbage collection execution program that realizes the same function as that of the slave server 100A illustrated in FIG. 5 will be described below.

図１２は、第３の実施形態に係るガベージコレクション実行プログラムを実行するコンピュータ１０００を示す図である。図１２に示すように、コンピュータ１０００は、例えば、メモリ１０１０と、ＣＰＵ（Central Processing Unit）１０２０と、ハードディスクドライブインタフェース１０３０と、ディスクドライブインタフェース１０４０と、シリアルポートインタフェース１０５０と、ビデオアダプタ１０６０と、ネットワークインタフェース１０７０とを有する。これらの各部は、バス１０８０によって接続される。 FIG. 12 is a diagram illustrating a computer 1000 that executes a garbage collection execution program according to the third embodiment. As shown in FIG. 12, the computer 1000 includes, for example, a memory 1010, a CPU (Central Processing Unit) 1020, a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network. Interface 1070. These units are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１およびＲＡＭ（Random Access Memory）１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１１００に接続される。ディスクドライブ１１００には、例えば、磁気ディスクや光ディスク等の着脱可能な記憶媒体が挿入される。シリアルポートインタフェース１０５０には、例えば、マウス１１１０およびキーボード１１２０が接続される。ビデオアダプタ１０６０には、例えば、ディスプレイ１１３０が接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090. The disk drive interface 1040 is connected to the disk drive 1100. A removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100, for example. For example, a mouse 1110 and a keyboard 1120 are connected to the serial port interface 1050. For example, a display 1130 is connected to the video adapter 1060.

ここで、図１２に示すように、ハードディスクドライブ１０９０は、例えば、ＯＳ（Operating System）１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３およびプログラムデータ１０９４を記憶する。第３の実施形態に係るガベージコレクション実行プログラムは、例えば、コンピュータ１０００によって実行される指令が記述されたプログラムモジュールとして、例えばハードディスクドライブ１０９０に記憶される。具体的には、上記実施形態で説明したガベージコレクション決定部１５１と同様の情報処理を実行するガベージコレクション決定ステップと、ガベージコレクション実行部１５２と同様の情報処理を実行するガベージコレクション実行ステップとが記述されたプログラムモジュールが、ハードディスクドライブ１０９０に記憶される。 Here, as shown in FIG. 12, the hard disk drive 1090 stores, for example, an OS (Operating System) 1091, an application program 1092, a program module 1093, and program data 1094. The garbage collection execution program according to the third embodiment is stored in, for example, the hard disk drive 1090 as a program module in which a command executed by the computer 1000 is described. Specifically, a garbage collection determination step for executing information processing similar to the garbage collection determination unit 151 described in the above embodiment and a garbage collection execution step for executing information processing similar to the garbage collection execution unit 152 are described. The programmed program module is stored in the hard disk drive 1090.

また、上記実施形態で説明したデータベースに記憶されるデータのように、性能分析プログラムによる情報処理に用いられるデータは、プログラムデータとして、例えば、ハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０が、ハードディスクドライブ１０９０に記憶されたプログラムモジュールやプログラムデータを必要に応じてＲＡＭ１０１２に読み出して、ガベージコレクション決定ステップと、ガベージコレクション実行ステップとを実行する。 Further, like the data stored in the database described in the above embodiment, data used for information processing by the performance analysis program is stored as program data, for example, in the hard disk drive 1090. Then, the CPU 1020 reads program modules and program data stored in the hard disk drive 1090 to the RAM 1012 as necessary, and executes a garbage collection determination step and a garbage collection execution step.

なお、性能分析プログラムに係るプログラムモジュールやプログラムデータは、ハードディスクドライブ１０９０に記憶される場合に限られず、例えば、着脱可能な記憶媒体に記憶されて、ディスクドライブ１１００等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、情報送受信プログラムに係るプログラムモジュールやプログラムデータは、ＬＡＮ（Local Area Network）やＷＡＮ（Wide Area Network）等のネットワークを介して接続された他のコンピュータに記憶され、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 Note that the program module and program data related to the performance analysis program are not limited to being stored in the hard disk drive 1090, but are stored in, for example, a removable storage medium and read out by the CPU 1020 via the disk drive 1100 or the like. May be. Alternatively, a program module and program data related to the information transmission / reception program are stored in another computer connected via a network such as a LAN (Local Area Network) or a WAN (Wide Area Network), and the CPU 1020 via the network interface 1070. May be read.

これらの実施形態やその変形は、本願が開示する技術に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 These embodiments and modifications thereof are included in the invention disclosed in the claims and equivalents thereof as well as included in the technology disclosed in the present application.

１００Ａ、１００Ｂ、１００Ｃスレーブサーバ
１１０通信制御Ｉ／Ｆ部
１２０入力部
１３０表示部
１４０記憶部
１４１データ記憶領域
１４２ソート済みファイル記憶領域
１５０制御部
１５１ガベージコレクション決定部
１５２ガベージコレクション実行部
２００マスターサーバ
３００ネットワークスイッチ 100A, 100B, 100C Slave server 110 Communication control I / F unit 120 Input unit 130 Display unit 140 Storage unit 141 Data storage area 142 Sorted file storage area 150 Control unit 151 Garbage collection determination unit 152 Garbage collection execution unit 200 Master server 300 Network switch

Claims

Garbage collection process for a plurality of server record containing their respective on thus distributed key information and value information is managed included in the distributed Key-Value store has each server respectively, changing the capacitance randomly for execution upon division of each of the plurality of parts tables, a determination unit that determines an execution timing of the garbage collection process for each of the partial table,
An execution unit that executes a garbage collection process on a target partial table at an execution time determined by the determination unit;
A garbage collection execution device characterized by comprising:

The determining unit divides the partial table so that the capacity after the division changes within a predetermined range when the garbage collection process is executed when the partial table is divided.
The garbage collection execution apparatus according to claim 1, wherein the execution unit executes a garbage collection process when the partial table is divided.

The determination unit predicts a reduction capacity when the garbage collection process is executed for each partial table at a predetermined time interval, determines a partial table having the maximum reduction capacity as a target of the garbage collection process,
The garbage collection execution apparatus according to claim 1, wherein the execution unit executes a garbage collection process on the partial table determined by the determination unit.

The determination unit determines that the record is a deletion target record that can be deleted by the garbage collection process when a predetermined period has elapsed from the time when the record is registered in the partial table, and is included in the partial table. The garbage collection execution apparatus according to claim 3, wherein a total capacity of the deletion target records is predicted as the deletion capacity.

The execution unit executes a garbage collection process on the partial table when the deletion capacity in the partial table determined as the garbage collection process target by the determination unit exceeds a predetermined threshold. The garbage collection execution apparatus according to claim 3 or 4.

A garbage collection execution method executed by a device included in a distributed Key-Value store,
Garbage collection processing for the record including a plurality of server key information and value information managed in their respective Accordingly dispersed contained in the dispersion Key-Value store has each server respectively, to change the capacity at random A determination step of determining the execution time of the garbage collection process for each partial table , so that it is executed at the time of division for each of the plurality of partial tables;
An execution step of performing a garbage collection process on the target partial table at the execution time determined by the determination step;
The garbage collection execution method characterized by including.

A garbage collection execution program executed by a device included in a distributed Key-Value store,
Garbage collection processing for the record including a plurality of server key information and value information managed in their respective Accordingly dispersed contained in the dispersion Key-Value store has each server respectively, to change the capacity at random a plurality of as may be performed in time division for each partial table, determining step of determining the execution timing of the garbage collection process for each of the parts tables,
An execution step of executing a garbage collection process on the target partial table at the execution time determined by the determination step;
Is executed by the apparatus. A garbage collection execution program characterized in that