JPWO2012131927A1

JPWO2012131927A1 - Computer system and data management method

Info

Publication number: JPWO2012131927A1
Application number: JP2013506934A
Authority: JP
Inventors: 昭博伊藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2011-03-30
Filing date: 2011-03-30
Publication date: 2014-07-24
Anticipated expiration: 2031-03-30
Also published as: US20130297788A1; WO2012131927A1; JP5342087B2

Abstract

キー及びデータ値から構成されるデータを複数含むデータセットに対する分析処理を複数の計算機が並列実行する計算機システムであって、各計算機は、所定のキー範囲毎に前記データセットを分割した分割領域の分割位置を示すキーである分割位置キーを管理する分割情報をデータセット毎に保持し、各データセットの分割情報に含まれるすべての分割位置キーは同一であり、計算機システムは、ファイルシステムに新規データセットが格納された場合に、新規データセットが格納された後の各分割領域のデータサイズに基づいて、所定の閾値より大きいデータサイズの分割領域である対象領域が存在するか否かを判定し、対象領域が存在すると判定された場合、対象領域を複数の新たな分割領域に分割する。A computer system in which a plurality of computers execute analysis processing on a data set including a plurality of data composed of keys and data values, and each computer has a divided region obtained by dividing the data set for each predetermined key range. The division information for managing the division position key, which is a key indicating the division position, is held for each data set. All the division position keys included in the division information of each data set are the same, and the computer system is newly added to the file system. When a data set is stored, based on the data size of each divided area after the new data set is stored, whether or not there is a target area that is a divided area with a data size larger than a predetermined threshold When it is determined that the target area exists, the target area is divided into a plurality of new divided areas.

Description

本発明は、大量データを処理する計算機システムにおいて、データを結合する技術に関わる。 The present invention relates to a technique for combining data in a computer system that processes a large amount of data.

データベースにおける表（テーブル、リレーション等）の結合処理に関する技術として、ソート・マージ結合技術を用いて表の結合を並列処理する手法が知られている（例えば、特許文献１参照）。 As a technique related to the process of joining tables (tables, relations, etc.) in a database, a technique is known in which table joins are processed in parallel using a sort / merge join technique (for example, see Patent Document 1).

ソート・マージ結合技術とは、結合対象の表をキー値に基づいてソートした後、各表の行を先頭から読み出し、対応するキー値である行同士をマージする手法である。 The sort / merge join technique is a method of sorting the tables to be joined based on the key values, then reading the rows of each table from the top, and merging the rows having the corresponding key values.

特許文献１には、処理を並列化するために、各表を同一のキー値に対応する位置で区分することによって表毎に対応する分割領域を生成し、分割領域毎にソート・マージ結合技術を利用して表を結合することが記載されている。さらに、特許文献１には、システム内のプロセッサ負荷に偏りが発生しないように、プロセッサへの分割領域の割り当てることが記載されている。 In Patent Document 1, in order to parallelize the processing, each table is divided at a position corresponding to the same key value to generate a divided region corresponding to each table, and a sort / merge join technique is performed for each divided region. It is described that tables are joined using. Further, Patent Document 1 describes that a divided area is allocated to processors so that the processor load in the system is not biased.

データベースに関する基本的な技術として、キーの値と当該キーの値に対応するデータの格納位置を対応付けるテーブル（インデックス）を用意しておき、データの検索処理時に、キーの値を指定することによって高速にデータを取得する技術がある（例えば、特許文献２参照）。特許文献２では、２つ以上のキーの組み合わせに対して、データの格納位置を対応付けるマトリックス・インデックスについて記載されている。 As a basic database technology, a table (index) that associates key values with data storage locations corresponding to the key values is prepared, and the key value is specified during data search processing. (See, for example, Patent Document 2). Patent Document 2 describes a matrix index that associates data storage positions with combinations of two or more keys.

また、キーの値の範囲毎にデータを保存する格納領域を変更することによって、複数の格納領域を利用可能とする技術が一般的に利用されている（例えば、特許文献３参照）。特許文献３では、格納領域を追加するとき、既存の格納領域から新たに追加した格納領域へのデータの移動量を抑えつつ、各格納領域の使用量を平準化する方法について説明されている。 In addition, a technique is commonly used in which a plurality of storage areas can be used by changing a storage area for storing data for each key value range (see, for example, Patent Document 3). Patent Document 3 describes a method of leveling the usage amount of each storage area while suppressing the amount of data movement from the existing storage area to the newly added storage area when adding the storage area.

特公平７−１１１７１８JP 7-11718 特開平６−５２２３１号公報JP-A-6-52231 特開２００１−１４２７５１号公報JP 2001-142751 A

データ分析システムでは、周期的に取得されたデータを蓄積し、必要に応じて蓄積されたデータを組み合わせて分析処理を実行する。 In the data analysis system, periodically acquired data is accumulated, and analysis processing is executed by combining the accumulated data as necessary.

ここで、図を用いてデータ分析システムによって処理されるデータの一例を示す。 Here, an example of data processed by the data analysis system will be described with reference to the drawings.

図２０は、従来のデータ分析システムにおいて処理されるデータの一例を示す説明図である。図２１は、従来のデータにおけるスキーマの一例を示す説明図である。図２２Ａから図２２Ｃは、従来の分析処理において処理されるデータの一例を示す説明図である。 FIG. 20 is an explanatory diagram showing an example of data processed in a conventional data analysis system. FIG. 21 is an explanatory diagram showing an example of a schema in conventional data. 22A to 22C are explanatory diagrams illustrating an example of data processed in the conventional analysis processing.

図２０に示す例では、ユーザの移動履歴を表す。具体的には、ユーザを識別するユーザＩＤ、ユーザの位置を特定する座標情報である位置Ｘ及び位置Ｙ、並びに、ユーザが当該位置に移動した時間であるタイムスタンプから構成されるデータである。 In the example shown in FIG. 20, the movement history of the user is shown. Specifically, the data includes a user ID for identifying the user, a position X and a position Y that are coordinate information specifying the position of the user, and a time stamp that is a time when the user moves to the position.

図２０に示すようなデータに対する分析処理では、例えば、図２１に示すようにスキーマに基づいてデータが変換される。さらに、変換されたデータは、図２２Ａに示すようにユーザＩＤ毎にグループ化され、集計等の分析処理が実行される。 In the analysis processing for data as shown in FIG. 20, for example, the data is converted based on the schema as shown in FIG. Further, the converted data is grouped for each user ID as shown in FIG. 22A, and analysis processing such as tabulation is executed.

しかし、分析処理時に図２０に示すようなデータを図２２Ａに示すようなデータに変換する処理に時間がかかるため、本データ分析システムでは、分析処理を効率化するため、予め図２２Ａに示すようなデータに変換されたデータが蓄積され、蓄積されたデータを用いて分析処理が実行される。 However, since it takes time to convert the data as shown in FIG. 20 into the data as shown in FIG. 22A during the analysis process, in this data analysis system, as shown in FIG. Data converted into correct data is stored, and analysis processing is executed using the stored data.

なお、本明細書では、１以上レコードから構成されるデータをデータセットと記載する。また、図２０に示すようなデータセットを素データと記載し、図２１に示すような構造のデータを構造化データと記載する。 In this specification, data composed of one or more records is referred to as a data set. Further, a data set as shown in FIG. 20 is described as raw data, and data having a structure as shown in FIG. 21 is described as structured data.

蓄積処理では、図２０に示す形式のデータが周期的（例えば月単位）に収集され、図２２Ａの形式のデータに変換された後、データ分析システムに蓄積される。このため、複数のデータを集計して、１年間のデータに対する分析処理、及び各年度の特定の月に対する分析処理を実行する場合には、図２２Ａに示す形式のデータを複数結合する必要がある。 In the accumulation process, data in the format shown in FIG. 20 is collected periodically (for example, in units of months), converted into data in the format in FIG. 22A, and then accumulated in the data analysis system. For this reason, when a plurality of data are aggregated and analysis processing for data for one year and analysis processing for a specific month in each year are executed, it is necessary to combine a plurality of data in the format shown in FIG. 22A. .

例えば、データ分析システムは、図２２Ａ及び図２２Ｂに示すような２つデータを結合して、図２２Ｃに示すようなデータになる。 For example, the data analysis system combines two pieces of data as shown in FIGS. 22A and 22B to form data as shown in FIG. 22C.

ここで、同一のユーザＩＤの行データ（レコード）をマージしていることから、データベースにおける結合（ジョイン）と同等の処理を行う必要がある。さらに、前述した例では結合対象となるデータは２つだけではなく、多数の表を結合する場合がある。 Here, since the row data (records) of the same user ID are merged, it is necessary to perform processing equivalent to the join (join) in the database. Furthermore, in the above-described example, there are cases where not only two data to be joined but many tables are joined.

また、周期的に蓄積されるデータは、データ毎にサイズ分布が異なる場合がある。例えば、月ごとのサービスの利用回数が異なるユーザのデータでは、各月のデータのサイズ分布の違いが発生する。 In addition, the periodically accumulated data may have a different size distribution for each data. For example, in the data of a user whose service usage count differs from month to month, a difference in the size distribution of the data for each month occurs.

特許文献１には、表を区分するときに区分する位置（分割位置）を決定する方法は記載されていない。一般に表を均等に区分するには表に含まれるキーの分布情報が必要になる。キーの分布情報を取得する場合に、表全体をスキャンする方法では処理完了までに時間がかかる。 Patent Document 1 does not describe a method for determining a position (division position) for dividing a table. Generally, in order to divide a table equally, distribution information of keys included in the table is required. When acquiring the key distribution information, the method of scanning the entire table takes time to complete the processing.

キーの分布情報を取得する他の方法としては、特許文献２に記載されたインデックスを用いる方法がある。インデックスは表には、すべてのキー値が含まれるため、インデックスをスキャンすることによってキーの分布情報を取得することができる。インデックスは表と比べてデータサイズが小さいため、処理時間を短くできる。 As another method for acquiring key distribution information, there is a method using an index described in Patent Document 2. Since the index includes all key values in the table, key distribution information can be obtained by scanning the index. Since the index has a smaller data size than the table, the processing time can be shortened.

しかし、多数の表を結合する場合には、表の数だけインデックスをスキャンする必要があり、処理時間が長くなる。また、対象とするデータが大量である場合、表の作成時にインデックスを作成する処理、及び、表の更新時にインデックスを更新する処理に時間がかかるという課題がある。 However, when joining a large number of tables, it is necessary to scan the index by the number of tables, which increases the processing time. In addition, when there is a large amount of target data, there is a problem that it takes time to create an index when creating a table and to update an index when updating a table.

これに対して、インデックスを利用せず、特許文献３に記載の方法を用いることが考えられる。すなわち、あらかじめ複数の分割領域に分割された表を管理しておき、各表の分割領域同士を対応させて、分割領域毎に並列にマージ結合処理を実行する方法を用いることが考えられる。 On the other hand, it is conceivable to use the method described in Patent Document 3 without using an index. That is, it is conceivable to use a method in which tables divided into a plurality of divided areas are managed in advance, the divided areas of each table are associated with each other, and merge join processing is executed in parallel for each divided area.

しかし、一般に表の分割位置は表毎に異なるため、分割領域を対応させることができない。たとえ、すべての表の分割位置が一致するようにしていてもデータ更新時に各分割領域にデータサイズの偏りが発生するという別の課題がある。 However, since the division position of the table generally differs from table to table, the divided regions cannot be associated. Even if the division positions of all the tables are matched, there is another problem that a deviation in data size occurs in each divided area when data is updated.

すなわち、周期的に蓄積されるデータ毎にデータサイズ分布が異なるため、予め固定された分割位置では、データの組み合わせによって各分割領域のデータサイズに偏りが発生する。したがって、並列して結合処理を実行する場合に処理量のばらつきが発生し、効率的に並列処理ができないという課題がある。 That is, since the data size distribution is different for each periodically accumulated data, the data size of each divided region is biased depending on the combination of data at the division positions fixed in advance. Therefore, there is a problem in that when the joint processing is executed in parallel, a variation in processing amount occurs and parallel processing cannot be performed efficiently.

本願において開示される発明の代表的な一例を示せば以下の通りである。すなわち、キー及びデータ値から構成されるデータを複数含むデータセットに対する分析処理を複数の計算機が並列実行する計算機システムであって、前記各計算機は、プロセッサと、前記プロセッサに接続されるメモリと、前記のプロセッサに接続される記憶装置と、前記プロセッサに接続されるネットワークインタフェースとを有し、前記各計算機は、所定のキー範囲毎に前記データセットを分割した分割領域の分割位置を示すキーである分割位置キーを管理する分割情報を、前記データセット毎に保持し、前記各データセットの前記分割情報に含まれるすべての前記分割位置キーは同一であり、前記複数の計算機が有する記憶領域上には、前記データセットを格納するファイルシステムが構成され、前記計算機システムは、前記分析処理を実行する場合に、前記分割領域毎に複数のタスクを生成し、前記生成されたタスクを前記各計算機に割り当てて、前記各データセットの分割領域に含まれる前記データを結合して前記分析処理を実行し、前記ファイルシステムに新規データセットが格納された場合に、前記新規データセットが格納された後の各分割領域のデータサイズに基づいて、所定の閾値より大きいデータサイズの前記分割領域である対象領域が存在するか否かを判定し、前記対象領域が存在すると判定された場合、前記対象領域を複数の新たな分割領域に分割することを特徴とする。 A typical example of the invention disclosed in the present application is as follows. That is, a computer system in which a plurality of computers execute an analysis process on a data set including a plurality of data composed of keys and data values, wherein each of the computers includes a processor, a memory connected to the processor, A storage device connected to the processor; and a network interface connected to the processor, wherein each of the computers is a key indicating a division position of a divided area obtained by dividing the data set for each predetermined key range. The division information for managing a certain division position key is held for each data set, and all the division position keys included in the division information of each data set are the same, and are stored on the storage area of the plurality of computers. Includes a file system for storing the data set, and the computer system includes the analysis process. A plurality of tasks are generated for each of the divided areas, the generated tasks are allocated to the respective computers, and the analysis processing is performed by combining the data included in the divided areas of the respective data sets. When a new data set is stored in the file system, based on the data size of each divided area after the new data set is stored, the divided area having a data size larger than a predetermined threshold is used. It is determined whether or not a certain target area exists, and when it is determined that the target area exists, the target area is divided into a plurality of new divided areas.

本発明の代表的な一形態によれば、インデックスを作成することなく、データセット間の結合処理を並列実行できる。また、新規データセットが追加された場合に、分割領域ごとのデータ量のばらつきを抑えることができるため結合処理を実行するタスク間の処理量の平準化できる。 According to a typical embodiment of the present invention, it is possible to execute a join process between data sets in parallel without creating an index. In addition, when a new data set is added, variation in the data amount for each divided region can be suppressed, so that the processing amount between tasks that execute the combining process can be leveled.

本発明の第１の実施形態におけるデータ分析システムのシステム構成を説明するブロック図である。It is a block diagram explaining the system configuration | structure of the data analysis system in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるノードのハードウェア構成を説明するブロック図である。It is a block diagram explaining the hardware constitutions of the node in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるマスタノードのソフトウェア構成を説明するブロック図である。It is a block diagram explaining the software configuration of the master node in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるスレーブノードのソフトウェア構成を説明するブロック図である。It is a block diagram explaining the software configuration of the slave node in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるデータ管理テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the data management table in the 1st Embodiment of this invention. 本発明の第１の実施形態における分割テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the division | segmentation table in the 1st Embodiment of this invention. 本発明の第１の実施形態における分割テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the division | segmentation table in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるパーティションテーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the partition table in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるにキーサイズテーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the key size table in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるにキーサイズテーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the key size table in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるデータの結合処理及び分析処理を説明するフローチャートである。It is a flowchart explaining the combination process and analysis process of the data in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるデータ追加処理を説明するフローチャートである。It is a flowchart explaining the data addition process in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるグルーピング処理の詳細を説明するフローチャートである。It is a flowchart explaining the detail of the grouping process in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるデータ出力処理を説明するフローチャートである。It is a flowchart explaining the data output process in the 1st Embodiment of this invention. 本発明の第１の実施形態のおけるデータサイズの確認処理を説明するフローチャートである。It is a flowchart explaining the confirmation process of the data size in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるにキーサイズテーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the key size table in the 1st Embodiment of this invention. 本発明の第１の実施形態における分割後の分割テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the division | segmentation table after the division | segmentation in the 1st Embodiment of this invention. 本発明の第１の実施形態における分割後の分割テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the division | segmentation table after the division | segmentation in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるに分割後のキーサイズテーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the key size table after a division | segmentation in the 1st Embodiment of this invention. 本発明の第２の実施形態におけるレコードのスキーマを示す説明図である。It is explanatory drawing which shows the schema of the record in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるレコードの一例を示す説明図である。It is explanatory drawing which shows an example of the record in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるファイルを示す説明図である。It is explanatory drawing which shows the file in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるファイルを示す説明図である。It is explanatory drawing which shows the file in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるファイルを示す説明図である。It is explanatory drawing which shows the file in the 2nd Embodiment of this invention. 本発明の第２の実施形態における分割テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the division | segmentation table in the 2nd Embodiment of this invention. 従来のデータ分析システムにおいて処理されるデータの一例を示す説明図である。It is explanatory drawing which shows an example of the data processed in the conventional data analysis system. 従来のデータにおけるスキーマの一例を示す説明図である。It is explanatory drawing which shows an example of the schema in the conventional data. 従来の分析処理おいて処理されるデータの一例を示す説明図である。It is explanatory drawing which shows an example of the data processed in the conventional analysis process. 従来の分析処理おいて処理されるデータの一例を示す説明図である。It is explanatory drawing which shows an example of the data processed in the conventional analysis process. 従来の分析処理おいて処理されるデータの一例を示す説明図である。It is explanatory drawing which shows an example of the data processed in the conventional analysis process.

［第１の実施形態］ [First Embodiment]

以下、本発明の第１の実施形態を説明する。 Hereinafter, a first embodiment of the present invention will be described.

図１は、本発明の第１の実施形態におけるデータ分析システムのシステム構成を説明するブロック図である。 FIG. 1 is a block diagram illustrating a system configuration of a data analysis system according to the first embodiment of the present invention.

データ分析システムは、クライアントノード１０、マスタノード２０及びスレーブノード３０から構成され、ネットワーク４０を介して各ノードが相互に接続される。なお、ネットワーク４０は、ＳＡＮ、ＬＡＮ及びＷＡＮなどが考えられるが、各ノードが通信できるものであればどのようなものであってもよい。また、各ノードが直接接続されてもよい。 The data analysis system includes a client node 10, a master node 20, and a slave node 30, and the nodes are connected to each other via a network 40. The network 40 may be a SAN, LAN, WAN, or the like, but may be any network as long as each node can communicate. Each node may be directly connected.

ここでノードとは計算機を示す。以降、計算機をノードと記載する。 Here, the node indicates a computer. Hereinafter, the computer is referred to as a node.

クライアントノード１０は、データ分析システムの利用者が利用するノードである。利用者は、クライアントノード１０を用いてマスタノード２０及びスレーブノード３０等に各種指示を送信する。 The client node 10 is a node used by a user of the data analysis system. The user transmits various instructions to the master node 20 and the slave node 30 using the client node 10.

マスタノード２０は、データ分析システム全体を管理するノードである。スレーブノード３０は、マスタノード２０から送信される指示にしたがって、各処理（タスク）を実行するノードである。なお、本データ分析システムは、並列分散処理システムの一種であり、スレーブノード３０の数を増やすことによって、システムの処理性能を向上することができる。 The master node 20 is a node that manages the entire data analysis system. The slave node 30 is a node that executes each process (task) in accordance with an instruction transmitted from the master node 20. This data analysis system is a kind of parallel distributed processing system, and the processing performance of the system can be improved by increasing the number of slave nodes 30.

なお、クライアントノード１０、マスタノード２０及びスレーブノード３０のハードウェア構成は同一のものであり、詳細については図２を用いて後述する。 Note that the hardware configurations of the client node 10, the master node 20, and the slave node 30 are the same, and details will be described later with reference to FIG.

各ノードには、ＨＤＤ等の記憶装置１１、２１、３１が接続される。各記憶装置１１、２１、３１には、ＯＳ等の各ノードが備える機能を実現するためのプログラムが格納される。各プログラムは、ＣＰＵ（図２参照）によって記憶装置１１、２１、３１から読み出され、ＣＰＵ（図２参照）によって実行される。 Storage devices 11, 21, and 31 such as HDDs are connected to each node. Each of the storage devices 11, 21, and 31 stores a program for realizing a function included in each node such as an OS. Each program is read from the storage devices 11, 21, and 31 by the CPU (see FIG. 2) and executed by the CPU (see FIG. 2).

図２は、本発明の第１の実施形態におけるノードのハードウェア構成を説明するブロック図である。 FIG. 2 is a block diagram illustrating a hardware configuration of the node according to the first embodiment of this invention.

図２ではクライアントノード１０を例に説明するが、マスタノード２０及びスレーブノード３０も同一のハードウェア構成である。 In FIG. 2, the client node 10 is described as an example, but the master node 20 and the slave node 30 also have the same hardware configuration.

クライアントノード１０は、ＣＰＵ１０１、ネットワークＩ／Ｆ１０２、入出力Ｉ／Ｆ１０３、メモリ１０４、及びディスクＩ／Ｆ１０５を備え、内部バス等を介して各構成が互いに接続される。 The client node 10 includes a CPU 101, a network I / F 102, an input / output I / F 103, a memory 104, and a disk I / F 105, and the respective components are connected to each other via an internal bus or the like.

ＣＰＵ１０１は、メモリ１０４に格納されるプログラムを実行する。 The CPU 101 executes a program stored in the memory 104.

メモリ１０４は、ＣＰＵ１０１によって実行されるプログラム及び当該プログラムを実行するために必要な情報を格納する。なお、メモリ１０４に格納されるプログラムは、記憶装置１１に格納されていてもよい。この場合、ＣＰＵ１０１によって、記憶装置１１からメモリ１０４上に読み出される。 The memory 104 stores a program executed by the CPU 101 and information necessary for executing the program. Note that the program stored in the memory 104 may be stored in the storage device 11. In this case, the data is read from the storage device 11 onto the memory 104 by the CPU 101.

ネットワークＩ／Ｆ１０２は、ネットワーク４０を介して他のノードと接続するためのインタフェースである。ディスクＩ／Ｆ１０５は、記憶装置１１と接続するためのインタフェースである。 The network I / F 102 is an interface for connecting to other nodes via the network 40. The disk I / F 105 is an interface for connecting to the storage device 11.

入出力Ｉ／Ｆ１０３は、キーボード１０６、マウス１０７及びディスプレイ１０８などの入出力装置を接続するためのインタフェースである。利用者は、入出力装置を用いてデータ分析システムに指示を送信し、また、分析結果を確認する。 The input / output I / F 103 is an interface for connecting input / output devices such as a keyboard 106, a mouse 107, and a display 108. The user transmits an instruction to the data analysis system using the input / output device and confirms the analysis result.

なお、マスタノード２０及びスレーブノード３０は、キーボード１０６、マウス１０７及びディスプレイ１０８を備えていなくてもよい。 Note that the master node 20 and the slave node 30 do not have to include the keyboard 106, the mouse 107, and the display 108.

次に、マスタノード２０及びスレーブノード３０のソフトウェア構成を説明する。 Next, the software configuration of the master node 20 and the slave node 30 will be described.

図３Ａは、本発明の第１の実施形態におけるマスタノード２０のソフトウェア構成を説明するブロック図である。 FIG. 3A is a block diagram illustrating a software configuration of the master node 20 according to the first embodiment of this invention.

マスタノード２０は、データ管理部２１、処理管理部２２及びファイルサーバ（マスタ）２３を備える。 The master node 20 includes a data management unit 21, a process management unit 22, and a file server (master) 23.

データ管理部２１、処理管理部２２及びファイルサーバ（マスタ）２３は、メモリ１０４上に格納されるプログラムであり、ＣＰＵ１０１によって実行される。以下、プログラムを主体として処理を説明する場合には、ＣＰＵ１０１によって当該プログラムが実行されているものとする。 The data management unit 21, the process management unit 22, and the file server (master) 23 are programs stored on the memory 104 and are executed by the CPU 101. Hereinafter, when the process is described with the program as the subject, it is assumed that the CPU 101 is executing the program.

データ管理部２１は、データ分析システムが処理するデータを管理する。データ管理部２１は、データ管理テーブルＴ１００、分割テーブルＴ２００及びキーサイズテーブルＴ４００を含む。 The data management unit 21 manages data processed by the data analysis system. The data management unit 21 includes a data management table T100, a partition table T200, and a key size table T400.

データ管理テーブルＴ１００は、データ分析システムが処理するデータセットの管理情報を格納する。データ管理テーブルＴ１００の詳細については、図４を用いて後述する。ここで、データセットとは、複数のレコードから構成されるデータを示す。 The data management table T100 stores management information on data sets processed by the data analysis system. Details of the data management table T100 will be described later with reference to FIG. Here, the data set indicates data composed of a plurality of records.

分割テーブルＴ２００は、データセットを分割した分割領域の管理情報を格納する。ここで分割領域とは、所定のキー範囲ごとにデータセットが分割されたレコード群を表す。分割テーブルＴ２００の詳細については、図５を用いて後述する。 The division table T200 stores management information of divided areas obtained by dividing the data set. Here, the divided area represents a record group in which the data set is divided for each predetermined key range. Details of the division table T200 will be described later with reference to FIG.

キーサイズテーブルＴ４００は、データセットにおける各分割領域のデータサイズの管理情報を格納する。一つのデータセットに対して一つのキーサイズテーブルＴ４００が対応する。また、データ分析システム全体のデータセットのデータサイズを管理するキーサイズテーブルＴ４００も含まれる。キーサイズテーブルＴ４００の詳細については、図７を用いて後述する。 The key size table T400 stores management information on the data size of each divided area in the data set. One key size table T400 corresponds to one data set. Also included is a key size table T400 that manages the data size of the data set of the entire data analysis system. Details of the key size table T400 will be described later with reference to FIG.

処理管理部２２は、各スレーブノード３０上で分散して実行される並列処理を管理する。処理管理部２２は、並列実行される処理（タスク）を生成するプログラムを管理するプログラムリポジトリ２４を含む。つまり、処理管理部２２は、プログラムリポジトリ２４から各スレーブノード３０において実行すべきタスクを生成し、生成されたタスクの実行をスレーブノード３０に指示する。 The process management unit 22 manages parallel processes executed in a distributed manner on each slave node 30. The process management unit 22 includes a program repository 24 that manages a program that generates processes (tasks) to be executed in parallel. That is, the process management unit 22 generates a task to be executed in each slave node 30 from the program repository 24 and instructs the slave node 30 to execute the generated task.

ファイルサーバ（マスタ）２３は、実際のデータを格納するファイルを管理する。 The file server (master) 23 manages a file for storing actual data.

なお、マスタノード２０が備えるソフトウェア構成は、ハードウェアを用いて実現してもよい。 Note that the software configuration of the master node 20 may be realized using hardware.

図３Ｂは、本発明の第１の実施形態におけるスレーブノード３０のソフトウェア構成を説明するブロック図である。 FIG. 3B is a block diagram illustrating a software configuration of the slave node 30 according to the first embodiment of this invention.

スレーブノード３０は、処理実行部３１及びファイルサーバ（スレーブ）３２を備える。 The slave node 30 includes a process execution unit 31 and a file server (slave) 32.

処理実行部３１及びファイルサーバ（スレーブ）３２は、メモリ１０４上に格納されるプログラムであり、ＣＰＵ１０１によって実行される。以下、プログラムを主体として処理を説明する場合には、ＣＰＵ１０１によって当該プログラムが実行されているものとする。 The process execution unit 31 and the file server (slave) 32 are programs stored on the memory 104 and are executed by the CPU 101. Hereinafter, when the process is described with the program as the subject, it is assumed that the CPU 101 is executing the program.

処理実行部３１は、マスタノード２０の処理管理部２２から処理（タスク）の実行指示を受け付け、所定の処理（タスク）を実行する。つまり、処理実行部３１は、受け付けた処理（タスク）の実行指示に基づいて、当該処理（タスク）を実行するためのプロセスを生成する。生成されたプロセスが実行されることによって、各スレーブノード３０上で複数のタスクが実行され、並列分散処理が実現される。 The process execution unit 31 receives a process (task) execution instruction from the process management unit 22 of the master node 20 and executes a predetermined process (task). That is, the process execution unit 31 generates a process for executing the process (task) based on the received execution instruction of the process (task). By executing the generated process, a plurality of tasks are executed on each slave node 30, and parallel distributed processing is realized.

本実施形態の処理実行部３１は、前述したタスクを実行するデータ追加部（Ｍａｐ）３３及びデータ追加部（Ｒｅｄｕｃｅ）３４を含む。 The process execution unit 31 of the present embodiment includes a data addition unit (Map) 33 and a data addition unit (Reduce) 34 that execute the above-described task.

データ追加部（Ｍａｐ）３３は、入力された素データ（図２０参照）からレコード単位のデータを読み出し、ｋｅｙ範囲毎にデータ追加部（Ｒｅｄｕｃｅ）３４に、読み出された素データを出力する。なお、データ追加部（Ｒｅｄｕｃｅ）３４は、処理を担当するｋｅｙ範囲が予め設定されている。 The data adding unit (Map) 33 reads data in record units from the input raw data (see FIG. 20), and outputs the read raw data to the data adding unit (Reduce) 34 for each key range. In the data adding unit (Reduce) 34, a key range in charge of processing is set in advance.

データ追加部（Ｍａｐ）３３は、パーティションテーブルＴ３００を含む。データ追加部（Ｍａｐ）３３は、パーティションテーブルＴ３００に基づいて、読み出されたデータを出力するデータ追加部（Ｒｅｄｕｃｅ）３４を特定する。なお、パーティションテーブルＴ３００については、図７Ａ及び図７Ｂを用いて後述する。 The data adding unit (Map) 33 includes a partition table T300. The data adding unit (Map) 33 specifies the data adding unit (Reduce) 34 that outputs the read data based on the partition table T300. The partition table T300 will be described later with reference to FIGS. 7A and 7B.

データ追加部（Ｒｅｄｕｃｅ）３４は、入力された素データを所定の形式、すなわち、構造化データ（図２１参照）に変換し、さらに、当該構造化データを分散ファイルシステムに出力する。 The data adding unit (Reduce) 34 converts the input raw data into a predetermined format, that is, structured data (see FIG. 21), and outputs the structured data to the distributed file system.

データ追加部（Ｒｅｄｕｃｅ）３４は、キーサイズテーブルＴ４００を含む。キーサイズテーブルＴ４００は、データ管理部２１に含まれるキーサイズテーブルＴ４００と同一のものである。ただし、キーサイズテーブルＴ４００には、データ追加部（Ｒｅｄｕｃｅ）３４が担当するｋｅｙ範囲の分割領域に関する管理情報のみが格納される。 The data adding unit (Reduce) 34 includes a key size table T400. The key size table T400 is the same as the key size table T400 included in the data management unit 21. However, in the key size table T400, only the management information related to the divided area of the key range handled by the data adding unit (Reduce) 34 is stored.

ファイルサーバ（スレーブ）３２は、分散配置されるファイルを管理する。ファイルサーバ（マスタ）２３は、ファイルのメタデータ（ディレクトリ構造、サイズ、更新日時等）を管理し、ファイルサーバ（スレーブ）３２と連携して一つのファイルシステムを提供する機能を備える。 The file server (slave) 32 manages files that are distributed. The file server (master) 23 has a function of managing file metadata (directory structure, size, update date and time) and providing one file system in cooperation with the file server (slave) 32.

データ追加部（Ｍａｐ）３３及びデータ追加部（Ｒｅｄｕｃｅ）３４は、ファイルサーバ（マスタ）２３にアクセスすることによって、ファイルシステム上のファイルを利用し、各種タスクを実行する。すなわち、データ追加部（Ｍａｐ）３３及びデータ追加部（Ｒｅｄｕｃｅ）３４は、同一のファイルシステムにアクセスすることができる。 The data adding unit (Map) 33 and the data adding unit (Reduce) 34 access the file server (master) 23 to use the files on the file system and execute various tasks. That is, the data adding unit (Map) 33 and the data adding unit (Reduce) 34 can access the same file system.

なお、スレーブノード３０が備えるソフトウェア構成は、ハードウェアを用いて実現してもよい。 Note that the software configuration of the slave node 30 may be realized using hardware.

次にデータ管理部２１に含まれる各テーブルの詳細について説明する。 Next, details of each table included in the data management unit 21 will be described.

図４は、本発明の第１の実施形態におけるデータ管理テーブルＴ１００の一例を示す説明図である。 FIG. 4 is an explanatory diagram illustrating an example of the data management table T100 according to the first embodiment of this invention.

データ管理テーブルＴ１００は、データＩＤ（Ｔ１０１）及び分割テーブル名Ｔ１０２を含む。データＩＤ（Ｔ１０１）は、データセットの識別子を格納する。分割テーブル名Ｔ１０２は、データセットに対応する分割テーブルＴ２００の名称を格納する。 The data management table T100 includes a data ID (T101) and a division table name T102. The data ID (T101) stores the identifier of the data set. The division table name T102 stores the name of the division table T200 corresponding to the data set.

データ管理テーブルＴ１００の各エントリは、本データ分析システムが管理する１つのデータセットに対応する。また、当該データセットは、通常のデータベースにおける１つのテーブル（リレーション）に対応する。 Each entry of the data management table T100 corresponds to one data set managed by the data analysis system. The data set corresponds to one table (relation) in a normal database.

図５Ａ及び図５Ｂは、本発明の第１の実施形態における分割テーブルＴ２００の一例を示す説明図である。 5A and 5B are explanatory diagrams illustrating an example of the division table T200 according to the first embodiment of this invention.

図５Ａは、分割テーブル名Ｔ１０２が「ｌｏｇ０１．ｐａｒｔ」であるデータセットの分割テーブルＴ２００の一例を示す。図５Ｂは、分割テーブル名Ｔ１０２が「ｌｏｇ０２．ｐａｒｔ」である分割テーブルＴ２００の一例を示す。 FIG. 5A shows an example of a partition table T200 of a data set whose partition table name T102 is “log01.part”. FIG. 5B shows an example of a partition table T200 whose partition table name T102 is “log02.part”.

分割テーブルＴ２００は、本データ分析システムが処理する各データセットの分割方法を示す管理情報を格納する。分割テーブルＴ２００は、分割テーブル名Ｔ２０１、データファイル名Ｔ２０２、ｋｅｙ（Ｔ２０３）及びオフセットＴ２０４を含む。 The division table T200 stores management information indicating a division method for each data set processed by the data analysis system. The division table T200 includes a division table name T201, a data file name T202, a key (T203), and an offset T204.

分割テーブル名Ｔ２０１は、分割テーブルＴ２００の名称を格納する。分割テーブル名Ｔ２０１は、分割テーブル名Ｔ１０２と同一のものである。 The division table name T201 stores the name of the division table T200. The division table name T201 is the same as the division table name T102.

データファイル名Ｔ２０２は、分割領域に対応するデータを格納するファイルの名称を格納する。 The data file name T202 stores the name of the file that stores the data corresponding to the divided area.

ｋｅｙ（Ｔ２０３）は、分割領域のｋｅｙ範囲を示すｋｅｙの値、すなわち、データセットの分割位置を表すｋｅｙの値を格納する。ｋｅｙ（Ｔ２０３）には、分割領域における終了地点を表すｋｅｙの値が格納される。 The key (T203) stores a key value indicating the key range of the divided area, that is, a key value indicating the division position of the data set. The key (T203) stores a key value representing the end point in the divided area.

オフセットＴ２０４は、データセットにおける分割位置の値に対応するオフセットを格納する。オフセットＴ２０４には、ｋｅｙ（Ｔ２０３）に対応するｋｅｙのオフセットが格納される。なお、データファイル名Ｔ２０２が異なる場合には、データが格納されるファイルが異なるため、対応するエントリのオフセットは「０」から改めてカウントされる。 The offset T204 stores an offset corresponding to the value of the division position in the data set. The offset of the key corresponding to the key (T203) is stored in the offset T204. When the data file name T202 is different, the file in which the data is stored is different, so the offset of the corresponding entry is counted again from “0”.

分割領域の開始位置は、１つ前のエントリのｋｅｙ（Ｔ２０３）及びオフセットＴ２０４に対応する。そして、最初の分割領域の開始位置を表すｋｅｙと、最後の分割領域の終了位置を表すｋｅｙは定義されないため、これらは分割テーブルＴ２００には記載されない。 The start position of the divided area corresponds to the key (T203) and offset T204 of the previous entry. Since the key indicating the start position of the first divided area and the key indicating the end position of the last divided area are not defined, they are not described in the division table T200.

各分割テーブルＴ２００の各エントリは、本データ分析システムが管理する１つの分割領域に対応する。 Each entry of each division table T200 corresponds to one division area managed by the data analysis system.

例えば、図４に示すデータ管理テーブルＴ１００の１つ目のエントリは、分割テーブル名Ｔ１０１が「ｌｏｇ０１．ｐａｒｔ」であり、図５Ａに示す分割テーブルＴ２００に対応する。 For example, the first entry of the data management table T100 shown in FIG. 4 has the division table name T101 “log01.part” and corresponds to the division table T200 shown in FIG. 5A.

図５Ａに示す分割テーブルＴ２００の１つ目のエントリが最初の分割領域に対応する。１つ目のエントリは、データファイル名Ｔ２０２が「ｌｏｇ０１／００１．ｄａｔ」であるファイルに、当該分割領域のデータが格納されていることを示す。 The first entry of the division table T200 shown in FIG. 5A corresponds to the first division area. The first entry indicates that the data of the divided area is stored in a file whose data file name T202 is “log01 / 001.dat”.

また、１つ目のエントリのｋｅｙ（Ｔ２０３）が「０３４ａ」であることから、最初の分割領域のｋｅｙ範囲は「０３４ａ」未満であることを示す。また、１つ目のエントリのオフセットＴ２０４が「２８０」であることから、ファイル上のオフセットが「０〜２７９」の範囲に最初の分割領域のデータが格納されていることを示す。 Further, since the key (T203) of the first entry is “034a”, it indicates that the key range of the first divided area is less than “034a”. Further, since the offset T204 of the first entry is “280”, it indicates that the data of the first divided area is stored in the range of the offset on the file “0 to 279”.

また、図５Ａに示す分割テーブルＴ２００の２つ目のエントリは、対応する分割領域のｋｅｙ範囲は「０３４ａ」以上かつ「１７２ｄ」未満であり、データファイル名Ｔ２０２が「ｌｏｇ０１／００２．ｄａｔ」であることを示す。また、データファイル名Ｔ２０２が１つ目のエントリと異なるため、オフセットは「０」からカウントされる。したがって、オフセットが「０〜２１８」の範囲に対応する分割領域のデータが格納されることを示す。 In the second entry of the division table T200 shown in FIG. 5A, the key range of the corresponding division area is “034a” or more and less than “172d”, and the data file name T202 is “log01 / 002.dat”. Indicates that there is. Further, since the data file name T202 is different from the first entry, the offset is counted from “0”. Therefore, it indicates that the data of the divided area corresponding to the range where the offset is “0 to 218” is stored.

また、図５Ａに示す分割テーブルＴ２００の３つ目のエントリは、対応する分割領域のｋｅｙ範囲は「１７２ｄ」以上かつ「３２８ｂ」未満であり、データファイル名Ｔ２０２が「ｌｏｇ０１／００２．ｄａｔ」であることを示す。また、データファイル名Ｔ２０２が２つ目のエントリと一致するため、ファイル上のオフセットが「２１９〜４５５」の範囲に対応する分割領域のデータが格納されることを示す。 In the third entry of the division table T200 shown in FIG. 5A, the key range of the corresponding division area is “172d” or more and less than “328b”, and the data file name T202 is “log01 / 002.dat”. Indicates that there is. Further, since the data file name T202 matches the second entry, it indicates that the data of the divided area corresponding to the range where the offset on the file is “219 to 455” is stored.

また、図４に示すデータ管理テーブルＴ１００の２つ目のエントリは、分割テーブル名Ｔ１０１が「ｌｏｇ０２．ｐａｒｔ」であり、図５Ｂに示す分割テーブルＴ２００に対応する。 Also, the second entry of the data management table T100 shown in FIG. 4 has the division table name T101 “log02.part” and corresponds to the division table T200 shown in FIG. 5B.

図５Ｂに示す分割テーブルＴ２００に格納される各エントリのデータファイル名Ｔ２０２及びオフセットＴ２０４は、図５Ａに示す分割テーブルＴ２００の各エントリと異なる。しかし、両分割テーブルＴ２００の分割位置を表すｋｅｙ（Ｔ２０３）は共に一致する。 The data file name T202 and offset T204 of each entry stored in the division table T200 shown in FIG. 5B are different from the entries in the division table T200 shown in FIG. 5A. However, the keys (T203) representing the division positions of both division tables T200 coincide with each other.

本実施形態では、結合する可能性があるデータセットにおける分割領域の分割位置、すなわち、ｋｅｙ（Ｔ２０３）は必ず一致するように管理される。これによって、２つ以上のデータセットの結合処理を並列化することができる。すなわち、結合対象となるデータセットの分割テーブルＴ２００のｋｅｙ（Ｔ２０３）が同一のエントリを対応付けることが可能となり、分割領域毎に結合処理を並列して実行することが可能となる。 In the present embodiment, division positions of divided areas in a data set that can be combined, that is, keys (T203) are managed so as to always match. As a result, two or more data sets can be combined in parallel. That is, it is possible to associate entries with the same key (T203) in the partition table T200 of the data set to be combined, and it is possible to execute the combining process in parallel for each divided region.

ファイルには、図２２Ａに示したように１つのｋｅｙと１つ以上の値とから構成されるレコードが複数含まれる。また、各ファイルは、ｋｅｙに基づいてソートされた形式で、分散ファイルシステムに格納される。これによって、分割領域毎に結合処理を行う場合に、同一のｋｅｙをつき合わせてマージ結合することが可能となる。 The file includes a plurality of records composed of one key and one or more values as shown in FIG. 22A. Each file is stored in the distributed file system in a format sorted based on the key. As a result, when combining processing is performed for each divided region, it becomes possible to merge and combine the same keys.

また、異なる分割領域のデータを格納するファイルは同一であってもよい。例えば、図５Ａでは、２つ目のエントリと３つ目のエントリとは、同一のファイルである。ただし、それぞれのエントリのｋｅｙ範囲が異なっている。 Also, the files that store the data of different divided areas may be the same. For example, in FIG. 5A, the second entry and the third entry are the same file. However, the key range of each entry is different.

前述のように図５Ａでは、ファイルの数は３つであるが、分割領域の数は４つであり、それぞれ異なる。後述するように、ファイルの数は、本データ分析システムにおけるデータ追加処理の並列度に一致する。一方、分割領域の数は、データの分析処理の並列度に依存する。したがって、ファイルの数と分割領域の数とは、それぞれ異なった処理に依存するため、両者には依存関係はなく、どのように決めてもよい。 As described above, in FIG. 5A, the number of files is three, but the number of divided areas is four, which are different from each other. As will be described later, the number of files matches the degree of parallelism of data addition processing in the data analysis system. On the other hand, the number of divided areas depends on the parallelism of data analysis processing. Accordingly, since the number of files and the number of divided areas depend on different processes, there is no dependency between them, and the number may be determined in any way.

図６は、本発明の第１の実施形態におけるパーティションテーブルＴ３００の一例を示す説明図である。 FIG. 6 is an explanatory diagram illustrating an example of the partition table T300 according to the first embodiment of this invention.

パーティションテーブルＴ３００は、新たに追加されるデータセット（素データ）を分割して、タスクを実行するデータ追加部（Ｒｅｄｕｃｅ）３４に、当該データを振り分ける際に用いられる情報を格納する。パーティションテーブルＴ３００は、ｋｅｙ（Ｔ３０１）及び宛先Ｔ３０２を含む。 The partition table T300 divides a newly added data set (raw data), and stores information used for distributing the data in a data adding unit (Reduce) 34 that executes a task. The partition table T300 includes a key (T301) and a destination T302.

ｋｅｙ（Ｔ３０１）は、入力されたデータセットの分割位置を表すｋｅｙの値を格納する。宛先Ｔ３０２は、分割されたデータセットの処理を担当するデータ追加部（Ｒｅｄｕｃｅ）３４の位置を示す宛先情報を格納する。図６に示す例では、ＩＰアドレス及びポートを含む宛先情報によってノード及び当該データ追加部（Ｒｅｄｕｃｅ）３４が指定される。 The key (T301) stores a key value representing the division position of the input data set. The destination T302 stores destination information indicating the position of the data adding unit (Reduce) 34 in charge of processing the divided data set. In the example illustrated in FIG. 6, the node and the data addition unit (Reduce) 34 are specified by destination information including an IP address and a port.

図７Ａ及び図７Ｂは、本発明の第１の実施形態におけるにキーサイズテーブルＴ４００の一例を示す説明図である。 7A and 7B are explanatory diagrams illustrating an example of the key size table T400 according to the first embodiment of this invention.

キーサイズテーブルＴ４００は、分割領域のデータサイズを格納する。キーサイズテーブルＴ４００は、ｋｅｙ（Ｔ４０１）及びサイズＴ４０２を含む。 The key size table T400 stores the data size of the divided area. The key size table T400 includes a key (T401) and a size T402.

ｋｅｙ（Ｔ４０１）は、ｋｅｙ（Ｔ２０３）と同一のものである。サイズＴ４０２は、ｋｅｙ（Ｔ４０１）を分割位置とする分割領域のデータサイズを格納する。 The key (T401) is the same as the key (T203). The size T402 stores the data size of the divided area with the key (T401) as the division position.

なお、サイズＴ４０２は、結合処理の対象となる分割領域のデータサイズの合計値が格納される。 The size T402 stores the total value of the data sizes of the divided areas to be combined.

キーサイズテーブルＴ４００は、後述する結合処理及び分析処理、並びに、データ追加処理の実行時に動的に生成される。 The key size table T400 is dynamically generated when a combination process and an analysis process, which will be described later, and a data addition process are executed.

次に、データの結合処理及び分析処理について説明する。 Next, data combination processing and analysis processing will be described.

図８は、本発明の第１の実施形態におけるデータの結合処理及び分析処理を説明するフローチャートである。 FIG. 8 is a flowchart for explaining data combination processing and analysis processing according to the first embodiment of the present invention.

結合処理は、必ず分析処理と共に実行される。すなわち、結合処理によって１レコード分のデータが結合された後、当該データに対して分析処理が実行される。 The combination process is always executed together with the analysis process. That is, after the data for one record is combined by the combining process, the analysis process is executed on the data.

結合処理及び分析処理は、利用者からの指示を受信したデータ管理部２１によって実行される。なお、利用者からの指示には、結合対象であるデータセットのデータＩＤが含まれる。 The combination process and the analysis process are executed by the data management unit 21 that has received an instruction from the user. The instruction from the user includes the data ID of the data set to be combined.

まず、マスタノード２０は、処理対象となるデータセットに対応するキーサイズテーブルＴ４００を作成する（ステップＳ１０１）。 First, the master node 20 creates a key size table T400 corresponding to a data set to be processed (step S101).

具体的には、以下のような処理が実行される。 Specifically, the following processing is executed.

データ管理部２１は、利用者から送信された指示に含まれるデータＩＤに基づいて、データ管理テーブルＴ１００を検索し、対応するエントリから分割テーブル名Ｔ１０２を取得する。 The data management unit 21 searches the data management table T100 based on the data ID included in the instruction transmitted from the user, and acquires the division table name T102 from the corresponding entry.

次に、データ管理部２１は、取得された分割テーブル名Ｔ１０２に対応する分割テーブルＴ２００を取得する。 Next, the data management unit 21 acquires a partition table T200 corresponding to the acquired partition table name T102.

データ管理部２１は、取得された分割テーブルＴ２００に基づいて、分割領域毎の分割位置を示すｋｅｙの値を特定し、また、結合対象であるデータセットのデータサイズを算出する。 Based on the obtained division table T200, the data management unit 21 specifies a key value indicating the division position for each division area, and calculates the data size of the data set to be combined.

さらに、データ管理部２１は、前述の処理結果に基づいて、キーサイズテーブルＴ４００を作成する。 Further, the data management unit 21 creates the key size table T400 based on the above processing result.

例えば、データＩＤ（Ｔ１０１）が「ｌｏｇ０１」及び「ｌｏｇ０２」であるデータセットを結合する場合、対応する分割テーブルＴ２００はそれぞれ図５Ａ及び図５Ｂとなる。このとき、データ管理部２１は、前述した処理を実行することによって、分割領域毎に２つのデータセットのデータサイズを足し合わせ、図７Ａに示すようなキーサイズテーブルＴ４００を作成する。 For example, when data sets whose data IDs (T101) are “log01” and “log02” are combined, the corresponding division tables T200 are as shown in FIGS. 5A and 5B, respectively. At this time, the data management section 21 adds the data sizes of the two data sets for each divided region by executing the above-described processing, and creates a key size table T400 as shown in FIG. 7A.

次に、マスタノード２０は、結合処理及び分析処理の組からなるタスクを複数生成し、生成された各タスクを各スレーブノード３０に割り当てることによって当該タスクを起動する（ステップＳ１０２）。 Next, the master node 20 generates a plurality of tasks including a combination process and an analysis process, and activates each task by assigning each generated task to each slave node 30 (step S102).

具体的には、処理管理部２２が、プログラムリポジトリ２４から処理に必要なプログラムを読み出し、利用者によって指定された並列数分のタスクを生成する。さらに、処理管理部２２が、生成されたタスクを各スレーブノード３０上で実行させる。 Specifically, the process management unit 22 reads a program necessary for the process from the program repository 24 and generates tasks for the number of parallels designated by the user. Further, the process management unit 22 causes the generated task to be executed on each slave node 30.

なお、当該並列数がステップＳ１０１において作成されたキーサイズテーブルＴ４００のエントリ数よりも小さい場合、当該エントリ数を並列数とし、エントリ数分のタスクをスレーブノード３０上で実行させる。 When the parallel number is smaller than the number of entries in the key size table T400 created in step S101, the number of entries is set as the parallel number, and tasks corresponding to the number of entries are executed on the slave node 30.

次に、マスタノード２０は、各タスクに分割領域を割り当てる（ステップＳ１０３）。 Next, the master node 20 assigns a divided area to each task (step S103).

具体的には、データ管理部２１は、ステップＳ１０１において作成されたキーサイズテーブルＴ４００の各エントリに対応する分割領域を、ステップＳ１０２において生成された各タスクに割り当てる。 Specifically, the data management unit 21 assigns a divided area corresponding to each entry of the key size table T400 created in step S101 to each task generated in step S102.

なお、データ管理部２１は、キーサイズテーブルＴ４００のサイズＴ４０２に基づいて、データサイズが均等になるように、各タスクに分割領域を割り当てる。 Note that the data management unit 21 allocates a divided area to each task so that the data sizes are equal based on the size T402 of the key size table T400.

前述した分割領域の割り当て方法としては、例えば、データ管理部２１が、キーサイズテーブルＴ４００のエントリをサイズＴ４０２に基づいてソートし、データサイズが大きなエントリから順に、割り当てられたデータサイズが小さいタスクへ割り当てる方法が考えられる。 For example, the data management unit 21 sorts the entries in the key size table T400 based on the size T402, and the task having the smaller allocated data size is performed in order from the larger data size. An allocation method is conceivable.

データ管理部２１は、分割領域の割り当てが終了した後、タスクが割り当てられたスレーブノード３０に対して、結合すべきファイルのデータファイル名及びオフセット位置を送信する。 After the allocation of the divided areas is completed, the data management unit 21 transmits the data file name and the offset position of the files to be combined to the slave node 30 to which the task is allocated.

例えば、図７ＡのキーサイズテーブルＴ４００の１つ目のエントリに対応する分割領域が割り当てられたタスクの場合、対応する分割テーブルＴ２００のエントリは、図５Ａ及び図５Ｂのそれぞれ１つ目のエントリである。したがって、データ管理部２１は、（データファイル名、開始位置、終了位置）＝（ｌｏｇ０１／００１．ｄａｔ，０，２８０）、（ｌｏｇ０２／００１．ｄａｔ，０，２００）を、当該タスクが割り当てられたスレーブノード３０に送信する。 For example, in the case of a task to which a divided area corresponding to the first entry in the key size table T400 in FIG. 7A is assigned, the corresponding entry in the divided table T200 is the first entry in each of FIGS. 5A and 5B. is there. Therefore, the data management unit 21 assigns (data file name, start position, end position) = (log 01 / 001.dat, 0, 280), (log 02 / 001.dat, 0, 200) to the task. To the slave node 30.

次に、マスタノード２０は、タスクが割り当てられた各スレーブノード３０に対してタスクの実行指示を送信し、処理を終了する（ステップＳ１０４）。 Next, the master node 20 transmits a task execution instruction to each slave node 30 to which the task is assigned, and ends the process (step S104).

具体的には、データ管理部２１は、タスクを割り当てた各スレーブノード３０にタスクの実行指示を送信する。 Specifically, the data management unit 21 transmits a task execution instruction to each slave node 30 to which the task is assigned.

マスタノード２０から指示を受信したスレーブノード３０は、ファイルサーバ（マスタ）２３にアクセスし、データ管理部２１から受信したデータファイル名及びオフセット位置に基づいて、指定されたファイルを、指定されたオフセット位置から読み出す。 The slave node 30 that has received the instruction from the master node 20 accesses the file server (master) 23 and, based on the data file name and offset position received from the data management unit 21, designates the designated file with the designated offset. Read from position.

各スレーブノード３０は、読み出された各ファイルのｋｅｙをつき合わせ、結合処理を実行する。さらに、スレーブノード３０は、同一のスレーブノード３０において、実行中の分析処理のタスクに１レコードずつ結合処理の結果を出力する。 Each slave node 30 adds the keys of the read files and executes a combination process. Further, the slave node 30 outputs the result of the combination processing for each record to the analysis processing task being executed in the same slave node 30.

例えば、図５Ａ及び図５Ｂに対応するデータセットに対する分析処理では、４つの分割領域毎にタスクが生成され、各タスクによって前述した結合処理が実行される。 For example, in the analysis process for the data set corresponding to FIGS. 5A and 5B, a task is generated for each of the four divided regions, and the above-described combining process is executed by each task.

このとき、データセット毎に分割位置が異なると、重複するキー範囲について処理が実行されてしまうため並列処理が実現できない。しかし、本実施形態では、各データセットの分割位置が同一であるため、各データセットの分割領域における結合処理を並列実行できる。 At this time, if the division position is different for each data set, parallel processing cannot be realized because the processing is executed for the overlapping key ranges. However, in this embodiment, since the division positions of the respective data sets are the same, it is possible to execute the combination processing in the divided areas of the respective data sets in parallel.

以上が、データの結合処理及び分析処理の説明である。 The above is the description of the data combination processing and analysis processing.

次にデータ追加処理について説明する。 Next, data addition processing will be described.

データ追加処理は、データ管理テーブルＴ１００及び分割テーブルＴ２００が作成されているデータセット、すなわち、分散ファイルシステムに既存のデータセットが格納されている場合に、新規データセットを追加するための処理である。 The data addition process is a process for adding a new data set when an existing data set is stored in the data set in which the data management table T100 and the partition table T200 are created, that is, the distributed file system. .

通常、データセット毎に各分割領域のデータサイズが異なる。そのため、分割位置を修正せずに各データセットの分割領域を結合すると、分割領域間のデータサイズのばらつきが発生する。この結果、分析処理を実行するタスクの処理量にばらつきが発生し、並列処理の効率が低下する。 Usually, the data size of each divided area differs for each data set. For this reason, if the divided areas of the respective data sets are combined without correcting the dividing position, the data size varies between the divided areas. As a result, the processing amount of the task that executes the analysis processing varies, and the efficiency of parallel processing decreases.

本発明では、前述した課題を解決するため、データ追加処理時に後述する処理を実行することによって、分割領域を再分割し、各分割領域のデータサイズを平準化する。 In the present invention, in order to solve the above-described problem, a process described later is executed during the data addition process, so that the divided areas are subdivided and the data size of each divided area is leveled.

具体的には、新規データセットが追加された後、結合対象となり得る全データセットを結合させた場合の各分割領域のデータサイズが所定の基準値以下になるように分割位置が制御される。これによって、全データセット利用時に並列実行される分析処理のタスク間における処理量の差を平準化させることができる。 Specifically, after a new data set is added, the division position is controlled so that the data size of each divided area is equal to or less than a predetermined reference value when all data sets that can be combined are combined. As a result, it is possible to equalize the difference in processing amount between tasks of analysis processing executed in parallel when all data sets are used.

なお、一部のデータセットを結合する場合には、各分割領域のデータサイズは基準値以下になり、分析処理のタスク間の処理量の差は平準化される。 When a part of the data sets is combined, the data size of each divided area is equal to or smaller than the reference value, and the difference in processing amount between tasks of the analysis processing is leveled.

分割領域を再分割することによって、結合処理及び分析処理のタスク制御のオーバーヘッドが発生した場合に、割り当てられている分割領域が小さくなった場合には、当該分割領域が割り当てられるタスクに複数の分割領域が割り当てられ、１つのタスクが実行する処理量を増やすことができる。 If the divided area is subdivided and task control overhead of the join processing and analysis processing occurs, and if the allocated divided area becomes smaller, multiple tasks are assigned to the task to which the divided area is assigned. Areas are allocated and the amount of processing executed by one task can be increased.

なお、前述した所定の基準値は、タスクの処理量の差に影響することから、許容されるタスクの処理量の差に基づき決定することが望ましい。 Note that the above-described predetermined reference value affects the difference in task throughput, so it is desirable to determine the predetermined reference value based on the allowable task throughput difference.

当該基準値を小さくしすぎると分割領域の数が増えるため、データ追加処理のオーバーヘッドが増える。一方、当該基準値を大きくしすぎるとタスク間の処理量の差が大きくなり、並列処理の効率が下がる。 If the reference value is made too small, the number of divided areas increases, which increases the overhead of data addition processing. On the other hand, if the reference value is increased too much, the difference in processing amount between tasks increases, and the efficiency of parallel processing decreases.

したがって、１つのタスクが所定のデータ量を処理するときの実行時間が、タスク間の処理時間の差として許容される時間以下になるようなデータ量を所定の基準値とすればよい。 Accordingly, the predetermined reference value may be a data amount such that an execution time when one task processes a predetermined data amount is equal to or less than a time allowed as a difference in processing time between tasks.

データ追加処理で追加されるデータは、図２０に示すような形式で入力される。データ追加処理では、図２２Ａ示すような形式のデータをユーザＩＤでグループ化された形式に変換され、分散ファイルシステムに格納される。以下、図２０の形式のデータセットを素データと記載し、図２１の形式のデータを構造化データと記載する。 Data added in the data addition process is input in a format as shown in FIG. In the data addition process, data in the format as shown in FIG. 22A is converted into a format grouped by user ID and stored in the distributed file system. Hereinafter, the data set in the format shown in FIG. 20 is referred to as raw data, and the data set in FIG. 21 is referred to as structured data.

以下、図９を用いて具体的に処理について説明する。 Hereinafter, the processing will be specifically described with reference to FIG.

図９は、本発明の第１の実施形態におけるデータ追加処理を説明するフローチャートである。 FIG. 9 is a flowchart for explaining data addition processing in the first embodiment of the present invention.

利用者が、ファイルサーバ（マスタ）２３及びファイルサーバ（スレーブ）３２によって実現される分散ファイルシステムに対して、素データを入力することによってデータ追加処理が実行される。 The user adds raw data to the distributed file system realized by the file server (master) 23 and the file server (slave) 32, thereby executing data addition processing.

まず、データ管理部２１は、入力された素データをサンプリングし、ｋｅｙの出現頻度を解析する（ステップＳ２０１）。 First, the data management unit 21 samples the input raw data and analyzes the appearance frequency of the key (step S201).

具体的には、データ管理部２１は、素データに含まれるレコードをランダムにサンプリングする。データ管理部２１は、読み出されたレコードの最初のフィールドをｋｅｙとするｋｅｙの一覧を作成する。 Specifically, the data management unit 21 samples a record included in the raw data at random. The data management unit 21 creates a key list in which the first field of the read record is the key.

なお、素データは１レコードが１行の形式のデータから構成されるため、データ管理部２１は、改行コードを検出することによって１レコード分のデータを読み出すことができる。 Since the raw data is composed of data in a format in which one record is one line, the data management unit 21 can read data for one record by detecting a line feed code.

精度を向上するためにサンプリング数を増やす場合には、データ管理部２１は、サンプリング処理を並列実行してもよい。この場合、データ管理部２１は、素データをデータサイズが等しくなるように複数個に分割し、分割された素データ毎にサンプリング処理が実行される。 When increasing the number of samplings in order to improve accuracy, the data management unit 21 may execute the sampling processes in parallel. In this case, the data management unit 21 divides the raw data into a plurality of pieces so that the data sizes are equal, and a sampling process is executed for each divided raw data.

具体的には、データ管理部２１は、サンプリング処理の実行タスクを各スレーブノード３０に割り当て、さらに、当該実行タスクに分割された素データを割り当てる。データ管理部２１は、各スレーブノード３０の処理実行部３１からサンプリング処理の結果を受信し、すべてのスレーブノード３０から受信したサンプリング処理の結果を集計してｋｅｙの一覧を作成する。 Specifically, the data management unit 21 assigns the execution task of the sampling process to each slave node 30, and further assigns the raw data divided into the execution task. The data management unit 21 receives the result of the sampling process from the process execution unit 31 of each slave node 30 and totals the results of the sampling process received from all the slave nodes 30 to create a key list.

次に、データ管理部２１は、作成されたｋｅｙの一覧に基づいて、素データの分割位置となるｋｅｙの値を決定する（ステップＳ２０２）。 Next, the data management unit 21 determines the value of the key that becomes the division position of the raw data based on the created list of keys (step S202).

当該分割処理は、後述するステップＳ２０４における入力された素データを出力するための分割処理であり、分割テーブルＴ２００における分割処理とは異なる処理である。 The division process is a division process for outputting input raw data in step S204 described later, and is a process different from the division process in the division table T200.

ただし、ステップＳ２０４の処理では、既存の分割位置は変更されない。したがって、素データの分割位置は、既存のデータセットの分割テーブルＴ２００の分割位置に一致させる必要がある。 However, the existing division position is not changed in the process of step S204. Therefore, the division position of the raw data needs to match the division position of the division table T200 of the existing data set.

データ管理部２１は、分割テーブルＴ２００を参照し、既存の全データセットの分割位置を含むキーサイズテーブルＴ４００を作成する。例えば、図７Ａに示すようなキーサイズテーブルＴ４００が作成される。ただし、この時点では、サイズＴ４０２には値は格納されていない。 The data management unit 21 refers to the division table T200 and creates a key size table T400 including division positions of all existing data sets. For example, a key size table T400 as shown in FIG. 7A is created. However, at this time, no value is stored in the size T402.

データ管理部２１は、サンプリングされたｋｅｙ毎に対応する分割領域を特定し、キーサイズテーブルＴ４００の対応するエントリのサイズＴ４０２に、ｋｅｙに対応するデータのデータサイズをインクリメントする。 The data management unit 21 specifies a divided area corresponding to each sampled key, and increments the data size of the data corresponding to the key to the size T402 of the corresponding entry in the key size table T400.

以上のような処理によって、データ管理部２１は、サンプリングされたｋｅｙの分布を求めることができる。 Through the processing as described above, the data management unit 21 can obtain the distribution of the sampled keys.

例えば、サンプリングされたｋｅｙが「１２５ｄ」である場合、当該ｋｅｙは、「０３４ａ」以上かつ「１７２ｄ」未満であるため、ｋｅｙ（Ｔ４０１）が「１７２ｄ」であるエントリのサイズＴ４０２にｋｅｙが「１２５ｄ」であるデータのデータサイズがインクリメントされる。 For example, when the sampled key is “125d”, since the key is “034a” or more and less than “172d”, the key is “125d” in the size T402 of the entry whose key (T401) is “172d”. Is incremented.

データ管理部２１は、ｋｅｙの分布を求めた後、利用者によって指定された並列数と分割領域の数とが一致するように、キーサイズテーブルＴ４００の隣り合う分割領域をマージする。このとき、マージ後の各分割領域のデータサイズが均等になることが望ましい。 After obtaining the key distribution, the data management unit 21 merges adjacent divided areas of the key size table T400 so that the parallel number specified by the user matches the number of divided areas. At this time, it is desirable that the data size of each divided area after merging is equal.

例えば、利用者によって指定された並列数が「２」の場合、ｋｅｙの分布が図７Ｂに示すようなキーサイズテーブルＴ４００は４つの分割領域があるため、マージして２つの分割領域にする必要がある。そこで、データ管理部２１は、ｋｅｙ（Ｔ４０１）が「０３４ａ」のエントリと「１７２ｄ」のエントリとを１つの分割領域としてマージし、ｋｅｙ（Ｔ４０１）が「３２８ｂ」のエントリと空欄のエントリとを１つの分割領域としてマージする。 For example, when the number of parallels specified by the user is “2”, the key size table T400 whose key distribution is shown in FIG. 7B has four divided areas, so it is necessary to merge them into two divided areas. There is. Therefore, the data management unit 21 merges the entry whose key (T401) is “034a” and the entry “172d” into one divided area, and combines the entry whose key (T401) is “328b” and the blank entry. Merge as one divided area.

マージ処理が終了した後、データ管理部２１は、マージ結果をパーティションテーブルＴ３００のｋｅｙ（Ｔ３０１）に格納する。 After the merge process ends, the data management unit 21 stores the merge result in the key (T301) of the partition table T300.

なお、前述したマージ処理において、キーサイズテーブルＴ４００のエントリ数が、利用者によって指定された並列数以上の場合、マージ処理は実行されず、当該エントリ数が並列数となる。 In the merge process described above, when the number of entries in the key size table T400 is equal to or larger than the parallel number specified by the user, the merge process is not executed and the number of entries is the parallel number.

以上がステップＳ２０２における処理である。 The above is the process in step S202.

次に、データ管理部２１は、分析処理において結合対象となる可能性がある全データセットのデータサイズを算出する（ステップＳ２０３）。さらに、データ管理部２１は、算出結果に基づいて、キーサイズテーブルＴ４００を作成する。 Next, the data management unit 21 calculates the data size of all data sets that may be combined in the analysis process (step S203). Further, the data management unit 21 creates a key size table T400 based on the calculation result.

データ管理部２１は、データ管理テーブルＴ１００を参照して、各データセットの分割テーブル名Ｔ１０２を取得する。さらに、データ管理部２１は、取得された分割テーブル名Ｔ１０２に基づいて、対応する分割テーブルＴ２００の一覧を取得する。 The data management unit 21 refers to the data management table T100 and acquires the division table name T102 of each data set. Furthermore, the data management unit 21 acquires a list of the corresponding partition table T200 based on the acquired partition table name T102.

なお、結合対象となり得る各データセットの分割テーブルＴ２００における分割位置は一致している。したがって、分析処理において分割領域の結合を並列実行できる。 Note that the division positions in the division table T200 of the data sets that can be combined are the same. Therefore, it is possible to execute the combination of the divided areas in parallel in the analysis process.

データ管理部２１は、取得された分割テーブルＴ２００のｋｅｙ（Ｔ２０３）を含むキーサイズテーブルＴ４００を作成する。さらに、データ管理部２１は、分割テーブルＴ２００毎に各分割領域のデータサイズを算出し、作成されたキーサイズテーブルＴ４００のサイズ（Ｔ４０２）に、算出されたデータサイズを加算する。 The data management unit 21 creates a key size table T400 including the key (T203) of the acquired division table T200. Further, the data management unit 21 calculates the data size of each divided area for each division table T200, and adds the calculated data size to the size (T402) of the created key size table T400.

取得されたすべての分割テーブルＴ２００に対して同様の処理を実行することによって、分散ファイルシステム上に存在するすべての既存のデータセットに関するキーサイズテーブルＴ４００を作成できる。 By executing the same processing for all the obtained divided tables T200, the key size table T400 regarding all existing data sets existing on the distributed file system can be created.

例えば、図５Ａ及び図５Ｂに示す分割テーブルＴ２００に対して前述した処理を実行することによって、図７Ａに示すようなキーサイズテーブルＴ４００が作成される。 For example, the key size table T400 as shown in FIG. 7A is created by executing the above-described processing on the division table T200 shown in FIGS. 5A and 5B.

以上がステップＳ２０３における処理である。 The above is the process in step S203.

次に、データ管理部２１は、ステップＳ２０２におけるマージ結果を表すパーティションテーブルＴ３００に基づいて、素データに対するグルーピング処理を実行する（ステップＳ２０４）。 Next, the data management unit 21 performs a grouping process on the raw data based on the partition table T300 representing the merge result in step S202 (step S204).

ここで、グルーピング処理とは、素データに含まれるレコードをｋｅｙ（図２０に示す例ではユーザＩＤ）毎に集約する処理である。 Here, the grouping process is a process of collecting records included in the raw data for each key (user ID in the example shown in FIG. 20).

グルーピング処理では、データ管理部２１、データ追加部（Ｍａｐ）３３及びデータ追加部（Ｒｅｄｕｃｅ）３４が連携して処理を実行する。 In the grouping process, the data management unit 21, the data addition unit (Map) 33, and the data addition unit (Reduce) 34 cooperate to execute the process.

データ追加部（Ｍａｐ）３３及びデータ追加部（Ｒｅｄｕｃｅ）３４は、データ管理部２１からの指示にしたがって、それぞれ並列処理を実行する。 The data adding unit (Map) 33 and the data adding unit (Reduce) 34 each execute parallel processing in accordance with an instruction from the data management unit 21.

なお、パーティションテーブルＴ３００のエントリ数が、タスクを割り当てるデータ追加部（Ｒｅｄｕｃｅ）３４の並列度となる。一方、タスクを割り当てるデータ追加部（Ｍａｐ）３３の並列度は、パーティションテーブルＴ３００のエントリ数とは無関係であり、利用者によって指定される。 Note that the number of entries in the partition table T300 is the degree of parallelism of the data adding unit (Reduce) 34 to which tasks are assigned. On the other hand, the degree of parallelism of the data adding unit (Map) 33 to which tasks are assigned is irrelevant to the number of entries in the partition table T300 and is specified by the user.

以下、データ追加部（Ｍａｐ）３３をＭａｐタスクと記載し、データ追加部（Ｒｅｄｕｃｅ）３４に割り当てるタスクをＲｅｄｕｃｅタスクとも記載する。 Hereinafter, the data adding unit (Map) 33 is referred to as a Map task, and a task assigned to the data adding unit (Reduce) 34 is also referred to as a Reduce task.

具体的には以下のような処理が実行される。 Specifically, the following processing is executed.

データ管理部２１は、利用者によって指定された並列数にしたがって、データサイズが一定となるように素データを分割する。さらに、データ管理部２１は、素データを分割して生成された分割領域の各分割位置であるオフセット位置、及び当該分割領域のデータサイズを算出する。なお、オフセット位置はレコード境界に一致するように素データの一部をスキャンして調整される。 The data management unit 21 divides the raw data according to the parallel number specified by the user so that the data size is constant. Further, the data management unit 21 calculates an offset position, which is each division position of the division area generated by dividing the raw data, and a data size of the division area. The offset position is adjusted by scanning a part of the raw data so as to coincide with the record boundary.

データ管理部２１は、処理管理部２２と連携して、利用者によって指定された並列数分のＭａｐタスクを生成し、生成されたＭａｐタスクを各データ追加部（Ｍａｐ）３３に割り当てる。このとき、各データ追加部（Ｍａｐ）３３には、分割領域のオフセット位置、分割領域のデータサイズ、及び素データのファイル名が送信される。 The data management unit 21 generates Map tasks for the number of parallels designated by the user in cooperation with the process management unit 22 and assigns the generated Map tasks to each data addition unit (Map) 33. At this time, each data adding unit (Map) 33 is transmitted with the offset position of the divided area, the data size of the divided area, and the file name of the raw data.

さらに、データ管理部２１は、処理管理部２２連携して、パーティションテーブルＴ３００のエントリ数分のＲｅｄｕｃｅタスクを生成する。 Furthermore, the data management unit 21 cooperates with the process management unit 22 to generate Reduce tasks for the number of entries in the partition table T300.

また、データ管理部２１は、パーティションテーブルＴ３００の各エントリをデータ追加部（Ｒｅｄｕｃｅ）３４と対応づける。データ管理部２１は、対応づけられた各データ追加部（Ｒｅｄｕｃｅ）３４に、ｋｅｙ（Ｔ３０１）に対応するｋｅｙ範囲の分割領域を処理するためのＲｅｄｕｃｅタスクを割り当てる。 In addition, the data management unit 21 associates each entry of the partition table T300 with the data addition unit (Reduce) 34. The data management unit 21 assigns a Reduce task for processing a divided region in the key range corresponding to the key (T301) to each associated data addition unit (Reduce) 34.

さらに、データ管理部２１は、ステップＳ２０２において作成されたキーサイズテーブルＴ４００のうち、送信されたｋｅｙ範囲に対応するエントリをデータ追加部（Ｒｅｄｕｃｅ）３４送信する。 Further, the data management unit 21 transmits the entry corresponding to the transmitted key range in the key size table T400 created in step S202, to the data addition unit (Reduce) 34.

例えば、図６に示すパーティションテーブルＴ３００の最初のエントリのｋｅｙ範囲は、「１７２ｄ」未満であるため、対応するキーサイズテーブルＴ４００のエントリは、図７Ａの一つ目のエントリ及び２つ目のエントリである。したがって、データ管理部２１は、一つ目のエントリ及び２つ目のエントリを対応するデータ追加部（Ｒｅｄｕｃｅ）３４に送信する。 For example, since the key range of the first entry of the partition table T300 shown in FIG. 6 is less than “172d”, the corresponding entry of the key size table T400 includes the first entry and the second entry of FIG. 7A. It is. Therefore, the data management unit 21 transmits the first entry and the second entry to the corresponding data addition unit (Reduce) 34.

さらに、データ管理部２１は、データ追加部（Ｒｅｄｕｃｅ）３４の宛先情報（アドレス：ポート番号）を取得し、パーティションテーブルＴ３００の対応するエントリの宛先Ｔ３０２に取得された宛先情報を格納する。 Further, the data management unit 21 acquires the destination information (address: port number) of the data adding unit (Reduce) 34 and stores the acquired destination information in the destination T302 of the corresponding entry of the partition table T300.

パーティションテーブルＴ３００が作成された後、処理管理部２２は、すべてのデータ追加部（Ｍａｐ）３３に完成したパーティションテーブルＴ３００を送信する。 After the partition table T300 is created, the process management unit 22 transmits the completed partition table T300 to all the data addition units (Map) 33.

以上がステップＳ２０４における処理である。 The above is the processing in step S204.

なお、ステップＳ２０４におけるデータ追加部（Ｍａｐ）３３及びデータ追加部（Ｒｅｄｕｃｅ）３４は、グルーピング処理が実行された後、データの出力処理を実行する。グルーピング処理の詳細については図１０を用いて後述し、また、データの出力処理の詳細については図１１を用いて後述する。 In step S204, the data adding unit (Map) 33 and the data adding unit (Reduce) 34 execute the data output process after the grouping process is executed. Details of the grouping process will be described later with reference to FIG. 10, and details of the data output process will be described later with reference to FIG.

データ管理部２１は、分割テーブルＴ２００を更新し、処理を終了する（ステップＳ２０５）。 The data management unit 21 updates the division table T200 and ends the process (step S205).

具体的には、データ管理部２１は、各データ追加部（Ｒｅｄｕｃｅ）３４から受信した分割テーブルＴ２００に基づいて、自身が管理する分割テーブルＴ２００を更新する。なお、受信した分割テーブルＴ２００は、データ追加部（Ｒｅｄｕｃｅ）３４が後述する処理（図１０及び図１１参照）が実行された後のテーブルである。 Specifically, the data management unit 21 updates the division table T200 managed by itself based on the division table T200 received from each data addition unit (Reduce) 34. The received division table T200 is a table after the data adding unit (Reduce) 34 has executed processing (see FIGS. 10 and 11) described later.

データ追加部（Ｒｅｄｕｃｅ）３４は、一部のｋｅｙ範囲のデータセットのみを処理する。本実施形態では、一つのデータ追加部（Ｒｅｄｕｃｅ）３４によって更新された分割テーブルＴ２００に基づいて、データ分析システムにおけるすべての分割テーブルＴ２００が更新される点に特徴がある。 The data adding unit (Reduce) 34 processes only a data set in a partial key range. The present embodiment is characterized in that all the division tables T200 in the data analysis system are updated based on the division table T200 updated by one data addition unit (Reduce) 34.

また、データ管理部２１は、各データ追加部（Ｒｅｄｕｃｅ）３４から受信した、入力された素データの分割テーブルＴ２００を１つにマージし、マージされたテーブルを入力された素データの分割テーブルＴ２００として管理する。 Further, the data management unit 21 merges the input raw data division table T200 received from each data addition unit (Reduce) 34 into one, and the merged table is input to the raw data division table T200. Manage as.

これは、ｋｅｙ範囲毎に、各データ追加部（Ｒｅｄｕｃｅ）３４において素データに対する処理が並列実行されていたため、各処理結果を集約する処理である。 This is a process of collecting the processing results for each key range because the processing for the raw data is performed in parallel in each data adding unit (Reduce) 34.

さらに、データ管理部２１は、素データの分割テーブルＴ２００に対応するエントリをデータ管理テーブルＴ１００に追加する。 Further, the data management unit 21 adds an entry corresponding to the raw data division table T200 to the data management table T100.

次に、ステップＳ２０４におけるグルーピング処理の詳細について説明する。 Next, details of the grouping process in step S204 will be described.

図１０は、本発明の第１の実施形態におけるグルーピング処理の詳細を説明するフローチャートである。 FIG. 10 is a flowchart illustrating details of grouping processing according to the first embodiment of this invention.

スレーブノード３０は、入力された素データに対してソート処理を実行する（ステップＳ３０１）。 The slave node 30 performs a sort process on the input raw data (step S301).

具体的には以下の処理が実行される。 Specifically, the following processing is executed.

データ追加部（Ｍａｐ）３３は、素データから１つずつレコードを読み出す。データ追加部（Ｍａｐ）３３は、読み出されたレコードのｋｅｙに基づいて、パーティションテーブルＴ３００からデータ追加部（Ｒｅｄｕｃｅ）３４の宛先情報を取得する。すなわち、読み出されたレコードを処理するデータ追加部（Ｒｅｄｕｃｅ）３４が特定される。 The data adding unit (Map) 33 reads records one by one from the raw data. The data adding unit (Map) 33 acquires the destination information of the data adding unit (Reduce) 34 from the partition table T300 based on the key of the read record. That is, the data adding unit (Reduce) 34 for processing the read record is specified.

データ追加部（Ｍａｐ）３３は、宛先毎に読み出された各レコードを分類する。以下、宛先ごとに分類されたレコード群をセグメントと記載する。 The data adding unit (Map) 33 classifies each record read for each destination. Hereinafter, the record group classified for each destination is referred to as a segment.

データ追加部（Ｍａｐ）３３は、自身が担当する分割された素データに含まれるすべてのレコードを読み出した後、各セグメントに含まれるレコードをｋｅｙに基づいてソートする。 The data adding unit (Map) 33 reads all the records included in the divided raw data handled by itself, and then sorts the records included in each segment based on the key.

以上がステップＳ３０１における処理である。 The above is the process in step S301.

次に、スレーブノード３０は、ソートされたセグメントをデータ追加部（Ｒｅｄｕｃｅ）３４に送信する（ステップＳ３０２）。 Next, the slave node 30 transmits the sorted segments to the data adding unit (Reduce) 34 (step S302).

具体的には、データ追加部（Ｍａｐ）３３が、ステップＳ３０１において取得された宛先情報に対応するデータ追加部（Ｒｅｄｕｃｅ）３４に、ソートされたセグメントを送信する。各データ追加部（Ｒｅｄｕｃｅ）３４は、各スレーブノード３０のデータ追加部（Ｍａｐ）３３から送信されるセグメントを受信する。 Specifically, the data adding unit (Map) 33 transmits the sorted segments to the data adding unit (Reduce) 34 corresponding to the destination information acquired in Step S301. Each data adding unit (Reduce) 34 receives a segment transmitted from the data adding unit (Map) 33 of each slave node 30.

データ追加部（Ｍａｐ）３３からセグメントを受信したスレーブノード３０は、ｋｅｙに基づいて受信したセグメントをマージし、処理を終了する（ステップＳ３０３）。 The slave node 30 that has received the segment from the data adding unit (Map) 33 merges the received segment based on the key, and ends the process (step S303).

具体的には、データ追加部（Ｒｅｄｕｃｅ）３４が、受信したすべてのセグメントを順に読み出し、ｋｅｙが同一のセグメント同士をマージして結合する。 Specifically, the data adding unit (Reduce) 34 sequentially reads all received segments, and merges and joins the segments having the same key.

さらに、データ追加部（Ｒｅｄｕｃｅ）３４は、マージされたセグメントに含まれるレコードを、図１０に示すような構造化データに変換する。前述した処理によって、複数のレコードが、ｋｅｙが同一の１つのレコードに集約される。 Further, the data adding unit (Reduce) 34 converts the records included in the merged segment into structured data as shown in FIG. By the process described above, a plurality of records are collected into one record having the same key.

次に、ステップＳ２０４において、データ追加部（Ｒｅｄｕｃｅ）３４が実行するデータ出力処理について説明する。 Next, a data output process executed by the data adding unit (Reduce) 34 in step S204 will be described.

図１１は、本発明の第１の実施形態におけるデータ出力処理を説明するフローチャートである。 FIG. 11 is a flowchart for explaining data output processing according to the first embodiment of the present invention.

まず、データ出力処理の概要について説明する。 First, the outline of the data output process will be described.

データ追加部（Ｒｅｄｕｃｅ）３４は、データ出力処理を実行することによって、図２２Ａに示すような形式の構造化データを分散ファイルシステムへ出力する。並列度の数だけ、データ追加部（Ｒｅｄｕｃｅ）３４においてタスクが実行される。このとき、データ追加部（Ｒｅｄｕｃｅ）３４が出力するファイル名はそれぞれ異なる。 The data adding unit (Reduce) 34 outputs structured data having a format shown in FIG. 22A to the distributed file system by executing a data output process. Tasks are executed in the data adding unit (Reduce) by the number of parallelism. At this time, the file names output by the data adding unit (Reduce) 34 are different.

さらに、本発明では、データ追加部（Ｒｅｄｕｃｅ）３４は、素データのデータサイズをキーサイズテーブルＴ４００に加算して、素データが追加された後の各分割領域のデータサイズを算出する。 Furthermore, in the present invention, the data adding unit (Reduce) 34 adds the data size of the raw data to the key size table T400, and calculates the data size of each divided area after the raw data is added.

データ追加部（Ｒｅｄｕｃｅ）３４は、データサイズが所定の閾値以上である分割領域が存在する場合、分割領域の分割処理を実行する。 The data adding unit (Reduce) 34 executes a division process of the divided area when there is a divided area having a data size equal to or larger than a predetermined threshold.

データ追加部（Ｒｅｄｕｃｅ）３４は、分割領域の分割処理が実行された場合、自身が管理する既存のデータセットの分割テーブルＴ２００も更新する。さらに、データ追加部（Ｒｅｄｕｃｅ）３４は、更新された分割テーブルＴ２００をデータ管理部２１に送信する。更新された分割テーブルＴ２００に基づいて、データ管理部２１が、分割テーブルＴ２００の更新処理（ステップＳ２０５）を実行する。 The data adding unit (Reduce) 34 also updates the division table T200 of the existing data set managed by itself when the division process of the division area is executed. Further, the data adding unit (Reduce) 34 transmits the updated division table T200 to the data management unit 21. Based on the updated division table T200, the data management unit 21 executes an update process (step S205) of the division table T200.

また、データ追加部（Ｒｅｄｕｃｅ）３４は、入力された素データの分割テーブルＴ２００を作成し、処理終了後に作成された分割テーブルＴ２００をデータ管理部２１に送信する。 In addition, the data adding unit (Reduce) 34 creates a division table T200 of the input raw data, and transmits the division table T200 created after the processing is completed to the data management unit 21.

以下、各処理の詳細について説明する。 Details of each process will be described below.

まず、データ追加部（Ｒｅｄｕｃｅ）３４は、データ出力処理を開始する前に、ステップＳ２０４においてデータ管理部２１から受信したキーサイズテーブルＴ４００に含まれるｋｅｙのみが格納されたキーサイズテーブルＴ４００を作成する。ここで、作成されたキーサイズテーブルＴ４００は、素データの所定の分割領域のデータサイズが格納されるテーブルである。 First, before starting the data output process, the data adding unit (Reduce) 34 creates a key size table T400 in which only the key included in the key size table T400 received from the data management unit 21 in step S204 is stored. . Here, the created key size table T400 is a table in which the data size of a predetermined divided area of the raw data is stored.

以下、作成されたキーサイズテーブルＴ４００を追加用キーサイズテーブルＴ４００とも記載する。なお、追加用キーサイズテーブルＴ４００が作成された時点では、サイズＴ４０２の初期値は「０」に設定される。 Hereinafter, the created key size table T400 is also referred to as an additional key size table T400. At the time when the additional key size table T400 is created, the initial value of the size T402 is set to “0”.

また、データ管理部２１から受信したキーサイズテーブルＴ４００は、データ追加部（Ｒｅｄｕｃｅ）３４が担当するｋｅｙ範囲に含まれる分散ファイルシステム上の全データセットのデータサイズを管理するテーブルである。以下、当該キーサイズテーブルＴ４００を全データ用キーサイズテーブルＴ４００と記載する。 The key size table T400 received from the data management unit 21 is a table for managing the data sizes of all data sets on the distributed file system included in the key range handled by the data addition unit (Reduce) 34. Hereinafter, the key size table T400 is referred to as an all data key size table T400.

データ出力処理が開始されると、データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ３０３において作成されたレコードを出力し、前回出力されたレコードとは異なる分割領域に含まれるレコードであるか否かを判定する（ステップＳ４０１）。 When the data output process is started, the data adding unit (Reduce) 34 outputs the record created in step S303, and determines whether or not the record is included in a divided area different from the record output last time. (Step S401).

具体的には、データ追加部（Ｒｅｄｕｃｅ）３４は、追加用キーサイズテーブルＴ４００のｋｅｙ（Ｔ４０２）を参照し、出力されたレコードが前回出力されたレコードと異なる分割領域に含まれるか否かを判定する。 Specifically, the data adding unit (Reduce) 34 refers to the key (T402) of the key size table T400 for addition, and determines whether the output record is included in a different divided area from the record output last time. judge.

本実施形態では、ｋｅｙに基づいてソートされたレコードが順に出力されるため、出力されたレコードが所定のｋｅｙ範囲、すなわち、所定の分割領域に含まれるか否かを判定できる。 In this embodiment, since the records sorted based on the keys are output in order, it can be determined whether or not the output records are included in a predetermined key range, that is, a predetermined divided area.

なお、最初に出力されるレコードの場合、同一の分割領域に含まれると判定される。 In the case of the first output record, it is determined that the record is included in the same divided area.

異なる分割領域に含まれるレコードであると判定された場合、データ追加部（Ｒｅｄｕｃｅ）３４は、前回レコードが追加された分割領域のデータサイズの確認処理を実行し（ステップＳ４０５）、ステップＳ４０２に進む。なお、データサイズの確認処理については、図１２を用いて後述する。 When it is determined that the record is included in a different divided area, the data adding unit (Reduce) 34 executes a process for confirming the data size of the divided area to which the previous record has been added (step S405), and proceeds to step S402. . The data size confirmation process will be described later with reference to FIG.

同一の分割領域に含まれるレコードであると判定された場合、データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ３０３において作成されたレコードを分散ファイルシステムに書き込む（ステップＳ４０２）。 When it is determined that the records are included in the same divided area, the data adding unit (Reduce) 34 writes the record created in step S303 to the distributed file system (step S402).

このとき、データ追加部（Ｒｅｄｕｃｅ）３４は、書き込まれたレコードのｋｅｙの値、レコードが書き込まれたファイル上のオフセット位置、及びレコードのデータサイズを含むレコード統計情報を作成し、作成されたレコード統計情報を保存する。これは、素データのレコード統計情報である。 At this time, the data adding unit (Reduce) 34 creates record statistical information including the key value of the written record, the offset position on the file in which the record is written, and the data size of the record, and the created record Save statistical information. This is record statistical information of raw data.

次に、データ追加部（Ｒｅｄｕｃｅ）３４は、キーサイズテーブルＴ４００を更新する（ステップＳ４０３）。 Next, the data adding unit (Reduce) 34 updates the key size table T400 (step S403).

具体的には、データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ４０２において書き込まれたレコードのｋｅｙが含まれるｋｅｙ範囲の分割領域を特定する。データ追加部（Ｒｅｄｕｃｅ）３４は、特定された分割領域に対応するエントリを、追加用キーサイズテーブルＴ４００及び全データキーサイズテーブルＴ４００から検索する。さらに、データ追加部（Ｒｅｄｕｃｅ）３４は、各キーサイズテーブルＴ４００の対応するエントリのサイズＴ４０２に、書き込まれたレコードのデータサイズを加算する。 Specifically, the data adding unit (Reduce) 34 specifies a divided area of the key range including the key of the record written in step S402. The data adding unit (Reduce) 34 searches the addition key size table T400 and the total data key size table T400 for an entry corresponding to the specified divided area. Further, the data adding unit (Reduce) 34 adds the data size of the written record to the size T402 of the corresponding entry in each key size table T400.

データ追加部（Ｒｅｄｕｃｅ）３４は、すべてのレコードを出力したか否かを判定する（ステップＳ４０４）。 The data adding unit (Reduce) 34 determines whether all records have been output (step S404).

すべてのレコードが出力されていないと判定された場合、データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ４０１に戻り、同様の処理を実行する。 If it is determined that all the records have not been output, the data adding unit (Reduce) 34 returns to Step S401 and executes the same processing.

すべてのレコードが出力されたと判定された場合、データ追加部（Ｒｅｄｕｃｅ）３４は、最後の分割領域に対するデータサイズの確認処理を実行し、処理を終了する（ステップＳ４０６）。なお、ステップＳ４０６におけるデータサイズの確認処理は、ステップＳ４０５と同一の処理である。 If it is determined that all the records have been output, the data adding unit (Reduce) 34 executes a data size confirmation process for the last divided area, and ends the process (step S406). Note that the data size confirmation processing in step S406 is the same processing as step S405.

図１２は、本発明の第１の実施形態のおけるデータサイズの確認処理を説明するフローチャートである。 FIG. 12 is a flowchart for explaining data size confirmation processing according to the first embodiment of this invention.

データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ４０３において更新された全データキーサイズテーブルＴ４００を参照し、対象となる分割領域のデータサイズが所定の基準値より大きいか否かを判定する（ステップＳ５０１）。すなわち、素データが追加された分割領域が、所定の基準値より大きいか否かが判定される。 The data adding unit (Reduce) 34 refers to the entire data key size table T400 updated in step S403, and determines whether or not the data size of the target divided area is larger than a predetermined reference value (step S501). . That is, it is determined whether or not the divided area to which the raw data is added is larger than a predetermined reference value.

ここで、対象となる分割領域とは、前回入力されたレコードが含まれる分割領域である。以下、対象となる分割領域を、対象領域とも記載する。 Here, the target divided area is a divided area including a previously input record. Hereinafter, the target divided area is also referred to as a target area.

具体的には、データ追加部（Ｒｅｄｕｃｅ）３４は、全データキーサイズテーブルＴ４００の対応するエントリのサイズＴ４０２を参照し、対象領域のデータサイズが所定の基準値より大きいか否かを判定する。 Specifically, the data adding unit (Reduce) 34 refers to the size T402 of the corresponding entry in the all data key size table T400, and determines whether or not the data size of the target area is larger than a predetermined reference value.

対象領域のデータサイズが所定の基準値以下であると判定された場合、データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ５０６に進む。 If it is determined that the data size of the target area is equal to or smaller than the predetermined reference value, the data adding unit (Reduce) 34 proceeds to step S506.

対象領域のデータサイズが所定の基準値より大きいと判定された場合、データ追加部（Ｒｅｄｕｃｅ）３４は、マスタノード２０から既存のデータセットの分割テーブルＴ２００を取得する（ステップＳ５０２）。 When it is determined that the data size of the target area is larger than the predetermined reference value, the data adding unit (Reduce) 34 acquires the existing data set division table T200 from the master node 20 (step S502).

ここで、ステップＳ２０３においてマスタノード２０が取得したすべての分割テーブルＴ２００が取得される。なお、データ追加部（Ｒｅｄｕｃｅ）３４は、マスタノード２０から取得された分割テーブルＴ２００をキャッシュとして保存してもよい。 Here, all the division tables T200 acquired by the master node 20 in step S203 are acquired. The data adding unit (Reduce) 34 may store the division table T200 acquired from the master node 20 as a cache.

次に、データ追加部（Ｒｅｄｕｃｅ）３４は、取得された各分割テーブルＴ２００における対象領域の終了位置、すなわち、オフセットを特定する（ステップＳ５０３）。 Next, the data adding unit (Reduce) 34 specifies the end position of the target region in each acquired division table T200, that is, the offset (step S503).

データ追加部（Ｒｅｄｕｃｅ）３４は、対象領域のｋｅｙに基づいて、取得された各分割テーブルＴ２００を参照して、対象領域に対応するエントリを取得する。すなわち、対象領域に対応するデータのデータファイル名Ｔ２０２及びオフセットＴ２０４が取得される。なお、当該処理は、ステップＳ５０２において取得されたすべての分割テーブルＴ２００に対して実行される。 The data adding unit (Reduce) 34 refers to each obtained division table T200 based on the key of the target area, and acquires an entry corresponding to the target area. That is, the data file name T202 and the offset T204 of the data corresponding to the target area are acquired. This process is executed for all the divided tables T200 acquired in step S502.

例えば、ステップＳ５０１において図１３に示すような全データキーサイズテーブルＴ４００であり、最初のエントリに対応する分割領域のデータサイズが所定の基準値より大きい場合、データ追加部（Ｒｅｄｕｃｅ）３４は、図５Ａ及び図５Ｂに示す分割テーブルＴ２００の１つ目のエントリから情報を取得する。 For example, if the total data key size table T400 as shown in FIG. 13 in step S501 and the data size of the divided area corresponding to the first entry is larger than a predetermined reference value, the data adding unit (Reduce) 34 Information is acquired from the first entry of the division table T200 shown in 5A and 5B.

この場合、図５Ａでは（データファイル名、オフセット）＝（／ｌｏｇ０１／００１．ｄａｔ，２８０）となり、図５Ｂでは（／ｌｏｇ０２／００２．ｄａｔ，２００）となる。取得されたオフセットが、各分割テーブルＴ２００における対象領域の終了位置となる。 In this case, (data file name, offset) = (/ log01 / 001.dat, 280) in FIG. 5A, and (/log02/002.dat, 200) in FIG. 5B. The acquired offset is the end position of the target area in each division table T200.

なお、対象領域の開始位置は、１つ目のエントリであるため開始位置のオフセットは「０」である。 Note that since the start position of the target area is the first entry, the offset of the start position is “0”.

次に、データ追加部（Ｒｅｄｕｃｅ）３４は、各既存のデータセットの対象領域に含まれるレコードを解析する（ステップＳ５０４）。 Next, the data adding unit (Reduce) 34 analyzes the records included in the target area of each existing data set (step S504).

具体的には、データ追加部（Ｒｅｄｕｃｅ）３４は、各既存のデータセットの対象領域に含まれるレコードを読み出す。例えば、データＩＤ（Ｔ１０１）が「ｌｏｇ０１」及び「ｌｏｇ０２」のデータセットがある場合に、「ｌｏｇ０１」のデータセットの対象領域からレコードが読み出され、また、「ｌｏｇ０２」のデータセットの対象領域からレコードが読み出される。 Specifically, the data adding unit (Reduce) 34 reads a record included in the target area of each existing data set. For example, when there is a data set whose data ID (T101) is “log01” and “log02”, a record is read from the target area of the data set of “log01”, and the target area of the data set of “log02” Records are read from.

データ追加部（Ｒｅｄｕｃｅ）３４は、読み出されたレコードのｋｅｙ、レコードのデータサイズ、及びレコードのファイル上のオフセット位置を含むレコード統計情報を取得する。 The data adding unit (Reduce) 34 acquires record statistical information including the key of the read record, the data size of the record, and the offset position of the record on the file.

なお、既存のデータセットは複数存在するため、当該レコードの解析処理をデータセット毎に並列実行してもよい。 Since there are a plurality of existing data sets, the analysis processing of the record may be executed in parallel for each data set.

データ追加部（Ｒｅｄｕｃｅ）３４は、ステップＳ４０２において取得された素データのレコード統計情報と、既存データセットのレコード統計情報とを合わせて、分散ファイルシステム上における全データセットのレコード統計情報とする。 The data adding unit (Reduce) 34 combines the record statistical information of the raw data acquired in step S402 and the record statistical information of the existing data set to obtain record statistical information of all data sets on the distributed file system.

次に、データ追加部（Ｒｅｄｕｃｅ）３４は、作成された全データセットのレコード統計情報に基づいて、再分割する分割位置となるｋｅｙの値を決定する（ステップＳ５０５）。 Next, the data adding unit (Reduce) 34 determines the value of the key to be a division position for re-division based on the record statistical information of all created data sets (step S505).

データ追加部（Ｒｅｄｕｃｅ）３４は、全データセットのレコード統計情報に基づいて、対象領域におけるデータサイズを算出する。 The data adding unit (Reduce) 34 calculates the data size in the target area based on the record statistical information of all data sets.

データ追加部（Ｒｅｄｕｃｅ）３４は、算出されたデータサイズ及び所定の基準値に基づいて、対象領域における分割数を算出する。 The data adding unit (Reduce) 34 calculates the number of divisions in the target area based on the calculated data size and a predetermined reference value.

次に、データ追加部（Ｒｅｄｕｃｅ）３４は、対象領域のデータサイズを、算出された分割数で除算して、再分割後の分割領域のデータサイズを算出する。 Next, the data adding unit (Reduce) 34 divides the data size of the target area by the calculated number of divisions, and calculates the data size of the divided areas after re-division.

データ追加部（Ｒｅｄｕｃｅ）３４は、全データセットのレコード統計情報のエントリをｋｅｙでソートした後、レコードのデータサイズの累積値分布を算出する。すなわち、分散ファイルシステムにおける所定のｋｅｙ範囲に含まれる各レコードのデータサイズの分布が算出される。 The data adding unit (Reduce) 34 calculates the cumulative value distribution of the data size of the records after sorting the records statistical information entries of all data sets by key. That is, the distribution of the data size of each record included in the predetermined key range in the distributed file system is calculated.

データ追加部（Ｒｅｄｕｃｅ）３４は、算出された累積値分布に基づいて、レコードのデータサイズが分割後の分割領域のデータサイズの整数倍になっている地点を再分割の分割位置として決定する。整数倍になっていない場合、当該テータサイズと最も近いレコードが分割位置として決定される。 Based on the calculated cumulative value distribution, the data adding unit (Reduce) 34 determines a point where the data size of the record is an integral multiple of the data size of the divided area after division as a division position for re-division. If it is not an integer multiple, the record closest to the data size is determined as the division position.

再分割位置のｋｅｙは、データとして存在するｋｅｙを使ってもよいし、データとして存在しないｋｅｙを使ってもよい。 As the key of the subdivision position, a key that exists as data may be used, or a key that does not exist as data may be used.

データ追加部（Ｒｅｄｕｃｅ）３４は、全データセットのレコード統計情報を参照して、決定された各ｋｅｙ範囲に対応するオフセットを特定する。 The data adding unit (Reduce) 34 specifies the offset corresponding to each determined key range with reference to the record statistical information of all data sets.

データ追加部（Ｒｅｄｕｃｅ）３４は、各分割テーブルＴ２００に再分割後の分割領域に対応するエントリを追加する。また、データ追加部（Ｒｅｄｕｃｅ）３４は、各分割テーブルＴ２００から再分割前の分割領域に対応するエントリを削除する。 The data adding unit (Reduce) 34 adds an entry corresponding to the divided area after re-division to each division table T200. Further, the data adding unit (Reduce) 34 deletes the entry corresponding to the divided area before the re-division from each division table T200.

例えば、ｋｅｙ範囲が「０３４ａ」未満である分割領域が、ｋｅｙ範囲が「０１５ｄ」未満である分割領域と、ｋｅｙ範囲が「０１５ｄ」以上かつ「０３４ａ」未満である分割領域との２つの分割領域に分割された場合、図５Ａ及び図５Ｂに示す分割テーブルＴ２００は、図１４Ａ及び図１４Ｂのように変更される。図中の太線で示した部分が変更箇所である。 For example, a divided region whose key range is less than “034a” is divided into two divided regions: a divided region whose key range is less than “015d” and a divided region whose key range is “015d” or more and less than “034a”. 5A and 5B, the division table T200 shown in FIGS. 5A and 5B is changed as shown in FIGS. 14A and 14B. A portion indicated by a thick line in the figure is a changed portion.

データ追加部（Ｒｅｄｕｃｅ）３４は、レコード統計情報に基づいて、追加用キーサイズテーブルＴ４００及び全データキーサイズテーブルＴ４００も変更する。 The data adding unit (Reduce) 34 also changes the adding key size table T400 and the total data key size table T400 based on the record statistical information.

例えば、再分割前の全データキーサイズテーブルＴ４００が図１３に示すテーブルである場合、図１５に示すように変更される。図中の太線で示した部分が変更箇所である。 For example, when the total data key size table T400 before re-division is the table shown in FIG. 13, the table is changed as shown in FIG. A portion indicated by a thick line in the figure is a changed portion.

以上がステップＳ５０５の処理である。 The above is the process of step S505.

次に、データ追加部（Ｒｅｄｕｃｅ）３４は、分割テーブルＴ２００を更新する（ステップＳ５０６）。 Next, the data adding unit (Reduce) 34 updates the division table T200 (step S506).

具体的には、データ追加部（Ｒｅｄｕｃｅ）３４は、追加用キーサイズテーブル及び素データのレコード統計情報に基づいて、素データの分割テーブルＴ２００に対応する分割領域のエントリを格納する。すなわち、素データの分割テーブルＴ２００が生成される。 Specifically, the data adding unit (Reduce) 34 stores the entry of the divided region corresponding to the raw data division table T200 based on the key size table for addition and the record statistical information of the raw data. That is, the raw data division table T200 is generated.

なお、再分割処理が実行された場合には、新たに分割された分割領域に対応するエントリが格納される。 When the re-division process is executed, an entry corresponding to the newly divided division area is stored.

データ追加部（Ｒｅｄｕｃｅ）３４は、前述した処理に用いたレコード統計情報を削除し、処理を終了する（ステップＳ５０７）。 The data adding unit (Reduce) 34 deletes the record statistical information used in the above-described processing, and ends the processing (step S507).

［第２の実施形態］ [Second Embodiment]

第１の実施形態では、ファイルの内容は１つのファイルに保存されているため分析処理時に不要なデータも読み出される可能性がある。これに対して、第２の実施形態では、データ項目（列）毎に異なるファイルとして保存する方式を用いる。当該方式を用いることによって、分析処理時に必要な項目のみ読み出すことが可能となる。 In the first embodiment, since the contents of the file are stored in one file, unnecessary data may be read during the analysis process. On the other hand, in the second embodiment, a method of saving as a different file for each data item (column) is used. By using this method, it is possible to read out only the items necessary for the analysis process.

本発明は、データ項目毎に異なるファイルに保存する格納方式（列分割格納方式）にも対応することが可能である。 The present invention can also support a storage method (column division storage method) in which data items are stored in different files.

以下、第１の実施形態との差異を中心に第２の実施形態について説明する。 Hereinafter, the second embodiment will be described focusing on differences from the first embodiment.

第２の実施形態では、データ分析システムの構成は第１の実施形態と同一であるため説明を省略する。また、マスタノード２０及びスレーブノード３０のハードウェア構成及びソフトウェア構成も第１の実施形態と同一であるため説明を省略する。 In the second embodiment, since the configuration of the data analysis system is the same as that of the first embodiment, the description thereof is omitted. Further, the hardware configuration and software configuration of the master node 20 and the slave node 30 are the same as those in the first embodiment, and thus description thereof is omitted.

図１６は、本発明の第２の実施形態におけるレコードのスキーマを示す説明図である。図１７は、本発明の第２の実施形態におけるレコードの一例を示す説明図である。 FIG. 16 is an explanatory diagram illustrating a record schema according to the second embodiment of this invention. FIG. 17 is an explanatory diagram illustrating an example of a record according to the second embodiment of this invention.

第１の実施形態のレコードに対して、第２の実施形態のレコードにはユーザの年齢が新たに含まれる。 In contrast to the record of the first embodiment, the record of the second embodiment newly includes the user's age.

レコードの項目は、ユーザＩＤ、移動履歴（位置Ｘ、位置Ｙ、タイムスタンプの履歴）、及び年齢の３種類があり、本実施形態ではユーザＩＤがｋｅｙとして使用される。 There are three types of record items: user ID, movement history (location X, location Y, timestamp history), and age. In this embodiment, the user ID is used as the key.

図１８Ａ、図１８Ｂ及び図１８Ｃは、本発明の第２の実施形態におけるファイルを示す説明図である。 18A, 18B, and 18C are explanatory diagrams showing files in the second embodiment of the present invention.

図１８Ａ、図１８Ｂ及び図１８Ｃでは、列分割方式を用いて前述したデータがファイルに格納された例を表す。 18A, 18B, and 18C show an example in which the above-described data is stored in a file using the column division method.

図１８Ａ、図１８Ｂ及び図１８Ｃに示すように、ユーザＩＤはｌｏｇ／００１．ｋｅｙ．ｄａｔ（図１８Ａ）、移動履歴はｌｏｇ／００１．ｒｅｃ．ｄａｔ（図１８Ｂ）、年齢はｌｏｇ／００１．ａｇｅ．ｄａｔ（図１８Ｃ）というファイルにそれぞれ格納される。 As shown in FIGS. 18A, 18B, and 18C, the user ID is log / 001. key. dat (FIG. 18A), the movement history is log / 001. rec. dat (FIG. 18B), age is log / 001. age. Each is stored in a file called dat (FIG. 18C).

データを読み出すときは、各ファイルの上から順にレコードが１つずつ読み出され、順に結合すれば図１７に示したレコード全体を再構成することができる。 When data is read, the records are read one by one from the top of each file, and the entire record shown in FIG. 17 can be reconstructed by combining them in order.

図１８Ａ、図１８Ｂ及び図１８Ｃに示す例では、ファイルは１セットのみであるが、データが定期的に蓄積されていくことによって、ユーザＩＤ、移動履歴、及び年齢に対応するファイルを含むデータセットが増加する。 In the examples shown in FIGS. 18A, 18B, and 18C, there is only one set of files. However, a data set that includes files corresponding to user ID, movement history, and age by periodically accumulating data. Will increase.

実際の結合処理及び分析処理では並列して実行されるため、前述のファイルが分割された後、各スレーブノード３０によって処理が実行される。 Since the actual combining process and analysis process are executed in parallel, the process is executed by each slave node 30 after the file is divided.

図１９は、本発明の第２の実施形態における分割テーブルＴ２００の一例を示す説明図である。 FIG. 19 is an explanatory diagram illustrating an example of a division table T200 according to the second embodiment of this invention.

第２の実施形態における分割テーブルＴ２００は、項目毎（ユーザＩＤ、移動履歴、及び年齢）にデータファイル名Ｔ２０２及びオフセットＴ２０４を格納する点が第１の実施形態と異なる。また、ｋｅｙとして使用される項目には、ｋｅｙ（Ｔ２０３）に分割位置を表すｋｅｙの値が格納される。 The division table T200 in the second embodiment is different from the first embodiment in that the data file name T202 and the offset T204 are stored for each item (user ID, movement history, and age). In the item used as the key, a key value representing a division position is stored in the key (T203).

次に、第２の実施形態における結合処理及び分析処理について第１の実施形態との相違点を中心に説明する。 Next, the combination process and the analysis process in the second embodiment will be described focusing on the differences from the first embodiment.

ステップＳ１０１では、キーサイズテーブルＴ４００が作成される場合に、データ管理部２１が、分割テーブルＴ２００の中で分析処理に用いる項目のオフセットを参照して、各分割領域のサイズを計算する。 In step S101, when the key size table T400 is created, the data management unit 21 calculates the size of each divided region with reference to the offset of the item used for analysis processing in the divided table T200.

例えばユーザＩＤと年齢のみを使用する分析を行う場合は、「ｕｉｄ」のオフセットと「ａｇｅ」のオフセットのみを使ってキーサイズテーブルのサイズを求める。このとき、「ｒｅｃ」についてのオフセットは使用されない。 For example, when an analysis using only the user ID and age is performed, the size of the key size table is obtained using only the “uid” offset and the “age” offset. At this time, the offset for “rec” is not used.

これによって、一部の項目のみ利用する場合であっても、各分割領域のデータサイズを正確に算出できる。 Thereby, even when only some items are used, the data size of each divided region can be accurately calculated.

また、Ｓ１０４では、タスクが割り当てられた各スレーブノード３０が、分析処理に用いるファイル数と、分析処理に用いる項目数との積の数分だけ、ファイルが読み出される。 In S104, each slave node 30 to which a task is assigned reads files as many as the product of the number of files used for analysis processing and the number of items used for analysis processing.

データ追加処理についても以下のような相違がある。 The data addition process also has the following differences.

ステップＳ２０３では、データ管理部２１が、結合する可能性があるすべてのデータセットの分割テーブルＴ２００の項目毎のオフセットから、既存データセットのキーサイズテーブルＴ４００を作成する。 In step S203, the data management unit 21 creates the key size table T400 of the existing data set from the offset for each item of the division table T200 of all data sets that may be combined.

ステップＳ４０２では、各レコードをファイル出力するとき、項目毎に別のファイルに出力される。したがって、ステップＳ４０２では項目毎に、書き込まれたレコードのｋｅｙの値、書き込まれたファイル上のオフセット、及びデータサイズを含むレコード統計情報が保存される。 In step S402, when each record is output to a file, each record is output to a separate file. Therefore, in step S402, record statistical information including the key value of the written record, the offset on the written file, and the data size is stored for each item.

また、ステップＳ４０３では、全項目の分割領域のサイズの和をキーサイズテーブルＴ４００の対応するエントリに加算される。 In step S403, the sum of the sizes of the divided areas of all items is added to the corresponding entry in the key size table T400.

Ｓ５０６では、前述したレコード統計情報及びキーサイズテーブルＴ４００を用いて、項目毎に分割位置のオフセット値を求めて分割テーブルＴ２００を更新する。 In S506, using the record statistical information and the key size table T400 described above, the offset value of the division position is obtained for each item, and the division table T200 is updated.

Ｓ５０４では、データに含まれる全項目に対応するファイルが読み出され、項目毎に、ファイルに書き込まれたレコードのｋｅｙの値、書き込んだファイル上のオフセット位置、及びデータサイズを含むレコード統計情報が保存される。 In S504, a file corresponding to all items included in the data is read, and for each item, record statistical information including the key value of the record written in the file, the offset position on the written file, and the data size is obtained. Saved.

Ｓ５０５では、データ追加部（Ｒｅｄｕｃｅ）３４が、全項目の分割領域のデータサイズを足し合わせたものを当該データセットのデータサイズとして、分割位置のｋｅｙを決定する。 In S505, the data adding unit (Reduce) 34 determines the key of the division position using the sum of the data sizes of the divided areas of all items as the data size of the data set.

Ｓ５０６では、データ追加部（Ｒｅｄｕｃｅ）３４が、決定されたｋｅｙ及びレコード統計情報を用いて、項目毎に分割位置のオフセットを算出し、分割テーブルＴ２００を更新する。 In S506, the data adding unit (Reduce) 34 calculates the offset of the division position for each item using the determined key and record statistical information, and updates the division table T200.

第２の実施形態では、３つの項目を処理する場合について説明したが、分割テーブルＴ２００において管理される項目数を変更することによって任意の項目数にすることができる。 In the second embodiment, the case where three items are processed has been described. However, the number of items managed in the division table T200 can be changed to any number of items.

本発明の一形態によれば、データ分析システムは、各データセットの分割位置が同一であるため分析処理における結合処理を並列に実行することができる。また、新たにデータセットが追加された場合に、タスク間の処理量が均一になるように分割領域を再分割することができる。これによって、タスク間の処理の不均衡を解消し、かつ、結合処理時に分散領域毎にレコードを結合することができる。 According to an aspect of the present invention, the data analysis system can execute the combination processing in the analysis processing in parallel because the division positions of the respective data sets are the same. In addition, when a data set is newly added, the divided area can be subdivided so that the processing amount between tasks becomes uniform. As a result, it is possible to eliminate processing imbalance between tasks and to combine records for each distributed area during the combining process.

以上、本発明を添付の図面を参照して詳細に説明したが、本発明はこのような具体的構成に限定されるものではなく、添付した請求の範囲の趣旨内における様々な変更及び同等の構成を含むものである。 Although the present invention has been described in detail with reference to the accompanying drawings, the present invention is not limited to such specific configurations, and various modifications and equivalents within the spirit of the appended claims Includes configuration.

Claims

A computer system in which a plurality of computers execute analysis processing on a data set including a plurality of data composed of keys and data values,
Each of the computers includes a processor, a memory connected to the processor, a storage device connected to the processor, and a network interface connected to the processor.
Each of the computers holds, for each data set, division information for managing a division position key, which is a key indicating a division position of a division area obtained by dividing the data set for each predetermined key range,
All the division position keys included in the division information of each data set are the same,
A file system for storing the data set is configured on a storage area of the plurality of computers.
The computer system is
When executing the analysis process, generate a plurality of tasks for each of the divided areas,
Assigning the generated task to each computer, combining the data included in the divided areas of each data set, and executing the analysis process;
When a new data set is stored in the file system, based on the data size of each divided area after the new data set is stored, the target area that is the divided area having a data size larger than a predetermined threshold is Determine if it exists,
A computer system that divides the target area into a plurality of new divided areas when it is determined that the target area exists.

When storing a new data set in the file system, the key distribution of the new data set is analyzed,
2. The division information of the new data set is generated based on the analysis result so as to be the same as all the division position keys included in the division information of the existing data set. The computer system described in 1.

The computer system according to claim 2, wherein the division position key in the division information of the existing data set is updated after the target area is divided.

When determining whether or not the target area exists, the data sizes of the divided areas of all the data sets are summed up to obtain a first data size that is the data size of the divided areas in the computer system. Calculate
It is determined whether or not there is the divided area in which the calculated first data size is larger than the predetermined threshold;
When dividing the target area, by calculating the second data size, which is the data size of the target area in the computer system, by summing the data sizes of the target areas of all the data sets,
Based on the predetermined threshold and the calculated second data size, calculate the number of divisions of the target area,
Based on the calculated number of divisions, determine a new division position key in the target area,
When updating the division position key of the division information of the existing data set, the information corresponding to the target area is deleted from the division information of the existing data set, and the determined division position key and the new division key are updated. Add information that correlates with the various divided areas,
The division information of the new data set is generated so as to be the same as the division key in the updated division information of the existing data set when the division information of the new data set is generated. Item 4. The computer system according to Item 3.

When dividing the target area, calculate the third data size by dividing the data size of the target area by the calculated number of divisions,
5. The computer system according to claim 4, wherein the key in the data corresponding to the calculated third data size is determined as the division position key.

5. The computer system according to claim 4, wherein the predetermined threshold is a data size that makes a processing time of a task to which the new divided area is allocated be equal to or less than a preset allowable time.

The data includes data values for a plurality of items,
5. The computer system according to claim 4, wherein, when calculating the first data size, the first data size is calculated by summing up data sizes of all items in the divided area. 6. .

When analyzing the key distribution of the new data set, the new data set is divided by a division position key that matches one of the division position keys included in the division information of the existing data set, and is used for a plurality of processes. Generate segmented areas,
3. The computer system according to claim 2, wherein a task for analyzing a key distribution of the new data set is generated for each of the generated processing divided areas, and the task is executed in parallel.

A data management method in a computer system in which a plurality of computers execute analysis processing on a data set including a plurality of data composed of keys and data values,
Each of the computers includes a processor, a memory connected to the processor, a storage device connected to the processor, and a network interface connected to the processor.
Each of the computers holds, for each data set, division information for managing a division position key, which is a key indicating a division position of a division area obtained by dividing the data set for each predetermined key range,
All the division position keys included in the division information of each data set are the same,
A file system for storing the data set is configured on a storage area of the plurality of computers.
The method
A first step of generating a plurality of tasks for each of the divided regions when at least one of the computers executes the analysis process;
A second step in which the computer that has generated the task assigns the generated task to each of the computers and combines the data included in the divided regions of each of the data sets to execute the analysis process; Including
When at least one of the computers stores a new data set in the file system, based on the data size of each divided area after the new data set is stored, the data size larger than a predetermined threshold is used. A third step of determining whether or not a target area that is a divided area exists;
A fourth step of dividing the target area into a plurality of new divided areas when the computer that has performed the above-described determination process determines that the target area exists;
A data management method comprising:

The third step includes
A fifth step of analyzing a key distribution of the new data set;
And a sixth step of generating the division information of the new data set so as to be the same as all the division position keys included in the division information of the existing data set based on the analysis result. The data management method according to claim 9.

The data according to claim 10, wherein the fourth step includes a seventh step of updating the division position key in the division information of the existing data set after the target region is divided. Management method.

The third step includes
An eighth step of calculating a first data size that is a data size of the divided area in the computer system by summing data sizes of the divided areas of all the data sets;
A ninth step of determining whether or not the divided region having the calculated first data size larger than the predetermined threshold exists;
The fourth step includes
A tenth step of calculating a second data size that is a data size of the target area in the computer system by summing up the data sizes of the target areas of all the data sets;
An eleventh step of calculating the number of divisions of the target area based on the predetermined threshold and the calculated second data size;
A twelfth step of determining a new division position key in the target area based on the calculated division number;
In the seventh step, information corresponding to the target area is deleted from the division information of the existing data set, and information in which the determined division position key and the new division area are associated is added. Including a thirteenth step,
The sixth step includes a fourteenth step of generating division information of the new data set so as to be the same as the division key in the updated division information of the existing data set. Item 12. The data management method according to Item 11.

The twelfth step includes
Dividing the data size of the target area by the calculated number of divisions to calculate a third data size;
The data management method according to claim 12, further comprising: determining the key in the data corresponding to the calculated third data size as the division position key.

13. The data management method according to claim 12, wherein the predetermined threshold is a data size that makes a processing time of a task to which the new divided area is allocated be equal to or less than a preset allowable time.

The data includes data values for a plurality of items,
13. The data management method according to claim 12, wherein in the eighth step, the first data size is calculated by summing up the data sizes of all items in the divided area.

The fifth step includes
Dividing the new data set with a division position key that matches one of the division position keys included in the division information of the existing data set to generate a plurality of processing division areas;
Generating a task for analyzing the key distribution of the new data set for each of the generated processing divided regions, and executing the task in parallel on each computer, The data management method according to claim 10.