JP6316503B2

JP6316503B2 - Computer system, accelerator, and database processing method

Info

Publication number: JP6316503B2
Application number: JP2017518648A
Authority: JP
Inventors: 芳孝辻本; 渡辺　聡; 聡渡辺; 能毅黒川
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2015-05-18
Filing date: 2015-05-18
Publication date: 2018-04-25
Anticipated expiration: 2035-05-18
Also published as: WO2016185542A1; JPWO2016185542A1

Description

本発明は、データ記憶部およびデータベース集約処理部を有するストレージ装置を利用したデータ処理方法、計算機システム及びストレージシステムに関する。 The present invention relates to a data processing method, a computer system, and a storage system using a storage device having a data storage unit and a database aggregation processing unit.

データベース検索処理を代表とするデータ処理システムでは、データ処理の高速化を目的として、データ処理サーバで行っていた処理の一部分を大容量記憶媒体（ストレージ）の近傍に配置したハードウェアアクセラレータへオフロードする構成が提案されている（例えば、特許文献１）。 In data processing systems such as database search processing, a part of the processing performed on the data processing server is offloaded to a hardware accelerator located near the mass storage medium (storage) for the purpose of speeding up the data processing. The structure which performs is proposed (for example, patent document 1).

従来のデータ処理サーバで行っていたフィルタリング処理や、プロジェクション処理、グルーピング処理、あるいは集約演算処理を、上記ハードウェアアクセラレータへオフロードすることで、データ処理サーバの負荷を軽減し、検索時間などの処理時間を大幅に短縮する技術が知られている（非特許文献１）。 By offloading the filtering processing, projection processing, grouping processing, or aggregation calculation processing that was performed by the conventional data processing server to the hardware accelerator, the load on the data processing server is reduced and processing such as search time is performed. A technique for significantly reducing the time is known (Non-Patent Document 1).

さらに、データ処理サーバでは、ジョイン処理や、マージソート処理もハードウェアアクセラレータにオフロードすることが検討されている（特許文献２）。データベースの処理を高速化する技術としては、機能レベルでのパイプライン化や、機能の並列化を行うことが知られている（特許文献１、２）。 Furthermore, in the data processing server, it has been studied to offload join processing and merge sort processing to a hardware accelerator (Patent Document 2). As a technique for speeding up database processing, it is known to perform pipeline processing at the function level and parallelization of functions (Patent Documents 1 and 2).

特開平５−１２８１６４号公報Japanese Patent Laid-Open No. 5-128164 米国特許出願公開第２０１２/００４７１２６号明細書US Patent Application Publication No. 2012/0047126

Louis Woods, Zsolt Isvan, Gustavo Alonso:”Ibex-An Intelligent Storage Engine with support for advanced SQL Off-loading”, Proc. VLDB Endowment (PVLDB), 2014Louis Woods, Zsolt Isvan, Gustavo Alonso: “Ibex-An Intelligent Storage Engine with support for advanced SQL Off-loading”, Proc. VLDB Endowment (PVLDB), 2014

しかしながら、上記非特許文献１では、データベースの処理単位が“行”であり、処理単位が小さくデータベースの処理性能を向上させることが困難であった。さらに、非特許文献１においては、データベースを格納したＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）からハードウェアアクセラレータに直接データを入力している為、データベースの処理性能が、ＳＳＤの読み出し性能により処理の制約を受けてしまい、ハードウェアアクセラレータへのデータ供給の速度がボトルネックになるという課題があった。 However, in Non-Patent Document 1, the database processing unit is “row”, and it is difficult to improve the database processing performance because the processing unit is small. Furthermore, in Non-Patent Document 1, data is directly input to a hardware accelerator from an SSD (Solid State Drive) that stores a database, so the database processing performance is subject to processing restrictions due to the SSD reading performance. Therefore, there has been a problem that the speed of data supply to the hardware accelerator becomes a bottleneck.

他方、上述のグルーピング処理では、ハッシュ演算を用いる例が知られている。ハッシュ演算を用いてグルーピングをした際には、異なるグループが同じハッシュ値を持ち、ハッシュ値の衝突（シノニム）が発生することがある。 On the other hand, in the above grouping process, an example using a hash operation is known. When grouping is performed using a hash operation, different groups may have the same hash value, and hash value collision (synonym) may occur.

シノニムが発生した場合、ハードウェアアクセラレータでハッシュ値の再計算を行うと次の行（データ）の計算が待たされるため、ハードウェアアクセラレータの処理性能が低下する、という問題があった。 When a synonym occurs, if the hardware accelerator recalculates the hash value, the calculation of the next line (data) is awaited, and the processing performance of the hardware accelerator is reduced.

そこで、本発明は上記問題点に鑑みてなされたもので、サーバと連携してデータベースの集約演算の性能を向上させるハードウェアを提供することを目的とする。 Therefore, the present invention has been made in view of the above problems, and an object of the present invention is to provide hardware that improves the performance of database aggregation operations in cooperation with a server.

本発明は、プロセッサと、メモリと、を含むサーバと、前記サーバに接続されてデータベース処理を行うアクセラレータと、前記アクセラレータに接続されてデータベースを格納するストレージ装置と、を有する計算機システムであって、前記サーバは、クエリを受け付けてデータベースコマンドを生成し、処理対象のデータベースの範囲と、前記データベースの範囲を分割してひとつのデータベースコマンドで処理する単位サイズを決定して、前記アクセラレータに指令するサーバコマンド処理部と、前記アクセラレータの出力を集計して前記クエリに対する処理結果を生成する再集約部と、を有し、前記アクセラレータは、前記サーバコマンド処理部からの指令に基づいて、前記単位サイズで前記ストレージ装置からデータベースの処理対象データを読み込み、前記処理対象データを所定の処理単位に分轄して、前記所定の処理単位ごとにグルーピング処理と、スタッキング処理と、集約処理とを含むデータベース処理を実行して集約結果を出力するデータベース処理部を有し、前記再集約部は、前記処理対象のデータベースの範囲についての集約結果を前記アクセラレータから受け付けると、当該集約結果を集計して前記クエリに対する処理結果として生成する。 The present invention is a computer system having a server including a processor, a memory, an accelerator connected to the server for database processing, and a storage device connected to the accelerator for storing a database. The server receives a query, generates a database command, determines a database range to be processed and a unit size to be processed by one database command by dividing the database range, and instructs the accelerator A command processing unit, and a re-aggregation unit that aggregates the output of the accelerator and generates a processing result for the query, and the accelerator has the unit size based on a command from the server command processing unit. Database processing from the storage device The processing target data is divided into predetermined processing units, database processing including grouping processing, stacking processing, and aggregation processing is executed for each predetermined processing unit, and an aggregation result is output. Having a database processing unit, the re-aggregation unit, when receiving an aggregation result for the range of the database to be processed from the accelerator, aggregates the aggregation result and generates it as a processing result for the query.

本発明によれば、サーバとアクセラレータが協調してデータベース処理を実行することにより、サーバの負荷軽減とアクセラレータのデータベース処理性能を向上させることができる。また、アクセラレータは、データベースコマンド単位で集約処理を実行し、サーバは、複数のデータベースコマンドに対する集約結果の再集約処理を行うことで、データベース処理システムの処理性能を向上することができる。 According to the present invention, the server and the accelerator execute database processing in cooperation, thereby reducing the load on the server and improving the database processing performance of the accelerator. In addition, the accelerator executes the aggregation process in units of database commands, and the server can improve the processing performance of the database processing system by performing the aggregation process of the aggregation results for a plurality of database commands.

本発明の第１の実施例を示し、データベース処理システムの構成の一例を示すブロック図である。It is a block diagram which shows a 1st Example of this invention and shows an example of a structure of a database processing system. 本発明の第１の実施例を示し、ＦＰＧＡの構成の一例を示すブロック図である。1 is a block diagram illustrating an example of a configuration of an FPGA according to a first embodiment of this invention. FIG. 本発明の第１の実施例を示し、データベースに対するクエリと処理内容の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of the query with respect to a database, and a processing content. 本発明の第１の実施例を示し、データベースに対するデータベースコマンドの一例を示す図である。It is a figure which shows a 1st Example of this invention and shows an example of the database command with respect to a database. 本発明の第１の実施例を示し、ＤＢサーバのコマンド発行とＦＰＧＡのコマンド処理の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of command issuing of DB server, and command processing of FPGA. 本発明の第１の実施例を示し、データベースのページフォーマットの一例を示す図である。It is a figure which shows 1st Example of this invention and shows an example of the page format of a database. 本発明の第１の実施例を示し、ＦＰＧＡのデータベース処理部の構成の一例を示すブロック図である。It is a block diagram which shows the 1st Example of this invention and shows an example of a structure of the database process part of FPGA. 本発明の第１の実施例を示し、ＦＰＧＡで行われるパイプライン処理の一例を示すタイミングチャートである。It is a timing chart which shows a 1st Example of this invention and shows an example of the pipeline process performed by FPGA. 本発明の第１の実施例を示し、ＦＰＧＡの各処理における出力データ量の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of the output data amount in each process of FPGA. 本発明の第１の実施例を示し、ＦＰＧＡの処理とサーバの処理の一例を示す図である。It is a figure which shows a 1st Example of this invention and shows an example of a process of FPGA and a process of a server. 本発明の第１の実施例を示し、グルーピング列のグルーピングの手法の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of the method of grouping of a grouping row | line | column. 本発明の第１の実施例を示し、グルーピング処理の一例を示すフローチャートである。It is a flowchart which shows a 1st Example of this invention and shows an example of a grouping process. 本発明の第１の実施例を示し、スタッキング演算による固定小数点データの格納方法の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of the storage method of fixed point data by a stacking calculation. 本発明の第１の実施例を示し、スタッキング演算用レジスタの一例を示すブロック図である。FIG. 3 is a block diagram illustrating an example of a stacking calculation register according to the first embodiment of this invention. 本発明の第１の実施例を示し、スタッキング処理の一例を示すフローチャートである。It is a flowchart which shows a 1st Example of this invention and shows an example of a stacking process. 本発明の第１の実施例を示し、スタッキング演算の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of a stacking calculation. 本発明の第１の実施例を示し、スタッキング演算で使用されるコマンドの一例を示す図である。It is a figure which shows a 1st Example of this invention and shows an example of the command used by a stacking calculation. 本発明の第１の実施例を示し、集約結果の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of an aggregation result. 本発明の第１の実施例を示し、集約処理の一例を示すタイミングチャートである。It is a timing chart which shows a 1st Example of this invention and shows an example of an aggregation process. 本発明の第１の実施例を示し、シノニムのフォーマットの一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of the format of a synonym. 本発明の第１の実施例を示し、データベースサーバで行われる処理の一例を示すフローチャートである。It is a flowchart which shows a 1st Example of this invention and shows an example of the process performed by a database server. 本発明の第１の実施例を示し、ＤＢサーバによる再集約処理の一例を示す図である。It is a figure which shows the 1st Example of this invention and shows an example of the re-aggregation process by DB server. 本発明の第１の実施例を示し、ハッシュテーブルの一例を示す図である。It is a figure which shows 1st Example of this invention and shows an example of a hash table. 本発明の第１の実施例を示し、グルーピング列テーブルの一例を示す図である。It is a figure which shows 1st Example of this invention and shows an example of a grouping row | line | column table. 本発明の第１の実施例を示し、グループハッシュテーブルの一例を示す図である。It is a figure which shows 1st Example of this invention and shows an example of a group hash table. 本発明の第２の実施例を示し、シノニム多発時のデータサイズを縮小する例を示す図である。It is a figure which shows the 2nd Example of this invention and shows the example which reduces the data size at the time of synonym frequent occurrence.

以下、本発明の実施形態について添付図面を用いて説明する。 Embodiments of the present invention will be described below with reference to the accompanying drawings.

図１は、本発明に係るデータベース処理機能を有するＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍａｂｌｅＧｒｉｄＡｒｒａｙ）２及びデータベース（以下ＤＢ）３０を格納するストレージ装置３と、データベース管理システム（ＤａｔａＢａｓｅＭａｎａａｇｅｍｅｎｔＳｙｓｔｅｍ、以下ＤＢＭＳとする）２０が稼働するデータベースサーバ（以下、ＤＢサーバ）１と、を含むデータベースシステムの一例を示すブロック図である。 1 shows a storage device 3 for storing a field programmable grid array (FPGA) 2 having a database processing function according to the present invention and a database (hereinafter referred to as DB) 30, and a database management system (hereinafter referred to as a DBMS) 20. 1 is a block diagram showing an example of a database system including a database server (hereinafter, referred to as a DB server) 1 on which is operated.

ＤＢサーバ１は、演算を実行するホストＣＰＵ１１と、プログラムやデータを格納するホストメモリ１２と、ＰＣＩスイッチ４を介してハードウェアアクセラレータとしてのＦＰＧＡ２及びストレージ装置３に接続されるホストインターフェース１３と、を含む。また、ホストインターフェース１３は、ＰＣＩあるいはＰＣＩｅｘｐｒｅｓｓに準拠したインターフェースである。 The DB server 1 includes a host CPU 11 that executes arithmetic operations, a host memory 12 that stores programs and data, and a host interface 13 that is connected to the FPGA 2 as a hardware accelerator and the storage device 3 via the PCI switch 4. Including. The host interface 13 is an interface conforming to PCI or PCI express.

ＤＢサーバ１では、ホストメモリ１２にロードされたＤＢＭＳ２０がホストＣＰＵ１１によって実行され、図示しないクライアント計算機からのアクセス要求（クエリ）に応じて、ＦＰＧＡ２にデータベースコマンドを発行する。 In the DB server 1, the DBMS 20 loaded in the host memory 12 is executed by the host CPU 11 and issues a database command to the FPGA 2 in response to an access request (query) from a client computer (not shown).

ストレージ装置３に搭載されたＦＰＧＡ２は、ＳＲＡＭ２００に加えてＣＰＵ１２６を含み、データベースコマンドに基づいてデータベース処理（フィルタ処理、グルーピング処理、プロジェクション処理、集約処理等）を実行するデータベース処理部２５０を有するハードウェアアクセラレータとして機能する。なお、ＦＰＧＡ２とストレージ装置３が独立して構成されて、それぞれがＰＣＩスイッチ４に接続される構成であっても良い。 The FPGA 2 mounted on the storage device 3 includes a CPU 126 in addition to the SRAM 200, and includes a database processing unit 250 that executes database processing (filter processing, grouping processing, projection processing, aggregation processing, etc.) based on database commands. Functions as an accelerator. Note that the FPGA 2 and the storage device 3 may be configured independently, and each may be connected to the PCI switch 4.

ストレージ装置３は、ＤＢ３０を格納する記憶媒体としてＳＳＤ１３７とＤＲＡＭ１３６とを有し、これらの記憶媒体を制御する制御部１２９は、ＳＳＤインターフェース１３０と、ＤＲＡＭインターフェース１３１の２つが含まれる。 The storage device 3 includes an SSD 137 and a DRAM 136 as storage media for storing the DB 30, and a control unit 129 that controls these storage media includes an SSD interface 130 and a DRAM interface 131.

ストレージ装置３とＦＰＧＡ２は、ＳＳＤインターフェース１３０及びＤＲＡＭインターフェース１３１で接続される。 The storage device 3 and the FPGA 2 are connected by an SSD interface 130 and a DRAM interface 131.

本実施例１のストレージ装置３では、不揮発性記憶媒体のＳＳＤ１３７にＤＢ３０を格納しておき、制御部１２９は、処理対象のＤＢ３０の範囲についてＳＳＤ１３７からＤＲＡＭ１３６に読み込んでおく。 In the storage apparatus 3 according to the first embodiment, the DB 30 is stored in the SSD 137 of the nonvolatile storage medium, and the control unit 129 reads the range of the DB 30 to be processed from the SSD 137 into the DRAM 136.

そして、ＦＰＧＡ２は、ＳＳＤ１３７に比して読み出し速度が高速なＤＲＡＭ１３６から所定の処理サイズ（例えば、数メガバイト）ずつデータを読み込むことで、ＳＳＤ１３７の読み出し速度の制限を受けることなく、ＤＲＡＭ１３６からＤＢ３０の処理対象となるデータを高速に読み出すことができる。 The FPGA 2 reads data from the DRAM 136, which has a higher reading speed than the SSD 137, by a predetermined processing size (for example, several megabytes), so that the processing from the DRAM 136 to the DB 30 is not limited by the reading speed of the SSD 137. The target data can be read at high speed.

ただし、本発明では、ＤＢ３０の処理対象の部分のサイズが、ＦＰＧＡ２が演算処理を行う単位よりも大である。すなわち、ＤＢ３０の処理対象の部分を一旦ＤＲＡＭ１３６へ読み込んだ後、ＤＲＡＭ１３６から所定のサイズずつＦＰＧＡ２へデータを入力する。 However, in the present invention, the size of the processing target portion of the DB 30 is larger than the unit in which the FPGA 2 performs arithmetic processing. That is, after a portion to be processed in the DB 30 is once read into the DRAM 136, data is input from the DRAM 136 to the FPGA 2 by a predetermined size.

＜ＤＢＭＳ＞
次に、ＤＢＭＳ２０について説明する。ＤＢサーバ１で稼働するＤＢＭＳ２０には、結果格納領域１１５と、メッセージ格納領域１１６と、シノニム格納領域１１７と、再集約モジュール１１８と、グルーピング及び集約モジュール１２０と、コマンド生成部１０３と、コマンド格納部１２３と、要求コマンドキュー１２１と、完了コマンドキュー１２２とを含む。グルーピング及び集約モジュール１２０には、ＦＰＧＡ２の集約処理でシノニムが発生したＤＢ３０のグルーピングを再度行うための、グルーピングと集約のモジュールを有する。<DBMS>
Next, the DBMS 20 will be described. The DBMS 20 operating on the DB server 1 includes a result storage area 115, a message storage area 116, a synonym storage area 117, a re-aggregation module 118, a grouping and aggregation module 120, a command generation unit 103, and a command storage unit. 123, a request command queue 121, and a completion command queue 122. The grouping and aggregation module 120 includes a grouping and aggregation module for re-grouping the DB 30 in which a synonym is generated in the FPGA 2 aggregation processing.

ＤＢＭＳ２０は、数ＴＢまたは数ＧＢのＤＢ３０を数ＭＢ毎に分割して、ＦＰＧＡ２にＤＢ３０の集約処理を指令する。このため、ＤＢサーバ１は、ＤＢ３０のうち集約処理の対象となるデータの範囲と、集約処理の内容からデータベースコマンドを生成する。 The DBMS 20 divides the DB 30 of several TB or several GB into several MBs, and instructs the FPGA 2 to perform aggregation processing of the DB 30. For this reason, the DB server 1 generates a database command from the range of data targeted for aggregation processing in the DB 30 and the contents of the aggregation processing.

ＤＢＭＳ２０は、クライアント計算機から受信したアクセス要求からコマンド生成部１０３でデータベースコマンドを生成して、コマンド格納部１２３に格納する。また、コマンド生成部１０３はコマンド格納部１２３に格納したデータベースコマンドの格納位置を示すコマンドポインタを要求コマンドキュー１２１に設定する。ＤＢＭＳ２０は、要求コマンドキュー１２１に入力されたコマンドポインタに対応するデータベースコマンドを順次ＦＰＧＡ２へ投入する。これらのコマンド生成部１０３、コマンド格納部１２３、要求コマンドキュー１２１がサーバコマンド処理モジュール１０３０を構成する。 The DBMS 20 generates a database command from the access request received from the client computer by the command generation unit 103 and stores it in the command storage unit 123. Further, the command generation unit 103 sets a command pointer indicating the storage position of the database command stored in the command storage unit 123 in the request command queue 121. The DBMS 20 sequentially inputs database commands corresponding to the command pointers input to the request command queue 121 to the FPGA 2. These command generation unit 103, command storage unit 123, and request command queue 121 constitute a server command processing module 1030.

ＤＢＭＳ２０は、ＦＰＧＡ２から受信したデータベースコマンドの集約結果を受け付けて、結果格納領域１１５に格納する。ＤＢＭＳ２０は、ＦＰＧＡ２が実行したデータベースコマンドで集約したグループの数と、シノニムが発生した数と、演算オーバーフローの情報等をメッセージとして受け付けて、メッセージ格納領域１１６に格納する。ＤＢＭＳ２０は、ＦＰＧＡ２のデータベースコマンドの実行でシノニムが発生した情報（シノニム１１４）を受け付けた場合には、シノニム格納領域１１７に格納する。また、ＤＢＭＳ２０は、ＦＰＧＡ２から実行完了の通知を受信したデータベースコマンドを、完了コマンドキュー１２２へ格納する。 The DBMS 20 accepts the database command aggregation result received from the FPGA 2 and stores it in the result storage area 115. The DBMS 20 accepts, as a message, the number of groups aggregated by the database command executed by the FPGA 2, the number of synonyms generated, information on operation overflow, and the like, and stores them in the message storage area 116. When the DBMS 20 receives the information (synonym 114) in which the synonym is generated by executing the database command of the FPGA 2, the DBMS 20 stores it in the synonym storage area 117. Further, the DBMS 20 stores the database command that has received the notification of execution completion from the FPGA 2 in the completion command queue 122.

＜ＦＰＧＡ＞
図２は、ＦＰＧＡ２の構成の一例を示すブロック図である。ＦＰＧＡ２は、データベースコマンドに基づいてデータベース処理を制御するＣＰＵ１２６と、データベース処理を行うハードウェアの機能ブロックを有する。<FPGA>
FIG. 2 is a block diagram illustrating an example of the configuration of the FPGA 2. The FPGA 2 includes a CPU 126 that controls database processing based on a database command, and hardware functional blocks that perform database processing.

ＦＰＧＡ２のデータベース処理部２５０を構成する機能ブロックは、ＤＢ３０のうち処理対象データをＤＲＡＭ１３６から読み込むデータ読み込み部（以下、ＤＡＴＡＩ／Ｆ）１０５と、処理対象データについてフィルタ処理を実施するフィルタ処理部（以下、Ｆｉｌｔｅｒ）１０６−０、１０６−１と、処理対象データについてフィルタ処理を実施するフィルタ処理部（以下、Ｆｉｌｔｅｒ）１０６−０、１０６−１と、処理対象データについて射影処理を実施するプロジェクション処理部（以下、Ｐｒｏｊｅｃｔｉｏｎ）１０７−０、１０７−１と、フィルタ処理と射影処理の結果についてデータのグルーピング（グループ化）を行うグルーピング処理部（以下、Ｇｒｏｕｐｉｎｇ）１０８−０、１０８−１と、フィルタ処理と射影処理の結果について演算を行うスタック処理部（以下、Ｓｔａｃｋｉｎｇ）１０９−０、１０９−１と、グループ化及び演算の結果を集約する集約処理部（以下、Ａｇｇｒｅｇａｔｉｏｎ）１１１、とを含む。 The functional blocks constituting the database processing unit 250 of the FPGA 2 include a data reading unit (hereinafter, DATA I / F) 105 that reads processing target data from the DRAM 136 in the DB 30, and a filter processing unit that performs filtering processing on the processing target data ( Hereinafter, Filter) 106-0 and 106-1, filter processing units (hereinafter referred to as Filter) 106-0 and 106-1 that perform filter processing on the processing target data, and projection processing that performs projection processing on the processing target data. Sections (hereinafter referred to as “Projections”) 107-0 and 107-1, grouping processing sections (hereinafter referred to as “Grouping”) 108-0 and 108-1 that perform data grouping (grouping) on the results of the filter processing and the projection processing, and filters Processing and projection Stack processing unit about the results performs calculation (hereinafter, Stacking) includes a 109-0,109-1, aggregation processing unit that aggregates the results of grouping and operation (hereinafter, Aggregation) 111, and.

ＦＰＧＡ２のＣＰＵ１２６は、コマンドキュー１２４からコマンドポインタを取得し、ＤＭＡ（図示省略）を起動してＤＢサーバ１のコマンド格納部１２３からコマンドレジスタ１０４へデータベースコマンドを転送する。 The CPU 126 of the FPGA 2 acquires a command pointer from the command queue 124, activates DMA (not shown), and transfers the database command from the command storage unit 123 of the DB server 1 to the command register 104.

次に、ＦＰＧＡ２のＣＰＵ１２６は、ＳＳＤ１３７からＤＲＡＭ１３６にデータベースコマンドで指定されたＤＢ３０の処理対象部分のデータを転送する。ＤＢ３０の処理単位であるページデータがＤＲＡＭ１３６に格納されると、ＣＰＵ１２６はレジスタ１０４に集約処理開始の信号を書き込む。 Next, the CPU 126 of the FPGA 2 transfers the data of the processing target portion of the DB 30 designated by the database command from the SSD 137 to the DRAM 136. When page data, which is a processing unit of DB 30, is stored in the DRAM 136, the CPU 126 writes an aggregation processing start signal in the register 104.

次に、ＣＰＵ１２６は、ＤＲＡＭインターフェース１３１からＤＡＴＡインターフェース１０５にＤＲＡＭ１３６のデータ転送を開始する。ＤＲＡＭインターフェース１３１は、データインターフェース１０５の、４ブロックに分割されたＳＲＡＭＤ０〜Ｄ３（２０１〜２０４）に、ＤＢ３０のページデータをそれぞれ格納する。なお、ＳＲＡＭＤ０〜Ｄ３（２０１〜２０４）は、図１に示したＦＰＧＡ２のＳＲＡＭ２００の所定の領域を割り当てたものである。なお、以下で説明する他のＳＲＡＭＰ０、Ｐ１、Ｇ０についてもＳＲＡＭ２００の所定の領域を割り当てたものである。 Next, the CPU 126 starts data transfer of the DRAM 136 from the DRAM interface 131 to the DATA interface 105. The DRAM interface 131 stores the page data of the DB 30 in the SRAMs D0 to D3 (201 to 204) divided into four blocks of the data interface 105, respectively. Note that the SRAMs D0 to D3 (201 to 204) are assigned with predetermined areas of the SRAM 200 of the FPGA 2 shown in FIG. It should be noted that other SRAMs P0, P1, and G0 described below are assigned predetermined areas of the SRAM 200.

ＤＡＴＡインターフェース１０５に格納されたＤＢ３０のページデータは、フィルタ処理とプロジェクション処理を行うパイプライン処理へページ単位で投入される。このページ単位のパイプライン処理は、Ｆｉｌｔｅｒ＃０（１０６−０）とＰｒｏｊｅｃｔｉｏｎ＃０（１０７−０）のパイプラインと、Ｆｉｌｔｅｒ＃１（１０６−１）と、Ｐｒｏｊｅｃｔｉｏｎ＃１（１０７−１）の２段のパイプラインで構成した例を示す。なお、Ｆｉｌｔｅｒ＃０（１０６−０）と、Ｆｉｌｔｅｒ＃１（１０６−１）の総称はＦｉｌｔｅｒ１０６として示す。他の構成要素も同様である。 The page data of the DB 30 stored in the DATA interface 105 is input on a page-by-page basis to pipeline processing that performs filtering processing and projection processing. This page-by-page pipeline processing includes the pipelines of Filter # 0 (106-0) and Projection # 0 (107-0), Filter # 1 (106-1), and Projection # 1 (107-1). An example of a two-stage pipeline is shown. The generic name of Filter # 0 (106-0) and Filter # 1 (106-1) is shown as Filter 106. The same applies to other components.

ページ単位のパイプライン処理の結果は、グルーピング処理とスタッキング処理の並列処理へ投入される。この並列処理は、Ｐｒｏｊｅｃｔｉｏｎ＃０（１０７−０）が出力する行データをＧｒｏｕｐｉｎｇ＃０（１０８−０）とＳｔａｃｋｉｎｇ＃０（１０９−０）で並列処理し、Ｐｒｏｊｅｃｔｉｏｎ＃１（１０７−１）が出力する行データをＧｒｏｕｐｉｎｇ＃１（１０８−１）とＳｔａｃｋｉｎｇ＃１（１０９−１）で並列処理する。 The result of page-by-page pipeline processing is input to parallel processing of grouping processing and stacking processing. In this parallel processing, row data output by Projection # 0 (107-0) is processed in parallel by Grouping # 0 (108-0) and Stacking # 0 (109-0), and Projection # 1 (107-1) The row data to be output are processed in parallel by Grouping # 1 (108-1) and Stacking # 1 (109-1).

Ａｒｂｉｔｅｒ１１０は、Ｇｒｏｕｐｉｎｇ＃０（１０８−０）とＳｔａｃｋｉｎｇ＃０（１０９−０）の結果と、Ｇｒｏｕｐｉｎｇ＃１（１０８−１）とＳｔａｃｋｉｎｇ＃１（１０９−１）の結果と、を順次受け付けて、Ａｇｇｒｅｇａｔｉｏｎ（１１１）に入力する。 The Arbiter 110 sequentially receives the results of Grouping # 0 (108-0) and Stacking # 0 (109-0), and the results of Grouping # 1 (108-1) and Stacking # 1 (109-1), Input to Aggregation (111).

Ａｇｇｒｅｇａｔｉｏｎ１１１は、グルーピング処理とスタッキング処理の並列処理の結果を入力としてデータベースコマンドで設定された集約処理を実行し、集約結果１１２と、メッセージ１１３と、シノニム１１４と、完了コマンド（１コマンド処理完了通知）１３８とを出力する。これら、集約結果１１２と、メッセージ１１３と、シノニム１１４及び完了コマンド１３８の出力は、ＰＣＩスイッチ４を介してＤＢサーバ１に送信される。 The Aggregation 111 executes the aggregation process set by the database command with the result of the parallel processing of the grouping process and the stacking process as an input, and executes the aggregation process 112, the message 113, the synonym 114, and the completion command (1 command process completion notification). 138 is output. The output of the aggregation result 112, the message 113, the synonym 114, and the completion command 138 is transmitted to the DB server 1 via the PCI switch 4.

ＤＢサーバ１のＤＢＭＳ２０は、後述するように、完了コマンドキュー１２２に格納された完了コマンド１３８と、結果格納領域１１５の集約結果と、シノニム格納領域１１７を用いて、再集約モジュール１１８による再集約と、シノニムが発生したデータについてグルーピング及び集約モジュール１２０によるグルーピングと集約処理を行う。ＤＢサーバ１は、この処理を複数回（データベースサイズ／一度のコマンドでＦＰＧＡ２が処理するデータサイズ＝ページ単位）まで繰り返して、全てのＤＢ３０を処理する。 The DBMS 20 of the DB server 1 performs re-aggregation by the re-aggregation module 118 using the completion command 138 stored in the completion command queue 122, the aggregation result of the result storage area 115, and the synonym storage area 117, as will be described later. Then, grouping and aggregation processing by the grouping and aggregation module 120 is performed on the data in which the synonym is generated. The DB server 1 processes all the DBs 30 by repeating this processing a plurality of times (database size / data size processed by the FPGA 2 with one command = page unit).

なお、グルーピング及び集約モジュール１２０は、シノニム格納領域１１７にシノニムの情報が書き込まれたときに処理を実行し、グルーピング及び集約処理を行ったデータを再集約モジュール１１８へ出力する。 The grouping / aggregation module 120 executes processing when synonym information is written in the synonym storage area 117, and outputs the grouped and aggregated data to the re-aggregation module 118.

再集約モジュール１１８は、結果格納領域１１５の集約結果とグルーピング及び集約モジュール１２０の出力に基づいて、グルーピング化列毎に集約結果を集計してクエリの結果を応答する。なお、グルーピング化列は、後述するように、処理対象データの複数の列を行方向で結合し、ハッシュ値の演算対象となるデータである。 The re-aggregation module 118 aggregates the aggregation results for each grouping column based on the aggregation result in the result storage area 115 and the output of the grouping and aggregation module 120, and responds to the query result. As will be described later, the grouping column is data that is a hash value calculation target by combining a plurality of columns of processing target data in the row direction.

図３は、ＤＢ３０に対するクエリと処理内容の一例を示す図である。図３の例では、ＤＢサーバ１が受信するクエリの一例としてとしてＤＢ３０のベンチマークＴＰＣ―ＨＱｕｅｒｙ３を示している。 FIG. 3 is a diagram illustrating an example of a query and processing content for the DB 30. In the example of FIG. 3, the benchmark TPC-H Query 3 of the DB 30 is shown as an example of the query received by the DB server 1.

図３のＱｕｅｒｙに出現するｓｅｌｅｃｔ文がＰｒｏｊｅｃｔｉｏｎ処理３０１であり、Ｑｕｅｒｙにより取り出すデータ列を記している。図２のＦＰＧＡ２において、Ｐｒｏｊｅｃｔｉｏｎ（プロジェクション処理部）＃０（１０７−０）及びＰｒｏｊｅｃｔｉｏｎ＃１（１０７−１）がＰｒｏｊｅｃｔｉｏｎ処理３０１を行う。 A select statement appearing in the query of FIG. 3 is the projection process 301, which describes a data string to be extracted by the query. In the FPGA 2 of FIG. 2, Projection (projection processing unit) # 0 (107-0) and Projection # 1 (107-1) perform the Projection process 301.

図３のＱｕｅｒｙに出現するＷｈｅｒｅ文がＦｉｌｔｅｒｉｎｇ処理３０２であり、Ｗｈｅｒｅ文の条件に一致した行を抽出する。図２のＦＰＧＡ２において、Ｆｉｌｔｅｒ（フィルタ処理部）＃０（１０６−０）及びＦｉｌｔｅｒ＃１（１０６−１）がＦｉｌｔｅｒｉｎｇ処理３０２を行う。 The WHERE statement that appears in the query of FIG. 3 is the filtering process 302, and a line that matches the condition of the WHERE statement is extracted. In the FPGA 2 of FIG. 2, Filter (filter processing unit) # 0 (106-0) and Filter # 1 (106-1) perform the filtering process 302.

図３のＱｕｅｒｙに出現するＧｒｏｕｐｂｙ文がＧｒｏｕｐｉｎｇ処理３０３であり、Ｇｒｏｕｐｉｎｇで指定した列によりグルーピングを行う。図２のＦＰＧＡ２において、Ｇｒｏｕｐｉｎｇ（グルーピング処理部）＃０（１０８−０）及びＧｒｏｕｐｉｎｇ＃１（１０８−１）がＧｒｏｕｐｉｎｇ処理３０３を行う。 A Group by statement appearing in the query shown in FIG. 3 is a grouping process 303, and grouping is performed by a column designated by the grouping. In the FPGA 2 of FIG. 2, the Grouping (grouping processing unit) # 0 (108-0) and the Grouping # 1 (108-1) perform the Grouping process 303.

図３のＱｕｅｒｙに出現するＳｕｍ文がＡｇｇｒｅｇａｔｉｏｎ処理３０５であり、集約演算を行う。図２のＦＰＧＡ２において、Ａｇｇｒｅｇａｔｉｏｎ１１１が行う集約演算処理３０５には、合計、最大値、最小値、カウント等が含まれる。 The Sum statement that appears in the query of FIG. 3 is the aggregation process 305, and performs an aggregation operation. In the FPGA 2 of FIG. 2, the aggregation calculation processing 305 performed by the aggregation 111 includes a total, a maximum value, a minimum value, a count, and the like.

図３のＱｕｅｒｙに出現するＳｕｍの（）内がＳｔａｃｋｉｎｇ処理３０６であり、列に対する数値演算を行う。図２のＦＰＧＡ２において、Ｓｔａｃｋｉｎｇ（スタッキング処理部）＃０（１０９−０）及びＳｔａｃｋｉｎｇ＃１（１０９−１）がＳｔａｃｋｉｎｇ処理３０６を行う。Ｓｔａｃｋｉｎｇ＃０（１０９−０）、Ｓｔａｃｋｉｎｇ＃１（１０９−１）が行う数値演算には、加算、減算、乗算が含まれる。尚、行の並べ替えを行うＯｒｄｅｒｂｙがｏｒｄｅｒｉｎｇ処理３０４であり、この処理はＤＢサーバ１で実行するものとする。 The inside of () of the Sum that appears in the query of FIG. In the FPGA 2 of FIG. 2, Stacking (stacking processing unit) # 0 (109-0) and Stacking # 1 (109-1) perform the Stacking process 306. Numerical operations performed by Stacking # 0 (109-0) and Stacking # 1 (109-1) include addition, subtraction, and multiplication. Note that Order by for rearranging rows is the ordering process 304, and this process is executed by the DB server 1.

図４は、ＦＰＧＡ２に対するデータベースコマンドの一例を示す図である。ＤＢ３０の格納先のＳＳＤ１３７の情報としては、読み出し開始アドレス４０１と、読み出すデータサイズ４０２、とが含まれる。また、図示はしないが、ひとつのコマンドでＦＰＧＡ２が処理するＤＢ３０のサイズ（単位サイズ：８ＭＢ）を指定しても良い。 FIG. 4 is a diagram illustrating an example of a database command for the FPGA 2. The information of the storage destination SSD 137 of the DB 30 includes a read start address 401 and a read data size 402. Although not shown, the size (unit size: 8 MB) of the DB 30 processed by the FPGA 2 may be specified with one command.

Ｆｉｌｔｅｒｉｎｇ１０６の設定情報としては、ＤＢ３０におけるフィルタ対象列４０３と、フィルタ条件４０４とが含まれる。Ｐｒｏｊｅｃｔｉｏｎ１０７の設定情報としては、取り出し列４０５が含まれる。Ｇｒｏｕｐｉｎｇ１０８の設定情報としては、グルーピング化列４０６が含まれる。 The setting information of the filtering 106 includes a filter target column 403 and a filter condition 404 in the DB 30. The setting information of the Projection 107 includes an extraction column 405. The grouping column 406 is included as the setting information of the Grouping 108.

Ｓｔａｃｋｉｎｇ１０９の設定情報としては、演算対象列４０７と、演算子４０８と、直値４０９とが含まれる。Ａｇｇｒｅｇａｔｉｏｎ（集約処理部）１１１の設定情報としては、演算子４１０を含むが含まれる。 The setting information of Stacking 109 includes a calculation target column 407, an operator 408, and a direct value 409. The setting information of the aggregation (aggregation processing unit) 111 includes an operator 410.

図５は、サーバコマンド処理モジュール１０３０の処理と、ＦＰＧＡ２の集約処理（ＤＡＴＡＩ／Ｆ（１０５）〜Ａｇｇｒｅｇａｔｉｏｎ１１１）の関係を示すタイムチャートである。 FIG. 5 is a time chart showing the relationship between the processing of the server command processing module 1030 and the aggregation processing (DATA I / F (105) to Aggregation 111) of FPGA2.

サーバコマンド処理モジュール１０３０は、ＤＢサーバ１が受信したクエリから、ＦＰＧＡ２が処理する複数（または１以上）のデータベースコマンド（５０１〜５０４）を生成し、データベースコマンドをコマンド格納部１２３に格納する。サーバコマンド処理モジュール１０３０は、データベースコマンドの格納が完了すると、要求コマンドキュー１２１にコマンドを格納した領域のポインタを書込み、ＦＰＧＡ２のＣＰＵ１２６に対してドアベルレジスタ（図示省略）を介してデータベースコマンドをスタックしたことを通知（図中の時刻５０５〜５０８）する。 The server command processing module 1030 generates a plurality (or one or more) of database commands (501 to 504) to be processed by the FPGA 2 from the query received by the DB server 1, and stores the database commands in the command storage unit 123. When the storage of the database command is completed, the server command processing module 1030 writes the pointer of the area storing the command in the request command queue 121 and stacks the database command on the CPU 126 of the FPGA 2 via the doorbell register (not shown). Is notified (time 505 to 508 in the figure).

ＣＰＵ１２６は、図示しないＤＭＡを起動して、要求コマンドキュー１２１のポインタが指し示すコマンド格納部１２３のデータベースコマンドを、ＤＢサーバ１からレジスタ１０４に転送する。 The CPU 126 activates a DMA (not shown) and transfers the database command in the command storage unit 123 indicated by the pointer of the request command queue 121 from the DB server 1 to the register 104.

ＦＰＧＡ２では、上記データベースコマンドの取得と並行して、ＣＰＵ１２６がＳＳＤ１３７からＤＲＡＭ１３６にＤＢ３０の処理対象データを転送する。ＣＰＵ１２６は、ＤＢ３０処理単位のページデータがＤＲＡＭ１３６に格納されると、レジスタ１０４に集約処理開始の信号を書き込み、ＤＲＡＭ１３６からＤＡＴＡＩ／Ｆ１０５にデータ転送を開始し、４面あるＳＲＡＭ０〜３にＤＢ３０のページ単位でデータ（ページデータ）を格納する。 In the FPGA 2, in parallel with the acquisition of the database command, the CPU 126 transfers the processing target data in the DB 30 from the SSD 137 to the DRAM 136. When the page data of the DB30 processing unit is stored in the DRAM 136, the CPU 126 writes an aggregation processing start signal to the register 104, starts data transfer from the DRAM 136 to the DATA I / F 105, and stores the data of the DB30 in the four SRAMs 0 to 3. Data (page data) is stored in units of pages.

ＤＡＴＡＩ／Ｆ１０５に書き込まれた複数のページデータは、順次Ｆｉｌｔｅｒ１０６に投入され、Ｐｒｏｊｅｃｔｉｏｎ１０７、Ｇｒｏｕｐｉｎｇ１０８、Ｓｔａｃｋｉｎｇ１０９、Ａｇｇｒｅｇａｔｉｏｎ１１１の各処理部で演算が行われる。 A plurality of page data written in the DATA I / F 105 is sequentially input to the filter 106, and operations are performed in the processing units of the projection 107, the grouping 108, the stacking 109, and the aggregation 111.

ＦＰＧＡ２の各演算が完了すると、ＣＰＵ１２６は、ＤＢサーバ１に対して完了コマンド１３８を発行して、１コマンド処理の完了を通知（図中５１２、５１３、５１４）する。そして、１コマンド処理の完了通知を受信したＤＢサーバ１は、当該完了通知が指し示すデータベースコマンドを完了コマンドキュー１２２に登録する。 When each operation of the FPGA 2 is completed, the CPU 126 issues a completion command 138 to the DB server 1 to notify the completion of one command processing (512, 513, and 514 in the figure). The DB server 1 that has received the one-command processing completion notification registers the database command indicated by the completion notification in the completion command queue 122.

ＤＢサーバ１の再集約モジュールは、１コマンド単位の完了通知（５１２、５１３、５１４）を受け取り、結果格納領域１１５のグルーピング化列を用いて、グループハッシュテーブル１１９を生成し、グルーピング処理と再集約処理を後述するように行う。 The re-aggregation module of the DB server 1 receives the completion notification (512, 513, 514) for each command, generates the group hash table 119 using the grouping column of the result storage area 115, and performs grouping processing and re-aggregation. Processing is performed as described below.

図６は、ＤＢ３０のページフォーマットの一例を示す。ＤＢ３０はページ単位で構成され、ページデータの先頭にページヘッダ６０１が格納される。ページヘッダ６０１に続いて行データ６０２が１行目〜Ｍ行目まで順に格納されており、ページデータの終端から順番に行の先頭アドレスを指し示す行ポインタ６０３が格納される。 FIG. 6 shows an example of the page format of the DB 30. The DB 30 is configured in units of pages, and a page header 601 is stored at the top of the page data. Following the page header 601, row data 602 is stored in order from the first row to the Mth row, and a row pointer 603 indicating the head address of the row is stored in order from the end of the page data.

ＦＰＧＡ２のＦｉｌｔｅｒ１０６と、Ｐｒｏｊｅｃｔｉｏｎ１０７は、行ポインタ６０３と図４に示したデータベースコマンドの設定情報（４０３、４０５）を用いて、必要な列データを取得する。 The Filter 106 and the Projection 107 of the FPGA 2 acquire necessary column data using the row pointer 603 and the database command setting information (403, 405) shown in FIG.

図７は、ＦＰＧＡ２のデータベース処理部２５０の回路構成の詳細な例を示すブロック図である。ＤＲＡＭ１３６から読み出されたＤＢ３０のページデータは、ＤＡＴＡＩ／Ｆ１０５を介して、ＳＲＡＭ２０１〜２０４（ＳＲＡＭＤ０〜Ｄ３）にページ単位で格納される。 FIG. 7 is a block diagram illustrating a detailed example of the circuit configuration of the database processing unit 250 of the FPGA 2. The page data of the DB 30 read from the DRAM 136 is stored in units of pages in the SRAMs 201 to 204 (SRAMs D0 to D3) via the DATA I / F 105.

ＳＲＡＭ２０１は一組のＳＲＡＭで構成されて、同一のデータを２つのＳＲＡＭ（例えばＤ１＿０、Ｄ１−１）で保持し、並列処理を行うＦｉｌｔｅｒ１０６と、Ｐｒｏｊｅｃｔｉｏｎ１０７にそれぞれ同一のデータを供給する。他の、ＳＲＡＭ２０２〜２０４も同様である。 The SRAM 201 is composed of a set of SRAMs, holds the same data in two SRAMs (for example, D1_0, D1-1), and supplies the same data to the filter 106 and the projection 107 that perform parallel processing. The same applies to the other SRAMs 202 to 204.

ＳＲＡＭ２０１〜２０４からＦｉｌｔｅｒ１０６とＰｒｏｊｅｃｔｉｏｎ１０７に供給されるデータは、セレクタ７０７、７０８、７０９、７１０によって選択される。Ｆｉｌｔｅｒ１０６の処理結果は、Ｐｒｏｊｅｃｔｉｏｎ１０７へ入力される。ここで、Ｆｉｌｔｅｒ１０６は、レジスタ１０４に設定されたデータベースコマンド（図４）のうちフィルタ対象列４０３とフィルタ条件４０４のフィルタ情報に基づいて処理を実行する。同様に、Ｐｒｏｊｅｃｔｉｏｎ１０７では、レジスタ１０４に設定されたプロジェクション情報に基づいて処理を実行する。 Data supplied from the SRAM 201 to 204 to the filter 106 and the projection 107 is selected by selectors 707, 708, 709, and 710. The processing result of the filter 106 is input to the project 107. Here, the filter 106 executes processing based on the filter information in the filter target column 403 and the filter condition 404 in the database command (FIG. 4) set in the register 104. Similarly, the projection 107 executes processing based on the projection information set in the register 104.

Ｐｒｏｊｅｃｔｉｏｎ１０７により処理された結果は、ＤＢ３０の１ページ単位で、ＳＲＡＭＰ０（２０５）と、ＳＲＡＭＰ１（２０６）に格納される。ＳＲＡＭＰ０（２０５）、ＳＲＡＭＰ１（２０６）のＰｒｏｊｅｃｔｉｏｎ結果を用いて、Ｇｒｏｕｐｉｎｇ１０８−０、１０８−１と、Ｓｔａｃｋｉｎｇ１０９−０、１０９−１がそれぞれ並列処理を実行する。 The results processed by the Projection 107 are stored in the SRAM P0 (205) and the SRAM P1 (206) in units of one page in the DB 30. Using the projection results of SRAM P0 (205) and SRAM P1 (206), Grouping 108-0 and 108-1 and Stacking 109-0 and 109-1 execute parallel processing, respectively.

Ｇｒｏｕｐｉｎｇ１０８−０、１０８−１では、データベースコマンドのグルーピング化列４０６と、ハッシュテーブル７２８＿０、７２８＿１と、グルーピング列テーブル７２９＿０、７２９＿１を用いて演算が行われる。ハッシュテーブル７２８には、グルーピング対象のデータのハッシュ値が格納される。グルーピング列テーブル７２９には、グループ化された列データが格納される。これらのハッシュテーブル７２８及びグルーピング列テーブル７２９は、ＦＰＧＡ２のＳＲＡＭ２００内の所定の領域に設定される。 In the Grouping 108-0 and 108-1, the calculation is performed using the grouping column 406 of the database command, the hash tables 728_0 and 728_1, and the grouping column tables 729_0 and 729_1. In the hash table 728, hash values of grouping target data are stored. The grouping column table 729 stores grouped column data. These hash table 728 and grouping column table 729 are set in predetermined areas in the SRAM 200 of the FPGA 2.

Ｓｔａｃｋｉｎｇ１０９−０、１０９−１では、データベースコマンドの演算対象列４０７、演算子４０８、直値４０９に基づいて演算が行われる。 In Stacking 109-0 and 109-1, the calculation is performed based on the calculation target column 407, the operator 408, and the direct value 409 of the database command.

これらのＧｒｏｕｐｉｎｇ１０８とＳｔａｃｋｉｎｇ１０９の出力は、Ａｒｂｉｔｅｒ１１０により順次受け付けられ、ＳＲＡＭＧ０（２０７）に書き込まれる。ＳＲＡＭＧ０（２０７）の内容は、Ａｇｇｒｅｇａｔｉｏｎ１１１の集約演算の結果によって書き換えられる。 The outputs of these Grouping 108 and Stacking 109 are sequentially received by the Arbiter 110 and written to the SRAM G0 (207). The contents of the SRAM G0 (207) are rewritten by the result of the aggregation calculation of the aggregation 111.

Ａｇｇｒｅｇａｔｉｏｎ１１１は、１コマンドのデータ処理が完了すると、集約結果１１２と、メッセージ１１３と、シノニム１１４とをＤＢサーバ１に出力する。メッセージ１１３には、集約したグループの数と、シノニムが発生した数と、演算オーバーフローの情報が格納されている。 When the data processing for one command is completed, the aggregation 111 outputs the aggregation result 112, the message 113, and the synonym 114 to the DB server 1. The message 113 stores the number of aggregated groups, the number of occurrences of synonyms, and operation overflow information.

図８は、ＦＰＧＡ２で行われるパイプライン処理のタイミングチャートを示している。時刻Ｔ０（８０１）においては、ＤＡＴＡＩ／Ｆ１０５がＳＲＡＭＤ０にページ１（図中Ｐ−１）のデータを書き込む。 FIG. 8 shows a timing chart of pipeline processing performed in the FPGA 2. At time T0 (801), the DATA I / F 105 writes the data of page 1 (P-1 in the figure) to the SRAM D0.

時刻Ｔ１（８０２）において、ＤＡＴＡＩ／Ｆ１０５がＳＲＡＭＤ１にページ２（Ｐ−２）のデータを書き込み、Ｆｉｌｔｅｒ１０６−０、１０６−１がＳＲＡＭＤ０のデータを用いてＦｉｌｔｅｒｉｎｇ処理を行う。 At time T1 (802), the DATA I / F 105 writes the data of page 2 (P-2) to the SRAM D1, and the filters 106-0 and 106-1 perform the filtering process using the data of the SRAM D0.

時刻Ｔ２（８０３）においては、ＤＡＴＡＩ／Ｆ１０５がＳＲＡＭＤ２にページ３（Ｐ−３）のデータを書き込み、Ｆｉｌｔｅｒ１０６−０、１０６−１がＳＲＡＭＤ１のデータ（Ｐ−２）を用いてＦｉｌｔｅｒｉｎｇ処理を行い、Ｐｒｏｊｅｃｔｉｏｎ１０７−０、１０７−１がＳＲＡＭＤ０（Ｐ−１）のデータを用いてＰｒｏｊｅｃｉｔｏｎ処理を行う。Ｆｉｌｔｅｒｉｎｇ処理とＰｒｏｊｅｃｉｔｏｎ処理は、ページデータの行単位で実行される。 At time T2 (803), the DATA I / F 105 writes the data of page 3 (P-3) to the SRAM D2, and the filters 106-0 and 106-1 perform the filtering process using the data (P-2) of the SRAM D1. The Projections 107-0 and 107-1 perform the Projection processing using the data of the SRAM D0 (P-1). The Filtering process and the Projection process are executed for each line of page data.

また、プロジェクション処理の１行目のデータがＳＲＡＭＰ０に書き込まれる時刻Ｔ２１においては、Ｇｒｏｕｐｉｎｇ１０８−０、１０８−１と、Ｓｔａｃｋｉｎｇ１０９−０、１０９−１が開始される。これにより、Ｐｒｏｊｅｃｔｉｏｎ処理と並行して、ＳＲＡＭＰ０に書き込まれたプロジェクション処理の結果に基づいてＧｒｏｕｐｉｎｇ処理とＳｔａｃｋｉｎｇ処理が並列して実行される。なお、他のＧｒｏｕｐｉｎｇ処理とＳｔａｃｋｉｎｇ処理も、ＳＲＡＭＰ０、Ｐ１に１行のデータが書き込まれるまで待機する。 Further, at time T21 when data in the first row of the projection processing is written into the SRAM P0, Grouping 108-0 and 108-1 and Stacking 109-0 and 109-1 are started. Thereby, in parallel with the projecting process, the grouping process and the stacking process are executed in parallel based on the result of the projection process written in the SRAM P0. Other Grouping processing and Stacking processing also wait until one row of data is written in the SRAMs P0 and P1.

ＳＲＡＭＰ０、Ｐ１にデータが書き込まれると、Ａｇｇｒｅｇａｔｉｏｎ１１１がＡｇｇｒｅｇａｔｉｏｎ処理を行い、ＳＲＡＭＧ０に集約処理の結果を書き込む。 When data is written to the SRAMs P0 and P1, the aggregation 111 performs an aggregation process and writes the result of the aggregation process to the SRAM G0.

時刻Ｔ３（８０４）においては、ＤＡＴＡＩ／Ｆ１０５がＳＲＡＭＤ３にページ４（Ｐ−４）のデータを書き込み、Ｆｉｌｔｅｒ（１０６−０、１０６−１）がＳＲＡＭＤ２のデータ（Ｐ−３）を用いてＦｉｌｔｅｒｉｎｇ処理を行う。また、Ｐｒｏｊｅｃｔｉｏｎ１０７−０、１０７−１がＳＲＡＭＤ１のデータ（Ｐ−２）を用いてＰｒｏｊｅｃｉｔｏｎ処理を行う。 At time T3 (804), the DATA I / F 105 writes the data of page 4 (P-4) to the SRAM D3, and the Filter (106-0, 106-1) uses the data of the SRAM D2 (P-3). Filtering processing is performed. In addition, the Projections 107-0 and 107-1 perform the Projection process using the data (P-2) of the SRAM D1.

Ｇｒｏｕｐｉｎｇ１０８−０、１０８−１と、Ｓｔａｃｋｉｎｇ１０９−０、１０９−１は、Ｐｒｏｊｅｃｔｉｏｎ処理と並行して、グルーピング処理とスタッキング処理を実行する。そして、グルーピング処理の結果と、スタッキング処理の結果に基づいてＡｇｇｒｅｇａｔｉｏｎ１１１が集約処理を行い、ＳＲＡＭＧ０に集約結果を書き込む。以上のようなタイミングで、ＦＰＧＡ２ではパイプライン処理が実行される。 The Grouping 108-0 and 108-1, and the Stacking 109-0 and 109-1 execute a grouping process and a stacking process in parallel with the Projection process. Then, the aggregation 111 performs an aggregation process based on the result of the grouping process and the result of the stacking process, and writes the aggregation result to the SRAM G0. At the timing described above, the pipeline processing is executed in the FPGA 2.

図９は、コマンド単位で、各処理における出力データ量の変化を示す図である。一回のデータベースコマンドで処理するＤＢ３０の処理対象データを図中符号９０１で示す。 FIG. 9 is a diagram illustrating a change in output data amount in each process in units of commands. The processing target data of the DB 30 processed by a single database command is indicated by reference numeral 901 in the figure.

このＤＢ３０の処理対象データ９０１をＦｉｌｔｅｒｉｎｇ処理すると、フィルタ条件に一致した行データが抽出されて、図中２つの行データ９０２が処理結果として出力される。 When the filtering processing is performed on the processing target data 901 of the DB 30, row data that matches the filter condition is extracted, and two row data 902 in the figure are output as processing results.

このフィルタリング処理の過程で残る行データ９０２のデータ量は、処理対象データ９０１のデータ量の約１／１０である。行データ９０２を入力としてＰｒｏｊｅｃｔｉｏｎ処理を行い、Ｇｒｏｕｐｉｎｇ処理に必要なＧｒｏｕｐｉｎｇ列９０３と、Ｓｔａｃｋｉｎｇ処理に必要なＳｔａｃｋｉｎｇ列９０４のデータを抽出する。 The data amount of the row data 902 remaining in the process of this filtering process is about 1/10 of the data amount of the processing target data 901. Projection processing is performed using the row data 902 as input, and data in the Grouping column 903 necessary for the Grouping processing and the Stacking column 904 necessary for the Stacking processing are extracted.

このプロジェクション処理の過程で算出される列データ（９０３、９０４）のデータ量は、Ｆｉｌｔｅｒｉｎｇ処理後の行データ９０２のデータ量の約１／１０である。Ｇｒｏｕｐｉｎｇ列９０３と、Ｓｔａｃｋｉｎｇ９０４のデータを用いて、Ｇｒｏｕｐｉｎｇ処理と、Ｓｔａｃｋｉｎｇ処理を行うとデータ９０５が得られる。この処理過程で残るデータ９０５のデータ量は、Ｐｒｏｊｅｃｔｉｏｎ処理後の列データ（９０３、９０４）のデータ量の約１／１０である。Ｇｒｏｕｐｉｎｇ処理とＳｔａｃｋｉｎｇ処理で得られたデータ９０５をＡｇｇｒｅｇａｔｉｏｎ処理し、最終的な集約結果９０６が得られる。 The data amount of the column data (903, 904) calculated in the process of the projection processing is about 1/10 of the data amount of the row data 902 after the filtering processing. Data 905 is obtained by performing the Grouping process and the Stacking process using the Grouping column 903 and the data of the Stacking 904. The data amount of the data 905 remaining in this process is about 1/10 of the data amount of the column data (903, 904) after the projection processing. Data 905 obtained by the Grouping process and the Stacking process is subjected to an aggregation process, and a final aggregation result 906 is obtained.

本実施例のＦＰＧＡ２では、入力する処理対象データ９０１に対して、出力する集約結果９０６のデータ量は約１／１０００となる。 In the FPGA 2 of this embodiment, the data amount of the aggregation result 906 to be output is about 1/1000 of the processing target data 901 to be input.

図１０は、ＦＰＧＡ２で行われる集約演算ローカル処理とＤＢサーバ１で行われる集約演算グローバル処理の関係を示す図である。 FIG. 10 is a diagram illustrating a relationship between the aggregate calculation local process performed by the FPGA 2 and the aggregate calculation global process performed by the DB server 1.

ＤＢサーバ１は、サーバコマンド処理モジュール１０３により、ＳＳＤ１３７に格納されている２ＴＢサイズのＤＢ３０のデータ１０００のうち、１回のデータベースコマンドで処理する処理対象データのサイズを８ＭＢに決定する。そして、ＤＢサーバ１は、１つのデータベースコマンドでＦＰＧＡ２が８ＭＢの処理対象データ９０１を処理するようにデータベースコマンド内で指定する（１００４）。 The DB server 1 uses the server command processing module 103 to determine that the size of the processing target data to be processed by one database command out of the data 1000 of the DB 30 of 2 TB size stored in the SSD 137 is 8 MB. Then, the DB server 1 designates in the database command that the FPGA 2 processes the processing target data 901 of 8 MB with one database command (1004).

ＦＰＧＡ２は、１つのデータベースコマンドを受け付けて、８ＭＢに分割したＤＢ３０の処理対象データ９０１を処理する。ＦＰＧＡ２は、８ＭＢの処理対象データ９０１について８ＫＢのページ単位で処理を行う（１００５）。 The FPGA 2 accepts one database command and processes the processing target data 901 of the DB 30 divided into 8 MB. The FPGA 2 processes the 8 MB processing target data 901 in units of 8 KB pages (1005).

ＦＰＧＡ２は、１つのデータベースコマンドで８ＭＢ分の処理対象データ９０１について、ページ単位（８ＫＢ）でＤＲＡＭ１３６から処理対象データ９０１を読み込んでグルーピング処理とスタッキング処理を実行する。そして、ＦＰＧＡ２はルーピング処理とスタッキング処理の結果に基づいて集約処理を実施する（１００６）。 The FPGA 2 reads the processing target data 901 from the DRAM 136 in units of pages (8 KB) for the processing target data 901 for 8 MB with one database command, and executes grouping processing and stacking processing. Then, the FPGA 2 performs aggregation processing based on the results of looping processing and stacking processing (1006).

ＦＰＧＡ２は集約処理が完了すると、後述する図２１のように、集約結果のグループの順序が不同な状態で、グルーピング化列と集約結果をＤＢサーバ１に返信する（１００８）。 When the aggregation process is completed, the FPGA 2 returns the grouping column and the aggregation result to the DB server 1 in a state where the order of the group of the aggregation result is not the same as shown in FIG. 21 (1008).

グルーピング化列と集約結果を受け取ったＤＢサーバ１は、再集約モジュール１１８がグルーピング化列に基づいてグループハッシュテーブル１１９を生成し、ハッシュ値の順にデータを並べて、全てのデータベースコマンドに対するグローバル（全ての処理対象データ）の集約処理を実行する。 Upon receiving the grouping column and the aggregation result, the DB server 1 generates a group hash table 119 based on the grouping column by the re-aggregation module 118, arranges the data in the order of hash values, and sets the global (all all database commands). (Processing target data) is aggregated.

図１１は、グルーピング化列のグルーピングの手法の一例を示す図である。ＦＰＧＡ２のＧｒｏｕｐｉｎｇ１０８−０、１０８−１ではグルーピング処理が行われる。このグルーピング処理では、１行の行データに含まれるグルーピング対象の列をグルーピング化列として連結しグルーピング列テーブル７２９に格納される。 FIG. 11 is a diagram illustrating an example of a grouping method for grouping columns. Grouping processing is performed in the FPGA 108 Grouping 108-0 and 108-1. In this grouping process, columns to be grouped included in one row of data are linked as grouped columns and stored in the grouping column table 729.

図示の例では、１行の行データの、グルーピング化列１、２、３（１１０１）を連結してグルーピング化データ１１０２を生成し、グルーピング列テーブル７２９に格納する。なお、グルーピング化データ１１０２の各行は、ハッシュ値の演算対象であるグルーピング化列のデータである。 In the illustrated example, grouping data 1102 is generated by concatenating grouping columns 1, 2, 3 (1101) of row data of one row, and stored in the grouping column table 729. Each row of the grouping data 1102 is data of a grouping column that is a hash value calculation target.

そして、Ｇｒｏｕｐｉｎｇ１０８ではグルーピング化データ１１０２のグルーピング化列ごとにハッシュ値とグループＩＤを演算して、各グループを識別する。なお、Ｇｒｏｕｐｉｎｇ１０８で演算されたハッシュ値とグループＩＤは、ハッシュテーブル７２８に格納される。 Then, the Grouping 108 calculates a hash value and a group ID for each grouping column of the grouping data 1102 to identify each group. Note that the hash value and group ID calculated in the grouping 108 are stored in the hash table 728.

図２２は、ＦＰＧＡ２内のハッシュテーブル７２８の一例を示す図である。ハッシュテーブル７２８は、ＳＲＡＭ２００のアドレス２３０１をハッシュ値とし、１ビットのデータ２３０２をフラグビットとし、１０ビットのデータ２３０３をハッシュ値に対応するグループＩＤとする例を示す。 FIG. 22 is a diagram illustrating an example of the hash table 728 in the FPGA 2. The hash table 728 shows an example in which the address 2301 of the SRAM 200 is a hash value, 1-bit data 2302 is a flag bit, and 10-bit data 2303 is a group ID corresponding to the hash value.

ハッシュテーブル７２８には、グルーピング処理の結果が格納され、グループＩＤ（２３０３）が割り当てられたアドレス２３０１のフラグビット（２３０２）には「１」（０ｂ１）が設定される。なお、グループＩＤ（２３０３）は、新たなグループが出現する度に追加される。 The hash table 728 stores the result of the grouping process, and “1” (0b1) is set in the flag bit (2302) of the address 2301 to which the group ID (2303) is assigned. The group ID (2303) is added every time a new group appears.

図２３は、ＦＰＧＡ２内のグルーピング列テーブル７２９の一例を示す図である。グルーピング列テーブル７２９は、ＳＲＡＭ２００のアドレス２４０１（１０ビット）をグループＩＤとし、６４ビットのデータ２４０２をグルーピング化列とする例を示す。グルーピング列テーブル７２９は、グループＩＤ（２４０１）に対応したグルーピング化列（２４０２）が保持される。 FIG. 23 is a diagram illustrating an example of the grouping column table 729 in the FPGA 2. The grouping column table 729 shows an example in which an address 2401 (10 bits) of the SRAM 200 is used as a group ID and 64-bit data 2402 is used as a grouping column. The grouping column table 729 holds a grouping column (2402) corresponding to the group ID (2401).

なお、本実施例１では、２５６Ｋｂｙｔｅのグルーピング化列まで対応可能な例を示し、ＦＰＧＡ２には、例えば、８Ｂｙｔｅ幅のＳＲＡＭを３２個搭載する。ハッシュ値とグループＩＤ（２４０１）の対応関係は、上述のハッシュテーブル７２８で定義される。また、本実施例１では、グループＩＤの最大値は１０２４となり、この最大値がハッシュ値を割り当て可能な数となる。 In the first embodiment, an example capable of supporting up to 256 Kbyte grouping columns is shown, and for example, 32 8-byte wide SRAMs are mounted in the FPGA 2. The correspondence between the hash value and the group ID (2401) is defined by the hash table 728 described above. In the first embodiment, the maximum value of the group ID is 1024, and this maximum value is the number that can be assigned a hash value.

なお、ハッシュ値とグループＩＤの算出については、上記に限定されるものではなく、公知または周知の手法を適用すればよく、処理対象データ９０１のハッシュ値と、データが所属するグループＩＤが決定されれば良い。 The calculation of the hash value and the group ID is not limited to the above, and a publicly known or known method may be applied, and the hash value of the processing target data 901 and the group ID to which the data belongs are determined. Just do it.

図１２は、グルーピング処理の一例を示すフローチャートである。この処理は、ＳＲＡＭＰ０、Ｐ１に行の演算結果が書き込まれたときにＦＰＧＡ２のＧｒｏｕｐｉｎｇ１０８が起動する。 FIG. 12 is a flowchart illustrating an example of the grouping process. In this process, the Grouping 108 of the FPGA 2 is activated when the operation result of the row is written in the SRAMs P0 and P1.

Ｇｒｏｕｐｉｎｇ１０８は、グルーピング化列（２４０２）のハッシュ値を算出（１２０１）する。Ｇｒｏｕｐｉｎｇ１０８は、算出されたハッシュ値がハッシュテーブル７２８に登録されているか否かを判定する（１２０２）。 The Grouping 108 calculates (1201) the hash value of the grouping column (2402). The Grouping 108 determines whether or not the calculated hash value is registered in the hash table 728 (1202).

Ｇｒｏｕｐｉｎｇ１０８は、算出されたハッシュ値がハッシュテーブル７２８に登録されていなければ、ステップ１２０５へ進む。ステップ１２０５では、Ｇｒｏｕｐｉｎｇ１０８が、算出されたハッシュ値をハッシュテーブル７２８へ登録する。この場合、シノニムは発生していないのでステップ１２０６に進んで、Ａｒｂｉｔｅｒ１１０からＳＲＡＭＧ０（２０７）に書き込まれる。 If the calculated hash value is not registered in the hash table 728, the grouping 108 proceeds to step 1205. In step 1205, the Grouping 108 registers the calculated hash value in the hash table 728. In this case, since no synonym has occurred, the process proceeds to step 1206 and is written from the Arbiter 110 to the SRAM G0 (207).

一方、算出されたハッシュ値が既にハッシュテーブル７２８に登録されている場合は、ステップ１２０３へ進む。ステップ１２０３では、Ｇｒｏｕｐｉｎｇ１０８が、ハッシュテーブル７２８において、同一のハッシュ値となったアドレス２３０１に対応するグルーピング列テーブル７２９のデータ２４０２（グルーピング化列）を取得する。Ｇｒｏｕｐｉｎｇ１０８は算出されたハッシュ値の元のグルーピング化列と、取得したグルーピング化列が異なれば、ステップ１２０４へ進んでシノニム（ハッシュ値の衝突）と判定する。 On the other hand, if the calculated hash value is already registered in the hash table 728, the process proceeds to step 1203. In step 1203, the Grouping 108 acquires data 2402 (grouping column) of the grouping column table 729 corresponding to the address 2301 having the same hash value in the hash table 728. If the original grouping sequence of the calculated hash value is different from the acquired grouping sequence, the grouping 108 proceeds to step 1204 and determines a synonym (hash value collision).

シノニムの場合は、Ａｇｇｒｅｇａｔｉｏｎ１１１での集約処理は行われず、シノニムの情報が後述するようにシノニム１１４としてＤＢサーバ１へ送信される。なお、シノニムの情報としては、ハッシュ値やグルーピング化列を用いることができる。 In the case of a synonym, the aggregation processing in the aggregation 111 is not performed, and the synonym information is transmitted to the DB server 1 as a synonym 114 as described later. As the synonym information, a hash value or a grouping sequence can be used.

一方、非シノニムの場合は、ＳＲＡＭＧ０（２０７）に書き込まれたＧｒｏｕｐｉｎｇ１０８の結果と、後述のＳｔａｃｋｉｎｇ１０９の結果に基づいてＡｇｇｒｅｇａｔｉｏｎ１１１で集約処理が行われる。 On the other hand, in the case of a non-synonym, aggregation processing is performed in the aggregation 111 based on the result of the Grouping 108 written in the SRAM G0 (207) and the result of the Stacking 109 described later.

図１３は、スタッキング演算における、固定小数点のデータ格納方法の一例を示す図である。Ｓｔａｃｋｉｎｇ１０９が、固定小数点を容易に演算できるように、ＤＢ３０への格納形式を整数としている。 FIG. 13 is a diagram illustrating an example of a fixed-point data storage method in the stacking calculation. The storage format in the DB 30 is an integer so that the Stacking 109 can easily calculate a fixed point.

図１３においては、スタッキング演算用の列は２つあり、Ｎ行目の第一列には、０．０８を１００倍して８としたデータ１０３１を格納する。Ｎ行目の第二列には、０．５を１０倍して５としたデータ１３０２を格納する。 In FIG. 13, there are two columns for stacking calculation, and data 1031 is stored in the first column of the Nth row by multiplying 0.08 by 100. In the second column of the Nth row, data 1302 in which 0.5 is multiplied by 10 to be 5 is stored.

Ｓｔａｃｋｉｎｇ１０９は、固定小数点を意識することなく、整数として演算を行い、ＤＢサーバ１が、結果格納領域１１５に格納された集約結果の値の桁をもとに戻して最終的な集約結果とする。なお、固定小数点の位置についてはＤＢＭＳ２０で予め設定されたものである。 The Stacking 109 performs an operation as an integer without regard to the fixed decimal point, and the DB server 1 restores the digit of the value of the aggregation result stored in the result storage area 115 to obtain the final aggregation result. Note that the position of the fixed point is preset by the DBMS 20.

図１４は、スタッキング演算用レジスタ及びＳＲＡＭの構成を示す図である。この構成は、Ｓｔａｃｋｉｎｇ１０９−０、１０９−１のハードウェア構成を示す。 FIG. 14 is a diagram illustrating a configuration of the stacking calculation register and the SRAM. This configuration shows the hardware configuration of Stacking 109-0 and 109-1.

スタッキング処理部を構成するＳｔａｃｋｉｎｇ１０９は、後述するように、スタッキング演算は、値と演算子を積み上げて、演算子が出現すると直近の２つの値に対する演算を実行する。そこで、値を保持する回路の構成を２つのレジスタＲＥＧ０（１４０１）、ＲＥＧ１（１４０２）と１つのＳＲＡＭ（１４０３）とする。なお、ＳＲＡＭ（１４０３）は、図１に示したＳＲＡＭ２００内の所定の領域を示す。 As will be described later, the Stacking 109 constituting the stacking processing unit accumulates values and operators, and executes an operation on the two most recent values when the operators appear. Therefore, the configuration of the circuit that holds the values is assumed to be two registers REG0 (1401) and REG1 (1402) and one SRAM (1403). The SRAM (1403) indicates a predetermined area in the SRAM 200 shown in FIG.

図１５は、ＦＰＧＡ２のＳｔａｃｋｉｎｇ１０９で実行されるスタッキング演算の一例を示すフローチャートである。 FIG. 15 is a flowchart illustrating an example of the stacking calculation executed in the Stacking 109 of the FPGA 2.

Ｓｔａｃｋｉｎｇ１０９−０、１０９−１は、Ｐｒｏｊｅｃｔｉｏｎ１０７がＳＲＡＭＰ０、Ｐ１（２０５、２０６）に演算結果を書き込むと処理を開始する（１５０１）。 The Stacking 109-0 and 109-1 start processing when the Projection 107 writes the operation result in the SRAMs P0 and P1 (205 and 206) (1501).

スタッキング演算を開始（１５０１）すると、Ｓｔａｃｋｉｎｇ１０９はレジスタ１０４からデータベースコマンドからスタック演算コマンドを１つ受け取り（１５０２）、コマンドの内容が数値か演算子であるかを判定する（１５０３）。 When the stacking operation is started (1501), the Stacking 109 receives one stack operation command from the database command from the register 104 (1502), and determines whether the content of the command is a numerical value or an operator (1503).

ステップ１５０３の判定において、コマンドの内容がＰｒｏｊｅｃｔｉｏｎ１０７の出力、もしくは、直値であれば、数値の格納先を判定するステップ１５０４に進む。ステップ１５０４において、Ｓｔａｃｋｉｎｇ１０９は、図１４のスタックレジスタ１４０１、１４０２にデータが格納済みであれば、ＳＲＡＭ１４０３にＲＥＧ０（１４０１）のデータを書込、ＲＥＧ１（１４０２）の値をＲＥＧ０（１４０１）に書き込み、ステップ１５０３の入力データをＲＥＧ１（１４０２）に書き込む。 If it is determined in step 1503 that the content of the command is the output of the projection 107 or a direct value, the process proceeds to step 1504 for determining the storage location of the numerical value. In step 1504, the Stacking 109 writes the data of REG0 (1401) to the SRAM 1403 and writes the value of REG1 (1402) to REG0 (1401) if the data has already been stored in the stack registers 1401 and 1402 of FIG. The input data of step 1503 is written into REG1 (1402).

ステップ１５０４の判定において、スタックレジスタ１４０１、１４０２にデータが格納済みでなければ、ステップ１５０９へ進んで、Ｓｔａｃｋｉｎｇ１０９は、ＲＥＧ１（１４０２）のデータをＲＥＧ０（１４０１）に書き込み、ステップ１５０３の入力データをＲＥＧ１（１４０２）に書き込む。 If it is determined in step 1504 that the data has not been stored in the stack registers 1401 and 1402, the process proceeds to step 1509, and Stacking 109 writes the data of REG1 (1402) to REG0 (1401), and the input data of step 1503 is REG1. Write to (1402).

ステップ１５０３の判定において、スタック演算コマンドの内容が演算子であれば、Ｓｔａｃｋｉｎｇ１０９は、ＲＥＧ０（１４０１）、ＲＥＧ１（１４０２）の出力をスタック演算コマンドによって演算し、演算結果をＲＥＧ１（１４０２）に書き戻す。 If it is determined in step 1503 that the content of the stack operation command is an operator, the Stacking 109 operates the outputs of REG0 (1401) and REG1 (1402) using the stack operation command, and writes the operation result back to REG1 (1402). .

さらに、ＳＲＡＭ（１４０３）にデータが存在すれば、Ｓｔａｃｋｉｎｇ１０９はＳＲＡＭ（１４０３）からＲＥＧ０（１４０１）にデータを書き込む。データの書き込みが完了すると、Ｓｔａｃｋｉｎｇ１０９は、レジスタ１０４を参照して、次のスタック演算コマンドが存在するか否かの判定を行う（１５０５）。 Furthermore, if data exists in the SRAM (1403), the Stacking 109 writes data from the SRAM (1403) to the REG0 (1401). When the data writing is completed, the Stacking 109 refers to the register 104 and determines whether or not there is a next stack operation command (1505).

ステップ１５０５において、スタック演算コマンドの内容が終了コマンドでなければ、Ｓｔａｃｋｉｎｇ１０９は、ステップ１５０２に戻って上記処理を繰り返す。一方、ステップ１５０５において、次のスタック演算コマンドが終了コマンドであれば、Ｓｔａｃｋｉｎｇ１０９はスタッキング演算を終了（１５０６）する。 In step 1505, if the content of the stack operation command is not an end command, the stacking 109 returns to step 1502 and repeats the above processing. On the other hand, if the next stack operation command is an end command in step 1505, the Stacking 109 ends the stacking operation (1506).

上記処理によって、プロジェクション処理の結果がＳＲＡＭ２０５、２０６へ書き込まれると、Ｓｔａｃｋｉｎｇ１０９によってスタッキング処理が実行されて、ＳＲＡＭＧ０（２０７）へ書き込まれる。 When the result of the projection processing is written to the SRAMs 205 and 206 by the above processing, the stacking processing is executed by the Stacking 109 and written to the SRAM G0 (207).

図１６Ａは、スタッキング演算の一例を示す図である。また、図１６Ｂは、スタッキング演算で使用されるコマンドの一例を示す図である。図１６Ａは、横軸を時刻としてスタッキング処理部１０９へ入力されるスタック演算コマンド１６０２、１６０５〜１６０８と、レジスタ１４０１、１４０２の状態を示す。スタック演算コマンド１６０５〜１６０８の内容は、図１６Ｂのコード１６１２に対応する。図１６Ｂのスタック演算コマンド１６０１は、ＮＯ１６１０と、Ｓｔａｃｋｉｎｇ１０９で実行するコマンド１６１１と、コード１６１２と、コマンドの意味１６１３から構成される。 FIG. 16A is a diagram illustrating an example of a stacking calculation. FIG. 16B is a diagram illustrating an example of commands used in the stacking calculation. FIG. 16A shows the states of the stack operation commands 1602 and 1605 to 1608 and the registers 1401 and 1402 input to the stacking processing unit 109 with the horizontal axis as time. The contents of the stack operation commands 1605 to 1608 correspond to the code 1612 in FIG. 16B. The stack operation command 1601 of FIG. 16B includes NO1610, a command 1611 executed in Stacking 109, a code 1612, and a command meaning 1613.

図１６Ｂで示すように、各コマンド１６１１には、コード１６１２が定められており、Ｓｔａｃｋｉｎｇ１０９は、コマンド１６１１に対応するコード１６１２を受け取り、スタッキング演算を実行する。 As shown in FIG. 16B, a code 1612 is defined for each command 1611, and the Stacking 109 receives the code 1612 corresponding to the command 1611 and executes a stacking operation.

図１６Ａの時刻Ｔ０において、Ｓｔａｃｋｉｎｇ１０９は、コマンドとして、０Ｘ８１（１６０５）のＰｒｏｊｅｃｔｉｏｎ列１の値１を受け取り、スタックレジスタＲＥＧ１（１４０２）にＰｒｏｊｅｃｔｉｏｎ列１の値を格納する。ここで、スタック演算コマンド１６０５のコード１６１２が「０ｘ８１」で、図１６Ｂの意味１６１３は、「スタックにプロジェクタ出力列１番を積み上げる」である。 At time T0 in FIG. 16A, the Stacking 109 receives the value 1 of the Projection column 1 of 0X81 (1605) as a command and stores the value of the Projection column 1 in the stack register REG1 (1402). Here, the code 1612 of the stack operation command 1605 is “0x81”, and the meaning 1613 in FIG. 16B is “stack the projector output row 1 on the stack”.

すなわち、最初のスタック演算コマンドは、Ｐｒｏｊｅｃｔｉｏｎ１０７の出力列１番の値をレジスタＲＥＧ１（１４０２）に格納する。なお、（６４’ｄ１）の「６４’」はデータのビット数が６４ビットで、データがｄ１であることを示す。また、図１６Ｂのスタックは、スタックレジスタＲＥＧ０（１４０１）、スタックレジスタＲＥＧ１（１４０２）を示す。 That is, the first stack operation command stores the value of the output string 1 of the Projection 107 in the register REG1 (1402). Note that “64 ′” in (64′d1) indicates that the data has 64 bits and the data is d1. The stack in FIG. 16B shows a stack register REG0 (1401) and a stack register REG1 (1402).

図１６Ａの時刻Ｔ１において、スタック演算コマンドとして、Ｓｔａｃｋｉｎｇ１０９は、０Ｘ１０（１６０６）の直値０の値を受け取り、スタックレジスタＲＥＧ１（１４０２）の値をＲＥＧ０（１４０１）に書き込み、直値０の値をＲＥＧ１（１４０２）に書き込む。 At time T1 in FIG. 16A, as a stack operation command, Stacking 109 receives the value of direct value 0 of 0X10 (1606), writes the value of stack register REG1 (1402) to REG0 (1401), and sets the value of direct value 0. Write to REG1 (1402).

なお、直値０の値は別途レジスタに設定しており、ここでは「２」とする。時刻Ｔ２において、Ｓｔａｃｋｉｎｇ１０９は、スタック演算コマンドとして、０Ｘ０１（１６０７）の和演算子を受け取り、スタックレジスタ１４０１、１４０２の値を足して、ＲＥＧ１（１４０２）に書き込む。 The value of direct value 0 is set in a separate register, and is set to “2” here. At time T2, the Stacking 109 receives the 0X01 (1607) sum operator as a stack operation command, adds the values of the stack registers 1401 and 1402, and writes the result to the REG1 (1402).

時刻Ｔ３において、Ｓｔａｃｋｉｎｇ１０９は、コマンドとして０Ｘ７Ｆ（１６０８）のスタック演算終了を受け取り、最終的なスタック演算結果を出力する。 At time T3, the Stacking 109 receives the stack operation end of 0X7F (1608) as a command, and outputs the final stack operation result.

以上のように、Ｓｔａｃｋｉｎｇ１０９は、スタックレジスタ１４０１、１４０２を用いてスタック演算コマンドに応じた演算を実行する。 As described above, the Stacking 109 executes an operation according to the stack operation command using the stack registers 1401 and 1402.

図１７は、Ａｇｇｒｅｇａｔｉｏｎ１１１の出力である集約結果１１２の一例を示す図である。集約結果１１２には、グルーピング化列０（１７０１）と、集約結果０（１７０２）と、集約結果１（１７０３）のように、グルーピング化列と、集約結果の順番に格納され、グループを示すグルーピング化列の数だけ同じフォーマットでデータが格納されている。集約結果１１２において、データが存在しない箇所１７０４は所定の領域までゼロの値で埋めている。 FIG. 17 is a diagram illustrating an example of the aggregation result 112 that is the output of the aggregation 111. In the aggregation result 112, grouping columns 0 (1701), an aggregation result 0 (1702), and an aggregation result 1 (1703) are stored in the order of the grouping column and the aggregation result, and the grouping indicates a group. Data is stored in the same format as the number of data columns. In the aggregation result 112, a portion 1704 where no data exists is filled with a zero value up to a predetermined area.

図１７の例では、１つのグルーピング化列に対してＦＰＧＡ２が２つの集約結果１７０２、１７０３を出力する例を示したが、これに限定されるものではなく、集約結果の数はＦＰＧＡ２の設定に応じて変更することができる。 In the example of FIG. 17, the example in which the FPGA 2 outputs two aggregation results 1702 and 1703 for one grouping column is shown, but the present invention is not limited to this, and the number of aggregation results depends on the setting of the FPGA 2. It can be changed accordingly.

図１８は、ＦＰＧＡ２で行われる集約処理の一例を示すタイミングチャートである。ＦＰＧＡ２では、Ｆｉｌｔｅｒ１０６によるフィルタリング処理と，Ｐｒｏｊｅｃｔｉｏｎ１０７によるプロジェクション処理はページ（８ＫＢ）単位で行っている。 FIG. 18 is a timing chart illustrating an example of aggregation processing performed in the FPGA 2. In the FPGA 2, the filtering process by the filter 106 and the projection process by the projection 107 are performed in units of pages (8 KB).

時刻Ｔ０からＦＰＧＡでは、フィルタリング処理ＦＬＴ＿０（１８０１）が実行される。フィルタリング処理が完了すると、フィルタリング処理ＦＬＴ＿０（１８０１）の結果を用いて、時刻Ｔ１ではプロジェクション処理ＰＲＪ＿０（１８０２）が開始される。 From the time T0 to the FPGA, the filtering process FLT_0 (1801) is executed. When the filtering process is completed, the projection process PRJ_0 (1802) is started at time T1, using the result of the filtering process FLT_0 (1801).

プロジェクション処理ＰＲＪ＿０の結果を用いて、Ｇｒｏｕｐｉｎｇ１０８とＳｔａｃｋｉｎｇ１０９では、グルーピング処理ＧＲＰ＿０（１８０３）とスタッキング処理ＳＴＫ＿０（１８０４）が並列して実行される。なお、グルーピング処理ＧＲＰ＿０（１８０３）とスタッキング処理ＳＴＫ＿０（１８０４）は、プロジェクション処理ＰＲＪ＿０（１８０２）で行データが出力された時刻Ｔ１Ａから開始される。 Using the result of the projection process PRJ_0, in the Grouping 108 and the Stacking 109, the grouping process GRP_0 (1803) and the stacking process STK_0 (1804) are executed in parallel. The grouping process GRP_0 (1803) and the stacking process STK_0 (1804) are started from time T1A when the row data is output in the projection process PRJ_0 (1802).

グルーピング処理ＧＲＰ＿０（１８０３）でシノニムが発生した場合には、グルーピング処理の結果と共に集約処理へシノニムの情報を通知する。 When a synonym occurs in the grouping process GRP_0 (1803), the synonym information is notified to the aggregation process together with the result of the grouping process.

グルーピング処理ＧＲＰ＿０（１８０３）とスタッキング処理ＳＴＫ＿０（１８０４）が完了するとＡｇｇｒerｇａｔｉｏｎ１１１で集約処理Ａ＿０（１８０５）が実行される。なお、Ａｇｇｒｅｇａｔｉｏｎ１１１は、シノニムの情報がある場合にはシノニム処理Ｓ＿０（１８０６）が行われる。シノニム処理Ｓ＿０（１８０６）では、後述するシノニム１１４へグルーピング化列とスタック列を書き込む処理である。 When the grouping process GRP_0 (1803) and the stacking process STK_0 (1804) are completed, the aggregation process A_0 (1805) is executed in the aggregation 111. Note that the aggregation 111 performs synonym processing S_0 (1806) when there is synonym information. The synonym process S_0 (1806) is a process of writing a grouping column and a stack column to a synonym 114 described later.

ＦＰＧＡ２では、処理対象データ９０１の最終ページの処理である、フィルタリング処理ＦＬＴ＿Ｎ（１８０７）と、プロジェクション処理ＰＲＪ＿Ｎ（１８０８）と、グルーピング処理ＧＲＰ＿Ｎ（１８０９）と、スタッキング処理ＳＴＫ＿Ｎ（１８１０）と、集約処理Ａ＿Ｎ（１８１１）と、シノニムＳ＿Ｎ（１８１２）が終了すると、図１７に示した集約結果１１２とシノニム１１４及びメッセージ１１３をＤＢサーバ１に転送する。また、ＦＰＧＡ２は、完了したデータベースコマンドについて実行完了をＤＢサーバ１へ通知する。なお、メッセージ１１３には、ＦＰＧＡ２が実行したデータベースコマンドで集約したグループの数と、シノニムが発生した数と、演算オーバーフローの情報等が含まれる。 In the FPGA 2, the filtering process FLT_N (1807), the projection process PRJ_N (1808), the grouping process GRP_N (1809), the stacking process STK_N (1810), and the aggregation process A_N, which are the processes of the last page of the processing target data 901, are performed. When (1811) and the synonym S_N (1812) are finished, the aggregation result 112, the synonym 114, and the message 113 shown in FIG. 17 are transferred to the DB server 1. Further, the FPGA 2 notifies the DB server 1 of the completion of execution of the completed database command. Note that the message 113 includes the number of groups aggregated by the database command executed by the FPGA 2, the number of synonyms generated, information on operation overflow, and the like.

ＤＢサーバ１は、ＦＰＧＡ２から受信した集約結果１１２を、結果格納領域１１５に格納し、受信したシノニム１１４をシノニム格納領域１１７に格納し、メッセージ１１３をメッセージ格納領域１１６に格納する。 The DB server 1 stores the aggregation result 112 received from the FPGA 2 in the result storage area 115, stores the received synonym 114 in the synonym storage area 117, and stores the message 113 in the message storage area 116.

なお、シノニムが多発した場合などではシノニム１１４のデータが埋まった時点で、ＤＢサーバ１のシノニム格納領域１１７にシノニム１１４を転送してもよい。 When synonyms occur frequently, the synonyms 114 may be transferred to the synonym storage area 117 of the DB server 1 when the data of the synonyms 114 is filled.

＜ＦＰＧＡとサーバの連携処理＞
次に、ＦＰＧＡ２のグルーピング処理でシノニムが発生した場合の集約処理について説明する。図１２において、シノニムと判定（１２０４）され、同じハッシュ値が異なるグルーピング化列に割り当てられた場合には、後述する図１９のように、グルーピング化列とスタック列をＤＢサーバ１に転送し、ＤＢサーバ１がグルーピング及び集約モジュール１２０において、グルーピングと集約演算を行う。<Cooperation between FPGA and server>
Next, an aggregation process when a synonym occurs in the grouping process of FPGA 2 will be described. In FIG. 12, when it is determined as a synonym (1204) and the same hash value is assigned to different grouping columns, the grouping column and the stack column are transferred to the DB server 1 as shown in FIG. The DB server 1 performs grouping and aggregation operations in the grouping and aggregation module 120.

図１９は、Ａｇｇｒｅｇａｔｉｏｎ１１１の出力である、シノニム１１４のフォーマットの一例を示す図である。 FIG. 19 is a diagram illustrating an example of the format of the synonym 114 that is the output of the aggregation 111.

グルーピング処理でシノニム結果を格納するシノニム１１４は、グルーピング化列０（２２０１）、スタック列０（２２０２）、スタック列１（２２０３）のように、グルーピング化列、スタック列の順番に格納され、グループ（グルーピング化列）の数だけ同じフォーマットでデータが格納されている。シノニム１１４において、データが存在しない箇所２２０４は所定の領域までゼロで埋められる（２２０４）。 The synonym 114 that stores the synonym result in the grouping process is stored in the order of the grouping column and the stack column like the grouping column 0 (2201), the stack column 0 (2202), and the stack column 1 (2203). Data is stored in the same format as the number of (grouping columns). In the synonym 114, a portion 2204 where no data exists is filled with zeros up to a predetermined area (2204).

グルーピング化列２２０１には、シノニムが発生したグルーピング化データ１１０２が格納される。スタック列には、当該グループのスタッキング処理の結果が格納される。 The grouping column 2201 stores grouping data 1102 in which synonyms have occurred. The stack column stores the result of stacking processing of the group.

図２０は、ＤＢサーバ１で行われる処理の一例を示すフローチャートである。ＤＢサーバ１のＤＢＭＳ２０は、図示しない計算機から受信したクエリに基づいてデータベースコマンドを生成してＦＰＧＡ２に処理を依頼し、ＦＰＧＡ２は所定のページ単位でデータベース処理を行った集約結果をＤＢサーバ１に応答する。ＤＢサーバ１のＤＢＭＳ２０は、ページ単位でグループ化された複数の集約結果を受信し、データベースコマンドで指定した全てのデータについて集約結果を受信すると集計してクエリの送信元の計算機（図示省略）に返信する。 FIG. 20 is a flowchart illustrating an example of processing performed in the DB server 1. The DBMS 20 of the DB server 1 generates a database command based on a query received from a computer (not shown) and requests the FPGA 2 to process, and the FPGA 2 responds to the DB server 1 with the result of the database processing performed in a predetermined page unit. To do. The DBMS 20 of the DB server 1 receives a plurality of aggregation results grouped in units of pages, and sums up when receiving the aggregation results for all the data specified by the database command, to the computer (not shown) that sends the query. Send back.

まず、ＤＢＭＳ２０は、他の計算機（図示省略）からＤＢ３０に対するクエリを受け付ける（１９０１）。ＤＢＭＳ２０は、受け付けたクエリに基づいて、図４で示したように、ストレージ装置３のＤＢ３０に対するデータベースコマンドを生成し、ＦＰＧＡ２にデータベースコマンドを発行する（１９０２）。すなわち、ＤＢＭＳ２０は、クエリの処理対象となるＤＢ３０の処理範囲と、当該ＤＢ３０の処理範囲を分割してＦＰＧＡ２がひとつのデータベースコマンドで処理する単位サイズ（例えば、８ＭＢ）を決定して、各データベース処理の内容を決定してＦＰＧＡ２に指令する。 First, the DBMS 20 receives a query for the DB 30 from another computer (not shown) (1901). Based on the accepted query, the DBMS 20 generates a database command for the DB 30 of the storage apparatus 3 as shown in FIG. 4, and issues the database command to the FPGA 2 (1902). In other words, the DBMS 20 determines the processing range of the DB 30 to be processed by the query and the unit size (for example, 8 MB) that the FPGA 2 processes with one database command by dividing the processing range of the DB 30 to process each database processing. Is determined and commanded to the FPGA 2.

ＤＢＭＳ２０は、データベースコマンドに対する完了コマンドをＦＰＧＡ２から受信したか否かを判定する（１９０３）。受信していなければ完了コマンドの受信を待機する。一方、完了コマンドを受信した場合には、ステップ１９０４へ進んで、データベースコマンドで指定したクエリの処理対象となる範囲の全てのデータについて完了コマンドを受信したか否かを判定する。 The DBMS 20 determines whether or not a completion command for the database command has been received from the FPGA 2 (1903). If not received, it waits for the completion command. On the other hand, if a completion command has been received, the process advances to step 1904 to determine whether or not the completion command has been received for all data in the range to be processed by the query specified by the database command.

ＦＰＧＡ２は、所定のページ単位でデータベース処理の結果を応答するので、ＤＢＭＳ２０は、データベースコマンドに対応する処理対象データが全て処理されるまで待機する。全てのデータについてデータベース処理が完了した場合にはステップ１９０５へ進み、そうでない場合にはステップ１９０３へ戻って完了コマンドを待つ。 Since the FPGA 2 responds with the result of the database processing in units of predetermined pages, the DBMS 20 waits until all the processing target data corresponding to the database command is processed. If the database processing has been completed for all the data, the process proceeds to step 1905. If not, the process returns to step 1903 to wait for a completion command.

ステップ１９０５では、ＤＢＭＳ２０が結果格納領域１１５に格納されたコマンド単位の集約結果を取得する。結果格納領域１１５には図１９で示した集約結果１１２がコマンド単位で格納されている。 In step 1905, the DBMS 20 acquires the aggregation result for each command stored in the result storage area 115. The result storage area 115 stores the aggregation result 112 shown in FIG. 19 in units of commands.

ＤＢＭＳ２０は、ステップ１９０５で読み込んだ集約結果からグルーピング化列のハッシュ値を演算してグループハッシュテーブル１１９を生成する。ハッシュ値の生成はＦＰＧＡ２のＧｒｏｕｐｉｎｇ１０８と同様であり、図１７に示した集約結果１１２のグルーピング化列のハッシュ値を算出する。そして、ＤＢＭＳ２０は、グルーピング化列のグループＩＤとハッシュ値を対応付けてグループハッシュテーブル１１９に格納する。なお、ハッシュ値の演算とグループＩＤの決定についてはＦＰＧＡ２のＧｒｏｕｐｉｎｇ１０８と同様の処理を行えば良い。 The DBMS 20 calculates a hash value of the grouping column from the aggregation result read in Step 1905 and generates a group hash table 119. The generation of the hash value is the same as the Grouping 108 of the FPGA 2, and the hash value of the grouping column of the aggregation result 112 shown in FIG. 17 is calculated. Then, the DBMS 20 associates the group ID of the grouping column with the hash value and stores them in the group hash table 119. Note that the hash value calculation and group ID determination may be performed in the same manner as in the FPGA 2 Grouping 108.

図２４は、グループハッシュテーブル１１９の一例を示す図である。グループハッシュテーブル１１９は、図２２に示したＦＰＧＡ２のハッシュテーブル７２８と同様の構成である。 FIG. 24 is a diagram illustrating an example of the group hash table 119. The group hash table 119 has the same configuration as the hash table 728 of the FPGA 2 shown in FIG.

グループハッシュテーブル１１９は、ハッシュ値１１９１と、当該ハッシュ値が使用されているか否かを示すフラグ１１９２と、ハッシュ値に対応するグループＩＤ１１９３とからひとつのエントリが構成される。 The group hash table 119 includes one entry including a hash value 1191, a flag 1192 indicating whether the hash value is used, and a group ID 1193 corresponding to the hash value.

ハッシュ値１１９１は、ステップ１９０５で読み込んだ集約結果のグルーピング化列１７０１（図１７）から算出したハッシュ値を格納する。フラグ１１９２は、当該エントリのハッシュ値１１９１が使用されていれば“１”が設定され、使用されていなければ“０”が設定される。 The hash value 1191 stores the hash value calculated from the grouping column 1701 (FIG. 17) of the aggregation result read in Step 1905. The flag 1192 is set to “1” if the hash value 1191 of the entry is used, and is set to “0” if it is not used.

次に、図２０のステップ１９０７では、ＤＢＭＳ２０がグループハッシュテーブル１１９のハッシュ値１１９１をソートする。ＤＢＭＳ２０は、ソート後のグループハッシュテーブル１１９のグループＩＤ１１９３に従って、集約結果１１２のグルーピング化列１７０１の順序を並び替えて、各ページの集約結果についてグルーピング化列と集約結果の関係を揃える。 Next, in step 1907 of FIG. 20, the DBMS 20 sorts the hash values 1191 of the group hash table 119. The DBMS 20 rearranges the order of the grouping column 1701 of the aggregation result 112 according to the group ID 1193 of the group hash table 119 after sorting, and aligns the relationship between the grouping column and the aggregation result for the aggregation result of each page.

ステップ１９０８では、ＤＢＭＳ２０が上記ステップ１９０７までの処理でグルーピング化列と集約結果の関係を揃えたデータの集約結果を集計する。この処理により、コマンド単位で集約されて、グルーピング化列の順序が不同である集約結果が、クエリで指定されたＤＢ３０の処理対象範囲で集計されてクエリに対する処理結果が生成される。なお、この時点では、クエリに対する処理結果が、シノニム１１４を含まない集計結果である。 In step 1908, the DBMS 20 aggregates the aggregation result of data in which the relationship between the grouping column and the aggregation result is aligned by the processing up to step 1907. As a result of this processing, the aggregation results that are aggregated in command units and in which the order of the grouping columns is not the same are aggregated in the processing target range of the DB 30 specified by the query, and the processing results for the query are generated. At this point, the processing result for the query is a total result that does not include the synonym 114.

次に、ＤＢＭＳ２０は、シノニム格納領域１１７のデータを読み込んで（１９０９）、ＦＰＧＡ２のデータベース処理でシノニムが発生したか否かを判定する（１９１０）。シノニム格納領域１１７にシノニム１１４が書き込まれていた場合には、ステップ１９１１へ進む。一方、シノニム格納領域１１７の情報が書き込まれていない場合には、ステップ１９１３へ進んで、ＤＢＭＳ２０が集計結果をクエリの送信元に応答して処理を終了する。 Next, the DBMS 20 reads the data in the synonym storage area 117 (1909), and determines whether or not a synonym has occurred in the database processing of the FPGA 2 (1910). If the synonym 114 has been written in the synonym storage area 117, the process proceeds to step 1911. On the other hand, if the information in the synonym storage area 117 is not written, the process proceeds to step 1913, where the DBMS 20 responds to the transmission result of the query and ends the process.

ステップ１９１１では、ＤＢＭＳ２０がシノニム１１４のデータを再度グルーピング処理を行う。すなわち、ＤＢＭＳ２０は、シノニム格納領域１１７に格納されているシノニム１１４のグルーピング化列（２２０１）を取得してハッシュ値を演算する。ＤＢＭＳ２０は、グループハッシュテーブル１１９のハッシュ値１１９１から演算結果のハッシュ値に一致するエントリを検索する。ＤＢＭＳ２０は、該当するエントリがなければ、当該ハッシュ値を新たなハッシュ値１１９１としてグループハッシュテーブル１１９に追加する。 In step 1911, the DBMS 20 performs the grouping process on the data of the synonym 114 again. That is, the DBMS 20 obtains a grouping column (2201) of the synonyms 114 stored in the synonym storage area 117 and calculates a hash value. The DBMS 20 searches the hash value 1191 of the group hash table 119 for an entry that matches the hash value of the calculation result. If there is no corresponding entry, the DBMS 20 adds the hash value as a new hash value 1191 to the group hash table 119.

ステップ１９１２では、ＤＢＭＳ２０がハッシュ値を演算したシノニム１１４のデータに含まれるスタック列（２２０２、２２０３）を、ハッシュ値が一致または追加したグループハッシュテーブル１１９のグループＩＤ１１９３に対応する集計結果に加えて再計算する。そして、ＤＢＭＳ２０は、ステップ１９１３で、シノニム１１４のデータで再計算された集計結果をクエリの送信元に応答して処理を終了する。 In step 1912, the stack string (2202, 2203) included in the data of the synonym 114 for which the DBMS 20 has calculated the hash value is re-added in addition to the aggregation result corresponding to the group ID 1193 of the group hash table 119 with which the hash value matches or is added. calculate. In step 1913, the DBMS 20 responds to the aggregation result recalculated with the data of the synonym 114 to the query transmission source and ends the process.

以上の処理により、ＦＰＧＡ２で所定の処理単位（ページ単位）で集約されたＤＢ３０の処理結果は、シノニム１１４を加味して集計されてクエリの送信元に送信される。また、上記ステップ１９０５〜１９０８の処理がＤＢＭＳ２０の再集約モジュール１１８の処理に相当し、ステップ１９０９〜１９１２の処理が、グルーピング及び集約モジュール１２０の処理に相当する。 Through the above processing, the processing results of the DB 30 aggregated in a predetermined processing unit (page unit) by the FPGA 2 are aggregated with the synonym 114 added and transmitted to the query transmission source. Further, the processing of steps 1905 to 1908 corresponds to the processing of the re-aggregation module 118 of the DBMS 20, and the processing of steps 1909 to 1912 corresponds to the processing of the grouping and aggregation module 120.

図２１は、ＤＢサーバ１による集約結果１１２の再集約処理の一例を示す図である。この処理は、上記図２０のステップ１９０５〜１９０８の処理に相当する。図示の例では、ＦＰＧＡ２に指令したデータベースコマンドが、コマンド１〜コマンドＮで構成され、Ｎ個の集約結果２１０１−１〜２１０１−ＮがＤＢサーバ１に出力された例を示す。 FIG. 21 is a diagram illustrating an example of re-aggregation processing of the aggregation result 112 by the DB server 1. This processing corresponds to the processing in steps 1905 to 1908 in FIG. In the illustrated example, the database command instructed to the FPGA 2 is composed of commands 1 to N, and N aggregation results 2101-1 to 2101-N are output to the DB server 1.

ＦＰＧＡ２で処理されたコマンド１の集約結果が２１０１−１であり、コマンドＮの集約結果が２１０１−Ｎである。集約結果２１０１−１と２１０１−Ｎではグルーピング化列の出現順が異なる。このため、ＤＢサーバ１の再集約モジュール１１８では、これらの結果を用いてそのまま再集約を行うことが出来ない。 The aggregation result of the command 1 processed by the FPGA 2 is 2101-1, and the aggregation result of the command N is 2101-N. The order of appearance of the grouping columns differs between the aggregation results 2101-1 and 2101-N. For this reason, the re-aggregation module 118 of the DB server 1 cannot re-aggregate as it is using these results.

そこで、ＤＢサーバ１は、ＦＰＧＡ２から受信したＮ個のコマンドの集約結果に対して、グループハッシュテーブル１１９を生成し、グルーピングと並べ替えを行い、新たな集約結果２１０２−１〜２１０２−Ｎを演算する。 Therefore, the DB server 1 generates a group hash table 119 for the aggregation results of N commands received from the FPGA 2, performs grouping and rearrangement, and calculates new aggregation results 2102-1 to 2102-N. To do.

集約結果２１０２−１と２１０２−Ｎにおいては、グルーピング化列の出現順が同一であり、ＤＢサーバ１は、各コマンドの集約結果の合計や最大値を行方向で演算することが可能である。ＤＢサーバ１の再集約モジュール１１８は、集約結果２１０２−１、２１０２−Ｎを用いて再集約演算を行い、最終的に図示の集約結果２１０３を演算する。 In the aggregation results 2102-1 and 2102-N, the order of appearance of the grouping columns is the same, and the DB server 1 can calculate the total or maximum value of the aggregation results of each command in the row direction. The re-aggregation module 118 of the DB server 1 performs the re-aggregation calculation using the aggregation results 2102-1 and 2102-N, and finally calculates the illustrated aggregation result 2103.

以上のように、実施例１によれば、ＤＢサーバ１がＤＢ３０を分割して、数ＭＢ単位でハードウェアアクセラレータであるＦＰＧＡ２に処理を依頼すると、ＤＲＡＭ１３６のバンド幅がＳＳＤ１３７のバンド幅よりも大きいことから、ＳＳＤ１３７からのＤＢ３０の読み出しと、ＤＲＡＭ１３６からのデータ読み出しを並列して行える。制御部１２９が処理対象のＤＢ３０のデータをＳＳＤ１３７からＤＲＡＭ１３６に複製した後には、ＦＰＧＡ２は、ＳＳＤ１３７よりも読み出し速度の速いＤＲＡＭ１３６から所定の処理サイズ（例えば、ページ）のデータを入力することで、処理性能を向上させることができる。 As described above, according to the first embodiment, when the DB server 1 divides the DB 30 and requests processing from the hardware accelerator FPGA 2 in units of several MB, the bandwidth of the DRAM 136 is larger than the bandwidth of the SSD 137. Therefore, reading of the DB 30 from the SSD 137 and reading of data from the DRAM 136 can be performed in parallel. After the control unit 129 replicates the data of the DB 30 to be processed from the SSD 137 to the DRAM 136, the FPGA 2 inputs data of a predetermined processing size (for example, page) from the DRAM 136, which has a faster reading speed than the SSD 137. Performance can be improved.

そして、ＦＰＧＡ２は、データベースコマンド単位で集約処理を実行し、ＤＢサーバ１は、複数のデータベースコマンドに対する集約結果の再集約処理を行い、ＤＢサーバ１とＦＰＧＡ２が協調してデータベース処理を実行することにより、データベース処理システムの処理性能を向上することができる。 Then, the FPGA 2 executes aggregation processing in units of database commands, and the DB server 1 performs re-aggregation processing of aggregation results for a plurality of database commands, and the DB server 1 and FPGA 2 execute database processing in cooperation with each other. The processing performance of the database processing system can be improved.

また、ＦＰＧＡ２では、フィルタリング処理、プロジェクション処理でのデータベース処理単位を、ＤＢ３０のページ単位とすることで、パイプライン処理を実現してハードウェアアクセラレータの性能をさらに向上させることが可能となる。 Further, in the FPGA 2, the database processing unit in the filtering process and the projection process is set to the page unit of the DB 30, so that pipeline processing can be realized and the performance of the hardware accelerator can be further improved.

本実施例２においては、シノニムの発生頻度が高い場合におけるＤＢサーバ１の処理について説明する。本実施例では、シノニムの発生頻度が所定の閾値を超えた場合に、ＤＢサーバ１が、ＦＰＧＡ２がひとつのコマンドで処理するデータサイズを縮小し、グルーピング処理においてシノニムの発生を抑制するものである。その他の構成は、前記実施例１と同様である。 In the second embodiment, processing of the DB server 1 when the synonym occurrence frequency is high will be described. In this embodiment, when the occurrence frequency of synonyms exceeds a predetermined threshold, the DB server 1 reduces the data size processed by the FPGA 2 with one command and suppresses the occurrence of synonyms in the grouping process. . Other configurations are the same as those of the first embodiment.

なお、シノニムの発生頻度は、例えば、シノニム１１４に含まれるグルーピング化列の数や、シノニム１１４に含まれるグルーピング化列の数と処理対象データ９０１の比率などを用いることができる。 As the synonym occurrence frequency, for example, the number of grouping columns included in the synonym 114 or the ratio between the number of grouping columns included in the synonym 114 and the processing target data 901 can be used.

図２５は、シノニム多発時にＦＰＧＡ２が処理するデータサイズを縮小する処理を示す図である。１００４〜１００８までについては、前記実施例１の図１０と同様であり、ＤＢサーバ１は、ＦＰＧＡ２がひとつのコマンドで処理するデータサイズを８ＭＢに設定している。 FIG. 25 is a diagram showing processing for reducing the data size processed by the FPGA 2 when synonyms occur frequently. 1004 to 1008 are the same as in FIG. 10 of the first embodiment, and the DB server 1 sets the data size that the FPGA 2 processes with one command to 8 MB.

Ｇｒｏｕｐｉｎｇ１０８がグルーピングを行う際に、シノニムの発生頻度が高く、再集約モジュール１１８を実行するＤＢサーバ１の負荷が増大して、計算機システムとしてのＤＢ処理性能が低下する。 When the grouping 108 performs grouping, the frequency of synonyms is high, the load on the DB server 1 that executes the re-aggregation module 118 increases, and the DB processing performance as a computer system decreases.

そこで、ＤＢサーバ１は、シノニムの発生頻度が高いと判定した場合には、ＦＰＧＡ２が処理する１コマンドあたりのＤＢ３０の処理単位のサイズを縮小させる。サイズ縮小前においては、１コマンドあたりのデータサイズは８ＭＢ（１００４）であるが、サイズ縮小後においては、１コマンドあたりのデータサイズは４ＭＢ（２００６）に低減される。 Therefore, if the DB server 1 determines that the synonym occurrence frequency is high, the DB server 1 reduces the size of the processing unit of the DB 30 per command processed by the FPGA 2. Before the size reduction, the data size per command is 8 MB (1004), but after the size reduction, the data size per command is reduced to 4 MB (2006).

ＦＰＧＡ２は、４ＭＢ（２００６）のデータを1ページ単位（８ＫＢ）でＤＲＡＭ１３６から読み込み、前記実施例１と同様に、グルーピング処理や、スタッキング処理（数値演算）や、集約演算等を実行する。 The FPGA 2 reads 4 MB (2006) data from the DRAM 136 in units of one page (8 KB), and executes grouping processing, stacking processing (numerical calculation), aggregation calculation, and the like, similar to the first embodiment.

ＦＰＧＡ２は、１コマンドあたり８ＭＢ分の集約処理が完了すると、前記実施例１と同様に、グループ順不同で、グルーピング化列と集約結果をＤＢサーバ１に返信する。グルーピング化列と集約結果を受け取ったＤＢサーバ１は、前記実施例１と同様にして、再集約モジュール１１８において、グルーピング化列を用いてグループハッシュテーブル１１９を生成し、ハッシュ値の順に集約結果を並べ替えて、全てのコマンドに対するグローバルの集約処理を実行する。 When the aggregation process for 8 MB per command is completed, the FPGA 2 returns the grouping column and the aggregation result to the DB server 1 in the same group order as in the first embodiment. The DB server 1 that has received the grouping column and the aggregation result generates a group hash table 119 using the grouping column in the re-aggregation module 118 in the same manner as in the first embodiment, and displays the aggregation result in the order of the hash values. Rearrange and execute global aggregation processing for all commands.

これにより、本実施例２では前記実施例１の効果に加えて、シノニム１１４が発生する頻度を低減して、ＤＢサーバ１の処理性能の低下を抑制することができる。 Thereby, in the second embodiment, in addition to the effects of the first embodiment, the frequency at which the synonym 114 is generated can be reduced, and the degradation of the processing performance of the DB server 1 can be suppressed.

＜まとめ＞
前記実施例１、２では、ＦＰＧＡ２を有するストレージ装置３が１つの例を示したが、ＰＣＩスイッチ４に複数のストレージ装置３を接続することができる。また、ＦＰＧＡ２とストレージ装置３を独立させても良く、各ＦＰＧＡ２には、複数のストレージ装置３を接続することができる。<Summary>
In the first and second embodiments, one storage apparatus 3 having the FPGA 2 is shown as an example. However, a plurality of storage apparatuses 3 can be connected to the PCI switch 4. Further, the FPGA 2 and the storage device 3 may be independent, and a plurality of storage devices 3 can be connected to each FPGA 2.

また、ストレージ装置３は、ＤＢ３０を格納するＳＳＤ１３７と、ＳＳＤ１３７から一旦データを読み込んでからＦＰＧＡ２にデータを読み込ませるＤＲＡＭ１３６で構成した例を示したがこれに限定されるものではない。ストレージ装置３は、例えば、不揮発性半導体記憶媒体で構成されてＤＢ３０を格納する第１の記憶部と、第１の記憶部よりも読み出し速度が高速な半導体記憶媒体で構成されてＦＰＧＡ２にデータを読み込ませる第２の記憶部とを有すれば良い。 In addition, although the storage apparatus 3 is configured with the SSD 137 that stores the DB 30 and the DRAM 136 that once reads data from the SSD 137 and then reads the data into the FPGA 2, the storage apparatus 3 is not limited thereto. The storage device 3 is composed of, for example, a non-volatile semiconductor storage medium that stores the DB 30 and a semiconductor storage medium that is faster in reading speed than the first storage section and stores data in the FPGA 2. A second storage unit to be read may be included.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に記載したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加、削除、又は置換のいずれもが、単独で、又は組み合わせても適用可能である。 In addition, this invention is not limited to an above-described Example, Various modifications are included. For example, the above-described embodiments are described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described. Further, a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment. In addition, any of the additions, deletions, or substitutions of other configurations can be applied to a part of the configuration of each embodiment, either alone or in combination.

また、上記の各構成、機能、処理部、及び処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、及び機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Each of the above-described configurations, functions, processing units, processing means, and the like may be realized by hardware by designing a part or all of them with, for example, an integrated circuit. In addition, each of the above-described configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files for realizing each function can be stored in a memory, a recording device such as a hard disk or SSD (Solid State Drive), or a recording medium such as an IC card, SD card, or DVD.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 Further, the control lines and information lines indicate what is considered necessary for the explanation, and not all the control lines and information lines on the product are necessarily shown. Actually, it may be considered that almost all the components are connected to each other.

Claims

A server including a processor and memory;
An accelerator connected to the server for database processing;
A storage system connected to the accelerator for storing a database, and a computer system comprising:
The server
A server command processing unit that receives a query, generates a database command, determines a database range to be processed, a unit size to be processed by one database command by dividing the database range, and instructs the accelerator; ,
A re-aggregation unit that aggregates the output of the accelerator and generates a processing result for the query,
The accelerator is
Based on a command from the server command processing unit, the processing target data of the database is read from the storage device in the unit size, the processing target data is divided into predetermined processing units, and grouped for each predetermined processing unit. A database processing unit that executes database processing including processing, stacking processing, and aggregation processing and outputs an aggregation result;
The re-aggregation unit
When a result of aggregation for the range of the database to be processed is received from the accelerator, the aggregation result is aggregated and generated as a processing result for the query.

The computer system according to claim 1,
The database processing unit
A computer system, wherein the grouping process and the stacking process are executed in parallel in the predetermined processing unit.

The computer system according to claim 1,
The database processing unit
A grouping processing unit that performs a grouping process on the data string of the predetermined processing unit;
The grouping processing unit
A hash value and group information are calculated for the data in the data string, the hash value and the group information are paired and stored in hash information, and if the hash information includes different group information with the same hash value, the hash value A computer system characterized by detecting a collision.

The computer system according to claim 3,
The database processing unit
When the grouping processing unit detects a collision of the hash values, the computer system outputs the group information and a numerical value that is a result of the stacking processing.

The computer system according to claim 1,
The storage device
A first storage unit for storing the database;
A second storage unit configured by a semiconductor storage medium having a higher reading speed than the first storage unit and causing the accelerator to read data;
The accelerator is
A computer system, wherein the range of the database to be processed is copied from the first storage unit to the second storage unit and read from the second storage unit for each unit size.

The computer system according to claim 1,
The database processing unit
In the grouping process, a hash value and group information are calculated for data in the data string of the predetermined processing unit,
In the stacking process, a numerical value is calculated for data in the data string of the predetermined processing unit,
In the aggregation process, numerical information obtained by aggregating the numerical values for each group information is calculated, and the group information and the numerical information are included in the aggregation result,
The re-aggregation unit
A hash value is calculated from the group information, stored in the group hash information, the aggregation result for the range of the database to be processed is rearranged based on the hash value, and then the numerical information is aggregated, to the query A computer system characterized by generating a processing result.

A computer system according to claim 6, wherein
The database processing unit
In the grouping process, when the hash value and the group information are paired and stored in the hash information, and the hash information includes different group information with the same hash value, it is detected that the hash value collides, Output information and numerical values that are the result of the stacking process,
The re-aggregation unit
A computer system that refers to the group hash information, re-aggregates the group information with which the hash values collide, and the numerical value of the stacking process, and recalculates the processing result for the query.

The computer system according to claim 3,
The server command processing unit
When the frequency of occurrence of hash value collisions received by the re-aggregation unit exceeds a preset threshold, the database range is divided to reduce the unit size processed by one database command. Computer system to do.

An accelerator connected to a storage device for storing a database and receiving a database command to perform database processing,
Receiving a database range to be processed and a unit size to be processed by a single database command by dividing the database range, and reading the database processing target data from the storage device in the unit size, the processing target A database processing unit that divides data into predetermined processing units, executes database processing including grouping processing, stacking processing, and aggregation processing for each predetermined processing unit and outputs an aggregation result;
The database processing unit
An accelerator, wherein the grouping process and the stacking process are executed in parallel in the predetermined processing unit.

The accelerator according to claim 9, comprising:
The database processing unit
A grouping processing unit that performs a grouping process on the data string of the predetermined processing unit;
The grouping processing unit
A hash value and group information are calculated for the data in the data string, the hash value and the group information are paired and stored in hash information, and if the hash information includes different group information with the same hash value, the hash value Accelerator characterized by detecting the collision.

The accelerator according to claim 10, wherein
The database processing unit
When the grouping processing unit detects a collision of the hash values, the grouping processing unit outputs the group information and a numerical value that is a result of the stacking processing.

The accelerator according to claim 9, comprising:
The storage device
A first storage unit for storing the database;
A second storage unit configured by a semiconductor storage medium having a higher reading speed than the first storage unit and causing the accelerator to read data;
The database processing unit
An accelerator, wherein the range of the database to be processed is copied from the first storage unit to the second storage unit and read from the second storage unit for each unit size.

A server including a processor and a memory is a database processing method for causing an accelerator connected to the server to perform database processing to process a database of a storage device connected to the accelerator,
The server receives a query, generates a database, determines a database range to be processed and a unit size to be processed by one database command by dividing the database range, and instructs the accelerator And the steps
Based on the command from the server, the accelerator reads data to be processed in the database from the storage device in the unit size, and divides the data to be processed into predetermined processing units, for each predetermined processing unit. A second step of executing a database process including a grouping process, a stacking process, and an aggregation process to output an aggregation result;
A third step in which the server aggregates the output of the accelerator and generates a processing result for the query;
The third step includes
A database processing method, wherein when an aggregation result for the range of the database to be processed is received from the accelerator, the aggregation result is aggregated and generated as a processing result for the query.

The database processing method according to claim 13, comprising:
The second step includes
In the grouping process, a hash value and group information are calculated for the data string of the predetermined processing unit, and in the stacking process, a numerical value is calculated for the data string of the predetermined processing unit. , Calculating numerical information obtained by aggregating the numerical values for each group information, and including the group information and numerical information in the aggregation result,
The third step includes
A hash value is calculated from the group information, stored in the group hash information, the aggregation result for the range of the database to be processed is rearranged based on the hash value, and then the numerical information is aggregated, to the query A database processing method characterized by generating a processing result.

The database processing method according to claim 14, comprising:
The second step includes
In the grouping process, when the hash value and the group information are paired and stored in the hash information, and the hash information includes different group information with the same hash value, it is detected that the hash value collides, Output information and numerical values that are the result of the stacking process,
The third step includes
A database processing method characterized by referring to the group hash information, re-aggregating the group information with which the hash values collide with numerical values of the stacking process, and recalculating the processing result for the query.