JP2004213680A

JP2004213680A - Database management system and query processing method

Info

Publication number: JP2004213680A
Application number: JP2004025150A
Authority: JP
Inventors: Masashi Tsuchida; 正士土田; Yukio Nakano; 幸生中野; Nobuo Kawamura; 信男河村; Kazuyoshi Negishi; 和義根岸; Shunichi Torii; 俊一鳥居
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2004-02-02
Filing date: 2004-02-02
Publication date: 2004-07-29
Anticipated expiration: 2020-07-06
Also published as: JP3668243B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a query processing method for speeding up query processing. <P>SOLUTION: Distribution nodes 1 to 8 are provided with a table T1 or a table T2 for distributing and storing databases for query, retrieve information from these databases and distribute the retrieved information to other nodes. Connecting nodes 9 to 11 sort the distributed information, and when there are a plurality of sorted pieces of information, merge these pieces of information and perform matching for the query based on the merged information. A determination management node 12 receives and analyzes the query, and determines a distribution node and a connecting node for performing execution processing based on the result of analysis. The determination management node 12 also outputs the result for the query sent out from the connecting node. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

本発明は、デ−タベ−ス処理装置に関し、特に、リレ−ショナルデ−タベ−ス管理システムに適した問合せの並列処理に好適な問合せ処理方法に関する。 The present invention relates to a database processing apparatus and, more particularly, to a query processing method suitable for parallel processing of queries suitable for a relational database management system.

デ−タベ−ス管理システム（以下ＤＢＭＳと略記）、特に、リレ−ショナルＤＢＭＳは、非手続き的な言語で表現された問合せを処理し、内部処理手順を決定し、内部処理手順に従って実行する。このデータベース言語としては、ＳＱＬが用いられる（Database Language SQL ISO 9075:1989）。従来の問合せ処理の主な方法には、予め設定した規則に基づいて単一の内部処理手順を決定するものと、各種統計情報を用いて選定された複数の候補処理手順から、コスト評価により、最適と思われるものを決定するものとがある。前者は、処理手順作成のための負荷は小さいけれども、一律に設定された規則の妥当性に問題があり、選ばれた内部処理手順の最適性にも問題がある。後者は、各種統計情報の管理し、複数の候補処理手順の作成し、それらのコスト評価のための負荷を算出して最適な処理手順を与える。上記両者の組合せ技術としては、例えば、Satoh,K.,et.al."Local and Global Optimization Mechanisms for Relational Database", Proc. VLDB, 1985.がある（非特許文献１）。該従来技術では、問い合わせの条件からデータ量を推定して処理手順を決めている。 A database management system (hereinafter abbreviated as DBMS), particularly a relational DBMS, processes a query expressed in a non-procedural language, determines an internal processing procedure, and executes it according to the internal processing procedure. SQL is used as the database language (Database Language SQL ISO 9075: 1989). The main method of the conventional query processing is to determine a single internal processing procedure based on a preset rule and from a plurality of candidate processing procedures selected using various statistical information, by cost evaluation, Some determine what seems optimal. In the former, although the load for creating the processing procedure is small, there is a problem in the validity of uniformly set rules, and also in the optimumness of the selected internal processing procedure. The latter manages various kinds of statistical information, creates a plurality of candidate processing procedures, calculates a load for cost evaluation thereof, and gives an optimal processing procedure. For example, Satoh, K., et.al. "Local and Global Optimization Mechanisms for Relational Database", Proc. VLDB, 1985. (Non-Patent Document 1). In the related art, a processing procedure is determined by estimating a data amount from an inquiry condition.

また、多くのＤＢＭＳは、問合せ解析処理と問合せ実行処理との２フェーズの処理を経て、問合せ処理が実現される。ホスト言語（ＣＯＢＯＬ、ＰＬ／Ｉ等）に問合せ言語を組み込む場合、当アプリケ−ションプログラム実行前に予め問合せを問合せ解析処理し、実行形式である１つの内部処理手順を作成している。この問合せ表現では、多くの場合、検索条件式にはホスト言語の変数が記述される。この変数に定数が代入されるのは、既に問合せ解析処理された結果の内部処理手順の実行時、すなわち、問合せ実行時である。この場合の問題点としては、変数に代入される値に従って複数の最適な処理手順が考えられることである。この問題を解決するために、問合せ実行処理時に複数の処理手順を作成しておき、問合せ実行時に変数に代入された値に従って処理手順を選択するものがある。コードの技術に関するものとしては、特開平１−１９４０２８号公報（特許文献１）、および、Graefe,G.,et.al."Dynamic Query Evaluation Plans", Proc. ACM-SIGMOD, 1989.（非特許文献２）に記載されている技術がある。 In many DBMSs, query processing is realized through two-phase processing of query analysis processing and query execution processing. When a query language is incorporated into a host language (COBOL, PL / I, etc.), a query is analyzed and analyzed in advance before executing this application program, and one internal processing procedure as an execution format is created. In this query expression, in many cases, variables in the host language are described in the search condition expression. The constant is assigned to this variable when the internal processing procedure of the result of the query analysis processing is executed, that is, when the query is executed. The problem in this case is that a plurality of optimal processing procedures can be considered according to the values substituted for the variables. In order to solve this problem, there is a method in which a plurality of processing procedures are created at the time of query execution processing, and the processing procedure is selected according to a value assigned to a variable at the time of query execution. Japanese Patent Application Laid-Open No. 1-194028 (Patent Document 1) and Graefe, G., et.al. "Dynamic Query Evaluation Plans", Proc. ACM-SIGMOD, 1989. There is a technique described in reference 2).

さらに、ＣＰＵ性能、ディスク容量の延びを上回るような、トランザクション量の増大、データベース量の増大に対応して、スケーラブルな並列データベースシステムの提供がユーザから望まれている。データベースシステムに対するユーザの性能要件として、数万を超える同時実行ユーザ数への対応、テラバイト単位の検索トランザクションの出現、表サイズに比例しない応答時間の保証がある。並列データベースシステムは、近年のハードウェアコストの低減と相まって、注目を浴びている。並列データベースシステムについては、DeWitt,D.,et.al.:"Parallel Database Systems: The Future of High Performance Database Systems", CACM, Vol.35, No.6, 1992.（非特許文献３）に記載の技術がある。そのようなシステムでは、密結合あるいは疎結合にプロセッサを接続し、データベース処理を複数のプロセッサに静的／動的に処理を配分し、スケジュールする必要がある。並列度を増せば応答性能は向上するが、過度の並列度は逆にオーバヘッドの増大、他トランザクションの応答時間の延び等の影響がある。そのため、適度な並列度の設定が重要である。 Further, in response to an increase in the amount of transactions and an increase in the amount of databases that exceed the increase in CPU performance and disk capacity, there is a demand from users for providing a scalable parallel database system. User performance requirements for a database system include handling more than tens of thousands of concurrent users, the appearance of search transactions in terabytes, and guaranteeing response times that are not proportional to table size. Parallel database systems have received attention in recent years, coupled with a reduction in hardware costs. The parallel database system is described in DeWitt, D., et.al .: "Parallel Database Systems: The Future of High Performance Database Systems", CACM, Vol. 35, No. 6, 1992. Technology. In such a system, it is necessary to connect processors tightly or loosely, and to allocate and schedule database processing statically / dynamically to a plurality of processors. Although the response performance is improved by increasing the degree of parallelism, excessive degree of parallelism adversely affects the increase of overhead and the extension of the response time of other transactions. Therefore, setting an appropriate degree of parallelism is important.

デ−タベ−ス処理において、処理対象となるデ−タは、二次記憶装置上に存在し、各デ−タベ−ス演算に対して大量デ−タの読み出しおよび転送が必要となる。並列デ−タベ−スシステムにおいても、転送するデ−タが大量となる場合、デ−タ転送時間がデ−タベ−スシステムの性能ネックとなる。そこで、二次記憶装置からデ−タを転送する時間を有効活用する方法が考えられる。これは、デ−タの転送時間と当該デ−タに対するデ−タベ−ス処理に要する時間とをオ−バラップさせるものであり、従来技術として良く知られている。この方式は、相互結合ネットワークで接続されるプロセッサ群間のデータ転送にも適用可能である。 In the database processing, data to be processed exists in a secondary storage device, and it is necessary to read and transfer a large amount of data for each database operation. Even in a parallel database system, when the amount of data to be transferred is large, the data transfer time becomes a performance bottleneck of the database system. Therefore, a method of effectively utilizing the time for transferring data from the secondary storage device can be considered. This overlaps the data transfer time and the time required for database processing on the data, and is well known in the prior art. This method is also applicable to data transfer between processors connected by an interconnection network.

特開平１−１９４０２８号公報JP-A-1-194028

Satoh,K.,et.al."Local and Global Optimization Mechanisms for Relational Database", Proc. VLDB, 1985.Satoh, K., et.al. "Local and Global Optimization Mechanisms for Relational Database", Proc. VLDB, 1985. Graefe,G.,et.al."Dynamic Query Evaluation Plans", Proc. ACM-SIGMOD, 1989.Graefe, G., et.al. "Dynamic Query Evaluation Plans", Proc. ACM-SIGMOD, 1989. DeWitt,D.,et.al.:"Parallel Database Systems: The Future of High Performance Database Systems", CACM, Vol.35, No.6, 1992.DeWitt, D., et.al .: "Parallel Database Systems: The Future of High Performance Database Systems", CACM, Vol. 35, No. 6, 1992.

上記従来技術において、問合せ最適化処理とは、ユ−ザが入力した問合せからデ−タベ−スシステムの各種統計情報を基にし、最も効率の良い処理手順をＤＢＭＳが自動判定するものである。さらに、問合せの選択条件式に変数が埋め込まれている場合には、複数の処理手順を問合せ解析時に展開しておき、問合せ実行時に当変数に代入される値に従って処理手順を選択することによって、最適な処理手順が選択される。 In the above prior art, the query optimization process is a process in which the DBMS automatically determines the most efficient processing procedure based on various statistical information of the database system from a query input by a user. Furthermore, when a variable is embedded in the selection condition expression of the query, a plurality of processing procedures are developed at the time of query analysis, and the processing procedure is selected according to the value assigned to the variable at the time of executing the query. The optimal processing procedure is selected.

並列データベース処理では、各ノード（プロセッサあるいはプロセッサとディスク装置との対）へデータベース演算が分割され、各ノードで各データベース演算が並列にあるいはパイプライン的に動作する。上記従来技術によれば、この並列処理形態でも、各ノードで処理手順を選択する方法は適用可能である。 In the parallel database processing, a database operation is divided into each node (processor or a pair of a processor and a disk device), and each database operation operates in parallel or in a pipeline at each node. According to the above prior art, even in this parallel processing mode, a method of selecting a processing procedure at each node is applicable.

しかし、並列に動作する処理では、同時間にそれぞれのノードが並行処理をするが、各ノードで実行するデータベース演算に対応して各ノード数を決定できないという問題がある。すなわち、ノード数を決定する基準が明確でないために、過度の並列化は逆にオーバヘッドの増大等の影響があり、最適に負荷分散することが困難である。 However, in the processing that operates in parallel, the respective nodes perform parallel processing at the same time, but there is a problem that the number of each node cannot be determined corresponding to the database operation executed in each node. That is, since the criterion for determining the number of nodes is not clear, excessive parallelization adversely affects the increase of overhead, and it is difficult to optimally distribute the load.

また、パイプライン動作させる処理では各ノードへデータベース演算が分割格納されるが、データの分割にバラツキが存在する場合、各ノードへの均等分割方法が明確でない。 In addition, in the process of performing the pipeline operation, the database operation is divided and stored in each node. However, when there is a variation in data division, a method of equally dividing each node is not clear.

さらに、処理時間の制約があったときなどのように、その時間内で複数の処理を行う場合において、各ノードで実行する各データベース演算をパラメータ化し、期待する処理時間に基づいて時間調整（チューニング）をする方法も明確でない。 Further, when a plurality of processes are performed within the time, for example, when there is a restriction on the processing time, each database operation executed at each node is parameterized, and time adjustment (tuning) is performed based on the expected processing time. It is not clear how to do it.

本発明の目的は、問合せ処理を高速化する問い合わせ処理方法およびデータベースシステムを提供することにある。 An object of the present invention is to provide a query processing method and a database system that speed up query processing.

本発明は、上記課題を解決するために、データベース処理を実行する複数のノードを備え、該複数のノードは、ネットワークを介して他のノードと接続されるデータベース管理システムであって、問い合わせ対象のデータベースを分散させて格納する記憶手段と、該記憶手段から情報を取り出して他のノードに取り出した情報を分配する分配手段を備える分配ノードと、該分配ノードから分配された情報を並び替える並び替え手段と、該並び替えられた情報が複数ある場合にはそれらをマージするマージ手段と、該マージされた情報に基づいて問い合わせに対する突き合わせを実行する突き合わせ手段とを備える結合ノードと、前記問い合わせを受け付けて、該問い合わせを解析して問い合わせの処理手順を作成する解析手段と、該解析手段の問い合わせの解析結果に基づいて実行処理を行う分配ノードおよび結合ノードを決定する決定手段と、前記結合ノードから得られた、問い合わせに対する結果を出力する出力手段とを備える決定管理ノードとを備える。 In order to solve the above problem, the present invention includes a plurality of nodes that execute database processing, the plurality of nodes being a database management system connected to another node via a network, and Storage means for distributing and storing a database; a distribution node including distribution means for extracting information from the storage means and distributing the extracted information to another node; and reordering for rearranging the information distributed from the distribution node. Means for combining a plurality of pieces of the rearranged information, a merger for merging the plurality of pieces of information when there is a plurality of pieces of information, a matching node for performing a match to an inquiry based on the merged information; Analyzing means for analyzing the inquiry and creating a processing procedure for the inquiry; Comprising a determining means for determining a distribution node and join node performing the execution process based on the analysis result of the inquiry, said obtained from join node, and a decision management node and output means for outputting the result to the inquiry.

前記決定手段は、前記解析手段の問い合わせの解析結果に基づいて前記分配ノードを決定し、前記分配ノードにおける予想される処理時間を算出し、該処理時間に基づいて結合ノードを決定することができる。 The determination unit may determine the distribution node based on an analysis result of the inquiry of the analysis unit, calculate an expected processing time in the distribution node, and determine a connection node based on the processing time. .

前記決定手段は、前記決定された分配ノードにおける予想される取り出し情報量に基づいて、前記結合ノードへの前記取り出し情報の分配を前記各結合ノードに均等に割当てるようにする。 The determining means is configured to allocate the distribution of the extracted information to the connection nodes equally to each of the connection nodes based on the determined amount of information to be extracted at the distribution node.

前記決定管理ノードは、前記決定手段において前記結合ノードに取り出し情報を均等に割当てるための、前記各ノードの記憶手段の情報に関する最適化情報を記憶している記憶手段を備えることができる。 The decision management node may include a storage unit that stores optimization information regarding information in a storage unit of each of the nodes so that the determination unit uniformly allocates the extracted information to the connection node.

前記決定管理ノードは、前記決定手段において前記結合ノードに取り出し情報を均等に割当てるためにあらかじめ定められたハッシュ関数を利用する。 The decision management node uses a predetermined hash function in order for the decision means to equally allocate the extracted information to the joining node.

また、前記複数のノードは、それぞれ独立に処理を行い、前記結合ノードは、前記分配ノードからの分配された情報を逐次入力し、入力された情報ごとに処理を行う。 In addition, the plurality of nodes independently perform processing, and the connection node sequentially inputs information distributed from the distribution node, and performs processing for each input information.

さらに、前記分配ノードは、該分配ノードで分配する情報を並び替える並び替え手段を有するようにしてもよい。 Further, the distribution node may include a reordering unit that rearranges information distributed by the distribution node.

前記決定手段は、前記分配ノードにおける予想される処理時間の算出結果から、より先に処理が終了する分配ノードに対して分配処理後に前記分配ノードの並び替え手段において並び替えをするように決定することができる。 The determining means determines from the calculation result of the expected processing time in the distribution node that the distribution node rearranging means rearranges the distribution node after the distribution processing to the distribution node which finishes the processing earlier. be able to.

前記決定手段は、前記処理時間に基づいて決定した前記結合ノードの台数を、所定数増加させるように決定する。 The determining means determines to increase the number of the connection nodes determined based on the processing time by a predetermined number.

前記結合ノードの並び替え手段は、並び替え処理の終了後にマージ処理をする機能を備えるようにしてもよい。前記突き合わせ手段は、前記マージ処理をする機能を備えるようにしてもよい。 The unit for rearranging the connection nodes may have a function of performing a merge process after the rearrangement process is completed. The matching means may have a function of performing the merge processing.

前記決定手段は、前記突き合わせ手段および前記出力手段における予想される処理時間を算出し、該算出結果に基づいて、前記出力手段の処理時間が、前記突き合わせ手段の処理時間より大きい場合には、前記突き合わせ手段に前記マージ処理を行わせるように決定する。 The determining means calculates an expected processing time in the matching means and the output means, and based on the calculation result, when the processing time of the output means is larger than the processing time of the matching means, It is determined that the matching means is to perform the merge processing.

また、本発明は、上記課題を解決するために、ネットワークを介して相互に接続される複数のノードを備えるデータベース管理システムであって、データベースを構成するデータを分散して格納する記憶手段を備え、指示されたときに、前記記憶手段に格納されたデータを取り出し、取り出したデータを他のノードに分配すべく前記ネットワークに送出する複数の第１のノードと、指示されたときに、前記ネットワークに送出されたデータを受け取り、受け取ったデータに対するデータベース演算を実行し、その結果を前記ネットワークに送出する複数の第２のノードと、ユーザからの問い合わせを受け付けて、該問い合わせを解析し、該解析結果に基づいて、前記複数の第１のノードのうち少なくとも一つの第１のノードに対して前記データの取りだしを指示し、前記第２のノードのうち少なくとも一つの第２のノードに対して前記データベース演算の実行を指示し、前記ネットワーク上に送出された演算結果を受け取り、前記問い合わせに対する結果を出力する第３のノードとを有する。 According to another aspect of the present invention, there is provided a database management system including a plurality of nodes interconnected via a network, wherein the database management system includes a storage unit that stores data constituting the database in a distributed manner. A plurality of first nodes for extracting data stored in the storage means when instructed, and sending the extracted data to the network for distributing the extracted data to another node; Receiving a query from a user and a plurality of second nodes for sending the result to the network, analyzing the query, and performing a database operation on the received data. Based on the result, the data is transmitted to at least one first node of the plurality of first nodes. , Instruct at least one of the second nodes to execute the database operation, receive the operation result sent over the network, and output the result for the inquiry And a third node.

前記第３のノードは、受け付けた問い合わせの解析結果に基づいて、前記データの取りだしを指示する第１のノードを決定し、当該第１のノードにおける予想される処理時間を算出し、該処理時間に基づいて前記データベース演算の実行を指示する第２のノードを決定することができる。 The third node determines a first node that instructs the retrieval of the data based on an analysis result of the received query, calculates an expected processing time in the first node, and calculates the processing time. The second node instructing execution of the database operation can be determined based on

前記第３のノードは、前記決定された第１のノードにおける、予想される取り出しデータ量に基づいて、前記決定された第２のノードへの前記取り出しデータの分配を当該決定された第２のノードに均等に割当てるようにする。 The third node determines a distribution of the extracted data to the determined second node based on an expected amount of extracted data at the determined first node. Make sure that nodes are evenly allocated.

前記第３のノードは、前記決定された第２のノードに取り出しデータを均等に割当てるための、前記各ノードの記憶手段のデータに関する最適化情報を記憶している記憶手段を備えることができる。 The third node may include a storage unit that stores optimization information on data in a storage unit of each of the nodes for uniformly assigning the extracted data to the determined second node.

前記第３のノードは、前記決定された第２のノードに取り出しデータを均等に割当てるためにあらかじめ定められたハッシュ関数を利用する。 The third node uses a predetermined hash function to evenly allocate the extracted data to the determined second node.

また、前記複数のノードは、それぞれ独立に処理を行い、前記決定された第２のノードは、前記決定された第１のノードからの分配されたデータを逐次入力し、入力されたデータごとに処理を行う。 Further, the plurality of nodes independently perform processing, and the determined second node sequentially inputs the distributed data from the determined first node, and for each of the input data, Perform processing.

さらに、前記複数の第１のノードの各々は、該第１のノードで分配するデータをさらに並び替えるようにしてもよい。 Further, each of the plurality of first nodes may further rearrange data distributed by the first nodes.

前記第３のノードは、前記決定された第１のノードにおける予想される処理時間の算出結果から、より先に処理が終了する第１のノードに対して分配処理後に当該第１のノードにおいて並び替えをするように決定することができる。 The third node is arranged in the first node after the distribution processing for the first node whose processing ends earlier based on the calculation result of the expected processing time in the determined first node. You can decide to make a replacement.

前記第３のノードは、前記処理時間に基づいて決定した前記第２のノードの台数を、所定数増加させるように決定する。 The third node determines to increase the number of the second nodes determined based on the processing time by a predetermined number.

前記決定された第２のノードは、前記第１のノードにおける分配されたデータの並び替えの終了後にマージ処理をするようにしてもよい。 The determined second node may perform a merging process after the rearrangement of the distributed data in the first node is completed.

前記第２のノードは、前記第１のノードにおける分配されたデータの並び替えの終了後にマージ処理を行い、該マージされたデータに基づいて問い合わせに対する突き合わせの処理を行う。 The second node performs a merge process after the rearrangement of the distributed data in the first node, and performs a matching process for an inquiry based on the merged data.

前記第３のノードは、前記突き合わせの処理および前記問い合わせに対する結果の出力の処理における予想される処理時間を算出し、該算出結果に基づいて、前記出力の処理の処理時間が、前記突き合わせの処理処理の時間より大きい場合には、前記突き合わせの処理の中で前記マージ処理を行わせる。 The third node calculates an expected processing time in the matching process and a process of outputting a result with respect to the inquiry, and calculates a processing time of the output process based on the calculation result. If the time is longer than the processing time, the merge processing is performed during the matching processing.

（作用）
前記決定管理ノードは、前記解析手段の問い合わせの解析結果に基づいて前記分配ノードを決定し、前記分配ノードにおける予想される処理時間を算出し、該処理時間に基づいて結合ノードを決定する。決定手段は、前記決定された分配ノードにおける予想される取り出し情報量に基づいて、前記結合ノードへの前記取り出し情報の分配を前記各結合ノードに均等に割当てるようにする。 (Action)
The decision management node determines the distribution node based on an analysis result of the inquiry of the analysis unit, calculates an expected processing time in the distribution node, and determines a connection node based on the processing time. The determining unit is configured to allocate the distribution of the extracted information to the connection nodes equally to each of the connection nodes based on the determined amount of information to be extracted at the distribution node.

前記決定された分配ノードのそれぞれは、前記問い合わせの解析結果に基づいて前記記憶手段から情報を取り出して該取り出した情報を他のノードに分配する。分配ノードおよび結合ノードは、それぞれ独立に処理を行い、前記結合ノードは、前記分配ノードからの分配された情報を逐次入力し、入力された情報ごとに処理を行う。前記決定された結合ノードのそれぞれは、前記分配ノードから分配された情報を並び替え、該並び替えられた情報が複数ある場合にはそれらをマージし、該マージされた情報に基づいて問い合わせに対する突き合わせをし、前記結合ノードから得られた、問い合わせに対する結果を出力する。 Each of the determined distribution nodes retrieves information from the storage unit based on an analysis result of the inquiry, and distributes the retrieved information to other nodes. The distribution node and the connection node perform processing independently of each other, and the connection node sequentially inputs information distributed from the distribution node and performs processing for each input information. Each of the determined connection nodes rearranges the information distributed from the distribution node, merges the rearranged information when there are a plurality of pieces of information, and matches a query to the query based on the merged information. And outputs the result for the query obtained from the join node.

また、決定手段は、前記分配ノードにおける予想される処理時間の算出結果から、より先に処理が終了する分配ノードに対して分配処理後に前記分配ノードの並び替え手段において並び替えをするように決定する。決定された分配ノードの並び替え手段は、該分配ノードで分配する情報を並び替える。 In addition, the determining unit determines from the calculation result of the expected processing time in the distribution node that the distribution node rearrangement unit rearranges the distribution node after the distribution process for the distribution node that finishes processing earlier. I do. The rearrangement means of the determined distribution node rearranges information distributed by the distribution node.

さらに、決定手段は、前記突き合わせ手段および前記出力手段における予想される処理時間を算出し、該算出結果に基づいて、前記出力手段の処理時間が、前記突き合わせ手段の処理時間より大きい場合には、前記突き合わせ手段に前記マージ処理を行わせるように決定する。決定された突き合わせ手段は、マージ処理をする。 Further, the determining unit calculates an expected processing time in the matching unit and the output unit, and based on the calculation result, when a processing time of the output unit is larger than a processing time of the matching unit, It is determined that the matching means is to perform the merge processing. The determined matching means performs a merging process.

また、処理時間があらかじめ定まっているときに、該処理時間以内で処理をさせるために、前記決定手段は、分配ノードにおける予想される処理時間に基づいて決定した前記結合ノードの台数を、所定数増加させるように決定する。これにより、結合ノードの台数が増加し、結合ノードの並び替え手段は、並び替え処理が短時間で処理できるので、並び替え処理の終了後にマージ処理をする。 Further, when the processing time is predetermined, in order to perform the processing within the processing time, the determination unit determines the number of the connection nodes determined based on the expected processing time in the distribution node by a predetermined number. Decide to increase. As a result, the number of connection nodes increases, and the rearrangement unit of the connection nodes can perform the rearrangement processing in a short time, and thus performs the merge processing after the rearrangement processing ends.

本発明の問合せ処理方法によれば、各ノードで実行するデータベース演算に対応して各ノード数を決定できる。また、データの分割にバラツキが存在する場合、各ノードへデータを均等に分割させ、各ノードで実行する各データベース演算をパラメタ化し期待する処理時間均等化させるので、各ノード間で処理時間の偏りがなく、円滑にパイプライン動作させることが可能である。
また、本発明の他の構成によれば、第３のノードは、ユーザからの問い合わせを受け付けて、該問い合わせを解析し、該解析結果に基づいて、前記複数の第１のノードのうち少なくとも一つの第１のノードに対して前記データの取りだしを指示し、前記第２のノードのうち少なくとも一つの第２のノードに対して前記データベース演算の実行を指示する。第３のノードは、前記決定された第１のノードにおける、予想される取り出しデータ量に基づいて、前記決定された第２のノードへの前記取り出しデータの分配を当該決定された第２のノードに均等に割当てるようにする。 According to the query processing method of the present invention, the number of each node can be determined corresponding to the database operation executed at each node. In addition, if there is a variation in data division, the data is equally divided into each node, and each database operation executed in each node is parameterized and the expected processing time is equalized. And smooth pipeline operation is possible.
Further, according to another configuration of the present invention, the third node receives an inquiry from the user, analyzes the inquiry, and, based on the analysis result, at least one of the plurality of first nodes. And instructing the first node to retrieve the data and instructing at least one of the second nodes to execute the database operation. And a third node configured to distribute the extracted data to the determined second node based on the expected amount of extracted data at the determined first node. To be evenly distributed.

決定された第１のノードの各々は、指示されたときに、前記記憶手段に格納されたデータを取り出し、取り出したデータを他のノードに分配すべく前記ネットワークに送出する。第１のノード及び第２のノードは、それぞれ独立に処理を行い、決定された第２のノードは、前記決定された第１のノードからの分配されたデータを逐次入力し、入力されたデータごとに処理を行う。決定された第２のノードの各々は、指示されたときに、前記ネットワークに送出されたデータを受け取り、受け取ったデータに対するデータベース演算を実行し、その結果を前記ネットワークに送出する。第３のノードは、前記ネットワーク上に送出された演算結果を受け取り、前記問い合わせに対する結果を出力する。 Each of the determined first nodes, when instructed, retrieves the data stored in the storage means and sends the retrieved data to the network for distribution to other nodes. The first node and the second node perform processing independently of each other, and the determined second node sequentially inputs the distributed data from the determined first node, and receives the input data. The processing is performed every time. Each of the determined second nodes, when instructed, receives the data transmitted to the network, performs a database operation on the received data, and transmits the result to the network. The third node receives the operation result transmitted on the network and outputs a result for the inquiry.

また、第３のノードは、前記決定された第１のノードにおける予想される処理時間の算出結果から、より先に処理が終了する第１のノードに対して分配処理後に当該第１のノードにおいてデータを並び替えるように決定する。決定された第１のノードは、分配処理後にデータを並び替える。 Further, the third node, after calculating the expected processing time at the determined first node, distributes the first processing to the first node whose processing ends earlier, and then executes the processing at the first node. Decide to sort the data. The determined first node rearranges the data after the distribution processing.

さらに、第３のノードは、前記突き合わせの処理および前記問い合わせに対する結果の出力の処理における予想される処理時間を算出し、該算出結果に基づいて、前記出力の処理の処理時間が、前記突き合わせの処理処理の時間より大きい場合には、前記突き合わせの処理の中で前記マージ処理を行わせる。 Further, the third node calculates an expected processing time in the matching processing and the processing of outputting the result with respect to the inquiry, and based on the calculation result, calculates a processing time of the output processing in the matching. If the time is longer than the processing time, the merge processing is performed during the matching processing.

また、処理時間が予め定まっているときに、該処理時間以内で処理をさせるために第３のノードは、前記処理時間に基づいて決定した前記第２のノードの台数を、所定数増加させるように決定する。これにより、第２のノードの台数が増加し、並び替えの処理をより短時間で処理することができる。並び替え処理の終了後にはマージ処理をすることができる。 Further, when the processing time is predetermined, the third node increases the number of the second nodes determined based on the processing time by a predetermined number in order to perform the processing within the processing time. To decide. As a result, the number of second nodes increases, and the sorting process can be performed in a shorter time. After the rearrangement process is completed, a merge process can be performed.

本発明によれば、データを取りだして分配するノードとデータベース演算を実行するノードとを決定できる。また、データの分割にバラツキが存在する場合、各ノードへデータを均等に分割させ、各ノードで実行する各データベース演算をパラメタ化し、期待する処理時間を均等化させるので、各ノード間で処理時間の偏りがなく、円滑にパイプライン動作をさせることが可能である。 According to the present invention, it is possible to determine a node that extracts and distributes data and a node that executes a database operation. Also, if there is a variation in the data division, the data is equally divided into each node, each database operation executed in each node is parameterized, and the expected processing time is equalized. And the pipeline operation can be performed smoothly.

各ノードで実行するデータベース演算に対応して各ノード数を決定し、各ノードへデータを均等に分割させ、各ノードで実行する処理時間均等化させるので、各ノード間で処理時間の偏りがなく、高速化な問合せ処理を実現することができる。 The number of each node is determined according to the database operation executed in each node, the data is divided equally into each node, and the processing time executed in each node is equalized, so that the processing time is not biased among the nodes. In addition, a high-speed query process can be realized.

以下、本発明の実施例を図面に基づいて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図２は、本実施例のデ−タベ−スシステムの概念図を示している。図２において、デ−タベ−スシステムは、ユ−ザが作成した、複数のアプリケ−ションプログラム（以下、ＡＰと略記する）１０および１１と、問合せ処理やリソ−ス管理等デ−タベ−スシステム全体の管理を行うＤＢＭＳ２０と、デ−タベ−ス処理において、入出力処理対象となるデ−タの読書きを行い、計算機システム全体の管理を受け持つオペレ−ティングシステム（以下では、オペレ−ティングシステムをＯＳと略記する）３０と、デ−タベ−ス処理対象となるデ−タを格納するデ−タベ−ス４０と、データベースの定義情報を管理するディクショナリ５０とを有する。ＤＢＭＳ２０は、他のデータベース管理システムと接続されている。ディクショナリ５０には、本実施例において使用する結合カラムに関する最適化情報なども記憶されている。 FIG. 2 shows a conceptual diagram of the database system of the present embodiment. In FIG. 2, a database system includes a plurality of application programs (hereinafter abbreviated as AP) 10 and 11 created by a user, and a database for query processing and resource management. An operating system (hereinafter, referred to as an operating system) that manages the entire computer system and a DBMS 20 that manages the entire computer system by reading and writing data to be subjected to input / output processing in database processing. Operating system), a database 40 for storing data to be subjected to database processing, and a dictionary 50 for managing database definition information. The DBMS 20 is connected to another database management system. The dictionary 50 also stores, for example, optimization information on the join columns used in the present embodiment.

上記ＤＢＭＳ２０は、システム全体の管理、制御に加えて、入出力の管理等を行うシステム制御部２１と、問い合わせに関する論理処理を行う論理処理部２２と、データベースの物理処理を実行する物理処理部２３と、当ＤＢＭＳ２０で処理対象となるデータを格納するデータベースバッファ２４とを備える。また、論理処理部２２は、問合せの構文解析、意味解析を行う問合せ解析２２０、適切な処理手順を生成する静的最適化処理２２１、処理手順に対応したコ−ドの生成を行なうコード生成２２２、静的最適化処理２２１で生成された処理手順候補から最適なものを選択する動的最適化処理２２３、および、当コードの解釈実行を行うコード解釈実行部２２４を備える。また、物理処理部２３は、アクセスしたデ−タの条件判定、編集、レコード追加等を実現するデ−タアクセス処理２３０、データベースレコ−ドの読み書きを制御するデ−タベ−スバッファ制御２３１、入出力対象となるデ−タの格納位置を管理するマッピング処理２３２、および、システムで共用するリソースの排他制御を実現する排他制御２３３を備える。 The DBMS 20 includes a system control unit 21 that performs input / output management and the like in addition to management and control of the entire system, a logical processing unit 22 that performs logical processing related to an inquiry, and a physical processing unit 23 that performs physical processing of a database. And a database buffer 24 for storing data to be processed by the DBMS 20. The logic processing unit 22 includes a query analysis 220 for performing syntax analysis and semantic analysis of the query, a static optimization process 221 for generating an appropriate processing procedure, and a code generation 222 for generating a code corresponding to the processing procedure. , A dynamic optimization process 223 for selecting an optimal one from the processing procedure candidates generated in the static optimization process 221, and a code interpretation execution unit 224 for interpreting and executing the code. Further, the physical processing unit 23 includes a data access process 230 for realizing condition determination, editing, record addition, and the like of the accessed data, a database buffer control 231 for controlling reading and writing of database records, and an input. A mapping process 232 for managing the storage location of data to be output and an exclusive control 233 for implementing exclusive control of resources shared by the system are provided.

図３は、本発明が適用されるハードウェア構成の一例を示すものである。具体的には、図３は、プロセッサおよびディスク装置が１ノードを構成し、複数のノードを備える並列プロセッサシステムの適用構成例を示している。図３において、プロセッサ６０〜６５およびディスク装置７０〜７５が相互結合ネットワーク８０で接続される。図３に示すハードウェア構成は、図２に示すデータベースシステムを複数のプロセッサで並列処理するための構成であり、各ノードに対してそれぞれ処理が分散される。 FIG. 3 shows an example of a hardware configuration to which the present invention is applied. Specifically, FIG. 3 shows an example of an application configuration of a parallel processor system in which a processor and a disk device constitute one node and a plurality of nodes are provided. In FIG. 3, processors 60 to 65 and disk devices 70 to 75 are connected by an interconnection network 80. The hardware configuration shown in FIG. 3 is a configuration for parallel processing of the database system shown in FIG. 2 by a plurality of processors, and the processing is distributed to each node.

上記各ノードごとに機能分散した場合の構成を図１に示す。図１は、本実施例が適用されたデ−タベ−スシステムの概要図を示している。以下に、並列データベースシステムの処理例を図１を参照して説明する。この例では、データベースに対する検索要求に並列処理を適用する。図１において、各ノードは、データを取り出して分配処理するソート機能と、複数のノードでそれぞれソートされたデータを結合処理するマージ機能とが各ノードごとに割り当てられている。ノードによりソート機能だけを備えるものや、ソート機能とマージ機能とを備えるものがある。デ−タベ−スは、ユ−ザから２次元のテ−ブル形式で見られる表から成るものとし、当該表は行あるいはロウごとにデータが存在するものである。また、ロウは、１つ以上の属性（これを「カラム」という）からなる。図１においては、データベースの表としてＴ１およびＴ２があり、ノード１（９０）からノード４（９１）に表Ｔ１が、ノード５（９２）からノード８（９３）に表Ｔ２が各々格納されており、これらの各ノードが分配ノードであり、分配ノードにおいて格納している表に基づいてデータ取り出し処理およびデータ分配処理が実行される。また、ノード９（９４）からノード１１（９６）は、結合ノードであり、ノード１〜４およびノード５〜８から出力されるデータを受け取り、部分列ソート処理およびマージ処理をして完全列の作成を実行する。さらに、ノード１２（９７）は、問い合わせを受け付け、該問い合わせを解析し、問い合わせに対する処理を実行する分配ノードおよび結合ノードの数を決定する決定ノードである。また、ノード１２（９７）は、ノード９〜１１から出力されたデータを受け取り出力する。これらのノード群は、相互結合ネットワーク８０で接続され、ノード１〜４およびノード５〜８と、ノード９〜１１とが並列に動作し、しかもノード１〜４およびノード５〜８でそれぞれ処理された結果は、すぐにノード９〜１１で処理を行うというようなパイプライン的に動作する（以下、並列パイプライン動作と呼ぶ）。また、ノード９〜１１とノード１２とも同様にパイプライン動作する。以下では、ノード９〜１１における部分列ソート処理をスロットソート処理といい、完全列作成処理をＮウェイマージ処理と呼ぶ。スロットソート処理は、データが格納されるページを対象とするページ内のソート処理を指し、スロット順に読みだせば昇順にロウがアクセス可能とする。Ｎウェイマージ処理は、Ｎウェイのバッファを用いて、各マージ段でＮ本のソート連を入力にして最終的に１本のソート連を作成する。 FIG. 1 shows a configuration in the case where functions are distributed for each of the nodes. FIG. 1 shows a schematic diagram of a database system to which the present embodiment is applied. Hereinafter, a processing example of the parallel database system will be described with reference to FIG. In this example, parallel processing is applied to a search request for a database. In FIG. 1, each node is assigned to each node a sort function for extracting and distributing data and a merge function for combining and processing data sorted by a plurality of nodes. Some nodes have only a sorting function, and some nodes have a sorting function and a merging function. The database consists of a table which can be viewed from the user in a two-dimensional table format, and the table has data for each row or row. A row is composed of one or more attributes (this is called a “column”). In FIG. 1, there are T1 and T2 as tables in the database. Table 1 is stored from node 1 (90) to node 4 (91), and table T2 is stored from node 5 (92) to node 8 (93). Each of these nodes is a distribution node, and the data extraction processing and the data distribution processing are executed based on the table stored in the distribution node. Nodes 9 (94) to 11 (96) are connection nodes that receive data output from nodes 1 to 4 and 5 to 8 and perform partial column sorting and merging to perform complete column sorting. Execute creation. Further, the node 12 (97) is a decision node that receives a query, analyzes the query, and determines the number of distribution nodes and connection nodes that execute processing for the query. The node 12 (97) receives and outputs data output from the nodes 9 to 11. These nodes are connected by an interconnection network 80. Nodes 1 to 4 and nodes 5 to 8 and nodes 9 to 11 operate in parallel, and are processed by nodes 1 to 4 and nodes 5 to 8, respectively. The result thus obtained operates in a pipeline such that the processing is immediately performed in the nodes 9 to 11 (hereinafter, referred to as a parallel pipeline operation). The nodes 9 to 11 and the node 12 also operate in a pipeline manner. Hereinafter, the partial column sorting process in the nodes 9 to 11 is called a slot sorting process, and the complete column creating process is called an N-way merge process. The slot sort process refers to a sort process within a page in which data is stored. If the data is read in the slot order, the rows can be accessed in ascending order. The N-way merge process uses an N-way buffer and inputs N sort runs at each merge stage to finally create one sort run.

データベース検索処理のための問合せは、例えば、以下のようになる。 The query for the database search process is, for example, as follows.

ＳＥＬＥＣＴＴ１．Ｃ３，Ｔ２．Ｃ３
ＦＲＯＭＴ１，Ｔ２
ＷＨＥＲＥＴ１．Ｃ１＝Ｔ２．Ｃ１
ＡＮＤＴ１．Ｃ２＝？
このような問い合わせが、ノード１２において受け付けられると、ノード１２において、最適な分配処理方法が選択され、各ノードに対してネットワークを介して指示される。上記の問い合わせにおいては、ノード１（９０）からノード４（９１）に表Ｔ１が、ノード５（９２）からノード８（９３）に表Ｔ２が各々格納されているので、各ノードにおいてデータ取り出し処理およびデータ分配処理が実行される。また、ノード９（９４）からノード１１（９６）では、ノード１〜４およびノード５〜８から出力されるデータを逐次受け取り、ソート処理および結合処理を実行する。ノード１２（９７）では、ノード９〜１１から出力されたデータを受け取り出力する。これによりデータベース検索は終了する。 SELECT T1. C3, T2. C3
FROM T1, T2
WHERE T1. C1 = T2. C1
AND T1. C2 =?
When such an inquiry is received at the node 12, the node 12 selects an optimal distribution processing method, and instructs each node via the network. In the above inquiry, the table T1 is stored from the node 1 (90) to the node 4 (91), and the table T2 is stored from the node 5 (92) to the node 8 (93). And a data distribution process is performed. Further, the nodes 9 (94) to 11 (96) sequentially receive the data output from the nodes 1 to 4 and the nodes 5 to 8, and execute the sorting process and the combining process. The node 12 (97) receives and outputs the data output from the nodes 9 to 11. This terminates the database search.

つぎに、上記各ノードの処理時間の関係について図４を参照して説明する。図４は、並列パイプライン動作を説明するための概要図を示す。図４において、１００および１０１は、図１におけるノード１（９０）からノード４（９１）と、ノード５（９２）からノード８（９３）とにおける処理に対応し、データ取り出し処理およびデータ配分処理を実行する。１１０および１１１は、ノード９（９４）からノード１１（９６）における処理に対応し、スロットソート処理、Ｎウェイマージ処理、突き合わせ処理が実行される。１２０は、ノード１２（９７）における処理に対応し、要求データ出力処理が実行される。時間軸に沿えば、データ取り出し処理およびデータ配分処理１００および１０１で処理されたデータは、逐次スロットソート処理１１０および１１１に移り、パイプライン的に実行される。データ取り出し処理からスロットソート処理までを取り出しフェーズと呼ぶ。また、Ｎウェイマージ処理１１０および１１１は、それぞれのノードで単に並列に実行される。このＮウェイマージ処理期間をマージフェーズと呼ぶ。さらに、突き合わせ処理の結果は要求データ出力処理１２０に逐次転送されてパイプライン的に実行される。この突き合わせから要求データ出力までを結合フェーズと呼ぶ。図４に示すタイムチャートは、図１に示す問合せ例を適用した場合の処理内容である。取り出しフェーズにおいて、ノード１（９０）からノード４（９１）における処理時間は、Ｔ１データ取り出し／データ分配処理時間１３０として示す。また、ノード５（９２）からノード８（９３）における処理時間は、Ｔ２データ取り出し／データ分配処理時間１３１で示し、相互結合ネットワーク８０における転送時間をデータ分配転送時間１４０で示し、ノード９（９４）からノード１１（９６）における処理時間は、Ｔ１／Ｔ２スロットソート処理時間１５０に示すように、各々実行される。図４において、取り出しフェーズは、スロットソート処理完了待ち合わせ１８０の時点までで終了する。また、マージフェーズにおいて、ノード９（９４）からノード１１（９６）における処理時間は、Ｔ１／Ｔ２Ｎウェイマージ処理時間１５１に示す時間において実行される。このマージフェーズは、Ｔ１／Ｔ２Ｎウェイマージ処理待ち合わせ１８１までで終了する。結合フェーズは、ノード９（９４）からノード１１（９６）における処理時間は、突き合わせ処理時間１５２で示し、相互結合ネットワーク８０における処理時間は、結合結果転送時間１６０で示し、ノード１２（９７）における処理時間は、要求データ出力処理時間１７０で示し、各々その時間内に実行される。 Next, the relationship between the processing times of the above nodes will be described with reference to FIG. FIG. 4 is a schematic diagram for explaining the parallel pipeline operation. In FIG. 4, reference numerals 100 and 101 correspond to processing from node 1 (90) to node 4 (91) and from node 5 (92) to node 8 (93) in FIG. Execute 110 and 111 correspond to the processing from the node 9 (94) to the node 11 (96), and the slot sort processing, the N-way merge processing, and the matching processing are executed. Reference numeral 120 corresponds to the processing in the node 12 (97), and the request data output processing is executed. Along the time axis, the data processed in the data extraction processing and the data distribution processing 100 and 101 are sequentially transferred to the slot sort processing 110 and 111 and executed in a pipeline manner. The process from the data fetch process to the slot sort process is called a fetch phase. Also, the N-way merge processes 110 and 111 are simply executed in parallel at each node. This N-way merge processing period is called a merge phase. Further, the result of the matching process is sequentially transferred to the request data output process 120 and executed in a pipeline. The period from the matching to the output of the requested data is called a combining phase. The time chart shown in FIG. 4 shows the processing contents when the inquiry example shown in FIG. 1 is applied. In the extraction phase, the processing time from the node 1 (90) to the node 4 (91) is indicated as T1 data extraction / data distribution processing time 130. The processing time from the node 5 (92) to the node 8 (93) is indicated by a T2 data extraction / data distribution processing time 131, the transfer time in the interconnection network 80 is indicated by a data distribution transfer time 140, and the node 9 (94) ) To node 11 (96) are executed as shown in T1 / T2 slot sort processing time 150. In FIG. 4, the take-out phase ends up to the point in time of slot sort processing completion wait 180. In the merge phase, the processing time from the node 9 (94) to the node 11 (96) is executed at the time indicated by the T1 / T2N way merge processing time 151. This merge phase ends up to the T1 / T2N way merge process waiting 181. In the connection phase, the processing time from the node 9 (94) to the node 11 (96) is indicated by a matching processing time 152, the processing time in the interconnected network 80 is indicated by a connection result transfer time 160, and the processing time in the node 12 (97) is indicated. The processing time is indicated by a request data output processing time 170, and each is executed within that time.

つぎに、図１におけるノード１２の各ノード群への処理の振り分け方法について図５を参照して説明する。図５は、データ分配処理における各ノード群への振り分け方法を示す説明図である。前提として、データ取り出し／データ分配処理をするノード群は、プロセッサ２００〜２３０とディスク装置２０１〜２３１とを備えるノード１〜１０の１０台からなる。また、結合処理をするノード群は、プロセッサ２４０〜２５０とディスク装置２４１〜２５１とを備えるノード１１〜１５の５台からなるとする。ディクショナリ５０には、結合カラムに関する最適化情報５１が格納されていている。該最適化情報５１とは、データベースのデータを均等に分割するための情報であり、例えば、結合カラムに対するデータ件数は通常均一でないので、データ件数が均一になるように結合カラムで分割するようにするものである。図５に示すように、ノード１〜１０に格納されているデータが、ｖ１からｖ１０の各分割範囲で均等にデータ分割可能であることを示す。この場合、ノード１１〜１５に均等にデータ分割するためにはｖ１〜ｖ２、ｖ３〜ｖ４、ｖ５〜ｖ６、ｖ７〜ｖ８、ｖ９〜ｖ１０の５区間にそれぞれノード番号１１、１２、１３、１４、１５を対応付けるような分配処理手段を備えればよい。上記最適化情報が存在しない場合、適当なハッシュ関数を設定してデータ分配を行なえばよい。このようにして、図１におけるノード１２では、分配処理手段を備えることにより、Ｎウェイマージ処理を行う際の各ノード群への処理の振り分けを行う。これにより、上記のような場合には、ノード１１〜１５に均等にデータ分割することができ、処理時間が均等になる。 Next, a method of distributing the processing of the node 12 to each node group in FIG. 1 will be described with reference to FIG. FIG. 5 is an explanatory diagram showing a method of assigning data to each node group in the data distribution processing. As a premise, a node group that performs data extraction / data distribution processing includes ten nodes 1 to 10 each including processors 200 to 230 and disk devices 201 to 231. Further, it is assumed that the node group that performs the combining process includes five nodes 11 to 15 each including processors 240 to 250 and disk devices 241 to 251. The dictionary 50 stores optimization information 51 on the join column. The optimization information 51 is information for equally dividing the data of the database. For example, since the number of data items for the join column is not usually uniform, the optimization information 51 may be divided by the join column so that the number of data items becomes uniform. To do. As shown in FIG. 5, the data stored in the nodes 1 to 10 can be equally divided in the respective divided ranges from v1 to v10. In this case, in order to equally divide the data into the nodes 11 to 15, the node numbers 11, 12, 13, 14, and 5 are divided into five sections v1 to v2, v3 to v4, v5 to v6, v7 to v8, and v9 to v10. It suffices to provide a distribution processing means for associating 15 with the distribution processing means. When the above-mentioned optimization information does not exist, an appropriate hash function may be set to perform data distribution. In this way, the node 12 in FIG. 1 includes the distribution processing means, and distributes the processing to each node group when performing the N-way merge processing. As a result, in the case described above, the data can be equally divided into the nodes 11 to 15, and the processing time becomes equal.

つぎに、Ｎウェイマージ処理を行う際の結合ノード数の決定方法について図６を参照して説明する。図６は、結合ノード数決定方法を説明するための概要図を示している。 Next, a method of determining the number of connected nodes when performing the N-way merge process will be described with reference to FIG. FIG. 6 is a schematic diagram illustrating a method for determining the number of connected nodes.

図１における並列結合処理の各フェーズ、各処理の処理時間をグラフ化し、図４に示す並列パイプライン動作概要に合わせてレイアウトしている。図６において、データ取り出し／データ分配処理が、ノード１〜８で実行され、３００〜３０５の処理時間がそれぞれかかるものとする。ここでは、ノード５の処理時間３０４が最大処理時間であるとする。スロットソート処理時間は、結合処理ノード数Ｎと、予め決められたシステム特性（ＣＰＵ性能、ディスク装置性能等）と、データベース演算方法とから導けることができ、スロットソート処理の性能特性は一般的に下記に示すような式で求めることができる。 Each phase of the parallel connection processing in FIG. 1 and the processing time of each processing are graphed and laid out in accordance with the outline of the parallel pipeline operation shown in FIG. In FIG. 6, it is assumed that data fetching / data distribution processing is executed in nodes 1 to 8 and that processing times of 300 to 305 are required. Here, it is assumed that the processing time 304 of the node 5 is the maximum processing time. The slot sort processing time can be derived from the number N of coupling processing nodes, predetermined system characteristics (CPU performance, disk device performance, and the like), and a database operation method. It can be obtained by the following equation.

パイプライン処理を行う際の効果を最大にするために、スロットソート処理の性能特性と最大処理時間３０４との交点となる結合ノード割当て数３５０をノード数として求めることができる。結合ノード割当て数３５０が決まると、Ｎウェイマージ処理時間３２０および突き合わせ処理時間３３０が、Ｎウェイマージ処理の性能特性と突き合わせ処理の性能特性とから同様に推定できる。これらの処理時間の合計が問い合わせに対する全体の処理時間となる。このように結合ノード数を決定し、データ取り出し／データ分配処理において分配されたデータを逐次マージして同時に処理することにより、全体の処理時間（問い合わせをしてから出力されるまでの応答時間）を短縮することができる。

In order to maximize the effect of performing the pipeline processing, it is possible to determine the number of nodes 350 to be assigned, which is the intersection of the performance characteristic of the slot sort processing and the maximum processing time 304, as the number of nodes. When the number of connection nodes 350 is determined, the N-way merge processing time 320 and the matching processing time 330 can be similarly estimated from the performance characteristics of the N-way merge processing and the matching processing. The sum of these processing times is the total processing time for the inquiry. In this manner, the number of connection nodes is determined, and the data distributed in the data fetching / data distribution processing are sequentially merged and simultaneously processed, so that the entire processing time (response time from inquiry to output). Can be shortened.

結合ノード数を決定する場合に用いられる性能特性の具体例を以下に示しておく。例えば、ロウ数が表Ｔ１および表Ｔ２とも１０、０００、０００件あり、条件数がＴ１−１コ（全体ロウが１％に絞られる）とし、データ取り出し／データ分配処理をする分配ノード数が表Ｔ１および表Ｔ２ともそれぞれ１６ノードで均等分割され、結合ノード数が８ノードで、プロセッサ性能が５０ＭＩＰＳ（１秒間に５千万命令実行）で、ネットワーク転送レートが２０Ｍバイト／秒であるとする。このような条件で実際のデータベース管理システムに処理させた結果もしくは性能モデルから算出した結果が以下のようになる。 Specific examples of the performance characteristics used in determining the number of connection nodes will be described below. For example, assume that the number of rows is 10,000,000 in both table T1 and table T2, the number of conditions is T1-1 (the total number of rows is reduced to 1%), and the number of distribution nodes for data extraction / data distribution processing is Each of Tables T1 and T2 is equally divided into 16 nodes, the number of connection nodes is 8, the processor performance is 50 MIPS (50 million instructions executed per second), and the network transfer rate is 20 Mbytes / sec. . The result of processing by the actual database management system under such conditions or the result calculated from the performance model is as follows.

表Ｔ１および表Ｔ２の分配ノードの処理時間がそれぞれ１８０秒、Ｔ１／Ｔ２スロットソート処理時間が８０秒、Ｎウェイマージ処理時間が３８０秒、突き合わせ処理時間が１１０秒、要求データ出力時間が１０秒となる。これらの結果の処理性能に基づいて問い合わせに対する処理時間を推定する。 The processing times of the distribution nodes in Table T1 and Table T2 are respectively 180 seconds, the T1 / T2 slot sort processing time is 80 seconds, the N-way merge processing time is 380 seconds, the matching processing time is 110 seconds, and the requested data output time is 10 seconds. It becomes. The processing time for the inquiry is estimated based on the processing performance of these results.

つぎに、図６に示した結合ノード数決定方法を基にして、応答時間をさらに短縮するための処理時間調整方法（チューニング方法）について、図７、図８および図９を参照して説明する。以下に示す方法は、上記ノード１２の分配処理手段において、各ノード群への処理の振り分けを決定する際にあらかじめ算出されて、その結果より振り分けを決定するものである。 Next, a processing time adjustment method (tuning method) for further reducing the response time based on the connection node number determination method shown in FIG. 6 will be described with reference to FIGS. 7, 8, and 9. . In the method described below, the distribution processing means of the node 12 is calculated in advance when determining the distribution of the processing to each node group, and determines the distribution from the result.

図７は、スロットソート前処理化の概要図を示す。データ取り出し／データ分配処理が、ノード１〜８で実行され、各３００〜３０５の処理時間がそれぞれかかるものとする。ノードごとの処理時間には各表のデータ数によりバラツキが存在する。また、スロットソート処理は、結合処理ノード群で実行されるように設定されている。ノードごとの処理時間でバラツキがある場合には、データ取り出し／データ分配処理ノード群へスロットソート処理を移す処理手順を考える。図７に、スロットソートの前処理化として示すように、データ取り出し／データ分配処理がより早く終了したノードでスロットソート処理を行う。その処理によれば、結合ノード割当て数３５０のノードにおけるスロットソート処理時間が３１０から３１２に削減できる。その処理時間の差３１１においてＮウェイマージ処理を移す。これは、スロットソート処理の連長を延ばすことにほかならない。これによって、Ｎウェイマージ処理時間が削減でき、結果的に応答時間が削減できる。 FIG. 7 shows a schematic diagram of the slot sorting preprocessing. It is assumed that the data fetching / data distribution processing is executed by the nodes 1 to 8 and that the processing time of each of the nodes 300 to 305 is required. The processing time for each node varies depending on the number of data in each table. Further, the slot sorting process is set to be executed in the combination processing node group. If there is a variation in the processing time for each node, consider a processing procedure for transferring the slot sorting processing to the data extraction / data distribution processing node group. As shown in FIG. 7 as the preprocessing of the slot sort, the slot sort processing is performed at the node where the data extraction / data distribution processing ends earlier. According to the processing, the slot sort processing time at the node having the number of assigned joint nodes 350 can be reduced from 310 to 312. At the processing time difference 311, the N-way merge processing is shifted. This is nothing short of extending the run length of the slot sort process. As a result, the N-way merge processing time can be reduced, and as a result, the response time can be reduced.

図８は、スロットソート連長チューニング概要図を示している。例えば処理時間の制約があったときなどのように、その時間内で複数の処理を行う場合において、各ノードで実行する各データベース演算をパラメータ化し、期待する処理時間に基づいて時間調整（チューニング）をする方法について説明する。図６で求まる結合ノード割当て数３５０から最小限だけ結合処理ノードを増やし、応答時間の短縮を図る。この場合の結合ノード割当て数を３５１とする。結合ノード割当て数３５１とすると、スロットソート処理時間は３１０から３１２へ削減される。パイプライン効果を最大にするため、処理時間３１１においてＮウェイマージ処理をスロットソート処理へ移す。これによって、Ｎウェイマージ処理のマージ回数が減り、処理時間が３２０と削減でき、結果的に応答時間が削減できる。 FIG. 8 shows a schematic diagram of slot sort run length tuning. For example, when a plurality of processes are performed within that time period, such as when there is a restriction on the processing time, each database operation executed by each node is parameterized, and time adjustment (tuning) is performed based on the expected processing time. The method of doing is explained. The number of connection processing nodes is increased to a minimum from the number of connection node assignments 350 obtained in FIG. 6 to shorten the response time. In this case, it is assumed that the number of connected nodes is 351. Assuming that the number of connected nodes is 351, the slot sort processing time is reduced from 310 to 312. In order to maximize the pipeline effect, the N-way merge processing is shifted to the slot sort processing in the processing time 311. As a result, the number of merges in the N-way merge process is reduced, and the processing time can be reduced to 320, and as a result, the response time can be reduced.

図９は、Ｎウェイマージ回数チューニングの概要図を示す。結合ノード割当て数３５０で決まる突き合わせ処理時間３３０が要求データ出力処理時間３４０より小である場合には、Ｎウェイマージ処理の最終段のマージ処理を突き合わせ処理に移すようにできる。Ｎウェイマージ処理の最終段のマージ処理時間３３１と突き合わせ処理時間３３０との和が要求データ出力処理時間３４０を上回らなければ、当最終段のマージ処理を突き合わせ処理へ移す。これによって、応答時間が削減できる。 FIG. 9 shows a schematic diagram of N-way merge count tuning. If the matching processing time 330 determined by the number of connection node assignments 350 is shorter than the required data output processing time 340, the last merge processing of the N-way merge processing can be shifted to the matching processing. If the sum of the last merge processing time 331 and the matching processing time 330 of the N-way merge processing does not exceed the required data output processing time 340, the final merge processing is shifted to the matching processing. Thereby, the response time can be reduced.

つぎに、本実施例におけるデータベース管理システムの動作フローを説明する。図１０、図１１、図１２、図１３、図１４および図１５は、本実施例におけるＤＢＭＳの処理のフロ−チャ−トを示す。図１０において、ＤＢＭＳは、問合せ実行前に行われる問合せの解析処理（ステップ２２０）、静的最適化処理（ステップ２２１）およびコード生成（ステップ２２２）により問い合わせ解析を行う問合せ解析処理４００と、変数に定数を代入し、処理手順を選択する動的最適化処理（ステップ２２３）および問合せのコード解釈実行（ステップ２２４）により問い合わせに対する実行処理を行う問合せ実行処理４１０とを行う。 Next, an operation flow of the database management system according to the present embodiment will be described. FIGS. 10, 11, 12, 13, 14 and 15 show flowcharts of the processing of the DBMS in this embodiment. In FIG. 10, the DBMS includes a query analysis process 400 for performing a query analysis process (step 220), a query optimization process (step 221), and a code generation process (step 222), which are performed before a query is executed. And a query execution process 410 for executing a process for a query by executing a code interpretation of the query (step 224).

以下、各処理部の概要について述べる。 Hereinafter, an outline of each processing unit will be described.

（ａ）問合せ解析処理４００
図１０（ａ）および（ｃ）において、問合せ解析（ステップ２２０）では、上記ノード１２においてアプリケーションプログラムにより入力された問合せ文の構文解析、意味解析を実行する（ステップ２２００）。図１０（ａ）において静的最適化処理（ステップ２２１）では、上記ノード１２において問合せで出現する条件式から条件を満足するデ−タの割合を推定し、予め設定している規則を基に、有効なアクセスパス候補（特にインデクスを選出する）を作成し、処理手順の候補を作成する。コード生成（ステップ２２２）では、上記ノード１２において処理手順候補を実行形式に展開する。 (A) Query analysis processing 400
10A and 10C, in the query analysis (step 220), syntax analysis and semantic analysis of the query sentence input by the application program in the node 12 are executed (step 2200). In FIG. 10A, in the static optimization processing (step 221), the ratio of data satisfying the condition is estimated from the conditional expression appearing in the query at the node 12, and based on a rule set in advance. Then, a valid access path candidate (particularly, an index is selected) is created, and a candidate for a processing procedure is created. In the code generation (step 222), the processing procedure candidates in the node 12 are developed into an executable form.

（ｂ）問合せ実行処理４１０
図１０（ｂ）において、動的実行時最適化（ステップ２２３）では、上記ノード１２において代入された定数に基づき、各ノード群で実行する処理手順を決定する。コード解釈実行（ステップ２２４）では、それぞれのノードにおいて処理手順を解釈し、実行する。 (B) Query execution processing 410
In FIG. 10B, in the dynamic runtime optimization (step 223), a processing procedure to be executed in each node group is determined based on the constant substituted in the node 12. In the code interpretation execution (step 224), the processing procedure is interpreted and executed in each node.

つぎに、各処理部の詳細な処理フローの説明を行う。 Next, a detailed processing flow of each processing unit will be described.

図１０（ｄ）において、動的最適化処理（ステップ２２１）では、問合せに出現する条件式の述語選択率推定し（ステップ２２１０）、インデクス等からなるアクセスパスの剪定をし（ステップ２２１１）、これらアクセスパスを組合せた処理手順候補の生成をする（ステップ２２１２）。 In FIG. 10D, in the dynamic optimization processing (step 221), a predicate selection rate of a conditional expression appearing in a query is estimated (step 2210), and an access path including an index or the like is pruned (step 2211). A processing procedure candidate combining these access paths is generated (step 2212).

図１０（ｅ）において、述語選択率推定（ステップ２２１０）では、問合せ条件式に変数が出現するか否かチェックする（ステップ２２１０１）。変数が出現すれば、当条件式にカラム値分布情報があるかチェックする（ステップ２２１０４）。存在すれば終了する。存在しなければ、条件式の種別に応じてディフォルト値を設定し（ステップ２２１０５）、終了する。変数が出現しなければ、当条件式にカラム値分布情報があるかチェックする（ステップ２２１０４）。存在しなければ、条件式の種別に応じてディフォルト値を設定し（ステップ２２１０５）、終了する。存在すれば、カラム値分布情報を用いて選択率を算出する（ステップ２２１０３）。 In FIG. 10E, in the predicate selection rate estimation (step 2210), it is checked whether or not a variable appears in the query conditional expression (step 22101). If a variable appears, it is checked whether the conditional expression has column value distribution information (step 22104). If present, terminate. If not, a default value is set according to the type of the conditional expression (step 22105), and the process ends. If the variable does not appear, it is checked whether the conditional expression has column value distribution information (step 22104). If not, a default value is set according to the type of the conditional expression (step 22105), and the process ends. If there is, the selectivity is calculated using the column value distribution information (step 22103).

図１１において、アクセスパス剪定２２１２では、問合せ条件式で出現するカラムのインデクスをアクセスパス候補として登録する（ステップ２２１２０）。つぎに、問合せでアクセス対象となる表が複数ノードに分割格納されているかチェックする（ステップ２２１２１）。分割格納されていれば、パラレルテーブルスキャンをアクセスパス候補として登録する（ステップ２２１２３）。分割格納されていなければ、テ−ブルスキャンをアクセスパス候補として登録する（ステップ２２１２３）。各条件式の選択率が既に設定済みか否かチェックする（ステップ２２１２４）。設定済みであれば、各表に関して選択率が最小となる条件式のインデクスをアクセスパスの最優先度とする（ステップ２２１２５）。設定済みでなければ、各条件式の選択率の最大値／最小値を取得する（ステップ２２１２６）。最後に、ＣＰＵ性能、ＩＯ性能等のシステム特性より各アクセスパスの選択基準を算出し（ステップ２２１２７）、単一あるいは複数のインデクスを組合せたアクセスパスでの選択率が上記選択基準を下回るものだけアクセスパス候補として登録する（ステップ２２１２８）。 In FIG. 11, in access path pruning 2212, the index of the column appearing in the query conditional expression is registered as an access path candidate (step 22120). Next, it is checked whether the table to be accessed by the query is divided and stored in a plurality of nodes (step 22121). If divided and stored, the parallel table scan is registered as an access path candidate (step 22123). If not, the table scan is registered as an access path candidate (step 22123). It is checked whether or not the selectivity of each conditional expression has already been set (step 22124). If set, the index of the conditional expression that minimizes the selectivity for each table is set as the highest priority of the access path (step 22125). If not set, the maximum / minimum value of the selectivity of each conditional expression is obtained (step 22126). Finally, the selection criterion of each access path is calculated from system characteristics such as CPU performance and IO performance (step 22127), and only the selection criterion of the access path combining single or plural indexes is lower than the above selection criterion. It is registered as an access path candidate (step 22128).

図１２において、処理手順候補生成２２１３は、問合せでアクセス対象となる表が複数ノードに分割格納されているかチェックする（ステップ２２１３０）。分割格納されていれば、ステップ２２１３５へ移行する。分割格納されていなければ、処理手順候補にソート処理が含まれているか否かをチェックする（ステップ２２１３１）。含まれていれば、ステップ２２１３５へ移行する。含まれていなければ、問合せでアクセス対象となる表のアクセスパスが唯一であるかチェックし（ステップ２２１３２）、唯一であれば単一の処理手順を作成し（ステップ２２１３３）、唯一でなければ複数の処理手順を作成し（ステップ２２１３４）、終了する。ステップ２２１３５では、結合可能な２ウェイ結合へ問合せを分解する。分割格納される表の格納ノード群に対応して、データ読みだし／データ分配処理手順を候補として登録する。また、スロットソート処理手順を候補として登録する（ステップ２２１３６）。結合処理ノード群に対応して、スロットソート処理手順、Ｎウェイマージ処理手順および突き合わせ処理手順を候補として登録し、スロットソート連長およびマージ処理回数をパラメタ化しておく（ステップ２２１３７）。要求データ出力ノードに要求データ出力処理手順を登録する（ステップ２２１３８）。最後に、分解結果に対して評価がすべて終了すれば（ステップ２２１３９）、終了する。 In FIG. 12, the processing procedure candidate generation 2213 checks whether the table to be accessed by the inquiry is divided and stored in a plurality of nodes (step 22130). If divided and stored, the process proceeds to step 22135. If it is not divided and stored, it is checked whether or not the sorting procedure is included in the processing procedure candidate (step 22131). If it is included, the process moves to step 22135. If not, it is checked whether the access path of the table to be accessed by the query is unique (step 22132). If it is unique, a single processing procedure is created (step 22133). Is created (step 22134), and the process ends. In step 22135, the query is decomposed into two-way joins that can be joined. The data reading / data distribution processing procedure is registered as a candidate corresponding to the storage node group of the table to be divided and stored. Also, the slot sorting procedure is registered as a candidate (step 22136). A slot sort procedure, an N-way merge procedure, and a matching procedure are registered as candidates corresponding to the join processing node group, and the slot sort run length and the number of merge processes are parameterized (step 22137). The request data output procedure is registered in the request data output node (step 22138). Finally, when all the evaluations for the decomposition result have been completed (step 22139), the process ends.

図１３において、コード生成２２２は、処理手順候補が唯一か否かをチェックする（ステップ２２２０）。唯一であれば、ステップ２２２３へ移行する。唯一でなければ、カラム値分布情報等からなる最適化情報を処理手順に埋込み（ステップ２２２１）、問合せ実行時に代入された定数に基づいて処理手順を選択するデータ構造を作成する（ステップ２２２２）。最後に、処理手順を実行形式へ展開する（ステップ２２２３）。 In FIG. 13, the code generation 222 checks whether or not the processing procedure candidate is unique (step 2220). If it is unique, the process proceeds to step 2223. If it is not unique, optimization information including column value distribution information and the like is embedded in the processing procedure (Step 2221), and a data structure for selecting the processing procedure based on the constant assigned at the time of executing the query is created (Step 2222). Finally, the processing procedure is developed into an executable form (step 2223).

図１４において、動的最適化処理２２３は、作成されている処理手順が単一か否かをチェックする（ステップ２２３００）。単一であれば、終了する。単一でなければ、代入された定数を基に選択率を算出する（ステップ２２３０１）。処理手順候補に並列な処理手順が含まれるか否かチェックする（ステップ２２３０２）。含まれていなければ、アクセスパスの選択基準に従って処理手順を選択し（ステップ２２３１３）、終了する。含まれていれば、ディクショナリから最適化情報（結合カラムのカラム値分布情報、アクセス対象となる表のロウ数、ページ数等）を入力し（ステップ２２３０３）、データ取り出し／データ分配のための処理時間を各システム特性を考慮し、前述したように算出する（ステップ２２３０４）。当処理時間から結合処理に割当てるノード数ｐを決定し、当処理手順ａ１を決定する（ステップ２２３０５）。データ取り出し／データ分配処理時間にバラツキがあるか否かをチェックする（ステップ２２３０６）。バラツキがあれば、データ取り出し／データ分配処理ノード群でスロットソート処理を実行する処理手順ａ２を設定する（ステップ２２３０７）。つぎに、結合ノード割当て数ｐをα台だけ増した処理手順ａ３を設定する（ステップ２２３０８）。要求データ処理時間が突き合わせ処理時間と１回分のＮウェイマージ処理時間との和より大であれば（ステップ２２３０９）、突き合わせ処理へ１回分のＮウェイマージ処理を移した処理手順ａ４を設定する（ステップ２２３１０）。処理手順ａ１〜ａ４で最適な処理手順を応答時間最小、各ノード負荷量最小、他トランザクション応答性能への影響小等の観点で選択する（ステップ２２３１１）。データ分配情報を最適化情報を基にして作成する（ステップ２２３１２）。最適化情報がなければ、ハッシュ関数の結合カラム評価値に従い、データ分配情報を作成する。アクセスパスの選択基準に従って処理手順を選択し（ステップ２２３１３）、終了する。 In FIG. 14, the dynamic optimization processing 223 checks whether or not the created processing procedure is single (step 22300). If it is single, it ends. If not, the selectivity is calculated based on the substituted constant (step 22301). It is checked whether or not the candidate processing procedure includes a parallel processing procedure (step 22302). If not included, the processing procedure is selected according to the access path selection criteria (step 22313), and the process ends. If it is included, the optimization information (column value distribution information of the join column, the number of rows and the number of pages of the table to be accessed, etc.) is input from the dictionary (step 22303), and the processing for data retrieval / data distribution is performed. The time is calculated in consideration of each system characteristic as described above (step 22304). The number p of nodes to be assigned to the combining process is determined from the processing time, and the processing procedure a1 is determined (step 22305). It is checked whether there is a variation in the data retrieval / data distribution processing time (step 22306). If there is a variation, a processing procedure a2 for executing the slot sort processing in the data extraction / data distribution processing node group is set (step 22307). Next, a processing procedure a3 in which the number p of connected nodes is increased by α is set (step 22308). If the required data processing time is larger than the sum of the matching processing time and the one-way N-way merge processing time (step 22309), the processing procedure a4 in which the one-way N-way merge processing is shifted to the matching processing is set ( Step 22310). The optimum processing procedure is selected from the processing procedures a1 to a4 from the viewpoints of the minimum response time, the minimum amount of load on each node, and the small effect on other transaction response performance (Step 22311). Data distribution information is created based on the optimization information (step 22312). If there is no optimization information, data distribution information is created according to the join column evaluation value of the hash function. The processing procedure is selected according to the access path selection criteria (step 22313), and the process ends.

図１５において、コード解釈実行処理２２４では、設定された各ノードにおいてそれぞれ対応する処理手順にしたがって処理を行う。 In FIG. 15, in a code interpretation execution process 224, a process is performed in each set node according to a corresponding processing procedure.

まず、各ノードでは、データ取り出し／データ分配処理が設定されているか否かを判断する（ステップ２２４００）。データ取り出し／データ分配処理が設定されていれば、各ノードの記憶装置に格納されているデータベースにアクセスし、条件式を評価する（ステップ２２４０１）。最適化情報を基に作成されたデータ分配情報に基づいて、データを取り出し、各結合ノードのバッファへ逐次データを分配する（ステップ２２４０２）。各結合ノードのバッファが満杯か否かを判定し、満杯であれば、ページ形式で対応する結合ノードへ転送する。問い合わせに対応する全てのデータを取り出して分配すると処理が終了する（ステップ２２４０４）。 First, each node determines whether data extraction / data distribution processing is set (step 22400). If the data extraction / data distribution processing is set, the database accessing the storage device of each node is accessed, and the conditional expression is evaluated (step 22401). Data is extracted based on the data distribution information created based on the optimization information, and the data is sequentially distributed to the buffers of the connection nodes (step 22402). It is determined whether or not the buffer of each connection node is full, and if full, the data is transferred to the corresponding connection node in a page format. When all the data corresponding to the inquiry is taken out and distributed, the process ends (step 22404).

また、各ノードでは、スロットソート処理が設定されているか否かを判断する（ステップ２２４０５）。スロットソート処理が設定されていれば、前記データ取り出し／データ分配処理ノードからのページ形式のデータを受信し、（ステップ２２４０６）受信したデータについて順次スロットソート処理を行う（ステップ２２４０７）。処理したスロットソート結果を一時保存しておき、スロットソート処理を終了する（ステップ２２４０８）。 In addition, each node determines whether or not the slot sorting process is set (step 22405). If the slot sort processing is set, page format data is received from the data extraction / data distribution processing node (step 22406), and the received data is sequentially subjected to slot sort processing (step 22407). The processed slot sort result is temporarily stored, and the slot sort process ends (step 22408).

また、Ｎウェイマージ処理が設定されているか否かを判断する（ステップ２２４０９）。Ｎウェイマージ処理が設定されていれば、スロットソート結果に基づいてＮウェイマージ処理を実行し（ステップ２２４１０）、Ｎウェイマージ処理結果をバッファなどに一時保存し（ステップ２２４１１）、Ｎウェイマージ処理を終了する。 In addition, it is determined whether or not the N-way merge process is set (step 22409). If the N-way merge process is set, the N-way merge process is executed based on the slot sort result (step 22410), and the N-way merge process result is temporarily stored in a buffer or the like (step 22411). To end.

また、突き合わせ処理が設定されているか否かを判断する（ステップ２２４１２）。突き合わせ処理が設定されていれば、Ｎウェイマージ処理結果のソートリストを突き合わせ、出力用バッファにデータを設定する（ステップ２２４１３）。出力用バッファが満杯の場合には、ページ形式で要求データ出力ノードへ転送する（ステップ２２４１５）。 Further, it is determined whether or not the matching process has been set (step 22412). If the matching process has been set, the sorted list of the N-way merge process result is matched, and data is set in the output buffer (step 22413). If the output buffer is full, it is transferred to the request data output node in a page format (step 22415).

また、要求データ出力処理が設定されているか否かを判断する（ステップ２２４１６）。要求データ出力処理が設定されていれば、結合ノードからページ形式のデータの転送があるかないかを判断する（ステップ２２４１７）。ページ形式のデータの転送がある場合には、該ページ形式のデータを受信し（ステップ２２４１８）、アプリケーションプログラムへ問い合わせ処理結果を出力し、ページ形式のデータ転送がない場合にはそのまま問い合わせ処理結果を出力する（ステップ２２４１９）。 In addition, it is determined whether or not the request data output process has been set (step 22416). If the request data output process has been set, it is determined whether or not there is transfer of page format data from the connection node (step 22417). If there is a page format data transfer, the page format data is received (step 22418), and the query processing result is output to the application program. If there is no page format data transfer, the query processing result is directly received. The data is output (step 22419).

また、上記コード解釈実行処理において、処理時間にバラツキがある場合などのときにデータ取り出し／データ分配処理ノード群でスロットソート処理を実行する場合には、データ取り出し／データ分配処理終了後、コード解釈実行処理２２４を再度実行し、スロットソート処理を行うようにする。 In the above-described code interpretation execution processing, when slot sorting is performed in the data extraction / data distribution processing node group when there is a variation in processing time, the code interpretation is performed after the data extraction / data distribution processing is completed. The execution process 224 is executed again to perform the slot sort process.

さらに、ステップ２２４１３でＮウェイマージ処理結果が完全ソート列でなければ、最終段のマージと突き合わせ処理とを行う。 Further, if the result of the N-way merge processing is not a complete sort sequence in step 22413, the merge and matching processing of the final stage are performed.

以上のように処理することによりデータベース管理システムの問い合わせ応答時間を短縮することができる。 By performing the processing as described above, the inquiry response time of the database management system can be reduced.

図６に示す結合ノード割当て方法と、図７、図８および図９に示すチューニング方法とは、各々独立に適用してもよいし、また任意の組合せで適用してもよい。すなわち、動的最適化処理２２３では、すべての組合せが適用できる場合を想定している。さらに、データ取り出し処理においては、複数ディスク装置からなる並列入出力アクセス方法の適用と、一括入出力方法／先読み入出力方法の適用と、データ分配処理に最適化情報あるいはハッシュ関数によるデータ分配方法の適用と、Ｎウェイマージ処理に並列ソート方法の適用と、突き合わせ処理にノード間での突き合わせ処理方法の適用と、要求データ出力処理に複数のノードを割当て並列受け取り処理方法の適用等も考えられる。上記ステップ２２３０９およびステップ２２３１０では、１回分のＮウェイマージ処理を仮定しているが、一般的にｎ回（ｎ≧１）としてもよい。 The joint node assignment method shown in FIG. 6 and the tuning methods shown in FIGS. 7, 8 and 9 may be applied independently, or may be applied in any combination. That is, in the dynamic optimization process 223, it is assumed that all combinations can be applied. Further, in the data retrieval processing, application of a parallel input / output access method including a plurality of disk devices, application of a batch input / output method / read-ahead input / output method, and optimization of information or a data distribution method using a hash function in data distribution processing are performed. Application, application of a parallel sorting method to N-way merge processing, application of a matching processing method between nodes to the matching processing, application of a parallel reception processing method by allocating a plurality of nodes to the request data output processing, and the like are also conceivable. In steps 22309 and 22310, one N-way merge process is assumed. However, n times (n ≧ 1) may be generally used.

図４に示す並列パイプライン動作に関して、上記図６に示した結合ノード割当て方法と、図７、図８および図９に示すチューニング方法とを適用すると、取り出しフェーズ、マージフェーズ、結合フェーズの内、マージフェーズが省略可能となる場合も存在する。すなわち、スロットソート連長の延び、Ｎウェイマージ処理の移動により可能となる。この場合、問合せ実行処理でもマージフェーズの処理を省略する。 With respect to the parallel pipeline operation shown in FIG. 4, when the joint node assignment method shown in FIG. 6 and the tuning methods shown in FIGS. 7, 8 and 9 are applied, among the extraction phase, merge phase, and join phase, In some cases, the merge phase can be omitted. That is, it becomes possible by extending the slot sort run length and moving the N-way merge process. In this case, the merge phase process is omitted from the query execution process.

本発明の問合せ処理方法は、統計情報を用いた規則とコスト評価との併用に限らず、適当なデ−タベ−ス参照特性情報を与える処理手順が得られるものであれば適用できる。例えば、コスト評価のみ、規則利用のみ、コスト評価と規則利用の併用等の最適化処理を行うＤＢＭＳにも適用できる。 The query processing method of the present invention is not limited to the use of the rule using the statistical information and the cost evaluation, but may be applied to any method that can provide a processing procedure for providing appropriate database reference characteristic information. For example, the present invention can also be applied to a DBMS that performs optimization processing such as cost evaluation only, rule use only, and cost evaluation and rule use together.

本発明は、密結合／疎結合マルチプロセッサシステム大型計算機のソフトウェアシステムを介して実現することも、また、各処理部のために専用プロセッサが用意された密結合／疎結合複合プロセッサシステムを介して実現することも可能である。また、単一プロセッサシステムでも、各処理手順のために並列なプロセスを割当てていれば、適用可能である。 The present invention can be realized through a software system of a tightly-coupled / loosely-coupled multiprocessor system large-scale computer, or through a tightly-coupled / loosely-coupled multiprocessor system in which a dedicated processor is prepared for each processing unit. It is also possible to realize. Further, the present invention is applicable to a single processor system as long as parallel processes are allocated for each processing procedure.

本実施例によれば、各ノードで実行するデータベース演算に対応して各ノード数を決定し、また、データの分割にバラツキが存在する場合、各ノードへデータを均等に分割させ、各ノードで実行する各データベース演算をパラメタ化し、期待する処理時間均等化させるので、各ノード間で処理時間の偏りがなく、円滑にパイプライン動作させることが可能となり、高速な問合せ処理が実現可能となる。 According to the present embodiment, the number of each node is determined according to the database operation executed at each node, and if there is a variation in the division of data, the data is equally divided into each node, Since each database operation to be executed is parameterized and the expected processing time is equalized, the processing time is not biased among the nodes, the pipeline operation can be performed smoothly, and high-speed query processing can be realized.

並列結合処理概要図Schematic diagram of parallel join processing データベースシステムの構成図Configuration diagram of database system ハードウェア構成図Hardware configuration diagram 並列パイプライン動作の概要図Schematic diagram of parallel pipeline operation データ分配処理概要図Overview of data distribution processing 結合ノード割当て概要図Schematic diagram of binding node assignment スロットソート前処理化概要図Overview of slot sort preprocessing スロットソート連長チューニング概要図Slot sort run length tuning overview Ｎウェイマージ回数チューニング概要図Schematic of N-way merge count tuning データベース管理システムのフローチャートFlowchart of database management system データベース管理システムのフローチャートFlowchart of database management system データベース管理システムのフローチャートFlowchart of database management system データベース管理システムのフローチャートFlowchart of database management system データベース管理システムのフローチャートFlowchart of database management system データベース管理システムのフローチャートFlowchart of database management system

Explanation of reference numerals

１０、１１…アプリケーションプログラム、２０…データベース管理システム、２２…論理処理部、２２０…問合せ解析、２２１…静的最適化処理、２２２…コード生成、２２３…動的最適化処理、２２４…コード解釈実行、３０…オペレーティングシステム、４０…データベース、５０…ディクショナリ、８０…相互結合ネットワーク、９０、９１、９２、９３、９４、９５、９６、９７…ノード。
10, 11 application program, 20 database management system, 22 logical processing unit, 220 query analysis, 221 static optimization process, 222 code generation, 223 dynamic optimization process, 224 code execution , 30 operating system, 40 database, 50 dictionary, 80 interconnection network, 90, 91, 92, 93, 94, 95, 96, 97 nodes.

Claims

Storage means for storing the data constituting the database in a distributed manner,
A plurality of first nodes for retrieving and sending stored data according to the retrieval request;
A plurality of second nodes that execute database processing on the data sent from the first node according to the input database processing request and output the database processing result;
The input query request is analyzed, a plurality of database processing requests are generated based on the join columns of the database, and distributed to the second nodes. Generating a retrieval request to the first node, receiving a database processing result of the database processing request from the second node, and outputting a processing result of the inquiry request. A database management system comprising: a node;

A database management device that issues a database calculation request to a plurality of database calculation devices,
Analyzing the input query request, generating a plurality of database calculation requests based on the binding column of the database calculation key of the query request, distributing the generated plurality of database calculation requests to the database calculation device, A database management device comprising: a query processing unit that receives a processing result output from each of the database operation devices and outputs a processing result of the inquiry request.