JP2012123680A

JP2012123680A - Distributed database management system and distributed database management method

Info

Publication number: JP2012123680A
Application number: JP2010274862A
Authority: JP
Inventors: Terumasa Kawahata; 輝聖川畠
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2010-12-09
Filing date: 2010-12-09
Publication date: 2012-06-28
Anticipated expiration: 2030-12-09
Also published as: JP5659757B2

Abstract

PROBLEM TO BE SOLVED: To quicken data search processing in a distributed database system.SOLUTION: A distributed database management system includes a front server 21 configured to transfer schemer information showing the distribution destination of table data to backend servers 22, 23, and 24, and to, when there exists item data which are not included in a connection key column stored in each of the backend servers 22, 24, and 24, complementarily store the item data as duplicate data in the corresponding backend servers. Each of the backend servers 22, 23 and 24 is configured to, when a search request is transmitted, extract column data for search which becomes necessary with the search content from the connection key column by connecting column data stored so as to be distributed in the backend servers, and to generate an intermediate search result by transmitting search column data for connection to the other servers in which data for connection corresponding to the search column data are distributed.

Description

本発明は、異なるデータベースに記憶されたデータの管理を行う分散データベース管理システムに関する。 The present invention relates to a distributed database management system that manages data stored in different databases.

データベースで扱われるデータを全てメモリ上に展開した上で、データ検索などの計算処理を行うメモリデータベース管理システムが利用されている。
このメモリデータベース管理システムでは、大量データを一括で処理するバッチ用途や、大量のデータからデータマートを作成するようなシステムにおいては、集計や結合処理を高速化することにより、システム全体として高速な処理を実現することが可能となる。 2. Description of the Related Art A memory database management system that performs calculation processing such as data retrieval after all data handled in a database is expanded on a memory is used.
In this memory database management system, in batch applications that process a large amount of data at once or in a system that creates a data mart from a large amount of data, high-speed processing as a whole system can be achieved by speeding up aggregation and join processing. Can be realized.

一方、ディスクベースのデータベースシステムでは、例えば、数千万件に渡るような大量のデータ処理を行う場合、夜間などのシステム停止時間に処理を行うことにより、これを可能としている。
しかしながら、大量のデータ処理内容として、複雑な結合処理や集計処理が含まれる場合には、システム停止時間内に処理が終わらない場合などが生じ得る。
このため、ディスクベースのデータベース管理システムに、上述のメモリデータメース管理システムを組み込むことによって、安定的なシステムを構築すると共に、処理時間の短縮を図る手法が利用されるようになってきている。 On the other hand, in a disk-based database system, for example, when processing a large amount of data such as tens of millions, this is possible by performing the processing during a system stop time such as at night.
However, when a large amount of data processing contents include complicated combination processing and totalization processing, the processing may not be completed within the system stop time.
For this reason, by incorporating the above-mentioned memory data management system into a disk-based database management system, a method for constructing a stable system and shortening the processing time has come to be used.

メモリデータベース管理システムでは、計算処理対象のデータを半導体記憶装置であるメモリ内に読み込んで計算処理を行うため、システム内で扱うデータや演算途中の一次データのデータ量が、システム内に設けられたメモリ容量を超えてしまう場合、一般的にはディスク上の仮想メモリ領域を使用することとなる。
しかしながら、ディスクにおけるＩ／Ｏのデータ転送速度は、メモリにおけるデータ転送速度と比べて格段に遅いため、データ量がメモリ容量を超えてしまった場合には、データベースの処理性能（速度）が著しく劣化してしまうため、搭載メモリ量を超えてしまうような大容量のデータを処理する場合には、ディスクベースのデータベースシステムに対してメモリデータベース管理システムを組み込むことができないという不都合がある。 In the memory database management system, the calculation target data is read into the memory which is a semiconductor storage device, and the calculation process is performed. Therefore, the amount of data handled in the system and the amount of primary data in the middle of the calculation is provided in the system. When the memory capacity is exceeded, a virtual memory area on the disk is generally used.
However, the I / O data transfer speed on the disk is much slower than the data transfer speed on the memory, so that if the amount of data exceeds the memory capacity, the database processing performance (speed) is significantly degraded. Therefore, when processing a large amount of data that exceeds the amount of installed memory, there is a disadvantage that the memory database management system cannot be incorporated into the disk-based database system.

例えば、サーバに数百ギガバイト級のメモリが搭載されたシステムで1件あたり数百バイトのデータが数十〜数百億件あるテーブルデータを処理する場合には、1つのサーバのメモリ上に全処理データを展開することはできない。 For example, if a server with hundreds of gigabytes of memory installed on a server is used to process table data with hundreds of bytes of data per case and several tens to billions of data, all the data is stored on the memory of one server. Processing data cannot be expanded.

また、Ｗｅｂのアクセスログデータやレシート単位・商品単位での販売実績データを扱いたい場合にも、上記データ量（データ件数）を超えてしまうことが想定され、この膨大な処理データをディスクベースのデータベースや仮想メモリ領域を用いて集計処理する場合には、処理時間が膨大となってしまうため、システム要件に合わなくなってしまうといった不都合が生じ得る。 In addition, when it is desired to handle Web access log data and sales record data in units of receipts and products, it is assumed that the amount of data (number of data items) will be exceeded. When the aggregation process is performed using the database or the virtual memory area, the processing time becomes enormous, so that there may be a disadvantage that the system requirement is not met.

更には、通常のハードディスクなどのストレージを用いるディスクベースの分散データベース管理システムに対して複数のメモリデータベースサーバを使用して分散処理を行うメモリデータベースサーバ管理システムをそのまま代替適用することができない場合が多い。
例えば、通常の分散データベース管理システムでは、表データを行単位で分割する水平分割が用いられており、複数のサーバにまたがるクエリ処理、特にセミジョイン法などの結合処理を行うためには、各データベースサーバ間でデータの転送のために通信を行いながら処理を行う必要がある。 In addition, a memory database server management system that performs distributed processing using a plurality of memory database servers in many cases cannot be applied as an alternative to a disk-based distributed database management system that uses a storage such as a normal hard disk. .
For example, in an ordinary distributed database management system, horizontal partitioning that divides table data in units of rows is used. In order to perform query processing across multiple servers, particularly join processing such as semi-join method, each database server It is necessary to perform processing while performing communication for data transfer between them.

しかしながら、メモリデータベースの場合は、この大量の通信による処理がメモリ上での演算処理に比較して長時間かかってしまうため、メモリデータベース特有の高速性が大きく損なわれるため、高速処理性能が求められるシステムには適さないという不都合がある。 However, in the case of a memory database, the processing by this large amount of communication takes a long time compared to the arithmetic processing on the memory, so the high speed characteristic peculiar to the memory database is greatly impaired, and high speed processing performance is required. There is a disadvantage that it is not suitable for the system.

これに対する関連技術として、フロントメモリデータベースサーバの1台にインデックス形式の表として実データを集約し、それ以外の実データを複数の分散したサーバする方式が開示されている（特許文献１）。この場合、フロントメモリデータベースサーバ上で集計や結合処理が行われるため、通常の水平分散では必要となるサーバ間の通信量を削減することが可能となる。 As a related technique for this, a system is disclosed in which real data is aggregated as a table in an index format in one front memory database server, and other real data is distributed to a plurality of servers (Patent Document 1). In this case, since aggregation and join processing are performed on the front memory database server, it is possible to reduce the amount of communication between servers that is necessary for normal horizontal distribution.

他の関連技術として、データ更新のタイミングが異なる複数のデータベースから同じ意味合いを有する項目列を含むテーブルを予めマージして、一つのデータベースとして保有し、最新の更新データを当該一つのデータベースに集約することにより、不必要なデータやデータベースを検索対象から除外し、これによりデータベースの運用効率、高速化を実現するシステムが開示されている（特許文献２）。 As another related technique, tables including item sequences having the same meaning from a plurality of databases having different data update timings are merged in advance and held as one database, and the latest update data is aggregated into the one database. Thus, a system is disclosed in which unnecessary data and databases are excluded from search targets, thereby realizing database operational efficiency and speedup (Patent Document 2).

特開２０１０−０８５５６８号公報JP 2010-085568 A 特開２００６−２６８７８３号公報JP 2006-268783 A

しかしながら、上記特許文献１に記載の関連技術では、フロントメモリデータベースサーバにインデックス形式のみを保有させる必要がある。この場合、通常の表データと比較してデータ量が圧縮されているとはいえ、1台のフロントメモリデータベースサーバに全ての表データを保持するため、数十台分のメモリを必要とするような大規模のデータ量では、フロントメモリサーバのメモリの上限を超えてしまうという不都合が生じ得る。 However, in the related technique described in Patent Document 1, it is necessary to cause the front memory database server to have only the index format. In this case, although the amount of data is compressed compared to normal table data, all the table data is held in one front memory database server, so that several tens of memories are required. With such a large amount of data, there may be a disadvantage that the upper limit of the memory of the front memory server is exceeded.

また、上記特許文献２に記載の関連技術では、新たに発生した行単位のデータや列データの集合を優先的に更新し、複数のデータベースを統合することで、ユーザからの検索要求時に一つの表単位を検索対象として扱うことができる。
しかしながら、異なるデータベース内に格納された表データを対象として結合処理を行う場合に、データベース間で転送されるデータや、レプリケーションされるデータ量が膨大となってしまうため、一般的には一定サイズ以下の比較的小さいテーブルに限定されてしまうという不都合があり、さらには、通信トラフィックが膨大となり結合処理の迅速化が抑制されるといった不都合が生じ得る。 In the related technology described in Patent Document 2, a set of newly generated row-by-row data and column data is preferentially updated, and a plurality of databases are integrated, so that one search request from a user is obtained. Table units can be handled as search targets.
However, when performing join processing on table data stored in different databases, the amount of data transferred between databases and the amount of data to be replicated becomes enormous. There is a disadvantage that the table is limited to a relatively small table, and further, there is a problem that communication traffic becomes enormous and the speeding up of the joining process is suppressed.

［発明の目的］
本発明は、上記関連技術の有する不都合を改善し、異なるサーバ上にあるデータを対象とした参照、集計、検索処理をより迅速に行う分散データベース管理システム、分散データベース管理方法を提供することを、その目的とする。 [Object of invention]
The present invention provides a distributed database management system and a distributed database management method for improving the inconvenience of the related technology and performing reference, aggregation, and search processing for data on different servers more quickly. For that purpose.

上記目的を達成するため、本発明に係る分散データベース管理システムは、予め設定された表データを異なる複数のデータベースサーバに列データとして分配すると共に、外部端末からの検索要求に対して当該検索要求に基づく検索結果を前記各データベースサーバから取得する要求データ処理装置を備えた分散データベース管理システムであって、前記要求データ処理装置は、前記列データの分配先を示すスキーマ情報を前記各データベースサーバに対して転送するスキーマ情報複製転送部と、前記各データベースサーバに格納された結合キー列に格納されていない項目データがある場合に、前記項目データを複製データとして前記バックエンドサーバに対して補完的に格納する結合キーデータ補完格納部とを備え、前記各データベースサーバは、要求データ処理装置から検索要求が送り込まれた場合に、前記結合キー列に基づき自己データベースサーバ内に分配された列データの結合を行うことにより前記結合キー列から前記検索要求で必要とされる検索用列データを抽出するデータ結合抽出部と、前記抽出された検索列データに対応する結合用データが分配された他のサーバを前記スキーマ情報に基づき特定する結合用サーバ特定部と、前記特定したデータベースサーバに対して前記検索列データを送信し前記結合用データと結合することにより前記検索結果を生成する検索結果結合生成部とを備えたことを特徴とする構成を有する。 To achieve the above object, the distributed database management system according to the present invention distributes preset table data as column data to a plurality of different database servers, and responds to a search request from an external terminal. A distributed database management system comprising a request data processing device for obtaining a search result based on each database server, wherein the request data processing device provides schema information indicating a distribution destination of the column data to each database server. When there is item data that is not stored in the combination key column stored in each database server and the schema information copy transfer unit that transfers the item data, the item data is complemented to the back-end server as copy data. Each of the database servers. When the search request is sent from the request data processing device, the search request is required from the join key sequence by combining the column data distributed in the self database server based on the join key sequence. A data combination extraction unit that extracts the column data for search to be performed, a server identification unit for combination that identifies another server to which the data for combination corresponding to the extracted search column data is distributed based on the schema information, A search result combination generation unit configured to generate the search result by transmitting the search string data to the specified database server and combining it with the combination data.

また、本発明に係る分散データベース管理方法は、予め設定された表データを異なる複数のデータベースサーバに列データとして分配する要求データ処理装置が、外部端末からの検索要求に応じて当該検索要求に基づく検索結果を前記各データベースサーバから取得する分散データベース管理方法であって、前記要求データ処理装置が前記列データの分配先を示すスキーマ情報を前記各データベースサーバに対して転送し、前記各データベースサーバに格納された結合キー列に格納されていない項目データがある場合に、この項目データを複製データとして前記バックエンドサーバに対して補完的に格納し、前記各データベースサーバは、前記要求データ処理装置から検索要求が送り込まれた場合に、前記結合キー列に基づき自己データベースサーバ内に分配されたデータとの結合を行うことにより、前記結合キー列から前記検索要求で必要とされる検索用列データを抽出し、前記抽出された検索列データに対応する結合用データが分配された他のサーバを前記スキーマ情報に基づき特定し、当該特定したデータベースサーバに対して前記検索列データを送信し前記結合用データと結合することにより前記検索結果を生成することを特徴としている。 Also, in the distributed database management method according to the present invention, a request data processing device that distributes preset table data as column data to a plurality of different database servers is based on the search request in response to a search request from an external terminal. A distributed database management method for acquiring a search result from each database server, wherein the request data processing device transfers schema information indicating a distribution destination of the column data to each database server, and transmits the schema information to each database server. When there is item data that is not stored in the stored combination key column, this item data is stored in a complementary manner to the back-end server as replicated data, and each database server receives from the request data processing device When a search request is sent, the self-database is based on the join key column. By combining with the data distributed in the server, the search column data required for the search request is extracted from the combination key column, and the combination data corresponding to the extracted search column data is The distributed server is specified based on the schema information, and the search result is generated by transmitting the search string data to the specified database server and combining it with the combining data. .

本発明は、上述したように、各データベースサーバに分散格納された結合キー列に格納されていない項目データがある場合に、この項目データを複製データとして補完的に格納する要求データ処理装置と、結合キー列に基づき自己データベースサーバ内に分配された列データ間の結合を行うことにより、結合キー列から検索要求で必要とされる検索用列データを抽出するデータベースサーバとを備えたことにより、異なるデータベースサーバ上にあるデータを対象とした参照、集計、検索処理をより迅速に行う分散データベース管理システム、分散データベース管理方法を提供することができる。 The present invention, as described above, when there is item data that is not stored in the joint key string distributed and stored in each database server, the request data processing device that complementarily stores this item data as duplicate data, By providing a database server that extracts column data for search required for a search request from a join key column by performing a join between column data distributed in the self database server based on the join key column, It is possible to provide a distributed database management system and a distributed database management method that can more quickly perform reference, aggregation, and search processing for data on different database servers.

本発明の実施形態に係る分散データベース管理システムの一実施形態を示す概略ブロック図である。1 is a schematic block diagram showing an embodiment of a distributed database management system according to an embodiment of the present invention. 図２（ａ）は、分散データベース管理システムにおけるＡ取扱商品テーブルの一例を示す説明図である。図２（ｂ）は、分散データベース管理システムにおける売上げテーブルの一例を示す説明図である。図２（ｃ）は、分散データベース管理システムにおける修理売上サマリテーブルの一例を示す説明図である。FIG. 2A is an explanatory diagram showing an example of the A handling product table in the distributed database management system. FIG. 2B is an explanatory diagram showing an example of a sales table in the distributed database management system. FIG. 2C is an explanatory diagram showing an example of a repair sales summary table in the distributed database management system. 図１で示した分散データベース管理システムにおけるカラムデータの配置状況の一例を示す説明図である。It is explanatory drawing which shows an example of the arrangement | positioning condition of the column data in the distributed database management system shown in FIG. 図１で示した分散データベース管理システムにおけるカラムデータの配置状況の一例を示す説明図である。It is explanatory drawing which shows an example of the arrangement | positioning condition of the column data in the distributed database management system shown in FIG. 図１で示した分散データベース管理システムにおけるスキーマ情報を転送する動作処理ステップを示すフローチャートである。It is a flowchart which shows the operation | movement process step which transfers the schema information in the distributed database management system shown in FIG. 図１で示した分散データベース管理システムにおけるバックエンドサーバに対して表データの配置を行う動作処理ステップを示すフローチャートである。It is a flowchart which shows the operation | movement process step which arrange | positions table data with respect to the back end server in the distributed database management system shown in FIG. 図１で示した分散データベース管理システムにおけるバックエンドサーバ間における結合処理の動作処理ステップを示すフローチャートである。It is a flowchart which shows the operation | movement process step of the joint process between the back end servers in the distributed database management system shown in FIG. 図１で示した分散データベース管理システムにおけるバックエンドサーバにおける検索処理の動作処理ステップを示すフローチャートである。It is a flowchart which shows the operation | movement process step of the search process in the back end server in the distributed database management system shown in FIG.

［実施形態］
本発明の実施形態である分散データベース管理システム１００は、ユーザからの入力に基づき生成した要求やコマンドを送出するデータベースクライアント（外部端末）１と、データベースクライアント１からの検索要求に応じてデータベースの検索処理を行うと共に検索結果を生成する分散メモリデータベース管理システム２を備えている。 [Embodiment]
A distributed database management system 100 according to an embodiment of the present invention includes a database client (external terminal) 1 that sends a request or a command generated based on an input from a user, and a database search according to a search request from the database client 1 A distributed memory database management system 2 that performs processing and generates search results is provided.

分散メモリデータベース管理システム２は、予め登録された表データを構成するデータをカラム単位で実データとして記憶するバックエンドメモリデータベースサーバ（データベースサーバに対応）２２，２３，および２４と、上記表データを列データ（カラム単位）に分割してバックエンドサーバ２２，２３，２４それぞれに対して格納すると共に、データベースクライアント１から検索要求が送り込まれた場合に各バックエンドサーバ２２，２３，２４それぞれから検索結果を取得し、これを統合した検索結果表データを生成するフロントメモリデータベースサーバ（要求データ処理装置）２１を備えている。 The distributed memory database management system 2 includes back-end memory database servers (corresponding to database servers) 22, 23, and 24 that store data constituting pre-registered table data as actual data in column units, and the above table data. The data is divided into column data (column units) and stored in each of the back-end servers 22, 23, 24, and when a search request is sent from the database client 1, the search is performed from each of the back-end servers 22, 23, 24 A front memory database server (request data processing device) 21 is provided for acquiring results and generating search result table data in which the results are integrated.

バックエンドメモリデータベースサーバ（以下、「バックエンドサーバ」という）２２、２３、２４は、上述のように、それぞれにおける検索の結果を中間検索結果としてフロントサーバ（以下「フロントサーバ」という）２１に返信する。これにより、フロントサーバ２１のクエリ実行部２１１は、バックエンドサーバ２２、２３、２４から送り込まれた中間検索結果をマージすることにより検索結果を生成し、この検索結果をデータベースクライアント１に対して返信する。 As described above, the back-end memory database server (hereinafter referred to as “back-end server”) 22, 23, 24 returns the search results of each of them to the front server (hereinafter referred to as “front server”) 21 as an intermediate search result. To do. As a result, the query execution unit 211 of the front server 21 generates a search result by merging the intermediate search results sent from the back-end servers 22, 23, and 24, and returns this search result to the database client 1. To do.

尚、本実施形態における分散メモリデータベース管理システムでは、３台のバックエンドメモリデータベースサーバ２２，２３，２４が、内部ネットワーク４を介してフロントサーバ２１に対してそれぞれ並列に接続して設置された構成を示しているが、バックエンドメモリデータベースサーバは１台以上あればよい。 In the distributed memory database management system according to this embodiment, three back-end memory database servers 22, 23, and 24 are installed in parallel to the front server 21 via the internal network 4. However, one or more back-end memory database servers are sufficient.

データベースクライアント１およびフロントめもりデータベースサーバ２１は、図1に示すように、通信回線を介して接続されており、同様にフロントサーバ２１とバックエンドサーバ２２、２３、２４の間は内部ネットワーク３によって接続されている。尚、上記通信回線を介しての通信プロトコルおよびサーバ間の通信プロトコルについては任意とする。 As shown in FIG. 1, the database client 1 and the front memory database server 21 are connected via a communication line. Similarly, the front server 21 and the back-end servers 22, 23, 24 are connected by the internal network 3. Has been. The communication protocol via the communication line and the communication protocol between servers are arbitrary.

フロントメモリデータベースサーバ２１は、データベースクライアント１から送り込まれた検索の問い合わせ（検索要求）やメッセージの内容を解析すると共に、解析結果に基づき各バックサーバ２２，２３，２４に対する問い合わせや操作要求（動作要求）を行うクエリ実行部２１１と、スキーマ情報を記憶するスキーマ情報管理部２１２と、送り込まれた中間結果を一時的に格納する一時保存メモリとしての一時保存メモリ領域２１３を備えた構成を有する。 The front memory database server 21 analyzes the search inquiry (search request) and message contents sent from the database client 1, and also makes an inquiry and operation request (operation request) to each back server 22, 23, 24 based on the analysis result. ), A schema information management unit 212 for storing schema information, and a temporary storage memory area 213 as a temporary storage memory for temporarily storing the sent intermediate results.

また、クエリ実行部２１１は、予め設定されたスキーマ情報に基づき、内部ネットワーク４を介して、表データをバックデータベースサーバ２２，２３，２４それぞれに対して分配格納（パーティショニング）を行う。 Also, the query execution unit 211 distributes and stores (partitions) the table data to the back database servers 22, 23, and 24 via the internal network 4 based on preset schema information.

更に、クエリ実行部２１１は、スキーマ情報管理部２１２で管理されているスキーマ情報を各バックデータベースサーバ２２，２３，２４に対して転送する（スキーマ情報転送機能）。これにより、スキーマ情報管理部２２２、２３２、２４２それぞれに格納されるスキーマ情報は共通となる（つまり、同一の情報がスキーマ情報として管理されている）。 Further, the query execution unit 211 transfers the schema information managed by the schema information management unit 212 to each back database server 22, 23, 24 (schema information transfer function). Thereby, the schema information stored in each of the schema information management units 222, 232, and 242 is common (that is, the same information is managed as the schema information).

スキーマ情報管理部２１２は、クエリ実行部２１１からの問い合わせに対して、予め記憶したスキーマ情報に基づき問い合わせ先のバックエンドサーバ（２２，２３，２４）を指示する（問い合わせ先指定機能）。
これにより、クエリ実行部２１１は、スキーマ情報管理部からの指示に基づき、バックエンドデータベースサーバ（バックエンドサーバ）２２，２３，２４それぞれに対する問い合わせや、動作要求を行うことが可能となる。 In response to an inquiry from the query execution unit 211, the schema information management unit 212 instructs the back-end servers (22, 23, 24) to be inquired based on previously stored schema information (inquiry destination designation function).
As a result, the query execution unit 211 can make inquiries and operation requests to the back-end database servers (back-end servers) 22, 23, and 24 based on instructions from the schema information management unit.

一時保存メモリ領域２１３は、バックエンドサーバ（２２，２３，２４）それぞれから送り込まれた中間結果を格納する中間結果格納機能を有する。また、スキーマ情報管理部２１２における管理対象であるスキーマ情報も、このデータ保存メモリ領域２１３に記憶されているものとする。 The temporary storage memory area 213 has an intermediate result storage function for storing intermediate results sent from the back-end servers (22, 23, 24). In addition, it is assumed that schema information to be managed by the schema information management unit 212 is also stored in the data storage memory area 213.

尚、バックエンドサーバ２２，２３，２４はそれぞれデータベースとして機能するコンピュータであって、内部ネットワーク４を介してフロントサーバ２１に対して接続し、クエリ実行部２１１からの要求に応じて、自己記憶メモリ領域内に設定されたデータベースの検索処理を行う。
また、バックエンドサーバ２２，２３，２４は、同一の内部構成を有するため、ここでは、バックエンドサーバ２２の内部構成について、説明する。この内部構成については、バックエンドサーバ２３および２４でも対応する同等の構成内容を備えているものとする。 Each of the back-end servers 22, 23, and 24 is a computer that functions as a database, and is connected to the front server 21 via the internal network 4, and in response to a request from the query execution unit 211, a self-storage memory Search the database set in the area.
Further, since the back-end servers 22, 23, and 24 have the same internal configuration, the internal configuration of the back-end server 22 will be described here. As for this internal configuration, it is assumed that the back-end servers 23 and 24 have the same corresponding configuration content.

バックエンドサーバ２２は、フロントサーバ２１からの検索処理要求に応じて自己サーバ内におけるデータ検索処理を行うローカルクエリ管理部２２１と、ローカルクエリ実行部２２１からの問い合わせに応じてスキーマ情報を解析を行うスキーマ情報管理部２２２と、予め入力されたスキーマ情報に基づき生成された表データを保持するデータベース保存メモリ領域２２３を備えた構成を有する。 The back-end server 22 analyzes the schema information in response to an inquiry from the local query execution unit 221 and a local query management unit 221 that performs data search processing in its own server in response to a search processing request from the front server 21. It has a configuration including a schema information management unit 222 and a database storage memory area 223 that holds table data generated based on pre-input schema information.

クエリ実行部２１１とスキーマ情報管理部２１２とデータ保存メモリ領域２１３とを備えており、データベースクライアントからの問い合わせの受付をし、検索結果をデータベースクライアントに返却をするサーバである。 The server includes a query execution unit 211, a schema information management unit 212, and a data storage memory area 213. The server receives a query from the database client and returns a search result to the database client.

クエリ実行部２１１は、データベースクライアントから発行された問い合わせや操作の内容を確認し、バックエンドデータベースサーバ（２２，２３，２４）への問い合わせや操作を行う。
ただし、「表の定義」はスキーマ情報管理部２１２が管理しており、スキーマ情報管理部２１２にその情報を問い合わせることでバックエンドデータベースサーバへの問合せが可能となる。 The query execution unit 211 confirms the contents of the inquiry and operation issued from the database client, and makes an inquiry and operation to the back-end database server (22, 23, 24).
However, the “table definition” is managed by the schema information management unit 212, and the back-end database server can be inquired by inquiring the information to the schema information management unit 212.

尚、スキーマ情報を、各バックエンドサーバ２２，２３，２４にも転送し、フロントサーバ２１のスキーマ情報管理部２１１とバックエンドサーバ（２２，２３，２４）上のスキーマ情報管理部は同じ情報を管理しているものとする。 The schema information is also transferred to each back-end server 22, 23, 24, and the schema information management unit 211 of the front server 21 and the schema information management unit on the back-end server (22, 23, 24) share the same information. It shall be managed.

また、クエリ実行部２１１は、バックエンドサーバ２２，２３，２４から返却された中間結果を一時保存メモリ領域２１３に格納し、全ての演算結果がそろった段階で、これら中間結果をマージしてデータベースクライアント１へ返却する。 In addition, the query execution unit 211 stores the intermediate results returned from the back-end servers 22, 23, and 24 in the temporary storage memory area 213, and merges these intermediate results at the stage when all the operation results are collected. Return to client 1.

スキーマ情報管理部２１２は、予め入力されたスキーマ情報を記憶・管理している。このスキーマ情報には、（１）表の定義情報（特に結合処理の際に使用される結合キーを示す結合キー指定情報）、（２）データ（カラムデータ）がどのバックエンドメモリデータベースサーバ（２２，２３，２４）に含まれる（格納されている）か否かを示すカラムデータ格納先情報、（３）レンジパーティションやハッシュパーティションなどのパーティショニングにおけるルールを示すパーティショニング条件情報（「パーティショニングルール」という）が含まれる。 The schema information management unit 212 stores and manages pre-input schema information. This schema information includes (1) table definition information (particularly, join key designation information indicating a join key used in the join process), and (2) which back-end memory database server (22) the data (column data) is. , 23, 24) column data storage location information indicating whether or not (stored), (3) partitioning condition information indicating a partitioning rule such as range partition or hash partition ("partitioning rule") ”).

データ保存メモリ領域２１３は、メモリ上にクエリ実行部２１１経由でバックエンドサーバ２１、２２、２３から返却された中間結果を格納している。また、スキーマ情報管理部２１２で管理すべき各スキーマ情報もこのデータ保存メモリ領域に記憶されている。 The data storage memory area 213 stores intermediate results returned from the back-end servers 21, 22, and 23 via the query execution unit 211 on the memory. Each schema information to be managed by the schema information management unit 212 is also stored in this data storage memory area.

カラムデータ管理部２２１は、フロントサーバ２１（クエリ実行部２１１）から送り込まれた問い合わせや要求に対して、カラムデータ保存メモリ領域２２３に予め保存されたデータの検索、格納されたデータの更新、を行う検索処理機能を備えている。 In response to the inquiry or request sent from the front server 21 (query execution unit 211), the column data management unit 221 searches for data stored in the column data storage memory area 223 in advance, and updates the stored data. A search processing function is provided.

カラムデータ保存メモリ領域２２３は、スキーマ情報管理部２２２が扱うスキーマ情報も記憶保持しているものとする。尚、カラムデータ保存メモリ領域２２３には、フロントサーバ２１、および他のバックエンドサーバ２３，２４に格納されたスキーマ情報と同一のスキーマ情報が格納される。 It is assumed that the column data storage memory area 223 also stores and holds schema information handled by the schema information management unit 222. The column data storage memory area 223 stores the same schema information as the schema information stored in the front server 21 and the other back-end servers 23 and 24.

また、カラムデータ管理部２２１は、自己データベース内に格納されたカラムデータについて、結合処理を行うローカル結合処理機能と、他のバックエンドサーバ（２３，２４）に対して結合処理用のカラムデータを送信し、他のバックエンドサーバ内に格納されたカラムデータとの結合処理を行う通信結合処理機能を備えている。 In addition, the column data management unit 221 provides a local join processing function for performing join processing on column data stored in its own database, and column data for join processing to other back-end servers (23, 24). It has a communication connection processing function for transmitting and combining with column data stored in other back-end servers.

ここで、データベースクライアント１、フロントメモリデータベースサーバ２１、バックエンドメモリデータベースサーバ２２，２３，２４におけるスキーマ情報を転送する動作について、図５のフローチャートに基づき説明する。 Here, the operation of transferring the schema information in the database client 1, the front memory database server 21, and the back end memory database servers 22, 23, and 24 will be described based on the flowchart of FIG.

まず、データベース管理者やユーザなどがデータベースクライアント１を利用して、分散データベース構築に必要なスキーマ情報（定義）を作成すると共にこのスキーマ情報を分散データベース管理システム２に対して入力する。
また、データベースクライアント１は、実データの分散格納を分散メモリデータベース管理システム２に対して要求するコマンド送信を行う設定であってもよい（ステップＳ５１：図５）。 First, a database administrator, a user, or the like uses the database client 1 to create schema information (definition) necessary for building a distributed database and inputs this schema information to the distributed database management system 2.
Further, the database client 1 may be configured to send a command for requesting the distributed memory database management system 2 to store the actual data in a distributed manner (step S51: FIG. 5).

ここで、上記スキーマ情報は、利用者により入力されたコマンドと共に、格納対象のテーブル情報の属性を定義する表定義、表を構成するどの範囲のカラムデータがどのバックエンドサーバ２２，２３，２４に格納されるかを指定するパーティショニングルールと、バックエンドサーバ間で相互に実施される、データベースの結合処理時に利用される列を示す結合キー指定情報と、参照側の表あるいは処理対象のテーブルを示すテーブル情報を含む。 Here, the schema information includes a command input by the user, a table definition that defines attributes of table information to be stored, and a range of column data constituting the table to which back-end server 22, 23, 24. The partitioning rule that specifies whether to store the data, the join key specification information that indicates the columns used during the database join process that are mutually executed between the back-end servers, the reference table, or the table to be processed Contains table information to indicate.

なお、データベースクライアント１は、ユーザにより設定された上述のスキーマ情報を、クエリ実行部２１１に対して入力する（ステップＳ５１）。 The database client 1 inputs the above-described schema information set by the user to the query execution unit 211 (step S51).

クエリ実行部２１１は、データベースクライアント１から入力されたスキーマ情報の内容を解析する（ステップＳ５１）と共に、スキーマ情報管理部２１２にスキーマ情報を渡す（ステップＳ５３）。
スキーマ情報管理部２１２は、このスキーマ情報を２１３に格納する（ステップＳ５４）と共に、設定完了をクエリ実行部２１１に通知する（ステップＳ５５）。 The query execution unit 211 analyzes the contents of the schema information input from the database client 1 (step S51) and passes the schema information to the schema information management unit 212 (step S53).
The schema information management unit 212 stores this schema information in 213 (step S54), and notifies the query execution unit 211 of the completion of setting (step S55).

次いで、クエリ実行部２１１は、結合キー列を示す結合キー指定情報、およびパーティショニングルールを含むスキーマ情報を複製すると共に、カラムデータの格納先である各バックエンドサーバ２２，２３，２４それぞれに対してスキーマ情報を転送する（ステップＳ５６へ）。
これにより、フロントサーバ２１、およびバックエンドサーバ２２，２３，２４で共通のスキーマ情報が保持されることとなる。 Next, the query execution unit 211 duplicates the join key designation information indicating the join key column and the schema information including the partitioning rule, and for each back-end server 22, 23, 24 that is the storage destination of the column data. Then, the schema information is transferred (to step S56).
As a result, the common schema information is held in the front server 21 and the back-end servers 22, 23, and 24.

ここで、スキーマ情報管理部２２１，２３１，２４１は、それぞれ、各バックエンドサーバ２２，２３，２４内に設けられた半導体記憶装置のメモリ領域であって、転送されたスキーマ情報を記憶保持するものとする。 Here, the schema information management units 221, 231, and 241 are memory areas of the semiconductor storage devices provided in the respective back-end servers 22, 23, and 24, and store and hold the transferred schema information. And

バックエンドサーバ２２，２３，２４のローカルクエリ実行部２２１、２３１，２４１は、それぞれ送り込まれたスキーマ情報をスキーマ情報管理部（２２２、２３２，２４２）に送信し、スキーマ情報管理部（２２２、２３２，２４２）は、これを保存する（ステップＳ５８）。
これにより、バックエンドサーバ２２，２３，２４には同一のスキーマ情報が設定された状態となる。 The local query execution units 221, 231, and 241 of the back-end servers 22, 23, and 24 transmit the schema information that is sent to the schema information management units (222, 232, and 242), respectively, and the schema information management units (222, 232). , 242) stores this (step S58).
As a result, the same schema information is set in the back-end servers 22, 23, and 24.

次いで、ローカルクエリ実行部２２１、２３１，２４１は、スキーマ情報の設定が完了したことをクエリ実行部２１１に通知し、クエリ実行部２１１は、バックエンドサーバ全てからスキーマ情報の設定が完了したことが通知された場合に、データベースクライアント１にスキーマ情報の格納完了を通知する（ステップＳ６０）。 Next, the local query execution units 221, 231, and 241 notify the query execution unit 211 that the setting of schema information has been completed, and the query execution unit 211 has completed the setting of schema information from all back-end servers. When notified, the database client 1 is notified of the completion of storing schema information (step S60).

また、フロントサーバ２１は、データベース管理者やユーザなどにより、分散データベース管理システム２に対して入力されたテーブル情報などの実データを、予め設定されたスキーマ情報の内容に基づき各バックエンドサーバ２２，２３，２４それぞれに、処理対象の表データ（テーブル）をカラム単位に分散データ配置するパーティショニング分散配置機能を有する。 Further, the front server 21 receives actual data such as table information input to the distributed database management system 2 by a database administrator or a user based on the contents of preset schema information. Each of 23 and 24 has a partitioning distribution arrangement function for arranging the processing target table data (table) in a column unit.

尚、スキーマ情報には、データベースクライアント１からコマンドにより指定される表定義、格納対象である表におけるバックエンドサーバ２２，２３，２４への分配ルールを示すパーティショニングルール、結合演算時の結合キー列を示す情報（結合キー指定情報）、対象の表データ、および、どの表が参照側の表であるかを示す情報などが含まれる。
また、上記スキーマ情報は、データベース管理者やユーザによりクエリ実行部２１１に対して入力された物であってもよい。 The schema information includes a table definition specified by a command from the database client 1, a partitioning rule indicating a distribution rule to a back-end server 22, 23, 24 in a table to be stored, and a join key string at the time of a join operation. Information (join key designation information), target table data, information indicating which table is the reference table, and the like.
The schema information may be input to the query execution unit 211 by a database administrator or user.

クエリ実行部２１１は、一時保存メモリ領域２１３に記憶されたスキーマ情報の内容を解析すると共に、バックエンドサーバ２２，２３，２４それぞれのスキーマ情報管理部２２２，２３２，２４２に対してスキーマ情報を展開する。
これにより、フロントサーバ２１内に記憶されたスキーマ情報と同一内容のスキーマ情報が各バックエンドサーバ２２，２３，２４のメモリ領域にも保存される。 The query execution unit 211 analyzes the contents of the schema information stored in the temporary storage memory area 213 and expands the schema information to the schema information management units 222, 232, and 242 of the back-end servers 22, 23, and 24, respectively. To do.
As a result, the schema information having the same content as the schema information stored in the front server 21 is also stored in the memory area of each back-end server 22, 23, 24.

次に、スキーマ情報管理部２２２，２３２，２４２にスキーマ情報が格納された後、クエリ実行部２１１が実行するパーティショニング分散配置機能について、詳説する。
尚、パーティショニング分配機能における動作は以下に示す第一および第二段階に分かれる。 Next, the partitioning distributed arrangement function executed by the query execution unit 211 after the schema information is stored in the schema information management units 222, 232, and 242 will be described in detail.
The operation in the partitioning distribution function is divided into the following first and second stages.

［パーティショニング分散配置機能］
まず、パーティショニング分散配置機能の第一段階では、クエリ実行部２１１は、スキーマ情報における表定義とパーティショニングルールに基づき、バックエンドサーバへのデータ配置を行う。 [Partitioning distribution function]
First, in the first stage of the partitioning distributed arrangement function, the query execution unit 211 performs data arrangement on the back-end server based on the table definition and the partitioning rule in the schema information.

このとき、各バックエンドサーバ２２，２３，２４に分散配置する際に、クエリ実行部２１１は、同一実データ列（カラム）内に含まれる実データの重複を排除し（正規化し）且つソートした状態で格納するものとする。
これにより、各バックエンドサーバ２２，２３，２４では、重複排他的にソートした状態で実データが格納されるため、データに変更や修正があった場合にも、再計算を行う必要なく、格納されたデータの整合性が保つことができる。 At this time, when distributed to each back-end server 22, 23, 24, the query execution unit 211 eliminates (normalizes) and sorts the actual data included in the same actual data string (column). It shall be stored in the state.
As a result, the back-end servers 22, 23, and 24 store the actual data in a state of being sorted exclusively and redundantly, so that even if the data is changed or modified, it is not necessary to recalculate and store the data. The integrity of the recorded data can be maintained.

また、クエリ実行部２１１は、「実データ列」を格納した上で、各データ列を参照するインデックスから成るカラム単位の列データを示すインデックス列を生成し格納する（インデクス列生成機能）。
ただし、予め設定された結合の定義によって、ローカルに格納された２つ以上表の中に同一の結合キーがある場合は1つの実データ列に統合して配置する。
これにより、結合をつど行うことなく、インデックス列を参照することで、各バックエンドサーバローカルで結合処理を行うことが可能となる（第一段階終わり）。 In addition, the query execution unit 211 stores and stores an “actual data string”, and then generates and stores an index string indicating column data including an index that refers to each data string (index string generation function).
However, if two or more tables stored locally have the same join key according to a preset join definition, they are integrated into one actual data string.
As a result, it is possible to perform the join process locally at each back-end server by referring to the index string without performing the join (end of the first stage).

パーティショニング対象の表に対応する結合定義が予め設定されている場合、各バックエンドサーバ２２，２３，２４それぞれに結合キー列のデータのうち分配によって各サーバの格納されていない（すなわち、各サーバが保有していない）データについて、他のサーバからレプリケーションを行い、結合キー列に含まれるすべてのデータを保有させる。 When a join definition corresponding to a table to be partitioned is set in advance, each server is not stored by distribution among the data of the join key string in each back-end server 22, 23, 24 (that is, each server For other data, replication is performed from another server, and all data included in the join key column is retained.

これにより、結合処理時に結合キーを各サーバから集める動作工程を省くことができ、通信量を軽減することができる。
以下、パーティショニング分散配置機能について、具体的に説明する。 As a result, it is possible to omit the operation step of collecting the combination key from each server during the combination process, and to reduce the amount of communication.
Hereinafter, the partitioning distributed arrangement function will be specifically described.

ここでは、処理対象の表データとして、図２（ａ）〜（ｃ）に示すように、Ａ取扱商品テーブル（ａ）、売上テーブル（ｂ）、および修理売上サマリテーブル（ｃ）が設定されているものとする。 Here, as table data to be processed, as shown in FIGS. 2A to 2C, an A handling product table (a), a sales table (b), and a repair sales summary table (c) are set. It shall be.

ここで、Ａ取扱商品テーブルは、図２（ａ）に示すように、カラムデータ列としての「商品ＩＤ」、「商品名」、「カテゴリ」、「製造会社」、「定価」を有する表データである。
また、売上げテーブルは、図２（ｂ）に示すように、カラムデータ列としての「売上番号」、「年月日」、「商品ＩＤ」、「個数」、「売上金額」を有する表データである。
さらに、修理売上げサマリテーブルは、図２（ｃ）に示すように、カラムデータ列としての「年度」、「期」、「商品ＩＤ」「累計個数」「累計売上」を有する表データである。 Here, as shown in FIG. 2A, the A handling product table is table data having “product ID”, “product name”, “category”, “manufacturing company”, and “list price” as column data columns. It is.
As shown in FIG. 2B, the sales table is table data having “sales number”, “date”, “product ID”, “number”, and “sales amount” as column data columns. is there.
Further, as shown in FIG. 2C, the repair sales summary table is table data having “year”, “period”, “product ID”, “cumulative number”, and “cumulative sales” as column data strings.

また、スキーマ情報におけるパーティショニングルールとしては、以下に示す内容が予め定められているものとする。 In addition, as the partitioning rules in the schema information, it is assumed that the following contents are predetermined.

［パーティショニングルール］
Ａ取扱商品テーブル（ａ）、および売上テーブル（ｂ）については、
商品ＩＤ：１００１５以下はバックエンドサーバ２２へ
商品ＩＤ：１００１５以下はバックエンドサーバ２３へ [Partitioning rules]
About A handling product table (a) and sales table (b),
Product ID: 10015 or less to the back-end server 22 Product ID: 10015 or less to the back-end server 23

修理売上サマリテーブル（ｃ）については、
２００７年度以前はバックエンドサーバ２３へ
２００８年度以降はバックエンドサーバ２４へ
［パーティショニングルールおわり］ For the repair sales summary table (c),
Before 2007, go to back-end server 23. After 2008, go to back-end server 24 [End of partitioning rules]

すなわち、Ａ取扱商品テーブル、および売上げテーブルにおいては、商品ＩＤの値が１００１５以下である場合には、その行項目をカラム単位でバックエンドサーバ２２に配置する。また、商品ＩＤの値が１００１６以上である場合には、その行項目をカラム単位でバックエンドサーバ２３に配置（レンジパーティショニング）することを示す。 That is, in the A handling product table and the sales table, when the value of the product ID is 10015 or less, the line item is arranged in the back end server 22 in units of columns. Further, when the value of the product ID is 10016 or more, it indicates that the line item is arranged (range partitioning) in the back-end server 23 in units of columns.

また、修理売上サマリテーブルにおいては、年度の値が２００７以前である場合には、その行項目をカラム単位でバックエンドサーバ２３に配置する。また、年度の値が２００８以降である場合には、その行項目をカラム単位でバックエンドサーバ２４に配置（レンジパーティショニング）することを示す。 In the repair sales summary table, if the value of the year is before 2007, the line item is arranged in the back-end server 23 in units of columns. Further, if the value of the year is 2008 or later, it indicates that the line item is arranged (range partitioning) in the back-end server 24 in units of columns.

また、スキーマ情報における結合の定義として、以下に示す内容が予め定められているものとする。 Further, it is assumed that the following contents are predetermined as the definition of the combination in the schema information.

［結合の定義］
売上げテーブル（ｂ）は、
参照元テーブル：Ａ取扱商品テーブル
結合キー列：商品ＩＤ [Define Join]
The sales table (b)
Reference source table: A handling product table join key column: Product ID

修理売上サマリテーブル（ｃ）
参照元テーブル：Ａ取扱商品テーブル
結合キー列：商品ＩＤ Repair sales summary table (c)
Reference source table: A handling product table join key column: Product ID

すなわち、売上テーブルでは、参照元のテーブルがＡ取扱商品テーブルであり、結合キー列を商品ＩＤとする。また、修理売上サマリテーブルでは、参照元のテーブルがＡ取扱商品テーブルであり、結合キー列を商品ＩＤであるものとする。 That is, in the sales table, the reference source table is the A handling product table, and the combined key column is the product ID. In the repair sales summary table, it is assumed that the reference source table is the A handling product table and the combined key column is the product ID.

フロントサーバ２１のクエリ実行部２１１は、スキーマ情報における、上記パーティショニングルール、および結合の定義に基づき、図２（ａ〜ｃ）の各テーブルをパーティショニングすると共に、バックエンドサーバ２２，２３，２４に実データとして配置（ロード）する。 The query execution unit 211 of the front server 21 partitions the tables shown in FIGS. 2A to 2C on the basis of the partitioning rule and the join definition in the schema information, and back-end servers 22, 23, 24. Placed (loaded) as real data.

ここで、図３に、クエリ実行部２１１によりパーティショニングされたＡ取扱商品テーブル、売上げテーブル、および修理売上げサマリテーブルのデータがバックエンドサーバ２２，２３，２４に実データとしてロードされた状態を図３に示す。 Here, FIG. 3 shows a state in which the data of the A handling product table, the sales table, and the repair sales summary table partitioned by the query execution unit 211 are loaded as actual data into the back-end servers 22, 23, and 24. 3 shows.

ここで、Ａ取扱商品テーブル、売上げテーブル、および修理売上げサマリテーブルそれぞれのテーブルがバックエンドサーバ２２，２３，２４に対して配置された後、クエリ実行部２１１は、バックエンドサーバ２３と２４に対して、参照元の結合キー列であるＡ取扱商品テーブルの商品ＩＤの中で、配置されていない列データを、複製（レプリケーション）してレプリカ列としてロードする。 Here, after the A handling product table, the sales table, and the repair sales summary table are arranged for the back-end servers 22, 23, and 24, the query execution unit 211 sends the back-end servers 23 and 24 to the back-end servers 23 and 24. Thus, the column data that is not arranged in the product ID of the A handling product table that is the combined key column of the reference source is replicated and loaded as a replica column.

ここで、クエリ実行部２１１は、Ａ取扱商品テーブルと修理売上げサマリテーブルが商品ＩＤで結合されるという定義に基づき、パーティショニングルールの対象が商品ＩＤではないことから、結合キー列である商品ＩＤの列データを各バックエンドメモリサーバからレプリケーションして配置する。 Here, based on the definition that the A handling product table and the repair sales summary table are combined with the product ID, the query execution unit 211 has a product ID that is a combination key column because the object of the partitioning rule is not the product ID. Are replicated from each backend memory server and placed.

これにより、結合処理に必要な結合キー列（ここでは、商品ＩＤ列（カラム））に含まれる全データを各バックエンドサーバそれぞれに対して予め格納される。このため、バックエンドサーバにおける検索処理時に生じる、結合処理のためにバックエンドサーバ間で行われるデータ転送量を有効に軽減することができる。 Thereby, all the data included in the combination key string (here, the product ID string (column)) necessary for the combination process is stored in advance for each back-end server. For this reason, it is possible to effectively reduce the amount of data transferred between the back-end servers for the join process, which occurs during the search process in the back-end server.

一方、バックエンドメモリサーバ２２では、Ａ取扱商品テーブルと売上げテーブルの結合キー列である、商品ＩＤについて全てのデータを保有しているため、クエリ実行部２１１はレプリケーションを行わない。 On the other hand, since the back-end memory server 22 holds all data for the product ID, which is a combined key column of the A handling product table and the sales table, the query execution unit 211 does not perform replication.

これは、Ａ取扱商品テーブルと売上げテーブルのパーティショニングルールが結合キー列である商品ＩＤを対象としており、更には、その条件（すなわち、商品ＩＤの値が１００１５以下である場合はバックエンドサーバ２２へ、商品ＩＤの値が１００１５以下である場合はバックエンドサーバ２３へ）も同一であるため、他のサーバに結合キーを参照する必要がないためである。 This is for product IDs whose partitioning rules of the A handling product table and the sales table are combined key columns, and further, the condition (that is, if the value of the product ID is 10015 or less, the back-end server 22 In the case where the value of the product ID is 10015 or less, the same applies to the back-end server 23), and it is not necessary to refer to the combined key for other servers.

以上のように、本実施形態は、カラムストアデータベースという特性から、カラム方向でデータを格納しているため、水平分割している場合と比較した場合に、カラム内のデータ重複があった場合にデータを圧縮格納することがかのうであり、これにより、データ格納時におけるメモリ使用量を有効に抑制することが可能となる。 As described above, the present embodiment stores data in the column direction because of the characteristics of the column store database, so when there is data duplication in the column when compared with the case of horizontal division. It is possible to store data in a compressed manner, which makes it possible to effectively suppress the memory usage during data storage.

［変形例］
尚、上記実施形態における、スキーマ情報における「結合の定義」で、結合キー列に含まれる項目（行）に対応する、集計処理や検索条件の対象となる列データ（「対象列データ」という）を合わせて指定する設定であってもよい。 [Modification]
In the above embodiment, the column data (referred to as “target column data”) corresponding to the items (rows) included in the join key column and subject to aggregation processing and search conditions in the “join definition” in the schema information. May be set to be specified together.

これにより、Ａ取扱商品テーブル、売上げテーブル、および修理売上げサマリテーブルそれぞれのテーブルがバックエンドサーバ２２，２３，２４に対して配置された後、クエリ実行部２１１が、バックエンドサーバ２３、２４それぞれに配置されていない結合キー列のデータを、レプリカ列としてバックエンドサーバ２３と２４に対してロードする際に、結合キー列だけでなく、結合の定義で指定された対応列データも合わせてレプリケーションしてロードするものとする。 Thereby, after the tables of the A handling product table, the sales table, and the repair sales summary table are arranged for the back-end servers 22, 23, 24, the query execution unit 211 sets the back-end servers 23, 24 respectively. When data of join key columns that are not arranged is loaded to the back-end servers 23 and 24 as replica columns, not only the join key columns but also the corresponding column data specified in the join definition are replicated together. Shall be loaded.

これにより、図３では、レプリケーションされているのが結合キー（商品ＩＤ）だけであったのに対し、この変形例では、結合キー列に係り集計処理や検索条件の対象となるデータとして指定された対応列データである「価格」列も同時にレプリケーションされている。
これにより、結合時に必要な各バックエンドサーバから取得可能なデータを予めローカルサーバに保有しているため、バックエンドサーバ間での通信データ量をより軽減することが可能となり、検索処理をより迅速化することができる。 As a result, in FIG. 3, only the combination key (product ID) is replicated. In this modification, the combination key column is designated as data subject to aggregation processing and search conditions. The “price” column, which is the corresponding column data, is also replicated at the same time.
As a result, data that can be acquired from each back-end server required at the time of combination is stored in the local server in advance, so the amount of communication data between back-end servers can be further reduced, and search processing can be performed more quickly. Can be

［実施形態の動作説明］
次に、上記の実施形態の全体的な動作について説明する。 [Description of Operation of Embodiment]
Next, the overall operation of the above embodiment will be described.

フロントサーバ（要求データ処理装置）２１が表データの分配先を示すスキーマ情報をバックエンドサーバ（データベースサーバ）２２，２３，２４それぞれに対して転送し（スキーマ情報複製転送工程）、各バックエンドサーバ２２，２３，２４に格納された結合キー列に格納されていない項目データがある場合に当該項目データを複製データとして前記バックエンドサーバに対して補完的に格納する（結合キーデータ補完格納工程）。
次いで、フロントサーバ（要求データ処理装置）２１から検索要求が送り込まれた場合に、各バックエンドサーバ２２，２３，２４は、自己バックエンドサーバ内に分散格納されたカラムデータ相互の結合を行うことにより、前記結合キー列から前記検索内容で必要とされる検索用列データを抽出し（検索用列データ抽出工程）、前記抽出された検索列データに対応する結合用データが分配された他のサーバを特定し、検索列データを送信し前記結合用データと結合することにより中間検索結果を生成する（検索結果結合取得工程）。 The front server (request data processing device) 21 transfers schema information indicating the distribution destination of the table data to each of the back-end servers (database servers) 22, 23, and 24 (schema information duplication transfer process), and each back-end server When there is item data that is not stored in the combination key string stored in 22, 23, 24, the item data is stored in a complementary manner in the back-end server as duplicate data (join key data supplement storage step). .
Next, when a search request is sent from the front server (request data processing device) 21, the back-end servers 22, 23, and 24 combine column data stored in a distributed manner in their own back-end servers. The search column data required for the search content is extracted from the combination key column (search column data extraction step), and the combination data corresponding to the extracted search column data is distributed. An intermediate search result is generated by specifying a server, transmitting search string data, and combining it with the data for combination (search result combination acquisition step).

ここで、上記スキーマ情報複製転送工程、結合キーデータ補完格納工程、検索用列データ抽出工程、および検索結果結合取得工程については、その実行内容をプログラム化し、分散メモリデータベース管理システム２の備えたコンピュータに実行させる構成としてもよい。 Here, the execution contents of the schema information copy transfer process, the combined key data supplement storage process, the search column data extraction process, and the search result combination acquisition process are programmed, and the computer provided in the distributed memory database management system 2 It is good also as a structure made to perform.

次に、本実施形態の動作について詳説する。
ここでは、まず、クエリ実行部２１１によるバックエンドサーバ２２，２３，２４に対してカラムデータの分配（ロード）を行う動作について、図６のフローチャートに基づき説明する。 Next, the operation of this embodiment will be described in detail.
Here, first, the operation of distributing (loading) column data to the back-end servers 22, 23, 24 by the query execution unit 211 will be described based on the flowchart of FIG.

ここでは、データベースクライアント１からバックエンドサーバ２２，２３，２４に対するデータの分配（ロード）を要求するメッセージが入力されることにより、フロントサーバ２１におけるカラムデータの分配機能が実行されるものとする（ステップＳ７１：データロード開始）。 Here, it is assumed that the column data distribution function in the front server 21 is executed when a message requesting data distribution (loading) from the database client 1 to the back-end servers 22, 23, 24 is input ( Step S71: Start of data loading).

［第一段階］
ここで、フロントサーバ２１のクエリ実行部２１１が、入力された表データをスキーマ情報に基づき解析する（ステップＳ７２）。
ここでは、スキーマ情報管理部２１２が、予め管理しているスキーマ情報の表定義、およびパーティショニングルールを確認する（ステップＳ７３）。 [the first stage]
Here, the query execution unit 211 of the front server 21 analyzes the input table data based on the schema information (step S72).
Here, the schema information management unit 212 confirms the table definition and partitioning rule of the schema information managed in advance (step S73).

ここで、パーティショニングルールに基づき分割されたカラム情報をパーティショニングルールで指定された各バックエンドサーバ２２，２３，２４に対して転送配置する（ステップＳ７４：レンジパーティショニング）。
尚、上記表データをどのように分割するか、また、分割されたカラム情報をどのバックエンドサーバに対して分配するかは、スキーマ情報のパーティショニングルールに予め定義されているものとする。 Here, the column information divided based on the partitioning rule is transferred and arranged to each of the back-end servers 22, 23, 24 specified by the partitioning rule (step S 74: range partitioning).
It is assumed that how to divide the table data and to which back-end server the divided column information is distributed are defined in advance in the schema information partitioning rule.

次いで、各バックエンドサーバ２２，２３，２４それぞれに設置されたローカルクエリ管理部２２１，２３１，２４１それぞれが分散配置されたカラムデータを取得すると共に、各バックエンドサーバ内におけるデータベース保存メモリ領域（「保存メモリ」という）２２３，２３３，２４３にそれぞれ格納する（ステップＳ７５）。 Next, the local query management units 221, 231, and 241 installed in the respective back-end servers 22, 23, and 24 acquire the distributed column data, and the database storage memory area (“ (Referred to as “storage memory”) 223, 233, 243 (step S75).

ここで、各ローカルクエリ管理部２２１，２３１，２４１は、送り込まれたカラムデータを実データとして、項目どうしの重複を排除し、且つ昇順ソートした形式（実データ列）で、保存メモリ内に保存するものとする。 Here, each local query management unit 221, 231, 241 saves the sent column data in the storage memory in the form (actual data string) in which the column data is sent as real data, the duplication of items is eliminated, and the items are sorted in ascending order. It shall be.

また、ローカルクエリ管理部２２１，２３１，２４１は、各実データ列の各項目（値）に対応した
インデックスから成るインデクス列（インデックス形式）を生成し、各バックエンドサーバ内における上記保存メモリ内に格納する。 Further, the local query management units 221, 231, and 241 generate an index string (index format) composed of indexes corresponding to each item (value) of each real data string, and store it in the storage memory in each back-end server. Store.

尚、各バックエンドサーバに格納されたスキーマ情報に含まれる結合の定義に基づき、結合対象となる異なる２つ以上の表の中に共通の結合キー列が存在する場合は、１つの実データ列に統合して、各保存メモリ内に配置されるものとする。
これにより、各バックエンドサーバ内でインデックス列を参照することにより、結合処理を行うことが可能となる。 If there is a common join key column in two or more different tables to be joined based on the join definition included in the schema information stored in each backend server, one actual data column And are arranged in each storage memory.
Thereby, it is possible to perform the joining process by referring to the index string in each back-end server.

次いで、フロントサーバ２１のクエリ実行部２１１が、カラムデータの配置が完了したか否かの判定を行う（ステップＳ７６：送信完了を確認）。ここで、上記表データの含まれるカラムデータのうち配置されていないカラムデータがある場合には、パーティショニングルールに基づき分割されたカラム情報をパーティショニングルールで指定された各バックエンドデータベース２２，２３，２４に対して転送配置する（ステップＳ７４へ）。 Next, the query execution unit 211 of the front server 21 determines whether or not the arrangement of the column data has been completed (step S76: confirmation of transmission completion). Here, when there is column data that is not arranged among the column data included in the table data, the column information divided based on the partitioning rule is used as the back-end databases 22 and 23 specified by the partitioning rule. , 24 (to step S74).

一方、上記表データの含まれるカラムデータのうち配置されていないカラムデータがない場合には（データ残なし）、以下に示す第二段階に移行する（第一段階終了）。 On the other hand, if there is no unplaced column data among the column data included in the table data (no data remaining), the process proceeds to the second stage described below (end of the first stage).

次に、フロントサーバ２１の２１１がカラムデータを分配ロードする動作の第二段階について、図６のフローチャートに基づき詳説する。 Next, the second stage of the operation in which the front server 21 211 distributes and loads the column data will be described in detail with reference to the flowchart of FIG.

［第二段階］
ここで、バックエンドサーバ２２，２３，２４に対する配置対象である表データのスキーマ情報に予め結合定義が設定されている場合に、クエリ実行部２１１は、結合キー列に含まれるデータのうち、各バックエンドサーバ２２，２３，２４に配置されていないデータがあるか否かを判定する。 [Second stage]
Here, when the join definition is set in advance in the schema information of the table data to be arranged for the back-end servers 22, 23, and 24, the query execution unit 211 selects each of the data included in the join key column. It is determined whether there is data not arranged in the back-end servers 22, 23, 24.

保有されていない結合キー列のデータがある（つまり、データ欠けがある）と判定された場合に、クエリ実行部２１１は、結合キー列の配置データ欠けがあるバックエンドサーバ２２，２３，または２４に対して、保有されていないデータの複製（レプリケーション）を生成し、転送する（ステップＳ７８：レプリケーションを展開通知）。
これにより、各バックエンドサーバ２２，２３，２４では、結合キー列の全ての行のデータが保有される。 In a case where it is determined that there is data of the join key string that is not held (that is, there is data missing), the query execution unit 211 returns the back-end server 22, 23, or 24 with the missing placement data of the join key string. In response to this, a copy (replication) of data that is not held is generated and transferred (step S78: replication is notified of deployment).
Thereby, in each backend server 22,23,24, the data of all the lines of a joint key column are held.

次いで、バックエンドサーバ２２のローカルクエリ実行部２２１は、他のバックエンドサーバ２３，２４に配置された表から参照される結合キー列があるか否かをスキーマ情報に基づき判断し、他のバックエンドサーバ２３，２４に配置された表から参照される結合キー列がある場合に、対象となる列データを転送する（ステップＳ７９）。 Next, the local query execution unit 221 of the back-end server 22 determines whether or not there is a join key column that is referenced from the tables arranged in the other back-end servers 23 and 24 based on the schema information. If there is a join key column referenced from the tables arranged in the end servers 23 and 24, the target column data is transferred (step S79).

次に、クエリ実行部２１１は、バックエンドサーバ２２，２３，２４における結合処理に必要な結合キー列（商品ＩＤ）に含まれる全データが各バックエンドサーバ２２，２３，２４それぞれに対して格納されたか否かを確認する（ステップＳ８０、８１）。
結合処理に必要な結合キー列（商品ＩＤ）に含まれる全データが各バックエンドサーバ２２，２３，２４それぞれに対して格納されたことが確認された場合に、クエリ実行部２１１は、バックエンドサーバ２２，２３，２４に対するパーティショニングされたデータの配置（ロード）の終了をデータベースクライアント１に通知する（ステップＳ８２） Next, the query execution unit 211 stores all data included in the combination key string (product ID) necessary for the combination processing in the back-end servers 22, 23, 24 for each of the back-end servers 22, 23, 24. It is confirmed whether or not it has been done (steps S80, 81).
When it is confirmed that all data included in the combination key string (product ID) necessary for the combination process is stored in each of the back-end servers 22, 23, and 24, the query execution unit 211 returns the back-end The database client 1 is notified of the end of placement (loading) of the partitioned data with respect to the servers 22, 23, and 24 (step S82).

データベースクライアント１は、ロード終了通知を取得し（ステップＳ８３）、分散メモリデータベース管理システム２は、この時点で検索処理要求の待機状態に設定される。 The database client 1 obtains the load end notification (step S83), and the distributed memory database management system 2 is set to a search processing request standby state at this point.

［バックエンドサーバ相互間における結合処理］
次に、異なるバックエンドサーバ間でカラムデータを通信することにより結合処理を行う動作について、図７のフローチャートに基づき説明する。
ここでは、バックエンドサーバＡをバックエンドサーバ２３、結合キーの参照元データを保有するバックエンドサーバＢがバックエンドサーバ２４であるものとして、説明する（図７）。 [Join processing between back-end servers]
Next, an operation for performing a join process by communicating column data between different back-end servers will be described based on the flowchart of FIG.
Here, it is assumed that the back-end server A is the back-end server 23 and the back-end server B that holds the reference data of the combined key is the back-end server 24 (FIG. 7).

まず、バックエンドサーバ２３のローカルクエリ実行部２３１が、スキーマ情報を解析する（ステップＳ９１）。
スキーマ情報管理部２３２が、結合キー列（ここでは、商品ＩＤ列であるものとする）に含まれる全てのデータを、バックエンドサーバ２３が保有していることを確認する（ステップＳ９２）。
次いで、バックエンドサーバ２３は、結合キー列のみを利用して結合処理を行う（ステップＳ９３：結合キーのみで結合処理を実施）。 First, the local query execution unit 231 of the back-end server 23 analyzes schema information (step S91).
The schema information management unit 232 confirms that the back-end server 23 has all the data included in the combined key string (here, it is assumed to be the product ID string) (step S92).
Next, the back-end server 23 performs a join process using only the join key string (step S93: execute the join process using only the join key).

スキーマ情報におけるパーティショニングルールを確認し、参照元のデータ（表）が配置されたサーバ（ここでは、バックエンドサーバ２４）を特定すると共に、バックエンドサーバ２４に結合列データ（列データ）と結合処理を要求するコマンド（結合コマンド）を送信する（ステップＳ９４）。 The partitioning rule in the schema information is confirmed, the server (here, the back-end server 24) where the reference source data (table) is located is specified, and the join column data (column data) is joined to the back-end server 24. A command requesting processing (join command) is transmitted (step S94).

次いで、バックエンドサーバ２４のローカルクエリ実行部２４１が送り込まれた列データと結合コマンドに基づき結合処理を行い（ステップＳ９５）、結合処理結果である結合データをバックエンドサーバ２３に対して返す（ステップＳ９６：結合データを返却） Next, the local query execution unit 241 of the back-end server 24 performs a join process based on the column data and the join command sent (Step S95), and returns join data as a join process result to the back-end server 23 (Step S95). S96: Return combined data)

次いで、バックエンドサーバ２３は、結合キー列に含まれる全てのデータについて結合データが揃ったか否かを確認し、重複なく結合データが揃ったことが確認された場合に処理を完了する（ステップＳ９７）。 Next, the back-end server 23 checks whether or not the combined data has been prepared for all the data included in the combined key string, and completes the process when it is confirmed that the combined data has been prepared without duplication (step S97). ).

一般的に、分散データベースシステムでは、結合のために必要であるデータをサーバ間で送信し合うセミジョイン法が利用されている。
しかしながら、通常のセミジョイン法では、結合に使う結合キー列を、例えば、サーバＡから取り出し、サーバＢに送信し、サーバＢで結合を行い、その結果をサーバＡに送り返し、サーバＡで結合を完成させる。 In general, a distributed database system uses a semi-join method in which data necessary for joining is transmitted between servers.
However, in the normal semi-join method, a join key string used for joining is extracted from, for example, server A, sent to server B, joined at server B, the result is sent back to server A, and the join is completed at server A. Let

これに対して、本実施形態では、上記ステップＳ９４の時点で、結合キー列に含まれるデータのうち結合に必要なデータを特定し、この特定されたデータのみを上記サーバに対して送信して結合処理を行うことが可能となる。
また、対象の結合列が（スキーマ情報管理部で管理されている）パーティションルールに該当する場合は更なる結合処理に利用されるデータ（結合用データ）をさらに絞込む（限定する）ことができ、これにより、さらなる通信データ量の軽減が可能となる。 On the other hand, in the present embodiment, at the time of step S94, data necessary for combining is specified from the data included in the combined key string, and only the specified data is transmitted to the server. It is possible to perform the combining process.
In addition, when the target join column corresponds to a partition rule (managed by the schema information management unit), the data (join data) used for further join processing can be further narrowed down (limited). As a result, the amount of communication data can be further reduced.

［検索処理］
次に、データベースクライアント１からデータ検索の要求（検索問い合わせ）があった場合の分散メモリデータベース管理システム２の動作（検索処理動作）について、具体的に説明する。 [Search processing]
Next, the operation (search processing operation) of the distributed memory database management system 2 when there is a data search request (search inquiry) from the database client 1 will be specifically described.

ここで、検索対象であるカラムデータは、各バックエンドサーバ２２，２３，２４内に図３に示すようにパーティショニングされているものとする。
このとき、以下に示すＳＱＬ文がデータベースクライアント１からクエリ実行部２１１に入力され、このＳＱＬ文に基づき分散メモリデータベース管理システム２における検索動作が行われる。 Here, it is assumed that the column data to be searched is partitioned in each back-end server 22, 23, 24 as shown in FIG.
At this time, the following SQL statement is input from the database client 1 to the query execution unit 211, and a search operation in the distributed memory database management system 2 is performed based on the SQL statement.

［ＳＱＬ文］
ＳＥＬＥＣＴ商品ＩＤ，年度，期，累計個数＊定価
ＦＲＯＭＡ取扱商品，修理売上げサマリ
ＷＨＥＲＥＡ取扱商品．商品ＩＤ＝修理売上げサマリ．商品ＩＤ
ＡＮＤ累計個数＊定価＞３０００００ [SQL sentence]
SELECT Product ID, year, period, cumulative number * price FROM A handling products, repair sales summary handling A handling A products. Product ID = Repair sales summary. Product ID
AND cumulative number * list price> 300,000

まず、クエリ実行部２１１が、スキーマ情報管理部２１２に対して、バックエンドサーバ２２，２３，２４の内のどのバックエンドサーバに、上記ＳＱＬ文で指定されたカラムデータが配置されているかを、問い合わせる。
ここで、スキーマ情報管理部２１２は、スキーマ情報に基づき、「修理売上げサマリ」と「Ａ取扱商品」がバックエンドサーバ２３と２４に格納されていることを特定し、これをクエリ実行部２１１に通知する。 First, the query execution unit 211 determines with respect to the schema information management unit 212 which of the back-end servers 22, 23, and 24 the column data specified by the SQL statement is arranged. Inquire.
Here, the schema information management unit 212 specifies that “repair sales summary” and “A handling product” are stored in the back-end servers 23 and 24 based on the schema information, and sends them to the query execution unit 211. Notice.

クエリ実行部２１１は、スキーマ情報管理部２１２からの通知に基づき、バックエンドサーバ２３，２４それぞれのローカルクエリ実行部２３１、２４１に対して、上記ＳＱＬに基づく検索用のコマンドを発行する。 The query execution unit 211 issues a search command based on the SQL to the local query execution units 231 and 241 of the back-end servers 23 and 24 based on the notification from the schema information management unit 212.

以下、各バックエンドサーバ２３、２４は、それぞれクエリ実行部２１１から送り込まれた検索用のコマンドに応じての同様の検索処理を実施する。
ここではバックエンドサーバ２４における動作内容について、図８のフローチャートに基づき説明する。尚、ローカルクエリ実行部２４１は、スキーマ情報管理部２４２を介してデータベース保存メモリ領域２４３に格納されたデータを参照可能であるものとする（図３）。
尚、ここでも、上述と同様に、バックエンドサーバＡをバックエンドサーバ２３、結合キーの参照元データを保有するバックエンドサーバＢがバックエンドサーバ２４であるものとして、説明する（図８）。 Hereinafter, each of the back-end servers 23 and 24 performs the same search process according to the search command sent from the query execution unit 211.
Here, the operation content in the back-end server 24 will be described based on the flowchart of FIG. Note that the local query execution unit 241 can refer to the data stored in the database storage memory area 243 via the schema information management unit 242 (FIG. 3).
Here, as in the case described above, the back-end server A is assumed to be the back-end server 23, and the back-end server B that holds the reference data of the combined key is assumed to be the back-end server 24 (FIG. 8).

ローカルクエリ実行部２４１は、スキーマ情報に含まれる結合条件に基づき結合処理を行う。
ここでは、ローカルクエリ実行部２４１は、修理売上げサマリテーブルのインデックス列における商品ＩＤ列とインデックス列であるＡ取扱商品テーブル（レプリカ列）のインデックス列における商品ＩＤ列とが等しいという結合条件に基づき、結合処理を行う（ステップＳ１０３）。
これにより、ローカルクエリ実行部２４１は、共通の値の集合｛４、８｝を抽出する。 The local query execution unit 241 performs a join process based on the join condition included in the schema information.
Here, the local query execution unit 241 is based on a join condition that the product ID column in the index column of the repair sales summary table is equal to the product ID column in the index column of the A handling product table (replica column) that is the index column. A combining process is performed (step S103).
As a result, the local query execution unit 241 extracts a set of common values {4, 8}.

次いで、ローカルクエリ実行部２４１は、インデックス列における結合処理で抽出した集合｛４、８｝に基づき、実データ列である商品ＩＤ列の、対応する４番目と８番目の値の集合である｛１００１３、１００３０｝を得る。 Next, the local query execution unit 241 is a set of corresponding fourth and eighth values of the product ID column, which is an actual data column, based on the set {4, 8} extracted by the join processing in the index column { 10013, 10030}.

これにより、本実施形態では、通常のセミジョイン法では結合処理時に転送されてしまう「修理売上げサマリテーブル」のインデックス番号=５（商品ＩＤ＝１００１５）を転送することなく、各バックエンドサーバ内のデータを優先して利用することにより結合処理を行うことができる。 Thus, in this embodiment, the data in each back-end server is not transferred without transferring the index number = 5 (product ID = 1015) of the “repair sales summary table” that is transferred during the joining process in the normal semi-join method. It is possible to perform the combining process by prioritizing and using.

ここで、ローカルクエリ実行部２４１は、結合に必要なデータとしての商品ＩＤ｛１００１３、１００３０｝に対応する「定価」データがバックエンドサーバ２４ローカル内には分配（格納）されていないため（ステップＳ１０５：結合に必要なデータが足りない場合）、当該対応する「定価」データを他のバックエンドサーバから取得する。このとき、ローカルクエリ実行部２４１は、スキーマ情報管理部２４２を介してＡ取扱商品テーブルのパーティショニングルールを参照する。 Here, the local query execution unit 241 does not distribute (store) the “price” data corresponding to the product ID {10013, 10030} as data necessary for the combination in the backend server 24 local (step) S105: When there is not enough data necessary for combining), the corresponding “list price” data is acquired from another back-end server. At this time, the local query execution unit 241 refers to the partitioning rule of the A handling product table via the schema information management unit 242.

ローカルクエリ実行部２４１は、商品ＩＤのパーティショニングルールに基づき、「商品ＩＤ：１００１３」に対応する定価を示すデータはバックエンドメモリサーバ２２に格納されており、また、「商品ＩＤ：１００３０」に対応する定価を示すデータはバックエンドサーバ２３に格納されていることを特定する。
ここで、ローカルクエリ実行部２４１は、対応する列データと、この列データに対応する「定価」データを要求する処理コマンドを、バックエンドメモリサーバ２２、２３それぞれに対して転送する（ステップＳ１０６）。 The local query execution unit 241 stores the data indicating the list price corresponding to “product ID: 10013” in the back-end memory server 22 based on the partitioning rule of the product ID, and also stores “product ID: 10030”. It is specified that the data indicating the corresponding list price is stored in the back-end server 23.
Here, the local query execution unit 241 transfers the corresponding column data and a processing command for requesting the “list price” data corresponding to the column data to the back-end memory servers 22 and 23 (step S106). .

次いで、バックエンドサーバ２２，２３それぞれでは、ローカルクエリ実行部２２１，２３１が、送り込まれた処理コマンドに基づき、商品ＩＤ：１００１３または、商品ＩＤ：１００３０に対応する定価の値を検索し、商品ＩＤ：１００１３、または１００３０と対応する、検索取得した定価の値との結合処理を行い（ステップＳ１０７）、これにより生成した結合データをそれぞれバックエンドサーバ２４に返す（ステップＳ１０８）。 Next, in each of the back-end servers 22 and 23, the local query execution units 221 and 231 search for a price value corresponding to the product ID: 10013 or the product ID: 10030 based on the received processing command, and the product ID 10013 or 10030 and a search / acquired list price value are combined (step S107), and the generated combination data is returned to the back-end server 24 (step S108).

次いで、バックエンドサーバ２４のローカルクエリ実行部２４１は、バックエンドサーバ２２および２３それぞれから送り込まれた結合データを取得し、フロントサーバ２１から検索要求に対するデータである結合データがそろったことを確認し、バックエンドサーバ２２および２３それぞれから送り込まれた結合データを結合した、商品ＩＤおよび定価からなる中間検索結果をクエリ実行部２１１に返す。 Next, the local query execution unit 241 of the back-end server 24 acquires the combined data sent from each of the back-end servers 22 and 23, and confirms that the combined data that is data for the search request has been collected from the front server 21. The intermediate search result composed of the product ID and the fixed price obtained by combining the combined data sent from the back-end servers 22 and 23 is returned to the query execution unit 211.

クエリ実行部２４１は、各バックエンドサーバ２３，２４から戻ってきた中間処理結果を一時メモリ領域２１３に格納して、全ての処理結果がそろったことを確認した場合に、中間検索結果をマージし、これにより生成したテーブル情報を最終検索結果としてデータベースクライアント１に返信する（ステップＳ１０９）。 The query execution unit 241 stores the intermediate processing results returned from the back-end servers 23 and 24 in the temporary memory area 213, and merges the intermediate search results when it is confirmed that all the processing results are available. Then, the generated table information is returned to the database client 1 as a final search result (step S109).

以上のように、本実施形態では、結合処理を行うためのレプリケーションデータを結合キー列のデータに限定することにより、異なるデータベースに格納されたデータ（テーブル）の結合処理を行うために各サーバ上で利用されるメモリ量を抑制することができる。 As described above, in this embodiment, the replication data for performing the join process is limited to the data of the join key column, so that each server can perform the join process of data (tables) stored in different databases. Can reduce the amount of memory used.

特にメモリデータベースではディスクの場合と比較してデータ上限が制限されるため、本実施形態におけるメモリデータベースでは、結合キー列に限定して、その全レコードを各データベースを備えたサーバ（バックエンドサーバ）が保有する構成としたことにより、メモリ上におけるデータ格納に必要なメモリ領域を軽減することが可能になる。 In particular, since the upper limit of data is limited in the memory database as compared with the case of the disk, in the memory database in the present embodiment, the server (back-end server) including all the records is limited to the join key column. As a result, the memory area required for data storage on the memory can be reduced.

次に、スキーマ情報における結合の定義で、結合キー列に含まれる項目（行）に対応する、検索要求で指定された集計や検索条件に含まれる列データ（「対応列データ」という）が指定された場合（上記変形例の場合）の分散メモリデータベース管理システム２における動作（検索処理動作）について、具体的に説明する。 Next, in the definition of the join in the schema information, the column data (referred to as “corresponding column data”) specified in the aggregation or search condition specified in the search request corresponding to the item (row) included in the join key column is specified. An operation (search processing operation) in the distributed memory database management system 2 in the case of being performed (in the case of the above modification) will be specifically described.

このとき、クエリ実行部２１１は、上述のように、各バックエンドサーバ２２，２３，２４に対するデータのレプリケーションをする際に、結合キー列だけでなく、結合の定義で指定された対応列データを結合キー列と共にレプリケーションする。
これにより、図２（ａ）〜（ｃ）のテーブルデータは、カラムデータとして各バックエンドサーバ２２，２３，２４に、図４に示すようにパーティショニング（分配）される。 At this time, as described above, the query execution unit 211 performs not only the join key column but also the corresponding column data specified by the definition of the join when replicating the data to each back-end server 22, 23, 24. Replicate with join key columns.
As a result, the table data in FIGS. 2A to 2C is partitioned (distributed) as column data to the back-end servers 22, 23, and 24 as shown in FIG.

このとき、以下に示すＳＱＬ文がデータベースクライアント１からクエリ実行部２１１に入力され、このＳＱＬ文に基づき分散メモリデータベース管理システム２における検索動作が行われる場合について説明する。 At this time, a case where the following SQL statement is input from the database client 1 to the query execution unit 211 and a search operation in the distributed memory database management system 2 is performed based on the SQL statement will be described.

ここで、上述と同様に、クエリ実行部２１１が、スキーマ情報管理部２１２に対して、バックエンドサーバ２２，２３，２４の内のどのバックエンドサーバに、上記ＳＱＬ文で指定されたカラムデータが配置されているかを、問い合わせ、スキーマ情報管理部２１２は、スキーマ情報に基づき、「修理売上げサマリ」と「Ａ取扱商品」がバックエンドサーバ２３と２４に格納されていることを特定し、これをクエリ実行部２１１に通知する。 Here, as described above, the query execution unit 211 sends the schema data management unit 212 the column data specified in the SQL statement to any of the back-end servers 22, 23, and 24. The schema information management unit 212 identifies whether the “repair sales summary” and the “A handling product” are stored in the back-end servers 23 and 24 based on the schema information. The query execution unit 211 is notified.

以下、各バックエンドサーバ２３、２４は、それぞれクエリ実行部２１１から送り込まれた検索用のコマンドに応じての同様の検索処理を実施するため、上述と同様に、バックエンドサーバ２４における動作内容について説明する。 Hereinafter, since each back-end server 23 and 24 performs the same search process according to the search command sent from the query execution unit 211, the operation contents in the back-end server 24 are the same as described above. explain.

ローカルクエリ実行部２４１は、上述と同様に（図８）、スキーマ情報に含まれる結合条件に基づき結合処理を行う。ここでは、ローカルクエリ実行部２４１は、修理売上げサマリテーブルのインデックス列における商品ＩＤ列とインデックス列であるＡ取扱商品テーブル（レプリカ列）のインデックス列における商品ＩＤ列とが等しいという結合条件に基づき、結合処理を行い、共通の値の集合｛４、８｝を抽出する。 Similar to the above (FIG. 8), the local query execution unit 241 performs a join process based on the join condition included in the schema information. Here, the local query execution unit 241 is based on a join condition that the product ID column in the index column of the repair sales summary table is equal to the product ID column in the index column of the A handling product table (replica column) that is the index column. A combination process is performed to extract a set of common values {4, 8}.

ここで、ローカルクエリ実行部２４１は、結合キー商品ＩＤのインデックス列の｛４、８｝が抽出された後、Ａ取扱商品テーブル（レプリカ表）を確認することで、同じ行に格納されている「価格」のインデックスも同様に判定できる。
また、実データに関しても自己であるバックエンドサーバ２４に格納されているため、「価格」のデータ取得のために他のバックエンドサーバとの通信を行ってデータ取得を行うといった必要がない。 Here, after {4, 8} of the index column of the combined key product ID is extracted, the local query execution unit 241 checks the A handling product table (replica table) and stores them in the same row. The “price” index can be similarly determined.
In addition, since the actual data is also stored in the back-end server 24 that is itself, there is no need to acquire data by communicating with other back-end servers in order to acquire “price” data.

これにより、バックエンドサーバ２４は、他サーバ（バックエンドサーバ２３）に対する結合号用の通信を行うことなく、検索結果を生成しクエリ実行部２１１に返すことができる。 As a result, the back-end server 24 can generate a search result and return it to the query execution unit 211 without performing communication for a combined number with another server (back-end server 23).

バックエンドサーバ２３においても同様に、検索結果をクエリ実行部２１１に返すことができ、クエリ実行部２１１は、各バックエンドサーバから返された全ての検索結果がそろった場合にこれらをマージして最終検索結果としてデータベースクライアント１に返送することが可能となる。 Similarly, in the back-end server 23, the search results can be returned to the query execution unit 211, and the query execution unit 211 merges all the search results returned from each back-end server. It can be returned to the database client 1 as a final search result.

以上のように、上記変形例の場合、結合に必要な列を合わせてレプリケーションすることで、データベース間の通信量をさらに軽減することができ、このため、検索処理の高速性をさらに向上させることができる。 As described above, in the case of the above modification, it is possible to further reduce the amount of communication between the databases by performing replication by combining the columns necessary for the join, and thus further improve the high speed of the search processing. Can do.

上述した実施形態については、その新規な技術的内容の要点をまとめると、以下のようになる。
尚、上記の実施形態の一部又は全部は、新規な技術として以下のようにまとめられるが、本発明は必ずしもこれに限定されるものではない。 Regarding the above-described embodiment, the main points of the new technical contents are summarized as follows.
In addition, although a part or all of said embodiment is put together as follows as a novel technique, this invention is not necessarily limited to this.

（付記１）
予め設定された表データを異なる複数のデータベースサーバに列データとして分配すると共に、外部端末からの検索要求に対して当該検索要求に基づく検索結果を前記各データベースサーバから取得する要求データ処理装置を備えた分散データベース管理システムであって、
前記要求データ処理装置は、
前記列データの分配先を示すスキーマ情報を前記各データベースサーバに対して転送するスキーマ情報複製転送部と、
前記各データベースサーバに格納された結合キー列に格納されていない項目データがある場合に、前記項目データを複製データとして前記バックエンドサーバに対して補完的に格納する結合キーデータ補完格納部とを備え、
前記各データベースサーバは、
要求データ処理装置から検索要求が送り込まれた場合に、前記結合キー列に基づき自己データベースサーバ内に分配された列データの結合を行うことにより前記結合キー列から前記検索要求で必要とされる検索用列データを抽出するデータ結合抽出部と、
前記抽出された検索列データに対応する結合用データが分配された他のサーバを前記スキーマ情報に基づき特定する結合用サーバ特定部と、
前記特定したデータベースサーバに対して前記検索列データを送信し前記結合用データと結合することにより前記検索結果を生成する検索結果結合生成部とを備えたことを特徴とする分散データベース管理システム。 (Appendix 1)
A request data processing device is provided that distributes preset table data as column data to a plurality of different database servers, and obtains a search result based on the search request from each database server in response to a search request from an external terminal. Distributed database management system,
The request data processing device includes:
A schema information replication transfer unit for transferring schema information indicating a distribution destination of the column data to each database server;
When there is item data that is not stored in the combination key column stored in each database server, a combination key data supplement storage unit that complementarily stores the item data as duplicate data in the back-end server; Prepared,
Each of the database servers is
When a search request is sent from the request data processing device, a search required by the search request from the combined key column is performed by combining the column data distributed in the self database server based on the combined key column A data combination extractor for extracting column data;
A server identification unit for coupling that identifies another server to which data for coupling corresponding to the extracted search string data is distributed based on the schema information;
A distributed database management system comprising: a search result combination generation unit that generates the search result by transmitting the search string data to the specified database server and combining the search string data with the combination data.

（付記２）
付記１に記載の分散データベース管理システムにおいて、
結合キーデータ補完格納部は、前記検索要求の内容に基づき特定される検索処理対象の列データで且つ前記結合キー列の項目データに対応する列データを対象列データとして特定し、当該対象列データを対応する前記バックエンドサーバに対して前記複製データとして格納する対象列格納機能を備えたことを特徴とする分散データベース管理システム。 (Appendix 2)
In the distributed database management system according to attachment 1,
The combination key data complementation storage unit specifies column data corresponding to item data of the combination key column as target column data, which is column data to be searched based on the content of the search request, and the target column data A distributed database management system comprising a target column storage function for storing a target column as the replicated data in the corresponding back-end server.

（付記３）
付記１に記載の分散データベース管理システムにおいて、
前記要求データ処理装置は、前記各データベースサーバに対して分配する列データに含まれる項目データを重複排他的に正規化するデータ配置管理手段を備えたことを特徴とする分散データベース管理システム。 (Appendix 3)
In the distributed database management system according to attachment 1,
The distributed data base management system, wherein the request data processing device comprises data arrangement management means for redundantly and exclusively normalizing item data included in column data distributed to each database server.

（付記４）
付記１に記載の分散データベース管理システムにおいて、
前記各データベースサーバは、自己データベースサーバ内に分配された列データの各項目行に対応する参照用のインデックスデータから成るインデックス列を生成するインデックス列生成手段を備え、
前記データ結合抽出部は、前記インデックス列に基づく結合を行うことにより前記検索要求で必要とされる検索用列データの抽出を行うことを特徴とする分散データベース管理システム。 (Appendix 4)
In the distributed database management system according to attachment 1,
Each database server includes index column generation means for generating an index column composed of reference index data corresponding to each item row of column data distributed in the self database server,
The distributed database management system, wherein the data combination extraction unit extracts column data for search required by the search request by performing a combination based on the index column.

（付記５）
予め設定された表データを異なる複数のデータベースサーバに列データとして分配する要求データ処理装置が、外部端末からの検索要求に対して当該検索要求に基づく検索結果を前記各データベースサーバから取得する分散データベース管理方法であって、
前記要求データ処理装置は前記列データの分配先を示すスキーマ情報を前記各データベースサーバに対して転送し、前記各データベースサーバに格納された結合キー列に格納されていない項目データがある場合に、前記項目データを複製データとして前記バックエンドサーバに対して補完的に格納し、
前記各データベースサーバは、
前記要求データ処理装置から検索要求が送り込まれた場合に、前記結合キー列に基づき自己データベースサーバ内に分配されたデータとの結合を行うことにより、前記結合キー列から前記検索要求で必要とされる検索用列データを抽出し、前記抽出された検索列データに対応する結合用データが分配された他のサーバを前記スキーマ情報に基づき特定し、当該特定したデータベースサーバに対して前記検索列データを送信し前記結合用データと結合することにより前記検索結果を生成することを特徴とした分散データベース管理方法。 (Appendix 5)
A distributed database in which a request data processing device for distributing preset table data as column data to a plurality of different database servers obtains a search result based on the search request from each database server in response to a search request from an external terminal A management method,
The request data processing apparatus transfers schema information indicating the distribution destination of the column data to each database server, and when there is item data that is not stored in the join key column stored in each database server, The item data is stored as duplicate data in a complementary manner to the backend server,
Each of the database servers is
When a search request is sent from the request data processing device, it is required for the search request from the combined key sequence by combining with the data distributed in the self database server based on the combined key sequence. Search column data is extracted, and another server to which the binding data corresponding to the extracted search column data is distributed is specified based on the schema information, and the search column data is specified to the specified database server. A distributed database management method, wherein the search result is generated by transmitting and combining with the data for combination.

本発明は、大量のデータベースからデータを抽出してデータマートを作成するシステムに対して有効に適用することができる。 The present invention can be effectively applied to a system that creates data marts by extracting data from a large amount of databases.

１データベースクライアント
２分散メモリデータベース管理システム
４内部ネットワーク
２１フロントメモリデータベースサーバ（フロントサーバ）
２２，２３，２４バックエンドメモリデータベースサーバ（バックエンドサーバ）
２１１クエリ実行部
２１２，２２２，２３２，２４２スキーマ情報管理部
２１３一時保存メモリ領域
２２１，２３１，２４１ローカルクエリ実行部
２２３，２３３，２４３データベース保存メモリ領域 1 Database client 2 Distributed memory database management system 4 Internal network 21 Front memory database server (front server)
22, 23, 24 Backend memory database server (backend server)
211 Query execution unit 212, 222, 232, 242 Schema information management unit 213 Temporary storage memory area 221, 231, 241 Local query execution unit 223, 233, 243 Database storage memory area

Claims

A request data processing device is provided that distributes preset table data as column data to a plurality of different database servers, and obtains a search result based on the search request from each database server in response to a search request from an external terminal. Distributed database management system,
The request data processing device includes:
A schema information replication transfer unit for transferring schema information indicating a distribution destination of the column data to each database server;
When there is item data that is not stored in the combination key column stored in each database server, a combination key data supplement storage unit that complementarily stores the item data as duplicate data in the back-end server; Prepared,
Each of the database servers is
When a search request is sent from the request data processing device, a search required by the search request from the combined key column is performed by combining the column data distributed in the self database server based on the combined key column A data combination extractor for extracting column data;
A server identification unit for coupling that identifies another server to which data for coupling corresponding to the extracted search string data is distributed based on the schema information;
A distributed database management system comprising: a search result combination generation unit that generates the search result by transmitting the search string data to the specified database server and combining the search string data with the combination data.

The distributed database management system according to claim 1,
The combination key data complementation storage unit specifies column data corresponding to item data of the combination key column as target column data, which is column data to be searched based on the content of the search request, and the target column data A distributed database management system comprising a target column storage function for storing a target column as the replicated data in the corresponding back-end server.

The distributed database management system according to claim 1,
The distributed data base management system, wherein the request data processing device comprises data arrangement management means for redundantly and exclusively normalizing item data included in column data distributed to each database server.

The distributed database management system according to claim 1,
Each database server includes index column generation means for generating an index column composed of reference index data corresponding to each item row of column data distributed in the self database server,
The distributed database management system, wherein the data combination extraction unit extracts column data for search required by the search request by performing a combination based on the index column.

A request data processing device that divides preset table data into column data and places them in a plurality of different database servers obtains search results based on the search requests from the respective database servers in response to search requests from external terminals. A distributed database management method,
The request data processing apparatus transfers schema information indicating the distribution destination of the column data to each database server, and when there is item data that is not stored in the join key column stored in each database server, The item data is stored as duplicate data in a complementary manner to the backend server,
Each of the database servers is
When a search request is sent from the request data processing device, it is required for the search request from the combined key sequence by combining with the data distributed in the self database server based on the combined key sequence. Search column data is extracted, and another server to which the binding data corresponding to the extracted search column data is distributed is specified based on the schema information, and the search column data is specified to the specified database server. A distributed database management method, wherein the search result is generated by transmitting and combining with the data for combination.