JP7495269B2

JP7495269B2 - Data management system and method

Info

Publication number: JP7495269B2
Application number: JP2020075064A
Authority: JP
Inventors: 晃 ▲高▼木; 元伸齊藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2020-04-21
Filing date: 2020-04-21
Publication date: 2024-06-04
Anticipated expiration: 2040-04-21
Also published as: JP2021174079A; WO2021215101A1

Description

本発明は、データ管理システムおよびデータ管理方法に関する。 The present invention relates to a data management system and a data management method.

近年のデータベース技術の発展により、従来のようなリレーショナルモデルだけではなく、データ特性に応じて、種々のデータベースモデル（キーバリュー、グラフ等）が使われるようになってきた。一方で、複数のデータベースモデルをそれぞれ別々のデータベースとして扱うことの難しさから、それらを統合的に扱うことのできるマルチモデルデータベースが開発されている。例えば、特許文献１では、インメモリのキーバリューストアに、ＲＤＢ（Relational Database）に格納されたデータ群の複製を作成し、外部からの問い合わせに対して、可能な限りインメモリのキーバリューストアを用いて対応することで、問い合わせの処理時間の短縮を図っている。 Recent developments in database technology have led to the use of various database models (key-value, graph, etc.) in addition to the traditional relational model, depending on the characteristics of the data. On the other hand, due to the difficulty of treating multiple database models as separate databases, multi-model databases that can handle them in an integrated manner have been developed. For example, in Patent Document 1, a copy of a data group stored in an RDB (relational database) is created in an in-memory key-value store, and external queries are handled using the in-memory key-value store as much as possible, thereby shortening the query processing time.

ＵＳ２０１７１４７６６４Ａ１US2017147664A1

しかし、既存のデータベース群を用いて運用しているシステムについて、新規のマルチモデルデータベースへと移行することが難しい。処理性能を発揮するためには、データ特性やデータ操作といった問い合わせ内容を考慮して最適なデータベースモデルのデータベースにデータを再配置する必要がある。特許文献１では、このような問い合わせ内容を考慮してデータベースモデルのデータベースにデータを再配置することについて言及されていない。例えば、データベースが保持するテーブルに対するクエリまたはデータベースのスキーマを解析し、リストテーブルや階層テーブル等、辿る操作が多いテーブルはグラフモデルへ配置し、ＩＤ等の主キーでしか検索しないデータはキーバリューモデルへ配置し、複数のキーにおける検索や、複雑な検索、結合操作が多いテーブルはリレーショナルモデルへ配置することについては記載されていない。このように、特許文献１では、データ特性やデータ操作といった問い合わせ内容の違いに応じて、配置すべきデータベースを決定し、テーブルに対する問い合わせの処理時間を従来よりも短縮させることについては開示されていない。 However, it is difficult to migrate a system that is operated using a group of existing databases to a new multi-model database. In order to achieve processing performance, it is necessary to rearrange data in a database of an optimal database model taking into account the contents of queries such as data characteristics and data operations. Patent Document 1 does not mention rearranging data in a database of a database model taking into account such queries. For example, it does not mention analyzing queries for tables held by the database or the schema of the database, and arranging tables with many traversal operations such as list tables and hierarchical tables in a graph model, arranging data that can only be searched by a primary key such as an ID in a key-value model, and arranging tables with many searches by multiple keys, complex searches, and join operations in a relational model. Thus, Patent Document 1 does not disclose determining the database to be arranged according to differences in query contents such as data characteristics and data operations, and shortening the processing time for queries to tables compared to the conventional method.

本発明の一側面は、解析結果から得られる問い合わせ内容の違いに応じて、データを配置すべきデータベースを決定し、データに対する問い合わせの処理時間を従来よりも短縮させることが可能なデータ管理システムおよびデータ管理方法を提供することを目的とする。 One aspect of the present invention aims to provide a data management system and method that can determine the database in which data should be placed depending on differences in the content of queries obtained from analysis results, thereby reducing the processing time for queries on the data compared to conventional methods.

本発明の一態様にかかるデータ管理システムは、既存データベース部が有する第１のデータベースに格納されるテーブルに問い合わせを行う問い合わせプログラムまたは前記第１のデータベースのスキーマを解析する解析部と、前記問い合わせプログラムまたは前記第１のデータベースのスキーマの解析結果に基づいて、前記第１のデータベースに格納されるテーブルを前記第１のデータベース以外の第２のデータベースに配置する配置処理部と、前記第１のデータベースに格納されるテーブルに対する前記問い合わせプログラムの処理時間よりも、前記第２のデータベースに配置した前記テーブルに対する問い合わせプログラムの処理時間のほうが短い場合に、前記配置を前記既存データベース部に反映する配置反映部と、を有することを特徴とするデータ管理システムとして構成される。 A data management system according to one aspect of the present invention is configured as a data management system having an analysis unit that analyzes a query program that queries a table stored in a first database of an existing database unit or a schema of the first database, an arrangement processing unit that arranges a table stored in the first database in a second database other than the first database based on the analysis result of the query program or the schema of the first database, and an arrangement reflection unit that reflects the arrangement in the existing database unit when the processing time of the query program for the table arranged in the second database is shorter than the processing time of the query program for the table stored in the first database.

本発明の一態様によれば、解析結果から得られる問い合わせ内容の違いに応じて、データを配置すべきデータベースを決定し、データに対する問い合わせの処理時間を従来よりも短縮させることができる。 According to one aspect of the present invention, the database in which data should be placed is determined according to the differences in the query content obtained from the analysis results, and the processing time for queries on the data can be reduced compared to conventional methods.

本実施例におけるデータ管理装置の機能構成例を示すブロック図である。2 is a block diagram showing an example of a functional configuration of a data management device according to the present embodiment; FIG. データ管理装置のコンピュータの概略図である。FIG. 2 is a schematic diagram of a computer of the data management device. 実行実績テーブルの例を示す図である。FIG. 13 illustrates an example of an execution result table. 試行判定テーブルを示す図である。FIG. 13 is a diagram showing a trial determination table. 処理時間判定テーブルを示す図である。FIG. 13 is a diagram illustrating a processing time determination table. 配置対応テーブルの例を示す図である。FIG. 13 is a diagram illustrating an example of a layout correspondence table. 第１種データベースの例を示す図である。FIG. 2 is a diagram illustrating an example of a first type database. 第２種データベースの例を示す図である。FIG. 11 is a diagram illustrating an example of a second type database. 第３種データベースの例を示す図である。FIG. 11 is a diagram illustrating an example of a third type database. 問い合わせ形式変換部が、ＳＱＬ形式の問い合わせプログラムから非ＳＱＬ形式の問い合わせプログラムに変換する一例を示す図である。13 is a diagram illustrating an example in which a query format conversion unit converts a query program in an SQL format into a query program in a non-SQL format. FIG. 本システムで行われる問い合わせプログラムの実行実績を蓄積する処理（実行実績蓄積処理）の処理手順を示すフローチャートである。11 is a flowchart showing the procedure of a process (execution record accumulation process) for accumulating execution records of an inquiry program performed in the present system; データベース部に記憶されているテーブルの再配置を行う処理（再配置処理）の処理手順を示すフローチャートである。13 is a flowchart showing a processing procedure for rearranging tables stored in a database unit (rearrangement processing);

以下、図面を参照して本発明の実施形態を説明する。以下の記載および図面は、本発明を説明するための例示であって、説明の明確化のため、適宜、省略および簡略化がなされている。本発明は、他の種々の形態でも実施する事が可能である。特に限定しない限り、各構成要素は単数でも複数でも構わない。 Embodiments of the present invention will be described below with reference to the drawings. The following description and drawings are examples for explaining the present invention, and some parts have been omitted or simplified as appropriate for clarity of explanation. The present invention can also be implemented in various other forms. Unless otherwise specified, each component may be singular or plural.

図面において示す各構成要素の位置、大きさ、形状、範囲などは、発明の理解を容易にするため、実際の位置、大きさ、形状、範囲などを表していない場合がある。このため、本発明は、必ずしも、図面に開示された位置、大きさ、形状、範囲などに限定されない。 The position, size, shape, range, etc. of each component shown in the drawings may not represent the actual position, size, shape, range, etc., in order to facilitate understanding of the invention. Therefore, the present invention is not necessarily limited to the position, size, shape, range, etc. disclosed in the drawings.

以下の説明では、「テーブル」、「リスト」等の表現にて各種情報を説明することがあるが、各種情報は、これら以外のデータ構造で表現されていてもよい。データ構造に依存しないことを示すために「ＸＸテーブル」、「ＸＸリスト」等を「ＸＸ情報」と呼ぶことがある。識別情報について説明する際に、「識別情報」、「識別子」、「名」、「ＩＤ」、「番号」等の表現を用いた場合、これらについてはお互いに置換が可能である。 In the following explanation, various types of information may be explained using expressions such as "table" and "list", but the various types of information may be expressed in other data structures. To indicate independence from data structure, "XX table", "XX list", etc. may be referred to as "XX information". When explaining identification information, when expressions such as "identification information", "identifier", "name", "ID", "number", etc. are used, these are interchangeable.

同一あるいは同様な機能を有する構成要素が複数ある場合には、同一の符号に異なる添字を付して説明する場合がある。ただし、これらの複数の構成要素を区別する必要がない場合には、添字を省略して説明する場合がある。 When there are multiple components with the same or similar functions, they may be described using the same reference numerals with different subscripts. However, when there is no need to distinguish between these multiple components, the subscripts may be omitted.

また、以下の説明では、プログラムを実行して行う処理を説明する場合があるが、プログラムは、プロセッサ（例えばＣＰＵ（Central Processing Unit）、ＧＰＵ（Graphics Processing Unit））によって実行されることで、定められた処理を、適宜に記憶資源（例えばメモリ）および／またはインターフェースデバイス（例えば通信ポート）等を用いながら行うため、処理の主体がプロセッサとされてもよい。同様に、プログラムを実行して行う処理の主体が、プロセッサを有するコントローラ、装置、システム、計算機、ノードであってもよい。プログラムを実行して行う処理の主体は、演算部であれば良く、特定の処理を行う専用回路（例えばＦＰＧＡ（Field-Programmable Gate Array）やＡＳＩＣ（Application Specific Integrated Circuit））を含んでいてもよい。 In addition, the following description may describe processing performed by executing a program, but the program is executed by a processor (e.g., a CPU (Central Processing Unit), a GPU (Graphics Processing Unit)) to perform a specified process using storage resources (e.g., memory) and/or interface devices (e.g., communication ports) as appropriate, so the subject of the processing may be the processor. Similarly, the subject of the processing performed by executing a program may be a controller, device, system, computer, or node having a processor. The subject of the processing performed by executing a program may be a calculation unit, and may include a dedicated circuit (e.g., an FPGA (Field-Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit)) that performs a specific process.

プログラムは、プログラムソースから計算機のような装置にインストールされてもよい。プログラムソースは、例えば、プログラム配布サーバまたは計算機が読み取り可能な記憶メディアであってもよい。プログラムソースがプログラム配布サーバの場合、プログラム配布サーバはプロセッサと配布対象のプログラムを記憶する記憶資源を含み、プログラム配布サーバのプロセッサが配布対象のプログラムを他の計算機に配布してもよい。また、以下の説明において、２以上のプログラムが１つのプログラムとして実現されてもよいし、１つのプログラムが２以上のプログラムとして実現されてもよい。 The program may be installed in a device such as a computer from a program source. The program source may be, for example, a program distribution server or a computer-readable storage medium. When the program source is a program distribution server, the program distribution server may include a processor and a storage resource that stores the program to be distributed, and the processor of the program distribution server may distribute the program to be distributed to other computers. Also, in the following description, two or more programs may be realized as one program, and one program may be realized as two or more programs.

以下に本実施の形態にかかる修理支援システムおよび修理支援方法をＡＴＭに適用した場合について詳細に説明するが、この例に限らず、様々な装置や機器に適用することができる。 The following provides a detailed explanation of the repair assistance system and repair assistance method according to this embodiment when applied to an ATM, but the system is not limited to this example and can be applied to a variety of devices and equipment.

図１は、本実施例におけるデータ管理装置１０の機能構成例を示すブロック図である。図１に示すように、データ管理装置１０は、ＳＱＬ（Structured Query Language）等の問い合わせ形式（第１の問い合わせ形式）でデータベースが保持するテーブルに対して問い合わせを行う問い合わせ入力プログラム（第１の問い合わせプログラム）の入力を受け付ける問い合わせ入力部１０１と、上記問い合わせ入力プログラムの実行実績や当該プログラムで用いられるデータベースおよびテーブルを管理するデータ配置管理部１０２と、上記問い合わせ形式を非ＳＱＬ（Non Structured Query Language）等の上記問い合わせ形式とは異なる問い合わせ形式（第２の問い合わせ形式）のプログラム（第２の問い合わせプログラム）に変換し、変換した上記異なる問い合わせ形式の問い合わせプログラムによりデータベースが保持するテーブルに問い合わせを行い、その結果を出力する問い合わせ形式変換部１０５と、問い合わせ形式変換部１０５から出力された上記異なる問い合わせ形式の問い合わせプログラムが問い合わせした結果を、上記異なる問い合わせ形式から上記問い合わせ形式に変換する問い合わせ結果変換部１０６と、上記問い合わせ形式に変換された問い合わせの結果を、問い合わせ入力プログラムに対する問い合わせ結果として出力する問い合わせ結果出力部１１１と、を有する。 1 is a block diagram showing an example of the functional configuration of the data management device 10 in this embodiment. As shown in FIG. 1, the data management device 10 has a query input unit 101 that accepts input of a query input program (first query program) that queries a table held by a database in a query format (first query format) such as SQL (Structured Query Language), a data placement management unit 102 that manages the execution record of the query input program and the database and table used in the program, a query format conversion unit 105 that converts the query format into a program (second query program) of a query format (second query format) different from the query format of non-SQL (Non Structured Query Language), queries a table held by the database using the query program of the converted different query format, and outputs the result, a query result conversion unit 106 that converts the result of the query by the query program of the different query format output from the query format conversion unit 105 from the different query format to the query format, and a query result output unit 111 that outputs the query result converted into the query format as a query result for the query input program.

データ配置管理部１０２は、問い合わせ入力部１０１が受け付けた問い合わせ入力プログラムの実行実績を記憶する実行実績テーブル１０３と、データ管理装置１０に記憶されたテーブルのなかで再配置の試行対象とするテーブルを決定するための試行判定テーブル１０３１と、実行実績テーブル１０３の処理時間をテーブルごとに集計した処理時間判定テーブル１０３２と、試行判定テーブル１０３１により決定されたテーブルを、当該テーブルを保持するモデルのデータベースから変換されたモデルのデータベースを保持する試行データベース部１１８と、テーブルとテーブルを保持するモデルのデータベースの対応関係を記憶する配置対応テーブル１０４とを有する。また、データ管理装置１０は、リレーショナルモデルのデータベースから構成される第１種データベース１０８と、キーバリューモデルのデータベースから構成される第２種データベース１０９と、グラフモデルのデータベースから構成される第３種データベース１１０とを記憶するデータベース部１０７を有する。 The data placement management unit 102 includes an execution record table 103 that stores the execution record of the query input program received by the query input unit 101, a trial judgment table 1031 for determining which tables are to be subject to a trial of rearrangement among the tables stored in the data management device 10, a processing time judgment table 1032 that aggregates the processing time of the execution record table 103 for each table, a trial database unit 118 that stores a model database converted from the model database that holds the table determined by the trial judgment table 1031, and a placement correspondence table 104 that stores the correspondence between the table and the model database that holds the table. The data management device 10 also includes a database unit 107 that stores a first type database 108 consisting of a relational model database, a second type database 109 consisting of a key-value model database, and a third type database 110 consisting of a graph model database.

また、データ管理装置１０は、データ配置管理部１０２で管理されている問い合わせ入力プログラムの問い合わせ形式や問い合わせ内容を解析する問い合わせ解析部１１２と、データ配置管理部１０２で管理されている問い合わせ入力プログラムで用いられるテーブルを保持するモデルのデータベースの構造を解析するデータスキーマ解析部１１３と、上記再配置の試行対象と決定されたテーブルの再配置を試行するか否かを判定するデータ配置試行判定部１１４と、データ配置試行判定部１１４により再配置を試行すると判定されたテーブルの再配置を試行するデータ配置試行部１１５と、データ配置試行部１１５により再配置が試行されたテーブルに対する問い合わせを試行する問い合わせ試行部１１６と、問い合わせ試行部１１６により再配置が試行されたテーブルに問い合わせが行われたときの処理時間が、再配置が試行される前のテーブルに対する問い合わせを行ったときの平均的な処理時間よりも短い場合に、再配置が試行される前のテーブルを、再配置が試行されたテーブルを保持するモデルのデータベースに反映する処理時間判定部１１７と、を有する。 The data management device 10 also includes a query analysis unit 112 that analyzes the query format and query content of the query input program managed by the data placement management unit 102, a data schema analysis unit 113 that analyzes the structure of a database of a model that holds tables used in the query input program managed by the data placement management unit 102, a data placement trial determination unit 114 that determines whether or not to attempt relocation of the table determined to be the target of the relocation attempt, a data placement trial unit 115 that attempts relocation of the table determined by the data placement trial determination unit 114 to be attempted for relocation, a query trial unit 116 that attempts a query to the table for which relocation has been attempted by the data placement trial unit 115, and a processing time determination unit 117 that reflects the table before the relocation attempt in the database of the model that holds the table for which relocation has been attempted when the processing time when a query is made to the table for which relocation has been attempted by the query trial unit 116 is shorter than the average processing time when a query is made to the table before the relocation attempt.

また、データ管理装置１０は、データ配置試行部１１５により再配置されたデータベースを記憶する試行データベース部１１８を有する。図１では、一例として、再配置が試行されたテーブルを保持するモデルのデータベースとして、キーバリューモデルのデータベースから構成される第２種データベース１１９と、グラフモデルのデータベースから構成される第３種データベース１２０とが記憶されている。 The data management device 10 also has a trial database unit 118 that stores databases rearranged by the data arrangement trial unit 115. In FIG. 1, as an example, a second type database 119 consisting of a key-value model database and a third type database 120 consisting of a graph model database are stored as model databases that hold tables for which rearrangement has been attempted.

データ管理装置１０は、例えば、図２（コンピュータの概略図）に示すような、ＣＰＵ２０１と、メモリ２０２と、ＨＤＤ(Hard Disk Drive)等の外部記憶装置２０３と、ＣＤ(Compact Disk)やＤＶＤ(Digital Versatile Disk)等の可搬性を有する記憶媒体２０８に対して情報を読み書きする読書装置２０７と、キーボードやマウス等の入力装置２０６と、ディスプレイ等の出力装置２０５と、通信ネットワークに接続するためのＮＩＣ(Network Interface Card)等の通信装置２０４と、これらを連結するシステムバス等の内部通信線(システムバスという)２０９と、を備えた一般的なコンピュータ２００により実現できる。 The data management device 10 can be realized, for example, by a general computer 200 equipped with a CPU 201, memory 202, an external storage device 203 such as a hard disk drive (HDD), a reading/writing device 207 for reading and writing information from a portable storage medium 208 such as a compact disk (CD) or a digital versatile disk (DVD), an input device 206 such as a keyboard or mouse, an output device 205 such as a display, a communication device 204 such as a network interface card (NIC) for connecting to a communication network, and an internal communication line (called a system bus) 209 such as a system bus that connects these together, as shown in FIG. 2 (schematic diagram of a computer).

例えば、データ管理装置１０に記憶されたデータベースやテーブルは、ＣＰＵ２０１がメモリ２０２または外部記憶装置２０３から読み出して利用することにより実現可能である。また、データ管理装置１０が有する問い合わせ入力部１０１、データ配置管理部１０２、問い合わせ形式変換部１０５、問い合わせ結果変換部１０６、問い合わせ結果出力部１１１、問い合わせ解析部１１２、データスキーマ解析部１１３、データ配置試行判定部１１４、データ配置試行部１１５、問い合わせ試行部１１６、処理時間判定部１１７は、ＣＰＵ２０１が外部記憶装置２０３に記憶されている所定のプログラムをメモリ２０２にロードして実行することにより実現可能である。データ管理装置１０は、ＣＰＵ２０１が入力装置２０６を動作させて問い合わせ入力部１０１の入力機能を実現してもよい。データ管理装置１０は、ＣＰＵ２０１が出力装置２０５を動作させて問い合わせ結果出力部１１１の出力機能を実現してもよい。また、データ管理装置１０は、ＣＰＵ２０１が通信装置２０４を動作させて通信機能を実現可能な通信部を有していてもよい。 For example, the databases and tables stored in the data management device 10 can be realized by the CPU 201 reading them from the memory 202 or the external storage device 203 and using them. In addition, the query input unit 101, data placement management unit 102, query format conversion unit 105, query result conversion unit 106, query result output unit 111, query analysis unit 112, data schema analysis unit 113, data placement trial determination unit 114, data placement trial unit 115, query trial unit 116, and processing time determination unit 117 of the data management device 10 can be realized by the CPU 201 loading a predetermined program stored in the external storage device 203 into the memory 202 and executing it. In the data management device 10, the CPU 201 may operate the input device 206 to realize the input function of the query input unit 101. In the data management device 10, the CPU 201 may operate the output device 205 to realize the output function of the query result output unit 111. The data management device 10 may also have a communication unit in which the CPU 201 can operate the communication device 204 to realize a communication function.

上述した所定のプログラムは、読書装置２０７を介して記憶媒体２０８から、あるいは、通信装置２０４を介してネットワークから、外部記憶装置２０３に記憶(ダウンロード)され、それから、メモリ２０２上にロードされて、ＣＰＵ２０１により実行されるようにしてもよい。また、読書装置２０７を介して、記憶媒体２０８から、あるいは通信装置２０４を介してネットワークから、メモリ２０２上に直接ロードされ、ＣＰＵ２０１により実行されるようにしてもよい。 The above-mentioned predetermined program may be stored (downloaded) in the external storage device 203 from the storage medium 208 via the reading/writing device 207 or from the network via the communication device 204, and then loaded onto the memory 202 and executed by the CPU 201. It may also be loaded directly onto the memory 202 from the storage medium 208 via the reading/writing device 207 or from the network via the communication device 204, and executed by the CPU 201.

以下では、データ管理装置１０の各部が、ハードウェアとしては一般的なコンピュータに設けられているが、これらの全部または一部が、クラウドのような１または複数のコンピュータに分散して設けられ、互いに通信することにより同様の機能を実現してもよい。データ管理装置１０の各部の動作、保持するデータの例については、フローチャートを用いて説明する。 In the following, each unit of the data management device 10 is provided in a general computer as hardware, but all or part of these may be provided in one or more computers, such as a cloud, in a distributed manner, and similar functions may be realized by communicating with each other. The operation of each unit of the data management device 10 and examples of the data it holds will be explained using flowcharts.

図３は、実行実績テーブル１０３の例を示す図である。図３に示すように、実行実績テーブル１０３は、問い合わせプログラムを識別するための問い合わせＩＤと、当該問い合わせＩＤで識別される問い合わせの具体的な内容である問い合わせとが対応付けて記憶されている。図３では、例えば、問い合わせＩＤ「１」～「６」の問い合わせは、「テーブルＡ」に対して「部品ＩＤ」を主キーとした問い合わせが行われたことを示している。後述するように、「部品ＩＤ」のようなＩＤ等の主キーでの検索が一定数以上多いテーブルは、所定の条件を満たした場合にはキーバリューモデルのデータベースへの配置が試行される。 Figure 3 is a diagram showing an example of the execution record table 103. As shown in Figure 3, the execution record table 103 stores a correspondence between a query ID for identifying a query program and a query that is the specific content of the query identified by the query ID. In Figure 3, for example, queries with query IDs "1" to "6" indicate that queries were made to "Table A" using "Part ID" as the primary key. As will be described later, for tables that have a certain number of searches using a primary key such as an ID like "Part ID", an attempt is made to place them in the key-value model database if they satisfy certain conditions.

また、図３において、問い合わせＩＤ「７」の問い合わせは、「ＷＩＴＨＲＥＣＵＲＳＩＶＥ」から始まっており、「ＵＮＩＯＮＡＬＬ」以下に示すように「テーブルＢ」に対する再帰操作が行われたことを示している。後述するように、リストデータや階層データ等、「ＷＩＴＨＲＥＣＵＲＳＩＶＥ」のような再帰操作をはじめとした辿る操作が一定数以上多いテーブルは、所定の条件を満たした場合にはグラフモデルへの配置が試行される。 In addition, in Figure 3, the query with query ID "7" begins with "WITH RECURSIVE", indicating that a recursive operation was performed on "Table B" as shown below under "UNION ALL". As will be described later, for tables that have a certain number of traversal operations, such as recursive operations like "WITH RECURSIVE", such as list data or hierarchical data, an attempt is made to place them in a graph model if certain conditions are met.

さらに、問い合わせＩＤ「８」の問い合わせは、「ＪＯＩＮ」に示すように「テーブルＢ」と「テーブルＣ」との結合操作が行われたことを示している。後述するように、少なくとも複数のキーにおける検索や、一定数以上の条件を組み合わせた複雑な検索、結合操作が一定数以上あるテーブルはリレーショナルモデルへの配置が試行される。 Furthermore, the query with query ID "8" indicates that a join operation between "Table B" and "Table C" was performed, as indicated by "JOIN." As will be described later, an attempt is made to place tables in a relational model when there are searches using at least multiple keys, complex searches combining a certain number of conditions, or tables with a certain number of join operations.

図３では、さらに、問い合わせＩＤで識別された問い合わせプログラムごとに、当該問い合わせの処理時間が記憶されている。例えば、問い合わせＩＤ「１」の問い合わせは、０．５１秒の処理時間がかかったことを示している。当該処理時間は、データ配置管理部１０２が、問い合わせ形式変換部１０５の問い合わせ結果に含まれる処理時間を受け取り、実行実績テーブル１０３に書き込む。 In FIG. 3, the processing time of each query program identified by the query ID is also stored. For example, the query with query ID "1" takes 0.51 seconds to process. The data placement management unit 102 receives the processing time included in the query result from the query format conversion unit 105 and writes the processing time into the execution record table 103.

図４は、試行判定テーブル１０３１を示す図である。図４に示すように、試行判定テーブル１０３１は、データベースを解析して得られたテーブルのテーブル名および当該テーブルに対するアクセス方法と、当該テーブルの再配置を試行するか否かを判定するための閾値とが対応付けて記憶されている。図４では、例えば、データスキーマ解析部１１３がデータベースを解析した結果、テーブルＡに対する問い合わせは「主キー検索」が６０％であり、「再帰検索」が２５％であり、「その他」が１５％であったことを示している。また、テーブルＡの再配置を試行する条件は、アクセス方法のうちのいずれかの方法が「５５％」を超えた場合であることを示している。 Figure 4 is a diagram showing the attempt judgment table 1031. As shown in Figure 4, the attempt judgment table 1031 stores the table name and the access method for the table obtained by analyzing the database in association with a threshold value for determining whether or not to attempt to rearrange the table. For example, Figure 4 shows that as a result of the data schema analysis unit 113 analyzing the database, 60% of queries to table A were "primary key search", 25% were "recursive search", and 15% were "other". It also shows that the condition for attempting to rearrange table A is when any of the access methods exceeds "55%".

また、図４では、テーブルＡに対してアクセス方法「主キー検索」が６０％となりしきい値を超えているため、テーブルＡは、キーバリューモデルのデータベース（第２種データベース１１９）への再配置が試行される。また、テーブルＢに対してアクセス方法「再帰検索」が７５％となりしきい値を超えているため、テーブルＢは、グラフモデルのデータベース（第３種データベース１２０）への再配置が試行される。 In addition, in FIG. 4, the access method "primary key search" for table A is 60%, exceeding the threshold, so an attempt is made to relocate table A to a key-value model database (second type database 119). Also, the access method "recursive search" for table B is 75%, exceeding the threshold, so an attempt is made to relocate table B to a graph model database (third type database 120).

図５は、処理時間判定テーブル１０３２を示す図である。図５に示すように、処理時間判定テーブル１０３２は、データベースを解析して得られたテーブルのテーブル名と、当該テーブルに対する処理時間の平均値である平均処理時間とが対応付けて記憶されている。図５では、例えば、テーブルＡの平均的な処理時間は、０．５５秒であったことを示している。問い合わせ解析部１１２は、実行実績テーブル１０３の処理時間をテーブルごとに集計してその平均値を算出し、当該平均処理時間に書き込む。 Figure 5 is a diagram showing the processing time determination table 1032. As shown in Figure 5, the processing time determination table 1032 stores the table names obtained by analyzing the database in association with the average processing time, which is the average value of the processing time for the table. Figure 5 shows, for example, that the average processing time for table A was 0.55 seconds. The query analysis unit 112 aggregates the processing times in the execution record table 103 for each table, calculates the average value, and writes this into the average processing time.

図６は、配置対応テーブル１０４の例を示す図である。図６（ａ）は、テーブル再配置前の配置対応テーブル１０４ａを示している。また、図６（ｂ）は、テーブル再配置後の配置対応テーブル１０４ｂを示している。これらの図に示すように、テーブル再配置前の配置対応テーブル１０４は、データベース部１０７に記憶されているテーブルのテーブル名と、当該テーブル名のテーブルが格納されているデータベースのデータベース名と、当該データベース名のデータベースのモデルとが対応付けて記憶されている。図６（ａ）では、例えば、テーブル名「テーブルＣ」のテーブルは、データベース名「データベースＸ」であるリレーショナルモデルのデータベースに格納されていることを示している。また、図６（ｂ）では、例えば、テーブル名「テーブルＣ」のテーブルが、データベース名「データベースＹ」のキーバリューモデルのデータベースに再配置され、格納されたことを示している。処理時間判定部１１７は、問い合わせ解析部１１２が算出した上記処理時間の平均値と、問い合わせ試行部１１６が問い合わせを行ったときの処理時間とを比較し、後者が前者よりも短くなっている場合に、テーブルの再配置を反映し、配置対応テーブル１０４のデータベース名を更新する。 Figure 6 is a diagram showing an example of the arrangement correspondence table 104. Figure 6 (a) shows the arrangement correspondence table 104a before the table rearrangement. Also, Figure 6 (b) shows the arrangement correspondence table 104b after the table rearrangement. As shown in these figures, the arrangement correspondence table 104 before the table rearrangement stores the table name of the table stored in the database unit 107, the database name of the database in which the table with the table name is stored, and the model of the database with the database name in association with each other. Figure 6 (a) shows, for example, that a table with the table name "Table C" is stored in a relational model database with the database name "Database X". Also, Figure 6 (b) shows, for example, that a table with the table name "Table C" has been rearranged and stored in a key-value model database with the database name "Database Y". The processing time determination unit 117 compares the average processing time calculated by the query analysis unit 112 with the processing time when the query attempt unit 116 made the query, and if the latter is shorter than the former, it reflects the rearrangement of the tables and updates the database name in the layout correspondence table 104.

図７は、第１種データベースの例を示す図である。図７に示すように、第１種データベース７０１は、「部品ＩＤ」を主キーとしたリレーショナルデータベースが記憶されている。また、図８は、第２種データベースの例を示す図である。図８に示すように、第２種データベース８０１は、部署ＩＤ「１」、部署名「Ａ社」を頂点とするツリー構造のキーバリューデータベースが記憶されている。また、図９は、第３種データベースの例を示す図である。図９に示すように、第３種データベース９０１は、「社員ＩＤ」により識別されるノードを頂点とするグラフデータベースが記憶されている。 Figure 7 is a diagram showing an example of a first type database. As shown in Figure 7, the first type database 701 stores a relational database with a "part ID" as the primary key. Also, Figure 8 is a diagram showing an example of a second type database. As shown in Figure 8, the second type database 801 stores a key-value database with a tree structure with department ID "1" and department name "Company A" at the apex. Also, Figure 9 is a diagram showing an example of a third type database. As shown in Figure 9, the third type database 901 stores a graph database with nodes identified by "employee ID" at the apex.

図１０は、問い合わせ形式変換部１０５が、ＳＱＬ形式の問い合わせ入力プログラムから非ＳＱＬ形式の問い合わせ入力プログラムに変換する一例を示す図である。図１０（ａ）では、ＳＱＬ形式の問い合わせ入力プログラムをキーバリューモデルに変換したときのクエリの一例、図１０（ｂ）では、ＳＱＬ形式の問い合わせ入力プログラムをグラフモデルに変換したときのクエリの一例を示している。問い合わせ形式変換部１０５は、ＳＱＬ形式の問い合わせ入力プログラムから、非ＳＱＬ形式に変換した問い合わせ入力プログラムによりデータベース部１０７のデータベースに問い合わせを行う。問い合わせ結果変換部１０６は、その問い合わせの結果をＳＱＬ形式に変換し、問い合わせ結果出力部１１１が、問い合わせ入力プログラムに対する問い合わせ結果を出力する。 Figure 10 is a diagram showing an example of the query format conversion unit 105 converting a query input program in SQL format into a query input program in non-SQL format. Figure 10(a) shows an example of a query when a query input program in SQL format is converted into a key-value model, and Figure 10(b) shows an example of a query when a query input program in SQL format is converted into a graph model. The query format conversion unit 105 queries the database in the database unit 107 using the query input program converted from the SQL format into the non-SQL format. The query result conversion unit 106 converts the query results into SQL format, and the query result output unit 111 outputs the query results for the query input program.

図１１は、本システムで行われる問い合わせプログラム１０１の実行実績を蓄積する処理（実行実績蓄積処理）の処理手順を示すフローチャートである。実行実績蓄積処理では、まず、問い合わせ入力部１０１は、問い合わせプログラム１０１の入力を受け付ける（Ｓ１１０１）。データ配置管理部１０２は、問い合わせプログラムを解析し、図３に示した実行実績テーブル１０３に解析した問い合わせプログラムを登録する（Ｓ１１０２）。 Figure 11 is a flowchart showing the procedure for accumulating execution records of the query program 101 (execution record accumulation process) performed in this system. In the execution record accumulation process, the query input unit 101 first accepts input of the query program 101 (S1101). The data placement management unit 102 analyzes the query program and registers the analyzed query program in the execution record table 103 shown in Figure 3 (S1102).

問い合わせ形式変換部１０５は、ＳＱＬ形式の問い合わせ入力プログラムから非ＳＱＬ形式の問い合わせ入力プログラムに変換して問い合わせを行い、当該問い合わせの処理時間を含む問い合わせの結果を出力する（Ｓ１１０３）。問い合わせ結果変換部１０６は、当該問い合わせの結果を、非ＳＱＬ形式からＳＱＬ形式に変換し、問い合わせ結果出力部１１１に出力する。また、問い合わせ形式変換部１０５は、上記処理時間を含む問い合わせの結果をデータ配置管理部１０２に出力し、データ配置管理部１０２が、当該処理時間を実行実績テーブル１０３に書き込んで登録する（Ｓ１１０４）。 The query format conversion unit 105 converts the query input program in SQL format into a query input program in non-SQL format, executes a query, and outputs the query result including the query processing time (S1103). The query result conversion unit 106 converts the query result from non-SQL format into SQL format, and outputs it to the query result output unit 111. The query format conversion unit 105 also outputs the query result including the above-mentioned processing time to the data placement management unit 102, and the data placement management unit 102 writes and registers the processing time in the execution record table 103 (S1104).

図１２は、データベース部に記憶されているテーブルの再配置を行う処理（再配置処理）の処理手順を示すフローチャートである。再配置処理は、定期または不定期で実行実績テーブル１０３に一定数以上のプログラムが登録された場合に実行される。 Figure 12 is a flowchart showing the procedure for rearranging tables stored in the database unit (rearrangement process). The rearrangement process is executed when a certain number of programs or more are registered in the execution record table 103 on a regular or irregular basis.

まず、問い合わせ解析部１１２は、問い合わせ入力プログラムの問い合わせ形式や問い合わせ内容を解析し、実行実績テーブル１０３の処理時間をテーブルごとに集計してその平均値を算出し、当該平均処理時間に書き込む（Ｓ１２０１）。続いて、データスキーマ解析部１１３は、問い合わせ入力プログラムで用いられるテーブルを保持するデータベースの構造およびアクセス方法を解析してアクセス方法の割合を算出し、当該アクセス方法の割合を、試行判定テーブル１０３１のアクセス方法に書き込む（Ｓ１２０２）。データ配置試行判定部１１４は、試行判定テーブル１０３１のしきい値を参照し、テーブルの再配置を試行するか否かを判定する。データ配置試行部１１５は、データ配置試行判定部１１４が、テーブルの再配置を試行すると判定したテーブル、例えば、上記アクセス方法の割合がしきい値を超えているテーブルを読み出し、読み出したテーブルを、当該テーブルを保持するモデルのデータベースから、当該モデルとは異なる他のモデルのデータベースに再配置する（Ｓ１２０３）。 First, the query analysis unit 112 analyzes the query format and query contents of the query input program, aggregates the processing times of the execution record table 103 for each table, calculates the average value, and writes the average processing time (S1201). Next, the data schema analysis unit 113 analyzes the structure and access method of the database that holds the tables used in the query input program, calculates the proportion of the access method, and writes the proportion of the access method in the access method of the trial judgment table 1031 (S1202). The data placement trial judgment unit 114 refers to the threshold value of the trial judgment table 1031 and judges whether or not to attempt to rearrange the table. The data placement trial unit 115 reads the table for which the data placement trial judgment unit 114 has determined that the table rearrangement should be attempted, for example, a table for which the proportion of the access method exceeds the threshold value, and rearranges the read table from the database of the model that holds the table to a database of another model different from the model (S1203).

問い合わせ試行部１１６は、データ配置試行部１１５により再配置されたテーブルに対する問い合わせを行う（Ｓ１２０４）。例えば、問い合わせ試行部１１６は、問い合わせ形式変換部１０５がＳＱＬ形式から変換した非ＳＱＬ形式の問い合わせ入力プログラムを読み出して、問い合わせを行う。 The query trial unit 116 makes a query to the table relocated by the data allocation trial unit 115 (S1204). For example, the query trial unit 116 reads out a query input program in a non-SQL format that has been converted from SQL format by the query format conversion unit 105, and makes a query.

処理時間判定部１１７は、問い合わせ試行部１１６が問い合わせしたときの処理時間が、再配置が試行される前のテーブルに対する問い合わせをしたときの平均的な処理時間よりも短い場合に、再配置が試行される前のテーブルを、再配置が試行されたテーブルを保持するモデルのデータベースに反映する（Ｓ１２０５）。 If the processing time taken when the query attempt unit 116 makes a query is shorter than the average processing time taken when a query is made to the table before the relocation is attempted, the processing time determination unit 117 reflects the table before the relocation is attempted in the model database that holds the table for which the relocation is attempted (S1205).

このように、本システムでは、既存データベース部（例えば、データベース部１０７）が有する第１のデータベース（例えば、第１種データベース１０８）に格納されるテーブルに問い合わせを行う問い合わせプログラムまたは上記第１のデータベースのスキーマを解析する解析部（例えば、問い合わせ解析部１１２、データスキーマ解析部１１３）と、上記問い合わせプログラムまたは上記第１のデータベースのスキーマの解析結果に基づいて、上記第１のデータベースに格納されるテーブルを上記第１のデータベース以外の第２のデータベース（例えば、試行データベース部１１８）に配置する配置処理部（例えば、データ配置試行部１１５）と、上記第１のデータベースに格納されるテーブルに対する上記問い合わせプログラムの処理時間よりも、上記第２のデータベースに配置したテーブルに対する問い合わせプログラムの処理時間のほうが短い場合に、上記配置を上記既存データベース部に反映する配置反映部（例えば、処理時間判定部１１７）と、を有するので、
例えば、再配置処理において、ユーザアプリケーション等のプログラムから発行されるクエリ（問い合わせ内容）および既存データベース部に格納されるデータスキーマ（データ特性）を解析し、データ配置試行部において、処理時間が短くなるようにテーブルの再配置を行うことができ、上記解析結果から得られる問い合わせ内容の違いに応じて、データを配置すべきデータベースを決定し、データに対する問い合わせの処理時間を従来よりも短縮させることができる。 As described above, the present system includes an inquiry program for making an inquiry to a table stored in a first database (e.g., a first type database 108) in an existing database unit (e.g., a database unit 107) or an analysis unit (e.g., an inquiry analysis unit 112, a data schema analysis unit 113) for analyzing the schema of the first database; an arrangement processing unit (e.g., a data arrangement trial unit 115) for allocating a table stored in the first database to a second database (e.g., a trial database unit 118) other than the first database based on the analysis result of the inquiry program or the schema of the first database; and an arrangement reflection unit (e.g., a processing time determination unit 117) for reflecting the arrangement in the existing database unit when the processing time of the inquiry program for the table allocated in the second database is shorter than the processing time of the inquiry program for the table stored in the first database.
For example, in the rearrangement process, the query (inquiry content) issued from a program such as a user application and the data schema (data characteristics) stored in the existing database unit are analyzed, and in the data rearrangement trial unit, tables can be rearranged to shorten the processing time. Depending on the differences in the inquiry content obtained from the analysis results, the database in which the data should be placed is determined, making it possible to shorten the processing time for inquiries regarding the data compared to the conventional method.

上記配置処理部は、上記第１のデータベースに格納されるテーブルに対する問い合わせが、一定数以上の辿る操作を含む場合には、グラフモデルの上記第２のデータベースに上記テーブルを配置し、上記第１のデータベースに格納されるテーブルに対する問い合わせが、一定数以上の主キーでの検索を含む場合には、キーバリューモデルの上記第２のデータベースに上記テーブルを配置し、上記第１のデータベースに格納されるテーブルに対する問い合わせが、少なくとも複数のキーにおける検索、一定数以上の条件を組み合わせた複雑な検索、一定数以上の結合操作のいずれかを含む場合には、リレーショナルモデルの上記第２のデータベースに上記テーブルを配置する。したがって、このような問い合わせ内容の違いに応じて、データを配置すべきデータベースを決定し、データに対する問い合わせの処理時間を従来よりも短縮させることができる。 The placement processing unit places the table in the second database of a graph model if a query to a table stored in the first database includes a certain number of tracing operations, places the table in the second database of a key-value model if a query to a table stored in the first database includes a certain number of searches on primary keys, and places the table in the second database of a relational model if a query to a table stored in the first database includes at least one of a search on multiple keys, a complex search combining a certain number of conditions, or a certain number of join operations. Therefore, the database in which data should be placed can be determined depending on the difference in the query content, and the processing time for queries on data can be reduced compared to conventional methods.

１０データ管理装置
１０１問い合わせ入力部
１０２データ配置管理部
１０３実行実績テーブル
１０３１試行判定テーブル
１０３２処理時間判定テーブル
１０４配置対応テーブル
１０５問い合わせ形式変換部
１０６問い合わせ結果変換部
１０７データベース部
１１１問い合わせ結果出力部
１１２問い合わせ解析部
１１３データスキーマ解析部
１１４データ配置試行判定部
１１５データ配置試行部
１１６問い合わせ試行部
１１７処理時間判定部
１１８試行データベース部 10 Data management device 101 Query input unit 102 Data placement management unit 103 Execution record table 1031 Attempt judgment table 1032 Processing time judgment table 104 Placement correspondence table 105 Query format conversion unit 106 Query result conversion unit 107 Database unit 111 Query result output unit 112 Query analysis unit 113 Data schema analysis unit 114 Data placement attempt judgment unit 115 Data placement attempt unit 116 Query attempt unit 117 Processing time judgment unit 118 Attempt database unit

Claims

a query program for querying a table stored in a first database included in the existing database unit or an analysis unit for analyzing a schema of the first database;
an arrangement processing unit that arranges a table stored in the first database in a second database other than the first database based on an analysis result of the query program or a schema of the first database;
an arrangement reflecting unit that reflects the arrangement in the existing database unit when a processing time of the inquiry program for the table arranged in the second database is shorter than a processing time of the inquiry program for the table stored in the first database ,
the placement processing unit places the table in the second database of a graph model when a query to a table stored in the first database includes a certain number of traversal operations or more.
A data management system comprising:

the placement processing unit places the table in the second database of a key-value model when a query to a table stored in the first database includes a search using a certain number or more of primary keys;
2. The data management system according to claim 1.

the placement processing unit places the table in the second database of a relational model when a query to a table stored in the first database includes at least one of a search using a plurality of keys, a complex search combining a certain number of conditions, and a certain number of join operations;
2. The data management system according to claim 1.

1. A computer-implemented data management system, comprising:
an analysis unit that analyzes a query program that queries a table stored in a first database included in the existing database unit or a schema of the first database;
a placement processing unit places a table stored in the first database in a second database other than the first database based on an analysis result of the query program or a schema of the first database;
when a processing time of the query program for the table arranged in the second database is shorter than a processing time of the query program for the table stored in the first database, the arrangement reflecting unit reflects the arrangement in the existing database unit ,
the placement processing unit places the table in the second database of a graph model when a query to a table stored in the first database includes a certain number of traversal operations or more.
A data management method comprising:

the placement processing unit places the table in the second database of a key-value model when a query to a table stored in the first database includes a search using a certain number or more of primary keys;
5. The data management method according to claim 4 .

the placement processing unit places the table in the second database of a relational model when a query to a table stored in the first database includes at least one of a search using a plurality of keys, a complex search combining a certain number of conditions, and a certain number of join operations;
5. The data management method according to claim 4 .