JP5800720B2

JP5800720B2 - Information processing apparatus, information processing method, and program

Info

Publication number: JP5800720B2
Application number: JP2012011738A
Authority: JP
Inventors: 秀哉柴田; 田村　孝之; 孝之田村
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2012-01-24
Filing date: 2012-01-24
Publication date: 2015-10-28
Anticipated expiration: 2032-01-24
Also published as: JP2013152512A

Description

本発明は、データベースを効率的に検索するための技術に関する。 The present invention relates to a technique for efficiently searching a database.

データベース管理システムにおいて、データ参照（ＳＱＬのＳＥＬＥＣＴ文に相当）やデータ更新（ＳＱＬのＵＰＤＡＴＥ文に相当）のために発行される問合せ文の実行時間は、一般に、問合せ文において、選択処理（ＳＱＬのＷＨＥＲＥ句に相当）として記述された複数の検索条件（以下、「条件」又は「条件式」ともいう）の評価順序に大きく依存する。
また、記述された条件式を、他の等価な条件式に置き換えられる場合、複数ある等価な候補の中から実行コストが最も低い候補を選択することで、問合せの実行時間を短縮することができる。 In a database management system, the execution time of a query statement issued for data reference (corresponding to an SQL SELECT statement) or data update (corresponding to an SQL UPDATE statement) is generally selected in the query statement (SQL This greatly depends on the evaluation order of a plurality of search conditions (hereinafter also referred to as “conditions” or “conditional expressions”) described as “where clause”.
When the described conditional expression can be replaced with another equivalent conditional expression, the execution time of the query can be shortened by selecting the candidate with the lowest execution cost from a plurality of equivalent candidates. .

なお、実行コストとは、データベースの検索処理に必要なリソースの消費量のことであり、例えば、ディスク・アクセスの回数やＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）の負荷などである。
前述のように、一般的に、実行コストが低いと問い合わせの実行時間も短いという関係にある。 The execution cost is the amount of resource consumption required for the database search process, and is, for example, the number of disk accesses or the load on the CPU (Central Processing Unit).
As described above, generally, when the execution cost is low, the execution time of the query is short.

このような、問合せの実行計画最適化に関する技術は数多く研究されている。
例えば、特許文献１においては、データベース管理システムが外部関数を利用した参照を行う場合、単一、あるいは、複数の外部関数参照により記述される条件に関して、同一の結果が得られる全ての書き換えを考慮した上で、実行コスト評価に基づいた実行計画の最適化を実現する技術が開示されている。 Many techniques relating to such query execution plan optimization have been studied.
For example, in Patent Document 1, when the database management system performs a reference using an external function, all rewrites that obtain the same result are considered for the condition described by a single or a plurality of external function references. In addition, a technique for realizing execution plan optimization based on execution cost evaluation is disclosed.

特開平７−１４１２３６号公報JP-A-7-141236

一方で、ある条件式を形成する演算や関数の実行コストが、他の演算や関数の実行コストと比較して十分に高い場合、該条件の実行処理における並列多重度を増やし、複数レコードに対する該処理を同時実行することが、問合せ実行の高速化に有効である。
ただし、並列化によるオーバーヘッドが存在するため、実行コストが低い処理に関しては、並列化しない方が実行速度の面で有利であることが多い。
このように、高コスト処理と低コスト処理とで、並列度を使い分けられることが要求される。 On the other hand, if the execution cost of an operation or function that forms a certain conditional expression is sufficiently high compared to the execution cost of another operation or function, the parallel multiplicity in the execution processing of the condition is increased, and Simultaneous processing is effective for speeding up query execution.
However, since there is overhead due to parallelization, it is often advantageous in terms of execution speed not to parallelize a process with low execution cost.
In this way, it is required that the degree of parallelism be properly used for high-cost processing and low-cost processing.

しかしながら、従来の実行計画最適化技術の範疇では、使用するデータベース管理システムによっては、複数レコードに対する同時実行を実現するような、並列化の仕組みを組み込むことが困難であった。 However, in the category of conventional execution plan optimization techniques, depending on the database management system used, it has been difficult to incorporate a parallel mechanism that realizes simultaneous execution of a plurality of records.

例として、データベース管理システムとして、オープンソースのＰｏｓｔｇｒｅＳＱＬ９．０（以下、単にＰｏｓｔｇｒｅＳＱＬ）を使用する場合を考える。
特許文献１の方法を単純に適用すると、問合せに用いるＳＱＬ文の選択処理に該当するＷＨＥＲＥ句が、最適な条件に書き換えられる。
しかしながら、ＰｏｓｔｇｒｅＳＱＬでは、ＷＨＥＲＥ句の評価はレコード単位の逐次処理として実行されるため、ＰｏｓｔｇｒｅＳＱＬ本体を改修しない限り、複数レコードを同時に処理する並列化の機構を単純に組み入れることはできない。
したがって、また、高コスト処理と低コスト処理とで並列度を使い分けることもできない。 As an example, consider a case where an open source PostgreSQL 9.0 (hereinafter simply “PostgreSQL”) is used as a database management system.
When the method of Patent Document 1 is simply applied, the WHERE clause corresponding to the SQL statement selection process used for the query is rewritten to the optimum condition.
However, in PostgreSQL, evaluation of the WHERE clause is executed as sequential processing in units of records. Therefore, unless the PostgreSQL body is modified, a parallel mechanism for simultaneously processing a plurality of records cannot be simply incorporated.
Therefore, the parallelism cannot be properly used for high cost processing and low cost processing.

本発明は、上記のような課題を解決することを主な目的としており、検索条件の実行コストの高低に応じて抽出処理における並列度を使い分けることを可能にし、問合せの実行時間の短縮を図ることを主な目的とする。 The main object of the present invention is to solve the above-described problems. The parallelism in the extraction process can be properly used according to the execution cost of the search condition, and the execution time of the query is shortened. The main purpose.

本発明に係る情報処理装置は、
検索の対象となる検索対象テーブルから複数の検索条件の組合せに適合するレコードを抽出するようデータベースサーバ装置に要求する問合せ文を入力する問合せ文入力部と、
前記問合せ文入力部により入力された入力問合せ文の検索条件ごとに、検索実行時の実行コストが閾値以上であるか否かを判断し、検索実行時の実行コストが閾値未満である検索条件を第１の検索条件カテゴリーに分類し、検索実行時の実行コストが閾値以上である検索条件を第２の検索条件カテゴリーに分類する検索条件分類部と、
前記第１の検索条件カテゴリーに分類された検索条件の組合せに適合するレコードを前記検索対象テーブルから抽出する第１の抽出処理と、前記第１の抽出処理により抽出されたレコードから前記第２の検索条件カテゴリーに分類された検索条件の組合せに適合するレコードを抽出する第２の抽出処理とを実行するよう前記データベースサーバ装置に要求する問合せ文を、前記入力問合せ文を変換して生成する問合せ文変換部とを有することを特徴とする。 An information processing apparatus according to the present invention includes:
A query statement input unit that inputs a query statement that requests the database server device to extract records that match a combination of a plurality of search conditions from a search target table to be searched;
For each search condition of the input query sentence input by the query sentence input unit, it is determined whether or not the execution cost at the time of executing the search is greater than or equal to a threshold, and the search condition that the execution cost at the time of executing the search is less than the threshold. A search condition classifying unit that classifies the search condition into a second search condition category that is classified into a first search condition category and that has a search execution cost equal to or higher than a threshold;
A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. A query generated by converting the input query statement to request the database server device to execute a second extraction process for extracting a record that matches a combination of search conditions classified in the search condition category And a sentence conversion unit.

本発明によれば、検索実行時の実行コストに基づいて、問合せ文に含まれる検索条件を第１の検索条件カテゴリーと第２の検索条件カテゴリーとに分類し、第１の検索条件カテゴリーについて抽出処理と、第２の検索条件カテゴリーについての抽出処理とを区別するため、実行コストの高低に応じて並列度を使い分けることができ、問合せの実行時間の短縮を図ることができる。 According to the present invention, the search conditions included in the query statement are classified into the first search condition category and the second search condition category based on the execution cost at the time of executing the search, and the first search condition category is extracted. Since the process and the extraction process for the second search condition category are distinguished, the degree of parallelism can be properly used according to the execution cost, and the execution time of the query can be shortened.

実施の形態１に係るデータベース管理システムの全体構成例を示す図。1 is a diagram illustrating an example of the overall configuration of a database management system according to a first embodiment. 実施の形態１に係るデータ格納装置の構成例を示す図。FIG. 3 shows a configuration example of a data storage device according to the first embodiment. 実施の形態１に係る問合せ変換装置の構成例を示す図。FIG. 3 is a diagram illustrating a configuration example of an inquiry conversion apparatus according to the first embodiment. 実施の形態１に係る問合せ変換装置の動作例を示すフローチャート図。FIG. 3 is a flowchart showing an operation example of the inquiry conversion apparatus according to the first embodiment. 実施の形態１に係る問合せ文の再生成の例を示す図。The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 1. FIG. 実施の形態１に係る問合せ文の再生成の例を示す図。The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 1. FIG. 実施の形態１に係る問合せ文の再生成の例を示す図。The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 1. FIG. 実施の形態１に係る選択実行関数の動作例を示す図。FIG. 5 is a diagram illustrating an operation example of a selection execution function according to the first embodiment. 実施の形態２に係るデータベース管理システムの全体構成例を示す図。The figure which shows the example of whole structure of the database management system which concerns on Embodiment 2. FIG. 実施の形態３に係るデータ格納装置の構成例を示す図。FIG. 9 shows a configuration example of a data storage device according to a third embodiment. 実施の形態３に係る問合せ変換装置の構成例を示す図。FIG. 9 is a diagram illustrating a configuration example of an inquiry conversion apparatus according to a third embodiment. 実施の形態３に係る問合せ変換装置の動作例を示すフローチャート図。FIG. 10 is a flowchart showing an operation example of the inquiry conversion apparatus according to the third embodiment. 実施の形態３に係る暗号データベースのテーブル定義例を示す図。The figure which shows the table definition example of the encryption database which concerns on Embodiment 3. FIG. 実施の形態３に係る問合せ文の再生成の例を示す図。The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 3. FIG. 実施の形態１に係るデータベースサーバ装置の動作例を説明する図。FIG. 6 illustrates an operation example of the database server device according to the first embodiment. 実施の形態１に係るスレーブ装置の並列処理を説明する図。FIG. 6 is a diagram for explaining parallel processing of the slave device according to the first embodiment. 実施の形態１〜３に係る問合せ変換装置のハードウェア構成例を示す図。The figure which shows the hardware structural example of the inquiry conversion apparatus which concerns on Embodiment 1-3.

実施の形態１．
実施の形態１〜３では、実行計画の最適化と、複数レコードを同時処理する並列化とを両立する、問合せ変換装置を説明する。
実施の形態１〜３における問合せ変換装置は、入力、出力共に問合せ文である。
問合せ文のレベルで、実行計画最適化と並列化を両立するための処理命令を組み込むことにより、既存のデータベース管理システムを改修することなく、問合せ実行時間の短縮効果を得ることが可能となる。
具体的には、問合せ文の選択処理として記述された条件のうち、低コストの条件を優先的に評価し、絞込み処理を効率化すると共に、高コストの検索条件のみをレコード単位で並列化することで、問合せの実行時間を短縮することが可能となる。 Embodiment 1 FIG.
In the first to third embodiments, a query conversion apparatus that achieves both optimization of an execution plan and parallelization for simultaneously processing a plurality of records will be described.
In the query conversion devices in the first to third embodiments, both input and output are query statements.
By incorporating processing instructions for achieving both execution plan optimization and parallelism at the query statement level, it is possible to obtain the effect of shortening the query execution time without modifying the existing database management system.
Specifically, among the conditions described as the query statement selection process, low-cost conditions are preferentially evaluated, the narrowing process is made more efficient, and only high-cost search conditions are parallelized in units of records. As a result, the execution time of the query can be shortened.

以下の説明では、問合せ変換装置を、データベース管理システムに適用した例を挙げる。
なお、実施の形態１及び実施の形態２では、データベースを利用するアプリケーションを具体的に定めずに、問合せ変換装置の一般的な構成や動作について説明する。
具体的なアプリケーションへの適用については、実施の形態３で説明する。 In the following description, an example in which the query conversion device is applied to a database management system will be given.
In the first and second embodiments, the general configuration and operation of the query conversion apparatus will be described without specifically defining an application that uses a database.
Specific application to the application will be described in Embodiment 3.

図１は、本実施の形態の問合せ変換装置を適用した、データベース管理システムの構成例を示す。
問合せ発行装置２００（データベースクライアントに相当）は、ネットワーク５００を通じて、問合せ変換装置１００と接続される。
問合せ変換装置１００は、データベースサーバ装置３００と接続されており、データベースサーバ装置３００はデータ格納装置４００と接続されている。
また、データベースサーバ装置３００には、複数のスレーブ装置６００が接続されている。
スレーブ装置６００は、データベースサーバ装置３００の外部の計算機であってもよいし、データベースサーバ装置３００がマルチプロセスに対応している場合はデータベースサーバ装置３００で実行される各プロセスであってもよい。
図１の例では、データベースサーバ装置３００の外部の３台の計算機をスレーブ装置６００としている。
スレーブ装置６００は、データベースサーバ装置３００の管理下で、データ格納装置４００の種々のデータを参照することができる。 FIG. 1 shows a configuration example of a database management system to which the query conversion apparatus according to this embodiment is applied.
The query issuing device 200 (corresponding to a database client) is connected to the query conversion device 100 through the network 500.
The query conversion device 100 is connected to the database server device 300, and the database server device 300 is connected to the data storage device 400.
In addition, a plurality of slave devices 600 are connected to the database server device 300.
The slave device 600 may be a computer external to the database server device 300, or may be each process executed by the database server device 300 when the database server device 300 supports multi-process.
In the example of FIG. 1, three computers outside the database server device 300 are set as slave devices 600.
The slave device 600 can refer to various data in the data storage device 400 under the management of the database server device 300.

問合せ発行装置２００は、ユーザの要求に応じて問合せ文を生成し、データベースサーバ装置３００へ発行する。
問合せ文を記述するデータベース言語としては、例えば、標準ＳＱＬや、各データベース管理システムで独自に拡張されたＳＱＬ言語が挙げられる。 The query issuing device 200 generates a query statement in response to a user request and issues it to the database server device 300.
Examples of the database language for describing the query sentence include standard SQL and the SQL language that is uniquely extended by each database management system.

問合せ変換装置１００は、問合せ発行装置２００が発行した問合せ文を受け取り、問合せ文が選択処理（ＳＱＬのＷＨＥＲＥ句に相当）を含む場合に、問合せ文を目的の形に変換し、データベースサーバ装置３００へ発行する。
問合せ文が選択処理を含まない場合には、問合せ文を変換せずに、データベースサーバ装置３００へ発行する。
問合せ変換装置１００は、情報処理装置の例に相当する。 The query conversion device 100 receives the query statement issued by the query issuing device 200, and when the query statement includes a selection process (corresponding to the WHERE clause of SQL), converts the query statement into a target form, and the database server device 300 To issue.
If the query statement does not include a selection process, the query statement is issued to the database server device 300 without being converted.
The inquiry conversion device 100 corresponds to an example of an information processing device.

データベースサーバ装置３００は、問合せ変換装置１００から受け取った問合せ文の解析を行い、問合せ（データ格納装置４００内のデータの検索）を実行する。
問合せの実行時には、必要であれば、データ格納装置４００に格納された種々のデータを参照、または、変更する。 The database server device 300 analyzes the query sentence received from the query conversion device 100 and executes a query (search for data in the data storage device 400).
When executing the query, if necessary, various data stored in the data storage device 400 are referred to or changed.

図２は、データ格納装置４００の詳細を示したものである。
データ格納装置４００には、少なくともデータ４１０、カタログ情報４２０、演算コスト情報４３０を含む種々のデータが格納されている。
データ４１０は、データベース管理システムにおいて、管理情報を含まないデータそのものを指す。
カタログ情報４２０は、データベースカタログ情報を含むデータが格納される。
演算コスト情報４３０は、データベース管理システムで定義された演算や関数の実行コスト情報を含むデータが格納されている。
演算コスト情報４３０を参照することで、問合せ変換装置１００、および、データベースサーバ装置３００は、問合せに記述される各演算の実行コストを見積もることができる。 FIG. 2 shows details of the data storage device 400.
The data storage device 400 stores various data including at least data 410, catalog information 420, and calculation cost information 430.
Data 410 refers to data itself that does not include management information in the database management system.
The catalog information 420 stores data including database catalog information.
The calculation cost information 430 stores data including execution cost information of calculations and functions defined in the database management system.
By referring to the operation cost information 430, the query conversion device 100 and the database server device 300 can estimate the execution cost of each operation described in the query.

次に、問合せ変換装置１００の詳細な構成を、図３を用いて説明する。
問合せ変換装置１００は、問合せ文入力部１１０、問合せ解析部１２０、副問合せ文生成部１３０、並列処理指示部１４０、問合せ文再生成部１５０から構成される。 Next, a detailed configuration of the query conversion apparatus 100 will be described with reference to FIG.
The query conversion apparatus 100 includes a query statement input unit 110, a query analysis unit 120, a sub-query statement generation unit 130, a parallel processing instruction unit 140, and a query statement regeneration unit 150.

問合せ文入力部１１０は、問合せ発行装置２００がネットワーク５００を介して発行した問合せ文１０１を受け取り、問合せ変換装置１００へ入力する。
問合せ文１０１は、データ格納装置４００のデータ４１０内の検索の対象となるテーブル（検索対象テーブル）から、複数の検索条件の組合せに適合するレコードを抽出するよう要求するメッセージである。 The query statement input unit 110 receives the query statement 101 issued by the query issuing device 200 via the network 500 and inputs it to the query conversion device 100.
The query statement 101 is a message requesting to extract a record that matches a combination of a plurality of search conditions from a table (search target table) to be searched in the data 410 of the data storage device 400.

問合せ解析部１２０は、入力された問合せ文１０１の解析を行い、問合せ変換に必要な情報を取得する。
問合せ解析部１２０は、必要に応じて、カタログ情報４２０や演算コスト情報４３０などの管理情報を参照する。
問合せ文１０１が選択処理を含まない場合は、問合せ文１０１を変換せずに、変換済み問合せ文１０２として出力する。 The query analysis unit 120 analyzes the input query statement 101 and acquires information necessary for query conversion.
The query analysis unit 120 refers to management information such as catalog information 420 and calculation cost information 430 as necessary.
When the query statement 101 does not include the selection process, the query statement 101 is output as a converted query statement 102 without being converted.

副問合せ文生成部１３０は、問合せ文１０１に選択処理が含まれる場合に、問合せ解析部１２０による解析結果を元に、選択処理の中から、高コスト処理を除いた、低コスト処理のみで形成される副問合せ文（第１の問合せ文の例）を生成する。
副問合せ文生成部１３０は、必要に応じて、カタログ情報４２０や演算コスト情報４３０などの管理情報を参照する。 When the query statement 101 includes a selection process, the sub-query sentence generation unit 130 is formed based on the analysis result of the query analysis unit 120 by using only the low cost process from which the high cost process is excluded from the selection process. A subquery sentence (example of a first query sentence) to be generated is generated.
The subquery generation unit 130 refers to management information such as catalog information 420 and calculation cost information 430 as necessary.

副問合せ文生成部１３０は、より具体的には、問合せ文１０１の検索条件ごとに、検索実行時の実行コストが閾値以上であるか否かを判断し、検索実行時の実行コストが閾値未満である検索条件を低コストの検索条件（第１の検索条件カテゴリーの例）に分類し、検索実行時の実行コストが閾値以上である検索条件を高コストの検索条件（第２の検索条件カテゴリーの例）に分類する。
なお、以下では、低コストの検索条件を低コスト処理、高コストの検索条件を高コスト処理という。
また、副問合せ文生成部１３０は、低コスト処理に分類された検索条件の組合せに適合するレコードを検索対象テーブルから抽出する第１の抽出処理の実行を要求する副問合せ文を、問合せ文１０１を変換して生成する。
副問合せ文生成部１３０は、検索条件分類部の例に相当する。
また、副問合せ文生成部１３０は、後述の問合せ文再生成部１５０とともに、問合せ文変換部の例に相当する。 More specifically, the subquery generation unit 130 determines, for each search condition of the query statement 101, whether or not the execution cost at the time of search execution is equal to or greater than a threshold, and the execution cost at the time of search execution is less than the threshold Are categorized into low-cost search conditions (example of first search condition category), and search conditions whose execution cost at the time of search execution is equal to or higher than a threshold are set to high-cost search conditions (second search condition category) For example).
In the following, low-cost search conditions are referred to as low-cost processing, and high-cost search conditions are referred to as high-cost processing.
In addition, the subquery generation unit 130 generates a subquery that requests execution of the first extraction process for extracting a record that matches the combination of search conditions classified as the low cost process from the search target table. Is generated by converting.
The subquery generation unit 130 corresponds to an example of a search condition classification unit.
The sub-query sentence generation unit 130 corresponds to an example of a query sentence conversion unit together with a query sentence re-generation unit 150 described later.

並列処理指示部１４０は、副問合せ文生成部１３０において除かれた、０個以上の各高コスト処理に対して、処理を並列化することを指示する命令を出力する。
並列処理指示部１４０は、より具体的には、データベースサーバ装置３００に、第１の抽出処理により抽出された複数のレコードを、それぞれが１つ以上のレコードで構成される複数のブロックに分割し、後述の第２の抽出処理を複数のブロックに対して並列に実行するよう指示する。
並列処理指示部１４０は、並列処理制御部の例に相当する。 The parallel processing instruction unit 140 outputs an instruction for instructing to parallelize the processing for each of zero or more high-cost processes removed by the subquery generation unit 130.
More specifically, the parallel processing instruction unit 140 divides the plurality of records extracted by the first extraction processing into a plurality of blocks each composed of one or more records. Then, the second extraction process described later is instructed to be executed on a plurality of blocks in parallel.
The parallel processing instruction unit 140 corresponds to an example of a parallel processing control unit.

問合せ文再生成部１５０は、副問合せ文生成部１３０で生成された副問合せ文と、副問合せ文生成部１３０において除かれた、０個以上の高コスト処理とを入力に持ち、問合せ文１０１における選択処理の結果を入力とするような後続の処理を実行するのに十分なデータの集合を出力するような関数（選択処理実行関数）の出力を、前記後続の処理への入力とするような問合せ文（第２の問合せ文の例）を再生成する。
つまり、問合せ文再生成部１５０は、副問合せ文生成部１３０で生成された副問合せ文を含み、副問合せ文の実行（第１の抽出処理の実行）により抽出されたレコードから高コスト処理に分類された検索条件の組合せに適合するレコードを抽出する第２の抽出処理を実行するよう要求する問合せ文を、問合せ文１０１を変換して生成する。
そして、問合せ文再生成部１５０は、再生成した問合せ文を、変換済み問合せ文１０２として出力する。
なお、変換済み問合せ文１０２の生成の具体的な方法は、問合せ変換装置１００の動作説明において説明する。
問合せ文再生成部１５０は、前述の副問合せ文生成部１３０とともに、問合せ文変換部の例に相当する。 The query statement regenerator 150 has, as inputs, the subquery statement generated by the subquery statement generator 130 and zero or more high-cost processes removed by the subquery statement generator 130. The output of a function (selection process execution function) that outputs a set of data sufficient to execute the subsequent process that receives the result of the selection process in is input to the subsequent process. A simple query statement (example of the second query statement) is regenerated.
That is, the query statement regeneration unit 150 includes the subquery statement generated by the subquery statement generation unit 130, and performs high-cost processing from the record extracted by executing the subquery statement (execution of the first extraction process). The query statement 101 is generated by converting the query statement 101 to request execution of the second extraction process for extracting the record that matches the classified search condition combination.
Then, the query statement regenerating unit 150 outputs the regenerated query statement as the converted query statement 102.
A specific method for generating the converted query statement 102 will be described in the description of the operation of the query conversion apparatus 100.
The query statement regeneration unit 150 corresponds to an example of a query statement conversion unit together with the sub-query statement generation unit 130 described above.

次に、問合せ変換装置１００の動作について図４を用いて説明する。 Next, the operation of the query conversion apparatus 100 will be described with reference to FIG.

問合せ変換装置１００に問合せ文１０１が到着すると、問合せ文入力部１１０は、問合せ文１０１を受け取る（Ｓ１１０）。
問合せ解析部１２０は、問合せ文１０１を解析する（Ｓ１２０）。
解析の結果、問合せ文１０１が、選択処理（ＳＱＬのＷＨＥＲＥ句に相当）を含む場合は（Ｓ１３０でＹＥＳ）、副問合せ文生成部１３０が、選択処理の中から、高コスト処理を除いた、低コスト処理のみで形成される副問合せ文を生成する（Ｓ１４０）。
問合せ文１０１が、選択処理を含まない場合は（Ｓ１３０でＮＯ）、問合せ文１０１を変換せずに、問合せ解析部１２０が、変換済み問合せ文１０２として出力する（Ｓ１８０）。 When the query statement 101 arrives at the query conversion device 100, the query statement input unit 110 receives the query statement 101 (S110).
The query analysis unit 120 analyzes the query statement 101 (S120).
As a result of the analysis, when the query statement 101 includes a selection process (corresponding to the SQL WHERE clause) (YES in S130), the sub-query generation unit 130 excludes the high-cost process from the selection process. A subquery sentence formed only by low cost processing is generated (S140).
If the query statement 101 does not include a selection process (NO in S130), the query analysis unit 120 outputs the converted query statement 102 without converting the query statement 101 (S180).

Ｓ１４０において除かれた、０個以上の高コスト処理に対して、並列化するよう予めシステムで設定されている場合は（Ｓ１５０でＹＥＳ）、並列処理指示部１４０が、処理Ｓ１４０において除かれた、０個以上の各高コスト処理に対して、処理を並列化することを指示する命令をデータベースサーバ装置３００に出力する（Ｓ１６０）。
高コスト処理に対する並列化が設定されていない場合は、処理Ｓ１６０を飛ばし、処理Ｓ１７０に進む。 If the system is set in advance to perform parallel processing for zero or more high-cost processes removed in S140 (YES in S150), the parallel processing instruction unit 140 is removed in process S140. For each of the zero or more high-cost processes, an instruction for instructing to parallelize the processes is output to the database server apparatus 300 (S160).
If parallelization for high-cost processing is not set, processing S160 is skipped and processing proceeds to processing S170.

問合せ文再生成部１５０は、処理Ｓ１４０で生成された副問合せ文、０個以上の高コスト処理、および、処理Ｓ１６０で出力された並列化処理命令を入力として、問合せ文を再生成し（Ｓ１７０）、再生成した問合せ文を変換済み問合せ文１０２として出力する（Ｓ１８０）。 The query statement regenerator 150 receives the subquery statement generated in step S140, zero or more high-cost processes, and the parallel processing instruction output in step S160, and regenerates the query statement (S170). ), The regenerated query statement is output as the converted query statement 102 (S180).

処理Ｓ１２０で適用される問合せ文の解析技術については、公知の技術、あるいはそれらの自明な拡張で実現可能である。
例えば、オープンソースのデータベース管理システムＰｏｓｔｇｒｅＳＱＬでは、ＳＱＬ解析処理を含む全ソースコードが公開されている。 The query sentence analysis technique applied in step S120 can be realized by a known technique or a trivial extension thereof.
For example, in the open source database management system PostgreSQL, all source codes including SQL analysis processing are disclosed.

処理Ｓ１３０では、ＳＱＬの場合、選択処理を含む問合せ文として、少なくとも、ＷＨＥＲＥ句を含むＳＥＬＥＣＴ文、および、ＵＰＤＡＴＥ文が含まれる。 In the process S130, in the case of SQL, at least a SELECT statement including a WHERE clause and an UPDATE statement are included as a query statement including a selection process.

処理Ｓ１４０の動作を、具体例を交えて説明する。次のＳＱＬ文を考える（数１）。 The operation of process S140 will be described with a specific example. Consider the following SQL statement (Equation 1).

ここで、ｏｐ＿ｉ（ｉ＝１，２，…，Ｎ）は適当な二項演算子である。
重要なのは、ｎ個の条件式が論理積で結合されている点である。
更に、各ｏｐ＿ｉには、実行コストが設定されているものとする。
実行コストは演算コスト情報４３０に格納されており、問合せ解析部１２０、あるいは、副問合せ文生成部１３０において参照される。
ここでは、簡単のため、演算ｏｐ＿ｉには実行コスト（１００＊ｉ）が設定されているものとする。すなわち、ｉが大きいほど、演算ｏｐ＿ｉの実行コストが大きく、実行時間がかかるという状況である。 Here, op_i (i = 1, 2,..., N) is an appropriate binary operator.
What is important is that n conditional expressions are connected by AND.
Furthermore, it is assumed that an execution cost is set for each op_i.
The execution cost is stored in the calculation cost information 430 and is referred to by the query analysis unit 120 or the subquery statement generation unit 130.
Here, for simplicity, it is assumed that the execution cost (100 * i) is set in the operation op_i. That is, the larger i is, the higher the execution cost of the operation op_i is, and the longer the execution time is.

このとき、システム管理者等によって、予め閾値θが設定されているとする。
この閾値θは、副問合せ文生成部１３０において、高コスト処理と低コスト処理とを分ける基準となる。
実行コストが閾値θ未満となる演算がｏｐ＿ｉ（１＝１，２，…，Ｋ）であったとすると、副問合せ文生成部１３０は、低コスト処理のみで形成される副問合せ文ＳｕｂＱｒｙを次にように生成する（数２）。 At this time, it is assumed that the threshold value θ is set in advance by a system administrator or the like.
This threshold value θ is a criterion for separating high cost processing and low cost processing in the subquery generation unit 130.
Assuming that the operation whose execution cost is less than the threshold value θ is op_i (1 = 1, 2,..., K), the subquery generation unit 130 outputs the subquery statement SubQry formed only by low cost processing next. (Formula 2).

副問合せ文を生成する目的は、選択処理の絞込み効率向上を目的としているため、副問合せ文を形成する条件群は、元の問合せ文１０１において、論理積により切り離されていることが要件となる。
また、副問合せ文生成の際に除かれた高コスト処理、および、カラムｃの射影処理を後に実行可能とするために、高コスト処理で使用するカラム群、および、カラムｃを、副問合せ文内で選択されるカラムに全て含める必要がある。 Since the purpose of generating the subquery is to improve the efficiency of selection processing, it is a requirement that the group of conditions forming the subquery is separated by logical product in the original query 101. .
Further, in order to make it possible to execute the high-cost processing and the projection processing of the column c that are removed at the time of generating the sub-query statement later, the column group and the column c used in the high-cost processing are changed to the sub-query statement. Must be included in all selected columns.

数１による例は最も基本的な場合であるが、ここから様々な応用を考えることができる。
例えば、一般の論理式は、論理積標準形に代表されるように、論理積で結合された形式で表すことが可能であるため、次のような一般形を想定することができる（数３）。 The example according to Equation 1 is the most basic case, but various applications can be considered from here.
For example, since a general logical expression can be expressed in a form combined with logical product, as represented by the logical product standard form, the following general form can be assumed (Equation 3) ).

ここで、ｃｏｎｄ＿ｉ（ｉ＝１，２，…，Ｎ）は適当な論理式である。
数３では、各ｃｏｎｄ＿ｉを形成する演算子が単一であるとは限らないため、各論理式の実行コストを、各ｃｏｎｄ＿ｉを形成する演算子の中で最も実行コストが大きいもの、として定義することで、数１の場合と同様の副問合せ文生成手法が適用できる。 Here, cond_i (i = 1, 2,..., N) is an appropriate logical expression.
In Equation 3, since the operator forming each cond_i is not necessarily a single operator, the execution cost of each logical expression is defined as the one having the highest execution cost among the operators forming each cond_i. Thus, the same subquery generation method as in the case of Equation 1 can be applied.

数３の特別な場合として、高コスト処理を形成する論理式と、低コスト処理を形成する論理式とが、論理積で結合されている場合がある（数４）。 As a special case of Expression 3, there is a case where a logical expression that forms a high-cost process and a logical expression that forms a low-cost process are combined by a logical product (Expression 4).

数４において、ｈｉｇｈｅｒ＿ｃｏｓｔ＿ｐａｒｔとｌｏｗｅｒ＿ｃｏｓｔ＿ｐａｒｔは適当な論理式であり、ｌｏｗｅｒ＿ｃｏｓｔ＿ｐａｒｔが実行コストの小さい処理に対応する。数４に対応する副問合せ文ＳｕｂＱｒｙは数５のようになる。 In Equation 4, high_cost_part and lower_cost_part are appropriate logical expressions, and lower_cost_part corresponds to processing with a low execution cost. The subquery sentence SubQry corresponding to Equation 4 is as shown in Equation 5.

ここで、ｈｉｇｈｅｒ＿ｓｅｌｅｃｔ＿ｌｉｓｔは、高コスト処理ｈｉｇｈｅｒ＿ｃｏｓｔ＿ｐａｒｔを実行するために必要なカラム群を指す。 Here, high_select_list refers to a column group necessary for executing the high cost processing high_cost_part.

なお、数１、数３、数５はいずれもＳＥＬＥＣＴ文の例であるが、ＵＰＤＡＴＥ文の場合も同様に考えることができる。
ここでは、数１に対応するＵＰＤＡＴＥ文の例を挙げる（数６）。 Note that Equations (1), (3), and (5) are all examples of the SELECT statement, but the case of the UPDATE statement can be considered similarly.
Here, an example of the UPDATE statement corresponding to Equation 1 is given (Equation 6).

数６に対応する副問合せ文ＳｕｂＱｒｙは数７のようになる。 The subquery sentence SubQry corresponding to Expression 6 is as shown in Expression 7.

ここで、ＰＲＩＭＡＲＹＫＥＹはテーブル内の主キーを指す。
ＵＰＤＡＴＥ文の場合に、副問合せ文内で主キーを選択する理由については、後で説明する。
副問合せ文生成の目的は、選択処理の絞込み効率向上であるため、元の問合せ文がＵＰＤＡＴＥ文であっても、副問合せ文はＳＥＬＥＣＴ文となることに注意されたい。 Here, PRIMARYKEY indicates a primary key in the table.
The reason for selecting the primary key in the subquery in the case of the UPDATE statement will be described later.
Note that the purpose of subquery generation is to improve the narrowing efficiency of the selection process, so that even if the original query is an UPDATE statement, the subquery is a SELECT statement.

次に、処理Ｓ１７０の動作を、具体例を交えて説明する。
処理Ｓ１７０において、問合せ文再生成部１５０は、処理Ｓ１４０で生成された副問合せ文と、処理Ｓ１４０において除かれた、０個以上の高コスト処理を形成する論理式とを入力に持ち、問合せ文１０１における選択処理の結果を入力とするような後続の処理を実行するのに十分なデータの集合を出力するような関数（選択処理実行関数）の出力を、前記後続の処理への入力とするような問合せ文を再生成する。 Next, operation | movement of process S170 is demonstrated with a specific example.
In process S170, the query statement regenerator 150 has as input the subquery generated in process S140 and the logical expression forming the zero or more high-cost processes removed in process S140. An output of a function (selection process execution function) that outputs a set of data sufficient to execute a subsequent process that receives the result of the selection process in 101 is used as an input to the subsequent process. Regenerate a query statement like this:

ここで、「選択処理の結果を入力とするような後続の処理」の、具体例を挙げる。数１で表されるＳＥＬＥＣＴ文の場合、この後続の処理は、カラムｃへの射影処理を指す。
また、数６で表されるＵＰＤＡＴＥ文の場合、後続の処理は、「ｃ＝ｃ＊１．１」で表されるカラムｃの更新処理を指す。 Here, a specific example of “subsequent processing using the result of the selection processing as input” will be described. In the case of the SELECT statement expressed by Equation 1, this subsequent process indicates a projecting process to the column c.
Further, in the case of the UPDATE statement expressed by Equation 6, the subsequent processing indicates update processing of the column c expressed by “c = c * 1.1”.

図５にＳＥＬＥＣＴ文再生成の具体例を示す。
図５の例は、数１を用いた例であり、コスト情報の設定等も数１の例と同様である。
変換後の問合せ文において、ＦＲＯＭ句に表れるＦｕｎｃが上述の選択処理実行関数に相当する。
関数Ｆｕｎｃの引数は、処理Ｓ１４０で生成した副問合せ文ＳｕｂＱｒｙ、および、処理Ｓ１４０で除かれた高コスト処理群である。
関数Ｆｕｎｃの戻り値は、後の射影処理で必要となるカラムｃを含むような、選択結果のデータ集合である。 FIG. 5 shows a specific example of SELECT statement regeneration.
The example of FIG. 5 is an example using Equation 1, and setting of cost information and the like are the same as those of Equation 1.
In the converted query statement, Func appearing in the FROM phrase corresponds to the above-described selection process execution function.
The arguments of the function Func are the sub-query sentence SubQry generated in the process S140 and the high cost process group removed in the process S140.
The return value of the function Func is a data set of selection results including the column c necessary for the subsequent projection processing.

図６にＳＥＬＥＣＴ文再生成の別の具体例を示す。
図６の例では、処理Ｓ１６０で出力された並列処理命令を選択処理実行関数Ｆｕｎｃに引数として与える例である。
高コスト処理「ａ＿Ｎｏｐ＿ｎｂ＿Ｎ」の次の引数「０」が、並列処理対象か否かのフラグとなっている。
また、変形例として、並列多重度を自然数として渡す方法が挙げられる。
また、図６のように、並列処理命令を明示的に引数とするのではなく、与えられた高コスト処理は、暗黙的に全て並列化対象として扱う、というようにシステム管理者等が設定するという方法もある。 FIG. 6 shows another specific example of SELECT statement regeneration.
The example of FIG. 6 is an example in which the parallel processing instruction output in step S160 is given as an argument to the selection processing execution function Func.
The next argument “0” of the high-cost process “a_N op_n b_N” is a flag indicating whether or not it is a target for parallel processing.
Further, as a modification, a method of passing the parallel multiplicity as a natural number can be mentioned.
In addition, as shown in FIG. 6, the system administrator or the like sets that a given high-cost process is implicitly treated as a parallel object rather than explicitly using a parallel processing instruction as an argument. There is also a method.

図７にＵＰＤＡＴＥ文再生成の具体例を示す。図の例は、数６を用いた例であり、コスト情報の設定等は数１の例と同様である。
ここで、ＰＲＩＭＡＲＹＫＥＹはテーブル内の主キーを表す。
基本的な考え方は、ＳＥＬＥＣＴ文のときと同様であるが、ＵＰＤＡＴＥ文においては、選択処理を実行するテーブルと更新対象のテーブルが同一のテーブルであるという規則があるため、変換に工夫が必要となる。
そこで、関数Ｆｕｎｃによる選択結果として生成される一時テーブル（図７のｓｕｂ＿ｔ）と、更新対象のテーブル（図７のｔ）とを、主キーによって結合することで、この問題を解決している。
このため、本方法を用いる限り、更新対象のテーブルには主キー、あるいは、それに準じるカラムが設定されていることが要件となる。 FIG. 7 shows a specific example of UPDATE statement regeneration. The example shown in the figure is an example using Equation 6, and the setting of cost information and the like are the same as those in Equation 1.
Here, PRIMARYKEY represents a primary key in the table.
The basic idea is the same as in the SELECT statement. However, in the UPDATE statement, there is a rule that the table for executing the selection process and the table to be updated are the same table. Become.
Therefore, this problem is solved by joining the temporary table (sub_t in FIG. 7) generated as a selection result by the function Func and the table to be updated (t in FIG. 7) with the primary key.
Therefore, as long as this method is used, it is a requirement that the table to be updated has a primary key or a column corresponding to it.

選択処理実行関数Ｆｕｎｃは、適当なプログラミング言語で作成されたユーザ定義の関数として実現することができる。
多くのデータベース管理システムでは、ユーザが独自に定義した関数、演算をデータベース言語に組み入れて利用することが可能となっている。
例えば、オープンソースのデータベース管理システムＰｏｓｔｇｒｅＳＱＬや、商用データベースであるＯｒａｃｌｅＤａｔａｂａｓｅ１１ｇでは、演算による選択結果のデータ集合を、一時テーブルとして出力することが可能とする機能を、ユーザ定義のＣ言語関数においてサポートしており、本実施の形態を適用することができる。 The selection process execution function Func can be realized as a user-defined function created in an appropriate programming language.
In many database management systems, functions and operations defined by users can be incorporated into a database language and used.
For example, in the open source database management system PostgreSQL and the commercial database Oracle Database 11g, a function that enables a data set of selection results obtained by calculation to be output as a temporary table is supported by a user-defined C language function. Therefore, the present embodiment can be applied.

選択処理実行関数の動作について、図８を用いて説明する。
選択処理実行関数はデータベースサーバ装置３００によって、問合せ実行時に呼び出される。
選択処理実行関数が起動されると、選択処理実行関数は、呼び出し元のデータベースサーバ装置３００を呼び出し、入力された副問合せ文を実行させ、結果を受け取る（Ｓ２１０）。
入力された高コスト処理のうち、未処理のものがあれば（Ｓ２２０）、未処理の高コスト処理を１つ選択し実行し、結果を受け取る（Ｓ２３０）。
この際、処理Ｓ１６０により並列処理命令が併せて入力されている場合は、並列多重度を増やして（並列処理命令に適合する並列多重度にして）、処理を実行する。
未処理の高コスト処理がなくなれば（Ｓ２２０）、結果を出力し、処理を完了する（Ｓ２４０）。 The operation of the selection process execution function will be described with reference to FIG.
The selection process execution function is called by the database server device 300 at the time of query execution.
When the selection process execution function is activated, the selection process execution function calls the caller database server apparatus 300 to execute the input subquery and receives the result (S210).
If there is an unprocessed high-cost process that has been input (S220), one unprocessed high-cost process is selected and executed, and the result is received (S230).
At this time, if a parallel processing instruction is also input in step S160, the parallel multiplicity is increased (the parallel multiplicity is adapted to the parallel processing instruction), and the process is executed.
If there is no unprocessed high-cost process (S220), the result is output and the process is completed (S240).

処理Ｓ２１０では、例えばＰｏｓｔｇｒｅＳＱＬの場合、ＳＰＩ（ＳｅｒｖｅｒＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）というインタフェースを利用して、Ｃ言語のユーザ定義関数からＰｏｓｔｇｒｅＳＱＬサーバを呼び出し、データ集合の受け渡しが可能である。
商用データベースであるＯｒａｃｌｅ（登録商標）Ｄａｔａｂａｓｅ１１ｇにおいても、ＯＣＩ（Ｏｒａｃｌｅ（登録商標）ＣａｌｌＩｎｔｅｒｆａｃｅ）を利用して、同様の操作が可能である。 In the process S210, for example, in the case of PostgreSQL, it is possible to call a PostgreSQL server from a C-defined user-defined function using an interface called SPI (Server Programming Interface), and to exchange a data set.
In Oracle (registered trademark) Database 11g which is a commercial database, the same operation can be performed using OCI (Oracle (registered trademark) Call Interface).

処理Ｓ２１０において、未処理の高コスト処理を１つ選択する方法として、実行コストの小さい順に選択するという方法がある。
こうすることで、絞込み効率を向上させることができる。 As a method of selecting one unprocessed high-cost process in the process S210, there is a method of selecting in ascending order of execution cost.
By doing so, the narrowing efficiency can be improved.

ここで、変換済み問合せ文１０２を入力したデータベースサーバ装置３００で行われる動作の具体例を図１５及び図１６を参照して説明する。
図１５は、データベースサーバ装置３００が検索対象とする検索対象テーブルの概略を示している。
図１６は、データベースサーバ装置３００が副問い合わせ文ＳｕｂＱｒｙを実行して抽出したレコードに対する、スレーブ装置６００の並列処理を説明している。 Here, a specific example of an operation performed by the database server apparatus 300 that has input the converted query statement 102 will be described with reference to FIGS. 15 and 16.
FIG. 15 shows an outline of a search target table to be searched by the database server device 300.
FIG. 16 illustrates parallel processing of the slave device 600 with respect to a record extracted by the database server device 300 executing the sub-query sentence SubQry.

図１５のａ＿１、ａ＿２、ａ＿Ｋ、ａ＿（Ｋ＋１）、ａ＿（Ｋ＋２）、ａ＿Ｎはカラム名を示しており（つまり、図１５の例では、Ｋ＝３、Ｎ＝６）、図５の条件式（ａ＿１ｏｐ＿１ｂ＿１）等に対応する。
また、図１５のｃもカラム名を示している。
ｃ＿１、ｃ＿２、ｃ＿３等はカラムｃの値である。
ｃ＿１、ｃ＿２、ｃ＿３等は各レコードを識別できる値である。
更に、図１５の「Ｘ」は、検索条件に合致していることを示している。
例えば、図１５のｃ＿１レコードでは、カラムａ＿１に「Ｘ」が示されているが、これはｃ＿１レコードは検索条件（ａ＿１ｏｐ＿１ｂ＿１）に合致していることを意味する。
図１６においても、これらは同じである。 15, a_1, a_2, a_K, a_ (K + 1), a_ (K + 2), and a_N indicate column names (that is, K = 3, N = 6 in the example of FIG. 15), and the conditional expression of FIG. (A_1 op_1 b_1) and the like.
Also, c in FIG. 15 indicates the column name.
c_1, c_2, c_3, etc. are the values of the column c.
c_1, c_2, c_3, etc. are values that can identify each record.
Furthermore, “X” in FIG. 15 indicates that the search condition is met.
For example, in the c_1 record of FIG. 15, “X” is shown in the column a_1, which means that the c_1 record matches the search condition (a_1 op_1 b_1).
These are the same in FIG.

データベースサーバ装置３００では、図５の副問い合わせ文ＳｕｂＱｒｙを実行する。
図１５の例では、データベースサーバ装置３００は、低コストの検索条件（ａ＿１ｏｐ＿１ｂ＿１）、（ａ＿２ｏｐ＿２ｂ＿２）、（ａ＿Ｋｏｐ＿Ｋｂ＿Ｋ）の組合せに適合するレコードのレコード名ｃと、そのレコードにおけるカラムａ＿（Ｋ＋１）、ａ＿（Ｋ＋２、ａ＿Ｎの値を抽出する（第１の抽出処理）。
副問い合わせ文ＳｕｂＱｒｙの実行の結果、図１５の例では、矢印が示されているレコードが抽出される。 In the database server device 300, the sub-query sentence SubQry of FIG. 5 is executed.
In the example of FIG. 15, the database server device 300 includes a record name c of a record that matches a combination of a low-cost search condition (a_1 op_1 b_1), (a_2 op_2 b_2), (a_K op_K b_K), and a column in the record. The values of a_ (K + 1), a_ (K + 2, a_N) are extracted (first extraction process).
As a result of the execution of the subquery sentence SubQry, a record indicated by an arrow is extracted in the example of FIG.

そして、並列処理指示部１４０により並列化が指示されている場合（図４のＳ１５０でＹＥＳの場合）に、データベースサーバ装置３００は、高コストの検索条件についての抽出処理を複数のスレーブ装置６００に並列に実行させる。
例えば、データベースサーバ装置３００は、副問い合わせ文ＳｕｂＱｒｙの実行により得られたレコード群を、所定の単位で分割し、分割により得られたブロックを複数のスレーブ装置６００に出力し、複数のスレーブ装置６００にブロックごとの抽出処理を並列に実行させる。
図１６の例では、データベースサーバ装置３００は、副問い合わせ文ＳｕｂＱｒｙの実行により得られたレコード群を２レコード単位で分割し、３つのスレーブ装置６００の各々に２レコードごとのブロックを出力する。
そして、各スレーブ装置６００では、高コストの検索条件（ａ＿（Ｋ＋１）ｏｐ＿（Ｋ＋１）ｂ＿（Ｋ＋１））、（ａ＿（Ｋ＋２）ｏｐ＿（Ｋ＋２）ｂ＿（Ｋ＋２））、（ａ＿Ｎｏｐ＿Ｎｂ＿Ｎ）の組合せに適合するレコードのレコード名ｃを抽出し（第２の抽出処理）、データベースサーバ装置３００に出力する。 When parallel processing is instructed by the parallel processing instructing unit 140 (YES in S150 of FIG. 4), the database server device 300 performs extraction processing on the high-cost search condition to the plurality of slave devices 600. Run in parallel.
For example, the database server device 300 divides a record group obtained by executing the sub-query sentence SubQry by a predetermined unit, outputs a block obtained by the division to the plurality of slave devices 600, and the plurality of slave devices 600. To execute the extraction process for each block in parallel.
In the example of FIG. 16, the database server device 300 divides the record group obtained by executing the sub-query message SubQry in units of two records, and outputs a block for every two records to each of the three slave devices 600.
In each slave device 600, a combination of high-cost search conditions (a_ (K + 1) op_ (K + 1) b_ (K + 1)), (a_ (K + 2) op_ (K + 2) b_ (K + 2)), (a_N op_N b_N) Is extracted (second extraction process) and output to the database server device 300.

なお、図１６では、高コストの並列処理を全てスレーブ装置６００で実行することとしているが、並列処理の一部をデータベースサーバ装置３００が実行するようにしてもよい。
また、データベースサーバ装置３００がマルチプロセスに対応している場合には、データベースサーバ装置３００で全ての並列処理を実行するようにしてもよい。 In FIG. 16, all the high-cost parallel processing is executed by the slave device 600, but the database server device 300 may execute part of the parallel processing.
Further, when the database server device 300 supports multi-process, all parallel processing may be executed by the database server device 300.

また、図１５の例では、説明の簡明のために、カラム数が少ないテーブルを例にして説明を行ったが、どのような大きさのテーブルであっても、同様の手順により問合せ実行時間の短縮を図ることができる。 Further, in the example of FIG. 15, for the sake of simplicity of explanation, the description has been given by taking a table with a small number of columns as an example. However, the query execution time of a table of any size can be determined by the same procedure. Shortening can be achieved.

以上で述べたように、実施の形態１においては、問合せ変換装置１００に入力された問合せ文１０１に対して、適切な問合せ文変換を施すことで、問合せの実行結果を変えることなく、実行計画の最適化と、複数レコードを同時処理する並列化とを両立させることが可能となる。
問合せ文のレベルで、実行計画最適化と並列化を両立するための処理命令を組み込むことにより、既存のデータベース管理システムを改修することなく、問合せ実行時間の短縮効果を得ることが可能となる。
このように、本実施の形態では、検索実行時の実行コストに基づいて、問合せ文に含まれる検索条件を低コストと高コストに分類し、低コストの検索条件について抽出処理と、高コストの検索条件についての抽出処理とを区別するため、実行コストの高低に応じて並列度を使い分けることができ、問合せの実行時間の短縮を図ることができる。 As described above, in the first embodiment, by executing appropriate query statement conversion on the query statement 101 input to the query conversion device 100, an execution plan can be obtained without changing the query execution result. It is possible to achieve both optimization and parallelization for simultaneously processing a plurality of records.
By incorporating processing instructions for achieving both execution plan optimization and parallelism at the query statement level, it is possible to obtain the effect of shortening the query execution time without modifying the existing database management system.
As described above, in this embodiment, based on the execution cost at the time of search execution, the search conditions included in the query statement are classified into low cost and high cost. In order to distinguish from the extraction process for the search condition, the degree of parallelism can be properly used according to the execution cost, and the execution time of the query can be shortened.

以上、本実施の形態では、
データベース管理システムにおいて、選択処理を含む問合せ文を変換する問合せ変換装置であって、
１）問合せ文を入力する問合せ文入力部と、
２）入力された問合せ文の解析を行う問合せ解析部と、
３）問合せ解析結果と、演算の実行コスト情報とを元に、前記問合せ文の選択処理の中から、高コスト処理を除いた、低コスト処理のみで形成される副問合せ文を生成する、副問合せ文生成部と、
４）少なくとも、前記副問合せ文と、前記副問合せ文生成部において除かれた、０個以上の高コスト処理とを入力に持ち、前記問合せ文において、前記選択処理の結果を入力とするような後続の処理を実行するのに十分なデータの集合を出力するような関数（選択処理実行関数）の出力を、前記後続の処理への入力とするような問合せ文を再生成する、問合せ文再生成部と
を有する問合せ変換装置を説明した。 As described above, in the present embodiment,
In a database management system, a query conversion device for converting a query statement including a selection process,
1) a query statement input unit for inputting a query statement;
2) a query analysis unit for analyzing the input query statement;
3) Based on the query analysis result and the execution cost information of the operation, a sub-query sentence formed only by the low-cost process excluding the high-cost process is generated from the query sentence selection process. A query statement generation unit;
4) At least the sub-query sentence and zero or more high-cost processes removed by the sub-query sentence generation unit are input, and the result of the selection process is input to the query sentence. Query statement regeneration that regenerates a query statement that uses the output of a function (selection process execution function) that outputs a set of data sufficient to execute subsequent processing as input to the subsequent processing. A query conversion apparatus having a component has been described.

また、本実施の形態では、
前記問合せ変換装置は、
前記副問合せ文生成部において除かれた、０個以上の各高コスト処理に対して、処理を並列化することを指示する命令を、前記選択処理実行関数へ入力する並列処理指示部を有することを説明した。 In the present embodiment,
The inquiry conversion device includes:
A parallel processing instruction unit for inputting, to the selection processing execution function, an instruction for instructing parallel processing for each of zero or more high-cost processes excluded in the sub-query generation unit; Explained.

また、本実施の形態では、
前記問合せ文はデータ参照文であって、前記問合せ変換装置は、
前記選択処理の結果を入力とするような後続の処理は、射影処理である
ことを説明した。 In the present embodiment,
The query statement is a data reference statement, and the query conversion device
It has been described that the subsequent process that uses the result of the selection process as an input is a projection process.

また、本実施の形態では、
前記問合せ文はデータ更新文であって、前記問合せ変換装置は、
前記選択処理の結果を入力とするような後続の処理は、更新処理である
ことを説明した。 In the present embodiment,
The query statement is a data update statement, and the query conversion device
It has been described that the subsequent process using the result of the selection process as an input is an update process.

また、本実施の形態では、
データベース管理システムにおいて、選択処理を含む問合せ文を変換する問合せ変換方法であって、
１）問合せ文を入力する問合せ文入方法と、
２）入力された問合せ文の解析を行う問合せ解析方法と、
３）問合せ解析結果と、演算の実行コスト情報とを元に、前記問合せ文の選択処理の中から、高コスト処理を除いた、低コスト処理のみで形成される副問合せ文を生成する、副問合せ文生成方法と、
４）少なくとも、前記副問合せ文と、前記副問合せ文生成方法において除かれた、０個以上の高コスト処理とを入力に持ち、前記問合せ文において、前記選択処理の結果を入力とするような後続の処理を実行するのに十分なデータの集合を出力するような関数（選択処理実行関数）の出力を、前記後続の処理への入力とするような問合せ文を再生成する、問合せ文再生成方法と
を有する問合せ変換方法を説明した。 In the present embodiment,
In a database management system, a query conversion method for converting a query statement including a selection process,
1) A query statement input method for inputting a query statement;
2) a query analysis method for analyzing the input query statement;
3) Based on the query analysis result and the execution cost information of the operation, a sub-query sentence formed only by the low-cost process excluding the high-cost process is generated from the query sentence selection process. A query statement generation method,
4) At least the subquery sentence and zero or more high-cost processes excluded in the subquery sentence generation method are input, and the result of the selection process is input to the query sentence. Query statement regeneration that regenerates a query statement that uses the output of a function (selection process execution function) that outputs a set of data sufficient to execute subsequent processing as input to the subsequent processing. A query transformation method having a composition method has been described.

実施の形態２．
本実施の形態では、問合せ変換装置を、データベース管理システムに適用した別の例を挙げる。
本実施の形態は、実施の形態１と比較して、システムの構成のみが異なる。
そこで、本実施の形態では、システムの構成のみを説明する。 Embodiment 2. FIG.
In this embodiment, another example in which the query conversion apparatus is applied to a database management system will be given.
The present embodiment is different from the first embodiment only in the system configuration.
Therefore, in the present embodiment, only the system configuration will be described.

図９は、本実施の形態の問合せ変換装置を適用した、データベース管理システムを示す構成図である。
問合せ発行装置２００ａ（データベースクライアントに相当）は、ネットワーク５００を通じて、データベースサーバ装置３００と接続されており、データベースサーバ装置３００はデータ格納装置４００と接続されている。
また、データベースサーバ装置３００には、複数のスレーブ装置６００が接続されている。
問合せ変換装置１００ａは、問合せ発行装置２００ａの部分装置として適用される。 FIG. 9 is a configuration diagram showing a database management system to which the query conversion apparatus according to this embodiment is applied.
The query issuing device 200a (corresponding to a database client) is connected to the database server device 300 through the network 500, and the database server device 300 is connected to the data storage device 400.
In addition, a plurality of slave devices 600 are connected to the database server device 300.
The inquiry conversion device 100a is applied as a partial device of the inquiry issuing device 200a.

問合せ発行装置２００ａでは、問合せ文生成部２０１が、ユーザの要求に応じて問合せ文を生成し、生成された問合せ文を発行する前に、問合せ変換装置１００ａが、必要に応じて問合せ文を変換する。
その後、問合せ発行装置２００ａは、変換済みの問合せ文を問合せ変換装置１００ａから受け取り、データベースサーバ装置３００へ変換済みの問合せ文を発行する。
この点以外は、実施の形態１と同様であるため、説明を省略する。
なお、本実施の形態では、問合せ文生成部２０１及び問合せ変換装置１００ａを含む問合せ発行装置２００ａが情報処理装置の例に相当する。 In the query issuing device 200a, the query statement generating unit 201 generates a query statement in response to a user request, and before issuing the generated query statement, the query conversion device 100a converts the query statement as necessary. To do.
Thereafter, the query issuing device 200a receives the converted query statement from the query conversion device 100a and issues the converted query statement to the database server device 300.
Other than this point, the second embodiment is the same as the first embodiment, and a description thereof will be omitted.
In the present embodiment, the query issuing device 200a including the query statement generation unit 201 and the query conversion device 100a corresponds to an example of an information processing device.

実施の形態３．
本実施の形態では、問合せ変換装置を、カラム単位で暗号化されたデータを格納するデータベースを扱うシステム（以下、「暗号化データベースシステム」という）へ適用した例を挙げる。
本実施の形態は、実施の形態１や実施の形態２の構成に、暗号化に対応させた構成を追加することで実現する。
そこで、本実施の形態では、実施の形態１との差異部分のみを説明する。 Embodiment 3 FIG.
In the present embodiment, an example in which the query conversion apparatus is applied to a system that handles a database that stores data encrypted in column units (hereinafter referred to as “encrypted database system”) will be described.
This embodiment is realized by adding a configuration corresponding to encryption to the configuration of the first embodiment or the second embodiment.
Therefore, in the present embodiment, only differences from the first embodiment will be described.

暗号化データベースシステムへ適用した構成図は、図１、図１０、および図１１で示される。
図１０で示されるデータ格納装置４００、および、図１１で示される絞込み処理追加部１６０が、実施の形態１との差異部分である。
本実施の形態では、絞込み処理追加部１６０も問合せ文変換部の例に相当する。 Configuration diagrams applied to the encrypted database system are shown in FIG. 1, FIG. 10, and FIG.
The data storage device 400 shown in FIG. 10 and the narrowing processing addition unit 160 shown in FIG. 11 are different from the first embodiment.
In the present embodiment, the narrowing process adding unit 160 also corresponds to an example of a query statement converting unit.

図１０で示されるデータ格納装置４００では、カラム毎に適用される暗号化方式を記録した暗号化方式情報４４０、および、カラム毎に設定された機密度に関する機密度情報４５０が追加の情報となっている。
これら追加の情報は、他の情報と同様に、問合せ変換装置１００、および、データベースサーバ装置３００から参照することができる。 In the data storage device 400 shown in FIG. 10, the encryption method information 440 that records the encryption method applied to each column and the confidential information 450 related to the confidentiality set for each column are additional information. ing.
These additional information can be referred to from the query conversion apparatus 100 and the database server apparatus 300 in the same manner as other information.

絞込み処理追加部１６０を追加した問合せ変換装置１００の動作を、図１２を用いて説明する。
ここでは、処理Ｓ１９０が追加されている。 The operation of the query conversion apparatus 100 to which the narrowing process adding unit 160 is added will be described with reference to FIG.
Here, processing S190 is added.

絞込み処理追加部１６０は、特定のカラムを参照する演算処理において、処理を高速化するための絞込み処理用カラムが別途用意されている場合に、問合せ解析部１２０による解析結果を元に、問合せ文１０１の選択処理結果の正確さを損なわないような絞込み処理用カラムによる処理を追加し、処理を追加した問合せ文を、副問合せ文生成部１３０への入力とする（Ｓ１９０）。 The narrowing process adding unit 160, in the calculation process referring to a specific column, when a narrowing process column for speeding up the process is prepared separately, based on the analysis result by the query analyzing unit 120, the query sentence The processing by the column for narrowing processing that does not impair the accuracy of the selection processing result 101 is added, and the query statement with the added processing is used as an input to the sub-query generation unit 130 (S190).

絞込み処理追加の具体例を説明する。
データの暗号化方式として、表１の３種類の方式をカラムの機密度毎に使い分ける場合を考える。
なお、以降で使用する「確定的暗号」「確率的暗号」「検索可能暗号」といった用語は、一定の性質を持つ暗号化方式の種類を表すものであり、個別の暗号化方式を特定するものではない。 A specific example of adding a narrowing process will be described.
As a data encryption method, a case will be considered in which the three types shown in Table 1 are properly used for each column sensitivity.
The terms “deterministic cryptography”, “probabilistic cryptography”, and “searchable cryptography” to be used in the following refer to the types of encryption schemes having a certain property, and specify individual encryption schemes. is not.

表１の「確定的暗号」は、平文と暗号文とが一対一に対応する方式であり、完全一致比較が可能である。
この比較は平文カラムの比較性能と同等性能で実現できる。
確定的暗号は、頻度解析に弱いという欠点があるため、暗号強度の面で確率的暗号に劣る。 The “deterministic cipher” in Table 1 is a scheme in which plaintext and ciphertext correspond one-to-one, and complete match comparison is possible.
This comparison can be realized with the same performance as that of the plaintext column.
Since deterministic encryption has the disadvantage of being weak in frequency analysis, it is inferior to stochastic encryption in terms of encryption strength.

表中の「確率的暗号」は、暗号強度の高い適当な確率的暗号方式で対象データを暗号化するため、完全一致比較を含む任意の演算が利用できない。
そこで、「確率的暗号」が適用されたカラムに関しては、検索可能暗号を検索に利用することを考える。 The “probabilistic encryption” in the table encrypts the target data with an appropriate probabilistic encryption method with high encryption strength, and therefore cannot use any operation including exact match comparison.
Therefore, for a column to which “probabilistic encryption” is applied, consider using searchable encryption for the search.

検索可能暗号とは、以下の参考文献に端を発し研究が進んでいる、データを暗号化したまま検索可能な暗号化方式を指す。
検索可能暗号により、データベースサーバ装置３００に平文および検索語の内容を漏らすことなく、検索結果を得ることが可能となる。 Searchable encryption refers to an encryption method that can be searched while data is encrypted, which has been researched with reference to the following references.
With the searchable encryption, it is possible to obtain a search result without leaking the contents of the plain text and the search word to the database server device 300.

参考文献：
Ｄ．Ｘ．Ｓｏｎｇ，Ｄ．ＷａｇｎｅｒａｎｄＡ．Ｐｅｒｒｉｇ， “ＰｒａｃｔｉｃａｌＴｅｃｈｎｉｑｕｅｓｆｏｒＳｅａｒｃｈｅｓｏｎＥｎｃｒｙｐｔｅｄＤａｔａ”，ＩＥＥＥ，２０００． References:
D. X. Song, D.C. Wagner and A.W. Perrig, “Practical Technologies for Searches on Encrypted Data”, IEEE, 2000.

検索可能暗号は、検索に使用する専用タグ（以下、「暗号化タグ」という）のみを生成する。
専用のトラップドア（暗号化した検索語のようなもの）を発行し、暗号化タグとの照合を実施する。
検索可能暗号から元の平文の情報を復元することはできず、あくまで、検索のためだけの付加情報である。
「確率的暗号」を適用したカラムでは、暗号化データを格納したカラムとは別に、検索に使用するための暗号化タグを格納するカラムを用意する。 The searchable encryption generates only a dedicated tag (hereinafter referred to as “encryption tag”) used for search.
Issue a dedicated trapdoor (similar to an encrypted search term) and check against the encryption tag.
The original plaintext information cannot be restored from the searchable encryption, and is additional information only for search.
In a column to which “probabilistic encryption” is applied, a column for storing an encryption tag for use in a search is prepared separately from a column for storing encrypted data.

暗号化タグとトラップドアの照合処理は、高いセキュリティ強度を保証するが、照合速度は遅く、「実行コストが高い処理」に該当する。 The verification process between the encryption tag and the trap door guarantees high security strength, but the verification speed is slow and corresponds to “processing with high execution cost”.

更に、暗号化タグとトラップドアの照合処理に関して、絞込みを高速化するための索引カラムが別途用意されているとする。
これは、例えば、暗号化前の平文データを入力とする精度の荒いハッシュ値で実現できる。
このハッシュ値による絞込み処理を前処理として行うことで、速度の遅い暗号化タグとトラップドアとの照合処理量を削減することができる。 Further, it is assumed that an index column for speeding up the narrowing is separately prepared for the verification process of the encryption tag and the trapdoor.
This can be realized, for example, with a hash value having rough accuracy with plaintext data before encryption as an input.
By performing the narrowing process based on the hash value as a pre-process, it is possible to reduce the amount of verification processing between the slow-speed encryption tag and the trap door.

以上の状況の下で、絞込み処理追加部１６０を含めた問合せ変換の具体例を図１３、図１４に示す。
図１３は、検索対象となるテーブルの定義文をＳＱＬで記述したものである。
カラムｃ＿３が確率的暗号により暗号化されるカラムであり、検索用のカラムとして、検索可能暗号で生成される暗号化タグを格納するカラムｃ＿３＿ｔａｇと、絞り込み用の索引カラムｃ＿３＿ｉｄｘが付加されている。 Under the above situation, specific examples of query conversion including the narrowing process adding unit 160 are shown in FIGS.
FIG. 13 shows the definition sentence of the table to be searched in SQL.
The column c_3 is a column encrypted by probabilistic encryption, and a column c_3_tag for storing an encryption tag generated by the searchable encryption and an index column c_3_idx for narrowing down are added as search columns.

図１４は、ＳＱＬで記述された問合せ文の具体的な変換方法を示している。
図１４における変換前のＳＱＬ文において、関数ｅｎｃｒｙｐｔ＿ｄｔｒ（），ｇｅｎ＿ｔｒａｐｄｏｏｒ（）は、それぞれ、確定的暗号方式による暗号化、トラップドアの生成を表している。
但し、ここでは、簡単のため、鍵などの引数は一切省略し、暗号化対象となるデータに関する引数のみを記載している。
このように省略しても、本実施の形態を説明する上では支障をきたさない。
また、ｃｏｌｌａｔｅ（）は、暗号化タグとトラップドアの照合処理を実行するため関数である。 FIG. 14 shows a specific method for converting a query statement described in SQL.
In the SQL sentence before conversion in FIG. 14, functions encrypt_dtr () and gen_trapdoor () respectively represent encryption by a deterministic encryption method and generation of a trap door.
However, here, for the sake of simplicity, no arguments such as a key are omitted, and only arguments relating to data to be encrypted are described.
Even if it omits in this way, there will be no trouble in explaining the present embodiment.
Further, “collate ()” is a function for executing verification processing between the encryption tag and the trapdoor.

図１４における副問合せ文では、カラムｃ＿１，ｃ＿２に関する低コスト処理に加え、カラムｃ＿３の暗号化タグ照合処理の絞込みを実施するため、索引カラムｃ＿３＿ｉｄｘによる処理が追加されている。
前述したように、索引カラムｃ＿３＿ｉｄｘと暗号化前の平文データのハッシュ値（ｈａｓｈ（１２３４５６７８９））との照合処理を追加することで、暗号化タグ（ｃ＿３＿ｔａｇ）とトラップドア（ｔｒａｐｄｏｏｒ（１２３４５６７８９））との照合処理を高速化することができる。
絞込み処理追加部１６０は、索引カラムｃ＿３＿ｉｄｘを絞り込み処理に用いられる絞り込み処理用カラムとして指定し、更に、絞り込み処理を定義する問合せ文（ｃ＿３＿ｉｄｘ＝ｇｅｎ＿ｈａｓｈ（１２３４５６７８９）を生成し、当該問合せ文を副問合せ文生成部１３０に出力する（図１２のＳ１９０）。
副問合せ文生成部１３０は、絞込み処理追加部１６０からの問合せ文を含む副問合せ文を生成する（図１２のＳ１４０）。
なお、副問合せ文を生成した後は、実施の形態１で説明したように問合せ文を再生成すればよい。
図１４における変換後のＳＱＬ文は、再生成された問合せ文を表している。 In the subquery in FIG. 14, in addition to the low-cost processing for the columns c_1 and c_2, processing by the index column c_3_idx is added in order to narrow down the encryption tag matching processing for the column c_3.
As described above, by adding a collation process between the index column c_3_idx and the hash value (hash (123456789)) of the plaintext data before encryption, the encryption tag (c_3_tag) and the trap door (trapdoor (123456789)) Can be speeded up.
The narrowing process adding unit 160 specifies the index column c_3_idx as a narrowing process column used for the narrowing process, generates a query statement (c_3_idx = gen_hash (123456789)) that defines the narrowing process, and subqueries the query statement. It outputs to the sentence production | generation part 130 (S190 of FIG. 12).
The subquery generation unit 130 generates a subquery including the query from the narrowing process addition unit 160 (S140 in FIG. 12).
Note that after generating the sub-query text, the query text may be re-generated as described in the first embodiment.
The converted SQL statement in FIG. 14 represents the regenerated query statement.

このように、本実施の形態では、図１４に示すように、副問合せ文に、低コストであるカラムｃ＿１，ｃ＿２についての抽出処理（第１の抽出処理）が示されるとともに、絞り込み処理として、索引カラムｃ＿３＿ｉｄｘと暗号化前の平文データのハッシュ値との照合処理が示される。
この絞り込み処理は、高コストであるｃ＿３＿ｔａｇについての抽出処理（第２の抽出処理）が対象とするレコードを、カラムｃ＿１，ｃ＿２についての抽出処理（第１の抽出処理）で得られるレコードよりも絞り込むことができる。
そして、ｃ＿３＿ｔａｇについての抽出処理（第２の抽出処理）では、絞り込み処理により絞り込まれたレコードを対象にした抽出が行われるため、処理の高速化が図られる。 As described above, in the present embodiment, as shown in FIG. 14, the subquery statement indicates the extraction process (first extraction process) for the columns c_1 and c_2, which are low in cost, and as the narrowing process, A collation process between the index column c_3_idx and the hash value of plaintext data before encryption is shown.
In this narrowing-down process, the records targeted by the high-cost extraction process (second extraction process) for c_3_tag are narrower than the records obtained by the extraction process (first extraction process) for columns c_1 and c_2. be able to.
In the extraction process (second extraction process) for c_3_tag, the extraction is performed on the records narrowed down by the narrowing process, so that the processing speed is increased.

対象カラムが、暗号化されている暗号化カラムである場合に、絞込み処理追加部１６０は、対象カラムの暗号化方式（暗号化強度）に応じて、追加する絞込み処理の内容を変更させるようにしてもよい。
例えば、「確率的暗号」が適用されたカラムについて、暗号化方式の強度に比例して、索引カラムに使用するハッシュの精度を荒くするという方法がある。
これは、強度の高い暗号化方式を適用しているカラムは、機密度が高い可能性が高いため、索引による絞込み効率の向上よりも、セキュリティを優先するべきという考え方に基づいている。
この場合は、当然ながら、データ格納時には、対応するハッシュ関数を用いて、索引カラムのデータを作っておく必要がある。 When the target column is an encrypted encrypted column, the narrowing process adding unit 160 changes the content of the narrowing process to be added according to the encryption method (encryption strength) of the target column. May be.
For example, for a column to which “stochastic encryption” is applied, there is a method in which the accuracy of the hash used for the index column is roughened in proportion to the strength of the encryption method.
This is based on the idea that security should be prioritized over improvement of narrowing efficiency by index because a column to which a high-strength encryption method is applied is likely to have high confidentiality.
In this case, of course, when storing data, it is necessary to create index column data using a corresponding hash function.

また、対象カラムが、機密度が設定されている機密度設定カラムである場合に、絞込み処理追加部１６０は、対象カラムに設定された機密度に応じて、追加する絞込み処理の内容を変更させるようにしてもよい。
例えば、設定された機密度が高いほど、索引カラムに使用するハッシュの精度を荒くするという方法がある。
これは、機密度が高いカラムであるほど、索引による絞込み効率の向上よりも、セキュリティを優先するべきという考え方に基づいている。
この場合は、当然ながら、データ格納時には、対応するハッシュ関数を用いて、索引カラムのデータを作っておく必要がある。 In addition, when the target column is a sensitivity setting column in which the sensitivity is set, the narrowing process adding unit 160 changes the content of the narrowing process to be added according to the sensitivity set in the target column. You may do it.
For example, there is a method of increasing the accuracy of the hash used for the index column as the set confidentiality is higher.
This is based on the idea that the higher the sensitivity of the column, the higher the priority should be given to security than the improvement of the narrowing efficiency by the index.
In this case, of course, when storing data, it is necessary to create index column data using a corresponding hash function.

以上で述べたように、実施の形態３においては、問合せ変換装置１００に入力された問合せ文１０１に対して、適切な問合せ文変換を施すことで、問合せの実行結果を変えることなく、実行計画の最適化と、複数レコードを同時処理する並列化とを両立させることができ、更に、適当な絞込み処理を自動追加することで、問合せの実行時間を短縮することが可能となる。 As described above, in the third embodiment, the execution plan is changed without changing the execution result of the query by performing appropriate query statement conversion on the query statement 101 input to the query conversion device 100. Optimization and parallel processing for simultaneously processing a plurality of records can be made compatible, and further, an appropriate narrowing process can be automatically added to shorten the query execution time.

以上、本実施の形態では、
問合せ変換装置は、
特定のカラムを参照する演算処理において、処理を高速化するための絞込み処理用カラムが別途用意されている場合に、問合せ解析部による解析結果を元に、選択処理結果の正確さを損なわないような絞込み処理用カラムによる処理を追加する、絞込み処理追加部を有し、
絞込み処理追加部の出力を副問合せ文生成部への入力とする
ことを説明した。 As described above, in the present embodiment,
The query conversion device
In the calculation processing that refers to a specific column, when a narrowing processing column for speeding up the processing is prepared separately, the accuracy of the selection processing result should not be impaired based on the analysis result by the query analysis unit. There is a refinement process addition part that adds a process by a narrow refinement process column,
Explained that the output of the refinement processing addition unit is the input to the subquery generation unit.

また、本実施の形態では、
前記問合せ変換装置が適用されるデータベース管理システムは、
カラム単位で暗号化されたデータを格納するテーブルへの問合せを含み、かつ、
カラム毎のデータの暗号化方式を管理しており、
前記絞込み処理追加部は、
問合せに記述された条件が参照するカラムの暗号化方式に応じて、絞込み処理を追加する
ことを説明した。 In the present embodiment,
A database management system to which the query conversion device is applied is as follows.
Contains a query to a table that stores data encrypted in column units, and
Manages the data encryption method for each column,
The narrowing process addition unit
Explained that the refinement process is added according to the encryption method of the column referenced by the condition described in the query.

また、本実施の形態では、
前記問合せ変換装置が適用されるデータベース管理システムは、
カラム単位で暗号化されたデータを格納するテーブルへの問合せを含み、かつ、
カラム毎のデータの機密度を管理しており、
前記絞込み処理追加部は、
問合せに記述された条件が参照するカラムの機密度に応じて、絞込み処理を追加する
ことを説明した。 In the present embodiment,
A database management system to which the query conversion device is applied is as follows.
Contains a query to a table that stores data encrypted in column units, and
It manages the sensitivity of data for each column,
The narrowing process addition unit
It was explained that the refinement process is added according to the sensitivity of the column referenced by the condition described in the query.

最後に、実施の形態１〜３に示した問合せ変換装置１００のハードウェア構成例について説明する。
図１７は、実施の形態１〜３に示す問合せ変換装置１００のハードウェア資源の一例を示す図である。
なお、図１７の構成は、あくまでも問合せ変換装置１００のハードウェア構成の一例を示すものであり、問合せ変換装置１００のハードウェア構成は図１７に記載の構成に限らず、他の構成であってもよい。 Finally, a hardware configuration example of the query conversion apparatus 100 shown in the first to third embodiments will be described.
FIG. 17 is a diagram illustrating an example of hardware resources of the query conversion apparatus 100 described in the first to third embodiments.
Note that the configuration in FIG. 17 is merely an example of the hardware configuration of the query conversion device 100, and the hardware configuration of the query conversion device 100 is not limited to the configuration described in FIG. Also good.

図１７において、問合せ変換装置１００は、プログラムを実行するＣＰＵ９１１（中央処理装置、処理装置、演算装置、マイクロプロセッサ、マイクロコンピュータ、プロセッサともいう）を備えている。
ＣＰＵ９１１は、バス９１２を介して、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９１３、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９１４、通信ボード９１５、表示装置９０１、キーボード９０２、マウス９０３、磁気ディスク装置９２０と接続され、これらのハードウェアデバイスを制御する。
更に、ＣＰＵ９１１は、ＦＤＤ９０４（ＦｌｅｘｉｂｌｅＤｉｓｋＤｒｉｖｅ）、コンパクトディスク装置９０５（ＣＤＤ）、プリンタ装置９０６、スキャナ装置９０７と接続していてもよい。また、磁気ディスク装置９２０の代わりに、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、光ディスク装置、メモリカード（登録商標）読み書き装置などの記憶装置でもよい。
ＲＡＭ９１４は、揮発性メモリの一例である。ＲＯＭ９１３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０の記憶媒体は、不揮発性メモリの一例である。これらは、記憶装置の一例である。
通信ボード９１５、キーボード９０２、マウス９０３、スキャナ装置９０７などは、入力装置の一例である。
また、通信ボード９１５、表示装置９０１、プリンタ装置９０６などは、出力装置の一例である。 In FIG. 17, the query conversion apparatus 100 includes a CPU 911 (also referred to as a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, or a processor) that executes a program.
The CPU 911 is connected to, for example, a ROM (Read Only Memory) 913, a RAM (Random Access Memory) 914, a communication board 915, a display device 901, a keyboard 902, a mouse 903, and a magnetic disk device 920 via a bus 912. Control hardware devices.
Further, the CPU 911 may be connected to an FDD 904 (Flexible Disk Drive), a compact disk device 905 (CDD), a printer device 906, and a scanner device 907. Further, instead of the magnetic disk device 920, a storage device such as an SSD (Solid State Drive), an optical disk device, or a memory card (registered trademark) read / write device may be used.
The RAM 914 is an example of a volatile memory. The storage media of the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920 are an example of a nonvolatile memory. These are examples of the storage device.
A communication board 915, a keyboard 902, a mouse 903, a scanner device 907, and the like are examples of input devices.
The communication board 915, the display device 901, the printer device 906, and the like are examples of output devices.

通信ボード９１５は、図１に示すように、ネットワークに接続されている。
例えば、通信ボード９１５は、ＬＡＮ（ローカルエリアネットワーク）、インターネット、ＷＡＮ（ワイドエリアネットワーク）、ＳＡＮ（ストレージエリアネットワーク）などに接続されている。 As shown in FIG. 1, the communication board 915 is connected to a network.
For example, the communication board 915 is connected to a LAN (local area network), the Internet, a WAN (wide area network), a SAN (storage area network), and the like.

磁気ディスク装置９２０には、オペレーティングシステム９２１（ＯＳ）、ウィンドウシステム９２２、プログラム群９２３、ファイル群９２４が記憶されている。
プログラム群９２３のプログラムは、ＣＰＵ９１１がオペレーティングシステム９２１、ウィンドウシステム９２２を利用しながら実行する。 The magnetic disk device 920 stores an operating system 921 (OS), a window system 922, a program group 923, and a file group 924.
The programs in the program group 923 are executed by the CPU 911 using the operating system 921 and the window system 922.

また、ＲＡＭ９１４には、ＣＰＵ９１１に実行させるオペレーティングシステム９２１のプログラムやアプリケーションプログラムの少なくとも一部が一時的に格納される。
また、ＲＡＭ９１４には、ＣＰＵ９１１による処理に必要な各種データが格納される。 The RAM 914 temporarily stores at least part of the operating system 921 program and application programs to be executed by the CPU 911.
The RAM 914 stores various data necessary for processing by the CPU 911.

また、ＲＯＭ９１３には、ＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔＯｕｔｐｕｔＳｙｓｔｅｍ）プログラムが格納され、磁気ディスク装置９２０にはブートプログラムが格納されている。
問合せ変換装置１００の起動時には、ＲＯＭ９１３のＢＩＯＳプログラム及び磁気ディスク装置９２０のブートプログラムが実行され、ＢＩＯＳプログラム及びブートプログラムによりオペレーティングシステム９２１が起動される。 The ROM 913 stores a BIOS (Basic Input Output System) program, and the magnetic disk device 920 stores a boot program.
When the inquiry conversion device 100 is activated, the BIOS program in the ROM 913 and the boot program in the magnetic disk device 920 are executed, and the operating system 921 is activated by the BIOS program and the boot program.

上記プログラム群９２３には、実施の形態１〜３の説明において「〜部」として説明している機能を実行するプログラムが記憶されている。プログラムは、ＣＰＵ９１１により読み出され実行される。 The program group 923 stores a program for executing the function described as “˜unit” in the description of the first to third embodiments. The program is read and executed by the CPU 911.

ファイル群９２４には、実施の形態１〜３の説明において、「〜の判断」、「〜の解析」、「〜の検索」、「〜の抽出」、「〜の生成」、「〜の再生成」、「〜の比較」、「〜の照合」、「〜の更新」、「〜の設定」、「〜の登録」、「〜の選択」、「〜の入力」、「〜の出力」等として説明している処理の結果を示す情報やデータや信号値や変数値や暗号鍵・復号鍵や乱数値やパラメータが、「〜ファイル」の各項目として記憶されている。
「〜ファイル」は、ディスクやメモリなどの記憶媒体に記憶される。
ディスクやメモリなどの記憶媒体に記憶された情報やデータや信号値や変数値やパラメータは、読み書き回路を介してＣＰＵ９１１によりメインメモリやキャッシュメモリに読み出される。
そして、読み出された情報やデータや信号値や変数値やパラメータは、抽出・検索・参照・比較・演算・計算・処理・編集・出力・印刷・表示などのＣＰＵの動作に用いられる。
抽出・検索・参照・比較・演算・計算・処理・編集・出力・印刷・表示のＣＰＵの動作の間、情報やデータや信号値や変数値やパラメータは、メインメモリ、レジスタ、キャッシュメモリ、バッファメモリ等に一時的に記憶される。
また、実施の形態１〜３で説明しているフローチャートの矢印の部分は主としてデータや信号の入出力を示す。
データや信号値は、ＲＡＭ９１４のメモリ、ＦＤＤ９０４のフレキシブルディスク、ＣＤＤ９０５のコンパクトディスク、磁気ディスク装置９２０の磁気ディスク、その他光ディスク、ミニディスク、ＤＶＤ等の記憶媒体に記録される。
また、データや信号は、バス９１２や信号線やケーブルその他の伝送媒体によりオンライン伝送される。 In the description of the first to third embodiments, the file group 924 includes “determination of”, “analysis of”, “search of”, “extraction of”, “generation of”, and “reproduction of”. "Compare", "Compare", "Verify", "Update", "Set up", "Register", "Select", "Input", "Output" The information, data, signal value, variable value, encryption key / decryption key, random number value, and parameter indicating the result of the processing described in the above are stored as each item of “˜file”.
The “˜file” is stored in a storage medium such as a disk or a memory.
Information, data, signal values, variable values, and parameters stored in a storage medium such as a disk or memory are read out to the main memory or cache memory by the CPU 911 via a read / write circuit.
The read information, data, signal value, variable value, and parameter are used for CPU operations such as extraction, search, reference, comparison, calculation, calculation, processing, editing, output, printing, and display.
Information, data, signal values, variable values, and parameters are stored in the main memory, registers, cache memory, and buffers during the CPU operations of extraction, search, reference, comparison, calculation, processing, editing, output, printing, and display. It is temporarily stored in a memory or the like.
In addition, the arrows in the flowcharts described in Embodiments 1 to 3 mainly indicate input and output of data and signals.
Data and signal values are recorded in a storage medium such as a memory of the RAM 914, a flexible disk of the FDD 904, a compact disk of the CDD 905, a magnetic disk of the magnetic disk device 920, other optical disks, mini disks, and DVDs.
Data and signals are transmitted online via a bus 912, signal lines, cables, or other transmission media.

また、実施の形態１〜３の説明において「〜部」として説明しているものは、「〜回路」、「〜装置」、「〜機器」であってもよく、また、「〜ステップ」、「〜手順」、「〜処理」であってもよい。
すなわち、実施の形態１〜３で説明したフローチャートに示すステップ、手順、処理により、本発明に係る「情報処理方法」を実現することができる。
また、「〜部」として説明しているものは、ＲＯＭ９１３に記憶されたファームウェアで実現されていても構わない。
或いは、ソフトウェアのみ、或いは、素子・デバイス・基板・配線などのハードウェアのみ、或いは、ソフトウェアとハードウェアとの組み合わせ、さらには、ファームウェアとの組み合わせで実施されても構わない。
ファームウェアとソフトウェアは、プログラムとして、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、ＤＶＤ等の記憶媒体に記憶される。
プログラムはＣＰＵ９１１により読み出され、ＣＰＵ９１１により実行される。
すなわち、プログラムは、実施の形態１〜３の「〜部」としてコンピュータを機能させるものである。あるいは、実施の形態１〜３の「〜部」の手順や方法をコンピュータに実行させるものである。 In addition, what is described as “˜unit” in the description of the first to third embodiments may be “˜circuit”, “˜device”, “˜device”, and “˜step”, It may be “˜procedure” or “˜processing”.
That is, the “information processing method” according to the present invention can be realized by the steps, procedures, and processes shown in the flowcharts described in the first to third embodiments.
Further, what is described as “˜unit” may be realized by firmware stored in the ROM 913.
Alternatively, it may be implemented only by software, or only by hardware such as elements, devices, substrates, and wirings, by a combination of software and hardware, or by a combination of firmware.
Firmware and software are stored as programs in a storage medium such as a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, and a DVD.
The program is read by the CPU 911 and executed by the CPU 911.
That is, the program causes the computer to function as “to part” in the first to third embodiments. Alternatively, the computer executes the procedure and method of “to part” in the first to third embodiments.

このように、実施の形態１〜３に示す問合せ変換装置１００は、処理装置たるＣＰＵ、記憶装置たるメモリ、磁気ディスク等、入力装置たるキーボード、マウス、通信ボード等、出力装置たる表示装置、通信ボード等を備えるコンピュータである。
そして、上記したように「〜部」として示された機能をこれら処理装置、記憶装置、入力装置、出力装置を用いて実現するものである。 As described above, the inquiry conversion device 100 shown in the first to third embodiments includes a CPU as a processing device, a memory as a storage device, a magnetic disk, a keyboard as an input device, a mouse, a communication board, and a display device as an output device, communication A computer including a board or the like.
Then, as described above, the functions indicated as “˜units” are realized using these processing devices, storage devices, input devices, and output devices.

１００問合せ変換装置、１０１問合せ文、１０２変換済み問合せ文、１１０問合せ文入力部、１２０問合せ解析部、１３０副問合せ文生成部、１４０並列処理指示部、１５０問合せ文再生成部、１６０絞込み処理追加部、２００問合せ発行装置、２０１問合せ文生成部、３００データベースサーバ装置、４００データ格納装置、４１０データ、４２０カタログ情報、４３０演算コスト情報、４４０暗号化方式情報、４５０機密度情報、５００ネットワーク、６００スレーブ装置。 100 Query Conversion Device, 101 Query Statement, 102 Converted Query Statement, 110 Query Statement Input Unit, 120 Query Analysis Unit, 130 Subquery Statement Generation Unit, 140 Parallel Processing Instruction Unit, 150 Query Statement Regeneration Unit, 160 Narrowing Process Addition 200, query issuing device, 201 query generation unit, 300 database server device, 400 data storage device, 410 data, 420 catalog information, 430 operation cost information, 440 encryption method information, 450 confidentiality information, 500 network, 600 Slave device.

Claims

A query statement input unit that inputs a query statement that requests the database server device to extract records that match a combination of a plurality of search conditions from a search target table to be searched;
For each search condition of the input query sentence input by the query sentence input unit, it is determined whether or not the execution cost at the time of executing the search is greater than or equal to a threshold, and the search condition that the execution cost at the time of executing the search is less than the threshold. A search condition classifying unit that classifies the search condition into a second search condition category that is classified into a first search condition category and that has a search execution cost equal to or higher than a threshold;
A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. A query generated by converting the input query statement to request the database server device to execute a second extraction process for extracting a record that matches a combination of search conditions classified in the search condition category An information processing apparatus having a sentence conversion unit.

The query statement conversion unit
Converting the input query statement to generate a first query statement that requests the database server device to execute the first extraction process;
Further, a second query statement that includes the first query statement and requests the database server device to execute the second extraction process on the record extracted by executing the first query statement. The information processing apparatus according to claim 1, wherein the input query sentence is generated by conversion.

The query statement conversion unit
The information processing apparatus according to claim 2 , wherein the generated second query statement is output to a database server apparatus that manages the search target table.

The query statement conversion unit
A function for requesting the database server device to execute an arbitrary query specified by a string argument;
The information processing apparatus according to claim 2, wherein a second query sentence including the first query sentence is generated as the character string argument of the function.

The information processing apparatus further includes:
The plurality of records extracted by the first extraction process are divided into a plurality of blocks each composed of one or more records, and the second extraction process is executed in parallel on the plurality of blocks. 5. The information processing apparatus according to claim 3, further comprising a parallel processing control unit that instructs the query statement conversion unit to include control information for inclusion in the second query statement.

The database server device
Manage two or more slave devices,
The parallel processing control unit
The first extraction process is executed by the database server device, and the second extraction process is executed in parallel to a plurality of blocks by the two or more slave devices under the management of the database server device. 6. The information processing apparatus according to claim 5, wherein the query information conversion unit is instructed to include control information for causing the query information to be included in the second query statement.

The query statement conversion unit
Defining a narrowing process that narrows down the records that are targeted by the second extraction process, rather than the records extracted by the first extraction process, without affecting the processing result of the second extraction process;
A first extraction process, a narrowing process, and a second extraction process for extracting a record that matches a combination of search conditions classified into the second search condition category from the records narrowed down by the narrowing process; The information processing apparatus according to claim 1, wherein a query sentence requesting to execute the command is generated by converting the input query sentence.

The query statement conversion unit
One of the columns included in the search target table is designated as a refinement processing column used for refinement processing,
8. The information processing apparatus according to claim 7, wherein a narrowing process for narrowing down records targeted by the second extraction process is defined based on a data value described in the narrowing process column.

The query statement conversion unit
When an encrypted column, which is a column in which a data value is encrypted, is included in the search target table, and the encrypted column is classified into the second search condition category,
The information processing apparatus according to claim 8, wherein the content of the narrowing-down process is determined according to an encryption strength of the encrypted column classified into the second search condition category.

The query statement conversion unit
When the sensitivity setting column, which is a column for which sensitivity is set, is included in the search target table, and the sensitivity setting column is classified into the second search condition category,
10. The information processing apparatus according to claim 8, wherein the content of the narrowing-down process is determined in accordance with the confidentiality set in the confidentiality setting column classified into the second search condition category.

The information processing apparatus further includes:
A query statement generation unit for generating a query statement;
The query statement input unit
The information processing apparatus according to claim 1, wherein the inquiry sentence generated by the inquiry sentence generation unit is input.

A query statement input step in which a computer inputs a query statement that requests the database server device to extract records that match a combination of a plurality of search conditions from a search target table to be searched; and
The computer determines, for each search condition of the input query sentence input by the query sentence input step, whether or not the execution cost at the time of executing the search is equal to or greater than a threshold, and the execution cost at the time of executing the search is less than the threshold A search condition classification step for classifying a search condition into a first search condition category and classifying a search condition whose execution cost at the time of search execution is equal to or greater than a threshold into a second search condition category;
A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. The computer converts the input query statement into a query statement that requests the database server device to execute a second extraction process that extracts records that are classified into search condition categories and that match a combination of search conditions. An information processing method comprising: generating a query statement conversion step.

A query statement input step for inputting a query statement for requesting the database server device to extract a record that matches a combination of a plurality of search conditions from a search target table to be searched;
For each search condition of the input query sentence input by the query sentence input step, it is determined whether or not the execution cost at the time of search execution is equal to or higher than a threshold, and the search condition that the execution cost at the time of search execution is less than the threshold A search condition classification step for classifying the search condition into a second search condition category, which is classified into a first search condition category and whose execution cost at the time of search execution is equal to or greater than a threshold;
A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. A query statement generated by converting the input query statement to generate a query statement that requests the database server device to execute a second extraction process for extracting a record that matches a combination of search conditions classified in the search condition category A program for causing a computer to execute the conversion step.