JP5800720B2 - Information processing apparatus, information processing method, and program - Google Patents

Information processing apparatus, information processing method, and program Download PDF

Info

Publication number
JP5800720B2
JP5800720B2 JP2012011738A JP2012011738A JP5800720B2 JP 5800720 B2 JP5800720 B2 JP 5800720B2 JP 2012011738 A JP2012011738 A JP 2012011738A JP 2012011738 A JP2012011738 A JP 2012011738A JP 5800720 B2 JP5800720 B2 JP 5800720B2
Authority
JP
Japan
Prior art keywords
query
search
query statement
search condition
extraction process
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2012011738A
Other languages
Japanese (ja)
Other versions
JP2013152512A (en
JP2013152512A5 (en
Inventor
秀哉 柴田
秀哉 柴田
田村 孝之
孝之 田村
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to JP2012011738A priority Critical patent/JP5800720B2/en
Publication of JP2013152512A publication Critical patent/JP2013152512A/en
Publication of JP2013152512A5 publication Critical patent/JP2013152512A5/ja
Application granted granted Critical
Publication of JP5800720B2 publication Critical patent/JP5800720B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

  The present invention relates to a technique for efficiently searching a database.

In a database management system, the execution time of a query statement issued for data reference (corresponding to an SQL SELECT statement) or data update (corresponding to an SQL UPDATE statement) is generally selected in the query statement (SQL This greatly depends on the evaluation order of a plurality of search conditions (hereinafter also referred to as “conditions” or “conditional expressions”) described as “where clause”.
When the described conditional expression can be replaced with another equivalent conditional expression, the execution time of the query can be shortened by selecting the candidate with the lowest execution cost from a plurality of equivalent candidates. .

The execution cost is the amount of resource consumption required for the database search process, and is, for example, the number of disk accesses or the load on the CPU (Central Processing Unit).
As described above, generally, when the execution cost is low, the execution time of the query is short.

Many techniques relating to such query execution plan optimization have been studied.
For example, in Patent Document 1, when the database management system performs a reference using an external function, all rewrites that obtain the same result are considered for the condition described by a single or a plurality of external function references. In addition, a technique for realizing execution plan optimization based on execution cost evaluation is disclosed.

JP-A-7-141236

On the other hand, if the execution cost of an operation or function that forms a certain conditional expression is sufficiently high compared to the execution cost of another operation or function, the parallel multiplicity in the execution processing of the condition is increased, and Simultaneous processing is effective for speeding up query execution.
However, since there is overhead due to parallelization, it is often advantageous in terms of execution speed not to parallelize a process with low execution cost.
In this way, it is required that the degree of parallelism be properly used for high-cost processing and low-cost processing.

  However, in the category of conventional execution plan optimization techniques, depending on the database management system used, it has been difficult to incorporate a parallel mechanism that realizes simultaneous execution of a plurality of records.

As an example, consider a case where an open source PostgreSQL 9.0 (hereinafter simply “PostgreSQL”) is used as a database management system.
When the method of Patent Document 1 is simply applied, the WHERE clause corresponding to the SQL statement selection process used for the query is rewritten to the optimum condition.
However, in PostgreSQL, evaluation of the WHERE clause is executed as sequential processing in units of records. Therefore, unless the PostgreSQL body is modified, a parallel mechanism for simultaneously processing a plurality of records cannot be simply incorporated.
Therefore, the parallelism cannot be properly used for high cost processing and low cost processing.

  The main object of the present invention is to solve the above-described problems. The parallelism in the extraction process can be properly used according to the execution cost of the search condition, and the execution time of the query is shortened. The main purpose.

An information processing apparatus according to the present invention includes:
A query statement input unit that inputs a query statement that requests the database server device to extract records that match a combination of a plurality of search conditions from a search target table to be searched;
For each search condition of the input query sentence input by the query sentence input unit, it is determined whether or not the execution cost at the time of executing the search is greater than or equal to a threshold, and the search condition that the execution cost at the time of executing the search is less than the threshold. A search condition classifying unit that classifies the search condition into a second search condition category that is classified into a first search condition category and that has a search execution cost equal to or higher than a threshold;
A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. A query generated by converting the input query statement to request the database server device to execute a second extraction process for extracting a record that matches a combination of search conditions classified in the search condition category And a sentence conversion unit.

  According to the present invention, the search conditions included in the query statement are classified into the first search condition category and the second search condition category based on the execution cost at the time of executing the search, and the first search condition category is extracted. Since the process and the extraction process for the second search condition category are distinguished, the degree of parallelism can be properly used according to the execution cost, and the execution time of the query can be shortened.

1 is a diagram illustrating an example of the overall configuration of a database management system according to a first embodiment. FIG. 3 shows a configuration example of a data storage device according to the first embodiment. FIG. 3 is a diagram illustrating a configuration example of an inquiry conversion apparatus according to the first embodiment. FIG. 3 is a flowchart showing an operation example of the inquiry conversion apparatus according to the first embodiment. The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 1. FIG. The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 1. FIG. The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 1. FIG. FIG. 5 is a diagram illustrating an operation example of a selection execution function according to the first embodiment. The figure which shows the example of whole structure of the database management system which concerns on Embodiment 2. FIG. FIG. 9 shows a configuration example of a data storage device according to a third embodiment. FIG. 9 is a diagram illustrating a configuration example of an inquiry conversion apparatus according to a third embodiment. FIG. 10 is a flowchart showing an operation example of the inquiry conversion apparatus according to the third embodiment. The figure which shows the table definition example of the encryption database which concerns on Embodiment 3. FIG. The figure which shows the example of regeneration of the query sentence which concerns on Embodiment 3. FIG. FIG. 6 illustrates an operation example of the database server device according to the first embodiment. FIG. 6 is a diagram for explaining parallel processing of the slave device according to the first embodiment. The figure which shows the hardware structural example of the inquiry conversion apparatus which concerns on Embodiment 1-3.

Embodiment 1 FIG.
In the first to third embodiments, a query conversion apparatus that achieves both optimization of an execution plan and parallelization for simultaneously processing a plurality of records will be described.
In the query conversion devices in the first to third embodiments, both input and output are query statements.
By incorporating processing instructions for achieving both execution plan optimization and parallelism at the query statement level, it is possible to obtain the effect of shortening the query execution time without modifying the existing database management system.
Specifically, among the conditions described as the query statement selection process, low-cost conditions are preferentially evaluated, the narrowing process is made more efficient, and only high-cost search conditions are parallelized in units of records. As a result, the execution time of the query can be shortened.

In the following description, an example in which the query conversion device is applied to a database management system will be given.
In the first and second embodiments, the general configuration and operation of the query conversion apparatus will be described without specifically defining an application that uses a database.
Specific application to the application will be described in Embodiment 3.

FIG. 1 shows a configuration example of a database management system to which the query conversion apparatus according to this embodiment is applied.
The query issuing device 200 (corresponding to a database client) is connected to the query conversion device 100 through the network 500.
The query conversion device 100 is connected to the database server device 300, and the database server device 300 is connected to the data storage device 400.
In addition, a plurality of slave devices 600 are connected to the database server device 300.
The slave device 600 may be a computer external to the database server device 300, or may be each process executed by the database server device 300 when the database server device 300 supports multi-process.
In the example of FIG. 1, three computers outside the database server device 300 are set as slave devices 600.
The slave device 600 can refer to various data in the data storage device 400 under the management of the database server device 300.

The query issuing device 200 generates a query statement in response to a user request and issues it to the database server device 300.
Examples of the database language for describing the query sentence include standard SQL and the SQL language that is uniquely extended by each database management system.

The query conversion device 100 receives the query statement issued by the query issuing device 200, and when the query statement includes a selection process (corresponding to the WHERE clause of SQL), converts the query statement into a target form, and the database server device 300 To issue.
If the query statement does not include a selection process, the query statement is issued to the database server device 300 without being converted.
The inquiry conversion device 100 corresponds to an example of an information processing device.

The database server device 300 analyzes the query sentence received from the query conversion device 100 and executes a query (search for data in the data storage device 400).
When executing the query, if necessary, various data stored in the data storage device 400 are referred to or changed.

FIG. 2 shows details of the data storage device 400.
The data storage device 400 stores various data including at least data 410, catalog information 420, and calculation cost information 430.
Data 410 refers to data itself that does not include management information in the database management system.
The catalog information 420 stores data including database catalog information.
The calculation cost information 430 stores data including execution cost information of calculations and functions defined in the database management system.
By referring to the operation cost information 430, the query conversion device 100 and the database server device 300 can estimate the execution cost of each operation described in the query.

Next, a detailed configuration of the query conversion apparatus 100 will be described with reference to FIG.
The query conversion apparatus 100 includes a query statement input unit 110, a query analysis unit 120, a sub-query statement generation unit 130, a parallel processing instruction unit 140, and a query statement regeneration unit 150.

The query statement input unit 110 receives the query statement 101 issued by the query issuing device 200 via the network 500 and inputs it to the query conversion device 100.
The query statement 101 is a message requesting to extract a record that matches a combination of a plurality of search conditions from a table (search target table) to be searched in the data 410 of the data storage device 400.

The query analysis unit 120 analyzes the input query statement 101 and acquires information necessary for query conversion.
The query analysis unit 120 refers to management information such as catalog information 420 and calculation cost information 430 as necessary.
When the query statement 101 does not include the selection process, the query statement 101 is output as a converted query statement 102 without being converted.

When the query statement 101 includes a selection process, the sub-query sentence generation unit 130 is formed based on the analysis result of the query analysis unit 120 by using only the low cost process from which the high cost process is excluded from the selection process. A subquery sentence (example of a first query sentence) to be generated is generated.
The subquery generation unit 130 refers to management information such as catalog information 420 and calculation cost information 430 as necessary.

More specifically, the subquery generation unit 130 determines, for each search condition of the query statement 101, whether or not the execution cost at the time of search execution is equal to or greater than a threshold, and the execution cost at the time of search execution is less than the threshold Are categorized into low-cost search conditions (example of first search condition category), and search conditions whose execution cost at the time of search execution is equal to or higher than a threshold are set to high-cost search conditions (second search condition category) For example).
In the following, low-cost search conditions are referred to as low-cost processing, and high-cost search conditions are referred to as high-cost processing.
In addition, the subquery generation unit 130 generates a subquery that requests execution of the first extraction process for extracting a record that matches the combination of search conditions classified as the low cost process from the search target table. Is generated by converting.
The subquery generation unit 130 corresponds to an example of a search condition classification unit.
The sub-query sentence generation unit 130 corresponds to an example of a query sentence conversion unit together with a query sentence re-generation unit 150 described later.

The parallel processing instruction unit 140 outputs an instruction for instructing to parallelize the processing for each of zero or more high-cost processes removed by the subquery generation unit 130.
More specifically, the parallel processing instruction unit 140 divides the plurality of records extracted by the first extraction processing into a plurality of blocks each composed of one or more records. Then, the second extraction process described later is instructed to be executed on a plurality of blocks in parallel.
The parallel processing instruction unit 140 corresponds to an example of a parallel processing control unit.

The query statement regenerator 150 has, as inputs, the subquery statement generated by the subquery statement generator 130 and zero or more high-cost processes removed by the subquery statement generator 130. The output of a function (selection process execution function) that outputs a set of data sufficient to execute the subsequent process that receives the result of the selection process in is input to the subsequent process. A simple query statement (example of the second query statement) is regenerated.
That is, the query statement regeneration unit 150 includes the subquery statement generated by the subquery statement generation unit 130, and performs high-cost processing from the record extracted by executing the subquery statement (execution of the first extraction process). The query statement 101 is generated by converting the query statement 101 to request execution of the second extraction process for extracting the record that matches the classified search condition combination.
Then, the query statement regenerating unit 150 outputs the regenerated query statement as the converted query statement 102.
A specific method for generating the converted query statement 102 will be described in the description of the operation of the query conversion apparatus 100.
The query statement regeneration unit 150 corresponds to an example of a query statement conversion unit together with the sub-query statement generation unit 130 described above.

  Next, the operation of the query conversion apparatus 100 will be described with reference to FIG.

When the query statement 101 arrives at the query conversion device 100, the query statement input unit 110 receives the query statement 101 (S110).
The query analysis unit 120 analyzes the query statement 101 (S120).
As a result of the analysis, when the query statement 101 includes a selection process (corresponding to the SQL WHERE clause) (YES in S130), the sub-query generation unit 130 excludes the high-cost process from the selection process. A subquery sentence formed only by low cost processing is generated (S140).
If the query statement 101 does not include a selection process (NO in S130), the query analysis unit 120 outputs the converted query statement 102 without converting the query statement 101 (S180).

If the system is set in advance to perform parallel processing for zero or more high-cost processes removed in S140 (YES in S150), the parallel processing instruction unit 140 is removed in process S140. For each of the zero or more high-cost processes, an instruction for instructing to parallelize the processes is output to the database server apparatus 300 (S160).
If parallelization for high-cost processing is not set, processing S160 is skipped and processing proceeds to processing S170.

  The query statement regenerator 150 receives the subquery statement generated in step S140, zero or more high-cost processes, and the parallel processing instruction output in step S160, and regenerates the query statement (S170). ), The regenerated query statement is output as the converted query statement 102 (S180).

The query sentence analysis technique applied in step S120 can be realized by a known technique or a trivial extension thereof.
For example, in the open source database management system PostgreSQL, all source codes including SQL analysis processing are disclosed.

  In the process S130, in the case of SQL, at least a SELECT statement including a WHERE clause and an UPDATE statement are included as a query statement including a selection process.

  The operation of process S140 will be described with a specific example. Consider the following SQL statement (Equation 1).

Here, op_i (i = 1, 2,..., N) is an appropriate binary operator.
What is important is that n conditional expressions are connected by AND.
Furthermore, it is assumed that an execution cost is set for each op_i.
The execution cost is stored in the calculation cost information 430 and is referred to by the query analysis unit 120 or the subquery statement generation unit 130.
Here, for simplicity, it is assumed that the execution cost (100 * i) is set in the operation op_i. That is, the larger i is, the higher the execution cost of the operation op_i is, and the longer the execution time is.

At this time, it is assumed that the threshold value θ is set in advance by a system administrator or the like.
This threshold value θ is a criterion for separating high cost processing and low cost processing in the subquery generation unit 130.
Assuming that the operation whose execution cost is less than the threshold value θ is op_i (1 = 1, 2,..., K), the subquery generation unit 130 outputs the subquery statement SubQry formed only by low cost processing next. (Formula 2).

Since the purpose of generating the subquery is to improve the efficiency of selection processing, it is a requirement that the group of conditions forming the subquery is separated by logical product in the original query 101. .
Further, in order to make it possible to execute the high-cost processing and the projection processing of the column c that are removed at the time of generating the sub-query statement later, the column group and the column c used in the high-cost processing are changed to the sub-query statement. Must be included in all selected columns.

The example according to Equation 1 is the most basic case, but various applications can be considered from here.
For example, since a general logical expression can be expressed in a form combined with logical product, as represented by the logical product standard form, the following general form can be assumed (Equation 3) ).

Here, cond_i (i = 1, 2,..., N) is an appropriate logical expression.
In Equation 3, since the operator forming each cond_i is not necessarily a single operator, the execution cost of each logical expression is defined as the one having the highest execution cost among the operators forming each cond_i. Thus, the same subquery generation method as in the case of Equation 1 can be applied.

  As a special case of Expression 3, there is a case where a logical expression that forms a high-cost process and a logical expression that forms a low-cost process are combined by a logical product (Expression 4).

  In Equation 4, high_cost_part and lower_cost_part are appropriate logical expressions, and lower_cost_part corresponds to processing with a low execution cost. The subquery sentence SubQry corresponding to Equation 4 is as shown in Equation 5.

  Here, high_select_list refers to a column group necessary for executing the high cost processing high_cost_part.

Note that Equations (1), (3), and (5) are all examples of the SELECT statement, but the case of the UPDATE statement can be considered similarly.
Here, an example of the UPDATE statement corresponding to Equation 1 is given (Equation 6).

  The subquery sentence SubQry corresponding to Expression 6 is as shown in Expression 7.

Here, PRIMARYKEY indicates a primary key in the table.
The reason for selecting the primary key in the subquery in the case of the UPDATE statement will be described later.
Note that the purpose of subquery generation is to improve the narrowing efficiency of the selection process, so that even if the original query is an UPDATE statement, the subquery is a SELECT statement.

Next, operation | movement of process S170 is demonstrated with a specific example.
In process S170, the query statement regenerator 150 has as input the subquery generated in process S140 and the logical expression forming the zero or more high-cost processes removed in process S140. An output of a function (selection process execution function) that outputs a set of data sufficient to execute a subsequent process that receives the result of the selection process in 101 is used as an input to the subsequent process. Regenerate a query statement like this:

Here, a specific example of “subsequent processing using the result of the selection processing as input” will be described. In the case of the SELECT statement expressed by Equation 1, this subsequent process indicates a projecting process to the column c.
Further, in the case of the UPDATE statement expressed by Equation 6, the subsequent processing indicates update processing of the column c expressed by “c = c * 1.1”.

FIG. 5 shows a specific example of SELECT statement regeneration.
The example of FIG. 5 is an example using Equation 1, and setting of cost information and the like are the same as those of Equation 1.
In the converted query statement, Func appearing in the FROM phrase corresponds to the above-described selection process execution function.
The arguments of the function Func are the sub-query sentence SubQry generated in the process S140 and the high cost process group removed in the process S140.
The return value of the function Func is a data set of selection results including the column c necessary for the subsequent projection processing.

FIG. 6 shows another specific example of SELECT statement regeneration.
The example of FIG. 6 is an example in which the parallel processing instruction output in step S160 is given as an argument to the selection processing execution function Func.
The next argument “0” of the high-cost process “a_N op_n b_N” is a flag indicating whether or not it is a target for parallel processing.
Further, as a modification, a method of passing the parallel multiplicity as a natural number can be mentioned.
In addition, as shown in FIG. 6, the system administrator or the like sets that a given high-cost process is implicitly treated as a parallel object rather than explicitly using a parallel processing instruction as an argument. There is also a method.

FIG. 7 shows a specific example of UPDATE statement regeneration. The example shown in the figure is an example using Equation 6, and the setting of cost information and the like are the same as those in Equation 1.
Here, PRIMARYKEY represents a primary key in the table.
The basic idea is the same as in the SELECT statement. However, in the UPDATE statement, there is a rule that the table for executing the selection process and the table to be updated are the same table. Become.
Therefore, this problem is solved by joining the temporary table (sub_t in FIG. 7) generated as a selection result by the function Func and the table to be updated (t in FIG. 7) with the primary key.
Therefore, as long as this method is used, it is a requirement that the table to be updated has a primary key or a column corresponding to it.

The selection process execution function Func can be realized as a user-defined function created in an appropriate programming language.
In many database management systems, functions and operations defined by users can be incorporated into a database language and used.
For example, in the open source database management system PostgreSQL and the commercial database Oracle Database 11g, a function that enables a data set of selection results obtained by calculation to be output as a temporary table is supported by a user-defined C language function. Therefore, the present embodiment can be applied.

The operation of the selection process execution function will be described with reference to FIG.
The selection process execution function is called by the database server device 300 at the time of query execution.
When the selection process execution function is activated, the selection process execution function calls the caller database server apparatus 300 to execute the input subquery and receives the result (S210).
If there is an unprocessed high-cost process that has been input (S220), one unprocessed high-cost process is selected and executed, and the result is received (S230).
At this time, if a parallel processing instruction is also input in step S160, the parallel multiplicity is increased (the parallel multiplicity is adapted to the parallel processing instruction), and the process is executed.
If there is no unprocessed high-cost process (S220), the result is output and the process is completed (S240).

In the process S210, for example, in the case of PostgreSQL, it is possible to call a PostgreSQL server from a C-defined user-defined function using an interface called SPI (Server Programming Interface), and to exchange a data set.
In Oracle (registered trademark) Database 11g which is a commercial database, the same operation can be performed using OCI (Oracle (registered trademark) Call Interface).

As a method of selecting one unprocessed high-cost process in the process S210, there is a method of selecting in ascending order of execution cost.
By doing so, the narrowing efficiency can be improved.

Here, a specific example of an operation performed by the database server apparatus 300 that has input the converted query statement 102 will be described with reference to FIGS. 15 and 16.
FIG. 15 shows an outline of a search target table to be searched by the database server device 300.
FIG. 16 illustrates parallel processing of the slave device 600 with respect to a record extracted by the database server device 300 executing the sub-query sentence SubQry.

15, a_1, a_2, a_K, a_ (K + 1), a_ (K + 2), and a_N indicate column names (that is, K = 3, N = 6 in the example of FIG. 15), and the conditional expression of FIG. (A_1 op_1 b_1) and the like.
Also, c in FIG. 15 indicates the column name.
c_1, c_2, c_3, etc. are the values of the column c.
c_1, c_2, c_3, etc. are values that can identify each record.
Furthermore, “X” in FIG. 15 indicates that the search condition is met.
For example, in the c_1 record of FIG. 15, “X” is shown in the column a_1, which means that the c_1 record matches the search condition (a_1 op_1 b_1).
These are the same in FIG.

In the database server device 300, the sub-query sentence SubQry of FIG. 5 is executed.
In the example of FIG. 15, the database server device 300 includes a record name c of a record that matches a combination of a low-cost search condition (a_1 op_1 b_1), (a_2 op_2 b_2), (a_K op_K b_K), and a column in the record. The values of a_ (K + 1), a_ (K + 2, a_N) are extracted (first extraction process).
As a result of the execution of the subquery sentence SubQry, a record indicated by an arrow is extracted in the example of FIG.

When parallel processing is instructed by the parallel processing instructing unit 140 (YES in S150 of FIG. 4), the database server device 300 performs extraction processing on the high-cost search condition to the plurality of slave devices 600. Run in parallel.
For example, the database server device 300 divides a record group obtained by executing the sub-query sentence SubQry by a predetermined unit, outputs a block obtained by the division to the plurality of slave devices 600, and the plurality of slave devices 600. To execute the extraction process for each block in parallel.
In the example of FIG. 16, the database server device 300 divides the record group obtained by executing the sub-query message SubQry in units of two records, and outputs a block for every two records to each of the three slave devices 600.
In each slave device 600, a combination of high-cost search conditions (a_ (K + 1) op_ (K + 1) b_ (K + 1)), (a_ (K + 2) op_ (K + 2) b_ (K + 2)), (a_N op_N b_N) Is extracted (second extraction process) and output to the database server device 300.

In FIG. 16, all the high-cost parallel processing is executed by the slave device 600, but the database server device 300 may execute part of the parallel processing.
Further, when the database server device 300 supports multi-process, all parallel processing may be executed by the database server device 300.

  Further, in the example of FIG. 15, for the sake of simplicity of explanation, the description has been given by taking a table with a small number of columns as an example. However, the query execution time of a table of any size can be determined by the same procedure. Shortening can be achieved.

As described above, in the first embodiment, by executing appropriate query statement conversion on the query statement 101 input to the query conversion device 100, an execution plan can be obtained without changing the query execution result. It is possible to achieve both optimization and parallelization for simultaneously processing a plurality of records.
By incorporating processing instructions for achieving both execution plan optimization and parallelism at the query statement level, it is possible to obtain the effect of shortening the query execution time without modifying the existing database management system.
As described above, in this embodiment, based on the execution cost at the time of search execution, the search conditions included in the query statement are classified into low cost and high cost. In order to distinguish from the extraction process for the search condition, the degree of parallelism can be properly used according to the execution cost, and the execution time of the query can be shortened.

As described above, in the present embodiment,
In a database management system, a query conversion device for converting a query statement including a selection process,
1) a query statement input unit for inputting a query statement;
2) a query analysis unit for analyzing the input query statement;
3) Based on the query analysis result and the execution cost information of the operation, a sub-query sentence formed only by the low-cost process excluding the high-cost process is generated from the query sentence selection process. A query statement generation unit;
4) At least the sub-query sentence and zero or more high-cost processes removed by the sub-query sentence generation unit are input, and the result of the selection process is input to the query sentence. Query statement regeneration that regenerates a query statement that uses the output of a function (selection process execution function) that outputs a set of data sufficient to execute subsequent processing as input to the subsequent processing. A query conversion apparatus having a component has been described.

In the present embodiment,
The inquiry conversion device includes:
A parallel processing instruction unit for inputting, to the selection processing execution function, an instruction for instructing parallel processing for each of zero or more high-cost processes excluded in the sub-query generation unit; Explained.

In the present embodiment,
The query statement is a data reference statement, and the query conversion device
It has been described that the subsequent process that uses the result of the selection process as an input is a projection process.

In the present embodiment,
The query statement is a data update statement, and the query conversion device
It has been described that the subsequent process using the result of the selection process as an input is an update process.

In the present embodiment,
In a database management system, a query conversion method for converting a query statement including a selection process,
1) A query statement input method for inputting a query statement;
2) a query analysis method for analyzing the input query statement;
3) Based on the query analysis result and the execution cost information of the operation, a sub-query sentence formed only by the low-cost process excluding the high-cost process is generated from the query sentence selection process. A query statement generation method,
4) At least the subquery sentence and zero or more high-cost processes excluded in the subquery sentence generation method are input, and the result of the selection process is input to the query sentence. Query statement regeneration that regenerates a query statement that uses the output of a function (selection process execution function) that outputs a set of data sufficient to execute subsequent processing as input to the subsequent processing. A query transformation method having a composition method has been described.

Embodiment 2. FIG.
In this embodiment, another example in which the query conversion apparatus is applied to a database management system will be given.
The present embodiment is different from the first embodiment only in the system configuration.
Therefore, in the present embodiment, only the system configuration will be described.

FIG. 9 is a configuration diagram showing a database management system to which the query conversion apparatus according to this embodiment is applied.
The query issuing device 200a (corresponding to a database client) is connected to the database server device 300 through the network 500, and the database server device 300 is connected to the data storage device 400.
In addition, a plurality of slave devices 600 are connected to the database server device 300.
The inquiry conversion device 100a is applied as a partial device of the inquiry issuing device 200a.

In the query issuing device 200a, the query statement generating unit 201 generates a query statement in response to a user request, and before issuing the generated query statement, the query conversion device 100a converts the query statement as necessary. To do.
Thereafter, the query issuing device 200a receives the converted query statement from the query conversion device 100a and issues the converted query statement to the database server device 300.
Other than this point, the second embodiment is the same as the first embodiment, and a description thereof will be omitted.
In the present embodiment, the query issuing device 200a including the query statement generation unit 201 and the query conversion device 100a corresponds to an example of an information processing device.

Embodiment 3 FIG.
In the present embodiment, an example in which the query conversion apparatus is applied to a system that handles a database that stores data encrypted in column units (hereinafter referred to as “encrypted database system”) will be described.
This embodiment is realized by adding a configuration corresponding to encryption to the configuration of the first embodiment or the second embodiment.
Therefore, in the present embodiment, only differences from the first embodiment will be described.

Configuration diagrams applied to the encrypted database system are shown in FIG. 1, FIG. 10, and FIG.
The data storage device 400 shown in FIG. 10 and the narrowing processing addition unit 160 shown in FIG. 11 are different from the first embodiment.
In the present embodiment, the narrowing process adding unit 160 also corresponds to an example of a query statement converting unit.

In the data storage device 400 shown in FIG. 10, the encryption method information 440 that records the encryption method applied to each column and the confidential information 450 related to the confidentiality set for each column are additional information. ing.
These additional information can be referred to from the query conversion apparatus 100 and the database server apparatus 300 in the same manner as other information.

The operation of the query conversion apparatus 100 to which the narrowing process adding unit 160 is added will be described with reference to FIG.
Here, processing S190 is added.

  The narrowing process adding unit 160, in the calculation process referring to a specific column, when a narrowing process column for speeding up the process is prepared separately, based on the analysis result by the query analyzing unit 120, the query sentence The processing by the column for narrowing processing that does not impair the accuracy of the selection processing result 101 is added, and the query statement with the added processing is used as an input to the sub-query generation unit 130 (S190).

A specific example of adding a narrowing process will be described.
As a data encryption method, a case will be considered in which the three types shown in Table 1 are properly used for each column sensitivity.
The terms “deterministic cryptography”, “probabilistic cryptography”, and “searchable cryptography” to be used in the following refer to the types of encryption schemes having a certain property, and specify individual encryption schemes. is not.

The “deterministic cipher” in Table 1 is a scheme in which plaintext and ciphertext correspond one-to-one, and complete match comparison is possible.
This comparison can be realized with the same performance as that of the plaintext column.
Since deterministic encryption has the disadvantage of being weak in frequency analysis, it is inferior to stochastic encryption in terms of encryption strength.

The “probabilistic encryption” in the table encrypts the target data with an appropriate probabilistic encryption method with high encryption strength, and therefore cannot use any operation including exact match comparison.
Therefore, for a column to which “probabilistic encryption” is applied, consider using searchable encryption for the search.

Searchable encryption refers to an encryption method that can be searched while data is encrypted, which has been researched with reference to the following references.
With the searchable encryption, it is possible to obtain a search result without leaking the contents of the plain text and the search word to the database server device 300.

References:
D. X. Song, D.C. Wagner and A.W. Perrig, “Practical Technologies for Searches on Encrypted Data”, IEEE, 2000.

The searchable encryption generates only a dedicated tag (hereinafter referred to as “encryption tag”) used for search.
Issue a dedicated trapdoor (similar to an encrypted search term) and check against the encryption tag.
The original plaintext information cannot be restored from the searchable encryption, and is additional information only for search.
In a column to which “probabilistic encryption” is applied, a column for storing an encryption tag for use in a search is prepared separately from a column for storing encrypted data.

  The verification process between the encryption tag and the trap door guarantees high security strength, but the verification speed is slow and corresponds to “processing with high execution cost”.

Further, it is assumed that an index column for speeding up the narrowing is separately prepared for the verification process of the encryption tag and the trapdoor.
This can be realized, for example, with a hash value having rough accuracy with plaintext data before encryption as an input.
By performing the narrowing process based on the hash value as a pre-process, it is possible to reduce the amount of verification processing between the slow-speed encryption tag and the trap door.

Under the above situation, specific examples of query conversion including the narrowing process adding unit 160 are shown in FIGS.
FIG. 13 shows the definition sentence of the table to be searched in SQL.
The column c_3 is a column encrypted by probabilistic encryption, and a column c_3_tag for storing an encryption tag generated by the searchable encryption and an index column c_3_idx for narrowing down are added as search columns.

FIG. 14 shows a specific method for converting a query statement described in SQL.
In the SQL sentence before conversion in FIG. 14, functions encrypt_dtr () and gen_trapdoor () respectively represent encryption by a deterministic encryption method and generation of a trap door.
However, here, for the sake of simplicity, no arguments such as a key are omitted, and only arguments relating to data to be encrypted are described.
Even if it omits in this way, there will be no trouble in explaining the present embodiment.
Further, “collate ()” is a function for executing verification processing between the encryption tag and the trapdoor.

In the subquery in FIG. 14, in addition to the low-cost processing for the columns c_1 and c_2, processing by the index column c_3_idx is added in order to narrow down the encryption tag matching processing for the column c_3.
As described above, by adding a collation process between the index column c_3_idx and the hash value (hash (123456789)) of the plaintext data before encryption, the encryption tag (c_3_tag) and the trap door (trapdoor (123456789)) Can be speeded up.
The narrowing process adding unit 160 specifies the index column c_3_idx as a narrowing process column used for the narrowing process, generates a query statement (c_3_idx = gen_hash (123456789)) that defines the narrowing process, and subqueries the query statement. It outputs to the sentence production | generation part 130 (S190 of FIG. 12).
The subquery generation unit 130 generates a subquery including the query from the narrowing process addition unit 160 (S140 in FIG. 12).
Note that after generating the sub-query text, the query text may be re-generated as described in the first embodiment.
The converted SQL statement in FIG. 14 represents the regenerated query statement.

As described above, in the present embodiment, as shown in FIG. 14, the subquery statement indicates the extraction process (first extraction process) for the columns c_1 and c_2, which are low in cost, and as the narrowing process, A collation process between the index column c_3_idx and the hash value of plaintext data before encryption is shown.
In this narrowing-down process, the records targeted by the high-cost extraction process (second extraction process) for c_3_tag are narrower than the records obtained by the extraction process (first extraction process) for columns c_1 and c_2. be able to.
In the extraction process (second extraction process) for c_3_tag, the extraction is performed on the records narrowed down by the narrowing process, so that the processing speed is increased.

When the target column is an encrypted encrypted column, the narrowing process adding unit 160 changes the content of the narrowing process to be added according to the encryption method (encryption strength) of the target column. May be.
For example, for a column to which “stochastic encryption” is applied, there is a method in which the accuracy of the hash used for the index column is roughened in proportion to the strength of the encryption method.
This is based on the idea that security should be prioritized over improvement of narrowing efficiency by index because a column to which a high-strength encryption method is applied is likely to have high confidentiality.
In this case, of course, when storing data, it is necessary to create index column data using a corresponding hash function.

In addition, when the target column is a sensitivity setting column in which the sensitivity is set, the narrowing process adding unit 160 changes the content of the narrowing process to be added according to the sensitivity set in the target column. You may do it.
For example, there is a method of increasing the accuracy of the hash used for the index column as the set confidentiality is higher.
This is based on the idea that the higher the sensitivity of the column, the higher the priority should be given to security than the improvement of the narrowing efficiency by the index.
In this case, of course, when storing data, it is necessary to create index column data using a corresponding hash function.

  As described above, in the third embodiment, the execution plan is changed without changing the execution result of the query by performing appropriate query statement conversion on the query statement 101 input to the query conversion device 100. Optimization and parallel processing for simultaneously processing a plurality of records can be made compatible, and further, an appropriate narrowing process can be automatically added to shorten the query execution time.

As described above, in the present embodiment,
The query conversion device
In the calculation processing that refers to a specific column, when a narrowing processing column for speeding up the processing is prepared separately, the accuracy of the selection processing result should not be impaired based on the analysis result by the query analysis unit. There is a refinement process addition part that adds a process by a narrow refinement process column,
Explained that the output of the refinement processing addition unit is the input to the subquery generation unit.

In the present embodiment,
A database management system to which the query conversion device is applied is as follows.
Contains a query to a table that stores data encrypted in column units, and
Manages the data encryption method for each column,
The narrowing process addition unit
Explained that the refinement process is added according to the encryption method of the column referenced by the condition described in the query.

In the present embodiment,
A database management system to which the query conversion device is applied is as follows.
Contains a query to a table that stores data encrypted in column units, and
It manages the sensitivity of data for each column,
The narrowing process addition unit
It was explained that the refinement process is added according to the sensitivity of the column referenced by the condition described in the query.

Finally, a hardware configuration example of the query conversion apparatus 100 shown in the first to third embodiments will be described.
FIG. 17 is a diagram illustrating an example of hardware resources of the query conversion apparatus 100 described in the first to third embodiments.
Note that the configuration in FIG. 17 is merely an example of the hardware configuration of the query conversion device 100, and the hardware configuration of the query conversion device 100 is not limited to the configuration described in FIG. Also good.

In FIG. 17, the query conversion apparatus 100 includes a CPU 911 (also referred to as a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, or a processor) that executes a program.
The CPU 911 is connected to, for example, a ROM (Read Only Memory) 913, a RAM (Random Access Memory) 914, a communication board 915, a display device 901, a keyboard 902, a mouse 903, and a magnetic disk device 920 via a bus 912. Control hardware devices.
Further, the CPU 911 may be connected to an FDD 904 (Flexible Disk Drive), a compact disk device 905 (CDD), a printer device 906, and a scanner device 907. Further, instead of the magnetic disk device 920, a storage device such as an SSD (Solid State Drive), an optical disk device, or a memory card (registered trademark) read / write device may be used.
The RAM 914 is an example of a volatile memory. The storage media of the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920 are an example of a nonvolatile memory. These are examples of the storage device.
A communication board 915, a keyboard 902, a mouse 903, a scanner device 907, and the like are examples of input devices.
The communication board 915, the display device 901, the printer device 906, and the like are examples of output devices.

As shown in FIG. 1, the communication board 915 is connected to a network.
For example, the communication board 915 is connected to a LAN (local area network), the Internet, a WAN (wide area network), a SAN (storage area network), and the like.

The magnetic disk device 920 stores an operating system 921 (OS), a window system 922, a program group 923, and a file group 924.
The programs in the program group 923 are executed by the CPU 911 using the operating system 921 and the window system 922.

The RAM 914 temporarily stores at least part of the operating system 921 program and application programs to be executed by the CPU 911.
The RAM 914 stores various data necessary for processing by the CPU 911.

The ROM 913 stores a BIOS (Basic Input Output System) program, and the magnetic disk device 920 stores a boot program.
When the inquiry conversion device 100 is activated, the BIOS program in the ROM 913 and the boot program in the magnetic disk device 920 are executed, and the operating system 921 is activated by the BIOS program and the boot program.

  The program group 923 stores a program for executing the function described as “˜unit” in the description of the first to third embodiments. The program is read and executed by the CPU 911.

In the description of the first to third embodiments, the file group 924 includes “determination of”, “analysis of”, “search of”, “extraction of”, “generation of”, and “reproduction of”. "Compare", "Compare", "Verify", "Update", "Set up", "Register", "Select", "Input", "Output" The information, data, signal value, variable value, encryption key / decryption key, random number value, and parameter indicating the result of the processing described in the above are stored as each item of “˜file”.
The “˜file” is stored in a storage medium such as a disk or a memory.
Information, data, signal values, variable values, and parameters stored in a storage medium such as a disk or memory are read out to the main memory or cache memory by the CPU 911 via a read / write circuit.
The read information, data, signal value, variable value, and parameter are used for CPU operations such as extraction, search, reference, comparison, calculation, calculation, processing, editing, output, printing, and display.
Information, data, signal values, variable values, and parameters are stored in the main memory, registers, cache memory, and buffers during the CPU operations of extraction, search, reference, comparison, calculation, processing, editing, output, printing, and display. It is temporarily stored in a memory or the like.
In addition, the arrows in the flowcharts described in Embodiments 1 to 3 mainly indicate input and output of data and signals.
Data and signal values are recorded in a storage medium such as a memory of the RAM 914, a flexible disk of the FDD 904, a compact disk of the CDD 905, a magnetic disk of the magnetic disk device 920, other optical disks, mini disks, and DVDs.
Data and signals are transmitted online via a bus 912, signal lines, cables, or other transmission media.

In addition, what is described as “˜unit” in the description of the first to third embodiments may be “˜circuit”, “˜device”, “˜device”, and “˜step”, It may be “˜procedure” or “˜processing”.
That is, the “information processing method” according to the present invention can be realized by the steps, procedures, and processes shown in the flowcharts described in the first to third embodiments.
Further, what is described as “˜unit” may be realized by firmware stored in the ROM 913.
Alternatively, it may be implemented only by software, or only by hardware such as elements, devices, substrates, and wirings, by a combination of software and hardware, or by a combination of firmware.
Firmware and software are stored as programs in a storage medium such as a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, and a DVD.
The program is read by the CPU 911 and executed by the CPU 911.
That is, the program causes the computer to function as “to part” in the first to third embodiments. Alternatively, the computer executes the procedure and method of “to part” in the first to third embodiments.

As described above, the inquiry conversion device 100 shown in the first to third embodiments includes a CPU as a processing device, a memory as a storage device, a magnetic disk, a keyboard as an input device, a mouse, a communication board, and a display device as an output device, communication A computer including a board or the like.
Then, as described above, the functions indicated as “˜units” are realized using these processing devices, storage devices, input devices, and output devices.

  100 Query Conversion Device, 101 Query Statement, 102 Converted Query Statement, 110 Query Statement Input Unit, 120 Query Analysis Unit, 130 Subquery Statement Generation Unit, 140 Parallel Processing Instruction Unit, 150 Query Statement Regeneration Unit, 160 Narrowing Process Addition 200, query issuing device, 201 query generation unit, 300 database server device, 400 data storage device, 410 data, 420 catalog information, 430 operation cost information, 440 encryption method information, 450 confidentiality information, 500 network, 600 Slave device.

Claims (13)

  1. A query statement input unit that inputs a query statement that requests the database server device to extract records that match a combination of a plurality of search conditions from a search target table to be searched;
    For each search condition of the input query sentence input by the query sentence input unit, it is determined whether or not the execution cost at the time of executing the search is greater than or equal to a threshold, and the search condition that the execution cost at the time of executing the search is less than the threshold. A search condition classifying unit that classifies the search condition into a second search condition category that is classified into a first search condition category and that has a search execution cost equal to or higher than a threshold;
    A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. A query generated by converting the input query statement to request the database server device to execute a second extraction process for extracting a record that matches a combination of search conditions classified in the search condition category An information processing apparatus having a sentence conversion unit.
  2. The query statement conversion unit
    Converting the input query statement to generate a first query statement that requests the database server device to execute the first extraction process;
    Further, a second query statement that includes the first query statement and requests the database server device to execute the second extraction process on the record extracted by executing the first query statement. The information processing apparatus according to claim 1, wherein the input query sentence is generated by conversion.
  3. The query statement conversion unit
    The information processing apparatus according to claim 2 , wherein the generated second query statement is output to a database server apparatus that manages the search target table.
  4. The query statement conversion unit
    A function for requesting the database server device to execute an arbitrary query specified by a string argument;
    The information processing apparatus according to claim 2, wherein a second query sentence including the first query sentence is generated as the character string argument of the function.
  5. The information processing apparatus further includes:
    The plurality of records extracted by the first extraction process are divided into a plurality of blocks each composed of one or more records, and the second extraction process is executed in parallel on the plurality of blocks. 5. The information processing apparatus according to claim 3, further comprising a parallel processing control unit that instructs the query statement conversion unit to include control information for inclusion in the second query statement.
  6. The database server device
    Manage two or more slave devices,
    The parallel processing control unit
    The first extraction process is executed by the database server device, and the second extraction process is executed in parallel to a plurality of blocks by the two or more slave devices under the management of the database server device. 6. The information processing apparatus according to claim 5, wherein the query information conversion unit is instructed to include control information for causing the query information to be included in the second query statement.
  7. The query statement conversion unit
    Defining a narrowing process that narrows down the records that are targeted by the second extraction process, rather than the records extracted by the first extraction process, without affecting the processing result of the second extraction process;
    A first extraction process, a narrowing process, and a second extraction process for extracting a record that matches a combination of search conditions classified into the second search condition category from the records narrowed down by the narrowing process; The information processing apparatus according to claim 1, wherein a query sentence requesting to execute the command is generated by converting the input query sentence.
  8. The query statement conversion unit
    One of the columns included in the search target table is designated as a refinement processing column used for refinement processing,
    8. The information processing apparatus according to claim 7, wherein a narrowing process for narrowing down records targeted by the second extraction process is defined based on a data value described in the narrowing process column.
  9. The query statement conversion unit
    When an encrypted column, which is a column in which a data value is encrypted, is included in the search target table, and the encrypted column is classified into the second search condition category,
    The information processing apparatus according to claim 8, wherein the content of the narrowing-down process is determined according to an encryption strength of the encrypted column classified into the second search condition category.
  10. The query statement conversion unit
    When the sensitivity setting column, which is a column for which sensitivity is set, is included in the search target table, and the sensitivity setting column is classified into the second search condition category,
    10. The information processing apparatus according to claim 8, wherein the content of the narrowing-down process is determined in accordance with the confidentiality set in the confidentiality setting column classified into the second search condition category.
  11. The information processing apparatus further includes:
    A query statement generation unit for generating a query statement;
    The query statement input unit
    The information processing apparatus according to claim 1, wherein the inquiry sentence generated by the inquiry sentence generation unit is input.
  12. A query statement input step in which a computer inputs a query statement that requests the database server device to extract records that match a combination of a plurality of search conditions from a search target table to be searched; and
    The computer determines, for each search condition of the input query sentence input by the query sentence input step, whether or not the execution cost at the time of executing the search is equal to or greater than a threshold, and the execution cost at the time of executing the search is less than the threshold A search condition classification step for classifying a search condition into a first search condition category and classifying a search condition whose execution cost at the time of search execution is equal to or greater than a threshold into a second search condition category;
    A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. The computer converts the input query statement into a query statement that requests the database server device to execute a second extraction process that extracts records that are classified into search condition categories and that match a combination of search conditions. An information processing method comprising: generating a query statement conversion step.
  13. A query statement input step for inputting a query statement for requesting the database server device to extract a record that matches a combination of a plurality of search conditions from a search target table to be searched;
    For each search condition of the input query sentence input by the query sentence input step, it is determined whether or not the execution cost at the time of search execution is equal to or higher than a threshold, and the search condition that the execution cost at the time of search execution is less than the threshold A search condition classification step for classifying the search condition into a second search condition category, which is classified into a first search condition category and whose execution cost at the time of search execution is equal to or greater than a threshold;
    A first extraction process for extracting records that match a combination of search conditions classified into the first search condition category from the search target table, and the second extraction from the records extracted by the first extraction process. A query statement generated by converting the input query statement to generate a query statement that requests the database server device to execute a second extraction process for extracting a record that matches a combination of search conditions classified in the search condition category A program for causing a computer to execute the conversion step.
JP2012011738A 2012-01-24 2012-01-24 Information processing apparatus, information processing method, and program Active JP5800720B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2012011738A JP5800720B2 (en) 2012-01-24 2012-01-24 Information processing apparatus, information processing method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2012011738A JP5800720B2 (en) 2012-01-24 2012-01-24 Information processing apparatus, information processing method, and program

Publications (3)

Publication Number Publication Date
JP2013152512A JP2013152512A (en) 2013-08-08
JP2013152512A5 JP2013152512A5 (en) 2014-12-25
JP5800720B2 true JP5800720B2 (en) 2015-10-28

Family

ID=49048823

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2012011738A Active JP5800720B2 (en) 2012-01-24 2012-01-24 Information processing apparatus, information processing method, and program

Country Status (1)

Country Link
JP (1) JP5800720B2 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6186387B2 (en) * 2015-03-19 2017-08-23 株式会社日立製作所 Confidential data processing system
JP6441160B2 (en) * 2015-04-27 2018-12-19 株式会社東芝 Concealment device, decryption device, concealment method and decryption method
WO2017122326A1 (en) 2016-01-14 2017-07-20 三菱電機株式会社 Confidential search system, confidential search method and confidential search program
WO2017221308A1 (en) * 2016-06-20 2017-12-28 三菱電機株式会社 Data management device, data management method, data management program, search device, search method, and search program

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0581330A (en) * 1991-09-20 1993-04-02 Nec Corp Inquiry optimizing device of data base
JPH05334368A (en) * 1992-06-02 1993-12-17 Hitachi Ltd Data base inquiry processing method
JPH07105238A (en) * 1993-09-29 1995-04-21 Omron Corp Device and method for data base retrieval
US6339768B1 (en) * 1998-08-13 2002-01-15 International Business Machines Corporation Exploitation of subsumption in optimizing scalar subqueries
JP2000163446A (en) * 1998-11-30 2000-06-16 Nec Corp Extendable inquiry processor
JP3694193B2 (en) * 1999-05-28 2005-09-14 富士通株式会社 Database search apparatus and program recording medium
JP2001331463A (en) * 2000-05-23 2001-11-30 Nec Corp Data base construction method and recording medium having the program recorded thereon
JP3303881B2 (en) * 2001-03-08 2002-07-22 株式会社日立製作所 Document retrieval method and apparatus
US6598044B1 (en) * 2002-06-25 2003-07-22 Microsoft Corporation Method for choosing optimal query execution plan for multiple defined equivalent query expressions
JP4575064B2 (en) * 2004-07-29 2010-11-04 三菱電機株式会社 Information retrieval device
US20080040334A1 (en) * 2006-08-09 2008-02-14 Gad Haber Operation of Relational Database Optimizers by Inserting Redundant Sub-Queries in Complex Queries
JP4571609B2 (en) * 2006-11-08 2010-10-27 株式会社日立製作所 Resource allocation method, resource allocation program, and management computer
JP5011006B2 (en) * 2007-07-03 2012-08-29 株式会社日立製作所 Resource allocation method, resource allocation program, and resource allocation device
JP2011159015A (en) * 2010-01-29 2011-08-18 Fujitsu Frontech Ltd Program, apparatus and method for supporting search
US20110202774A1 (en) * 2010-02-15 2011-08-18 Charles Henry Kratsch System for Collection and Longitudinal Analysis of Anonymous Student Data
US8484243B2 (en) * 2010-05-05 2013-07-09 Cisco Technology, Inc. Order-independent stream query processing

Also Published As

Publication number Publication date
JP2013152512A (en) 2013-08-08

Similar Documents

Publication Publication Date Title
Wilke et al. The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools
Polato et al. A comprehensive view of Hadoop research—A systematic literature review
Khayyat et al. Bigdansing: A system for big data cleansing
Zaharia et al. Fast and interactive analytics over Hadoop data with Spark
US20100161566A1 (en) Using relationships in candidate discovery
US8321454B2 (en) Double map reduce distributed computing framework
US8788473B2 (en) Matching transactions in multi-level records
Zhang et al. A scalable two-phase top-down specialization approach for data anonymization using mapreduce on cloud
Zhang et al. Sedic: privacy-aware data intensive computing on hybrid clouds
Garfinkel Digital media triage with bulk data analysis and bulk_extractor
US9003529B2 (en) Apparatus and method for identifying related code variants in binaries
KR20060043011A (en) Improved query optimizer using implied predicates
US8601474B2 (en) Resuming execution of an execution plan in a virtual machine
Wang et al. Performance prediction for apache spark platform
Eldawy SpatialHadoop: towards flexible and scalable spatial processing using mapreduce
Zhao et al. Sahad: Subgraph analysis in massive networks using hadoop
JP5437557B2 (en) Search processing method and search system
Ceccarelli et al. Dexter: an open source framework for entity linking
US9383982B2 (en) Data-parallel computation management
US20140013304A1 (en) Source code analytics platform using program analysis and information retrieval
Fan et al. The Case Against Specialized Graph Analytics Engines.
Shao et al. Managing and mining large graphs: systems and implementations
Marcu et al. Spark versus flink: Understanding performance in big data analytics frameworks
JP6542785B2 (en) Implementation of semi-structured data as first class database element
US9195744B2 (en) Protecting information in search queries

Legal Events

Date Code Title Description
A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20141110

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20141110

TRDD Decision of grant or rejection written
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20150722

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20150728

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20150825

R150 Certificate of patent or registration of utility model

Ref document number: 5800720

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250