WO2011118425A1

WO2011118425A1 - Query optimization system, query optimization device, and query optimization method

Info

Publication number: WO2011118425A1
Application number: PCT/JP2011/055887
Authority: WO
Inventors: 伸治菊地
Original assignee: 日本電気株式会社
Priority date: 2010-03-24
Filing date: 2011-03-14
Publication date: 2011-09-29
Also published as: JPWO2011118425A1

Abstract

Provided is a query optimization system, wherein after a rational assessment of the interim results of a query generated as a result of optimization, query optimization is achieved at the implementation stage, taking into account the decentralized storage site in which the query should be positioned. First, numerical data concerning reliability and latency for sites that store data is extracted. Next, the reciprocals of the aforementioned data concerning latency and numerical data concerning reliability are overlaid onto cost values of queries, and the expanded cost is determined. Next, a replica of a high-order specified value appropriate for the cost value is selected. Next, 1 is added to the maximum block size, which is a parameter in the algorithm for selecting a positioning destination site for interim results generated as a result of optimization, and pre-reading is implemented by preparing copies of the processing system itself. Next, as a result of the pre-reading, the interim results are positioned in one or more sites. By doing so, the method of positioning the interim results for queries is evaluated.

Description

Query optimization system, query optimization device, and query optimization method

The present invention relates to a query optimization system, and more particularly to a query optimization system that evaluates an arrangement method of intermediate results of a query.

The following documents disclose typical existing techniques related to a query optimization method (query optimization system) that evaluates an arrangement method of intermediate results of a query (query).

<Non-Patent Document 1> D. Kossmann, K.M. Stocker, “Iterative Dynamic Programming: A New Class of Queries Optimization Algorithms”, ACM Transactions on Database Systems, Vol. 25, no. 1, March 2000.

<Non-Patent Document 2> E. J. et al. Shekita, H .; C. Young, K .; L. Tan, “Multi-Join Optimization for Symmetric Multiprocessors”, Proceedings of the 19th VLDB Conference, 1993.

<Patent Document 1> JP 2002-222108 A

<Non-Patent Document 1> refers to a conventional query optimization method for evaluating a query intermediate result placement method, but does not explicitly refer to the intermediate result placement method. .

<Non-Patent Document 2> specifically refers to writing intermediate results to a storage (storage: external storage device) or the like by a query optimization method. However, this assumes functional modules in a single system, and does not deal with matters premised on wide-ranging computer resource interchange, procurement, and virtualization, which are the current issues in 2009.

On the other hand, <Patent Document 1> describes an existing technique based on the premise of external placement of intermediate results, not a technique related to query optimization.

<Conventional query optimization method>
Here, the query optimization method of the prior art is extracted from <Non-Patent Document 1> and shown as a specific procedure.

In (Step 1), the specified query is parsed and the set of tables {R ₁ , R ₂ ,. . . , R _n }.

In (Step 2), after extracting all the assumed access patterns such as effectively using an index in each element table R _m of the table set in (Step 2.1) under it, Add to the temporary join element list. This is performed for all the element tables R _m (m = 1 to n). Thereby, optimization candidates are extracted.

In (Step 3), one of the optimal access conditions is extracted from the temporary connection element list for each element table R _m implemented in (Step 2.1) in (Step 3.1) under the subordinate (Step 3.1). The query elements that optimize the cost are selected by leaving only the extracted ones and deleting others from the temporary binding element list. This is performed for all the element tables R _m (m = 1 to n).

In (Step 4), the temporary combined element list is deleted.

(Step 5), the initial version of the combined element list {R ₁ ^o , R ₂ ^o , R ₃ ^o,. . . , R _n ^o } and add it to the optimization plan element list.

In (Step 6), the subordinate processes are repeated until the number of elements in the combined element list becomes 1.

[(Step 6) Subordinate processing]
First, in (Step 6.1), the maximum block size k is set to the smaller of the specified value or the number of elements in the combined element list. Subsequently, in (Step 6.2), the number of bonds is specified as 2. Subsequently, in (Step .6.3), the subordinate processing is repeatedly performed as long as the condition of designated number of connections ≦ maximum number of blocks k is satisfied.

[(Step .6.3) Subordinate processing]
First, in (Step .6.3.1), all scenes constituting combinations for the designated number of connections are displayed in the current version {R ₁ ^o , R ₂ ^o , R ₃ ^o ,. . . , R _n ^o } to create and add a temporary list. Subsequently, in (Step 6.3.2), the subordinate processing is repeated for all the provisional list elements.

[(Step .6.3.2) Subordinate process]
First, in (Step 6.3.2.1), one element is extracted from the temporary list. Subsequently, in (Step .6.3.3.2), the subordinate processes are repeated until the

temporary list parts

1 and 2 can be defined.

[Process under (Step 6.3.2.2)]
First, in (Step 6.3.3.2.2.1), the element is divided into two, and two combinations of the elements {R ₁ ^o } are created, and {temporary list part 1} {temporary list part 2 }. Subsequently, in (Step 6.3.3.2.2.2), a combination of {temporary list part 1} {temporary list part 2} is added to the temporary operation list. Subsequently, in (Step .6.3.2.2.2.3), a combination of {temporary list part 2} {temporary list part 1} is added to the temporary operation list.

[(Step .6.3) Continuation of subordinate processing 1]
After the completion of (Step 6.3.2), in (Step .6.3.3) under (Step .6.3), the optimization plan list is initialized. Subsequently, in (Step. 6.3.4), the subordinate processing is repeated for all the provisional operation list elements.

[Processing under (Step. 6.3.4)]
First, in (Step 6.3.4.1), one is extracted from the temporary operation list. Subsequently, in (Step 6.3.4.2), the optimization plan of the extracted right side {provisional list portion x (x = 1, 2)} is extracted from the optimization plan element list or created. Subsequently, in (Step 6.3.4.3), the optimization plan of the extracted left {temporary list portion x (x = 1, 2)} is extracted from the optimization plan element list or created. Subsequently, in (Step 6.3.4.4), the combined plan in the two optimization plans of (Step .6.3.4.2) and (Step 6.3.4.3) described above. And add it to the optimization plan list.

[(Step .6.3) Continuation of subordinate process 2]
After (Step .6.3.4) is finished, in (Step .6.3.5) under (Step .6.3), one optimal item is extracted from the optimization plan list and optimized. Add to the plan element list. Subsequently, in (Step .6.3.6) under (Step .6.3), non-optimal optimization plan lists are organized and held as a library. Subsequently, in (Step .6.3.7) under (Step .6.3), the temporary list and the temporary operation list are cleared (erased). Subsequently, in (Step .6.3.8) under (Step .6.3), 1 is added to the number of bonds (number of bonds ← (number of bonds + 1)).

[(Step 6) Continuation of subordinate processing]
After the completion of (Step 6.3), in (Step .6.4) under (Step .6), the algorithm (Greedy Algorithm) of the greedy method (greedy method) is calculated from the elements in the optimization plan element list. Use to select a join tree that minimizes the cost function that calculates the cost. Subsequently, in (Step .6.5) under (Step .6), the connection tree is set to {T _k ^o }. Subsequently, in (Step .6.6) under (Step .6), the elements of the connection tree {T _k ^o } are replaced with the current version {R ₁ ^o , R ₂ ^o , R ₃ ^o,. . . , R _n ^o }. Subsequently, in (Step .6.7) under (Step 6), the connection elements constituting the connection tree {T _k ^o } are changed to the current version {R ₁ ^o , R ₂ ^o , R ₃ ^{o of the} connection element list. ,. . . , R _n ^o }.

After the completion of (Step 6), in (Step 7), the remaining optimization is performed from the current version {T _m ^o } of the combined element list, and an optimization plan is added.

Subsequently, in (Step 8), one optimization plan is selected.

Subsequently, in (Step 9), the current version of the combined element list {R ₁ ^o , R ₂ ^o , R ₃ ^o,. . . , R _n ^o } to re-express the optimization plan.

Subsequently, in (Step 10), the optimization plan is returned.

当該 This optimization method largely combines two algorithms. One is “Dynamic Programming Algorithm”. Specifically, the series of areas in (Step .6.3) corresponds. The other is the “greedy method (greedy method) algorithm”, which corresponds to (step 6.4). In this method, a size range of a combined area called a maximum block size is specified, and a combination operation is initially performed unless the maximum block size is exceeded by a dynamic programming algorithm. After that, the method with the lowest cost is selected from the selected calculation result by the algorithm of the greedy method (greedy method), and it is replaced as an intermediate generation result. By repeating the above procedure a plurality of times, the method with the lowest overall cost is selected.

In order to explain the algorithm of the optimization method in a simple manner, a model when implemented using the algorithm will be described with reference to FIG. FIG. 1 schematically shows a setup for actually executing query optimization processing by the optimization method. Here, for the sake of illustration, it is assumed that the relationship list to be used is composed of five relationships. Specifically, the relation R ₁ , the relation R ₂ , the relation R ₃ , the relation R ₄ , and the relation R ₅ are optimized as a result of (Step 5), and the relation R ₁ ^o and the relation R ₂ ^{o are obtained.} , A connection element list 3007 including the relationship R ₃ ^o , the relationship R ₄ ^o , and the relationship R ₅ ^o is created. As a result, an initial result 3000 is generated.

In the first stage, a join operation of two relations is performed. In FIG. 1, the symbol of the join operation is denoted by 3006. This corresponds to the process up to (Step .6.3) in the optimization method. As a result, a result 3001 of the first-stage two-table join operation including the

join candidates

3008, 3009, and 3010 is created.

Similarly, in the second stage, three relational join operations are performed based on the result 3001 of the two-table join in the first stage described above. This corresponds to the process up to (Step .6.3) in the optimization method. As a result, a result 3002 of the second-stage three-table join operation including the join candidates 3011, 3012, and 3013 is created. Further, based on this, cost evaluation is performed by an algorithm of a greedy method (greedy method). This optimization method corresponds to (Step .6.4). As a result, the combination candidate 3012 having the smallest cost evaluation is selected. In addition, since the algorithm of the greedy method (greedy method) is common, detailed description is not carried out.

In the third stage, since the maximum block size exceeds the specified value 2, intermediate results are once written to the storage. In this case, the connection candidate 3012 is replaced with a connection tree {T _k1 ^o }: 3014. Thereafter, the process returns to (Step 6) and returns to the same state as the initial stage for the second stage processing. In this case, the connection tree {T _k1 ^o }: 3014, the relationship R ₃ ^o : 3015, and the relationship R ₅ ^o : 3016 remain in the connection element list 3007, and are defined as the result 3003.

In the fourth stage, the join operation of the two relations in the second stage is performed. This corresponds to the process up to (Step .6.3) in the optimization method. As a result, a result 3004 of the second-stage two-table join operation including the

join candidates

3017, 3018, and 3019 is created.

In the fifth stage, processing similar to that in the third stage is performed, and a join operation between the three tables is performed. In this case, a join operation of the join tree {T _k1 ^o }: 3014 with the result of the join operation of the relation R ₃ ^o : 3015 and the relation R ₅ ^o : 3016 is performed between the three tables, and the join candidate 3020 is selected. After that, it is reflected in the combined element list 3007. At this time, since the condition described in (Step 6) is not satisfied, the process proceeds to (Step 7). After that, the connection tree 3021 is reconstructed with the element relation R ₁ ^o , the relation R ₂ ^o , the relation R ₃ ^o , the relation R ₄ ^o , and the relation R ₅ ^o of the original joining element list 3007 by (Step 9). Then, a desired optimization plan is created. Thereby, a result 3005 is generated.

<Patent Document 1> relates to a method for arranging replicas to be created, not a method related to query optimization. 2A and 2B show the device configuration based on the description in <Patent Document 1>. The device configuration largely includes three functional layers. The first functional layer 4100 is a functional layer including application program-1: 4007 and application program-L: 4008. The second functional layer 4200 is a functional layer in which a database is accessed and the data collection unit 4003 is arranged. The third functional layer 4300 is a functional layer including resource management /

transaction control units

4011 and 4012, distributed storage 4004, and the like. Each functional layer accesses each other via

various networks

4001 and 4002, such as LAN (Local Area Network) / WAN (Wide Area Network) / Internet.

Application program-1: 4007 includes a database access client unit-1: 4009. Similarly, the application program-L: 4008 includes a database access client unit-L: 4010. The database access client unit-1: 4009 and the database access client unit-L: 4010 request and transmit an execution query such as SQL (Structured Query Language) to the data collection unit 4003.

The functional layer including the above-described resource management / transaction control unit 4011 and the above-described distributed storage 4004 is currently described on the assumption of two sites, but is not limited to this. Each site includes resource management /

transaction control units

4011 and 4012, respectively.

In the site having the resource management / transaction control unit 4011, the distributed storage 4004 is associated and arranged. The distributed storage 4004 stores a database table area, index area, log (log), journal, and the like. In this case, the number of distributed storages is not limited to one and may be two or more. That is, at least one distributed storage exists.

Database tablespaces of the distributed storage in 4004, the index area, the table Table corresponding to the relationship _{R 11.} 11: 4014, Table Table corresponding to the relationship _{R 12.} 12: 4015, table Table.corresponding to relation R ₁ ⁿ . 1n: 4016, table 4018 of intermediate results is stored. In the index area, an index 4017, a log, and a journal 4019 are stored.

The site having the resource management / transaction control unit 4012 corresponds to a replica of the site having the resource management / transaction control unit 4011. Therefore, the same configuration as that of the site having the resource management / transaction control unit 4011 described above is adopted.

In the data collection unit 4003, another distributed storage 4006 is arranged to arrange a partial replica. Here, a resource management / transaction control unit 4013 is arranged and has almost the same configuration as the site having the resource management / transaction control unit 4011 described above.

Here, the data collection unit 4003 receives request messages s4030 to the resource management /

transaction control units

4011 and 4012 based on the inquiry messages s4029 and s4035 from the database

access client units

1 and 4009 and the database access client unit -L: 4010. As a result, it is transferred as s4031, and as a result, the data retrieved and acquired in the distributed storage 4004 and similarly in the distributed storage 4005 are received as response messages s4033 and s4032.

Subsequently, the data collection unit 4003 groups a group of queries including a plurality of conditions specified in a plurality of query messages s4029 and s4035, and obtains a query content and a set of conditions for each condition. In addition, the data collection unit 4003 determines a condition based on the overlapping range of the condition. Furthermore, the data collection unit 4003 sets an evaluation index related to the validity level according to the number of times of duplication of inquiries.

The data collection unit 4003 places a partial replica in the distributed storage 4006 of the resource management / transaction control unit 4013 according to the evaluation index related to the effectiveness.

As described above, past individual technologies (conventional technologies) are roughly classified into those related to the query optimization method and those related to the replica technology based on the external arrangement of intermediate results. The issues are as follows.

For example, if the elemental technology of the database management system is widely distributed across the entire network and distributed in a loosely linked environment, and deployed on a large scale, the distributed storage that places the intermediate results of the query will also be widely distributed in the network environment. become.

However, conventional query optimization methods do not consider which distributed storage sites should be placed with reasonable judgment on the intermediate results of queries that occur as a result of optimization. There is a problem that optimization has not been done in the implementation stage.

As a result, there is a problem that the agreed service level decided by the user of the database access client cannot be satisfied. This is described as follows for each case described above.

<Patent Document 1> includes a process for grouping a group of queries and determining a query content and a set of conditions for each condition, determining a condition based on the overlapping range of the condition, and further depending on the number of times the query is duplicated It is disclosed that an evaluation index called “effectiveness” is set, and those having a certain degree of effectiveness are arranged as partial replicas. Although partial replicas are different from intermediate results, placement locations are not evaluated based on reasonable judgment among multiple options. As a result, it is not possible to meet the above problem.

<Non-Patent Document 1> discloses an overview of the current query optimization method. However, although the use of replicas is mentioned, no proposal has been made on the premise of the placement of intermediate results. Therefore, it does not deal with query optimization based on wide-area computer resource interchange, procurement, and virtualization, which is a current issue in fiscal 2009, and cannot respond to the above-mentioned issues.

<Non-Patent Document 2> discloses an outline of a query optimization method for writing intermediate results to a storage or the like. However, it is not considered in which distributed storage site the intermediate results of the query generated as a result of optimization should be placed based on reasonable judgment, and as a result, query optimization is performed at the implementation stage. There is a problem of not.

JP 2002-222108 A

The object of the present invention is to optimize the query in the database management system, and in the environment using a large-scale virtualized storage, the intermediate result generated in the course of query optimization is more rational. To optimize the query even at the implementation stage. Further, as a result, the agreed service level decided by the user of the database access client is satisfied.

In order to achieve the above object, the present invention provides a query optimization system that evaluates an arrangement method of intermediate results of queries in a database management system. In this query optimization system, numerical information regarding latency and reliability regarding a site holding data is extracted. Note that the latency is a delay time from when a request such as a data transfer request is issued until the request result is returned. The extended cost value is obtained by superimposing the numerical information related to the latency and the numerical information related to the reliability on the cost value of the query. In addition, a site having a designated value that is appropriate (adapts to a predetermined condition) is selected at the top when ordered by the extended cost value. In addition, in order to select a placement destination site for intermediate results generated as a result of optimization, prefetching is performed by creating a copy of the processing system itself in which some parameters in the algorithm are changed. In addition, as a result of the prefetching, intermediate results are arranged in one or more sites.

This makes it possible to rationally select the placement.

It is explanatory drawing which illustrated the process of the query optimization result produced | generated by the "query optimization system" of a prior art. It is the block diagram which described the apparatus structure which implement | achieves a part of "query optimization system" of a prior art. It is the block diagram which described the apparatus structure which implement | achieves a part of "query optimization system" of a prior art. It is the block diagram which described the apparatus structure which implement | achieves the "query optimization system which evaluates the arrangement | positioning system of the intermediate result of a query" of this invention. It is the block diagram which described the apparatus structure which implement | achieves the "query optimization system which evaluates the arrangement | positioning system of the intermediate result of a query" of this invention. It is the block diagram which described the whole processing content at the time of implementing the said query optimization system on the apparatus structure which implement | achieves the "query optimization system which evaluates the arrangement | positioning system of the intermediate result of a query" of this invention. It is the block diagram which described the whole processing content at the time of implementing the said query optimization system on the apparatus structure which implement | achieves the "query optimization system which evaluates the arrangement | positioning system of the intermediate result of a query" of this invention. It is explanatory drawing which extracted and extracted the characteristic part by the procedure of the "query optimization system which evaluates the arrangement | positioning system of the intermediate result of a query" of this invention. It is explanatory drawing which extracted and extracted the characteristic part by the procedure of the "query optimization system which evaluates the arrangement | positioning system of the intermediate result of a query" of this invention.

Hereinafter, a first embodiment of the present invention will be described with reference to the accompanying drawings.

<Explanation on basic configuration>
The “query optimization method (query optimization system) for evaluating an arrangement method of intermediate results of a query” according to the present invention is based on the device configuration as shown in FIGS. 3A and 3B, and the processing is performed in the environment. .

Therefore, as a pre-stage of the description of the “query optimization method for evaluating the arrangement method of the intermediate result of the query” according to the present invention, the presumed apparatus configuration of FIG. 3A and FIG. 3B and the whole described in FIG. 4A and FIG. 4B The processing contents of will be described.

3A and 3B are representative, and only the most essential components are extracted and defined. For this reason, the thing of a structure other than this may be included. The device configuration largely includes one or more networks and at least three functional layers. As the network,

various networks

1001 and 1002 such as LAN / WAN / Internet are defined as different ones, but can be considered to be the same as these. Moreover, there is no reason to distinguish here about the kind. The

various networks

1001 and 1002 include relay devices such as routers and switches.

Among the three functional layers, the first functional layer 1100 is a functional layer including application program-1: 1013 and application program-L: 1014. The second functional layer 1200 is a functional layer including a database access unit-1: 1101 and a database access unit-M: 1012. The third functional layer 1300 is a functional layer including resource management /

transaction control units

1015, 1016, 1017, 1018, a distributed storage 1003, and the like.

Each functional layer accesses each other via

various networks

1001 and 1002 such as the aforementioned LAN / WAN / Internet.

[Hardware example]
As an example of each functional layer, a computer such as a PC (personal computer), an appliance, a workstation, a mainframe, and a supercomputer is assumed. Further, not limited to computers, relay devices such as routers and switches, intermediate devices such as firewalls and bandwidth control devices, other communication devices, electronic devices, dedicated devices, and the like may be used.

Although not shown, the above-described computer or the like is realized by a processor that is driven based on a program and executes predetermined processing, a memory that stores the program and various data, and an interface used for communication with a network. .

As examples of the above processor, a CPU (Central Processing Unit), a microprocessor, a microcontroller, or a semiconductor integrated circuit (Integrated Circuit (IC)) having a dedicated function can be considered.

As examples of the above memory, semiconductor storage devices such as RAM (Random Access Memory), ROM (Read Only Memory), EEPROM (Electrically Erasable and Programmable Read Only Memory), and HDD Memory (SDHidK) An auxiliary storage device such as State Drive), a removable disk such as a DVD (Digital Versatile Disk), a storage medium such as an SD memory card (Secure Digital memory card), or the like is conceivable. A register may also be used.

Note that the processor and the memory may be integrated. For example, in recent years, a single chip such as a microcomputer has been developed. Therefore, a case where a one-chip microcomputer mounted on a computer includes a processor and a memory can be considered.

Examples of the above interfaces include semiconductor integrated circuits such as boards (motherboards and I / O boards) and chips that support network communication, network adapters such as NIC (Network Interface Card), and communication devices such as expansion cards and antennas. A communication port such as a connection port (connector) is conceivable.

Examples of networks include the Internet, LAN (Local Area Network), wireless LAN (Wireless LAN), WAN (Wide Area Network), backbone (Backbone), cable TV (CATV) line, fixed telephone network, mobile phone network, WiMAX (IEEE 802.16a), 3G (3rd Generation), dedicated line (lease line), IrDA (Infrared Data Association), Bluetooth (registered trademark), serial communication line, data bus, and the like are conceivable.

That is, each unit (internal configuration) in each functional layer is not limited to a dedicated device, but may be a module or a component.

However, actually, it is not limited to these examples.

Application program-1: 1013 includes a database access client unit-1: 1060. The database access client unit-1: 1060 requests and transmits an execution query such as SQL to the database access unit-1: 1101.

The database access unit-1: 1011 includes a query reception unit 1048, a query analysis unit 1049, a dictionary access unit 1050, an optimization unit 1051, a process execution unit-1: 1053, and a process execution unit-N: 1054. Including. The query reception unit 1048 receives a request execution query from the database access client unit-1: 1060. The dictionary access unit 1050 accesses a dictionary that manages definition information to be included in the request execution query. The query analysis unit 1049 analyzes the request execution query based on the information of the dictionary access unit 1050. The optimization unit 1051 performs optimization of the analyzed request execution query using the “query optimization method for evaluating the arrangement method of intermediate results of queries” according to the present invention. The process execution unit-1: 1053 and the process execution unit-N: 1054 receive the result of the optimization unit 1051, decompose the request execution query, and execute individual execution units.

Further, the database access unit-1: 1101 includes a statistical information management unit agent management unit 1052, a transaction management control unit 1055, and a log and journal 1056. The transaction management control unit 1055 executes a system transaction in which the process execution unit-1: 1053 and the process execution unit-N: 1054 generate and manage intermediate results. Log, journal 1056 manages the log and journal of the system transaction. The statistical information management unit agent management unit 1052 collects the results of executing the actual request execution query elements and reflects them in the processing of the optimization unit 1051 described above.

The above-described query receiving unit 1048 holds a designated query 1057 that is a request execution query that is actually requested. The above-described process execution unit-1: 1053 and the above-described process execution unit-N: 1054 include 1058 and 1059 which are decomposed execution units.

3A and 3B show four sites in the functional layer including the above-described resource management / transaction control unit 1015 and the above-described distributed storage 1003. However, the present invention is not limited to this. Each site includes resource management /

transaction control units

1015, 1016, 1017, and 1018, respectively.

In the site having the resource management / transaction control unit 1015, the distributed storage 1003 and the distributed storage 1004 are arranged in association with each other. The distributed storage 1003 has a database table area and an index area. Another distributed storage 1004 stores Logs, journals, and the like. In this case, the number of distributed storages is not limited to two and may be one or three or more. That is, at least one distributed storage exists.

Further, the site having the resource management / transaction control unit 1015 also includes a performance / error information / statistical information service 1019.

Database tablespaces in the distributed storage 1003, the index area, the table Table corresponding to the relationship _{R 1.} 1: 1023, Table Table corresponding relationship _{R 2.} 2: 1024, Table Table corresponding to the relationship _{R n.} n: 1025, table 1030 of intermediate results is stored. An index 1029 is stored in the index area.

The above-mentioned distributed storage 1004 includes a log, journal 1036, performance information, error information 1037, and statistical information 1038 for each object. The performance information and error information 1037 manages general performance information and error information. As information management methods, information integration / listing / grouping / databaseing, etc. can be considered. However, actually, it is not limited to these examples. The statistical information 1038 for each object is stored in the table Table. 1: 1023 and the table Table. 2: 1024 and the above table Table. n: 1025 and the statistical information of the access performance for the index 1029 described above are managed.

The site having the resource management / transaction control unit 1016 corresponds to a replica of the site having the resource management / transaction control unit 1015. Therefore, the same configuration as that of the site having the resource management / transaction control unit 1015 described above is adopted.

In the site having the resource management / transaction control unit 1016, the distributed storage 1005 and the distributed storage 1006 are arranged in association with each other. The distributed storage 1005 has a database table area and an index area. Another distributed storage 1006 stores Log, journal, and the like. In this case, the number of distributed storages is not limited to two and may be one or three or more. That is, at least one distributed storage exists.

Further, the site having the resource management / transaction control unit 1016 also includes a performance / error information / statistical information service 1020.

Database tablespaces in the distributed storage 1005, the index area corresponds to the relationship _{R 1,} the previous table Table. 1: Table 102, which is a replica of 1023. 1: 1026, equivalent to relation _{R 2,} above table Table. 2: Table 1024 which is a replica of 1024. 2: 1027, corresponding to the relationship R _n , and the above table Table. n: A table Table. n: 1028, table 1032 of intermediate results is stored. An index 1031 is stored in the index area.

The above-mentioned distributed storage 1006 includes a log, journal 1039, performance information, error information 1040, and statistical information 1041 for each object. The performance information and error information 1040 manages general performance information and error information. The statistical information 1041 for each object is stored in the table Table. 1: 1026 and the table Table. 2: 1027 and the above table Table. n: 1028 and the statistical information of the access performance for the index 1031 is managed.

The site having the resource management / transaction control unit 1017 corresponds to a site specialized in index management. At the site having the resource management / transaction control unit 1017, the distributed storage 1007 and the distributed storage 1008 are arranged in association with each other. The distributed storage 1007 has a database table area and an index area. Another distributed storage 1008 stores Log, journal, and the like. In this case, the number of distributed storages is not limited to two and may be one or three or more. That is, at least one distributed storage exists.

Furthermore, the site having the resource management / transaction control unit 1017 also includes a performance / error information / statistical information service 1021.

The intermediate result table 1034 is stored in the table area and index area of the database in the distributed storage 1007. An index 1033 is stored in the index area.

The above-described distributed storage 1008 includes a log, journal 1042, performance information, error information 1043, and statistical information 1044 for each object. The performance information and error information 1043 manages general performance information and error information. The statistical information 1044 for each object manages the statistical information on the access performance for the index 1033 on the distributed storage 1007.

The site having the resource management / transaction control unit 1018 corresponds to a site specialized for managing intermediate results. At a site having the resource management / transaction control unit 1018, a distributed storage 1009 and a distributed storage 1010 are arranged in association with each other. The distributed storage 1009 has a database table area and an index area. Another distributed storage 1010 stores Logs, journals, and the like. In this case, the number of distributed storages is not limited to two and may be one or three or more. That is, at least one distributed storage exists.

Furthermore, the site having the resource management / transaction control unit 1018 also includes a performance / error information / statistical information service 1022.

The intermediate result table 1035 is stored in the table area and index area of the database in the distributed storage 1009.

The above-described distributed storage 1009 includes Log, journal information 1045, performance information, error information 1046, and statistical information 1047 for each object. The performance information and error information 1046 manages general performance information and error information. The statistical information 1047 for each object manages statistical information on access performance for an arbitrary object on the distributed storage 1010.

<Description of overall operation>
Here, the entire processing contents shown in FIGS. 4A and 4B will be described as the first stage of the description of the “query optimization method for evaluating the arrangement method of intermediate results of queries” according to the present invention.

When accessing the database, the aforementioned database access client unit-1: 1060 transmits a request execution query as a message s100 to the query reception unit 1048 in the database access unit-1: 1101. The message s100, described in SQL are marked, it is assumed to include a description of request execution query was performed natural binding relationship R _n from the relationship R _1.

The query reception unit 1048 passes the specified query expression s101 equivalent to the specified query 1057 to the query analysis unit 1049. The query analysis unit 1049 obtains the definition information s102 from the dictionary access unit 1050, converts the designated query expression s101 as the intermediate expression s103, and passes it to the optimization unit 1051.

The optimization unit 1051 calculates an optimized execution plan, and passes a part of the execution plan descriptions s124 and s105 from the process execution unit-1: 1053 to the process execution unit-N: 1054. In the optimization, various latency information and reliability information s104 are obtained from the statistical information management unit agent management unit 1052.

In the example of FIGS. 4A and 4B, the partial execution plan description s105 is passed in the processing execution unit-N: 1054 described above, and the query execution unit 1058 is held in the processing execution unit-N: 1054 described above. . Further, the partial execution plan description s124 is passed to the above-described process execution unit-1: 1053, and the query execution unit 1059 is held in the above-described process execution unit-1: 1053. In particular, the query execution unit 1059 uses the intermediate result table, whereas the query execution unit 1058 generates the intermediate result table. For this reason, system transaction processing described later is performed.

The above-described processing execution unit-N: 1054 executes a query on the site having the above-described resource management / transaction control unit 1015 selected as the optimum site in the replica in order to process the query execution unit 1058. Send a message s106 containing the request.

Above resource management / transaction control section 1015 accepts the message s106, in accordance with the contents described in the message s106, tables Table corresponding to the relationship _{R 1.} 1: 1023 and the destination line s107, obtained from it as the drive side, the table Table corresponding relationship _{R 2.} 2: Get the corresponding row s108 from 1024. Furthermore, the binding results in the drive side, by using the index 1029 of the above, after the obtained line ID information s110 of the primary key table Table corresponding to the relationship _{R n.} n: Obtained from 1025 while combining the corresponding row s109.

Thereafter, the resource management / transaction control unit 1015 returns the response result, which is intermediate result information for the query execution unit 1058, as the message s111 once to the processing execution unit-N: 1054.

The response result described in the message s111 needs to be temporarily recorded in the distributed storage as an intermediate result table. For this reason, after the processing execution unit-N: 1054 determines a storage to be written using a part of the present invention, an intermediate result table including a response result in the site holding the resource management / transaction control unit 1018 described above. A message s112 related to the creation request is sent.

When the resource management / transaction control unit 1018 receives the message s112, the resource management / transaction control unit 1018 sends a request s113 for recording the pre-update image in the Log and journal 1045 in order to generate an intermediate result table as a transaction. In this case, there is no real result because there is no interim result table.

Thereafter, the resource management / transaction control unit 1018 described above extracts the intermediate result s114 described in the message s112 and writes it in the intermediate result table 1035. Thereafter, the state transits to a transaction commit wait state, and the resource management / transaction control unit 1018 returns the commit wait message s116 to the process execution unit-N: 1054.

The aforementioned process execution unit-N: 1054 issues a transaction state management request s117 in advance to the transaction management control unit 1055 in the database access unit-1: 1101. The transaction management control unit 1055 issues a registration request s118 to the Log and journal 1056 regarding the system transaction.

As a result, when the transaction management control unit 1055 determines that there is no particular problem in executing the transaction, the transaction management control unit 1055 notifies the processing execution unit-N: 1054 that is a request source of the system transaction. The process execution unit-N: 1054 transmits a commit request message s119 to the message management / transaction control unit 1018 in response to the message s112 related to the intermediate result table creation request.

When the above-described resource management / transaction control unit 1018 receives the above-described commit request message s119, the commit result writing s120 is performed on the above-described log and journal 1045.

Thereafter, the request s121 for writing the actual data to the statistical information 1047 for each object in the site of the resource management / transaction control unit 1018 in relation to the execution result of the transaction related to the creation of the intermediate result table 1035, the performance information, and the error A request s122 for writing information 1046 is issued.

When the series of operations is completed, the resource management / transaction control unit 1018 described above sends a response message s123 to the requesting process execution unit-N: 1054.

After that, the process execution unit-1: 1053 is activated using the intermediate result table 1035. In the processing execution unit-1: 1053, a partial execution plan description s124 is passed, and the execution unit 1059 of the query is held in the above-described processing execution unit-1: 1053.

In order to process the query execution unit 1059, the processing execution unit-1: 1053 described above transmits a message s125 including a query execution request to the site having the resource management / transaction control unit 1018 described above. At the same time, a message s127 including a query execution request is transmitted to the site having the resource management / transaction control unit 1016 selected as the optimum site in the replica.

When the resource management / transaction control unit 1018 receives the message s125 including the query execution request, the resource management / transaction control unit 1018 obtains the corresponding row s126 from the intermediate result table 1035 in accordance with the contents described in the message s125, and executes the requesting process execution unit. -1: Responds to 1059.

When the above-described resource management / transaction control unit 1016 receives the message s127 including the query execution request, the table Table.1 corresponding to the relationship R _{(k + 1)} according to the content described in the message s127. Get the corresponding line from (k + 1) etc.

After that, the resource management / transaction control unit 1016 returns the response result to the query execution unit 1059 as the message s128 to the process execution unit-1-1053.

The results of each of the above-described processing execution units-1: 1053 and the above-described processing execution unit-N: 1054 are returned to the query reception unit 1048 as result responses s129 and s130, respectively, and the results are integrated. The response message s131 is returned to the calling database access client unit-1: 1060.

The aforementioned optimization unit 1051 obtains various latency information and reliability information s104 from the statistical information management unit agent management unit 1052 when calculating the optimized execution plan. These are information expressed numerically. Therefore, the statistical information management unit agent management unit 1052 acquires and manages the information via the performance / error information / statistical information service 1019: 1020: 1021: 1022 on each site.

For example, the performance / error information / statistical information service 1019 described above is arranged at a site managed by the resource management / transaction control unit 1015 described above, and the performance / error information s134 is determined from the performance information / error information 1037 described above. The statistical information s133 is obtained from the statistical information 1038 for each object.

Thereafter, the aforementioned performance / error information / statistical information service 1019 integrates them and reports them as a performance / error information / statistical information report s132. The above-described statistical information management unit agent management unit 1052 obtains the performance / error information / statistical information report s132 from the performance / error information / statistical information service 1019: 1020: 1021: 1022 on all sites. As latency information and reliability information.

<Query optimization method that evaluates the layout of query intermediate results>
FIG. 5A and FIG. 5B are explanatory diagrams schematically showing a characteristic part extracted in the procedure of the “query optimization method for evaluating the arrangement method of intermediate results of queries” of the present invention. The query optimization method is mainly implemented by the optimization unit 1051 in FIGS. 3A and 3B, and is used by the query analysis unit 1049, the transaction management control unit 1055, and the statistical information management unit agent management unit 1052 in the processing process. Is done. Therefore, the procedure is briefly described below.

(Step 1) is performed by the query analysis unit 1049. The query analysis unit 1049 performs syntax analysis of the specified query 1057 and refers to a set of tables {R ₁ , R ₂ ,. . . , R _n }.

(Step 2) is performed by the optimization unit 1051. In this step, in the subordinate (step 2.1), the optimization unit 1051 extracts all possible access patterns such as effectively using an index in each element table R _m of the table set. Above, add to the temporary binding element list. This is performed for all the element tables R _m (m = 1 to n). Thereby, optimization candidates are extracted.

(Step 3) is also performed by the optimization unit 1051. In this step, two major items are performed for each element table R _m in the table set (m = 1 to n).

[(Step 3) Subordinate processing]
One (Step .3) (Step .3.1) under began, those optimum access condition from the temporary binding element list for each element table _{R m} which was performed in (step .2.1) The query elements that optimize the cost are selected by taking out and deleting only the extracted ones and deleting others from the temporary joining element list. Subsequently, in (step .3.2) under, for each element of the temporary coupling element list, all replicas min _{_{_{{R 11, R 12, ...}}} , R 1r, R 21, ..., R 2r,. . . , R _nr } is created and added to the temporary combined element list.

(Step 4) is also performed by the optimization unit 1051. These (Step 4) and (Step 5) are one of the characteristic matters of the present invention, and (Step 4) has two procedures under it. In (Step 4), the subordinate processing is repeated for the entry R _mk in the temporary coupling element list (k = 1 to r).

[(Step 4) Subordinate processing]
In (Step 4.1), the optimizing unit 1051 takes out one entry from the temporary coupling element list including the replica part, and numerical information on latency and reliability corresponding to the distributed storage site holding the entry. Is extracted from the above-described statistical information management unit agent management unit 1052. Thereafter, in (Step 4.2), the above-described temporary coupling element list is updated in such a manner that the numerical information related to the latency and reliability is added to the above-described entry extracted from the temporary coupling element list (m = 1-n).

(Step 5) is also performed by the optimization unit 1051. In (Step 5), the subordinate processing is repeated for the entry R _mk in the temporary coupling element list (m = 1 to n).

[(Step 5) Subordinate processing]
Here, in (Step 5.1), the optimizing unit 1051 takes out one entry from the above-described temporary combining element list including the replica portion. In (Step 5.2), the subordinate process is repeated for the extracted entry R _mk (k = 1 to r). (Step 5.2) In the subordinate (Step 5.2.1), numerical information regarding the latency and reliability of each replica of the extracted entry is evaluated.

Regarding this evaluation, normally, the cost value is described as an evaluation value in the evaluation of each entry, and the smallest value is evaluated as desirable. In the present invention, an extended cost value is defined in which a normalized latency value is superimposed on the cost value and an inverse number of reliability is also superimposed. Then, evaluation is made so that the smallest expansion cost value is the best.

The normalized latency value becomes larger as the communication quality of various networks 1002 such as site performance and LAN / WAN / Internet deteriorates or the arrangement distance becomes longer. Also, the lower the site reliability, the greater the inverse. For this reason, the farther the arrangement position is and the worse the quality is, the worse the cost value is evaluated.

Then, in (Step 5.3), the elements of the temporary combination element list to be actually used for the designated top p replicas whose expansion cost values are valid (conforming to predetermined conditions), Decide for each type. At this time, the top p replicas are left and others are deleted from the temporary connection element list.

(Step .6), the initial version of _^{R 11 _o coupling element list from the entry remaining on the temporary coupling element list ^{_{^{above, ..., R 1p o, R}}} 21 o, ..., R 2p o,. . . , R _np ^o } is added to the optimization plan element list.

After that, in (Step 7), the temporary combined element list is deleted.

(Step 8) also includes the characteristic features of the present invention. In particular, (Step .8.3), (Step .8.4), and (Step .8.7) to (Step .8.9).

In (Step 8), the optimization unit 1051 repeatedly performs the subordinate processing until the number of elements in the combined element list becomes one type except for the replica.

[(Step 8) Subordinate processing]
In (Step 8.1), the maximum block size k is set to a small value by the designated value or the number of elements in the combined element list, as can be seen in the conventional method. Thereafter, in (Step .8.2), the initial value of the number of connections is designated as 2. In (Step 8.3), after the processing system of the optimization unit 1051 copies itself, the maximum block size on the copied side is rewritten to (k + 1). Thereafter, in (Step .8.4), the copied processing system is activated. This is to generate an intermediate result table when the maximum block size is k, but is performed in order to pre-determine and rationally determine which distributed storage site to place.

In order to place the intermediate result table, not only the above-mentioned numerical information on latency and reliability is evaluated, but also the total cost value is minimized when the intermediate result table is used to perform the joint operation process again. It is necessary to arrange as follows. For this purpose, it is necessary to know which replica is used for access.

For this purpose, in actuality, when the maximum block size is k, the above-mentioned intermediate result table is written, but this is set to (k + 1), and it is determined which replica table is used for k joins. In the above, it is necessary to reduce the above-mentioned expansion cost value by arranging the intermediate result table at the distributed storage site where the replica table is arranged or at a distributed storage site in the vicinity thereof.

For this purpose, the processing system itself of the optimization unit 1051 is copied, the maximum block size is changed to (k + 1), and the evaluation is performed in parallel. It can be obtained.

In (Step 8.5), as long as the condition of designated number of connections ≦ maximum number of blocks k (the copy side is (k + 1)) is satisfied, the subordinate processing is repeated.

In addition, (step .8.5. 1), (step .8.5.2), (step .8.5.3), (step .8.5.4), (step .8.5.5) ), (Step. 8.5.6), (Step. 8.5.7), and (Step. 8.5.8) are methods that are also confirmed by the conventional method.

[(Step .8.5) Subordinate processing]
First, in (step .8.5.1), all scenes constituting the combination of the specified binding few minutes, the initial version of _^{R 11 _o coupling element _{^{_{list, ..., R 1p o, R}}} 21 o, ..., R _2p ^o _,. . . , R _np ^o } to create and add a temporary list. Subsequently, in (Step 8.5.2), the subordinate processing is repeated for all the temporary list elements.

[(Step .8.5.2) Subordinate processing]
First, in (Step .8.5.2.1), one element is extracted from the temporary list. Subsequently, in (Step .8.5.2.2), the subordinate processing is repeated until the

temporary list portions

1 and 2 can be defined together.

[Processing under (Step.8.5.2.2)]
First, in (Step 8.5.5.2.2.1), the element is divided into two, and two sets of combinations of the elements {R _kp ^o } are created, and {temporary list part 1} {temporary list part 2 }. Subsequently, in (Step .8.5.2.2.2.2), a combination of {temporary list part 1} {temporary list part 2} is added to the temporary operation list. Subsequently, in (Step .8.5.2.2.2.3), a combination of {temporary list part 2} {temporary list part 1} is added to the temporary operation list.

[(Step. 8.5) Continuation of subordinate process 1]
After the completion of (Step .8.5.2), the optimization plan list is initialized in (Step .8.5.3) under (Step .8.5). Subsequently, in (Step .8.5.4), the subordinate processing is repeated for all the provisional operation list elements.

[(Step. 8.5.4) Subordinate processing]
First, in (Step .8.5.4.1), one is extracted from the temporary operation list. Subsequently, in (Step .8.5.4.2), the optimization plan of the extracted right side {temporary list portion x (x = 1, 2)} is extracted from the optimization plan element list or created. Subsequently, in (Step .8.5.4.3), the optimization plan of the extracted left {temporary list portion x (x = 1, 2)} is extracted from the optimization plan element list or created. Subsequently, in (Step .8.5.4.4), the combined plan in the two optimization plans of (Step .8.5.4.2) and (Step 8.5.4.3) described above. And add it to the optimization plan list.

[(Step .8.5) Continuation of subordinate process 2]
After (Step .8.5.4) is finished, in (Step .8.5.5) under (Step .8.5), one optimal item is extracted from the optimization plan list and optimized. Add to the plan element list. Subsequently, in (Step .8.5.6) under (Step .8.5), the non-optimal optimization plan list is organized and held as a library. Subsequently, in (Step .8.5.7) under (Step .8.5), the temporary list and the temporary operation list are cleared. Subsequently, in (Step .8.5.8) under (Step .8.5), 1 is added to the number of bonds (number of bonds ← (number of bonds + 1)).

[(Step 8) Continuation of subordinate processing 1]
In (Step 8.6), as can be seen in the conventional method, from the elements in the optimization plan element list created in (Step .8.5), the algorithm of greedy method (greedy method) is used. Select a join tree that minimizes the cost function. The algorithm of the greedy method (greedy method) is the same as the conventional one.

(Step 8.7) is a characteristic matter of the present invention. In (Step 8.7), the processing system of the optimizing unit 1051 that is executing in parallel is the side that performed copying with the maximum block size k (non-copy processing system) or the maximum block size (k + 1). The side that was copied (copy processing system) is evaluated. Depending on the result, the content of subsequent processing differs.

[For copy processing system]
When the processing system of the optimization unit 1051 is a copy processing system that has been copied with the maximum block size (k + 1), in (Step 8.7.1), it waits for an inquiry from the non-copy processing system. Thereafter, in (Step 8.7.2), the type of the join tree {T ₁ ^o } of the intermediate result table specified at the time of inquiry from the non-copy processing system that has performed copying is determined. In this case, the left side (external side) / driving table or the right side (internal side) / reference table is evaluated, and which table of which site is to be joined with k joins is responded to the non-copy processing system. . Thereafter, in (Step 8.7.3), the process is terminated by itself. That is, the copy processing system is terminated.

[Non-copy processing system]
In the case where the processing system of the optimization unit 1051 performs copying with the maximum block size k, in (Step 8.8.7.4), the temporary tree is temporarily placed in the new intermediate result table {T ₁ ^o }. To do. Thereafter, in step 8.7.5, an inquiry is made to the processing system of the optimization unit 1051 copied with the maximum block size (k + 1) in order to inquire which site this result should be placed.

The processing system of the optimization unit 1051 copied with the maximum block size (k + 1) is normally assumed to end itself in (Step 8.8.7.3), but the case where it has not ended after use is assumed. Considering (step 8.7.5), it is also possible to forcibly terminate.

[(Step 8) Continuation of subordinate process 2]
Thereafter, in (Step 8.8), the placement evaluation function is called, and the top two proposals are extracted in accordance with the creation policy of the site where the intermediate result table is to be placed and its index. When an index or the like is created, an additional cost is generated for this purpose, but the search speed is greatly improved because of the index. Therefore, it is necessary to evaluate “creation of index etc.” and “search using index etc.” in a balanced manner.

[Placement evaluation function]
The said arrangement | positioning evaluation function is implemented in the following procedures.
In the first step (step .f), the type of connection tree {T ₁ ^o } of the intermediate result table determined in the processing system of the optimization unit 1051 copied with the maximum block size (k + 1) is confirmed. When the type of the connection tree {T ₁ ^o } is left side (external side) / driving table, (step .f.1.1) to (step .f.1.12) are continuously performed. Further, when the type of the connected tree {T ₁ ^o } is the right side (internal side) / reference table, (step .f.2.1) to (step .f.2.15) are continuously performed.

[When the join tree {T ₁ ^o } type is left side (external side) / driving table]
In (Step .f.1.1), the next join operation destination (in this case, internal side / reference table) in (Step .8.7.2) is confirmed. After that, in (step f. 1.2), all the site groups (including replicas) used in the next join operation destination (inside / reference table) are taken out and added to the temporary site list.

In (Step .f.1.1) and (Step .f.1.2), in the processing system of the copied optimization unit 1051, which table and site group is the next join operation destination (in this case, It is determined whether it is used as an internal side / reference table), and it is included in the list. On the other hand, in (Step .f.1.3) to (Step .f.1.7), up to two candidates that can place the intermediate result table among the unused sites. Explore.

Specifically, in (Step .f.1.3), a list of unused site groups is extracted. Here, a site not included in the temporary site list is taken out and added to the candidate temporary list. In (Step f.1.4), candidate areas for two locations for search are prepared and initialized. Thereafter, in (Step .f.1.5), the subordinate processing is repeatedly performed for the candidate temporary list element.

[Processing under (Step.f.1.5)]
In (step .f.1.5.1), the site information is read sequentially. In (step .f. 1.5.2), numerical information related to latency and reliability related to the site is read from the measurement statistics DB. In addition, the subordinate process is performed every time numerical information related to latency and reliability is read.

[When there is something initialized in the candidate area]
In (Step .f.1.5.2.1), the first two sites are set as candidates in the above-described search candidate area.

[If the candidate area is already set and not registered in the temporary site list]
In (step .f.1.5.2.2), when (step .f.1.5.2) is repeated and the numerical information on latency and reliability is higher evaluation, Replace the site listed in the candidate area for search (replace). No other processing is performed.

If two sites that are new candidates are identified as a result of performing (Step f.1.5) on all sites in the candidate temporary list, (Step f.1.6) The intermediate result table join tree {T ₁ ^o } is added to a temporary site list that means site candidates to be temporarily arranged. Thereafter, in (step .f.1.7), the candidate temporary list is cleared. Subsequently, in (step .f.1.8), each of the site candidates for temporarily arranging the connection tree {T ₁ ^o } of the intermediate result table by repeatedly performing the subordinate processing for the site temporary list element. Is specifically evaluated.

[Process under (Step.f.1.8)]
In (Step .f.1.8.1), the table creation cost is estimated from the number of creations of the join tree {T ₁ ^o } (number of rows created and number of cases). After that, in (Step .f. 1.8.2), numerical information related to the latency and reliability related to the site is read from the measurement statistics DB. Thereafter, in (Step .f. 1.8.3), the normalized latency related to the site holding the data obtained in (Step .f. 1.8.2) with respect to the estimation of the table creation cost described above. And the reciprocal number of the numerical information related to the reliability are superimposed to obtain the total cost. Further, in (Step .f. 1.8.4), a plan number is assigned to the entire cost, and it is added to the evaluation list as one of candidates. In (step .f. 1.8.5), the possibility of creating an index is determined. If the index can be created, the subordinate process is executed.

[If the index can be created]
In (Step.f.1.8.5.1), the index creation cost is estimated from the total cost obtained in (Step.f.1.8.8.3), and the total cost including the index creation cost is calculated again. . After that, in (Step .f.1.8.5.2), a plan number is assigned to the newly calculated overall cost, and it is added to the evaluation list as another candidate.

After evaluating all the site candidates that temporarily place the connection tree {T ₁ ^o } of the intermediate result table in (Step .f.1.8), in (Step .f.1.9), the evaluation list described above is used. The data in the list is sorted and sorted according to the overall cost. Then, the one with the lowest overall cost is stored as the first plan.

After that, in (step .f.1.8.10), the one with the same arrangement as the first plan, the arrangement destination is different, and the next lowest overall cost is stored as the second plan. Here, a similar configuration means a configuration in which an index exists when an index is created, and a configuration in which no index exists when an index is not created.

After that, in (Step f.1.8.11), the site temporary list and the evaluation list are cleared as post-processing. Thereafter, in (Step .f. 1.8.12), the above-mentioned first proposal and the above-mentioned second proposal are answered.

Up to this point, the processing is performed when the type of the connection tree {T ₁ ^o } is left (external side) / driving table.

[When the join tree {T ₁ ^o } type is right (inside) / reference table]
When the type of the join tree {T ₁ ^o } becomes the right side (inside) / reference table, the process is complicated, unlike the case where the type of the join tree {T ₁ ^o } becomes the left side (outside) / drive table. . This is because there are cases where hash join is used. In particular, a hash is created after writing the join tree {T ₁ ^o }, which is an intermediate generation table, to the storage. However, since the hash is a data structure created on the memory, it is possible that the site that writes to the storage and the site that holds the hash are different sites and devices.

In (Step .f.2.1), the next join operation destination (in this case, external side / drive table) in (Step 8.7.2) is confirmed. Thereafter, in (step f.2.2), all site groups (including replicas) used at the next join operation destination (external side / drive table) are taken out and added to the temporary site list. After that, in (Step .f2.3), a list of sites that are not used is extracted. Here, a site not included in the temporary site list is taken out and added to the candidate temporary list. In (Step .f.2.4), two candidate areas for search are prepared and initialized. Thereafter, in (step .f.2.5), the subordinate processing is repeatedly performed for the candidate temporary list element.

[Processing under (Step.f.2.5)]
In (step .f.2.5.1), the site information is read sequentially. In (step .f.2.5.2), numerical information regarding latency and reliability related to the site is read from the measurement statistics DB. In addition, the subordinate process is performed every time numerical information related to latency and reliability is read.

[When there is something initialized in the candidate area]
In (step .f.2.5.2.1), the first two sites are set as candidates in the candidate area for search described above.

[If the candidate area is already set and not registered in the temporary site list]
In (Step .f.2.5.2.2), if (Step .f.2.5.2) is repeated and the numerical information on latency and reliability is higher, Replace the site listed in the candidate area for search (replace). No other processing is performed.

If two sites that are new candidates are identified as a result of performing (Step .f.2.5) on all sites in the candidate temporary list, (Step .f.2.6) The intermediate result table join tree {T ₁ ^o } is added to a temporary site list that means site candidates to be temporarily arranged. Thereafter, in (step .f.2.7), the candidate temporary list is cleared. Subsequently, in (step .f. 2.8), each of the site candidates for temporarily arranging the connection tree {T ₁ ^o } of the intermediate result table by repeatedly executing the subordinate processing for the site temporary list element Is specifically evaluated.

[Process under (Step.f.2.8)]
In (Step .f.2.8.1), the table creation cost is estimated from the number of creations of the join tree {T ₁ ^o } (the number of rows and the number of rows created). Thereafter, in (Step .f. 2.8.2), numerical information related to the latency and reliability related to the site is read from the measurement statistics DB. Thereafter, in (Step .f. 2.8.3), the normalized latency related to the site holding the data obtained in (Step .f. 2.8.2) with respect to the estimation of the table creation cost described above. And the reciprocal number of the numerical information related to the reliability are superimposed to obtain the entire table portion cost. Further, in (Step .f. 2.8.4), a plan number is assigned to the entire table part cost, and the plan number is added to the table part evaluation list as one of candidates.

(Step .f.2.1) to (Step .f. 2.8.4) are the same as the above (Step .f.1.2) to the above-mentioned except for (Step .f.2.1.1.1). (Step .f. 1.8.4) is equivalent to the process, and the entire table portion cost is calculated. On the other hand, (Step .f2.9) to (Step .f.2.15) are dedicated processing contents when the type of the connection tree {T ₁ ^o } is the right side (inside) / reference table. It is.

In (step .f. 2.9), when the type of join tree {T ₁ ^o } is the right side (inside) / reference table, the total cost of the table portion is small for the purpose of suppressing the overall processing amount. Limited to the top three. Here, from the table part evaluation list, the top three items with the small total table part cost are selected and the others are deleted.

Subsequently, in (Step .f.2.10), the subordinate processing is repeatedly performed for the site temporary list element.

[Process under (Step.f.2.10)]
In (Step .f.2.10.1), the creation cost of the above-described hash configured on the memory is estimated. (Step .f.2.10.2) superimposes the numerical information related to the normalized latency related to the site obtained in (Step .f. 2.8.2) and the reciprocal of the numerical information related to the reliability described above. And obtain the reference cost. Furthermore, in (step .f.2.10.3), a plan number is assigned to this reference cost, and it is added to the reference partial evaluation list as one of the candidates.

Also, there may be cases where hash join is not used. In that case, an index may be created. Therefore, in (Step .f.2.10.4), the above-mentioned index creation cost is estimated. (Step .f.2.10.5) superimposes the numerical information related to the normalized latency related to the site obtained in (Step .f. 2.8.2) and the reciprocal number of the numerical information related to the reliability described above. And obtain the reference cost. Further, in (Step .f.2.10.6), a plan number is assigned to this overall cost, and it is added to the reference partial evaluation list as one of the candidates.

Subsequently, in (Step .f.2.11), the intermediate result table including the above-described join tree {T ₁ ^o } and the above-described hash are obtained by combining the total cost of the table part and the reference cost. Evaluate when it is created on a different site. Here, in (Step .f.2.11), the subordinate processing is repeated for the table partial evaluation list element.

[Process under (Step.f.2.11)]
In (step .f.2.11.1), one item is taken out from the table partial evaluation list in the order of the plan number. In (Step .f.2.11.2), the table portion overall cost and the reference cost are each weighted by a constant weight, and the sum is taken to calculate the overall cost. Thereafter, in (Step .f.2.11.2), a plan number is assigned to the calculated overall cost and added to the evaluation list.

(Step .f.2.12) sorts and ranks the data in the above-described evaluation list according to the overall cost. Then, the one with the lowest overall cost is stored as the first plan.

After that, in (Step .f.2.13), the configuration with the same configuration as the first plan, the placement destination is different, and the next lowest overall cost is stored as the second plan. Here, the similar configuration specifically means a configuration in which an index exists when an index is created, and a configuration in which no index exists when an index is not created.

Thereafter, in (step .f.2.14), as a post-process, the site temporary list, the table partial evaluation list, the reference partial evaluation list, and the evaluation list are cleared. Thereafter, in (Step .f.2.15), the above-mentioned first proposal and the above-mentioned second proposal are answered.

Based on the above procedure, the placement evaluation function ends the process. After that, based on the result of the arrangement evaluation function, the intermediate generation result is reserved for arrangement in one place or speculatively two places in (Step .8.9).

The processing after (Step 8.10) is processing that can also be seen in the conventional method. In (Step .8.10), the elements of the connection tree {T _k1 ^o , T _k2 ^o } corresponding to the arranged intermediate result table are added to the current version of the connection element list. In response to this, in (step 8.11), the joining elements constituting the joining tree {T _k1 ^o , T _k2 ^o } are deleted from the current version of the joining element list.

The processing of (Step 8) is the core processing of optimization, and when this processing is omitted, the processing system of the optimization unit 1051 shifts to (Step 9). In (Step 9), the remaining optimization is performed from the current version {T _m ^o } of the combined element list, and an optimization plan is added. Thereafter, the processing system of the optimization unit 1051 selects one optimization plan in (Step 10).

Thereafter, the processing system of the optimization unit 1051 performs the initial version {R ₁₁ ^o ,..., R _{1 p} ^o , R ₂₁ ^o ,..., R _2p ^o _,. . . , R _np ^o } and the current version {T _m ^o } to re-express the optimization plan. Thereafter (in step 12.12, the processing system of the optimizing unit 1051 transfers a part of the execution plan descriptions s124 and s105 from the above-described processing execution unit-1: 1053 to the above-mentioned processing execution unit- N: It passes to 1054 and a process is complete | finished.

5A and 5B correspond to (Step 1) to (Step 7). The repetition 2001 after the initial stage 2000 corresponds to (Step 8). Here, the relationship {R ₁ ^o , R ₂ ^o,. . . , R _n ^o }.

“Start of processing system and copy processing system” 2002 in FIGS. 5A and 5B corresponds to (Step .8.3) and (Step .8.4). The side that activates the copy processing system in the processing system of the optimization unit 1051 is represented by a processing sequence 2003 in FIGS. 5A and 5B. On the other hand, the side activated as the copy processing system is expressed by a processing sequence 2004. In particular, the core part of the processing content (step .8.5) is expressed by the processing sequence 2005 and the processing sequence 2006.

For this reason, in the processing sequence 2003, when the maximum block size k is set to 2, the intermediate result table is written to the distributed storage in the second stage, so the join processing is interrupted, whereas in the processing sequence 2004, the maximum Processing proceeds to the third stage of the block size (k + 1).

Initially, the processing sequence 2003 and the processing sequence 2004 proceed with exactly the same processing. In the first stage of processing, in order to minimize the cost calculation, a join operation is performed on two relations including the relation {R ₁ ^o } and the relation {R ₂ ^o }. As a result, a plurality of combined result proposals are created. The processing result of the first stage corresponds to the combined result 2007 in the processing sequence 2003 and corresponds to the combined result 2015 in the processing sequence 2004.

In the second stage process, a join operation is further performed on the first stage join result. For example, for the result of performing a join operation between the relationship {R ₁ ^o } and the relationship {R ₂ ^o }, a process of performing a join operation with the relationship {R ₄ ^o }, and the like. As a result, a plurality of combined result proposals are created. The processing result of the second stage corresponds to the combined result 2008 in the processing sequence 2003 and corresponds to the combined result 2016 in the processing sequence 2004.

Thereafter, in the processing sequence 2003, a plurality of replacement candidates 2009 for the connection tree {T ₁ ^o } corresponding to the intermediate generation table are generated. Furthermore, the selection result 2010 of the join tree {T ₁ ^o }, which is the optimum intermediate result table, is created from the replacement candidate 2009 by the algorithm of the greedy method (greedy method). The series of processing contents correspond to (Step .8.5) and (Step .8.6).

On the other hand, in the processing sequence 2004, since the processing proceeds to the third stage, a join operation is further performed on the join result 2016 in the second stage. For the first stage join operation, a join operation is further performed. For example, a join operation is performed between the relation {R ₁ ^o } and the relation {R ₂ ^o }, and the relation {R ₃ ^o } 2022 and the relation {R ₅ ^{o are} further obtained for the result of the relation {R ₄ ^o }. } This corresponds to the process of additionally combining the combination result 2020 of 2021. As a result, a plurality of combined result proposals are created, and here, the combined result proposal 2017 corresponds.

Thereafter, processing 2018 is performed to specify which replica table is used in k joins when the maximum block size is (k + 1). This corresponds to the first half of (Step 8.7.2).

Further, the processing unit of the optimizing unit 1051 waits for an inquiry from the side that activates the copy processing system. As a result of the inquiry 2023, the process 2019 is executed, the response 2024 is executed, and the process ends. This corresponds to the latter half of (Step 8.7.2) and (Step 8.7.3).

Here, in the above-described query 2023, a join tree {T ₁ ^o } that is an intermediate result table is designated. On the other hand, in the response 2024 described above, it is specified whether the connection tree {T ₁ ^o } becomes the left side (external side) / driving table or the right side (internal side) / reference table. All the site groups and table groups are designated as to which replica table is used in the next join operation.

On the other hand, in the processing sequence 2003 on the side in which the copy processing system is started in the processing system of the optimization unit 1051, in the processing 2012, the (k + 1) -th stage assumed connection destination query 2023 is performed. Thereafter, the corresponding response 2024 is received. This is performed by (step 8.7.5).

After that, in processing 2013, a distributed storage site or the like for arranging the intermediate result table is specified. This is performed through (Step 8.9) after performing the evaluation in the arrangement evaluation function of (Step 8.8). The arrangement evaluation function follows the algorithm of the [placement evaluation function] described above.

Further, post-processing is performed in processing 2014. Specifically, this corresponds to (Step .8.10) and (Step .8.11).

As described above, the query optimization method of the present invention is a method for optimizing a query in a database management system, and occurs in the course of query optimization, particularly in an environment using a large-scale virtualized storage. This is a method for more rationally arranging intermediate results. In particular, queries are premised on an arbitrary number of combined operations of a limit operation and a join operation in relational algebra.

This query optimization method extracts numerical information related to latency and reliability related to the site holding the data. Next, the numerical information related to the latency and the numerical information related to the reliability are superimposed on the cost value of the query to obtain an extended cost value. Next, a site having an appropriate designated value at the top when selecting an extension cost value is selected. Next, in order to select a placement destination site for intermediate results generated as a result of optimization, prefetching is performed by creating a copy of the processing system itself in which some parameters in the algorithm are changed. Next, as a result of the prefetching, a query intermediate result placement method characterized by including a step of placing intermediate results on one or more sites is evaluated.

Traditional query optimization techniques do not take into account which distributed storage sites should be placed with reasonable judgment on the intermediate results of queries that result from optimization, resulting in query optimization However, there is a problem that has not been made in the implementation stage, and as a result, a problem that cannot satisfy the agreed service level decided by the user of the database access client may occur.

However, in the present invention, as a query optimization method of the database management system, numerical information related to latency and reliability related to a site holding data is extracted (step, numerical information related to the latency and numerical information related to the reliability are extracted from the query. The step of obtaining the extended cost value by superimposing it on the cost value, the step of selecting a site having a reasonable designated value when ordered by the extended cost value, and selecting the site where the intermediate result generated as a result of the optimization is placed Therefore, the method includes a step of performing prefetching by creating a copy of the processing system itself in which a parameter in the algorithm is partially changed, and a step of placing intermediate results on one or more sites as a result of the prefetching. To provide a query optimization method that evaluates the placement method of intermediate results of queries Reasonably be able to implement the selection of the arrangement, it can solve the conventional problems.

In the present invention, the elemental technology of a database management system is distributed over a wide range of environments, from functional modules in a single system to a wide network, and is deployed on a large scale. The query optimization technique assumed is also affected, and can be considered as a form of evolution to a technique that assumes an environment distributed throughout the network. Since it is handled over a wide area network, the related technical areas will span several.

The embodiments of the present invention have been described in detail above. However, actually, the present invention is not limited to the above-described embodiments, and modifications within a scope not departing from the gist of the present invention are included in the present invention.

<Remarks>
In addition, this application claims the priority based on the Japanese application number 2010-068822, and the disclosed content in the Japanese application number 2010-068822 is incorporated in this application by reference.

Claims

In a database management system, a query optimization system for evaluating an arrangement method of intermediate results of a query,
An apparatus for extracting numerical information on latency and reliability related to a site holding data;
An apparatus for obtaining an extended cost value by superimposing the extracted numerical information on latency and reliability on the cost value of the query;
A device that selects a site having a specified value that meets a predetermined condition at the top when ordered by an extended cost value;
A device that performs a read-ahead by creating a copy of the processing system itself in which some parameters in the algorithm are changed,
As a result of the prefetching, a site to which an intermediate result generated as a result of optimization is selected from sites having specified values that meet the predetermined condition, and the intermediate result is selected from the selected destination site. A query optimization system including a device installed at at least one site.
The query optimization system of claim 1,
A device that parses the query syntax, extracts an access pattern assumed in each element table of the reference table set, and adds it to the temporary join element list;
An apparatus that takes out the optimal access condition from the temporary connection element list for each element table and leaves it, and deletes the other from the temporary connection element list;
An apparatus for creating entries for all replicas for each element of the temporary combining element list and adding the entries to the temporary combining element list;
An apparatus for retrieving an entry from the temporary combining element list, giving numerical information about latency and reliability to the entry, and updating the temporary combining element list;
An apparatus for evaluating numerical information related to latency and reliability of the entry of the temporary coupling element list;
As a result of the evaluation, a device that leaves a predetermined number of high-order replicas and deletes others from the temporary coupling element list;
An apparatus for creating an initial version of the combined element list based on the entry of the temporary combined element list and adding it to the optimized plan element list;
A query optimizing system further comprising: an apparatus for deleting the temporary joining element list.
The query optimization system according to claim 2,
A device that copies the same processing system and starts the copied processing system;
An apparatus for selecting a connection tree that minimizes an evaluation function for calculating a cost using an algorithm of a greedy method (greedy method) from elements in the optimization plan element list;
A device that determines whether it is a copy processing system or a non-copy processing system;
If it is a copy processing system, it waits for an inquiry from the non-copy processing system, determines the type of the intermediate tree for the intermediate result table specified when making an inquiry from the non-copy processing system that performed the copy, A device that evaluates which side / driving table or internal side / reference table, and which table at which site to perform a join operation in response to the non-copy processing system, and ends the copy processing system,
If the device itself is a non-copy processing system, a device that temporarily places the join tree in a new intermediate result table and inquires of the copy processing system,
A device that calls a placement evaluation function and retrieves an optimal plan based on a site on which the intermediate result table is to be placed and an index creation policy of the site;
A query optimization system further comprising: an apparatus that arranges the intermediate generation result at a predetermined location based on the result of the arrangement evaluation function.
The query optimization system according to claim 3,
In the processing of the placement evaluation function, a device for confirming the next join operation destination, taking out a site group used in the next join operation destination, and adding it to the temporary site list;
A device that takes out sites not included in the temporary site list and adds them to the candidate temporary list;
A device for preparing and initializing a candidate area for search;
A device that sequentially reads site information and reads numerical information related to latency and reliability related to the site;
A device that sets the initial predetermined number of sites as candidates in the candidate area for search, if the initialized ones in the candidate areas remain;
If a candidate area has already been set and is not registered in the temporary site list, if numerical information on latency and reliability is more highly evaluated, a device that replaces the site described in the candidate area for search;
When a new candidate site is identified, a device for adding the site to the temporary site list,
A device that estimates the cost of creating a table from the number of join trees created in the intermediate results table;
Superimposing the numerical information related to normalized latency and the inverse of the numerical information related to reliability on the estimate of the table creation cost, the total cost is obtained, and the plan number is assigned to the total cost and evaluated as one of the candidates. A device to add to the list;
If the index creation possibility is determined and the index can be created, the index creation cost is estimated from the total cost, the total cost including the index creation cost is calculated again, and the plan number is assigned to the newly calculated total cost. A device to be added to the evaluation list as another candidate,
Regarding the data in the evaluation list, the apparatus sorts and ranks according to the total cost, and stores the one with the minimum total cost as a first proposal,
An apparatus for storing a second plan that has the same configuration as the first plan but has a different placement destination and the next lowest overall cost,
A query optimization system further comprising: an apparatus that answers the first proposal and the second proposal.
In a database management system, a query optimization device for evaluating an arrangement method of intermediate results of a query,
Means for retrieving numerical information about latency and reliability of the site holding the data;
Means for superimposing the extracted latency and reliability numerical information on the query cost value to obtain an extended cost value;
A means for selecting a site having a specified value that meets a predetermined condition at the top when ordered by extended cost value,
A means of making a copy of the processing system itself in which some of the parameters in the algorithm are changed and performing prefetching;
As a result of the prefetching, a site to which an intermediate result generated as a result of optimization is selected from sites having specified values that meet the predetermined condition, and the intermediate result is selected from the selected destination site. A query optimizing device comprising means for disposing at least one of the sites.
The query optimization device according to claim 5,
A means of analyzing the syntax of the query, extracting an access pattern assumed in each element table of the table set to be referenced, and adding it to the temporary join element list;
Means for taking out each of the element tables from the temporary connection element list and leaving the optimal access condition, and deleting others from the temporary connection element list;
Means for creating entries for all replicas for each element of the temporary combining element list and adding to the temporary combining element list;
Means for taking out an entry from the temporary binding element list, giving numerical information about latency and reliability to the entry, and updating the temporary binding element list;
Means for evaluating numerical information relating to latency and reliability of the entries of the temporary binding element list;
As a result of the evaluation, the means for leaving the upper predetermined number of replicas and deleting others from the temporary coupling element list;
Means for creating an initial version of the combined element list based on the entry of the temporary combined element list and adding it to the optimized plan element list;
Query optimizing apparatus, further comprising means for deleting the temporary joining element list.
The query optimization device according to claim 6,
A means for copying the same processing system and starting the copied processing system;
Means for selecting, from the elements in the optimization plan element list, a join tree that minimizes an evaluation function for calculating a cost using an algorithm of greedy method (greedy method);
Means for determining whether it is a copy processing system or a non-copy processing system;
If it is a copy processing system, it waits for an inquiry from the non-copy processing system, determines the type of the intermediate tree for the intermediate result table specified when making an inquiry from the non-copy processing system that performed the copy, Means for evaluating the side / driving table or the internal side / reference table, responding to the non-copy processing system to which table at which site to perform the join operation, and terminating the copy processing system;
If it is a non-copy processing system, a means for temporarily placing the connection tree in a new intermediate result table and inquiring the copy processing system,
Means for invoking a placement evaluation function and taking out the optimum plan based on the site where the intermediate result table is to be placed and the index creation policy of the site;
A query optimization device further comprising: means for arranging the intermediate generation result at a predetermined location based on the result of the arrangement evaluation function.
The query optimization device according to claim 7,
In the processing of the location evaluation function, a means for confirming the next join operation destination, taking out a site group used in the next join operation destination, and adding it to the temporary site list;
Means for taking out a site not included in the temporary site list and adding it to the candidate temporary list;
Means for preparing and initializing candidate areas for search;
Means for sequentially reading site information and reading numerical information about latency and reliability related to the site;
If the candidate area remains initialized, means for setting the first predetermined number of sites as candidates in the candidate area for search;
If the candidate area is already set and not registered in the temporary site list, if the numerical information regarding latency and reliability is higher evaluation, means for replacing the site described in the candidate area for search;
If a new candidate site is identified, means for adding the site to the temporary site list;
A means of estimating the cost of creating a table from the number of join trees created in the intermediate outcome table;
Superimposing the numerical information related to normalized latency and the inverse of the numerical information related to reliability on the estimate of the table creation cost, the total cost is obtained, and the plan number is assigned to the total cost and evaluated as one of the candidates. Means to add to the list,
If the index creation possibility is determined and the index can be created, the index creation cost is estimated from the total cost, the total cost including the index creation cost is calculated again, and the plan number is assigned to the newly calculated total cost. Means for adding to the evaluation list as another candidate,
With respect to the data in the evaluation list, means for sorting and ranking by the total cost, and storing the one with the minimum total cost as a first plan,
Means for storing the second plan with the next lowest overall cost in the same configuration as the first plan,
A query optimization apparatus, further comprising: means for answering the first proposal and the second proposal.
In a database management system, a query optimization method implemented by a computer that evaluates an arrangement method of intermediate results of a query,
Retrieving numerical information about latency and reliability of the site holding the data;
Superimposing the retrieved latency and reliability numerical information on the cost value of the query to obtain an extended cost value;
Select a site with a specified value that meets the specified conditions at the top when ordered by extended cost value,
Create a copy of the processing system itself with some changes to the parameters in the algorithm and perform prefetching;
As a result of the prefetching, a site to which an intermediate result generated as a result of optimization is selected from sites having specified values that meet the predetermined condition, and the intermediate result is selected from the selected destination site. A query optimization method comprising deploying to at least one of the sites.
The query optimization method according to claim 9, comprising:
Analyzing the query syntax, extracting the expected access pattern in each element table of the referenced table set, adding it to the temporary join element list,
Taking out the one with the optimal access condition from the temporary binding element list for each of the element tables, and deleting the other from the temporary binding element list;
Creating an entry for all replicas for each element of the temporary binding element list and adding to the temporary binding element list;
Taking out an entry from the temporary binding element list, giving numerical information about latency and reliability to the entry, and updating the temporary binding element list;
Evaluating numerical information regarding latency and reliability of the entries in the temporary binding element list;
As a result of the evaluation, leaving the upper predetermined number of replicas, deleting the other from the temporary coupling element list,
Creating an initial version of the combined element list based on the entry of the temporary combined element list and adding it to the optimized plan element list;
The query optimization method further comprising: deleting the temporary joining element list.
The query optimization method according to claim 10, comprising:
Copy the same processing system, start the copied processing system,
Selecting from the elements in the optimization plan element list a join tree that minimizes an evaluation function for calculating a cost using an algorithm of greedy method (greedy method);
Determining whether it is a copy processing system or a non-copy processing system,
If it is a copy processing system, it waits for an inquiry from the non-copy processing system, determines the type of the intermediate tree for the intermediate result table specified when making an inquiry from the non-copy processing system that performed the copy, Evaluating the side / drive table or the internal / reference table, responding to which table at which site to perform the join operation to the non-copy processing system, and terminating the copy processing system;
If you are a non-copy processing system, temporarily place the join tree in a new intermediate outcome table, query the copy processing system,
Calling the placement evaluation function, taking out the optimal plan in terms of the site where the intermediate result table should be placed and the index creation policy of the site;
Arranging the intermediate generation result at a predetermined location based on the result of the arrangement evaluation function. The query optimization method.
The query optimization method according to claim 11, comprising:
In the processing of the location evaluation function, confirm the next join operation destination, take out the site group used in the next join operation destination, and add to the temporary site list;
Taking out sites not included in the temporary site list and adding them to the candidate temporary list;
Preparing and initializing candidate areas for search;
Sequentially reading the site information, reading the numerical information about the latency and reliability of the site,
If there remains a initialized area in the candidate area, setting the first predetermined number of sites as candidates in the candidate area for search;
If the candidate area is already set and not registered in the temporary site list, if the numerical information regarding latency and reliability is higher, replacing the site described in the candidate area for search,
If a new candidate site is identified, adding the site to the temporary site list,
Estimating table creation costs from the number of join trees created in the intermediate outcome table,
Superimposing the numerical information on normalized latency and the reciprocal of the numerical information on reliability is superimposed on the estimate of the table creation cost, the overall cost is obtained, the plan number is assigned to the overall cost, and it is evaluated as one of the candidates Adding to the list,
If the index creation possibility is determined and the index can be created, the index creation cost is estimated from the total cost, the total cost including the index creation cost is calculated again, and the plan number is assigned to the newly calculated total cost. Adding to the evaluation list as another candidate,
Regarding the data in the evaluation list, sorting and ranking by the overall cost, storing the one with the lowest overall cost as a first plan,
The arrangement destination is different in the same configuration as the first plan, and the next lowest overall cost is stored as the second plan,
A query optimization method, further comprising: responding to the first proposal and the second proposal.
A program for causing a computer to execute the query optimization method according to any one of claims 9 to 12.