WO2017157160A1 - 数据表连接方式处理方法及装置 - Google Patents

数据表连接方式处理方法及装置 Download PDF

Info

Publication number
WO2017157160A1
WO2017157160A1 PCT/CN2017/075065 CN2017075065W WO2017157160A1 WO 2017157160 A1 WO2017157160 A1 WO 2017157160A1 CN 2017075065 W CN2017075065 W CN 2017075065W WO 2017157160 A1 WO2017157160 A1 WO 2017157160A1
Authority
WO
WIPO (PCT)
Prior art keywords
data table
execution
cost
connection
execution cost
Prior art date
Application number
PCT/CN2017/075065
Other languages
English (en)
French (fr)
Inventor
徐冬
孙伟光
连杰红
汪龙重
Original Assignee
阿里巴巴集团控股有限公司
徐冬
孙伟光
连杰红
汪龙重
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 徐冬, 孙伟光, 连杰红, 汪龙重 filed Critical 阿里巴巴集团控股有限公司
Priority to EP17765702.0A priority Critical patent/EP3432157B1/en
Priority to US16/084,529 priority patent/US11650990B2/en
Publication of WO2017157160A1 publication Critical patent/WO2017157160A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • G06F16/24544Join order optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • G06F16/2456Join operations

Definitions

  • the present application relates to the field of database technologies, and in particular, to a method and device for processing a data table connection manner.
  • Data warehouses play a huge role in this context. Due to the advent of the big data era, data warehouses have become distributed architectures to meet the explosive growth of computing and storage needs. Distributed data warehouses generally use columnar storage and store data in the form of files. Therefore, using distributed data warehouses can improve the storage and computing performance of big data.
  • Join algorithm For distributed data warehouses, if you can choose the appropriate Join algorithm, you can not only save the resources of the distributed data warehouse, but also improve the query efficiency. However, due to the short development time of distributed data warehouse, there is currently no Join algorithm selection scheme for distributed data warehouse.
  • aspects of the present application provide a method and apparatus for processing a data table connection manner, which are used to select a suitable Join algorithm for connecting data tables, thereby saving resources of a distributed data warehouse and improving query efficiency.
  • An aspect of the present application provides a method for processing a data table connection manner, including:
  • an execution cost when the connection manner of the candidate data table is connected to the data table to be connected is estimated according to the parameter in the parameter list and the execution logic of each candidate data table connection manner;
  • the execution cost at the time of calculation selects a target data table connection manner for performing connection calculation on the to-be-connected data table.
  • Another aspect of the present application provides a data table connection manner processing apparatus, including:
  • a setting module configured to set a parameter list for cost estimation according to a distributed data warehouse environment in which the data table to be connected is located;
  • An estimation module configured to estimate, according to the parameter in the parameter list and the execution logic of each candidate data table connection manner, an execution cost when the connection manner of each candidate data table is connected to the data table to be connected;
  • a selection module configured to select a target data table connection for performing connection calculation on the to-be-connected data table according to an estimated execution cost of the connection calculation of the to-be-connected data table according to the estimated connection manner of each candidate data table the way.
  • a parameter list for cost estimation is set, and each candidate data table is estimated according to parameters in the parameter list and execution logic of each candidate data table connection manner.
  • the execution cost of the connection method when the connection data table is connected, and the execution cost when the connection calculation is performed according to the estimated connection manner of each candidate data table, and the target data table for the connection calculation of the connection data table is selected.
  • the connection method selects the data table connection mode suitable for the distributed data warehouse environment, thereby saving the resources of the distributed data warehouse and improving the query when calculating the connection between the data tables based on the selected data table connection manner. effectiveness.
  • FIG. 1 is a schematic flowchart of a method for processing a data table connection manner according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of execution logic of a PSJ mode according to another embodiment of the present application.
  • FIG. 3 is a schematic diagram of execution logic of a BHJ mode according to another embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of execution logic of a BKHJ mode according to another embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a data table connection manner processing apparatus according to another embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of a data table connection manner processing apparatus according to another embodiment of the present disclosure.
  • the present application provides a solution whose core idea is to set a parameter list for cost estimation based on the distributed data warehouse environment in which the data table to be connected is located, and further based on the parameters in the parameter list and each candidate data.
  • the execution logic of the table connection mode predicts the execution cost of each candidate data table connection mode when performing connection calculation on the connection data table, and then selects the true connection data to be connected from each candidate data table connection mode based on the estimated execution cost.
  • the table is connected to the data table for connection calculation.
  • connection method of the target data table may be adopted, and the connection data table is connected for calculation.
  • the application can select a data table connection mode suitable for the distributed data warehouse environment from the connection manners of the candidate data tables, thereby saving the distributed when calculating the connection between the data tables based on the selected data table connection manner.
  • FIG. 1 is a schematic flowchart of a method for processing a data table connection manner according to an embodiment of the present disclosure. As shown in Figure 1, the method includes:
  • the execution cost of each candidate data table connection manner when performing connection calculation on the connection data table is estimated.
  • the data table to be connected may be determined.
  • the order table and the user table need to be connected and calculated, so that the data table connection mode processing device can be used (hereinafter referred to as processing).
  • the device submits a data table connection task, and carries an identifier of the order table and the user table in the data table connection task.
  • the processing device connects the tasks according to the data table, and determines that the data table to be connected includes an order table and a user table.
  • the processing device After determining the data table to be connected, the processing device uses the data table connection method (referred to as the Join algorithm for short) to perform connection calculation on the connection data table.
  • the data table connection method referred to as the Join algorithm for short
  • the processing device in order to select a suitable data table connection manner to perform connection calculation on the connection data table, to save resources of the distributed data warehouse and improve query efficiency, connects the usable data table connection manner as a candidate data table.
  • the execution cost of each candidate data table connection manner when performing connection calculation on the connection data table is estimated, and the execution cost when performing connection calculation on the connection data table based on the estimated connection manner of each candidate data table is connected from each candidate data table.
  • the processing device determines a distributed data warehouse environment in which the data table to be connected is located, and sets a parameter list for cost estimation according to the distributed data warehouse environment in which the data table to be connected is located; the parameter list includes a candidate data table
  • the connection method takes the parameters required for cost estimation.
  • processing device setting parameter list includes:
  • the parameter list includes the following parameters: the number of data records included in each data table, the total number of data records, the average length of each data record, the size of data blocks supported by each storage node, and various operations. The unit cost and the number of data records that can be processed by each compute node.
  • the data table to be connected includes at least two data tables.
  • Each data table in the data table to be connected contains a certain data record, so the number of data records included in each data table can be statistically obtained, and the total number of data records included in each data table can be summed to obtain the total number of data records;
  • the total amount of data can be obtained based on the amount of data recorded in each data table, and the average length of each data record can be obtained based on the total amount of data and the total number of data records.
  • the connection operation involves reading and/or writing of the data table, and further involves at least one of a local read operation, a local write operation, a network read operation, and a network write operation.
  • the unit cost and local write operation of the local read operation can be determined according to the storage medium used by the distributed data warehouse.
  • Unit cost For example, the storage medium used by the distributed data warehouse may be a disk, a flash memory, a USB flash drive, etc., and the read and write times required by different storage media are generally different.
  • the processing device can determine the unit cost of the local read operation and the unit cost of the local write operation based on the storage medium.
  • the processing device may also determine the unit cost of the network read operation and the unit cost of the network write operation according to the network topology of the distributed data warehouse.
  • the unit cost of the above local read operation is defined as the average cost of reading 1 byte (byte) of data locally; the unit cost of the above local write operation is defined as the average cost of locally writing 1 byte of data; similarly, the unit cost definition of the above network read operation The average cost of reading 1 byte of data over the network; the unit cost of the above network write operation is defined as the average cost of writing 1 byte of data over the network.
  • the distributed data warehouse includes a plurality of computing nodes and a plurality of storage nodes.
  • the size of the data block supported by the storage node may be determined according to the file system of the distributed data warehouse, for example, may be 256 MB, and according to the distributed data
  • the hardware information of the warehouse determines the number of data records that the compute node can handle, such as 1 GB.
  • the processing device estimates the execution cost when the connection manner of each candidate data table is connected to the data table to be connected according to the parameters in the parameter list and the execution logic of each candidate data table connection manner.
  • the execution logic of the candidate data table connection mode may be embodied by its execution steps and key operations in the execution steps.
  • the key operation refers to an operation that mainly reflects the execution cost of the execution step, for example, if the cost of an operation is greater than a preset value in a certain aspect, or is far greater than the cost of other operations in this aspect, the operation is determined to be a critical operation.
  • the processing device estimates, according to the parameters in the parameter list and the execution logic of each candidate data table connection mode, the execution cost of each candidate data table connection manner when performing connection calculation on the to-be-connected data table includes:
  • For each candidate data table connection mode determine the execution steps of the candidate data table connection mode and the key operations in each execution step; estimate each execution according to the parameters in the parameter list and the key operations in each execution step The execution cost of the step; the execution cost of the candidate data table connection method is obtained according to the execution cost of each execution step.
  • the foregoing implementation manner of estimating the execution cost of each execution step according to the parameters in the parameter list and the key operations in each execution step includes:
  • For each execution step obtain the target parameters required for the execution step from the parameter list; estimate the execution cost of the key operations in the execution step according to the target parameters required for the execution step and the key operations in the execution step
  • the execution cost of the execution step is obtained according to the execution cost of the key operation in the execution step.
  • the execution cost of the key operations in the execution steps can be overlaid to obtain the execution cost of the execution steps.
  • the execution cost of the key operations in the execution step is directly taken as the execution cost of the execution step.
  • the triplet (data record count consumption, CPU consumption, IO consumption) is used to represent the above execution cost. That is, the execution cost of an execution step or a candidate data table connection manner can be described from three dimensions of the number of data records (RowCount), the CPU resources consumed, and the consumed IO resources.
  • the execution cost of different candidate data table connection methods can be expressed by the above-mentioned triplet, and then compared.
  • FIG. 2 it is a schematic diagram of the execution logic of the PSJ mode.
  • the PSJ includes a re-partition step and an ordered connection step.
  • FIG. 3 it is a schematic diagram of the execution logic of the BHJ mode.
  • the data table to be connected includes two data tables as an example, and the BHJ includes a broadcast step and a hash connection step.
  • BKHJ includes a broadcast distribution step and a hash connection step.
  • the PSJ method, the BHJ method, and the BKHJ method described above can be used as the candidate data table connection method of the present application.
  • the processing device estimates the difference. The following describes the estimation process for different candidate data table connections.
  • the processing device may determine that the PSJ mode execution step includes: a redistribution step and an ordered connection step; the redistribution step is mainly to sort all the to-be-connected data tables and distribute them to different calculations.
  • the main purpose of the ordered connection step is to obtain and output all the data table combinations that meet the connection conditions output by the redistribution step.
  • the key operations in the redistribution step include: a local read operation, a network read operation, a local sort operation, and a local write operation; accordingly, it may be determined that the key operations in the ordered join step include: an output operation.
  • the processing device After determining the execution steps of the PSJ mode and the key operations in each execution step, the processing device needs to obtain the target parameters required for the execution step from the parameter list for each execution step. specific:
  • the processing device may obtain the parameters N, L, RC, RNC, and WC from the parameter list as target parameters required for the redistribution step; wherein N represents the total number of data records; L represents each data. The average length of the record; RC represents the unit cost of the local read operation; RNC represents the unit cost of the network read operation; WC represents the unit cost of the local write operation;
  • the processing device may obtain N j and n from the parameter list as target parameters required for the ordered connection step, where N j represents the jth data table included in the data table to be connected
  • N j the jth data table included in the data table to be connected
  • the processing device can estimate the execution cost of the key operation in the redistribution step according to the target parameters required by the redistribution step and the key operations in the redistribution step. And estimate the execution cost of the key operations in the ordered join step based on the target parameters required for the ordered join step and the key operations in the ordered join step.
  • the processing device can estimate the execution cost of the local read operation as (0, 0, N*L*RC) according to the above parameters N, L, RC, RNC, and WC, and the network read operation
  • the execution cost is (N, 0, N * L * RNC)
  • the execution cost of the local sort operation is (0, N, 0)
  • the execution cost of the local write operation is (0, 0, N * L * WC);
  • the processing device may obtain the execution cost of the redistribution step according to the execution cost of the key operation in the redistribution step, and according to The execution cost of the key operations in the ordered join step yields the execution cost of the ordered join step.
  • the processing device can implement the execution cost of the local read operation (0, 0, N * L * RC), the execution cost of the network read operation (N, 0, N * L * RNC), local The execution cost of the sort operation (0, N, 0) and the execution cost of the local write operation (0, 0, N * L * WC) are superimposed to obtain the execution cost (N, N, N * L * (RC + RNC) +WC)) as the execution cost of the redistribution step;
  • the data records in the data table to be connected need to be distributed to P computing nodes, and if the data records distributed to each computing node are balanced, then in each calculation The execution cost on the node is the same, so the execution cost of the corresponding operation can be calculated directly by the data amount N*L and the unit cost of the corresponding operation; if the data record distributed to each computing node is not balanced, the network is read.
  • For operations and local write operations because you need to wait for data on other compute nodes, you need to multiply P by the cost on the compute node (or the most expensive compute node) where the largest data record is located to obtain the final execution cost. .
  • the execution cost of the local read operation (0, 0, N * L * RC)
  • the execution cost of the network read operation (N, 0, N * L * RNC)
  • the execution cost of the local sort operation (0, N, 0)
  • local write The processing cost of the operation (0,0,N*L*WC) is superimposed to obtain the execution cost (N, N, N*L*(RC+RNC+WC)) as the execution cost of the redistribution step
  • the processing device It can be determined whether the data record contained in the data table to be connected has a distribution tilt.
  • Top K values can be used to estimate the most consumed computing node.
  • Top K values refers to the K values with the highest degree of repetition in a column of data and their repetitions.
  • the ratio of the highest repetition rate to the overall data is p, which is referred to as the distribution tilt rate.
  • p the ratio of the highest repetition rate to the overall data
  • the distribution tilt rate the ratio of the highest repetition rate to the overall data.
  • the data distribution skew occurs, and the execution cost (N, 0, N*L*RNC) of the network read operation is corrected to (N, 0, P*N*L*p*RNC), Correct the execution cost of the local write operation (0,0,N*L*WC) to (0,0,P*N*L*p*WC); where p is the distribution tilt rate; P is for the connection to be connected The number of compute nodes that the data table performs for join processing.
  • the execution cost of the network read operation (N, 0, P*N*L*1/P*RNC), that is, (N, 0, N*L*RNC)
  • the execution cost of local write operations (0,0, P*N*L*1/P*WC), ie (0,0,N*L*WC).
  • the processing device needs to perform the execution cost of the local read operation (0, 0, N*L*RC), the execution cost of the corrected network read operation (N, 0, P*N). *L*p*RNC), the execution cost of the local sort operation (0, N, 0) and the execution cost of the modified local write operation (0, 0, P*N*L*p*WC) are superimposed to
  • the execution cost N, N, N*L*RC+P*N*L*p*(RNC+WC) is obtained as the execution cost of the redistribution step.
  • the processing device can use the execution cost (J, 0, 0) of the output operation as the execution cost of the ordered connection step.
  • the processing device After obtaining the execution cost of the redistribution step and the execution cost of the ordered connection step, the processing device can superimpose the execution costs of the two execution steps to obtain the execution cost of the PSJ mode. specific:
  • the execution cost of the PSJ mode is (N+J, N, N*L*RC+P*N*L*p*(RNC+WC));
  • the execution cost of the PSJ mode is (N+J, N, N*L*) without the skew of the data distribution. (RC+RNC+WC)).
  • the processing device may determine that the BHJ mode execution step includes: a broadcast step and a hash connection step.
  • the largest data table among the n data tables is used as the main data table, and the remaining data tables are used as the auxiliary data table.
  • the broadcasting step mainly distributes the auxiliary data table to the main data table through network transmission. This means that the broadcast steps include: local read operations, network read operations, and local write operations. This embodiment selects a network read operation as a key operation in the broadcast step, but is not limited thereto.
  • the hash connection step mainly obtains and outputs all the data table combinations that meet the connection condition through the hash algorithm, so it can be determined that the hash connection step includes: hash calculation and output operation.
  • the present embodiment selects the output operation as a key operation in the hash connection step, but is not limited thereto.
  • the processing device After determining the execution step of the BHJ mode and the key operations in each execution step, the processing device needs to obtain the target parameters required for the execution step from the parameter list for each execution step. specific:
  • the processing device may estimate the execution cost of the key operation in the broadcasting step according to the target parameters required for the broadcasting step and the key operations in the broadcasting step, and according to the The target parameters required for the connection step and the key operations in the hash join step estimate the execution cost of the key operations in the hash join step.
  • the number of data records included in the secondary data table may be greater than the size D of the data partition supported by each storage node.
  • the data record needs to be written from the memory to the external storage, that is, degenerated into a nested loop Join. happening.
  • the data record since the data record is expensive to write from the memory, it is necessary to consider this part of the cost.
  • the processing device needs to determine whether the number of included data records in each of the secondary data tables is greater than that of each storage node.
  • the processing device can obtain the execution cost of the broadcasting step according to the execution cost of the key operation in the broadcasting step, and according to the hash connection
  • the execution cost of the key operations in the step gets the execution cost of the hash join step.
  • the processing device may use the execution cost of the network read operation ( ⁇ N i *M, 0, ⁇ N i *M*L*RNC) as the execution cost of the broadcast step;
  • the processing device may perform the modified execution cost ( J, N k * ⁇ N l, N k * ⁇ N l * L * WC) as the execution cost of hash join procedure; if each secondary data table is not present the number of records is greater than the data contained in each storage node supports The data table of the size D of the data block, the processing device can take the execution cost (J, 0, 0) of the output operation as the execution cost of the hash connection step.
  • the processing device may superimpose the execution costs of the two execution steps to obtain the execution cost of the BHJ mode. specific:
  • the execution cost of the BHJ mode is ( ⁇ N i *M+J, N k * ⁇ N l , N k * ⁇ N l *L*WC+ ⁇ N i *M*L*RNC);
  • the execution cost of the BHJ mode is ( ⁇ N i *M+J, 0) , ⁇ N i *M*L*RNC).
  • the processing device may determine that the BKHJ mode execution step includes: a broadcast distribution step and a hash connection step.
  • the broadcast distribution step mainly distributes all the data tables to be connected to different computing nodes, which means that the broadcast distribution steps include: local read operations, network read operations, and local write operations.
  • This embodiment takes a local read operation, a network read operation, and a local write operation as key operations in the broadcast step, but is not limited thereto.
  • the hash connection step mainly obtains and outputs all the data table combinations that meet the connection condition through the hash algorithm, so it can be determined that the hash connection step includes: hash calculation and output operation.
  • the present embodiment selects the output operation as a key operation in the hash connection step, but is not limited thereto.
  • the processing device After determining the execution steps of the BKHJ mode and the key operations in each execution step, the processing device needs to obtain the target parameters required for the execution step from the parameter list for each execution step. specific:
  • the processing device may obtain the parameters N, L, RC, RNC, and WC from the parameter list as target parameters required for the broadcast distribution step; wherein N represents the total number of data records; L represents each data. The average length of the record; RC represents the unit cost of the local read operation; RNC represents the unit cost of the network read operation; WC represents the unit cost of the local write operation;
  • the processing device can estimate the execution cost of the key operation in the broadcast distribution step according to the target parameters required for the broadcast distribution step and the key operations in the broadcast distribution step. And estimate the execution cost of the key operations in the hash join step based on the target parameters required for the hash join step and the key operations in the hash join step.
  • the processing device can estimate the execution cost of the local read operation as (0, 0, N*L*RC) according to the above parameters N, L, RC, RNC, and WC, and the network read operation
  • the execution cost is (N, 0, N * L * RNC) and the execution cost of the local write operation is (0, 0, N*L*WC);
  • the data record needs to be written from the memory to the external storage, that is, degenerated into a nested loop connection. (nested loop Join) situation.
  • nested loop Join since the data record is expensive to write from the memory, it is necessary to consider this part of the cost.
  • the processing device needs to determine whether the number of data records included in each data table to be connected is greater than each storage node.
  • the processing device may obtain the execution cost of the broadcast distribution step according to the execution cost of the key operation in the broadcast distribution step, and according to The execution cost of the key operations in the hash join step yields the execution cost of the hash join step.
  • the processing device can implement the execution cost of the local read operation (0, 0, N*L*RC), the execution cost of the network read operation (N, 0, N*L* RNC), and local.
  • the execution cost of the write operation (0,0, N*L*WC) is superimposed to obtain the execution cost (N, 0, N*L*(RC+RNC+WC)) as the execution cost of the broadcast distribution step;
  • the processing device may perform the modified execution cost. (J, N k * ⁇ N l , N k * ⁇ N l *L*WC) as the execution cost of the hash join step; if the number of data records included in each of the secondary data tables does not exist is greater than each storage node support
  • the data block is a data table of size D, and the processing device can use the execution cost (J, 0, 0) of the output operation as the execution cost of the hash join step.
  • the processing device may superimpose the execution costs of the two execution steps to obtain the execution cost of the BKHJ mode. specific:
  • the execution cost of the BKHJ mode is (N+J, N k * ⁇ N l , N k * ⁇ N l *L*WC+N*L*(RC+RNC+WC));
  • the execution cost of the BKHJ mode is (N+J, 0, N*L). *(RC+RNC+WC)).
  • the processing device may treat according to the estimated candidate data table connection manner.
  • the execution cost when connecting the data table for connection calculation, and selecting the target data table connection method for connection calculation for the connection data table.
  • the processing device may compare the estimated execution cost of each candidate data table connection manner when the connection data table is connected, and select a candidate data table connection manner corresponding to the minimum execution cost as the target data table connection manner, or The candidate data table connection mode corresponding to the optimal execution cost is selected as the target data table connection mode.
  • the priority between the triplets may be set in advance to facilitate comparison.
  • the priority of the data record consumption is higher than the CPU consumption priority, and the CPU consumption priority is higher than the priority of the IO consumption, based on which, when the execution cost of each candidate data table connection mode is compared
  • the data record number consumption can be preferentially compared, and the candidate data table connection mode with the smallest data record consumption is selected as the target data table connection mode; if the data record number consumption is the same, the CPU consumption can be compared, and the candidate data table connection with the smallest CPU consumption is selected.
  • the method is the target data table connection mode; if the CPU consumption is the same, the IO consumption can be compared, and the candidate data table connection mode with the smallest IO consumption is selected as the target data table connection mode.
  • connection data table is connected for calculation. Since the selected data table connection mode is suitable for the distributed data warehouse environment, when the connection calculation between the data tables is performed based on the selected data table connection manner, the resources of the distributed data warehouse can be saved, and the query efficiency is improved.
  • the following describes the execution cost of each candidate data table connection method by combining the specific data table connection scenarios and specific parameters.
  • the user table and the order table need to be connected, the user is marked as R, the order is marked as S, the data record contained in R is 10M, and the data record contained in S is 10M.
  • the following parameters are set. :
  • the above G represents the number of data records supported by a single compute node.
  • the execution cost of the redistribution step is (20M, 20M, 26GB);
  • the execution cost of the hash join step is (10M, 10M*10M, 10M*10M*) 100*2);
  • the execution cost of the hash join step is (20M, 10M*10M, 10M*10M*). 100*2);
  • the PSJ mode is less expensive to execute, so it can be selected. PSJ way.
  • the execution cost of the redistribution step is (10M, 10M, 13GB);
  • the execution cost of the ordered join step is (100K, 0, 0B);
  • the implementation cost of the PSJ mode is (10M, 10M, 13GB).
  • the execution cost of the broadcast step is (4K, 0,400K);
  • the execution cost of the hash join step is (100K, 0, 0);
  • the execution cost of the BHJ mode is (104K, 0,400K).
  • the execution cost of the broadcast distribution step is (10M, 0, 13GB);
  • the execution cost of the hash join step is (100K, 0, 0B);
  • the execution cost of the BKHJ mode is (10M, 0, 13GB).
  • the BHJ mode execution cost is lower by comparison, so the selection can be selected.
  • the BHJ mode, and the BKHJ mode implementation cost is also lower than the PSJ mode.
  • FIG. 5 is a schematic structural diagram of a data table connection manner processing apparatus according to another embodiment of the present disclosure. As shown in FIG. 5, the apparatus includes: a setting module 51, an estimation module 52, and a selection module 53.
  • the setting module 51 is configured to set a parameter list for cost estimation according to a distributed data warehouse environment in which the data table to be connected is located.
  • the estimation module 52 is configured to estimate, according to the parameter in the parameter list set by the setting module 51 and the execution logic of each candidate data table connection mode, the execution cost of each candidate data table connection manner when performing connection calculation on the connection data table.
  • the selecting module 53 is configured to select a target data table connection manner for performing connection calculation on the connection data table according to an execution cost when the connection data table is connected to be calculated according to each candidate data table connection manner estimated by the estimation module 52.
  • the setting module 51 is specifically configured to:
  • the setting module 51 is configured to: when setting the unit cost of the various operations required for the connection calculation according to the hardware information of the distributed data warehouse, specifically:
  • the unit cost of the network read operation and the unit cost of the network write operation are determined according to the network topology of the distributed data warehouse.
  • an implementation structure of the estimation module 52 includes: a determination sub-module 521, an estimation sub-module 522, and an acquisition sub-module 523.
  • the determining sub-module 521 is configured to determine the execution manner of the candidate data table connection manner and the key operations in each execution step for each candidate data table connection manner.
  • the estimation sub-module 522 is configured to estimate the execution cost of each execution step according to the parameters in the parameter list and the key operations in each execution step.
  • the obtaining sub-module 523 is configured to obtain an execution cost of the candidate data table connection manner according to the execution cost of each execution step.
  • an implementation structure of the prediction submodule 522 includes a parameter obtaining unit 5221, a cost estimating unit 5222, and a cost obtaining unit 5223.
  • the parameter obtaining unit 5221 is configured to obtain, for each execution step, a target parameter required to perform the step from the parameter list.
  • the cost estimation unit 5222 is configured to estimate the execution cost of the key operations in the execution step according to the target parameters required in the execution step and the key operations in the execution steps.
  • the cost obtaining unit 5223 is configured to obtain an execution cost of the execution step according to an execution cost of the key operation in the execution step.
  • the step of performing the partitioned ordered connection manner includes: a redistribution step and an ordered connection step; and a key in the redistribution step Operations include: local read operations, network read operations, local sort operations, and local write operations; key operations in the ordered join step include: output operations;
  • the execution step of the broadcast hash connection mode includes: a broadcast step and a hash connection step; the key operations in the broadcast step include: a network read operation; a hash connection step Key operations include: output operations;
  • the execution steps of the partition hash connection mode include: a broadcast distribution step and a hash connection step; the key operations in the broadcast distribution step include: local read operation, network read Operation and local write operations; key operations in the hash join step include: output operations.
  • execution cost is represented by a triplet (data record number consumption, CPU consumption, IO consumption).
  • the parameter obtaining unit 5221 is specifically configured to:
  • the parameters N, L, RC, RNC, and WC are obtained from the parameter list as target parameters required for the redistribution step;
  • the parameters N i , N k , D, L and RNC are obtained from the parameter list as target parameters required for the broadcasting step;
  • the parameters N, L, RC, RNC, and WC are obtained from the parameter list as target parameters required for the broadcast distribution step;
  • N j and n For an ordered join step or a hash join step, obtain N j and n from the parameter list as target parameters required for the ordered join step or the hash join step;
  • N the total number of data records
  • L represents the average length of each data record
  • RC represents the unit cost of a local read operation
  • RNC represents the unit cost of a network read operation
  • WC represents the unit cost of a local write operation
  • N k represents the number of data records included in the main data table in the data table to be connected, and k is any value of 1...n;
  • D represents the size of the data chunks supported by each storage node
  • n the number of data tables in the data table to be connected.
  • the cost estimating unit 5222 is specifically configured to:
  • the execution cost of the local read operation is estimated to be (0, 0, N*L*RC), and the execution cost of the network read operation is (N, 0, N*L*RNC), the execution cost of the local sort operation is (0, N, 0) and the execution cost of the local write operation is (0, 0, N*L*WC);
  • the execution cost of the local read operation is estimated to be (0, 0, N*L*RC), and the execution cost of the network read operation is (N, 0, The execution cost of N*L*RNC) and local write operations is (0,0,N*L*WC);
  • the cost obtaining unit 5223 is specifically configured to:
  • the execution cost of the local read operation (0,0,N*L*RC), the execution cost of the network read operation (N,0,N*L*RNC), the execution cost of the local sort operation (0) , N, 0) and the execution cost of the local write operation (0, 0, N * L * WC) are superimposed to obtain the execution cost (N, N, N * L * (RC + RNC + WC)) as a redistribution
  • the execution cost of the network read operation ( ⁇ N i *M,0, ⁇ N i *M*L*RNC) is taken as the execution cost of the broadcast step;
  • the execution cost of the local read operation (0,0,N*L*RC), the execution cost of the network read operation (N,0,N*L*RNC), and the execution cost of the local write operation (0) , 0, N * L * WC) are superimposed to obtain an execution cost (N, 0, N * L * (RC + RNC + WC)) as an execution cost of the broadcast distribution step;
  • the execution cost (J, 0, 0) of the output operation is taken as the execution cost of the ordered join step or the hash join step.
  • the cost obtaining unit 5223 is further configured to: at the execution cost of the local read operation (0, 0, N) *L*RC), execution cost of network read operation (N,0,N*L*RNC), execution cost of local sort operation (0,N,0), and execution cost of local write operation (0,0,N *L*WC) Before performing superposition to obtain the execution cost (N, N, N*L*(RC+RNC+WC)) as the execution cost of the redistribution step, it is judged whether the data record included in the data table to be connected is distributed.
  • the execution cost (N, 0, N*L*RNC) of the network read operation is corrected to (N, 0, P*N*L*p*RNC), and the local write operation is performed.
  • the execution cost (0,0,N*L*WC) is corrected to (0,0,P*N*L*p*WC);
  • P denotes the number of compute nodes for performing connection processing on the connection data table.
  • the cost obtaining unit 5223 is further configured to: perform a hash connection step, and determine the execution cost of the output operation (J, 0, 0) as the execution cost of the ordered connection step or the hash connection step, and determine each auxiliary data. Whether there is a data table in which the number of data records included is larger than the size D of the data block supported by each storage node; if the judgment result is yes, the execution cost (J, 0, 0) of the output operation is corrected, Obtaining the modified execution cost (J, N k * ⁇ N l , N k * ⁇ N l * L * WC) as the execution cost of the hash join step;
  • the device further includes: a connection calculation module 54.
  • the connection calculation module 54 is configured to perform connection calculation on the connection data table by using the target data table connection mode selected by the selection module 53.
  • the data table connection mode processing apparatus sets a parameter list for cost estimation according to a distributed data warehouse environment in which the data table to be connected is located, and performs execution logic according to parameters in the parameter list and connection manners of each candidate data table. Estimating the execution cost of each candidate data table connection method when performing connection calculation on the connection data table, and performing the connection calculation when the connection data table is connected according to the estimated connection manner of each candidate data table, and selecting the data table to be connected The connection method of the target data table for the connection calculation is performed, thereby selecting the data table connection mode suitable for the distributed data warehouse environment, and then saving the distributed when calculating the connection between the data tables based on the selected data table connection manner. Data warehouse resources to improve query efficiency.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the software functional unit described above is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods described in various embodiments of the present application. Part of the steps.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种数据表连接方式处理方法及装置。方法包括:根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表;根据参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价;根据预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,选择用于对待连接数据表进行连接计算的目标数据表连接方式。本申请可以选择合适的Join算法进行数据表之间的连接,进而节约分布式数据仓库的资源,提高查询效率。

Description

数据表连接方式处理方法及装置 【技术领域】
本申请涉及数据库技术领域,尤其涉及一种数据表连接方式处理方法及装置。
【背景技术】
随着互联网的发展,数据呈现爆发式增长,数据结构也开始多元化,数据含有的信息量越来越多,数据仓库在这样的背景下发挥着巨大的作用。由于大数据时代的降临,数据仓库转成为分布式架构,以满足爆发式增长的计算及存储的需求。分布式数据仓库一般使用列式存储,并以文件的形式保存数据,因此,采用分布式数据仓库可提高大数据的存储及计算性能。
在分布式数据仓库的查询过程中,经常需要进行数据表之间的连接(Join)计算。用于实现Join计算的算法(简称为Join算法)有很多,不同Join算法所能处理的数据量以及所消耗的各种资源一般是不同的。对于分布式数据仓库而言,若能够选择合适的Join算法,不仅可以节约分布式数据仓库的资源,而且可以提高查询效率。但是,由于分布式数据仓库发展时间不长,所以目前并不存在适用于分布式数据仓库的Join算法选择方案。
【发明内容】
本申请的多个方面提供一种数据表连接方式处理方法及装置,用以选择合适的Join算法进行数据表之间的连接,进而节约分布式数据仓库的资源,提高查询效率。
本申请的一方面,提供一种数据表连接方式处理方法,包括:
根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表;
根据所述参数列表中的参数和各候选数据表连接方式的执行逻辑,预估所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价;
根据预估的所述各候选数据表连接方式对所述待连接数据表进行连接 计算时的执行代价,选择用于对所述待连接数据表进行连接计算的目标数据表连接方式。
本申请的另一方面,提供一种数据表连接方式处理装置,包括:
设置模块,用于根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表;
预估模块,用于根据所述参数列表中的参数和各候选数据表连接方式的执行逻辑,预估所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价;
选择模块,用于根据预估的所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价,选择用于对所述待连接数据表进行连接计算的目标数据表连接方式。
在本申请中,根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表,根据参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,根据预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,选择用于对待连接数据表进行连接计算的目标数据表连接方式,从而选择与分布式数据仓库环境相适合的数据表连接方式,进而在基于所选择的数据表连接方式进行数据表之间的连接计算时,可以节约分布式数据仓库的资源,提高查询效率。
【附图说明】
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请一实施例提供的数据表连接方式处理方法的流程示意图;
图2为本申请另一实施例提供的PSJ方式的执行逻辑的示意图;
图3为本申请又一实施例提供的BHJ方式的执行逻辑的示意图;
图4为本申请又一实施例提供的BKHJ方式的执行逻辑的示意图;
图5为本申请又一实施例提供的数据表连接方式处理装置的结构示意图;
图6为本申请又一实施例提供的数据表连接方式处理装置的结构示意图。
【具体实施方式】
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在分布式数据仓库的查询过程中,经常需要进行数据表之间的连接计算。目前用于实现Join计算的算法有很多,而不同Join算法所能处理的数据量以及所消耗的各种资源一般是不同的。对于分布式数据仓库而言,若能够选择合适的Join算法,不仅可以节约分布式数据仓库的资源,而且可以提高查询效率。但是,由于分布式数据仓库发展时间不长,所以目前并不存在适用于分布式数据仓库的Join算法选择方案。
针对上述问题,本申请提供一种解决方案,其核心思想是:基于待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表,进而基于参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,进而基于预估的执行代价,从各候选数据表连接方式中选择真正用于对待连接数据表进行连接计算的数据表连接方式。
进一步,在选择目标数据表连接方式之后,可以采用目标数据表连接方式,对待连接数据表进行连接计算。
本申请可以从各候选数据表连接方式中选择与分布式数据仓库环境相适合的数据表连接方式,进而在基于所选择的数据表连接方式进行数据表之间的连接计算时,可以节约分布式数据仓库的资源,提高查询效率。
下面将通过具体实施方式对本申请技术方案做详细说明。
图1为本申请一实施例提供的数据表连接方式处理方法的流程示意图。如图1所示,该方法包括:
101、根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表。
102、根据上述参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价。
103、根据预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,选择用于对待连接数据表进行连接计算的目标数据表连接方式。
具体的,在分布式数据仓库的查询场景中,当出现数据表连接(Join)任务时,可以确定待连接数据表。例如,在电子商务领域,可能需要联合查询订单(order)表和用户(customer)表,此时需要对订单表和用户表进行连接计算,因此可以向数据表连接方式处理装置(后续简称为处理装置)提交数据表连接任务,在该数据表连接任务中携带有订单表和用户表的标识。处理装置根据该数据表连接任务,确定待连接数据表包括订单表和用户表。
在确定待连接数据表之后,处理装置采用数据表连接方式(简称为Join算法)对待连接数据表进行连接计算。
在本实施例中,为了选择合适的数据表连接方式对待连接数据表进行连接计算,以节约分布式数据仓库的资源,提高查询效率,处理装置将可使用的数据表连接方式作为候选数据表连接方式,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,基于预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,从各候选数据表连接方式中选择适合对待连接数据表进行连接计算的数据表连接方式。
具体的,处理装置确定待连接数据表所在的分布式数据仓库环境,根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表;参数列表中包括用于对候选数据表连接方式进行代价估计时所需的参数。
进一步,处理装置设置参数列表的一种实施方式包括:
设置待连接数据表中各数据表包含的数据记录数、数据记录总数以及每条数据记录的平均长度;
根据分布式数据仓库的文件系统,设置每个存储节点支持的数据分块的大小;
根据分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价以及每个计算节点所能处理的数据记录数。
在上述实施方式中,参数列表中包括以下参数:各数据表包含的数据记录数、数据记录总数、每条数据记录的平均长度、每个存储节点支持的数据分块的大小、各种操作的单位代价以及每个计算节点所能处理的数据记录数。
由于数据表连接计算需要至少两个数据表,所以待连接数据表包括至少两个数据表。对于待连接数据表中的每个数据表包含一定数据记录,因此可以统计获得每个数据表包含的数据记录数,将每个数据表包含的数据记录数求和可获得数据记录总数;另外,根据每个数据表包含的数据记录的数据量可求得数据总量,根据数据总量和数据记录总数可获得每条数据记录的平均长度。
进一步,对连接计算进行分析,可以发现连接操作涉及数据表的读和/或写,进一步涉及本地读操作、本地写操作、网络读操作和网络写操作中的至少一种。基于此,在根据分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价时,具体可以根据分布式数据仓库使用的存储介质,确定本地读操作的单位代价和本地写操作的单位代价。举例说明,分布式数据仓库使用的存储介质可以是磁盘、闪存、U盘等,不同存储介质所需要的读写时间一般不同。因此,处理装置可以根据存储介质,确定本地读操作的单位代价和本地写操作的单位代价。另外,考虑到分布式数据仓库可能会跨网络进行数据表的读写,因此处理装置还可以根据分布式数据仓库的网络拓扑结构,确定网络读操作的单位代价和网络写操作的单位代价。
上述本地读操作的单位代价定义为本地读1字节(byte)数据的平均代价;上述本地写操作的单位代价定义为本地写1byte数据的平均代价;同理,上述网络读操作的单位代价定义为通过网络读1字节(byte)数据的平均代价;上述网络写操作的单位代价定义为通过网络写1byte数据的平均代价。
另外,分布式数据仓库包括多个计算节点和多个存储节点。在预估各候选数据表连接方式的执行代价时,可以根据分布式数据仓库的文件系统确定存储节点所支持的数据分块的大小,例如可以是256MB,并根据分布式数据 仓库的硬件信息确定计算节点所能处理的数据记录数,例如1GB。
之后,处理装置根据参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价。
在一可选实施方式中,候选数据表连接方式的执行逻辑可通过其执行步骤以及执行步骤中的关键操作来体现。所述关键操作是指主要体现该执行步骤的执行代价的操作,例如若某操作在某一方面的代价大于预设值,或者远大于其他操作在该方面的代价,则确定该操作为关键操作。基于此,处理装置根据参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价的过程包括:
对每个候选数据表连接方式,确定该候选数据表连接方式的执行步骤以及每个执行步骤中的关键操作;根据参数列表中的参数和每个执行步骤中的关键操作,预估每个执行步骤的执行代价;根据每个执行步骤的执行代价,获得候选数据表连接方式的执行代价。
进一步,上述根据参数列表中的参数和每个执行步骤中的关键操作,预估每个执行步骤的执行代价的实施方式包括:
对于每个执行步骤,从参数列表中获取该执行步骤所需的目标参数;根据该执行步骤所需的目标参数和该执行步骤中的关键操作,预估该执行步骤中的关键操作的执行代价;根据该执行步骤中的关键操作的执行代价,获得该执行步骤的执行代价。
例如,在较为简单的实施方式中,可以将执行步骤中的关键操作的执行代价进行叠加,以获得执行步骤的执行代价。或者,直接将执行步骤中的关键操作的执行代价作为该执行步骤的执行代价。
在一可选实施方式中,采用三元组(数据记录数消耗、CPU消耗、IO消耗)来表示上述执行代价。即可以从所操作的数据记录数(RowCount)、所消耗的CPU资源以及所消耗的IO资源三个维度描述一个执行步骤或候选数据表连接方式的执行代价。
不同候选数据表连接方式的执行代价都可以通过上述三元组进行表示,进而进行比较。
在一实际应用场景中,假设较为常用的数据表连接方式包括但不限于以下三种:
分区有序连接(Partitioned Sort Join,简称PSJ)方式;
广播哈希连接(Broadcasted Hash Join,简称BHJ)方式;
分区哈希连接(Blocked Hash Join,简称BKHJ)方式。
如图2所示,为PSJ方式的执行逻辑的示意图。PSJ包括重分布(Re-partition)步骤和有序连接步骤。
如图3所示,为BHJ方式的执行逻辑的示意图。在图3中以待连接数据表包括两个数据表为例,BHJ包括广播(Broadcast)步骤和哈希连接步骤。
如图4所示,为BKHJ方式的执行逻辑的示意图。BKHJ包括广播分发步骤和哈希连接步骤。
上述PSJ方式、BHJ方式以及BKHJ方式均可作为本申请的候选数据表连接方式。
对于不同候选数据表连接方式,处理装置对其进行预估的过程有所差异,下面将分别针对不同候选数据表连接方式,对预估过程进行详细说明。
若候选数据表连接方式为PSJ方式,则处理装置可以确定该PSJ方式的执行步骤包括:重分布步骤和有序连接步骤;重分布步骤主要是对所有待连接数据表进行排序并分发到不同计算节点上,有序连接步骤主要目的是获取重分布步骤输出的所有符合连接条件的数据表组合并输出。进一步,可以确定重分布步骤中的关键操作包括:本地读操作、网络读操作、本地排序操作和本地写操作;相应的,可以确定有序连接步骤中的关键操作包括:输出操作。
处理装置在确定PSJ方式的执行步骤以及每个执行步骤中的关键操作之后,需要对每个执行步骤,从参数列表中获取该执行步骤所需的目标参数。具体的:
对PSJ方式中的重分布步骤,处理装置可以从参数列表中获取参数N、L、RC、RNC以及WC作为重分布步骤所需的目标参数;其中,N表示数据记录总数;L表示每条数据记录的平均长度;RC表示本地读操作的单位代价;RNC表示网络读操作的单位代价;WC表示本地写操作的单位代价;
对PSJ方式中的有序连接步骤,处理装置可以从参数列表中获取Nj以及n作为有序连接步骤所需的目标参数,其中,Nj表示待连接数据表中第j个数据表包含的数据记录数,j=1…n;n表示待连接数据表中数据表的个数。
在获得重分布步骤和有序连接步骤所需的目标参数之后,处理装置可以根据重分布步骤所需的目标参数和重分布步骤中的关键操作,预估重分布步骤中的关键操作的执行代价,并根据有序连接步骤所需的目标参数和有序连接步骤中的关键操作,预估有序连接步骤中的关键操作的执行代价。具体的:
对PSJ方式中的重分布步骤,处理装置可以根据上述参数N、L、RC、RNC以及WC,预估本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC),本地排序操作的执行代价为(0,N,0)以及本地写操作的执行代价为(0,0,N*L*WC);
对PSJ方式中的有序连接步骤,处理装置可以根据上述参数Nj以及n,预估输出操作的执行代价为(J,0,0),J=(∏Nj)1/n,即J=(N1*N2*...*Nn)1/n
在获得重分布步骤中的关键操作的执行代价和有序连接步骤中的关键操作的执行代价之后,处理装置可以根据重分布步骤中的关键操作的执行代价获得重分布步骤的执行代价,并根据有序连接步骤中的关键操作的执行代价获得有序连接步骤的执行代价。具体的:
对PSJ方式中的重分布步骤,处理装置可以将本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为重分布步骤的执行代价;
进一步可选的,考虑到在分布式数据仓库场景中,待连接数据表中的数据记录需要分布到P个计算节点上,若分布到每个计算节点上的数据记录均衡,则在每个计算节点上的执行代价是相同的,所以可以直接通过数据量N*L与相应操作的单位代价计算出相应操作的执行代价;若分布到每个计算节点上的数据记录不均衡,则对网络读操作和本地写操作来说,由于需要等待其它计算节点上的数据,所以需要以最大数据记录所在计算节点(或称为消耗最大的计算节点)上的代价为基准乘以P,获得最终执行代价。
基于上述,在将本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写 操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为重分布步骤的执行代价之前,处理装置可以判断待连接数据表包含的数据记录是否发生分布倾斜。
具体的,可以采用被称为Top K values的统计信息来估计消耗最大的计算节点。Top K values指的是某一列数据中重复度最高的K个值及其重复次数。优选的,为了简化问题的复杂度,考虑K=1,重复度最高的值占整体数据的比例是p,本申请称之为分布倾斜率。将p与1/P进行比较;若p>1/P,则可以判定出现了数据分布倾斜,此时消耗最大的计算节点需要处理的数据量是N*L*p;若p<=1/P,则判定未出现数据分布倾斜,此时每个计算节点需要处理的数据量是N*L*1/P。
若判断结果为是,即出现了数据分布倾斜,则将网络读操作的执行代价(N,0,N*L*RNC)修正为(N,0,P*N*L*p*RNC),将本地写操作的执行代价(0,0,N*L*WC)修正为(0,0,P*N*L*p*WC);其中,p表示分布倾斜率;P表示用于对待连接数据表进行连接处理的计算节点的个数。
若判断结果为否,即未出现数据分布倾斜,则网络读操作的执行代价(N,0,P*N*L*1/P*RNC),即(N,0,N*L*RNC);同理,本地写操作的执行代价(0,0,P*N*L*1/P*WC),即(0,0,N*L*WC)。
基于上述,若出现了数据分布倾斜,则处理装置需要将本地读操作的执行代价(0,0,N*L*RC)、修正后的网络读操作的执行代价(N,0,P*N*L*p*RNC)、本地排序操作的执行代价(0,N,0)以及修正后的本地写操作的执行代价(0,0,P*N*L*p*WC)进行叠加,以获得执行代价(N,N,N*L*RC+P*N*L*p*(RNC+WC))作为重分布步骤的执行代价。
对PSJ方式中的有序连接步骤,处理装置可以将输出操作的执行代价(J,0,0)作为有序连接步骤的执行代价。
在获得重分布步骤的执行代价以及有序连接步骤的执行代价之后,处理装置可以将两个执行步骤的执行代价进行叠加,以获得PSJ方式的执行代价。具体的:
在出现了数据分布倾斜的情况下,PSJ方式的执行代价为(N+J,N,N*L*RC+P*N*L*p*(RNC+WC));
在未出现数据分布倾斜的情况下,PSJ方式的执行代价为(N+J,N,N*L* (RC+RNC+WC))。
若候选数据表连接方式为BHJ方式,则处理装置可以确定BHJ方式的执行步骤包括:广播步骤和哈希连接步骤。在BHJ方式下,将n个数据表中最大的数据表作为主数据表,将其余数据表作为辅数据表,其广播步骤主要是将辅数据表通过网络传输分发到主数据表中。这意味着,该广播步骤包括:本地读操作、网络读操作和本地写操作。本实施例选择网络读操作作为广播步骤中的关键操作,但不限于此。哈希连接步骤主要是通过哈希算法获取所有符合连接条件的数据表组合并输出,因此可以确定哈希连接步骤包括:哈希计算和输出操作。本实施例选择输出操作作为哈希连接步骤中的关键操作,但不限于此。
处理装置在确定BHJ方式的执行步骤以及每个执行步骤中的关键操作之后,需要对每个执行步骤,从参数列表中获取该执行步骤所需的目标参数。具体的:
对BHJ方式的广播步骤,处理装置可以从参数列表中获取参数Ni、Nk、D、L以及RNC作为广播步骤所需的目标参数;其中,Nk表示待连接数据表中主数据表包含的数据记录数,k为1…n中的任一取值;Ni表示待连接数据表中第i个辅数据表包含的数据记录数,i=1…n且i≠k;D表示每个存储节点支持的数据分块的大小;L表示每条数据记录的平均长度;RNC表示网络读操作的单位代价;n表示待连接数据表中数据表的个数;
对BHJ方式的哈希连接步骤,处理装置可以从参数列表中获取参数Nj以及n作为哈希连接步骤所需的目标参数;其中,Nj表示待连接数据表中第j个数据表包含的数据记录数,j=1…n;n表示待连接数据表中数据表的个数。
在获得广播步骤和哈希连接步骤所需的目标参数之后,处理装置可以根据广播步骤所需的目标参数和广播步骤中的关键操作,预估广播步骤中的关键操作的执行代价,并根据哈希连接步骤所需的目标参数和哈希连接步骤中的关键操作,预估哈希连接步骤中的关键操作的执行代价。具体的:
对BHJ方式中的广播步骤,处理装置可以根据上述参数Ni、Nk、D、L以及RNC,预估网络读操作的执行代价为(∑Ni*M,0,∑Ni*M*L*RNC);其中,M=Nk/D;
对BHJ方式中的哈希连接步骤,处理装置可以根据上述参数Nj以及n,预 估输出操作的执行代价为(J,0,0),J=(∏Nj)1/n
进一步,考虑到辅数据表包含的数据记录数可能大于每个存储节点支持的数据分块的大小D,对于这种情况,需要将数据记录从内存写入外存,即退化为nested loop Join的情况。对于这种情况,由于数据记录从内存写入外存的代价很大,因此需要考虑这部分代价。
基于上述,在将输出操作的执行代价(J,0,0)作为哈希连接步骤的执行代价之前,处理装置需要判断各辅数据表中是否存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表;若判断结果为是,则对输出操作的执行代价(J,0,0)进行修正,以获得修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为哈希连接步骤的执行代价;其中,Nl表示第l个所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,l=1…n且l≠k。
在获得广播步骤中的关键操作的执行代价和哈希连接步骤中的关键操作的执行代价之后,处理装置可以根据广播步骤中的关键操作的执行代价获得广播步骤的执行代价,并根据哈希连接步骤中的关键操作的执行代价获得哈希连接步骤的执行代价。具体的:
对BHJ方式的广播步骤,处理装置可以将网络读操作的执行代价(∑Ni*M,0,∑Ni*M*L*RNC)作为广播步骤的执行代价;
对BHJ方式的哈希连接步骤,若各辅数据表中存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,则处理装置可以将修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为哈希连接步骤的执行代价;若各辅数据表中不存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,则处理装置可以将输出操作的执行代价(J,0,0)作为哈希连接步骤的执行代价。
在获得广播步骤的执行代价以及哈希连接步骤的执行代价之后,处理装置可以将两个执行步骤的执行代价进行叠加,以获得BHJ方式的执行代价。具体的:
在各辅数据表中存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表的情况下,BHJ方式的执行代价为(∑Ni*M+J,Nk*∑Nl,Nk*∑Nl*L*WC+∑Ni*M*L*RNC);
在各辅数据表中不存在所包含的数据记录数大于每个存储节点支持的数据 分块的大小D的数据表的情况下,BHJ方式的执行代价为(∑Ni*M+J,0,∑Ni*M*L*RNC)。
若候选数据表连接方式为BKHJ方式,则处理装置可以确定BKHJ方式的执行步骤包括:广播分发步骤和哈希连接步骤。广播分发骤主要是将所有待连接数据表分发到不同计算节点上,这意味着,该广播分发步骤包括:本地读操作、网络读操作和本地写操作。本实施例将本地读操作、网络读操作和本地写操作作为广播步骤中的关键操作,但不限于此。哈希连接步骤主要是通过哈希算法获取所有符合连接条件的数据表组合并输出,因此可以确定哈希连接步骤包括:哈希计算和输出操作。本实施例选择输出操作作为哈希连接步骤中的关键操作,但不限于此。
处理装置在确定BKHJ方式的执行步骤以及每个执行步骤中的关键操作之后,需要对每个执行步骤,从参数列表中获取该执行步骤所需的目标参数。具体的:
对BKHJ方式中的广播分发步骤,处理装置可以从参数列表中获取参数N、L、RC、RNC以及WC作为广播分发步骤所需的目标参数;其中,N表示数据记录总数;L表示每条数据记录的平均长度;RC表示本地读操作的单位代价;RNC表示网络读操作的单位代价;WC表示本地写操作的单位代价;
对BKHJ方式中的哈希连接步骤,处理装置可以从参数列表中获取参数Nj以及n作为哈希连接步骤所需的目标参数;其中,Nj表示待连接数据表中第j个数据表包含的数据记录数,j=1…n;n表示待连接数据表中数据表的个数。
在获得广播分发步骤和哈希连接步骤所需的目标参数之后,处理装置可以根据广播分发步骤所需的目标参数和广播分发步骤中的关键操作,预估广播分发步骤中的关键操作的执行代价,并根据哈希连接步骤所需的目标参数和哈希连接步骤中的关键操作,预估哈希连接步骤中的关键操作的执行代价。具体的:
对BKHJ方式中的广播分发步骤,处理装置可以根据上述参数N、L、RC、RNC以及WC,预估本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC)以及本地写操作的执行代价为(0,0, N*L*WC);
对BKHJ方式中的哈希连接步骤,处理装置可以根据上述参数Nj以及n,预估输出操作的执行代价为(J,0,0),J=(∏Nj)1/n
进一步,考虑到辅数据表包含的数据记录数可能大于每个存储节点支持的数据分块的大小D,对于这种情况,需要将数据记录从内存写入外存,即退化为嵌套循环连接(nested loop Join)的情况。对于这种情况,由于数据记录从内存写入外存的代价很大,因此需要考虑这部分代价。
基于上述,在将输出操作的执行代价(J,0,0)作为哈希连接步骤的执行代价之前,处理装置需要判断各待连接数据表中是否存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表;若判断结果为是,则对输出操作的执行代价(J,0,0)进行修正,以获得修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为哈希连接步骤的执行代价;其中,Nl表示第l个所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,l=1…n且l≠k。
在获得广播分发步骤中的关键操作的执行代价和哈希连接步骤中的关键操作的执行代价之后,处理装置可以根据广播分发步骤中的关键操作的执行代价获得广播分发步骤的执行代价,并根据哈希连接步骤中的关键操作的执行代价获得哈希连接步骤的执行代价。具体的:
对BKHJ方式中的广播分发步骤,处理装置可以将本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,0,N*L*(RC+RNC+WC))作为广播分发步骤的执行代价;
对BKHJ方式中的哈希连接步骤,若各辅数据表中存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,则处理装置可以将修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为哈希连接步骤的执行代价;若各辅数据表中不存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,则处理装置可以将输出操作的执行代价(J,0,0)作为哈希连接步骤的执行代价。
在获得广播分发步骤的执行代价以及哈希连接步骤的执行代价之后,处理装置可以将两个执行步骤的执行代价进行叠加,以获得BKHJ方式的执行代价。具体的:
在各辅数据表中存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表的情况下,BKHJ方式的执行代价为(N+J,Nk*∑Nl,Nk*∑Nl*L*WC+N*L*(RC+RNC+WC));
在各辅数据表中不存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表的情况下,BKHJ方式的执行代价为(N+J,0,N*L*(RC+RNC+WC))。
在预估出各候选数据表连接方式对待连接数据表进行连接计算时的执行代价(简称为各候选数据表连接方式的执行代价)后,处理装置可以根据预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,选择用于对待连接数据表进行连接计算的目标数据表连接方式。
具体的,处理装置可以将预估出的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价进行比较,选择最小执行代价对应的候选数据表连接方式作为目标数据表连接方式,或者选择最优执行代价对应的候选数据表连接方式作为目标数据表连接方式。
在上述选择目标数据表连接方式的过程中,需要对执行代价进行比较。在上述三元组(数据记录数消耗、CPU消耗、IO消耗)表示执行代价的情况下,可以预先设定三元组之间的优先级,以便于进行比较。例如,可以设定数据记录数消耗的优先级高于CPU消耗的优先级,CPU消耗的优先级高于IO消耗的优先级,基于此,在对各候选数据表连接方式的执行代价进行比较时,可以优先比较数据记录数消耗,选择数据记录数消耗最小的候选数据表连接方式作为目标数据表连接方式;若数据记录数消耗均相同,可以比较CPU消耗,选择CPU消耗最小的候选数据表连接方式作为目标数据表连接方式;若CPU消耗均相同,可以比较IO消耗,选择IO消耗最小的候选数据表连接方式作为目标数据表连接方式。
除了按照三元组之间的优先级对各候选数据表连接方式的执行代价进行比较之外,还可以设置三元组的权重,分别记为w1,w2和w3,然后根据公式平均代价=数据记录数消耗*w1+CPU消耗*w2+IO消耗*w3,计算出各候选数据表连接方式的平均代价,选择平均代价最小的候选数据表连接方式作为目标数据表连接方式。
然后,基于选择的目标数据表连接方式对待连接数据表进行连接计算。 由于所选择的数据表连接方式与分布式数据仓库环境相适合,所以在基于所选择的数据表连接方式进行数据表之间的连接计算时,可以节约分布式数据仓库的资源,提高查询效率。
下面结合具体数据表连接场景以及具体参数,详细说明一下各候选数据表连接方式的执行代价。
假设需要将用户表和订单表进行连接计算,用户表记为R,订单表记为S,R包含的数据记录为10M,S包含的数据记录为10M,根据分布式数据仓库环境,设置以下参数:
N1=10M;
N2=10M;
N=20M;
L=100Byte;
D=256MB;
G=1GB;
M=Input/256MB;
P=Input/1GB;
RC=1;
WC=2;
RNC=10;
WNC=10;
上述G表示单个计算节点支持的数据记录数。
对PSJ方式:
将相应参数代入执行代价(N,N,N*L*(RC+RNC+WC)),可以得出重分布步骤的执行代价是(20M,20M,26GB);
将相应参数代入执行代价(J,0,0),可以得出有序连接步骤的执行代价是(20M,0,0B);
进而得出:PSJ方式的执行代价是(40M,20M,26GB)。
对BHJ方式,计算出M=N2*L/256MB=4;
将相应参数代入执行代价(∑Ni*M,0,∑Ni*M*L*RNC),可以得出广播步骤的执行代价是(40M,0,4GB);
将相应参数代入执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC),可以得出哈希连接步骤的执行代价是(10M,10M*10M,10M*10M*100*2);
进而得出:BHJ方式的执行代价是(50M,inf,inf)。
对BKHJ方式,计算出P=N*L/1G=8;
将相应参数代入执行代价(N,0,N*L*(RC+RNC+WC)),可以得出广播分发步骤的执行代价是(20M,0,26GB);
将相应参数代入执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC),可以得出哈希连接步骤的执行代价是(20M,10M*10M,10M*10M*100*2);
进而得出:BKHJ方式的执行代价是(40M,100M,26GB)。
若基于预先设定的数据记录数消耗的优先级高于CPU消耗的优先级,CPU消耗的优先级高于IO消耗的优先级,则通过比较可见,PSJ方式的执行代价更低,因此可以选择PSJ方式。
如果S包含的数据记录为1K,则
对PSJ方式,可以得出:
重分布步骤的执行代价是(10M,10M,13GB);
有序连接步骤的执行代价是(100K,0,0B);
PSJ方式的执行代价是(10M,10M,13GB)。
对BHJ方式,计算出M=N2*L/256MB=4,基于此,可得出:
广播步骤的执行代价是(4K,0,400K);
哈希连接步骤的执行代价是(100K,0,0);
BHJ方式的执行代价是(104K,0,400K)。
对BKHJ方式,计算出P=N*L/1G=4,基于此,可得出:
广播分发步骤的执行代价是(10M,0,13GB);
哈希连接步骤的执行代价是(100K,0,0B);
BKHJ方式的执行代价是(10M,0,13GB)。
若基于预先设定的数据记录数消耗的优先级高于CPU消耗的优先级,CPU消耗的优先级高于IO消耗的优先级,则通过比较可见,BHJ方式的执行代价更低,因此可以选择BHJ方式,而BKHJ方式的执行代价也低于PSJ方式。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
图5为本申请又一实施例提供的数据表连接方式处理装置的结构示意图。如图5所示,该装置包括:设置模块51、预估模块52和选择模块53。
设置模块51,用于根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表。
预估模块52,用于根据设置模块51设置的参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价。
选择模块53,用于根据预估模块52预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,选择用于对待连接数据表进行连接计算的目标数据表连接方式。
在一可选实施方式中,设置模块51具体用于:
设置待连接数据表中各数据表包含的数据记录数、数据记录总数以及每条数据记录的平均长度;
根据分布式数据仓库的文件系统,设置每个存储节点支持的数据分块的大小;
根据分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价以及每个计算节点所能处理的数据记录数。
进一步,设置模块51在根据分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价时,具体用于:
根据分布式数据仓库使用的存储介质,确定本地读操作的单位代价和本地写操作的单位代价;
根据分布式数据仓库的网络拓扑结构,确定网络读操作的单位代价和网络写操作的单位代价。
在一可选实施方式中,如图6所示,该预估模块52的一种实现结构包括:确定子模块521、预估子模块522和获取子模块523。
确定子模块521,用于对每个候选数据表连接方式,确定候选数据表连接方式的执行步骤以及每个执行步骤中的关键操作。
预估子模块522,用于根据参数列表中的参数和每个执行步骤中的关键操作,预估每个执行步骤的执行代价。
获取子模块523,用于根据每个执行步骤的执行代价,获得候选数据表连接方式的执行代价。
进一步,如图6所示,预估子模块522的一种实现结构包括:参数获取单元5221、代价预估单元5222和代价获取单元5223。
参数获取单元5221,用于对于每个执行步骤,从参数列表中获取执行步骤所需的目标参数。
代价预估单元5222,用于根据执行步骤所需的目标参数和执行步骤中的关键操作,预估执行步骤中的关键操作的执行代价。
代价获取单元5223,用于根据执行步骤中的关键操作的执行代价,获得执行步骤的执行代价。
在一可选实施方式中,若候选数据表连接方式为分区有序连接Partitioned Sort Join方式,则分区有序连接方式的执行步骤包括:重分布步骤和有序连接步骤;重分布步骤中的关键操作包括:本地读操作、网络读操作、本地排序操作和本地写操作;有序连接步骤中的关键操作包括:输出操作;
若候选数据表连接方式为广播哈希连接Broadcasted Hash Join方式,广播哈希连接方式的执行步骤包括:广播步骤和哈希连接步骤;广播步骤中的关键操作包括:网络读操作;哈希连接步骤中的关键操作包括:输出操作;
若候选数据表连接方式为分区哈希连接Blocked Hash Join方式,分区哈希连接方式的执行步骤包括:广播分发步骤和哈希连接步骤;广播分发步骤中的关键操作包括:本地读操作、网络读操作和本地写操作;哈希连接步骤中的关键操作包括:输出操作。
进一步可选的,上述执行代价由三元组(数据记录数消耗、CPU消耗、IO消耗)表示。
基于上述,参数获取单元5221具体用于:
对重分布步骤,从参数列表中获取参数N、L、RC、RNC以及WC作为重分布步骤所需的目标参数;
对广播步骤,从参数列表中获取参数Ni、Nk、D、L以及RNC作为广播步骤所需的目标参数;
对广播分发步骤,从参数列表中获取参数N、L、RC、RNC以及WC作为广播分发步骤所需的目标参数;
对有序连接步骤或哈希连接步骤,从参数列表中获取Nj以及n作为有序连接步骤或哈希连接步骤所需的目标参数;
N表示数据记录总数;
L表示每条数据记录的平均长度;
RC表示本地读操作的单位代价;
RNC表示网络读操作的单位代价;
WC表示本地写操作的单位代价;
Nk表示待连接数据表中主数据表包含的数据记录数,k为1…n中的任一取值;
Ni表示待连接数据表中第i个辅数据表包含的数据记录数,i=1…n且i≠k;
D表示每个存储节点支持的数据分块的大小;
Nj表示待连接数据表中第j个数据表包含的数据记录数,j=1…n;
n表示待连接数据表中数据表的个数。
进一步,代价预估单元5222具体用于:
对重分布步骤,根据参数N、L、RC、RNC以及WC,预估本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC),本地排序操作的执行代价为(0,N,0)以及本地写操作的执行代价为(0,0,N*L*WC);
对广播步骤,根据参数Ni、Nk、D、L以及RNC,预估网络读操作的执行代价为(∑Ni*M,0,∑Ni*M*L*RNC);其中,M=Nk/D;
对广播分发步骤,根据参数N、L、RC、RNC以及WC,预估本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC)以及本地写操作的执行代价为(0,0,N*L*WC);
对有序连接步骤或哈希连接步骤,根据参数Nj以及n,预估输出操作的执行代价为(J,0,0),J=(∏Nj)1/n
进一步,代价获取单元5223具体用于:
对重分布步骤,将本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为重分布步骤的执行代价;
对广播步骤,将网络读操作的执行代价(∑Ni*M,0,∑Ni*M*L*RNC)作为广播步骤的执行代价;
对广播分发步骤,将本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,0,N*L*(RC+RNC+WC))作为广播分发步骤的执行代价;
对有序连接步骤或哈希连接步骤,将输出操作的执行代价(J,0,0)作为有序连接步骤或哈希连接步骤的执行代价。
更进一步,代价获取单元5223还用于:在将本地读操作的执行代价(0,0,N *L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为重分布步骤的执行代价之前,判断待连接数据表包含的数据记录是否发生分布倾斜;若判断结果为是,则将网络读操作的执行代价(N,0,N*L*RNC)修正为(N,0,P*N*L*p*RNC),将本地写操作的执行代价(0,0,N*L*WC)修正为(0,0,P*N*L*p*WC);
其中,p表示分布倾斜率;
P表示用于对待连接数据表进行连接处理的计算节点的个数。
更进一步,代价获取单元5223还用于:对述哈希连接步骤,将输出操作的执行代价(J,0,0)作为有序连接步骤或哈希连接步骤的执行代价之前,判断各辅数据表中是否存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表;若判断结果为是,则对输出操作的执行代价(J,0,0)进行修正,以获得修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为哈希连接步骤的执行代价;
Nl表示第l个所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,l=1…n且l≠k。
更进一步,如图6所示,该装置还包括:连接计算模块54。
连接计算模块54,用于采用选择模块53选出的目标数据表连接方式,对待连接数据表进行连接计算。
本实施例提供的数据表连接方式处理装置,根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表,根据参数列表中的参数和各候选数据表连接方式的执行逻辑,预估各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,根据预估的各候选数据表连接方式对待连接数据表进行连接计算时的执行代价,选择用于对待连接数据表进行连接计算的目标数据表连接方式,从而选择与分布式数据仓库环境相适合的数据表连接方式,进而在基于所选择的数据表连接方式进行数据表之间的连接计算时,可以节约分布式数据仓库的资源,提高查询效率。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (26)

  1. 一种数据表连接方式处理方法,其特征在于,包括:
    根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表;
    根据所述参数列表中的参数和各候选数据表连接方式的执行逻辑,预估所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价;
    根据预估的所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价,选择用于对所述待连接数据表进行连接计算的目标数据表连接方式。
  2. 根据权利要求1所述的方法,其特征在于,所述根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表,包括:
    设置所述待连接数据表中各数据表包含的数据记录数、数据记录总数以及每条数据记录的平均长度;
    根据所述分布式数据仓库的文件系统,设置每个存储节点支持的数据分块的大小;
    根据所述分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价以及每个计算节点所能处理的数据记录数。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价,包括:
    根据所述分布式数据仓库使用的存储介质,确定本地读操作的单位代价和本地写操作的单位代价;
    根据所述分布式数据仓库的网络拓扑结构,确定网络读操作的单位代价和网络写操作的单位代价。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述参数列表中的参数和各候选数据表连接方式的执行逻辑,预估所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价,包括:
    对每个候选数据表连接方式,确定所述候选数据表连接方式的执行步骤以及每个执行步骤中的关键操作;
    根据所述参数列表中的参数和每个执行步骤中的关键操作,预估每个执行步骤的执行代价;
    根据每个执行步骤的执行代价,获得所述候选数据表连接方式的执行代价。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述参数列表中的参数和每个执行步骤中的关键操作,预估每个执行步骤的执行代价,包括:
    对于每个执行步骤,从所述参数列表中获取所述执行步骤所需的目标参数;
    根据所述执行步骤所需的目标参数和所述执行步骤中的关键操作,预估所述执行步骤中的关键操作的执行代价;
    根据所述执行步骤中的关键操作的执行代价,获得所述执行步骤的执行代价。
  6. 根据权利要求5所述的方法,其特征在于,
    若所述候选数据表连接方式为分区有序连接Partitioned Sort Join方式,则所述分区有序连接方式的执行步骤包括:重分布步骤和有序连接步骤;所述重分布步骤中的关键操作包括:本地读操作、网络读操作、本地排序操作和本地写操作;所述有序连接步骤中的关键操作包括:输出操作;
    若所述候选数据表连接方式为广播哈希连接Broadcasted Hash Join方式,所述广播哈希连接方式的执行步骤包括:广播步骤和哈希连接步骤;所述广播步骤中的关键操作包括:网络读操作;所述哈希连接步骤中的关键操作包括:输出操作;
    若所述候选数据表连接方式为分区哈希连接Blocked Hash Join方式,所述分区哈希连接方式的执行步骤包括:广播分发步骤和哈希连接步骤;所述广播分发步骤中的关键操作包括:本地读操作、网络读操作和本地写操作;所述哈希连接步骤中的关键操作包括:输出操作。
  7. 根据权利要求6所述的方法,其特征在于,所述执行代价由三元组(数据记录数消耗、CPU消耗、IO消耗)表示。
  8. 根据权利要求7所述的方法,其特征在于,所述对于每个执行步骤,从所述参数列表中获取所述执行步骤所需的目标参数,包括:
    对所述重分布步骤,从所述参数列表中获取参数N、L、RC、RNC以及WC作为所述重分布步骤所需的目标参数;
    对所述广播步骤,从所述参数列表中获取参数Ni、Nk、D、L以及RNC作为所述广播步骤所需的目标参数;
    对所述广播分发步骤,从所述参数列表中获取参数N、L、RC、RNC以及WC作为所述广播分发步骤所需的目标参数;
    对所述有序连接步骤或所述哈希连接步骤,从所述参数列表中获取Nj以及n作为所述有序连接步骤或所述哈希连接步骤所需的目标参数;
    N表示数据记录总数;
    L表示每条数据记录的平均长度;
    RC表示本地读操作的单位代价;
    RNC表示网络读操作的单位代价;
    WC表示本地写操作的单位代价;
    Nk表示所述待连接数据表中主数据表包含的数据记录数,k为1…n中的任一取值;
    Ni表示所述待连接数据表中第i个辅数据表包含的数据记录数,i=1…n且i≠k;
    D表示每个存储节点支持的数据分块的大小;
    Nj表示所述待连接数据表中第j个数据表包含的数据记录数,j=1…n;
    n表示所述待连接数据表中数据表的个数。
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述执行步骤所需的目标参数和所述执行步骤中的关键操作,预估所述执行步骤中的关键操作的执行代价,包括:
    对所述重分布步骤,根据所述参数N、L、RC、RNC以及WC,预估所述本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC),本地排序操作的执行代价为(0,N,0)以及本地写操作的执行代价为(0,0,N*L*WC);
    对所述广播步骤,根据所述参数Ni、Nk、D、L以及RNC,预估所述网络 读操作的执行代价为(∑Ni*M,0,∑Ni*M*L*RNC);其中,M=Nk/D;
    对所述广播分发步骤,根据所述参数N、L、RC、RNC以及WC,预估所述本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC)以及本地写操作的执行代价为(0,0,N*L*WC);
    对所述有序连接步骤或所述哈希连接步骤,根据所述参数Nj以及n,预估所述输出操作的执行代价为(J,0,0),J=(∏Nj)1/n
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述执行步骤中的关键操作的执行代价,获得所述执行步骤的执行代价,包括:
    对所述重分布步骤,将所述本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为所述重分布步骤的执行代价;
    对所述广播步骤,将所述网络读操作的执行代价(∑Ni*M,0,∑Ni*M*L*RNC)作为所述广播步骤的执行代价;
    对所述广播分发步骤,将所述本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,0,N*L*(RC+RNC+WC))作为所述广播分发步骤的执行代价;
    对所述有序连接步骤或所述哈希连接步骤,将所述输出操作的执行代价(J,0,0)作为所述有序连接步骤或所述哈希连接步骤的执行代价。
  11. 根据权利要求10所述的方法,其特征在于,所述将所述本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为所述重分布步骤的执行代价之前,包括:
    判断所述待连接数据表包含的数据记录是否发生分布倾斜;
    若判断结果为是,则将所述网络读操作的执行代价(N,0,N*L*RNC)修正为(N,0,P*N*L*p*RNC),将所述本地写操作的执行代价(0,0,N*L*WC)修正为(0,0,P*N*L*p*WC);
    其中,p表示分布倾斜率;
    P表示用于对所述待连接数据表进行连接处理的计算节点的个数。
  12. 根据权利要求10所述的方法,其特征在于,所述对所述哈希连接步骤,将所述输出操作的执行代价(J,0,0)作为所述哈希连接步骤的执行代价之前,包括:
    判断各辅数据表中是否存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表;
    若判断结果为是,则对所述输出操作的执行代价(J,0,0)进行修正,以获得修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为所述哈希连接步骤的执行代价;
    Nl表示第l个所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,l=1…n且l≠k。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,所述根据所述各候选数据表连接方式的运行代价,选择用于所述分布式数据仓库的目标数据表连接方式之后,还包括:
    采用所述目标数据表连接方式,对所述待连接数据表进行连接计算。
  14. 一种数据表连接方式处理装置,其特征在于,包括:
    设置模块,用于根据待连接数据表所在的分布式数据仓库环境,设置用于代价估计的参数列表;
    预估模块,用于根据所述参数列表中的参数和各候选数据表连接方式的执行逻辑,预估所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价;
    选择模块,用于根据预估的所述各候选数据表连接方式对所述待连接数据表进行连接计算时的执行代价,选择用于对所述待连接数据表进行连接计算的目标数据表连接方式。
  15. 根据权利要求14所述的装置,其特征在于,所述设置模块具体用于:
    设置所述待连接数据表中各数据表包含的数据记录数、数据记录总数以及每条数据记录的平均长度;
    根据所述分布式数据仓库的文件系统,设置每个存储节点支持的数据分块 的大小;
    根据所述分布式数据仓库的硬件信息,设置连接计算所需的各种操作的单位代价以及每个计算节点所能处理的数据记录数。
  16. 根据权利要求15所述的装置,其特征在于,所述设置模块具体用于:
    根据所述分布式数据仓库使用的存储介质,确定本地读操作的单位代价和本地写操作的单位代价;
    根据所述分布式数据仓库的网络拓扑结构,确定网络读操作的单位代价和网络写操作的单位代价。
  17. 根据权利要求14所述的装置,其特征在于,所述预估模块包括:
    确定子模块,用于对每个候选数据表连接方式,确定所述候选数据表连接方式的执行步骤以及每个执行步骤中的关键操作;
    预估子模块,用于根据所述参数列表中的参数和每个执行步骤中的关键操作,预估每个执行步骤的执行代价;
    获取子模块,用于根据每个执行步骤的执行代价,获得所述候选数据表连接方式的执行代价。
  18. 根据权利要求17所述的装置,其特征在于,所述预估子模块包括:
    参数获取单元,用于对于每个执行步骤,从所述参数列表中获取所述执行步骤所需的目标参数;
    代价预估单元,用于根据所述执行步骤所需的目标参数和所述执行步骤中的关键操作,预估所述执行步骤中的关键操作的执行代价;
    代价获取单元,用于根据所述执行步骤中的关键操作的执行代价,获得所述执行步骤的执行代价。
  19. 根据权利要求18所述的装置,其特征在于,
    若所述候选数据表连接方式为分区有序连接Partitioned Sort Join方式,则所述分区有序连接方式的执行步骤包括:重分布步骤和有序连接步骤;所述重分布步骤中的关键操作包括:本地读操作、网络读操作、本地排序操作和本地写操作;所述有序连接步骤中的关键操作包括:输出操作;
    若所述候选数据表连接方式为广播哈希连接Broadcasted Hash Join方式,所述广播哈希连接方式的执行步骤包括:广播步骤和哈希连接步骤;所述广播步骤中的关键操作包括:网络读操作;所述哈希连接步骤中的关键操作包括:输出操作;
    若所述候选数据表连接方式为分区哈希连接Blocked Hash Join方式,所述分区哈希连接方式的执行步骤包括:广播分发步骤和哈希连接步骤;所述广播分发步骤中的关键操作包括:本地读操作、网络读操作和本地写操作;所述哈希连接步骤中的关键操作包括:输出操作。
  20. 根据权利要求19所述的装置,其特征在于,所述执行代价由三元组(数据记录数消耗、CPU消耗、IO消耗)表示。
  21. 根据权利要求20所述的装置,其特征在于,所述参数获取单元具体用于:
    对所述重分布步骤,从所述参数列表中获取参数N、L、RC、RNC以及WC作为所述重分布步骤所需的目标参数;
    对所述广播步骤,从所述参数列表中获取参数Ni、Nk、D、L以及RNC作为所述广播步骤所需的目标参数;
    对所述广播分发步骤,从所述参数列表中获取参数N、L、RC、RNC以及WC作为所述广播分发步骤所需的目标参数;
    对所述有序连接步骤或所述哈希连接步骤,从所述参数列表中获取Nj以及n作为所述有序连接步骤或所述哈希连接步骤所需的目标参数;
    N表示数据记录总数;
    L表示每条数据记录的平均长度;
    RC表示本地读操作的单位代价;
    RNC表示网络读操作的单位代价;
    WC表示本地写操作的单位代价;
    Nk表示所述待连接数据表中主数据表包含的数据记录数,k为1…n中的任一取值;
    Ni表示所述待连接数据表中第i个辅数据表包含的数据记录数,i=1…n且i≠k;
    D表示每个存储节点支持的数据分块的大小;
    Nj表示所述待连接数据表中第j个数据表包含的数据记录数,j=1…n;
    n表示所述待连接数据表中数据表的个数。
  22. 根据权利要求21所述的装置,其特征在于,所述代价预估单元具体用于:
    对所述重分布步骤,根据所述参数N、L、RC、RNC以及WC,预估所述本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC),本地排序操作的执行代价为(0,N,0)以及本地写操作的执行代价为(0,0,N*L*WC);
    对所述广播步骤,根据所述参数Ni、Nk、D、L以及RNC,预估所述网络读操作的执行代价为(∑Ni*M,0,∑Ni*M*L*RNC);其中,M=Nk/D;
    对所述广播分发步骤,根据所述参数N、L、RC、RNC以及WC,预估所述本地读操作的执行代价为(0,0,N*L*RC),网络读操作的执行代价为(N,0,N*L*RNC)以及本地写操作的执行代价为(0,0,N*L*WC);
    对所述有序连接步骤或所述哈希连接步骤,根据所述参数Nj以及n,预估所述输出操作的执行代价为(J,0,0),J=(∏Nj)1/n
  23. 根据权利要求22所述的装置,其特征在于,所述代价获取单元具体用于:
    对所述重分布步骤,将所述本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)、本地排序操作的执行代价(0,N,0)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,N,N*L*(RC+RNC+WC))作为所述重分布步骤的执行代价;
    对所述广播步骤,将所述网络读操作的执行代价(∑Ni*M,0,∑Ni*M*L*RNC)作为所述广播步骤的执行代价;
    对所述广播分发步骤,将所述本地读操作的执行代价(0,0,N*L*RC)、网络读操作的执行代价(N,0,N*L*RNC)以及本地写操作的执行代价(0,0,N*L*WC)进行叠加,以获得执行代价(N,0,N*L*(RC+RNC+WC))作为所述广播 分发步骤的执行代价;
    对所述有序连接步骤或所述哈希连接步骤,将所述输出操作的执行代价(J,0,0)作为所述有序连接步骤或所述哈希连接步骤的执行代价。
  24. 根据权利要求23所述的装置,其特征在于,所述代价获取单元还用于:
    判断所述待连接数据表包含的数据记录是否发生分布倾斜;
    若判断结果为是,则将所述网络读操作的执行代价(N,0,N*L*RNC)修正为(N,0,P*N*L*p*RNC),将所述本地写操作的执行代价(0,0,N*L*WC)修正为(0,0,P*N*L*p*WC);
    其中,p表示分布倾斜率;
    P表示用于对所述待连接数据表进行连接处理的计算节点的个数。
  25. 根据权利要求23所述的装置,其特征在于,所述代价获取单元还用于:
    判断各辅数据表中是否存在所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表;
    若判断结果为是,则对所述输出操作的执行代价(J,0,0)进行修正,以获得修正后的执行代价(J,Nk*∑Nl,Nk*∑Nl*L*WC)作为所述哈希连接步骤的执行代价;
    Nl表示第l个所包含的数据记录数大于每个存储节点支持的数据分块的大小D的数据表,l=1…n且l≠k。
  26. 根据权利要求14-25任一项所述的装置,其特征在于,还包括:
    连接计算模块,用于采用所述目标数据表连接方式,对所述待连接数据表进行连接计算。
PCT/CN2017/075065 2016-03-14 2017-02-27 数据表连接方式处理方法及装置 WO2017157160A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17765702.0A EP3432157B1 (en) 2016-03-14 2017-02-27 Data table joining mode processing method and apparatus
US16/084,529 US11650990B2 (en) 2016-03-14 2017-02-27 Method, medium, and system for joining data tables

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610141198.4A CN107193813B (zh) 2016-03-14 2016-03-14 数据表连接方式处理方法及装置
CN201610141198.4 2016-03-14

Publications (1)

Publication Number Publication Date
WO2017157160A1 true WO2017157160A1 (zh) 2017-09-21

Family

ID=59850087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075065 WO2017157160A1 (zh) 2016-03-14 2017-02-27 数据表连接方式处理方法及装置

Country Status (5)

Country Link
US (1) US11650990B2 (zh)
EP (1) EP3432157B1 (zh)
CN (1) CN107193813B (zh)
TW (1) TWI753881B (zh)
WO (1) WO2017157160A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11023466B2 (en) 2017-11-22 2021-06-01 Transwarp Technology (Shanghai) Co., Ltd. Cost-based optimizer, and cost estimation method and device thereof

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595254B (zh) * 2018-03-09 2022-02-22 北京永洪商智科技有限公司 一种查询调度方法
CN108549666B (zh) * 2018-03-22 2021-05-04 上海达梦数据库有限公司 一种数据表的排序方法、装置、设备及存储介质
CN108491516B (zh) * 2018-03-26 2021-09-14 哈工大大数据(哈尔滨)智能科技有限公司 基于混合整数线性规划的分布式多表连接选择方法及装置
CN108628972B (zh) * 2018-04-25 2020-11-06 咪咕音乐有限公司 一种数据表的处理方法、装置及存储介质
CN108804626B (zh) * 2018-05-31 2019-11-22 华中科技大学 一种基于分布感知的二元等值连接倾斜优化方法和系统
US11500931B1 (en) * 2018-06-01 2022-11-15 Amazon Technologies, Inc. Using a graph representation of join history to distribute database data
CN111078738B (zh) * 2019-11-25 2023-08-15 泰康保险集团股份有限公司 数据处理方法、装置、电子设备和存储介质
CN110851452B (zh) * 2020-01-16 2020-09-04 医渡云(北京)技术有限公司 数据表连接处理方法及装置、电子设备和存储介质
CN112199577B (zh) * 2020-12-09 2021-04-02 浙江口碑网络技术有限公司 一种数据查询方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739398A (zh) * 2008-11-11 2010-06-16 山东省标准化研究院 分布式数据库多连接查询优化算法
CN102929996A (zh) * 2012-10-24 2013-02-13 华南理工大学 XPath查询优化方法及系统
CN102968420A (zh) * 2011-08-31 2013-03-13 国际商业机器公司 数据库查询的方法和系统
US20150278306A1 (en) * 2014-03-28 2015-10-01 Xun Cheng Bloom filter costing estimation

Family Cites Families (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6950823B2 (en) 2002-12-23 2005-09-27 International Business Machines Corporation Transparent edge-of-network data cache
US7152073B2 (en) 2003-01-30 2006-12-19 Decode Genetics Ehf. Method and system for defining sets by querying relational data using a set definition language
US20050177557A1 (en) 2003-09-06 2005-08-11 Oracle International Corporation Automatic prevention of run-away query execution
US7254574B2 (en) 2004-03-08 2007-08-07 Microsoft Corporation Structured indexes on results of function applications over data
US7779238B2 (en) 2004-06-30 2010-08-17 Oracle America, Inc. Method and apparatus for precisely identifying effective addresses associated with hardware events
WO2006037613A2 (en) 2004-10-04 2006-04-13 Clearpace Software Limited Method and system for implementing an enhanced database
US7574424B2 (en) 2004-10-13 2009-08-11 Sybase, Inc. Database system with methodology for parallel schedule generation in a query optimizer
US20060248592A1 (en) 2005-04-28 2006-11-02 International Business Machines Corporation System and method for limiting disclosure in hippocratic databases
US8271478B2 (en) 2006-07-27 2012-09-18 Sap Ag Fast algorithms for computing semijoin reduction sequences
US10007686B2 (en) 2006-08-02 2018-06-26 Entit Software Llc Automatic vertical-database design
US20080172356A1 (en) 2007-01-17 2008-07-17 Microsoft Corporation Progressive parametric query optimization
US8136124B2 (en) 2007-01-18 2012-03-13 Oracle America, Inc. Method and apparatus for synthesizing hardware counters from performance sampling
US20080243765A1 (en) 2007-03-29 2008-10-02 Ariel Fuxman Method for generating nested mapping specifications in a schema mapping formalism
US7895192B2 (en) * 2007-07-19 2011-02-22 Hewlett-Packard Development Company, L.P. Estimating the loaded execution runtime of a database query
CN101408900B (zh) * 2008-11-24 2011-03-16 中国科学院地理科学与资源研究所 一种网格计算环境下的分布式空间数据查询优化方法
US8214352B2 (en) 2008-11-26 2012-07-03 Hewlett-Packard Development Company Modular query optimizer
US8898142B2 (en) * 2009-01-29 2014-11-25 Hewlett-Packard Development Company, L.P. Risk-premium-based database-query optimization
US8285709B2 (en) 2009-05-12 2012-10-09 Teradata Us, Inc. High-concurrency query operator and method
CN101908003B (zh) 2009-06-05 2014-10-22 Sap股份公司 并行化查询的多核调度
US8620903B2 (en) 2009-11-04 2013-12-31 Nec Laboratories America, Inc. Database distribution system and methods for scale-out applications
US8935232B2 (en) 2010-06-04 2015-01-13 Yale University Query execution systems and methods
US8260803B2 (en) 2010-09-23 2012-09-04 Hewlett-Packard Development Company, L.P. System and method for data stream processing
US8892569B2 (en) 2010-12-23 2014-11-18 Ianywhere Solutions, Inc. Indexing spatial data with a quadtree index having cost-based query decomposition
CN102831120B (zh) * 2011-06-15 2017-07-21 腾讯科技(深圳)有限公司 一种数据处理方法及系统
US9639575B2 (en) 2012-03-30 2017-05-02 Khalifa University Of Science, Technology And Research Method and system for processing data queries
US8949218B2 (en) * 2012-12-26 2015-02-03 Teradata Us, Inc. Techniques for join processing on column partitioned tables
US10740329B2 (en) * 2013-03-15 2020-08-11 Teradata Us, Inc. Optimization of database queries for database systems and environments
EP2843567B1 (en) 2013-08-30 2017-05-10 Pilab S.A. Computer-implemented method for improving query execution in relational databases normalized at level 4 and above
GB2521197A (en) 2013-12-13 2015-06-17 Ibm Incremental and collocated redistribution for expansion of an online shared nothing database
US9576039B2 (en) * 2014-02-19 2017-02-21 Snowflake Computing Inc. Resource provisioning systems and methods
CN103927346B (zh) * 2014-03-28 2017-02-15 浙江大学 基于数据量的查询连接方法
US9836506B2 (en) * 2014-06-11 2017-12-05 International Business Machines Corporation Dynamic query optimization with pilot runs
CN105243068A (zh) * 2014-07-09 2016-01-13 华为技术有限公司 数据库系统的查询方法、服务器和能耗测试系统
CN104504018B (zh) * 2014-12-11 2017-09-08 浙江大学 基于浓密树和自顶向下的大数据实时查询优化方法
CN105989142A (zh) * 2015-02-28 2016-10-05 华为技术有限公司 一种数据查询方法及装置
US10108683B2 (en) 2015-04-24 2018-10-23 International Business Machines Corporation Distributed balanced optimization for an extract, transform, and load (ETL) job
US10771538B2 (en) * 2015-10-08 2020-09-08 International Business Machines Corporation Automated ETL resource provisioner
US20170249360A1 (en) * 2016-02-26 2017-08-31 International Business Machines Corporation Skew sensitive estimating of record cardinality of a join predicate for rdbms query optimizer access path selection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739398A (zh) * 2008-11-11 2010-06-16 山东省标准化研究院 分布式数据库多连接查询优化算法
CN102968420A (zh) * 2011-08-31 2013-03-13 国际商业机器公司 数据库查询的方法和系统
CN102929996A (zh) * 2012-10-24 2013-02-13 华南理工大学 XPath查询优化方法及系统
US20150278306A1 (en) * 2014-03-28 2015-10-01 Xun Cheng Bloom filter costing estimation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11023466B2 (en) 2017-11-22 2021-06-01 Transwarp Technology (Shanghai) Co., Ltd. Cost-based optimizer, and cost estimation method and device thereof

Also Published As

Publication number Publication date
TW201734859A (zh) 2017-10-01
US11650990B2 (en) 2023-05-16
EP3432157A4 (en) 2019-10-02
TWI753881B (zh) 2022-02-01
EP3432157A1 (en) 2019-01-23
EP3432157B1 (en) 2021-05-26
CN107193813A (zh) 2017-09-22
US20190171639A1 (en) 2019-06-06
CN107193813B (zh) 2021-05-14

Similar Documents

Publication Publication Date Title
WO2017157160A1 (zh) 数据表连接方式处理方法及装置
US11487771B2 (en) Per-node custom code engine for distributed query processing
CN107193899B (zh) 一种图算法友善的强连通图划分方法
US20170286484A1 (en) Graph Data Search Method and Apparatus
US20170160965A1 (en) Optimization of in-memory data grid placement
WO2017148297A1 (zh) 数据表连接方法及装置
WO2015110062A1 (zh) 一种分布式数据存储方法、装置和系统
Chatzimilioudis et al. Distributed in-memory processing of all k nearest neighbor queries
CN110347515B (zh) 一种适合边缘计算环境的资源优化分配方法
WO2017177953A1 (zh) 在存储服务器中为应用设置数据处理路径的方法及装置
US20150302022A1 (en) Data deduplication method and apparatus
US20160335143A1 (en) System and method for determining concurrency factors for dispatch size of parallel processor kernels
Chen et al. Energy-efficient fault-tolerant data storage & processing in dynamic networks
Yang et al. Automatic and Scalable Data Replication Manager in Distributed Computation and Storage Infrastructure of Cyber-Physical Systems.
US11301436B2 (en) File storage method and storage apparatus
CN111708812A (zh) 一种分布式数据处理方法
US10089151B2 (en) Apparatus, method, and program medium for parallel-processing parameter determination
CN107203550B (zh) 一种数据处理方法和数据库服务器
KR101680945B1 (ko) 조인 연산을 위한 매핑 방법 및 리듀싱 방법,및 그 방법들을 이용한 장치들
CN111026713B (zh) 一种搜索系统、数据搜索方法及运行时间确定方法
US20140122817A1 (en) System and method for an optimized distributed storage system
US20130060762A1 (en) Ranking analysis results based on user perceived problems in a database system
US10841374B2 (en) Dynamic sharding for state-based processing
CN117971742A (zh) 基于传输序列的芯片数据传输方法和装置
US11657293B2 (en) Asynchronous architecture for evolutionary computation techniques

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017765702

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017765702

Country of ref document: EP

Effective date: 20181015

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17765702

Country of ref document: EP

Kind code of ref document: A1