WO2019101119A1 - 代价优化器与代价估计的方法及其设备 - Google Patents
代价优化器与代价估计的方法及其设备 Download PDFInfo
- Publication number
- WO2019101119A1 WO2019101119A1 PCT/CN2018/116874 CN2018116874W WO2019101119A1 WO 2019101119 A1 WO2019101119 A1 WO 2019101119A1 CN 2018116874 W CN2018116874 W CN 2018116874W WO 2019101119 A1 WO2019101119 A1 WO 2019101119A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- type
- connection
- predicate
- cost
- records
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
- G06F16/24542—Plan optimisation
- G06F16/24544—Join order optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/244—Grouping and aggregation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
- G06F16/24542—Plan optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
Definitions
- the present application relates to the field of computers, such as methods and apparatus for cost optimizer and cost estimation.
- Cost Based Optimizer is a core component of the database system. Due to its significant impact on the performance of the database system, it plays an important role in modern database systems.
- the core of the cost optimizer is the cost estimation model, which is used to estimate the cost of the execution plan generated by the database system to select the optimal execution plan. Estimate the quality of the model affects the pros and cons of the final implementation plan.
- the cost optimizer optimizes the execution plan based on the statistical data of the target data. The integrity and accuracy of the statistical information directly affect the estimation model's estimation of the execution plan.
- the cost optimizer is optimized to generate an optimal execution plan.
- the execution plan is generally represented by an operation tree, and the operation tree is composed of different types of operations.
- the operation type of an operation tree may be scan table, selection, filtering, aggregation, connection, projection, and the like.
- the cost optimizer can use the cost estimation model to estimate the cost of each step of the execution plan, and select the smallest overall cost from all possible execution plans as the final execution plan, thus completing the entire cost.
- the process of optimization It can be seen that the core of the cost optimizer is the cost estimation based on statistical information, which is the basis of the cost optimizer.
- the cost optimizer needs statistics and cannot complete the cost optimization in the absence of the necessary statistics. For example, a temporary table created at runtime or a scenario with subqueries cannot determine its statistics during the compilation phase, so cost optimization cannot be completed.
- a temporary table created at runtime or a scenario with subqueries cannot determine its statistics during the compilation phase, so cost optimization cannot be completed.
- the cost optimizer is applied to the big data system to process massive data, because the statistical information collection of massive data is costly, collecting the statistical information of the massive data will become the bottleneck of the application cost optimizer. For a given data set, it is not necessary to collect statistics for the full amount of data, and based on some statistical information, it is also possible to get an optimal plan. In the case where the statistical information is incomplete or unavailable, the cost optimizer in the related art fails.
- the present application provides a cost optimizer and cost estimation method and a device thereof, which can solve the problem that the cost estimation cannot be performed when the statistical information is incomplete.
- a method for estimating a cost comprising: determining whether the acquired statistical information is complete, and determining a cost estimation manner of the first operation type if the statistical information is incomplete; and estimating the cost based on the cost Means determining a first cost estimate of the first type of operation, the first type of operation being an operation type of an operation tree dependent on the statistical information; and determining a second cost estimate based on the first cost estimate and the second type of operation The cumulative cost estimate of the operation tree, the second type of operation being an operation type of an operation tree that is not dependent on the statistical information.
- an apparatus for cost estimation comprising: a judging means, a determining means and an estimating means.
- the determining device is configured to determine whether the obtained statistical information is complete. If the statistical information is incomplete, the corresponding cost estimating manner is determined according to the operation type of the operation tree that depends on the statistical information.
- the determining means is arranged to determine a cost estimate for the corresponding type of operation based on the cost estimate.
- the estimating means is arranged to determine a cumulative cost estimate of the operating tree based on a cost estimate corresponding to the cost estimate corresponding to the statistical information operation type and the operation type not dependent on the statistical information.
- a cost optimizer is further provided, the cost optimizer is configured to: generate an original execution plan; determine whether the obtained statistical information is complete, and if the statistical information is complete, according to the first based on the statistical information
- the cost estimation model estimates the cost. If the statistical information is incomplete, the cost is estimated according to the second cost estimation model; and the optimal execution plan is generated according to the cost estimated by the first cost estimation model or the cost estimated by the second cost estimation model.
- a computing-based device comprising: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to: determine Whether the obtained statistical information is complete, and if the statistical information is incomplete, determining a corresponding cost estimation manner according to an operation type of the operation tree that depends on the statistical information; determining a cost estimate of the corresponding operation type based on the cost estimation manner And determining a cumulative cost estimate of the operation tree according to a cost estimate corresponding to the cost estimate corresponding to the statistical information operation type and an operation type not dependent on the statistical information.
- a computer readable storage medium storing computer executable instructions for performing the method of cost estimation described above.
- the corresponding cost estimation manner is determined according to the operation type of the operation tree that depends on the statistical information; and the cost estimation of the corresponding operation type is determined based on the cost estimation manner; A cumulative cost estimate for the operational tree is determined dependent on a cost estimate corresponding to the statistical information operation type and a cost estimate corresponding to the operation type not dependent on the statistical information. Therefore, the traditional cost optimizer can not estimate the cost of creating temporary tables and subqueries at runtime, and the scenario of massive data is limited by the data size.
- the execution plan corresponding to the statement of the Structured Query Language is estimated according to the cumulative cost estimation, and can be applied to the optimization of the SQL in the database system, thereby improving the accuracy of the cost estimate cost estimation of the SQL, thereby Generate a better performing execution plan.
- SQL Structured Query Language
- FIG. 1 is a flow chart showing a method for estimating a cost according to an aspect of the present application.
- FIG. 2 is a schematic diagram showing an operation tree of a tree in an embodiment of the present application.
- FIG. 3 is a block diagram showing the structure of a device for estimating a cost according to another aspect of the present application.
- FIG. 4 shows a schematic diagram of an improved cost optimizer provided in accordance with yet another aspect of the present application.
- the terminal, the device of the service network, and the trusted party each include one or more processors (eg, a central processing unit (CPU), an input/output interface, a network interface, and a memory.
- processors eg, a central processing unit (CPU), an input/output interface, a network interface, and a memory.
- the memory may include non-persistent memory, random access memory (RAM) and/or non-volatile memory in a computer readable storage medium, such as Read-Only Memory (ROM). Or flash memory (flash RAM). Memory is an example of a computer readable storage medium.
- RAM random access memory
- ROM Read-Only Memory
- flash RAM flash RAM
- Computer readable storage media including both permanent and non-persistent, removable and non-removable media may be implemented by any method or technology.
- the information can be computer readable instructions, data structures, modules of programs, or other data.
- Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory.
- RAM Phase change memory
- SRAM static random access memory
- DRAM dynamic random access memory
- RAM Random Access Memory
- ROM Read Only Memory
- EEPROM Electrically-Erasable Programmable ROM
- Flash Memory or other memory technology
- Read-Only Memory (CD-ROM) ROM Read-Only Memory
- CD-ROM Compact Disc
- DVD Digital Versatile Disc
- Computer-readable storage media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.
- FIG. 1 is a schematic flowchart diagram of a method for estimating a cost according to an aspect of the present application, where the method includes: S11 to S13.
- An operation tree can include multiple types of operations, such as including types of operations that rely on statistics and types of operations that do not rely on statistics.
- the operation type of the operation tree depending on the statistical information is referred to as a first operation type
- the cost estimate of the first operation type is referred to as a first cost estimate
- the operation type of the operation tree not dependent on the statistical information is called
- the cost estimate for the second type of operation is referred to as the second cost estimate.
- a first cost estimate of the first type of operation is determined based on the cost estimation manner.
- a cost estimate corresponding to the operation type corresponding to the statistical information operation type ie, the first cost estimate of the first operation type
- an operation type not dependent on the statistical information ie, the second operation type
- Two cost estimates A cumulative cost estimate for the operational tree is determined.
- the above method can solve the scenario that the temporary table and the subquery cannot be estimated at runtime, or the scenario of the massive data is limited by the data size, and the traditional cost optimizer cannot perform the cost optimization problem.
- the method of the present application includes estimating an execution plan corresponding to the optimized structured query language statement based on the cumulative cost estimate.
- the cost estimation method described in the present application is applied to the optimization of SQL in the database system, and the accuracy of the cost estimate cost estimation of the SQL is improved, thereby generating a better performance execution plan.
- the execution plan may be cost optimized based on incomplete statistical information.
- the cost estimation model in the related technology is used for estimation, and the optimal execution plan is generated based on the estimation result; when the statistical information is incomplete, the cost estimation of the corresponding operation is performed by the method of cost estimation described in the present application. To obtain an estimate and generate an optimal execution plan based on the estimate.
- the corresponding cost estimation manner is determined according to the operation type (ie, the first operation type) of the operation tree that depends on the statistical information.
- the operation type of the operation tree that depends on the statistical information may include: a scan operation, a filtering operation, a connection operation, and an aggregation operation.
- an operation tree is composed of different operators, each of which represents an operation type, which may include scanning, filtering, connecting, projecting, aggregating, selecting, and the like.
- an operation tree diagram adjusts the operation type on the operation tree, estimates the cost of each operation type, and finally generates an operation tree with the smallest cumulative cost to generate an execution plan.
- the cost of different operation types depends on the statistical information.
- the cost of the scan operation, the filtering operation, the connection operation, and the aggregation operation depend on the statistical information, that is, only the scan operation, the filtering operation, the connection operation, and the aggregation operation are obtained.
- the complete statistical information can determine the cost estimate of the sweep operation, the filtering operation, the connection operation, and the aggregation operation; and the cost of the projection operation and the selection operation does not depend on the statistical information, that is, it is not necessary to obtain the projection operation and the selection operation.
- Complete statistical information can also be used to determine the cost estimate for projection operations and selection operations.
- the present application judges whether the obtained statistical information is complete, and if the statistical information is incomplete, the cost estimation method of the operation type depending on the statistical information is improved.
- a cost estimation manner of the operation type depending on the statistical information is determined, thereby determining a cost estimate of the operation type depending on the statistical information according to the cost estimation manner.
- a cost estimate corresponding to the type of operation of the statistical information operation type ie, the first cost estimate of the first operation type
- an operation type not dependent on the statistical information the second cost of the second operation type
- the cost estimation of the scan operation, the filtering operation, the connection operation, and the aggregation operation depending on the statistical information is accumulated with the cost estimation of the projection operation and the selection operation not dependent on the statistical information to obtain the cost estimate of the operation tree, thereby generating the most Excellent implementation plan.
- a cost estimate of the scan operation is determined according to the number of records of the data set; a selection rate of the filter condition is determined according to a type of the filter predicate, and a cost estimate of the filtering operation is determined according to the selection rate; Determining a cost estimate of the connection operation according to the determined number of records of the connection result set; determining an aggregation rate of the aggregation field according to the aggregation field and the aggregation function, and determining a cost estimate of the aggregation operation according to the aggregation rate.
- the cost of the scan operation is related to the size of the data set (also known as the number of records in the data set).
- the cost of the filtering operation is related to the selection rate of the filtering conditions, and the selection rate is calculated according to the filtering conditions.
- the selection rate of the filtering conditions is determined differently.
- the cost of the join operation is related to the size of the data set participating in the join, and also the size of the join result set after the join. Therefore, first determine the number of records in the join result set, and then determine the cost estimate of the join operation based on the number of records in the join result set. .
- the cost of the aggregation operation depends on the amount of data involved in the aggregation and the aggregation rate of the aggregation field.
- the aggregation rate of the aggregation field needs to be calculated according to the number of different values of the aggregation field. Therefore, when determining the cost of the aggregation operation, the aggregation field needs to be calculated. The number of different values.
- the number of records in the data set is recorded as RC (Row Count), and the size of the data set (denoted as A) is in a proportional relationship with the RC, that is, A ⁇ RC, so in the embodiment of the present application, You can use RC to represent the size of the data set, then the cost of the scan operation is estimated to be:
- the formula indicates that the cost estimate of the scan table operation can be determined based on the record number RC of the data set, wherein the Op represents the target operation for which the cost is to be estimated, and the getOriginalCost represents the method using the correlation technique and the RC to estimate the cost.
- the cost estimate of the filtering operation is related to the selection rate of the filtering predicate, and the estimation algorithms of the selection rates of the different types of filtering predicates are different.
- the cost of the filtering operation is related to the selection rate of the filtering condition, and the selection rate of the filtering condition is calculated according to the filtering condition.
- the identity predicate and the unequal predicate are a pair of complementary predicates, that is, the sum of the selection conditions of the filter conditions of the constant predicate and the unequal predicate is 1, therefore,
- the type of the filtering predicate is a range predicate
- the filter condition is between two values.
- v 1 , v 2 , . . . , v n are filter values
- C is a set of filter values
- n is the number of filter values in set C.
- the selection rate of the filtering condition S 1/specified value.
- the selection rate needs to be estimated according to the null value of the null predicate field.
- the filtering predicates of the cascading are, in order to prevent the estimation error from being cascading and amplifying, in the embodiment of the present application, the selection rate of the filtering conditions is estimated by the following method:
- ⁇ defines the minimum value of the filter filter rate of And cascade filter predicate, and can adjust the value of ⁇ according to the actual situation; selectivity is the selection rate, the above formula represents the cascaded selection rate value of And cascade filter predicate and And cascade filter The maximum value of the minimum value of the predicate filter rate is used as the selection rate of the filter condition.
- the selection rate of the filtering condition is determined according to the selection rate of the concatenated predicate corresponding to the or concatenated predicate.
- the filter or predicate filter predicate is:
- the selection rate of the filter condition is determined as follows:
- the selection rate of the corresponding filtering conditions is discussed for different types of filtering predicates, and the cost of the filtering operation can be calculated according to the determined selection rate:
- the number of records of the connection result set is determined according to the number of different values of the connection fields of the connected left and right tables.
- the left table of the join is denoted as T left
- the number of records is RC left
- the number of odd values is NDV left
- the right table of the join join is T right
- the number of records is RC right
- the number of distinct values is NDV right
- the number of records in the result set is recorded as RC result .
- the left and right tables participating in the connection have the following operations: selection operation, filtering operation, scan table operation, etc.
- the left and right tables respectively constitute the left operation tree and the right operation tree, and the cost estimate of each operation type on the operation tree. It can be estimated, therefore, the data set sizes of the left and right tables participating in the join (ie, RC left and RC right ) can be calculated from the estimation of the types of operations contained in the left and right tables and the existing calculation methods.
- the number of records of the connection result set is determined based on the connection type of the connection field.
- the following discussion discusses how to determine the number of records for a join result set based on the connection type of the join field.
- connection type of the connection field of the left table and the right table is connected as a primary key (PK)-foreign key (FK) form
- the record of the connection result set is determined according to the number of records of the foreign key field and the selection rate of the filtering condition of the primary key field. number.
- the connection type of the connection field of the left table and the right table is a primary key - the foreign key form connection can mean that the connection type of the connection field of the left table is a primary key form connection, and the connection type of the connection field of the right table is a foreign key form. Connection; or the connection type of the connection field of the left table is a foreign key connection, and the connection type of the connection field of the right table is a primary key connection.
- connection type of the connection field of the left table is a primary key form connection
- connection type of the connection field of the right table is a foreign key form connection
- connection type of the connection fields of the left table and the right table is a non-primary key-foreign key connection, it can be further divided into the following cases 1 - 4.
- Case 1 When the connection type of the connection fields of the left table and the right table is an inner connection, the maximum number of records of the left table and the number of records of the right table are used as the number of records of the connection result set;
- connection type of the connection field of the left table and the right table is a cross-multiply connection
- the product of the number of records of the connected left table and the number of records of the right table is used as the number of records of the connection result set
- connection type of the connection field of the left table is a left outer connection
- connection type of the connection field of the right table is a right outer connection
- the corresponding number of records of the left table or the record of the right table will be The number is the number of records that are connected to the result set.
- connection type of the connection fields of the left table and the right table is full connection
- the number of records of the connected left table and the number of records of the right table are counted as the number of records of the connection result set.
- the connection type of the connection field is a full outer join
- cost Op.getOriginalCost(RC left , RC right , RC result ).
- the cost estimate of the aggregation operation is related to the size of the data volume participating in the aggregation and the aggregation rate of the aggregation field, and the aggregation rate R agg may be defined as:
- the aggregation rate of the aggregated field needs to be calculated based on the number of unique values of the aggregated field.
- the number of unique values of the aggregated field cannot be obtained, the following discussion is made.
- the aggregation rate R agg of the aggregate field is a piecewise function as follows:
- n is the number of aggregated fields in the set of aggregated fields.
- the aggregation rate R agg of the aggregate field is:
- n is the number of aggregated fields in the set of aggregated fields
- k is a positive integer
- the aggregation rate R agg of the aggregate field is:
- n is the number of aggregated fields in the set of aggregated fields
- k is a positive integer
- the cost estimation of the operation tree may be performed by the method of cost estimation described in the foregoing embodiment of the present application, and the method for estimating the cost described in the present application may be applied.
- the cost optimizer in the related art is improved based on the method of cost estimation described in the present application, and the cost of the execution plan can be optimized based on incomplete statistical information.
- the cost estimation model in the related art can estimate the cost of the corresponding operation, and generate an optimal execution plan based on the estimation result.
- the cost optimizer in the related art cannot make a cost estimation for the corresponding operation, and the cost estimation model described in the present application (ie, the cost estimation method for the operation operation of the operation tree is improved)
- the operation performs a cost estimate and generates an optimal execution plan based on the estimation results.
- FIG. 3 is a block diagram showing the structure of a device for cost estimation according to another aspect of the present application, the device comprising: a judging device 11, a determining device 12, and an estimating device 13.
- the judging device 11 is configured to determine whether the obtained statistical information is complete. If the statistical information is incomplete, the corresponding cost estimation manner is determined according to the operation type of the operation tree depending on the statistical information.
- the determining means 12 is arranged to determine a cost estimate for the corresponding type of operation based on the cost estimate.
- the estimating means 13 is arranged to determine a cumulative cost estimate of the operating tree based on a cost estimate corresponding to the cost estimate corresponding to the statistical information operation type and the operation type not dependent on the statistical information.
- the above device can solve the problem that the cost optimizer in the related art cannot estimate the cost for creating the temporary table and the subquery at the runtime, and the scenario of the massive data is limited by the data size.
- the apparatus further comprises: an execution device configured to estimate an execution plan corresponding to the statement of the optimized structured query language based on the cumulative cost estimate.
- the cost estimation method described in the present application is applied to the optimization of SQL in the database system, and the accuracy of the cost estimate cost estimation of the SQL is improved, thereby generating a better performance execution plan.
- the cost of the execution plan can be optimized based on incomplete statistical information.
- the cost estimation model in the related technology is used for estimation, and the optimal execution plan is generated based on the estimation result; when the statistical information is incomplete, the cost estimation of the corresponding operation is performed by the method of cost estimation described in the present application. And generate an optimal execution plan based on the estimation results.
- the judging means 11 is arranged to determine the corresponding cost estimation mode according to the operation type of the operation tree depending on the statistical information when it is judged that the acquired statistical information is incomplete.
- the operation types of the operation tree that depend on the statistical information include a scan operation, a filtering operation, a connection operation, and an aggregation operation.
- an operation tree is composed of different operators, each of which represents an operation type, and an operation type of an operation tree may include operations such as scanning, filtering, connecting, projecting, aggregating, and selecting.
- an operation tree diagram adjusts the operation type on the operation tree, estimates the cost of each operation type, and finally generates an operation tree with the smallest cumulative cost to generate an execution plan, and the cost of different operation types The dependence of statistical information is different.
- the cost estimation of the sweep operation, the filtering operation, the connection operation, and the aggregation operation need to rely on statistical information, and the cost estimation of the projection operation and the selection operation does not depend on the statistical information. Therefore, it is necessary to determine the type of operation and use the corresponding cost estimation method to estimate the cost of each type of operation. When it is judged that the obtained statistical information is incomplete, the cost estimation method of the operation type depending on the statistical information is improved.
- the determining means 12 is arranged to determine a cost estimate of the type of operation dependent on the statistical information, thereby determining a cost estimate for the type of operation.
- the estimating means 13 is arranged to determine a cumulative cost estimate of the operating tree based on a cost estimate corresponding to the cost estimate corresponding to the statistical information operation type and the operation type not dependent on the statistical information. For example, the cost estimation of the scan operation, the filtering operation, the connection operation, and the aggregation operation depending on the statistical information is accumulated with the cost estimation of the projection operation and the selection operation not dependent on the statistical information to obtain the cost estimate of the operation tree, thereby generating the most Excellent implementation plan.
- the determining device 12 is configured to determine a cost estimate of the scan table operation according to the number of records of the data set; determine a selection rate of the filter condition according to the type of the filter predicate, and determine a cost estimate of the filter operation according to the selection rate. And determining a cost estimate of the connection operation according to the determined number of records of the connection result set; determining an aggregation rate of the aggregation field according to the aggregation field and the aggregation function, and determining a cost estimate of the aggregation operation according to the aggregation rate.
- the cost of the scan operation is related to the size of the data set (also referred to as the number of records in the data set).
- the cost of the filtering operation is related to the selection rate of the filtering conditions, and the selection rate is calculated according to the filtering conditions.
- the selection rate is determined differently.
- the cost of the join operation is related to the size of the data set participating in the join, and also the size of the result set after the join. Therefore, first determine the number of records in the join result set, and then determine the cost estimate of the join operation based on the number of records in the join result set.
- the cost of the aggregation operation is related to the amount of data involved in the aggregation and the aggregation rate of the aggregation field.
- the aggregation rate of the aggregation field needs to be calculated according to the number of unique values of the aggregation field. Therefore, the number of heterogeneous values of the aggregation field needs to be calculated.
- the number of records in the data set is recorded as RC (Row Count), and the size of the data set (denoted as A) is in a proportional relationship with the RC, that is, A ⁇ RC, so in the embodiment of the present application, You can use RC to represent the size of the data set, then the cost of the scan operation is estimated to be:
- the formula indicates that the cost estimate of the scan table operation can be determined based on the record number RC of the data set, wherein the Op represents the target operation for which the cost is to be estimated, and the getOriginalCost represents the method using the correlation technique and the RC to estimate the cost.
- the cost estimate of the filtering operation is related to the selection rate of the filtering predicate, and the estimation algorithms of the selection rates of the different types of filtering predicates are different.
- the cost of the filtering operation is related to the selection rate of the filtering condition, and the selection rate of the filtering condition is calculated according to the filtering condition.
- the identity predicate and the unequal predicate are a pair of complementary predicates, that is, the sum of the selection conditions of the filter conditions of the constant predicate and the unequal predicate should be 1, therefore
- the filter condition is between two values.
- v 1 , v 2 , . . . , v n are filter values
- C is a set of filter values
- n is the number of filter values in set C.
- the selection rate of the filtering condition S 1/specified value.
- the selection rate needs to be estimated according to the null value of the null predicate field.
- ⁇ defines the minimum value of the filter filter rate of And cascade filter predicate, and can adjust the value of ⁇ according to the actual situation; selectivity is the selection rate, the above formula represents the cascaded selection rate value of And cascade filter predicate and And cascade filter The maximum value of the minimum value of the predicate filter rate is used as the selection rate of the filter condition.
- the selection rate of the filtering condition is determined according to the selection rate of the concatenated predicate corresponding to the or concatenated predicate.
- the filter or predicate filter predicate is:
- the selection rate of the filter condition is determined as follows:
- the selection rate of the corresponding filter condition is discussed for different types of filter predicates, and the cost of the filtering operation can be calculated according to the determined selection rate:
- the number of records of the connection result set is determined according to the number of different values of the connection fields of the connected left and right tables.
- the left table of the join connection is T left
- the number of records is RC left
- the number of odd values is NDV left
- the right table of the join join is T right
- the number of records is RC right
- the number of distinct values is NDV right
- the number of records in the result set is recorded as RC result .
- the left and right tables participating in the connection have the following operations: selection operation, filtering operation, scan table operation, etc.
- the left and right tables respectively constitute the left operation tree and the right operation tree, and the cost estimates of each operation type on the operation tree are estimated. It can be estimated, therefore, the data set sizes of the left and right tables participating in the join (ie, RC left and RC right ) can be calculated from the estimation of the operation types of the left and right tables and the existing calculation methods.
- the number of records of the connection result set is determined based on the connection type of the connection field.
- the following discussion discusses how to determine the number of records for a join result set based on the connection type of the join field.
- connection type of the connection field is a primary key-foreign key form connection
- number of records of the connection result set is determined according to the number of records of the foreign key field and the selection rate of the filtering condition of the primary key field. If the connection type of the connection field of the left table and the connection type of the connection field of the right table are the primary key-foreign key form connection, respectively, the number of records of the connection result set RC result is:
- connection type of the connection field is non-primary key-foreign key connection, it can be further divided into the following cases 1 - case 4.
- connection result When the connection type of the connection field is an inner connection, the maximum number of records of the connected left table and the number of records of the right table is used as the number of records of the connection result set; here, in the implementation of the present application
- connection type of the connection field is a cross-multiply connection
- the product of the number of records of the connected left table and the number of records of the right table is used as the number of records of the connection result set
- connection type of the connection field When the connection type of the connection field is left outer join or right outer join, the number of records of the corresponding connected left table or the number of records of the right table is used as the number of records of the connection result set.
- connection type of the connection field When the connection type of the connection field is full connection, the number of records of the connected left table and the number of records of the right table are counted as the number of records of the connection result set.
- the connection type of the connection field is a full outer join
- cost Op.getOriginalCost(RC left , RC right , RC result ).
- the cost estimate of the aggregation operation is related to the size of the data volume participating in the aggregation and the aggregation rate of the aggregation field, and the aggregation rate R agg may be defined as:
- the aggregation rate of the aggregated field needs to be calculated based on the number of unique values of the aggregated field.
- the number of unique values of the aggregated field cannot be obtained, the following discussion is made.
- the aggregation rate R agg of the aggregate field is a piecewise function as follows:
- n is the number of aggregated fields in the set of aggregated fields
- the aggregation rate R agg of the aggregate field is:
- n is the number of aggregated fields in the set of aggregated fields, and k is a positive integer
- the aggregation rate R agg of the aggregate field is:
- n is the number of aggregated fields in the set of aggregated fields
- k is a positive integer
- a cost optimizer is further provided, wherein the cost optimizer is configured to: generate an original execution plan; determine whether the obtained statistical information is complete, and if the statistical information is complete, according to the statistical information
- the first cost estimation model ie, the cost estimation model in the related art
- the second cost estimation model based on the first cost estimation model is optimized (ie, the cost estimation model in the present application) Estimating a cost; generating an optimal execution plan based on the cost estimated by the first cost estimation model or the cost estimated by the second cost estimation model.
- the cost optimizer shown in FIG. 4 can optimize the execution plan based on incomplete statistical information.
- the cost is estimated using the first cost estimation model, wherein the first cost estimation model is a cost estimation model in the related art, and the cost estimation model in the related technology requires statistical information when performing cost optimization, and therefore, Collect complete and accurate statistics.
- an execution plan is generally represented by an operation tree, which is composed of different operation types, which may include scanning, selecting, filtering, aggregating, connecting, projecting, and the like.
- the cost optimizer can use the cost estimation model to estimate the cost of each type of operation of the execution plan, and select the smallest overall cost from all possible execution plans as the final The execution plan, thus completing the entire cost optimization process.
- the cost is estimated using the second cost estimation model optimized for the first cost estimation model, that is, the cost is estimated using the cost estimation model of the present application, and an optimal execution plan is generated based on the estimation result.
- the cost estimation model of the present application is configured to determine a corresponding cost estimation manner according to an operation type (ie, a first operation type) of an operation tree that depends on the statistical information; and determine the corresponding operation type based on the cost estimation manner.
- a cost estimate (ie, a first cost estimate of the first type of operation); a cost estimate corresponding to the cost estimate corresponding to the type of operation of the statistical information and an operation type not dependent on the statistical information (ie, the cost corresponding to the second type of operation) Estimating) determining a cumulative cost estimate for the operational tree.
- the cost estimation of the operation tree may be performed by the method performed in the device of the cost estimation described in the above embodiment of the present application, and the cost estimate described in the present application.
- Devices can be applied to temporary tables and subqueries created at runtime, as well as scenarios for massive amounts of data, which can be quickly estimated at a cost without being limited by the size of the data.
- the cost optimizer obtained by the device based on the cost estimation described in the present application can optimize the execution plan based on incomplete statistical information.
- the cost estimation model in the related art can estimate the cost of the corresponding operation, and generate an optimal execution plan based on the estimation result.
- the cost optimizer in the related art cannot make a cost estimation for the corresponding operation, and the cost estimation model of the present application (ie, the cost estimation method for the related operation of the operation tree is improved) costs the corresponding operation. Estimate and generate an optimal execution plan based on the estimates.
- a computing-based device comprising: a processor; and a memory arranged to store computer executable instructions.
- the executable instruction when executed, causes the processor to: determine whether the acquired statistical information is complete, and if the statistical information is incomplete, according to an operation type (first operation type) of the operation tree depending on the statistical information. Determining a corresponding cost estimation manner; determining a cost estimate of the corresponding operation type (ie, a first cost estimate of the first operation type) based on the cost estimation manner; and estimating a cost corresponding to the operation type depending on the statistical information ( The first cost estimate of the first type of operation and the cost estimate corresponding to the type of operation not dependent on the statistical information (ie, the second cost estimate corresponding to the second type of operation) determine a cumulative cost estimate for the operational tree.
- the present application can be implemented in software and/or a combination of software and hardware, for example, using an application specific integrated circuit (ASIC), a general purpose computer, or any other similar hardware device.
- the software program of the present application can be executed by a processor to implement the steps or functions described above.
- the software programs (including related data structures) of the present application can be stored in a computer readable recording medium such as a RAM memory, a magnetic or optical drive or a floppy disk and the like.
- some of the steps or functions of the present application may be implemented in hardware, for example, as a circuit that cooperates with a processor to perform various steps or functions.
- a portion of the present application can be applied as a computer program product, such as computer program instructions, which, when executed by a computer, can invoke or provide a method and/or technical solution in accordance with the present application.
- the program instructions for invoking the method of the present application may be stored in a fixed or removable recording medium, and/or transmitted by a data stream in a broadcast or other signal bearing medium, and/or stored in a The working memory of the computer device in which the program instructions are run.
- an embodiment in accordance with the present application includes a device including a memory for storing computer program instructions and a processor for executing program instructions, wherein when the computer program instructions are executed by the processor, triggering
- the apparatus operates based on the aforementioned methods and/or technical solutions in accordance with various embodiments of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Operations Research (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
Claims (24)
- 一种代价估计的方法,包括:判断获取到的统计信息是否完备,若所述统计信息不完备,确定第一操作类型的代价估计方式,所述第一操作类型为依赖于所述统计信息的操作树的操作类型;基于所述代价估计方式确定所述第一操作类型的第一代价估计;以及根据所述第一代价估计及第二操作类型的第二代价估计确定所述操作树的累积代价估计,所述第二操作类型为不依赖于所述统计信息的操作树的操作类型。
- 根据权利要求1所述的方法,其中,所述第一操作类型包括:扫表操作、过滤操作、连接操作和聚合操作。
- 根据权利要求1所述的方法,还包括:根据所述累积代价估计优化结构化查询语言(SQL)的语句对应的执行计划。
- 根据权利要求2所述的方法,其中,基于所述代价估计方式确定所述第一操作类型的第一代价估计,包括:根据数据集的记录数确定扫表操作的代价估计;根据过滤谓词的类型确定过滤条件的选择率,根据所述选择率确定过滤操作的代价估计;根据确定的连接结果集的记录数确定连接操作的代价估计;以及根据聚合字段及聚合函数确定聚合字段的聚合率,根据所述聚合率确定聚合操作的代价估计。
- 根据权利要求4所述的方法,其中,根据过滤谓词的类型确定过滤条件的选择率,包括:判断过滤谓词的类型对应的计算选择率的对象是否可获取,若所述计算选择率的对象不可获取,则所述过滤条件的选择率S=1/指定值。
- 根据权利要求5所述的方法,其中,判断过滤谓词的类型对应的计算选择率的对象是否可获取,若所述计算选择率的对象不可获取,则所述过滤条件的选择率,则所述过滤条件的选择率S=1/指定值,包括:当过滤谓词的类型为恒等谓词时,判断所述过滤操作的异值数是否可获取,若所述过滤操作的异值数不可获取,则所述过滤条件的选择率S=1/指定值;当过滤谓词的类型为范围谓词时,若谓词的字段的极值或异值数未能获取,则所述过滤条件的选择率S=1/指定值;以及当过滤谓词的类型为判空谓词时,判断谓词字段的空值数是否可获取,若所述谓词字段的空值数不可获取,则所述过滤条件的选择率S=1/指定值。
- 根据权利要求4所述的方法,其中,根据过滤谓词的类型确定过滤条件的选择率,包括:当过滤谓词的类型为like时,所述过滤条件的选择率S=1/指定值;当过滤谓词的类型为和级联谓词时,根据级联谓词的选择率的乘积与和级联谓词中过滤率最小值确定所述过滤条件的选择率;以及当过滤谓词的类型为或级联谓词时,根据所述或级联谓词对应的级联谓词的选择率确定所述过滤条件的选择率。
- 根据权利要求4所述的方法,根据确定的连接结果集的记录数确定连接操作的代价估计之前,包括:当相连接的左表和右表的连接字段的异值数不可获取时,基于连接字段的连接类型确定连接结果集的记录数。
- 根据权利要求8所述的方法,其中,基于连接字段的连接类型确定连接结果集的记录数,包括:当连接字段的连接类型为主键-外键形式连接时,则根据外键字段的记录数与主键字段的过滤条件的选择率确定连接结果集的记录数;当连接字段的连接类型为内连接时,则将相连接的左表的记录数及右表的记录数中的最大值作为连接结果集的记录数;当连接字段的连接类型为叉乘连接时,则将相连接的左表的记录数及右表的记录数的乘积作为连接结果集的记录数;当连接字段的连接类型为左外连接时,将相连接的左表的记录数作为连接结果集的记录数,当连接字段的连接类型为右外连接时,将相连接的右表的记 录数作为连接结果集的记录数;以及当连接字段的连接类型为全连接时,将相连接的左表的记录数及右表的记录数的累加作为连接结果集的记录数。
- 一种代价估计的设备,包括:判断装置,设置为判断获取到的统计信息是否完备,若所述统计信息不完备,确定第一操作类型的代价估计方式,所述第一操作类型为依赖于所述统计信息的操作树的操作类型;确定装置,设置为基于所述代价估计方式确定所述第一操作类型的第一代价估计;估算装置,设置为根据所述第一代价估计及第二操作类型对应的第二代价估计确定所述操作树的累积代价估计,所述第二操作类型为不依赖于所述统计信息的操作树的操作类型。
- 根据权利要求11所述的设备,其中,所述第一操作类型包括:扫表操 作、过滤操作、连接操作和聚合操作。
- 根据权利要求11所述的设备,还包括:执行装置,设置为根据所述累积代价估计优化结构化查询语言(SQL)的语句对应的执行计划。
- 根据权利要求12所述的设备,其中,所述确定装置设置为:根据数据集的记录数确定扫表操作的代价估计;根据过滤谓词的类型确定过滤条件的选择率,根据所述选择率确定过滤操作的代价估计;根据确定的连接结果集的记录数确定连接操作的代价估计;以及根据聚合字段及聚合函数确定聚合字段的聚合率,根据所述聚合率确定聚合操作的代价估计。
- 根据权利要求14所述的设备,其中,所述确定装置设置为:判断过滤谓词的类型对应的计算选择率的对象是否可获取,若所述计算选择率的对象不可获取,则所述过滤条件的选择率S=1/指定值。
- 根据权利要求15所述的设备,其中,所述确定装置设置为:当过滤谓词的类型为恒等谓词时,判断所述过滤操作的异值数是否可获取,若所述过滤操作的异值数不可获取,则所述过滤条件的选择率S=1/指定值;当过滤谓词的类型为范围谓词时,若谓词的字段的极值或异值数未能获取,则所述过滤条件的选择率S=1/指定值;以及当过滤谓词的类型为判空谓词时,判断谓词字段的空值数是否可获取,若所述谓词字段的空值数不可获取,则所述过滤条件的选择率S=1/指定值。
- 根据权利要求14所述的设备,其中,所述确定装置设置为:当过滤谓词的类型为like时,所述过滤条件的选择率S=1/指定值;当过滤谓词的类型为和级联谓词时,根据级联谓词的选择率的乘积与和级联谓词中过滤率最小值确定所述过滤条件的选择率;当过滤谓词的类型为或级联谓词时,根据所述或级联谓词对应的级联谓词的选择率确定所述过滤条件的选择率。
- 根据权利要求14所述的设备,其中,所述确定装置设置为:当相连接的左表和右表的连接字段的异值数不可获取时,基于连接字段的连接类型确定连接结果集的记录数。
- 根据权利要求18所述的设备,其中,所述确定装置设置为:当连接字段的连接类型为主键-外键形式连接时,则根据外键字段的记录数与主键字段的过滤条件的选择率确定连接结果集的记录数;当连接字段的连接类型为内连接时,则将相连接的左表的记录数及右表的记录数中的最大值作为连接结果集的记录数;当连接字段的连接类型为叉乘连接时,则将相连接的左表的记录数及右表的记录数的乘积作为连接结果集的记录数;当连接字段的连接类型为左外连接时,将相连接的左表的记录数作为所述连接结果集的记录数,当连接字段的连接类型为右外连接,则将相连接的右表的记录数作为连接结果集的记录数;以及当连接字段的连接类型为全连接时,则将相连接的左表的记录数及右表的记录数的累加作为连接结果集的记录数。
- 一种代价优化器,包括:生成原始执行计划;判断获取到的统计信息是否完备,若所述统计信息完备,则根据所述统计信息的第一代价估算模型估算代价,若所述统计信息不完备,则根据第二代价估算模型估算代价;以及根据所述第一代价估算模型估算的代价或所述第二代价估算模型估算的代价对所述原始执行计划进行优化,生成最优执行计划。
- 根据权利要求21所述的代价优化器,其中,所述第二代价估算模型,包括:确定第一操作类型的代价估计方式,所述第一操作类型为依赖于所述统计信息的操作树的操作类型;基于所述代价估计方式确定所述第一操作类型的第一代价估计;以及根据所述第一代价估计及第二代价估计确定所述操作树的累积代价估计,所述第二操作类型为不依赖于所述统计信息的操作树的操作类型。
- 一种基于计算的设备,包括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:判断获取到的统计信息是否完备,若所述统计信息不完备,确定第一操作类型的代价估计方式,所述第一操作类型为依赖于所述统计信息的操作树的操作类型;基于所述代价估计方式确定所述第一操作类型的第一代价估计;根据所述第一代价估计及第二操作类型的第二代价估计确定所述操作树的累积代价估计,所述第二操作类型为依赖于所述统计信息的操作树的操作类型。
- 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1-10任一项的方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG11202004818SA SG11202004818SA (en) | 2017-11-22 | 2018-11-22 | Cost-based optimizer, and cost estimation method and device thereof |
EP18880194.8A EP3716093B1 (en) | 2017-11-22 | 2018-11-22 | Cost-based optimizer, and cost estimation method and device thereof |
US16/766,480 US11023466B2 (en) | 2017-11-22 | 2018-11-22 | Cost-based optimizer, and cost estimation method and device thereof |
CA3083148A CA3083148C (en) | 2017-11-22 | 2018-11-22 | Cost-based optimizer, and cost estimation method and device thereof |
JP2020538120A JP2021504852A (ja) | 2017-11-22 | 2018-11-22 | コスト最適化装置、コスト推定の方法およびそのデバイス |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711175349.9A CN107885865B (zh) | 2017-11-22 | 2017-11-22 | 一种代价优化器与代价估计的方法及其设备 |
CN201711175349.9 | 2017-11-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019101119A1 true WO2019101119A1 (zh) | 2019-05-31 |
Family
ID=61778174
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/116874 WO2019101119A1 (zh) | 2017-11-22 | 2018-11-22 | 代价优化器与代价估计的方法及其设备 |
Country Status (7)
Country | Link |
---|---|
US (1) | US11023466B2 (zh) |
EP (1) | EP3716093B1 (zh) |
JP (1) | JP2021504852A (zh) |
CN (1) | CN107885865B (zh) |
CA (1) | CA3083148C (zh) |
SG (1) | SG11202004818SA (zh) |
WO (1) | WO2019101119A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11023466B2 (en) | 2017-11-22 | 2021-06-01 | Transwarp Technology (Shanghai) Co., Ltd. | Cost-based optimizer, and cost estimation method and device thereof |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110297858B (zh) * | 2019-05-27 | 2021-11-09 | 苏宁云计算有限公司 | 执行计划的优化方法、装置、计算机设备和存储介质 |
US11200230B2 (en) | 2019-08-09 | 2021-12-14 | Couchbase, Inc. | Cost-based optimization for document-oriented database queries |
US11681687B2 (en) | 2020-08-31 | 2023-06-20 | Couchbase, Inc. | Executing transactions on distributed databases |
CN114328606B (zh) * | 2021-12-30 | 2022-11-29 | 星环信息科技(上海)股份有限公司 | Sql执行效率的提高方法、设备及存储介质 |
CN116932580A (zh) * | 2022-03-31 | 2023-10-24 | 华为技术有限公司 | 一种基数估计方法及装置 |
CN116821193B (zh) * | 2023-08-30 | 2024-01-09 | 之江实验室 | 一种基于代理模型近似处理的推理查询优化方法及装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103136260A (zh) * | 2011-11-30 | 2013-06-05 | 国际商业机器公司 | 评估过滤因子用于数据库中访问路径优化的方法和装置 |
CN107168977A (zh) * | 2016-03-08 | 2017-09-15 | 阿里巴巴集团控股有限公司 | 一种数据查询的优化方法及装置 |
CN107239541A (zh) * | 2017-06-02 | 2017-10-10 | 星环信息科技(上海)有限公司 | 一种代价估计的方法及设备 |
CN107885865A (zh) * | 2017-11-22 | 2018-04-06 | 星环信息科技(上海)有限公司 | 一种代价优化器与代价估计的方法及其设备 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5903888A (en) * | 1997-02-28 | 1999-05-11 | Oracle Corporation | Method and apparatus for using incompatible types of indexes to process a single query |
US7596560B2 (en) * | 2004-12-23 | 2009-09-29 | Raytheon Company | System and method for adaptive query identification and acceleration |
US8447754B2 (en) * | 2010-04-19 | 2013-05-21 | Salesforce.Com, Inc. | Methods and systems for optimizing queries in a multi-tenant store |
US9372890B2 (en) * | 2011-11-23 | 2016-06-21 | Infosys Technologies, Ltd. | Methods, systems, and computer-readable media for providing a query layer for cloud databases |
US9471631B2 (en) * | 2012-09-28 | 2016-10-18 | Oracle International Corporation | Creating and using data that indicates misestimates of actual costs |
CN107193813B (zh) | 2016-03-14 | 2021-05-14 | 阿里巴巴集团控股有限公司 | 数据表连接方式处理方法及装置 |
CN106446134B (zh) | 2016-09-20 | 2019-07-09 | 浙江大学 | 基于谓词规约和代价估算的局部多查询优化方法 |
CN107025273A (zh) * | 2017-03-17 | 2017-08-08 | 南方电网科学研究院有限责任公司 | 一种数据查询的优化方法和装置 |
US20190236188A1 (en) * | 2018-01-31 | 2019-08-01 | Salesforce.Com, Inc. | Query optimizer constraints |
-
2017
- 2017-11-22 CN CN201711175349.9A patent/CN107885865B/zh active Active
-
2018
- 2018-11-22 JP JP2020538120A patent/JP2021504852A/ja active Pending
- 2018-11-22 US US16/766,480 patent/US11023466B2/en active Active
- 2018-11-22 SG SG11202004818SA patent/SG11202004818SA/en unknown
- 2018-11-22 WO PCT/CN2018/116874 patent/WO2019101119A1/zh unknown
- 2018-11-22 CA CA3083148A patent/CA3083148C/en active Active
- 2018-11-22 EP EP18880194.8A patent/EP3716093B1/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103136260A (zh) * | 2011-11-30 | 2013-06-05 | 国际商业机器公司 | 评估过滤因子用于数据库中访问路径优化的方法和装置 |
CN107168977A (zh) * | 2016-03-08 | 2017-09-15 | 阿里巴巴集团控股有限公司 | 一种数据查询的优化方法及装置 |
CN107239541A (zh) * | 2017-06-02 | 2017-10-10 | 星环信息科技(上海)有限公司 | 一种代价估计的方法及设备 |
CN107885865A (zh) * | 2017-11-22 | 2018-04-06 | 星环信息科技(上海)有限公司 | 一种代价优化器与代价估计的方法及其设备 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3716093A4 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11023466B2 (en) | 2017-11-22 | 2021-06-01 | Transwarp Technology (Shanghai) Co., Ltd. | Cost-based optimizer, and cost estimation method and device thereof |
Also Published As
Publication number | Publication date |
---|---|
CA3083148A1 (en) | 2019-05-31 |
CN107885865A (zh) | 2018-04-06 |
JP2021504852A (ja) | 2021-02-15 |
US20200379999A1 (en) | 2020-12-03 |
EP3716093A1 (en) | 2020-09-30 |
US11023466B2 (en) | 2021-06-01 |
CA3083148C (en) | 2021-10-05 |
EP3716093B1 (en) | 2021-09-15 |
CN107885865B (zh) | 2019-12-10 |
EP3716093A4 (en) | 2020-10-28 |
SG11202004818SA (en) | 2020-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019101119A1 (zh) | 代价优化器与代价估计的方法及其设备 | |
US20160275150A1 (en) | Lightweight table comparison | |
US11734234B1 (en) | Data architecture for supporting multiple search models | |
WO2020042804A1 (zh) | 一种数据库查询优化方法、装置、及计算机设备 | |
CN107239541B (zh) | 一种代价估计的方法及设备 | |
US11893011B1 (en) | Data query method and system, heterogeneous acceleration platform, and storage medium | |
US10726006B2 (en) | Query optimization using propagated data distinctness | |
US10409813B2 (en) | Imputing data for temporal data store joins | |
WO2022121098A1 (zh) | 适用于olap查询引擎的查询方法及装置 | |
WO2020199832A1 (zh) | 用于预计算系统中查询引擎的动态路由方法及装置 | |
CN111159219B (zh) | 一种数据管理方法、装置、服务器及存储介质 | |
WO2019120093A1 (en) | Cardinality estimation in databases | |
CN111723161A (zh) | 一种数据处理方法、装置及设备 | |
CN109344169B (zh) | 数据处理方法及装置 | |
CN107735781A (zh) | 存储查询结果的方法和装置、计算设备 | |
CN111857981A (zh) | 一种数据处理方法以及装置 | |
JPH11353331A (ja) | デ―タベ―スの問合せに応答する方法 | |
US7613682B2 (en) | Statistics collection using path-identifiers for relational databases | |
WO2020224242A1 (zh) | 区块链数据处理方法、装置、服务器及存储介质 | |
WO2022267183A1 (zh) | 预计算模型的评分方法、装置、设备和存储介质 | |
CN114328606B (zh) | Sql执行效率的提高方法、设备及存储介质 | |
US20190362254A1 (en) | Systems and methods for estimating validation time for fraud detection rules | |
CN114238500A (zh) | 区块链交易的存储方法、装置、电子设备及可读存储介质 | |
CN113918771A (zh) | 批流融合的信息处理方法和装置、存储介质 | |
WO2024041221A1 (zh) | 一种选择率估算方法及估算装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18880194 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3083148 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 2020538120 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018880194 Country of ref document: EP Effective date: 20200622 |