CN114780554B - Method and device for processing database query statement - Google Patents

Method and device for processing database query statement Download PDF

Info

Publication number
CN114780554B
CN114780554B CN202210707889.1A CN202210707889A CN114780554B CN 114780554 B CN114780554 B CN 114780554B CN 202210707889 A CN202210707889 A CN 202210707889A CN 114780554 B CN114780554 B CN 114780554B
Authority
CN
China
Prior art keywords
query
clause
data
database
limit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210707889.1A
Other languages
Chinese (zh)
Other versions
CN114780554A (en
Inventor
郑来磊
朱涛
熊仲健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Oceanbase Technology Co Ltd
Original Assignee
Beijing Oceanbase Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oceanbase Technology Co Ltd filed Critical Beijing Oceanbase Technology Co Ltd
Priority to CN202210707889.1A priority Critical patent/CN114780554B/en
Publication of CN114780554A publication Critical patent/CN114780554A/en
Application granted granted Critical
Publication of CN114780554B publication Critical patent/CN114780554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a method and a device for processing a database query statement. The method comprises the following steps: receiving a database query statement, wherein the database query statement comprises a LIMIT clause, and a connection operation of a first data set and a second data set; splitting the database query statement into a target query and an external query of the target query, the target query comprising the first data set and the LIMIT clause, the external query comprising the join operation of the query result of the target query and the second data set.

Description

Method and device for processing database query statement
Technical Field
The present disclosure relates to the field of databases, and in particular, to a method and an apparatus for processing a database query statement.
Background
Join operations are operations that are often used when accessing databases. In the database query statement, if the connection operation is used in cooperation with the LIMIT clause, the connection operation needs to be performed on the data set first, and after the connection result is obtained, the LIMIT clause is used for limiting the number of lines of the record lines which need to be returned in the connection result. This order of execution may result in large redundant data, resulting in poor performance of the execution of the database query statement.
Disclosure of Invention
In view of this, the present disclosure provides a method and an apparatus for processing a database query statement to improve the execution performance of the database query statement.
In a first aspect, a method for processing a database query statement is provided, including: receiving a database query statement, wherein the database query statement comprises a LIMIT clause, and a connection operation of a first data set and a second data set; splitting the database query statement into a target query and an external query of the target query, the target query comprising the first set of data, the LIMIT clause, the external query includes the join operation of the query result of the target query with the second data set.
In one possible implementation, the database query statement further includes a first related conditional clause for the first data set, and the target query further includes the first related conditional clause.
In one possible implementation, the connecting operation includes externally connecting the second set of data to the first set of data based on the first set of data.
In one possible implementation, the connection operation comprises a cross-connect operation, and the external query further comprises the LIMIT clause.
In one possible implementation, the target query is encapsulated in a view.
In one possible implementation, the first related condition clause includes a WHERE clause and/or an ORDER BY clause.
In a second aspect, an apparatus for processing a database query statement is provided, including: a receiving unit, configured to receive a database query statement, where the database query statement includes a LIMIT clause, and a connection operation between a first data set and a second data set; a processing unit, configured to split the database query statement into a target query and an external query of the target query, where the target query includes the first data set and the LIMIT clause, and the external query includes the connection operation of a query result of the target query and the second data set.
In one possible implementation, the database query statement further includes a first related conditional clause for the first data set, and the target query further includes the first related conditional clause.
In one possible implementation, the connecting operation includes externally connecting the second set of data to the first set of data based on the first set of data.
In one possible implementation, the connection operation comprises a cross-connect operation, and the external query further comprises the LIMIT clause.
In one possible implementation, the target query is encapsulated in a view.
In one possible implementation, the first related condition clause includes a WHERE clause and/or an ORDER BY clause.
In a third aspect, there is provided an apparatus for processing a database query statement, where the apparatus has the function of implementing the apparatus for processing a database query statement in the method design of the first aspect. These functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more units corresponding to the above functions.
In a fourth aspect, an apparatus for processing a database query statement is provided that includes a processor and a memory. The memory is adapted to store a computer program, and the processor is adapted to retrieve and run the computer program from the memory, such that the apparatus performs the method of the first aspect.
In a fifth aspect, a database is provided that includes a processor and a memory. The memory is adapted to store a computer program, and the processor is adapted to call and run the computer program from the memory, so that the apparatus performs the method of the first aspect.
In some implementations, the database may be a native distributed database.
In a sixth aspect, there is provided a computer program product comprising: computer program code which, when run on a computer, causes the computer to perform the method of the above-mentioned aspects.
In a seventh aspect, a computer-readable medium is provided, which stores program code, which, when run on a computer, causes the computer to perform the method of the above-mentioned aspects.
Drawings
FIG. 1 is a flow chart of a method of processing a database query statement of an embodiment of the present disclosure.
FIG. 2 is a schematic diagram of an apparatus for processing a database query statement according to an embodiment of the present disclosure.
Fig. 3 is a schematic diagram of an apparatus for processing a database query statement according to another embodiment of the disclosure.
Detailed Description
The embodiments of the present disclosure are described below with reference to the drawings in the embodiments of the present disclosure. In the following description, reference is made to the accompanying drawings which form a part hereof and in which is shown by way of illustration specific aspects of embodiments of the disclosure or in which aspects of embodiments of the disclosure may be practiced. It should be understood that embodiments of the present disclosure may be used in other respects, and may include structural or logical changes not depicted in the drawings. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims. For example, it should be understood that the disclosure in connection with the described methods may equally apply to the corresponding apparatus or system for performing the methods, and vice versa. For example, if one or more particular method steps are described, the corresponding apparatus may comprise one or more units, such as functional units, to perform the described one or more method steps (e.g., a unit performs one or more steps, or multiple units, each of which performs one or more of the multiple steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a particular apparatus is described based on one or more units, such as functional units, the corresponding method may comprise one step to perform the functionality of the one or more units (e.g., one step performs the functionality of the one or more units, or multiple steps, each of which performs the functionality of one or more of the plurality of units), even if such one or more steps are not explicitly described or illustrated in the figures. Further, it is to be understood that features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless explicitly stated otherwise.
For ease of understanding, some relevant technical knowledge related to the embodiments of the present disclosure is first introduced. The following related arts as alternatives can be arbitrarily combined with the technical solutions of the embodiments of the present disclosure, which all belong to the scope of protection of the embodiments of the present disclosure. Embodiments of the present disclosure include at least some of the following.
Connecting operation
In a relational database, database query statements (e.g., structured Query Language (SQL) statements) often contain JOIN Operations (JOINs). Based on join operations, two data sets (e.g., two database tables) may be merged according to a certain condition to form a new data set.
There are a variety of connection types. For example, the connection type may include an INNER connection (INNER JOIN), a CROSS-connect (CROSS JOIN), an OUTER connection (OUTER JOIN), and the like. The meaning of these connections is briefly described below.
The inner join only connects matching rows in the two data tables. Assuming tables a and B, the result of (a INNER JOIN B ON a.a = b.b) is such that: firstly, taking Cartesian product of A and B, then, taking a selection operation, and only keeping the rows meeting A.a = B.b in the Cartesian product.
A cross-connect can be viewed as an internal connection without a connection condition. And (4) performing Cartesian product on the tables A and B by cross connection to obtain Cartesian product results of the tables A and B.
The outer join returns the row where the two tables match, and the row that does not match is output after one side of the table (e.g., "base table") is filled. External connections may include LEFT external connections (LEFT out JOINs), RIGHT external connections (RIGHT out JOINs), and the like. And the left-out connection refers to taking each tuple in the left table as one tuple of the result, and adding the attribute corresponding to the tuple meeting the condition in the right table into the result. For tuples in the right table that cannot satisfy the predicate condition, their corresponding attributes are set to NULL (NULL) in the final result. From the results produced, the outer-left join is equivalent to adding such tuples on the basis of the inner-join: their attribute values cannot match any tuple of the right table. The right outer connection is symmetrical to the left outer connection.
View of the drawing
A VIEW (VIEW) may be understood as a table based on a visualization of the result set of a database query statement. The views contain rows and columns and the fields in the views are fields in real tables from one or more databases. The types of views may be varied, such as standard views, inline views, materialized views, and so forth.
Database query statements often require simultaneous queries of data in multiple data sets to implement business analysis (e.g., analytical queries, etc.). As one example, the data sets may be stored in a database in the form of data tables. When multiple different operations are included in a database query statement at the same time (e.g., a join operation and a LIMIT operation are included at the same time), different orders of execution may produce different sets of intermediate results. In the process of query optimization, reducing the data volume of the intermediate result set as much as possible can generally greatly reduce the response time of the query. It should be appreciated that the shorter the response time of the query, the more efficient the query, and the better the performance of the database.
Taking the database query statement including the LIMIT clause and the multiple data tables for performing the join operation as an example, if the complete result of the multi-table join operation is calculated first, and then the data records specifying the number of record lines are obtained from the complete result and returned, redundant data may be generated at the join operation stage. Redundant data may also be referred to as unnecessary data. Redundant data may refer to, for example, data that is unrelated to the data that is ultimately returned.
The redundant data generated at the connection stage may result in a connection that is computationally expensive. Especially when there are many recording lines of data records in the data set, a join operation is performed on the data set first, and then the LIMIT clause defines the number of lines of the returned recording lines, which may generate extremely large redundant data. These redundant data occupy a large memory space, and the computation of a large amount of redundant data will have a large impact on the query efficiency (e.g., significantly reduce the query speed).
For convenience of understanding, the database query statement 1 is taken as an example for description, and it is assumed that the database query statement 1 is used for querying three database tables (referred to as table a, table B, and table C), and the database query statement 1 includes a connection operation for connecting the table a, the table B, and the table C, and a LIMIT clause for limiting to return to the first 10 rows of data records.
If the connection operation includes an external connection operation, after the database receives the database query statement 1, the table a, the table B, and the table C need to be connected by using the external connection operation, that is, all rows of the table a need to be scanned first, and one row of the table a needs to be taken out each time to try to match with the table B. And after obtaining the matching results 1 of the tables A and B, performing external connection operation on the matching results and the table C, and obtaining a matching result 2. And finally, selecting the first 10 rows of data records from the matching result 2 and returning the data records as a final query result. At this time, if the data amount of table a is very large (for example, 500 rows of data), a great data redundancy may be generated during the connection process, and a large calculation overhead may be generated.
If the connection operation includes a cross connection operation, after the database receives the database query statement 1, the table a, the table B and the table C need to be scanned completely to obtain the cartesian product of the three database tables. The cartesian product results in a large increase in the number of rows after the connection. For example, table a includes 200 rows of data, table B includes 300 rows of data, and table C includes 50 rows of data. Then three database tables are cross-connected to generate 200 × 300 × 50 rows of records. At this time, due to the LIMIT of the LIMIT clause, only the first 10 rows of data need to be returned from the 200 × 300 × 50 rows of data, and the rest of the 200 × 300 × 50 rows of data are redundant data. Therefore, when the database table uses cross connection, great data redundancy can be generated, and the query efficiency is seriously influenced.
In view of the foregoing problems, embodiments of the present disclosure provide a method and an apparatus for processing a database query statement, and a native distributed database, so as to improve execution efficiency of the database query statement. The method of the disclosed embodiment is described below in conjunction with fig. 1.
Fig. 1 is a schematic flowchart of a method for processing a database query statement according to an embodiment of the present disclosure. The method shown in fig. 1 may be performed by a server side of a database. In some embodiments, the method illustrated in FIG. 1 may be performed by a query optimizer in a database. The database may be a generic database or a distributed database, such as a native distributed database.
A user of the database may access the database through a database query statement. For example, a user may read data from the database using a database query statement. Alternatively, the user may write data to the database using a database query statement. The database query statement referred to herein may be, for example, an SQL statement. One or more data sets may be stored in the database. The data collection is typically in the form of database tables.
The method shown in fig. 1 may include steps S110 to S120, each of which is described below.
Referring to fig. 1, in step S110, a database query statement is received. The database query statement may include a LIMIT clause, a join operation of the first data set and the second data set.
Wherein the LIMIT clause may be used to LIMIT the amount of data of the data set returned by the database query statement. The LIMIT clause may be used to force the SELECT statement to return the specified number of records (e.g., the data above specifying the number of lines). In some implementations, the LIMIT clause may contain one or two numerical parameters. Where the parameter must be an integer constant. If the LIMIT clause contains two arguments, the first argument may be used to indicate the offset of the first return record line and the second argument specifies the maximum number of return record lines. If the LIMIT clause contains a parameter, which may be used to indicate the maximum number of return record lines, the offset of the first return record line defaults to 0.
For example, the LIMIT clause is denoted as "LIMIT K1, K2", where K2 is greater than K1, and K2, K1 are positive integers greater than or equal to 0, and at this time, the parameter K1 is used to indicate the offset of the first return recording line; the parameter K2 is used to indicate the maximum data of the row of returned records, i.e. the LIMIT clause is used to indicate the data records from the row K1 to the row K2 in the returned database table.
For another example, the LIMIT clause is denoted as "LIMIT K3" and K3 is a positive integer greater than or equal to 0, at which time the parameter K is used to indicate the maximum data of the returned record row, i.e., the LIMIT clause is used to indicate the previous K3 rows of data in the returned database table.
The embodiments of the present disclosure do not specifically limit the type of the connection operation. In some embodiments, the connection operation may be an external connection operation and/or a cross-connect operation. Taking the connection operation as an external connection operation as an example, the connection operation may be a left external connection operation or a right external connection operation, or the like.
Taking the connection operation as the left external connection operation as an example, the database query statement may be represented as a database query statement 1:
SELECT *
FROM T_1
LEFT JOIN T_2
ON COND_1 ...
LIMIT K;
that is, the data table T _1 and the data table T _2 are externally connected to the left according to the connection condition "COND _1" to obtain a connection result, and the previous 10 rows of data of the connection result need to be returned.
In step S120, the database query statement is split into the target query and an external query of the target query.
Wherein the target query comprises the first set of data, the LIMIT clause. The external query includes a join operation of the query result of the target query with the second data set. That is, the LIMIT clause is pushed down (or "pushed down") into the target query to restrict the query result of the target query, and a join operation is performed with the second data set based on the restricted query result.
The connection operation between each data set may include connection conditions, and the number of connection conditions may be arbitrarily selected as needed. It should be understood that the join operation between each database table may not include a join condition.
It should be noted that the method provided by the embodiment of the present disclosure belongs to one type of query rewriting. The idea of rewriting is based on the equivalent transformation of relational algebra, i.e., rewriting one form of database query statement into another form of database query statement. The semantics of the database query statements before and after rewriting remain unchanged. In other words, the execution result of the database query statement before and after rewriting does not change. That is, in the method provided by the embodiment of the present disclosure, after the step S120 is executed, the execution result of the database query statement is not changed, in other words, the execution result of the rewritten database query statement is the same as the execution result of the database query statement before rewriting. Or, the query results obtained based on the split (i.e., rewritten) target query and the external query of the target query are equivalent to the query results obtained based on the pre-split database query statement. Generally, the rewritten database query statement has lower computational overhead in connection operation, so that the execution efficiency of the database query statement can be improved.
Therefore, the join operation between the query result of the target query and the second data set is of the same type as the join operation in the database query statement. For example, when the join operation in the database query statement is an outer join, the join operation between the query result of the target query and the second data set is also an outer join.
In the embodiment of the present disclosure, the database query statement is split into the target query and the external query of the target query, and the LIMIT clause is pushed down into the target query, so that before the external query is executed, the LIMIT clause is used to LIMIT the number of recording lines of the query result of the target query, and in the process of executing the external query, a connection operation is performed with the second data set based on the limited query result, which is helpful for reducing the data amount of redundant data in the execution process, and improving the execution performance of the database query statement. The method and the device avoid the problems that in the traditional database query statement execution process, the first data set and the second data set are preferentially connected to obtain a connection result, and then the number of record lines returned by the connection result is limited based on the LIMIT clause, so that a large amount of redundant data is generated in the process of obtaining the connection result, and the execution performance of the database query statement is reduced.
Typically, when a view (e.g., an inline view) is included in a database query statement along with an external query statement, the statement in the view will be executed in preference to the external query statement. Thus, in embodiments of the present disclosure, to ensure that the above-described target query is executed in preference to the external query statement, the target query may be encapsulated in a view. Of course, in the embodiment of the present disclosure, other ways may also be adopted to ensure that the target query is executed in preference to the external query statement, which is not limited by the embodiment of the present disclosure.
It should be noted that, in order to ensure the accuracy of the query result, the split target query and the external query of the target query are equivalent to the database query statement before splitting. Or, the obtained query result of the external query based on the split target query and the target query is equivalent to the query result obtained based on the pre-split database query statement.
As described above, the split target query and the external query of the target query are equivalent to the pre-split database query statement, but different types of join operations may affect the number of record lines of the query result of the external query statement.
In some scenarios, the LIMIT clause may also be included in the external query statement to LIMIT the number of recording lines returned by the query result of the external query statement. For example, when the connection operation is a cross-connect, due to the characteristics of the cross-connect, the number of record lines of the query result is usually the number of record lines of the query result of the target query multiplied by the number of record lines of the second data set. At this time, in order to make the query result of the external query statement include only K record lines, so as to be the same as the query result of the database query statement before splitting, the external query statement may also include the LIMIT clause to restrict the query result of the external query.
In other scenarios, the connection operation of the external query statement does not increase the number of lines of the query result of the external query statement, and in this case, the LIMIT clause may not be included in the external query statement. For example, the join operation is an outerjoin and the first data set is a base table of the outerjoin, or the join operation includes outerjoining the second data set to the first data set based on the first data set. At this time, no matter how many record line numbers are included in the second data set, based on the characteristics of the external connection operation, the record line number of the query result of the external query statement is the same as the record line number of the query result of the target query, that is, in this case, the obtained query result of the external query based on the target query after splitting and the target query is equivalent to the obtained query result based on the query statement of the database before splitting, and at this time, the external query statement may not include the LIMIT clause. Of course, in the above scenario, the LIMIT clause may also be included in the external query statement. The embodiments of the present disclosure are not limited thereto.
In some cases, a related conditional clause (hereinafter also referred to as a "first related conditional clause") may also be included in the database query statement. The relevant conditional sub-statements may be used to filter the join result. The relevant condition clause may include any number of conditions. The relevant condition clauses may include, for example, a WHERE clause, an ORDER BY clause, and the like. Wherein the WHERE clause is used for extracting records meeting specified conditions. The ORDER BY clause is used to ORDER the result set BY one column or multiple columns.
Taking the connection operation in the database query statement as an external connection operation as an example, a database query statement containing related conditional sub-statements is introduced, which can be expressed as:
SELECT * FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 LEFT JOIN T3 ON T1.C1>T3.C1 WHERE T1.C1>T2.C1 ORDER BY T1.C1 T2.C2 LIMIT 15.
in the above example, LEFT JOIN indicates a LEFT external connection operation. The connection condition of the left outer connection operation (i.e., the connection condition after the predicate ON) includes "t1.C1= t2.C1" and "t1.C1> t3.C1". The relevant condition clauses of the database query statement include a WHERE clause and an ORDER BY clause. In addition, due to the LIMIT of the LIMIT clause, the database query statement returns only 15 rows of data records.
As another example, the join operation in the database query statement is a cross-join operation, and the specific form of the database query statement is as follows:
SELECT * FROM T1, T2, T3 WHERE T1.C1>T3.C1 AND T2.C1>0 ORDER BY T3.C2 LIMIT 10.
in the above example, T1, T2, T3 represent cross-connect operations. The CROSS-connect operation may also be written as T1 CROSS JOIN T2 CROSS JOIN T3. The cross-connect operation may include a first condition "WHERE t1.C1> t3.C1 ORDER BY t3.C2" and a second condition "WHERE t2.C1>0".
Based on the above description, it can be known that the execution sequence of the related condition clauses affects the content of the final query result, taking the case that the database query statement includes a WHERE clause, which is used to screen the data records satisfying the condition, according to the execution sequence of the conventional database query statement, usually, after the data records satisfying the condition are screened out based on the WHERE clause, the number of the returned data record lines is limited based on the LIMIT clause. If the LIMIT clause is executed first and then the WHERE clause is executed, the query result obtained is different from the query result obtained by the conventional execution sequence described above. Accordingly, the ORDER BY clause also has a similar problem.
Thus, in order to avoid the above problem, if the database query statement includes a relevant conditional clause for the first data set, the relevant conditional clause may be split into clauses in the target query together with the LIMIT clause after splitting, i.e. the target query further includes the relevant conditional clause. Therefore, in the process of executing the target query, the related condition clause is executed before the LIMIT clause, so that the equivalence of the database query clause before and after splitting is ensured.
It should be noted that, because the operation object of the related condition clause is usually specific to a certain data set or multiple data sets, in the splitting process, the operation object of the related condition clause may be simultaneously divided into the target queries.
Continuing with the above database query statement containing the outer join operation as an example:
SELECT * FROM T1 LEFT JOIN T2 ON T1.C1=T2.C1 LEFT JOIN T3 ON T1.C1>T3.C1 WHERE T1.C1>T2.C1 ORDER BY T1.C1 T2.C2 LIMIT 15.
accordingly, the database query statement may be split into the following database query statements:
SELECT * FROM (T1 LEFT JOIN T2 ON T1.C1=T2.C2 WHERE T1.C1>T2.C1 ORDER BY T1.C1, T2.C2) LEFT JOIN T3 ON T1.C1>T3.C1 LIMIT 15.
wherein, "(T1 LEFT JOIN T2 ON T1.C1= T2.C2 WHERE T1.C1> T2.C1 ORDER BY T1.C1, T2. C2)" is the target query. The target query may be packaged as a view, denoted V1, and T1.C1 as S1. The external query of the database query statement may be expressed as:
SELECT * FROM V1 LEFT JOIN T3 ON V1.S1>T3.C1 LIMIT 15.
in some embodiments, prior to splitting, the database query statement may be examined to determine whether the basic condition for splitting (or, in other words, pressing down the LIMIT clause) is satisfied. The basic condition may include, for example, that after the LIMIT is pressed, the execution result of the database query statement is the same as the execution result of the source database query statement. For example, it may be checked whether the condition in the relevant condition clause may press LIMIT. If the basic condition of pressing LIMIT is satisfied, the database query statement is split to obtain an external query and a target query. If the basic condition for pressing LIMIT is not satisfied, the database query statement is directly executed.
Continuing with the database query statement including the external connection operation as an example, after the database query statement is split into the external query and the target query, the LIMIT clause may be pushed down to the target query. After being pressed down, the database query statement can be rewritten as follows:
SELECT * FROM (T1 LEFT JOIN T2 ON T1.C1=T2.C2 WHERE T1.C1>T2.C1 ORDER BY T1.C1, T2.C2 LIMIT 15) LEFT JOIN T3 ON T1.C1>T3.C1 LIMIT 15.
through the rewriting, the database query statement can execute the target query first in the execution process, and 15 data records in the query result are obtained. And then, carrying out left-outer connection on the 15 data and the data set T3 to obtain a final query result. The above procedure can reduce the amount of data in the connection process. Especially, when the data volume of the data set T1 is very large, redundant data can be effectively reduced, and the query efficiency and the query performance are improved.
The embodiment of the disclosure pushes down the LIMIT clause in the database query statement to reduce redundant data generated in the execution process of the query statement, improve the query efficiency and query performance of the query statement, and obviously reduce the execution time of the database query statement containing the LIMIT clause.
In some embodiments, when the number of data sets in the database query statement is large, after the database query statement is split into the external query and the target query, a second target query may be further generated for the target query (also referred to as the first target query) and the third data set. The third data set may refer to a next data set to be subjected to a join operation, among the plurality of data sets of the database query statement, except for the target data set. After generating the second target query, the LIMIT clause may be pushed down into the second target query. It will be appreciated that the step of generating the second target query described above may be performed in a loop until only data sets remain outside the target query that cannot be further segmented, or other stopping conditions are met.
By pressing down the LIMIT clause after each connection, redundant data generated in the execution process of the database query statement can be further reduced, so that the execution efficiency of the database query statement is further improved.
In some embodiments, the join operation of the target data set may be a cross-join operation. The related condition clause may include a first condition and a second condition. It should be understood that the embodiments of the present disclosure refer to the first condition and the second condition only to make the scheme clearer. The relevant condition clause may include any number of conditions. For example, the related condition clause may include only the first condition. As another example, the related condition clause may include a first condition, a second condition, a third condition, and so on.
The first condition may be associated with a first data set of the target data sets. The second condition may be associated with a second data set of the target data sets. The first data set and the second data set may each include any number of data sets. The first set of data and the second set of data are orthogonal. The first data set and the second data set being orthogonal may for example mean that the first data set and the second data set comprise different data sets.
Taking the above example of a database query statement containing a cross-connect, the first condition is associated with data sets T1 and T3, and the second condition is associated with data set T2. The first data set may include data sets T1 and T3 and the second data set may include data set T2. The first data set and the second data set may be referred to as orthogonal data sets.
The target query may include a first sub-query and a second sub-query. The first sub-query may include a first set of data and a first condition. The second sub-query may include a second set of data and a second condition. The first sub-query and the second sub-query may include LIMIT clauses. That is, both the first sub-query and the second sub-query return only data sets of the amount of data defined by the LIMIT clause.
By respectively packaging the orthogonal first condition and the orthogonal second condition into the first sub-query and the second sub-query, the data volume generated in the cross connection process can be greatly reduced, so that the memory occupation of the database query statement in the execution process is reduced, and the execution efficiency and the execution performance of the database query statement are improved.
The following describes aspects of embodiments of the present disclosure in more detail, taking the external connection operation and the cross-connection operation as examples, respectively. It should be noted that the following examples are merely intended to assist those skilled in the art in understanding the disclosed embodiments and are not intended to limit the disclosed embodiments to the specific values or specific contexts illustrated. It will be apparent to those skilled in the art from the examples set forth below that various equivalent modifications or variations can be made, and such modifications or variations fall within the scope of the embodiments of the present disclosure.
Example 1: external connection operation
Without loss of generality, a database query statement G1 including n databases connected in a left-out connection manner is defined, and a database table included in G1 is denoted as T = { T1, T2,. And tn }, and each element in T is a base table. The left external connection condition is COND = { COND _1, ·, COND _ (n-1) }. Each entry in G1 records the unique id of the table contained on its left side, denoted left table ids.
The database query statement G1 contains p where conditions, denoted as C = { C1, C2, ·, cp }, where C is an empty set when p is 0; q order by conditions, denoted as O = { O1, O2,..., oq }, where O is an empty set when q is 0. The expressions in set C and set O may contain any number of database tables, noting that the total set of tables contained by C and O is Tk, k = {0,.., n }.
The database query statement G1 may be expressed as follows, where ". -" denotes omitting the intermediate table join operation:
SELECT * FROM t1 LEFT JOIN t2 ON COND_1 … LEFT JOIN tn ON COND_(n-1) [WHERE C_p] [ORDER BY O_q] LIMIT K.
and traversing the left external connection to find a first target table set Ti meeting the rewriting condition, wherein the target table set Ti comprises i database tables. The left _ table _ ids of Ti is required to contain Tk completely, and the number of tables in Ti is the least among all the left outer connection tables that can contain Tk completely. It is clear that Ti is a non-empty subset of T and unique. After finding Ti, all tables in Ti are treated as a whole and packed into view V, and then C, O and limit statements are pressed down to view V.
After the above operations are performed, the database query statement G1 may be rewritten into the following form G1_ 1:
SELECT * FROM V LEFT JOIN t(i+1) ON COND_i ... LEFT JOIN tn ON COND_(n-1) LIMIT K.
wherein, view V is:
SELECT * FROM t1 ... LEFT JOIN ti ON COND_(i-1) [WHERE C_p] [ORDER BY O_q] LIMIT K.
after G1_1 is rewritten, the view V is left-outside connected with the remaining tables, and at this time, the no WHERE condition and the ORDER BY condition are externally inquired. Since view V already contains LIMIT, view V can be treated as a whole with table t (i + 1), packaged as view V', further LIMIT push down is performed, and so on, until traversing to the last table (i.e., table tn).
It should be understood that the WHERE condition C and ORDER BY condition O of the database query statement G1 may both be null. When both are empty, the target table set Ti is the leftmost table, i.e., table t1. When condition C and condition O contain all tables in T, overwriting is not required.
The following describes an implementation procedure when the connection operation is a left external connection operation, with a specific left external connection operation.
The database query statement Q1 includes 4 base tables (T1, T2, T3, and T4). The database query statement Q1 is as follows:
SELECT * FROM T1 LEFT JOIN T2 ON T1.C1 = T2.C1 LEFT JOIN T3 ON T1.C1 = T3.C1 LEFT JOIN T4 ON T1.C1 = T4.C1 WHERE T1.C1 > T2.C1 ORDER BY T1.C1, T2.C2 LIMIT 10.
considering query Q1, Q1 contains LIMIT operator and there are 3 tables T1, T2 and T3, which are connected left to right in sequence. WHERE condition C contains two tables T1 and T2.ORDER BY Condition O also includes T1 and T2. According to the definition, the target table set of the query Q1 is T1 and T2, so a view V1 can be constructed for T1 and T2, and then the condition sets C and O and limit are packaged together into the view V1, so as to turn into Q1_1, where Q1_1 can also be referred to as an external query.
Q1_1:
SELECT * FROM V1 LEFT JOIN T3 ON V1.C1 = T3.C1 LEFT JOIN T4 ON V1.C1 = T4.C1 LIMIT 10;
V1:
SELECT T1.C1 AS C1, T1.C2 AS C2, T2.C1 AS C3, T2.C2 AS C4 FROM T1 LEFT JOIN T2 ON T1.C1 = T2.C1 WHERE T1.C1 > T2.C1 ORDER BY T1.C1, T2.C2 LIMIT 10.
As can be seen from Q1_1, Q1_1 contains the LIMIT operator, followed by 3 tables V1, T3 and T4, which are connected outside the left. Since Q1_1 no longer contains the WHERE condition and the ORDER BY condition. Thus, T3 and all tables to the left of it (V1 in Q1_ 1) can be packaged into view V2, further rewriting Q1 as Q1_2:
Q1_2:
SELECT * FROM V2 LEFT JOIN T4 ON V2.C1 = T4.C1 LIMIT 10.
V2:
SELECT * FROM V1 LEFT JOIN T3 ON V1.C1 = T3.C1 LIMIT 10.
when the Q1 is connected, all rows of the T1 table need to be scanned, one row is taken out each time and is matched with the T2, and after a connection result is obtained, a connection operation is carried out with the T3 again. However, the database query statement Q1 eventually takes only 10 lines of data from the query result. If the amount of T1 table data is very large, a large overhead of computing connection is incurred. After the data is rewritten to Q1_2, only 10 rows in T1 are taken out according to the condition, then the data is connected with T2, only 10 rows are left after the connection, and when the data is connected with T3, the data volume is greatly reduced, and the same is true when the data is connected with T4. Obviously, if the data of the T1 table is less than 10 rows, Q1_2 needs to completely scan the T1 table once, and the table is consistent with the Q1 performance, and the execution speed is not lost; if the T1 table data is much larger than 10 rows, the query speed can be significantly increased.
For the right external connection, only the right table needs to be considered, and the situation is similar to the left external connection and is not repeated.
Example 2: cross connect operation
In a cross-connect scenario, multiple tables may appear simultaneously, and the tables have different connection relationships, so that connection objects on both sides of the cartesian product may need to be determined.
Without loss of generality, consider a database query statement G2. G2 includes n tables, denoted T = { T1, T2, · tn }; p term WHERE conditions, denoted as C = { C1, C2., cp }, WHERE p is 0, C is an empty set. WHERE condition ci ∈ C, i = {1,.., p }, connecting r tables in T, r = {0,.., n }; and (3) an ORDER BY condition with q items is recorded as O = { O1, O2., oq }, and when q is 0, O is an empty set. The ORDER BY condition oj ∈ C, j = {1,.., q }, s = {0,..., n }, where s = {0,.., n }, where s is connected.
The T table can be divided into k clusters, denoted as ST = { S1., sk }, according to whether the tables in T are connected BY the WHERE condition ci and the ORDER BY condition oj. Each term Sk in ST is a non-empty subset of T. These non-empty subsets are orthogonal pairwise to each other and form a Cartesian product (k > 1). When k =1, it means that all tables in T are connected to each other, at which point there is no cartesian product and limit depression is not possible; when k = n, each table in T forms a cartesian product with the other tables, and each table may depress limit.
The database query statement G2 may be expressed as follows, where ". -" denotes omitting the intermediate table join operation:
G2:
SELECT * FRO t1, t2, ... ,tn [WHERE C_p] [ORDER BY O_q] LIMIT K.
the following describes the implementation of a cross-connect operation as a cross-connect operation, with a specific cross-connect operation embodiment. The database query statement is as follows:
Q2:
SELECT * FROM T1, T2, T3 WHERE T1.C1 > T3.C1 AND T2.C1 > 0 ORDER BY T3.C2 LIMIT 10.
the connection condition of Q2 can be denoted as C = { C1, C2}, where C1: t1.C1> t3.C1, C2: t2.C1> 0. It can be seen that c1 relates tables T1 and T3, and c2 relates only to table T2. As can be seen from the foregoing, conditional cross-connects can be considered as inner connections. Thus, tables T1 and T3 and condition c1 constitute an internal connection.
At this time, the set T = { T1, T2, T3} of multiple database tables may be divided into 2 modules, which are written as: ST = { S1, S2}. Where S1= { T1, T3}, and S2= { T2}. It can be seen that the table in S1 and the table in S2 form a Cartesian product. S1 and S2 may also be referred to as orthogonal sets. At this time, LIMIT may be pressed down into S1 and S2, respectively. The number of tables contained in S1 is greater than 1 and needs to be considered as a whole. In some embodiments, S1 may be encapsulated into view S1. Q2 may be rewritten as Q2_1.
Q2_1:
SELECT * FROM S1, T2 WHERE S1.VC1 > S1.VC3 AND T2.C1 > 0 ORDER BY S1.VC4 LIMIT 10.
S1:
SELECT T1.C1 AS vc1, T1.C2 AS vc2, T3.C1 AS vc3, T3.C2 AS vc4 FROM T1, T3.
In Q2_1, T1 and T3 are encapsulated into view S1. At this point, Q2_1 can be viewed as a common table cross-connected with T2 to obtain the Cartesian product. At this time, LIMIT may be pressed down to S1 and T2, resulting in Q2_2.
Q2_2:
SELECT * FROM V2, V3 LIMIT 10.
V2:
SELECT * FROM S1 WHERE S1.VC1 > S1.VC3 ORDER BY S1.VC4 LIMIT 10.
V3:
SELECT T2.C1, T2.C2 FROM T2 WHERE T2.C1 > 0 LIMIT 10.
Q2_2 encapsulates S1 and T2 into V2 and V3, respectively, with V2 and V3 containing the corresponding WHERE condition and LIMIT clause.
Comparing Q2 before overwriting to Q2_2 after overwriting, Q2 needs to scan T1, T2 and T3 completely before overwriting. In a scene with a large data amount of a table, the number of connected rows is greatly increased due to the cartesian product, so that query time is slowed, and actual service requirements are influenced. After rewriting, only 10 rows of data in the T2 table need to be acquired, and after T1 and T3 are connected internally, only 10 rows of data after connection need to be acquired. Therefore, the number of rows of the Cartesian product is at most 10X 10 rows, the calculation amount is greatly reduced, and the execution speed can be effectively improved.
It will be understood that the pressing of the LIMIT clause is primarily related to the associated data set in the related conditional clause. When different connection operations are used between the plurality of data sets, the pressing down of the LIMIT clause can be performed independently in the connection order. Therefore, the scheme provided by the disclosure is also used in a scene in which a plurality of connection operations are mixed. For example, it is applicable to a scenario in which an external connection operation and a cross-connection operation are mixed. The scheme provided by the disclosure can be popularized to more complex service scenes, so that the method is widely applied to real production environments.
Method embodiments of the present disclosure are described in detail above in conjunction with fig. 1, and apparatus embodiments of the present disclosure are described in detail below in conjunction with fig. 2-3. It is to be understood that the description of the method embodiments corresponds to the description of the apparatus embodiments, and therefore reference may be made to the preceding method embodiments for parts not described in detail.
FIG. 2 is a schematic diagram of an apparatus for processing a database query statement according to an embodiment of the present disclosure. The apparatus 200 of fig. 2 may comprise a receiving unit 210 and a processing unit 220.
A receiving unit 210, configured to receive a database query statement, where the database query statement includes a LIMIT clause, and a connection operation of a first data set and a second data set;
a processing unit 220, configured to split the database query statement into a target query and an external query of the target query, where the target query includes the first data set and the LIMIT clause, and the external query includes the join operation of the query result of the target query and the second data set.
In one possible implementation, the database query statement further includes a first related conditional clause for the first data set, and the target query further includes the first related conditional clause.
In one possible implementation, the connecting operation includes externally connecting the second set of data to the first set of data based on the first set of data.
In one possible implementation, the connection operation comprises a cross-connect operation, and the external query further comprises the LIMIT clause.
In one possible implementation, the target query is encapsulated in a view.
In one possible implementation, the first related condition clause includes a WHERE clause and/or an ORDER BY clause.
In an alternative embodiment, the receiving unit 210 may be an input/output interface 330, the processing unit 220 may be a processor 320, and the apparatus may further include a memory 310, as shown in fig. 3.
Fig. 3 is a schematic block diagram of an apparatus of another embodiment of the present disclosure. The apparatus 300 shown in fig. 3 may include: memory 310, processor 320, input/output interface 330. The memory 310, the processor 320, and the input/output interface 330 are connected via an internal connection path, the memory 310 is used for storing instructions, and the processor 320 is used for executing the instructions stored in the memory 320, so as to control the input/output interface 330 to receive input data and information, output data such as operation results, and control the transceiver 340 to transmit signals.
It should be understood that, in the embodiment of the present disclosure, the processor 320 may adopt a general-purpose Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits, and is configured to execute a relevant program to implement the technical solutions provided in the embodiments of the present disclosure.
The memory 310 may include a read-only memory and a random access memory, and provides instructions and data to the processor 320. A portion of processor 320 may also include non-volatile random access memory. For example, processor 320 may also store information of the device type.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 320. The method for requesting uplink transmission resources disclosed in connection with the embodiments of the present disclosure may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 310, and the processor 320 reads the information in the memory 310 and completes the steps of the method in combination with the hardware. To avoid repetition, it is not described in detail here.
In an alternative embodiment, the apparatus shown in fig. 3 may be a database, such as a native distributed database. The native distributed database may be an autonomously developed distributed database, which is not obtained by performing secondary development or encapsulation on an existing distributed database. It should be noted that the solution of the embodiment of the present disclosure may also be applied to other databases, and the embodiment of the present disclosure does not limit this.
It should be understood that, in the embodiments of the present disclosure, the processor may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should be understood that in the embodiments of the present disclosure, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.
It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that, in various embodiments of the present disclosure, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.
In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the disclosure are all or partially produced when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be read by a computer or a data storage device including one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Versatile Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present disclosure, and all the changes or substitutions should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (13)

1. A method of processing a database query statement, comprising:
receiving a database query statement, wherein the database query statement comprises a LIMIT clause, a connection operation of a first data set and a second data set, and the LIMIT clause is used for limiting the data volume of a query result of the database query statement;
splitting the database query statement into a target query and an external query of the target query, the target query comprising the first data set and the LIMIT clause, the external query comprising the join operation of the query result of the target query and the second data set, the query result of the target query being limited by the LIMIT clause in data volume.
2. The method of claim 1, the database query statement further comprising a first related conditional clause for the first set of data, the target query further comprising the first related conditional clause.
3. The method of claim 2, the connecting operation comprising externally connecting the second set of data to the first set of data based on the first set of data.
4. The method of claim 2, the connection operation comprising a cross-connect operation, the external query further comprising the LIMIT clause.
5. The method of any of claims 1-4, the target query being encapsulated in a view.
6. The method of claim 2, the first related condition clause comprising a WHERE clause and/or an ORDER BY clause.
7. An apparatus for processing a database query statement, comprising:
a receiving unit, configured to receive a database query statement, where the database query statement includes a LIMIT clause, and a join operation of a first data set and a second data set, and the LIMIT clause is used to LIMIT a data amount of a query result of the database query statement;
a processing unit, configured to split the database query statement into a target query and an external query of the target query, where the target query includes the first data set and the LIMIT clause, the external query includes the join operation of a query result of the target query and the second data set, and the query result of the target query is limited by the LIMIT clause in data size.
8. The apparatus of claim 7, the database query statement further comprising a first related conditional clause for the first set of data, the target query further comprising the first related conditional clause.
9. The apparatus of claim 8, the join operation comprising outerjoining the second data set to the first data set based on the first data set.
10. The apparatus of claim 8, the join operation comprising a cross-connect operation, the external query further comprising the LIMIT clause.
11. The apparatus of any of claims 7-10, the target query is encapsulated in a view.
12. The apparatus of claim 8, the first related condition clause comprising a WHERE clause and/or an ORDER BY clause.
13. An apparatus for processing a database query statement, comprising a memory having stored therein executable code and a processor configured to execute the executable code to cause the apparatus to implement the method of any one of claims 1-6.
CN202210707889.1A 2022-06-22 2022-06-22 Method and device for processing database query statement Active CN114780554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210707889.1A CN114780554B (en) 2022-06-22 2022-06-22 Method and device for processing database query statement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210707889.1A CN114780554B (en) 2022-06-22 2022-06-22 Method and device for processing database query statement

Publications (2)

Publication Number Publication Date
CN114780554A CN114780554A (en) 2022-07-22
CN114780554B true CN114780554B (en) 2023-04-18

Family

ID=82422014

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210707889.1A Active CN114780554B (en) 2022-06-22 2022-06-22 Method and device for processing database query statement

Country Status (1)

Country Link
CN (1) CN114780554B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8868545B2 (en) * 2011-12-29 2014-10-21 Teradata Us, Inc. Techniques for optimizing outer joins
US10289723B1 (en) * 2014-08-21 2019-05-14 Amazon Technologies, Inc. Distributed union all queries
CN106528256B (en) * 2016-10-20 2019-06-21 国云科技股份有限公司 A kind of entity B EAN general polling method based on Java EJB platform
CN109947804B (en) * 2019-03-20 2021-04-06 上海达梦数据库有限公司 Data set query optimization method and device, server and storage medium
CN112199390B (en) * 2020-09-30 2023-05-30 上海达梦数据库有限公司 Data query method, device, equipment and storage medium in database
CN113377808A (en) * 2021-06-03 2021-09-10 北京沃东天骏信息技术有限公司 SQL optimization method and device
CN114265874B (en) * 2022-03-02 2022-05-03 北京奥星贝斯科技有限公司 Method and device for querying data

Also Published As

Publication number Publication date
CN114780554A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
US7814104B2 (en) Techniques for partition pruning
US9390115B2 (en) Tables with unlimited number of sparse columns and techniques for an efficient implementation
US8676821B2 (en) Summary filter transformation
US8612421B2 (en) Efficient processing of relational joins of multidimensional data
US8200612B2 (en) Efficient SQL access to multidimensional data
US8332389B2 (en) Join order for a database query
US20170147644A1 (en) Query optimization using join cardinality
US6370522B1 (en) Method and mechanism for extending native optimization in a database system
US5557791A (en) Outer join operations using responsibility regions assigned to inner tables in a relational database
US7246108B2 (en) Reusing optimized query blocks in query processing
US6401083B1 (en) Method and mechanism for associating properties with objects and instances
US20170357708A1 (en) Apparatus and method for processing multi-dimensional queries in a shared nothing system through tree reduction
CN112395303A (en) Query execution method and device, electronic equipment and computer readable medium
US8442971B2 (en) Execution plans with different driver sources in multiple threads
CN112579610A (en) Multi-data source structure analysis method, system, terminal device and storage medium
US9141654B2 (en) Executing user-defined function on a plurality of database tuples
CN114265874B (en) Method and device for querying data
US7769755B2 (en) Efficient execution of aggregation queries
CN113918605A (en) Data query method, device, equipment and computer storage medium
US10997175B2 (en) Method for predicate evaluation in relational database systems
CN114780554B (en) Method and device for processing database query statement
CN114490724B (en) Method and device for processing database query statement
CN111797095A (en) Index construction method and JSON data query method
US12086161B2 (en) Transforming relational statements into hierarchical data space operations
CN115563148A (en) Database query method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant