WO2024082881A2

WO2024082881A2 - Database query method and apparatus

Info

Publication number: WO2024082881A2
Application number: PCT/CN2023/118705
Authority: WO
Inventors: 王国平; 朱涛; 赵占越
Original assignee: 北京奥星贝斯科技有限公司
Priority date: 2022-10-20
Filing date: 2023-09-14
Publication date: 2024-04-25
Also published as: CN115563148A

Abstract

Provided are a database query method and apparatus. The database query method comprises: receiving a first database query statement, wherein the first database query statement comprises a first query branch and a second query branch, and the first query branch and the second query branch are joined by means of a predicate OR; expanding the predicate OR in the first database query statement, so as to obtain a second database query statement, wherein the second database query statement comprises a third database query statement and a fourth database query statement, the third database query statement is used for executing the first query branch, the fourth database query statement is used for executing the second query branch, and the third database query statement and the fourth database query statement are connected on the basis of UNION DISTINCT; and querying data in a database according to the second database query statement.

Description

Database query method and device

Technical Field

The present disclosure relates to the field of databases, and more specifically, to a database query method and device.

Background technique

A database query statement may contain an OR predicate. In order to improve the execution performance of the database query statement, the database query statement containing the OR predicate is usually rewritten to expand the OR predicate.

The related technology expands the OR predicate based on Union All. However, in certain scenarios, after the OR predicate is expanded based on Union All, the execution performance of the database query statement is still poor.

Summary of the invention

The present disclosure provides a database query method and device to improve the execution performance of database query statements.

In a first aspect, a database query method is provided, comprising: receiving a first database query statement, the first database query statement comprising a first query branch and a second query branch, the first query branch and the second query branch being connected by an OR predicate; expanding the OR predicate in the first database query statement to obtain a second database query statement, the second database query statement comprising a third database query statement and a fourth database query statement, the third database query statement being used to execute the first query branch, the fourth database query statement being used to execute the second query branch, and the third database query statement and the fourth database query statement being connected based on UNION DISTINCT; and querying data in a database according to the second database query statement.

Optionally, as a possible implementation method, before expanding the OR predicate in the first database query statement, the method also includes: checking whether the query branches in the first database query statement can output a unique column set; expanding the OR predicate in the first database query statement includes: if all query branches in the first database query statement can output a unique column set, expanding the OR predicate in the first database query statement.

Optionally, as a possible implementation manner, before expanding the OR predicate in the first database query statement, the method further includes: performing SPJ separation on the query branches in the first database query statement.

Optionally, as a possible implementation manner, in the first query branch and/or the second query branch The predicate is a subquery predicate.

Optionally, as a possible implementation manner, the database query statement is a SQL statement.

In a second aspect, a database query device is provided, including: a receiving module, used to receive a first database query statement, the first database query statement includes a first query branch and a second query branch, the first query branch and the second query branch are connected by an OR predicate; an expansion module, used to expand the OR predicate in the first database query statement to obtain a second database query statement, the second database query statement includes a third database query statement and a fourth database query statement, the third database query statement is used to execute the first query branch, the fourth database query statement is used to execute the second query branch, and the third database query statement and the fourth database query statement are connected based on UNION DISTINCT; a query module, used to query data in a database according to the second database query statement.

Optionally, as a possible implementation method, the device also includes: a checking module, used to check whether the query branches in the first database query statement can output a unique column set before expanding the OR predicate in the first database query statement; the expansion module is used to: if the query branches in the first database query statement can all output a unique column set, then expand the OR predicate in the first database query statement.

Optionally, as a possible implementation manner, the device further includes: a separation module, configured to perform SPJ separation on query branches in the first database query statement before expanding the OR predicate in the first database query statement.

Optionally, as a possible implementation manner, the predicate in the first query branch and/or the second query branch is a sub-query predicate.

According to a third aspect, a database query device is provided, comprising: a memory for storing instructions; and a processor for executing the instructions stored in the memory to execute the method as described in the first aspect or any possible implementation method of the first aspect.

According to a fourth aspect, a computer-readable storage medium is provided, on which instructions for executing the method described in the first aspect or any possible implementation of the first aspect are stored.

According to a fifth aspect, a computer program product is provided, comprising instructions for executing the method described in the first aspect or any possible implementation manner of the first aspect.

Related technology When expanding the OR predicate based on Union All, it is necessary to use the LLNVL predicate to perform deduplication operations (i.e., remove duplicate data in the query results of multiple query branches of the OR predicate). If the query branch connected by the OR predicate contains a complex predicate (subquery predicate), it will cause the LLNVL predicate to contain a subquery (i.e., the parameter of the LLNVL predicate is a subquery). Since the LLNVL predicate cannot perform subquery promotion (the execution efficiency of the subquery is very low, and it is usually hoped to convert the subquery and the outer query into a connection operation of two database tables to avoid executing the subquery. This conversion is called subquery promotion, but not all predicates can perform subquery promotion, and the LLNVL predicate cannot perform subquery promotion), it will reduce the execution efficiency of the database query statement after the OR predicate is expanded. The embodiment of the present disclosure implements OR predicate expansion based on UNION DISTINCT. Since UNION DISTINCT itself has a deduplication function, there is no need to introduce the LLNVL predicate, thus avoiding the above problem. In other words, even if the query branches connected by the OR predicate contain complex predicates (sub-query predicates), the embodiments of the present disclosure can still well improve the execution efficiency of the database query statement.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the background technology, the drawings required for use in the embodiments of the present disclosure or the background technology will be described below.

FIG1 is a schematic diagram of the structure of a database query method provided by an embodiment of the present disclosure.

FIG. 2 is a schematic diagram of the structure of a database query method provided by another embodiment of the present disclosure.

FIG3 is a schematic diagram of the structure of a database query device provided by an embodiment of the present disclosure.

FIG. 4 is a schematic diagram of the structure of a database query device provided by another embodiment of the present disclosure.

Detailed ways

The technical solutions in the embodiments of the present disclosure are clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only part of the embodiments of the present disclosure, rather than all the embodiments.

With the development of technology, the application of database systems (such as OceanBase database) is becoming more and more widespread. Different types of database systems have their own supported database query statements, so that users of the database system can easily access the database system (such as querying, adding, deleting, etc. data in the database system). The database query language mentioned here can be, for example, structured query language (SQL).

The execution performance of database query statements may be different if the expressions of database query statements are different. Therefore, in order to improve the execution performance of database query statements, the database system can query the received database query statements. Query rewriting (query rewriting of database query statements is usually an equivalence rewriting, that is, the query rewriting does not change the query results of the database query statement).

OR predicate expansion is a common query rewriting method. Taking the database query statement as a SELECT statement as an example, OR predicate expansion refers to splitting a SELECT statement with an OR predicate into multiple SELECT statements. Each of the multiple SELECT statements can be used to execute a query branch of the OR predicate. OR predicate expansion splits the OR predicate into each SELECT statement, so that the split SELECT statements may trigger more optimization space (that is, the split SELECT statements may be executed with an execution algorithm with better execution performance), thereby achieving the purpose of improving the query performance of database statements.

As an example, first create two tables for the database system:

create table t1(c1 int primary key,c2 int,c3 int,c4 int);

create table t2(c1 int primary key,c2 int,c3 int,c4 int).

Then, receive the original database query statement Q1:

select * from t1,t2 where t1.c2＝t2.c2 or t1.c3＝t2.c3.

In the database query statement Q1, "t1.c2 = t2.c2" and "t1.c3 = t2.c3" are two conditions, which are connected by the OR predicate. For the connection operation based on the OR predicate, if the database query statement Q1 is not rewritten, the only way to execute the database statement Q1 is to use the nested loop join (NESTED LOOP JOIN) algorithm.

Nested loop join is a general but inefficient join algorithm. Specifically, the nested loop join consists of two FOR loops, hence the name. Assuming that the two tables to be joined are T1 and T2, and the join condition is P, the nested loop join selects one as the outer loop and the other as the inner loop, and compares each tuple in table T1 and table T2 one by one to find all matching tuples. Nested loop join is relatively general. Like linear scan, nested loop join does not require indexes, and the algorithm is applicable to any join condition. In other words, for any type of join operation, the algorithm only needs to be slightly adjusted to perform the operation. However, the execution performance of nested loop join is relatively poor. Because the algorithm needs to compare each tuple in table T1 and table T2 one by one, when the data size is large and cannot be completely put into the memory, the disk and memory exchange caused by it is relatively frequent. Even if the data can be completely put into the memory, the CPU cache hit rate will be low during the execution of the nested loop join, which seriously affects the system efficiency.

In order to avoid using nested loop joins, the database query statement Q1 can be rewritten, that is, the OR predicate in the database query statement Q1 can be expanded. For example, the database query statement Q1 can be rewritten as The database query statement Q2 is as follows:

select * from t1,t2 where t1.c2＝t2.c2;

union all

select * from t1,t2 where t1.c3＝t2.c3 and llnvl(t1.c2＝t2.c2).

For the database query statement Q2, the OR predicate is split into multiple simple SELECT statements. In addition to using the nested loop join method to execute each SELECT statement, you can also choose the more efficient merge join (MERGE JOIN, the merge join algorithm is also called sort merge join, which can be used to calculate natural joins and equivalue joins. Assuming that the two tables to be connected are T1 and T2, make tables T1 and T2 sorted before connecting. Then scan the two tables separately to complete. Merge joins are very efficient to execute, and their time complexity is linear O(n), where n is the number of tuples in the table with the largest number of tuples in tables T1 and T2) and hash joins (HASH JOIN, hash joins are similar to merge joins. In the hash join algorithm, hash functions are used to partition two tables. The basic idea of this algorithm is to divide the two tables into sets of tuples with the same hash value according to the connection attributes. Hash joins do not require indexes, and compared with nested loop joins, hash joins are easier to handle large result sets) to execute, thereby improving the execution performance of database statements.

At present, in database systems that support OR predicate expansion, all use the form based on "Union All" to perform OR predicate expansion. This form of OR predicate expansion is very inefficient for some OR predicates containing complex predicates (such as subquery predicates), which may lead to the loss of the efficiency advantage brought by OR predicate expansion. The following analyzes the cause of this problem.

The "Union All"-based OR predicate expansion method is to use the "Union All" operator to organize multiple database query statements (each database query statement is used to execute a query branch of the OR predicate) to calculate the union of the query structures of the multiple database query statements. If the query results of the multiple database query statements contain multiple duplicate data, the "Union All" operator will return the multiple duplicate data.

In order to avoid duplicate data generated by multiple database query statements, the "Union All"-based OR predicate expansion method introduces the LLNVL predicate (such as "llnvl(t1.c2＝t2.c2)" in the database query statement Q2 mentioned above) to remove duplicate data. The introduction of the LLNVL predicate will lead to suboptimal performance of database query statements in certain specific scenarios. The specific scenario mentioned here may be, for example, a scenario where the query branch of the OR predicate contains complex predicates, such as a subquery predicate (that is, the predicate parameter contains a subquery).

As an example, the database query statement Q3 below contains an OR predicate, which contains two query branches, and each query branch contains a complex predicate (that is, the EXISTS predicate with a select subquery as a parameter below):

After performing OR predicate expansion on database query statement Q3, the following database query statement Q4 is obtained:

Although the predicate EXISTS of the first SELECT statement in the database query statement Q4 contains a subquery, it can be converted into a join (JOIN) through subquery lifting (subquery lifting is also a query rewriting method, which aims to express a subquery in the form of multi-table joins and merge it into the main query to improve query performance), so that more efficient join algorithms can be used to optimize query performance. Similarly, the EXISTS of the second SELECT statement in the database query statement Q4 can also be converted into a join (JOIN) through subquery lifting, so that more efficient join algorithms can be used to optimize query performance. However, currently, none of the database systems support the lifting operation of subqueries in LLNVL. Therefore, for the predicate LLNVL, it can only be executed in a manner similar to Nested Loop Join, and the execution performance is low. It can be seen that when the query branch of the OR predicate contains complex predicates (such as subquery predicates), even if the OR predicate is expanded, the query performance of the database query statement cannot be greatly improved.

In view of the above problems, the disclosed embodiment proposes an OR predicate expansion method based on "UNION DISTINCT". Since "UNION DISTINCT" itself has the function of data deduplication, there is no need to use the LLNVL predicate, which avoids the calculation of such complex predicates, thereby improving the execution performance of database query statements.

Still taking the database query statement Q3 mentioned above as an example, the OR predicate in Q3 can be expanded based on "UNION DISTINCT" to convert the database query statement Q3 into the following database query statement Q5:

By comparing database query statements Q4 and Q5, it can be seen that database query statement Q5 only contains the EXISTS predicate that can perform subquery lifting, and does not contain the LLNVL predicate that cannot perform subquery lifting. Therefore, database query statement Q5 does not need to execute Nested Loop Join, and thus has higher execution performance.

The following is a more detailed example of the database query method provided by the embodiment of the present disclosure in conjunction with Figure 1. The method 100 of Figure 1 can be executed by a database system, specifically, by an optimizer in the database system. The embodiment of the present disclosure does not specifically limit the type of the database system, for example, it can be a distributed database.

Referring to FIG. 1 , in step S110 , a first database query statement is received. The first database query statement may also be an original database query statement, such as a database query statement input by a user of a database system, or a database query statement after rewriting the original database query statement. The first database query statement may be, for example, an SQL statement.

The first database query statement may include a first query branch and a second query branch, and the first query branch and the second query branch are connected by an OR predicate. Therefore, in order to improve the execution performance of the first database query statement, it may be considered to perform an OR predicate expansion on the first database query statement.

The predicate in the first query branch can be a complex predicate. For example, the predicate in the first query branch can be a subquery predicate, that is, the parameter of the predicate includes a subquery. Taking the first database query statement as Q3 mentioned above as an example, the first query branch of the first database query statement includes a subquery predicate, that is, exists(select 1 from t2 where t2.c2＝t1.c2).

The predicate in the second query branch can be a complex predicate. For example, the predicate in the second query branch can be a subquery predicate, that is, the parameter of the predicate includes a subquery. Taking the first database query statement as Q3 mentioned above as an example, the second query branch of the first database query statement includes a subquery predicate, that is, exsits(select 1 from t3 where t3.c3＝t1.c3).

In step S120, the OR predicate in the first database query statement is expanded to obtain a second database query statement. The second database query statement may be equivalent to the first database query statement, that is, the first database query statement and the second database query statement may correspond to the same query result.

The second database query statement may include a third database query statement and a fourth database query statement. The third database query statement may be used to execute the first query branch of the OR predicate. The fourth database query statement may be used to execute the second query branch of the OR predicate. The third database query statement and the fourth database query statement are based on UNION DISTINCT. That is, the UNION DISTINCT operator can be used to calculate the union of the query results of the third database query statement and the fourth database query statement. Since the UNION DISTINCT operator itself has a deduplication function, there is no need to add a predicate specifically for deduplication, such as the LLNVL predicate mentioned above.

The query results of the third database query statement and the fourth database query statement may include a unique column set (or, the third database query statement and the fourth database query statement both project a unique column set). The so-called unique column set refers to a column set that can uniquely identify the rows queried by the third database query statement and the fourth database query statement. Taking the example that the third database query statement and the fourth database query statement both query a database table, the unique column set may be the primary key of the database table. UNION DISTINCT can perform a deduplication operation based on the unique column sets projected by the third database query statement and the fourth database query statement. Taking the primary key as the unique column set as an example, if the query results of the third database query statement and the fourth database query statement include data corresponding to the same primary key, only one copy of the data is retained. After the deduplication is completed, the unique column set can be deleted from the query results.

In step S130, the data in the database is queried according to the second database query statement. For example, an execution plan corresponding to the second database query statement may be generated, and then the database query task is executed based on the execution plan, and the query result corresponding to the second database query statement is returned.

The embodiment corresponding to Figure 1 performs OR predicate expansion based on UNION DISTINCT. Since UNION DISTINCT itself has the function of data deduplication, there is no need to use the predicate LLNVL, thus avoiding the calculation of such complex predicates, thereby improving the execution performance of the database query statement after the OR predicate expansion.

In some embodiments, before executing step S120, the method of FIG. 1 may further include: checking the first database query statement to determine whether the first database query statement satisfies the expansion condition. For example, it may be checked whether the query branch in the first database query statement can output a unique column set. If the query branch in the first database query statement cannot output a unique column set, deduplication cannot be performed based on UNION DISTINCT (because UNION DISTINCT needs to implement deduplication based on a unique column set, which is the main difference between UNION DISTINCT and UNION ALL). In this case, OR predicate expansion may not be performed, or OR predicate expansion may be performed in a traditional UNION ALL-based manner.

As an example, suppose the object queried by the first database query statement is not a real database table, but a view, and the view contains a hierarchical query, then the view cannot output a unique column set. Therefore, in this case, OR predicate expansion can be omitted, or the traditional UNION ALL-based method can be used to expand the OR predicate.

Correspondingly, if the query branches in the first database query statement can all output unique column sets, The OR predicate in the first database query statement is expanded in UNION DISTINCT.

In some embodiments, before executing step S120, the method of FIG. 1 may further include: performing SPJ (Selection-Projection-Join) separation on the query branches in the first database query statement. SPJ query refers to a query that only includes Selection, Projection, and Join. SPJ separation refers to converting the query branches into queries that only include Selection, Projection, and Join.

As a possible implementation, it can be determined whether each query branch in the first database query statement is an SPJ query. If a query branch itself is already an SPJ query, then there is no need to perform SPJ separation on the query branch; if a query branch is not an SPJ query (for example, if a query branch contains a Group-by clause, then the query branch is not an SPJ query), then it is necessary to perform SPJ separation on the query branch.

The following is a more detailed description of the embodiment of the present disclosure in conjunction with FIG2. It should be noted that the method 200 shown in FIG2 is only to help those skilled in the art understand the embodiment of the present disclosure, and is not intended to limit the embodiment of the present disclosure to the specific numerical values or specific scenarios illustrated. Those skilled in the art can obviously make various equivalent modifications or changes based on the examples given, and such modifications or changes also fall within the scope of the embodiment of the present disclosure.

2 , in step S210 , it is checked whether the database query statement can be expanded with an OR predicate. For example, it can be checked whether the OR predicate satisfies the expansion condition, whether the query branch can output a unique column set, and so on.

For example, the database query statement is Q6 shown below:

The query branch where the OR predicate in the database query statement Q6 is located has two tables: t1 and t4. As can be seen from the creation process, both t1 and t4 have primary keys, so (t1.c1, t4.c1) can be used as the output unique column set, and the value of this set can be used to identify the only row in the table.

In step S220, SPJ separation is performed on the query branch in the database query statement. If the query branch itself is an SPJ query, then no separation is required, otherwise SPJ separation is required.

Taking the database query Q6 mentioned above as an example, Q6 includes a Group-by clause, which is not an SPJ query. Therefore, the query branch can be SPJ separated to convert Q6 into Q7 shown below, where temp is the SPJ query branch after separation.

In step S230, the separated SPJ branches are expanded with an OR predicate. The OR condition is split into multiple simple conditions, and corresponding SELECT statements are created. At the same time, a corresponding output unique column set is added to each SELECT statement, and finally these queries are combined in the form of UNION DISTINCT. Q8 below is the result of OR predicate expansion on the temp table in Q7, where t1.c1 and t4.c1 are the unique column sets added in the query branch.

In step S240, the expanded UNION DISTINCT query is encapsulated, and the columns that need to be projected are projected. Q9 below shows a query encapsulation of Q8.

This example proposes a method for expanding OR predicates based on Union Distinc. Different from the method for expanding OR predicates based on Union All, the method for expanding OR predicates based on Union Distinc does not need to add LLNVL predicates to avoid duplicate data. For OR predicates that include complex predicates (such as subquery predicates), since the method provided in this example can avoid adding LLNVL predicates, it can reduce the calculation of complex predicates and increase the optimization space for expanding OR predicates, thereby achieving the purpose of improving the execution performance of database query statements.

It should be noted that the terms "first", "second", "third" and "fourth" etc. in the specification and claims of the present disclosure are used to distinguish different objects rather than to describe a specific order.

The above describes in detail the method embodiment of the present disclosure in conjunction with Figures 1 and 2, and the following describes in detail the device embodiment of the present disclosure in conjunction with Figures 3 and 4. It should be understood that the description of the method embodiment corresponds to the description of the device embodiment, and therefore, the part not described in detail can refer to the previous method embodiment.

FIG3 is a schematic diagram of a database query device according to an embodiment of the present disclosure. The database query device 300 shown in FIG3 includes a receiving module 310 , an expansion module 320 and a query module 330 .

The receiving module 310 may be configured to receive a first database query statement, wherein the first database query statement includes a first query branch and a second query branch, wherein the first query branch and the second query branch are connected by an OR predicate.

The expansion module 320 can be used to expand the OR predicate in the first database query statement to obtain a second database query statement, wherein the second database query statement includes a third database query statement and a fourth database query statement, wherein the third database query statement is used to execute the first query branch, and the fourth database query statement The method is used to execute the second query branch, and the third database query statement and the fourth database query statement are connected based on UNION DISTINCT.

The query module 330 may be used to query data in the database according to the second database query statement.

Optionally, in some embodiments, the device 300 may also include: a checking module, used to check whether the query branches in the first database query statement can output a unique column set before expanding the OR predicate in the first database query statement; the expansion module 320 can be used to expand the OR predicate in the first database query statement if all query branches in the first database query statement can output a unique column set.

Optionally, in some embodiments, the apparatus 300 may further include: a separation module, configured to perform SPJ separation on the query branches in the first database query statement before expanding the OR predicate in the first database query statement.

Optionally, in some embodiments, the predicate in the first query branch and/or the second query branch is a sub-query predicate.

Optionally, in some embodiments, the database query statement is a SQL statement.

FIG4 is a schematic diagram of the structure of a database query device provided by another embodiment of the present disclosure. The database query device 400 described in FIG4 may include a memory 410 and a processor 420, and the memory 410 may be used to store instructions. The processor 420 may be used to execute the instructions stored in the memory 410 to implement the steps in the various methods described above. In some embodiments, the device 400 may also include a network interface 430, and data exchange between the processor 420 and an external device may be implemented through the network interface 430.

In the above embodiments, it can be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented by software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the process or function described in the embodiment of the present disclosure is generated in whole or in part. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions can be transmitted from one website site, computer, server or data center to another website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a medium containing one or more available media. The available medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (DVD)), or a semiconductor medium (e.g., a solid state disk (SSD)).

Those skilled in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments of the present disclosure can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present disclosure.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

The above is only a specific embodiment of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art who is familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the present disclosure, which should be included in the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims

A database query method, comprising:

Receive a first database query statement, the first database query statement includes a first query branch and a second query branch, the first query branch and the second query branch are connected by an OR predicate;

Expand the OR predicate in the first database query statement to obtain a second database query statement, wherein the second database query statement includes a third database query statement and a fourth database query statement, wherein the third database query statement is used to execute the first query branch, and the fourth database query statement is used to execute the second query branch, and the third database query statement and the fourth database query statement are connected based on UNION DISTINCT;

The data in the database is queried according to the second database query statement.
According to the method of claim 1, before expanding the OR predicate in the first database query statement, the method further comprises:

Checking whether the query branch in the first database query statement can output a unique column set;

The step of expanding the OR predicate in the first database query statement includes:

If all query branches in the first database query statement can output a unique column set, the OR predicate in the first database query statement is expanded.
According to the method of claim 1, before expanding the OR predicate in the first database query statement, the method further comprises:

Perform SPJ separation on the query branches in the first database query statement.
According to the method of claim 1, the predicate in the first query branch and/or the second query branch is a subquery predicate.
According to the method of claim 1, the database query statement is an SQL statement.
A database query device, comprising:

A receiving module, configured to receive a first database query statement, wherein the first database query statement includes a first query branch and a second query branch, wherein the first query branch and the second query branch are connected by an OR predicate;

an expansion module, configured to expand the OR predicate in the first database query statement to obtain a second database query statement, wherein the second database query statement includes a third database query statement and a fourth database query statement, wherein the third database query statement is used to execute the first query branch, and the fourth database query statement is used to execute the second query branch, and the third database query statement and the fourth database query statement are connected based on UNION DISTINCT;

A query module is used to query data in the database according to the second database query statement.
The device according to claim 6, further comprising:

A checking module, configured to check whether a query branch in the first database query statement can output a unique column set before expanding the OR predicate in the first database query statement;

The expansion module is used to:

If all query branches in the first database query statement can output a unique column set, the OR predicate in the first database query statement is expanded.
The device according to claim 6, further comprising:

A separation module is used to perform SPJ separation on the query branches in the first database query statement before expanding the OR predicate in the first database query statement.
According to the apparatus according to claim 6, the predicate in the first query branch and/or the second query branch is a sub-query predicate.
According to the device according to claim 6, the database query statement is a SQL statement.
A database query device, comprising:

A memory for storing instructions;

A processor, configured to execute instructions stored in the memory to perform the method according to any one of claims 1 to 5.