CN106777054B - Semi-connection merging method and semi-connection merging device - Google Patents

Semi-connection merging method and semi-connection merging device Download PDF

Info

Publication number
CN106777054B
CN106777054B CN201611131301.3A CN201611131301A CN106777054B CN 106777054 B CN106777054 B CN 106777054B CN 201611131301 A CN201611131301 A CN 201611131301A CN 106777054 B CN106777054 B CN 106777054B
Authority
CN
China
Prior art keywords
query
sub
statement
join
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611131301.3A
Other languages
Chinese (zh)
Other versions
CN106777054A (en
Inventor
余鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING VSETTAN DATA TECHNOLOGY CO.,LTD.
Original Assignee
Huasheng Xintai Information Industry Development Co Ltd
Beijing Huasheng Xintai Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huasheng Xintai Information Industry Development Co Ltd, Beijing Huasheng Xintai Data Technology Co Ltd filed Critical Huasheng Xintai Information Industry Development Co Ltd
Priority to CN201611131301.3A priority Critical patent/CN106777054B/en
Publication of CN106777054A publication Critical patent/CN106777054A/en
Application granted granted Critical
Publication of CN106777054B publication Critical patent/CN106777054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • G06F16/24544Join order optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a semi-connection merging method and a semi-connection merging device, wherein the semi-connection merging method comprises the following steps: querying the SQL parse tree for half-joins matching the sub-queries; if the half-connection statement and the sub-query statement which match the sub-query are both one of the first type statement and the second type statement, determining a first predicate expression to be merged according to the statement; judging whether the type of the half-join converted by the statement is the same as that of the half-join matched with the sub-query; if so, merging the first predicate expression and the filtering condition in the half-join of the matched sub-query; and if not, negating the first predicate expression, and merging the negated first predicate expression and the filtering condition in the half-join of the matched sub-query. According to the technical scheme of the invention, the search space can be reduced, the number of times of accessing the physical table can be reduced, and the repeated calculation of the predicate expression can be reduced.

Description

Semi-connection merging method and semi-connection merging device
Technical Field
The invention relates to the technical field of databases, in particular to a semi-connection merging method and a semi-connection merging device.
Background
The process of processing SQL (Structured Query Language) statements in a database is mainly divided into three phases: a syntax analysis stage, in which a database (data warehouse) converts SQL sentences from a client into a data structure which can be recognized by the database (data warehouse) by means of a syntax analysis tool, namely a syntax analysis tree; secondly, a query optimization stage, in which a database (data warehouse) applies various optimization methods (namely common query rewriting in the database) on a lexical syntax analysis tree, then an optimal access path is calculated for a table participating in connection, and finally a structure is generated to represent an optimal query plan; and thirdly, executing the query plan, executing the optimal query plan, and finally obtaining a result.
In the query optimization stage, various optimizations are performed on the parse tree. One of these optimization strategies is to rewrite the IN statement or the EXISTs statement into a semi-join statement and pull it up. The benefits of this are: after the IN statement or the EXIST statement is rewritten into the semi-connection, more opportunities can be provided to discover the access path with lower cost, and finally a better query plan is obtained.
The main problem with the optimization of semi-joins for IN statements or EXISTs statements as currently provided IN databases (data warehouses) is that the merging of semi-joins is not considered. In some cases, in complex SQL statements, there is a possibility of multiple half-joins merging. However, in the existing database (data warehouse), half-links are not merged, which brings the following two disadvantages:
(1) in the query optimization stage, when an access path is searched, a dynamic programming algorithm is generally adopted, the complexity of the dynamic programming algorithm and the number of tables participating in connection are in an exponential relation, so that one more table increases the search space by several times, and accordingly, the search cost is increased correspondingly.
(2) In the query planning stage, the number of times of accessing the physical table and the calculation of repeated predicate expressions are increased, and finally, the IO pressure of the disk and the consumption of CPU resources are increased.
Therefore, how to merge semi-joins is an urgent technical problem to be solved, so as to reduce the search space in the query optimization stage and reduce the number of times of accessing the physical table and the computation of repeated predicate expressions in the query plan execution stage.
Disclosure of Invention
To this end, it is an object of the invention to propose a semi-join merging method.
Another object of the invention is to propose a semi-joint merging device.
To achieve at least one of the above objects, according to an embodiment of the first aspect of the present invention, there is provided a semi-join merging method, including: querying the SQL parse tree for half-joins matching the sub-queries; if the half-connected statement and the sub-query statement which are matched with the sub-query are both one of the first type statement and the second type statement, determining a first predicate expression to be merged according to the statement; judging whether the type of the half-join converted by the statement is the same as that of the half-join matched with the sub-query; if so, merging the first predicate expression and the filtering condition in the half-join matching the sub-query; and if not, negating the first predicate expression, and merging the negated first predicate expression and the filtering condition in the half-join matched with the sub-query.
In the technical scheme, the first predicate expression and the filtering condition in the half-join matching the sub-query are merged, and because the predicate expression has the half-join, the two half-joins can be merged into one half-join through the merging to obtain an optimal query plan. Through the scheme, on one hand, the number of half connections is reduced, so that the number of tables participating in the connections is reduced, and therefore when the optimal table access path is searched in the query optimization stage, the search space is reduced, the time for generating the optimal access path is further reduced, and finally the time for generating the query plan can be reduced. On the other hand, a more optimized query plan can be obtained, the selectivity of the predicate expressions in the half-join after combination is smaller, and the number of the half-joins is reduced, so that in the stage of executing the query plan, not only can the access to the physical table be reduced and the corresponding disk IO be reduced, but also the repeated calculation of the predicate expressions can be reduced due to the combination of the predicate expressions, and the cost of a CPU is greatly saved.
In the foregoing technical solution, preferably, the step of querying a semi-join of a matched sub-query in an SQL parse tree includes: traversing the SQL parse tree to determine whether the semi-join in the SQL parse tree and the sub-query satisfy a matching condition; if the half-join in the SQL syntax analysis tree and the sub-query meet the matching condition, taking the half-join meeting the matching condition in the SQL syntax analysis tree as the half-join matching the sub-query; wherein the matching condition includes: the statement of the half-join in the SQL parse tree and the statement of the sub-query are both the first type statement or the second type statement, the identification code of the half-join in the SQL parse tree is the same as the identification code of the sub-query, and the column of the main query quoted in the second predicate expression of the main query associated with the sub-query is the same as the column of the main query quoted on the join condition of the half-join in the SQL parse tree.
In the technical scheme, the half-join which meets the matching condition with the sub-query can be more comprehensively and accurately queried in the SQL syntax analysis tree by traversing the SQL syntax analysis tree.
In any of the above technical solutions, preferably, before the step of querying the semi-join matching sub-query in the SQL parse tree, the semi-join merging method further includes: obtaining a plurality of key fields in a syntax analysis tree of the sub-query; and generating the identification code of the sub-query according to the plurality of key fields.
In the technical scheme, the identification code of the sub-query is generated through a plurality of key fields in the syntax analysis tree of the sub-query, so that the half-link meeting the matching condition with the sub-query can be quickly inquired in the SQL syntax analysis tree according to the identification code of the sub-query.
In any of the foregoing technical solutions, preferably, if the half-join statement that matches the sub-query and the statement of the sub-query are both one of the first type statement and the second type statement, the step of determining the first predicate expression to be merged according to the one statement includes: if the half-join statement matching the sub-query and the statement of the sub-query are both the first type statement, taking the filtering condition in the first type statement as the first predicate expression; and if the half-connected statement matched with the sub-query and the statement matched with the sub-query are both the statements of the second type, taking the filtering condition in the statements of the second type as the first predicate expression.
In any of the above technical solutions, preferably, the half-join combining method further includes: if the half-join matching the sub-query is not queried in the SQL parse tree, adding the sub-query into the SQL parse tree as a half-join; and if the statement matched with the semi-connection of the sub-query and the statement of the sub-query are not the first type statement or the second type statement, adding the sub-query into the SQL syntax analysis tree as the semi-connection.
In the technical scheme, when the half-join matching the sub-query is not queried, or the half-join statement matching the sub-query and the sub-query statement are not the first type statement or the second type statement, the sub-query is added into the SQL syntax analysis tree as the half-join, so that the reliability of half-join merging is ensured.
In any one of the above technical solutions, preferably, the first type statement includes: an IN statement and/or an ANY statement; the second type statement includes: an EXIST statement.
In this embodiment, the first type statement includes: IN statements and/or ANY statements, the second type statements comprising: and the EXIST statement, thereby realizing the combination of the half connection of the IN statement, the ANY statement and the EXIST statement.
A second aspect of the present invention provides a semi-connection merging apparatus, including: a query unit for querying the half-join matching the sub-query in the SQL parse tree; the determining unit is used for determining a first predicate expression to be merged according to the statement if the statement matching the semi-join of the sub-query and the statement of the sub-query are both one of the first type statement and the second type statement; the judging unit is used for judging whether the type of the half-link converted by the statement is the same as that of the half-link matched with the sub-query; a merging unit, configured to merge the first predicate expression and the filtering condition in the half-join matching the sub-query if the determining unit determines that the type of the half-join converted from the statement is the same as the type of the half-join matching the sub-query; and the merging unit is further configured to, if the judging unit judges that the type of the half-join into which the statement is converted is different from the type of the half-join matching the sub-query, negate the first predicate expression, and merge the negated first predicate expression with the filter condition in the half-join matching the sub-query.
In the technical scheme, the first predicate expression and the filtering condition in the half-join matching the sub-query are merged, and because the predicate expression has the half-join, the two half-joins can be merged into one half-join through the merging to obtain an optimal query plan. Through the scheme, on one hand, the number of half connections is reduced, so that the number of tables participating in the connections is reduced, and therefore when the optimal table access path is searched in the query optimization stage, the search space is reduced, the time for generating the optimal access path is further reduced, and finally the time for generating the query plan can be reduced. On the other hand, a more optimized query plan can be obtained, the selectivity of the predicate expressions in the half-join after combination is smaller, and the number of the half-joins is reduced, so that in the stage of executing the query plan, not only can the access to the physical table be reduced and the corresponding disk IO be reduced, but also the repeated calculation of the predicate expressions can be reduced due to the combination of the predicate expressions, and the cost of a CPU is greatly saved.
In the foregoing technical solution, preferably, the query unit includes: a determining subunit, configured to traverse the SQL parsing tree to determine whether a half-join in the SQL parsing tree and the sub-query satisfy a matching condition; the determining subunit is further configured to, if the half-join in the SQL syntax analysis tree and the sub-query satisfy the matching condition, take the half-join in the SQL syntax analysis tree that satisfies the matching condition as the half-join that matches the sub-query; wherein the matching condition includes: the statement of the half-join in the SQL parse tree and the statement of the sub-query are both the first type statement or the second type statement, the identification code of the half-join in the SQL parse tree is the same as the identification code of the sub-query, and the column of the main query quoted in the second predicate expression of the main query associated with the sub-query is the same as the column of the main query quoted on the join condition of the half-join in the SQL parse tree.
In the technical scheme, the half-join which meets the matching condition with the sub-query can be more comprehensively and accurately queried in the SQL syntax analysis tree by traversing the SQL syntax analysis tree.
In any one of the above technical solutions, preferably, the half-connection merging apparatus further includes: the half-join merging apparatus further includes: the obtaining unit is used for obtaining a plurality of key fields in the syntax analysis tree of the sub-query; and the generating unit is used for generating the identification code of the sub-query according to the plurality of key fields.
In the technical scheme, the identification code of the sub-query is generated through a plurality of key fields in the syntax analysis tree of the sub-query, so that the half-link meeting the matching condition with the sub-query can be quickly inquired in the SQL syntax analysis tree according to the identification code of the sub-query.
In any one of the foregoing technical solutions, preferably, the determining unit is specifically configured to, if the half-join statement that matches the sub-query and the sub-query statement are both the first type statement, use a filtering condition in the first type statement as the first predicate expression; and if the half-connected statement matched with the sub-query and the statement matched with the sub-query are both the statements of the second type, taking the filtering condition in the statements of the second type as the first predicate expression.
In any one of the above technical solutions, preferably, the half-connection merging apparatus further includes: the adding unit is used for adding the sub-query into the SQL syntax analysis tree as a semi-connection if the query unit does not query the semi-connection matched with the sub-query in the SQL syntax analysis tree; and the adding unit is further configured to add the sub-query as a semi-join to the SQL syntax analysis tree if neither the statement matching the semi-join of the sub-query nor the statement of the sub-query is the first type statement or the second type statement.
In the technical scheme, when the half-join matching the sub-query is not queried, or the half-join statement matching the sub-query and the sub-query statement are not the first type statement or the second type statement, the sub-query is added into the SQL syntax analysis tree as the half-join, so that the reliability of half-join merging is ensured.
In any one of the above technical solutions, preferably, the first type statement includes: an IN statement and/or an ANY statement; the second type statement includes: an EXIST statement.
In this embodiment, the first type statement includes: IN statements and/or ANY statements, the second type statements comprising: and the EXIST statement, thereby realizing the combination of the half connection of the IN statement, the ANY statement and the EXIST statement.
According to the technical scheme, the half-join is combined, so that the search space is reduced in the query optimization stage, the number of times of accessing the physical table is reduced in the query plan execution stage, and repeated calculation of predicate expressions is reduced.
Drawings
FIG. 1 shows a flow diagram of a semi-join merging method according to one embodiment of the invention;
FIG. 2 shows a flow diagram of a semi-join merging method according to another embodiment of the invention;
fig. 3 shows a schematic structural diagram of a half-link merging apparatus according to an embodiment of the present invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments thereof which are illustrated in the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
Fig. 1 shows a flow diagram of a semi-join merging method according to an embodiment of the invention.
As shown in fig. 1, a semi-join merging method according to an embodiment of the present invention includes:
step 102, query the half-join of the matched sub-query in the SQL parse tree.
Preferably, step 102 comprises: traversing the SQL parse tree to determine whether the semi-join in the SQL parse tree and the sub-query satisfy a matching condition; if the half-join in the SQL syntax analysis tree and the sub-query meet the matching condition, taking the half-join meeting the matching condition in the SQL syntax analysis tree as the half-join matching the sub-query; wherein the matching condition includes: the statements of the half-join in the SQL parse tree and the statements of the sub-query are both first type statements or second type statements, the identification code of the half-join in the SQL parse tree is the same as the identification code of the sub-query, and the columns of the main query quoted in the second predicate expression of the main query associated with the sub-query are the same as the columns of the main query quoted on the connection condition of the half-join in the SQL parse tree.
And traversing the SQL syntax analysis tree to more comprehensively and accurately query the half-links meeting the matching conditions with the sub-queries in the SQL syntax analysis tree.
Preferably, the first type statement comprises: an IN statement and/or an ANY statement; the second type statement includes: an EXIST statement. The first type statement and the second type statement may contain a parse tree that can be converted into a semi-join.
Preferably, before step 102, the half-join merging method further comprises: obtaining a plurality of key fields in a syntax analysis tree of the sub-query; and generating the identification code of the sub-query according to the plurality of key fields.
For example, a plurality of key fields in the parse tree of the sub-query are merged into one overall field; the overall field is encoded with an unsigned integer number (e.g., a 64-bit integer number) to take the encoding of the overall field as the identification code of the sub-query.
For another example, each key field in the plurality of key fields in the parse tree of the sub-query is encoded using an unsigned integer number (e.g., an integer number of 64 bits), the codes of the plurality of key fields are combined, and the combined code is used as the identification code of the sub-query. And generating an identification code of the sub-query through a plurality of key fields in the syntax analysis tree of the sub-query and storing the identification code in the syntax analysis tree of the sub-query, so that the half-link meeting the matching condition with the sub-query can be quickly inquired in the SQL syntax analysis tree according to the identification code of the sub-query.
And 104, if the half-connection statement matched with the sub-query and the statement matched with the sub-query are both one of the first type statement and the second type statement, determining a first predicate expression to be merged according to the statement.
Preferably, step 104 comprises: if the half-join statement matching the sub-query and the statement of the sub-query are both the first type statement, taking the filtering condition in the first type statement as the first predicate expression; and if the half-connected statement matched with the sub-query and the statement matched with the sub-query are both the statements of the second type, taking the filtering condition in the statements of the second type as the first predicate expression.
Preferably, the half-join merging method further includes: if the half-join matching the sub-query is not queried in the SQL parse tree, adding the sub-query into the SQL parse tree as a half-join; and if the statement matched with the semi-connection of the sub-query and the statement of the sub-query are not the first type statement or the second type statement, adding the sub-query into the SQL syntax analysis tree as the semi-connection.
When the half connection matched with the sub-query is not queried, or the half connection statement matched with the sub-query and the sub-query statement are not the first type statement or the second type statement, the sub-query is added into the SQL syntax analysis tree as the half connection, so that the reliability of half connection combination is ensured.
Step 106, judging whether the type of the half-link converted by the statement is the same as the type of the half-link matched with the sub-query, if so, executing step 108, and if not, executing step 110.
Step 108, merging the first predicate expression with the filter condition in the half-join that matches the sub-query.
And 110, negating the first predicate expression, and merging the negated first predicate expression and the filtering condition in the half-join which matches the sub-query.
Wherein step 110 comprises: and generating a new first predicate expression for the first predicate expression, wherein an operator in the new first predicate expression is NOT, and the new first predicate expression is the negated first predicate expression.
In the technical scheme, the first predicate expression and the filtering condition in the half-join matching the sub-query are merged, and because the predicate expression has the half-join, the two half-joins can be merged into one half-join through the merging to obtain an optimal query plan. Through the scheme, on one hand, the number of half connections is reduced, so that the number of tables participating in the connections is reduced, and therefore when the optimal table access path is searched in the query optimization stage, the search space is reduced, the time for generating the optimal access path is further reduced, and finally the time for generating the query plan can be reduced. On the other hand, a more optimized query plan can be obtained, the selectivity of the predicate expressions in the half-join after combination is smaller, and the number of the half-joins is reduced, so that in the stage of executing the query plan, not only can the access to the physical table be reduced and the corresponding disk IO be reduced, but also the repeated calculation of the predicate expressions can be reduced due to the combination of the predicate expressions, and the cost of a CPU is greatly saved.
Fig. 2 shows a flow diagram of a semi-join merging method according to another embodiment of the invention.
As shown in fig. 2, a semi-join merging method according to another embodiment of the present invention includes:
step 202, traversing the filtering conditions of the SQL parse tree, and finding out the first type statement and the second type statement from the filtering conditions. Wherein the first type statement comprises: an IN statement and/or an ANY statement; the second type statement includes: an EXIST statement.
And step 204, generating an identification code for the sub-query, and storing the identification code of the sub-query into a syntax analysis tree of the sub-query.
Step 206, traversing the SQL parsing tree, determining whether a half-join matching the sub-query is queried, if so, executing step 208, and if not, executing step 210.
Wherein the half-join and sub-queries that match the sub-query satisfy the following matching conditions: the statement of the half-join and the statement of the sub-query in the SQL parsing tree are both the first type statement or the second type statement, the identification code of the half-join and the identification code of the sub-query in the SQL parsing tree are the same, and the column of the main query quoted in the second predicate expression of the main query associated with the sub-query is the same as the column of the main query quoted on the connection condition of the half-join in the SQL parsing tree.
Step 208, judging whether the half-join statement and the sub-query statement matching the sub-query are both the first type statement or the second type statement, if so, executing step 212, and if not, executing step 210.
Step 210, add the sub-query as a semi-join to the SQL parse tree.
Step 212, determining whether the half-join statement and the sub-query statement matching the sub-query are both of the second type statement, if so, executing step 214, and if not, indicating that the half-join statement and the sub-query statement matching the sub-query are both of the first type statement, executing step 218.
And step 214, taking the filter condition in the statement of the second type as the first predicate expression.
Step 216, determining whether the type of the half-join converted by the second type statement is the same as the type of the half-join matched with the sub-query, if it is determined that the type of the half-join converted by the second type statement is the same as the type of the half-join matched with the sub-query, executing step 224, and if it is determined that the type of the half-join converted by the second type statement is not the same as the type of the half-join matched with the sub-query, executing step 222.
Step 218, the filter condition in the first type statement is used as a first predicate expression.
Step 220, determining whether the type of the half-join converted by the first type statement is the same as the type of the half-join matched with the sub-query, if it is determined that the type of the half-join converted by the first type statement is the same as the type of the half-join matched with the sub-query, executing step 224, and if it is determined that the type of the half-join converted by the first type statement is different from the type of the half-join matched with the sub-query, indicating that the half-join converted by the first type statement is a reverse half-join, executing step 222.
And step 222, negating the first predicate expression, wherein the operator of the negated first predicate expression is NOT.
Step 224, the first predicate expression is merged with the filter terms in the half-join that matches the sub-query. If the first predicate expression is not negated, the first predicate expression is directly used for being merged with the filtering condition in the half-join of the matched sub-query; and if the first predicate expression is negated, merging the negated first predicate expression and the filtering condition in the half-join of the matched sub-query.
In addition, the first predicate expression is merged with the filter condition in the semi-join that matches the sub-query to obtain a new predicate expression, AND an operator of the predicate expression is an AND.
Fig. 3 shows a schematic structural diagram of a half-link merging apparatus according to an embodiment of the present invention.
As shown in fig. 3, a half-link merging apparatus 300 according to an embodiment of the present invention includes: a query unit 302, a determination unit 304, a judgment unit 306 and a merging unit 308.
A query unit 302 for querying the semi-joins of the matching sub-queries in the SQL parse tree.
Preferably, the query unit 302 includes: a determining subunit 3022, configured to traverse the SQL parse tree to determine whether the half-join in the SQL parse tree and the sub-query satisfy a matching condition; the determining subunit 3022 is further configured to, if the half-join in the SQL parse tree and the sub-query satisfy the matching condition, take the half-join in the SQL parse tree that satisfies the matching condition as the half-join that matches the sub-query; wherein the matching condition includes: the statements of the half-join in the SQL parse tree and the statements of the sub-query are both first type statements or second type statements, the identification code of the half-join in the SQL parse tree is the same as the identification code of the sub-query, and the columns of the main query quoted in the second predicate expression of the main query associated with the sub-query are the same as the columns of the main query quoted on the connection condition of the half-join in the SQL parse tree.
And traversing the SQL syntax analysis tree to more comprehensively and accurately query the half-links meeting the matching conditions with the sub-queries in the SQL syntax analysis tree.
Preferably, the first type statement comprises: an IN statement and/or an ANY statement; the second type statement includes: an EXIST statement.
A determining unit 304, configured to determine, if the statement that matches the semi-join of the sub-query and the statement of the sub-query are both one of the first type statement and the second type statement, the first predicate expression to be merged according to the statement.
Preferably, the determining unit 304 is specifically configured to, if the statement that matches the semi-join of the sub-query and the statement of the sub-query are both the first type statement, take a filter condition in the first type statement as the first predicate expression; and if the half-connected statement matched with the sub-query and the statement matched with the sub-query are both the statements of the second type, taking the filtering condition in the statements of the second type as the first predicate expression.
And the judging unit 306 is configured to judge whether the type of the half-join converted by the statement is the same as the type of the half-join matching the sub-query.
A merging unit 308, configured to merge the first predicate expression and the filtering condition in the half-join matching the sub-query if the determining unit 306 determines that the type of the half-join converted by the statement is the same as the type of the half-join matching the sub-query; and the merging unit 308 is further configured to, if the judging unit 306 judges that the type of the half-join converted by the statement is different from the type of the half-join matched with the sub-query, perform negation on the first predicate expression, and merge the negated first predicate expression with the filtering condition in the half-join matched with the sub-query.
In the technical scheme, the first predicate expression and the filtering condition in the half-join matching the sub-query are merged, and because the predicate expression has the half-join, the two half-joins can be merged into one half-join through the merging to obtain an optimal query plan. Through the scheme, on one hand, the number of half connections is reduced, so that the number of tables participating in the connections is reduced, and therefore when the optimal table access path is searched in the query optimization stage, the search space is reduced, the time for generating the optimal access path is further reduced, and finally the time for generating the query plan can be reduced. On the other hand, a more optimized query plan can be obtained, the selectivity of the predicate expressions in the half-join after combination is smaller, and the number of the half-joins is reduced, so that in the stage of executing the query plan, not only can the access to the physical table be reduced and the corresponding disk IO be reduced, but also the repeated calculation of the predicate expressions can be reduced due to the combination of the predicate expressions, and the cost of a CPU is greatly saved.
In the above technical solution, preferably, the half-connection combining apparatus 300 further includes: an obtaining unit 310, configured to obtain a plurality of key fields in a syntax analysis tree of the sub-query; a generating unit 312, configured to generate the identification code of the sub-query according to the plurality of key fields.
For example, the generating unit 312 may combine a plurality of key fields in the parse tree of the sub-query into one total field; the overall field is encoded with an unsigned integer number (e.g., a 64-bit integer number) to take the encoding of the overall field as the identification code of the sub-query.
For another example, the generating unit 312 may also encode each of the plurality of key fields in the parse tree of the sub-query using the unsigned integer number (e.g., 64-bit integer number), and then combine the codes of the plurality of key fields to obtain the combined code as the identification code of the sub-query.
And generating an identification code of the sub-query through a plurality of key fields in the syntax analysis tree of the sub-query and storing the identification code in the syntax analysis tree of the sub-query, so that the half-link meeting the matching condition with the sub-query can be quickly inquired in the SQL syntax analysis tree according to the identification code of the sub-query.
In any of the above technical solutions, preferably, the half-joint combining apparatus 300 further includes: a joining unit 314, configured to join the sub-query as a semi-join into the SQL parsing tree if the querying unit 302 does not query a semi-join matching the sub-query in the SQL parsing tree; and the adding unit 314 is further configured to add the sub-query as a semi-join to the SQL syntax analysis tree if neither the statement matching the semi-join of the sub-query nor the statement of the sub-query is the first type statement or the second type statement.
When the half connection matched with the sub-query is not queried, or the half connection statement matched with the sub-query and the sub-query statement are not the first type statement or the second type statement, the sub-query is added into the SQL syntax analysis tree as the half connection, so that the reliability of half connection combination is ensured.
The technical solutions of the present invention are described in detail above with reference to the drawings, and according to the technical solutions of the present invention, by merging half-joins, the search space is reduced in the query optimization stage, the number of times of accessing the physical table is reduced in the query planning stage, and the computation of repeated predicate expressions is reduced.
In the present invention, the terms "first", "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance; the term "plurality" means two or more. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A semi-join merging method, comprising:
querying the SQL parse tree for half-joins matching the sub-queries;
if the half-connection statement matched with the sub-query and the statement matched with the sub-query are both one of the first type statement and the second type statement, determining a first predicate expression to be merged according to the one statement;
judging whether the type of the half-join converted by the statement is the same as that of the half-join matched with the sub-query;
if so, merging the first predicate expression and the filtering condition in the half-join matching the sub-query; and
if not, negating the first predicate expression, and merging the negated first predicate expression and the filtering condition in the half-join which matches the sub-query;
traversing the SQL parse tree to determine whether the semi-join in the SQL parse tree and the sub-query satisfy a matching condition;
if the half-join in the SQL syntax analysis tree and the sub-query meet the matching condition, taking the half-join meeting the matching condition in the SQL syntax analysis tree as the half-join matching the sub-query;
wherein the matching condition includes: the statement of the half-join in the SQL parse tree and the statement of the sub-query are both the first type statement or the second type statement, the identification code of the half-join in the SQL parse tree is the same as the identification code of the sub-query, and the column of the main query quoted in the second predicate expression of the main query associated with the sub-query is the same as the column of the main query quoted on the connection condition of the half-join in the SQL parse tree;
the first type statement includes: an IN statement and/or an ANY statement;
the second type statement includes: an EXIST statement.
2. The semi-join merging method of claim 1, wherein prior to the step of querying the semi-joins of the matched sub-queries in the SQL parse tree, the semi-join merging method further comprises:
obtaining a plurality of key fields in a syntax analysis tree of the sub-query;
and generating the identification code of the sub-query according to the plurality of key fields.
3. The semi-join merging method according to claim 1 or 2, wherein if the statement that matches the semi-join of the sub-query and the statement of the sub-query are both one of a first type statement and a second type statement, the step of determining the first predicate expression to be merged according to the one statement comprises:
if the half-join statement matching the sub-query and the statement of the sub-query are both the first type statement, taking the filtering condition in the first type statement as the first predicate expression;
and if the half-connected statement matched with the sub-query and the statement matched with the sub-query are both the statements of the second type, taking the filtering condition in the statements of the second type as the first predicate expression.
4. The semi-join merging method of claim 1 or 2, further comprising:
if the half-join matching the sub-query is not queried in the SQL parse tree, adding the sub-query into the SQL parse tree as a half-join; and
and if the statement matched with the semi-connection of the sub-query and the statement of the sub-query are not the first type statement or the second type statement, adding the sub-query into the SQL syntax analysis tree as the semi-connection.
5. A semi-joined merge device, comprising:
a query unit for querying the half-join matching the sub-query in the SQL parse tree;
the determining unit is used for determining a first predicate expression to be merged according to a statement of a first type and a statement of a second type if the statement of the semi-join matching the sub-query and the statement of the sub-query are both one of the statements of the first type and the statements of the second type;
the judging unit is used for judging whether the type of the half-link converted by the statement is the same as that of the half-link matched with the sub-query;
a merging unit, configured to merge the first predicate expression and the filtering condition in the half-join matching the sub-query if the determining unit determines that the type of the half-join converted from the statement is the same as the type of the half-join matching the sub-query; and
the merging unit is further configured to, if the judging unit judges that the type of the half-join into which the statement is converted is different from the type of the half-join matching the sub-query, negate the first predicate expression, and merge the negated first predicate expression with the filter condition in the half-join matching the sub-query;
a determining subunit, configured to traverse the SQL parsing tree to determine whether a half-join in the SQL parsing tree and the sub-query satisfy a matching condition;
the determining subunit is further configured to, if the half-join in the SQL syntax analysis tree and the sub-query satisfy the matching condition, take the half-join in the SQL syntax analysis tree that satisfies the matching condition as the half-join that matches the sub-query;
wherein the matching condition includes: the statement of the half-join in the SQL parse tree and the statement of the sub-query are both the first type statement or the second type statement, the identification code of the half-join in the SQL parse tree is the same as the identification code of the sub-query, and the column of the main query quoted in the second predicate expression of the main query associated with the sub-query is the same as the column of the main query quoted on the connection condition of the half-join in the SQL parse tree;
the first type statement includes: an IN statement and/or an ANY statement;
the second type statement includes: an EXIST statement.
6. The semi-connected merging device of claim 5, further comprising:
the obtaining unit is used for obtaining a plurality of key fields in the syntax analysis tree of the sub-query;
and the generating unit is used for generating the identification code of the sub-query according to the plurality of key fields.
7. The semi-joint merging apparatus of claim 5 or 6, wherein the determining unit is specifically configured to,
if the half-join statement matching the sub-query and the statement of the sub-query are both the first type statement, taking the filtering condition in the first type statement as the first predicate expression;
and if the half-connected statement matched with the sub-query and the statement matched with the sub-query are both the statements of the second type, taking the filtering condition in the statements of the second type as the first predicate expression.
8. The semi-connected merging device of claim 5 or 6, further comprising:
the adding unit is used for adding the sub-query into the SQL syntax analysis tree as a semi-connection if the query unit does not query the semi-connection matched with the sub-query in the SQL syntax analysis tree; and
the adding unit is further configured to add the sub-query as a semi-join to the SQL syntax analysis tree if neither the statement matching the semi-join of the sub-query nor the statement of the sub-query is the first type statement or the second type statement.
CN201611131301.3A 2016-12-09 2016-12-09 Semi-connection merging method and semi-connection merging device Active CN106777054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611131301.3A CN106777054B (en) 2016-12-09 2016-12-09 Semi-connection merging method and semi-connection merging device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611131301.3A CN106777054B (en) 2016-12-09 2016-12-09 Semi-connection merging method and semi-connection merging device

Publications (2)

Publication Number Publication Date
CN106777054A CN106777054A (en) 2017-05-31
CN106777054B true CN106777054B (en) 2020-03-27

Family

ID=58879712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611131301.3A Active CN106777054B (en) 2016-12-09 2016-12-09 Semi-connection merging method and semi-connection merging device

Country Status (1)

Country Link
CN (1) CN106777054B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107632999A (en) * 2017-07-24 2018-01-26 杭州沃趣科技股份有限公司 A kind of method that multiple associated predicates are merged
CN112732752A (en) * 2021-01-29 2021-04-30 上海达梦数据库有限公司 Query statement optimization method, device, equipment and storage medium
CN112765286A (en) * 2021-02-01 2021-05-07 广州海量数据库技术有限公司 Query method and device based on relational database
CN114036188B (en) * 2021-11-29 2024-08-23 广州海量数据库技术有限公司 Method for optimizing and processing union in relational database management system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739398A (en) * 2008-11-11 2010-06-16 山东省标准化研究院 Distributed database multi-join query optimization algorithm
CN102110158A (en) * 2011-02-24 2011-06-29 上海大学 Multi-join query optimization method for database based on improved SDD-1 (System for Distributed Database) algorithm
CN103164495A (en) * 2011-12-19 2013-06-19 中国人民解放军63928部队 Half-connection inquiry optimizing method based on periphery searching and system thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7565342B2 (en) * 2005-09-09 2009-07-21 International Business Machines Corporation Dynamic semi-join processing with runtime optimization

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739398A (en) * 2008-11-11 2010-06-16 山东省标准化研究院 Distributed database multi-join query optimization algorithm
CN102110158A (en) * 2011-02-24 2011-06-29 上海大学 Multi-join query optimization method for database based on improved SDD-1 (System for Distributed Database) algorithm
CN103164495A (en) * 2011-12-19 2013-06-19 中国人民解放军63928部队 Half-connection inquiry optimizing method based on periphery searching and system thereof

Also Published As

Publication number Publication date
CN106777054A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
US10133778B2 (en) Query optimization using join cardinality
CN107402987B (en) Full-text retrieval method and distributed NewSQL database system
US10860632B2 (en) Information query method and device
CN106777054B (en) Semi-connection merging method and semi-connection merging device
US9665619B1 (en) Optimizing database queries using subquery composition
US9280583B2 (en) Scalable multi-query optimization for SPARQL
US20180067987A1 (en) Database capable of integrated query processing and data processing method thereof
KR101432700B1 (en) Method for optimizing query
US8423569B2 (en) Decomposed query conditions
CN106897343B (en) Searching method, storing method and device for execution plan
EP2682878A1 (en) Method of processing relational queries in a database system and corresponding database system
CN107729371B (en) Data indexing and querying method, device, equipment and storage medium of block chain
US9317552B2 (en) Reusing existing query plans in a database system
CN105677681A (en) Data search method and device based on multiple databases
CN105718593A (en) Database query optimization method and system
US9870394B2 (en) Query routing method, query routing server performing the same and storage medium storing the same
US10268724B2 (en) Techniques for improving the performance of complex queries
Urbani et al. KOGNAC: efficient encoding of large knowledge graphs
US10936595B2 (en) Deferring and/or eliminating decompressing database data
CN107729428A (en) A kind of SQL query method based on Presto and Elasticsearch
JP2024504322A (en) Combining JavaScript Object Notation (JASON) queries across cloud resources
EP3480693A1 (en) Distributed computing framework and distributed computing method
CN111078728B (en) Cross-database query method and device in database archiving mode
WO2015181511A1 (en) Data cleaning methods and systems
CN110147396B (en) Mapping relation generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220418

Address after: Room 403, 4th floor, building 23, East District, yard 10, Xibeiwang East Road, Haidian District, Beijing 100089

Patentee after: BEIJING VSETTAN DATA TECHNOLOGY CO.,LTD.

Address before: 100192 South Zone 1, floor 11, block a, No. 8 Xueqing Road (Science and technology wealth center), Haidian District, Beijing

Patentee before: BEIJING VSETTAN DATA TECHNOLOGY CO.,LTD.

Patentee before: Huasheng Xintai Information Industry Development Co., Ltd