CN110515973B - Data query optimization method, device, equipment and storage medium - Google Patents

Data query optimization method, device, equipment and storage medium Download PDF

Info

Publication number
CN110515973B
CN110515973B CN201910816840.8A CN201910816840A CN110515973B CN 110515973 B CN110515973 B CN 110515973B CN 201910816840 A CN201910816840 A CN 201910816840A CN 110515973 B CN110515973 B CN 110515973B
Authority
CN
China
Prior art keywords
statement
branch
node
sql
optimization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910816840.8A
Other languages
Chinese (zh)
Other versions
CN110515973A (en
Inventor
朱仲颖
刘志勇
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN201910816840.8A priority Critical patent/CN110515973B/en
Publication of CN110515973A publication Critical patent/CN110515973A/en
Application granted granted Critical
Publication of CN110515973B publication Critical patent/CN110515973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data query optimization method, a data query optimization device, data query optimization equipment and a storage medium. Determining an optimization result of each branch statement in the UNION set type statement after determining that the analyzed Structured Query Language (SQL) statement is the UNION set type statement; determining branch statements to be executed according to the optimization results of the branch statements; according to the technical means for carrying out the structured query on the branch sentences to be executed, the problem that in the prior art, each branch sentence in the set type sentences needs to be queried is solved, each branch sentence in the UNION set type sentences is optimized, the branch sentences meeting the optimization conditions are deleted, the branch sentences not meeting the optimization conditions are reserved, the query is carried out only according to the reserved branch sentences, the number of the scanned data tables is reduced, and the query efficiency is improved.

Description

Data query optimization method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of data query, in particular to a data query optimization method, a data query optimization device, data query optimization equipment and a storage medium.
Background
With the continuous and deep development of informatization, the data generation speed is increasing, the data volume needing to be processed is rapidly expanding, and the big data era is coming. By big data is meant data that is of such a magnitude that the data involved is too big to be processed in a reasonable time by the mainstream software. The large-scale data storage, query and analysis become an urgent problem to be solved.
Structured Query Language (SQL), a special purpose programming Language, is a database Query and programming Language for accessing data and querying, updating, and managing relational database systems.
SQL realizes data query and program design through different statements, and a UNION set type statement is a crucial statement in the SQL statement. The UNION set type statement generally includes a plurality of branch statements, and the UNION set type statement mainly performs a set operation on query results of the plurality of branch statements. Each branch statement in the UNION set type statement under normal conditions is to scan the table data first and then filter the data according to the filter condition to obtain a result set.
Based on the technical scheme, when the UNION set type statement is used for query, the data table corresponding to each branch statement needs to be scanned, and query efficiency is reduced.
Disclosure of Invention
The invention provides an optimization method, device, equipment and storage medium for data query, which are used for reducing the number of scanning data tables and improving the query efficiency when a UNION set type statement is used for query.
In a first aspect, an embodiment of the present invention provides a method for optimizing a data query, including:
determining an optimization result of each branch statement in the UNION set type statement after determining that the analyzed Structured Query Language (SQL) statement is the UNION set type statement;
determining branch statements to be executed according to the optimization results of the branch statements;
and carrying out structured query according to the branch statement to be executed.
In a second aspect, an embodiment of the present invention further provides an apparatus for optimizing data query, including:
the optimization result determining module is used for determining the optimized result of each branch statement in the UNION set type statement after the analyzed structured query language SQL statement is determined to be the UNION set type statement;
a branch statement to be executed determining module, configured to determine a branch statement to be executed according to the optimization result of each branch statement;
and the structured query module is used for carrying out structured query according to the branch statement to be executed.
In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of optimizing a data query as described in the first aspect above.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the optimization method for data query as described in the first aspect above.
Determining an optimization result of each branch statement in a UNION set type statement after determining that the analyzed Structured Query Language (SQL) statement is the UNION set type statement; determining branch statements to be executed according to the optimization results of the branch statements; the technical means for carrying out structured query according to the branch statement to be executed solves the problem that query needs to be carried out on each branch statement in the UNION set type statement in the prior art. In the embodiment, each branch statement in the UNION set type statement is optimized, the branch statement meeting the optimization condition is deleted, the branch statements which do not meet the optimization condition are reserved, and the query is performed only according to the reserved branch statements, so that the number of the scan data tables is reduced, and the query efficiency is improved.
Drawings
Fig. 1 is a flowchart of a method for optimizing a data query according to an embodiment of the present invention;
FIG. 2 is a flowchart of a data query optimization method according to a second embodiment of the present invention;
fig. 3 is a flowchart of an optimization method for data query according to a third embodiment of the present invention;
fig. 4a is a schematic structural diagram of a binary SQL tree according to a third embodiment of the present invention;
fig. 4b is a schematic structural diagram of an optimized SQL binary tree according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an optimization apparatus for data query according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a data query optimization method according to an embodiment of the present invention, which is applicable to a case where a query is performed when a filtering condition of a branch statement in a UNION set type statement is false.
SQL is a special-purpose programming language, a database query and programming language, used to access data and query, update, and manage relational database systems; and is also an extension of the database script file. The programming statement in SQL is simply referred to as SQL statement.
As shown in fig. 1, the method for optimizing data query provided in this embodiment specifically includes the following steps:
s110, after the analyzed SQL statement is determined to be a UNION set type statement, the optimization result of each branch statement in the UNION set type statement is determined.
In this embodiment, since SQL is a programming language, different types of statements exist in SQL for query, update, management, and the like. The type of the SQL statement may be determined according to the type defined in the SQL statement.
A set is a whole of one or more elements, a table in SQL represents a data set, and a query is a result set generated on the basis of the set. SQL set operators include INTERSECT INTERSECT, UNION (containing no repeated values), UNION ALL (containing repeated values), difference MINUS, and non-INTERSECT EXCEPT. The UNION operator includes: UNION (containing no repeated values) and UNION ALL (containing repeated values).
Further, if it is detected that the SQL statement uses a UNION operator, it is determined that the parsed SQL statement is a UNION set type statement.
After receiving an SQL statement input by a user, carrying out syntax and semantic analysis on the received SQL statement to obtain an analyzed SQL statement. It should be noted that, in the present embodiment, only the analysis of the SQL statement is described, but not limited. And a proper SQL statement analysis method can be selected according to actual conditions to analyze the SQL statement.
It should be noted that one SQL statement may be composed of one branch statement or may be composed of a plurality of branch statements. And in an SQL statement, when any branch statement is a UNION operator, determining that the SQL statement is a UNION set type statement. And then analyzing and optimizing each branch statement in the SQL statement respectively to obtain an optimization result.
In this embodiment, the optimization results in deleting branch statements and retaining branch statements. And judging whether the filtering condition of the branch statement of the UNION set type statement is always false. If the filtering condition of the branch statement of the UNION set type statement is true, the branch statement with the filtering condition of true is reserved, the other branch statement is used as a new branch statement to judge, and the operation of judging whether the filtering condition of the branch statement is always false is returned to be executed.
If the filtering condition of the branch statement is always false, judging whether the number of the branch statements of the UNION set type statement is larger than 1, if the number of the branch statements of the UNION set type statement is larger than 1, deleting the branch statement corresponding to the filtering condition which is always false, subtracting 1 from the number of the branch statements, judging the next branch statement as a new branch statement, and returning to execute the operation of judging whether the filtering condition of the branch statement is always false.
And if the number of the branch statements of the UNION set type statement is equal to 1, not deleting the branch statement corresponding to the filtering condition which is always false so as to determine that at least one branch statement is reserved for query.
For example; the SQL statement is as follows:
SELECT*FROM T1 WHERE C1<10AND C1>10UNION ALL SELECT* FROM T2;
in the above SQL statement, there are two branch statements in the UNION set type statement, and the first branch statement is: SELECT FROM T1 WHERE C1<10AND C1> 10; the second branch statement is: SELECT FROM T2.
Firstly, a first branch statement is analyzed, the filtering condition of the first branch statement is C1<10AND C1>10, it can be known that the filtering condition of the first branch statement is always false AND can be optimized, AND the current SQL statement comprises 2 branch statements, the condition that the number of branch statements of a UNION set type statement is more than 1 is met, the first branch statement can be deleted, AND the number of branch statements is reduced by 1. The second branch statement continues to be analyzed, which has no filter condition and only 1 branch statement remains. No other branches can continue to optimize.
And S120, determining branch statements to be executed according to the optimization results of the branch statements.
Further, in this embodiment, the optimization result includes deleting branch statements and keeping branch statements. The deleted branch statement is no longer processed. And combining and sorting the reserved branch statements to obtain branch statements to be executed.
In the above example, the first branch statement is deleted, the second branch statement is retained, and the second branch statement is determined as the branch statement to be executed. Namely, the branch statement to be executed is: SELECT FROM T2.
And S130, carrying out structured query according to the branch statement to be executed.
In this embodiment, performing structured query according to the branch statement to be executed may refer to performing conventional query according to the statement to be executed. In this embodiment, the method of structured query is not limited.
Further, the branch statement to be executed is: SELECT FROM T2. Compared with the SQL statement before optimization, the method reduces scanning on the table 1, namely greatly reduces the number of scanning tables and improves the query efficiency.
According to the technical scheme, after the analyzed Structured Query Language (SQL) statement is determined to be a UNION set type statement, the optimization result of each branch statement in the UNION set type statement is determined; determining branch statements to be executed according to the optimization results of the branch statements; the technical means for carrying out structured query according to the branch statement to be executed solves the problem that query needs to be carried out on each branch statement in the UNION set type statement in the prior art. In the embodiment, each branch statement in the UNION set type statement is optimized, the branch statement meeting the optimization condition is deleted, the branch statements which do not meet the optimization condition are reserved, and the query is performed only according to the reserved branch statements, so that the number of the scan data tables is reduced, and the query efficiency is improved.
Example two
Fig. 2 is a flowchart of a data query optimization method according to a second embodiment of the present invention, where the data query optimization method is further optimized in this embodiment based on the foregoing embodiments. As shown in fig. 2, the optimized data query optimization method mainly includes the following steps:
s210, carrying out syntax and semantic analysis on the SQL sentence input by the user to obtain the analyzed SQL sentence.
It should be noted that the syntax and semantic parsing of the SQL statement is usually performed by the server. The syntax and semantic analysis of the SQL statement input by the user comprises the following steps: the syntax of the SQL statement is checked to see if it complies with the syntax rules. If the server process considers that the SQL statement does not accord with the grammar rule, the error information is fed back to the client. In this syntax checking process, table names, column names, and the like included in the SQL statement are not checked, and SQL is only a syntax check.
If the SQL statement conforms to the grammatical definition, the contents of the fields, tables, etc. in the statement are checked next. See if these fields, tables are in the database. If the table name and the column name are inaccurate, the database can feed back error information to the client.
And when the grammar and the semantics pass the check, obtaining the analyzed SQL statement. It should be noted that, in the present embodiment, only syntax and semantic parsing are described, but not limited.
And S220, judging whether the analyzed SQL statement is a UNION set type statement, if so, executing S230, and if not, executing S290.
SQL set operators include INTERSECT (INTERSECT), UNION (UNION (containing no repeated values) or UNION ALL (containing repeated values)), difference set MINUS, non-INTERSECT (EXCEPT). The collection type statement is an SQL statement that uses SQL collection operators. A UNION set type statement is a set type statement that uses UNION operator statements.
And judging whether the SQL statement contains a UNION operator or not, and if not, determining that the SQL statement is a non-UNION set type statement. If a UNION set operator is included, then the SQL statement is determined to be a UNION set type statement.
For example, SQL statements s1 INTERSECT (s2 UNION s3) EXCEPT s4 UNION s 5; is a UNION collection type statement.
S230, judging whether the branch sentences meet optimization conditions or not aiming at each branch sentence in the UNION set type sentences; if yes, go to S240, otherwise go to S260.
Wherein, the branch statement is the branch of the UNION set operator, and the branch statement satisfies the optimization condition and includes: the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is more than 1. The branch statement not satisfying the optimization condition comprises: the filter condition of the branch statement is true; and/or the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is equal to 1.
In this embodiment, for each branch statement in the UNION set type statement, whether the optimization condition is satisfied is sequentially determined from the first branch statement until all branch statements are completely determined.
Further, judging whether the first branch statement filtering condition is false, and if the first branch statement filtering condition is false, determining that the first branch statement meets the optimization condition; then, whether the second branch is true or false, determining that the second branch statement does not satisfy the optimization condition; if the first branch statement filtering condition is true, determining that the first branch statement does not meet the optimization condition; and then continuously judging whether the second branch is false, if so, determining that the second branch meets the optimization condition, and if so, determining that the second branch does not meet the optimization condition.
Further, whether the first branch statement filtering condition is false is judged, if the first branch statement filtering condition is false, whether the number of branch statements in the set type statement is larger than 1 is judged, and if the number of branch statements in the set type statement is larger than 1, the first branch statement is determined to meet the optimization condition; then, judging whether the second branch statement filtering condition is false, and if the second branch statement filtering condition is true, determining that the second branch statement does not meet the optimization condition; then, judging whether the third branch statement filtering condition is false, if the first branch statement filtering condition is false, judging whether the number of branch statements in the set type statement is larger than 1, and if the number of branch statements in the set type statement is smaller than or equal to 1, determining that the third branch statement does not meet the optimization condition.
S240, determining that the optimization result of the branch statement is the branch statement which meets the optimization condition.
In this embodiment, if the branch statement satisfies the optimization condition, the branch statement satisfying the optimization condition is deleted, and as described above, the first branch statement is deleted if the first branch statement satisfies the optimization condition.
And S250, reducing the number of the branch sentences by one to obtain the number of the branch sentences corresponding to the next branch sentence.
In this embodiment, the number of branch statements may be understood as the number of branch statements included in a set type statement.
In this embodiment, when the first branch term satisfies the optimization condition, the first branch term is deleted, and the number of branch terms is subtracted by 1 to obtain the number of branch terms corresponding to the second branch term. The number of branch statements corresponding to the second branch statement may be understood as the number of branch statements included in the set type statement when analyzing whether the second branch statement satisfies the optimization condition.
S260, determining the optimization result of the branch statement to be the branch statement which does not meet the optimization condition.
In the present embodiment, if the branch statement does not satisfy the optimization condition, the branch statement that does not satisfy the optimization condition is retained.
And S270, determining each reserved branch statement which does not meet the optimization condition as a branch statement to be executed.
And S280, carrying out structured query according to the branch statement to be executed.
In the embodiment, the reserved branch statements which do not satisfy the optimization condition are subjected to structured query.
And S290, carrying out structured query according to the analyzed SQL statement.
If the SQL statement does not contain a UNION set operator, determining that the SQL statement is a non-UNION set type statement, indicating that the SQL statement cannot be optimized, directly querying the SQL statement and not optimizing the SQL statement.
According to the technical scheme, after the analyzed Structured Query Language (SQL) statement is determined to be a UNION set type statement, the optimization result of each branch statement in the UNION set type statement is determined; determining branch statements to be executed according to the optimization results of the branch statements; the technical means for carrying out structured query according to the branch statement to be executed solves the problem that query needs to be carried out on each branch statement in the UNION set type statement in the prior art. In the embodiment, each branch statement in the UNION set type statement is optimized, the branch statement meeting the optimization condition is deleted, the branch statements which do not meet the optimization condition are reserved, and the query is performed only according to the reserved branch statements, so that the number of the scan data tables is reduced, and the query efficiency is improved.
On the basis of the above embodiments, the embodiments of the present invention provide a preferred example.
First, data tables T1 and T2 are created, and corresponding data are stored in the data tables T1 and T2.
CREATE TABLE T1(c1 INT,c2 INT);
CREATE TABLE T2(d1 INT,d2 INT);
INSERT INTO T1 VALUES(1,NULL);
INSERT INTO T2 VALUES(2,1);
The SQL statement is:
SQL>SELECT*FROM T1 WHERE C1<10 AND C1>10 UNION ALL SELECT*FROM T2;
the method comprises the following steps of:
Figure BDA0002186562440000111
wherein, 8 steps are scanning a T2 table, 7 steps are data projection, for example, the query term is C1+1, the data projection can calculate the expression operation, and the calculated result is stored; 6. 5 is scanning T1 table, 4 steps are data projection; 3 is to combine the results of T2 and T1. Step 2 is data projection and step 1 is returning the final result set.
In this embodiment, the optimized execution steps are:
Figure BDA0002186562440000121
after optimization, the middle 6, 5, 4 and 3 steps are simplified, and the scanning of the T1 table is reduced.
The optimization method provides that whether the filtering condition of each query clause is constantly false is identified and judged in the SQL analysis stage, if the result set of the constantly false query clause is a null set, the query branch of the SQL can be directly removed in the analysis stage, the data scanning times in the execution stage are reduced, and therefore the query efficiency is improved. For example, in the above example, scan filtering of T1 is required before optimization, and query of T2 is required after optimization.
EXAMPLE III
Fig. 3 is a flowchart of a data query optimization method according to a second embodiment of the present invention, where the data query optimization method is further optimized in this embodiment based on the foregoing embodiments. As shown in fig. 2, the optimized data query optimization method mainly includes the following steps:
s310, determining the analyzed structured query language SQL statement as a UNION set type statement.
And S320, taking the set operator of the analyzed SQL statement as a root node of the SQL binary tree, and establishing the SQL binary tree.
After the SQL query statement input by the user is parsed in syntax and semantic, an SQL syntax tree of the SQL memory structure is generated.
And if the syntax tree root node type is a non-aggregate statement type, namely no branch exists, directly carrying out structured query according to the analyzed SQL statement. Otherwise, building SQL binary tree according to branch statement in SQL.
The SQL syntax tree is a binary tree. A binary tree is a tree structure with at most two subtrees per node. The subtrees are generally referred to as the "left subtree" and the "right subtree".
First, data tables T1, T2, T3, and T4 are created, and corresponding data are stored in the data tables T1, T2, T3, and T4.
CREATE TABLE T1(c1 INT,c2 INT);
CREATE TABLE T2(d1 INT,d2 INT);
CREATE TABLE T3(e1 INT,e2 INT);
CREATE TABLE T4(f1 INT,f2 INT);
INSERT INTO T1 VALUES(1,NULL);
INSERT INTO T2 VALUES(2,1);
INSERT INTO T3 VALUES(2,1);
INSERT INTO T4 VALUES(1,1);
Set operation queries are performed on the above data tables T1, T2, T3, and T4. The SQL statement is as follows:
SELECT*FROM T1 WHERE C1<10AND C1>10INTERSECT SELECT* FROM T2 EXCEPT(SELECT*FROM T3 WHERE E1<10AND E1>10UNION ALL SELECT*FROM T4 WHERE f1<10AND f1>10);
firstly, SQL syntax semantic analysis is carried out to obtain an SQL syntax tree. According to the operation sequence of the set operation from left to right with brackets in priority, the generation of the binary tree is from right to left, and fig. 4a is a schematic structural diagram of an SQL binary tree provided by the third embodiment of the present invention; as shown in FIG. 4a, the entire statement of the SQL statement is considered as an EXCEPT set operation, where the node type of the root node is EXCEPT. SQL on the left and right sides of EXCEPT is the left and right subtrees of the root node. Left sub-tree SELECT FROM T1 WHERE C1<10AND C1>10INTERSECT SELECT FROM T2, right sub-tree (SELECT FROM T3 WHERE E1<10AND E1>10UNION ALL SELECT WHERE FROM T4 WHERE f1<10AND f1> 10); because the left and right subtrees are all set type SQL, the binary tree can be generated continuously and recursively, the root node of the left subtree is an INTERSECT type, the left and right children are respectively queried by T1 and T2, the root node of the right subtree is a UNION, and the left and right children are respectively queried by T3 and T4.
And S330, traversing each node of the SQL binary tree by adopting a precedence.
The Pre-order traversal (Pre-order) refers to that all nodes on a path are traversed along a certain path according to the left and right sequence of a root. In the binary tree, the root node is analyzed first, then the left leaf child node is analyzed, and finally the right leaf child node is analyzed. The present invention takes the first-order traversal as an example, but is not limited to this traversal manner, and the middle-order traversal or the subsequent traversal may be possible.
S340, optimizing each branch statement in the UNION set type statement according to the node type corresponding to each node of the SQL binary tree to obtain an optimization result of each branch statement.
Specifically, the node types corresponding to the nodes are determined in sequence; if the node type corresponding to the node is a father node, acquiring a left child node of the node as a new node, and returning to execute the operation of determining the node type corresponding to each node; if the node type corresponding to the node is a leaf node, judging whether the branch statement corresponding to the leaf node meets the filtering condition, and whether the branch statement corresponding to the leaf node meets the filtering condition comprises the following steps: the filtering condition of the branch statement corresponding to the leaf node is false, and the statement type corresponding to the parent node to which the leaf node belongs is a UNION set type; if the branch statements corresponding to the leaf nodes meet the filtering condition, deleting the leaf nodes, replacing brother nodes located at the same layer with the leaf nodes with the positions of father nodes in the SQL binary tree, and continuously analyzing the next node; and determining the optimization result of each branch statement by the new SQL binary tree until all the nodes are analyzed.
Each node is recursively analyzed using a binary tree traversal of the order of the nodes, and if the node type is not a leaf node, the next node is continued. If the node type is a single statement, namely a leaf node, and has no branch, whether the node is a root node is judged. If the node is the root node, optimization is not carried out, the root node is directly subjected to inverse splicing, and the branch statement to be executed is obtained.
If the leaf node is not the root node, judging the where expression stored on the node, if the where expression is always FALSE and the parent node type of the node is UNION, adjusting the SQL binary tree, and generating a new SQL binary tree.
When the parent node of the leaf node is the root node: if the node is a left-leaf child node, replacing the root node with a right-leaf child node; if the node is a right leaf child node, the root node is replaced with a left leaf child node.
When the parent node of the leaf node is not the root node, there are 4 cases as follows:
1. if the node is a left-leaf child node, if the father node of the node is also a left node, replacing the father node with a right-leaf child node;
2. if the node is a left-leaf child node, if the father node of the node is a right node, replacing the father node with a right-leaf child node;
3. if the node is a right-leaf child node, if the father node of the node is a left node, replacing the father node with a left-leaf child node;
4. if the node is a right leaf child node, replacing the father node with a left leaf child node if the father node of the node is a right node;
and after the grammar tree is adjusted, continuously analyzing the next node. Until all the nodes are analyzed.
Specifically, as shown in fig. 4, after the pre-sequencing traversal, the obtained node analysis sequence is root, s1, s3, s4, s2, s5, and s 6. Firstly, analyzing root nodes root: the type of the root node is EXCEPT and is not a leaf node, the next node s1 is continuously analyzed, the next node s1 is not a leaf node, the next node s3 is continuously analyzed, the node s3 is a leaf node, the query about the data table T1 is performed, the node s3 is not a root node, the filtering condition of the node memory is judged, the filtering condition is that C1<10AND C1>10 is always false, but the parent node s1 of s3 is an INTERSECT type AND does not meet the optimization condition, s4 is continuously analyzed, s4 parent node is also s1 AND does not meet the optimization condition, s2 is continuously analyzed, s2 is not a leaf node, s5 is continuously analyzed, s5 is a leaf node, AND the filter condition e1<10AND e1>10 is always false, the parent s2 is of type unity, and s5 can be optimized without being a root node, the SQL syntax tree is adjusted, the position of the s2 node is replaced by the sibling nodes s6, s2 and s5 of s5, and a new SQL binary tree is formed. Fig. 4b is a schematic structural diagram of an optimized SQL binary tree according to a second embodiment of the present invention; the new SQL binary tree is shown in fig. 4 b.
Then, s6 is analyzed continuously, s6 is a leaf node, but the parent node of s6 is root in fig. 4b, the type is except, optimization cannot be performed, all the nodes are analyzed completely, and finally the syntax tree obtained through optimization is fig. 4 b.
And S350, determining the branch statements to be executed according to the optimization results of the branch statements.
And S360, carrying out structured query according to the branch statement to be executed.
The branch statement to be executed of the SQL statement is reversely spelled according to the new SQL syntax tree diagram 4b as follows: SELECT FROM T1 WHERE C1<10AND C1>10INTERSECT SELECT FROM T2 expand (SELECT FROM T4 WHERE f1<10AND f1> 10); and analyzing syntax and semantics of the rewritten sentences and then executing SQL. Compared with SQL before optimization, the method greatly reduces the number of the scanning tables and improves the query efficiency.
Example four
Fig. 5 is a schematic structural diagram of an optimization apparatus for data query according to a fourth embodiment of the present invention, which is applicable to a case where a query is performed when a filtering condition of a branch statement in a UNION ALL statement is false, and the optimization apparatus for data query can be implemented in a software and/or hardware manner.
As shown in fig. 5, the optimization apparatus for data query provided in this embodiment specifically includes the following modules: an optimization result determination module 510, a to-be-executed branch statement determination module 520, and a structured query module 530.
The optimization result determining module 510 is configured to determine an optimization result of each branch statement in a UNION set type statement after determining that the parsed structured query language SQL statement is the UNION set type statement;
a branch statement to be executed determining module 520, configured to determine a branch statement to be executed according to the optimization result of each branch statement;
and the structured query module 530 is configured to perform structured query according to the branch statement to be executed.
In the apparatus for optimizing data query provided in this embodiment, after determining that the parsed structured query language SQL statement is a UNION set type statement, an optimization result of each branch statement in the UNION set type statement is determined; determining branch statements to be executed according to the optimization results of the branch statements; the technical means for carrying out structured query according to the branch statement to be executed solves the problem that query needs to be carried out on each branch statement in the UNION set type statement in the prior art. In the embodiment, each branch statement in the UNION set type statement is optimized, the branch statement meeting the optimization condition is deleted, the branch statements which do not meet the optimization condition are reserved, and the query is performed only according to the reserved branch statements, so that the number of the scan data tables is reduced, and the query efficiency is improved.
Further, the optimization result determining module 310 includes:
the judging unit is used for judging whether the branch statements meet the optimization conditions or not aiming at each branch statement;
a deletion determining unit, configured to determine, if the branch statement satisfies the optimization condition, that the optimization result of the branch statement is to delete the branch statement satisfying the optimization condition, and reduce the number of the branch statements by one to obtain the number of branch statements corresponding to a subsequent branch statement; wherein the branch statement satisfying the optimization condition comprises: the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is more than 1;
a reservation determining unit, configured to determine, if the branch statement does not satisfy the optimization condition, that an optimization result of the branch statement is a branch statement that does not satisfy the optimization condition; wherein the branch statement does not satisfy the optimization condition, including: the filter condition of the branch statement is true; and/or the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is equal to 1.
Further, the to-be-executed branch statement determining module 320 is specifically configured to determine each remaining branch statement that does not satisfy the optimization condition as the to-be-executed branch statement.
Further, the branch statement satisfying the optimization condition includes: the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is more than 1.
Further, the apparatus further comprises:
and the SQL binary tree building module is used for building the SQL binary tree by taking the set operator of the analyzed SQL statement as a root node of the SQL binary tree.
Further, the optimization result determining module 510 includes:
a node traversal unit for traversing each node of the SQL binary tree in a pre-order manner,
and the optimization result determining unit is used for optimizing each branch statement in the UNION set type statement according to the node type corresponding to each node of the SQL binary tree to obtain the optimization result of each branch statement.
Specifically, the optimization result determining unit is specifically configured to sequentially determine node types corresponding to the nodes; if the node type corresponding to the node is a father node, acquiring a left child node of the node as a new node, and returning to execute the operation of determining the node type corresponding to each node; if the node type corresponding to the node is a leaf node, judging whether the branch statement corresponding to the leaf node meets the filtering condition, and whether the branch statement corresponding to the leaf node meets the filtering condition comprises the following steps: the filtering condition of the branch statement corresponding to the leaf node is false, and the statement type corresponding to the parent node to which the leaf node belongs is a UNION set type; if the branch statements corresponding to the leaf nodes meet the filtering condition, deleting the leaf nodes, replacing brother nodes located at the same layer with the leaf nodes with the positions of father nodes in the SQL binary tree, and continuously analyzing the next node; and determining the optimization result of each branch statement by the new SQL binary tree until all the nodes are analyzed.
The structured query module 530 is further configured to perform structured query according to the parsed SQL statement when it is detected that the parsed structured query language SQL statement is a non-UNION set type statement.
The data query optimization device provided by the embodiment of the invention can execute the data query optimization method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 6 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention, as shown in fig. 6, the apparatus includes a processor 610, a memory 620, an input device 630, and an output device 640; the number of processors 610 in the device may be one or more, and one processor 610 is taken as an example in fig. 6; the processor 610, the memory 620, the input device 630 and the output device 640 in the apparatus may be connected by a bus or other means, and fig. 6 illustrates an example of a connection by a bus.
The memory 620 is used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the optimization method of data query in the embodiment of the present invention (for example, the optimization result determination module 510, the to-be-executed branch statement determination module 520, and the structured query module 530 in the optimization device of data query). The processor 610 executes various functional applications of the device and data processing by executing software programs, instructions and modules stored in the memory 620, that is, implements the above-described optimization method of data query.
The memory 620 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 620 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 620 can further include memory located remotely from the processor 610, which can be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 630 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the device. The output device 640 may include a display device such as a display screen.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for optimizing a data query, the method including:
determining an optimization result of each branch statement in the UNION set type statement after determining that the analyzed Structured Query Language (SQL) statement is the UNION set type statement;
determining branch statements to be executed according to the optimization results of the branch statements;
and carrying out structured query according to the branch statement to be executed.
Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the method operations described above, and may also perform related operations in the optimization method of data query provided by any embodiments of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the optimization apparatus for data query, the units and modules included in the optimization apparatus are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (8)

1. A method for optimizing a data query, comprising:
determining an optimization result of each branch statement in the UNION set type statement after determining that the analyzed Structured Query Language (SQL) statement is the UNION set type statement;
determining branch statements to be executed according to the optimization results of the branch statements;
carrying out structured query according to the branch statement to be executed;
before determining the optimized result of each branch statement in the UNION set type statement, the method further includes: taking the set operator of the analyzed SQL statement as a root node of the SQL binary tree, and establishing the SQL binary tree;
optimizing each branch statement in the UNION set type statement according to the node type corresponding to each node of the SQL binary tree to obtain an optimization result of each branch statement, wherein the optimization result comprises the following steps:
sequentially determining the node type corresponding to each node;
if the node type corresponding to the node is a father node, acquiring a left child node of the node as a new node, and returning to execute the operation of determining the node type corresponding to each node;
if the node type corresponding to the node is a leaf node, judging whether the branch statement corresponding to the leaf node meets the filtering condition, and whether the branch statement corresponding to the leaf node meets the filtering condition comprises the following steps: the filtering condition of the branch statement corresponding to the leaf node is false, and the statement type corresponding to the parent node to which the leaf node belongs is a UNION set type;
if the branch statements corresponding to the leaf nodes meet the filtering condition, deleting the leaf nodes, replacing brother nodes located at the same layer with the leaf nodes with the positions of father nodes in the SQL binary tree, and continuously analyzing the next node;
and determining the optimization result of each branch statement by the new SQL binary tree until all the nodes are analyzed.
2. The method of claim 1, wherein the determining the optimized result for each branch statement in the UNION set type statement comprises:
judging whether the branch statements meet optimization conditions or not aiming at each branch statement;
if the branch statement meets the optimization condition, determining that the optimization result of the branch statement is to delete the branch statement meeting the optimization condition, and reducing the number of the branch statements by one to obtain the number of the branch statements corresponding to the next branch statement; the branch statement is a branch of UNION set operation, and the condition that the branch statement meets the optimization condition includes: the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is more than 1;
if the branch statement does not meet the optimization condition, determining the optimization result of the branch statement as a branch statement which is reserved and does not meet the optimization condition; wherein the branch statement does not satisfy the optimization condition, including: the filter condition of the branch statement is true; and/or the filtering condition of the branch statement is false, and the number of the branch statements corresponding to the branch statement is equal to 1.
3. The method of claim 2, wherein determining branch statements to be executed according to the optimization results of the branch statements comprises:
and determining each reserved branch statement which does not meet the optimization condition as a branch statement to be executed.
4. The method of claim 1, wherein the determining the optimized result for each branch statement in the UNION set type statement comprises:
traversing each node of the SQL binary tree in a pre-order manner,
and optimizing each branch statement in the UNION set type statement according to the node type corresponding to each node of the SQL binary tree to obtain an optimization result of each branch statement.
5. The method of claim 1, further comprising:
and when detecting that the analyzed Structured Query Language (SQL) statement is a non-UNION set type statement, performing structured query according to the analyzed SQL statement.
6. An apparatus for optimizing a data query, comprising:
the optimization result determining module is used for determining the optimized result of each branch statement in the UNION set type statement after the analyzed structured query language SQL statement is determined to be the UNION set type statement;
a branch statement to be executed determining module, configured to determine a branch statement to be executed according to the optimization result of each branch statement;
the structured query module is used for carrying out structured query according to the branch statement to be executed;
the data query optimization device comprises: the SQL binary tree building module is used for building the SQL binary tree by taking the set operator of the analyzed SQL statement as a root node of the SQL binary tree;
the optimization result determination module comprises:
the node traversal unit is used for traversing each node of the SQL binary tree in a precedence way;
an optimization result determining unit, configured to optimize each branch statement in the UNION set type statement according to a node type corresponding to each node of the SQL binary tree, to obtain an optimization result of each branch statement;
the optimization result determining unit is specifically used for sequentially determining the node types corresponding to the nodes; if the node type corresponding to the node is a father node, acquiring a left child node of the node as a new node, and returning to execute the operation of determining the node type corresponding to each node; if the node type corresponding to the node is a leaf node, judging whether a branch statement corresponding to the leaf node meets a filtering condition; whether the branch statement corresponding to the leaf node meets the filtering condition or not comprises the following steps: the filtering condition of the branch statement corresponding to the leaf node is false, and the statement type corresponding to the parent node to which the leaf node belongs is a UNION set type; if the branch statements corresponding to the leaf nodes meet the filtering condition, deleting the leaf nodes, replacing brother nodes located at the same layer with the leaf nodes with the positions of father nodes in the SQL binary tree, and continuously analyzing the next node; and determining the optimization result of each branch statement by the new SQL binary tree until all the nodes are analyzed.
7. An apparatus, characterized in that the apparatus comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of optimizing data queries as recited in any of claims 1-5.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of optimizing a data query according to any one of claims 1 to 5.
CN201910816840.8A 2019-08-30 2019-08-30 Data query optimization method, device, equipment and storage medium Active CN110515973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910816840.8A CN110515973B (en) 2019-08-30 2019-08-30 Data query optimization method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910816840.8A CN110515973B (en) 2019-08-30 2019-08-30 Data query optimization method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110515973A CN110515973A (en) 2019-11-29
CN110515973B true CN110515973B (en) 2022-02-18

Family

ID=68629612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910816840.8A Active CN110515973B (en) 2019-08-30 2019-08-30 Data query optimization method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110515973B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990423B (en) * 2019-12-12 2023-04-18 上海达梦数据库有限公司 SQL statement execution method, device, equipment and storage medium
CN111159228A (en) * 2019-12-26 2020-05-15 北京神州绿盟信息安全科技股份有限公司 Data query method and device
CN111506602B (en) * 2020-04-20 2023-05-09 上海达梦数据库有限公司 Data query method, device, equipment and storage medium
CN112069198B (en) * 2020-07-16 2021-09-10 中科驭数(北京)科技有限公司 SQL analysis optimization method and device
CN112286961B (en) * 2020-09-29 2022-11-18 苏宁云计算有限公司 SQL optimization query method and device
CN112199390B (en) * 2020-09-30 2023-05-30 上海达梦数据库有限公司 Data query method, device, equipment and storage medium in database
CN113641701B (en) 2021-10-13 2022-02-18 苏州浪潮智能科技有限公司 Data query method, system, heterogeneous acceleration platform and storage medium
CN114490709B (en) * 2021-12-28 2023-03-24 北京百度网讯科技有限公司 Text generation method and device, electronic equipment and storage medium
CN117271576A (en) * 2023-10-19 2023-12-22 北京人大金仓信息技术股份有限公司 Query optimization method, storage medium and computer equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563694A (en) * 2018-03-19 2018-09-21 广州视源电子科技股份有限公司 Method, apparatus, computer equipment and storage medium are executed to the SQL that logic is deleted

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100535905C (en) * 2007-06-08 2009-09-02 北京神舟航天软件技术有限公司 Method for removing SQL query statement constant condition
US10521425B2 (en) * 2016-08-31 2019-12-31 Sap Se Generating faster and efficient database query execution plans
US11120021B2 (en) * 2017-01-11 2021-09-14 Facebook, Inc. Systems and methods for optimizing queries
CN108304505B (en) * 2018-01-18 2020-09-11 上海达梦数据库有限公司 SQL statement processing method and device, server and storage medium
CN110162574B (en) * 2019-05-27 2020-11-03 上海达梦数据库有限公司 Method and device for determining data redistribution mode, server and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108563694A (en) * 2018-03-19 2018-09-21 广州视源电子科技股份有限公司 Method, apparatus, computer equipment and storage medium are executed to the SQL that logic is deleted

Also Published As

Publication number Publication date
CN110515973A (en) 2019-11-29

Similar Documents

Publication Publication Date Title
CN110515973B (en) Data query optimization method, device, equipment and storage medium
CN108804554B (en) Database query method, database query device, server and storage medium
JPH04299459A (en) Data base access system
CN111324577B (en) Yml file reading and writing method and device
CN110019384B (en) Method for acquiring blood edge data, method and device for providing blood edge data
CN110502532B (en) Method, device, equipment and storage medium for optimizing remote database object
CN111309753B (en) Optimization method, device, equipment and storage medium of structured query statement
CN114510611A (en) Method and device for constructing metadata blood relationship atlas and related equipment
Tekli et al. Approximate XML structure validation based on document–grammar tree similarity
CN108549688B (en) Data operation optimization method, device, equipment and storage medium
CN108629124B (en) Method for automatically generating simulation parameter data based on active graph path
CN112199390B (en) Data query method, device, equipment and storage medium in database
CN116483850A (en) Data processing method, device, equipment and medium
CN110990423B (en) SQL statement execution method, device, equipment and storage medium
CN107679107B (en) Graph database-based power grid equipment reachability query method and system
CN110188432B (en) System architecture verification method, electronic device and computer-readable storage medium
CN116910085A (en) Data query method, device, equipment and storage medium
CN111563094A (en) Data query method and device, electronic equipment and computer-readable storage medium
CN112799673B (en) Network protocol data checking method and device
CN114328525A (en) Data processing method and device
CN113590650A (en) Feature expression based structured query statement discrimination method and device
CN110162574B (en) Method and device for determining data redistribution mode, server and storage medium
CN114489723A (en) Universal configuration difference comparison method and device
CN110895529B (en) Processing method of structured query language and related device
KR101225333B1 (en) System and method using tree pattern expression for extraction information from syntactically parsed text corpora

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant