CN110968593A - Database SQL statement optimization method, device, equipment and storage medium - Google Patents

Database SQL statement optimization method, device, equipment and storage medium Download PDF

Info

Publication number
CN110968593A
CN110968593A CN201911257845.8A CN201911257845A CN110968593A CN 110968593 A CN110968593 A CN 110968593A CN 201911257845 A CN201911257845 A CN 201911257845A CN 110968593 A CN110968593 A CN 110968593A
Authority
CN
China
Prior art keywords
column
clause
optimized
data
invalid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911257845.8A
Other languages
Chinese (zh)
Other versions
CN110968593B (en
Inventor
万伟
朱仲颖
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN201911257845.8A priority Critical patent/CN110968593B/en
Publication of CN110968593A publication Critical patent/CN110968593A/en
Application granted granted Critical
Publication of CN110968593B publication Critical patent/CN110968593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for optimizing SQL (structured query language) statements of a database, wherein the method comprises the following steps: determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic analysis result of the target SQL statement; for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized; and for each data column, when the data column is determined to be an invalid column, deleting the invalid column and updating the semantic analysis result so as to optimize the target SQL statement. The embodiment of the invention can effectively remove the invalid columns of the clauses to be optimized in the database SQL sentences, reduce the invalid calculation corresponding to the clauses to be optimized, save the memory space and improve the execution efficiency of the database SQL sentences.

Description

Database SQL statement optimization method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a method, a device, equipment and a storage medium for optimizing SQL (structured query language) statements of a database.
Background
In a database Structured Query Language (SQL) statement, an ORDER BY clause may be used to sort the queried data BY specified columns, or a GROUP BY clause or a PARTITION BY clause in an analytic function may be used to GROUP the queried data BY specified columns.
In general, the method for processing the ORDER BY clause is as follows: after the data are inquired, carrying out sequencing calculation according to the sequence specified in the ORDERBY clause; the processing method of the GROUP BY clause and the PARTITION BY clause comprises the following steps: and after the data is inquired, performing grouping calculation according to the grouping columns specified in the GROUP BY clause or the PARTITION BY clause. When invalid sort/GROUP columns exist in the ORDER BY clause or the GROUP BY clause or the grouping BY clause, meaningless redundant computation is performed during the sort/GROUP computation, and the requirement for memory for temporarily storing data is increased meaninglessly, thereby resulting in inefficient execution of the database SQL statement.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a storage medium for optimizing a database SQL statement, which are used for effectively removing invalid columns of clauses to be optimized in the database SQL statement, reducing invalid calculation corresponding to the clauses to be optimized, saving memory space and improving the execution efficiency of the database SQL statement.
In a first aspect, an embodiment of the present invention provides a method for optimizing a database SQL statement, where the method includes:
determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic analysis result of the target SQL statement;
for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
and for each data column, when the data column is determined to be an invalid column, deleting the invalid column and updating the semantic analysis result so as to optimize the target SQL statement.
In a second aspect, an embodiment of the present invention further provides a device for optimizing a database SQL statement, where the device includes:
the clause determining module is used for determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic parsing result of the target SQL statement;
an invalid column determining module, configured to determine, for each clause to be optimized, at least one invalid column included in the clause to be optimized from each data column based on a column type of each data column included in the clause to be optimized and a clause type of the clause to be optimized;
and the statement optimization module is used for deleting the invalid columns and updating the semantic parsing result to optimize the target SQL statement for each data column when the data columns are determined to be invalid columns.
In a third aspect, an embodiment of the present invention further provides a database SQL statement optimization device, where the device includes:
one or more processors;
storage means for storing one or more programs;
the one or more programs are executed by the one or more processors, so that the one or more processors implement the database SQL statement optimization method according to the first aspect of the embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for optimizing the SQL statement in the database according to the first aspect of the embodiment of the present invention.
The embodiment of the invention determines the clauses to be optimized in the target SQL sentence based on the semantic analysis result of the target SQL sentence, further determines the invalid columns in the clauses to be optimized, and effectively reduces the invalid calculation corresponding to the clauses to be optimized, saves the memory space and improves the execution efficiency of the target SQL sentence by deleting the invalid columns in each clause to be optimized in the target SQL sentence.
Drawings
Fig. 1 is a schematic flowchart of a method for optimizing a database SQL statement according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a method for optimizing a database SQL statement according to a second embodiment of the present invention;
fig. 3 is a schematic flowchart of determining invalid columns in a clause to be optimized according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for optimizing a database SQL statement according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a database SQL statement optimization device according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a schematic flow diagram of a database SQL statement optimization method according to an embodiment of the present invention, which is applicable to removing invalid columns of clauses to be optimized in a database SQL statement, reducing invalid calculations corresponding to the clauses to be optimized, saving memory space, and improving execution efficiency of the database SQL statement.
As shown in fig. 1, the method for optimizing the SQL statement of the database provided in this embodiment specifically includes the following steps:
it can be understood that a database SQL statement is composed of a plurality of syntax clauses, such as a SELECT clause, a FROM clause, a WHERE clause, a GROUP BY clause, an ORDER BY clause, a PARTITION BY clause, and the like, and for an ordering (ORDER BY) clause and a grouping (GROUP BY and PARTITION BY) clause in the database SQL statement, invalid ordering sequences and grouping columns may exist correspondingly, so that when the ordering and grouping operations in the database SQL statement are performed correspondingly, meaningless redundant operations are performed, and thus, a demand for a memory for temporarily storing data is increased meaninglessly, and an execution efficiency of the database SQL statement is reduced.
Illustratively, given a table TAB1, containing two columns of C1 and C2, there is the following SQL statement:
SELECT C1 FROM TAB1 GROUP BY C1,C2,C2 ORDER BY C1,C2,C1;
obviously, a duplicate column C2 exists in the GROUP BY clause and a duplicate column C1 exists in the ORDER BY clause, which may cause invalid computation and memory waste and reduce the execution efficiency of the whole SQL statement.
The embodiment of the invention aims to determine and remove the invalid columns in the target SQL statement. For an SQL sentence, after semantic parsing, each grammar clause and data columns contained in each grammar clause can be obtained, the embodiments of the invention mainly determine clauses of types such as sorting clauses, grouping clauses and the like as clauses to be optimized for optimization, each clause obtained by parsing and the corresponding contained data columns are stored in a corresponding storage structure, all data columns in the storage structure corresponding to the clauses to be optimized are traversed, when invalid columns exist in the clauses to be optimized, all invalid columns are found out and deleted, all clauses to be optimized are traversed and optimized in such a way, and therefore, the optimization of the whole target SQL sentence is completed. In addition, the semantic analysis result of the target SQL is optimized, and the deletion of the invalid columns in the clause to be optimized is equivalent to the change of the initial semantic analysis result, so that the updating of the semantic analysis result is completed, namely the optimization process of the embodiment of the invention is equivalent to the completion of the semantic analysis result.
S101, determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic analysis result of the target SQL statement.
Optionally, the target SQL statement is an SQL statement that includes the clause to be optimized according to the embodiment of the present invention. The clause to be optimized refers to a grammar clause meeting the optimization conditions of the embodiment of the invention, and optionally, the grammar clause comprises a sequencing clause and a grouping clause; the sorting clause is an ORDER BY clause in an SQL statement, and the grouping clause is a GROUP BY clause in the SQL statement and a PARTITION BY clause in an analysis function.
Optionally, based on a semantic analysis result of a target Structured Query Language (SQL) statement, obtaining at least one syntax clause included in the target SQL statement and at least one data column correspondingly included in each syntax clause; and storing each grammar clause and each data column correspondingly contained in the grammar clauses into a corresponding storage structure according to a semantic parsing sequence, and determining the grammar clauses with clause types of sequencing clauses or grouping clauses as clauses to be optimized.
The storage structure may be any physical storage unit, and optionally, the storage structure includes, but is not limited to, a linked list and an array.
It is understood that the semantic parsing result includes, but is not limited to, each syntax clause included in the target SQL statement and each data column correspondingly included in each syntax clause. The storage structures corresponding to different syntax clauses may be the same or different. The syntax clauses and the data columns correspondingly contained in the syntax clauses are stored in the corresponding storage structures according to the semantic parsing sequence, so that the syntax clauses and the data columns correspondingly contained in the syntax clauses have a certain arrangement sequence, and subsequent traversal check operation is facilitated.
S102, aiming at each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized.
Wherein the column type may be understood as a data column category divided according to an attribute of a column name itself of the data column, and optionally, the column type includes a direct constant column, an indirect constant column, and a repeated column. The direct constant column may be understood as a data column whose column name itself defines a constant; the indirect constant column can be understood as a constant column whose column name comes from a view or a derivative table (subquery) and is defined as a constant in the view or the derivative table, i.e. is indirectly referenced; the repeated columns are understood to be two or more data columns with the same column name that are present in a clause to be optimized.
Optionally, sequentially selecting each data column contained in the clause to be optimized from the storage structure corresponding to the clause to be optimized as a current column according to a semantic parsing sequence; and determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
Wherein the current column refers to a data column that is currently about to be processed or being processed.
It can be understood that a plurality of data columns may exist in a statement to be optimized, and at this time, each data column may be sequentially selected as a current column according to a semantic parsing order to perform subsequent operation steps.
Optionally, when it is determined that the current column belongs to a direct constant column, determining the current column as an invalid column contained in the clause to be optimized; when the current column is determined to belong to a repeated column, determining a data column which is repeated with the current column in the clause to be optimized as an invalid column; and when the current column is determined to belong to an indirect constant column, further judging the clause type of the clause to be optimized, when the clause to be optimized is a sequencing clause, determining the current column as an invalid column contained in the clause to be optimized, and when the clause to be optimized is a grouping clause, if the storage structures corresponding to other grammar clauses do not contain the current column, determining the current column as an invalid column contained in the clause to be optimized.
S103, for each data column, deleting the invalid column and updating the semantic analysis result to optimize the target SQL statement when the data column is determined to be the invalid column.
It can be understood that, by for each data column, when the data column is judged to be an invalid column, the data column is deleted from the corresponding storage structure; therefore, all data columns are traversed according to the semantic parsing sequence, the updating of the initial semantic parsing result can be achieved, and therefore the optimization of the target SQL statement is completed.
Optionally, determining the remaining data columns corresponding to the clauses to be optimized after the invalid columns are deleted as the data columns included in the clauses to be optimized, so as to update a data column set composed of the data columns included in the clauses to be optimized; and determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
The embodiment of the invention determines the clauses to be optimized in the target SQL sentence based on the semantic analysis result of the target SQL sentence, further determines the invalid columns in the clauses to be optimized, and effectively reduces the invalid calculation corresponding to the clauses to be optimized, saves the memory space and improves the execution efficiency of the target SQL sentence by deleting the invalid columns in each clause to be optimized in the target SQL sentence.
Example two
Fig. 2 is a schematic flow chart of a method for optimizing a database SQL statement according to a second embodiment of the present invention, which is further optimized based on the first embodiment. In this embodiment, the determining, based on the semantic parsing result of the target structured query language SQL statement, at least one clause to be optimized included in the target SQL statement and at least one data row correspondingly included in each clause to be optimized is embodied as follows: obtaining at least one syntax clause contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each syntax clause based on a semantic analysis result of the target SQL statement; and storing each grammar clause and each data column correspondingly contained in the grammar clauses into a corresponding storage structure according to a semantic parsing sequence, and determining the grammar clauses with clause types of sequencing clauses or grouping clauses as clauses to be optimized.
In this embodiment, the determining, from each data column, at least one invalid column included in the clause to be optimized based on the column type of each data column included in the clause to be optimized and the clause type of the clause to be optimized is further embodied as: sequentially selecting each data column contained in the clauses to be optimized from the storage structures corresponding to the clauses to be optimized as a current column according to a semantic parsing sequence; and determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
The embodiment further embodies the updating of the semantic parsing result as follows: determining the remaining data columns corresponding to the clauses to be optimized after the invalid columns are deleted as the data columns contained in the clauses to be optimized so as to update a data column set consisting of all the data columns contained in the clauses to be optimized; and determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
As shown in fig. 2, the method for optimizing the SQL statement of the database provided in this embodiment specifically includes the following steps:
s201, obtaining at least one grammar clause contained in the target SQL sentence and at least one data column correspondingly contained in each grammar clause based on a semantic analysis result of the target Structured Query Language (SQL) sentence.
S202, storing each grammar clause and each data column correspondingly contained in the grammar clauses into a corresponding storage structure according to a semantic parsing sequence, and determining the grammar clauses with clause types of sequencing clauses or grouping clauses as clauses to be optimized.
S203, aiming at each clause to be optimized, sequentially selecting each data column contained in the clause to be optimized from the storage structure corresponding to the clause to be optimized according to the semantic parsing sequence as a current column.
S204, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
Specifically, fig. 3 provides a schematic flow chart of determining invalid columns in a clause to be optimized according to the present embodiment, and as shown in fig. 3, steps S203 to S204 may be optimized as follows:
and S10, selecting the first data column of the clause to be optimized from the storage structure corresponding to the clause to be optimized as the current column according to the semantic parsing sequence.
S11, determining the column type of the current column based on the column name of the current column, wherein the column type comprises: a direct constant column, an indirect constant column, and a repeat column.
S12, judging whether the current column belongs to a direct constant column; if yes, go to S13; otherwise, S14 is executed.
S13, determining that the current column is an invalid column contained in the clause to be optimized, and turning to S22.
Illustratively, the following SQL statement is given:
“SELECT'abc'AS C1,C2 AS C2 FROM TAB1 GROUP BY C1,C2 ORDER BY C1,C2;”
it can be seen that C1 in the GROUP BY clause and the ORDER BY clause is a constant 'abc', so C1 belongs to a direct constant column, and the optimization method according to the embodiment of the present invention is applied to perform SQL statement optimization to "SELECT 'abc' AS C1, C2 AS C2 FROM TAB1 GROUP BY C2 ORDER BY C2".
S14, judging whether the current column belongs to an indirect constant column; if yes, go to S15; otherwise, S20 is executed.
Illustratively, the following SQL statement is given:
“SELECT C2 FROM(SELECT'abc'AS C1,C2 AS C2 FROM TAB1)GROUP BY C1,C2ORDER BY C1,C2;”
it can be seen that C1 comes FROM the sub-queries "SELECT 'abc' AS C1, C2 AS C2 FROM TAB 1" and is defined AS constant 'abc' in the sub-query, so C1 belongs to the indirect constant column, and C1 is only used AS the indirect constant column in the GROUP BY clause and the ORDER BY clause, at this time, after applying the optimization method described in the embodiments of the present invention, it is equivalent to being performed SQL statement optimization for "SELECT C2 FROM (SELECT 'abc' AS C1, C2 AS C2 FROM TAB1) GROUP BYC2 or ORDER BY C2".
Illustratively, the following SQL statements are given:
“SELECT C1,C2 FROM(SELECT'abc'AS C1,C2 AS C2 FROM TAB1)GROUP BY C1,C2;”
it can be seen that C1 in the GROUP BY clause comes from the sub-queries "SELECT 'abc' AS C1, C2 AS C2 FROMTAB 1", and is defined AS the constant 'abc' in the sub-query, so C1 belongs to the indirect constant column; however, C1 is used as an indirect constant column, in addition to being used in the GROUP BY clause, in the outermost query; at this time, if C1 in the GROUP BY clause is deleted AS an invalid column, the executed statement becomes "SELECT C1, C2 FROM (SELECT 'abc' AS C1, C2 AS C2 FROM TAB1) GROUP BY C2; "which is not consistent with the syntax specification of the SQL statement, which will report errors during the syntax and semantic parsing stages, cannot be executed.
S15, judging whether the clause to be optimized is a sequencing clause; if yes, go to S16; otherwise, S17 is executed.
S16, determining the current column as an invalid column contained in the clause to be optimized, and turning to S22.
S17, traversing and checking the storage structures corresponding to other syntax clauses except the clause to be optimized in the target SQL statement.
S18, judging whether the storage structures corresponding to the other grammar clauses do not contain the current column; if yes, go to S19; otherwise, S22 is executed.
S19, determining the current column as an invalid column contained in the clause to be optimized, and turning to S22.
S20, judging whether the current column belongs to a repeated column; if yes, go to S21; otherwise, S22 is executed.
Illustratively, the following SQL statement is given:
“SELECT C1 FROM TAB1 GROUP BY C1,C2,C2 ORDER BY C1,C2,C1;”
it can be seen that if there is a duplicate column C2 in the GROUP BY clause and a duplicate column C1 in the ORDER BY clause, the optimization method according to the embodiment of the present invention is equivalent to the SQL statement optimization performed for "SELECT C1 FROMTAB1 GROUP BY C1, C2 ORDER C1, C2".
And S21, determining the data column repeated with the current column in the clause to be optimized as an invalid column.
S22, judging whether all data columns contained in the clause to be optimized are checked in a traversing mode; if yes, ending; otherwise, S23 is executed.
And S23, selecting the next data column of the current column from the storage structure corresponding to the clause to be optimized as a new current column according to the semantic parsing sequence, and returning to execute S11.
S205, for each data column, deleting the invalid column when the data column is determined to be the invalid column.
S206, determining the remaining data columns corresponding to the clauses to be optimized after the invalid columns are deleted as the data columns contained in the clauses to be optimized, so as to update a data column set formed by the data columns contained in the clauses to be optimized.
S207, determining a semantic analysis result containing the updated data column set as a new semantic analysis result so as to optimize the target SQL statement.
According to the embodiment of the invention, based on the semantic analysis result of the target SQL statement, the GROUP BY clause, the ORDER BY clause and the PARTITION BY clause in the target SQL statement are determined as the clauses to be optimized, the invalid columns in the clauses to be optimized are further determined, and the invalid columns in each clause to be optimized in the target SQL statement are deleted, so that invalid calculation in the grouping and sequencing process is effectively reduced, the memory space is saved, and the execution efficiency of the target SQL statement is improved.
EXAMPLE III
Fig. 4 is a schematic flowchart of a database SQL statement optimization apparatus according to a third embodiment of the present invention, where this embodiment is applicable to a situation where invalid columns of clauses to be optimized in a database SQL statement are removed, invalid calculations corresponding to the clauses to be optimized are reduced, a memory space is saved, and execution efficiency of the database SQL statement is improved, and the apparatus may be implemented in a software and/or hardware manner, and specifically includes: a clause determining module 301, an invalid column determining module 302, and a statement optimizing module 303, wherein,
a clause determining module 301, configured to determine, based on a semantic parsing result of a target Structured Query Language (SQL) statement, at least one clause to be optimized included in the target SQL statement and at least one data column correspondingly included in each clause to be optimized;
an invalid column determining module 302, configured to determine, for each clause to be optimized, at least one invalid column included in the clause to be optimized from each data column based on a column type of each data column included in the clause to be optimized and a clause type of the clause to be optimized;
and a statement optimizing module 303, configured to, for each data column, delete an invalid column when it is determined that the data column is an invalid column, and update the semantic parsing result to optimize the target SQL statement.
On the basis of the foregoing embodiments, the clause determining module 301 includes:
the clause acquisition unit is used for acquiring at least one syntax clause contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each syntax clause based on a semantic analysis result of the target SQL statement;
and the to-be-optimized determining unit is used for storing each grammar clause and each data column correspondingly contained in the grammar clause into a corresponding storage structure according to the semantic parsing sequence, and determining the grammar clauses with clause types of sequencing clauses or grouping clauses as the to-be-optimized clauses.
On the basis of the foregoing embodiments, the invalid column determining module 302 includes:
a current column determining unit, configured to select, in sequence according to a semantic parsing order, each data column included in the clause to be optimized from the storage structure corresponding to the clause to be optimized as a current column;
and the invalid column determining unit is used for determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
On the basis of the above embodiments, the invalid column determining unit includes:
an attribute determining subunit, configured to determine a column type of the current column based on a column name of the current column, where the column type includes: a direct constant column, an indirect constant column, and a repeating column;
the first determining subunit is configured to determine that the current column is an invalid column included in the clause to be optimized if it is determined that the current column belongs to a direct constant column;
a second determining subunit, configured to determine, based on a clause type of the clause to be optimized, at least one invalid column included in the clause to be optimized from each data column if it is determined that the current column belongs to an indirect constant column;
and a third determining subunit, configured to determine, if it is determined that the current column belongs to a repeated column, a data column that is repeated with the current column in the clause to be optimized as an invalid column.
On the basis of the foregoing embodiments, the second determining subunit is specifically configured to:
when the clause to be optimized is a sequencing clause, determining the current column as an invalid column contained in the clause to be optimized;
when the clause to be optimized is a grouping clause, traversing and checking storage structures corresponding to other syntax clauses except the clause to be optimized in the target SQL statement;
and if the storage structures corresponding to the other syntax clauses do not contain the current column, determining the current column as an invalid column contained in the clause to be optimized.
On the basis of the foregoing embodiments, the third determining subunit is specifically configured to:
comparing the current column with data columns in the storage structure behind the current column in sequence;
determining that the current column belongs to a duplicate column when the same data column as the current column exists in the storage structure.
On the basis of the above embodiments, the statement optimization module 303 includes:
a column set updating unit, configured to determine, after deleting the invalid column, a remaining data column corresponding to the clause to be optimized as a data column included in the clause to be optimized, so as to update a data column set composed of the data columns included in the clause to be optimized;
and the analysis updating unit is used for determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
The database SQL statement optimization device provided by the embodiment of the invention can execute the database SQL statement optimization method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 5 is a schematic structural diagram of an optimization apparatus for SQL statements according to a fourth embodiment of the present invention, as shown in fig. 5, the apparatus includes a processor 40, a memory 41, an input device 42, and an output device 43; the number of the processors 40 in the device may be one or more, and one processor 40 is taken as an example in fig. 5; the processor 40, the memory 41, the input device 42 and the output device 43 in the apparatus may be connected by a bus or other means, for example in fig. 5.
The memory 41 serves as a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the database SQL statement optimization method in the embodiment of the present invention (for example, the clause determining module 301, the invalid column determining module 302, and the statement optimizing module 303 in the database SQL statement optimizing device). The processor 40 executes various functional applications and data processing of the database SQL statement optimization device by executing software programs, instructions and modules stored in the memory 41, that is, implements the above-described database SQL statement optimization method.
The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 41 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to a database SQL statement optimization device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 42 is operable to receive input numeric or character information and generate key signal inputs relating to user settings and function controls of the database SQL statement optimization apparatus. The output device 43 may include a display device such as a display screen.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are executed by a computer processor to perform a method for optimizing a database SQL statement, and the method includes:
determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic analysis result of the target SQL statement;
for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
and for each data column, when the data column is determined to be an invalid column, deleting the invalid column and updating the semantic analysis result so as to optimize the target SQL statement.
Of course, the storage medium provided by the embodiment of the present invention includes computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the database SQL statement optimization method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the database SQL statement optimization apparatus, each unit and each module included in the embodiment are only divided according to functional logic, but are not limited to the above division, as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A database SQL statement optimization method is characterized by comprising the following steps:
determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic analysis result of the target SQL statement;
for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
and for each data column, when the data column is determined to be an invalid column, deleting the invalid column and updating the semantic analysis result so as to optimize the target SQL statement.
2. The method according to claim 1, wherein the determining, based on the semantic parsing result of the target Structured Query Language (SQL) statement, at least one clause to be optimized contained in the target SQL statement and at least one data column correspondingly contained in each clause to be optimized comprises:
obtaining at least one syntax clause contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each syntax clause based on a semantic analysis result of the target SQL statement;
and storing each grammar clause and each data column correspondingly contained in the grammar clauses into a corresponding storage structure according to a semantic parsing sequence, and determining the grammar clauses with clause types of sequencing clauses or grouping clauses as clauses to be optimized.
3. The method of claim 2, wherein the determining, from each of the data columns, at least one invalid column included in the clause to be optimized based on the column type of each of the data columns included in the clause to be optimized and the clause type of the clause to be optimized comprises:
sequentially selecting each data column contained in the clauses to be optimized from the storage structures corresponding to the clauses to be optimized as a current column according to a semantic parsing sequence;
and determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
4. The method of claim 3, wherein the determining, from each of the data columns, at least one invalid column contained in the clause to be optimized based on the column type of the current column and the clause type of the clause to be optimized comprises:
determining a column type of the current column based on a column name of the current column, the column type comprising: a direct constant column, an indirect constant column, and a repeating column;
if the current column is determined to belong to the direct constant column, determining that the current column is an invalid column contained in the clause to be optimized;
if the current column is determined to belong to an indirect constant column, determining at least one invalid column contained in the clause to be optimized from each data column based on the clause type of the clause to be optimized;
and if the current column is determined to belong to a repeated column, determining a data column which is repeated with the current column in the clause to be optimized as an invalid column.
5. The method of claim 4, wherein the determining, from each of the data columns, at least one invalid column contained in the clause to be optimized based on the clause type of the clause to be optimized comprises:
when the clause to be optimized is a sequencing clause, determining the current column as an invalid column contained in the clause to be optimized;
when the clause to be optimized is a grouping clause, traversing and checking storage structures corresponding to other syntax clauses except the clause to be optimized in the target SQL statement;
and if the storage structures corresponding to the other syntax clauses do not contain the current column, determining the current column as an invalid column contained in the clause to be optimized.
6. The method of claim 4, wherein the determining that the current column belongs to a duplicate column comprises:
comparing the current column with data columns in the storage structure behind the current column in sequence;
determining that the current column belongs to a duplicate column when the same data column as the current column exists in the storage structure.
7. The method according to any one of claims 1-6, wherein the updating the semantic parsing result comprises:
determining the remaining data columns corresponding to the clauses to be optimized after the invalid columns are deleted as the data columns contained in the clauses to be optimized so as to update a data column set consisting of all the data columns contained in the clauses to be optimized;
and determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
8. An apparatus for optimizing database SQL statements, comprising:
the clause determining module is used for determining at least one clause to be optimized contained in a target Structured Query Language (SQL) statement and at least one data column correspondingly contained in each clause to be optimized based on a semantic parsing result of the target SQL statement;
an invalid column determining module, configured to determine, for each clause to be optimized, at least one invalid column included in the clause to be optimized from each data column based on a column type of each data column included in the clause to be optimized and a clause type of the clause to be optimized;
and the statement optimization module is used for deleting the invalid columns and updating the semantic parsing result to optimize the target SQL statement for each data column when the data columns are determined to be invalid columns.
9. A database SQL statement optimization device, comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs being executed by the one or more processors such that the one or more processors implement the database SQL statement optimization method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the database SQL statement optimization method according to any of claims 1 to 7.
CN201911257845.8A 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium Active CN110968593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911257845.8A CN110968593B (en) 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911257845.8A CN110968593B (en) 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110968593A true CN110968593A (en) 2020-04-07
CN110968593B CN110968593B (en) 2023-10-03

Family

ID=70033484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911257845.8A Active CN110968593B (en) 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110968593B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737341A (en) * 2020-04-23 2020-10-02 武汉达梦数据库有限公司 Insertion operation synchronization optimization method based on log analysis and data synchronization system
CN112100198A (en) * 2020-09-09 2020-12-18 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN112199387A (en) * 2020-11-16 2021-01-08 广州速威智能系统科技有限公司 Method and system for processing database query statement
CN113064925A (en) * 2021-03-15 2021-07-02 深圳依时货拉拉科技有限公司 Big data query method, system and computer readable storage medium
CN113268494A (en) * 2021-05-24 2021-08-17 中国联合网络通信集团有限公司 Statement processing method and device for database to be optimized
CN114118055A (en) * 2021-12-06 2022-03-01 中电万维信息技术有限责任公司 Complex SQL statement optimization analysis method
CN114153875A (en) * 2021-12-16 2022-03-08 上海达梦数据库有限公司 Structured query statement optimization method, device, electronic device and medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339770B1 (en) * 1999-08-12 2002-01-15 International Business Machines Corporation Query simplification and optimization involving eliminating grouping column from group by operation corresponds to group by item that is constant
US20050154725A1 (en) * 2004-01-08 2005-07-14 International Business Machines Corporation Method applying transitive closure to group by and order by clauses
CN101021874A (en) * 2007-03-21 2007-08-22 金蝶软件(中国)有限公司 Method and apparatus for optimizing request to poll SQL
US20140280308A1 (en) * 2013-03-15 2014-09-18 International Business Machines Corporation Flexible Column Selection in Relational Databases
US20160042068A1 (en) * 2014-08-08 2016-02-11 Software Ag Service oriented query and service query language framework
US20170031989A1 (en) * 2015-07-31 2017-02-02 International Business Machines Corporation Outer join optimizations in database management systems
CN106611044A (en) * 2016-12-02 2017-05-03 星环信息科技(上海)有限公司 SQL optimization method and device
KR101737578B1 (en) * 2015-11-27 2017-05-18 한국비앤에스시스템 주식회사 Method and device for automatically tuning for sql sentences generated automatically
CN107704511A (en) * 2017-08-31 2018-02-16 星环信息科技(上海)有限公司 A kind of SQL optimization methods and equipment
US20180095962A1 (en) * 2016-10-05 2018-04-05 International Business Machines Corporation Translation of natural language questions and requests to a structured query format
CN108304505A (en) * 2018-01-18 2018-07-20 上海达梦数据库有限公司 A kind of processing method of SQL statement, device, server and storage medium
CN108628975A (en) * 2018-04-25 2018-10-09 上海达梦数据库有限公司 A kind of database SQL optimization and device
CN109766355A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 A kind of data query method and system for supporting natural language
CN109947791A (en) * 2019-03-27 2019-06-28 上海达梦数据库有限公司 A kind of database statement optimization method, device, equipment and storage medium
CN110399377A (en) * 2019-08-30 2019-11-01 北京东软望海科技有限公司 Optimization method, device, electronic equipment and the computer readable storage medium of SQL
US20190347260A1 (en) * 2017-06-14 2019-11-14 Tencent Technology (Shenzhen) Company Limited Uncorrelated subquery optimization method and apparatus, and storage medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339770B1 (en) * 1999-08-12 2002-01-15 International Business Machines Corporation Query simplification and optimization involving eliminating grouping column from group by operation corresponds to group by item that is constant
US20050154725A1 (en) * 2004-01-08 2005-07-14 International Business Machines Corporation Method applying transitive closure to group by and order by clauses
CN101021874A (en) * 2007-03-21 2007-08-22 金蝶软件(中国)有限公司 Method and apparatus for optimizing request to poll SQL
US20140280308A1 (en) * 2013-03-15 2014-09-18 International Business Machines Corporation Flexible Column Selection in Relational Databases
US20160042068A1 (en) * 2014-08-08 2016-02-11 Software Ag Service oriented query and service query language framework
US20170031989A1 (en) * 2015-07-31 2017-02-02 International Business Machines Corporation Outer join optimizations in database management systems
KR101737578B1 (en) * 2015-11-27 2017-05-18 한국비앤에스시스템 주식회사 Method and device for automatically tuning for sql sentences generated automatically
US20180095962A1 (en) * 2016-10-05 2018-04-05 International Business Machines Corporation Translation of natural language questions and requests to a structured query format
CN106611044A (en) * 2016-12-02 2017-05-03 星环信息科技(上海)有限公司 SQL optimization method and device
US20190347260A1 (en) * 2017-06-14 2019-11-14 Tencent Technology (Shenzhen) Company Limited Uncorrelated subquery optimization method and apparatus, and storage medium
CN107704511A (en) * 2017-08-31 2018-02-16 星环信息科技(上海)有限公司 A kind of SQL optimization methods and equipment
CN108304505A (en) * 2018-01-18 2018-07-20 上海达梦数据库有限公司 A kind of processing method of SQL statement, device, server and storage medium
CN108628975A (en) * 2018-04-25 2018-10-09 上海达梦数据库有限公司 A kind of database SQL optimization and device
CN109766355A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 A kind of data query method and system for supporting natural language
CN109947791A (en) * 2019-03-27 2019-06-28 上海达梦数据库有限公司 A kind of database statement optimization method, device, equipment and storage medium
CN110399377A (en) * 2019-08-30 2019-11-01 北京东软望海科技有限公司 Optimization method, device, electronic equipment and the computer readable storage medium of SQL

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737341A (en) * 2020-04-23 2020-10-02 武汉达梦数据库有限公司 Insertion operation synchronization optimization method based on log analysis and data synchronization system
CN111737341B (en) * 2020-04-23 2024-02-09 武汉达梦数据库股份有限公司 Insertion operation synchronization optimization method and data synchronization system based on log analysis
CN112100198A (en) * 2020-09-09 2020-12-18 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN112100198B (en) * 2020-09-09 2023-12-08 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN112199387A (en) * 2020-11-16 2021-01-08 广州速威智能系统科技有限公司 Method and system for processing database query statement
CN113064925A (en) * 2021-03-15 2021-07-02 深圳依时货拉拉科技有限公司 Big data query method, system and computer readable storage medium
CN113268494A (en) * 2021-05-24 2021-08-17 中国联合网络通信集团有限公司 Statement processing method and device for database to be optimized
CN113268494B (en) * 2021-05-24 2023-06-02 中国联合网络通信集团有限公司 Method and device for processing database statement to be optimized
CN114118055A (en) * 2021-12-06 2022-03-01 中电万维信息技术有限责任公司 Complex SQL statement optimization analysis method
CN114118055B (en) * 2021-12-06 2024-05-10 中电万维信息技术有限责任公司 Optimized analysis method for complex SQL (structured query language) statement
CN114153875A (en) * 2021-12-16 2022-03-08 上海达梦数据库有限公司 Structured query statement optimization method, device, electronic device and medium
CN114153875B (en) * 2021-12-16 2024-12-27 上海达梦数据库有限公司 Structured query statement optimization method, device, electronic device and medium

Also Published As

Publication number Publication date
CN110968593B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN110968593B (en) Database SQL statement optimization method, device, equipment and storage medium
KR102407510B1 (en) Method, apparatus, device and medium for storing and querying data
US10007699B2 (en) Optimized exclusion filters for multistage filter processing in queries
US8396852B2 (en) Evaluating execution plan changes after a wakeup threshold time
CN107943952B (en) Method for realizing full-text retrieval based on Spark framework
WO2018036549A1 (en) Distributed database query method and device, and management system
US20070226176A1 (en) Apparatus and method for optimizing a query to a partitioned database table using a virtual maintained temporary index that spans multiple database partitions
CN108875077B (en) Column storage method and device of database, server and storage medium
CN109299101B (en) Data retrieval method, device, server and storage medium
CN112162983A (en) Database index suggestion processing method, device, medium and electronic equipment
US20220005546A1 (en) Non-redundant gene set clustering method and system, and electronic device
CN112100198A (en) Database SQL statement optimization method, device, equipment and storage medium
WO2021139268A1 (en) Sensitive word detection method and apparatus, computer device, and storage medium
US20070239656A1 (en) Removal of Database Query Function Calls
CN112347108A (en) Data query method and system suitable for hybrid backend
CN116266182A (en) SQL sentence optimization method and device
CN110990423B (en) SQL statement execution method, device, equipment and storage medium
CN110895529B (en) Processing method of structured query language and related device
CN117290377A (en) Method and device for converting SQL sentences among relational databases
CN108536819B (en) Method, device, server and storage medium for comparing integer column and character string
CN112835905B (en) Array type column indexing method, device, equipment and storage medium
US20080162414A1 (en) Accelerating queries using delayed value projection of enumerated storage
CN113536058A (en) Spatial index modification method, device, equipment and storage medium
US11048692B2 (en) Partition move in case of table update
CN108984720B (en) Data query method and device based on column storage, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant