CN110968593B - Database SQL statement optimization method, device, equipment and storage medium - Google Patents

Database SQL statement optimization method, device, equipment and storage medium Download PDF

Info

Publication number
CN110968593B
CN110968593B CN201911257845.8A CN201911257845A CN110968593B CN 110968593 B CN110968593 B CN 110968593B CN 201911257845 A CN201911257845 A CN 201911257845A CN 110968593 B CN110968593 B CN 110968593B
Authority
CN
China
Prior art keywords
column
clause
optimized
data
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911257845.8A
Other languages
Chinese (zh)
Other versions
CN110968593A (en
Inventor
万伟
朱仲颖
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN201911257845.8A priority Critical patent/CN110968593B/en
Publication of CN110968593A publication Critical patent/CN110968593A/en
Application granted granted Critical
Publication of CN110968593B publication Critical patent/CN110968593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a database SQL statement optimization method, a device, equipment and a storage medium, wherein the method comprises the following steps: determining at least one clause to be optimized and at least one data column correspondingly contained in each clause to be optimized, which are contained in a target Structured Query Language (SQL) statement, based on a semantic analysis result of the target SQL statement; for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized; and deleting the invalid column and updating the semantic parsing result to optimize the target SQL statement when the data column is determined to be the invalid column for each data column. The method and the device can effectively remove the invalid columns of the clauses to be optimized in the database SQL statement, reduce the corresponding invalid computation of the clauses to be optimized, save the memory space and improve the execution efficiency of the database SQL statement.

Description

Database SQL statement optimization method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a database SQL statement optimization method, a device, equipment and a storage medium.
Background
In a database structured query language (Structured Query Language, SQL) statement, the queried data may be ordered BY specified columns using ORDER BY clauses, or may be grouped BY specified columns using GROUP BY clauses or PARTITION BY clauses in an analytical function.
In general, the processing method for the ORDER BY clause is: after the data is queried, performing sorting calculation according to a sorting column appointed in the ORDER BY clause; the processing method for the GROUP BY clause and the PARTITION BY clause is as follows: and after the data is queried, carrying out grouping calculation according to a grouping column appointed in the GROUP BY clause or the component BY clause. When there is an invalid sort/grouping column in the ORDER BY clause or the GROUP BY clause or the component BY clause, it may result in meaningless redundant computation performed during the sort/grouping computation, and meaningless increase in the demand for the memory for temporarily storing data, thereby resulting in inefficiency in the execution of the database SQL statement.
Disclosure of Invention
The embodiment of the invention provides a database SQL sentence optimization method, device, equipment and storage medium, which are used for effectively removing invalid columns of clauses to be optimized in a database SQL sentence, reducing corresponding invalid computation of the clauses to be optimized, saving memory space and improving the execution efficiency of the database SQL sentence.
In a first aspect, an embodiment of the present invention provides a method for optimizing a database SQL statement, where the method includes:
determining at least one clause to be optimized and at least one data column correspondingly contained in each clause to be optimized, which are contained in a target Structured Query Language (SQL) statement, based on a semantic analysis result of the target SQL statement;
for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
and deleting the invalid column and updating the semantic parsing result to optimize the target SQL statement when the data column is determined to be the invalid column for each data column.
In a second aspect, an embodiment of the present invention further provides a database SQL statement optimization apparatus, where the apparatus includes:
the clause determining module is used for determining at least one clause to be optimized and at least one data column correspondingly contained in each clause to be optimized, which are contained in the target SQL statement, based on the semantic analysis result of the target structured query language SQL statement;
the system comprises an invalid column determining module, a data column determining module and a data column determining module, wherein the invalid column determining module is used for determining at least one invalid column contained in each clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
and the statement optimization module is used for deleting the invalid columns and updating the semantic analysis results to optimize the target SQL statement when determining that the data columns are invalid columns for each data column.
In a third aspect, an embodiment of the present invention further provides a database SQL statement optimization apparatus, where the apparatus includes:
one or more processors;
a storage means for storing one or more programs;
the one or more programs are executed by the one or more processors, so that the one or more processors implement the database SQL statement optimization method according to the first aspect of the embodiment of the invention.
In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, where a computer program is stored, where the computer program implements the method for optimizing a database SQL statement according to the first aspect of the embodiment of the present invention when the computer program is executed by a processor.
According to the embodiment of the invention, the clause to be optimized in the target SQL sentence is determined based on the semantic analysis result of the target SQL sentence, the invalid columns in the clause to be optimized are further determined, and the corresponding invalid calculation of the clause to be optimized is effectively reduced by deleting the invalid columns in each clause to be optimized in the target SQL sentence, so that the memory space is saved, and the execution efficiency of the target SQL sentence is improved.
Drawings
FIG. 1 is a flowchart of a database SQL statement optimization method according to an embodiment of the invention;
FIG. 2 is a flowchart of a database SQL statement optimization method according to a second embodiment of the invention;
FIG. 3 is a flowchart of determining invalid columns in clauses to be optimized according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a database SQL statement optimization device according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a database SQL statement optimization device according to a fourth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. Furthermore, embodiments of the invention and features of the embodiments may be combined with each other without conflict. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
Fig. 1 is a schematic flow chart of a method for optimizing a database SQL statement, which is provided by the first embodiment of the present invention, and the method may be implemented by a device for optimizing a database SQL statement, where the device may be implemented in a software and/or hardware manner and may be integrated in a device for optimizing a database SQL statement.
As shown in fig. 1, the method for optimizing the SQL statement of the database provided in this embodiment specifically includes the following steps:
it will be appreciated that a database SQL statement is composed of multiple grammar clauses, such as SELECT clause, FROM clause, WHERE clause, GROUP BY clause, ORDER BY clause, part BY clause, etc., and for ORDER (ORDER BY) clause and packet (GROUP BY) clause in the database SQL statement, there may be cases WHERE invalid ORDER and packet columns exist correspondingly, resulting in meaningless redundant operations being performed when ordering and packet operations in the database SQL statement are correspondingly performed, resulting in meaningless increases in the need for memory for temporarily storing data, while reducing the execution efficiency of the database SQL statement.
Illustratively, given table TAB1, containing two columns of C1 and C2, there are the following SQL statements:
SELECT C1 FROM TAB1 GROUP BY C1,C2,C2 ORDER BY C1,C2,C1;
obviously, there is a repeated column C2 in the GROUP BY clause, and there is a repeated column C1 in the ORDER BY clause, which causes invalid computation and memory waste, and reduces the execution efficiency of the whole SQL statement.
The embodiment of the invention is to determine and remove the invalid column in the target SQL sentence. For an SQL sentence, after semantic analysis, each grammar clause and data columns contained in each grammar clause are obtained, and the embodiments of the invention mainly determine clauses of types such as sorting clauses, grouping clauses and the like as clauses to be optimized for optimization, and traverse all data columns in the corresponding storage structure of the clause to be optimized by storing each clause obtained by analysis and the corresponding data columns contained in the corresponding storage structure, and find out all invalid columns when invalid columns exist in the clause to be optimized, and delete all the clauses to be optimized so as to traverse and optimize, thereby completing optimization of the whole target SQL sentence. In addition, the embodiment of the invention optimizes the semantic analysis result based on the target SQL, and the deletion of the invalid columns in the clause to be optimized is equivalent to the change of the initial semantic analysis result, so that the update of the semantic analysis result is completed, namely the optimization process of the embodiment of the invention is equivalent to completion.
S101, determining at least one clause to be optimized and at least one data column correspondingly contained in each clause to be optimized, which are contained in a target Structured Query Language (SQL) statement, based on a semantic analysis result of the target SQL statement.
The target SQL statement may be understood as an original SQL statement to be executed, and optionally, the target SQL statement is an SQL statement containing a clause to be optimized according to the embodiment of the present invention. The clause to be optimized refers to a grammar clause conforming to the optimization conditions of the embodiment of the invention, and optionally, the grammar clause comprises a sorting clause and a grouping clause; the sorting clauses are ORDER BY clauses in the SQL statement, and the grouping clauses are GROUP BY clauses in the SQL statement and PARTITION BY clauses in the analysis function.
Optionally, based on a semantic analysis result of the target Structured Query Language (SQL) statement, obtaining at least one grammar clause contained in the target SQL statement and at least one data column correspondingly contained in each grammar clause; storing each grammar clause and each data column correspondingly contained into a corresponding storage structure according to a semantic analysis sequence, and determining the grammar clause with the clause type of a sorting clause or a grouping clause as a clause to be optimized.
The storage structure may be any kind of physical storage unit, and optionally, the storage structure includes, but is not limited to, a linked list and an array.
It can be understood that the semantic analysis result includes, but is not limited to, each grammar clause included in the target SQL statement and each data column included in each grammar clause. The storage structures corresponding to different grammar clauses can be the same or different. The aim of storing each grammar clause and each data column correspondingly contained in the grammar clause to the corresponding storage structure according to the semantic analysis sequence is to enable each grammar clause and each data column correspondingly contained in the grammar clause to have a certain arrangement sequence, so that the subsequent traversal checking operation is convenient.
S102, determining at least one invalid column contained in each clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized aiming at each clause to be optimized.
Wherein the column type may be understood as a class of data columns divided according to an attribute of a column name itself of the data column, optionally the column type includes a direct constant column, an indirect constant column, and a repeated column. The direct constant column may be understood as a data column whose column name itself is defined as a constant; the indirect constant column may be understood as a column name from the view or derivative table (sub-query) and is defined as a constant in the view or derivative table, i.e., a indirectly referenced constant column; the repeated column may be understood as two or more columns of data having the same column name appearing in one clause to be optimized.
Optionally, sequentially selecting each data column contained in the clause to be optimized from the storage structure corresponding to the clause to be optimized according to the semantic analysis sequence as a current column; and determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
Wherein the current column refers to a data column that is currently about to be processed or is being processed.
It can be understood that there may be multiple data columns in one statement to be optimized, and at this time, each data column may be sequentially selected as a current column according to the semantic parsing order, so as to perform subsequent operation steps.
Optionally, when it is determined that the current column belongs to a direct constant column, determining the current column as an invalid column contained in the clause to be optimized; when the current column is determined to belong to a repeated column, determining a data column which is repeated with the current column in the clause to be optimized as an invalid column; and when the current column is determined to belong to an indirect constant column, further judging the clause type of the clause to be optimized, when the clause to be optimized is a sorting clause, determining the current column as an invalid column contained in the clause to be optimized, and when the clause to be optimized is a grouping clause, if the storage structures corresponding to other grammar clauses do not contain the current column, determining the current column as an invalid column contained in the clause to be optimized.
And S103, deleting the invalid column and updating the semantic analysis result to optimize the target SQL statement when the data column is determined to be the invalid column according to each data column.
It will be appreciated that by for each data column, when the data column is determined to be an invalid column, the data column is deleted from the corresponding storage structure; therefore, the updating of the initial semantic analysis result can be realized by traversing all data columns according to the semantic analysis sequence, and the optimization of the target SQL statement is completed.
Optionally, determining the remaining data columns corresponding to the clause to be optimized after deleting the invalid columns as the data columns contained in the clause to be optimized, so as to update a data column set formed by the data columns contained in the clause to be optimized; and determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
According to the embodiment of the invention, the clause to be optimized in the target SQL sentence is determined based on the semantic analysis result of the target SQL sentence, the invalid columns in the clause to be optimized are further determined, and the corresponding invalid calculation of the clause to be optimized is effectively reduced by deleting the invalid columns in each clause to be optimized in the target SQL sentence, so that the memory space is saved, and the execution efficiency of the target SQL sentence is improved.
Example two
Fig. 2 is a flow chart of a database SQL statement optimization method according to a second embodiment of the present invention, where the second embodiment further optimizes the database SQL statement optimization method based on the first embodiment. In this embodiment, the determining, based on the semantic analysis result of the target structured query language SQL statement, at least one clause to be optimized and at least one data column correspondingly included in each clause to be optimized included in the target SQL statement is implemented as: based on a semantic analysis result of a target Structured Query Language (SQL) sentence, obtaining at least one grammar clause contained in the target SQL sentence and at least one data column correspondingly contained in each grammar clause; storing each grammar clause and each data column correspondingly contained into a corresponding storage structure according to a semantic analysis sequence, and determining the grammar clause with the clause type of a sorting clause or a grouping clause as a clause to be optimized.
The embodiment further embodies the determining, based on the column type of each data column included in the clause to be optimized and the clause type of the clause to be optimized, at least one invalid column included in the clause to be optimized from each data column, as follows: sequentially selecting each data column contained in the clause to be optimized from a storage structure corresponding to the clause to be optimized according to the semantic analysis sequence as a current column; and determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
The embodiment further embodies the updating of the semantic parsing result as: determining the remaining data columns corresponding to the clause to be optimized after deleting the invalid columns as the data columns contained in the clause to be optimized so as to update a data column set formed by the data columns contained in the clause to be optimized; and determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
As shown in fig. 2, the method for optimizing the database SQL statement provided in this embodiment specifically includes the following steps:
s201, based on a semantic analysis result of a target Structured Query Language (SQL) sentence, at least one grammar clause contained in the target SQL sentence and at least one data column correspondingly contained in each grammar clause are obtained.
S202, storing each grammar clause and each data column correspondingly contained into a corresponding storage structure according to a semantic analysis sequence, and determining the grammar clause with the clause type of a sorting clause or a grouping clause as a clause to be optimized.
S203, sequentially selecting each data column contained in each clause to be optimized from a storage structure corresponding to the clause to be optimized according to a semantic analysis sequence as a current column.
S204, determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
Specifically, fig. 3 shows a schematic flow chart for determining invalid columns in clauses to be optimized according to the present embodiment, and as shown in fig. 3, steps S203 to S204 may be optimized as follows:
s10, selecting a first data column of the clause to be optimized from a storage structure corresponding to the clause to be optimized as a current column according to a semantic analysis sequence.
S11, determining a column type of the current column based on the column name of the current column, wherein the column type comprises: direct constant columns, indirect constant columns, and repeated columns.
S12, judging whether the current column belongs to a direct constant column or not; if yes, executing S13; otherwise, S14 is performed.
S13, determining the current column as an invalid column contained in the clause to be optimized, and turning to S22.
Illustratively, the following SQL statement is given:
“SELECT'abc'AS C1,C2 AS C2 FROM TAB1 GROUP BY C1,C2 ORDER BY C1,C2;”
it can be seen that, C1 in the GROUP BY clause and the ORDER BY clause is constant 'abc', so that C1 belongs to a direct constant column, and after the optimization method described in the embodiment of the present invention is applied, the optimization is equivalent to the optimization of the executed SQL statement AS "SELECT 'abc' AS C1, C2 AS C2 FROM TAB1 GROUP BY C2 ORDER BY C2".
S14, judging whether the current column belongs to an indirect constant column or not; if yes, executing S15; otherwise, S20 is performed.
Illustratively, the following SQL statement is given:
“SELECT C2 FROM(SELECT'abc'AS C1,C2 AS C2 FROM TAB1)GROUP BY C1,C2 ORDER BY C1,C2;”
it can be seen that C1 is derived FROM the sub-query "SELECT 'abc' AS C1, C2 AS C2 FROM TAB1", and is defined AS constant 'abc' in the sub-query, so C1 belongs to the indirect constant column, and C1 is used AS the indirect constant column only in the GROUP BY clause and the ORDER BY clause, at this time, after the optimization method described in the embodiment of the present invention is applied, it is equivalent to being executed to perform the optimization AS "SELECT C2 FROM (SELECT 'abc' AS C1, C2 AS C2 FROM TAB 1) GROUP BY C2 ORDER BY C2.
Illustratively, the following SQL statement is given:
“SELECT C1,C2 FROM(SELECT'abc'AS C1,C2 AS C2 FROM TAB1)GROUP BY C1,C2;”
AS can be seen, C1 in the GROUP BY clause comes FROM the sub-queries "SELECT 'abc' AS C1, C2 AS C2 FROM TAB1", and is defined AS constant 'abc' in the sub-query, so C1 belongs to the indirect constant column; however, C1 is used as an indirect constant column in addition to being used in the GROUP BY clause, also in the outermost query; at this time, if C1 in the GROUP BY clause is deleted AS an invalid column, the corresponding sentence to be executed becomes "SELECT C1, C2 FROM (SELECT 'abc' AS C1, C2 AS C2 FROM TAB 1) GROUP BY C2; "that is not in compliance with the grammar specification of the SQL statement that would be misplaced during the grammar and semantic parsing stages, cannot be executed.
S15, judging whether the clause to be optimized is a sorting clause or not; if yes, executing S16; otherwise, S17 is performed.
S16, determining the current column as an invalid column contained in the clause to be optimized, and turning to S22.
S17, traversing and checking storage structures corresponding to grammar clauses except the clauses to be optimized in the target SQL sentence.
S18, judging whether the storage structures corresponding to the other grammar clauses do not contain the current column; if yes, executing S19; otherwise, S22 is performed.
S19, determining the current column as an invalid column contained in the clause to be optimized, and turning to S22.
S20, judging whether the current column belongs to a repeated column or not; if yes, executing S21; otherwise, S22 is performed.
Illustratively, the following SQL statement is given:
“SELECT C1 FROM TAB1 GROUP BY C1,C2,C2 ORDER BY C1,C2,C1;”
it can be seen that, when there is a repetition column C2 in the GROUP BY clause and a repetition column C1 in the ORDER BY clause, after the optimization method described in the embodiment of the present invention is applied, the optimization is equivalent to the executed SQL statement optimization as "SELECT C1 FROM TAB1 GROUP BY C1, C2 ORDER BY C1, C2".
S21, determining the data column which is repeated with the current column in the clause to be optimized as an invalid column.
S22, judging whether to traverse and check all data columns contained in the clause to be optimized; if yes, ending; otherwise, S23 is performed.
S23, selecting the next data column of the current column from the storage structure corresponding to the clause to be optimized as a new current column according to the semantic analysis sequence, and returning to execute S11.
S205, for each data column, deleting the invalid column when determining that the data column is the invalid column.
S206, determining the remaining data columns corresponding to the clause to be optimized after deleting the invalid columns as the data columns contained in the clause to be optimized, so as to update a data column set formed by the data columns contained in the clause to be optimized.
S207, determining the semantic analysis result containing the updated data column set as a new semantic analysis result so as to optimize the target SQL statement.
According to the embodiment of the invention, based on the semantic analysis result of the target SQL sentence, the GROUP BY clause, the ORDER BY clause and the PARTION BY clause in the target SQL sentence are determined as clauses to be optimized, invalid columns in the clauses to be optimized are further determined, and invalid columns in each clause to be optimized of the target SQL sentence are deleted, so that invalid calculation in grouping and sorting processes is effectively reduced, memory space is saved, and execution efficiency of the target SQL sentence is improved.
Example III
Fig. 4 is a schematic flow chart of a database SQL statement optimization device provided in the third embodiment of the present invention, where the embodiment may be adapted to remove an invalid column of a clause to be optimized in a database SQL statement, reduce corresponding invalid computation of the clause to be optimized, save memory space, and improve execution efficiency of the database SQL statement, where the device may be implemented by software and/or hardware, and the device specifically includes: clause determination module 301, invalid column determination module 302, and statement optimization module 303, wherein,
the clause determining module 301 is configured to determine, based on a semantic analysis result of a target structured query language SQL statement, at least one clause to be optimized included in the target SQL statement and at least one data column included in each clause to be optimized correspondingly;
an invalid column determining module 302, configured to determine, for each clause to be optimized, at least one invalid column included in the clause to be optimized from the data columns based on a column type of each data column included in the clause to be optimized and a clause type of the clause to be optimized;
the statement optimization module 303 is configured to, for each data column, delete an invalid column when determining that the data column is the invalid column, and update the semantic parsing result to optimize the target SQL statement.
On the basis of the above embodiments, the clause determination module 301 includes:
the clause acquisition unit is used for acquiring at least one grammar clause contained in the target SQL statement and at least one data column correspondingly contained in each grammar clause based on the semantic analysis result of the target structured query language SQL statement;
the to-be-optimized determining unit is used for storing each grammar clause and each data column correspondingly contained into a corresponding storage structure according to the semantic analysis sequence, and determining the grammar clause with the clause type being a sorting clause or a grouping clause as the to-be-optimized clause.
Based on the above embodiments, the invalid column determining module 302 includes:
the current column determining unit is used for sequentially selecting each data column contained in the clause to be optimized from the storage structure corresponding to the clause to be optimized according to the semantic analysis sequence as a current column;
and the invalid column determining unit is used for determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized.
On the basis of the above embodiments, the invalid column determining unit includes:
an attribute determination subunit, configured to determine a column type of the current column based on a column name of the current column, where the column type includes: direct constant columns, indirect constant columns, and repeated columns;
a first determining subunit, configured to determine, if it is determined that the current column belongs to a direct constant column, that the current column is an invalid column included in the clause to be optimized;
a second determining subunit, configured to determine, if it is determined that the current column belongs to an indirect constant column, at least one invalid column included in the clause to be optimized from each data column based on a clause type of the clause to be optimized;
and the third determining subunit is used for determining the data column repeated with the current column in the clause to be optimized as an invalid column if the current column is determined to belong to the repeated column.
On the basis of the above embodiments, the second determining subunit is specifically configured to:
when the clause to be optimized is a sorting clause, determining the current column as an invalid column contained in the clause to be optimized;
when the clause to be optimized is a grouping clause, traversing and checking storage structures corresponding to other grammar clauses except the clause to be optimized in the target SQL statement;
and if the storage structures corresponding to the other grammar clauses do not contain the current column, determining the current column as an invalid column contained in the clause to be optimized.
On the basis of the above embodiments, the third determining subunit is specifically configured to:
comparing the current column with the data columns positioned behind the current column in the storage structure in sequence;
and when the same data column as the current column exists in the storage structure, determining that the current column belongs to a repeated column.
On the basis of the above embodiments, the sentence optimization module 303 includes:
a column set updating unit, configured to determine, as a data column included in the clause to be optimized, a remaining data column corresponding to the clause to be optimized after the invalid column is deleted, so as to update a data column set formed by each data column included in the clause to be optimized;
and the analysis updating unit is used for determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
The database SQL sentence optimizing device provided by the embodiment of the invention can execute the database SQL sentence optimizing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example IV
Fig. 5 is a schematic structural diagram of a database SQL statement optimization device according to a fourth embodiment of the present invention, and as shown in fig. 5, the device includes a processor 40, a memory 41, an input device 42, and an output device 43; the number of processors 40 in the device may be one or more, one processor 40 being taken as an example in fig. 5; the processor 40, the memory 41, the input means 42 and the output means 43 in the device may be connected by a bus or by other means, in fig. 5 by way of example.
The memory 41 is used as a computer readable storage medium for storing software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the database SQL statement optimization method in the embodiment of the present invention (for example, the clause determination module 301, the invalid column determination module 302, and the statement optimization module 303 in the database SQL statement optimization device). The processor 40 executes various functional applications and data processing of the database SQL statement optimization apparatus by running software programs, instructions, and modules stored in the memory 41, that is, implements the database SQL statement optimization method described above.
The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 41 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 41 may further comprise memory remotely located with respect to processor 40, which may be connected to the database SQL statement optimization device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 42 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the database SQL statement optimizing device. The output means 43 may comprise a display device such as a display screen.
Example five
A fifth embodiment of the present invention also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a database SQL statement optimization method, the method comprising:
determining at least one clause to be optimized and at least one data column correspondingly contained in each clause to be optimized, which are contained in a target Structured Query Language (SQL) statement, based on a semantic analysis result of the target SQL statement;
for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
and deleting the invalid column and updating the semantic parsing result to optimize the target SQL statement when the data column is determined to be the invalid column for each data column.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the above-mentioned method operations, and may also perform the related operations in the database SQL statement optimization method provided in any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
It should be noted that, in the embodiment of the database SQL statement optimization apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (8)

1. A method for optimizing a database SQL statement, comprising:
determining at least one clause to be optimized contained in a target Structured Query Language (SQL) sentence and at least one data column correspondingly contained in each clause to be optimized based on a semantic analysis result of the target SQL sentence;
for each clause to be optimized, determining at least one invalid column contained in the clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
for each data column, deleting the invalid column when the data column is determined to be the invalid column, and updating the semantic parsing result to optimize the target SQL statement;
the determining, based on the column type of each data column included in the clause to be optimized and the clause type of the clause to be optimized, at least one invalid column included in the clause to be optimized from each data column includes:
sequentially selecting each data column contained in the clause to be optimized from a storage structure corresponding to the clause to be optimized according to the semantic analysis sequence as a current column;
determining at least one invalid column contained in the clause to be optimized from each data column based on the column type of the current column and the clause type of the clause to be optimized, wherein the determining comprises the following steps:
determining a column type of the current column based on a column name of the current column, the column type comprising: a direct constant column, an indirect constant column, and a repeated column, wherein the direct constant column is a data column whose column name itself is defined as a constant; the indirect constant is listed as column name from view or derivative table and is defined as constant in the view or derivative table; the repeated columns are data columns with the same column names and two or more column names appearing in one clause to be optimized;
if the current column is determined to belong to a direct constant column, determining that the current column is an invalid column contained in the clause to be optimized;
if the current column is determined to belong to an indirect constant column, determining at least one invalid column contained in the clause to be optimized from each data column based on the clause type of the clause to be optimized;
and if the current column is determined to belong to the repeated column, determining the data column repeated with the current column in the clause to be optimized as an invalid column.
2. The method according to claim 1, wherein determining at least one clause to be optimized and at least one data column corresponding to each clause to be optimized included in the target SQL statement based on the semantic analysis result of the target structured query language SQL statement comprises:
based on a semantic analysis result of a target Structured Query Language (SQL) sentence, obtaining at least one grammar clause contained in the target SQL sentence and at least one data column correspondingly contained in each grammar clause;
storing each grammar clause and each data column correspondingly contained into a corresponding storage structure according to a semantic analysis sequence, and determining the grammar clause with the clause type of a sorting clause or a grouping clause as a clause to be optimized.
3. The method of claim 1, wherein the determining, based on the clause type of the clause to be optimized, at least one invalid column included in the clause to be optimized from each of the data columns includes:
when the clause to be optimized is a sorting clause, determining the current column as an invalid column contained in the clause to be optimized;
when the clause to be optimized is a grouping clause, traversing and checking storage structures corresponding to other grammar clauses except the clause to be optimized in the target SQL statement;
and if the storage structures corresponding to the other grammar clauses do not contain the current column, determining the current column as an invalid column contained in the clause to be optimized.
4. The method of claim 1, wherein the determining that the current column belongs to a duplicate column comprises:
comparing the current column with the data columns positioned behind the current column in the storage structure in sequence;
and when the same data column as the current column exists in the storage structure, determining that the current column belongs to a repeated column.
5. The method of any of claims 1-4, wherein the updating the semantic parsing result comprises:
determining the remaining data columns corresponding to the clause to be optimized after deleting the invalid columns as the data columns contained in the clause to be optimized so as to update a data column set formed by the data columns contained in the clause to be optimized;
and determining the semantic analysis result containing the updated data column set as a new semantic analysis result.
6. A database SQL statement optimization apparatus, comprising:
the clause determining module is used for determining at least one clause to be optimized contained in the target SQL statement and at least one data column correspondingly contained in each clause to be optimized based on the semantic analysis result of the target structured query language SQL statement;
the system comprises an invalid column determining module, a data column determining module and a data column determining module, wherein the invalid column determining module is used for determining at least one invalid column contained in each clause to be optimized from the data columns based on the column type of each data column contained in the clause to be optimized and the clause type of the clause to be optimized;
the statement optimization module is used for deleting the invalid columns and updating the semantic analysis results to optimize the target SQL statement when determining that the data columns are invalid columns for each data column;
the invalid column determining module includes:
the current column determining unit is used for sequentially selecting each data column contained in the clause to be optimized from the storage structure corresponding to the clause to be optimized according to the semantic analysis sequence as a current column;
an invalid column determining unit, configured to determine at least one invalid column included in the clause to be optimized from each data column based on a column type of the current column and a clause type of the clause to be optimized;
the invalid column determining unit includes:
an attribute determination subunit, configured to determine a column type of the current column based on a column name of the current column, where the column type includes: a direct constant column, an indirect constant column, and a repeated column, wherein the direct constant column is a data column whose column name itself is defined as a constant; the indirect constant is listed as column name from view or derivative table and is defined as constant in the view or derivative table; the repeated columns are data columns with the same column names and two or more column names appearing in one clause to be optimized;
a first determining subunit, configured to determine, if it is determined that the current column belongs to a direct constant column, that the current column is an invalid column included in the clause to be optimized;
a second determining subunit, configured to determine, if it is determined that the current column belongs to an indirect constant column, at least one invalid column included in the clause to be optimized from each data column based on a clause type of the clause to be optimized;
and the third determining subunit is used for determining the data column repeated with the current column in the clause to be optimized as an invalid column if the current column is determined to belong to the repeated column.
7. A database SQL statement optimization apparatus, comprising:
one or more processors;
a storage means for storing one or more programs;
the one or more programs are executed by the one or more processors to cause the one or more processors to implement the database SQL statement optimization method of any one of claims 1-5.
8. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the database SQL statement optimization method of any one of claims 1-5.
CN201911257845.8A 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium Active CN110968593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911257845.8A CN110968593B (en) 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911257845.8A CN110968593B (en) 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110968593A CN110968593A (en) 2020-04-07
CN110968593B true CN110968593B (en) 2023-10-03

Family

ID=70033484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911257845.8A Active CN110968593B (en) 2019-12-10 2019-12-10 Database SQL statement optimization method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110968593B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737341B (en) * 2020-04-23 2024-02-09 武汉达梦数据库股份有限公司 Insertion operation synchronization optimization method and data synchronization system based on log analysis
CN112100198B (en) * 2020-09-09 2023-12-08 上海达梦数据库有限公司 Database SQL statement optimization method, device, equipment and storage medium
CN112199387A (en) * 2020-11-16 2021-01-08 广州速威智能系统科技有限公司 Method and system for processing database query statement
CN113064925A (en) * 2021-03-15 2021-07-02 深圳依时货拉拉科技有限公司 Big data query method, system and computer readable storage medium
CN113268494B (en) * 2021-05-24 2023-06-02 中国联合网络通信集团有限公司 Method and device for processing database statement to be optimized
CN114118055B (en) * 2021-12-06 2024-05-10 中电万维信息技术有限责任公司 Optimized analysis method for complex SQL (structured query language) statement

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339770B1 (en) * 1999-08-12 2002-01-15 International Business Machines Corporation Query simplification and optimization involving eliminating grouping column from group by operation corresponds to group by item that is constant
CN101021874A (en) * 2007-03-21 2007-08-22 金蝶软件(中国)有限公司 Method and apparatus for optimizing request to poll SQL
CN106611044A (en) * 2016-12-02 2017-05-03 星环信息科技(上海)有限公司 SQL optimization method and device
KR101737578B1 (en) * 2015-11-27 2017-05-18 한국비앤에스시스템 주식회사 Method and device for automatically tuning for sql sentences generated automatically
CN107704511A (en) * 2017-08-31 2018-02-16 星环信息科技(上海)有限公司 A kind of SQL optimization methods and equipment
CN108304505A (en) * 2018-01-18 2018-07-20 上海达梦数据库有限公司 A kind of processing method of SQL statement, device, server and storage medium
CN108628975A (en) * 2018-04-25 2018-10-09 上海达梦数据库有限公司 A kind of database SQL optimization and device
CN109766355A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 A kind of data query method and system for supporting natural language
CN109947791A (en) * 2019-03-27 2019-06-28 上海达梦数据库有限公司 A kind of database statement optimization method, device, equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7720840B2 (en) * 2004-01-08 2010-05-18 International Business Machines Corporation Method applying transitive closure to group by and order by clauses
US9208214B2 (en) * 2013-03-15 2015-12-08 International Business Machines Corporation Flexible column selection in relational databases
US9800644B2 (en) * 2014-08-08 2017-10-24 Software Ag Service oriented query and service query language framework
US10127277B2 (en) * 2015-07-31 2018-11-13 International Business Machines Corporation Outer join optimizations in database management systems
US10303683B2 (en) * 2016-10-05 2019-05-28 International Business Machines Corporation Translation of natural language questions and requests to a structured query format
CN107315790B (en) * 2017-06-14 2021-07-06 腾讯科技(深圳)有限公司 Optimization method and device for non-relevant sub-queries
CN110399377A (en) * 2019-08-30 2019-11-01 北京东软望海科技有限公司 Optimization method, device, electronic equipment and the computer readable storage medium of SQL

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339770B1 (en) * 1999-08-12 2002-01-15 International Business Machines Corporation Query simplification and optimization involving eliminating grouping column from group by operation corresponds to group by item that is constant
CN101021874A (en) * 2007-03-21 2007-08-22 金蝶软件(中国)有限公司 Method and apparatus for optimizing request to poll SQL
KR101737578B1 (en) * 2015-11-27 2017-05-18 한국비앤에스시스템 주식회사 Method and device for automatically tuning for sql sentences generated automatically
CN106611044A (en) * 2016-12-02 2017-05-03 星环信息科技(上海)有限公司 SQL optimization method and device
CN107704511A (en) * 2017-08-31 2018-02-16 星环信息科技(上海)有限公司 A kind of SQL optimization methods and equipment
CN108304505A (en) * 2018-01-18 2018-07-20 上海达梦数据库有限公司 A kind of processing method of SQL statement, device, server and storage medium
CN108628975A (en) * 2018-04-25 2018-10-09 上海达梦数据库有限公司 A kind of database SQL optimization and device
CN109766355A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 A kind of data query method and system for supporting natural language
CN109947791A (en) * 2019-03-27 2019-06-28 上海达梦数据库有限公司 A kind of database statement optimization method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110968593A (en) 2020-04-07

Similar Documents

Publication Publication Date Title
CN110968593B (en) Database SQL statement optimization method, device, equipment and storage medium
KR102407510B1 (en) Method, apparatus, device and medium for storing and querying data
US8396852B2 (en) Evaluating execution plan changes after a wakeup threshold time
CN107943952B (en) Method for realizing full-text retrieval based on Spark framework
CN107783985B (en) Distributed database query method, device and management system
CN110909016B (en) Repeated association detection method, device, equipment and storage medium based on database
JP6198845B2 (en) Active database query maintenance
CN112100198B (en) Database SQL statement optimization method, device, equipment and storage medium
CN109299101B (en) Data retrieval method, device, server and storage medium
CN112162983A (en) Database index suggestion processing method, device, medium and electronic equipment
CN110362593B (en) Data query method, device, equipment and storage medium
CN111309753A (en) Method, device and equipment for optimizing structured query statement and storage medium
CN110990423B (en) SQL statement execution method, device, equipment and storage medium
CN116049232A (en) Sub-query extraction method, sub-query extraction device, electronic equipment and storage medium
CN112835905B (en) Array type column indexing method, device, equipment and storage medium
CN113536058A (en) Spatial index modification method, device, equipment and storage medium
US11048692B2 (en) Partition move in case of table update
CN108984720B (en) Data query method and device based on column storage, server and storage medium
CN110895529B (en) Processing method of structured query language and related device
US8738600B2 (en) String searches in a computer database
JP4091586B2 (en) Structured document management system, index construction method and program
CN110263055B (en) Parameter prompting method, device, equipment and storage medium
CN113204568A (en) Method, device, equipment and storage medium for optimizing structured query statement
CN116974845A (en) Table blood edge relation analysis method, apparatus, device and computer readable storage medium
CN115878661A (en) Query method, query device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant