WO2024093693A1 - 一种sql语句的优化执行方法、装置、电子设备及介质 - Google Patents

一种sql语句的优化执行方法、装置、电子设备及介质 Download PDF

Info

Publication number
WO2024093693A1
WO2024093693A1 PCT/CN2023/125687 CN2023125687W WO2024093693A1 WO 2024093693 A1 WO2024093693 A1 WO 2024093693A1 CN 2023125687 W CN2023125687 W CN 2023125687W WO 2024093693 A1 WO2024093693 A1 WO 2024093693A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
statement
parameter
sql
parameters
Prior art date
Application number
PCT/CN2023/125687
Other languages
English (en)
French (fr)
Inventor
魏宇臣
王国平
杜开展
Original Assignee
北京奥星贝斯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京奥星贝斯科技有限公司 filed Critical 北京奥星贝斯科技有限公司
Publication of WO2024093693A1 publication Critical patent/WO2024093693A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models

Definitions

  • This document belongs to the field of data processing technology, and in particular relates to a method, device, electronic device and medium for optimizing the execution of SQL statements.
  • SQL Structured Query Language
  • MSE MySQL's Multiple Statement Execution
  • the embodiments of this specification provide a method, device, electronic device and medium for optimizing the execution of SQL statements, which can avoid excessive overhead when executing SQL statements in batches, thereby improving the performance of executing SQL statements in batches in a relational database.
  • a method for optimizing the execution of SQL statements including: generating a first parameter view based on parameters of multiple original SQL statements, the multiple original SQL statements have the same grammatical structure and are all used to execute a target change operation on a target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement; rewriting the multiple original SQL statements into target SQL statements for performing a target change operation on the target table, and setting the parameters of the target SQL statement to reference parameters corresponding to the records in the first parameter view; generating and executing an execution plan for the target SQL statement, wherein the execution plan is used to call the parameters corresponding to the records in the first parameter view to perform the target change operation on the target table.
  • an optimized execution device for SQL statements including: a view generation module, based on multiple The parameters of the original SQL statement generate a first parameter view, the multiple original SQL statements have the same syntax structure and are all used to perform a target change operation on a target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement; a statement rewriting module rewrites the multiple original SQL statements into a target SQL statement for performing a target change operation on the target table, and the parameters of the target SQL statement are set to reference parameters corresponding to the records in the first parameter view; a statement execution module generates and executes an execution plan of the target SQL statement, wherein the execution plan is used to call the parameters corresponding to the records in the first parameter view to perform the target change operation on the target table.
  • an electronic device comprising: a processor; and a memory arranged to store computer-executable instructions, wherein the executable instructions, when executed, cause the processor to perform the following operations: generate a first parameter view based on parameters of multiple original SQL statements, wherein the multiple original SQL statements have the same grammatical structure and are all used to perform a target change operation on a target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement; rewrite the multiple original SQL statements into target SQL statements for performing a target change operation on the target table, and the parameters of the target SQL statements are set to parameters corresponding to the records in the first parameter view; generate and execute an execution plan for the target SQL statement, wherein the execution plan is used to call the parameters corresponding to the records in the first parameter view to perform the target change operation on the target table.
  • a computer-readable storage medium which stores one or more programs.
  • the electronic device When the one or more programs are executed by an electronic device including multiple application programs, the electronic device performs the following operations: generates a first parameter view based on parameters of multiple original SQL statements, the multiple original SQL statements have the same grammatical structure and are all used to perform a target change operation on a target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement; rewrites the multiple original SQL statements into target SQL statements for performing a target change operation on the target table, and the parameters of the target SQL statements are set to parameters corresponding to the records in the first parameter view; generates and executes an execution plan for the target SQL statement, wherein the execution plan is used to call the parameters corresponding to the records in the first parameter view to perform the target change operation on the target table.
  • the solution of the embodiment of this specification extracts the parameters of each SQL statement to generate a parameter view, where each record in the parameter view corresponds to the parameters of an original SQL statement.
  • the multiple SQL statements are rewritten into a target SQL statement for the same type of change operation, and the parameters of the target SQL statement are set to reference the parameters in the parameter view, so as to generate an execution plan for the change operation on the target table by calling the parameters of each record in the parameter view according to the target SQL statement.
  • the change operation to be performed by the original multiple SQL statements is equivalently executed with one execution plan, so that there is no need to repeat the execution plan.
  • the execution order of multiple SQL statements is taken into consideration, thereby eliminating the context overhead and significantly improving the performance of batch execution of SQL statements.
  • FIG. 1 is a schematic diagram of a first flow chart of a method for optimizing execution of SQL statements provided in an embodiment of this specification.
  • FIG. 2 is a schematic diagram of an execution plan of a target SQL statement optimized by the optimization execution method according to an embodiment of this specification.
  • FIG. 3 is another schematic diagram of an execution plan of a target SQL statement optimized by the optimization execution method according to an embodiment of this specification.
  • FIG. 4 is a schematic diagram of the structure of an optimized execution device for SQL statements provided in an embodiment of this specification.
  • FIG. 5 is a schematic diagram of the structure of an electronic device provided in an embodiment of this specification.
  • Multiple Statement Execution defines writing multiple SQL statements in the same string to request the relational database to execute them in batches.
  • the SQL statements in the string are separated by semicolons ";”.
  • the update statement is a SQL statement that updates a table.
  • Multiple Statement Execution request contains multiple update statements.
  • the normal execution process of Multiple Statement Execution is to execute each update statement serially in the order in which they are arranged in the request. Therefore, for the update statement executed later, in theory, the incremental modification of the first executed statement can be seen, and the incremental modification can continue to be superimposed based on the executed results. Its execution process is an alternating mode of "read-write-read-write". Since the execution order of the update statement needs to be strictly reflected, a large amount of context overhead will be generated.
  • Multiple Statement Execution can request multiple SQL statements to a relational database at one time, thereby reducing the interaction cost between the client and the database, there is no performance optimization in the execution stage.
  • this document aims to provide an optimized execution solution for SQL statements, which can rewrite multiple SQL statements that perform the same change operation on the target table into one SQL statement for execution.
  • the original multiple SQL statements are integrated into one execution plan, making the execution process a "read-read-write-write" mode that does not strictly require the execution order.
  • This mode not only avoids the large number of transactions caused by single row-by-row execution, but also does not need to consider the execution order between different SQL statements, so it does not generate a lot of context overhead, thereby greatly improving the performance of batch execution of SQL statements in relational databases.
  • FIG. 1 is a flow chart of a method for optimizing the execution of SQL statements according to an embodiment of the present specification.
  • the method shown in FIG. 1 can be executed by corresponding devices as described below, and the specific steps include S102 to S106.
  • the target change operation can be any of an update operation based on an update statement, a delete operation based on a delete statement, and an insert operation based on an insert statement.
  • the syntax of the delete statement is: delete from table name [where matching field value], which means that the where parameter of the matching field value in the target table is the parameter of the delete statement.
  • the syntax of the insert statement is: insert [into] ⁇ table name> [column name] values ⁇ column value>, which means that the values parameter of the inserted field value in the target table is the parameter of the insert statement.
  • This step is to extract parameters as variables from multiple original SQL statements to generate a first parameter view.
  • each record of the first parameter view in this specification corresponds to an exclusive ID stmt_id to represent the corresponding original SQL statement.
  • the structure of the corresponding first parameter view v1 is:
  • the first parameter view records the corresponding parameters for each update statement in the Multiple Statement Execution request.
  • the execution plan is the specific execution process of the SQL statement, which mainly reflects: the order of table reading, the type of data reading operation, which indexes are used, references between tables, operators in each link, etc.
  • the target SQL statement in this specification needs to perform target change operations on the target table according to the corresponding execution plan.
  • the embodiment of this specification executes the execution plan of the target SQL statement to perform each original The target change operation corresponding to the SQL statement.
  • the method of the embodiment of the present specification extracts the parameters of each SQL statement to generate a parameter view, wherein each record in the parameter view corresponds to the parameters of an original SQL statement.
  • the multiple SQL statements are rewritten into a target SQL statement of the same type of change operation, and the parameters of the target SQL statement are set to reference the parameters in the parameter view, thereby generating an execution plan for calling the parameters of each record in the parameter view according to the target SQL statement, and performing the change operation on the target table.
  • the change operation to be performed by the original multiple SQL statements is equivalently executed with one execution plan, so there is no need to consider the execution order between the original multiple SQL statements, thereby eliminating the context overhead, and greatly improving the performance of batch execution of SQL statements.
  • the execution plan of Batch is equivalently rewritten into a general form of join+update/delete between a specially defined internal parameter view and the updated table.
  • the left branch of the join is the parameter view
  • the right branch of the join is the query of the DML statement itself.
  • the execution process is a "read-read-write-write" execution mode that does not strictly require an execution order. If multiple original SQL statements have an incremental modification relationship for the target table, the "read-read-write-write" execution mode cannot achieve correct incremental modification of the target table.
  • the embodiments of this specification may perform syntax analysis on the multiple original SQL statements to determine whether the multiple original SQL statements meet the SQL merge rewrite conditions.
  • the SQL merge rewrite conditions include at least one of the following: there is no update operation on the predicate column in the multiple original SQL statements; there is no multiple change operations on the same record in the multiple original SQL statements; there is no inconsistency in the corresponding parameter types in the multiple original SQL statements.
  • the predicate column in the embodiment of this specification refers to the query field in the where statement.
  • the embodiment of the present specification optimizes multiple original SQL statements to be executed by an execution plan of a target SQL statement.
  • the execution plan of the target SQL statement if the target change operation fails for any parameter recorded in the first parameter view (for example, a foreign key constraint violation occurs), the target SQL statement needs to be rolled back.
  • the target SQL statement fails to execute, the target The execution operation of the SQL statement is rolled back, and then the other original SQL statements except the original SQL statements corresponding to the records that failed to be executed among the multiple original SQL statements are executed one by one.
  • the embodiment of the specification may also record the row identifier of the record that failed to execute in the first parameter view, and roll back the execution operation of the target SQL statement. Afterwards, a second parameter view is generated based on the records before the row identifier of the record that failed to execute in the first parameter view, and a third parameter view is generated based on the records after the row identifier of the record that failed to execute in the first parameter view.
  • the parameters of the target SQL statement are set to reference the parameters corresponding to the records in the second parameter view, and an execution plan for the target SQL statement is generated and executed; and the parameters of the target SQL statement are set to reference the parameters corresponding to the records in the third parameter view, and an execution plan for the target SQL statement is generated and executed.
  • the target records that failed to execute in the first parameter view are extracted, and the records before the target record in the first parameter view are merged into an execution plan of a target SQL statement for unified execution; the records after the target record in the first parameter view are merged into an execution plan of a target SQL statement for unified execution.
  • the row identifier of the record in the embodiment of this specification can be used to represent the order of the record in the parameter view.
  • the order of the parameters of multiple original SQL statements can be used as the row identifier, etc.
  • additional field values can also be added as row identifiers, and this embodiment of this specification does not limit this.
  • stmt_id in the parameter view is used to represent the row identifier.
  • the parameters of the target SQL statement are set to the parameters corresponding to the records in the third parameter view v2, and an execution plan for the target SQL statement is generated and executed.
  • the embodiments of this specification can record the execution information of the corresponding target change operation during the execution of the target SQL statement, and after the execution plan of the target SQL statement is executed, the execution information corresponding to each record in the first parameter view is summarized and uniformly returned.
  • step S106 can be implemented as follows: if the target change operation is an update operation based on an update statement, a join operator of a first parameter view and a target table is generated, the left branch of the join operator is the first parameter view, and the right branch of the join operator is a query of the target table; the parameters of the left branch parameter view of the join operator are traversed to perform an update operation on the target table to drive the query of the right branch of the join.
  • step S106 can be implemented as follows: if the target change operation is a delete operation based on a delete statement, a join operator of the first parameter view and the target table is generated, the left branch of the join operator is the first parameter view, and the right branch of the join operator is a query of the target table; the parameters of the left branch parameter view of the join operator are traversed to perform a delete operation on the target table to drive the query of the right branch of the join.
  • the following introduces the execution plan by taking the target update statement as the target SQL statement as an example.
  • the execution plan of the target update statement in this specification adopts MySQL's Nested Loop Join algorithm to unify the execution plan of each original update statement.
  • FIG2 is a diagram of the execution plan structure corresponding to the target update statement.
  • the execution plan is equivalently rewritten into a general form of join+update/delete between a specially defined internal parameter view and the updated table.
  • the left branch of the join is the parameter view
  • the right branch of the join is the query of the DML statement itself.
  • the query of the right branch of the join is driven, so as to achieve the effect of executing multiple groups of parameters.
  • the Table Update operator, the Nested Loop Join operator and the ObExprValues operator can be deployed.
  • the ObExprValues operator is responsible for outputting the parameters in the parameter view v1 row by row (one row corresponds to one group) to the Nested Loop Join operator (each row corresponds to one group). It should be noted that each group of parameters in the parameter view v1 has a corresponding stmt_id number.
  • the ObExprValues operator outputs the variable parameters in the parameter view v1, it is also necessary to associate the output group identifier v1.stmt_id to maintain the mapping relationship of the groups.
  • the Nested Loop Join operator drives the Table Scan operator to scan data according to the parameters output by the ObExprValues operator, and outputs the scanned matching data to the Table Update operator to perform the update of t1. Similarly, when the Nested Loop Join operator outputs the scanned data, it also needs to associate the output group identifier v1.stmt_id to maintain the mapping relationship.
  • the target original SQL statement among the multiple original statements includes a query based on a join statement
  • the right branch of the join operator is the query based on the join statement.
  • a target original SQL statement among the multiple original update statements to be executed contains the above-mentioned query based on a join statement, as shown in Figure 3, it is necessary to set another target query based on a join statement under the Nested Loop Join operator of the execution plan of the target update statement.
  • the target query is associated with the parameter view v1 and the output parameters TSC(t1) and TSC(t2) of the target original SQL statement.
  • the Table Update operator When the Table Update operator updates table t1, it records the stmt_id that failed to update. When the target update is rolled back, the Table Update operator aggregates the stmt_id that was successfully updated before and calls the variable parameter again in batches to update table t1, and then calls each stmt_id that failed to update separately. The variable parameters of stmt_id are used to supplement and update table t1. In addition, after the call is completed, each variable parameter of stmt_id will be associated with stmt_id to record the execution information required for the return packet. When all variable parameters of stmt_id are called and executed, a unified return packet is made based on the execution information of each stmt_id.
  • FIG4 is a schematic diagram of an optimized execution device 400 for SQL statements provided in one embodiment of this specification, which includes the following modules.
  • the view generation module 410 generates a first parameter view based on the parameters of multiple original SQL statements, wherein the multiple original SQL statements have the same grammatical structure and are all used to execute the target change operation of the target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement.
  • the statement rewriting module 420 rewrites the multiple original SQL statements into target SQL statements for performing a target change operation on the target table, wherein the parameters of the target SQL statements are set to parameters corresponding to the records in the first parameter view.
  • the statement execution module 430 generates and executes an execution plan for the target SQL statement, wherein the execution plan is used to call the parameters corresponding to the records of the first parameter view to perform the target change operation on the target table.
  • the device of the embodiment of the present specification needs to batch execute multiple SQL statements of the same change operation type on the target table
  • the parameters of each SQL statement are extracted to generate a parameter view, wherein each record in the parameter view corresponds to the parameters of an original SQL statement.
  • the multiple SQL statements are rewritten into a target SQL statement of the same type of change operation, and the parameters of the target SQL statement are set to reference the parameters in the parameter view, thereby generating an execution plan for calling the parameters of each record in the parameter view according to the target SQL statement to perform the change operation on the target table.
  • the change operations to be performed by the original multiple SQL statements are equivalently executed with one execution plan, so there is no need to consider the execution order between the original multiple SQL statements, thereby eliminating the context overhead, and greatly improving the performance of batch execution of SQL statements.
  • the target change operation is an update operation based on an update statement
  • the parameters corresponding to each group include: a set parameter representing the updated field value in the target table and a where parameter representing the matching field value in the target table; or, the target change operation is a delete operation based on a delete statement, and the parameters corresponding to each group include: a where parameter representing the matching field value in the target table.
  • the view generation module 410 generates a first parameter view based on the parameters of multiple original SQL statements, including: performing grammatical analysis on the multiple original SQL statements to determine whether the multiple original SQL statements meet SQL merge rewrite conditions; if the SQL merge rewrite conditions are met, generating a first parameter view based on the parameters of the multiple original SQL statements.
  • the SQL merge rewrite condition includes at least one of the following: there is no update operation on the predicate column in the multiple original SQL statements; there is no multiple change operations on the same record in the multiple original SQL statements; there is no inconsistency in the corresponding parameter types in the multiple original SQL statements.
  • the statement execution module 430 rolls back the execution operation of the target SQL statement, and executes the multiple original SQL statements one by one.
  • the statement execution module 430 executes: recording the row identifier of the record that failed to execute in the first parameter view, and rolling back the execution operation of the target SQL statement; generating a second parameter view based on the records before the row identifier of the record that failed to execute in the first parameter view, and generating a third parameter view based on the records after the row identifier of the record that failed to execute in the first parameter view; setting the parameters of the target SQL statement to reference the parameters corresponding to the records in the second parameter view, generating and executing an execution plan for the target SQL statement; setting the parameters of the target SQL statement to reference the parameters corresponding to the records in the third parameter view, generating and executing an execution plan for the target SQL statement.
  • the statement execution module 430 is specifically used to: if the target change operation is an update operation based on an update statement, generate a join operator for the first parameter view and the target table, the left branch of the join operator is the first parameter view, and the right branch of the join operator is a query of the target table; traverse the parameters of the left branch parameter view of the join operator to perform an update operation on the target table to drive the query of the right branch of the join.
  • the statement execution module 430 is specifically used to: if the target change operation is a delete operation based on a delete statement, generate a join operator for the first parameter view and the target table, the left branch of the join operator is the first parameter view, and the right branch of the join operator is a query of the target table; traverse the parameters of the left branch parameter view of the join operator to perform a delete operation on the target table to drive the query of the right branch of the join.
  • the right branch of the join operator is the query based on the join statement.
  • the device of the embodiment of this specification also includes: a reporting execution module, which records the execution information corresponding to the target change operation during the execution of the target SQL statement; and after the execution plan of the target SQL statement is executed, the execution information corresponding to each record in the first parameter view is summarized and returned.
  • a reporting execution module which records the execution information corresponding to the target change operation during the execution of the target SQL statement; and after the execution plan of the target SQL statement is executed, the execution information corresponding to each record in the first parameter view is summarized and returned.
  • optimization execution device of the embodiment of this specification can implement the steps and functions of the embodiment shown in Figure 1, and will not be described in detail here.
  • FIG5 is a schematic diagram of the structure of an electronic device provided by an embodiment of the present specification.
  • the electronic device includes a processor, and optionally also includes an internal bus, a network interface, and a memory.
  • the memory may include a memory, such as a high-speed random access memory (RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk storage, etc.
  • the electronic device may also include hardware required for other services.
  • the processor, the network interface and the memory may be interconnected via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus or an EISA (Extended Industry Standard Architecture) bus, etc.
  • the bus may be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one bidirectional arrow is used in FIG5 , but it does not mean that there is only one bus or one type of bus.
  • the memory is used to store the program.
  • the program may include a program code, and the program code includes a computer operation instruction.
  • the memory may include a memory and a non-volatile memory, and provides instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it, forming the optimization execution device shown in FIG4 at the logical level.
  • the processor executes the program stored in the memory and is specifically used to perform the following operations: generating a first parameter view based on parameters of multiple original SQL statements, wherein the multiple original SQL statements have the same grammatical structure and are all used to perform a target change operation of a target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement.
  • the multiple original SQL statements are rewritten into target SQL statements for performing a target change operation on the target table, and parameters of the target SQL statements are set to parameters corresponding to records referenced in the first parameter view.
  • An execution plan for the target SQL statement is generated and executed, wherein the execution plan is used to call parameters corresponding to the records of the first parameter view to perform the target change operation on the target table.
  • the electronic device of the embodiment of the present specification needs to batch execute multiple SQL statements of the same change operation type on the target table
  • the parameters of each SQL statement are extracted to generate a parameter view, wherein each record in the parameter view corresponds to the parameters of an original SQL statement.
  • the multiple SQL statements are rewritten into a target SQL statement of the same type of change operation, and the parameters of the target SQL statement are set to reference the parameters in the parameter view, thereby generating an execution plan for calling the parameters of each record in the parameter view according to the target SQL statement to perform the change operation on the target table.
  • the change operations to be performed by the original multiple SQL statements are equivalently executed with one execution plan, thereby There is no need to consider the execution order of multiple SQL statements, which saves context overhead and greatly improves the performance of batch execution of SQL statements.
  • the method disclosed in the embodiment shown in FIG. 1 of the present specification can be applied to a processor or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method can be completed by an integrated logic circuit of hardware in the processor or by instructions in the form of software.
  • the above processor can be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it can also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the methods, steps and logic block diagrams disclosed in one or more embodiments of the present specification can be implemented or executed.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
  • the steps of the method disclosed in one or more embodiments of the present specification can be directly embodied as a hardware decoding processor for execution, or can be executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register, etc.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the electronic device of this specification does not exclude other implementation methods, such as logic devices or a combination of software and hardware, etc., that is to say, the execution subject of the following processing flow is not limited to each logic unit, but can also be hardware or logic devices.
  • the embodiments of this specification also provide a computer-readable storage medium, which stores one or more programs.
  • the one or more programs include instructions, which, when executed by a portable electronic device including multiple application programs, can enable the portable electronic device to execute the method of the embodiment shown in Figure 1, and are specifically used to perform the following operations: generate a first parameter view based on parameters of multiple original SQL statements, the multiple original SQL statements have the same grammatical structure and are all used to execute the target change operation of the target table, and each record of the first parameter view corresponds to a parameter of an original SQL statement.
  • the multiple original SQL statements are rewritten into target SQL statements for performing a target change operation on the target table, and parameters of the target SQL statements are set to parameters corresponding to records referenced in the first parameter view.
  • each SQL statement is used as a group, a parameter view recording the variable parameters corresponding to all groups is generated, and a target SQL statement for replacing multiple SQL statements is created based on the parameter view.
  • the target SQL statement uses the grammatical structure of the original SQL statement, but the variable parameters are set to reference the variable parameters in the parameter view, so that the execution plan of multiple SQL statements can be equivalently converted to be executed by one target SQL statement, so there is no need to consider the execution order corresponding to the original multiple SQL statements, thereby saving the context overhead, and greatly improving the performance of batch execution of SQL statements.
  • a typical implementation device is a computer.
  • the computer may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
  • Computer readable media include permanent and non-permanent, removable and non-removable media that can be implemented by any method or technology to store information.
  • Information can be computer readable instructions, data structures, program modules or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary computer readable media (transitory media), such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本说明书公开了一种SQL语句的优化执行方法、装置、电子设备及介质。方法包括:基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数。将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数。生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。

Description

一种SQL语句的优化执行方法、装置、电子设备及介质 技术领域
本文件属于数据处理技术领域,尤其涉及一种SQL语句的优化执行方法、装置、电子设备及介质。
背景技术
结构化查询语言(Structured Query Language,SQL)是常用的一种关系型数据库操作语言。为了减少客户端和数据库之间的RPC交互和执行上下文切换的代价,传统数据库都提供了各自的batch执行协议来优化其中的开销,例如,例如MySQL的多语句执行(Multiple Statement Execution,MSE)。
相关技术中,关系型数据库在执行多条SQL语句时,比较常见的是在内核进行逐条处理,每条SQL在内核单独处理的过程会存在多次的链路开销,导致执行性能并不理想。为此,如何能够避免SQL语句批量执行时产生过多的开销,从而提升关系型数据库批量执行SQL语句的性能,是本文所要解决的技术问题。
发明内容
本说明书实施例提供了一种SQL语句的优化执行方法、装置、电子设备及介质,能够避免SQL语句批量执行时产生过多的开销,从而提升关系型数据库批量执行SQL语句的性能。
为解决上述技术问题,本说明书实施例是这样实现的:第一方面,提出了一种SQL语句的优化执行方法,包括:基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
第二方面,提出了一种SQL语句的优化执行装置,包括:视图生成模块,基于多条 原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;语句改写模块,将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;语句执行模块,生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
第三方面,提出了一种电子设备,包括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
第四方面,提出了一种计算机可读存储介质,所述计算机可读存储介质存储一个或多条程序,所述一个或多条程序当被包括多条应用程序的电子设备执行时,使得所述电子设备执行以下操作:基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
本说明书实施例的方案在需要对目标表批量执行多条相同变更操作类型的SQL语句时,提取每条SQL语句的参数以生成参数视图,其中,参数视图的每一条记录对应一条原始SQL语句的参数。之后,将多条SQL语句改写为一个同类型变更操作的目标SQL语句,该目标SQL语句的参数设定为引用参数视图中的参数,从而根据目标SQL语句生成调用参数视图中每个记录的参数,对目标表进行变更操作的执行计划。也就是说,以一个执行计划等价执行了原先多条SQL语句所要执行的变更操作,从而无需再 考虑原先多条SQL语句之间的执行先后顺序,进而省去了上下文的开销,可大幅提高SQL语句批量执行的性能。
附图说明
此处所说明的附图用来提供对本说明书的进一步理解,构成本说明书的一部分,本说明书的示意性实施例及其说明用于解释本说明书,并不构成对本说明书的不当限定。在附图中:
图1为本说明书实施例提供的SQL语句的优化执行方法的第一种流程示意图。
图2为基于本说明书实施例的优化执行方法所优化得到的目标SQL语句的执行计划示意图。
图3为基于本说明书实施例的优化执行方法所优化得到的目标SQL语句的另一执行计划示意图。
图4为本说明书实施例提供的SQL语句的优化执行装置的结构示意图。
图5为本说明书实施例提供的电子设备的结构示意图。
具体实施方式
为使本文件的目的、技术方案和优点更加清楚,下面将结合本说明书具体实施例及相应的附图对本说明书技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本文件一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本文件保护的范围。
如前所述,目前关系型数据库在执行多条SQL语句时,通常是在内核进行逐条处理,每条SQL在内核单独处理的过程会存在多次的链路开销,导致执行性能并不理想。
这里以MySQL的Multiple Statement Execution为示例。
目前Multiple Statement Execution定义将多条SQL语句写在同一个字符串里以请求关系型数据库进行批量执行。其中,字符串中的各SQL语句使用分号“;”进行分隔。
比如,某个Multiple Statement Execution请求如下:
create table t1(a int primary key,b int);
update t1 set b=2 where a=1;update t1 set b=3 where a=2;update t1 set b=4 where 
a=3;update t1 set b=5 where a=4;
其中,update语句是一种对表进行更新操作的SQL语句。update语句的语法结构是:“update表名称set列名称=更新字段值where列名称=匹配字段值。”比如,“update t1 set b=2 where a=1”是指在表t1中对应a=1的列,更新b=2值。
可以看出,上述Multiple Statement Execution请求中包含多条update语句,Multiple Statement Execution正常的执行流程是对每条update语句按照请求中排列先后顺序串行执行。因此对于后执行的update语句理论上可以看到先执行语句的增量修改,并且可以基于执行过的结果继续进行增量修改的叠加,其执行流程为“读-写-读-写”的交替模式,由于需要严格体现出update语句的执行顺序,因此会产生大量的上下文开销。虽然说Multiple Statement Execution能够一次性将多条SQL语句请求至关系型数据库,从而降低客户端和数据库的交互代价,但在执行阶段并没有性能上的优化。
针对上述问题,本文件旨在提供一种SQL语句的优化执行方案,能够将对目标表进行相同变更操作的多条SQL语句改写为为一条SQL语句执行,这样一来原先多条SQL语句被整合成一个执行计划,使得执行流程为“读-读-写-写”这种不严格要求执行顺序的模式。该模式不仅可以避免了单条逐行执行带来的大量事务,且不需要考虑不同SQL语句之间的执行顺序,因此不会产生大量的上下文开销,从而能够大幅度提升关系型数据库批量执行SQL语句的性能。
图1是本说明书的一个实施例SQL语句的优化执行方法的流程图,图1所示的方法可以由下文相对应的装置执行,具体步骤包括S102至S106。
S102,基于多条原始SQL语句的参数生成第一参数视图,多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,第一参数视图的每一条记录对应一条原始SQL语句的参数。
本说明书中,多条原始SQL语句均为待批量执行的SQL语句。目标变更操作可以是基于update语句的更新操作、基于delete语句的删除操作和基于insert语句的插入操作中的任意者。
需要说明的是,虽然每条原始SQL语句的语法结构相同,但语法结构中作为变量的参数并不一定相同。
比如,有下述两个待批量执行的update语句:
update语句1,update t1 set b=2 where a=1;
update语句2,update t1 set b=3 where a=2。
可以看出,这两个update语句的set值和where不值同,因此set和where即为变量的参数。
同理,delete语句的语法是:delete from表名[where匹配字段值],表示目标表中匹配字段值的where参数即为delete语句的参数。insert语句的语法是:insert[into]<表名>[列名]values<列值>,表示目标表中插入字段值的values参数即为insert语句的参数。
可以看出,同一类型的不同SQL语句相当于是通过设置不同的变量参数,来体现出不同的数据操作。本步骤是从多条原始SQL语句中提取作为变量的参数,以生成第一参数视图。
具体地,本说明书中第一参数视图的每一条记录对应有一个专属的ID stmt_id,以表征相对应的原始SQL语句。
这里沿用上文的Multiple Statement Execution请求为示例。假设Multiple Statement Execution请求中的字符串如下:
create table t1(a int primary key,b int);
update t1 set b=2 where a=1;update t1 set b=3 where a=2;update t1 set b=4 where 
a=3;update t1 set b=5 where a=4;
对应生成的第一参数视图v1的结构为:
其中:
stmt_id=0是update语句“update t1 set b=2 where a=1”的记录ID;
stmt_id=1是update语句“update t1 set b=3 where a=2”的记录ID;
stmt_id=2是update语句“update t1 set b=4 where a=3”的记录ID;
stmt_id=3是update语句“update t1 set b=5 where a=4”的记录ID。
可以看出,第一参数视图针对Multiple Statement Execution请求中的每个update语句记录了相对应的参数。
需要说明的是,上述引用的Multiple Statement Execution请求仅用于示例性介绍。实际应用中,需要批量执行的多条原始SQL语句可以通过其他方式提出请求,这里本文不作具体限定。
S104,将多条原始SQL语句改写为目标表进行目标变更操作的目标SQL语句,目标SQL语句的参数设定为引用第一参数视图中的记录对应的参数。
还是上文所述的Multiple Statement Execution请求为例,假设Multiple Statement Execution请求的字符串为:
create table t1(a int primary key,b int);
update t1 set b=2 where a=1;update t1 set b=3 where a=2;update t1 set b=4 where 
a=3;update t1 set b=5 where a=4;
按照上述第一参数视图v1的结构,字符串中的多条原始update语句可以改写为一个目标update语句,即:
update t1,v1 set t1.b=v1.b where t1.a=v1.a;
在目标update语句中,还是之前“update……,set……where……”语法结构的更新操作,但set设定了b字段的取值为调用View v1中b字段的取值,where设定了a字段的取值为调用View v1中a字段的取值。
需要说明的,将多条原始delete语句改写为一条目标delete语句,或者,将多条原始insert语句改写为一条目标insert语句的方式与update语句的改写方式相同,这里不再举例赘述。
S106,生成目标SQL语句的执行计划,执行计划调用参数视图中每个分组的参数,对目标表进行所述目标变更操作。
应理解,执行计划(Execution plan)是SQL语句的具体执行流程,主要体现出:表的读取顺序、数据读取操作的操作类型、哪些索引被使用、表之间的引用、每个环节的算子等信息。本说明书中的目标SQL语句需要按照对应的执行计划来对目标表进行目标变更操作。
本说明书实施例通过执行目标SQL语句的执行计划,以对目标表进行原先每条原始 SQL语句所对应的目标变更操作。
还是以目标update语句为“update t1,v1 set t1.b=v1.b where t1.a=v1.a”进行示例。
改写的后目标update语句具有以下执行分支:1)调用stmt_id=0的分组的变量参数,以基于“update t1 set b=2 where a=1”更新表格t1的数据;2)调用stmt_id=1的分组的变量参数,以基于“update t1 set b=3 where a=2”的分组标识;3)调用stmt_id=2的分组的变量参数,以基于“update t1 set b=4 where a=3”的分组标识;4)调用stmt_id=3的分组的变量参数,以基于“update t1 set b=5 where a=4”的分组标识。
综上所述,本说明书实施例的方法在需要对目标表批量执行多条相同变更操作类型的SQL语句时,提取每条SQL语句的参数以生成参数视图,其中,参数视图的每一条记录对应一条原始SQL语句的参数。之后,将多条SQL语句改写为一个同类型变更操作的目标SQL语句,该目标SQL语句的参数设定为引用参数视图中的参数,从而根据目标SQL语句生成调用参数视图中每个记录的参数,对目标表进行变更操作的执行计划。也就是说,以一个执行计划等价执行了原先多条SQL语句所要执行的变更操作,从而无需再考虑原先多条SQL语句之间的执行先后顺序,进而省去了上下文的开销,可大幅提高SQL语句批量执行的性能。
具体地,本说明书实施例中将Batch的执行计划等价改写成一种特殊定义的内部参数视图和被更新表之间的join+update/delete的通用形式,join的左支为参数视图,join的右支为dml语句本身的查询,通过遍历左支参数视图的参数,来驱动join右支的查询,以期达到多组参数全部被执行的效果。
此外,由于本说明书实施例是将多条原始SQL语句整合成一个执行计划,使得执行流程为“读-读-写-写”这种不严格要求执行顺序的执行模式。如果多条原始SQL语句存在针对目标表的增量修改关系,则“读-读-写-写”这种执行模式无法对目标表实现正确的增量修改。
例如下述Multiple Statement Execution请求中的字符串:
create table t_overlap(a int primary key,b int);
insert into t_overlap values(1,1);
update t_overlap set a=2 where a=1;update t_overlap set a=3 where a=2;update t_overlap 
set a=4 where a=3;//
其中,“update t_overlap set a=3 where a=2”是更新a=3的新值,而“update t_overlap set 
a=4 where a=3”更新位置是a=3。也就是说“update t_overlap set a=4 where a=3”要在“update t_overlap set a=3 where a=2”执行完成后再执行。
为此,当有多条原始SQL语句需要批量执行时,本说明书实施例可以对多条原始SQL语句进行语法分析,以确定多条原始SQL语句是否满足SQL合并改写条件。
其中,SQL合并改写条件包括如下至少一个:多条原始SQL语句中不存在对谓词列的更新操作;多条原始SQL语句中不存在对同一条记录进行多次变更操作的情况;所述多条原始SQL语句中不存在对应参数类型不一致的情况。
应理解,本说明书实施例的谓词列,是指where语句中的查询字段。多条原始SQL语句中不存在对谓词列的更新操作,也就是说,多条原始SQL语句的参数中不存在对SQL谓词列条件的修改。例如,update t1 set c1=10where c1=1;这样的SQL谓词列是c1,但是c1同时会被自身修改,这就不符合SQL合并改写条件。
应理解,多条原始SQL语句中不存在对同一条记录进行多次变更操作的情况,也就是说不会针对一个相同的记录进行多次变更操作(更新或删除)。例如,SQL1:update t1 set a1=10where c1=1;SQL2:update t1 set a1=5 where c1=1。这两个语句中,针对满足条件c1=1的记录针对字段a1进行了两次修改,这也不符合SQL合并改写条件。
应理解,所述多条原始SQL语句中不存在对应参数类型不一致的情况,是说任意两条原始SQL语句中相对应的参数的语法类型必须一致。例如,update t1 set a=1where b=1;update t1 set a=2 where b='2';这两条SQL语句违反常量类型一致,不能被优化。如果满足SQL合并改写条件,则决定对多条原始SQL语句进行改写,并开始基于多条原始SQL语句的参数生成第一参数视图。反之,如果不满足SQL合并改写条件,则可以逐条执行多条原始SQL语句。
此外,本说明书实施例是将多条原始SQL语句优化为由一个目标SQL语句的执行计划负责执行,在执行目标SQL语句的执行计划中,如果第一参数视图中有任意记录的参数进行目标变更操作失败(比如发生违反外键约束的问题),则需要对目标SQL语句回滚。
而一旦目标SQL语句回滚,则优化前的所有原始SQL语句都相当于是没有执行。显然,批量执行的原始SQL语句的数量越多,则目标SQL语句被回滚的概率越大,容易发生原始SQL语句的执行滞留的问题。
针对这一问题,本说明书实施例可以在目标SQL语句执行失败,先对对所述目标 SQL语句的执行操作进行回滚,之后,再逐条执行多条原始SQL语句中执行失败的记录对应的原始SQL语句以外的其他原始SQL语句。
或者,如果所述目标SQL语句执行失败,本说明书实施例也可以记录第一参数视图中执行失败的记录的行标识,并对目标SQL语句的执行操作进行回滚。之后,基于第一参数视图中执行失败的记录的行标识之前的记录生成第二参数视图,以及基于第一参数视图中执行失败的记录的行标识之后的记录生成第三参数视图。并将目标SQL语句的参数设定为引用第二参数视图中的记录对应的参数,生成并执目标SQL语句的执行计划;以及,并将目标SQL语句的参数设定为引用第三参数视图中的记录对应的参数,生成并执行目标SQL语句的执行计划。
也就是将第一参数视图中执行失败的目标记录抽取出来,第一参数视图中相对目标记录之前的记录合并成一个目标SQL语句的执行计划统一执行;对第一参数视图中相对目标记录之后的记录合并成一个目标SQL语句的执行计划统一执行。
应理解,本说明书实施例中的记录的行标识,可用于表示记录在参数视图中的顺序。例如,可以将多条原始SQL语句的参数的传入顺序作为行标识等等。当然,也可以添加额外的字段值作为行标识,本说明书实施例在此不作限制。在本说明书实施例中,用参数视图中的stmt_id来表示行标识。
作为示例性介绍,假设第一参数视图v1的结构为:
如果目标SQL语句执行过程中,调用stmt_id=3的参数对目标表进行目标变更操作并失败,则对stmt_id=3进行标记。
在目标update语句被回滚后,一方面,基于stmt_id=3之前的stmt_id=0、stmt_id=1和stmt_id=2生成如下而结构的第二参数视图V2:
并将目标SQL语句的参数设定为引用第三参数视图v2中的记录对应的参数,生成并执行目标SQL语句的执行计划。
另一方面,基于stmt_id=3之后的stmt_id=4和stmt_id=5生成如下而结构的第三参数视图V3:
对于stmt_id=3,则不再执行。
此外,本说明书实施例可以在在目标SQL语句的执行过程中,记录对应目标变更操作的执行信息,在目标SQL语句的执行计划执行结束后,将第一参数视图中每条记录对应的执行信息汇总后进行统一回包。
可选地,作为一个实施例,步骤S106可实现为:若所述目标变更操作为基于update语句的更新操作,生成第一参数视图和目标表的join算子,所述join算子的左支为所述第一参数视图,所述join算子的右支为所述目标表的查询;遍历所述join算子左支参数视图的参数对目标表进行更新操作,以驱动所述join右支的查询。
或者,可选地,作为另一个实施例,步骤S106可实现为:若所述目标变更操作为基于delete语句的删除操作,生成第一参数视图和目标表的join算子,所述join算子的左支为所述第一参数视图,所述join算子的右支为所述目标表的查询;遍历所述join算子左支参数视图的参数对目标表进行删除操作,以驱动所述join右支的查询。
下面对目标update语句作为目标SQL语句为示例,对执行计划进行介绍。
具体地,假设目标update语句为:update t1,v1 set t1.b=v1.b where t1.a=v1.a。本说明书目标update语句的执行计划采用MySQL的Nested Loop Join算法,来统一执行每个原始update语句各自的执行计划。
图2是该目标update语句对应的执行计划结构图。本说明书实施例中,可将Batch 的执行计划等价改写成一种特殊定义的内部参数视图和被更新表之间的join+update/delete的通用形式,join的左支为参数视图,join的右支为dml语句本身的查询,通过遍历左支参数视图的参数,来驱动join右支的查询,以期达到多组参数全部被执行的效果。具体地,如图2所示,可部署有Table Update算子、Nested Loop Join算子和ObExprValues算子。其中:ObExprValues算子负责把参数视图v1中的参数逐行(一行对应有一个分组)输出output给Nested Loop Join算子(每行对应有一个分组)。需要说明的是,参数视图v1中每组参数都有对应的stmt_id编号,ObExprValues算子在体输出参数视图v1中的变量参数时,还需要关联输出分组标识v1.stmt_id以保持分组的映射关系。
Nested Loop Join算子根据ObExprValues算子输出的参数,驱动Table Scan算子进行数据扫描,并将扫描到的匹配数据输出给Table Update算子执行t1的更新。同理,Nested Loop Join算子在输出扫描到的数据时,还要需要关联输出分组标识v1.stmt_id以保持映射关系。
进一步地,若所述多条原始语句中的目标原始SQL语句包含有基于join语句的查询,则所述join算子的右支为所述基于join语句的查询。
应理解,有些原始update语句自身也会存在基于join语句的查询,比如:
update t1
[inner join|left join]t2ON t1.id=t2.relation_id
set t1.column=t2.column
where condition;
这条update语句的大致意思是表格t1关联表格t2,然后根据表格t2中目标字段的数据更新表格t1中目标字段的数据。
如果待执行的多条原始update语句中有目标原始SQL语句包含有上述这类自身基于join语句的查询,则如图3所示,还需要在目标update语句的执行计划的Nested Loop Join算子下再设置一个基于join语句的目标查询,该目标查询关联有参数视图v1以及目标原始SQL语句的output参数TSC(t1)和TSC(t2)即可。
在Table Update算子对表格t1进行更新的过程中,Table Update算子会记录更新失败的stmt_id。当目标update被回滚后,Table Update算子将之前能更新成功的stmt_id汇总后重新批量调用变量参数以对表格t1进行更新,并再单独调用各个更新失败的 stmt_id的变量参数,对表格t1进行补充更新。此外,每一个stmt_id的变量参数在调用执行完成之后都会与stmt_id关联记录回包所需要的执行信息,当全部stmt_id的变量参数调用执行完成后,再根据每个stmt_id的执行信息进行统一回包。
同样地,本说明书实施例可以参考图2,将需要批量执行的多条原始delete语句改写为一个目标delete语句的执行计划。由于原理相同,仅需将图2、图3所示的Update操作替换为delete操作即可,这里不再举例赘述。
图4是本说明书一个实施例提供的SQL语句的优化执行装置400的示意图,包括以下模块。
视图生成模块410,基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数。
语句改写模块420,将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数。
语句执行模块430,生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
本说明书实施例的装置在需要对目标表批量执行多条相同变更操作类型的SQL语句时,提取每条SQL语句的参数以生成参数视图,其中,参数视图的每一条记录对应一条原始SQL语句的参数。之后,将多条SQL语句改写为一个同类型变更操作的目标SQL语句,该目标SQL语句的参数设定为引用参数视图中的参数,从而根据目标SQL语句生成调用参数视图中每个记录的参数,对目标表进行变更操作的执行计划。也就是说,以一个执行计划等价执行了原先多条SQL语句所要执行的变更操作,从而无需再考虑原先多条SQL语句之间的执行先后顺序,进而省去了上下文的开销,可大幅提高SQL语句批量执行的性能。
可选地,所述目标变更操作为基于update语句的更新操作,每个分组对应的参数包括:表示目标表中更新字段值的set参数和表示目标表中匹配字段值的where参数;或者,所述目标变更操作为基于delete语句的删除操作,每个分组对应的参数包括:表示目标表中匹配字段值的where参数。
可选地,视图生成模块410基于多条原始SQL语句的参数生成第一参数视图,包括:对所述多条原始SQL语句进行语法分析,以确定所述多条原始SQL语句是否满足SQL合并改写条件;若满足所述SQL合并改写条件,则基于所述多条原始SQL语句的参数生成第一参数视图。
其中,所述SQL合并改写条件包括如下至少一个:所述多条原始SQL语句中不存在对谓词列的更新操作;所述多条原始SQL语句中不存在对同一条记录进行多次变更操作的情况;所述多条原始SQL语句中不存在对应参数类型不一致的情况。
可选地,如果所述目标SQL语句执行失败,则语句执行模块430对所述目标SQL语句的执行操作进行回滚,并逐条执行所述多条原始SQL语句。
可选地,如果所述目标SQL语句执行失败,则语句执行模块430执行:记录所述第一参数视图中执行失败的记录的行标识,并对所述目标SQL语句的执行操作进行回滚;并基于所述第一参数视图中所述执行失败的记录的行标识之前的记录生成第二参数视图,以及基于所述第一参数视图中所述执行失败的记录的行标识之后的记录生成第三参数视图;将所述目标SQL语句的参数设定为引用所述第二参数视图中的记录对应的参数,生成并执行所述目标SQL语句的执行计划;将所述目标SQL语句的参数设定为引用所述第三参数视图中的记录对应的参数,生成并执行所述目标SQL语句的执行计划。
可选地,作为一个实施例,语句执行模块430,具体用于:若所述目标变更操作为基于update语句的更新操作,生成第一参数视图和目标表的join算子,所述join算子的左支为所述第一参数视图,所述join算子的右支为所述目标表的查询;遍历所述join算子左支参数视图的参数对目标表进行更新操作,以驱动所述join右支的查询。
或者,可选地,作为另一个实施例,语句执行模块430,具体用于:若所述目标变更操作为基于delete语句的删除操作,生成第一参数视图和目标表的join算子,所述join算子的左支为所述第一参数视图,所述join算子的右支为所述目标表的查询;遍历所述join算子左支参数视图的参数对目标表进行删除操作,以驱动所述join右支的查询。
可选地,若所述多条原始语句中的目标原始SQL语句包含有基于join语句的查询,则所述join算子的右支为所述基于join语句的查询。
可选地,本说明书实施例的装置还包括:汇报执行模块,在所述目标SQL语句的执行过程中,记录对应所述目标变更操作的执行信息;以及,在所述目标SQL语句的执行计划执行结束后,将所述第一参数视图中每条记录对应的执行信息汇总后进行回包。
显然,本说明书实施例的优化执行装置能够实现图1所示实施例中的步骤及功能,这里不再具体赘述。
图5是本说明书的一个实施例提供的电子设备的结构示意图。请参考图5,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。
其中,处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成上述图4所示的优化执行装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数。
将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数。
生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
本说明书实施例的电子设备在需要对目标表批量执行多条相同变更操作类型的SQL语句时,提取每条SQL语句的参数以生成参数视图,其中,参数视图的每一条记录对应一条原始SQL语句的参数。之后,将多条SQL语句改写为一个同类型变更操作的目标SQL语句,该目标SQL语句的参数设定为引用参数视图中的参数,从而根据目标SQL语句生成调用参数视图中每个记录的参数,对目标表进行变更操作的执行计划。也就是说,以一个执行计划等价执行了原先多条SQL语句所要执行的变更操作,从而 无需再考虑原先多条SQL语句之间的执行先后顺序,进而省去了上下文的开销,可大幅提高SQL语句批量执行的性能。
上述如本说明书图1所示实施例揭示的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本说明书一个或多条实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本说明书一个或多条实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
当然,除了软件实现方式之外,本说明书的电子设备并不排除其他实现方式,比如逻辑器件抑或软硬件结合的方式等等,也就是说以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。
本说明书实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多条程序。
其中,该一个或多条程序包括指令,该指令当被包括多条应用程序的便携式电子设备执行时,能够使该便携式电子设备执行图1所示实施例的方法,并具体用于执行以下操作:基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数。
将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数。
生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述 第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
本说明书实施例的电子设备在需要批量执行多条SQL语句时,以每个SQL语句为分组,生成记录有所有分组对应的变量参数的参数视图,并基于参数视图创建一个用于替代多条SQL语句的目标SQL语句。其中,目标SQL语句沿用原先SQL语句的语法结构,但变量参数设定为引用参数视图中的变量参数,这样一来,多条SQL语句的执行计划可以等价转换成由一个目标SQL语句执行,从而无需再考虑原先多条SQL语句所对应的执行先后顺序,进而省去了上下文的开销,可大幅提高SQL语句批量执行的性能。
总之,以上所述仅为本说明书的较佳实施例而已,并非用于限定本说明书的保护范围。凡在本说明书一个或多条实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本说明书一个或多条实施例的保护范围之内。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。

Claims (13)

  1. 一种SQL语句的优化执行方法,包括:
    基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;
    将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;
    生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
  2. 根据权利要求1所述的方法,
    所述目标变更操作为基于update语句的更新操作,每个分组对应的参数包括:表示目标表中更新字段值的set参数和表示目标表中匹配字段值的where参数;或者,
    所述目标变更操作为基于delete语句的删除操作,每个分组对应的参数包括:表示目标表中匹配字段值的where参数。
  3. 根据权利要求2所述的方法,生成并执行所述目标SQL语句的执行计划,包括:
    若所述目标变更操作为基于update语句的更新操作,生成第一参数视图和目标表的join算子,所述join算子的左支为所述第一参数视图,所述join算子的右支为所述目标表的查询;
    遍历所述join算子左支参数视图的参数对目标表进行更新操作,以驱动所述join右支的查询。
  4. 根据权利要求2所述的方法,生成并执行所述目标SQL语句的执行计划,包括:
    若所述目标变更操作为基于delete语句的删除操作,生成第一参数视图和目标表的join算子,所述join算子的左支为所述第一参数视图,所述join算子的右支为所述目标表的查询;
    遍历所述join算子左支参数视图的参数对目标表进行删除操作,以驱动所述join右支的查询。
  5. 根据权利要求3或4所述的方法,还包括:
    若多条原始语句中的目标原始SQL语句包含有基于join语句的查询,则所述join算子的右支为所述基于join语句的查询。
  6. 根据权利要求1或2所述的方法,
    基于多条原始SQL语句的参数生成第一参数视图,包括:
    对所述多条原始SQL语句进行语法分析,以确定所述多条原始SQL语句是否满足SQL合并改写条件;
    若满足所述SQL合并改写条件,则基于所述多条原始SQL语句的参数生成第一参数视图。
  7. 根据权利要求6所述的方法,
    所述SQL合并改写条件包括如下至少一个:
    所述多条原始SQL语句中不存在对谓词列的更新操作;
    所述多条原始SQL语句中不存在对同一条记录进行多次变更操作的情况;
    所述多条原始SQL语句中不存在对应参数类型不一致的情况。
  8. 根据权利要求1或2所述的方法,还包括:
    如果所述目标SQL语句执行失败,则对所述目标SQL语句的执行操作进行回滚;
    逐条执行所述多条原始SQL语句。
  9. 根据权利要求1或2所述的方法,还包括:
    如果所述目标SQL语句执行失败,则记录所述第一参数视图中执行失败的记录的行标识,并对所述目标SQL语句的执行操作进行回滚;
    基于所述第一参数视图中所述执行失败的记录的行标识之前的记录生成第二参数视图,以及基于所述第一参数视图中所述执行失败的记录的行标识之后的记录生成第三参数视图;
    将所述目标SQL语句的参数设定为引用所述第二参数视图中的记录对应的参数,生成并执行所述目标SQL语句的执行计划;
    生成并执行所述执行失败的记录对应的原始SQL语句的执行计划;
    将所述目标SQL语句的参数设定为引用所述第三参数视图中的记录对应的参数,生成并执行所述目标SQL语句的执行计划。
  10. 根据权利要求1或2所述的方法,还包括:
    在所述目标SQL语句的执行过程中,记录对应所述目标变更操作的执行信息;以及,
    在所述目标SQL语句的执行计划执行结束后,将所述第一参数视图中每条记录对应的执行信息汇总后进行回包。
  11. 一种SQL语句的优化执行装置,包括:
    视图生成模块,基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视 图的每一条记录对应一条原始SQL语句的参数;
    语句改写模块,将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;
    语句执行模块,生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
  12. 一种电子设备,包括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
    基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;
    将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;
    生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
  13. 一种计算机可读存储介质,所述计算机可读存储介质存储一个或多条程序,所述一个或多条程序当被包括多条应用程序的电子设备执行时,使得所述电子设备执行以下操作:
    基于多条原始SQL语句的参数生成第一参数视图,所述多条原始SQL语句具有相同的语法结构,且均用于执行目标表的目标变更操作,所述第一参数视图的每一条记录对应一条原始SQL语句的参数;
    将所述多条原始SQL语句改写为所述目标表进行目标变更操作的目标SQL语句,所述目标SQL语句的参数设定为引用所述第一参数视图中的记录对应的参数;
    生成并执行所述目标SQL语句的执行计划,其中,所述执行计划用于调用所述第一参数视图的记录对应的参数,以对所述目标表进行所述目标变更操作。
PCT/CN2023/125687 2022-11-01 2023-10-20 一种sql语句的优化执行方法、装置、电子设备及介质 WO2024093693A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211358917.XA CN115687392A (zh) 2022-11-01 2022-11-01 一种sql语句的优化执行方法、装置、电子设备及介质
CN202211358917.X 2022-11-01

Publications (1)

Publication Number Publication Date
WO2024093693A1 true WO2024093693A1 (zh) 2024-05-10

Family

ID=85047255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/125687 WO2024093693A1 (zh) 2022-11-01 2023-10-20 一种sql语句的优化执行方法、装置、电子设备及介质

Country Status (2)

Country Link
CN (1) CN115687392A (zh)
WO (1) WO2024093693A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115687392A (zh) * 2022-11-01 2023-02-03 北京奥星贝斯科技有限公司 一种sql语句的优化执行方法、装置、电子设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030200204A1 (en) * 2002-04-19 2003-10-23 Limoges Joseph Serge Substituting parameter markers for literals in database query language statement to promote reuse of previously generated access plans
CN101021875A (zh) * 2007-03-22 2007-08-22 金蝶软件(中国)有限公司 面向对象的数据库访问方法及系统
CN113886419A (zh) * 2021-09-29 2022-01-04 平安医疗健康管理股份有限公司 Sql语句处理方法、装置、计算机设备及存储介质
CN114969101A (zh) * 2022-07-13 2022-08-30 北京奥星贝斯科技有限公司 Sql语句的处理方法及装置
CN115687392A (zh) * 2022-11-01 2023-02-03 北京奥星贝斯科技有限公司 一种sql语句的优化执行方法、装置、电子设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030200204A1 (en) * 2002-04-19 2003-10-23 Limoges Joseph Serge Substituting parameter markers for literals in database query language statement to promote reuse of previously generated access plans
CN101021875A (zh) * 2007-03-22 2007-08-22 金蝶软件(中国)有限公司 面向对象的数据库访问方法及系统
CN113886419A (zh) * 2021-09-29 2022-01-04 平安医疗健康管理股份有限公司 Sql语句处理方法、装置、计算机设备及存储介质
CN114969101A (zh) * 2022-07-13 2022-08-30 北京奥星贝斯科技有限公司 Sql语句的处理方法及装置
CN115687392A (zh) * 2022-11-01 2023-02-03 北京奥星贝斯科技有限公司 一种sql语句的优化执行方法、装置、电子设备及介质

Also Published As

Publication number Publication date
CN115687392A (zh) 2023-02-03

Similar Documents

Publication Publication Date Title
WO2020228801A1 (zh) 一种多语言融合查询方法及多模数据库系统
US10776349B2 (en) Systems, methods, and apparatuses for implementing dynamic macros within a multi-tenant aware structured query language
WO2024093693A1 (zh) 一种sql语句的优化执行方法、装置、电子设备及介质
US10649986B2 (en) Systems, methods, and apparatuses for implementing a BY ORGID command term within a multi-tenant aware structured query language
US11354284B2 (en) System and method for migration of a legacy datastore
US10803062B2 (en) Systems, methods, and apparatuses for implementing a by partition command term within a multi-tenant aware structured query language
US10649987B2 (en) Systems, methods, and apparatuses for implementing conditional statement execution within a multi-tenant aware structured query language
US8396852B2 (en) Evaluating execution plan changes after a wakeup threshold time
US8918388B1 (en) Custom data warehouse on top of mapreduce
US20170083573A1 (en) Multi-query optimization
US11693912B2 (en) Adapting database queries for data virtualization over combined database stores
US20140214897A1 (en) SYSTEMS AND METHODS FOR ACCESSING A NoSQL DATABASE USING BUSINESS INTELLIGENCE TOOLS
US20040019587A1 (en) Method and device for processing a query in a database management system
US20240045860A1 (en) Data query method and system, heterogeneous acceleration platform, and storage medium
US8667010B2 (en) Database table partitioning allowing overlaps used in full text query
US5481703A (en) Database restructuring system for detecting functionally dependent relations and converting them into third normal form
US20120096054A1 (en) Reading rows from memory prior to reading rows from secondary storage
US7275051B2 (en) Method and system for reducing host variable impact on access path selection
US11308060B2 (en) Method, apparatus, device and storage medium for managing index
US9047354B2 (en) Statement categorization and normalization
CN112434003A (zh) 一种sql优化方法、装置、计算机设备及存储介质
WO2020224498A1 (zh) 基于联盟链的关系型数据库及其操作方法和装置
US20240045868A1 (en) Method and device for accelerating database operation
CN107622070B (zh) 一种数据库管理方法及装置
WO2022140936A1 (zh) 区块链中数据查询工作量的计量方法、装置及终端