WO2021169322A1

WO2021169322A1 - Execution plan processing method, device, and system

Info

Publication number: WO2021169322A1
Application number: PCT/CN2020/121193
Authority: WO
Inventors: 赵俊
Original assignee: 华为技术有限公司
Priority date: 2020-02-27
Filing date: 2020-10-15
Publication date: 2021-09-02
Also published as: CN113312371A

Abstract

An execution plan processing method, device, and system, relating to the field of databases. The method comprises: executing a first execution plan of a structured query language (SQL) sentence; when the performance of the first execution plan of the SQL sentence is degraded, obtaining a second execution plan of the SQL sentence; and executing the second execution plan to replace the execution of the first execution plan. The method is used for executing an execution plan in a database, thereby reducing the influence of a degraded execution plan on the database.

Description

Method, equipment and system for processing execution plan

This application claims the priority of the Chinese patent application filed on February 27, 2020 with the application number 202010124587.2 and the invention title of "Performance Fission Self-Recovery Method of Execution Plan", and the application number filed on March 30, 2020 It is the priority of the Chinese patent application of 202010238039.2 and the invention title is "Processing method, equipment and system of execution plan", the entire content of which is incorporated into this application by reference.

Technical field

This application relates to the field of databases, and in particular to a processing method, equipment and system for executing plans.

Background technique

In the database, users use Structured Query Language (Structured Query Language) SQL statements to query data. The query engine can generate an execution plan for a SQL statement, and process data based on the execution plan within a period of time. Among them, the execution plan (also called SQL execution plan) is used to indicate the actions to be executed, and the sequence of actions to be executed, etc.

However, since each execution plan will be used continuously for a period of time (such as several weeks), if the execution plan itself has poor performance, it will affect the performance of the database during this period of time.

Summary of the invention

The embodiments of the present application provide a processing method, equipment, and system for executing a plan. The technical solution is as follows:

In a first aspect, a method for processing an execution plan is provided. The method can be executed by a database system. The method includes:

Execute the first execution plan of the structured query language SQL statement; when the performance of the first execution plan of the SQL statement deteriorates, obtain the second execution plan of the SQL statement; execute the second execution plan to replace the first execution Implementation of the plan.

In the embodiment of this application, after the performance degradation of the first execution plan is analyzed, the second execution plan will be used to replace the first execution plan to avoid the use of the first execution plan of the performance degradation of the database, and reduce the performance degradation of the execution plan to the database. The impact of this to ensure the performance of the database.

Optionally, the second execution plan is different from the first execution plan. Further optionally, the performance of the second execution plan is better than the performance of the first execution plan, which can effectively guarantee the performance of the database.

Optionally, there are multiple implementation modes for obtaining the second execution plan of the SQL statement, and the embodiment of the present application takes the following two obtaining modes as examples for description:

The first method of obtaining: obtaining the historical execution plan of the SQL statement; when the performance of the historical execution plan is better than the performance of the first execution plan, the historical execution plan is used as the second execution plan.

Optionally, the historical execution plan may be determined in other execution plans of the SQL statement (the other execution plan is an execution plan generated before the first execution plan) to determine whether the historical execution plan can be used as the second execution plan. For example, the historical execution plan may be the execution plan with the best performance among other execution plans of the SQL statement, the execution plan with the longest historical use time, or the execution plan that meets other set conditions.

By adopting the first acquisition method, since the historical execution plan has been generated in advance, the historical execution plan with better performance than the first execution plan can be quickly acquired as the second execution plan, which can realize the rapid optimization of the execution plan and improve the optimization efficiency.

The second way to obtain: generate the second execution plan of the SQL statement.

Optionally, the computing resources occupied by the second execution plan for generating the SQL statement are greater than the computing resources occupied by the first execution plan for generating the SQL statement, and/or the duration of the second execution plan for generating the SQL statement is greater than The duration of the first execution plan for generating the SQL statement.

Since the generation time and/or the occupied computing resources reduce the restriction on the process of generating the execution plan, it can be ensured that the performance of the finally determined second execution plan is better than the performance of the first execution plan.

Optionally, the database system provided in the embodiment of the present application also supports an alarm function. Correspondingly, the processing method of the execution plan further includes: sending alarm indication information when the performance of the first execution plan of the SQL statement deteriorates. Optionally, the alarm indication information may include SQL statements, that is, SQL statements that produce performance degradation; the alarm indication information may also include tasks that indicate intervention in the execution plan. The user can know the SQL statement that currently has performance degradation based on the warning indication information, and determine whether to intervene in the execution plan.

In the embodiment of the present application, the warning indication information is presented to the user through the user interface, and the user controls whether to optimize the execution plan, which can prevent the automatic execution plan optimization in the background from affecting the user's operation or affecting the user business being executed, and improve The flexibility in the time of execution of the planned intervention improves the user experience.

The process of obtaining the second execution plan of the SQL statement can be triggered in multiple ways. For example, after receiving the execution plan optimization instruction, the second execution plan of the SQL statement is obtained. The execution plan optimization instruction may be triggered by the user through the application program, or may be triggered by a designated device. The execution plan optimization instruction is used to instruct to intervene in the degraded execution plan.

Optionally, the method further includes: when the performance of the new version of the execution plan is better than the performance of the second execution plan, executing the new version of the execution plan to replace the execution of the second execution plan.

In the embodiment of this application, when the performance of the new version of the execution plan is better than the performance of the second execution plan, the database system executes the new version of the execution plan to replace the execution of the second execution plan, so as to prevent the database from adopting a new version with degraded performance. The execution plan reduces the impact of the performance-degraded execution plan on the database, thereby ensuring the performance of the database. Moreover, since the performance of the new version of the execution plan is better than that of the second execution plan, the performance of the database can be effectively guaranteed, the database version is prevented from rolling back, and the duration of business interruption is reduced.

The running index is used to reflect the running effect of the corresponding execution plan, and can also reflect the performance of the corresponding SQL statement. The performance of the first execution plan can be determined through the operation indicators of the first execution plan. Then, when the operation index of the first execution plan is abnormal, it is determined that the performance of the first execution plan of the SQL statement is degraded. For example, the operation indicators of the first execution plan include one or more of the following: IO indicators, delay (delay), error (error) information, execution times, and processing duration of the SQL statement.

Optionally, the process for the database system to analyze whether the operating indicators of the first execution plan are abnormal may include: for each operating indicator in the operating indicators of the first execution plan, the operating indicator data group corresponding to the operating indicator and the corresponding history The operation index data group is compared to determine whether each operation index is abnormal; based on the abnormality determination result of each operation index, it is determined whether the operation index of the first execution plan (that is, the overall operation index of the first execution plan) is abnormal.

Optionally, for each operating indicator in the operating indicators of the first execution plan, the achievable manner for determining whether the operating indicator is abnormal may include:

The first possible implementation manner is that for each operating indicator in the operating indicators of the first execution plan, when the performance indicated by the operating indicator data group corresponding to the operating indicator is lower than the historical operating indicator of the SQL statement corresponding to the operating indicator The performance indicated by the data group determines that the operation index is abnormal.

Wherein, for each operating indicator in the operating indicators of the first execution plan, it is determined whether the performance indicated by the operating indicator data set corresponding to the operating indicator is lower than that indicated by the historical operating indicator data set of the SQL statement corresponding to the operating indicator There are two ways of performance:

In the first way, the database system compares the corresponding operation index data group with the corresponding historical operation index data group based on a specified comparison rule to detect whether the performance indicated by the operation index data group is lower than that indicated by the corresponding historical operation index data group. performance.

For example, the database system maintains an expert experience database, which records at least one specified comparison rule determined based on expert experience. Based on the specified comparison rule, the database system performs the operation index data group corresponding to the operation index and the corresponding historical operation index Comparison of data sets.

For another example, the database system compares the performance curve of the operating indicator data set with the performance baseline of the operating indicator to detect whether the performance indicated by the operating indicator data set is lower than the performance indicated by the historical operating indicator data set.

When the performance curve of the operating indicator data set corresponding to the operating indicator does not match the performance baseline of the operating indicator, it is determined that the performance indicated by the operating indicator data set corresponding to the operating indicator is lower than the performance indicated by the historical operating indicator data set. The performance baseline of the operating indicator is determined based on the historical operating indicator data set.

Optionally, before using the performance baseline of the operating indicator, the database system may also generate the performance baseline of the operating indicator based on the second artificial intelligence model and the historical operating indicator data set.

The performance baseline of the operating index is generated by the artificial intelligence model. On the basis of ensuring the accuracy of the performance baseline, the efficiency of obtaining the performance baseline can also be improved.

In the second achievable manner, the database system uses the AI model to identify whether the first operating indicator is abnormal. The process is as follows:

For each operating indicator in the operating indicators of the first execution plan, the operating indicator data set corresponding to the operating indicator is input into the first artificial intelligence model, and when the first artificial intelligence model outputs indication information indicating that the operating indicator is abnormal, It is determined that the operation index is abnormal.

The foregoing two achievable methods can be combined according to actual conditions. For example, the method provided by the first achievable method is executed first to perform a rough screening of abnormal operating index data sets. There may be certain errors in the rough screening process. Determine the non-abnormal operation index data group as the abnormal operation index data group; then perform the method provided by the second achievable method for the operation index data group whose rough screening result is abnormal to perform the abnormal operation index data group Of fine screening.

For each running index in the running index of the first execution plan, when the performance indicated by the running index data group is lower than the performance indicated by the historical running index data group of the SQL statement (that is, the running index data group is determined to be abnormal through rough screening) , The operation index data group is input into the first artificial intelligence model, when the first artificial intelligence model outputs indication information indicating that the operation index data group is abnormal, it is determined that the operation index data group of the first execution plan is abnormal (that is, the operation is determined by fine screening. Index data group abnormality), compared to the second achievable manner described above, the accuracy of determining the abnormality of the operating index data group can be improved. When the performance indicated by the running index data set is not lower than the performance indicated by the historical running index data set of the SQL statement (that is, the running index data set is determined to be normal through coarse screening), there is no need to input the running index data set into the first artificial intelligence model. Compared with the foregoing second achievable manner, the computational cost of the first artificial intelligence model can be reduced.

Optionally, the database system may determine whether the operation index of the first execution plan is abnormal based on the abnormality determination result of each operation index. There are many ways for the database system to determine whether the operation index of the first execution plan is abnormal.

In the first optional method, when at least one operation index of the first execution plan is abnormal, it is determined that the operation index of the first execution plan is abnormal; when all the operation indexes of the first execution plan are not abnormal, it is determined that the first execution plan is not abnormal. There is no abnormality in the operational indicators of the execution plan.

In the second optional method, based on the abnormality determination result of each operation index, the operation index score of the first execution plan is determined. When the operation index score is greater than the specified score threshold, it is determined that the operation index of the first execution plan is abnormal ; When the running index score is not greater than the specified score threshold, it is determined that the running index of the first execution plan is not abnormal. That is, the higher the running index score, the higher the probability of abnormality.

In a possible implementation manner, after acquiring the second execution plan of the SQL statement, the processing method of the execution plan further includes: converting the second execution plan into a second execution plan matching the management node. To ensure that the management node can quickly analyze the transformed second execution plan, thereby reducing the analysis delay generated by the management node, and improving the efficiency of loading and using the second execution plan by the management node.

Optionally, the process of transforming the second execution plan into a second execution plan matching the management node includes:

Query the correspondence between the designated management node and the execution plan format to obtain the execution plan format corresponding to the management node; transform the second execution plan into a second execution plan that conforms to the execution plan format.

Optionally, the second execution plan carries a description prompt label, and the description prompt label is used to identify the corresponding execution plan as an intervention plan. This can facilitate the database system to distinguish which execution plans are intervened execution plans.

Optionally, the process of generating the second execution plan of the SQL statement may include: generating multiple candidate execution plans based on the optimization rule information and/or optimization cost information of the database where the management node is located; traversing the multiple candidate executions Plan to get the second execution plan.

Optionally, the processing method of the execution plan may further include: receiving a rule setting instruction, where the rule setting instruction includes a set rule. Optionally, the rules in the database system may include at least one of the following: SQL performance comparison rules, description prompt label setting rules, alarm rules, and routing rules.

In a second aspect, a method for processing an execution plan is provided. The method can be executed by a database system. The method includes:

Execute the first execution plan of the structured query language SQL statement; obtain the second execution plan of the new version of the SQL statement; execute the second execution when the performance of the second execution plan is better than that of the first execution plan Plan to replace the execution of the first execution plan.

In the embodiments of this application, when the performance of the new version of the second execution plan is better than the performance of the first execution plan, the database system executes the second execution plan to replace the execution of the first execution plan, so as to prevent the database from adopting new performance degradation The version of the execution plan reduces the impact of the degraded execution plan on the database, thereby ensuring the performance of the database. In addition, since the performance of the second execution plan is better than the performance of the first execution plan, the performance of the database can be effectively guaranteed, the database version rollback can be avoided, and the duration of business interruption can be reduced.

In a possible implementation manner, after obtaining the second execution plan of the SQL statement, for example, after determining that the performance of the second execution plan is better than the performance of the first execution plan, the method further includes: The execution plan is transformed into a second execution plan that matches the management node. To ensure that the management node can quickly analyze the transformed second execution plan, thereby reducing the analysis delay generated by the management node, and improving the efficiency of loading and using the second execution plan by the management node.

Optionally, the process of converting the second execution plan into a second execution plan matching the management node includes: querying the correspondence between the specified management node and the execution plan format to obtain the execution plan format corresponding to the management node; The second execution plan is converted into a second execution plan conforming to the execution plan format.

Optionally, obtaining the second execution plan of the new version of the SQL statement includes: generating the second execution plan of the SQL statement, and the process of generating the second execution plan of the SQL statement may include: optimization based on the database where the management node is located Rule information and/or optimization cost information to generate multiple candidate execution plans; traverse the multiple candidate execution plans to obtain the second execution plan.

In a third aspect, the present application provides a database system, the database system may include at least one module, and the at least one module may be used to implement the execution plan processing method provided in the first aspect or various possible implementations of the first aspect. .

In a fourth aspect, the present application provides a database system, which may include at least one module, and the at least one module may be used to implement the execution plan processing method provided by the foregoing first aspect or various possible implementations of the first aspect.

In a fifth aspect, this application provides a computer device including a processor and a memory. The memory stores computer instructions; the processor executes the computer instructions stored in the memory, so that the computer device executes the methods provided by the foregoing first aspect or various possible implementations of the first aspect, so that the computer device deploys the foregoing third aspect or the first aspect. Various possible implementations of the three aspects provide the database system.

In a sixth aspect, this application provides a computer device including a processor and a memory. The memory stores computer instructions; the processor executes the computer instructions stored in the memory, so that the computer device executes the foregoing second aspect or the methods provided by various possible implementations of the second aspect, so that the computer device deploys the foregoing fourth aspect or the first aspect. Four possible implementations of the database system provided.

In a seventh aspect, the present application provides a computer-readable storage medium having computer instructions stored in the computer-readable storage medium, and the computer instructions instruct the computer device to execute the foregoing first aspect or various possible implementations of the first aspect. The method or the computer instruction instructs the computer device to deploy the database system provided by the foregoing third aspect or various possible implementations of the third aspect.

In an eighth aspect, the present application provides a computer-readable storage medium having computer instructions stored in the computer-readable storage medium, and the computer instructions instruct the computer device to execute the above-mentioned second aspect or various possible implementations of the second aspect. The method or the computer instruction instructs the computer device to deploy the database system provided by the foregoing fourth aspect or various possible implementations of the fourth aspect.

In a ninth aspect, this application provides a computer program product. The computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above-mentioned first aspect or the methods provided by various possible implementations of the first aspect, so that the computer The device deploys the database system provided by the third aspect or various possible implementations of the third aspect.

In a tenth aspect, the present application provides a computer program product. The computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above-mentioned second aspect or the methods provided by various possible implementations of the second aspect, so that the computer The equipment deploys the database system provided by the foregoing fourth aspect or various possible implementations of the fourth aspect.

In an eleventh aspect, a chip is provided, the chip may include a programmable logic circuit and/or program instructions, and when the chip is running, it is used to implement the processing method of any execution plan as in the first aspect. Or, when the chip is running, it is used to implement the processing method of any one of the execution plans as in the second aspect.

The beneficial effects brought about by the technical solutions provided by the embodiments of the present application are:

After analyzing the performance degradation of the first execution plan, the embodiment of the application will replace the first execution plan with the second execution plan to avoid the use of the degraded first execution plan for the database and reduce the impact of the degraded execution plan on the database , So as to ensure the performance of the database. Optionally, the performance of the second execution plan is better than the performance of the first execution plan, thereby effectively ensuring the performance of the database.

Description of the drawings

FIG. 1 is a schematic diagram of an application environment of a database system involved in a method for processing an execution plan provided by an embodiment of the present application;

FIG. 2 is a schematic flowchart of a method for processing an execution plan provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of data flow in an exemplary database system provided by an embodiment of the present application; FIG.

FIG. 4 is a schematic diagram of a performance baseline of a first operating index provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a comparison scenario between a performance curve of a first operating indicator data set and a performance baseline of the first operating indicator according to an embodiment of the present application;

6 is a schematic diagram of a comparison scenario between the numerical points of the first operating index data set and the performance baseline of the first operating index provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a schematic user interface provided by an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a database system provided by an embodiment of the present application;

FIG. 9 is a schematic flowchart of a method for processing an execution plan provided by an embodiment of the present application;

FIG. 10 is a block diagram of a database system provided by an embodiment of the present application;

FIG. 11 is a block diagram of another database system provided by an embodiment of the present application;

FIG. 12 is a block diagram of another database system provided by an embodiment of the present application;

FIG. 13 is a block diagram of still another database system provided by an embodiment of the present application;

FIG. 14 is a block diagram of a database system provided by another embodiment of the present application;

Fig. 15 is a block diagram of a computer device provided by an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions, and principles of the present application clearer, the implementation manners of the present application will be further described in detail below with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of an application environment of a database system (DBS) 10 involved in a method for processing an execution plan provided by an embodiment of the present application. The database system 10 may be a server or a server cluster composed of multiple servers. The database system includes a database management system (Database Management System, DBMS) and at least one database (Database, DB) (not shown in FIG. 1). In this database system, the application program can transparently operate the database through the database management system, and the data in the database is managed by the database management system.

Optionally, the aforementioned database may be a relational database, which refers to a database that uses a relational model to organize data. It stores data in the form of rows and columns. Each relational model can be called a relational table. According to different storage principles, relational databases can be divided into distributed relational databases and non-distributed relational databases.

The database system 10 includes: a management node (also called a query engine, a SQL engine, a database engine or a coordinator) 101, a plurality of data nodes 102, and an optimization node 103. The database system 10 may include one or more management nodes 101, each management node 101 belongs to a database, and the management node 101 is used to manage the data nodes 102 in the corresponding database.

In the embodiment of the present application, the management node 101 may be a single node, or a designated data node among multiple data nodes 102 or a data node obtained by election, and it may be a server or a server cluster composed of multiple servers. The management node 101 is configured to generate an execution plan of the SQL statement after receiving the SQL statement sent by the application program, so as to control the data node managed by the management node 101 to execute the execution plan. Among them, the execution plan is used to indicate the actions to be executed and the sequence of actions to be executed. In other words, when to do something and so on.

The optimization node 103 is used to intervene in the execution plan of the management node 101 (also called optimization).

In one implementation, each data node may be a server or a server cluster composed of multiple servers; in another implementation, each data node represents a set minimum processing unit of the database system. For example, each data node may be a virtual machine or a container, an application instance or a database execution process that manages and/or stores data.

It should be noted that the optimization node 103 can be integrated on the management node 101. For example, each management node 101 in the database system can be integrated with an optimization node 103, or the designated management node 101 in the database system can be integrated An optimized node 103. The optimization node 103 can also be independently set outside the management node 101. When the optimization node 103 is set outside the management node 101, the management node 101 provides an interface for the optimization node 103 to intervene in the execution plan of the management node 101. When the optimization node 103 is independently arranged outside the management node 101, the impact on the performance of the management node 101 can be reduced.

Fig. 2 is a schematic flowchart of a method for processing an execution plan provided by an embodiment of the present application. This method can be executed by the aforementioned database system 10. Subsequent embodiments take the execution plan intervention of a management node as an example for description. It is assumed that the management node is the first management node, the database corresponding to the first management node is the first database, and the execution plan intervention is performed on other management nodes. The process can refer to the process of performing execution plan intervention on the first management node. As shown in Figure 2, the processing method (ie intervention process) of the execution plan includes:

Step 201: The database system executes the first execution plan of the SQL statement.

When users need to perform data operations (such as data query operations) on the first database, they can input SQL statements to the first management node through the application. After receiving the SQL statements, the first management node will follow the optimization rules of the first database (rule based on optimizer, RBO) information and/or cost based on optimizer (CBO) information to generate an execution plan, and continue to use the execution plan for a period of time after the execution plan is generated.

The optimization rule information may include table partition information and/or reliability indicators (available indicators).

When the database has the partition function, the relational table is divided into multiple subsets called partitions, and each subset is a partition table. For example, when the number of records in the relationship table exceeds the specified record number threshold, the relationship table is divided into multiple partition tables. The partitioning rules of the partition table may include: partitioning according to storage date and/or partitioning according to location, etc. For example, the partition table obtained by dividing the data in the relational table according to the location includes: a partition table corresponding to Shanghai and a partition table corresponding to Beijing.

The reliability index is an index used to reflect the reliability of the database. When the data in the database exceeds the range specified by the reliability index, the database will become unstable, that is, the database is no longer reliable. For example, the reliability index may include: allowable interruption duration and/or input/output (input/output, IO) upper limit. When the execution plan of the SQL statement generated by the management node in the database exceeds the allowable interruption time, or the number of IOs of the executable plan of the SQL statement generated by the management node in the database exceeds the IO limit, the database may have services that can be noticed by the user. Interruption, loss of the database system.

The optimization cost information may include statistical information of the database. The statistical information is used to reflect the data distribution of the relational table of the corresponding database. For example, the distribution ratio of different types of data in the table, or the data nodes where different types of data are stored (or mainly stored), etc. For example, the relational table of the first database records data indexed as "male" and "female", and the statistical information of the first database may include the proportion of data indexed as "male" and "female" in all data and the main Which data nodes are stored in. Among them, which data nodes are mainly stored in refers to which data nodes are stored in which data greater than a specified threshold is stored.

In the embodiment of the present application, it is assumed that the SQL statement in step 201 is an SQL statement input to the first management node, and the first execution plan is an execution plan generated by the first management node based on the SQL statement. Then the first execution plan for the database system to execute the SQL statement refers to that the first management node controls the data node it manages to execute the first execution plan.

Optionally, after the database system generates and executes the first execution plan, it records the first execution plan so that the first database can continue to use the first execution plan in a subsequent period of time.

FIG. 3 is a schematic diagram of data flow in a schematic database system provided by an embodiment of the present application. It is assumed that the optimization node 103 is set independently of the first management node 101a. After generating the first execution plan, the first management node 101a controls the managed data node to execute the first execution plan (that is, use the first execution plan), and the executed first execution plan is stored in the system table 1042, The execution plan stored in the system table 1042 can be synchronized to the index database 1041, and the optimization node 103 can obtain the first execution plan from the index database 1041, and detect whether the performance of the first execution plan is degraded.

Optionally, the system table is used to record the execution plan currently used (or called the current time period) in the database system. The currently used execution plan refers to that the execution plan continues to be used after being executed for the first time (that is, it is not replaced). Because there may be multiple databases in the database system. In an optional implementation manner, multiple databases may respectively set system tables corresponding to each other; in another optional implementation manner, multiple databases may share system tables. Taking the first database as an example, the corresponding system table records the execution plan currently used by the first database. Since there may be multiple SQL statements recently acquired by the first management node, the number of currently used execution plans corresponding to each SQL statement is one. Therefore, the system table records a one-to-one corresponding multiple execution plans of multiple SQL statements of the first management node, and the multiple execution plans include the first execution plan. When the first management node needs to reuse the previously executed execution plan next time (that is, the execution plan is inherited), the first management node queries the system table to obtain the required execution plan.

Optionally, the execution plan stored in the system table 1042 can be periodically synchronized to the indicator database 1041; or, when the execution plan stored in the system table 1042 is updated, the execution plan stored in the system table 1042 is synchronized (ie, full synchronization) To the index database 1041; or, when there is an update to the execution plan stored in the system table 1042, the updated execution plan in the system table 1042 is synchronized (that is, incremental synchronization) to the index database 1041.

It is the same as the setting method of the aforementioned system table 1042, because there are multiple databases in the database system. In an optional implementation manner, multiple databases may be correspondingly provided with index databases; in another optional implementation manner, multiple databases may share the index database. Taking the first database as an example, the corresponding indicator database records the execution plan for each synchronization of the system table of the first management node. The indicator database includes at least the historical execution plan of the first management node, and may also include the execution plan currently used by the first database, and the historical execution plan is an execution plan whose use period is before the use period of the currently used execution plan. For example, for the first execution plan of the SQL statement, the corresponding historical execution plan is other execution plans of the SQL statement that existed before the first execution plan was generated. Whether the indicator database includes the execution plan currently used by the first database is affected by the frequency with which the first database table synchronizes the execution plan with the indicator database. Generally, the indicator database records all executed execution plans (including historical execution plans and currently used execution plans) of the first management node. Assume that the indicator database records all execution plans that have been executed by the first management node. Since there may be multiple SQL statements recently acquired by the first management node, the number of all executed execution plans corresponding to each SQL statement is at least one. Therefore, the index database records multiple execution plan groups corresponding to multiple SQL statements of the first management node, and each execution plan group includes at least one execution plan of the corresponding SQL statement.

In an optional manner, the optimization node may obtain the first execution plan of the SQL statement from the index database 1041. For example, in the execution plan group corresponding to the SQL statement of the first management node in the index database 1041 (the execution plan group includes one or more execution plans), the latest execution plan is acquired as the first execution plan. The latest execution plan is the execution plan with the synchronization time closest to the current time.

In another optional manner, the optimization node may obtain the first execution plan of the SQL statement from the system table 1042. For example, the execution plan corresponding to the SQL statement of the first management node recorded in the system table 1042 is taken as the first execution plan.

It should be noted that the execution plan currently used by the database system and the storage method of the historical execution plan can also be stored in other ways. Figure 3 only takes the currently used execution plan stored in the system table and the historical execution plan stored in the indicator database as an example for illustration, but it does not limit this. As long as it is ensured that the database system can effectively distinguish and obtain the currently used execution plan and historical execution plan. In addition, FIG. 3 schematically takes the system table stored in a database as an example for description, and the storage method of the system table may also have other methods, which are not limited in the embodiment of the present application.

It is worth noting that the plans generated in the database system but not executed can also be recorded in the index database 1041, so that the execution plan in the database system can be monitored, and the performance of the database system can be easily analyzed.

Step 202: The database system analyzes whether the performance of the first execution plan of the SQL statement has deteriorated.

Due to changes in the software and hardware environment (such as capacity expansion or kernel upgrades to new versions, etc.) or database abnormalities, the performance of the first execution plan currently used by the first management node may be worse than that of the previous execution plan, resulting in a decrease in the performance of SQL statements. Even the performance of the entire first database has dropped. In this case, the database system (such as an optimization node) can determine whether to intervene in the first execution plan by analyzing whether the performance of the first execution plan of the SQL statement has deteriorated, so as to restore the performance of the SQL statement. The process of analyzing whether the performance of the first execution plan of the SQL statement has deteriorated includes:

Step A1. The database system analyzes whether the operation index of the first execution plan is abnormal. Perform step A2 or step A3.

The running index is used to reflect the running effect of the corresponding execution plan, and can also reflect the performance of the corresponding SQL statement. The performance of the first execution plan can be determined through the operation indicators of the first execution plan. For example, the operation indicators of the first execution plan include one or more of the following: IO indicators, delay (delay), error (error) information, execution times, and processing duration of the SQL statement.

Among them, the IO indicator refers to; the number of IOs generated when the execution plan is executed; the delay refers to the delay caused by the execution of the execution plan; the error message refers to the content of the error caused by the execution of the execution plan, And/or, the percentage of different types of errors caused by the execution of the execution plan in the overall errors generated, that is, the ratio of the number of each type of error to the number of overall errors; the number of executions refers to the execution The number of times the plan is executed; the processing time can be the time that the execution plan occupies the processor when the execution plan is executed, such as the processing time of the central processing unit (CPU). The processing time of the CPU is called the CPU time (CPU time). ).

Optionally, the database system can periodically analyze whether the operating indicators are abnormal. For example, the analysis period ranges from 1 minute to 10 minutes.

A database system (such as an optimized node) can deploy an agent module on the host (also known as the database host) where each database in the database system is located, and each agent module is used to monitor the corresponding running indicators on the deployed database host The data. For example, each agent module may collect data corresponding to the operating indicators on the deployed database host, and periodically send the collected data to a node of the management agent module, such as an optimization node. For example, the range of the sending period of the data corresponding to the aforementioned operating indicators is 5 seconds to 10 minutes.

There may be one or more operating indicators of the first execution plan. In the operating indicators of the first execution plan, each operating indicator corresponds to an operating indicator data group. The operating indicator data group usually includes multiple data of corresponding operating indicators. , The multiple data are usually data collected within a specified period of time. For example, the specified duration is one day; the multiple data may also be data collected at specified sampling intervals. For example, the specified sampling interval is 1 second. In the operating indicators of the first execution plan, each operating indicator also corresponds to a historical operating indicator data group.

For the historical operating indicator data group and operating indicator data group corresponding to the same operating indicator, the collection time corresponding to the historical operating indicator data group (that is, the collection time corresponding to the data in the collected data group) and the collection of the operating indicator data group The duration is the same, and the acquisition period of the historical operation indicator data group is before the acquisition period of the operation execution data group. For example, the data collection time of both is 1 day. Optionally, the number of data in the historical operating indicator data group is the same as the number of data in the operating indicator data group. For example, the number of data in the historical operating indicator data group and the number of data in the operating indicator data group are both 10,000; optionally, the data in the historical operating indicator data group and the data in the operating indicator data group There is a one-to-one correspondence between the acquisition moments of the data. For example, the data in the historical operating indicator data group and the data in the operating indicator data group are both acquired at the same sampling interval within one day. For example, the data in the historical operating indicator data group is sampled at 9:00 Corresponding to the data sampled at 9:00 in the operating indicator data group, the data sampled at 9:01 in the historical operating indicator data group corresponds to the data sampled at 9:01 in the operating indicator data group.

In the embodiment of the present application, the process of analyzing whether the operation index of the first execution plan is abnormal by the database system may include:

Step A11: For each operating indicator in the operating indicators of the first execution plan, compare the operating indicator data group corresponding to the operating indicator with the corresponding historical operating indicator data group to determine whether each operating indicator is abnormal.

Step A12: Based on the abnormality determination result of each operation index, determine whether the operation index of the first execution plan (that is, the overall operation index of the first execution plan) is abnormal.

In the foregoing step A11, for ease of description, it is assumed that the first operating indicator is one of the operating indicators of the first execution plan, the operating indicator data set corresponding to the first operating indicator is the first operating indicator data set, and the first operating indicator is the first operating indicator. The historical operating indicator data group corresponding to the operating indicator is the first historical operating indicator data group, and the process of determining whether the first operating indicator is abnormal may include the following two implementation methods:

In the first possible implementation manner, when the performance indicated by the first operating index data set is lower than the performance indicated by the first historical operating indicator data set, the database system determines that the first operating indicator is abnormal; when the first operating indicator data set indicates The performance is not lower than the performance indicated by the first historical operation index data group, and the database system determines that the first operation index is not abnormal (that is, normal).

Optionally, there are multiple ways for the database system to analyze that the performance indicated by the first operating indicator data set is lower than the performance indicated by the first historical operating indicator data set. The embodiments of this application are described in the following two ways as examples:

In the first method, the database system compares the first operating index data set with the first historical operating indicator data set based on a specified comparison rule to detect whether the performance indicated by the first operating indicator data set is lower than the first historical operating indicator data set Indicated performance.

For example, the database system maintains an expert experience database. The expert experience database records at least one specified comparison rule determined based on expert experience. Based on the specified comparison rule, the database system performs the first operation index data set and the first historical operation index data Group comparison.

For example, the at least one specified comparison rule includes a year-on-year comparison rule and/or a chain comparison rule, where the chain comparison rule means obtaining a chain change between the historical operation index data group and the operation index data group. The comparison rule can be based on the chain decrease The rate formula obtains the month-on-month decline rate. For example, the formula for the month-on-month decline rate includes:

Momentum drop rate=(operation index data group-historical operation index data group)/historical operation index data group×100%.

The year-on-year comparison rule means obtaining the year-on-year changes between the historical operation index data group and the operation index data group. Using this comparison rule, the year-on-year decline rate can be obtained based on the year-on-year decline rate formula. For example, the year-on-year decline rate formula includes:

Year-on-year decline rate = (operation index data group-historical operation index data group)/|historical operation index data group|×100%.

For example, when the month-on-month decline rate is greater than the first proportion threshold and/or the year-on-year decline rate is greater than the second proportion threshold, it is determined that the performance indicated by the first operating indicator data set is lower than the performance indicated by the first historical operating indicator data set.

In the second way, the database system compares the performance curve of the first operating indicator data set with the performance baseline of the first operating indicator to detect whether the performance indicated by the first operating indicator data set is lower than the first historical operating indicator data set Indicated performance. The process is as follows:

Step A111: The database system determines a performance baseline of the first operating index.

The performance baseline of the first operating indicator is a baseline established based on the data in the historical operating indicator data group and used to reflect the performance of the SQL statement with respect to the first operating indicator. Optionally, the database system may generate a performance baseline of the first operating indicator of the SQL statement based on the second artificial intelligence model and the first historical operating indicator data set.

In the first optional manner, the database system may obtain the candidate operating index data set of the first operating indicator of the SQL statement, and the number of data included in the candidate operating indicator data set is greater than or equal to the first historical operating indicator data The number of data in the group; the database system then determines the historical operation index data group based on the obtained candidate operation index data group of the first operation index; finally, the database system inputs the first historical operation index data group into the second artificial intelligence model , The second artificial intelligence model generates the performance baseline of the first operating indicator of the SQL statement.

Optionally, the database system may determine the first historical operation index data group in the following ways based on the obtained candidate operation index data group of the first operation index:

The first type is to filter the target collection time data in the candidate operation index data group to obtain the first historical operation index data group. The target collection time data can be randomly selected in the candidate operating index data group, or it can be the data with the best corresponding performance filtered from the candidate operating indicator data group through a sliding window. The width of the sliding window can be The acquisition duration corresponds to the width of the target.

The second type: The candidate operating index data group includes M sub-data groups of target collection time length, and the average data of the M sub-data groups is determined as the first historical operating indicator data group, that is, the first historical operating indicator data group Each data in is the mean value of corresponding data in M sub-data groups. For example, the candidate operating index data group includes 4 days of data, and the target collection time is 1 day, then the data of a sub-data group is one day of data, and the average data of the 4 days of data is used as the first historical operating index data Group.

It is worth noting that the first historical operation index data group may also have other determination methods, for example, determination based on expert experience. The embodiment of the present application merely illustrates this schematically and does not limit it.

In the second optional manner, the database system can obtain the candidate operating indicator data group of the first operating indicator of the SQL statement, and the number of data included in the candidate operating indicator data group is greater than or equal to that in the historical operating indicator data group. The amount of data; the database system inputs the acquired candidate operating indicator data set of the first operating indicator into the second artificial intelligence model, and the second artificial intelligence model generates the performance baseline of the first operating indicator of the SQL statement. In this way, the second artificial intelligence model first obtains the first historical operating indicator data set of the SQL statement, and then generates a performance baseline of the operating indicator.

Generally, each operating indicator in the operating indicators of the first execution plan corresponds to a second artificial intelligence model. For the first operating indicator, the corresponding second artificial intelligence model can be composed of multiple historical operating indicator data sets of the first operating indicator. Obtained as a training sample. Wherein, corresponding to the aforementioned first optional method, the number of data in each historical operating index data group used for training is the same as the number of data in the historical operating indicator data group; corresponding to the aforementioned second optional method, The quantity of data in each historical operation index data group used for training is the same as the quantity of data in the candidate historical operation index data group.

It is worth noting that the first historical operation indicator data group of the aforementioned SQL statement may be updated periodically or after receiving an update instruction, and the performance baseline of the first performance indicator is also updated accordingly. In this way, it can be ensured that the first historical operation index data group can better reflect the historical performance of the first performance index of the SQL statement.

Optionally, the database system may also obtain the performance baseline of the first operating index in other ways, for example, generating the performance baseline through a statistical model, or obtaining the performance baseline through manual drawing.

As an example, as shown in FIG. 4, FIG. 4 is a schematic diagram of a performance baseline of a first operating index provided by an embodiment of the present application. The performance baseline can be located in a two-dimensional coordinate system. For example, the performance baseline includes a high-level line, a low-level line, and a median line. In the two-dimensional coordinate system, the horizontal axis represents the collection time, and the vertical axis represents the numerical value of the operating index. The high line is a straight line determined based on the maximum value in the first historical operating index data set, which is usually a straight line parallel to the horizontal axis passing through the maximum; the low line is based on the minimum value in the first historical operating indicator data set. A straight line determined by the value, which is usually a straight line parallel to the horizontal axis passing through the minimum; the median line is a straight line between the high line and the bottom line determined based on the data in the first historical operating index data set, the median The line is usually a straight line parallel to the horizontal axis passing through the median value, and the median value is the average value of the data in the first historical operating indicator data group.

It is worth noting that the performance baseline of an operating index provided in the embodiment of the present application may also have other drawing methods, which is not limited in the embodiment of the present application.

Step A112: The database system determines the performance curve of the first operating index data group. Perform step A113 or step A114.

The first operating index data group may form a performance curve in the coordinate system where the performance baseline of the first operating index is located. The database system can directly calibrate the performance curve of the first operating index data set in the coordinate system. For example, the value points corresponding to every two adjacent data in the first operating index data group are connected by a line segment to obtain the performance curve.

Step A113: When the performance curve of the first operating index data set does not match the performance baseline of the first operating indicator of the SQL statement, the database system determines that the performance indicated by the first operating indicator data set is lower than the first historical operating indicator of the SQL statement The performance indicated by the data group.

The database system compares the performance curve of the first operating index data set with the performance baseline of the first operating index. When the performance curve of the first operating index data set does not match the performance baseline of the first operating indicator of the SQL statement, the database system determines that the performance indicated by the first operating indicator data set is lower than the first historical operating indicator data set of the SQL statement. Performance.

The condition for matching the performance curve of the first operating index data group with the corresponding performance baseline is preset, and there may be multiple setting methods. For example, the matching condition refers to: the deviation of the numerical point on the performance curve from the performance baseline is within a specified range. That is, the deviation of the numerical point from the performance baseline is within the specified range, and the performance curve of the first operating index data set matches the corresponding performance baseline; when the deviation of the numerical point from the performance baseline is not within the specified range, the first run The performance curve of the indicator data set does not match the corresponding performance baseline.

Optionally, the deviation of the numerical point on the curve from the performance baseline may be reflected by the distance between the numerical point and the performance baseline. For example, when the performance baseline includes a high line, a low line, and a median line, the deviation of a value point on the curve from the performance baseline may be reflected by the distance between the value point and at least one of the high line, the low line, and the median line. For example, when the distance from the numerical point to the median line is greater than the first specified distance threshold; or, when the distance from the numerical point to the median line is greater than the first specified distance threshold, and the distance from the numerical point to the high line is greater than the first specified distance threshold. 2. A designated distance threshold; or, when the distance from the numerical point to the median line is greater than the first designated distance threshold, and the distance from the numerical point to the low line is greater than the third designated distance threshold, it is determined that the numerical point deviates from the performance baseline.

As shown in FIG. 5, FIG. 5 is a schematic diagram of a comparison scenario between the performance curve of the first operating indicator data set and the performance baseline of the first operating indicator provided by an embodiment of the present application. By comparing whether the performance curve matches the performance baseline, it can be determined whether the performance indicated by the first operation index data group is lower than the performance indicated by the first historical operation index data group of the SQL statement.

Step A114: When the performance curve of the first operating index data set matches the performance baseline of the first operating indicator of the SQL statement, the database system determines that the performance indicated by the first operating indicator data set is not lower than the first historical operating indicator of the SQL statement The performance indicated by the data group.

It is worth noting that the database system can also directly calibrate the numerical points (usually a set of discrete numerical points) corresponding to the first operational index data group in the coordinate system where the performance baseline is located based on the first operational index data group. There is no need to determine the performance curve corresponding to the first operating index data set. Then the foregoing step A112 can be deleted, and step A113 is replaced with when the value point of the first operating index data set does not match the performance baseline of the operating indicator of the SQL statement, the database system determines that the performance indicated by the first operating indicator data set is lower than that of the SQL statement The performance indicated by the first historical operating indicator data set. Step A114 is replaced by the database system determining that the performance indicated by the first running index data group is not lower than the first historical running index data group indication of the SQL sentence when the numerical points of the running index data group match the performance baseline of the running index of the SQL statement Performance.

The condition for matching the numerical points of the first operating index data group with the corresponding performance baseline is preset, and there may be multiple setting methods. For example, the matching condition refers to: the first type: the number of numerical points deviating from the performance baseline in the first operating index data group is less than the specified number threshold, or the second type: the first operating index The proportion of the numerical points that deviate from the performance baseline in the numerical points of the data set to the total number of numerical points is less than the specified ratio threshold.

Optionally, the degree of deviation of the numerical point from the performance baseline may be reflected by the distance between the numerical point and the performance baseline. For example, when the performance baseline includes a high line, a low line, and a median line, the degree of deviation of the value point from the performance baseline may be reflected by the distance between the value point and at least one of the high line, the low line, and the median line. The relevant explanation can refer to the explanation of the degree of deviation from the performance baseline of the numerical point on the aforementioned curve.

As shown in FIG. 6, FIG. 6 is a schematic diagram of a comparison scenario between the numerical points of the first operating indicator data set and the performance baseline of the first operating indicator provided by an embodiment of the present application. By comparing whether the corresponding multiple discrete numerical points of the first operating indicator data set match the performance baseline, it can be determined whether the performance indicated by the first operating indicator data set is lower than the performance indicated by the first historical operating indicator data set. For example, suppose the matching condition is the aforementioned first matching condition, and the number of value points that deviate from the performance baseline among the 6 value points is 4. If the specified number threshold is 3, the value points of the first operating index data group deviate from The number of numerical points of the performance baseline is greater than the specified number threshold, then the corresponding 6 discrete numerical points of the first operating index data group do not match the performance baseline, and accordingly, it is determined that the performance indicated by the first operating index data group is lower than The performance indicated by the first historical operating index data set.

The database system inputs the first operation index data group into the first artificial intelligence model, and when the first artificial intelligence model outputs indication information indicating that the first operation index data group is abnormal, it is determined that the first operation index is abnormal.

Among them, the content in the first operating index data group can refer to the introduction in the previous first implementable manner. The structure of the indication information can be multiple. In an optional manner, each time an operating index data set is input to the first artificial intelligence model, the indication information may include an indication label (or a classification label). It is used to identify whether the operating indicator data set input to the first artificial intelligence model is abnormal; in another optional manner, when multiple operating indicator data sets are input to the first artificial intelligence model each time, the instruction information may include The operating indicator data group of the indicator label, that is, the output operating indicator data group is the same as the input operating indicator data group, but each operating indicator data group carries an indicator label; in another optional method, each time When at least one operating index data group is input to the first artificial intelligence model, the indication information may include an abnormal operating index data group, that is, only an abnormal operating index data group that is output, and no normal operating index data group is output.

It is worth noting that, usually, each operating indicator in the operating indicators of an execution plan corresponds to a first artificial intelligence model. For the same operating indicator, the corresponding first artificial intelligence model can be run by multiple historical operations corresponding to the operating indicator. The index data set is trained as a training sample.

The foregoing two implementation methods can be combined according to actual conditions. For example, when there are multiple operation index data groups obtained, for example, operation index data groups corresponding to multiple SQL statements are obtained, or one operation of a SQL statement is obtained. When there are multiple operating indicator data sets for an indicator, the method provided in the first achievable mode can be executed first to perform a rough screening of abnormal operating indicator data sets. There are certain errors in the rough screening process, and non-abnormal results may occur. The operation index data group is determined to be an abnormal operation index data group; then the operation index data group obtained by rough screening (for example, the rough screening result is abnormal) is executed by the method provided by the second achievable way to perform abnormal operation Fine screening of indicator data sets.

Taking the first execution plan as an example, for each running index in the running index of the first execution plan, when the performance indicated by the running index data group is lower than the performance indicated by the historical running index data group of the SQL statement, the index data will be run Group input first artificial intelligence model, when the first artificial intelligence model outputs indication information indicating that the operation index data group is abnormal, it is determined that the operation index data group of the first execution plan is abnormal, which can improve Determine the accuracy of the abnormality of the operating index data set. When the performance indicated by the running index data set is not lower than the performance indicated by the historical running index data set of the SQL statement, there is no need to input the running index data set into the first artificial intelligence model. Compared with the foregoing second achievable method, the first method can be reduced. The computational cost of an artificial intelligence model.

Therefore, the combination of the first achievable method and the second achievable method can improve the screening accuracy compared to the method provided by the first achievable method, and the screening accuracy can be improved compared to the method provided by the second achievable method only. Method, can improve the efficiency of screening.

In the foregoing step A11, the process of determining whether other operating indicators in the operating indicators of the first execution plan are abnormal refers to the process of determining whether the first operating indicator is abnormal. This is not repeated in the embodiment of the application.

In the foregoing step A12, the database system may determine whether the operation index of the first execution plan is abnormal based on the abnormality determination result of each operation index. There are many ways to determine whether the operation index of the first execution plan is abnormal. illustrate:

Among them, the operating index score S satisfies the score calculation formula:

Among them, Xi represents the abnormal level of the operating indicator data group corresponding to the i-th operating indicator, 1≤i≤N, N is the total number of operating indicators, Pi is the weight of the i-th operating indicator, Pi is run by the i-th operating indicator The importance (or priority) of the indicator is determined. The abnormality level of each operation index is determined based on the abnormality determination result of each operation index. The abnormality level reflects the degree of abnormality of the data in the operation index data group corresponding to the operation index. Generally, the more abnormal data, the The higher the anomaly level. S represents the weighted sum of the abnormality levels of the operation index data group corresponding to the N operation indexes.

It is worth noting that the aforementioned second optional method is to define the operation index score in a way that the operation index score is positively correlated with the probability of occurrence of an abnormality. In actual implementation, it is also possible to define another operation that is negatively correlated with the probability of occurrence of an abnormality. Index score, that is, the higher the running index score, the lower the probability of abnormality, and the score calculation formula is adjusted accordingly. For example, the updated running index score S'is 1/S, and S is the aforementioned running index score. This is not repeated in the embodiment of the application.

Step A2. When the operation index of the first execution plan is abnormal, it is determined that the performance of the first execution plan of the SQL statement is degraded.

Step A3: When the operation index of the first execution plan is not abnormal, it is determined that the performance of the first execution plan of the SQL statement has not deteriorated.

Step 203: After analyzing the performance degradation of the first execution plan of the SQL statement, the database system obtains the second execution plan of the SQL statement.

Optionally, the performance of the second execution plan is different from the performance of the first execution plan. Further optionally, the performance of the second execution plan is better than the performance of the first execution plan. There may be multiple ways for the database system to obtain the second execution plan of the SQL statement. The embodiment of the present application takes the following two obtaining ways as examples for description:

The first acquisition method: the database system acquires the historical execution plan of the SQL statement; when the performance of the historical execution plan is better than that of the first execution plan, the historical execution plan is determined as the second execution plan.

Because the database system (such as the optimization node or the first management node) may have generated other execution plans of the SQL statement before the first execution plan (that is, the other execution plan is the execution plan generated before the first execution plan), Since the first execution plan is the currently used execution plan, it can be known that this other execution plan has become a historical execution plan. The database system can determine the historical execution plan in the other execution plans of the SQL statement to determine whether the historical execution plan can be used as the first execution plan. 2. Implementation plan. For example, the historical execution plan may be the execution plan with the best performance among other execution plans of the SQL statement, the execution plan with the longest historical use time, or the execution plan that meets other set conditions.

The database system can compare the performance of the historical execution plan with the performance of the first execution plan to determine the pros and cons of the two. There may be many ways to compare the performance of the historical execution plan with the performance of the first execution plan. Assuming that the running index score is positively correlated with the probability of abnormality, the database system can determine the running index score of the first execution plan, and obtain the running index score of the historical execution plan. The database system compares the running index scores of the two, when the first execution plan If the running index score of the first execution plan is less than the running index score of the historical execution plan, it is determined that the performance of the first execution plan generated by the first management node is better than that of the historical execution plan; when the running index score of the first execution plan is not less than the running index of the historical execution plan The index score determines that the performance of the first execution plan generated by the first management node is not better than the performance of the historical execution plan. The method for determining the running index score of the first execution plan can refer to the second optional method in step A12 above.

There may be multiple ways to determine the operating index scores of the historical execution plan. The embodiments of this application take the following as examples for illustration:

In an optional example, please refer to step 202. Since each time a new execution plan of the SQL statement is adopted, the database system will analyze whether the performance of the execution plan is degraded. In the process of analyzing whether the performance of the execution plan is degraded, Obtain instruction information indicating the performance of the execution plan, for example, the instruction information includes the data corresponding to the operation index of the execution plan (for example, the operation index data group corresponding to each operation index in the operation index of the execution plan) or the execution plan The database system can record the obtained performance indication information. In this way, the database system records the instruction information of each historical execution plan. Then the database system can determine the operation index score of the historical execution plan based on the pre-recorded indication information of the historical execution plan, and the operation index score reflects the performance of the historical execution plan.

In another optional example, in the database system, the generated execution plan carries estimated performance overhead, and the larger the performance overhead, the worse the performance of the execution plan. The database system may determine the running index score of the historical execution plan based on the performance overhead, and the running index score reflects the performance of the historical execution plan. For example, the performance overhead may include data of estimated operating indicators of the execution plan. For example, the performance overhead of the historical execution plan includes: estimated IO indicators, time delay, error information, execution times, and/or processing time data of the historical execution plan (or the abnormality level converted from the data). The operating index score is determined based on the estimated data of each of the foregoing operating indicators, and the determination method may refer to the second optional method in the foregoing step A12. For another example, the performance overhead may directly include the estimated running index score.

The embodiment of the present application may also compare the performance of the historical execution plan with the performance of the first execution plan in other ways, for example, compare the operation indexes of the historical execution plan with the operation indexes of the first execution plan one by one, and based on the comparison result, It is determined whether the performance of the historical execution plan is better than the performance of the first execution plan, which is not limited in the embodiment of the present application.

Since the historical execution plan has been generated in advance, the first acquisition method can quickly acquire a historical execution plan that has better performance than the first execution plan as the second execution plan, which can achieve rapid optimization of the execution plan and improve optimization efficiency.

The second method of obtaining: the database system generates the second execution plan of the SQL statement.

Optionally, the computing resources occupied by the second execution plan for generating the SQL statement are greater than the computing resources occupied by the first execution plan for generating the SQL statement, and/or the duration of the second execution plan for generating the SQL statement is greater than that for generating the SQL statement The duration of the first execution plan.

The first execution plan is usually generated online by the first management node. In order to avoid too long delay and affect user experience, the time length of the first execution plan generation has a certain limit. For example, the first execution plan needs to be generated within the first duration threshold. For example, in step 201, the first management node may perform the following steps: generate multiple candidate execution plans based on the optimization rule information and/or optimization cost information of the first database, and determine among the multiple candidate execution plans The first execution plan. The duration of generating the candidate execution plan and determining the first execution plan needs to be within the first duration threshold.

In addition, in order to avoid occupying computing resources and affecting other user services, the computing resources occupied when the first execution plan is generated are also limited to a certain extent. For example, the computing resources occupied by generating the first execution plan are less than the first computing resource threshold. That is, the computing resources occupied by the foregoing process of generating the candidate execution plan and determining the first execution plan is less than the first computing resource threshold. Wherein, the computing resources may include CPU resources, memory resources, and/or hard disk resources required by the first management node during operation.

Wherein, the first duration threshold may be determined based on the allowable interruption duration in the foregoing reliability index, for example, the first duration threshold is less than or equal to the allowable interruption duration; the first computing resource threshold may be based on the IO upper limit in the foregoing reliability index or other The parameters related to computing resources are determined.

In the embodiment of the present application, the database system (for example, the optimization node) may generate multiple candidate execution plans based on the optimization rule information and/or optimization cost information of the first database, and determine the first execution plan among the multiple candidate execution plans. 2. Implementation plan. However, the process of determining the second execution plan can be an offline calculation process. Because the second execution plan does not affect the normal business of the first database when it is generated offline, and is not restricted by the business of the first database, it can occupy more More time and/or computing resources to determine the second execution plan of the SQL statement. In this case, the process of generating the candidate execution plan and determining the second execution plan by the database system can refer to the corresponding process of generating the first execution plan described above. That is, the generation principle of the second execution plan and the first execution plan may be the same. However, the computing resources occupied by the second execution plan for generating the SQL statement; and/or the duration of the second execution plan for generating the SQL statement is not constrained, or is less constrained relative to the process of generating the first execution plan. In this way, that the aforementioned database system generates multiple alternative execution plans based on the optimization rule information and/or optimization cost information of the first database means that the database system is based on the optimization rule information of the first database and/or other information in the optimization cost information , Generate multiple alternative execution plans, and the other information is information other than information related to computing resources and computing time in the optimization rule information and/or optimization cost information.

It can be seen that the computing resources occupied by the second execution plan of the SQL statement generated by the database system are greater than the computing resources occupied by the first execution plan for generating the SQL statement; and/or the duration of the second execution plan for generating the SQL statement is greater than that of the generated SQL statement The duration of the first execution plan of the SQL statement.

For example, the second duration threshold and the second computing resource threshold may be set for the database system. The second duration threshold is greater than the first duration threshold, and the second computing resource threshold is greater than the first resource threshold. The database system generates an alternative execution plan under the constraints of the second duration threshold and the second computing resource threshold, and then determines the second execution plan in the alternative execution plan. For this process, please refer to the corresponding method for generating the first execution plan. process.

Since there are fewer restrictions on generating the second execution plan, the performance is usually better than that of the first execution plan. Assume that the first execution plan is generated by the first management node, and the second execution plan is generated by the optimization node. In the first example, the number of candidate execution plans generated by the first management node (such as 100) is less than the number of candidate execution plans generated by the optimized node (such as 10000). For example, although the first management node traverses The multiple candidate execution plans generated are used to determine the first execution plan. Similarly, the multiple candidate execution plans generated by the node traversal are optimized to determine the second execution plan. The two traversal processes are the same, but due to the optimization node There are more alternative execution plans available, and the final second execution plan is more likely to be better than the first execution plan.

In the second example, the number of candidate execution plans generated by the first management node and the number of candidate execution plans generated by the optimization node (for example, both are 10,000), but are limited by the first duration threshold, the first The management node cannot traverse (can only enumerate) the generated multiple candidate execution plans to determine the first execution plan, while the optimization node can traverse the generated multiple candidate execution plans to determine the second execution plan. Although the same number of candidate execution plans are generated, the number of candidate execution plans scanned by the first management node and the optimized node are different, and the finally determined second execution plan is more likely to be better than the first execution plan.

In order to facilitate readers' understanding, suppose that the optimization rule information of the first database indicates that the query data of the SQL statement is returned in 5 milliseconds, and the first duration threshold is 0.5 milliseconds.

Continuing with the foregoing second example as an example, suppose the first management node generates 1000 alternative execution plans. Within 0.5 milliseconds, the first management node can only scan 100 alternative executions among the 1000 alternative execution plans. plan. Then the first management node determines the first execution plan based on the 100 candidate execution plans. Assuming that the optimization node generates 1000 candidate execution plans, which are the same as the 1000 candidate execution plans generated by the first management node, the optimization node traverses the 1000 candidate execution plans to determine among the 1000 candidate execution plans The second execution plan.

In summary, since the generation time and/or the occupied computing resources reduce the restriction on the process of generating the execution plan, it can be ensured that the performance of the finally determined second execution plan is better than the performance of the first execution plan.

It is worth noting that the aforementioned first acquisition method and the second acquisition method can also be used in combination. For example, if the database system cannot obtain the historical execution plan of the SQL statement (for example, there are no other execution plans for the SQL statement, or other existing execution plans do not meet the conditions, resulting in the inability to obtain the historical execution plan); or, although the historical execution plan can be obtained , But the performance of the historical execution plan is not better than that of the first execution plan, then the second acquisition method mentioned above is executed.

Optionally, the database system may also use other methods to obtain the second execution plan of the SQL statement. For example, the user may input an execution plan update instruction, and correspondingly, the database system may receive an execution plan update instruction, and the execution plan update instruction includes the second execution plan.

In the embodiment of the present application, the user can also control whether to intervene in the execution plan. Then, before the aforementioned step 203, that is, after the database system analyzes the performance degradation of the first execution plan of the SQL statement, the database system may also send an alarm indication information. The warning indication information indicates that the performance of the first execution plan of the SQL statement has deteriorated. Optionally, the alarm indication information may include SQL statements, that is, SQL statements that produce performance degradation; the alarm indication information may also include tasks that indicate intervention in the execution plan. The user can know the SQL statement that currently has performance degradation based on the warning indication information, and determine whether to intervene in the execution plan. Correspondingly, in step 203, after receiving the execution plan optimization instruction, the database system obtains the second execution plan of the SQL statement. The execution plan optimization instruction is used to instruct to intervene in the degraded execution plan. Optionally, the execution plan optimization instruction and the execution plan update instruction may be the same instruction.

Optionally, the warning indication information may be presented through a user interface. FIG. 7 is a schematic diagram of an exemplary user interface 30 provided by an embodiment of the present application. In addition to the alarm indication information 301, the user interface 30 may also present a determination option 302, a prohibition option 303, and/or a delay optimization option 304. When the user determines to optimize the execution plan, the determination option 302 is triggered. Correspondingly, the database system receives the execution instruction and optimizes the execution plan based on the execution instruction; when the user determines that the optimization of the execution plan is prohibited, the prohibition is triggered Option 303: Correspondingly, the database system receives an execution prohibition instruction, and the optimization of the execution plan is prohibited based on the execution prohibition instruction; when the user determines to delay the optimization of the execution plan, the delay optimization option 304 is triggered, and accordingly, the database system receives When the deferred execution instruction is reached, the execution plan is optimized based on the deferred execution instruction after reaching the time point indicated by the deferred execution instruction. The time point indicated by the delay optimization option may be a preset time point, such as a shutdown time point or a power-on time point; or, the time point indicated by the delay optimization option may be a time point set by the user, such as one hour later or one day later.

Step 204: The database system converts the second execution plan into a second execution plan matching the first management node.

The second execution plan obtained by the database system needs to be parsed by the first management node when it is used. Although the second execution plan can be parsed by the first management node, it is possible that the second execution plan is not fully adapted to the first management node, which affects the speed at which the first management node parses the second execution plan, which in turn leads to the The first management node generates a longer analysis delay, which affects the efficiency of loading and using the second execution plan.

In the embodiments of the present application, the second execution plan can be converted into a second execution plan that matches the first management node, so as to ensure that the first management node can quickly analyze the converted second execution plan, thereby reducing the generation of the first management node. Analyze the time delay and improve the efficiency of loading and using the second execution plan by the first management node.

For example, the database system may use the method of querying the correspondence relationship to convert the second execution plan into a second execution plan matching the first management node. The process includes: the database system queries the correspondence relationship between the specified management node and the execution plan format , The execution plan format corresponding to the first management node is obtained; the database system converts the second execution plan into a second execution plan conforming to the execution plan format.

Step 205: The optimization node executes the second execution plan to replace the execution of the first execution plan.

In the first optional manner, the optimization node may not execute the foregoing step 204, and directly execute the second execution plan to replace the execution of the first execution plan; in another optional manner, the optimization node The second execution plan transformed in step 204 may be executed to replace the execution of the first execution plan.

In an exemplary implementation manner, the second execution plan carries a hint tag, which is used to identify the corresponding execution plan as an updated execution plan (or called an intervened execution plan). For example, a database system (such as an optimization node) can add a description prompt label to the second execution plan, and the first management node can determine that the corresponding execution plan is an execution plan after the intervention of the database system (such as an optimization node) based on the description prompt label. , It is not the execution plan normally generated by the first management node itself (such as online generation). The description prompt label may be composed of one or more characters, and the characters may be numeric characters, alphabetic characters, and so on.

Optionally, please continue to refer to FIG. 3, the second execution plan added with the description prompt label is still stored in the aforementioned system table 1042. The second execution plan will replace the first execution plan recorded in the system table 1042. In an optional manner, the first execution plan may be deleted first, and then the second execution plan may be added; in another optional manner, the second execution plan may be used to overwrite the first execution plan. The execution plan in the traditional database system does not add a description prompt label. The embodiment of the present application adds a description prompt label to the second execution plan, which can be distinguished from the execution plan normally generated by the first management node 101a. The first management node 101a loads the second execution plan to which the description prompt label is added. Generally, the performance of the second execution plan is better than that of the first execution plan, so that the optimization of the execution plan is realized.

In another exemplary implementation manner, a partition storage method may be used to distinguish the updated execution plan from the normally generated execution plan. For example, the database system (such as the optimization node) stores the second execution plan in other system tables, and the other system tables are used to record the updated execution plan. When the updated execution plan is not allowed to be used (for example, stop the first execution plan) After the intervention process of the execution plan of the management node), the updated execution plan corresponding to the first management node in the other system tables can be cleared. In this way, other system tables usually have the following two states, one is a state in which an updated execution plan of the first management node is stored, and the other is a state in which an updated execution plan of the first management node is not stored.

When the first management node needs to load the execution plan of the SQL statement, it can first query other system tables. If the execution plan of the SQL statement is stored in the other system table, the execution plan is loaded; if the execution plan of the SQL statement is not stored in the other system table , Load the execution plan of the SQL statement in the aforementioned system table.

It is worth noting that the foregoing 204 and step 205 are described by taking the performance degradation of the first execution plan as an example. When the performance of the first execution plan is analyzed in step 202 as degradation, the use of the first execution plan can be maintained. In this case, as shown in FIG. 3, what the first management node 101a obtains from the system table 1042 is still the first execution plan (not shown in FIG. 3).

Step 206: When the performance of the new version of the execution plan is better than the performance of the second execution plan, the database system executes the new version of the execution plan to replace the execution of the second execution plan.

If the hardware and software environment changes, or the first management node reaches the cycle of generating a new version of the execution plan, the database system (such as the first management node) will generate a new version of the execution plan, assuming that the new version of the execution plan is the third execution plan. The database system can analyze the performance of the third execution plan, and if the performance of the third execution plan is better than that of the second execution plan, the database system executes the new version of the execution plan to replace the execution of the second execution plan. When the performance of the third execution plan is not better than the performance of the second execution plan, it indicates that the performance of the second execution plan is still better, and the database system still maintains the execution of the second execution plan. In this case, each time the database system generates a third execution plan, it compares the performance of the second execution plan with the performance of the third execution plan until the performance of the third execution plan is better than that of the second execution plan. , The database system executes the new version of the execution plan to replace the execution of the second execution plan.

There are many ways to compare the performance of the second execution plan with the performance of the third execution plan. For example, in a database system, the generated execution plan carries estimated performance overhead, and the larger the performance overhead, the worse the performance of the execution plan. Assuming that the running index score is positively correlated with the probability of abnormality, the database system can determine the running index score of the third execution plan based on the performance cost, and obtain the running index score of the second execution plan, and the database system compares the running index scores of the two , When the operating index score of the third execution plan is less than the operating index score of the second execution plan, it is determined that the performance of the third execution plan is better than that of the second execution plan; when the third execution plan’s operation index score is not less than the second execution plan The running index score of the plan determines that the performance of the third execution plan is not better than the performance of the second execution plan.

For example, the performance overhead may include data corresponding to the estimated operating indicators of the third execution plan (that is, the operating indicator data group corresponding to each operating indicator in the operating indicators of the execution plan). For example, the performance overhead of the third execution plan includes: the estimated IO indicators, time delay, error information, execution times, and/or processing time data of the third execution plan (or the abnormality level converted from the data). The operating index score is determined based on the estimated data of each of the foregoing operating indicators, and the determination method may refer to the second optional method in the foregoing step A12. For another example, the performance overhead may directly include the estimated running index score.

The embodiment of the present application can also compare the performance of the second execution plan with the performance of the third execution plan in other ways. As a result, it is determined whether the performance of the third execution plan is better than the performance of the second execution plan, which is not limited in the embodiment of the present application.

Optionally, the database system can also obtain the third execution plan of the SQL statement (that is, the aforementioned new version of the execution plan) in other ways. For example, the user can input an execution plan version update instruction, and accordingly, the database system can receive the execution plan version. An update instruction, and the execution plan version update instruction includes a third execution plan.

In the embodiment of the present application, the user can also set the rules in the database system. Then, the method for processing the execution plan provided in the embodiment of the present application may further include: receiving a rule setting instruction, where the rule setting instruction includes the set rule. Optionally, the rules in the database system may include at least one of the following: SQL performance comparison rules, description prompt label setting rules, alarm rules, and routing rules.

The SQL performance comparison rule is a rule that instructs the database system how to determine whether the performance of the first execution plan of the SQL statement is degraded. For example, based on the SQL performance comparison rule, the database system can execute the aforementioned steps A1 to A3. The description prompt label setting rule is a rule that instructs the database system how to set the description prompt label. For example, based on the setting rules of the description prompt label, the database system may use the method provided in the foregoing step 205 to add the description prompt label. The alarm rule is a rule that instructs the database system how to alarm. For example, based on the alarm rule, the database system can issue the aforementioned alarm indication information. The routing rule is a rule that instructs the database system how to store the acquired data (such as the data corresponding to the running index). For example, based on routing rules, the database system may store the acquired data in the operating index data group in a designated storage space.

In summary, after analyzing the performance degradation of the first execution plan, the embodiment of the present application will replace the first execution plan with the second execution plan to avoid the use of the first execution plan of performance degradation in the database and reduce the execution of performance degradation. Plan the impact on the database to ensure the performance of the database. Optionally, the performance of the second execution plan is better than the performance of the first execution plan, thereby effectively ensuring the performance of the database.

It should be noted that the order of the steps in the processing method of the execution plan provided in the embodiment of the application can be adjusted appropriately, and the steps can also be increased or decreased according to the situation. For example, the

aforementioned steps

202, 204, and 206 may not be executed. Those skilled in the art can easily think of various methods within the technical scope disclosed in this application, which should be covered by the protection scope of this application, and therefore will not be repeated.

The database system of the embodiment of the present application supports the function of intervention in the execution plan of the management node, and this function can be triggered in a variety of ways. For example, in an optional manner, after receiving the execution plan intervention instruction, the database system executes the intervention process of the execution plan of the management node, such as starting to execute step 202 to step 206, that is, after receiving the execution plan optimization instruction After that, start to analyze whether the performance of the first execution plan of the SQL statement is degraded. The execution plan intervention instruction can be triggered by the user through the application or by a designated device; in another alternative, the database system periodically Perform an intervention process to the execution plan of the first management node, such as step 202 to step 206.

Fig. 8 is a schematic structural diagram of the database system provided by an embodiment of the present application. As shown in Figure 8, the database system includes optimization nodes. Figure 8 assumes that the optimization nodes are located outside the management node (not shown in Figure 8). The optimization nodes include processing modules, artificial intelligence computing engines, analysis modules, and operation modules. And alarm module. Among them, the analysis module includes a diagnosis sub-module and an optimization sub-module, and the operation module includes a cluster management sub-module and an instance management sub-module.

The artificial intelligence calculation engine may store the aforementioned first artificial intelligence model and the aforementioned second artificial intelligence model, and perform corresponding calculations based on the stored artificial intelligence model. For example, perform the calculation corresponding to the aforementioned step A111 and the second method provided by the aforementioned step A114. The analysis module is used to analyze whether the performance of the first execution plan of the SQL statement is deteriorated, and after determining that the performance of the first execution plan is deteriorated, perform replacement of the first execution plan. Among them, the diagnosis sub-module is used to diagnose whether the performance of the first execution plan of the SQL statement is degraded, which can execute the aforementioned step 202; the optimization sub-module is used to perform the first execution after the diagnosis sub-module determines that the performance of the first execution plan is degraded For the planned replacement, it can execute the aforementioned step 203 to step 205. The operation module is used to manage operations in the database system, where the cluster management sub-module is used to manage the database cluster, and the instance management sub-module is used to manage the database instance. The alarm module is used to send alarm indication information.

The database system also includes multiple databases, the multiple databases including: one or more relational databases, one or more index databases, and a configuration database.

Among them, the relational database is the database mainly maintained by the database system. Fig. 8 takes a total of 3 relational databases, databases 1 to 3 respectively, as an example for illustration. Each database includes a management node and one or more data nodes managed by the management node, and each database can provide system views, SQL indicators, and/or an application programming interface (API) of the management node. The SQL indicator refers to the operation indicator of the execution plan of the aforementioned SQL statement. The management node API refers to the API of the management node in the database.

The index database is used to store the data involved in the processing method of the execution plan provided in the embodiment of the present application. In the embodiment of the present application, the number of the index database may increase according to the increase in the amount of stored data. Figure 8 takes a total of 3 index databases, namely index databases 1 to 3 as examples. The configuration database is used to store rules in the database system, such as SQL performance comparison rules, description prompt label setting rules, alarm rules and/or routing rules. The database system is maintained with an operating system (OS), and the operating system may be a Linux or Windows operating system. The operating system can control CPU, disk, memory, network, and/or mainboard, etc.

Based on the original database service, the database system adds a collection layer, a storage and processing layer, a service layer, and a page (view) layer according to functions. Among them, there is an API between the collection layer and the storage and processing layer, and there is an API between the service layer and the page layer. The page layer can provide visual pages. Through this visualization page, the user can control whether to intervene in the execution plan, or enter the execution plan version update instruction, or enter the execution plan intervention instruction. For example, the visualization page may present a user interface as shown in FIG. 7.

In order to facilitate readers' understanding, the embodiment of the present application is based on the schematic structure of the database system shown in FIG. 8 to schematically illustrate the processing method of the execution plan. The optimization node can arrange proxy modules on the host where each database in the database system is located, and each proxy module collects data corresponding to the running indicators on the deployed database host, and sends the collected data to the processing module. For example, the proxy module may use a message queue (MQ) method to send the collected data to the processing module, and the processing module performs streaming processing on the received data. And the processing module can store the execution plans of all SQL statements stored in the system table of the database system (such as the system table in the foregoing embodiment) into the index database corresponding to the management node. The way the processing module performs streaming processing can be divided into online (online) and offline (offline) two ways. Optionally, the processing module may obtain the performance baseline of each operation index calculated by the artificial intelligence calculation engine, and store the performance baseline of each operation index in the index database. Optionally, the user can set routing rules through the aforementioned rule setting instructions. Correspondingly, the processing module stores the data corresponding to the operating indicators of the execution plan of the SQL statement in the indicator database based on the set routing rules and/or the performance baseline of each operating indicator. . Assuming that the database X is any one of the aforementioned databases 1 to 3, the analysis module analyzes whether the performance of the first execution plan of the SQL statement in the database X has deteriorated, and after determining the performance deterioration of the first execution plan, the optimization sub-module The second execution plan of the SQL statement is acquired, and the database X is controlled to execute the second execution plan to replace the execution of the first execution plan. The alarm module sends out alarm indication information after reaching the alarm condition.

In traditional database systems, since each execution plan will continue to be used for a period of time (such as several weeks), if the execution plan itself has poor performance, it will affect the performance of the database during this period of time. In severe cases, it will cause large The large-scale database version is rolled back, causing long-term business interruption.

After analyzing the performance degradation of the first execution plan, the embodiment of the present application will replace the first execution plan with the second execution plan to prevent the database from using the first execution plan with performance degradation and reduce the impact of the performance plan on the database. Influence, thereby ensuring the performance of the database. Optionally, when the performance of the second execution plan is better than the performance of the first execution plan, the performance of the database can be effectively guaranteed, database version rollbacks can be avoided, and the duration of business interruption can be reduced.

Based on the same concept as the processing method of the execution plan provided in the above embodiment, an embodiment of the present application provides a processing method of an execution plan. FIG. 9 is a schematic flowchart of the processing method of the execution plan provided in an embodiment of the present application. This method can be executed by the aforementioned database system 10. The method includes:

Step 401: The database system executes the first execution plan of the SQL statement.

For the process of step 401, reference may be made to the process of step 201, which is not described in detail in the embodiment of the present application.

Step 402: The database system obtains the second execution plan of the new version of the SQL statement.

The process of obtaining the second execution plan of the new version of the SQL statement by the database system in step 402 can refer to the process of obtaining the execution plan of the new version of the SQL statement by the database system in step 206 (that is, the third execution plan). Do repeat.

For example, the second execution plan is generated by the management node or input by the user through the execution plan version update instruction.

Step 403: When the performance of the second execution plan is better than the performance of the first execution plan, the database system executes the second execution plan to replace the execution of the first execution plan.

The process of step 403 may refer to the process of executing the third execution plan by the database system in step 206 to replace the execution of the second execution plan, which is not described in detail in the embodiment of the present application. The second execution plan in step 403 is equivalent to the third execution plan in step 206, and the first execution plan in step 403 is equivalent to the second execution plan in step 206.

It is worth noting that the execution plan processing method also supports other functions provided in the foregoing embodiments. For example, before step 403, for example, after determining that the performance of the second execution plan is better than that of the first execution plan, the database The system converts the second execution plan into a second execution plan that matches the management node. For this process, refer to the aforementioned step 204. This is not repeated in the embodiment of the application.

In a traditional database system, when the management node generates a new version of the execution plan, it directly executes the new version of the execution plan to replace the execution of the original execution plan, and the new version of the execution plan will continue for a period of time (such as a few weeks) Use, if the performance of the new version of the execution plan itself is poor, it will affect the performance of the database during this period of time. In severe cases, large-scale database versions will be rolled back, causing long-term business interruption.

In the embodiment of this application, when the performance of the new version of the second execution plan is better than the performance of the first execution plan, the database system executes the second execution plan to replace the execution of the first execution plan, so as to avoid the use of degraded performance in the database. The new version of the execution plan reduces the impact of degraded execution plans on the database, thereby ensuring the performance of the database. In addition, since the performance of the second execution plan is better than the performance of the first execution plan, the performance of the database can be effectively guaranteed, the database version rollback can be avoided, and the duration of business interruption can be reduced.

FIG. 10 is a block diagram of a database system 50 provided by an embodiment of the present application. The database system 50 includes:

The execution module 501 is used to execute the first execution plan of the structured query language SQL statement. The obtaining module 502 is configured to obtain the second execution plan of the SQL statement when the performance of the first execution plan of the SQL statement deteriorates. The execution module 502 is also used to execute the second execution plan to replace the execution of the first execution plan.

In the embodiment of the present application, after the performance of the first execution plan is deteriorated, the execution module will replace the first execution plan with the second execution plan, so as to avoid the use of the first execution plan of the performance deterioration of the database, and reduce the effect of the execution plan of the performance deterioration on the database. The impact of this to ensure the performance of the database.

Optionally, the second execution plan is different from the first execution plan.

Optionally, the performance of the second execution plan is better than the performance of the first execution plan. When the performance of the second execution plan is better than the performance of the first execution plan, the performance of the database can be effectively guaranteed, database version rollbacks can be avoided, and the duration of business interruption can be reduced.

Optionally, the obtaining module 502 is configured to: obtain a historical execution plan of the SQL statement; when the performance of the historical execution plan is better than that of the first execution plan, use the historical execution plan as the second execution plan .

Optionally, the obtaining module 502 is configured to generate a second execution plan of the SQL statement.

FIG. 11 is a block diagram of a database system 50 provided by an embodiment of the present application. The database system 50 includes an alarm module 503 for sending alarm indication information when the performance of the first execution plan of the SQL statement deteriorates. Optionally, the obtaining module 502 is configured to obtain the second execution plan of the SQL statement after receiving the execution plan optimization instruction.

Optionally, the execution module 501 is further configured to execute the new version of the execution plan to replace the execution of the second execution plan when the performance of the new version of the execution plan is better than the performance of the second execution plan.

12 is a block diagram of another database system 50 provided by an embodiment of the present application. The database system 50 further includes: a determining module 504, configured to determine the first SQL statement when an abnormality occurs in the operation index of the first execution plan. The performance of the execution plan is degraded.

Optionally, the operation indicators of the first execution plan include one or more of the following: input and output IO indicators, time delay, error information, execution times, and processing duration of the first execution plan.

Optionally, the determining module 504 is configured to: for each operating indicator in the operating indicators of the first execution plan, when the performance indicated by the operating indicator data group corresponding to the operating indicator is lower than the SQL corresponding to the operating indicator The performance indicated by the statement’s historical operating indicator data group is determined to be abnormal; and/or, for each operating indicator in the operating indicators of the first execution plan, the operating indicator data group corresponding to the operating indicator is entered in the first An artificial intelligence model, when the first artificial intelligence model outputs indication information indicating that the operation index is abnormal, it is determined that the operation index is abnormal.

Optionally, the determining module 504 is configured to:

For each operating indicator in the operating indicators of the first execution plan, when the performance curve of the operating indicator data group corresponding to the operating indicator does not match the performance baseline of the operating indicator, determine the operating indicator data group corresponding to the operating indicator The indicated performance is lower than the performance indicated by the historical operating indicator data set, and the performance baseline of the operating indicator is determined based on the historical operating indicator data set.

13 is a block diagram of another database system 50 provided by an embodiment of the present application. The database system 50 further includes: a baseline generation module 505, configured to generate the operating index based on the second artificial intelligence model and the historical operating indicator data set. Performance baseline.

It is worth noting that the structure of a database system provided by an embodiment of the present application may also be the database system shown in FIG. 1 or FIG. 8 described above.

For example, when the structure of the database system is the structure of the database system shown in FIG. 1, the foregoing execution module 501 can be integrated in the management node 101 and the data node 102, so that the management node 101 and the data node 102 cooperate to complete the execution module The function of 501, or the aforementioned execution module 501 can be integrated in the management node 101, so that the management node 101 completes the function of the execution module 501; one of the acquisition module 502, the alarm module 503, the determination module 504 and the baseline generation module 505 or Multiple modules may be integrated in the optimization node 103, so that the optimization node 103 completes the function of the one or more modules.

For example, when the structure of the database system is the structure of the database system shown in FIG. 8, the execution module 501 may be integrated in the database, such as at least one of the databases 1 to 3, so that the at least one database Complete the function of the execution module 501; the acquisition module 502 can be integrated in the processing module to complete the function of the acquisition module 502 by the processing module. The function of the alarm module 503 is the same as that of the alarm module in FIG. 8; the determination module 504 can be integrated In the analysis module, the analysis module completes the function of the determination module 504; the baseline generation module 505 can be integrated into the artificial intelligence calculation engine, so that the artificial intelligence calculation engine completes the function of the baseline generation module 505.

FIG. 14 is a block diagram of another database system 60 provided by an embodiment of the present application. The database system 60 includes:

The execution module 601 is used to execute the first execution plan of the structured query language SQL statement; the obtaining module 602 is used to obtain the second execution plan of the new version of the SQL statement; the execution module 602 is also used to execute the second execution plan of the SQL statement. When the performance of the plan is better than the performance of the first execution plan, the second execution plan is executed to replace the execution of the first execution plan.

It is worth noting that the structure of a database system provided by an embodiment of the present application may also be the database system shown in FIG. 1 described above.

For example, when the structure of the database system is the structure of the database system shown in FIG. 1, the foregoing execution module 601 can be integrated in the management node 101 and the data node 102, so that the management node 101 and the data node 102 cooperate to complete the execution module The function of 601, or the aforementioned execution module 601 can be integrated in the management node 101, so that the management node 101 completes the function of the execution module 601; the acquisition module 602 can be integrated in the optimization node 103, so that the optimization node 103 completes the execution module 602 Function.

In the embodiment of the present application, when the performance of the new version of the second execution plan is better than the performance of the first execution plan, the execution module executes the second execution plan to replace the execution of the first execution plan, so as to avoid performance degradation of the database adoption The new version of the execution plan reduces the impact of performance-degraded execution plans on the database, thereby ensuring the performance of the database. In addition, since the performance of the second execution plan is better than the performance of the first execution plan, the performance of the database can be effectively guaranteed, the database version rollback can be avoided, and the duration of business interruption can be reduced.

Optionally, FIG. 15 schematically provides a possible basic hardware architecture of the computer device of the present application.

Referring to FIG. 15, the computer device 700 includes a processor 701, a memory 702, a communication interface 703, and a bus 704.

In the computer device 700, the number of processors 701 may be one or more, and FIG. 15 only illustrates one of the processors 701. Optionally, the processor 701 may be a central processing unit (CPU). If the computer device 700 has multiple processors 701, the types of the multiple processors 701 may be different or may be the same. Optionally, multiple processors 701 of the computer device 700 may also be integrated into a multi-core processor.

The memory 702 stores computer instructions and data; the memory 702 may store computer instructions and data required to implement the processing method of the execution plan provided in the present application. For example, the memory 702 stores instructions for implementing the steps of the processing method of the execution plan. The memory 702 may be any one or any combination of the following storage media: non-volatile memory (for example, read only memory (ROM), solid state drive (SSD), hard disk (HDD), optical disc), volatile memory.

The communication interface 703 may be any one or any combination of the following devices: a network interface (for example, an Ethernet interface), a wireless network card, and other devices with a network access function.

The communication interface 703 is used for data communication between the computer device 700 and other computer devices or terminals.

The bus 704 can connect the processor 701 with the memory 702 and the communication interface 703. In this way, through the bus 704, the processor 701 can access the memory 702, and can also use the communication interface 703 to interact with other computer devices or terminals.

In this application, the computer device 700 executes the computer instructions in the memory 702, so that the computer device 700 implements the processing method of the execution plan provided in this application, or causes the computer device 700 to deploy a database system.

In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium including instructions, such as a memory including instructions, which can be executed by a processor of a server to complete the execution plan shown in each embodiment of the present application.的处理方法。 Treatment methods. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in the form of a computer program product in whole or in part, and the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, all or part of them are generated in accordance with the procedures or functions described in the embodiments of the present application. The computer may be a general-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data. The center transmits to another website, computer, server, or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium, or a semiconductor medium (for example, a solid state hard disk).

In this application, the terms "first", "second" and "third" are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance. The term "at least one" means one or more, and the term "plurality" means two or more, unless specifically defined otherwise. A refers to B, which means that A is the same as B or A is a simple modification of B.

It should be noted that when the database system provided in the above embodiment executes the processing method of the execution plan, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be assigned to different functions according to needs. Module completion, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the database system provided in the foregoing embodiment and the execution plan processing method embodiment belong to the same concept. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.

A person of ordinary skill in the art can understand that all or part of the steps in the above embodiments can be implemented by hardware, or by a program to instruct relevant hardware. The program can be stored in a computer-readable storage medium. The storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

The above are only optional embodiments of this application and are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection of this application. Within range.

Claims

An execution plan processing method, characterized in that the method includes:

The first execution plan for executing the structured query language SQL statement;

When the performance of the first execution plan of the SQL statement deteriorates, acquiring the second execution plan of the SQL statement;

The second execution plan is executed to replace the execution of the first execution plan.
The method according to claim 1, wherein the second execution plan is different from the first execution plan.
The method according to claim 2, wherein the performance of the second execution plan is better than the performance of the first execution plan.
The method according to any one of claims 1 to 3, wherein the obtaining the second execution plan of the SQL statement comprises:

Acquiring the historical execution plan of the SQL statement;

When the performance of the historical execution plan is better than the performance of the first execution plan, the historical execution plan is used as the second execution plan.
The method according to any one of claims 1 to 3, wherein the obtaining the second execution plan of the SQL statement comprises:

Generate a second execution plan of the SQL statement.
The method according to claim 5, wherein the computing resources occupied by the second execution plan for generating the SQL statement are greater than the computing resources occupied by the first execution plan for generating the SQL statement, and/or generating The duration of the second execution plan of the SQL statement is greater than the duration of the first execution plan of the SQL statement.
The method according to any one of claims 1 to 6, wherein the method further comprises:

When the performance of the first execution plan of the SQL statement deteriorates, an alarm indication message is sent.
The method according to any one of claims 1 to 7, wherein the obtaining the second execution plan of the SQL statement comprises:

After receiving the execution plan optimization instruction, the second execution plan of the SQL statement is acquired.
The method according to any one of claims 1 to 8, wherein the method further comprises:

When the performance of the new version of the execution plan is better than the performance of the second execution plan, the new version of the execution plan is executed to replace the execution of the second execution plan.
The method according to any one of claims 1 to 9, wherein the method further comprises:

When the operation index of the first execution plan is abnormal, it is determined that the performance of the first execution plan of the SQL statement is degraded.
The method according to claim 10, wherein the operation index of the first execution plan includes one or more of the following:

The input and output IO indicators, time delay, error information, execution times, and processing duration of the first execution plan.
The method according to claim 10 or 11, wherein the method further comprises:

For each operating indicator in the operating indicators of the first execution plan, when the performance indicated by the operating indicator data set corresponding to the operating indicator is lower than the historical operating indicator data set of the SQL statement corresponding to the operating indicator To determine that the operating index is abnormal;

And/or, for each operation index in the operation index of the first execution plan, the operation index data group corresponding to the operation index is input into the first artificial intelligence model, and when the first artificial intelligence model outputs the instruction According to the indication information of abnormal operation index, it is determined that the operation index is abnormal.
The method according to claim 12, wherein the method further comprises:

For each operating indicator in the operating indicators of the first execution plan, when the performance curve of the operating indicator data group corresponding to the operating indicator does not match the performance baseline of the operating indicator, determine the corresponding operating indicator The performance indicated by the operating index data set is lower than the performance indicated by the historical operating indicator data set, and the performance baseline of the operating indicator is determined based on the historical operating indicator data set.
The method according to claim 13, wherein the method further comprises:

Based on the second artificial intelligence model and the historical operating indicator data set, a performance baseline of the operating indicator is generated.
An execution plan processing method, characterized in that the method includes:

The first execution plan for executing the structured query language SQL statement;

Acquiring a second execution plan of the new version of the SQL statement;

When the performance of the second execution plan is better than the performance of the first execution plan, the second execution plan is executed to replace the execution of the first execution plan.
A database system, characterized in that, the database system includes:

The execution module is used to execute the first execution plan of the structured query language SQL statement;

An obtaining module, configured to obtain the second execution plan of the SQL statement when the performance of the first execution plan of the SQL statement deteriorates;

The execution module is further configured to execute the second execution plan to replace the execution of the first execution plan.
The database system according to claim 16, wherein the second execution plan is different from the first execution plan.
The database system according to claim 17, wherein the performance of the second execution plan is better than the performance of the first execution plan.
The database system according to any one of claims 16 to 18, wherein the acquisition module is configured to:

Acquiring the historical execution plan of the SQL statement;

When the performance of the historical execution plan is better than the performance of the first execution plan, the historical execution plan is used as the second execution plan.
The database system according to any one of claims 16 to 18, wherein the acquisition module is configured to:

Generate a second execution plan of the SQL statement.
The database system according to claim 20, wherein the computing resources occupied by the second execution plan for generating the SQL statement are greater than the computing resources occupied by the first execution plan for generating the SQL statement, and/or, The duration of generating the second execution plan of the SQL statement is greater than the duration of generating the first execution plan of the SQL statement.
The database system according to any one of claims 16 to 21, wherein the database system further comprises:

The alarm module is used to send alarm indication information when the performance of the first execution plan of the SQL statement deteriorates.
The database system according to any one of claims 16 to 22, wherein the acquisition module is configured to:

After receiving the execution plan optimization instruction, the second execution plan of the SQL statement is acquired.
The database system according to any one of claims 16 to 23, wherein the execution module is further configured to:

When the performance of the new version of the execution plan is better than the performance of the second execution plan, the new version of the execution plan is executed to replace the execution of the second execution plan.
The database system according to any one of claims 16 to 24, wherein the database system further comprises:

The determining module is configured to determine the performance degradation of the first execution plan of the SQL statement when the operation index of the first execution plan is abnormal.
The database system according to claim 25, wherein the operation index of the first execution plan includes one or more of the following:

The input and output IO indicators, time delay, error information, execution times, and processing duration of the first execution plan.
The database system according to claim 25 or 26, wherein the determining module is configured to:

For each operating indicator in the operating indicators of the first execution plan, when the performance indicated by the operating indicator data set corresponding to the operating indicator is lower than the historical operating indicator data set of the SQL statement corresponding to the operating indicator To determine that the operating index is abnormal;

And/or, for each operation index in the operation index of the first execution plan, the operation index data group corresponding to the operation index is input into the first artificial intelligence model, and when the first artificial intelligence model outputs the instruction According to the indication information of abnormal operation index, it is determined that the operation index is abnormal.
The database system according to claim 27, wherein the determining module is configured to:

For each operating indicator in the operating indicators of the first execution plan, when the performance curve of the operating indicator data group corresponding to the operating indicator does not match the performance baseline of the operating indicator, determine the corresponding operating indicator The performance indicated by the operating index data set is lower than the performance indicated by the historical operating indicator data set, and the performance baseline of the operating indicator is determined based on the historical operating indicator data set.
The database system according to claim 28, wherein the database system further comprises:

The baseline generation module is configured to generate a performance baseline of the operating indicator based on the second artificial intelligence model and the historical operating indicator data set.
A database system, characterized in that, the database system includes:

The execution module is used to execute the first execution plan of the structured query language SQL statement;

An obtaining module, configured to obtain the second execution plan of the new version of the SQL statement;

The execution module is further configured to execute the second execution plan to replace the execution of the first execution plan when the performance of the second execution plan is better than the performance of the first execution plan.
A computer device, characterized in that it comprises:

Processor and memory;

The memory is used to store computer instructions;

The processor is configured to execute computer instructions stored in the memory, so that the computer device executes the execution plan processing method according to any one of claims 1 to 14, or executes the execution plan according to claim 15 Approach.
A computer-readable storage medium, wherein the computer-readable storage medium includes computer instructions that instruct a computer device to execute the execution plan processing method of any one of claims 1 to 14, or execute The processing method of the execution plan of claim 15.