WO2022001965A1

WO2022001965A1 - Database configuration parameter adjustment method, and device and storage medium

Info

Publication number: WO2022001965A1
Application number: PCT/CN2021/102782
Authority: WO
Inventors: 弄庆鹏; 李忠良; 屠要峰; 郭斌; 黄震江; 陈小强
Original assignee: 中兴通讯股份有限公司
Priority date: 2020-06-30
Filing date: 2021-06-28
Publication date: 2022-01-06
Also published as: CN113872788A

Abstract

A database configuration parameter adjustment method, and a device and a storage medium. The embodiments of the present application comprise: acquiring a database state hybrid characterization parameter from a database server (S1100); inputting the database state hybrid characterization parameter into a deep reinforcement learning model to generate a target database configuration parameter (S1200); and sending the target database configuration parameter to the database server (S1300).

Description

Database configuration parameter adjustment method, device and storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on the Chinese patent application with the application number of 202010618107.8 and the filing date of June 30, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is incorporated herein by reference.

technical field

The embodiments of the present application relate to, but are not limited to, the technical field of databases, and in particular, relate to a method, device, and storage medium for adjusting database configuration parameters.

Background technique

A database is an organized, sharable collection of data stored in a computer for a long time. With the development of science and technology, the amount of data used for user information connection has grown rapidly. As the basic support for information technology storage services, the database will be more widely used.

At present, the optimization of large-scale databases generally relies on database administrators, but the differential configuration of databases is a complicated, repetitive, time-consuming and labor-intensive challenge for database administrators. The quality is also quite different, so that the adjustment of database configuration parameters has the problems of low automation, slow speed and low efficiency.

SUMMARY OF THE INVENTION

Embodiments of the present application provide a database configuration parameter adjustment method, device, and storage medium.

In a first aspect, an embodiment of the present application provides a method for adjusting database configuration parameters, which is applied to a database tuning module, including: obtaining a database state mixed representation parameter from a database server; inputting the database state mixed representation parameter into Deep Enhancement Generate target database configuration parameters in the learning model; send the target database configuration parameters to the database server.

In a second aspect, an embodiment of the present application provides a method for adjusting database configuration parameters, which is applied to a database server, including: sending a database state mixed representation parameter to a database tuning module, so that the database tuning module executes as described in the first aspect receiving the target database configuration parameters sent from the database tuning module; and performing parameter configuration on the database server according to the database configuration parameters.

In a third aspect, an embodiment of the present application provides an electronic device, including: at least one processor, and a memory communicatively connected to the at least one processor; wherein the memory stores instructions, and the instructions are The at least one processor executes, so that when the at least one processor executes the instructions, the method as described in the first aspect or the method as described in the second aspect is implemented.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to cause a computer to execute the method described in the first aspect or the method described in the second aspect.

Other features and advantages of the present application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the description, claims and drawings.

Description of drawings

The accompanying drawings are used to provide a further understanding of the technical solutions of the present application, and constitute a part of the specification. They are used to explain the technical solutions of the present application together with the embodiments of the present application, and do not constitute a limitation on the technical solutions of the present application.

1 is a flowchart of a method for adjusting database configuration parameters provided by an embodiment of the present application;

2 is a schematic diagram of database configuration parameter adjustment provided by an embodiment of the present application;

3 is a flowchart of a method for adjusting database configuration parameters provided by another embodiment of the present application;

4 is a schematic diagram of a hybrid characterization module in database configuration parameter adjustment provided by an embodiment of the present application;

5 is a schematic diagram of a database configuration parameter adjustment module provided by another embodiment of the present application;

6 is a schematic diagram of a reward function module in a database configuration parameter adjustment module provided by another embodiment of the present application;

7 is a schematic diagram of a database configuration parameter self-tuning module provided by an embodiment of the present application;

8 is a flowchart of a method for adjusting database configuration parameters provided by another embodiment of the present application;

FIG. 9 is a flow chart of the principle of self-tuning of database configuration parameters provided by another embodiment of the present application;

10 is a flowchart of a method for adjusting database configuration parameters provided by another embodiment of the present application;

11 is a flowchart of a method for adjusting database configuration parameters provided by another embodiment of the present application;

12 is an application scenario diagram of the method for adjusting database configuration parameters provided by an embodiment of the present application;

13(a) and 13(b) are an overall flowchart of a method for adjusting database configuration parameters provided by another embodiment of the present application;

14 is an application scenario diagram of a method for adjusting database configuration parameters provided by another embodiment of the present application;

FIG. 15 is a schematic diagram of an electronic device for adjusting database configuration parameters provided by an embodiment of the present application.

detailed description

In order to make the purpose, technical solutions and advantages of the present application more clearly understood, the present application will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application. If there is no conflict, the embodiments in this application and the features in the embodiments may be combined with each other arbitrarily.

It should be noted that although the functional modules are divided in the schematic diagram of the device, and the logical sequence is shown in the flowchart, in some cases, the modules may be divided differently from the device, or executed in the order in the flowchart. steps shown or described. The terms "first", "second" and the like in the description and claims and the above drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

At present, the optimization of large-scale databases generally relies on database administrators, but the differential configuration of databases is a complicated, repetitive, time-consuming and labor-intensive challenge for database administrators. The quality also varies widely, resulting in low automation, slow speed, and low efficiency in the adjustment of database configuration parameters.

Based on this, the embodiment of the present application proposes a database configuration parameter adjustment method, device, and storage medium. The embodiment of the present application can use a deep reinforcement learning model to process the obtained database state mixed representation parameters to generate target database configuration parameters, The generated database configuration parameters are sent to the database server for configuration, thereby overcoming the problems of low automation, slow speed and low efficiency of database configuration, and effectively improving the automation, speed and efficiency of database configuration.

It should be noted that, in the following various embodiments, the electronic device may be a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, and a personal computer. Digital assistant (personal digital assistant, PDA), augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) device and other devices including the above-mentioned folding screen, the embodiment of this application does not make special restrictions on the specific form of the electronic device .

In a first aspect, an embodiment of the present application provides a method for adjusting database configuration parameters, which is applied to a database tuning module.

In some embodiments, referring to FIG. 1 , the database configuration parameter adjustment method includes:

Step S1100, obtaining the database state mixed representation parameter from the database server;

Step S1200, inputting the database state mixed representation parameters into the deep reinforcement learning model to generate target database configuration parameters;

Step S1300, sending the target database configuration parameters to the database server.

In some embodiments, the database state mixed characterization parameters in step S1100 may be database state parameters or configuration parameters, etc. The performance of the database can be learned through these state mixed characterization parameters of the database, such as the database performance parameters in the database state parameters etc., and the quality of database configuration parameters directly affects the query and response speed of the database.

In some embodiments, in step S1200, after inputting the above-mentioned database state mixed representation parameters into the deep reinforcement learning model for training, configuration parameters of the target database will be generated, and these configuration parameters will play an important role in database optimization.

Understandably, machine learning is a multi-domain interdisciplinary subject involving probability theory, statistics, approximation theory, algorithmic complexity theory and other disciplines. Specializing in the study of how computers simulate or implement human learning behavior. In order to acquire new knowledge or skills, reorganize the existing knowledge structure to continuously improve its own performance. Machine learning is divided into supervised learning, unsupervised learning and reinforcement learning. Among them, supervised learning is that the computer obtains simple input and gives the desired output. The process is to learn general principles to map from input to output through a "training model"; no Supervised learning does not give a label to learn the algorithm, allowing it to discover the structure of the input by itself. Unsupervised learning itself can be regarded as a goal or a way to achieve results, which can also be called feature learning; reinforcement learning is a computer program with Dynamically interacting with the environment while showing a precise goal, such as driving a vehicle or playing a game against an opponent, the program’s reward and punishment mechanism serves as feedback to enable it to navigate the problem domain.

The deep reinforcement learning model is a deep reinforcement learning model formed by combining a deep learning model and a reinforcement learning model. In this embodiment, the deep learning model is a recurrent neural network model. The recurrent neural network model is hereinafter referred to as RNN (Recurrent Neural Network) One of the deep learning algorithms, RNN is a type of recurrent neural network that takes sequence data as input, performs recursion in the evolution direction of the sequence, and all nodes, namely cyclic units, are connected in a chain, so that RNN can capture the correlation of each output parameter. The database state hybrid representation parameters can be synthesized and processed. The reinforcement learning model belongs to the above reinforcement learning, which combines deep learning and reinforcement learning to form deep reinforcement learning, which overcomes the problem that traditional reinforcement learning algorithms cannot handle high-dimensional state space, and combines reinforcement learning and deep learning to realize the end from perception to action. A brand new algorithm for peer-to-peer learning that simplifies features and enables unsupervised learning. At the same time, the agent can discover the internal connection between features in the process of self-learning. Therefore, deep reinforcement learning has the potential to enable the agent to achieve completely autonomous learning of one or even multiple skills. The application in this embodiment can provide the ability to process high-dimensional parameters, and continuously optimize model parameters through autonomous learning.

In this embodiment, the database state mixed representation parameters obtained in step S1100 are input into the deep reinforcement learning model to generate database configuration parameters, which can achieve faster convergence speed, improve the speed of configuration parameter generation, and improve configuration efficiency.

In some embodiments, the database server in step S1300 is composed of one or more computers running in the local area network and database management system software. The database server can provide data services for the client's application program. The target database configuration parameters are sent to the database server, so that the database server can be optimized, thereby improving the performance of the database in the database server.

In this embodiment, a deep reinforcement learning model can be used to process the acquired database state mixed representation parameters to generate target database configuration parameters, and the generated database configuration parameters can be sent to the database server for configuration, thereby overcoming the slow and efficient database configuration. Low problem, effectively improve the speed and efficiency of database configuration. In some embodiments, the database status table mixed characterization parameters include one or more of the following: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource status parameters.

In some embodiments, the database performance parameters include, but are not limited to, throughput, query delay, and automatic VACUUM times, where throughput refers to the number of successfully transmitted data per unit time to a network, device, port, virtual circuit, or other facility , the amount of data is measured in bits, bytes, packets, etc. The throughput directly affects the response speed of database data writing, data reading and database access; the query delay is the time from the start of the database query to the database response. The size of the database query delay directly affects the access response speed of the database; among them, the main function of automatic VACUUM is to move free pages to the end of the database, thereby reducing the size of the database. Automatic VACUUM can make the database occupy more space. small.

Among them, hardware resource parameters include but are not limited to the number of CPUs, memory capacity, storage space and other parameters. CPU (Central Processing Unit) is an ultra-large-scale integrated circuit, which is the computing core and control core of a computer. Its function is mainly to interpret computer instructions and process data in computer software. The main function of memory is memory and loading functions. The memory capacity can be understood as the storage capacity of the memory stick. Generally, the memory capacity is in MB. The larger the memory capacity, the more beneficial it is. operation of the hardware system. For example, the memory capacity of a computer usually refers to the capacity of random access memory (RAM). The memory capacity is generally a multiple of 2, such as 64MB, 128MB, 256MB, etc. The larger the memory capacity, the more beneficial the operation of the hardware system or computer system.

The hardware resource status parameters include but are not limited to parameters such as CPU utilization, memory utilization, and storage utilization. Among them, the CPU utilization is the percentage of the CPU occupied by the running process. The size of the CPU utilization directly affects the service life of the CPU. If the CPU utilization is too high, it will cause the internal aging of the CPU, thereby reducing the service life of the CPU; the memory occupancy rate Refers to the memory occupied by the running process. Excessive memory utilization will slow down the access speed of the CPU and affect the performance of the entire hardware resource.

In some embodiments, referring to FIG. 2 , the deep reinforcement learning model includes a hybrid representation module and a self-tuning module, the hybrid representation module is connected to the self-tuning module, wherein the hybrid representation module uses an RNN model, and the self-tuning module uses a Reinforcement learning models.

The self-tuning module transmits the database configuration parameters to the control interface of the server through the control interface, and the server outputs the workload pressure signal to pressurize the target database (such as the database server) by controlling the pressure measurement device; the parameter acquisition module monitors the hardware resources in the target database The state parameters and database performance parameters are collected, combined with the server's own hardware resource parameters, the above parameters are transmitted to the hybrid characterization module in the database tuning module through the server's control interface for hybrid characterization, and the output database state hybrid characterization vector is sent to the self-tuning The self-tuning module completes the interaction with the database through the current action network in reinforcement learning, so as to transmit the configuration parameters generated by the self-tuning module to the target database.

This embodiment can realize the deep reinforcement learning model formed by the RNN model and the reinforcement learning model, and then combine the database state mixed representation parameters to obtain the configuration parameters of the database, thereby improving the generation speed of the database configuration parameters and improving the efficiency of the database configuration.

3, step S1200 specifically includes:

Step S1210, input the database state mixed characterization parameters into the mixed characterization module;

Step S1220, the hybrid characterization module uses the RNN model to process the database state hybrid characterization parameters to obtain a database state hybrid characterization vector;

Step S1230, the database state mixed representation vector is input to the self-tuning module, and the self-tuning module uses the reinforcement learning model to process the database state mixed representation vector to obtain target database configuration parameters.

In some embodiments, the purpose of inputting the above-mentioned database state mixed representation parameters into the mixed representation module in step S1210 is to obtain a vector of the mixed representation of database states.

The mixed representation of the above parameters is used in this embodiment, which can ensure the completeness and accuracy of the database state representation.

In some embodiments, the hybrid representation module in step S1220 uses a neural network model to process the features of the above database state hybrid representation parameters to obtain a database state hybrid representation vector. More specifically, the neural network model used RNN is a recurrent neural network model, RNN refers to a structure that recurs over time; a typical RNN network contains an input, an output and a neural network unit, where the neural network unit of the RNN network is not only related to There is a relationship between input and output, and there is also a loop with itself, that is, the network state information at the previous moment will act on the network state at the next moment. RNN can improve the convergence speed of the model, thereby improving the generation speed of database configuration parameters and making the database configuration more efficient.

In some embodiments, the database state mixed representation parameters include database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters. The above step S1220 includes:

Step S1221, acquiring the current database performance parameters and the previous database performance parameters; and, acquiring the current database configuration parameters and the previous database configuration parameters; and, acquiring the current hardware resource state parameters and the previous hardware resource state parameters; and, acquiring the hardware resources parameter;

Step S1222, obtain the current database performance parameter feature vector according to the current database performance parameter; obtain the previous database performance parameter feature vector according to the previous database performance parameter; obtain the current database configuration parameter feature vector according to the current database configuration parameter; Obtain the previous database configuration parameter feature vector; obtain the current hardware resource state parameter feature vector according to the current hardware resource state parameter; obtain the previous hardware resource state parameter feature vector according to the previous hardware resource state parameter; obtain the hardware resource according to the hardware resource parameter parameter eigenvector;

Step S1223: Calculate and obtain the database performance parameter difference feature vector according to the current database performance parameter feature vector and the previous database performance parameter feature vector; calculate and obtain the database configuration parameter according to the current database configuration parameter feature vector and the previous database configuration parameter feature vector Difference feature vector; according to the current hardware resource state parameter feature vector and the previous hardware resource state parameter feature vector, calculate the hardware resource state parameter difference feature vector; step S1224, the database performance parameter feature vector, database performance parameter difference feature vector Vector, database configuration parameter feature vector, database configuration parameter difference feature vector, hardware resource state parameter feature vector, hardware resource state parameter difference feature vector, hardware resource parameter feature vector are input into the neural network model, and the neural network model is used to output the database State mixture representation vector.

In some embodiments, in step S1222, after inputting the parameter features obtained in step S1221 into the hybrid characterization module, preprocessing coding is performed, and the process of preprocessing is to obtain the current database performance parameters and the previous database performance parameters; And, current database configuration parameter and previous database configuration parameter; And, current hardware resource state parameter and previous hardware resource state parameter; And, obtain hardware resource parameter to carry out vectorization processing, and obtain parameter difference value vector by step S1223, The vector of parameters and the corresponding vector of parameter differences are outputted as a mixed representation vector of database state after the neural network model is trained.

In some embodiments, referring to FIG. 4 , the historical records in the figure are the previous database performance parameters, the previous database configuration parameters, and the previous database hardware resource status parameters. By combining the current database performance parameters, the current database configuration parameters, and the current database hardware The resource state parameters are preprocessed to form a mixed representation vector of database state and transmitted to the self-tuning module. At the same time, the current database performance parameters, the current database hardware resource state parameters, the previous database performance parameters, the previous database hardware resource state parameters and The previous database configuration database is input to the sample pool and stored as a sample.

This embodiment can realize the generation of the database state mixed representation vector in the self-tuning module and the generation of parameters in the sample pool, which provides a parameter basis for optimizing the database tuning module.

In some embodiments, in step S1224, there are a database performance parameter feature vector, a database performance parameter difference vector, a database configuration parameter feature vector, a database configuration parameter difference vector, a hardware resource state parameter feature vector, and a hardware resource state parameter difference vector , hardware resource parameter vector These seven vectors are input into the neural network model, and the neural network model is used to output the mixed representation vector of the database state. These seven eigenvectors are related, and the cyclic neural network RNN is used to extract the relationship and reduce the dimensionality of the coding, so as to realize the mixed representation of the database state, which improves the comprehensiveness and characterization of the database and its hardware resource environment state by the database characterization vector. Accuracy improves the effectiveness and efficiency of database configuration parameter adjustment.

In this embodiment, the core idea of this embodiment is described by taking a database server whose hardware resource parameters are 4-core CPU, 32G memory, and 256G storage as an example. Among them, the memory is the running memory, which can also be called RAM, and the data will not exist after the shutdown. The size of the memory determines the running speed of the machine; the storage as 256G represents the storage capacity of the CPU, which can store 256G of data. Assuming database-related collection data, taking 10-dimensional vector features as an example, the database hardware resource parameter feature vector is [4, 32, 256, 0, 0, 0, 0, 0, 0, 0], where 4 It means that the CPU does not have 4 cores, 32 means that the memory size is 32G, and 256 means that the storage size is 256G. The missing dimension after the vector is processed with 0s, that is, 7 0s are added as supplementary dimensions.

In this embodiment, the current hardware resource state parameter feature vector is [0.6, 0.58, 0.3, 0, 0, 0, 0, 0, 0, 0], where 0.6 represents the CPU utilization rate, 0.58 represents the memory utilization rate, 0.3 represents the storage utilization. Similarly, the following seven 0s represent the missing dimension; the previous hardware resource status parameter feature vector is [0.4, 0.48, 0.5, 0, 0, 0, 0, 0, 0, 0], then The hardware resource state parameter difference vector is, the current hardware resource state parameter feature vector minus the previous hardware resource state parameter feature vector, then the hardware resource state parameter difference vector is [0.2, 0.10, -0.2, 0, 0, 0, 0, 0, 0, 0];

The current configuration parameter feature vector in this embodiment is [542, 730, 9, 7, 55, 23, 99, 10, 67, 86], where 542 represents temp_buffers in the database configuration parameters, that is, the size of the temporary buffer. Accessing temporary table data in a database session, 730 represents work_mem, the memory size used by internal sorting operations and hash tables before writing to the temporary file, 9 represents max_wal_size, and a vector of related configuration parameters such as WAL grows to the maximum size during automatic WAL checkpoints. The previous configuration parameter feature vector is [372, 650, 3, 4, 32, 21, 76, 13, 67, 56], then the database configuration parameter difference vector is the current current configuration parameter feature vector and the previous current configuration parameter Difference vector between feature vectors, i.e. [170, 80, 6, 3, 23, 2, 23, -3, 0, 30].

In this embodiment, the current database performance parameter feature vector is [1154.5, 8.5, 515, 9325.0, 854.8, 0.0, 91, 1.0, 6.7, 8.6], where 1154.5 represents the n_tup_ins state parameter, 8.5 represents the buffers_alloc configuration parameter, and 515 represents the xact_commit Configuration parameters, 9325.0 represents n_dead_tup status parameters, etc. Then the performance database performance parameter difference vector is [104.0, 3.0, 15, 2.0, -122.1, 0, 23, 0.2, 1.4, 1.0].

In this embodiment, database performance parameters, database configuration parameters, hardware resource parameters, current values, and the difference between the current value and the previous value are used for mixed characterization, wherein the previous value can be understood as a historical value, which is understandable , In the process of database configuration parameter tuning, we will not only observe the current database performance parameters, but observe the changes of database performance parameters after one database configuration parameter adjustment, and the changes of database performance parameters are determined by the database configuration parameters. It is determined by the amount of change, and the current database performance parameters are generally determined by hardware resource parameters. To abstract the manual adjustment process of database administrators is to use the current values of hardware resource parameters, database performance parameters, and database configuration parameters. The state of the database is mixed with the previous value or the historical value, and the RNN can capture the relationship between the parameters so that the database configuration parameters can be adjusted in real time and accurately.

In some embodiments, referring to FIG. 5 , the deep reinforcement learning model further includes a database configuration parameter reward function module;

Specifically, step S1200 further includes:

S1240, input the database state mixed characterization parameters into the mixed characterization module and the database configuration parameter reward function module, respectively, to generate a reward strategy;

S1250, store the reward strategy in the sample pool;

S1260, sampling the sample pool to obtain sampling data;

S1270, using the sampled data to optimize the self-tuning module.

It can be understood that the mixed representation parameters of the database state in S1240 include database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters, wherein the database configuration parameters are parameters configured during the operation of the database, and this parameter will affect the database. The performance parameters and database hardware status parameters are usually called action a in reinforcement learning; the performance parameters of the database include but are not limited to the throughput and latency performance indicators of the database, and the database hardware status parameters include but are not limited to Limited to indicators such as CPU utilization, memory utilization, and storage utilization. The database configuration parameter reward function in the database configuration parameter reward function module is mainly used to evaluate the quality of the current database configuration parameters, such as whether the throughput and latency of the database meet the business requirements, which is usually called reward r in reinforcement learning.

Among them, the database performance parameters and hardware resource state parameters can be collectively referred to as database state parameters, which are called state s in reinforcement learning. It is understandable that database state parameters and database configuration parameters appear in pairs, that is, a set of database configuration parameters. Corresponding to a set of database state parameters, the database state parameters include but are not limited to database performance parameters, current configuration parameters, and hardware resource state parameters, wherein the hardware resource state parameters include CPU utilization and/or memory utilization, which are usually called in reinforcement learning. is state s.

In some embodiments, the database configuration parameter reward function module in step S1240 is configured to generate the reward value r of the current database configuration parameter, and the hybrid characterization module is configured to output database configuration parameters, database performance parameters and hardware resource status parameters, wherein The database performance parameters and hardware resource status parameters can be combined as database status parameters. The current database status parameter is represented as s_, the previous database status parameter is represented as s, and the current database configuration parameter is represented as a. Combined with the hybrid characterization module and The database configuration parameter reward function module outputs samples of (s, a, r, s_) together, and then executes step S1250 to place the reward strategy in the sample pool.

In some embodiments, the above step S1240 includes:

S1241, output the reward value of the current database configuration parameter after normalizing the difference between the current database performance parameter and the previous database performance parameter, and the difference between the current hardware resource state parameter and the previous hardware resource state parameter;

Among them, the normalization process is to calculate the proportion of the difference between the current database performance parameter and the previous database performance parameter in the previous database performance parameter; and, the difference between the current hardware resource state parameter and the previous hardware resource state parameter is the first The ratio of the secondary hardware resource status parameters, and the ratio is multiplied by the corresponding weighted value to output the reward value of the current database configuration parameter;

S1242: Generate a reward policy according to the reward value of the current database configuration parameter, the current database configuration parameter, the current database performance parameter, the current hardware resource state parameter, the previous database performance parameter, and the previous hardware resource state parameter.

In some embodiments, the normalization in step S1241 is to change the number into a decimal between 0 and 1, and the normalization in this embodiment is to calculate the current database performance parameter and the previous database performance parameter, and, The difference between the current hardware resource state parameter and the previous hardware resource state parameter, and the ratio of the current database performance parameter and the previous database performance parameter to the current database performance parameter, and the current hardware resource state parameter and the previous The ratio of the difference value of the hardware resource state parameter to the previous hardware resource state parameter, and the above ratio is multiplied by the corresponding weighted value and summed to output the reward value of the current database configuration parameter; the specific formula is as follows:

r=(difference between current database performance parameter and previous database performance parameter/previous database performance parameter)*α+(difference between current hardware resource state parameter and previous hardware resource state parameter/previous hardware resource state parameter)*β ;

Wherein, α and β are weighted values, and r is a normalized value after normalizing the above-mentioned database performance parameters and hardware resource state parameters, and in this embodiment, represents the reward value of the current database configuration parameter.

In some embodiments, referring to FIG. 6 , the hardware resource status parameters in step S1241 include CPU utilization and memory utilization. In the processing process using the normalization method in step S1241, the database configuration parameters can be adjusted according to the database configuration parameters during the process. Performance parameters, CPU utilization and memory utilization have different degrees of emphasis. The specific normalization process is carried out by setting the size of the weighted value. The specific normalization formula is:

Δperfor=cur_perfor-hist_perfor;

Δcpu_rate=cur_cpu_rate-hist_cpu_rate;

Δmen_rate=cur_men_rate-hist_men_rate;

Among them, α, β, γ are weighted values, r is the reward value of the current database configuration parameter, cur_perfor is the current database performance parameter, hist_perfor is the previous or historical database performance parameter, cur_cpu_rate is the current CPU utilization, and hist_cpu_rate is the previous or Historical CPU utilization, cur_men_rate is the current memory utilization, hist_men_rate is the previous or historical memory utilization, △perfor is the difference between the current database performance parameters and the previous or historical database performance parameters, and △cpu_rate is the current CPU utilization and previous The difference between the current or historical CPU utilization, and △men_rate is the difference between the current memory utilization and the previous or historical memory utilization.

It is understandable that the weighted value can be set to adjust the degree of emphasis on database performance parameters, CPU utilization and memory utilization in the process of database configuration parameter adjustment. For example, focusing on the optimization of database performance parameters, then performance parameters The component weight α can be set to a larger value. If it focuses on the optimization of CPU utilization, β can be set to a larger value. Similarly, if the database optimization focuses on the optimization of memory utilization, set γ to a larger value. The normalization of the above components is to solve the problem of different dimensions and varying degrees of the three parameters.

For example, the optimization of the database in this embodiment focuses on the database performance parameters. After comprehensive experimental consideration, the weighted values α are set to 0.8, β to 0.1, γ to 0.1, cur_perfor=8550tps, hist_perfor=6800tps, cur_cpu_rate =85%, hist_cpu_rate=56%, cur_men_rate=72%, hist_men_rate=63%, then according to the above normalization processing, r is 0.2647, then 0.2647 here is the reward output, that is, the reward of the current database configuration parameters value.

In this embodiment, the generation of the reward value of the current database configuration parameter is calculated by combining the current value and the previous value or the historical value of the database performance parameter, CPU utilization and memory utilization parameter. It is understandable that when optimizing the data configuration parameter In the process, the focus is not on the absolute value of the current database performance, but more on the changes in database performance parameters, CPU utilization, and memory utilization after the database configuration parameters are adjusted. The historical value of the parameters can be the historical maximum value. Or the default configuration of the database corresponds to the collected value or the mean value recorded in the sliding window. In the process of calculating the reward, firstly, the differences of the above three parameters of database performance parameters, CPU utilization and memory utilization are calculated respectively, then the three differences are normalized, and finally the weighted sum is used as the reward output.

In some embodiments, referring to FIG. 7 , which is a schematic diagram of a self-tuning module, the module includes a current action network, a current evaluation network, a target action network, and a target evaluation network, wherein the current action network mainly uses the current state to generate the next action, The current evaluation network is set to evaluate the current state and the next action, the target action network is set to generate the next action using the next state, and the next action and the next state are evaluated using the target evaluation network. Among them, the current state parameter is represented as states, the current action is represented as actions, the next state is represented as next_states, and the next action is represented as next_states, through the interaction between the current action network, the current evaluation network, the target action network, and the target evaluation network , an organic connection will be established between the current evaluation network and the target evaluation network through the loss function.

In some embodiments, referring to FIGS. 7 and 8 , the self-tuning module includes: a current action network, a current evaluation network, a target action network, and a target evaluation network; the current action network is configured to generate a target database according to the database state hybrid characterization parameters The configuration parameters are sent to the database server; the sampling data includes the current database configuration parameters, the current database state parameters, the next database state parameters, and the reward value of the current database configuration parameters; the sampling data in step S1270 includes the current database configuration parameters, the current database state parameters, the next database state parameters, wherein the database state parameters include database performance parameters and database hardware resource state parameters, the above step S1270 includes:

Step S1271: Input the current database state parameters into the current action network, and generate the next database configuration parameters;

The current database state parameter can be represented as states, and the next database configuration parameter can be represented as next_actions. It is understandable that the current action network is also set to output the destination database configuration parameter by processing the current database state parameter and the mixed representation vector.

Step S1272: Input the next database configuration parameters and the current database state parameters into the current evaluation network to determine the first loss function value;

Step S1273: Optimizing the current action network according to the first loss function value and softly updating the target action network according to the parameters output by the current action network;

Among them, the current database state parameter is states, and the first loss function is policy_loss=-Q'(states, next_actions)

Among them, policy_loss can be used as the loss value of the current network, which can optimize the current action network, and Q'(states, next_actions) represents the evaluation value of the current database state parameters and the next database configuration parameters in the current evaluation network.

Step S1274: Input the next database state parameters into the target action network, and generate the next database configuration parameters;

Among them, the next database state parameter is represented as next_states;

Step S1275: Input the current database configuration parameters and the current database state parameters into the current evaluation network to generate the current action evaluation value;

Step S1276: Add the next database configuration parameters and the next database state parameters to the target evaluation network to generate the next action evaluation value;

Step S1277: Calculate the second loss function value according to the current action evaluation value, the next action evaluation value and the reward value of the current database configuration parameter;

Step S1278: Use the second loss function value to optimize the current evaluation network, and at the same time perform a soft update on the target evaluation network through the parameters output by the current evaluation network.

Specifically, the second loss function formula is as follows:

Loss=Q'(next_state,next_action)-Q(state,action)

Among them, Q'(next_state, next_action) is the current action evaluation value, Q(state, action) is the previous action evaluation value, Loss() represents the loss value between the target evaluation network and the current evaluation network, where Q(state, action) is represented as Q(s, a) in Figure 7.

Among them, the relationship between Q(state, action) and Q'(state_next, action_next) is:

Q(state,action)=r+γ*(max(Q'(next_state,next_action)));

Among them, r represents the reward value of the current database configuration parameter, and max(Q'(next_state, next_action) is the highest evaluation score that may be generated by the action taken in the next state. In this embodiment, it represents the database configuration corresponding to the next database parameter. The highest evaluation score produced, where γ is a numeric value from 0 to 1.

It is understandable that the formula of Q(state, action) is substituted into the Loss function formula to get:

Loss=-r+(1-γ)Q'(next_state,next_action)

When the value of γ is 1, the target evaluation value Loss=-r, that is, the current evaluation value plus the Loss value is plus -r, which is the target evaluation value, where r is the reward value of the current database configuration parameter in step S1241 .

It can be understood that the self-tuning module in the embodiment of the present application consists of four parts, the current action network, the current evaluation network, the target action network, and the target evaluation network, all of which are composed of neural networks; The best action to be applied to the environment at the next moment is determined according to the current state; here, the action can be regarded as the configuration parameter to be configured in the database, and the state can be regarded as the state parameter of the database.

Specifically, the value obtained by the Loss formula is the value of the second loss function, and the current evaluation network is optimized to determine the output value of the current action network, that is, the optimized current evaluation network can predict the input state parameter corresponding to the next A database configuration parameter (target database configuration parameter).

The target action network can provide the target evaluation network with the action corresponding to the next state, that is, the database configuration parameters corresponding to the next database state. The database status is database performance parameters and database hardware resource status parameters. The target evaluation network is mainly responsible for the calculation of the target Q value. At the same time, the current evaluation network also needs to optimize the module, so it is necessary to compare a target value with the generated value of the current evaluation network to obtain the loss value to optimize the current evaluation network.

It can be understood that this embodiment can periodically update the parameters of the current evaluation network with the parameters of the target evaluation network, and use the soft update method to use the loss function once for each update, so as to ensure that each update will be adjusted according to the loss function, Guarantee more optimization of the module.

It is understandable that the update() function in the self-tuning module is used to trigger the optimization process of the entire self-tuning module. The self-tuning module obtains the previous state parameter from the sample pool, which is represented as state, and the previous configuration parameter is represented as action. , the current state parameter is represented as next_state and the reward value reward (represented as r in the above embodiment) parameter of the current database configuration parameter is processed through steps S1271 to S1277 and the optimized database configuration parameter is output. Because the configuration parameters in this embodiment are added with the reward value of the current database configuration parameters, the database configuration parameters generated by the self-tuning module can be more optimal, and the database configuration parameters in this embodiment are input into the target database server for configuration. , which can improve the access speed, reading speed and other database processing matters of the database in the target database server.

In some embodiments, referring to FIG. 9 , which is a schematic flowchart of the database tuning module, the hybrid characterization module is responsible for providing the database state hybrid characterization vector for the self-tuning module, and also provides parameter samples for the sample pool, while the hybrid characterization module receives the target. The database performance parameters, database configuration parameters, and database hardware resource status parameters sent by the database are mixed with the above parameters. The organic system of parameters can quickly and efficiently complete the generation, optimization and configuration of the database.

In some embodiments, the current action network is responsible for the interaction of the self-optimization module with the database environment, that is, generating database configuration parameters according to database state parameters, and configuring the configuration parameters into the database. The current evaluation network is responsible for evaluating the value (good or bad) of each pair of database state-actions, the evaluation is the Q value, so that it can determine what action to take in what database state. That is, what kind of database configuration parameters can be used in what kind of database state can bring higher database performance, wherein the database state includes database performance parameters, database hardware resource state parameters, and database performance includes throughput, delay, etc. , High database performance means high throughput, low latency, etc.

In some embodiments, the data server includes a backup database server and a service database server, and step S1100 includes:

Step 1110: Obtain the database state hybrid representation parameter of the backup server;

Step S1300 includes:

Step S1310: Send the target database configuration parameters to the service database server.

Understandably, the business database server is a server that is running and being used.

In some embodiments, when optimizing the database server in offline mode, the server that needs to perform database configuration can be called a business server. When adjusting the database configuration parameters, it is performed through a backup server. The database status of the business server and the backup server is The mixed characterization parameters are the same. At this time, it is necessary to obtain the database state mixed characterization parameters of the backup server through step S1110, so that the target database configuration parameters generated after performing the methods of steps S1100 to S1300 are also applicable to the service server.

In some embodiments, after the backup server performs configuration optimization through the newly generated target configuration parameters, and the database is running normally, step S1310 may be executed to send the target database configuration parameters to the business database server for tuning of the business database server.

In this embodiment, the database configuration parameters can be adjusted on the backup server first, and only when the configuration is successful and the database runs normally after the configuration, the relevant configuration parameters can be configured on the business server. This offline method configures the database for the business server. Parameter adjustment can prevent unexpected situations in the process of database configuration parameter adjustment from damaging the running business server and causing unnecessary losses.

In some embodiments, the online mode optimization is performed on the database server, and the database optimization module directly obtains the database state mixed representation parameter from the business database, and after obtaining the above-mentioned database state mixed representation parameter, executes the methods of the above steps S1100 to S1300 to obtain The target database configuration parameters are transmitted, and the newly generated target database configuration parameters are transmitted to the business database in real time, so that the business database can configure the database configuration parameters in real time.

This embodiment can realize the real-time online optimization of the configuration parameters of the business database, and improve the optimization efficiency.

In some embodiments, steps S1100 and S1200 in the database configuration parameter adjustment method include:

S1120, obtain the database state mixed representation parameter from the database server;

S1280, input the database state mixed representation parameters into the deep reinforcement learning model to generate intermediate database configuration parameters, and send the intermediate database configuration parameters to the database server;

S1290: Repeat the above steps N times until the target database configuration parameters are obtained. That is, the intermediate database configuration parameters are input into the database server as the database configuration parameters of the database server, and steps S1120 and S1280 are repeated until N times are reached to obtain the target database configuration parameters, where N is a positive integer, and N can be preset. value. Through multiple iterations, better database configuration parameters are obtained.

In some embodiments, a preset number of iterations N=max_iter is set, where max_iter represents the maximum number of iterations, the number of iterations is represented as iter, and the increase in the number of iterations is represented as iter++; when iteration is performed, for each iteration, iter The value of is increased by 1 until the iteration number iter reaches the maximum iteration number max_iter, that is, N in this embodiment, and the iteration ends. That is, when iter>N, it means that the iteration is over, and better database configuration parameters can be obtained at this time. If iter≤max_iter, continue to repeat steps S1120 and S1280 until the value of the iteration number iter reaches the maximum iteration number max_iter, that is, this implementation N in the example. Wherein, N is a positive integer, the value of N can be preset, and N can be obtained through an experimental value or an empirical value.

This embodiment can continuously converge through model training by means of a preset number of iterations until better database configuration parameters are obtained, which enhances the real-time and efficient configuration of database configuration parameters.

In a second aspect, the embodiments of the present application provide a method for adjusting database configuration parameters, which is applied to a database server.

In some embodiments, referring to FIG. 10 , the database configuration parameter adjustment method includes:

Step S2100, sending the database state mixed representation parameter to the database tuning module, so that the database tuning module executes the method of the first aspect;

Step S2200, receiving the target database configuration parameters sent from the database tuning module;

Step S2300, perform parameter configuration on the database server according to the target database configuration parameters.

In some embodiments, in some embodiments, the database state hybrid characterization parameters include one or more of the following: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters.

In some embodiments, referring to FIG. 11 , before the above step S2100, it further includes:

Step S2400, obtaining the first mixed representation parameter of database state;

Step S2500, obtaining the workload pressure signal from the pressure simulation server;

Step S2600, performing a pressurization operation on the database server according to the workload pressure signal;

Step S2700, after the pressurization operation is completed, obtain the second database state mixed representation parameter.

In some embodiments, the database server includes a backup database server and a service database server, and step S2100 includes:

Step 2110: use the backup database server to send the database state mixed representation parameters to the database tuning module and perform the methods from steps S1100 to S1300 to generate target database configuration parameters;

Step 2120: use the service server to receive target database configuration parameters;

Step 2130: Perform parameter configuration on the service database server according to the target database configuration parameters.

In order to better understand the core idea of the embodiments of the present application, an overall solution for adjusting the database configuration parameters of the embodiments of the present application is described below according to specific application scenarios.

In some embodiments, referring to FIG. 12 , the specific physical environment of the application scenario is as follows:

Table 1 Example physical environment

数据库版本Database version	PG10.11数据库PG10.11 database
CPU核数Number of CPU cores	44
内存RAM	4G4G
磁盘存储disk storage	20G20G
数据库配置参数调试模式Database Configuration Parameters Debug Mode	离线offline

Among them, the physical environment in Table 1 is based on the PASS environment, the PG10.11 database represents the Postgresql 10.11 version database, and the offline debugging mode is to pass the database tuning server 101 before the business starts (the database tuning server has a built-in database Tuning module) to tune the most configuration parameters, and directly configure it into the business database after obtaining the better configuration, and then start the database service.

In some embodiments, the database configuration parameter adjustment system includes a database configuration parameter self-tuning server 101, a database query pressure simulation server 103, and a PG database server (service server) based on the PASS platform. The database tuning server 101 is the core of this embodiment, and is responsible for running the database tuning module, controlling the database user query pressure simulation server, and interacting with the PG10.11 database server. The service server is responsible for the operation of the PG10.11 database, limiting the hardware resources for database operation, receiving the query from the database user query pressure simulation server, and interacting with the database tuning server 101 . The database user queries the pressure simulation server, which is responsible for the operation of the pressure measurement device, and performs the pressure test on the PG10.11 database under the control of the self-tuning module to simulate the database workload during the actual operation. Wherein, in this embodiment, the PG10.11 database can be used as a service server.

Wherein, the database tuning server 101 is connected to the database query pressure simulation server 103 and the service server 102 in two-way communication respectively, and the output of the database query pressure simulation server is connected to the service server through a one-way connection, and the communication connection can be RJ45 port or other communication connection.

In some embodiments, referring to Figures 13(a) and 13(b), the database configuration parameter adjustment method specifically includes:

Step S3100: Obtain database server hardware resource parameters;

Step S3200: Build a database tuning module, if there are already trained database configuration parameters, load them, otherwise go to the next step;

Step S3300: Acquire the default configuration parameters of the PG10.11 database;

Step S3400: reset the environment state of the PG10.11 database;

It is understandable that the purpose of resetting the environment state of the PG10.11 database is to rebuild the data tables in the database, prevent changes in the database tables after multiple stress tests, ensure the consistency of the database environment for each stress test, and make each stress test. The obtained database performance parameters are comparable.

Step S3500: Configure the current database configuration parameters into the database server PG10.11 database through the network, and reset the database.

It is understandable that the current database configuration parameters are equal to the default configuration parameters only during the first round of model training. During the continuous iteration of model training, the current configuration parameters will be continuously updated, and some configuration parameters require a database reset to take effect.

Step S3600: collect database performance parameters for the first time;

Among them, the collected database state parameters include but are not limited to the types shown in Table 2:

Table 2 Database configuration parameter types

Step S3700: The control database query pressure simulation server starts to pressurize the PG10.11 database in the PG database server, and starts to record the database hardware resource status parameters in real time.

The database hardware resource status parameters include but are not limited to: CPU utilization, memory utilization, storage utilization and other parameters;

Step S3800: the compression of the PG10.11 database in the PG database server is completed, and the recording of the database hardware resource status parameters is stopped;

Step S3900: collect database performance parameters for the second time;

Step S31000: Calculate the current performance parameter of the database according to the difference between the database performance parameter collected for the first time in step S3600 and the database state parameter collected for the second time in step S3900, and calculate the current performance parameter of the database according to the database hardware resource state parameter record during the stress measurement process. The average value of the database hardware resource status parameters such as CPU utilization, memory utilization, and storage utilization during the process.

Step S31100: Calculate the difference according to the database performance parameters, hardware resource status parameters collected in step S31000, the current database configuration parameters in S3500, and the historical record values of the above parameters, and then use the current parameters and the obtained difference parameters. The difference is used for mixed representation, and the mixed representation vector of the database state is obtained.

Step S31200: Input the generated database state hybrid representation vector into the self-tuning module to generate new database configuration parameters.

The generated database configuration parameters include, but are not limited to, as shown in Table 3:

Table 3 Database configuration parameter types

Step S31300: Calculate the reward value r of the current database configuration parameter according to each database state parameter and its historical record value collected in step S31000. The calculation formula is as follows:

Δperfor=cur_perfor-hist_perfor;

Δcpu_rate=cur_cpu_rate-hist_cpu_rate;

Δmen_rate=cur_men_rate-hist_men_rate;

Among them, α, β, γ are weighted values, r is the reward value of the current database configuration parameter, cur_perfor is the current database performance parameter, hist_perfor is the previous or historical database performance parameter, cur_cpu_rate is the current CPU utilization, and hist_cur_cpu_rate is the previous or Historical CPU utilization, cur_men_rate is the current memory utilization, hist_men_rate is the previous or historical memory utilization, △perfor is the difference between the current database performance parameters and the previous or historical database performance parameters, and △cpu_rate is the current CPU utilization and previous The difference between the current or historical CPU utilization, and △men_rate is the difference between the current memory utilization and the previous or historical memory utilization.

Among them, the cur_perfor and hist_perfor parameters can be parameters such as database throughput TPS and query delay;

Step S31400: Record the optimal database configuration parameter according to the reward value of the current database configuration parameter calculated in step S31300;

Step S31500: Store the previous database state parameters, current database configuration parameters, the reward value of the current database configuration parameters, and the current database state parameters as an interactive sample in the experience pool, where the database state parameters include database hardware resource state parameters, database performance parameters.

Step S31600: Perform batch sampling on the sample pool, and optimize the self-tuning module in step S31200.

The specific self-tuning module optimization process is as described in step S1270, which is not repeated here.

Step S31700: Update the current database configuration parameters according to the database configuration parameters generated in step S31200, and update the current database status record value for the database status parameters obtained in step S31000, where the database status parameters include database performance parameters and database hardware. Resource status parameters.

Step S31800: If the current iteration counter iter value is not greater than the maximum iteration value max_iter, return to step S3500, configure the database configuration parameters in the PG10.11 database, and continue the iterative process; otherwise, go to step S31900.

Step S31900: Return the optimal database configuration parameters and corresponding database performance parameters to the database administrator, and the database administrator decides whether to configure the database configuration parameters recommended by the model into the target database PG10.11 of the server.

Step S32000: Save the parameters of the current database self-tuning module.

In some embodiments, this embodiment provides another database parameter adjustment situation performed in an application scenario, as shown in Table 4;

Table 4 Example physical environment

数据库版本Database version	PG10.11数据库PG10.11 database
CPU核数Number of CPU cores	4848
内存RAM	128G128G
磁盘存储disk storage	1T1T
数据库配置参数调试模式Database Configuration Parameters Debug Mode	在线online

Referring to FIG. 14, this embodiment adopts the mode of two database servers, one of which is the backup database server 102 and the other is the business database server 104, both of which are PG10.11 database servers. The server 102 searches for optimal configuration parameters of the database, and then recommends the configuration parameters to the business database server 104 for configuration, so as to avoid interfering with the normal operation of business data during the tuning process.

The database configuration and debugging system is composed of four parts: a database tuning server 101 (the database tuning server 101 has a built-in database tuning module), a database query pressure simulation server 103, a backup database server 102 and a business database server 104, wherein the database tuning server 101 It is responsible for the operation of the database tuning module, the control of the database user query pressure simulation server 103, the interaction with the backup database server 102, and the parameter configuration of the business database server 104. The backup database server 102 is responsible for running the interactive database, limiting the hardware resources for running the database, receiving queries from the database user query pressure simulation server 103 , and interacting with the database tuning server 101 . The database query pressure simulation server 103 is responsible for running the pressure measurement tool and interacting with the database tuning server 101 . The above servers are connected through network cables, but are not limited to the above connection methods.

Execute the database configuration parameter adjustment method from steps S3100 to S32000 above.

In a third aspect, an embodiment of the present application provides an electronic device.

In some embodiments, referring to FIG. 15 , the above-mentioned electronic device includes one or more processors 201; a storage device 202 is used to store one or more programs, when the above-mentioned one or more programs are processed by the above-mentioned one or more processors The execution causes the above one or more processors to implement: the method for adjusting database configuration parameters in the first aspect; or, the method for adjusting database configuration parameters in the second aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium.

In some embodiments, the above-mentioned computer-readable storage medium stores computer-executable instructions, and the above-mentioned computer-executable instructions are used to perform: as the database configuration parameter adjustment method in the first aspect; or, as the database configuration parameter in the second aspect adjustment method.

The embodiments of the present application include: acquiring database state mixed representation parameters from a database server; inputting the database state mixed representation parameters into a deep reinforcement learning model to generate target database configuration parameters; and sending the target database configuration parameters to the database server. In the embodiment of the present application, a deep reinforcement learning model can be used to process the obtained database state mixed representation parameters to generate target database configuration parameters, and the generated database configuration parameters can be sent to the database server for configuration, thereby overcoming the low degree of database configuration automation. , slow speed and low efficiency, which effectively improves the automation, speed and efficiency of database configuration.

The apparatus embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

Those of ordinary skill in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media. Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and which can be accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

The above is a specific description of some implementations of the application, but the application is not limited to the above-mentioned embodiments. Those skilled in the art can also make various equivalent modifications or replacements without departing from the scope of the application. These Equivalent modifications or substitutions are included within the scope defined by the claims of the present application.

Claims

A method for adjusting database configuration parameters, applied to a database tuning module, comprising:

Get the database state hybrid representation parameters from the database server;

Inputting the database state mixed representation parameters into the deep reinforcement learning model to generate target database configuration parameters;

The target database configuration parameters are sent to the database server.
The method according to claim 1, wherein the database state mixed characterization parameters include one or more of the following: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters.
The method of claim 1, wherein the deep reinforcement learning model comprises a hybrid representation module and a self-tuning module, the hybrid representation module is connected to the self-tuning module;

The described database state mixed representation parameters are input into the deep reinforcement learning model to generate target database configuration parameters, including:

inputting the database state mixed characterization parameters into the mixed characterization module;

The hybrid characterization module uses a neural network model to process the database state hybrid characterization parameters to obtain a hybrid characterization vector;

The mixed representation vector is input to the self-tuning module, and the self-tuning module uses a reinforcement learning model to process the mixed representation vector to obtain target database configuration parameters.
The method according to claim 3, wherein, when the database state mixed characterization parameters include database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters;

The described hybrid characterization module uses a neural network model to process the database state hybrid characterization parameters to obtain a hybrid characterization vector, including:

Acquire the current database performance parameters and the previous database performance parameters; and, acquire the current database configuration parameters and the previous database configuration parameters; and, acquire the current hardware resource status parameters and the previous hardware resource status parameters; and, acquiring the hardware resource parameters;

Obtain the current database performance parameter feature vector according to the current database performance parameter; obtain the previous database performance parameter feature vector according to the previous database performance parameter; obtain the current database configuration parameter according to the current database configuration parameter feature vector; obtain the previous database configuration parameter feature vector according to the previous database configuration parameter; obtain the current hardware resource state parameter feature vector according to the current hardware resource state parameter; obtain the current hardware resource state parameter feature vector according to the previous hardware resource state parameter Obtain the hardware resource state parameter feature vector of the previous time; obtain the hardware resource parameter feature vector according to the hardware resource parameter;

According to the current database performance parameter feature vector and the previous database performance parameter feature vector, calculate the database performance parameter difference value feature vector; according to the current database configuration parameter feature vector and the previous database configuration parameter feature vector, Calculate and obtain the database configuration parameter difference feature vector; according to the current hardware resource state parameter feature vector and the previous hardware resource state parameter feature vector, calculate and obtain the hardware resource state parameter difference feature vector;

The database performance parameter feature vector, the database performance parameter difference feature vector, the database configuration parameter feature vector, the database configuration parameter difference feature vector, the hardware resource status parameter feature vector, the hardware resource The state parameter difference feature vector and the hardware resource parameter feature vector are input into the neural network model, and the neural network model is used to output a mixed representation vector.
The method of claim 3, wherein the deep reinforcement learning model further comprises a database configuration parameter reward function module;

Said inputting the database state mixed representation parameters into the deep reinforcement learning model to generate database configuration parameters, further comprising:

Inputting the database state mixed characterization parameters into the mixed characterization module and the database configuration parameter reward function module, respectively, to generate a reward strategy;

storing the reward strategy in the sample pool;

sampling the sample pool to obtain sampling data;

Using the sampled data, the database tuning module is optimized.
The method according to claim 5, wherein when the database state mixed characterization parameters include database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters, said mixing the database state characterization parameters respectively Input to the hybrid representation module and the database configuration parameter reward function module to generate a reward strategy, including:

After normalizing the difference between the current database performance parameter and the previous database performance parameter, and the difference between the current hardware resource state parameter and the previous hardware resource state parameter, output the reward value of the current database configuration parameter;

The normalization process is to calculate the ratio of the difference between the current database performance parameter and the previous database performance parameter in the previous database performance parameter; and, the current hardware resource status parameter and the previous time. The ratio of the hardware resource state parameter difference in the previous hardware resource state parameter, and the reward value of the current database configuration parameter is output after multiplying the ratio by the corresponding weighted value and summing up;

According to the reward value of the current database configuration parameter, the current database configuration parameter, the current database performance parameter, the current hardware resource status parameter, the previous database performance parameter, and the previous hardware resource status parameter, The reward policy is generated.
The method according to claim 5, wherein the self-tuning module comprises: a current action network, a current evaluation network, a target action network, and a target evaluation network; the current action network is configured to generate a hybrid representation parameter according to a database state The target database configuration parameters are sent to the database server; the sampled data includes the current database configuration parameters, the current database state parameters, the next database state parameters, and the reward value of the current database configuration parameters;

The using the sampled data to optimize the self-tuning module includes:

Inputting the current database state parameters into the current action network, and generating the next database configuration parameters;

inputting the next database configuration parameter and the current database state parameter into the current evaluation network to determine the first loss function value;

The current action network is optimized according to the first loss function value; the target action network is soft-updated according to the parameters output by the current action network;

Inputting the next database state parameters into the target action network to generate the next database configuration parameters;

Inputting the current database configuration parameters and the current database state parameters into the current evaluation network to generate a current action evaluation value;

adding the next database configuration parameter and the next database state parameter to the target evaluation network to generate the next action evaluation value;

Calculate the second loss function value according to the current action evaluation value, the next action evaluation value and the reward value of the current database configuration parameter;

The current evaluation network is optimized by using the second loss function value, and at the same time, the target evaluation network is soft-updated through the parameters output by the current evaluation network.
The method according to claim 1, wherein, the data server comprises a backup database server and a service database server, and the obtaining the database state hybrid representation parameters from the database server comprises:

Obtain the hybrid representation parameters of the database state of the backup server;

Send the target database configuration parameters to the database server, including:

Send the target database configuration parameters to the service database server.
The method according to any one of claims 1-8, wherein the obtaining the database state mixed representation parameters from a database server; inputting the database state mixed representation parameters into a deep reinforcement learning model to generate target database configuration parameters, Including: obtaining the mixed representation parameters of the database state from the database server;

Inputting the database state mixed representation parameters into the deep reinforcement learning model to generate intermediate database configuration parameters;

Send the intermediate database configuration parameters to the database server, and repeat the above steps N times until the target database configuration parameters are obtained.
A database configuration parameter adjustment method, applied to a database server, includes:

Sending the database state mixed representation parameter to the tuning module, so that the tuning module executes the method according to any one of claims 1-9;

receiving the target database configuration parameters sent from the database tuning module;

The database server is configured with parameters according to the database configuration parameters.
The method according to claim 10, wherein the database state mixed characterization parameters include one or more of the following: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters.
The method according to claim 10, wherein before the sending the database state mixed representation parameter to the database tuning module, so that the database tuning module executes the method according to any one of claims 1-9, the method further comprises:

Obtain the first mixed representation parameters of the database state;

Obtain workload stress signals from stress simulation servers;

performing a pressurization operation on the database server according to the workload pressure signal;

After the pressurization operation is completed, the second database state mixing characterization parameter is obtained.
The method according to claim 10, wherein the database server includes a backup database server and a business database server, and the sending the database state mixed representation parameter to the database tuning module, so that the database tuning module performs any of the tasks according to claims 1-9. The method of one; comprising:

Utilize the backup database server to send the database state mixed representation parameters to the database tuning module to execute the method according to any one of claims 1-9 to generate target database configuration parameters;

Utilize the business server to receive the target database configuration parameter;

Parameter configuration is performed on the service database server according to the target database configuration parameters.
Electronic equipment, including:

at least one processor, and,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions which are executed by the at least one processor such that when the at least one processor executes the instructions, a method as claimed in any one of claims 1 to 9 or as claimed in claim 1 is implemented The method of any one of 10 to 13.
A computer-readable storage medium storing computer-executable instructions, wherein the computer-executable instructions are used to cause a computer to perform the method as claimed in any one of claims 1 to 9 or the method as claimed in any one of claims 10 to 13 method described.