CN113872788A

CN113872788A - Database configuration parameter adjusting method, device and storage medium

Info

Publication number: CN113872788A
Application number: CN202010618107.8A
Authority: CN
Inventors: 弄庆鹏; 李忠良; 屠要峰; 郭斌; 黄震江; 陈小强
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2021-12-31
Also published as: WO2022001965A1

Abstract

The embodiment of the application relates to a database configuration parameter adjusting method, a device and a storage medium. The method comprises the steps of obtaining database state mixed characterization parameters from a database server; inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate target database configuration parameters; and sending the target database configuration parameters to a database server. According to the embodiment of the application, the database state mixed representation parameters from the database server can be processed by utilizing the deep reinforcement learning model to generate the database configuration parameters, and the generated database configuration parameters are sent to the database server for configuration, so that the problems of low automation degree, low speed and low efficiency of database configuration are solved, and the automation degree, the speed and the efficiency of database configuration are effectively improved.

Description

Database configuration parameter adjusting method, device and storage medium

Technical Field

The embodiment of the application relates to, but not limited to, the technical field of databases, and in particular, to a method, a device, and a storage medium for adjusting configuration parameters of a database.

Background

Databases are organized, shareable collections of data that are stored long term in a computer. With the development of science and technology, the data volume for user information connection is increased sharply, the database is used as a basic support for information technology storage service, the application of the database is wider, and the wide application of the database is accompanied by the optimization problem of the database.

At present, optimization of a large-scale database generally depends on a database administrator, but differentiated configuration of the database is a complex, repeated, time-consuming and labor-consuming challenge for the database administrator, and the database administrator with different experiences greatly differs in quality of configuration parameters of the database, so that the problems of low automation degree, low speed and low efficiency exist in adjustment of the configuration parameters of the database.

Disclosure of Invention

The embodiment of the application provides a database configuration parameter adjusting method, a device and a storage medium, which can adjust the database configuration parameter rapidly and efficiently so as to optimize the database configuration parameter.

In a first aspect, an embodiment of the present application provides a method for adjusting a database configuration parameter, which is applied to a database tuning module, and includes: acquiring database state hybrid characterization parameters from a database server; inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate target database configuration parameters; and sending the target database configuration parameters to a database server.

In a second aspect, an embodiment of the present application provides a method for adjusting database configuration parameters, which is applied to a database server, and includes: sending the database state hybrid representation parameters to a database tuning module, so that the database tuning module executes the method according to the first aspect; receiving target database configuration parameters sent by the database tuning module; and performing parameter configuration on the database server according to the database configuration parameters.

In a third aspect, an embodiment of the present application provides an electronic device, including: at least one processor, and,

a memory communicatively coupled to the at least one processor; wherein the memory stores instructions for execution by the at least one processor to cause the at least one processor, when executing the instructions, to implement the method of the first aspect or the method of the second aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method of the first aspect or the method of the second aspect.

The embodiment of the application comprises the following steps: acquiring database state hybrid characterization parameters from a database server; inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate target database configuration parameters; and sending the target database configuration parameters to a database server. According to the method and the device, the obtained database state mixed characterization parameters can be processed by using the deep reinforcement learning model to generate the target database configuration parameters, and the generated database configuration parameters are sent to the database server to be configured, so that the problems of low automation degree, low speed and low efficiency of database configuration are solved, and the automation degree, speed and efficiency of database configuration are effectively improved.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the claimed subject matter and are incorporated in and constitute a part of this specification, illustrate embodiments of the subject matter and together with the description serve to explain the principles of the subject matter and not to limit the subject matter.

Fig. 1 is a flowchart of a database configuration parameter adjustment method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating adjustment of database configuration parameters according to an embodiment of the present application;

FIG. 3 is a flowchart of a database configuration parameter adjustment method according to another embodiment of the present invention;

FIG. 4 is a schematic diagram of a hybrid characterization module in database configuration parameter adjustment according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a database configuration parameter adjustment module according to another embodiment of the present application;

FIG. 6 is a schematic diagram of a reward function module in a database configuration parameter adjustment module according to another embodiment of the present application;

FIG. 7 is a diagram of a database configuration parameter self-tuning module according to an embodiment of the present invention;

FIG. 8 is a flowchart illustrating a database configuration parameter adjustment method according to another embodiment of the present invention;

FIG. 9 is a flow chart illustrating the principle of database configuration parameter self-optimization according to another embodiment of the present invention;

FIG. 10 is a flowchart illustrating a database configuration parameter adjustment method according to another embodiment of the present invention;

FIG. 11 is a flowchart illustrating a database configuration parameter adjustment method according to another embodiment of the present invention;

fig. 12 is an application scenario diagram of a database configuration parameter adjustment method according to an embodiment of the present invention;

fig. 13(a) and 13(b) are overall flowcharts of a database configuration parameter adjustment method according to another embodiment of the present invention;

fig. 14 is an application scenario diagram of a database configuration parameter adjustment method according to another embodiment of the present invention;

fig. 15 is a schematic diagram of an electronic device for adjusting database configuration parameters according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. In the present application, the embodiments and features of the embodiments may be arbitrarily combined with each other without conflict.

It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

At present, optimization of a large-scale database generally depends on a database administrator, but differentiated configuration of the database is a complex, repeated, time-consuming and labor-consuming challenge for the database administrator, and the database administrator with different experiences greatly differs in quality of configuration parameters of the database, so that the adjustment of the configuration parameters of the database has the conditions of low automation degree, low speed and low efficiency.

Based on this, the embodiment of the application provides a database configuration parameter adjusting method, device and storage medium, and the embodiment of the application can process the obtained database state hybrid representation parameters by using a deep reinforcement learning model to generate target database configuration parameters, and send the generated database configuration parameters to a database server for configuration, thereby overcoming the problems of low automation degree, low speed and low efficiency of database configuration, and effectively improving the automation degree, speed and efficiency of database configuration.

It should be noted that, in the following embodiments, the electronic device may be a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), an Augmented Reality (AR)/Virtual Reality (VR) device, and the like, which include the above-mentioned foldable screen, and the specific form of the electronic device is not particularly limited in the embodiments of the present application.

In a first aspect, an embodiment of the present application provides a method for adjusting a database configuration parameter, which is applied to a database tuning module.

In some embodiments, referring to fig. 1, the database configuration parameter adjustment method includes:

step S1100, obtaining database state mixed characterization parameters from a database server;

step S1200, inputting the database state mixed representation parameters into a deep reinforcement learning model to generate target database configuration parameters;

step S1300, sending the target database configuration parameters to the database server.

In some embodiments, the database state hybrid representation parameters in step S1100 may be state parameters or configuration parameters of the database, and the performance of the database, such as database performance parameters in the database state parameters, may be known through these state hybrid representation parameters of the database, and the query and response speed of the database is directly affected by the quality of the database configuration parameters.

In some embodiments, in step S1200, after the database state hybrid representation parameters are input into the deep reinforcement learning model for training, configuration parameters of the target database are generated, and these configuration parameters play an important role in optimizing the database.

It can be understood that machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, algorithm complexity theory and the like. It is specialized to study how computers simulate or implement human learning behavior. To acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the knowledge structure. Machine learning is divided into supervised learning, unsupervised learning and reinforcement learning, wherein the supervised learning is that a computer obtains simple input and gives expected output, and the process is that the input is mapped to the output by learning a universal rule through a training model; unsupervised learning is a method that a mark is not given for learning an algorithm and finds an input structure by itself, and the unsupervised learning can be used as a target or a way for realizing a result, and can be called as feature learning; reinforcement learning is the interaction of a computer program and a dynamic environment, and simultaneously shows exact targets, such as driving a vehicle or playing a game against an opponent, and the reward and punishment mechanism of the program can be used as feedback to realize the navigation of the program in the problem field.

In the embodiment, the deep learning model is a recurrent Neural network model, the recurrent Neural network model is hereinafter abbreviated as RNN (recurrent Neural network) which is one of deep learning algorithms, RNN is a recurrent Neural network which takes sequence data as input and recurses in the evolution direction of the sequence, and all nodes, namely, the cyclic units are connected in a chain manner, so that the RNN can capture the correlation of each output parameter, and can synthesize and process database state hybrid characterization parameters. The reinforcement learning model belongs to the reinforcement learning, combines the deep learning and the reinforcement learning to form the deep reinforcement learning, overcomes the problem that the traditional reinforcement learning algorithm cannot process a high-dimensional state space, combines the reinforcement learning and the deep learning, realizes a brand-new algorithm for sensing to end-to-end learning of actions, simplifies the characteristics and realizes unsupervised learning. Meanwhile, the intelligent agent can discover the internal connection between the characteristics in the self-learning process. Therefore, the deep reinforcement learning has the potential of one or more skills which enable the intelligent agent to realize completely autonomous learning, and the application of the deep reinforcement learning in the embodiment can provide the capability of processing high-dimensional parameters and continuously optimize the model parameters through autonomous learning.

In this embodiment, the database state hybrid representation parameters acquired in step S1100 are input into the deep reinforcement learning model to generate the database configuration parameters, so that a faster convergence rate can be achieved, the speed of generating the configuration parameters is increased, and the configuration efficiency is increased.

In some embodiments, the database server in step S1300 is formed by one or more computers running in the local area network and database management system software, and the database server may provide data service for the application program of the client, and send the target database configuration parameters generated in step S1200 to the database server, so that the database server may be optimized, thereby improving the performance of the database in the database server.

According to the method and the device, the obtained database state mixed characterization parameters can be processed by using the deep reinforcement learning model to generate the target database configuration parameters, and the generated database configuration parameters are sent to the database server for configuration, so that the problems of low database configuration speed and low efficiency are solved, and the database configuration speed and efficiency are effectively improved. In some embodiments, the database state representation hybrid characterization parameters include one or more of: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource status parameters.

In some embodiments, the database performance parameters include, but are not limited to, throughput, query latency, number of automatic vaums, where throughput refers to the amount of data successfully transferred per unit time for a network, device, port, virtual circuit, or other facility, and the amount of data is measured in bits, bytes, packets, etc., and the magnitude of throughput directly affects the response speed of database data writing, data reading, and database access; the query time delay is the time delay from the beginning of query to the response of the database, and the access response speed of the database is directly influenced by the query time delay of the database; the main function of the automatic vacum is to move the free page to the end of the database, thereby reducing the size of the database.

The hardware resource parameters include, but are not limited to, the number of CPUs, the size of memory capacity, the size of storage space, and the like, and a CPU (central Processing unit) is an ultra-large scale integrated circuit and is an operation core and a control core of a computer. The function of the system is mainly to explain computer instructions and process data in computer software, wherein the main function of a memory is memory and loading functions, the memory capacity can be understood as the storage capacity of a memory bank, generally, the memory capacity takes MB as a unit, and the larger the memory capacity is, the more favorable the operation of a hardware system is. For example, the memory capacity of a computer is typically referred to as the capacity of Random Access Memory (RAM). The memory capacity is generally an integral multiple of 2, such as 64MB, 128MB, 256MB, etc., and the larger the memory capacity is, the more advantageous the hardware system or the computer system is to operate.

The hardware resource state parameters include, but are not limited to, CPU utilization, memory utilization, storage utilization, and the like. The CPU utilization rate is the percentage of the CPU occupied by the running process, the service life of the CPU is directly influenced by the CPU utilization rate, and the internal aging of the CPU can be caused by the overhigh CPU utilization rate, so that the service life of the CPU is shortened; the memory occupancy rate refers to the memory occupied by the running process, and the access speed of the CPU is slowed down due to the excessively high utilization rate of the memory, which affects the performance of the whole hardware resource.

In some embodiments, referring to fig. 2, the deep reinforcement learning model includes a hybrid characterization module and a self-optimization module, the hybrid characterization module is connected to the self-optimization module, wherein the hybrid characterization module uses an RNN model, and the self-optimization module uses a reinforcement learning model.

The self-optimization module transmits the database configuration parameters to a control interface of the server through the control interface, and the server outputs a working load pressure signal to a target database (such as a database server) through the control pressure measuring device to pressurize; the parameter acquisition module acquires hardware resource state parameters and database performance parameters in a target database, the parameters are transmitted to a mixed characterization module in the database tuning module through a control interface of the server to be subjected to mixed characterization by combining the hardware resource parameters of the server, a database state mixed characterization vector is output and transmitted to the self-tuning module, and the self-tuning module completes interaction with the database through a current action network in reinforcement learning, so that configuration parameters generated by the self-tuning module are transmitted to the target database.

According to the embodiment, the configuration parameters of the database can be obtained by combining the deep reinforcement learning model formed by the RNN model and the reinforcement learning model and the state mixed characterization parameters of the database, so that the generation speed of the configuration parameters of the database is increased, and the configuration efficiency of the database is improved.

Referring to fig. 3, step S1200 specifically includes:

step S1210, inputting the database state mixed characterization parameters into a mixed characterization module;

step S1220, the hybrid representation module processes the hybrid representation parameters of the database state by using the RNN model to obtain hybrid representation vectors of the database state;

step S1230, the database state hybrid token vector is input into the self-tuning module, and the self-tuning module processes the database state hybrid token vector by using the reinforcement learning model to obtain the target database configuration parameter.

In some embodiments, the purpose of inputting the database state hybrid representation parameters into the hybrid representation module in step S1210 is to obtain a vector of the database state hybrid representation, i.e., the various features are fused with each other and input into the hybrid representation module.

In the embodiment, the mixed characterization of the parameters is used, so that the completeness and the accuracy of the database state characterization can be ensured.

In some embodiments, in the step S1220, the neural network model is used to process the features of the database state hybrid characterization parameters to obtain a database state hybrid characterization vector, and more specifically, the used neural network model is a recurrent neural network model (RNN), which is a structure that repeatedly occurs over time; a typical RNN network comprises an input, an output and a neural network element, wherein the neural network element of the RNN network is not only connected to the input and the output, but also has a loop with itself, i.e. the network state information at the previous time will act on the network state at the next time. The RNN can improve the convergence speed of the model, so that the generation speed of the database configuration parameters is improved, and the database configuration is more efficient.

In some embodiments, the database state hybrid characterization parameters include a database performance parameter, a current database configuration parameter, a hardware resource parameter, and a hardware resource state parameter, and the step S1220 includes:

step S1221, acquiring performance parameters of a current database and performance parameters of a previous database; acquiring current database configuration parameters and previous database configuration parameters; acquiring a current hardware resource state parameter and a previous hardware resource state parameter; and acquiring hardware resource parameters;

step S1222, obtaining characteristic vectors of the performance parameters of the current database according to the performance parameters of the current database; obtaining a characteristic vector of the performance parameter of the previous database according to the performance parameter of the previous database; obtaining a characteristic vector of the current database configuration parameter according to the current database configuration parameter; obtaining a characteristic vector of a previous database configuration parameter according to the previous database configuration parameter; obtaining a current hardware resource state parameter feature vector according to the current hardware resource state parameter; obtaining a hardware resource state parameter feature vector of the previous time according to the hardware resource state parameter of the previous time; acquiring a hardware resource parameter feature vector according to the hardware resource parameter;

step S1223, calculating to obtain a database performance parameter difference value feature vector according to the current database performance parameter feature vector and the previous database performance parameter feature vector; calculating to obtain a database configuration parameter difference value eigenvector according to the current database configuration parameter eigenvector and the last database configuration parameter eigenvector; calculating to obtain a hardware resource state parameter difference value eigenvector according to the current hardware resource state parameter eigenvector and the previous hardware resource state parameter eigenvector; step S1224 of inputting the database performance parameter feature vector, the database performance parameter difference feature vector, the database configuration parameter difference feature vector, the hardware resource state parameter difference feature vector, and the hardware resource parameter feature vector into the neural network model, and outputting the database state hybrid token vector using the neural network model.

In some embodiments, in step S1222, after the parameter characteristics obtained in step S1221 are input into the hybrid representation module, preprocessing encoding is performed, where the preprocessing is performed by using the obtained current database performance parameters and previous database performance parameters; the current database configuration parameters and the previous database configuration parameters; the current hardware resource state parameter and the previous hardware resource state parameter; and acquiring hardware resource parameters for vectorization processing, acquiring a parameter difference vector through step S1223, and outputting a database state hybrid representation vector after performing neural network model training through the vector of the parameters and the corresponding parameter difference vector.

In some embodiments, referring to fig. 4, the history records in the graph are previous database performance parameters, previous database configuration parameters, and previous database hardware resource status parameters, and by combining the current database performance parameters, the current database configuration parameters, and the current database hardware resource status parameters, a database state hybrid characterization vector is formed after preprocessing and transmitted to the self-tuning module, and at the same time, the current database performance parameters, the current database hardware resource status parameters, the previous database performance parameters, the previous database hardware resource status parameters, and the previous database configuration database are input to the sample pool and stored as a sample.

According to the embodiment, the generation of the database state mixed characterization vector in the self-optimization module and the generation of the parameter in the sample pool can be realized, and a parameter basis is provided for optimizing the database tuning module.

In some embodiments, seven vectors, namely the database performance parameter feature vector, the database performance parameter difference vector, the database configuration parameter feature vector, the database configuration parameter difference vector, the hardware resource state parameter feature vector, the hardware resource state parameter difference vector and the hardware resource parameter vector, are input into the neural network model in step S1224, and the database state hybrid representation vector is output by using the neural network model. The seven characteristic vectors have correlation, and the recurrent neural network RNN is adopted to perform relationship extraction and coding dimension reduction on the seven characteristic vectors, so that the database state hybrid representation is realized, the comprehensiveness and accuracy of the database representation vectors on the representation of the database and the hardware resource environment state thereof are improved, and the effectiveness and the efficiency of the adjustment of the database configuration parameters are improved.

In this embodiment, a database server with hardware resource parameters of 4 cores for CPU, 32G for memory, and 256G for storage is taken as an example to illustrate the core idea of this embodiment. The memory is a running memory, which may also be called RAM, and data does not exist after shutdown. The size of the memory determines the running speed of the machine; the storage of 256G represents the storage capacity of the CPU, and can store data of 256G size. Assuming the collected data related to the database, taking the vector feature of 10 dimensions as an example, the database hardware resource parameter feature vector is [4, 32, 256, 0, 0, 0, 0], where 4 represents that the CPU has no 4 cores, 32 represents that the memory size is 32G, 256 represents that the storage size is 256G, and the missing dimension behind the vector is processed by complementing 0, that is, 70 s are complemented later as a complementary dimension.

The current hardware resource state parameter feature vector in this embodiment is [0.6, 0.58, 0.3, 0, 0, 0, 0, 0], where 0.6 represents the CPU utilization, 0.58 represents the memory utilization, 0.3 represents the storage utilization, and similarly, the next 70 s represent missing dimensions; the last hardware resource state parameter feature vector is [0.4, 0.48, 0.5, 0, 0, 0, 0], and the hardware resource state parameter difference vector is obtained by subtracting the last hardware resource state parameter feature vector from the current hardware resource state parameter feature vector, and the hardware resource state parameter difference vector is [0.2, 0.10, -0.2, 0, 0, 0, 0 ];

the current configuration parameter feature vector in this embodiment is [542, 730, 9, 7, 55, 23, 99, 10, 67, 86], where 542 represents temp _ buffers in the database configuration parameters, i.e., temporary buffer size, for database sessions to access temporary table data, 730 for work mem, memory size used by internal sort operations and hash tables prior to writing temporary files, 9 for max wal size, in the automatic WAL check point, the WAL is increased to the maximum size and other related configuration parameter vectors, if the characteristic vector of the configuration parameter of the previous time is [372, 650, 3, 4, 32, 21, 76, 13, 67, 56], the database configuration parameter difference vector is the difference vector between the current configuration parameter feature vector and the previous current configuration parameter feature vector, i.e., [170, 80, 6, 3, 23, 2, 23, -3, 0, 30 ].

The current database performance parameter feature vector in this embodiment is [1154.5, 8.5, 515, 9325.0, 854.8, 0.0, 91, 1.0, 6.7, 8.6], where 1154.5 represents the n _ tup _ ins status parameter, 8.5 represents the buffers _ alloc configuration parameter, 515 represents the xact _ commit configuration parameter, 9325.0 represents the n _ dead _ tup status parameter, and so on. The performance database performance parameter difference vector is [104.0, 3.0, 15, 2.0, -122.1, 0, 23, 0.2, 1.4, 1.0 ].

In the embodiment, the database performance parameters, the database configuration parameters, the hardware resource parameters, the current values and the difference values between the current values and the previous values are used for mixed representation, wherein the previous values can be understood as historical values, it can be understood that only the current database performance parameters are observed in the process of tuning the database configuration parameters, but the change conditions of the database performance parameters are observed after one-time database configuration parameter tuning, the change of the database performance parameters is determined by the variable quantity of the database configuration parameters, the current database performance parameters are determined by the hardware resource parameters, and abstracting the manual tuning process of a database administrator is to use the hardware resource parameters, the database performance parameters, the current values of the database configuration parameters and the previous values or the historical values to perform mixed representation on the state of the database, and the RNN can capture the relation among all parameters, so that the adjustment of the configuration parameters of the database can be accurately made in real time.

In some embodiments, referring to fig. 5, the deep reinforcement learning model further comprises a database configuration parameter reward function module;

specifically, step S1200 further includes:

s1240, respectively inputting the database state mixed characterization parameters into the mixed characterization module and the database configuration parameter reward function module to generate a reward strategy;

s1250, storing the reward strategy into a sample pool;

s1260, sampling the sample pool to obtain sampling data;

and S1270, optimizing the self-optimization module by using the sampling data.

It can be understood that the database state hybrid characterization parameters in S1240 include database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters, where the database configuration parameters are parameters configured in the database operation process, and the parameters may affect the performance parameters of the database and the database hardware state parameters, and are generally referred to as action a in reinforcement learning; the performance parameters of the database include, but are not limited to, performance indexes of throughput and time delay of the database, and the hardware state parameters of the database include, but are not limited to, indexes of CPU utilization, memory utilization, storage utilization, and the like. The database configuration parameter reward function in the database configuration parameter reward function module is mainly used for evaluating the quality of the current database configuration parameters, such as whether the throughput and the time delay of the database meet the service requirements, and is generally called as reward r in reinforcement learning.

The database performance parameters and the hardware resource status parameters may be collectively referred to as database status parameters, and referred to as status s in reinforcement learning, it can be understood that the database status parameters and the database configuration parameters are presented in pairs, that is, a set of database configuration parameters corresponds to a set of database status parameters, and the database status parameters include, but are not limited to, database performance parameters, current configuration parameters, and hardware resource status parameters, wherein the hardware resource status parameters include CPU utilization and/or memory utilization, and are generally referred to as status s in reinforcement learning.

In some embodiments, the database configuration parameter reward function module in step S1240 is configured to generate a reward value r of the current database configuration parameter, the hybrid characterization module is configured to output the database configuration parameter, the database performance parameter, and the hardware resource status parameter, where the database performance parameter and the hardware resource status parameter may be considered in combination as a database status parameter, the current database status parameter is represented as S _, the previous database status parameter is represented as S, the current database configuration parameter is represented as a, and a sample of (S, a, r, S _) is output by combining the hybrid characterization module and the database configuration parameter reward function module, and then step S1250 is performed to place the reward policy in the sample pool.

In some embodiments, the step S1240 includes:

s1241, normalizing the difference value between the current database performance parameter and the previous database performance parameter and the difference value between the current hardware resource state parameter and the previous hardware resource state parameter, and outputting the reward value of the current database configuration parameter;

the normalization processing is to calculate the ratio of the difference value between the current database performance parameter and the previous database performance parameter in the previous database performance parameter; the occupation ratio of the difference value between the current hardware resource state parameter and the previous hardware resource state parameter in the previous hardware resource state parameter is summed up by multiplying the occupation ratio by a corresponding weighted value, and then the reward value of the current database configuration parameter is output;

s1242, generating a reward strategy according to the reward value of the current database configuration parameter, the current database performance parameter, the current hardware resource state parameter, the previous database performance parameter and the previous hardware resource state parameter.

In some embodiments, the normalization in step S1241 is to change the number to a decimal between 0 and 1, and the normalization in this embodiment is performed by calculating the current database performance parameter and the previous database performance parameter, and the difference between the current hardware resource status parameter and the previous hardware resource status parameter, and outputting the reward value of the current database configuration parameter by the ratio of the current database performance parameter and the previous database performance parameter to the current database performance parameter, and the ratio of the difference between the current hardware resource status parameter and the previous hardware resource status parameter to the previous hardware resource status parameter, and summing the ratios multiplied by the corresponding weighted values; the specific formula is as follows:

r ═ α + (difference between current database performance parameter and previous database performance parameter/previous database performance parameter) (difference between current hardware resource state parameter and previous hardware resource state parameter/previous hardware resource state parameter) (. beta);

where α and β are weighted values, and r is a normalized value obtained by normalizing the database performance parameter and the hardware resource status parameter, and represents an incentive value of the current database configuration parameter in this embodiment.

In some embodiments, referring to fig. 6, the hardware resource state parameters in step S1241 include a CPU utilization rate and a memory utilization rate, and in the normalization method processing process in step S1241, a specific normalization process is performed by setting a weighted value according to different degrees of emphasis on the database performance parameter, the CPU utilization rate, and the memory utilization rate in the database configuration parameter adjustment process, where the specific normalization formula is:

Δperfor＝cur_perfor-hist_perfor；

Δcpu_rate＝cur_cpu_rate-hist_cpu_rate；

Δmen_rate＝cur_men_rate-hist_men_rate；

wherein, α, β, γ are weighted values, r is an award value of a current database configuration parameter, cur _ performance is a current database performance parameter, hist _ performance is a previous or historical database performance parameter, cur _ CPU _ rate is a current CPU utilization rate, hist _ CPU _ rate is a previous or historical CPU utilization rate, cur _ men _ rate is a current memory utilization rate, hist _ men _ rate is a previous or historical memory utilization rate, Δ performance is a difference value between the current database performance parameter and the previous or historical database performance parameter, Δ CPU _ rate is a difference value between the current CPU utilization rate and the previous or historical CPU utilization rate, and Δ men _ rate is a difference value between the current memory utilization rate and the previous or historical memory utilization rate.

It can be understood that the weighting values may be set to adjust the degrees of emphasis on the database performance parameters, the CPU utilization rate, and the memory utilization rate in the database configuration parameter adjustment process, for example, the weighting values are set to optimize the database performance parameters, the component weighting values α of the performance parameters may be set to be larger values, if the weighting values are set to optimize the CPU utilization rate, β may be set to be larger values, and similarly, if the database optimization is set to optimize the memory utilization rate, γ may be set to be larger values, and the normalization of the above components is to solve the problems of different dimensions and degrees of change of the three parameters.

For example, in this embodiment, the optimization and comparison of the database is focused on the database performance parameters, and then, after comprehensive experimental consideration, the weighting values α are set to 0.8, β is 0.1, γ is 0.1, cur _ performance is 8550tps, hist _ performance is 6800tps, cur _ cpu _ rate is 85%, hist _ cpu _ rate is 56%, cur _ men _ rate is 72%, hist _ men _ rate is 63%, and according to the above normalization process, r is 0.2647, where 0.2647 is the reward output, that is, the reward value of the current database configuration parameters.

In this embodiment, the reward value of the current database configuration parameter is generated by combining the current values and the previous or historical values of the database performance parameter, the CPU utilization rate, and the memory utilization rate parameter, and it can be understood that the reward value is not an absolute value of the current database performance in the optimization process of the data configuration parameter, but is more a variation of the database performance parameter, the CPU utilization rate, and the memory utilization rate after the database configuration parameter is adjusted, where the parameter historical value may be a historical maximum value or a corresponding collection value in default configuration of the database or a mean value recorded in a sliding window. In the process of calculating the reward, difference values of the database performance parameters, the CPU utilization rate and the memory utilization rate are respectively solved, then the three difference values are normalized, and finally the weighted sum is used as reward output.

In some embodiments, referring to fig. 7, a schematic diagram of a self-optimization module is shown, wherein the module includes a current action network, a current evaluation network, a target action network, and a target evaluation network, the current action network mainly uses a current state to generate a next action, the current evaluation network is used to evaluate the current state and the next action, the target action network is used to generate the next action using the next state, and the target evaluation network is used to evaluate the next action and the next state. The method comprises the steps that a current state parameter is represented as states, a current action is represented as actions, a next state is represented as next _ states, a next action is represented as next _ states, and through interaction among a current action network, a current evaluation network, a target action network and a target evaluation network, connection can be organically established between the current evaluation network and the target evaluation network through a loss function.

In some embodiments, referring to fig. 7 and 8, the self-tuning module comprises: a current action network, a current evaluation network, a target action network, and a target evaluation network; the current action network is used for generating a target database configuration parameter according to the database state mixed representation parameter and sending the target database configuration parameter to the database server; the sampling data comprises current database configuration parameters, current database state parameters, next database state parameters and reward values of the current database configuration parameters; the sampling data in step S1270 includes current database configuration parameters, current database state parameters, and next-time database state parameters, where the database state parameters include database performance parameters and database hardware resource state parameters, and step S1270 includes:

step S1271: inputting the current database state parameters into the current action network, and generating next database configuration parameters;

as can be appreciated, the current action network is further configured to output the destination database configuration parameters by processing the current database state parameters and the hybrid token vectors.

Step S1272: inputting the next database configuration parameters and the current database state parameters into the current evaluation network to determine a first loss function value;

step S1273: optimizing the current action network according to the first loss function value and performing soft update on the target action network according to the parameters output by the current action network;

wherein, the current database state parameter is states, and the first loss function is policy _ loss ═ Q' (states, next _ actions)

Wherein, policy _ loss can be used as a loss value of the current network, and can optimize the current action network, and Q' (states, next _ actions) represents the evaluation values of the current database state parameters and the next database configuration parameters in the current evaluation network.

Step S1274: inputting the next database state parameters into the target action network to generate next database configuration parameters;

wherein, the next database state parameter is represented as next _ states;

step S1275: inputting the current database configuration parameters and the current database state parameters into a current evaluation network to generate a current action evaluation value;

step S1276: the next database configuration parameters and the next database state parameters are sent to a target evaluation network to generate a next action evaluation value;

step S1277: calculating to obtain a second loss function value according to the current action evaluation value, the next action evaluation value and the reward value of the current database configuration parameter;

step S1278: and optimizing the current evaluation network by using the second loss function value, and simultaneously performing soft update on the target evaluation network through the parameters output by the current evaluation network.

Specifically, the second loss function formula is as follows:

Loss＝Q'(next_state,next_action)-Q(state,action)

where Q' (next _ state, next _ action) is the current action evaluation value, Q (state, action) is the previous action evaluation value, and Loss () represents the Loss value between the target evaluation network and the current evaluation network, where Q (state, action) is represented as Q (s, a) in fig. 7.

Wherein, the relationship between Q (state, action) and Q' (state _ next, action _ next) is:

Q(state,action)＝r+γ*(max(Q'(next_state,next_action)))；

where r represents the reward value of the current database configuration parameter, and max (Q' (next _ state, next _ action) represents the highest evaluation score that may be generated by the action taken in the next state, i.e., the highest evaluation score generated by the database configuration corresponding to the next database parameter in this embodiment, where γ is a value from 0 to 1.

It can be understood that substituting the formula of Q (state, action) into the Loss function formula results in:

Loss＝-r+(1-γ)Q'(next_state,next_action)

when γ is 1, the target evaluation value Loss is-r, i.e. the current evaluation value plus the Loss value, i.e. plus-r, is the target evaluation value, where r is the reward value of the current database configuration parameter in step S1241.

It can be understood that the self-optimization module in the embodiment of the invention comprises four parts, namely a current action network, a current evaluation network, a target action network and a target evaluation network, wherein the four networks are all composed of a neural network; the action network determines the best action to be applied to the environment at the next moment according to the current state; here, an action may be considered as a configuration parameter to be configured by the database, and a state may be considered as a state parameter of the database.

Specifically, the current evaluation network is optimized through a value obtained by the Loss formula, that is, the second Loss function value, so as to determine an output value of the current action network, that is, a next database configuration parameter (target database configuration parameter) corresponding to the input state parameter can be predicted through the optimized current evaluation network.

The target action network may provide an action corresponding to a next state, that is, a database configuration parameter corresponding to a next database state, for the target evaluation network. The database state is a database performance parameter and a database hardware resource state parameter. The target evaluation network is mainly responsible for calculating a target Q value, and meanwhile, the current evaluation network also needs to optimize modules, so that a target value needs to be compared with a generated value of the current evaluation network to obtain a loss value so as to optimize the current evaluation network.

It can be understood that, in this embodiment, the parameters of the current evaluation network can be periodically updated by using the parameters of the target evaluation network, and the loss function is used once per update by using a soft update mode, so that it can be ensured that each update is adjusted according to the loss function, and the module is more optimized.

It can be understood that the optimization process of the whole self-optimization module is triggered by the update () function in the self-optimization module, and the self-optimization module obtains the previous state parameter represented as state, the previous configuration parameter represented as action, the current state parameter represented as next _ state and the reward value reward (represented as r in the above embodiment) parameter of the current database configuration parameter from the sample pool, and outputs the optimized database configuration parameter after the processes of steps S1271 to S1277. The configuration parameters in the embodiment can make the database configuration parameters generated by the self-optimization module more optimal because the reward value reward of the current database configuration parameters is added, and the database configuration parameters in the embodiment are input into the target database server for configuration, so that the access speed and the reading speed of the database in the target database server and other processing items of the database can be improved.

In some embodiments, referring to fig. 9, a schematic flow chart of a database tuning module is shown, where a hybrid characterization module is responsible for providing a database state hybrid characterization vector for the tuning module, and also provides a parameter sample for a sample pool, and the hybrid characterization module receives a database performance parameter, a database configuration parameter, and a database hardware resource state parameter sent by a target database, and performs hybrid characterization processing on the parameters, and meanwhile, the target database receives a database configuration parameter generated by the tuning module, so that an organic system for updating database configuration parameters uninterruptedly is formed, and generation, optimization, and configuration of the database can be completed quickly and efficiently.

In some embodiments, the current action network is responsible for the interaction of the self-optimization module with the database environment, i.e., generating database configuration parameters according to the database state parameters and configuring the configuration parameters into the database. The current evaluation network is responsible for evaluating the value (quality) of each pair of database states-actions, namely, evaluating Q values, so as to determine what actions are taken in what database states, and it can be understood that the embodiment embodies that what database configuration parameters are taken in what database states to bring higher database performance, wherein the database states include database performance parameters and database hardware resource state parameters, the database performance includes throughput, latency, and the like, and the high database performance includes high throughput, low latency, and the like.

In some embodiments, the data server includes a backup database server and a business database server, and step S1100 includes:

step 1110: acquiring database state hybrid characterization parameters of a backup server;

step S1300 includes:

step 1310: and sending the target database configuration parameters to a service database server.

It will be appreciated that the traffic database server is an operating, in-use server.

In some embodiments, the off-line mode optimization is performed on the database server, a server that needs to perform database configuration may be called a service server, and when adjusting the database configuration parameters, the database configuration parameters are performed by a backup server, and the database state hybrid representation parameters of the service server and the backup server are the same, at this time, the database state hybrid representation parameters of the backup server need to be obtained through step S1110, so that the target database configuration parameters generated after the methods of steps S1100 to S1300 are performed are also applicable to the service server.

In some embodiments, after the backup server performs configuration optimization through the newly generated target configuration parameters, and the database operates normally, step S1310 may be executed to send the target database configuration parameters to the service database server for tuning the service database server.

According to the embodiment, the adjustment work of the configuration parameters of the database can be performed on the backup server, the relevant configuration parameters can be configured to the service server only when the configuration is successful and the database runs normally after the configuration, and the off-line mode is used for adjusting the configuration parameters of the database to the service server, so that the situation that the running service server is damaged by an accident in the adjustment process of the configuration parameters of the database, and unnecessary loss is caused can be prevented.

In some embodiments, the online mode optimization is performed on the database server, the database tuning module directly obtains the database state hybrid representation parameter from the service database, and after obtaining the database state hybrid representation parameter, the method of the above steps S1100 to S1300 is executed to obtain the target database configuration parameter, and the newly generated target database configuration parameter is transmitted to the service database in real time, so that the service database performs real-time configuration of the database configuration parameter.

The embodiment can realize real-time online optimization of the configuration parameters of the service database and improve the optimization efficiency.

In some embodiments, the steps S1100 and S1200 in the database configuration parameter adjustment method include:

s1120, acquiring database state mixed characterization parameters from a database server;

s1280, inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate intermediate database configuration parameters, and sending the intermediate database configuration parameters to a database server;

and S1290, repeating the steps for N times until the target database configuration parameters are obtained. That is, the configuration parameters of the intermediate database are input into the database server as the configuration parameters of the database server, and the steps S1120 and S1280 are repeatedly executed until N times are reached, so as to obtain the configuration parameters of the target database, where N is a positive integer and a value of N may be preset. And through multiple iterations, obtaining more optimal database configuration parameters.

In some embodiments, a preset iteration number N is set to be max _ iter, where max _ iter is expressed as a maximum iteration number, iter is expressed as iter, and an increase in iteration number is expressed as iter + +; when iteration is performed, the value of iter is increased by 1 every time iteration is performed, and the iteration is ended until the iteration number iter reaches the maximum iteration number max _ iter, that is, N in this embodiment. That is, when iter > N, it indicates that iteration is finished, and at this time, a more optimal database configuration parameter may be obtained, and if iter is less than or equal to max _ iter, steps S1120 and S1280 are continuously and repeatedly executed until the value of the iteration number iter reaches the maximum iteration number max _ iter, which is N in this embodiment. The value of N can be preset, and N can be obtained through experimental values or empirical values.

According to the embodiment, model training convergence can be continuously performed in a mode of presetting iteration times until more optimal database configuration parameters are obtained, so that the real-time performance and the high efficiency of database configuration parameter configuration are enhanced.

In a second aspect, an embodiment of the present application provides a method for adjusting database configuration parameters, which is applied to a database server.

In some embodiments, referring to fig. 10, the database configuration parameter adjustment method includes:

step S2100, sending the database state hybrid representation parameter to the database tuning module, so that the database tuning module executes the method according to the first aspect;

step S2200, receive the goal database configuration parameter sent from the tuning module of the database;

and step S2300, configuring parameters of the database server according to the target database configuration parameters.

In some embodiments, the database state hybrid characterization parameters include one or more of the following: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource status parameters.

In some embodiments, referring to fig. 11, the step S2100 further includes, before:

step S2400, acquiring a first database state mixed characterization parameter;

step S2500, acquiring a working load pressure signal from a pressure simulation server;

step S2600, pressurizing the database server according to the workload pressure signal;

and step S2700, acquiring the secondary database state mixed characterization parameters after the pressurizing operation is finished.

In some embodiments, the database server includes a backup database server and a business database server, and step S2100 includes:

step 2110: sending the database state mixed characterization parameters to a database tuning module by using a backup database server to execute the methods from the step S1100 to the step S1300 to generate target database configuration parameters;

step 2120: receiving configuration parameters of a target database by using a service server;

step 2130: and carrying out parameter configuration on the service database server according to the target database configuration parameters.

In order to better understand the core idea of the embodiment of the present invention, an overall scheme for adjusting the database configuration parameters according to the embodiment of the present invention is performed according to a specific application scenario.

In some embodiments, referring to fig. 12, the specific physical environment of the application scenario is as follows:

table 1 example physical environment

Database version	PG10.11 database
		Number of CPU cores	4
Memory device	4G
		Magnetic disk storage	20G
Database configuration parameter debug mode	Off-line

The physical environment in table 1 is based on PASS environment, the PG10.11 database represents a postgresql10.11 version database, and the offline debugging mode is to optimize the most configured parameters of the database tuning server 101 (a database tuning module is built in the database tuning server) before the service starts, directly configure the optimized configuration into the service database, and then start the database service.

In some embodiments, the database configuration parameter adjustment system comprises three parts, namely a database configuration parameter self-optimization server 101, a database query pressure simulation server 103 and a PASS platform-based PG database server (service server). The database tuning server 101 is the core of the present embodiment, and is responsible for the operation of the database tuning module, the control of the database user to query the pressure simulation server, and the interaction with the PG10.11 database server. The service server is responsible for the operation of the PG10.11 database, limits the hardware resources for the operation of the database, receives the query of a database user query pressure simulation server and interacts with the database tuning server 101. The database user inquires the pressure simulation server to be responsible for the operation of the pressure measurement device and performs pressure test on the PG10.11 database under the control of the self-optimization module so as to simulate the working load of the database in the actual operation process. In this embodiment, the PG10.11 database may serve as a service server.

The database tuning server 101 is in bidirectional communication connection with the database query pressure simulation server 103 and the service server 102, respectively, the output of the database query pressure simulation server is in unidirectional connection with the service server, and the communication connection mode may be an RJ45 port or other communication connection modes.

In some embodiments, referring to fig. 13(a) and 13(b), the method for adjusting the database configuration parameters specifically includes:

step S3100: acquiring hardware resource parameters of a database server;

step S3200: building a database tuning module, loading if the trained database configuration parameters exist, and otherwise, carrying out the next step;

step S3300: acquiring default configuration parameters of a PG10.11 database;

step S3400: resetting the environmental state of the PG10.11 database;

it can be understood that the resetting of the PG10.11 database environment state is to rebuild the database table in the database, prevent the database table from changing after multiple pressure measurements, and ensure the consistency of the database environment at each pressure measurement, so that the database performance parameters obtained at each pressure measurement are comparable.

Step S3500: and configuring the current database configuration parameters into the database of the database server PG10.11 through a network, and resetting the database.

It can be understood that the current database configuration parameters are equal to the default configuration parameters only during the first round of model training, the current configuration parameters are continuously updated during the continuous iteration of model training, and some configuration parameters can be effective only when the database is reset.

Step S3600: collecting the performance parameters of the database for the first time;

the collected database state parameters include, but are not limited to, the types shown in table 2:

TABLE 2 database configuration parameter types

Step S3700: and controlling a database query pressure simulation server to start pressurizing the PG10.11 database in the PG database server and start recording the state parameters of the hardware resources of the database in real time.

The database hardware resource status parameters include but are not limited to: parameters such as CPU utilization rate, memory utilization rate and storage utilization rate;

step S3800: after the PG10.11 database in the PG database server is pressurized, stopping recording the state parameters of the database hardware resources;

step S3900: performing secondary collection on the performance parameters of the database;

step S31000: calculating the current performance parameters of the database according to the difference between the performance parameters of the database acquired for the first time in the step S3600 and the state parameters of the database acquired for the second time in the step S3900, and calculating the mean values of the hardware resource state parameters of the database in the process, such as the CPU utilization rate, the memory utilization rate, the storage utilization rate and the like, according to the records of the hardware resource state parameters of the database in the pressure measurement process.

Step S31100: and solving a difference value according to the database performance parameters, the hardware resource state parameters and the current database configuration parameters in the step S31000 and the historical record values of the parameters, and then performing mixed characterization by using the difference value between the current parameters and the solved difference value parameters to obtain a database state mixed characterization vector.

Step S31200: and inputting the generated database state mixed characterization vector into a self-optimization module to generate new database configuration parameters.

The generated database configuration parameters include, but are not limited to, as shown in table 3:

TABLE 3 database configuration parameter types

Step S31300: calculating the reward value r of the current database configuration parameter according to each database state parameter and the historical record value thereof collected in the step S31000, wherein the calculation formula is as follows:

Δperfor＝cur_perfor-hist_perfor；

Δcpu_rate＝cur_cpu_rate-hist_cpu_rate；

Δmen_rate＝cur_men_rate-hist_men_rate；

wherein, α, β, γ are weighted values, r is an award value of a current database configuration parameter, cur _ performance is a current database performance parameter, hist _ performance is a previous or historical database performance parameter, cur _ CPU _ rate is a current CPU utilization rate, hist _ cur _ CPU _ rate is a previous or historical CPU utilization rate, cur _ men _ rate is a current memory utilization rate, hist _ men _ rate is a previous or historical memory utilization rate, Δ performance is a difference value between the current database performance parameter and the previous or historical database performance parameter, Δ CPU _ rate is a difference value between the current CPU utilization rate and the previous or historical CPU utilization rate, and Δ men _ rate is a difference value between the current memory utilization rate and the previous or historical memory utilization rate.

The cur _ performance and hist _ performance parameters may be parameters such as database throughput TPS and query latency;

step S31400: recording the optimal database configuration parameters according to the reward value of the current database configuration parameters calculated in the step S31300;

step S31500: and storing the previous database state parameter, the current database configuration parameter, the reward value of the current database configuration parameter and the current database state parameter into an experience pool as an interaction sample, wherein the database state parameter comprises a database hardware resource state parameter and a database performance parameter.

Step S31600: sampling the sample pool in batches, and optimizing the self-optimization module in the step S31200.

The specific optimization process of the self-optimization module is described in step S1270, and is not described herein again.

Step S31700: updating the current database configuration parameters according to the database configuration parameters generated in the step S31200, and updating the current database state record values according to the database state parameters obtained in the step S31000, where the database state parameters include database performance parameters and database hardware resource state parameters.

Step S31800: if the iter value of the current iteration counter is not greater than the max _ iter iteration value, returning to the step S3500, configuring the database configuration parameters into the PG10.11 database, and continuing the iteration process; otherwise, step S31900 is executed.

Step S31900: the optimal database configuration parameters and corresponding database performance parameters are returned to the database manager, and the database manager determines whether to configure the database configuration parameters recommended by the model into the target database PG10.11 of the server.

Step S32000: and saving the self-optimization module parameters of the current database.

In some embodiments, the present embodiment provides another database parameter adjustment performed in an application scenario, as shown in table 4;

table 4 example physical environment

Database version	PG10.11 database
		Number of CPU cores	48
Memory device	128G
		Magnetic disk storage	1T
Database configuration parameter debug mode	On-line

Referring to fig. 14, in this embodiment, a mode of two database servers is adopted, one of the two database servers is a backup database server 102, the other is a service database server 104, and both the two servers are PG10.11 database servers, and by searching for a better configuration parameter of the database in the backup database server 102 and then recommending the configuration parameter to the service database server 104 for configuration, normal operation of service data is prevented from being interfered in the tuning process.

The database configuration debugging system comprises a database tuning server 101 (a database tuning module is arranged in the database tuning server 101), a database query pressure simulation server 103, a backup database server 102 and a service database server 104, wherein the database tuning server 101 is responsible for operation of the database tuning module, control over the database user query pressure simulation server 103, interaction with the backup database server 102 and parameter configuration over the service database server 104. The backup database server 102 is responsible for the operation of the interactive database, limits the hardware resources of the database operation, receives the query of the database user query pressure simulation server 103, and interacts with the database tuning server 101. The database query pressure simulation server 103 is responsible for the operation of the pressure measurement tool and the interaction with the database tuning server 101. The servers are connected by a network cable, but the connection method is not limited to the above.

The database configuration parameter adjustment method of steps S3100 to S32000 described above is executed.

In a third aspect, an embodiment of the present application provides an electronic device.

In some embodiments, referring to fig. 15, the electronic device described above includes one or more processors 201; a storage device 202 for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to perform: the database configuration parameter adjustment method according to the first aspect; alternatively, the database configuration parameter adjustment method according to the second aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium.

In some embodiments, the computer-readable storage medium stores computer-executable instructions for performing: a database configuration parameter adjustment method as in the first aspect; alternatively, the database configuration parameter adjustment method according to the second aspect.

The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

While the present invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A database configuration parameter adjusting method is applied to a database tuning module and comprises the following steps:

acquiring database state hybrid characterization parameters from a database server;

inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate target database configuration parameters;

and sending the target database configuration parameters to a database server.

2. The method of claim 1, wherein the database-state hybrid characterization parameters include one or more of: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource status parameters.

3. The method of claim 1, wherein the deep reinforcement learning model comprises a hybrid characterization module and a self-optimization module, and the hybrid characterization module is connected with the self-optimization module;

the step of inputting the database state hybrid representation parameters into a deep reinforcement learning model to generate target database configuration parameters comprises the following steps:

inputting the database state hybrid characterization parameters into a hybrid characterization module;

the hybrid characterization module processes the database state hybrid characterization parameters by using a neural network model to obtain hybrid characterization vectors;

and inputting the mixed characterization vector into the self-optimization module, and processing the mixed characterization vector by the self-optimization module by using a reinforced learning model to obtain a target database configuration parameter.

4. The method of claim 3, wherein when the database state hybrid characterization parameters include database performance parameters, current database configuration parameters, hardware resource state parameters;

the processing the hybrid characterization module on the database state hybrid characterization parameters by using a neural network model to obtain a hybrid characterization vector includes:

acquiring the current database performance parameter and the previous database performance parameter; acquiring the current database configuration parameters and the previous database configuration parameters; acquiring the current hardware resource state parameter and the previous hardware resource state parameter; and acquiring the hardware resource parameters;

obtaining a characteristic vector of the current database performance parameter according to the current database performance parameter; obtaining a characteristic vector of the performance parameter of the database at the previous time according to the performance parameter of the database at the previous time; obtaining a characteristic vector of the current database configuration parameter according to the current database configuration parameter; obtaining a characteristic vector of the database configuration parameter of the previous time according to the database configuration parameter of the previous time; obtaining a current hardware resource state parameter feature vector according to the current hardware resource state parameter; obtaining a hardware resource state parameter feature vector of the previous time according to the hardware resource state parameter of the previous time; obtaining the hardware resource parameter feature vector according to the hardware resource parameter;

calculating to obtain a database performance parameter difference value feature vector according to the current database performance parameter feature vector and the previous database performance parameter feature vector; calculating to obtain a database configuration parameter difference value eigenvector according to the current database configuration parameter eigenvector and the last database configuration parameter eigenvector; calculating to obtain a hardware resource state parameter difference value eigenvector according to the current hardware resource state parameter eigenvector and the previous hardware resource state parameter eigenvector;

inputting the database performance parameter feature vector, the database performance parameter difference feature vector, the database configuration parameter difference feature vector, the hardware resource state parameter difference feature vector and the hardware resource parameter feature vector into the neural network model, and outputting a mixed characterization vector by using the neural network model.

5. The method of claim 3, wherein the deep reinforcement learning model further comprises a database configuration parameter reward function module;

the step of inputting the database state hybrid representation parameters into a deep reinforcement learning model to generate database configuration parameters further comprises:

respectively inputting the database state mixed characterization parameters into a mixed characterization module and a database configuration parameter reward function module to generate a reward strategy;

storing the reward policy in a sample pool;

sampling the sample pool to obtain sampling data;

and optimizing the database tuning module by using the sampling data.

6. The method of claim 5, wherein when the database state hybrid representation parameters include database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource state parameters, the inputting the database state hybrid representation parameters into the hybrid representation module and the database configuration parameter reward function module, respectively, generates a reward policy, comprising:

normalizing the difference value between the current database performance parameter and the previous database performance parameter and the difference value between the current hardware resource state parameter and the previous hardware resource state parameter, and outputting the reward value of the current database configuration parameter;

the normalization processing is to calculate the proportion of the difference value between the current database performance parameter and the previous database performance parameter in the previous database performance parameter; the current hardware resource state parameter and the previous hardware resource state parameter difference value are in the proportion of the previous hardware resource state parameter, and the reward value of the current database configuration parameter is output after the proportion is multiplied by the corresponding weighted value and summed;

and generating the reward strategy according to the reward value of the current database configuration parameter, the current database performance parameter, the current hardware resource state parameter, the previous database performance parameter and the previous hardware resource state parameter.

7. The method of claim 5, wherein the self-tuning module comprises: a current action network, a current evaluation network, a target action network, and a target evaluation network; the current action network is used for generating target database configuration parameters according to the database state mixed characterization parameters and sending the target database configuration parameters to the database server; the sampling data comprises current database configuration parameters, current database state parameters, next database state parameters and reward values of the current database configuration parameters;

the optimizing the self-tuning module by using the sampling data comprises the following steps:

inputting the current database state parameters into a current action network, and generating next database configuration parameters;

inputting the next database configuration parameter and the current database state parameter into a current evaluation network to determine a first loss function value;

optimizing the current action network according to the first loss function value; performing soft update on a target action network according to the parameters output by the current action network;

inputting the next database state parameters into the target action network to generate next database configuration parameters;

inputting the current database configuration parameters and the current database state parameters into a current evaluation network to generate a current action evaluation value;

the next database configuration parameters and the next database state parameters are sent to a target evaluation network to generate a next action evaluation value;

calculating to obtain a second loss function value according to the current action evaluation value, the next action evaluation value and the reward value of the current database configuration parameter;

and optimizing the current evaluation network by using the second loss function value, and simultaneously performing soft update on the target evaluation network through the parameters output by the current evaluation network.

8. The method of claim 1, wherein the data server comprises a backup database server and a business database server, and the obtaining the database state hybrid characterization parameters from the database server comprises:

acquiring database state hybrid characterization parameters of a backup server;

sending the target database configuration parameters to a database server, comprising:

and sending the target database configuration parameters to a service database server.

9. The method according to any one of claims 1-8, wherein the obtaining of the database state hybrid characterization parameters from the database server; inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate target database configuration parameters, wherein the target database configuration parameters comprise: acquiring database state hybrid characterization parameters from a database server;

inputting the database state mixed characterization parameters into a deep reinforcement learning model to generate intermediate database configuration parameters;

and sending the configuration parameters of the intermediate database to a database server, and repeatedly executing the steps for N times until the configuration parameters of the target database are obtained.

10. A database configuration parameter adjusting method is applied to a database server and comprises the following steps:

sending the database state hybrid representation parameters to a tuning module to cause the tuning module to perform the method of any one of claims 1-9;

receiving target database configuration parameters sent by the database tuning module;

and performing parameter configuration on the database server according to the database configuration parameters.

11. The method of claim 10, wherein the database-state hybrid characterization parameters include one or more of: database performance parameters, current database configuration parameters, hardware resource parameters, and hardware resource status parameters.

12. The method of claim 10, wherein before sending the database state hybrid representation parameters to the database tuning module to cause the database tuning module to perform the method of any one of claims 1-9, the method further comprises:

acquiring a first database state mixed characterization parameter;

acquiring a working load pressure signal from a pressure simulation server;

pressurizing the database server according to the working load pressure signal;

and after the pressurizing operation is finished, acquiring the state mixed characterization parameters of the secondary database.

13. The method of claim 10, wherein the database server comprises a backup database server and a business database server, and the sending of the database state hybrid characterization parameters to the database tuning module causes the database tuning module to perform the method of any one of claims 1-9; the method comprises the following steps:

sending the database state hybrid representation parameters to a database tuning module by using a backup database server to execute the method of any one of claims 1 to 9 to generate target database configuration parameters;

receiving the target database configuration parameters by using a service server;

and performing parameter configuration on the service database server according to the target database configuration parameters.

14. An electronic device, comprising:

at least one processor, and,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions for execution by the at least one processor to cause the at least one processor, when executing the instructions, to implement the method of any one of claims 1 to 9 or the method of any one of claims 10 to 13.

15. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of any of claims 1 to 9 or the method of any of claims 10 to 13.