CN111913939B - Database cluster optimization system and method based on reinforcement learning - Google Patents

Database cluster optimization system and method based on reinforcement learning Download PDF

Info

Publication number
CN111913939B
CN111913939B CN202010807625.4A CN202010807625A CN111913939B CN 111913939 B CN111913939 B CN 111913939B CN 202010807625 A CN202010807625 A CN 202010807625A CN 111913939 B CN111913939 B CN 111913939B
Authority
CN
China
Prior art keywords
optimization
configuration information
subsystem
database cluster
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010807625.4A
Other languages
Chinese (zh)
Other versions
CN111913939A (en
Inventor
莫毓昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202010807625.4A priority Critical patent/CN111913939B/en
Publication of CN111913939A publication Critical patent/CN111913939A/en
Application granted granted Critical
Publication of CN111913939B publication Critical patent/CN111913939B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a database cluster optimization system and method based on reinforcement learning, wherein the optimization system comprises a current configuration information acquisition subsystem, a current performance information acquisition subsystem, an optimization strategy execution subsystem and an optimization engine subsystem; and the subsystems are coordinated and matched, optimal guidance is provided for the selection of the database cluster optimization strategy according to the current configuration information and the current performance information of the database cluster, and the optimization strategy execution subsystem is controlled to adjust the configuration information of the database cluster. The advantages are that: the method for strengthening learning is used for automatically optimizing the configuration parameters of the database clusters, so that the processing performance of the database clusters is remarkably improved; and when the load changes, the dynamic adaptability optimization adjustment can be realized rapidly, and the labor cost and the time cost are reduced greatly.

Description

Database cluster optimization system and method based on reinforcement learning
Technical Field
The invention relates to the field of database cluster optimization, in particular to a database cluster optimization system and method based on reinforcement learning.
Background
The world is an informationized world, and people can not leave the support of an information system for life, work and study. And the place behind the information system for storing and processing the final results is the database. Thus, database systems become particularly important, meaning that if the database is problematic, it means that the entire application system is also challenged with serious losses and consequences.
The word "big data" has become very popular today, although it is not known how this concept falls to the ground. However, it can be determined that with the rise of the internet of things and mobile applications, the data volume has a geometric grade improvement compared with the past. In view of the above challenges, it is obvious that a plurality of servers are grouped into a cluster, so that resources of each server can be fully utilized and client loads can be distributed to different servers, and as application loads increase, only new servers need to be added to the cluster.
Often, a database cluster administrator optimizes configuration parameters of the database cluster according to historical operation conditions of the database cluster and real-time states of the database cluster so as to improve processing performance of the database cluster.
There is a delay between optimizing the database cluster configuration parameters and feedback of the database cluster processing performance, and if a continuous number of optimization actions are taken, it is difficult to determine which optimization action is functioning or what effect each optimization action has on the results. Therefore, the manual optimization is not free from deviation, and factors such as huge parameter search space, load continuity, load and equipment diversity determine that the traditional manual optimization method is very inefficient.
Disclosure of Invention
The invention aims to provide a database cluster optimization system and method based on reinforcement learning, so as to solve the problems in the prior art.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a database cluster optimization system based on reinforcement learning, the optimization system comprising,
a current configuration information acquisition subsystem; the system comprises an optimization engine subsystem, a database cluster, a database, a control module and a control module, wherein the optimization engine subsystem is used for receiving a current configuration information acquisition command issued by the optimization engine subsystem and acquiring current configuration information of the database cluster according to the current configuration information acquisition command; the collected current configuration information of the database cluster is sent to an optimization engine subsystem;
the current performance information acquisition subsystem; the system comprises a database cluster, an optimization engine subsystem, a database control module and a control module, wherein the database cluster is used for storing the current performance information of the database cluster; the collected current performance information of the database cluster is sent to an optimization engine subsystem;
optimizing a policy enforcement subsystem; the system comprises an optimization engine subsystem, a database cluster management subsystem and a database cluster management subsystem, wherein the optimization engine subsystem is used for receiving an optimization strategy issued by the optimization engine subsystem and adjusting configuration information of the database cluster according to the optimization strategy; the optimization strategy comprises optimization parameters and optimization directions;
An optimization engine subsystem; the system comprises a current configuration information acquisition subsystem, a current performance information acquisition subsystem and an optimization strategy execution subsystem, wherein the current configuration information acquisition subsystem is used for acquiring current configuration information, the current performance information acquisition subsystem is used for acquiring current performance information, and the optimization strategy execution subsystem is used for respectively transmitting a current configuration information acquisition command, a current performance information acquisition command and an optimization parameter to the current configuration information acquisition subsystem, the current performance information acquisition subsystem and the optimization strategy execution subsystem; and generating a database cluster optimization strategy according to the acquired current configuration information and the current performance information of the database cluster, and controlling an optimization strategy execution subsystem to adjust the configuration information of the database cluster.
Preferably, the current configuration information acquisition subsystem comprises,
the first acquisition command receiving module; receiving a current configuration information acquisition command issued by the optimization engine subsystem through the network monitoring optimization engine subsystem, and calling a configuration information acquisition module according to the received current configuration information acquisition command so as to acquire the current configuration information of the database cluster;
configuring an information acquisition module; calling each configuration information sub-module to collect corresponding configuration information of the database cluster according to the current configuration information, and sending the corresponding configuration information to an optimization engine subsystem;
a cache configuration information sub-module; the configuration information related to database cache is collected and comprises a query cache size, a single query available cache area size and a sequencing cache size;
An operation configuration information sub-module; the configuration information related to database operation is collected and comprises a read operation buffer area size, a temporary table size, a maximum heap table size, an index buffer area size, a batch insertion data buffer area size and a joint operation queue size;
a network configuration information sub-module; the configuration information is used for collecting configuration information related to a database network, and comprises a maximum value of one-time message transmission quantity, a maximum database connection number and a maximum number of abnormal interruption times of a database connection request in network transmission;
a system configuration information sub-module; the configuration information is used for collecting configuration information related to the database system; including the number of files allowed to open, the number of database requests that can be stored in the stack in a short time, the number of threads stored in the cache, the number of concurrent threads, and the stack size for each thread.
Preferably, the current performance information acquisition subsystem includes,
the second acquisition command receiving module; receiving a current performance information acquisition command issued by the optimization engine subsystem through the network monitoring optimization engine subsystem, and calling a performance information acquisition module according to the received current performance information acquisition command so as to acquire the current performance information of the database cluster;
A performance information acquisition module; calling each performance information sub-module to collect corresponding performance information of the database cluster according to the current performance information, and sending the corresponding performance information to an optimization engine subsystem;
transaction and query information sub-modules; the method comprises the steps of acquiring performance information related to database transactions and queries, wherein the performance information comprises the steps of acquiring average per-second select statement execution times by utilizing a database management command, acquiring average per-second insert statement execution times by utilizing a database management command, acquiring average update statement execution times by utilizing a database management command, acquiring average delete statement execution times by utilizing a database management command, calculating the number of transactions per second, calculating the number of queries per second and utilizing the database management command to query operation response time statistics;
a thread performance information sub-module; the method comprises the steps of collecting performance information related to database threads; the method comprises the steps of obtaining the number of threads currently in an activated state by using an operating system management command, and obtaining the number of threads currently connected by using the operating system management command;
a network traffic performance information sub-module; the system is used for collecting performance information related to the network flow of the database; including obtaining an average number of bytes received from all clients per second using network management commands, and obtaining an average number of bytes sent to all clients per second using network management commands.
Preferably, the optimization strategy execution subsystem comprises,
an optimization strategy receiving module; receiving an optimization strategy issued by an optimization engine subsystem through a network monitoring optimization engine subsystem;
an optimization strategy executing module; the optimizing strategy is used for receiving an optimizing strategy issued by the optimizing engine, searching the configuration file according to the optimizing parameter, and finding out the configuration parameter corresponding to the optimizing parameter; according to the optimization direction, adjusting configuration parameter values corresponding to the optimization parameters; the adjustment content may specifically include the content of,
a1, when the optimization direction is +, if the configuration parameters corresponding to the optimization parameters are switching items, setting the parameter values to be on;
a2, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is a switch item, setting the parameter value to be off;
a3, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer within 10, the parameter value is set to be increased by 1;
a4, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer within 10, setting the parameter value to be reduced by 1;
a5, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer within 256, the parameter value is set to be increased by 8;
a6, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer within 256, setting the parameter value to be reduced by 8;
A7, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer greater than 256, the parameter value is set to be multiplied by 2;
a8, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer greater than 256, the parameter value is set to be divided by 2.
Preferably, the optimization engine subsystem comprises an optimization policy evaluation network comprising an input layer, two hidden layers and an output layer,
an input layer comprising 17 inputs, each from 17 current values of configuration information of the database cluster;
the first hidden layer, comprising 128 neurons, has a calculation formula,
O 1 =relu(w 1 ·x+b 1 )
wherein x is the input of the optimization strategy evaluation network; w (w) 1 Is a weight matrix; b 1 Is biased; o (O) 1 Is the output vector of the first hidden layer 128 dimension;
the second hidden layer, comprising 64 neurons, has a calculation formula,
O 2 =relu(w 2 ·O 1 +b 2 )
wherein w is 2 Is a weight matrix; b 2 Is biased; o (O) 2 Is the output vector of the second hidden layer 64 dimension;
the output layer comprises 34 neurons, the calculation formula is,
y=relu(w 3 ·O 2 +b 3 )
wherein w is 3 Is a weight matrix; b 3 Is biased; y is an output vector of the output layer, which includes 34 outputs, each corresponding to an evaluation value of the optimization strategy; since there are 17 configuration information, 2 optimization directions per configuration information, there are 34 optimization strategies.
The invention also aims to provide a database cluster optimization method based on reinforcement learning, which is realized by using any one of the optimization systems; the optimization method comprises the following steps,
s1, initializing: initializing the optimization strategy evaluation network, namely initializing the ownership matrix parameters and the bias parameters into random values;
s2, learning: performing reinforcement learning process once every first preset time period until the learning process is finished, and obtaining a trained optimization strategy evaluation network;
s3, application stage: and (3) carrying out optimization adjustment on the database cluster parameters once by utilizing the trained optimization strategy evaluation network every a first preset time period until the database cluster stops running.
Preferably, step S2 specifically includes,
s21, every a first preset time length, the optimization engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimization engine subsystem acquires the current configuration information state S of the database cluster;
s22, inputting the current configuration information state S of the acquired database cluster into an optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
S23, the optimization engine subsystem selects one optimization strategy from all optimization strategies according to a database cluster optimization strategy selection mechanism;
s24, the optimization engine subsystem sends the selected optimization strategy h to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the optimization strategy h to update the configuration information state of the database cluster from S to S';
s25, delaying a second preset time length, and commanding the current performance information acquisition subsystem to acquire the current performance information of the database cluster by the optimization engine subsystem, and calculating a return value r corresponding to an optimization strategy h for updating the configuration information state of the database cluster from S to S';
s26, inputting the updated database cluster configuration information configuration state S' into the optimization strategy evaluation network, obtaining evaluation values of 34 optimization strategies through calculation of a neural network, and selecting a maximum evaluation value V_max from the evaluation values;
s27, calculating a corresponding updated evaluation value h_val of the database cluster under the configuration information state S and the optimization strategy h by using a classical reinforcement learning bellman formula, wherein h_val=r+V_max;
s28, updating the evaluation value vector by using the updated evaluation value h_val and the evaluation value vector V_s, namely replacing a value corresponding to the optimization strategy h in the evaluation value vector V_s with h_val to obtain an updated evaluation value vector V_s';
S29, storing the configuration information state S of the database cluster and the updated evaluation value vector V_s' into a playback pool as a training sample;
s210, repeating the steps S21 to S29 32 times, wherein the number of training samples in the playback pool is 32;
s211, training an optimization strategy evaluation network by using the 32 training samples and using a gradient descent neural network training algorithm so as to update parameters of the optimization strategy evaluation network;
s212, repeatedly executing the steps S21 to S211 until the error of the optimization strategy evaluation network is smaller than a preset threshold value, and ending the reinforcement learning process.
Preferably, the optimization strategy selection mechanism is specifically,
randomly selecting an optimization strategy h_rand from 34 optimization strategies according to the epsilon probability, and taking the optimization strategy h_rand as an optimization strategy h; or selecting the maximum optimization strategy h_max in the configuration information state s with the probability of 1-epsilon, and taking the maximum optimization strategy h_max as an optimization strategy h.
Preferably, the return value r is calculated by the following steps,
b1, calculating the difference Dtps between the transaction number tps per second collected by the optimization engine subsystem and the transaction number tps per second collected by the optimization engine subsystem at the previous moment after executing the optimization strategy;
Wherein dps=tps-tps';
b2, calculating the difference Dqps between the query number qps per second collected by the optimizing engine subsystem and the query number qps per second collected by the optimizing engine subsystem at the previous moment after executing the optimizing strategy;
wherein Dqps = qps-qps';
b3, calculating the difference Dquery_response_time between the query operation response time query_response_time acquired by the optimization engine subsystem and the query operation response time query_response_time' acquired by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dqery_response_time=query_response_time-query_response_time';
b4, calculating the difference Dthread_running between the number of threads of the activated state acquired by the optimizing engine subsystem and the number of threads of the activated state acquired by the optimizing engine subsystem at the previous moment after executing the optimizing strategy;
wherein, dthreads_running = threads_running-threads_running';
b5, calculating the difference Dthreads_connected between the number threads_connected of the current connection collected by the optimizing engine subsystem and the number threads_connected' of the current connection collected by the optimizing engine subsystem every second at the last moment after executing the optimizing strategy;
Wherein, dthreads_connected = threads_connected-threads_connected';
b6, calculating the difference between the average byte number received from all clients per second collected by the optimization engine subsystem and the average byte number received from all clients per second collected by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dbytes_received_ps=bytes_received_ps-bytes_received_ps';
b7, calculating the difference between the byte number Bytes_send_ps which is acquired by the optimization engine subsystem and is transmitted to all clients every second and the byte number Bytes_send_ps' which is acquired by the optimization engine subsystem and is transmitted to all clients every second after the optimization strategy is executed;
wherein dbytes_send_ps=bytes_send_ps-bytes_send_ps';
b8, calculating the return rate r according to the difference value obtained in the steps B1 to B7, wherein the calculation formula is,
wherein, gamma 1 And gamma 2 Is a weight and satisfies gamma 1 <γ 2 ;γ 3 Is the weight.
Preferably, step S3 comprises in particular,
s31, every first preset time length, the optimizing engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimizing engine subsystem acquires the current configuration information state S of the database cluster;
S32, inputting the current configuration information state S of the acquired database cluster into a trained optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
s33, the optimization engine subsystem sends a maximum optimization strategy h_max to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the maximum optimization strategy h_max to update the configuration information state of the database cluster from S to S';
s34, repeating the steps S31 to S34 until the database cluster stops working and the parameter optimization is finished.
The beneficial effects of the invention are as follows: the method for strengthening learning is used for automatically optimizing the configuration parameters of the database clusters, so that the processing performance of the database clusters is remarkably improved; and when the load changes, the dynamic adaptability optimization adjustment can be realized rapidly, and the labor cost and the time cost are reduced greatly.
Drawings
FIG. 1 is a flow chart of an optimization method in an embodiment of the invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the invention.
Example 1
In this embodiment, there is provided a reinforcement learning-based database cluster optimization system, the optimization system including,
a current configuration information acquisition subsystem; the system comprises an optimization engine subsystem, a database cluster, a database, a control module and a control module, wherein the optimization engine subsystem is used for receiving a current configuration information acquisition command issued by the optimization engine subsystem and acquiring current configuration information of the database cluster according to the current configuration information acquisition command; the collected current configuration information of the database cluster is sent to an optimization engine subsystem;
the current performance information acquisition subsystem; the system comprises a database cluster, an optimization engine subsystem, a database control module and a control module, wherein the database cluster is used for storing the current performance information of the database cluster; the collected current performance information of the database cluster is sent to an optimization engine subsystem;
optimizing a policy enforcement subsystem; the system comprises an optimization engine subsystem, a database cluster management subsystem and a database cluster management subsystem, wherein the optimization engine subsystem is used for receiving an optimization strategy issued by the optimization engine subsystem and adjusting configuration information of the database cluster according to the optimization strategy; the optimization strategy comprises optimization parameters and optimization directions;
an optimization engine subsystem; the system comprises a current configuration information acquisition subsystem, a current performance information acquisition subsystem and an optimization strategy execution subsystem, wherein the current configuration information acquisition subsystem is used for acquiring current configuration information, the current performance information acquisition subsystem is used for acquiring current performance information, and the optimization strategy execution subsystem is used for respectively transmitting a current configuration information acquisition command, a current performance information acquisition command and an optimization parameter to the current configuration information acquisition subsystem, the current performance information acquisition subsystem and the optimization strategy execution subsystem; and generating a database cluster optimization strategy according to the acquired current configuration information and the current performance information of the database cluster, and controlling an optimization strategy execution subsystem to adjust the configuration information of the database cluster.
In this embodiment, the current configuration information acquisition subsystem, the current performance information acquisition subsystem, and the optimization policy execution subsystem all operate on each server that constitutes the database cluster.
In this embodiment, the optimization engine subsystem performs a reinforcement learning process at intervals (for example, half an hour), specifically: firstly, commanding a current configuration information acquisition subsystem to acquire current configuration information of a database cluster; calculating an evaluation value vector V_s according to an optimization strategy evaluation network, then sending the optimization strategy h to an optimization strategy execution subsystem according to an optimization strategy h selected by a database cluster optimization strategy selection mechanism, and then delaying for a certain time (for example, 5 minutes) to order a current performance information acquisition subsystem to acquire current performance information; the evaluation value vector v_s' is updated according to the current performance information using the bellman formula, thus completing one iteration. With continuous iterative operation, the optimization engine subsystem continuously trains an optimization strategy evaluation network, and provides optimal guidance for optimization strategy selection of the optimization engine subsystem. See in particular the following steps S21 to S24.
S21, every a first preset time length, the optimization engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimization engine subsystem acquires the current configuration information state S of the database cluster;
S22, inputting the current configuration information state S of the acquired database cluster into an optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
s23, the optimization engine subsystem selects one optimization strategy from all optimization strategies according to a database cluster optimization strategy selection mechanism;
and S24, the optimization engine subsystem sends the selected optimization strategy h to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the optimization strategy h.
In this embodiment, the current configuration information acquisition subsystem includes,
the first acquisition command receiving module; receiving a current configuration information acquisition command issued by the optimization engine subsystem through the network monitoring optimization engine subsystem, and calling a configuration information acquisition module according to the received current configuration information acquisition command so as to acquire the current configuration information of the database cluster;
configuring an information acquisition module; calling each configuration information sub-module to collect corresponding configuration information of the database cluster according to the current configuration information, and sending the corresponding configuration information to an optimization engine subsystem;
A cache configuration information sub-module; the configuration information related to database cache is collected, and comprises query cache size query_cache_size, single query available cache area size query_cache_limit and sequencing cache size sort_cache_size;
an operation configuration information sub-module; the configuration information is used for collecting configuration information related to database operation, and comprises a read operation buffer size read_buffer_size, a temporary table size tmp_table_size, a maximum heap table size max_head_table_size, an index buffer size key_buffer_size, a batch insertion data buffer size bulk_insert_buffer_size and a joint operation queue size join_queue_size;
a network configuration information sub-module; the configuration information is used for collecting configuration information related to a database network, and comprises a maximum value max_allowed_shacket of one-time message transmission quantity in network transmission, a maximum database connection number max_connections and a maximum number max_connection_error of abnormal interruption of a database connection request;
a system configuration information sub-module; the configuration information is used for collecting configuration information related to the database system; including the number of files open_files_limit allowed to open, the number of database requests back_log that can be stored in the stack in a short time, the number of threads stored in the cache thread_cache_size, the number of concurrent threads thread_concurrency, the stack size per thread.
In this embodiment, the current performance information acquisition subsystem includes,
the second acquisition command receiving module; receiving a current performance information acquisition command issued by the optimization engine subsystem through the network monitoring optimization engine subsystem, and calling a performance information acquisition module according to the received current performance information acquisition command so as to acquire the current performance information of the database cluster;
a performance information acquisition module; calling each performance information sub-module to collect corresponding performance information of the database cluster according to the current performance information, and sending the corresponding performance information to an optimization engine subsystem;
transaction and query information sub-modules; the method comprises the steps of acquiring performance information related to database transactions and queries, wherein the performance information comprises the steps of acquiring average per-second select statement execution times com_select_ps by utilizing a database management command, acquiring average per-second insert statement execution times com_insert_ps by utilizing the database management command, acquiring average per-second update statement execution times com_update_ps by utilizing the database management command, acquiring average per-second delete statement execution times com_delete_ps by utilizing the database management command, calculating transaction number tps per second, calculating query number per second qps, and calculating query response time by utilizing the database management command; wherein tps=com_insert_ps+com_update_ps+com_delete_ps; qps = com_select_ps+com_insert_ps+com_update_ps+com_delete_ps;
A thread performance information sub-module; the method comprises the steps of collecting performance information related to database threads; acquiring the number of threads (threads_running) currently in an active state by using an operating system management command, and acquiring the number of threads (threads_connected) currently connected by using the operating system management command;
a network traffic performance information sub-module; the system is used for collecting performance information related to the network flow of the database; including obtaining the average number of Bytes per second received from all clients using network management commands, and obtaining the average number of Bytes per second sent to all clients using network management commands, byte_send_ps.
In this embodiment, the optimization strategy execution subsystem includes,
an optimization strategy receiving module; receiving an optimization strategy issued by an optimization engine subsystem through a network monitoring optimization engine subsystem;
an optimization strategy executing module; the optimizing strategy is used for receiving an optimizing strategy issued by the optimizing engine, searching the configuration file according to the optimizing parameter, and finding out the configuration parameter corresponding to the optimizing parameter; according to the optimization direction, adjusting configuration parameter values corresponding to the optimization parameters; the adjustment content may specifically include the content of,
a1, when the optimization direction is +, if the configuration parameters corresponding to the optimization parameters are switching items, setting the parameter values to be on;
A2, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is a switch item, setting the parameter value to be off;
a3, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer within 10, the parameter value is set to be increased by 1;
a4, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer within 10, setting the parameter value to be reduced by 1;
a5, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer within 256, the parameter value is set to be increased by 8;
a6, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer within 256, setting the parameter value to be reduced by 8;
a7, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer greater than 256, the parameter value is set to be multiplied by 2;
a8, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer greater than 256, the parameter value is set to be divided by 2.
In this embodiment, each configuration parameter has constraint conditions of a maximum value and a minimum value, and when the optimized configuration parameter value is greater than the maximum value or less than the minimum value, the corresponding configuration parameter is set to be the maximum value or the minimum value, so as to ensure the normal operation of the system.
In this embodiment, the optimization engine subsystem includes an optimization policy evaluation network that includes an input layer, two hidden layers, and an output layer,
an input layer comprising 17 inputs, each from 17 current values of configuration information of the database cluster;
the first hidden layer, comprising 128 neurons, has a calculation formula,
O 1 =relu(w 1 ·x+b 1 )
wherein x is the input of the optimization strategy evaluation network; w (w) 1 Is a weight matrix; b 1 Is biased; o (O) 1 Is the output vector of the first hidden layer 128 dimension;
the second hidden layer, comprising 64 neurons, has a calculation formula,
O 2 =relu(w 2 ·O 1 +b 2 )
wherein w is 2 Is a weight matrix; b 2 Is biased; o (O) 2 Is the output vector of the second hidden layer 64 dimension;
the output layer comprises 34 neurons, the calculation formula is,
y=relu(w 3 ·O 2 +b 3 )
wherein w is 3 Is a weight matrix; b 3 Is biased; y is an output vector of the output layer, which includes 34 outputs, each corresponding to an evaluation value of the optimization strategy; since there are 17 configuration information, 2 optimization directions per configuration information, there are 34 optimization strategies.
Example two
In this embodiment, a database cluster optimization method based on reinforcement learning is provided, where the optimization method is implemented using the optimization system described above; the optimization method comprises the following steps,
S1, initializing: initializing the optimization strategy evaluation network, namely initializing the ownership matrix parameters and the bias parameters into random values;
s2, learning: performing reinforcement learning process once every first preset time period until the learning process is finished, and obtaining a trained optimization strategy evaluation network;
s3, application stage: and (3) carrying out optimization adjustment on the database cluster parameters once by utilizing the trained optimization strategy evaluation network every a first preset time period until the database cluster stops running.
In this embodiment, the step S2 specifically includes the following,
s21, every a first preset time length, the optimization engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimization engine subsystem acquires the current configuration information state S of the database cluster; the first preset duration may be specifically set according to specific situations, where the first preset duration may be selected to be half an hour;
s22, inputting the current configuration information state S of the acquired database cluster into an optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
S23, the optimization engine subsystem selects one optimization strategy from all optimization strategies according to a database cluster optimization strategy selection mechanism;
s24, the optimization engine subsystem sends the selected optimization strategy h to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the optimization strategy h to update the configuration information state of the database cluster from S to S';
s25, delaying a second preset time length, and commanding the current performance information acquisition subsystem to acquire the current performance information of the database cluster by the optimization engine subsystem, and calculating a return value r corresponding to an optimization strategy h for updating the configuration information state of the database cluster from S to S'; the second preset time period can be specifically set according to specific situations, and can be selected to be 5 minutes;
s26, inputting the updated database cluster configuration information configuration state S' into the optimization strategy evaluation network, obtaining evaluation values of 34 optimization strategies through calculation of a neural network, and selecting a maximum evaluation value V_max from the evaluation values;
s27, calculating a corresponding updated evaluation value h_val of the database cluster under the configuration information state S and the optimization strategy h by using a classical reinforcement learning bellman formula, wherein h_val=r+V_max;
S28, updating the evaluation value vector by using the updated evaluation value h_val and the evaluation value vector V_s, namely replacing a value corresponding to the optimization strategy h in the evaluation value vector V_s with h_val to obtain an updated evaluation value vector V_s';
s29, storing the configuration information state S of the database cluster and the updated evaluation value vector V_s' into a playback pool as a training sample;
s210, repeating the steps S21 to S29 32 times, wherein the number of training samples in the playback pool is 32;
s211, training an optimization strategy evaluation network by using the 32 training samples and using a gradient descent neural network training algorithm so as to update parameters of the optimization strategy evaluation network;
s212, repeatedly executing the steps S21 to S211 until the error of the optimization strategy evaluation network is smaller than a preset threshold value, and ending the reinforcement learning process.
In this embodiment, the optimization policy selection mechanism is specifically that,
randomly selecting an optimization strategy h_rand from 34 optimization strategies according to the epsilon probability, and taking the optimization strategy h_rand as an optimization strategy h; or selecting the maximum optimization strategy h_max in the configuration information state s with the probability of 1-epsilon, and taking the maximum optimization strategy h_max as an optimization strategy h. Epsilon has a small value and is generally set to 0.01.
In this embodiment, the setting of the return value r is the most critical point of reinforcement learning, because the training of the model is performed depending on the return value r, the quality of the setting of the return value r often determines that reinforcement learning cannot be successfully applied at last. In addition, the load of the database cluster is continuously changed, if the report value r is simply defined as the difference between the current database cluster performance and the database cluster performance at the last moment, when the load is changed drastically, the report value r will be changed greatly correspondingly, and then the database cluster optimization engine subsystem cannot distinguish whether the report value r is caused by the load change or the database optimization, so that reinforcement learning cannot be converged.
In the present invention, therefore, the return value r is calculated by the process of,
b1, calculating the difference Dtps between the transaction number tps per second collected by the optimization engine subsystem and the transaction number tps per second collected by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dps=tps-tps';
b2, calculating the difference Dqps between the query number qps per second collected by the optimizing engine subsystem and the query number qps per second collected by the optimizing engine subsystem at the previous moment after executing the optimizing strategy;
Wherein Dqps = qps-qps';
b3, calculating the difference Dquery_response_time between the query operation response time query_response_time acquired by the optimization engine subsystem and the query operation response time query_response_time' acquired by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dqery_response_time=query_response_time-query_response_time';
b4, calculating the difference Dthread_running between the number of threads of the activated state acquired by the optimizing engine subsystem and the number of threads of the activated state acquired by the optimizing engine subsystem at the previous moment after executing the optimizing strategy;
wherein, dthreads_running = threads_running-threads_running';
b5, calculating the difference Dthreads_connected between the number threads_connected of the current connection collected by the optimizing engine subsystem and the number threads_connected' of the current connection collected by the optimizing engine subsystem every second at the last moment after executing the optimizing strategy;
wherein, dthreads_connected = threads_connected-threads_connected';
b6, calculating the difference between the average byte number received from all clients per second collected by the optimization engine subsystem and the average byte number received from all clients per second collected by the optimization engine subsystem at the previous moment after executing the optimization strategy;
Wherein dbytes_received_ps=bytes_received_ps-bytes_received_ps';
b7, calculating the difference between the byte number Bytes_send_ps which is acquired by the optimization engine subsystem and is transmitted to all clients every second and the byte number Bytes_send_ps' which is acquired by the optimization engine subsystem and is transmitted to all clients every second after the optimization strategy is executed;
wherein dbytes_send_ps=bytes_send_ps-bytes_send_ps';
b8, calculating the return rate r according to the difference value obtained in the steps B1 to B7, wherein the calculation formula is,
wherein, gamma 1 And gamma 2 Is a weight and satisfies gamma 1 <γ 2 ;γ 3 Is the weight.
Since the transaction and query performance information is a performance index reflecting the granularity of the database cluster from the perspective of database operation, and the thread performance information is a performance index reflecting the whole database cluster from the perspective of threads, different weights gamma need to be given 1 And gamma 2 Makes a distinction and satisfies gamma 1 Less than gamma 2 Thereby increasing the specific gravity of the thread performance information. Gamma ray 1 And gamma 2 The specific value of (2) can be selected according to specific conditions so as to better meet the actual needs; but both must meet gamma 1 Less than gamma 2
The improvement of the database cluster performance caused by the network traffic performance information can accurately reflect the load change of the user, and is not necessarily caused by the optimization of the database. The proportion of improvement in database cluster performance that counts into the return value is reduced as the user load increases. Division is used here, meaning that the greater the network traffic performance, the greater the user load, and the smaller the proportion of improvement in database cluster performance that counts into the rate of return r. In order to avoid too small a value of r, a weight gamma may be set 3 Avoid excessive denominator, gamma 3 The specific value of (2) can be selected according to specific conditions so as to better meet the actual needs; here, 0.001 may be selected.
In this embodiment, the step S3 specifically includes the following,
s31, every first preset time length, the optimizing engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimizing engine subsystem acquires the current configuration information state S of the database cluster;
s32, inputting the current configuration information state S of the acquired database cluster into a trained optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
s33, the optimization engine subsystem sends a maximum optimization strategy h_max to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the maximum optimization strategy h_max to update the configuration information state of the database cluster from S to S';
s34, repeating the steps S31 to S34 until the database cluster stops working and the parameter optimization is finished.
Example III
In this embodiment, how the optimization system and the optimization method of the present invention are embodied are specifically described as examples.
Specific: a mysql database cluster consisting of 5 mysql database servers; the method comprises 10 database clients, wherein the clients send database operation requests to a database cluster, and the database operation requests form a database load. Comprises a database optimization server.
And the 5 mysql database servers respectively operate a current configuration information acquisition subsystem, a current performance information acquisition subsystem and an optimization strategy execution subsystem. A database optimization server runs the optimization engine subsystem.
The specific implementation process comprises the following steps:
1) 10 database clients randomly generate database operation requests, and a mysql database server is randomly selected to send the database operation requests.
2) And adopting mysql database default parameter configuration to test for 3 times, and taking the average value of the performance information as a comparison object.
3) And starting a reinforcement learning mechanism, training an optimization strategy evaluation network in the optimization engine subsystem for a certain time (24 hours), and storing the trained optimization strategy evaluation network for calling.
4) Generating mysql database parameter configuration test for 3 times (the test duration is about 10h each time) by adopting a trained optimization strategy evaluation network, and taking the average value of the performance information as a comparison object; the parameter configuration is shown in the following table.
From the table, the optimization strategy evaluation network generated after a period of learning can more accurately find the efficient database cluster optimization strategy by starting the reinforcement learning mechanism, so that better performance of the database cluster is ensured.
From the transaction and query performance information, it can be seen that the database cluster is able to handle more database operations under reinforcement learning parameter configuration.
As can be seen from the thread performance information, the database cluster can fully utilize more threads to perform database operations under the reinforcement learning parameter configuration.
As can be seen from the network traffic performance information, the database cluster can respond to requests more quickly without delaying the response or failing the response, under the reinforcement learning parameter configuration, at the same level of database request quantity.
By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained:
the invention provides a database cluster optimization system and a method based on reinforcement learning, wherein the reinforcement learning method is used for automatically optimizing the configuration parameters of the database cluster, so that the processing performance of the database cluster is remarkably improved; and when the load changes, the dynamic adaptability optimization adjustment can be realized rapidly, and the labor cost and the time cost are reduced greatly.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which is also intended to be covered by the present invention.

Claims (9)

1. A database cluster optimization method based on reinforcement learning is characterized by comprising the following steps: the optimization method comprises the following steps,
s1, initializing: initializing an optimization strategy evaluation network, namely initializing ownership matrix parameters and bias parameters into random values;
s2, learning: performing reinforcement learning process once every first preset time period until the learning process is finished, and obtaining a trained optimization strategy evaluation network;
s3, application stage: performing optimization adjustment on database cluster parameters once by using the trained optimization strategy evaluation network every a first preset time period until the database clusters stop running;
step S2 specifically includes the following,
s21, every a first preset time length, the optimizing engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimizing engine subsystem acquires the current configuration information state S of the database cluster;
S22, inputting the current configuration information state S of the acquired database cluster into an optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
s23, the optimization engine subsystem selects one optimization strategy from all optimization strategies according to a database cluster optimization strategy selection mechanism;
s24, the optimization engine subsystem sends the selected optimization strategy h to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the optimization strategy h to update the configuration information state of the database cluster from S to S';
s25, delaying a second preset time length, and commanding the current performance information acquisition subsystem to acquire the current performance information of the database cluster by the optimization engine subsystem, and calculating a return value r corresponding to an optimization strategy h for updating the configuration information state of the database cluster from S to S';
s26, inputting the updated database cluster configuration information configuration state S' into the optimization strategy evaluation network, obtaining evaluation values of 34 optimization strategies through calculation of a neural network, and selecting a maximum evaluation value V_max from the evaluation values;
S27, calculating a corresponding updated evaluation value h_val of the database cluster under the configuration information state S and the optimization strategy h by using a classical reinforcement learning bel lman formula, wherein h_val=r+V_max;
s28, updating the evaluation value vector by using the updated evaluation value h_val and the evaluation value vector V_s, namely replacing a value corresponding to the optimization strategy h in the evaluation value vector V_s with h_val to obtain an updated evaluation value vector V_s';
s29, storing the configuration information state S of the database cluster and the updated evaluation value vector V_s' into a playback pool as a training sample;
s210, repeating the steps S21 to S29 32 times, wherein the number of training samples in the playback pool is 32;
s211, training an optimization strategy evaluation network by using the 32 training samples and using a gradient descent neural network training algorithm so as to update parameters of the optimization strategy evaluation network;
s212, repeatedly executing the steps S21 to S211 until the error of the optimization strategy evaluation network is smaller than a preset threshold value, and ending the reinforcement learning process.
2. The reinforcement learning-based database cluster optimization method of claim 1, wherein: the optimization policy selection mechanism is specifically that,
randomly selecting an optimization strategy h_rand from 34 optimization strategies according to the epsilon probability, and taking the optimization strategy h_rand as an optimization strategy h; or selecting the maximum optimization strategy h_max in the configuration information state s with the probability of 1-epsilon, and taking the maximum optimization strategy h_max as an optimization strategy h.
3. The reinforcement learning-based database cluster optimization method of claim 1, wherein: the calculation process of the return value r is that,
b1, calculating the difference Dtps between the transaction number tps per second collected by the optimization engine subsystem and the transaction number tps per second collected by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dps=tps-tps';
b2, calculating the difference Dqps between the query number qps per second collected by the optimizing engine subsystem and the query number qps per second collected by the optimizing engine subsystem at the previous moment after executing the optimizing strategy;
wherein Dqps = qps-qps';
b3, calculating the difference Dquery_response_time between the query operation response time query_response_time acquired by the optimization engine subsystem and the query operation response time query_response_time' acquired by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dqery_response_time=query_response_time-query_response_time';
b4, calculating the difference Dthread_running between the number of threads of the activated state acquired by the optimizing engine subsystem and the number of threads of the activated state acquired by the optimizing engine subsystem at the previous moment after executing the optimizing strategy;
Wherein, dthreads_running = threads_running-threads_running';
b5, calculating the difference Dthreads_connected between the number threads_connected of the current connection collected by the optimizing engine subsystem and the number threads_connected' of the current connection collected by the optimizing engine subsystem every second at the last moment after executing the optimizing strategy;
wherein, dthreads_connected = threads_connected-threads_connected';
b6, calculating the difference between the average byte number received from all clients per second collected by the optimization engine subsystem and the average byte number received from all clients per second collected by the optimization engine subsystem at the previous moment after executing the optimization strategy;
wherein dbytes_received_ps=bytes_received_ps-bytes_received_ps';
b7, calculating the difference between the byte number Bytes_send_ps which is acquired by the optimization engine subsystem and is transmitted to all clients every second and the byte number Bytes_send_ps' which is acquired by the optimization engine subsystem and is transmitted to all clients every second after the optimization strategy is executed;
wherein dbytes_send_ps=bytes_send_ps-bytes_send_ps';
B8, calculating the return rate r according to the difference value obtained in the steps B1 to B7, wherein the calculation formula is,
wherein, gamma 1 And gamma 2 Is a weight and satisfies gamma 1 <γ 2 ;γ 3 Is the weight.
4. The reinforcement learning-based database cluster optimization method of claim 1, wherein: step S3 specifically includes the following,
s31, every first preset time length, the optimizing engine subsystem commands the current configuration information acquisition subsystem to acquire the current configuration information of the database cluster once, and the optimizing engine subsystem acquires the current configuration information state S of the database cluster;
s32, inputting the current configuration information state S of the acquired database cluster into a trained optimization strategy evaluation network, and outputting evaluation value vectors V_s of 34 optimization strategies through calculation of a neural network; selecting an optimization strategy h_max with the maximum evaluation value vector from all the optimization strategies;
s33, the optimization engine subsystem sends a maximum optimization strategy h_max to an optimization strategy execution subsystem, and the optimization strategy execution subsystem executes the maximum optimization strategy h_max to update the configuration information state of the database cluster from S to S';
s34, repeating the steps S31 to S34 until the database cluster stops working and the parameter optimization is finished.
5. A reinforcement learning based database cluster optimization system for implementing the optimization method of any one of the above claims 1 to 4; the method is characterized in that: the optimization system comprises a system and a control system,
a current configuration information acquisition subsystem; the system comprises an optimization engine subsystem, a database cluster, a database, a control module and a control module, wherein the optimization engine subsystem is used for receiving a current configuration information acquisition command issued by the optimization engine subsystem and acquiring current configuration information of the database cluster according to the current configuration information acquisition command; the collected current configuration information of the database cluster is sent to an optimization engine subsystem;
the current performance information acquisition subsystem; the system comprises a database cluster, an optimization engine subsystem, a database control module and a control module, wherein the database cluster is used for storing the current performance information of the database cluster; the collected current performance information of the database cluster is sent to an optimization engine subsystem;
optimizing a policy enforcement subsystem; the system comprises an optimization engine subsystem, a database cluster management subsystem and a database cluster management subsystem, wherein the optimization engine subsystem is used for receiving an optimization strategy issued by the optimization engine subsystem and adjusting configuration information of the database cluster according to the optimization strategy; the optimization strategy comprises optimization parameters and optimization directions;
an optimization engine subsystem; the system comprises a current configuration information acquisition subsystem, a current performance information acquisition subsystem and an optimization strategy execution subsystem, wherein the current configuration information acquisition subsystem is used for acquiring current configuration information, the current performance information acquisition subsystem is used for acquiring current performance information, and the optimization strategy execution subsystem is used for respectively transmitting a current configuration information acquisition command, a current performance information acquisition command and an optimization parameter to the current configuration information acquisition subsystem, the current performance information acquisition subsystem and the optimization strategy execution subsystem; and generating a database cluster optimization strategy according to the acquired current configuration information and the current performance information of the database cluster, and controlling an optimization strategy execution subsystem to adjust the configuration information of the database cluster.
6. The reinforcement learning based database cluster optimization system of claim 5, wherein: the current configuration information acquisition subsystem includes,
the first acquisition command receiving module; receiving a current configuration information acquisition command issued by the optimization engine subsystem through the network monitoring optimization engine subsystem, and calling a configuration information acquisition module according to the received current configuration information acquisition command so as to acquire the current configuration information of the database cluster;
configuring an information acquisition module; calling each configuration information sub-module to collect corresponding configuration information of the database cluster according to the current configuration information, and sending the corresponding configuration information to an optimization engine subsystem;
a cache configuration information sub-module; the configuration information related to database cache is collected and comprises a query cache size, a single query available cache area size and a sequencing cache size;
an operation configuration information sub-module; the configuration information related to database operation is collected and comprises a read operation buffer area size, a temporary table size, a maximum heap table size, an index buffer area size, a batch insertion data buffer area size and a joint operation queue size;
a network configuration information sub-module; the configuration information is used for collecting configuration information related to a database network, and comprises a maximum value of one-time message transmission quantity, a maximum database connection number and a maximum number of abnormal interruption times of a database connection request in network transmission;
A system configuration information sub-module; the configuration information is used for collecting configuration information related to the database system; including the number of files allowed to open, the number of database requests that can be stored in the stack in a short time, the number of threads stored in the cache, the number of concurrent threads, and the stack size for each thread.
7. The reinforcement learning based database cluster optimization system of claim 5, wherein: the current performance information acquisition subsystem includes,
the second acquisition command receiving module; receiving a current performance information acquisition command issued by the optimization engine subsystem through the network monitoring optimization engine subsystem, and calling a performance information acquisition module according to the received current performance information acquisition command so as to acquire the current performance information of the database cluster;
a performance information acquisition module; calling each performance information sub-module to collect corresponding performance information of the database cluster according to the current performance information, and sending the corresponding performance information to an optimization engine subsystem;
transaction and query information sub-modules; the method comprises the steps of acquiring performance information related to database transactions and queries, wherein the performance information comprises the steps of acquiring average per-second select statement execution times by utilizing a database management command, acquiring average per-second insert statement execution times by utilizing a database management command, acquiring average update statement execution times by utilizing a database management command, acquiring average delete statement execution times by utilizing a database management command, calculating the number of transactions per second, calculating the number of queries per second and utilizing the database management command to query operation response time statistics;
A thread performance information sub-module; the method comprises the steps of collecting performance information related to database threads; the method comprises the steps of obtaining the number of threads currently in an activated state by using an operating system management command, and obtaining the number of threads currently connected by using the operating system management command;
a network traffic performance information sub-module; the system is used for collecting performance information related to the network flow of the database; including obtaining an average number of bytes received from all clients per second using network management commands, and obtaining an average number of bytes sent to all clients per second using network management commands.
8. The reinforcement learning based database cluster optimization system of claim 5, wherein: the optimization strategy execution subsystem includes,
an optimization strategy receiving module; receiving an optimization strategy issued by an optimization engine subsystem through a network monitoring optimization engine subsystem;
an optimization strategy executing module; the optimizing strategy is used for receiving an optimizing strategy issued by the optimizing engine, searching the configuration file according to the optimizing parameter, and finding out the configuration parameter corresponding to the optimizing parameter; according to the optimization direction, adjusting configuration parameter values corresponding to the optimization parameters; the adjustment content may specifically include the content of,
A1, when the optimization direction is +, if the configuration parameters corresponding to the optimization parameters are switching items, setting the parameter values to be on;
a2, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is a switch item, setting the parameter value to be off;
a3, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer within 10, the parameter value is set to be increased by 1;
a4, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer within 10, setting the parameter value to be reduced by 1;
a5, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer within 256, the parameter value is set to be increased by 8;
a6, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer within 256, setting the parameter value to be reduced by 8;
a7, when the optimization direction is +, if the configuration parameter corresponding to the optimization parameter is an integer greater than 256, the parameter value is set to be multiplied by 2;
a8, when the optimization direction is-, if the configuration parameter corresponding to the optimization parameter is an integer greater than 256, the parameter value is set to be divided by 2.
9. The reinforcement learning based database cluster optimization system of claim 5, wherein: the optimization engine subsystem includes an optimization policy evaluation network including an input layer, two hidden layers, and an output layer,
An input layer comprising 17 inputs, each from 17 current values of configuration information of the database cluster;
the first hidden layer, comprising 128 neurons, has a calculation formula,
O 1 =relu(w 1 ·x+b 1 )
wherein x is the input of the optimization strategy evaluation network; w (w) 1 Is a weight matrix; b 1 Is biased; o (O) 1 Is the output vector of the first hidden layer 128 dimension;
the second hidden layer, comprising 64 neurons, has a calculation formula,
O 2 =relu(w 2 ·O 1 +b 2 )
wherein w is 2 Is a weight matrix; b 2 Is biased; o (O) 2 Is the output vector of the second hidden layer 64 dimension;
the output layer comprises 34 neurons, the calculation formula is,
y=relu(w 3 ·O 2 +b 3 )
wherein w is 3 For the rightA value matrix; b 3 Is biased; y is an output vector of the output layer, which includes 34 outputs, each corresponding to an evaluation value of the optimization strategy; since there are 17 configuration information, 2 optimization directions per configuration information, there are 34 optimization strategies.
CN202010807625.4A 2020-08-12 2020-08-12 Database cluster optimization system and method based on reinforcement learning Active CN111913939B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010807625.4A CN111913939B (en) 2020-08-12 2020-08-12 Database cluster optimization system and method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010807625.4A CN111913939B (en) 2020-08-12 2020-08-12 Database cluster optimization system and method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN111913939A CN111913939A (en) 2020-11-10
CN111913939B true CN111913939B (en) 2023-10-03

Family

ID=73284378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010807625.4A Active CN111913939B (en) 2020-08-12 2020-08-12 Database cluster optimization system and method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN111913939B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381158A (en) * 2020-11-18 2021-02-19 山东高速信息集团有限公司 Artificial intelligence-based data efficient training method and system
CN113127446B (en) * 2021-04-01 2023-04-07 山东英信计算机技术有限公司 Cluster tuning method and device based on Ottertune service
CN114398400B (en) * 2022-03-24 2022-06-03 环球数科集团有限公司 Serverless resource pool system based on active learning
CN114760117A (en) * 2022-03-30 2022-07-15 深信服科技股份有限公司 Data acquisition method and device and electronic equipment
CN115528667B (en) * 2022-11-28 2023-04-07 西华大学 Direct-current micro-grid cluster control system and multi-stage cooperative control method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120457A (en) * 2018-09-13 2019-01-01 余利 The method for processing business of the intelligent cloud of framework is defined based on distributed software
CN110019151A (en) * 2019-04-11 2019-07-16 深圳市腾讯计算机系统有限公司 Database performance method of adjustment, device, equipment, system and storage medium
CN110134697A (en) * 2019-05-22 2019-08-16 南京大学 A kind of parameter automated tuning method, apparatus, system towards key-value pair storage engines
CN111353582A (en) * 2020-02-19 2020-06-30 四川大学 Particle swarm algorithm-based distributed deep learning parameter updating method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180351816A1 (en) * 2017-06-02 2018-12-06 Yan Li Methods and apparatus for parameter tuning using a cloud service
US11157488B2 (en) * 2017-12-13 2021-10-26 Google Llc Reinforcement learning techniques to improve searching and/or to conserve computational and network resources
US11593381B2 (en) * 2018-01-25 2023-02-28 Amadeus S.A.S. Re-computing pre-computed query results

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109120457A (en) * 2018-09-13 2019-01-01 余利 The method for processing business of the intelligent cloud of framework is defined based on distributed software
CN110019151A (en) * 2019-04-11 2019-07-16 深圳市腾讯计算机系统有限公司 Database performance method of adjustment, device, equipment, system and storage medium
CN110134697A (en) * 2019-05-22 2019-08-16 南京大学 A kind of parameter automated tuning method, apparatus, system towards key-value pair storage engines
CN111353582A (en) * 2020-02-19 2020-06-30 四川大学 Particle swarm algorithm-based distributed deep learning parameter updating method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Guoliang Li等.QTune: a query-aware database tuning system with deep reinforcement learning.Proceedings of the VLDB Endowment.2019,第12卷(第12期),第2118–2130页. *
Ji Zhang等.An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning.Proceedings of the 2019 International Conference on Management of Data.2019,第415–432页. *
徐江峰 ; 谭玉龙 ; .基于机器学习的HBase配置参数优化研究.计算机科学.2020,(S1),全文. *
李国良 ; 周煊赫 ; .轩辕:AI原生数据库系统.软件学报.2020,(03),全文. *
茗珂等.学习式数据库系统:挑战与机遇.软件学报.2020,第31卷(第03期),第806-830页. *

Also Published As

Publication number Publication date
CN111913939A (en) 2020-11-10

Similar Documents

Publication Publication Date Title
CN111913939B (en) Database cluster optimization system and method based on reinforcement learning
US11301781B2 (en) Systems and methods implementing an intelligent optimization platform
US20230385129A1 (en) Systems and methods for implementing an intelligent application program interface for an intelligent optimization platform
US8745036B2 (en) System, method, and computer-readable medium for enhancing query execution by an optimizer in a database system
US20070297327A1 (en) Method for applying stochastic control optimization for messaging systems
CN111638948B (en) Multi-channel high-availability big data real-time decision making system and decision making method
CN114415965B (en) Data migration method, device, equipment and storage medium
US20120143834A1 (en) Data summary system, method for summarizing data, and recording medium
US20230368028A1 (en) Automated machine learning pre-trained model selector
CN112084206A (en) Database transaction request processing method, related device and storage medium
CN111526208A (en) High-concurrency cloud platform file transmission optimization method based on micro-service
CN113778683A (en) Handle identification system analysis load balancing method based on neural network
JP2005516302A (en) An object-oriented framework for general purpose adaptive control
Mostafa et al. An intelligent dynamic replica selection model within grid systems
CN113688115A (en) File big data distributed storage system based on Hadoop
CN117472959A (en) Gskip list-based block chain efficient query system and dynamic construction method
US20230297436A1 (en) Key-based aggregation service
CN113791935B (en) Data backup method, network node and system
US11445012B2 (en) Proactive load balancer for data storage system
CN109508433B (en) Load fluctuation coping method and system based on performance adjustment of matching algorithm
CN111382196B (en) Distributed accounting processing method and system
US20160004747A1 (en) Join query execution method and device, and storage medium
CN116881230B (en) Automatic relational database optimization method based on cloud platform
CN117713213B (en) Photovoltaic cluster control method and device based on improved artificial fish school and storage medium
CN112100557B (en) Combined matching system and method based on content publishing and subscribing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant