CN117609288B

CN117609288B - Data query policy optimization method, device, terminal equipment and storage medium

Info

Publication number: CN117609288B
Application number: CN202410065129.4A
Authority: CN
Inventors: 王龙; 王祯; 杨钊
Original assignee: Matrix Origin Shenzhen Information Technology Co ltd
Current assignee: Matrix Origin Shenzhen Information Technology Co ltd
Priority date: 2024-01-17
Filing date: 2024-01-17
Publication date: 2024-04-30
Anticipated expiration: 2044-01-17
Also published as: CN117609288A

Abstract

The invention discloses a data query strategy optimization method, a device, terminal equipment and a storage medium, wherein the method comprises the following steps: receiving a query SQL statement; and sending the SQL statement to a pre-constructed data management system, and optimizing an initial query strategy of the SQL statement through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor. According to the invention, the query SQL statement is optimized through the data management system to obtain the final query policy, so that the problems that the data query cannot meet the wide growth of modern data and the connection condition is complicated are solved, and the efficiency of optimizing the data query policy is improved.

Description

Data query policy optimization method, device, terminal equipment and storage medium

Technical Field

The present invention relates to the field of data query technologies, and in particular, to a data query policy optimization method, a data query policy optimization device, a terminal device, and a storage medium.

Background

With the advent of the big data age, the importance of database management systems has increased, and in the past decades, research on query optimization has been one of the key links of database management systems.

One of the greatest challenges in query optimization, however, is the optimization of connection order, for which many research institutions or database vendors have developed extensive research, but conventional rule algorithms are still used in their product implementations, such as Greedy (Greedy algoritm) based algorithms in the optimizer of Mysql databases.

This approach tends to result in too large a search space and too high a time complexity, which affects database query performance, and in some other databases, dynamic programming-based algorithms are used, and despite the adoption of various pruning strategies in optimizers of these database management systems to reduce the search space of relational connection sequences, the requirements of the broad growth of modern data and the complexity of connection conditions are still not met.

The present invention is therefore directed to utilizing the ability of deep reinforcement learning to quickly generate multiple connection orders with optimal relationships for modern database management systems.

The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.

Disclosure of Invention

The invention mainly aims to provide a data query strategy optimization method, a data query strategy optimization device, terminal equipment and a storage medium, and aims to solve the technical problems that data query cannot meet the wide growth of modern data and the complexity of connection conditions.

In order to achieve the above object, the present invention provides a data query policy optimization method, which includes the following steps:

Receiving a query SQL statement;

The SQL sentence is sent to a pre-constructed data management system, and the initial query strategy of the SQL sentence is optimized through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor;

The step of sending the SQL sentence to a pre-constructed data management system, and optimizing an initial query strategy of the SQL sentence by the data management system to obtain a final query strategy comprises the following steps:

analyzing the SQL sentence through the analyzer to obtain an abstract syntax AST tree;

vectorizing the AST tree to obtain an initial state of the AST tree;

Initializing the initial query strategy by the intelligent agent according to a preset cyclic neural network LSTM to obtain an initial strategy function and an initial cost function;

generating an execution plan based on the initial strategy function and the initial state, and obtaining an execution result according to the execution plan;

Optimizing the initial strategy function and the initial cost function through the intelligent agent according to the execution result to obtain a final strategy function and a final cost function;

and generating a final query strategy based on the final strategy function and the final cost function.

Optionally, the step of generating an execution plan based on the initial policy function and the initial state, and obtaining an execution result according to the execution plan includes:

Inputting the initial state into the initial strategy function for repeated calculation to obtain a query action and a final state;

according to the inquiring action and the final state, planning generation is carried out through the optimizer, and an execution plan is obtained;

and according to the execution plan, performing planning execution through the executor to obtain an execution result.

Optionally, the step of generating, by the optimizer, a plan according to the query action and the final state, and obtaining an execution plan includes:

Analyzing the query action through the optimizer to obtain an optimization decision;

and outputting a query tree through the optimizer based on the optimization decision, and creating a plan node for the query tree based on the final state to acquire an execution plan.

Optionally, the step of optimizing the initial policy function and the initial cost function by the agent according to the execution result, and obtaining a final policy function and a final cost function includes:

according to the execution result, calculating through a cost model of the optimizer to obtain an operation rewarding value;

performing function construction through a strategy gradient algorithm PPO according to the circulating neural network LSTM and the operation rewarding value to obtain a loss function;

and optimizing the initial strategy function and the initial cost function through the intelligent agent according to the loss function, and obtaining a final strategy function and a final cost function.

Optionally, the step of performing function construction by using a policy gradient algorithm PPO according to the recurrent neural network LSTM and the operational reward value, and obtaining the loss function includes:

performing entropy calculation on the query action to obtain a query entropy of the initial strategy function;

performing function construction based on the initial cost function to obtain a cost loss function;

Limiting the updating amplitude of the LSTM through the operation rewarding value and a preset dominance function to obtain a near-end ratio clipping loss value;

And cutting out a loss value based on the query entropy value, the cost loss function and the near-end ratio, and performing function construction through a strategy gradient algorithm PPO to obtain the loss function.

Optionally, the step of obtaining the operation rewards value by calculating through the cost model of the optimizer according to the execution result includes:

Performing data calculation on the execution result based on a deep learning algorithm of the cost model to acquire central processing unit occupation data and memory occupation data;

Acquiring data transmission cost and running cost of the central processing unit according to the central processing unit occupation data and the memory occupation data;

and according to a preset cost weight value, performing rewarding calculation through the data transmission cost and the running cost to acquire an operation rewarding value.

The embodiment of the invention also provides a data query strategy optimization device, which comprises:

The receiving module is used for receiving the query SQL statement;

the optimizing module is used for sending the SQL sentence to a pre-constructed data management system, and optimizing an initial query strategy of the SQL sentence through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor;

The optimization module is further used for analyzing the SQL sentence through the analyzer to obtain an abstract syntax AST tree;

vectorizing the AST tree to obtain an initial state of the AST tree;

The embodiment of the invention also provides a terminal device which comprises a memory, a processor and a data query strategy optimization program stored in the memory and capable of running on the processor, wherein the data query strategy optimization program realizes the steps of the data query strategy optimization method when being executed by the processor.

The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium is stored with a data query strategy optimization program, and the data query strategy optimization program realizes the steps of the data query strategy optimization method when being executed by a processor.

The embodiment of the invention provides a data query strategy optimization method, a device, terminal equipment and a storage medium, which are implemented by receiving query SQL sentences; and sending the SQL statement to a pre-constructed data management system, and optimizing an initial query strategy of the SQL statement through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor. Therefore, the query SQL statement is optimized through the data management system to obtain a final query policy, the problems that the data query cannot meet the wide growth of modern data and the complexity of connection conditions are solved, and the efficiency of optimizing the data query policy is improved.

Drawings

FIG. 1 is a schematic diagram of functional modules of a terminal device to which a data query policy optimization device of the present invention belongs;

FIG. 2 is a flow chart of an exemplary embodiment of a data query policy optimization method of the present invention;

FIG. 3 is a schematic diagram of an AST tree state involved in the data query strategy optimization method of the present invention;

FIG. 4 is an overall schematic diagram of a data query strategy optimization method of the present invention;

FIG. 5 is a flow chart of the data query strategy optimization method of the present invention involving obtaining a final strategy function and a final cost function;

FIG. 6 is a schematic flow chart of the data query strategy optimization method of the present invention involving a recurrent neural network LSTM;

FIG. 7 is a schematic diagram of the data query policy optimization method of the present invention involving CPU occupancy data;

FIG. 8 is a schematic diagram of the data query policy optimization method of the present invention involving memory footprint data;

FIG. 9 is a schematic diagram of a data query strategy optimization method of the present invention involving pre-training of a cost model.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The main solutions of the embodiments of the present invention are: analyzing the SQL sentence through the analyzer to obtain an abstract syntax AST tree; vectorizing the AST tree to obtain an initial state of the AST tree; initializing the initial query strategy by the intelligent agent according to a preset cyclic neural network LSTM to obtain an initial strategy function and an initial cost function; generating an execution plan based on the initial strategy function and the initial state, and obtaining an execution result according to the execution plan; optimizing the initial strategy function and the initial cost function through the intelligent agent according to the execution result to obtain a final strategy function and a final cost function; and generating a final query strategy based on the final strategy function and the final cost function. Inputting the initial state into the initial strategy function for repeated calculation to obtain a query action and a final state; according to the inquiring action and the final state, planning generation is carried out through the optimizer, and an execution plan is obtained; and according to the execution plan, performing planning execution through the executor to obtain an execution result. Analyzing the query action through the optimizer to obtain an optimization decision; and outputting a query tree through the optimizer based on the optimization decision, and creating a plan node for the query tree based on the final state to acquire an execution plan. According to the execution result, calculating through a cost model of the optimizer to obtain an operation rewarding value; performing function construction through a strategy gradient algorithm PPO according to the circulating neural network LSTM and the operation rewarding value to obtain a loss function; and optimizing the initial strategy function and the initial cost function through the intelligent agent according to the loss function, and obtaining a final strategy function and a final cost function. Performing entropy calculation on the query action to obtain a query entropy of the initial strategy function; performing function construction based on the initial cost function to obtain a cost loss function; limiting the updating amplitude of the LSTM through the operation rewarding value and a preset dominance function to obtain a near-end ratio clipping loss value; and cutting out a loss value based on the query entropy value, the cost loss function and the near-end ratio, and performing function construction through a strategy gradient algorithm PPO to obtain the loss function. Performing data calculation on the execution result based on a deep learning algorithm of the cost model to acquire central processing unit occupation data and memory occupation data; acquiring data transmission cost and running cost of the central processing unit according to the central processing unit occupation data and the memory occupation data; and according to a preset cost weight value, performing rewarding calculation through the data transmission cost and the running cost to acquire an operation rewarding value. Therefore, the problem that the data query cannot meet the wide growth of modern data and the complexity of connection conditions is solved, the optimization of the data query strategy is realized, and the efficiency of the data query strategy optimization is improved. Based on the scheme of the invention, the problems of low query efficiency caused by the fact that the search space is too huge and the time complexity is too high due to the fact that the conventional rule algorithm is used for data query, so that the database query performance is affected, and the dynamic programming-based algorithm is used in other databases, and the optimization efficiency of the data query strategy is obviously improved through the method, even though the search space of the relation connection sequence is reduced by adopting various pruning strategies in the optimizers of the database management systems, the requirements of the wide growth of modern data and the complexity of the connection conditions still cannot be met, so that the query efficiency is low.

Technical terms related to the embodiment of the invention:

SQL: SQL (Structured Query Language) is a language specifically for managing and manipulating relational databases, which is a standardized query language for retrieving, inserting, updating, and deleting data from databases, and defining and managing database structures.

AST tree: the AST is called Abstract Syntax Tree (abstract syntax tree) and is an intermediate representation form of programs such as a compiler and an interpreter, and is a tree-shaped structure generated after source codes are subjected to lexical analysis and syntax analysis and used for representing syntax structures and semantic information of the programs, the AST tree is a key data structure of the programs such as the compiler and the interpreter and can effectively describe the syntax structures and the semantic information of the programs, each node in the AST tree represents a syntax element such as an identifier, an operator, a function call and the like, the connection between the nodes represents the relationship between the nodes such as a scope, a dependency relationship and the like, the AST tree can be used for performing operations such as code optimization, analysis and conversion, and the like, and the AST tree can be used for performing static analysis on the programs and generating executable codes by traversing the AST tree. AST trees may also be used for applications such as grammar highlighting, code shading, reconstruction tools, etc., in the implementation of programming languages, an AST tree is typically an intermediate representation of a compiler or interpreter that converts source code into an AST tree, then optimizes and converts it, and finally generates object code, and an interpreter that interprets the implemented program takes the AST tree as input, because the AST tree may represent the syntax structure and semantic information of the program, careful design and implementation of the data structure and algorithm of the AST tree is required when writing the compiler or interpreter.

An intelligent agent: an Agent (Agent) refers to a computer program or system having the ability to perceive, think and act, which can make inferences and decisions by perceiving information in the environment and taking appropriate action to achieve a predetermined goal.

PPO: PPO is an abbreviation of Proximal Policy Optimization (near-end policy optimization), an algorithm for reinforcement learning, which is a policy gradient-based method proposed by OpenAI, aiming at solving reinforcement learning problems in continuous motion space, and the core idea of PPO is to improve performance of an agent by iteratively optimizing policies, and compared with the conventional policy gradient method, PPO introduces a constraint mechanism called "near-end policy optimization" to update policy parameters more stably and efficiently.

LSTM: LSTM (Long Short-Term Memory) is a recurrent neural network (Recurrent Neural Network, RNN) architecture commonly used to process sequence data, and has better performance in handling Long-Term dependency problems than conventional RNNs, the core idea of LSTM is to use special Memory cells to store and update information in order to facilitate Long-Term Memory and capture Long-Term dependencies in the sequence, it controls the flow of information through gating mechanisms, and can optionally forget or update the content in Memory.

Proximal ratio clipping: near-end ratio clipping (Proximal Ratio Clipping) is a technique used in Proximal Policy Optimization (PPO) algorithms for limiting the magnitude of policy updates, in which the ratio between the current and old policies needs to be calculated and incorporated as an important factor into the objective function in order to achieve optimization of the policies, however, if the policy updates are too large, this may lead to the policies being far from the original distribution, affecting the stability and convergence of the algorithm, and to solve this problem PPO introduces a near-end ratio clipping technique whose idea is to control the magnitude of the policy updates by limiting the magnitude of the change of the new policy relative to the old policy, in particular near-end ratio clipping cuts the ratio between the new and old policies within a fixed range by setting a clipping threshold.

The embodiment of the invention considers that when the related technology performs data query, the traditional rule algorithm is used for data query, so that the search space is too huge, the time complexity is too high, the query performance of the database is affected, and the query result is greatly affected, so that the problem of low efficiency exists in the mode.

Therefore, in the embodiment of the invention, the problem of low query efficiency is solved by designing a data query strategy optimization method and verifying the effectiveness of the data query strategy optimization method when the data query strategy is optimized, and finally, the efficiency of optimizing the data query strategy by the method is obviously improved, because the search space of the relation connection sequence is reduced by adopting various pruning strategies in optimizers of the database management systems, which is caused by the fact that the traditional rule algorithm is used for data query, so that the search space is too huge and the time complexity is too high in reality, and the database query performance is affected.

Specifically, referring to fig. 1, fig. 1 is a schematic diagram of functional blocks of a terminal device to which the data query policy optimization device of the present invention belongs. The data query policy optimization device may be independent of the terminal device and may be a device capable of performing data query policy optimization, and may be carried on the terminal device in a form of hardware or software. The terminal equipment can be intelligent mobile equipment with a data processing function such as a mobile phone and a tablet personal computer, and can also be fixed terminal equipment or a server with a data processing function.

In this embodiment, the terminal device to which the data query policy optimization apparatus belongs at least includes an output module 110, a processor 120, a memory 130, and a communication module 140.

The memory 130 stores an operating system and a data query policy optimization program, and the data query policy optimization device may receive a query SQL statement; the SQL sentence is sent to a pre-constructed data management system, and the initial query strategy of the SQL sentence is optimized through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor; the step of sending the SQL sentence to a pre-constructed data management system, and optimizing an initial query strategy of the SQL sentence by the data management system to obtain a final query strategy comprises the following steps: analyzing the SQL sentence through the analyzer to obtain an abstract syntax AST tree; vectorizing the AST tree to obtain an initial state of the AST tree; initializing the initial query strategy by the intelligent agent according to a preset cyclic neural network LSTM to obtain an initial strategy function and an initial cost function; generating an execution plan based on the initial strategy function and the initial state, and obtaining an execution result according to the execution plan; optimizing the initial strategy function and the initial cost function through the intelligent agent according to the execution result to obtain a final strategy function and a final cost function; and generating a final query strategy based on the final strategy function and the final cost function. Performing data query policy optimization through the data query policy optimization program, and storing information such as an optimization result and the like in the memory 130; the output module 110 may be a display screen or the like. The communication module 140 may include a WIFI module, a mobile communication module, a bluetooth module, and the like, and communicates with an external device or a server through the communication module 140.

Wherein the data query policy optimization program in the memory 130 when executed by the processor performs the steps of:

Receiving a query SQL statement;

And sending the SQL statement to a pre-constructed data management system, and optimizing an initial query strategy of the SQL statement through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor.

vectorizing the AST tree to obtain an initial state of the AST tree;

Further, the data query policy optimization program in the memory 130, when executed by the processor, further implements the steps of:

The embodiment adopts the scheme, in particular by receiving the query SQL statement; the SQL sentence is sent to a pre-constructed data management system, and the initial query strategy of the SQL sentence is optimized through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor; the step of sending the SQL sentence to a pre-constructed data management system, and optimizing an initial query strategy of the SQL sentence by the data management system to obtain a final query strategy comprises the following steps: analyzing the SQL sentence through the analyzer to obtain an abstract syntax AST tree; vectorizing the AST tree to obtain an initial state of the AST tree; initializing the initial query strategy by the intelligent agent according to a preset cyclic neural network LSTM to obtain an initial strategy function and an initial cost function; generating an execution plan based on the initial strategy function and the initial state, and obtaining an execution result according to the execution plan; optimizing the initial strategy function and the initial cost function through the intelligent agent according to the execution result to obtain a final strategy function and a final cost function; and generating a final query strategy based on the final strategy function and the final cost function. The query SQL statement is optimized through the data management system to obtain a final query policy, so that the problems that the data query cannot meet the wide growth of modern data and the connection condition is complicated can be solved. Based on the scheme of the invention, the problems of low query efficiency caused by the fact that the search space is too huge and the time complexity is too high due to the fact that the conventional rule algorithm is used for data query, so that the database query performance is affected, and the dynamic programming-based algorithm is used in other databases, and the optimization efficiency of the data query strategy is obviously improved through the method, even though the search space of the relation connection sequence is reduced by adopting various pruning strategies in the optimizers of the database management systems, the requirements of the wide growth of modern data and the complexity of the connection conditions still cannot be met, so that the query efficiency is low.

The method embodiments of the present invention are presented based on the above-described terminal device architecture but not limited to the above-described framework.

Referring to fig. 2, fig. 2 is a flowchart of an exemplary embodiment of a data query policy optimization method according to the present invention. The data query strategy optimization method comprises the following steps:

step S01, receiving a query SQL statement;

Step S02, the SQL sentence is sent to a pre-constructed data management system, and the initial query strategy of the SQL sentence is optimized through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor.

The execution body of the method of the embodiment may be a data query policy optimization device, or may be a data query policy optimization terminal device or a server, and the embodiment uses the data query policy optimization device as an example, where the data query policy optimization device may be integrated on a terminal device with a data processing function.

It should be understood that, with the advent of the big data age, the importance of the database management system is becoming higher and higher, research about query optimization is always one of the key links of the database management system, however, one of the biggest challenges in query optimization is about optimization of connection order, many research institutions or database manufacturers have developed extensive research for this purpose, but conventional rule-based algorithms are still used in its product implementation, such as greedy-based algorithms in the optimizer of Mysql database, this method results in too large search space and too high time complexity, affecting database query performance, and dynamic programming-based algorithms are used in other databases, and although various pruning strategies are adopted in the optimizer of these database management systems to reduce the search space of the relation connection order, the requirements of the complexity of modern data for the wide growth and connection conditions still cannot be met, so this embodiment proposes a data query strategy optimization method which aims at using the capability of deep reinforcement learning fast decision to quickly generate a plurality of optimal connection orders for the modern database management system;

Before optimizing the data query strategy, a query SQL statement needs to be received, wherein SQL (Structured Query Language) is a language specially used for managing and operating the relational database, is a standardized query language used for retrieving, inserting, updating and deleting data from the database, and defining and managing a database structure;

The received query SQL statement can be understood as a data query means in a database, so that the data management system in the embodiment can be pertinently arranged in the database to perform optimization, and the data management system is specifically set by actual service requirements;

After the data management system receives the SQL statement, the SQL statement is optimized, and in the database, the query principle of the query statement is based on the data connection relationship, so that the data connection relationship is correspondingly modeled, and the query cost of the current query policy can be obtained by taking a cost model as a concrete cost model, so that the query policy is optimized, and the final query policy is obtained.

Specifically, in step S02 of this embodiment, the step of sending the SQL statement to a pre-built data management system, and optimizing, by the data management system, an initial query policy of the SQL statement to obtain a final query policy includes:

Step S021, analyzing the SQL sentence through the analyzer to obtain an abstract grammar AST tree;

Step S022, vectorizing the AST tree to obtain an initial state of the AST tree;

step S023, initializing the initial query strategy by the intelligent agent according to a preset cyclic neural network LSTM, and obtaining an initial strategy function and an initial cost function;

Step S024, generating an execution plan based on the initial strategy function and the initial state, and obtaining an execution result according to the execution plan;

step S025, optimizing the initial strategy function and the initial cost function through the intelligent agent according to the execution result to obtain a final strategy function and a final cost function;

And step S026, generating a final query strategy based on the final strategy function and the final cost function.

Based on the foregoing, the data management system in this embodiment at least includes a parser, an agent, an Optimizer, and an executor, where the parser is used to parse the SQL statement so as to vectorize the SQL statement, the agent is a computer program or system with sensing, thinking, and action capabilities, it can perform reasoning and decision by sensing information in the environment, and take appropriate actions to achieve a predetermined objective, the Optimizer is one of the key components responsible for executing the SQL query, the main task of the Optimizer is to optimize the query performance by selecting the most effective execution plan to improve the execution efficiency of the query, and the executor is to execute the query plan so as to learn that the query effect can be obtained and optimize the query policy according to the query effect.

After the data management system receives the SQL statement, the SQL statement is firstly analyzed to obtain an abstract syntax AST tree, wherein the AST is called Abstract Syntax Tree (abstract syntax tree), which is an intermediate representation form of programs such as a compiler, an interpreter and the like, and is a tree structure generated after the source code is subjected to lexical analysis and grammar analysis and used for representing the grammar structure and semantic information of the programs;

In order to enable an AST tree to be processed by a neural network, vectorization is needed to obtain an initial state of the AST, wherein the initial state is a data connection relation under an initial query strategy, the input state represents a relation to be connected in a query plan, in order to facilitate the more convenient processing of the state by the neural network, tree structure vectorization is needed to be performed on the state, in this embodiment, n-dimensional row vectors are used to vectorize each subtree, wherein n represents the number of nodes in the tree, for a certain row vector, if a certain node does not exist in the subtree, the corresponding value in the vector is 0, otherwise, the corresponding value is represented as

Wherein,Representing the height of the ith node in x vector subtrees, as shown in fig. 3, there are four relations A, B, C, D to perform connection operation, the initial state has four vector subtrees, coding is performed according to the node positions in the subtrees, the connection operation is performed according to decision action selection relations A and C in the first step, then the next subtree containing the connection of relations A and C and two independent subtrees of B and D are obtained, then other actions to be performed are sequentially obtained according to the taken decisions, and finally/>Is the connection order (i.e., final state) of (a) the (b);

Since the AST tree has been converted into an initial state that can be processed by the neural network, the initial query strategy is initialized by the agent according to the current neural network LSTM to obtain an initial strategy function and an initial cost function, wherein LSTM is a cyclic neural network (Recurrent Neural Network, RNN) architecture commonly used for processing sequence data, LSTM has better performance in processing long-term dependency problems than the traditional RNN, and the core idea of LSTM is to use a special memory unit to store and update information so as to facilitate long-term memory and capture long-term dependency in a sequence, and the flow of information is controlled by a gating mechanism, and the content in memory can be selectively forgotten or updated;

After the initial strategy function and the initial cost function are obtained, an execution plan is obtained through the initial function, and the obtained execution plan is executed to obtain an execution result;

and finally, the agent optimizes the initial strategy function and the initial cost function based on the execution result to obtain a final strategy function and a final cost function, and obtains a final query strategy according to the final strategy function and the final cost function.

Since the updating of the data query policy requires continuous optimization of the query action, the step S024 of generating an execution plan based on the initial policy function and the initial state, and obtaining the execution result according to the execution plan includes:

Step S0241, inputting the initial state into the initial strategy function for repeated calculation, and obtaining a query action and a final state;

Step S0242, according to the inquiry action and the final state, generating a plan through the optimizer to obtain an execution plan;

And step S0243, according to the execution plan, carrying out plan execution by the executor to obtain an execution result.

The initial state is known to comprise the query action of the query result obtained by the query statement and the state after the query, so that the initial state is input into an initial strategy function for calculation, and the query action and a new state are obtained;

then, calculating the new state again through the initial strategy function to obtain a plurality of inquiry actions and the new state, and obtaining all inquiry actions and the final state after repeated calculation;

Based on the query action and the final state, the generation of an execution plan can be performed through an optimizer, namely, the query action is optimized and tried;

Finally, the obtained execution result is executed to obtain an execution result, and it should be understood that the execution result can be used for optimizing the query strategy, so that the execution result at least comprises an optimizing effect of the query effect and the like.

Specifically, in step S0242, according to the query action and the final state, the step of generating the plan by the optimizer, and obtaining the execution plan includes:

step S02421, analyzing the query action through the optimizer to obtain an optimization decision;

step S02422, based on the optimization decision, outputting a query tree through the optimizer, and performing planning node creation on the query tree based on the final state, thereby obtaining an execution plan.

Based on the above-mentioned change of the connection order, it can be known that the currently employed query action can be changed, so that the previously obtained query action is analyzed to obtain an optimization strategy, wherein the query action not only represents the decision of connecting two relations in the query plan, but also represents the physical operation about connecting the two relations, in this embodiment, the actionBy way of example, action/>Can be expressed as/>Wherein/>Two relationships indicating that a connection is required,Representing the physical operations that need to be performed; /(I)

The query tree is then exported and additional planning nodes are added to obtain the final physical execution plan, which may be passed on to the executor for planning execution.

More specifically, as shown in fig. 4, fig. 4 is an overall schematic diagram of the data query policy optimization method of the present invention.

Firstly, inputting an SQL sentence into a Parser (Parser) for parsing to obtain an AST tree;

Then vectorizing AST to obtain initial state so that it can be processed by neural network LSTM;

Then, inputting the initial state into the intelligent agent (namely the environment in the figure, because the intelligent agent can make reasoning and decision by sensing the information in the environment and take appropriate actions to realize the preset target), and acquiring the query action and updating the state;

Then, inputting the final state obtained by the agent into an optimizer to generate a corresponding query plan (execution plan), and delivering the generated query plan to an executor to execute the plan to obtain an execution result;

finally, the agent calculates rewards and total loss functions by performing the results and updates the policy functions and the cost functions based on the back propagation of the loss functions.

The embodiment adopts the scheme, in particular by receiving the query SQL statement; and sending the SQL statement to a pre-constructed data management system, and optimizing an initial query strategy of the SQL statement through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor. Therefore, the query SQL statement is optimized through the data management system to obtain a final query policy, the problems that the data query cannot meet the wide growth of modern data and the complexity of connection conditions are solved, and the efficiency of optimizing the data query policy is improved.

Referring to fig. 5, fig. 5 is a flow chart illustrating a method for optimizing a data query policy according to the present invention, which involves obtaining a final policy function and a final cost function.

Based on the embodiment shown in fig. 2, the step S025, according to the execution result, optimizes the initial policy function and the initial cost function by the agent, and the step of obtaining the final policy function and the final cost function includes:

step S0251, calculating through a cost model of the optimizer according to the execution result to obtain an operation rewarding value;

step S0252, performing function construction through a strategy gradient algorithm PPO according to the cyclic neural network LSTM and the operation rewarding value to obtain a loss function;

and step S2523, optimizing the initial strategy function and the initial cost function through the intelligent agent according to the loss function, and obtaining a final strategy function and a final cost function.

In order to solve the problem about the optimized connection sequence with the markov decision process, a strategy gradient PPO (Proximal Policy Optimization) algorithm is adopted in the embodiment to solve the problem of deep reinforcement learning with discrete action space, the neural network structure used in the embodiment is shown in fig. 6, the network is composed of a convolutional neural network and a stacked LSTM (long short term memory) network, and the problem of deep reinforcement learning affected by the action sequence can be effectively processed.

It should be clear that, since the current execution plan has already been executed, an execution result is obtained, and for the execution result, a cost model of the optimizer is used to calculate, so as to obtain the prize value of the operation (query);

Then, after the calculated reward value is obtained, performing function construction through a strategy gradient algorithm according to the cyclic neural network LSTM and the calculated reward value to obtain a loss function, wherein the loss function can calculate the query error rate in the execution plan of the user;

And finally, optimizing the initial strategy function and the initial cost function based on the loss function to obtain a final strategy function and a final cost function.

Specifically, in step S0251, according to the execution result, the step of calculating by the cost model of the optimizer, and obtaining the operation rewards value includes:

Step S02511, carrying out data calculation on the execution result based on a deep learning algorithm of the cost model to obtain CPU occupation data and memory occupation data;

step S02512, acquiring data transmission cost and running cost of the central processing unit according to the central processing unit occupation data and the memory occupation data;

step S02513, performing bonus calculation according to the preset cost weight value through the data transmission cost and the running cost, and obtaining an operation bonus value.

The cost model mentioned in this embodiment uses a transform-based deep learning algorithm to obtain the cost of the query;

The algorithm acquires CPU and memory occupation data during execution of the query, as shown in fig. 7 and 2, then uses the CPU and memory usage data as input of a neural network through Encoding, evaluates the cost of the query through the neural network, and is used for calculating lost (loss function) of the deep reinforcement learning algorithm, so as to update network parameters of the connection optimization method based on the LSTM and PPO algorithm fusion.

In order to make the result obtained by the cost model more accurate, training is required before the cost model is used, because the acquired occupation data about the CPU and the memory are all related to time sequence, in order to better process such data, the deep learning algorithm in the cost model in this embodiment uses a network structure based on a transducer, as shown in fig. 9, before the strategy of obtaining the connection sequence is obtained, the algorithm network of the cost model needs to be trained in advance, in this embodiment, a large amount of actual data is sampled in advance and is respectively sampled into a training set and a test set after being cleaned to pretrain the algorithm network of the cost model, and then the algorithm network is updated periodically according to the result of the connection optimization algorithm.

More specifically, in step S025, according to the recurrent neural network LSTM and the operational reward value, the step of obtaining the loss function by performing function construction by using a policy gradient algorithm PPO includes:

Step S02521, performing entropy calculation on the query action to obtain a query entropy of the initial strategy function;

step S02522, performing function construction based on the initial cost function to obtain a cost loss function;

Step S02523, limiting the updating amplitude of the LSTM through the operation rewarding value and a preset dominance function to obtain a near-end ratio clipping loss value;

and step S02524, clipping the loss value based on the query entropy value, the cost loss function and the near-end ratio, and performing function construction through a strategy gradient algorithm PPO to obtain the loss function.

For the PPO algorithm, the loss function is defined by the following equation:

The specific parameter definition and calculation are as follows:

（1） Representing the near-end ratio clipping penalty, used to limit the magnitude of the network update, can be calculated by:

Wherein,

Is the update amplitude of the policy, and indicates that the current policy is in stateTake action/>Probability of (2) and old policy are in stateTake action/>Probability ratio of/>Is a merit function for measuring the goodness of the current state and the motion relative to the average level, and can be calculated by the following formula:

Wherein, Expressed in state/>Take action/>Value of/>Expressed in state/>Average value under/>Is a super parameter for controlling the clipping amplitude;

（2） representing the cost loss function, the calculation can be performed by:

（3） Is the entropy of the strategy, which is used to explore more actions, can be calculated by:

Network parameters Gradient updates may be made by a loss function.

According to the scheme, the calculation reward value is obtained by calculating through the cost model of the optimizer according to the execution result; performing function construction through a strategy gradient algorithm PPO according to the circulating neural network LSTM and the operation rewarding value to obtain a loss function; and optimizing the initial strategy function and the initial cost function through the intelligent agent according to the loss function, and obtaining a final strategy function and a final cost function. Therefore, the optimization of the initial strategy function and the initial cost function is realized, the problem that the data query can not meet the wide growth of modern data and the complexity of the connection condition is solved, and the efficiency of optimizing the data query strategy is improved.

In addition, the embodiment of the invention also provides a data query strategy optimization device, which comprises:

The receiving module is used for receiving the query SQL statement;

The optimizing module is used for sending the SQL sentence to a pre-constructed data management system, and optimizing the initial query strategy of the SQL sentence through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor.

In addition, the embodiment of the invention also provides a terminal device, which comprises a memory, a processor and a data query strategy optimization program stored on the memory and capable of running on the processor, wherein the data query strategy optimization program realizes the steps of the data query strategy optimization method when being executed by the processor.

Because all the technical schemes of all the embodiments are adopted when the data query policy optimization program is executed by the processor, the data query policy optimization program at least has all the beneficial effects brought by all the technical schemes of all the embodiments and is not described in detail herein.

In addition, the embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium is stored with a data query strategy optimization program, and the data query strategy optimization program realizes the steps of the data query strategy optimization method when being executed by a processor.

Compared with the prior art, the data query strategy optimization method, the device, the terminal equipment and the storage medium provided by the embodiment of the invention are realized by receiving query SQL sentences; and sending the SQL statement to a pre-constructed data management system, and optimizing an initial query strategy of the SQL statement through the data management system to obtain a final query strategy, wherein the data management system is obtained based on an analyzer, an agent, an optimizer and an executor. Therefore, the problem that the data query cannot meet the wide growth of modern data and the complexity of connection conditions is solved, the optimization of the data query strategy is realized, and the efficiency of the data query strategy optimization is improved. Based on the scheme of the invention, the problems of low query efficiency caused by the fact that the search space is too huge and the time complexity is too high due to the fact that the conventional rule algorithm is used for data query, so that the database query performance is affected, and the dynamic programming-based algorithm is used in other databases, and the optimization efficiency of the data query strategy is obviously improved through the method, even though the search space of the relation connection sequence is reduced by adopting various pruning strategies in the optimizers of the database management systems, the requirements of the wide growth of modern data and the complexity of the connection conditions still cannot be met, so that the query efficiency is low.

Compared with the prior art, the embodiment of the invention has the following advantages:

1. The method and the device can adapt to the connection relation of various forms, meet various query demands of users, and quickly acquire the optimal connection sequence to reduce the query time, thereby improving the use experience of the users;

2. The invention models the relation to be connected as a state, models the decision for connecting the two relations as an action, and then solves the optimization problem with a Markov decision process by utilizing a PPO algorithm in a deep reinforcement learning algorithm, thereby generating the connection sequence of a plurality of relations, and the neural network of the algorithm consists of a convolutional neural network and a stacked LSTM (long short term memory) network.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, a controlled terminal, or a network device, etc.) to perform the method of each embodiment of the present invention.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. The data query strategy optimization method is characterized by comprising the following steps of:

Receiving a query SQL statement;

vectorizing the AST tree to obtain an initial state of the AST tree;

The step of generating an execution plan based on the initial policy function and the initial state and obtaining an execution result according to the execution plan includes:

according to the execution plan, performing plan execution through the executor to obtain an execution result;

the step of optimizing the initial policy function and the initial cost function by the agent according to the execution result to obtain a final policy function and a final cost function includes:

the step of performing function construction through a strategy gradient algorithm PPO according to the cyclic neural network LSTM and the operation rewarding value, and obtaining a loss function includes:

cutting out a loss value based on the query entropy value, the cost loss function and the near-end ratio, and performing function construction through a strategy gradient algorithm PPO to obtain the loss function;

Optimizing the initial strategy function and the initial cost function through the intelligent agent according to the loss function, and obtaining a final strategy function and a final cost function;

2. The method according to claim 1, wherein the step of generating a plan by the optimizer according to the query action and the final state, and obtaining an execution plan comprises:

3. The method according to claim 1, wherein the step of obtaining the operation reward value by calculating through a cost model of the optimizer according to the execution result comprises:

4. A data query policy optimization device, characterized in that the data query policy optimization device comprises:

The receiving module is used for receiving the query SQL statement;

vectorizing the AST tree to obtain an initial state of the AST tree;

Wherein, the optimization module is further used for: inputting the initial state into the initial strategy function for repeated calculation to obtain a query action and a final state;

wherein, the optimization module is further used for: according to the execution result, calculating through a cost model of the optimizer to obtain an operation rewarding value;

Based on the query entropy value, the cost loss function and the near-end ratio clipping loss value, performing function construction through a strategy gradient algorithm PPO to obtain the loss function

5. A terminal device comprising a memory, a processor and a data query policy optimization program stored on the memory and executable on the processor, the data query policy optimization program when executed by the processor implementing the steps of the data query policy optimization method according to any of claims 1-3.

6. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a data query policy optimization program, which when executed by a processor, implements the steps of the data query policy optimization method according to any of claims 1-3.