CN113094368A

CN113094368A - System and method for improving cache access hit rate

Info

Publication number: CN113094368A
Application number: CN202110392024.6A
Authority: CN
Inventors: 乔少杰; 杨国平; 宋海权; 韩楠; 李勇; 闵圣捷; 王伟业; 孙科; 袁犁; 张浩东; 范勇强; 甘戈; 冉先进; 魏军林; 余华; 元昌安; 黄发良; 覃晓; 郑皎凌; 张永清
Original assignee: Hanwang Yunlian Chengdu Technology Co ltd; Chengdu University of Information Technology
Current assignee: Hanwang Yunlian Chengdu Technology Co ltd; Chengdu University of Information Technology
Priority date: 2021-04-13
Filing date: 2021-04-13
Publication date: 2021-07-09
Anticipated expiration: 2041-04-13
Also published as: CN113094368B

Abstract

The invention discloses a system and a method for improving cache access hit rate, which improve the access hit rate of a cache region by setting a DDQN model, can better utilize the cache region and improve the query efficiency. The DDQN model provided by the invention can learn experience, a plurality of queries can be put into the query set storage table and scheduled, more experience can be obtained from the historical executed queries, and the scheduling strategy can be improved. The invention can effectively capture the state of the cache region and the data access mode, better utilize the cache region and improve the decision arrangement of the query; the DDQN model can adapt to the queries which are not executed, and the query scheduling strategy can quickly adapt to a new query template, so that the remarkable effect is generated, and the resource sharing efficiency is improved.

Description

System and method for improving cache access hit rate

Technical Field

The invention belongs to the field of artificial intelligence and databases, and particularly relates to a system and a method for improving cache access hit rate.

Background

Query scheduling problems are an important and challenging task in modern database systems. Query scheduling can have a significant impact on query performance and resource utilization, but it may require consideration of a number of factors, such as cached data sets, available resources (e.g., memory), performance goals for each query, query priority, or inter-query dependencies (e.g., relevant data access patterns).

The database will store the table data and index in the cache in the form of pages, and in some cases (when preprocessing is used) will cache the query plan, but will not cache the specific query results. The data page is cached, and the data page contains continuous data, namely, the data is not only the data to be queried. The traditional caching method is realized by some rule-based algorithms, but for the current large data scene, the query traffic is large in scale and rapidly increases, different and complex queries bring severe challenges to the traditional caching method, and the database system can learn some characteristics by itself by using the AI technology, such as state information of the whole buffer area, characteristic information of query sentences and some load information of services, which are much more accurate than the traditional caching method, so that the cache hit rate is higher.

The existing query scheduling strategy cannot effectively improve the cache hit rate, and the problem of cache invalidation is probably caused by improper query execution sequence, so that I/O operation has to be carried out, and the performance loss is very large. Assuming that the cache of the database is a fixed size, the model can find out an optimal sequence, so that the current query statement can be loaded to the cached data page through IO operation by utilizing the last query statement as much as possible, and the IO operation of the database is reduced. The main objective of the present invention is to reduce the IO operation, i.e. increase the hit rate of the data page of the cache region, because the consumption of IO has a great influence on the performance of the database.

In summary, in order to increase the hit rate of the cache, generate some effective execution plans, and have better scheduling capability for various complex queries, it is necessary to design a method for increasing the hit rate of the cache access.

Disclosure of Invention

Aiming at the defects in the prior art, the system and the method for improving the cache access hit rate have the problem that the database cache hit rate is low in the prior art.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that: a system for improving cache access hit rate comprises a query storage table module, a query feature extractor, a DDQN model and a buffer pool feature extractor;

the query storage table module is connected with the query feature extractor and is used for acquiring a query request submitted by a user and storing the query request into the chained queue; the query feature extractor is connected with the DDQN model and used for converting query information acquired by the query set storage table module into a feature vector and compressing the feature vector into a first bitmap; the DDQN model connection is connected with the buffer pool feature extractor and is used for receiving the first bitmap input by the query feature extractor and the buffer pool feature extractor and executing query; the buffer pool feature extractor is used for converting the state of the database buffer pool into a second bitmap.

Further, the database is used for storing a data table; the buffer pool of the database comprises m columns of data blocks multiplied by n rows, and each data table comprises one row of data blocks; the data block is used for caching data; the query request comprises queries of data blocks corresponding to a plurality of basic relations; each basic relation corresponds to a data table and comprises the state of the data block inquired by the corresponding data table; the data blocks are provided with corresponding index blocks, and all the index blocks form an index table.

The invention has the beneficial effects that:

(1) the invention provides a system for improving cache access hit rate, which improves the access hit rate of a cache region by setting a DDQN model, can better utilize the cache region and improve the query efficiency.

(2) The DDQN model provided by the invention can learn experience, and a plurality of queries can be put into the query set storage table and scheduled; more experience is gained from past executed queries to improve scheduling policies.

A method for improving the hit rate of cache access comprises the following steps:

s1, establishing a chained query queue through the query storage table module, performing query request enqueuing operation, and sequentially storing query requests in the chained query queue to the query storage table module;

s2, scanning the basic relationship contained in each query request through a query feature extractor, representing the basic relationship as a feature vector, marking the data block to be accessed by the feature vector as 1, marking the data block not to be accessed by the feature vector as 0, and constructing a first bitmap;

s3, scanning an index table, acquiring the access probability of the data block and the access probability of the index block, sequentially selecting dequeued query requests according to the access probabilities of the data block and the index block, and taking the dequeued query requests as candidate query requests;

s4, transmitting the first bitmap corresponding to the candidate query request to the DDQN model;

s5, converting the state of the database buffer pool into a second bitmap, constructing a bitmap state according to the characteristics of the second bitmap, and transmitting the bitmap state to the DDQN model;

s6, selecting candidate query requests to query through the DDQN model according to the first bitmap and the bitmap states, and completing the improvement process of the cache access hit rate.

Further, the specific method for establishing the chained query queue through the query storage table module in step S1 and performing the query enqueuing operation includes:

s1.1, establishing a chained query queue through a query storage table module, and initializing the query storage table module to be empty;

s1.2, collecting a query request of a user to a chained query queue;

s1.3, setting a subsequent tail pointer r pointing to a queue tail node, and executing enqueuing operation on a query request pointed by the tail pointer r, wherein the tail pointer r points to the next query request;

s1.4, according to the method in the step S1.3, the inquiry requests in the chain inquiry queue are sequentially enqueued;

after the query enqueue is completed, the tail pointer r points to the head of the queue of the enqueued query request for executing the dequeue operation.

Further, the step S3 is specifically:

s3.1, scanning an index table to perform index operation according to the basic relationship corresponding to the query request, and selecting a row related to the basic relationship;

s3.2, taking the ratio of the selected row in all the rows as the access probability of the data block corresponding to the basic relationship;

s3.3, taking the ratio of the index blocks related to the basic relationship in all the index blocks as the access probability of the index blocks corresponding to the basic relationship;

s3.4, traversing all query requests in the query storage table module, and acquiring the data block access probability and the index block access probability corresponding to each query request;

s3.5, dequeuing the query request with the maximum access probability of the data block and the index block;

and S3.6, repeating the step S3.5, sequentially selecting the dequeued query requests, and taking the dequeued query requests as candidate query requests.

Further, the specific method for converting the buffer pool state into the second bitmap in step S5 and constructing the bitmap state according to the features of the second bitmap includes:

s5.1, converting the buffer pool state into a second bitmap, wherein each row of data blocks of the second bitmap is used as a basic relation;

s5.2, creating a binary group, and recording rows and columns of the second bitmap as X and Y, where X is 1, 2.. and X, Y is 1, 2.. and Y is, where X denotes a total number of rows and Y denotes a total number of columns;

s5.3, representing the data block marked as 1 in the second bitmap as < x, y > -1, and representing the data block marked as 0 in the second bitmap as < x, y > -0;

s5.4, on the basis of the step S5.3, constructing a bitmap State1 by using the characteristics of the second bitmap_x,yComprises the following steps:

wherein N is_xRepresenting the row vector to which the basic relation x corresponds, M_xThe quotient of dividing the row vector corresponding to the basic relation x by the number of the nonzero unit cells is represented, and the | represents the vector modulo operation.

Further, the step S6 includes specifically:

s6.1, receiving a first bitmap and a bitmap state through a DDQN model;

s6.2, setting the target of the DDQN model as the sum R of the searched maximum rewards, and constructing a cache scheduling strategy of the DDQN model as a function Q^π(S_t,A_t) Said function Q^π(S_t,A_t) Representing a deep neural network, S_tRepresents the state, A_tRepresenting an action;

s6.3, constructing a Q neural network to be updated and a target Q neural network, and fixing the target Q neural network as Q^π’(S_t+1，π(S_t+1))+r_tIn which S is_t+1Representing the state of the target Q neural network, pi (S)_t+1) Representing the action of the target Q neural network, r_tExpressing the reward value obtained by executing the query, wherein pi represents an execution function of the Q neural network;

s6.4, fitting Q by using Q neural network to be updated^π’(S_t+1，π(S_t+1))+r_tRepeating the training for N times;

s6.5, covering the parameters of the target Q neural network with the parameters of the Q neural network to be updated after N times of training, and obtaining the updated target Q neural network as follows:

wherein Q is^π’(S_t，A_t) Representing the updated target Q neural network,

a target Q neural network is represented,

representing that the Q neural network to be updated is selected through the argmax function and is transmitted into the buffer status S'_t+1And action A, find order Q^π′Action A with the largest value; the action A represents a first bitmap, and the buffer status S'_t+1Indicating a buffer pool bitmap state;

and S6.6, executing the obtained query request corresponding to the action A, and completing the cache access hit rate improving process.

The invention has the beneficial effects that:

(1) when more queries are scheduled, the DDQN model can improve the hit rate of the cache region, effectively captures the state of the cache region and the data access mode, better utilizes the cache region and improves the query decision arrangement.

(2) The DDQN model in the invention can adapt to the query which is not executed, and the query scheduling strategy can quickly adapt to a new query template, thereby generating obvious effect and improving the resource sharing efficiency.

(3) The method does not cause over-estimation, the over-estimation refers to that the estimated Q value function is larger than the real Q value function, the root of the over-estimation is the operation of maximizing the Q value in the DQN (deep Q network), the problem is effectively avoided, the selection of the action (query) and the evaluation of the action (query) are respectively realized by different functions, and the cache access hit rate is improved.

Drawings

Fig. 1 is a schematic diagram of a system for increasing a cache access hit rate according to the present invention.

Fig. 2 is a flowchart of a method for increasing a cache access hit rate according to the present invention.

FIG. 3 is a diagram illustrating a DDQN model according to the present invention.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

As shown in fig. 1, a system for improving cache access hit rate includes a lookup storage table module, a lookup feature extractor, a DDQN model and a buffer pool feature extractor;

The database is used for storing a data table; the buffer pool of the database comprises m columns of data blocks multiplied by n rows, and each data table comprises one row of data blocks; the data block is used for caching data; the query request comprises queries of data blocks corresponding to a plurality of basic relations; each basic relation corresponds to a data table and comprises the state of the data block inquired by the corresponding data table; the data blocks are provided with corresponding index blocks, and all the index blocks form an index table.

The invention has the beneficial effects that:

As shown in fig. 2, a method for increasing a cache access hit rate includes the following steps:

In step S1, the specific method for establishing the chained query queue through the query storage table module and performing the query enqueuing operation includes:

s1.2, collecting a query request of a user to a chained query queue;

The step S3 specifically includes:

In step S5, the buffer pool state is converted into a second bitmap, and a specific method for constructing a bitmap state by using the features of the second bitmap includes:

The step S6 includes the specific steps of:

s6.1, receiving a first bitmap and a bitmap state through a DDQN model;

as shown in fig. 3, S6.5, covering the parameters of the target Q neural network with the parameters of the Q neural network to be updated after N times of training, and obtaining the updated target Q neural network as follows:

wherein Q is^π’(S_t，A_t) Representing the updated target Q neural network,

a target Q neural network is represented,

The invention has the beneficial effects that:

(1) when more queries are scheduled, the DDQN model can improve the hit rate of the cache region, effectively captures the state of the cache region and the data access mode, better utilizes the cache region and improves the query decision arrangement;

(2) the DDQN model in the invention can adapt to the query which is not executed, and the query scheduling strategy can quickly adapt to a new query template, thereby generating obvious effect and improving the resource sharing efficiency;

Claims

1. A system for improving the cache access hit rate is characterized by comprising a query storage table module, a query feature extractor, a DDQN model and a buffer pool feature extractor;

2. The system for improving cache access hit rate according to claim 1, wherein the database is configured to store a data table; the buffer pool of the database comprises m columns of data blocks multiplied by n rows, and each data table comprises one row of data blocks; the data block is used for caching data; the query request comprises queries of data blocks corresponding to a plurality of basic relations; each basic relation corresponds to a data table and comprises the state of the data block inquired by the corresponding data table; the data blocks are provided with corresponding index blocks, and all the index blocks form an index table.

3. A method for improving cache access hit rate is characterized by comprising the following steps:

4. The method according to claim 3, wherein the step S1 of establishing the chained query queue through the query storage table module and performing the query enqueuing operation includes:

s1.2, collecting a query request of a user to a chained query queue;

5. The method of claim 4, wherein the step S3 specifically comprises:

6. The method according to claim 5, wherein the step S5 of converting the buffer pool status into the second bitmap, and the specific method of constructing the bitmap status by using the characteristics of the second bitmap includes:

7. The method according to claim 6, wherein the step S6 includes steps of:

s6.1, receiving a first bitmap and a bitmap state through a DDQN model;

wherein Q is^π’(S_t，A_t) Representing the updated target Q neural network,

representing a target Q neural network, arg_AmaxQ^π(S'_t+1A) represents that the Q neural network to be updated is selected through the argmax function and is transmitted into the buffer status S'_t+1And action A, find order Q^π’Action A with the largest value; the action A represents a first bitmap, and the buffer status S'_t+1Indicating a buffer pool bitmap state;