CN112347104A

CN112347104A - Column storage layout optimization method based on deep reinforcement learning

Info

Publication number: CN112347104A
Application number: CN202011228158.6A
Authority: CN
Inventors: 覃雄派; 陈跃国; 杜小勇; 赵丽萍
Original assignee: Renmin University of China
Current assignee: Renmin University of China
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2021-02-09
Anticipated expiration: 2040-11-06
Also published as: CN112347104B

Abstract

The invention discloses a column storage layout optimization method based on deep reinforcement learning, which comprises the following steps: receiving a query load; analyzing the query load to generate query features; acquiring characteristic data of the data column according to the query characteristic; determining the output sequence of the columns based on the strategy of the output sequence of the columns and the characteristic data of the data columns; performing quantitative evaluation on the output sequence, wherein the quantitative evaluation strategy is adjusted based on the reward of a system; and adjusting the output sequence of the columns according to the quantitative evaluation result. By the method and the device, the used model parameters can be continuously adjusted in the expected direction that the disk skip time is reduced, the neural network can automatically learn the optimal column sequence according to the characteristic data of the columns, the incremental training can be realized, and the column sequence does not need to be recalculated every time the column sequence is optimized, so that the calculation cost is greatly reduced.

Description

Column storage layout optimization method based on deep reinforcement learning

Technical Field

The invention relates to the field of computers, in particular to a column storage layout optimization method based on deep reinforcement learning, which is mainly used for carrying out layout optimization on column storage of big data so as to improve data reading performance.

Background

OLAP (Online Line analytical processing) analysis oriented to relational data plays a crucial role in many analysis and decision support applications. In the big data era, many big data analysis systems, such as Hive and Spark SQL, use HDFS (hadoop Distributed File system) as the underlying storage, and a large amount of data is continuously accumulated and stored on HDFS, but the real-time requirement of data analysis is higher and higher. As a de facto standard for distributed big data low-cost data storage and processing, the HDFS provides fault-tolerant, portable, extensible and high-read-write throughput unified data storage for a big data analysis system. Big data analytics systems on HDFS are typically used to support batch and interactive query analysis on massive amounts of data.

In these systems, data tables typically employ column storage formats such as RCFile, ORC, partial, CarbonData, etc., data storage employing column storage provides flexible and efficient data encoding and compression functions and is able to read only the necessary data columns, thereby avoiding unnecessary I/O, but we have found that query analysis performance of data on HDFS can be further improved by optimization of the storage layout. When a data column in one horizontal fragment in an HDFS data block is queried and accessed, multiple disk skip reads are needed, and an optimal column sequence can provide the minimum disk skip cost. Among them, the column ordering problem has been demonstrated by academic papers as NP-Hard. It is a challenge how to design an efficient column ordering algorithm to find a near-optimal column order given the query load. The optimization randomness of the existing heuristic search is strong, the existing heuristic search is easy to fall into suboptimum, meanwhile, the column sequencing needs to be recalculated for each optimization, and the calculation cost is high.

Disclosure of Invention

In view of the above, the present invention has been developed to provide a solution that overcomes, or at least partially solves, the above-mentioned problems. Therefore, in one aspect of the present invention, a column storage layout optimization method based on deep reinforcement learning is provided, the method includes:

receiving a query load;

analyzing the query load to generate query features;

acquiring characteristic data of the data column according to the query characteristic;

determining the output sequence of the columns based on the strategy of the output sequence of the columns and the characteristic data of the data columns;

performing quantitative evaluation on the output sequence, wherein the quantitative evaluation strategy is adjusted based on the reward of a system;

and adjusting the output sequence of the columns according to the quantitative evaluation result.

Optionally, the output sequence is quantitatively evaluated according to the disk skipping time.

Optionally, an Actor-Critic algorithm is adopted to realize a strategy of the output sequence of the depth reinforcement learning column, and a reward adjustment quantitative evaluation strategy based on the system includes adjusting parameters in a Critic neural network according to rewards given by the system.

Optionally, the decision of the output sequence is made by using a neural network of pointernet, including mapping from one sequence to another.

Optionally, determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns includes:

obtaining the weight of an element at a certain position of the output sequence and each position of the input sequence by using an attention mechanism;

and combining the input sequence with the weight to calculate the element with the maximum relation between the current output and the input sequence, and taking the element of the input sequence as an output element.

Optionally, the method further includes: the method for uniformly coding the input query load specifically comprises the following steps:

initializing each query in the input query load to a set;

determining a corresponding column access characteristic for each query;

and carrying out binary coding on the elements in the set corresponding to the query according to the column access characteristics.

The technical scheme provided by the application at least has the following technical effects or advantages: the invention realizes a column storage layout optimization method based on deep reinforcement learning, and performs experimental comparison with the existing heuristic column sorting algorithm, further reduces the disk skip cost, can continuously adjust the used model parameters in the expected direction of reducing the disk skip time, enables a neural network to automatically learn the optimal column sorting according to the characteristic data of columns, can realize incremental training, directly inputs the latest query load into a model, and does not need to recalculate the column sorting at every second optimization, thereby greatly reducing the calculation cost.

The above description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the technical solutions of the present invention and the objects, features, and advantages thereof more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a flow chart of a deep reinforcement learning-based column storage layout optimization method proposed by the present invention;

FIG. 2 is a diagram illustrating a skip cost model of a disk in a wide-table-based column storage layout optimization scheme;

FIG. 3 shows an overall framework diagram of the column storage layout optimization based on deep reinforcement learning proposed by the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

In a big data analysis system, I/O is often the main performance bottleneck, and the design and optimization of a storage system are crucial to the improvement of big data analysis performance. In terms of data organization, column storage (e.g., ORC, partial) provides flexible and efficient data encoding and compression functions and can only read the necessary data columns, thereby avoiding unnecessary I/O. Under the column storage layout, how to adjust the physical data storage layout to adapt to the changing query load and system environment is an urgent problem to be solved. The invention aims to design and realize a column storage layout optimization method DRL-COA (DRL-COA) based on deep reinforcement learning in a Hadoop environmentDeep Reinforcement Learning based Column Ordering Algorithm) is applied to the self-adaptive column storage layout optimization, and compared with the existing heuristic column sorting algorithm, the method further reduces the disk skip cost.

In the DRL-COA provided by the invention, an Actor-Critic algorithm model is used for reinforcement learning, a network structure of a Pointer Net is applied to an Actor neural network, the Actor-Critic neural network is mainly used for continuously outputting new actions (column sequence) according to the initial input, and the Critic neural network is used for evaluating the actions according to the 'income' after the actions, so that the new actions are continuously selected. When action selection is performed, the selection of the column of each position is also important because the action selection is in a column sequence, and the selection is performed by obtaining the weight of an element of a certain position and each position of an input sequence through an attention mechanism.

In one aspect of the present invention, a column storage layout optimization method based on deep reinforcement learning is provided, which uses a deep reinforcement learning technique to solve a sequential decision problem of columns, and optimizes a storage layout by training a model, specifically, as shown in fig. 1, the method includes:

receiving a query load;

analyzing the query load to generate query features;

As a preferred embodiment, an Actor-Critic algorithm is adopted to realize a strategy of outputting the sequence of the deep reinforcement learning columns, and a quantitative evaluation strategy is adjusted based on the reward of the system, wherein the strategy comprises the step of adjusting parameters in a Critic neural network according to the reward given by the system.

The specific process of implementing the Actor-Critic algorithm is described in detail below. The input of the DRL-COA network model in the scheme can be represented as a matrix [1 x n ]]：

c_iRepresenting whether each data column exists in the query (where 1 represents existence and 0 represents nonexistence), and n represents the number of queries, and the input is a mathematical representation of the query load Q. The output of the Actor-Critic model is the order of the data columns, which can be expressed as o (order). And meanwhile, evaluating the output result of each iteration of the model by using the disk jump time. The process mainly comprises the following steps:

(1) according to the current state, the Actor makes an action of column sequencing output;

(2) critic scores the last performance of the operator according to state and action;

(3) according to the score of critic, the Actor adjusts the current strategy (namely parameters in the Actor neural network) and executes the next action;

(4) critic also adjusts the current scoring strategy (i.e., parameters in the Critic neural network) according to the reward (revenue) given by the system;

(5) initially, Actor performed randomly, Critic scored randomly. But due to the existence of reward, Critic scores become more accurate, and Actor performs better and better.

In this method, it is preferable that the output order is quantitatively evaluated based on a disk skip time. Network model random strategy p of the scheme_θ(o | c) can be expressed as: when the input is c (column) and the output is o (order), the model evaluation SC (o-c) is the corresponding disk jump time. The goals of study model training were: the smaller the model evaluation SC (o | c) value, the greater the probability that output o will be selected. The output result of each iteration of the model is evaluated according to the disk jump time during training, so that the design can ensure that the model parameters are continuously adjusted in the expected direction of reducing the disk jump time during the training process.

FIG. 2 is a diagram showing a skip cost model of a disk in a wide-table-based column storage layout optimization scheme. In a column storage layout optimization scheme based on a wide table, a jump reading time cost model based on a magnetic disk is designed aiming at the characteristics of data access on a traditional magnetic disk, multiple jump reading operations with equal distances are executed in a series of HDFS files, and the average jump reading time on the jump reading distance d is taken as the jump reading cost corresponding to d in a statistical sense. And after the skip cost under different skip distances is obtained, a segmented skip cost function is constructed by adopting linear fitting. Fig. 2 shows skip cost functions obtained by the method on three different types of disks.

In the patent, the Actor-Critic algorithm is adopted to realize deep reinforcement learning, so that discrete values can be processed and single-step updating can be realized in game data processing. However, after the Actor-critical deep learning algorithm is adopted in the application, the discrete value processing and single-step updating can be realized.

FIG. 3 is an overall framework diagram of deep reinforcement learning-based column storage layout optimization proposed by the present invention. The figure shows several components of a DRL-COA model, from the collection and analysis of query load to the characteristic input and training of the model, an intelligent agent (a deep learning agent) learns the characteristic data of a column through continuous interactive learning with the environment, and a Critic neural network evaluates the output result of each iteration of an Actor neural network model according to a skip estimator component, so that the model parameters can be continuously adjusted in the expected direction of reducing the skip time of a disk, and compared with the randomness optimization of heuristic search, the DRL-COA model has more directionality and is not easy to fall into suboptimum; meanwhile, as a deep reinforcement learning model, column sequencing does not need to be recalculated at every second optimization, and the calculation cost is greatly reduced by an incremental training mode.

In this patent, determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns may include:

Preferably, the decision of the output order is made by using a neural network of pointernet, including mapping from one sequence to another.

The scheme is similar to a related problem of combination optimization in consideration of the problem of column ordering, namely, a decision between sequences is needed to adjust the column ordering. The DRL-COA model adopts a neural network of Pointer Net to solve a sequence decision problem in column sequencing and solve a mapping problem from one sequence to another sequence. Meanwhile, when the output sequence is calculated, the attention mechanism is utilized to obtain the weight of an element at a certain position of the output sequence and each position of the input sequence, and then the input sequence and the weight are combined in a certain mode to influence the output. Thus, the element with the largest relation between the current output and the input sequence can be calculated, and the element of the input sequence is used as the output element, and each output element points to the input element like a pointer. The design can control each input element to be pointed by only one output element, so that the repeated appearance of the input elements is avoided.

In the present invention, the input query load samples are encoded prior to model training. This is because the column accessed by each query may be only a part of columns, and the output is the set of all columns, the requirement of the network structure of the Pointer Net can be met by the input coding, the content of the output sequence is completely consistent with that of the input sequence, and only the sequence order is changed. In FIG. 3, C₁,C₂,C₃,C₄,C₅For the data column input to the encoder, and<g>,C₄,C₅,C₁,C₂is the data column output by the decoder.

Assuming that the number of queries in load Q is N and the set of accessed data columns is N in length, we initialize each query Q to be N

c_iSet N' of 0.

Therefore, the uniformly encoding the input query load may specifically include:

initializing each query in the input query load to a set;

determining a corresponding column access characteristic for each query; in particular, for each query Q (involving only m data columns) in the load Q, its column access characteristic is C_q＝{c_q,1,c_q,2,...,c_q,m}。

The elements in the set corresponding to the query are binary-coded according to the column access characteristics, and specifically, the subscript position of the data column of {1, 2..., m } etc. corresponding to the query q in N' may be set to 1, and the other positions are still 0 (indicating that the query q has not accessed the column). Thus, the load is uniformly encoded into a pattern of {1, 0.., 1 }.

According to the method, an input load sample is effectively encoded through technologies such as an Actor-Critic deep reinforcement learning algorithm, a Pointer Net neural network, an attention mechanism and disk skip cost simulation, the sequence is used as output, the disk skip cost is used as an output result of each iteration of a model to be evaluated, and therefore model parameters can be continuously adjusted in an expected direction of reducing disk skip time. Under the implementation scheme, the neural network automatically learns the optimal column sequence according to the characteristic data of the columns, the incremental training of the DRL-COA model can be realized, the latest query load is directly input into the model, and the column sequence is not required to be recalculated every time the optimization is carried out, so that the calculation cost is greatly reduced.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.

Claims

1. A column storage layout optimization method based on deep reinforcement learning is characterized by comprising the following steps:

receiving a query load;

analyzing the query load to generate query features;

2. The column storage layout optimization method of claim 1, further characterized by quantitatively evaluating the output order according to disk skip time.

3. The column storage layout optimization method according to claim 1, further characterized in that an Actor-Critic algorithm is adopted to implement a strategy of depth-enhanced learning column output order, and the system-based reward adjustment quantitative evaluation strategy comprises adjusting parameters in a Critic neural network according to rewards given by the system.

4. The column store layout optimization method of claim 1, further characterized by taking output order decisions using a neural network of pointernet, including mapping from one sequence to another.

5. The method of optimizing column storage layout of claim 1, further characterized in that determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns comprises:

6. The method of optimizing column storage layout of claim 4, further characterized in that determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns comprises: the input query load is uniformly coded, specifically: initializing each query in the input query load to a set; determining a corresponding column access characteristic for each query; and carrying out binary coding on the elements in the set corresponding to the query according to the column access characteristics.