CN116382588A

CN116382588A - LSM-Tree storage engine read amplification problem optimization method based on learning index

Info

Publication number: CN116382588A
Application number: CN202310394141.5A
Authority: CN
Inventors: 毛誉; 陆鑫
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2023-04-13
Filing date: 2023-04-13
Publication date: 2023-07-04

Abstract

The invention belongs to the technical field of storage indexes, and provides a learning index-based LSM-Tree storage engine read amplification problem optimization method which is used for solving the LSM-Tree storage engine read amplification problem. The invention utilizes a key value separation method to improve the arrangement and organization mode of SSTable file records, so that the SSTable file records can be applied to learning indexes, thereby replacing the sparse indexes of the SSTable files with SSTable learning indexes, and utilizing the hierarchical learning indexes to accelerate the retrieval of SSTable metadata; using PLR model with limited error as bottom model in two learning indexes, and using recursive structure to represent complex data distribution; and a buffer area is arranged in the hierarchical learning index, so that the learning index can support dynamic insertion, and better performance is achieved. In conclusion, the method and the device fundamentally reduce the times of reading the SSTable file and the metadata query range, and effectively reduce the query cost, thereby solving the problem of reading and amplifying of the LSM-Tree.

Description

LSM-Tree storage engine read amplification problem optimization method based on learning index

Technical Field

The invention belongs to the technical field of storage indexes, relates to a storage index technology combined with machine learning, and particularly provides an LSM-Tree storage engine reading amplification problem optimization method based on learning indexes.

Background

The index technology is a data engine technology used for accelerating data access in a database system or a storage system, and plays a vital role in improving the data access performance of a storage file; in the big data age of the rapid increase of the system data storage quantity, the traditional index has the problems of large storage space cost, low multi-layer retrieval performance and the like.

In recent years, LSM-Tree (log-structured merge Tree) storage engines represented by level db and rockdb are applied to storage systems by virtue of their good write performance, and LSM-Tree storage engines are suitable for data storage of dense write load application systems, but have drawbacks of read amplification. The problem of read amplification refers to that when retrieving data, the data needs to be searched layer by layer, so that extra disk I/O operation is caused, and especially when the range is searched, the phenomenon of read amplification is obvious; when a sparse index is used in the SSTable of the LSM-Tree storage engine, it can cause read-out problems to the storage engine, such that the data queries at the upper layers involve excessive, useless I/O processing. It follows that the main source of the sense amplification problem is: 1) When multi-layer searching in SSTable metadata, binary searching is carried out, and redundant disk reading is caused by accessing irrelevant SSTable files in the process; 2) In the process of searching a key in an SSTable file, a plurality of disk blocks without the query key are read, so that the reading amplification is caused; how to reasonably solve the problem of the read amplification of the LSM-Tree becomes a popular direction. The learning index is an indexing method utilizing machine learning technology, which learns the data distribution and query load characteristics of a storage system and generates a key-value index mapping function to be used for data query and retrieval, thereby reducing the space cost of the index and improving the query performance. Most of the persistence contents of the LSM-Tree cannot change in a short time, and the LSM-Tree can be well adapted to the characteristic of one-time construction of the learning index, but how to apply the learning index in the LSM-Tree, optimize the storage engine and solve the problem of reading amplification becomes the research focus of the invention.

Disclosure of Invention

Aiming at the problem of read amplification of an LSM-Tree storage engine, the invention provides an optimization method for the problem of read amplification of the LSM-Tree storage engine based on a learning index; the method is based on learning index, and is applied to metadata retrieval of LSM-Tree and SSTable file searching, so that the number of times of reading SSTable files and the metadata query range are fundamentally reduced, the query cost is reduced, and the problem of reading and amplifying of LSM-Tree is solved.

In order to achieve the above purpose, the invention adopts the following technical scheme:

the LSM-Tree storage engine read amplification problem optimization method based on the learning index is characterized by comprising the following steps of:

constructing an SSTable file, wherein the SSTable file comprises the following components: file header, index block, bloom filter block, and file data block; one entry in a file data block of the SSTable file is < key > < value >, key is a key value of an integer type key, and value is a user storage value; putting key value pair data (EntryIndex) into an Entry directory of a file data block, wherein the EntryIndex is the serial number of an Entry;

constructing an index block of an SSTable file, wherein the index block of the SSTable file adopts an SSTable learning index which is a recursive PLR model, the recursive PLR model is a multi-layer model, and each model node consists of a PLR model with 1 root line segment number;

constructing a hierarchy learning index, setting hierarchy changes to take Version as a unit, and organizing a plurality of Version by using a form of a doubly linked list; recording metadata of each layer of SSTable files in each Version, wherein the metadata of the SSTable files comprise serial numbers, sizes, maximum keys and minimum keys of the SSTable files; version is stored by Buffer Level and multiple Learnedlevels, each Learnedlevel sets up a recursive PLR model, all recursive PLR models together make up the hierarchical learning index.

Further, the Buffer Level in Version stores new inserted keys, the sizes of the Buffer Level and the plurality of LearnedLevels are sequentially increased, and the Buffer Level and the plurality of LearnedLevels are sequentially combined in a recursion mode; and for the new inserted key, firstly inserting the new inserted key into the BufferLevel, carrying out recursion combination with the lower LearnedLevel when the BufferLevel is full, and constructing a recursion PLR model based on the recursion combination data.

Further, in the SSTable learning index and the hierarchical learning index, the construction process of the recursive PLR model is as follows: constructing a bottom layer PLR model according to a PLR algorithm by taking key value data (EntryIndex) as an original data set or taking metadata of an SSTable file as an original data set, wherein the error limit is a recursive error limit delta; checking whether the number of nodes of the PLR model of the layer is 1, if not, taking the first key value of each model of the PLR model of the layer as a new data set, and constructing the PLR model again, wherein the error limit is a recursion error limit delta'; and outputting the recursive PLR model until the number of PLR model nodes at the top layer is 1.

Based on the technical scheme, the invention has the beneficial effects that:

the invention provides an optimization method for the read amplification problem of an LSM-Tree storage engine based on a learning index, which is used for optimally solving the read amplification problem of the LSM-Tree storage engine and has the following advantages:

1) The invention improves the arrangement and organization mode of SSTable file records by using a key value separation method, and on the basis of key value separation, the key value storage items entry of the disk blocks are fixedly and tightly arranged, so that the method can be applied to learning indexes; the learning index and the SSTable are serialized together, so that the actual disk reading times are reduced by the learning index, and the faster SSTable searching performance is realized;

2) According to the invention, the sparse index of the SSTable file is replaced by the SSTable learning index, and the hierarchical learning index is utilized to accelerate the retrieval of SSTable metadata;

3) In the invention, in two learning indexes, a PLR model with bounded errors is used as a bottom model, and a recursion structure is used to enable a simple model to represent complex data distribution so as to obtain a recursion PLR model, namely a recursion learning index;

4) In the hierarchical learning index, the buffer area above the recursive learning index enables the learning index to support dynamic insertion, and the model is built only when the space of the buffer area is full, so that better performance can be achieved; when the PLR model is inserted, judging whether the buffer area is full, merging the PLR model with the lower layer PLR model when the buffer area is full, retraining the PLR model into a new PLR model, and otherwise, not constructing the PLR model; during prediction, firstly querying a buffer area, and then querying the data of the constructed PLR model, so as to realize the dynamic insertion function of the hierarchical index;

5) The invention adopts the PLR model based on the optimal PLR algorithm, the optimal PLR algorithm can fit the data stream into line segments within the maximum error range, and the minimum number of the line segments is ensured;

6) Because SSTable files are persistent files on disk, learning indexes need to support serialization onto disk; in SSTable learning index serialization, all PLR models of each layer are converted into byte stream bit data.

Drawings

FIG. 1 is a schematic diagram of a learning index based LSM-Tree storage engine in accordance with the present invention.

Fig. 2 is a schematic diagram of SSTable files in the present invention.

Fig. 3 is a schematic diagram of a lookup flow of SSTable learning indexes in the present invention.

Fig. 4 is a schematic diagram of a search flow of a hierarchical learning index according to the present invention.

Fig. 5 is a schematic diagram of a recursive PLR index in the present invention.

Fig. 6 is a schematic diagram of a construction flow of a conventional PLR model.

Fig. 7 is a schematic diagram of a construction flow of a recursive PLR index in the present invention.

FIG. 8 is a diagram of a hierarchical learning index according to the present invention.

Fig. 9 is a schematic diagram of FileMeta (metadata) of Version in the present invention.

Fig. 10 is a schematic diagram of a flow chart of updating a hierarchical learning index according to the present invention.

Fig. 11 is a schematic diagram of a deletion flow of a hierarchical learning index according to the present invention.

Fig. 12 is a graph of the results of performance testing of SSTable learning index in the present invention.

FIG. 13 is a graph of performance test results for a hierarchical learning index according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantageous effects of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples.

The invention provides a reading amplifying optimizing method of an LSM-Tree storage engine based on a learning index, aiming at the reading amplifying problem of the LSM-Tree storage engine, wherein the LSM-Tree storage engine based on the learning index supports SSTable (Sortedstate) and hierarchical index access processing through a PLR model (Maximum error-bounded Piecewise Linear Representation, error-bounded recursive linear model) as shown in figure 1, thereby realizing better reading performance.

In the above LSM-Tree storage engine, SSTable learning index blocks are used instead of conventional sparse index blocks in SSTable files, as shown in fig. 2. When an SSTable file is constructed, a plurality of Key values and data (Key, entryIndex) are put into an Entry directory of a file data block, entryIndex is the serial number of an Entry, and the serial number of the first Entry constructed is 0 and sequentially increases; the role of the Footer header in the SSTable file is to identify the location and size of each block. When an SSTable file is read, first, file header Footer data is read, and the positions and the sizes of IndexBlock (index block) and MetaIndexBlock (bloom filter block) are obtained, so that the file data block can be positioned and accessed;

more precisely, as shown in fig. 3, the SSTable learning index predicts the PLR sequence number of the lower layer from the root PLR model of the first layer through key values, calculates the PLR model range { PLRIndex-delta ', plrindex+delta' } of the lower layer to be searched according to the error range, and performs traversal searching; when searching is carried out to the PLR model of the last layer, the EntryIndex is obtained, and then the error range AporxPOs= { EntryIndex-delta, entryIndex+delta } of the entry is obtained; performing binary search on Entry in the range of AporpPOS, and returning a corresponding value if the found key value is located; otherwise, returning NotFound, which indicates that the key value does not exist in the SSTable file; for entries in the range of ApporxPos _j By EntryIndex _j Calculate entry _j Realizes access to disk block offset of entry _j Is set by the disk block offset EntryOffset _j The method comprises the following steps: entryOffset _j ＝EntryIndex _j X EntrySize, entrySize is the size of entry;

performing machine learning training before the SSTable file is constructed to obtain an index mapping function, namely SSTable learning index, and performing corresponding serialization operation; the SSTable learning index aims to reduce the number of read blocks in the SSTable file, thereby reducing the I/O times of the disk and realizing the optimization of the read amplification problem.

In the LSM-Tree storage engine, in the index of SSTable metadata, a hierarchical learning index is applied to optimize the problem of read amplification; as with SSTable learning index, the hierarchical index still adopts recursive PLR index and a Buffer Level is set to support dynamic data change; the hierarchical learning index is built on the metadata array, and the reading quantity of SSTable files is reduced by reducing the query range of SSTable metadata, so that the cost of disk reading is reduced, and the reading amplification problem is optimized;

more specifically, as shown in fig. 4, the hierarchical learning index is processed using the mere-PLR algorithm (recursive PLR algorithm); the Merge-PLR is provided with a Buffer layer (Buffer Level) to support dynamic data updating, and for a new insertion key, the Buffer layer is inserted into the Buffer Level first, when the Buffer Level is full, merge is carried out with a lower layer, and a recursive PLR index is constructed for data completed by the Merge; when searching, the hierarchical learning index can search the buffer zone first, if the key does not exist, the hierarchical learning index can continue to search in the lower-layer recursion PLR index, the specific process is the same as the prediction process of the SSTable learning index, the number of SSTable files loaded during searching can be obviously reduced by the hierarchical learning index, and the reading and amplifying problem is optimized.

In terms of working principle:

1. construction of SSTable learning index

Because the SSTable file has invariance, the construction of the learning index in the SSTable file is easier, the learning index in the SSTable can be ensured to be effective in the whole life cycle of the SSTable file, and the learning index in the SSTable file can be trained by adopting a simple machine learning model because of smaller data quantity, so that the total cost when the SST file is generated is reduced.

The SSTable learning index model is mainly used for replacing sparse indexes in SSTable files to achieve better time and space performance. In order to choose the appropriate underlying machine learning model to replace the index, several conditions first need to be met: 1) The index is required to be constructed and searched in real time due to the design performance consideration, so that the performance of constructing the index and the performance of searching are required to be fast enough; 2) The query error of the index must be bounded, like B-Tree, the index can navigate to keys within the error range and locate the final data by a "last mile" search, if the error cannot be guaranteed to be bounded, the accuracy and performance of the entire index cannot be guaranteed, which can bring catastrophic consequences in a real-time system; 3) Serialization is supported.

For the above features, since the mapping of the data key and the offset is a one-to-one mapping, one entry is a tight permutation of < key_share_size >, < key_non_size >, < value_size >, < key >, < value >, when the shared key prefix is not closed, neither key size (< key_share_size >, < key_non_size >) nor value size (< value_size >) is fixed, the learning index cannot directly predict the offset of the entries because the size of each entry is not fixed, if an erroneous position is predicted at the time of predicting the offset, the decoded entry is meaningless, and a memory cross-boundary or the like problem may occur.

Since integer type keys can be encoded to a fixed length, the magnitude of the value is also fixed in the kv split implementation. When kv separation is achieved, one entry is < key > < value >, where key is a fixed 16-byte, including integer-type 8-byte code and 8-byte intelkeycode, value is a fixed 16-byte, where the value contains 8-byte vlog file number and 8-byte offset; now one entry has a fixed size, which brings great benefit to learning the index; the learning index can directly index an entry, and the fixed size of the entry also provides a certain guarantee for the regularity of data distribution. Therefore, the learning index is constructed in units of SSTable files, and the key and number of the entry are the best choice as the training data set.

Using line segments to represent the time-series data stream is called a piecewise linear representation (Piecewise Linear Regression, PLR) model, i.e., PLR model; when facing data sets with very complex distribution, a single-layer PLR model cannot effectively divide space into smaller sub-ranges and cannot learn the overall shape of the data distribution; the present invention therefore proposes a recursive PLR model employing a hierarchical structure.

In order to build a learning index from the original data and accelerate the searching process, a recursive PLR model needs to be built based on the original data, as shown in FIG. 5, firstly, for the bottommost segment, namely a segment set obtained by fitting the original data according to a PLR algorithm, wherein the error limit is a recursive error limit delta; after the bottom line segment is built, the first key of each line segment in the bottom line segment set and the index number of the first key are used as a new data set, the PLR is built again, wherein the error limit is a recursion error limit delta', the built new line segment set is used as the current bottom line segment set, and recursion execution is continued until the number of the top line segment segments is 1.

The recursive PLR model is a multi-layer model, each model node consisting of a PLR model. For a traditional PLR model, the PLR model consists of a plurality of segments, and the representation method of one segment is slope (slope), start key (start key) and intercept (intercept); the optimal PLR model requires constructing a plurality of segments over successive key-value data, the construction flow of which is shown in fig. 6. In the invention, for SSTable key value pair data, optimal PLR construction is carried out, when the fitting slope error of a new point and a current segment is larger than a given limit, the current segment is saved, then a new segment is initialized for fitting the next data, finally, the constructed root segment segments are output, and each root segment is output as a PLR model node; after the bottom layer PLR model is output, the multi-layer recursive PLR model is continuously built, the building flow is shown in figure 7, the data built for the first time is derived from SSTable, and after the bottom layer PLR model is built, whether the number of nodes (root line segments) of the PLR model is 1 is checked; if the number of the PLR nodes is not 1, continuing to construct a recursive PLR model for the layer PLR model, and outputting a final recursive PLR model until the number of the PLR model nodes (the number of root line segments) of the uppermost layer is 1 under the condition that the error limit is still met.

2. Constructing hierarchical learning index

The task of the hierarchical learning index is to give a key to index the SSTable file where it is located, and since the SSTable file of each layer is constantly changing, the hierarchical learning index needs to support dynamic insertion and deletion, but does not need to consider persistence, and it can exist as a pure memory index. The level data has strong dynamic property but small data volume, and each SSTable file corresponds to the minimum key and the maximum key of the SSTable file in the level index as metadata for learning.

Each SSTable file has metadata, and the number, the size, the maximum key and the minimum key of the SSTable file are recorded; when searching SSTable files where keys are located, binary search is performed in metadata of each layer, then search is performed in the files, and hierarchical learning indexes aim to achieve the same things. Similar to SSTable, the change of hierarchy is in units of Version, and multiple versions are organized in the form of doubly linked list and are uniformly managed by Version set, so that the level db can implement operations such as Version rollback, and metadata of SSTable files of each layer are recorded in each Version. We can construct the learning index in version units, which has the advantage that the feature of supporting dynamic insertion is not needed to be realized for the learning index alone, the function provided by the learning index is dynamic in terms of version set, and the data volume of SSTable file metadata is less; in particular, after the maximum capacity of the SSTable file is adjusted to 64MB, the data size of 1TB will only generate 32786 data items for learning, which is far smaller than the data size of learning index in one SSTable file of 64 MB.

As shown in fig. 8, in each layer FileMeta (metadata) of Version, a learning index corresponding to each layer is established; in the original searching process, performing binary search from FileMeta of the 0th layer, if the 0th layer is not found, continuing to search the 1 st layer, if a key is found, exiting, and if the key is not found, searching until the last layer is obtained; and after the hierarchical learning index is added, the binary search range in one layer is reduced to the error limit provided by the learning index by the metadata of the whole layer.

The key value pair provided by the hierarchical learning index to the learning index is the original key and the corresponding SSTable file number, and the SSTable file keys representing the coverage of the segment are all in the SSTable with the corresponding location number.

For the B+ tree, dynamic insertion is achieved through reserved ordering positions, when one node is filled in a quantity larger than the maximum key quantity of the nodes, one key in the middle is extracted as a father node, the original node is split into two child nodes, and for the father node, the splitting operation is performed recursively until all nodes meet the maximum key quantity. The hierarchical learning index may be inspired from the insertion splitting algorithm of the b+ tree, but is the largest difference from the b+ tree: when a new key is inserted, we still need to guarantee the maximum error between this key and the corresponding SSTable file number on the fit segment.

As shown in fig. 9, the recursive PLR supports dynamic insertion by reserving buffers and recursive merging, version is stored by Buffer Level with 2 sharnedlevels, buffer Level stores new insert keys, the sizes of the Buffer Level, the LearnedLevel0 and the LearnedLevel 1 are sequentially increased, and the Buffer Level, the LearnedLevel0 and the LearnedLevel 1 are sequentially combined in a recursion mode; for the new inserted key, the new inserted key is inserted into the buffer level, when the buffer level is full, the new inserted key is recursively combined with the lower layer (Merge), and the data completed by Merge is constructed into a recursion PLR index.

More specifically:

when one compact ends, some new SSTable files are created, and some sstables are marked as discarded;

for the newly inserted SSTable file, as shown in FIG. 10, the SSTable metadata (FileMeta) in Version will change, if the SSTable metadata (FileMeta) does not exceed Buffer Level, the new Version's hierarchical learning index will not relearn, which will greatly improve the performance of the whole system; only when the Buffer Level is exceeded, triggering recursive merging;

for the deleted SSTable file, the deleting flow of the hierarchical learning index is shown in FIG. 11, the hierarchical learning index searches whether the corresponding key is in the Buffer Level, and if the corresponding key is in the Buffer Level, the corresponding key is directly removed from the Buffer Level; otherwise, add the removal mark in the correspondent LearnedLevel, delete in the next recursion mergence.

The technical effects of the present invention are described below by simulation result data, where the server operating system environment used in the present embodiment is Linux, and its hardware is configured as follows: processor model:

Core ^TM i7-12700H Processor,CPU number: 14Cores&20Threads, memory size: 32GB, display card model: the GeForce RTX ^TM 3070 Ti 8GB, storage size: 2TB NVME; in the simulation, four data sets of Book sales, wiki encyclopedia, facebook user id, OSM google open map location were used, each data set consisting of 2 hundred million 64-bit non-repeated unsigned integer key data records. On the basis of a real data set, comparing and testing the SSTable learning index and the sparse index, as shown in fig. 12, the SSTable learning index is 35% -52% faster than the original index mode in the searching time delay, and in the performance test of the sequential sequence data, the SSTable learning index is 12% faster than the original index mode in the searching time delay. The result of testing the hierarchical learning index on the real data set is that after the hierarchical learning index is opened in the LSM-Tree as shown in figure 13, only 0.582 ms-1.166 ms is needed to search a key in the real data set, and the searching performance of the hierarchical index is improved by 53.77% -196.45%.

While the invention has been described in terms of specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the equivalent or similar purpose, unless expressly stated otherwise; all of the features disclosed, or all of the steps in a method or process, except for mutually exclusive features and/or steps, may be combined in any manner.

Claims

1. The LSM-Tree storage engine read amplification problem optimization method based on the learning index is characterized by comprising the following steps of:

constructing an SSTable file, wherein the SSTable file comprises the following components: file header, index block, bloom filter block, and file data block; one entry in a file data block of the SSTable file is < key > < value >, key is a key of an integer type key, and value is a user storage value; putting key value pair data (EntryIndex) into an Entry directory of a file data block, wherein the EntryIndex is the serial number of an Entry;

constructing an index block of an SSTable file, wherein the index block of the SSTable file adopts an SSTable learning index which is a recursive PLR model, the recursive PLR model is a multi-layer model, and each model node consists of PLR models with the number of root segments being 1;

2. The LSM-Tree storage engine read amplification problem optimization method based on learning index as set forth in claim 1, wherein the Buffer Level stores new insert keys in Version, the sizes of the Buffer Level and the plurality of Learnedlevels are sequentially increased, and the Buffer Level and the plurality of Learnedlevels are sequentially recursively combined; and for the new inserted key, firstly inserting the new inserted key into the BufferLevel, carrying out recursion combination with the lower LearnedLevel when the BufferLevel is full, and constructing a recursion PLR model based on the recursion combination data.

3. The LSM-Tree storage engine read-out amplification problem optimization method based on learning index as set forth in claim 1, wherein the construction process of the recursive PLR model in the SSTable learning index and the hierarchical learning index is as follows: constructing a bottom layer PLR model according to a PLR algorithm by taking key value data (EntryIndex) as an original data set or taking metadata of an SSTable file as an original data set, wherein the error limit is a recursive error limit delta; checking whether the number of nodes of the PLR model of the layer is 1, if not, taking the first key value of each model of the PLR model of the layer as a new data set, and constructing the PLR model again, wherein the error limit is a recursion error limit delta'; and outputting the recursive PLR model until the number of PLR model nodes at the top layer is 1.