CN116881252A - Key value storage method and system based on LSM tree - Google Patents

Key value storage method and system based on LSM tree Download PDF

Info

Publication number
CN116881252A
CN116881252A CN202310833212.7A CN202310833212A CN116881252A CN 116881252 A CN116881252 A CN 116881252A CN 202310833212 A CN202310833212 A CN 202310833212A CN 116881252 A CN116881252 A CN 116881252A
Authority
CN
China
Prior art keywords
data
files
memory
file
hard disk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310833212.7A
Other languages
Chinese (zh)
Inventor
董小社
钟燕
王龙翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202310833212.7A priority Critical patent/CN116881252A/en
Publication of CN116881252A publication Critical patent/CN116881252A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a key value storage method and a key value storage system based on an LSM (least squares) tree, which comprise three parts, namely a storage structure in a memory, a log file in a hard disk and a data file in the hard disk, wherein the storage structure in the memory is used as a cache to support efficient access of data, the log file in the hard disk is responsible for two functions of fault recovery and data persistence, the layout of hot and cold data is dynamically adjusted by means of a merging process, and the hot and cold data can be subjected to differential processing by the common cooperation of the three parts, so that the access performance of the LSM tree is improved, and meanwhile, the problems of read amplification and unbalanced read performance of the LSM tree can be relieved.

Description

Key value storage method and system based on LSM tree
Technical Field
The application belongs to the technical field of data storage, and particularly relates to a key value storage method and system based on an LSM tree.
Background
Because the relational database faces great challenges in the application of data storage in the big data age, people begin to explore new-generation database technologies to make up for the short plates of the relational database, so that the new-generation database technologies play a better role in the big data age, and NoSQL databases are generated. LSM trees are an efficient storage structure, and many NoSQL databases are prototyped from LSM trees, such as LevelDB, rocksDB.
The LSM tree has the main design idea that random IO is converted into sequential IO as much as possible by utilizing the characteristic that the sequential IO performance of a hard disk is far higher than the random IO performance so as to obtain higher read-write performance. When data writing is carried out, organizing the data into an orderly form and writing the orderly form into a hard disk, when certain conditions are met, sequentially reading data files of different layers, sorting the data in the data files, and sequentially writing the data files into new data files of a specific layer; when data is read, files in different layers are sequentially read, and good query performance is obtained by means of structures such as indexes, bloom filters and the like.
In a scenario where the difference between the hot and cold data access conditions is large, in general, only when the hot data has a good access performance and the cold data has a poor access performance, the good access performance can be obtained as a whole. The conventional LSM tree does not perform differentiation processing on hot and cold data, and how to enable the LSM tree to obtain better access performance in a scene with larger difference of hot and cold data access conditions is one of important technical subjects in the field.
Disclosure of Invention
The technical problem to be solved by the application is to provide a key value storage method and a key value storage system based on an LSM (least squares) tree for solving the technical problem that hot data and cold data are stored in the LSM tree indiscriminately and the storage performance of the LSM tree cannot be released under the scene with larger access difference of the hot data and the cold data.
The application adopts the following technical scheme:
a key value storage method based on LSM tree includes the following steps:
s1, storing hot data and expelling cold data by using a memory space;
s2, managing log files in the hard disk, writing the log files in blocks by taking integer multiples of hard disk blocks as units during writing, and accelerating reading speed through indexes in a memory during reading;
s3, based on a multi-level multi-file structure, the data files in the hard disk are sorted through merging operation, and key value storage based on an LSM tree is realized.
Specifically, step S1 specifically includes:
s101, opening up space in a memory to accommodate user data;
s102, counting the information of the data in the storage structure, wherein the information of the data in the storage structure comprises one or more of access times, access time and semantic relations among the data;
s103, sorting the data according to the information counted in the step S102, determining the heat of the data according to the sorting result, and dividing the data into hot data or cold data;
s104, deleting one or more pieces of data in the memory storage structure after the capacity of the memory storage structure is full;
and S105, after the cold data is eliminated in the step S104, the statistical information of the rest data is completely reserved or completely cleared or periodically cleared.
Further, in step S101, the storage structure used for storing data includes one or more of a linked list, a hash table, and a tree.
Further, in step S102, the reference count of the encapsulated data object is counted for the data stored in the memory after encapsulation.
Specifically, step S2 specifically includes:
s201, temporarily storing data in a memory, wherein the temporarily storing data comprises key and value fields of user data, and the operation type;
s202, when the size of data in a memory reaches a preset threshold value, after serializing the data, writing the data into a log file in blocks according to the unit of a hard disk block, and simultaneously updating an index of corresponding data in the memory, wherein the index comprises a key field of the data and an offset of a value field in the log file;
s203, when the size of the log file reaches a preset threshold value, writing the rest data in the memory into the log file, and writing the index of the data into the tail of the log file to form a complete log file;
s204, the log file is transmitted to a part of the data file in the hard disk to be managed, and a new file is created again to serve as the writing of the log file receiving data stream.
Further, in step S204, when the condition is satisfied, the data file in the hard disk triggers the merging operation, which is divided into:
the first type of merging, namely the files participating in the merging operation are the data files of the level 0 and level1 layers;
the second type of merge, i.e., files that participate in the merge operation, are level i (i > 0) and level i+1 layers of data files.
Further, the triggering conditions of the merging operation are as follows:
the number of the data files of a certain layer reaches a preset threshold value; and/or the total size of the data files of a certain layer reaches a preset threshold value; and/or, the total invalid reading times of a certain layer of files reach a preset threshold value; and/or the spatial magnification of a certain layer of files reaches a preset threshold value.
Specifically, the step S3 specifically includes:
s301, selecting files to be combined, wherein a selection strategy comprises the following steps: polling to select, select files with small overlapping range, select files with low heat, select files with more deletion/update marks;
s302, reading data files of two adjacent layers, rearranging the data in the data files to generate new files, and placing the new files in corresponding layers;
s303, reserving the new file generated after merging, and cleaning the invalid old file.
Further, in step S302, for the first type merging, the files of the level 0 layer and the level1 layer are read during execution, the data therein are rearranged to generate a new file, and the generated new file is placed in the level1 layer;
and for the second type of merging, reading files of a level i layer and a level i+1 layer when executing, i >0, rearranging data in the files to generate a new file, placing the file containing hot data into the level i layer, and placing the file containing cold data into the level i+1 layer.
In a second aspect, an embodiment of the present application provides a key value storage system based on an LSM tree, including:
a memory module for storing hot data and expelling cold data by using a memory space;
the log module is used for managing log files in the hard disk, writing the log files in blocks by taking integer multiples of hard disk blocks as units, and accelerating the reading speed through indexes in a memory during reading;
and the data module is used for sorting the data files in the hard disk through merging operation based on a multi-level multi-file structure, so that key value storage based on the LSM tree is realized.
Compared with the prior art, the application has at least the following beneficial effects:
a key value storage method based on an LSM tree simplifies the writing flow of the LSM tree by redesigning the management modes of a memory space, a log file and a data file, and meanwhile, carries out differential processing on hot and cold data, so that the data reading flow of the LSM tree is more suitable for application scenes with larger difference of hot and cold data access conditions, thereby improving the reading and writing performance of the LSM tree, relieving the problem of reading and amplifying, and improving the balance of a system.
Furthermore, step S1 uses the memory space to distinguish and convert the hot and cold data, and when the space is full, the cold data is evicted, so that the most frequently accessed data resides in the memory for a long time through effective management of the memory space, and relatively good access performance is obtained, so that the overall data access efficiency can be improved.
Further, for data storage, selecting different data structures has different advantages and disadvantages, the space efficiency corresponding to the different data structures is different, the performance of corresponding insert, delete, update and search operations is also different, and in the memory, the storage structure used for storing data is often one or a combination of a linked list, a hash table and a tree, and the balance of time efficiency and space efficiency is achieved by flexibly selecting the data structures.
Furthermore, some data contents are more complex and can be divided into a plurality of parts, the access condition of each part is different, the data can be packaged and then stored in the memory, and for the data stored in the memory after packaging, the reference count of the packaged data object is counted as one of the hot and cold data discrimination criteria, so that the accuracy of the hot and cold data differentiation processing can be improved.
Furthermore, step S2 connects three operations of temporary storage of data in the memory, refreshing of the data from the memory to the hard disk, and maintenance of the index in series, thereby forming a complete and standard log file, the efficiency of writing data is ensured by writing in blocks, and the efficiency of reading data is ensured by maintaining the high-efficiency index.
Furthermore, when the data files in the hard disk meet the conditions, the merging operation is triggered, the merging operation is divided into a first type merging operation and a second type merging operation, the merging operation can rearrange the data written into the hard disk, and the layout of the data in the hard disk is continuously adjusted, so that the layout is beneficial to efficient access of hot data.
Further, the triggering conditions of the merging operation are as follows: the number of the data files of a certain layer reaches a preset threshold value; and/or the total size of the data files of a certain layer reaches a preset threshold value; and/or, the total invalid reading times of a certain layer of files reach a preset threshold value; and/or, the spatial amplification of a certain layer of file reaches a preset threshold, the selection of the triggering condition can be very flexible, and for a user, the selection of the triggering condition is the trade-off between indexes such as read-write performance, read-write amplification and the like.
Further, in step S3, the data files are managed, and the data writing pressure is shared to multiple layers, so that the whole data files are orderly, and the performance of the data file reading operation of each layer is ensured.
Further, for the first type of merging, files of a level 0 layer and a level1 layer are read during execution, new files are generated after data in the files are rearranged, the generated new files are placed on the level1 layer, for the second type of merging, files of the level i layer and the level i+1 layer are read during execution, i >0 is generated after rearranging the data in the files, the files containing hot data are placed on the level i layer, the files containing cold data are placed on the level i+1 layer, the difference between the first type of merging and the second type of merging is that the data files related in the two types of merging are different in layer, the file formats of the level 0 and the level1 layer are different, the range of data between the files of the level 0 layer is possibly overlapped, therefore, the newly generated files can only be written into the level1 layer, the accuracy of data reading and the high efficiency of data access are guaranteed, the files of the level i layer and the level i+1 layer are identical, the difference between the files containing cold data and the level i+1 layer is guaranteed, the data formats of the files containing cold data are continuously overlapped, and the data between the two types of the files are continuously placed in the level1 layer, and the data between the two layers are continuously accessed.
It will be appreciated that the advantages of the second aspect may be found in the relevant description of the first aspect, and will not be described in detail herein.
In summary, the application improves the overall access performance of the data, effectively relieves the problem of read amplification, and has quite balanced read performance.
The technical scheme of the application is further described in detail through the drawings and the embodiments.
Drawings
FIG. 1 is a schematic diagram of the method of the present application;
FIG. 2 is a schematic diagram of a layout of data files in a hard disk under ideal conditions;
FIG. 3 is a schematic diagram of a first type of merging;
FIG. 4 is a schematic diagram of a second type of merging.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the description of the present application, it will be understood that the terms "comprises" and "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In the present application, the character "/" generally indicates that the front and rear related objects are an or relationship.
It should be understood that although the terms first, second, third, etc. may be used to describe the preset ranges, etc. in the embodiments of the present application, these preset ranges should not be limited to these terms. These terms are only used to distinguish one preset range from another. For example, a first preset range may also be referred to as a second preset range, and similarly, a second preset range may also be referred to as a first preset range without departing from the scope of embodiments of the present application.
Depending on the context, the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection". Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.
Various structural schematic diagrams according to the disclosed embodiments of the present application are shown in the accompanying drawings. The figures are not drawn to scale, wherein certain details are exaggerated for clarity of presentation and may have been omitted. The shapes of the various regions, layers and their relative sizes, positional relationships shown in the drawings are merely exemplary, may in practice deviate due to manufacturing tolerances or technical limitations, and one skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions as actually required.
The application provides a key value storage method based on an LSM tree, which comprises three parts, namely: the storage structure in the memory is used as a cache to support the efficient access of data on the basis of distinguishing hot data from cold data; the log file in the hard disk is responsible for two functions of fault recovery and data persistence, and meanwhile, an index is established to ensure the reading performance of the log file; the data file in the hard disk is composed of a plurality of files positioned at a plurality of layers, and the layout of hot and cold data is dynamically adjusted by means of a merging process. Through the cooperation of the three parts, the access performance of the LSM tree can be improved.
The application discloses a key value storage method based on an LSM tree, which comprises the following steps:
s1, a memory storage structure is used as a cache to support efficient access of data on the basis of distinguishing hot data from cold data;
s101, opening up a certain capacity space in a memory to accommodate user data, wherein a storage structure which can be used for storing the data in the embodiment comprises one or more of a linked list, a hash table and a tree;
s102, counting information of certain aspects of data in a storage structure, wherein the information comprises one or more of access times, access time and semantic relations among the data, and particularly, for some data with complex contents, the data can be stored into a memory after being packaged, and at the moment, the reference count of an object for packaging the data can be listed as counted information;
s103, sorting the data according to the information counted in the step S102, determining the heat of the data according to the sorting result, and dividing the data into hot data or cold data;
the partitioning of hot and cold data does not have absolute criteria, and the ordering of data according to statistical information has the effect of preferentially evicting the last ranked data from memory when memory space is full.
S104, if the capacity of the storage structure in the memory is full, eliminating the cold data, and deleting one or more pieces of data in the storage structure of the memory;
s105, after the cold data is eliminated in the step S104, the statistical information of the rest data is completely reserved or completely cleared or periodically cleared.
The method realizes the functions of distinguishing hot and cold data and eliminating cold data, selects corresponding hot and cold data distinguishing standards and cold data eliminating strategies according to different scenes, and flexibly adapts to different workloads. The technical effects that hot data reside in the memory as much as possible and cold data are eliminated when the storage structure capacity is full can be achieved, and the hot data have better access performance, so that the overall access performance of the LSM tree is improved.
S2, a log file in a hard disk is responsible for two functions of fault recovery and data persistence, and an index is built to ensure the reading performance of the log file;
s201, using a small amount of space in a memory for temporarily storing data, wherein the data residing in the memory comprises key and value fields of user data, and the operation type;
s202, when the size of data in the memory reaches a preset threshold value, the data in the memory are serialized and then are blocked and written into a log file according to the unit of a hard disk block, and meanwhile, the index of the corresponding data in the memory is updated, wherein the index comprises the key field of the data and the offset of the value field in the log file;
s203, when the size of the log file reaches a preset threshold value, writing the rest data in the memory into the log file, and writing the index of the data into the tail of the log file to form a complete log file;
s204, the log file is transmitted to a part of the data file in the hard disk to be managed, and a new file is created again to serve as the writing of the log file receiving data stream.
It can be understood that by implementing the steps, the writing performance and the reading performance of the log file are effectively ensured, and meanwhile, the functions of fault recovery and data persistence are provided. After the complete log file is created, the log file is managed by the part of the data file in the hard disk, and in the process, the transition speed of the log file to the data file can be adjusted to dynamically balance the reading and writing performance of the LSM tree.
The number of data files in the hard disk is multiple and the data files are located in multiple different layers, so that the data files can be considered to have n+1 layers from level 0 to level n;
the file of the level 0 layer is directly converted from a complete log file, the data in the file of the level 0 layer are arranged according to time sequence, the index of the data is positioned at the tail end of the file, and the data stored in different files of the level 0 layer can be overlapped in a range;
the files of levels 1 through n are ordered in some way (usually lexicographical order), and the data stored by different files at level i (1.ltoreq.i.ltoreq.n) may not overlap in scope.
When the data files in the hard disk meet certain conditions, the merging operation is triggered, namely, one or more data files are read, the data in the data files are rearranged, and the data files are written into new data files, wherein the merging operation is always carried out between adjacent layers, and the merging operation can be divided into:
the first type of merging, namely the files participating in the merging operation are the data files of the level 0 and level1 layers;
the second type of merge, i.e., files that participate in the merge operation, are level i (i > 0) and level i+1 layers of data files.
The triggering conditions of the merging operation include:
the number of the data files of a certain layer reaches a preset threshold value;
and/or the number of the groups of groups,
the total size of the data files of a certain layer reaches a preset threshold value;
and/or the number of the groups of groups,
the total invalid reading times of a certain layer of files reach a preset threshold value;
and/or the number of the groups of groups,
the spatial amplification of a certain layer of files reaches a preset threshold value.
It can be understood that the closer the hierarchy corresponding to the data file in which the data is located is to level 0, the better the access performance.
Referring to fig. 2, the layout of the data files in the hard disk should be: the data are sequentially distributed in the layers from level 0 to level n after being sequenced from hot to cold according to the heat. In order to make the layout of the data files in the hard disk approach to the layout in the ideal situation as much as possible, the application provides an embodiment of the merging operation, which aims to dynamically adjust the layout of the data files in the hard disk so as to obtain better access performance of the hot data.
S3, the data files in the hard disk are composed of a plurality of files positioned in a plurality of layers, and the layout of hot and cold data is dynamically adjusted by means of a merging process.
The embodiment of the merging operation between the data files in the hard disk comprises the following specific steps:
s301, selecting files to be combined, wherein a selection strategy comprises the following steps: polling to select, select files with small overlapping range, select files with low heat, select files with more deletion/update marks;
s302, reading data files of two adjacent layers, rearranging the data in the data files to generate new files, and placing the new files in corresponding layers;
for the first type of merging, please refer to fig. 3, the files of the level 0 layer and the level1 layer are read during execution, the data in the files are rearranged to generate a new file, and the generated new file is placed in the level1 layer;
for the second type of merging, please refer to fig. 4, the files of the level i (i > 0) layer and the level i+1 layer are read during execution, the data therein are rearranged to generate a new file, the file containing hot data is placed on the level i layer, and the file containing cold data is placed on the level i+1 layer;
s303, reserving the new file generated after merging, and cleaning the invalid old file.
It will be appreciated that the process of the above described merge operation may be recursive after the trigger condition for the merge operation is selected, i.e. one merge operation may bring about additional merge operations. Meanwhile, although merging operations occur between data files of adjacent layers, a series of merging operations may cause data to move between multiple layers, so that the adjustment effect of the hot and cold data layout does not depend on a single merging operation, but on the overall situation after multiple merging operations occur.
In still another embodiment of the present application, a LSM tree-based key value storage system is provided, which can be used to implement the LSM tree-based key value storage method described above, and in particular, the LSM tree-based key value storage system includes a memory module, a log module, and a data module.
The memory module uses the memory space to store hot data and expel cold data;
the log module is used for managing log files in the hard disk, writing is performed according to the integral multiple of the hard disk blocks during writing, and the reading speed is increased through indexes in the memory during reading;
and the data module is used for sorting the data files in the hard disk through merging operation based on a multi-level multi-file structure, so that key value storage based on the LSM tree is realized.
In yet another embodiment of the present application, a terminal device is provided, the terminal device including a processor and a memory, the memory for storing a computer program, the computer program including program instructions, the processor for executing the program instructions stored by the computer storage medium. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., which are the computational core and control core of the terminal adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement the corresponding method flow or corresponding functions; the processor according to the embodiment of the application can be used for the operation of the key value storage method based on the LSM tree, and comprises the following steps:
storing hot data and expelling cold data by using a memory space; managing log files in a hard disk, writing according to blocks which are integral multiples of hard disk blocks during writing, and accelerating reading speed through indexes in a memory during reading; based on a multi-level multi-file structure, the data files in the hard disk are sorted through merging operation, so that key value storage based on an LSM tree is realized.
In a further embodiment of the present application, the present application also provides a storage medium, in particular, a computer readable storage medium (Memory), which is a Memory device in a terminal device, for storing programs and data. It will be appreciated that the computer readable storage medium herein may include both a built-in storage medium in the terminal device and an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer readable storage medium may be a high-speed RAM Memory or a Non-Volatile Memory (Non-Volatile Memory), such as at least one magnetic disk Memory.
One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to implement the respective steps of the above-described embodiments with respect to an LSM tree-based key-value storing method; one or more instructions in a computer-readable storage medium are loaded by a processor and perform the steps of:
storing hot data and expelling cold data by using a memory space; managing log files in a hard disk, writing according to blocks which are integral multiples of hard disk blocks during writing, and accelerating reading speed through indexes in a memory during reading; based on a multi-level multi-file structure, the data files in the hard disk are sorted through merging operation, so that key value storage based on an LSM tree is realized.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Compared with the traditional LSM tree, the throughput rate of the corresponding system is improved by 6.3-27.3%, the read amplification of the corresponding system is reduced by 13.9-29.8% in different operation numbers, and the throughput rate curve of different time periods is smoother.
In summary, the key value storage method and system based on the LSM tree have the following effects:
by differentiating the hot and cold data, the hot data is in a memory structure of a memory and has better access performance in the upper data file, the cold data is in the lower data file to complete persistence, and the integral access performance of the data is improved while the advantages of LSM tree sequential IO are reserved.
When the data is read, the hot data is in the storage structure of the memory and the data files at the upper layer, so that the number of the files to be read and the number of the hard disk blocks to be read are reduced, and the problem of reading and amplifying is effectively solved.
When data is read, most of the read requests fall on a small part of hot data, the hot data has better access performance, the response time of most of the read requests is stable in a specific range, and the read performance in different time periods in the execution process of the read requests is quite balanced.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal and method may be implemented in other manners. For example, the apparatus/terminal embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier wave signal, a telecommunications signal, a software distribution medium, etc., it should be noted that the computer readable medium may contain content that is appropriately increased or decreased according to the requirements of jurisdictions and patent practices, such as in certain jurisdictions, according to the jurisdictions and patent practices, the computer readable medium does not contain electrical carrier wave signals and telecommunications signals.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above is only for illustrating the technical idea of the present application, and the protection scope of the present application is not limited by this, and any modification made on the basis of the technical scheme according to the technical idea of the present application falls within the protection scope of the claims of the present application.

Claims (10)

1. The key value storage method based on the LSM tree is characterized by comprising the following steps of:
s1, storing hot data and expelling cold data by using a memory space;
s2, managing log files in the hard disk, writing the log files in blocks by taking integer multiples of hard disk blocks as units during writing, and accelerating reading speed through indexes in a memory during reading;
s3, based on a multi-level multi-file structure, the data files in the hard disk are sorted through merging operation, and key value storage based on an LSM tree is realized.
2. The LSM tree-based key-value storing method according to claim 1, wherein step S1 is specifically:
s101, opening up space in a memory to accommodate user data;
s102, counting the information of the data in the storage structure, wherein the information of the data in the storage structure comprises one or more of access times, access time and semantic relations among the data;
s103, sorting the data according to the information counted in the step S102, determining the heat of the data according to the sorting result, and dividing the data into hot data or cold data;
s104, deleting one or more pieces of data in the memory storage structure after the capacity of the memory storage structure is full;
and S105, after the cold data is eliminated in the step S104, the statistical information of the rest data is completely reserved or completely cleared or periodically cleared.
3. The LSM tree-based key-value storing method according to claim 2, wherein the storing structure used for storing data in the present group S101 includes one or more of a linked list, a hash table, and a tree.
4. The LSM tree-based key-value storing method according to claim 2, wherein in step S102, the reference count of the encapsulated data object is counted for the data stored in the memory after encapsulation.
5. The LSM tree-based key-value storing method according to claim 1, wherein step S2 is specifically:
s201, temporarily storing data in a memory, wherein the temporarily stored data comprises key and value fields of user data, and the operation type;
s202, when the size of data in a memory reaches a preset threshold value, after serializing the data, writing the data into a log file in blocks by taking a hard disk block as a unit, and simultaneously updating an index of corresponding data in the memory, wherein the index comprises a key field of the data and an offset of a value field in the log file;
s203, when the size of the log file reaches a preset threshold value, writing the rest data in the memory into the log file, and writing the index of the data into the tail of the log file to form a complete log file;
s204, the log file is transmitted to a part of the data file in the hard disk to be managed, and a new file is created again to serve as the writing of the log file receiving data stream.
6. The LSM tree-based key-value storing method according to claim 5, wherein in step S204, the merging operation is triggered when the condition is satisfied by the data file in the hard disk, and is divided into:
the first type of merging, namely the files participating in the merging operation are the data files of the level 0 and level1 layers;
the second type of merge, i.e., files that participate in the merge operation, are level i (i > 0) and level i+1 layers of data files.
7. The LSM tree based key-value storing method of claim 6, wherein the triggering condition of the merging operation is:
the number of the data files of a certain layer reaches a preset threshold value; and/or the total size of the data files of a certain layer reaches a preset threshold value; and/or, the total invalid reading times of a certain layer of files reach a preset threshold value; and/or the spatial magnification of a certain layer of files reaches a preset threshold value.
8. The LSM tree-based key-value storing method according to claim 1, wherein step S3 is specifically:
s301, selecting files to be combined, wherein a selection strategy comprises the following steps: polling to select, select files with small overlapping range, select files with low heat, select files with more deletion/update marks;
s302, reading data files of two adjacent layers, rearranging the data in the data files to generate new files, and placing the new files in corresponding layers;
s303, reserving the new file generated after merging, and cleaning the invalid old file.
9. The LSM tree-based key-value storing method according to claim 8, wherein in step S302, for the first type merging, files of level 0 layer and level1 layer are read during execution, new files are generated after rearranging data therein, and the generated new files are placed in level1 layer;
and for the second type of merging, reading files of a level i layer and a level i+1 layer when executing, i >0, rearranging data in the files to generate a new file, placing the file containing hot data into the level i layer, and placing the file containing cold data into the level i+1 layer.
10. A LSM tree based key-value store system comprising:
a memory module for storing hot data and expelling cold data by using a memory space;
the log module is used for managing log files in the hard disk, writing the log files in blocks by taking integer multiples of hard disk blocks as units, and accelerating the reading speed through indexes in a memory during reading;
and the data module is used for sorting the data files in the hard disk through merging operation based on a multi-level multi-file structure, so that key value storage based on the LSM tree is realized.
CN202310833212.7A 2023-07-07 2023-07-07 Key value storage method and system based on LSM tree Pending CN116881252A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310833212.7A CN116881252A (en) 2023-07-07 2023-07-07 Key value storage method and system based on LSM tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310833212.7A CN116881252A (en) 2023-07-07 2023-07-07 Key value storage method and system based on LSM tree

Publications (1)

Publication Number Publication Date
CN116881252A true CN116881252A (en) 2023-10-13

Family

ID=88267309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310833212.7A Pending CN116881252A (en) 2023-07-07 2023-07-07 Key value storage method and system based on LSM tree

Country Status (1)

Country Link
CN (1) CN116881252A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117493400A (en) * 2024-01-02 2024-02-02 中移(苏州)软件技术有限公司 Data processing method and device and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117493400A (en) * 2024-01-02 2024-02-02 中移(苏州)软件技术有限公司 Data processing method and device and electronic equipment
CN117493400B (en) * 2024-01-02 2024-04-09 中移(苏州)软件技术有限公司 Data processing method and device and electronic equipment

Similar Documents

Publication Publication Date Title
Procopiuc et al. Bkd-tree: A dynamic scalable kd-tree
JP6025149B2 (en) System and method for managing data
Bercken et al. An evaluation of generic bulk loading techniques
CN103678160B (en) A kind of method and apparatus of data storage
CN105117417B (en) A kind of memory database Trie tree indexing means for reading optimization
Andersson et al. Improved behaviour of tries by adaptive branching
CN100353325C (en) Method for realing sharing internal stored data base and internal stored data base system
US8442988B2 (en) Adaptive cell-specific dictionaries for frequency-partitioned multi-dimensional data
CN110362572A (en) A kind of time series database system based on column storage
CN112000846B (en) Method for grouping LSM tree indexes based on GPU
CN108021702A (en) Classification storage method, device, OLAP database system and medium based on LSM-tree
CN116881252A (en) Key value storage method and system based on LSM tree
CN102609530A (en) Space database indexing method of regional double-tree structure
Achakeev et al. Efficient bulk updates on multiversion b-trees
US7310719B2 (en) Memory management tile optimization
Zhan et al. RangeKV: An efficient key-value store based on hybrid DRAM-NVM-SSD storage structure
Kvet Database index balancing strategy
CN110515897B (en) Method and system for optimizing reading performance of LSM storage system
Lang et al. The case for hybrid succinct data structures
KR100904875B1 (en) Organising data in a database
Salzberg Access methods
Matsliach et al. Distributing a B+-tree in a loosely coupled environment
JP2001331353A (en) Data input system to database, and recording medium in which its program is stored
Нікітін et al. Combined indexing method in nosql databases
CN111694847B (en) Update access method with high concurrency and low delay for extra-large LOB data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination