CN114780500B - Data storage method, device and equipment based on log merging tree and storage medium - Google Patents

Data storage method, device and equipment based on log merging tree and storage medium Download PDF

Info

Publication number
CN114780500B
CN114780500B CN202210702791.7A CN202210702791A CN114780500B CN 114780500 B CN114780500 B CN 114780500B CN 202210702791 A CN202210702791 A CN 202210702791A CN 114780500 B CN114780500 B CN 114780500B
Authority
CN
China
Prior art keywords
data
files
data layer
key
sstable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210702791.7A
Other languages
Chinese (zh)
Other versions
CN114780500A (en
Inventor
瞿晓阳
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210702791.7A priority Critical patent/CN114780500B/en
Publication of CN114780500A publication Critical patent/CN114780500A/en
Application granted granted Critical
Publication of CN114780500B publication Critical patent/CN114780500B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of block chains, and discloses a data storage method, a device, equipment and a storage medium based on a log merge tree, wherein the method comprises the following steps: recording the written key value pair data by adopting a fixed-size sorting character string table structure, and recording the key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data; orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files, and storing the fragment files into a disk of a log merging tree; if the data layer of the disk meets the preset merging condition, merging all the fragmented files in the data layer into a new fragmented file, storing the new fragmented file to the next data layer of the data layer, and deleting all the fragmented files in the data layer; in the invention, SSTable files with fixed sizes are organized in fragments and are orderly arranged according to the key range, thereby ensuring the ordering of data, reducing the rewriting operation of the data, and reducing the occupied space caused by merging, thereby improving the performance of a data storage system.

Description

Data storage method, device and equipment based on log merging tree and storage medium
Technical Field
The invention relates to the technical field of block chains, in particular to a data storage method, a data storage device, data storage equipment and a data storage medium based on a log merge tree.
Background
In recent years, key-value (KV) databases are widely used for storing blockchain data, but as the amount of data and the amount of user access increase, KV databases are also subject to performance tests. The mainstream KV database is a database based on a Log Merge Tree (LSM-Tree) structure, and a Log Merge Tree architecture provides an algorithm for delaying updating and batch writing, so that random writing is converted into batch writing, The writing performance of The database is improved, and The service requirements of high concurrency and high performance of The database are better met.
However, in order to order data to maintain good reading performance, the log merge tree needs to perform merge operation continuously in the background, and the existing data merge method has a large defect, which may cause problems such as an exponential increase in file data size, occupation of a large amount of storage space, frequent rewriting of data files, and the like, and may seriously affect the performance of the data storage system.
Disclosure of Invention
The invention provides a data storage method, a data storage device, data storage equipment and a data storage medium based on a log merging tree, and aims to solve the problem that the performance of a data storage system is influenced by the defects that a large amount of storage space is occupied, the data is frequently restored and the like in the conventional data merging mode.
The data storage method based on the log merge tree comprises the following steps:
recording the written key value pair data by adopting a fixed-size sorting character string table structure, and recording the key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data;
orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files, and storing the fragment files into a disk of a log merging tree;
determining whether a data layer of the disk meets a preset merging condition;
if the data layer meets the preset merging condition, merging all the fragmented files in the data layer into a new fragmented file;
and storing the new fragment file to the next data layer of the data layers, and deleting all the fragment files in the data layers.
Further, merging all the fragmented files in the data layer into a new fragmented file includes:
determining SSTable files in all the fragment files in the data layer, and recording the SSTable files as files to be merged;
determining whether the key ranges of the files to be merged in the data layer are overlapped;
and if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
Further, after determining whether there is an overlap between the key ranges of the files to be merged in the data layer, the method further includes:
if the key ranges of the files to be merged in the data layer are overlapped, merging and sequencing the files to be merged by adopting a merging and sequencing algorithm to obtain a plurality of new SSTable files, wherein the key ranges of the new SSTable files are not overlapped;
and orderly arranging the plurality of new SSTable files according to the size of the key range to generate new fragment files.
Further, determining whether the data layer of the disk meets a preset merge condition includes:
determining whether the number of the fragmented files in the data layer reaches a preset number;
and if the number of the fragmented files in the data layer reaches the preset number, determining that the data layer meets the preset merging condition.
Further, after an index structure is arranged on a memory of the log merge tree and the sizes of the key ranges of the plurality of SSTable files are orderly arranged to generate the fragment file, the method further comprises the following steps:
generating an index key range according to all the fragment files in the data layer;
and updating the index key range serving as an index parameter corresponding to the data layer into an index structure to generate index data, wherein the index data comprises a plurality of index parameters.
Further, after storing the new fragmented file to the next data layer of the data layer and deleting all fragmented files in the data layer, the method further includes:
generating an index key range according to all the fragment files in the next data layer of the data layers, and updating the index key range to an index structure to serve as an index parameter corresponding to the next data layer;
the corresponding index parameters of the data layer are deleted in the index structure.
There is provided a log merge tree based data storage apparatus, comprising:
the recording module is used for recording the written key value pair data by adopting a fixed-size sorting character string table structure and recording the key value pair data as an SSTable file, and the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data;
the generating module is used for orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files and storing the fragment files into a disk of a log merging tree;
the determining module is used for determining whether the data layer of the disk meets a preset merging condition;
the merging module is used for merging all the fragmented files in the data layer into a new fragmented file if the data layer meets the preset merging condition;
and the deleting module is used for storing the new fragment file to the next data layer of the data layers and deleting all the fragment files in the data layers.
Further, the merging module is specifically configured to:
determining SSTable files in all the fragment files in the data layer, and recording the SSTable files as files to be merged;
determining whether the key ranges of the files to be merged in the data layer are overlapped;
and if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
There is provided a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the log merge tree based data storage method when executing the computer program.
There is provided a computer readable storage medium storing a computer program, wherein the computer program is configured to implement the steps of the log merge-tree based data storage method when executed by a processor.
In one scheme provided by the data storage method, the data storage device, the data storage equipment and the storage medium based on the log merging tree, written key value pair data are recorded by adopting a fixed-size sorting character string table structure and are recorded as SSTable files, the SSTable files comprise the key value pair data and key ranges corresponding to the key value pair data, then the key ranges of a plurality of SSTable files are orderly arranged to generate fragment files, the fragment files are stored in a disk of the log merging tree, whether a data layer of the disk meets a preset merging condition or not is determined, if the data layer meets the preset merging condition, all the fragment files in the data layer are merged into a new fragment file, and finally the new fragment file is stored to the next data layer of the data layer, and all the fragment files in the data layer are deleted; in the invention, SSTable files with fixed size are organized by fragments, a plurality of SSTable files in the fragment files are orderly arranged according to a key range, so that the data ordering is ensured, when the data merging is carried out, the data merging is different from the traditional merging of the data of the current data layer and the next layer, only all the files of the current data layer are merged into a new file to be stored in the next layer, and the files of the current data layer are deleted, so that the data rewriting operation is reduced, the space occupation caused by the merging can be reduced, and the performance of a data storage system is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a schematic diagram of an application environment of a data storage method based on a log merge tree according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data storage method based on a log merge tree according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a structure of a log merge tree according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating an implementation of step S30 in FIG. 2;
FIG. 5 is a flowchart illustrating an implementation of step S40 in FIG. 2;
FIG. 6 is a data merge diagram of data layers in accordance with an embodiment of the present invention;
FIG. 7 is a schematic flow chart of another implementation of step S40 in FIG. 2;
FIG. 8 is a diagram illustrating another data merge of data layers in an embodiment of the invention;
FIG. 9 is a schematic flow chart of a data storage method based on a log merge tree according to an embodiment of the present invention;
FIG. 10 is a schematic flow chart of a data storage method based on a log merge tree according to an embodiment of the present invention;
FIG. 11 is a block diagram of a data storage device based on a log merge tree according to an embodiment of the invention;
FIG. 12 is a block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The data storage method based on the log merge tree provided by the embodiment of the invention can be applied to the application environment shown in fig. 1, wherein the terminal device communicates with the server through the network.
The method comprises the steps that terminal equipment sends a storage instruction to a server, and the server converts metadata into key value pair data after receiving the storage instruction, wherein the key value pair data comprise keys and metadata corresponding to the keys, and each key corresponds to one piece of metadata so as to search corresponding metadata according to the keys in the following process; after the processed key value pair data, recording the written key value pair data by adopting a fixed-size sorting String Table (SSTable) structure, and recording the key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data; then orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files, storing the fragment files in a disk of a log merging tree, determining whether a data layer of the disk meets a preset merging condition, merging all the fragment files in the data layer into a new fragment file if the data layer meets the preset merging condition, finally storing the new fragment file to the next data layer of the data layer, and deleting all the fragment files in the data layer; in the invention, SSTable files with fixed size are organized by fragments, a plurality of SSTable files in the fragment files are orderly arranged according to a key range, so that the data ordering is ensured, when the data merging is carried out, the data merging is different from the traditional merging of the data of the current data layer and the next layer, only all the files of the current data layer are merged into a new file to be stored in the next layer, and the files of the current data layer are deleted, so that the data rewriting operation is reduced, the space occupation caused by the merging can be reduced, and the performance of a data storage system is improved.
The Log Merge Tree is a storage structure (LSM-Tree) based on The Log Merge Tree, The LSM-Tree includes a memory and a disk, and data in The disk is stored in an SSTable structure. SSTable is a file where one key is ordered, storing key-value pairs in the form of strings, and SSTable provides a persistent, ordered, immutable mapping from keys to values, where keys and values (representing metadata) are strings of any byte length. SSTable provides the following operations: by querying a key for an associated value, a range of keys may be specified to traverse all key-value pair data therein. Each SSTable is internally composed of a series of data blocks (blocks), the data blocks being located using block indices (block indices) stored at the end of the SSTable; when the SSTable file is open, the block index is loaded into memory. The corresponding block is found in the index stored in the memory by searching through a query (lookup) operation, and then the content of the block is read from the disk.
The terminal device may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster consisting of a plurality of servers.
In an embodiment, as shown in fig. 2, a data storage method based on a log merge tree is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
s10: and recording the written key value pair data by adopting a fixed-size sorting character string table structure, and recording the key value pair data as an SSTable file and an SSTable file.
And when storing the key value pair data to the log merge Tree LSM-Tree, recording the written key value pair data by adopting a fixed-size sorting character string table structure, and recording the key value pair data as an SSTable file. Wherein the key-value pair data comprises keys and metadata corresponding to the keys; the SSTable file includes key-value pair data and a key range corresponding to the key-value pair data, where the key range is composed of a maximum key and a minimum key in the key-value pair data, and the key range may be an index of the SSTable file.
S20: and orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files, and storing the fragment files into a disk of a log merging tree.
After the written key value pair data is recorded by adopting a fixed-size sorting character string table structure and recorded as an SSTable file, the sizes of the key ranges of a plurality of SSTable files are orderly arranged to generate fragment files, and the fragment files are stored in a disk of a log merging tree. The method comprises the steps that a plurality of SSTable files with fixed sizes are organized in a fragmentation mode to generate a fragmentation file Shard, wherein the sizes of key ranges of the SSTable files are orderly arranged in the fragmentation file Shard, namely the key ranges of different SSTable files in the same fragmentation file Shard cannot be overlapped, so that the ordering of data is guaranteed, the subsequent indexing is facilitated, and the reading and amplifying problems are reduced. After the fragment files Shard are generated by orderly arranging the sizes of the key ranges of the SSTable files, the fragment files Shard are stored in a disk of a log merge tree, wherein the disk comprises a plurality of data layers, the fragment files Shard are stored in the data layers, one data layer can comprise a plurality of fragment files Shard, the key ranges of the fragment files Shard in the same data layer can be overlapped, and no sequence relation exists among the different fragment files Shard, so that the frequent rewriting operation of data for maintaining the key order can be reduced, and the data writing is facilitated.
The information of each SSTable file is recorded in the fragment file, and the information of the SSTable file comprises the information of the size of the SSTable file, key value pair data in the SSTable file, and the key range (including the maximum key and the minimum key) corresponding to the key value pair data.
The log merged Tree LSM-Tree comprises two structural layers of a memory and a disk, wherein the memory comprises a memory table and/or an index structure (index), the memory table is divided into a Memtable and an ImmutableTable, and SSTable files are stored on the disk. The complete storage process of the log merge Tree LSM-Tree comprises the following steps: the server writes the key value pair data into MemTable in the memory, after MemTable is fully written, MemTable becomes the state of not being writable, namely freeze the memory table (Immunable MemTable), write the key value pair data of the frozen memory table into the disk at this moment, produce SSTable file of the fixed size; then, in this embodiment, a plurality of fixed-size SSTable files are organized in a fragmentation form, and a fragmentation file Shard is generated, where each data layer includes a plurality of fragmentation files Shard. As shown in fig. 3, for example, there is one data layer for the i-disk, and the data layer includes two sharded files Shard, each sharded file Shard includes two SSTable files, where the left side of the SSTable file represents the minimum key of the SSTable file key range and the right side of the SSTable file represents the maximum key of the SSTable file key range. In fig. 3, the key ranges in the four SSTable files to the right are (1, 12), (14, 26), (7, 13), (21, 45), and it can be seen that the key ranges of the SSTable files in each sliced file are arranged in order in size, and the key ranges do not overlap.
S30: and determining whether the data layer of the disk meets a preset merging condition.
The method comprises the steps of sequentially arranging the sizes of a plurality of SSTable file key ranges to generate fragment files, storing the fragment files to a disk of a log merging tree, determining whether each data layer in the disk meets a preset merging condition, and if the data layer does not meet the preset merging condition, indicating that the data layer is not full and does not need to be merged, and continuously writing the fragment files.
S40: and if the data layer meets the preset merging condition, merging all the fragment files in the data layer into a new fragment file.
After determining whether each data layer in the disk meets a preset merging condition, if the data layer meets the preset merging condition, merging all the fragmented files in the data layer into a new fragmented file, indicating that the data layer is full of storage and needs to be merged, so as to empty the data of the data layer and write the data into subsequent data, and merging all the fragmented files in the data layer into the new fragmented file.
S50: and storing the new fragment file to the next data layer of the data layers, and deleting all the fragment files in the data layers.
After all the fragmented files in the data layer are merged into a new fragmented file, the new fragmented file is stored to the next data layer of the data layer, and all the fragmented files in the data layer are deleted, so that data can be written into the data layer subsequently.
It should be understood that the conventional LSM-Tree merging manner includes two types, namely size classification (sizing) and level compression (Leveling), wherein the size classification allows the key ranges of the SSTable files in each layer to overlap without frequent data rewriting for maintaining key order, but the exponentially growing ordered SSTable files occupy a large amount of storage space when merging, and redundant data can reduce the indexing performance; the hierarchical compression method solves the problem of occupying a large amount of storage space, but in order to maintain the ordering of keys in the SSTable files, the SSTable files need to be frequently rewritten, and the problem of write amplification exists in the major groups. In the embodiment, the SSTable files with fixed sizes are organized by the fragments, and a plurality of SSTable files in the fragment files are arranged in order according to the keys, so that the order of the keys is ensured, the data ordering is convenient for subsequent index query, and when data merging is performed, only all files in the current data layer are merged into a new file to be stored in the next layer, and the files in the current data layer are deleted.
In this embodiment, a fixed-size sorting string table structure is used to record written key value pair data, which is recorded as an SSTable file, where the SSTable file includes key value pair data and key ranges corresponding to the key value pair data, then the key ranges of a plurality of SSTable files are arranged in order to generate fragment files, the fragment files are stored in a disk of a log merge tree, it is determined whether a data layer of the disk meets a preset merge condition, if the data layer meets the preset merge condition, all fragment files in the data layer are merged into a new fragment file, and finally the new fragment file is stored in a next data layer of the data layer, and all fragment files in the data layer are deleted. The SSTable files with fixed sizes are organized by the fragments, a plurality of SSTable files in the fragment files are orderly arranged according to the key range, data ordering is ensured, when data merging is carried out, the data merging is different from the traditional method of merging the data of the current data layer and the data of the next layer together, all the files of the current data layer are merged into a new file to be stored in the next layer, the files of the current data layer are deleted, data rewriting operation is reduced, space occupation caused by merging can be reduced, and therefore the performance of a data storage system is improved.
In an embodiment, as shown in fig. 4, in step S30, that is, determining whether the data layer of the magnetic disk meets the preset merge condition, the method specifically includes the following steps:
s31: and determining whether the number of the fragmented files in the data layer reaches a preset number.
S32: and if the number of the fragmented files in the data layer reaches the preset number, determining that the data layer meets the preset merging condition.
S33: and if the number of the fragmented files in the data layer does not reach the preset number, determining that the data layer does not meet the preset merging condition.
Determining whether the number of the fragment files in each data layer reaches a preset number, if the number of the fragment files in the data layer does not reach the preset number, indicating that the data layer is not full of files, determining that the data layer does not meet a preset merging condition, and if the number of the fragment files in the data layer reaches the preset number, indicating that the data layer is full of files, determining that the data layer meets the preset merging condition.
In other embodiments, it may also be determined whether the data layer satisfies a preset merging condition according to the total data size of all the fragmented files in the data layer: summarizing the file sizes of the SSTable files in each fragment file to obtain the total data volume of all fragment files in the data layer, determining whether the total data volume of all fragment files in the data layer reaches a preset number, if so, determining that the data layer meets a preset merging condition, and if not, determining that the data layer does not meet the preset merging condition.
In the embodiment, whether the data layer meets the preset merging condition or not is determined according to the number of the fragmented files in the data layer, and compared with a traditional data volume threshold value judging mode, the method is simpler and more intuitive, data volume statistics is infinitely performed, and the calculation burden of a storage system is reduced. In addition, based on the storage mode of the fragmented files, different fragmented files may have different data volumes, that is, different data layers may have different numbers, and the merging operation may be frequently triggered by using a single data volume threshold mode, which reduces the data storage performance.
In this embodiment, by determining whether the number of the fragmented files in the data layer reaches a preset number, if the number of the fragmented files in the data layer reaches the preset number, it is determined that the data layer satisfies a preset merging condition; if the number of the fragment files in the data layer does not reach the preset number, the data layer is determined not to meet the preset merging condition, the specific step of determining whether the data layer of the disk meets the preset merging condition is determined, the classified number of the fragments is used as the merging triggering condition, and compared with a traditional data volume threshold value judging mode, the method is simpler and more intuitive, data volume statistics is carried out infinitely, and the calculation burden of a storage system is reduced.
In an embodiment, as shown in fig. 5, in step S40, merging all the sliced files in the data layer into a new sliced file specifically includes the following steps:
s41: SSTable files in all the fragment files in the data layer are determined and recorded as files to be merged.
After the data layer is determined to meet the preset merging condition, the SSTable files in all the fragmented files in the data layer need to be determined, and each SSTable file in the data layer is recorded as a file to be merged.
S42: and determining whether the key ranges of the files to be merged in the data layer are overlapped.
After SSTable files in all the fragmented files in the data layer are recorded as files to be merged, whether the key ranges of the files to be merged in the data layer overlap or not is determined.
S43: and if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
After determining whether the key ranges of the files to be merged in the data layer are overlapped, if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
For example, as shown in fig. 6, the plurality of data layers in the disk are L0 and L1 in sequence, and taking the preset number as 2 as an example, before merging the data layers, the data layer L0 has 2 Shard files Shard, each Shard file has 1 SSTable file, the minimum key (min) in the first SSTable file is 7, the maximum key (max) in the first SSTable file is 13, and the minimum key in the second SSTable file is 21 and the maximum key in the second SSTable file is 45; the two SSTable file key ranges in the data layer L0 do not overlap, and the two SSTable file key ranges are directly copied to the L1 layer to form a new Shard file Shard containing 2 SSTable files when being merged.
After all files to be merged in a data layer are arranged and merged into new fragmented files according to a key range sequence, the new fragmented files are stored in a next data layer, original fragmented data in the data layer are deleted, the process is equivalent to that only all data in the data layer are directly copied to the next data layer, the data of the next data layer do not participate in merging, only the fragmented files from which the data are output are directly transplanted to the next data layer, complex merging operation is not needed, only metadata information contained in the fragmented files is updated, new and redundant repeated data are not produced, the problem of increment of an enable file data index in a traditional merging mode can be effectively solved, writing amplification of the storage process is reduced, and the problems of space occupation of data rewriting and merging processes can be reduced.
In the embodiment, SSTable files in all fragmented files in a data layer are determined and recorded as files to be merged, and then whether key ranges of the files to be merged in the data layer are overlapped or not is determined; if the key ranges of the files to be merged in the data layer are not overlapped, all the files to be merged in the data layer are arranged and merged into a new fragment file according to the key range sequence, the specific process of merging all the fragment files in the data layer into the new fragment file is clarified, when the key ranges of the SSTable files to be merged are not overlapped, only the data information in the fragment files is updated, real merging operation is not generated, write amplification is reduced, and the new fragment file is directly stored in the next data layer instead of being merged with the SSTable file in the next data layer fragment file during subsequent merging, so that the space occupation of data rewriting and merging processes can be reduced.
In an embodiment, as shown in fig. 7, after step S42, that is, after determining whether there is an overlap between key ranges of files to be merged in the data layer, the method further includes the following steps:
s44: and if the key ranges of the files to be merged in the data layer are overlapped, merging and sequencing the files to be merged by adopting a merging and sequencing algorithm to obtain a plurality of new SSTable files.
S45: and orderly arranging the plurality of new SSTable files according to the size of the key range to generate new fragment files.
After determining whether the key ranges of the files to be merged in the data layer are overlapped, if the key ranges of the files to be merged in the data layer are overlapped, merging and sequencing the files to be merged by adopting a merging and sequencing algorithm to obtain a plurality of new SSTable files, wherein the key ranges of the new SSTable files are not overlapped. After the files to be merged are merged and sorted by adopting a merging and sorting algorithm to obtain a plurality of new SSTable files, the plurality of new SSTable files are orderly arranged according to the size of the key range to generate new fragment files.
The method for merging and sequencing the files to be merged by adopting a merging and sequencing algorithm to obtain a plurality of new SSTable files comprises the following steps: determining the key range of each file to be merged, determining two overlapped key ranges, determining the key with the minimum value and the key with the maximum value in the two overlapped key ranges, generating a new key range according to the key with the minimum value and the key with the maximum value, generating a new SSTable file by using the key value corresponding to the new key range and the new key range until all the two overlapped key ranges are traversed to obtain a plurality of new SSTable files, and then orderly arranging the plurality of new SSTable files according to the size of the key ranges to generate new fragment files.
For example, as shown in fig. 8, the plurality of data layers in the disk are L0, L1, and L2 in sequence, for example, the preset number is 2, the data layer L1 originally stores a sharded file, the original sharded file includes two SSTable files whose key ranges are not overlapped, the key ranges of the two SSTable files are (1, 12), (14, 26), respectively, after the data layer L0 is merged to the next data layer L1, the data layer L1 newly adds a sharded file Shard whose key ranges are (7, 13), (21, 45), respectively, so that the key ranges (1, 12) and (7, 13) are overlapped, the key ranges (14, 26) and (21, 45) are overlapped, the key with the smallest value in the key ranges (1, 12) and (7, 13) is determined to be 1, the key with the largest value is 13, and according to the two key ranges (1, 12) and (7, 13) with overlapping, 13) generating a new key range (1, 13), using the new key range (1, 13) and the corresponding key value pair data as a new SSTable file, thereby generating a new key range (14, 45) according to two overlapped key ranges (14, 26) and (21, 45), using the new key range (14, 45) and the corresponding key value pair data as another new SSTable file, finally orderly arranging the two newly generated SSTable files according to the size of the key range to generate a new Shard file Shard, wherein the new Shard file Shard comprises 2 non-overlapped ranges of SSTable files, the key ranges of the two SSTable files are (1, 13), (14, 45), respectively, merging the data layer L1 into the next data layer L2, and emptying the Shard file Shard of the L1 layer.
In this embodiment, after determining whether the key ranges of the files to be merged in the data layer overlap, if the key ranges of the files to be merged in the data layer overlap, merging and sorting the files to be merged by using a merging and sorting algorithm to obtain a plurality of new SSTable files, the key ranges of the new SSTable files do not overlap, and the plurality of new SSTable files are sequentially arranged according to the size of the key ranges to generate new fragment files.
In an embodiment, as shown in fig. 3, an index structure (index) is set on the memory of the log merge tree, where the index structure is used to record index parameters of data in each data layer of the disk, and the index structure may also be used to manage merging of each data layer. As shown in fig. 9, after step S20, that is, after the sizes of the multiple SSTable file key ranges are arranged in order to generate the sliced file, the method further includes the following steps:
s01: and generating an index key range according to all the fragment files in the data layer.
After the sizes of the key ranges of the SSTable files are orderly arranged to generate the fragment files, the data in each data layer is stored in the form of the fragment files, and at the moment, in order to facilitate subsequent index query of the data, the index key ranges are required to be generated according to all the fragment files in the data layers.
Wherein, generating the index key range according to all the fragment files in each data layer comprises: and determining a key range corresponding to each fragmented file in the data layer, wherein the key range corresponding to each fragmented file is a total key range obtained by summarizing according to the key ranges of the SSTable files in the fragmented file, summarizing and generating the key range corresponding to the data layer according to the key range corresponding to each fragmented file, and taking the key range corresponding to the data layer as an index key range of the data layer.
S02: and updating the index key range to an index structure to generate index data by taking the index key range as an index parameter corresponding to the data layer.
After generating index key ranges according to all the fragment files in the data layer, taking the index key ranges of the data layer as index parameters corresponding to the data layer, and updating the index parameters into an index structure to generate index data. The index data comprises a plurality of index parameters, each index parameter corresponds to one data layer, corresponding key value pair data can be searched in the corresponding data layer according to the index parameters, and therefore corresponding metadata can be obtained.
In the embodiment, by setting the index structure on the memory, after the key ranges of the multiple SSTable files are sequentially arranged to generate the fragment files, the index key range is generated according to all the fragment files in each data layer, the index key range is used as the index parameter corresponding to the data layer, and is updated to the index structure to generate the index data, the index data comprises multiple index parameters, the generation mode of the index data is defined, the corresponding index parameter can be conveniently searched for in the index structure subsequently according to needs, and then data query is performed in the corresponding data layer according to the index parameter, so that the data query amount is reduced, and the data index performance of the storage system is improved.
In an embodiment, after step S02, that is, after the index parameter is updated to the index structure to generate the index data, it is determined whether an index request input by the user through the index structure is received, where the index request includes a key (i.e., an index key) to be indexed, if an index request input by the user through the index structure is received, the index parameter corresponding to the index key is determined in the index data according to the index request, the index parameter corresponding to the index key is recorded as a target index parameter, then a data layer corresponding to the target index parameter is determined, and metadata meeting the index request is queried in the data layer, that is, the metadata corresponding to the index key is queried in the data layer, and the corresponding data layer is directly queried through the index key, so that data layers do not need to be traversed, a data query amount is reduced, and a data index performance of the storage system is improved.
In an embodiment, as shown in fig. 10, after step S50, that is, after storing a new fragmented file into a next data layer of the data layer and deleting all fragmented files in the data layer, the method further includes the following steps:
s61: generating an index key range according to all the fragment files in the next data layer of the data layers, and updating the index key range to an index structure to be used as an index parameter corresponding to the next data layer;
s62: the corresponding index parameters of the data layer are deleted in the index structure.
After the new fragmented file is stored to the next data layer of the data layers and all fragmented files in the data layers are deleted, the data in the current data layer and the data in the next data layer are changed, so that the index data in the index structure needs to be updated for subsequent data indexing.
Therefore, after storing the new fragmented file to the next data layer of the data layer and deleting all fragmented files in the data layer, since all fragmented files in the data layer are deleted, the data layer is emptied of non-existing data, so the corresponding index parameter for the data layer needs to be deleted in the index structure, so that the corresponding metadata can not be indexed by installing the original index parameters during the subsequent user indexing, and at the same time, since the data of the data layer is updated and stored in the next data layer, the index key range needs to be generated according to all the fragment files in the next data layer of the data layer, and updating the index key range to the index structure as the index parameter corresponding to the next data layer, namely, the newly generated index key range of the next data layer replaces the original index parameter, so that correct data indexing can be performed according to the index key in the following, and the possibility of data indexing failure is reduced.
In this embodiment, after storing a new fragment file to a next data layer of the data layer and deleting all fragment files in the data layer, an index key range is generated according to all fragment files in the next data layer of the data layer, the index key range is updated to the index structure as an index parameter corresponding to the next data layer, the index parameter corresponding to the data layer is deleted in the index structure, and the index data in the index structure is updated in time according to data changes in each data layer in the disk, so that correct data indexing is performed according to the index key in the following process, the possibility of data indexing failure is reduced, and the indexing performance of the storage system is improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In an embodiment, a data storage device based on a log merge tree is provided, and the data storage device based on the log merge tree corresponds to the data storage method based on the log merge tree in the above embodiment one to one. As shown in fig. 11, the log merge-tree based data storage device includes a recording module 111, a generating module 112, a determining module 113, a merging module 114, and a deleting module 115. The detailed description of each functional module is as follows:
the recording module 111 is configured to record the written key value pair data by using a fixed-size sorting character string table structure, and record the key value pair data as an SSTable file, where the SSTable file includes the key value pair data and a key range corresponding to the key value pair data;
the generating module 112 is configured to sequentially arrange the sizes of the multiple SSTable file key ranges to generate fragmented files, and store the fragmented files in a disk of a log merge tree;
a determining module 113, configured to determine whether a data layer of the disk meets a preset merge condition;
a merging module 114, configured to merge all the fragmented files in the data layer into a new fragmented file if the data layer meets a preset merging condition;
and the deleting module 115 is configured to store the new fragmented file to a next data layer of the data layers, and delete all fragmented files in the data layers.
Further, the merging module 114 is specifically configured to:
determining SSTable files in all the fragment files in the data layer, and recording the SSTable files as files to be merged;
determining whether the key ranges of the files to be merged in the data layer are overlapped;
and if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
Further, after determining whether there is an overlap between the key ranges of the files to be merged in the data layer, the merging module 114 is further specifically configured to:
if the key ranges of the files to be merged in the data layer are overlapped, merging and sequencing the files to be merged by adopting a merging and sequencing algorithm to obtain a plurality of new SSTable files, wherein the key ranges of the new SSTable files are not overlapped;
and orderly arranging the plurality of new SSTable files according to the size of the key range to generate new fragment files.
Further, the determining module 113 is specifically configured to:
determining whether the number of the fragmented files in the data layer reaches a preset number;
and if the number of the fragmented files in the data layer reaches the preset number, determining that the data layer meets the preset merging condition.
Further, after an index structure is arranged on the memory and the sizes of the plurality of SSTable file key ranges are arranged in order to generate the fragment file, the generating module 112 is further configured to:
generating an index key range according to all the fragment files in the data layer;
and updating the index key range serving as an index parameter corresponding to the data layer into an index structure to generate index data, wherein the index data comprises a plurality of index parameters.
Further, after storing the new fragment file to the next data layer of the data layers and deleting all the fragment files in the data layers, the generating module 112 is further specifically configured to:
generating an index key range according to all the fragment files in the next data layer of the data layers, and updating the index key range to an index structure to serve as an index parameter corresponding to the next data layer;
the corresponding index parameters of the data layer are deleted in the index structure.
For specific limitations of the data storage device based on the log merge tree, reference may be made to the above limitations of the data storage method based on the log merge tree, and details are not repeated here. The various modules in the log merge tree based data store described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for generating and using data based on the data storage method of the log merging tree. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a log merge tree based data storage method.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
recording the written key value pair data by adopting a fixed-size sorting character string table structure, and recording the key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data;
orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files, and storing the fragment files into a disk of a log merging tree;
determining whether a data layer of the disk meets a preset merging condition;
if the data layer meets the preset merging condition, merging all the fragmented files in the data layer into a new fragmented file;
and storing the new fragment file to the next data layer of the data layers, and deleting all the fragment files in the data layers.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
recording the written key value pair data by adopting a fixed-size sorting character string table structure, and recording the key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data;
orderly arranging the sizes of the key ranges of the SSTable files to generate fragment files, and storing the fragment files into a disk of a log merging tree;
determining whether a data layer of the disk meets a preset merging condition;
if the data layer meets the preset merging condition, merging all the fragmented files in the data layer into a new fragmented file;
and storing the new fragment file to the next data layer of the data layers, and deleting all the fragment files in the data layers.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional units and modules is only used for illustration, and in practical applications, the above function distribution may be performed by different functional units and modules as needed, that is, the internal structure of the apparatus may be divided into different functional units or modules to perform all or part of the above described functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. A data storage method based on log merge tree is characterized by comprising the following steps:
recording written key value pair data by adopting a fixed-size sorting character string table structure, and recording the written key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data;
the SSTable files are orderly arranged according to the size of the key range to generate fragment files, the fragment files are stored in a disk of a log merging tree, the disk comprises a plurality of data layers, each data layer comprises a plurality of fragment files, and the fragment files in the same data layer have no sequential relation;
determining whether the data layer of the disk meets a preset merging condition;
if the data layer meets the preset merging condition, merging all the fragment files in the data layer into a new fragment file;
and storing the new fragment file to the next data layer of the data layers, and deleting all the fragment files in the data layers.
2. The log merge tree based data storage method of claim 1, wherein the merging all of the sharded files in the data tier into a new sharded file comprises:
determining the SSTable files in all the fragment files in the data layer, and recording as files to be merged;
determining whether the key ranges of the files to be merged in the data layer are overlapped;
if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
3. The log merge-tree based data storage method of claim 2, wherein after determining whether there is an overlap of the key ranges of the files to be merged in the data tier, the method further comprises:
if the key ranges of the files to be merged in the data layer are overlapped, merging and sequencing the files to be merged by adopting a merging and sequencing algorithm to obtain a plurality of new SSTable files, wherein the key ranges of the new SSTable files are not overlapped;
and arranging the plurality of new SSTable files in order according to the size of the key range to generate new fragment files.
4. The log merge-tree based data storage method of claim 1, wherein the determining whether the data layer of the disk meets a preset merge condition comprises:
determining whether the number of the fragmented files in the data layer reaches a preset number;
and if the number of the fragment files in the data layer reaches the preset number, determining that the data layer meets the preset merging condition.
5. The log merge tree-based data storage method of any one of claims 1-4, wherein after providing an index structure on the memory of the log merge tree and generating sharded files by ordering the plurality of SSTable files by the size of the key range, the method further comprises:
generating an index key range according to all the fragment files in the data layer;
and updating the index key range as an index parameter corresponding to the data layer to the index structure to generate index data, wherein the index data comprises a plurality of index parameters.
6. The log merge tree based data storage method of claim 5, wherein after storing the new sharded file to a next data tier of the data tiers and deleting all of the sharded files in the data tiers, the method further comprises:
generating the index key range according to all the fragment files in a next data layer of the data layers, and updating the index key range to the index structure as the index parameter corresponding to the next data layer;
deleting the corresponding index parameter of the data layer in the index structure.
7. A log merge tree based data storage device, comprising:
the recording module is used for recording the written key value pair data by adopting a fixed-size sorting character string table structure and recording the key value pair data as an SSTable file, wherein the SSTable file comprises the key value pair data and a key range corresponding to the key value pair data;
the generating module is used for orderly arranging the SSTable files according to the size of the key range to generate fragment files and storing the fragment files into a disk of a log merging tree, wherein the disk comprises a plurality of data layers, each data layer comprises a plurality of fragment files, and the fragment files in the same data layer have no sequential relation;
the determining module is used for determining whether the data layer of the disk meets a preset merging condition;
a merging module, configured to merge all the fragmented files in the data layer into a new fragmented file if the data layer meets the preset merging condition;
and the deleting module is used for storing the new fragment file to the next data layer of the data layers and deleting all the fragment files in the data layers.
8. The log merge tree based data storage device of claim 7, wherein said merging all of the sharded files in the data tier into a new sharded file comprises:
determining the SSTable files in all the fragment files in the data layer, and recording as files to be merged;
determining whether the key ranges of the files to be merged in the data layer are overlapped;
if the key ranges of the files to be merged in the data layer are not overlapped, arranging and merging all the files to be merged in the data layer into a new fragment file according to the key range sequence.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the log merge-tree based data storage method according to any one of claims 1 to 6.
10. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the log merge-tree based data storage method according to any one of claims 1 to 6.
CN202210702791.7A 2022-06-21 2022-06-21 Data storage method, device and equipment based on log merging tree and storage medium Active CN114780500B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210702791.7A CN114780500B (en) 2022-06-21 2022-06-21 Data storage method, device and equipment based on log merging tree and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210702791.7A CN114780500B (en) 2022-06-21 2022-06-21 Data storage method, device and equipment based on log merging tree and storage medium

Publications (2)

Publication Number Publication Date
CN114780500A CN114780500A (en) 2022-07-22
CN114780500B true CN114780500B (en) 2022-09-20

Family

ID=82421294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210702791.7A Active CN114780500B (en) 2022-06-21 2022-06-21 Data storage method, device and equipment based on log merging tree and storage medium

Country Status (1)

Country Link
CN (1) CN114780500B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561073B (en) * 2023-04-14 2023-12-19 云和恩墨(北京)信息技术有限公司 File merging method and system based on database, equipment and storage medium
CN116991794B (en) * 2023-05-24 2024-07-23 阿里云计算有限公司 Data management method, system, device, equipment and medium in data warehouse
CN116450591B (en) * 2023-06-15 2023-09-12 北京数巅科技有限公司 Data processing method, device, computer equipment and storage medium
CN118051643A (en) * 2024-02-23 2024-05-17 中国科学院信息工程研究所 Metadata sparse distribution-oriented LSM data organization method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9355109B2 (en) * 2010-06-11 2016-05-31 The Research Foundation For The State University Of New York Multi-tier caching
CN106407224B (en) * 2015-07-31 2019-09-13 华为技术有限公司 The method and apparatus of file compacting in a kind of key assignments storage system
CN110825706B (en) * 2018-08-07 2022-09-16 华为云计算技术有限公司 Data compression method and related equipment
US11308030B2 (en) * 2020-03-05 2022-04-19 International Business Machines Corporation Log-structured merge-tree with blockchain properties
CN114253908A (en) * 2020-09-23 2022-03-29 华为云计算技术有限公司 Data management method and device of key value storage system
CN113704261B (en) * 2021-08-26 2024-05-24 平凯星辰(北京)科技有限公司 Key value storage system based on cloud storage

Also Published As

Publication number Publication date
CN114780500A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN114780500B (en) Data storage method, device and equipment based on log merging tree and storage medium
KR102564170B1 (en) Method and device for storing data object, and computer readable storage medium having a computer program using the same
US10353607B2 (en) Bloom filters in a flash memory
CN107766374B (en) Optimization method and system for storage and reading of massive small files
US11556513B2 (en) Generating snapshots of a key-value index
CN113535670B (en) Virtual resource mirror image storage system and implementation method thereof
CN113867627B (en) Storage system performance optimization method and system
US11514010B2 (en) Deduplication-adapted CaseDB for edge computing
CN116204130A (en) Key value storage system and management method thereof
CN113253932B (en) Read-write control method and system for distributed storage system
KR101806394B1 (en) A data processing method having a structure of the cache index specified to the transaction in a mobile environment dbms
CN113392089B (en) Database index optimization method and readable storage medium
US11461299B2 (en) Key-value index with node buffers
US20210406237A1 (en) Searching key-value index with node buffers
CN117573676A (en) Address processing method and device based on storage system, storage system and medium
Jensen et al. Optimality in external memory hashing
US20200019539A1 (en) Efficient and light-weight indexing for massive blob/objects
CN114896250B (en) Key value separated key value storage engine index optimization method and device
CN116382588A (en) LSM-Tree storage engine read amplification problem optimization method based on learning index
US20240256511A1 (en) Lsm hybrid compaction
CN113688096B (en) Storage method, storage device and storage system
CN113326262B (en) Data processing method, device, equipment and medium based on key value database
CN113625938B (en) Metadata storage method and device
WO2024213024A1 (en) Data access method and data processing system
US10169250B2 (en) Method and apparatus method and apparatus for controlling access to a hash-based disk

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant