CN105138622B - For the insertion operation of LSM tree storage systems and reading and the merging method of load - Google Patents

For the insertion operation of LSM tree storage systems and reading and the merging method of load Download PDF

Info

Publication number
CN105138622B
CN105138622B CN201510501523.9A CN201510501523A CN105138622B CN 105138622 B CN105138622 B CN 105138622B CN 201510501523 A CN201510501523 A CN 201510501523A CN 105138622 B CN105138622 B CN 105138622B
Authority
CN
China
Prior art keywords
key
data
value
read
fields
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510501523.9A
Other languages
Chinese (zh)
Other versions
CN105138622A (en
Inventor
贾士博
岳银亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongkehai micro (Beijing) Technology Co.,Ltd.
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201510501523.9A priority Critical patent/CN105138622B/en
Publication of CN105138622A publication Critical patent/CN105138622A/en
Application granted granted Critical
Publication of CN105138622B publication Critical patent/CN105138622B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices

Abstract

The present invention provides a kind of insertion operation method for LSM tree storage systems, including:1) value will be inserted into and is inserted into key and is configured to key value structure body and using the key value structure body of the new structure as being inserted into the latest data fragmented storage of key into database;For the key value structure body of the new structure, the total amount of data for being inserted into the data sectional accumulated before key is preserved, preserves the storage location for the last data segmentation for being inserted into key;2) list item for being inserted into key of insertion table is updated;The insertion token records the total amount of data of all data sectionals of each key and the storage location of latest data segmentation.The present invention also provides the read methods and merging method of the load of corresponding insertion operation.The present invention can avoid overhead of the insertion operation caused by read/write scale-up problem from increasing;It can be during data be moved from low layer to high level by insertion operation load aggregation;The reading performance of insertion operation load can be improved.

Description

For the insertion operation of LSM tree storage systems and reading and the merging method of load
Technical field
The present invention relates to area information storages, specifically, are deposited the present invention relates to one kind for LSM trees (i.e. LSM Tree) Insertion (i.e. Append) operation of storage system and reading and the merging method of respective load.
Background technology
LSM Tree storage systems are that the performance of persistent storage is influenced for random I/O and makes the storage system of optimization System.The full name of LSM Tree shows existing LSM Tree storage systems for Log-Structured Merge-Tree, Fig. 1 Structure and working principle, its major design are:
1st, data hierarchy is placed, and wherein the data of lowermost layer are located at memory, and the data of other levels are located at persistence equipment (HDD/SSD etc.).
2nd, data strange land updates, and data are write in a manner of log.
3rd, data are constantly merged (merge) by the background process of system from low layer to high level, i.e. by the identical K-V of key Older version in structure is deleted, and retains the new version of sequence.Data are with the progress of merging, and data are constantly from low layer to height Layer is mobile.
This design of LSM Tree can reduce the tracking number of disk by batch write-in, and then improve persistence The performance of storage, it is particularly suitable for the situation of the random I/O of a large amount of disks.However, in LSM Tree storage systems, to data , it is necessary to which the value in K-V structures is read out, splices and is write back when carrying out insertion operation, and since background process can also Data are constantly moved from low layer to high level, therefore the read/write number of insertion operation can be actually amplified, this may result in Overhead significantly increases, and influences storage performance.
The content of the invention
Therefore, the task of the present invention is provide a kind of LSM Tree for the drawbacks described above that can overcome the prior art to store system The reading of insertion operation and respective load in system and merging solution.
According to an aspect of the invention, there is provided a kind of insertion operation method for LSM Tree storage systems, bag Include the following steps:
1) value (value is known as value sometimes herein) will be inserted into and be inserted into key and (herein sometimes claimed key For key) it is configured to K-V structures and K-V structures (is known as key-value structure by the K-V structures newly built sometimes herein Body) as being inserted into the latest data fragmented storage of key into database;For the K-V structures newly built, preservation is inserted into The total size for the data sectional accumulated before key preserves the storage location for the last data segmentation for being inserted into key;
2) list item for being inserted into key of Append tables (being sometimes known as Append tables to be inserted into table herein) is updated;It is described Insertion token records the total size of all data sectionals of each key and the storage location of latest data segmentation.
Wherein, in the step 1):PrevSize fields and prevPos are set in the K-V structures of the new structure Field, prevSize fields preserve the total size for the data sectional accumulated before same key, and prevPos fields preserve same The storage location of the last data segmentation of this K-V structures of one key.
Wherein, the step 1) includes substep:
11) key is inserted into insertion table search, obtains corresponding list item;
12) according to the list item for being inserted into key of gained, all data for being inserted into key before this insertion operation are obtained The storage location of total size and the latest data segmentation of segmentation;
13) key will be inserted into and be inserted into value and be configured to K-V structures, in the prevSize words of the K-V structures The total size of all data sectionals for being inserted into key before this insertion operation acquired in step 12) is inserted in section, in the K- The latest data for being inserted into key before this insertion operation acquired in step 12) is inserted in the prevPos fields of V structure body The storage location of segmentation.
Wherein, each list item of the insertion table includes size fields and pos fields, the size field records list item The total size of all data sectionals of key, the storage location of the latest data segmentation of the key of the pos field records list item.
Wherein, the step 2) includes substep:
21) the size fields for the list item for being inserted into key are updated to add in all data sectionals after being inserted into value Total size;
22) the pos fields for the list item for being inserted into key are updated to the storage location of the K-V structures obtained by step 13).
According to another aspect of the present invention, additionally provide what a kind of insertion operation for LSM Tree storage systems loaded Read method, the insertion operation load include belonging at least two K-V structures as data sectional of same key, and Each K-V structures include prevSize fields and prevPos fields, wherein, prevSize fields are used to preserve same key Under the K-V structures before the total size of data sectional accumulated, prevPos fields are for preserving under same key The storage location of the last data segmentation of the K-V structures;
The read method comprises the following steps:
101) in the list item of the insertion table search key to be read;The insertion token records all data sectionals of each key Total size and latest data segmentation storage location;
102) according to the key for hitting the total size for all data sectionals that list item records in insertion table and being read Each K-V structures in preserve before the total size of data sectional accumulated, find the value scopes to be read Matched K-V structures and the therefrom value to be read.
Wherein, the step 102) includes substep:
1021) the latest data segmentation position recorded according to list item is hit in insertion table, finds the newest of the key to be read The K-V structures of data sectional are as current K-V structures;
1022) according to the value sizes of current K-V structures itself and prevSize fields, current K-V structures are judged Whether with the value commensurate in scope to be read, if it is, output read result;If it is not, then continue to execute step 103);
1023) the K-V structures of last data segmentation are found according to the prevPos fields of current K-V structures, then will The K-V structures of last data segmentation re-execute step 1022), until the key to be read as current K-V structures All data sectionals be all found.
Wherein, in the step 1022), after result is read in output, continue whether to judge the value scopes to be read It all reads and finishes, if the judgment is Yes, then directly release this reading, if the judgment is No, jump to step 1023).
According to another aspect of the invention, additionally provide what a kind of insertion operation for LSM Tree storage systems loaded Merging method, the insertion operation load include belonging at least two K-V structures as data sectional of same key, and Each K-V structures include prevSize fields and prevPos fields, wherein, prevSize fields are used to preserve same key Under the K-V structures before the total size of data sectional accumulated, prevPos fields are for preserving under same key The storage location of the last data segmentation of the K-V structures;
The merging method includes:During the union operation of LSM Tree storage systems, the insertion behaviour will be belonged to The each K-V structures made under the same key loaded merge.
Wherein, rising using SST caused by Append class formation bodies (i.e. orderly string table file) as SST evolution tables Point;The process that each K-V structures under the same key that the insertion operation load will be belonged to merge comprises the following steps:
201) when merging operation in persistent storage layer, searched in SST evolution tables and participate in this union operation Each SST;
202) when reading the structure for participating in this union operation, it is incorporated as the K-V of the different data segmentation of same key Structure.
Compared with prior art, the present invention has following technique effect:
1st, the present invention LSM Tree storage systems realize insertion operation function while, avoid insertion operation because read/ Writing the overhead caused by scale-up problem increases.
2nd, the present invention can data from low layer to high level move during will be under same key caused by insertion operation Each K-VpartStructure polymerize.
3rd, the present invention can improve the reading performance of K-V structures caused by insertion operation.
Description of the drawings
Hereinafter, the embodiment that the present invention will be described in detail is carried out with reference to attached drawing, wherein:
Fig. 1 shows the structure and working principle of existing LSM Tree storage systems;
Fig. 2 shows the insertion operation principle schematic of one embodiment of the invention;
Fig. 3 shows the flow chart of the insertion operation method of one embodiment of the invention;
Fig. 4 shows the flow chart of the read method of load caused by the insertion operation of one embodiment of the invention;
Fig. 5 shows that the data of the Append class formation bodies of one embodiment of the invention merge flow chart.
Specific embodiment
According to one embodiment of present invention, a kind of insertion operation method in LSM Tree storage systems is provided.It should Embodiment performs insertion (i.e. Append) operation based on fragmented storage technology, while passes through the transformation of key and drawing for insertion table Enter, solve the problems, such as that the value of the different sections of same key is connected each other, thus while insertion operation normal function is realized, Read/write scale-up problem of the insertion operation in LSM Tree storage systems is avoided, and then saves overhead.
Fig. 2 shows the insertion operation principle schematic of the present embodiment.As shown in Fig. 2, to improve the performance of insertion operation, Insertion table is established in the present embodiment, new project is added in the key fields of the K-V structures of Append classes, while also builds SST evolution tables are found, first the K-V structures to insertion table, Append classes and SST evolution table are described respectively below, so Improved insertion operation method is described again afterwards.
1st, insertion table is established
Storage is all to write database by insertion operation in the insertion table (Append tables) established in the present embodiment (DB) (value accumulated in the database is always big for all value sections of total size corresponding to key and the key It is small) and the current newest one section of value of the key storage location.As shown in Fig. 2, each element in the insertion table of the present embodiment Including three parts:Key, size (size represents the value total sizes that the key accumulates in DB) and (newest one section of pos The storage location of value).There are in persistent storage equipment, if the storage location of value uses newest one section of value SST ID are represented, if there is in memory, then labeled as mem, for convenience of following description, mem can also be considered as one it is special SST ID.
In a preferred embodiment, table is inserted into establish using red-black tree construction.
2nd, in Append classes K-V structures key fields increase
It is well known that LSM Tree storage systems, which are based on K-V structures, realizes reading and writing data and various other operations.Its In, K-V structures caused by insertion operation are performed in a manner that the present embodiment is based on fragmented storage technology and are known as Append Append class K-V structures herein, are also known as insertion operation load by class K-V structures.In the present embodiment, there is provided insert Enter operate interface, wherein increase an insertion operation parameter, the insertion operation parameter for specify each insertion operation be according to The mode of traditional " reading-splicing value- writes back " is realized or the reality of the mode based on fragmented storage technology with the present embodiment It is existing.This insertion operation parameter can by database (DB) according to carry out insertion operation value sizes and performance requirement come dynamically Adjustment, to determine the realization method of each insertion operation.
In the present embodiment, for the key in Append class K-V structures, except K-V structures are possessed in itself Outside userKey, SeqNum, valueType field, also increase following field:PrevSize and PrevPos, as shown in Figure 2.Its In, prevSize represents the value sizes that the key has accumulated in the database before this section of value writes;PrevPos Represent this section of value the last period K-VpartSST ID where the value of structure, if this section is first segment, the field mark Head is denoted as, wherein, K-VpartStructure represents the corresponding K-V structures of the value segmentations under same key values.
3rd, SST evolution tables are established
It is well known that SST refers to the orderly string table file in LSM Tree databases, i.e. Sorted String Table is abbreviated as SSTable or SST.
As shown in Fig. 2, in the present embodiment, SST evolution table is had recorded in storage system with the structure of digraph in the overall situation The alteration of SST:For each SST after overcompression (compact) operation, the key in the SST has respectively enterd which or which New SST.Each SST is known as reference count, is not yet merged on each SST that it is represented there are one the counting of oneself The number of K-V.
When the key in a SST is compressed (compact) into multiple SST, it is also necessary to record the key of different branches Scope (i.e. record is compressed to the key scopes of each SST).For the new SST that squeeze operation is formed, as a key When being compressed into the new SST, the new SST corresponding reference counts in SST evolution tables add 1;When squeeze operation to same The value of the different sections of one key, which is done, (can delete prevPos fields) when merge during merging, this SST ID (prevPos) Reference count subtracts 1 in SST evolution tables, is 0 when some SST ID is located at oriented source of graph and its reference count, then deletes The record of this SST ID.
SST evolution table can be used for the read operation for accelerating multistage value, and occupation mode is detailed in subsequent read operation flow It states.
In the following, the improved insertion operation method of the present invention is described with reference to one embodiment.
Fig. 3 shows the flow chart of the insertion operation method of one embodiment of the invention, it employs one kind and is different from The peculiar mode of " reading-splicing value- writes back " realizes insertion operation.With reference to figure 3, which includes following steps Suddenly:
Step 301:The key and the value to be inserted into of insertion operation will be performed by receiving, and then being searched in table is inserted into should Key (for convenience of describing, is hereinafter referred to as inserted into key).
Step 302:Judge whether to find in insertion table and be inserted into key, if so, entering step 303, otherwise, enter Step 308.
Step 303:It is read from insertion table and is inserted into the size fields of key and pos fields.
Step 304:Judge whether pos fields are memIf so, entering step 306,305 are otherwise entered step.This step Also in memory whether rapid purpose be exactly to judge the preceding paragraph value of same key.
Step 305:By currently received key and it is inserted into value (i.e. the key and value of step 301) and is configured to K- VpartStructure, the pervPos fields put in the key of the structure (refer to step 303 institute by the pos found in insertion table The content of the pos fields of reading).It is newest corresponding to same key before being this insertion operation by what is stored in insertion table The storage location of one section of value, therefore in step 305, the storage location of the last period value of this insertion operation is written with Current K-VpartThe pervPos fields of structure.After this step, step 307 is performed.
Step 306:By currently received key and it is inserted into value (i.e. the key and value of step 301) and is configured to K- V structure body puts pervPos fields in the key of the structure as mem (in this case, a upper field also in memory).This After step, step 307 is performed.
Step 307:By constructed K-VpartPrevSize in structure key is set to what is found in insertion table Size (content for referring to the size fields read-out by step 303).Before being this insertion operation by what is stored in insertion table, together All value total sizes corresponding to one key, therefore, what the operation of this step just accumulated same key before in the database Value total sizes incorporate current K-VpartThe size fields of structure.After this step, step 314 is performed.
Step 308:In the database by the way that (Get) operation is taken to obtain the key of the key, i.e. step 301 that are inserted into.
Step 309:Judge to take whether operation succeeds, if it fails, then thinking without corresponding key to enter in system Step 310, if it succeeds, entering step 311.
Step 310:The key and value that directly step 301 is received as the structure of Append classes first segment, It is denoted as K-Vpart.After enter step 312.
Step 311:Corresponding K-V structures are taken out from database, are then inserted into value in splicing, it is obtained Structure is denoted as K-Vpart, by K-VpartFirst segment as Append class formation bodies.After enter step 312.
Step 312:By K-VpartThe prevSize fields of structure are set to 0, and pervPos fields are set to head, to represent this K-VpartStructure is the first segment of the Append class formation bodies of same key values,
Step 313:Constructed K-V is inserted into insertion tablepartThe key of structure, and will be inserted into corresponding in table Size fields are set to current K-VpartThe size of the value of structure, pos fields are set to mem, represent current K-VpartStructure Position is memory.Enter step 314.
Step 314:By putting (Put) operation by current K-VpartIn structure write-in memory.
Step 315:It is right when needing to be moved to data in database from memory by pouring out operation (i.e. Dump operations) Its pos field is set to the SST ID for the disk to be poured out to (being moved to) by the element for being mem in insertion table pos fields.
Step 316:The pervPos fields that operation is poured out for participation are the K-V of mempartStructure, by its pervPos Field is set to the SST ID for the disk to be poured out to (being moved to).It is performed again after the completion of step 316 and pours out operation, it will Data are moved to from memory in database.
Above-mentioned flow is by the transformation of the key to insertion operation and is inserted into the introducing of table, the different sections for solving same key The problem of value is connected each other, realizes the basic function of insertion operation in the form of fragmented storage, while also avoids tradition " reading-splicing value- writes back " formula insertion operation read/write amplification, save overhead.
Fig. 4 shows the flow chart of the read method of load caused by the insertion operation of one embodiment of the invention.Its Load refers to the K-V structures of Append classes caused by middle insertion operation.This method comprises the following steps:
Step 401:The key to be read and the value scopes to be read are received, the key is searched in insertion table.For just In description, hereinafter referred to as key to be read, the value scopes to be read is known as scope to be read.Reading large scale During file, user need not often read entire file, but only that reading the value of wherein one section specified range, this step In rapid, what scope to be read referred to is exactly the value scopes to be read specified by this read operation.
Step 402:If finding key to be read in insertion table, 403 are entered step, otherwise, enters step 404.
Step 403:It is read and the matched K-V of key to be read in memtablepartStructure.It is well known that Memtable is the common memory data structures of LSM Tree.405 are entered step after this step.
Step 404:It is read and the matched K-V structures of key to be read from database in a traditional way.The feelings of this step Under condition, because key to be read is not present in insertion table, it is possible to think to be read is non-Append classes load, therefore, It can directly read in a conventional manner.At this point, terminate this read operation after the completion of this step.
Step 405:Judge from the read K-V of memorypartWhether structure and scope to be read have overlapping, if it is determined that It is yes, enters step 406, otherwise, enters step 409.
Step 406:Judge from the read K-V of memorypartWhether structure can cover scope to be read, if it is determined that It is no, 407 are entered step, otherwise, enters step 408.
Step 407:By current K-VpartThe value Chong Die with scope to be read reads out to caching (buffer) in structure Area.Subsequently into step 409.
Step 408:By current K-VpartThe value Chong Die with scope to be read is directly read in structure, completes this Read operation.
Step 409:Read insertion table, find with the matched elements of key to be read, obtain corresponding size and pos fields.
Step 410:The SST ID recorded using pos fields read SST evolution tables as starting point, wherein, according to be read Key determines branch's trend of evolution table, finally obtains the first segment value positions of key to be read in persistent storage layer (i.e. Place SST) SST ID.Herein, first segment value is referred to and this matched Append class formations body of key institutes to be read Newest one section of value in persistent storage layer.
Step 411:According to current obtained SST ID find in corresponding SST with the matched K-V of key to be readpartKnot Structure body, according to the K-VpartThe prevSize fields of structure judge the K-VpartWhether structure is to be read with current residual Scope has overlapping, if the judgment is No, then enters step 412, if the judgment is Yes, then enters step 413.
Step 412:The K-V currently foundpartThe SST ID that the prevPos fields of structure are recorded, with the SST ID reads SST evolution tables for starting point, wherein, the branch for determining evolution table according to key to be read moves towards, and finally obtains persistence and deposits With the SST ID of the matched the preceding paragraph value positions of key to be read in reservoir, step 411 is then back to.
Step 413:Judge current K-VpartWhether structure can cover the scope to be read of current residual, if not, into Enter step 414, otherwise, enter step 415.
Step 414:By current K-VpartThe value Chong Die with the scope to be read of current residual is read out to slow in structure Area is deposited, subsequently into step 412.
Step 415:According to remaining scope to be read by current K-VpartCorresponding value is read in structure, and with Value splicings in caching obtain complete scope value to be read and return it into, this is to Append class formation bodies Read operation finishes.
When user only needs a part of value, based on above-mentioned read operation flow, can correctly find to be read Value scopes, without reading entire value, so as to the reading performance of Append class formation bodies produced by improving insertion operation.
Fig. 5 shows that the data of the Append class formation bodies of one embodiment of the invention merge flow chart.SST evolution tables lead to Often generated with the pouring out operation of memTable, and in evolution in the squeeze operation of persistent storage layer.Wherein, compress Operation is that the typical operation that data merge from low layer to high level is completed in LSM Tree storage systems.
As shown in figure 5, the data of the Append class formation bodies of the present embodiment merge flow chart:
Step 501:When memTable carries out pouring out operation, a series of SST can be generated by pouring out operation.At this point, for This Append class formation body for pouring out operation is participated in, generates SST evolution table and using generated SST as the SST evolution tables Starting point.In the SST evolution tables, corresponding SST is represented with SST ID.
Step 502:When being compressed operation in persistent storage layer, searched in SST evolution tables and participate in this second compression behaviour The each SST made.
Step 503:Reading the K-V of this squeeze operation of participationpartWhen, merge the different section value of same key, close K-V structures after and are stored in the new SST being compressed into.
Step 504:Reading the K-V of this squeeze operation of participationpartWhen, the SST corresponding to corresponding prevPos is existed Ref (i.e. reference count) in SST evolution tables subtracts 1, if the ref of certain SST reduces to 0, this section is deleted in SST evolution tables Point.
Step 505:After generating new SST, the node corresponding to new SST is added in SST evolution tables, and allows old SST (this time The SST deleted in squeeze operation) corresponding to node according to key scopes be directed toward new node.
The data of above-mentioned Append class formations body, which merge flow, to be incited somebody to action during data are moved from low layer to high level Each K-V under same key caused by insertion operationpartStructure polymerize.Also, it is also achieved in above-mentioned merging flow The generation and evolution of SST evolution tables help to improve the reading performance of Append class formation bodies.
Finally it should be noted that above example is only describing technical scheme rather than to this technology method It is limited, the present invention can above extend to other modifications, variation, application and embodiment, and it is taken as that institute in application There are such modification, variation, application, embodiment all in the range of the spirit or teaching of the present invention.

Claims (10)

1. a kind of insertion operation method for LSM tree storage systems comprises the following steps:
1) value will be inserted into and is inserted into key and is configured to key-value structure and using the key-value structure newly built as being inserted into The latest data fragmented storage of key is into database;For the key-value structure of the new structure, preservation is inserted into before key The total amount of data of data sectional through accumulation preserves the storage location for the last data segmentation for being inserted into key;
2) list item for being inserted into key of insertion table is updated;The insertion token records the total amount of data of all data sectionals of each key With the storage location of latest data segmentation.
2. insertion operation method according to claim 1, which is characterized in that in the step 1):In the new structure PrevSize fields and prevPos fields, prevSize fields is set to be accumulated before preserving same key in key-value structure Data sectional total amount of data, prevPos fields preserve the storage of the last data segmentation of this key-value structure of same key Position.
3. insertion operation method according to claim 2, which is characterized in that the step 1) includes substep:
11) key is inserted into insertion table search, obtains corresponding list item;
12) according to the list item for being inserted into key of gained, all data sectionals for being inserted into key before this insertion operation are obtained Total amount of data and the storage location of latest data segmentation;
13) key will be inserted into and be inserted into value and be configured to key-value structure, in the prevSize fields of the key-value structure The total amount of data of all data sectionals for being inserted into key before this insertion operation acquired in step 12) is inserted, in the key-value The latest data for being inserted into key before this insertion operation acquired in step 12) point is inserted in the prevPos fields of structure The storage location of section.
4. insertion operation method according to claim 3, which is characterized in that each list item of the insertion table includes size Field and pos fields, the total amount of data of all data sectionals of the key of the size field records list item, the pos field records table The storage location of the latest data segmentation of the key of item.
5. insertion operation method according to claim 4, which is characterized in that the step 2) includes substep:
21) the size fields for the list item for being inserted into key are updated to add in the total data of all data sectionals after being inserted into value Amount;
22) the pos fields for the list item for being inserted into key are updated to the storage location of the key-value structure obtained by step 13).
6. the read method that a kind of insertion operation for LSM tree storage systems loads, wherein, the insertion operation load includes Belong at least two key-value structure as data sectional of same key, and each key-value structure includes prevSize Field and prevPos fields, wherein, prevSize fields are tired before being used to preserve the key-value structure under same key The total amount of data of long-pending data sectional, prevPos fields are used to preserve the last data point of the key-value structure under same key The storage location of section;The read method comprises the following steps:
101) in the list item of the insertion table search key to be read;The insertion token records the sum of all data sectionals of each key According to amount and the storage location of latest data segmentation;
102) according to each of the key for hitting the total amount of data for all data sectionals that list item records in insertion table and being read The total amount of data for the data sectional accumulated before being preserved in a key-value structure, finds the value scope to be read The key-value structure matched somebody with somebody and the therefrom value to be read.
7. read method according to claim 6, which is characterized in that the step 102) includes substep:
1021) latest data for the key to be read is found in the latest data segmentation position recorded according to list item is hit in insertion table The key-value structure of segmentation is as current key-value structure;
1022) according to the data volume of current key-value structure own value and prevSize fields, current key-value structure is judged Whether with the value commensurate in scope to be read, if it is, output read result;If it is not, then continue to execute step 103);
1023) the key-value structure of last data segmentation is found according to the prevPos fields of current key-value structure, then will The key-value structure of last data segmentation re-executes step 1022), up to what is read as current key-value structure All data sectionals of key are all found.
8. read method according to claim 7, which is characterized in that in the step 1022), after result is read in output, Continue to judge that whether all reading finishes the value scope to be read, and if the judgment is Yes, then directly releases this reading, such as Fruit is judged as NO, and jumps to step 1023).
9. the merging method that a kind of insertion operation for LSM tree storage systems loads, wherein, the insertion operation load includes Belong at least two key-value structure as data sectional of same key, and each key-value structure includes prevSize Field and prevPos fields, wherein, prevSize fields are tired before being used to preserve the key-value structure under same key The total amount of data of long-pending data sectional, prevPos fields are used to preserve the last data point of the key-value structure under same key The storage location of section;The merging method includes:During the union operation of LSM tree storage systems, the insertion will be belonged to Each key-value structure under the same key of service load merges.
10. merging method according to claim 9, which is characterized in that by orderly character caused by insertion class formation body Starting point of the string list file as orderly string table file evolution table, will be each under the same key for belonging to the insertion operation load The process that a key-value structure merges comprises the following steps:
201) when merging operation in persistent storage layer, searched in orderly string table file evolution table and participate in this conjunction And each orderly string table file operated;
202) when reading the structure for participating in this union operation, it is incorporated as the key-value knot of the different data segmentation of same key Structure body.
CN201510501523.9A 2015-08-14 2015-08-14 For the insertion operation of LSM tree storage systems and reading and the merging method of load Active CN105138622B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510501523.9A CN105138622B (en) 2015-08-14 2015-08-14 For the insertion operation of LSM tree storage systems and reading and the merging method of load

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510501523.9A CN105138622B (en) 2015-08-14 2015-08-14 For the insertion operation of LSM tree storage systems and reading and the merging method of load

Publications (2)

Publication Number Publication Date
CN105138622A CN105138622A (en) 2015-12-09
CN105138622B true CN105138622B (en) 2018-05-22

Family

ID=54723970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510501523.9A Active CN105138622B (en) 2015-08-14 2015-08-14 For the insertion operation of LSM tree storage systems and reading and the merging method of load

Country Status (1)

Country Link
CN (1) CN105138622B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844650A (en) * 2017-01-20 2017-06-13 中国科学院计算技术研究所 A kind of daily record merges the merging method and system of tree
CN108804625B (en) * 2018-05-31 2020-05-12 阿里巴巴集团控股有限公司 LSM tree optimization method and device and computer equipment
CN110032565A (en) * 2019-03-26 2019-07-19 阿里巴巴集团控股有限公司 A kind of method, system and electronic equipment generating statistical information
CN112988208B (en) * 2019-12-18 2023-06-30 腾讯科技(深圳)有限公司 Data updating method, device, equipment and storage medium
WO2021197493A1 (en) * 2020-04-04 2021-10-07 厦门网宿有限公司 File management method and apparatus based on lsm-tree storage engine
CN113495871B (en) * 2020-04-04 2023-06-23 厦门网宿有限公司 File management method and device based on LSM-Tree storage engine
CN112395212B (en) * 2020-11-05 2022-05-31 华中科技大学 Method and system for reducing garbage recovery and write amplification of key value separation storage system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722449A (en) * 2012-05-24 2012-10-10 中国科学院计算技术研究所 Key-Value local storage method and system based on solid state disk (SSD)
CN104142958A (en) * 2013-05-10 2014-11-12 华为技术有限公司 Storage method for data in Key-Value system and related device
CN104424326A (en) * 2013-09-09 2015-03-18 华为技术有限公司 Data processing method and device
CN104424219A (en) * 2013-08-23 2015-03-18 华为技术有限公司 Method and equipment of managing data documents
CN104572920A (en) * 2014-12-27 2015-04-29 北京奇虎科技有限公司 Data arrangement method and data arrangement device
CN104809237A (en) * 2015-05-12 2015-07-29 百度在线网络技术(北京)有限公司 LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9846711B2 (en) * 2012-12-28 2017-12-19 Facebook, Inc. LSM cache

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722449A (en) * 2012-05-24 2012-10-10 中国科学院计算技术研究所 Key-Value local storage method and system based on solid state disk (SSD)
CN104142958A (en) * 2013-05-10 2014-11-12 华为技术有限公司 Storage method for data in Key-Value system and related device
CN104424219A (en) * 2013-08-23 2015-03-18 华为技术有限公司 Method and equipment of managing data documents
CN104424326A (en) * 2013-09-09 2015-03-18 华为技术有限公司 Data processing method and device
CN104572920A (en) * 2014-12-27 2015-04-29 北京奇虎科技有限公司 Data arrangement method and data arrangement device
CN104809237A (en) * 2015-05-12 2015-07-29 百度在线网络技术(北京)有限公司 LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system

Also Published As

Publication number Publication date
CN105138622A (en) 2015-12-09

Similar Documents

Publication Publication Date Title
CN105138622B (en) For the insertion operation of LSM tree storage systems and reading and the merging method of load
US11288129B2 (en) Tiering data to a cold storage tier of cloud object storage
CN108255408B (en) Data storage method and system
CN105117415B (en) A kind of SSD data-updating methods of optimization
US8719237B2 (en) Method and apparatus for deleting duplicate data
KR101084816B1 (en) Systems and methods for versioning based triggers
CN107526550B (en) Two-stage merging method based on log structure merging tree
CN105574093A (en) Method for establishing index in HDFS based spark-sql big data processing system
CN105975587A (en) Method for organizing and accessing memory database index with high performance
CN112740198A (en) System and method for early removal of tombstone records in a database
US10664459B2 (en) Database managing method, database managing system, and database tree structure
CN109213432B (en) Storage device for writing data using log structured merge tree and method thereof
US20210349866A1 (en) Deduplication-Adapted Casedb For Edge Computing
US9286339B2 (en) Dynamic partitioning of a data structure
CN113468080B (en) Caching method, system and related device for full-flash metadata
US20180011897A1 (en) Data processing method having structure of cache index specified to transaction in mobile environment dbms
US11269915B2 (en) Maintaining shards in KV store with dynamic key range
CN114896271B (en) Method, device and application for efficiently maintaining node full path
CN116382588A (en) LSM-Tree storage engine read amplification problem optimization method based on learning index
CN105302889A (en) Conversion method and apparatus for data storage structure
KR102287774B1 (en) Method for Processing Data in Database Based on Log Structured Merge Tree Using Non-Volatile Memory
JP7297772B2 (en) Multiple transactions within a single KV store
CN113326262A (en) Data processing method, device, equipment and medium based on key value database
JP4494878B2 (en) Data management apparatus, data management method and program
JP2010191903A (en) Distributed file system striping class selecting method and distributed file system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210301

Address after: Room 1146, 11 / F, research complex building, Institute of computing technology, Chinese Academy of Sciences, No. 6, South Road, Haidian District, Beijing, 100190

Patentee after: Zhongkehai micro (Beijing) Technology Co.,Ltd.

Address before: 100190 No. 6 South Road, Zhongguancun Academy of Sciences, Beijing, Haidian District

Patentee before: Institute of Computing Technology, Chinese Academy of Sciences