CN106844650A - A kind of daily record merges the merging method and system of tree - Google Patents
A kind of daily record merges the merging method and system of tree Download PDFInfo
- Publication number
- CN106844650A CN106844650A CN201710047936.3A CN201710047936A CN106844650A CN 106844650 A CN106844650 A CN 106844650A CN 201710047936 A CN201710047936 A CN 201710047936A CN 106844650 A CN106844650 A CN 106844650A
- Authority
- CN
- China
- Prior art keywords
- sstable
- real
- key
- virtual
- merging
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention proposes that a kind of daily record merges the merging method and system of tree, and method includes real combining step, and data merge and merge with metadata, generate Real SSTable, and data are merged into and merge SSTable;Empty combining step, generates Virtual SSTable, and only metadata is merged, the data source of record Virtual SSTable;The read step of Real SSTable, is read out to Real SSTable, when key falls in the key range of Real SSTable, the corresponding value values of key is directly searched on the Real SSTable;The read step of Virtual SSTable;The Virtual SSTable are merged in reading process, Virtual SSTable are become into Real SSTable.
Description
Technical field
Merge number technical field the present invention relates to daily record, more particularly to a kind of daily record merges the merging method and system of tree.
Background technology
Daily record merges tree (Log-Structured Merge Tree, abbreviation LSM-Tree) and is made up of multicompartment, including one
Individual memory subassembly and multiple DPU disk pack units, component size are exponentially increased, and its framework is as shown in Fig. 2 memory subassembly has memory table
(Memtable) constitute, each DPU disk pack unit is made up of one or more sequencing character string table (SSTable), LSM-
The main thought of Tree is, by writing or updating and be stored in internal memory to data, to reach these operations are suitable after the threshold value specified
It is written in storage device to sequence, insertion, updates, deletes etc. and update operation by memory subassembly service, searches and the operation such as scanning
By all component service, it uses strange land update mode, therefore, in LSM-tree, same key there may be multiple versions
Value, deletion action is to add one in memory subassembly to delete mark, data is deleted without real, follow-up
Can do the deletion and merging of data during compact according to the new and old edition and deleted marker of same key, different key it
Between do sorting operation, generate new SSTable files, and delete old SSTable files.
In order to keep LSM-tree for future reading and writing operation efficiently, it is necessary to constantly by data from component high to low
Component is moved, and when the size of certain component exceedes threshold value, is triggered it and is merged (Compaction) operation, is existed per secondary data movement
Carried out between adjacent component, be ranked up for two components data by period, delete invalid data and legacy data, this process claims
It is compaction, by taking Fig. 3 as an example, when component C2 exceedes threshold value, then selects a SSTable T22 to be closed in component C2
And, from component C3 there is the SSTable T32 and T33 of overlap with the key-range of T22 in component C2 in selection.By T22
Three new SSTable are merged into T32, T33, is T35, T36, T37;And the SSTable of neotectonics is placed on C3.With this
The mode of kind, component C3 is moved down into by data T22 from component C2.
Compaction can control the I/O operation that the key/value based on LSM-Tree is stored, in Compaction processes
In, key-value pair can flow to larger level (high level) from smaller level (low layer), due to every layer of presetting capacity
Limitation, key-value pair can cause the pause of writing of system as the slow flowing of larger level, Fig. 4 show a key-value pair from
Smaller level flow to the read-write operation process during larger level, during compaction, a key assignments
Write out many times to that can be read into, even in same layer, reason is that compaction processes are a polling dispatchings,
And the speed of poll fast can cross level layers of larger in smaller level layers, as a result phase is moved in a key-value pair
Just be already engaged in multiple compaction processes before adjacent bed, it is this it is serious write amplification phenomenon and result in often occur writing temporary
Stop, so as to reduce the write performance of system.
The existing solution that scale-up problem is write for LSM-Tree, mainly there is following several ways:
Technical scheme 1:The condition of traditional LSM-Tree triggerings compaction is amplified, that is, increases the threshold value of each component
Size, main disadvantage is that:Component 0 to the scale-up problem of writing of component 1 can only be alleviated, and succeeding layer level remains unchanged to exist and writes amplification
Problem.
Technical scheme 2:Key is divided into multiple key range by bLSM so that compaction falls in a small amount of key
In range, it is to avoid the data in uncorrelated key range carry out compaction, as shown in fig. 5, it is assumed that in the range of N~Q not
It is disconnected to there are data to insert, then can only trigger compaction, the compaction without triggering other scopes in the range of N~Q,
As shown in Figure 5.Its major defect is:The amplification of writing that the compaction in a certain key range cannot be avoided to be brought is asked
Topic.
BLSM is published in Proceedings of the 2012ACM SIGMOD International Conference
on Management of Data。
Technical scheme 3:VT-Tree can judge when key-value pair in continuous multilayer all without identical key it is corresponding its
During the key-value pair of its version, layer or last layer containing identical key values directly can be reached across these layers, so as to save
Key-value pair brought I/O expenses of movement before multilayer, as shown in Figure 6.Its major defect is:Hot spot data cannot be alleviated
Carry out that compaction brought writes scale-up problem.
VT-Tree is published in 11th USENIX Conference on File and Storage Technologies
(FAST’13)
Technical scheme 4:Key is stored separately with value in LSM-Tree in Wisckey, will key and value
Pointer is stored in LSM-Tree, and the True Data of value is then stored elsewhere, therefore is carrying out compaction mistakes
Cheng Zhong, only key can carry out repetitive read-write with the pointer of value, as shown in Figure 7.Its major defect is:Key due to
Compaction is read and write repeatedly, and under some key larger scene, the scale-up problem of writing of LSM-Tree is still present.
Wisckey is published in 14th USENIX Conference on File and Storage Technologies
(FAST’16)
The content of the invention
Cannot be solved because data are (same due to merging in LSM-Tree present invention aim to address above-mentioned prior art
Layer & cross-layers) cause the LSM-Tree for reading and writing and triggering repeatedly to write scale-up problem, propose a kind of daily record merge tree merging method and
System.
The present invention proposes that a kind of daily record merges the merging method of tree, including:
Real combining step, including data merging merges with metadata, and generates Real SSTable, and wherein data are merged into
SSTable is merged, metadata merges to be included merging key range, file number, file size information;
Empty combining step, generates Virtual SSTable, and only metadata merged, and metadata merges bag
Include merging key range, file number, file size information, and record the data of the Virtual SSTable
Source;
The read step of Real SSTable, is read out to the Real SSTable, wherein when key falls described
In the key range of Real SSTable, then the value values corresponding to key are directly searched on the Real SSTable;
The read step of Virtual SSTable, when key falls in the key range of Virtual SSTable, then leads to
Cross metadata information and search the Virtual SSTable data sources corresponding with key, and in the Real SSTable
Value values corresponding to middle lookup key;
The combining step of Virtual SSTable, merges in reading process to the Virtual SSTable,
The Virtual SSTable are become into Real SSTable, so as to lift reading performance.
The real combining step includes:
11. key/value for being successively read each Real SSTable;
12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In;
After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
The empty combining step includes:
21. metadata for collecting Real SSTable;
The 22. key range in Real SSTable, obtain new key range, cover all Real
The scope of SSTable;
New key range, according to the number N of Real SSTable, are divided into N number of by 23., represent N number of Virtual
The file size of SSTable, wherein Virtual SSTable is arranged to identical with the file size of Real SSTable,
The reference number of a document of Virtual SSTable passes through empty merging process and distributes unitedly, and data source is involved in then merging for void
All Real SSTable.
The read step of the Virtual SSTable includes
41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable;
42. if it was not then return
If 43. are finding data source, i.e. Real SSTables by the metadata of Virtual SSTable;
44. carry out lookup key according to the index of Real SSTable;
45. if it is found, then return to the value values of key;
If 46. do not find, the lookup of next Real SSTable is carried out, until by Virtual SSTable
Data source included in Real SSTable all search one time;
If 47., again without finding, return.
The combining step of the Virtual SSTable includes
The 51. data source Real SSTables for obtaining Virtual SSTable;
52. key/value for being successively read each Real SSTable;
53. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In;
After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
The present invention also proposes that a kind of daily record merges the combination system of tree, including:
Real merging module, including data merging merges with metadata, and generates Real SSTable, and wherein data are merged into
SSTable is merged, metadata merges to be included merging key range, file number, file size information;
Empty merging module, for generating Virtual SSTable, and is only merged to metadata, and metadata is closed
And including merging key range, file number, file size information, and record the data of the Virtual SSTable
Source;
The read module of Real SSTable, for being read out to the Real SSTable, wherein when key falls in institute
In stating the key range of Real SSTable, then the value values corresponding to key are directly searched on the Real SSTable;
The read module of Virtual SSTable, for falling in the key range of Virtual SSTable as key,
The Virtual SSTable data sources corresponding with key is then searched by metadata information, and in the Real
The value values corresponding to key are searched in SSTable;
The merging module of Virtual SSTable, for being closed to the Virtual SSTable in reading process
And, the Virtual SSTable are become into Real SSTable, so as to lift reading performance.
The real merging module includes:
11. key/value for being successively read each Real SSTable;
12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In;
After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
The empty merging module includes:
21. metadata for collecting Real SSTable;
The 22. key range in Real SSTable, obtain new key range, cover all Real
The scope of SSTable;
New key range, according to the number N of Real SSTable, are divided into N number of by 23., represent N number of Virtual
The file size of SSTable, wherein Virtual SSTable is arranged to identical with the file size of Real SSTable,
The reference number of a document of Virtual SSTable passes through empty merging process and distributes unitedly, and data source is involved in then merging for void
All Real SSTable.
The read module of the Virtual SSTable includes
41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable;
42. if it was not then return
If 43. are finding data source, i.e. Real SSTables by the metadata of Virtual SSTable;
44. carry out lookup key according to the index of Real SSTable;
45. if it is found, then return to the value values of key;
If 46. do not find, the lookup of next Real SSTable is carried out, until by Virtual SSTable
Data source included in Real SSTable all search one time;
If 47., again without finding, return.
The merging module of the Virtual SSTable includes
The 51. data source Real SSTables for obtaining Virtual SSTable;
52. key/value for being successively read each Real SSTable;
53. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In;
After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
From above scheme, the advantage of the invention is that:
1. the present invention can reduce merge brought write scale-up problem so that the write performance of lifting system, and and its
He reduce LSM-Tree write scale-up problem method be it is orthogonal, it is stackable to use;
2. the present invention can reduce the I/O amounts in merging, and for this kind of storage devices of SSD, extend its service life;
Fig. 1 illustrates the performance evaluating of the present invention and RocksDB, from figure 1 it appears that advantage of the invention is that:
Under the load of Write-intensive (writing intensive), overall performance lifts 30%~1 times;
Under the load of Read-intensive (reading intensive), overall performance substantially maintains an equal level with RocksDB;
LSM-Tree itself is mainly used in Write-intensive loads.
Brief description of the drawings
Fig. 1 is performance evaluating figure of the present invention;
Fig. 2 is LSM-Tree Organization Charts;
Fig. 3 is LSM-Tree Compaction exemplary plots;
Fig. 4 is the flow process figure that LSM-Tree carries out key-value pair during Compaction;
Fig. 5 is bLSM figures;
Fig. 6 is VT-Tree figures;
Fig. 7 is Wisckey figures;
Fig. 8 is real merging particular flow sheet;
Fig. 9 is empty merging particular flow sheet;
Figure 10 is the reading flow chart of Virtual SSTable;
Figure 11 is the merging figure of Virtual SSTable;
Figure 12 is influence figures of the different VCT for write performance;
Figure 13 is the influence figure that different VCT measured and merged the time for I/O;
Figure 14 is influence figures of the different MCT for reading performance.
Specific embodiment
It is below overall flow of the invention, it is as follows:
1. it is real to merge
Real merging includes that data merge and metadata merges, and the SSTable that it is produced is Real SSTable.Wherein data
Merging is that SSTable merges, and metadata merging includes merging the information such as key range, file number, file size,
Idiographic flow is as shown in Figure 8.
11. key/value for being successively read each Real SSTable;
12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In (be defaulted as 2MB)
After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
2. it is empty to merge
It is Virtual SSTable that void merges the SSTable for producing.Void merges and only metadata is merged.Metadata merges
Including merging key range, file number, file size etc., and it is by which to record the Virtual SSTable
Real SSTable constitute (referred to as parentSST), the i.e. data source of Virtual SSTable.Idiographic flow such as Fig. 9 institutes
Show.
21. metadata for collecting Real SSTable involved during void merges;
The 22. key range in Real SSTable, obtain a new key range, cover above-mentioned all
The scope of Real SSTable;
23. according to void merge in Real SSTable number N, new key range are divided into N number of, this N number of key's
The N number of Virtual SSTable of Range Representation, wherein file size unification are arranged to as Real SSTable sizes (acquiescence
It is 2MB), reference number of a document passes through empty merging process and distributes unitedly, and data source is then involved all during this void merges
Real SSTable。
The reading of 3.Real SSTable
Real SSTable reading flow (with tradition based on LSM-Tree key assignments system, such as LevelDB,
RocksDB, identical):When key falls in the key range of Real SSTable, then search directly over corresponding to key
Value values.
The reading of 4.Virtual SSTable
The reading flow of Virtual SSTable:When key falls in the key range of Virtual SSTable, then lead to
Cross metadata information and find the corresponding data sources of Virtual SSTable, i.e. Real SSTable, and in Real SSTable
The value values corresponding to key are searched, idiographic flow is as shown in Figure 10.
41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable;
42. if it was not then return
If 43. are finding its data source, i.e. Real SSTables by the metadata of Virtual SSTable;
44. carry out lookup key according to the index of Real SSTable;
45. if it is found, then return to the value values of key;
If 46. do not find, the lookup of next Real SSTable is carried out, until by the Virtual
Real SSTable included in the data source of SSTable are looked for one time;
If 47., again without finding, return;
The merging of 5.Virtual SSTable
In order to reduce influences of the Virtual SSTable to reading performance, can be to Virtual SSTable in reading process
Merge, Virtual SSTable are become into Real SSTable, so as to lift reading performance.Idiographic flow is as shown in figure 11.
51. data source-Real the SSTables for obtaining Virtual SSTable;
52. key/value for being successively read each Real SSTable;
53. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In (be defaulted as 2MB)
After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
Example 1:Union operation (void merges and the real combination for merging)
In the present invention when a certain layer of LSM-Tree reaches threshold value, it is necessary to merge, then the following institute of idiographic flow
Show:
A. selection needs the SSTable for merging;
B. the number comprising Real SSTable in these SSTable is counted, N is designated as;
If c. N exceedes threshold value VCT, real merging is carried out;
D.N is less than VCT, then carry out empty merging;
Wherein VCT is to discriminate between the real parameter merged with empty merging, and different VCT are different for the influence for merging:
VCT is smaller, and the real number of times that merges is more, so as to cause I/O expenses larger;
VCT is bigger, and it is more that void merges number of times, although can save I/O expenses, but the follow-up real expense that merges is larger,
So as to cause the merging time more long;
Figure 12 is illustrated under Write-100% loads, the influence of different VCT to write performance, wherein working as VCT=12
When, write performance is optimal, and it is main reason is that two aspects:It is the I/O amounts of saving and total merging time, specific as schemed
Shown in 13, VCT=12 can save certain I/O amounts, and the time for merging is minimum, therefore VCT=12 is most in this example
Excellent.
Example 2:Get operates (reading of Real SSTable and Virtual SSTable, the conjunction of Virtual SSTable
And)
Its basic procedure is as follows:
A. looked for from top to bottom from LSM-Tree;
B. the SSTable for each layer, SSTable is positioned by metadata information thereon-key range;
C. if Real SSTable, then directly searched;
D. if Virtual SSTable, then by the parentSST of Virtual SSTable, its data source is found
Afterwards, then searched;
E. do not find, then continue to be searched in this layer or the next layer of corresponding SSTable of positioning, until finding or
Untill person has looked for;
Get operations of the invention can also be related to the merging of Virtual SSTable, and idiographic flow is as follows:
A. when Get operations are related to Virtual SSTable, two values of current Virtual SSTable can be judged:
Read the Real SSTable that the cumulative frequency (R) and the Virtual SSTable of the Virtual SSTable are included
Number (M)
B. R is worked as>RCT&&M>During MCT, then the merging of Virtual SSTable, otherwise nonjoinder are carried out;
The Real included using the cumulative frequency for reading the Virtual SSTable and with the Virtual SSTable
The number of SSTable is used as triggering the condition that Virtual SSTable merge, main reason is that:
Read cumulative frequency to consider from temperature angle, temperature is higher, and virtual SSTable are written infrequently, and temperature is lower,
Virtual SSTable are seldom read;
The Real SSTable numbers that Virtual SSTable are included are more, and reading performance is poorer;Conversely, reading performance is got over
It is good;
Therefore when R and M meets condition simultaneously, show that the temperature of current Virtual SSTable is higher, and included
Real SSTable numbers have exceeded threshold range, can have a strong impact on reading performance, it is necessary to merge.The default value of RCT is used
5, and the selection for MCT, Figure 14 illustrate influences of the different MCT for reading performance.
When MCT=3~5, reading performance is essentially identical with RocksDB;
When MCT=7~11, reading performance is poor, and it is main reason is that the Virtual SSTable shadows not being merged
Reading performance is rung;
Therefore, the selection of MCT needs to consider two factors:The expense that reading performance and Merging zone method come, summary two because
Element, MCT=5 is more suitable in this example, and reason is that 1) reading performance is substantially the same with RocksDB;2) compared to MCT=3,
Less Virtual SSTable can be triggered to merge, save resources.
The present invention also proposes that a kind of daily record merges the combination system of tree, including:
Real merging module, including data merging merges with metadata, and generates Real SSTable, and wherein data are merged into
SSTable is merged, metadata merges to be included merging key range, file number, file size information;
Empty merging module, for generating Virtual SSTable, and is only merged to metadata, and metadata is closed
And including merging key range, file number, file size information, and record the data of the Virtual SSTable
Source;
The read module of Real SSTable, for being read out to the Real SSTable, wherein when key falls in institute
In stating the key range of Real SSTable, then the value values corresponding to key are directly searched on the Real SSTable;
The read module of Virtual SSTable, for falling in the key range of Virtual SSTable as key,
The Virtual SSTable data sources corresponding with key is then searched by metadata information, and in the Real
The value values corresponding to key are searched in SSTable;
The merging module of Virtual SSTable, for being closed to the Virtual SSTable in reading process
And, the Virtual SSTable are become into Real SSTable, so as to lift reading performance.
The real merging module includes:
11. key/value for being successively read each Real SSTable;
12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In;
After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
The empty merging module includes:
21. metadata for collecting Real SSTable;
The 22. key range in Real SSTable, obtain new key range, cover all Real
The scope of SSTable;
New key range, according to the number N of Real SSTable, are divided into N number of by 23., represent N number of Virtual
The file size of SSTable, wherein Virtual SSTable is arranged to identical with the file size of Real SSTable,
The reference number of a document of Virtual SSTable passes through empty merging process and distributes unitedly, and data source is involved in then merging for void
All Real SSTable.
The read module of the Virtual SSTable includes
41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable;
42. if it was not then return
If 43. are finding data source, i.e. Real SSTables by the metadata of Virtual SSTable;
44. carry out lookup key according to the index of Real SSTable;
45. if it is found, then return to the value values of key;
If 46. do not find, the lookup of next Real SSTable is carried out, until by Virtual SSTable
Data source included in Real SSTable all search one time;
If 47., again without finding, return.
The merging module of the Virtual SSTable includes
The 51. data source Real SSTables for obtaining Virtual SSTable;
52. key/value for being successively read each Real SSTable;
53. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size
In;
After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins
Wherein, until writing.
Claims (10)
1. a kind of daily record merges the merging method of tree, it is characterised in that including:
Real combining step, including data merge and merge with metadata, and generate Real SSTable, wherein data merge into by
SSTable is merged, and metadata merges to be included merging key range, file number, file size information;
Empty combining step, generates Virtual SSTable, and only metadata merged, and metadata is merged including closing
And key range, file number, file size information, and record the data source of the Virtual SSTable;
The read step of Real SSTable, is read out to the Real SSTable, wherein when key falls in the Real
In the key range of SSTable, then the value values corresponding to key are directly searched on the Real SSTable;
The read step of Virtual SSTable, when key falls in the key range of Virtual SSTable, then by unit
Data source Virtual SSTable corresponding with key described in data information search, and looked into the Real SSTable
Look for the value values corresponding to key;
The combining step of Virtual SSTable, merges in reading process to the Virtual SSTable, by institute
State Virtual SSTable and become Real SSTable, so as to lift reading performance.
2. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the real combining step includes:
11. key/value for being successively read each Real SSTable;
12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size;
After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins wherein,
Until writing.
3. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the empty combining step includes:
21. metadata for collecting Real SSTable;
The 22. key range in Real SSTable, obtain new key range, cover all Real SSTable's
Scope;
New key range, according to the number N of Real SSTable, are divided into N number of by 23., represent N number of Virtual SSTable,
The file size of wherein Virtual SSTable is arranged to identical with the file size of Real SSTable, Virtual
The reference number of a document of SSTable passes through empty merging process and distributes unitedly, and data source is then all Real involved in empty merging
SSTable。
4. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the Virtual SSTable's
Read step includes
41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable;
42. if it was not then return
If 43. are finding data source, i.e. Real SSTables by the metadata of Virtual SSTable;
44. carry out lookup key according to the index of Real SSTable;
45. if it is found, then return to the value values of key;
If 46. do not find, the lookup of next Real SSTable is carried out, until by the number of Virtual SSTable
All searched one time according to the Real SSTable included in source;
If 47., again without finding, return.
5. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the Virtual SSTable's
Combining step includes
The 51. data source Real SSTables for obtaining Virtual SSTable;
52. key/value for being successively read each Real SSTable;
53. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size;
After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins wherein,
Until writing.
6. a kind of daily record merges the combination system of tree, it is characterised in that including:
Real merging module, including data merge and merge with metadata, and generate Real SSTable, wherein data merge into by
SSTable is merged, and metadata merges to be included merging key range, file number, file size information;
Empty merging module, for generating Virtual SSTable, and only merges to metadata, and metadata merges bag
Include merging key range, file number, file size information, and record the data of the Virtual SSTable
Source;
The read module of Real SSTable, for being read out to the Real SSTable, wherein when key falls described
In the key range of Real SSTable, then the value values corresponding to key are directly searched on the Real SSTable;
The read module of Virtual SSTable, for falling in the key range of Virtual SSTable as key, then leads to
Cross metadata information and search the Virtual SSTable data sources corresponding with key, and in the Real SSTable
Value values corresponding to middle lookup key;
The merging module of Virtual SSTable, for being merged to the Virtual SSTable in reading process,
The Virtual SSTable are become into Real SSTable, so as to lift reading performance.
7. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the real merging module includes:
11. key/value for being successively read each Real SSTable;
12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size;
After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins wherein,
Until writing.
8. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the empty merging module includes:
21. metadata for collecting Real SSTable;
The 22. key range in Real SSTable, obtain new key range, cover all Real SSTable's
Scope;
New key range, according to the number N of Real SSTable, are divided into N number of by 23., represent N number of Virtual SSTable,
The file size of wherein Virtual SSTable is arranged to identical with the file size of Real SSTable, Virtual
The reference number of a document of SSTable passes through empty merging process and distributes unitedly, and data source is then all Real involved in empty merging
SSTable。
9. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the Virtual SSTable's
Read module includes
41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable;
42. if it was not then return
If 43. are finding data source, i.e. Real SSTables by the metadata of Virtual SSTable;
44. carry out lookup key according to the index of Real SSTable;
45. if it is found, then return to the value values of key;
If 46. do not find, the lookup of next Real SSTable is carried out, until by the number of Virtual SSTable
All searched one time according to the Real SSTable included in source;
If 47., again without finding, return.
10. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the Virtual SSTable's
Merging module includes
The 51. data source Real SSTables for obtaining Virtual SSTable;
52. key/value for being successively read each Real SSTable;
53. are sorted by merger, and qualified key/value is sequentially written in the RealSSTable of fixed size;
After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins wherein,
Until writing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710047936.3A CN106844650A (en) | 2017-01-20 | 2017-01-20 | A kind of daily record merges the merging method and system of tree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710047936.3A CN106844650A (en) | 2017-01-20 | 2017-01-20 | A kind of daily record merges the merging method and system of tree |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844650A true CN106844650A (en) | 2017-06-13 |
Family
ID=59119444
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710047936.3A Pending CN106844650A (en) | 2017-01-20 | 2017-01-20 | A kind of daily record merges the merging method and system of tree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844650A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110147359A (en) * | 2017-12-13 | 2019-08-20 | 北京奇虎科技有限公司 | A kind of increment generation method, device and a kind of data-updating method, device |
CN110532228A (en) * | 2019-09-02 | 2019-12-03 | 深圳市网心科技有限公司 | A kind of method, system, equipment and the readable storage medium storing program for executing of block chain reading data |
CN110716690A (en) * | 2018-07-12 | 2020-01-21 | 阿里巴巴集团控股有限公司 | Data recovery method and system |
CN111694992A (en) * | 2019-03-15 | 2020-09-22 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN112307016A (en) * | 2019-07-29 | 2021-02-02 | 华为技术有限公司 | Data unit merging method and device |
CN112486994A (en) * | 2020-11-30 | 2021-03-12 | 武汉大学 | Method for quickly reading data of key value storage based on log structure merging tree |
CN112527804A (en) * | 2021-01-27 | 2021-03-19 | 中智关爱通(南京)信息科技有限公司 | File storage method, file reading method and data storage system |
EP3825866A4 (en) * | 2018-08-14 | 2021-08-25 | Huawei Technologies Co., Ltd. | Partition merging method and database server |
CN116595015A (en) * | 2023-07-18 | 2023-08-15 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198150A (en) * | 2013-04-24 | 2013-07-10 | 清华大学 | Big data indexing method and system |
CN104142958A (en) * | 2013-05-10 | 2014-11-12 | 华为技术有限公司 | Storage method for data in Key-Value system and related device |
CN104809237A (en) * | 2015-05-12 | 2015-07-29 | 百度在线网络技术(北京)有限公司 | LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system |
CN104915145A (en) * | 2014-03-11 | 2015-09-16 | 华为技术有限公司 | Method and device for reducing LSM Tree writing amplification |
CN105138622A (en) * | 2015-08-14 | 2015-12-09 | 中国科学院计算技术研究所 | Append operation method for LSM tree memory system and reading and merging method for loads of append operation |
CN105159915A (en) * | 2015-07-16 | 2015-12-16 | 中国科学院计算技术研究所 | Dynamically adaptive LSM (Log-structured merge) tree combination method and system |
CN105302487A (en) * | 2015-10-20 | 2016-02-03 | 中国科学院信息工程研究所 | Flow control based treelike storage structure write amplification optimization method |
CN105468298A (en) * | 2015-11-19 | 2016-04-06 | 中国科学院信息工程研究所 | Key value storage method based on log-structured merged tree |
-
2017
- 2017-01-20 CN CN201710047936.3A patent/CN106844650A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103198150A (en) * | 2013-04-24 | 2013-07-10 | 清华大学 | Big data indexing method and system |
CN104142958A (en) * | 2013-05-10 | 2014-11-12 | 华为技术有限公司 | Storage method for data in Key-Value system and related device |
CN104915145A (en) * | 2014-03-11 | 2015-09-16 | 华为技术有限公司 | Method and device for reducing LSM Tree writing amplification |
CN104809237A (en) * | 2015-05-12 | 2015-07-29 | 百度在线网络技术(北京)有限公司 | LSM-tree (The Log-Structured Merge-Tree) index optimization method and LSM-tree index optimization system |
CN105159915A (en) * | 2015-07-16 | 2015-12-16 | 中国科学院计算技术研究所 | Dynamically adaptive LSM (Log-structured merge) tree combination method and system |
CN105138622A (en) * | 2015-08-14 | 2015-12-09 | 中国科学院计算技术研究所 | Append operation method for LSM tree memory system and reading and merging method for loads of append operation |
CN105302487A (en) * | 2015-10-20 | 2016-02-03 | 中国科学院信息工程研究所 | Flow control based treelike storage structure write amplification optimization method |
CN105468298A (en) * | 2015-11-19 | 2016-04-06 | 中国科学院信息工程研究所 | Key value storage method based on log-structured merged tree |
Non-Patent Citations (2)
Title |
---|
FENGFENG PAN 等: ""dCompaction: Delayed Compaction for the LSM-Tree"", 《INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING》 * |
FENG-FENG PAN 等: ""dCompaction:Speeding up Compaction of the LSM-Tree via Delayed Compaction"", 《JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110147359A (en) * | 2017-12-13 | 2019-08-20 | 北京奇虎科技有限公司 | A kind of increment generation method, device and a kind of data-updating method, device |
CN110716690A (en) * | 2018-07-12 | 2020-01-21 | 阿里巴巴集团控股有限公司 | Data recovery method and system |
CN110716690B (en) * | 2018-07-12 | 2023-02-28 | 阿里巴巴集团控股有限公司 | Data recovery method and system |
EP3825866A4 (en) * | 2018-08-14 | 2021-08-25 | Huawei Technologies Co., Ltd. | Partition merging method and database server |
US11762881B2 (en) | 2018-08-14 | 2023-09-19 | Huawei Cloud Computing Technologies Co., Ltd. | Partition merging method and database server |
CN111694992A (en) * | 2019-03-15 | 2020-09-22 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN111694992B (en) * | 2019-03-15 | 2023-05-26 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN112307016B (en) * | 2019-07-29 | 2022-08-26 | 华为技术有限公司 | Data unit merging method and device |
CN112307016A (en) * | 2019-07-29 | 2021-02-02 | 华为技术有限公司 | Data unit merging method and device |
CN110532228A (en) * | 2019-09-02 | 2019-12-03 | 深圳市网心科技有限公司 | A kind of method, system, equipment and the readable storage medium storing program for executing of block chain reading data |
CN112486994A (en) * | 2020-11-30 | 2021-03-12 | 武汉大学 | Method for quickly reading data of key value storage based on log structure merging tree |
CN112486994B (en) * | 2020-11-30 | 2024-04-19 | 武汉大学 | Data quick reading method based on key value storage of log structure merging tree |
CN112527804A (en) * | 2021-01-27 | 2021-03-19 | 中智关爱通(南京)信息科技有限公司 | File storage method, file reading method and data storage system |
CN116595015A (en) * | 2023-07-18 | 2023-08-15 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN116595015B (en) * | 2023-07-18 | 2023-12-15 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106844650A (en) | A kind of daily record merges the merging method and system of tree | |
CN101937377B (en) | Data recovery method and device | |
CN110268394A (en) | KVS tree | |
US7689574B2 (en) | Index and method for extending and querying index | |
CN110383261A (en) | Stream for multithread storage device selects | |
CN110268399A (en) | Merging tree for attended operation is modified | |
CN110291518A (en) | Merge tree garbage index | |
KR100856245B1 (en) | File system device and method for saving and seeking file thereof | |
US10445022B1 (en) | Optimization of log-structured merge (LSM) tree-based databases using object solid state drive (SSD) devices | |
CN111399777A (en) | Differentiated key value data storage method based on data value classification | |
CN111026329B (en) | Key value storage system based on host management tile record disk and data processing method | |
CN103198150B (en) | A kind of large data index method and system | |
US10496612B2 (en) | Method for reliable and efficient filesystem metadata conversion | |
CN107391774A (en) | The rubbish recovering method of JFS based on data de-duplication | |
CN108959119A (en) | The method and system of garbage collection in storage system | |
CN104461388B (en) | A kind of storage array configuration is preserved and referee method | |
CN114780530A (en) | Time sequence data storage method and system based on LSM tree key value separation | |
JP4825719B2 (en) | Fast file attribute search | |
CN105068761B (en) | A kind of video interception storage method and system convenient for retrieval | |
CN106648991A (en) | Duplicated data deletion method in data recovery system | |
CN106528436B (en) | Data storage device and data maintenance method thereof | |
KR100809452B1 (en) | Methods for automatically classifying patents using computing machines and systems thereof | |
CN111324284B (en) | Memory device | |
CN113391916A (en) | Organization architecture data processing method, device, computer equipment and storage medium | |
CN106126555A (en) | A kind of file management method and file system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170613 |
|
WD01 | Invention patent application deemed withdrawn after publication |