CN110083601A - Index tree constructing method and system towards key assignments storage system - Google Patents

Index tree constructing method and system towards key assignments storage system Download PDF

Info

Publication number
CN110083601A
CN110083601A CN201910271085.XA CN201910271085A CN110083601A CN 110083601 A CN110083601 A CN 110083601A CN 201910271085 A CN201910271085 A CN 201910271085A CN 110083601 A CN110083601 A CN 110083601A
Authority
CN
China
Prior art keywords
key assignments
tree
slot
index tree
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910271085.XA
Other languages
Chinese (zh)
Other versions
CN110083601B (en
Inventor
韩书楷
蒋德钧
熊劲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201910271085.XA priority Critical patent/CN110083601B/en
Publication of CN110083601A publication Critical patent/CN110083601A/en
Application granted granted Critical
Publication of CN110083601B publication Critical patent/CN110083601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of index tree constructing methods towards key assignments storage system, comprising: the prefix of the key assignments of key assignments data is ranked up and is divided the superstructure to generate dictionary tree, as index tree;Hash table is constructed with the cryptographic Hash of the key assignments, the understructure of the index tree is generated with the Hash table;Key assignments data-Hash table-dictionary tree corresponding relationship is established, the index tree is generated.Index tree constructing method of the invention, the operation of key assignments data directory is carried out by the index tree of the upper and lower structures of building, there are more excellent monomer operational capacity O (L+K) and lower space expense and higher efficiency, and range-based searching and dynamic processing data is supported to increase.

Description

Index tree constructing method and system towards key assignments storage system
Technical field
The invention belongs to the key assignments storages of computer storage, index technology field, and in particular to a kind of to store towards key assignments The method and system of the index tree building of system.
Background technique
For a storage system, how efficiently to organize, index these data and become one storage of influence be The key factor of system good efficiency.For the design of memory index, the index type being widely used at this stage mainly has following It is several:
1, B+ tree (B+tree): node can possess the multi-fork search tree of more than two child node.It can store number According to, it is ranked up and allow to be searched with the operation of the time complexity of O (log n), sequence is read, insertion and is deleted Data structure.B+tree algorithm is generally used in database and file system.Invention " a kind of B+ tree read buffer method and related dress Set " (publication number: CN109492005A), disclose a kind of B+ tree read buffer method, it is first determined currently available spatial cache Capacity, and determine the preceding N layers of B+ tree non-leaf nodes that the capacity can cache, preceding N layers of nodal information is all delayed It deposits, so that in longitudinal each paths, it is identical for being buffered in the number of nodes of spatial cache when searching leaf node. " data query method and apparatus " (publication number: CN109299106A), provides a kind of data query method and apparatus, comprising: connect The inquiry instruction that user sends is received, includes at least one querying condition in inquiry instruction;In three level list file set, determine The first dimension values corresponding with each of at least one querying condition querying condition, and determination is corresponding with the first dimension values Three level list information;In secondary index file set, secondary index information corresponding with three level list information is determined;According to The corresponding secondary index information of second dimension values determines data corresponding with secondary index information in the database.And then no longer It needs to traverse the key assignments of data, only according to three level list information, secondary index information, so that it may find in database Data.
2, Hash table: Hash table is widely used in always various memory type Database Systems as a kind of index structure In, its a certain slot that data be addressed to table random by hash function, operation can achieve O (1) grade in the ideal case Other time complexity.It invents " the key assignments storage system effectively indexed including resource " (publication number: CN109416694A), description A kind of key assignments storage system for effectively being indexed using resource to be interacted with the key assignments entry in content store.Rope Draw and the data structure including multiple hash buckets is provided.Each hash bucket includes the chained list of hash bucket unit.Key assignments storage system base It is indexed in memory in a distributed way in its creation time and stores hash bucket list between repository and secondary index repository Hash-entry in each chained list of member.Key assignments storage system is additionally configured to store the hash bucket unit of link in chronological order Specific collection in Hash-entry to reflect its creation time.Index further includes the various of the performance of influence key assignments storage system Adjustable parameter.
3, dictionary tree: being a kind of ordered tree, and for saving Associate array, key therein is usually character string.It is looked into y-bend Look for tree different, key is not stored directly in node, is determined by position of the node in tree.All descendants of one node There are identical prefix, that is, the corresponding character string of this node, and root node corresponds to null character string.Under normal circumstances, no It is that all nodes have corresponding value, only key corresponding to leaf node and partial interior node just has relevant value.Hair Bright " a kind of index datastore and search method, device and storage medium " (publication number: CN109325032A), provides one kind Index datastore and search method, device and storage medium, date storage method is when data (i.e. key-value pair) stores, not only It is ranked up according to the size of value element, the data sequence of sequence is also divided into multiple sections, each section is sorted key assignments, and will The all orderly storages of data sequence storage corresponding with key assignments sequence, implementation value element and key assignments (also referred to as record number), that is, construct Completely new index structure, and propose many condition search method suitable for the index structure, for arbitrary interval query, Result set can be indicated with one or more union of sets collection, and these gather most of orderly, most boundaries Two set are unordered, thus improve carried out in multiple condition queries with or the operations such as non-efficiency.
However all there are different degrees of problems for the above-mentioned prior art, for example, for B+ tree, since itself is complicated for operation Degree is O (log n), therefore efficiency is not very high, in addition, B+ tree maintains tree-like state due to itself needing to carry out split degree, because This is not very outstanding in terms of concurrency;For Hash table, since the characteristic Hash table of its random addressing wants that carrying out range searches Rope can only scan entire Hash table, therefore its range query efficiency is extremely low.In addition, being breathed out as Hash table is gradually fully written Uncommon table can generate more addressing conflicts, may be taken more time in processing conflict when carrying out addressing, because This migrates data when a Hash table needs to establish when load factor reaches saturation bigger Hash table;For dictionary For tree, due to the dispersibility of its node, so that when scanning for progress dictionary tree, cache utilization rate can be very low, furthermore The growth that dictionary tree can not control internal node well will cause a large amount of memory overhead.
Summary of the invention
Lower to solve the problems, such as to face recall precision in above-mentioned storage system, the present invention has upper and lower level rope by a kind of The index tree of guiding structure, to meet the needs that key assignments storage system retrieves key assignments data efficient.
Specifically, index tree constructing method of the invention include: the prefix of the key assignments of key assignments data is ranked up and Divide the superstructure to generate dictionary tree, as index tree;Hash table is constructed with the cryptographic Hash of the key assignments, it is raw with the Hash table At the understructure of the index tree;Key assignments data-Hash table-dictionary tree corresponding relationship is established, the index tree is generated.
Index tree constructing method of the present invention, wherein the Hash table includes: multiple Hash buckets, each Hash bucket packet The logo slot set gradually, cache slot and address slot are included, wherein the identification item of the logo slot stores the cryptographic Hash of the key assignments, this is slow The cache entry storage section of the slot key assignments is deposited, the address entries of the address slot store the storage address of the key assignments data.
Index tree constructing method of the present invention, wherein each logo slot includes N number of identification item, each cache slot Including N number of cache entry, each address slot includes N number of address entries;N-th of identification item and n-th of cache entry and n-th of address entries The corresponding same key assignments data;Wherein N, n are positive integer, n≤N.
Index tree constructing method of the present invention, wherein the identification item is 4 bytes of storage space, which is 4 words Memory space is saved, which is 8 bytes of storage space.
The present invention also proposes a kind of index tree building system towards key assignments storage system, comprising: index tree superstructure Generation module, the prefix for the key assignments to key assignments data are ranked up and are divided to generate dictionary tree, as the upper of index tree Layer structure;Index tree understructure generation module, for constructing Hash table with the cryptographic Hash of the key assignments, being generated with the Hash table should The understructure of index tree;Index tree corresponding relationship generation module, for establishing key assignments data-Hash table-dictionary tree pair It should be related to, generate the index tree.
Index tree of the present invention constructs system, and wherein the Hash table includes multiple Hash buckets, each Hash bucket packet The logo slot set gradually, cache slot and address slot are included, which specifically includes: logo slot storage Module, the identification item for being stored as the cryptographic Hash of the key assignments in the logo slot;Cache slot memory module, for part to be somebody's turn to do Key assignments is stored as the cache entry of the cache slot;Address slot memory module, for the storage address of the key assignments data to be stored as this The address entries of address slot.
Index tree of the present invention constructs system, wherein each logo slot includes N number of identification item, each cache slot Including N number of cache entry, each address slot includes N number of address entries;N-th of identification item and n-th of cache entry and n-th of address entries The corresponding same key assignments data;Wherein N, n are positive integer, n≤N.
Index tree of the present invention constructs system, and wherein the identification item is 4 bytes of storage space, which is 4 words Memory space is saved, which is 8 bytes of storage space.
The present invention also proposes a kind of readable storage medium storing program for executing, is stored with executable instruction, and the executable instruction is for executing such as Index tree constructing method above-mentioned towards key assignments storage system.
The present invention also proposes a kind of data processing equipment, including readable storage medium storing program for executing as the aforementioned, the data processing equipment The executable instruction in the readable storage medium storing program for executing is transferred and executes, to construct the index tree towards key assignments storage system.
A kind of index tree constructing method towards key assignments storage system proposed by the present invention, constructs the index of upper and lower structures It sets (Radix Hashing Tree, RH-Tree), is the tree index structure of O (log n) relative to operation complexities such as B+ trees For, RH-Tree has more excellent monomer operational capacity O (L+K);For hash index, RH-Tree on the one hand can To support range-based searching, another aspect RH-Tree that can dynamically handle data growth;For dictionary tree, RH-Tree has Lower space expense and higher efficiency.
Detailed description of the invention
Fig. 1 is RH-Tree structure total figure of the invention.
Fig. 2 is index entry and key assignments data separating storage organization schematic diagram of the invention.
Fig. 3 is the search operation schematic diagram of RH-Tree of the invention.
Fig. 4 is the search operation flow chart of RH-Tree of the invention.
Fig. 5 is the insertion operation flow chart of RH-Tree of the invention.
Fig. 6 is the scan operation schematic diagram of RH-Tree of the invention.
Fig. 7 is the common division schematic diagram of RH-Tree of the invention.
Fig. 8 is the split layer schematic diagram of RH-Tree of the invention.
Fig. 9 is the uneven division schematic diagram of RH-Tree of the invention.
Figure 10 is the optimization division schematic diagram of RH-Tree of the invention.
Figure 11 is that the consistency commonly divided of the invention guarantees schematic diagram.
Figure 12 is that the consistency of split layer of the invention guarantees schematic diagram.
Figure 13 be traditional volatile ram single thread in the case of each indexing means PUT and GET handling capacity comparison diagram.
Figure 14 be traditional volatile ram single thread in the case of each indexing means SCAN handling capacity comparison diagram.
Figure 15 be traditional volatile ram multithreading in the case of RH-Tree with B+ tree concurrency comparison diagram.
Figure 16 be Nonvolatile memory single thread in the case of each indexing means PUT and GET handling capacity comparison diagram.
Figure 17 be Nonvolatile memory multithreading in the case of the handling capacity comparison diagram of PUT that respectively indexes.
Figure 18 is the data processing equipment signal of the index tree building system of the invention towards key assignments storage system.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing, the present invention is mentioned The index tree constructing method towards key assignments storage system out is further described.It should be appreciated that described herein specific Implementation method is only used to explain the present invention, is not intended to limit the present invention.
The present invention proposes a kind of index tree constructing method towards key assignments storage system, the index constructed by this method Tree, referred to as RH-Tree (Radix Hashing Tree, hash index tree).It is O (log relative to operation complexities such as B+ trees N) for tree index structure, RH-Tree has more excellent monomer operational capacity O (L+K);Relative to hash index On the one hand speech, RH-Tree can support range-based searching, another aspect RH-Tree that can dynamically handle data growth;For word For allusion quotation tree, RH-Tree has lower space expense and higher efficiency.
Index tree constructing method of the invention includes: that the prefix of the key assignments of key assignments data is ranked up and is divided to generate Dictionary tree, the superstructure as index tree;Hash table is constructed with the cryptographic Hash of key assignments, the lower layer of index tree is generated with Hash table Structure;Key assignments data-Hash table-dictionary tree corresponding relationship is established, index tree is generated.
Wherein Hash table includes: multiple Hash buckets, which includes a plurality of index entry, and every index entry includes successively setting Logo slot, cache slot and the address slot set, wherein the cryptographic Hash of the identification item storage key assignments of logo slot, the cache entry of cache slot are deposited Store up part key assignments, the storage address of the address entries storage key assignments data of address slot.Each logo slot includes N number of identification item, each Cache slot includes N number of cache entry, and each address slot includes N number of address entries;N-th of identification item and n-th of cache entry are with n-th The corresponding same key assignments data in location;Wherein N, n are positive integer, n≤N.Specifically, identification item is 4 bytes of storage space, caching Item is 4 bytes of storage space, and address entries are 8 bytes of storage space.
The present invention also proposes a kind of index tree building system towards key assignments storage system, comprising: index tree superstructure Generation module, the prefix for the key assignments to key assignments data are ranked up and are divided to generate dictionary tree, as the upper of index tree Layer structure;Index tree understructure generation module generates rope for constructing Hash table with the cryptographic Hash of the key assignments with the Hash table Draw the understructure of tree;Index tree corresponding relationship generation module, for establishing key assignments data-Hash table-dictionary tree correspondence Relationship generates index tree.
Specifically, the index tree of the invention towards key assignments storage system includes:
One, RH-Tree structure
RH-Tree of the invention, index structure are divided into two structures up and down: using Radix Tree to key assignments data (Key- Value the prefix of key assignments (Key)) is ranked up and divides, and forms dictionary tree, this part dictionary tree constitutes the upper of RH-Tree Layer structure;The key assignments in same range is indexed using Hash table, all lower layer's Hash tables collectively form RH-Tree's Understructure;In the design aspect of lower layer's Hash table structure, using logo slot (signature slot);Improve the hit of caching Rate reduces unnecessary data access, improves search efficiency.
The common splitting method and split layer method increased the invention also provides processing data, by key assignments and index structure Separation reduces the carrying of data, and setting cache slot (cache slot) optimizes the fission process of RH-Tree, proposes for negative Carry uneven division optimization algorithm;Splitting algorithm of the invention is efficient, and the traditional Hash table of effective solution can not handled very well The problem of data increase.
And it is directed to Nonvolatile memory memory, the invention proposes the consistency based on 8 byte atomic writes to guarantee algorithm, protects Suitability of the RH-Tree on Nonvolatile memory is demonstrate,proved.
Specifically, RH-Tree of the invention includes three parts: index tree (radix tree), Hash table (HTable) and the memory block of key assignments data (Key-Value, KV) (Key-Value item, KV item).
Fig. 1 is RH-Tree structure total figure of the invention.As shown in Figure 1, the design for RH-Tree, before being divided into upper layer Sew the Hash node structure of ordering structure and lower layer: the ordering structure on upper layer is a Radix Tree;The hash data structure of lower layer, Node for each Radix Tree is a Hash table, and key assignments data are then specifically breathed out with some in some Hash table Uncommon bucket (bucket) is corresponding.The KV item of each Hash bucket then corresponding keys Value Data storage region (KV item area).
Each Hash table includes multiple Hash bucket buckets, and Hash bucket can regard the index entry of key assignments data as, each Including logo slot (signature slot), cache slot (cache slot) and address slot (address slot), in each signature slot There are 4 identification items (signature), it is empty that each signature accounts for 4 bytes (Bety, B) (being one 32 cryptographic Hash) storage Between, each signature slot accounts for 16B memory space, when having data in the signature of a signature slot, Signature is the cryptographic Hash that the bucket corresponds to key assignments (Key), if signature is 0, then it represents that the signature slot For sky;Cache slot is stored with cache, and cache is the part key assignments data of a key assignments, and each cache slot is stored with 4 Cache, one cache account for 4B (the 4B data that can cache a key assignments) memory space, and each cache slot accounts for 16B, cache The effect of slot will be illustrated subsequent;Address is the storage address of key assignments data, address slot totally 4 address, often A address accounts for 8B memory space, and each address slot accounts for 32B altogether.Each bucket also includes the mark of other quantity Slot, cache slot and address slot, e.g. 2 groups (having 2 logo slots, 2 cache slots and 2 address slots respectively) or 8 (have respectively 8 logo slots, 8 cache slots and 8 address slots), the present invention is not limited thereto.
Fig. 2 is index entry and key assignments data separating storage organization schematic diagram of the invention.As shown in Fig. 2, in of the invention In RH-Tree, key value Key is stored separately with key assignments data KV item.One search for lower layer's Hash table is grasped Make: navigating to some bucket of Hash table HTable using cryptographic Hash first, the present invention is using cryptographic Hash to bucket number The mode of remainder judged, such as: the cryptographic Hash of the key value Key is 4097, and the bucket of a Hash table is 4096, that It is just positioned in bucket1 (4097mod4096=1).It is not 0 slot for signature, reads it Signature value, if whether signature value is the same, then verified correctly by cache judgment part key assignments, if It is verified, the key assignments data to be searched for is eventually found by address pointer, there are complete key assignments in the key assignments data, Final and complete key assignments compares, and the result to be searched for is returned after meeting.
Due to by above-mentioned design, the size of a Hash bucket is 64B, therefore RH-Tree can effectively will be to one The search operation of Hash bucket is placed in cache lines (cache line, cache line are typically sized to 64B) and carries out, effectively The access times for reducing memory, caching is adequately utilized.
Two, data manipulation is handled
Index tree constructing method towards key assignments storage system of the invention includes the operation to key assignments data, such as search, Insertion, deletion, range-based searching etc..
1, one is operated: search
For a search operation, scanned for first using the prefix for the key assignments to be operated in Radix Tree;It is searching Rope to corresponding lower layer Hash table when, the searching algorithm for reusing Hash table scans for.
Therefore the search complexity of whole RH-Tree is O (L+K), and wherein O (L) is operation in upper layer dictionary tree construction Search complexity, O (K) be operation lower layer's Hash table structure search complexity.
Fig. 3 is the search operation schematic diagram of RH-Tree of the invention.As shown in figure 3, one key of search is " acebbaq " Data corresponding keys are finally found in Hash table in Radix Tree structure using ace as the Hash table of prefix search to lower layer Data storage position.
Fig. 4 is the search operation flow chart of RH-Tree of the invention.As shown in figure 4, the search of RH-Tree of the invention Operation includes:
Step S11 searches for the Radix Tree (prefix of RH-Tree superstructure by using the prefix of key value Key Tree), navigate to some Hash table (Hash leaf node) of RH-Tree understructure;
Step S12, the cryptographic Hash of calculation key Key obtain one 32 cryptographic Hash hash32;
Step S13 carries out remainder using barrelage of this cryptographic Hash to Hash table, to be positioned to the bucket of search;
Step S14 searches for the signature slot in bucket one by one, to judge whether there is corresponding search mesh in bucket Mark;If there is corresponding search target, step S15 is carried out, verification search proves corresponding without storing as a result, if not having Target is searched for, exits and this time searches for;
Step S15 obtains target key value data according to the address stored in corresponding address slot, with corresponding cache slot The part key assignments data of interior storage are compared, to determine the accuracy of search result.
2, two are operated: insertion
Insertion operation is the same with search operation, needs some bucket being addressed to first by searching algorithm in Hash table, Searching position in the signature of the Hash bucket later is the write-in that empty slot corresponding to 0 carries out cache slot, final to be written Signature represents the completion of an insertion operation;
Fig. 5 is the insertion operation flow chart of RH-Tree of the invention.As shown in figure 5, the insertion of RH-Tree of the invention Operation includes:
Step S21 searches for the Radix Tree (prefix of RH-Tree superstructure by using the prefix of key value Key Tree), navigate to some Hash table (Hash leaf node) of RH-Tree understructure;
Step S22, the cryptographic Hash of calculation key Key obtain one 32 cryptographic Hash hash32;
Step S23 carries out remainder using barrelage of this cryptographic Hash to Hash table, to search for and be positioned to the bucket of insertion;
Step S24 searches for the signature slot in bucket one by one, to judge whether there is the value of signature in bucket For 0 signature slot;If so, then carrying out step S15, insertion operation is carried out if not having and carries out step S26;
Step S25 carries out the insertion operation of key assignments and key assignments data, and according to the cryptographic Hash of the key assignments of insertion, part key The save location of Value Data and key assignments data updates the data of signature slot, cache slot and address slot in bucket;
Step S26, divides RH-Tree, and return step S21, to re-start the insertion of key assignments data.
3, three are operated: being deleted
Whether as with insertion operation one, need to find key value by searching algorithm first is located under some delete operation In layer Hash table, if search and successful match, corresponding key assignments memory space is discharged and by the signature slot Signature field is set to 0.
4, operation four: range-based searching
Fig. 6 is the scan operation schematic diagram of RH-Tree of the invention.As shown in fig. 6, searching for key assignments 1 for range-based searching (Key1) " ababc " finds boundary by two general search first and breathes out to the data in key assignments 2 (Key2) " cacbd " key assignments Uncommon table (HTable1, HTable N), but due to be inside Hash table it is unordered, not can guarantee two Kazakhstan for being located at search boundary The data of uncommon table are entirely in seeking scope (scan range), it is therefore desirable to all data of HTable1 and HTableN are scanned, Find the data item for meeting range.But for Hash table HTable 2~HTable N-1, the word based on RH-Tree superstructure Allusion quotation tree characteristic, interior data range is inevitable to exist " ababc " arrives " between cacbd ", therefore these data need not be compared one by one be It is no in the range of Hash table HTable 2~HTable N-1.
Three, data growth is handled
Traditional Hash table can not usually handle the growth of data well.For the way that processing data increase, tradition Hash table can will establish a bigger new table, by the Data Migration of old table into new table.In this case, if new establish Table it is excessive, it will cause the wastes in space;If the table established is smaller, it will appear new table at once and be fully written, and will The case where re-starting Data Migration.
In view of the above-mentioned problems, will do it splitting operation after a node of RH-Tree is write completely and generate new node, divide It is divided into two kinds, is not to be related to commonly dividing, being related to the split layer of path growth for prefix path growth respectively;In addition, RH-Tree By key assignments data with index entry separate, only in index entry save key assignments data address (address in block), division when Wait will not shifting bond(s) Value Data, but only move the index entry for indexing, the movement of the key assignments data of effective reduction.
1, common division
Fig. 7 is the common division schematic diagram of RH-Tree of the invention.As shown in fig. 7, an operating process commonly divided It is as follows:
After a Hash table is write completely, if there are more than two upper layer paths (such as from internal node for the Hash table Innernode it) is directed toward its (see on the left of Fig. 7, there are two prefix pointers of a, b to be directed toward the Hash table --- HTable1), then can Commonly to be divided (normal split).Establish a new Hash table --- HTable2;It will be directed toward old Hash table originally (HTable1) pointer point half is directed toward new Hash table (HTable2), and (see on the right side of Fig. 7, prefix pointers a continues to point to old HTable1, and prefix pointers b has been directed toward newly-established HTable2);The index entry of new Hash table prefix will be met from old Hash Table mobile (movement) arrives new Hash table.
2, split layer
Fig. 8 is the split layer schematic diagram of RH-Tree of the invention.As shown in figure 8, the operating process of a split layer is such as Under:
After a Hash table is write completely, if the Hash table only exists a upper layer path (such as from internal node Innernode it) is directed toward it, then carries out split layer.For a split layer (level split), a new inside can be established Node is used to expand path (see on the left of Fig. 6, establishing new internal node new innernode);Then it is new interior to establish this Multiple child nodes (HTable1', HTable2') of portion's node;The index entry of node i nnernode is finally transported to new node In new innernode.
3, division efficiency is promoted by cache slot
It can be seen that whether common or split layer, requires key assignments and goes to judge that an index entry should be assigned to In which node, therefore division is directed to the access of key assignments data every time, since the present invention carries out index and key assignments data Separation will definitely increase a large amount of memory access expenses if division will access data every time.Referring to Fig. 1, such as Fig. 1 institute Show, data buffering (the 4B number of one key of storage of a key assignments 4B can be stored in the design of the Hash bucket of RH-Tree of the present invention According to).With split layer, upper layer path constantly increases, and needs to update cache slot, by this design, the present invention one internal section Point 4 split layers of every progress can just access primary key Value Data and go to update cache slot, thus greatly reduce in fission process Access to memory, so that the most of inquiry of primary division can be completed in cache.
4, intensive key-value pair is effectively treated
Fig. 9 is the uneven division schematic diagram of RH-Tree of the invention.As shown in figure 9, due to the upper layer RH-Tree be by It is addressed by the prefix of key assignments according to lexcographical order, and is also to be divided according to lexcographical order when division, therefore After will appear a division, two new nodes is caused to will appear the situation of data unevenness, HTable0 once will be divided commonly, it It can write data into HTable1 and HTable2, but can see at this time, according to the distribution of key, be written in HTable1 80% or more data, so caused problem are exactly that can generate a division again at once, and need a mobile data, this Sample has resulted in the multiple movement of data.
Figure 10 is the optimization division schematic diagram of RH-Tree of the invention.As shown in Figure 10, it when each division, needs Judge when whether predivision will cause a case where node write-in is more than 80%, if so, so judging again next Secondary (multiple) division, if the load that will cause new node is uneven.When if primary common division will cause HTable1 write-in Imbalance, it is a split layer at this time that RH-Tree, which judges to divide again whether to will cause write-in uneven next time, at this time, can be with Seeing that current division not will cause the load unevenness of new node, RH-Tree will construct node according to new division situation at this time, Then data are carried.
5, consistency guarantees
(1) insertion operation consistency guarantees
A. apply for key assignments memory space, write key Value Data and persistence.
B. using the address item in atomic write and synchronization primitives (mfence/clfush) modification index entry, being directed toward should Key assignments storage address, if delay machine at this time, the corresponding signature of the slot is 0, indicates that the slot is still empty.
C. it using the signature in atomic write and synchronization primitives (mfence/clfush) modification Hash term, indicates Once write complete
(2) operational consistency is updated to guarantee
A. apply for key assignments memory space, write-in key assignments number and persistence, if memory overflow, but this can occur for delay machine at this time Problem is not considered and is studied by us.
B. using the address in atomic write and synchronization primitives (mfence/clfush) modification index entry, it is directed toward the key Value Data storage address, if delay machine at this time, the corresponding signature of the slot is 0, indicates that the slot is still empty.
C. it using the signature in atomic write and synchronization primitives (mfence/clfush) modification index entry, indicates Once write complete.
D. the memory space of legacy data is discharged.
(3) delete operation consistency guarantees
A. setting siganture slot in Hash term using atomic write and synchronization primitives (mfence/clfush) is 0.At this time Delay machine, delete operation success, but memory overflow can occur for delay machine at this time.
B. the space of release key assignments storage.
(4) common splitting operation consistency guarantees
Figure 11 A, 11B, 11C are that the consistency commonly divided of the invention guarantees schematic diagram.For splitting operation, RH- Tree devises atom failure pointer and goes to guarantee the consistency in fission process.Atom failure pointer is one by dividing section Point is directed toward the pointer for carrying back end, when being divided, to use the pointer by new and old node phase before carrying data Even, new node can be found by the pointer when ensuring the delay machine in fission process.One is divided, its process is such as Under:
Original state is that Radix Tree pointer in upper layer is directed toward Hash node (the Hash table HTable in Figure 11 A);
Apply for new Hash node (the Hash table new HTable in Figure 11 B), and split vertexes (Hash table is wanted in use HTable atom failure pointer) is directed toward new Hash table new HTable (arrow 1 in such as Figure 11 B);By index entry from old section Point is transported to new node, and carries out persistence (arrow 2 in such as Figure 11 B)
It modifies upper layer Radix Tree pointer (arrow 3 in such as Figure 11 C) and is directed toward the new Hash node (Kazakhstan in Figure 11 C Uncommon table new HTable);Atom failure pointer is removed, indicates that primary division is completed.
(5) split layer operational consistency guarantees
Figure 12 is that the consistency of split layer of the invention guarantees schematic diagram.As shown in figure 12, similar with commonly dividing, for Split layer RH-Tree equally will use atom failure pointer the node divided is connected, with ensure divide in if Delay machine can be restored.Steps are as follows:
Apply for new Hash node, they are connected with each other using atom failure pointer;
Carry out the carrying of data;
Apply for new internal node, and the pointer of new internal node is directed toward to new Hash node;
The pointer direction of old internal node is revised as new internal node;
Delete the atom failure pointer of all nodes.
Four, it evaluates and tests
It is as follows to evaluate and test environment:
Processor Intel Xeon E5-2620v3
Memory 96GB
Operating system CentOS 7.0, Linux kernel 4.3.0
Memory management library jemalloc
Test set: it is evaluated and tested, the size of Key using the data (100,000,000 Key-value item) of 100M For 32B, the size of Key-value is 128B.
1, traditional volatile ram evaluation and test
With B+ tree (B+-Tree), red black tree (Red-Black Tree), Adaptive Radix Tree, Level The tradition indexing means such as Hashing compare.
(1) single threaded operation efficiency
Figure 13 be traditional volatile ram single thread in the case of each indexing means PUT and GET handling capacity comparison diagram. As shown in figure 13, RH-Tree PUT operation be respectively B+-Tree, Level Hashing, Adaptive Radix Tree 7.27 times, 7.13 times, 3.71 times, 1.74 times.The GET operation of RH-Tree is Red-Black Tree, Level respectively Hashing, Adaptive Radix Tree, 5.3 times of B+-Tree, 9.6 times, 6.05 times, 1.71 times.
(2) scan operation efficiency
Figure 14 be traditional volatile ram single thread in the case of each indexing means SCAN handling capacity comparison diagram.Such as figure Shown in 14, the relatively good B+ tree index of RH-Tree and search performance is compared here, it can be seen that RH-Tree's SCAN performance is 80% or so of traditional B+ tree, this is because expense when scanning in scan data to boundary Hash table.
(3) multi-thread concurrent efficiency
Figure 15 be traditional volatile ram multithreading in the case of RH-Tree with B+ tree concurrency comparison diagram.Such as Figure 15 Shown, in the case where 16 threads, RH-Tree is 6.2 times of B+-Tree handling capacity.Between 12 threads~16 Thread Counts The reason of handling capacity rises slowly or declines is mainly since one physical core of test machine used could support up 12 lines Journey, due to the design of NUMA architecture, 13 or more threads will appear the problem of remote memory accesses, and increase memory access latency.
2, the evaluation and test on Nonvolatile memory
Tradition is indexed, same to FP-Tree, FAST-FAIR Tree, Hybrid Index, Level Hashing are carried out Comparison.
(1) single threaded operation efficiency
Figure 16 be Nonvolatile memory single thread in the case of each indexing means PUT and GET handling capacity comparison diagram.Such as Shown in Figure 16, the PUT of RH-Tree operation be respectively FPTree, 1.86 times of FAST-FAIR, Level Hashing, 1.53 times, 1.44 again.RH-Tree GET operation be respectively FPTree, 5.32 times of FAST-FAIR, Level Hashing, 4.79 times, 2.4 again.
(2) multi-thread concurrent efficiency
Figure 17 be Nonvolatile memory multithreading in the case of the handling capacity comparison diagram of PUT that respectively indexes.As shown in figure 17, In terms of concurrency, RH-Tree has outstanding concurrency performance, can reach FP-Tree index when 12 threads 4 times of energy have only carried out 12 threads due to testing, and the disadvantage of FAST-FAIR and HiKV embody not yet, even if this Sample, RH-Tree performance also can achieve their 1.5 times.
Figure 18 is the data processing equipment schematic diagram of the index tree building system of the invention towards key assignments storage system.Such as Shown in Figure 18, the embodiment of the present invention also provides a kind of readable storage medium storing program for executing and a kind of data processing equipment.Of the invention is readable Storage medium is stored with computer executable instructions, when executable instruction is executed by the processor of data processing equipment, in realization State the index tree building towards key assignments storage system.Those of ordinary skill in the art will appreciate that whole or portion in the above method Related hardware (such as processor) can be instructed to complete by program step by step, described program can store in readable storage medium storing program for executing In, such as read-only memory, disk or CD.One or more collection also can be used in all or part of the steps of above-described embodiment It is realized at circuit.Correspondingly, each module in above-described embodiment can take the form of hardware realization, such as pass through integrated electricity Its corresponding function is realized on road, can also be realized in the form of software function module, such as is stored in by processor execution Program/instruction in memory realizes its corresponding function.The embodiment of the present invention be not limited to any particular form hardware and The combination of software.
Although the present invention has been disclosed by way of example above, it is not intended to limit the present invention., any technical field In those of ordinary skill can make several modifications and improvements without departing from the spirit and scope of the present invention, therefore it is of the invention Protection scope should be defined by the scope of the appended claims.

Claims (10)

1. a kind of index tree constructing method towards key assignments storage system characterized by comprising
The prefix of the key assignments of key assignments data is ranked up and is divided the superstructure to generate dictionary tree, as index tree;
Hash table is constructed with the cryptographic Hash of the key assignments, the understructure of the index tree is generated with the Hash table;
Key assignments data-Hash table-dictionary tree corresponding relationship is established, the index tree is generated.
2. index tree constructing method as described in claim 1, which is characterized in that the Hash table includes:
Multiple Hash buckets, each Hash bucket include the logo slot set gradually, cache slot and address slot, wherein the logo slot Identification item stores the cryptographic Hash of the key assignments, the cache entry storage section of the cache slot key assignments, the address entries storage of the address slot The storage address of the key assignments data.
3. index tree constructing method as claimed in claim 2, which is characterized in that each logo slot includes N number of identification item, often A cache slot includes N number of cache entry, and each address slot includes N number of address entries;N-th of identification item and n-th cache entry with N-th of address entries corresponds to the same key assignments;Wherein N, n are positive integer, n≤N.
4. index tree constructing method as claimed in claim 3, which is characterized in that the identification item is 4 bytes of storage space, this is slow Credit balance is 4 bytes of storage space, which is 8 bytes of storage space.
5. a kind of index tree towards key assignments storage system constructs system characterized by comprising
Index tree superstructure generation module, the prefix for the key assignments to key assignments data are ranked up and are divided to generate dictionary Tree, the superstructure as index tree;
Index tree understructure generation module generates the rope for constructing Hash table with the cryptographic Hash of the key assignments with the Hash table Draw the understructure of tree;
Index tree corresponding relationship generation module, for establishing key assignments data-Hash table-dictionary tree corresponding relationship, generating should Index tree.
6. index tree as claimed in claim 5 constructs system, which is characterized in that the Hash table includes multiple Hash buckets, each The Hash bucket includes the logo slot set gradually, cache slot and address slot, which specifically includes:
Logo slot memory module, the identification item for being stored as the cryptographic Hash of the key assignments in the logo slot;
Cache slot memory module, for the part key assignments to be stored as to the cache entry of the cache slot;
Address slot memory module, for the storage address of the key assignments data to be stored as to the address entries of the address slot.
7. index tree as claimed in claim 6 constructs system, which is characterized in that each logo slot includes N number of identification item, often A cache slot includes N number of cache entry, and each address slot includes N number of address entries;N-th of identification item and n-th cache entry with N-th of address entries corresponds to the same key assignments data;Wherein N, n are positive integer, n≤N.
8. index tree as claimed in claim 6 constructs system, which is characterized in that the identification item is 4 bytes of storage space, this is slow Credit balance is 4 bytes of storage space, which is 8 bytes of storage space.
9. a kind of readable storage medium storing program for executing, is stored with executable instruction, which appoints for executing Claims 1 to 4 such as Towards the index tree constructing method of key assignments storage system described in one.
10. a kind of data processing equipment, including readable storage medium storing program for executing as claimed in claim 9, the data processing equipment are transferred And the executable instruction in the readable storage medium storing program for executing is executed, to construct the index tree towards key assignments storage system.
CN201910271085.XA 2019-04-04 2019-04-04 Key value storage system-oriented index tree construction method and system Active CN110083601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910271085.XA CN110083601B (en) 2019-04-04 2019-04-04 Key value storage system-oriented index tree construction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910271085.XA CN110083601B (en) 2019-04-04 2019-04-04 Key value storage system-oriented index tree construction method and system

Publications (2)

Publication Number Publication Date
CN110083601A true CN110083601A (en) 2019-08-02
CN110083601B CN110083601B (en) 2021-11-30

Family

ID=67414358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910271085.XA Active CN110083601B (en) 2019-04-04 2019-04-04 Key value storage system-oriented index tree construction method and system

Country Status (1)

Country Link
CN (1) CN110083601B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888886A (en) * 2019-11-29 2020-03-17 华中科技大学 Index structure, construction method, key value storage system and request processing method
CN111338568A (en) * 2020-02-16 2020-06-26 西安奥卡云数据科技有限公司 Data logic position mapping method
CN111399777A (en) * 2020-03-16 2020-07-10 北京平凯星辰科技发展有限公司 Differentiated key value data storage method based on data value classification
CN111858607A (en) * 2020-07-24 2020-10-30 北京金山云网络技术有限公司 Data processing method and device, electronic equipment and computer readable medium
CN112579575A (en) * 2020-12-28 2021-03-30 超越科技股份有限公司 Method for quickly constructing database index structure
CN112667636A (en) * 2020-12-30 2021-04-16 杭州趣链科技有限公司 Index establishing method, device and storage medium
CN112732725A (en) * 2021-01-22 2021-04-30 上海交通大学 NVM (non volatile memory) hybrid memory-based adaptive prefix tree construction method, system and medium
CN112835907A (en) * 2021-02-08 2021-05-25 兴业数字金融服务(上海)股份有限公司 Multi-hash storage method and system
CN113157694A (en) * 2021-03-22 2021-07-23 浙江大学 Database index generation method based on reinforcement learning
CN113505130A (en) * 2021-07-09 2021-10-15 中国科学院计算技术研究所 Hash table processing method
CN113535788A (en) * 2021-07-12 2021-10-22 中国海洋大学 Retrieval method, system, equipment and medium for marine environment data
CN113778752A (en) * 2021-09-10 2021-12-10 中国电信集团系统集成有限责任公司 Hash data storage method and device for data de-duplication
CN113821171A (en) * 2021-09-01 2021-12-21 浪潮云信息技术股份公司 Key value storage method based on hash table and LSM tree
CN114676136A (en) * 2022-03-28 2022-06-28 浙江邦盛科技股份有限公司 Subset filter oriented to memory key value table
EP4104058A4 (en) * 2020-02-10 2024-03-20 2Misses Corp System and method for a hash table and data storage and access using the same

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477523A (en) * 2008-11-24 2009-07-08 北京邮电大学 Index structure and retrieval method for ultra-large fingerprint base
CN102831224A (en) * 2012-08-24 2012-12-19 北京百度网讯科技有限公司 Creating method for data index base and searching suggest generation method and device
CN104850572A (en) * 2014-11-18 2015-08-19 中兴通讯股份有限公司 HBase non-primary key index building and inquiring method and system
CN104899297A (en) * 2015-06-08 2015-09-09 南京航空航天大学 Hybrid index structure with storage perception
CN104991905A (en) * 2015-06-17 2015-10-21 河北大学 Method for mathematical expression retrieval based on hierarchical indexing
CN107273443A (en) * 2017-05-26 2017-10-20 电子科技大学 A kind of hybrid index method based on big data model metadata

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477523A (en) * 2008-11-24 2009-07-08 北京邮电大学 Index structure and retrieval method for ultra-large fingerprint base
CN102831224A (en) * 2012-08-24 2012-12-19 北京百度网讯科技有限公司 Creating method for data index base and searching suggest generation method and device
CN104850572A (en) * 2014-11-18 2015-08-19 中兴通讯股份有限公司 HBase non-primary key index building and inquiring method and system
CN104899297A (en) * 2015-06-08 2015-09-09 南京航空航天大学 Hybrid index structure with storage perception
CN104991905A (en) * 2015-06-17 2015-10-21 河北大学 Method for mathematical expression retrieval based on hierarchical indexing
CN107273443A (en) * 2017-05-26 2017-10-20 电子科技大学 A kind of hybrid index method based on big data model metadata

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888886A (en) * 2019-11-29 2020-03-17 华中科技大学 Index structure, construction method, key value storage system and request processing method
CN110888886B (en) * 2019-11-29 2022-11-11 华中科技大学 Index structure, construction method, key value storage system and request processing method
EP4104058A4 (en) * 2020-02-10 2024-03-20 2Misses Corp System and method for a hash table and data storage and access using the same
CN111338568A (en) * 2020-02-16 2020-06-26 西安奥卡云数据科技有限公司 Data logic position mapping method
CN111338568B (en) * 2020-02-16 2020-11-06 西安奥卡云数据科技有限公司 Data logic position mapping method
CN111399777A (en) * 2020-03-16 2020-07-10 北京平凯星辰科技发展有限公司 Differentiated key value data storage method based on data value classification
CN111399777B (en) * 2020-03-16 2023-05-16 平凯星辰(北京)科技有限公司 Differential key value data storage method based on data value classification
CN111858607A (en) * 2020-07-24 2020-10-30 北京金山云网络技术有限公司 Data processing method and device, electronic equipment and computer readable medium
CN112579575A (en) * 2020-12-28 2021-03-30 超越科技股份有限公司 Method for quickly constructing database index structure
CN112667636A (en) * 2020-12-30 2021-04-16 杭州趣链科技有限公司 Index establishing method, device and storage medium
CN112732725A (en) * 2021-01-22 2021-04-30 上海交通大学 NVM (non volatile memory) hybrid memory-based adaptive prefix tree construction method, system and medium
CN112732725B (en) * 2021-01-22 2022-03-25 上海交通大学 NVM (non volatile memory) hybrid memory-based adaptive prefix tree construction method, system and medium
CN112835907A (en) * 2021-02-08 2021-05-25 兴业数字金融服务(上海)股份有限公司 Multi-hash storage method and system
CN113157694A (en) * 2021-03-22 2021-07-23 浙江大学 Database index generation method based on reinforcement learning
CN113505130A (en) * 2021-07-09 2021-10-15 中国科学院计算技术研究所 Hash table processing method
CN113505130B (en) * 2021-07-09 2023-07-21 中国科学院计算技术研究所 Hash table processing method
CN113535788B (en) * 2021-07-12 2024-03-05 中国海洋大学 Ocean environment data-oriented retrieval method, system, equipment and medium
CN113535788A (en) * 2021-07-12 2021-10-22 中国海洋大学 Retrieval method, system, equipment and medium for marine environment data
CN113821171A (en) * 2021-09-01 2021-12-21 浪潮云信息技术股份公司 Key value storage method based on hash table and LSM tree
CN113821171B (en) * 2021-09-01 2024-06-11 上海沄熹科技有限公司 Key value storage method based on hash table and LSM tree
CN113778752A (en) * 2021-09-10 2021-12-10 中国电信集团系统集成有限责任公司 Hash data storage method and device for data de-duplication
CN114676136A (en) * 2022-03-28 2022-06-28 浙江邦盛科技股份有限公司 Subset filter oriented to memory key value table

Also Published As

Publication number Publication date
CN110083601B (en) 2021-11-30

Similar Documents

Publication Publication Date Title
CN110083601A (en) Index tree constructing method and system towards key assignments storage system
US9672235B2 (en) Method and system for dynamically partitioning very large database indices on write-once tables
US11693830B2 (en) Metadata management method, system and medium
US11449507B2 (en) Database engine
US7558802B2 (en) Information retrieving system
US7805427B1 (en) Integrated search engine devices that support multi-way search trees having multi-column nodes
US8224829B2 (en) Database
CN105975587B (en) A kind of high performance memory database index organization and access method
CN112000846B (en) Method for grouping LSM tree indexes based on GPU
US7603346B1 (en) Integrated search engine devices having pipelined search and b-tree maintenance sub-engines therein
US8086641B1 (en) Integrated search engine devices that utilize SPM-linked bit maps to reduce handle memory duplication and methods of operating same
AU2002222096A1 (en) Method of organising, interrogating and navigating a database
EP3014488A1 (en) Incremental maintenance of range-partitioned statistics for query optimization
US7653619B1 (en) Integrated search engine devices having pipelined search and tree maintenance sub-engines therein that support variable tree height
US7987205B1 (en) Integrated search engine devices having pipelined node maintenance sub-engines therein that support database flush operations
US10558636B2 (en) Index page with latch-free access
US7953721B1 (en) Integrated search engine devices that support database key dumping and methods of operating same
EP3995972A1 (en) Metadata processing method and apparatus, and computer-readable storage medium
Shui et al. Querying and maintaining ordered XML data using relational databases
Wu et al. PABIRS: A data access middleware for distributed file systems
CN111949439B (en) Database-based data file updating method and device
CN118051478A (en) Distributed block storage small file aggregation index management method
Hu et al. RWORT: A Read and Write Optimized Radix Tree for Persistent Memory
KR101656619B1 (en) RBI-based Subgraph Listing Method
JP2024504806A (en) Fast skip list purge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant