WO2020082597A1 - 一种b+树节点的批量插入和删除方法及装置 - Google Patents

一种b+树节点的批量插入和删除方法及装置 Download PDF

Info

Publication number
WO2020082597A1
WO2020082597A1 PCT/CN2018/124696 CN2018124696W WO2020082597A1 WO 2020082597 A1 WO2020082597 A1 WO 2020082597A1 CN 2018124696 W CN2018124696 W CN 2018124696W WO 2020082597 A1 WO2020082597 A1 WO 2020082597A1
Authority
WO
WIPO (PCT)
Prior art keywords
tree
key
value
node
disk
Prior art date
Application number
PCT/CN2018/124696
Other languages
English (en)
French (fr)
Inventor
刘丹
邹虎
何孝金
Original Assignee
郑州云海信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 郑州云海信息技术有限公司 filed Critical 郑州云海信息技术有限公司
Publication of WO2020082597A1 publication Critical patent/WO2020082597A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Definitions

  • the invention relates to the field of computer technology, and in particular to a method and device for batch insertion and deletion of B + tree nodes.
  • the B + tree is a balanced search tree designed for disks or other direct access auxiliary devices. It is usually used in the file system of databases and operating systems. For example, file systems such as NTFS, ReiserFS, NSS, XFS, JFS, ReFS, and BFS are using B + trees as metadata indexes.
  • file systems such as NTFS, ReiserFS, NSS, XFS, JFS, ReFS, and BFS are using B + trees as metadata indexes.
  • the characteristic of the B + tree is that it can keep the data stable and orderly, and its insertion and modification have a relatively stable logarithmic time complexity.
  • B + tree elements are inserted from bottom to top.
  • this application provides a method and device for batch insertion and deletion of B + tree nodes.
  • the specific technical solutions are as follows:
  • the present application provides a method for bulk insertion and deletion of B + tree nodes, the method including:
  • the extracting the first key-value from the memory B + tree node and the second key-value from the disk B + tree node include:
  • the operation type of the key-value in the memory B + tree node is one of inserting, deleting, and revoking abort.
  • an ordered set of all key-values in the memory B + tree Compare with the ordered set consisting of all the key-values to be inserted or deleted after the second key-value in the disk B + tree one by one. After the comparison result is obtained, it also includes:
  • all key-values in the memory B + tree are merged into the disk B + tree to generate a A new disk B + tree, including:
  • the memory B + tree According to the comparison result of the key-value in the memory B + tree with the key-value in the changed node in the disk B + tree, and the operation type of the key-value in the memory B + tree, the memory B + tree The key-values in the sequence are merged into the disk B + tree in turn to generate a new disk B + tree;
  • the node and key-value of the disk B + tree that have not changed are directly used as the node and key-value of the generated new disk B + tree.
  • the combining the memory B + tree with the disk B + tree according to the comparison result and the operation type of all key-values in the memory B + tree node includes: :
  • the present application provides a device for bulk insertion and deletion of B + tree nodes.
  • the device includes:
  • the extraction unit is used to extract the first key-value from the memory B + tree node and the second key-value from the disk B + tree node, where the first key-value is the smallest key- in the memory B + tree value, the second key-value is the smallest key-value to be inserted or deleted in the disk B + tree;
  • a comparison unit used to sequentially sort the ordered set of all key-values in the memory B + tree and the disk B + tree from the location of the first key-value and the second key-value The ordered set consisting of all the key-values to be inserted or deleted after the second key-value is compared one by one to obtain the comparison result;
  • a merging unit is used to merge the memory B + tree with the disk B + tree according to the comparison result and all key-value operation types in the memory B + tree node to generate a new disk B + tree , In order to realize the batch insertion and deletion of B + tree nodes.
  • the extraction unit includes:
  • a first extraction subunit configured to extract the smallest key-value in the node from the node corresponding to the cursor in the memory B + tree as the first key-value
  • the query subunit is used to query the corresponding key-value and the node to which it belongs in the disk B + tree according to the key-value pointed by the cursor in the memory B + tree, and set the position of the key-value to Describe the cursor position of disk B + tree;
  • the second extraction subunit is used to extract the smallest key-value in the node from the node corresponding to the cursor in the disk B + tree as the second key-value.
  • the operation type of the key-value in the memory B + tree node is one of inserting, deleting, and revoking abort.
  • the device further includes:
  • An obtaining unit configured to obtain the changed node and the unchanged node in the disk B + tree according to the comparison result
  • the merging unit includes:
  • the first merging subunit is used for comparing the key-value in the memory B + tree with the key-value in the changed node in the disk B + tree, and the operation of the key-value in the memory B + tree Type, merge the key-values in the memory B + tree into the disk B + tree in turn to generate a new disk B + tree;
  • the first merging subunit is used to directly use the node and key-value of the disk B + tree that have not changed as the node and key-value of the generated new disk B + tree.
  • the merging unit includes:
  • Skip subunit used to skip the key-value in the current memory B + tree node and move the memory from left to right if the operation type of the key-value in the current memory B + tree node is delete or abort Comparing the cursor in the B + tree node and the cursor in the disk B + tree node to compare the next key-value in the memory B + tree node with the next key-value in the disk B + tree node;
  • Select a subunit, used to select the key-value in the current memory B + tree node if the operation type of the key-value in the current memory B + tree node is insert, and insert it into a new node Move the cursors in the memory B + tree node and the cursors in the disk B + tree node upwards respectively to perform the next key-value in the memory B + tree node and the next key-value in the disk B + tree node Compare
  • the third merging subunit is used for traversing the cursors of the memory B + tree and the disk B + tree, merging to generate a new disk B + tree, and updating the root node of the generated new disk B + tree .
  • the first key-value is extracted from the memory B + tree node
  • the second key-value is extracted from the disk B + tree node, where the first key -value is the smallest key-value in the memory B + tree, the second key-value is the smallest key-value to be inserted or deleted in the disk B + tree, and then from the first key-value and the second key-value Position, compare the ordered set of all key-values in the memory B + tree with the ordered set of all key-values behind the second key-value in the disk B + tree one by one, get the comparison result, and then, According to the comparison result and the operation type of all key-values in the memory B + tree node, the memory B + tree and the disk B + tree can be merged to generate a new disk B + tree to realize the bulk insertion and deletion of the B + tree nodes.
  • this application generates a new disk B + tree by extracting the smallest node from the B + tree in memory and the B + tree on the disk, instead of inserting and deleting on the original B + tree using traditional methods.
  • the memory B + tree and the disk B + tree can be merged directly.
  • this method can improve the read performance of metadata access, and can avoid a large number of insertion and deletion of nodes A large number of disk read operations and frequent changes when deleting nodes save memory costs.
  • FIG. 1 is a schematic flowchart of a method for bulk insertion and deletion of B + tree nodes according to an embodiment of the present application
  • FIG. 2 is a flowchart of extracting a first key-value and a second key-value provided by an embodiment of this application;
  • FIG. 3 is a schematic diagram of querying for changed nodes in the B + tree of an embodiment of the present application
  • FIG. 4 is a schematic diagram of generating a new disk B + tree from bottom to top provided by an embodiment of the present application;
  • FIG. 5 is a schematic structural diagram of an apparatus for batch inserting and deleting B + tree nodes according to an embodiment of the present application.
  • the B + tree is a balanced search tree designed for disks or other direct access auxiliary devices. It is usually used in the file system of databases and operating systems. For example, file systems such as NTFS, ReiserFS, NSS, XFS, JFS, ReFS, and BFS all use B + trees as metadata indexes.
  • the characteristic of the B + tree is that it can keep the data stable and orderly, and its insertion and modification have a relatively stable logarithmic time complexity.
  • B + tree elements are inserted from bottom to top.
  • this application proposes a method and device for bulk insertion and deletion of B + tree nodes, which is used to realize the bulk insertion and deletion of B + tree nodes while avoiding a large number of disk read operations and Frequent changes when deleting nodes to save memory costs.
  • FIG. 1 shows a flowchart of a method for bulk insertion and deletion of B + tree nodes provided by an embodiment of the present application
  • this embodiment may include the following steps:
  • the B + tree in the memory and the B + tree on the disk can be considered to be merged first Two sources of, and then, each time a minimum node is obtained from these two sources, that is, the first key-value is extracted from the memory B + tree node, and the second key-value is extracted from the disk B + tree node,
  • the first key-value is the smallest key-value in the memory B + tree
  • the second key-value is the smallest key-value in the disk B + tree to be inserted or deleted.
  • step S101 may specifically include steps S201-S204:
  • each node of the memory B + tree can contain 512 entries, and each entry corresponds to a key-value pair. Therefore, memory B + The tree contains multiple key-values, for example, each node can contain 512 key-values.
  • the leftmost leaf node of the memory B + tree can be set as its cursor position, and the cursor value is 0.
  • the data in the B + tree is arranged from left to right in order from small to large, so, You can extract the smallest key-value in the node (the leftmost leaf node) corresponding to the cursor position in the memory B + tree, that is, the leftmost key-value, and use it as the first key-value.
  • S203 According to the key-value pointed by the cursor in the memory B + tree, query the corresponding key-value in the disk B + tree and the node to which it belongs, and set the position of the key-value to the cursor position of the disk B + tree.
  • the key-value (that is, the first key) pointed to by the cursor position in the memory B + tree can be further -value), query the corresponding key-value in the disk B + tree and the node to which it belongs.
  • each node of the disk B + tree can also store multiple entry entries.
  • each node can also contain 512 entries, and each entry corresponds to a key-value pair key-value. Therefore, the disk The B + tree also contains multiple key-values, for example, each node can contain 512 key-values.
  • step S203 according to the key-value pointed to by the cursor in the memory B + tree, after querying the location of the cursor in the disk B + tree, the node data in the disk B + tree is also in order from small to large from left to Arrange on the right, so you can keep the nodes in the disk B + tree with the cursor position to the left (excluding the node corresponding to the cursor position), and do not insert or delete them, directly use these nodes as memory B + tree and disk B + tree The left node of the newly generated disk B + tree after merging.
  • the smallest key-value in the node to be inserted or deleted can be extracted from the node corresponding to the cursor position in the disk B + tree, and used as the second key-value.
  • S102 According to the positions of the first key-value and the second key-value, sequentially sort the ordered set consisting of all the key-values in the memory B + tree and all the to-be-inserted or after the second key-value in the disk B + tree The ordered set of deleted key-values is compared one by one to get the comparison result.
  • step S101 after extracting the first key-value from the memory B + tree node and the second key-value from the disk B + tree node, the first key-value and the second key can be further extracted -value location, in order (from left to right, bottom to top), an ordered set of all key-values in the memory B + tree and all the to-be-inserted or deleted after the second key-value in the disk B + tree The ordered set consisting of key-values is compared one by one to obtain the comparison result.
  • step S102 from the position of the first key-value and the second key-value, all the key-values in the memory B + tree and the disk B + tree are sorted from left to right and bottom to top.
  • the key-values to be inserted or deleted in the comparison are compared.
  • all key-values in the memory B + tree can be compared with the operation type of the corresponding key-value in the memory B + tree node.
  • the disk B + tree is merged to generate a new disk B + tree to realize the batch insertion and deletion of B + tree nodes. It can be seen that the new disk B + tree generated after merging the memory B + tree and the disk B + tree is unique. Key-value information can further improve query efficiency.
  • step S103 the operation type of the key-value in the memory B + tree node is one of insert, delete, and cancel abort.
  • the abort type refers to that when a key-value pair of the insert type exists and a delete operation is performed, the key-value pair is marked as abort.
  • the B + tree in memory is equivalent to an increment.
  • Each node records its own operation type, whether it is inserted or deleted. If it is inserted, the value of the key-value in the corresponding disk B + tree needs to be updated. If it is deleted, You need to delete the corresponding information of the B + tree on the disk
  • step S102 nodes with changes and nodes without changes in the disk B + tree can be obtained.
  • the node with change in the disk B + tree refers to the node whose key-value corresponding to the parent node is greater than the key-value pointed by the current cursor of the B + tree in memory, otherwise, the node with no change.
  • FIG. 3 shows a schematic diagram of a node with a change in the query disk B + tree provided by an embodiment of the present application. As can be seen from Figure 3, if none of the leaf nodes under a parent node has changed, and if there is a parent node upwards from the parent node, continue to check upwards until a node with a change is found.
  • step S103 may include the following steps A-B:
  • Step A According to the comparison result of the key-value in the memory B + tree and the key-value in the changed node in the disk B + tree, and the operation type of the key-value in the memory B + tree, the key-value in the memory B + tree They are merged into the disk B + tree in turn to generate a new disk B + tree.
  • Step B The node and key-value of the disk B + tree that have not changed are directly used as the node and key-value of the generated new disk B + tree.
  • step S103 if the operation type of the key-value in the current memory B + tree node is delete or abort, the key-value in the current memory B + tree node is skipped, from left Move the cursor in the memory B + tree node and the cursor in the disk B + tree node to the right to compare the next key-value in the memory B + tree node with the next key-value in the disk B + tree node.
  • the operation type of the key-value in the current memory B + tree node is insert, select the key-value in the current memory B + tree node, insert it into a new node, and then move the memory B + tree from bottom to top
  • the cursor in the node and the cursor in the disk B + tree node are compared between the next key-value in the memory B + tree node and the next key-value in the disk B + tree node.
  • the generated parent node corresponds to the physical address of the child node on the disk (that is, the logical sequential address in the disk), for example, the disk capacity can be 1G, then the physical address of each child node in the disk can It is a sequential address from 0 to 1G.
  • m represents the number of entries contained in each node, for example, m can be taken as 512, the node value can be taken as 511 and so on.
  • the first key-value is extracted from the memory B + tree node
  • the second key-value is extracted from the disk B + tree node
  • the first The key-value is the smallest key-value in the memory B + tree
  • the second key-value is the smallest key-value to be inserted or deleted in the disk B + tree
  • the first key-value and the second key-value At the location, compare the ordered set of all key-values in the memory B + tree with the ordered set of all key-values after the second key-value in the disk B + tree one by one to get the comparison result, and then , According to the comparison result, and the operation type of all key-values in the memory B + tree node, the memory B + tree and the disk B + tree can be merged to generate a new disk B + tree to realize the batch insertion and deletion of B + tree nodes .
  • this application generates a new disk B + tree by extracting the smallest node from the B + tree in memory and the B + tree on the disk, instead of inserting and deleting on the original B + tree using traditional methods.
  • the memory B + tree and the disk B + tree can be merged directly.
  • this method can improve the read performance of metadata access, and can avoid a large number of insertion and deletion of nodes A large number of disk read operations and frequent changes when deleting nodes save memory costs.
  • the parent node contains the physical address of the child node on the disk
  • the present application also provides a device for bulk insertion and deletion of B + tree nodes.
  • the device includes:
  • the extraction unit 501 is used to extract the first key-value from the memory B + tree node and the second key-value from the disk B + tree node, where the first key-value is the smallest key in the memory B + tree -value, the second key-value is the smallest key-value to be inserted or deleted in the disk B + tree;
  • the comparing unit 502 is configured to sequentially sort the ordered set of all key-values in the memory B + tree and the disk B + tree from the locations of the first key-value and the second key-value Compare all the key-values to be inserted or deleted after the second key-value in the middle to get the comparison result;
  • the merging unit 503 is used to merge the memory B + tree with the disk B + tree according to the comparison result and all key-value operation types in the memory B + tree node to generate a new disk B + Tree to realize the batch insertion and deletion of B + tree nodes.
  • the extraction unit 501 includes:
  • a first extraction subunit configured to extract the smallest key-value in the node from the node corresponding to the cursor in the memory B + tree as the first key-value
  • the query subunit is used to query the corresponding key-value and the node to which it belongs in the disk B + tree according to the key-value pointed by the cursor in the memory B + tree, and set the position of the key-value to Describe the cursor position of disk B + tree;
  • the second extraction subunit is used to extract the smallest key-value in the node from the node corresponding to the cursor in the disk B + tree as the second key-value.
  • the operation type of the key-value in the memory B + tree node is one of insert, delete, and abort.
  • the device further includes:
  • An obtaining unit configured to obtain the changed node and the unchanged node in the disk B + tree according to the comparison result
  • the merging unit 503 includes:
  • the first merging subunit is used for comparing the key-value in the memory B + tree with the key-value in the changed node in the disk B + tree, and the operation of the key-value in the memory B + tree Type, merge the key-values in the memory B + tree into the disk B + tree in turn to generate a new disk B + tree;
  • the first merging subunit is used to directly use the node and key-value of the disk B + tree that have not changed as the node and key-value of the generated new disk B + tree.
  • the merging unit 503 includes:
  • Skip subunit used to skip the key-value in the current memory B + tree node and move the memory from left to right if the operation type of the key-value in the current memory B + tree node is delete or abort Comparing the cursor in the B + tree node and the cursor in the disk B + tree node to compare the next key-value in the memory B + tree node with the next key-value in the disk B + tree node;
  • Select a subunit, used to select the key-value in the current memory B + tree node if the operation type of the key-value in the current memory B + tree node is insert, and insert it into a new node Move the cursors in the memory B + tree node and the cursors in the disk B + tree node upwards respectively to perform the next key-value in the memory B + tree node and the next key-value in the disk B + tree node Compare
  • the third merging subunit is used for traversing the cursors of the memory B + tree and the disk B + tree, merging to generate a new disk B + tree, and updating the root node of the generated new disk B + tree .
  • the apparatus for batch insertion and deletion of B + tree nodes provided by this application, first, the first key-value is extracted from the memory B + tree node, and the second key-value is extracted from the disk B + tree node, where One key-value is the smallest key-value in the memory B + tree, and the second key-value is the smallest key-value in the disk B + tree to be inserted or deleted.
  • the memory B + tree and the disk B + tree can be merged to generate a new disk B + tree, so as to realize the batch insertion of B + tree nodes and delete. It can be seen that this application generates a new disk B + tree by extracting the smallest node from the B + tree in memory and the B + tree on the disk, instead of inserting and deleting on the original B + tree using traditional methods.
  • the memory B + tree and the disk B + tree can be merged directly. Compared with the existing method of dropping the memory metadata first and then merging, this method can improve the read performance of metadata access, and can avoid a large number of insertion and deletion of nodes A large number of disk read operations and frequent changes when deleting nodes save memory costs.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable and programmable ROM
  • registers hard disks, removable disks, CD-ROMs, or all fields of technology. Any other known storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开一种B+树节点的批量插入和删除方法及装置,该方法包括:从内存B+树节点中抽取第一key-value,从磁盘B+树节点中抽取第二key-value,第一key-value为内存B+树中最小的key-value,第二key-value为磁盘B+树中待插入或删除的最小的key-value,再按序将内存B+树中的key-value组成的有序集合与磁盘B+树中第二key-value后面的所有key-value组成的有序集合逐个进行比较,再根据比较结果和内存B+树节点中所有key-value的操作类型,将内存B+树与磁盘B+树合并,生成新的磁盘B+树。可见,本申请是通过从内存B+树和磁盘B+树抽取最小的节点生成新的磁盘B+树,相比较将内存元数据先落盘后再合并的方法,本方法可以提高元数据访问的读性能,避免大量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,节省内存开支。

Description

一种B+树节点的批量插入和删除方法及装置
本申请要求于2018年10月22日提交中国专利局、申请号为201811231305.8、发明名称为“一种B+树节点的批量插入和删除方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,具体涉及一种B+树节点的批量插入和删除方法及装置。
背景技术
随着移动互联网、社交网络、电子商务的飞速发展,人类在生产和生活中产生的数据呈现指数型增长,导致数据处理量也与日俱增,同时需要的存储容量也越来越大。
由此,在云计算时代,海量数据的存储需要文件系统的支持,并且,文件系统的元数据性能已成为影响文件访问性能的关键。B+树是为磁盘或其他直接存取辅助设备而设计的一种平衡查找树,通常用于数据库和操作系统的文件系统中。比如NTFS、ReiserFS、NSS、XFS、JFS、ReFS以及BFS等文件系统都在使用B+树作为元数据索引。B+树的特点是能够保持数据稳定有序,其插入与修改拥有较稳定的对数时间复杂度。B+树元素的插入方式为自底向上插入。并且,在需要进行内存和磁盘换入换出时,通常都是需要批量操作来提升元数据性能,但如果按照传统的B+树的插入和删除做法不仅会涉及大量的磁盘访问还会占用大量的内存空间。
因此,如何利用更先进的B+树节点的批量插入和删除方式取代传统的批量插入和删除方式,来避免大量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,以节省内存开支,已成为亟待解决的问题。
发明内容
为解决上述问题,本申请提供了一种B+树节点的批量插入和删除方法及装置,具体技术方案如下:
第一方面,本申请提供了一种B+树节点的批量插入和删除方法,所述方法包括:
从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,所述第一key-value为所述内存B+树中最小的key-value,所述第二key-value为所述磁盘B+树中待插入或删除的最小的key-value;
从所述第一key-value和所述第二key-value所在位置,按序将所述内存B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后面的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果;
根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。
在一种可选的实现方式中,所述从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,包括:
将所述内存B+树最左侧的叶子节点设置为其游标位置,所述游标位置的游标为0;
从所述内存B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第一key-value;
根据所述内存B+树中游标指向的key-value,查询出所述磁盘B+树中对应的key-value及其所属节点,并将所述key-value的位置设置为所述磁盘B+树的游标位置;
从所述磁盘B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第二key-value。
在一种可选的实现方式中,所述内存B+树节点中key-value的操作类型为插入insert、删除delete以及撤销abort中的一种。
在一种可选的实现方式中,所述从所述第一key-value和所述第二key-value所在位置,按序将所述内存B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后面的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果之后,还包括:
根据所述比较结果,获取所述磁盘B+树中有变化的节点和没有变化的节 点;
相应的,根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树中的所有key-value合并到所述将所述磁盘B+树,生成一棵新的磁盘B+树,包括:
根据所述内存B+树中key-value与所述磁盘B+树中有变化的节点中的key-value的比较结果,以及所述内存B+树中key-value的操作类型,将所述内存B+树中的key-value依次合并到所述磁盘B+树,生成一棵新的磁盘B+树;
将所述磁盘B+树中没有变化的节点及其key-value直接作为所述生成的新的磁盘B+树的节点及其key-value。
在一种可选的实现方式中,所述根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,包括:
若当前内存B+树节点中key-value的操作类型为delete或abort,则跳过所述当前内存B+树节点中的key-value,自左往右分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
或者,
若当前内存B+树节点中的key-value的操作类型为insert,则选择所述当前内存B+树节点中的key-value,并将其插入新的节点中,再自底向上分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
直至所述内存B+树和所述磁盘B+树的游标均遍历结束,合并生成一棵新的磁盘B+树,并更新所述生成的新的磁盘B+树的root节点。
第二方面,本申请提供了一种B+树节点的批量插入和删除装置,所述装置包括:
抽取单元,用于从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,所述第一key-value为所述内存B+树中最小的key-value,所述第二key-value为所述磁盘B+树中待插入或删除的最小的 key-value;
比较单元,用于从所述第一key-value和所述第二key-value所在位置,按序将所述内存B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果;
合并单元,用于根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。
在一种可选的实现方式中,所述抽取单元包括:
设置子单元,用于将所述内存B+树最左侧的叶子节点设置为其游标位置,所述游标位置的游标为0;
第一抽取子单元,用于从所述内存B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第一key-value;
查询子单元,用于根据所述内存B+树中游标指向的key-value,查询出所述磁盘B+树中对应的key-value及其所属节点,并将所述key-value的位置设置为所述磁盘B+树的游标位置;
第二抽取子单元,用于从所述磁盘B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第二key-value。
在一种可选的实现方式中,所述内存B+树节点中key-value的操作类型为插入insert、删除delete以及撤销abort中的一种。
在一种可选的实现方式中,所述装置还包括:
获取单元,用于根据所述比较结果,获取所述磁盘B+树中有变化的节点和没有变化的节点
相应的,所述合并单元包括:
第一合并子单元,用于根据所述内存B+树中key-value与所述磁盘B+树中有变化的节点中的key-value的比较结果,以及所述内存B+树中key-value的操作类型,将所述内存B+树中的key-value依次合并到所述磁盘B+树,生成一棵新的磁盘B+树;
第一合并子单元,用于将所述磁盘B+树中没有变化的节点及其key-value直接作为所述生成的新的磁盘B+树的节点及其key-value。
在一种可选的实现方式中,所述合并单元包括:
跳过子单元,用于若当前内存B+树节点中key-value的操作类型为delete或abort,则跳过所述当前内存B+树节点中的key-value,自左往右分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
或者,
选择子单元,用于若当前内存B+树节点中的key-value的操作类型为insert,则选择所述当前内存B+树节点中的key-value,并将其插入新的节点中,再自底向上分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
第三合并子单元,用于直至所述内存B+树和所述磁盘B+树的游标均遍历结束,合并生成一棵新的磁盘B+树,并更新所述生成的新的磁盘B+树的root节点。
在本申请提供的B+树节点的批量插入和删除方法中,首先,从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,其中,第一key-value为内存B+树中最小的key-value,第二key-value为磁盘B+树中待插入或删除的最小的key-value,然后,再从第一key-value和第二key-value所在位置,按序将内存B+树中所有的key-value组成的有序集合与磁盘B+树中第二key-value后面的所有key-value组成的有序集合逐个进行比较,得到比较结果,进而,可以根据比较结果,以及内存B+树节点中所有key-value的操作类型,将内存B+树与磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。可见,本申请通过从内存的B+树和磁盘上的B+树抽取最小的节点生成新的磁盘B+树,而不是用传统的方法在原有的B+树上进行插入和删除。使得内存B+树与磁盘B+树可以直接进行合并,相比较现有的将内存元数据先落盘后再合并的方法,本方法可以提 高元数据访问的读性能,并能够避免大量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,节省内存开支。
附图说明
为了更清楚的说明本发明实施例或现有技术的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种B+树节点的批量插入和删除方法的流程示意图;
图2为本申请实施例提供的抽取第一key-value和第二key-value的流程图;
图3为本申请实施例提供的查询磁盘B+树中有变化的节点的示意图;
图4为本申请实施例提供的自底向上生成新的磁盘B+树的示意图;
图5为本申请实施例提供的一种B+树节点的批量插入和删除装置的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为了便于理解本申请提供的技术方案,下面先对本申请技术方案的研究背景进行简单说明。
众所周知,正如背景技术中的描述,随着网络技术的发展,数据处理量与日俱增的同时,海量数据的存储也需要文件系统的支持,其中,文件系统的元数据性能是影响文件访问性能的关键。B+树是为磁盘或其他直接存取辅助设备而设计的一种平衡查找树,通常用于数据库和操作系统的文件系统中。比如NTFS、ReiserFS、NSS、XFS、JFS、ReFS以及BFS等文件系统均是以B+树作为元数据索引。B+树的特点是能够保持数据稳定有序,其插入与修改拥有较稳定的对数时间复杂度。B+树元素的插入方式为自底向上插入。并且,在 需要进行内存和磁盘换入换出时,通常都是需要批量操作来提升元数据性能,但如果按照传统的B+树的插入和删除做法不仅会涉及大量的磁盘访问还会占用大量的内存空间。由此,如何利用更先进的B+树节点的批量插入和删除方式取代传统的批量插入和删除方式,来避免批量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,以节省内存开支,已成为亟待解决的问题。
基于此,本申请提出了一种B+树节点的批量插入和删除方法及装置,用于在实现B+树节点的批量插入和删除的同时,避免批量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,以节省内存开支。
以下将结合附图对本申请实施例提供的B+树节点的批量插入和删除方法进行详细说明。参见图1,其示出了本申请实施例提供的一种B+树节点的批量插入和删除方法的流程图,本实施例可以包括以下步骤:
S101:从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,其中,第一key-value为内存B+树中最小的key-value,第二key-value为磁盘B+树中待插入或删除的最小的key-value。
在本实施例中,为了能够在实现B+树节点的批量插入和删除的同时,避免批量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,以节省内存开支,不再逐个将key-value先读到内存在写到磁盘中,由于内存的存储空间较小,而磁盘的存储空间较大,进而首先可以将内存中的B+树和磁盘上的B+树当作待合并的两个源,然后,每次从这两个源中获取一个最小的节点,即,从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,其中,第一key-value为内存B+树中最小的key-value,第二key-value为磁盘B+树中待插入或删除的最小的key-value。
在本申请一些可能的实现方式中,步骤S101具体可以包括步骤S201-S204:
S201:将内存B+树最左侧的叶子节点设置为其游标位置,该游标位置的游标为0。
在本实现方式中,内存B+树的各个节点中存放了多个条目entry,比如,每个节点中可以包含512个entry,且每个entry对应了一个键值对key-value,所以,内存B+树中包含了多个key-value,比如,每个节点中可以包含512个 key-value。
进而,可以将内存B+树最左侧的叶子节点设置为其游标位置,并将该游标值为0。
S202:从内存B+树中游标对应的节点中抽取出该节点中最小的key-value,作为第一key-value。
在本实现方式中,通过步骤S201将内存B+树最左侧的叶子节点设置为其游标为0的位置后,由于B+树中的数据是按从小到大的顺序从左到右排列,所以,可以从内存B+树中游标位置对应的节点(最左侧的叶子节点)中抽取出该节点中最小的key-value,即最左侧的key-value,并将其作为第一key-value。
S203:根据内存B+树中游标指向的key-value,查询出磁盘B+树中对应的key-value及其所属节点,并将该key-value的位置设置为磁盘B+树的游标位置。
在本实现方式中,通过步骤S202抽取出内存B+树中最小的key-value(即第一key-value)后,进一步可以根据该内存B+树中游标位置指向的key-value(即第一key-value),查询出磁盘B+树中对应的key-value及其所属节点。
具体来讲,可以根据第一key-value的值,在磁盘B+树中查询出与第一key-value的值相同的key-value,若磁盘B+树中不包含与第一key-value的值相同的key-value,则可以查询出与第一key-value的值相近的key-value,比如,若第一key-value的值为46,则可以在磁盘B+树中查询出key-value的值为46的key-value作为磁盘B+树的游标位置,若查询出磁盘B+树中不包含值为46的key-value,则可以将值为45、47等与46相近的key-value作为磁盘B+树的游标位置。
需要说明的是,磁盘B+树的各个节点中也可以存放多个条目entry,比如,每个节点中也可以包含512个entry,且每个entry对应了一个键值对key-value,所以,磁盘B+树中也包含了多个key-value,比如,每个节点中可以包含512个key-value。
S204:从磁盘B+树中游标对应的节点中抽取出该节点中最小的key-value,作为第二key-value。
在本实现方式中,通过步骤S203,根据内存B+树中游标指向的key-value,查询出磁盘B+树中游标位置后,由于磁盘B+树中的节点数据也是按从小到大的顺序从左到右排列,所以,可以将磁盘B+树中游标位置以左(不包括游标位置对应的节点)的各个节点保留,不对其进行插入和删除等操作,直接将这些节点作为内存B+树和磁盘B+树合并后新生成的磁盘B+树的左侧节点。相对应的,可以从磁盘B+树中游标位置对应的节点中抽取出该节点中待插入或删除的、最小的key-value,并将其作为第二key-value。
S102:从第一key-value和第二key-value所在位置,按序将内存B+树中所有的key-value组成的有序集合与磁盘B+树中第二key-value后面的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果。
在本实施例中,通过步骤S101,从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value后,进一步可以从第一key-value和第二key-value所在位置,按序(从左往右、自底向上)将内存B+树中所有的key-value组成的有序集合与磁盘B+树中第二key-value后面的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果。
S103:根据该比较结果,以及内存B+树节点中所有key-value的操作类型,将内存B+树与磁盘B+树进行合并,生成一颗新的磁盘B+树,以实现B+树节点的批量插入和删除。
在本实施例中,通过步骤S102,从第一key-value和第二key-value所在位置,按从左往右、自底向上的顺序将内存B+树中所有的key-value与磁盘B+树中待插入或删除的key-value进行比较,得到比较结果后,进一步可以根据该比较结果,以及内存B+树节点中对应的key-value的操作类型,将内存B+树中的所有key-value与磁盘B+树进行合并,生成一颗新的磁盘B+树,以实现B+树节点的批量插入和删除,可见,将内存B+树和磁盘B+树合并后生成的新的磁盘B+树保存的是唯一的key-value信息,进而可以提高查询效率。
在本申请一些可能的实现方式中,本步骤S103中,内存B+树节点中key-value的操作类型为插入insert、删除delete以及撤销abort中的一种。
其中,abort类型指的是当存在了insert类型的key-value对时,又进行了delete操作,就会将该key-value对标记成abort。
并且,内存中的B+树相当于一个增量,其中每个节点记录自己的操作类型,是插入还是删除,如果是插入就需要更新对应磁盘B+树中的key-value的值,如果是删除,就需要将磁盘上B+树的相应信息删除
在本申请一些可能的实现方式中,通过上述步骤S102,可以获取到磁盘B+树中有变化的节点和没有变化的节点。
其中,磁盘B+树中有变化的节点指的是对应父节点的key-value大于内存中B+树当前游标所指向的key-value的节点,反之则为没有变化的节点。如图3所示,其示出了本申请实施例提供的查询磁盘B+树中有变化的节点的示意图。从图3可以看出,如果某一父节点下面的叶子节点都没有变化,且如果该父节点向上还有父节点则继续向上检查,直到找到有变化的节点。
相应的,步骤S103的具体实现过程可以包括下述步骤A-B:
步骤A:根据内存B+树中key-value与磁盘B+树中有变化的节点中的key-value的比较结果,以及内存B+树中key-value的操作类型,将内存B+树中的key-value依次合并到磁盘B+树,生成一棵新的磁盘B+树。
步骤B:将磁盘B+树中没有变化的节点及其key-value直接作为生成的新的磁盘B+树的节点及其key-value。
具体来讲,可以将磁盘B+树中没有变化的节点及其key-value保留,不对其进行插入和删除等操作,直接将这些节点及作为内存B+树和磁盘B+树合并后新生成的磁盘B+树的中节点,即,可以直接将这些节点插入新生成的磁盘B+树中。
在本申请一些可能的实现方式中,在步骤S103中,若当前内存B+树节点中key-value的操作类型为delete或abort,则跳过当前内存B+树节点中的key-value,自左向右分别移动内存B+树节点中的游标和磁盘B+树节点中的游标,进行内存B+树节点中下一个key-value和磁盘B+树节点中下一个key-value的比较。
或者,若当前内存B+树节点中的key-value的操作类型为insert,则选择当前内存B+树节点中的key-value,并将其插入新的节点中,再自底向上分别移动内存B+树节点中的游标和磁盘B+树节点中的游标,进行内存B+树节点中下一个key-value和磁盘B+树节点中下一个key-value的比较。
进一步的,如果在比较过程中,需要在磁盘B+树中向上生成相应的父节点,则首先将子节点刷盘(即将内存B+树中相应的key-value写入到磁盘B+树中对应的节点上),并且,生成的父节点中对应包含了子节点在磁盘的物理地址(即在磁盘中的逻辑顺序地址),比如该磁盘容量可以为1G,则每个子节点在磁盘中的物理地址可以是0至1G中的某一顺序地址。
再进一步的,在判断完内存B+树中某一子节点中的key-value后,可以释放子节点内存,再判断父节点是否需要向上生成相应的父节点,以此类型,直至内存B+树和磁盘B+树的游标均遍历结束,也就是磁盘B+树上的节点合并完成后,合并生成一棵新的磁盘B+树,并更新该新的磁盘B+树的root节点,如图4所示,其示出了本申请实施例提供的自底向上生成新的磁盘B+树的示意图。其中,非树根节点n满足以下公式:
Figure PCTCN2018124696-appb-000001
其中,m表示每个节点中包含的条目数量,比如m可以取为512,节点值可以取为511等。
这样,在本申请提供B+树节点的批量插入和删除方法中,首先,从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,其中,第一key-value为内存B+树中最小的key-value,第二key-value为磁盘B+树中待插入或删除的最小的key-value,然后,再从第一key-value和第二key-value所在位置,按序将内存B+树中所有的key-value组成的有序集合与磁盘B+树中第二key-value后面的所有key-value组成的有序集合逐个进行比较,得到比较结果,进而,可以根据比较结果,以及内存B+树节点中所有key-value的操作类型,将内存B+树与磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。可见,本申请通过从内存的B+树和磁盘上的B+树抽取最小的节点生成新的磁盘B+树,而不是用传统的方法在原有的B+树上进行插入和删除。使得内存B+树与磁盘B+树可以直接进行合并,相比较现有的将内存元数据先落盘后再合并的方法,本方法可以提高元数据访问的读性能,并能够避免大量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,节省内存开支。
为便于理解,现对本申请实施例的整体实现过程分步骤进行详细介绍如下:
(1)设置内存B+树的节点游标为最左侧叶子节点,设置key-value对游标为0;
(2)根据内存B+树的游标指向的key去磁盘B+树查找对应的叶子节点,设置磁盘B+树的游标;
(3)设置磁盘B+树游标过程中将磁盘B+树中没有变化的节点插入新生成的磁盘B+树;
(4)从内存B+树的游标位置取最小的key-value;
(5)从磁盘B+树游标位置取最小的key-value。
(6)根据内存B+树、磁盘B+树的key,内存B+树中key的操作类型(insert、delete、abort)比较并选择或者跳过key-value;
(7)如果跳过key-value移动相应的内存B+树和磁盘B+树的游标;
(8)如果选择了key-value,则先进行key-value的合并,再移动相应的游标;
(9)如果内存B+树游标移动后超出该节点,则更新游标节点为本节点下一个,如果已经达到尾部则设置为null;
(10)如果磁盘B+树游标移动后超出该节点,则将下一个节点从磁盘读出;
(11)从磁盘读磁盘B+树结点时仅读有变化的节点,没有变化的节点直接插入新生成的磁盘B+树中;
(12)判断磁盘B+树中父节点下面的叶子节点是否有变化的依据是该父节点的key大于内存B+树的当前游标所指向的key;
(13)如果该父节点下面的叶子节点都没有变化,且如果该父节点还有向上的父节点,则继续向上检查,直到找到有变化的节点;
(14)新生成的磁盘B+树中插入节点后,可根据规则判断是否需要向上生成相应的父节点;
(15)如果需要向上生成相应的父节点,则首先将子节点刷盘。
(16)生成父节点,父节点包含子节点在磁盘的物理地址;
(17)释放子节点内存,再判断父节点是否需要向上生成相应的父节点;
(18)重复上述步骤,直至内存B+树和磁盘B+树的游标都遍历结束。
(19)更新磁盘上B+树的root节点。
需要说明的是,上述步骤(1)-(19)的具体实现过程参见步骤S101~步骤S103。
基于以上B+树节点的批量插入和删除方法,本申请还提供了一种B+树节点的批量插入和删除装置,所述装置包括:
抽取单元501,用于从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,所述第一key-value为所述内存B+树中最小的key-value,所述第二key-value为所述磁盘B+树中待插入或删除的最小的key-value;
比较单元502,用于从所述第一key-value和所述第二key-value所在位置,按序将所述内存B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后的所有待插入或删除的key-value进行比较,得到比较结果;
合并单元503,用于根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。
可选地,所述抽取单元501包括:
设置子单元,用于将所述内存B+树最左侧的叶子节点设置为其游标位置,所述游标位置的游标为0;
第一抽取子单元,用于从所述内存B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第一key-value;
查询子单元,用于根据所述内存B+树中游标指向的key-value,查询出所述磁盘B+树中对应的key-value及其所属节点,并将所述key-value的位置设置为所述磁盘B+树的游标位置;
第二抽取子单元,用于从所述磁盘B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第二key-value。
可选地,所述内存B+树节点中key-value的操作类型为插入insert、删除delete以及撤销abort中的一种。
可选地,所述装置还包括:
获取单元,用于根据所述比较结果,获取所述磁盘B+树中有变化的节点和没有变化的节点
相应的,所述合并单元503包括:
第一合并子单元,用于根据所述内存B+树中key-value与所述磁盘B+树中有变化的节点中的key-value的比较结果,以及所述内存B+树中key-value的操作类型,将所述内存B+树中的key-value依次合并到所述磁盘B+树,生成一棵新的磁盘B+树;
第一合并子单元,用于将所述磁盘B+树中没有变化的节点及其key-value直接作为所述生成的新的磁盘B+树的节点及其key-value。
可选地,所述合并单元503包括:
跳过子单元,用于若当前内存B+树节点中key-value的操作类型为delete或abort,则跳过所述当前内存B+树节点中的key-value,自左往右分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
或者,
选择子单元,用于若当前内存B+树节点中的key-value的操作类型为insert,则选择所述当前内存B+树节点中的key-value,并将其插入新的节点中,再自底向上分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
第三合并子单元,用于直至所述内存B+树和所述磁盘B+树的游标均遍历结束,合并生成一棵新的磁盘B+树,并更新所述生成的新的磁盘B+树的root节点。
这样,在本申请提供的B+树节点的批量插入和删除装置中,首先,从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,其中,第一key-value为内存B+树中最小的key-value,第二key-value为磁盘B+树中待插入或删除的最小的key-value,然后,再从第一key-value和第二key-value所在位置,按序将内存B+树中所有的key-value组成的有序 集合与磁盘B+树中第二key-value后面的所有key-value组成的有序集合逐个进行比较,得到比较结果,进而,可以根据比较结果,以及内存B+树节点中所有key-value的操作类型,将内存B+树与磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。可见,本申请通过从内存的B+树和磁盘上的B+树抽取最小的节点生成新的磁盘B+树,而不是用传统的方法在原有的B+树上进行插入和删除。使得内存B+树与磁盘B+树可以直接进行合并,相比较现有的将内存元数据先落盘后再合并的方法,本方法可以提高元数据访问的读性能,并能够避免大量插入和删除节点带来的大量读磁盘操作和删除结点时频繁的变换,节省内存开支。
需要说明的是,本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上对本发明所提供的针对高速信号连接器优化分析方法与系统进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对 本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。

Claims (10)

  1. 一种B+树节点的批量插入和删除方法,其特征在于,所述方法包括:
    从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,所述第一key-value为所述内存B+树中最小的key-value,所述第二key-value为所述磁盘B+树中待插入或删除的最小的key-value;
    从所述第一key-value和所述第二key-value所在位置,按序将所述内存B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后面的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果;
    根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。
  2. 根据权利要求1所述的B+树节点的批量插入和删除方法,其特征在于,所述从内存B+树节点中抽取第一key-value,以及从磁盘B+树节点中抽取第二key-value,包括:
    将所述内存B+树最左侧的叶子节点设置为其游标位置,所述游标位置的游标为0;
    从所述内存B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第一key-value;
    根据所述内存B+树中游标指向的key-value,查询出所述磁盘B+树中对应的key-value及其所属节点,并将所述key-value的位置设置为所述磁盘B+树的游标位置;
    从所述磁盘B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第二key-value。
  3. 根据权利要求1所述的B+树节点的批量插入和删除方法,其特征在于,所述内存B+树节点中key-value的操作类型为插入insert、删除delete以及撤销abort中的一种。
  4. 根据权利要求1所述的B+树节点的批量插入和删除方法,其特征在于,所述从所述第一key-value和所述第二key-value所在位置,按序将所述内存 B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后面的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果之后,还包括:
    根据所述比较结果,获取所述磁盘B+树中有变化的节点和没有变化的节点;
    相应的,根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树中的所有key-value合并到所述将所述磁盘B+树,生成一棵新的磁盘B+树,包括:
    根据所述内存B+树中key-value与所述磁盘B+树中有变化的节点中的key-value的比较结果,以及所述内存B+树中key-value的操作类型,将所述内存B+树中的key-value依次合并到所述磁盘B+树,生成一棵新的磁盘B+树;
    将所述磁盘B+树中没有变化的节点及其key-value直接作为所述生成的新的磁盘B+树的节点及其key-value。
  5. 根据权利要求3所述的B+树节点的批量插入和删除方法,其特征在于,所述根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,包括:
    若当前内存B+树节点中key-value的操作类型为delete或abort,则跳过所述当前内存B+树节点中的key-value,自左往右分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
    或者,
    若当前内存B+树节点中的key-value的操作类型为insert,则选择所述当前内存B+树节点中的key-value,并将其插入新的节点中,再自底向上分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
    直至所述内存B+树和所述磁盘B+树的游标均遍历结束,合并生成一棵新的磁盘B+树,并更新所述生成的新的磁盘B+树的root节点。
  6. 一种B+树节点的批量插入和删除装置,其特征在于,所述装置包括:
    抽取单元,用于从内存B+树节点中抽取第一key-value,以及从磁盘B+ 树节点中抽取第二key-value,所述第一key-value为所述内存B+树中最小的key-value,所述第二key-value为所述磁盘B+树中待插入或删除的最小的key-value;
    比较单元,用于从所述第一key-value和所述第二key-value所在位置,按序将所述内存B+树中所有的key-value组成的有序集合与所述磁盘B+树中第二key-value后的所有待插入或删除的key-value组成的有序集合逐个进行比较,得到比较结果;
    合并单元,用于根据所述比较结果,以及所述内存B+树节点中所有key-value的操作类型,将所述内存B+树与所述磁盘B+树进行合并,生成一棵新的磁盘B+树,以实现B+树节点的批量插入和删除。
  7. 根据权利要求6所述的装置,其特征在于,所述抽取单元包括:
    设置子单元,用于将所述内存B+树最左侧的叶子节点设置为其游标位置,所述游标位置的游标为0;
    第一抽取子单元,用于从所述内存B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第一key-value;
    查询子单元,用于根据所述内存B+树中游标指向的key-value,查询出所述磁盘B+树中对应的key-value及其所属节点,并将所述key-value的位置设置为所述磁盘B+树的游标位置;
    第二抽取子单元,用于从所述磁盘B+树中游标对应的节点中抽取出所述节点中最小的key-value,作为所述第二key-value。
  8. 根据权利要求6所述的装置,其特征在于,所述内存B+树节点中key-value的操作类型为插入insert、删除delete以及撤销abort中的一种。
  9. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    获取单元,用于根据所述比较结果,获取所述磁盘B+树中有变化的节点和没有变化的节点
    相应的,所述合并单元包括:
    第一合并子单元,用于根据所述内存B+树中key-value与所述磁盘B+树中有变化的节点中的key-value的比较结果,以及所述内存B+树中key-value的操作类型,将所述内存B+树中的key-value依次合并到所述磁盘B+树,生 成一棵新的磁盘B+树;
    第一合并子单元,用于将所述磁盘B+树中没有变化的节点及其key-value直接作为所述生成的新的磁盘B+树的节点及其key-value。
  10. 根据权利要求8所述的装置,其特征在于,所述合并单元包括:
    跳过子单元,用于若当前内存B+树节点中key-value的操作类型为delete或abort,则跳过所述当前内存B+树节点中的key-value,自左往右分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
    或者,
    选择子单元,用于若当前内存B+树节点中的key-value的操作类型为insert,则选择所述当前内存B+树节点中的key-value,并将其插入新的节点中,再自底向上分别移动所述内存B+树节点中的游标和所述磁盘B+树节点中的游标,进行所述内存B+树节点中下一个key-value和所述磁盘B+树节点中下一个key-value的比较;
    第三合并子单元,用于直至所述内存B+树和所述磁盘B+树的游标均遍历结束,合并生成一棵新的磁盘B+树,并更新所述生成的新的磁盘B+树的root节点。
PCT/CN2018/124696 2018-10-22 2018-12-28 一种b+树节点的批量插入和删除方法及装置 WO2020082597A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811231305.8 2018-10-22
CN201811231305.8A CN109522271B (zh) 2018-10-22 2018-10-22 一种b+树节点的批量插入和删除方法及装置

Publications (1)

Publication Number Publication Date
WO2020082597A1 true WO2020082597A1 (zh) 2020-04-30

Family

ID=65772444

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/124696 WO2020082597A1 (zh) 2018-10-22 2018-12-28 一种b+树节点的批量插入和删除方法及装置

Country Status (2)

Country Link
CN (1) CN109522271B (zh)
WO (1) WO2020082597A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502457B (zh) * 2019-08-23 2022-02-18 北京浪潮数据技术有限公司 一种元数据存储方法及装置
CN110569396B (zh) * 2019-09-03 2022-05-06 上海赜睿信息科技有限公司 一种数据搜索方法、电子设备和计算机可读存储介质
CN111400249A (zh) * 2020-03-06 2020-07-10 深圳市瑞驰信息技术有限公司 一种易于统计文件数量的文件存储系统及存储方法
CN113377364B (zh) * 2021-04-29 2022-10-11 上海工程技术大学 基于节点历史值的树形结构组织编辑操作回撤方法
CN113204681A (zh) * 2021-05-07 2021-08-03 北京柠檬微趣科技股份有限公司 数据排序方法、装置、设备、存储介质及程序产品
CN113901276A (zh) * 2021-09-30 2022-01-07 苏州浪潮智能科技有限公司 一种数据管理、b+树加载方法、装置及电子设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609490A (zh) * 2012-01-20 2012-07-25 东华大学 一种面向列存储dwms的b+树索引方法
WO2012149100A2 (en) * 2011-04-27 2012-11-01 Verisign, Inc. Systems and methods for a cache-sensitive index using partial keys
CN104331497A (zh) * 2014-11-19 2015-02-04 中国科学院自动化研究所 一种利用向量指令并行处理文件索引的方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012149100A2 (en) * 2011-04-27 2012-11-01 Verisign, Inc. Systems and methods for a cache-sensitive index using partial keys
CN102609490A (zh) * 2012-01-20 2012-07-25 东华大学 一种面向列存储dwms的b+树索引方法
CN104331497A (zh) * 2014-11-19 2015-02-04 中国科学院自动化研究所 一种利用向量指令并行处理文件索引的方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈彬 (CHEN, BIN): "数据流实时存储关键技术 (Real-time Storage Technology for Data Stream)", 中国优秀硕士学位论文全文数据库 (电子期刊) 信息科技辑 (ELECTRONIC TECHNOLOGY & INFORMATION SCIENCE, CHINA MASTER’S THESES FULL-TEXT DATABASE (ELECTRONIC JOURNALS)), 30 April 2016 (2016-04-30), DOI: 20190613155222X *

Also Published As

Publication number Publication date
CN109522271B (zh) 2021-05-18
CN109522271A (zh) 2019-03-26

Similar Documents

Publication Publication Date Title
WO2020082597A1 (zh) 一种b+树节点的批量插入和删除方法及装置
US10706106B2 (en) Merge tree modifications for maintenance operations
US10230643B2 (en) Full flow retrieval optimized packet capture
US9411840B2 (en) Scalable data structures
US20150363404A1 (en) Minimizing index maintenance costs for database storage regions using hybrid zone maps and indices
US10922288B2 (en) Method for storing data elements in a database
US20180144061A1 (en) Edge store designs for graph databases
CN110147204B (zh) 一种元数据落盘方法、装置、系统及计算机可读存储介质
US11074133B2 (en) Method, electronic device and computer readable medium of file management
CN111538724A (zh) 管理索引的方法
US20180357330A1 (en) Compound indexes for graph databases
US10824610B2 (en) Balancing write amplification and space amplification in buffer trees
CN110799961A (zh) 在数据库中创建和删除租户的系统和方法
CN104408128B (zh) 一种基于b+树异步更新索引的读优化方法
CN109189759A (zh) Kv存储系统中的数据读取方法、数据查询方法、装置及设备
EP3955256A1 (en) Non-redundant gene clustering method and system, and electronic device
US20180075074A1 (en) Apparatus and method to correct index tree data added to existing index tree data
CN109408539B (zh) 数据操作方法、装置、服务器和存储介质
US9715514B2 (en) K-ary tree to binary tree conversion through complete height balanced technique
US10007692B2 (en) Partition filtering using smart index in memory
US20180144060A1 (en) Processing deleted edges in graph databases
CN111045994B (zh) 一种基于kv数据库的文件分类检索方法及系统
WO2012081165A1 (ja) データベース管理装置及びデータベース管理方法
US10762139B1 (en) Method and system for managing a document search index
CN108984720B (zh) 基于列存储的数据查询方法、装置、服务器及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18938157

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18938157

Country of ref document: EP

Kind code of ref document: A1