CN108920708B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN108920708B
CN108920708B CN201810805648.4A CN201810805648A CN108920708B CN 108920708 B CN108920708 B CN 108920708B CN 201810805648 A CN201810805648 A CN 201810805648A CN 108920708 B CN108920708 B CN 108920708B
Authority
CN
China
Prior art keywords
node page
key
page
leaf node
memory element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810805648.4A
Other languages
Chinese (zh)
Other versions
CN108920708A (en
Inventor
王洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN201810805648.4A priority Critical patent/CN108920708B/en
Publication of CN108920708A publication Critical patent/CN108920708A/en
Application granted granted Critical
Publication of CN108920708B publication Critical patent/CN108920708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a data processing method and a device, wherein the method comprises the following steps: creating a target multi-version B + tree MVBT data storage structure; and responding to the received data processing request based on the target MVBT data storage structure. The embodiment of the invention can improve the data processing performance.

Description

Data processing method and device
Technical Field
The present invention relates to the field of network communication technologies, and in particular, to a data processing method and apparatus.
Background
The MVBT (Multi-version B + Tree) modifies the traditional B + Tree, increases the life cycle (Lifesspan) of each Key (Key), and records as [ a, B ] (B is more than or equal to a). Wherein, a is the version number generated when the Key is inserted, and b is the version number generated when the Key is deleted; when b is + ∞, the Key is not deleted; the Key that LifesPan is [ a, + ∞ ] can be called live Key (live Key), and the rest of the keys (keys whose ending version number is not + ∞) are called dead Key (dead Key).
However, practice shows that in the conventional MVBT implementation scheme, the sorting of the memory elements in the page is chaotic (that is, sorting is not performed according to the size of the Key), and when the Key needs to be searched, the sorting needs to be performed by traversing the memory elements in the page, which is inefficient.
Disclosure of Invention
The invention provides a data processing method and a data processing device, which aim to solve the problem that the conventional MVBT implementation scheme only supports fixed-length Key and Value.
According to a first aspect of the present invention, there is provided a data processing method comprising:
creating a target multi-version B + tree MVBT data storage structure; the target MVBT data storage structure comprises memory pages with a tree structure and sorting arrays respectively associated with the memory pages, wherein each memory page comprises at least one memory element, a Key is recorded in each memory element, and each array element in the sorting arrays is used for representing the position offset of the memory element in the memory page;
and responding to the received data processing request based on the target MVBT data storage structure.
According to a second aspect of the present invention, there is provided a data processing apparatus comprising:
a creation unit configured to create a target multi-version B + tree MVBT data storage structure; the target MVBT data storage structure comprises memory pages with a tree structure and sorting arrays respectively associated with the memory pages, wherein each memory page comprises at least one memory element, a Key is recorded in each memory element, and each array element in the sorting arrays is used for representing the position offset of the memory element in the memory page;
a receiving unit, configured to receive a data processing request;
and the processing unit is used for responding to the received data processing request based on the target MVBT data storage structure.
In a third aspect, the present application provides a network device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine-executable instructions to implement the data processing method steps described above.
In a fourth aspect, the present application provides a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, may cause the processor to perform the data processing method steps described above.
By applying the technical scheme disclosed by the invention, the data storage structure in the traditional MVBT implementation scheme is modified, the associated sequencing array is added to the memory page of each tree structure, each array element in the sequencing array is used for representing the position offset of the memory element in the memory page, and then the Key recorded in the memory element in the memory page of the associated tree structure can be searched in sequence based on the sequencing array, so that the data processing performance is improved.
Drawings
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIGS. 2A-2B are schematic diagrams of memory pages and associated sorting arrays of a tree structure according to an embodiment of the present invention;
FIG. 3A is a diagram illustrating a memory page finalization and leveling operation according to an embodiment of the present invention;
fig. 3B is a schematic diagram of a memory page finalization and insertion operation according to an embodiment of the present invention;
fig. 4A to 4C are schematic diagrams illustrating Key range overlapping prevention when a memory element is inserted into a memory page according to an embodiment of the present invention;
FIG. 5A is a diagram illustrating a memory page being terminated and split according to an embodiment of the present invention;
FIG. 5B is a diagram illustrating a memory page finalization and merge operation according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a memory page with a tree structure according to an embodiment of the present invention;
FIG. 7 is a block diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a hardware structure of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution in the embodiment of the present invention, a brief description will be given below of some terms and Key sorting rules related to the embodiment of the present invention.
First, term interpretation:
and (3) finalization: inserting a memory element of Live Key contained in a Page (memory Page) into a new Page, and marking the Live Key in the original Page as a Dead Key;
terminating and inserting: inserting the memory element of the Live Key contained in one Page and the memory element recording the Key to be inserted into a new Page, and marking the Live Key in the original Page as the Dead Key;
and (3) finishing and uniformly spreading: the method comprises the steps that a memory element of Live Key contained in one Page and a memory element recording Key to be inserted are inserted into two new pages in a halving mode, and the Live Key in the original Page is marked as a Dead Key;
and (3) terminating and merging: inserting a memory element of Live Key contained in a Page and a Sibling (sitting) Page into a new Page, and marking the Live Key in the original Page and the Sibling Page as a Dead Key;
and (3) terminating and splitting: the internal memory element of Live Key contained in one Page and its brother (Sibling) Page is inserted into two new pages in a halving way, and the Live Key in the original Page and its brother Page is marked as the Dead Key.
The end version number of the newly labeled Dead Key in any operation is the version number corresponding to the operation, and the start version number of the Live Key inserted into the new Page is the version number corresponding to the operation.
II, Key sorting rule:
key is sorted from big to small or from small to big, and the composition and comparison rules of Key are as follows:
key is composed of two parts: the value of Key and the life cycle of Key. The value of the Key can be customized by the user, and the life cycle is dependent on the write transaction that generated/modified the Key. A Key is denoted herein in the form { a, [ b, c) }, where a is the value of the Key and [ b, c) is the Key's life cycle.
Key comparison is divided into two types, one is comparison during insertion, and the other is comparison during searching:
for comparison at insertion:
firstly, comparing the value of Key, wherein the larger the value is, the larger the Key is; if the values are the same, comparing the life cycle of Key: at this time, the larger the starting version number of the Key is, the larger the Key is. Examples are:
1. key1 ═ {3, [1, + ≦) } and Key2 ═ {4, [5, 6 }, since 4 > 3, Key2> Key 1.
2. Key1 ═ {3, [1,4) } and Key2 ═ 3, [6, + ≦ or) }, since 6 > 1, Key2> Key 1.
For comparison at search time:
the Key to be searched is referred to herein as the Search _ Key, and the Key present in the MVBT is the Compare _ Key. The version number of the Search _ Key is a unique value during searching, and is not an interval.
When comparing the Search _ Key and the Compare _ Key, the values of the two keys are compared first, and the larger the value is, the larger the Key is.
If the version number of the Search _ Key is not less than the ending version number of the life cycle of the Compare _ Key under the condition of the same value, the Search _ Key is greater than the Compare _ Key;
if the version number of the Search _ Key is less than the initial version number of the life cycle of the Compare _ Key, then the Search _ Key is less than the Compare _ Key;
if the starting Version number of the lifecycle of the Compare _ Key is less than or equal to the Version number of the Search _ Key < the ending Version of the lifecycle of the Compare _ Key, then the Search _ Key is the Compare _ Key. Examples are:
search _ Key ═ {3, 8}, match _ Key ═ {3, [10, + ≦ r }, then Search _ Key < match _ Key;
search _ Key ═ {3, 8}, match _ Key ═ {3, [2, 6 }, then Search _ Key > match _ Key;
search _ Key ═ {3, 8}, match _ Key ═ {3, [8, 9 }, then Search _ Key ═ match _ Key;
search _ Key ═ {3, 8}, match _ Key ═ {3, [6, 8}, then Search _ Key > match _ Key.
It should be noted that, normally, the life cycles of a Key do not overlap each other, but adding and deleting a Key repeatedly in the same transaction may cause multiple completely identical keys to exist. For such keys that are added and then deleted within the same transaction, Compare _ Key < Search _ Key are considered when compared. For example: match _ Key ═ {3, [8, 8) } < Search _ Key ═ 3, 8 }.
In order to make the aforementioned objects, features and advantages of the embodiments of the present invention more comprehensible, embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, a schematic flow chart of a data processing method according to an embodiment of the present invention is provided, where the data processing method may be applied to a network device (hereinafter, referred to as a network device for short) that supports a Key-Value data storage mode, and as shown in fig. 1, the data processing method may include the following steps:
and step 101, creating a target MVBT data storage structure.
In the embodiment of the invention, considering that the memory elements in each memory page in the traditional MVBT implementation scheme are out of order, when searching for Key, the memory elements in the memory pages need to be traversed, the Key searching efficiency is poor, and the data processing performance is low, so that in order to improve the data processing performance, the existing data storage structure of the MVBT can be modified, and the MVBT can search for the Key in order.
The modified MVBT data storage structure (referred to herein as a target MVBT data storage structure) includes memory pages of a tree structure and sorting arrays respectively associated with the memory pages, where the memory also includes at least one memory element, a Key is recorded in each memory element, and each array element in the sorting arrays is used to represent a position offset of the memory element in the memory page.
Each array element is stored in the sorting array in sequence, and the storage sequence is consistent with the sorting result of the Key recorded by each memory element (hereinafter, keys are sorted in the order from small to large as an example).
Taking the memory page shown in fig. 2A as an example, the sorting array associated with the memory page may be as shown in fig. 2B, where each array element in the sorting array is used to identify a position offset of a memory element in the memory page, and the storage order of each array element in the sorting array is consistent with the sorting result of the Key recorded by each memory element.
It should be noted that, in practical applications, the number of memory elements in the memory page node may also be recorded in the sorting array, and specific implementation thereof is not described herein.
Further, in the embodiment of the present invention, it is considered that only fixed-length keys and values are supported in the conventional MVBT implementation scheme, thereby resulting in poor flexibility of data storage, waste of storage resources, and reduced performance of data processing.
Accordingly, in one embodiment of the present invention, a memory page with a tree structure (referred to as a first type memory page, which may also be referred to as a TreeNodePage) is used to store location information of a Key (referred to as a Key location herein), location information of a Value corresponding to the Key (referred to as a Value location herein), and a lifetime of the Key, and no specific Key and Value are stored; in addition, a memory Page for storing Key (Data Page for Key, referred to as second-type memory Page herein) and a memory Page for storing Value (Data Page for Value, referred to as third-type memory Page herein) are newly added in the target MVBT Data storage structure.
The first-type memory page serving as a leaf node stores a Value location, where the Value location includes an index of a third-type memory page (e.g., a location of the third-type memory page) storing the Value and an offset of the Value in the third-type memory page; the stored Value location includes an index of the first-type memory page in a next layer (in an order from top to bottom of the root node page, the branch node page, and the leaf node page) where a Key corresponding to the Value is located, and an offset of the Value location in the first-type memory page.
For any Key, the Key location in a first type of memory page includes the index of the second type of memory page in which the Key is stored (e.g., the location of the second type of memory page) and the offset of the Key in the second type of memory page.
In this embodiment, the first type of memory page stores a Key location, a Value location, and a life cycle through a Key location field, a Value location field, and a life cycle field, respectively; wherein, a Key location field, and its corresponding Value field and life cycle field together form a memory element (referred to herein as a first type memory element) in the first type memory page.
The lengths of the first-type memory elements in the first-type memory pages are fixed and the same, that is, the number of the first-type memory elements that can be contained in the first-type memory pages is fixed, so that the specific implementation manners of the operations of merging, splitting, and the like of the first-type memory pages can refer to the implementation manners of the operations of merging, splitting, and the like of the memory pages in the conventional MVBT.
The lengths of the keys and the values stored in the second-type memory page and the third-type memory page are not fixed any more, but can include any length supported by the keys and the values in practical application, so that the flexibility of data storage is improved, the waste of storage resources is reduced, and the data processing performance is improved.
Optionally, the sorting array associated with each first-type memory page may be recorded in a fourth-type memory page (which may be referred to as a SortListPage herein).
And 102, responding to the received data processing request based on the target MVBT data storage structure.
In the embodiment of the present invention, after the network device establishes the target MVBT data storage structure, the network device may respond to the received data processing request based on the target MVBT data storage structure.
The data processing request may include, but is not limited to: key insertion requests, Key deletion requests, Key reading requests, and the like.
It can be seen that, in the method flow shown in fig. 1, by modifying the data storage structure in the conventional MVBT implementation scheme, a related sorting array is added to the memory page of each tree structure, each array element in the sorting array represents a position offset of the memory element in the memory page, each array element is stored in the sorting array in sequence, and the stored sequence is consistent with the sorting result of the Key recorded by each memory element, and then the keys in the memory pages of the related tree structures can be sequentially searched based on the sorting array, thereby improving the data processing performance.
The following describes the processing flow of various types of data processing requests based on the above target MVBT data storage structure.
It should be noted that, unless otherwise specified, all memory pages mentioned below refer to memory pages in a tree structure in the target MVBT data storage structure.
In one embodiment of the application, the responding to the received data processing request based on the target MVBT data storage structure may include:
when the data processing request is an insertion request aiming at the first Key, judging whether a root node page in a memory page of a tree structure has a next layer of node page or not;
if yes, determining a first target leaf node page inserted by the first Key by using a dichotomy based on the sorting array according to the sequence from the root node page to the leaf node page; otherwise, determining the root node page as a first target leaf node page;
inserting a first memory element into a first target leaf node page, wherein the first memory element records the first Key.
It should be noted that, in the embodiment of the present invention, the insertion of the Key may include the insertion of the Key, Value and life cycle corresponding to the Key; similarly, the reading of the Key may also include the reading of the Key and the Value corresponding to the Key, which will not be repeated in the following.
In this embodiment, when the first Key needs to be inserted, it may be determined first whether the root node page has a next-layer node page; if not, the root node is directly determined as the leaf node page (referred to as the first target leaf node page) inserted by the first Key.
When the memory page of the tree structure only includes one memory page, the memory page is used as both a root node page and a leaf node page.
If the root node page has a next-layer node page, the leaf node (referred to as a first target leaf node page) into which the first Key is inserted may be determined by using a binary method based on the sorting array associated with each memory page of the target MVBT according to the sequence from the root node page to the leaf node page.
Specifically, a binary method may be first used to select one array element (hereinafter referred to as a first target array element, and a location offset represented by the first target array element is hereinafter referred to as a first target location offset) in the sorting array associated with the root node page, and compare the Key of the memory element record in the root node page pointed to by the first target location offset with the first Key.
If the Key is larger than the first Key, array elements are selected again in array elements which are arranged behind the first target array element in the ordering array associated with the root node page by utilizing a dichotomy until the hit Key is determined.
If the Key is smaller than the first Key, array elements which are sequenced in front of the first target array element in the sequencing array associated with the root node page are selected again by utilizing the dichotomy until the hit Key is determined.
For ease of description, the selection of an array element from an ordered set is described directly below as the selection of a position offset from an ordered set.
For example, suppose the root node page includes the Key: 1. 4 and 7, the position offset in the root node page is respectively 0, 2 and 1, the sequence of the position offset in the sorting array associated in the root node page from front to back is 0-2-1, when the Key to be inserted is 5, the Key corresponding to 5 and the position offset 2 (i.e. 4) can be compared first, and since 5 > 4, one position offset, i.e. position offset 1, can be selected again from the position offsets sorted behind the position offset 2, and since 5 is smaller than the Key corresponding to the position offset 1 (i.e. 7), the Key hit in the root node page can be determined to be 4.
In this embodiment, after determining the Key hit in the root node page, the dichotomy may be used again to perform the position offset selection and the Key comparison in the sorting array associated with the next level node page pointed by the hit Key until the first target leaf node page into which the first Key is inserted is found.
Further, in this embodiment, when the determined capacity of the memory page for Key insertion is insufficient, a new memory page needs to be created for Key insertion, considering that the capacity of a single memory page is limited.
The capacity of the memory page may be represented by the number of memory elements, where the memory elements may be the first type of memory elements described in the above embodiment, or the memory elements may be composed of keys and values and life cycles corresponding to the keys; the length of a single memory element is fixed, and therefore, the number of memory elements that can be accommodated in a single memory page can be predetermined.
In addition, because there is a limit to the minimum Live Key number (generally greater than 1) in the memory pages in the MVBT (i.e., a limit to the number of memory elements that contain the Live Key at the minimum), when the determined capacity of the memory page in which Key insertion is performed is insufficient, an implementation manner in which a new memory page is directly created and a memory element containing the Key is directly inserted into the newly created memory page cannot satisfy the limit to the minimum Live Key number of the memory page.
In order to solve the above problem, in an implementation manner of this embodiment, the inserting a first memory element in the first target leaf node page may include:
judging whether the number of memory elements in the first target leaf node page is greater than a preset first number threshold;
if the number of the memory elements in the first target leaf node page is larger than a preset first number threshold, creating a first leaf node page;
inserting the first memory element and the memory element of the live Key contained in the first target leaf node page into the first leaf node page;
labeling a live Key in a first target leaf node as a dead Key;
an ordering array associated with the first leaf node page is created.
In this embodiment, when the first target leaf node page into which the first Key is inserted is determined, it may be determined whether the number of memory elements in the first target leaf node page reaches a preset upper limit of the number of memory elements (which may be referred to as a preset first number threshold, and may be set according to an actual scenario, for example, the maximum capacity of a memory page), and if not, the first memory element is directly inserted into the first target leaf node page; if the first target leaf node page is reached, the network device may terminate the first target leaf node page, that is, create the first leaf node page for inserting the memory element of the live Key and the first memory element that are already included in the first target leaf node page, and mark the live Key in the first target leaf node page as a dead Key.
When the network device creates the first leaf node page, a sorting array associated with the first leaf node page may also be created, and the position offset of each memory element in the first leaf node page is recorded in the sorting array, and the sorting of each position offset is consistent with the sorting result of the Key recorded in the memory element corresponding to the position offset.
It should be noted that, in this embodiment, if the number of memory elements in the first target leaf node page is less than or equal to the preset first number threshold after the first memory element is inserted, the first memory element may be directly inserted into the first target leaf node page, and the sorting array associated with the first target leaf node page is updated, that is, the position offset of the first memory element in the first target leaf node page is increased, and the sorting is performed on each position offset, which is not described herein in detail.
Further, when the number of memory elements of the live Key included in the first target leaf node page and the first memory element are inserted into one first leaf node page, the number of memory elements in the first leaf node page is greater than a preset first number threshold value.
For example, assuming that the preset first number threshold is 6, when a first memory element needs to be inserted, and the number of memory elements of a live Key already included in a first target leaf node page is 6, the first target leaf node page needs to be terminated, but at this time, if only one first leaf node page is created, 7 memory elements need to be inserted into the new first leaf node page, and the number of memory elements is greater than the preset first number threshold, so in this case, two first leaf node pages need to be created.
Accordingly, in an example, the inserting the first memory element and the memory element of the live Key already contained in the first target leaf node page into the first leaf node page may include:
judging whether the sum of the number of the memory elements of the live Key contained in the first target leaf node page and the number of the first memory elements is greater than a preset first number threshold;
if so, creating two first leaf node pages, and inserting the first memory element and the memory element of the live Key contained in the first target leaf node page into each first leaf node page in a bisection manner;
otherwise, a first leaf node page is created, and the first memory element and the memory element of the live Key contained in the first target leaf node page are inserted into the first leaf node page.
Specifically, in this example, when it is determined that the first leaf node page needs to be created, it may be determined whether the sum of the number of memory elements of the live Key included in the first target leaf node page and the number of first memory elements is greater than a preset first number threshold.
If the sum of the number of the memory elements of the live Key contained in the first target leaf node page and the number of the first memory elements is greater than a preset first number threshold, termination and shared operation can be performed, that is, two first leaf node pages are created, the memory elements of the live Key contained in the first target leaf node page and the first memory elements are inserted into each first leaf node page in a split manner, and the live keys in the first target leaf node pages are all marked as dead keys.
For example, taking the scenario shown in fig. 3A as an example, in this scenario, assuming that the preset first number threshold is 6, the first Key to be inserted is {11, [10, + ≦ and the number of memory elements in the first target leaf node page is 6 (both are memory elements containing a live Key), since after the first memory element containing the first Key is inserted, the number (7) of memory elements contained in the first target leaf node page is greater than the preset first number threshold (6), and the sum (7) of the number (6) of memory elements of the live Key already contained in the first target leaf node page and the number (1) of first memory elements is greater than the preset first number threshold (6), two first leaf node pages need to be created, and the memory elements of the live Key already contained in the first target leaf node page and the first memory elements are inserted into each first leaf node page in a halved manner (one of the memory elements is inserted into 4 memory elements, another inserts 3 memory elements), and labels all live keys in the first target leaf node page as dead keys.
It should be noted that, in the embodiment of the present invention, the memory elements in the memory page in the tree structure may be out of order, that is, the memory elements are not sorted according to the contained keys, and it is necessary to implement sequential search based on the sorting array associated with the memory page.
In this example, if the sum of the number of memory elements of the live Key included in the first target leaf node page and the number of first memory elements is less than or equal to the preset first number threshold, termination and insertion operations may be performed, that is, a first leaf node page is created, the memory elements of the live Key included in the first target leaf node page and the first memory elements are inserted into the first leaf node page, and the live keys in the first target leaf node page are both marked as dead keys.
For example, taking the scenario shown in fig. 3B as an example, in this scenario, assuming that the preset first number threshold is 6, the first Key to be inserted is {11, [10, + ≦ and }, the number of memory elements in the first destination leaf node page is 6 (the number of memory elements containing a live Key is 2, i.e., {9, [3, + ≦ or ≦ and {10, [3, + ≦ or }), and after the first memory element containing the first Key is inserted, the number (7) of memory elements contained in the first destination leaf node page is greater than the preset first number threshold (6), and the sum (3) of the number (2) of memory elements containing a live Key contained in the first destination leaf node page and the number (1) of memory elements of the first memory element is smaller than the preset first number threshold (6), so that only one first leaf node page needs to be created, and the memory elements of the live Key contained in the first destination leaf node page and the first memory element are inserted into the first leaf node page, and all live keys in the first target leaf node page are marked as dead keys.
Further, in this embodiment, considering that operations such as Key insertion, deletion, and reading in the MVBT all need to perform positioning of leaf node pages in an order from a root node page to a leaf node page (an order from top to bottom), in order to ensure that a newly created leaf node page can be positioned, a memory element pointing to the newly created leaf node needs to be added to an upper node of the newly created leaf node page.
Accordingly, after the creating the first leaf node page, the method may further include:
judging whether the first target leaf node page is a root node page or not;
if the first target leaf node page is not the root node page, inserting a parent node page of the first target leaf node page to point to a second memory element, recording a data pointer in the second memory element, pointing to the first leaf node page by the data pointer, and updating a sequencing array associated with the parent node page;
if the first target leaf node page is a root node page and the number of the first leaf node pages is greater than 1, another node page is created, a second memory element is inserted into the another node page, the second memory element records a data pointer, the data pointer points to the first leaf node page, and a sorting array associated with the another node page is created.
Specifically, if the first target leaf node page is a root node page (that is, the tree in which the first target leaf node page is located has only one layer of nodes, and the first target leaf node page is both a root node page and a leaf node page), when the first leaf node page is created, the original one tree is changed into two trees (one tree is a tree whose root is the original first target leaf node page, and the other tree is a tree corresponding to the first leaf node page), and at this time, a subsequent processing policy needs to be further determined according to the number of the first leaf node pages.
If the number of the first leaf node pages is 1, the first leaf node pages are directly used as root node pages of a new tree, and at this time, the first leaf node pages do not have father node pages and do not need to insert second memory elements into the father node pages.
If the number of the first leaf node pages is greater than 1 (2 in this embodiment), another node page needs to be created as the root node page of the new tree, a second memory element is inserted into the another node page, and an associated sorting array of the another node page is created.
And the sorting array records the position offset of the second memory element in the other node page, and the position offset is sorted according to the mode.
If the first target leaf node page is not the root node page, when the first target leaf node page is applied, a second memory element needs to be inserted into a father node page of the first target leaf node page, and the associated sorting array of the father node page is updated.
It should be noted that, in the embodiment of the present invention, when the first target leaf node page is not the root node page and the second memory element is inserted into the parent node page of the first target leaf node page, a situation that the number of the memory elements in the parent node page exceeds a preset first number threshold may also occur, and therefore, it may also be determined whether the parent node page needs to be terminated and inserted or terminated and shared in the manner described in the above embodiment, and specific implementation thereof may refer to relevant processing of the first target leaf node page in the above embodiment, and details of the embodiment of the present invention are not described herein again.
Further, in this embodiment, it is considered that when the first target leaf node page is not the root node page and the second memory element is inserted into the parent node page of the first target leaf node page, a Key range may overlap in the parent node page (that is, a memory element recording the same Key exists in the parent node page), and further, when data search is performed by using the binary method, an exception may occur.
For example, taking the scenario shown in fig. 4A as an example, assuming that the preset first number threshold is 5, when a Key {9, [9, + ≦ is required to be inserted, the hit first target leaf node Page is a lower-right leaf node Page (Page3), since after a memory element containing the Key {9, [9, + ≦ is inserted, the number of the memory element in the first target leaf node Page is 6, which is greater than the preset first number threshold (5), and the sum (6) of the number (5) of the memory element of the live Key already contained in the first target leaf node Page and the number (1) of the memory element to be inserted is greater than the preset first number threshold (5), the leaf node Page needs to be terminated and shared, and a schematic diagram of the terminated and shared tree structure memory pages may be as shown in fig. 4B.
In the memory Page in the tree structure shown in fig. 4B, the value of the Key in the terminated leaf node Page (Page3) pointed to by Key {4, [6, 9 } in the root node Page is in the range of 4 to 8, and since the value of the Key in the root node Page (Page1) is 1,4 and 7, respectively, according to the dichotomy principle, a Key with a value greater than 7 should be searched in the leaf node Page (Page5) pointed to by Key {7, [9, + ≦ 9 } according to the dichotomy principle, so when a memory element corresponding to a Key with a value greater than 7 is searched in the tree, the Key can only search the data in the leaf node Page (Page5) pointed to by Key {7, [9, + ≦ 6, 9 }, but cannot search the data in the leaf node Page (Page3) pointed to by Key {4, [6, 9 }.
In an example, after inserting the second memory element into the parent node page of the first target leaf node page, the method may further include:
judging whether the memory elements contained in the father node page have the memory elements with the same Key as the second memory element record;
if the node page exists, a node page of the same layer with the father node page is established, and the memory element of the live Key and the second memory element contained in the father node page are inserted into the node page of the same layer;
marking the live Key in the father node page as a dead Key;
an ordering array associated with the same level node page is created.
Specifically, in this example, after the second memory element pointing to the first leaf node page is inserted into the parent node page of the first target leaf node page, it may be determined whether a memory element having the same Key as the second memory element record exists in the memory elements included in the parent node page.
In this example, if there is a memory element with the same Key as the second memory element record in the memory elements included in the parent node page, a node page in the same layer as the parent node page may be created, the memory element of the live Key and the second memory element that are already included in the parent node page are inserted into the node page in the same layer, and the live Key in the parent node is marked as the dead Key.
For example, taking the scenario shown in fig. 4A as an example, when Key {9, [9, + ≦ is) } insertion causes termination sharing to be performed on the lower right leaf node Page (Page3), in order to avoid Key range overlapping, when termination sharing is performed on the lower right leaf node Page, a termination and insertion operation may be performed on the parent node Page (Page1) thereof, that is, a memory element of a live Key already included in the parent node Page and memory elements pointing to two first leaf node pages (Page4 and Page5) created when termination and sharing operations are performed on the lower right leaf node Page (Page3) are inserted into a Page at the same level (Page6), and a live Key in the parent node Page (Page1) is labeled as a dead Key, which is schematically shown in fig. 4C.
It should be noted that, in this example, after the created same-layer node page is inserted into the same-layer node page, the memory element that already contains the live Key and the second memory element in the parent node page, if the parent node page is not the root node page, a memory element that records a data pointer pointing to the same-layer node page needs to be inserted into the parent node page of the parent node page.
For example, taking the scenario shown in fig. 4C as an example, assuming that Page1 is not the root node Page, after the above operations are completed, a memory element recording a data pointer pointing to Page6 needs to be inserted into the parent node Page of Page 1.
In this case, a situation that the number of memory elements in the parent node page of the parent node page exceeds the preset first number threshold may also occur, and therefore, it may also be determined whether operations such as terminating, inserting, terminating, sharing, and the like need to be performed on the parent node page of the parent node page in the manner described in the above embodiment.
In another embodiment of the present application, the responding to the received data processing request based on the target MVBT data storage structure may include:
when the data processing request is a deletion request aiming at the second Key, selecting array elements in the sequencing array by utilizing a dichotomy according to the sequence from the root node page to the leaf node page in the tree structure;
comparing the Key recorded in the memory element corresponding to the array element with the second Key to determine a second target leaf node page; the second target leaf node page comprises a first target memory element, and the Key recorded by the first target memory element is matched with the second Key;
and marking the Key of the target memory element record in the second target leaf node page as a dead Key.
In this embodiment, the second Key does not refer specifically to a fixed Key, but may refer to any Key that needs to be deleted.
In this embodiment, when the second Key needs to be deleted, an array element may be selected in the sorting array by using a dichotomy according to the sequence from the root node page to the leaf node page in the tree structure, and the Key recorded in the memory element corresponding to the array element may be compared with the second Key to determine the second target leaf node page.
Specifically, a position offset (hereinafter referred to as a second target position offset) may be first selected from the rank order group associated with the root node page by using a bisection method, and the Key included in the memory element in the root node page pointed to by the second target position offset may be compared with the second Key.
If the Key is equal to a second Key, when the root node page does not have a next layer node page, directly determining the root node page as a second target leaf node page, and determining the memory element as a target memory element matched with the second Key; when the next-layer node page exists in the root node page, continuously utilizing the dichotomy to select the position offset and compare the Key in the sequencing array associated with the next-layer node page pointed by the memory element in the root node page;
if the Key is greater than the second Key, continuously utilizing a dichotomy to sequence from a sequencing array associated with the root node page in a position offset behind the second target offset to select the position offset and compare the position offsets with the keys until two adjacent position offsets are found, wherein one corresponding Key sequenced in the front is smaller than the second Key, and when one corresponding Key sequenced in the back is greater than the second Key, determining a memory element corresponding to the position offset sequenced in the front as a memory element hit in the root node page; or finding out the position offset of which the corresponding Key is equal to the second Key, and determining the memory element corresponding to the position offset as the memory element hit in the root node page;
if the Key is smaller than a second Key, continuously utilizing a dichotomy to sort the keys from the sorting arrays associated with the root node page in the position offset before the second target offset to select the position offset and compare the keys until two adjacent position offsets are found, wherein one corresponding Key sorted in the front is smaller than the second Key, and when one corresponding Key sorted in the back is larger than the second Key, determining the memory element corresponding to the position offset sorted in the front as the memory element hit in the root node page; or, finding out the position offset of which the corresponding Key is equal to the second Key, and determining the memory element corresponding to the position offset as the memory element hit in the root node page.
In this embodiment, when the second target leaf node page where the target memory element matched with the second Key is located is determined, the Key included in the target memory element in the second target leaf node page may be marked as a dead Key, for example, the ending version number of the Key included in the target memory element in the second target leaf node page is updated to the version number corresponding to this operation.
Further, in this embodiment, considering that the memory page of the non-root node page in the MVBT has the limitation of the minimum number of live keys (that is, the minimum contained Key is the limitation of the number of memory elements of the live Key), deleting the Key contained in the memory page may reduce the number of memory elements containing the live Key in the memory page, so that the number of memory elements containing the live Key in the memory page may not meet the limitation of the minimum number of live keys.
Accordingly, in an implementation manner of this embodiment, after the labeling the Key of the target memory element record in the second target leaf node page as a dead Key, the method may further include:
when the second target leaf node page is not the root node page, judging whether the number of the memory elements of the live Key contained in the second target leaf node page is smaller than a preset second number threshold value or not;
if so, creating a second leaf node page, inserting memory elements of live keys contained in the second target leaf node page and the brother pages of the second target leaf node page into the second leaf node page, and marking the live keys in the second target leaf node page and the brother pages as dead keys;
and creating a sorting array of the second leaf node page association.
In this embodiment, when the second target leaf node page is not the root node page, after the Key included in the target memory element in the second target leaf node page is marked as a dead Key, it may be determined whether the number of memory elements of a live Key included in the second target leaf node page is less than a preset second number threshold.
If the second target leaf node page is smaller than the first target leaf node page, ending operations can be performed on the second target leaf node page and the sibling pages of the second target leaf node page, that is, a second leaf node page is created, the memory elements of the live keys contained in the second target leaf node page and the sibling pages of the second target leaf node page are inserted into the second leaf node page, and the live keys of the second target leaf node page and the sibling pages of the second target leaf node page are marked as dead keys.
The brother pages of the second target leaf node page may be left or right adjacent to the second target leaf node page, and the same-layer node page as the parent node page.
When the network device creates the second leaf node page, the sorting array associated with the second leaf node page can be created.
Further, when considering that the second target leaf node page and the sibling page of the second target leaf node page are terminated, if the numbers of memory elements containing the live Key in the second target leaf node page and the sibling page of the second target leaf node page are large, when the numbers of memory elements containing the live Key in the second target leaf node page and the sibling page of the second target leaf node page are inserted into the second leaf node page, the number of memory elements in the second leaf node page may be greater than the preset first number threshold.
Accordingly, in an example, the creating a second leaf node page, and inserting the memory element of the live Key included in the second target leaf node page and the sibling page of the second target leaf node page into the second leaf node page includes:
judging whether the sum of the number of the memory elements of the live keys contained in the second target leaf node page and the brother page is greater than a preset first number threshold;
if yes, two second leaf node pages are created, and the memory elements of the live keys contained in the second target leaf node page and the brother pages are inserted into each second leaf node page in a split mode;
otherwise, a second leaf node page is created, and the memory element of the live Key contained in the second target leaf node page and the sibling page is inserted into the second leaf node page.
Specifically, in this example, when creating the second leaf node page, it may be determined whether the sum of the number of memory elements of the live Key already contained in the second target leaf node page and the number of memory elements of the live Key already contained in the sibling page of the second target leaf node page is greater than a preset first number threshold.
If so, terminating and splitting operations can be performed, that is, two new leaf node pages are created, the memory elements of the live keys contained in the second target leaf node page and the sibling pages of the second target leaf node page are inserted into each second leaf node page in a split manner, and the live keys in the second target leaf node page and the sibling pages of the second target leaf node page are marked as the dead keys.
For example, taking the scenario shown in fig. 5A as an example, in this scenario, assuming that the preset second quantity threshold is 2, the preset first quantity threshold is 6, the second Key to be deleted is {9, [10, + ≦ and }, and the number of memory elements containing a live Key in the second target leaf node page is 2, when the second Key is labeled as a dead Key, the number of memory elements containing a live Key in the second target leaf node page will be less than 2, and the limit of the preset second quantity threshold is not satisfied, at this time, it is necessary to terminate the second target leaf node page and the sibling page of the second target leaf node page, since the sum of the numbers of memory elements containing a live Key in the second target leaf node page and the sibling page of the second target leaf node page (1+6 ═ 7) is greater than the preset first quantity threshold (6), so two second leaf node sub-pages need to be created, and inserting the memory elements of the live keys contained in the second target leaf node page and the sibling pages of the second target leaf node page into each second leaf node page in a split manner (one of the memory elements is inserted with 4, and the other memory element is inserted with 3), and marking the live keys in the second target leaf node page and the sibling pages of the second target leaf node page as dead keys.
In this example, if the sum of the numbers of the memory elements of the live keys included in the second target leaf node page and the sibling pages of the second target leaf node page is less than or equal to the preset first number threshold, termination and merging operations may be performed, that is, a second leaf node page is created, the memory elements of the live keys included in the second target leaf node page and the sibling pages of the second target leaf node page are inserted into the second leaf node page, and the live keys in the second target leaf node page and the sibling pages of the second target leaf node page are both marked as dead keys.
For example, taking the scenario shown in fig. 5B as an example, in this scenario, assuming that the preset second quantity threshold is 2, the preset first quantity threshold is 6, the second Key to be deleted is {9, [10, + ≦ or ≦ and the number of memory elements containing a live Key in the second target leaf node page is 2, when the second Key is labeled as a dead Key, the number of memory elements containing a live Key in the second target leaf node page will be less than 2, and the limit of the preset second quantity threshold is not satisfied, at this time, it is necessary to terminate the second target leaf node page and the sibling page of the second target leaf node page, since the sum of the numbers of memory elements containing a live Key in the second target leaf node page and the sibling page of the second target leaf node page (1+3 ≦ 4) is less than the preset first quantity threshold (6), so only one second leaf node page can be created, inserting the memory elements of the live keys contained in the second target leaf node page and the brother pages of the second target leaf node page into the second leaf node page, and marking the live keys in the second target leaf node page and the brother pages of the second target leaf node page as dead keys.
Further, in this embodiment, in consideration that operations such as Key insertion, deletion, and reading in the MVBT all need to perform positioning of leaf node pages in an order from a root node page to a leaf node page (an order from top to bottom), in order to ensure that a newly created leaf node page can be positioned, a memory element pointing to a newly created leaf node needs to be added to an upper node page of the newly created leaf node page.
Accordingly, in an example, after the creating the second leaf node page, the creating further includes:
and inserting a third memory element in the father node page of the second target leaf node page, recording a data pointer in the third memory element, pointing to the second leaf node page by the data pointer, and updating the sequencing array associated with the father node page.
For a specific implementation of this example, reference may be made to a related implementation of a case where a first target leaf node page is not a root node page in a processing flow of an insertion request for a first Key, and details of an embodiment of the present invention are not described herein again.
Further, in this embodiment, it is considered that when the second target leaf node page is not the root node page and the third memory element pointing to the second leaf node page is inserted into the parent node page of the second target leaf node page, the Key ranges of the parent node page may overlap, and further, when data search is performed by using the binary method, an abnormality may occur.
In one example, after the third memory element is inserted into the parent node page of the second target leaf node page, the method further includes:
judging whether the memory elements contained in the father node page have the memory elements with the same Key as the third memory element record;
if the node page exists, a same-layer node page which is the same layer as the father node page is created, and the memory element and the third memory element of the active Key contained in the father node page are inserted into the same-layer node page;
marking the live Key in the father node page as a dead Key;
an ordering array associated with the same level node page is created.
The specific implementation of this example may refer to a related implementation in a processing flow of an insertion request for a first Key, and details of the embodiment of the present invention are not described herein again.
In another embodiment of the present application, the responding to the received data processing request based on the target MVBT data storage structure may include:
when the data processing request is a read request aiming at a third Key and a third Value corresponding to the third Key, selecting array elements in a sequencing array by utilizing a dichotomy according to the sequence from a root node page to a leaf node page in the tree structure;
comparing the Key recorded in the memory element corresponding to the array element with the third Key to determine a third target leaf node page; the third target leaf node page comprises a second target memory element, and the Key recorded by the second target memory element is matched with the third Key;
and reading a third Key and a third Value from a second target memory element included in a third target leaf node page.
In this embodiment, the third Key does not refer specifically to a fixed Key, but may refer to any Key that needs to be read.
In this embodiment, when the third Key and the third Value corresponding to the third Key need to be read, an array element may be selected in the sorting array by using a binary method according to an order from a root node page to a leaf node page in the tree structure, and the Key recorded in the memory element corresponding to the array element and the third Key are compared to determine a third target leaf node page.
The specific implementation of determining the third target leaf node page where the target memory element matched with the third Key is located by using the bisection method may refer to related implementation in the processing flow of the deletion request for the second Key, and this embodiment of the present invention is not described herein again.
It should be noted that, when Key deletion is performed and Key search is performed, a matched Key is a live Key (i.e., a memory cycle is ≦ +; when the Key is read, the read request carries the version number of the Key, and when the Key is searched, the version number needs to be matched.
In this embodiment, when the third destination leaf node page where the second destination memory element matched with the third Key is located is determined, the third Key and the third Value may be read according to the second destination memory element in the third destination leaf node page.
When the memory element in the leaf node page includes a Key, a corresponding Value, and a life cycle, the third Key and the corresponding third Value may be directly read from the second target memory element in the third target leaf node page.
When the memory elements in the leaf node page include a Key location and a corresponding Value location and a life cycle, the third Key and a corresponding third Value may be read according to the Key location and the Value location included in the second target memory element in the third target leaf node page.
It should be noted that, in the embodiment, when the third Key needs to be read and the third target leaf node page where the second target memory element that matches the third Key is located is determined, if the life cycle of the Key included in the second target memory element in the third target leaf node page matches the version number of the third Key (that is, the final Search result is the Search _ Key is the match Key), it is determined that the Search result is the exact match; if the life cycle of the Key contained in the second target memory element in the third target leaf node page does not match the version number of the third Key (i.e., the final Search result is Search _ Key > match Key or Search _ Key < match Key), it is determined that the Search result is an inaccurate match.
Further, in the embodiment, it is considered that the Key search is performed in the order from the root node page to the leaf node page in the tree structure, but the Key insertion or deletion is performed in the order from the leaf node page to the root node page, and therefore, in order to improve the efficiency of the Key insertion or deletion and the like, the data processing efficiency is further improved, and the Key search path may be recorded when the Key search is performed.
Correspondingly, in an implementation manner of this embodiment, when the third Key and the third Value corresponding to the third Key need to be read, a search path for acquiring the third Key and the Value corresponding to the third Key may be recorded according to an order from a root node page to a leaf node page in the tree structure; the search path may include the number of the memory page hit in each layer in the search process, Key information hit in the memory page, and a life cycle corresponding to the hit Key information.
In one example, an array (which may be referred to as a Stack structure) may be added to the target MVBT data storage structure to record the search path when Key reading is performed. Each element of the array contains two fields: the number of the memory page and Key information. The 0 th element of the array represents the access information in the root node page in the searching process, the 1 st element represents the access information in the second layer node page (if the access information exists), the 2 nd element represents the access information in the third layer node page (if the access information exists), and so on until the leaf node page. Nodes on the search path can be traced back through the Stack structure, so that Key insertion, deletion and other operations in subsequent processes are facilitated.
It should be noted that the search path information may further include a number of layers (which may be depth) of the node page in the search path. For example, for Stack structures, a depth variable (which may be denoted Stack. Since each element in the Stack structure represents access information of a layer of memory page nodes, the Stack depth may also represent the number of valid elements in the Stack.
For example, taking the memory Page of the tree structure shown in fig. 6 as an example, assuming that the value of the Key to be searched is 5 and the version number is 1, first search is performed in the root node Page (i.e., Page1), a position offset can be selected in the sorting array associated with Page1 by using bisection, and the Key is compared according to the Key included in the memory element in the memory Page pointed to by the position offset, the value of the Key included in the finally hit memory element is 4, the value of the Key in the memory element and the life cycle are recorded in stack.path [0], and stack.depth is incremented by one. At this time, the Stack information at this time is:
Stack.depth=1;
Stack.path[0]={Page1,Key{4,[1,2)}}
further, a search may be performed in the next-level node Page (i.e., Page6) to which the memory element whose Key value is 4 contained in Page1 points, a position offset may be selected in the sorting array associated with Page6 by using a bisection method, a comparison of keys is performed according to the Key included in the memory element in the memory Page to which the position offset points, the value of the Key in the finally hit memory element is 5, the value and the life cycle of the Key in the memory element are recorded in stack.path [1], and stack.depth is incremented by one. Because the search structure is matched with the Key to be searched, the search is finished, and the returned result is accurate matching.
The Stack information at this time is:
Stack.depth=2;
Stack.path[0]={Page1,Key{4,[1,2)}}
Stack.path[1]={Page6,Key{5,[1,2)}}。
it should be noted that, in order to avoid the situation that the search time is too long and reduce the complexity of implementing the search, when performing Key search, the maximum value of the number of layers of the searched node page (that is, the maximum value of depth) may be preset, and when the number of layers of the searched node page exceeds the maximum value, it is directly determined that the search fails.
As can be seen from the above description, in the technical solution provided in the embodiment of the present invention, a data storage structure in a conventional MVBT implementation scheme is modified, an associated sorting array is added to a memory page of each tree structure, each array element in the sorting array is used to represent a position offset of a memory element in the memory page, and then a Key recorded in the memory element in the memory page of the associated tree structure can be sequentially searched based on the sorting array, so that data processing performance is improved.
Referring to fig. 7, a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention is provided, where the apparatus may be applied to a network device in the foregoing method embodiment, and as shown in fig. 7, the data processing apparatus may include:
a creating unit 710 for creating a target multi-version B + tree MVBT data storage structure; the target MVBT data storage structure comprises memory pages with a tree structure and sorting arrays respectively associated with the memory pages, wherein each memory page comprises at least one memory element, a Key is recorded in each memory element, and each array element in the sorting arrays is used for representing the position offset of the memory element in the memory page;
a receiving unit 720, configured to receive a data processing request;
a processing unit 730, configured to respond to the received data processing request based on the target MVBT data storage structure.
In an optional embodiment, the processing unit 730 is specifically configured to, when the data processing request is an insertion request for the first Key, determine whether a next-layer node page exists in a root node page in the memory page of the tree structure;
if yes, determining a first target leaf node page inserted by the first Key by using a dichotomy according to the sequence from a root node page to a leaf node page and based on the sorting array; otherwise, determining the root node page as a first target leaf node page;
inserting a first memory element into the first target leaf node page, wherein the first memory element records the first Key.
In an optional embodiment, the processing unit 730 is specifically configured to determine whether the number of memory elements in the first target leaf node page is greater than a preset first number threshold;
the creating unit 710 is further configured to create a first leaf node page if the number of memory elements in the first target leaf node page is greater than the preset first number threshold;
the processing unit 730 is further specifically configured to insert the first memory element and the memory element of the live Key included in the first target leaf node page into the first leaf node page, and mark the live Key in the first target leaf node as a dead Key;
the creating unit 710 is further configured to create a sorting array associated with the first leaf node page.
In an optional embodiment, the processing unit 730 is further configured to determine whether a sum of the number of memory elements of the live Key included in the first target leaf node page and the number of the first memory elements is greater than the preset first number threshold;
the creating unit 710 is further configured to create two first leaf node pages if a sum of the number of memory elements of the live Key included in the first target leaf node page and the number of the first memory elements is greater than the preset first number threshold;
the processing unit 730 is further configured to insert the first memory element and the memory element of the live Key included in the first target leaf node page into each first leaf node page in a split manner;
the creating unit 710 is further configured to create a first leaf node page if a sum of the number of the memory elements of the live Key included in the first target leaf node page and the number of the first memory elements is less than or equal to the preset first number threshold;
the processing unit 730 is further configured to insert the first memory element and the memory element of the live Key included in the first target leaf node page into the first leaf node page.
In an optional embodiment, the processing unit 730 is further configured to determine whether the first target leaf node page is a root node page;
the processing unit 730, further configured to insert a second memory element in the parent node page of the first target leaf node page if the first target leaf node page is not the root node page, where the second memory element records a data pointer, the data pointer points to the first leaf node page, and updates the sorting array associated with the parent node page;
the creating unit 710 is further configured to create another node page if the first target leaf node page is a root node page and the number of the first leaf node pages is greater than 1;
the processing unit 730 is further configured to insert a second memory element into the created another node page, where the second memory element records a data pointer, where the data pointer points to the first leaf node page, and create a sorting array associated with the another node page.
In an optional embodiment, the processing unit 730 is further configured to determine whether a memory element with the same Key as the second memory element record exists in the memory elements included in the parent node page;
the creating unit 710 is further configured to create a same-layer node page that is the same layer as the parent node page if the node page exists;
the processing unit 730 is further configured to insert the memory element of the live Key included in the parent node page and the second memory element into the node page on the same layer; marking the live Key in the father node page as a dead Key;
the creating unit 710 is further configured to create a sorting array associated with the peer node page.
In an optional embodiment, the processing unit 730 is further configured to compare a Key recorded in the memory element corresponding to the array element with the second Key, so as to determine a second target leaf node page; wherein, the second target leaf node page includes a first target memory element, and the Key recorded by the first target memory element is matched with the second Key;
the processing unit 730 is further configured to label the Key of the target memory element record in the second target leaf node page as a dead Key.
In an optional embodiment, the processing unit 730 is further configured to, when the second target leaf node page is not a root node page, determine whether the number of memory elements of the live Key included in the second target leaf node page is smaller than a preset second number threshold;
the creating unit 710 is further configured to create a second leaf node page if the number of memory elements of the live Key included in the second target leaf node page is smaller than a preset second number threshold;
the processing unit 730 is further configured to insert a memory element of a live Key included in the second target leaf node page and a sibling page of the second target leaf node page into the second leaf node page, and mark the live Key in the second target leaf node page and the sibling page as a dead Key;
the creating unit 710 is further configured to create a sorting array associated with the second leaf node page.
In an optional embodiment, the processing unit 730 is further configured to determine whether a sum of numbers of memory elements of live keys included in the second target leaf node page and the sibling page is greater than a preset first number threshold;
the creating unit 710 is further configured to create two second leaf node pages if a sum of the numbers of memory elements of the live keys included in the second target leaf node page and the sibling page is greater than a preset first number threshold;
the processing unit 730 is further configured to insert a memory element of a live Key included in the second target leaf node page and the sibling page into each second leaf node page in a split manner;
the creating unit 710 is further configured to create a second leaf node page if a sum of the numbers of the memory elements of the live Key included in the second target leaf node page and the sibling page is less than or equal to a preset first number threshold;
the processing unit 730 is further configured to insert a memory element of a live Key included in the second target leaf node page and the sibling page into the second leaf node page.
In an optional embodiment, the processing unit 730 is further configured to insert a third memory element in the parent node page of the second target leaf node page, where the third memory element records a data pointer, where the data pointer points to the second leaf node page, and updates the sorting array associated with the parent node page.
In an optional embodiment, the processing unit 730 is further configured to determine whether a memory element with the same Key as the third memory element record exists in the memory elements included in the parent node page;
the creating unit 710 is further configured to create a same-layer node page that is the same layer as the parent node page if the node page exists;
the processing unit 730 is further configured to insert the memory element of the live Key included in the parent node page and the third memory element into the peer node page; marking the live Key in the father node page as a dead Key;
the creating unit 710 is further configured to create a sorting array associated with the peer node page.
In an optional embodiment, the processing unit 730 is further configured to, when the data processing request is a read request for a third Key and a third Value corresponding to the third Key, select an array element in the sorted array by using a binary method according to an order from a root node page to a leaf node page in the tree structure;
the processing unit 730 is further configured to compare a Key recorded in the memory element corresponding to the array element with the third Key, so as to determine a third target leaf node page; wherein, the third target leaf node page includes a second target memory element, and the Key recorded by the second target memory element is matched with the third Key;
the processing unit 730 is further configured to read the third Key and the third Value from a second target memory element included in the third target leaf node page.
In an optional embodiment, the processing unit 730 is further configured to record, according to an order from a root node page to a leaf node page in the tree structure, a search path for acquiring the third Key and a Value corresponding to the third Key;
the search path includes the number of the memory page hit in each layer, the Key information hit in the memory page, and the life cycle corresponding to the Key information hit in the memory page in the search process.
Fig. 8 is a schematic diagram of a hardware structure of a data processing apparatus according to an embodiment of the present invention. The data processing apparatus may include a processor 801, a machine-readable storage medium 802 storing machine-executable instructions. The processor 801 and the machine-readable storage medium 802 may communicate via a system bus 803. Also, the processor 801 may perform the data processing methods described above by reading and executing machine-executable instructions in the machine-readable storage medium 802 corresponding to the data processing logic.
The machine-readable storage medium 802 referred to herein may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
Embodiments of the present invention also provide a machine-readable storage medium, such as machine-readable storage medium 802 in fig. 8, comprising machine-executable instructions that are executable by processor 801 in a data processing apparatus to implement the data processing method described above.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
As can be seen from the above embodiments, by modifying the data storage structure in the conventional MVBT implementation scheme, an associated sorting array is added to the memory page of each tree structure, and each array element in the sorting array is used to represent a position offset of a memory element in the memory page, so that keys recorded in the memory elements in the memory page of the associated tree structure can be sequentially searched based on the sorting array, thereby improving data processing performance.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (16)

1. A data processing method, comprising:
creating a target multi-version B + tree MVBT data storage structure; the target MVBT data storage structure comprises memory pages with a tree structure and sorting arrays respectively associated with the memory pages, wherein each memory page comprises at least one memory element, a Key is recorded in each memory element, and each array element in the sorting arrays is used for representing the position offset of the memory element in the memory page; the sorting array respectively associated with each memory page is used for representing memory elements included in each memory page, and sorting results of sorting in each memory page are carried out based on a Key of each memory element;
and responding to the received data processing request based on the target MVBT data storage structure.
2. The method of claim 1, wherein responding to the received data processing request based on the MVBT data storage structure comprises:
when the data processing request is an insertion request aiming at a first Key, judging whether a root node page in the memory page of the tree structure has a next layer of node page;
if yes, determining a first target leaf node page inserted by the first Key by using a dichotomy according to the sequence from a root node page to a leaf node page and based on the sorting array; otherwise, determining the root node page as a first target leaf node page;
inserting a first memory element into the first target leaf node page, wherein the first memory element records the first Key.
3. The method of claim 2, wherein inserting a first memory element in the first target leaf node page comprises:
judging whether the number of the memory elements in the first target leaf node page is greater than a preset first number threshold value or not;
if the number of the memory elements in the first target leaf node page is larger than the preset first number threshold, creating a first leaf node page;
inserting the first memory element and a memory element of a live Key contained in the first target leaf node page into the first leaf node page;
marking the live Key in the first target leaf node as a dead Key;
creating a sorting array associated with the first leaf node page.
4. The method of claim 3, wherein inserting the first memory element and the memory element of the live Key already contained in the first target leaf node page into the first leaf node page comprises:
judging whether the sum of the number of the memory elements of the live Key contained in the first target leaf node page and the number of the first memory elements is greater than the preset first number threshold;
if so, creating two first leaf node pages, and inserting the first memory element and the memory element of the live Key contained in the first target leaf node page into each first leaf node page in a split manner;
otherwise, a first leaf node page is created, and the first memory element and the memory element of the live Key included in the first target leaf node page are inserted into the first leaf node page.
5. The method of claim 3 or 4, wherein after creating the first leaf node page, further comprising:
judging whether the first target leaf node page is a root node page or not;
if the first target leaf node page is not the root node page, inserting a second memory element in the father node page of the first target leaf node page, recording a data pointer in the second memory element, pointing to the first leaf node page by the data pointer, and updating the sequencing array associated with the father node page;
if the first target leaf node page is a root node page and the number of the first leaf node pages is more than 1, another node page is created; and inserting a second memory element into the created other node page, recording a data pointer in the second memory element, pointing to the first leaf node page by the data pointer, and creating a sequencing array associated with the other node page.
6. The method of claim 5, wherein after inserting the second memory element in the parent node page of the first target leaf node page, further comprising:
judging whether a memory element with the same Key as the second memory element record exists in the memory elements included in the father node page;
if the memory element and the second memory element of the live Key contained in the father node page are contained in the same layer node page, establishing a same layer node page which is the same layer as the father node page, and inserting the memory element and the second memory element of the live Key contained in the father node page into the same layer node page;
marking the live Key in the father node page as a dead Key;
and creating a sorting array associated with the same-layer node page.
7. The method of claim 1, wherein responding to the received data processing request based on the MVBT data storage structure comprises:
when the data processing request is a deletion request aiming at the second Key, selecting array elements in the sorting array by utilizing a dichotomy according to the sequence from the root node page to the leaf node page in the tree structure;
comparing the Key recorded in the memory element corresponding to the array element with the second Key to determine a second target leaf node page; wherein, the second target leaf node page includes a first target memory element, and the Key recorded by the first target memory element is matched with the second Key;
and marking the Key of the target memory element record in the second target leaf node page as a dead Key.
8. The method of claim 7, wherein after labeling the Key of the target memory element record in the second target leaf node page as a dead Key, further comprising:
when the second target leaf node page is not the root node page, judging whether the number of the memory elements of the live Key contained in the second target leaf node page is smaller than a preset second number threshold;
if so, creating a second leaf node page, inserting the memory elements of the live keys contained in the second target leaf node page and the brother pages of the second target leaf node page into the second leaf node page, and marking the live keys in the second target leaf node page and the brother pages as dead keys;
and creating a sequencing array associated with the second leaf node page.
9. The method of claim 8, wherein the creating a second leaf node page, and inserting the memory element of the live Key contained in the second target leaf node page and a sibling of the second target leaf node page into the second leaf node page comprises:
judging whether the sum of the number of the memory elements of the live Key contained in the second target leaf node page and the sibling page is greater than a preset first number threshold;
if so, creating two second leaf node pages, and inserting the memory element of the live Key contained in the second target leaf node page and the sibling page into each second leaf node page in a split manner;
otherwise, a second leaf node page is created, and the memory element of the live Key contained in the second target leaf node page and the sibling page is inserted into the second leaf node page.
10. The method according to claim 8 or 9, wherein after creating the second leaf node page, further comprising:
and inserting a third memory element in the father node page of the second target leaf node page, wherein a data pointer is recorded in the third memory element, the data pointer points to the second leaf node page, and the sequencing array associated with the father node page is updated.
11. The method of claim 10, wherein after inserting a third memory element in the parent node page of the second target leaf node page, further comprising:
judging whether a memory element with the same Key as the third memory element record exists in the memory elements included in the father node page;
if the node page exists, a same-layer node page which is the same as the father node page in layer is created, and the memory element of the live Key contained in the father node page and the third memory element are inserted into the same-layer node page;
marking the live Key in the father node page as a dead Key;
and creating a sorting array associated with the same-layer node page.
12. The method of claim 1, wherein responding to the received data processing request based on the MVBT data storage structure comprises:
when the data processing request is a read request aiming at a third Key and a third Value corresponding to the third Key, selecting array elements in the sorting array by utilizing a dichotomy according to the sequence from a root node page to a leaf node page in the tree structure;
comparing the Key recorded in the memory element corresponding to the array element with the third Key to determine a third target leaf node page; wherein, the third target leaf node page includes a second target memory element, and the Key recorded by the second target memory element is matched with the third Key;
and reading the third Key and the third Value from a second target memory element included in the third target leaf node page.
13. The method of claim 12, further comprising:
recording and acquiring the third Key and a search path of Value corresponding to the third Key according to the sequence from a root node page to a leaf node page in the tree structure;
the search path includes the number of the memory page hit in each layer, the Key information hit in the memory page, and the life cycle corresponding to the Key information hit in the memory page in the search process.
14. A data processing apparatus, comprising:
a creation unit configured to create a target multi-version B + tree MVBT data storage structure; the target MVBT data storage structure comprises memory pages with a tree structure and sorting arrays respectively associated with the memory pages, wherein each memory page comprises at least one memory element, a Key is recorded in each memory element, and each array element in the sorting arrays is used for representing the position offset of the memory element in the memory page; the sorting array respectively associated with each memory page is used for representing memory elements included in each memory page, and sorting results of sorting in each memory page are carried out based on a Key of each memory element;
a receiving unit, configured to receive a data processing request;
and the processing unit is used for responding to the received data processing request based on the target MVBT data storage structure.
15. The apparatus of claim 14,
the processing unit is specifically configured to, when the data processing request is an insertion request for a first Key, determine whether a next-layer node page exists in a root node page in the memory page of the tree structure;
if yes, determining a first target leaf node page inserted by the first Key by using a dichotomy according to the sequence from a root node page to a leaf node page and based on the sorting array; otherwise, determining the root node page as a first target leaf node page;
inserting a first memory element into the first target leaf node page, wherein the first memory element records the first Key.
16. The apparatus of claim 15,
the processing unit is specifically configured to determine whether the number of memory elements in the first target leaf node page is greater than a preset first number threshold;
the creating unit is further configured to create a first leaf node page if the number of memory elements in the first target leaf node page is greater than the preset first number threshold;
the processing unit is further specifically configured to insert the first memory element and a memory element of a live Key included in the first target leaf node page into the first leaf node page, and mark the live Key in the first target leaf node as a dead Key;
the creating unit is further configured to create a sorting array associated with the first leaf node page.
CN201810805648.4A 2018-07-20 2018-07-20 Data processing method and device Active CN108920708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810805648.4A CN108920708B (en) 2018-07-20 2018-07-20 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810805648.4A CN108920708B (en) 2018-07-20 2018-07-20 Data processing method and device

Publications (2)

Publication Number Publication Date
CN108920708A CN108920708A (en) 2018-11-30
CN108920708B true CN108920708B (en) 2021-04-27

Family

ID=64415574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810805648.4A Active CN108920708B (en) 2018-07-20 2018-07-20 Data processing method and device

Country Status (1)

Country Link
CN (1) CN108920708B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990377B (en) * 2019-11-21 2023-08-22 上海达梦数据库有限公司 Data loading method, device, server and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609490A (en) * 2012-01-20 2012-07-25 东华大学 Column-storage-oriented B+ tree index method for DWMS (data warehouse management system)
CN103116641A (en) * 2013-02-21 2013-05-22 新浪网技术(中国)有限公司 Acquisition method of ordering statistical data and ordering device
CN104252528A (en) * 2014-09-04 2014-12-31 国家电网公司 Big data secondary index establishing method based on identifier space mapping
CN105095197A (en) * 2014-04-16 2015-11-25 华为技术有限公司 Method and device for processing data
CN105930280A (en) * 2016-05-27 2016-09-07 诸葛晴凤 Efficient page organization and management method facing NVM (Non-Volatile Memory)
CN105988876A (en) * 2015-03-27 2016-10-05 杭州迪普科技有限公司 Memory allocation method and apparatus
CN106462592A (en) * 2014-03-28 2017-02-22 华为技术有限公司 Systems and methods to optimize multi-version support in indexes
CN106775435A (en) * 2015-11-24 2017-05-31 腾讯科技(深圳)有限公司 Data processing method, device and system in a kind of storage system
CN107038206A (en) * 2017-01-17 2017-08-11 阿里巴巴集团控股有限公司 The method for building up of LSM trees, the method for reading data and server of LSM trees

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107612765B (en) * 2016-07-12 2020-12-25 华为技术有限公司 Data processing method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609490A (en) * 2012-01-20 2012-07-25 东华大学 Column-storage-oriented B+ tree index method for DWMS (data warehouse management system)
CN103116641A (en) * 2013-02-21 2013-05-22 新浪网技术(中国)有限公司 Acquisition method of ordering statistical data and ordering device
CN106462592A (en) * 2014-03-28 2017-02-22 华为技术有限公司 Systems and methods to optimize multi-version support in indexes
CN105095197A (en) * 2014-04-16 2015-11-25 华为技术有限公司 Method and device for processing data
CN104252528A (en) * 2014-09-04 2014-12-31 国家电网公司 Big data secondary index establishing method based on identifier space mapping
CN105988876A (en) * 2015-03-27 2016-10-05 杭州迪普科技有限公司 Memory allocation method and apparatus
CN106775435A (en) * 2015-11-24 2017-05-31 腾讯科技(深圳)有限公司 Data processing method, device and system in a kind of storage system
CN105930280A (en) * 2016-05-27 2016-09-07 诸葛晴凤 Efficient page organization and management method facing NVM (Non-Volatile Memory)
CN107038206A (en) * 2017-01-17 2017-08-11 阿里巴巴集团控股有限公司 The method for building up of LSM trees, the method for reading data and server of LSM trees

Also Published As

Publication number Publication date
CN108920708A (en) 2018-11-30

Similar Documents

Publication Publication Date Title
CN110083601B (en) Key value storage system-oriented index tree construction method and system
US8335889B2 (en) Content addressable storage systems and methods employing searchable blocks
US8239343B2 (en) Database reorganization technique
US7558802B2 (en) Information retrieving system
JP5506290B2 (en) Associative memory system and method using searchable blocks
CN105975587B (en) A kind of high performance memory database index organization and access method
CN111190904B (en) Method and device for hybrid storage of graph-relational database
CN103870588B (en) A kind of method and device used in data base
US11100047B2 (en) Method, device and computer program product for deleting snapshots
CN105224532A (en) Data processing method and device
US11681691B2 (en) Presenting updated data using persisting views
CN111596945B (en) Differential upgrading method for dynamic multi-partition firmware of embedded system
CN104021223A (en) Method and device for accessing survey point of cluster database
JP4825719B2 (en) Fast file attribute search
CN109189759A (en) Method for reading data, data query method, device and equipment in KV storage system
CN109325022B (en) Data processing method and device
CN108920708B (en) Data processing method and device
US11080334B1 (en) Graph representation system and methods
CN109325023B (en) Data processing method and device
CN115935020A (en) Graph data storage method and device
CN113821508B (en) Method and system for realizing array index
US20120054196A1 (en) System and method for subsequence matching
CN106980685B (en) Data processing method and data processing device
CN113407107B (en) Data storage method, device and equipment
CN114416741A (en) KV data writing and reading method and device based on multi-level index and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant