CN117573676A - Address processing method and device based on storage system, storage system and medium - Google Patents

Address processing method and device based on storage system, storage system and medium Download PDF

Info

Publication number
CN117573676A
CN117573676A CN202311611112.6A CN202311611112A CN117573676A CN 117573676 A CN117573676 A CN 117573676A CN 202311611112 A CN202311611112 A CN 202311611112A CN 117573676 A CN117573676 A CN 117573676A
Authority
CN
China
Prior art keywords
key value
tree
address
value pair
logical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311611112.6A
Other languages
Chinese (zh)
Inventor
詹宇斌
施培任
刚亚州
仇锋利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN202311611112.6A priority Critical patent/CN117573676A/en
Publication of CN117573676A publication Critical patent/CN117573676A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an address processing method and device based on a storage system, the storage system and a medium, and relates to the field of storage, wherein the method comprises the following steps: acquiring all logical addresses contained in a logical address space in a storage system; generating a B+ tree by using all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to all logical addresses according to the logical addresses, each leaf node stores the same number of key value pairs, and the key value pairs take the logical addresses as keys; when a key value pair update request is received, replacing a new key value pair containing a mapping relation between a logical address and a physical address with a key value pair with the same key in the B+ tree; the B+ tree can be generated based on all the logical addresses, so that the key value pairs for storing the mapping relation between each logical address and the physical address can be stored in the B+ tree according to the logical address sequence, and the complexity of managing the address mapping relation by using the tree structure can be reduced by replacing the key value pairs when updating the mapping relation.

Description

Address processing method and device based on storage system, storage system and medium
Technical Field
The present invention relates to the field of storage, and in particular, to a method and apparatus for processing addresses based on a storage system, and a medium.
Background
A logical address and a physical address generally exist in a storage system, where the logical address is an address of data in an application layer or a logical volume, and the physical address is an address where the data is actually written into a storage device after passing through the logical layer, and a mapping relationship is generally set between the logical address and the physical address.
In the related art, a tree structure may be generally introduced to manage a mapping relationship between logical addresses and physical addresses. However, because the mapping relationship between the logical address and the physical address always changes dynamically, the existing tree structure is easy to change frequently, so that the updating and inquiring complexity of the tree structure is increased, and meanwhile, more burden is brought to the processor.
Disclosure of Invention
The invention aims to provide an address processing method, an address processing device, a storage system and a medium based on a storage system, which can reduce the complexity of managing an address mapping relation by utilizing a tree structure.
In order to solve the above technical problems, the present invention provides an address processing method based on a storage system, including:
Acquiring all logical addresses contained in a logical address space in a storage system;
generating a B+ tree by using all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, and each leaf node stores the same number of key value pairs, wherein the key value pairs are keyed by the logical addresses;
when a key value pair update request is received, a new key value pair containing the mapping relation between the logical address and the physical address is replaced with a key value pair with the same key in the B+ tree.
Optionally, said generating a b+ tree using all of said logical addresses includes:
creating an initial key value pair for each of the logical addresses; the initial key value pair takes the logical address as a key and takes a null value as a value;
the B+ tree is generated using the initial key value pairs of all the logical addresses.
Optionally, the method further comprises:
and when a key value pair deleting request is received, clearing the value of the key value pair to be deleted, which corresponds to the key value pair deleting request, in the B+ tree.
Optionally, before acquiring all logical addresses contained in the logical address space in the storage system, the method further includes:
Dividing the logic address space in the storage system, and entering all logic addresses contained in the logic address space in the acquisition storage system based on each logic address space obtained by dividing.
Optionally, after generating the b+ tree using all the logical addresses, the method further includes:
loading the B+ tree from a storage device of the storage system into a memory device of the storage system;
ordering the tree nodes according to the positions of the tree nodes in the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes according to the arrangement sequence of the tree nodes;
the replacing the new key value pair containing the mapping relation between the logical address and the physical address with the key value pair with the same key in the B+ tree comprises the following steps:
searching the memory address of the leaf node to be updated, to which the new key value pair belongs, according to the key of the new key value pair, the logic address range recorded by each leaf node and the arrangement sequence of each leaf node;
and replacing the new key value pair with the same key in the leaf node to be updated according to the memory address of the leaf node to be updated.
Optionally, the sequentially recording, according to the arrangement sequence of each tree node, the memory address corresponding to each tree node includes:
generating corresponding arrays for each layer of the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes of each layer into the corresponding arrays of each layer according to the arrangement sequence of the tree nodes;
the searching the memory address of the leaf node to be updated to which the new key value pair belongs according to the key of the new key value pair, the logic address range recorded by each leaf node and the arrangement sequence of each leaf node comprises the following steps:
determining the serial numbers of the leaf nodes to be updated, to which the new key value pair belongs, in the array according to the keys of the new key value pair and the logic address ranges recorded by the leaf nodes;
and acquiring the memory address of the leaf node to be updated from the array according to the sequence number.
Optionally, the method further comprises:
when a key value pair query request is received, determining the serial numbers of the leaf nodes to be queried, to which the key value pair to be queried belongs, in the array according to the logical addresses of the key value pair to be queried and the logical address ranges recorded by the leaf nodes;
and acquiring the memory address of the leaf node to be queried from the array according to the sequence number, and acquiring the key value pair to be queried from the leaf node to be queried according to the memory address.
The invention also provides an address processing device based on the storage system, which comprises:
the acquisition module is used for acquiring all logical addresses contained in the logical address space in the storage system;
the generation module is used for generating a B+ tree by utilizing all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, and each leaf node stores the same number of key value pairs, wherein the key value pairs are keyed by the logical addresses;
and the updating module is used for replacing the new key value pair containing the mapping relation between the logical address and the physical address with the key value pair with the same key in the B+ tree when receiving the key value pair updating request.
The invention also provides a storage system comprising:
a first memory for storing user data;
a second memory for storing a computer program;
and a processor for implementing the address processing method based on the storage system when executing the computer program.
The present invention also provides a computer readable storage medium having stored therein computer executable instructions which, when loaded and executed by a processor, implement the storage system based address processing method as described above.
The invention provides an address processing method, which comprises the following steps: acquiring all logical addresses contained in a logical address space in a storage system; generating a B+ tree by using all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, and each leaf node stores the same number of key value pairs, wherein the key value pairs are keyed by the logical addresses; when a key value pair update request is received, a new key value pair containing the mapping relation between the logical address and the physical address is replaced with a key value pair with the same key in the B+ tree.
Therefore, the invention can firstly acquire all logical addresses in the logical address space in the storage system, and generate a B+ tree by utilizing all logical addresses, wherein all leaf nodes of the B+ tree sequentially store key value pairs corresponding to all logical addresses according to the logical addresses, all leaf nodes store the same number of key value pairs, and the key value pairs use the logical addresses as keys. Obviously, the b+ tree in the present invention can cover all logical addresses in the logical address space, and the key value pairs of each logical address are regularly saved in the b+ tree. Therefore, when the invention receives the update request of the key value pair, only the key value pair containing the mapping relation between the logical address and the physical address is needed to be replaced by the key value pair with the same key as the key value pair in the B+ tree, and the structure of the B+ tree is not needed to be changed, thereby avoiding the problem of frequent tree structure change caused by dynamic change of the mapping relation between the logical address and the physical address, and further effectively reducing the complexity of managing the mapping relation between the logical address and the physical address by using the tree structure. The invention also provides an address processing device based on the storage system, the storage system and a computer readable storage medium, which have the beneficial effects.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an address processing method based on a storage system according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a b+ tree according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a B+ tree using array records according to an embodiment of the present invention;
FIG. 4 is a block diagram illustrating an address processing apparatus based on a memory system according to an embodiment of the present invention;
fig. 5 is a block diagram of a storage system according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A logical address and a physical address generally exist in a storage system, where the logical address is an address of data in an application layer or a logical volume, and the physical address is an address where the data is actually written into a storage device after passing through the logical layer, and a mapping relationship is generally set between the logical address and the physical address. In the related art, a tree structure may be generally introduced to manage a mapping relationship between logical addresses and physical addresses. However, because the mapping relationship between the logical address and the physical address always changes dynamically, the existing tree structure is easy to change frequently, so that the updating and inquiring complexity of the tree structure is increased, and meanwhile, more burden is brought to the processor. In view of this, the present invention provides an address processing method based on a storage system, which can generate a b+ tree based on all logical addresses in a logical address space, so that key value pairs for storing mapping relationships between logical addresses and physical addresses can be stored in the b+ tree according to a logical address sequence, and further, only the key value pairs need to be replaced when the mapping relationships are updated, and the tree structure is not required to be adjusted, thereby reducing the complexity of managing the address mapping relationships by using the tree structure.
It should be noted that the embodiment of the present invention is not limited to a specific storage system, and may be selected according to practical application requirements, for example, a SAN (Storage Area Network ) storage system. It will be appreciated that storage systems typically include a plurality of storage devices that are capable of providing storage services externally, typically in an array. The embodiment of the invention is not limited to the type (such as a solid state disk and a mechanical hard disk) and the number of the storage devices in the storage system and the form (such as RAID1, RAID5, RAID6, RAID Redundant Arrays of Independent Disks and disk arrays) of the array formed by the storage devices, and can be set according to the actual application requirements.
Referring to fig. 1, fig. 1 is a flowchart of a method for processing addresses based on a storage system according to an embodiment of the present invention, where the method may include:
s101, acquiring all logical addresses contained in a logical address space in a storage system.
Note that, in the storage system, logical addresses (LBA, logical Block Address, logical block addresses) are set in a logical address space, and physical addresses (PBA, physical Block Address, physical block addresses) are set in a physical address space, and the two address spaces correspond to each other through a mapping relationship between logical addresses and physical addresses. In the actual storage process, for a user, data can be stored in a logic address space and can be accessed through a logic address; for the storage device, the data is stored in the physical address space after passing through the logic layer, and is accessed through the physical address. Because the mapping relation exists between the logical address and the physical address, when a user accesses the logical address space by using the logical address, the corresponding physical address can be further determined according to the mapping relation, and the corresponding user data can be acquired from the storage device according to the physical address. Obviously, the mapping between logical and physical addresses pertains to the metadata required to support the logical address space. In general, the mapping relationship (the logical address is a Key and the physical address is a Value) can be stored based on a Key Value pair form (KV pair), and the Key Value pair is stored by using a tree structure; in addition, to facilitate the lookup of key-value pairs, when key-value pairs are inserted into the tree structure, the storage location of each key-value pair in the tree structure may be adjusted according to the key. However, in the related art, only when it is determined that a mapping relationship is set between a logical address and a physical address, a corresponding key value pair is inserted into the tree structure; moreover, since the mapping relationship changes frequently, the tree structure may need to adjust the storage position of each key value pair in the tree structure at any time, so that larger structure adjustment may be brought to the tree structure, more inquiry and writing operations are introduced for metadata management, and efficient metadata management is not facilitated. Therefore, the invention can firstly acquire all the logic addresses contained in the logic address space in the storage system, and generates a tree structure based on all the logic addresses before using the logic address space, so that the tree structure sequentially stores key value pairs corresponding to the logic addresses according to the logic addresses. Therefore, the tree structure can be ensured to completely cover all logical addresses in the logical address space, key value pairs corresponding to the logical addresses can be conveniently searched in the tree structure, and when the mapping relation between the logical addresses and the physical addresses is created later, only the key value pairs in the tree structure provided by the embodiment are required to be updated and replaced, and the branch structure of the tree structure is not required to be adjusted.
It should be noted that, the embodiment of the present invention is not limited to a specific tree structure, and may be set according to practical application requirements, for example, binary tree, multi-tree (such as b+ tree), and the like. The embodiment of the invention also does not limit the specific size of the logic address space, and can be set according to the actual application requirements. It should be noted here that the size (e.g., depth) of the tree structure may increase as the logical address space size increases, while a larger size tree structure may not facilitate a quick lookup of key-value pairs. In order to effectively control the size of the tree structure, the embodiment of the invention can divide the logic address space with larger size and create a corresponding tree structure with smaller size for each divided logic address space.
Based on this, before acquiring all logical addresses contained in the logical address space in the storage system, it may further include:
step 11: dividing the logic address space in the storage system, and entering all logic addresses contained in the logic address space in the acquisition storage system based on each logic address space obtained by dividing.
S102, generating a B+ tree by utilizing all the logical addresses; and all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, wherein the leaf nodes store the same number of key value pairs, and the key value pairs are keyed by the logical addresses.
In particular, embodiments of the present invention will store key-value pairs based on a b+ tree structure, where the b+ tree is a self-balancing search tree that is widely used in databases and file systems. The b+ tree is characterized by storing a certain number of keys on each internal node and dividing the node into a plurality of subtrees. Each key appears only once in the tree and all leaf nodes are at the same level. These characteristics of the b+ tree enable it to maintain efficient performance in insert, delete and search operations. In the related art, if the mapping relationship between the logical address and the physical address is managed in the existing manner based on the b+ tree form (i.e., only when the mapping relationship between the logical address and the physical address is determined, the corresponding key value pair is inserted into the tree structure), the structure of the same b+ tree always changes along with the sequence and the number of the key value pairs inserted. This easily results in: 1. each node has low utilization rate, and the condition that free space is wasted exists in the nodes, for example, when key value pairs are inserted, if one node exceeds the maximum capacity, the node is split into two nodes, at the moment, only about half of space of each node is effective, and the other half of space is wasted; 2. the frequency of cache modification update is increased, because the B+ tree structure is in continuous dynamic change, so that the cache structure also needs to be continuously adjusted along with the change of the structure on the disk, so as to keep the consistency of the cache and the data on the disk.
As such, embodiments of the present invention will generate a b+ tree based on all logical addresses in a logical address space prior to using that space to ensure that all leaf nodes of the b+ tree sequentially store key value pairs corresponding to each logical address in accordance with the logical address and that each leaf node stores the same number of key value pairs. Obviously, the embodiment of the invention can ensure that all logical addresses in the logical address space are completely covered by the B+ tree from the beginning, so that when the mapping relation between the logical address and the physical address is created subsequently, only the corresponding key value pair in the B+ tree is required to be updated and replaced, and the branch structure of the B+ tree is not required to be adjusted. Moreover, the embodiment of the invention can ensure that each leaf node stores the same number of key value pairs, and further can allocate the size of each leaf node in advance according to the number of the key value pairs so as to ensure that each leaf node can be fully written, and further can achieve the effects of improving the node utilization rate and reducing the storage resource waste.
It should be noted that at the time of creating the b+ tree, since the corresponding physical address is not allocated to each logical address, all key-Value pairs in the b+ tree contain values (values) that are initial values. The embodiment of the invention is not limited to a specific initial value, and can be set according to actual application requirements, so long as the storage system can be ensured to judge that no physical address is allocated to the corresponding logical address based on the initial value. In one possible case, the initial value may be a null value (null).
Based on this, the generating a b+ tree using all the logical addresses may include:
step 21: creating an initial key value pair for each of the logical addresses; the initial key value pair takes the logical address as a key and takes a null value as a value;
step 22: the B+ tree is generated using the initial key value pairs of all the logical addresses.
Further, since the size of the logical address space will directly affect the size of the b+ tree, in order to avoid the oversized b+ tree, the embodiment of the present invention may further divide the logical address space according to the specified size of the b+ tree, specifically, the logical address range that the b+ tree may cover may be determined according to the preset maximum depth of the b+ tree and the preset maximum key value pair number that may be stored by the leaf node in the b+ tree, and further divide the logical address space in the storage system based on the logical address range, so as to ensure that the size of the divided logical address space is suitable for the specified size of the b+ tree.
Based on this, the partitioning of the logical address space in the storage system may include:
step 31: determining a logic address range which can be covered by the B+ tree according to the preset maximum depth of the B+ tree and the preset maximum key value pair number which can be saved by leaf nodes in the B+ tree;
Step 32: and dividing a logic address space in the storage system according to the logic address range.
For ease of understanding, please refer to fig. 2, fig. 2 is a schematic diagram of a b+ tree structure according to an embodiment of the present invention. When the tree structure is constructed, the embodiment of the invention can adopt a certain balance strategy to ensure that the number of keys contained in each tree is the same, and the key value (key) of each tree is distributed in a certain fixed range. Further, as shown in fig. 2, each leaf node contains all the key values of the fixed range class, and each leaf node records the key values as consecutive integers within the fixed range. At this time, since the number of key-value pairs is known and the distribution of key-value pairs is known, the structure of the b+ tree may be preset in advance according to the b+ tree parameter configuration, for example, the maximum number of key-value pairs are sequentially filled into each leaf node in order from left to right in order of keys from small to large. After the leaf nodes are determined, the key value pair distribution of each layer of tree nodes can be determined sequentially from bottom to top until the root node. It is noted here that key-value pairs in other tree nodes, except leaf nodes, are used to record pointers to their underlying nodes. After the construction of the B+ tree is completed, the B+ tree can be used as a base tree, and when the key value pair is updated, the key value pair replacement operation can be directly carried out according to the base tree structure. Regardless of the number of key-value pairs and the sequence of updating, the tree structure on the disk remains unchanged all the time, and the relative position relationship between the key-value pairs remains unchanged. This is a structural invariance and relative positional relationship invariance, providing the possibility to design an efficient cache structure.
S103, when a key value pair update request is received, replacing a new key value pair containing the mapping relation between the logical address and the physical address with a key value pair with the same key in the B+ tree.
As described above, when a key value pair update request is received, a new key value pair including the mapping relationship between the logical address and the physical address is simply replaced with a key value pair having the same key in the b+ tree. Similarly, when a key value pair deletion request representing a mapping relation between a clearing logical address and a physical address is received, in order to maintain invariance of a relative position relation between key value pairs, in the embodiment of the present invention, a value of a key value pair to be deleted corresponding to the key value pair deletion request in the b+ tree may be cleared, so as to restore the key value pair to be deleted to an initial state, without removing the key value pair to be deleted.
Based on this, the method may further include:
step 41: and when a key value pair deleting request is received, clearing the value of the key value pair to be deleted, which corresponds to the key value pair deleting request, in the B+ tree.
Based on the above embodiment, the present invention may first obtain all logical addresses in a logical address space in a storage system, and generate a b+ tree using all logical addresses, where all leaf nodes of the b+ tree sequentially store key value pairs corresponding to each logical address according to the logical addresses, each leaf node stores the same number of key value pairs, and the key value pairs use the logical addresses as keys. Obviously, the b+ tree in the present invention can cover all logical addresses in the logical address space, and the key value pairs of each logical address are regularly saved in the b+ tree. Therefore, when the invention receives the update request of the key value pair, only the key value pair containing the mapping relation between the logical address and the physical address is needed to be replaced by the key value pair with the same key as the key value pair in the B+ tree, and the structure of the B+ tree is not needed to be changed, thereby avoiding the problem of frequent tree structure change caused by dynamic change of the mapping relation between the logical address and the physical address, and further effectively reducing the complexity of managing the mapping relation between the logical address and the physical address by using the tree structure.
Based on the above embodiment, since the embodiment of the present invention particularly sets the b+ tree and can ensure that the distribution situation of the key value pairs in the b+ leaf child nodes is known and regular, the embodiment of the present invention can further improve the efficiency of searching the key value pairs based on the characteristics. Based on this, after generating the b+ tree using all the logical addresses, it may further include:
s201, loading the B+ tree from the storage device of the storage system to the memory device of the storage system.
S202, sorting the tree nodes according to the positions of the tree nodes in the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes according to the arrangement sequence of the tree nodes.
It should be noted that, to facilitate user retrieval of data, when using a logical address space, a tree structure may be generally loaded from a storage device of a storage system into a memory device of the storage system to determine the location of user data in the storage device directly in the memory device. However, in the related art, the tree structure is usually stored by using a hash table, and in the query process, the query needs to be performed from the top of the tree structure layer by layer. Taking the b+ tree structure as an example, in the existing manner, since the key value pairs are not fixed in the distribution situation in the b+ tree, it is necessary to determine the leaf node where the key value pair is located from the top of the b+ tree layer by layer. Furthermore, due to the structure of the b+ tree, in the conventional manner, each time a key pair is queried, a number of lookup operations equal to the depth of the b+ tree must be performed, which reduces the query efficiency of the key pair. In the embodiment of the invention, since the key value pairs are orderly arranged in the b+ tree provided by the embodiment, and the positions of the key value pairs in the b+ tree are not changed all the time, the embodiment of the invention can determine the leaf node where the key value to be queried is located by executing one query according to the key of the key value pair to be queried, the sequence of the leaf nodes and the logic address range recorded by the leaf nodes when the key value pair is queried only by sequentially recording the memory addresses corresponding to the tree nodes according to the arrangement sequence of the tree nodes, and can quickly query the key value pair to be queried, thereby remarkably improving the query efficiency of the key value pair.
Correspondingly, the replacing the new key value pair containing the mapping relation between the logical address and the physical address with the key value pair with the same key in the B+ tree comprises the following steps:
s203, searching the memory address of the leaf node to be updated, to which the new key value pair belongs, according to the key of the new key value pair, the logic address range recorded by each leaf node and the arrangement sequence of each leaf node.
It should be noted that, the leaf node to be updated is the new key value pair of the leaf node. It should be further noted that, the embodiment of the present invention is not limited to the form of recording the memory address of each tree node, for example, the memory address may be recorded in a list form or may be recorded in an array form. In order to facilitate index searching, the embodiment of the invention can record the memory address of each tree node in an array form. Specifically, a corresponding array may be generated for each layer of the b+ tree, and the memory addresses corresponding to the tree nodes of each layer may be sequentially recorded into the corresponding array of each layer according to the arrangement sequence of the tree nodes. In this way, during subsequent inquiry, the sequence number of the leaf node to be updated, to which the new key value pair belongs, in the array is determined only according to the key of the new key value pair and the logic address range recorded by each leaf node, and the memory address of the leaf node to be updated is obtained from the array according to the sequence number.
Based on this, the sequentially recording, according to the arrangement order of the tree nodes, the memory address corresponding to each tree node may include:
step 51: generating corresponding arrays for each layer of the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes of each layer into the corresponding arrays of each layer according to the arrangement sequence of the tree nodes;
the searching the memory address of the leaf node to be updated to which the new key value pair belongs according to the key of the new key value pair, the logical address range recorded by each leaf node and the arrangement sequence of each leaf node may include:
step 52: determining the serial numbers of the leaf nodes to be updated, to which the new key value pair belongs, in the array according to the keys of the new key value pair and the logic address ranges recorded by the leaf nodes;
step 53: and acquiring the memory address of the leaf node to be updated from the array according to the sequence number.
For convenience of understanding, please refer to fig. 3, fig. 3 is a schematic diagram of a b+ tree recorded by an array according to an embodiment of the present invention. Therefore, based on the rule, given any key, the embodiment of the invention can calculate what node and the position in the node of the leaf layer the value corresponding to the key falls on. Meanwhile, the embodiment of the invention also can easily calculate the position of the father node corresponding to the leaf node, the position of the ancestor node in each layer and the position in the node. When in cache, the base tree structure can be mapped by constructing a cache array, and B+ tree nodes of the base tree on the disk are mapped into the cache array according to rules, so that quick searching and positioning based on array indexes are realized. As shown in FIG. 3, first, for each level of the on-disk base tree, there is a corresponding array in the cache array for storing pointers of tree nodes in the level in memory. And secondly, the array size of each layer is equal to the number of nodes contained in the layer in the base tree. Thus, according to the key value pair to be inquired and the storage rule of the key value pair, the node of the leaf where the key value pair is located is known to be the node of the leaf layer, and the storage position in the node is known, so that the required value can be quickly inquired, inserted or deleted. Compared with the prior art that the root node is required to be searched downwards step by step, the operation quantity is greatly reduced, and the query efficiency is obviously improved when each layer is also required to be searched in a binary search mode and a hash table search mode.
S204, replacing the new key value pair and the key value pair with the same key in the leaf node to be updated according to the memory address of the leaf node to be updated.
It will be appreciated that the key pair substitution may be performed in memory after the memory address of the leaf node to be updated is queried. And after the key value pair of the leaf node to be updated is updated and replaced, the key value pair of the leaf node to be updated can be updated into the storage device to persist the updated result.
Further, for a simple key-value pair query request, the embodiment of the invention can quickly query the required key-value pair in the same way. Specifically, when a key value pair query request is received, a sequence number of a leaf node to be queried, to which the key value pair to be queried belongs, in an array can be determined according to a logical address of the key value pair to be queried and a logical address range recorded by each leaf node, and a memory address of the leaf node to be queried is obtained from the array according to the sequence number, so that the key value pair to be queried is obtained in the leaf node to be queried according to the memory address.
Based on this, the method may further include:
step 61: when a key value pair query request is received, determining the serial numbers of the leaf nodes to be queried, to which the key value pair to be queried belongs, in the array according to the logical addresses of the key value pair to be queried and the logical address ranges recorded by the leaf nodes;
Step 62: and acquiring the memory address of the leaf node to be queried from the array according to the sequence number, and acquiring the key value pair to be queried from the leaf node to be queried according to the memory address.
In summary, the above embodiments can simply be summarized as two steps, namely:
1. firstly, in the metadata updating and disk storage stage, the traditional method for dynamically constructing a disk brushing tree based on a B+ tree is abandoned, key value pairs are directly inserted and deleted according to a basic tree structure, and then disk storage is carried out, so that the key value pairs falling from disk can be ensured to be stored strictly according to a preset basic tree structure.
2. In the metadata reading cache, for each base tree, a corresponding cache array is constructed, and nodes in the base tree are stored in the cache array. When the cache is checked, the position of the node where the node is located is calculated according to a certain rule by inquiring the key, and the value is directly read if the node is hit; if the physical addresses of the father node and the ancestor node are not hit, the physical addresses of the father node and the ancestor node are sequentially searched, and then the disk reading operation is triggered.
The address processing device, the storage system and the computer readable storage medium based on the storage system according to the embodiments of the present invention are described below, and the address processing device, the storage system and the computer readable storage medium based on the storage system described below and the address processing method based on the storage system described above may be referred to correspondingly.
Referring to fig. 4, fig. 4 is a block diagram of an address processing apparatus based on a storage system according to an embodiment of the present invention, where the apparatus may include:
an obtaining module 401, configured to obtain all logical addresses contained in a logical address space in a storage system;
a generating module 402, configured to generate a b+ tree using all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, and each leaf node stores the same number of key value pairs, wherein the key value pairs are keyed by the logical addresses;
and the updating module 403 is configured to replace, when a key value pair updating request is received, a new key value pair including a mapping relationship between the logical address and the physical address with a key value pair having the same key in the b+ tree.
Optionally, the generating module 402 may include:
a key value pair initializing sub-module for creating an initial key value pair for each of the logical addresses; the initial key value pair takes the logical address as a key and takes a null value as a value;
and the B+ tree creation submodule is used for generating the B+ tree by utilizing initial key value pairs of all the logical addresses.
Optionally, the apparatus may further include:
and the deleting module is used for clearing the value of the key value pair to be deleted corresponding to the key value pair deleting request in the B+ tree when the key value pair deleting request is received.
Optionally, the apparatus may further include:
the division module is used for dividing the logic address space in the storage system and entering the step of acquiring all logic addresses contained in the logic address space in the storage system based on each logic address space obtained by division.
Optionally, the dividing module may include:
the coverable logic address range determining submodule is used for determining the coverable logic address range of the B+ tree according to the preset maximum depth of the B+ tree and the preset maximum key value pair number which can be saved by leaf nodes in the B+ tree;
and the dividing sub-module is used for dividing the logic address space in the storage system according to the logic address range.
Optionally, the apparatus may further include:
the loading module is used for loading the B+ tree from the storage device of the storage system to the memory device of the storage system;
the recording module is used for sequencing the tree nodes according to the positions of the tree nodes in the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes according to the arrangement sequence of the tree nodes;
The updating module 403 may include:
a query sub-module, configured to search a memory address of a leaf node to be updated to which the new key value pair belongs according to a key of the new key value pair, a logical address range recorded by each leaf node, and an arrangement sequence of each leaf node;
and the replacing sub-module is used for replacing the new key value pair and the key value pair with the same key in the leaf node to be updated according to the memory address of the leaf node to be updated.
Optionally, the recording module is specifically configured to:
generating corresponding arrays for each layer of the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes of each layer into the corresponding arrays of each layer according to the arrangement sequence of the tree nodes;
the query sub-module may include:
a query unit, configured to determine, according to a key of the new key value pair and a logical address range recorded by each leaf node, a sequence number of a leaf node to be updated to which the new key value pair belongs in the array;
and the acquisition unit is used for acquiring the memory address of the leaf node to be updated from the array according to the sequence number.
Optionally, the apparatus may further include:
The inquiry request response module is used for determining the serial numbers of the leaf nodes to be inquired, which belong to the key value pair to be inquired, in the array according to the logical addresses of the key value pair to be inquired and the logical address range recorded by each leaf node when the key value pair inquiry request is received;
and the query module is used for acquiring the memory address of the leaf node to be queried from the array according to the sequence number, and acquiring the key value pair to be queried from the leaf node to be queried according to the memory address.
Referring to fig. 5, fig. 5 is a block diagram illustrating a storage system according to an embodiment of the present invention, and the embodiment of the present invention provides a storage system 50, including a processor 51, a first memory 52, and a second memory 53; wherein the first memory 52 is configured to store user data; the second memory 53 is used for storing a computer program; the processor 51 is configured to execute the address processing method based on the storage system provided in the foregoing embodiment when executing the computer program.
For the specific process of the address processing method based on the storage system, reference may be made to the corresponding content provided in the foregoing embodiment, and no detailed description is given here.
The first memory 52 and the second memory 53 may be read-only memories, random access memories, magnetic disks, optical disks, or the like as carriers for storing resources, and the storage may be temporary storage or permanent storage.
In addition, the memory system 50 also includes a power supply 54, a communication interface 55, an input-output interface 56, and a communication bus 57; wherein, the power supply 54 is used for providing working voltage for each hardware device on the storage system 50, and can be externally supplied; the communication interface 55 can create a data transmission channel between the storage system 50 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present invention, which is not specifically limited herein; the input/output interface 56 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
Further, the first memory 52 in the memory system may include a plurality of memories, and a logical address space may be constructed by a memory space formed by the plurality of first memories 52; in addition, the storage system may also include a memory device.
The embodiment of the invention also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the address processing method based on the storage system of any embodiment are realized.
Since the embodiments of the computer readable storage medium portion and the embodiments of the address processing method portion based on the storage system correspond to each other, the embodiments of the computer readable storage medium portion are referred to the description of the embodiments of the address processing method portion based on the storage system, and are not repeated herein.
In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The address processing method, the device, the storage system and the medium based on the storage system provided by the invention are described in detail. The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims (10)

1. A memory system-based address processing method, comprising:
acquiring all logical addresses contained in a logical address space in a storage system;
Generating a B+ tree by using all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, and each leaf node stores the same number of key value pairs, wherein the key value pairs are keyed by the logical addresses;
when a key value pair update request is received, a new key value pair containing the mapping relation between the logical address and the physical address is replaced with a key value pair with the same key in the B+ tree.
2. The address processing method of claim 1, wherein said generating a b+ tree using all of said logical addresses comprises:
creating an initial key value pair for each of the logical addresses; the initial key value pair takes the logical address as a key and takes a null value as a value;
the B+ tree is generated using the initial key value pairs of all the logical addresses.
3. The address processing method according to claim 2, further comprising:
and when a key value pair deleting request is received, clearing the value of the key value pair to be deleted, which corresponds to the key value pair deleting request, in the B+ tree.
4. The address processing method of claim 1, further comprising, prior to acquiring all logical addresses contained in the logical address space in the storage system:
Dividing the logic address space in the storage system, and entering all logic addresses contained in the logic address space in the acquisition storage system based on each logic address space obtained by dividing.
5. The address processing method according to any one of claims 1 to 4, further comprising, after generating a b+ tree using all the logical addresses:
loading the B+ tree from a storage device of the storage system into a memory device of the storage system;
ordering the tree nodes according to the positions of the tree nodes in the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes according to the arrangement sequence of the tree nodes;
the replacing the new key value pair containing the mapping relation between the logical address and the physical address with the key value pair with the same key in the B+ tree comprises the following steps:
searching the memory address of the leaf node to be updated, to which the new key value pair belongs, according to the key of the new key value pair, the logic address range recorded by each leaf node and the arrangement sequence of each leaf node;
and replacing the new key value pair with the same key in the leaf node to be updated according to the memory address of the leaf node to be updated.
6. The method of claim 5, wherein sequentially recording the memory addresses corresponding to the tree nodes according to the arrangement order of the tree nodes, comprises:
generating corresponding arrays for each layer of the B+ tree, and sequentially recording the memory addresses corresponding to the tree nodes of each layer into the corresponding arrays of each layer according to the arrangement sequence of the tree nodes;
the searching the memory address of the leaf node to be updated to which the new key value pair belongs according to the key of the new key value pair, the logic address range recorded by each leaf node and the arrangement sequence of each leaf node comprises the following steps:
determining the serial numbers of the leaf nodes to be updated, to which the new key value pair belongs, in the array according to the keys of the new key value pair and the logic address ranges recorded by the leaf nodes;
and acquiring the memory address of the leaf node to be updated from the array according to the sequence number.
7. The address processing method of claim 6, further comprising:
when a key value pair query request is received, determining the serial numbers of the leaf nodes to be queried, to which the key value pair to be queried belongs, in the array according to the logical addresses of the key value pair to be queried and the logical address ranges recorded by the leaf nodes;
And acquiring the memory address of the leaf node to be queried from the array according to the sequence number, and acquiring the key value pair to be queried from the leaf node to be queried according to the memory address.
8. An address processing apparatus based on a storage system, comprising:
the acquisition module is used for acquiring all logical addresses contained in the logical address space in the storage system;
the generation module is used for generating a B+ tree by utilizing all the logical addresses; all leaf nodes of the B+ tree sequentially store key value pairs corresponding to the logical addresses according to the logical addresses, and each leaf node stores the same number of key value pairs, wherein the key value pairs are keyed by the logical addresses;
and the updating module is used for replacing the new key value pair containing the mapping relation between the logical address and the physical address with the key value pair with the same key in the B+ tree when receiving the key value pair updating request.
9. A storage system, comprising:
a first memory for storing user data;
a second memory for storing a computer program;
a processor for implementing the memory system based address processing method according to any of claims 1 to 7 when executing the computer program.
10. A computer readable storage medium having stored therein computer executable instructions which when loaded and executed by a processor implement the storage system based address processing method of any of claims 1 to 7.
CN202311611112.6A 2023-11-28 2023-11-28 Address processing method and device based on storage system, storage system and medium Pending CN117573676A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311611112.6A CN117573676A (en) 2023-11-28 2023-11-28 Address processing method and device based on storage system, storage system and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311611112.6A CN117573676A (en) 2023-11-28 2023-11-28 Address processing method and device based on storage system, storage system and medium

Publications (1)

Publication Number Publication Date
CN117573676A true CN117573676A (en) 2024-02-20

Family

ID=89886051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311611112.6A Pending CN117573676A (en) 2023-11-28 2023-11-28 Address processing method and device based on storage system, storage system and medium

Country Status (1)

Country Link
CN (1) CN117573676A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117952082A (en) * 2024-03-25 2024-04-30 冠骋信息技术(苏州)有限公司 PLC address resolution method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117952082A (en) * 2024-03-25 2024-04-30 冠骋信息技术(苏州)有限公司 PLC address resolution method and system

Similar Documents

Publication Publication Date Title
US4611272A (en) Key-accessed file organization
US10831736B2 (en) Fast multi-tier indexing supporting dynamic update
AU2010265984B2 (en) Scalable indexing in a non-uniform access memory
US10394757B2 (en) Scalable chunk store for data deduplication
US8793290B1 (en) Metadata management for pools of storage disks
US11580162B2 (en) Key value append
US6697795B2 (en) Virtual file system for dynamically-generated web pages
EP3037988A1 (en) Configuration method and device for hash database
CN106682110B (en) Image file storage and management system and method based on Hash grid index
EP3495964B1 (en) Apparatus and program for data processing
CN111522507B (en) Low-delay file system address space management method, system and medium
CN117573676A (en) Address processing method and device based on storage system, storage system and medium
CN113535670B (en) Virtual resource mirror image storage system and implementation method thereof
US10824610B2 (en) Balancing write amplification and space amplification in buffer trees
CN115718819A (en) Index construction method, data reading method and index construction device
US7424574B1 (en) Method and apparatus for dynamic striping
CN111274259A (en) Data updating method for storage nodes in distributed storage system
US20200019539A1 (en) Efficient and light-weight indexing for massive blob/objects
CN111338569A (en) Object storage back-end optimization method based on direct mapping
CN116204130A (en) Key value storage system and management method thereof
CN115964002A (en) Electric energy meter terminal file management method, device, equipment and medium
CN111881064A (en) Method, device and equipment for processing access request in full flash memory storage system
EP0117906B1 (en) Key-accessed file organization
US10169250B2 (en) Method and apparatus method and apparatus for controlling access to a hash-based disk
CN115576868B (en) Multi-level mapping framework, data operation request processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination