WO2018205151A1 - Data updating method and storage device - Google Patents

Data updating method and storage device Download PDF

Info

Publication number
WO2018205151A1
WO2018205151A1 PCT/CN2017/083657 CN2017083657W WO2018205151A1 WO 2018205151 A1 WO2018205151 A1 WO 2018205151A1 CN 2017083657 W CN2017083657 W CN 2017083657W WO 2018205151 A1 WO2018205151 A1 WO 2018205151A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
storage area
subtree
key value
storage
Prior art date
Application number
PCT/CN2017/083657
Other languages
French (fr)
Chinese (zh)
Inventor
徐君
于群
王元钢
薛常亮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2017/083657 priority Critical patent/WO2018205151A1/en
Priority to CN201780070813.XA priority patent/CN110168532B/en
Publication of WO2018205151A1 publication Critical patent/WO2018205151A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present application relates to the field of computer storage, and in particular, to a data update method and a storage device.
  • the key (Key) can be quickly determined by finding a keyword, that is, a key, so that the capability of processing services in a large-scale real-time can be realized.
  • the Log Structure Merge Tree (LSM Tree) is the main algorithm structure of the KV database.
  • the random write can be changed to sequential write by layer-by-layer merging, but since the storage device such as the disk is stored in units of blocks, each read and write operation is performed in units of blocks, thus In the layer-by-layer merging, the entire data block related to the data needs to be read into the memory and merged with the data, and then written back to the disk, so that the write amplification in the system is serious, further Affected the improvement of data storage performance.
  • the present application provides a data updating method and storage device, which can improve data reading and writing efficiency.
  • a data update method is provided, the method being performed by a storage device including a first storage area and a second storage area, wherein a data read/write speed of the first storage area is higher than the second storage Data read and write speed of the area, the method comprising: searching an index tree according to a first key value of the first data to obtain a first subtree corresponding to the first key value, wherein the first data storage In the first storage area, the index tree includes an M layer, and a first n layer node of the index tree is stored in the first storage area, and a root node of the first subtree is located in the index tree.
  • the first n layers, the leaf nodes of the first subtree include information of data stored in the second storage area, M and n are positive integers, n is less than or equal to M; Writing the first storage area to the second storage area; updating the first sub-tree according to the first key value, where the updated first leaf node of the first sub-tree includes the The first data information.
  • the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
  • the root node of the first subtree is located in the first n layers of the index tree, and n is less than or equal to the total number of layers M of the index tree, thus The root node of the first subtree is also located in the first storage area.
  • the root node of the first subtree is located in the nth layer of the index tree, that is, in the first n layer of the index tree located in the first storage area, and the root node of the first subtree is the last of the n layers layer.
  • the total number of layers of the index tree is 5, wherein the first 3 layers are stored in the first storage area, and the root node of the first subtree is located in the third layer of the index tree.
  • the first sub-tree corresponding to the first key value means that the first key value is within a range of key values corresponding to the first sub-tree.
  • the information of the first data may include at least one of the following information: a value of the first data, a first key value of the first data, an address (or a link) of the first data, The address (or link) of the first key value, and so on.
  • the information of the second data may include at least one of the following: a value of the second data, a first key value of the second data, an address (or a link) of the second data, The address (or link) of the second key value, and so on.
  • the method further includes: receiving a write request, where the write request includes the first data to be written and the first key value; The first data and the first key value are written into the first storage area.
  • the newly written data is first written into the first storage area, and the read/write performance of the first storage area is superior, thereby increasing the data writing speed.
  • the manner of transferring from the first storage area to the second storage area also reduces writing caused by data update directly in the second storage area. Amplify the problem, which improves data storage performance.
  • the method further includes: receiving a read request, where the read request includes a second key value of the second data; when the second key value is When the second data is not found in the first storage area, searching the index tree according to the second key value to obtain a second subtree corresponding to the second key value, where the second The root node of the subtree is located in the first n layer of the index tree; and the second storage area is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree Two data, wherein the second leaf node is a leaf node that is found according to the second key value.
  • the writing the first data from the first storage area to the second storage area includes: when remaining storage of the first storage area When the space is smaller than the first threshold and the number of readings and readings of the first subtree satisfies the second threshold, the first data is written from the first storage area to the second storage area.
  • the index range corresponding to each read operation covers multiple subtrees, and a certain number of cold subtrees (subtrees with too small access frequency) exist in the multiple subtrees. Or a subtree with fewer leaf nodes, then the multiple subtrees can be merged. Before the multiple subtrees are merged, the data indexed by the key values corresponding to the subtrees needs to be written from the first storage area. Two storage areas.
  • a storage device that can be used to perform various ones of the storage methods described in the first aspect and various implementations described above.
  • the storage device includes a storage module and a processing module, the storage module includes a first storage area and a second storage area, and the data read/write speed of the first storage area is higher than the data read/write speed of the second storage area
  • the processing module is configured to: search an index tree according to a first keyword key value of the first data to obtain a first subtree corresponding to the first key value, where the first data is stored in the first In a storage area, the index tree includes an M layer, and a first n layer node of the index tree is stored in the first storage area, and a root node of the first subtree is located in a front n layer of the index tree.
  • the leaf node of the first subtree includes information of data stored in the second storage area, where M and n are positive integers, n is less than or equal to M; and the first data is from the first storage
  • the area is written into the second storage area; the first sub-tree is updated according to the first key value, wherein the updated first leaf node of the first sub-tree includes the first data information.
  • the processing module is further configured to: receive a write request, where the write request includes the first data to be written and the first key value; Writing the first data and the first key value into the first storage area.
  • the processing module is further configured to: receive a read request, where the read request includes a second key value of the second data; When the value is not found in the first storage area, the index tree is searched according to the second key value to obtain a second subtree corresponding to the second key value, where The root node of the second subtree is located in the first n layer of the index tree; and the information is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree The second data, wherein the second leaf node is a leaf node that is found according to the second key value.
  • the processing module is specifically configured to: when a remaining storage space of the first storage area is smaller than a first threshold, and the number of reading and writing of the first subtree is satisfied And the second threshold is written from the first storage area to the second storage area.
  • the first storage area includes a non-volatile storage medium.
  • a storage device including a transceiver, a processor, and a memory.
  • the memory stores a program that executes the program for performing the various processes in the data update method described in the first aspect and various implementations described above.
  • a computer including a processor and a memory; the memory is configured to store computer execution instructions, and the processor and the memory communicate with each other through an internal connection path, when the computer is running, The processor executes the computer-executed instructions stored by the memory to cause the computer to perform various ones of the data update methods described in the first aspect and various implementations described above.
  • a computer readable storage medium storing a program, the program causing the apparatus to perform any one of the above first aspects and various implementation manners thereof .
  • a system chip comprising an input interface, an output interface, a processor, and a memory
  • the processor is configured to execute an instruction stored by the memory, and when the instruction is executed, the processor can implement the foregoing The first aspect and any of its various implementations.
  • 1 is a schematic diagram of data storage in the prior art.
  • FIG. 2 is a schematic flowchart of a data update method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a data update method according to another embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a data update method according to another embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a storage device according to an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a storage device according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a storage device according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a system chip according to an embodiment of the present application.
  • the data update method described in the embodiments of the present application may be applied to a storage system supporting key-values.
  • data is stored in a key-value, and multiple pairs of key-values are stored in the corresponding file, by looking up the keyword Key in the Key Value.
  • the data value Value corresponding to the keyword Key is quickly determined, so that the capability of processing services in a large-scale real-time can be realized.
  • the updated data sequence is first written to the disk log, and then The data update is performed in the memory cache.
  • the data is directly transferred to the level (level) 0 layer file of the disk; when the data volume of the level 0 layer file is accumulated to a certain extent The degree is merged with the Level 1 layer file, and the merged new file is stored to the Level 1 layer, and the redundant data is deleted; when the data volume of the Level 1 layer file is accumulated to a certain extent, it is merged with the Level 2 layer file, and Store the merged new file to the Level 2 layer and delete the redundant data; and so on, to form a smaller number of files with larger storage capacity.
  • Such a layer-by-layer merge results in a significant write amplification. For example, if only a small amount of data in the upper layer needs to be merged into the next layer, it is only necessary to write the small amount of data to the next layer, but since the storage device such as the disk is stored in units of blocks, each read and write operation It must be in units of data blocks, so the entire data block associated with the data needs to be read into the memory and merged with the data, and then written back to the disk, so that a block of data is written. The amount of data. Therefore, the layer-by-layer merge method leads to a serious write amplification, which further affects the performance of data storage.
  • the key corresponding to the data is first searched in the memory table (memtable). If the key is not found, the index file of each layer of data is checked in reverse order, that is, the character. Sorted String Table (SSTable) until the Key is found. Each SSTable is ordered. The search speed is slowed down as the number of SSTables increases.
  • the time complexity is O(K log N), where K is the number of sstable files and N is the average size of the SSTable. Therefore, the complexity of the write operation also limits the performance of data storage.
  • the embodiment of the present application provides a data update method for improving storage performance.
  • the write operation in the embodiment of the present application may include an operation of writing new data (put) or update data (update), and the read operation in the embodiment of the present application may include reading (get) or range of data.
  • Query range query and other operations.
  • FIG. 2 is a schematic flowchart of a data update method according to an embodiment of the present application.
  • the method is performed by a storage device including a first storage area and a second storage area, wherein a data read/write speed of the first storage area is higher than a data read/write speed of the second storage area, and the storage device is accessed through an index tree.
  • Data stored in the second storage area, the index tree includes an M layer, the first n layers of the index tree are stored in the first storage area, M and n are both positive integers and n is less than or equal to the total number of layers of the index tree M.
  • the index tree may be any type of index tree, such as a binary tree, a balanced multiple search tree (B-Tree, B+Tree), etc., which is not limited in this application.
  • the first storage area may be, for example, a storage-class memory (SCM) or other byte-addressable non-volatile storage medium;
  • the second storage area may be, for example, a NAND flash memory ( NAND Flash) or Hard Disk Drive (HDD). Since the first storage area uses a fast storage medium, and the second storage area uses a slow storage medium, the data read and write speed of the first storage area is higher than the data read/write speed of the second storage area, that is, The access performance of the first storage area is better than the second storage area.
  • the index tree is used to access data in the second storage area, but since the upper layer access of the index tree is frequently higher, the first n layers of the index tree can be stored in the first storage area.
  • n is smaller than the total number of layers of the index tree, a partial layer of the index tree is stored in the first storage area, and when n is equal to the total number of layers of the index tree, the entire index tree is stored in the first storage area.
  • the second storage area such as storage level memory SCM, etc.
  • it may be preferred to store a part of the index tree that is, the first few layers, in the first storage area, and the remaining layers are stored in The second storage area. If the second storage area, such as the storage level memory SCM, is sufficiently cheap in the future, and the space is large enough, the entire index tree may also be stored in the second storage area, which is not limited herein.
  • the data update method may include the following steps:
  • an index tree is searched according to a first keyword key value of the first data to obtain a pair with the first key value.
  • the first subtree should be.
  • the first data is stored in the first storage area, and the index tree includes an M layer.
  • the first n-layer node of the index tree is stored in the first storage area, and the root node of the first sub-tree is located in the index tree.
  • the first n layers, the leaf nodes of the first subtree include information of data stored in the second storage area, M and n are positive integers, and n is less than or equal to M.
  • the nodes of the index tree are divided into four categories: root node, leaf node, parent node, and child node.
  • the child node is the next-level node of the parent node. If a node has a higher level, the upper level is called its parent node. If there is no upper level, the node has no parent node.
  • a node with no children in a tree is called a leaf node. There are no other nodes above the current node. This node is called the root node.
  • the nodes in the embodiments of the present application can also be written as nodes.
  • the root node of the first subtree is located in the first n layers of the index tree, and n is less than or equal to the total number of layers M of the index tree, thus The root node of the first subtree is also located in the first storage area.
  • the root node of the first subtree is located in the nth layer of the index tree, that is, in the first n layer of the index tree located in the first storage area, and the root node of the first subtree is the last of the n layers layer.
  • the total number of layers of the index tree is 5, wherein the first 3 layers are stored in the first storage area, and the root node of the first subtree is located in the third layer of the index tree.
  • the first sub-tree corresponding to the first key value means that the first key value is within a range of key values corresponding to the first sub-tree.
  • data may be stored in the form of a Key Value.
  • the data stored in the first storage area and the second storage area may include a value of the data and an index corresponding to the data, for example, the first data mentioned herein includes the value of the first data and the first An index of data, the index of the first data is the key Key in the key-value pair, and the value of the first data is the value Value in the key-value pair.
  • the Value corresponding to the Key can be quickly found.
  • Different data can be managed by the index tree. When the data is read and written, the index tree can be used to determine the data block where the corresponding data is located by using the index of the data, thereby realizing access to the data.
  • the student's student number and name are stored, wherein the student number is used as the key Key and the name is the value Value. If the new data is written into the student management database, execute put(0600100, Chen Meiling), that is, adding a pair of key-values (0600100, Chen Meiling) to the student management database.
  • the first data is written from the first storage area to the second storage area.
  • the storage device may write the first data stored in the first storage area into the second storage area. That is, the data indexed by the key value corresponding to the first subtree in the first storage area and the data indexed by the key value corresponding to the first subtree in the second storage area may be data merged, that is, The data is rearranged and re-stored according to the index size of the data in the first storage area and the second storage area. The merged data will be stored in the second storage area. Thus, since the data in the first storage area is transferred to the second storage area, The storage space in the first storage area can be released.
  • writing the first data from the first storage area to the second storage area includes: when the remaining storage space of the first storage area is less than the first threshold, and the number of times of reading and writing of the first subtree is satisfied At the second threshold, the first data is written from the first storage area to the second storage area.
  • the timing of writing the first data from the first storage area to the second storage area may be determined according to the size of the remaining storage space of the first storage area at the time, when the remaining storage space of the first storage area is insufficient, for example, less than the first At the threshold, the transfer of the data is initiated, that is, the data is written from the first storage area to the second storage area.
  • the storage device first selects at least one subtree satisfying the condition in all the subtrees where the root node is located in the first n layer, for example, the nth layer, and the at least one subtree includes, for example, the first subtree described above, and according to the At least one subtree writes data in the first storage area to the second storage area.
  • the judgment of whether each sub-tree in the index tree satisfies the condition is triggered, so that the root node is located at the nth layer and the key value corresponding to the subtree satisfying the condition is obtained.
  • the indexed data is written from the first storage area to the second storage area.
  • the condition that the at least one subtree satisfies may include, for example, any one of the following: the data that can be released after the data indexed by the key value corresponding to the subtree is written from the first storage area to the second storage area is greater than one pre
  • the threshold is set; the ratio of the number of write operations of the subtree to the number of read operations is greater than a preset threshold; the read operation frequency and/or the write operation frequency corresponding to the subtree are less than a preset threshold.
  • the following uses the first subtree as an example to describe these three conditions.
  • the storage space that can be released after the sub-tree merge operation is performed on the first sub-tree is greater than a preset threshold.
  • the primary purpose of writing the first data from the first storage area to the second storage area is to release the storage space of the fast storage area, so when the storage space of the fast storage area is insufficient, the key corresponding to each sub-tree may be The size of the storage space that can be released after the first data that is indexed is written from the first storage area to the second storage area, to determine which sub-tree corresponding key values are indexed from the first storage area to the second data.
  • Storage area For example, if the data indexed by the key value corresponding to the first subtree (including the first key value) (including the first data) is written from the first storage area to the second storage area, the size of the storage space that can be released is greater than one.
  • the storage device may write the data indexed by the key value corresponding to the first subtree from the first storage area to the second storage area.
  • the ratio of the number of write operations of the first subtree to the number of read operations of the first subtree is greater than a preset threshold.
  • a subtree with many write operations and few read operations can be selected, so that the data indexed by the key value corresponding to the subtree is written from the first storage area to the second storage area.
  • the ratio of the number of write operations of the first subtree to the number of read operations of the first subtree is used as a measure, if the ratio of the number of write operations of the first subtree to the number of read operations of the first subtree is greater than a preset
  • the data (including the first data) indexed by the key value corresponding to the first subtree is written from the first storage area to the second storage area.
  • the read operation speed is faster in the first storage area, and the read operation speed is slower in the second storage area, if the data corresponding to the first sub-tree in the first storage area has more read operations, The data is left in the first memory area so that the read speed of the system can be increased.
  • the write operation of the first subtree refers to that the key value of the written data is in the range of key values corresponding to the first subtree; the read operation corresponding to the first subtree refers to the key value of the read data. It is located in the range of key values corresponding to the first subtree.
  • the read operation frequency and/or the write operation frequency corresponding to the first subtree are less than a preset threshold.
  • the subtree may also write data indexed by the key value corresponding to the subtree from the first storage area to the second storage area. For example, if the read operation frequency and/or the write operation frequency of the first subtree is less than a preset threshold, the data indexed by the key value corresponding to the first subtree (including the first key value) may be included (including the first Data) is written to the second storage area from the first storage area.
  • I/O statistics may be updated each time a write operation or a read operation is performed in the storage device, and the I/O statistical information may include, for example. At least one of the following: the sum of the number of write operations (including the put operation and the update operation) for each subtree, the number of read operations, the index range of the read operation, the number of queries for the index range, and each time
  • the time stamp of the operation records, for example, time information for performing a write operation, time information for performing an update operation, time information for performing a read operation, and the like.
  • the first subtree is updated according to the first key value.
  • the first leaf node of the updated first subtree includes the information of the first data.
  • the storage device further updates the first subtree according to the first key value, and the updated first leaf node in the first subtree Information including the first data.
  • the first leaf node may be any leaf node in the first subtree.
  • the information of the first data may include at least one of the following information: a value of the first data, a first key value of the first data, an address (or a link) of the first data, and a first The address (or link) of the key value, etc.
  • the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
  • the method further includes 240 and 250.
  • a write request is received, the write request including the first data to be written and the first key value.
  • the first data and the first key value are written to the first storage area.
  • the storage device may first store the received data into the first storage area for temporary management.
  • the first storage area does not merge by layer by layer when writing data, but for example, data can be written in a granularity of bytes, so that the speed of writing data into the first storage area is significantly higher than that of writing data.
  • the speed of the two storage areas avoids the problem of write amplification.
  • the newly written data is first written into the first storage area, and the read/write performance of the first storage area is superior, thereby increasing the data writing speed.
  • the manner of transferring from the first storage area to the second storage area also reduces writing caused by data update directly in the second storage area. Amplify the problem, which improves data storage performance.
  • the method further includes:
  • a read request is received, the read request including a second key value of the second data.
  • the root node of the second subtree is located in the first n layers of the index tree.
  • the second data is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree.
  • the second leaf node is a leaf node that is found according to the second key value.
  • the information of the second data includes at least one of the following information: a value of the second data, a second key value of the second data, an address (or a link) of the second data, and a second key The address (or link) of the value, etc.
  • the storage device first searches for the second data in the first storage area according to the second key value, and if the second data is found in the first storage area, directly reads the second data. data. Since part of the data is also stored in the first storage area, if the part of the data has not been written to the second storage area, the storage device can find the data in the first storage area. Since the data access speed of the first storage area is significantly higher than the data access speed of the second storage area, fast reading of data can be realized.
  • the storage device may not find the index of the second data in the first storage area, and at this time, the storage device needs to be according to the second
  • the key value searches the index tree to obtain a second subtree corresponding to the second key value, and reads from the second storage area according to the information of the second data included in the second leaf node of the second subtree. Take the second data.
  • FIG. 5 is a schematic diagram of a data update method according to an embodiment of the present application.
  • the first storage area is a first storage medium SCM
  • the second storage area is a NAND Flash or HDD
  • the first storage area includes a data area and an index area.
  • the data area is used to store data
  • the index area is used to store the index tree. It is assumed here that the index tree is a balanced multipath search tree B+Tree.
  • the second storage area is used to store the index tree and data.
  • the first storage area may be a storage medium, and the storage medium is divided into a data area and an index area.
  • the data area and the index area may use different storage media, which are not limited herein.
  • the total number of layers of the index tree for finding data in the second storage area in the storage device is 5, and the first 3 layers are stored in the index area of the first storage area, and the last 2 layers are stored in the second storage area.
  • the index tree has a node A as a root node, wherein the child nodes of the node A include a node B and a node P, wherein the child nodes of the node B include a node C and a node I, and in turn, a child of the node C
  • the node includes a node D and a node E, and the child nodes of the node D include a node F, a node G, and a node H.
  • Node F, node G, and node H are leaf nodes of the entire index tree.
  • the first data When writing data, for example, writing the first data, the first data is not written into the second storage area as in the prior art, but the first data is written into the data area of the first storage area, the first storage
  • the data write speed of the zone is significantly faster than the second memory zone, so the first data can be written first.
  • the storage space of the first storage area is continuously reduced, when the storage space of the first storage area is reduced to a certain extent, for example, less than a space threshold.
  • the storage device writes data from the first storage area to the second storage area to release the storage space of the first storage area.
  • the storage device specifically transfers the data to the second storage area, and the sub-trees satisfying certain conditions are selected in the plurality of sub-trees whose root nodes are located in the third layer, and the key values corresponding to the sub-trees satisfying the condition are indexed. Data is written from the first storage area to the second storage area.
  • the storage device may be configured to: according to the size of the storage space that can be released by the storage space that can be released after the data indexed by the key value corresponding to the subtree is written from the first storage area to the second storage area, or the read/write operation corresponding to each subtree. The number of times determines whether each subtree satisfies the merge condition.
  • the plurality of subtrees in the index tree in which the root node is located in the third layer includes the first subtree and the second subtree.
  • the storage device determines the first Whether the subtree and the second subtree satisfy a preset condition, for example, whether the number of read/write operations corresponding to each subtree reaches a certain threshold, assuming that the first subtree satisfies
  • the preset condition is that the storage device writes the data indexed by the key value corresponding to the first subtree from the first storage area to the second storage area.
  • the data included in the first subtree in the first storage area and the data included in the first subtree in the second storage area are read into the memory, and sorted according to the key Key of the data, and the sorted The data is transferred from the memory to the second storage area so as to be stored in the corresponding location in the second storage area. In this way, the storage space in the first storage area can be released. As shown in FIG.
  • the data indexed by the key value corresponding to the first subtree in the first storage area includes (Key 1, Value 1), (Key 2, Value 2), and (Key 4, Value 4), Data (Key 3, Value 3), (Key 5, Value 5), and (Key 6, Value 6) indexed by the key value corresponding to the first subtree in the second storage area, and the data is rearranged according to the size of the Key.
  • the merged data (Key 1, Value 1), (Key 2, Value 2), (Key 3, Value 3), (Key 4, Value 4), (Key 5, Value 5), and (Key 6) are formed. , Value 6), these combined data will be stored in the second storage area.
  • the data indexed by the key value corresponding to the first subtree refers to the data corresponding to the key value in the range of the key value of the first subtree.
  • the key value corresponding to the first subtree ranges from 10 to 10 25, if the first key value of the first data is 15, the first key value 15 is located in the key value range of the first subtree 10-25, and the data indexed by the key value corresponding to the first subtree includes the first data.
  • the key-value pair of the data may be stored in the data block as shown in FIG. 5 ( Key, Value), for example, the data in the dotted line in the lower left corner of the second storage area needs to store the key-value pairs of the data (Key 3, Value 3) when storing.
  • the leaf node F since the leaf node F has already stored the key Key 3 of the data, when the data is stored, only the Value 3 can be stored in the data block, that is, only the dotted line in the lower left corner of the second storage area is stored. Value 3 does not store the full (Key 3, Value 3).
  • the embodiment of the present application does not limit the data storage form in the data block.
  • the data when the data is temporarily managed in the first storage area, it may also be managed by means of an index tree.
  • a small square in the index area of the first storage area in FIG. 5 may be represented by Node C is the three child nodes of the parent node. These three child nodes manage data of different index ranges respectively. These three child nodes may also include other child nodes, which are not shown here.
  • the index tree with node A as the root node mentioned above is an index tree for accessing data in the second index area, and the index tree used for temporarily managing data in the first storage area is a different index. tree.
  • the index tree in the embodiment of the present application refers to an index tree for accessing data of the second storage area, that is, a 5-layer index tree with node A as the root node in FIG. 5, unless otherwise specified.
  • the second key value Key 3 of the second data to be read is first searched in the index area of the first storage area, if in the first storage area.
  • the second data (Key 3, Value 3) is read in the data area of the first storage area according to Key3.
  • the leaf where the Key 3 is located is searched layer by layer from the root node of the index tree, that is, the A node according to the index tree. Node, assuming that Key 3 is found in the leaf node F in the second storage area, data is read from the data block corresponding to the second storage area and Key 3 according to the found Key 3 (Key 3, Value 3) .
  • the index range corresponding to each read operation covers multiple subtrees, and a certain number of cold subtrees (subtrees with too small access frequency) exist in the multiple subtrees. Or subtrees with fewer leaf nodes, then you can merge these multiple subtrees.
  • the data indexed by the key values corresponding to the subtrees needs to be written into the second storage area from the first storage area before the multiple subtrees are merged.
  • the process of merging multiple subtrees is the same as the process of merging multiple subtrees in the prior art. For brevity, no further details are provided here.
  • the index range corresponding to each read operation covers the first subtree and the second subtree, and at least the first subtree and the second subtree are at least If there is a subtree of less than a certain number of leaf nodes of the subtree or at least one subtree, then the first subtree and the second subtree may be merged. At this time, the first subtree and the second subtree are combined. The data indexed by the corresponding key value has been written from the first storage area to the second storage area. After the first subtree and the second subtree are merged, the first subtree and the second subtree can be combined into a new subtree, and the parent node of the new subtree is also the node B.
  • the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application.
  • the implementation process constitutes any limitation.
  • a storage device according to an embodiment of the present application will be described below with reference to FIG. 6 to FIG. 8.
  • the technical features described in the method embodiments may be applied to the following device embodiments.
  • FIG. 6 is a schematic block diagram of a memory device 600 in accordance with an embodiment of the present application.
  • the storage device includes a storage module 610 and a processing module 620.
  • the storage module 610 includes a first storage area and a second storage area, and the data read and write speed of the first storage area is higher than the The data read and write speed of the second storage area, the processing module 620 is configured to:
  • the leaf node includes information of data stored in the second storage area, M and n are both positive integers, n is less than or equal to M; and the first data is written from the first storage area to the second storage Updating the first subtree according to the first key value, where the updated first leaf node of the first subtree includes information of the first data.
  • the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
  • the processing module 620 is further configured to: receive a write request, where the write request includes the first data to be written and the first key value; and the first data and the first A key value is written to the first storage area.
  • the newly written data is first written into the first storage area, and the read/write performance of the first storage area is superior, thereby increasing the data writing speed.
  • the manner of transferring from the first storage area to the second storage area also reduces data update directly in the second storage area. The problem of write amplification is increased, thereby improving data storage performance.
  • the processing module 620 is further configured to: receive a read request, where the read request includes a second key value of the second data; when the second key value is not in the first storage area according to the second key value
  • the index tree is searched according to the second key value to obtain a second subtree corresponding to the second key value, where a root node of the second subtree is located in the a first n layer of the index tree; the second data is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree, wherein the The two leaf node is a leaf node that is found according to the second key value.
  • the processing module 620 is specifically configured to: when the remaining storage space of the first storage area is less than a first threshold, and the number of read/write times of the first subtree meets the second threshold, The first data is written from the first storage area to the second storage area.
  • the first storage area comprises a non-volatile storage medium.
  • FIG. 7 is a schematic block diagram of a storage device 700 in accordance with an embodiment of the present application.
  • the storage device may include the storage device 600 shown in FIG. 6, which may be, for example, a device for storing data, such as a computer, a server, or the like.
  • the storage device 700 includes a processor 710, a transceiver 720, and a memory 730, wherein the processor 710, the transceiver 720, and the memory 730 communicate with each other through an internal connection path.
  • the memory 730 is used to store data and instructions in the file, and the processor 710 is configured to execute instructions stored in the memory 730 to control the transceiver 720 to receive signals or transmit signals.
  • the memory 730 includes a first storage area and a second storage area. The data read/write speed of the first storage area is higher than the data read/write speed of the second storage area.
  • the processor 710 is configured to:
  • the leaf node includes information of data stored in the second storage area, M and n are both positive integers, n is less than or equal to M; and the first data is written from the first storage area to the second storage Updating the first subtree according to the first key value, where the updated first leaf node of the first subtree includes information of the first data.
  • the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
  • the processor 710 is further configured to: receive a write request, where the write request includes the first data to be written and the first key value; and the first data and the first A key value is written to the first storage area.
  • the processor 710 is further configured to: receive a read request, where the read request includes a second key value of the second data; when the second key value is not in the first storage area according to the second key value
  • the index tree is searched according to the second key value to obtain a second subtree corresponding to the second key value, where a root node of the second subtree is located in the a first n layer of the index tree; the second data is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree, wherein the Two leaf node A leaf node found according to the second key value.
  • the processor 710 is specifically configured to: when the remaining storage space of the first storage area is less than a first threshold, and the number of read/write times of the first subtree meets the second threshold, The first data is written from the first storage area to the second storage area.
  • the first storage area comprises a non-volatile storage medium.
  • the processor 710 may be a central processing unit (CPU), and the processor 710 may also be other general-purpose processors, digital signal processing (DSP). , Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc.
  • DSP digital signal processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 730 can include read only memory and random access memory and provides instructions and data to the processor 710. A portion of the memory 730 may also include a non-volatile random access memory.
  • each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 710 or an instruction in a form of software.
  • the steps of the positioning method disclosed in the embodiment of the present application may be directly implemented by the hardware processor, or may be performed by a combination of hardware and software modules in the processor 710.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in memory 730, and processor 710 reads the information in memory 730 and, in conjunction with its hardware, performs the steps of the above method. To avoid repetition, it will not be described in detail here.
  • the storage device 700 according to the embodiment of the present application may correspond to the storage device for performing the method 200 in the foregoing method 200, and the storage device 600 according to the embodiment of the present application, and each unit or module in the storage device 700 is used for respectively The operations or processes performed by the storage device in the above method 200 are performed.
  • each unit or module in the storage device 700 is used for respectively The operations or processes performed by the storage device in the above method 200 are performed.
  • detailed description thereof will be omitted.
  • FIG. 8 is a schematic structural diagram of a system chip according to an embodiment of the present application.
  • the system chip 800 of FIG. 8 includes an input interface 801, an output interface 802, at least one processor 803, and a memory 804.
  • the input interface 801, the output interface 802, the processor 803, and the memory 804 are interconnected by an internal connection path.
  • the processor 803 is configured to execute code in the memory 804. When the code is executed, the processor 803 can implement the method 200 performed by the storage device in a method embodiment. For the sake of brevity, it will not be repeated here.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling or direct coupling or new connection shown or discussed may be an indirect coupling or a new connection through some interface, device or unit, and may be in electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application provides a data updating method and a storage device, which is performed by a storage device comprising a first storage area and a second storage area. The data read/write speed of the first storage area is higher than the data read/write speed of the second storage area, and the method comprises: finding an index tree according to a first key value of first data to obtain a first subtree corresponding to a first key value, wherein the first data is stored in the first storage area, the index tree comprises M layers, the first n layers nodes of the index tree is stored in the first storage area, the root node of the first subtree is located in the first n layers of the index tree, and the leaf node of the first subtree comprises information of data stored in the second storage area; writing the first data from the first storage area to the second storage area; and updating the first subtree according to the first key value, wherein the first leaf node of the updated first subtree comprises information of the first data. The data updating method provided by the present application can improve the storage performance of the storage device.

Description

数据更新方法和存储装置Data update method and storage device 技术领域Technical field
本申请涉及计算机存储领域,尤其涉及一种数据更新方法和存储装置。The present application relates to the field of computer storage, and in particular, to a data update method and a storage device.
背景技术Background technique
在支持键-值(Key Value,KV)的存储系统中,可以通过查找关键字即键(Key),以快速确定Key所对应的值(Value),从而能够实现大规模实时处理业务的能力。日志结构合并树(Log Structure Merge Tree,LSM Tree)是KV数据库的主要算法结构。当需要更新数据时,可以通过逐层合并的方式将随机写变成顺序写,但由于磁盘等存储设备是以块为单位进行存储的,每次的读写操作都要以块为单位,因而在进行逐层合并时,需要把与该数据相关的整个数据块都读取到内存中与该数据进行合并,然后再写回到磁盘中,这样就导致的系统中的写放大很严重,进一步影响了数据存储性能的提升。In a storage system that supports Key Value (KV), the key (Key) can be quickly determined by finding a keyword, that is, a key, so that the capability of processing services in a large-scale real-time can be realized. The Log Structure Merge Tree (LSM Tree) is the main algorithm structure of the KV database. When it is necessary to update the data, the random write can be changed to sequential write by layer-by-layer merging, but since the storage device such as the disk is stored in units of blocks, each read and write operation is performed in units of blocks, thus In the layer-by-layer merging, the entire data block related to the data needs to be read into the memory and merged with the data, and then written back to the disk, so that the write amplification in the system is serious, further Affected the improvement of data storage performance.
发明内容Summary of the invention
本申请提供一种数据更新方法和存储装置,能够提高数据的读写效率。The present application provides a data updating method and storage device, which can improve data reading and writing efficiency.
第一方面,提供了一种数据更新方法,所述方法由包含有第一存储区和第二存储区的存储设备执行,所述第一存储区的数据读写速度高于所述第二存储区的数据读写速度,所述方法包括:根据第一数据的第一关键字key值查找索引树以获得与所述第一key值对应的第一子树,其中,所述第一数据存储于所述第一存储区中,所述索引树包括M层,所述索引树的前n层节点存储于所述第一存储区中,所述第一子树的根节点位于所述索引树的前n层,所述第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M;将所述第一数据从所述第一存储区写入所述第二存储区;根据所述第一key值更新所述第一子树,其中,更新后的所述第一子树的第一叶子节点中包含有所述第一数据的信息。In a first aspect, a data update method is provided, the method being performed by a storage device including a first storage area and a second storage area, wherein a data read/write speed of the first storage area is higher than the second storage Data read and write speed of the area, the method comprising: searching an index tree according to a first key value of the first data to obtain a first subtree corresponding to the first key value, wherein the first data storage In the first storage area, the index tree includes an M layer, and a first n layer node of the index tree is stored in the first storage area, and a root node of the first subtree is located in the index tree. The first n layers, the leaf nodes of the first subtree include information of data stored in the second storage area, M and n are positive integers, n is less than or equal to M; Writing the first storage area to the second storage area; updating the first sub-tree according to the first key value, where the updated first leaf node of the first sub-tree includes the The first data information.
在本申请提供的数据更新方法中,将存储设备中的存储区进行了分区,其中,第一存储区的读写性能优于第二存储区的读写性能。并且将索引树的前n层节点存储于所述第一存储区中,且该索引树的第一子树的根节点也位于所述索引树的前n层,从而使得在数据更新过程中,与现有技术中需要逐层合并各级索引树的方式相比,能够通过待写入数据的第一key值快速查找到对应的第一子树,提高了查找速度。并且,本申请中,能够基于第一子树实现第一存储区与第二存储区中数据的合并,从而可以减少现有技术中因为逐层合并带来的写放大问题。In the data update method provided by the present application, the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
应理解,该索引树的前n层存储在第一存储区中,该第一子树的根节点位于该索引树的前n层,n是小于或等于索引树的总层数M的,因而该第一子树的根节点也就位于第一存储区中。特别地,该第一子树的根节点位于该索引树的第n层,即位于第一存储区的该索引树的前n层中,第一子树的根节点为这n层中的最后一层。例如,该索引树的总层数为5,其中前3层存储在第一存储区,该第一子树的根节点位于索引树的第3层。It should be understood that the first n layers of the index tree are stored in the first storage area, the root node of the first subtree is located in the first n layers of the index tree, and n is less than or equal to the total number of layers M of the index tree, thus The root node of the first subtree is also located in the first storage area. Specifically, the root node of the first subtree is located in the nth layer of the index tree, that is, in the first n layer of the index tree located in the first storage area, and the root node of the first subtree is the last of the n layers layer. For example, the total number of layers of the index tree is 5, wherein the first 3 layers are stored in the first storage area, and the root node of the first subtree is located in the third layer of the index tree.
还应理解,与该第一key值对应的第一子树,是指该第一key值位于该第一子树对应的key值范围内。 It should also be understood that the first sub-tree corresponding to the first key value means that the first key value is within a range of key values corresponding to the first sub-tree.
其中,可选地,该第一数据的信息可以包括以下信息中的至少一种:第一数据的值(value)、第一数据的第一key值、第一数据的地址(或链接)、第一key值的地址(或链接)等。Optionally, the information of the first data may include at least one of the following information: a value of the first data, a first key value of the first data, an address (or a link) of the first data, The address (or link) of the first key value, and so on.
其中,可选地,该第二数据的信息可以包括以下信息中的至少一种:第二数据的值(value)、第二数据的第一key值、第二数据的地址(或链接)、第二key值的地址(或链接)等。Optionally, the information of the second data may include at least one of the following: a value of the second data, a first key value of the second data, an address (or a link) of the second data, The address (or link) of the second key value, and so on.
可选地,在第一方面的一种实现方式中,所述方法还包括:接收写请求,所述写请求中包含有待写入的所述第一数据以及所述第一key值;将所述第一数据以及所述第一key值写入所述第一存储区。Optionally, in an implementation manner of the first aspect, the method further includes: receiving a write request, where the write request includes the first data to be written and the first key value; The first data and the first key value are written into the first storage area.
根据这种方式,将新写入数据先写入所述第一存储区中,由于第一存储区的读写性能较优,从而提高了数据写入速度。并且,在将数据写入第一存储区中之后,再从所述第一存储区转移到所述第二存储区中的方式也减少了直接在第二存储区进行数据更新所带来的写放大的问题,从而提高了数据存储性能。According to this manner, the newly written data is first written into the first storage area, and the read/write performance of the first storage area is superior, thereby increasing the data writing speed. Moreover, after the data is written into the first storage area, the manner of transferring from the first storage area to the second storage area also reduces writing caused by data update directly in the second storage area. Amplify the problem, which improves data storage performance.
可选地,在第一方面的一种实现方式中,所述方法还包括:接收读请求,所述读请求中包含有第二数据的第二key值;当根据所述第二key值在所述第一存储区中未找到所述第二数据时,根据所述第二key值查找所述索引树以获得与所述第二key值对应的第二子树,其中,所述第二子树的根节点位于所述索引树的前n层;根据所述第二子树的第二叶子节点中包含的所述第二数据的信息从所述第二存储区中读取所述第二数据,其中,所述第二叶子节点为根据所述第二key值查找到的叶子节点。Optionally, in an implementation manner of the first aspect, the method further includes: receiving a read request, where the read request includes a second key value of the second data; when the second key value is When the second data is not found in the first storage area, searching the index tree according to the second key value to obtain a second subtree corresponding to the second key value, where the second The root node of the subtree is located in the first n layer of the index tree; and the second storage area is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree Two data, wherein the second leaf node is a leaf node that is found according to the second key value.
可选地,在第一方面的一种实现方式中,所述将所述第一数据从所述第一存储区写入所述第二存储区包括:当所述第一存储区的剩余存储空间小于第一阈值且所述第一子树的读写次数满足所述第二阈值时,将所述第一数据从所述第一存储区写入所述第二存储区。Optionally, in an implementation manner of the first aspect, the writing the first data from the first storage area to the second storage area includes: when remaining storage of the first storage area When the space is smaller than the first threshold and the number of readings and readings of the first subtree satisfies the second threshold, the first data is written from the first storage area to the second storage area.
可选地,如果对索引树的多次读操作中,每次读操作对应的索引范围覆盖多个子树,且这多个子树中存在一定数量的冷子树(访问频率过小的子树)或叶子结点较少的子树,那么可以对这多个子树进行合并,对这多个子树进行合并之前,需要将这些子树对应的key值所索引的数据从第一存储区写入第二存储区。Optionally, if multiple read operations are performed on the index tree, the index range corresponding to each read operation covers multiple subtrees, and a certain number of cold subtrees (subtrees with too small access frequency) exist in the multiple subtrees. Or a subtree with fewer leaf nodes, then the multiple subtrees can be merged. Before the multiple subtrees are merged, the data indexed by the key values corresponding to the subtrees needs to be written from the first storage area. Two storage areas.
第二方面,提供了一种存储装置,该存储装置可以用于执行前述第一方面及各种实现方式中所述的存储方法中的各个过程。所述存储装置包括存储模块和处理模块,所述存储模块包括第一存储区和第二存储区,所述第一存储区的数据读写速度高于所述第二存储区的数据读写速度,所述处理模块用于:根据第一数据的第一关键字key值查找索引树以获得与所述第一key值对应的第一子树,其中,所述第一数据存储于所述第一存储区中,所述索引树包括M层,所述索引树的前n层节点存储于所述第一存储区中,所述第一子树的根节点位于所述索引树的前n层,所述第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M;将所述第一数据从所述第一存储区写入所述第二存储区;根据所述第一key值更新所述第一子树,其中,更新后的所述第一子树的第一叶子节点中包含有所述第一数据的信息。In a second aspect, a storage device is provided that can be used to perform various ones of the storage methods described in the first aspect and various implementations described above. The storage device includes a storage module and a processing module, the storage module includes a first storage area and a second storage area, and the data read/write speed of the first storage area is higher than the data read/write speed of the second storage area The processing module is configured to: search an index tree according to a first keyword key value of the first data to obtain a first subtree corresponding to the first key value, where the first data is stored in the first In a storage area, the index tree includes an M layer, and a first n layer node of the index tree is stored in the first storage area, and a root node of the first subtree is located in a front n layer of the index tree. And the leaf node of the first subtree includes information of data stored in the second storage area, where M and n are positive integers, n is less than or equal to M; and the first data is from the first storage The area is written into the second storage area; the first sub-tree is updated according to the first key value, wherein the updated first leaf node of the first sub-tree includes the first data information.
可选地,在第二方面的一种实现方式中,所述处理模块还用于:接收写请求,所述写请求中包含有待写入的所述第一数据以及所述第一key值;将所述第一数据以及所述第一key值写入所述第一存储区。 Optionally, in an implementation manner of the second aspect, the processing module is further configured to: receive a write request, where the write request includes the first data to be written and the first key value; Writing the first data and the first key value into the first storage area.
可选地,在第二方面的一种实现方式中,所述处理模块还用于:接收读请求,所述读请求中包含有第二数据的第二key值;当根据所述第二key值在所述第一存储区中未找到所述第二数据时,根据所述第二key值查找所述索引树以获得与所述第二key值对应的第二子树,其中,所述第二子树的根节点位于所述索引树的前n层;根据所述第二子树的第二叶子节点中包含的所述第二数据的信息从所述第二存储区中读取所述第二数据,其中,所述第二叶子节点为根据所述第二key值查找到的叶子节点。Optionally, in an implementation manner of the second aspect, the processing module is further configured to: receive a read request, where the read request includes a second key value of the second data; When the value is not found in the first storage area, the index tree is searched according to the second key value to obtain a second subtree corresponding to the second key value, where The root node of the second subtree is located in the first n layer of the index tree; and the information is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree The second data, wherein the second leaf node is a leaf node that is found according to the second key value.
可选地,在第二方面的一种实现方式中,所述处理模块具体用于:当所述第一存储区的剩余存储空间小于第一阈值且所述第一子树的读写次数满足所述第二阈值时,将所述第一数据从所述第一存储区写入所述第二存储区。Optionally, in an implementation manner of the second aspect, the processing module is specifically configured to: when a remaining storage space of the first storage area is smaller than a first threshold, and the number of reading and writing of the first subtree is satisfied And the second threshold is written from the first storage area to the second storage area.
可选地,在第二方面的一种实现方式中,所述第一存储区包括非易失性存储介质。Optionally, in an implementation manner of the second aspect, the first storage area includes a non-volatile storage medium.
第三方面,提供了一种存储设备,该存储设备包括收发器、处理器和存储器。所述存储器存储了程序,所述处理器执行所述程序,以用于执行前述第一方面及各种实现方式中所述的数据更新方法中的各个过程。In a third aspect, a storage device is provided, the storage device including a transceiver, a processor, and a memory. The memory stores a program that executes the program for performing the various processes in the data update method described in the first aspect and various implementations described above.
第四方面,提供了一种计算机,包括处理器和存储器;所述存储器用于存储计算机执行指令,所述处理器和所述存储器之间通过内部连接通路互相通信,当所述计算机运行时,所述处理器执行所述存储器存储的所述计算机执行指令,以使所述计算机执行前述第一方面及各种实现方式中所述的数据更新方法中的各个过程。In a fourth aspect, a computer is provided, including a processor and a memory; the memory is configured to store computer execution instructions, and the processor and the memory communicate with each other through an internal connection path, when the computer is running, The processor executes the computer-executed instructions stored by the memory to cause the computer to perform various ones of the data update methods described in the first aspect and various implementations described above.
第五方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有程序,所述程序使得上述装置执行上述第一方面及其各种实现方式中的任一种数据更新方法。In a fifth aspect, a computer readable storage medium is provided, the computer readable storage medium storing a program, the program causing the apparatus to perform any one of the above first aspects and various implementation manners thereof .
第六方面,提供了一种系统芯片,该系统芯片包括输入接口、输出接口、处理器和存储器,该处理器用于执行该存储器存储的指令,当该指令被执行时,该处理器可以实现前述第一方面及其各种实现方式中的任一种方法。In a sixth aspect, a system chip is provided, the system chip comprising an input interface, an output interface, a processor, and a memory, the processor is configured to execute an instruction stored by the memory, and when the instruction is executed, the processor can implement the foregoing The first aspect and any of its various implementations.
附图说明DRAWINGS
图1是现有技术中数据存储的示意图。1 is a schematic diagram of data storage in the prior art.
图2是本申请实施例的数据更新方法的示意性流程图。FIG. 2 is a schematic flowchart of a data update method according to an embodiment of the present application.
图3是本申请另一实施例的数据更新方法的示意性流程图。FIG. 3 is a schematic flowchart of a data update method according to another embodiment of the present application.
图4是本申请另一实施例的数据更新方法的示意性流程图。FIG. 4 is a schematic flowchart of a data update method according to another embodiment of the present application.
图5是本申请实施例的存储装置的示意架构图。FIG. 5 is a schematic structural diagram of a storage device according to an embodiment of the present application.
图6是本申请实施例的存储装置的示意性框图。FIG. 6 is a schematic block diagram of a storage device according to an embodiment of the present application.
图7是本申请实施例的存储设备的示意性结构图。FIG. 7 is a schematic structural diagram of a storage device according to an embodiment of the present application.
图8是本申请实施例的系统芯片的示意性结构图。FIG. 8 is a schematic structural diagram of a system chip according to an embodiment of the present application.
具体实施方式detailed description
下面将结合附图,对本申请中的技术方案进行描述。The technical solutions in the present application will be described below with reference to the accompanying drawings.
应理解,本申请实施例所述的数据更新方法可以应用于支持键-值的存储系统。在支持键-值的存储系统中,数据是以键-值为存储单元的,多对键-值保存在对应的文件中,可以通过查找键-值(Key Value)中的关键字Key,以快速确定该关键字Key所对应的数据值Value,从而能够实现大规模实时处理业务的能力。It should be understood that the data update method described in the embodiments of the present application may be applied to a storage system supporting key-values. In a storage system that supports key-values, data is stored in a key-value, and multiple pairs of key-values are stored in the corresponding file, by looking up the keyword Key in the Key Value. The data value Value corresponding to the keyword Key is quickly determined, so that the capability of processing services in a large-scale real-time can be realized.
现有技术中,在当需要更新数据时,首先将更新的数据顺序写到磁盘日志中,然后 在内存缓存中执行该数据更新,如图1所示,当内存中的数据量到达一定阈值,直接转存至磁盘的等级(Level)0层文件;当Level 0层文件的数据量积累到一定程度,便与Level 1层文件合并,并将合并后的新文件存储至Level 1层,同时删除冗余数据;当Level1层文件的数据量积累到一定程度,便与Level 2层文件合并,并将合并后的新文件存储至Level 2层,同时删除冗余数据;依次类推,形成数量更少、存储量更大的文件。这样的逐层合并导致的写放大很严重。例如,如果上一层只有少量数据需要合并至下一层,本来只需要把该少量数据写入下一层,但由于磁盘等存储设备是以块为单位进行存储的,每次的读写操作都要以数据块为单位,所以这时需要把与该数据相关的整个数据块都读取到内存中与该数据进行合并,然后再写回到磁盘中,这样,写入的就是一个数据块的数据量。因此,这种逐层合并的方式导致的写放大很严重,进一步对数据存储性能的提升带来了影响。In the prior art, when data needs to be updated, the updated data sequence is first written to the disk log, and then The data update is performed in the memory cache. As shown in FIG. 1 , when the amount of data in the memory reaches a certain threshold, the data is directly transferred to the level (level) 0 layer file of the disk; when the data volume of the level 0 layer file is accumulated to a certain extent The degree is merged with the Level 1 layer file, and the merged new file is stored to the Level 1 layer, and the redundant data is deleted; when the data volume of the Level 1 layer file is accumulated to a certain extent, it is merged with the Level 2 layer file, and Store the merged new file to the Level 2 layer and delete the redundant data; and so on, to form a smaller number of files with larger storage capacity. Such a layer-by-layer merge results in a significant write amplification. For example, if only a small amount of data in the upper layer needs to be merged into the next layer, it is only necessary to write the small amount of data to the next layer, but since the storage device such as the disk is stored in units of blocks, each read and write operation It must be in units of data blocks, so the entire data block associated with the data needs to be read into the memory and merged with the data, and then written back to the disk, so that a block of data is written. The amount of data. Therefore, the layer-by-layer merge method leads to a serious write amplification, which further affects the performance of data storage.
而对于读操作,当需要读取某个数据时,首先在内存表(memtable)中查找该数据对应的Key,如果没有找到该Key,就会逆序逐层检查各层数据的索引文件,即字符串表(Sorted String Table,SSTable),直到该Key被找到。每个SSTable都是有序的,查找速度随着SSTable个数的增加而变慢,时间复杂度为O(K log N),其中K为sstable文件的个数,N为SSTable的平均大小。因此,写操作的复杂度也限制了数据存储性能的提升。For a read operation, when a certain data needs to be read, the key corresponding to the data is first searched in the memory table (memtable). If the key is not found, the index file of each layer of data is checked in reverse order, that is, the character. Sorted String Table (SSTable) until the Key is found. Each SSTable is ordered. The search speed is slowed down as the number of SSTables increases. The time complexity is O(K log N), where K is the number of sstable files and N is the average size of the SSTable. Therefore, the complexity of the write operation also limits the performance of data storage.
本申请实施例提供了一种数据更新方法,用于提高存储性能。应理解,本申请实施例中的写操作可以包括写入新的数据(put)或者更新数据(update)等操作,本申请实施例中的读操作可以包括对数据的读取(get)或者范围查询(range query)等操作。The embodiment of the present application provides a data update method for improving storage performance. It should be understood that the write operation in the embodiment of the present application may include an operation of writing new data (put) or update data (update), and the read operation in the embodiment of the present application may include reading (get) or range of data. Query (range query) and other operations.
图2是本申请实施例的数据更新方法的示意性流程图。该方法由包含有第一存储区和第二存储区的存储设备执行,该第一存储区的数据读写速度,高于该第二存储区的数据读写速度,该存储设备通过索引树访问第二存储区中存储的数据,该索引树包括M层,该索引树的前n层存储在第一存储区,M和n均为正整数且n小于或等于所述索引树的总层数M。该索引树可以是任何类型的索引树例如二叉树、平衡多路查找树(Balance Multiple Search Trees,B-Tree,B+Tree)等,本申请不做限定。FIG. 2 is a schematic flowchart of a data update method according to an embodiment of the present application. The method is performed by a storage device including a first storage area and a second storage area, wherein a data read/write speed of the first storage area is higher than a data read/write speed of the second storage area, and the storage device is accessed through an index tree. Data stored in the second storage area, the index tree includes an M layer, the first n layers of the index tree are stored in the first storage area, M and n are both positive integers and n is less than or equal to the total number of layers of the index tree M. The index tree may be any type of index tree, such as a binary tree, a balanced multiple search tree (B-Tree, B+Tree), etc., which is not limited in this application.
应理解,该第一存储区例如可以为存储级内存(Storage-Class Memory,SCM)或者其他以字节为寻址粒度的非易失性存储介质;该第二存储区例如可以为NAND闪存(NAND Flash)或硬盘驱动器(Hard Disk Drive,HDD)等。由于第一存储区采用的是快速存储介质,第二存储区采用的是慢速存储介质,因而该第一存储区的数据读写速度,高于该第二存储区的数据读写速度,即第一存储区的访问性能优于第二存储区。It should be understood that the first storage area may be, for example, a storage-class memory (SCM) or other byte-addressable non-volatile storage medium; the second storage area may be, for example, a NAND flash memory ( NAND Flash) or Hard Disk Drive (HDD). Since the first storage area uses a fast storage medium, and the second storage area uses a slow storage medium, the data read and write speed of the first storage area is higher than the data read/write speed of the second storage area, that is, The access performance of the first storage area is better than the second storage area.
还应理解,该索引树用于访问第二存储区中的数据,但由于索引树的上层的访问频繁较高,因而可以将该索引树的前n层存储在第一存储区中。当n小于该索引树的总层数时,该索引树的部分层存储在第一存储区,而当n等于该索引树的总层数时,整个索引树都存储在第一存储区。但是考虑到对于第二存储区例如存储级内存SCM等,目前受到制造成本和存储空间的限制,因而可以优先考虑将该索引树的一部分即前几层存储在第一存储区,剩余层存储在第二存储区。如果未来第二存储区例如存储级内存SCM等足够便宜,空间足够大,整个索引树也可以都存储在第二存储区,这里不做限定。It should also be understood that the index tree is used to access data in the second storage area, but since the upper layer access of the index tree is frequently higher, the first n layers of the index tree can be stored in the first storage area. When n is smaller than the total number of layers of the index tree, a partial layer of the index tree is stored in the first storage area, and when n is equal to the total number of layers of the index tree, the entire index tree is stored in the first storage area. However, considering that for the second storage area, such as storage level memory SCM, etc., currently limited by manufacturing cost and storage space, it may be preferred to store a part of the index tree, that is, the first few layers, in the first storage area, and the remaining layers are stored in The second storage area. If the second storage area, such as the storage level memory SCM, is sufficiently cheap in the future, and the space is large enough, the entire index tree may also be stored in the second storage area, which is not limited herein.
如图2所示,该数据更新方法可以包括以下步骤:As shown in FIG. 2, the data update method may include the following steps:
在210中,根据第一数据的第一关键字key值查找索引树以获得与该第一key值对 应的第一子树。In 210, an index tree is searched according to a first keyword key value of the first data to obtain a pair with the first key value. The first subtree should be.
其中,该第一数据存储于该第一存储区中,该索引树包括M层,该索引树的前n层节点存储于第一存储区中,该第一子树的根节点位于该索引树的前n层,该第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M。The first data is stored in the first storage area, and the index tree includes an M layer. The first n-layer node of the index tree is stored in the first storage area, and the root node of the first sub-tree is located in the index tree. The first n layers, the leaf nodes of the first subtree include information of data stored in the second storage area, M and n are positive integers, and n is less than or equal to M.
这里首先简单阐述一下索引树的节点。在一个索引树中,索引树的节点分为根节点、叶子节点、父节点和子节点这四类。子节点是父节点的下一层节点,一个节点如果有上一级,则称这个上一级是它的父节点,如果没有上一级,则这个节点则无父节点。一棵树当中没有子节点的节点,称为叶子节点。在当前节点之上已经没有其他的节点,这个节点叫做根节点。本申请实施例中的节点也可以写作结点。Here is a brief description of the nodes of the index tree. In an index tree, the nodes of the index tree are divided into four categories: root node, leaf node, parent node, and child node. The child node is the next-level node of the parent node. If a node has a higher level, the upper level is called its parent node. If there is no upper level, the node has no parent node. A node with no children in a tree is called a leaf node. There are no other nodes above the current node. This node is called the root node. The nodes in the embodiments of the present application can also be written as nodes.
应理解,该索引树的前n层存储在第一存储区中,该第一子树的根节点位于该索引树的前n层,n是小于或等于索引树的总层数M的,因而该第一子树的根节点也就位于第一存储区中。特别地,该第一子树的根节点位于该索引树的第n层,即位于第一存储区的该索引树的前n层中,第一子树的根节点为这n层中的最后一层。例如,该索引树的总层数为5,其中前3层存储在第一存储区,该第一子树的根节点位于索引树的第3层。It should be understood that the first n layers of the index tree are stored in the first storage area, the root node of the first subtree is located in the first n layers of the index tree, and n is less than or equal to the total number of layers M of the index tree, thus The root node of the first subtree is also located in the first storage area. Specifically, the root node of the first subtree is located in the nth layer of the index tree, that is, in the first n layer of the index tree located in the first storage area, and the root node of the first subtree is the last of the n layers layer. For example, the total number of layers of the index tree is 5, wherein the first 3 layers are stored in the first storage area, and the root node of the first subtree is located in the third layer of the index tree.
还应理解,与该第一key值对应的第一子树,是指该第一key值位于该第一子树对应的key值范围内。It should also be understood that the first sub-tree corresponding to the first key value means that the first key value is within a range of key values corresponding to the first sub-tree.
还应理解,在支持键-值的存储系统中,数据可以通过键-值对(Key Value)的形式进行存储。本申请实施例中,第一存储区和第二存储区中存储的数据可以包括该数据的值和该数据对应的索引,例如这里所说的第一数据,就包括第一数据的值以及第一数据的索引,第一数据的索引就是键-值对中的键Key,第一数据的值就是键-值对中的值Value。通过第一数据的Key,就可以快速查找到与该Key对应的Value。不同的数据可以通过索引树进行管理,在进行数据读写时,通过该索引树,可以第一通过数据的索引确定其对应的数据所在的数据块,从而实现对该数据的访问。It should also be understood that in a key-value enabled storage system, data may be stored in the form of a Key Value. In the embodiment of the present application, the data stored in the first storage area and the second storage area may include a value of the data and an index corresponding to the data, for example, the first data mentioned herein includes the value of the first data and the first An index of data, the index of the first data is the key Key in the key-value pair, and the value of the first data is the value Value in the key-value pair. Through the Key of the first data, the Value corresponding to the Key can be quickly found. Different data can be managed by the index tree. When the data is read and written, the index tree can be used to determine the data block where the corresponding data is located by using the index of the data, thereby realizing access to the data.
例如,对于学生管理数据库,存储的是学生的学号和姓名,其中,学号作为键Key,姓名作为值Value,如果是向该学生管理数据库中写入新的数据,则执行put(0600100,陈美玲),即向该学生管理数据库中添加(0600100,陈美玲)这样的一对键-值。For example, for the student management database, the student's student number and name are stored, wherein the student number is used as the key Key and the name is the value Value. If the new data is written into the student management database, execute put(0600100, Chen Meiling), that is, adding a pair of key-values (0600100, Chen Meiling) to the student management database.
如果是对该学生管理数据库中的数据进行更新,假设已经写入了(0600100,陈美玲),即已经向该学生管理数据库中添加了(0600100,陈美玲)这样的一对键-值,现在陈美玲改名为陈美丽,则需要将(0600100,陈美玲)更新为(0600100,陈美丽),执行update(0600100,陈美丽)。这样,数据库中的(0600100,陈美玲)就变成了(0600100,陈美丽)。数据的更新操作是针对已经存在的Key对应的Value进行更新,写操作则是增加新的键-值(Key,Value)。If the data in the student management database is updated, it is assumed that it has been written (0600100, Chen Meiling), that is, a pair of key-values such as (0600100, Chen Meiling) has been added to the student management database, and now Chen Meiling is renamed. For Chen Meimei, you need to update (0600100, Chen Meiling) to (0600100, Chen Mei) and execute update (0600100, Chen Mei). In this way, the (0600100, Chen Meiling) in the database becomes (0600100, Chen Meimei). The data update operation is to update the value corresponding to the existing Key, and the write operation is to add a new key-value (Key, Value).
在220中,将该第一数据从第一存储区写入第二存储区。At 220, the first data is written from the first storage area to the second storage area.
具体地,存储设备可以将第一存储区域中存储的第一数据,写入第二存储区中。也就是说,第一存储区中与该第一子树对应的key值所索引的数据,与第二存储区中与该第一子树对应key值所索引的数据,可以进行数据合并,即按照第一存储区和第二存储区中的这些数据的索引大小,对数据进行重排并进行重新存储。合并后的这些数据,会被存储在第二存储区中。这样,由于第一存储区中的数据,被转移至第二存储区中,因 而第一存储区中的存储空间就能够被释放。Specifically, the storage device may write the first data stored in the first storage area into the second storage area. That is, the data indexed by the key value corresponding to the first subtree in the first storage area and the data indexed by the key value corresponding to the first subtree in the second storage area may be data merged, that is, The data is rearranged and re-stored according to the index size of the data in the first storage area and the second storage area. The merged data will be stored in the second storage area. Thus, since the data in the first storage area is transferred to the second storage area, The storage space in the first storage area can be released.
可选地,在220中,将该第一数据从第一存储区写入第二存储区,包括:当第一存储区的剩余存储空间小于第一阈值且第一子树的读写次数满足第二阈值时,将第一数据从第一存储区写入第二存储区。Optionally, in 220, writing the first data from the first storage area to the second storage area includes: when the remaining storage space of the first storage area is less than the first threshold, and the number of times of reading and writing of the first subtree is satisfied At the second threshold, the first data is written from the first storage area to the second storage area.
具体地,将第一数据从第一存储区写入第二存储区的时机可以根据当时第一存储区的剩余存储空间的大小来确定,当第一存储区的剩余存储空间不足例如小于第一阈值时,会启动数据的转移即将数据从第一存储区写入第二存储区。这时,存储设备首先会在根节点位于前n层例如第n层的所有子树中,选择满足条件的至少一个子树,该至少一个子树例如包括上述的第一子树,并且根据该至少一个子树将第一存储区中的数据写入第二存储区。也就是说,当第一存储区的剩余存储空间不足时,才触发对索引树中每个子树是否满足该条件的判断,从而将根节点位于第n层且满足条件的子树对应的key值所索引的数据从第一存储区写入第二存储区。Specifically, the timing of writing the first data from the first storage area to the second storage area may be determined according to the size of the remaining storage space of the first storage area at the time, when the remaining storage space of the first storage area is insufficient, for example, less than the first At the threshold, the transfer of the data is initiated, that is, the data is written from the first storage area to the second storage area. At this time, the storage device first selects at least one subtree satisfying the condition in all the subtrees where the root node is located in the first n layer, for example, the nth layer, and the at least one subtree includes, for example, the first subtree described above, and according to the At least one subtree writes data in the first storage area to the second storage area. That is to say, when the remaining storage space of the first storage area is insufficient, the judgment of whether each sub-tree in the index tree satisfies the condition is triggered, so that the root node is located at the nth layer and the key value corresponding to the subtree satisfying the condition is obtained. The indexed data is written from the first storage area to the second storage area.
该至少一个子树所满足的条件例如可以包括以下中的任意一种:将子树对应的key值所索引的数据从第一存储区写入第二存储区后能够释放的存储空间大于一个预设的阈值;子树的写操作次数与读操作次数的比值大于一个预设的阈值;子树对应的读操作频率和/或写操作频率小于一个预设的阈值。下面以第一子树为例,具体针对这三种条件进行描述。The condition that the at least one subtree satisfies may include, for example, any one of the following: the data that can be released after the data indexed by the key value corresponding to the subtree is written from the first storage area to the second storage area is greater than one pre The threshold is set; the ratio of the number of write operations of the subtree to the number of read operations is greater than a preset threshold; the read operation frequency and/or the write operation frequency corresponding to the subtree are less than a preset threshold. The following uses the first subtree as an example to describe these three conditions.
方式1 Mode 1
对第一子树执行该子树内合并操作后能够释放的存储空间,大于预设的阈值。The storage space that can be released after the sub-tree merge operation is performed on the first sub-tree is greater than a preset threshold.
具体地,将第一数据从第一存储区写入第二存储区的首要目的是释放快速存储区的存储空间,因而当快速存储区的存储空间不足时,可以根据将每个子树对应的key所索引的第一数据从第一存储区写入第二存储区后所能够释放的存储空间的大小,来决定将哪些子树对应的key值所索引的数据从第一存储区写入第二存储区。例如,若将第一子树对应的key值(包括第一key值)所索引的数据(包括第一数据)从第一存储区写入第二存储区后能够释放的存储空间的大小大于一个预设的阈值,那么存储设备可以将第一子树对应的key值所索引的数据从第一存储区写入第二存储区。Specifically, the primary purpose of writing the first data from the first storage area to the second storage area is to release the storage space of the fast storage area, so when the storage space of the fast storage area is insufficient, the key corresponding to each sub-tree may be The size of the storage space that can be released after the first data that is indexed is written from the first storage area to the second storage area, to determine which sub-tree corresponding key values are indexed from the first storage area to the second data. Storage area. For example, if the data indexed by the key value corresponding to the first subtree (including the first key value) (including the first data) is written from the first storage area to the second storage area, the size of the storage space that can be released is greater than one. The preset threshold, the storage device may write the data indexed by the key value corresponding to the first subtree from the first storage area to the second storage area.
方式2 Mode 2
第一子树的写操作次数与第一子树的读操作次数的比值,大于预设的阈值。The ratio of the number of write operations of the first subtree to the number of read operations of the first subtree is greater than a preset threshold.
具体地,这里可以尽可能选择写操作多,读操作少的子树,从而将这类子树对应的key值所索引的数据从第一存储区写入第二存储区。例如,以第一子树的写操作次数与第一子树的读操作次数的比值作为衡量,若第一子树的写操作次数与第一子树的读操作次数的比值大于一个预设的阈值时,将第一子树对应的key值(包括第一key值)所索引的数据(包括第一数据)从第一存储区写入第二存储区。因为在第一存储区中读操作速度较快,而在第二存储区中读操作速度较慢,如果第一子树在第一存储区对应的数据有较多的读操作,则尽可能把数据留在第一存储区,这样就可以加块系统的读操作速度。Specifically, as far as possible, a subtree with many write operations and few read operations can be selected, so that the data indexed by the key value corresponding to the subtree is written from the first storage area to the second storage area. For example, the ratio of the number of write operations of the first subtree to the number of read operations of the first subtree is used as a measure, if the ratio of the number of write operations of the first subtree to the number of read operations of the first subtree is greater than a preset At the threshold, the data (including the first data) indexed by the key value corresponding to the first subtree (including the first key value) is written from the first storage area to the second storage area. Because the read operation speed is faster in the first storage area, and the read operation speed is slower in the second storage area, if the data corresponding to the first sub-tree in the first storage area has more read operations, The data is left in the first memory area so that the read speed of the system can be increased.
应理解,第一子树的写操作,是指写入的数据的key值位于第一子树对应的key值范围的;第一子树对应的读操作,是指读取的数据的key值是位于第一子树对应的key值范围的。It should be understood that the write operation of the first subtree refers to that the key value of the written data is in the range of key values corresponding to the first subtree; the read operation corresponding to the first subtree refers to the key value of the read data. It is located in the range of key values corresponding to the first subtree.
方式3 Mode 3
第一子树对应的读操作频率和/或写操作频率,小于预设的阈值。 The read operation frequency and/or the write operation frequency corresponding to the first subtree are less than a preset threshold.
具体地,还有一种情况,存在一些子树,其对应的读操作和写操作都比较少,子树的读操作频率和/或写操作频率很低,即子树处于冷状态,对于这类子树,也可以将这类子树对应的key值所索引的数据从第一存储区写入第二存储区。例如,如果第一子树的读操作频率和/或写操作频率小于一个预设的阈值,那么可以将第一子树对应的key值(包括第一key值)所索引的数据(包括第一数据)从第一存储区写入第二存储区。Specifically, there is also a case where there are some subtrees, and corresponding read operations and write operations are relatively small, and the read operation frequency and/or write operation frequency of the subtree is very low, that is, the subtree is in a cold state, The subtree may also write data indexed by the key value corresponding to the subtree from the first storage area to the second storage area. For example, if the read operation frequency and/or the write operation frequency of the first subtree is less than a preset threshold, the data indexed by the key value corresponding to the first subtree (including the first key value) may be included (including the first Data) is written to the second storage area from the first storage area.
可选地,在具体实现时,存储设备中每次执行写操作或读操作以后,可以对输入/输出(Input/Output,I/O)统计信息进行更新,该I/O统计信息例如可以包括以下中的至少一种:针对每个子树的写操作(包括put操作和update操作)次数之和、读(get)操作次数、读操作的索引范围、对该索引范围的查询次数、以及每次操作的时间戳记录例如执行写操作的时间信息、执行更新操作的时间信息、执行读操作的时间信息等。Optionally, in a specific implementation, input/output (I/O) statistics may be updated each time a write operation or a read operation is performed in the storage device, and the I/O statistical information may include, for example. At least one of the following: the sum of the number of write operations (including the put operation and the update operation) for each subtree, the number of read operations, the index range of the read operation, the number of queries for the index range, and each time The time stamp of the operation records, for example, time information for performing a write operation, time information for performing an update operation, time information for performing a read operation, and the like.
在230中,根据第一key值更新该第一子树。In 230, the first subtree is updated according to the first key value.
其中,更新后的该第一子树的第一叶子节点中包含有该第一数据的信息。The first leaf node of the updated first subtree includes the information of the first data.
具体地,存储设备将第一数据由第一存储区写入第二存储区后,还要根据第一key值更新第一子树,更新后的该第一子树中的第一叶子节点中包括该第一数据的信息。另外,该第一叶子节点可以为第一子树中的任意一个叶子结点。Specifically, after the first data is written by the first storage area into the second storage area, the storage device further updates the first subtree according to the first key value, and the updated first leaf node in the first subtree Information including the first data. In addition, the first leaf node may be any leaf node in the first subtree.
可选地,该第一数据的信息可以包括以下信息中的至少一种:第一数据的值(value)、第一数据的第一key值、第一数据的地址(或链接)、第一key值的地址(或链接)等。Optionally, the information of the first data may include at least one of the following information: a value of the first data, a first key value of the first data, an address (or a link) of the first data, and a first The address (or link) of the key value, etc.
因此,将存储设备中的存储区进行了分区,其中,第一存储区的读写性能优于第二存储区的读写性能。并且将索引树的前n层节点存储于所述第一存储区中,且该索引树的第一子树的根节点也位于所述索引树的前n层,从而使得在数据更新过程中,与现有技术中需要逐层合并各级索引树的方式相比,能够通过待写入数据的第一key值快速查找到对应的第一子树,提高了查找速度。并且,本申请中,能够基于第一子树实现第一存储区与第二存储区中数据的合并,从而可以减少现有技术中因为逐层合并带来的写放大问题。Therefore, the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
可选地,如图3所示,在210之前,该方法还包括240和250。Optionally, as shown in FIG. 3, prior to 210, the method further includes 240 and 250.
在240中,接收写请求,该写请求中包含有待写入的该第一数据以及该第一key值。At 240, a write request is received, the write request including the first data to be written and the first key value.
在250中,将该第一数据以及该第一key值写入该第一存储区。At 250, the first data and the first key value are written to the first storage area.
具体来说,当执行写操作时,存储设备可以首先将接收到的数据存储至第一存储区中进行临时管理。第一存储区在写数据时并不是通过逐层合并的方式,而是例如可以按照字节为粒度写入数据,因而将数据写入第一存储区的速度要明显高于将数据写入第二存储区的速度,避免了写放大的问题。Specifically, when performing a write operation, the storage device may first store the received data into the first storage area for temporary management. The first storage area does not merge by layer by layer when writing data, but for example, data can be written in a granularity of bytes, so that the speed of writing data into the first storage area is significantly higher than that of writing data. The speed of the two storage areas avoids the problem of write amplification.
根据这种方式,将新写入数据先写入所述第一存储区中,由于第一存储区的读写性能较优,从而提高了数据写入速度。并且,在将数据写入第一存储区中之后,再从所述第一存储区转移到所述第二存储区中的方式也减少了直接在第二存储区进行数据更新所带来的写放大的问题,从而提高了数据存储性能。According to this manner, the newly written data is first written into the first storage area, and the read/write performance of the first storage area is superior, thereby increasing the data writing speed. Moreover, after the data is written into the first storage area, the manner of transferring from the first storage area to the second storage area also reduces writing caused by data update directly in the second storage area. Amplify the problem, which improves data storage performance.
可选地,如图4所示,该方法还包括:Optionally, as shown in FIG. 4, the method further includes:
在260中,接收读请求,该读请求中包含有第二数据的第二key值。At 260, a read request is received, the read request including a second key value of the second data.
在270中,当根据该第二key值在该第一存储区中未找到该第二数据时,根据该第二key值查找该索引树以获得与该第二key值对应的第二子树。In 270, when the second data is not found in the first storage area according to the second key value, searching the index tree according to the second key value to obtain a second subtree corresponding to the second key value. .
其中,该第二子树的根节点位于该索引树的前n层。 The root node of the second subtree is located in the first n layers of the index tree.
在280中,根据该第二子树的第二叶子节点中包含的该第二数据的信息从该第二存储区中读取该第二数据。In 280, the second data is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree.
其中,该第二叶子节点为根据该第二key值查找到的叶子节点。可选地,该第二数据的信息包括以下信息中的至少一种:第二数据的值(value)、第二数据的第二key值、第二数据的地址(或链接)、第二key值的地址(或链接)等。The second leaf node is a leaf node that is found according to the second key value. Optionally, the information of the second data includes at least one of the following information: a value of the second data, a second key value of the second data, an address (or a link) of the second data, and a second key The address (or link) of the value, etc.
具体地,在需要读取数据时,存储设备首先根据该第二key值在第一存储区中查找第二数据,若在第一存储区中查找到第二数据,则直接读取该第二数据。由于第一存储区中也存储了部分数据,这部分数据如果还没有被写入第二存储区,那么存储设备就可以在第一存储区中查找到这些数据。由于第一存储区的数据访问速度明显高于第二存储区的数据访问速度,因而可以实现对数据的快速读取。Specifically, when the data needs to be read, the storage device first searches for the second data in the first storage area according to the second key value, and if the second data is found in the first storage area, directly reads the second data. data. Since part of the data is also stored in the first storage area, if the part of the data has not been written to the second storage area, the storage device can find the data in the first storage area. Since the data access speed of the first storage area is significantly higher than the data access speed of the second storage area, fast reading of data can be realized.
如果第一存储区中的数据例如第二数据已经被写入第二存储区,那么存储设备在第一存储区中可能查找不到第二数据的索引,这时,存储设备需要根据该第二key值查找该索引树以获得与该第二key值对应的第二子树,并根据该第二子树的第二叶子节点中包含的该第二数据的信息从该第二存储区中读取该第二数据。If the data in the first storage area, for example, the second data has been written into the second storage area, the storage device may not find the index of the second data in the first storage area, and at this time, the storage device needs to be according to the second The key value searches the index tree to obtain a second subtree corresponding to the second key value, and reads from the second storage area according to the information of the second data included in the second leaf node of the second subtree. Take the second data.
下面结合图5以一个详细的示例描述本申请实施例的数据更新方法。图5是本申请实施例的数据更新方法的示意图。图5中示出了第一存储区和第二存储区,第一存储区为第一存储介质SCM,第二存储区为NAND Flash或HDD,并且,该第一存储区包括数据区和索引区这两个部分,该数据区用于存储数据,该索引区用于存储索引树。这里假设该索引树为平衡多路查找树B+Tree。第二存储区用于存储索引树和数据。应理解,第一存储区可以采用一个存储介质,该存储介质被划分为数据区和索引区,或者,该数据区和该索引区可以分别使用不同的存储介质,这里不做限定。The data updating method of the embodiment of the present application will be described below with reference to FIG. 5 in a detailed example. FIG. 5 is a schematic diagram of a data update method according to an embodiment of the present application. A first storage area and a second storage area are shown in FIG. 5. The first storage area is a first storage medium SCM, the second storage area is a NAND Flash or HDD, and the first storage area includes a data area and an index area. These two parts, the data area is used to store data, and the index area is used to store the index tree. It is assumed here that the index tree is a balanced multipath search tree B+Tree. The second storage area is used to store the index tree and data. It should be understood that the first storage area may be a storage medium, and the storage medium is divided into a data area and an index area. Alternatively, the data area and the index area may use different storage media, which are not limited herein.
该存储设备中用于在第二存储区中查找数据的索引树的总层数为5,且前3层存储在第一存储区的索引区中,后2层存储在第二存储区中。如图5所示,该索引树以节点A为根节点,其中,A节点的子节点包括节点B和节点P,其中,节点B的子节点包括节点C和节点I,依次,节点C的子节点包括节点D和节点E,节点D的子节点包括节点F、节点G和节点H。节点F、节点G和节点H为整个索引树的叶子节点。The total number of layers of the index tree for finding data in the second storage area in the storage device is 5, and the first 3 layers are stored in the index area of the first storage area, and the last 2 layers are stored in the second storage area. As shown in FIG. 5, the index tree has a node A as a root node, wherein the child nodes of the node A include a node B and a node P, wherein the child nodes of the node B include a node C and a node I, and in turn, a child of the node C The node includes a node D and a node E, and the child nodes of the node D include a node F, a node G, and a node H. Node F, node G, and node H are leaf nodes of the entire index tree.
当写入数据例如写入第一数据时,并不会像现有技术将第一数据写入第二存储区,而是将第一数据写入第一存储区的数据区中,第一存储区的数据写入速度要明显快于第二存储区,因而第一数据能够第一被写入。When writing data, for example, writing the first data, the first data is not written into the second storage area as in the prior art, but the first data is written into the data area of the first storage area, the first storage The data write speed of the zone is significantly faster than the second memory zone, so the first data can be written first.
但是,用户每次写入的数据都被写入第一存储区,就会导致第一存储区的存储空间不断减少,当第一存储区的存储空间减少至一定程度,例如小于一个空间阈值时,存储设备会将数据从第一存储区写入第二存储区,以释放第一存储区的存储空间。存储设备具体将哪些数据转移至第二存储区,可以在根节点位于第3层的多个子树中,选择满足一定条件的子树,并将满足该条件的子树对应的key值所索引的数据,从第一存储区写入第二存储区。存储设备具体可以根据将子树对应的key值所索引的数据从第一存储区写入第二存储区后能够释放的存储空间能够释放的存储空间的大小,或者每个子树对应的读写操作次数,判断每个子树是否满足该合并条件。However, each time the data written by the user is written into the first storage area, the storage space of the first storage area is continuously reduced, when the storage space of the first storage area is reduced to a certain extent, for example, less than a space threshold. The storage device writes data from the first storage area to the second storage area to release the storage space of the first storage area. The storage device specifically transfers the data to the second storage area, and the sub-trees satisfying certain conditions are selected in the plurality of sub-trees whose root nodes are located in the third layer, and the key values corresponding to the sub-trees satisfying the condition are indexed. Data is written from the first storage area to the second storage area. The storage device may be configured to: according to the size of the storage space that can be released by the storage space that can be released after the data indexed by the key value corresponding to the subtree is written from the first storage area to the second storage area, or the read/write operation corresponding to each subtree. The number of times determines whether each subtree satisfies the merge condition.
例如,如图5所示,该索引树中根节点位于第3层的多个子树包括第一子树和第二子树,当第一存储区中的存储空间不足时,则存储设备判断第一子树和第二子树是否满足预设的条件例如每个子树对应的读写操作次数是否达到一定阈值,假设第一子树满足 预设的条件,则存储设备将第一子树对应的key值所索引的数据,从第一存储区写入第二存储区。即,将第一存储区中第一子树包括的数据,和第二存储区中第一子树包括的数据,读取到内存中,并按数据的键Key进行排序,并且将排序后的数据由内存转移至第二存储区中,从而存储在第二存储区中相应的位置。这样,第一存储区中的存储空间就能够得到释放。如图5所示,第一存储区中与第一子树对应的key值所索引的数据包括(Key 1,Value 1)、(Key 2,Value 2)和(Key 4,Value 4),第二存储区中与第一子树对应的key值所索引的数据(Key 3,Value 3)、(Key 5,Value 5)和(Key 6,Value6),对这些数据按照Key的大小进行重排后,形成合并后的数据(Key 1,Value 1)、(Key 2,Value 2)、(Key 3,Value 3)、(Key 4,Value 4)、(Key 5,Value 5)和(Key 6,Value 6),这些合并后的数据会被存储在第二存储区中。For example, as shown in FIG. 5, the plurality of subtrees in the index tree in which the root node is located in the third layer includes the first subtree and the second subtree. When the storage space in the first storage area is insufficient, the storage device determines the first Whether the subtree and the second subtree satisfy a preset condition, for example, whether the number of read/write operations corresponding to each subtree reaches a certain threshold, assuming that the first subtree satisfies The preset condition is that the storage device writes the data indexed by the key value corresponding to the first subtree from the first storage area to the second storage area. That is, the data included in the first subtree in the first storage area and the data included in the first subtree in the second storage area are read into the memory, and sorted according to the key Key of the data, and the sorted The data is transferred from the memory to the second storage area so as to be stored in the corresponding location in the second storage area. In this way, the storage space in the first storage area can be released. As shown in FIG. 5, the data indexed by the key value corresponding to the first subtree in the first storage area includes (Key 1, Value 1), (Key 2, Value 2), and (Key 4, Value 4), Data (Key 3, Value 3), (Key 5, Value 5), and (Key 6, Value 6) indexed by the key value corresponding to the first subtree in the second storage area, and the data is rearranged according to the size of the Key. After that, the merged data (Key 1, Value 1), (Key 2, Value 2), (Key 3, Value 3), (Key 4, Value 4), (Key 5, Value 5), and (Key 6) are formed. , Value 6), these combined data will be stored in the second storage area.
这里,与第一子树对应的key值所索引的数据,是指位于第一子树的key值范围内的key值所对应的数据,例如,第一子树对应的key值范围为10-25,若第一数据的第一key值为15,第一key值15位于第一子树的key值范围10-25,因而第一子树对应的key值所索引的数据中就包括第一数据。Here, the data indexed by the key value corresponding to the first subtree refers to the data corresponding to the key value in the range of the key value of the first subtree. For example, the key value corresponding to the first subtree ranges from 10 to 10 25, if the first key value of the first data is 15, the first key value 15 is located in the key value range of the first subtree 10-25, and the data indexed by the key value corresponding to the first subtree includes the first data.
应理解,图5中,第一存储区和第二存储区中与第一子树对应的数据在进行存储时,可以像图5中所示,在数据块中存储数据的键-值对(Key,Value),例如第二存储区左下角虚线框中的数据在进行存储时需要存储该数据的键-值对比如(Key 3,Value 3)。但是,在一些情况下,由于叶子节点F已经存储了该数据的键Key 3,那么在进行数据存储时,数据块中也可以只存储Value 3,即第二存储区左下角虚线框中仅存储Value 3而不用存储完整的(Key 3,Value 3)。本申请实施例对数据块中的数据存储形式不作任何限定。It should be understood that, in FIG. 5, when the data corresponding to the first subtree in the first storage area and the second storage area is stored, the key-value pair of the data may be stored in the data block as shown in FIG. 5 ( Key, Value), for example, the data in the dotted line in the lower left corner of the second storage area needs to store the key-value pairs of the data (Key 3, Value 3) when storing. However, in some cases, since the leaf node F has already stored the key Key 3 of the data, when the data is stored, only the Value 3 can be stored in the data block, that is, only the dotted line in the lower left corner of the second storage area is stored. Value 3 does not store the full (Key 3, Value 3). The embodiment of the present application does not limit the data storage form in the data block.
还应理解,在第一存储区中对数据进行临时管理时,也可以通过索引树的方式进行管理,例如图5中第一存储区的索引区中阴影示出的小方格,可以代表以节点C为父节点的三个子节点,这三个子节点分别管理不同索引范围的数据,这三个子节点下还可以包括其他子节点,这里没有画出。前面提到以节点A为根节点的索引树是用于在第二索引区中访问数据的索引树,与这里用于对第一存储区中的数据进行临时管理的索引树,是不同的索引树。由于第一存储区中存储的数据量一般较小,且通常以字节为粒度进行数据读写,因而第一存储区中也可以不通过索引树的方式对数据进行临时管理,而通过表格、列表等方式等更简单的方式对数据进行管理,本申请实施例中不做限定。没有特别指出的情况下,本申请实施例中的索引树均指用于访问第二存储区的数据的索引树,即图5中以节点A为根节点的5层索引树。It should also be understood that when the data is temporarily managed in the first storage area, it may also be managed by means of an index tree. For example, a small square in the index area of the first storage area in FIG. 5 may be represented by Node C is the three child nodes of the parent node. These three child nodes manage data of different index ranges respectively. These three child nodes may also include other child nodes, which are not shown here. The index tree with node A as the root node mentioned above is an index tree for accessing data in the second index area, and the index tree used for temporarily managing data in the first storage area is a different index. tree. Since the amount of data stored in the first storage area is generally small, and data is usually read and written in a granularity of bytes, the data may be temporarily managed in the first storage area without using an index tree. The data is managed in a simple manner, such as a list, and the like, which is not limited in the embodiment of the present application. The index tree in the embodiment of the present application refers to an index tree for accessing data of the second storage area, that is, a 5-layer index tree with node A as the root node in FIG. 5, unless otherwise specified.
上面结合图5举例说明了基于本申请数据更新方法的写操作的过程,下面结合图5举例说明读操作的过程。The process of the write operation based on the data update method of the present application is exemplified above with reference to FIG. 5, and the process of the read operation is exemplified below with reference to FIG.
举例来说,如果要查找第二数据(Key 3,Value 3),首先在第一存储区的索引区中查找待读取的第二数据的第二key值Key 3,如果在第一存储区查找到Key 3,则根据Key3在第一存储区的数据区读取第二数据(Key 3,Value 3)。如果像图5中所示,在第一存储区无法查找到第二数据的第二key值Key 3,则根据索引树,从索引树的根节点即A节点开始逐层查找Key 3所在的叶子节点,假设在第二存储区中的叶子节点F中查找到Key 3,则根据查找到的Key 3后,从第二存储区与Key 3对应的数据块中读取数据(Key3,Value 3)。 For example, if the second data (Key 3, Value 3) is to be searched, the second key value Key 3 of the second data to be read is first searched in the index area of the first storage area, if in the first storage area. When Key 3 is found, the second data (Key 3, Value 3) is read in the data area of the first storage area according to Key3. If, as shown in FIG. 5, the second key value Key 3 of the second data cannot be found in the first storage area, the leaf where the Key 3 is located is searched layer by layer from the root node of the index tree, that is, the A node according to the index tree. Node, assuming that Key 3 is found in the leaf node F in the second storage area, data is read from the data block corresponding to the second storage area and Key 3 according to the found Key 3 (Key 3, Value 3) .
可选地,如果对索引树的多次读操作中,每次读操作对应的索引范围覆盖多个子树,且这多个子树中存在一定数量的冷子树(访问频率过小的子树)或叶子结点较少的子树,那么可以对这多个子树进行合并。其中,在对这多个子树进行合并之前,需要将这些子树对应的key值所索引的数据从第一存储区写入第二存储区。对多个子树进行合并的过程和现有技术中多个子树合并的过程相同,为了简洁,这里不再赘述。Optionally, if multiple read operations are performed on the index tree, the index range corresponding to each read operation covers multiple subtrees, and a certain number of cold subtrees (subtrees with too small access frequency) exist in the multiple subtrees. Or subtrees with fewer leaf nodes, then you can merge these multiple subtrees. The data indexed by the key values corresponding to the subtrees needs to be written into the second storage area from the first storage area before the multiple subtrees are merged. The process of merging multiple subtrees is the same as the process of merging multiple subtrees in the prior art. For brevity, no further details are provided here.
仍以图5为例,如果对索引树的多次读操作中,每次读操作对应的索引范围都覆盖第一子树和第二子树,且第一子树和第二子树中至少有一个子树为冷子树或至少有一个子树的叶子结点少于一定数量,那么可以对第一子树和第二子树进行合并,这时,第一子树和第二子树对应的key值所索引的数据已经从第一存储区写入第二存储区。对第一子树和第二子树进行合并后,第一子树和第二子树就可以合成一个新的子树,这个新的子树的父节点同样为节点B。Still taking FIG. 5 as an example, if multiple read operations are performed on the index tree, the index range corresponding to each read operation covers the first subtree and the second subtree, and at least the first subtree and the second subtree are at least If there is a subtree of less than a certain number of leaf nodes of the subtree or at least one subtree, then the first subtree and the second subtree may be merged. At this time, the first subtree and the second subtree are combined. The data indexed by the corresponding key value has been written from the first storage area to the second storage area. After the first subtree and the second subtree are merged, the first subtree and the second subtree can be combined into a new subtree, and the parent node of the new subtree is also the node B.
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that, in the various embodiments of the present application, the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application. The implementation process constitutes any limitation.
下面将结合图6至图8,描述根据本申请实施例的存储装置,方法实施例所描述的技术特征可以适用于以下装置实施例。A storage device according to an embodiment of the present application will be described below with reference to FIG. 6 to FIG. 8. The technical features described in the method embodiments may be applied to the following device embodiments.
图6是根据本申请实施例的存储装置600的示意性框图。如图6所示,所述存储装置包括存储模块610和处理模块620,所述存储模块610包括第一存储区和第二存储区,所述第一存储区的数据读写速度高于所述第二存储区的数据读写速度,所述处理模块620用于:FIG. 6 is a schematic block diagram of a memory device 600 in accordance with an embodiment of the present application. As shown in FIG. 6, the storage device includes a storage module 610 and a processing module 620. The storage module 610 includes a first storage area and a second storage area, and the data read and write speed of the first storage area is higher than the The data read and write speed of the second storage area, the processing module 620 is configured to:
根据第一数据的第一关键字key值查找索引树以获得与所述第一key值对应的第一子树,其中,所述第一数据存储于所述第一存储区中,所述索引树包括M层,所述索引树的前n层节点存储于所述第一存储区中,所述第一子树的根节点位于所述索引树的前n层,所述第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M;将所述第一数据从所述第一存储区写入所述第二存储区;根据所述第一key值更新所述第一子树,其中,更新后的所述第一子树的第一叶子节点中包含有所述第一数据的信息。Finding an index tree according to a first keyword key value of the first data to obtain a first subtree corresponding to the first key value, wherein the first data is stored in the first storage area, the index The tree includes an M layer, the first n-th node of the index tree is stored in the first storage area, and the root node of the first sub-tree is located in the first n-layer of the index tree, where the first sub-tree The leaf node includes information of data stored in the second storage area, M and n are both positive integers, n is less than or equal to M; and the first data is written from the first storage area to the second storage Updating the first subtree according to the first key value, where the updated first leaf node of the first subtree includes information of the first data.
因此,将存储设备中的存储区进行了分区,其中,第一存储区的读写性能优于第二存储区的读写性能。并且将索引树的前n层节点存储于所述第一存储区中,且该索引树的第一子树的根节点也位于所述索引树的前n层,从而使得在数据更新过程中,与现有技术中需要逐层合并各级索引树的方式相比,能够通过待写入数据的第一key值快速查找到对应的第一子树,提高了查找速度。并且,本申请中,能够基于第一子树实现第一存储区与第二存储区中数据的合并,从而可以减少现有技术中因为逐层合并带来的写放大问题。Therefore, the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
可选地,所述处理模块620还用于:接收写请求,所述写请求中包含有待写入的所述第一数据以及所述第一key值;将所述第一数据以及所述第一key值写入所述第一存储区。Optionally, the processing module 620 is further configured to: receive a write request, where the write request includes the first data to be written and the first key value; and the first data and the first A key value is written to the first storage area.
根据这种方式,将新写入数据先写入所述第一存储区中,由于第一存储区的读写性能较优,从而提高了数据写入速度。并且,在将数据写入第一存储区中之后,再从所述第一存储区转移到所述第二存储区中的方式也减少了直接在第二存储区进行数据更新所 带来的写放大的问题,从而提高了数据存储性能。According to this manner, the newly written data is first written into the first storage area, and the read/write performance of the first storage area is superior, thereby increasing the data writing speed. Moreover, after the data is written into the first storage area, the manner of transferring from the first storage area to the second storage area also reduces data update directly in the second storage area. The problem of write amplification is increased, thereby improving data storage performance.
可选地,所述处理模块620还用于:接收读请求,所述读请求中包含有第二数据的第二key值;当根据所述第二key值在所述第一存储区中未找到所述第二数据时,根据所述第二key值查找所述索引树以获得与所述第二key值对应的第二子树,其中,所述第二子树的根节点位于所述索引树的前n层;根据所述第二子树的第二叶子节点中包含的所述第二数据的信息从所述第二存储区中读取所述第二数据,其中,所述第二叶子节点为根据所述第二key值查找到的叶子节点。Optionally, the processing module 620 is further configured to: receive a read request, where the read request includes a second key value of the second data; when the second key value is not in the first storage area according to the second key value When the second data is found, the index tree is searched according to the second key value to obtain a second subtree corresponding to the second key value, where a root node of the second subtree is located in the a first n layer of the index tree; the second data is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree, wherein the The two leaf node is a leaf node that is found according to the second key value.
可选地,所述处理模块620具体用于:当所述第一存储区的剩余存储空间小于第一阈值且所述第一子树的读写次数满足所述第二阈值时,将所述第一数据从所述第一存储区写入所述第二存储区。Optionally, the processing module 620 is specifically configured to: when the remaining storage space of the first storage area is less than a first threshold, and the number of read/write times of the first subtree meets the second threshold, The first data is written from the first storage area to the second storage area.
可选地,所述第一存储区包括非易失性存储介质。Optionally, the first storage area comprises a non-volatile storage medium.
图7是根据本申请实施例的存储设备700的示意性框图。该存储设备可以包括图6所示的存储装置600,该存储设备例如可以为计算机、服务器等用于存储数据的设备。如图7所示,该存储设备700包括处理器710、收发器720和存储器730,其中,该处理器710、收发器720和存储器730之间通过内部连接通路互相通信。该存储器730用于存储文件中的数据以及指令,该处理器710用于执行该存储器730存储的指令,以控制该收发器720接收信号或发送信号。其中,该存储器730包括第一存储区和第二存储区,所述第一存储区的数据读写速度高于所述第二存储区的数据读写速度,该处理器710用于:FIG. 7 is a schematic block diagram of a storage device 700 in accordance with an embodiment of the present application. The storage device may include the storage device 600 shown in FIG. 6, which may be, for example, a device for storing data, such as a computer, a server, or the like. As shown in FIG. 7, the storage device 700 includes a processor 710, a transceiver 720, and a memory 730, wherein the processor 710, the transceiver 720, and the memory 730 communicate with each other through an internal connection path. The memory 730 is used to store data and instructions in the file, and the processor 710 is configured to execute instructions stored in the memory 730 to control the transceiver 720 to receive signals or transmit signals. The memory 730 includes a first storage area and a second storage area. The data read/write speed of the first storage area is higher than the data read/write speed of the second storage area. The processor 710 is configured to:
根据第一数据的第一关键字key值查找索引树以获得与所述第一key值对应的第一子树,其中,所述第一数据存储于所述第一存储区中,所述索引树包括M层,所述索引树的前n层节点存储于所述第一存储区中,所述第一子树的根节点位于所述索引树的前n层,所述第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M;将所述第一数据从所述第一存储区写入所述第二存储区;根据所述第一key值更新所述第一子树,其中,更新后的所述第一子树的第一叶子节点中包含有所述第一数据的信息。Finding an index tree according to a first keyword key value of the first data to obtain a first subtree corresponding to the first key value, wherein the first data is stored in the first storage area, the index The tree includes an M layer, the first n-th node of the index tree is stored in the first storage area, and the root node of the first sub-tree is located in the first n-layer of the index tree, where the first sub-tree The leaf node includes information of data stored in the second storage area, M and n are both positive integers, n is less than or equal to M; and the first data is written from the first storage area to the second storage Updating the first subtree according to the first key value, where the updated first leaf node of the first subtree includes information of the first data.
因此,将存储设备中的存储区进行了分区,其中,第一存储区的读写性能优于第二存储区的读写性能。并且将索引树的前n层节点存储于所述第一存储区中,且该索引树的第一子树的根节点也位于所述索引树的前n层,从而使得在数据更新过程中,与现有技术中需要逐层合并各级索引树的方式相比,能够通过待写入数据的第一key值快速查找到对应的第一子树,提高了查找速度。并且,本申请中,能够基于第一子树实现第一存储区与第二存储区中数据的合并,从而可以减少现有技术中因为逐层合并带来的写放大问题。Therefore, the storage area in the storage device is partitioned, wherein the read/write performance of the first storage area is better than the read/write performance of the second storage area. And storing the first n-th node of the index tree in the first storage area, and the root node of the first sub-tree of the index tree is also located in the first n-th layer of the index tree, so that during the data update process, Compared with the manner in which the index tree of each level needs to be merged layer by layer in the prior art, the corresponding first subtree can be quickly found by the first key value of the data to be written, thereby improving the search speed. Moreover, in the present application, the merging of data in the first storage area and the second storage area can be implemented based on the first subtree, thereby reducing the problem of write amplification caused by layer-by-layer merging in the prior art.
可选地,所述处理器710还用于:接收写请求,所述写请求中包含有待写入的所述第一数据以及所述第一key值;将所述第一数据以及所述第一key值写入所述第一存储区。Optionally, the processor 710 is further configured to: receive a write request, where the write request includes the first data to be written and the first key value; and the first data and the first A key value is written to the first storage area.
可选地,所述处理器710还用于:接收读请求,所述读请求中包含有第二数据的第二key值;当根据所述第二key值在所述第一存储区中未找到所述第二数据时,根据所述第二key值查找所述索引树以获得与所述第二key值对应的第二子树,其中,所述第二子树的根节点位于所述索引树的前n层;根据所述第二子树的第二叶子节点中包含的所述第二数据的信息从所述第二存储区中读取所述第二数据,其中,所述第二叶子节点 为根据所述第二key值查找到的叶子节点。Optionally, the processor 710 is further configured to: receive a read request, where the read request includes a second key value of the second data; when the second key value is not in the first storage area according to the second key value When the second data is found, the index tree is searched according to the second key value to obtain a second subtree corresponding to the second key value, where a root node of the second subtree is located in the a first n layer of the index tree; the second data is read from the second storage area according to the information of the second data included in the second leaf node of the second subtree, wherein the Two leaf node A leaf node found according to the second key value.
可选地,所述处理器710具体用于:当所述第一存储区的剩余存储空间小于第一阈值且所述第一子树的读写次数满足所述第二阈值时,将所述第一数据从所述第一存储区写入所述第二存储区。Optionally, the processor 710 is specifically configured to: when the remaining storage space of the first storage area is less than a first threshold, and the number of read/write times of the first subtree meets the second threshold, The first data is written from the first storage area to the second storage area.
可选地,所述第一存储区包括非易失性存储介质。Optionally, the first storage area comprises a non-volatile storage medium.
应理解,在本申请实施例中,该处理器710可以是中央处理单元(Central Processing Unit,CPU),该处理器710还可以是其他通用处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in the embodiment of the present application, the processor 710 may be a central processing unit (CPU), and the processor 710 may also be other general-purpose processors, digital signal processing (DSP). , Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
该存储器730可以包括只读存储器和随机存取存储器,并向处理器710提供指令和数据。存储器730的一部分还可以包括非易失性随机存取存储器。The memory 730 can include read only memory and random access memory and provides instructions and data to the processor 710. A portion of the memory 730 may also include a non-volatile random access memory.
在实现过程中,上述方法的各步骤可以通过处理器710中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的定位方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器710中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器730,处理器710读取存储器730中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 710 or an instruction in a form of software. The steps of the positioning method disclosed in the embodiment of the present application may be directly implemented by the hardware processor, or may be performed by a combination of hardware and software modules in the processor 710. The software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like. The storage medium is located in memory 730, and processor 710 reads the information in memory 730 and, in conjunction with its hardware, performs the steps of the above method. To avoid repetition, it will not be described in detail here.
根据本申请实施例的存储设备700可以对应于上述方法200中用于执行方法200的存储设备,以及根据本申请实施例的存储装置600,且该存储设备700中的各单元或模块分别用于执行上述方法200中存储设备所执行的各动作或处理过程,这里,为了避免赘述,省略其详细说明。The storage device 700 according to the embodiment of the present application may correspond to the storage device for performing the method 200 in the foregoing method 200, and the storage device 600 according to the embodiment of the present application, and each unit or module in the storage device 700 is used for respectively The operations or processes performed by the storage device in the above method 200 are performed. Here, in order to avoid redundancy, detailed description thereof will be omitted.
图8是本申请实施例的系统芯片的一个示意性结构图。图8的系统芯片800包括输入接口801、输出接口802、至少一个处理器803、存储器804,所述输入接口801、输出接口802、所述处理器803以及存储器804之间通过内部连接通路互相连接。所述处理器803用于执行所述存储器804中的代码。当所述代码被执行时,所述处理器803可以实现方法实施例中由存储设备执行的方法200。为了简洁,这里不再赘述。FIG. 8 is a schematic structural diagram of a system chip according to an embodiment of the present application. The system chip 800 of FIG. 8 includes an input interface 801, an output interface 802, at least one processor 803, and a memory 804. The input interface 801, the output interface 802, the processor 803, and the memory 804 are interconnected by an internal connection path. . The processor 803 is configured to execute code in the memory 804. When the code is executed, the processor 803 can implement the method 200 performed by the storage device in a method embodiment. For the sake of brevity, it will not be repeated here.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。A person skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the system, the device and the unit described above can refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通新连接可以是通过一些接口,装置或单元的间接耦合或通新连接,可以是电性,机械或其它的形式。 In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. Alternatively, the coupling or direct coupling or new connection shown or discussed may be an indirect coupling or a new connection through some interface, device or unit, and may be in electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。 The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application. It should be covered by the scope of protection of this application. Therefore, the scope of protection of the present application should be determined by the scope of the claims.

Claims (12)

  1. 一种数据更新方法,所述方法由包含有第一存储区和第二存储区的存储设备执行,所述第一存储区的数据读写速度高于所述第二存储区的数据读写速度,所述方法包括:A data updating method, the method being performed by a storage device including a first storage area and a second storage area, wherein a data read/write speed of the first storage area is higher than a data read/write speed of the second storage area , the method includes:
    根据第一数据的第一关键字key值查找索引树以获得与所述第一key值对应的第一子树,其中,所述第一数据存储于所述第一存储区中,所述索引树包括M层,所述索引树的前n层节点存储于所述第一存储区中,所述第一子树的根节点位于所述索引树的前n层,所述第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M;Finding an index tree according to a first keyword key value of the first data to obtain a first subtree corresponding to the first key value, wherein the first data is stored in the first storage area, the index The tree includes an M layer, the first n-th node of the index tree is stored in the first storage area, and the root node of the first sub-tree is located in the first n-layer of the index tree, where the first sub-tree The leaf node includes information of the stored data in the second storage area, where M and n are positive integers, and n is less than or equal to M;
    将所述第一数据从所述第一存储区写入所述第二存储区;Writing the first data from the first storage area to the second storage area;
    根据所述第一key值更新所述第一子树,其中,更新后的所述第一子树的第一叶子节点中包含有所述第一数据的信息。Updating the first subtree according to the first key value, where the updated first leaf node of the first subtree includes information of the first data.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 further comprising:
    接收写请求,所述写请求中包含有待写入的所述第一数据以及所述第一key值;Receiving a write request, where the write request includes the first data to be written and the first key value;
    将所述第一数据以及所述第一key值写入所述第一存储区。Writing the first data and the first key value into the first storage area.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    接收读请求,所述读请求中包含有第二数据的第二key值;Receiving a read request, where the read request includes a second key value of the second data;
    当根据所述第二key值在所述第一存储区中未找到所述第二数据时,根据所述第二key值查找所述索引树以获得与所述第二key值对应的第二子树,其中,所述第二子树的根节点位于所述索引树的前n层;When the second data is not found in the first storage area according to the second key value, searching the index tree according to the second key value to obtain a second corresponding to the second key value a subtree, wherein a root node of the second subtree is located in a front n layer of the index tree;
    根据所述第二子树的第二叶子节点中包含的所述第二数据的信息从所述第二存储区中读取所述第二数据,其中,所述第二叶子节点为根据所述第二key值查找到的叶子节点。Reading the second data from the second storage area according to the information of the second data included in the second leaf node of the second subtree, wherein the second leaf node is according to the The leaf node found by the second key value.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述将所述第一数据从所述第一存储区写入所述第二存储区包括:The method according to any one of claims 1 to 3, wherein the writing the first data from the first storage area to the second storage area comprises:
    当所述第一存储区的剩余存储空间小于第一阈值且所述第一子树的读写次数满足所述第二阈值时,将所述第一数据从所述第一存储区写入所述第二存储区。Writing the first data from the first storage area when the remaining storage space of the first storage area is less than a first threshold and the number of read/write times of the first sub-tree satisfies the second threshold The second storage area is described.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一存储区包括非易失性存储介质。The method according to any one of claims 1 to 4, wherein the first storage area comprises a non-volatile storage medium.
  6. 一种存储装置,其特征在于,所述存储装置包括存储模块和处理模块,所述存储模块包括第一存储区和第二存储区,所述第一存储区的数据读写速度高于所述第二存储区的数据读写速度,所述处理模块用于:A storage device, comprising: a storage module and a processing module, the storage module comprising a first storage area and a second storage area, wherein the data read and write speed of the first storage area is higher than the Data read and write speed of the second storage area, the processing module is configured to:
    根据第一数据的第一关键字key值查找索引树以获得与所述第一key值对应的第一子树,其中,所述第一数据存储于所述第一存储区中,所述索引树包括M层,所述索引树的前n层节点存储于所述第一存储区中,所述第一子树的根节点位于所述索引树的前n层,所述第一子树的叶子节点中包括第二存储区中已存储的数据的信息,M和n均为正整数,n小于或等于M;Finding an index tree according to a first keyword key value of the first data to obtain a first subtree corresponding to the first key value, wherein the first data is stored in the first storage area, the index The tree includes an M layer, the first n-th node of the index tree is stored in the first storage area, and the root node of the first sub-tree is located in the first n-layer of the index tree, where the first sub-tree The leaf node includes information of the stored data in the second storage area, where M and n are positive integers, and n is less than or equal to M;
    将所述第一数据从所述第一存储区写入所述第二存储区;Writing the first data from the first storage area to the second storage area;
    根据所述第一key值更新所述第一子树,其中,更新后的所述第一子树的第一叶子节点中包含有所述第一数据的信息。Updating the first subtree according to the first key value, where the updated first leaf node of the first subtree includes information of the first data.
  7. 根据权利要求6所述的存储装置,其特征在于,所述处理模块还用于: The storage device according to claim 6, wherein the processing module is further configured to:
    接收写请求,所述写请求中包含有待写入的所述第一数据以及所述第一key值;Receiving a write request, where the write request includes the first data to be written and the first key value;
    将所述第一数据以及所述第一key值写入所述第一存储区。Writing the first data and the first key value into the first storage area.
  8. 根据权利要求6或7所述的存储装置,其特征在于,所述处理模块还用于:The storage device according to claim 6 or 7, wherein the processing module is further configured to:
    接收读请求,所述读请求中包含有第二数据的第二key值;Receiving a read request, where the read request includes a second key value of the second data;
    当根据所述第二key值在所述第一存储区中未找到所述第二数据时,根据所述第二key值查找所述索引树以获得与所述第二key值对应的第二子树,其中,所述第二子树的根节点位于所述索引树的前n层;When the second data is not found in the first storage area according to the second key value, searching the index tree according to the second key value to obtain a second corresponding to the second key value a subtree, wherein a root node of the second subtree is located in a front n layer of the index tree;
    根据所述第二子树的第二叶子节点中包含的所述第二数据的信息从所述第二存储区中读取所述第二数据,其中,所述第二叶子节点为根据所述第二key值查找到的叶子节点。Reading the second data from the second storage area according to the information of the second data included in the second leaf node of the second subtree, wherein the second leaf node is according to the The leaf node found by the second key value.
  9. 根据权利要求6至8中任一项所述的存储装置,其特征在于,所述处理模块具体用于:The storage device according to any one of claims 6 to 8, wherein the processing module is specifically configured to:
    当所述第一存储区的剩余存储空间小于第一阈值且所述第一子树的读写次数满足所述第二阈值时,将所述第一数据从所述第一存储区写入所述第二存储区。Writing the first data from the first storage area when the remaining storage space of the first storage area is less than a first threshold and the number of read/write times of the first sub-tree satisfies the second threshold The second storage area is described.
  10. 根据权利要求6至9中任一项所述的存储装置,其特征在于,所述第一存储区包括非易失性存储介质。The storage device according to any one of claims 6 to 9, wherein the first storage area comprises a non-volatile storage medium.
  11. 一种计算机,其特征在于,包括:处理器和存储器;A computer, comprising: a processor and a memory;
    所述存储器用于存储计算机执行指令,所述处理器和所述存储器之间通过内部连接通路互相通信,当所述计算机运行时,所述处理器执行所述存储器存储的所述计算机执行指令,以使所述计算机执行权利要求1-5中任意一项所述的方法。The memory is configured to store computer execution instructions, and the processor and the memory communicate with each other through an internal connection path, and when the computer is running, the processor executes the computer execution instructions stored by the memory, The computer is caused to perform the method of any of claims 1-5.
  12. 一种计算机可读存储介质,其特征在于,包括计算机执行指令,当计算机的处理器执行所述计算机执行指令时,所述计算机执行权利要求1-5中任意一项所述的方法。 A computer readable storage medium comprising computer executed instructions for performing the method of any of claims 1-5 when a processor of a computer executes the computer to execute an instruction.
PCT/CN2017/083657 2017-05-09 2017-05-09 Data updating method and storage device WO2018205151A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/083657 WO2018205151A1 (en) 2017-05-09 2017-05-09 Data updating method and storage device
CN201780070813.XA CN110168532B (en) 2017-05-09 2017-05-09 Data updating method and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/083657 WO2018205151A1 (en) 2017-05-09 2017-05-09 Data updating method and storage device

Publications (1)

Publication Number Publication Date
WO2018205151A1 true WO2018205151A1 (en) 2018-11-15

Family

ID=64104067

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/083657 WO2018205151A1 (en) 2017-05-09 2017-05-09 Data updating method and storage device

Country Status (2)

Country Link
CN (1) CN110168532B (en)
WO (1) WO2018205151A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086168A (en) * 2022-08-19 2022-09-20 北京全路通信信号研究设计院集团有限公司 Vehicle-mounted equipment communication parameter updating and storing method and system

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104403B (en) * 2019-11-30 2022-06-07 北京浪潮数据技术有限公司 LSM tree data processing method, system, equipment and computer medium
CN111131015B (en) * 2019-12-27 2021-09-03 芯启源(南京)半导体科技有限公司 Method for dynamically updating route based on PC-Trie
CN111475507B (en) * 2020-03-31 2022-06-21 浙江大学 Key value data indexing method for workload adaptive single-layer LSMT
CN111857582B (en) * 2020-07-08 2024-04-05 平凯星辰(北京)科技有限公司 Key value storage system
CN114626532B (en) * 2020-12-10 2023-11-03 本源量子计算科技(合肥)股份有限公司 Method and device for reading data based on address, storage medium and electronic device
CN113609076A (en) * 2021-08-04 2021-11-05 杭州海康威视数字技术股份有限公司 File storage method and file reading method
CN115374127B (en) * 2022-10-21 2023-04-28 北京奥星贝斯科技有限公司 Data storage method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030093613A1 (en) * 2000-01-14 2003-05-15 David Sherman Compressed ternary mask system and method
CN104090942A (en) * 2014-06-30 2014-10-08 中国电子科技集团公司第三十二研究所 Trie search method and device applied to network processor
CN104899297A (en) * 2015-06-08 2015-09-09 南京航空航天大学 Hybrid index structure with storage perception
CN105447059A (en) * 2014-09-29 2016-03-30 华为技术有限公司 Data processing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751406B (en) * 2008-12-18 2012-01-04 赵伟 Method and device for realizing column storage based relational database
CN102591864B (en) * 2011-01-06 2015-03-25 上海银晨智能识别科技有限公司 Data updating method and device in comparison system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030093613A1 (en) * 2000-01-14 2003-05-15 David Sherman Compressed ternary mask system and method
CN104090942A (en) * 2014-06-30 2014-10-08 中国电子科技集团公司第三十二研究所 Trie search method and device applied to network processor
CN105447059A (en) * 2014-09-29 2016-03-30 华为技术有限公司 Data processing method and device
CN104899297A (en) * 2015-06-08 2015-09-09 南京航空航天大学 Hybrid index structure with storage perception

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086168A (en) * 2022-08-19 2022-09-20 北京全路通信信号研究设计院集团有限公司 Vehicle-mounted equipment communication parameter updating and storing method and system
CN115086168B (en) * 2022-08-19 2022-11-22 北京全路通信信号研究设计院集团有限公司 Vehicle-mounted equipment communication parameter updating and storing method and system

Also Published As

Publication number Publication date
CN110168532B (en) 2021-08-20
CN110168532A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2018205151A1 (en) Data updating method and storage device
US10162598B2 (en) Flash optimized columnar data layout and data access algorithms for big data query engines
US11256696B2 (en) Data set compression within a database system
US9021189B2 (en) System and method for performing efficient processing of data stored in a storage node
US9495398B2 (en) Index for hybrid database
US9092321B2 (en) System and method for performing efficient searches and queries in a storage node
US8595248B2 (en) Querying a cascading index that avoids disk accesses
US7418544B2 (en) Method and system for log structured relational database objects
US9268804B2 (en) Managing a multi-version database
Ahn et al. ForestDB: A fast key-value storage system for variable-length string keys
US20140136510A1 (en) Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data
EP2562657B1 (en) Management of update transactions and crash recovery for columnar database
WO2012095771A1 (en) Sparse index table organization
US10509780B2 (en) Maintaining I/O transaction metadata in log-with-index structure
CN104054071A (en) Method for accessing storage device and storage device
WO2013075306A1 (en) Data access method and device
WO2015024406A1 (en) Data file management method and device
CN109407985B (en) Data management method and related device
KR101806394B1 (en) A data processing method having a structure of the cache index specified to the transaction in a mobile environment dbms
US8396858B2 (en) Adding entries to an index based on use of the index
CN109165321A (en) A kind of consistency Hash table construction method and system based on nonvolatile memory
US20220129466A1 (en) Compressing data sets for storage in a database system
CN110515897B (en) Method and system for optimizing reading performance of LSM storage system
JP6006740B2 (en) Index management device
US10762139B1 (en) Method and system for managing a document search index

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17909104

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17909104

Country of ref document: EP

Kind code of ref document: A1