WO2013128788A1 - Dispositif de gestion de données, procédé de gestion de données, et programme - Google Patents

Dispositif de gestion de données, procédé de gestion de données, et programme Download PDF

Info

Publication number
WO2013128788A1
WO2013128788A1 PCT/JP2013/000187 JP2013000187W WO2013128788A1 WO 2013128788 A1 WO2013128788 A1 WO 2013128788A1 JP 2013000187 W JP2013000187 W JP 2013000187W WO 2013128788 A1 WO2013128788 A1 WO 2013128788A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
hierarchy
dividing
usage rate
division
Prior art date
Application number
PCT/JP2013/000187
Other languages
English (en)
Japanese (ja)
Inventor
盛朗 佐々木
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Publication of WO2013128788A1 publication Critical patent/WO2013128788A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • G06F3/0649Lifecycle management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device

Definitions

  • the present invention relates to a data management device, a data management method, and a program.
  • index tree range search can be executed at high speed. It is more convenient to handle data as a set of data (tuple), for example, ⁇ name, address, product, date / time> rather than handling data alone. Therefore, an index tree is assigned to the data (key) in the tuple to which a search condition is attached. For example, if a tree is assigned using the above “date and time” as a key, “goods purchased from 11:00 to 12:00 on September 30, 2011” can be searched at high speed.
  • the B + tree shown in Non-Patent Document 1 is one of widely used index trees.
  • data is managed in units of blocks.
  • a block in the lowest layer is called a leaf block
  • a block in a layer higher than the leaf block is called a branch block
  • the highest and only block is called a root block.
  • a key in a specific range and a pointer to a tuple corresponding to the key are recorded.
  • a leaf block in which a key in an adjacent range is recorded is connected by a pointer.
  • this key / pointer pair is called an entry.
  • the branch is recorded with an entry consisting of a key and a pointer to a leaf or an adjacent lower branch.
  • block division is executed.
  • half of all entries included in the divided block are moved to the newly added block based on the size of the key.
  • a value obtained by dividing “the number of entries held by the block immediately after block division” divided by “the maximum number of entries that the block can hold” is used as “guaranteed use”.
  • a value obtained by dividing “rate” and “number of entries currently held by the block” by “maximum number of entries that the block can hold” is defined as “usage rate”. Since the B + tree is an algorithm for dividing a block whose usage rate is 100%, the guaranteed usage rate is about 50%.
  • the B * tree shown in Non-Patent Document 2 adds a new block when the usage rate of the target block to be processed and the usage rate of the adjacent block adjacent to the target block are both 100%.
  • the target block and the entry held by the adjacent block are divided between the target block, the adjacent block, and the newly added block, and a newly added entry is inserted. Therefore, the guaranteed usage rate of the B * tree is about 67%.
  • Non-Patent Document 3 proposes Fractal Prefetching B + -tree (fpB + -tree) that is conscious of both the CPU (Central Processing Unit) cache and the disk.
  • fpB + -tree a B + tree having a block size optimized for the disk is created, and a sub-block size optimized for the cache is created in the block.
  • Non-Patent Document 4 proposes “FAST” which is a binary tree optimized for individual hardware.
  • FAST SIMD (Single Instruction Multiple Multiple Data) register size, line size, page size, etc. are considered.
  • a block sized according to the cache is provided in a block sized according to the disk, but in this document, a block sized according to the SIMD register is further included in the block sized according to the cache.
  • a block is provided.
  • entries are packed with width priority. For example, a certain entry and two entries corresponding to its children are packed.
  • Non-Patent Document 5 proposes an FD tree that addresses the problem of slow random writing to flash memory.
  • the FD tree is composed of a head tree which is a small B + tree whose block size is equal to the page size of the flash memory, and an array of sorted blocks.
  • the array generally corresponds to one level of the tree, with lower level arrays in the tree being larger in size. By searching the block of the head tree or the array, the block to be searched next in the next lower level can be specified. Writes are buffered in the head tree and are flushed to lower level buffers when the buffer is full.
  • Patent Document 1 describes an index mounting method and an apparatus on which an index is mounted, focusing on the fact that the index development cost is relatively high and that the index has a plurality of data structures.
  • a data structure such as a B-tree, a hash, and a heap is prepared as a component, and by combining these components, an index is mounted at low cost and without trouble.
  • FAST Fast Architecture Search on Modern CPUs and GPUs. In SIGMOD, pages 339-350, 2010. Y. Li, B. He, R. J. Yang, Q. Luo, and K. Yi. Tree Indexing on Solid State Drives. In PVLDB 2010.
  • Non-Patent Documents 1 to 4 there is one division method used when dividing blocks in the entire hierarchical structure.
  • Non-Patent Document 5 and Patent Document 1 have a plurality of different data structures, but there is only one division method used in the entire hierarchical structure. Therefore, in each document, both the search speed and the update speed cannot be optimized according to the update frequency for each hierarchy.
  • an object of the present invention is to provide a data management device, a data management method, and a program for optimizing both a search speed and an update speed for each hierarchy in data having a hierarchical structure.
  • a management means for managing entries in block units in a tree-like hierarchical structure Provided for each hierarchy, and when the usage rate of the block is equal to or higher than a certain division threshold when inserting a new entry, a new block is added, and the existing entry held by the block is replaced with the block and the new block.
  • Dividing means for dividing between If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
  • a data management device is provided.
  • Computer Manage block-level entries in a tree-like hierarchical structure When the usage rate of the block is equal to or greater than a certain division threshold when inserting a new entry, a new block is added using a division means provided for each layer, and the existing entry held by the block is Split between the block and the new block; If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
  • a data management method for setting to be provided is provided.
  • Computer A means of managing block unit entries in a tree-like hierarchical structure When the usage rate of the block is equal to or greater than a certain division threshold when inserting a new entry, a new block is added using a division means provided for each layer, and the existing entry held by the block is Means for dividing between the block and the new block; If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
  • a program for functioning as a setting means is provided.
  • FIG. 1 is a block diagram showing the configuration of the data management apparatus according to the first embodiment of the present invention.
  • the data management apparatus 10 includes a management unit 102 and a dividing unit 104.
  • the management means 102 manages entries in block units in a tree-like hierarchical structure, and searches for entries held in the blocks and new entries to the blocks (hereinafter referred to as new entries) by search means and insertion means (not shown). .) Is inserted.
  • the search means of the management means 102 identifies the target block that is the insertion target of the new entry based on the key value of the new entry.
  • the insertion means of the management means 102 inserts a new entry at a predetermined location in the target block based on the key value of the new entry.
  • the insertion means of the management means 102 has a usage rate of the target block to be processed when inserting a new entry equal to or higher than a certain threshold (hereinafter referred to as a division threshold) that determines whether or not to divide. And a command to divide the block based on the determination result is transmitted to the dividing means 104.
  • a certain threshold hereinafter referred to as a division threshold
  • the dividing unit 104 divides the target block based on an instruction to divide the block. Further, the dividing means 104 is provided for each hierarchy, and the guaranteed usage rate of the dividing method used by the dividing means 104 located in the upper hierarchy is guaranteed by the dividing method used by the dividing means 104 located in the lower hierarchy. More than the usage rate.
  • the dividing unit 104 divides an entry already inserted in the target block (hereinafter referred to as an existing entry) between the target block and the new block.
  • each component of the data management apparatus shown in each figure is not a hardware unit configuration but a functional unit block.
  • Each component includes a CPU and memory of an arbitrary computer, a program that realizes the components shown in the figure loaded in the memory, a storage medium such as a hard disk for storing the program, and a hardware and software mainly for a network connection interface Realized by any combination of wear.
  • the dividing method with a high guaranteed usage rate increases the search speed because the frequency of adding new blocks is lower than the dividing method with a low guaranteed usage rate.
  • the division method with a high guaranteed usage rate has a slower update speed than the division method with a low guaranteed usage rate because the block division processing is complicated.
  • the highest-level block (root block) has a lower update frequency, and the search process is main.
  • the lowermost block (leaf block) has a higher update frequency, and the update process is the main.
  • the guaranteed usage rate of the dividing method used by the level 1 and level 2 dividing means 104 in the data management apparatus 10 is higher than the guaranteed usage rate of the dividing method used by the level 3 dividing means 104.
  • the dividing method used by the level 1 and level 2 dividing means 104 is a B * tree dividing algorithm
  • the dividing method used by the level 3 dividing means 104 is a B + tree dividing algorithm. it can.
  • a new block is added when the usage rate of the target block and the usage rate of adjacent blocks adjacent to the target block are both 100%. Then, the existing entry is divided among the target block, the adjacent block, and the new block. Then, a new entry is inserted at a predetermined position in any block based on the value of the key. In the B * tree partitioning algorithm, two blocks with a usage rate of 100% are split into three blocks, so the guaranteed usage rate of the B * tree is about 67%.
  • a new block is added when the usage rate of the target block is 100%. Then, the existing entry is divided between the target block and the new block. Then, a new entry is inserted at a predetermined position in any block based on the value of the key. In the B + tree partitioning algorithm, one block having a usage rate of 100% is divided into two blocks, so the guaranteed usage rate of the B + tree is about 50%.
  • which division method is used by the division means 104 for each hierarchy can be determined based on the data update frequency for each hierarchy.
  • the data update frequency may be calculated on the basis of the number of accesses to the block and the number of occurrences of division for each layer, for example. Based on the data update frequency, select a division method with a high guaranteed usage rate for a hierarchy that is higher than a hierarchy that has a data update frequency equal to or higher than a certain threshold, and the data update frequency is lower than or equal to a certain threshold In the hierarchy located at, a division method with a low guaranteed usage rate can be selected.
  • a configuration in which the dividing unit 104 uses a dividing method with a higher guaranteed usage rate for higher layers and a dividing unit 104 uses a dividing method with a lower guaranteed usage rate for lower layers. take.
  • the search speed is improved in the upper hierarchy where the search process is main, and the update speed is improved in the lower hierarchy where the update process is main. Can be made. Therefore, according to this configuration, the search speed and the update speed can be optimized compared with the case where one division method is used for the entire hierarchical structure, and the processing speed of the entire apparatus can be improved.
  • the level k hierarchy dividing means 104 performs division processing with k-1 adjacent blocks as shown in FIG.
  • This adjacent block is located in the same hierarchy as the target block.
  • the adjacent block is a block that starts from a block that holds an entry having a minimum key value among blocks having an entry having a key value larger than the maximum key value in the target block.
  • the blocks that can be secured are adjacent blocks. Therefore, for example, when the target block is a block that holds an entry having the maximum key value in the same hierarchy, the number of adjacent blocks is zero.
  • the update process includes a search process for specifying a block into which data is inserted, and an insert process for inserting data into the block specified by the search process.
  • the insertion process further includes a division process that divides the block if the free area of the block is not sufficient at the time of data insertion.
  • FIG. 3 is a flowchart showing the flow of search processing.
  • the search means of the management means 102 determines whether the hierarchy in which the currently searched block is located is one hierarchy above the lowest layer (leaf block).
  • the search means of the management means 102 decrements the counter k and lowers the search target hierarchy by 1 (S108).
  • the search means of the management means 102 specifies the block made into search object in the following hierarchy based on the pointer acquired by S104 (S110), and performs the process from S104 again about the specified block.
  • the search unit of the management unit 102 specifies a block (target block) to be inserted as a new entry based on the pointer acquired in S104 ( S112).
  • the management unit 102 stores the route from the root block to the target block in a storage unit (not shown).
  • FIG. 4 is a flowchart showing the flow of insertion processing.
  • the insertion unit of the management unit 102 determines whether or not a new entry can be inserted into the target block specified by the search process by comparing the usage rate of the target block with the division threshold (S202). When the usage rate of the target block is less than the division threshold (NO in S202), it is not necessary to divide the target block. Therefore, the insertion unit of the management unit 102 does not execute the division process and updates the entry of the target block (S212). Specifically, the insertion unit of the management unit 102 specifies the insertion position of the new entry based on the key value of the existing entry of the target block and the key value of the new entry.
  • the insertion means of the management means 102 inserts a new entry at the position specified by the target block.
  • the target block needs to be divided. Therefore, the insertion unit of the management unit 102 transmits a command for performing the dividing process to the dividing unit 104 (S204). The flow of division processing will be described below.
  • FIG. 5 is a flowchart showing the flow of division processing.
  • the dividing unit 104 uses a counter i as a reference for determining adjacent blocks.
  • the insertion means of the management means 102 initializes the counter i with 1 (S302).
  • the insertion unit of the management unit 102 determines whether there is an adjacent block whose usage rate has not been determined.
  • the inserting unit of the management unit 102 determines whether the usage rate of the adjacent block (i-th adjacent block) indicated by the counter i is equal to or greater than the division threshold. Determine.
  • the insertion unit of the management unit 102 transmits an instruction to the existing unit to divide the existing entry.
  • the dividing unit 104 calculates an i + 1 quantile value (S312).
  • the i + 1 quantile value is a reference value for dividing an existing entry among i + 1 blocks from the target block to the i-th adjacent block.
  • the i + 1 quantile can be obtained, for example, by dividing the total number of existing entries of the i-th adjacent block from the target block by i + 1.
  • the dividing unit 104 divides the existing entry among the blocks based on the obtained i + 1 quantile value (S314). Then, the insertion unit of the management unit 102 specifies the position where the new entry is inserted based on the key value held by the divided existing entry, and inserts the new entry (S316).
  • the minimum key value for each block subjected to the division process S318). Further, each entry acquired in S318 is used when updating an entry of a higher-level block in subsequent processing.
  • the insertion unit of the management unit 102 transmits a command for performing block division processing to the division unit 104 in order to secure an area for inserting a new entry.
  • the dividing unit 104 adds a new block after the k ⁇ 1th adjacent block (S310). The subsequent processing is the same as when no new block is added.
  • the existing entry is divided from the target block to the new block, and the new entry is inserted.
  • the division process when executed in the lower layer, it is necessary to update the entry held by the block in the upper layer with a new key value, so it is determined whether or not the upper layer exists. If there is an upper layer (YES in S206), the counter k is incremented to increase the processing target layer by one (S208). Then, the block to be processed is specified in the next higher hierarchy according to the route held in the search process.
  • the insertion unit of the management unit 102 determines whether or not a new block has been added in the division process of S204 (S214).
  • a flag indicating that a new block has been added When a new block is not added in the lower layer (NO in S214), there is no possibility that the upper layer block is divided. Therefore, the insertion unit of the management unit 102 updates the entry indicating the block subjected to the division process in the lower hierarchy among the identified block entries using the entry acquired in S318 (S212).
  • the insertion unit of the management unit 102 determines whether the usage rate of the identified block is greater than or equal to the division threshold (S202).
  • the insertion unit of the management unit 102 updates the entry indicating the block subjected to the division process in the lower hierarchy among the specified block entries using the entry acquired in S318 (S212). In the case of further dividing processing in the upper layer, the processing executed in the lower layer is repeated, and the description thereof is omitted.
  • the division algorithm is equivalent to the B + tree because only the target block is divided at the lowest layer (level 1). Therefore, the guaranteed usage rate of the level 1 hierarchy division method is about 50%. Similarly, in the level 2 hierarchy, since the target block and the first adjacent block are divided, the division algorithm is equivalent to the B * tree, and the guaranteed usage rate is about 67%. Similarly, the guaranteed usage rate is higher in the upper hierarchy, approximately 75% at level 3 and approximately 80% at level 4.
  • the guaranteed usage rate is increased as the hierarchy is increased. With this configuration, the guaranteed usage rate can be changed more flexibly than the configuration in which the division method used by the division unit 104 is divided in a hierarchy having a constant update frequency, and the processing speed can be improved.
  • FIG. 6 is a block diagram showing a configuration of a data management apparatus according to the third embodiment of the present invention.
  • the data management apparatus 10 further includes first correction means 106 that determines a division threshold for each adjacent block.
  • FIG. 7 is a flowchart showing the flow of division processing in the third embodiment of the present invention.
  • the first correction unit 106 determines a division threshold according to the distance between the target block and the adjacent block (S402).
  • the first correction means 106 determines that the division threshold becomes smaller as the adjacent block is farther from the target block.
  • the division threshold can be set to “100 ⁇ c ⁇ i% (where c is an arbitrary constant)” using a counter i that determines adjacent blocks. As a result, it is possible to adjust the conditions under which block division occurs for each adjacent block.
  • the dividing unit 104 calculates the dividing position of the existing entry based on the dividing threshold value of each adjacent block before dividing the existing entry between the blocks (S404). For example, the dividing unit 104 divides an existing entry among blocks based on the position of i + 1 minutes. Here, when the division threshold of a certain block is exceeded when attempting to divide based on the i + 1 minute position, the dividing unit 104 assigns more existing entries to the block having the larger division threshold, and each block is assigned the division threshold. Adjust so that it does not exceed.
  • FIG. 8 is a diagram illustrating an example of entry transition when there is no division threshold correction according to the distance between adjacent blocks.
  • FIG. 9 is a diagram illustrating an example of entry transition when there is correction of a division threshold according to the distance between adjacent blocks.
  • the description will be made assuming that the maximum number of entries in one block is 36, the guaranteed usage rate is 2/3, and a new entry is inserted only in the leftmost block.
  • the division method used by the upper layer division unit 104 has a higher guaranteed usage rate
  • the division method used by the lower layer division unit 104 has a lower guaranteed usage rate.
  • the division threshold is determined according to the distance between the target block and the adjacent block.
  • FIG. 10 is a block diagram showing a configuration of a data management apparatus according to the fourth embodiment of the present invention.
  • the data management apparatus 10 further includes a second correction unit 108 that determines a division threshold value for each layer.
  • FIG. 11 is a flowchart showing the flow of insertion processing in the fourth embodiment of the present invention.
  • the second correction unit 108 determines a division threshold according to the hierarchy to be processed (S502).
  • the second correction unit 108 determines that the division threshold value decreases as the lower hierarchy level with higher update frequency is reached.
  • the same effects as those in the first to third embodiments can be obtained.
  • segmentation threshold value for every hierarchy is taken. With this configuration, it is possible to flexibly optimize the guaranteed usage rate according to the update frequency for each layer. Therefore, the processing speed can be improved as compared with the case where the same division threshold is used in all layers.
  • the dividing method used by the dividing means 104 for each layer is other than the dividing method described in the second embodiment, the guaranteed use of the dividing method used by the dividing means 104 in the upper hierarchy is used. It is sufficient that the rate is equal to or higher than the guaranteed usage rate of the division method used by the division unit 104 in the lower hierarchy.
  • the effects of the present embodiment can be obtained without the first correction means 106 described in the third embodiment.
  • a management means for managing entries in block units in a tree-like hierarchical structure Provided for each hierarchy, and when the usage rate of the block is equal to or higher than a certain division threshold when inserting a new entry, a new block is added, and the existing entry held by the block is replaced with the block and the new block.
  • Dividing means for dividing between If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
  • a data management device. (Appendix 2) In the data management device according to attachment 1, For each hierarchy, the update frequency of the entry is retained, The dividing method used by the dividing means is: In a hierarchy higher than the hierarchy having the update frequency equal to or higher than a certain threshold, a B * tree division algorithm is used.
  • the B + tree In a hierarchy located below the hierarchy having the update frequency equal to or higher than a certain threshold, the B + tree is used.
  • Data management device which is a partitioning algorithm. (Appendix 3) In the data management device according to attachment 1, In the hierarchy of level k (k ⁇ 1), the dividing means is Among the blocks that hold the entry having the key value larger than the maximum key value in the target block, the block being the insertion target of the new entry as the target block, located in the same hierarchy as the target block Then, starting from the block holding the entry with the smallest key value as a starting point and k-1 blocks counted from the starting block as adjacent blocks, in all of the target block and the adjacent block When the usage rate is equal to or greater than the division threshold, the new block is added, and all the existing entries in the adjacent block from the target block are transferred between the target block, the adjacent block, and the new block.
  • a data management device that divides the existing entry among the adjacent blocks whose usage rate has been determined from the target block.
  • a data management apparatus further comprising first threshold value correction means for determining the division threshold value according to a distance from a target block which is the block to be processed.
  • a data management apparatus further comprising second threshold correction means for determining the division threshold for each hierarchy.
  • (Appendix 7) Computer A means of managing block unit entries in a tree-like hierarchical structure, When the usage rate of the block is equal to or greater than a certain division threshold when inserting a new entry, a new block is added using a division means provided for each layer, and the existing entry held by the block is Means for dividing between the block and the new block; If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
  • a program for functioning as a means for setting. (Appendix 8) In the data management method described in appendix 6, The computer is For each hierarchy, the update frequency of the entry is retained, Using the dividing method used by the dividing means, In a hierarchy higher than the hierarchy having the update frequency equal to or higher than a certain threshold, a B * tree division algorithm is used. In a hierarchy located below the hierarchy having the update frequency equal to or higher than a certain threshold, the B + tree is used. Data management method, which is a partitioning algorithm.
  • the computer is Using the dividing means in the level k (k ⁇ 1) hierarchy, Among the blocks that hold the entry having the key value larger than the maximum key value in the target block, the block being the insertion target of the new entry as the target block, located in the same hierarchy as the target block Then, starting from the block holding the entry with the smallest key value as a starting point and k-1 blocks counted from the starting block as adjacent blocks, in all of the target block and the adjacent block When the usage rate is equal to or greater than the division threshold, the new block is added, and all the existing entries in the adjacent block from the target block are transferred between the target block, the adjacent block, and the new block.
  • segments the said existing entry between the said adjacent blocks which determined the said usage rate from the said object block.
  • the computer is A data management method for determining the division threshold according to a distance from a target block which is the block to be processed.
  • the computer is A data management method for determining the division threshold for each hierarchy.
  • Appendix 12 In the program described in Appendix 7, The computer, Means for holding the update frequency of the entry for each hierarchy; The dividing method used by the dividing means; In a hierarchy higher than the hierarchy having the update frequency equal to or higher than a certain threshold, a B * tree division algorithm is used. In a hierarchy located below the hierarchy having the update frequency equal to or higher than a certain threshold, the B + tree is used. A program for functioning as a means for setting as a division algorithm.
  • a program for causing the existing entry to function as means for dividing the existing block among the adjacent blocks for which the usage rate has been determined. (Appendix 14) In the program described in Appendix 13, The computer, A program for further functioning as means for determining the division threshold according to a distance from a target block which is the block to be processed. (Appendix 15) In the program described in any one of appendices 7, 12 to 14, The computer, A program for further functioning as means for determining the division threshold for each hierarchy.
  • the data structure is a three-layer structure, but a structure having a different number of layers may be used.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Un moyen (102) de gestion gère des entrées d'unité de bloc dans une hiérarchie arborescente, et un moyen de recherche du moyen (102) de gestion identifie un bloc cible pour une nouvelle entrée. Un moyen d'insertion du moyen (102) de gestion insère la nouvelle entrée dans le bloc cible, détermine si le rapport d'utilisation de bloc cible lorsque la nouvelle entrée est insérée est supérieur ou égal à un seuil de division, et transmet une instruction de division d'un bloc à un moyen (104) de division. Le moyen (104) de division divise le bloc cible sur la base de la commande. Un moyen (104) de division est prévu dans chaque niveau ; le rapport d'utilisation garanti d'un procédé de division pour un moyen (104) de division situé à un niveau plus élevé est supérieur ou égal au rapport d'utilisation garanti d'un procédé de division pour un moyen (104) de division situé à un niveau plus bas. Le moyen (104) de division divise les entrées existantes entre le bloc cible et un nouveau bloc.
PCT/JP2013/000187 2012-03-02 2013-01-17 Dispositif de gestion de données, procédé de gestion de données, et programme WO2013128788A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012046089 2012-03-02
JP2012-046089 2012-03-02

Publications (1)

Publication Number Publication Date
WO2013128788A1 true WO2013128788A1 (fr) 2013-09-06

Family

ID=49082007

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/000187 WO2013128788A1 (fr) 2012-03-02 2013-01-17 Dispositif de gestion de données, procédé de gestion de données, et programme

Country Status (2)

Country Link
JP (1) JPWO2013128788A1 (fr)
WO (1) WO2013128788A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997029426A1 (fr) * 1996-02-09 1997-08-14 Sony Corporation Processeur d'informations, procede de modification de noms de fichiers, et support d'enregistrement sur lequel un programme de changement de nom de fichier est enregistre
JP2010086391A (ja) * 2008-10-01 2010-04-15 Internatl Business Mach Corp <Ibm> 木構造を探索する方法
JP2010160591A (ja) * 2009-01-07 2010-07-22 Hitachi Ltd 空間データ管理装置、空間データ管理方法、および、空間データ管理プログラム
JP2011170460A (ja) * 2010-02-16 2011-09-01 Nippon Telegr & Teleph Corp <Ntt> 情報蓄積検索方法及び情報蓄積検索プログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997029426A1 (fr) * 1996-02-09 1997-08-14 Sony Corporation Processeur d'informations, procede de modification de noms de fichiers, et support d'enregistrement sur lequel un programme de changement de nom de fichier est enregistre
JP2010086391A (ja) * 2008-10-01 2010-04-15 Internatl Business Mach Corp <Ibm> 木構造を探索する方法
JP2010160591A (ja) * 2009-01-07 2010-07-22 Hitachi Ltd 空間データ管理装置、空間データ管理方法、および、空間データ管理プログラム
JP2011170460A (ja) * 2010-02-16 2011-09-01 Nippon Telegr & Teleph Corp <Ntt> 情報蓄積検索方法及び情報蓄積検索プログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KNUTH DONALD: "The Art of Computer Programming Volume 3 Sorting and Searching Second Edition", vol. 3, article TOSHIHIRO FUKUOKA, pages: 462 - 465 *

Also Published As

Publication number Publication date
JPWO2013128788A1 (ja) 2015-07-30

Similar Documents

Publication Publication Date Title
US8868926B2 (en) Cryptographic hash database
EP2633413B1 (fr) Stockage de clés et de valeurs permanent, à haut débit, à faible encombrement de ram et effectué à l&#39;aide d&#39;une mémoire secondaire
US10831736B2 (en) Fast multi-tier indexing supporting dynamic update
US9851917B2 (en) Method for de-duplicating data and apparatus therefor
KR102034833B1 (ko) 플래시 저장장치의 내부 병렬성을 이용하는 키 값 기반의 데이터 액세스 장치 및 방법
US10698831B2 (en) Method and apparatus for data access
CN106990915B (zh) 一种基于存储介质类型和加权配额的存储资源管理方法
KR102564170B1 (ko) 데이터 객체 저장 방법, 장치, 및 이를 이용한 컴퓨터 프로그램이 저장되는 컴퓨터 판독가능한 저장 매체
US20140136510A1 (en) Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data
CN108287840B (zh) 一种基于矩阵哈希的数据存储和查询方法
US20120215752A1 (en) Index for hybrid database
US20070050326A1 (en) Data Storage method and data storage structure
WO2015152830A1 (fr) Procédé de maintien de la cohérence des données
CN102346735A (zh) 一种减少哈希冲突的哈希查找方法
CN111506604A (zh) 访问数据的方法、装置和计算机程序产品
US10515055B2 (en) Mapping logical identifiers using multiple identifier spaces
US7484068B2 (en) Storage space management methods and systems
CN106599247A (zh) LSM‑tree结构中数据文件的合并方法及装置
US8935508B1 (en) Implementing pseudo content access memory
JP6006740B2 (ja) インデックス管理装置
WO2013128788A1 (fr) Dispositif de gestion de données, procédé de gestion de données, et programme
US11435926B2 (en) Method, device, and computer program product for managing storage system
KR100878142B1 (ko) 플래시 메모리 상에서의 효율적인 동작을 위한 수정된b-트리 인덱스 구성 방법
US9824105B2 (en) Adaptive probabilistic indexing with skip lists
Zhu SHaMBa: Reducing Bloom Filter Overhead in LSM Trees

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13754195

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2014501984

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13754195

Country of ref document: EP

Kind code of ref document: A1