WO2013128788A1 - Dispositif de gestion de données, procédé de gestion de données, et programme - Google Patents
Dispositif de gestion de données, procédé de gestion de données, et programme Download PDFInfo
- Publication number
- WO2013128788A1 WO2013128788A1 PCT/JP2013/000187 JP2013000187W WO2013128788A1 WO 2013128788 A1 WO2013128788 A1 WO 2013128788A1 JP 2013000187 W JP2013000187 W JP 2013000187W WO 2013128788 A1 WO2013128788 A1 WO 2013128788A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- hierarchy
- dividing
- usage rate
- division
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
- G06F3/0649—Lifecycle management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
Definitions
- the present invention relates to a data management device, a data management method, and a program.
- index tree range search can be executed at high speed. It is more convenient to handle data as a set of data (tuple), for example, ⁇ name, address, product, date / time> rather than handling data alone. Therefore, an index tree is assigned to the data (key) in the tuple to which a search condition is attached. For example, if a tree is assigned using the above “date and time” as a key, “goods purchased from 11:00 to 12:00 on September 30, 2011” can be searched at high speed.
- the B + tree shown in Non-Patent Document 1 is one of widely used index trees.
- data is managed in units of blocks.
- a block in the lowest layer is called a leaf block
- a block in a layer higher than the leaf block is called a branch block
- the highest and only block is called a root block.
- a key in a specific range and a pointer to a tuple corresponding to the key are recorded.
- a leaf block in which a key in an adjacent range is recorded is connected by a pointer.
- this key / pointer pair is called an entry.
- the branch is recorded with an entry consisting of a key and a pointer to a leaf or an adjacent lower branch.
- block division is executed.
- half of all entries included in the divided block are moved to the newly added block based on the size of the key.
- a value obtained by dividing “the number of entries held by the block immediately after block division” divided by “the maximum number of entries that the block can hold” is used as “guaranteed use”.
- a value obtained by dividing “rate” and “number of entries currently held by the block” by “maximum number of entries that the block can hold” is defined as “usage rate”. Since the B + tree is an algorithm for dividing a block whose usage rate is 100%, the guaranteed usage rate is about 50%.
- the B * tree shown in Non-Patent Document 2 adds a new block when the usage rate of the target block to be processed and the usage rate of the adjacent block adjacent to the target block are both 100%.
- the target block and the entry held by the adjacent block are divided between the target block, the adjacent block, and the newly added block, and a newly added entry is inserted. Therefore, the guaranteed usage rate of the B * tree is about 67%.
- Non-Patent Document 3 proposes Fractal Prefetching B + -tree (fpB + -tree) that is conscious of both the CPU (Central Processing Unit) cache and the disk.
- fpB + -tree a B + tree having a block size optimized for the disk is created, and a sub-block size optimized for the cache is created in the block.
- Non-Patent Document 4 proposes “FAST” which is a binary tree optimized for individual hardware.
- FAST SIMD (Single Instruction Multiple Multiple Data) register size, line size, page size, etc. are considered.
- a block sized according to the cache is provided in a block sized according to the disk, but in this document, a block sized according to the SIMD register is further included in the block sized according to the cache.
- a block is provided.
- entries are packed with width priority. For example, a certain entry and two entries corresponding to its children are packed.
- Non-Patent Document 5 proposes an FD tree that addresses the problem of slow random writing to flash memory.
- the FD tree is composed of a head tree which is a small B + tree whose block size is equal to the page size of the flash memory, and an array of sorted blocks.
- the array generally corresponds to one level of the tree, with lower level arrays in the tree being larger in size. By searching the block of the head tree or the array, the block to be searched next in the next lower level can be specified. Writes are buffered in the head tree and are flushed to lower level buffers when the buffer is full.
- Patent Document 1 describes an index mounting method and an apparatus on which an index is mounted, focusing on the fact that the index development cost is relatively high and that the index has a plurality of data structures.
- a data structure such as a B-tree, a hash, and a heap is prepared as a component, and by combining these components, an index is mounted at low cost and without trouble.
- FAST Fast Architecture Search on Modern CPUs and GPUs. In SIGMOD, pages 339-350, 2010. Y. Li, B. He, R. J. Yang, Q. Luo, and K. Yi. Tree Indexing on Solid State Drives. In PVLDB 2010.
- Non-Patent Documents 1 to 4 there is one division method used when dividing blocks in the entire hierarchical structure.
- Non-Patent Document 5 and Patent Document 1 have a plurality of different data structures, but there is only one division method used in the entire hierarchical structure. Therefore, in each document, both the search speed and the update speed cannot be optimized according to the update frequency for each hierarchy.
- an object of the present invention is to provide a data management device, a data management method, and a program for optimizing both a search speed and an update speed for each hierarchy in data having a hierarchical structure.
- a management means for managing entries in block units in a tree-like hierarchical structure Provided for each hierarchy, and when the usage rate of the block is equal to or higher than a certain division threshold when inserting a new entry, a new block is added, and the existing entry held by the block is replaced with the block and the new block.
- Dividing means for dividing between If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
- a data management device is provided.
- Computer Manage block-level entries in a tree-like hierarchical structure When the usage rate of the block is equal to or greater than a certain division threshold when inserting a new entry, a new block is added using a division means provided for each layer, and the existing entry held by the block is Split between the block and the new block; If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
- a data management method for setting to be provided is provided.
- Computer A means of managing block unit entries in a tree-like hierarchical structure When the usage rate of the block is equal to or greater than a certain division threshold when inserting a new entry, a new block is added using a division means provided for each layer, and the existing entry held by the block is Means for dividing between the block and the new block; If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
- a program for functioning as a setting means is provided.
- FIG. 1 is a block diagram showing the configuration of the data management apparatus according to the first embodiment of the present invention.
- the data management apparatus 10 includes a management unit 102 and a dividing unit 104.
- the management means 102 manages entries in block units in a tree-like hierarchical structure, and searches for entries held in the blocks and new entries to the blocks (hereinafter referred to as new entries) by search means and insertion means (not shown). .) Is inserted.
- the search means of the management means 102 identifies the target block that is the insertion target of the new entry based on the key value of the new entry.
- the insertion means of the management means 102 inserts a new entry at a predetermined location in the target block based on the key value of the new entry.
- the insertion means of the management means 102 has a usage rate of the target block to be processed when inserting a new entry equal to or higher than a certain threshold (hereinafter referred to as a division threshold) that determines whether or not to divide. And a command to divide the block based on the determination result is transmitted to the dividing means 104.
- a certain threshold hereinafter referred to as a division threshold
- the dividing unit 104 divides the target block based on an instruction to divide the block. Further, the dividing means 104 is provided for each hierarchy, and the guaranteed usage rate of the dividing method used by the dividing means 104 located in the upper hierarchy is guaranteed by the dividing method used by the dividing means 104 located in the lower hierarchy. More than the usage rate.
- the dividing unit 104 divides an entry already inserted in the target block (hereinafter referred to as an existing entry) between the target block and the new block.
- each component of the data management apparatus shown in each figure is not a hardware unit configuration but a functional unit block.
- Each component includes a CPU and memory of an arbitrary computer, a program that realizes the components shown in the figure loaded in the memory, a storage medium such as a hard disk for storing the program, and a hardware and software mainly for a network connection interface Realized by any combination of wear.
- the dividing method with a high guaranteed usage rate increases the search speed because the frequency of adding new blocks is lower than the dividing method with a low guaranteed usage rate.
- the division method with a high guaranteed usage rate has a slower update speed than the division method with a low guaranteed usage rate because the block division processing is complicated.
- the highest-level block (root block) has a lower update frequency, and the search process is main.
- the lowermost block (leaf block) has a higher update frequency, and the update process is the main.
- the guaranteed usage rate of the dividing method used by the level 1 and level 2 dividing means 104 in the data management apparatus 10 is higher than the guaranteed usage rate of the dividing method used by the level 3 dividing means 104.
- the dividing method used by the level 1 and level 2 dividing means 104 is a B * tree dividing algorithm
- the dividing method used by the level 3 dividing means 104 is a B + tree dividing algorithm. it can.
- a new block is added when the usage rate of the target block and the usage rate of adjacent blocks adjacent to the target block are both 100%. Then, the existing entry is divided among the target block, the adjacent block, and the new block. Then, a new entry is inserted at a predetermined position in any block based on the value of the key. In the B * tree partitioning algorithm, two blocks with a usage rate of 100% are split into three blocks, so the guaranteed usage rate of the B * tree is about 67%.
- a new block is added when the usage rate of the target block is 100%. Then, the existing entry is divided between the target block and the new block. Then, a new entry is inserted at a predetermined position in any block based on the value of the key. In the B + tree partitioning algorithm, one block having a usage rate of 100% is divided into two blocks, so the guaranteed usage rate of the B + tree is about 50%.
- which division method is used by the division means 104 for each hierarchy can be determined based on the data update frequency for each hierarchy.
- the data update frequency may be calculated on the basis of the number of accesses to the block and the number of occurrences of division for each layer, for example. Based on the data update frequency, select a division method with a high guaranteed usage rate for a hierarchy that is higher than a hierarchy that has a data update frequency equal to or higher than a certain threshold, and the data update frequency is lower than or equal to a certain threshold In the hierarchy located at, a division method with a low guaranteed usage rate can be selected.
- a configuration in which the dividing unit 104 uses a dividing method with a higher guaranteed usage rate for higher layers and a dividing unit 104 uses a dividing method with a lower guaranteed usage rate for lower layers. take.
- the search speed is improved in the upper hierarchy where the search process is main, and the update speed is improved in the lower hierarchy where the update process is main. Can be made. Therefore, according to this configuration, the search speed and the update speed can be optimized compared with the case where one division method is used for the entire hierarchical structure, and the processing speed of the entire apparatus can be improved.
- the level k hierarchy dividing means 104 performs division processing with k-1 adjacent blocks as shown in FIG.
- This adjacent block is located in the same hierarchy as the target block.
- the adjacent block is a block that starts from a block that holds an entry having a minimum key value among blocks having an entry having a key value larger than the maximum key value in the target block.
- the blocks that can be secured are adjacent blocks. Therefore, for example, when the target block is a block that holds an entry having the maximum key value in the same hierarchy, the number of adjacent blocks is zero.
- the update process includes a search process for specifying a block into which data is inserted, and an insert process for inserting data into the block specified by the search process.
- the insertion process further includes a division process that divides the block if the free area of the block is not sufficient at the time of data insertion.
- FIG. 3 is a flowchart showing the flow of search processing.
- the search means of the management means 102 determines whether the hierarchy in which the currently searched block is located is one hierarchy above the lowest layer (leaf block).
- the search means of the management means 102 decrements the counter k and lowers the search target hierarchy by 1 (S108).
- the search means of the management means 102 specifies the block made into search object in the following hierarchy based on the pointer acquired by S104 (S110), and performs the process from S104 again about the specified block.
- the search unit of the management unit 102 specifies a block (target block) to be inserted as a new entry based on the pointer acquired in S104 ( S112).
- the management unit 102 stores the route from the root block to the target block in a storage unit (not shown).
- FIG. 4 is a flowchart showing the flow of insertion processing.
- the insertion unit of the management unit 102 determines whether or not a new entry can be inserted into the target block specified by the search process by comparing the usage rate of the target block with the division threshold (S202). When the usage rate of the target block is less than the division threshold (NO in S202), it is not necessary to divide the target block. Therefore, the insertion unit of the management unit 102 does not execute the division process and updates the entry of the target block (S212). Specifically, the insertion unit of the management unit 102 specifies the insertion position of the new entry based on the key value of the existing entry of the target block and the key value of the new entry.
- the insertion means of the management means 102 inserts a new entry at the position specified by the target block.
- the target block needs to be divided. Therefore, the insertion unit of the management unit 102 transmits a command for performing the dividing process to the dividing unit 104 (S204). The flow of division processing will be described below.
- FIG. 5 is a flowchart showing the flow of division processing.
- the dividing unit 104 uses a counter i as a reference for determining adjacent blocks.
- the insertion means of the management means 102 initializes the counter i with 1 (S302).
- the insertion unit of the management unit 102 determines whether there is an adjacent block whose usage rate has not been determined.
- the inserting unit of the management unit 102 determines whether the usage rate of the adjacent block (i-th adjacent block) indicated by the counter i is equal to or greater than the division threshold. Determine.
- the insertion unit of the management unit 102 transmits an instruction to the existing unit to divide the existing entry.
- the dividing unit 104 calculates an i + 1 quantile value (S312).
- the i + 1 quantile value is a reference value for dividing an existing entry among i + 1 blocks from the target block to the i-th adjacent block.
- the i + 1 quantile can be obtained, for example, by dividing the total number of existing entries of the i-th adjacent block from the target block by i + 1.
- the dividing unit 104 divides the existing entry among the blocks based on the obtained i + 1 quantile value (S314). Then, the insertion unit of the management unit 102 specifies the position where the new entry is inserted based on the key value held by the divided existing entry, and inserts the new entry (S316).
- the minimum key value for each block subjected to the division process S318). Further, each entry acquired in S318 is used when updating an entry of a higher-level block in subsequent processing.
- the insertion unit of the management unit 102 transmits a command for performing block division processing to the division unit 104 in order to secure an area for inserting a new entry.
- the dividing unit 104 adds a new block after the k ⁇ 1th adjacent block (S310). The subsequent processing is the same as when no new block is added.
- the existing entry is divided from the target block to the new block, and the new entry is inserted.
- the division process when executed in the lower layer, it is necessary to update the entry held by the block in the upper layer with a new key value, so it is determined whether or not the upper layer exists. If there is an upper layer (YES in S206), the counter k is incremented to increase the processing target layer by one (S208). Then, the block to be processed is specified in the next higher hierarchy according to the route held in the search process.
- the insertion unit of the management unit 102 determines whether or not a new block has been added in the division process of S204 (S214).
- a flag indicating that a new block has been added When a new block is not added in the lower layer (NO in S214), there is no possibility that the upper layer block is divided. Therefore, the insertion unit of the management unit 102 updates the entry indicating the block subjected to the division process in the lower hierarchy among the identified block entries using the entry acquired in S318 (S212).
- the insertion unit of the management unit 102 determines whether the usage rate of the identified block is greater than or equal to the division threshold (S202).
- the insertion unit of the management unit 102 updates the entry indicating the block subjected to the division process in the lower hierarchy among the specified block entries using the entry acquired in S318 (S212). In the case of further dividing processing in the upper layer, the processing executed in the lower layer is repeated, and the description thereof is omitted.
- the division algorithm is equivalent to the B + tree because only the target block is divided at the lowest layer (level 1). Therefore, the guaranteed usage rate of the level 1 hierarchy division method is about 50%. Similarly, in the level 2 hierarchy, since the target block and the first adjacent block are divided, the division algorithm is equivalent to the B * tree, and the guaranteed usage rate is about 67%. Similarly, the guaranteed usage rate is higher in the upper hierarchy, approximately 75% at level 3 and approximately 80% at level 4.
- the guaranteed usage rate is increased as the hierarchy is increased. With this configuration, the guaranteed usage rate can be changed more flexibly than the configuration in which the division method used by the division unit 104 is divided in a hierarchy having a constant update frequency, and the processing speed can be improved.
- FIG. 6 is a block diagram showing a configuration of a data management apparatus according to the third embodiment of the present invention.
- the data management apparatus 10 further includes first correction means 106 that determines a division threshold for each adjacent block.
- FIG. 7 is a flowchart showing the flow of division processing in the third embodiment of the present invention.
- the first correction unit 106 determines a division threshold according to the distance between the target block and the adjacent block (S402).
- the first correction means 106 determines that the division threshold becomes smaller as the adjacent block is farther from the target block.
- the division threshold can be set to “100 ⁇ c ⁇ i% (where c is an arbitrary constant)” using a counter i that determines adjacent blocks. As a result, it is possible to adjust the conditions under which block division occurs for each adjacent block.
- the dividing unit 104 calculates the dividing position of the existing entry based on the dividing threshold value of each adjacent block before dividing the existing entry between the blocks (S404). For example, the dividing unit 104 divides an existing entry among blocks based on the position of i + 1 minutes. Here, when the division threshold of a certain block is exceeded when attempting to divide based on the i + 1 minute position, the dividing unit 104 assigns more existing entries to the block having the larger division threshold, and each block is assigned the division threshold. Adjust so that it does not exceed.
- FIG. 8 is a diagram illustrating an example of entry transition when there is no division threshold correction according to the distance between adjacent blocks.
- FIG. 9 is a diagram illustrating an example of entry transition when there is correction of a division threshold according to the distance between adjacent blocks.
- the description will be made assuming that the maximum number of entries in one block is 36, the guaranteed usage rate is 2/3, and a new entry is inserted only in the leftmost block.
- the division method used by the upper layer division unit 104 has a higher guaranteed usage rate
- the division method used by the lower layer division unit 104 has a lower guaranteed usage rate.
- the division threshold is determined according to the distance between the target block and the adjacent block.
- FIG. 10 is a block diagram showing a configuration of a data management apparatus according to the fourth embodiment of the present invention.
- the data management apparatus 10 further includes a second correction unit 108 that determines a division threshold value for each layer.
- FIG. 11 is a flowchart showing the flow of insertion processing in the fourth embodiment of the present invention.
- the second correction unit 108 determines a division threshold according to the hierarchy to be processed (S502).
- the second correction unit 108 determines that the division threshold value decreases as the lower hierarchy level with higher update frequency is reached.
- the same effects as those in the first to third embodiments can be obtained.
- segmentation threshold value for every hierarchy is taken. With this configuration, it is possible to flexibly optimize the guaranteed usage rate according to the update frequency for each layer. Therefore, the processing speed can be improved as compared with the case where the same division threshold is used in all layers.
- the dividing method used by the dividing means 104 for each layer is other than the dividing method described in the second embodiment, the guaranteed use of the dividing method used by the dividing means 104 in the upper hierarchy is used. It is sufficient that the rate is equal to or higher than the guaranteed usage rate of the division method used by the division unit 104 in the lower hierarchy.
- the effects of the present embodiment can be obtained without the first correction means 106 described in the third embodiment.
- a management means for managing entries in block units in a tree-like hierarchical structure Provided for each hierarchy, and when the usage rate of the block is equal to or higher than a certain division threshold when inserting a new entry, a new block is added, and the existing entry held by the block is replaced with the block and the new block.
- Dividing means for dividing between If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
- a data management device. (Appendix 2) In the data management device according to attachment 1, For each hierarchy, the update frequency of the entry is retained, The dividing method used by the dividing means is: In a hierarchy higher than the hierarchy having the update frequency equal to or higher than a certain threshold, a B * tree division algorithm is used.
- the B + tree In a hierarchy located below the hierarchy having the update frequency equal to or higher than a certain threshold, the B + tree is used.
- Data management device which is a partitioning algorithm. (Appendix 3) In the data management device according to attachment 1, In the hierarchy of level k (k ⁇ 1), the dividing means is Among the blocks that hold the entry having the key value larger than the maximum key value in the target block, the block being the insertion target of the new entry as the target block, located in the same hierarchy as the target block Then, starting from the block holding the entry with the smallest key value as a starting point and k-1 blocks counted from the starting block as adjacent blocks, in all of the target block and the adjacent block When the usage rate is equal to or greater than the division threshold, the new block is added, and all the existing entries in the adjacent block from the target block are transferred between the target block, the adjacent block, and the new block.
- a data management device that divides the existing entry among the adjacent blocks whose usage rate has been determined from the target block.
- a data management apparatus further comprising first threshold value correction means for determining the division threshold value according to a distance from a target block which is the block to be processed.
- a data management apparatus further comprising second threshold correction means for determining the division threshold for each hierarchy.
- (Appendix 7) Computer A means of managing block unit entries in a tree-like hierarchical structure, When the usage rate of the block is equal to or greater than a certain division threshold when inserting a new entry, a new block is added using a division means provided for each layer, and the existing entry held by the block is Means for dividing between the block and the new block; If the value obtained by dividing the number of the existing entries held by the block immediately after the division is divided by the maximum number of the entries that can be held by the block is assumed as the guaranteed usage rate, the second The guaranteed usage rate of the dividing method used by the dividing means of one hierarchy is equal to or higher than the guaranteed usage rate of the dividing method used by the dividing means of the second hierarchy located lower than the first hierarchy.
- a program for functioning as a means for setting. (Appendix 8) In the data management method described in appendix 6, The computer is For each hierarchy, the update frequency of the entry is retained, Using the dividing method used by the dividing means, In a hierarchy higher than the hierarchy having the update frequency equal to or higher than a certain threshold, a B * tree division algorithm is used. In a hierarchy located below the hierarchy having the update frequency equal to or higher than a certain threshold, the B + tree is used. Data management method, which is a partitioning algorithm.
- the computer is Using the dividing means in the level k (k ⁇ 1) hierarchy, Among the blocks that hold the entry having the key value larger than the maximum key value in the target block, the block being the insertion target of the new entry as the target block, located in the same hierarchy as the target block Then, starting from the block holding the entry with the smallest key value as a starting point and k-1 blocks counted from the starting block as adjacent blocks, in all of the target block and the adjacent block When the usage rate is equal to or greater than the division threshold, the new block is added, and all the existing entries in the adjacent block from the target block are transferred between the target block, the adjacent block, and the new block.
- segments the said existing entry between the said adjacent blocks which determined the said usage rate from the said object block.
- the computer is A data management method for determining the division threshold according to a distance from a target block which is the block to be processed.
- the computer is A data management method for determining the division threshold for each hierarchy.
- Appendix 12 In the program described in Appendix 7, The computer, Means for holding the update frequency of the entry for each hierarchy; The dividing method used by the dividing means; In a hierarchy higher than the hierarchy having the update frequency equal to or higher than a certain threshold, a B * tree division algorithm is used. In a hierarchy located below the hierarchy having the update frequency equal to or higher than a certain threshold, the B + tree is used. A program for functioning as a means for setting as a division algorithm.
- a program for causing the existing entry to function as means for dividing the existing block among the adjacent blocks for which the usage rate has been determined. (Appendix 14) In the program described in Appendix 13, The computer, A program for further functioning as means for determining the division threshold according to a distance from a target block which is the block to be processed. (Appendix 15) In the program described in any one of appendices 7, 12 to 14, The computer, A program for further functioning as means for determining the division threshold for each hierarchy.
- the data structure is a three-layer structure, but a structure having a different number of layers may be used.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Un moyen (102) de gestion gère des entrées d'unité de bloc dans une hiérarchie arborescente, et un moyen de recherche du moyen (102) de gestion identifie un bloc cible pour une nouvelle entrée. Un moyen d'insertion du moyen (102) de gestion insère la nouvelle entrée dans le bloc cible, détermine si le rapport d'utilisation de bloc cible lorsque la nouvelle entrée est insérée est supérieur ou égal à un seuil de division, et transmet une instruction de division d'un bloc à un moyen (104) de division. Le moyen (104) de division divise le bloc cible sur la base de la commande. Un moyen (104) de division est prévu dans chaque niveau ; le rapport d'utilisation garanti d'un procédé de division pour un moyen (104) de division situé à un niveau plus élevé est supérieur ou égal au rapport d'utilisation garanti d'un procédé de division pour un moyen (104) de division situé à un niveau plus bas. Le moyen (104) de division divise les entrées existantes entre le bloc cible et un nouveau bloc.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012046089 | 2012-03-02 | ||
JP2012-046089 | 2012-03-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013128788A1 true WO2013128788A1 (fr) | 2013-09-06 |
Family
ID=49082007
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/000187 WO2013128788A1 (fr) | 2012-03-02 | 2013-01-17 | Dispositif de gestion de données, procédé de gestion de données, et programme |
Country Status (2)
Country | Link |
---|---|
JP (1) | JPWO2013128788A1 (fr) |
WO (1) | WO2013128788A1 (fr) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997029426A1 (fr) * | 1996-02-09 | 1997-08-14 | Sony Corporation | Processeur d'informations, procede de modification de noms de fichiers, et support d'enregistrement sur lequel un programme de changement de nom de fichier est enregistre |
JP2010086391A (ja) * | 2008-10-01 | 2010-04-15 | Internatl Business Mach Corp <Ibm> | 木構造を探索する方法 |
JP2010160591A (ja) * | 2009-01-07 | 2010-07-22 | Hitachi Ltd | 空間データ管理装置、空間データ管理方法、および、空間データ管理プログラム |
JP2011170460A (ja) * | 2010-02-16 | 2011-09-01 | Nippon Telegr & Teleph Corp <Ntt> | 情報蓄積検索方法及び情報蓄積検索プログラム |
-
2013
- 2013-01-17 WO PCT/JP2013/000187 patent/WO2013128788A1/fr active Application Filing
- 2013-01-17 JP JP2014501984A patent/JPWO2013128788A1/ja active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997029426A1 (fr) * | 1996-02-09 | 1997-08-14 | Sony Corporation | Processeur d'informations, procede de modification de noms de fichiers, et support d'enregistrement sur lequel un programme de changement de nom de fichier est enregistre |
JP2010086391A (ja) * | 2008-10-01 | 2010-04-15 | Internatl Business Mach Corp <Ibm> | 木構造を探索する方法 |
JP2010160591A (ja) * | 2009-01-07 | 2010-07-22 | Hitachi Ltd | 空間データ管理装置、空間データ管理方法、および、空間データ管理プログラム |
JP2011170460A (ja) * | 2010-02-16 | 2011-09-01 | Nippon Telegr & Teleph Corp <Ntt> | 情報蓄積検索方法及び情報蓄積検索プログラム |
Non-Patent Citations (1)
Title |
---|
KNUTH DONALD: "The Art of Computer Programming Volume 3 Sorting and Searching Second Edition", vol. 3, article TOSHIHIRO FUKUOKA, pages: 462 - 465 * |
Also Published As
Publication number | Publication date |
---|---|
JPWO2013128788A1 (ja) | 2015-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8868926B2 (en) | Cryptographic hash database | |
EP2633413B1 (fr) | Stockage de clés et de valeurs permanent, à haut débit, à faible encombrement de ram et effectué à l'aide d'une mémoire secondaire | |
US10831736B2 (en) | Fast multi-tier indexing supporting dynamic update | |
US9851917B2 (en) | Method for de-duplicating data and apparatus therefor | |
KR102034833B1 (ko) | 플래시 저장장치의 내부 병렬성을 이용하는 키 값 기반의 데이터 액세스 장치 및 방법 | |
US10698831B2 (en) | Method and apparatus for data access | |
CN106990915B (zh) | 一种基于存储介质类型和加权配额的存储资源管理方法 | |
KR102564170B1 (ko) | 데이터 객체 저장 방법, 장치, 및 이를 이용한 컴퓨터 프로그램이 저장되는 컴퓨터 판독가능한 저장 매체 | |
US20140136510A1 (en) | Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data | |
CN108287840B (zh) | 一种基于矩阵哈希的数据存储和查询方法 | |
US20120215752A1 (en) | Index for hybrid database | |
US20070050326A1 (en) | Data Storage method and data storage structure | |
WO2015152830A1 (fr) | Procédé de maintien de la cohérence des données | |
CN102346735A (zh) | 一种减少哈希冲突的哈希查找方法 | |
CN111506604A (zh) | 访问数据的方法、装置和计算机程序产品 | |
US10515055B2 (en) | Mapping logical identifiers using multiple identifier spaces | |
US7484068B2 (en) | Storage space management methods and systems | |
CN106599247A (zh) | LSM‑tree结构中数据文件的合并方法及装置 | |
US8935508B1 (en) | Implementing pseudo content access memory | |
JP6006740B2 (ja) | インデックス管理装置 | |
WO2013128788A1 (fr) | Dispositif de gestion de données, procédé de gestion de données, et programme | |
US11435926B2 (en) | Method, device, and computer program product for managing storage system | |
KR100878142B1 (ko) | 플래시 메모리 상에서의 효율적인 동작을 위한 수정된b-트리 인덱스 구성 방법 | |
US9824105B2 (en) | Adaptive probabilistic indexing with skip lists | |
Zhu | SHaMBa: Reducing Bloom Filter Overhead in LSM Trees |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13754195 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014501984 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13754195 Country of ref document: EP Kind code of ref document: A1 |