CN106227677A - Method for managing variable-length cache metadata - Google Patents
Method for managing variable-length cache metadata Download PDFInfo
- Publication number
- CN106227677A CN106227677A CN201610571927.XA CN201610571927A CN106227677A CN 106227677 A CN106227677 A CN 106227677A CN 201610571927 A CN201610571927 A CN 201610571927A CN 106227677 A CN106227677 A CN 106227677A
- Authority
- CN
- China
- Prior art keywords
- key
- tree
- node
- metadata
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000003780 insertion Methods 0.000 claims abstract description 5
- 230000037431 insertion Effects 0.000 claims abstract description 5
- 238000013507 mapping Methods 0.000 claims abstract description 3
- 238000002372 labelling Methods 0.000 claims description 7
- 230000008520 organization Effects 0.000 claims 1
- 238000012217 deletion Methods 0.000 abstract description 6
- 230000037430 deletion Effects 0.000 abstract description 6
- 238000007726 management method Methods 0.000 description 8
- 230000008901 benefit Effects 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
- G06F12/0886—Variable-length word access
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for managing variable-length cache metadata, which manages the cache metadata through a B + tree, organizes the metadata into a B + tree, each node in the B + tree also stores a plurality of keys except the node information, for the leaf nodes of the B + tree, the mapping relation from the cache data in an SSD to the data in the HDD is stored in the keys, and the keys in the non-leaf nodes are used for finding child nodes and maintaining the structure of the whole B + tree. The invention can effectively reduce the data volume occupied by the key and has high insertion, deletion and search efficiency.
Description
Technical field
The present invention relates to storage system SSD caching technology field, be specifically related to the side of a kind of elongated cache metadata management
Method.
Background technology
Within the storage system, the speed of mechanical hard disk is well below internal memory, CPU speed, and development speed is slowly, becomes
For the bottleneck in whole storage system.For solving the slow-paced problem of mechanical hard disk, SSD arises at the historic moment.But SSD cost at present
Costliness, unit capacity cost mechanical hard disk still has big advantage, and the shipment amount of solid-state mixing array is still far beyond emerging
Full flash array market.Then take into account high IOPS and massive store demand, cache with SSD, become the scheme of a kind of compromise.
For elongated data cached, manage metadata the most efficiently, improve cache lookup, displacement efficiency and SSD buffer memory device empty
Between utilization rate become the major issue of SSD caching design.
In storage system, can the management decision of the elongated cache metadata of SSD realize cache lookup, caching efficiently
Displacement, raising buffer memory device space availability ratio.Cache metadata management relates generally to the most effectively manage in SSD equipment and caches
Data and the corresponding relation of data in the HDD equipment of rear end, increase the most efficiently, delete, search these corresponding relations, how to subtract
The problems such as the data volume of few metadata.
Summary of the invention
The technical problem to be solved in the present invention is: the present invention is directed to the asking of metadata management in the SSD caching of storage system
Topic, in order to reduce the data volume of metadata as far as possible, for SSD device characteristics on the basis of ensureing efficiently, it is provided that a kind of elongated slow
The method depositing metadata management, it is possible to increase the efficiency that metadata increases, deletes, searches, reduces metadata and delays at internal memory and SSD
Deposit the data volume in equipment.
The technical solution adopted in the present invention is:
The method of a kind of elongated cache metadata management, described method manages the metadata of caching by a B+ tree, by unit
Data set is made into a B+ tree, and in B+ tree, each node also deposits several key in addition to this nodal information, for B+ tree
Leaf node, houses in SSD data cached to the mapping relations of data in HDD in key, the key in non-leaf nodes is used for
Find child node, maintain the structure of whole B+ tree.
In each node of B+ tree, the maximum quantity of Key is determined by SSD memory element/key size, by using
Each attribute of labelling is carried out in several bit positions.
Described method is by being divided into several to gather key in a node, and the key in each set ensures in order,
Later set is used for increasing new key, moves reducing the data of insertion process.
The deletion process of described key, be this key of labelling be invalid, until invalid data amount exceedes certain value, then entirety is deleted
Except the invalid key in a node node.
Described method forms a balanced binary lookup by taking a key every a cache lines in initial data
Tree, when searching a key in B+ tree, first finds corresponding position, then the cache to correspondence in this assisted lookup tree
Line searches real key.
The invention have the benefit that
The present invention uses B+ tree to manage metadata, can determine the number of key according to SSD ultimate unit, improve metadata and read
The efficiency taking and writing;Use the key of compression, reduce the size of key, decrease the data volume of metadata, improve caching and set
Standby space availability ratio;The insertion of key and deletion, the method using point set, only insert new data toward last set, it is to avoid insert
Enter process mass data to move.Simply do a labelling when deleting key, the most really delete key, until in a node effectively
Key is less than disposed of in its entirety during certain value, and the insertion of key, deletion efficiency are the highest;Use a kind of efficient at a large amount of ordered data collection
The algorithm of middle lookup element, solves the binary chop problem that CPU cache hit probability is low in large data sets, and search efficiency is very
High.
Detailed description of the invention
Below according to detailed description of the invention, the present invention is further described:
Embodiment 1:
The method of a kind of elongated cache metadata management, described method manages the metadata of caching by a B+ tree, by unit
Data set is made into a B+ tree, and in B+ tree, each node also deposits several key in addition to this nodal information, for the leaf of B+ tree
Child node, key houses in SSD data cached in HDD data mapping relations (data position in SSD and HDD,
The information such as skew, data cached size), the key in non-leaf nodes is used for finding child node, i.e. maintains the knot of whole B+ tree
Structure, can be in order to find child node.
Leafy node is the concept in the middle of discrete mathematics.There is no the node of child node (i.e. degree is 0) in the middle of one tree, be referred to as
Leafy node, is called for short " leaf ".Leaf degree of referring to is the node of 0, is also called terminal node.
Embodiment 2
On the basis of embodiment 1, in each node of the present embodiment B+ tree, the maximum quantity of Key is single by SSD storage
Unit's (such as 256K)/key size determines, the size therefore reducing key can reduce the data volume of metadata.In order to compress as far as possible
The data volume of metadata, reduce key size, by using several bit positions to carry out each attribute of labelling, as data cached greatly
Little, HDD device number, data cached deviation post on HDD, the position etc. on SDD, the most both can guarantee that the information energy of needs
Effectively record, it is possible to realize required function and ensure efficiently, the size of data to be reduced again, increasing on per unit buffer memory device
The data volume of the key that can preserve, improves the space availability ratio of buffer memory device.
Embodiment 3
On the basis of embodiment 1 or 2, the present embodiment searches the complexity of key to reduce, and in a node, key should protect
Holding sequence, substantial amounts of data may be needed when being thus inserted into key to move, key in a node is divided into several by described method
Set set, in each set set, key ensures in order, and last set set is used for increasing new key, so can reduce slotting
The data entering process move.
Embodiment 4
On the basis of embodiment 3, the deletion process of key described in the present embodiment, the most really one key of deletion, but labelling
This key is invalid, until invalid data amount exceedes certain value, the more overall invalid key deleted in a node node, reaches high
The purpose that effect is deleted.
Embodiment 5
In the ordered data set that data volume is the biggest, search the algorithm of element, ordered set is searched element, common
Way is binary chop, but when the data volume of ordered set is the biggest, always selects intermediate point to continue to look into during two points
Look for, so CPU cache hit probability is the lowest, cause search efficiency the highest.In B+ tree, i.e. search a key, final performance bottle
Neck is the efficiency searched in a ordered set the biggest, to this end, on the basis of embodiment 4, method described in the present embodiment
Form a new set by taking a key every a cache lines cache line in initial data, this is gathered
Middle element is stored in an array by the order of binary tree inorder traversal, then this array is constituted a balanced binary and looks into
Look for tree, in this assisted lookup tree, first find corresponding position during lookup, then search really in corresponding cache line
Key, reach the purpose of efficient lookup.
Embodiment is merely to illustrate the present invention, and not limitation of the present invention, about the ordinary skill of technical field
Personnel, without departing from the spirit and scope of the present invention, it is also possible to make a variety of changes and modification, the most all equivalents
Technical scheme fall within scope of the invention, the scope of patent protection of the present invention should be defined by the claims.
Claims (5)
1. the method for an elongated cache metadata management, it is characterised in that: described method manages caching by a B+ tree
Metadata, metadata organization is become a B+ tree, in B+ tree, each node also deposits several in addition to this nodal information
Key, for the leaf node of B+ tree, houses in SSD data cached to the mapping relations of data in HDD in key, non-leaf saves
Key in point is used for finding child node, maintains the structure of whole B+ tree.
The method of a kind of elongated cache metadata the most according to claim 1 management, it is characterised in that: each joint of B+ tree
In point, the maximum quantity of Key is determined by SSD memory element/key size, each by using several bit positions to carry out labelling
Individual attribute.
The method of a kind of elongated cache metadata the most according to claim 1 and 2 management, it is characterised in that: described method
By being divided into several to gather key in a node, the key in each set ensures that last set is used for increasing in order
Add new key, move reducing the data of insertion process.
The method of a kind of elongated cache metadata the most according to claim 3 management, it is characterised in that: described key deletes
Except process, be this key of labelling be invalid, until invalid data amount exceedes certain value, more overall delete in a node node
Invalid key.
The method of a kind of elongated cache metadata the most according to claim 4 management, it is characterised in that: described method is passed through
In initial data, take a key every a cache lines form a balanced binary search tree, B+ tree is searched a key
Time, in this assisted lookup tree, first find corresponding position, then search real key in corresponding cache line.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610571927.XA CN106227677B (en) | 2016-07-20 | 2016-07-20 | Method for managing variable-length cache metadata |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610571927.XA CN106227677B (en) | 2016-07-20 | 2016-07-20 | Method for managing variable-length cache metadata |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106227677A true CN106227677A (en) | 2016-12-14 |
CN106227677B CN106227677B (en) | 2018-11-20 |
Family
ID=57531098
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610571927.XA Active CN106227677B (en) | 2016-07-20 | 2016-07-20 | Method for managing variable-length cache metadata |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106227677B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107861841A (en) * | 2017-11-07 | 2018-03-30 | 郑州云海信息技术有限公司 | The management method and system that data map in a kind of SSD Cache |
CN109271570A (en) * | 2018-10-30 | 2019-01-25 | 郑州云海信息技术有限公司 | A kind of method of metadata management inquiry |
CN109299111A (en) * | 2018-11-14 | 2019-02-01 | 郑州云海信息技术有限公司 | A kind of metadata query method, apparatus, equipment and computer readable storage medium |
CN109522243A (en) * | 2018-10-22 | 2019-03-26 | 郑州云海信息技术有限公司 | Metadata cache management method, device and storage medium in a kind of full flash memory storage |
CN110134340A (en) * | 2019-05-23 | 2019-08-16 | 苏州浪潮智能科技有限公司 | A kind of method, apparatus of metadata updates, equipment and storage medium |
CN110532201A (en) * | 2019-08-23 | 2019-12-03 | 北京浪潮数据技术有限公司 | A kind of metadata processing method and device |
US11586629B2 (en) | 2017-08-17 | 2023-02-21 | Samsung Electronics Co., Ltd. | Method and device of storing data object |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102364474A (en) * | 2011-11-17 | 2012-02-29 | 中国科学院计算技术研究所 | Metadata storage system for cluster file system and metadata management method |
CN102521386A (en) * | 2011-12-22 | 2012-06-27 | 清华大学 | Method for grouping space metadata based on cluster storage |
US20120317338A1 (en) * | 2011-06-09 | 2012-12-13 | Beijing Fastweb Technology Inc. | Solid-State Disk Caching the Top-K Hard-Disk Blocks Selected as a Function of Access Frequency and a Logarithmic System Time |
CN103020299A (en) * | 2012-12-29 | 2013-04-03 | 天津南大通用数据技术有限公司 | Storage method and device for inverted indexes and appended data in full-text search |
CN103294786A (en) * | 2013-05-17 | 2013-09-11 | 华中科技大学 | Metadata organization and management method and system of distributed file system |
CN104408128A (en) * | 2014-11-26 | 2015-03-11 | 上海爱数软件有限公司 | Read optimization method for asynchronously updating indexes based on B+ tree |
CN105117415A (en) * | 2015-07-30 | 2015-12-02 | 西安交通大学 | Optimized SSD data updating method |
-
2016
- 2016-07-20 CN CN201610571927.XA patent/CN106227677B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120317338A1 (en) * | 2011-06-09 | 2012-12-13 | Beijing Fastweb Technology Inc. | Solid-State Disk Caching the Top-K Hard-Disk Blocks Selected as a Function of Access Frequency and a Logarithmic System Time |
CN102364474A (en) * | 2011-11-17 | 2012-02-29 | 中国科学院计算技术研究所 | Metadata storage system for cluster file system and metadata management method |
CN102521386A (en) * | 2011-12-22 | 2012-06-27 | 清华大学 | Method for grouping space metadata based on cluster storage |
CN103020299A (en) * | 2012-12-29 | 2013-04-03 | 天津南大通用数据技术有限公司 | Storage method and device for inverted indexes and appended data in full-text search |
CN103294786A (en) * | 2013-05-17 | 2013-09-11 | 华中科技大学 | Metadata organization and management method and system of distributed file system |
CN104408128A (en) * | 2014-11-26 | 2015-03-11 | 上海爱数软件有限公司 | Read optimization method for asynchronously updating indexes based on B+ tree |
CN105117415A (en) * | 2015-07-30 | 2015-12-02 | 西安交通大学 | Optimized SSD data updating method |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11586629B2 (en) | 2017-08-17 | 2023-02-21 | Samsung Electronics Co., Ltd. | Method and device of storing data object |
CN107861841A (en) * | 2017-11-07 | 2018-03-30 | 郑州云海信息技术有限公司 | The management method and system that data map in a kind of SSD Cache |
CN107861841B (en) * | 2017-11-07 | 2022-04-22 | 郑州云海信息技术有限公司 | Management method and system for data mapping in SSD (solid State disk) Cache |
CN109522243A (en) * | 2018-10-22 | 2019-03-26 | 郑州云海信息技术有限公司 | Metadata cache management method, device and storage medium in a kind of full flash memory storage |
CN109522243B (en) * | 2018-10-22 | 2021-11-19 | 郑州云海信息技术有限公司 | Metadata cache management method and device in full flash storage and storage medium |
CN109271570A (en) * | 2018-10-30 | 2019-01-25 | 郑州云海信息技术有限公司 | A kind of method of metadata management inquiry |
CN109299111A (en) * | 2018-11-14 | 2019-02-01 | 郑州云海信息技术有限公司 | A kind of metadata query method, apparatus, equipment and computer readable storage medium |
CN110134340A (en) * | 2019-05-23 | 2019-08-16 | 苏州浪潮智能科技有限公司 | A kind of method, apparatus of metadata updates, equipment and storage medium |
CN110134340B (en) * | 2019-05-23 | 2020-03-06 | 苏州浪潮智能科技有限公司 | Method, device, equipment and storage medium for updating metadata |
CN110532201A (en) * | 2019-08-23 | 2019-12-03 | 北京浪潮数据技术有限公司 | A kind of metadata processing method and device |
CN110532201B (en) * | 2019-08-23 | 2021-08-31 | 北京浪潮数据技术有限公司 | Metadata processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN106227677B (en) | 2018-11-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106227677B (en) | Method for managing variable-length cache metadata | |
JP5996088B2 (en) | Cryptographic hash database | |
CN103514250B (en) | Method and system for deleting global repeating data and storage device | |
CN107066393A (en) | The method for improving map information density in address mapping table | |
CN109376156B (en) | Method for reading hybrid index with storage awareness | |
CN103458023B (en) | Distribution type flash memory storage | |
CN107153707B (en) | Hash table construction method and system for nonvolatile memory | |
CN105117415B (en) | A kind of SSD data-updating methods of optimization | |
CN104765575B (en) | information storage processing method | |
CN107463447B (en) | B + tree management method based on remote direct nonvolatile memory access | |
CN110888886B (en) | Index structure, construction method, key value storage system and request processing method | |
WO2009033419A1 (en) | A data caching processing method, system and data caching device | |
CN112000846B (en) | Method for grouping LSM tree indexes based on GPU | |
WO2013071882A1 (en) | Storage system and management method used for metadata of cluster file system | |
CN113704261B (en) | Key value storage system based on cloud storage | |
US10061775B1 (en) | Scalable and persistent L2 adaptive replacement cache | |
CN109101365A (en) | A kind of data backup and resume method deleted again based on source data | |
US20210303196A1 (en) | Method, device and computer program product for storage | |
CN106055679A (en) | Multi-level cache sensitive indexing method | |
WO2024119797A1 (en) | Data processing method and system, device, and storage medium | |
CN105988720A (en) | Data storage device and method | |
CN116204130A (en) | Key value storage system and management method thereof | |
CN116382588A (en) | LSM-Tree storage engine read amplification problem optimization method based on learning index | |
KR101104112B1 (en) | Dynamic index information maintenance system adapted solid state disk and method thereof and Recording medium having program source thereof | |
Yang et al. | BF-join: an efficient hash join algorithm for DRAM-NVM-based hybrid memory systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |