CN106227677A - Method for managing variable-length cache metadata - Google Patents

Method for managing variable-length cache metadata Download PDF

Info

Publication number
CN106227677A
CN106227677A CN201610571927.XA CN201610571927A CN106227677A CN 106227677 A CN106227677 A CN 106227677A CN 201610571927 A CN201610571927 A CN 201610571927A CN 106227677 A CN106227677 A CN 106227677A
Authority
CN
China
Prior art keywords
key
tree
node
metadata
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610571927.XA
Other languages
Chinese (zh)
Other versions
CN106227677B (en
Inventor
刘如意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201610571927.XA priority Critical patent/CN106227677B/en
Publication of CN106227677A publication Critical patent/CN106227677A/en
Application granted granted Critical
Publication of CN106227677B publication Critical patent/CN106227677B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0886Variable-length word access

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for managing variable-length cache metadata, which manages the cache metadata through a B + tree, organizes the metadata into a B + tree, each node in the B + tree also stores a plurality of keys except the node information, for the leaf nodes of the B + tree, the mapping relation from the cache data in an SSD to the data in the HDD is stored in the keys, and the keys in the non-leaf nodes are used for finding child nodes and maintaining the structure of the whole B + tree. The invention can effectively reduce the data volume occupied by the key and has high insertion, deletion and search efficiency.

Description

A kind of method of elongated cache metadata management
Technical field
The present invention relates to storage system SSD caching technology field, be specifically related to the side of a kind of elongated cache metadata management Method.
Background technology
Within the storage system, the speed of mechanical hard disk is well below internal memory, CPU speed, and development speed is slowly, becomes For the bottleneck in whole storage system.For solving the slow-paced problem of mechanical hard disk, SSD arises at the historic moment.But SSD cost at present Costliness, unit capacity cost mechanical hard disk still has big advantage, and the shipment amount of solid-state mixing array is still far beyond emerging Full flash array market.Then take into account high IOPS and massive store demand, cache with SSD, become the scheme of a kind of compromise. For elongated data cached, manage metadata the most efficiently, improve cache lookup, displacement efficiency and SSD buffer memory device empty Between utilization rate become the major issue of SSD caching design.
In storage system, can the management decision of the elongated cache metadata of SSD realize cache lookup, caching efficiently Displacement, raising buffer memory device space availability ratio.Cache metadata management relates generally to the most effectively manage in SSD equipment and caches Data and the corresponding relation of data in the HDD equipment of rear end, increase the most efficiently, delete, search these corresponding relations, how to subtract The problems such as the data volume of few metadata.
Summary of the invention
The technical problem to be solved in the present invention is: the present invention is directed to the asking of metadata management in the SSD caching of storage system Topic, in order to reduce the data volume of metadata as far as possible, for SSD device characteristics on the basis of ensureing efficiently, it is provided that a kind of elongated slow The method depositing metadata management, it is possible to increase the efficiency that metadata increases, deletes, searches, reduces metadata and delays at internal memory and SSD Deposit the data volume in equipment.
The technical solution adopted in the present invention is:
The method of a kind of elongated cache metadata management, described method manages the metadata of caching by a B+ tree, by unit Data set is made into a B+ tree, and in B+ tree, each node also deposits several key in addition to this nodal information, for B+ tree Leaf node, houses in SSD data cached to the mapping relations of data in HDD in key, the key in non-leaf nodes is used for Find child node, maintain the structure of whole B+ tree.
In each node of B+ tree, the maximum quantity of Key is determined by SSD memory element/key size, by using Each attribute of labelling is carried out in several bit positions.
Described method is by being divided into several to gather key in a node, and the key in each set ensures in order, Later set is used for increasing new key, moves reducing the data of insertion process.
The deletion process of described key, be this key of labelling be invalid, until invalid data amount exceedes certain value, then entirety is deleted Except the invalid key in a node node.
Described method forms a balanced binary lookup by taking a key every a cache lines in initial data Tree, when searching a key in B+ tree, first finds corresponding position, then the cache to correspondence in this assisted lookup tree Line searches real key.
The invention have the benefit that
The present invention uses B+ tree to manage metadata, can determine the number of key according to SSD ultimate unit, improve metadata and read The efficiency taking and writing;Use the key of compression, reduce the size of key, decrease the data volume of metadata, improve caching and set Standby space availability ratio;The insertion of key and deletion, the method using point set, only insert new data toward last set, it is to avoid insert Enter process mass data to move.Simply do a labelling when deleting key, the most really delete key, until in a node effectively Key is less than disposed of in its entirety during certain value, and the insertion of key, deletion efficiency are the highest;Use a kind of efficient at a large amount of ordered data collection The algorithm of middle lookup element, solves the binary chop problem that CPU cache hit probability is low in large data sets, and search efficiency is very High.
Detailed description of the invention
Below according to detailed description of the invention, the present invention is further described:
Embodiment 1:
The method of a kind of elongated cache metadata management, described method manages the metadata of caching by a B+ tree, by unit Data set is made into a B+ tree, and in B+ tree, each node also deposits several key in addition to this nodal information, for the leaf of B+ tree Child node, key houses in SSD data cached in HDD data mapping relations (data position in SSD and HDD, The information such as skew, data cached size), the key in non-leaf nodes is used for finding child node, i.e. maintains the knot of whole B+ tree Structure, can be in order to find child node.
Leafy node is the concept in the middle of discrete mathematics.There is no the node of child node (i.e. degree is 0) in the middle of one tree, be referred to as Leafy node, is called for short " leaf ".Leaf degree of referring to is the node of 0, is also called terminal node.
Embodiment 2
On the basis of embodiment 1, in each node of the present embodiment B+ tree, the maximum quantity of Key is single by SSD storage Unit's (such as 256K)/key size determines, the size therefore reducing key can reduce the data volume of metadata.In order to compress as far as possible The data volume of metadata, reduce key size, by using several bit positions to carry out each attribute of labelling, as data cached greatly Little, HDD device number, data cached deviation post on HDD, the position etc. on SDD, the most both can guarantee that the information energy of needs Effectively record, it is possible to realize required function and ensure efficiently, the size of data to be reduced again, increasing on per unit buffer memory device The data volume of the key that can preserve, improves the space availability ratio of buffer memory device.
Embodiment 3
On the basis of embodiment 1 or 2, the present embodiment searches the complexity of key to reduce, and in a node, key should protect Holding sequence, substantial amounts of data may be needed when being thus inserted into key to move, key in a node is divided into several by described method Set set, in each set set, key ensures in order, and last set set is used for increasing new key, so can reduce slotting The data entering process move.
Embodiment 4
On the basis of embodiment 3, the deletion process of key described in the present embodiment, the most really one key of deletion, but labelling This key is invalid, until invalid data amount exceedes certain value, the more overall invalid key deleted in a node node, reaches high The purpose that effect is deleted.
Embodiment 5
In the ordered data set that data volume is the biggest, search the algorithm of element, ordered set is searched element, common Way is binary chop, but when the data volume of ordered set is the biggest, always selects intermediate point to continue to look into during two points Look for, so CPU cache hit probability is the lowest, cause search efficiency the highest.In B+ tree, i.e. search a key, final performance bottle Neck is the efficiency searched in a ordered set the biggest, to this end, on the basis of embodiment 4, method described in the present embodiment Form a new set by taking a key every a cache lines cache line in initial data, this is gathered Middle element is stored in an array by the order of binary tree inorder traversal, then this array is constituted a balanced binary and looks into Look for tree, in this assisted lookup tree, first find corresponding position during lookup, then search really in corresponding cache line Key, reach the purpose of efficient lookup.
Embodiment is merely to illustrate the present invention, and not limitation of the present invention, about the ordinary skill of technical field Personnel, without departing from the spirit and scope of the present invention, it is also possible to make a variety of changes and modification, the most all equivalents Technical scheme fall within scope of the invention, the scope of patent protection of the present invention should be defined by the claims.

Claims (5)

1. the method for an elongated cache metadata management, it is characterised in that: described method manages caching by a B+ tree Metadata, metadata organization is become a B+ tree, in B+ tree, each node also deposits several in addition to this nodal information Key, for the leaf node of B+ tree, houses in SSD data cached to the mapping relations of data in HDD in key, non-leaf saves Key in point is used for finding child node, maintains the structure of whole B+ tree.
The method of a kind of elongated cache metadata the most according to claim 1 management, it is characterised in that: each joint of B+ tree In point, the maximum quantity of Key is determined by SSD memory element/key size, each by using several bit positions to carry out labelling Individual attribute.
The method of a kind of elongated cache metadata the most according to claim 1 and 2 management, it is characterised in that: described method By being divided into several to gather key in a node, the key in each set ensures that last set is used for increasing in order Add new key, move reducing the data of insertion process.
The method of a kind of elongated cache metadata the most according to claim 3 management, it is characterised in that: described key deletes Except process, be this key of labelling be invalid, until invalid data amount exceedes certain value, more overall delete in a node node Invalid key.
The method of a kind of elongated cache metadata the most according to claim 4 management, it is characterised in that: described method is passed through In initial data, take a key every a cache lines form a balanced binary search tree, B+ tree is searched a key Time, in this assisted lookup tree, first find corresponding position, then search real key in corresponding cache line.
CN201610571927.XA 2016-07-20 2016-07-20 Method for managing variable-length cache metadata Active CN106227677B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610571927.XA CN106227677B (en) 2016-07-20 2016-07-20 Method for managing variable-length cache metadata

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610571927.XA CN106227677B (en) 2016-07-20 2016-07-20 Method for managing variable-length cache metadata

Publications (2)

Publication Number Publication Date
CN106227677A true CN106227677A (en) 2016-12-14
CN106227677B CN106227677B (en) 2018-11-20

Family

ID=57531098

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610571927.XA Active CN106227677B (en) 2016-07-20 2016-07-20 Method for managing variable-length cache metadata

Country Status (1)

Country Link
CN (1) CN106227677B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107861841A (en) * 2017-11-07 2018-03-30 郑州云海信息技术有限公司 The management method and system that data map in a kind of SSD Cache
CN109271570A (en) * 2018-10-30 2019-01-25 郑州云海信息技术有限公司 A kind of method of metadata management inquiry
CN109299111A (en) * 2018-11-14 2019-02-01 郑州云海信息技术有限公司 A kind of metadata query method, apparatus, equipment and computer readable storage medium
CN109522243A (en) * 2018-10-22 2019-03-26 郑州云海信息技术有限公司 Metadata cache management method, device and storage medium in a kind of full flash memory storage
CN110134340A (en) * 2019-05-23 2019-08-16 苏州浪潮智能科技有限公司 A kind of method, apparatus of metadata updates, equipment and storage medium
CN110532201A (en) * 2019-08-23 2019-12-03 北京浪潮数据技术有限公司 A kind of metadata processing method and device
US11586629B2 (en) 2017-08-17 2023-02-21 Samsung Electronics Co., Ltd. Method and device of storing data object

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102364474A (en) * 2011-11-17 2012-02-29 中国科学院计算技术研究所 Metadata storage system for cluster file system and metadata management method
CN102521386A (en) * 2011-12-22 2012-06-27 清华大学 Method for grouping space metadata based on cluster storage
US20120317338A1 (en) * 2011-06-09 2012-12-13 Beijing Fastweb Technology Inc. Solid-State Disk Caching the Top-K Hard-Disk Blocks Selected as a Function of Access Frequency and a Logarithmic System Time
CN103020299A (en) * 2012-12-29 2013-04-03 天津南大通用数据技术有限公司 Storage method and device for inverted indexes and appended data in full-text search
CN103294786A (en) * 2013-05-17 2013-09-11 华中科技大学 Metadata organization and management method and system of distributed file system
CN104408128A (en) * 2014-11-26 2015-03-11 上海爱数软件有限公司 Read optimization method for asynchronously updating indexes based on B+ tree
CN105117415A (en) * 2015-07-30 2015-12-02 西安交通大学 Optimized SSD data updating method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317338A1 (en) * 2011-06-09 2012-12-13 Beijing Fastweb Technology Inc. Solid-State Disk Caching the Top-K Hard-Disk Blocks Selected as a Function of Access Frequency and a Logarithmic System Time
CN102364474A (en) * 2011-11-17 2012-02-29 中国科学院计算技术研究所 Metadata storage system for cluster file system and metadata management method
CN102521386A (en) * 2011-12-22 2012-06-27 清华大学 Method for grouping space metadata based on cluster storage
CN103020299A (en) * 2012-12-29 2013-04-03 天津南大通用数据技术有限公司 Storage method and device for inverted indexes and appended data in full-text search
CN103294786A (en) * 2013-05-17 2013-09-11 华中科技大学 Metadata organization and management method and system of distributed file system
CN104408128A (en) * 2014-11-26 2015-03-11 上海爱数软件有限公司 Read optimization method for asynchronously updating indexes based on B+ tree
CN105117415A (en) * 2015-07-30 2015-12-02 西安交通大学 Optimized SSD data updating method

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11586629B2 (en) 2017-08-17 2023-02-21 Samsung Electronics Co., Ltd. Method and device of storing data object
CN107861841A (en) * 2017-11-07 2018-03-30 郑州云海信息技术有限公司 The management method and system that data map in a kind of SSD Cache
CN107861841B (en) * 2017-11-07 2022-04-22 郑州云海信息技术有限公司 Management method and system for data mapping in SSD (solid State disk) Cache
CN109522243A (en) * 2018-10-22 2019-03-26 郑州云海信息技术有限公司 Metadata cache management method, device and storage medium in a kind of full flash memory storage
CN109522243B (en) * 2018-10-22 2021-11-19 郑州云海信息技术有限公司 Metadata cache management method and device in full flash storage and storage medium
CN109271570A (en) * 2018-10-30 2019-01-25 郑州云海信息技术有限公司 A kind of method of metadata management inquiry
CN109299111A (en) * 2018-11-14 2019-02-01 郑州云海信息技术有限公司 A kind of metadata query method, apparatus, equipment and computer readable storage medium
CN110134340A (en) * 2019-05-23 2019-08-16 苏州浪潮智能科技有限公司 A kind of method, apparatus of metadata updates, equipment and storage medium
CN110134340B (en) * 2019-05-23 2020-03-06 苏州浪潮智能科技有限公司 Method, device, equipment and storage medium for updating metadata
CN110532201A (en) * 2019-08-23 2019-12-03 北京浪潮数据技术有限公司 A kind of metadata processing method and device
CN110532201B (en) * 2019-08-23 2021-08-31 北京浪潮数据技术有限公司 Metadata processing method and device

Also Published As

Publication number Publication date
CN106227677B (en) 2018-11-20

Similar Documents

Publication Publication Date Title
CN106227677B (en) Method for managing variable-length cache metadata
JP5996088B2 (en) Cryptographic hash database
CN103514250B (en) Method and system for deleting global repeating data and storage device
CN107066393A (en) The method for improving map information density in address mapping table
CN109376156B (en) Method for reading hybrid index with storage awareness
CN103458023B (en) Distribution type flash memory storage
CN107153707B (en) Hash table construction method and system for nonvolatile memory
CN105117415B (en) A kind of SSD data-updating methods of optimization
CN104765575B (en) information storage processing method
CN107463447B (en) B + tree management method based on remote direct nonvolatile memory access
CN110888886B (en) Index structure, construction method, key value storage system and request processing method
WO2009033419A1 (en) A data caching processing method, system and data caching device
CN112000846B (en) Method for grouping LSM tree indexes based on GPU
WO2013071882A1 (en) Storage system and management method used for metadata of cluster file system
CN113704261B (en) Key value storage system based on cloud storage
US10061775B1 (en) Scalable and persistent L2 adaptive replacement cache
CN109101365A (en) A kind of data backup and resume method deleted again based on source data
US20210303196A1 (en) Method, device and computer program product for storage
CN106055679A (en) Multi-level cache sensitive indexing method
WO2024119797A1 (en) Data processing method and system, device, and storage medium
CN105988720A (en) Data storage device and method
CN116204130A (en) Key value storage system and management method thereof
CN116382588A (en) LSM-Tree storage engine read amplification problem optimization method based on learning index
KR101104112B1 (en) Dynamic index information maintenance system adapted solid state disk and method thereof and Recording medium having program source thereof
Yang et al. BF-join: an efficient hash join algorithm for DRAM-NVM-based hybrid memory systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant