CN104951403A - Low-overhead and error-free cold and hot data recognition method - Google Patents
Low-overhead and error-free cold and hot data recognition method Download PDFInfo
- Publication number
- CN104951403A CN104951403A CN201510395697.1A CN201510395697A CN104951403A CN 104951403 A CN104951403 A CN 104951403A CN 201510395697 A CN201510395697 A CN 201510395697A CN 104951403 A CN104951403 A CN 104951403A
- Authority
- CN
- China
- Prior art keywords
- lru
- data
- request
- item
- lpn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a low-overhead and error-free cold and hot data recognition method. The method is characterized by comprising storage structure design, an address grouping process, an aging mechanism and a cold and hot recognition process of a data page. According to the method, cold and hot data can be accurately and effectively recognized with smaller space-time overhead, and the method can be easily extended to a finer-granularity multi-level cold and hot data recognition method. Compared with a conventional cold and hot data recognition method, the method can guarantee lower overhead in the operating time and smaller space overhead, can prevent misrecognition of hot data and is applicable to deployment in an existing storage system, and the whole performance of the system is greatly improved.
Description
Technical field
The invention belongs to technical field of computer data storage, be specifically related to realize low expense and the recognition methods of the quick-cooling, heating data of zero defect by grouping.
Background technology
Real-life operating load (workload) presents higher data access locality usually, and namely some data can be accessed frequently, and is called dsc data; Some data is then little or accessed hardly, is called cold data.By considering data cold and hot in the design of modern memory systems, by cold and hot data identification out and be separated placement, the overall performance of storage system effectively can be improved.But the key issue realizing this design need find one effectively and the cold and hot data identification method of lightweight (less space-time expense).American computer association " storage journal " (ACM transactions on Storage (TOS), 2 volume 1 phases in 2006,22-40 page) method based on multiple Hash mapping introduced is the cold and hot data identification method that current existing space-time expense is less.But the method there will be certain wrong identification (be dsc data by the data identification of seldom or hardly accessing), error recognition rate can depend on the change of operating load, make it to some operating load and inapplicable, thus make to introduce cold and hot data identification separation mechanism the effect that performance of storage system improves is weakened greatly.
Summary of the invention
The object of the invention is to propose a kind of low expense and the cold and hot data identification method of zero defect, to overcome now methodical above-mentioned defect, under the prerequisite ensureing less space-time expense, effectively identify cold and hot data.
The low expense of the present invention and the cold and hot data identification method of zero defect, is characterized in that comprising the following steps:
The first step: node store structure design
Adopt the list of one group of record fixed qty metadata item to record the history visit information of data page (data page), wherein, the number of list is designated as K, and in each list, the number of metadata item is designated as N; The visitation frequency (counter) of the metadata record logical page number (LPN) of data page (lpn) information and data page corresponding to this logical page number (LPN), takies 32 bits (bit) and 4 bit storage space respectively; When list is filled with data item and has new metadata item to need to add in list, use least recently used algorithm (LRU) to carry out the replacement of metadata item to list, each list is designated as LRU table;
Second step: address packets
The whole logical address space of storage system maps f (x)=x%K by a hash function, metadata item corresponding for Different Logic address grouping is stored in different LRU tables, to realize the grouping of logical page number (LPN), wherein, K is the number of LRU table, x is the logical page number (LPN) of a certain data page, and % is modulo operation;
3rd step: the cold and hot identification of data page
When a request of access arrives, first calculate cryptographic hash according to the logical page number (LPN) of request of access and determine that the LRU belonging to it shows, then in corresponding LRU table, search metadata item corresponding to this logical page number (LPN) whether to exist, if existed, judge whether the value of visitation frequency is greater than predefined threshold value after then the visitation frequency value of its correspondence being added 1, if so, then the data of request access are considered as dsc data, otherwise are considered as cold data; By the multiple threshold value of setting, the visitation frequency of request of access and multiple threshold value can be compared, realize the cold and hot data identification of more fine-grained multi-layer; Before recognition result returns, upgrade corresponding LRU show: if during the logical page number (LPN) of request of access exists and show with corresponding LRU, then its metadata item is moved on to gauge outfit that LRU shows with the least recently used algorithm characteristic of maintenance table; If there is no, then search to table tail the metadata item that visitation frequency value is 0 from the gauge outfit of LRU table, if find such item, then the metadata item before this metadata item is all rearwardly moved once, then the metadata item of request of access is inserted into the gauge outfit of LRU table, and to set visitation frequency value be 1, if can not find such item, then show the item of afterbody with the probability dropping LRU of 50%, then other metadata items in table are all rearwardly moved once, the gauge outfit that the metadata item inserting request of access is shown to LRU to set visitation frequency value be 1;
4th step: ageing mechanism (aging mechanism)
The visitation frequency of data page corresponding to a certain logical page number (LPN) is have recorded in metadata item, the request of access of data page is often come once, visitation frequency in respective meta-data item will add 1, when visitation frequency value reaches the maximal value 15 that this store data items space can reach, will no longer be increased; After the request of process fixed qty, the visitation frequency value of metadata item in all LRU tables is reduced by half.
Compared with traditional cold and hot data identification method, the low expense of the present invention and the cold and hot data identification algorithm of zero defect can realize cold and hot data identification effectively accurately under less space-time expense, be easy to be extended to the cold and hot differentiating method of more fine-grained multi-layer, both ensure that expense and less space expense lower working time, it also avoid the wrong identification of dsc data, be applicable to be deployed in existing storage system, and greatly improve the overall performance of system.Because the inventive method adopts, logical address packet map is deposited in different LRU shows, compared with the recognition methods of traditional cold dsc data, identifying faster can be realized.Under the prerequisite of same magnitude recognition time, the inventive method remains the logical page number (LPN) information of request of access, avoids the wrong identification of dsc data, and comparatively traditional cold dsc data method can realize better recognition effect.
Accompanying drawing explanation
Fig. 1 is the general structure schematic diagram according to the cold and hot data of the inventive method identification.
Fig. 2 is the state updating schematic diagram that a certain moment LRU shows after difference request arrives;
Fig. 3 is that LRU table removes the state updating schematic diagram of table tail metadata item with certain probability;
Fig. 4 upgrades schematic diagram for carrying out LRU table after ageing mechanism process.
Embodiment
Below in conjunction with accompanying drawing by specific embodiment to the low expense of the present invention and the cold and hot data identification method of zero defect be described in further detail.
Embodiment 1:
The low expense of the present invention and the operating process of a specific embodiment of the cold and hot data identification method of zero defect is as follows:
The first step: node store structure design
Fig. 1 illustrates the general structure schematic diagram that the inventive method instantiation realizes cold and hot data identification.In this Fig. 1, the every a line of the list that right side is endways is made up of a little rectangle and lattice, forms a metadata item; 4 row formations that each braces comprises LRU table, the LRU table that continuous 4 row of filling with different pattern are corresponding different, namely 4 row that list the top dotted line frame encloses correspond to LRU table 1, are followed successively by LRU table 2 ~ 256 below; Storage organization have employed 256 LRU tables altogether, and retain 4 metadata items in each table, each metadata item retains the logical page number (LPN) information of 32 bits and the visitation frequency information of 4 bits.Logical page number (LPN) in all LRU tables is initialized as-1 by the initial period of cold and hot data identification, and visitation frequency value is initialized as 0.
Second step: address packets process
For reducing the recognition time of cold and hot data, the request of Different Logic address is mapped in different LRU tables by the inventive method, to reduce the metadata item number that identification data is each time inquired about and moved.As shown in Figure 1, in figure, the left side first grid correspond to the logical page number (LPN) x of current access request, and the rectangle of this grid arrow indication correspond to a hash function, and hash function is f (x)=x%256.Hash function is used for the whole logical address space of storage system to divide 256 groups, when often carrying out a request of access, namely calculate the group belonging to it according to the logical address (page number) of request, then transfer to the cold and hot identifying of request msg one of them group from all groups.
3rd step: the cold and hot identifying of data page
When whether identification data page is dsc data, first the inventive method determines the LRU table belonging to it according to the address packets process of previous step, then in LRU table, search metadata item corresponding to this data page whether to exist, if exist and visitation frequency value is more than or equal to the threshold value preset, then be identified as dsc data, otherwise be identified as cold data.Fig. 2 illustrates the renewal process of asking cold and hot data identification process when arriving and corresponding LRU table when specific.In Fig. 2, left side dotted line frame has suffered the state illustrating a certain moment LRU table 1, by numbering
represent, in table, the metadata of four line items is respectively (256,1), (0,3), (1024,1) and (-1,0) (represent a metadata by (256,1) this mode, in bracket, first digit represents the logical page number (LPN) of request of access, and second digit represents the visitation frequency of data page corresponding to logical page number (LPN)).Wherein, the logical page number (LPN) of fourth line metadata shows that this LRU shows current and was inserted into 256,0 and 1,024 three logical page number (LPN)s for-1.The gauge outfit of the corresponding LRU table of the first row in LRU table 1, the table tail of the corresponding LRU table of fourth line, when LRU table upgrades, always inserts new metadata item from gauge outfit, removes from table tail the metadata item abandoned.Be numbered in figure
lRU table represent LRU table 1 and exist
during state, logical page number (LPN) is the state upgraded after the request of access arrival of 0, and in table, the metadata of four line items is respectively (0,4), (256,1), (1024,1) and (-1,0), numbering
lRU table point to numbering
lRU table dotted arrow represent logical page number (LPN) be 0 request of access arrive after LRU table renewal process.Can find out, when in LRU table 1, subsistence logic page number is the metadata item of 0, first the metadata item before it is moved once after table tail, then be 0 be placed into logical page number (LPN) gauge outfit for metadata item and 1 is added to corresponding visitation frequency value, now, because visitation frequency value equals default threshold value 4, be dsc data by the data identification of this request of access.Be numbered
lRU table represent LRU table 1 and exist
during state, logical page number (LPN) is the state upgraded after the request of access arrival of 256, and in table, the metadata of four line items is respectively (256,2), (0,3), (1024,1) and (-1,0), numbering
lRU table point to numbering
lRU table dotted arrow represent logical page number (LPN) be 256 request of access arrive after LRU table renewal process; Be numbered
lRU table represent LRU table 1 and exist
during state, logical page number (LPN) is the state upgraded after the request of access arrival of 1024, and in table, the metadata of four line items is respectively (1024,2), (256,1), (0,3) and (-1,0), numbering
lRU table point to numbering
the dotted arrow of LRU table represent logical page number (LPN) be 1024 request of access arrive after the renewal process of LRU table, because the visitation frequency values of these two kinds requests after upgrading all are less than default threshold value 4, the data page of its correspondence is all identified as cold data.Be numbered
lRU table represent LRU table 1 and exist
during state, logical page number (LPN) is the state upgraded after the request arrival of 8192, and in table, the metadata of four line items is respectively (8192,1), (256,1), (0,3) and (1024,1), numbering
lRU table point to numbering
lRU table dotted arrow represent logical page number (LPN) be 8192 request of access arrive after LRU table renewal process, can find out when in LRU table 1, subsistence logic page number is not the metadata item of 8192, first in table, searching visitation frequency value metadata from table Caudad gauge outfit is the item of 0, because there is such metadata item, therefore the metadata item before it is moved once after table tail, then insert in a logical page number (LPN) in gauge outfit and be 8192 new metadata items and set visitation frequency value to be 1, now, visitation frequency value is less than default threshold value 4, logical page number (LPN) is that the data page of 8192 correspondences is identified as cold data.As accompanying drawing 3, if LRU table 1 is being numbered
state time, have a logical page number (LPN) be 2048 request arrive, in LRU table 1, subsistence logic page number is not the metadata item of 2048 and there is not the metadata item that visitation frequency value is 0, then remove the metadata item of table tail with the probability of 50% and insert in gauge outfit the new metadata item that logical page number (LPN) is 2048, being numbered
the representative of LRU table remove table tail metadata item after the state that upgrades, in table, the metadata of four line items is respectively (2048,1), (8192,1), (256,1) and (0,3), from being numbered
lRU table in draw bifurcated arrow represent logical page number (LPN) be 2048 request of access arrive after carry out probabilistic determination and LRU table renewal process.
4th step: ageing mechanism
For ensureing the ageing of dsc data, after often processing 4096 request of access, reduce by half to the visitation frequency value of metadata item in all LRU tables.Be numbered in accompanying drawing 4
lRU table illustrate LRU table 1 and be numbered
state time carry out ageing mechanism process after the state that upgrades, in table, the metadata of four line items is respectively (8192,0), (256,0), (0,1) and (1024,0), is numbered
lRU show sensing be numbered
lRU table dotted arrow represent ageing mechanism processing procedure.Can find out that ageing mechanism only can upgrade the value of the visitation frequency of metadata item in LRU table, and can not metadata item be moved.
In the present embodiment cold and hot identifying is carried out once to data page, only need at the most to search in a LRU table or mobile 4 metadata items, compare traditional cold and hot recognizer, greatly improve once the speed of cold and hot data identification; In the design of storage organization, introduce the logical page number (LPN) of request of access, avoid the wrong identification of dsc data, effectively can identify dsc data and improve the overall performance of storage system.
Claims (1)
1. low expense and a cold and hot data identification method for zero defect, is characterized in that comprising the following steps:
The first step: node store structure design
Adopt the list of one group of record fixed qty metadata item to record the history visit information of data page, wherein, the number of list is designated as K, and in each list, the number of metadata item is designated as N; The visitation frequency of the logical page number (LPN) information of metadata record data page and data page corresponding to this logical page number (LPN), takies 32 bits and 4 bit storage space respectively; When list is filled with data item and has new metadata item to need to add in list, use least recently used algorithm to carry out the replacement of metadata item to list, each list is designated as LRU table;
Second step: address packets
The whole logical address space of storage system maps f (x)=x%K by a hash function, metadata item corresponding for Different Logic address grouping is stored in different LRU tables, to realize the grouping of logical page number (LPN), wherein, K is the number of LRU table, x is the logical page number (LPN) of a certain data page, and % is modulo operation;
3rd step: the cold and hot identification of data page
When a request of access arrives, first calculate cryptographic hash according to the logical page number (LPN) of request of access and determine that the LRU belonging to it shows, then in corresponding LRU table, search metadata item corresponding to this logical page number (LPN) whether to exist, if existed, judge whether the value of visitation frequency is greater than predefined threshold value after then the visitation frequency value of its correspondence being added 1, if so, then the data of request access are considered as dsc data, otherwise are considered as cold data; By setting multiple threshold value, the visitation frequency of request of access and multiple threshold value being compared, realizing the cold and hot data identification of more fine-grained multi-layer; Before recognition result returns, upgrade corresponding LRU show: if during the logical page number (LPN) of request of access exists and show with corresponding LRU, then its metadata item is moved on to gauge outfit that LRU shows with the least recently used algorithm characteristic of maintenance table; If there is no, then search to table tail the metadata item that visitation frequency value is 0 from the gauge outfit of LRU table, if find such item, then the metadata item before this metadata item is all rearwardly moved once, then the metadata item of request of access is inserted into the gauge outfit of LRU table, and to set visitation frequency value be 1, if can not find such item, then show the item of afterbody with the probability dropping LRU of 50%, then other metadata items in table are all rearwardly moved once, the gauge outfit that the metadata item inserting request of access is shown to LRU to set visitation frequency value be 1;
4th step: ageing mechanism
The visitation frequency of data page corresponding to a certain logical page number (LPN) is have recorded in metadata item, the request of access of data page is often come once, visitation frequency in respective meta-data item will add 1, when visitation frequency value reaches the maximal value 15 that this store data items space can reach, will no longer be increased; After the request of process fixed qty, the visitation frequency value of metadata item in all LRU tables is reduced by half.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510395697.1A CN104951403B (en) | 2015-07-06 | 2015-07-06 | A kind of cold and hot data identification method of low overhead and zero defect |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510395697.1A CN104951403B (en) | 2015-07-06 | 2015-07-06 | A kind of cold and hot data identification method of low overhead and zero defect |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104951403A true CN104951403A (en) | 2015-09-30 |
CN104951403B CN104951403B (en) | 2018-01-30 |
Family
ID=54166069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510395697.1A Active CN104951403B (en) | 2015-07-06 | 2015-07-06 | A kind of cold and hot data identification method of low overhead and zero defect |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104951403B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569962A (en) * | 2016-10-19 | 2017-04-19 | 暨南大学 | Identification method of hot data based on temporal locality enhancement |
CN106874213A (en) * | 2017-01-12 | 2017-06-20 | 杭州电子科技大学 | A kind of solid state hard disc dsc data recognition methods for merging various machine learning algorithms |
CN109783443A (en) * | 2018-12-25 | 2019-05-21 | 西安交通大学 | The cold and hot judgment method of mass data in a kind of distributed memory system |
CN109885574A (en) * | 2019-02-22 | 2019-06-14 | 广州荔支网络技术有限公司 | A kind of data query method and device |
CN116303119A (en) * | 2023-05-19 | 2023-06-23 | 珠海妙存科技有限公司 | Method, system and storage medium for identifying cold and hot data |
WO2024066575A1 (en) * | 2022-09-26 | 2024-04-04 | 华为技术有限公司 | Method and device for distinguishing cold and hot physical pages, and chip and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100153630A1 (en) * | 2005-07-13 | 2010-06-17 | Samsung Electronics Co., Ltd. | Data storage system with complex memory and method of operating the same |
CN101753625A (en) * | 2009-12-28 | 2010-06-23 | 北京理工大学 | Method for deployment of copy service and copy establishment in peer-to-peer network environment |
CN102117309A (en) * | 2010-01-06 | 2011-07-06 | 卓望数码技术(深圳)有限公司 | Data caching system and data query method |
CN102170468A (en) * | 2011-04-07 | 2011-08-31 | 江苏省电力公司 | Content similarity-based and distributed storage replica replacement algorithm |
-
2015
- 2015-07-06 CN CN201510395697.1A patent/CN104951403B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100153630A1 (en) * | 2005-07-13 | 2010-06-17 | Samsung Electronics Co., Ltd. | Data storage system with complex memory and method of operating the same |
CN101753625A (en) * | 2009-12-28 | 2010-06-23 | 北京理工大学 | Method for deployment of copy service and copy establishment in peer-to-peer network environment |
CN102117309A (en) * | 2010-01-06 | 2011-07-06 | 卓望数码技术(深圳)有限公司 | Data caching system and data query method |
CN102170468A (en) * | 2011-04-07 | 2011-08-31 | 江苏省电力公司 | Content similarity-based and distributed storage replica replacement algorithm |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569962A (en) * | 2016-10-19 | 2017-04-19 | 暨南大学 | Identification method of hot data based on temporal locality enhancement |
CN106874213A (en) * | 2017-01-12 | 2017-06-20 | 杭州电子科技大学 | A kind of solid state hard disc dsc data recognition methods for merging various machine learning algorithms |
CN106874213B (en) * | 2017-01-12 | 2020-03-20 | 杭州电子科技大学 | Solid state disk hot data identification method fusing multiple machine learning algorithms |
CN109783443A (en) * | 2018-12-25 | 2019-05-21 | 西安交通大学 | The cold and hot judgment method of mass data in a kind of distributed memory system |
CN109885574A (en) * | 2019-02-22 | 2019-06-14 | 广州荔支网络技术有限公司 | A kind of data query method and device |
WO2024066575A1 (en) * | 2022-09-26 | 2024-04-04 | 华为技术有限公司 | Method and device for distinguishing cold and hot physical pages, and chip and storage medium |
CN116303119A (en) * | 2023-05-19 | 2023-06-23 | 珠海妙存科技有限公司 | Method, system and storage medium for identifying cold and hot data |
CN116303119B (en) * | 2023-05-19 | 2023-08-11 | 珠海妙存科技有限公司 | Method, system and storage medium for identifying cold and hot data |
Also Published As
Publication number | Publication date |
---|---|
CN104951403B (en) | 2018-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104951403A (en) | Low-overhead and error-free cold and hot data recognition method | |
US10706101B2 (en) | Bucketized hash tables with remap entries | |
CN102521269B (en) | Index-based computer continuous data protection method | |
JP6356675B2 (en) | Aggregation / grouping operation: Hardware implementation of hash table method | |
CN101655861B (en) | Hashing method based on double-counting bloom filter and hashing device | |
KR101620773B1 (en) | Data migration for composite non-volatile storage device | |
JP6764359B2 (en) | Deduplication DRAM memory module and its memory deduplication method | |
WO2019127104A1 (en) | Method for resource adjustment in cache, data access method and device | |
CN104731794B (en) | A kind of cold and hot data fragmentation excavates storage method | |
CN109977129A (en) | Multi-stage data caching method and equipment | |
CN106407224A (en) | Method and device for file compaction in KV (Key-Value)-Store system | |
US10007615B1 (en) | Methods and apparatus for performing fast caching | |
CN105677580A (en) | Method and device for accessing cache | |
JP2017208096A5 (en) | ||
US20170123979A1 (en) | Systems, devices, and methods for handling partial cache misses | |
CN105117351A (en) | Method and apparatus for writing data into cache | |
CN108027713A (en) | Data de-duplication for solid state drive controller | |
WO2016107182A1 (en) | Multi-path set-connection cache and processing method therefor | |
CN104158744A (en) | Method for building table and searching for network processor | |
CN104750432B (en) | A kind of date storage method and device | |
CN112148217B (en) | Method, device and medium for caching deduplication metadata of full flash memory system | |
CN105956032A (en) | Cache data synchronization method, system and apparatus | |
US9940069B1 (en) | Paging cache for storage system | |
CN108733584B (en) | Method and apparatus for optimizing data caching | |
CN104426774A (en) | High-speed routing lookup method and device simultaneously supporting IPv4 and IPv6 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220829 Address after: 100192 207, floor 2, building C-1, Zhongguancun Dongsheng science and Technology Park, No. 66, xixiaokou Road, Haidian District, Beijing Patentee after: Pingkai star (Beijing) Technology Co.,Ltd. Address before: 230026 Jinzhai Road, Baohe District, Hefei, Anhui Province, No. 96 Patentee before: University of Science and Technology of China |