CN104881369B - Towards the low memory cost hotspot data identification method of mixing storage system - Google Patents
Towards the low memory cost hotspot data identification method of mixing storage system Download PDFInfo
- Publication number
- CN104881369B CN104881369B CN201510236366.3A CN201510236366A CN104881369B CN 104881369 B CN104881369 B CN 104881369B CN 201510236366 A CN201510236366 A CN 201510236366A CN 104881369 B CN104881369 B CN 104881369B
- Authority
- CN
- China
- Prior art keywords
- page
- temperature
- bit array
- value
- heat degree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention discloses a kind of low memory cost hotspot data identification method towards mixing storage system, concretely comprises the following steps:1) access times of the Bloom filter UBF record pages of super large counter capacity are defined;2) when page x is accessed, query page x temperature;If 3) x is not yet recorded in UBF, by the way that some positions corresponding to x in UBF are recorded into x for 1 by 0 upset;If x is already recorded in UBF, by the way that one 0 corresponding to x in UBF is overturn for 1, so as to increase x temperature with certain probability;4) page x temperature and the size for the heat degree threshold specified are compared, if page x temperature is more than heat degree threshold, data corresponding to page x are identified as hot spot data;Redirect and perform step 2).The present invention has the advantages of hot spot data identification that memory cost is low, can realize big data quantity and high recognition accuracy.
Description
Technical field
The present invention relates to extensive mixing technical field of memory, more particularly to a kind of low internal memory towards mixing storage system
Expense hotspot data identification method.
Background technology
Current large-scale storage systems are proposed higher requirement to performance and cost, and mixing storage architecture is then one
Kind can meet the solution of these demands simultaneously.It is high-end by the way that capacity is smaller, performance is higher in mixing storage system to set
It is standby to ensure performance, while relatively low, the cheap low side devices of utility reduce cost, thus stored in such isomery and be
In system, high-end devices are generally used to preserve the hot spot data frequently accessed, low side devices and are then used for preserving the cold number seldom accessed
According to, but to realize that this Optimized Measures largely then need to rely on accurately identifying for hot spot data.
Hot spot data identifies generally by the analysis that takes statistics of the history to data access, current based on this principle
Substantial amounts of hotspot data identification method has been carried out in practitioner's design.In principle, most of cache replacement policy belongs to focus number
According to recognition methods, wherein most representative with LFU (Least Frequently Used) strategies.It is each in LFU strategies
Part one counter of data maintenance, when a data are accessed, its corresponding counter is incremented by 1;Counter values are larger
Data are hot spot data.But this method is needed for the counter of one integer type of each part of data maintenance, memory cost
It is larger.
Bloom Filter (Bloom filter) are a kind of data structures of low memory cost, and it can be used to record access and goes through
History, so being identified available for hot spot data.Bloom Filter safeguard a units group and k hash function, wherein, bit array
In all positions be initialized as 0.When a page is accessed, k hash function calculates k according to the page number x of the page
Individual numerical value h1(x),h2(x),…,hk(x), this k value corresponds to k position in bit array respectively, as long as this k position is all set to
1, so that it may recorded page x in Bloom Filter.Bloom Filter can be many with very low space expense record
Accession page, but it can not be directly used in hot spot data identification in, have include following 2 reasons;
1) Bloom Filter can not record the repeated accesses to data;No matter page x is accessed how many times, Bloom
Filter is only by k position h corresponding to x1(x),h2(x),…,hk(x) 1 is set to, without taking any measure record x to be interviewed
The number asked.Due to not having access times information, Bloom Filter can not identify hot spot data from accessing in history.
2) Bloom Filter can only constantly record the new page, and can not delete and be already recorded in Bloom Filter
In any page, cause the access history of record increasingly longer;And in fact, the access history of early stage identifies to hot spot data
There is no value, memory cost can be increased on the contrary by recording long access history.
In order to overcome two big defects existing for above Bloom Filter, there is researcher to propose to utilize multiple Bloom
Filter carrys out record access historic villages and towns, and this method is referred to as more Bloom Filter methods (Multiple Bloom
Filters methods), abbreviation MBFs methods.When a page x is accessed, selected from multiple Bloom Filter of its maintenance
One is selected also without record x Bloom Filter, and x is recorded in the selected Bloom Filter.So, one
The page repeatedly accessed will be recorded in multiple Bloom Filter, so as to be identified as hot spot data.In addition, MBFs
Method periodically one Bloom Filter of selection, by all clearings in its bit array, this is recorded in so as to delete
The access history of all pages in Bloom Filter, whereby method deletion early stage.
Although MBFs methods can record the access times of the page by multiple Bloom Filter, and by periodic
The access history that a Bloom Filter deletes early stage is removed, so as to efficiently identify focus number from access history
According to;But there is also both sides defect for MBFs methods:On the one hand, due to using multiple Bloom Filter data structures,
The memory cost of MBFs methods is still higher;On the other hand, the page access number that MBFs is able to record is limited in scope;It is assumed that
MBFs methods safeguard n Bloom Filter, and for specific webpage, this method can be that the maximum access times of its record are n;
If the access times of the page, more than n, the access information beyond part can not be recorded in Bloom Filter so that lost
Having lost can a part of hot information.In above-mentioned both sides defect due to can not make up simultaneously, i.e. increase accesses secondary MBFs methods
Several count ranges (n) inevitably results in the increase of memory cost, therefore MBFs methods also are difficult to be applied to mixing on a large scale and deposited
In storage system.
The content of the invention
The technical problem to be solved in the present invention is overcome the deficiencies in the prior art, there is provided a kind of implementation method is simple, internal memory
Expense is low, can realize the hot spot data identification of big data quantity and recognition accuracy is high towards the low interior of mixing storage system
Deposit expense hotspot data identification method.
In order to solve the above technical problems, technical scheme proposed by the present invention is:
A kind of low memory cost hotspot data identification method towards mixing storage system, specific implementation step are:
1) the Bloom filter UBF (Ultra-counting Bloom Filter) of super large counter capacity is defined, it is described super
The Bloom filter UBF of big counter capacity is remembered by bit array and each page of mapping to k hash function of the bit array
Record each access of the page;The bit array is initialized to be 0 and set each hash function;
2) when page x is accessed, by the temperature of the bit array query page x, the temperature of the page x is the page
The digit that x is 1 in the corresponding k position intermediate value of bit array;
3) judge whether page x temperature is less than default first record threshold value t, first record threshold value t is less than Hash letter
Several number k, if it is, judge that page x is not yet recorded in UBF, for page x k of bit array correspondence position, by by its
Middle t position is 1 record page x by 0 upset;Otherwise judge that page x is reported in UBF, it is individual in the k of bit array for page x
Corresponding position, pass through the repeated accesses counting for being 1 carry out page x by the upset of the first predetermined probabilities by the position that one of value is 0;
4) page x temperature and the size for the heat degree threshold specified are compared, if page x temperature is more than heat degree threshold, page
Data corresponding to the x of face are identified as hot spot data;Redirect and perform step 2).
Preferably, the first predetermined probabilities are 1/2 in the step 3)h-t, wherein h is page x temperature, and t is first record
Threshold value.
Preferably, in the step 3) by the bit flipping that a value is 0 be 1 before also include carrying out for some 1 it is random
The step of clearing, concretely comprise the following steps:The number m for the position that byte intermediate value where counting the position to be flipped for being 1 is 1, according to counting
To number m with the second predetermined probabilities by each of byte clearing where the position to be flipped for being 1.
Preferably, second predetermined probabilities are 1/ (8-m), and byte intermediate value where wherein m is the position to be flipped for being 1 is
The number of 1 position.
Preferably, the step 2) concretely comprises the following steps:
2.1) when page x receives accessed request, initialization counter i and temperature h are 0, redirect execution step
2.2);
2.2) the value h of i-th of hash function corresponding to page x is calculatedi(x) it is corresponding in the bit array, to detect page x
Hi(x) value of position, if detecting corresponding hi(x) value of position is 1, then redirects and perform step 2.3);If detect corresponding the
hi(x) value of position is 0, then redirects and perform step 2.4);
2.3) temperature h is added 1, redirects and perform step 2.4);
2.4) counter i is added 1, redirects and perform step 2.2), until counter i value is equal to the number k of hash function,
Exported temperature h as page x temperature, redirect and perform step 3).
Preferably, concretely comprising the following steps for the bit array is initialized in the step 1):
1.1) set the number k of hash function and need the access history length n recorded;
1.2) storage according to needed for the number k of the hash function of setting, access history length n calculate the bit array is empty
Between size, in internal memory be a piece of corresponding size of the bit array application memory space;
1.3) memory space corresponding to the bit array is initialized as 0.
Preferably, the memory space needed for the bit array is directly proportional to k × n, and wherein k is the number of hash function, and n is
Need the access history length recorded.
Preferably, the step 4) also include being adjusted by formula (1) according to the page x temperature heat degree threshold size specified with
The step of being identified for hot spot data next time (5);
Wherein, h representation pages x temperature, NewThreshold represent the heat degree threshold after adjustment, and Threshold is represented
Heat degree threshold before adjustment, DesiredHotPages represent the number of the hot pages set in advance for needing to identify.
Preferably, the step 5) also includes when page x is identified as hot spot data, after raising the adjustment by formula (2)
Heat degree threshold, obtain the heat degree threshold after final adjustment be used for next time hot spot data identify the step of;
Wherein, h representation pages x temperature, NewThreshold represent the heat degree threshold after adjustment, NewThreshold '
The heat degree threshold after final adjustment is represented, DesiredHotPages represents the hot pages set in advance for needing to identify
Number.
Compared with prior art, the advantage of the invention is that:
1) present invention is visited by defining each time of the Bloom filter UBF data structure records pages of super large counter capacity
Ask, only a page need to can record by the t positions information less than hash function number, memory cost can be reduced, led to simultaneously
Cross 0 corresponded to the page in position to count to carry out the repetition memory access of the page for 1 with the upset of certain probability, it is only necessary to safeguard that one surpasses
The Bloom filter UBF of big counter capacity is the multiple access record that the page can be achieved, so as to realize low memory cost
While realize big data quantity hot spot data identification, effectively increase hot spot data recognition accuracy and efficiency.
2) present invention is further clear by all bytes of the position to be flipped for being 1 with certain probability before being 1 by 0 bit flipping
Zero, the temperature for making infrequently to access data progressively declines, and is finally disappeared in history is accessed, so as to effectively dispose for a long time not
Accessed historical data.
3) present invention further adjusts heat degree threshold according to the temperature of page dynamic, when a page is accessed, root
Heat degree threshold Threshold is adjusted according to the temperature h of accession page, the value for making heat degree threshold Threshold is according to adjustment of load
The temperature average value of the page appeared in load, wherein when temperature h is more than heat degree threshold Threshold, raise temperature threshold
Value Threshold value;When temperature h is less than heat degree threshold Threshold, heat degree threshold Threshold value is lowered, from
And cause the identification of hot spot data to be adaptive to the change of load, improve the flexibility of hot spot data identification.
4) present invention further adjusts heat degree threshold according to the temperature of page dynamic, whenever a page is identified as focus
During data, up-regulation heat degree threshold Threshold value, identified by the heat degree threshold Threshold constantly raised
Hot spot data is data most hot in whole load.
Brief description of the drawings
Fig. 1 is that the implementation process of the present embodiment towards the low memory cost hotspot data identification method of mixing storage system is shown
It is intended to.
Embodiment
Below in conjunction with Figure of description and specific preferred embodiment, the invention will be further described, but not therefore and
Limit the scope of the invention.
As shown in figure 1, low memory cost hotspot data identification method of the present embodiment towards mixing storage system, specific real
Applying step is:
1) UBF and initialization are defined:Define the Bloom filter UBF, the Bu Long of super large counter capacity of super large counter capacity
Filter UBF records the access times of the page by bit array and each page of mapping to k hash function of bit array;Initially
Change bit array to be 0 and set each hash function;
2) query page temperature:When page x receives accessed request, pass through bit array query page x temperature, page
Face x temperature is the digit that page x is 1 in the corresponding k position intermediate value of bit array;
3) page is added into UBF:Judge whether page x temperature is less than default first record threshold value t, first record threshold
Value t be less than hash function number k, if it is, judge page x be not yet recorded in UBF, for page x bit array k
Individual corresponding position, by the way that wherein t position is recorded into page x for 1 by 0 upset;Otherwise judge that page x is reported in UBF, for page
For face x k of bit array corresponding position, it is 1 carry out page x to be overturn by the position that is 0 by one of value by the first predetermined probabilities
Repeated accesses count;
4) hot spot data identifies:Compare page x temperature and the size for the heat degree threshold specified, if page x temperature is big
In heat degree threshold, data corresponding to page x are identified as hot spot data, and otherwise data corresponding to page x are identified as cold data;Redirect
Perform step 2).
In the present embodiment, design realize it is a kind of can with the data structure of very low memory cost record access history, should
Data structure records the access times of corresponding page by k position in bit array, has the counter capacity of super large, thus is referred to as super
The Bloom filter UBF (Ultra-counting Bloom Filter) of big counter capacity.Held by safeguarding that a super large counts
The Bloom filter UBF of amount achieves that super large-scale page access historical record, so as to dynamically identify frequent access
Hot spot data, for mixing storage system in data layout guidance is provided.
In the present embodiment, initialization bit array concretely comprises the following steps in step 1):
1.1) set the number k of hash function and need the access history length n recorded;
1.2) memory space according to needed for the number k of the hash function of setting, access history length n calculate bit array is big
It is small, it is the memory space of a piece of corresponding size of bit array application in internal memory;
1.3) memory space corresponding to bit array is initialized as 0.
The number k of hash function, need the access history length n recorded specifically can be according to memory source, upper layer application journey
The practical application requests such as sequence demand are configured, and when the accessed number of data in application load is more, then can be set larger
K values to record more access times;On the premise of memory source abundance, access history length n then set it is more big more
Good, n is the page number included in recorded access history.
It is directly proportional to k × n in the memory space needed for bit array in the present embodiment, i.e. the number for the position that bit array is included
Mesh is directly proportional to k × n, then the access history length n recorded according to the number k of hash function, needs can calculate UBF institutes
The size of the bit array needed.Apply for that one piece of memory headroom is initial as bit array, then by all positions in bit array in internal memory
Turning to 0, it is assumed that the number of bit array middle position is 2k × n, then applies for the memory space of a piece of 2k × n/8 bytes in internal memory, and
Region corresponding to bit array is reset;K each page x of mapping of initializing set hash function h simultaneously1(x),h2(x),…,hk
(x) the Bloom filter UBF of super large counter capacity initialization, is completed.
In the present embodiment, step 2) concretely comprises the following steps:
2.1) when page x is accessed, initialization counter i and temperature h are 0, redirect and perform step 2.2);
2.2) the value h of i-th of hash function corresponding to page x is calculatedi(x) page x corresponding h in bit array, are detectedi
(x) value of position, if detecting corresponding hi(x) value of position is 1, then redirects and perform step 2.3);If detect corresponding hi(x)
The value of position is 0, then redirects and perform step 2.4);
2.3) temperature h is added 1, redirects and perform step 2.4);
2.4) counter i is added 1, redirects and perform step 2.2), until counter i value is equal to the number k of hash function,
Exported temperature h as page x temperature, redirect and perform step 3).
Page x corresponds to k position in UBF bit array, and the digit that this k position intermediate value is 1 is to be defined as page x heat
Degree.Then after request of the upper layer application to page x is received, page x corresponding k position in bit array, statistics are detected successively
Go out the temperature h that the number that the k position intermediate value is 1 obtains page x, then page x is added in bit array to record currently to page
Face x access.
Page x is added in UBF bit array and two kinds of situations be present:The first situation is that page x is not yet recorded in UBF
Bit array in, be now that page x is added in UBF first to carry out first record;Second of situation is that page x is remembered
Record is now to need the repeated accesses for increasing page x to count in UBF bit array.In the present embodiment, step 3) especially by
The temperature h of current page and default first record threshold value t comparison determine which kind of above-mentioned situation performed, i.e., by remembering first
Record threshold value t differentiates whether a page is reported in UBF bit array, and first record threshold value t is specifically set can be significantly small
In k.
When the temperature h of the page is less than first record threshold value t, then the first situation is performed, otherwise performs second of situation.
For the first situation, i.e. page x is not yet recorded in UBF, then is 1 by 0 upset by t position in k of bit array corresponding position
Page x is recorded, so, after page x is added into, at least t are 1 in its corresponding k position;When page x is accessed again,
The temperature h of the page is no longer less than t, then performs second of situation.For second of situation, i.e. page x has been recorded in UBF, this
When only need to increase repeated accesses record, then one value of selection is preset generally for 0 position by first in k of bit array corresponding position
Rate upset to carry out repeated accesses counting, that is, records the access times to page x for 1.
Using the above method, the present embodiment only needs t positions information to represent whether a page x is recorded in UBF, compares
Need to safeguard an integer counter for each page with record access number in traditional LFU strategies, and each integer meter
Number device need to occupy tens place memory headroom, can effectively reduce memory cost.The present embodiment also only needs to safeguard a UBF with reality
Each time of the existing page accesses record, thus its memory cost also significantly lower than needs to safeguard multiple Bloom Filter MBFs side
Case.
The present embodiment for 1 with the upset of certain probability by k positions one 0 by carrying out repeated accesses counting, to page x
Access for each time and to be recorded in k of UBF correspondence position.Even if the access times to page x are exponentially increased, in UBF
In the number of 1 corresponding with page x also only linear increase so that the i.e. recordable index of position information for passing through linear increase increases
Long access times, there is super large count range, so as to realize the focus number of big data quantity while low memory cost is realized
According to identification.Due to the unprecedented growth of count range, hot spot data recognition accuracy and efficiency can also be significantly improved.It is and traditional
MBFs schemes need to safeguard n Bloom Filter, and the page access number of most multipotency record is only n, when page x visit
When asking number more than n, follow-up access can not be recorded in MBFs, so as to cause hot information to be lost.
In the present embodiment, the first predetermined probabilities are 1/2h-t, wherein h is the temperature of current page, and t is time record threshold value.I.e.
When needing to carry out repeated accesses counting to page x, the position that a values of the page x in k corresponding position of bit array is 0 is pressed
1/2h-tProbability upset be 1, hence for specific webpage x, just with several information can records application program to page x
Hundreds of time access.
In the present embodiment, in step 3) by 0 bit flipping be 1 before also include being reset at random for some 1
Step, concretely comprise the following steps:The number m for the position that byte intermediate value where counting the position to be flipped for being 1 is 1, the number obtained according to statistics
Mesh m is reset each of byte where the position to be flipped for being 1 with the second predetermined probabilities.Access was once appeared in for those to go through
Shi Zhong but the page not being accessed for a long time, if by its in bit array corresponding to " 1 " position overturn at leisure as 0, can make
Its temperature progressively declines, and is eventually disappeared in history is accessed, and effectively disposes not accessed without any value for a long time
Early stage access history data.
In the present embodiment, the second predetermined probabilities are 1/ (8-m), and byte intermediate value where wherein m is the position to be flipped for being 1 is
The number of 1 position.I.e. for specific one position, before being 1 by 0 upset by the position, first count 1 in the byte at this place
Number m, then reset all of the byte with 1/ (8-m) probability, then perform and overturn the certain bits for 1 by 0
Operation.
The present embodiment identifies whether the page is hot spot data by specified heat degree threshold Threshold, if page x heat
Degree h is more than the heat degree threshold Threshold specified, then instruction page x is hot pages, and data corresponding to page x are dsc data,
It should be stored in high-end devices;If page x temperature h is less than the heat degree threshold Threshold that specifies, illustrate x for not frequently
Numerous accession page, data corresponding to page x are infrequently to access data, as cold data, it should are stored in low side devices.
To after hot spot data recognition result, the recognition result is returned in mixing storage system, mixing storage is instructed according to hot spot data
Data layout in system.
In the present embodiment, step 4) also includes the size for the heat degree threshold specified according to page x temperature by formula (1) adjustment
The step of being identified for hot spot data next time;
Wherein, h representation pages x temperature, NewThreshold represent the heat degree threshold after adjustment, and Threshold is represented
Heat degree threshold before adjustment, DesiredHotPages represent the number of the hot pages set in advance for needing to identify.Assuming that
In the Bedding storage system of a two-stage, application program is needed 1000 most hot Page-savings in first order accumulation layer
Secondary, then DesiredHotPages is set as 1000.
In the present embodiment, dynamic is made to specified heat degree threshold Threshold by page x temperature and adjusted, such as formula (1)
It show the first rule of the present embodiment dynamic adjustment.In the first rule, when a page is accessed, according to the page
X temperature adjusts heat degree threshold Threshold according to formula (1), when page x temperature h value is more than heat degree threshold
During Threshold, heat degree threshold Threshold is raised, i.e., new threshold value will increase;Conversely, the value of the temperature h as page x
During less than heat degree threshold Threshold, heat degree threshold Threshold is lowered, i.e., new threshold value will be reduced.Pass through above-mentioned first
Heat degree threshold Threshold can be adjusted to the temperature average value of the page appeared in load by kind rule, so that temperature
Threshold value Threshold being capable of flexibility of the dynamic self-adapting in load, raising hot spot data identification for different loads.
In the present embodiment, step 5) also includes when page x is identified as hot spot data, by the heat after formula (2) up-regulation adjustment
Threshold value is spent, the heat degree threshold after final adjustment is obtained and is used for the step of hot spot data identifies next time;
Wherein, h representation pages x temperature, NewThreshold represent the heat degree threshold after adjustment, NewThreshold '
The heat degree threshold after final adjustment is represented, DesiredHotPages represents the hot pages set in advance for needing to identify
Number.
As formula (2) show the present embodiment dynamic adjustment second rule, when a page is identified as hot spot data
When, i.e., when the temperature h of page value is more than heat degree threshold Threshold, raise heat degree threshold Threshold.If due to big
The page of amount is identified as hot spot data, then illustrates that heat degree threshold Threshold is too low, it should suitably increase threshold value, with from this
The higher page of temperature is found out in a little hot pages, thus a hot pages can often found out by above-mentioned second rule
Shi Zengjia heat degree thresholds Threshold value, steps up heat degree threshold Threshold value, and some temperatures are less high
Hot spot data will likely again be determined and be identified as cold data, thus can finally be filtered out from substantial amounts of hot pages most hot
Data, so that it is guaranteed that most hot data can be identified from hotter data by the heat degree threshold Threshold of adjustment.
And in traditional MBFs methods, due to a static threshold can only be set, thus do not have flexibility, it is quiet that access times exceed this
The page of state threshold value is identified as hot spot data, can not update to obtain most hot data.
Above-mentioned simply presently preferred embodiments of the present invention, not makees any formal limitation to the present invention.It is although of the invention
It is disclosed above with preferred embodiment, but it is not limited to the present invention.Any those skilled in the art, do not taking off
In the case of from technical solution of the present invention scope, all technical solution of the present invention is made perhaps using the technology contents of the disclosure above
More possible changes and modifications, or it is revised as the equivalent embodiment of equivalent variations.Therefore, it is every without departing from technical solution of the present invention
Content, according to the technology of the present invention essence to any simple modifications, equivalents, and modifications made for any of the above embodiments, all should fall
In the range of technical solution of the present invention protection.
Claims (9)
- A kind of 1. low memory cost hotspot data identification method towards mixing storage system, it is characterised in that specific implementation step For:1) the Bloom filter UBF, the Bloom filter UBF of the super large counter capacity for defining super large counter capacity pass through digit Group and each page of mapping record the access times of the page to k hash function of the bit array;Initialize the bit array 0 and to set each hash function;2) when page x is accessed, by the temperature of the bit array query page x, the temperature of the page x exists for page x The corresponding k position intermediate value of bit array is 1 digit;3) judge whether page x temperature is less than default first record threshold value t, wherein first record threshold value t is less than Hash letter Several number k, if it is, judge that page x is not yet recorded in UBF, for page x k of bit array correspondence position, pass through by Wherein t position is 1 record page x by 0 upset;Otherwise judge page x be reported in UBF, for page x bit array k Individual corresponding position, pass through the repeated accesses counting for being 1 carry out page x by the upset of the first predetermined probabilities by the position that one of value is 0;4) page x temperature and the size for the heat degree threshold specified are compared, if page x temperature is more than heat degree threshold, page x Corresponding data are identified as hot spot data;Redirect and perform step 2).
- 2. the low memory cost hotspot data identification method according to claim 1 towards mixing storage system, its feature It is:The first predetermined probabilities are 1/2 in the step 3)h-t, wherein h is page x temperature, and t is first record threshold value.
- 3. the low memory cost hotspot data identification method according to claim 2 towards mixing storage system, its feature Be, in the step 3) by a value be 0 bit flipping be 1 before also include reset at random the step of, concretely comprise the following steps: The number m for the position that byte intermediate value where counting the position to be flipped for being 1 is 1, the number m obtained according to statistics is with the second predetermined probabilities By each clearing of byte where the position to be flipped for being 1.
- 4. the low memory cost hotspot data identification method according to claim 3 towards mixing storage system, its feature It is:Second predetermined probabilities are 1/ (8-m), and wherein m is the number for the position that the position place byte intermediate value to be flipped for being 1 is 1.
- 5. the low memory cost hotspot data identification method according to claim 4 towards mixing storage system, its feature It is, the step 2) concretely comprises the following steps:2.1) when page x is accessed, initialization counter i and temperature h are 0, redirect and perform step 2.2);2.2) the value h of i-th of hash function corresponding to page x is calculatedi(x), and page x is detected corresponding the in the bit array hi(x) value of position, if detecting corresponding hi(x) value of position is 1, then redirects and perform step 2.3);If detect corresponding hi (x) value of position is 0, then redirects and perform step 2.4);2.3) temperature h is added 1, redirects and perform step 2.4);2.4) counter i is added 1, redirects and perform step 2.2), until counter i value is equal to the number k of hash function, by heat The temperature for spending h as page x exports, and redirects and performs step 3).
- 6. the low memory cost hotspot data identification method according to claim 5 towards mixing storage system, its feature It is, concretely comprising the following steps for the bit array is initialized in the step 1):1.1) set the number k of hash function and need the access history length n recorded;1.2) memory space according to needed for the number k of the hash function of setting, access history length n calculate the bit array is big It is small, it is a piece of memory space for corresponding to size of the bit array application in internal memory;1.3) memory space corresponding to the bit array is initialized as 0.
- 7. the low memory cost hotspot data identification method according to claim 6 towards mixing storage system, its feature It is:Memory space needed for the bit array is directly proportional to k × n, and wherein k is the number of hash function, and n needs to record Access history length.
- 8. the low memory cost hot spot data towards mixing storage system according to any one in claim 1~7 is known Other method, it is characterised in that the heat degree threshold that the step 4) also includes being specified by formula (1) adjustment according to page x temperature is big Small the step of being identified for hot spot data next time (5);Wherein, h representation pages x temperature, NewThreshold represent the heat degree threshold after adjustment, and Threshold represents adjustment Preceding heat degree threshold, DesiredHotPages represent the number of the hot pages set in advance for needing to identify.
- 9. the low memory cost hotspot data identification method according to claim 8 towards mixing storage system, its feature It is, the step 5) also includes when page x is identified as hot spot data, and the heat degree threshold after the adjustment is raised by formula (2), Obtain the heat degree threshold after final adjustment and be used for the step of hot spot data identifies next time;Wherein, h representation pages x temperature, NewThreshold represent the heat degree threshold after adjustment, and NewThreshold ' is represented Heat degree threshold after final adjustment, DesiredHotPages represent the number of the hot pages set in advance for needing to identify.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510236366.3A CN104881369B (en) | 2015-05-11 | 2015-05-11 | Towards the low memory cost hotspot data identification method of mixing storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510236366.3A CN104881369B (en) | 2015-05-11 | 2015-05-11 | Towards the low memory cost hotspot data identification method of mixing storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104881369A CN104881369A (en) | 2015-09-02 |
CN104881369B true CN104881369B (en) | 2017-12-12 |
Family
ID=53948869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510236366.3A Active CN104881369B (en) | 2015-05-11 | 2015-05-11 | Towards the low memory cost hotspot data identification method of mixing storage system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104881369B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106874213B (en) * | 2017-01-12 | 2020-03-20 | 杭州电子科技大学 | Solid state disk hot data identification method fusing multiple machine learning algorithms |
CN108241725B (en) * | 2017-05-24 | 2019-07-05 | 新华三大数据技术有限公司 | A kind of data hot statistics system and method |
CN109542339B (en) * | 2018-10-23 | 2021-09-03 | 拉扎斯网络科技(上海)有限公司 | Data layered access method and device, multilayer storage equipment and storage medium |
CN112052190B (en) * | 2020-09-03 | 2022-08-30 | 杭州电子科技大学 | Solid state disk hot data identification method based on bloom filter and secondary LRU table |
CN113766650B (en) * | 2021-08-26 | 2022-06-28 | 武汉天地同宽科技有限公司 | Internet resource obtaining method and system based on dynamic balance |
CN113849752A (en) * | 2021-09-24 | 2021-12-28 | 苏州浪潮智能科技有限公司 | Page caching method and device and storage medium |
CN117234432B (en) * | 2023-11-14 | 2024-02-23 | 苏州元脑智能科技有限公司 | Management method, management device, equipment and medium of hybrid memory system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103902260A (en) * | 2012-12-25 | 2014-07-02 | 华中科技大学 | Pre-fetch method of object file system |
CN104156432A (en) * | 2014-08-08 | 2014-11-19 | 四川九成信息技术有限公司 | File access method |
-
2015
- 2015-05-11 CN CN201510236366.3A patent/CN104881369B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103902260A (en) * | 2012-12-25 | 2014-07-02 | 华中科技大学 | Pre-fetch method of object file system |
CN104156432A (en) * | 2014-08-08 | 2014-11-19 | 四川九成信息技术有限公司 | File access method |
Also Published As
Publication number | Publication date |
---|---|
CN104881369A (en) | 2015-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104881369B (en) | Towards the low memory cost hotspot data identification method of mixing storage system | |
CN105242871B (en) | A kind of method for writing data and device | |
CN104246721B (en) | Storage system, storage controller, and storage method | |
US8880544B2 (en) | Method of adapting a uniform access indexing process to a non-uniform access memory, and computer system | |
US20180004690A1 (en) | Efficient context based input/output (i/o) classification | |
CN106528454B (en) | A kind of memory system caching method based on flash memory | |
CN105653591A (en) | Hierarchical storage and migration method of industrial real-time data | |
CN101645043B (en) | Methods for reading and writing data and memory device | |
WO2013152678A1 (en) | Method and device for metadata query | |
US10997080B1 (en) | Method and system for address table cache management based on correlation metric of first logical address and second logical address, wherein the correlation metric is incremented and decremented based on receive order of the first logical address and the second logical address | |
CN106502587A (en) | Data in magnetic disk management method and magnetic disk control unit | |
CN104699424A (en) | Page hot degree based heterogeneous memory management method | |
US20120117297A1 (en) | Storage tiering with minimal use of dram memory for header overhead | |
CN110795363B (en) | Hot page prediction method and page scheduling method of storage medium | |
CN108762671A (en) | Mixing memory system and its management method based on PCM and DRAM | |
CN103942161B (en) | Redundancy elimination system and method for read-only cache and redundancy elimination method for cache | |
CN108845957B (en) | Replacement and write-back self-adaptive buffer area management method | |
CN104077242A (en) | Cache management method and device | |
CN103150395A (en) | Directory path analysis method of solid state drive (SSD)-based file system | |
CN107102954A (en) | A kind of solid-state storage grading management method and system based on failure probability | |
CN111580754B (en) | Write-friendly flash memory solid-state disk cache management method | |
CN114416646A (en) | Data processing method and device of hierarchical storage system | |
CN110532200B (en) | Memory system based on hybrid memory architecture | |
CN108710581A (en) | PCM storage medium abrasion equilibrium methods based on Bloom filter | |
CN103176753B (en) | Storing device and data managing method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |