WO2016178594A1 - Method and device for garbage collection in log structured file systems - Google Patents

Method and device for garbage collection in log structured file systems Download PDF

Info

Publication number
WO2016178594A1
WO2016178594A1 PCT/RU2015/000289 RU2015000289W WO2016178594A1 WO 2016178594 A1 WO2016178594 A1 WO 2016178594A1 RU 2015000289 W RU2015000289 W RU 2015000289W WO 2016178594 A1 WO2016178594 A1 WO 2016178594A1
Authority
WO
WIPO (PCT)
Prior art keywords
segments
state parameter
volume
garbage collection
parameter value
Prior art date
Application number
PCT/RU2015/000289
Other languages
French (fr)
Other versions
WO2016178594A8 (en
Inventor
Hongbo Zhang
Vyacheslav Anatolievich DUBEYKO
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/RU2015/000289 priority Critical patent/WO2016178594A1/en
Priority to CN201580079149.6A priority patent/CN107533506B/en
Publication of WO2016178594A1 publication Critical patent/WO2016178594A1/en
Publication of WO2016178594A8 publication Critical patent/WO2016178594A8/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0804Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0871Allocation or management of cache space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7205Cleaning, compaction, garbage collection, erase control

Abstract

A garbage collection device (30) for performing a garbage collection of a volume in a log structured file system is provided. The volume comprises a plurality of segments, each comprising a plurality of blocks. The device (30) comprises a state parameter determination unit (31) adapted to determine a block state parameter of each of the blocks, a segment state parameter of each of the segments a volume state parameter. Moreover, the device (30) comprises a garbage collection determination unit (32) adapted to determine a garbage collection queue based upon the segment state parameters, and determine garbage collection segments based upon the volume state and the garbage collection queue. The device (30) comprises a garbage collection unit (33) adapted to perform garbage collection of the garbage collection segments.

Description

METHOD AND DEVICE FOR GARBAGE COLLECTION IN LOG
STRUCTURED FILE SYSTEMS
TECHNICAL FIELD
The invention relates to garbage collection in log structured file systems, especially to a method and device for performing a garbage collection efficiently.
BACKGROUND
The fundamental idea of a Log structured file system (LFS) is to improve write performance by buffering a sequence of file system changes in the file cache and then writing all the changes to disk sequentially in a single disk write operation. The information written to disk in the write operation includes file data blocks, attributes, index blocks, directories and almost all the other information used to manage the file system. A Log structured file system writes all new information to disk in a sequential structure called the log.
LFS doesn't place inodes at fixed positions; they are written to the log. LFS uses a data structure called an inode map to maintain the current location of each inode. Given the identifying number of a file, the inode map must be indexed to determine the disk address of the inode. The inode map is divided into blocks that are written to the log; a fixed checkpoint region on each disk identifies the locations of all the inode map blocks.
This can readily be seen in Fig. 1. Especially a log structured file system disk layout 10 is shown.
The most difficult design issue for log structured file systems is the management of free space. The goal is to maintain large free extents for writing new data. Initially all the free space is in a single extent on the disk, but by the time the log reaches the end of the disk the free space will have been fragmented into many small extents corresponding to the files that were deleted or overwritten meanwhile.
From this point on, the file system has two choices: Threading and copying. The first alternative is to leave the live data in place and thread the log through the free extents. Unfortunately, threading will cause the free space to become severely fragmented. The second alternative is to copy live data out of the log in order to leave large free extents for writing. The live data is written back in a compacted form at the head of the log. The disadvantage of copying is its cost, particularly for long-lived files.
The above-mentioned two alternatives are readily shown in Fig. 2. On the left side, a log 20 before and after threading is shown, while on the right side, a log 21 before and after copying and compacting is shown.
For a log structured file system to operate efficiently, it must ensure that there are always large extents of free space available for writing new data. One solution is based on large extents called segments, where a segment cleaner process continually regenerates empty segments by compressing the live data from heavily fragmented segments.
The process of copying live data out of a segment is called segment cleaning or garbage collection (GC). In LFS it is a simple three-step process: read a number of segments into memory, identify the live data, and write the live data back to a smaller number of clean segments. After this operation is completed, the segments that were read are marked as clean, and they can be used for new data or for additional cleaning.
With increasing storage sizes, conventional methods for performing this garbage collection though are overburdened. A very long time for performing an entire defragmentation of a volume is so far necessary. Accordingly, an object of the present invention is to provide a device and method, which allow an efficient garbage collection in log structured file systems.
SUMMARY The above-mentioned object is solved by the features of claim 1 for the method and claim 10 for the device. Further it is solved by the features of claim 9 for the associated computer program. The dependent claims contain further developments. A goal of the invention is to suggest a method for determining a fragmentation and aging state of volumes of Log structured file systems (LFS) with the purpose to elaborate a flexible garbage collection (GC) policy when using the copying approach. Log structured file systems are limited by a file system that uses a Copy-on- Write policy as baseline internal data modification technique. The design of log structured file systems is based on the hypothesis that this technique will no longer be effective because ever-increasing memory sizes on modern computers would lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. LFS treats its storage as a circular log and writes sequentially to the head of log. Log structured file systems use Copy-on-Write (COW) policy as internal technique of data modifications. The COW policy means that modified file's blocks are not updated in place but these modified blocks are written in a last log.
Log structured file systems, however, must reclaim free space from the tail of the log to prevent the file system from becoming full when the head of the log wraps around to meet it. The tail can release space and move forward by skipping over data for which newer versions exist farther ahead in the log. If there are no newer versions, then the data is moved and appended to the head.
Thereby, inefficient GC logic can decrease performance of log structured file systems significantly. Exemplary implementations of garbage collection in log structured file systems don't suggest efficient GC policies, which do not affect file system performance. An idea of the proposed method is using an invalid blocks count as criterion for determining a structure of a GC queue of pre-dirty/dirty segments, selecting pre- dirty/dirty segments with minimal clearing cost and detecting a volume state. The GC queue can be structured as array of chains of dirty segments ordered by invalid blocks count as indexes of the array. Every index of the array groups dirty segments with a concrete number of invalid blocks in the respective segment.
The GC queue then has a counter of dirty segments in the chain for every index of invalid blocks count in array. These counters are used for volume state detection. Indexes of the GC queue's array are grouped on the basis of valid blocks count ranges:
(1) Cold volume area [80% - 99% of valid blocks];
(2) Cooling off volume area [65% - 80% of valid blocks];
(3) Warming-up volume area [50% - 65% of valid blocks];
(4) Warm volume area [35% - 50% of valid blocks];
(5) Pre-hot volume area [20% - 35% of valid blocks];
(6) Hot volume area [1 % - 20% of valid blocks].
The calculation of the segments count for every volume state's areas gives opportunity for comparing the dirty segments count in different volume state's areas. Thereby, if some of volume state's area has a greatest dirty segments count, this means that the volume is in such a state, for example, if the pre-hot area has greatest count of dirty segments then the volume has the volume state pre-hot.
A simple and efficient method of LFS's volume state detection without a file system performance overhead is therefore proposed. Qualitative LFS's volume state treatment can result in an elaboration of a flexible and highly efficient GC policy. A key goal of flexible GC policy is to provide enough free segment capacity without file system performance degradation even in the case of an aged volume state.
According to a first aspect of the invention, a method for garbage collection of a volume in log structured file systems is provided. A volume comprises a plurality of segments, which each comprise a plurality of blocks. The method comprises determining a block state parameter of each of the blocks, determining a segment state parameter of each of the segments based upon the block state parameters of the blocks of the segment and determining a volume state parameter of the volume based upon the segment state parameters and the block state parameters. Moreover the method comprises determining a garbage collection queue based upon the segment state parameters, determining garbage collection segments based upon the volume state and the garbage collection queue, and performing garbage collection of the garbage collection segments. It is thereby possible to limit the number of segments, a garbage collection is performed on and thereby reduce the garbage collection effort significantly.
According to a first implementation form of the first aspect, the determining of the block state parameters comprises allocating a block state parameter value to each of the blocks. Possible block state parameter values are:
- free, indicating the block having been prepared for a write operation but not yet allocated,
- invalid, indicating the block having been freed after file update, file truncation or deletion operation but it was not collected as garbage yet,
- pre-allocated, indicating the block having been pre-allocated for a file but no data has been written in the block yet, and
- valid, indicating the block having been allocated and data has been written to the block. It is thereby possible to very efficiently determine the block state parameters. According to a first implementation form of the first implementation form of the first aspect, the determining of the segment state parameters of the segments comprises allocating a segment state parameter value to each of the segments. Possible segment state parameter values are:
- clean, indicating the segment having only blocks with block state parameter value free,
- using, indicating the segment having blocks with block state parameter values valid, invalid, pre-allocated and free,
- used, indicating the segment having only blocks with block state parameter value valid,
- pre-dirty, indicating the segment having only blocks with block state parameter values valid and invalid, and
- dirty, indicating the segment having only blocks with block state parameter value invalid. It is thereby possible to very efficiently determine the segment state parameters.
According to a first implementation form of the first implementation form of the first implementation form of the first aspect, the determining of the volume state parameter of the volume comprises allocating a volume state parameter value to the volume. Possible volume state parameter values are:
- clean, indicating the volume containing only segments of segment state parameter value clean,
- icy, indicating most of allocated segments on the volume having 100% blocks of block state parameter value valid,
- cold, indicating most of allocated segments on the volume having 80-99% blocks of block state parameter value valid,
- cooling-off, indicating most of allocated segments on the volume having 65-80% blocks of block state parameter value valid,
- warming-up, indicating most of allocated segments on the volume having 50- 65% blocks of block state parameter value valid,
- warm, indicating most of allocated segments on the volume having 35-50% blocks of block state parameter value valid, - pre-hot, indicating most of allocated segments on the volume having 20-35% blocks of block state parameter value valid,
- hot, indicating most of allocated segments on the volume having 1-20% blocks of block state parameter value valid, and
- boiling, indicating most of allocated segments on the volume having 0% blocks of block state parameter value valid. It is thereby possible to very efficiently determine the volume state parameter.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first aspect, determining the garbage collection queue comprises determining an invalid block count of each of the segments and ordering the segments according to their invalid blocks count, wherein the segments of higher invalid blocks count are arranged in the garbage collection queue to be garbage collected earlier than segments of lower invalid blocks count. A further increase in garbage collection efficiency can thereby be achieved.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first aspect, the determining of the garbage collection segments comprises:
- if the volume state parameter value is boiling, selecting all segments as garbage collection segments,
- if the volume state parameter value is hot, selecting the garbage collection segments as all segments in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is pre-hot, selecting the garbage collection segments as at least some of segments in range of 20-35% of block state parameter value valid and all segments in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is warm, selecting the garbage collection segments as at least some of segments in range of 35-50% of block state parameter value valid,
- if the volume state parameter value is warming-up, selecting the garbage collection segments as at least some of segments in range of 50-65% of block state parameter value valid,
- if the volume state parameter value is cooling-off, selecting no garbage collection segments,
- if the volume state parameter value is cold, selecting no garbage collection segments,
- if the volume state parameter value is icy, selecting no garbage collection segments, and
- if the volume state parameter value is clean, selecting no garbage collection segments. A significant reduction in the number of segments to be garbage collected can thereby be achieved. A further significant efficiency increase is the result.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first aspect, in case of the volume state parameter value being cooling-off or cold or icy, a threading of the log structured file systems through three extends of the volume is performed. In this case, no garbage collection is performed. This further reduces the garbage collection workload and increases the efficiency.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first aspect, the performing of the garbage collection of the garbage collection segments comprises:
- reading live data out of the garbage collection segments,
- writing the live data back to a smaller number of segments of segment state parameter value clean, and
- marking the garbage collection segments of segment state parameter value clean. Also by these measures, a very efficient garbage collection is achieved. According to a second aspect of the invention a computer program with a program code for performing the method according to the first aspect or the according implementation forms when the computer program runs on a computer, is provided. According to a third aspect of the invention, a garbage collection device for performing a garbage collection of a volume in a log structured file system is provided. The volume comprises a plurality of segments, each of which comprises a plurality of blocks. The device comprises a state parameter determination unit adapted to determine a block state parameter of each of the blocks, determine a segment state parameter of each of the segments based upon the block state parameters of the blocks of the segment, and determine a volume state parameter of the volume based upon the segment state parameters and block state parameters. Moreover, the device comprises a garbage collection determination unit adapted to determine a garbage collection queue based upon the segment state parameters, and determine garbage collection segments based upon the volume state and the garbage collection queue. Also, the device comprises a garbage collection unit adapted to perform garbage collection of the garbage collection segments. It is thereby possible to perform a very efficient garbage collection of the volume in the log structured file system.
According to a first implementation form of the third aspect, the state parameter determination unit is adapted to determine the block state parameters by allocating a block state parameter value to each of the blocks. Possible block state parameters are:
- free, indicating the block having been prepared for a write operation but not yet allocated, - invalid, indicating the block having been freed after file update, file truncation or deletion operation but it was not collected as garbage yet,
- pre-allocated, indicating the block having been pre-allocated for a file but no data has been written in the block yet, and
- valid, indicating the block having been allocated and data has been written to the block. It is thereby possible to very efficiently determine the block state parameters.
According to a first implementation form of the first implementation form of the third aspect, the state parameter determination unit is adapted to determine the segment state parameters of the segments by allocating a segment state parameter value to each of the segments. Possible segment state parameter values are
- clean, indicating the segment having only blocks with block state parameter value free,
- using, indicating the segment having blocks with block state parameter values valid, invalid, pre-allocated and free,
- used, indicating the segment having only blocks with block state parameter value valid,
- pre-dirty, indicating the segment having only blocks with block state parameter values valid and invalid, and
- dirty, indicating the segment having only blocks with block state parameter value invalid. It is thereby possible to very efficiently determine the segment state parameters.
According to a first implementation form of the first implementation form of the first implementation form of the third aspect, the state parameter determination unit is adapted to determine the volume state parameter of the volume by allocating a volume state parameter value to the volume. Possible volume state parameters are:
- clean, indicating the volume containing only segments of segment state parameter value clean, - icy, indicating most of allocated segments on the volume having 100% blocks of block state parameter value valid,
- cold, indicating most of allocated segments on the volume having 80-99% blocks of block state parameter value valid,
- cooling-off, indicating most of allocated segments on the volume having 65-80% blocks of block state parameter value valid,
- warming-up, indicating most of allocated segments on the volume having 50- 65% blocks of block state parameter value valid,
- warm, indicating most of allocated segments on the volume having 35-50% blocks of block state parameter value valid,
- pre-hot, indicating most of allocated segments on the volume having 20-35% blocks of block state parameter value valid,
- hot, indicating most of allocated segments on the volume having 1-20% blocks of block state parameter value valid, and
- boiling, indicating most of allocated segments on the volume having 0% blocks of block state parameter value valid. It is thereby possible to very efficiently determine the volume state parameter.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the third aspect, the garbage collection determination unit is adapted to determine the garbage collection queue by determining an invalid blocks count of each of the segments and ordering the segments according to their invalid blocks count. Segments of higher invalid blocks count are arranged in the garbage collection queue to be garbage collected earlier that segments of lower invalid blocks count. A further increase in garbage collection efficiency can thereby be achieved.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the third aspect, the garbage collection determination unit is adapted to determine the garbage collection segments by
- if the volume state parameter value is boiling, selecting all segments as garbage collection segments,
- if the volume state parameter value is hot, selecting the garbage collection segments as all segments in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is pre-hot, selecting the garbage collection segments as at least some of segments in range of 20-35% of block state parameter value valid and all segments in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is warm, selecting the garbage collection segments as at least some of segments in range of 35-50% of block state parameter value valid,
- if the volume state parameter value is warming-up, selecting the garbage collection segments as at least some of segments in range of 50-65% of block state parameter value valid,
- if the volume state parameter value is cooling-off, selecting no garbage collection segments,
- if the volume state parameter value is cold, selecting no garbage collection segments,
- if the volume state parameter value is icy, selecting no garbage collection segments, and
- if the volume state parameter value is clean, selecting no garbage collection segments. A significant reduction in the number of segments to be garbage collected can thereby be achieved. A further significant efficiency increase is the result.
According to a first form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the third aspect, the garbage collection device is adapted to in case of the volume state parameter value being cooling- off or cold or icy, performing a threading of the log of the log structured file system through three extends of the volume. In this case, no garbage collection is performed. This further reduces the garbage collection workload and increases the efficiency.
According to a first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the first implementation form of the third aspect, the garbage collection unit is adapted to perform the garbage collection of the garbage collection segment by
- reading live data out of the garbage collection segments,
- writing the live data back to a smaller number of segments of segment state parameter value clean, and
- marking the garbage collection segments of segment state parameter value clean. Also by these measures, a very efficient garbage collection is achieved.
Generally, it has to be noted that all arrangements, devices, elements, units and means and so forth described in the present application could be implemented by software or hardware elements or any kind of combination thereof. Furthermore, the devices may be processors or may comprise processors, wherein the functions of the elements, units and means described in the present applications may be implemented in one or more processors. All steps which are performed by the various entities described in the present application as well as the functionality described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if in the following description or specific embodiments, a specific functionality or step to be performed by a general entity is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respect of software or hardware elements, or any kind of combination thereof.
BRIEF DESCRIPTION OF DRAWINGS The present invention is in the following explained in detail in relation to embodiments of the invention in reference to the enclosed drawings, in which
Fig. 1 a structure of an exemplary log structured file system;
Fig. 2 two different exemplary options of garbage collection in a log structured file system;
Fig. 3 an embodiment of the inventive garbage collection device in a block diagram;
Fig. 4 exemplary blocks and segments in a volume;
Fig. 5 an overview of a garbage collection of several segments;
Fig. 6 several blocks and segments before and after garbage collection activity;
Fig. 7 a state diagram of different block states;
Fig. 8 an exemplary garbage collection queue;
Fig. 9 several segments before and after garbage collecting and the according garbage collection queues;
Fig. 0 an exemplary diagram showing segment state parameters and a volume state;
Fig. a further exemplary diagram showing segment state parameters and a volume state;
Fig. 12 a further exemplary diagram showing segment state parameters and a volume state;
Fig. 13 a first embodiment of the method according to the first aspect of the invention, and
Fig. 14 a second embodiment of the method according to the first aspect of the invention.
DESCRIPTION OF EMBODIMENTS
First we demonstrate the general construction and function of an embodiment of the garbage collection device according to the third aspect of the invention along Fig. 3. After this, along Fig. 4 - Fig. 12, further details of the construction and function are elaborated on. Finally, along Fig. 13 and Fig. 14 the function of embodiments of the inventive method according to the first aspect are described. Similar entities and reference numbers and different figures have been partially omitted.
In Fig. 3, an exemplary embodiment of the inventive garbage collection device 30 is shown. The garbage collection device 30 comprises a parameter determination unit 31 , a garbage collection determination unit 32 and a garbage collection unit 33. Moreover, it comprises a control unit 34. The control unit 34 is connected to the state parameter determination unit 31 , to the garbage collection determination unit 32 and to the garbage collection unit 33. Moreover the state parameter determination unit 31 is connected to the garbage collection determination unit 32 which in turn is connected to the garbage collection unit 33.
The garbage collection device 30 is adapted for performing a garbage collection of a volume in a log structure file system. The volume comprises a plurality of segments, which each comprise a plurality of blocks. The state parameter determination unit 31 determines a block state parameter for each of the blocks. Moreover, it determines a segment state parameter of each of the segments based upon the block state parameters of the blocks of the respective segment. Finally, it determines a volume state parameter based on the segment state parameters and block state parameters. The state parameters are handed on to the garbage collection determination unit 32, which then determines a garbage collection queue based upon the segment state parameters. Moreover, it determines garbage collection segments based upon the volume state and the garbage collection queue in order to determine, which segments a garbage collection is to be performed on. A list of these garbage collection segments is then handed on to the garbage collection unit 33, which finally performs a garbage collection of the respective garbage collection of the respective garbage collection segments. The function of the state parameter determination unit 31 , the garbage collection determination unit 32 and the garbage collection unit 33 is controlled by the control unit 34.
In the following, some more information regarding typical log structured file systems is given.
A volume 40 of Log structured file system (LFS), as shown in Fig. 4 is a chain of segments 41 , 42, 43, 44, 45. Each segment 40, 41 , 42, 43, 44, 45 is an aggregate of physical disk blocks. The capacity of a segment is usually 2 MB or 8 MB. LFS allocate free space by means of allocating from a free segment pool. Free space in a segment is used for free blocks allocation for creating or updating files.
Thereby, a segment of LFS can be:
(1 ) completely or partially free;
(2) completely filled by blocks with valid data;
(3) completely filled by invalid blocks;
(4) filled by valid as well as invalid blocks.
An invalid block means that this block was written previously by a file's data but after a file update it stores obsolete data in its current state.
The design of log structured file systems is based on the hypothesis that these will no longer be effective because ever-increasing memory sizes on modern computers will lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. LFS treat its storage as a circular log and write sequentially to the head of log. Thereby, invalid blocks are a consequence of the Copy-on-Write, COW, policy of LFS's principle of functioning. The COW policy means that modified file's blocks are not updated in place but these modified blocks are written in a last log.
An exemplary write operation in a log structured file system is shown in Fig. 5. Segments 50, 51 and 52 are at least partially filled with data. In the here- depicted operation an additional block is written to segment 52.
Log structured file systems, however, must reclaim free space from the tail of the log to prevent the file system from becoming full when the head of the log wraps around to meet it. The tail can release space and move forward by skipping over data for which newer versions exist farther ahead in the log. If there are no newer versions, then the data is moved and appended to the head.
Such a garbage collection operation is shown in Fig. 6. In the upper part of Fig. 6, the situation before the garbage collection is shown. Segments 601 , 602, 603, 604 and 605 are present within a volume 60. The segments 602 and 603 are dirty segments.
In the lower section of Fig. 6, the situation after the garbage collection is shown. The volume 60 still comprises the segment 601. The segment 602 is still present as segment 612, the segment 603 is present as segment 613, the segment 604 is present as segment 614 and the segment 605 is present as segment 615. The content of these respective segments though has changed. The dirty segment 602 and 603 are garbage collected. This means that the content of the dirty segment 602 is copied and written to the empty segment 605 resulting in segment 615. Since the segment 603 contained no valid data, the garbage collection of this segment merely includes labeling the segment as free segment 613. Moreover, in Fig. 6 the update of a single block of the segment 604 is shown. This block is afterwards free. The content has been copied to the segment 615.
Arbitrarily performing such garbage collection operations though can decrease performance of log structured file systems significantly.
In log structured file systems, every segment is a sequence of blocks. A block of a segment can be:
(1 ) "free";
(2) "pre-allocated";
(3) " valid", and
(4) "invalid".
A definition of these block states is described in the following table.
Complex Pair of simple Description
state states
Free [Free:Clean] Block has prepared for the write operation but it was not allocated yet.
Invalid [Free: Dirty] Block has been freed after file update, file truncation or deletion operation but it was not collected as garbage yet.
Pre- [Allocated .Clean] Block has been pre-allocated for a allocated file but it was not written by any data yet.
Valid [Allocated: Dirty] Block has been allocated and it was written by data. Different combinations of block states in a single segment define the state of segment itself.
As a result, a segment can be:
(1 ) "clean";
(2) " using";
(3) "used";
(4) "pre-dirty";
(5) "dirty".
A definition of the segment states is described in the following table.
Figure imgf000021_0001
Fig. 7 illustrates possible segment state transformations in a state diagram. Firstly, in state 71 , an exemplary segment is completely clean. A segment becomes "using" after allocation of segment for last log saving on a volume. This corresponds to the segment state 72. The "using" state 72 can be transformed in "used" state 75 or "pre-dirty" state 73 after exhausting of segment's free space by saved logs. The "Used" state 75 of the segment can be shifted to the "pre-dirty" state 73 as a result of a deletion or an update operation. Finally, the segment enters the "dirty" state 74 after invalidation of all blocks in the segment. The segment is transformed to the "clean" state 71 by performing a garbage collection, GC.
The count of valid blocks in the segment is an important factor. This factor defines the intensity of GC working during clearing of the segment because of the necessity to copy all valid blocks from cleared segments into the last log. Thereby, the cost of segment clearing grows with increasing valid blocks count in a segment. As a result, GC should select segments with least clearing cost. But only using a "least clearing cost criterion" is not enough for most efficient GC, since in different volume states such a criterion can have a different efficiency and even result in a significant performance overhead. It is additionally possible to group segments on the basis of a count of valid blocks in a segment. A combination of segment count in different groups can be a basis for distinction of possible volume states. The following table describes such possible volume states.
State Description
Clean Volume contains only clean segments.
Icy The most of allocated segments on volume have 100% of valid blocks.
Cold The most of allocated segments on volume have 80% - 99% of valid blocks.
Cooling-off The most of allocated segments on volume have 65% - 80% of valid blocks.
Warming-up The most of allocated segments on volume have 50% - 65% of valid blocks.
Warm The most of allocated segments on volume have 35% - 50% of valid blocks.
Pre-hot The most of allocated segments on volume have 20% - 35% of valid blocks.
Hot The most of allocated segments on volume have 1% - 20% of valid blocks.
Boiling The most of allocated segments on volume have 0% of valid blocks. An invalid blocks count can be a criterion for the structure of the GC queue of p re-dirty/dirty segments. Such a criterion is a way of selecting pre-dirty/dirty segments with minimal clearing cost. Thereby, the GC queue can be structured as array of dirty segment chains and is ordered by invalid blocks count as indexes of the array. This is readily depicted in Fig. 8. As a result, GC can choose garbage collection segments as segments for clearing with the greatest count of invalid blocks. Moreover, calculating of the segments count in different ranges of invalid blocks count gives opportunity for detection of the volume state as shown in the previous table.
In Fig. 8, especially a garbage collection queue 80 is shown. As an index, a number of invalid blocks of the segments 81 is used. Each index number corresponds to a chain of segments to be cleared 82. In Fig. 9, in an upper left part, a volume 90 that has three segments in "used" state initially is depicted. Thereby, all segments' numbers are located in the chain for segments with an invalid blocks count of 0. In the lower left part of Fig. 9, an according garbage collection queue is shown. Now, a deletion of a file located in segment #2 and an updating of files located in segments #1 and #3 are performed. The resulting volume 91 is displayed in an upper right part of Fig. 9. The updated files are now located in segments #4 and #5. An according resulting garbage collection queue is shown in a lower right part of Fig. 9.
Knowledge of the distribution of segment count between states with various invalid block counts gives opportunity to build a graphical representation of segment count dependence from invalid blocks count. Such representations are shown in Fig. 10, Fig. 11 and Fig. 12.
From this information, it is moreover possible to determine a peak of the data in such a diagram. For example, in Fig. 10, the peak is clearly at an invalid blocks count of 1 , resulting in a volume state "cold". In Fig. 11 , the peak is clearly at an invalid blocks count of between 332 and 408, resulting in a volume state "pre- hot". In Fig. 12, the peak is clearly at an invalid blocks count of between 511 , resulting in a volume state "hot".
Based on the position of this peak, which corresponds to the knowledge regarding the volume state, it is possible to manage GC behavior. An exemplary GC behavior based upon volume state parameters is shown in Fig. 14. It is here referred to the elaborations regarding the method described there.
In Fig. 13, an exemplary embodiment of the method according to the first aspect of the invention is depicted. In a first step 130, a volume is provided in a log structured file system. The volume comprises a plurality of segments which each comprises a plurality of blocks. In a second step 131 , a block state parameter of each of the blocks is determined. Regarding the details of determining the block state parameters, it is referred to the earlier elaborations. In a third step 132, a segment state parameter of each of the segments is determined based upon the block state parameters of the blocks of the respective segment. Also here it is referred to the earlier description. In a fourth step 133, a volume state parameter of the volume is determined based upon the segment state parameters and block state parameters as earlier described. In a fifth step 134, a garbage collection queue is determined based upon the segments state parameters determined in the fourth step 133. Also here it is referred to the earlier elaborations regarding the determination of the garbage collection queue. Moreover, in a sixth step 135, garbage collection segments are determined based upon the volume state and the garbage collection queue. Regarding the details of the implementation, it is also referred to the earlier description. Finally, in a seventh step 136, a garbage collection of the earlier determined garbage collection segments is performed. Also here it is referred to the earlier elaborations.
Finally, in Fig. 14, a detailed embodiment of the inventive method according to the first aspect of the invention is shown. The steps shown here correspond to the sixth step 135 of Fig. 13.
In a first step 140, it is determined, if the earlier-determined volume state is "hot". If this is the case, in a second step 141 , the garbage collection segments are selected as all segments in range of 1 - 20 % of block state parameter value valid. If this is not the case, in a third step 142 it is checked if the volume state parameter value is "pre-hot". In this case, in a fourth step 143, the garbage collection segments are selected so that at least some of the segments in range of 20 - 25 % of block state parameter value valid and all segments in range of 1 - 20 % of block state parameter value valid are selected. If this not the case, in a fifth step 144, it is checked, if the volume state parameter value is "warm". If this is the case, in a sixth step 145 the garbage collection segments are selected as at least some of the segments in range of 35 - 50 % of block state parameter value valid. If this is not the case, in a seventh step 146 it is determined if the volume state parameter value is warming up. If this is the case, the garbage collection segments are selected as at least some of the segments in range of 50 - 65 % of block state parameter value valid in eighth step 147. If this is not the case, it is checked in a ninth state 148 if the volume state parameter value is "cooling-off". If this is the case, no garbage collection segments are selected. In a tenth step 149 a threading of the log through three extends is performed. If this is not the case, in an eleventh step 150 it is checked if the volume state parameter value is "cold". If this is the case, it is also continued with the tenth step 149. Apart from the here-depicted volume state parameter values, moreover as a reaction to a volume state parameter value boiling, all segments are selected as garbage collection segments. Moreover, as a reaction to a volume state parameter value "icy" or "clean", also no garbage collection segments are selected. In case of the volume state parameter being icy, also a threading as in the tenth step 149 is performed.
The invention is not limited to the examples shown in the description. Especially, the invention can be used with any swat of storage medium, including hard drives, solid state discs and even random access memory or flash memory. The characteristics of the exemplary embodiments can be used in any advantages combination.
The invention has been described in conjunction with various embodiments herein. However, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure and the appended claims. In the claims, the word "comprising " does not exclude other elements or steps and the indefinite article "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in usually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. A computer program may be stored/distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the internet or other wired or wireless communication systems.

Claims

Claims
1. A method for garbage collection of a volume (10, 20, 21 , 40, 60, 90, 91) in log structured file systems,
wherein the volume (10, 20, 21 , 40, 60, 90, 91 ) comprises a plurality of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615),
wherein the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) each comprise a plurality of blocks,
wherein the method comprises:
- determining (131) a block state parameter of each of the blocks,
- determining (132) a segment state parameter of each of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) based upon the block state parameters of the blocks of the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615),
- determining (133) a volume state parameter of the volume (10, 20, 21 , 40, 60, 90, 91) based upon the segment state parameters and the block state parameters,
- determining (134) a garbage collection queue based upon the segment state parameters,
- determining (135) garbage collection segments based upon the volume state and the garbage collection queue, and
- performing (136) garbage collection of the garbage collection segments.
2. The method of claim ,
wherein the determining (131 ) of the block state parameters comprises allocating a block state parameter value to each of the blocks, and
wherein possible block state parameter values are:
- free, indicating the block having been prepared for a write operation but not yet allocated,
- invalid, indicating the block having been freed after file update, file truncation or deletion operation but it was not collected as garbage yet, - pre-allocated, indicating the block having been pre-allocated for a file but no data has been written in the block yet, and
- valid, indicating the block having been allocated and data has been written to the block.
3. The method of claim 2,
wherein the determining (132) of the segment state parameters of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) comprises allocating a segment state parameter value to each of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615), and
wherein possible segment state parameter values are:
- clean, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) having only blocks with block state parameter value free,
- using, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) having blocks with block state parameter values valid, invalid, pre-allocated and free,
- used, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) having only blocks with block state parameter value valid,
- pre-dirty, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603,
604, 605, 612, 613, 614, 615) having only blocks with block state parameter values valid and invalid, and
- dirty, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604,
605, 612, 613, 614, 615) having only blocks with block state parameter value invalid.
4. The method of claim 3,
wherein the determining (133) of the volume state parameter of the volume (10, 20, 21 , 40, 60, 90, 91 ) comprises allocating a volume state parameter value to the volume (10, 20, 21 , 40, 60, 90, 91), and wherein possible volume state parameter values are:
- clean, indicating the volume (10, 20, 21 , 40, 60, 90, 91) containing only segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) of segment state parameter value clean,
- icy, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91 ) having 100% blocks of block state parameter value valid,
- cold, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91 ) having 80-99% blocks of block state parameter value valid,
- cooling-off, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91) having 65-80% blocks of block state parameter value valid,
- warming-up, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40,
60, 90, 91 ) having 50-65% blocks of block state parameter value valid,
- warm, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91 ) having 35-50% blocks of block state parameter value valid,
- pre-hot, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52,
601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91 ) having 20-35% blocks of block state parameter value valid,
- hot, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 ,
602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91 ) having 1 -20% blocks of block state parameter value valid, and
- boiling, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21 , 40, 60, 90, 91) having 0% blocks of block state parameter value valid.
5. The method of claim 4,
wherein the determining (134) of the garbage collection queue comprises:
- determining an invalid blocks count of each of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615), and
- ordering the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) according their invalid blocks count, wherein segments (41 ,
42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) of higher invalid blocks count are arranged in the garbage collection queue to be garbage collected earlier than segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) of lower invalid blocks count.
6. The method of claim 5,
wherein the determining ( 35) of the garbage collection segments comprises:
- if the volume state parameter value is boiling, selecting all segments (41 , 42,
43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) as garbage collection segments,
- if the volume state parameter value is hot, selecting the garbage collection segments as all segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604,
605, 612, 613, 614, 615) in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is pre-hot, selecting the garbage collection segments as at least some of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 ,
602, 603, 604, 605, 612, 613, 614, 615) in range of 20-35% of block state parameter value valid and all segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602,
603, 604, 605, 612, 613, 614, 615) in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is warm, selecting the garbage collection segments as at least some of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) in range of 35-50% of block state parameter value valid,
- if the volume state parameter value is warming-up, selecting the garbage collection segments as at least some of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) in range of 50-65% of block state parameter value valid,
- if the volume state parameter value is cooling-off, selecting no garbage collection segments, - if the volume state parameter value is cold, selecting no garbage collection segments,
- if the volume state parameter value is icy, selecting no garbage collection segments, and
- if the volume state parameter value is clean, selecting no garbage collection segments.
7. The method of claim 6,
wherein in case of the volume state parameter value being cooling-off or cold or icy, a threading of the log of the log structured file system through free extents of the volume (10, 20, 21 , 40, 60, 90, 91 ) is performed.
8. The method of claim 6 or 7,
wherein the performing (136) of the garbage collection of the garbage collection segments comprises:
- reading live data out of the garbage collection segments,
- writing the live data back to a smaller number of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) of segment state parameter value clean, and
- marking the garbage collection segments of segment state parameter value clean.
9. A computer program with a program code for performing the method according to claims 1 to 8 when the computer program runs on a computer.
10. A garbage collection device (30) for performing a garbage collection of a volume (10, 20, 21 , 40, 60, 90, 91 ) in log structured file systems,
wherein the volume (10, 20, 21 , 40, 60, 90, 91) comprises a plurality of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615),
wherein the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) each comprise a plurality of blocks, wherein the device (30) comprises a state parameter determination unit (31 ) adapted to:
- determine a block state parameter of each of the blocks,
- determine a segment state parameter of each of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) based upon the block state parameters of the blocks of the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615), and
- determine a volume state parameter of the volume (10, 20, 21 , 40, 60, 90, 91 ) based upon the segment state parameters and the block state parameters, wherein the device (30) comprises a garbage collection determination unit (32) adapted to:
- determine a garbage collection queue based upon the segment state parameters, and
- determine garbage collection segments based upon the volume state and the garbage collection queue, and
wherein the device (30) comprises a garbage collection unit (33) adapted to perform garbage collection of the garbage collection segments.
1 1. The device of claim 10,
wherein the state parameter determination unit (31) is adapted to determine the block state parameters by allocating a block state parameter value to each of the blocks, and
wherein possible block state parameter values are:
- free, indicating the block having been prepared for a write operation but not yet allocated,
- invalid, indicating the block having been freed after file update, file truncation or deletion operation but it was not collected as garbage yet,
- pre-allocated, indicating the block having been pre-allocated for a file but no data has been written in the block yet, and
- valid, indicating the block having been allocated and data has been written to the block.
12. The device of claim 11 ,
wherein the state parameter determination unit (31 ) is adapted to determine the segment state parameters of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) by allocating a segment state parameter value to each of the segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615), and
wherein possible segment state parameter values are:
- clean, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) having only blocks with block state parameter value free,
- using, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) having blocks with block state parameter values valid, invalid, pre-allocated and free,
- used, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) having only blocks with block state parameter value valid,
- pre-dirty, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603,
604, 605, 612, 613, 614, 615) having only blocks with block state parameter values valid and invalid, and
- dirty, indicating the segment (41 , 42, 43, 44, 45, 50, 51 , 52, 60 , 602, 603, 604,
605, 612, 613, 614, 615) having only blocks with block state parameter value invalid.
13. The device of claim 12,
wherein the state parameter determination unit (31 ) is adapted to determine the volume state parameter of the volume (10, 20, 21 , 40, 60, 90, 91) by allocating a volume state parameter value to the volume (10, 20, 21 , 40, 60, 90, 91 ), and wherein possible volume state parameter values are:
- clean, indicating the volume (10, 20, 21 , 40, 60, 90, 91) containing only segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613,
614, 615) of segment state parameter value clean,
- icy, indicating most of allocated segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 100% blocks of block state parameter value valid,
- cold, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 80-99% blocks of block state parameter value valid,
- cooling-off, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 65-80% blocks of block state parameter value valid,
- warming-up, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40,
60, 90, 91) having 50-65% blocks of block state parameter value valid,
- warm, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 35-50% blocks of block state parameter value valid,
- pre-hot, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52,
601, 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 20-35% blocks of block state parameter value valid,
- hot, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52, 601,
602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 1-20% blocks of block state parameter value valid, and
- boiling, indicating most of allocated segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) on the volume (10, 20, 21, 40, 60, 90, 91) having 0% blocks of block state parameter value valid. 14. The device of claim 13,
wherein the garbage collection determination unit (32) is adapted to determine the garbage collection queue by:
- determining an invalid blocks count of each of the segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615), and
- ordering the segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) according their invalid blocks count, wherein segments (41, 42, 43, 44, 45, 50, 51, 52, 601, 602, 603, 604, 605, 612, 613, 614, 615) of higher invalid blocks count are arranged in the garbage collection queue to be garbage collected earlier than segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) of lower invalid blocks count.
15. The device of claim 14,
wherein the garbage collection determination unit (32) is adapted to determine the garbage collection segments by:
- if the volume state parameter value is boiling, selecting all segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) as garbage collection segments,
- if the volume state parameter value is hot, selecting the garbage collection segments as all segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is pre-hot, selecting the garbage collection segments as at least some of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 ,
602, 603, 604, 605, 612, 613, 614, 615) in range of 20-35% of block state parameter value valid and all segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602,
603, 604, 605, 612, 613, 614, 615) in range of 1-20% of block state parameter value valid,
- if the volume state parameter value is warm, selecting the garbage collection segments as at least some of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) in range of 35-50% of block state parameter value valid,
- if the volume state parameter value is warming-up, selecting the garbage collection segments as at least some of segments (41 , 42, 43, 44, 45, 50, 51 ,
52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) in range of 50-65% of block state parameter value valid,
- if the volume state parameter value is cooling-off, selecting no garbage collection segments,
- if the volume state parameter value is cold, selecting no garbage collection segments,
- if the volume state parameter value is icy, selecting no garbage collection segments, and
- if the volume state parameter value is clean, selecting no garbage collection segments.
16. The device of claim 15,
wherein the garbage collection device (30) is adapted to in case of the volume state parameter value being cooling-off or cold or icy, performing a threading of the log of the log structured file system through free extents of the volume (10, 20, 21 , 40, 60, 90, 91 ).
17. The device of claim 15 or 16,
wherein the garbage collection unit (33) is adapted to perform the garbage collection of the garbage collection segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) by:
- reading live data out of the garbage collection segments,
- writing the live data back to a smaller number of segments (41 , 42, 43, 44, 45, 50, 51 , 52, 601 , 602, 603, 604, 605, 612, 613, 614, 615) of segment state parameter value clean, and
- marking the garbage collection segments of segment state parameter value clean.
PCT/RU2015/000289 2015-05-06 2015-05-06 Method and device for garbage collection in log structured file systems WO2016178594A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/RU2015/000289 WO2016178594A1 (en) 2015-05-06 2015-05-06 Method and device for garbage collection in log structured file systems
CN201580079149.6A CN107533506B (en) 2015-05-06 2015-05-06 Garbage collection method and equipment in log structured file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/RU2015/000289 WO2016178594A1 (en) 2015-05-06 2015-05-06 Method and device for garbage collection in log structured file systems

Publications (2)

Publication Number Publication Date
WO2016178594A1 true WO2016178594A1 (en) 2016-11-10
WO2016178594A8 WO2016178594A8 (en) 2017-02-16

Family

ID=54771172

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/RU2015/000289 WO2016178594A1 (en) 2015-05-06 2015-05-06 Method and device for garbage collection in log structured file systems

Country Status (2)

Country Link
CN (1) CN107533506B (en)
WO (1) WO2016178594A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021050110A1 (en) * 2019-09-12 2021-03-18 Western Digital Technologies, Inc. Storage system and method for validation of hints prior to garbage collection

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020000492A1 (en) * 2018-06-30 2020-01-02 华为技术有限公司 Storage fragment managing method and terminal
US10635599B2 (en) 2018-07-26 2020-04-28 Sandisk Technologies Llc Memory controller assisted address mapping

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110264843A1 (en) * 2010-04-22 2011-10-27 Seagate Technology Llc Data segregation in a storage device
US20110283049A1 (en) * 2010-05-12 2011-11-17 Western Digital Technologies, Inc. System and method for managing garbage collection in solid-state memory
US20120096217A1 (en) * 2010-10-15 2012-04-19 Kyquang Son File system-aware solid-state storage management system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488153A (en) * 2009-02-12 2009-07-22 浙江大学 Method for implementing high-capacity flash memory file system in embedded type Linux

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110264843A1 (en) * 2010-04-22 2011-10-27 Seagate Technology Llc Data segregation in a storage device
US20110283049A1 (en) * 2010-05-12 2011-11-17 Western Digital Technologies, Inc. System and method for managing garbage collection in solid-state memory
US20120096217A1 (en) * 2010-10-15 2012-04-19 Kyquang Son File system-aware solid-state storage management system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHANGMAN LEE ET AL: "Open access to the Proceedings of the 13th USENIX Conference on File and Storage Technologies is sponsored by USENIX F2FS: A New File System for Flash Storage F2FS: A New File System for Flash Storage", 19 February 2015 (2015-02-19), XP055245340, Retrieved from the Internet <URL:https://www.usenix.org/system/files/conference/fast15/fast15-paper-lee.pdf> [retrieved on 20160127] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021050110A1 (en) * 2019-09-12 2021-03-18 Western Digital Technologies, Inc. Storage system and method for validation of hints prior to garbage collection
US11573893B2 (en) 2019-09-12 2023-02-07 Western Digital Technologies, Inc. Storage system and method for validation of hints prior to garbage collection

Also Published As

Publication number Publication date
CN107533506B (en) 2021-03-23
CN107533506A (en) 2018-01-02
WO2016178594A8 (en) 2017-02-16

Similar Documents

Publication Publication Date Title
US10802718B2 (en) Method and device for determination of garbage collector thread number and activity management in log-structured file systems
US11010102B2 (en) Caching of metadata for deduplicated luns
US10042853B2 (en) Flash optimized, log-structured layer of a file system
US10185656B2 (en) Memory system and method for controlling nonvolatile memory
KR101717644B1 (en) Apparatus, system, and method for caching data on a solid-state storage device
US10503423B1 (en) System and method for cache replacement using access-ordering lookahead approach
US9529546B2 (en) Global in-line extent-based deduplication
US11307765B2 (en) System and methods for storage data deduplication
US10108547B2 (en) High performance and memory efficient metadata caching
US20170139825A1 (en) Method of improving garbage collection efficiency of flash-oriented file systems using a journaling approach
Pitchumani et al. SMRDB: Key-value data store for shingled magnetic recording disks
WO2012106362A2 (en) Apparatus, system, and method for managing eviction of data
US10296229B2 (en) Storage apparatus
US8862838B1 (en) Low overhead memory space management
KR101017067B1 (en) Locality-Aware Garbage Collection Technique for NAND Flash Memory-Based Storage Systems
US20170139616A1 (en) Method of decreasing write amplification factor and over-provisioning of nand flash by means of diff-on-write approach
WO2016178594A1 (en) Method and device for garbage collection in log structured file systems
KR101153688B1 (en) Nand flash memory system and method for providing invalidation chance to data pages
KR101663425B1 (en) Apparatus and method for memory storage to manage multiplexer open block for improving memory&#39;s performance and durability
Guo et al. Improving Write Performance and Extending Endurance of Object-Based NAND Flash Devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15804240

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15804240

Country of ref document: EP

Kind code of ref document: A1