WO2020000492A1 - 一种存储碎片管理方法及终端 - Google Patents

一种存储碎片管理方法及终端 Download PDF

Info

Publication number
WO2020000492A1
WO2020000492A1 PCT/CN2018/093930 CN2018093930W WO2020000492A1 WO 2020000492 A1 WO2020000492 A1 WO 2020000492A1 CN 2018093930 W CN2018093930 W CN 2018093930W WO 2020000492 A1 WO2020000492 A1 WO 2020000492A1
Authority
WO
WIPO (PCT)
Prior art keywords
segment
terminal
candidate set
file system
aging degree
Prior art date
Application number
PCT/CN2018/093930
Other languages
English (en)
French (fr)
Inventor
俞超
陈浩
童碧峰
郑成亮
周喜渝
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to US17/257,015 priority Critical patent/US11842046B2/en
Priority to EP18924374.4A priority patent/EP3789883A4/en
Priority to CN201880048691.9A priority patent/CN110945486B/zh
Priority to PCT/CN2018/093930 priority patent/WO2020000492A1/zh
Publication of WO2020000492A1 publication Critical patent/WO2020000492A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7205Cleaning, compaction, garbage collection, erase control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the technical field of storage management, and in particular, to a storage fragment management method and terminal.
  • a log structured file system refers to the storage space of the entire storage device as a log.
  • LFS log structured file system
  • LFS divides the log into multiple segments.
  • the so-called segments are fixed-size storage areas.
  • LFS uses segments as storage units, and segments are divided into blocks.
  • the storage fragment management method provided by the prior art is: identify the segments where storage fragments occur, copy the data of valid blocks in these segments, and continuously write the data of these valid blocks into free segments; after this operation is completed, these storage fragments appear The storage space occupied by the segments will be released and re-marked as free segments.
  • the disadvantage of this is that if the valid block data in the segment where the fragmentation occurs is hot data, that is, the probability of these data being updated is large, then it is possible that the data of these valid blocks is written to the free segment after the segment The data in it will soon be updated or deleted again, causing the segment to appear storage fragmentation again. Therefore, the data of the valid blocks in the segment needs to be moved again, so additional power consumption is generated.
  • the present application provides a storage fragment management method and terminal, which are used to solve the problem of large power consumption of the existing log structured file system when data is moved to the storage fragment.
  • an embodiment of the present application provides a storage fragment management method.
  • the method may be applied to a file system of a terminal.
  • the file system includes at least one segment.
  • the method includes: the terminal firstly The proportion of valid blocks determines the source segment from the file system; then the terminal determines from the file system a target segment that is consistent with the age of the source segment according to the age of the source segment. Finally, the terminal moves data of a valid block in the source segment into a free block in the target segment.
  • the time at which the data of each block in the target segment is updated or deleted is also basically the same, which is not easy to cause the target segment to be fragmented again, so the number of relocations can be reduced, and power consumption can be reduced to a certain extent.
  • the source segment is a segment in the file system that is older than or equal to the first threshold and has the smallest effective block ratio.
  • the terminal may first traverse the segments in the file system to determine a first candidate set, and the aging degree of the segments in the first candidate set is greater than or equal to a first threshold, and then the terminal selects from the first candidate set.
  • the segment with the smallest effective block ratio is determined as the source segment. In this way, the effective block data in the source segment selected each time by the terminal is minimal, so the amount of written data during the move can be reduced, and the power consumption is reduced to a certain extent.
  • the terminal determines, from the first candidate set, the segment with the smallest effective block ratio as the source segment. Or, when the number of segments in the first candidate set is greater than the second threshold, the terminal removes at least one segment with the least aging degree from the first candidate set, until The number of segments in the first candidate set is less than or equal to the second threshold, and then a segment with the smallest effective block ratio is determined from the first candidate set as a source segment.
  • the segments in the first candidate set are all segments with a relatively large aging degree, the aging degree of the selected source segment is sufficiently large.
  • the effective block data in the source segment selected by the terminal is minimal, so the amount of data written during flashback can be reduced, and power consumption is reduced to a certain extent.
  • the aging degree of the target segment in the above storage fragment management method is consistent with the aging degree of the source segment. It can be understood that the aging degree of the target segment is greater than or equal to the first threshold, and the aging degree of the target segment is between Within the set value range. It should be noted that the set value interval is generated according to the aging degree of the source segment. For example, the center value of the set value interval is the aging degree of the source segment. In this case, the aging degree of the target segment and the source segment are the same or close to each other, so after the move, the data in the target segment is hot and cold.
  • the aging degree of the target segment in the foregoing storage fragment management method is consistent with the aging degree of the source segment. It can also be understood that the aging degree of the target segment is greater than or equal to the first threshold, and the aging of the target segment The degree is within a set value interval, and at the same time, the target segment is a segment where the aging degree is greater than or equal to the first threshold and the aging degree is within the set value interval, and the effective block occupies the largest proportion. In this case, not only the target segment and the source segment have the same or similar aging degree, but also the probability that the target segment is filled with valid block data of the source segment is relatively large. Therefore, after the move, the free blocks in the target segment are fully utilized, and the data in the target segment is equally hot and cold.
  • the target segment can be determined through the following steps, which specifically include: the terminal traverses the segments in the file system to determine a second alternative set, wherein the age of the segments in the second alternative set is greater than or Is equal to the first threshold, and then the terminal determines a third candidate set from the second candidate set, and the value of the aging degree of the segments in the third candidate set is within the set value interval. Finally, the terminal selects the segment with the largest proportion of valid blocks as the target segment from the third candidate set. In this way, the terminal can realize that the determined target segment is consistent with the aging level of the source segment, and the free blocks of the target segment can also be filled up, thereby being fully utilized.
  • the condition that triggers the terminal to determine the source segment from the file system may be: when the number of free segments in the file system is lower than a third threshold, the terminal according to the aging degree of the segment
  • the source segment is determined from the file system based on the effective block ratio of the segment and the segment; or the source segment may be determined from the file system periodically according to the age of the segment and the effective block ratio of the segment.
  • the terminal may trigger storage fragmentation management due to insufficient LFS free segments, or it may be because the terminal itself has a cleaning thread to periodically determine the source segment and move the data in the source segment, no matter what kind of trigger condition , Are conducive to the terminal to reclaim its own storage space in a timely manner.
  • the file system mentioned in the embodiment of the present application may be an LFS.
  • an embodiment of the present application provides a terminal, including a processor and a memory.
  • the memory is used to store one or more computer programs; when the one or more computer programs stored in the memory are executed by a processor, the terminal can implement any one of the possible design methods of the first aspect.
  • an embodiment of the present application further provides a terminal, where the terminal includes a module / unit that executes the first aspect or a method of any possible design of the first aspect.
  • modules / units can be implemented by hardware, and can also be implemented by hardware executing corresponding software.
  • an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium includes a computer program, and when the computer program runs on a terminal, causes the terminal to execute the first aspect or the foregoing first Any one of the possible design methods.
  • an embodiment of the present application further provides a computer program product that, when the computer program product runs on a terminal, causes the terminal to execute the first aspect or any one of the foregoing possible designs of the first aspect. method.
  • FIG. 1 is a schematic diagram of a storage device layout of a log structured file system according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of a log structure of a log structured file system according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a log structured file system according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a storage fragment management method provided in the prior art
  • FIG. 5 is a schematic flowchart of a storage fragment management method according to an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a source segment selection method according to an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a method for selecting a target segment according to an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of another method for selecting a target segment according to an embodiment of the present application
  • FIG. 9 is a schematic flowchart of another storage fragment management method according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of garbage collection of an LFS system according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a unit module of a terminal according to an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a mobile phone according to an embodiment of the present application.
  • the embodiments of the present application provide a storage fragment management method and a terminal, which are used to solve the problem of large power consumption of the existing log structured file system when data is moved to the storage fragment.
  • the method and the terminal described in this application are based on the same inventive concept. Since the method and the principle of the terminal to solve the problem are similar, the implementation of the device and the method can be referred to each other, and duplicated details are not described again.
  • File system is a method for storing and organizing computer files and data.
  • file systems including object-based file systems; log-based file systems, such as LFS.
  • Garbage collection is the management of free space in the storage device, and its goal is to provide large free blocks required for new data writing.
  • the storage device may be a non-volatile memory, a dynamic random access memory, a static random access memory, a flash memory, an embedded multimedia memory card (EMMC), and the like.
  • EMMC embedded multimedia memory card
  • Multiple means two or more.
  • the checkpoint area is a fixed location on the storage device that is used to locate the disk block or flash block where the index node map is located, and determine the last checkpoint in the log.
  • the index node map is used to maintain the current position of each index node, and its active part is cached in memory, so it is almost unnecessary to access the storage device when searching.
  • LFS the log is a disk structure.
  • LFS divides logs into segments.
  • LFS metadata metadata (metadata) is mainly distributed in checkpoints and segments, and its disk layout is shown in Figure 1. The pointer of the index node and the current position of the index node given by the index node graph often change. Using the timestamp in the checkpoint, you can determine the last successful checkpoint.
  • LFS logs use sequential, incremental-only data structures. The description of LFS still uses traditional index organization. LFS accesses index nodes in the log. Index nodes enable LFS to retrieve information about files from the log in a random access manner. The steps to find an index node in LFS are: find the nearest index node map in the checkpoint located in a fixed area on the disk; find the latest version of the index node from the index node map; you can find it from the index node The corresponding data block. As shown in FIG. 2, an index node map is found in the checkpoint area, then three index nodes are found from the index node map, and corresponding data blocks are found from each index node.
  • each segment is a sequence of multiple blocks.
  • the state of a block can be: 1) idle; 2) valid.
  • the definitions of the states of these blocks are shown in Table 2.
  • the so-called block state is valid, which means that there is valid data in the block.
  • For the state of the block it can be determined based on the information of the segment summary or the segment usage table.
  • the embodiment of the present application enumerates the following two methods for description.
  • Method 1 In the LFS, summary information is recorded for each block.
  • the summary information includes an inode number (the index node number, which is used to indicate which file this disk block belongs to), and an offset (offset, which is used to indicate the number of disk blocks that belong to the file).
  • This information is stored in the segment summary block at the head of the segment. According to this information of the segment summary block, it can be directly determined whether there is valid data in a block. If valid data exists, it is a valid block, otherwise it is a free block.
  • Blocks The validity of bolcks (blocks) can be determined by checking whether the block pointer of the file's index node (Inode) or indirect block (Indirect block) still points to these blocks. If the pointer still points to these blocks, then these blocks are Valid blocks, otherwise free blocks.
  • Inode index node
  • Indirect block indirect block
  • segments are composed of blocks, different combinations of the states of the blocks in a segment determine the state of the segment, where the state of the segment can be 1) idle; 2) dirty; 3) valid.
  • the status of these segments is defined in Table 3.
  • Free All blocks in the segment are free blocks. Dirty There are valid blocks and free blocks in the segment. Effective All blocks in the segment are valid blocks.
  • the log structured file system 400 includes 41, 42, 43.
  • Each segment is a collection of physical disk blocks or flash blocks.
  • the capacity of a segment is 8MB. All blocks in segment 41 are free, so the state of segment 41 is free; there are two valid blocks and four free blocks in segment 42, so the state of segment 42 is dirty; all blocks in segment 43 are Valid blocks, so the status of segment 43 is valid.
  • hot data means that the data in the effective block may be updated or deleted soon
  • cold data means that the data in the effective block may be updated or deleted after a long time.
  • the segment is composed of blocks, if the data in the valid block in the segment is basically cold data, then the segment also belongs to the cold segment; if the data in the valid block in the segment is basically hot data, then the segment is also Belongs to the hot segment. In other words, if the data in the valid block is cold data, the time of the last updated data saved in the segment use table is generally longer than the current time, that is, the valid block is older .
  • the degree of coldness and heat of each segment can also be measured by the degree of aging. One way to define the degree of aging is as follows:
  • the last update time of the system refers to the last update time of the log structured file system.
  • the earliest update time of the system refers to the time when the log structured file system was first updated.
  • the segment update time is the average update time of all valid blocks in the segment.
  • n is the number of valid blocks in the segment, T1 is the update time of the first valid block in the segment, T2 is the update time of the second valid block in the segment, and Tn is the update of the nth valid block in the segment time.
  • An embodiment of the present application provides a storage fragment management method, which can be used to perform garbage collection on storage fragments in a storage device.
  • the reason for garbage collection can be understood through the design principle of LFS.
  • LFS In LFS, as applications continue to create, modify, and delete files in this LFS, the free space of LFS will become fragmented, resulting in the inability to perform a large number of continuous Write operations, therefore, need to organize the available space on the memory device.
  • the log structured file system usually adopts a garbage collection method. During the garbage cleaning process, each time the segment with the least effective block in the dirty state segment is selected as the source segment, and all valid blocks in the source segment are selected.
  • the target segment is all free blocks.
  • the data of the valid blocks in the source segment are removed and the source segment is removed. Filled with free blocks, the first three in the target segment are valid blocks, and the last three are free blocks.
  • the disadvantage of this is that if the data stored in the source segment is hot data, then the data in the valid block in the source segment may be updated or deleted soon, resulting in repeated movements and additional power consumption.
  • the storage fragment management method provided in the embodiment of the present application considers the aging degree of a segment in the process of selecting a target segment and a source segment.
  • the method of the embodiment of the present application selects a source segment with a large aging degree as much as possible, and moves data in a valid block in the source segment to a free block in a target segment consistent with the aging degree. In this way, the data in the target segment after moving is hot and cold.
  • the time at which the data of each block in the segment is updated or deleted is also basically the same, which is unlikely to cause the target segment to be fragmented again, so the number of relocations can be reduced, and power consumption can be reduced to a certain extent.
  • an embodiment of the present application provides a storage fragment management method, which can be executed by a terminal.
  • the specific process includes:
  • Step 501 The terminal determines the source segment from the file system according to the aging degree of the segment and the effective block ratio of the segment.
  • the processor of the terminal initiates a cleaning thread.
  • the cleaning thread can first traverse all segments in the log structured file system, determine the source segment where the aging degree and the proportion of valid blocks meet the set conditions, and then write the data in the valid blocks in the source segment to the cache.
  • the setting condition may be: a segment in which the aging degree of the source segment in the LFS is greater than the first threshold, and the segment with the smallest effective block ratio.
  • the setting condition may also be: the aging degree of the source segment in the LFS is greater than the first threshold, and the effective block ratio is only second to the smallest segment.
  • the setting condition may also be: a segment in which the aging degree of the source segment in the LFS is greater than the first threshold, and the effective block ratio is smaller than a certain threshold. That is, the source segment may have a larger aging degree and a smaller effective block ratio.
  • the cleaning thread uses multiple loops to traverse the LFS, and each traversal selects the segment in the LFS that is older than the first threshold and has the smallest effective block ratio as the source segment.
  • the clearing thread determines the effective block proportion of the source segment each time, the amount of data in the effective block is also the smallest, so the amount of data to be moved is also the smallest. In comparison, this condition can reduce the moving At the time of writing, the power consumption is reduced to a certain extent.
  • the amount of data to be moved is also small, which can also reduce the movement. The amount of writing at the time, reducing power consumption.
  • the terminal may first traverse the segments of the log structured file system, and add the segments with dirty status and aging degree greater than the first threshold to the first candidate set; and then traverse The first candidate set, from which the segment with the smallest effective block ratio is determined as the source segment. Then load the data of the valid block in the source segment into the cache, and add an identifier to the segment.
  • the terminal determines, from the first candidate set, the segment with the smallest effective block ratio as the source segment.
  • the terminal may remove some segments with a lower aging degree from the first candidate set until the first candidate set The number of segments in is less than or equal to the second threshold, and then the segment with the smallest effective block ratio is determined as the source segment from the first candidate set.
  • the segments in the first candidate set are all segments with a relatively high degree of aging, the aging degree of the selected source segment is sufficiently large.
  • the effective block data in the source segment selected by the terminal is minimal, so the amount of data written during flashback can be reduced, and power consumption is reduced to a certain extent.
  • Step 502 The terminal determines a target segment that is consistent with the aging degree of the source segment from the file system according to the aging degree of the source segment.
  • the aging degree of the target segment in the foregoing storage fragment management method is consistent with the aging degree of the source segment. It can be understood that the aging degree of the target segment is the same as or similar to that of the source segment. Specifically, when selecting a target segment, a segment with an aging degree greater than or equal to a first threshold and an aging degree within a set value interval may be selected from the file system as the target segment. The first threshold may be the same as the first threshold used when determining the source segment.
  • the set value interval is generated according to the aging degree of the source segment. For example, the center value of the set value interval is the aging degree of the source segment. In this case, the aging degree of the target segment and the source segment are the same or close to each other, so after the move, the data in the target segment is hot and cold.
  • the cleaning thread may first traverse the segments in the log structured file system, select the segments with a degree of aging greater than the first threshold from the dirty state segments, and select the selected ones. All segments are added to the second candidate set; then the second candidate set is traversed to determine the segments whose aging degree is within the set value range, and these segments are added to the third candidate set; From the third candidate set, a segment closest to the aging level of the source segment is selected as the target segment, or a segment is arbitrarily selected as the target segment, or a segment with the greatest aging level is selected as the target segment.
  • the preset value interval may be [a-0.3, a + 0.3], so the value of the aging degree of the segments in the third candidate set is Both are in [a-0.3, a + 0.3], and then the terminal selects a segment from the third candidate set as the target segment.
  • the terminal may sort the segments in the second candidate set according to the aging degree from large to small. Then, taking the aging value of the source segment as the center and K as the radius, the K segments before and after the aging value of the source segment are selected, and the selected segment is added to the third candidate set.
  • the aging degree of the source segment is a (for example, 0.8)
  • k segments (a, for example, 3) smaller than a, and k greater than a, can be selected.
  • the selected 2k or 2k + 1 segments are used as the third candidate set; then a segment is selected from the third candidate set as the target segment.
  • the second candidate set includes a segment with an aging degree of a
  • the 2k + 1 segments selected as the third candidate set are included; if the second candidate set does not include a segment with an aging degree of a, Then, 2k segments are selected as the third candidate set.
  • the aging degree of the target segment in the foregoing storage fragment management method is consistent with the aging degree of the source segment, and it can also be understood that the target segment has an aging degree greater than or equal to the first threshold and an aging degree Among the segments in the set value interval, the segment with the largest effective block ratio.
  • the target segment and the source segment have the same or close aging degree, but also the probability of the target segment being filled with valid block data of the source segment is relatively large, so after the move, the free blocks in the target segment are fully utilized, and the target The data in the segment is hot and cold.
  • the cleaning thread may first traverse the segments in the log structured file system, select the segments with a degree of aging greater than the first threshold from the dirty state segments, and select the selected ones.
  • the segments are all added to the second candidate set; then the second candidate set is traversed to determine the segments whose aging degree is within the set value range, and these segments are added to the third candidate set.
  • the third candidate set selects the segment with the largest effective block ratio as the target segment.
  • the preset value interval may be [a-0.3, a + 0.3], so the value of the aging degree of the segments in the third candidate set is All are within [a-0.3, a + 0.3], and then the terminal selects the segment with the largest proportion of effective blocks as the target segment from the third candidate set.
  • Step 503 The terminal moves data of a valid block in the source segment to a free block in the target segment.
  • the valid block data in the source segment may be loaded into the cache first. Then, for each valid block in the cache, the terminal finds the source segment identifier according to the data index of the valid block in the cache, and determines the target segment according to the aging degree of the source segment. The data of the valid block is then written into a free block of the target segment. At the same time, the terminal releases the storage space occupied by the source segment identifier corresponding to the source segment.
  • a trigger condition may be that when the number of free segments in the file system is lower than a third threshold (for example, 20), the processor generates a cleaning thread for garbage collection in the kernel.
  • the cleaning thread executes steps 501 to 503 in a loop, and stops execution until the number of free segments in the file system rises to a certain threshold (for example, 100).
  • This storage fragment management method can also be called foreground garbage collection.
  • a trigger condition may be that the processor configures a cleanup thread in the kernel for garbage collection.
  • the cleaning thread executes step 501 in real time or periodically.
  • the source segment is also marked as a segment to be garbage collected.
  • steps 502 to 503 are triggered.
  • the loading time is recorded and marked as dirty.
  • LFS LFS
  • the processor of the terminal will generate a cleaning thread for garbage collection in the kernel.
  • the cleaning thread is used to perform the following three stages of processing.
  • the three stages include: stage one, selecting the source segment in real time or periodically; stage two , Select the target segment; phase three, garbage collection.
  • the source segment is selected in real time or periodically.
  • the system description is described below with reference to FIG. 6.
  • Step 601 The cleaning thread scans all segments in the LFS to obtain the aging degree of the segments.
  • Step 602 The cleaning thread determines whether there is a segment with a degree of aging greater than or equal to the first threshold in the scanned segment. If it exists, it proceeds to step 603; if it does not exist, it proceeds to step 601.
  • Step 603 The cleaning thread adds a segment with an aging degree greater than a first threshold to the first candidate set.
  • Step 604 The cleaning thread determines whether the number of segments in the first candidate set does not exceed the second threshold, and if yes, skips to step 606; if not, skips to step 605.
  • Step 605 The cleaning thread removes the least aged segment from the first candidate set, and then executes step 604.
  • Step 606 The cleaning thread selects the segment with the smallest proportion of valid blocks as the source segment from the current first candidate set.
  • the cleaning thread loads the data in a valid block in each selected source segment into the cache, and then adds an identifier to be garbage collected for the source segment.
  • the cleaning thread When the proportion of valid block data in the cache occupies a certain percentage, such as 80%, the cleaning thread is triggered to select the target segment, or when the valid block data in the cache is dirty for longer than the set period, the cleaning thread is triggered Select the target segment.
  • the target segment is selected from the segments in the LFS whose status is dirty.
  • the selection strategy of the target segment can be based on the factor of aging degree, or two factors of aging degree and the proportion of effective blocks.
  • the aging degree of the target segment is the same as that of the source segment. Specifically, for each valid block in the cache, the cleaning thread indexes to the identifier of the source segment where the valid block is located according to the index node corresponding to the data of the valid block in the cache, so as to identify the source segment corresponding to the source segment ID. The degree of aging determines the target segment. The system description is described below with reference to FIG. 7.
  • Step 701 The cleaning thread scans all segments in the LFS to obtain the aging degree of the segments.
  • Step 702 The cleaning thread determines whether there is a segment with a degree of aging exceeding a first threshold in the scanned segment. If it exists, it proceeds to step 703, and if it does not exist, it proceeds to step 701.
  • Step 703 The cleaning thread adds a segment with an aging degree greater than the first threshold to the second candidate set.
  • the selection of the target segment may be triggered when the proportion of valid block data in the cache reaches a certain percentage.
  • the selection of the target segment occurs after the source segment is selected, so the cleaning thread scans the LFS in step 701.
  • the state of the segment is likely to be different from the state of the segment scanned in the LFS by the cleaning thread in step 601, so the second alternative set obtained may also be different from the first alternative set.
  • Step 704 The cleaning thread traverses the second candidate set according to the aging degree of the source segment, and determines whether there is a segment whose aging value is not within a set value interval. If yes, go to step 805a, otherwise go to step 806a.
  • the aging degree of the source segment is a (for example, 0.8), and the set value range can be [a-0.3, a + 0.3].
  • the cleaning thread determines whether the aging degree of the segment exists in the second candidate set. Value is not in [a-0.3, a + 0.3].
  • Step 705 The cleaning thread removes the segments whose aging value is not in the rounded value interval from the second candidate set, and then executes step 704.
  • Step 706 The cleaning thread selects a segment with the largest proportion of valid blocks from the current third candidate set as the target segment.
  • step 706 may also be: the cleaning thread selects the segment closest to the aging level of the source segment from the current third candidate set as the target segment, or may randomly select a segment as the target segment. Or select the oldest segment as the target segment.
  • the target segment may also be determined in another manner, which is described in conjunction with FIG. 8 in this article.
  • Steps 801 to 803 are the same as those described in steps 701 to 703, and are not repeated here.
  • Step 804 The cleaning thread sorts the segments in the second candidate set according to the aging degree, where the segments can be sorted from large to small, or can be sorted from small to large.
  • Step 805 The cleaning thread traverses the second candidate set according to the aging degree of the source segment, and determines whether there are segments exceeding the number radius. If yes, the process proceeds to step 806; otherwise, the process proceeds to step 807.
  • the aging degree of the source segment is a (for example, 0.8). From the sorted second candidate set, determine whether there are three consecutive segments less than a and three consecutive segments larger than a. Other paragraphs.
  • Step 806 Remove the segments exceeding the number of radii from the second candidate set, and then execute step 805b.
  • Step 807 The cleaning thread selects a segment with the largest effective block ratio from the third candidate set as the target segment.
  • the target segment finally determined by the cleaning thread may also be greater than one.
  • a segment with the largest effective block ratio is selected from the third candidate set, and the segment next to the largest is the target segment. This can avoid the problem that a target segment has too few free blocks and cannot completely write valid data in the source end.
  • the clearing thread writes the data of the valid block in the cache to the corresponding target segment for each valid block in the cache. After all the valid block data in the cache is written, the storage space occupied by the source segment corresponding to the identifier to be garbage collected is released.
  • the above-mentioned phases 2 and 3 may be executed cyclically, and execution is stopped until the proportion of the data of the valid blocks in the cache occupying the cache is less than a certain ratio (for example, 20%).
  • the above-mentioned phases 2 and 3 may also be executed periodically.
  • the cleaning thread determines a source segment every five minutes and loads the data of the valid block of the source segment into the cache; after the cleaning thread loads the data of the valid block of a source segment into the cache, it is separated by 5
  • the second cleanup thread will index each valid block in the cache to the corresponding source segment, then determine the target segment according to the aging degree of the source segment, and then write the data of the valid block into the free block of the target segment.
  • the cleaning thread can directly write the data of the valid block in the source segment to the free block of the target segment, or it can write the data of the valid block in the source segment into the cache and then load it into the cache. The data is written into the free block of the target segment, which is not specifically limited in this application.
  • the processor will generate a cleanup thread, which is used to execute the following three There are three phases of processing: phase one, selecting the source and target segments; and phase two, garbage collection. The system description is described below with reference to FIG. 9.
  • Step 901 The cleaning thread scans all segments in the LFS to obtain the aging degree of the segments.
  • Step 902 The cleaning thread determines whether there are any segments in the scanned segment whose aging degree exceeds the first threshold. If so, the process proceeds to step 903; if not, the process proceeds to step 901.
  • Step 903 The cleaning thread adds a segment with an aging degree greater than the first threshold to the first candidate set.
  • Step 904 The cleaning thread determines whether the number of segments in the first candidate set does not exceed the second threshold threshold. If yes, the process proceeds to step 906; if not, the process proceeds to step 905.
  • step 905 the cleaning thread removes the least aged segment from the first candidate set, and then jumps to step 904.
  • Step 906 The cleaning thread selects the segment with the smallest proportion of valid blocks as the source segment from the current first candidate set.
  • Step 907 The cleaning thread traverses the first candidate set according to the aging degree of the source segment, and determines whether there is a segment whose aging value is not within a set value interval. If yes, go to step 908; otherwise, go to step 909.
  • the aging degree of the source segment is a (for example, 0.8), and the set value range can be [a-0.3, a + 0.3].
  • the cleaning thread determines whether there is a segment aging degree in the first candidate set. Value is not in [a-0.3, a + 0.3].
  • Step 908 The cleaning thread removes the segment whose aging value is not within the set value interval from the first candidate set, and then jumps to step 907 until the value of the aging degree of the first candidate set segment. Both are within the preset value interval.
  • Step 909 The cleaning thread selects a segment with the largest proportion of valid blocks from the current first candidate set as the target segment.
  • step 909 may also be: the cleaning thread selects a segment closest to the aging level of the source segment from the current first candidate set as the target segment, or may randomly select a segment as the target segment Or select the oldest segment as the target segment.
  • the cleaning thread writes the data in the valid blocks in the source segment to the target segment, and then repeats steps 901 to 909 until the number of idle segments in the LFS system rises to a set threshold (for example, 80).
  • the cleaning thread may directly write the data of the valid block in the source segment to the free block of the target segment, or it may write the data of the valid block in the source segment into the cache and then load the cache The data in is written into the free block of the target segment, which is not specifically limited in this application.
  • the target segment finally determined by the cleaning thread may also be greater than one. For example, a segment with the largest proportion of valid blocks is selected from the first candidate set, and the segment next to the largest is the target segment. This can avoid the problem that a target segment has too few free blocks and cannot completely write valid data in the source end.
  • the manner of selecting the target segment in FIG. 9 may also be the manner shown in FIG. 7b, and details are not repeated here.
  • the source segment and the target segment shown in FIG. 9 may both be determined from the first candidate set, or may be determined from different candidate sets. For example, the source segment is selected from the LFS segment corresponding to the first moment, and the target segment is selected from the LFS segment corresponding to the second moment; where the second moment is after the first moment, the two The segments of the LFS corresponding to the moments may be different. Therefore, the first candidate sets corresponding to the two moments may also be different.
  • garbage collection phase for example, as shown in Figure 10, before garbage collection, there are 3 valid blocks and three free blocks in the source segment, and three free blocks and three valid blocks in the target segment.
  • garbage collection or background garbage collection all six segments in the source segment are idle segments, and the target segment is filled with valid blocks.
  • the storage space corresponding to the source segment is reclaimed, and the source segment is set as a free segment again, and new data can be written again.
  • garbage collection when garbage collection is performed, if the data in the valid block of the source segment belongs to multiple files in the same directory, the files in the same directory can be preferentially moved to In a target segment; in another possible design, when garbage collection is performed, valid blocks are grouped according to the last modification time of the valid blocks of the source segment, and valid blocks of the same or similar time are placed in the same In a group, the data of the valid blocks of these groups are then moved to the same target segment.
  • An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium includes a computer program, and when the computer program runs on a terminal, causes the terminal to execute any one of the foregoing storage fragment management methods. achieve.
  • An embodiment of the present application further provides a computer program product that, when the computer program product runs on a terminal, causes the terminal to execute any one of the possible implementations of the foregoing storage fragment management method.
  • an embodiment of the present application discloses a terminal.
  • the terminal is configured to implement the methods described in the foregoing method embodiments, and includes: a state cache module 1001 and a source segment.
  • the modules included in this terminal can be implemented in the kernel layer of the Android operating system.
  • the status cache module 1001 obtains the status of the segments in the LFS, calculates the aging degree of each segment and the proportion of valid blocks.
  • the source segment selection module 1002 is used to support the terminal to perform step 501 in FIG. Step 502 in FIG. 5 is performed, and the garbage collection module 1004 is configured to support the terminal to perform step 503 in FIG. 5. All relevant content of each step involved in the above method embodiment can be referred to the functional description of the corresponding functional module, and will not be repeated here.
  • an embodiment of the present application discloses a terminal.
  • the terminal may include: one or more processors 1101; a memory 1102; a display 1103; and one or more applications.
  • a program (not shown); and one or more computer programs 1104, each of which may be connected via one or more communication buses 1105.
  • the one or more computer programs 1104 are stored in the memory 1102 and configured to be executed by the one or more processors 1101.
  • the one or more computer programs 1104 include instructions, and the instructions can be used to execute 5 and the respective steps in the corresponding embodiments.
  • the above terminals may be terminal devices such as mobile phones, tablet computers, laptops, ultra-mobile personal computers (UMPCs), netbooks, personal digital assistants (PDAs), and the like.
  • FIG. 13 is a block diagram showing a partial structure of a mobile phone 20 related to each embodiment of the present invention.
  • the mobile phone 20 includes a display device 210, a processor 220, and a memory 230.
  • the memory 230 may be used to store software programs and data.
  • the memory 230 may mainly include a stored program area and a stored data area, where the stored program area may store an operating system and at least one application required by a function (such as an image acquisition function, etc.);
  • the storage data area may store data (such as audio data, phonebook, images, etc.) created according to the use of the mobile phone 200.
  • the memory 230 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the storage fragment management method provided by the example of the present invention is applicable to the management of storage fragments in the memory 230.
  • the processor 220 executes various functional applications and data processing of the mobile phone 200 by running software programs and data stored in the memory 230.
  • the processor 220 is the control center of the mobile phone 200. It uses various interfaces and lines to connect various parts of the entire mobile phone. By running or executing software programs and / or data stored in the memory 230, it performs various functions of the mobile phone 200 and processes data. To monitor the phone as a whole.
  • the processor 220 may include one or more general-purpose processors, may also include one or more DSPs (digital signal processors), and may also include one or more ISPs (image signal processors, image signal processors). , For performing related operations to implement the technical solution provided by the embodiment of the present application.
  • the mobile phone 200 further includes a camera 260 for capturing images or videos.
  • the camera 260 may be an ordinary camera or a focusing camera.
  • the mobile phone 200 may further include an input device 240 for receiving inputted digital information, character information, or contact touch operations / contactless gestures, and generating signal inputs related to user settings and function control of the mobile phone 200.
  • the display device 210 includes a display panel 211 for displaying information input by the user or information provided to the user and various menu interfaces of the mobile phone 200.
  • the display device 210 is mainly used to display the camera or sensor in the mobile phone 100 The acquired image to be detected.
  • the display panel 211 may be configured in the form of a liquid crystal display (LCD) or an organic light-emitting diode (OLED).
  • the mobile phone 200 may further include a power source 250 for supplying power to other modules.
  • the mobile phone 200 may further include one or more sensors 270, such as an image sensor, an infrared sensor, a laser sensor, and the like.
  • the mobile phone 200 may further include a radio frequency (RF) circuit 280 for performing network communication with a wireless network device, and may further include a WiFi module 290 for performing WiFi communication with other devices and obtain images or data transmitted by other devices. Wait.
  • RF radio frequency
  • Each functional unit in each of the embodiments of the present application may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the embodiments of the present application is essentially a part that contributes to the existing technology or all or part of the technical solution may be embodied in the form of a software product.
  • the computer software product is stored in a storage
  • the medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to perform all or part of the steps of the method described in the embodiments of the present application.
  • the foregoing storage medium includes: various types of media that can store program codes, such as a flash memory, a mobile hard disk, a read-only memory, a random access memory, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种存储碎片管理方法及终端,可以应用于终端的文件系统中,该文件系统包括至少一个段,该方法包括:终端先根据段的老化程度和段的有效块占比,从该文件系统中确定出源段;然后终端根据源段的老化程度,从所述文件系统中确定出与该源段的老化程度一致的目标段。最后,终端将该源段中的有效块的数据搬移该目标段的空闲块中。该方法用以解决现有日志结构化文件系统对存储碎片进行数据搬移存在功耗大的问题。

Description

一种存储碎片管理方法及终端 技术领域
本申请涉及存储管理技术领域,尤其涉及一种存储碎片管理方法及终端。
背景技术
日志结构化文件系统(log structured file system,LFS)是将整个存储器件的存储空间当做一个日志。当有写入数据请求时,在当前写入位置不断的往后写入;利用记录日志的原理,将原本可能离散的写数据聚合成连续的写数据,再提交给存储器件,从而获得较高的随机写性能。然而LFS中,随着应用不断在该LFS中创建、修改、删除文件,LFS的空闲空间会出现碎片化。而写数据需要连续的空闲空间,因此LFS中会进行存储碎片的空闲空间回收,即将碎片化空闲空间整理成连续的空闲空间,来满足日志结构连续写入的模式。
目前LFS将日志划分为多个段(segment),所谓段即固定大小的存储区域。LFS是以段为存储单位,段又被划分为块。现有技术提供的存储碎片管理方法是:识别出现存储碎片的段,复制这些段内有效块的数据,并将这些有效块的数据连续写入空闲段中;此操作完成后,这些出现存储碎片的段所占的存储空间会被释放,重新被标记为空闲段。但是这样做的缺点是:如果出现存储碎片的段中有效块的数据是热数据,也就是这些数据被更新的机率较大,那么可能这些有效块的数据被写入到空闲段之后,该段中的这些数据很快又会被更新或者删除,导致该段再次出现存储碎片。因此需要再次搬移该段中的有效块的数据,所以会产生额外的功耗。
发明内容
本申请提供一种存储碎片管理方法及终端,用以解决现有日志结构化文件系统对存储碎片进行数据搬移存在功耗大的问题。
第一方面,本申请实施例提供了一种存储碎片管理方法,该方法可以应用于终端的文件系统中,该文件系统包括至少一个段,该方法包括:终端先根据段的老化程度和段的有效块占比,从该文件系统中确定出源段;然后终端根据源段的老化程度,从所述文件系统中确定出与该源段的老化程度一致的目标段。最后,终端将该源段中的有效块的数据搬移该目标段的空闲块中。
在本申请实施例中,因源段和目标段的老化程度一致,所以搬移后,目标段中的数据冷热程度相当。由此,目标段中各个块的数据再次发生更新或者删除的时间也基本一致,不太容易导致该目标段再次碎片化,所以可以减少搬移的次数,一定程度上能够降低功耗。
在一种可能的设计中,源段是文件系统中老化程度大于或等于第一阈值、且有效块占比最小的段。具体地,终端可以先遍历一遍文件系统中的段,确定出第一备选集合,该第一备选集合中的段的老化程度大于或等于第一阈值,然后终端从第一备选集合中确定出有效块占比最小的段作为源段。这样,终端每次选择出来的源段中的有效块数据都是最少的,所以可以减少搬移时的写入数据量,一定程度上降低了功耗。
在一种可能的设计中,当第一备选集合中的段的个数小于或者等于第二阈值时,终端从所述第一备选集合中确定出有效块占比最小的段作为源段;或者是,当所述第一备选集 合中的段的个数大于所述第二阈值时,所述终端将老化程度最小的至少一个段从所述第一备选集合中移除,直至所述第一备选集合中的段的个数小于或者等于所述第二阈值,然后从所述第一备选集合中确定出有效块占比最小的段作为源段。这样的话,因第一备选集合中的段均是相对老化程度较大的段,所以选择出来的源段的老化程度足够大。而且终端选择出来的源段中的有效块数据都是最少的,所以可以减少回刷时的写入数据量,一定程度上降低了功耗。
在一种可能的设计中,上述存储碎片管理方法中目标段的老化程度与源段的老化程度一致,可以理解为:目标段的老化程度大于或等于第一阈值,而且目标段的老化程度在设定的取值区间内。需要说明的是,设定的取值区间是根据源段的老化程度生成的,例如设定取值区间的中心值是源段的老化程度。这样的话,目标段与源段的老化程度相同或临近,所以搬移后,目标段中的数据冷热程度相当。
在另一种可能的设计中,上述存储碎片管理方法中目标段的老化程度与源段的老化程度一致,还可以理解为:目标段的老化程度大于或等于第一阈值,而且目标段的老化程度在设定的取值区间内,同时,目标段是老化程度大于或等于所述第一阈值、且老化程度在所述设定的取值区间内的段中有效块占比最大的段。这样的话,不仅目标段与源段的老化程度相同或临近,而且目标段被源段的有效块数据填满的概率也比较大。所以搬移后,目标段中的空闲块被充分利用,且目标段中的数据冷热程度相当。
在一种可能的设计中,可以通过如下步骤确定目标段,具体包括:终端遍历文件系统中的段,确定出第二备选集合,其中,第二备选集合中的段的老化程度大于或等于所述第一阈值,接着终端再从该第二备选集合中确定出第三备选集合,而且第三备选集合中的段的老化程度的值在所述设定取值区间内。最后,终端从第三备选集合中选择出有效块占比最大的段作为目标段。这样的话,终端可以实现确定出来的目标段与源段的老化程度一致,而且目标段的空闲块也可以被填满,从而得到充分利用。
在一种可能的设计中,触发终端从所述文件系统中确定出源段的条件可以是:当所述文件系统中空闲段的个数低于到第三阈值时,终端根据段的老化程度和段的有效块占比,从文件系统中确定出源段;也可以是所述终端周期性的根据段的老化程度和段的有效块占比,从文件系统中确定出源段。也就是说,终端可能是因为LFS空闲段不足,从而触发存储碎片管理,也可能是因为终端自身有清理线程在周期地地确定源段,将源段中的数据进行搬移,无论哪种触发条件,都有利于终端及时回收自身的存储空间。
需要说明的是,在一种可能的设计中,在本申请实施例中提及的文件系统可以是LFS。
第二方面,本申请实施例提供一种终端,包括处理器和存储器。其中,存储器用于存储一个或多个计算机程序;当存储器存储的一个或多个计算机程序被处理器执行时,使得终端能够实现第一方面的任意一种可能的设计的方法。
第三方面,本申请实施例还提供了一种终端,所述终端包括执行第一方面或者第一方面的任意一种可能的设计的方法的模块/单元。这些模块/单元可以通过硬件实现,也可以通过硬件执行相应的软件实现。
第四方面,本申请实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当计算机程序在终端上运行时,使得所述终端执行第一方面或上述第一方面的任意一种可能的设计的方法。
第五方面,本申请实施例还提供一种包含计算机程序产品,当所述计算机程序产品在 终端上运行时,使得所述终端执行第一方面或上述第一方面的任意一种可能的设计的方法。
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
图1为本申请实施例提供的一种日志结构化文件系统的存储器件布局示意图;
图2为本申请实施例提供的一种日志结构化文件系统的日志结构示意图;
图3为本申请实施例提供的一种日志结构化文件系统示意图;
图4为现有技术提供的一种存储碎片管理方法示意图;
图5为本申请实施例提供的一种存储碎片管理方法流程示意图;
图6为本申请实施例提供的一种源段的选择方法流程示意图;
图7为本申请实施例提供的一种目标段的选择方法流程示意图;
图8为本申请实施例提供的另一种目标段的选择方法流程示意图
图9为本申请实施例提供的另一种存储碎片管理方法流程示意图;
图10为本申请实施例提供的一种LFS系统垃圾回收示意图;
图11为本申请实施例提供的一种终端的单元模块示意图;
图12为本申请实施例提供的一种终端结构示意图;
图13为本申请实施例提供的一种手机结构示意图。
具体实施方式
下面将结合附图对本申请实施例作进一步地详细描述。
本申请实施例提供一种存储碎片管理方法及终端,用以解决现有日志结构化文件系统对存储碎片进行数据搬移存在功耗大的问题。其中,本申请所述方法和终端基于同一发明构思,由于方法及终端解决问题的原理相似,因此设备与方法的实施可以相互参见,重复之处不再赘述。
以下,对本申请中的部分用语进行解释说明。
1)文件系统(file system,FS)是存储和组织计算机文件和数据的方法。文件系统的种类繁多,有基于对象的文件系统;有基于log(日志)的文件系统,例如LFS。
2)垃圾回收(garbage collection,垃圾回收)是对存储器件中空闲空间的管理,其目标是要提供新数据写入所需要的大的空闲块。
3)存储器件可以为非易失存储器、动态随机存储器、静态随机存储器、Flash闪存、嵌入式多媒体存储卡(embeded multi media card,EMMC)等。
4)多个,是指两个或两个以上。
5)在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。
下文,以存储器件是磁盘为例,先详细阐述关于日志结构化文件系统的相关信息。
1、日志结构化文件系统的数据结构
LFS在存储器件上的主要数据结构及其作用和位置见表1。检查点区域是存储器件上的一个固定位置,用来定位索引结点图所在的磁盘块或者闪存块,并确定日志中的最后一 个检查点。用索引结点图来维护每个索引结点的当前位置,其活动部分缓存在内存中,因而查找时几乎不需要访问存储器件。
表1 LFS在存储器件上的主要数据结构
Figure PCTCN2018093930-appb-000001
2、日志结构化文件系统的磁盘布局
在LFS中,日志是磁盘结构。为便于对空闲空间的管理,LFS把日志划分为段。LFS的元数据(meta data)主要分布在检查点和段中,其磁盘布局如图1所示。索引结点的指针和由索引结点图给出了索引结点的当前位置都经常改变。利用在检查点中的时间戳,可以确定最后一个成功的检查点。
3、日志结构化文件系统的日志结构
LFS的日志使用顺序的、只增的数据结构。LFS的描述仍采用传统的索引组织方式。LFS在日志中接入索引结点,索引结点使LFS能够以随机访问方式从日志中找回文件的相关信息。LFS中查找一个索引结点的步骤是:在位于磁盘上固定区域的检查点中,找到最近的索引结点图;由索引结点图找到索引结点的最近版本;由索引结点就可以找到相应的数据块。如图2所示,在检查点区域中找到索引结点图,然后由索引结点图找到三个索引结点,由各个索引结点找到相应的数据块。
4、日志结构化文件系统的段的状态和段中块的状态
在日志结构化文件系统中,每个段都是由多个块组成的序列。块的状态可以是:1)空闲的;2)有效的。这些块的状态的定义如表2所示。
表2
Figure PCTCN2018093930-appb-000002
Figure PCTCN2018093930-appb-000003
所谓块的状态是有效的,意味着块中存在有效的数据,对于块的状态,可以依据段摘要或段使用表的信息确定,本申请实施例列举如下两种确定方式进行说明。
方式一:在LFS中,为每一个块记录了摘要信息。该摘要信息包含inode number(索引结点序号,用于指示这个磁盘块属于哪个文件)和offset(偏移量,用于指示属于文件第几个磁盘块)。这个信息保存在段的头部的segment summary block(段摘要块)中。根据段摘要块的这个信息,就可以很直接的判定一个块中是否存在有效的数据,若存在有效的数据,则是有效块,否则就是空闲块。
方式二,bolcks(块)的有效性可以通过检查文件的索引结点(Inode)或间接块(Indirect block)的块指针是否仍然指向这些块来判断,若指针仍指向这些块,则这些块是有效块,否则是空闲块。
因为段是由块组成的,所以段中块的状态的不同组合决定了段的状态,其中,段的状态可以是1)空闲的;2)脏的;3)有效的。这些段的状态的定义如表3所示。
表3
段的状态 描述
空闲的 段中的所有块都是空闲块。
脏的 段中有有效块和空闲块。
有效的 段中的所有块都是有效块。
如图3所示,日志结构化文件系统400包括41、42、43。每个段是物理磁盘块或者闪存块的集合,例如段的容量为8MB。段41中的所有块都是空闲块,所以段41的状态是空闲的;段42中有两个有效块和四个空闲块,所以段42的状态是脏的;段43中所有块都是有效块,所以段43的状态是有效的。
一般地,段中有效块中的数据可能存在热数据和冷数据的区别。所谓热数据是指该有效块中的数据可能很快会被更新或者删除,所谓冷数据是指该有效块中的数据可能过了很长时间才会被更新或者删除。因为段是由块组成的,所以如果段中的有效块中的数据基本是冷数据,那么该段也属于冷的段;若段中的有效块中的数据基本是热数据,那么该段也属于热的段。换句话说,如果有效块中的数据是冷数据,那么段使用表(Segment use table)中所保存的最后一次被更新的数据的时间一般距离当前时间较久,也就是说该有效块比较老。一般也可以用老化程度来衡量各个的段的冷热程度。其中老化程度的一种定义方式如下:
Figure PCTCN2018093930-appb-000004
Figure PCTCN2018093930-appb-000005
其中,系统最后更新时间是指日志结构化文件系统最后一次更新的时间。系统最早更新时间是指日志结构化文件系统第一次更新的时间。段更新时间是指该段中所有有效块的 平均更新时间。n为该段中有效块的数量,T1是指该段第一个有效块的更新时间,T2是指该段第二个有效块的更新时间,Tn是指该段第n个有效块的更新时间。
本申请实施例提供了一种存储碎片管理方法,该方法可以用以对存储器件中的存储碎片进行垃圾回收。之所以要进行垃圾回收,通过LFS的设计原理可以了解,在LFS中,随着应用不断在该LFS中创建、修改、删除文件,LFS的空闲空间会出现碎片化,造成不能再进行大量连续的写操作,因此需要整理存储器件上的可用空间。目前现有技术中,日志结构化文件系统通常采用的垃圾回收方式:垃圾清理过程中,每次选择脏状态的段中有效块占比最少的段作为源段,将该源段中所有有效块中的数据搬移到连续的空闲空间中,然后回收源段所占用的存储空间。如图4所示,在垃圾回收前,源段中有三个有效块和三个空闲块,目标段全是空闲块;当垃圾回收后,源段中的有效块的数据被搬移走,源段被空闲块填充,目标段中前三个为有效块,后三个为空闲块。但是这样做的缺点是如果源段中存储的数据是热数据,那么可能该源段中的有效块中的数据很快会被更新或者删除,导致重复的搬移,产生额外的功耗。
本申请实施例提供的存储碎片管理方法,在选择目标段和源段的过程中结合考虑了段的老化程度。本申请实施例的方法尽可能选择老化程度较大的源段,并将该源段中的有效块中的数据搬移到与之老化程度一致的目标段的空闲块中。这样,搬移后目标段中的数据冷热程度相当。由此,段中各个块的数据再次发生更新或者删除的时间也基本一致,不太容易导致该目标段再次碎片化,所以可以减少搬移的次数,一定程度上能够降低功耗。
为了更加清晰地描述本申请实施例的技术方案,下面结合附图,对本申请实施例提供的存储碎片管理方法及终端进行详细说明。参阅图5所示,本申请实施例提供了一种存储碎片管理方法,该方法可以由终端执行,具体流程包括:
步骤501、终端根据段的老化程度和段的有效块占比,从文件系统中确定出源段。
具体来说,以文件系统是LFS为例,终端的处理器发起清理线程。该清理线程可以先遍历日志结构化文件系统中的所有的段,确定出老化程度和有效块占比均满足设定条件的源段,然后将源段中的有效块中的数据写入到缓存中。设定条件可以是:LFS中源段的老化程度大于第一阈值,且有效块占比最小的段。当然,设定条件也可以是:LFS中源段的老化程度大于第一阈值,且有效块占比仅次于最小的段。或者说,设定条件也可以是:LFS中源段的老化程度大于第一阈值,且有效块占比小于某一阈值的段。即,源段可以是老化程度较大,且有效块占比较小。
一般地,清理线程采用多次循环遍历LFS,每次遍历均选择LFS中老化程度大于第一阈值,且有效块占比最小的段作为源段。这样,由于清理线程每次确定出的源段的有效块占比最小,所以有效块的数据量也是最小的,所以要搬移的数据量也是最小的,相较而言,这一条件可以减少搬移时的写入量,一定程度上降低了功耗。同样的,当每次遍历均选择老化程度大于第一阈值、且有效块占比较小(小于某一阈值)的一个段作为源段时,需要搬移的数据量也是较小的,也可以减少搬移时的写入量,降低功耗。
具体的,在一种可能的设计中,终端可以是先遍历一遍日志结构化文件系统中段,将脏的状态的且老化程度大于第一阈值的段均加入第一备选集合中;然后再遍历第一备选集合,从中确定出有效块占比最小的段作为源段。然后将该源段中的有效块的数据加载到缓存中,并对该段添加标识。
补充来说,当第一备选集合中的段的个数小于或者等于第二阈值时,终端从所述第一 备选集合中确定出有效块占比最小的段作为源段。
另外,当所述第一备选集合中的段的个数大于第二阈值时,终端可以将老化程度较小的一些段从所述第一备选集合中移除,直至第一备选集合中的段的个数小于或者等于所述第二阈值,然后再从第一备选集合中确定出有效块占比最小的段作为源段。这样的话,因第一备选集合中的段的均是相对老化程度较大的段,所以选择出来的源段的老化程度足够大。而且终端选择出来的源段中的有效块数据都是最少的,所以可以减少回刷时的写入数据量,一定程度上降低了功耗。
步骤502、终端根据所述源段的老化程度,从所述文件系统中确定出与源段老化程度一致的目标段。
具体地,在一种可能的设计中,上述存储碎片管理方法中目标段的老化程度与源段的老化程度一致,可以理解为:目标段的老化程度与源段的老化程度相同或相近。具体在选取目标段时,可以从所述文件系统中选取老化程度大于或等于第一阈值、且老化程度在设定的取值区间内的一个段作为目标段。其中,该第一阈值可以与确定源段时使用的第一阈值相同。所述设定的取值区间是根据源段的老化程度生成的。例如设定取值区间的中心值是源段的老化程度。这样的话,目标段与源段的老化程度相同或临近,所以搬移后,目标段中的数据冷热程度相当。
具体地,在一种可能的设计中,清理线程可以先遍历一遍日志结构化文件系统中的段,从脏的状态的段中选择出老化程度大于第一阈值的段,并将选择出来的这些段均加入第二备选集合中;然后再遍历该第二备选集合,从中确定出老化程度在设定的取值区间内的段,将这些段加入到第三备选集合;之后终端再从第三备选集合中选择与源段的老化程度最接近的段为目标段,或者是任意选择一个段作为目标段,又或者选择老化程度最大的段作为目标段。
示例性地,假设源段的老化程度为a(例如0.8),预设的取值区间可以为[a-0.3,a+0.3],所以说第三备选集合中的段的老化程度的值均在[a-0.3,a+0.3]内,然后终端再从第三备选集合中选择一个段作为目标段。
再比如,在一种可能的设计中,终端可以对第二备选集合中的段按照老化程度从大到小排序。然后以源段的老化程度值为中心,以K为半径,选择该源段的老化程度值前后各K各段,并将选择出来的段加入到第三备选集合中。示例性地,假设源段的老化程度为a(例如0.8),从排序后的第二备选集合中可以选择出比a小的k(k例如为3)个段,以及比a大的k个段,将选择的2k或者2k+1个段作为第三备选集合;然后从第三备选集合中选择出一个段作为目标段。其中,若第二备选集合中包含有老化程度为a的段,则选择作为第三备选集合的为2k+1个段;若第二备选集合中没有包含老化程度为a的段,则选择作为第三备选集合的为2k个段。
在另一种可能的设计中,上述存储碎片管理方法中目标段的老化程度与源段的老化程度一致,还可以理解为:目标段是老化程度大于或等于所述第一阈值、且老化程度在所述设定的取值区间内的段中有效块占比最大的段。这样的话,不仅目标段与源段的老化程度相同或临近,而且目标段被源段的有效块数据填满的概率也比较大,所以搬移后,目标段中的空闲块被充分利用,且目标段中的数据冷热程度相当。
具体地,在一种可能的设计中,清理线程可以先遍历一遍日志结构化文件系统中的段,从脏的状态的段中选择出老化程度大于第一阈值的段,并将选择出来的这些段均加入第二 备选集合中;然后再遍历该第二备选集合,从中确定出老化程度在设定的取值区间内的段,将这些段加入到第三备选集合,终端再从第三备选集合选择有效块占比最大的段作为目标段。
示例性地,假设源段的老化程度为a(例如0.8),预设的取值区间可以为[a-0.3,a+0.3],所以说第三备选集合中的段的老化程度的值均在[a-0.3,a+0.3]内,然后终端再从第三备选集合中选择有效块占比最大的段作为目标段。
步骤503,终端将源段中的有效块的数据搬移到目标段的空闲块。
在步骤503中,当终端扫描到源段后,可以先将源段中的有效块的数据先加载到缓存中。然后针对缓存中的每个有效块,终端依据缓存中的有效块的数据索引找到所在的源段标识,从而根据该源段的老化程度确定出目标段。再将该有效块的的数据写入到该目标段的空闲块中。同时,终端将该源段标识对应源段占用的存储空间释放掉。
一般地,存储碎片管理的触发条件有多种,下文列举出几种条件。
一种触发条件可以是,当文件系统中空闲段的个数低于第三阈值(例如20个)时,处理器才在内核中生成一个用于垃圾回收的清理线程。该清理线程循环执行步骤501至步骤503,直至文件系统中空闲段的个数上升到一定阈值(例如100个)时停止执行。这种存储碎片管理方式也可以称为前台垃圾回收。
一种触发条件可以是,处理器在内核中配置一个用于垃圾回收的清理线程。该清理线程实时地或者周期地执行步骤501。当终端确定出来的源段中的有效块的数据加载到缓存中时,该源段也被标记为待垃圾回收的段。一种情况下,当缓存中有效块的数据的占比缓存的比例大于或者等于一定比例时,例如80%,则触发执行步骤502至步骤503。另一种情况下,缓存中所加载的有效块的数据被加载的同时会记录加载时刻且标识为脏,一但缓存管理器监控到该有效块的数据被置脏时间超过一定时长时,则触发执行步骤502至步骤503,并清空缓存中该有效块的数据。这种存储碎片管理方式也可以称为后台垃圾回收。
下面分别针对后台垃圾回收和前台垃圾回收这两种场景,以LFS为例,对该存储碎片的执行过程进行具体说明。
后台垃圾回收场景
终端的处理器会在内核中生成一个用于垃圾回收的清理线程,该清理线程用于执行如下三个阶段的处理,三个阶段包括:阶段一,实时地或者周期地选择源段;阶段二,选择目标段;阶段三,垃圾回收。
阶段一、实时地或者周期地选择源段,下面结合附图6进行系统说明。
步骤601、清理线程扫描LFS中的所有的段,获取段的老化程度。
步骤602、该清理线程判断扫描到的段是否存在老化程度大于或等于第一阈值的段,若存在,则跳转到步骤603,若不存在,则跳转到步骤601。
步骤603、该清理线程将老化程度大于第一阈值的段加入到第一备选集合中。
步骤604、该清理线程判断第一备选集合中的段的个数是否不超过第二阈值,若是,则跳转到步骤606,若否,则跳转到步骤605。
步骤605、该清理线程将老化程度最小的段从第一备选集合中移除,然后执行步骤604。
步骤606、该清理线程从当前的第一备选集合中,选择出有效块的占比最小的段为源段。
最后,在一种可能的设计中,该清理线程将每次选择出的源段中的有效块中的数据加载到缓存中,然后为该源段添加待垃圾回收的标识。
执行阶段二,选择目标段
当缓存中有效块的数据的占据缓存的比例达到一定比例时,例如80%,触发清理线程选择目标段,或者是,缓存中有效块的数据被置脏时长超过设定时长,则触发清理线程选择目标段。
在阶段二,主要是从LFS中的状态为脏的段中选择目标段。目标段的选择策略可以是依据老化程度这个因素,也可以依据老化程度和有效块的占比这两个因素。最终选择出来目标段的老化程度与源段的老化程度一致。具体地,清理线程针对缓存中的每个有效块,依据缓存中的有效块的数据对应的索引结点索引到该有效块所在的源段的标识,从而根据该源段标识对应的源段的老化程度确定出目标段。下面结合附图7进行系统说明。
步骤701、清理线程扫描LFS中的所有的段,获取段的老化程度。
步骤702、该清理线程判断扫描到的段是否存在老化程度超过第一阈值的段,若存在,则跳转到步骤703,若不存在,则跳转到步骤701。
步骤703、该清理线程将老化程度大于第一阈值的段加入到第二备选集合中。
需要说明的是,因选择目标段可能是在缓存中有效块的数据的占比达到一定比例时才触发的,选择目标段发生在源段选择出来之后,所以步骤701中清理线程扫描LFS中的段的状态与步骤601中清理线程扫描LFS中的段的状态很可能不同,所以得到的第二备选集合与第一备选集合也可能不同。
步骤704、清理线程根据源段的老化程度,遍历第二备选集合,判断是否存在段的老化程度值不在设定的取值区间内的段。若是,则跳转至步骤805a,否则跳转至步骤806a。
具体地,假设源段的老化程度为a(例如0.8),设定的取值区间可以为[a-0.3,a+0.3],清理线程判断第二备选集合中是否存在段的老化程度的值不在[a-0.3,a+0.3]中。
步骤705、清理线程将老化程度值不在舍得的取值区间的段从第二备选集合中移除,然后执行步骤704。
步骤706、清理线程从当前的第三备选集合中选择出一个有效块占比最大的段为目标段。
需要说明的是,上述步骤706也可以是:清理线程从当前的第三备选集合中选择与源段的老化程度最接近的段为目标段,也可以是任意选择出一个段为目标段,或者选择老化程度最老的段为目标段。
需要说明的是,确定出第二备选集之后,也可以采用另一种方式确定出目标段,本文结合图8进行说明。
步骤801至步骤803同上文步骤701至步骤703所述,不再赘述。
步骤804、清理线程将第二备选集合中的段按照老化程度大小排序,其中可以按照从大到小排序,或者也可以按照从小到大排序。
步骤805、清理线程根据源段的老化程度,遍历第二备选集合,判断是否存在段超出个数半径,是则跳转至步骤806,否则跳转至步骤807。
例如,源段的老化程度为a(例如0.8),从排序后的第二备选集合中,判断除了小于a的连续3个段,以及大于a大的连续3个段之外,是否还有别的段。
步骤806、将超出个数半径的段从第二备选集合中移除,然后执行步骤805b。
步骤807、清理线程从第三备选集合中选择出一个有效块占比最大的段为目标段。
需要说明的是,清理线程最终确定出来的目标段也可以大于1个,例如从第三备选集合中选择出一个有效块占比最大的段,和仅次于最大的段均为目标段。这样可以避免一个目标段空闲块太少,不能够完全写入该源端中有效的数据的问题。
阶段三,垃圾回收
清理线程针对缓存中的每个有效块,将缓存中该有效块的数据写入到对应的目标段中。当缓存中的有效块的数据均写入完成后,再将待垃圾回收的标识对应的源段占用的存储空间释放掉。
需要说明的是,上述阶段二和阶段三可以是循环执行的,直至缓存中有效块的数据占据缓存的比例小于一定比例时(例如20%)停止执行。另外,上述阶段二和阶段三也可以是周期执行的。比如说清理线程每隔五分钟确定出一个源段,并把该源段的有效块的数据加载到缓存中;在清理线程将一个源段的有效块的数据加载到缓存中后,再隔5秒钟清理线程就会根据缓存中的每个有效块索引到对应的源段,然后根据源段的老化程度确定出目标段,之后将该有效块的数据写入到目标段的空闲块中。
另外,清理线程可以直接将源段中的有效块的数据写入到目标段的空闲块中,也可以是清理线程将源段中的有效块的数据写加载到缓存中,然后再将缓存中的数据写入到目标段的空闲块中,本申请对其并不做具体限定。
前台垃圾回收场景
若LFS的清理线程收到资源回收指令,或者当LFS系统中的空闲段的个数下降到设定的阈值(例如20个)时,处理器会生成清理线程,该清理线程用于执行如下三个阶段的处理,三个阶段包括:阶段一,选择源段和目标段;阶段二,垃圾回收。下面结合附图9进行系统说明。
阶段一、选择源段和目标段
步骤901、清理线程扫描LFS中的所有的段,获取段的老化程度。
步骤902、该清理线程判断扫描到的段是否有老化程度超过第一阈值的段,若有,则跳转到步骤903,若没有,则跳转到步骤901。
步骤903、该清理线程将老化程度大于第一阈值的段加入到第一备选集合中。
步骤904、该清理线程判断第一备选集合中的段的个数是否不超过第二阈值阈值,若是,则跳转到步骤906,若否,则跳转到步骤905。
步骤905,该清理线程将老化程度最小的段从第一备选集合中移除,然后跳转到步骤904。
步骤906、该清理线程从当前的第一备选集合中,选择出有效块的占比最小的段为源段。
步骤907、清理线程根据源段的老化程度,遍历第一备选集合,判断是否存在段的老化程度值不在设定的取值区间内的段。若是,则跳转至步骤908,否则跳转至步骤909。
具体地,假设源段的老化程度为a(例如0.8),设定的取值区间可以为[a-0.3,a+0.3],清理线程判断第一备选集合中是否存在段的老化程度的值不在[a-0.3,a+0.3]中。
步骤908、清理线程将该老化程度的值不在设定的取值区间内的段从第一备选集合中移除,然后跳转到步骤907,直至第一备选集合段的老化程度的值都在预设的取值区间内。
步骤909,清理线程从当前的第一备选集合中选择出一个有效块占比最大的段为目标段。
需要说明的是,上述步骤909也可以是:清理线程从当前的第一备选集合中选择与源段的老化程度最接近的段为目标段,也可以是任意选择出一个段为目标段,或者选择老化程度最老的段为目标段。
阶段二,垃圾回收
清理线程将源段中的有效块中的数据写入到目标段中,然后重复执行步骤901至909,直至LFS系统中的空闲段的个数上升到设定的阈值(例如80个)时。具体地,清理线程可以直接将源段中的有效块的数据写入到目标段的空闲块中,也可以是清理线程将源段中的有效块的数据写加载到缓存中,然后再将缓存中的数据写入到目标段的空闲块中,本申请对其并不做具体限定。
其中,清理线程最终确定出来的目标段也可以大于1个,例如从第一备选集合中选择出一个有效块占比最大的段,和仅次于最大的段均为目标段。这样可以避免一个目标段空闲块太少,不能够完全写入该源端中有效的数据的问题。
另外,需要说明的是,在图9选择目标段的方式也可以采用图7b所示的方式,在此不再重复赘述。另外,需要说明的,图9所示的源段和目标段可以都是从第一备选集合中确定出来的,也可以是从不同的备选集合中确定出来的。例如源段是在第一时刻对应的LFS的段中选择出来的,而目标段是在第二时刻对应的LFS的段中筛选出来的;其中,第二时刻在第一时刻之后,这两个时刻对应的LFS的段可能不相同,由此,这两个时刻对应的第一备选集合也可能不相同。
对应垃圾回收阶段,举例来说,如图10所示,在垃圾回收前,源段中有3个有效块和三个空闲块,目标段中有三个空闲块和三个有效块,在进行前台垃圾回收或者后台垃圾回收时,源段中的六个段全部为空闲段,目标段被有效块填满了。当完成垃圾回收后,该源段对应的存储空间被回收,源段重新被置为空闲段,可以再次被写入新数据。
需要说明的是,在一种可能的设计中,在进行垃圾回收时,若源段的有效块中的数据属于同一目录的文件有多个,则可以优先地将同一个目录中的文件搬移到一个目标段中;在另一种可能的设计中,在进行垃圾回收时,按照源段的有效块的最后修改时间,对有效块进行分组,将相同或者相近的时间的有效地块放在同一个组中,然后这些组的有效块的数据搬移到同一个目标段中。
本申请实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当计算机程序在终端上运行时,使得所述终端执行上述存储碎片管理方法任意一种可能的实现。
本申请实施例还提供一种包含计算机程序产品,当所述计算机程序产品在终端上运行时,使得所述终端执行上述存储碎片管理方法任意一种可能的实现。
在本申请的一些实施例中,本申请实施例公开了一种终端,如图11所示,该终端用于实现以上各个方法实施例中记载的方法,其包括:状态缓存模块1001、源段选择模块1002、目标段选择模块1003以及垃圾回收模块1004。该终端包括的模块可以在安卓操作系统的内核层实现。其中,状态缓存模块1001获取LFS中段的状态,计算各个段的老化程度和有效块占比,源段选择模块1002用于支持终端执行图5中的步骤501,目标段选择模块 1003用于支持终端执行图5中的步骤502,垃圾回收模块1004用于支持终端执行图5中的步骤503。上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在本申请的另一些实施例中,本申请实施例公开了一种终端,如图12所示,该终端可以包括:一个或多个处理器1101;存储器1102;显示器1103;一个或多个应用程序(未示出);以及一个或多个计算机程序1104,上述各器件可以通过一个或多个通信总线1105连接。其中该一个或多个计算机程序1104被存储在上述存储器1102中并被配置为被该一个或多个处理器1101执行,该一个或多个计算机程序1104包括指令,上述指令可以用于执行如图5及相应实施例中的各个步骤。
上述终端可以为手机、平板电脑、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等终端设备,下面以终端为手机为例进行说明,图13示出的是与本发明各实施例相关的手机20的部分结构的框图。
如图13所示,手机20包括显示设备210、处理器220以及存储器230。存储器230可用于存储软件程序以及数据,存储器230可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如图像采集功能等)等;存储数据区可存储根据手机200的使用所创建的数据(比如音频数据、电话本、图像等)等。此外,存储器230可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。本发明实例提供的存储碎片管理方法适用于对存储器230中的存储碎片的管理。
处理器220通过运行存储在存储器230的软件程序以及数据,从而执行手机200的各种功能应用以及数据处理。处理器220是手机200的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器230内的软件程序和/或数据,执行手机200的各种功能和处理数据,从而对手机进行整体监控。处理器220可以包括一个或多个通用处理器,还可包括一个或多个DSP(digital signal processor,数字信号处理器),也可以包括一个或者多个ISP(image signal processor,图像信号处理器),用于执行相关操作,以实现本申请实施例所提供的技术方案。
手机200中还包括用于拍摄图像或视频的摄像头260。摄像头260可以是普通摄像头,也可以是对焦摄像头。
手机200还可以包括输入设备240,用于接收输入的数字信息、字符信息或接触式触摸操作/非接触式手势,以及产生与手机200的用户设置以及功能控制有关的信号输入等。
显示设备210,包括的显示面板211,用于显示由用户输入的信息或提供给用户的信息以及手机200的各种菜单界面等,在本申请实施例中主要用于显示手机100中摄像头或者传感器获取的待检测图像。可选的,显示面板可以采用液晶显示器(liquid crystal display,LCD)或OLED(organic light-emitting diode,有机发光二极管)等形式来配置显示面板211。
除以上之外,手机200还可以包括用于给其他模块供电的电源250。手机200还可以包括一个或多个传感器270,例如图像传感器、红外传感器、激光传感器等。手机200还可以包括无线射频(radio frequency,RF)电路280,用于与无线网络设备进行网络通信,还可以包括WiFi模块290,用于与其他设备进行WiFi通信,获取其他设备传输的图像或者数据等。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:快闪存储器、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何在本申请实施例揭露的技术范围内的变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以所述权利要求的保护范围为准。

Claims (21)

  1. 一种存储碎片管理方法,应用于终端的文件系统中,所述文件系统包括至少一个段,其特征在于,所述方法包括:
    终端根据段的老化程度和段的有效块占比,从所述文件系统中确定出源段;
    所述终端根据所述源段的老化程度,从所述文件系统中确定出目标段,所述目标段的老化程度与所述源段的老化程度一致;
    所述终端将所述源段中的有效块的数据搬移所述目标段的空闲块中。
  2. 如权利要求1所述的方法,其特征在于,所述源段是所述文件系统中老化程度大于或等于第一阈值、且有效块占比最小的段。
  3. 如权利要求2所述的方法,其特征在于,所述终端根据段的老化程度和段的有效块占比,从所述文件系统中确定出源段,包括:
    所述终端遍历所述文件系统中的段,确定出第一备选集合,所述第一备选集合中的段的老化程度大于或等于所述第一阈值;
    所述终端从所述第一备选集合中确定出有效块占比最小的段作为所述源段。
  4. 如权利要求3所述的方法,其特征在于,所述终端从所述第一备选集合中确定出有效块占比最小的段作为所述源段,包括:
    当所述第一备选集合中的段的个数小于或者等于第二阈值时,所述终端从所述第一备选集合中确定出有效块占比最小的段作为所述源段;或者
    当所述第一备选集合中的段的个数大于所述第二阈值时,所述终端将老化程度最小的至少一个段从所述第一备选集合中移除,直至所述第一备选集合中的段的个数小于或者等于所述第二阈值,然后从所述第一备选集合中确定出有效块占比最小的段作为所述源段。
  5. 如权利要求1至4任一项所述的方法,其特征在于,所述目标段的老化程度与所述源段的老化程度一致,包括:所述目标段的老化程度大于或等于所述第一阈值,且所述目标段的老化程度在设定的取值区间内;所述设定的取值区间是根据所述源段的老化程度生成的。
  6. 如权利要求5所述的方法,其特征在于,所述目标段是老化程度大于或等于所述第一阈值、且老化程度在所述设定的取值区间内的段中有效块占比最大的段。
  7. 如权利要求6所述的方法,其特征在于,所述终端根据所述源段的老化程度,从所述文件系统中确定出目标段,包括:
    所述终端遍历所述文件系统中的段,确定出第二备选集合,所述第二备选集合中的段的老化程度大于或等于所述第一阈值;
    所述终端从所述第二备选集合中确定出第三备选集合,所述第三备选集合中的段的老化程度的值在所述设定取值区间内;
    所述终端从所述第三备选集合中选择出有效块占比最大的段作为所述目标段。
  8. 如权利要求5至7任一项所述的方法,其特征在于,其中,所述设定取值区间的中心值是所述源段的老化程度。
  9. 如权利要求1至8任一项所述的方法,其特征在于,所述终端根据段的老化程度和段的有效块占比,从所述文件系统中确定出源段,包括:
    当所述文件系统中空闲段的个数低于到第三阈值时,所述终端根据段的老化程度和段 的有效块占比,从文件系统中确定出源段;或者
    所述终端周期性的根据段的老化程度和段的有效块占比,从所述文件系统中确定出源段。
  10. 如权利要求1至9任一项所述的方法,其特征在于,所述文件系统为日志结构化文件系统LFS。
  11. 一种终端,其特征在于,包括处理器和存储器;
    所述存储器用于存储一个或多个计算机程序;
    当所述存储器存储的一个或多个计算机程序被所述处理器执行时,使得所述终端执行:
    根据段的老化程度和段的有效块占比,从所述文件系统中确定出源段;
    根据所述源段的老化程度,从所述文件系统中确定出目标段,所述目标段的老化程度与所述源段的老化程度一致;
    将所述源段中的有效块的数据搬移所述目标段的空闲块中。
  12. 如权利要求11所述的终端,其特征在于,所述源段是所述文件系统中老化程度大于或等于第一阈值、且有效块占比最小的段。
  13. 如权利要求12所述的终端,其特征在于,当所述存储器存储的一个或多个计算机程序被所述处理器执行时,还使得所述终端执行:
    遍历所述文件系统中的段,确定出第一备选集合,所述第一备选集合中的段的老化程度大于或等于所述第一阈值;
    从所述第一备选集合中确定出有效块占比最小的段作为所述源段。
  14. 如权利要求13所述的终端,其特征在于,当所述存储器存储的一个或多个计算机程序被所述处理器执行时,还使得所述终端执行:
    当所述第一备选集合中的段的个数小于或者等于第二阈值时,从所述第一备选集合中确定出有效块占比最小的段作为所述源段;或者
    当所述第一备选集合中的段的个数大于所述第二阈值时,将老化程度最小的至少一个段从所述第一备选集合中移除,直至所述第一备选集合中的段的个数小于或者等于所述第二阈值,然后从所述第一备选集合中确定出有效块占比最小的段作为所述源段。
  15. 如权利要求11至14任一项所述的终端,其特征在于,当所述存储器存储的一个或多个计算机程序被所述处理器执行时,还使得所述终端执行:
    所述目标段的老化程度与所述源段的老化程度一致,包括:所述目标段的老化程度大于或等于所述第一阈值,且所述目标段的老化程度在设定的取值区间内;所述设定的取值区间是根据所述源段的老化程度生成的。
  16. 如权利要求15所述的终端,其特征在于,所述目标段是老化程度大于或等于所述第一阈值、且老化程度在所述设定的取值区间内的段中有效块占比最大的段。
  17. 如权利要求16所述的终端,其特征在于,当所述存储器存储的一个或多个计算机程序被所述处理器执行时,还使得所述终端执行:
    遍历所述文件系统中的段,确定出第二备选集合,所述第二备选集合中的段的老化程度大于或等于所述第一阈值;
    从所述第二备选集合中确定出第三备选集合,所述第三备选集合中的段的老化程度的值在所述设定取值区间内;
    从所述第三备选集合中选择出有效块占比最大的段作为所述目标段。
  18. 如权利要求15至17任一项所述的终端,其特征在于,其中,所述设定取值区间的中心值是所述源段的老化程度。
  19. 如权利要求11至18任一项所述的终端,其特征在于,当所述存储器存储的一个或多个计算机程序被所述处理器执行时,还使得所述终端执行:
    当所述文件系统中空闲段的个数低于到第三阈值时,根据段的老化程度和段的有效块占比,从文件系统中确定出源段;或者
    周期性的根据段的老化程度和段的有效块占比,从所述文件系统中确定出源段。
  20. 如权利要求11至19任一项所述的终端,其特征在于,所述文件系统为日志结构化文件系统LFS。
  21. 一种计算机存储介质,其特征在于,所述计算机可读存储介质包括计算机程序,当计算机程序在终端上运行时,使得所述终端执行如权利要求1至10任一所述的方法。
PCT/CN2018/093930 2018-06-30 2018-06-30 一种存储碎片管理方法及终端 WO2020000492A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US17/257,015 US11842046B2 (en) 2018-06-30 2018-06-30 Storage fragment management method and terminal
EP18924374.4A EP3789883A4 (en) 2018-06-30 2018-06-30 MEMORY FRAGMENT MANAGEMENT PROCEDURE AND TERMINAL
CN201880048691.9A CN110945486B (zh) 2018-06-30 2018-06-30 一种存储碎片管理方法及终端
PCT/CN2018/093930 WO2020000492A1 (zh) 2018-06-30 2018-06-30 一种存储碎片管理方法及终端

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/093930 WO2020000492A1 (zh) 2018-06-30 2018-06-30 一种存储碎片管理方法及终端

Publications (1)

Publication Number Publication Date
WO2020000492A1 true WO2020000492A1 (zh) 2020-01-02

Family

ID=68984611

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/093930 WO2020000492A1 (zh) 2018-06-30 2018-06-30 一种存储碎片管理方法及终端

Country Status (4)

Country Link
US (1) US11842046B2 (zh)
EP (1) EP3789883A4 (zh)
CN (1) CN110945486B (zh)
WO (1) WO2020000492A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022017148A1 (zh) * 2020-07-21 2022-01-27 中兴通讯股份有限公司 文件系统管理方法、电子设备及存储介质
CN116701298A (zh) * 2022-11-22 2023-09-05 荣耀终端有限公司 一种文件系统管理方法及电子设备
US20230393976A1 (en) * 2022-06-01 2023-12-07 Micron Technology, Inc. Controlling variation of valid data counts in garbage collection source blocks

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114063886A (zh) * 2020-07-31 2022-02-18 伊姆西Ip控股有限责任公司 用于存储管理的方法、电子设备和计算机程序产品
CN116049113B (zh) * 2022-08-29 2023-10-20 荣耀终端有限公司 文件系统的整理方法、电子设备及计算机可读存储介质
CN116049021B (zh) * 2022-08-29 2023-10-20 荣耀终端有限公司 存储空间管理方法、电子设备及计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389264A (zh) * 2014-08-29 2016-03-09 Emc公司 存储系统中垃圾收集的方法和系统
CN106293497A (zh) * 2015-05-27 2017-01-04 华为技术有限公司 瓦记录感知文件系统中垃圾数据的回收方法和装置
CN106502587A (zh) * 2016-10-19 2017-03-15 华为技术有限公司 磁盘数据管理方法和磁盘控制装置
US9767032B2 (en) * 2012-01-12 2017-09-19 Sandisk Technologies Llc Systems and methods for cache endurance
CN107533506A (zh) * 2015-05-06 2018-01-02 华为技术有限公司 日志结构化文件系统中的垃圾回收方法和设备
CN108139968A (zh) * 2015-10-19 2018-06-08 华为技术有限公司 确定垃圾收集器线程数量及活动管理的方法及设备

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100481025C (zh) 2007-02-08 2009-04-22 深圳万利达电子工业有限公司 一种nandflash文件系统实现方法
CN101339808B (zh) 2008-07-28 2011-02-09 华中科技大学 存储块的擦除方法及装置
CN101364166B (zh) * 2008-09-23 2011-02-02 杭州华三通信技术有限公司 将2048字节页的Nand Flash模拟成硬盘的方法和装置
US8392687B2 (en) * 2009-01-21 2013-03-05 Micron Technology, Inc. Solid state memory formatting
KR101867282B1 (ko) * 2011-11-07 2018-06-18 삼성전자주식회사 비휘발성 메모리 장치의 가비지 컬렉션 방법
US8898376B2 (en) * 2012-06-04 2014-11-25 Fusion-Io, Inc. Apparatus, system, and method for grouping data stored on an array of solid-state storage elements
KR102033323B1 (ko) * 2014-03-05 2019-10-17 한국전자통신연구원 플래시 메모리에서 사용하는 로그 구조 파일시스템의 메타데이터 저장 방법
CN104933169B (zh) 2015-06-29 2018-05-01 南开大学 基于热点文件优先的文件系统碎片整理方法
KR102501751B1 (ko) 2015-09-22 2023-02-20 삼성전자주식회사 메모리 콘트롤러, 불휘발성 메모리 시스템 및 그 동작방법
CN105095529A (zh) 2015-09-30 2015-11-25 深圳天珑无线科技有限公司 软件的垃圾内容清理方法、装置及终端
US20170139825A1 (en) * 2015-11-17 2017-05-18 HGST Netherlands B.V. Method of improving garbage collection efficiency of flash-oriented file systems using a journaling approach
KR102545067B1 (ko) 2016-03-02 2023-06-20 한국전자통신연구원 로그 구조 파일 시스템의 메타 데이터 저장 방법, 시스템 및 컴퓨터 판독 가능한 기록 매체
CN107145452A (zh) 2017-05-25 2017-09-08 努比亚技术有限公司 碎片整理的方法、终端设备及计算机可读存储介质
CN107885458B (zh) 2017-09-28 2021-03-30 努比亚技术有限公司 一种磁盘碎片的整理方法、终端和计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9767032B2 (en) * 2012-01-12 2017-09-19 Sandisk Technologies Llc Systems and methods for cache endurance
CN105389264A (zh) * 2014-08-29 2016-03-09 Emc公司 存储系统中垃圾收集的方法和系统
CN107533506A (zh) * 2015-05-06 2018-01-02 华为技术有限公司 日志结构化文件系统中的垃圾回收方法和设备
CN106293497A (zh) * 2015-05-27 2017-01-04 华为技术有限公司 瓦记录感知文件系统中垃圾数据的回收方法和装置
CN108139968A (zh) * 2015-10-19 2018-06-08 华为技术有限公司 确定垃圾收集器线程数量及活动管理的方法及设备
CN106502587A (zh) * 2016-10-19 2017-03-15 华为技术有限公司 磁盘数据管理方法和磁盘控制装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3789883A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022017148A1 (zh) * 2020-07-21 2022-01-27 中兴通讯股份有限公司 文件系统管理方法、电子设备及存储介质
US20230393976A1 (en) * 2022-06-01 2023-12-07 Micron Technology, Inc. Controlling variation of valid data counts in garbage collection source blocks
US11947452B2 (en) * 2022-06-01 2024-04-02 Micron Technology, Inc. Controlling variation of valid data counts in garbage collection source blocks
CN116701298A (zh) * 2022-11-22 2023-09-05 荣耀终端有限公司 一种文件系统管理方法及电子设备
CN116701298B (zh) * 2022-11-22 2024-06-07 荣耀终端有限公司 一种文件系统管理方法及电子设备

Also Published As

Publication number Publication date
CN110945486B (zh) 2022-06-10
EP3789883A1 (en) 2021-03-10
CN110945486A (zh) 2020-03-31
EP3789883A4 (en) 2021-05-12
US11842046B2 (en) 2023-12-12
US20210223958A1 (en) 2021-07-22

Similar Documents

Publication Publication Date Title
WO2020000492A1 (zh) 一种存储碎片管理方法及终端
EP3170106B1 (en) High throughput data modifications using blind update operations
US9779027B2 (en) Apparatus, system and method for managing a level-two cache of a storage appliance
US8799601B1 (en) Techniques for managing deduplication based on recently written extents
RU2672719C2 (ru) Журналируемое хранение без блокировок для нескольких способов доступа
CN108268219B (zh) 一种处理io请求的方法及装置
US8825959B1 (en) Method and apparatus for using data access time prediction for improving data buffering policies
US20140115244A1 (en) Apparatus, system and method for providing a persistent level-two cache
WO2019085769A1 (zh) 一种数据分层存储、分层查询方法及装置
US9152575B2 (en) Data staging area
US9727479B1 (en) Compressing portions of a buffer cache using an LRU queue
US20220382651A1 (en) Fast recovery and replication of key-value stores
US20130054727A1 (en) Storage control method and information processing apparatus
CN109804359A (zh) 用于将数据回写到存储设备的系统和方法
US10996857B1 (en) Extent map performance
CN111831691B (zh) 一种数据读写方法及装置、电子设备、存储介质
CN105493080A (zh) 基于上下文感知的重复数据删除的方法和装置
US11829291B2 (en) Garbage collection of tree structure with page mappings
US10216630B1 (en) Smart namespace SSD cache warmup for storage systems
CN107133334B (zh) 基于高带宽存储系统的数据同步方法
CN115168416A (zh) 数据缓存方法、装置、存储介质及电子装置
Guo et al. Re-enabling high-speed caching for LSM-trees
CN113485642A (zh) 数据缓存方法及装置
KR20140049327A (ko) 모바일 환경에 구축된 데이터베이스에 대한 트랜잭션 로깅 및 회복 장치 및 그 방법
CN111444179B (zh) 数据处理方法、装置、存储介质及服务器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18924374

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018924374

Country of ref document: EP

Effective date: 20201201

NENP Non-entry into the national phase

Ref country code: DE