CN114503083A - Backup system and method for improving durability of storage medium thereof - Google Patents

Backup system and method for improving durability of storage medium thereof Download PDF

Info

Publication number
CN114503083A
CN114503083A CN202080068994.4A CN202080068994A CN114503083A CN 114503083 A CN114503083 A CN 114503083A CN 202080068994 A CN202080068994 A CN 202080068994A CN 114503083 A CN114503083 A CN 114503083A
Authority
CN
China
Prior art keywords
data items
backup
storage medium
data
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080068994.4A
Other languages
Chinese (zh)
Inventor
阿萨夫·纳塔逊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN114503083A publication Critical patent/CN114503083A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/203Failover techniques using migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7211Wear leveling

Abstract

The present invention provides a method for improving the endurance of a storage medium of a backup system and a backup system including a storage medium with improved endurance. The durability is improved by reducing the writing magnification of the storage medium, and the methods for reducing the writing magnification include: grouping the data according to the change frequency and the minimum storage time of the data of the host equipment; and storing the grouped data on each block in the backup system according to the residual life of each block in the storage medium.

Description

Backup system and method for improving durability of storage medium thereof
Technical Field
The present invention relates generally to the field of secondary storage systems, and more particularly to a backup system and method for improving the endurance of the storage media of the backup system.
Background
Nonvolatile memory has been used as memory in computers and portable information devices. Recently, a Solid State Drive (SSD) using a NAND flash memory has been widely used in computers as a substitute for a Hard Disk Drive (HDD). The SSD has features of low power consumption and high performance, and thus is used as a main memory of various computers. SSDs are considered to have great potential for entering areas of use traditionally thought to be limited to HDDs.
A typical SSD includes a plurality of data blocks that store data. In an SSD, when new data is to be written into a data block, it is not possible to simply overwrite the existing data of the data block. In an SSD, other data (which is not overwritten) of the data block is first copied into a different data block on the SSD, then all data of the data block is erased, and finally new data (to be written) is written into the data block. Therefore, when new data is to be rewritten into one data block, the rewriting amount is very large compared to the size of the new data. In other words, the erase granularity of an SSD is much larger than the block granularity.
For example, if the size of the new data is 4 Kilobytes (KB) and the size of an erase data block in the SSD storing the data is 100KB, then the write granularity is smaller, e.g., 4KB, but the erase granularity is larger. Thus, if written to a blank location, the data size written is only 4KB, but if it is desired to overwrite old data, it is first necessary to read 100KB to replace 4KB, erase the old 100KB and overwrite new data, all 100KB of existing data in the block being erased and overwritten with 4KB of new data. Therefore, to store 4KB of data, other data is written that is 25 times as large as the new data. Therefore, the SSD has a high write amplification rate. This makes the process of writing data into an SSD very time consuming and resource intensive, thereby greatly reducing the endurance of the SSD.
Recently, some techniques that employ Flash Translation Layer (FTL) tables and garbage collection mechanisms have been used to reduce the write amplification associated with SSDs. The FTL table can write new data anywhere in the SSD, having a logical address different from the physical address, and therefore, if the logical address is to be rewritten, the physical address does not necessarily have to be rewritten like a direct address mapping. In addition, the logical-to-physical addressing mechanism identifies and groups new data into hot data (i.e., frequently changing data) and cold data (i.e., static data), so the frequently changing data will be grouped together in a single erase block, meaning that most of the data in the erase block will typically be overwritten (i.e., onto another location due to the FTL). Thus, the garbage collection mechanism erases data in a data block only after a large portion of the data in the data block is to be overwritten with new data, thereby reducing write amplification. When a new block is to be written and the SSD has a free space, the block is written immediately. If all disks are written, then a block of data needs to be erased (much larger than the written block), which may include both the required data and the overwrite data. The system reads the desired data, erases the block, and writes the desired data and the new block to be written, leaving available space after the erase. If most of the data in the erase block has been overwritten, it is not necessary to write any data again after the erase, which is the ideal case. Amplification is the amount of data that the system needs to write again because the block contains unerased data. Here, grouping frequently changing data together in a single erase block ensures that most of the data in the erase block will be overwritten before the block needs to be erased. This means that when a block is rewritten, a small amount of data needs to be copied, thereby reducing the write magnification.
A great deal of research has been conducted to separate the data into hot and cold data. One way to achieve this goal is to separate the data based on the application ID and know which processes will produce hot data and which processes will produce cold data. However, it may be difficult to know when a particular application explicitly knows its future behavior, i.e., whether future generated data is classified as hot or cold data.
Typically, a computing system (e.g., a host device) includes a main memory (e.g., a host memory) to store data related to the computing system. It is common practice to create a backup of the primary storage to protect data and allow such data to be recovered in the event of a data loss. Examples of the data loss event may include, but are not limited to, data corruption, hardware or software failure in the primary storage device, accidental deletion of data, hacking, or malicious attack. For this purpose, a backup system is employed. More and more backup systems use SSDs to meet their storage requirements. However, SSDs with high magnification may not be suitable as storage media for backup systems.
In view of the above discussion, in order to more widely adopt SSDs as storage media, it becomes crucial to reduce the high magnification.
Disclosure of Invention
The present invention is directed to a method for improving the endurance of a storage medium of a backup system and to a backup system including a storage medium with improved endurance. The present invention is directed to provide a technical solution to solve the existing problem of low data backup efficiency caused by high write amplification rate in the conventional backup system including a storage medium. It is an object of the present invention to provide a solution that at least partly solves the problems encountered in the prior art and to provide an improved method and system for reducing the write amplification of a storage medium and for increasing the data backup efficiency of a backup system.
The object of the invention is achieved by the solution presented in the attached independent claims. Advantageous implementations of the invention are further defined in the dependent claims.
In one aspect, the present invention provides a method for improving the endurance of a storage medium of a backup system, wherein a backup of one or more data items of a host device is stored in one or more blocks of the storage medium of the backup system. The method comprises the following steps: the backup agent detects a frequency of change of one of the one or more data items in the host device; a backup layer of the backup system receiving the detected frequency of change; the backup layer of the backup system obtaining a minimum storage time for each backup of the one or more data items from a backup policy stored in a backup policy repository; providing the detected change frequency and the minimum storage time for each backup of the one or more data items retrieved from the backup layer to the storage medium to group data items stored in the one or more blocks of the storage medium; grouping the data items in the storage medium according to the provided detected change frequency and the obtained minimum storage time of each backup of the one or more data items; estimating a remaining life of each of the one or more blocks of the storage medium; moving the grouped data items into the blocks according to the estimated remaining life of each block.
The method of the present invention provides an efficient way of classifying the data items by grouping the data items into hot data items (i.e. frequently changing data items) and cold data items (i.e. infrequently changing data items) according to the determined frequency of change. Advantageously, the grouping of the data items enables the data items to be stored in the one or more blocks of the storage medium such that a majority of the data items are rewritten to a given block at a time, such that the amount of writing of data items in a block is nearly equal to the size of the data items to be rewritten. This is in contrast to conventional techniques in which there are fewer data items to be overwritten at a time, and thus the amount of writes performed on a block is many times the size of the data items to be overwritten. Thus, the erase granularity of the storage medium is almost equal to the block granularity, which results in a significant reduction in write amplification of the storage medium compared to conventional techniques. Furthermore, this enables the data items to be stored in the respective blocks according to the remaining life of each block, and therefore, these blocks can be used for a longer period of time than in the conventional art.
In one implementation, determining the change frequency includes: determining a time period of a frequency of changes in the one or more data items in the host device.
Embodiments of the present invention, when implemented on the backup system, are able to determine a time period of a frequency of changes of the one or more data items in the host device. In addition, the present embodiment can efficiently group the data items into hot data items and cold data items, and thus the data items can be stored in respective blocks, which significantly reduces the write amplification of the backup system compared to the conventional art.
In one implementation, obtaining the minimum storage time of each backup includes: obtaining a storage time of each backup of the one or more data items in the storage medium, wherein the storage time is greater than the change frequency of the one or more data items in the host device.
When the embodiment of the invention is implemented on the backup system, the storage time of each backup of the one or more data items in the storage medium can be determined. This information about the minimum storage time enables the grouped data items to be stored in individual blocks, such that most data items are rewritten to a given block at a given time. Therefore, the write amplification of the backup system is reduced compared to the conventional art.
In one implementation, detecting the frequency of change of the data items is performed by detecting a type of the data items of the one or more data items in the host device.
By detecting the type of the data item, the time at which the data item is changed next and therefore needs to be rewritten to the storage medium can be determined. With this information the data items can be grouped such that the writing method rate of the storage medium is reduced.
In one implementation, the grouping of the data items is performed by grouping data items for which the minimum storage time of the backups is the same.
By grouping data items having the same minimum storage time, most of the data items can be rewritten into a given block at the same time, and therefore, the write amplification of the backup system is reduced as compared with the conventional art.
In one implementation, the grouping of the data items is performed by grouping data items that have not changed during the backup of the storage medium.
By grouping data items that have not changed during backup, cold data items are identified and grouped for storage in individual blocks to reduce write amplification.
In one implementation, estimating the remaining lifetime of each of the blocks is performed by identifying a maximum number of possible rewrites for each of the one or more blocks and a number of executed rewrites for each of the one or more blocks.
By estimating the maximum possible number of rewrites per block of the one or more blocks and the number of times that rewrites have been performed per block of the one or more blocks, the remaining lifetime of each block is estimated, and thus, a suitable set of data items can be stored in each block, i.e., cold data items are stored in each block for which the estimated remaining lifetime is shortest, and hot data items are stored in each block for which the estimated remaining lifetime is longest, so that the overall durability of the storage medium is improved.
In one implementation, the method further comprises: and detecting the block with the shortest estimated remaining life.
By detecting the blocks for which the estimated remaining lifetime is the shortest, suitable sets of data items that do not change often (i.e., cold data items) are stored only in these blocks. Thus, the lifetime of such blocks and the overall durability of the storage medium is significantly improved.
In another aspect, the present invention provides a backup system including a storage medium having improved durability. A backup of one or more data items of a host device is stored in one or more blocks of the storage medium, the backup system comprising: a backup agent to detect a frequency of change of one of the one or more data items in the host device; a backup layer to communicate with the host device to: receiving the detected change frequency of the data item of the one or more data items, receiving a minimum storage time of each backup of the one or more data items from a backup policy store, and providing the detected change frequency and the obtained minimum storage time of each backup of the one or more data items to the storage medium to group the data items; the storage medium is to: grouping the data items according to the provided change frequency and a minimum storage time for each backup of the one or more data items, estimating a remaining lifetime of each of the one or more blocks, and moving the grouped data items into the blocks according to the estimated remaining lifetimes.
Grouping the data items into hot data items (i.e., frequently changing data items) and cold data items (i.e., infrequently changing data items) based on the determined frequency of change, the backup system of the present invention provides an efficient way to sort the data items. Advantageously, the grouping of the data items enables the data items to be stored in the one or more blocks of the storage medium such that a majority of the data items are rewritten to a given block at a time, such that the amount of writing of data items in a block is nearly equal to the size of the data items to be rewritten. This is in contrast to conventional techniques in which there are fewer data items to be overwritten at a time, and thus the amount of writes performed on a block is many times the size of the data items to be overwritten. Thus, when the backup system is started to determine where to write the next block, the backup system first identifies which blocks can be erased by identifying blocks that are completely or almost completely overwritten. This results in a significant reduction of the write amplification of the storage medium compared to the prior art. Furthermore, this enables the data items to be stored in the respective blocks according to the remaining life of each block, and therefore, these blocks can be used for a longer period of time than in the conventional art.
In yet another aspect, the invention provides a computer program. The computer program is adapted to perform the method of the first aspect when executed on a backup system.
The computer program achieves all the advantages and effects of the method of the invention.
It is understood that all implementations discussed above may be combined together. It should be noted that all devices, elements, units and modules described in the present invention may be implemented in software elements or hardware elements or any type of combination thereof. All steps performed by the various entities described in this disclosure and the functions described to be performed by the various entities are intended to mean that the various entities are adapted or configured to perform the respective steps and functions. Although in the following description of specific embodiments specific functions or steps performed by an external entity are not reflected in the description of specific elements of the entity performing the specific steps or functions, it should be clear to a skilled person that these methods and functions may be implemented in respective hardware elements or software elements or any type of combination thereof. It will be appreciated that features of the invention are susceptible to being combined in various combinations without departing from the scope of the invention as defined by the accompanying claims.
Other aspects, advantages, features and objects of the present invention will become apparent from the accompanying drawings and from the detailed description of illustrative embodiments, which is to be construed in conjunction with the following appended claims.
Drawings
The foregoing summary, as well as the following detailed description of embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, there is shown in the drawings exemplary constructions of the invention. However, the present invention is not limited to the specific methods and instrumentalities disclosed herein. Further, those skilled in the art will appreciate that the drawings are not drawn to scale. Similar elements are denoted by the same numerals whenever possible.
Embodiments of the invention will now be described, by way of example only, with reference to the following drawings.
Fig. 1 is a flowchart of a method for improving the endurance of a storage medium of a backup system according to an embodiment of the present invention.
Fig. 2A is a network environment diagram of a system for improving the durability of a storage medium of a backup system according to an embodiment of the present invention.
Fig. 2B is a block diagram of various exemplary components of a host device, according to an embodiment of the invention.
FIG. 2C is a block diagram of various exemplary components of a backup system, provided in accordance with an embodiment of the present invention.
Fig. 3 is a diagram of a system for improving the endurance of storage media of a backup system according to an embodiment of the present invention.
In the drawings, an underline number is used to indicate an item in which the underline number is located or an item adjacent to the underline number. The term to which the non-underlined number refers is identified by a bar that connects the non-underlined number to the term. When a number is not underlined and is accompanied by an associated arrow, non-underlined numbers are used to identify the general term to which the arrow points.
Detailed Description
The following detailed description illustrates embodiments of the invention and the manner in which they may be practiced. Although some modes for carrying out the invention have been disclosed, those skilled in the art will recognize that other embodiments for carrying out or implementing the invention are possible.
Fig. 1 is a flowchart of a method 100 for improving the endurance of a storage medium of a backup system according to an embodiment of the present invention. In one aspect, the present invention provides the method 100 for improving the endurance of a storage medium of a backup system, wherein a backup of one or more data items of a host device is stored in one or more blocks of the storage medium of the backup system. The method 100 comprises: the backup agent detects a frequency of change of one of the one or more data items in the host device; a backup layer of the backup system receiving the detected frequency of change; the backup layer of the backup system obtaining a minimum storage time for each backup of the one or more data items from a backup policy stored in a backup policy repository; providing the detected change frequency and the minimum storage time for each backup of the one or more data items retrieved from the backup layer to the storage medium to group data items stored in one or more blocks of the storage medium; grouping the data items in the storage medium according to the provided detected change frequency and the obtained minimum storage time of each backup of the one or more data items; estimating a remaining life of each of the one or more blocks of the storage medium; moving the grouped data items into the blocks according to the estimated remaining life of each block.
The method 100 is performed in a backup system of a host device, the host device and the backup system being as described in fig. 2A to 2C and so on. The storage medium of the backup system is used to receive and create backups. A backup of one or more data items of the host device is stored in one or more blocks of the storage medium of the backup system. The backup of the one or more data items enables data to be restored or recovered if there is a loss of data in the host device due to data corruption, hardware or software failure in the host device, accidental deletion of data, hacker or malicious attack. The method 100 includes steps S1, S2, S3, S4, S5, S6, and S7. Steps S1, S2, S3, S4, S5, S6, and S7 may be performed in any reasonable order to achieve the goals of the disclosed embodiments. The descriptions and corresponding descriptions in fig. 1 do not necessarily imply a particular order to the disclosed steps of method 100 unless a particular method step is a necessary prerequisite to performing any other method step. The individual method steps may be performed simultaneously or nearly simultaneously, in sequence or in parallel.
In step S1, the method 100 includes: the backup agent 228 detects a frequency of change in the host device 202 of one of the one or more data items 222. As used herein, the term "data item" refers to data, such as text files, images, video, audio, etc., associated with one or more software products that may be executed on the host device, and may also be a block or collection of blocks in a block store when the backup is a block-level backup of the host system. Such one or more data items are stored in one or more blocks on the host storage device. In one example, the data stored in the host storage device is created or changed in the host device by one or more software products, such as picture editing software, data compression software, data encryption software, and the like. Further, the change frequency of one data item indicates the time when the data item is changed next in the host device. In other words, the frequency of change is the expected lifetime or expected endurance of the data item in the host device. For example, the frequency of change of one data item may be one hour, one day, one week, one month, six months, or one year. Further, herein, the backup agent refers to software implemented on the host device for accessing one or more data items and performing functions related to the backup of data items. In addition, the backup agent also acts as a communication agent or interface between the host device and the backup system. The backup agent is installed and activated on the host device. Upon installation of the backup agent, the backup agent is integrated with several software products in the host device to be able to perform its functions, e.g. to detect the changing frequency of one or more data items created by these software products.
In step S2, the method 100 further includes: backup layer 212 of backup system 204 receives the detected frequency of change. Here, the backup layer refers to a software component or a hardware component or a combination of the software component and the hardware component of the backup system, and is used for performing communication between the backup system and the host device through a communication network and the like, and further capable of creating a backup of data items in the storage medium of the backup system, as described in detail with reference to fig. 2A to 2C. The backup layer is to serve as an interface between the storage media of the backup system and the host storage device of the host device. The detected frequency of change is received by the backup layer over the communication network.
In step S3, the method 100 further includes: the backup layer 212 of the backup system obtains a minimum storage time for each backup of the one or more data items from the backup policies stored in the backup policy store 210. The term "minimum storage time" refers to the time a block of one or more blocks of the storage medium was not last overwritten or changed with a new data item. In one example, the minimum storage time may vary from six months to a year for a monthly backup performed by the backup system. That is, data items backed up monthly are not deleted within six months or a year. In another example, the minimum storage time may vary from 12 hours to 24 hours for hourly backups or short term backups performed by the backup system. The backup strategy storage library is used for storing the strategy retained by the backup data. For example, a data retention policy in the backup policy may include: backup once an hour and corresponding data retention for one day, once a day and corresponding data retention for one week, once a week and corresponding data retention for one month, once a month and corresponding data retention for one year. It will be appreciated that, in general, the lower the frequency with which backups are performed, the longer the minimum storage time. Frequently changing data (i.e., hot data) is backed up more frequently and, therefore, such data is also deleted more frequently. In addition, cold data that does not change frequently is not deleted frequently because it has a long storage time. Furthermore, it will be appreciated that, typically for short term backups, the minimum storage time may be determined by evaluating the average lifetime of each block using known standard techniques.
In step S4, the method 100 further includes: the detected frequency of change and the minimum storage time for each backup of the one or more data items retrieved from backup layer 212 are provided to storage medium 214 for grouping data items stored in one or more blocks of the storage medium. Evaluating the change frequency and the minimum storage time can more efficiently classify data items as hot data items and cold data items, and such information can be used to efficiently store data items in one or more blocks of the storage medium (described in detail later). Here, the backup layer provides the change frequency and the minimum storage time to the storage medium in the form of a new protocol to write the one or more data items to one or more blocks of the storage medium. In one example, the format of the protocol is represented in function (1).
write(offset,size,data,minimum durability,expected durability) (1)
Wherein the content of the first and second substances,
'offset' indicates a location of a block in the storage medium where data is to be stored,
'size' refers to the storage size of a block in the storage medium.
'data' refers to a data item to be stored in the storage medium.
'minimum duration' refers to the minimum time a data item is not overwritten into one block of the storage medium.
'expected duration' refers to the time at which a data item can be next rewritten to a block of the storage medium.
In step S5, the method 100 further includes: grouping the data items in the storage medium according to the provided detected change frequency and the obtained minimum storage time of each backup of the one or more data items. The change frequency and the minimum storage time provide an expected time for a next overwrite of a data item into the one or more blocks of the storage medium. In the present embodiment, the term "grouping" herein refers to assigning locations to data items in the one or more blocks of the storage medium, such that data items having similar characteristics, such as frequency of change and minimum storage time, may be assigned the same blocks of the storage medium, if possible.
In step S6, the method 100 further includes: estimating a remaining life of each of the one or more blocks of the storage medium. As mentioned above, each block of the storage medium can only be rewritten a limited number of times. Thus, each rewrite shortens the lifetime of each block over time. The remaining lifetime of a particular block in the storage medium represents the number of further possible overwrites of the block and depends on the type of the storage medium and the number of overwrites that have occurred for the particular block of the storage medium.
In step S7, the method 100 further includes: moving the grouped data items into the blocks according to the estimated remaining life of each block. In other words, the grouped data items are written to a given block based on the estimated remaining life of the given block. In the present example, hot data items (i.e., frequently changing data items) are written to respective blocks having relatively long remaining lifetimes, while cold data items (i.e., infrequently changing data items) are written to respective blocks having relatively short remaining lifetimes. Therefore, the respective blocks having a relatively short remaining life are rewritten less frequently than other blocks having a relatively long remaining life. This results in an extended overall life of the storage medium, which improves the durability of the storage medium. Advantageously, grouping the data items and then storing the data items in respective blocks ensures that when a new data item is to be written in the same block, most of the data items in the block are overwritten at the same time, in contrast to conventional techniques in which writing even a small amount of data may result in overwriting an entire block. Therefore, the write amplification of the storage medium of the present invention is significantly reduced compared to storage media implementing conventional techniques.
According to one embodiment, determining the change frequency comprises: determining a period of time of a frequency of the one or more data items changing in the host device. Here, the time period represents a duration before the data item is changed in the host device. The time period of the frequency of change of data items is determined by the backup agent. It will be appreciated that data items relating to different software products may have different periods of variation. In one example, data items associated with one software product may change once per hour, while data items associated with another software product may change once per month. Because the backup agent is integrated with the software products on the host device, the backup agent may determine a time period corresponding to the frequency of data item changes by evaluating the behavior of data items created by each software product over a period of time.
According to one embodiment, obtaining the minimum storage time for each backup comprises: obtaining a storage time of each backup of the one or more data items in the storage medium, wherein the storage time is greater than the change frequency of the one or more data items in the host device. As discussed above, the backup policy repository is for storing a policy for backup data retention, including information regarding a storage time of backups of the one or more data items in the storage medium. By obtaining such information, a minimum storage time for each backup may be determined. The determined minimum storage time and the information about the corresponding change frequency enable grouping of data items for writing into one or more blocks of the storage medium such that a large part of the data items are to be rewritten into a given block at the same time, resulting in a reduced rewriting and thus a reduced magnification of writing of the storage medium.
According to one embodiment, detecting the changing frequency of the data items is performed by detecting a type of the data items of the one or more data items in the host device. The type of data item is referred to herein as the characteristics of the data item that affect the time at which the data item may change next, which in turn may depend on the associated software product that created the data item. For example, the temporary data file type created by the software product may last for a shorter period of time than the system file type of the host device. Furthermore, in one example, one type of data item may change in the host device every day, while another type of data item may change only after weeks or months, so a pattern on the frequency of change may be determined from the different data types. Thus, depending on the type of the data item, the frequency of change of the data item may be determined. The backup system is configured to group the data items according to a minimum storage time obtained from the backup policy and an expected time that depends on the minimum storage time and the change frequency. For example, if the data item has a minimum storage time of one hour but is expected to change with a frequency of 1 year, then in this case the data item is likely to be cold data, the backup system may group cold data items, but may attempt to separate cold data items from data items with a minimum frequency of 1 year so that cold data items may change faster.
According to one embodiment, the grouping of the data items is performed by grouping data items for which the minimum storage time of the backup is the same. The minimum storage time may be considered as a good parameter for grouping the data items, since the minimum storage time represents the minimum time a data item may not be overwritten into a block of the storage medium. In one example, data items with a minimum storage time of one hour are grouped together, data items with a minimum storage time of one day are grouped together, data items with a minimum storage time of one week are grouped together, and data items with a minimum storage time of one month are grouped together. This makes the grouping of data items efficient because the data items can be classified into hot data items, which refer to data items that change frequently, and cold data items, which are static data items that change after a considerably longer time than the hot data items.
According to one embodiment, the grouping of the data items is performed by grouping data items that have not changed during the backup in the storage medium. It will be appreciated that data items that may not have changed during a backup in the storage medium may be determined to have a lower frequency of change. In addition, data items that may not change between two backups in the storage medium may be determined to have a lower frequency of change. Therefore, it may be relatively simple to mark these data items as cold and group them together accordingly.
According to one embodiment, estimating the remaining lifetime of each of the blocks is performed by identifying: a maximum possible number of rewrites for each of the one or more blocks and a number of executed rewrites for each of the one or more blocks.
That is, estimating the remaining life of each block is performed by identifying a maximum possible number of rewrites for each of the one or more blocks. In general, each block in the storage medium may have a different maximum possible number of rewrites. Further, estimating the remaining life of each block is performed by identifying a number of times that rewriting has been performed for each of the one or more blocks. It will be appreciated that, in general, each block of the storage medium may initially have the same lifetime and may have the same number of rewrites over its lifetime. But over time, a block may be rewritten more times than other blocks. Each block may have previously stored and overwritten one or more data items, and therefore the number of overwrites available is reduced. Thus, each block may have a different maximum number of rewrites and may not be rewritable again. In one example, the maximum number of rewrites for a given block may depend on the number of rewrites that have occurred on the given block. In general, the remaining life of a block is inversely proportional to the number of times that rewriting has been performed.
According to one embodiment, the method further comprises: and detecting the block with the shortest estimated remaining life. The blocks with the shortest estimated remaining life are those blocks in which most of the maximum possible number of rewrites have occurred. The block whose estimated remaining life is the shortest is detected to be able to store only appropriate data items according to the remaining life of the block. Since the blocks whose estimated remaining life is the shortest are detected, the data item groups that do not change often are stored only in these blocks. That is, the cold data items are stored in respective blocks in which the estimated remaining life is shortest, and the hot data items are stored in other blocks in which the estimated remaining life is relatively long. In this way, the lifetime of the block is significantly extended, thereby improving the durability of the storage medium.
The method of the present invention provides an efficient way of grouping data items into hot data items (i.e. frequently changing data items) and cold data items (i.e. infrequently changing data items) according to said determined frequency of change. Advantageously, the grouping of the data items enables the data items to be stored in the one or more blocks of the storage medium such that a majority of the data items are rewritten to a given block at a time, such that the amount of writing of data items in a block is nearly equal to the size of the data items to be rewritten. This is in contrast to conventional techniques in which there are fewer data items to be overwritten at a time, and thus the amount of writes performed on a block is many times the size of the data items to be overwritten. Thus, when the backup system is started to determine where to write the next block, the backup system first identifies which blocks can be erased by identifying blocks that are completely or almost completely overwritten. This results in a significant reduction of the writing magnification of the storage medium compared to the prior art. Furthermore, this enables the data items to be stored in the respective blocks according to the remaining life of each block, and therefore, these blocks can be used for a longer period of time than in the conventional art.
Steps S1-S7 are merely illustrative, and other alternatives may also be provided in which one or more steps are added, one or more steps are deleted, or one or more steps are provided in a different order without departing from the scope of the claims herein.
In one aspect, a computer program is provided. The computer program, when executed on a backup system, is operable to perform the method 100. In another aspect, a computer program product is provided that includes a non-transitory computer readable storage medium. The non-transitory computer readable storage medium has stored thereon computer program code executable by a processor to perform the method 100. Examples of implementations of the non-transitory computer-readable storage medium include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read-Only Memory (ROM), Hard Disk Drive (Hard Disk Drive, HDD), flash Memory, Secure Digital (SD) card, Solid-State Drive (SSD), computer-readable storage medium, or CPU cache. A computer readable storage medium for providing non-transitory memory may include, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing.
Fig. 2A is a network environment diagram of a system for improving the durability of a storage medium of a backup system according to an embodiment of the present invention. Referring to fig. 2A, a system 200 is shown. System 200 includes a host device 202 and a backup system 204. A communication network 206 is also shown. The host device 202 includes a backup agent 208. The backup system 204 includes a backup policy repository 210, a backup layer 212, and a storage medium 214 that stores backups.
In another aspect, the present invention provides a backup system 204 comprising an enhanced durability storage medium 214, wherein a backup of one or more data items of a host device 202 is stored in one or more blocks of the storage medium 214. Backup system 204 also includes: a backup agent 208 for detecting a frequency of change of one of the one or more data items in the host device 202; a backup layer 212 for communicating with the host device 202 to: receiving the detected change frequency of the data item of the one or more data items, receiving a minimum storage time for each backup of the one or more data items from the backup policy store 210, providing the detected change frequency and the obtained minimum storage time for each backup of the one or more data items to the storage medium 214 for grouping the data items; the storage medium 214 is used to: grouping the data items according to the provided change frequency and a minimum storage time for each backup of the one or more data items, estimating a remaining lifetime of each of the one or more blocks, and moving the grouped data items into the blocks according to the estimated remaining lifetimes.
The host device 202 may comprise suitable logic, circuitry, interfaces and/or code that may enable information to be stored, processed and shared with the backup system 204 via the communication network 206. The host device 202 includes a host file system that includes one or more data items or stores data in a block device that does not include a file system. The host file system may have one or more operating algorithms implemented on the host device 202. Host device 202 provides information such as the frequency of changes to one or more data items to backup system 204 through backup agent 208 in host device 202. Examples of host devices 202 include, but are not limited to, host servers, host production environment systems, thin clients connected to host servers, master storage devices, and user devices (e.g., cellular phones, Personal Digital Assistants (PDAs), handheld devices, laptops, personal computers, Internet of Things (IoT) devices, smart phones, Machine Type Communication (MTC) devices, computing devices, drones, or any other portable or non-portable electronic device).
The backup system 204 may comprise suitable logic, circuitry, interfaces and/or code that may be operable to store, process and/or receive information from the host device 202 via the communication network 206. Further, it should be understood that backup system 204 may be a single hardware server and/or multiple hardware servers operating in a parallel or distributed architecture to form backup system 204. Examples of backup system 204 include, but are not limited to, a secondary storage system, a backup server, a storage server, a cloud server, a web server, an application server, or a combination thereof. Backup system 204 is configured to receive one or more data items from host device 202 via backup agent 208. The received one or more data items are stored in one or more blocks of backup system 204. In one embodiment, the backup system 204 is used to take snapshots (i.e., images) of the host device 202. In addition, backup system 204 detects changes in host device 202 from the image. Backup system 204 stores the changes in host device 202 as a backup to host device 202.
The communication network 206 includes a medium (e.g., a communication channel) through which the host device 202 communicates with the backup system 204. The communication network 206 may be a wired or wireless communication network 206. Examples of communication Network 206 may include, but are not limited to, a Wireless Fidelity (Wi-Fi) Network, a Local Area Network (LAN), a Wireless Personal Area Network (WPAN), a Wireless Local Area Network (WLAN), a WLAN, a Wireless Wide Area Network (WWAN), a cloud Network, a Long Term Evolution (LTE) Network, a Plain Old Telephone Service (POTS), a Metropolitan Area Network (MAN), and/or the internet. The host device 202 and the backup system 204 may be configured to connect to a communication network 206 according to various wired and wireless communication protocols. Examples of such wired and wireless communication protocols may include, but are not limited to, Transmission Control Protocol/Internet Protocol (TCP/IP), User Datagram Protocol (UDP), Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), ZigBee, EDGE, Infrared (IR), IEEE 802.11, IEEE 802.16, Long Term Evolution (LTE), Light Fidelity (Li-Fi), and/or other cellular or Bluetooth (BT) communication protocols, including variants thereof.
Backup agent 208 is software implemented on host device 202 for accessing one or more data items associated with one or more software products on host device 202.
Backup policy repository 210 refers to a database for storing backup policies for backup system 204.
Backup layer 212 refers to a software component or a hardware component or a combination of software and hardware components of backup system 204 to communicate with host device 202 and provide information to storage medium 214 to group one or more data items.
Storage medium 214 refers to an organized body of information regardless of the manner in which the information or organized body thereof is represented. Storage medium 214 is used to store one or more data items received by backup system 204. Examples of storage medium 214 include, but are not limited to, a backup directory, a table, an Extensible Markup Language (XML) file for storing data within XML tags, or any form of database for storing information for future reference at backup system 204.
In operation, the backup agent 208 is operable to detect a frequency of change of one of the one or more data items in the host device 202. The backup agent 208 is used to determine when the data item next changes in the host device 202. According to this embodiment, backup agent 208 determining the change frequency includes: a time period of a frequency of the one or more data items changing in the host device 202 is determined. The time period determined by the backup agent 208 is the time period before the data item changes in the host device 202. According to the present embodiment, detecting the frequency of change of the data items is performed by detecting the type of the data item of the one or more data items in the host device 202. The type of data item detected by backup agent 208 is a particular characteristic of the data associated with the software product from which the frequency of change of the data can be determined.
The backup policy store 210 provides a minimum storage time for each backup of one or more data items through the backup layer 212 of the backup system 204. As described above, the backup policy store 210 is used to store policies that backup data retains. For example, a data retention policy in the backup policy may include: backup once an hour and corresponding data retention for one day, once a day and corresponding data retention for one week, once a week and corresponding data retention for one month, once a month and corresponding data retention for one year. It will be appreciated that, in general, the lower the frequency with which backups are performed, the longer the minimum storage time.
Backup layer 212 is further configured to communicate with host device 202 to receive the detected frequency of change of the data item of the one or more data items. Backup layer 212 receives the change frequency from backup agent 208. The backup layer 212 is also operable to receive a minimum storage time for each backup of the one or more data items from the backup policy store 210. According to one embodiment, the backup layer 212 obtaining the minimum storage time of each backup includes: obtaining a storage time of each backup of the one or more data items in the storage medium 214, wherein the storage time is greater than the change frequency of the one or more data items in the host device 202. The frequency of change and the minimum storage time provide backup layer 212 with a range of time before the data item is overwritten.
The backup layer 212 is further configured to provide the detected change frequency and the obtained minimum storage time of each backup of the one or more data items to the storage medium 214 for grouping the data items. In general, the storage medium 214 may have its own controller that may use the received information relating to the detected frequency of change and the minimum storage time for each backup of the one or more retrieved data items to determine a storage (overwrite) pattern for each block therein. The data items in the storage medium 214 can be grouped according to the change frequency and the minimum storage time to allow the data items to be stored in the appropriate blocks of the storage medium 214 in order to reduce the write amplification of the storage medium 214.
The storage medium 214 is configured to group the data items according to the provided change frequency and a minimum storage time for each backup of the one or more data items. Frequently changing data items are grouped and written into blocks that have a long remaining life (i.e., long minimum storage time), while less frequently changing data items are grouped and written into blocks that have a short remaining life (i.e., short minimum storage time). The grouping of the data items is performed according to the change frequency and the minimum storage time to reduce the write amplification of the storage medium 214. According to one embodiment, the grouping of the data items is performed by the storage medium 214 by grouping the data items for which the minimum storage time of the backup is the same. Data items that are backed up with the same minimum storage time are grouped so that most of the data items in a block can be overwritten at the same time. Thus, the write magnification of the storage medium 214 is reduced. According to one embodiment, the grouping of the data items is performed by the storage medium 214 by grouping data items that did not change during the backup in the storage medium 214. Data items that may not have changed during backup in the storage medium 214 may be determined to have a lower frequency of change, marked as cold data items and grouped together accordingly.
The storage medium 214 is also used to estimate the remaining life of each of the one or more blocks. Estimating the remaining life of each block enables data items to be stored in the respective blocks such that data items that change frequently are used to write the respective blocks having long remaining lives, and data items that change infrequently are used to write the respective blocks having short remaining lives. In this way, the lifetime of individual blocks in the storage medium 214 is generally improved. According to one embodiment, estimating the remaining lifetime of each of the blocks is performed by identifying: a maximum possible number of rewrites for each of the one or more blocks and a number of executed rewrites for each of the one or more blocks. Since each block can only be rewritten a limited number of times, the maximum possible number of rewrites and the number of times that rewrites have been performed help determine the remaining rewrites for the block to allow the appropriate data items to be stored in the various blocks of the storage medium 214.
The storage medium 214 is further configured to move the grouped data items into the block according to the estimated remaining lifetime. Here, hot data items are moved to blocks where the estimated remaining life is long, while cold data items are moved to blocks where the estimated remaining life is short. According to one embodiment, the block with the shortest estimated remaining life is identified. Data items that change infrequently (i.e., cold data) are moved to those blocks where the estimated remaining lifetime is the shortest. Thus, these near-end-of-life blocks may be used for a longer period of time, thereby helping to improve the overall durability of the storage medium 214.
FIG. 2B is a block diagram of various exemplary components of host device 202, according to one embodiment of the invention. The host device 202 includes a first processor 216, a first transceiver 218, and a first memory 220. The first processor 216 may be communicatively coupled to a first transceiver 218 and a first memory 220. The first memory 220 also includes one or more data items 222 and the backup agent 208.
The first processor 216 is configured to provide one or more data items 222 to the backup system 204. In one implementation, the first processor 216 is configured to execute instructions stored in the first memory 220 that, when executed by the first processor, cause the host device to communicate with the backup system provided by the present invention. In one example, the first processor 216 may be a general purpose processor. Other examples of the first processor 216 may include, but are not limited to, microprocessors, microcontrollers, Complex Instruction Set Computing (CISC) processors, application-specific integrated circuit (ASIC) processors, Reduced Instruction Set (RISC) processors, Very Long Instruction Word (VLIW) processors, Central Processing Units (CPUs), state machines, data processing units, and other processors or control circuits. Further, the first processor 216 may refer to one or more separate processors, processing devices, processing units that are part of a machine, such as the host device 202.
The first transceiver 218 may comprise suitable logic, circuitry, and/or interfaces that may be operable to communicate with one or more external devices, such as the backup system 204. Examples of the first transceiver 218 may include, but are not limited to, an antenna, a telematics unit, a Radio Frequency (RF) transceiver, one or more amplifiers, one or more oscillators, a digital signal processor, a CODEC chipset, and/or a Subscriber Identity Module (SIM) card.
The first memory 220 refers to a primary storage device of the host device 202. The first memory 220 may comprise suitable logic, circuitry, and/or interfaces that may be operable to store machine code or instructions having at least one code section executable by the first processor 216. Examples of implementations of the first Memory 220 may include, but are not limited to, an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Random Access Memory (RAM), a Read-Only Memory (ROM), a Hard Disk Drive (Hard Drive, HDD), a flash Memory, a Secure Digital (SD) card, a Solid-State Drive (SSD), or a CPU cache. The first memory 220 may store an operating system or other program product (including one or more operating algorithms), or both, to operate the host device 202. A computer readable storage medium for providing non-transitory memory may include, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing.
First memory 220 includes one or more data items 222 associated with one or more software products executing on host device 202. Backup agent 208 is used to act as an interface between host device 202 and backup system 204 to provide the frequency of changes to data items of one or more data items 222 to backup system 204.
In operation, in one aspect, first processor 216 is configured to provide a frequency of change of one or more data items 222 from backup agent 208 to backup system 204. In addition, first processor 216 is also configured to provide one or more data items 222 to backup system 204 to enable the backup of one or more data items 222.
FIG. 2C is a block diagram of various exemplary components of backup system 204, as provided by one embodiment of the present invention. The backup system 204 includes a second processor 224, a second transceiver 226, and a second memory 228. The second processor 224 may be communicatively coupled to a second transceiver 226 and a second memory 228. The second memory 228 also includes a backup policy repository 210, a backup layer 212, and a storage medium 214.
The second processor 224 is configured to acquire the one or more data items 222 and the detected frequency of change via the second transceiver 226. In one implementation, the second processor 224 is configured to execute instructions stored in the second memory 228 that, when executed by the second processor, cause the backup system to perform the steps of the method provided by the present invention. The backup policy store 210 is operable to provide a minimum storage time for each backup of one or more data items 222 in the backup system. The backup layer 212 is used to receive the change frequency and the minimum storage time and provide them to the storage medium 214. The storage medium 214 stores data items in one or more blocks. The storage medium 214 is configured to receive the varying frequency and the minimum storage time to group the data items. Further, the storage medium 214 estimates the remaining life of each block, and stores the grouped data items in each block according to the remaining life.
In operation, in one aspect, the second processor 224 is configured to receive the change frequency from the host device and provide the change frequency to the backup layer 212. In addition, the second processor 224 provides the one or more data items received from the host device to the storage medium 214 to store a backup of the one or more data items in the backup system 204.
FIG. 3 is an illustration of a system 300 for improving endurance of storage media of a backup system, in accordance with an embodiment of the present invention. System 300 includes host device 202, backup agent 208, policy engine 306, secondary storage tier 308, backup tier 212, and block storage tier 312.
The host device 202 includes one or more data items that are backed up in a storage medium of the backup system. The backup agent 208 of the host device 202 is configured to detect a frequency of change of one of the one or more data items of the host device 202. The change frequency is the maximum lifetime of the data item after which the data item changes in the host device 202 and therefore needs to be backed up again in the storage medium. Backup agent 208 provides the change frequency to backup layer 212 as change cue 314.
Policy engine 306 is used to store backup policies related to data item storage and retention in backup system 204. Policy engine 306 may also be referred to as a backup policy repository. Policy engine 306 provides the minimum storage time to backup layer 212 via the backup policy. The minimum storage time is a minimum lifetime of one of the one or more blocks of the storage medium after which data items in the block are overwritten. Policy engine 306 provides the minimum storage time as an accurate backup policy 316 to backup layer 212.
The secondary storage tier 308 includes a backup tier 212 for communicating with the host device 202 to receive, via the backup agent 208, a frequency of change of the one or more detected data items. The backup layer 212 is also operable to receive a minimum storage time for each backup of the one or more data items from the policy engine 306. The backup layer 212 is further configured to provide the detected change frequency and the minimum storage time of each backup of the one or more acquired data items to the storage medium for grouping the data items. The backup layer 212 provides the change frequency and the minimum storage time as a new write protocol 318. In one example, new write protocol 318 includes an offset value, a size of the data item, a location of the data item, a frequency of change (i.e., expected endurance), and a minimum storage time (i.e., maximum endurance).
Block store layer 312 is to group data items in the one or more blocks according to a received new write protocol 318, wherein new write protocol 318 includes the change frequency and the minimum storage time. The block storage layer 312 is also used to estimate the remaining life of each of the one or more blocks. Therefore, data items that change frequently (i.e., hot data) are stored in respective blocks whose minimum storage time is long, and data items that change infrequently (i.e., cold data) are stored in respective blocks whose minimum storage time is short.
The system and method of the present invention provides for efficient grouping of data items into hot data items (i.e., data items that change frequently) and cold data items (i.e., data items that do not change frequently) based on the determined frequency of change and minimum storage time. Advantageously, the grouping of the data items enables the data items to be stored in the one or more blocks of the storage medium such that a majority of the data items are rewritten to a given block at a time, such that the amount of writing of data items in a block is nearly equal to the size of the data items to be rewritten. This is in contrast to conventional techniques in which there are fewer data items to be overwritten at a time, and thus the amount of writes performed on a block is many times the size of the data items to be overwritten. Thus, when the backup system is started to determine where to write the next block, the backup system first identifies which blocks can be erased by identifying blocks that are completely or almost completely overwritten. This results in a significant reduction of the write amplification of the storage medium compared to the prior art. Furthermore, this enables the data items to be stored in the respective blocks according to the remaining life of each block, and therefore, these blocks can be used for a longer period of time than in the conventional art.
Modifications may be made to the embodiments of the invention described above without departing from the scope of the invention as defined in the accompanying claims. Expressions used for describing and claiming the present invention such as "including/comprising", "incorporating", "having", "being", are intended to be interpreted in a non-exclusive manner, i.e. to allow items, components or elements not explicitly described to be present. Reference to the singular is also to be construed to relate to the plural. The word "exemplary" is used herein to mean "serving as an example, instance, or illustration. Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the presence of features from other embodiments. The word "optionally" as used herein means "provided in some embodiments and not provided in other embodiments". It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention which are, for clarity, described in the context of a single embodiment, may also be provided separately or in any suitable combination or as any other described embodiment of the invention.

Claims (10)

1. A method (100) for improving the endurance of a storage medium (214) of a backup system (204), wherein a backup of one or more data items (222) of a host device (202) is stored in one or more blocks of the storage medium of the backup system, the method comprising:
a backup agent (208) detects a frequency of change of one of the one or more data items in the host device;
a backup layer (212) of the backup system receiving the detected frequency of change;
the backup layer of the backup system obtains a minimum storage time for each backup of the one or more data items from a backup policy stored in a backup policy repository (210);
providing the detected frequency of change and the minimum storage time for each backup of the one or more data items obtained from the backup layer to the storage medium to group data items stored in the one or more blocks of the storage medium;
grouping the data items in the storage medium according to the provided detected change frequency and the obtained minimum storage time of each backup of the one or more data items;
estimating a remaining life of each of the one or more blocks of the storage medium;
moving the grouped data items into the blocks according to the estimated remaining life of each block.
2. The method (100) of any of the above claims, wherein determining the frequency of change comprises determining a period of time of the frequency of change of the one or more data items (222) in the host device (202).
3. The method (100) of claim 1 or 2, wherein obtaining the minimum storage time for each backup comprises obtaining a storage time for each backup of the one or more data items (222) in the storage medium (214), wherein the storage time is greater than the frequency of changes of the one or more data items in the host device (202).
4. The method (100) according to claim 1 or 2, wherein detecting the frequency of change of the data items is performed by detecting a type of the data item of the one or more data items (222) in the host device (202).
5. The method (100) of any of the preceding claims, wherein the grouping of the data items (222) is performed by grouping data items for which the minimum storage time of the backups is the same.
6. The method (100) of any of the preceding claims, wherein the grouping of the data items (222) is performed by grouping data items that have not changed during backup in the storage medium (214).
7. The method (100) according to any of the preceding claims, wherein estimating the remaining lifetime of each block is performed by identifying:
a maximum possible number of rewrites for each of the one or more blocks,
a number of times that rewriting has been performed for each of the one or more blocks.
8. The method (100) of any of the preceding claims, further comprising detecting a block with the shortest estimated remaining lifetime.
9. A backup system (204) comprising a storage medium (214) having increased durability, wherein a backup of one or more data items (222) of a host device (202) is stored in one or more blocks of the storage medium, the backup system further comprising:
a backup agent (208) for
Detecting a frequency of change of one of the one or more data items in the host device;
a backup layer (212) for
Communicating with the host device to: receiving a change frequency of the data item of the detected one or more data items,
receiving a minimum storage time for each backup of the one or more data items from a backup policy store (210),
providing the detected change frequency and the obtained minimum storage time of each backup of the one or more data items to the storage medium to group the data items;
the storage medium is to:
grouping the data items according to the provided change frequency and a minimum storage time for each backup of the one or more data items,
estimating a remaining life of each of the one or more blocks,
moving the grouped data items into the block according to the estimated remaining lifetime.
10. A computer program for performing the method (100) according to any one of claims 1 to 8 when executed in a backup system (204).
CN202080068994.4A 2020-06-25 2020-06-25 Backup system and method for improving durability of storage medium thereof Pending CN114503083A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/067768 WO2021259475A1 (en) 2020-06-25 2020-06-25 Backup system and method for increasing durability of storage media thereof

Publications (1)

Publication Number Publication Date
CN114503083A true CN114503083A (en) 2022-05-13

Family

ID=71170589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080068994.4A Pending CN114503083A (en) 2020-06-25 2020-06-25 Backup system and method for improving durability of storage medium thereof

Country Status (2)

Country Link
CN (1) CN114503083A (en)
WO (1) WO2021259475A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10222984B1 (en) * 2015-12-31 2019-03-05 EMC IP Holding Company LLC Managing multi-granularity flash translation layers in solid state drives
US20170242625A1 (en) * 2016-02-24 2017-08-24 Samsung Electronics Co., Ltd. Apparatus for ssd performance and endurance improvement
US10496533B1 (en) * 2018-05-21 2019-12-03 Micron Technology, Inc. Allocation of overprovisioned blocks for minimizing write amplification in solid state drives

Also Published As

Publication number Publication date
WO2021259475A1 (en) 2021-12-30

Similar Documents

Publication Publication Date Title
TWI632457B (en) Method of wear leveling for data storage device
US20190102262A1 (en) Automated continuous checkpointing
EP1818829B1 (en) Apparatus for collecting garbage block of nonvolatile memory according to power state and method of collecting the same
US11126561B2 (en) Method and system for organizing NAND blocks and placing data to facilitate high-throughput for random writes in a solid state drive
US8850173B2 (en) BIOS image manager
US10877898B2 (en) Method and system for enhancing flash translation layer mapping flexibility for performance and lifespan improvements
US11630767B2 (en) Garbage collection—automatic data placement
US20170139825A1 (en) Method of improving garbage collection efficiency of flash-oriented file systems using a journaling approach
CN104808951A (en) Storage control method and device
US9213634B2 (en) Efficient reuse of segments in nonoverwrite storage systems
CN108228341B (en) Memory recovery method and device, terminal equipment and computer readable storage medium
CN105917303B (en) Controller, method for identifying stability of data block and storage system
US11640244B2 (en) Intelligent block deallocation verification
CN109213448B (en) Method, device, equipment and storage medium for erasing and writing data of smart card
CN108228339B (en) Memory recovery method and device, terminal equipment and computer readable storage medium
US9336250B1 (en) Systems and methods for efficiently backing up data
US10942811B2 (en) Data processing method for solid state drive
US10437784B2 (en) Method and system for endurance enhancing, deferred deduplication with hardware-hash-enabled storage device
KR101676175B1 (en) Apparatus and method for memory storage to protect data-loss after power loss
CN108228340B (en) Terminal control method and device, terminal equipment and computer readable storage medium
CN114503083A (en) Backup system and method for improving durability of storage medium thereof
KR100677227B1 (en) Improvement method for velocity of update in mobile terminal device
CN109960611B (en) Data recovery method and device, electronic equipment and machine-readable storage medium
CN108509295B (en) Operation method of memory system
CN110888823B (en) Page scanning efficiency improving method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination