WO2021175446A1

WO2021175446A1 - Devices and methods for eliminating defragmentation in deduplication

Info

Publication number: WO2021175446A1
Application number: PCT/EP2020/056082
Authority: WO
Inventors: Yehonatan DAVID; Elizabeth FIRMAN; Michael Hirsch
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2020-03-06
Filing date: 2020-03-06
Publication date: 2021-09-10
Also published as: CN113632059A

Abstract

The present disclosure relates to a device and method for storing duplicated data blocks. Specifically, the disclosure proposes a device for eliminating a need for defragmentation in a deduplication backup system. The device is configured to store one or more compressed containers, wherein each container comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers. The device is further configured to delete a first duplicated data block that is stored in a segment by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment. In this way, a solution to remove obsolete data which does not require metadata updates, is provided.

Description

DEVICES AND METHODS FOR ELIMINATING DEFRAGMENTATION IN

DEDUPLICATION TECHNICAL FIELD

The present disclosure relates to a device and method for storing data, in particular, for storing duplicated data. In order to solve a problem of fragmentation caused in the deduplication process, embodiments of this disclosure provide a solution to simplify the defragmentation process. To this end, the present disclosure proposes a device and method that allows eliminating defragmentation in a backup storage.

BACKGROUND

Deduplication is a process of elimination of duplicated data in storage. Due to the nature of how the data is saved in a deduplicating storage, there is often a problem of fragmentation in deduplication. This is a software level fragmentation, which is done on top of a lower-level hardware that also does some defragmentation. This fragmentation can also be a cause for further read amplification.

In deduplication, user data is stored as pointers to the actual data. The actual data is stored in containers of some sort. When a user deletes data, chunks of a container become obsolete. Therefore, when performing deduplication, there are often obsolete data chunks in the repository that needs to be collected. This usually requires an offline/background process to do a lot of work, and this also includes a lot of calculations, and sometime complicated processes that update reference counts in a way that allows preventing errors.

Traditionally, to redeem space, the data can either be copied to another place, trying to keep the locality of references, or the space may be reused for new data to prevent the complexities of copying. In a conventional manner of handling obsolete data, it may wait until a certain amount of data in a container or area is obsolete, and then copy the remaining data into a new container, and update all the references from the old location to the new location. This process requires locking of metadata of the container, the container itself, and maybe other structures as well. It is difficult to parallelize the defragmentation process thus make it error prone, and it may affect backup and restore speeds as well.

SUMMARY

In view of the above-mentioned challenges, an objective for embodiments of the present disclosure is to provide a device and method, respectively, that eliminates a need for defragmentation in a deduplication backup system. An aim is, in particular, to avoid locking and/or updating metadata references when removing obsolete data. Another aim is to redeem space at an application level. Further, it is also desirable to be able to perform normal backup and restore operations in the meantime.

This is achieved by the embodiments of the present disclosure as described in the enclosed independent claims. Advantageous implementations of the embodiments of the present disclosure are further defined in the dependent claims.

A first aspect of the present disclosure provides a device for storing duplicated data blocks, the device being configured to: store one or more compressed containers, wherein each container comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers; and delete a first duplicated data block that is stored in a segment by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment.

The device of the first aspect has the advantage that a need for defragmentation in a deduplication backup system is reduced or eliminated. This is due to the replacement by zeros, and the compression of the segments.

In an implementation form of the first aspect, the device is further configured to maintain metadata of the one or more stored duplicated data blocks, wherein metadata of each stored duplicated data block comprises a reference to a location where the duplicated data block is stored.

Metadata can be used to map the stored data blocks to their physical addressed. In an implementation form of the first aspect, each container further comprises segment metadata for each segment, which includes a location of the data segments in the container.

Knowing the segment metadata, the stored data segment can be located.

In an implementation form of the first aspect, the device is further configured to: obtain the one or more duplicated data blocks; fill the one or more duplicated data blocks into the plurality of segments of the one or more containers; and compress the plurality of segments of the one or more containers.

The deduplication based backup system may divide a backup stream into variable sized blocks of data. The data blocks that need to be written may be aggregated into the containers, particularly into segments of the containers. According to this embodiment of the invention, each segment is compressed in the container.

In an implementation form of the first aspect, the device is further configured to retrieve the stored first duplicated data block according to metadata of the first duplicated data block.

For instance, during data recovery, data blocks will be located and read based on the metadata.

In an implementation form of the first aspect, each duplicated data block is associated with a reference value, which indicates how often the duplicated data block is being updated.

Optionally, the reference value may be used to distinguish hot data and cold data in a storage.

In an implementation form of the first aspect, the device is further configured to store a duplicated data block that has a reference value higher than a preset value, and a duplicated data block that has a reference value lower than the preset value in different containers.

Preferably, it may be desired to keep cold data and hot data separated in different containers. The preset value may be configured based on specific implementations or requirements.

In an implementation form of the first aspect, the device comprises a disk, and the one or more containers are stored on the disk.

A second aspect of the present disclosure provides a method for storing duplicated data blocks, comprising: storing one or more compressed containers, wherein each container comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers; and deleting a first duplicated data block that is stored in a segment by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment.

In an implementation form of the second aspect, the method further comprises maintaining metadata of the one or more stored duplicated data blocks, wherein metadata of each stored duplicated data block comprises a reference to a location where the duplicated data block is stored.

In an implementation form of the second aspect, each container further comprises segment metadata for each segment, which includes a location of the date segments in the container.

In an implementation form of the second aspect, the method further comprises: obtaining the one or more duplicated data blocks; filling the one or more duplicated data blocks into the plurality of segments of the one or more container; and compressing the plurality of segments of the one or more containers.

In an implementation form of the second aspect, the method further comprises retrieving the stored first duplicated data block according to metadata of the first duplicated data block.

In an implementation form of the second aspect, each duplicated data block is associated with a reference value, which indicates how often the duplicated data block is being updated.

In an implementation form of the second aspect, the method comprises storing a duplicated data block that has a reference value higher than a preset value, and a duplicated data block that has a reference value lower than the preset value in different containers.

The method of the second aspect and its implementation forms provide the same advantages and effects as described above for the device of the first aspect and its respective implementation forms.

A third aspect of the present disclosure provides a computer program comprising a program code for carrying out, when implemented on a processor, the method according to the second aspect or any of its implementation forms. A fourth aspect of the present disclosure provides a non-transitory storage medium storing executable program code which, when executed by a processor, causes the method according to the second aspect or any of its implementation forms to be performed.

It has to be noted that all devices, elements, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. BRIEF DESCRIPTION OF DRAWINGS

The above described aspects and implementation forms will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which

FIG. 1 shows a device according to an embodiment of the invention.

FIG. 2 shows a typical deduplication structure.

FIG. 3 shows an example of data structure when copying data to a new container.

FIG. 4 shows an example of data structure when reusing redeemed space.

FIG. 5 shows a file structure according to an embodiment of the invention.

FIG. 6 shows a method according to an embodiment of the invention. DETAILED DESCRIPTION OF EMBODIMENTS

Illustrative embodiments of method, device, and program product for efficient packet transmission in a communication system are described with reference to the figures. Although this description provides a detailed example of possible implementations, it should be noted that the details are intended to be exemplary and in no way limit the scope of the application.

Moreover, an embodiment/example may refer to other embodiments/examples. For example, any description including but not limited to terminology, element, process, explanation and/or technical advantage mentioned in one embodiment/example is applicative to the other embodiments/examples.

FIG. 1 shows a device 100 according to an embodiment of the invention. The device 100 is adapted for storing duplicated data blocks. The device 100 is configured to store one or more compressed containers 101. In particular, each container 101 comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers 101. Further, the device 100 is configured to delete a first duplicated data block that is stored in a segment, particularly by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment.

The device 100 may comprise processing circuitry (not shown) configured to perform, conduct or initiate the various operations of the device 100 described herein. The processing circuitry may comprise hardware and software. The hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry. The digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors. The device 100 may further comprise memory circuitry, which stores one or more instruction(s) that can be executed by the processor or by the processing circuitry, in particular under control of the software. For instance, the memory circuitry may comprise a non-transitory storage medium storing executable software code which, when executed by the processor or the processing circuitry, causes the various operations of the device 100 to be performed. In one embodiment, the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors. The non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the device 100 to perform, conduct or initiate the operations or methods described herein. Notably, instead of storing duplicated data, the deduplication process stores some form of references, e.g. pointer, to where the duplicated data is already stored, as an example of a typical deduplication structure shown in FIG. 2. A reference to a location where the duplicated data block is stored commonly known as metadata.

In particular, according to an embodiment of the invention, the device 100 is further configured to maintain metadata of the one or more stored duplicated data blocks, wherein metadata of each stored duplicated data block comprises a reference to a location where the duplicated data block is stored.

Deduplication based backup system divides a backup into variable sized chunks or segments of data. When a user deletes data, chunks of a container become obsolete. When handling obsolete data in a conventional solution, remaining data will be copied or moved into a new container, as an example depicted in FIG. 3. In this way, all the references from the old location need to be updated to the new location. When changing a location of data, all the metadata that references the location of data needs to be locked until the transaction is complete. Otherwise, a pointer to data which was moved may be kept. This also affects backup and restore speeds.

For instance, if a container originally had chunks from a backup X: XI, X2, X3, X4 and X5, then X2, X3 and X5 are deleted. Later, if some data from another backup Y are written into this container, it may have in the container: XI Y1 Y2 X4 Y3, and a part of it are deleted. If the container are reused, eventually it may get to a scenario as shown in FIG. 4.

As the example shown in FIG. 4, it illustrates the effect of reusing container for a long while in a system. Data in the reused container is from a variety of different and possibly unrelated backups, and each backup user data is also scattered among large number of containers.

Notably, when a data backup is performed in such scenario, both the throughput performance and the deduplication performance are reduced. When a data recovery is performed, it needs to read random chunks from different locations, which is a known problem and bottleneck that slows down restore of deduplicated data.

Embodiments of this disclosure design a backup and delete process of a deduplication system to redeem space at an application level, which eliminates the need for defragmentation. Notably, each container is divided into segments. According to an embodiment of the invention, each container may further comprise segment metadata for each segment, which includes a location of the data segments in the container.

Each segment may be identified by an identifier. Segment metadata may be used to map identifiers of segments to their physical addresses.

During a backup, data blocks that need to be written are aggregated into containers to preserve the spatial locality of a backup stream. Optionally, according to an embodiment of the invention, the device 100 may be further configured to obtain the one or more duplicated data blocks. Accordingly, the device 100 may be configured to fill the one or more duplicated data blocks into the plurality of segments of the one or more containers 101. Further, the device 100 may be configured to compress the plurality of segments of the one or more containers 101.

During a restore, the backup stream is reconstructed according to the data. Optionally, according to an embodiment of the invention, the device 100 may be further configured to retrieve the stored first duplicated data block according to metadata of the first duplicated data block.

Optionally, according to embodiments of the invention, containers stored in the device 100 may be large containers. For instance, a size of the container may be bigger than 50MB. Each segment is compressed in the device 100. Possibly, a header of a segment may contain the location of the segment in a file (or an object). With such scheme, removing data from a physical storage, can be implemented by reading segments that contain redeemed chunks, uncompressing the segments, replacing the redeems parts with zeros, compressing and replacing the old segments with the recompressed segments. In particular, replacing the redeemed part with zeros means to replace each bit of the data block, which needs to be deleted, with a zero. As a file structure according to an embodiment of the present disclosure shown in FIG. 5, after the deleted data being replaced with zeros and further being recompressed, the space in the compressed container 101 is saved.

In this way, there is no need to update the metadata references, since the location of the segment remains the same. In addition, the locality of references is also kept. The read amplification is improved, since areas that belong to the same backup are kept together. The read amplification refers to the number of disk reads per query, which is a ratio between the amount of data that is requested to the amount of data that actually needs to be read. Notably, in a conventional way of reusing redeemed space, the locality of references is lost when the container is being reused. According to embodiments provided in this invention, the container is not reused, thus locality of references can be kept.

This also helps deduplication since the locality of data (especially cold data) is kept. Notably, cold data refers to data which is kept over a number of backups (e.g., hundreds of), and hot data refers to data which is often replaced between backups. This process also implicitly keeps cold data and hot data separated in different containers.

In particular, according to an embodiment of the invention, wherein each duplicated data block is associated with a reference value, which indicates how often the duplicated data block is being updated.

Preferably, according to an embodiment of the invention, the device 100 may be further configured to store a duplicated data block that has a reference value higher than a preset value, and a duplicated data block that has a reference value lower than the preset value in different containers.

In can be seen that, this process requires no locking, and no bad path handling. The process of reclaiming data can be done independently on any container. It requires no locking, which means a backup and data recovery can be operated as usual. It also requires no locking for metadata, and there is no common area that needs update, so it can be done in parallel and/or distributed among any number of threads, processes, or computers.

Moreover, even for some weird edge case, such as two entities are working on the same container, there is still no risk for data corruption, and no need for handling special cases.

In the previous embodiments, cross references are kept in the system, which can be verified or rebuild independently and in parallel. However, this scheme can also work in any other deduplication system.

Possibly, according to an embodiment of the invention, the device 100 comprises a disk, and the one or more containers are stored on the disk.

FIG. 6 shows a method 600 for storing duplicated data blocks according to an embodiment of the present disclosure. In a particular embodiment of the disclosure, the method 300 is performed by a device 100 shown in FIG. 1. The method 600 comprises: a step 601 of storing one or more compressed containers, wherein each container comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers; and a step 602 of deleting a first duplicated data block that is stored in a segment by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment.

Optionally, according to an embodiment of the present disclosure, the method 600 may further comprise maintaining metadata of the one or more stored duplicated data blocks, wherein metadata of each stored duplicated data block comprises a reference to a location where the duplicated data block is stored.

Optionally, according to an embodiment of the present disclosure, each container further comprises segment metadata for each segment, which includes a location of the date segments in the container.

Optionally, according to an embodiment of the present disclosure, the method 600 may further comprise obtaining the one or more duplicated data blocks. Accordingly the method may comprise filling the one or more duplicated data blocks into the plurality of segments of the one or more container. Then, the method may further comprise compressing the plurality of segments of the one or more containers.

Optionally, according to an embodiment of the disclosure, the method 600 may further comprise retrieving the stored first duplicated data block according to metadata of the first duplicated data block.

Optionally, according to an embodiment of the disclosure, each duplicated data block is associated with a reference value, which indicates how often the duplicated data block is being updated.

Optionally, according to an embodiment of the disclosure, the method 600 may comprise storing a duplicated data block that has a reference value higher than a preset value, and a duplicated data block that has a reference value lower than the preset value in different containers.

Deduplication systems are dedicated, and a lot of trade-offs need to be considered when designing such systems. Based on the design proposed in embodiments of this disclosure, the following effects can be gained: 1. Eliminating the need for defragmentation.

2. Removing obsolete data while requiring no metadata updates.

3. Reducing overall read amplification, and increasing data locality.

4. Increasing a lifecycle of containers with cold data, and reducing a lifecycle of containers with hot data.

5. No need for offline defragmentation.

6. The process per container is completely independent, thus allowing fully distributed and parallelism with no extra effort.

The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Furthermore, any method according to embodiments of the present disclosure may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method. The computer program is included in a computer readable medium of a computer program product. The computer readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory), a PROM (Programmable Read-Only Memory), an EPROM (Erasable PROM), a Flash memory, an EEPROM (Electrically Erasable PROM), or a hard disk drive.

Moreover, it is realized by the skilled person that embodiments of the device 100 comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the solution. Examples of other such means, units, elements and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, trellis-coded modulation (TCM) encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc. which are suitably arranged together for performing the solution.

Especially, the processor(s) of the device 100 may comprise, e.g., one or more instances of a Central Processing Unit (CPU), a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC), a microprocessor, or other processing logic that may interpret and execute instructions. The expression “processor” may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above. The processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.

Claims

1. A device (100) for storing duplicated data blocks, the device (100) being configured to: store one or more compressed containers (101), wherein each container (101) comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers (101); and delete a first duplicated data block that is stored in a segment by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment.

2. The device (100) according to claim 1, being configured to: maintain metadata of the one or more stored duplicated data blocks, wherein metadata of each stored duplicated data block comprises a reference to a location where the duplicated data block is stored.

3. The device (100) according to claim 1 or 2, wherein each container (101) further comprises segment metadata for each segment, which includes a location of the data segments in the container (101).

4. The device (100) according to one of the claims 1 to 3, being configured to: obtain the one or more duplicated data blocks; fill the one or more duplicated data blocks into the plurality of segments of the one or more containers (101); and compress the plurality of segments of the one or more containers (101).

5. The device (100) according to one of the claims 2 to 4, being configured to: retrieve the stored first duplicated data block according to metadata of the first duplicated data block.

6. The device (100) according to one of the claims 2 to 5, wherein each duplicated data block is associated with a reference value, which indicates how often the duplicated data block is being updated.

7. The device (100) according to claim 6, being configured to: store a duplicated data block that has a reference value higher than a preset value, and a duplicated data block that has a reference value lower than the preset value in different containers (101).

8. The device (100) according to one of the claims 1 to 7, wherein the device (100) comprises a disk, and the one or more containers (101) are stored on the disk.

9. A method (600) for storing duplicated data blocks, comprising: storing (601) one or more compressed containers, wherein each container comprises a plurality of segments, wherein one or more duplicated data blocks are stored in the plurality of segments of the one or more containers; and deleting (602) a first duplicated data block that is stored in a segment by decompressing the segment, replacing the first duplicated data block with zeros in the segment, and recompressing the segment.

10. The method (600) according to claim 9, further comprising: maintaining metadata of the one or more stored duplicated data blocks, wherein metadata of each stored duplicated data block comprises a reference to a location where the duplicated data block is stored.

11. The method (600) according to claim 9 or 10, wherein each container further comprises segment metadata for each segment, which includes a location of the date segments in the container.

12. The method (600) according to one of the claims 9 to 11, further comprising: obtaining the one or more duplicated data blocks; filling the one or more duplicated data blocks into the plurality of segments of the one or more container; and compressing the plurality of segments of the one or more containers.

13. The method (600) according to one of the claims 10 to 12, further comprising: retrieving the stored first duplicated data block according to metadata of the first duplicated data block.

14. The method (600) according to one of the claims 10 to 13, wherein each duplicated data block is associated with a reference value, which indicates how often the duplicated data block is being updated.

15. The method (600) according to claim 14, further comprising: storing a duplicated data block that has a reference value higher than a preset value, and a duplicated data block that has a reference value lower than the preset value in different containers.

16. A computer program product comprising a program code for carrying out, when implemented on a processor, the methods according to one of the claims 9 to 15.