WO2020037985A1

WO2020037985A1 - Method and apparatus for calculating backup file size

Info

Publication number: WO2020037985A1
Application number: PCT/CN2019/079390
Authority: WO
Inventors: 古文武; 刘继朋
Original assignee: 华为技术有限公司
Priority date: 2018-08-23
Filing date: 2019-03-23
Publication date: 2020-02-27
Also published as: CN110858123A; CN110858123B

Abstract

The present application relates to the field of data protection, and discloses a method and an apparatus for calculating backup file size, used for solving the problem in the prior art of the slow calculation speed when calculating the size of a backup file. The method comprises: determining a target backup file and a first backup file, the first backup file being the first backup file not deleted of the backup files having a creation time closest to the creation time of the target backup file and having a creation time before the creation time of the target file; on the basis of reference information of the data blocks recorded in the metadata of the target backup file and reference information of the data blocks recorded in the metadata of the first backup file, acquiring the data blocks belonging to the target backup file; and, on the basis of the data blocks belonging to the target backup file, calculating the size of the target backup file. Rapid calculation of the size of a backup file is thereby implemented.

Description

Method and device for calculating backup file size

Technical field

The embodiments of the present application relate to the technical field of data protection, and in particular, to a method and a device for calculating the size of a backup file.

Background technique

With the development of public cloud, private cloud, and hybrid cloud in full swing, the use of cloud resources to back up user data has become a mainstream. Incremental backup technology is usually used to back up user data on cloud resources. Among them, the principle of incremental backup technology is: when the user's data is backed up for the first time, a full backup is performed to back up all data blocks. When the data is not backed up for the first time, incremental backup is used. Only the data blocks whose data has changed are not backed up.

The following illustrates the incremental backup technology. As shown in Figure 1, it is a chain structure of incremental backup. A group of data blocks: data block 1-data block 6 is backed up 4 times. When the first backup is performed, Back up all 6 data blocks. Subsequent backups only back up data blocks that have changed data. Data blocks that have not changed data are no longer backed up. Among them, whether the data has changed is relative to the last backup. Take the second backup as an example. In the second backup, the data block 1, data block 2, and data block 4 are compared with the first backup, and the data in them has changed. Data block 3, data block 5, and data block 6 Compared to the first backup, the data in it has not changed. Therefore, in the second backup, only data block 1, data block 2, and data block 4 are backed up, and data block 3, data block 5, and data block 6 are no longer backed up.

In the prior art, each backup performed for a data block group generates a metadata record file. The metadata record file stores an identifier of each data block in the data block group and each data. The storage path of the block. The data blocks that are backed up in the data block group corresponding to each backup are considered as the data blocks contained in the backup file.

For example, as shown in FIG. 2, a metadata record file (that is, a metadata record file identified as backup 1) is generated for the first backup, and the identifiers of data block 1 to data block 6 (data block 1 ... data) are stored. The corresponding identifiers of block 6 are Block1... Block6), and the storage paths of data blocks 1 to 6 are stored (data block 1... And the corresponding storage paths of data block 6 are Flie1... Flie6). For the second backup, another metadata record file (that is, the metadata record file identified as backup 2) is generated. The record file 2 also stores the identifiers of data blocks 1 to 6 and data blocks 1 to data blocks. 6 storage paths. The difference is that during the second backup process, only the data block 1, data block 2 and data block 4 whose data content has changed are backed up (that is, the changed data is stored in the metadata record file identified as backup 2). The storage paths of block 1, data block 2, and data block 4 are Flie1 ', File2', and Flie4 '), and the storage paths and records of data block 3, data block 5, and data block 6 in which the data content in record file 2 has not changed The same in file 1 (that is, the storage addresses of data block 3, data block 5, and data block 6 are still Flie3, File5, and Flie6), which means that data block 3, data block 5, and data block 6 are generated in the backup file in the order of reference. Data blocks from previous backup files.

In the prior art, when calculating the storage space occupied by a data block included in a backup file, it is only necessary to calculate the storage space occupied by the changed data block included in the backup file. The storage path of the changed data block recorded in the data record file, the corresponding data block is found in the data stored in the storage space, and the size of each found data block is identified to obtain the size of the backup file. For example, when calculating the storage space occupied by the data blocks included in the backup file obtained after the second backup, you can only find the 3 data blocks corresponding to Flie1 ', File2', and Flie4 ', and use the found 3 data blocks as The amount of storage space occupied by the data blocks included in the backup file obtained after the second backup. This method of finding the data block at the bottom of the data to calculate the size of the backup file is slower.

Summary of the Invention

The embodiments of the present application provide a method and a device for calculating the size of a backup file, which are used to solve the problem of slow calculation speed when calculating the size of a backup file in the prior art. The specific technical solutions provided in the embodiments of the present application are as follows:

In a first aspect, a method for calculating the size of a backup file is provided. The method can be applied to a backup server, specifically:

First, the backup server determines a target backup file of a size to be calculated, and determines a first backup file, the first backup file having a creation time before the creation time of the target backup file, and the creation time and the target backup file The backup file with the most recent creation time, the first backup file has not been deleted. The metadata of the backup file records the reference information of the data block and the size of the data block. The reference information of the data block is used to indicate the backup file to which the data block belongs. Then, the backup server queries the metadata of the target backup file for reference information of each data block in the multiple data blocks included in the target backup file to obtain the first data block belonging to the target backup file. Then back up the server and according to the reference information of each data block contained in the target backup file recorded in the metadata of the target backup file and each data contained in the first backup file recorded in the metadata of the first backup file Block reference information to obtain a second data block belonging to the target backup file. Finally, the backup server calculates the size of the target backup file according to the size of the first data block and the second data block recorded in the metadata of the target backup file.

In a possible implementation, when calculating the size of the target backup file, the sum of the sizes of the first data block and the second data block may be determined as the size of the target backup file.

In the embodiment of the present application, since the metadata of the target backup file and the metadata of the first backup file respectively record the reference information of each data block included in the corresponding backup file, it can be quickly identified as belonging to the target backup. The data block of the file, and the size of each recorded data block attributable to the target backup file, is quickly added to calculate the size of the target backup file.

In a possible implementation, when determining the second data block belonging to the target backup file, it may be based on reference information of the data block recorded in the metadata of the target backup file and the first backup file. The reference information of the data block recorded in the metadata to obtain the data block belonging to the second backup file, the creation time of the second backup file is before the creation time of the target backup file, and it is located in the first backup After the file creation time, the second backup file has been deleted; and the data block attributed to the second backup file is determined as the second data block attributed to the target backup file.

In the embodiment of the present invention, after the second backup file is deleted, the data block belonging to the second backup file is updated to the target backup file, then the data block originally belonging to the second backup file may be determined as belonging. The second data block of the target backup file is used to quickly and accurately calculate the size of the backup file.

In a possible implementation, the backup server may determine a target backup file after receiving a query request, and the query request includes an identifier of the target backup file, and the backup server determines the target backup file according to the identifier.

The query request may be sent by the tenant or sent by the billing system.

In a possible implementation, the method described in the first aspect may further include: recording, in the metadata of the target backup file, reference information of each data block included in the target backup file, and the each The size of each data block, and the reference information of each data block contained in the first backup file is recorded in the metadata of the first backup file.

In order to calculate the backup file size based on the information recorded in the metadata.

In a second aspect, a method for calculating the size of a backup file is provided. The method can be applied to a backup server, specifically:

First, the backup server obtains a first total size of a plurality of data blocks included in the saved target backup file, and the target backup file is a backup file of the source file; then, the backup server obtains a plurality of data contained in the saved target backup file. The second total size of the data blocks in the data block that do not belong to the target backup file; finally, the size of the target backup file is calculated according to the first total size and the second total size.

In a possible implementation, the difference between the first total size and the second total size is determined as the size of the target backup file.

In the embodiment of the present invention, since the first total size of the target backup file is saved in advance and the second total size of the third data block that does not belong to the target backup file, a subtraction method is used to quickly calculate the target backup file's size.

In a possible implementation, the process of pre-saving the first total size of a plurality of data blocks included in the target backup file includes: the backup server determines a target backup file, and queries the target from the metadata of the target backup file. The size of each data block included in the backup file, and determining the sum of the sizes of each data block included in the target backup file as the first total size of a plurality of data blocks included in the target backup file, And save.

In a possible implementation, the process of pre-saving the second total size of the data blocks that do not belong to the target backup file includes: the backup server determines the target backup file, and according to the metadata recorded in the target backup file, The reference information of each data block included in the target backup file to obtain a third data block of the plurality of data blocks included in the target backup file that does not belong to the target backup file, and each data block The reference information of is used to indicate the backup file to which the data block belongs; according to the size of the third data block recorded in the metadata of the target backup file, determining among a plurality of data blocks included in the target backup file The second total size of the data blocks that do not belong to the target backup file is the sum of the sizes of the third data blocks.

The query request may be sent by the tenant or sent by the billing system.

In a possible implementation, the method described in the second aspect may further include: in the metadata of the target backup file, recording reference information of each data block included in the target backup file, and the each The size of each data block.

In a third aspect, a method for calculating a size of data deleted in a backup file is provided. The method can be applied to a backup server, and specifically:

First, the backup server determines a target backup file. The target backup file includes multiple data blocks. The target backup file is a backup file of a source file. The source file also has a first backup file and a third backup file. The first backup file includes a plurality of data blocks, and the third backup file includes a plurality of data blocks. The first backup file is created before the creation time of the target backup file. The backup file with the latest creation time of the target backup file, the first backup file is not deleted, the third backup file is created after the creation time of the target backup file, and the creation time and the target backup are The backup file with the most recent file creation time, the third backup file has not been deleted;

Then, according to the reference information of each data block contained in the target backup file recorded in the metadata of the target backup file, obtaining a first data block belonging to the target backup file;

Then, a first backup file is determined, according to the target backup file recorded in the metadata of the target backup file, the target backup file contains reference information of each data block, and the first backup recorded in the metadata of the first backup file. The reference information of each data block contained in the file to obtain a second data block belonging to the target backup file, and the reference information of each data block is used to indicate the backup file to which the data block belongs;

Then, a third backup file is determined, and according to the metadata of the third backup file, reference information of each data block included in the third backup file is recorded to obtain a first data block belonging to the target backup file. And the third data block in the second data block that is referenced by the third backup file;

Finally, according to the first data block, the second data block, and the third data block, a fourth data block that can be deleted is determined, and according to each of the fourth data blocks recorded in the metadata of the target backup file, The size determines the size of data that can be deleted in the target backup file.

In a possible implementation, the sum of the sizes of the fourth data blocks is determined as the size of data that can be deleted in the target backup file.

In the embodiment of the present invention, since the metadata of the target backup file, the metadata of the first backup file, and the metadata of the third backup file each record the reference information of each data block included in the corresponding backup file, it is possible to Quickly identify the data blocks that can be deleted in the target backup file, and quickly calculate the data that can be deleted in the target backup file according to the size of each recorded data block that can be deleted. size.

In a possible implementation, when determining the fourth data block that can be deleted in the target backup file, the data blocks other than the third data block in the first data block and the second data block may be determined as being capable of being deleted. The deleted fourth data block.

In a possible implementation, when determining the second data block belonging to the target backup file, first, according to the reference information of the data block recorded in the metadata of the target backup file and the first backup file The reference information of the data block recorded in the metadata to obtain the data block belonging to the second backup file, the creation time of the second backup file is before the creation time of the target backup file, and it is located in the first backup After the file creation time, the second backup file has been deleted; then, the data block attributed to the second backup file is determined as the second data block attributed to the target backup file.

In the embodiment of the present invention, after the second backup file is deleted, the data block belonging to the second backup file is updated to the target backup file, then the data block originally belonging to the second backup file may be determined as belonging. A second data block for the target backup file.

In a possible implementation, the backup server may determine a target backup file after receiving the delete request, the delete request includes an identifier of the target backup file, and the backup server determines the target backup file according to the identifier.

The deletion request may be sent by the tenant or by another system.

In a possible implementation, the method described in the third aspect may further include: recording, in the metadata of the target backup file, reference information of each data block included in the target backup file, and the each The size of each data block, record the reference information of each data block contained in the first backup file in the metadata of the first backup file, and record the third backup file in the metadata of the third backup file Reference information for each data block contained in the.

In order to subsequently calculate the size of the data that can be deleted in the backup file based on the information recorded in the metadata.

In a fourth aspect, the present application provides a device for calculating the size of a backup file, a chip for a backup server or a backup server, including: units or means for performing each step in any of the above aspects.

In a fifth aspect, the present application provides a device for calculating the size of a backup file, which is used for a backup server or a chip of a backup server, and includes at least one processing element and at least one storage element, where the at least one storage element is used to store a program and Data, the at least one processing element is configured to perform a method provided by any aspect of the present application.

According to a sixth aspect, the present application provides a device for calculating a size of a backup file, which is used for a backup server including at least one processing element (or chip) for performing the method of any of the above aspects.

In a seventh aspect, the present application provides a computer program product including computer instructions that, when executed by a computer, cause the computer to execute the method of any of the above aspects.

In an eighth aspect, the present application provides a computer-readable storage medium that stores computer instructions, and when the computer instructions are executed by a computer, the computer is caused to execute the method of any of the above aspects.

BRIEF DESCRIPTION OF THE DRAWINGS

1 is a chain structure diagram of an incremental backup in the prior art;

2 is a schematic diagram of a metadata record file of a backup file in the prior art;

3 is a structural diagram of a computing backup system according to an embodiment of the present application;

4 is a flowchart of a method for calculating a size of a backup file according to an embodiment of the present application;

5 is a schematic diagram of a metadata record file of a backup file in an embodiment of the present application;

6 is a flowchart of a method for calculating a size of a backup file according to an embodiment of the present application;

7 is a schematic diagram of a backup process in an embodiment of the present application;

8 is a schematic diagram of a process of deleting a backup file in an embodiment of the present application;

9 is a device for calculating a size of a backup file in an embodiment of the present application;

FIG. 10 is an apparatus for calculating a size of deleted data in a backup file according to an embodiment of the present application; FIG.

11 is an apparatus for calculating a size of a backup file in an embodiment of the present application;

FIG. 12 is an apparatus for calculating a size of deleted data in a backup file according to an embodiment of the present application.

detailed description

The embodiments of the present application will be described in detail below with reference to the drawings.

FIG. 3 is a backup system 300 according to an embodiment of the present application. The system includes a user 301, a backup server 302, a production storage server 303, and a backup storage server 304.

Users can back up, restore, and delete backup files using the backup server 302 in the backup system. This application applies to backup and deletion scenarios. In order to prevent data loss caused by business failures, the user 301 can periodically report to the backup server. 302 sends a backup request to back up the data that needs to be backed up. The user 301 initiates a backup task to the backup server 302. The backup server 302 reads the data to be backed up from the production storage server 303 and stores the data in the backup storage server 304. The backup server 302 also has a local database for management Back up task data. Users can be tenants or other systems.

The storage space rented by the tenant on the cloud resource is limited. The tenant can save the storage space occupied by each backup file. Therefore, when the rented storage space is insufficient, some backup files are deleted, so the tenant can send backup The server 302 queries the size of the backup file and records it.

Except that the tenant can query the backup file size from the backup service 302, the billing system can also query the backup server 302 about the size of the backup file. When the billing system performs billing, it charges each backup file. The billing system needs to know the size of each backup file in order to perform billing.

In the prior art, when the backup server calculates the size of the backup file, it needs to query the size of each data block at the bottom according to the storage path of the data block in the backup file. The calculation speed is slow. This application provides a calculation The method of backing up the file size can increase the calculation speed. When calculating the size of a backup file, a backup file adjacent to the backup file may be involved. Multiple backup files provided in this application are located in the same backup chain, that is, backup files for the same source file. In addition, it should be understood that in the description of this application, the words "first" and "second" are used only for the purpose of distinguishing descriptions, and cannot be understood as indicating or implying relative importance, nor as indicating Or imply order.

As shown in FIG. 4, the present application discloses a flowchart of a method for calculating the size of a backup file. The backup server in this process is specifically the backup server 302 shown in FIG. 3. It can be understood that, in this application, the function of the backup server may also be implemented by a chip applied to the backup server. The process is specifically:

Step S401: The backup server receives a query request, and the query request includes an identifier of a target backup file.

Step S402: The backup server determines a target backup file.

In the embodiment of the present application, when the tenant or the charging system wants to know the size of a backup file, it may send a query request to the backup server to query the size of the backup file. The query request includes an identifier of the backup file to be queried, and the backup file to be queried is referred to as a target backup file. After receiving the query request, the backup server can determine the target backup file according to the identifier contained in the query request.

In the incremental backup technology, when the source file is backed up for the first time, a full backup is performed according to the source file, and when the subsequent second, third, ..., n-th backup is performed, the previous backup is performed according to the previous backup. The generated backup files are incrementally backed up, that is, they are backups of data blocks whose data has changed. Data blocks whose data has not changed are not backed up, but instead refer to the data blocks in the backup file generated by the previous backup. Regardless of whether it is the first or non-first backup, the backup file generated by this backup is the backup file for the source file.

In each backup, the data block to be backed up can be regarded as the data block belonging to the backup file generated by the backup. Regardless of whether the data block is backed up or the data block is not backed up, that is, the data block in the backup file generated from the previous backup can be considered as the data block included in the backup file generated by the backup.

As shown in Figure 1, in the third backup, data block 1, data block 2, data block 4, and data block 6 refer to the data block in the backup file generated by referring to the previous backup, that is, the data block 3 And data block 5 is the data block backed up in the third backup, then data blocks 1 to 6 in the third backup are considered as the data blocks and data blocks contained in the backup file generated by the third backup. 3 and data block 5 are considered to belong to the backup file generated by the third backup. Data block 1, data block 2, data block 4, and data block 6 do not belong to the backup file generated by the third backup.

In the embodiment of the application, the target backup file is a backup file generated when performing incremental backup, and the target backup file is a backup file of the source file. Each backup file has its metadata. The metadata includes the identification of the backup file, the backup serial number of the backup file, and the information of each data block contained in the backup file. Among them, the backup serial number increases in the order in which the backup file was generated, and the data The block information includes the size of the data block, the reference information of the data block, and so on. The reference information of a data block is used to indicate the backup file to which the data block belongs, that is, the backup file generated when the data block is backed up. The reference information of the data block recorded in the metadata can be the backup file to which the data block belongs Identification and backup serial number. The backup sequence number of the backup file is used to indicate the generation sequence of the backup file.

After the backup server determines the target backup file, it may further perform step 403: query the metadata of the target backup file to query reference information of each data block in the multiple data blocks included in the target backup file. To obtain the first data block belonging to the target backup file.

After querying the reference information of each data block among the multiple data blocks contained in the target backup file recorded in the metadata of the target backup file, the ownership can be determined according to the reference information of each data block found For the data block of the target backup file, the data block belonging to the target backup file is determined as the first data block only based on the reference information of the data block recorded in the metadata of the target backup file. The first data block is originally owned. For the target backup file.

When determining the data block belonging to the target backup file according to the reference information of each data block that is queried, it may be determined according to the identifier of the backup file in the reference information, specifically, it is recorded in the metadata identifying the target backup file. The identifier of the backup file to which each data block belongs, and the data block corresponding to the identifier of the target backup file therein is determined as the first data block belonging to the target backup file. It may also be determined based on the backup sequence number of the backup file in the reference information, specifically identifying the backup sequence number of the backup file to which each data block recorded in the metadata of the target backup file belongs, and correspondingly the backup sequence number of the target backup file therein. The data block is determined as the first data block belonging to the target backup file.

In the embodiment of the present invention, the source file also has a first backup file. The creation time of the first backup file is earlier than the creation time of the target backup file, and the creation time of the two is the closest, that is, the first backup file is The backup file whose creation time is before the creation time of the target backup file and whose creation time is closest to the creation time of the target backup file, the first backup file has not been deleted.

For example, the target backup file is the backup file generated during the fourth backup. When there is no deleted backup file, the first backup file is the backup file generated during the third backup. If a deleted backup file exists and the deleted backup file is the backup file generated during the third backup, the first backup file is the backup file generated during the second backup and originally belongs to the third backup generation The data block of the backup file is changed to belong to the target backup file. If the deleted backup file is the backup file generated during the third backup and the backup file generated during the second backup, the first backup file is the backup file generated during the first backup and originally belongs to the third backup The data blocks of the backup file generated during the backup and the backup file generated during the second backup are changed to the target backup file.

The process of the backup server determining the first backup file may be as follows:

The backup server determines the backup serial number of the target backup file according to the recorded backup serial number of each backup file. The backup serial number of the target backup file is referred to as the target backup serial number. The backup server determines that it is adjacent to the target backup serial number and is smaller than The first backup sequence number of the target backup sequence number;

The first backup file corresponding to the first backup serial number is determined according to the backup serial number of each backup file.

In order to make the size of the determined backup file more accurate, step 404 may also be performed: determining the first backup file, according to reference information of each data block included in the target backup file recorded in the metadata of the target backup file, and the first The reference information of each data block contained in the first backup file recorded in the metadata of a backup file is used to obtain the data block belonging to the second backup file; and the data block belonging to the second backup file is determined as A second data block belonging to the target backup file, wherein the creation time of the second backup file is before the creation time of the target backup file and after the creation time of the first backup file, the The second backup file has been deleted.

Reference information of each data block included in the target backup file recorded in the metadata of the target backup file according to the backup server and each data block included in the first backup file recorded in the metadata of the first backup file When obtaining the reference data of the second backup file belonging to the second backup file, specifically, for each data block except the first data block in the target backup file, identifying the data block in the first backup file The corresponding data block, the data block, and the corresponding data block in the first backup file are backup data blocks for the same data block in the source file. The backup server judges the reference information of the data block and the data block. Whether the reference information of the corresponding data block of the data block in the first backup file is the same. If they are different, then it is determined that the data block is the data block originally belonging to the deleted second backup file, that is, the data block currently belonging to the target backup file. The second data block.

For each data block except the first data block in the target backup file, the process of identifying the corresponding data block of the data block in the first backup file may be that the backup server records in the metadata of each backup file The identifier of each data block contained in the backup file may be the same for the backup data block of each data block in the source file. The backup server may identify, for each data block except the first data block in the target backup file, a data block in the first backup file that has the same identifier as the data block, and determine the data block with the same identifier as the target backup file. The corresponding data block of the data block in the first backup file.

After the backup server determines the first data block and the second data block belonging to the target backup file, the backup server may according to the first data block and the second data block recorded in the metadata of the target backup file. To calculate the size of the target backup file. Specifically, step 405 may be performed: determining the sum of the sizes of the first data block and the second data block as the size of the target backup file.

Since the metadata of the target backup file and the metadata of the first backup file respectively record the reference information of each data block contained in the corresponding backup file, the data blocks belonging to the target backup file can be quickly identified, and according to The size of each recorded data block that belongs to the target backup file is quickly calculated by using the addition method.

The reference information of each data block contained in the backup file and the size of the data block recorded in the metadata of the backup file can be specifically recorded in the metadata record file of the backup file. Except for the record in the metadata record file of the backup file, In addition to the identification block and storage path Flie of each existing data block shown in 2, the size of the data block, the identification of the backup file to which the data block belongs, and the backup sequence number of the backup file are detailed. See the metadata record file shown in Figure 5. The metadata record file shown in FIG. 5 is only used to indicate what information is recorded in the metadata record file of the backup file. The specific record format can be the format shown in FIG. 5 or other settings set by the backup server. format. In Figure 5, the third column is the backup serial number of the backup file to which the data block belongs. It can be understood that the third column is an orderly identification. Of course, the identification of the backup file and the backup serial number can also be recorded separately. The fourth column is The size of the data block. Taking the third backup as an example, the metadata record file of the backup file generated by the third backup is backup 3 in FIG. 5. The number “2” corresponding to the data block 1 (Block1) in the backup 3 indicates that the data block 1 belongs to it. The backup file is the backup file generated during the second backup. The number “3” corresponding to data block 5 (Block 5) indicates that the backup file to which data block 5 belongs is the backup file generated during the third backup.

In this embodiment, the target backup file is not the backup file generated by the first backup, so there may be data blocks that do not belong to the target backup file among the multiple data blocks included in the target backup file. When determining the size of the target backup file, the size of the target backup file can be calculated by adding the sizes of the data blocks belonging to the target backup file, or the total of multiple data blocks contained in the target backup file can be used. The size of the target backup file is calculated by subtracting the size of each data block that does not belong to the target backup file from the size.

In another embodiment of the present application, in order to further quickly calculate the size of the backup file, the backup serial number of the backup file contained in the backup file and the reference information of each data block may be further recorded according to the metadata of the backup file. And the size of the data block, save the total size of multiple data blocks contained in the backup file in advance, and the total size of data blocks among the multiple data blocks contained in the target backup file that do not belong to the target backup file, for convenience Differently, the total size of the data blocks included in the backup file is referred to as a first total size, and the total size of the data blocks not included in the target backup file among the plurality of data blocks included in the target backup file is referred to as a second The total size.

The backup server receives a query request, and the query request includes an identifier of the target backup file. The backup server determines the target backup file according to the identifier, and obtains a first total size of a plurality of data blocks contained in the saved target backup file. The target backup file is a backup file of the source file; then, the backup server obtains a second total size of the data blocks among the plurality of data blocks contained in the saved target backup file that do not belong to the target backup file; finally, according to the The first total size and the second total size are used to calculate the size of the target backup file.

Specifically, the difference between the first total size and the second total size is determined as the size of the target backup file.

For the target backup file, the total size of the data blocks contained in the target backup file referencing the first backup file is the total size of the data blocks included in the target backup file that do not belong to the target backup file. When recording the total size of data blocks that are not attributable to the target backup file among the plurality of data blocks included in the target backup file, the total size of the data blocks included in the first backup file referenced by the target backup file may be recorded. Of course, it is also possible to record the total size of the data blocks contained in each backup file in which the reference creation order in the backup file precedes the creation order of the backup file.

The total size of multiple data blocks contained in the backup file, the reference creation order in the backup file, and the total size of the data blocks contained in each backup file before the backup file creation order can be stored in the backup chain. In the relationship record file, the information recorded in the backup chain reference relationship record file includes the identification (ID) of the backup file as shown in Table 1. For example, it can be backup 1, backup 2, ..., backup n, etc., backup of the backup file Sequence number (Snaplndex), such as 1, 2, ..., n, etc. The total size of multiple data blocks contained in the backup file (Totasize). Each backup in the backup file is referenced in the generation order before the generation order of the backup file. The total size of the data blocks contained in the file (Reference). Subsequently, the size of the backup file can be calculated based on the information recorded in the backup chain reference relationship record file.

IDID	SnaplndexSnaplndex	TotasizeTotasize	ReferenceReference
备份1Backup 1	11	T1T1	Zh
备份2Backup 2	22	T2T2	R(2，1)R (2,1)
备份3Backup 3	33	T3T3	R(3，1)；R(3，2)R (3,1); R (3,2)
备份4 Backup 4	44	T4T4	R(4，1)；R(4，2)；R(4，3)R (4,1); R (4,2); R (4,3)
……...	……...	……...	……...
备份nBackup n	nn	TnTn	R(n，1)；R(n，2)；R(n，3)……R(n，n-1)R (n, 1); R (n, 2); R (n, 3) ... R (n, n-1)

Table 1

Take this behavior example identified as backup 3 in Table 1 for illustration: T3 represents the total size of multiple data blocks contained in the backup file with backup sequence number 3, and R (3, 1) represents the backup file with backup sequence number 3. The total size of the data blocks contained in the backup file with the backup sequence number 1 in the reference, R (3, 2) represents the total size of the data blocks contained in the backup file with the backup sequence number 2 in the backup file with the backup sequence number 3.

The backup server may record the backup sequence number of the backup file, the reference information of each data block and the size of the data block in advance in the backup chain reference relationship record file for calculating the backup according to the backup file recorded in the metadata of the backup file. Information about the size of the file. When the backup server records information in the backup chain reference relationship record file, it generally records in real time following the generation process of the backup file, that is, when a backup file is generated, it is based on the metadata of the backup file. The information recorded in the backup chain reference relationship record file determines the information recorded in the backup chain reference relationship record file. In the process of recording information in the backup chain reference relationship record file, there is no need to consider the case of deleting a backup file.

The process of the backup server pre-saving the first total size of the plurality of data blocks included in the target backup file includes: the backup server determines the target backup file, and queries each metadata included in the target backup file from the metadata of the target backup file. The size of each data block, determine the sum of the sizes of each data block included in the target backup file as the first total size of a plurality of data blocks included in the target backup file, and save the target backup The first total size of multiple data blocks contained in the file.

The process of the second total size of the data blocks that are not attributable to the target backup file pre-saved by the backup server includes: the backup server determines the target backup file, and according to the target backup file recorded in the metadata of the target backup file The reference information of each data block included, to obtain a third data block of the plurality of data blocks included in the target backup file that does not belong to the target backup file, and the reference information of each data block is used to indicate A backup file to which the data block belongs; according to the size of the third data block recorded in the metadata of the target backup file, determining that a plurality of data blocks included in the target backup file do not belong to the target The second total size of the data blocks of the backup file, and the second total size is the sum of the sizes of the third data blocks.

The backup server calculates the size of the target backup file, then the backup chain reference relationship record file records at least the first total size of multiple data blocks contained in the target backup file, and the target backup file references the data contained in the first backup file The total size of the blocks, that is, the second total size of the data blocks in the plurality of data blocks included in the first target file that do not belong to the target backup file.

According to the reference information of each data block contained in the target backup file and the size of the data block recorded in the metadata of the target backup file, the backup chain reference relationship record file records the multiple data blocks contained in the target backup file. The total size process includes:

The backup server determines the target backup file, and determines the sum of the sizes of each data block recorded in the metadata of the target backup file according to the size of each data block recorded in the metadata of the target backup file. The total size of multiple data blocks contained in the target backup file recorded in the backup chain reference relationship record file is described.

According to the reference information of each data block contained in the target backup file and the size of the data block recorded in the metadata of the target backup file, records in the backup chain reference relationship record file refer to the target backup file that is included in the first backup file. The process of the second total size of the data block includes:

The backup server determines a target backup file, and obtains, based on the reference information of each data block included in the target backup file in the metadata of the target backup file, a plurality of data blocks included in the target backup file that do not belong to The third data block of the target backup file, and the reference information of each data block is used to indicate the backup file to which the data block belongs; according to the third data block recorded in the metadata of the target backup file Determine the second total size of the data blocks not attributable to the target backup file among the plurality of data blocks included in the target backup file, and the second total size is the sum of the sizes of the third data blocks.

Obtaining, according to reference information of each data block included in the target backup file recorded in the metadata of the target backup file, a plurality of data blocks included in the target backup file that do not belong to the target backup The third data block of the file may be determined according to the identifier of the backup file in the reference information, specifically identifying the identifier of the backup file to which each data block recorded in the metadata of the target backup file belongs, and the non-target The data block corresponding to the identifier of the backup file is determined as the third data block that does not belong to the target backup file. It may also be determined based on the backup sequence number of the backup file in the reference information, which may specifically identify the backup sequence number of the backup file to which each data block recorded in the metadata of the target backup file belongs, and the backup sequence number of the non-target backup file therein. The corresponding data block is determined as a third data block that does not belong to the target backup file, that is, the target backup file refers to a data block included in the first backup file.

With reference to the information contained in the metadata record file of the backup file identified as backup 3 in FIG. 5, the backup file referenced in the backup chain reference relationship record file recorded in the backup chain reference number 3 is included in the backup file with the backup serial number 1. The total size of the data block, that is, R (3,1) is the size of data block 6 (Block6), that is, size6.

The data block 6 (Block6) contained in the backup file with the backup sequence number 2 refers to the data block contained in the backup file with the backup sequence number 1 and the backup file with the backup sequence number 3 recorded in the backup chain reference relationship record file refers to the backup The total size of the data blocks contained in the backup file with sequence number 2, that is, R (3, 2) is the sum of the sizes of data block 1, data block 2, data block 4, and data block 6, namely size1 ', size2', The sum of size4 'and size6.

For the case where there is no deletion of the backup file, the generation order of the backup file b precedes the generation order of the backup file a. When determining the total size of data blocks contained in another backup file a that references another backup file b, it can be Identify the backup sequence number of the backup file to which each data block contained in the backup file a recorded in the metadata of the backup file a, and identify the backup sequence number of the backup file b recorded in the metadata of the backup file b, in The backup sequence number of the backup file to which each data block contained in the backup file a identifies a specific backup sequence number that is less than or equal to the backup sequence number of the backup file b; the data block corresponding to the specific backup sequence number is determined as the backup file a and the backup file b is referenced The data blocks included in the backup file a determine the size of each specific data block according to the size of each data block included in the backup file a, and determine the sum of the sizes of each specific data block as the backup file a reference to the backup file b The total size of the data block.

When querying the size of the target backup file, you can calculate the size of the target backup file based on the information contained in the backup chain reference relationship record file. It may be the total size (Totasize) of a plurality of data blocks included in the target backup file minus the size of the data blocks that do not belong to the target backup file. The size of the data block that does not belong to the target backup file is the size of the data block included in the reference backup file in the target backup file. Assume that the total size of the multiple data blocks contained in the target backup file is Ta, and the size of the data block in the target backup file that refers to the first backup file is R (a, p). The size of the target backup file is Ta-R (a, p), where a is the target backup serial number of the target backup file, and p is the first backup serial number of the first backup file.

Assume that the target backup file whose size is to be calculated is the backup file identified as backup 4. If there is no deletion of a backup file, you can query it in the backup chain reference relationship record file according to the backup serial number of each backup file. The first backup file is a backup file identified by 3. The total size of the multiple data blocks contained in the target backup file is T4, and the total size of the data blocks among the multiple data blocks contained in the target backup file that do not belong to the target backup file is R (4, 3). The target backup The file size is T4-R (4,3).

If the backup file identified as backup 3 is deleted, the backup chain reference relationship record file does not contain related information identified as backup 3. You can query the backup chain reference relationship record file to find the A backup file is a backup file identified by 2. The total size of the multiple data blocks contained in the target backup file is T4, and the total size of the data blocks among the multiple data blocks contained in the target backup file that do not belong to the target backup file is R (4, 2). The target backup The file size is T4-R (4, 2).

As shown in Figure 6, a flowchart of calculating the size of a backup file. The user sends a query request to the backup server for querying the size of the target backup file. The backup server records the target backup file recorded by the backup chain reference relationship record file in advance The total size of the data blocks included in the reference, the size of the first backup file, the size of the target backup file, and feedback to the user on the size of the target backup file.

No matter whether the size of the target backup file is determined by adding the sizes of the multiple data blocks included in the target backup file, or the total size of the multiple data blocks included in the target backup file is subtracted from the non-attribution The total size of the data blocks of the target backup file, the calculation of the size of the target backup file, the reference information of each data block used by the backup server, and the size of each data block need to be recorded in advance in the metadata. In another embodiment of the application, the method further includes:

Record the reference information of each data block contained in the target backup file and the size of each data block in the metadata of the target backup file, and record the first in the metadata of the first backup file Reference information for each data block contained in the backup file.

Generally, when a backup is performed, a backup file is generated, and for the backup file, reference information of each data block included in the backup file is recorded in a metadata block of the backup file.

In the embodiment of the present application, in order to prevent data loss caused by a service failure, the tenant may send a backup request to the backup server. Of course, other systems may also send a backup request to the backup server when the period arrives according to the backup period of the data block group. The backup server backs up the data blocks in the target data block group according to the identifier of the target data block group, generates a target backup file, and records the identifier of the target backup file in the metadata of the target backup file and the backup serial number of the target backup file. , The reference information of each data block contained in the target backup file, and the size of each data block.

As shown in FIG. 7, the backup server performs the backup process and records the backup file identifier, backup serial number, data block reference information, and data block size process:

When the backup server receives the backup request sent by the user, it creates a backup task and generates the ID of the backup file. The generated ID of the backup file is called the target ID. The backup server creates a new metadata record file based on the target ID and the backup server. The largest backup sequence number currently stored in the database is used to determine the target backup sequence number. Generally, the largest backup sequence number is increased by 1 as the target backup sequence number, and the target identifier and the target backup sequence number are stored in the newly created metadata record file.

The backup server cyclically reads data from the production storage server, obtains each data block in the target data block group for backup, writes the backup data block to the backup storage server to implement data block backup, and identifies each The size of the backed up data block, the size of each backed up data block is recorded in the newly created metadata record file, and the target identifier of the backup file to which it belongs is recorded for each backed up data block;

Identify each un-backed-up data block in the target data block group, and for each un-backed-up data block, identify another identifier of the backup file corresponding to the largest backup sequence number, and find the corresponding element according to the other identifier Data log file

Identify the size of each un-backed-up data block recorded in the backup process recorded in the metadata record file corresponding to the other ID, and the ID of the backup file to which each un-backed-up data block belongs is the backup serial number, and Record in the newly created metadata record file.

The data blocks to be backed up and the data blocks not to be backed up are data blocks included in the backup file of the target identifier, and the data blocks to be backed up belong to the backup file of the target identifier. According to the above process, it can be considered that the backup task is completed, and the user can be notified that the backup is completed.

However, in order to quickly calculate the size of the backup file, it can also be calculated based on the backup sequence number recorded in the newly created metadata record file, the size of each data block, and the identifier of the backup file to which each data block belongs, that is, the backup sequence number. The total size of the data blocks contained in the backup file generated by this backup, and the total size of the data blocks contained in each backup file referenced in the backup file generated by this backup before the backup file generated by this backup, And the total size of the data blocks contained in the backup file generated by this backup recorded in the backup chain reference relationship record file, and each of the reference generation order in the backup file generated by this backup precedes the backup file generated by this backup The total size of the data blocks contained in the backup file. According to the above process, the backup task is completed.

In another embodiment of the present application, if the target backup file of the size to be calculated is the backup file generated by the first backup, the sizes of all data blocks included in the target backup file of the size to be calculated may be directly added. To get the size of the target backup file.

The backup server may determine whether the target backup file of the size to be calculated is the backup file generated by the first backup when receiving the query request. Specifically, the backup server may determine the size to be calculated according to the backup serial number of each backup file saved in advance. Whether the backup serial number of the target backup file is the serial number of the smallest backup in the backup chain where the target backup file of the size to be calculated is located. If so, it is determined that the backup file of the size to be calculated is the backup file generated during the first backup, otherwise, It is determined that the backup file to be calculated is not a backup file generated when the first backup is performed.

The identification of the backup file, the backup serial number, the identification of the backup file to which each data block contained in the backup file, and the backup serial number can be stored in the local database of the backup server shown in FIG. 3.

The storage space rented by the tenant on the cloud resource is limited. When the rented storage space is insufficient, the tenant can delete some backup files. The tenant can send a delete request to the backup server to delete the backup file. In the technology, after the backup server deletes the data in the backup file, the size of the data in the deleted backup file can be calculated. The entire deletion process takes a long time, and the size of the deleted data cannot be feedback to the tenant in time. Once a delete request has been sent, you should not be billed for the data that can be deleted in the backup file. During a long process of deletion, it is not reasonable to charge the data that can be deleted in the backup file that the tenant requested to delete. of.

For this reason, in one embodiment of the present application, a method for calculating the size of data deleted in a backup file is provided, which can quickly calculate the size of data that can be deleted in a backup file. Get the size of the data that can be deleted in the backup file. The methods include:

First, the backup server determines a target backup file. The target backup file includes multiple data blocks. The target backup file is a backup file of a source file. The source file also has a first backup file and a third backup file. The first backup file is a backup file whose creation time is before the target backup file, and whose creation time is closest to the creation time of the target backup file, and the third backup file is a creation time which is after the target backup file. , And the backup file whose creation time is closest to the creation time of the target backup file, neither the first backup file nor the third backup file is deleted.

Secondly, according to the reference information of each data block contained in the target backup file recorded in the metadata of the target backup file, a first data block belonging to the target backup file is obtained.

The process by which the backup server determines the first data block belonging to the target backup file has been described in the above embodiment for calculating the size of the backup file, and will not be described here.

Secondly, the backup server determines the first target backup file, and according to the target backup file recorded in the metadata of the target backup file, the target backup file contains reference information of each data block, and the metadata recorded in the metadata of the first backup file is recorded. Reference information of each data block included in the first backup file is used to obtain a second data block belonging to the target backup file, and the reference information of each data block is used to indicate the backup file to which the data block belongs.

The process of determining the first backup file by the backup server has been described in the above embodiment for calculating the size of the backup file, and will not be described here.

When the backup server determines the second data block belonging to the target backup file, it may be based on the reference information of the data block recorded in the metadata of the target backup file and the data recorded in the metadata of the first backup file. The reference information of the data block to obtain the data block belonging to the second backup file, and the creation time of the second backup file is before the creation time of the target backup file and after the creation time of the first backup file, The second backup file has been deleted; then, the data block attributed to the second backup file is determined as the second data block attributed to the target backup file. The specific process has been described in the above embodiment for calculating the size of the backup file, and will not be described here.

Then, the backup server determines a third target backup file, and according to the metadata of the third backup file, records reference information of each data block included in the third backup file to obtain a first attribute belonging to the target backup file. A third data block referenced by a third backup file in the data block and the second data block;

Then, a fourth data block that can be deleted is determined according to the first data block, the second data block, and the third data block, specifically, the fourth data that can be deleted in determining the target backup file In the block, the data block other than the third data block among the first data block and the second data block may be determined as a fourth data block that can be deleted.

Finally, the size of the data that can be deleted in the target backup file is determined according to the size of each fourth data block recorded in the metadata of the target backup file. Specifically, the sum of the sizes of the fourth data blocks is determined as the size of data that can be deleted in the target backup file.

In the embodiment of the present application, when deleting a backup file, the tenant may send a delete request to the backup server to delete data that can be deleted from the backup file. The delete request includes an identifier of the backup file to be deleted, and the backup file to be deleted is referred to as a target backup file. After receiving the delete request, the backup server can determine the target backup file according to the identifier contained in the delete request.

The process of the backup server determining the third backup file may be as follows:

The backup server determines the backup serial number of the target backup file according to the recorded backup serial number of each backup file. The backup serial number of the target backup file is referred to as the target backup serial number. The backup server determines that it is adjacent to the target backup serial number and is greater than the The third backup sequence number of the target backup sequence number;

The third backup file corresponding to the third backup serial number is determined according to the backup serial number of each backup file.

The backup server records the reference information of each data block included in the third backup file in the metadata according to the third backup file to obtain the first data block and the second data block belonging to the target backup file. When the third data block is referenced by the third backup file, specifically, for each first data block and each second data block in the target backup file, identifying the corresponding data of the data block in the first backup file Block, the backup server judges whether the reference information of the data block is the same as the reference information of the corresponding data block in the first backup file. If the reference information is the same, it determines that the data block is the third data referenced by the third backup file. Block, the third data block cannot be deleted.

In another embodiment of the present application, the backup server may pre-save a first total size of a plurality of data blocks included in the target backup file, and a second total referenced data block included in the first backup file in the target backup file. The third backup file refers to the third total size of the data blocks contained in the target backup file, and the third backup file refers to the fourth total size of the data blocks contained in the first backup file.

The reference to the data block included in the first backup file in the target backup file can be understood as the data block included in the target backup file that does not belong to the target backup file.

The data blocks included in the third backup file that are referenced in the target backup file can be understood as the data blocks included in the third backup file that do not belong to the third backup file.

The reference to the data block contained in the first backup file in the third backup file can be understood as the data block contained in the third backup file is neither a data block belonging to the third backup file nor a target backup file.

The difference between the third total size of the third backup file referencing the data block contained in the target backup file minus the fourth total size of the third backup file referencing the data block contained in the first backup file is included in the third backup file Among the data blocks belonging to the target backup file, the data blocks belonging to the target backup file are referenced by the third backup file and will not be deleted.

When calculating the size of the data that can be deleted in the target backup file, the total size of the data blocks belonging to the target backup file may be calculated first, that is, the first total size minus the first difference between the second total size; The total size of the data blocks referenced by the third backup file in the data blocks belonging to the target backup file, that is, the third total size minus the fourth total size is the second difference, which belongs to the data blocks of the target backup file, but The data blocks referenced by the third backup file are not deleted; finally, the total size of the data blocks belonging to the target backup file other than the data blocks referenced by the third backup file is calculated, that is, the first total size Subtract the first difference of the second total size, and then subtract the second difference of the third total size and the fourth total size, that is, the first total size, subtract the second total size, and then subtract the third total size. The size plus the fourth total size is the total size of data that can be deleted in the target backup file.

As shown in Table 1, the backup chain reference relationship record file can record the total size of the data blocks contained in the backup file, as well as the reference file generation order in the backup file that is included in each backup file before the backup file generation order. Information such as the total size of the data block.

Assume that the total size of the multiple data blocks contained in the target backup file is Ta, and that the data block contained in the target backup file refers to the size R (a, p) of the data block that references the first backup file, then the third backup file references The size R (b, a) of the data block in the target backup file, the size R (b, p) of the data block in the first backup file that is referenced in the third backup file, and the size of the deleted data in the target backup file is Ta -R (a, p)-[R (b, a) -R (b, p)], where a is the target backup serial number of the target backup file, p is the first backup serial number of the first backup file, and b is the first The third backup sequence number of the three backup files.

Assume that the target backup file to be deleted is the backup file identified as backup 4. In the case where there is no deletion of a backup file, you can query the backup chain reference relationship record file for the backup file according to the backup serial number of each backup file. A backup file is a backup file identified by 3, and a third backup file is a backup file identified by backup 5. The total size of the multiple data blocks contained in the target backup file is T4, and the total size of the data blocks among the multiple data blocks contained in the target backup file that do not belong to the target backup file is R (4, 3). The total size of the data blocks belonging to the target backup file among the multiple data blocks contained in the three backup files is (R (5,4) -R (5,3), and the size of the deleted data in the target backup file is T4-R (4,3)-[(R (5,4) -R (5,3)].

If the backup file identified as backup 3 is deleted, the backup chain reference relationship record file does not contain related information identified as backup 3. You can query the backup chain reference relationship record file to find the A backup file is a backup file identified as 2 and a third backup file is a backup file identified as backup 5. The total size of the multiple data blocks contained in the target backup file is T4, and the total size of the data blocks that do not belong to the target backup file among the multiple data blocks contained in the target backup file is R (4, 2). The third backup The total size of the data blocks belonging to the target backup file among the multiple data blocks contained in the file is (R (5,4) -R (5,2), and the size of the deleted data in the target backup file is T4-R (4 , 2)-[(R (5,4) -R (5,2)].

If the backup files identified as backup 3 and backup 5 are deleted, the backup chain reference relationship record file does not contain related information identified as backup 3 and backup 5, and the reference relationship can be referenced in the backup chain according to the backup sequence number of each backup file. In the record file, the first backup file is queried as the backup file identified as 2 and the third backup file is the backup file identified as backup 6. The total size of the multiple data blocks contained in the target backup file is T4, and the total size of the data blocks that do not belong to the target backup file among the multiple data blocks contained in the target backup file is R (4, 2). The third backup The total size of the data blocks belonging to the target backup file among the multiple data blocks contained in the file is (R (6,4) -R (6,2), and the size of the deleted data in the target backup file is T4-R (4 , 2)-[(R (6,4) -R (6,2)].

In another embodiment of the present application, if the target backup file to be deleted is a backup file generated by the first backup, when calculating the size of the data deleted in the target backup file, the target backup file to be deleted may be directly The total size of the multiple data blocks contained in the data is subtracted from the sum of the sizes of the data blocks in the reference backup file in the third backup file, which is the size of the data deleted in the target backup file.

In another embodiment of the present application, if the target backup file to be deleted is the backup file generated by the last backup, when calculating the size of data deleted in the target backup file, the target backup file to be deleted may be directly The total size of the multiple data blocks included is subtracted from the sum of the sizes of the data blocks in the first backup file referenced in the target backup file as the size of the data deleted in the target backup file.

The backup server may determine whether the target backup file to be deleted is the backup file generated for the first backup and the backup file generated for the last backup when receiving the deletion request. The process of determining whether a backup file is generated for the first backup is described in the above embodiment for calculating the size of the backup file, and will not be described here.

When judging whether the target backup file is the backup file generated by the last backup, specifically, according to the backup serial number of each backup file saved in advance, it is determined whether the backup serial number of the target backup file to be deleted is the target backup file to be deleted. The sequence number of the largest backup in the backup chain of the. If yes, determine that the target backup file to be deleted is the backup file generated at the last backup; otherwise, determine that the target backup file to be deleted is not the backup file generated at the last backup .

As shown in FIG. 8, it is a flowchart of deleting a target backup file. A user sends a delete request to a backup server to delete the target backup file. The backup server includes the target backup file recorded in the pre-saved backup chain reference relationship record file. The total size of multiple data blocks, the total size of the data blocks in the target backup file referencing the first backup file, the total size of the data blocks in the third backup file referencing the target backup file, and the third backup file referencing the first backup The total size of the data blocks of the file, calculates the size of the data that can be deleted in the target backup file, and feeds back to the user the size of the data that can be deleted in the target backup file. The backup server can also delete data that can be deleted from the target backup file saved on the production storage server. After the backup server deletes the data that can be deleted in the target backup file, it can also update the backup chain reference relationship record file. Specifically, it deletes the related information of the target backup file recorded in the backup chain reference relationship record file. The related information includes The ID of the target backup file, the backup serial number, the total size of the multiple data blocks contained in the target backup file, and the total size of the data blocks in the reference backup file in any backup file.

Based on the above concept, as shown in FIG. 9, this application provides a device 900 for calculating the size of a backup file. The device may include a processing unit 901, a transceiver unit 902, and a storage unit 903. The device 900 for calculating the size of a backup file may be applied to a backup server or a chip in the backup server.

The storage unit 903 may be configured to store a backup file.

The processing unit 901 may be configured to determine a target backup file of a size to be calculated, and determine a first backup file, where the first backup file is created before the creation time of the target backup file, and the creation time and the target are The backup file that was created most recently. The metadata of the backup file records the backup serial number of the backup file, the reference information of the data block, and the size of the data block. The reference information of the data block is used to indicate the backup file to which the data block belongs; the metadata from the target backup file Query the reference information of each data block contained in the target backup file to obtain a first data block belonging to the target backup file, and the reference information of each data block is used to indicate the backup file;

Obtaining the second data block belonging to the target backup file according to the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file;

Determining the size of the target backup file according to the size of the first data block and the second data block recorded in the metadata of the target backup file.

Further, the processing unit 901 may be specifically configured to obtain attribution attributed to the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file. A data block of a second backup file, the creation time of the second backup file is before the creation time of the target backup file, and after the creation time of the first backup file, the second backup file has been deleted ;

And determining the data block attributed to the second backup file as the second data block attributed to the target backup file.

Further, the processing unit 901 may be specifically configured to determine the sum of the sizes of the first data block and the second data block as the size of the target backup file.

The receiving unit 902 may be configured to receive a query request, where the query request includes an identifier of the target backup file.

The storage unit 903 may be configured to record the reference information of each data block in the metadata of the target backup file, the size of each data block, and record the data in the metadata of the first backup file. Reference information for each data block contained in the first backup file.

It should be noted that the functions of the above units may be executed by the processor of the backup server or executed by the processor calling a program in the memory.

Based on the above concept, as shown in FIG. 10, the present application provides a device 1000 for calculating the size of deleted data in a backup file. The device may include a processing unit 1001, a transceiving unit 1002, and a storage unit 1003. The device 1000 for calculating the size of a backup file may be applied to a backup server or a chip in a backup server.

A storage unit 1003, which can be used to store a backup file;

The processing unit 1001 may be configured to determine a target backup file, where the target backup file includes multiple data blocks, the target backup file is a backup file of a source file, and the source file further includes a first backup file and a third backup file The first backup file includes multiple data blocks, the third backup file includes multiple data blocks, and the first backup file is created before the creation time of the target backup file, and the creation time and The backup file with the latest creation time of the target backup file, the first backup file is not deleted, the third backup file is created after the creation time of the target backup file, and the creation time and the The backup file with the latest creation time of the target backup file, and the third backup file has not been deleted;

Obtaining the first data block belonging to the target backup file according to the reference information of each data block contained in the target backup file recorded in the metadata of the target backup file;

Determining the first backup file, according to the target backup file recorded in the metadata of the target backup file, including the reference information of each data block, and the first backup file recorded in the metadata of the first backup file, Obtaining reference information of each data block to obtain a second data block belonging to the target backup file, and the reference information of each data block is used to indicate a backup file to which the data block belongs;

Determine a third backup file, and according to the metadata of the third backup file, record reference information of each data block included in the third backup file to obtain the first data block and the second data belonging to the target backup file The third data block in the data block that is referenced by the third backup file;

Determine a fourth data block that can be deleted according to the first data block, the second data block, and the third data block, and according to the size of each fourth data block recorded in the metadata of the target backup file, Determine the size of data that can be deleted in the target backup file.

The processing unit 1001 may be specifically configured to determine the sum of the sizes of the fourth data blocks as the sizes of data that can be deleted in the target backup file.

The processing unit 1001 may be specifically configured to determine a data block other than the third data block among the first data block and the second data block as a fourth data block that can be deleted.

The processing unit 1001 may be specifically configured to obtain the second backup according to the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file. Data block of the file, the creation time of the second backup file is before the creation time of the target backup file, and after the creation time of the first backup file, the second backup file has been deleted; The data block attributed to the second backup file is determined as the second data block attributed to the target backup file.

The receiving unit 1002 may be configured to receive a deletion request, where the deletion request includes an identifier of the target backup file.

The storage unit 1003 may be configured to record the backup serial number of the respective backup file and the reference information of each data block included in the backup file in the metadata of the target backup file, the second backup file, and the first backup file. Describe the size of each data block.

Based on the above concept, as shown in FIG. 11, the present application further provides a device 1100 for calculating the size of a backup file. The device 1100 for calculating the size of a backup file can be applied to a backup server or a chip in the backup server.

The device 1100 for calculating the size of a backup file may include a processor 1101 and a memory 1102.

The memory 1102 is configured to store computer instructions;

The processor 1101 is configured to execute computer instructions stored in the memory 1102, so that the apparatus for calculating the size of a backup file implements the method according to any one of the foregoing calculations of the size of a backup file.

For the introduction of the processor 1101 and the memory 1102, refer to the description of the process shown in the above calculation of the size of the backup file, which will not be repeated here.

Based on the above concept, as shown in FIG. 12, the present application further provides a device 1200 for calculating the size of deleted data in a backup file. The device 1200 for calculating the size of deleted data in a backup file may be applied to a backup server or a backup server. In the chip.

The device 1200 for calculating the size of the deleted data in the backup file may include a processor 1201 and a memory 1202.

The memory 1202 is configured to store computer instructions;

The processor 1201 is configured to execute computer instructions stored in the memory 1202, so that the device for calculating the size of deleted data in a backup file implements any of the above-mentioned methods for calculating the size of deleted data in a backup file. The method described.

For the introduction of the processor 1201 and the memory 1202, refer to the description of the process shown in the above calculation of the size of the backup file, which is not repeated here.

Those skilled in the art should understand that the embodiments of the present application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, this application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

This application is described with reference to flowcharts and / or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing device to produce a machine, so that the instructions generated by the processor of the computer or other programmable data processing device are used to generate instructions Means for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions The device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

Although the preferred embodiments of the present application have been described, those skilled in the art can make other changes and modifications to these embodiments once they know the basic inventive concepts. Therefore, the following claims are intended to be construed to include the preferred embodiments and all changes and modifications that fall within the scope of this application.

Obviously, those skilled in the art can make various modifications and variations to the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. In this way, if these modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application also intends to include these changes and variations.

Claims

A method for calculating the size of a backup file, comprising:

The backup server determines a target backup file. The target backup file includes multiple data blocks. The target backup file is a backup file of a source file. The source file also has a first backup file. The first backup file includes multiple data blocks. Data blocks, the first backup file is a backup file whose creation time is before the creation time of the target backup file, and whose creation time is closest to the creation time of the target backup file, the first backup file is not delete;

Querying the reference information of each data block contained in the target backup file from the metadata of the target backup file to obtain a first data block belonging to the target backup file, and the reference information of each data block A backup file for indicating the ownership of the data block;

Obtaining the second data block belonging to the target backup file according to the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file;

Determining the size of the target backup file according to the size of the first data block and the second data block recorded in the metadata of the target backup file.
The method according to claim 1, wherein the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file Obtaining the second data block belonging to the target backup file includes:

Obtaining the data block belonging to the second backup file according to the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file, the first The creation time of the second backup file is before the creation time of the target backup file and after the creation time of the first backup file, the second backup file has been deleted;

And determining the data block attributed to the second backup file as the second data block attributed to the target backup file.
The method according to claim 1, wherein the determining of the target backup file based on the size of the first data block and the second data block recorded in the metadata of the target backup file Sizes include:

The sum of the sizes of the first data block and the second data block is determined as the size of the target backup file.
The method according to claim 1, further comprising:

Record the reference information of each data block contained in the target backup file and the size of each data block in the metadata of the target backup file;

Record the reference information of each data block contained in the first backup file in the metadata of the first backup file.
The method according to claim 1, wherein before the backup server determines a target backup file, the method further comprises:

A query request is received, and the query request includes an identifier of the target backup file.
A device for calculating the size of a backup file, comprising a processing unit and a storage unit;

The storage unit is configured to store a backup file;

The processing unit is configured to determine a target backup file, where the target backup file includes multiple data blocks, the target backup file is a backup file of a source file, and the source file further includes a first backup file, the first A backup file includes a plurality of data blocks. The first backup file is a backup file whose creation time is before the creation time of the target backup file and whose creation time is closest to the creation time of the target backup file. A backup file has not been deleted;

Querying the reference information of each data block contained in the target backup file from the metadata of the target backup file to obtain a first data block belonging to the target backup file, and the reference information of each data block A backup file for indicating the ownership of the data block;

Obtaining the second data block belonging to the target backup file according to the reference information of the data block recorded in the metadata of the target backup file and the reference information of the data block recorded in the metadata of the first backup file;

Determining the size of the target backup file according to the size of the first data block and the second data block recorded in the metadata of the target backup file.
The device according to claim 6, wherein the processing unit is specifically configured to record information in the metadata of the target backup file according to the reference information of the data block recorded in the metadata of the target backup file and in the metadata of the first backup file. The reference information of the data block to obtain the data block belonging to the second backup file, and the creation time of the second backup file is before the creation time of the target backup file and after the creation time of the first backup file , The second backup file has been deleted;

And determining the data block attributed to the second backup file as the second data block attributed to the target backup file.
The apparatus according to claim 6, wherein the processing unit is specifically configured to determine a sum of sizes of the first data block and the second data block as the size of the target backup file.
The device according to claim 6, wherein the storage unit is further configured to record reference information of each data block included in the target backup file in the metadata of the target backup file, and each The size of each data block;

Record the reference information of each data block contained in the first backup file in the metadata of the first backup file.
The apparatus according to claim 6, further comprising:

The receiving unit is configured to receive a query request, where the query request includes an identifier of the target backup file.
A device for calculating the size of a backup file, comprising a processor and a memory;

The memory is used to store computer instructions;

The processor is configured to execute computer instructions stored in the memory, so that the apparatus for calculating the size of a backup file implements the method according to any one of claims 1 to 5.
A computer-readable storage medium, wherein the storage medium stores computer instructions, and when the computer instructions are executed by a computer, the computer causes the computer to execute the method according to any one of claims 1 to 5. .
A computer program product, wherein the computer program product includes computer instructions, and when the computer instructions are executed by a computer, the computer causes the computer to execute the method according to any one of claims 1 to 5.