WO2019127034A1 - 一种过期备份处理方法及备份服务器 - Google Patents
一种过期备份处理方法及备份服务器 Download PDFInfo
- Publication number
- WO2019127034A1 WO2019127034A1 PCT/CN2017/118689 CN2017118689W WO2019127034A1 WO 2019127034 A1 WO2019127034 A1 WO 2019127034A1 CN 2017118689 W CN2017118689 W CN 2017118689W WO 2019127034 A1 WO2019127034 A1 WO 2019127034A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- backup
- deletion
- storage system
- disk
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
- G06F11/1453—Management of the data involved in backup or backup restore using de-duplication of the data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/80—Database-specific techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/84—Using snapshots, i.e. a logical point-in-time copy of the data
Definitions
- the embodiments of the present invention relate to the field of storage, and in particular, to an expired backup processing method and a backup server.
- the disk data can be backed up to the object storage system through the backup server according to the object storage mode.
- the specific process includes: the object storage system saves a plurality of consecutive data fragments constituting the disk data as at least one object, each object including a continuous at least one data slice. If the disk backup data is stored in multiple objects in a distributed manner, when the subsequent backup server needs to access the backup data, multiple objects of the object storage system need to be accessed multiple times. To this end, the backup server sends at least two objects and one large object identifier to the object storage system, and the object storage system saves at least two objects into one large object to reduce the number of accesses to the object storage system by accessing the large object. .
- the embodiment of the invention provides an expired backup processing method and a backup server, which implements a technology for incrementally backing up data of a disk based on a large object storage manner, and how to implement an expired backup processing method.
- an embodiment of the present invention provides an expired backup processing method, where the method is performed by a backup server.
- the method includes: after the backup server determines the expired backup of the first disk data, the deletion log of the first disk data is created, and a pointer of the invalid data in the expired backup is saved to the deletion log.
- the expired backup is the earliest backup of all unexpired backups performed on the first disk data in the object storage system at the current first time. After the subsequent backup server detects that the deletion condition is met, multiple deletion logs corresponding to the first disk data are obtained.
- the purpose of obtaining a plurality of deletion logs is to determine a first target large object including the valid data and the invalid data saved in the object storage system according to the plurality of deletion logs corresponding to the first disk data. After determining the first target large object including the invalid data and the valid data, sending a data migration indication and an object deletion indication to the object storage system, the data migration indication is used to indicate that the object storage system is to be the first target The valid data in the object is migrated to another large object, the object deletion indication is used to instruct the object storage system to delete the first target large object.
- a delete log is created, and a pointer of invalid data in the expired backup is saved to the delete log, so that the subsequent backup server can determine a large object including invalid data and valid data according to the delete log. Furthermore, the processing of the expired backup is performed on the disk data that is incrementally backed up by the large object storage mode.
- the backup server determines the first target large object including the invalid data and the valid data
- the data migration indication and the object deletion indication may be included in one instruction or may be separately sent by two instructions. If the data migration indication and the object deletion indication are included in an instruction, after the object storage system receives the instruction including the data migration indication and the object deletion indication, the valid data in the first target large object is first migrated to another In the large object, the first target large object is deleted. If the data migration indication and the object deletion indication are separately sent by two instructions, the backup server may first send an instruction including the data migration indication, and then send an instruction including the object deletion indication.
- the backup server Before the backup server sends an instruction including an object deletion indication, it may first determine whether the object storage system completes the migration of valid data. If the confirmation object storage system completes the effective data migration, an instruction including the object deletion indication is sent to the object storage system to instruct the object storage system to delete the first target large object. The backup server may determine whether the object storage system completes the migration of valid data according to whether the migration completion message returned by the object storage system is obtained.
- whether the detecting meets the deletion condition includes:
- the timing is started after the deletion condition is last met, and it is detected whether the timing is over, and if the timing is over, the deletion condition is satisfied.
- the method further includes: creating a pointer after the valid data is moved, creating a movement log of the first disk data, and moving the pointer before the valid data and the pointer after moving the valid data The correspondence is saved to the movement log.
- the pointer after the valid data is moved indicates the position of the other large object after the valid data is moved to the other large object.
- the method further includes: receiving a data recovery request sent by the client, where the data recovery request includes the first disk identifier, the second disk identifier, and the Recovering the backup identifier of the backup, the data recovery request is used to indicate that the backup data to be restored corresponding to the backup identifier of the backup to be restored, and recovering the first disk data to the second disk, where the backup to be restored is Any of all unexpired backups of the first disk data. Obtaining the backup metadata of the backup to be restored, and acquiring all the movement logs of the first disk data.
- a fourth implementation manner after the saving of the correspondence between the pointer before the valid data movement and the pointer after the valid data is moved to the mobile log And including: creating an object identifier of the mobile log; storing a correspondence between the first disk identifier and the object identifier of the mobile log, and sending a mobile log storage request to the object storage system, where the mobile log storage request includes An object identifier of the movement log and the movement log, the movement log storage request is used to instruct the object storage system to save the movement log to an object corresponding to an object identifier of the movement log; All the movement logs of the first disk are obtained, including: acquiring an object identifier of the mobile log according to the first disk identifier, and sending the mobile log obtaining request to the object storage system, where the mobile log obtaining request includes An object identifier of the mobile log, the mobile log obtaining request is used to indicate that the object storage system is from Obtaining the movement log in an object corresponding to the object identifier of
- the determining, according to the multiple deletion logs, determining that the object storage system is saved The first target large object including the valid data and the invalid data comprising: determining, according to the plurality of deletion logs, a first target large object including the invalid data saved in the object storage system; according to the predefined invalid data And a quantity of invalid data included in the first target large object, determining a data amount of all invalid data in the first target large object; and transmitting a data amount determining request to the object storage system, the data amount Determining that the request includes an identifier of the first target large object, the data amount determining request is used to instruct the object storage system to send the data amount of the first target large object; and receiving data amount attribute information, the data amount attribute The information includes an amount of data of the first target large object; if the amount of data of all invalid data in the first target large object is greater than the data Attribute information, the data of the first small
- an expired backup processing method is provided, where the method is performed by a backup server, including:
- a deletion log of the first disk data is created, and a pointer of the invalid data in the expired backup is saved to the deletion log, and the expired backup is the current one.
- whether the detecting meets the deletion condition includes:
- the timing is started after the deletion condition is last met, and it is detected whether the timing is over, and if the timing is over, the deletion condition is satisfied.
- the determining, according to the multiple deletion logs, the large object that is included in the object storage system and including only invalid data including Determining, according to the plurality of deletion logs, a large object including invalid data; determining, according to the size of the predefined invalid data and the number of invalid data included in the large object including the invalid data, the object including the invalid data a data amount of all invalid data; a data amount determination request to the object storage system, the data amount determination request including an identifier of the large object including invalid data, the data amount determination request indicating the object storage
- the system transmits the data amount of the large object including the invalid data; receiving the data amount attribute information, the data amount attribute information including the data amount of the large object including the invalid data; if all of the large objects including the invalid data
- the data amount of the invalid data and the data of the large object including the invalid data in the data amount attribute information The same, it is determined that the object is stored in the storage system comprises a large object is a
- an embodiment of the present invention provides a data backup method, the method comprising: the backup server backing up the first disk data before processing the expired backup.
- the process of backing up data fragments in the first disk data is: the backup server receives multiple data fragments of the first disk data to be backed up to the object storage system, according to the size of the data fragment and the predefined data blocks.
- the size determines a data block to be backed up to the object storage system, the data block including at least one data slice in the first disk data.
- the backup server calculates the weak hash value of the data block. An identification of another data block that has been saved in the object storage system is determined, the weak hash value of the data block being similar to the weak hash value of the other data block.
- the size does not reach a predefined size, and a data backup request is sent to the object storage system.
- the data backup request includes an identifier of the data block and a large object where the another data block is located, where the data backup request is used to indicate that the object storage system saves the data block to the another data block In the big object.
- the object storage system After receiving the data storage request, the object storage system saves the data block to a large object in which the other data block is located.
- the embodiment of the present invention can implement large object storage of data blocks with weak hash values similar when backing up data.
- the data fragment in the another data block may be a data fragment in the first disk data, or may be a data fragment in other disk data.
- the backup server before the processing of the expired backup, backs up the data of the first disk by using the method provided by the third side.
- a fourth aspect provides a backup server, including modules for performing the expiration backup processing method in the first aspect or any possible implementation manner of the first aspect, where the module may be implemented by hardware or by hardware. Perform the appropriate software implementation.
- a fifth aspect provides a backup server, including modules for performing an expired backup processing method in any one of the possible implementations of the second aspect or the second aspect, where the module may be implemented by hardware or by hardware. Perform the appropriate software implementation.
- a backup server which includes various modules for performing an expired backup processing method in the implementation manner provided by the third aspect, and the module may be implemented by hardware or by executing corresponding software by hardware.
- a backup server including an interface, a memory, and a processor, the interface for communicating with an object storage system, the memory for storing a software program, the processor being stored in the memory by running A software program that performs the expiration backup processing method of the first aspect or any of the possible implementations of the first aspect.
- a backup server including an interface, a memory, and a processor, the interface for communicating with an object storage system, the memory for storing a software program, the processor being stored in the memory by running
- the software program executes the expired backup processing method in any of the possible implementations of the second aspect or the second aspect.
- a computer readable storage medium stores instructions that, when run on a computer, cause the computer to perform any of the first aspect or the first aspect described above Expired backup processing methods in possible implementations.
- a computer readable storage medium stores instructions that, when run on a computer, cause the computer to perform any of the second aspect or the second aspect described above Expired backup processing methods in possible implementations.
- FIG. 1 is a schematic structural diagram of a backup system according to an embodiment of the present invention.
- FIG. 2 is a flowchart of a method for full backup of virtual machine disk data according to an embodiment of the present invention
- FIG. 3 is a flowchart of a method for incrementally backing up virtual machine disk data according to an embodiment of the present invention
- FIG. 4 is a flowchart of a method for processing an expired backup according to an embodiment of the present invention.
- FIG. 5 is a flowchart of a method for deleting backup data according to an accumulated deletion log by a backup server according to an embodiment of the present invention
- FIG. 6 is a flowchart of a method for a backup server to indicate that an object storage system saves valid data by using a multi-segment replication technology according to an embodiment of the present invention
- FIG. 7 is a flowchart of a method for restoring virtual machine disk data according to an embodiment of the present invention.
- FIG. 8 is a structural diagram of a backup server according to an embodiment of the present invention.
- FIG. 9 is a structural diagram of another backup server according to an embodiment of the present invention.
- FIG. 10 is a structural diagram of another backup server according to an embodiment of the present invention.
- FIG. 11 is a structural diagram of another backup server according to an embodiment of the present invention.
- FIG. 12 is a structural diagram of another backup server according to an embodiment of the present invention.
- FIG. 1 is a schematic structural diagram of a storage system according to an embodiment of the present invention.
- the storage system includes a client 110, a storage node 120, and a backup system 100.
- the backup system 100 includes a backup server 130 and an object storage system 140.
- the backup server 130 is connected to the object storage system 140.
- the backup server 130 is connected to the client 110 and the storage node 120.
- the storage node 120 includes one or more disks, and the disk may be a virtual machine disk or a physical disk.
- the storage node 120 is configured to divide data of a certain disk to be stored to the storage node 120 into a plurality of consecutive data fragments, and save the consecutive plurality of data fragments in a plurality of physical disks of the storage node 120. In the block.
- the backup server 130 is configured to receive the full backup or incremental backup of the data on the disk of the storage node 120 to the object storage system 140 when receiving the data backup request sent by the client 110 or determining that the preset time is reached. Create and save backup metadata and backup attribute information.
- the full backup refers to backing up all the data on the disk of the storage node 120 to the storage system.
- the incremental backup refers to backing up the modified data on the disk of the storage node 120 to the object storage system.
- Backup metadata is used to represent the location of each object in the disk data in the disk data.
- the backup metadata can record the identifier and pointer of each object that constitutes the disk data, and record the pointer of each object according to the order in which the objects are arranged on the disk.
- the backup attribute information includes a backup backup identifier, a backup time, and an identifier of the backup metadata.
- the data backup request sent by the client 110 includes the disk identifier of the disk to be backed up, and the backup server 130 backs up or incrementally backs up the data on the disk indicated by the disk identifier to the object storage system 140.
- the backup server 130 backs up or incrementally backs up the data on the disk indicated by the disk identifier to the object storage system 140.
- the backup server in the backup system provided by the embodiment of the present invention deletes the expired backup as needed.
- the backup server 130 is configured to: when detecting that the total number of copies of all the backup attribute information of the disk exceeds a predetermined value, determine that the earliest backup is an expired backup according to the backup time in all the backup attribute information. After the backup server 130 determines the expired backup, the backup metadata of the expired backup and the backup metadata of the next backup adjacent to the expired backup identify the invalid data in the expired backup, create a delete log, and invalidate the invalid data in the expired backup. The pointer is saved to the delete log. After the deletion log is created, the backup server 130 is further configured to delete the backup attribute information of the expired backup, and create and save the correspondence between the disk identifier and the identifier of the deleted log.
- the invalid data refers to an object including the modified data fragment in the expired backup with respect to the next backup adjacent to the expired backup.
- the backup server 130 After the backup server 130 detects whether the deletion condition is met, the backup server 130 acquires multiple deletion logs corresponding to the disk data according to the correspondence between the disk identifier and the identifier of the deleted log.
- the deletion condition may be that the number of multiple deletion logs corresponding to the disk data reaches a preset deletion threshold, or reaches a preset deletion time, or starts timing until the timeout ends since the deletion condition is last met.
- the backup server 130 is further configured to determine, according to the plurality of deletion logs, a target large object that includes the valid data and the invalid data saved in the object storage system 140, and determine a large object that includes only invalid data.
- the valid data refers to an object including an unmodified data fragment in the expired backup with respect to the next backup adjacent to the expired backup.
- the backup server 130 is further configured to send a data migration indication and an object deletion indication to the object storage system 140,
- the data migration indication is used to instruct the object storage system 140 to migrate the valid data in the target large object to another large object
- the object deletion indication is used to instruct the object storage system 140 to delete the target Large object.
- the backup server 130 is further configured to send an object deletion indication to the object storage system 140, where the object deletion indication is used to instruct the object storage system 140 to delete the invalid data only. Big object.
- the accumulated expired backup is processed after a period of time to determine a large object that includes invalid data. If it is determined that the large object is a large object including valid data and invalid data, the valid data in the large object is moved to another large object, and then the large object is deleted to save the system's storage space. If you determine that the large object is a large object that only includes invalid data, delete the large object to save system storage space. As such, the embodiment of the present invention implements how to delete the expired backup of the disk data that is incrementally backed up by the large object storage mode, and avoids processing the expired backup every time an expired backup is determined, but after a period of time The accumulated multiple expired backups are processed to simplify the steps that the backup system frequently processes the expired backups.
- the pointer of the valid data in the large object is modified, in order to avoid updating the valid data in the large object recorded in the backup metadata corresponding to all the other backups respectively.
- the pointer records the correspondence between the pointer before the effective data movement and the pointer after the movement in the large object, so as to confirm whether the backup metadata of the certain backup exists in the subsequent backup when a certain backup needs to be accessed.
- the backup server 130 After moving the valid data in the large object to another large object, the backup server 130 is further configured to create a movement log, and save the correspondence between the valid data movement and the moved pointer to the movement log.
- the subsequent backup server 130 accesses the valid data in the disk data through the backup metadata of the unexpired backup, the corresponding relationship between the pre- and post-moving pointers of the valid data stored in the mobile log is updated, and the unexpired backup is updated.
- Backup metadata When the backup metadata of the unexpired backup is updated, the same pointer before the valid data recorded in the mobile log is modified in the backup metadata of the unexpired backup, and the modified pointer is the A pointer after moving the valid data recorded in the log.
- the backup server 130 receives the data recovery request sent by the client 110, and after determining the backup to be restored from all the unexpired backups, the backup metadata of the backup to be restored is modified according to the movement log, and the composition is obtained according to the modified backup metadata. All objects of the disk data, and then restore the disk data to the target disk.
- the target disk may be located in another storage node 120, which may be a physical disk or a virtual machine disk.
- the target disk may also be located on the storage node 120 where the disk data is backed up or on the same disk as the disk where the disk data was backed up.
- the object storage system 140 in the backup system 100 is configured to store backup data.
- the object storage system 140 is used to save all the data in the disk data during the full backup.
- the object storage system 140 is used to save modified data in the disk data during the incremental backup.
- the object storage system 140 can also store backup metadata created when the disk data is backed up.
- the object storage system 140 can also store a delete log or a move log created when an expired backup process of disk data is performed.
- the backup server 130 and the object storage system 140 are deployed independently, and the client 110 may be a virtual machine deployed on the backup server 130 or deployed independently from the backup server 130.
- the client 110 may be a device independent of the backup server 130 or a virtual machine deployed on a device independent of the backup server 130.
- the storage node 120 can be a storage device in the backup server 130 or deployed independently of the backup server 130. Storage node 120 is used to manage physical disks or to manage virtual machine disks.
- the client 110 can be a physical server or various types of terminal devices.
- the terminal device of the embodiment of the invention includes a tablet device, a notebook computer, a mobile internet device, a palmtop computer, a desktop computer, a mobile phone or other terminal devices in the form of products.
- the client 110 can also be a software module, such as a software module running on a physical device or a virtual machine running on a physical server.
- FIG. 2 is a flowchart of a method for full backup of virtual machine disk data according to an embodiment of the present invention. The method includes the following steps:
- the backup server 130 sends a snapshot notification to the storage node 120.
- the snapshot notification includes the virtual machine disk ID.
- the snapshot notification is used to instruct the storage node 120 to take a snapshot of the virtual machine disk corresponding to the virtual machine disk identifier.
- the disk corresponding to the virtual machine disk identifier is the disk to be backed up.
- the storage node 120 takes a snapshot of the virtual machine disk, creates a snapshot identifier of the virtual machine disk, and saves the corresponding relationship between the virtual machine disk identifier and the snapshot identifier.
- the backup system initiates a full backup can be implemented in a variety of ways.
- the user may select a time point for performing full backup of the virtual machine disk data through the client 110.
- the client 110 sends a backup request to the backup server 130, where the backup request includes the identifier of the storage node 120. And virtual machine disk identification.
- the backup server 130 After receiving the backup request sent by the client 110, the backup server 130 sends a snapshot notification to the storage node 120.
- the storage node 120 saves a plurality of consecutive data fragments to be stored in the virtual machine disk after one physical block of the virtual machine disk, and sends a backup request to the backup server 130 to start a full backup.
- the user presets the time point of the full backup of the storage node 120 in the backup server 130. After the time point of the full backup arrives, the backup server 130 sends a snapshot notification to the storage node 120.
- the specific implementation of how the backup system initiates a full backup is not limited by the examples of the embodiments of the present invention.
- the snapshot information includes the snapshot ID of the virtual machine disk and the virtual machine disk ID.
- the storage node 120 When the storage node 120 snapshots the virtual machine disk data, it creates a snapshot identifier of the virtual machine disk. Then, snapshot information is generated according to the snapshot identifier of the virtual machine disk and the virtual machine disk identifier. The storage node 120 sends the virtual machine disk identifier and the snapshot identifier to the backup server 130 by sending snapshot information.
- the backup server 130 After receiving the snapshot information, the backup server 130 saves the correspondence between the snapshot identifier and the virtual machine disk identifier, and confirms the existing backup status of the virtual machine disk according to the virtual machine disk identifier, and determines whether to perform the full amount according to the existing backup situation. Backup.
- step 220 the backup server 130 confirms the existing backup status of the virtual machine disk according to the virtual machine disk identifier into two cases:
- the backup server 130 confirms that the number of backups to the virtual machine disk is 0 according to the virtual machine disk identification. If the number of backups is 0, a full backup of the virtual machine disk data is performed.
- the backup server 130 confirms that the number of backups of the virtual machine disk is not 0 according to the virtual machine disk identifier, and the information saved when the virtual machine disk is backed up last time is inconsistent with the incremental information that needs to be saved when backing up. .
- the snapshot information sent to the backup server also includes the amount of data of the snapshot at the time of the last backup recorded in the storage node.
- the backup server confirms that the number of backups of the virtual machine disk is not 0 according to the virtual machine disk identifier, compares the data volume of the snapshot at the time of the last backup recorded in the backup server with the data volume of the snapshot at the time of the last backup recorded in the storage node. Check whether the information saved during the last backup of the virtual machine disk is consistent with the incremental information that should be saved. If the snapshot of the last backup recorded in the backup server and the snapshot of the last backup recorded in the storage node The amount is different, indicating that the information saved during the last backup of the virtual machine disk is inconsistent with the incremental information that should be saved when backing up.
- the snapshot information sent to the backup server further includes the identifier of the last snapshot.
- the backup server confirms that the number of backups of the virtual machine disk is not 0 according to the virtual machine disk ID, it compares the identifier of the snapshot at the time of the last backup recorded in the backup server with the identifier of the snapshot at the last backup recorded in the storage node.
- the information saved when the virtual machine disk is backed up is the same as the incremental information that should be saved in the backup. If the last backup recorded in the storage node is different from the previous snapshot, the snapshot is different from the last backup. , which indicates whether the information saved during the last backup of the virtual machine disk is consistent with the incremental information that should be saved.
- the backup server 130 After determining that the full backup is performed, the backup server 130 sends a full data acquisition request to the storage node 120 where the virtual machine disk is located.
- the full data acquisition request includes the virtual machine disk identifier and the location of the data block to be read in the virtual machine disk data.
- the data block to be read includes at least one data slice.
- the number of data slices in the data block to be read is determined by the size of the data block to be read and the size of the data slice.
- the size of the data block to be read is predefined in the backup software of the backup server 130, and the size of the data block to be read, which is predefined by different backup software, may be different.
- the backup server 130 can send a plurality of full data acquisition requests to obtain all data fragments of the virtual machine disk data.
- the location of the data block to be read in the virtual machine disk data may be the following two types:
- the first type the starting position of the data block to be read in the virtual machine disk data and the size of the data block to be read. If the first data fragment in the data block to be read is the i-th data fragment in the virtual machine disk data, where i is an integer greater than 0, the data block to be read is The starting position in the virtual machine disk data is the product of (i-1) and the size of the data slice. For example, the data block to be read is the third data fragment in the virtual machine disk data. If the size of the data fragment to be read is 4M, the data block to be read starts in the virtual machine disk data. The position is the product of (3-1) and 4M, that is, the starting position of the data block to be read in the virtual machine disk data is 8M.
- the first data fragment of the data block to be read is the first data fragment in the virtual machine disk data. If the size of the data fragment to be read is 4M, the data block to be read is The starting position in the virtual machine disk data is the product of (1-1) and 4M, that is, the starting position of the data block to be read in the virtual machine disk data is 0M.
- the starting and ending positions of the data block to be read in the virtual machine disk data is as described above, and will not be described here.
- the end position of the data block to be read in the virtual machine disk data is the end position of the last data fragment in the data block to be read, if the last data fragment in the data block to be read is a virtual machine.
- the wth data fragment in the disk data, the end position of the last data fragment in the data block to be read is the product of the size of the w and the data fragment.
- the full data acquisition request includes a virtual machine disk identifier and the plurality of data blocks to be read.
- the location of the plurality of data blocks to be read in the virtual machine disk data may be a starting position of the first data block to be read in the plurality of data blocks to be read in the virtual machine disk data and the The size of multiple data blocks to be read.
- the location of the plurality of data blocks to be read in the virtual machine disk data may be the first data block to be read in the plurality of data blocks to be read in the virtual machine disk data.
- the start position and the end position of the last data block to be read among the plurality of data blocks to be read in the virtual machine disk data can refer to the manner of deleting the position of the data block to be read in the virtual machine disk data, and specific implementation details are not described herein.
- the storage node 120 where the virtual machine disk is located receives the full data acquisition request, and searches for multiple data points of the constituent data blocks in the virtual machine disk data according to the location of the data block to be read in the virtual machine disk data. sheet. Then, a plurality of data fragments constituting the data block in the virtual machine disk data are sent to the backup server 130.
- the backup server 130 After receiving the plurality of data blocks in the virtual machine disk data, the backup server 130 determines a plurality of data blocks constituting the large object. Each data block is an object in a large object.
- the plurality of data blocks received by the backup server 130 conforming to a predetermined number are a plurality of data blocks constituting a large object.
- the weak hash value of each data block can be calculated, and multiple data blocks with weak hash values can be determined as multiple data blocks constituting the large object.
- the backup server 130 determines a plurality of data blocks constituting the large object, the identifier of the large object is created, and the identifiers of the plurality of data blocks and the large object are sent to the object storage system 140.
- the object storage system 140 After receiving the identifiers of the plurality of data blocks and the large object, the object storage system 140 saves the plurality of data blocks to a large object corresponding to the identifier of the large object.
- All of the data blocks of the virtual machine disk make up a plurality of large objects that are sent to the object storage system 140.
- the backup server 130 creates an identifier of the backup metadata and the metadata object, and sends the backup metadata and the identifier of the metadata object to the object storage system 140.
- the backup metadata is used to represent the location of each object in the virtual machine's disk data in the virtual machine's disk data.
- the backup metadata records the identifier and pointer of each object that makes up the virtual machine's disk data, and records the pointer of each object according to the order in which the objects are arranged in the virtual machine's disk.
- the identity of the object can also be used as a pointer at the same time. That is to say, in the backup metadata, only the identifier of each object constituting the virtual machine disk data can be recorded, and the identifier of each object is recorded in the order in which the objects are arranged in the virtual machine disk.
- the object identifier recorded in the backup metadata may include an identifier of the large object and a location of the object in the large object to which the object belongs.
- the object's pointer represents the location of the object in the large object it belongs to. For example, a pointer to a part of the first four large objects in the disk data recorded in the backup metadata indicated in Table 1 below.
- the backup server 130 creates backup attribute information of the full backup, and saves the correspondence between the backup attribute information of the full backup and the virtual machine disk identifier.
- the backup attribute information of the full backup includes a backup identifier of the full backup, a metadata object identifier of the backup metadata, and a backup time.
- the above steps 200 to 260 describe how the backup system performs a full backup of the virtual machine disk data.
- the user may modify the data, and then perform incremental backup on the modified data.
- the process of a specific incremental backup includes the process in FIG.
- the backup server 130 determines the expired backup and processes the expired backup. The following describes in detail how the backup system incrementally backs up virtual machine disk data.
- FIG. 3 is a flowchart of a method for incrementally backing up virtual machine disk data according to an embodiment of the present invention. As shown in FIG. 3, the method for incrementally backing up virtual machine disk data provided by the embodiment of the present invention includes the following steps:
- the backup server 130 sends a snapshot notification to the storage node 120.
- the snapshot notification includes the virtual machine disk ID.
- the snapshot notification is used to instruct the storage node 120 to take a snapshot of the virtual machine disk corresponding to the virtual machine disk identifier.
- the disk corresponding to the virtual machine disk identifier is the disk to be backed up.
- the storage node 120 takes a snapshot of the virtual machine disk, creates a snapshot identifier of the virtual machine disk, and saves the corresponding relationship between the virtual machine disk identifier and the snapshot identifier.
- the user may select a time point for incrementally backing up the virtual machine disk data through the client 110.
- the client 110 sends a backup request to the backup server 130, where the backup request includes the storage node 120. ID and virtual machine disk ID.
- the backup server 130 sends a snapshot notification to the storage node 120.
- the storage node 120 saves a plurality of consecutive data fragments to be stored in the virtual machine disk after one physical block of the virtual machine disk, if the storage node 120 where the virtual machine disk is located has at least one object in the virtual machine disk data.
- At least one of the data fragments is modified, and the storage node 120 sends a backup request to the backup server 130 to initiate an incremental backup.
- the user presets the time point of the incremental backup of the storage node 120 in the backup server 130. After the time point of the incremental backup arrives, the backup server 130 sends a snapshot notification to the storage node 120.
- the specific implementation of how the backup system initiates an incremental backup is not limited by the examples of the embodiments of the present invention.
- the storage node 120 After the storage node 120 snapshots the virtual machine disk data, the storage node 120 sends the snapshot information to the backup server 130.
- the snapshot information includes the snapshot identifier of the current snapshot, Change Block Tracking (CBT) information, and the virtual machine disk identifier.
- CBT Change Block Tracking
- the storage node 120 When the storage node 120 snapshots the virtual machine disk data, it creates the snapshot ID of the virtual machine disk and the CBT information of the current snapshot. After the storage node 120 creates the snapshot identifier of the virtual machine disk, the snapshot information can be generated according to the snapshot identifier of the virtual machine disk, the CBT information of the current snapshot, and the virtual machine disk identifier. The storage node 120 sends the virtual machine disk identifier, the CBT information of the current snapshot, and the snapshot identifier to the backup server 130 by sending the snapshot information.
- the mapping between the CBT information, the snapshot identifier, and the virtual machine disk identifier of the current snapshot is saved.
- the CBT information of the current snapshot is used to indicate the modified state of each data slice in the virtual machine disk data at the time of the current snapshot relative to the previous snapshot.
- the CBT information of the current snapshot includes the CBT information identifier of the current snapshot and the modified status identifier of each data fragment in the virtual machine disk data.
- the CBT information of the current snapshot records the modification status identifier of each data fragment in the virtual machine disk data according to the order of each data fragment constituting the virtual machine disk data.
- the modification status identifier is used to indicate the modification status of the data fragment in the virtual machine disk data at the time of the current snapshot relative to the previous snapshot.
- the modified state can include a modified state and an unmodified state.
- the number 0 indicates that the data slice in the virtual machine disk data is unmodified when compared to the previous snapshot.
- the data 1 indicates that the data fragment in the virtual machine disk data is relative to the current snapshot.
- the modification status at the time of the last snapshot is the modified state.
- the modified status identifier may be, for example, a Chinese character such as "modified” or "unmodified", or may be a letter, a number, or other symbol, or a combination of letters, numbers, or other symbols. The specific manifestation of the modified status identifier is not limited by this embodiment.
- the backup server 130 confirms that the storage node 120 where the virtual machine disk is located creates a snapshot of the virtual machine disk according to the virtual machine disk identifier, and sends an incremental comparison information acquisition request to the storage node 120 where the virtual machine disk is located.
- the incremental comparison information acquisition request includes the snapshot identifier of the virtual machine disk and the virtual machine disk identifier.
- the storage node 120 where the virtual machine disk is located receives the incremental comparison information acquisition request, and sends the incremental comparison information to the backup server 130.
- the incremental comparison information includes the CBT information identifier of the previous snapshot generated by the previous snapshot of the current snapshot corresponding to the snapshot identifier of the virtual machine disk.
- the storage node 120 where the virtual machine disk is located searches for the CBT of the previous snapshot generated by the previous snapshot corresponding to the snapshot identifier of the virtual machine disk according to the snapshot identifier of the virtual machine disk.
- the information if the CBT information of the previous snapshot generated by the previous snapshot of the current snapshot corresponding to the snapshot identifier of the virtual machine disk is found, the CBT information of the previous snapshot in the CBT information of the previous snapshot is sent. Identifies to the backup server 130. If the CBT information of the previous snapshot generated by the previous snapshot of the current snapshot corresponding to the snapshot identifier of the virtual machine disk is not found, the information of the failed search is sent to the backup server 130.
- the first snapshot is the snapshot of the full backup.
- the CBT information is not created. Therefore, the storage node 120 does not find the CBT information of the previous snapshot generated by the previous snapshot of the current snapshot corresponding to the snapshot identifier of the virtual machine disk, and sends the information of the failed search to the backup server 130.
- the backup server 130 After receiving the incremental comparison information sent by the storage node 120 where the virtual machine disk is located, the backup server 130 confirms the CBT information of the previous snapshot and the CBT information of the current snapshot in the snapshot information according to the incremental comparison information. The offset position of the incremental data corresponding to the current snapshot in the virtual machine disk data.
- the specific implementation manner of the backup server 130 confirming the offset position of the incremental data corresponding to the current snapshot in the virtual machine disk data includes: the backup server 130 first confirms that the backup server 130 is in the backup server 130 when the virtual machine disk data is backed up. Whether the CBT information identifier recorded in the same is the same as the CBT information identifier of the previous snapshot in the incremental comparison information. If they are the same, the backup server 130 confirms the corresponding snapshot corresponding to the current snapshot according to the modified status identifier recorded in the CBT information of the current snapshot. The offset position of the volume data in the virtual machine's disk data.
- the specific implementation manner of the backup server 130 confirming the offset position of the incremental data corresponding to the current snapshot in the virtual machine disk data according to the modified status identifier of the CBT information record of the current snapshot is: the backup server 130 records according to the CBT information of the current snapshot. Determining the order of the modified state identifiers to read a predetermined number of modified state identifiers, and if the predetermined number of modified state identifiers includes the modified state identifiers indicating that the modified state is modified, determining that the predetermined number of modified state identifiers are corresponding to the plurality of The modified data fragment is included in the data fragment. Wherein the predetermined number is equal to the number of data fragments in one object.
- the offset position of the predetermined number of modified state identifiers in the modified state identifier of the CBT information record of the current snapshot is the offset position of the incremental data corresponding to the current snapshot in the virtual machine disk data.
- the backup server 130 After determining the offset position position of the plurality of incremental data in the virtual machine disk data according to the offset location position of the incremental data in the virtual machine disk data, the backup server 130 sequentially sends the incremental data acquisition request to The storage node 120 where the virtual machine disk is located performs the following step 340 in sequence.
- the backup server 130 sends an incremental data acquisition request to the storage node 120 where the virtual machine disk is located.
- the incremental data acquisition request includes the virtual machine disk ID and the offset location of the delta data in the virtual machine disk data.
- the storage node 120 where the virtual machine disk is located receives the incremental data acquisition request, and searches for the incremental data in the virtual machine disk data according to the offset position of the incremental data in the virtual machine disk data.
- the storage node 120 where the virtual machine disk is located then sends incremental data to the backup server 130.
- the backup server 130 After receiving the incremental data, the backup server 130 creates a new large object identifier, and sends the new large object identifier and the incremental data to the object storage system 140. Each incremental data is saved in object storage system 140 as a new one of a new large object.
- a plurality of new objects corresponding to the plurality of incremental data may be attributed to a new large object or a plurality of new large objects.
- backup software predefines the number of multiple objects that a large object can actually hold.
- the backup server 130 creates a new large object identifier, and receives multiple incremental data and a new large object identifier.
- the object storage system 140 is sent to the object storage system 140 to save the plurality of incremental data to the new large object corresponding to the identification of the new large object.
- the object storage system 140 receives the delta data, and saves the delta data to a new object corresponding to the new object identifier by the object storage mode as an incremental backup. Object storage system 140 then sends an incremental backup completion request to backup server 130.
- the backup server 130 When the backup server 130 stores the backup metadata corresponding to the incremental backup in the object storage manner, the backup server 130 creates the backup identifier and the metadata object identifier of the incremental backup, and saves the backup identifier of the incremental backup and The correspondence of metadata object identifiers.
- the backup server 130 sends a metadata acquisition request to the object storage system 140.
- the metadata acquisition request includes the metadata object identifier corresponding to the previous backup.
- the backup server 130 Before the backup server 130 creates the metadata acquisition request, it first searches for the backup identifier of the previous backup, and searches for the metadata object identifier corresponding to the backup identifier of the previous backup according to the backup identifier of the previous backup.
- the object storage system 140 receives the metadata acquisition request and sends the backup metadata corresponding to the previous backup to the backup server 130.
- the backup server 130 modifies the backup metadata corresponding to the previous backup, and obtains the modified backup metadata as the backup metadata of the incremental backup.
- the backup server 130 modifies the pointer of the object corresponding to the incremental data, and is modified to indicate the location of the object corresponding to the incremental data held in the object storage system 140 in the new large object to which it belongs.
- the modified metadata is expressed in the modified form.
- the backup server 130 divides at least one of the first object B1 in the large object B and at least one of the first object D1 in the large object D
- the slice is modified, and the backup system creates a new large object E during the incremental backup.
- the backup server 130 saves the object to which the modified data slice belongs to the new large object E. Therefore, the new large object E includes two objects with data slice modifications.
- the backup server 130 acquires the backup metadata involved in Table 1, and modifies the backup metadata involved in Table 1.
- a portion of the backup metadata is shown in Table 2 below.
- the backup server 130 After the backup server 130 obtains the modified backup metadata, the metadata object identifier of the backup metadata of the incremental backup is created, and then the backup attribute information of the incremental backup is created, and the backup attribute information of the incremental backup includes the incremental backup. Backup ID, metadata object ID, and backup time.
- the backup server 130 saves the correspondence between the backup attribute information of the incremental backup and the virtual machine disk identifier.
- the backup server 130 sends the modified backup metadata and the metadata object identifier of the incremental backup to the object storage system 140.
- the object storage system 140 is instructed to save the backup metadata of the incremental backup to the object corresponding to the metadata object identifier corresponding to the incremental backup.
- the object storage system 140 saves the backup metadata created during the incremental backup to the object corresponding to the metadata object identifier of the incremental backup, and then sends the metadata storage completion message of the incremental data to the backup. Server 130.
- the above steps 300 to 380 describe how the backup system performs incremental backup of virtual machine disk data by means of object storage.
- the backup server 130 detects that all backups performed on the virtual machine disk data in the object storage system 140 have not expired after the full backup or the incremental backup of the virtual machine disk data is completed. The total number of copies of the backup. If the total number of unexpired backups in all backups of the disk data in the object storage system 140 exceeds a predetermined value, it is determined that the earliest backup of all unexpired backups in the object storage system 140 belongs to an expired backup, and the backup system further Process the oldest backups in all non-expired backups to save storage space on the backup system.
- the processing method of the expired backup provided by the embodiment of the present invention is described below.
- FIG. 4 is a flowchart of a method for processing an expired backup according to an embodiment of the present invention. As shown in FIG. 4, the processing method of the expired backup provided by the embodiment of the present invention includes the following steps:
- the first expired backup information acquisition request is sent to the object storage system 140.
- the first expired backup information acquisition request includes a metadata object identifier of the first expired backup and a metadata object identifier of a next backup adjacent to the first expired backup.
- backup server 130 determines a first expired backup of virtual machine disk data.
- the process of determining the first expired backup includes: the backup server 130 detects the quantity of all the backup attribute information saved in the backup server 130. If the number of all the backup attribute information saved in the backup server 130 exceeds a predetermined value, the pair is queried according to the backup attribute information. a backup identifier of the first expired backup in all the backups performed by the virtual machine disk data, where the first expired backup is all that is not performed on the virtual machine disk data in the object storage system 140 at the current first time The oldest backup in an expired backup.
- the backup server 130 After the backup server 130 determines the first expired backup of the virtual machine disk data, and before the backup server 130 sends the first expired backup information obtaining request, it first searches for the backup identifier of the first expired backup, and then according to the first expired backup. The backup identifier finds a metadata object identifier corresponding to the backup identifier of the first expired backup. And, before the backup server 130 sends the first expired backup information acquisition request, the backup identifier of the next backup adjacent to the first expired backup is first searched, and then the next backup adjacent to the first expired backup is performed. The backup identifier searches for a metadata object identifier corresponding to the backup identifier of the next backup adjacent to the first expired backup.
- the backup server 130 After the backup server 130 confirms the metadata object identifier of the first expired backup and the metadata object identifier of the next backup adjacent to the first expired backup, the first expired backup information acquisition request is sent to the object storage system 140, The backup metadata corresponding to the metadata object identifier of the first expired backup and the metadata object identifier of the next backup adjacent to the first expired backup is sent to the backup server 130 by the indication object storage system 140.
- the object storage system 140 After receiving the first expired backup information obtaining request, the object storage system 140 searches for the backup metadata of the first expired backup according to the metadata object identifier of the first expired backup, and according to the next adjacent to the first expired backup.
- the metadata object identifier of one backup is to find backup metadata of the next backup adjacent to the first expired backup.
- the object storage system 140 sends the backup metadata of the first expired backup and the backup metadata of the next backup adjacent to the first expired backup to the backup server 130.
- the backup server 130 After the backup server 130 receives the backup metadata corresponding to the first expired backup and the backup metadata of the next backup adjacent to the first expired backup, the backup metadata corresponding to the first expired backup and the The first expired backup backup metadata of the next next backup, identifying valid data and invalid data in the first expired backup.
- the specific implementation manner of confirming the invalid data and the valid data in the first expired backup includes: the backup server 130 first compares the backup metadata of the first expired backup with the backup metadata corresponding to the next backup adjacent to the first expired backup, and determines Whether the same arrangement position in the backup metadata corresponding to the first expired backup and the backup metadata corresponding to the next backup point to the same object, and if it points to the same object, it indicates that the object pointed to by the arrangement position in the first expired backup is Valid data; conversely, pointing to a different object indicates that the object pointed to by the permutation location in the first expired backup is invalid data.
- the backup server 130 After confirming the invalid data in the first expired backup, the backup server 130 creates a first deletion log of the virtual machine disk data, and saves the pointer of the invalid data in the first expired backup to the first deletion log.
- the pointer of the partial object in the backup metadata of the full backup shown in Table 1 based on the pointer of the partial object in the backup metadata of the full backup shown in Table 1 and the pointer of the partial object in the backup metadata of the incremental backup shown in Table 2, if the full backup corresponding to Table 1 is The first expired backup, the incremental backup corresponding to the second backup is the backup adjacent to the first expired backup, and the object corresponding to the pointer B1 of the first object in the large object B in the first expired backup is invalid data, and If the object corresponding to the pointer D1 of the first object in the large object D is invalid data, the pointer of the partial invalid data recorded in the first deletion log corresponding to the first expired backup may be referred to the content shown in Table 3 below. .
- the backup server 130 may send the first expired backup metadata deletion instruction to the object storage system 140.
- the first expired backup metadata deletion instruction includes a metadata object identifier corresponding to the backup metadata of the first expired backup.
- the first expired backup metadata deletion instruction is used to instruct to delete the backup metadata of the first expired backup.
- the object storage system 140 may delete the backup metadata of the first expired backup according to the first expired backup metadata deletion instruction.
- the backup server 130 after the backup server 130 creates the deletion log, the backup server 130 also deletes the backup attribute information of the first expired backup. After the backup server 130 deletes the backup attribute information of the first expired backup, the remaining backup attribute information saved in the backup server 130 is the backup attribute information of the unexpired backup. Therefore, after deleting the backup attribute information of the first expired backup, after the subsequent incremental backup, the backup server 130 can detect the quantity of all the saved backup attribute information to detect whether the total number of all unexpired backups exceeds a predetermined value.
- the backup server 130 sends the first deletion log to the object storage system 140.
- the object storage system 140 After receiving the first deletion log, the object storage system 140 saves the first deletion log to the object storage system 140.
- the object storage system 140 may send a delete log save complete message to the backup server 130 to notify the backup server 130 that the first delete log has been saved to the object storage system 140.
- the backup server 130 can also perform incremental backup of the virtual machine disk data by the steps shown in FIG. 3 above.
- the backup metadata of the incremental backup is shown in Table 4 below.
- the backup server 130 After the incremental backup corresponding to Table 4, if the backup server 130 detects that the total number of unexpired backups in all backups of the virtual machine disk data in the object storage system 140 exceeds a predetermined value, all of the object storage systems 140 are determined.
- the oldest backup in an unexpired backup is a second expired backup.
- the backup server 130 After determining the second expired backup, the backup server 130 creates a second deletion log of the virtual machine disk data, and saves a pointer of the invalid data in the second expired backup to the second deletion log.
- the second expired backup is the earliest backup of all un-expired backups performed on the virtual machine disk data in the object storage system 140 at the current second time.
- the method for processing the expired backup based on the embodiment of the present invention may refer to the details of steps 400 to 430 shown in FIG. 4 for each processing manner after determining an expired backup. The specific implementation details are not described herein again.
- the pointer of the partially invalid data recorded in the created second deleted log may be referred to the following Table 5. content.
- the backup metadata of the incremental backup is shown in Table 6 below.
- the backup server 130 After the incremental backup corresponding to Table 6, if the backup server 130 detects that the total number of unexpired backups in all the backups of the virtual machine disk data in the object storage system 140 exceeds a predetermined value, then all the objects in the object storage system 140 are determined.
- the oldest backup in an unexpired backup is a third expired backup.
- the backup server 130 After determining the third expired backup, the backup server 130 creates a third delete log of the virtual machine disk data, and saves a pointer of the invalid data in the third expired backup to the third delete log.
- the third expired backup is the earliest backup of all unexpired backups performed on the virtual machine disk data in the object storage system 140 at the current third time.
- the method for processing the expired backup based on the embodiment of the present invention may refer to the details of steps 400 to 430 shown in FIG. 4 for each processing manner after determining an expired backup. The specific implementation details are not described herein again.
- the pointer of the partially invalid data recorded in the created third deletion log may be referred to the following Table 7. content.
- the purpose of saving the deleted logs of the plurality of expired backups of the virtual machine disk to the object storage system 140 is that the backup system can subsequently determine all the objects in the expired backup that contain invalid data according to all the deleted logs, so as to include all the invalid data.
- the object is processed.
- all the objects including invalid data there may be a large object including invalid data and valid data, or a large object including only invalid data or an object which is invalid data.
- FIG. 5 is a flowchart of a method for deleting backup data according to an accumulated deletion log by a backup server according to an embodiment of the present invention. As shown in FIG. 5, the method for deleting the expired data by the backup server 130 according to the accumulated deletion log according to the embodiment of the present invention includes the following steps.
- the backup server 130 receives the plurality of deletion logs corresponding to the first disk data stored in the object storage system 140 and sent by the object storage system 140.
- the backup server 130 detects whether the deletion condition is satisfied before receiving the plurality of deletion logs sent by the object storage system 140. If the deletion condition is met, all the deletion logs of the virtual machine disk data are acquired from the object storage system 140. Whether the detection exceeds the preset deletion threshold. If the preset deletion threshold is exceeded, the deletion condition is met; or the preset deletion time is detected. If the preset deletion time is reached, the deletion condition is met; or the deletion condition is met since the last time the deletion condition is met. Timing, detecting whether the timing is over, and if the timing is over, the deletion condition is satisfied.
- the backup server 130 may detect whether the number of the log object identifiers of the deletion log corresponding to the first disk data reaches a preset deletion threshold, to detect whether the number of multiple deletion logs corresponding to the first disk data reaches a preset. Delete the threshold.
- the plurality of deletion logs corresponding to the first disk data stored in the object storage system 140 may be, for example, the first deletion log, the second deletion log, and the third deletion log described in the above examples.
- the backup server 130 After receiving the multiple deletion logs sent by the object storage system 140, the backup server 130 determines all large objects including the invalid data and the valid data stored in the object storage system 140 according to all the deletion logs, and determines that only the invalid data is included. All large objects of data.
- step 501 the backup server 130 determines whether the object is a large object including invalid data and valid data or only according to the number of invalid data in the large object including the invalid data and the data amount of the large object including the invalid data. Large object with invalid data.
- the backup server 130 determines whether the large object is a large object including invalid data and valid data or only invalid data according to the number of invalid data in the large object including the invalid data and the actual data amount of the large object including the invalid data.
- the large object is implemented by the backup server 130 determining the target data amount of the large object including the invalid data according to the number of invalid data in the large object including the invalid data and the size of the invalid data, and the backup server 130 acquires the included data.
- the actual data amount of the large object of invalid data if the target data amount of the large object including the invalid data is smaller than the actual data amount of the large object including the invalid data, the large object including the invalid data is invalid data and valid data Large object. If the target data amount of the large object including the invalid data is equal to the actual data amount of the large object including the invalid data, the large object including the invalid data is a large object including only the invalid data.
- the implementation manner in which the backup server 130 acquires the actual data amount of the large object including the invalid data is that the backup server 130 sends the data amount determination request to the object storage system 140, and the data amount determination request carries the large object including the invalid data.
- the identifier, the data amount determination request is used to instruct the object storage system 140 to transmit the actual data amount of the large object including the invalid data.
- the object storage system 140 After receiving the data amount determination request, the object storage system 140 transmits the data amount attribute information to the backup server 130.
- the backup server 130 receives the data amount attribute information, which carries the actual data amount of the large object including the invalid data.
- the large object is a large object including invalid data and valid data or only invalid data.
- the implementation of the large object, the large object including the invalid data determined by the backup server 130 according to the first deletion log, the second deletion log, and the third deletion log described in the above example has a large object B, a large object D, and a large object A , large object C and large object E.
- the large object B is a large object including invalid data and valid data
- the large object B is a large object including only invalid data
- the large object A is a large object including only invalid data
- the large object C is a large object including only invalid data.
- Large object E is a large object including invalid data and valid data.
- the backup server determines that the large object is a large object including valid data and invalid data.
- the first type the backup server requests the large object from the object storage system after determining the large object including the invalid data. A pointer to all objects in the large object. If the pointer of the partial object in the large object is a pointer to delete invalid data of the large object recorded in the log, it is determined that the large object is a large object including valid data and invalid data.
- the backup server requests the object storage system for the number of pointers of all objects in the large object, and if the number of pointers of all invalid data of the large object recorded in the delete log is less than The number of pointers to all objects in the large object requested by the object storage system is determined to be a large object including valid data and invalid data.
- the backup server determines that the large object is a large object including only invalid data.
- the first type the backup server requests the large object from the object storage system after determining the large object including the invalid data. A pointer to all objects in the object. If the pointer of all the objects in the large object is a pointer to delete invalid data of the large object recorded in the log, it is determined that the large object is a large object including valid data and invalid data.
- the backup server requests the object storage system for the number of pointers of all objects in the large object, and if the number of pointers of all invalid data of the large object recorded in the delete log is equal to The number of pointers to all objects in the large object requested by the object storage system is determined to be a large object including only invalid data.
- step 501 all the large objects including the invalid data and the valid data are determined, and all the large objects including the invalid data are respectively determined to correspond to different processing manners, and the following description can be referred to.
- the backup server 130 determines all the large objects including the invalid data and the valid data according to all the deletion logs, the following steps 510 to 560 are sequentially executed, that is, the backup server 130 acquires all the large objects including the invalid data and the valid data from the object storage system 140. Saving valid data of all large objects including invalid data and valid data to at least one newly created new large object in the object storage system 140, and then instructing the object storage system 140 to delete all large data including invalid data and valid data.
- steps 510 to 560 below.
- steps 570 to 580 are sequentially executed, that is, the backup server 130 instructs the object storage system 140 to delete all the objects including only the invalid data. See steps 570 and 580 below.
- step 570 and step 510 is in no particular order.
- the following describes the processing manner after the backup server 130 determines all the large objects including the invalid data and the valid data according to the multiple deletion logs, and specifically includes the following steps 510 to 560.
- the backup server 130 sends a large object acquisition request to the object storage system 140 after determining that at least one large object in the object storage system 140 includes invalid data and valid data according to all the deletion logs.
- the large object acquisition request includes the identification of the large object including the invalid data and the valid data.
- all large objects that include invalid data and valid data in all expired backups include the first target large object.
- the large object acquisition request includes an identification of the first target large object.
- the first target large object may be any of the large object B and the large object E.
- the large object acquisition request includes at least one identification of a large object including valid data and invalid data. For example, if the number of large objects including invalid data and valid data determined by the backup server 130 based on a plurality of deletion logs is more than one, the backup server 130 may implement a large object acquisition request in various ways. For example, the backup server 130 requests to acquire a plurality of large objects including invalid data and valid data by transmitting a plurality of large object acquisition requests to the object storage system 140, each large object acquisition request including a large amount including valid data and invalid data. The identity of the object. Alternatively, the backup server 130 sends a large object acquisition request to request acquisition of a plurality of large objects, the large object acquisition request including a plurality of identifiers of large objects including valid data and invalid data.
- the large object that includes the invalid data and the valid data in the object storage system 140 After determining, by the backup server 130, the large object that includes the invalid data and the valid data in the object storage system 140, generate an identifier of the at least one new large object.
- the identifier of at least one new large object corresponds to at least one new large object.
- the execution order of steps 510 and 511 is in no particular order.
- the at least one new large object includes the first new large object as an example.
- the backup server 130 determines the large object including the invalid data and the valid data in the object storage system 140, the plurality of objects that can be actually saved according to the number of valid data in the at least one large object including the invalid data and the valid data.
- the number determines the number of identifiers of the created new large object, and the number of identifiers of the created new large object is the number of all valid data in at least one large object including invalid data and valid data, and the number of large objects that can be actually saved.
- the quotient of the number of objects is rounded up and added to the number of operations. For example, there are two large objects including invalid data and valid data, which are large object B and large object E, respectively. B2 in the large object B and E1 in the large object E are valid data.
- the number of valid data is two. If the number of objects that can be saved by the new large object is three, the number of identifiers for creating a new large object is 1.
- the backup server 130 may first determine the pointers of the plurality of valid data to be saved to the new large object according to the order of the pointers of the valid data to be confirmed. When the number of valid data pointers reaches the number of multiple objects that a large object can actually save, the identity of the new large object is created. For subsequent remaining valid data, the backup server 130 may also sequentially determine, in the subsequent remaining valid data, the pointers of the plurality of valid data to be saved to another new large object according to the order of the pointers of the valid data, when sequentially determined. When the number of pointers of the plurality of valid data reaches the number of the plurality of objects that the large object can actually save, the identifier of the other new large object is created, and so on to create the identifiers of the plurality of new large objects.
- the object storage system 140 After receiving the large object acquisition request, the object storage system 140 queries the large object corresponding to the identifier of the large object according to the large object acquisition request. For example, the object storage system 140 queries the corresponding first target large object according to the identifier of the first target large object in the large object acquisition request.
- step 520 is performed.
- the object storage system 140 may be a large object B or a large object E according to the first target large object corresponding to the identifier of the first target large object in the large object acquisition request.
- step 521 The object storage system 140 sends the queried large object to the backup server 130. After step 520, step 521 is performed.
- the object storage system 140 sends the queried first target large object to the backup server 130.
- the object storage system 140 sends the queried large object B or large object E to the backup server 130.
- the backup server 130 After receiving the at least one large object including the invalid data and the valid data, the backup server 130 creates a valid data movement instruction.
- the valid data movement instruction includes an identification of a new large object, at least one valid data of at least one large object including the invalid data and valid data.
- the valid data movement instruction is used to instruct the object storage system 140 to save at least one valid data of the large object including the invalid data and the valid data to the new large object corresponding to the identifier of the new large object. in.
- the valid data movement instruction further includes a location identifier of the new large object after the valid data is moved. After the valid data is moved, the position of the new large object is used to indicate the position in the new large object after the effective data is moved.
- the backup server 130 After receiving the large object including the invalid data and the valid data, the backup server 130 parses the large object including the invalid data and the valid data, and confirms the valid data in the large object including the invalid data and the valid data. After the backup server 130 determines at least one valid data in at least one large object, a valid data move instruction is created.
- At least one valid data in the valid data movement instruction may include invalid data and valid data by two or more. Composing at least one valid data in each of the plurality of large objects.
- at least one valid data in the valid data move instruction may be composed of all valid data in at least one large object including invalid data and valid data, and some or all of valid data in other large objects including invalid data and valid data.
- backup server 130 can create at least one valid data move instruction. That is, you can create a valid data move instruction or create multiple valid data move instructions. At least one valid data in each valid data move instruction may be composed of some or all of the valid data of at least one large object including the invalid data and the valid data.
- the backup server 130 may transmit a plurality of valid data move instructions, Transmitting of all valid data in the large object including the invalid data and the valid data to the object storage system 140.
- the implementation of all the valid data movement instructions is exemplified by the first target large object including the valid data and the invalid data, for example, the valid data movement instruction includes the identifier of the first new large object, and the first target is large. At least one valid data in the object, the valid data movement instruction being used to instruct the object storage system 140 to save the at least one valid data in the first target large object in the valid data movement instruction to the The first new large object corresponds to the first new large object.
- the first target large object is the large object B
- at least one valid data in the first target large object included in the valid data movement instruction may be B2.
- at least one valid data in the first target large object included in the valid data movement instruction may be E1.
- the backup server 130 can create two valid data movement instructions, one valid data movement instruction including valid data B2, another valid data movement instruction including E1, and a valid data movement instruction including two valid data B2 and E1.
- the object storage system 140 After receiving the valid data movement instruction, the object storage system 140 saves the at least one valid data included in the valid data movement instruction to the new large object in the object storage system 140 corresponding to the identifier of the new large object.
- step 540 after the object storage system 140 saves at least one valid data included in the valid data movement instruction to the new large object in the object storage system 140 corresponding to the identifier of the new large object, the valid data is saved.
- the information is completed to the backup server 130 to notify the backup server 130 that all valid data included in the valid data move instruction is saved.
- all the valid data of all the large objects including the invalid data and the valid data are saved to the object storage system 140 through the above steps 510 to 540. In a new big object.
- steps 541 to 560 are subsequently performed, that is, The backup server 130 creates a pointer after the valid data is moved, and the pointer after the valid data is moved is used to indicate the position of the valid data after being saved to the new large object in the new large object.
- the backup server 130 records the correspondence between the pointers before and after the effective data movement, and deletes the large object including the invalid data and the valid data determined by the backup server 130 according to all the deletion logs. To achieve the deletion of expired data.
- the specific process is detailed in steps 541 to 560 below.
- step 541 The backup server 130 determines a pointer after the valid data is moved.
- the order of execution of step 541 and step 531 is in no particular order.
- the backup server 130 can determine the pointer after the valid data is moved.
- the pointer after the valid data is moved is used to indicate the position of the valid data in the new large object after it is saved to the new large object.
- the backup server 130 may determine the pointer after the valid data is moved according to the location identifier in the new large object after the valid data in the valid data movement instruction created in the above step 530 is moved.
- the backup server 130 creates a movement log, and saves the correspondence between the valid data movement and the moved pointer to the movement log.
- step 542 before the backup server 130 creates the movement log, the object identifier of the movement log is created in advance, and the correspondence between the virtual machine disk identifier and the object identifier of the movement log is saved.
- the backup system After the backup server 130 backs up the virtual machine disk data, the recovery frequency of the virtual machine disk data is not high. Therefore, in the expired backup processing method provided by the embodiment of the present invention, each time the expired backup is processed, the backup system is in the backup system. After moving the valid data in the expired backup to the new large object, the backup metadata corresponding to all the unexpired backups is not updated. Instead, the mobile log is created to save the pointer after the valid data in the expired backup is moved to In the mobile log, to ensure that the virtual machine disk data is restored after the backup to be restored in the unexpired backup, the backup metadata corresponding to the backup to be restored may be determined according to the moved data pointer recorded in the mobile log. Then, the virtual machine disk data corresponding to the backup to be restored is obtained from the object storage system 140 according to the backup metadata corresponding to the backup to be restored.
- the mobile log is created after the backup system processes the expired backup and the valid data in the expired backup is moved, so that when the unexpired backup is restored, the backup metadata corresponding to the restored backup is updated, which simplifies.
- Expired backup handles the complexity of backup metadata and improves the processing efficiency of expired backups.
- the backup server 130 sends the movement log to the object storage system 140.
- the movement log storage request may be transmitted to the object storage system 140.
- the move log storage request includes an object identifier and a move log of the move log.
- the move log storage request is used to instruct the object storage system 140 to save the move log in the move log storage request to an object corresponding to the object identifier of the move log.
- the object storage system 140 After receiving the mobile log storage request, the object storage system 140 saves the mobile log to the object corresponding to the object identifier of the mobile log.
- object storage system 140 sends a large object write completion message to backup server 130.
- the purpose of the object storage system 140 to send the large object write completion message to the backup server 130 is to notify the backup server 130 that the saving of the mobile log has been completed.
- the first object deletion instruction includes an identifier of each large object including the invalid data and the valid data determined by the backup server 130 according to the plurality of deletion logs.
- the backup server 130 may send a first object deletion instruction to the object storage system 140 after receiving the large object write completion message.
- the object storage system 140 After receiving the first object deletion instruction, the object storage system 140 deletes the large object corresponding to the identifier of each large object including the invalid data and the valid data according to the first object deletion instruction.
- the following describes the processing manner in which the backup server 130 determines all the large objects including only the invalid data according to all the deletion logs, and specifically includes the following steps 570 and 580.
- the backup server 130 determines all the large objects including only the invalid data according to the plurality of deletion logs, the following steps 570 to 580 are sequentially executed, that is, the backup server 130 instructs the object storage system 140 to delete all the large objects including only the invalid data. See steps 570 and 580 below for specific implementation.
- the backup server 130 After determining, by the plurality of deletion logs, all the large objects including only the invalid data, the backup server 130 sends the second object deletion instruction to the object storage system 140.
- the second object deletion instruction includes an identifier of each large object including only invalid data determined by the backup server 130 according to the plurality of deletion logs.
- the object storage system 140 After receiving the second object deletion instruction, the object storage system 140 deletes the large object corresponding to the identifier of the large object according to the second object deletion instruction.
- the above steps 510 to 540 describe how the backup system saves the valid data of all the large objects including the invalid data and the valid data determined by the backup server 130 according to all the deletion logs to the object storage system 140.
- the specific implementation method is that the backup server 130 requests and acquires all large objects including the invalid data and the valid data from the object storage system 140, and then the valid data in all the large objects including the invalid data and the valid data is sent by the backup server 130 to the object storage.
- System 140 saves.
- the method for saving valid data is different from the backup system described in steps 510 to 540.
- the embodiment of the present invention further provides another method for saving valid data, that is, the backup server 130 confirms invalid data according to the foregoing.
- the object storage system 140 is instructed by the multi-segment copying technique to save the valid data of all the large objects including the invalid data and the valid data to the object storage system 140. A newly created new large object.
- the backup server 130 includes all the invalid data and the valid data determined according to the plurality of pieces of deletion logs.
- the backup server 130 After the large object, the backup server 130 does not request to read all large objects containing invalid data and valid data from the object storage system 140, but indicates that the object storage system 140 will include all large objects including invalid data and valid data by the multi-segment copy technique.
- the saved valid data is saved to at least one newly created new large object in the object storage system 140, which reduces the interaction process between the backup server 130 and the object storage system 140, and improves the processing performance of the backup system.
- the specific implementation manner of the method for indicating that the object storage system 140 saves the valid data by using the multi-segment replication technology may refer to the process of the backup server 130, which is described in FIG. 6, subsequently, by the multi-segment replication technology, to instruct the object storage system 140 to save the valid data.
- FIG. 6 is a flowchart of a method for a backup server to indicate that an object storage system saves valid data by using a multi-segment replication technology according to an embodiment of the present invention.
- the method for the backup server 130 according to the embodiment of the present invention to indicate that the object storage system 140 saves valid data by using the multi-segment replication technology includes the following steps.
- the backup server 130 determines all the large objects including the invalid data and the valid data according to all the deletion logs, create an identifier of the at least one new large object.
- the identification of the at least one new large object in this step 610 reference may be made to the implementation manner of the foregoing step 511, and specific implementation details are not described herein again.
- the backup server 130 creates a valid data move instruction.
- the valid data movement instruction includes valid data of the new large object and valid data information respectively corresponding to at least one large object including valid data and invalid data, the valid data information including the large object including invalid data and valid data At least one consecutive valid data segment of at least one consecutive valid data segment at an offset position in the large object including invalid data and valid data, and/or at least the large object including invalid data and valid data The offset position of each valid data in a valid data in the large object including the invalid data and the valid data.
- the backup server 130 may confirm a plurality of valid data of all large objects including the invalid data and the valid data stored in the object storage system 140 according to step 501, and confirm that the valid data includes the invalid data and the valid data. The offset position in the large object.
- the backup server 130 may create a valid data move instruction based on the offset position of the valid data previously confirmed in the above step 611 in the large object including the invalid data and the valid data.
- the valid data information includes an offset position of each valid data in at least one of the large objects including the invalid data and the valid data in the large object including the invalid data and the valid data
- at least one valid data It may be a plurality of non-contiguous valid data in the large object including the invalid data and the valid data.
- the valid data movement instruction is used to instruct the object storage system 140 to save the at least one consecutive valid data segment saved in the object storage system 140 to the new large object corresponding to the identifier of the new large object according to the valid data information. in.
- each successive valid data segment is represented in the offset position in the large object including the invalid data and the valid data, and the following includes invalid data and valid data for each successive valid data segment.
- the two representations of the offset position in the large object are described separately.
- the first representation of the offset position of each successive valid data segment in the large object including the invalid data and the valid data is that each successive valid data segment is large in the invalid data and the valid data.
- the size of each consecutive valid data segment is determined according to the quantity of valid data included in each consecutive valid data segment, and each valid data is respectively an object, since the number of data fragments included in each object is Fixed, so the size of each object is fixed.
- the backup server 130 can determine the size of each successive valid data segment according to the size of the object and the number of valid data included in each successive valid data segment, and the size of each successive valid data segment is the size of the object and each The product of the number of valid data included in consecutive valid data segments.
- the starting position of each successive valid data segment in the large object including invalid data and valid data is based on the first valid data of each successive valid data segment in the large data including invalid data and valid data
- the starting position in the object is determined. If the first valid data of a piece of continuous valid data is the jth object in the large object including the invalid data and the valid data, wherein the j is an integer greater than 0, then the segment of the valid data is continuous
- the starting position of the first valid data in the large object including the invalid data and the valid data is (j-1) the product of the size of the object.
- the first valid data of a piece of continuous valid data is the second object in the large object including the invalid data and the valid data, and if the size of the object is 16M, the first of the consecutive valid data of the segment
- the starting position of the valid data is the product of (2-1) and 16M, that is, the starting position of the first valid data of the continuous valid data of the segment is 16M. Therefore, the starting position of the continuous valid data of the segment is 16M.
- a second representation of the offset position of each successive valid data segment in the large object including invalid data and valid data is that each successive valid data segment is large in the invalid data and valid data.
- the starting and ending positions in the object The starting position of each successive valid data segment in the large object including the invalid data and the valid data is described in detail in the first expression manner, and details are not described herein again.
- the end position of each successive valid data segment in the large object including invalid data and valid data is based on the last valid data of each successive valid data segment in the large object including invalid data and valid data The end position is determined.
- the last valid data of a piece of continuous valid data is the nth valid data in the large object including the invalid data and the valid data, wherein n is an integer greater than 0, then the continuous valid data of the segment
- the end position of the last valid data in the large object including the invalid data and the valid data is the product of the n and the size of the object.
- the valid data information includes an offset position of the at least one valid data of the large object including the invalid data and the valid data in the large object including the invalid data and the valid data, the at least one valid data is invalid in the inclusion
- the offset position in the large object of data and valid data are described separately below.
- the first manifestation of the offset position of each valid data in the at least one valid data in the large object including the invalid data and the valid data is that each valid data is in the large object including invalid data and valid data
- Valid data is an object, so the size of valid data is fixed.
- the determination of the starting position of each valid data in the large object including the invalid data and the valid data is, if the valid data is the kth object in the large object including the invalid data and the valid data, wherein If k is an integer greater than 0, the starting position of the valid data is the product of (k-1) and the size of the object.
- the valid data is the fourth object in the large object including the invalid data and the valid data.
- the starting position of the valid data is the product of (4-1) and 16M. That is, the starting position of the valid data is 48M.
- a second representation of the offset position of each valid data in the at least one valid data in the large object including the invalid data and the valid data is that each valid data is in the large object including invalid data and valid data
- the starting position and ending position in is described in detail in the first expression manner, and will not be described herein.
- the end position of each valid data in the large object including the invalid data and the valid data is determined if the valid data is the t-th object in the large object including the invalid data and the valid data, wherein The t is an integer greater than 0, and the end position of the valid data is the product of t and the size of the object.
- the valid data is the fourth object in the large object including the invalid data and the valid data. If the size of the object is 16M, the end position of the valid data is a product of 4 and 16M, that is, the valid data. The end position is 64M.
- the valid data movement instruction may include a plurality of the valid data information respectively corresponding to the large objects including the valid data and the invalid data.
- the valid data information corresponding to each large object includes an offset position of each successive valid data segment of the at least one consecutive valid data segment in the large object including the invalid data and the valid data, and/or at least one valid data The offset position of each valid data in the large object including the invalid data and the valid data.
- the backup server 130 may create at least one valid data move instruction, the valid data information in each valid data move instruction including an offset position of a portion of the valid data in the large object including the invalid data and the valid data. Therefore, the backup server 130 can implement the transmission of the offset position of all valid data in the large object including the invalid data and the valid data to the object storage system 140 by transmitting a plurality of valid data movement instructions.
- a part of the valid data in the large object including the invalid data and the valid data may be a partial continuous valid data segment and/or partial valid data in the large object including the invalid data and the valid data.
- the offset position of a part of the valid data in the large object including the invalid data and the valid data includes an offset of each successive valid data segment in the partial continuous valid data segment in the large object including the invalid data and the valid data.
- the offset position of each valid data in the location, and/or partial valid data in the large object including the invalid data and the valid data includes an offset of each successive valid data segment in the partial continuous valid data segment in the large object including the invalid data and the valid data.
- the backup server 130 sends a valid data move instruction to the object storage system 140.
- the object storage system 140 After receiving the valid data movement instruction, the object storage system 140, according to the at least one consecutive valid data segment and/or the at least one valid data in the valid data information in the valid data movement instruction, is large in the invalid data and the valid data.
- the object storage system 140 After receiving the valid data movement instruction, the object storage system 140 first queries whether the new large object corresponding to the identifier of the new large object is created according to the identifier of the new large object in the valid data movement instruction. If the query does not create a new large object corresponding to the identifier of the new large object, the object storage system 140 creates a new large object according to the identifier of the new large object, and then moves the valid data information in the instruction according to the valid data.
- the object storage system 140 After the object storage system 140 receives the valid data movement instruction, if the new large object corresponding to the identifier of the new large object has been created according to the identifier of the new large object, the object storage system 140 moves according to the effective data movement instruction.
- the at least one consecutive valid data segment and/or the at least one valid data in the valid data information includes the invalid data and the valid data stored in the object storage system 140 at the offset position in the large object including the invalid data and the valid data. At least one consecutive valid data segment and/or at least one valid data in the large object is saved to a new large object corresponding to the identifier of the new large object.
- all the valid data of all the large objects including the invalid data and the valid data are saved to the object storage system 140 through the above steps 610 to 620. In a new big object.
- the backup server 130 Compared with the method for saving valid data described in the above steps 510 to 540, the backup server 130 according to FIG. 6 instructs the object storage system 140 to save valid data by the multi-segment copy technology, and the backup server 130 determines the plurality of deletion logs. After all large objects including invalid data and valid data, the backup server 130 does not request to read all large objects containing invalid data and valid data from the object storage system 140, but indicates that the object storage system 140 will invalidate the inclusion by the multi-segment copy technique. The valid data in all the large objects of the data and the valid data is saved to at least one new large object of the object storage system 140, which reduces the interaction flow between the backup server 130 and the object storage system 140, and improves the processing performance of the backup system.
- step 620 shown in FIG. 6, that is, after the object storage system 140 saves the valid data of all the large objects including the invalid data and the valid data to the at least one new large object in the object storage system 140
- the subsequent execution of step 541 to 560 that is, the backup server 130 creates a pointer after the valid data is moved, and the pointer after the valid data is moved is used to indicate the position of the valid data after being saved to the new large object in the new large object.
- the backup server 130 creates a pointer after the valid data is moved, the backup server 130 records the correspondence between the pointers before and after the effective data movement, and deletes the large object including the invalid data and the valid data determined by the backup server 130 according to the plurality of deletion logs. To achieve the deletion of expired data.
- steps 541 to 560 and details are not described herein again.
- the backup server 130 After the backup system backs up the virtual machine disk data, if the client 110 has a need for recovery of the virtual machine disk data, the client 110 sends a data recovery request to the backup server 130. After receiving the data recovery request, the backup server 130 performs a recovery process to recover the virtual machine disk data. When the backup server 130 recovers the virtual machine disk data, the virtual machine disk data corresponding to the to-be-recovered backup is obtained according to the to-be-restored backup identifier included in the data recovery request, and the virtual machine disk data is performed by using the virtual machine disk data corresponding to the backup to be restored. Recovery, where the backup to be restored is an unexpired backup.
- the backup system After the backup server 130 backs up the virtual machine disk data, the recovery frequency of the virtual machine disk data is not high. Therefore, in the expired backup processing method provided by the embodiment of the present invention, each time the expired backup is processed, the backup system is in the backup system. After moving the valid data in the expired backup to the new large object, the backup metadata corresponding to all the unexpired backups is not updated. Instead, the mobile log is created to save the pointer after the valid data in the expired backup is moved to In the mobile log, to ensure that the virtual machine disk data is restored after the backup to be restored in the unexpired backup, the backup metadata corresponding to the backup to be restored may be determined according to the moved data pointer recorded in the mobile log.
- the virtual machine disk data corresponding to the backup to be restored is obtained from the object storage system 140 according to the backup metadata corresponding to the backup to be restored.
- a method for recovering virtual machine disk data by using a mobile log according to an embodiment of the present invention is described below with reference to FIG.
- FIG. 7 is a flowchart of a method for restoring virtual machine disk data according to an embodiment of the present invention. As shown in FIG. 7, the method for restoring virtual machine disk data according to an embodiment of the present invention includes the following steps.
- the backup server 130 receives the data recovery request sent by the client 110.
- the data recovery request includes a target disk identifier, a virtual machine disk identifier, and a to-be-restored backup identifier corresponding to the to-be-recovered backup of the virtual machine disk data.
- the data recovery request is used to indicate that the virtual machine disk data is restored to the target disk based on the to-be-recovered backup corresponding to the backup identifier to be restored.
- the virtual machine disk data includes a plurality of consecutive data fragments.
- the backup of the virtual machine disk data corresponding to the to-be-restored backup identifier in the data recovery request is a backup to be restored.
- the target disk identifier may be, for example, Chinese characters such as "production data” or "business data”, or may be letters, numbers, or other symbols, or a combination of letters, numbers, or other symbols.
- the specific implementation is not limited by the embodiment.
- the target disk can be a virtual machine disk that was previously used to store virtual machine disk data, or it can be another disk.
- the target disk may be deployed in the storage node 120 or the backup server 130 where the virtual machine disk is located, or may be deployed in other physical devices, such as other storage devices, and may be a storage array.
- the backup server 130 may receive a data recovery request sent by the client 110.
- the backup identifier to be restored in the data recovery request can be implemented by using the backup time identifier or the backup version identifier or the backup number identifier.
- the backup server 130 creates a metadata acquisition request for the backup to be restored.
- the metadata acquisition request of the backup to be restored includes a virtual machine disk identifier and a backup identifier to be restored.
- the backup server 130 sends a metadata acquisition request to be restored to the object storage system 140.
- the backup server 130 pre-stores the correspondence between the to-be-restored backup identifier and the metadata object identifier corresponding to the backup metadata of the backup to be restored. Before the backup server 130 sends the metadata acquisition request, the backup server 130 searches for the backup to be restored according to the correspondence between the metadata of the to-be-restored backup identifier and the metadata identifier of the backup metadata to be restored. The metadata object identifier corresponding to the backup metadata is generated; and then the metadata acquisition request of the to-be-restored backup is created according to the metadata object identifier corresponding to the backup metadata of the backup to be restored.
- the metadata acquisition request includes a metadata object identifier corresponding to the backup metadata of the backup to be restored, and the metadata acquisition request of the to-be-restored backup is used to indicate that the object storage system 140 is based on the backup to be restored.
- the metadata object identifier corresponding to the backup metadata is obtained, and the backup metadata of the backup to be restored is obtained.
- the backup server 130 sends a mobile log acquisition request to the object storage system 140.
- the mobile log acquisition request includes an object identifier of the mobile log.
- the mobile log acquisition request is used to instruct the object storage system 140 to send all of the movement logs of the virtual machine disk.
- the move log is saved in the object storage system 140 in an object storage manner.
- the backup server 130 pre-stores the correspondence between the virtual machine disk identifier and the object identifier of the mobile log.
- the backup server 130 searches for the object identifier of the mobile log according to the correspondence between the virtual machine disk identifier and the object identifier of the mobile log.
- steps 703 and 702 are in no particular order.
- the object storage system 140 searches for backup metadata of the backup to be restored according to the metadata acquisition request of the backup to be restored.
- the object storage system 140 searches for a movement log corresponding to the object identifier of the movement log according to the object identifier of the movement log.
- steps 711 and 710 are in no particular order.
- the object storage system 140 sends a movement log corresponding to the object identifier of the mobile log to the backup server 130.
- steps 712 and 710 are in no particular order.
- the object storage system 140 sends the backup metadata of the backup to be restored to the backup server 130.
- the order of execution of steps 713 and 712 is in no particular order. And, the execution order of steps 713 and 711 is in no particular order. 720.
- the backup server 130 After receiving the backup metadata of the backup to be restored sent by the object storage system 140 and all the movement logs of the virtual machine disk, the backup server 130 updates the backup metadata of the backup to be restored according to all the mobile logs, and obtains the modified backup metadata. .
- the backup server 130 receives all the mobile logs created by the expired backup processing method provided by the embodiment of the present invention, and needs to restore the backup backup elements according to the moved data pointers recorded in all the mobile logs. The data is updated.
- the backup server 130 modifies the same pointer in the backup metadata before the effective data recorded in the movement log, and the modified pointer is recorded in the movement log. A pointer after the valid data is moved corresponding to the same pointer before the valid data movement.
- the backup server 130 After obtaining the modified backup metadata corresponding to the backup to be restored, the backup server 130 creates a to-be-recovered backup object acquisition request according to the modified backup metadata.
- the to-be-restored backup object acquisition request includes a pointer recorded in the modified backup metadata.
- the backup server 130 After the backup server 130 updates the backup metadata of the restored backup to obtain the modified backup metadata, and before creating the to-be-recovered backup data acquisition request, the backup server first confirms according to the order of the pointers recorded in the modified backup metadata. A plurality of consecutive pointers, the plurality of objects corresponding to the plurality of consecutive pointers belonging to the same large object. The backup server 130 then creates a backup object acquisition request to be restored, the to-be-recovered object acquisition request including the plurality of consecutive pointers. The to-be-recovered object acquisition request is used to instruct the object storage system 140 to search for multiple objects in the same large object in the to-be-recovered backup according to an object to be restored.
- the backup server 130 needs to create multiple backup object acquisition requests to be restored, and each of the to-be-recovered backup object acquisition requests includes only Multiple pointers in succession or a separate pointer.
- the backup server 130 sends the to-be-recovered backup object acquisition request to the object storage system 140.
- the object storage system 140 After receiving the to-be-recovered backup object acquisition request, the object storage system 140 searches for an object in the virtual machine disk data corresponding to the to-be-recovered backup according to the pointer recorded in the modified backup metadata.
- the object storage system 140 sends the found object in the virtual machine disk data corresponding to the to-be-recovered backup.
- the backup server 130 After receiving the object in the virtual machine disk data corresponding to the to-be-recovered backup, the backup server 130 sends the storage node 120 where the target disk corresponding to the target disk identifier is to be restored.
- the backup recovery indication includes the object and the target disk identifier in the virtual machine disk data corresponding to the to-be-recovered backup.
- the storage node 120 where the target disk is located receives the object in the virtual machine disk data corresponding to the backup to be restored, and saves the object in the virtual machine disk data corresponding to the backup backup to the target disk corresponding to the target disk identifier.
- the backup server 130 needs to create a plurality of to-be-recovered backup object acquisition requests and send a plurality of to-be-recovered backup object acquisition requests to the object storage system 140.
- the embodiment of the present invention further provides a method for collating the modified backup metadata corresponding to the restored backup, that is, sorting the order of all the pointers in the modified backup metadata, and belonging to the same big A plurality of pointers corresponding to a plurality of objects of the object are arranged together.
- a method for collating the modified backup metadata corresponding to the restored backup that is, sorting the order of all the pointers in the modified backup metadata, and belonging to the same big
- a plurality of pointers corresponding to a plurality of objects of the object are arranged together.
- the following two methods for collating the modified backup metadata are respectively described below.
- the first implementation manner of sorting the modified backup metadata corresponding to the recovery backup is that the backup server 130 confirms multiple pointers corresponding to multiple objects belonging to the same large object according to the modified backup metadata. And storing the plurality of pointers corresponding to the plurality of objects belonging to the same large object into the first storage space pointed by a continuous address. And, save all pointers in the modified backup metadata to a second storage space pointed to by a contiguous address.
- the second storage space includes a first storage space.
- the backup server 130 sequentially confirms a plurality of pointers corresponding to the plurality of objects included in each of the plurality of large objects, and saves the plurality of pointers corresponding to the plurality of objects included in each of the large objects into one first storage space. In this way, a plurality of pointers corresponding to all objects in the plurality of large objects are saved through the plurality of first storage spaces.
- the plurality of consecutive addresses corresponding to the plurality of first storage spaces may be continuous or discontinuous with each other.
- the plurality of pointers stored in the any two of the first storage spaces do not include independent pointers, that is, there is no one that does not belong to the same A pointer corresponding to the object of the large object.
- the plurality of consecutive addresses corresponding to the plurality of first storage spaces are not continuous with each other, the plurality of pointers stored in the plurality of first storage spaces are arranged with independent pointers, that is, the ones that do not belong to the same large object The pointer, the independent pointer and the object corresponding to the other pointers adjacent thereto do not belong to the same large object.
- the backup server 130 can confirm a plurality of pointers corresponding to a plurality of objects belonging to the same large object, and confirm a plurality of independent pointers corresponding to the plurality of independent objects that do not belong to the same large object, and then back up the server 130.
- the end address of the first storage space and the start address of the third storage space are consecutive, or the start address of the first storage space and the third storage
- the end address of the space is continuous.
- the second storage space includes the first storage space and the third storage space, and if there are multiple large objects, the second storage space includes a plurality of the first storage spaces.
- the second implementation manner of sorting the modified backup metadata corresponding to the recovery backup is that the backup server 130 confirms multiple pointers corresponding to multiple objects belonging to the same large object according to the modified backup metadata, and creates a first index storing a correspondence between the first index and the plurality of pointers corresponding to the plurality of objects belonging to the same large object, and confirming a plurality of independent ones corresponding to the plurality of independent objects not belonging to the same large object
- the pointer creates a second index, and stores the correspondence between the second index and the independent pointer corresponding to the independent object that does not belong to the same large object. If there are multiple large objects, there are multiple first indexes, and the number of first indexes is the same as the number of large objects. If there are multiple independent pointers, there are multiple second indexes, and the number of second indexes is the same as the number of independent pointers. All first indexes and all second indexes are stored in the storage space pointed to by a contiguous address.
- the embodiment of the present invention further provides a different data backup method.
- a specific implementation manner of backing up virtual machine disk data corresponding to at least one virtual machine disk to the object storage system 140 Includes the following steps.
- the backup server 130 After the backup server 130 obtains a plurality of consecutive data fragments of the virtual machine disk to be backed up to the object storage system 140, the data slice is determined according to the arrangement position of the virtual machine disk data to meet a predetermined number of consecutive multiple data fragments. A collection of data.
- the backup server 130 determines that the first data set consisting of a predetermined number of consecutive multiple data fragments is met, the weak hash value of the data set is calculated, and a new large object identifier is created, and the identifier of the new large object is saved and weak. The correspondence of hash values.
- the backup server 130 After the backup server 130 creates the identity of the new large object, the backup server 130 sends a first data set save instruction to the object storage system 140.
- the first data set save instruction includes a first data set and an identifier of the new large object.
- the first data set save instruction is used to instruct the object storage system 140 to save the first data set to the new large object corresponding to the identifier of the new large object.
- the backup server 130 After the backup server 130 sends the first data set save command to the object storage system 140 for storage, the first data set is saved by the object storage system 140 to the new large object corresponding to the new large object identifier.
- the weak hash value of the other second data set is calculated first, and then whether the other second data is saved in the object storage system 140 is detected.
- the data set of the weak hash value of the set if any, queries the identifier of the new large object to which the data set similar to the weak hash value of the other data set belongs, and the backup server 130 detects the other Whether the new large object to which the weakly hashed data set of the two data sets belongs reaches a predefined size.
- the backup server 130 sends another second data set save command to The object storage system 140, the another second data set holding instruction includes an identifier of the other second data set and a new large object to which the data set similar to the weak hash value of the other second data set belongs.
- the object storage system 140 After receiving the another second data set save instruction, the object storage system 140 saves the other second data set to a new data set that is similar to the weak hash value of the other second data set.
- the data set that approximates the weak hash value of the other second data set may be the first data set.
- the backup server has the function of implementing the backup server 130 in the above system embodiment, and the function can be implemented by hardware executing corresponding software.
- FIG. 8 is a structural diagram of a backup server according to an embodiment of the present invention.
- the backup server 130 includes a controller 210 and a storage device 220.
- the controller 210 is connected to the storage device 220.
- the backup server 130 shown in FIG. 8 can be applied to the storage system shown in FIG. 1.
- Storage device 220 is used to provide storage services to controller 210.
- the controller 210 is configured to receive a full backup or incremental backup of the data on the disk of the storage node 120 to the object storage system 140 when receiving the data backup request sent by the client 110, or determining that the preset time is reached. Create and save backup metadata and backup attribute information.
- the full backup refers to backing up all the data on the disk of the storage node 120 to the storage system.
- the incremental backup refers to backing up the modified data on the disk of the storage node 120 to the object storage system.
- Backup metadata is used to represent the location of each object in the disk data in the disk data.
- the backup metadata can record the identifier and pointer of each object that makes up the disk data, and record the pointer of each object according to the order in which the objects are arranged on the disk.
- the backup attribute information includes a backup backup identifier, a backup time, and an identifier of the backup metadata.
- the data backup request sent by the client 110 includes the disk identifier of the disk to be backed up, and the controller 210 backs up or incrementally backs up the data on the disk indicated by the disk identifier to the object storage system 140.
- the full-size backup and the incremental backup refer to the description of the backup server 130 for the full-size backup and the incremental backup in the foregoing method embodiment, and details are not described herein again.
- the backup server 130 in the backup system provided by the embodiment of the present invention deletes the expired backup as needed.
- the controller 210 is configured to: when detecting that the total number of copies of all the backup attribute information of the disk exceeds a predetermined value, determine that the earliest backup is an expired backup according to the backup time in all the backup attribute information. After determining the expired backup, the controller 210 identifies the invalid metadata in the expired backup by the backup metadata of the expired backup and the backup metadata of the next backup adjacent to the expired backup, and creates a delete log, which will invalidate the expired backup data. The pointer is saved to the delete log. After the deletion log is created, the controller 210 is further configured to delete the backup attribute information of the expired backup, and create and save the correspondence between the disk identifier and the identifier of the deleted log.
- the invalid data refers to an object including the modified data fragment in the expired backup with respect to the next backup adjacent to the expired backup.
- the controller 210 After detecting the deletion condition, the controller 210 acquires multiple deletion logs corresponding to the disk data according to the correspondence between the disk identifier and the identifier of the deletion log.
- the deletion condition may be that the number of multiple deletion logs corresponding to the disk data reaches a preset deletion threshold, or reaches a preset deletion time, or starts timing until the timeout ends since the deletion condition is last met.
- the controller 210 is further configured to determine, according to the plurality of deletion logs, a target large object that includes the valid data and the invalid data saved in the object storage system 140, and determine a large object that includes only invalid data.
- the valid data refers to an object including an unmodified data fragment in the expired backup with respect to the next backup adjacent to the expired backup.
- the controller 210 is further configured to send a data migration indication and an object deletion indication to the object storage system 140,
- the data migration indication is used to instruct the object storage system 140 to migrate the valid data in the target large object to another large object
- the object deletion indication is used to instruct the object storage system 140 to delete the target Large object.
- the controller 210 is further configured to send an object deletion indication to the object storage system 140, where the object deletion indication is used to instruct the object storage system 140 to delete the invalid data only. Big object.
- the pointer of the valid data in the large object is modified, in order to avoid updating the valid data in the large object recorded in the backup metadata corresponding to all the other backups respectively.
- the pointer records the correspondence between the pointer before the effective data movement and the pointer after the movement in the large object, so as to confirm whether the backup metadata of the certain backup exists in the subsequent backup when a certain backup needs to be accessed.
- the controller 210 is further configured to create a movement log, and save the correspondence between the valid data movement and the moved pointer to the movement log.
- the controller 210 accesses the valid data with the movement in the disk data through the backup metadata that has not expired, the unupdated backup is updated according to the correspondence between the pre- and post-movement pointers of the valid data saved in the movement log.
- Backup metadata When the backup metadata of the unexpired backup is updated, the same pointer before the valid data recorded in the mobile log is modified in the backup metadata of the unexpired backup, and the modified pointer is the A pointer after moving the valid data recorded in the log.
- the controller 210 receives the data recovery request sent by the client 110, and after determining the backup to be restored from all the unexpired backups, the backup metadata of the backup to be restored is modified according to the movement log, and the composition is obtained according to the modified backup metadata. All objects of the disk data, and then restore the disk data to the target disk.
- the controller 210 includes a first interface 211, a second interface 212, and a control module 213.
- the control module 213 is connected to the first interface 211 and the second interface 212 respectively.
- the second interface 212 is for communicating with the storage device 220 and the object storage system 140.
- the control module 213 is configured to implement the function of the controller 210.
- the control module 213 is configured to implement the function of the controller 210.
- the function description of the controller 210 For details of implementation of the specific function, reference may be made to the function description of the controller 210.
- the control module 213 includes a processor 214 and a memory 215.
- the processor 214 is connected to the first interface 211 and the second interface 212.
- the processor 214 is configured to implement the functions of the controller 210.
- the processor 214 is coupled to a memory 215 that is coupled to the first interface 211 and the second interface 212 for temporarily storing information transmitted from the client or object storage system 140.
- the memory 215 is also used to store software programs as well as application modules.
- the processor 214 implements various functions of the backup server 130 by running software programs stored in the memory 215 and application modules.
- the processor 214 can be any computing device, and can be a general purpose central processing unit (CPU), a microprocessor, a programmable processor, an application-specific integrated circuit (ASIC), or one or more The integrated circuit executed by the above program.
- processor 214 can include one or more CPUs.
- the memory 215 may include a Volotile Memory, such as a Random-Access Memory (RAM); the memory 215 may also include a non-volatile memory, for example, a read-only memory.
- ROM Read-Only Memory
- Flash Memory Flash Memory
- HDD Hard Disk Drive
- SSD Solid-State Drive
- FIG. 9 is a structural diagram of another backup server according to an embodiment of the present invention.
- the backup server 900 includes a creation module 910, a detection module 920, a determination module 930, and a first transceiver module 940.
- the connection relationship between the modules in the backup server 900 is: a detection module 920 and a creation module 910.
- the determining modules 930 are respectively connected, and the first transceiver module 940 is connected to the determining module 930.
- the creation module 910, the detection module 920, and the determination module 930 can be implemented by the controller 210 or the processor 214 shown in FIG. 8 in a specific implementation.
- the first transceiver module 940 can be implemented by the second interface 212 shown in FIG. 8 during specific implementation.
- the functions of the various modules shown in Figure 9 are described as follows:
- a creating module 910 configured to: after the backup server 900 determines an expired backup of the first disk data, create a deletion log of the first disk data, and save a pointer of the invalid data in the expired backup to the In the deletion log, the expired backup is the earliest backup of all the unexpired backups performed on the first disk data in the object storage system at the current time.
- the creation module 910 creates the deletion log of the first disk data, and saves the pointer of the invalid data in the expired backup to the specific implementation details in the deletion log. Refer to the content of steps 400-421 shown in FIG. 4, The details are not repeated here.
- the detecting module 920 is configured to detect whether the deletion condition is met, and if the deletion condition is met, obtain multiple deletion logs corresponding to the first disk data.
- the determining module 930 is configured to determine, according to the multiple deletion logs, a first target large object that includes the valid data and the invalid data saved in the object storage system. Determining, by the determining module 930, specific implementation details of the first target large object including the valid data and the invalid data saved in the object storage system according to the multiple deletion logs may refer to the content described in step 501 shown in FIG. 5, The specific implementation details will not be described here.
- a first transceiver module 940 configured to send a data migration indication and an object deletion indication to the object storage system, where the data migration indication is used to instruct the object storage system to use the valid data in the first target large object Migrating to another large object, the object deletion indication is used to instruct the object storage system to delete the first target large object.
- the first transceiver module 940 sends the data migration indication details to the object storage system.
- the content described in the steps 510-540 shown in FIG. 5 or the content described in steps 610-620 shown in FIG. 6 may be referred to, the first transceiver module 940.
- For specific implementation details of sending an object deletion indication to the object storage system reference may be made to the content described in steps 550-560 shown in FIG. 5, and specific implementation details are not described herein again.
- the detecting module 920 is further configured to detect whether the number of multiple deletion logs corresponding to the first disk data reaches a preset deletion threshold, and if the preset deletion threshold is reached, the deletion is satisfied. Condition; or
- the timing is started after the deletion condition is last met, and it is detected whether the timing is over, and if the timing is over, the deletion condition is satisfied.
- the creating module 910 is further configured to create a pointer after the valid data is moved, create a movement log of the first disk data, and move the pointer and the location before the valid data
- the correspondence between the pointers after the effective data movement is saved in the movement log, and the pointer after the valid data movement indicates the position of the other large object after the valid data is moved to the other large object.
- the creating module 910 creates a pointer after the valid data is moved, creates a movement log of the first disk data, and saves a correspondence between the pointer before the valid data movement and the pointer after the valid data is moved to
- For details of the movement log refer to the descriptions of steps 541-544 shown in FIG. 5, and specific implementation details are not described herein again.
- FIG. 10 is a structural diagram of another backup server 1000 according to an embodiment of the present invention.
- the backup server 1000 further includes: a second transceiver module 1010 and a processing module 1020.
- the processing module 1020 can be implemented by the controller 210 or the processor 214 shown in FIG. 8 in a specific implementation.
- the second transceiver module 1010 can be implemented by using the first interface 211 shown in FIG. 8 during specific implementation.
- the function of the module shown in Fig. 10 different from that shown in Fig. 9 is as follows:
- the second transceiver module 1010 is further configured to receive a data recovery request sent by the client 110, where the data recovery request includes a first disk identifier, a second disk identifier, and a backup identifier of the backup to be restored, where the data recovery request is used to indicate Restoring the first disk data to the second disk based on the backup to be restored corresponding to the backup identifier of the to-be-recovered backup, where the to-be-recovered backup is any of the unexpired backups of the first disk data A backup.
- the first transceiver module 940 for receiving the data recovery request sent by the client refer to the content described in step 700 shown in FIG. 7. The specific implementation details are not described herein again.
- the processing module 1020 is configured to obtain the backup metadata of the backup to be restored, and obtain all the movement logs of the first disk data.
- the processing module 1020 is configured to obtain the backup metadata of the backup to be restored, and obtain the For details of all the movement logs of the first disk data, refer to the descriptions of steps 701-713 shown in FIG. 7. The specific implementation details are not described herein again.
- the processing module 1020 is further configured to: according to all the movement logs of the first disk data, confirm whether there is a pointer in the backup metadata of the to-be-recovered backup that is the same as a pointer before the effective data movement recorded in the movement log, If yes, modifying the same pointer in the backup metadata of the to-be-recovered backup to a pointer after the valid data is moved corresponding to the pointer before the valid data movement, and obtaining the modified backup metadata.
- the modified backup metadata includes an unmodified pointer and a modified pointer, the modified pointer is a pointer after the valid data is recorded in the movement log; and the update module 1030 obtains the modified backup element.
- the first transceiver module 940 is further configured to acquire, according to the modified backup metadata, the first disk data corresponding to the to-be-recovered backup; and the first transceiver module 940 is configured according to the modified backup metadata.
- the first disk data corresponding to the to-be-recovered backup refer to the descriptions of steps 722-731 shown in FIG. 7. The specific implementation details are not described herein again.
- the processing module 1020 is further configured to save the first disk data to the second disk.
- the processing module 1020 may refer to the content described in steps 740-750 shown in FIG. 7. The specific implementation details are not described herein again.
- FIG. 11 is a structural diagram of another backup server according to an embodiment of the present invention.
- the creation module 910 is coupled to the processing module 1020.
- the creation module 910 is further configured to create an object identifier of the mobile log. For details, refer to the description of step 543 shown in FIG. 5, and details of implementation are not described herein.
- the processing module 1020 is further configured to save a correspondence between the first disk identifier and the object identifier of the mobile log, and send a mobile log storage request to the object storage system, where the mobile log storage request includes the mobile log
- the movement log storage request is used to instruct the object storage system to save the movement log to an object corresponding to the object identifier of the movement log;
- the processing module 1020 is further configured to acquire the object identifier of the mobile log according to the first disk identifier.
- the processing module 1020 is further configured to acquire the object identifier of the mobile log according to the first disk identifier.
- the first transceiver module 940 is further configured to send a mobile log obtaining request, where the mobile log obtaining request includes an object identifier of the mobile log, where the mobile log obtaining request is used to indicate that the object storage system is
- the mobile log is obtained from the object corresponding to the object identifier of the mobile log.
- the first transceiver module 940 is further configured to receive the mobile log sent by the object storage system. For specific implementation details, reference may be made to the content described in steps 710-712 in FIG. 7, and specific implementation details are not described herein again.
- FIG. 12 is a structural diagram of another backup server according to an embodiment of the present invention.
- the first transceiver module 940 is connected to the detection module 920.
- the determining module 930 is further configured to determine, according to the multiple deletion logs, a first target large object that includes invalid data saved in the object storage system, according to a size of the predefined invalid data and the first target The amount of invalid data included in the object, determining the amount of data of all invalid data in the first target large object;
- the first transceiver module 940 is further configured to send a data volume determination request to the object storage system, where the data volume determination request includes an identifier of the first target large object, and the data volume determination request is used to indicate the location Transmitting, by the object storage system, the amount of data of the first target large object;
- the first transceiver module 940 is further configured to receive data volume attribute information, where the data volume attribute information includes a data amount of the first target large object;
- the detecting module 920 is further configured to: detect that the data amount of all invalid data in the first target large object is smaller than the data amount of the first target large object in the data amount attribute information, and determine the object storage system.
- the first target large object saved in is a first target large object including invalid data and valid data.
- the creating module 910 is configured to create the first disk after the backup server 900 determines an expired backup of the first disk data each time. Deleting a log of the data, and storing a pointer of the invalid data in the expired backup to the delete log, where the expired backup is all unexpired for the first disk data in the object storage system at the current time instant. The earliest backup in the backup; the creation module 910 creates the deletion log of the first disk data, and saves the pointer of the invalid data in the expired backup to the specific implementation details in the deletion log, refer to the steps shown in FIG. 4 The contents of 400-421, the specific details are not repeated here.
- the detecting module 920 is configured to detect whether the deletion condition is met, and if the deletion condition is met, obtain multiple deletion logs corresponding to the first disk data;
- a determining module 930 configured to determine, according to the multiple deletion logs, a large object that is included in the object storage system and that includes only invalid data; and the determining module 930 determines, according to the multiple deletion logs, only the saved in the object storage system.
- the specific implementation details of the large object including the invalid data may refer to the content described in step 501 shown in FIG. 5, and specific implementation details are not described herein again.
- the first transceiver module 940 is configured to send an object deletion indication to the object storage system, where the object deletion indication is used to instruct the object storage system to delete the large object that only includes invalid data.
- the implementation details of the first transceiver module 940 for transmitting the object deletion indication to the object storage system may refer to the content described in steps 570 and 580 shown in FIG. 5, and specific implementation details are not described herein.
- the determining module 930 is further configured to determine, according to the multiple deletion logs, a large object that includes invalid data, according to a size of the predefined invalid data, and the large object that includes invalid data. The amount of invalid data included, determining the amount of data of all invalid data in the object including the invalid data;
- the first transceiver module 940 is further configured to send a data volume determination request to the object storage system, where the data volume determination request includes an identifier of the large object including invalid data, where the data volume determination request is used to indicate Transmitting, by the object storage system, the amount of data of the large object including invalid data;
- the first transceiver module 940 is further configured to receive data volume attribute information, where the data volume attribute information includes a data volume of the large object that includes invalid data;
- the detecting module 920 is further configured to detect that the data amount of all invalid data in the large object including the invalid data is the same as the data amount of the large object including the invalid data in the data amount attribute information, and determine the object.
- the large object including invalid data stored in the storage system is a large object including only invalid data.
- the size of the sequence numbers of the above processes does not mean the order of execution, and the order of execution of each process should be determined by its function and internal logic, and should not be taken to the embodiments of the present invention.
- the implementation process constitutes any limitation.
- the disclosed apparatus, apparatus, and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another device, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated modules can be implemented in the form of hardware or in the form of hardware plus software function modules.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions can be stored in a computer readable storage medium, which can be any available media that can be read by a computer or a data storage device such as a server, data center, or the like that includes one or more available media integrations. .
- the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a Digital Video Disc (DVD)), or a semiconductor medium (eg, a Solid State Disk (SSD)). )Wait.
- a magnetic medium eg, a floppy disk, a hard disk, a magnetic tape
- an optical medium eg, a Digital Video Disc (DVD)
- DVD Digital Video Disc
- SSD Solid State Disk
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Retry When Errors Occur (AREA)
Abstract
一种过期备份处理方法,所述方法由备份服务器(130)执行。包括:备份服务器(130)每次确定第一磁盘数据的过期备份后,会创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中。所述过期备份为截止当前第一时刻对象存储系统(140)中对所述第一磁盘数据进行的所有未过期备份中的最早备份。如此,后续备份服务器(130)检测满足删除条件后,获取与所述第一磁盘数据对应的多条删除日志,根据删除日志确定包括无效数据和有效数据的大对象,将有效数据移动至另一大对象,并且指示所述对象存储系统(140)删除包括无效数据的大对象,进而对以大对象存储方式进行增量备份的磁盘数据实现过期备份的处理。
Description
本发明实施例涉及存储领域,尤其涉及一种过期备份处理方法及备份服务器。
为保证数据的安全性,在数据保存至磁盘后,可以按照对象存储方式通过备份服务器将磁盘数据备份至对象存储系统。具体的过程包括:对象存储系统将组成磁盘数据的连续的多个数据分片保存为至少一个对象,每个对象包括连续的至少一个数据分片。如果磁盘备份数据被分散地存储在多个对象中,后续备份服务器需要访问该备份数据时,需要多次访问对象存储系统的多个对象。为此,备份服务器发送至少两个对象和一个大对象标识给对象存储系统,由对象存储系统将至少两个对象保存成一个大对象,以通过访问大对象减少对所述对象存储系统的访问次数。
为了减少备份的数据量,通常也会采用增量备份的方式来备份,然而,目前还没有以大对象存储方式对磁盘数据进行增量备份的技术。
发明内容
本发明实施例提供一种过期备份处理方法及备份服务器,实现了基于大对象存储方式对磁盘的数据的进行增量备份的技术,如何实现过期备份的处理方法。
第一方面,本发明实施例提供一种过期备份处理方法,所述方法由备份服务器执行。包括:备份服务器每次确定第一磁盘数据的过期备份后,会创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中。所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份。后续备份服务器检测满足删除条件后,获取与所述第一磁盘数据对应的多条删除日志。获取多条删除日志的目的是根据与所述第一磁盘数据对应的多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象。确定包括无效数据和有效数据的第一目标大对象后,向所述对象存储系统发送数据迁移指示和对象删除指示,所述数据迁移指示用于指示所述对象存储系统将所述第一目标大对象中的所述有效数据迁移至另一大对象中,所述对象删除指示用于指示所述对象存储系统删除所述第一目标大对象。
本发明中在确定过期备份后创建删除日志,将所述过期备份中的无效数据的指针保存至所述删除日志中,如此后续备份服务器可以根据删除日志确定包括无效数据和有效数据的大对象,进而对以大对象存储方式进行增量备份的磁盘数据实现过期备份的处理。
上述备份服务器确定包括无效数据和有效数据的第一目标大对象后,向所述对象存储系统发送数据迁移指示和对象删除指示的实现方式有多种。例如,数据迁移指示和对象删除指示可以包含在一条指令中,也可以通过两条指令分别发送。如果数据迁移指示和对象 删除指示可以包含在一条指令中,对象存储系统接收到包括数据迁移指示和对象删除指示的指令后,先将所述第一目标大对象中的所述有效数据迁移至另一大对象中,再删除所述第一目标大对象。如果数据迁移指示和对象删除指示通过两条指令分别发送,备份服务器可以先发送包括数据迁移指示的指令,后发送包括对象删除指示的指令。备份服务器发送包括对象删除指示的指令之前,可以先确定对象存储系统是否完成有效数据的迁移。如果确认对象存储系统将有效数据迁移完成,再发送包括对象删除指示的指令至对象存储系统,以指示对象存储系统删除所述第一目标大对象。备份服务器可以根据是否获取到对象存储系统返回的迁移完成消息确定对象存储系统是否完成有效数据的迁移。
基于第一方面,在第一种实现方式中,所述检测是否满足删除条件包括:
检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到预设删除阈值,则满足删除条件;或
检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或
自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
基于第一方面或第一方面的第一种实现方式,在第二种实现方式中,所述根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象之后,还包括:创建所述有效数据移动后的指针,创建所述第一磁盘数据的移动日志,并将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中。所述有效数据移动后的指针表示所述有效数据移动至所述另一大对象后在所述另一大对象的位置。
基于第一方面的第二种实现方式,在第三种实现方式中,该方法还包括:接收客户端发送的数据恢复请求,所述数据恢复请求包括第一磁盘标识、第二磁盘标识和待恢复备份的备份标识,所述数据恢复请求用于指示基于所述待恢复备份的备份标识对应的待恢复备份,恢复所述第一磁盘数据至所述第二磁盘中,所述待恢复备份为对所述第一磁盘数据的所有未过期备份中的任一备份。获取所述待恢复备份的备份元数据,并获取所述第一磁盘数据的所有移动日志。根据所述第一磁盘数据的所有移动日志,确认所述待恢复备份的备份元数据中是否存在与所述移动日志中记录的有效数据移动前的指针相同的指针,如果存在,则将所述待恢复备份的备份元数据中的所述相同的指针修改为与所述有效数据移动前的指针对应的有效数据移动后的指针,获得修改后的备份元数据,所述修改后的备份元数据包括未修改的指针和修改后的指针,所述修改后的指针为所述移动日志中记录的所述有效数据移动后的指针。根据所述修改后的备份元数据获取与所述待恢复备份对应的所述第一磁盘数据。将所述第一磁盘数据保存至所述第二磁盘中。
基于第一方面的第三种实现方式,在第四种实现方式中,所述将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中之后,还包括:创建所述移动日志的对象标识;保存第一磁盘标识和所述移动日志的对象标识的对应关系,并发送移动日志存储请求至所述对象存储系统,所述移动日志存储请求包括所述移动日志的对象标识和所述移动日志,所述移动日志存储请求用于指示所述对象存储系统将所述移动日志保存至所述移动日志的对象标识对应的对象中;所述获取所述第一磁盘的所有移动日志,包括:根据所述第一磁盘标识获取所述移动日志的对象标识,发送所述移动日志获 取请求至所述对象存储系统,所述移动日志获取请求包括所述移动日志的对象标识,所述移动日志获取请求用于指示所述对象存储系统从与所述移动日志的对象标识对应的对象中获取所述移动日志;接收所述对象存储系统发送的所述移动日志。
基于第一方面或第一方面的第一种至第四种实现方式中任一种实现方式,在第五种实现方式中,所述根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象,包括:根据所述多条删除日志确定所述对象存储系统中保存的包括无效数据的第一目标大对象;根据预定义的无效数据的大小以及所述第一目标大对象中包括的无效数据的数量,确定所述第一目标大对象中所有无效数据的数据量;发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述第一目标大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述第一目标大对象的数据量;接收数据量属性信息,所述数据量属性信息包括所述第一目标大对象的数据量;如果所述第一目标大对象中所有无效数据的数据量比所述数据量属性信息中所述第一目标大对象的数据量小,则确定所述对象存储系统中保存的所述第一目标大对象为包括无效数据和有效数据的第一目标大对象。
第二方面,提供一种过期备份处理方法,所述方法由备份服务器执行,包括:
每次确定第一磁盘数据的过期备份后,创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份。检测是否满足删除条件,如果满足删除条件,获取与所述第一磁盘数据对应的多条删除日志。根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象。向所述对象存储系统发送对象删除指示,所述对象删除指示用于指示所述对象存储系统删除所述只包括无效数据的大对象。
基于第二方面,在第一种实现方式中,所述检测是否满足删除条件包括:
检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到预设删除阈值,则满足删除条件;或
检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或
自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
基于第二方面或第二方面的第一种实现方式,在第二种实现方式中,所述根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象,包括:根据所述多条删除日志确定包括无效数据的大对象;根据预定义的无效数据的大小以及所述包括无效数据的大对象中包括的无效数据的数量,确定所述包括无效数据的对象中所有无效数据的数据量;发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述包括无效数据的大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述包括无效数据的大对象的数据量;接收数据量属性信息,所述数据量属性信息包括所述包括无效数据的大对象的数据量;如果所述包括无效数据的大对象中所有无效数据的数据量与所述数据量属性信息中所述包括无效数据的大对象的数据量相同,则确定所述对象存储系统中保存的所述包括无效数据的大对象为只包括无效数据的大对象。
第三方面,本发明实施例提供了一种数据备份的方法,该方法包括:在对过期备份进 行处理之前,备份服务器会对第一磁盘数据进行备份。对第一磁盘数据中的数据分片进行备份的过程为:备份服务器接收待备份至对象存储系统中的第一磁盘数据的多个数据分片,根据数据分片的大小以及预定义的数据块的大小确定待备份至对象存储系统的数据块,所述数据块包括所述第一磁盘数据中的至少一个数据分片。备份服务器会计算所述数据块的弱哈希值。确定所述对象存储系统中已保存的另一数据块的标识,所述数据块的弱哈希值与所述另一数据块的弱哈希值相似。查询所述另一数据块的标识,根据所述另一数据块的标识确定所述另一数据块所在的大对象的标识,确定所述另一数据块所在的大对象中已保存的数据量的大小没有达到预定义的大小,向所述对象存储系统发送数据备份请求。所述数据备份请求包括所述数据块和所述另一数据块所在的大对象的标识,所述数据备份请求用于指示所述对象存储系统保存所述数据块至所述另一数据块所在的大对象中。对象存储系统接收到所述数据存储请求后,保存所述数据块至所述另一数据块所在的大对象中。如此,本发明实施例在备份数据时可以实现弱哈希值相似的数据块的大对象存储。所述另一数据块中的数据分片可以是所述第一磁盘数据中的数据分片,也可以是其他磁盘数据中的数据分片。
可以理解的是,也可以在上述第一方面和第二方面的方法中,在对过期备份进行处理之前,备份服务器会采用第三面所提供的方法对所述第一磁盘的数据进行备份。
第四方面,提供一种备份服务器,包括用于执行第一方面或第一方面的任一种可能实现方式中的过期备份处理方法的各个模块,所述模块可以通过硬件实现,也可以通过硬件执行相应的软件实现。
第五方面,提供一种备份服务器,包括用于执行第二方面或第二方面的任一种可能实现方式中的过期备份处理方法的各个模块,所述模块可以通过硬件实现,也可以通过硬件执行相应的软件实现。
第六方面,提供一种备份服务器,包括用于执行第三方面提供的实现方式中的过期备份处理方法的各个模块,所述模块可以通过硬件实现,也可以通过硬件执行相应的软件实现。
第七方面,提供一种备份服务器,包括接口、存储器和处理器,所述接口用于和对象存储系统通信,所述存储器用于存储软件程序,所述处理器通过运行存储在所述存储器中的软件程序,执行第一方面或第一方面的任一种可能实现方式中的过期备份处理方法。
第八方面,提供一种备份服务器,包括接口、存储器和处理器,所述接口用于和对象存储系统通信,所述存储器用于存储软件程序,所述处理器通过运行存储在所述存储器中的软件程序,执行第二方面或第二方面的任一种可能实现方式中的过期备份处理方法。
第九方面,提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得所述计算机执行上述第一方面或第一方面的任一种可能实现方式中的过期备份处理方法。
第十方面,提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得所述计算机执行上述第二方面或第二方面的任一种可能实现方式中的过期备份处理方法。
图1为本发明实施例提供的一种备份系统的结构示意图;
图2为本发明实施例提供的虚拟机磁盘数据的全量备份方法流程图;
图3为本发明实施例提供的虚拟机磁盘数据的增量备份方法流程图;
图4为本发明实施例提供的过期备份的处理方法的流程图;
图5为本发明实施例提供的备份服务器根据累计的删除日志删除过期数据的方法流程图;
图6为本发明实施例提供的备份服务器通过多段复制技术指示对象存储系统保存有效数据的方法流程图;
图7为本发明实施例提供的恢复虚拟机磁盘数据的方法流程图;
图8为本发明实施例提供的一种备份服务器的结构图;
图9为本发明实施例提供的另一种备份服务器的结构图;
图10为本发明实施例提供的另一种备份服务器的结构图;
图11为本发明实施例提供的另一种备份服务器的结构图;
图12为本发明实施例提供的另一种备份服务器的结构图。
下面将结合附图,对本申请中的技术方案进行描述。
本发明实施例提供了一种备份系统。请参见图1,图1为本发明实施例提供的一种存储系统的结构示意图。所述存储系统包括客户端110、存储节点120,以及备份系统100。其中,备份系统100包括备份服务器130和对象存储系统140,备份服务器130和对象存储系统140连接。该备份服务器130和客户端110以及存储节点120相连,存储节点120包括一个或多个磁盘,所述磁盘可以为虚拟机磁盘或物理磁盘。
存储节点120,用于将待存储至存储节点120的某一磁盘的数据划分为连续的多个数据分片,将所述连续的多个数据分片保存在存储节点120的磁盘的多个物理块中。
备份服务器130,用于接收到客户端110发送的数据备份请求时,或者确定到达预设的时间时,将存储节点120的磁盘上的数据全量备份或者增量备份至对象存储系统140中,并且创建并保存备份元数据和备份属性信息。其中,全量备份是指将存储节点120的磁盘上的所有数据备份至存储系统,增量备份是指将存储节点120的磁盘上有修改的数据备份至对象存储系统。备份元数据用于表示磁盘数据中的每个对象在磁盘数据中的位置。备份元数据中可以记录组成磁盘数据的每个对象的标识及指针,并且按照对象在磁盘中的排列顺序来记录每个对象的指针。所述备份属性信息包括备份的备份标识、备份时间和所述备份元数据的标识。备份服务器130创建备份属性信息后,还会保存存储节点120的磁盘的磁盘标识与备份属性信息的对应关系。
通常,在客户端110发送的数据备份请求中要包含需要备份的磁盘的磁盘标识,备份服务器130将该磁盘标识所指代的磁盘上的数据全量备份或增量备份至对象存储系统140中。具体的全量备份和增量备份的过程请参考下述方法实施例中的描述,在此不再赘述。
对该磁盘的磁盘数据进行全量备份或增量备份后,系统中会产生越来越多的备份。为了节省系统的存储空间,需要对存储系统中的备份进行管理。本发明实施例提供的备份系统中的备份服务器会根据需要删除过期备份。
备份服务器130,用于检测到该磁盘的所有备份属性信息的总份数超过预定值时,根据所有备份属性信息中的备份时间确定最早备份为过期备份。备份服务器130确定过期备份后,通过该过期备份的备份元数据以及与过期备份相邻的下一次备份的备份元数据识别过期备份中的无效数据,创建删除日志,将过期备份中的无效数据的指针保存至删除日志中。创建删除日志后,备份服务器130还用于删除该过期备份的备份属性信息,以及创建并保存磁盘标识和删除日志的标识的对应关系。其中,所述无效数据指的是相对于与所述过期备份相邻的下一次备份,过期备份中的包括被修改的数据分片的对象。
备份服务器130检测是否满足删除条件后,根据磁盘标识和删除日志的标识的对应关系,获取与所述磁盘数据对应的多条删除日志。该删除条件可以是与所述磁盘数据对应的多条删除日志的数量达到预设删除阈值,或是达到预设删除时间,或是自上次满足删除条件后启动计时直至所述计时结束。
备份服务器130,还用于根据所述多条删除日志确定所述对象存储系统140中保存的包括有效数据和所述无效数据的目标大对象,以及确定只包括无效数据的大对象。其中,所述有效数据指的是相对于与所述过期备份相邻的下一次备份,过期备份中的包括没有修改的数据分片的对象。
在确定所述对象存储系统140中保存的包括有效数据和所述无效数据的目标大对象后,备份服务器130,还用于向所述对象存储系统140发送数据迁移指示和对象删除指示,所述数据迁移指示用于指示所述对象存储系统140将所述目标大对象中的所述有效数据迁移至另一大对象中,所述对象删除指示用于指示所述对象存储系统140删除所述目标大对象。在确定只包括无效数据的大对象后,备份服务器130,还用于向所述对象存储系统140发送对象删除指示,所述对象删除指示用于指示所述对象存储系统140删除该只包括无效数据的大对象。
一段时间后对累计的过期备份进行处理,确定包括无效数据的大对象。如果确定大对象为包括有效数据和无效数据的大对象,将大对象中的有效数据移动至另一大对象,然后删除该大对象以节省系统的存储空间。如果确定大对象为只包括无效数据的大对象,删除该大对象,以节省系统的存储空间。如此,本发明实施例实现了对以大对象存储方式进行增量备份的磁盘数据如何实现过期备份的删除,并且避免每确定一个过期备份后对该过期备份进行处理,而是经过一段时间后对累计的多个过期备份进行处理,简化了备份系统频繁对过期备份进行处理的步骤。
如果将大对象中的有效数据移动至另一大对象,该大对象中的有效数据的指针有修改,为了避免更新其他所有备份分别对应的备份元数据中记录的该大对象中的有效数据的指针,通过移动日志记录该大对象中的有效数据移动前的指针和移动后的指针的对应关系,以便于在后续需要访问某一备份时,确认该某一备份的备份元数据中是否存在与移动日志记录的有效数据移动前的指针相同的指针,如果存在,则将该备份的备份元数据中的该相同的指针修改为该有效数据移动后的指针,避免移动有效数据后更新其他所有备份分别对应的备份元数据。
在移动大对象中的有效数据至另一大对象后,备份服务器130还用于创建移动日志,将有效数据移动前和移动后的指针的对应关系保存至移动日志中。这样,后续备份服务器130通过未过期备份的备份元数据访问磁盘数据中的有移动的有效数据时,会根据移动日 志中保存的有效数据移动前和移动后的指针的对应关系,更新未过期备份的备份元数据。对未过期备份的备份元数据进行更新时,将所述未过期备份的备份元数据中与所述移动日志中记录的所述有效数据移动前的相同指针进行修改,修改后的指针为所述移动日志中记录的所述有效数据移动后的指针。
例如,备份服务器130接收到客户端110发送的数据恢复请求,从所有未过期备份中确定待恢复备份后,会根据移动日志修改待恢复备份的备份元数据,根据修改后的备份元数据获取组成磁盘数据的所有对象,然后将磁盘数据恢复至目标磁盘。目标磁盘可以位于另一存储节点120中,可以是物理磁盘或虚拟机磁盘,目标磁盘也可以位于磁盘数据备份前所在的存储节点120或与磁盘数据备份前所在的磁盘为同一磁盘。
相应的,备份系统100中的对象存储系统140,用于存储备份数据。其中,全量备份时对象存储系统140用于保存磁盘数据中的所有数据。增量备份时对象存储系统140用于保存磁盘数据中有修改的数据。
对象存储系统140还可以存储磁盘数据备份时创建的备份元数据。
对象存储系统140还可以存储对磁盘数据的过期备份处理时创建的删除日志或移动日志。
上述备份系统中,备份服务器130和对象存储系统140独立部署,客户端110可以是备份服务器130上部署的虚拟机或与备份服务器130独立部署。客户端110与备份服务器130独立部署时,客户端110可以是与备份服务器130独立的设备或是与备份服务器130独立的设备上部署的虚拟机。存储节点120可以是备份服务器130中的存储设备或与备份服务器130独立部署。存储节点120用于管理物理磁盘或用于管理虚拟机磁盘。其中,客户端110可以为物理服务器或者各种类型的终端设备。本发明实施例的终端设备包括平板电脑、笔记本电脑、移动互联网设备、掌上电脑、台式电脑、手机或者其他产品形态的终端设备。客户端110也可以是软件模块,例如可以是运行在物理设备上的软件模块或者运行在物理服务器上的虚拟机。
下面描述本发明实施例提供的过期数据的删除方法。该方法应用于上面图1所示的备份系统中。在描述本发明实施例提供的过期数据的删除方法之前,以备份虚拟机磁盘数据为例说明备份服务器130通过对象存储方式对虚拟机磁盘数据首次进行全量备份和后续进行增量备份的流程。请参见图2,图2为本发明实施例提供的虚拟机磁盘数据的全量备份方法流程图。该方法包括以下步骤:
200、备份服务器130发送快照通知至存储节点120。
快照通知包括虚拟机磁盘标识。快照通知用于指示存储节点120对与虚拟机磁盘标识对应的虚拟机磁盘进行快照。在本实施方式中,虚拟机磁盘标识对应的磁盘为待备份磁盘。存储节点120接收到快照通知后对虚拟机磁盘进行快照,创建虚拟机磁盘的快照标识,并保存虚拟机磁盘标识和快照标识的对应关系。
备份系统如何启动全量备份可以有多种方式实现。例如,用户可以通过客户端110选择对虚拟机磁盘数据进行全量备份的时间点,在全量备份的时间点到达后,客户端110发送备份请求至备份服务器130,备份请求中包括存储节点120的标识和虚拟机磁盘标识。备份服务器130接收到客户端110发送的备份请求后,发送快照通知至存储节点120。或 者,存储节点120将待存储至虚拟机磁盘中连续的多个数据分片分别保存在虚拟机磁盘的一个物理块后,发送备份请求至备份服务器130启动全量备份。或者,用户在备份服务器130中预先设定存储节点120的全量备份的时间点,在全量备份的时间点到达后,备份服务器130发送快照通知至存储节点120。在其他实例中,备份系统如何启动全量备份的具体实现不受本发明实施例所举实例的限制。
210、存储节点120对虚拟机磁盘数据进行快照后,发送快照信息至备份服务器130。快照信息包括虚拟机磁盘的快照标识和虚拟机磁盘标识。
存储节点120对虚拟机磁盘数据进行快照时,会创建虚拟机磁盘的快照标识。然后根据所述虚拟机磁盘的快照标识和虚拟机磁盘标识生成快照信息。存储节点120通过发送快照信息发送虚拟机磁盘标识和快照标识至备份服务器130。
220、备份服务器130接收到快照信息后,保存快照标识和虚拟机磁盘标识的对应关系,并且,根据虚拟机磁盘标识确认虚拟机磁盘的已有备份情况,并根据已有备份情况确定是否进行全量备份。
在步骤220中,备份服务器130根据虚拟机磁盘标识确认虚拟机磁盘的已有备份情况分为以下两种情况:
第一种情况,备份服务器130根据虚拟机磁盘标识确认对虚拟机磁盘已备份的数量为0。如果已备份的数量为0,则对虚拟机磁盘数据进行全量备份。
第二种情况,备份服务器130根据虚拟机磁盘标识确认对虚拟机磁盘已备份的数量不为0,且最后一次对虚拟机磁盘进行备份时保存的信息与应该备份时需要保存的增量信息不一致。
在第二种情况中,发送给备份服务器的快照信息中还包括存储节点中记录的上一次备份时的快照的数据量。备份服务器根据虚拟机磁盘标识确认对虚拟机磁盘已备份的数量不为0后,通过对比备份服务器中记录的上一次备份时快照的数据量与存储节点中记录的上一次备份时快照的数据量确认最后一次对虚拟机磁盘进行备份时保存的信息与应该备份所要保存的增量信息是否一致,如果备份服务器中记录的上一次备份时的快照与存储节点中记录的上一次备份时快照的数据量不同,则表明最后一次对虚拟机磁盘进行备份时保存的信息与应该备份时需要保存的增量信息不一致。
可替换的,在第二种情况中,发送给备份服务器的快照信息中还包括上一次快照的标识。备份服务器根据虚拟机磁盘标识确认对虚拟机磁盘已备份的数量不为0后,通过对比备份服务器中记录的上一次备份时快照的标识与存储节点中记录的上一次备份时快照的标识确认最后一次对虚拟机磁盘进行备份时保存的信息与应该备份所要保存的增量信息是否一致,如果存储节点中记录的上一次备份时快照的标识与备份服务器中记录的上一次备份时快照的标识不同,则表明最后一次对虚拟机磁盘进行备份时保存的信息与应该备份所要保存的增量信息是否一致。
225、备份服务器130确定进行全量备份后,发送全量数据获取请求至虚拟机磁盘所在的存储节点120。
全量数据获取请求包括虚拟机磁盘标识以及待读取的数据块在虚拟机磁盘数据中的位置。待读取的数据块包括至少一个数据分片。待读取的数据块中数据分片的数量是由待读取的数据块的大小和数据分片的大小确定的。待读取的数据块的大小是在备份服务器 130的备份软件中预先定义的,不同的备份软件预先定义的待读取的数据块的大小可以不同。备份服务器130可以发送多个全量数据获取请求获取虚拟机磁盘数据的所有数据分片。
其中,待读取的数据块在虚拟机磁盘数据中的位置可以是下面两种:
第一种:待读取的数据块在虚拟机磁盘数据中的起始位置和待读取的数据块的大小。如果待读取的数据块中的第一个数据分片为虚拟机磁盘数据中的第i个数据分片,其中,所述i为大于0的整数,则所述待读取的数据块在虚拟机磁盘数据中的起始位置为(i-1)与数据分片的大小的乘积。例如待读取的数据块为虚拟机磁盘数据中的第3个数据分片,如果待读取的数据分片的大小为4M,则待读取的数据块在虚拟机磁盘数据中的起始位置为(3-1)与4M的乘积,即待读取的数据块在虚拟机磁盘数据中的起始位置为8M。又如待读取的数据块的第一个数据分片为虚拟机磁盘数据中的第1个数据分片,如果待读取的数据分片的大小为4M,则待读取的数据块在虚拟机磁盘数据中的起始位置为(1-1)与4M的乘积,即待读取的数据块在虚拟机磁盘数据中的起始位置为0M。
第二种:待读取的数据块在虚拟机磁盘数据中的起始位置和结束位置。待读取的数据块在虚拟机磁盘数据中的起始位置如上一表现方式,在此不再赘述。待读取的数据块在虚拟机磁盘数据中的结束位置为待读取的数据块中的最后一个数据分片的结束位置,如果待读取的数据块中的最后一个数据分片为虚拟机磁盘数据中的第w个数据分片,则待读取的数据块中的最后一个数据分片的结束位置为所述w与所述数据分片的大小的乘积。
可以理解的是,上述的起始位置和结束位置指的是偏移量。
可替换的,如果待读取的数据块的数量有多个,多个待读取的数据块为连续的数据块,全量数据获取请求包括虚拟机磁盘标识以及该多个待读取的数据块在虚拟机磁盘数据中的位置。该多个待读取的数据块在虚拟机磁盘数据中的位置可以是该多个待读取的数据块中第一个待读取的数据块在虚拟机磁盘数据中的起始位置和该多个待读取的数据块的大小。可替换的,该多个待读取的数据块在虚拟机磁盘数据中的位置可以是该多个待读取的数据块中第一个待读取的数据块在虚拟机磁盘数据中的起始位置和该多个待读取的数据块中最后一个待读取的数据块在虚拟机磁盘数据中的结束位置。其中,该多个待读取的数据块中第一个待读取的数据块在虚拟机磁盘数据中的起始位置和该多个待读取的数据块中最后一个待读取的数据块在虚拟机磁盘数据中的结束位置可以参照删除待读取的数据块在虚拟机磁盘数据的位置的确定方式,具体实现细节不再在此赘述。
在步骤225之后,虚拟机磁盘所在的存储节点120接收全量数据获取请求,会根据待读取的数据块在虚拟机磁盘数据中的位置查找虚拟机磁盘数据中的组成数据块的多个数据分片。然后,发送虚拟机磁盘数据中的组成数据块的多个数据分片至备份服务器130。
230、备份服务器130接收虚拟机磁盘数据中的多个数据块后,确定组成大对象的多个数据块。每个数据块为大对象中的一个对象。
备份服务器130接收到的符合预定数量的连续的多个数据块为组成大对象的多个数据块。
备份服务器接收到的符合预定数量的连续的多个数据块后,可以计算每个数据块的弱哈希值,可以确定弱哈希值相似的多个数据块为组成大对象的多个数据块。240、备份服务器130确定组成大对象的多个数据块后,创建大对象的标识,发送所述多个数据块和大对象的标识至对象存储系统140。
对象存储系统140接收到所述多个数据块和大对象的标识后,将所述多个数据块保存至与所述大对象的标识对应的大对象中。
虚拟机磁盘的所有数据块组成多个大对象,被发送至对象存储系统140。
250、备份服务器130创建备份元数据和元数据对象的标识,将所述备份元数据和元数据对象的标识发送至对象存储系统140。
备份元数据用于表示虚拟机磁盘数据中的每个对象在虚拟机磁盘数据中的位置。备份元数据中可以记录组成虚拟机磁盘数据的每个对象的标识及指针,并且按照对象在虚拟机磁盘中的排列顺序来记录每个对象的指针。
可替换的,对象的标识也可以同时用作指针。也就是说,备份元数据中可以只记录组成虚拟机磁盘数据的每个对象的标识,并且按照对象在虚拟机磁盘中的排列顺序来记录每个对象的标识。其中,备份元数据中记录的对象标识可以包括大对象的标识和对象在其所属的大对象中的位置。
如果一个对象包含在一个大对象中,则该对象的指针表示该对象在其所属的大对象中的位置。例如下表1中表示的备份元数据中记录的磁盘数据中前四个大对象中的部分对象的指针。
表1
260、备份服务器130创建全量备份的备份属性信息,并保存所述全量备份的备份属性信息和虚拟机磁盘标识的对应关系。所述全量备份的备份属性信息包括全量备份的备份标识、备份元数据的元数据对象标识和备份时间。
以上步骤200至260描述了备份系统如何对虚拟机磁盘数据进行全量备份的方法。在本发明实施例中,对虚拟机磁盘数据进行全量备份之后,用户可能对数据做了修改,然后对修改数据进行增量备份。具体增量备份的过程包括图3中的过程。在增量备份后,为了减少备份数据的数据量,备份服务器130会确定过期备份并对过期备份进行处理。下面详细描述备份系统如何对虚拟机磁盘数据进行增量备份的方法。
请参见图3,图3为本发明实施例提供的虚拟机磁盘数据的增量备份方法流程图。如图3所示,本发明实施例提供的虚拟机磁盘数据的增量备份方法包括以下步骤:
300、备份服务器130发送快照通知至存储节点120。
快照通知包括虚拟机磁盘标识。快照通知用于指示存储节点120对与虚拟机磁盘标识对应的虚拟机磁盘进行快照。在本实施方式中,虚拟机磁盘标识对应的磁盘为待备份磁盘。存储节点120接收到快照通知后对虚拟机磁盘进行快照,创建虚拟机磁盘的快照标识,并保存虚拟机磁盘标识和快照标识的对应关系。
备份系统如何启动增量备份可以有多种方式实现。例如,用户可以通过客户端110选择对虚拟机磁盘数据进行增量备份的时间点,在增量备份的时间点到达后,客户端110 发送备份请求至备份服务器130,备份请求中包括存储节点120的标识和虚拟机磁盘标识。备份服务器130接收到客户端110发送的备份请求后,发送快照通知至存储节点120。或者,存储节点120将待存储至虚拟机磁盘中连续的多个数据分片分别保存在虚拟机磁盘的一个物理块后,如果虚拟机磁盘所在的存储节点120对虚拟机磁盘数据中至少一个对象中的至少一个数据分片进行修改,存储节点120会发送备份请求至备份服务器130启动增量备份。或者,用户在备份服务器130中预先设定存储节点120的增量备份的时间点,在增量备份的时间点到达后,备份服务器130发送快照通知至存储节点120。在其他实例中,备份系统如何启动增量备份的具体实现不受本发明实施例所举实例的限制。
310、存储节点120对虚拟机磁盘数据进行快照后,发送快照信息至备份服务器130。快照信息包括当前快照的快照标识、改变块跟踪(Change Block Tracking,CBT)信息和虚拟机磁盘标识。
存储节点120对虚拟机磁盘数据进行快照时,会创建虚拟机磁盘的快照标识和当前快照的CBT信息。存储节点120创建虚拟机磁盘的快照标识后,可以根据所述虚拟机磁盘的快照标识、当前快照的CBT信息和虚拟机磁盘标识生成快照信息。存储节点120通过发送快照信息发送虚拟机磁盘标识、当前快照的CBT信息和快照标识至备份服务器130。
存储节点120创建虚拟机磁盘的快照标识和当前快照的CBT信息后,会保存当前快照的CBT信息、快照标识和虚拟机磁盘标识的对应关系。
在本实施方式中,当前快照的CBT信息用于表示当前快照时虚拟机磁盘数据中的每个数据分片相对于上一次快照时的修改状态。当前快照的CBT信息包括当前快照的CBT信息标识和虚拟机磁盘数据中的每个数据分片的修改状态标识。当前快照的CBT信息会按照组成虚拟机磁盘数据的每个数据分片的排列顺序来记录虚拟机磁盘数据中每个数据分片的修改状态标识。其中,修改状态标识用于指示当前快照时虚拟机磁盘数据中的数据分片相对于上一次快照时的修改状态。修改状态可以包括已修改状态和未修改状态。例如通过数字0表示本次快照时虚拟机磁盘数据中的数据分片相对于上一次快照时的修改状态为未修改状态,通过数据1表示当前快照时虚拟机磁盘数据中的数据分片相对于上一次快照时的修改状态为已修改状态。在其他实现方式中,修改状态标识例如可以是“修改”或者“未修改”等中文字符,也可以是字母、数字或其他符号,也可以是字母、数字或其他符号的组合。修改状态标识的具体表现形式不受本实施例的限制。
320、备份服务器130根据虚拟机磁盘标识确认虚拟机磁盘所在的存储节点120对虚拟机磁盘创建过快照,则发送增量对比信息获取请求至虚拟机磁盘所在的存储节点120。
增量对比信息获取请求包括虚拟机磁盘的快照标识和虚拟机磁盘标识。
在步骤320之后,虚拟机磁盘所在的存储节点120接收增量对比信息获取请求,发送增量对比信息至备份服务器130。增量对比信息包括虚拟机磁盘的快照标识对应的当前快照的前一次快照所生成的所述前一次快照的CBT信息标识。
虚拟机磁盘所在的存储节点120接收到增量对比信息获取请求后,根据虚拟机磁盘的快照标识查找所述虚拟机磁盘的快照标识对应的当前快照的前一次快照所生成的前一次快照的CBT信息,如果查找到相对于虚拟机磁盘的快照标识对应的当前快照的前一次快照所生成的该前一次快照的CBT信息,则发送该前一次快照的CBT信息中的该前一次快照的CBT信息标识至备份服务器130。如果查找不到相对于虚拟机磁盘的快照标识对应的 当前快照的前一次快照生成的该前一次快照的CBT信息,则发送查找失败的信息至备份服务器130。
如果本次快照是第二次快照,则第一次快照是全量备份时的快照,全量备份创建快照时是没有创建CBT信息的。因此,存储节点120是查找不到相对于虚拟机磁盘的快照标识对应的本次快照的前一次快照生成的该前一次快照的CBT信息,则发送查找失败的信息至备份服务器130。
330、备份服务器130接收到虚拟机磁盘所在的存储节点120发送的增量对比信息后,根据增量对比信息中的所述前一次快照的CBT信息标识和快照信息中的当前快照的CBT信息确认当前快照对应的增量数据在虚拟机磁盘数据中的偏移位置。
备份服务器130确认当前快照对应的增量数据在虚拟机磁盘数据中的偏移位置的具体实现方式包括:备份服务器130首先确认备份服务器130在前一次对虚拟机磁盘数据进行备份时在备份服务器130中记录的CBT信息标识与所述增量对比信息中的前一次快照的CBT信息标识是否相同,如果相同,备份服务器130根据当前快照的CBT信息中记录的修改状态标识,确认当前快照对应的增量数据在虚拟机磁盘数据中的偏移位置。
备份服务器130根据当前快照的CBT信息记录的修改状态标识确认当前快照对应的增量数据在虚拟机磁盘数据中的偏移位置的具体实现方式为:备份服务器130根据当前快照的CBT信息中记录的修改状态标识的排列顺序读取预定数量个修改状态标识,如果所述预定数量个修改状态标识包括表示修改状态为已修改的修改状态标识,则确定所述预定数量个修改状态标识对应的多个数据分片中包括已修改的数据分片。其中,所述预定数量等于一个对象中数据分片的数量。所述预定数量个修改状态标识在当前快照的CBT信息记录的修改状态标识中的偏移位置即为当前快照对应的增量数据在虚拟机磁盘数据中的偏移位置。
依据上述确定增量数据在虚拟机磁盘数据中的偏移位置位置方式依次确定出多个增量数据在虚拟机磁盘数据中的偏移位置位置后,备份服务器130依次发送增量数据获取请求至虚拟机磁盘所在的存储节点120,即顺序执行如下步骤340。
340、备份服务器130发送增量数据获取请求至虚拟机磁盘所在的存储节点120。增量数据获取请求包括虚拟机磁盘标识以及增量数据在虚拟机磁盘数据中的偏移位置。
增量数据在虚拟机磁盘数据中的偏移位置的表现方式有两种,具体实现可以参照图2所示的全量备份过程中对待读取的数据块在虚拟机磁盘数据中的偏移位置的表现方式。在此不再赘述。
在步骤340之后,虚拟机磁盘所在的存储节点120接收增量数据获取请求,根据增量数据在虚拟机磁盘数据中的偏移位置查找虚拟机磁盘数据中的增量数据。然后,虚拟机磁盘所在的存储节点120发送增量数据至备份服务器130。
350、备份服务器130接收到增量数据后,创建新的大对象标识,将新的大对象标识和增量数据发送至对象存储系统140。每个增量数据在对象存储系统140中被保存为一个新的大对象中的一个新的对象。
多个增量数据对应的多个新的对象可以归属于一个新的大对象或多个新的大对象。
例如,备份软件预先定义大对象实际可以保存的多个对象的数量。当收到增量数据的数量达到大对象实际可以保存的多个对象的数量时,备份服务器130创建一个新的大对象 的标识,将收到的多个增量数据和新的大对象的标识发送给对象存储系统140,以使对象存储系统140将多个增量数据保存至新的大对象的标识对应的新的大对象中。
在步骤350之后,对象存储系统140接收到增量数据,通过对象存储方式将增量数据保存至与新的对象标识对应的一个新的对象,作为增量备份。然后,对象存储系统140发送增量备份完成请求至备份服务器130。
备份服务器130在增量备份时,如果以对象存储方式存储增量备份对应的备份元数据,备份服务器130会创建增量备份的备份标识和元数据对象标识,并保存增量备份的备份标识和元数据对象标识的对应关系。
360、备份服务器130发送元数据获取请求至对象存储系统140。
元数据获取请求包括上一次备份对应的元数据对象标识。
备份服务器130创建元数据获取请求之前,会先查找上一次备份的备份标识,根据上一次备份的备份标识查找与上一次备份的备份标识对应的元数据对象标识。
在步骤360之后,对象存储系统140接收元数据获取请求,发送上一次备份对应的备份元数据至备份服务器130。
370、备份服务器130接收到对象存储系统140发送的上一次备份对应的备份元数据后,修改上一次备份对应的备份元数据,获得修改后的备份元数据作为增量备份的备份元数据。
备份服务器130将增量数据对应的对象的指针进行修改,修改后用于指示对象存储系统140中保存的增量数据对应的对象在其所属的新的大对象中的位置。
例如以修改上表1中涉及的一部分磁盘数据中的个别数据为例,说明备份元数据在修改后的表现形式。依据表1中涉及的一部分磁盘数据,如果备份服务器130对大对象B中的第一个对象B1中的至少一个数据分片、以及大对象D中的第一个对象D1中的至少一个数据分片进行修改,备份系统在增量备份时,会创建一个新的大对象E。备份服务器130会将修改后的数据分片所属的对象保存至新的大对象E中。因此,新的大对象E包括两个有数据分片修改的对象。备份服务器130将修改后的数据分片所属的对象保存至新的大对象E后,备份服务器130会获取表1中涉及的备份元数据,将表1中涉及的备份元数据进行修改,修改后的备份元数据中的一部分如下表2所示。
表2
备份服务器130获得修改后的备份元数据后,会创建增量备份的备份元数据的元数据对象标识,然后创建增量备份的备份属性信息,所述增量备份的备份属性信息包括增量备份的备份标识、元数据对象标识和备份时间。备份服务器130会保存所述增量备份的备份属性信息和虚拟机磁盘标识的对应关系。
380、备份服务器130发送修改后的备份元数据和增量备份的元数据对象标识至对象 存储系统140。
备份服务器130发送增量备份的备份元数据至对象存储系统140时,会指示对象存储系统140将增量备份的备份元数据保存至与增量备份对应的所述元数据对象标识对应的对象中。
在步骤380之后,对象存储系统140将增量备份时创建的备份元数据保存至与增量备份的所述元数据对象标识对应的对象中,然后发送增量数据的元数据存储完成消息至备份服务器130。
以上步骤300至380描述了备份系统如何通过对象存储方式对虚拟机磁盘数据进行增量备份的方法。对虚拟机磁盘数据完成全量备份或增量备份后,为节省备份系统的存储空间,本发明实施例中,备份服务器130会检测对象存储系统140中对虚拟机磁盘数据进行的所有备份中未过期的备份的总份数。如果对象存储系统140中对磁盘数据进行的所有备份中未过期的备份的总份数超过预定值,则确定对象存储系统140中所有未过期的备份中的最早备份属于过期备份,备份系统进而会对所有未过期的备份中的最早备份进行处理以节省备份系统的存储空间。下面描述一下本发明实施例提供的过期备份的处理方法。
请参见图4,图4为本发明实施例提供的过期备份的处理方法的流程图。如图4所示,本发明实施例提供的过期备份的处理方法包括以下步骤:
400、备份服务器130确定虚拟机磁盘数据的第一过期备份后,发送第一过期备份信息获取请求至对象存储系统140。第一过期备份信息获取请求包括所述第一过期备份的元数据对象标识和与所述第一过期备份相邻的下一次备份的元数据对象标识。
在步骤400之前,备份服务器130确定虚拟机磁盘数据的第一过期备份。
确定第一过期备份的过程包括:备份服务器130会检测备份服务器130中保存的所有备份属性信息的数量,如果备份服务器130中保存的所有备份属性信息的数量超过预定值,根据备份属性信息查询对所述虚拟机磁盘数据进行的所有备份中的第一过期备份的备份标识,所述第一过期备份为截止所述当前第一时刻对象存储系统140中对所述虚拟机磁盘数据进行的所有未过期备份中的最早备份。
备份服务器130确定虚拟机磁盘数据的第一过期备份后,以及备份服务器130发送第一过期备份信息获取请求之前,会先查找所述第一过期备份的备份标识,然后根据所述第一过期备份的备份标识查找与所述第一过期备份的备份标识对应的元数据对象标识。以及,备份服务器130发送第一过期备份信息获取请求之前,会先查找与所述第一过期备份相邻的下一次备份的备份标识,然后根据与所述第一过期备份相邻的下一次备份的备份标识查找与所述第一过期备份相邻的下一次备份的备份标识对应的元数据对象标识。备份服务器130确认所述第一过期备份的元数据对象标识和与所述第一过期备份相邻的下一次备份的元数据对象标识后,发送第一过期备份信息获取请求至对象存储系统140,以指示对象存储系统140将所述第一过期备份的元数据对象标识和与所述第一过期备份相邻的下一次备份的元数据对象标识所对应的备份元数据发送至备份服务器130。
410、对象存储系统140接收到第一过期备份信息获取请求后,根据第一过期备份的元数据对象标识查找第一过期备份的备份元数据,以及根据与所述第一过期备份相邻的下一次备份的元数据对象标识查找与所述第一过期备份相邻的下一次备份的备份元数据。
411、对象存储系统140发送第一过期备份的备份元数据以及与所述第一过期备份相 邻的下一次备份的备份元数据至备份服务器130。
420、备份服务器130接收到第一过期备份对应的备份元数据以及与所述第一过期备份相邻的下一次备份的备份元数据后,根据第一过期备份对应的备份元数据以及与所述第一过期备份相邻的下一次备份的备份元数据,识别出第一过期备份中的有效数据和无效数据。
确认第一过期备份中的无效数据和有效数据的具体实现方式包括:备份服务器130首先对比第一过期备份的备份元数据以及与第一过期备份相邻的下一次备份对应的备份元数据,判断第一过期备份所对应的备份元数据以及所述下一次备份所对应的备份元数据中同一排列位置是否指向同一对象,如果指向同一对象,则表明第一过期备份当中该排列位置指向的对象是有效数据;相反,指向不同的对象,则表明第一过期备份当中该排列位置指向的对象是无效数据。
421、备份服务器130确认第一过期备份中的无效数据后,创建虚拟机磁盘数据的第一删除日志,并将第一过期备份中的无效数据的指针保存至第一删除日志中。
例如基于表1所示的关于全量备份的备份元数据中的部分对象的指针以及表2所示的关于增量备份的备份元数据中的部分对象的指针,如果与表1对应的全量备份为第一过期备份,与表2对应的增量备份为与第一过期备份相邻的备份,则第一过期备份中大对象B中的第一个对象的指针B1对应的对象为无效数据,以及大对象D中的第一个对象的指针D1对应的对象为无效数据,则创建的与第一过期备份对应的第一删除日志中记录的部分无效数据的指针可参考下表3所示的内容。
表3
在本发明实施例中,在上述步骤421之后,即备份服务器130创建第一删除日志后,备份服务器130可以发送第一过期备份元数据删除指令至对象存储系统140。第一过期备份元数据删除指令包括与第一过期备份的备份元数据对应的元数据对象标识。第一过期备份元数据删除指令用于指示删除第一过期备份的备份元数据。备份服务器130发送第一过期备份元数据删除指令至对象存储系统140后,对象存储系统140可以根据所述第一过期备份元数据删除指令删除第一过期备份的备份元数据。
另外,在本发明实施例中,备份服务器130在创建删除日志后,备份服务器130还会删除所述第一过期备份的备份属性信息。备份服务器130删除第一过期备份的备份属性信息后,备份服务器130中保存的剩余的备份属性信息为未过期备份的备份属性信息。所以,删除第一过期备份的备份属性信息后,后续增量备份后,备份服务器130可以检测保存的所有备份属性信息的数量,以检测所有未过期备份的总份数是否超过预定值。
422、备份服务器130发送所述第一删除日志至所述对象存储系统140。
430、对象存储系统140接收到所述第一删除日志后,将第一删除日志保存至对象存储系统140中。
将第一删除日志保存至对象存储系统140后,对象存储系统140可以发送删除日志保存完成消息至备份服务器130,以通知备份服务器130已将第一删除日志保存至对象存储系统140。
后续如果虚拟机磁盘数据再有修改,备份服务器130还可以通过以上图3所示的步骤对虚拟机磁盘数据进行增量备份。
例如,关于表2对应的增量备份后,如果虚拟机磁盘数据中的大对象A中的第一个对象A1、大对象C中的第一个对象C1以及大对象E中的第二个对象E2中的数据分片有修改,则对修改的数据分片所属的对象进行增量备份后,修改后的数据分片所属的对象保存在大对象F中。如此,增量备份的备份元数据如下表4所示。
表4
经过与表4对应的增量备份,如果备份服务器130检测对象存储系统140中对虚拟机磁盘数据进行的所有备份中未过期的备份的总份数超过预定值,则确定对象存储系统140中所有未过期的备份中的最早备份为第二过期备份。确定第二过期备份后,备份服务器130会创建所述虚拟机磁盘数据的第二删除日志,并保存所述第二过期备份中的无效数据的指针至所述第二删除日志。所述第二过期备份为截止当前第二时刻所述对象存储系统140中对所述虚拟机磁盘数据进行的所有未过期备份中的最早备份。基于本发明实施例的过期备份处理方法,每确定一个过期备份后的处理方式可以参照上述图4所示的步骤400至430的细节,具体实现细节不再在此赘述。
例如,经过与表4对应的增量备份后,如果表2对应的增量备份为第二过期备份,则创建的第二删除日志中记录的部分无效数据的指针可参考下表5所示的内容。
表5
又如,关于表4对应的增量备份后,如果虚拟机磁盘数据中的大对象A中的第二个对象A2以及大对象A中的第三个对象A3中的数据分片有修改,则对修改的数据分片所属的对象进行增量备份后,修改后的数据分片所属的对象保存在大对象G中。如此,增量备份的备份元数据如下表6所示。
表6
经过与表6对应的增量备份,如果备份服务器130检测对象存储系统140中对虚拟机磁盘数据进行的所有备份中未过期的备份的总份数超过预定值,则确定对象存储系统140中所有未过期的备份中的最早备份为第三过期备份。确定第三过期备份后,备份服务器130会创建所述虚拟机磁盘数据的第三删除日志,并保存所述第三过期备份中的无效数据的指针至所述第三删除日志。所述第三过期备份为截止当前第三时刻所述对象存储系统140中对所述虚拟机磁盘数据进行的所有未过期备份中的最早备份。基于本发明实施例的过期备份处理方法,每确定一个过期备份后的处理方式可以参照上述图4所示的步骤400至430的细节,具体实现细节不再在此赘述。
例如,经过与表6对应的增量备份后,如果表4对应的增量备份为第三过期备份,则创建的第三删除日志中记录的部分无效数据的指针可参考下表7所示的内容。
表7
保存虚拟机磁盘的多个过期备份分别对应的删除日志至对象存储系统140中的目的是,备份系统后续可以根据所有删除日志确定过期备份中包含无效数据的所有对象,以对包括无效数据的所有对象进行处理。包括无效数据的所有对象中,可以即有包括无效数据和有效数据的大对象,也可以有只包括无效数据的大对象或为无效数据的对象。
备份服务器130将删除日志保存至对象存储系统140后,待检测满足删除条件时,备份服务器130根据累计的删除日志启动过期数据删除流程。下面详细描述备份服务器130如何根据累计的删除日志删除过期数据。请参见图5,图5为本发明实施例提供的备份服务器根据累计的删除日志删除过期数据的方法流程图。如图5所示,本发明实施例提供的备份服务器130根据累计的删除日志删除过期数据的方法包括以下步骤。
500、备份服务器130接收对象存储系统140发送的保存在所述对象存储系统140中的与所述第一磁盘数据对应的多条删除日志。
备份服务器130接收对象存储系统140发送的多条删除日志之前,检测是否满足删除条件,如果满足删除条件,从对象存储系统140中获取虚拟机磁盘数据的所有删除日志。检测是否超过预设删除阈值,如果超过预设删除阈值,则满足删除条件;或检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。备份服务器130可以检测与所述第一磁盘数据对应的删除日志的日志对象标识的数量是否达到预设删除阈值,以检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值。保存在所述对象存储系统140中的与所述第一磁盘数据对应的多条删除日志,例如可以是上述示例中描述的第一删除日志、第二删除日志以及第三删除日志。
501、备份服务器130接收对象存储系统140发送的所述多条删除日志后,根据所有删除日志确定所述对象存储系统140中保存的包括无效数据和有效数据的所有大对象,以及确定只包括无效数据的所有大对象。
步骤501中,备份服务器130是根据所述包括无效数据的大对象中无效数据的数量以 及包括无效数据的大对象的数据量,确定所述对象是包括无效数据和有效数据的大对象还是只包括无效数据的大对象。
备份服务器130根据所述包括无效数据的大对象中无效数据的数量以及包括无效数据的大对象的实际数据量,确定所述大对象是包括无效数据和有效数据的大对象还是只包括无效数据的大对象的实现方式为,备份服务器130根据所述包括无效数据的大对象中无效数据的数量以及无效数据的大小确定所述包括无效数据的大对象的目标数据量,以及备份服务器130会获取包括无效数据的大对象的实际数据量,如果所述包括无效数据的大对象的目标数据量小于包括无效数据的大对象的实际数据量,则包括无效数据的大对象为包括无效数据和有效数据的大对象。如果所述包括无效数据的大对象的目标数据量等于包括无效数据的大对象的实际数据量,则包括无效数据的大对象为只包括无效数据的大对象。
备份服务器130获取包括无效数据的大对象的实际数据量的实现方式为,备份服务器130发送数据量确定请求至所述对象存储系统140,所述数据量确定请求中携带包括无效数据的大对象的标识,所述数据量确定请求用于指示所述对象存储系统140发送所述包括无效数据的大对象的实际数据量。对象存储系统140接收到所述数据量确定请求后,发送数据量属性信息至备份服务器130。备份服务器130接收数据量属性信息,所述数据量属性信息中携带包括无效数据的大对象的实际数据量。
例如,基于上述根据所述包括无效数据的大对象中无效数据的数量以及包括无效数据的大对象的实际数据量,确定所述大对象是包括无效数据和有效数据的大对象还是只包括无效数据的大对象的实现方式,备份服务器130根据上述示例中描述的第一删除日志、第二删除日志以及第三删除日志确定的包括无效数据的大对象有大对象B、大对象D、大对象A、大对象C以及大对象E。其中大对象B为包括无效数据和有效数据的大对象,大对象B为只包括无效数据的大对象,大对象A为只包括无效数据的大对象,大对象C为只包括无效数据的大对象,大对象E为包括无效数据和有效数据的大对象。
可替换的,备份服务器确定大对象为包括有效数据和无效数据的大对象的方式还有两种,第一种:备份服务器在确定包括无效数据的大对象后,向对象存储系统请求该大对象中所有对象的指针,如果大对象中的部分对象的指针为删除日志中记录的该大对象的无效数据的指针,则确定该大对象为包括有效数据和无效数据的大对象。第二种:备份服务器在确定包括无效数据的大对象后,向对象存储系统请求该大对象中所有对象的指针的数量,如果删除日志中记录的该大对象的所有无效数据的指针的数量小于向对象存储系统请求的大对象中的所有对象的指针的数量,则确定该大对象为包括有效数据和无效数据的大对象。
同理,可替换的,备份服务器确定大对象为只包括无效数据的大对象的方式还有两种,第一种:备份服务器在确定包括无效数据的大对象后,向对象存储系统请求该大对象中所有对象的指针,如果大对象中的所有对象的指针为删除日志中记录的该大对象的无效数据的指针,则确定该大对象为包括有效数据和无效数据的大对象。第二种:备份服务器在确定包括无效数据的大对象后,向对象存储系统请求该大对象中所有对象的指针的数量,如果删除日志中记录的该大对象的所有无效数据的指针的数量等于向对象存储系统请求的大对象中的所有对象的指针的数量,则确定该大对象为只包括无效数据的大对象。
在步骤501中,确定包括无效数据和有效数据的所有大对象,以及确定只包括无效数 据的所有大对象后分别对应不同的处理方式,可以参考如下描述。
备份服务器130根据所有删除日志确定了包括无效数据和有效数据的所有大对象后,顺序执行如下步骤510至560,即备份服务器130从对象存储系统140中获取包括无效数据和有效数据的所有大对象,将包括无效数据和有效数据的所有大对象中的有效数据保存至至对象存储系统140中至少一个新创建的新大对象中,然后指示对象存储系统140删除包括无效数据和有效数据的所有大对象,具体实现请参见以下步骤510至560。
以及,备份服务器130根据所有删除日志确定了只包括无效数据的所有大对象后,顺序执行如下步骤570至580,即备份服务器130指示对象存储系统140删除只包括无效数据的所有对象,具体实现请参见一下步骤570和580。
在本实施方式中,步骤570和步骤510的执行顺序不分先后。
下面先描述备份服务器130根据多条删除日志确定了包括无效数据和有效数据的所有大对象后的处理方式,具体包括如下步骤510至560。
510、备份服务器130根据所有删除日志确定所述对象存储系统140中至少一个大对象包括无效数据和有效数据后,发送大对象获取请求至对象存储系统140。大对象获取请求包括所述包括无效数据和有效数据的大对象的标识。
例如,所有过期备份中包括无效数据和有效数据的所有大对象中包括第一目标大对象。大对象获取请求包括第一目标大对象的标识。第一目标大对象可以是大对象B和大对象E中的任一大对象。
大对象获取请求包括至少一个包括有效数据和无效数据的大对象的标识。例如,如果备份服务器130根据若干删除日志确定的包括无效数据和有效数据的大对象的数量不止一个,则备份服务器130发送大对象获取请求的实现方式有多种。例如,备份服务器130通过发送多个大对象获取请求至对象存储系统140,以请求获取多个包括无效数据和有效数据的大对象,每个大对象获取请求包括一个包括有效数据和无效数据的大对象的标识。或者,备份服务器130发送一个大对象获取请求,以请求获取多个大对象,所述大对象获取请求包括多个包括有效数据和无效数据的大对象的标识。
511、备份服务器130确定所述对象存储系统140中包括无效数据和有效数据的大对象后,生成至少一个新大对象的标识。至少一个新大对象的标识与至少一个新大对象一一对应。在本实施方式中,步骤510和511的执行顺序不分先后。本实施例中,以至少一个新大对象包括第一新大对象为例。
备份服务器130确定所述对象存储系统140中包括无效数据和有效数据的大对象后,根据包括无效数据和有效数据的至少一个大对象中的有效数据的数量和大对象实际可以保存的多个对象的数量确定创建的新大对象的标识的数量,创建的新大对象的标识的数量为包括无效数据和有效数据的至少一个大对象中的所有有效数据的数量与大对象实际可以保存的多个对象的数量的商进行取整加1运算后获得的数量。例如,包括无效数据和有效数据的大对象有两个,分别是大对象B和大对象E。大对象B中的B2以及大对象E中的E1是有效数据。有效数据的数量为2个,如果新大对象实际可以保存的对象的数量为3个,则创建新大对象的标识的数量为1。
可替换的,备份服务器130在创建新大对象的标识之前,可以先根据确认的有效数据的指针的排列顺序,依次确定待保存至新大对象的多个有效数据的指针,当依次确定的多 个有效数据的指针的数量达到大对象实际可以保存的多个对象的数量时,创建新大对象的标识。对于后续剩余的有效数据,备份服务器130也可以在后续剩余的有效数据中,按照有效数据的指针的排列顺序,依次确定待保存至另一个新大对象的多个有效数据的指针,当依次确定的多个有效数据的指针的数量达到大对象实际可以保存的多个对象的数量时,创建所述另一个新大对象的标识,以此类推以创建多个新大对象的标识。
520、对象存储系统140接收到大对象获取请求后,根据所述大对象获取请求查询大对象的标识对应的大对象。例如,对象存储系统140根据所述大对象获取请求中的第一目标大对象的标识查询对应的第一目标大对象。
在步骤510之后,执行步骤520。例如,对象存储系统140根据所述大对象获取请求中的第一目标大对象的标识查询对应的第一目标大对象可以是大对象B或大对象E。
521、对象存储系统140发送查询到的大对象至备份服务器130。在步骤520之后,执行步骤步骤521。
例如,对象存储系统140发送查询到的第一目标大对象至备份服务器130。比如对象存储系统140发送查询到的大对象B或大对象E至备份服务器130。
530、备份服务器130接收到包括无效数据和有效数据的至少一个大对象后,创建有效数据移动指令。
所述有效数据移动指令包括新大对象的标识、包括所述无效数据和有效数据的至少一个大对象中至少一个有效数据。所述有效数据移动指令用于指示所述对象存储系统140将所述包括无效数据和有效数据的大对象中的至少一个有效数据保存至与所述新大对象的标识对应的所述新大对象中。
所述有效数据移动指令还包括有效数据移动后在新大对象的位置标识。有效数据移动后在新大对象的位置标识用于表示有效数据移动后在新大对象中的位置。
备份服务器130接收到包括无效数据和有效数据的大对象后,解析包括无效数据和有效数据的大对象,确认包括无效数据和有效数据的大对象中的有效数据。备份服务器130确定了至少一个大对象中的至少一个有效数据后,创建有效数据移动指令。
有效数据移动指令中的至少一个有效数据的实现方式有多种,可选的一种实现方式中,有效数据移动指令中的至少一个有效数据可以由两个或两个以上包括无效数据和有效数据的的多个大对象中每个大对象中的至少一个有效数据组成。例如,有效数据移动指令中的至少一个有效数据可以由包括无效数据和有效数据的至少一个大对象中所有有效数据以及包括无效数据和有效数据的其他大对象中的部分或全部有效数据组成。
备份服务器130创建有效数据移动指令的数量不受本申请实施例的限制。例如备份服务器130可以创建至少一个有效数据移动指令。即可以创建一个有效数据移动指令,也可以创建多个有效数据移动指令。每个有效数据移动指令中的至少一个有效数据可以由包括所述无效数据和所述有效数据的至少一个大对象中的一部分或全部有效数据组成。
当每个有效数据移动指令中的至少一个有效数据由包括所述无效数据和所述有效数据的至少一个大对象中的一部分有效数据组成时,备份服务器130可以通过发送多个有效数据移动指令,以实现包括所述无效数据和所述有效数据的大对象中的所有有效数据至对象存储系统140的发送。
以上述包括有效数据和无效数据的第一目标大对象为例说明所有有效数据移动指令 的实现方式,例如所述有效数据移动指令包括所述第一新大对象的标识、所述第一目标大对象中至少一个有效数据,所述有效数据移动指令用于指示所述对象存储系统140将所述有效数据移动指令中的所述第一目标大对象中的所述至少一个有效数据保存至所述第一新大对象的标识对应的第一新大对象中。例如第一目标大对象为大对象B时,所述有效数据移动指令包括的所述第一目标大对象中至少一个有效数据可以是B2。例如第一目标大对象为大对象E时,所述有效数据移动指令包括的所述第一目标大对象中至少一个有效数据可以是E1。备份服务器130可以创建两个有效数据移动指令,一个有效数据移动指令包括有效数据B2,另一个有效数据移动指令包括E1,也可以创建一个有效数据移动指令,该有效数据移动指令包括两个有效数据B2和E1。
531、发送有效数据移动指令至对象存储系统140。
540、对象存储系统140接收到有效数据移动指令后,将有效数据移动指令中包括的至少一个有效数据保存至对象存储系统140中与所述新大对象的标识对应的所述新大对象中。
在步骤540之后,即对象存储系统140将有效数据移动指令中包括的至少一个有效数据保存至对象存储系统140中与所述新大对象的标识对应的所述新大对象后,发送有效数据保存完成信息至备份服务器130,以通知备份服务器130有效数据移动指令中包括的所有有效数据保存完成。
备份服务器130根据所有删除日志确定包括无效数据和有效数据的所有大对象后,通过上述步骤510至540将包括无效数据和有效数据的所有大对象中的所有有效数据保存至对象存储系统140中至少一个新大对象中。
基于上述步骤510至540,在对象存储系统140将包括无效数据和有效数据的所有大对象中的有效数据保存至对象存储系统140中的至少一个新大对象后,后续执行步骤541至560,即备份服务器130创建有效数据移动后的指针,有效数据移动后的指针用于表示有效数据保存至新大对象后在新大对象中的位置。备份服务器130创建有效数据移动后的指针后,由备份服务器130记录有效数据移动前和移动后的指针的对应关系,删除备份服务器130根据所有删除日志确定的包括无效数据和有效数据的大对象,以实现过期数据的删除。具体过程详见如下步骤541至560。
541、备份服务器130确定有效数据移动后的指针。步骤541与步骤531的执行顺序不分先后。
备份服务器130接收到对象存储系统140发送的有效数据保存完成信息后,可以确定有效数据移动后的指针。有效数据移动后的指针用于表示有效数据保存至新大对象后在新大对象中的位置。备份服务器130可以根据上述步骤530中创建的有效数据移动指令中的有效数据移动后在新大对象中的位置标识确定有效数据移动后的指针。
542、备份服务器130创建移动日志,并保存有效数据移动前和移动后的指针的对应关系至所述移动日志中。
在步骤542中,备份服务器130创建移动日志之前,预先创建移动日志的对象标识,保存虚拟机磁盘标识和所述移动日志的对象标识的对应关系。
由于备份服务器130对虚拟机磁盘数据进行备份后,客户端110对虚拟机磁盘数据的恢复频率不高,因此本发明实施例提供的过期备份处理方法中,每次处理过期备份时,备 份系统在将过期备份中的有效数据移动至新大对象后,并没有对所有未过期的备份分别对应的备份元数据进行更新,而是通过创建移动日志,将过期备份中有效数据移动后的指针保存至所述移动日志中,以确保后续通过未过期备份中的待恢复备份对虚拟机磁盘数据进行恢复时,可以根据移动日志中记录的有效数据移动后的指针确定待恢复备份对应的备份元数据,进而根据待恢复备份对应的备份元数据从对象存储系统140中获取待恢复备份对应的虚拟机磁盘数据。
移动日志是在备份系统处理过期备份时对过期备份中的有效数据进行移动后创建的,以便后续在恢复未过期的备份时,有针对性的对待恢复备份对应的备份元数据进行更新,简化了过期备份处理备份元数据的复杂度,提高了过期备份的处理效率。
543、备份服务器130发送移动日志至对象存储系统140。
备份服务器130保存有效数据移动前和移动后的指针的对应关系至所述移动日志中后,可以发送移动日志存储请求至对象存储系统140。所述移动日志存储请求包括移动日志的对象标识和移动日志。所述移动日志存储请求用于指示对象存储系统140将所述移动日志存储请求中的所述移动日志保存至所述移动日志的对象标识对应的对象中。
544、对象存储系统140接收到移动日志后,保存移动日志。
对象存储系统140接收到所述移动日志存储请求后,保存移动日志至所述移动日志的对象标识对应的对象中。
在步骤544之后,对象存储系统140发送大对象写完成消息至备份服务器130。对象存储系统140发送大对象写完成消息至备份服务器130的目的是通知备份服务器130已完成移动日志的保存。
550、发送第一对象删除指令至对象存储系统140。第一对象删除指令包括备份服务器130根据多条删除日志确定的包括无效数据和有效数据的每个大对象的标识。
备份服务器130可以在接收到大对象写完成消息后,发送第一对象删除指令至对象存储系统140。
560、对象存储系统140接收到第一对象删除指令后,根据第一对象删除指令删除包括无效数据和有效数据的每个大对象的标识分别对应的大对象。
下面描述备份服务器130根据所有删除日志确定了只包括无效数据的所有大对象后的处理方式,具体包括如下步骤570和580。
备份服务器130根据多条删除日志确定了只包括无效数据的所有大对象后,顺序执行如下步骤570至580,即备份服务器130指示对象存储系统140删除只包括无效数据的所有大对象。具体实现请参见以下步骤570和580。
570、备份服务器130根据多条删除日志确定只包括无效数据的所有大对象后,发送第二对象删除指令至对象存储系统140。第二对象删除指令包括备份服务器130根据多条删除日志确定的只包括无效数据的每个大对象的标识。
580、对象存储系统140接收到第二对象删除指令后,根据第二对象删除指令删除大对象的标识对应的大对象。
在本实施方式中,以上步骤510至540描述了备份系统如何将备份服务器130根据所有删除日志确定的包括无效数据和有效数据的所有大对象中的有效数据保存至对象存储系统140。具体实现方法为备份服务器130从对象存储系统140请求以及获取包括无效数 据和有效数据的所有大对象,然后由备份服务器130将包括无效数据和有效数据的所有大对象中的有效数据发送至对象存储系统140进行保存。
在另一种实现方式中,区别于步骤510至540描述的备份系统对有效数据进行保存的方法,本发明实施例还提供另一种有效数据的保存方法,即备份服务器130根据上述确认无效数据和有效数据的方法确认包含无效数据和有效数据的所有大对象后,通过多段复制技术指示对象存储系统140将包括无效数据和有效数据的所有大对象中的有效数据保存至对象存储系统140中至少一个新创建的新大对象中。与上述步骤510至540描述的有效数据的保存方法相比,本发明实施例提供的这种有效数据的保存方法,备份服务器130根据多条多条删除日志确定的包括无效数据和有效数据的所有大对象后,备份服务器130不用从对象存储系统140中请求读取包含无效数据和有效数据的所有大对象,而是通过多段复制技术指示对象存储系统140将包括无效数据和有效数据的所有大对象中的有效数据保存至对象存储系统140中至少一个新创建的新大对象中,减少了备份服务器130与对象存储系统140的交互流程,提高了备份系统的处理性能。本发明实施例提供的通过多段复制技术指示对象存储系统140保存有效数据的方法的具体实现方式可以参考后续图6描述的备份服务器130通过多段复制技术指示对象存储系统140保存有效数据的流程。
下面详细描述一下备份服务器130通过多段复制技术指示对象存储系统140保存有效数据的方法。请参见图6,图6为本发明实施例提供的备份服务器通过多段复制技术指示对象存储系统保存有效数据的方法流程图。如图6所示,本发明实施例提供的备份服务器130通过多段复制技术指示对象存储系统140保存有效数据的方法包括以下步骤。
610、备份服务器130根据所有删除日志确定包括无效数据和有效数据的所有大对象后,创建至少一个新大对象的标识。本步骤610创建至少一个新大对象的标识的细节可以参考上述步骤511的实现方式,具体实现细节不再在此赘述。
611、备份服务器130创建有效数据移动指令。
所述有效数据移动指令包括所述新大对象的标识和包括有效数据和无效数据的至少一个大对象分别对应的有效数据信息,所述有效数据信息包括所述包括无效数据和有效数据的大对象中至少一个连续的有效数据段中每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置,和/或者所述包括无效数据和有效数据的大对象中至少一个有效数据中每个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置。
在步骤611之前,备份服务器130可以根据步骤501确认所述对象存储系统140中保存的包括无效数据和有效数据的所有大对象的多个有效数据,确认有效数据在包括所述无效数据和有效数据的大对象中的偏移位置。
备份服务器130可以根据上述步骤611之前确认的有效数据在包括所述无效数据和有效数据的大对象中的偏移位置,创建有效数据移动指令。
如果所述有效数据信息包括所述包括无效数据和有效数据的大对象中至少一个有效数据中每个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置,至少一个有效数据可以是所述包括无效数据和有效数据的大对象中非连续的多个有效数据。
如果所述有效数据信息包括所述包括无效数据和有效数据的大对象中至少一个连续的有效数据段中每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置时,有效数据移动指令用于指示对象存储系统140根据有效数据信息,将对象存储 系统140中保存的所述至少一个连续的有效数据段保存至所述新大对象的标识对应的新大对象中。
每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置的表现方式有两种,下面对每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置的两种表现方式分别进行说明。
每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置的第一种表现方式为,每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的起始位置和每个连续的有效数据段的大小。其中,每个连续的有效数据段的大小是根据每个连续的有效数据段包括的有效数据的数量确定的,每个有效数据分别为一个对象,由于每个对象包括的数据分片的数量是固定的,因此每个对象的大小是固定的。这样备份服务器130可以根据对象的大小以及每个连续的有效数据段包括的有效数据的数量确定每个连续的有效数据段的大小,每个连续的有效数据段的大小为对象的大小以及每个连续的有效数据段包括的有效数据的数量的乘积。每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的起始位置是根据每个连续的有效数据段的第一个有效数据在所述包括无效数据和有效数据的大对象中的起始位置确定的。如果一段连续的有效数据的第一个有效数据为所述包括无效数据和有效数据的大对象中的第j个对象,其中,所述j为大于0的整数,则该段连续的有效数据的第一个有效数据在所述包括无效数据和有效数据的大对象中的起始位置为(j-1)与对象的大小的乘积。例如,一段连续的有效数据的第一个有效数据为所述包括无效数据和有效数据的大对象中的第2个对象,如果对象的大小为16M,则该段连续的有效数据的第一个有效数据的起始位置为(2-1)与16M的乘积,即该段连续的有效数据的第一个有效数据的起始位置为16M。因此,该段连续的有效数据的起始位置为16M。
每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置的第二种表现方式为,每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的起始位置和结束位置。每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的起始位置如上第一种表现方式中的细节描述,在此不再赘述。每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的结束位置是根据每个连续的有效数据段的最后一个有效数据在所述包括无效数据和有效数据的大对象中的结束位置确定的。如果一段连续的有效数据的最后一个有效数据为所述包括无效数据和有效数据的大对象中的第n个有效数据,其中,所述n为大于0的整数,则该段连续的有效数据的最后一个有效数据在所述包括无效数据和有效数据的大对象中的结束位置为所述n与对象的大小的乘积。
如果所述有效数据信息包括所述包括无效数据和有效数据的大对象中至少一个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置,至少一个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置有两种表现形式,下面分别进行说明。
至少一个有效数据中每个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置的第一种表现形式为,每个有效数据在所述包括无效数据和有效数据的大对象中的起始位置和有效数据的大小。有效数据为一个对象,因此有效数据的大小是固定的。每个有效数据在所述包括无效数据和有效数据的大对象中的起始位置的确定方式为,如果该有效数据为所述包括无效数据和有效数据的大对象中的第k个对象,其中,所述k为大于0 的整数,则该有效数据的起始位置为(k-1)与对象的大小的乘积。例如,该有效数据为所述包括无效数据和有效数据的大对象中的第4个对象,如果对象的大小为16M,则该有效数据的起始位置为(4-1)与16M的乘积,即该有效数据的起始位置为48M。
至少一个有效数据中每个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置的第二种表现形式为,每个有效数据在所述包括无效数据和有效数据的大对象中的起始位置和结束位置。每个有效数据在所述包括无效数据和有效数据的大对象中的起始位置如上第一种表现方式中的细节描述,在此不再赘述。每个有效数据在所述包括无效数据和有效数据的大对象中的结束位置的确定方式为,如果该有效数据为所述包括无效数据和有效数据的大对象中的第t个对象,其中,所述t为大于0的整数,则该有效数据的结束位置为t与对象的大小的乘积。例如,该有效数据为所述包括无效数据和有效数据的大对象中的第4个对象,如果对象的大小为16M,则该有效数据的结束位置为4与16M的乘积,即该有效数据的结束位置为64M。
有效数据移动指令中可以包括多个所述包括有效数据和无效数据的大对象分别对应的有效数据信息。每个大对象对应的有效数据信息包括至少一个连续的有效数据段中每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置,和/或者至少一个有效数据中每个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置。
备份服务器130可以创建至少一个有效数据移动指令,每个有效数据移动指令中的有效数据信息包括所述包括无效数据和有效数据的大对象中一部分有效数据的偏移位置。所以备份服务器130可以通过发送多个有效数据移动指令实现包括所述无效数据和所述有效数据的大对象中的所有有效数据的偏移位置至对象存储系统140的发送。所述包括无效数据和有效数据的大对象中一部分有效数据可以是所述包括无效数据和有效数据的大对象中部分连续的有效数据段和/或部分有效数据。所述包括无效数据和有效数据的大对象中一部分有效数据的偏移位置包括部分连续的有效数据段中每个连续的有效数据段在所述包括无效数据和有效数据的大对象中的偏移位置,和/或部分有效数据中每个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置。
612、备份服务器130发送有效数据移动指令至对象存储系统140。
620、对象存储系统140接收到有效数据移动指令后,根据有效数据移动指令中的有效数据信息中至少一个连续的有效数据段和/或至少一个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置,将对象存储系统140中保存的包括无效数据和有效数据的大对象中的至少一个连续的有效数据段和/或至少一个有效数据保存至所述新大对象的标识对应的新大对象中。
对象存储系统140接收到有效数据移动指令后,会先根据有效数据移动指令中所述新大对象的标识查询是否有创建所述新大对象的标识对应的新大对象。如果查询没有创建所述新大对象的标识对应的新大对象,则对象存储系统140会根据所述新的大对象的标识创建一个新的大对象,然后根据有效数据移动指令中的有效数据信息中至少一个连续的有效数据段和/或至少一个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置,将对象存储系统140中保存的包括无效数据和有效数据的大对象中的至少一个连续的有效数据段和/或至少一个有效数据保存至所述新大对象的标识对应的新大对象中。
对象存储系统140接收到有效数据移动指令后,如果根据所述新大对象的标识查询已 创建所述新大对象的标识对应的新大对象,则对象存储系统140会根据有效数据移动指令中的有效数据信息中至少一个连续的有效数据段和/或至少一个有效数据在所述包括无效数据和有效数据的大对象中的偏移位置,将对象存储系统140中保存的包括无效数据和有效数据的大对象中的至少一个连续的有效数据段和/或至少一个有效数据保存至所述新大对象的标识对应的新大对象中。
备份服务器130根据所有删除日志确定包括无效数据和有效数据的所有大对象后,通过上述步骤610至620将包括无效数据和有效数据的所有大对象中的所有有效数据保存至对象存储系统140中至少一个新大对象中。
与上述步骤510至540描述的保存有效数据的方法相比,基于图6所示的备份服务器130通过多段复制技术指示对象存储系统140保存有效数据的方法,备份服务器130根据多条删除日志确定的包括无效数据和有效数据的所有大对象后,备份服务器130不用从对象存储系统140中请求读取包含无效数据和有效数据的所有大对象,而是通过多段复制技术指示对象存储系统140将包括无效数据和有效数据的所有大对象中的有效数据保存至对象存储系统140的至少一个新大对象中,减少了备份服务器130与对象存储系统140的交互流程,提高了备份系统的处理性能。
在图6所示的步骤620之后,即对象存储系统140将包括无效数据和有效数据的所有大对象中的有效数据保存至对象存储系统140中的至少一个新大对象后,后续执行步骤541至560,即备份服务器130创建有效数据移动后的指针,有效数据移动后的指针用于表示有效数据保存至新大对象后在新大对象中的位置。备份服务器130创建有效数据移动后的指针后,由备份服务器130记录有效数据移动前和移动后的指针的对应关系,删除备份服务器130根据多条删除日志确定的包括无效数据和有效数据的大对象,以实现过期数据的删除。具体过程详见步骤541至560,在此不再赘述。
备份系统对虚拟机磁盘数据进行备份后,如果客户端110对虚拟机磁盘数据有恢复的需求,客户端110会发送数据恢复请求至备份服务器130。备份服务器130接收到数据恢复请求后会执行恢复流程对虚拟机磁盘数据进行恢复。备份服务器130对虚拟机磁盘数据进行恢复时,根据数据恢复请求中包括的待恢复备份标识获取待恢复备份对应的虚拟机磁盘数据,通过待恢复备份对应的虚拟机磁盘数据对虚拟机磁盘数据进行恢复,其中待恢复备份属于未过期的备份。
由于备份服务器130对虚拟机磁盘数据进行备份后,客户端110对虚拟机磁盘数据的恢复频率不高,因此本发明实施例提供的过期备份处理方法中,每次处理过期备份时,备份系统在将过期备份中的有效数据移动至新大对象后,并没有对所有未过期的备份分别对应的备份元数据进行更新,而是通过创建移动日志,将过期备份中有效数据移动后的指针保存至所述移动日志中,以确保后续通过未过期备份中的待恢复备份对虚拟机磁盘数据进行恢复时,可以根据移动日志中记录的有效数据移动后的指针确定待恢复备份对应的备份元数据,进而根据待恢复备份对应的备份元数据从对象存储系统140中获取待恢复备份对应的虚拟机磁盘数据。下面通过图7描述本发明实施例提供的如何通过移动日志对虚拟机磁盘数据进行恢复的方法。
请参见图7,图7为本发明实施例提供的恢复虚拟机磁盘数据的方法流程图。如图7所示,本发明实施例提供的虚拟机磁盘数据的恢复方法包括以下步骤。
700、备份服务器130接收客户端110发送的数据恢复请求。
所述数据恢复请求包括目标磁盘标识、虚拟机磁盘标识和与虚拟机磁盘数据的待恢复备份对应的待恢复备份标识。所述数据恢复请求用于指示基于待恢复备份标识对应的待恢复备份,恢复虚拟机磁盘数据至目标磁盘中。所述虚拟机磁盘数据包括连续的若干数据分片。与所述数据恢复请求中的待恢复备份标识对应的虚拟机磁盘数据的备份为待恢复备份。
目标磁盘标识例如可以是“生产数据”或者“业务数据”等中文字符,也可以是字母、数字或其他符号,也可以是字母、数字或其他符号的组合。具体实现不受本实施例的限制。
目标磁盘可以是之前用于存储虚拟机磁盘数据的虚拟机磁盘,也可以是其他磁盘。目标磁盘可以部署于虚拟机磁盘所在的存储节点120中或备份服务器130中,也可以部署于其他物理设备中,例如是其他的存储设备,可以是存储阵列。
在本步骤700中,所述备份服务器130可以接收客户端110发送的数据恢复请求。数据恢复请求中的待恢复备份标识可以通过备份时间标识或者备份版本标识或者备份次数标识实现。
701、所述备份服务器130创建待恢复备份的元数据获取请求。所述待恢复备份的元数据获取请求包括虚拟机磁盘标识和待恢复备份标识。
702、所述备份服务器130发送待恢复备份的元数据获取请求至对象存储系统140。
所述备份服务器130预先保存有所述待恢复备份标识与所述待恢复备份的备份元数据对应的元数据对象标识之间的对应关系。所述备份服务器130发送所述元数据获取请求之前,先根据所述待恢复备份标识与所述待恢复备份的备份元数据对应的元数据对象标识之间的对应关系,查找所述待恢复备份的备份元数据对应的元数据对象标识;然后根据所述待恢复备份的备份元数据对应的元数据对象标识创建所述待恢复备份的元数据获取请求。所述元数据获取请求包括所述待恢复备份的备份元数据对应的元数据对象标识,所述待恢复备份的元数据获取请求用于指示所述对象存储系统140根据与所述待恢复备份的备份元数据对应的元数据对象标识,获取所述待恢复备份的备份元数据。
703、所述备份服务器130发送移动日志获取请求至对象存储系统140。所述移动日志获取请求包括移动日志的对象标识。所述移动日志获取请求用于指示对象存储系统140发送所述虚拟机磁盘的所有移动日志。移动日志以对象存储方式保存在所述对象存储系统140中。
所述备份服务器130预先保存有虚拟机磁盘标识和移动日志的对象标识的对应关系。所述备份服务器130根据虚拟机磁盘标识和移动日志的对象标识的对应关系查找移动日志的对象标识。
步骤703与702的执行顺序不分先后。
710、对象存储系统140根据待恢复备份的元数据获取请求查找待恢复备份的备份元数据。
711、对象存储系统140根据移动日志的对象标识查找移动日志的对象标识对应的移动日志。
步骤711和710的执行顺序不分先后。
712、对象存储系统140发送与移动日志的对象标识对应的移动日志至备份服务器130。
步骤712和710的执行顺序不分先后。
713、对象存储系统140发送待恢复备份的备份元数据至备份服务器130。
步骤713和712的执行顺序不分先后。以及,步骤713和711的执行顺序不分先后。720、备份服务器130接收到对象存储系统140发送的待恢复备份的备份元数据以及虚拟机磁盘的所有移动日志后,根据所有移动日志更新待恢复备份的备份元数据,获得修改后的备份元数据。
由于在本发明实施例提供的过期备份处理方法中,备份系统对备份类型属于增量备份的过期备份处理后,并没有对未过期备份的备份元数据进行更新,所以备份服务器130在对虚拟机磁盘数据进行恢复时,备份服务器130接收到基于本发明实施例提供的过期备份处理方法所创建的所有移动日志之后,需要根据所有移动日志中记录的有效数据移动后的指针对待恢复备份的备份元数据进行更新。
备份服务器130根据所有移动日志更新待恢复备份的备份元数据时,备份服务器130将备份元数据中与移动日志中记录的有效数据移动前的相同指针进行修改,修改后的指针为移动日志中记录的与所述有效数据移动前的相同指针对应的有效数据移动后的指针。
721、备份服务器130获得待恢复备份对应的修改后的备份元数据后,根据修改后的备份元数据创建待恢复备份对象获取请求。所述待恢复备份对象获取请求包括修改后的备份元数据中记录的的指针。
备份服务器130对待恢复备份的备份元数据进行更新获得修改后的备份元数据后,以及创建待恢复备份数据获取请求之前,先按照所述修改后的备份元数据中记录的指针的排列顺序依次确认多个连续的指针,所述多个连续的指针对应的多个对象属于同一个大对象。然后备份服务器130创建待恢复备份对象获取请求,所述待恢复对象获取请求包括所述多个连续的指针。所述待恢复对象获取请求用于指示对象存储系统140根据一个待恢复对象获取请求查找待恢复备份中同一个大对象中的多个对象。如果同一个大对象中的多个对象的指针在待恢复备份对应的备份元数据中不连续,则备份服务器130需要创建多个待恢复备份对象获取请求,每个待恢复备份对象获取请求只包括连续的多个指针或一个独立的指针。
722、备份服务器130发送所述待恢复备份对象获取请求至对象存储系统140。
730、对象存储系统140接收到所述待恢复备份对象获取请求后,根据所述修改后的备份元数据中记录的指针查找所述待恢复备份对应的虚拟机磁盘数据中的对象。
731、对象存储系统140发送查找到的所述待恢复备份对应的虚拟机磁盘数据中的对象。
740、备份服务器130接收到所述待恢复备份对应的虚拟机磁盘数据中的对象后,发送待恢复备份恢复指示至目标磁盘标识对应的目标磁盘所在的存储节点120。待恢复备份恢复指示包括所述待恢复备份对应的虚拟机磁盘数据中的对象和目标磁盘标识。
750、目标磁盘所在的存储节点120接收到待恢复备份对应的虚拟机磁盘数据中的对象后,将待恢复备份对应的虚拟机磁盘数据中的对象保存至目标磁盘标识对应的目标磁盘。
由于处理过期备份时,不同大对象中的有效数据会保存至同一个新大对象中,且没有对未过期备份的备份元数据中的指针进行修改。待恢复备份也为一个未过期备份,所以基于图7所示的虚拟机磁盘数据的恢复方法,通过步骤720对待恢复备份的备份元数据更新后,属于同一个新大对象的有效数据移动后的指针在待恢复备份对应的修改后的备份元数 据中不一定连续。所以,按照上述步骤721中根据连续的指针创建待恢复对象获取请求的实现方式,备份服务器130需要创建多个待恢复备份对象获取请求,并发送多个待恢复备份对象获取请求至对象存储系统140,增加了备份服务器130与对象存储系统140的交互次数,消耗了备份服务器130与对象存储系统140之间的传输资源。因此,本发明实施例还提供了一种对待恢复备份对应的修改后的备份元数据进行整理的方法,即对修改后的备份元数据中的所有指针的排列顺序进行整理,将属于同一个大对象的多个对象对应的多个指针排列在一起。本发明实施例对待恢复备份对应的修改后的备份元数据进行整理的具体实现方式有两种,下面对这两种对修改后的备份元数据进行整理的方法分别进行描述。
对待恢复备份对应的修改后后的备份元数据进行整理的第一种实现方式为,备份服务器130根据修改后的备份元数据,确认与属于同一个大对象的多个对象对应的多个指针,将所述与属于同一个大对象的多个对象对应的多个指针保存至一段连续的地址指向的第一存储空间中。以及,将修改后的备份元数据中的所有指针保存至一段连续的地址指向的第二存储空间。所述第二存储空间包括第一存储空间。
例如,备份服务器130依次确认多个大对象中每个大对象包括的多个对象对应的多个指针,将每个大对象包括的多个对象对应的多个指针保存至一个第一存储空间中,以此通过多个第一存储空间保存多个大对象中所有对象对应的多个指针。所述多个第一存储空间对应的多段连续地址相互之间可以是连续的也可以是不连续的。所述多个第一存储空间中任意两个第一存储空间对应的地址连续时,所述任意两个第一存储空间中存储的多个指针不包括独立的指针,即不存在不属于同一个大对象的对象对应的指针。所述多个第一存储空间对应的多段连续地址相互之间不连续时,所述多个第一存储空间存储的多个指针之间排列有独立的指针,即存在不属于同一个大对象的指针,所述独立的指针以及与其相邻的其他指针分别对应的对象不属于同一个大对象。
又如,备份服务器130可以确认与属于同一个大对象的多个对象对应的多个指针,以及确认不属于同一个大对象的多个独立的对象对应的多个独立的指针,然后备份服务器130将与属于同一个大对象的多个对象对应的多个指针保存至一段连续地址指向的第一存储空间中,以及,将不属于同一个大对象的多个独立的对象对应的多个指针保存至连续地址指向的第三存储空间中,所述第一存储空间的结束地址和所述第三存储空间的起始地址连续,或所述第一存储空间的起始地址和所述第三存储空间的结束地址连续。所述第二存储空间包括所述第一存储空间和所述第三存储空间,如果存在多个大对象,则所述第二存储空间包括多个所述第一存储空间。
对待恢复备份对应的修改后的备份元数据进行整理的第二种实现方式为,备份服务器130根据修改后的备份元数据,确认与属于同一个大对象的多个对象对应的多个指针,创建第一索引,保存第一索引与所述与属于同一个大对象的多个对象对应的多个指针的对应关系,以及确认不属于同一个大对象的多个独立的对象对应的多个独立的指针,创建第二索引,保存第二索引与不属于同一个大对象的独立的对象对应的独立的指针的对应关系。如果大对象有多个,则第一索引有多个,第一索引的数量和大对象的数量相同。如果独立的指针有多个,则第二索引有多个,第二索引的数量和独立的指针的数量相同。所有第一索引和所有第二索引保存在一段连续地址指向的存储空间中。
本发明实施例还提供一种不同的数据备份方法,在本发明实施例提供的数据备份方法中,在将至少一个虚拟机磁盘分别对应的虚拟机磁盘数据备份至对象存储系统140的具体实现方式包括以下步骤。
备份服务器130获取到待备份至对象存储系统140的虚拟机磁盘的连续的多个数据分片后,按照数据分片在虚拟机磁盘数据的排列位置确定符合预定数量的连续的多个数据分片组成的数据集合。
备份服务器130确定符合预定数量的连续的多个数据分片组成的第一数据集合后,会计算数据集合的弱哈希值,并创建一个新大对象的标识,保存新大对象的标识和弱哈希值的对应关系。
备份服务器130创建新大对象的标识后,备份服务器130会发送第一数据集合保存指令至对象存储系统140。所述第一数据集合保存指令包括第一数据集合以及所述新大对象的标识。第一数据集合保存指令用于指示对象存储系统140将第一数据集合保存至所述新大对象的标识对应的新大对象中。
备份服务器130将所述第一数据集合保存指令发送给对象存储系统140存储后,由对象存储系统140将所述第一数据集合保存至所述新大对象标识对应的新大对象中。
后续,备份服务器130需要将其他第二数据集合保存至对象存储系统140时,会先计算其他第二数据集合的弱哈希值,然后检测对象存储系统140中是否保存与所述其他第二数据集合的弱哈希值相近似的数据集合,如果有则查询与所述其他数据集合的弱哈希值相近似的数据集合所属的新大对象的标识,备份服务器130会检测与所述其他第二数据集合的弱哈希值相近似的数据集合所属的新大对象是否达到预定义的大小,如果检测到没有达到预定义的大小,则备份服务器130会发送另一第二数据集合保存指令至对象存储系统140,所述另一第二数据集合保存指令包括所述其他第二数据集合以及与所述其他第二数据集合的弱哈希值相近似的数据集合所属的新大对象的标识。
对象存储系统140接收到所述另一第二数据集合保存指令后,会将所述其他第二数据集合保存至与所述其他第二数据集合的弱哈希值相近似的数据集合所属的新大对象。例如与所述其他第二数据集合的弱哈希值相近似的数据集合可以是第一数据集合。
下面描述本发明实施例提供的备份服务器的结构。备份服务器具有实现上述系统实施例中备份服务器130的功能,所述功能可以由硬件执行相应的软件实现。
请参考图8,图8为本发明实施例提供的一种备份服务器的结构图。如图8所示,备份服务器130包括控制器210和存储设备220。控制器210和存储设备220连接。图8所示的备份服务器130可以应用于图1所示的存储系统中。存储设备220用于为控制器210提供存储服务。
控制器210,用于接收到客户端110发送的数据备份请求时,或者确定到达预设的时间时,将存储节点120的磁盘上的数据全量备份或者增量备份至对象存储系统140中,并且创建并保存备份元数据和备份属性信息。其中,全量备份是指将存储节点120的磁盘上的所有数据备份至存储系统,增量备份是指将存储节点120的磁盘上有修改的数据备份至对象存储系统。备份元数据用于表示磁盘数据中的每个对象在磁盘数据中的位置。备份元数据中可以记录组成磁盘数据的每个对象的标识及指针,并且按照对象在磁盘中的排列顺 序来记录每个对象的指针。所述备份属性信息包括备份的备份标识、备份时间和所述备份元数据的标识。该控制器210创建备份属性信息后,还会保存存储节点120的磁盘的磁盘标识与备份属性信息的对应关系。
通常,在客户端110发送的数据备份请求中要包含需要备份的磁盘的磁盘标识,该控制器210将该磁盘标识所指代的磁盘上的数据全量备份或增量备份至对象存储系统140中。具体的全量备份和增量备份的过程请参考上述方法实施例中关于备份服务器130实现全量备份和增量备份的描述,在此不再赘述。
对该磁盘的磁盘数据进行全量备份或增量备份后,系统中会产生越来越多的备份。为了节省系统的存储空间,需要对存储系统中的备份进行管理。本发明实施例提供的备份系统中的备份服务器130会根据需要删除过期备份。
控制器210,用于检测到该磁盘的所有备份属性信息的总份数超过预定值时,根据所有备份属性信息中的备份时间确定最早备份为过期备份。控制器210确定过期备份后,通过该过期备份的备份元数据以及与过期备份相邻的下一次备份的备份元数据识别过期备份中的无效数据,创建删除日志,将过期备份中的无效数据的指针保存至删除日志中。创建删除日志后,控制器210还用于删除该过期备份的备份属性信息,以及创建并保存磁盘标识和删除日志的标识的对应关系。其中,所述无效数据指的是相对于与所述过期备份相邻的下一次备份,过期备份中的包括被修改的数据分片的对象。
控制器210检测是否满足删除条件后,根据磁盘标识和删除日志的标识的对应关系,获取与所述磁盘数据对应的多条删除日志。该删除条件可以是与所述磁盘数据对应的多条删除日志的数量达到预设删除阈值,或是达到预设删除时间,或是自上次满足删除条件后启动计时直至所述计时结束。
控制器210,还用于根据所述多条删除日志确定所述对象存储系统140中保存的包括有效数据和所述无效数据的目标大对象,以及确定只包括无效数据的大对象。其中,所述有效数据指的是相对于与所述过期备份相邻的下一次备份,过期备份中的包括没有修改的数据分片的对象。
在确定所述对象存储系统140中保存的包括有效数据和所述无效数据的目标大对象后,控制器210,还用于向所述对象存储系统140发送数据迁移指示和对象删除指示,所述数据迁移指示用于指示所述对象存储系统140将所述目标大对象中的所述有效数据迁移至另一大对象中,所述对象删除指示用于指示所述对象存储系统140删除所述目标大对象。在确定只包括无效数据的大对象后,控制器210,还用于向所述对象存储系统140发送对象删除指示,所述对象删除指示用于指示所述对象存储系统140删除该只包括无效数据的大对象。
如果将大对象中的有效数据移动至另一大对象,该大对象中的有效数据的指针有修改,为了避免更新其他所有备份分别对应的备份元数据中记录的该大对象中的有效数据的指针,通过移动日志记录该大对象中的有效数据移动前的指针和移动后的指针的对应关系,以便于在后续需要访问某一备份时,确认该某一备份的备份元数据中是否存在与移动日志记录的有效数据移动前的指针相同的指针,如果存在,则将该备份的备份元数据中的该相同的指针修改为该有效数据移动后的指针,避免移动有效数据后更新其他所有备份分别对应的备份元数据。
在移动大对象中的有效数据至另一大对象后,控制器210还用于创建移动日志,将有效数据移动前和移动后的指针的对应关系保存至移动日志中。这样,后续控制器210通过未过期备份的备份元数据访问磁盘数据中的有移动的有效数据时,会根据移动日志中保存的有效数据移动前和移动后的指针的对应关系,更新未过期备份的备份元数据。对未过期备份的备份元数据进行更新时,将所述未过期备份的备份元数据中与所述移动日志中记录的所述有效数据移动前的相同指针进行修改,修改后的指针为所述移动日志中记录的所述有效数据移动后的指针。
例如,控制器210接收到客户端110发送的数据恢复请求,从所有未过期备份中确定待恢复备份后,会根据移动日志修改待恢复备份的备份元数据,根据修改后的备份元数据获取组成磁盘数据的所有对象,然后将磁盘数据恢复至目标磁盘。
在图8所示的备份服务器130中,控制器210包括第一接口211、第二接口212和控制模块213,控制模块213和第一接口211、第二接口212分别连接,第一接口211用于和客户端110通信。第二接口212用于和存储设备220、对象存储系统140进行通信。
控制模块213,用于实现控制器210的功能,具体功能的实现细节可参照上述控制器210的功能描述。
在图8所示的备份服务器130中,控制模块213包括处理器214和存储器215。处理器214与第一接口211、第二接口212连接,处理器214用于实现上述控制器210的功能。处理器214与和存储器215连接,存储器215与第一接口211和第二接口212连接,存储器215,用于临时存储从客户端或对象存储系统140发送的信息。存储器215还用于存储软件程序以及应用模块。处理器214通过运行存储在存储器215的软件程序以及应用模块,从而实现备份服务器130的各种功能。
处理器214可以是任何计算器件,可以是通用中央处理器(CPU),微处理器,可编程处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制以上方案程序执行的集成电路。在具体实现中,作为一种实施例,处理器214可以包括一个或多个CPU。
存储器215可以包括易失性存储器(Volatile Memory),例如,随机存取存储器(Random-Access Memory,RAM);该存储器215也可以包括非易失性存储器(non-volatile memory),例如,只读存储器(Read-Only Memory,ROM),快闪存储器(Flash Memory),硬盘(Hard Disk Drive,HDD)、固态硬盘(Solid-State Drive,SSD)、磁盘存储介质,但不限于此。
请参考图9,图9为本发明实施例提供的另一种备份服务器的结构图。如图9所示,该备份服务器900包括:创建模块910、检测模块920、确定模块930和第一收发模块940,该备份服务器900中各模块的连接关系为:检测模块920与创建模块910和确定模块930分别连接,第一收发模块940与确定模块930连接。创建模块910、检测模块920、确定模块930在具体实现时可以通过图8所示的控制器210或处理器214实现。第一收发模块940在具体实现时可以通过图8所示的第二接口212实现。图9所示的各个模块的作用如下描述:
创建模块910,用于在所述备份服务器900每次确定第一磁盘数据的过期备份后,创 建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份。创建模块910创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中的具体实现细节可以参考图4所示的步骤400-421的内容,具体细节不再在这里赘述。
检测模块920,用于检测是否满足删除条件,如果满足删除条件,获取与所述第一磁盘数据对应的多条删除日志。
确定模块930,用于根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象。确定模块930根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象的具体实现细节可以参考图5所示的步骤501描述的内容,具体实现细节在此不再赘述。
第一收发模块940,用于向所述对象存储系统发送数据迁移指示和对象删除指示,所述数据迁移指示用于指示所述对象存储系统将所述第一目标大对象中的所述有效数据迁移至另一大对象中,所述对象删除指示用于指示所述对象存储系统删除所述第一目标大对象。第一收发模块940向所述对象存储系统发送数据迁移指示细节可以参考图5所示的步骤510-540的描述的内容或图6所示的步骤610-620描述的内容,第一收发模块940向所述对象存储系统发送对象删除指示的具体实现细节可以参考图5所示的步骤550-560描述的内容,具体实现细节在此不再赘述。
可选的一种实现方式,所述检测模块920,还用于检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到预设删除阈值,则满足删除条件;或
检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或
自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
可选的一种实现方式,所述创建模块910,还用于创建所述有效数据移动后的指针,创建所述第一磁盘数据的移动日志,并将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中,所述有效数据移动后的指针表示所述有效数据移动至所述另一大对象后在所述另一大对象的位置。所述创建模块910创建所述有效数据移动后的指针,创建所述第一磁盘数据的移动日志,并将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中的细节可以参考图5所示的步骤541-544描述的内容,具体实现细节在此不再赘述。
基于图9所示的实例,在一种可选的实现方式中,请参见图10,图10为本发明实施例提供的另一种备份服务器1000的结构图。如图10所示,所述备份服务器1000还包括:第二收发模块1010和处理模块1020。处理模块1020在具体实现时可以通过图8所示的控制器210或处理器214实现。第二收发模块1010在具体实现时可以通过图8所示的第一接口211实现。图10所示的与图9所示的不同的模块的作用如下描述:
第二收发模块1010,还用于接收客户端110发送的数据恢复请求,所述数据恢复请求包括第一磁盘标识、第二磁盘标识和待恢复备份的备份标识,所述数据恢复请求用于指示基于所述待恢复备份的备份标识对应的待恢复备份,恢复所述第一磁盘数据至所述第二磁盘中,所述待恢复备份为所述第一磁盘数据的所有未过期备份中的任一备份。第一收发 模块940用于接收客户端发送的数据恢复请求的细节可以参考图7所示的步骤700描述的内容,具体实现细节在此不再赘述。
所述处理模块1020,用于获取所述待恢复备份的备份元数据,并获取所述第一磁盘数据的所有移动日志;处理模块1020用于获取所述待恢复备份的备份元数据,并获取所述第一磁盘数据的所有移动日志的细节可以参考图7所示的步骤701-713描述的内容,具体实现细节在此不再赘述。
处理模块1020,还用于根据所述第一磁盘数据的所有移动日志,确认所述待恢复备份的备份元数据中是否存在与所述移动日志中记录的有效数据移动前的指针相同的指针,如果存在,则将所述待恢复备份的备份元数据中的所述相同的指针修改为与所述有效数据移动前的指针对应的有效数据移动后的指针,获得修改后的备份元数据,所述修改后的备份元数据包括未修改的指针和修改后的指针,所述修改后的指针为所述移动日志中记录的所述有效数据移动后的指针;更新模块1030获得修改后的备份元数据的细节可以参考图7所示的步骤720描述的内容,具体实现细节在此不再赘述。
所述第一收发模块940,还用于根据所述修改后的备份元数据获取与所述待恢复备份对应的所述第一磁盘数据;第一收发模块940根据所述修改后的备份元数据获取与所述待恢复备份对应的所述第一磁盘数据的细节可以参考图7所示的步骤722-731描述的内容,具体实现细节在此不再赘述。
处理模块1020,还用于将所述第一磁盘数据保存至所述第二磁盘中。处理模块1020将所述第一磁盘数据保存至所述第二磁盘中的细节可以参考图7所示的步骤740-750描述的内容,具体实现细节在此不再赘述。
基于图10所示的实施例,在一种可选的实施例中,请参见图11,图11为本发明实施例提供的另一种备份服务器的结构图。如图11所示,创建模块910与处理模块1020连接。
所述创建模块910,还用于创建所述移动日志的对象标识;具体实现细节可以参考图5所示的步骤543描述的细,具体实现细节在此不再赘述。
所述处理模块1020,还用于保存第一磁盘标识和所述移动日志的对象标识的对应关系,并发送移动日志存储请求至所述对象存储系统,所述移动日志存储请求包括所述移动日志的对象标识和所述移动日志,所述移动日志存储请求用于指示所述对象存储系统将所述移动日志保存至所述移动日志的对象标识对应的对象中;具体实现细节可以参考图5所示的步骤543描述的细节,具体实现细节在此不再赘述。
所述处理模块1020,还用于根据所述第一磁盘标识获取所述移动日志的对象标识;具体实现细节可以参见图7中的步骤701描述的内容,具体实现细节不再在此赘述。
所述第一收发模块940,还用于发送移动日志获取请求,所述移动日志获取请求包括所述移动日志的对象标识,所述移动日志获取请求用于指示所述对象存储系统从与所述移动日志的对象标识对应的对象中获取所述移动日志;具体实现细节可以参考图7中的步骤703描述的内容,具体实现细节不再在此赘述。
所述第一收发模块940,还用于接收所述对象存储系统发送的所述移动日志。具体实现细节可以参考图7中的步骤710-712描述的内容,具体实现细节不再在此赘述。
基于图9或图10所示的任一实施例,在另一可选的实施例中,请参见图12,图12 为本发明实施例提供的另一种备份服务器的结构图。如图12所示,第一收发模块940与所述检测模块920连接。
所述确定模块930,还用于根据所述多条删除日志确定所述对象存储系统中保存的包括无效数据的第一目标大对象,根据预定义的无效数据的大小以及所述第一目标大对象中包括的无效数据的数量,确定所述第一目标大对象中所有无效数据的数据量;
所述第一收发模块940,还用于发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述第一目标大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述第一目标大对象的数据量;
所述第一收发模块940,还用于接收数据量属性信息,所述数据量属性信息包括所述第一目标大对象的数据量;
所述检测模块920,还用于检测所述第一目标大对象中所有无效数据的数据量比所述数据量属性信息中所述第一目标大对象的数据量小,确定所述对象存储系统中保存的所述第一目标大对象为包括无效数据和有效数据的第一目标大对象。
在另一种实现方式中,基于图9所示的备份服务器900的结构,创建模块910,用于在所述备份服务器900每次确定第一磁盘数据的过期备份后,创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份;创建模块910创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中的具体实现细节可以参考图4所示的步骤400-421的内容,具体细节不再在这里赘述。
检测模块920,用于检测是否满足删除条件,如果满足删除条件,获取与所述第一磁盘数据对应的多条删除日志;
确定模块930,用于根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象;确定模块930根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象的具体实现细节可以参考图5所示的步骤501描述的内容,具体实现细节在此不再赘述。
第一收发模块940,用于向所述对象存储系统发送对象删除指示,所述对象删除指示用于指示所述对象存储系统删除所述只包括无效数据的大对象。第一收发模块940,用于向所述对象存储系统发送对象删除指示的实现细节可以参考图5所示的步骤570和580描述的内容,具体实现细节在此不再赘述。
在另一种实现方式中,所述确定模块930,还用于根据所述多条删除日志确定包括无效数据的大对象,根据预定义的无效数据的大小以及所述包括无效数据的大对象中包括的无效数据的数量,确定所述包括无效数据的对象中所有无效数据的数据量;
所述第一收发模块940,还用于发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述包括无效数据的大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述包括无效数据的大对象的数据量;
所述第一收发模块940,还用于接收数据量属性信息,所述数据量属性信息包括所述包括无效数据的大对象的数据量;
所述检测模块920,还用于检测所述包括无效数据的大对象中所有无效数据的数据量 与所述数据量属性信息中所述包括无效数据的大对象的数据量相同,确定所述对象存储系统中保存的所述包括无效数据的大对象为只包括无效数据的大对象。
应理解,在本发明的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
在本发明所提供的几个实施例中,应该理解到,所揭露的设备、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个设备,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的模块既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,所述计算机可读存储介质可以是计算机能够读取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(Digital Video Disc,DVD))或者半导体介质(例如,固态硬盘(Solid State Disk,SSD))等。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。
Claims (20)
- 一种过期备份处理方法,其特征在于,所述方法由备份服务器执行,包括:每次确定第一磁盘数据的过期备份后,创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份;检测是否满足删除条件,如果满足所述删除条件,获取与所述第一磁盘数据对应的多条删除日志;根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象;向所述对象存储系统发送数据迁移指示和对象删除指示,所述数据迁移指示用于指示所述对象存储系统将所述第一目标大对象中的所述有效数据迁移至另一大对象中,所述对象删除指示用于指示所述对象存储系统删除所述第一目标大对象。
- 如权利要求1所述的方法,其特征在于,所述检测是否满足删除条件包括:检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到所述预设删除阈值,则满足删除条件;或检测是否达到预设删除时间,如果达到所述预设删除时间,则满足删除条件;或自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
- 如权利要求1或2所述的方法,其特征在于,所述根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象之后,还包括:创建所述有效数据移动后的指针,所述有效数据移动后的指针表示所述有效数据移动至所述另一大对象后在所述另一大对象的位置;创建所述第一磁盘数据的移动日志,并将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中。
- 如权利要求3所述的方法,其特征在于,还包括:接收客户端发送的数据恢复请求,所述数据恢复请求包括第一磁盘标识、第二磁盘标识和待恢复备份的备份标识,所述数据恢复请求用于指示基于所述待恢复备份的备份标识对应的待恢复备份,恢复所述第一磁盘数据至所述第二磁盘中,所述待恢复备份为所述第一磁盘数据的所有未过期备份中的任一备份;获取所述待恢复备份的备份元数据,并获取所述第一磁盘数据的所有移动日志;根据所述第一磁盘数据的所有移动日志,确认所述待恢复备份的备份元数据中是否存在与所述移动日志中记录的有效数据移动前的指针相同的指针,如果存在,则将所述待恢复备份的备份元数据中的所述相同的指针修改为与所述有效数据移动前的指针对应的有效数据移动后的指针,获得修改后的备份元数据,所述修改后的备份元数据包括未修改的指针和修改后的指针,所述修改后的指针为所述移动日志中记录的所述有效数据移动后的 指针;根据所述修改后的备份元数据获取与所述待恢复备份对应的所述第一磁盘数据;将所述第一磁盘数据保存至所述第二磁盘中。
- 如权利要求4所述的方法,其特征在于,所述将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中之后,还包括:创建所述移动日志的对象标识;保存第一磁盘标识和所述移动日志的对象标识的对应关系,并发送移动日志存储请求至所述对象存储系统,所述移动日志存储请求包括所述移动日志的对象标识和所述移动日志,所述移动日志存储请求用于指示所述对象存储系统将所述移动日志保存至所述移动日志的对象标识对应的对象中;所述获取所述第一磁盘的所有移动日志,包括:根据所述第一磁盘标识获取所述移动日志的对象标识,发送所述移动日志获取请求至所述对象存储系统,所述移动日志获取请求包括所述移动日志的对象标识,所述移动日志获取请求用于指示所述对象存储系统从与所述移动日志的对象标识对应的对象中获取所述移动日志;接收所述对象存储系统发送的所述移动日志。
- 如权利要求1-5任一所述的方法,其特征在于,所述根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象,包括:根据所述多条删除日志确定所述对象存储系统中保存的包括无效数据的第一目标大对象;根据预定义的无效数据的大小以及所述第一目标大对象中包括的无效数据的数量,确定所述第一目标大对象中所有无效数据的数据量;发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述第一目标大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述第一目标大对象的数据量;接收数据量属性信息,所述数据量属性信息包括所述第一目标大对象的数据量;如果所述第一目标大对象中所有无效数据的数据量比所述数据量属性信息中所述第一目标大对象的数据量小,则确定所述对象存储系统中保存的所述第一目标大对象为包括无效数据和有效数据的第一目标大对象。
- 一种过期备份处理方法,其特征在于,所述方法由备份服务器执行,包括:每次确定第一磁盘数据的过期备份后,创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份;检测是否满足删除条件,如果满足删除条件,获取与所述第一磁盘数据对应的多条删除日志;根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象;向所述对象存储系统发送对象删除指示,所述对象删除指示用于指示所述对象存储系统删除所述只包括无效数据的大对象。
- 如权利要求7所述的方法,其特征在于,所述检测是否满足删除条件包括:检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到预设删除阈值,则满足删除条件;或检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
- 如权利要求7或8所述的方法,其特征在于,所述根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象,包括:根据所述多条删除日志确定包括无效数据的大对象;根据预定义的无效数据的大小以及所述包括无效数据的大对象中包括的无效数据的数量,确定所述包括无效数据的对象中所有无效数据的数据量;发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述包括无效数据的大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述包括无效数据的大对象的数据量;接收数据量属性信息,所述数据量属性信息包括所述包括无效数据的大对象的数据量;如果所述包括无效数据的大对象中所有无效数据的数据量与所述数据量属性信息中所述包括无效数据的大对象的数据量相同,则确定所述对象存储系统中保存的所述包括无效数据的大对象为只包括无效数据的大对象。
- 一种备份服务器,其特征在于,包括:创建模块,用于在所述备份服务器每次确定第一磁盘数据的过期备份后,创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份;检测模块,用于检测是否满足删除条件,如果满足删除条件,获取与所述第一磁盘数据对应的多条删除日志;确定模块,用于根据所述多条删除日志确定所述对象存储系统中保存的包括有效数据和所述无效数据的第一目标大对象;第一收发模块,用于向所述对象存储系统发送数据迁移指示和对象删除指示,所述数据迁移指示用于指示所述对象存储系统将所述第一目标大对象中的所述有效数据迁移至另一大对象中,所述对象删除指示用于指示所述对象存储系统删除所述第一目标大对象。
- 如权利要求10所述的备份服务器,其特征在于,所述检测模块,还用于检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到预设删除阈值,则满足删除条件;或检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
- 如权利要求10或11所述的备份服务器,其特征在于,所述创建模块,还用于创建所述有效数据移动后的指针,创建所述第一磁盘数据的移动日志,并将所述有效数据移动前的指针和所述有效数据移动后的指针的对应关系保存至所述移动日志中,所述有效数据移动后的指针表示所述有效数据移动至所述另一大对象后在所述另一大对象的位置。
- 如权利要求12所述的备份服务器,其特征在于,还包括:第二收发模块,用于接收客户端发送的数据恢复请求,所述数据恢复请求包括第一磁盘标识、第二磁盘标识和待恢复备份的备份标识,所述数据恢复请求用于指示基于所述待恢复备份的备份标识对应的待恢复备份,恢复所述第一磁盘数据至所述第二磁盘中,所述待恢复备份为所述第一磁盘数据的所有未过期备份中的任一备份;处理模块,用于获取所述待恢复备份的备份元数据,并获取所述第一磁盘数据的所有移动日志;所述处理模块,还用于根据所述第一磁盘数据的所有移动日志,确认所述待恢复备份的备份元数据中是否存在与所述移动日志中记录的有效数据移动前的指针相同的指针,如果存在,则将所述待恢复备份的备份元数据中的所述相同的指针修改为与所述有效数据移动前的指针对应的有效数据移动后的指针,获得修改后的备份元数据,所述修改后的备份元数据包括未修改的指针和修改后的指针,所述修改后的指针为所述移动日志中记录的所述有效数据移动后的指针;所述第一收发模块,还用于根据所述修改后的备份元数据获取与所述待恢复备份对应的所述第一磁盘数据;所述处理模块,还用于将所述第一磁盘数据保存至所述第二磁盘中。
- 如权利要求13所述的备份服务器,其特征在于,所述创建模块,还用于创建所述移动日志的对象标识;所述处理模块,还用于保存第一磁盘标识和所述移动日志的对象标识的对应关系,并发送移动日志存储请求至所述对象存储系统,所述移动日志存储请求包括所述移动日志的对象标识和所述移动日志,所述移动日志存储请求用于指示所述对象存储系统将所述移动日志保存至所述移动日志的对象标识对应的对象中;所述处理模块,还用于根据所述第一磁盘标识获取所述移动日志的对象标识;所述第一收发模块,还用于发送移动日志获取请求,所述移动日志获取请求包括所述移动日志的对象标识,所述移动日志获取请求用于指示所述对象存储系统从与所述移动日志的对象标识对应的对象中获取所述移动日志;所述第一收发模块,还用于接收所述对象存储系统发送的所述移动日志。
- 如权利要求10-14任一所述的备份服务器,其特征在于,所述确定模块,还用于根据所述多条删除日志确定所述对象存储系统中保存的包括无效数据的第一目标大对象,根据预定义的无效数据的大小以及所述第一目标大对象中包括的无效数据的数量,确定所述第一目标大对象中所有无效数据的数据量;所述第一收发模块,还用于发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述第一目标大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述第一目标大对象的数据量;所述第一收发模块,还用于接收数据量属性信息,所述数据量属性信息包括所述第一目标大对象的数据量;所述检测模块,还用于检测所述第一目标大对象中所有无效数据的数据量比所述数据量属性信息中所述第一目标大对象的数据量小,确定所述对象存储系统中保存的所述第一目标大对象为包括无效数据和有效数据的第一目标大对象。
- 一种备份服务器,其特征在于,包括:创建模块,用于在所述备份服务器每次确定第一磁盘数据的过期备份后,创建所述第一磁盘数据的删除日志,并保存所述过期备份中的无效数据的指针至所述删除日志中,所述过期备份为截止当前第一时刻对象存储系统中对所述第一磁盘数据进行的所有未过期备份中的最早备份;检测模块,用于检测是否满足删除条件,如果满足删除条件,获取与所述第一磁盘数据对应的多条删除日志;确定模块,用于根据所述多条删除日志确定所述对象存储系统中保存的只包括无效数据的大对象;收发模块,用于向所述对象存储系统发送对象删除指示,所述对象删除指示用于指示所述对象存储系统删除所述只包括无效数据的大对象。
- 如权利要求16所述的备份服务器,其特征在于,所述检测检测模块,还用于检测与所述第一磁盘数据对应的多条删除日志的数量是否达到预设删除阈值,如果达到预设删除阈值,则满足删除条件;或检测是否达到预设删除时间,如果达到预设删除时间,则满足删除条件;或自上次满足删除条件后启动计时,检测所述计时是否结束,如果所述计时结束则满足删除条件。
- 如权利要求16或17所述的备份服务器,其特征在于,所述确定模块,还用于根据所述多条删除日志确定包括无效数据的大对象,根据预定义的无效数据的大小以及所述包括无效数据的大对象中包括的无效数据的数量,确定所述包括无效数据的对象中所有无效数据的数据量;所述收发模块,还用于发送数据量确定请求至所述对象存储系统,所述数据量确定请求包括所述包括无效数据的大对象的标识,所述数据量确定请求用于指示所述对象存储系统发送所述包括无效数据的大对象的数据量;所述收发模块,还用于接收数据量属性信息,所述数据量属性信息包括所述包括无效 数据的大对象的数据量;所述检测模块,还用于检测所述包括无效数据的大对象中所有无效数据的数据量与所述数据量属性信息中所述包括无效数据的大对象的数据量相同,确定所述对象存储系统中保存的所述包括无效数据的大对象为只包括无效数据的大对象。
- 一种备份服务器,其特征在于,包括接口、存储器和处理器,所述接口用于和对象存储系统通信,所述存储器用于存储软件程序,所述处理器通过运行存储在所述存储器中的软件程序,执行权利要求1-9中任一过期备份处理方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得所述计算机执行上述权利要求1-9中任一过期备份处理方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/118689 WO2019127034A1 (zh) | 2017-12-26 | 2017-12-26 | 一种过期备份处理方法及备份服务器 |
CN201780002847.5A CN110998537B (zh) | 2017-12-26 | 2017-12-26 | 一种过期备份处理方法及备份服务器 |
JP2019517325A JP6968876B2 (ja) | 2017-12-26 | 2017-12-26 | 期限切れバックアップ処理方法及びバックアップサーバ |
EP17933183.0A EP3537302B1 (en) | 2017-12-26 | 2017-12-26 | Expired backup processing method and backup server |
US16/908,923 US11615000B2 (en) | 2017-12-26 | 2020-06-23 | Method and backup server for processing expired backups |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/118689 WO2019127034A1 (zh) | 2017-12-26 | 2017-12-26 | 一种过期备份处理方法及备份服务器 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/908,923 Continuation US11615000B2 (en) | 2017-12-26 | 2020-06-23 | Method and backup server for processing expired backups |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019127034A1 true WO2019127034A1 (zh) | 2019-07-04 |
Family
ID=67064270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/118689 WO2019127034A1 (zh) | 2017-12-26 | 2017-12-26 | 一种过期备份处理方法及备份服务器 |
Country Status (5)
Country | Link |
---|---|
US (1) | US11615000B2 (zh) |
EP (1) | EP3537302B1 (zh) |
JP (1) | JP6968876B2 (zh) |
CN (1) | CN110998537B (zh) |
WO (1) | WO2019127034A1 (zh) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111597078A (zh) * | 2020-05-15 | 2020-08-28 | 山东汇贸电子口岸有限公司 | 一种复制ceph块存储数据至对象存储的定时备份方法及系统 |
US11947493B2 (en) * | 2022-03-16 | 2024-04-02 | Rubrik, Inc. | Techniques for archived log deletion |
US12079094B2 (en) * | 2023-01-17 | 2024-09-03 | Zilliz Inc. | Data backup method, data recovery method, and electronic equipment |
US11876864B1 (en) * | 2023-02-13 | 2024-01-16 | Dell Products L.P. | Using predictive analytics on SFP metrics to influence the target port selection process |
US20240281338A1 (en) * | 2023-02-22 | 2024-08-22 | Bank Of America Corporation | Systems, methods, and apparatuses for determining and applying a backup file attribution to files in an electronic network |
CN116340732B (zh) * | 2023-05-29 | 2023-08-04 | 天翼云科技有限公司 | 一种过期数据的自动清理方法、装置及电子设备 |
CN116560914B (zh) * | 2023-07-10 | 2023-10-13 | 成都云祺科技有限公司 | 虚拟机cbt失效下的增量备份方法、系统及存储介质 |
CN116661706B (zh) * | 2023-07-26 | 2023-11-14 | 江苏华存电子科技有限公司 | 一种固态硬盘的缓存清理分析方法及系统 |
CN117349086B (zh) * | 2023-12-04 | 2024-02-23 | 四川精容数安科技有限公司 | 一种Windows整机永久增量备份的方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101453490A (zh) * | 2008-12-23 | 2009-06-10 | 上海爱数软件有限公司 | 一种合成备份方法和装置 |
US20130173548A1 (en) * | 2012-01-02 | 2013-07-04 | International Business Machines Corporation | Method and system for backup and recovery |
CN103399806A (zh) * | 2013-07-26 | 2013-11-20 | 安徽省徽商集团有限公司 | 网络备份更新管理方法及其系统 |
CN103645971A (zh) * | 2013-12-13 | 2014-03-19 | 江苏名通信息科技有限公司 | Linux系统下文件备份及转移方法 |
CN105740098A (zh) * | 2016-01-26 | 2016-07-06 | 浪潮(北京)电子信息产业有限公司 | 备份数据中过期数据的判定方法及系统 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3632539B2 (ja) * | 2000-01-11 | 2005-03-23 | 日本電気株式会社 | 自動バックアップ装置、自動バックアップ方法および自動バックアップ用プログラムを記録した記録媒体 |
US7756833B2 (en) * | 2004-09-22 | 2010-07-13 | Microsoft Corporation | Method and system for synthetic backup and restore |
US7694088B1 (en) * | 2005-03-31 | 2010-04-06 | Symantec Operating Corporation | System and method for efficient creation of aggregate backup images |
JP4883027B2 (ja) * | 2008-02-27 | 2012-02-22 | 日本電気株式会社 | バックアップ装置、その制御方法及びプログラム |
CN101937377B (zh) * | 2009-06-29 | 2014-10-22 | 百度在线网络技术(北京)有限公司 | 数据恢复方法和装置 |
US9367401B2 (en) * | 2014-09-30 | 2016-06-14 | Storagecraft Technology Corporation | Utilizing an incremental backup in a decremental backup system |
US9626250B2 (en) * | 2015-03-16 | 2017-04-18 | International Business Machines Corporation | Data synchronization of block-level backup |
US10942813B2 (en) * | 2015-10-30 | 2021-03-09 | Netapp, Inc. | Cloud object data layout (CODL) |
US10228962B2 (en) * | 2015-12-09 | 2019-03-12 | Commvault Systems, Inc. | Live synchronization and management of virtual machines across computing and virtualization platforms and using live synchronization to support disaster recovery |
-
2017
- 2017-12-26 WO PCT/CN2017/118689 patent/WO2019127034A1/zh unknown
- 2017-12-26 JP JP2019517325A patent/JP6968876B2/ja active Active
- 2017-12-26 CN CN201780002847.5A patent/CN110998537B/zh active Active
- 2017-12-26 EP EP17933183.0A patent/EP3537302B1/en active Active
-
2020
- 2020-06-23 US US16/908,923 patent/US11615000B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101453490A (zh) * | 2008-12-23 | 2009-06-10 | 上海爱数软件有限公司 | 一种合成备份方法和装置 |
US20130173548A1 (en) * | 2012-01-02 | 2013-07-04 | International Business Machines Corporation | Method and system for backup and recovery |
CN103399806A (zh) * | 2013-07-26 | 2013-11-20 | 安徽省徽商集团有限公司 | 网络备份更新管理方法及其系统 |
CN103645971A (zh) * | 2013-12-13 | 2014-03-19 | 江苏名通信息科技有限公司 | Linux系统下文件备份及转移方法 |
CN105740098A (zh) * | 2016-01-26 | 2016-07-06 | 浪潮(北京)电子信息产业有限公司 | 备份数据中过期数据的判定方法及系统 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3537302A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP3537302A4 (en) | 2020-01-22 |
US20200319976A1 (en) | 2020-10-08 |
CN110998537A (zh) | 2020-04-10 |
JP6968876B2 (ja) | 2021-11-17 |
EP3537302B1 (en) | 2022-01-19 |
CN110998537B (zh) | 2022-09-02 |
JP2020506444A (ja) | 2020-02-27 |
US11615000B2 (en) | 2023-03-28 |
EP3537302A1 (en) | 2019-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019127034A1 (zh) | 一种过期备份处理方法及备份服务器 | |
US9792306B1 (en) | Data transfer between dissimilar deduplication systems | |
US20230359644A1 (en) | Cloud-based replication to cloud-external systems | |
US8683156B2 (en) | Format-preserving deduplication of data | |
US10534547B2 (en) | Consistent transition from asynchronous to synchronous replication in hash-based storage systems | |
US11625374B2 (en) | Eventual consistency in a deduplicated cloud storage system | |
US10437682B1 (en) | Efficient resource utilization for cross-site deduplication | |
WO2016016695A1 (en) | Live migration of virtual machines that use externalized memory pages | |
CN104978151A (zh) | 基于应用感知的重复数据删除存储系统中的数据重构方法 | |
US10929176B2 (en) | Method of efficiently migrating data from one tier to another with suspend and resume capability | |
US10042719B1 (en) | Optimizing application data backup in SMB | |
EP3862883A1 (en) | Data backup method and apparatus, and system | |
US10649807B1 (en) | Method to check file data integrity and report inconsistencies with bulk data movement | |
US10120875B1 (en) | Method and system for detecting boundaries of data blocks for deduplication | |
WO2015096847A1 (en) | Method and apparatus for context aware based data de-duplication | |
US20240241798A1 (en) | Multi-phase file recovery from cloud environments | |
US20240220371A1 (en) | Data backup system and apparatus | |
US10108647B1 (en) | Method and system for providing instant access of backup data | |
US9971797B1 (en) | Method and system for providing clustered and parallel data mining of backup data | |
US20170109239A1 (en) | Efficient Processing of File System Objects for Image Level Backups | |
US11593304B2 (en) | Browsability of backup files using data storage partitioning | |
US20230236725A1 (en) | Method to opportunistically reduce the number of SSD IOs, and reduce the encryption payload, in an SSD based cache in a deduplication file system | |
WO2019052213A1 (zh) | 一种数据恢复方法及装置 | |
US20240143212A1 (en) | Inline snapshot deduplication | |
US11847334B2 (en) | Method or apparatus to integrate physical file verification and garbage collection (GC) by tracking special segments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2019517325 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2017933183 Country of ref document: EP Effective date: 20190604 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17933183 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |