WO2020082744A1 - 一种备份数据的方法、装置和系统 - Google Patents

一种备份数据的方法、装置和系统 Download PDF

Info

Publication number
WO2020082744A1
WO2020082744A1 PCT/CN2019/090090 CN2019090090W WO2020082744A1 WO 2020082744 A1 WO2020082744 A1 WO 2020082744A1 CN 2019090090 W CN2019090090 W CN 2019090090W WO 2020082744 A1 WO2020082744 A1 WO 2020082744A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
storage device
lun
backup
snapshot
Prior art date
Application number
PCT/CN2019/090090
Other languages
English (en)
French (fr)
Inventor
张磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP19874958.2A priority Critical patent/EP3862883B1/en
Publication of WO2020082744A1 publication Critical patent/WO2020082744A1/zh
Priority to US17/235,557 priority patent/US11907078B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1466Management of the backup or restore process to make the backup process non-disruptive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Definitions

  • This application relates to the field of storage technology, and in particular, to a method, device, and system for backing up data.
  • NAS Network Attached Storage
  • file services are provided by connecting file servers to existing networks.
  • the file server is a device dedicated to file services.
  • the file server can be used for storage, retrieval, and file access functions for applications and clients.
  • the data on the file server is usually backed up. If the storage device used for backup is not provided by the manufacturer of the above-mentioned file server, then when doing backup, the following scheme is usually adopted:
  • the first backup obtain the data of all files in the shared directory of the file server, and store all the obtained data to the backup storage device.
  • the data in the file changes, obtain a list of files that have changed from the previous backup, and then back up the changed files to the backup storage device, that is, the second backup and after, all use incremental backups .
  • the backup storage device that is, the second backup and after, all use incremental backups .
  • This application proposes a method, equipment and system for backing up data. Used to reduce unnecessary duplicate data in backup storage.
  • an embodiment of the present application provides a method for backing up data, and the method is executed by a backup server.
  • the method includes: after the backup data is triggered, a change information acquisition request is sent to the file server, the change information acquisition request is used to request the change information of the data on the file server; receiving the second snapshot and the first returned by the file server A data change record between snapshots, the first snapshot is created before the second snapshot, and the data change record is used to record data blocks that have changed between the second snapshot and the first snapshot Information, the data change record includes identification information of the changed data block; acquiring data of the data block identified by the data block identification information from the file server according to the data change record; The acquired data is stored in a backup storage device, and a data mapping relationship of this backup is established, wherein the data mapping relationship includes identification information of all data blocks in the second snapshot, and all data in the second snapshot The storage location of the block in the backup storage device and the snapshot identifier of the second snapshot.
  • the file server and the backup storage device are heterogeneous devices.
  • the data mapping relationship may be stored in the backup storage device.
  • the identification information of the data block includes: a data block identification and a file identification of the file to which the data block belongs.
  • the obtained data change records are for data blocks, so in incremental backup, the data blocks that change between the second snapshot and the first snapshot are obtained according to the data change records, and these data blocks are backed up and stored .
  • the data blocks that change between the second snapshot and the first snapshot are obtained according to the data change records, and these data blocks are backed up and stored .
  • backing up the entire file as long as there are data changes in a file as in the prior art.
  • duplicate data during incremental backup is reduced, and the performance of backup storage is improved.
  • the data change record further includes a corresponding operation identifier
  • establishing the data mapping relationship includes: copying the last backed up data in the backup storage device Mapping relationship, and modifying the copied data mapping relationship according to the data modification record to obtain the data mapping relationship of this backup.
  • the method before sending the change information acquisition request to the file server, the method further includes: sending a snapshot request to the file server; receiving a return from the first device The snapshot creation success response of, the response includes the snapshot identifier of the second snapshot. If the backup is for a shared directory, the snapshot request here may also include the indication information of the shared directory. In this way, when the file server receives the snapshot request, it will create a snapshot of the shared directory. Compared with taking a snapshot of the entire file server, it can be more targeted.
  • the change information acquisition request includes the snapshot identifier of the second snapshot and the snapshot identifier of the first snapshot.
  • the identifiers of the first snapshot and the second snapshot are used to indicate which data change information between the two snapshots the file server needs to acquire.
  • the above-mentioned first snapshot and second snapshot are both initiated based on the backup, and the data change information between the two snapshots means what changes have occurred in the data between the two backups.
  • the second aspect of the present application provides a method for backing up data performed by a file server.
  • the method includes: receiving a change information acquisition request sent by a backup server, where the change information acquisition request is used to request change information of data on the file server; determining change information of data on the file server, and submitting to the The backup server returns a data change record between the second snapshot and the first snapshot, the first snapshot is created before the second snapshot, and the data change record is used to record the second snapshot and the first snapshot Information of the changed data block, the data change record includes the identification information of the changed data block; receiving the data acquisition request sent by the backup server, the data acquisition request includes the data to be acquired The identification information of the data block; the data of the data block identified by the identification information of the data block is returned to the backup server.
  • the change information acquisition request includes the snapshot identifier of the second snapshot and the snapshot identifier of the first snapshot.
  • the identification information of the data block includes: a data block identification and a file identification of the file to which the data block belongs.
  • the file server can provide data change records at the data block level, and based on the request of the backup server, the data of the changed data block between two snapshots (that is, two backups) is returned to the backup server, so that The backup server can realize incremental backup at the data block level. So as to avoid repeatedly backing up the data in some files.
  • the method further includes: tracking data operations of the client on the file server; recording data change records between two adjacent snapshots, and the two adjacent snapshots
  • the data change record between includes the identification information of the data block that changes between the two adjacent snapshots.
  • the file server can provide accurate data block-level data change records.
  • the method further includes: The data change records of every two adjacent snapshots between the second snapshot and the first snapshot are merged to obtain the data change records between the second snapshot and the first snapshot.
  • the method for merging to obtain data change records includes:
  • the data generated before the first snapshot, the data block deleted between the first snapshot and the second snapshot, is marked as deleted;
  • the data generated after the first snapshot is not deleted before the second snapshot, and is marked as new;
  • the data modified after the first snapshot is not deleted before the second snapshot, and is marked as modified.
  • a backup server includes a block change list reader, a block data reader, and a block map organizer, wherein the block change list reader is used to send a change information acquisition request to the file server after the backup data is triggered,
  • the change information acquisition request is used to request change information of data on the file server, and receive a data change record between the second snapshot and the first snapshot returned by the file server, and the data change record is used to record Information about the changed data block between the second snapshot and the first snapshot, the first snapshot is created before the second snapshot, and the data change record includes the changed data block Identification information of the block;
  • the block data reader is used to obtain data of the data block identified by the identification information of the data block from the file server according to the data modification record;
  • the block map organizer is used to Store the acquired data in a backup storage device, and create a data mapping relationship for this backup, where the data mapping relationship includes all Identification information of all data blocks in the second snapshot and said second snapshot of all data blocks in the backup storage device and a snapshot storage
  • the data change record further includes a corresponding operation identifier
  • establishing a data mapping relationship by the block map organizer includes: the block map organizer is copied in the backup storage device The data mapping relationship of the last backup, and modifying the copied data mapping relationship according to the data change record to obtain the data mapping relationship of this backup.
  • the backup server further includes a snapshot trigger, where the snapshot trigger is used to send a snapshot request to the file server; receiving a snapshot creation success response returned by the first device, The response includes a snapshot identifier of the second snapshot.
  • a fourth aspect of the present application provides a file server, where the file server is used to provide a file service.
  • the file server includes a file input / output tracker, a block change list provider, and a block data provider, wherein the block change list provider is used to receive a change information acquisition request sent by a backup server, and convert the change information
  • An acquisition request is sent to the file input and output tracker, and the change information acquisition request is used to request change information of data on the file server;
  • the file input and output tracker is used to determine according to the change information acquisition request Change information of the data on the file server, and return a data change record between the second snapshot and the first snapshot to the backup server, where the data change record is used to record the second snapshot and the first Information of changed data blocks between snapshots, the data change record includes identification information of the changed data blocks;
  • the block data provider is used to receive a data acquisition request sent by the backup server, And return the data of the corresponding data block to the backup server according to the data acquisition request, the According to acquire the
  • the file server further includes a snapshot creator and a storage unit, wherein the snapshot creator is configured to receive a snapshot creation request to create the second snapshot, and according to the snapshot Create a request to create a second snapshot, and notify the file input output tracker to store the data change record between the second snapshot and the first snapshot in a storage unit, where the first snapshot is the second The previous snapshot of the snapshot; the file input and output tracker is also used to track the data operation of the client on the file server, and after receiving the notification sent by the snapshot creator, store it in the storage according to the data operation
  • the unit stores data change records between the second snapshot and the first snapshot.
  • a fifth aspect of the present application provides a server for implementing the above first and second aspects.
  • the server includes a network interface, a processor, and a memory, and the network interface, the processor, and the memory are connected by a bus.
  • the network interface is used to access the network
  • the memory is used to store computer operation instructions.
  • it may be a high-speed RAM memory or a non-volatile memory (non-volatile memory).
  • the processor is used to execute computer operation instructions stored in the memory.
  • the processor may specifically be a central processing unit (CPU), or a specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits configured to implement embodiments of the present invention.
  • the processor executes the method of the first aspect or the second aspect by executing computer operation instructions stored in the memory.
  • the sixth aspect of the present application provides a backup system.
  • the system includes the aforementioned backup server and the aforementioned file server.
  • the backup system further includes a backup storage device, and the backup storage device is used to store data sent by the backup server.
  • the seventh aspect of the present application provides a storage medium for storing the computer operation instructions mentioned in the fifth aspect.
  • these operation instructions are executed by a computer, the method of the first aspect or the second aspect described above may be executed.
  • FIG. 1 is a schematic diagram of networking of a backup system provided by an embodiment of the present invention.
  • FIG. 2-1 is a schematic structural diagram of a file server provided by an embodiment of the present invention.
  • FIG. 2-2 is another schematic structural diagram of a file server provided by an embodiment of the present invention.
  • Figure 2-3 is a schematic structural diagram of a backup server provided by an embodiment of the present invention.
  • Figures 2-4 is another schematic structural diagram of a backup server provided by an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a method provided by an embodiment of the present invention.
  • Figure 3-1 is a schematic diagram of data in a shared directory during a backup in an embodiment of the present invention
  • 3-2 is a schematic diagram of data in a shared directory during another backup according to an embodiment of the present invention.
  • Figure 3-3 is a schematic diagram of data mapping relationship in an embodiment of the present invention.
  • the embodiment of the present invention proposes a method, device and system for backing up data, which can track the change information of data blocks when processing data operations, and only backup the changed data blocks during incremental backup. Therefore, it solves the problems of low efficiency and excessive waste of backup storage caused by backing up a large amount of duplicate data in the prior art.
  • data blocks referred to in the Unix NFS protocol and the clusters referred to in the windows CIFS protocol are collectively referred to as data blocks.
  • FIG. 1 depicts a schematic diagram of a backup system networking provided by an embodiment of the present invention.
  • the backup system includes a file server 200, a backup server 300, and a backup storage device 400. These devices communicate with each other through a network, and also communicate with each other through the network. Communication between clients 100.
  • the file server 200 and the backup storage device 400 may be heterogeneous.
  • the file server 200 is a network-attached storage (NAS) device
  • the backup storage device 400 is heterogeneous with the NAS device. Back up storage devices.
  • the heterogeneity with the file server 200 may be due to different manufacturers or different models.
  • the file server 200 includes a network interface 2100, a processor 2102, a memory 2104, a storage interface 2106, and a storage array 2108.
  • the network interface 2100, the processor 2102, the memory 2104, and the storage interface 2106 are connected by a bus, and the storage interface is communicatively connected to the storage array.
  • the network interface 2102 may be provided by one or more network interface cards (Network Interface Card) for accessing the network.
  • the storage interface 2108 is used to connect to the storage array 2108.
  • the storage array 2108 is used to store data.
  • the storage array 2108 may also be replaced by other storage devices.
  • the memory 2104 stores some program instructions. When these instructions are executed by the processor 2102, they are used to implement the following functions of the file server.
  • the file server 200 is used to receive a change information acquisition request sent by the backup server 300, and the change information acquisition request is used to request the change information of the data on the file server 200; determine the change of the data on the file server 200 Information, and returns a data change record between the second snapshot and the first snapshot to the backup server, the first snapshot is created before the second snapshot, and the data change record is used to record the second snapshot Information of a data block that has changed between the first snapshot and the data change record includes identification information of the data block that has changed; receiving a data acquisition request sent by the backup server 300, the The data acquisition request includes identification information of the data block to be acquired; the data of the data block identified by the identification information of the data block is returned to the backup server 300.
  • the program instructions stored in the memory 2104 may be logically divided into multiple sub-collections, and when a sub-collection is executed by the processor 2102, it may be used to implement a component function included in the file server 200.
  • the file server 200 includes a file input and output tracker 2101 (FileIOTracker), a block change list provider 2103 (BlockChgListProvider), and a block data provider 2105 (BlockDataProvider). among them,
  • the block change list provider 2103 is configured to receive a change information acquisition request sent by a backup server, and send the change information acquisition request to the file input / output tracker, and the change information acquisition request is used to request the Information about changes in data on the file server.
  • the file input and output tracker 2101 is configured to determine the change information of the data on the file server according to the change information acquisition request, and return the data change record between the second snapshot and the first snapshot to the backup server
  • the data change record is used to record information of the changed data block between the second snapshot and the first snapshot, and the data change record includes identification information of the changed data block.
  • the block data provider 2105 is configured to receive a data acquisition request sent by the backup server, and return the data of the corresponding data block to the backup server according to the data acquisition request, where the data acquisition request includes the data to be acquired The identification information of the data block.
  • the file server 200 further includes a snapshot creator 2107 and a storage unit 2109, where,
  • the snapshot creator 2107 is configured to receive a snapshot creation request to create the second snapshot, create a second snapshot according to the snapshot creation request, and notify the file input output tracker to associate the second snapshot with the
  • the data change record between the first snapshots is stored in the storage unit 2109, and the first snapshot is the previous snapshot of the second snapshot.
  • the snapshot creator 2107 may also return a snapshot creation success response to the snapshot creator, where the response includes a snapshot identifier.
  • the response message may further include a snapshot path, and the snapshot path is used to indicate the location of the snapshot in the NAS file system.
  • the file input and output tracker 2101 is also used to track the data operation of the client on the file server 200, and after receiving the notification sent by the snapshot creator, store it in the storage unit 2109 according to the data operation The data change record between the second snapshot and the first snapshot.
  • the storage unit 2109 here may be the storage array 2108 mentioned above.
  • the client when the file server 200 sets a shared directory for the client, the client can create a file under the shared directory through the network, or access the file in the shared directory and execute the data in it Read, write, delete or modify operations. If the data under the shared directory is backed up, then the above snapshot request includes the indication information indicating the snapshot object, that is, the indication information of the shared directory. After the snapshot creator 2107 receives the snapshot request, it is the share The directory creates a snapshot.
  • the file input and output tracker 2101 can track the file input and output requests to the specified shared directory sent by the client 100 each time, if data blocks are added, data blocks are deleted, and data are modified under the shared directory Block, record the identifier of the changed data block, the file identifier of the file to which the changed data block belongs, and the corresponding operation identifier.
  • the snapshot creator 2107 creates a snapshot, all change records in the shared directory between the current snapshot and the previous snapshot are stored as data change records in the storage unit. Among them, this snapshot and the previous snapshot are two adjacent snapshots.
  • the data modification records include file identification, data block identification, and operation identification (addition, deletion, modification). It is understandable that during the first data backup, when the snapshot is created, there is no comparable previous snapshot.
  • the change information provided by the block change list provider 2103 to the backup server 300 generally refers to the data change record between the current snapshot and a previous snapshot.
  • the current snapshot corresponds to the current backup, that is, the above-mentioned second snapshot
  • the previous snapshot refers to the first snapshot, which corresponds to the previous backup. It is not difficult to understand that the second snapshot and the first snapshot may be two adjacent snapshots or two non-adjacent snapshots.
  • the block change list provider 2103 may also have two different processing methods.
  • Method 1 The block change list provider 2103 provides the data change record according to the data change record returned by the file input / output tracker 2101 as change information to the backup server.
  • Method 2 The block change list provider 2103 returns the change information returned by the file input / output tracker 2101 to the backup server.
  • the change information in method two refers to the file input / output tracker 2101 according to the snapshot identifier carried in the change information acquisition request, determining the information of the files in the shared directory and the data blocks contained in the file when the snapshot 1 is created Information. That is, in the second method, the block change list provider 2103 returns the information of the determined file and the information of the data blocks contained in the file to the backup server 300 as change information.
  • the backup server 300 includes a network interface 3102, a processor 3104, and a memory 3106, wherein the network interface 3102, the processor 3104, and the memory 3106 are connected by a bus, and the network interface 3102 It can be provided by one or more network interface cards (Network Interface Card) for access to the network.
  • the memory 3106 stores some program instructions. When these instructions are executed by the processor 3104, they are used to implement the functions of the backup server 300 described below.
  • the backup server 300 is used to send a change information acquisition request to the file server 200 after the backup data is triggered, and the change information acquisition request is used to request the change information of the data on the file server 200;
  • the returned data change record between the second snapshot and the first snapshot.
  • the first snapshot is created before the second snapshot.
  • the data change record is used to record the data blocks that have changed between the two snapshots.
  • the data change record includes identification information of the changed data block; according to the data change record, data of the data block identified by the data block identification information is acquired from the file server 200 and stored Go to backup storage device 400; establish a data mapping relationship for this backup, where the data mapping relationship includes identification information of all data blocks in the second snapshot, and the storage location of the data blocks in the backup storage device And the snapshot identifier of the second snapshot.
  • the program instructions stored in the memory 3106 may be logically divided into multiple sub-collections, and when each sub-collection is executed by the processor, it is used to implement the functions of the components included in the backup server.
  • the backup server 300 includes a block change list reader 3103 (BlockChgListReader), a block data reader 3105 (BlockDataReader), and a block map organizer 3107 (BlockMapOrganizer). among them,
  • the block change list reader 3103 is used to send a change information acquisition request to the file server 200 after the backup data is triggered.
  • the change information acquisition request is used to request the change information of the data on the file server to receive the A data change record between the second snapshot and the first snapshot returned by the file server, the data change record is used to record information of a data block that has changed between the second snapshot and the first snapshot, the The first snapshot is created before the second snapshot, and the data change record includes identification information of the changed data block.
  • the request carries the snapshot identifier of the snapshot created by this backup and the snapshot identifier of the snapshot created during a previous backup. Wherein, the snapshot created by this backup is the second snapshot, and the snapshot created during the previous backup is the first snapshot.
  • the change information acquisition request only the snapshot ID of the snapshot created by this backup is carried. If it is not the first backup, then the change information acquisition request carries the snapshot identifier of the second snapshot and the snapshot identifier of the first snapshot. It can be understood that the second snapshot and the first block snapshot may not be adjacent snapshots.
  • the block data reader 3105 is configured to acquire the data of the data block identified by the identification information of the data block from the file server 200 according to the data modification record.
  • the identification information of the data block may include the identification of the data block and the file identification of the file to which the data block belongs.
  • the block map organizer 3107 is used to store the acquired data in a backup storage device and create a data mapping relationship for this backup, where the data mapping relationship includes all data in the second snapshot Block identification information, the storage location of all data blocks in the second snapshot in the backup storage device, and the snapshot identification of the second snapshot.
  • the data change record further includes a corresponding operation identifier
  • establishing a data mapping relationship by the block map organizer includes: the block map organizer copying the data backup relationship of the last backup in the backup storage device, And modify the copied data mapping relationship according to the data modification record to obtain the data mapping relationship of this backup.
  • the backup server further includes a snapshot trigger 3101, which is used to send a snapshot request to the file server; receive a snapshot creation success response returned by the first device, and the response includes the second The snapshot ID of the snapshot.
  • the snapshot trigger receives the second snapshot identifier and forwards it to the block change list reader.
  • the block change list reader can save the received snapshot identifier so that when the data change is needed When recording, there is evidence to follow.
  • the snapshot trigger is used to send a snapshot request to the snapshot creator when the preset condition is satisfied. If the snapshot is backed up to a shared directory, the snapshot request includes the indication information of the shared directory.
  • the backup storage device includes a data area for storing data, and an area for storing the data mapping relationship of this backup.
  • the backup storage device refers to the above-mentioned heterogeneous device with the file server, which may be a file server not produced by the same manufacturer, or different types of storage devices, such as storage area network (SAN) devices.
  • SAN storage area network
  • each time the client operates the file server or a shared directory in the file server it tracks the change information of the file server or the data block in the shared directory.
  • the backup server performs the backup. In this way, the data of the data blocks in the file that have not been changed are stored repeatedly, thereby improving the performance of the backup storage device. It also solves the problem of excessive waste of backup storage in the prior art.
  • an embodiment of the present invention also provides a method for backing up data, which is applied in the above-mentioned backup system. If you need to back up a shared directory shared with the client on the file server, the client can access the files in the shared directory, and you can also create files in the directory, read, write, or delete existing files. Assuming that the file server shared directory is "// IP / MyShare /", the following takes the backup data in the file server shared directory "// IP / MyShare /" as an example to illustrate the implementation process of the embodiment of the present invention.
  • the file server receives a write input / output (Input / Output, I / O) request sent by the client, and determines to write data to the above shared directory // IP / MyShare / according to the write I / O.
  • I / O write input / output
  • the shared directory includes file 1 and file 2.
  • file 1 includes data block 1 and data block 2
  • the data written in data block 1 is ABC
  • the data written in data block 2 is DEF
  • File 2 includes data block 1, and the data written in data block 1 is XYZ.
  • the FileIOTracker in the file server records every operation (read, write, delete, or change) in the shared directory sent by the client as a data change record.
  • the data change record includes the identifier of the data block being operated, the file identifier of the file to which the data block belongs, and the corresponding operation identifier.
  • File identification Data block identification Operation ID File 1 Data block 1 increase File 1 Data block 2 increase File 2 Data block 1 increase
  • the operations on the data in the shared directory before the first backup may not be recorded.
  • the backup server sends a snapshot request to the file server, where the snapshot request includes instruction information for indicating a snapshot object.
  • the above-mentioned preset condition may be set by the user according to needs, for example, the preset condition is set to a certain moment every day, and when the moment comes, the preset condition is satisfied. If it is not a specific shared directory that needs to be backed up, the snapshot request may not include the indication information of the shared directory. It has also been mentioned in the previous embodiment that it may be a backup for the entire file server, then the indication information indicating the snapshot object may not be carried in the snapshot request. In other words, the default snapshot is for the entire file server.
  • the file server After the file server creates the snapshot, it returns a snapshot creation success response to the backup server.
  • the response includes the snapshot identifier.
  • the successful response may further include a snapshot path, where the snapshot path is used to indicate the location of the snapshot in the NAS file system. In this way, when the content of the snapshot is needed, the snapshot can be found through the snapshot path.
  • the SnapCreator returns a snapshot creation success response to the SnapTrigger.
  • the file server includes Snapcreator and FileIOTracker.
  • the process of creating a snapshot by the file server may include: creating Snapshot 1 when Snapcreator receives the snapshot request, and notifying FileIOTracker to persist the data change records in memory. Since this is the first backup, there is no comparable previous snapshot. It can be assumed that the previous snapshot is snapshot 0.
  • FileIOTracker After FileIOTracker receives the notification, it stores the data change records recorded in memory to a permanent disk in the file server, such as a disk array, and marks it as Snap0-1.
  • the data change record may not be recorded during the first snapshot, but only the snapshot. Understandably, both Snap0-1 and Snap-1 are recording the same data.
  • the backup server sends a change information acquisition request to the file server.
  • the backup server After creating a snapshot, the backup server determines the range of data to be backed up for this backup. If the data has never been backed up before this backup, then all the files in the shared directory are backed up, also known as full backup. In this case, the BlockChgListReader in the backup server sends a change information acquisition request to the BlocklistChgProvider in the file server. The request carries the snapshot ID of the snapshot created by this backup and the snapshot ID of the snapshot created during the last backup. Since this is the first backup and there is no information about the last backup, the change information acquisition request only carries the snapshot identifier Snap-1 of the snapshot created by this backup. With reference to the above system embodiment, in another implementation, the change information acquisition request may also carry the data change record Snap0-1 mentioned in step 306.
  • the file server returns change information to the backup server.
  • the BlocklistChgProvider of the file server sends the change information acquisition request to FileIOTracker.
  • FileIOTracker determines the change information to be returned according to the snapshot identifier in the change information acquisition request. Since the received change information acquisition request only carries the snapshot identifier Snap-1, it can be seen that this is the first backup. Then all the files in the directory and the data blocks under the files can be regarded as newly added.
  • FileIOtracker determines the information of the files in the shared directory and the information of the data blocks contained in the file when creating snapshot 1, according to Snap-1, and uses the information of the determined files and the information of the data blocks contained in the file as changes The information is returned to the backup server.
  • the change information acquisition request carries the data change record Snap0-1, because Snap0-1 and Snap-1 are all targeted at the same data, the final return will also be the files in the shared directory when snapshot 1 was created Information and data block information contained in the file.
  • the returned change information may be as shown in Table 2 below, including the file identification and the corresponding data block identification.
  • the backup server obtains the data in the data block to be backed up according to the change information, stores the data in the data block to be backed up in a backup storage device, and establishes a data mapping relationship for this backup.
  • the Blockdatareader in the backup server sends a data read request to the BlockDataProvider in the file server, and the data read request carries the identification information of the data block indicated by the change information, that is, in Table 2 above Data block identification and file identification corresponding to these data blocks.
  • the BlockDataProvider in the file server After receiving the data read request sent by the Blockdatareader, the BlockDataProvider in the file server compares the data block 1 and the data block 2 of the file 1 according to the file identifier and the data block identifier included in the data read request , And the data in the three data blocks of data block 1 of file 2 is returned to the Block data reader.
  • the Blockdatareader sends the received data to the Blockmapognr in the backup server, and the Blockmapognr stores all the data in the file into the data area for storing data in the backup storage device, and stores it in the backup storage device Create the data mapping relationship of this backup in.
  • the data mapping relationship includes identifiers of all data blocks in the snapshot of this backup, file identifiers of files to which the data blocks belong, storage locations of the data blocks in backup storage, and snapshot identifiers. It is understandable that every backup will create a snapshot first, so the snapshot identifier can also be used to identify a certain data backup.
  • the storage location may be the name and URL address of a bucket in the object storage.
  • buckets are similar to folders, storage objects, etc., and can contain data and metadata used to describe the data.
  • the file server receives the write I / O request sent by the client, and according to the write I / O, determines to write data to the above shared directory // IP / MyShare / and delete the above shared directory // IP / MyShare / Or modify the data under the shared directory // IP / MyShare /.
  • File 1 Block 1
  • the data in is modified to abc, block 3 is added, and the data in block 3 is OPQ.
  • File 2 is deleted.
  • File 3 is newly added, and data MNT is stored in block 1 in file 3.
  • each operation read, write, delete, or change
  • the data change record includes the identifier of the data block being operated, the file identifier of the file to which the data block belongs, and the corresponding operation identifier.
  • the backup server sends a snapshot request to the file server, where the snapshot request includes indication information for indicating the snapshot object.
  • the backup server When the conditions for performing the backup operation are met again, the backup server performs the backup operation again.
  • the SnapTrigger in the backup server sends a snapshot request to the SnapCreator in the file server, and the snapshot request includes the path of the shared directory where the snapshot needs to be created.
  • that is NASShare "// IP / MyShare”.
  • the file server After the file server creates the snapshot, it returns a snapshot creation success response to the backup server.
  • the response includes the snapshot path and snapshot ID.
  • the FileIOTracker in the file server stores the data change records in the shared directory between this snapshot and the last snapshot in memory.
  • the data change record is used to record data-related information that has been changed between two adjacent snapshots.
  • the data change record includes a file identifier, a data fast identifier, and a corresponding operation identifier (which may include modification, addition, and deletion). In this step, it is the data change record between Snap-1 and Snap-2, as shown in Table 4, including:
  • File identification Data block identification Operation ID File 1 Data block 1 modify File 1 Data block 3 increase File 2 Data block 1 delete File 3 Data block 1 increase
  • the file server includes Snapcreator and FileIOTracker.
  • the process of creating a snapshot by the file server includes:
  • Snapcreator When Snapcreator receives a snapshot request, it creates Snapshot 2 and notifies FileIOTracker to persist the data change records in memory. After receiving the notification, FileIOTracker stores the data change records recorded in the memory to a persistent storage device in the file server, such as a disk array, and is marked as Snap1-2. At this time, the data recorded in snapshot 2 is the data shown in Figure 3-2.
  • the backup server sends a change information acquisition request to the file server.
  • the backup server After creating a snapshot, the backup server determines the range of data to be backed up for this backup. Since there are backup records before this backup, then this data backup uses incremental backup, you need to first determine what data has changed between this backup and the last backup. It is understandable that since each data backup is a snapshot, the data changes between this backup and the last backup are reflected in the difference between Snap-2 and Snap-1.
  • the BlockChgListReader in the backup server sends a change information acquisition request to the BlocklistChgProvider in the file server to request data change information between Snap-1 and Snap-2.
  • the file server returns the change information to the backup server.
  • the BlocklistChgProvider forwards the change information acquisition request to the FileIOTracker in the file server.
  • FileIOTracker obtains the previously stored data change record Snap1-2 from the persistent storage device and returns it to BlocklistChgProvider.
  • the BlocklistChgProvider returns the data change record Snap1-2 to the BlockChglistReader in the backup server.
  • the backup server sends a data read request to the file server according to the acquired change information, where the data read request carries identification information of the data block to be read.
  • the Blockdatareader in the backup server determines that Block-1 and Block-3 of file 1 and file 3 need to be backed up according to the data change record Snap1-2 The data in Block-1. After that, the Blockdatareader sends a data read request to the BlockDataProvider in the file server, and the data read request carries the data block identifier and the file identifier corresponding to these data blocks.
  • the file server returns the data in the data block identified by the identification information of the data block to the backup server according to the identification information of the data block in the data reading request.
  • the BlockDataProvider in the file server converts the file 1 according to the file identifier and the data block identifier included in the data reading request
  • the data in the three data blocks of Block-1 and Block-3 and Block-1 of file 3 are returned to the Blockdatareader.
  • the backup server stores the obtained data in the backup storage, and establishes the data mapping relationship of this backup.
  • Blockmapognr in the backup server stores the obtained data in the data area for storing data in the backup storage device, and creates a data mapping relationship of this backup in the backup storage device.
  • the data mapping relationship includes identifiers of all data blocks in the snapshot of this backup, file identifiers of files to which the data blocks belong, storage locations of the data blocks in backup storage, and snapshot identifiers.
  • the process of creating the data mapping relationship of this backup includes: copying the data mapping relationship of the last backup to the backup storage device, and modifying the copied data mapping relationship of the last backup according to the data change record, modifying The subsequent data mapping relationship is the data mapping relationship of this backup.
  • the data block should be uniquely identified by combining the file identifier of the file where the data block is located.
  • the identification of the data block and the file identification of the file where the data block is located are summarized as identification information of the data block.
  • the identification information of the data block may also have other implementation manners, which is not limited in the embodiment of the present invention.
  • the backup data is not the data between two consecutive snapshots.
  • the second backup fails in the above embodiment, then during the third backup, a snapshot Snapshot-3 is created, and the differential data between Snap-1 and Snapshot-3 needs to be backed up.
  • the BlockChglistReader in the backup server sends the change information acquisition request between Snap-1 and Snapshot-3 to the BlocklistChgProvider in the file server.
  • step 322 after receiving the change information acquisition request between Snap-1 and Snap-3 sent by BlockChglistReader, the BlocklistChgProvider forwards the Snap-1 and Snap-3 to the FileIOTracker in the file server Requests for information about changes between times.
  • the FileIOTracker obtains the previously stored data change records Snap1-2 and Snap2-3 from the persistent storage device, and returns the Snap1-2 and Snap2-3 to the BlocklistChgProvider.
  • the BlocklistChgProvider obtains the data change record Snap1-3 by superimposing the data change records Snap1-2 and Snap2-3. Examples of specific stacking methods are as follows:
  • BlocklistChgProvider After obtaining Snap1-3, BlocklistChgProvider returns the data change record Snap1-3 to BlockChglistReader, and the BlockChglistReader sends the data change record to Blockdatareader in the backup server.
  • step 324 if the returned change information is the data change record Snap1-3, the Blockdatareader in the backup server determines which data blocks need to be backed up according to the data change record Snap1-3.
  • the data read request sent to the file server carries the data block identifiers corresponding to these data blocks.
  • Computer program products may take the form of computer program products, and computer program products refer to computer-readable program codes stored in computer-readable media.
  • Computer readable media include but are not limited to electronic, magnetic, optical, electromagnetic, infrared or semiconductor systems, devices or devices, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM), optical disc.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable Programmable read only memory
  • the processor in the computer reads the computer-readable program code stored in the computer-readable medium, so that the processor can perform the functional actions specified in each step or a combination of steps in the flowchart.
  • the computer-readable program code can be executed entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server . It should also be noted that, in some alternative implementations, the functions noted in the steps in the flowchart or the blocks in the block diagram may occur out of the order noted in the figures. For example, depending on the functions involved, two steps shown in succession, or two blocks, may actually be executed approximately simultaneously, or these blocks may sometimes be executed in the reverse order.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例提供了一种备份数据的方法,应用于存储系统中,所述存储系统包括第一存储设备、第二存储设备和备份存储设备,所述第一存储设备中构建有第一逻辑单元LUN,所述第二存储设备中构建有第二逻辑单元LUN,所述第一LUN和所述第二LUN设置为双活关系。该方法包括在对所述第一LUN备份时,向所述第二LUN所归属的第二存储设备发送查询数据一致性点的请求消息,所述请求消息中包括所述第一LUN的IO数据状态记录,所述IO数据状态记录用于记录所述第一LUN中下盘的IO。接收所述第二存储设备根据所述第一LUN的IO数据状态记录以及所述第二存储设备中存储的所述第二LUN的IO数据状态记录得到的数据一致性点的信息。根据所述数据一致性点对所述第一LUN创建快照,并提供本次快照与上次快照之间的差异数据,所述差异数据被写入到所述备份存储设备中为本次备份创建的备份映像中。由此减少存储的备份数据。

Description

一种备份数据的方法、装置和系统 技术领域
本申请涉及存储技术领域,尤其涉及到一种备份数据的方法、装置和系统。
背景技术
网络连接存储(Network Attached Storage,NAS)存储系统中,通过将文件服务器连接到现有的网络上,提供文件服务。目前而言,文件服务器是专用于文件服务的设备。文件服务器可以用于存储、检索以及为应用程序和客户端提供文件存取功能等。为了保证数据的可靠性,通常会将文件服务器上的数据进行备份。如果用作备份的存储设备不是由上述文件服务器的生产商提供的,那么在做备份时,通常采用如下的方案:
首次备份时,获取文件服务器共享目录下所有文件的数据,将获取的所有数据存储到备份存储设备。当文件中有数据发生变化时,获取相对上次备份发生变化的文件的列表,然后将发生变化的文件备份到备份存储设备中,也就是说第二备份及之后,用的均是增量备份。随着备份次数的增加,备份数据会越来越多,占用的存储空间也越来越多,导致存储的性能降低。
发明内容
本申请提出了一种备份数据的方法、设备以及系统。用于减少备份存储的不必要的重复数据。
第一方面,本申请实施例提供一种备份数据的方法,所述方法由备份服务器执行。该方法包括:备份数据被触发后,向文件服务器发送变化信息获取请求,所述变化信息获取请求用于请求所述文件服务器上的数据的变化信息;接收文件服务器返回的第二快照与第一快照之间的数据更改记录,所述第一快照在所述第二快照之前创建,所述数据更改记录用于记录所述第二快照和所述第一快照之间发生了更改的数据块的信息,所述数据更改记录包括所述发生了更改的数据块的标识信息;根据所述数据更改记录从所述文件服务器获取所述数据块的标识信息所标识的数据块的数据;将所述获取的数据存储至备份存储设备中,并建立本次备份的数据映射关系,其中,所述数据映射关系包括所述第二快照中的所有数据块的标识信息,所述第二快照中所有数据块在所述备份存储设备中的存储位置以及所述第二快照的快照标识。
可选地,所述文件服务器和所述备份存储设备为异构设备。所述数据映射关系可以存储在所述备份存储设备中。所述数据块的标识信息包括:数据块标识和所述数据块所属的文件的文件标识。
该方法中,获得的数据更改记录是针对数据块的,因此在增量备份中,根据数据更改记录获得第二快照和第一快照之间发生变化的数据块,并将这些数据块进行备份存储。而不是像现有技术那样,一个文件中只要有数据改动,就将整个文件进行备份。相比而言,本发明实施例的方案中,减少了增量备份时的重复数据,提高了备份存储的性能。
结合第一方面,在第一方面的一种可能的实现中,所述数据更改记录还包括相应的操作标识,所述建立数据映射关系包括:在所述备份存储设备中复制上次备份的数据映射关系,并根据所述数据更改记录修改所述复制的数据映射关系得到本次备份的数据映射关系。
结合第一方面,在第一方面的第二种可能的实现中,所述向文件服务器发送变化信息获取请求之前,所述方法还包括:向文件服务器发送快照请求;接收所述第一设备返回的快照创建成功响应,所述响应中包括所述第二快照的快照标识。如果备份针对的是某个共享目录,那么此处的快照请求中还可以包括该共享目录的指示信息。这样当文件服务器接收到快照请求之后,会针对该共享目录创建快照。相比于对整个文件服务器打快照,可以更有针对性。
结合第一方面的第二种可能实现,在第一方面的第三种可能实现中,所述变化信息获取请求中包含所述第二快照的快照标识与第一快照的快照标识。所述第一快照和第二快照的标识,用于指示文件服务器需要获取的是哪两个快照之间的数据变化信息。在本申请的实施例中,上述的第一快照和第二快照都是基于备份而发起的,两个快照之间的数据变化信息意味着两次备份之间发生的数据发生了哪些变化。
本申请的第二方面提供了文件服务器执行的备份数据的方法。该方法包括:接收备份服务器发送的变化信息获取请求,所述变化信息获取请求用于请求所述文件服务器上的数据的变化信息;确定所述文件服务器上的数据的变化信息,并向所述备份服务器返回第二快照与第一快照之间的数据更改记录,所述第一快照在所述第二快照之前创建,所述数据更改记录用于记录所述第二快照与所述第一快照之间发生了更改的数据块的信息,所述数据更改记录包括所述发生了更改的数据块的标识信息;接收所述备份服务器发送的数据获取请求,所述数据获取请求中包括待获取的数据块的标识信息;向所述备份服务器返回所述数据块的标识信息所标识的数据块的数据。
可选地,所述变化信息获取请求中包含所述第二快照的快照标识与所述第一快照的快照标识。所述数据块的标识信息包括:数据块标识和所述数据块所属的文件的文件标识。
由于文件服务器能够提供数据块级别的数据更改记录,并且在备份服务器请求的基础上向所述的备份服务器返回两次快照(也就是两次备份)之间发生了更改的数据块的数据,使得备份服务器能够实现数据块级别的增量备份。从而避免重复备份一些文件中的数据。
结合第二方面的第一种可能实现,所述方法还包括:跟踪客户端在所述文件服务器上的数据操作;记录相邻两次快照之间的数据更改记录,所述相邻两次快照之间的数据更改 记录包括所述相邻两次快照之间发生变化的数据块的标识信息。
由于跟踪了客户端每次在文件服务器上的数据操作,并在相邻两次快照之间及时将数据更改记录进行保存。使得文件服务器能够提供准确的数据块级别的数据更改记录。
结合第二方面的第一种可能实现,在第二方面的第二种可能实现中,当所述第二快照与所述第一快照不是相邻快照时,所述方法还包括:将所述第二快照与所述第一快照之间的每两个相邻快照的数据更改记录进行合并获得所述第二快照与所述第一快照之间的数据更改记录。
可选地,所述合并获得数据更改记录的方法包括:
所述第一快照之前产生的数据,在所述第一快照和所述第二快照之间删除的数据块,标记为删除;
所述第一快照之前产生的数据,在所述第一快照和所述第二快照之间修改的数据块,标记为修改;
所述第一快照之后产生的数据,在所述第二快照之前删除的,不记录;
所述第一快照之后产生的数据,在所述第二快照之前未删除的,标记为新增;
所述第一快照之后修改的数据,在所述第二快照之前未删除的,标记为修改。
本申请的第三方面,提供了一种备份服务器。所述备份服务器包括块变化列表读取器、块数据读取器以及块地图组织器,其中,块变化列表读取器,用于在备份数据被触发后,向文件服务器发送变化信息获取请求,所述变化信息获取请求用于请求所述文件服务器上的数据的变化信息,接收所述文件服务器返回的第二快照与第一快照之间的数据更改记录,所述数据更改记录用于记录所述第二快照和所述第一快照之间发生了更改的数据块的信息,所述第一快照在所述第二快照之前创建,所述数据更改记录包括所述发生了个更改的数据块的标识信息;所述块数据读取器,用于根据所述数据更改记录从所述文件服务器获取所述数据块的标识信息所标识的数据块的数据;所述块地图组织器,用于将所述获取的数据存储至备份存储设备中,并创建本次备份的数据映射关系,其中,所述数据映射关系包括所述第二快照中的所有数据块的标识信息,所述第二快照中所有数据块在备份存储设备中的存储位置以及所述第二快照的快照标识。
结合第三方面的第一种可能实现中,所述数据更改记录还包括相应的操作标识,所述块地图组织器建立数据映射关系包括:所述块地图组织器在所述备份存储设备中复制上次备份的数据映射关系,并根据所述数据更改记录修改所述复制的数据映射关系得到本次备份的数据映射关系。
结合第三方面的第二种可能实现中,所述备份服务器还包括快照触发器,其中,所述快照触发器用于向文件服务器发送快照请求;接收所述第一设备返回的快照创建成功响 应,所述响应中包括所述第二快照的快照标识。
本申请第四方面提供一种文件服务器,所述文件服务器用于提供文件服务。所述文件服务器包括文件输入输出跟踪器、块变化列表提供器和块数据提供器,其中,所述块变化列表提供器,用于接收备份服务器发送的变化信息获取请求,并将所述变化信息获取请求发送给所述文件输入输出跟踪器,所述变化信息获取请求用于请求所述文件服务器上的数据的变化信息;所述文件输入输出跟踪器,用于根据所述变化信息获取请求确定所述文件服务器上的数据的变化信息,并向所述备份服务器返回第二快照与第一快照之间的数据更改记录,所述数据更改记录用于记录所述第二快照与所述第一快照之间发生了更改的数据块的信息,所述数据更改记录包括所述发生了更改的数据块的标识信息;所述块数据提供器,用于接收所述备份服务器发送的数据获取请求,并根据所述数据获取请求向所述备份服务器返回相应的数据块的数据,所述数据获取请求中包括待获取的数据块的标识信息。
结合第四方面的一种可能实现中,所述文件服务器还包括快照创建器和存储单元,其中,所述快照创建器,用于接收创建所述第二快照的快照创建请求,根据所述快照创建请求创建第二快照,并且通知所述文件输入输出跟踪器将所述第二快照与所述第一快照之间的数据更改记录存储到存储单元中,所述第一快照为所述第二快照的前一次快照;所述文件输入输出跟踪器,还用于跟踪客户端在所述文件服务器上的数据操作,接收所述快照创建器发送的通知后,根据所述数据操作在所述存储单元中存储所述第二快照与所述第一快照之间的数据更改记录。
本申请的第五方面提供了用以实现上述第一方面和第二方面的服务器,该服务器包括网络接口、处理器和存储器,所述的网络接口、处理器和存储器之间通过总线相连。其中,所述网络接口用于接入网络,所述存储器,用于存放计算机操作指令。具体可以是高速RAM存储器,也可以是非易失性存储器(non-volatile memory)。所述处理器,用于执行存储器中存放的计算机操作指令。处理器具体可以是中央处理器(central processing unit,CPU),或者是特定集成电路(Application Specific Integrated Circuit,ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路。其中,处理器通过执行该存储器中存储的计算机操作指令以执行所述第一方面或第二方面的方法。
本申请的第六方面,提供了一种备份系统。其中,所述系统包括上述的备份服务器和上述的文件服务器。
可选地,该备份系统中还还包括备份存储设备,所述备份存储设备用于存储所述备份服务器发送的数据。
本申请的第七方面提供了一种存储介质,用以存储上述第五方面提到的计算机操作指令。当这些操作指令被计算机执行时,可以执行上述第一方面或第二方面的方法。
附图说明
图1是本发明实施例提供的备份系统的组网示意图;
图2-1是本发明实施例提供的文件服务器的一种结构示意图;
图2-2是本发明实施例提供的文件服务器的另一种结构示意图;
图2-3是本发明实施例提供的备份服务器的一种结构示意图;
图2-4是本发明实施例提供的备份服务器的另一种结构示意图;
图3是本发明实施例提供的方法流程示意图;
图3-1是本发明实施例在一次备份时共享目录下的数据示意图;
图3-2是本发明实施例在另一次备份时共享目录下的数据示意图;
图3-3是本发明实施例中的数据映射关系示意图。
具体实施方式
本发明实施例提出了一种备份数据方法、装置和系统,在处理数据操作的时候可以跟踪数据块的变化信息,在增量备份的时候,只是把发生变化的数据块进行备份。因此,解决现有技术中备份大量重复数据导致效率低,备份存储浪费多的问题。需要说明的是,为便于描述,将Unix NFS协议中所指的数据块,以及指windows CIFS协议中所指的簇,统一称为数据块。
图1描绘了本发明实施例提供的一种备份系统的组网示意图,该备份系统中包括文件服务器200,备份服务器300和备份存储设备400,这些设备通过网络相互通信,同时也通过该网络与客户端100之间通信。其中,文件服务器200和备份存储设备400可以是异构的,比如,文件服务器200是网络连接存储(network-attached storage,NAS)设备,而备份存储设备400则是与所述NAS设备异构的备份存储设备。其中,与所述文件服务器200异构,可以是因为生产厂商不同,也可以是因为型号不同。
参考图2-1,所述文件服务器200中包括网络接口2100、处理器2102、存储器2104、存储接口2106和存储阵列2108。所述的网络接口2100,处理器2102、存储器2104以及存储接口2106之间通过总线连接,所述存储接口与所述存储阵列之间通信相连。其中,网络接口2102可以由一个或多个网络接口卡(Network Interface Card)来提供,用于接入网络。存储接口2108用于连接存储阵列2108。所述存储阵列2108,用于存储数据。所述存储阵列2108也可以由别的存储设备来代替。存储器2104中存储着一些程序指令,当这些指令被处理器2102执行时,用于实现所述文件服务器的下述功能。
所述文件服务器200用于接收备份服务器300发送的变化信息获取请求,所述变化信息获取请求用于请求所述文件服务器200上的数据的变化信息;确定所述文件服务器200上的数据的变化信息,并向所述备份服务器返回第二快照与第一快照之间的数据更改记录,所述第一快照在所述第二快照之前创建,所述数据更改记录用于记录所述第二快照与所述第一快照之间发生了更改的数据块的信息,所述数据更改记录包括所述发生了个更改的数据块的标识信息;接收所述备份服务器300发送的数据获取请求,所述数据获取请求中包括待获取的数据块的标识信息;向所述备份服务器300返回所述数据块的标识信息所标识的数据块的数据。
可以理解的是,存储器2104中存储的程序指令可以在逻辑上划分为多个子集合,当一个子集合被处理器2102执行的时候,可以用于实现文件服务器200包括的一个组件功 能。参考图2-2,在一个具体的实现中,文件服务器200包括文件输入输出跟踪器2101(FileIOTracker)、块变化列表提供器2103(BlockChgListProvider)、块数据提供器2105(BlockDataProvider)。其中,
所述块变化列表提供器2103,用于接收备份服务器发送的变化信息获取请求,并将所述变化信息获取请求发送给所述文件输入输出跟踪器,所述变化信息获取请求用于请求所述文件服务器上的数据的变化信息。
所述文件输入输出跟踪器2101,用于根据所述变化信息获取请求确定所述文件服务器上的数据的变化信息,并向所述备份服务器返回第二快照与第一快照之间的数据更改记录,所述数据更改记录用于记录所述第二快照与所述第一快照之间发生了更改的数据块的信息,所述数据更改记录包括所述发生了更改的数据块的标识信息。
所述块数据提供器2105,用于接收所述备份服务器发送的数据获取请求,并根据所述数据获取请求向所述备份服务器返回相应的数据块的数据,所述数据获取请求中包括待获取的数据块的标识信息。
可选地,该文件服务器200中还包括快照创建器2107和存储单元2109,其中,
所述快照创建器2107,用于接收创建所述第二快照的快照创建请求,根据所述快照创建请求创建第二快照,并且通知所述文件输入输出跟踪器将所述第二快照与所述第一快照之间的数据更改记录存储到存储单元2109中,所述第一快照为所述第二快照的前一次快照。所述快照创建器2107,还会向快照创建器返回快照创建成功响应,该响应中包括快照标识。具体实现中,该响应消息中还可以包括快照路径,快照路径用于指示快照在NAS文件系统中所在的位置。
所述文件输入输出跟踪器2101,还用于跟踪客户端在所述文件服务器200上的数据操作,接收所述快照创建器发送的通知后,根据所述数据操作在所述存储单元2109中存储所述第二快照与所述第一快照之间的数据更改记录。
可以理解的是,所述块数据提供器2105所提供的数据块存储在所述的存储单元2109中。这里的存储单元2109可以上面提到过的存储阵列2108。
在一个具体的实现中,当所述文件服务器200为客户端设置共享目录时,客户端可以通过网络在所述共享目录下创建文件,也可以访问该共享目录下的文件,对其中的数据执行读、写、删除或者修改操作。如果备份的是该共享目录下的数据,那么上述快照请求中包括用于指示快照对象的指示信息,也就是该共享目录的指示信息,所述快照创建器2107接收到快照请求之后,为该共享目录创建快照。
所述文件输入输出跟踪器2101,可以跟踪每次由客户端100发送的对指定的共享目录的文件输入输出请求,如果在该共享目录下新增了数据块、删除了数据块、修改了数据块,记录发生变化的数据块的标识,该发生变化的数据块所属的文件的文件标识,以及相应的操作标识。并在所述快照创建器2107创建快照时,将在本次快照与前一次快照之间该共享目录下所有的更改记录作为数据更改记录存储至存储单元中。其中,本次快照与前一次快照是相邻的两次快照。所述数据更改记录包括文件标识、数据块标识、操作标识(增、删除、修改)。可以理解的是,由于在首次数据备份的过程中,创建快照时,没有可比较的前一次快照。因此,可以有如下两个可选方案。方案一,假设前一次快照为快照0进行比较,得出数据更改记录。方案二,在首次备份时不记录数据更改记录,而只是记录快照。可以理解的是,这两种方式都是针对同样的一些数据块进行记录。
块变化列表提供器2103向备份服务器300提供的变化信息通常是指当前快照与之前某次快照之间的数据更改记录。其中,当前快照跟本次备份对应,也就是指上述的第二快照,而之前某次快照则是指所述第一快照,跟之前某次备份对应。不难理解,所述第二快照和所述第一快照可以相邻的两次快照,也可以不相邻的两次快照。由于首次数据备份的过程中,创建快照时,没有可比较的之前一次快照。那么,相应于前面文件输入输出跟踪器的两种不同处理方案,块变化列表提供器2103也可以有两种不同的处理方法。方法一,块变化列表提供器2103根据所述文件输入输出跟踪器2101返回的数据更改记录,将该数据更改记录作为变化信息提供给备份服务器。方法二,块变化列表提供器2103将所述文件输入输出跟踪器2101返回的变化信息返回给备份服务器。方法二中的变化信息是指所述文件输入输出跟踪器2101根据变化信息获取请求中携带的快照标识,确定在创建快照1时该共享目录下的文件的信息及所述文件中包含的数据块的信息。也就是说,在方法二中,块变化列表提供器2103将确定出的文件的信息以及文件中包含的数据块的信息作为变化信息返回给备份服务器300。
参考图2-3,所述备份服务器300中包括网络接口3102、处理器3104、存储器3106,其中,所述的网络接口3102,处理器3104以及存储器3106之间通过总线连接,所述网络接口3102可以由一个或多个网络接口卡(Network Interface Card)来提供,用于接入网络。存储器3106中存储着一些程序指令,当这些指令被处理器3104执行时,用于实现下述备份服务器300的功能。
所述备份服务器300,用于在备份数据被触发后,向文件服务器200发送变化信息获取请求,所述变化信息获取请求用于请求所述文件服务器200上的数据的变化信息;接收文件服务器200返回的第二快照与第一快照之间的数据更改记录,所述第一快照在所述第二快照之前创建,所述数据更改记录用于记录两次快照之间发生了更改的数据块的相关信息,所述数据更改记录包括所述发生了更改的数据块的标识信息;根据所述数据更改记录从所述文件服务器200获取所述数据块的标识信息所标识的数据块的数据并存储到备份存储设备400中;建立本次备份的数据映射关系,其中,所述数据映射关系包括所述第二快照中的所有数据块的标识信息,所述数据块在备份存储设备中的存储位置以及所述第二快照的快照标识。
可以理解的是,存储器3106中存储的程序指令可以在逻辑上划分为多个子集合,当每个子集合被处理器执行的时候,用于实现备份服务器包括的各组件的功能。参考图2-4,在一个具体的实现中,备份服务器300包括块变化列表读取器3103(BlockChgListReader)、块数据读取器3105(BlockDataReader)以及块地图组织器3107(BlockMapOrganizer)。其中,
块变化列表读取器3103,用于在备份数据被触发后,向文件服务器200发送变化信息获取请求,所述变化信息获取请求用于请求所述文件服务器上的数据的变化信息,接收所述文件服务器返回的第二快照与第一快照之间的数据更改记录,所述数据更改记录用于记录所述第二快照和所述第一快照之间发生了更改的数据块的信息,所述第一快照在所述第二快照之前创建,所述数据更改记录包括所述发生了更改的数据块的标识信息。在一种具体的实现中,所述请求中携带本次备份创建的快照的快照标识以及之前某次备份时创建的快照的快照标识。其中,所述本次备份创建的快照为所述第二快照,所述之前某次备份时创建的快照是所述第一快照。如果是首次备份,并没有之前某次备份的信息,那么在该 变化信息获取请求中,只携带本次备份创建的快照的快照标识。如果非首次备份,那么在变化信息获取请求中携带所述第二快照的快照标识以及所述第一快照的快照标识。可以理解的是,所述第二快照和所述第一块快照可以不是相邻的快照。
所述块数据读取器3105,用于根据所述数据更改记录从所述文件服务器200获取所述数据块的标识信息所标识的数据块的数据。可选地,所述数据块的标识信息可以包括数据块的标识和数据块所属的文件的文件标识。
所述块地图组织器3107,用于将所述获取的数据存储至备份存储设备中,并创建本次备份的数据映射关系,其中,所述数据映射关系包括所述第二快照中的所有数据块的标识信息,所述第二快照中所有数据块在备份存储设备中的存储位置以及所述第二快照的快照标识。
可选地,所述数据更改记录还包括相应的操作标识,所述块地图组织器建立数据映射关系包括:所述块地图组织器在所述备份存储设备中复制上次备份的数据映射关系,并根据所述数据更改记录修改所述复制的数据映射关系得到本次备份的数据映射关系。
可选地,所述备份服务器还包括快照触发器3101,所述快照触发器用于向文件服务器发送快照请求;接收所述第一设备返回的快照创建成功响应,所述响应中包括所述第二快照的快照标识。所述快照触发器接收到所述第二快照标识,转发给所述块变化列表读取器,所述块变化列表读取器可以保存接收到的快照标识,这样,当需要获得所述数据更改记录时,有据可循。如何存储快照标识,可以有多种实现,本发明实施例不作限制。可以理解的是,在具体的实现中,快照触发器用于在预设条件满足时,向快照创建器发送快照请求。如果快照是针对某个共享目录进行备份,那么所述快照请求中包含该共享目录的指示信息。
备份存储设备包括用于存储数据的数据区,以及用于存储本次备份的数据映射关系的区域。备份存储设备是指上述与文件服务器异构的设备,可以是非同一家厂商生产的文件服务器,也可以不同类型的存储设备,比如,存储区域网络(storage area network,SAN)设备。
本发明实施例的备份系统中,在客户端每次操作文件服务器或者文件服务器中的某个共享目录时,跟踪所述文件服务器或者该共享目录下数据块的变化信息。在增量备份的时候,只是把发生变化的数据块提供给备份服务器,由备份服务器进行备份。通过这种方式,避免了重复存储文件中那些没有发生更改的数据块的数据,从而提高了备份存储设备的性能。也解决了现有技术中备份存储浪费多的问题。
参考图3,本发明实施例还提供了一种备份数据的方法,该方法应用在上述的备份系统中。如果需要备份的是文件服务器上共享给客户端的某个共享目录,客户端访问该共享目录下的文件,也可以在该目录中创建文件、对已有文件的读写或者删除。假设该文件服务器共享目录为“//IP/MyShare/”,下面以备份文件服务器共享目录“//IP/MyShare/”下的数据为例来阐述本发明实施例的实现过程。
302,文件服务器接收到客户端发送的写输入输出(Input/Output,I/O)请求,根据该写I/O确定将数据写入到上述共享目录//IP/MyShare/下。
参考图3-1,当经过一段时间的I/O操作,该共享目录下包括文件1和文件2。其中,文件1中包括数据块1和数据块2,数据块1中写的数据为ABC,数据块2中写的数据为DEF。文件2包括数据块1,数据块1中写的数据为XYZ。
具体地,文件服务器中的FileIOTracker记录客户端每次发送的对该共享目录下的操作(读、写、删除或更改),作为数据更改记录。如表1所示,所述数据更改记录包括被操作的数据块标识、数据块所属的文件的文件标识以及相应的操作标识。
文件标识 数据块标识 操作标识
文件1 数据块1 增加
文件1 数据块2 增加
文件2 数据块1 增加
可替换地,对于首次备份之前对于该共享目录下的数据的操作,也可以不记录。
304,预设条件满足时,备份服务器向文件服务器发送快照请求,所述的快照请求中包括用于指示快照对象的指示信息。
具备地,在本步骤中,备份服务器中的SnapTrigger向文件服务器中的SnapCreator发送快照请求,所述快照请求中包括需要创建快照的共享目录的指示信息。也就是说,该快照请求中包括NASShare=“//IP/MyShare”。
上述的预设条件可以是由用户根据需要设置,比如,将预设条件设置为每天的某一时刻,当该时刻来临时,预设条件满足。如果需要备份的并不是某个特定的共享目录,那么该快照请求中也可以不带该共享目录的指示信息。在前面的实施例中也有提及过,可以是针对文件服务器整体的备份,那么可以不在快照请求中携带所述指示快照对象的指示信息。也就是说,默认快照针对的是整个文件服务器。
306,文件服务器创建好快照之后,向备份服务器返回快照创建成功响应。该响应中包括快照标识。该成功响应中还可以包括快照路径,所述快照路径用于指示快照在NAS文件系统中所在的位置。这样,当需要所述快照的内容时,可以通过快照路径可以找到该快照。
相应地,所述SnapCreator向所述SnapTrigger返回快照创建成功响应。当快照请求中包括NASShare=“//IP/MyShare”时,在快照创建成功响应消息中也可以包括快照路径(NASShareSnapshot=“//IP/MyShareSnap-1)、SnapshotID=Snap-1。
此时,Snap-1记录下的数据为图3-1所示的数据。
可选地,如上所述,文件服务器中包括Snapcreator和FileIOTracker。文件服务器创建快照的过程可以包括:当Snapcreator接收到快照请求时创建快照1,并通知FileIOTracker将内存中的数据更改记录持久化存储。由于这是首次备份,没有可比较的前一次快照。可以假设前一次快照为快照0,当FileIOTracker接收到该通知之后,将内存中记录的数据更改记录存储到文件服务器中的永久化磁盘中,比如磁盘阵列中,并标记为Snap0-1。
可替代地,也可以在首次快照时不记录数据更改记录,而只是记录快照。可以理解的是,Snap0-1跟Snap-1都是在很对同样的一些数据进行记录。
308,备份服务器向文件服务器发送变化信息获取请求。
创建好快照之后,备份服务器确定此次备份需要备份的数据范围。如果在此次备份之前,数据从未被备份过,那么将该共享目录下的所有文件进行备份,也就是通常所说的全量备份。这种情况下,由备份服务器中的BlockChgListReader向文件服务器中的BlocklistChgProvider发送变化信息获取请求,所述请求中携带本次备份创建的快照的快照 标识以及上次备份时创建的快照的快照标识。由于是第一次备份,并没有上次备份的信息,所以在该变化信息获取请求中,只携带了本次备份创建的快照的快照标识Snap-1。参考上面系统实施例所言,在另外一种实现中,在该变化信息获取请求中也可以携带步骤306中所提到的数据更改记录Snap0-1。
310,文件服务器向备份服务器返回变化信息。
具体地,在本步骤中,文件服务器的BlocklistChgProvider接收到变化信息获取请求之后,将该变化信息获取请求发送给FileIOTracker。FileIOTracker根据变化信息获取请求中的快照标识确定需要返回的变化信息。由于,接收到的变化信息获取请求中只是携带了快照标识Snap-1,可见,这是第一次备份。那么该目录下所有的文件以及文件下的数据块都可以认为是新增的。FileIOtracker根据Snap-1确定在创建快照1时该共享目录下的文件的信息及所述文件中包含的数据块的信息,并将确定出的文件的信息以及文件中包含的数据块的信息作为变化信息返回给备份服务器。
如果变化信息获取请求中携带的是数据更改记录Snap0-1,由于Snap0-1跟Snap-1所针对的都是同样一些数据,最后返回的也会是在创建快照1时该共享目录下的文件的信息及所述文件中包含的数据块的信息。
在本实施例中,返回的变化信息,可以如下面表2中所示,包括文件标识以及相应的数据块标识。
文件标识 数据块标识
文件1 数据块1
文件1 数据块2
文件2 数据块1
表2
312,备份服务器根据所述变化信息获得待备份数据块中的数据,将所述待备份数据块中的数据存储到备份存储设备中,并且建立本次备份的数据映射关系。
具体地,备份服务器中的所述Blockdatareader向所述文件服务器中的BlockDataProvider发送数据读取请求,在该数据读取请求中携带变化信息所指示的数据块的标识信息,也就是上述表2中的这些数据块对应的数据块标识及文件标识。文件服务器中的所述BlockDataProvider接收到所述Blockdatareader发送的所述数据读取请求后,根据该数据读取请求中包括的文件标识和数据块标识将所述文件1的数据块1和数据块2,以及文件2的数据块1这三个数据块中的数据返回给所述Blockdatareader。所述Blockdatareader将接收到的数据发送给备份服务器中的Blockmapognr,由所述Blockmapognr将所述文件中的所有数据存储到所述备份存储设备中用于存储数据的数据区中,并在备份存储设备中创建本次备份的数据映射关系。所述数据映射关系包括本次备份的快照中的所有数据块的标识,所述数据块所属的文件的文件标识,所述数据块在备份存储中的存储位置,以及快照标识。可以理解的是,每次备份都会先创建快照,所以快照标识也可以用来标识某一次数据备份。
如图3-3所示,本次数据备份中(备份1),文件1的数据块1和数据块2中的数据,以及文件2的数据块1的数据,也就是,ABC、DEF和XYZ,被存储在数据区中。本次备份的数据映射关系,如下表3所示,包括:
文件标识 数据块标识 存储位置 快照标识
文件1 数据块1 位置1 Snap-1
文件1 数据块2 位置2 Snap-1
文件2 数据块1 位置3 Snap-1
表3
需要注意的是,不同的存储设备,存储位置的具体表现形式有所不同,比如,当备份存储是对象存储时,该存储位置可以是对象存储中某个桶(bucket)的名称以及URL地址。其中,桶跟文件夹、存储对象等类似,可以包含数据以及用于描述该数据的元数据。314,文件服务器接收到客户端发送的写I/O请求,根据该写I/O确定将数据写入到上述共享目录//IP/MyShare/下、删除上述共享目录//IP/MyShare/中的数据或者修改所述共享目录//IP/MyShare/下的数据。
第一次备份之后,文件服务器还会继续收到写I/O操作,经过一段时间,如图3-2所示,在该共享目录下的文件又发生了如下变化:文件1中:块1中的数据被修改为abc,新增块3,且块3中的数据为OPQ。文件2被删除。新增文件3,且文件3中的块1存了数据MNT。
具体地,文件服务器中的FileIOTracker记录Snap-1之后,客户端每次发送的对该共享目录下的操作(读、写、删除或更改),作为数据更改记录。如表1所示,所述数据更改记录包括被操作的数据块标识、数据块所属的文件的文件标识以及相应的操作标识。
316,备份服务器向文件服务器发送快照请求,所述的快照请求中包括用于指示快照对象的指示信息。
当执行备份操作的条件再次满足时,备份服务器会再次执行备份操作。同样地,本步骤中,备份服务器中的SnapTrigger向文件服务器中的SnapCreator发送快照请求,所述快照请求中包括需要创建快照的共享目录的路径。在本实施例中,也就是NASShare=“//IP/MyShare”。
318,文件服务器创建好快照之后,向备份服务器返回快照创建成功响应。该响应中包括快照路径以及快照ID。
相应地,当快照请求中包括NASShare=“//IP/MyShare”时,在快照创建成功响应消息中也可以包括快照路径(NASShareSnapshot=“//IP/MyShareSnap-2)、SnapshotID=Snap-2
文件服务器中的FileIOTracker在内存中存储着本次快照与上次快照之间该共享目录下的数据更改记录。该数据更改记录用于记录两次相邻的快照之间被更改过的数据相关信息,该数据更改记录包括文件标识、数据快标识以及相应的操作标识(可以包括修改、增加、删除)。本步骤中,是Snap-1和Snap-2之间的数据更改记录,如表4所示,包括:
文件标识 数据块标识 操作标识
文件1 数据块1 修改
文件1 数据块3 增加
文件2 数据块1 删除
文件3 数据块1 增加
表4
如上所述,文件服务器中包括Snapcreator和FileIOTracker。文件服务器创建快照的 过程包括:
当Snapcreator接收到快照请求时创建快照2,并通知FileIOTracker将内存中的数据更改记录持久化存储。FileIOTracker接收到该通知之后,将内存中记录的数据更改记录存储到文件服务器中的持久化存储设备中,比如磁盘阵列中,并标记为Snap1-2。此时,快照2记录下的数据为图3-2所示的数据。
320,备份服务器向文件服务器发送变化信息获取请求。
创建好快照之后,备份服务器确定此次备份需要备份的数据范围。由于此次备份之前已经有备份记录,那么,此次数据备份采用增量备份,需要先判断本次备份跟上次备份之间有哪些数据发生了变化。可以理解的是,由于每次数据备份时都是先创建快照,所以本次备份跟上次备份之间的数据变化体现在Snap-2和Snap-1之间的差异数据上。
在一个具体的实现中,备份服务器中的BlockChgListReader向文件服务器中的BlocklistChgProvider发送变化信息获取请求,用于请求Snap-1和Snap-2之间的数据变化信息。
322,文件服务器将变化信息返回给备份服务器。
具体的,本步骤中,BlocklistChgProvider接收到BlockChglistReader发送的变化信息获取请求之后,向文件服务器中的FileIOTracker转发该变化信息获取请求。FileIOTracker从持久化存储设备中获得之前存储的数据更改记录Snap1-2,并返回给BlocklistChgProvider。所述BlocklistChgProvider将所述数据更改记录Snap1-2返回给备份服务器中的BlockChglistReader。
324,备份服务器根据获取到的变化信息向文件服务器发送数据读取请求,所述数据读取请求中携带需要读取的数据块的标识信息。
具体地,如果返回的变化信息为数据更改记录Snap1-2,那么备份服务器中的Blockdatareader根据所述数据更改记录Snap1-2确定需要备份的为文件1的Block-1和Block-3,以及文件3的Block-1中的数据。之后,所述Blockdatareader向所述文件服务器中的BlockDataProvider发送数据读取请求,在该数据读取请求中携带这些数据块对应的数据块标识及文件标识。
326,文件服务器根据所述数据读取请求中的数据块的标识信息,向所述备份服务器返回所述数据块的标识信息所标识的数据块中的数据。
具体地,本步骤中,文件服务器中的所述BlockDataProvider接收到所述Blockdatareader发送的所述数据读取请求后,根据该数据读取请求中包括的文件标识和数据块标识将所述文件1的Block-1和Block-3,以及文件3的Block-1三个数据块中的数据返回给所述Blockdatareader。
328,备份服务器将所获得数据存储到备份存储中,并建立本次备份的数据映射关系。
具体地,在本步骤中,备份服务器中的Blockmapognr将所获得的数据存储到所述备份存储设备中用于存储数据的数据区中,并在备份存储设备中创建本次备份的数据映射关系。所述数据映射关系包括本次备份的快照中的所有数据块的标识,所述数据块所属的文件的文件标识,所述数据块在备份存储中的存储位置,以及快照标识。
创建本次备份的数据映射关系的过程包括:将上次备份的数据映射关系复制到所述备份存储设备中,并根据所述数据更改记录修改所述复制的上次备份的数据映射关系,修改后的数据映射关系就是本次备份的数据映射关系。
如图3-3所示,本次数据备份中,文件1的数据块1和数据块3中的数据,以及文件3的数据块1的数据,也就是,abc、OPQ和MNT,被存储在数据区中。本次备份的数据映射关系,如下表5所示,包括:
文件标识 数据块标识 存储位置 快照标识
文件1 数据块1 位置4 Snap-2
文件1 数据块2 位置2 Snap-2
文件1 数据块3 位置5 Snap-2
文件3 数据块1 位置6 Snap-2
表5
可以理解的是,由于在不同文件内的数据块标识可以是一样的,因此,应当结合数据块所在的文件的文件标识来唯一标识该数据块。所述数据块的标识以及数据块所在文件的文件标识概括为数据块的标识信息。数据块的标识信息,也可以有别的实现方式,本发明实施例不作限定。
参考图可以理解的是,在本实施例中,由于在本次备份与上次备份之间,文件1中的数据块1和数据块3因为数据发生了变化,因此Block-1和Block-3再次备份。而,文件1的数据块2则因为在此期间没有变化,所以在此次增量备份中,文件1中的数据块2并没有再次备份,也没有在备份服务器和文件服务器传输这个数据块。如果需要读取快照2时,文件1中的数据块2,根据上述表4,可以找到位置2,从而读出该数据块2的数据。另外,对于文件2中的数据块1,因为该数据块已经被删除,所以不会在本次备份的快照中;又由于该数据块的操作标识为删除,那么,在本次备份的数据映射关系中也不会包括该数据块的相关信息。
可以理解的是,存在着一些场景,备份的数据并非连续两次快照之间的数据。比如,如果上述实施例中第二次备份失败,那么在第三次备份时,创建快照Snapshot-3,需要备份的是Snap-1和Snapshot-3之间的差异数据。在这种情况下,备份服务器中的BlockChglistReader向文件服务器中的BlocklistChgProvider发送Snap-1和Snapshot-3之间的变化信息获取请求。
相应地,步骤322中,所述BlocklistChgProvider接收到BlockChglistReader发送的Snap-1和Snap-3之间的变化信息获取请求之后,向所述文件服务器中的所述FileIOTracker转发Snap-1和Snap-3之间的变化信息获取请求。所述FileIOTracker从持久化存储设备中获得之前存储的数据更改记录Snap1-2以及Snap2-3,并将所述Snap1-2以及Snap2-3返回给所述BlocklistChgProvider。所述BlocklistChgProvider通过叠加所述的数据更改记录Snap1-2以及Snap2-3得到数据更改记录Snap1-3。具体的叠加方法举例如下:
1),Snap-1前产生的数据块,Snap2-3间删除的,则标记为删除;
2),Snap-1前产生的数据块,Snap2-3间修改的,则标记为修改;
3),Snap-1前产生的数据块,Snap1-2之间删除的,标记为删除
4),Snap-1前产生的数据块,Snap1-2之间修改的,但是Snap2-3间没删除的,则标记为修改;
5),Snap-1后产生的数据块,Snap2-3间删除的,Snap1-3里无需记录;
6),Snap-1后产生的数据块,Snap2-3间没删除的,Snap1-3里记录为新增;
7),Snap-1后改变的数据块,在Snap2-3之间未删除,则无论Snap2-3间是否改变,Snap1-3标记为改变。
获得Snap1-3之后,BlocklistChgProvider将所述数据更改记录Snap1-3返回给BlockChglistReader,所述BlockChglistReader将所述数据更改记录发送给备份服务器中Blockdatareader。
相应地,步骤324中,如果返回的变化信息为数据更改记录Snap1-3,那么备份服务器中的Blockdatareader根据所述数据更改记录Snap1-3确定需要备份的是哪些数据块中存储的数据,在所述发送给文件服务器发送的数据读取请求中携带这些数据块对应的数据块标识。通过上述的方法,在数据备份的时候可以追溯到数据块的变化信息,因此,在增量备份的时候,只是把发生变化的数据块进行备份。这样减少了大量重复的数据被备份,因而解决了现有技术中备份存储浪费多的问题。
值得注意的是,上述实施例只是对备份服务器内部的组件作示意性划分,而不作限定。对于内部组件之间的交互,可能是有实际数据的传递,或者一些信号的传递。比如,步骤Blockdatareader接收到Blockdataprovider返回的数据之后,可以是Blockmapognr跟备份存储设备之间建立镜像,将所述Blockdatareader得到的数据镜像存储至所述备份存储设备中。
本领域普通技术人员将会理解,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。计算机可读介质包含但不限于电子、磁性、光学、电磁、红外或半导体系统、设备或者装置,或者前述的任意适当组合,如随机访问存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、光盘。计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代码,使得处理器能够执行在流程图中每个步骤、或各步骤的组合中规定的功能动作。计算机可读程序代码可以完全在用户的计算机上执行、部分在用户的计算机上执行、作为单独的软件包、部分在用户的计算机上并且部分在远程计算机上,或者完全在远程计算机或者服务器上执行。也应该注意,在某些替代实施方案中,在流程图中各步骤、或框图中各块所注明的功能可能不按图中注明的顺序发生。例如,依赖于所涉及的功能,接连示出的两个步骤、或两个块实际上可能被大致同时执行,或者这些块有时候可能被以相反顺序执行。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,本领域普通技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。

Claims (16)

  1. 一种备份数据的方法,应用于存储系统中,所述存储系统包括第一存储设备、第二存储设备和备份存储设备,所述第一存储设备中构建有第一逻辑单元LUN,所述第二存储设备中构建有第二逻辑单元LUN,所述第一LUN和所述第二LUN设置为双活关系,其特征在于,所述的方法包括:
    当所述第一存储设备发起对所述第一LUN备份时,向所述第二LUN所归属的第二存储设备发送查询数据一致性点的请求消息,所述请求消息中包括所述第一LUN的IO数据状态记录,所述IO数据状态记录用于记录所述第一LUN中下盘的IO;
    接收所述第二存储设备根据所述第一LUN的IO数据状态记录以及所述第二存储设备中存储的所述第二LUN的IO数据状态记录得到的数据一致性点的信息;
    根据所述数据一致性点对所述第一LUN创建快照,并提供本次快照与上次快照之间的差异数据,所述差异数据被写入到所述备份存储设备中为本次备份创建的备份映像中。
  2. 根据权利要求1所述的方法,其特征在于,所述方法包括:
    所述备份映像在所述备份存储设备中的路径包括预设路径以及所述第一LUN的WWN。
  3. 根据权1所述的方法,其特征在于,所述IO数据状态记录包括发起端标识和发起端接收到主机IO的序号,所述得到数据一致性点的过程包括:
    所述第二存储设备根据所述第一LUN的IO数据状态记录和第二LUN的IO数据状态记录确定主机写入所述第一LUN的IO中已经在所述第一存储设备下盘且已经在所述第二存储设备下盘的IO,以确定出的IO最新序号为第一数据一致性点;并,
    根据所述第一LUN的IO数据状态记录和第二LUN的IO数据状态记录确定主机写入到所述第二LUN的IO中已经在所述第二存储设备下盘的IO且已经在所述第一存储设备下盘的IO,以确定出的IO最新序号为第二数据一致性点;其中,
    所述数据一致性点包括所述第一数据一致性点和所述第二数据一致性点。
  4. 根据权利要求3所述的方法,其特征在于,所述第二存储设备根据所述数据一致性点创建快照。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述数据一致性点创建快照包括:
    根据所述第一数据一致性点和所述第二数据一致性点确定本次快照包括的写IO的序号;
    创造快照卷,根据所述写IO的序号从所述第一LUN的IO数据状态记录以及所述第二LUN的IO数据状态记录中查找相应的数据,并将找到的数据写入快照卷。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    如果所述第一LUN的写IO日志中以及所述第二LUN的写IO日志中,查找不到相应的数据,则从所述第一存储设备中或者从所述第二存储设备中查找相应的数据,并将找到的数据写入快照卷。
  7. 根据权利要求1所述的方法,在所述第一存储设备发起对所述第一LUN备份时之前,所述方法还包括:
    所述第一存储设备设置双端备份策略,所述双端备份策略包括所述第一存储设备对所述第一LUN执行的第一备份策略,以及所述第二存储设备对所述第二LUN执行的第二备份策略;
    其中,所述第一存储设备根据所述第一备份策略发起对所述第一LUN的备份。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    所述第一存储设备将所述双端备份策略发送给所述第二存储设备;
    所述第二存储设备检测到所述第一存储设备正常时,执行所述第二备份策略;
    所述第二存储设备检测到所述第一存储设备故障时,代替所述第一存储设备执行所述第一备份策略。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括
    在获取本次快照与上次快照之间的差异数据并写入到所述备份映像的过程中,所述第一存储设备定期向所述备份存储设备发送断点信息,所述断点信息包括备份任务ID、所述第一LUN中数据已备份完成的偏移地址,备份映像ID;
    所述代替所述第一存储设备执行所述第一备份策略包括:
    所述第二存储设备根据本端记录的对端的任务信息里找到所述第一存储设备故障前正在执行的备份的备份任务ID;
    根据所述备份任务ID在所述备份存储设备中找到断点信息,根据所述断点信息继续获取差异信息进行备份。
  10. 一种备份数据的方法,应用于存储系统中,所述存储系统包括第一存储设备、第二存储设备和备份存储设备,所述第一存储设备中构建有第一LUN,所述第二存储设备中构建有第二LUN,所述第一LUN和所述第二LUN设置为双活关系,其特征在于,所述的方法包括:
    当所述第一存储设备发起对所述第一LUN备份时,悬挂主机IO,将此前所述第一存储设备中缓存的写IO写入硬盘中;
    所述第一存储设备悬挂主机IO的同时向所述第二存储设备发送通知消息,所述通知消息用于指示悬挂主机IO并创建快照;
    接收所述第二设备返回的快照标识,且此前所述第一存储设备中缓存的写IO下盘,所述第一存储设备对所述第一LUN创建快照,在所述备份存储设备中创建本次备份的备份映像,并将本次快照与上次快照之间的差异数据写入到所述备份映像中。
  11. 根据权利要求10所述的方法,其特征在于,所述方法还包括:
    所述第二存储设备接收到所述通知消息后,悬挂本端的主机IO,并将此前所述第一存储设备中缓存的写IO下盘。
  12. 一种存储系统,所述存储系统包括第一存储设备、第二存储设备和备份存储设备,其中,所述第一存储设备中构建有第一逻辑单元LUN,所述第二存储设备中构建有第二逻辑单元LUN,所述第一LUN和所述第二LUN设置为双活关系,其特征在于,
    所述第一存储设备,用于在发起对所述第一LUN备份时,向所述第二LUN所归属的第二存储设备发送查询数据一致性点的请求消息,所述请求消息中包括所述第一LUN的IO数据状态记录,所述IO数据状态记录用于记录所述第一LUN中下盘的IO,接收所述第二存储设备返回的包括数据一致性点的信息的响应消息,根据所述数据一致性点对所述第一LUN创建快照;
    所述第一存储设备,还用于接收查询差异数据的请求,根据所述请求返回相应的差异数据,所述差异数据被写入到所述备份存储设备中为本次备份创建的备份映像中;
    所述第二存储设备,用于接受所述查询数据一致性点的请求消息,根据所述第一LUN的IO数据状态记录以及所述第二存储设备中存储的所述第二LUN的IO数据状态记录得到所述数据一致性点的信息,向所述第二存储设备返回所述响应消息;
    所述备份存储设备,用于提供所述备份存储映像。
  13. 一种存储设备,所述存储设备中设置有至少一个LUN,所述至少一个LUN包括第一LUN,所述第一LUN与其他存储设备中的第二LUN设置为双活关系,其特征在于,所述存储设备包括双端一致性快照装置,
    所述双端一致性快照装置,用于在发起对所述第一LUN备份时,向所述第二LUN所归属的第二存储设备发送查询数据一致性点的请求消息,所述请求消息中包括所述第一LUN的IO数据状态记录,所述IO数据状态记录用于记录所述第一LUN中下盘的IO,接收所述第二存储设备返回的包括数据一致性点的信息的响应消息,根据所述数据一致性点对所述第一LUN创建快照;
    所述第一LUN,用于接收查询差异数据的请求,根据所述请求返回相应的差异数据,所述差异数据被写入到所述备份存储设备中为本次备份创建的备份映像中。
  14. 如权利要求13所述的存储设备,其特征在于,所述IO数据状态记录包括从主机接收到IO的LUN的标识以及从主机接收到所述IO的序号;
    所述双端一致性快照装置,还用于接收查询数据一致性点的请求消息,所述请求消息来自于所述第二LUN归属的第二存储设备,所述请求消息中包括所述第二LUN的IO数据状态记录,所述IO数据状态记录用于记录所述第二LUN中下盘的IO;
    所述双端一致性快照装置,还用于根据所述存储设备存储的所述第一LUN的IO数据状态记录和所述第二LUN的IO数据状态记录确定主机写入所述第一LUN的IO中已经在所述第一存储设备下盘且已经在所述第二存储设备下盘的IO,以确定出的IO最新序号为第一数据一致性点;并,
    根据所述第一LUN的IO数据状态记录和第二LUN的IO数据状态记录确定主机写入到所述第二LUN的IO中已经在所述第二存储设备下盘的IO且已经在所述第一存储设备下盘的IO,以确定出的IO最新序号为第二数据一致性点;其中,
    所述数据一致性点包括所述第一数据一致性点和所述第二数据一致性点。
  15. 如权利要求13所述的存储设备,其特征在于,所述存储设备还包括双策略管理装置以及备份装置,其中,
    所述备份装置,用于接收备份策略创建请求,并将所述备份策略创建请求转发给所述双端策略管理装置,其中,所述备份策略创建请求用于创建备份策略;
    所述双端策略管理装置,用于根据所述第一LUN与第二LUN之间的双活关系设置双端备份策略,所述双端备份策略包括对所述第一LUN适用的第一备份策略,以及对所述第二LUN适用的第二备份策略;
    其中,所述双端策略根据所述第一备份策略生成对所述第一LUN的备份任务。
  16. 如权利要求13所述的存储设备,其特征在于,
    所述双端策略管理装置,具体用于根据所述第一数据一致性点和所述第二数据一致性点确定本次快照包括的写IO的序号,创造快照卷,根据所述写IO的序号从所述第一LUN的IO数据状态记录以及所述第二LUN的IO数据状态记录中查找相应的数据,并将找到的数据写入快照卷。
PCT/CN2019/090090 2018-10-22 2019-06-05 一种备份数据的方法、装置和系统 WO2020082744A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19874958.2A EP3862883B1 (en) 2018-10-22 2019-06-05 Data backup method and apparatus, and system
US17/235,557 US11907078B2 (en) 2018-10-22 2021-04-20 Data backup method, apparatus, and system

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201811232224 2018-10-22
CN201811232224.X 2018-10-22
CN201811433033.X 2018-11-28
CN201811433033.XA CN111078464B (zh) 2018-10-22 2018-11-28 一种备份数据的方法、装置和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/235,557 Continuation US11907078B2 (en) 2018-10-22 2021-04-20 Data backup method, apparatus, and system

Publications (1)

Publication Number Publication Date
WO2020082744A1 true WO2020082744A1 (zh) 2020-04-30

Family

ID=70310038

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090090 WO2020082744A1 (zh) 2018-10-22 2019-06-05 一种备份数据的方法、装置和系统

Country Status (4)

Country Link
US (1) US11907078B2 (zh)
EP (1) EP3862883B1 (zh)
CN (1) CN111078464B (zh)
WO (1) WO2020082744A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4300314A3 (en) 2018-12-29 2024-04-10 Huawei Technologies Co., Ltd. Data backup method, apparatus and system
US11126365B2 (en) * 2019-03-11 2021-09-21 Commvault Systems, Inc. Skipping data backed up in prior backup operations
CN112214352B (zh) * 2020-10-16 2023-02-17 天津七所高科技有限公司 一种基于Ethernet/IP的焊机设备数据自动备份方法及装置
CN112650447B (zh) * 2020-12-18 2024-02-13 北京浪潮数据技术有限公司 一种ceph分布式块存储的备份方法、系统及装置
CN113238891A (zh) * 2021-03-19 2021-08-10 浪潮云信息技术股份公司 一种基于备份链的备份删除方法及系统
CN113157699B (zh) * 2021-04-25 2024-10-15 上海淇玥信息技术有限公司 一种业务数据审核方法、装置和电子设备
CN115543695B (zh) * 2022-11-29 2023-08-15 苏州浪潮智能科技有限公司 一种数据备份方法、装置及电子设备和存储介质
CN117421160B (zh) * 2023-11-01 2024-04-30 广州鼎甲计算机科技有限公司 数据备份方法、装置、计算机设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019888A (zh) * 2012-12-21 2013-04-03 华为技术有限公司 备份方法与装置
CN104375904A (zh) * 2014-10-30 2015-02-25 浪潮电子信息产业股份有限公司 一种基于快照差异化数据传输的容灾备份方法
CN107391314A (zh) * 2017-07-31 2017-11-24 郑州云海信息技术有限公司 一种支持双活的数据一致性保持方法与装置
US20180260281A1 (en) * 2017-03-08 2018-09-13 Hewlett Packard Enterprise Development Lp Restoring a storage volume from a backup

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6279011B1 (en) * 1998-06-19 2001-08-21 Network Appliance, Inc. Backup and restore for heterogeneous file server environment
US20040268068A1 (en) * 2003-06-24 2004-12-30 International Business Machines Corporation Efficient method for copying and creating block-level incremental backups of large files and sparse files
JP4741371B2 (ja) * 2006-01-05 2011-08-03 株式会社日立製作所 システム、サーバ装置及びスナップショットの形式変換方法
CN103034566B (zh) * 2012-12-06 2015-07-22 华为技术有限公司 虚拟机还原的方法和装置
US20160125059A1 (en) * 2014-11-04 2016-05-05 Rubrik, Inc. Hybrid cloud data management system
US9904598B2 (en) * 2015-04-21 2018-02-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
CN106095622A (zh) * 2016-06-22 2016-11-09 上海爱数信息技术股份有限公司 数据备份方法及装置
US11663084B2 (en) * 2017-08-08 2023-05-30 Rubrik, Inc. Auto-upgrade of remote data management connectors

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019888A (zh) * 2012-12-21 2013-04-03 华为技术有限公司 备份方法与装置
CN104375904A (zh) * 2014-10-30 2015-02-25 浪潮电子信息产业股份有限公司 一种基于快照差异化数据传输的容灾备份方法
US20180260281A1 (en) * 2017-03-08 2018-09-13 Hewlett Packard Enterprise Development Lp Restoring a storage volume from a backup
CN107391314A (zh) * 2017-07-31 2017-11-24 郑州云海信息技术有限公司 一种支持双活的数据一致性保持方法与装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3862883A4 *

Also Published As

Publication number Publication date
US11907078B2 (en) 2024-02-20
CN111078464A (zh) 2020-04-28
EP3862883A1 (en) 2021-08-11
EP3862883A4 (en) 2021-12-22
CN111078464B (zh) 2024-06-25
EP3862883B1 (en) 2023-04-19
US20210240578A1 (en) 2021-08-05

Similar Documents

Publication Publication Date Title
WO2020082744A1 (zh) 一种备份数据的方法、装置和系统
US11868312B2 (en) Snapshot storage and management within an object store
US8924668B1 (en) Method and apparatus for an application- and object-level I/O splitter
US11630807B2 (en) Garbage collection for objects within object store
US9336230B1 (en) File replication
US11797477B2 (en) Defragmentation for objects within object store
US9710177B1 (en) Creating and maintaining clones in continuous data protection
US8914595B1 (en) Snapshots in deduplication
US8738813B1 (en) Method and apparatus for round trip synchronous replication using SCSI reads
US8250033B1 (en) Replication of a data set using differential snapshots
US8769336B1 (en) Method and apparatus for preventing journal loss on failover in symmetric continuous data protection replication
US11899620B2 (en) Metadata attachment to storage objects within object store
US9043280B1 (en) System and method to repair file system metadata
JP6968876B2 (ja) 期限切れバックアップ処理方法及びバックアップサーバ
WO2023009769A1 (en) Flexible tiering of snapshots to archival storage in remote object stores
CN109254958B (zh) 分布式数据读写方法、设备及系统
US11544007B2 (en) Forwarding operations to bypass persistent memory
US20230350760A1 (en) Physical size api for snapshots backed up to object store
CN113157487A (zh) 数据恢复方法及其设备
CN117632890A (zh) 数据处理方法、装置、存储节点、系统及存储介质
CN111581015B (zh) 一种现代应用的持续数据保护系统及方法
CN115840662A (zh) 一种数据备份系统及装置
US11645333B1 (en) Garbage collection integrated with physical file verification
US11675668B2 (en) Leveraging a cloud-based object storage to efficiently manage data from a failed backup operation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19874958

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019874958

Country of ref document: EP

Effective date: 20210507