WO2018120844A1 - Differential data backup method and differential data backup device - Google Patents

Differential data backup method and differential data backup device Download PDF

Info

Publication number
WO2018120844A1
WO2018120844A1 PCT/CN2017/096782 CN2017096782W WO2018120844A1 WO 2018120844 A1 WO2018120844 A1 WO 2018120844A1 CN 2017096782 W CN2017096782 W CN 2017096782W WO 2018120844 A1 WO2018120844 A1 WO 2018120844A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
snapshot
numbers
backup
production volume
Prior art date
Application number
PCT/CN2017/096782
Other languages
French (fr)
Chinese (zh)
Inventor
廖基祥
欧阳戟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018120844A1 publication Critical patent/WO2018120844A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques

Definitions

  • Embodiments of the present invention relate to the field of storage technologies, and in particular, to a differential data backup method and a differential data backup device.
  • a typical data disaster recovery system includes a production center and a disaster recovery center.
  • hosts and storage arrays are deployed for normal service operations.
  • hosts and storage arrays are deployed to take over the services after a disaster occurs in the production center.
  • the storage array of the production center or the disaster recovery center includes multiple data volumes, and the data volume is a logical storage space mapped by physical storage space. After the data generated by the service of the production center is written to the production array, it can be backed up to the disaster recovery center through the DR link and written to the disaster recovery array. To ensure that the data of the disaster recovery center can support the service takeover after the disaster occurs, the data backed up to the disaster recovery array must ensure consistency.
  • Assuring data consistency essentially means that there is a dependency write request, and the dependency needs to be guaranteed.
  • Applications, operating systems, and databases all rely on this logic of writing data request dependencies to run their services. For example, the write data request 1 is completed first, and the write data request 2 is completed. The order is fixed. That is to say, the system will ensure that the write data request 1 is sent after the write data request 1 is completely returned successfully. Therefore, it is possible to rely on an inherent method to recover the service when a failure causes the execution process to be interrupted. Otherwise, such a situation may occur. For example, when reading data, the data stored in the write data request 2 can be read, but the data stored in the write data request 1 cannot be read, which will cause the service to be unrecoverable.
  • a snapshot is an image of data at a certain point in time (the point in time when the copy begins).
  • the purpose of the snapshot is to create a state view for the data volume at a specific point in time. Only the data volume can be seen at the time of creation. After this time point, the data volume is modified (new data is written). Will not be reflected in the snapshot view. With this snapshot view, you can make a backup of the data.
  • the production center since the snapshot data is “stationary”, the production center can back up the snapshot data to the disaster recovery center after snapshotting the data at each time point, and can complete remote data backup without The effect continues to execute write data requests at the production center.
  • data consistency requirements can also be met. For example, the data of the data request 2 is successfully backed up to the disaster recovery center, and the data of the data request 1 is not successfully backed up. The data of the disaster recovery center can be restored to the previous state by using the snapshot data before the data request 2.
  • the production array needs to hang the write data request before creating the snapshot to prevent the changed data from being recorded in the snapshot, thus causing data and disaster recovery in the production array.
  • the data in the array is inconsistent.
  • hanging write data requests often affects the efficiency of the production array processing data.
  • the present application proposes a differential data backup method and a differential data backup device, which can avoid hanging write data requests and improve data processing efficiency.
  • the first aspect of the present application provides a differential data backup method, which is applied to a storage system.
  • the storage system includes a processor, a production volume, and a target volume.
  • the processor acquires a record of the difference data between the two numbers.
  • the number is used to identify the time period during which data is written to the production volume.
  • the first number of the two numbers is a number assigned to the data received by the production volume last time before the first snapshot of the production volume is created.
  • the second of the two numbers is the number assigned to the data received last time for the production volume after the second snapshot of the production volume was created.
  • the record of the difference data includes a logical place of the difference data received within a time period identified by a number between the two numbers site.
  • the processor reads backup data from the second snapshot according to a logical address of the difference data, the backup data being a subset of the difference data.
  • the processor then sends the backup data to the target volume.
  • the processor acquires a record of the two numbered difference data, and reads the backup data from the second snapshot according to the logical address of the difference data. Since the first number of the two numbers is a number assigned to the data received last time for the production volume before the first snapshot of the production volume is created, and the second number of the two numbers is a creation The number of data that was last received for the production volume after the second snapshot of the production volume, so the difference data is more than the backup data. Then, in the differential backup process provided by the first aspect, there is no need to hang the write data request, and the difference data between the two snapshots can still be backed up to the target volume, thereby ensuring data consistency between the production volume and the target volume. Since there is no need to hang write data requests, the efficiency of data processing can be improved.
  • all numbers between the two numbers are changed according to a set condition including a preset backup period arrival or creation of the production A snapshot of the volume. Thereby, the time period in which the data is written to the production volume by the number is realized.
  • the second snapshot is a next snapshot of the first snapshot. Since the second snapshot is the next snapshot of the first snapshot, it is guaranteed that the data of each backup is slightly more than the difference data between the adjacent two snapshots, and less than the difference data between the two snapshots that are not adjacent. Reduce the amount of data per backup as much as possible.
  • the method further includes the processor transmitting a logical address of the backup data to the target volume .
  • the number between the two numbers does not include the second number. Since the second number is the starting number of the next backup, the second number may not be included in the current backup to reduce the amount of data backed up.
  • the second aspect of the present application provides a differential data backup apparatus for performing a differential data backup method provided by the first method.
  • a third aspect of the present application provides a storage system, including a processor, a production volume, and a target volume, where the processor is configured to perform the differential data backup method provided by the first aspect.
  • a fourth aspect of the present application provides a storage system including a processor, a memory, a production volume, and a target volume, the processor invoking a program in the memory to execute a differential data backup method provided by the first aspect.
  • the present application also provides a computer program product comprising a computer readable storage medium storing program code, the program code comprising instructions executable by the storage system of the third aspect or the fourth aspect, and for performing the above At least one method on the one hand.
  • the above computer program product provided by the application of the present invention can not hang the write data request during the backup process, thereby improving the efficiency of data processing.
  • FIG. 1 is a schematic diagram of an application scenario according to an embodiment of the present invention.
  • FIG. 3 is a structural diagram of a storage device according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a differential data backup method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of changes in numbers during execution of a differential data backup method according to an embodiment of the present invention.
  • FIG. 6 is a structural diagram of a differential data backup apparatus according to an embodiment of the present invention.
  • the embodiment of the invention provides a data backup method and a storage system, which can avoid hanging write data requests, thereby improving the efficiency of data processing.
  • FIG. 1 depicts a composition diagram of a storage system 10 according to an embodiment of the present invention.
  • the storage system 10 shown in FIG. 1 includes one or more hosts 40 and a storage device 20.
  • the host can be a computing device, such as a terminal device such as a server or a desktop computer.
  • the storage device 20 may be a storage device based on data block data, such as a Storage Area Networking (SAN) device, or a storage device including a file system, such as a Network Attached Storage (NAS) device. This embodiment does not limit the type of the storage device.
  • SAN Storage Area Networking
  • NAS Network Attached Storage
  • This embodiment does not limit the type of the storage device.
  • the network file system (NFS)/Common Internet File System (CIFS) protocol or Fibre Channel (Fiber Channel, FC) can be used.
  • the protocol communicates.
  • the storage device 20 includes at least one controller 21 and a plurality of hard disks 22.
  • Controller 21 can include any computing device such as a server, desktop computer, or the like. Inside the controller, an operating system and other applications are installed.
  • the controller 21 can send an input/output (I/O) request to the hard disk 22. For example, a write data request is sent to the hard disk 22 such that the hard disk 22 writes the data to be written carried in the write data request into its storage medium.
  • I/O input/output
  • the hard disk 22 can be a plurality of types of hard disks, such as Solid State Drive (SSD) or Serial Attached SCSI (SAS) or Fibre Channel (FC) hard disk drives (Hard Disk Drive, HDD). ), where SCSI (Small Computer System Interface) is the abbreviation of the minicomputer system interface or Serial Advanced Technology Attachment (SATA) or Near Line (NL) Serial Attached SCSI (Serial Attached SCSI) , SAS) HDD, not limited here.
  • SCSI Small Computer System Interface
  • SATA Serial Advanced Technology Attachment
  • NL Near Line
  • Serial Attached SCSI Serial Attached SCSI
  • SAS Serial Attached SCSI
  • a Logic Unit (LU) is a logical storage space distributed over one or more hard disks 22, such as production volume 23 and target volume 24 shown in FIG.
  • the host 40 can send a write data request to the storage system 10, the write data request carrying data to be written to the storage system 10, the data can be block data or a file.
  • the controller 21 receives the data and then writes it into the logical unit of the storage device 20.
  • data needs to be backed up. For example, the data in the production volume 23 is backed up to the target volume 24. When the data in the production volume 23 is damaged, the data stored in the target volume 24 can be used for recovery.
  • FIG. 2 depicts a composition diagram of another storage system 10 that includes one or more hosts 40, a storage device 20, and a storage device 30.
  • the storage device 30 is similar to the storage device 20 and includes at least one controller 31 and a plurality of hard disks 32.
  • the structure and function of the controller 31 are similar to those of the controller 21 of FIG. 1.
  • the structure and function of the hard disk 32 are similar to those of the hard disk 22 of FIG. 1, and will not be described herein.
  • the difference from the application scenario described in FIG. 1 is that the backup in FIG. 1 refers to a backup in one storage device, and the backup in FIG. 2 refers to a backup between two storage devices.
  • storage device 20 needs to back up data on its production volume to target volume 33 of storage device 30.
  • the controller 21 may use data when backing up data in one LU (referred to as a production volume) to another LU (referred to as a target volume).
  • the method of full backup can also adopt the method of incremental backup.
  • a full backup is a full backup of all the data on the production volume. Incremental backups are backups since the last full backup or Data modified since the incremental backup (whichever is later). Because it is limited to backing up modified data (also known as differential data), this backup is very fast and saves storage space.
  • FIG. 3 depicts a composition diagram of the controller 21 provided by the embodiment of the present invention.
  • the controller 21 includes at least an interface 211, a processor 212, and a memory 213.
  • the interface 211 is configured to communicate with the host 40 or the hard disk 22 or the storage device 30.
  • the processor 212 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention.
  • the processor 212 can be used to process input/output (I/O) requests to the hard disk 22, back up data in the production volume to the target volume, and the like.
  • the controller 21 can implement functions such as IO operation, data backup, and the like.
  • the processor 212 is configured to execute the program 214, and specifically, the related steps in the following method embodiments may be performed.
  • the memory 213 is configured to store the program 214.
  • the memory 213 may include a cache memory, may also include a high speed RAM memory, and may also include a non-volatile memory, such as at least one hard disk memory. It can be understood that the memory 213 can be a random access memory (RAM), a magnetic disk, a hard disk, or a solid state disk (SSD).
  • RAM random access memory
  • SSD solid state disk
  • the memory 213 can also be used to cache data received from the host 40 or data read from the hard disk 22.
  • Program 214 can include an operating system, a file system, and other software modules.
  • FIG. 4 is a flowchart of a data backup method provided by this embodiment, and the steps shown in FIG. 4 are performed by the processor 212 shown in FIG.
  • FIG. 5 which reflects the change of the number in the following backup process.
  • step S101 the storage device 20 receives one or more write data requests.
  • Each write data request includes data to be written (referred to as data) and a logical address of the data to be written.
  • the logical address includes an identifier of a volume, a logical block address (English: logic block address), and a length (English: length).
  • the volume of the volume is used to indicate the volume to be written by the data.
  • the volume to be written by the data is the production volume as an example.
  • the logical block address indicates the location of the data at the volume, the length representing the size of the data.
  • step S102 the storage device 20 allocates a number for each write data request, and the number is used to identify a time period for receiving the write data request.
  • the storage device 20 includes a number table in which a plurality of numbers are included, and each number is sequentially incremented in ascending order. For example, the plurality of numbers are 0, 1, 2, 3, 4, ..., respectively.
  • Write data requests received during a certain period of time are all assigned the same number. Assume that the initial value of the number is the first number 0. When the specific condition is satisfied, the storage device 20 sets the first number to the second number 1, and the second number 1 is the next number of the first number 0. Then the write data request received in the next time period is assigned the second number 1.
  • the storage device 20 maintains a number generator in which the initial value of the number is recorded, assuming that the initial value of the number is the first number 0.
  • the write data request received during a certain period of time is assigned the first number 0.
  • the number initial value in the number generator is subjected to an operation of incrementing by 1, so that the next number of the first number 0 is the second number 1.
  • the write data request received in the next time period is assigned the second number 1.
  • Specific conditions here include the arrival of a preset backup cycle or the creation of a snapshot of the production volume.
  • each write data request is assigned the first number 0 in step S102.
  • the storage device 20 records the correspondence between the first number 0 and the logical address included in each write data request.
  • the storage device 20 receives three write data requests.
  • the data to be written carried by the first write data request is written to the volume A, the logical block address is 00001, the length of the data is 8 bytes, and the second write data request carries the data to be written.
  • the length of the data is 8 bytes.
  • Step S103 when the preset backup period arrives, the storage device 20 performs a data backup operation.
  • the data in the production volume is sent to the target volume through a number of backup cycles, which are preset for a length of time. If it is the first backup cycle, then storage device 20 needs to send all of the data in the production volume to the target volume, a process also referred to as full backup. If it is not the first backup cycle, storage device 20 may send all of the data in the production volume to the target volume, or may only send the difference data to the target volume.
  • the storage device 20 Since the storage device 20 is in the process of transmitting data to the target volume, the storage device 20 also continues to receive the write data request. Therefore, the data in the production volume and the data in the target volume are inconsistent. In order to ensure data consistency between the production volume and the target volume, the storage device 20 uses the number to distinguish the write data request before the backup cycle arrives from the write data request after the backup cycle arrives. Also, the storage device 20 does not directly read data from the production volume and sends it to the target volume, but creates a snapshot of the production volume from which data is read and sent to the target volume.
  • the storage device 20 performs the following operations:
  • the storage device 20 will modify the number again. For example, after creating the first snapshot of the production volume, the second number 1 is modified to a third number 2. Then, for the write data request received before the next modification number (when the next backup period arrives) after the second number is changed to the third number, the third number 2 is assigned. However, after the first snapshot of the production volume is created, the write data request received by the storage device 20 is still assigned the second number 1 during the period before the second number is modified to the third number, and the writes are still performed. The data carried by the data request is not recorded in the first snapshot. Therefore, the first snapshot includes data stored in a logical address corresponding to the first number 0 and data stored in a logical address corresponding to the second second number 1.
  • the storage device 20 reads the first snapshot of the production volume, and sends the data included in the first snapshot and the logical address of the data to the target volume.
  • the backed up data includes data stored in a logical address corresponding to the first number 0 and data stored in a logical address corresponding to a portion of the second number 1.
  • the storage device 20 completes a full backup.
  • the process of incremental backup is described below.
  • step S103 After the second number is changed to the third number, all the write data requests received by the storage device 20 are assigned the third number 2. These write data requests may contain write data requests that modify the data in the production volume. As shown in table 2:
  • storage device 20 receives three write data requests.
  • the data to be written carried by the first write data request is written to volume A, the logical block address is 00001, and the length of the data is 8 bytes. Since the logical block address and length of the data carried by the write data request are the same as the logical block address and length of the data carried by the first write data request shown in Table 1, the write data request carries The data is used to cover the data carried by the first write data request shown in Table 1.
  • the data to be written carried by the second write data request will be written to volume A with a logical block address of 00002 and a length of 8 bytes.
  • the write data request carries The data is used to cover the data carried by the second write data request shown in Table 1.
  • the data to be written carried by the third write data request is newly written data, which will be written to volume A, whose logical block address is 00004, and the length of the data is 8 bytes.
  • step S103 the following steps are further included.
  • Step S104 When the backup period arrives, the storage device 20 modifies the third number to the fourth number. As described earlier, the number assigned to each write data request in step S103 is the third number 2. Then, when the current backup period arrives, the storage device 20 modifies the third number 2 to the fourth number 3, and the fourth number 3 is the number after the third number 2. Then, the write data request received later will be assigned the fourth number 3.
  • Step S105 Create a snapshot of the production volume.
  • the snapshot here is referred to as a second snapshot.
  • the point in time for each snapshot is the data consistency point for the production and target volumes.
  • the data consistency point is the point in time at which the data of the production volume is consistent with the data of the target volume.
  • Step S106 Modify the fourth number to the fifth number.
  • the storage device 20 will modify the number again. For example, the fourth number 3 is modified to the fifth number 4. Then, after the fourth number 3 is changed to the fifth number 4, the write data request received before the next modification number (when the next backup period arrives) is assigned the fifth number 4. However, after the second snapshot of the production volume is created, the write data request received by the storage device 20 is still assigned the fourth number 3 during the period before the fourth number 3 is modified to the fifth number 4. The data carried by these write data requests is not recorded in the second snapshot.
  • the second snapshot includes data stored in a logical address corresponding to the first number 0, data stored in a logical address corresponding to the second number 1, data stored in a logical address corresponding to the third number 2, and a portion fourth.
  • Step S107 Determine the logical address of the difference data after the second number up to and before the fifth number.
  • the difference data includes the difference data received in the time period identified by the second number 1, the difference data received in the time period identified by the third number 2, and the difference data received in the time period identified by the fourth number 3, but The difference data received during the time period identified by the fifth number 4 is not included.
  • the second number is the number assigned last time for the write data request before the first snapshot is created
  • the fifth number is the number assigned last time for the write data request after the second snapshot is created. Therefore, the difference data after the second number until before the fifth number is more than the data after the first snapshot is created until the second snapshot is created.
  • the record of the difference data after the second number up to and before the fifth number is as shown in Table 3.
  • Step S108 Read backup data from the second snapshot according to the logical address of the difference data.
  • the backup data read from the second snapshot may be the difference data or a subset of the difference data. Since only the data stored in the logical address corresponding to the fourth fourth number 3 is recorded in the second snapshot, the storage device 20 may not be able to obtain all the difference data recorded in step S107. For example, assume that within the time period identified by number 3, storage device 20 receives two write data requests. The first write data request is received before the second snapshot is created, and the logical block address of the data carried is 00008 and the length is 4 bytes. The second write data request is received after the second snapshot is created, and the data carried by the logical block address is 00004 and the length is 8 bytes.
  • the storage device 20 can only obtain the data carried by the first write data request from the storage device 20, and the data stored in the second snapshot by the logical address of the second write data request.
  • the second write data request carries 45600000
  • the logical block address is 00004
  • the length is 8 bytes.
  • the data of the logical block address and length in the second snapshot is 12300000. Therefore, storage device 20 still backs up 123000000 to the target volume.
  • Step S109 Send the backup data to the target volume.
  • the storage device 20 may further send the logical address of the backup data to the target volume, such that the location where the backup data is saved in the target volume and the location where the backup data is saved in the production volume Consistent.
  • the storage device 20 completes an incremental backup. Since the storage device 20 records more difference data than the backup data, the storage device 20 does not need to hang the write data request when performing the incremental backup, and can still guarantee Backing up the difference data between the two snapshots to the target volume ensures data consistency between the production volume and the target volume.
  • This embodiment also provides a difference data backup device 66.
  • the device 66 is located in a storage system that includes a production volume and a target volume. As shown in FIG. 6, the device 66 includes a reading module 661 and a transmitting module 662.
  • the reading module 661 is configured to obtain a record of difference data between two numbers, and the number is used to identify a time period in which data is written into the production volume, wherein the first number of the two numbers is to create the The number assigned to the data received by the production volume most recently before the first snapshot of the production volume, the second number of the two numbers being the last time the second volume of the production volume was created to receive the production volume a data allocation number, the record of the difference data including a logical address of the difference data received within the time period identified by the number between the two numbers, all numbers between the two numbers not including the a second number; and reading backup data from the second snapshot based on the logical address of the difference data, the backup data being a subset of the difference data.
  • the reading module 661 can be referred to the description of step S101 to step S108, and details are not described herein again.
  • the reading module 661 may be the processor 212 shown in FIG. 3 calling the program 214 in the memory 213.
  • the processor 212 is a CPU.
  • the reading module 661 can also be implemented independently by the processor 212.
  • the processor 212 is a Field-Programmable Gate Array (FPGA) or other processing chip.
  • FPGA Field-Programmable Gate Array
  • the sending module 662 is configured to send the backup data to the target volume.
  • the sending module 662 may be the processor 212 shown in FIG. 3 calling the program 214 in the memory 213.
  • the processor 212 is a CPU.
  • the transmitting module 662 can also be implemented by the processor 212 independently.
  • the processor 212 is a Field-Programmable Gate Array (FPGA) or other processing chip.
  • FPGA Field-Programmable Gate Array
  • all the numbers between the two numbers are changed according to a setting condition, that the preset backup period arrives or a snapshot of the production volume is created.
  • the second snapshot is the next snapshot of the first snapshot.
  • the sending module 662 is further configured to send the logical address of the backup data to the target volume.
  • the number between the two numbers does not include the second number.
  • the differential data backup apparatus since the first number of the two numbers is the number assigned to the data received by the production volume last time before the first snapshot of the production volume is created, The second number in the two numbers is the number assigned to the data received last time for the production volume after the second snapshot of the production volume is created, so the difference data is more than the backup data. Therefore, the differential data backup device provided in this embodiment does not need to suspend the write data request, and can still ensure that the difference data between the two snapshots is backed up to the target volume, thereby ensuring data consistency between the production volume and the target volume. Since there is no need to hang write data requests, the efficiency of data processing can be improved.
  • aspects of the present invention, or possible implementations of various aspects may be embodied as a system, method, or computer program product.
  • aspects of the invention, or possible implementations of various aspects may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," “modules,” or “systems.”
  • aspects of the invention, or possible implementations of various aspects may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
  • Computer readable media include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM), optical disc.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable Programmable read only memory
  • the processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor can perform the functional actions specified in each step or combination of steps in the flowchart.
  • the computer readable program code can execute entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server.
  • the functions noted in the various steps in the flowcharts or in the blocks in the block diagrams may not occur in the order noted. For example, two steps, or two blocks, shown in succession may be executed substantially concurrently or the blocks may be executed in the reverse order.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Retry When Errors Occur (AREA)

Abstract

Disclosed are a differential data backup method, a differential data backup device and a storage system. The storage system comprises a processor, a production volume and a target volume. The processor acquires a record of differential data between two numbers, wherein the numbers are used to mark a period in which data is written to the production volume. A first number of the two numbers is a number allocated to data most recently received by the production volume before creating a first snapshot of the production volume; a second number of the two numbers is a number allocated to data most recently received by the production volume after creating a second snapshot of the production volume. The record of differential data comprises logical addresses of the differential data received in the periods marked by all numbers between the two numbers. The processor acquires backup data from the second snapshot according to the record of differential data and sends the backup data to the target volume. The present invention can prevent a data-writing request from being suspended, thus increasing the efficiency of data processing in the storage system.

Description

一种差异数据备份方法和差异数据备份装置Differential data backup method and differential data backup device 技术领域Technical field
本发明实施例涉及存储技术领域,特别是一种差异数据备份方法和差异数据备份装置。Embodiments of the present invention relate to the field of storage technologies, and in particular, to a differential data backup method and a differential data backup device.
背景技术Background technique
典型的数据容灾系统包括生产中心和灾备中心。在生产中心,部署有主机、存储阵列,用于正常的业务运行;在灾备中心,部署有主机、存储阵列,用于在生产中心发生灾难后,接管其业务。其中,生产中心或灾备中心的存储阵列均包含多个数据卷,数据卷是物理存储空间映射而成的一段逻辑存储空间。生产中心的业务产生的数据写入生产阵列后,可以经容灾链路备份到灾备中心,写入灾备阵列。为了保证灾难发生后,灾备中心的数据能够支撑业务接管,备份到灾备阵列的数据必须保证一致性(consistency)。保证数据一致性本质上是指,有依赖关系的写数据请求,该依赖关系需要得到保证。应用程序、操作系统、数据库都内在地依靠这种写数据请求的依赖关系的逻辑来运行其业务,例如:先完成写数据请求1,再完成写数据请求2,顺序是固定的。也就是说,系统会确保写数据请求1完全返回成功后,才会下发写数据请求2。由此,才能实现当出现故障导致执行过程中断时,可以依靠固有的办法来恢复业务。否则,可能会出现这样的情况,例如:在读取数据时,可以读到写数据请求2存储的数据,却读不到写数据请求1存储的数据,这将导致业务无法恢复。A typical data disaster recovery system includes a production center and a disaster recovery center. In the production center, hosts and storage arrays are deployed for normal service operations. In the disaster recovery center, hosts and storage arrays are deployed to take over the services after a disaster occurs in the production center. The storage array of the production center or the disaster recovery center includes multiple data volumes, and the data volume is a logical storage space mapped by physical storage space. After the data generated by the service of the production center is written to the production array, it can be backed up to the disaster recovery center through the DR link and written to the disaster recovery array. To ensure that the data of the disaster recovery center can support the service takeover after the disaster occurs, the data backed up to the disaster recovery array must ensure consistency. Assuring data consistency essentially means that there is a dependency write request, and the dependency needs to be guaranteed. Applications, operating systems, and databases all rely on this logic of writing data request dependencies to run their services. For example, the write data request 1 is completed first, and the write data request 2 is completed. The order is fixed. That is to say, the system will ensure that the write data request 1 is sent after the write data request 1 is completely returned successfully. Therefore, it is possible to rely on an inherent method to recover the service when a failure causes the execution process to be interrupted. Otherwise, such a situation may occur. For example, when reading data, the data stored in the write data request 2 can be read, but the data stored in the write data request 1 cannot be read, which will cause the service to be unrecoverable.
在现有技术中,可以利用快照技术解决这个问题。快照是数据在某个时间点(拷贝开始的时间点)的映像。快照的目的是为数据卷创建一个在特定时间点的状态视图,通过这个视图只可以看到数据卷在创建时刻的数据,在此时间点之后数据卷的修改(有新的数据写入),不会反映在快照视图中。利用这个快照视图,就可以做数据的备份。对于生产中心而言,由于快照数据是“静止的”,因此生产中心可以在将各个时间点的数据创建快照之后,再将快照数据备份到灾备中心,既可以完成远程数据备份,也不会影响在生产中心继续执行写数据请求。对于灾备中心而言,也可以满足数据一致性的要求。例如,写数据请求2的数据成功备份到灾备中心,写数据请求1的数据没有成功备份,可以利用写数据请求2之前的快照数据,将灾备中心的数据恢复到之前的状态。In the prior art, snapshot technology can be used to solve this problem. A snapshot is an image of data at a certain point in time (the point in time when the copy begins). The purpose of the snapshot is to create a state view for the data volume at a specific point in time. Only the data volume can be seen at the time of creation. After this time point, the data volume is modified (new data is written). Will not be reflected in the snapshot view. With this snapshot view, you can make a backup of the data. For the production center, since the snapshot data is “stationary”, the production center can back up the snapshot data to the disaster recovery center after snapshotting the data at each time point, and can complete remote data backup without The effect continues to execute write data requests at the production center. For disaster recovery centers, data consistency requirements can also be met. For example, the data of the data request 2 is successfully backed up to the disaster recovery center, and the data of the data request 1 is not successfully backed up. The data of the disaster recovery center can be restored to the previous state by using the snapshot data before the data request 2.
然而,为了保证生产阵列和灾备阵列之间的数据一致性,生产阵列在创建快照之前需要悬挂写数据请求,以防止变化的数据没有记录在快照中,从而导致生产阵列中的数据与灾备阵列中的数据不一致。但是悬挂写数据请求往往会影响生产阵列处理数据的效率。However, in order to ensure data consistency between the production array and the disaster recovery array, the production array needs to hang the write data request before creating the snapshot to prevent the changed data from being recorded in the snapshot, thus causing data and disaster recovery in the production array. The data in the array is inconsistent. However, hanging write data requests often affects the efficiency of the production array processing data.
发明内容Summary of the invention
本申请提出了一种差异数据备份方法和差异数据备份装置,能够避免悬挂写数据请求从而提高数据处理的效率。The present application proposes a differential data backup method and a differential data backup device, which can avoid hanging write data requests and improve data processing efficiency.
本申请第一方面提供了一种差异数据备份方法,该方法应用于存储系统中。所述存储系统包括处理器、生产卷和目标卷。所述处理器获取两个编号之间的差异数据的记录。编号用于标识数据写入所述生产卷的时间段。其中,所述两个编号中的第一编号是创建所述生产卷的第一快照之前最近一次为所述生产卷接收的数据分配的编号。所述两个编号中的第二编号是创建所述生产卷的第二快照之后最近一次为所述生产卷接收的数据分配的编号。所述差异数据的记录包括所述两个编号之间的编号所标识的时间段内接收的所述差异数据的逻辑地 址。所述处理器根据所述差异数据的逻辑地址从所述第二快照中读取备份数据,所述备份数据是所述差异数据的子集。然后,所述处理器将所述备份数据发送给所述目标卷。The first aspect of the present application provides a differential data backup method, which is applied to a storage system. The storage system includes a processor, a production volume, and a target volume. The processor acquires a record of the difference data between the two numbers. The number is used to identify the time period during which data is written to the production volume. Wherein the first number of the two numbers is a number assigned to the data received by the production volume last time before the first snapshot of the production volume is created. The second of the two numbers is the number assigned to the data received last time for the production volume after the second snapshot of the production volume was created. The record of the difference data includes a logical place of the difference data received within a time period identified by a number between the two numbers site. The processor reads backup data from the second snapshot according to a logical address of the difference data, the backup data being a subset of the difference data. The processor then sends the backup data to the target volume.
按照第一方面提供的差异数据备份方法,处理器获取两个编号的差异数据的记录,并且根据所述差异数据的逻辑地址从所述第二快照中读取备份数据。由于所述两个编号中的第一编号是创建所述生产卷的第一快照之前最近一次为所述生产卷接收的数据分配的编号,而所述两个编号中的第二编号是创建所述生产卷的第二快照之后最近一次为所述生产卷接收的数据分配的编号,因此所述差异数据多于备份数据。那么,在第一方面提供的差异备份过程中,不需要悬挂写数据请求,仍然可以保证将两次快照之间的差异数据备份给目标卷,保证了生产卷和目标卷的数据一致性。由于不需要悬挂写数据请求,因此可以提高数据处理的效率。According to the differential data backup method provided by the first aspect, the processor acquires a record of the two numbered difference data, and reads the backup data from the second snapshot according to the logical address of the difference data. Since the first number of the two numbers is a number assigned to the data received last time for the production volume before the first snapshot of the production volume is created, and the second number of the two numbers is a creation The number of data that was last received for the production volume after the second snapshot of the production volume, so the difference data is more than the backup data. Then, in the differential backup process provided by the first aspect, there is no need to hang the write data request, and the difference data between the two snapshots can still be backed up to the target volume, thereby ensuring data consistency between the production volume and the target volume. Since there is no need to hang write data requests, the efficiency of data processing can be improved.
结合第一方面,在第一方面的第一种实施方式中,所述两个编号之间的所有编号按照设定条件改变,所述设定条件包括预设的备份周期到达或者创建所述生产卷的快照。由此,实现了以编号来标识数据写入所述生产卷的时间段。In conjunction with the first aspect, in a first implementation of the first aspect, all numbers between the two numbers are changed according to a set condition including a preset backup period arrival or creation of the production A snapshot of the volume. Thereby, the time period in which the data is written to the production volume by the number is realized.
结合第一方面或者第一方面的第一种实施方式中,在第一方面的第二种实施方式中,所述第二快照是所述第一快照的下一次快照。由于第二快照是第一快照的下一次快照,保证每次备份的数据略多于相邻两次快照之间的差异数据,少于不相邻的两次快照之间的差异数据,由此尽可能地减少每次备份的数据量。In conjunction with the first aspect or the first implementation of the first aspect, in a second implementation of the first aspect, the second snapshot is a next snapshot of the first snapshot. Since the second snapshot is the next snapshot of the first snapshot, it is guaranteed that the data of each backup is slightly more than the difference data between the adjacent two snapshots, and less than the difference data between the two snapshots that are not adjacent. Reduce the amount of data per backup as much as possible.
结合第一方面或者第一方面的任意一种实施方式,在第一方面的第三种实施方式中,所述方法还包括所述处理器将所述备份数据的逻辑地址发送给所述目标卷。In combination with the first aspect or any one of the first aspect, in a third implementation of the first aspect, the method further includes the processor transmitting a logical address of the backup data to the target volume .
结合第一方面或者第一方面的任意一种实施方式,在第一方面的第四种实施方式中,所述两个编号之间的编号不包括所述第二编号。由于所述第二编号是下一次备份的起始编号,所以在本次备份中可以不包括所述第二编号,以减少备份的数据量。In conjunction with the first aspect or any one of the first aspects, in the fourth implementation of the first aspect, the number between the two numbers does not include the second number. Since the second number is the starting number of the next backup, the second number may not be included in the current backup to reduce the amount of data backed up.
本申请第二方面提供了一种差异数据备份装置,用于执行第一方法提供的差异数据备份方法。The second aspect of the present application provides a differential data backup apparatus for performing a differential data backup method provided by the first method.
本申请第二方面的各种实施方式,与第一方面的实施方式类似。The various embodiments of the second aspect of the application are similar to the embodiments of the first aspect.
本申请第三方面提供了一种存储系统,所述存储系统包括处理器、生产卷和目标卷,所述处理器用于执行第一方面提供的差异数据备份方法。A third aspect of the present application provides a storage system, including a processor, a production volume, and a target volume, where the processor is configured to perform the differential data backup method provided by the first aspect.
本申请第四方面提供了一种存储系统,所述存储系统包括处理器、存储器、生产卷和目标卷,所述处理器调用所述存储器中的程序执行第一方面提供的差异数据备份方法。A fourth aspect of the present application provides a storage system including a processor, a memory, a production volume, and a target volume, the processor invoking a program in the memory to execute a differential data backup method provided by the first aspect.
本申请还提供了一种计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可以由上述第三方面或第四方面的存储系统执行,并用于执行上述第一方面的至少一种方法。The present application also provides a computer program product comprising a computer readable storage medium storing program code, the program code comprising instructions executable by the storage system of the third aspect or the fourth aspect, and for performing the above At least one method on the one hand.
本发明申请提供的以上一种计算机程序产品,都能够在备份过程中不悬挂写数据请求,提高数据处理的效率。The above computer program product provided by the application of the present invention can not hang the write data request during the backup process, thereby improving the efficiency of data processing.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments will be briefly described below.
图1是本发明实施例提供的一种应用场景图;FIG. 1 is a schematic diagram of an application scenario according to an embodiment of the present invention;
图2是本发明实施例提供的另一种应用场景图;2 is another application scenario diagram provided by an embodiment of the present invention;
图3是本发明实施例提供的存储设备的结构图;3 is a structural diagram of a storage device according to an embodiment of the present invention;
图4是本发明实施例提供的差异数据备份方法的流程示意图; 4 is a schematic flowchart of a differential data backup method according to an embodiment of the present invention;
图5是本发明实施例提供的差异数据备份方法的执行过程中编号的变化示意图;FIG. 5 is a schematic diagram of changes in numbers during execution of a differential data backup method according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的差异数据备份装置的结构图。FIG. 6 is a structural diagram of a differential data backup apparatus according to an embodiment of the present invention.
具体实施方式detailed description
本发明实施例提出了一种数据备份方法和存储系统,能够避免悬挂写数据请求,从而提高数据处理的效率。The embodiment of the invention provides a data backup method and a storage system, which can avoid hanging write data requests, thereby improving the efficiency of data processing.
下面对本发明实施例的应用场景进行介绍。The application scenario of the embodiment of the present invention is introduced below.
如图1所示,图1描绘了本发明实施例提供的存储系统10的组成图,图1所示的存储系统10包括一个或多个主机40和一个存储设备20。主机可以是计算设备,如服务器、台式计算机等终端设备。存储设备20可以是基于数据块数据的存储设备,如存储区域网(Storage Area Networking,SAN)设备,也可以是包含文件系统的存储设备,如网络附属存储(Network Attached Storage,NAS)设备。本实施例并不对存储设备的类型做任何限定。主机40与存储设备20之间,以及各个存储设备20之间可以通过网络文件系统(Network File System,NFS)/通用网络文件系统(Common Internet File System,CIFS)协议或者光纤通道(Fiber Channel,FC)协议进行通信。As shown in FIG. 1, FIG. 1 depicts a composition diagram of a storage system 10 according to an embodiment of the present invention. The storage system 10 shown in FIG. 1 includes one or more hosts 40 and a storage device 20. The host can be a computing device, such as a terminal device such as a server or a desktop computer. The storage device 20 may be a storage device based on data block data, such as a Storage Area Networking (SAN) device, or a storage device including a file system, such as a Network Attached Storage (NAS) device. This embodiment does not limit the type of the storage device. Between the host 40 and the storage device 20, and between the storage devices 20, the network file system (NFS)/Common Internet File System (CIFS) protocol or Fibre Channel (Fiber Channel, FC) can be used. The protocol communicates.
存储设备20包括至少一个控制器21和若干个硬盘22。控制器21可以包括任何计算设备,如服务器、台式计算机等等。在控制器内部,安装有操作系统以及其他应用程序。控制器21可以向硬盘22发送输入输出(I/O)请求。例如,向硬盘22发送写数据请求,使得硬盘22将写数据请求中携带的待写入数据写入其存储介质中。The storage device 20 includes at least one controller 21 and a plurality of hard disks 22. Controller 21 can include any computing device such as a server, desktop computer, or the like. Inside the controller, an operating system and other applications are installed. The controller 21 can send an input/output (I/O) request to the hard disk 22. For example, a write data request is sent to the hard disk 22 such that the hard disk 22 writes the data to be written carried in the write data request into its storage medium.
硬盘22可以是多种类型的硬盘,例如,固态硬盘(Solid State Drive,SSD)或者串行连接SCSI(Serial Attached SCSI,SAS)或光纤通道(Fiber Channel,FC)硬盘驱动器(Hard Disk Drive,HDD),其中,SCSI(Small Computer System Interface)为小型机系统接口的英文缩写或者串行高级技术附件(Serial Advanced Technology Attachment,SATA)或近线(Near Line,NL)串行连接SCSI(Serial Attached SCSI,SAS)HDD,在此不做限定。逻辑单元(Logic Unit,LU)是分布在一个或多个硬盘22上的一段逻辑存储空间,例如图1所示的生产卷23和目标卷24。主机40可以向存储系统10发送写数据请求,所述写数据请求中携带待写入所述存储系统10的数据,所述数据可以是块数据或者文件。控制器21接收所述数据后再写入所述存储设备20的逻辑单元中。在实际应用中,为了保证数据可靠性,往往需要对数据进行备份处理。例如,将生产卷23中的数据备份到目标卷24,当生产卷23中的数据发生损坏时,可以用目标卷24中存储的数据进行恢复。The hard disk 22 can be a plurality of types of hard disks, such as Solid State Drive (SSD) or Serial Attached SCSI (SAS) or Fibre Channel (FC) hard disk drives (Hard Disk Drive, HDD). ), where SCSI (Small Computer System Interface) is the abbreviation of the minicomputer system interface or Serial Advanced Technology Attachment (SATA) or Near Line (NL) Serial Attached SCSI (Serial Attached SCSI) , SAS) HDD, not limited here. A Logic Unit (LU) is a logical storage space distributed over one or more hard disks 22, such as production volume 23 and target volume 24 shown in FIG. The host 40 can send a write data request to the storage system 10, the write data request carrying data to be written to the storage system 10, the data can be block data or a file. The controller 21 receives the data and then writes it into the logical unit of the storage device 20. In practical applications, in order to ensure data reliability, data needs to be backed up. For example, the data in the production volume 23 is backed up to the target volume 24. When the data in the production volume 23 is damaged, the data stored in the target volume 24 can be used for recovery.
本发明实施例还适用于另一种应用场景,如图2所示。图2描绘了另一种存储系统10的组成图,该存储系统10包括一个或多个主机40,一个存储设备20以及一个存储设备30。存储设备30与存储设备20类似,包括至少一个控制器31和若干个硬盘32。控制器31的结构和功能与图1中控制器21类似,硬盘32的结构和功能与图1中的硬盘22类似,这里不再赘述。与图1描述的应用场景的不同之处在于,图1中的备份是指一个存储设备内的备份,而图2中的备份是指两个存储设备间的备份。例如,存储设备20需要将其生产卷上的数据备份到存储设备30的目标卷33中。The embodiment of the present invention is also applicable to another application scenario, as shown in FIG. 2 . 2 depicts a composition diagram of another storage system 10 that includes one or more hosts 40, a storage device 20, and a storage device 30. The storage device 30 is similar to the storage device 20 and includes at least one controller 31 and a plurality of hard disks 32. The structure and function of the controller 31 are similar to those of the controller 21 of FIG. 1. The structure and function of the hard disk 32 are similar to those of the hard disk 22 of FIG. 1, and will not be described herein. The difference from the application scenario described in FIG. 1 is that the backup in FIG. 1 refers to a backup in one storage device, and the backup in FIG. 2 refers to a backup between two storage devices. For example, storage device 20 needs to back up data on its production volume to target volume 33 of storage device 30.
无论是图1所示的应用场景还是图2所示的应用场景,控制器21在将一个LU(称为生产卷)中的数据备份到另一个LU(称为目标卷)中时,可以采用全量备份的方式,也可以采用增量备份的方式。Regardless of the application scenario shown in FIG. 1 or the application scenario shown in FIG. 2, the controller 21 may use data when backing up data in one LU (referred to as a production volume) to another LU (referred to as a target volume). The method of full backup can also adopt the method of incremental backup.
全量备份是指对生产卷上的所有数据进行完整备份。增量备份是备份自上次全备份或者 增量式备份以来(取两者中较晚者)修改的数据。由于仅限于对修改的数据(又称为差异数据)进行备份,这种备份非常快,也更能节省存储空间。A full backup is a full backup of all the data on the production volume. Incremental backups are backups since the last full backup or Data modified since the incremental backup (whichever is later). Because it is limited to backing up modified data (also known as differential data), this backup is very fast and saves storage space.
下面介绍控制器21的组成结构。如图3所示,图3描绘了本发明实施例提供的控制器21的组成图。The composition of the controller 21 will be described below. As shown in FIG. 3, FIG. 3 depicts a composition diagram of the controller 21 provided by the embodiment of the present invention.
控制器21至少包括接口211,处理器212和存储器213。The controller 21 includes at least an interface 211, a processor 212, and a memory 213.
接口211,用于与主机40或者硬盘22或者存储设备30进行通信。The interface 211 is configured to communicate with the host 40 or the hard disk 22 or the storage device 30.
处理器212可能是一个中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。处理器212可以用于处理对硬盘22的输入/输出(Input/Output,I/O)请求,将生产卷中的数据备份到目标卷中等等。从而使控制器21可以实现IO操作、数据备份等功能。在本发明实施例中,处理器212用于执行程序214,具体可以执行下述方法实施例中的相关步骤。The processor 212 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The processor 212 can be used to process input/output (I/O) requests to the hard disk 22, back up data in the production volume to the target volume, and the like. Thereby, the controller 21 can implement functions such as IO operation, data backup, and the like. In the embodiment of the present invention, the processor 212 is configured to execute the program 214, and specifically, the related steps in the following method embodiments may be performed.
存储器213,用于存放程序214,存储器213可以包括高速缓存存储器(cache),也可以包括高速RAM存储器,还可以包括非易失性存储器(non-volatile memory),例如至少一个硬盘存储器。可以理解的是,存储器213可以为随机存储器(Random-Access Memory,RAM)、磁碟、硬盘、固态硬盘(Solid State Disk,SSD)。存储器213还可以用于缓存从主机40接收的数据或者从硬盘22读取的数据。The memory 213 is configured to store the program 214. The memory 213 may include a cache memory, may also include a high speed RAM memory, and may also include a non-volatile memory, such as at least one hard disk memory. It can be understood that the memory 213 can be a random access memory (RAM), a magnetic disk, a hard disk, or a solid state disk (SSD). The memory 213 can also be used to cache data received from the host 40 or data read from the hard disk 22.
程序214可以包括操作系统,文件系统以及其他软件模块。 Program 214 can include an operating system, a file system, and other software modules.
下面结合以上两种应用场景和图3介绍数据备份的过程。请参见图4,图4是本实施例提供的数据备份方法的流程图,图4所示的步骤由图3所示的处理器212执行。另外,请参见图5,图5反映了下述备份过程中编号的变化。The following describes the process of data backup in combination with the above two application scenarios and FIG. Referring to FIG. 4, FIG. 4 is a flowchart of a data backup method provided by this embodiment, and the steps shown in FIG. 4 are performed by the processor 212 shown in FIG. In addition, please refer to FIG. 5, which reflects the change of the number in the following backup process.
步骤S101,存储设备20接收一个或多个写数据请求。每个写数据请求包括待写入数据(简称为数据)以及所述待写入数据的逻辑地址。所述逻辑地址包括卷的标识、逻辑块地址(英文:logic block address)和长度(英文:length)。所述卷的标识用于指示所述数据待写入的卷,本实施例中均以所述数据待写入的卷是所述生产卷为例予以说明。所述逻辑块地址表示所述数据位于所述卷的位置,所述长度表示所述数据的大小。In step S101, the storage device 20 receives one or more write data requests. Each write data request includes data to be written (referred to as data) and a logical address of the data to be written. The logical address includes an identifier of a volume, a logical block address (English: logic block address), and a length (English: length). The volume of the volume is used to indicate the volume to be written by the data. In this embodiment, the volume to be written by the data is the production volume as an example. The logical block address indicates the location of the data at the volume, the length representing the size of the data.
步骤S102,存储设备20为每个写数据请求分配编号,所述编号用于标识接收所述写数据请求的时间段。存储设备20包括一个编号表,所述编号表中包括多个编号,每个编号按照从小到大的顺序依次递增。例如,所述多个编号分别是0、1、2、3、4……。在某个时间段内接收的写数据请求均被分配同一个编号。假设编号初始值为第一编号0,当特定条件满足时,存储设备20会将所述第一编号设置为第二编号1,第二编号1是第一编号0的下一个编号。那么在下一个时间段内接收的写数据请求均被分配所述第二编号1。或者,存储设备20维护有一个编号发生器,所述编号发生器中记录编号初始值,假设所述编号初始值为第一编号0。在某个时间段内接收的写数据请求均被分配第一编号0。当所述特定条件满足时,所述编号发生器中的编号初始值被执行加1的操作,因此第一编号0的下一个编号就是第二编号1。那么,在下一个时间段内接收的写数据请求均被分配第二编号1。这里的特定条件包括预设的备份周期到达或者创建了所述生产卷的快照。示例性地,假设在步骤S102中每个写数据请求均被分配第一编号0。存储设备20记录第一编号0与各个写数据请求包括的逻辑地址之间的对应关系。In step S102, the storage device 20 allocates a number for each write data request, and the number is used to identify a time period for receiving the write data request. The storage device 20 includes a number table in which a plurality of numbers are included, and each number is sequentially incremented in ascending order. For example, the plurality of numbers are 0, 1, 2, 3, 4, ..., respectively. Write data requests received during a certain period of time are all assigned the same number. Assume that the initial value of the number is the first number 0. When the specific condition is satisfied, the storage device 20 sets the first number to the second number 1, and the second number 1 is the next number of the first number 0. Then the write data request received in the next time period is assigned the second number 1. Alternatively, the storage device 20 maintains a number generator in which the initial value of the number is recorded, assuming that the initial value of the number is the first number 0. The write data request received during a certain period of time is assigned the first number 0. When the specific condition is satisfied, the number initial value in the number generator is subjected to an operation of incrementing by 1, so that the next number of the first number 0 is the second number 1. Then, the write data request received in the next time period is assigned the second number 1. Specific conditions here include the arrival of a preset backup cycle or the creation of a snapshot of the production volume. Illustratively, it is assumed that each write data request is assigned the first number 0 in step S102. The storage device 20 records the correspondence between the first number 0 and the logical address included in each write data request.
Figure PCTCN2017096782-appb-000001
Figure PCTCN2017096782-appb-000001
Figure PCTCN2017096782-appb-000002
Figure PCTCN2017096782-appb-000002
如表1所示,在编号0标识的时间段内,存储设备20接收到3个写数据请求。其中,第一个写数据请求所携带的待写入数据将被写入卷A,其逻辑块地址为00001,所述数据的长度为8byte;第二个写数据请求所携带的待写入数据将被写入卷A,其逻辑块地址为00002,所述数据的长度为8byte;第二个写数据请求所携带的待写入数据将被写入卷A,其逻辑块地址为00003,所述数据的长度为8byte。As shown in Table 1, during the time period identified by the number 0, the storage device 20 receives three write data requests. The data to be written carried by the first write data request is written to the volume A, the logical block address is 00001, the length of the data is 8 bytes, and the second write data request carries the data to be written. Will be written to volume A, its logical block address is 00002, the length of the data is 8byte; the data to be written carried by the second write data request will be written to volume A, and its logical block address is 00003. The length of the data is 8 bytes.
步骤S103,预设的备份周期到达时,存储设备20执行数据备份操作。Step S103, when the preset backup period arrives, the storage device 20 performs a data backup operation.
生产卷中的数据是通过若干个备份周期发送给目标卷的,所述备份周期是预先设置的时长。如果是第一个备份周期,那么存储设备20需要将所述生产卷中的所有数据发送给目标卷,这个过程也被称之为全量备份。如果不是第一个备份周期,存储设备20可以将所述生产卷中的所有数据发送给所述目标卷,也可以仅将差异数据发送给所述目标卷。The data in the production volume is sent to the target volume through a number of backup cycles, which are preset for a length of time. If it is the first backup cycle, then storage device 20 needs to send all of the data in the production volume to the target volume, a process also referred to as full backup. If it is not the first backup cycle, storage device 20 may send all of the data in the production volume to the target volume, or may only send the difference data to the target volume.
由于存储设备20在将数据发送给目标卷的过程中,存储设备20也在继续接收写数据请求。因此,生产卷中的数据和目标卷中的数据会产生不一致。为了保证生产卷与目标卷的数据一致性,存储设备20会用编号将备份周期到达之前的写数据请求与备份周期到达之后的写数据请求区分开。并且,存储设备20不会直接从生产卷中读取数据发送给目标卷,而是创建所述生产卷的快照,从所述快照中读取数据发送给目标卷。Since the storage device 20 is in the process of transmitting data to the target volume, the storage device 20 also continues to receive the write data request. Therefore, the data in the production volume and the data in the target volume are inconsistent. In order to ensure data consistency between the production volume and the target volume, the storage device 20 uses the number to distinguish the write data request before the backup cycle arrives from the write data request after the backup cycle arrives. Also, the storage device 20 does not directly read data from the production volume and sends it to the target volume, but creates a snapshot of the production volume from which data is read and sent to the target volume.
具体的,当预设的备份周期到达时,存储设备20会执行如下操作:Specifically, when the preset backup period arrives, the storage device 20 performs the following operations:
1,将第一编号修改为第二编号。如前面所说,在所述预设的备份周期到达之前,为每个写数据请求分配的编号为第一编号0。在所述预设的备份周期到达之后,下一次修改编号之前接收的写数据请求,均会被分配第二编号1。1. Change the first number to the second number. As mentioned before, the number assigned to each write data request is the first number 0 before the preset backup period arrives. After the preset backup period arrives, the write data request received before the next modification of the number will be assigned the second number 1.
2、创建所述生产卷的第一快照。在实际应用中,存储设备20会定期创建所述生产卷的快照。2. Create a first snapshot of the production volume. In a practical application, the storage device 20 periodically creates a snapshot of the production volume.
3、将第二编号修改为第三编号。按照前面的描述,当所述生产卷的第一快照被创建之后,存储设备20会再次修改编号。例如,在创建所述生产卷的第一快照之后,将所述第二编号1修改为第三编号2。那么,对于在将第二编号修改为第三编号之后,下一次修改编号(下一次备份周期到达时)之前接收的写数据请求,均会被分配第三编号2。然而,在创建所述生产卷的第一快照之后,将第二编号修改为第三编号之前的这段时间内,存储设备20接收的写数据请求仍然会被分配第二编号1,而这些写数据请求所携带的数据并没有被记录在所述第一快照中。因此,所述第一快照包括第一编号0对应的逻辑地址中存储的数据以及部分第二编号1对应的逻辑地址中存储的数据。3. Change the second number to the third number. According to the foregoing description, after the first snapshot of the production volume is created, the storage device 20 will modify the number again. For example, after creating the first snapshot of the production volume, the second number 1 is modified to a third number 2. Then, for the write data request received before the next modification number (when the next backup period arrives) after the second number is changed to the third number, the third number 2 is assigned. However, after the first snapshot of the production volume is created, the write data request received by the storage device 20 is still assigned the second number 1 during the period before the second number is modified to the third number, and the writes are still performed. The data carried by the data request is not recorded in the first snapshot. Therefore, the first snapshot includes data stored in a logical address corresponding to the first number 0 and data stored in a logical address corresponding to the second second number 1.
待上述操作执行完毕之后,存储设备20读取所述生产卷的第一快照,将所述第一快照中包括的数据以及所述数据的逻辑地址发送给目标卷。由前面的描述可知,在这次备份操作中,备份的数据包括第一编号0对应的逻辑地址中存储的数据以及部分第二编号1对应的逻辑地址中存储的数据。After the execution of the above operation is completed, the storage device 20 reads the first snapshot of the production volume, and sends the data included in the first snapshot and the logical address of the data to the target volume. As can be seen from the foregoing description, in this backup operation, the backed up data includes data stored in a logical address corresponding to the first number 0 and data stored in a logical address corresponding to a portion of the second number 1.
通过上述步骤S101-步骤S103,存储设备20完成了一次全量备份。下面描述增量备份的过程。Through the above steps S101 to S103, the storage device 20 completes a full backup. The process of incremental backup is described below.
由上述步骤S103可知,在将第二编号修改为第三编号之后,存储设备20接收的所有写数据请求均会被分配第三编号2。这些写数据请求中可能包含对所述生产卷中的数据进行修改的写数据请求。如表2所示:It can be seen from the above step S103 that after the second number is changed to the third number, all the write data requests received by the storage device 20 are assigned the third number 2. These write data requests may contain write data requests that modify the data in the production volume. As shown in table 2:
编号Numbering 卷的标识Volume identification 逻辑块地Logical block 长度(单Length (single
    site 位:byte)Bit: byte)
22 AA 0000100001 88
22 AA 0000200002 88
22 AA 0000400004 88
如表2所示,在编号2标识的时间段内,存储设备20接收到3个写数据请求。其中,第一个写数据请求所携带的待写入数据将被写入卷A,其逻辑块地址为00001,所述数据的长度为8byte。由于所述写数据请求所携带的数据的逻辑块地址和长度均与表1所示的第一个写数据请求所携带的数据的逻辑块地址和长度相同,因此所述写数据请求所携带的数据用于覆盖表1所示的第一个写数据请求所携带的数据。第二个写数据请求所携带的待写入数据将被写入卷A,其逻辑块地址为00002,所述数据的长度为8byte。由于所述写数据请求所携带的数据的逻辑块地址和长度均与表1所示的第二个写数据请求所携带的数据的逻辑块地址和长度相同,因此所述写数据请求所携带的数据用于覆盖表1所示的第二个写数据请求所携带的数据。第三个写数据请求所携带的待写入数据是新写入的数据,所述数据将被写入卷A,其逻辑块地址为00004,所述数据的长度为8byte。As shown in Table 2, during the time period identified by number 2, storage device 20 receives three write data requests. The data to be written carried by the first write data request is written to volume A, the logical block address is 00001, and the length of the data is 8 bytes. Since the logical block address and length of the data carried by the write data request are the same as the logical block address and length of the data carried by the first write data request shown in Table 1, the write data request carries The data is used to cover the data carried by the first write data request shown in Table 1. The data to be written carried by the second write data request will be written to volume A with a logical block address of 00002 and a length of 8 bytes. Since the logical block address and length of the data carried by the write data request are the same as the logical block address and length of the data carried by the second write data request shown in Table 1, the write data request carries The data is used to cover the data carried by the second write data request shown in Table 1. The data to be written carried by the third write data request is newly written data, which will be written to volume A, whose logical block address is 00004, and the length of the data is 8 bytes.
在步骤S103之后,还包括以下步骤。After step S103, the following steps are further included.
步骤S104:当备份周期到达时,存储设备20将第三编号修改为第四编号。如前面所述,在步骤S103中为每个写数据请求分配的编号为第三编号2。那么,在本次备份周期到达时,存储设备20将第三编号2修改为第四编号3,第四编号3是第三编号2之后的编号。那么,之后接收的写数据请求,均会被分配第四编号3。Step S104: When the backup period arrives, the storage device 20 modifies the third number to the fourth number. As described earlier, the number assigned to each write data request in step S103 is the third number 2. Then, when the current backup period arrives, the storage device 20 modifies the third number 2 to the fourth number 3, and the fourth number 3 is the number after the third number 2. Then, the write data request received later will be assigned the fourth number 3.
步骤S105:创建所述生产卷的快照。为了与步骤S103中的第一快照相区别,将这里的快照称为第二快照。每个快照的时间点均是生产卷和目标卷的数据一致性点。数据一致性点是指生产卷的数据和目标卷的数据一致的时间点。Step S105: Create a snapshot of the production volume. In order to distinguish from the first snapshot in step S103, the snapshot here is referred to as a second snapshot. The point in time for each snapshot is the data consistency point for the production and target volumes. The data consistency point is the point in time at which the data of the production volume is consistent with the data of the target volume.
步骤S106:将第四编号修改为第五编号。按照前面的描述,当所述生产卷的快照被创建之后,存储设备20会再次修改编号。例如,将所述第四编号3修改为第五编号4。那么,对于在将第四编号3修改为第五编号4之后,下一次修改编号(下一次备份周期到达时)之前接收的写数据请求,均会被分配第五编号4。然而,在创建所述生产卷的第二快照之后,将第四编号3修改为第五编号4之前的这段时间内,存储设备20接收的写数据请求仍然会被分配第四编号3,而这些写数据请求所携带的数据并没有被记录在所述第二快照中。因此,所述第二快照包括第一编号0对应的逻辑地址中存储的数据、第二编号1对应的逻辑地址中存储的数据、第三编号2对应的逻辑地址中存储的数据以及部分第四编号3对应的逻辑地址中存储的数据。Step S106: Modify the fourth number to the fifth number. According to the foregoing description, after the snapshot of the production volume is created, the storage device 20 will modify the number again. For example, the fourth number 3 is modified to the fifth number 4. Then, after the fourth number 3 is changed to the fifth number 4, the write data request received before the next modification number (when the next backup period arrives) is assigned the fifth number 4. However, after the second snapshot of the production volume is created, the write data request received by the storage device 20 is still assigned the fourth number 3 during the period before the fourth number 3 is modified to the fifth number 4. The data carried by these write data requests is not recorded in the second snapshot. Therefore, the second snapshot includes data stored in a logical address corresponding to the first number 0, data stored in a logical address corresponding to the second number 1, data stored in a logical address corresponding to the third number 2, and a portion fourth. The data stored in the logical address corresponding to number 3.
步骤S107:确定第二编号之后直至与第五编号之前的差异数据的逻辑地址。其中,所述差异数据包括第二编号1标识的时间段内接收的差异数据、第三编号2标识的时间段内接收的差异数据、第四编号3标识的时间段内接收的差异数据,但不包括第五编号4标识的时间段内接收的差异数据。可以理解的是,第二编号是创建第一快照之前最近一次为写数据请求分配的编号,第五编号是创建第二快照之后最近一次为写数据请求分配的编号。因此,第二编号之后直至与第五编号之前的差异数据多于创建第一快照之后至创建第二快照之前的数据。示例性的,第二编号之后直至与第五编号之前的差异数据的记录如表3所示。Step S107: Determine the logical address of the difference data after the second number up to and before the fifth number. The difference data includes the difference data received in the time period identified by the second number 1, the difference data received in the time period identified by the third number 2, and the difference data received in the time period identified by the fourth number 3, but The difference data received during the time period identified by the fifth number 4 is not included. It can be understood that the second number is the number assigned last time for the write data request before the first snapshot is created, and the fifth number is the number assigned last time for the write data request after the second snapshot is created. Therefore, the difference data after the second number until before the fifth number is more than the data after the first snapshot is created until the second snapshot is created. Exemplarily, the record of the difference data after the second number up to and before the fifth number is as shown in Table 3.
Figure PCTCN2017096782-appb-000003
Figure PCTCN2017096782-appb-000003
Figure PCTCN2017096782-appb-000004
Figure PCTCN2017096782-appb-000004
步骤S108:根据所述差异数据的逻辑地址从所述第二快照中读取备份数据。所述从第二快照中读取的备份数据可能是所述差异数据,也可能是所述差异数据的子集。由于第二快照中仅记录了部分第四编号3对应的逻辑地址中存储的数据,因此存储设备20可能不能得到步骤S107中记录的所有的差异数据。例如,假设在编号3所标识的时间段内,存储设备20接收了两个写数据请求。其中第一个写数据请求是在创建第二快照之前接收的,其携带的数据的逻辑块地址为00008,长度为4byte。而第二个写数据请求是在创建第二快照之后接收的,其携带的数据的逻辑块地址为00004,长度为8byte。那么,存储设备20仅能从存储设备20中获得第一个写数据请求携带的数据,以及第二个写数据请求包含的逻辑地址在第二快照中存储的数据。例如,如表3所示,第二个写数据请求携带的数据为45600000,其逻辑块地址为00004,长度为8byte。而所述逻辑块地址和长度在第二快照中的数据为12300000。因此,存储设备20仍然将123000000备份至目标卷。Step S108: Read backup data from the second snapshot according to the logical address of the difference data. The backup data read from the second snapshot may be the difference data or a subset of the difference data. Since only the data stored in the logical address corresponding to the fourth fourth number 3 is recorded in the second snapshot, the storage device 20 may not be able to obtain all the difference data recorded in step S107. For example, assume that within the time period identified by number 3, storage device 20 receives two write data requests. The first write data request is received before the second snapshot is created, and the logical block address of the data carried is 00008 and the length is 4 bytes. The second write data request is received after the second snapshot is created, and the data carried by the logical block address is 00004 and the length is 8 bytes. Then, the storage device 20 can only obtain the data carried by the first write data request from the storage device 20, and the data stored in the second snapshot by the logical address of the second write data request. For example, as shown in Table 3, the second write data request carries 45600000, the logical block address is 00004, and the length is 8 bytes. The data of the logical block address and length in the second snapshot is 12300000. Therefore, storage device 20 still backs up 123000000 to the target volume.
步骤S109:将所述备份数据发送至目标卷。另外,存储设备20还可以将所述备份数据的逻辑地址发送给所述目标卷,使得所述备份数据保存在所述目标卷中的位置与所述备份数据保存在所述生产卷中的位置一致。Step S109: Send the backup data to the target volume. In addition, the storage device 20 may further send the logical address of the backup data to the target volume, such that the location where the backup data is saved in the target volume and the location where the backup data is saved in the production volume Consistent.
待上述操作执行完毕之后,存储设备20完成了一次增量备份,由于存储设备20记录的差异数据多于备份数据,因此存储设备20在进行增量备份时不需要悬挂写数据请求,仍然可以保证将两次快照之间的差异数据备份给目标卷,保证了生产卷和目标卷的数据一致性。After the foregoing operations are completed, the storage device 20 completes an incremental backup. Since the storage device 20 records more difference data than the backup data, the storage device 20 does not need to hang the write data request when performing the incremental backup, and can still guarantee Backing up the difference data between the two snapshots to the target volume ensures data consistency between the production volume and the target volume.
本实施例还提供了一种差异数据备份装置66。所述装置66位于存储系统中,所述存储系统包括生产卷和目标卷。如图6所示,所述装置66包括读取模块661和发送模块662。This embodiment also provides a difference data backup device 66. The device 66 is located in a storage system that includes a production volume and a target volume. As shown in FIG. 6, the device 66 includes a reading module 661 and a transmitting module 662.
其中,读取模块661用于获取两个编号之间的差异数据的记录,编号用于标识数据写入所述生产卷的时间段,其中所述两个编号中的第一编号是创建所述生产卷的第一快照之前最近一次为所述生产卷接收的数据分配的编号,所述两个编号中的第二编号是创建所述生产卷的第二快照之后最近一次为所述生产卷接收的数据分配的编号,所述差异数据的记录包括所述两个编号之间的编号所标识的时间段内接收的差异数据的逻辑地址,所述两个编号之间的所有编号不包括所述第二编号;以及根据所述差异数据的逻辑地址从所述第二快照中读取备份数据,所述备份数据是所述差异数据的子集。具体的,读取模块661的功能可以参见步骤S101-步骤S108的描述,这里不再赘述。在实际实现时,读取模块661可以是图3所示的处理器212调用存储器213中的程序214,这种情况下,处理器212为CPU。或者,读取模块661也可以由处理器212独立实现,这种情况下,处理器212为现场可编程门阵列(英文:Field-Programmable Gate Array,FPGA)或其他处理芯片。The reading module 661 is configured to obtain a record of difference data between two numbers, and the number is used to identify a time period in which data is written into the production volume, wherein the first number of the two numbers is to create the The number assigned to the data received by the production volume most recently before the first snapshot of the production volume, the second number of the two numbers being the last time the second volume of the production volume was created to receive the production volume a data allocation number, the record of the difference data including a logical address of the difference data received within the time period identified by the number between the two numbers, all numbers between the two numbers not including the a second number; and reading backup data from the second snapshot based on the logical address of the difference data, the backup data being a subset of the difference data. Specifically, the function of the reading module 661 can be referred to the description of step S101 to step S108, and details are not described herein again. In actual implementation, the reading module 661 may be the processor 212 shown in FIG. 3 calling the program 214 in the memory 213. In this case, the processor 212 is a CPU. Alternatively, the reading module 661 can also be implemented independently by the processor 212. In this case, the processor 212 is a Field-Programmable Gate Array (FPGA) or other processing chip.
发送模块662,用于将所述备份数据发送给所述目标卷。发送模块662可以是图3所示的处理器212调用存储器213中的程序214,这种情况下,处理器212为CPU。或者,发送模块662也可以由处理器212独立实现,这种情况下,处理器212为现场可编程门阵列(英文:Field-Programmable Gate Array,FPGA)或其他处理芯片。 The sending module 662 is configured to send the backup data to the target volume. The sending module 662 may be the processor 212 shown in FIG. 3 calling the program 214 in the memory 213. In this case, the processor 212 is a CPU. Alternatively, the transmitting module 662 can also be implemented by the processor 212 independently. In this case, the processor 212 is a Field-Programmable Gate Array (FPGA) or other processing chip.
可选的,所述两个编号之间的所有编号按照设定条件改变,所述设定条件包括预设的备份周期到达或者创建所述生产卷的快照。Optionally, all the numbers between the two numbers are changed according to a setting condition, that the preset backup period arrives or a snapshot of the production volume is created.
可选的,所述第二快照是所述第一快照的下一次快照。Optionally, the second snapshot is the next snapshot of the first snapshot.
可选的,发送模块662还用于将所述备份数据的逻辑地址发送给所述目标卷。Optionally, the sending module 662 is further configured to send the logical address of the backup data to the target volume.
可选的,所述两个编号之间的编号不包括所述第二编号。Optionally, the number between the two numbers does not include the second number.
在本实施例提供的差异数据备份装置中,由于所述两个编号中的第一编号是创建所述生产卷的第一快照之前最近一次为所述生产卷接收的数据分配的编号,而所述两个编号中的第二编号是创建所述生产卷的第二快照之后最近一次为所述生产卷接收的数据分配的编号,因此所述差异数据多于备份数据。因此本实施例提供的差异数据备份装置不需要悬挂写数据请求,仍然可以保证将两次快照之间的差异数据备份给目标卷,保证了生产卷和目标卷的数据一致性。由于不需要悬挂写数据请求,因此可以提高数据处理的效率。In the differential data backup apparatus provided in this embodiment, since the first number of the two numbers is the number assigned to the data received by the production volume last time before the first snapshot of the production volume is created, The second number in the two numbers is the number assigned to the data received last time for the production volume after the second snapshot of the production volume is created, so the difference data is more than the backup data. Therefore, the differential data backup device provided in this embodiment does not need to suspend the write data request, and can still ensure that the difference data between the two snapshots is backed up to the target volume, thereby ensuring data consistency between the production volume and the target volume. Since there is no need to hang write data requests, the efficiency of data processing can be improved.
本领域普通技术人员将会理解,本发明的各个方面、或各个方面的可能实现方式可以被具体实施为系统、方法或者计算机程序产品。因此,本发明的各方面、或各个方面的可能实现方式可以采用完全硬件实施例、完全软件实施例(包括固件、驻留软件等等),或者组合软件和硬件方面的实施例的形式,在这里都统称为“电路”、“模块”或者“系统”。此外,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。Those of ordinary skill in the art will appreciate that various aspects of the present invention, or possible implementations of various aspects, may be embodied as a system, method, or computer program product. Thus, aspects of the invention, or possible implementations of various aspects, may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," "modules," or "systems." Furthermore, aspects of the invention, or possible implementations of various aspects, may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
计算机可读介质包含但不限于电子、磁性、光学、电磁、红外或半导体系统、设备或者装置,或者前述的任意适当组合,如随机访问存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、光盘。Computer readable media include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM), optical disc.
计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代码,使得处理器能够执行在流程图中每个步骤、或各步骤的组合中规定的功能动作。The processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor can perform the functional actions specified in each step or combination of steps in the flowchart.
计算机可读程序代码可以完全在用户的计算机上执行、部分在用户的计算机上执行、作为单独的软件包、部分在用户的计算机上并且部分在远程计算机上,或者完全在远程计算机或者服务器上执行。也应该注意,在某些替代实施方案中,在流程图中各步骤、或框图中各块所注明的功能可能不按图中注明的顺序发生。例如,依赖于所涉及的功能,接连示出的两个步骤、或两个块实际上可能被大致同时执行,或者这些块有时候可能被以相反顺序执行。The computer readable program code can execute entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server. . It should also be noted that in some alternative implementations, the functions noted in the various steps in the flowcharts or in the blocks in the block diagrams may not occur in the order noted. For example, two steps, or two blocks, shown in succession may be executed substantially concurrently or the blocks may be executed in the reverse order.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. Different methods may be used to implement the described functionality for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,本领域普通技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。 The above is only the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and those skilled in the art can easily think of changes or substitutions within the technical scope of the present invention, and should be covered in Within the scope of protection of the present invention. Therefore, the scope of the invention should be determined by the scope of the claims.

Claims (10)

  1. 一种差异数据备份方法,其特征在于,所述方法应用于存储系统中,所述存储系统包括处理器、生产卷和目标卷,所述方法由所述处理器执行,包括:A differential data backup method, the method being applied to a storage system, the storage system comprising a processor, a production volume, and a target volume, the method being executed by the processor, comprising:
    获取两个编号之间的差异数据的记录,编号用于标识数据写入所述生产卷的时间段,其中所述两个编号中的第一编号是创建所述生产卷的第一快照之前最近一次为所述生产卷接收的数据分配的编号,所述两个编号中的第二编号是创建所述生产卷的第二快照之后最近一次为所述生产卷接收的数据分配的编号,所述差异数据的记录包括所述两个编号之间的编号所标识的时间段内接收的所述差异数据的逻辑地址;Obtaining a record of difference data between two numbers, the number being used to identify a time period during which data is written to the production volume, wherein the first number of the two numbers is the most recent before the first snapshot of the production volume is created a number assigned to the data received by the production volume at a time, the second number of the two numbers being a number assigned to the data received last time for the production volume after the second snapshot of the production volume is created, The record of the difference data includes a logical address of the difference data received within a time period identified by a number between the two numbers;
    根据所述差异数据的逻辑地址从所述第二快照中读取备份数据,所述备份数据是所述差异数据的子集;Reading backup data from the second snapshot according to a logical address of the difference data, the backup data being a subset of the difference data;
    将所述备份数据发送给所述目标卷。The backup data is sent to the target volume.
  2. 根据权利要求1所述的方法,其特征在于,所述两个编号之间的所有编号按照设定条件改变,所述设定条件包括预设的备份周期到达或者创建所述生产卷的快照。The method of claim 1 wherein all of the numbers between the two numbers are changed according to set conditions, the set conditions including a preset backup period arriving or creating a snapshot of the production volume.
  3. 根据权利要求1-2任一所述的方法,其特征在于,所述第二快照是所述第一快照的下一次快照。The method of any of claims 1-2, wherein the second snapshot is a next snapshot of the first snapshot.
  4. 根据权利要求1-3任一所述的方法,其特征在于,还包括将所述备份数据的逻辑地址发送给所述目标卷。The method of any of claims 1-3, further comprising transmitting a logical address of the backup data to the target volume.
  5. 根据权利要求1-4任一所述的方法,其特征在于,所述两个编号之间的编号不包括所述第二编号。A method according to any one of claims 1-4, wherein the number between the two numbers does not include the second number.
  6. 一种差异数据备份装置,其特征在于,所述装置位于存储系统中,所述存储系统包括生产卷和目标卷,所述装置包括:A differential data backup device, wherein the device is located in a storage system, the storage system includes a production volume and a target volume, and the device includes:
    读取模块,用于获取两个编号之间的差异数据的记录,编号用于标识数据写入所述生产卷的时间段,其中所述两个编号中的第一编号是创建所述生产卷的第一快照之前最近一次为所述生产卷接收的数据分配的编号,所述两个编号中的第二编号是创建所述生产卷的第二快照之后最近一次为所述生产卷接收的数据分配的编号,所述差异数据的记录包括所述两个编号之间的编号所标识的时间段内接收的差异数据的逻辑地址,所述两个编号之间的所有编号不包括所述第二编号;以及根据所述差异数据的逻辑地址从所述第二快照中读取备份数据,所述备份数据是所述差异数据的子集;a reading module for obtaining a record of difference data between two numbers, the number being used to identify a time period during which data is written into the production volume, wherein the first number of the two numbers is to create the production volume The number assigned to the data received by the production volume last time before the first snapshot, the second number of the two numbers being the data received last time for the production volume after the second snapshot of the production volume was created a number assigned, the record of the difference data comprising a logical address of difference data received within a time period identified by a number between the two numbers, and all numbers between the two numbers do not include the second Numbering; and reading backup data from the second snapshot based on the logical address of the difference data, the backup data being a subset of the difference data;
    发送模块,用于将所述备份数据发送给所述目标卷。And a sending module, configured to send the backup data to the target volume.
  7. 根据权利要求6所述的装置,其特征在于,所述两个编号之间的所有编号按照设定条件改变,所述设定条件包括预设的备份周期到达或者创建所述生产卷的快照。The apparatus according to claim 6, wherein all of the numbers between the two numbers are changed according to a set condition including a preset backup period arrival or creation of a snapshot of the production volume.
  8. 根据权利要求6或7所述的装置,其特征在于,所述第二快照是所述第一快照的下一次快照。 The apparatus according to claim 6 or 7, wherein the second snapshot is a next snapshot of the first snapshot.
  9. 根据权利要求6-8任一所述的装置,其特征在于,Device according to any of claims 6-8, characterized in that
    所述发送模块,还用于将所述备份数据的逻辑地址发送给所述目标卷。The sending module is further configured to send a logical address of the backup data to the target volume.
  10. 根据权利要求6-8任一所述的装置,其特征在于,所述两个编号之间的编号不包括所述第二编号。 Apparatus according to any of claims 6-8, wherein the number between the two numbers does not include the second number.
PCT/CN2017/096782 2016-12-29 2017-08-10 Differential data backup method and differential data backup device WO2018120844A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611249975.3A CN106776147B (en) 2016-12-29 2016-12-29 Differential data backup method and differential data backup device
CN201611249975.3 2016-12-29

Publications (1)

Publication Number Publication Date
WO2018120844A1 true WO2018120844A1 (en) 2018-07-05

Family

ID=58927958

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/096782 WO2018120844A1 (en) 2016-12-29 2017-08-10 Differential data backup method and differential data backup device

Country Status (2)

Country Link
CN (1) CN106776147B (en)
WO (1) WO2018120844A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442476A (en) * 2019-06-25 2019-11-12 平安科技(深圳)有限公司 Data snapshot method, device, equipment and storage medium
CN111026755A (en) * 2019-12-06 2020-04-17 中国银行股份有限公司 Transaction serial number obtaining method and device based on full-quantity serial number generator
CN113094207A (en) * 2019-12-23 2021-07-09 华为技术有限公司 Data backup method and system
CN114697351A (en) * 2020-12-30 2022-07-01 华为技术有限公司 Storage management method, device and medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776147B (en) * 2016-12-29 2020-10-09 华为技术有限公司 Differential data backup method and differential data backup device
CN108733513A (en) * 2018-05-07 2018-11-02 杭州宏杉科技股份有限公司 A kind of data-updating method and device
CN109614055B (en) * 2018-12-21 2022-11-04 杭州宏杉科技股份有限公司 Snapshot creating method and device, electronic equipment and machine-readable storage medium
CN116010164A (en) * 2018-12-29 2023-04-25 华为技术有限公司 Method, device and system for backing up data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103649901A (en) * 2013-07-26 2014-03-19 华为技术有限公司 Data transmission method, data receiving method and sotring equipment
US8832027B1 (en) * 2014-03-28 2014-09-09 Storagecraft Technology Corporation Change tracking between snapshots of a source storage
CN105607968A (en) * 2015-12-17 2016-05-25 浙江大华技术股份有限公司 Incremental backup method and equipment
US20160283148A1 (en) * 2015-03-24 2016-09-29 Nec Corporation Backup control device, backup control method, and recording medium
CN106776147A (en) * 2016-12-29 2017-05-31 华为技术有限公司 A kind of variance data backup method and variance data back-up device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201227268A (en) * 2010-12-20 2012-07-01 Chunghwa Telecom Co Ltd Data backup system and data backup and retrival method
WO2013074914A1 (en) * 2011-11-18 2013-05-23 Appassure Software, Inc. Method of and system for merging, storing and retrieving incremental backup data
US9218255B2 (en) * 2012-08-27 2015-12-22 International Business Machines Corporation Multi-volume instant virtual copy freeze
CN104572340A (en) * 2013-10-18 2015-04-29 宇宙互联有限公司 Incremental backup system and method
CN103699459A (en) * 2013-12-31 2014-04-02 汉柏科技有限公司 Method and system for incremental backup of virtual machine data based on Qcow2 snapshots
CN104536846A (en) * 2014-12-17 2015-04-22 杭州华为数字技术有限公司 Data backing up method and device
CN105138426B (en) * 2015-08-20 2018-04-13 浪潮(北京)电子信息产业有限公司 A kind of business data method for protecting consistency and device based on snapshot
CN105162869B (en) * 2015-09-18 2019-01-18 久盈世纪(北京)科技有限公司 A kind of method and apparatus for backup data management
CN105389230B (en) * 2015-10-21 2018-06-22 上海爱数信息技术股份有限公司 A kind of continuous data protection system and method for combination snapping technique

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103649901A (en) * 2013-07-26 2014-03-19 华为技术有限公司 Data transmission method, data receiving method and sotring equipment
US8832027B1 (en) * 2014-03-28 2014-09-09 Storagecraft Technology Corporation Change tracking between snapshots of a source storage
US20160283148A1 (en) * 2015-03-24 2016-09-29 Nec Corporation Backup control device, backup control method, and recording medium
CN105607968A (en) * 2015-12-17 2016-05-25 浙江大华技术股份有限公司 Incremental backup method and equipment
CN106776147A (en) * 2016-12-29 2017-05-31 华为技术有限公司 A kind of variance data backup method and variance data back-up device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442476A (en) * 2019-06-25 2019-11-12 平安科技(深圳)有限公司 Data snapshot method, device, equipment and storage medium
CN111026755A (en) * 2019-12-06 2020-04-17 中国银行股份有限公司 Transaction serial number obtaining method and device based on full-quantity serial number generator
CN111026755B (en) * 2019-12-06 2023-05-02 中国银行股份有限公司 Transaction sequence number acquisition method and device based on full sequence number generator
CN113094207A (en) * 2019-12-23 2021-07-09 华为技术有限公司 Data backup method and system
CN113094207B (en) * 2019-12-23 2023-11-03 华为技术有限公司 Data backup method and system
CN114697351A (en) * 2020-12-30 2022-07-01 华为技术有限公司 Storage management method, device and medium
CN114697351B (en) * 2020-12-30 2023-03-10 华为技术有限公司 Storage management method, device and medium

Also Published As

Publication number Publication date
CN106776147A (en) 2017-05-31
CN106776147B (en) 2020-10-09

Similar Documents

Publication Publication Date Title
WO2018120844A1 (en) Differential data backup method and differential data backup device
US10140039B1 (en) I/O alignment for continuous replication in a storage system
US10977124B2 (en) Distributed storage system, data storage method, and software program
US8521685B1 (en) Background movement of data between nodes in a storage cluster
US9690666B1 (en) Incremental backup operations in a transactional file system
US9703816B2 (en) Method and system for forward reference logging in a persistent datastore
CN111164584B (en) Method for managing distributed snapshots for low latency storage and apparatus therefor
US10095585B1 (en) Rebuilding data on flash memory in response to a storage device failure regardless of the type of storage device that fails
US10394491B2 (en) Efficient asynchronous mirror copy of thin-provisioned volumes
US8140886B2 (en) Apparatus, system, and method for virtual storage access method volume data set recovery
US20180260281A1 (en) Restoring a storage volume from a backup
WO2018076633A1 (en) Remote data replication method, storage device and storage system
WO2019080370A1 (en) Data reading and writing method and apparatus, and storage server
US10983930B1 (en) Efficient non-transparent bridge (NTB) based data transport
WO2015035814A1 (en) Data writing method and storage device
US10503426B2 (en) Efficient space allocation in gathered-write backend change volumes
JPWO2018051505A1 (en) Storage system
US9619336B2 (en) Managing production data
WO2020087930A1 (en) Data protection method and apparatus, and system
US8595454B1 (en) System and method for caching mapping information for off-host backups
US10740189B2 (en) Distributed storage system
US10929255B2 (en) Reducing the size of fault domains
US8880939B2 (en) Storage subsystem and method for recovering data in storage subsystem
WO2018055686A1 (en) Information processing system
US9703497B2 (en) Storage system and storage control method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17887521

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17887521

Country of ref document: EP

Kind code of ref document: A1