WO2017113059A1 - Discrepant data backup method, storage system and discrepant data backup device - Google Patents

Discrepant data backup method, storage system and discrepant data backup device Download PDF

Info

Publication number
WO2017113059A1
WO2017113059A1 PCT/CN2015/099213 CN2015099213W WO2017113059A1 WO 2017113059 A1 WO2017113059 A1 WO 2017113059A1 CN 2015099213 W CN2015099213 W CN 2015099213W WO 2017113059 A1 WO2017113059 A1 WO 2017113059A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
bitmap
data block
bit
byte
Prior art date
Application number
PCT/CN2015/099213
Other languages
French (fr)
Chinese (zh)
Inventor
邬肖元
黄恒
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/099213 priority Critical patent/WO2017113059A1/en
Priority to CN201580003189.2A priority patent/CN107135662B/en
Publication of WO2017113059A1 publication Critical patent/WO2017113059A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication

Definitions

  • Embodiments of the present invention relate to the field of storage technologies, and in particular, to a differential data backup method, a storage system, and a difference bitmap backup device.
  • Data is a key asset of modern enterprises. Providing fast and convenient protection of data is a basic task in daily work. Data backup is a guarantee of data protection. In order to reduce backup time, reducing the amount of data backed up is a good choice. An incremental backup is a solution that backs up only the data that has changed since the last backup, and does not back up the data that has already been backed up. Normally, the changed data block can be obtained by comparing the fingerprint of the data block.
  • the fingerprint of a data block refers to the hash value of the data block.
  • the hash value of the read data block is calculated, and the hash value is compared with the hash value of the saved data block at the corresponding position. If they are the same, then this data block has no changes and no backup is required. If it is different, it means that this data block has changed since the last backup and needs to be backed up. In this way, it is possible to find out the changed data blocks.
  • each backup must read each data block, and compare them in sequence, even if there is no change, the data block needs to be read, which increases the computational burden of the backup software, making the backup efficiency low.
  • the embodiment of the invention provides a differential data backup method, a storage system and a difference data backup device, which can modify the bitmap according to the log information in a backup period, thereby quickly finding the difference data in a backup period and improving the backup efficiency.
  • the first aspect of the embodiment provides a differential data backup method.
  • the method is applied to a storage system including a processor, a memory, a production volume, and a target volume.
  • the production volume includes a plurality of data blocks having the same size, and each data block is allocated a continuous logical address.
  • a bitmap is stored in the memory, the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one of the production volumes.
  • the processor acquires a log record within a backup period, the log record including a start address of a data block to be written to the production volume and a length of the data block.
  • the processor calculates a target value, and reads a source value saved in the target byte, performs a bit or operation on the target value and the source value, and writes a value obtained after performing the bit or operation
  • the target byte is entered to modify the bitmap.
  • the processor obtains difference data according to the modified bitmap, and backs up the difference data to the target volume.
  • the first aspect of the present embodiment provides a specific manner of modifying a bitmap, which can modify a bitmap according to log information in a backup period, thereby quickly finding the difference data in a backup period and backing it up to the target volume. Improve backup efficiency.
  • the log record is generated by combining at least two write data requests, the logical addresses of the data blocks included in the at least two write data requests being consecutive or repeated. If the bitmap is modified according to two write data requests respectively, the bitmap needs to be modified at least twice, but it is merged into one log record, and the merged log record can be operated only once for the bitmap. , which improves the efficiency of modifying bitmaps.
  • the log record is generated by combining at least one write data request and at least one delete data request, the write data request includes a logical address of the data block, and The logical address portion of the data block included in the delete data request is repeated.
  • the effect achieved by the second embodiment is similar to that of the first embodiment, if the writing is separately described The data request and the delete data request modify the bitmap, then the bitmap needs to be modified at least twice, but after merging it into one log record, the modification of the partial bitmap can be offset, and only The bitmap operation is performed once, which improves the efficiency of modifying the bitmap.
  • the processor clears each bit of the bitmap after backing up the difference data into the target volume .
  • the bitmap can continue to record modified data blocks in the next backup cycle.
  • the second aspect of the embodiment provides another differential data backup method.
  • the method is applied to a storage system including a processor, a memory, a production volume, and a target volume.
  • the production volume includes a plurality of data blocks having the same size, and each data block is allocated a continuous logical address.
  • a bitmap is stored in the memory, the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one of the production volumes.
  • the processor acquires a log record within a backup period, the log record including a start address of a data block to be written to the production volume and a length of the data block.
  • the processor uses the n as the abscissa, and the p is the ordinate to search for a target value in a preset two-dimensional mapping table, and each value a np in the two-dimensional mapping table is determined by a formula Calculated.
  • the processor continues to read the source value stored in the target byte, performs a bit or operation on the target value and the source value, and writes the value obtained after performing the bit or operation to the target word Section to modify the bitmap.
  • the processor Upon determining that the backup period arrives, the processor obtains difference data according to the modified bitmap, and backs up the difference data to the target volume.
  • the difference data provided by the second aspect is compared with the differential data backup method provided by the first aspect.
  • the method uses a look-up table to obtain the target value.
  • each value in the two-dimensional mapping table is calculated by the formula provided by the first aspect, and is pre-stored in the storage system in the form of a table.
  • the processor needs to obtain the target value, the n can be directly used as the abscissa, and the p is the ordinate in the preset two-dimensional mapping table. Thereby, the amount of calculation of the processor is reduced.
  • the differential data backup method provided by the second aspect is more efficient in modifying the bitmap.
  • a third aspect of the embodiments of the present invention provides a storage system for performing the differential data backup method provided by the first aspect.
  • a fourth aspect of the embodiments of the present invention provides another storage system for performing the differential data backup method provided by the second aspect.
  • a fifth aspect of the embodiments of the present invention provides a differential data backup apparatus for performing the differential data backup method provided by the first aspect.
  • a sixth aspect of the embodiments of the present invention provides a differential data backup apparatus, configured to perform the differential data backup method provided by the second aspect.
  • the storage system or the differential data backup apparatus provided in the third to sixth aspects of the embodiments of the present invention can modify the bitmap according to the log information in one backup period, thereby quickly finding the difference data in one backup period and backing it up to the target volume. Improved backup efficiency.
  • the embodiment of the invention further provides a computer program product, comprising a computer readable storage medium storing program code, the program code comprising instructions executable by the storage system of the third aspect, and for performing the first aspect At least one method.
  • the embodiment of the invention further provides a computer program product, comprising a computer readable storage medium storing program code, the program code comprising instructions executable by the storage system of the fourth aspect, and for performing the second aspect At least one method.
  • the above computer program product provided by the embodiment of the present invention can all be based on a backup week During the period, the log information is modified to quickly find the difference data in a backup period and back it up to the target volume, which improves the backup efficiency.
  • FIG. 1 is a schematic diagram of an application scenario according to an embodiment of the present invention.
  • FIG. 3 is a structural diagram of a storage system according to an embodiment of the present invention.
  • 4A is a schematic diagram of a bitmap provided by an embodiment of the present invention.
  • 4B is a schematic diagram of another bitmap provided by an embodiment of the present invention.
  • 4C is a schematic diagram of still another bitmap provided by an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a differential data backup method according to an embodiment of the present invention.
  • FIG. 6 is a structural diagram of a differential data backup apparatus according to an embodiment of the present invention.
  • the embodiment of the invention provides a differential data backup method and a storage system, which can quickly modify the bitmap to find the difference data in a backup period.
  • FIG. 1 depicts a composition diagram of a storage system 10 according to an embodiment of the present invention.
  • the storage system 10 shown in FIG. 1 includes one or more hosts 40 and a storage device 20.
  • the host can be a computing device, such as a terminal device such as a server or a desktop computer.
  • the storage device 20 may be a storage device based on data block data, such as a Storage Area Networking (SAN) device, or a storage device including a file system, such as a Network Attached Storage (NAS) device.
  • SAN Storage Area Networking
  • NAS Network Attached Storage
  • a network file can be passed between the host 40 and the storage device 20, and between the storage devices 20
  • the system Network File System, NFS
  • CIFS Common Internet File System
  • FC Fibre Channel
  • the storage device 20 includes at least one controller 21 and a plurality of disks 22.
  • Controller 21 can include any computing device such as a server, desktop computer, or the like. Inside the controller, an operating system and other applications are installed.
  • the controller 21 can send an input/output (I/O) request to the disk 22. For example, a write data request is sent to the disk 22 such that the disk 22 writes the data to be written carried in the write data request into its storage medium.
  • I/O input/output
  • the disk 22 can be a plurality of types of disks, such as Solid State Drive (SSD) or Serial Attached SCSI (SAS) or Fibre Channel (FC) hard disk drives (Hard Disk Drive, HDD). ), where SCSI (Small Computer System Interface) is the abbreviation of the minicomputer system interface or Serial Advanced Technology Attachment (SATA) or Near Line (NL) Serial Attached SCSI (Serial Attached SCSI) , SAS) HDD, not limited here.
  • SCSI Small Computer System Interface
  • SATA Serial Advanced Technology Attachment
  • NL Near Line
  • Serial Attached SCSI Serial Attached SCSI
  • SAS Serial Attached SCSI
  • a logical unit (LU) is a logical storage space distributed over one or more disks 22, such as production volume 23 and target volume 24 shown in FIG.
  • the host 40 can send a write data request to the storage system 10, the write data request carrying data to be written to the storage system 10, the data can be block data or a file.
  • the controller 21 receives the data and then writes it into the logical unit of the storage device 20.
  • data needs to be backed up. For example, the data in the production volume 23 is backed up to the target volume 24. When the data in the production volume 23 is damaged, the data stored in the target volume 24 can be used for recovery.
  • FIG. 2 depicts a composition diagram of another storage system 10 that includes one or more hosts 40, a storage device 20, and a storage device 30.
  • the storage device 30 is similar to the storage device 20 and includes at least one controller 31 and a plurality of disks 32.
  • the structure and function of the controller 31 are similar to those of the controller 21 of FIG. 1.
  • the structure and function of the magnetic disk 32 are similar to those of the magnetic disk 22 of FIG. 1, and will not be described herein.
  • the difference from the application scenario described in FIG. 1 is that the backup in FIG. 1 refers to a storage device. Backup, while the backup in Figure 2 refers to a backup between two storage devices.
  • storage device 20 needs to back up data on its production volume to target volume 33 of storage device 30.
  • the controller 21 may use data when backing up data in one LU (referred to as a production volume) to another LU (referred to as a target volume).
  • the method of full backup can also adopt the method of incremental backup.
  • a full backup is a full backup of all the data on the production volume.
  • An incremental backup is data that has been modified since the last full or incremental backup, whichever is later. Because it is limited to backing up modified data (also known as differential data), this backup is very fast and saves storage space. So how to get the difference data becomes a problem that must be solved.
  • the difference data is obtained by recording a bitmap. This part will be described in detail in the following pages.
  • FIG. 3 depicts a composition diagram of the controller 21 provided by the embodiment of the present invention.
  • the controller 21 includes at least an interface 211, a processor 212, and a memory 213.
  • the interface 211 is configured to communicate with the host 40 or the disk 22 or the storage device 30.
  • the processor 212 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention.
  • the processor 212 can be used to process input/output (I/O) requests to the disk 22, back up data in the production volume to the target volume, and the like.
  • the controller 21 can implement functions such as IO operation, data backup, and the like.
  • the processor 212 is configured to execute the program 214, and specifically, the related steps in the following method embodiments may be performed.
  • the memory 213 is configured to store the program 214.
  • the memory 213 may include a cache memory, may also include a high speed RAM memory, and may also include a non-volatile memory, such as at least one disk memory. It can be understood that the memory 213 can be a random memory (RAM), a magnetic disk, a hard disk, a solid state disk (SSD), or a non-volatile memory. A non-transitory machine readable medium of code.
  • the memory 213 can also be used to cache data received from the host 40 or data read from the disk 22.
  • Program 214 can include an operating system, a file system, and other software modules.
  • the bitmap may be generated or modified directly according to the IO request, because the IO request also carries the starting address of the data block and the length of the data block.
  • the bitmap needs to be processed for that IO request. Even if in some cases multiple IO requests are operations on the same data block, only the last IO request is actually useful information for modifying the bitmap, but it must also be based on the information recorded by other IO requests. Modifications can cause unnecessary overhead in the generation of bitmaps.
  • the log record when the bitmap is generated or modified according to the log record, the log record may be sorted in advance. For example, if multiple log records are write operations for the same data block, these log records can be merged, and the corresponding portion of the data block in the bitmap is modified according to the merged result. Alternatively, if multiple log records are write operations for consecutive data blocks of several addresses, the log records may also be merged, and the multiple data blocks may be modified in a bitmap according to the combined result. part.
  • the storage space in the disk 22 can be logically divided into a plurality of data blocks, each of which corresponds to a continuous logical address.
  • the size of each data block is 512 bytes to 64 KB, or larger. This embodiment is described by taking the size of each data block as 16 KB as an example, but does not impose any limitation on the size of the data block.
  • Each data block can be sequentially numbered starting from 0. For example, the logical address of the data block of number 0 is 0KB-15KB; the logical address of the data block of number 1 is 16KB-31KB; the logical address of the data block of number 2 is 32KB. -47KB, and so on.
  • each data block has a logical address (as shown in Table 1). It should be noted that the sequential numbering of each data block is only an example of the embodiment. In some cases, only the range of the logical address of each data block needs to be recorded, and it is not required for each data block. Perform sequential numbering.
  • the bitmap generated by this embodiment will be described below.
  • the bitmap may be stored in the memory 213.
  • a copy may be saved in the disk 22 as a backup.
  • the bitmap includes a plurality of bytes.
  • the controller 21 internally numbers the multiple bytes, for example, byte 0, byte 1, byte 2, ... analogy.
  • each byte contains 8 bits, such as 0, 1, 2, 3, 4, 5, 6, 7.
  • the 0th bit is the lowest bit and the 7th bit is the highest bit.
  • each bit in the bitmap corresponds to one data block. As shown in FIG. 4A, the bitmap includes a plurality of bytes.
  • the controller 21 internally numbers the multiple bytes, for example, byte 0, byte 1, byte 2, ... analogy.
  • each byte contains 8 bits, such as 0, 1, 2, 3, 4, 5, 6, 7.
  • the 0th bit is the lowest bit and the 7th bit is the highest bit.
  • each bit in the bitmap corresponds to one data block. As shown in FIG.
  • the 0th bit of byte 0 corresponds to the first data block (logical address is 0KB-15KB); the first bit of byte 0 corresponds to the second data block (logical address is 16KB-31KB); byte 0 The second bit corresponds to the third data block (logical address is 32KB-47KB)... The 0th bit of byte1 corresponds to the ninth data block (logical address is 128KB-143KB)...
  • byte0 corresponds to The logical address of multiple data blocks is 0KB-127KB; the logical address of multiple data blocks corresponding to byte1 is 128KB-255KB... and so on.
  • each bit in the bitmap may have two types of marks, one is a mark "1", which means that the data block corresponding to the bit is modified in a certain period of time; the other is the mark "0". Represents that the data block corresponding to the bit has not been modified during this period of time.
  • the bitmap, and the correspondence between each bit in the bitmap and the logical address of the data block, the correspondence between each byte in the bitmap and the logical address of the plurality of data blocks may be stored in the memory 213 in.
  • the bitmap Since the bitmap is for data backup, the bitmap reflects the data block that changes during a backup cycle. For example, in the application scenario shown in FIG. 1 or FIG. 2, the production volume periodically backs up the changed data block to the target volume. At the initial moment (for example, time T0), the production volume backs up the data it stores to the target volume and creates a bitmap. When the bitmap is just created, all the bits it contains are set to the flag "0". Suppose T1 is the moment when the next backup task is triggered. The bitmap reflects which data blocks have been modified between T0 and T1 on the production volume.
  • the first bit of byte0 and the 0th bit of byte1 in the bitmap are set to the flag "1"
  • the data block corresponding to the first bit of byte0 (the logical address is 16 KB)
  • a data block of -31 KB, and a data block corresponding to the 0th bit of byte1 (a data block having a logical address of 128 KB - 143 KB) is modified between time T0 and time T1. That is to say, a data block with a logical address of 16 KB - 31 KB and a data block with a logical address of 128 KB - 143 KB are the difference data to be backed up to the target volume this time.
  • the backup period corresponding to the time T0-T1 is completed, and all the bits of the bitmap are reset. Mark "0".
  • the bitmap at this time reflects which data blocks of the production volume are modified between T1 and T2. This again acquires the differential data backup to the target volume and loops accordingly.
  • the update of the bitmap for a backup period is done by the information in the log record.
  • the following describes how to modify the bitmap according to the information in the log record in conjunction with FIG. 5.
  • FIG. 5 is a flowchart of the method of the present embodiment, which can be applied to the storage device 20 shown in FIG. 1 or applied to the storage device 20 and the storage device 30 shown in FIG. 2, and the execution body thereof is the processing in FIG. 212. Includes the following steps:
  • Step 601 Obtain a log record in a backup period.
  • the log record includes a starting address of a data block to be written to the production volume and a length of the data block.
  • the production volume refers to the LU located in the storage device 20, as shown in FIG. 1 or 2.
  • the first log record information is: a data block of length 30 KB is written to the start address of 17 KB (where the start address refers to the start address in the production volume) In the storage space.
  • the start address refers to the start address in the production volume
  • it is required to determine which bits in the bitmap need to be set to 1 according to the information recorded by the log.
  • the log record may be generated by combining at least two write data requests, and the at least two write data requests include data blocks consecutively.
  • the two write data requests are: writing a data block having a length of 15 KB into a storage space having a starting address of 17 KB; and writing a data block having a length of 15 KB into a storage space having a starting address of 32 KB. . It can be known from the start address and length of the two write requests that their logical addresses are contiguous and thus can be merged into the first log record.
  • the log record may be generated by a combination of at least two write data requests, the at least two write data requests comprising data blocks that are repeated or partially repeated. For example, two write data requests write a data block of length 30 KB into the storage space with a starting address of 17 KB, except that the contents of the data block are different. Then, the second write data request can be used to overwrite the first write data request, that is, the content recorded by the merged log record is consistent with the content of the second write data request.
  • the log record can also be generated by combining at least one write data request and at least one delete data request.
  • the write data request is to write a data block having a length of 15 KB into a storage space having a starting address of 17 KB.
  • the delete data request is to delete a data block whose start address is 17 KB and whose length is 15 KB.
  • the log record generated after the merge is that the data block whose starting address is 17 KB and whose length is 15 KB is not modified. Accordingly, the bitmap does not need to be modified.
  • the logical address of the data block contained in the write data request is repeated with the logical address portion of the data block contained in the delete data request.
  • the write data request is to write a data block having a length of 15 KB into a storage space having a starting address of 17 KB.
  • the delete data request is to delete a data block whose start address is 19 KB and whose length is 13 KB.
  • the log record generated after the merge is to write a data block of length 2 KB into the storage space with a starting address of 17 KB.
  • Step 602 According to the starting address and the length, and logically corresponding to each byte The address determines the corresponding target byte of the data block in the bitmap.
  • Step 603 Determine the number n of bits in the bitmap to be modified according to the start address and the length, and the size of the data block.
  • each bit in the bitmap corresponds to one data block, two bits in byte0 need to be set to the flag "1".
  • Step 604 Determine, according to the starting address, the size of the data block and the number 8 of bits included in each byte to determine a starting bit p in the bit to be modified.
  • the starting address is 17 KB
  • the size of the data block corresponding to each bit is 16 KB
  • the number of bits included in each byte is 8.
  • the starting bit of the byte 0 that needs to be set to the flag "1" is 1 bit.
  • Step 605 by the n, the p, and the formula Calculate the target value.
  • n denotes the number of bits that need to be set to the flag "1"
  • p denotes the number of the start bit that needs to be set to the flag "1”.
  • a two-dimensional mapping table is set in the memory 213 in advance, as shown in Table 2.
  • the value of each item in the table is determined by the above formula Calculated. It should be noted that after the value of each item is obtained by the above formula, it is converted into a hexadecimal value and written in Table 2.
  • Step 606 Read the source value saved in the target byte.
  • the source value refers to the value originally saved in the target byte before the target byte is modified. As can be seen from FIG. 4A, the source value in byte0 is 00000000B.
  • Step 607 Perform a bit or operation on the target value and the source value, and write the value obtained after performing the bit or operation to the target byte.
  • the bit or operation refers to performing an OR operation on each bit in the byte. Specifically, after the binary data (00000110B) corresponding to the value (6 or 0x06) obtained in the above step is bit-operated with the source value 00000000B, 00000110B is obtained. Write it to the corresponding byte (ie byte0). In this embodiment, the 0th bit is a low bit, and the following is a high bit. So after writing 00000110B to byte0, the effect is shown in Figure 4B.
  • step 608 may be directly performed: obtaining difference data according to the modified bitmap.
  • the backup period arrives, it can be seen from the difference bitmap that the first bit and the second bit of the byte 0 change.
  • the logical address of the data block corresponding to the second bit of Byte0 is 16 KB - 31 KB, and the logical address of the data block corresponding to the second bit of byte 0 is 32 KB - 47 KB. Therefore, data (also called difference data) can be obtained from the disk based on the logical addresses of the two data blocks.
  • Step 609 Back up the difference data to the target volume.
  • the target volume refers to an LU located at the storage device 20 or the storage device 30, as shown in FIG. 1 or 2.
  • the difference data is obtained by recording the difference bitmap.
  • the backup period arrives, only the changed data block needs to be obtained according to the difference bitmap, which reduces the calculation burden and improves the backup efficiency.
  • the second log record is read before step 608, for example, the information of the second log record is: the length Write a 60 KB data block with a starting address of 20 KB.
  • the value 62 is calculated by calculation.
  • the binary data corresponding to 62 is 00111110B.
  • the binary data 00111110B and the original binary data 00000110B in byte0 are bit ORed to obtain 00111110B, and byte0 is written.
  • the effect after writing is as shown in Fig. 4C.
  • the modification of the bitmap may be completed according to each log record from time T0 to time T1.
  • a data block ie, difference data
  • T0 and T1 are obtained according to the bitmap, and is backed up to the target volume.
  • the embodiment of the invention further provides a difference data backup device 700, where the device 700 is located In a storage system, the storage system includes a memory, a production volume, and a target volume; wherein the production volume includes a plurality of data blocks having the same size, each data block is allocated a continuous logical address; the memory is saved There is a bitmap; the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one of the production volumes; the apparatus 700 includes:
  • the log processing module 701 is configured to obtain a log record in a backup period, where the log record includes a start address of a data block to be written into the production volume and a length of the data block.
  • the backup module 703 is configured to: when the backup period arrives, obtain difference data according to the modified bitmap; and back up the difference data to the target volume.
  • the log record may be generated by combining at least two write data requests, the logical addresses of the data blocks included in the at least two write data requests being consecutive or repeated.
  • the log record may also be generated by combining at least one write data request and at least one delete data request, the logical address of the data block included in the write data request being duplicated with the logical address portion of the data block included in the delete data request .
  • bitmap generation module 702 is further configured to clear each bit of the bitmap after the difference data is backed up into the target volume.
  • the difference data is obtained by recording the difference bitmap.
  • the backup period arrives, only the changed data block needs to be obtained according to the difference bitmap, which reduces the calculation burden and improves the backup efficiency.
  • the foregoing storage medium includes: a USB flash drive, a mobile hard disk, a magnetic disk, an optical disk, a random access memory (RAM), a solid state disk (SSD), or a nonvolatile.

Abstract

A discrepant data backup method, storage system and discrepant data backup device, wherein the discrepant data backup method comprises: modifying a bit map according to a log record within a backup cycle; acquiring, on completion of the backup cycle, discrepant data according to the modified discrepant data bit map, and backing up the discrepant data. The present invention enables the rapid location of discrepant data as well as efficient data backup.

Description

一种差异数据备份方法、存储系统和差异数据备份装置Differential data backup method, storage system and differential data backup device 技术领域Technical field
本发明实施例涉及存储技术领域,特别是一种差异数据备份方法、存储系统和差异位图备份装置。Embodiments of the present invention relate to the field of storage technologies, and in particular, to a differential data backup method, a storage system, and a difference bitmap backup device.
背景技术Background technique
数据是现代企业的关键资产,对数据提供快速、便捷保护是日常工作中的基本任务。数据备份是数据保护的一个保证。为了减少备份时间,减少备份的数据量是一个不错的选择。增量备份就是这样的方案,只备份从上次备份后变化的数据,而不再备份已经备份过的数据。通常情况下,可以通过比较数据块的指纹来获得变化的数据块。数据块的指纹是指所述数据块的哈希值。备份软件在第一次全量备份时,计算出每个数据块的哈希值,并保存起来。在下次备份的时候,计算出读取的数据块的哈希值,将此哈希值与保存的对应位置上数据块的哈希值进行比对。如果相同,那么这个数据块没有变化,不需要备份。如果不同,则意味着从上次备份后,此数据块有变化,需要备份。按照这种方式,能够找出变化的数据块。然而,每次备份必须读取每个数据块,依次比对,即使是没有变化的数据块也需要读取,这就加重了备份软件的计算负担,使得备份效率较低。Data is a key asset of modern enterprises. Providing fast and convenient protection of data is a basic task in daily work. Data backup is a guarantee of data protection. In order to reduce backup time, reducing the amount of data backed up is a good choice. An incremental backup is a solution that backs up only the data that has changed since the last backup, and does not back up the data that has already been backed up. Normally, the changed data block can be obtained by comparing the fingerprint of the data block. The fingerprint of a data block refers to the hash value of the data block. When the backup software first backs up, it calculates the hash value of each data block and saves it. At the next backup, the hash value of the read data block is calculated, and the hash value is compared with the hash value of the saved data block at the corresponding position. If they are the same, then this data block has no changes and no backup is required. If it is different, it means that this data block has changed since the last backup and needs to be backed up. In this way, it is possible to find out the changed data blocks. However, each backup must read each data block, and compare them in sequence, even if there is no change, the data block needs to be read, which increases the computational burden of the backup software, making the backup efficiency low.
发明内容Summary of the invention
本发明实施例提出了一种差异数据备份方法、存储系统和差异数据备份装置,能够根据一个备份周期内的日志信息修改位图,从而快速找到一个备份周期内的差异数据,提高备份效率。The embodiment of the invention provides a differential data backup method, a storage system and a difference data backup device, which can modify the bitmap according to the log information in a backup period, thereby quickly finding the difference data in a backup period and improving the backup efficiency.
本实施例第一方面提供了一种差异数据备份方法。该方法应用于存储系统中,所述存储系统包括处理器、存储器、生产卷和目标卷。其中,生产卷包括多个具有相同尺寸的数据块,每个数据块被分配一段连续的逻辑地址。 存储器中保存有位图,所述位图包括多个字节,每个字节包括8个位,并且所述位图中的每个位对应所述生产卷中的一个数据块。所述处理器获取一个备份周期内的日志记录,所述日志记录包括待写入所述生产卷的数据块的起始地址和所述数据块的长度。然后,所述处理器根据所述起始地址和所述长度,以及每个字节对应的逻辑地址确定所述数据块在所述位图中对应的目标字节。再根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n,0<n<=8,并且根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p,0=<p<8。由所述n,所述p,以及公式
Figure PCTCN2015099213-appb-000001
所述处理器计算获得目标数值,并读取所述目标字节中保存的源数值,对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节以修改所述位图。在确定所述备份周期到达时,所述处理器根据所述修改后的位图获得差异数据,将所述差异数据备份至所述目标卷中。
The first aspect of the embodiment provides a differential data backup method. The method is applied to a storage system including a processor, a memory, a production volume, and a target volume. Wherein, the production volume includes a plurality of data blocks having the same size, and each data block is allocated a continuous logical address. A bitmap is stored in the memory, the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one of the production volumes. The processor acquires a log record within a backup period, the log record including a start address of a data block to be written to the production volume and a length of the data block. Then, the processor determines, according to the start address and the length, and a logical address corresponding to each byte, a target byte corresponding to the data block in the bitmap. Determining, according to the start address and the length, and the size of the data block, the number of bits to be modified in the bitmap, n<0<=8, and according to the start address, the data The size of the block and the number of bits 8 included in each byte determine the start bit p in the bit to be modified, 0 = < p < 8. By the n, the p, and the formula
Figure PCTCN2015099213-appb-000001
The processor calculates a target value, and reads a source value saved in the target byte, performs a bit or operation on the target value and the source value, and writes a value obtained after performing the bit or operation The target byte is entered to modify the bitmap. Upon determining that the backup period arrives, the processor obtains difference data according to the modified bitmap, and backs up the difference data to the target volume.
本实施例第一方面提供了一种具体的修改位图的方式,能够根据一个备份周期内的日志信息修改位图,从而快速找到一个备份周期内的差异数据,将其备份至目标卷中,提高了备份效率。The first aspect of the present embodiment provides a specific manner of modifying a bitmap, which can modify a bitmap according to log information in a backup period, thereby quickly finding the difference data in a backup period and backing it up to the target volume. Improve backup efficiency.
结合第一方面,在第一方面的第一种实施方式中,所述日志记录由至少两个写数据请求合并生成,所述至少两个写数据请求包含的数据块的逻辑地址连续或者重复。如果分别根据两个写数据请求对所述位图进行修改,那么所述位图至少需要修改两次,然而将其合并为一条日志记录,根据所述合并的日志记录可以只对位图操作一次,从而提高了修改位图的效率。In conjunction with the first aspect, in a first implementation of the first aspect, the log record is generated by combining at least two write data requests, the logical addresses of the data blocks included in the at least two write data requests being consecutive or repeated. If the bitmap is modified according to two write data requests respectively, the bitmap needs to be modified at least twice, but it is merged into one log record, and the merged log record can be operated only once for the bitmap. , which improves the efficiency of modifying bitmaps.
结合第一方面,在第一方面的第二种实施方式中,所述日志记录由至少一个写数据请求和至少一个删除数据请求合并生成,所述写数据请求包含的数据块的逻辑地址,与所述删除数据请求包含的数据块的逻辑地址部分重复。第二种实施方式所达到的效果与第一种实施方式类似,如果分别所述写 数据请求和所述删除数据请求对所述位图进行修改,那么所述位图至少需要修改两次,然而将其合并为一条日志记录后,可以抵消掉部分位图的修改,并且也可以只对位图操作一次,从而提高了修改位图的效率。In conjunction with the first aspect, in a second implementation of the first aspect, the log record is generated by combining at least one write data request and at least one delete data request, the write data request includes a logical address of the data block, and The logical address portion of the data block included in the delete data request is repeated. The effect achieved by the second embodiment is similar to that of the first embodiment, if the writing is separately described The data request and the delete data request modify the bitmap, then the bitmap needs to be modified at least twice, but after merging it into one log record, the modification of the partial bitmap can be offset, and only The bitmap operation is performed once, which improves the efficiency of modifying the bitmap.
结合以上任意一种实施方式,在第一方面的第二种实施方式中,在将所述差异数据备份至所述目标卷中之后,所述处理器将所述位图的每个位清零。由此,所述位图可以继续记录下一个备份周期内的修改的数据块。In combination with any of the above embodiments, in a second implementation of the first aspect, the processor clears each bit of the bitmap after backing up the difference data into the target volume . Thus, the bitmap can continue to record modified data blocks in the next backup cycle.
本实施例第二方面提供了另一种差异数据备份方法。该方法应用于存储系统中,所述存储系统包括处理器、存储器、生产卷和目标卷。其中,生产卷包括多个具有相同尺寸的数据块,每个数据块被分配一段连续的逻辑地址。存储器中保存有位图,所述位图包括多个字节,每个字节包括8个位,并且所述位图中的每个位对应所述生产卷中的一个数据块。所述处理器获取一个备份周期内的日志记录,所述日志记录包括待写入所述生产卷的数据块的起始地址和所述数据块的长度。然后,所述处理器根据所述起始地址和所述长度,以及每个字节对应的逻辑地址确定所述数据块在所述位图中对应的目标字节。再根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n,0<n<=8,并且根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p,0=<p<8。所述处理器以所述n为横坐标,所述p为纵坐标在预设的二维映射表中查找获得目标数值,所述二维映射表中的每项数值anp由公式
Figure PCTCN2015099213-appb-000002
计算获得。所述处理器继续读取所述目标字节中保存的源数值,对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节以修改所述位图。在确定所述备份周期到达时,所述处理器根据所述修改后的位图获得差异数据,将所述差异数据备份至所述目标卷中。
The second aspect of the embodiment provides another differential data backup method. The method is applied to a storage system including a processor, a memory, a production volume, and a target volume. Wherein, the production volume includes a plurality of data blocks having the same size, and each data block is allocated a continuous logical address. A bitmap is stored in the memory, the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one of the production volumes. The processor acquires a log record within a backup period, the log record including a start address of a data block to be written to the production volume and a length of the data block. Then, the processor determines, according to the start address and the length, and a logical address corresponding to each byte, a target byte corresponding to the data block in the bitmap. Determining, according to the start address and the length, and the size of the data block, the number of bits to be modified in the bitmap, n<0<=8, and according to the start address, the data The size of the block and the number of bits 8 included in each byte determine the start bit p in the bit to be modified, 0 = < p < 8. The processor uses the n as the abscissa, and the p is the ordinate to search for a target value in a preset two-dimensional mapping table, and each value a np in the two-dimensional mapping table is determined by a formula
Figure PCTCN2015099213-appb-000002
Calculated. The processor continues to read the source value stored in the target byte, performs a bit or operation on the target value and the source value, and writes the value obtained after performing the bit or operation to the target word Section to modify the bitmap. Upon determining that the backup period arrives, the processor obtains difference data according to the modified bitmap, and backs up the difference data to the target volume.
与第一方面提供的差异数据备份方法相比,第二方面提供的差异数据备 份方法采用查表的方式获得目标数值。而二维映射表中的每项数值是由第一方面提供的公式计算获得的,并且以表的形式预先存储在所述存储系统中。那么当所述处理器需要获得所述目标数值时,可以直接以所述n为横坐标,所述p为纵坐标在预设的二维映射表中查找获得。从而减小了处理器的运算量,与第一方面相比,第二方面提供的差异数据备份方法在修改位图时的效率更高。The difference data provided by the second aspect is compared with the differential data backup method provided by the first aspect. The method uses a look-up table to obtain the target value. And each value in the two-dimensional mapping table is calculated by the formula provided by the first aspect, and is pre-stored in the storage system in the form of a table. Then, when the processor needs to obtain the target value, the n can be directly used as the abscissa, and the p is the ordinate in the preset two-dimensional mapping table. Thereby, the amount of calculation of the processor is reduced. Compared with the first aspect, the differential data backup method provided by the second aspect is more efficient in modifying the bitmap.
本发明实施例提供的第二方面的各种实施方式,与第一方面的实施方式类似。Various embodiments of the second aspect provided by the embodiments of the present invention are similar to the embodiments of the first aspect.
本发明实施例第三方面提供了一种存储系统,用于执行第一方面提供的差异数据备份方法。A third aspect of the embodiments of the present invention provides a storage system for performing the differential data backup method provided by the first aspect.
本发明实施例第四方面提供了另一种存储系统,用于执行第二方面提供的差异数据备份方法。A fourth aspect of the embodiments of the present invention provides another storage system for performing the differential data backup method provided by the second aspect.
本发明实施例第五方面提供了一种差异数据备份装置,用于执行第一方面提供的差异数据备份方法。A fifth aspect of the embodiments of the present invention provides a differential data backup apparatus for performing the differential data backup method provided by the first aspect.
本发明实施例第六方面提供了一种差异数据备份装置,用于执行第二方面提供的差异数据备份方法。A sixth aspect of the embodiments of the present invention provides a differential data backup apparatus, configured to perform the differential data backup method provided by the second aspect.
本发明实施例第三方面至第六方面提供的存储系统或者差异数据备份装置能够根据一个备份周期内的日志信息修改位图,从而快速找到一个备份周期内的差异数据,将其备份至目标卷中,提高了备份效率。The storage system or the differential data backup apparatus provided in the third to sixth aspects of the embodiments of the present invention can modify the bitmap according to the log information in one backup period, thereby quickly finding the difference data in one backup period and backing it up to the target volume. Improved backup efficiency.
本发明实施例还提供了一种计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可以由上述第三方面的存储系统执行,并用于执行上述第一方面的至少一种方法。The embodiment of the invention further provides a computer program product, comprising a computer readable storage medium storing program code, the program code comprising instructions executable by the storage system of the third aspect, and for performing the first aspect At least one method.
本发明实施例还提供了一种计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可以由上述第四方面的存储系统执行,并用于执行上述第二方面的至少一种方法。The embodiment of the invention further provides a computer program product, comprising a computer readable storage medium storing program code, the program code comprising instructions executable by the storage system of the fourth aspect, and for performing the second aspect At least one method.
本发明实施例提供的以上一种计算机程序产品,都能够根据一个备份周 期内的日志信息修改位图,从而快速找到一个备份周期内的差异数据,将其备份至目标卷中,提高了备份效率。The above computer program product provided by the embodiment of the present invention can all be based on a backup week During the period, the log information is modified to quickly find the difference data in a backup period and back it up to the target volume, which improves the backup efficiency.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对现有技术或实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the prior art or the embodiments will be briefly described below. Obviously, the drawings in the following description are only some implementations of the present invention. For example, other drawings may be obtained from those of ordinary skill in the art in light of the inventive work.
图1是本发明实施例提供的一种应用场景图;FIG. 1 is a schematic diagram of an application scenario according to an embodiment of the present invention;
图2是本发明实施例提供的另一种应用场景图;2 is another application scenario diagram provided by an embodiment of the present invention;
图3是本发明实施例提供的存储系统的结构图;3 is a structural diagram of a storage system according to an embodiment of the present invention;
图4A是本发明实施例提供的一种位图的示意图;4A is a schematic diagram of a bitmap provided by an embodiment of the present invention;
图4B是本发明实施例提供的另一种位图的示意图;4B is a schematic diagram of another bitmap provided by an embodiment of the present invention;
图4C是本发明实施例提供的再一种位图的示意图;4C is a schematic diagram of still another bitmap provided by an embodiment of the present invention;
图5是本发明实施例提供的差异数据备份方法的流程示意图;FIG. 5 is a schematic flowchart of a differential data backup method according to an embodiment of the present invention;
图6是本发明实施例提供的差异数据备份装置的结构图。FIG. 6 is a structural diagram of a differential data backup apparatus according to an embodiment of the present invention.
具体实施方式detailed description
本发明实施例提出了一种差异数据备份方法和存储系统,能够快速地修改位图,以找到一个备份周期内的差异数据。The embodiment of the invention provides a differential data backup method and a storage system, which can quickly modify the bitmap to find the difference data in a backup period.
下面对本发明实施例的应用场景进行介绍。The application scenario of the embodiment of the present invention is introduced below.
如图1所示,图1描绘了本发明实施例提供的存储系统10的组成图,图1所示的存储系统10包括一个或多个主机40和一个存储设备20。主机可以是计算设备,如服务器、台式计算机等终端设备。存储设备20可以是基于数据块数据的存储设备,如存储区域网(Storage Area Networking,SAN)设备,也可以是包含文件系统的存储设备,如网络附属存储(Network Attached Storage,NAS)设备。本实施例并不对存储设备的类型做任何限定。主机40与存储设备20之间,以及各个存储设备20之间可以通过网络文件 系统(Network File System,NFS)/通用网络文件系统(Common Internet File System,CIFS)协议或者光纤通道(Fiber Channel,FC)协议进行通信。As shown in FIG. 1, FIG. 1 depicts a composition diagram of a storage system 10 according to an embodiment of the present invention. The storage system 10 shown in FIG. 1 includes one or more hosts 40 and a storage device 20. The host can be a computing device, such as a terminal device such as a server or a desktop computer. The storage device 20 may be a storage device based on data block data, such as a Storage Area Networking (SAN) device, or a storage device including a file system, such as a Network Attached Storage (NAS) device. This embodiment does not limit the type of the storage device. A network file can be passed between the host 40 and the storage device 20, and between the storage devices 20 The system (Network File System, NFS)/Common Internet File System (CIFS) protocol or Fibre Channel (FC) protocol communicates.
存储设备20包括至少一个控制器21和若干个磁盘22。控制器21可以包括任何计算设备,如服务器、台式计算机等等。在控制器内部,安装有操作系统以及其他应用程序。控制器21可以向磁盘22发送输入输出(I/O)请求。例如,向磁盘22发送写数据请求,使得磁盘22将写数据请求中携带的待写入数据写入其存储介质中。The storage device 20 includes at least one controller 21 and a plurality of disks 22. Controller 21 can include any computing device such as a server, desktop computer, or the like. Inside the controller, an operating system and other applications are installed. The controller 21 can send an input/output (I/O) request to the disk 22. For example, a write data request is sent to the disk 22 such that the disk 22 writes the data to be written carried in the write data request into its storage medium.
磁盘22可以是多种类型的磁盘,例如,固态硬盘(Solid State Drive,SSD)或者串行连接SCSI(Serial Attached SCSI,SAS)或光纤通道(Fiber Channel,FC)硬盘驱动器(Hard Disk Drive,HDD),其中,SCSI(Small Computer System Interface)为小型机系统接口的英文缩写或者串行高级技术附件(Serial Advanced Technology Attachment,SATA)或近线(Near Line,NL)串行连接SCSI(Serial Attached SCSI,SAS)HDD,在此不做限定。逻辑单元(Logic Unit,LU)是分布在一个或多个磁盘22上的一段逻辑存储空间,例如图1所示的生产卷23和目标卷24。主机40可以向存储系统10发送写数据请求,所述写数据请求中携带待写入所述存储系统10的数据,所述数据可以是块数据或者文件。控制器21接收所述数据后再写入所述存储设备20的逻辑单元中。在实际应用中,为了保证数据可靠性,往往需要对数据进行备份处理。例如,将生产卷23中的数据备份到目标卷24,当生产卷23中的数据发生损坏时,可以用目标卷24中存储的数据进行恢复。The disk 22 can be a plurality of types of disks, such as Solid State Drive (SSD) or Serial Attached SCSI (SAS) or Fibre Channel (FC) hard disk drives (Hard Disk Drive, HDD). ), where SCSI (Small Computer System Interface) is the abbreviation of the minicomputer system interface or Serial Advanced Technology Attachment (SATA) or Near Line (NL) Serial Attached SCSI (Serial Attached SCSI) , SAS) HDD, not limited here. A logical unit (LU) is a logical storage space distributed over one or more disks 22, such as production volume 23 and target volume 24 shown in FIG. The host 40 can send a write data request to the storage system 10, the write data request carrying data to be written to the storage system 10, the data can be block data or a file. The controller 21 receives the data and then writes it into the logical unit of the storage device 20. In practical applications, in order to ensure data reliability, data needs to be backed up. For example, the data in the production volume 23 is backed up to the target volume 24. When the data in the production volume 23 is damaged, the data stored in the target volume 24 can be used for recovery.
本发明实施例还适用于另一种应用场景,如图2所示。图2描绘了另一种存储系统10的组成图,该存储系统10包括一个或多个主机40,一个存储设备20以及一个存储设备30。存储设备30与存储设备20类似,包括至少一个控制器31和若干个磁盘32。控制器31的结构和功能与图1中控制器21类似,磁盘32的结构和功能与图1中的磁盘22类似,这里不再赘述。与图1描述的应用场景的不同之处在于,图1中的备份是指一个存储设备内的 备份,而图2中的备份是指两个存储设备间的备份。例如,存储设备20需要将其生产卷上的数据备份到存储设备30的目标卷33中。The embodiment of the present invention is also applicable to another application scenario, as shown in FIG. 2 . 2 depicts a composition diagram of another storage system 10 that includes one or more hosts 40, a storage device 20, and a storage device 30. The storage device 30 is similar to the storage device 20 and includes at least one controller 31 and a plurality of disks 32. The structure and function of the controller 31 are similar to those of the controller 21 of FIG. 1. The structure and function of the magnetic disk 32 are similar to those of the magnetic disk 22 of FIG. 1, and will not be described herein. The difference from the application scenario described in FIG. 1 is that the backup in FIG. 1 refers to a storage device. Backup, while the backup in Figure 2 refers to a backup between two storage devices. For example, storage device 20 needs to back up data on its production volume to target volume 33 of storage device 30.
无论是图1所示的应用场景还是图2所示的应用场景,控制器21在将一个LU(称为生产卷)中的数据备份到另一个LU(称为目标卷)中时,可以采用全量备份的方式,也可以采用增量备份的方式。Regardless of the application scenario shown in FIG. 1 or the application scenario shown in FIG. 2, the controller 21 may use data when backing up data in one LU (referred to as a production volume) to another LU (referred to as a target volume). The method of full backup can also adopt the method of incremental backup.
全量备份是指对生产卷上的所有数据进行完整备份。增量备份是备份自上次全备份或者增量式备份以来(取两者中较晚者)修改的数据。由于仅限于对修改的数据(又称为差异数据)进行备份,这种备份非常快,也更能节省存储空间。那么,如何获取差异数据就成为一个必须解决的问题。本实施例采用记录位图的方式来获取差异数据。这部分内容将在后续篇幅中进行详细介绍。A full backup is a full backup of all the data on the production volume. An incremental backup is data that has been modified since the last full or incremental backup, whichever is later. Because it is limited to backing up modified data (also known as differential data), this backup is very fast and saves storage space. So how to get the difference data becomes a problem that must be solved. In this embodiment, the difference data is obtained by recording a bitmap. This part will be described in detail in the following pages.
下面介绍控制器21的组成结构。如图3所示,图3描绘了本发明实施例提供的控制器21的组成图。The composition of the controller 21 will be described below. As shown in FIG. 3, FIG. 3 depicts a composition diagram of the controller 21 provided by the embodiment of the present invention.
控制器21至少包括接口211,处理器212和存储器213。The controller 21 includes at least an interface 211, a processor 212, and a memory 213.
接口211,用于与主机40或者磁盘22或者存储设备30进行通信。The interface 211 is configured to communicate with the host 40 or the disk 22 or the storage device 30.
处理器212可能是一个中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。处理器212可以用于处理对磁盘22的输入/输出(Input/Output,I/O)请求,将生产卷中的数据备份到目标卷中等等。从而使控制器21可以实现IO操作、数据备份等功能。在本发明实施例中,处理器212用于执行程序214,具体可以执行下述方法实施例中的相关步骤。The processor 212 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The processor 212 can be used to process input/output (I/O) requests to the disk 22, back up data in the production volume to the target volume, and the like. Thereby, the controller 21 can implement functions such as IO operation, data backup, and the like. In the embodiment of the present invention, the processor 212 is configured to execute the program 214, and specifically, the related steps in the following method embodiments may be performed.
存储器213,用于存放程序214,存储器213可以包括高速缓存存储器(cache),也可以包括高速RAM存储器,还可以包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。可以理解的是,存储器213可以为随机存储器(Random-Access Memory,RAM)、磁碟、硬盘、固态硬盘(Solid State Disk,SSD)或者非易失性存储器等各种可以存储程序 代码的非短暂性的(non-transitory)机器可读介质。存储器213还可以用于缓存从主机40接收的数据或者从磁盘22读取的数据。The memory 213 is configured to store the program 214. The memory 213 may include a cache memory, may also include a high speed RAM memory, and may also include a non-volatile memory, such as at least one disk memory. It can be understood that the memory 213 can be a random memory (RAM), a magnetic disk, a hard disk, a solid state disk (SSD), or a non-volatile memory. A non-transitory machine readable medium of code. The memory 213 can also be used to cache data received from the host 40 or data read from the disk 22.
程序214可以包括操作系统,文件系统以及其他软件模块。 Program 214 can include an operating system, a file system, and other software modules.
在本发明实施例中,可以直接根据IO请求生成或者修改位图,因为IO请求也携带有数据块的起始地址和数据块的长度。采用这种方式修改位图时,每接收到一条IO请求就需要针对该条IO请求对位图进行处理。即使某些情况下多条IO请求是对同一个数据块进行的操作,实际上只有最后一条IO请求对于修改位图来说是有用的信息,但是也必须根据其他IO请求记录的信息对位图进行修改,在位图的生成过程中就会造成一些不必要的开销。In the embodiment of the present invention, the bitmap may be generated or modified directly according to the IO request, because the IO request also carries the starting address of the data block and the length of the data block. When modifying a bitmap in this way, each time an IO request is received, the bitmap needs to be processed for that IO request. Even if in some cases multiple IO requests are operations on the same data block, only the last IO request is actually useful information for modifying the bitmap, but it must also be based on the information recorded by other IO requests. Modifications can cause unnecessary overhead in the generation of bitmaps.
在本实施例中,在根据日志记录生成或者修改位图时,可以事先对日志记录进行一些整理。例如,如果多条日志记录是针对同一个数据块进行的写入操作,可以对这些日志记录进行合并,根据合并后的结果修改所述数据块在位图中对应的部分。或者,如果多条日志记录是针对几个地址连续的数据块进行的写入操作,也可以对这些日志记录进行合并,根据合并后的结果一次性修改所述多个数据块在位图中对应的部分。In this embodiment, when the bitmap is generated or modified according to the log record, the log record may be sorted in advance. For example, if multiple log records are write operations for the same data block, these log records can be merged, and the corresponding portion of the data block in the bitmap is modified according to the merged result. Alternatively, if multiple log records are write operations for consecutive data blocks of several addresses, the log records may also be merged, and the multiple data blocks may be modified in a bitmap according to the combined result. part.
在本实施例中,磁盘22中的存储空间可以在逻辑上被划分为若干个数据块,每个数据块对应一段连续的逻辑地址。每个数据块的尺寸是512字节至64KB,或者更大,本实施例以每个数据块的尺寸是16KB为例来进行说明,但并不对数据块的尺寸进行任何限制。每个数据块可以从0开始进行顺序编号,例如,编号0的数据块的逻辑地址是0KB-15KB;编号1的数据块的逻辑地址是16KB-31KB;编号2的数据块的逻辑地址是32KB-47KB,依此类推。每个数据块的编号与逻辑地址之间的对应关系可以保存在存储器213中(如表1所示)。需要说明的是,对每个数据块进行顺序编号只是本实施例的一个举例说明,在某些情况下只需要记录每个数据块的逻辑地址的范围即可,并不需要对每个数据块进行顺序编号。 In this embodiment, the storage space in the disk 22 can be logically divided into a plurality of data blocks, each of which corresponds to a continuous logical address. The size of each data block is 512 bytes to 64 KB, or larger. This embodiment is described by taking the size of each data block as 16 KB as an example, but does not impose any limitation on the size of the data block. Each data block can be sequentially numbered starting from 0. For example, the logical address of the data block of number 0 is 0KB-15KB; the logical address of the data block of number 1 is 16KB-31KB; the logical address of the data block of number 2 is 32KB. -47KB, and so on. The correspondence between the number of each data block and the logical address can be stored in the memory 213 (as shown in Table 1). It should be noted that the sequential numbering of each data block is only an example of the embodiment. In some cases, only the range of the logical address of each data block needs to be recorded, and it is not required for each data block. Perform sequential numbering.
编号Numbering 00 11 22 33 44 ……......
逻辑地址Logical address 0KB-15KB0KB-15KB 16KB-31KB16KB-31KB 32KB-47KB32KB-47KB 48KB-63KB48KB-63KB 64KB-79KB64KB-79KB ……......
表1Table 1
下面对本实施例生成的位图进行介绍。所述位图可以保存在存储器213中,为了防止存储器213掉电时发生数据丢失,也可以在磁盘22中保存一份作为备份。The bitmap generated by this embodiment will be described below. The bitmap may be stored in the memory 213. In order to prevent data loss when the memory 213 is powered off, a copy may be saved in the disk 22 as a backup.
图4A是本发明实施例的位图的示意图。如图4A所示,所述位图包含多个字节(byte),通常情况下控制器21内部会对所述多个字节进行顺序编号,例如byte 0,byte 1,byte2……依此类推。其中,每个字节包含8个位,例如0,1,2,3,4,5,6,7。其中,第0位是最低位,第7位是最高位。另外,所述位图中的每个位对应一个数据块。如图4A所示,byte 0的第0位对应第1个数据块(逻辑地址为0KB-15KB);byte 0的第1位对应第2个数据块(逻辑地址为16KB-31KB);byte 0的第2位对应第3个数据块(逻辑地址为32KB-47KB)……byte1的第0位对应第9个数据块(逻辑地址为128KB-143KB)……另外,可以理解的是,byte0对应的多个数据块的逻辑地址是0KB-127KB;byte1对应的多个数据块的逻辑地址是128KB-255KB……依此类推。4A is a schematic diagram of a bitmap of an embodiment of the present invention. As shown in FIG. 4A, the bitmap includes a plurality of bytes. Generally, the controller 21 internally numbers the multiple bytes, for example, byte 0, byte 1, byte 2, ... analogy. Among them, each byte contains 8 bits, such as 0, 1, 2, 3, 4, 5, 6, 7. Among them, the 0th bit is the lowest bit and the 7th bit is the highest bit. In addition, each bit in the bitmap corresponds to one data block. As shown in FIG. 4A, the 0th bit of byte 0 corresponds to the first data block (logical address is 0KB-15KB); the first bit of byte 0 corresponds to the second data block (logical address is 16KB-31KB); byte 0 The second bit corresponds to the third data block (logical address is 32KB-47KB)... The 0th bit of byte1 corresponds to the ninth data block (logical address is 128KB-143KB)... In addition, it can be understood that byte0 corresponds to The logical address of multiple data blocks is 0KB-127KB; the logical address of multiple data blocks corresponding to byte1 is 128KB-255KB... and so on.
本实施例中,位图中的每个位可以具有两种标记,一个是标记“1”,代表在某一段时间内所述位对应的数据块被修改了;另一个是标记“0”,代表在该段时间内所述位对应的数据块没有被修改。所述位图,以及位图中每个位与数据块的逻辑地址之间的对应关系,位图中每个字节与多个数据块的逻辑地址之间的对应关系都可以保存在存储器213中。In this embodiment, each bit in the bitmap may have two types of marks, one is a mark "1", which means that the data block corresponding to the bit is modified in a certain period of time; the other is the mark "0". Represents that the data block corresponding to the bit has not been modified during this period of time. The bitmap, and the correspondence between each bit in the bitmap and the logical address of the data block, the correspondence between each byte in the bitmap and the logical address of the plurality of data blocks may be stored in the memory 213 in.
由于位图是为数据备份服务的,因此位图反映的是一个备份周期内变化的数据块。例如图1或图2所示的应用场景,生产卷周期性地将发生变化的数据块备份到目标卷。初始时刻(例如T0时刻),生产卷将其存储的数据全量备份给目标卷,并且创建位图。所述位图刚被创建时,其包含的所有的位都被置为标记“0”。假设T1时刻是下一个备份任务触发的时刻,那么所 述位图反映的是生产卷在T0时刻至T1时刻之间哪些数据块被修改了。例如,在T1时刻所述位图中的byte0的第1个位和byte1的第0个位被置为标记“1”,那么说明byte0的第1个位所对应的数据块(逻辑地址为16KB-31KB的数据块),以及byte1的第0个位所对应的数据块(逻辑地址为128KB-143KB的数据块)在T0时刻至T1时刻之间被修改了。也就是说,逻辑地址为16KB-31KB的数据块和逻辑地址为128KB-143KB的数据块是本次将要被备份到目标卷的差异数据。在将所述逻辑地址为16KB-31KB的数据块和逻辑地址为128KB-143KB的数据块发送给目标卷以后,T0时刻-T1时刻对应的备份周期完成,所述位图的所有位被重新置为标记“0”。到了T2时刻,又一次备份任务触发,那么此时的位图反映的是生产卷在T1时刻至T2时刻之间哪些数据块被修改了。由此再次获取差异数据备份到目标卷,依此循环。Since the bitmap is for data backup, the bitmap reflects the data block that changes during a backup cycle. For example, in the application scenario shown in FIG. 1 or FIG. 2, the production volume periodically backs up the changed data block to the target volume. At the initial moment (for example, time T0), the production volume backs up the data it stores to the target volume and creates a bitmap. When the bitmap is just created, all the bits it contains are set to the flag "0". Suppose T1 is the moment when the next backup task is triggered. The bitmap reflects which data blocks have been modified between T0 and T1 on the production volume. For example, at the time T1, the first bit of byte0 and the 0th bit of byte1 in the bitmap are set to the flag "1", then the data block corresponding to the first bit of byte0 (the logical address is 16 KB) A data block of -31 KB, and a data block corresponding to the 0th bit of byte1 (a data block having a logical address of 128 KB - 143 KB) is modified between time T0 and time T1. That is to say, a data block with a logical address of 16 KB - 31 KB and a data block with a logical address of 128 KB - 143 KB are the difference data to be backed up to the target volume this time. After the data block with the logical address of 16KB-31KB and the data block with the logical address of 128KB-143KB are sent to the target volume, the backup period corresponding to the time T0-T1 is completed, and all the bits of the bitmap are reset. Mark "0". At the time of T2, another backup task is triggered, then the bitmap at this time reflects which data blocks of the production volume are modified between T1 and T2. This again acquires the differential data backup to the target volume and loops accordingly.
在本实施例中,对于一个备份周期内的位图的更新是通过日志记录中的信息来完成的。下面结合图5介绍如何根据日志记录中的信息对位图进行修改。In this embodiment, the update of the bitmap for a backup period is done by the information in the log record. The following describes how to modify the bitmap according to the information in the log record in conjunction with FIG. 5.
图5是本实施例的方法流程图,可应用在图1所示的存储设备20中,或应用在图2所示的存储设备20和存储设备30中,其执行主体是图3中的处理器212。包括以下步骤:5 is a flowchart of the method of the present embodiment, which can be applied to the storage device 20 shown in FIG. 1 or applied to the storage device 20 and the storage device 30 shown in FIG. 2, and the execution body thereof is the processing in FIG. 212. Includes the following steps:
步骤601:获取一个备份周期内的日志记录。所述日志记录包括待写入生产卷的数据块的起始地址和所述数据块的长度。Step 601: Obtain a log record in a backup period. The log record includes a starting address of a data block to be written to the production volume and a length of the data block.
所述生产卷是指位于存储设备20的LU,如图1或图2所示。The production volume refers to the LU located in the storage device 20, as shown in FIG. 1 or 2.
以图4A为例,T0时刻,位图刚被创建时,其所有位都置为标记“0”。在T1时刻之前,读取T0时刻至T1时刻的各条日志记录,根据各条日志记录对所述位图进行修改。Taking FIG. 4A as an example, at time T0, when the bitmap is just created, all its bits are set to the flag “0”. Before the time T1, each log record from time T0 to time T1 is read, and the bitmap is modified according to each log record.
例如,第一条日志记录的信息是:将一个长度为30KB的数据块写入起始地址为17KB(这里的起始地址是指在所述生产卷中的起始地址)的存 储空间中。本实施例需要根据该条日志记录的信息确定位图中的哪些位需要被置为1。For example, the first log record information is: a data block of length 30 KB is written to the start address of 17 KB (where the start address refers to the start address in the production volume) In the storage space. In this embodiment, it is required to determine which bits in the bitmap need to be set to 1 according to the information recorded by the log.
可选的,所述日志记录可以是由至少两个写数据请求合并生成的,所述至少两个写数据请求包含的数据块连续。例如,所述两个写数据请求分别是:将长度为15KB的数据块写入起始地址为17KB的存储空间中;以及将长度为15KB的数据块写入起始地址为32KB的存储空间中。由所述两个写请求的起始地址和长度可知,其逻辑地址是连续的,因此可以合并为所述第一条日志记录。Optionally, the log record may be generated by combining at least two write data requests, and the at least two write data requests include data blocks consecutively. For example, the two write data requests are: writing a data block having a length of 15 KB into a storage space having a starting address of 17 KB; and writing a data block having a length of 15 KB into a storage space having a starting address of 32 KB. . It can be known from the start address and length of the two write requests that their logical addresses are contiguous and thus can be merged into the first log record.
或者,所述日志记录可以是由至少两个写数据请求合并生成的,所述至少两个写数据请求包含的数据块重复或部分重复。例如,两个写数据请求均为将一个长度为30KB的数据块写入起始地址为17KB的存储空间中,只是数据块的内容不同。那么,可以用第二个写数据请求覆盖第一个写数据请求,也就是说合并后的日志记录所记录的内容与所述第二个写数据请求的内容一致。Alternatively, the log record may be generated by a combination of at least two write data requests, the at least two write data requests comprising data blocks that are repeated or partially repeated. For example, two write data requests write a data block of length 30 KB into the storage space with a starting address of 17 KB, except that the contents of the data block are different. Then, the second write data request can be used to overwrite the first write data request, that is, the content recorded by the merged log record is consistent with the content of the second write data request.
另外,日志记录也可以由至少一个写数据请求和至少一个删除数据请求合并生成的。例如,所述写数据请求是将一个长度为15KB的数据块写入起始地址为17KB的存储空间中。所述删除数据请求是将起始地址为17KB,长度为15KB的数据块删除。那么,合并后生成的日志记录则是,不对所述起始地址为17KB长度为15KB的数据块进行修改。相应地,位图也不用进行修改。另外,在某些情况下,写数据请求包含的数据块的逻辑地址,与删除数据请求包含的数据块的逻辑地址部分重复。例如,所述写数据请求是将一个长度为15KB的数据块写入起始地址为17KB的存储空间中。所述删除数据请求是将起始地址为19KB,长度为13KB的数据块删除。那么,合并后生成的日志记录则是,将一个长度为2KB的数据块写入起始地址为17KB的存储空间中。In addition, the log record can also be generated by combining at least one write data request and at least one delete data request. For example, the write data request is to write a data block having a length of 15 KB into a storage space having a starting address of 17 KB. The delete data request is to delete a data block whose start address is 17 KB and whose length is 15 KB. Then, the log record generated after the merge is that the data block whose starting address is 17 KB and whose length is 15 KB is not modified. Accordingly, the bitmap does not need to be modified. In addition, in some cases, the logical address of the data block contained in the write data request is repeated with the logical address portion of the data block contained in the delete data request. For example, the write data request is to write a data block having a length of 15 KB into a storage space having a starting address of 17 KB. The delete data request is to delete a data block whose start address is 19 KB and whose length is 13 KB. Then, the log record generated after the merge is to write a data block of length 2 KB into the storage space with a starting address of 17 KB.
步骤602:根据所述起始地址和所述长度,以及每个字节对应的逻辑地 址确定所述数据块在所述位图中对应的目标字节。Step 602: According to the starting address and the length, and logically corresponding to each byte The address determines the corresponding target byte of the data block in the bitmap.
具体的,由起始地址17KB,长度30KB可知本次要修改的数据块与图4A所示位图的byte0对应。因此,需要对byte0中的某些位置为标记“1”。Specifically, from the start address of 17 KB and the length of 30 KB, it can be seen that the data block to be modified this time corresponds to byte0 of the bitmap shown in FIG. 4A. Therefore, some locations in byte0 need to be marked "1".
步骤603:根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n。Step 603: Determine the number n of bits in the bitmap to be modified according to the start address and the length, and the size of the data block.
具体的,由所述起始地址17KB以及每个数据块的尺寸16KB可知,本次要修改的第一个数据块的编号是17/16=1(取商的整数部分)。由所述起始地址17KB以及所述长度30KB可知末尾地址是17KB+30KB=47KB,再由末尾地址和所述每个数据块的尺寸16KB可知,本次要修改的最后一个数据块的编号是47/16=2(取商的整数部分)。从编号为1的数据块至编号为2的数据块,一共有(2-1+1)=2个数据块需要被修改。由于位图中的每个位对应一个数据块,所以byte0中有两个位需要被置为标记“1”。在本实施例中,以字母n表示需要被置为标记“1”的位的个数,其中0<n<=8。Specifically, from the start address of 17 KB and the size of each data block of 16 KB, the number of the first data block to be modified this time is 17/16=1 (the integer part of the quotient). It can be seen from the start address of 17 KB and the length of 30 KB that the end address is 17 KB+30 KB=47 KB, and the last address and the size of each of the data blocks are 16 KB. The number of the last data block to be modified this time is 47/16=2 (the integer part of the quotient). From the data block numbered 1 to the data block numbered 2, there is a total of (2-1+1) = 2 data blocks that need to be modified. Since each bit in the bitmap corresponds to one data block, two bits in byte0 need to be set to the flag "1". In the present embodiment, the number of bits to be set to the mark "1" is indicated by the letter n, where 0 < n <= 8.
步骤604:根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p。Step 604: Determine, according to the starting address, the size of the data block and the number 8 of bits included in each byte to determine a starting bit p in the bit to be modified.
具体的,由起始地址17KB,每个位对应的数据块的尺寸16KB,以及每个byte包含的位的个数为8可知,byte0中需要被置为标记“1”的起始位为第1个位。具体的计算公式是:(17/16)mod8=1。在本实施例中,以字母p表示需要被置为标记“1”的起始位的编号,其中0=<p<8。Specifically, the starting address is 17 KB, the size of the data block corresponding to each bit is 16 KB, and the number of bits included in each byte is 8. The starting bit of the byte 0 that needs to be set to the flag "1" is 1 bit. The specific calculation formula is: (17/16) mod8=1. In the present embodiment, the number of the start bit to be set to the mark "1" is indicated by the letter p, where 0 = <p < 8.
步骤605:由所述n,所述p,以及公式
Figure PCTCN2015099213-appb-000003
计算获得目标数值。其中,n表示需要被置为标记“1”的位的个数;p表示需要被置为标记“1”的起始位的编号。本实施例可以将步骤2中得到的“n”与步骤3中得到的“p”作为下述计算公式的输入值,得到一个输出值。按照上述示例,n=2,P=1,可以得出value=6。
Step 605: by the n, the p, and the formula
Figure PCTCN2015099213-appb-000003
Calculate the target value. Where n denotes the number of bits that need to be set to the flag "1"; p denotes the number of the start bit that needs to be set to the flag "1". In this embodiment, "n" obtained in step 2 and "p" obtained in step 3 can be used as input values of the following calculation formula to obtain an output value. According to the above example, n=2, P=1, you can get value=6.
或者,预先在存储器213中设置一张二维映射表,如表2所示。字母n 为横坐标,代表需要被置为标记“1”的位的个数;字母p为纵坐标,代表需要被置为标记“1”的起始位的编号。该表中的每项的数值由上述公式
Figure PCTCN2015099213-appb-000004
计算获得。需要说明的是,由上述公式计算获得每项的数值之后,转化为十六进制数值写入表2中。
Alternatively, a two-dimensional mapping table is set in the memory 213 in advance, as shown in Table 2. The letter n is the abscissa, which represents the number of bits that need to be set to the mark "1"; the letter p is the ordinate, which represents the number of the start bit that needs to be set to the mark "1". The value of each item in the table is determined by the above formula
Figure PCTCN2015099213-appb-000004
Calculated. It should be noted that after the value of each item is obtained by the above formula, it is converted into a hexadecimal value and written in Table 2.
Figure PCTCN2015099213-appb-000005
Figure PCTCN2015099213-appb-000005
表2Table 2
按照上面的举例可知,n=2,p=1,因此通过查表可以获知value=0x06。According to the above example, n=2, p=1, so the value=0x06 can be known by looking up the table.
步骤606:读取所述目标字节中保存的源数值。Step 606: Read the source value saved in the target byte.
其中,源数值是指在对目标字节进行修改之前,所述目标字节中原来保存的数值。由图4A可知,byte0中的源数值为00000000B。The source value refers to the value originally saved in the target byte before the target byte is modified. As can be seen from FIG. 4A, the source value in byte0 is 00000000B.
步骤607:对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节。Step 607: Perform a bit or operation on the target value and the source value, and write the value obtained after performing the bit or operation to the target byte.
所述位或操作是指对字节中的每个位执行“或”操作。具体的,将上面步骤中获得的value(6或者0x06)对应的二进制数据(00000110B)与所述源数值00000000B进行位或操作之后,得到00000110B。将其写入对应的byte(即byte0)中。在本实施例中,第0位是低位,后面依次是高位。所以将00000110B写入byte0之后,其效果如图4B所示。 The bit or operation refers to performing an OR operation on each bit in the byte. Specifically, after the binary data (00000110B) corresponding to the value (6 or 0x06) obtained in the above step is bit-operated with the source value 00000000B, 00000110B is obtained. Write it to the corresponding byte (ie byte0). In this embodiment, the 0th bit is a low bit, and the following is a high bit. So after writing 00000110B to byte0, the effect is shown in Figure 4B.
通过以上步骤就完成了根据第一条日志记录对位图的修改。如果T0至T1时刻之间只有所述第一条日志记录,则在T1时刻备份周期到达时,可以直接执行步骤608:根据所述修改后的位图获得差异数据。Through the above steps, the modification of the bitmap according to the first log record is completed. If there is only the first log record between T0 and T1, then when the backup period arrives at time T1, step 608 may be directly performed: obtaining difference data according to the modified bitmap.
具体的,当备份周期到达时,由上述差异位图可知byte0的第1位和第2位发生变化。Byte0的第2位对应的数据块的逻辑地址是16KB-31KB,byte0的第2位对应的数据块的逻辑地址是32KB-47KB。因此,可以根据这两个数据块的逻辑地址从磁盘中获取数据(也称之为差异数据)。Specifically, when the backup period arrives, it can be seen from the difference bitmap that the first bit and the second bit of the byte 0 change. The logical address of the data block corresponding to the second bit of Byte0 is 16 KB - 31 KB, and the logical address of the data block corresponding to the second bit of byte 0 is 32 KB - 47 KB. Therefore, data (also called difference data) can be obtained from the disk based on the logical addresses of the two data blocks.
步骤609:将所述差异数据备份至目标卷中。Step 609: Back up the difference data to the target volume.
所述目标卷是指位于存储设备20或存储设备30的LU,如图1或图2所示。The target volume refers to an LU located at the storage device 20 or the storage device 30, as shown in FIG. 1 or 2.
按照本实施例描述的方法,通过记录差异位图的方式获取差异数据,备份周期到达时,只需要根据差异位图获取发生变化的数据块,减小了计算负担,提高了备份效率。According to the method described in this embodiment, the difference data is obtained by recording the difference bitmap. When the backup period arrives, only the changed data block needs to be obtained according to the difference bitmap, which reduces the calculation burden and improves the backup efficiency.
在上述实施例中,如果T0至T1时刻之间还有其他日志记录,那么在执行步骤607之后,步骤608之前再读取第二条日志记录,例如第二条日志记录的信息是:将长度为60KB的数据块写入起始地址为20KB的存储空间中。按照上述5个步骤,可以得知需要对byte0中的某些位置为标记“1”,并且通过计算得出value=62。62对应的二进制数据为00111110B。将所述二进制数据00111110B与byte0中原来的二进制数据00000110B进行位或操作得到00111110B,写入byte0。写入后的效果如图4C所示。In the above embodiment, if there are other log records between the times T0 and T1, after performing step 607, the second log record is read before step 608, for example, the information of the second log record is: the length Write a 60 KB data block with a starting address of 20 KB. According to the above five steps, it can be known that some positions in byte0 need to be marked "1", and the value = 62 is calculated by calculation. The binary data corresponding to 62 is 00111110B. The binary data 00111110B and the original binary data 00000110B in byte0 are bit ORed to obtain 00111110B, and byte0 is written. The effect after writing is as shown in Fig. 4C.
按照上述示例,可以依次根据T0时刻至T1时刻的各条日志记录,完成对所述位图进行修改。在T1时刻,备份任务触发时根据所述位图得到T0时刻至T1时刻之间发生变化的数据块(即差异数据),备份到目标卷。According to the above example, the modification of the bitmap may be completed according to each log record from time T0 to time T1. At time T1, when the backup task is triggered, a data block (ie, difference data) that changes between T0 and T1 is obtained according to the bitmap, and is backed up to the target volume.
另外,在本实施例中备份完成之后,再将所述位图的所有位清零,以用于标记下一个备份周期内发生变化的数据块。In addition, after the backup is completed in this embodiment, all the bits of the bitmap are cleared to mark the data block that has changed in the next backup period.
本发明实施例还提供了一种差异数据备份装置700,所述装置700位于 存储系统中,所述存储系统包括存储器、生产卷和目标卷;其中,所述生产卷包括多个具有相同尺寸的数据块,每个数据块被分配一段连续的逻辑地址;所述存储器中保存有位图;所述位图包括多个字节,每个字节包括8个位,并且所述位图中的每个位对应所述生产卷中的一个数据块;所述装置700包括:The embodiment of the invention further provides a difference data backup device 700, where the device 700 is located In a storage system, the storage system includes a memory, a production volume, and a target volume; wherein the production volume includes a plurality of data blocks having the same size, each data block is allocated a continuous logical address; the memory is saved There is a bitmap; the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one of the production volumes; the apparatus 700 includes:
日志处理模块701,用于获取一个备份周期内的日志记录,所述日志记录包括待写入所述生产卷的数据块的起始地址和所述数据块的长度。The log processing module 701 is configured to obtain a log record in a backup period, where the log record includes a start address of a data block to be written into the production volume and a length of the data block.
位图生成模块702,用于根据所述起始地址和所述长度,以及每个字节对应的逻辑地址确定所述数据块在所述位图中对应的目标字节;根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n,0<n<=8;根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p,0=<p<8;由所述n,所述p,以及公式
Figure PCTCN2015099213-appb-000006
计算获得目标数值;读取所述目标字节中保存的源数值;对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节以修改所述位图。
a bitmap generation module 702, configured to determine, according to the start address and the length, and a logical address corresponding to each byte, a target byte corresponding to the data block in the bitmap; The address and the length, and the size of the data block determine the number of bits to be modified in the bitmap n, 0 < n <= 8; according to the start address, the size of the data block and each word The number 8 of bits included in the section determines the start bit p of the bit to be modified, 0=<p<8; by the n, the p, and the formula
Figure PCTCN2015099213-appb-000006
Calculating a target value; reading a source value stored in the target byte; performing a bit or operation on the target value and the source value, and writing a value obtained by performing the bit or operation to the target word Section to modify the bitmap.
备份模块703,用于确定所述备份周期到达时,根据所述修改后的位图获得差异数据;将所述差异数据备份至所述目标卷中。The backup module 703 is configured to: when the backup period arrives, obtain difference data according to the modified bitmap; and back up the difference data to the target volume.
所述日志记录可以是由至少两个写数据请求合并生成的,所述至少两个写数据请求包含的数据块的逻辑地址连续或者重复。The log record may be generated by combining at least two write data requests, the logical addresses of the data blocks included in the at least two write data requests being consecutive or repeated.
所述日志记录也可以是由至少一个写数据请求和至少一个删除数据请求合并生成,所述写数据请求包含的数据块的逻辑地址,与所述删除数据请求包含的数据块的逻辑地址部分重复。The log record may also be generated by combining at least one write data request and at least one delete data request, the logical address of the data block included in the write data request being duplicated with the logical address portion of the data block included in the delete data request .
另外,所述位图生成模块702,还用于在将所述差异数据备份至所述目标卷中之后,将所述位图的每个位清零。In addition, the bitmap generation module 702 is further configured to clear each bit of the bitmap after the difference data is backed up into the target volume.
按照本实施例描述的装置,通过记录差异位图的方式获取差异数据, 备份周期到达时,只需要根据差异位图获取发生变化的数据块,减小了计算负担,提高了备份效率。According to the apparatus described in this embodiment, the difference data is obtained by recording the difference bitmap. When the backup period arrives, only the changed data block needs to be obtained according to the difference bitmap, which reduces the calculation burden and improves the backup efficiency.
本领域普通技术人员可以理解,前述的存储介质包括:U盘、移动硬盘、磁碟、光盘、随机存储器(Random-Access Memory,RAM)、固态硬盘(Solid State Disk,SSD)或者非易失性存储器(non-volatile memory)等各种可以存储程序代码的非短暂性的(non-transitory)机器可读介质。A person skilled in the art can understand that the foregoing storage medium includes: a USB flash drive, a mobile hard disk, a magnetic disk, an optical disk, a random access memory (RAM), a solid state disk (SSD), or a nonvolatile. A non-transitory machine readable medium that can store program code, such as a non-volatile memory.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and are not limited thereto.

Claims (12)

  1. 一种差异数据备份方法,其特征在于,所述方法应用于存储系统中,所述存储系统包括处理器、存储器、生产卷和目标卷;其中,所述生产卷包括多个具有相同尺寸的数据块,每个数据块被分配一段连续的逻辑地址;所述存储器中保存有位图;所述位图包括多个字节,每个字节包括8个位,并且所述位图中的每个位对应所述生产卷中的一个数据块;所述方法由所述处理器执行,包括:A differential data backup method, characterized in that the method is applied to a storage system, the storage system comprising a processor, a memory, a production volume and a target volume; wherein the production volume comprises a plurality of data having the same size a block, each data block is allocated a contiguous logical address; the memory holds a bitmap; the bitmap includes a plurality of bytes, each byte includes 8 bits, and each of the bitmaps One bit corresponds to one of the production volumes; the method is performed by the processor, including:
    获取一个备份周期内的日志记录,所述日志记录包括待写入所述生产卷的数据块的起始地址和所述数据块的长度;Obtaining a log record in a backup period, where the log record includes a start address of a data block to be written into the production volume and a length of the data block;
    根据所述起始地址和所述长度,以及每个字节对应的逻辑地址确定所述数据块在所述位图中对应的目标字节;Determining, according to the start address and the length, and a logical address corresponding to each byte, a corresponding target byte of the data block in the bitmap;
    根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n,0<n<=8;Determining, according to the starting address and the length, and the size of the data block, the number of bits to be modified in the bitmap n, 0 < n <= 8;
    根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p,0=<p<8;Determining, according to the starting address, the size of the data block and the number of bits included in each byte, a starting bit p in the bit to be modified, 0=<p<8;
    由所述n,所述p,以及公式
    Figure PCTCN2015099213-appb-100001
    计算获得目标数值;
    By the n, the p, and the formula
    Figure PCTCN2015099213-appb-100001
    Calculate the target value;
    读取所述目标字节中保存的源数值;Reading the source value saved in the target byte;
    对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节以修改所述位图;Performing a bit or operation on the target value and the source value, and writing a value obtained after performing the bit or operation to the target byte to modify the bitmap;
    确定所述备份周期到达时,根据所述修改后的位图获得差异数据;Determining that the backup period arrives, obtaining difference data according to the modified bitmap;
    将所述差异数据备份至所述目标卷中。The difference data is backed up to the target volume.
  2. 根据权利要求1所述的方法,其特征在于,所述日志记录由至少两个写数据请求合并生成,所述至少两个写数据请求包含的数据块的逻辑地址连续或者重复。 The method of claim 1, wherein the log record is generated by combining at least two write data requests, the logical addresses of the data blocks included in the at least two write data requests being consecutive or repeated.
  3. 根据权利要求1所述的方法,其特征在于,所述日志记录由至少一个写数据请求和至少一个删除数据请求合并生成,所述写数据请求包含的数据块的逻辑地址,与所述删除数据请求包含的数据块的逻辑地址部分重复。The method according to claim 1, wherein said log record is generated by combining at least one write data request and at least one delete data request, said write data requesting a logical address of said included data block, and said deleting data The logical address portion of the data block contained in the request is repeated.
  4. 根据权利要求1-3任一所述的方法,其特征在于,在将所述差异数据备份至所述目标卷中之后,所述方法还包括将所述位图的每个位清零。The method of any of claims 1-3, wherein after backing up the difference data into the target volume, the method further comprises clearing each bit of the bitmap.
  5. 一种存储系统,包括处理器、存储器、生产卷和目标卷,其特征在于,A storage system including a processor, a memory, a production volume, and a target volume, wherein
    所述生产卷包括多个具有相同尺寸的数据块,每个数据块被分配一段连续的逻辑地址;The production volume includes a plurality of data blocks having the same size, each data block being allocated a continuous logical address;
    所述存储器中保存有位图;所述位图包括多个字节,每个字节包括8个位,并且所述位图中的每个位对应所述生产卷中的一个数据块;a bitmap is stored in the memory; the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to one data block in the production volume;
    所述处理器,用于获取一个备份周期内的日志记录,所述日志记录包括待写入所述生产卷的数据块的起始地址和所述数据块的长度;The processor is configured to acquire a log record in a backup period, where the log record includes a start address of a data block to be written into the production volume and a length of the data block;
    根据所述起始地址和所述长度,以及每个字节对应的逻辑地址确定所述数据块在所述位图中对应的目标字节;Determining, according to the start address and the length, and a logical address corresponding to each byte, a corresponding target byte of the data block in the bitmap;
    根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n,0<n<=8;Determining, according to the starting address and the length, and the size of the data block, the number of bits to be modified in the bitmap n, 0 < n <= 8;
    根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p,0=<p<8;Determining, according to the starting address, the size of the data block and the number of bits included in each byte, a starting bit p in the bit to be modified, 0=<p<8;
    由所述n,所述p,以及公式
    Figure PCTCN2015099213-appb-100002
    计算获得目标数值;
    By the n, the p, and the formula
    Figure PCTCN2015099213-appb-100002
    Calculate the target value;
    读取所述目标字节中保存的源数值;Reading the source value saved in the target byte;
    对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节以修改所述位图;Performing a bit or operation on the target value and the source value, and writing a value obtained after performing the bit or operation to the target byte to modify the bitmap;
    确定所述备份周期到达时,根据所述修改后的位图获得差异数据; Determining that the backup period arrives, obtaining difference data according to the modified bitmap;
    将所述差异数据备份至所述目标卷中。The difference data is backed up to the target volume.
  6. 根据权利要求5所述的存储系统,其特征在于,所述日志记录由至少两个写数据请求合并生成,所述至少两个写数据请求包含的数据块的逻辑地址连续或者重复。The storage system according to claim 5, wherein the log record is generated by combining at least two write data requests, and the logical addresses of the data blocks included in the at least two write data requests are consecutive or repeated.
  7. 根据权利要求5所述的存储系统,其特征在于,所述日志记录由至少一个写数据请求和至少一个删除数据请求合并生成,所述写数据请求包含的数据块的逻辑地址,与所述删除数据请求包含的数据块的逻辑地址部分重复。The storage system according to claim 5, wherein said log record is generated by combining at least one write data request and at least one delete data request, said write data request includes a logical address of said data block, and said deleting The logical address portion of the data block contained in the data request is partially duplicated.
  8. 根据权利要求5-7任一所述的存储系统,其特征在于,所述处理器还用于在将所述差异数据备份至所述目标卷中之后,将所述位图的每个位清零。The storage system according to any one of claims 5-7, wherein the processor is further configured to clear each bit of the bitmap after backing up the difference data into the target volume zero.
  9. 一种差异数据备份装置,其特征在于,所述装置位于存储系统中,所述存储系统包括存储器、生产卷和目标卷;其中,所述生产卷包括多个具有相同尺寸的数据块,每个数据块被分配一段连续的逻辑地址;所述存储器中保存有位图;所述位图包括多个字节,每个字节包括8个位,并且所述位图中的每个位对应所述生产卷中的一个数据块;所述装置包括:A differential data backup device, wherein the device is located in a storage system, the storage system comprising a memory, a production volume, and a target volume; wherein the production volume includes a plurality of data blocks having the same size, each The data block is allocated a contiguous logical address; the memory holds a bitmap; the bitmap includes a plurality of bytes, each byte includes 8 bits, and each bit in the bitmap corresponds to Describe a data block in a production volume; the device includes:
    日志处理模块,用于获取一个备份周期内的日志记录,所述日志记录包括待写入所述生产卷的数据块的起始地址和所述数据块的长度;a log processing module, configured to acquire a log record in a backup period, where the log record includes a start address of a data block to be written into the production volume and a length of the data block;
    位图生成模块,用于根据所述起始地址和所述长度,以及每个字节对应的逻辑地址确定所述数据块在所述位图中对应的目标字节;根据所述起始地址和所述长度,以及数据块的尺寸确定所述位图中待修改的位的个数n,0<n<=8;根据所述起始地址,所述数据块的尺寸以及每个字节包括的位的个数8确定所述待修改的位中的起始位p,0=<p<8;由所述n,所述p,以及 公式
    Figure PCTCN2015099213-appb-100003
    计算获得目标数值;读取所述目标字节中保存的源数值;对所述目标数值与所述源数值执行位或操作,将执行所述位或操作后获得的数值写入所述目标字节以修改所述位图;
    a bitmap generating module, configured to determine, according to the starting address and the length, and a logical address corresponding to each byte, a corresponding target byte of the data block in the bitmap; according to the starting address And the length, and the size of the data block, determining the number of bits to be modified in the bitmap n, 0 < n <= 8; according to the starting address, the size of the data block and each byte The number 8 of bits included determines the start bit p of the bit to be modified, 0=<p<8; by the n, the p, and the formula
    Figure PCTCN2015099213-appb-100003
    Calculating a target value; reading a source value stored in the target byte; performing a bit or operation on the target value and the source value, and writing a value obtained by performing the bit or operation to the target word Section to modify the bitmap;
    备份模块,用于确定所述备份周期到达时,根据所述修改后的位图获得差异数据;将所述差异数据备份至所述目标卷中。And a backup module, configured to: when the backup period arrives, obtain difference data according to the modified bitmap; and back up the difference data to the target volume.
  10. 根据权利要求9所述的装置,其特征在于,所述日志记录由至少两个写数据请求合并生成,所述至少两个写数据请求包含的数据块的逻辑地址连续或者重复。The apparatus of claim 9, wherein the log record is generated by combining at least two write data requests, the logical addresses of the data blocks included in the at least two write data requests being consecutive or repeated.
  11. 根据权利要求9所述的装置,其特征在于,所述日志记录由至少一个写数据请求和至少一个删除数据请求合并生成,所述写数据请求包含的数据块的逻辑地址,与所述删除数据请求包含的数据块的逻辑地址部分重复。The apparatus according to claim 9, wherein said log record is generated by combining at least one write data request and at least one delete data request, said write data requesting a logical address of said included data block, and said deleting data The logical address portion of the data block contained in the request is repeated.
  12. 根据权利要求9-11任一所述的装置,其特征在于,所述位图生成模块,还用于在将所述差异数据备份至所述目标卷中之后,将所述位图的每个位清零。 The apparatus according to any one of claims 9-11, wherein the bitmap generating module is further configured to: after backing the difference data into the target volume, each of the bitmaps The bit is cleared.
PCT/CN2015/099213 2015-12-28 2015-12-28 Discrepant data backup method, storage system and discrepant data backup device WO2017113059A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/099213 WO2017113059A1 (en) 2015-12-28 2015-12-28 Discrepant data backup method, storage system and discrepant data backup device
CN201580003189.2A CN107135662B (en) 2015-12-28 2015-12-28 Differential data backup method, storage system and differential data backup device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/099213 WO2017113059A1 (en) 2015-12-28 2015-12-28 Discrepant data backup method, storage system and discrepant data backup device

Publications (1)

Publication Number Publication Date
WO2017113059A1 true WO2017113059A1 (en) 2017-07-06

Family

ID=59224107

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099213 WO2017113059A1 (en) 2015-12-28 2015-12-28 Discrepant data backup method, storage system and discrepant data backup device

Country Status (2)

Country Link
CN (1) CN107135662B (en)
WO (1) WO2017113059A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614055A (en) * 2018-12-21 2019-04-12 杭州宏杉科技股份有限公司 Snapshot creation method, device, electronic equipment and machine readable storage medium
CN110647531A (en) * 2019-08-15 2020-01-03 中国平安财产保险股份有限公司 Data synchronization method, device, equipment and computer readable storage medium
CN114442944A (en) * 2022-01-05 2022-05-06 杭州宏杉科技股份有限公司 Data copying method, system and equipment
CN115904885A (en) * 2023-03-10 2023-04-04 浪潮电子信息产业股份有限公司 Log management method and device, electronic equipment and computer readable storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115292094B (en) * 2022-08-10 2023-11-14 广州鼎甲计算机科技有限公司 Data recovery processing method, device, equipment, storage medium and program product
CN115328704A (en) * 2022-09-06 2022-11-11 安徽鼎甲计算机科技有限公司 File backup method, file recovery method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070083722A1 (en) * 2005-10-06 2007-04-12 Acronis, Inc. Fast incremental backup method and system
US7831789B1 (en) * 2005-10-06 2010-11-09 Acronis Inc. Method and system for fast incremental backup using comparison of descriptors
CN103019888A (en) * 2012-12-21 2013-04-03 华为技术有限公司 Backup method and device
CN103049353A (en) * 2012-12-21 2013-04-17 华为技术有限公司 Data backup method and related device
CN104506619A (en) * 2014-12-22 2015-04-08 华为技术有限公司 Data backup and recovery method and device, and server
US9021222B1 (en) * 2012-03-28 2015-04-28 Lenovoemc Limited Managing incremental cache backup and restore

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070083722A1 (en) * 2005-10-06 2007-04-12 Acronis, Inc. Fast incremental backup method and system
US7831789B1 (en) * 2005-10-06 2010-11-09 Acronis Inc. Method and system for fast incremental backup using comparison of descriptors
US9021222B1 (en) * 2012-03-28 2015-04-28 Lenovoemc Limited Managing incremental cache backup and restore
CN103019888A (en) * 2012-12-21 2013-04-03 华为技术有限公司 Backup method and device
CN103049353A (en) * 2012-12-21 2013-04-17 华为技术有限公司 Data backup method and related device
CN104506619A (en) * 2014-12-22 2015-04-08 华为技术有限公司 Data backup and recovery method and device, and server

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614055A (en) * 2018-12-21 2019-04-12 杭州宏杉科技股份有限公司 Snapshot creation method, device, electronic equipment and machine readable storage medium
CN109614055B (en) * 2018-12-21 2022-11-04 杭州宏杉科技股份有限公司 Snapshot creating method and device, electronic equipment and machine-readable storage medium
CN110647531A (en) * 2019-08-15 2020-01-03 中国平安财产保险股份有限公司 Data synchronization method, device, equipment and computer readable storage medium
CN114442944A (en) * 2022-01-05 2022-05-06 杭州宏杉科技股份有限公司 Data copying method, system and equipment
CN114442944B (en) * 2022-01-05 2024-02-27 杭州宏杉科技股份有限公司 Data replication method, system and equipment
CN115904885A (en) * 2023-03-10 2023-04-04 浪潮电子信息产业股份有限公司 Log management method and device, electronic equipment and computer readable storage medium
CN115904885B (en) * 2023-03-10 2023-05-09 浪潮电子信息产业股份有限公司 Log management method, device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN107135662B (en) 2020-01-03
CN107135662A (en) 2017-09-05

Similar Documents

Publication Publication Date Title
WO2017113059A1 (en) Discrepant data backup method, storage system and discrepant data backup device
US11153094B2 (en) Secure data deduplication with smaller hash values
US9846642B2 (en) Efficient key collision handling
US10983955B2 (en) Data unit cloning in memory-based file systems
US10127166B2 (en) Data storage controller with multiple pipelines
JP6218869B2 (en) System and method for copy-on-write to SSD
US11120081B2 (en) Key-value storage device and method of operating key-value storage device
US9740422B1 (en) Version-based deduplication of incremental forever type backup
US20140297603A1 (en) Method and apparatus for deduplication of replicated file
US9184767B2 (en) Scoring variable nodes for low density parity check code decoding
WO2016041384A1 (en) Duplicate data deletion method and device
US10891074B2 (en) Key-value storage device supporting snapshot function and operating method thereof
TWI603194B (en) Data storage device and data accessing method
US10824359B2 (en) Optimizing inline deduplication during copies
US11921633B2 (en) Deduplicating data based on recently reading the data
WO2016101145A1 (en) Controller, method for identifying data block stability and storage system
CN109725850B (en) Memory system and memory device
US8966207B1 (en) Virtual defragmentation of a storage
JP7376488B2 (en) Deduplication as an infrastructure to avoid snapshot copy-on-write data movement
US20140219041A1 (en) Storage device and data processing method thereof
CN107273306B (en) Data reading and writing method for solid state disk and solid state disk
WO2016082559A1 (en) Data writing method and storage device
WO2023245942A1 (en) Ssd finite window data deduplication identification method and apparatus, and computer device
US11662949B2 (en) Storage server, a method of operating the same storage server and a data center including the same storage server
US11436092B2 (en) Backup objects for fully provisioned volumes with thin lists of chunk signatures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15911691

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15911691

Country of ref document: EP

Kind code of ref document: A1