CN115509808B - Data backup method, device, computer equipment and storage medium - Google Patents

Data backup method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN115509808B
CN115509808B CN202211134595.0A CN202211134595A CN115509808B CN 115509808 B CN115509808 B CN 115509808B CN 202211134595 A CN202211134595 A CN 202211134595A CN 115509808 B CN115509808 B CN 115509808B
Authority
CN
China
Prior art keywords
snapshot
data block
link
file
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211134595.0A
Other languages
Chinese (zh)
Other versions
CN115509808A (en
Inventor
余剑
陈建熊
杨宇昊
卫东
王子骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Dingjia Computer Technology Co ltd
Original Assignee
Anhui Dingjia Computer Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Dingjia Computer Technology Co ltd filed Critical Anhui Dingjia Computer Technology Co ltd
Priority to CN202211134595.0A priority Critical patent/CN115509808B/en
Publication of CN115509808A publication Critical patent/CN115509808A/en
Application granted granted Critical
Publication of CN115509808B publication Critical patent/CN115509808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data backup method, a data backup device, computer equipment and a storage medium. The method comprises the following steps: responding to a data snapshot creation request aiming at a source volume, determining at least one snapshot time of snapshot record aiming at a data block to be updated, and obtaining a snapshot sequence corresponding to the data block to be updated; compressing a target data block set of the data block to be updated in the source volume to obtain a compressed data block; storing the compressed data blocks into a linked snapshot file in a snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time; and establishing a link relation between the snapshot files corresponding to each snapshot time except the link snapshot time and the link snapshot file so as to backup the data blocks in the source volume. By adopting the method, the read-write efficiency of the server disk can be improved.

Description

Data backup method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of computer technology, and in particular, to a data backup method, apparatus, computer device, storage medium, and computer program product.
Background
Along with the rapid development of the informatization technology, the application of cloud computing, big data analysis and mass storage is widely applied, and new challenges are brought to backup disaster recovery in a large-capacity storage scene. The snapshot technology can generate a copy image of data in a short time and then generate a full or backup-added file based on the image, so that the snapshot technology is widely applied in the backup field.
When the traditional technology realizes the storage snapshot, in the process of updating data at a certain moment, the covered data in the source volume is required to be read and written into the resource volume, and then new data is written into the source volume, so that a large number of Input and Output (IO) requests can be generated, the data volume of the input and output requests is larger, and the read-write efficiency of a server disk is reduced.
Therefore, the conventional technology has the problem of low read-write efficiency of the server disk.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data backup method, apparatus, computer device, computer readable storage medium, and computer program product that can improve the read/write efficiency of a server disk.
In a first aspect, the present application provides a data backup method. The method comprises the following steps:
Responding to a data snapshot creation request aiming at a source volume, and determining at least one snapshot time needing to carry out snapshot recording aiming at a data block to be updated in the source volume under the condition that the data block to be updated in the source volume is not snapshot recorded, so as to obtain a snapshot sequence corresponding to the data block to be updated; the snapshot sequence consists of snapshot files corresponding to each snapshot time;
compressing the target data block set of the data block to be updated in the source volume to obtain a compressed data block;
storing the compressed data blocks into a linked snapshot file in the snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in the at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time;
and establishing a link relation between the snapshot files corresponding to the snapshot times except the link snapshot time and the link snapshot files so as to backup the data blocks in the source volume.
In one embodiment, the establishing a link relationship between the snapshot files corresponding to the snapshot times except the link snapshot time and the link snapshot file to backup the data blocks in the source volume includes:
Updating the data block index information corresponding to the compressed data block in the link snapshot file to obtain link index information;
and linking the data block index information of the compressed data block in the snapshot files corresponding to the snapshot times except the link snapshot time to the link index information in a pointer mode so as to establish the link relation.
In one embodiment, the method further comprises:
determining that data block index information pointing to a local link relation exists in the link snapshot file as target link index information in response to a deleting instruction aiming at the link snapshot file;
the target link index information is additionally copied to the priority reference snapshot file in the corresponding data block in the link snapshot file, and the copied data block is obtained; the priority reference snapshot file is a snapshot file corresponding to the priority reference snapshot time in the reference snapshot file; the data block index information in the reference snapshot file is linked to the target link index information; the time interval between the priority reference snapshot time and the link snapshot time is smaller than the time interval between any other reference snapshot time except the priority reference snapshot time and the link snapshot time in the reference snapshot time; the reference snapshot time is the snapshot time corresponding to the reference snapshot file;
Updating the index information of the data block corresponding to the copied data block in the priority reference snapshot file to index information pointing to the local, and obtaining the local link index information corresponding to the copied data block in the priority reference snapshot file;
and updating the data block index information linked to the target link index information in the rest reference snapshot files corresponding to the rest reference snapshot time to be linked to the local link index information, and deleting the link snapshot file.
In one embodiment, the establishing a link relationship between the snapshot files corresponding to the snapshot times except the link snapshot time and the link snapshot file to backup the data blocks in the source volume includes:
responding to a data backup request of the source volume at the full backup time, and reading bitmap information corresponding to the source volume at the full backup time;
reading each data block set according to the data record judgment mark corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into a preset backup file; and each bitmap position has a one-to-one mapping relation with each data block set.
In one embodiment, the reading each data block set according to the data record determination identifier corresponding to each bitmap position in the bitmap information, so as to write the data blocks in each data block set into a preset backup file, includes:
judging whether a current data block set corresponding to a current bitmap position is recorded by a snapshot or not according to a data record judging mark corresponding to the current bitmap position;
under the condition that the current data block set is judged not to be recorded by the snapshot, reading the data blocks in the current data block set in the source volume to write a full backup file;
under the condition that the current data block set is judged to be recorded by the snapshot, reading the current compressed data block from a snapshot file recorded with the current compressed data block corresponding to the current data block set according to the data block index information corresponding to the current data block set at the full backup time;
decompressing the current compressed data block to write the data blocks in the current data block set into the full backup file;
and returning to the step of judging whether the current data block set corresponding to the current bitmap position is recorded by the snapshot according to the data record judgment mark corresponding to the current bitmap position until the data blocks in each data block set in the source volume are written into the full backup file.
In one embodiment, the method further comprises, after the step of writing the full back-up file to the data blocks in each set of data blocks in the source volume:
responding to a data incremental backup request aiming at the source volume at the incremental backup moment, and acquiring bitmap log identifiers corresponding to the bitmap positions; the bitmap log identifier is used for judging whether the current data block set corresponding to the current bitmap position is updated in the period between the full backup time and the incremental backup time;
under the condition of updating, the bitmap information corresponding to the source volume at the incremental backup moment is read;
and reading the current data block set according to the data record judgment mark corresponding to the current bitmap position in the bitmap information corresponding to the incremental backup time so as to write the incremental backup file.
In a second aspect, the present application further provides a data backup apparatus. The device comprises:
the determining module is used for responding to a data snapshot creation request aiming at a source volume, and determining at least one snapshot time of snapshot record aiming at the data block to be updated in the source volume under the condition that the data block to be updated in the source volume is not snapshot recorded, so as to obtain a snapshot sequence corresponding to the data block to be updated; the snapshot sequence consists of snapshot files corresponding to each snapshot time;
The compression module is used for compressing the target data block set of the data block to be updated in the source volume to obtain a compressed data block;
the storage module is used for storing the compressed data blocks into a linked snapshot file in the snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in the at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time;
and the establishing module is used for establishing a link relation between the snapshot files corresponding to the snapshot times except the link snapshot time and the link snapshot files so as to backup the data blocks in the source volume.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the method described above.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of the method described above.
According to the data backup method, the data backup device, the computer equipment, the storage medium and the computer program product, by responding to the data snapshot creation request aiming at the source volume, under the condition that the data blocks to be updated in the source volume are not snapshot recorded, at least one snapshot time needing to be snapshot recorded aiming at the data blocks to be updated is determined, and a snapshot sequence corresponding to the data blocks to be updated is obtained; the snapshot sequence consists of snapshot files corresponding to each snapshot time; compressing a target data block set of the data block to be updated in the source volume to obtain a compressed data block; storing the compressed data blocks into a linked snapshot file in a snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time; establishing a link relation between snapshot files corresponding to snapshot times except the link snapshot time and the link snapshot files so as to backup the data blocks in the source volume; in the snapshot generation process, data is compressed firstly, so that the data size of an input/output (IO) request is reduced, the size of a linked snapshot file is reduced, and the read-write efficiency of a server disk is improved; under the condition that a plurality of snapshot moments exist, aiming at the problem that the snapshot files corresponding to each snapshot moment need to be written into the compressed data blocks corresponding to the data blocks to be updated, a method of only writing the compressed data blocks into the linked snapshot files and establishing the link relation between the snapshot files corresponding to the snapshot moments except the linked snapshot moment is adopted, so that data writing operation is not needed for the rest snapshot files in the snapshot sequence, the times and the data quantity of input and output requests are greatly reduced, the problem of reduced input and output performance under the condition of multiple snapshots is solved, and the read-write efficiency of a server disk is further effectively improved.
Drawings
FIG. 1 is a flow chart of a data backup method according to an embodiment;
FIG. 2 is a flow chart of a method for full back-up of data in one embodiment;
FIG. 3 is a flow chart of a method of incremental backup of data according to one embodiment;
FIG. 4 is a flowchart of a data backup method according to another embodiment;
FIG. 5 is a system block diagram of a snapshot backup restoration in one embodiment;
FIG. 6 is a schematic diagram of a disk layout of a source volume in one embodiment;
FIG. 7 (a) is a flow diagram of a method of compressing copy-on-write snapshots in one embodiment;
FIG. 7 (b) is a schematic diagram of a snapshot file when a snapshot is opened at another snapshot time according to an embodiment;
FIG. 7 (c) is a schematic diagram of a snapshot file maintenance process in one embodiment;
FIG. 8 is a block diagram of a snapshot backup device in one embodiment;
fig. 9 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a data backup method is provided, where this embodiment is applied to a server for illustration, it is understood that the method may also be applied to a terminal, and may also be applied to a system including a terminal and a server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the steps of:
step S110, in response to a data snapshot creation request for the source volume, determines at least one snapshot time when a snapshot record needs to be performed on a data block to be updated in the case that the data block to be updated in the source volume is not snapshot recorded, and obtains a snapshot sequence corresponding to the data block to be updated.
The snapshot sequence consists of snapshot files corresponding to each snapshot time.
In a specific implementation, the computer device may respond to a data snapshot creation request for the source volume, determine whether a data block to be updated in the source volume has been snapshot recorded by acquiring bitmap information corresponding to the data block in the source volume, and determine at least one snapshot time when the snapshot recording is required for the data block to be updated if the data block to be updated is not snapshot recorded, so as to obtain a snapshot sequence corresponding to the data block to be updated, where the snapshot sequence is composed of snapshot files corresponding to each snapshot time.
Step S120, compressing the target data block set of the data block to be updated in the source volume to obtain a compressed data block.
Wherein the target data block set consists of consecutive data blocks, and comprises data blocks to be updated.
In a specific implementation, the computer device may obtain a target data block set to which the data block to be updated belongs in the source volume, and compress the target data block set by adopting a preset compression algorithm to obtain a compressed data block.
Specifically, the computer device may intercept a write operation for the source volume system by using a Blocking IO (block IO) filtering driving module in the source volume system, determine an address for operating a data block of the source volume, so as to determine a block number (BlockNumber) corresponding to the data block to be updated, that is, a data block identifier corresponding to the data block to be updated, so as to determine a bitmap position corresponding to a bitmap information of the data block to be updated according to the data block identifier, and determine an address range corresponding to a continuous data block forming the target data block set in the source volume according to the bitmap position, so as to read the continuous data block corresponding to the address range, thereby obtaining the target data block set. Wherein the block number corresponding to the data block in the source volume is marked from 0.
In practical application, if one bit (bit) in the bitmap information represents N continuous data blocks in the corresponding source volume, the bitmap position corresponding to the data block to be updated in the bitmap information is determined to be position=int (BlockNumber/N) according to the data block identifier, and then the address range corresponding to the continuous data blocks forming the target data block set in the source volume is from position×n+1 to (position+1) ×n.
For example, if the BlockNumber is 16 (i.e., the data block to be updated is the 17 th data block in the source volume), and n=16, then the position=1, and the computer needs to read the 17 th to 32 th data blocks in the source volume.
In addition, the preset compression algorithm may be a lossless nonlinear compression algorithm, such as a general algorithm such as zip (a compression algorithm).
Step S130, storing the compressed data block into a linked snapshot file in the snapshot sequence.
The link snapshot file is a snapshot file corresponding to the link snapshot time in at least one snapshot time.
Wherein the link snapshot time is earlier than any snapshot time other than the link snapshot time in the at least one snapshot time.
The snapshot sequence is obtained by sequencing the snapshot files from the early to the late according to the corresponding snapshot time, namely the linked snapshot file is the first snapshot file in the snapshot sequence.
Among other things, snapshot files may be named CCOW (Compression Copy on Write, copy-on-press) files.
In a specific implementation, the computer device may determine an earliest snapshot time in at least one snapshot time of a snapshot record to be performed on a data block to be updated, take the earliest snapshot time as a link snapshot time, and take a snapshot file corresponding to the link snapshot time as a link snapshot file in a snapshot sequence, so that the compressed data block may be stored in the link snapshot file.
And step S140, establishing a link relation between the snapshot files corresponding to the snapshot times except the link snapshot time and the link snapshot files so as to backup the data blocks in the source volume.
In a specific implementation, the computer device may establish a link relationship between the snapshot file corresponding to each snapshot time except the link snapshot time and the link snapshot file according to the index information of the data blocks in the snapshot file, so as to backup the data blocks in the source volume.
In the data backup method, by responding to the data snapshot creation request aiming at the source volume, under the condition that the data block to be updated in the source volume is not snapshot recorded, at least one snapshot time of snapshot recording aiming at the data block to be updated is determined, and a snapshot sequence corresponding to the data block to be updated is obtained; the snapshot sequence consists of snapshot files corresponding to each snapshot time; compressing a target data block set of the data block to be updated in the source volume to obtain a compressed data block; storing the compressed data blocks into a linked snapshot file in a snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time; establishing a link relation between snapshot files corresponding to snapshot times except the link snapshot time and the link snapshot files so as to backup the data blocks in the source volume; in the snapshot generation process, data is compressed firstly, so that the data size of an input/output (IO) request is reduced, the size of a linked snapshot file is reduced, and the read-write efficiency of a server disk is improved; under the condition that a plurality of snapshot moments exist, aiming at the problem that the snapshot files corresponding to each snapshot moment need to be written into the compressed data blocks corresponding to the data blocks to be updated, a method of only writing the compressed data blocks into the linked snapshot files and establishing the link relation between the snapshot files corresponding to the snapshot moments except the linked snapshot moment is adopted, so that data writing operation is not needed for the rest snapshot files in the snapshot sequence, the times and the data quantity of input and output requests are greatly reduced, the problem of reduced input and output performance under the condition of multiple snapshots is solved, and the read-write efficiency of a server disk is further effectively improved.
In one embodiment, establishing a link relationship between a snapshot file corresponding to each snapshot time except the link snapshot time and the link snapshot file to backup a data block in the source volume includes: updating the data block index information corresponding to the compressed data block in the link snapshot file to obtain link index information; and linking the data block index information of the compressed data block in the snapshot files corresponding to all snapshot times except the link snapshot time to the link index information in a pointer mode so as to establish a link relation.
Wherein each snapshot file includes a block extension information file and a data file.
The block extension information file of each snapshot file comprises bitmap information corresponding to the data blocks in the source volume at the corresponding snapshot time and data block index information corresponding to the data blocks in the source volume at the corresponding snapshot time.
Wherein the data file is used for storing the compressed data blocks.
The data block index information consists of a bitmap position corresponding to the data block to be updated in the bitmap information, an address range corresponding to a target data block set to which the data block to be updated belongs in a source volume, and an offset and a size of the compressed data block in a data file.
In a specific implementation, in a process of establishing a link relation between a snapshot file corresponding to each snapshot time except the link snapshot time and the link snapshot file by the computer device so as to backup a data block in a source volume, the computer device can update data block index information corresponding to the compressed data block in the link snapshot file to obtain the data block index information corresponding to the compressed data block in the link snapshot file as link index information.
And then, the computer equipment can link the data block index information of the compressed data block to the link index information in an ordered chain pointer mode in the snapshot files corresponding to all snapshot times except the link snapshot time so as to establish a link relation.
According to the technical scheme, the link index information is obtained by updating the data block index information corresponding to the compressed data block in the link snapshot file; linking the data block index information of the compressed data block in the snapshot files corresponding to all snapshot moments except the linked snapshot moment to the link index information in a pointer mode so as to establish a link relation; therefore, based on the link relation, the snapshot file where the compressed data block is located can be quickly queried according to the data block index information corresponding to the compressed data block in any snapshot file, and data writing operation is not needed to be performed on the snapshot files except the link snapshot file, so that the frequency and the data quantity of input and output requests are greatly reduced, and the read-write efficiency of a server disk is effectively improved.
In one embodiment, the method further comprises: determining that data block index information pointing to a local link relation exists in the link snapshot file as target link index information in response to a deleting instruction aiming at the link snapshot file; the corresponding data block of the target link index information in the link snapshot file is additionally copied to the priority reference snapshot file, and the copied data block is obtained; updating index information of the corresponding data block of the copied data block in the priority reference snapshot file to index information pointing to the local, and obtaining local link index information of the corresponding data block of the copied data block in the priority reference snapshot file; and updating the data block index information linked to the target link index information in the rest reference snapshot files corresponding to the rest reference snapshot time to be linked to the local link index information, and deleting the link snapshot file.
The priority reference snapshot file is a snapshot file corresponding to the priority reference snapshot time in the reference snapshot file.
Wherein the data block index information in the reference snapshot file is linked to the target link index information.
The time interval between the priority reference snapshot time and the link snapshot time is smaller than the time interval between any other reference snapshot time except the priority reference snapshot time and the link snapshot time in the reference snapshot time.
The reference snapshot time is the snapshot time corresponding to the reference snapshot file.
In particular implementations, the computer device may determine, in response to a delete instruction for the link snapshot file, that data block index information of a local-pointing link relationship exists in the link snapshot file as target link index information. In particular, to maintain data integrity, the computer device may traverse the data block index information in the snapshot file corresponding to each snapshot time after the link snapshot time to detect the snapshot file linked to the link snapshot file, so that it may be determined that the data block index information pointing to the local link relationship exists in the link snapshot file.
In addition, the computer equipment can additionally copy the corresponding data block of the target link index information in the link snapshot file to the priority reference snapshot file to obtain the copied data block; the computer equipment can update the index information of the corresponding data block of the copied data block in the priority reference snapshot file to the index information pointing to the local, and obtain the local link index information of the corresponding data block of the copied data block in the priority reference snapshot file.
Finally, the computer device may update the data block index information linked to the target link index information to the local link index information in the rest of the reference snapshot files corresponding to the rest of the reference snapshot times, and delete the link snapshot files.
According to the technical scheme, data block index information pointing to a local link relation in the link snapshot file is determined to serve as target link index information by responding to a deleting instruction aiming at the link snapshot file; the corresponding data block of the target link index information in the link snapshot file is additionally copied to the priority reference snapshot file, and the copied data block is obtained; the priority reference snapshot file is a snapshot file corresponding to the priority reference snapshot time in the reference snapshot file; the data block index information in the reference snapshot file is linked to the target link index information; the time interval between the priority reference snapshot time and the link snapshot time is smaller than the time interval between any other reference snapshot time except the priority reference snapshot time and the link snapshot time in the reference snapshot time; the reference snapshot time is the snapshot time corresponding to the reference snapshot file; updating index information of the corresponding data block of the copied data block in the priority reference snapshot file to index information pointing to the local, and obtaining local link index information of the corresponding data block of the copied data block in the priority reference snapshot file; updating the data block index information linked to the target link index information in the rest reference snapshot files corresponding to the rest reference snapshot moments to be linked to the local link index information, and deleting the link snapshot files; in this way, under the condition that the link snapshot file needs to be deleted, the data block corresponding to the target link index information in the link snapshot file is additionally copied to the priority reference snapshot file, and the link relation between the reference snapshot file and the link snapshot file is updated to be linked to the local link index information corresponding to the copied data block in the priority reference snapshot file, so that the integrity of the data can be ensured after the link snapshot file is deleted, and the reliability of the data snapshot is improved.
In one embodiment, establishing a link relationship between a snapshot file corresponding to each snapshot time except the link snapshot time and the link snapshot file to backup a data block in the source volume includes: responding to a data backup request of the source volume at the full-volume backup moment, and reading bitmap information corresponding to the source volume at the full-volume backup moment; and reading each data block set according to the data record judgment mark corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into a preset backup file.
Wherein, each bitmap position has a one-to-one mapping relation with each data block set.
In a specific implementation, in a process of establishing a link relation between a snapshot file corresponding to each snapshot time except a link snapshot time and the link snapshot file so as to backup a data block in a source volume, the computer equipment can respond to a data backup request of a full backup time for the source volume and read bitmap information corresponding to the full backup time of the source volume; and reading each data block set in different modes according to the data record judgment marks corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into a preset backup file.
Specifically, one bit (bit) in the bitmap information characterizes N continuous data blocks in the corresponding source volume, and the N continuous data blocks form a data block set, so that each bitmap position in the bitmap information has a one-to-one mapping relationship with each data block set.
According to the technical scheme, bitmap information corresponding to the full-volume backup moment of the source volume is read through responding to a data backup request of the full-volume backup moment for the source volume; reading each data block set according to the data record judgment mark corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into a preset backup file; mapping relation exists between each bitmap position and each data block set in a one-to-one correspondence manner; therefore, each data block set in the source volume can be accurately read according to the mapping relation between the bitmap position and the data block set and the data record judgment mark corresponding to each bitmap position in the bitmap information, so that the data blocks in the data block set can be backed up with high reliability.
In one embodiment, reading each data block set according to the data record determination identifier corresponding to each bitmap position in the bitmap information, so as to write the data blocks in each data block set into a preset backup file, including: judging whether the current data block set corresponding to the current bitmap position is recorded by a snapshot or not according to the data record judging mark corresponding to the current bitmap position; under the condition that the current data block set is not recorded by the snapshot, reading the data blocks in the current data block set in the source volume to write the full backup file; under the condition that the current data block set is judged to be recorded by the snapshot, reading the current compressed data block from the snapshot file recorded with the current compressed data block corresponding to the current data block set according to the data block index information corresponding to the current data block set at the full backup time; decompressing the current compressed data blocks to write the data blocks in the current data block set into a full backup file; and returning to the step of judging whether the current data block set corresponding to the current bitmap position is recorded by the snapshot according to the data record judgment mark corresponding to the current bitmap position until the data blocks in each data block set in the source volume are written into the full backup file.
The bitmap information may be named sparse bitmap information in practical application.
In a specific implementation, in a process that the computer device reads each data block set according to the data record determination identifier corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into the preset backup file, fig. 2 provides a schematic flow chart of a full-data backup method for convenience of those skilled in the art. As shown in fig. 2, when the first full-volume backup starts, the computer device may intercept a read operation for the source volume system through the BIO-filter driving module, and read bitmap information corresponding to the source volume at the full-volume backup time, so that the computer device may search and inquire a data record determination identifier corresponding to the current bitmap position in the bitmap information, and determine whether the current data block set corresponding to the current bitmap position is snapshot-recorded. For example, if the data record determination flag is "0", then it is characterized that the current data block set is not snapshot-recorded; if the data record determination is identified as "1", then it is characterized that the current data block set has been snapshot-recorded.
In the event that the computer device determines that the current set of data blocks is not snapshot recorded, then reading the data blocks in the current set of data blocks in the source volume (i.e., the source volume in fig. 2) to write a full back-up file; under the condition that the current data block set is judged to be recorded by the snapshot, the computer equipment determines the position of the current compressed data block corresponding to the current data block set in a data file in a snapshot file (CCOW file) according to the corresponding data block index information in a block information table corresponding to the current data block set at the full backup time, so as to read out the current compressed data block in the data file recorded with the current compressed data block corresponding to the current data block set, decompress the current compressed data block to obtain the data block in the current data block set, and write the data block into the full backup file.
Then, if the data blocks in each data block set have not been completely written into the backup file, that is, the backup has not been completed, the computer device may return to the step of intercepting the read operation for the source volume system by the BIO-filter driving module, so as to determine whether the current data block set corresponding to the current bitmap position is snapshot-recorded according to the data record determination identifier corresponding to the current bitmap position, until the data blocks in each data block set in the source volume are completely written into the full backup file, that is, after traversing all the data blocks, deleting the snapshot file to complete the full backup.
According to the technical scheme of the embodiment, whether the current data block set corresponding to the current bitmap position is subjected to snapshot recording is judged according to the data record judging mark corresponding to the current bitmap position; under the condition that the current data block set is not recorded by the snapshot, reading the data blocks in the current data block set in the source volume to write the full backup file; under the condition that the current data block set is judged to be recorded by the snapshot, reading the current compressed data block from the snapshot file recorded with the current compressed data block corresponding to the current data block set according to the data block index information corresponding to the current data block set at the full backup time; decompressing the current compressed data blocks to write the data blocks in the current data block set into a full backup file; returning to the step of judging whether the current data block set corresponding to the current bitmap position is recorded by the snapshot according to the data record judgment mark corresponding to the current bitmap position until the data blocks in each data block set in the source volume are written into the full backup file; therefore, whether the current data block set is recorded in the snapshot is taken as a judging condition, the data blocks in the current data block set are accurately read from the source volume or the snapshot file to be written into the full-volume backup file, and the integrity of the data in the source volume in the full-volume backup process is ensured.
In one embodiment, until after the step of writing the full back-up file to the data blocks in each set of data blocks in the source volume, the method further comprises: responding to a data incremental backup request aiming at a source volume at the incremental backup moment, and acquiring bitmap log identifiers corresponding to the positions of all bitmaps; under the condition of updating, reading bitmap information corresponding to the incremental backup time of the source volume; and reading the current data block set according to the data record judgment mark corresponding to the current bitmap position in the bitmap information corresponding to the incremental backup time so as to write the incremental backup file.
The bitmap log identifier is used for judging whether the current data block set corresponding to the current bitmap position is updated in the period between the full backup time and the incremental backup time.
In a specific implementation, after the computer device writes the data blocks in the data block set in the source volume into the full backup file at the full backup time, the computer device switches to a log working mode in the period of next required backup of the source volume system, and only records bitmap information corresponding to the data block set updated at the full backup time and during the current backup time in the mode, and does not read or record the original data blocks in the data block set.
For ease of understanding by those skilled in the art, FIG. 3 provides a flow chart of a method for incremental backup of data. As shown in fig. 3, the computer device may intercept, by using the BIO filter driver module, a read operation for the source volume system in response to a data incremental backup request for the source volume at an incremental backup time, so as to read a bitmap log identifier corresponding to each bitmap position, where the bitmap log identifier is used to determine whether an update occurs in a current data block set corresponding to a current bitmap position in a period between the full backup time and the incremental backup time. Specifically, each bitmap position corresponds to one sparse bitmap log, and the identifier corresponding to each sparse bitmap log is used as the bitmap log identifier corresponding to each bitmap position.
In practical application, if the bitmap log mark is "0", it can be characterized that the current data block set corresponding to the current bitmap position is not updated in the period between the full backup time and the incremental backup time, and then no processing is needed; if the bitmap log is identified as "1", it may be characterized that the current data block set corresponding to the current bitmap position is updated in the period between the full backup time and the incremental backup time.
Under the condition of updating, the computer equipment can read bitmap information corresponding to the source volume at the incremental backup moment; and reading the current data block set in different modes according to the data record judgment identification corresponding to the current bitmap position in the bitmap information corresponding to the incremental backup time so as to write the data blocks in the current data block set into a preset incremental backup file.
Specifically, the computer device may determine, according to the data record determination identifier corresponding to the current bitmap position, whether the current data block set corresponding to the current bitmap position is snapshot-recorded. For example, if the data record determination flag is "0", then it is characterized that the current data block set is not snapshot-recorded; if the data record determination is identified as "1", then it is characterized that the current data block set has been snapshot-recorded.
In the event that the computer device determines that the current set of data blocks is not snapshot recorded, then reading the data blocks in the current set of data blocks in the source volume (i.e., the source volume in FIG. 3) to write to the incremental backup file; under the condition that the current data block set is judged to be recorded by the snapshot, the computer equipment determines the position of the current compressed data block corresponding to the current data block set in a data file in a snapshot file (CCOW file) according to the corresponding data block index information in a block information table corresponding to the current data block set at the incremental backup time so as to read out the current compressed data block in the data file recorded with the current compressed data block corresponding to the current data block set, decompresses the current compressed data block to obtain the data block in the current data block set, and writes the data block into the incremental backup file.
Then, the computer equipment can return to the step of intercepting the read operation aiming at the source volume system through the BIO filtering driving module so as to read the bitmap log marks corresponding to the bitmap positions until the bitmap log marks corresponding to all bitmap positions are judged, namely after all data blocks are traversed, the snapshot file is deleted, and incremental backup is finished.
According to the technical scheme, bitmap log identifiers corresponding to the positions of all bitmaps are obtained by responding to a data incremental backup request of a source volume at the incremental backup moment; the bitmap log identifier is used for judging whether the current data block set corresponding to the current bitmap position is updated in the period between the full backup time and the incremental backup time; under the condition of updating, reading bitmap information corresponding to the incremental backup time of the source volume; reading a current data block set according to the data record judging identification corresponding to the current bitmap position in the bitmap information corresponding to the incremental backup time so as to write the incremental backup file; in this way, the bitmap log marks and records the current data block set corresponding to the current bitmap position, whether the current data block set is updated in the time period between the full backup time and the incremental backup time or not is judged, and the current data block set is read again to write the incremental backup file under the condition of updating, so that the load on the input and output requests of the system is smaller, and the read-write efficiency of the server disk is improved; and the full backup or incremental backup speed of the next time can be accelerated based on the bitmap log identification, so that the data backup efficiency is improved.
In another embodiment, as shown in fig. 4, a data backup method is provided, which is exemplified as the method applied to a computer device, and includes the following steps:
in step S410, in response to the data snapshot creation request for the source volume, in the case that the data block to be updated in the source volume is not snapshot-recorded, at least one snapshot time when snapshot recording is required for the data block to be updated is determined, and a snapshot sequence corresponding to the data block to be updated is obtained.
Step S420, compressing the target data block set of the data block to be updated in the source volume to obtain a compressed data block.
Step S430, storing the compressed data block into a linked snapshot file in the snapshot sequence.
In step S440, in the link snapshot file, the data block index information corresponding to the compressed data block is updated to obtain the link index information.
Step S450, linking the data block index information of the compressed data block in the snapshot files corresponding to the snapshot times except the link snapshot time to the link index information in a pointer mode so as to establish a link relation.
Step S460, in response to the deletion instruction for the link snapshot file, determines that there is data block index information of the link snapshot file that points to the local link relationship as the target link index information.
In step S470, the data block corresponding to the target link index information in the link snapshot file is additionally copied to the priority reference snapshot file, so as to obtain the copied data block.
In step S480, the index information of the corresponding data block in the priority reference snapshot file of the copied data block is updated to the index information pointing to the local, so as to obtain the local link index information of the corresponding data block in the priority reference snapshot file of the copied data block.
Step S490, the data block index information linked to the target link index information in the rest of the reference snapshot files corresponding to the rest of the reference snapshot time is updated to be linked to the local link index information, and the link snapshot file is deleted.
It should be noted that, the specific limitation of the above steps may be referred to the specific limitation of a data backup method.
In one embodiment, for convenience of those skilled in the art, fig. 5 provides a system block diagram of a snapshot backup recovery system, which includes a source volume system (in practical application, may be named as a backup source), a snapshot backup service (in practical application, may be named as a snapshot module) and a storage system (in practical application, may be named as a backup storage pool), and all three may be connected through a network.
In a source volume system, an address of a system for operating a data block of a source data volume is obtained through a BIO (building information on demand) filter driving module (i.e. a BIO driver), in a snapshot mode, the filter driving module intercepts read-write operation aiming at the source volume system, and reads a data record judgment mark corresponding to a current bitmap position in sparse bitmap information so as to judge whether the current data block to be updated corresponding to the current bitmap position is recorded or not. If the data block to be updated is recorded, the source volume system completes the data updating operation aiming at the address corresponding to the data block to be updated in the source volume; if not, the filtering driving module reads the target data block set to which the data block to be updated belongs, compresses the target data block set and transmits the target data block set to the snapshot backup service so as to store the target data block set into the data file in the CCOW file, and simultaneously updates the block expansion information file (namely sparse bitmap information and data block index information) in the CCOW file. In order to quickly acquire the data record judgment identification corresponding to the current bitmap position, a copy of sparse bitmap information (namely sparse bitmap copy information) can be generally reserved in a source volume system, and the sparse bitmap copy information is updated at the same time after the snapshot backup service is completed.
The snapshot backup service realizes the communication with the source volume system and the storage system through the communication module, and stores a CCOW file which comprises a block expansion information file and a data file. The block extension information is composed of sparse bitmap information and data block index information, wherein the sparse bitmap information adopts 1 bit to identify whether data of N blocks/sectors in a corresponding source volume is updated, for example, the size of a data block of a disk file system in the source volume is 4Kb, 1 bit of the sparse bitmap information represents 16 (n=16) continuous data blocks in the corresponding source volume system, the 16 continuous data blocks are taken as a data block set, and then the data block set corresponding to each bitmap position in the sparse bitmap information is 4×16=64 Kb. When any one data block in a data block set of a source volume system is updated, the data block set with the size of 64Kb is compressed and transmitted to a snapshot backup service only when the update occurs for the first time, after the snapshot is completed, a data record judgment mark corresponding to a bitmap position in sparse bitmap information is converted from 0 to 1, and if sparse bitmap copy information exists in the source volume system, the sparse bitmap copy information needs to be updated at the same time. Since the compression process typically uses lossless non-linear compression algorithms, i.e. the size of the compressed data block is converted from standard 64Kb to dynamic size. And the data file in the CCOW file adopts an additional mode, the new compressed data block is directly written into the tail end of the data file, and the block expansion information file of the new compressed data block is updated at the same time.
The backup recovery proxy service in the snapshot backup service realizes the full or incremental backup work of the data blocks in the source volume through the CCOW file, and when the backup file uses the compression mode, the block expansion information file is also required to be generated. The backup restoration proxy service may take out the full or incremental backup file from the backup storage pool, and if the backup file adopts the compression mode, decompress each compressed data block according to the data block index information in the block expansion information file, where the decompressed data blocks have the same size, for example, the size of 64 Kb.
The source volume system and the snapshot backup service can be deployed in the same server, and the operation of reading the data blocks in the source volume can be realized by directly accessing the disk data in the full or incremental backup working process of the data blocks in the source volume through the CCOW file.
According to the technical scheme of the embodiment, the dynamic characteristics of the compressed data size are combined, and the mode of separating the block expansion information file from the data file is adopted, so that the data file in the snapshot file can be dynamically increased according to the actual change condition of the source volume data, and the self-adaptive growth function of the data file is realized.
In one embodiment, for ease of understanding by those skilled in the art, FIG. 6 provides a flow chart of a method of compacting an abbreviated copy snapshot (CCOW). As shown in fig. 6, during a snapshot, when an operation of writing a disk block of a source data volume occurs, a write operation for a source volume system may be intercepted by a BIO (Blocking IO, block IO) filtering driving module, local sparse bitmap copy information is read, binary information (i.e., a data record determination identifier) of a bitmap position corresponding to a data block to be updated in the local sparse bitmap copy information is determined, if the binary information is 0, a target data block set to which the data block to be updated belongs is read in the source volume by the BIO filtering driving module, a compressed data block is obtained after compression, and a data record determination identifier of the corresponding bitmap position in the local sparse bitmap copy information is updated; transmitting the compressed data blocks and the data block index information corresponding to the compressed data blocks to a snapshot backup service; finally, the BIO filtering driving module completes the data updating operation of the address corresponding to the data block to be updated in the source volume; if the address is 1, the data updating operation of the address corresponding to the data block to be updated in the source volume is directly completed.
After receiving the compressed data blocks and the corresponding data block index information, the backup recovery proxy service in the snapshot backup service writes the compressed data blocks into the data files in the CCOW file in an additional mode, and updates the block expansion information files (namely sparse bitmap information and data block index information) in the CCOW file.
7 (a) -7 (c) provide generation and updating of CCOW files in multi-snapshot mode in accordance with one or more embodiments, detailing the process of updating the block extension information files and data files at subsequent snapshot times, T1, T2, etc., when the system completes an initial snapshot at time T0.
In fig. 7 (a), a disk layout of a source volume is described, each data block has a size of 4Kb, each consecutive 16 (n=16) data blocks corresponds to one bit in the sparse bitmap information in the CCOW file, all the source volume data blocks are traversed in turn, and all the bits of the sparse bitmap information are 0 at the time of initialization. If the data of the source volume changes at the time of the T1 snapshot, that is, when the BIO filter driving module intercepts the write operation for the source volume system, firstly, a block number (that is, a data block identifier) corresponding to a data block to be updated is obtained, a bitmap position corresponding to the sparse bitmap information is obtained by calculation, a data record judgment identifier "0" corresponding to the bitmap position is inverted to "1", and if the corresponding data record judgment identifier is "1", the data record judgment identifier is not changed. In fig. 7 (a), the first data block set a (consisting of data blocks A1 to a 16), the second data block set B (consisting of data blocks B1 to B16), and the tenth data block set C (consisting of data blocks C1 to C16) in the source volume are updated with data blocks, for example, the data block A2 in the data block set a is updated to A2', the data block B1 in the data block set B is updated to B1', and the data block C3 in the data block set C is updated to C3', then the data block set a is updated to a ', the data block set B is updated to B ', and the data record determination flag "0" corresponding to the 1 st, 2 nd 10 th bitmap positions in the sparse bitmap information is inverted to "1", and the changed positions of the disk data blocks in the source volume during the snapshot are recorded. In addition, the data blocks in the data block set a, the data block set B and the data block set C need to be read from the source volume, and the data blocks are compressed and stored in the data file in the CCOW file, and the size of each compressed block is generally different.
The data block index table (i.e., the block information table in the above embodiment) records data block index information, including a bitmap position (i.e., a bitmap ID) corresponding to a data block to be updated in the bitmap information, an address range (i.e., a source volume range (sourceblock)) corresponding to a data block set to which the data block to be updated belongs in the source volume, and an offset and a size (i.e., a data file offset and a size) of the compressed data block in the data file; wherein, the bitmap ID (BlockBitmapID) records the bitmap position updated in the sparse bitmap information; the source volume middle range (SoureBlock) records the address range of the data block in the source volume corresponding to the bitmap position; the data file offset and size record the offset location and size of the compressed data blocks in the source volume.
The data file records a data block set corresponding to the compressed data block and the size of the compressed data block.
In the whole snapshot process, the information of the data block to be updated is continuously additionally recorded in the data block index table, and the data file also uses an additional writing mode, so that the size of the CCOW file can be dynamically increased. Furthermore, if a data block set has been snapshot-recorded, when a data block in the data block set is subsequently changed, since the binary information of the corresponding bitmap position is already "1", then the snapshot operation on the data block set does not need to be performed again.
FIG. 7 (b) depicts the structure of the CCOW file at the time of snapshot opening at the time of the T2 snapshot after the time of the T1 snapshot. At the time of the snapshot T2, when the data in the first data block set is updated, the data block set a 'is updated to a ", and meanwhile, the data in the eleventh data block set D (composed of the data blocks D1 to D16) is updated, so that in order to meet the data consistency requirement of the snapshot, the data in the data block set a' needs to be recorded in the snapshot file corresponding to the time of the snapshot T2, and the original data in the data block set D needs to be recorded in the snapshot file corresponding to the time of the snapshot T1 and the time of the snapshot T2. In order to meet the above requirement, the snapshot backup service will read the data in the data block set D, compress the data and then add the data to the data file corresponding to the time of the T1 snapshot, and update the data block index information corresponding to the time of the T1 snapshot and the data record determination identifier of the corresponding bitmap position of the data block set D. Meanwhile, the snapshot backup service reads data in the data block set A ', writes the data into a data file corresponding to the T2 snapshot time after compression, updates data record judgment identifications of bitmap positions corresponding to the first data block set and the eleventh data block set in sparse bitmap information corresponding to the T2 snapshot time, and records data block index information of the compressed data block set A' in a data block index table corresponding to the T2 snapshot time. And aiming at the data block index information corresponding to the data block set D at the T2 snapshot time, the data block index information corresponding to the data block set D at the T1 snapshot time is linked to by adopting a pointer mode.
Fig. 7 (c) depicts a snapshot file maintenance flow at the time of T2 snapshot and subsequent snapshot times in order to maintain data integrity, if the snapshot file corresponding to the time of T1 snapshot (corresponding to the linked snapshot time) needs to be deleted. When the snapshot file at the time of the T1 snapshot is deleted, the index information of the data blocks at all the snapshot times after the snapshot file needs to be traversed first, when the link to the time of the T1 snapshot is detected, the data in the data file corresponding to the time of the T1 snapshot needs to be copied to the data file of the priority reference snapshot file, and meanwhile, the link of all the snapshot times after the snapshot time corresponding to the priority reference snapshot file to the time of the T1 snapshot is modified to the link of the snapshot time corresponding to the priority reference snapshot file.
Specifically, in fig. 7 (b), in the data block index table corresponding to the T2 snapshot time, the data block index information corresponding to the 11 th bitmap position points to the data block index information corresponding to the same bitmap position in the data block index table corresponding to the T1 snapshot time, so that it is necessary to copy and write the data in the data file corresponding to the bitmap position at the T1 snapshot time into the data file corresponding to the T2 snapshot time (i.e., the snapshot time corresponding to the snapshot file is preferentially referred to) using the append mode, and update the data block index information corresponding to the bitmap position at the T2 snapshot time to the index information pointing to the local, as shown in fig. 7 (c). If there are other snapshot times referring to the T1 snapshot time after the T2 snapshot time, the links of the other snapshot times to the T1 snapshot time need to be modified to be the links pointing to the T2 snapshot time, and finally the snapshot file corresponding to the T1 snapshot time is deleted.
According to the technical scheme of the embodiment, under the condition that a plurality of snapshot moments exist, aiming at the problem that the snapshot files corresponding to each snapshot moment need to be written into the compressed data blocks corresponding to the data blocks to be updated, the data pointers of the ordered chain are adopted, the compressed data blocks are only written into the data files corresponding to the first snapshot moment in the snapshot sequence, and the data pointers in the index information of the data blocks are updated in the snapshot files of other snapshot moments, so that data writing operation is not required for the rest snapshot files in the snapshot sequence, and the problem of IO performance reduction of the multi-snapshot condition is solved.
In addition, the data block of the scheme adopts a mode of KV (Key-value) table+linked list to query, so that index searching can be rapidly performed, and the efficiency of querying the data block is improved.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a data backup device for realizing the above related data backup method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the data backup device provided below may refer to the limitation of a data backup method hereinabove, and will not be repeated herein.
In one embodiment, as shown in fig. 8, there is provided a data backup apparatus including: a determination module 810, a compression module 820, a storage module 830, and a setup module 840, wherein:
a determining module 810, configured to determine, in response to a data snapshot creation request for a source volume, at least one snapshot time when a snapshot record needs to be performed on a data block to be updated in a case where the data block to be updated in the source volume is not snapshot recorded, and obtain a snapshot sequence corresponding to the data block to be updated; the snapshot sequence consists of snapshot files corresponding to each snapshot time;
the compression module 820 is configured to compress the target data block set to which the data block to be updated belongs in the source volume, so as to obtain a compressed data block;
A storage module 830, configured to store the compressed data block into a linked snapshot file in the snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in the at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time;
and the establishing module 840 is configured to establish a link relationship between the snapshot files corresponding to the snapshot times except the link snapshot time and the link snapshot file, so as to backup the data blocks in the source volume.
In one embodiment, the establishing module 840 is specifically configured to update, in the link snapshot file, data block index information corresponding to the compressed data block to obtain link index information; and linking the data block index information of the compressed data block in the snapshot files corresponding to the snapshot times except the link snapshot time to the link index information in a pointer mode so as to establish the link relation.
In one embodiment, the apparatus further comprises: the index determining module is used for determining that data block index information pointing to a local link relation exists in the link snapshot file as target link index information in response to a deleting instruction for the link snapshot file; the copying module is used for additionally copying the corresponding data block of the target link index information in the link snapshot file to the priority reference snapshot file to obtain a copied data block; the priority reference snapshot file is a snapshot file corresponding to the priority reference snapshot time in the reference snapshot file; the data block index information in the reference snapshot file is linked to the target link index information; the time interval between the priority reference snapshot time and the link snapshot time is smaller than the time interval between any other reference snapshot time except the priority reference snapshot time and the link snapshot time in the reference snapshot time; the reference snapshot time is the snapshot time corresponding to the reference snapshot file; the index updating module is used for updating the index information of the data block corresponding to the copied data block in the priority reference snapshot file into the index information pointing to the local, so as to obtain the local link index information corresponding to the copied data block in the priority reference snapshot file; and the link updating module is used for updating the data block index information linked to the target link index information in the rest reference snapshot files corresponding to the rest reference snapshot time to be linked to the local link index information and deleting the link snapshot files.
In one embodiment, the establishing module 840 is specifically configured to read bitmap information corresponding to the full-volume backup time of the source volume in response to a data backup request for the source volume at the full-volume backup time; reading each data block set according to the data record judgment mark corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into a preset backup file; and each bitmap position has a one-to-one mapping relation with each data block set.
In one embodiment, the establishing module 840 is specifically configured to determine whether the current data block set corresponding to the current bitmap position is snapshot-recorded according to the data record determination identifier corresponding to the current bitmap position; under the condition that the current data block set is judged not to be recorded by the snapshot, reading the data blocks in the current data block set in the source volume to write a full backup file; under the condition that the current data block set is judged to be recorded by the snapshot, reading the current compressed data block from a snapshot file recorded with the current compressed data block corresponding to the current data block set according to the data block index information corresponding to the current data block set at the full backup time; decompressing the current compressed data block to write the data blocks in the current data block set into the full backup file; and returning to the step of judging whether the current data block set corresponding to the current bitmap position is recorded by the snapshot according to the data record judgment mark corresponding to the current bitmap position until the data blocks in each data block set in the source volume are written into the full backup file.
In one embodiment, the apparatus further comprises: the acquisition module is used for responding to the data incremental backup request of the source volume at the incremental backup moment and acquiring bitmap log identifiers corresponding to the bitmap positions; the bitmap log identifier is used for judging whether the current data block set corresponding to the current bitmap position is updated in the period between the full backup time and the incremental backup time; the bitmap reading module is used for reading bitmap information corresponding to the source volume at the incremental backup moment under the condition of updating; and the data reading module is used for reading the current data block set according to the data record judgment mark corresponding to the current bitmap position in the bitmap information corresponding to the incremental backup time so as to write the incremental backup file.
The modules in the data backup device may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data in a source volume. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data backup method.
It will be appreciated by those skilled in the art that the structure shown in fig. 9 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to comply with the related laws and regulations and standards of the related countries and regions.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (10)

1. A method of data backup, the method comprising:
responding to a data snapshot creation request aiming at a source volume, and determining at least one snapshot time needing to carry out snapshot recording aiming at a data block to be updated in the source volume under the condition that the data block to be updated in the source volume is not snapshot recorded, so as to obtain a snapshot sequence corresponding to the data block to be updated; the snapshot sequence consists of snapshot files corresponding to each snapshot time;
Compressing the target data block set of the data block to be updated in the source volume to obtain a compressed data block;
storing the compressed data blocks into a linked snapshot file in the snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in the at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time;
establishing a link relation between data block index information aiming at the compressed data block in a snapshot file corresponding to each snapshot time except the link snapshot time and the data block index information aiming at the compressed data block in the link snapshot file by adopting a pointer mode so as to backup the data block in the source volume;
wherein backing up the data blocks in the source volume includes:
writing the data blocks in each data block set in the source volume into a preset backup file according to the data record judgment marks corresponding to each bitmap position in the bitmap information corresponding to the source volume; and each bitmap position has a one-to-one mapping relation with each data block set.
2. The method according to claim 1, wherein the establishing, by means of pointers, a link relationship between the data block index information for the compressed data block in the snapshot file corresponding to each snapshot time except the link snapshot time and the data block index information for the compressed data block in the link snapshot file, so as to backup the data block in the source volume includes:
updating the data block index information corresponding to the compressed data block in the link snapshot file to obtain link index information;
and linking the data block index information of the compressed data block in the snapshot files corresponding to the snapshot times except the link snapshot time to the link index information in a pointer mode so as to establish the link relation.
3. The method according to claim 1, wherein the method further comprises:
determining that data block index information pointing to a local link relation exists in the link snapshot file as target link index information in response to a deleting instruction aiming at the link snapshot file;
The target link index information is additionally copied to the priority reference snapshot file in the corresponding data block in the link snapshot file, and the copied data block is obtained; the priority reference snapshot file is a snapshot file corresponding to the priority reference snapshot time in the reference snapshot file; the data block index information in the reference snapshot file is linked to the target link index information; the time interval between the priority reference snapshot time and the link snapshot time is smaller than the time interval between any other reference snapshot time except the priority reference snapshot time and the link snapshot time in the reference snapshot time; the reference snapshot time is the snapshot time corresponding to the reference snapshot file;
updating the index information of the data block corresponding to the copied data block in the priority reference snapshot file to index information pointing to the local, and obtaining the local link index information corresponding to the copied data block in the priority reference snapshot file;
and updating the data block index information linked to the target link index information in the rest reference snapshot files corresponding to the rest reference snapshot time to be linked to the local link index information, and deleting the link snapshot file.
4. The method of claim 1, wherein writing the data blocks in the set of data blocks in the source volume into a preset backup file according to the data record determination identifier corresponding to each bitmap position in the bitmap information corresponding to the source volume, includes:
responding to a data backup request of the source volume at the full backup time, and reading bitmap information corresponding to the source volume at the full backup time;
and reading each data block set according to the data record judgment mark corresponding to each bitmap position in the bitmap information so as to write the data blocks in each data block set into the preset backup file.
5. The method according to claim 4, wherein reading each set of data blocks according to the data record determination identifier corresponding to each bitmap position in the bitmap information to write the data blocks in each set of data blocks into the preset backup file includes:
judging whether a current data block set corresponding to a current bitmap position is recorded by a snapshot or not according to a data record judging mark corresponding to the current bitmap position;
under the condition that the current data block set is judged not to be recorded by the snapshot, reading the data blocks in the current data block set in the source volume to write a full backup file;
And returning to the step of judging whether the current data block set corresponding to the current bitmap position is recorded by the snapshot according to the data record judgment mark corresponding to the current bitmap position until the data blocks in each data block set in the source volume are written into the full backup file.
6. The method of claim 5, wherein the method further comprises:
under the condition that the current data block set is judged to be recorded by the snapshot, reading the current compressed data block from a snapshot file recorded with the current compressed data block corresponding to the current data block set according to the data block index information corresponding to the current data block set at the full backup time;
and decompressing the current compressed data blocks to write the data blocks in the current data block set into the full backup file.
7. The method of claim 5, wherein the method further comprises, until after the step of writing the full back-up file to the data blocks in each set of data blocks in the source volume:
responding to a data incremental backup request aiming at the source volume at the incremental backup moment, and acquiring bitmap log identifiers corresponding to the bitmap positions; the bitmap log identifier is used for judging whether the current data block set corresponding to the current bitmap position is updated in the period between the full backup time and the incremental backup time;
Under the condition of updating, the bitmap information corresponding to the source volume at the incremental backup moment is read;
and reading the current data block set according to the data record judgment mark corresponding to the current bitmap position in the bitmap information corresponding to the incremental backup time so as to write the incremental backup file.
8. A data backup apparatus, the apparatus comprising:
the determining module is used for responding to a data snapshot creation request aiming at a source volume, and determining at least one snapshot time of snapshot record aiming at the data block to be updated in the source volume under the condition that the data block to be updated in the source volume is not snapshot recorded, so as to obtain a snapshot sequence corresponding to the data block to be updated; the snapshot sequence consists of snapshot files corresponding to each snapshot time;
the compression module is used for compressing the target data block set of the data block to be updated in the source volume to obtain a compressed data block;
the storage module is used for storing the compressed data blocks into a linked snapshot file in the snapshot sequence; the link snapshot file is a snapshot file corresponding to the link snapshot time in the at least one snapshot time; the link snapshot time is earlier than any snapshot time except the link snapshot time in the at least one snapshot time;
The establishing module is used for establishing a link relation between data block index information aiming at the compressed data block in snapshot files corresponding to the snapshot moments except the link snapshot moment and the data block index information aiming at the compressed data block in the link snapshot file by adopting a pointer mode so as to backup the data block in the source volume;
the establishing module is specifically configured to write data blocks in each data block set in the source volume into a preset backup file according to a data record determination identifier corresponding to each bitmap position in bitmap information corresponding to the source volume; and each bitmap position has a one-to-one mapping relation with each data block set.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202211134595.0A 2022-09-19 2022-09-19 Data backup method, device, computer equipment and storage medium Active CN115509808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211134595.0A CN115509808B (en) 2022-09-19 2022-09-19 Data backup method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211134595.0A CN115509808B (en) 2022-09-19 2022-09-19 Data backup method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115509808A CN115509808A (en) 2022-12-23
CN115509808B true CN115509808B (en) 2023-07-07

Family

ID=84504722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211134595.0A Active CN115509808B (en) 2022-09-19 2022-09-19 Data backup method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115509808B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116560914B (en) * 2023-07-10 2023-10-13 成都云祺科技有限公司 Incremental backup method, system and storage medium under virtual machine CBT failure

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7870356B1 (en) * 2007-02-22 2011-01-11 Emc Corporation Creation of snapshot copies using a sparse file for keeping a record of changed blocks
CN107544871B (en) * 2017-07-21 2020-10-02 新华三云计算技术有限公司 Virtual machine disk backup method and device
CN109710454A (en) * 2018-11-08 2019-05-03 厦门集微科技有限公司 A kind of cloud host snapshot method and device
CN111338850A (en) * 2020-02-25 2020-06-26 上海英方软件股份有限公司 Method and system for improving backup efficiency based on COW mode multi-snapshot
CN113448774B (en) * 2021-06-04 2023-01-24 山东英信计算机技术有限公司 Method, system, device and medium for optimizing copy-on-write storage snapshot management
CN115033425A (en) * 2022-05-31 2022-09-09 中电信数智科技有限公司 Method for improving success rate of data backup

Also Published As

Publication number Publication date
CN115509808A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
US9910620B1 (en) Method and system for leveraging secondary storage for primary storage snapshots
US9043540B2 (en) Systems and methods for tracking block ownership
US9176853B2 (en) Managing copy-on-writes to snapshots
US20060047926A1 (en) Managing multiple snapshot copies of data
CN106951375B (en) Method and device for deleting snapshot volume in storage system
WO2017020576A1 (en) Method and apparatus for file compaction in key-value storage system
WO2019091085A1 (en) Snapshot comparison method and apparatus
CN108415986B (en) Data processing method, device, system, medium and computing equipment
CN105493080B (en) The method and apparatus of data de-duplication based on context-aware
CN110597762A (en) File processing method, device, equipment and storage medium
CN115509808B (en) Data backup method, device, computer equipment and storage medium
CN107506466B (en) Small file storage method and system
WO2024169851A1 (en) Data compression method, system, and device, and computer readable storage medium
CN114115734A (en) Data deduplication method, device, equipment and storage medium
CN111444114A (en) Method, device and system for processing data in nonvolatile memory
CN114020691A (en) Read-write separated data updating method and device and KV storage system
US11620056B2 (en) Snapshots for any point in time replication
CN113821476B (en) Data processing method and device
WO2023178899A1 (en) Data management method and apparatus of file system, electronic device, and storage medium
US20220129160A1 (en) Data processing method, device, and electronic device
CN114625695A (en) Data processing method and device
CN114924911A (en) Method, device, equipment and storage medium for backing up effective data of Windows operating system
CN113360095A (en) Hard disk data management method, device, equipment and medium
US20200142591A1 (en) Snapshot managing system
CN118170589B (en) Data processing method, computer program product, equipment and computer medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant