CN112269681A - Method, device and equipment for continuously protecting virtual machine data - Google Patents
Method, device and equipment for continuously protecting virtual machine data Download PDFInfo
- Publication number
- CN112269681A CN112269681A CN202011112737.4A CN202011112737A CN112269681A CN 112269681 A CN112269681 A CN 112269681A CN 202011112737 A CN202011112737 A CN 202011112737A CN 112269681 A CN112269681 A CN 112269681A
- Authority
- CN
- China
- Prior art keywords
- backup
- data
- log
- incremental
- recovery
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000011084 recovery Methods 0.000 claims abstract description 106
- 238000003860 storage Methods 0.000 claims abstract description 37
- 238000005516 engineering process Methods 0.000 claims abstract description 19
- 230000014759 maintenance of location Effects 0.000 claims description 59
- 230000008569 process Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 9
- 230000003203 everyday effect Effects 0.000 claims description 5
- 230000004888 barrier function Effects 0.000 description 13
- 238000004140 cleaning Methods 0.000 description 10
- 238000004519 manufacturing process Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 4
- 235000019580 granularity Nutrition 0.000 description 4
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000010926 purge Methods 0.000 description 2
- 241000700605 Viruses Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1479—Generic software techniques for error detection or fault masking
- G06F11/1482—Generic software techniques for error detection or fault masking by means of middleware or OS functionality
- G06F11/1484—Generic software techniques for error detection or fault masking by means of middleware or OS functionality involving virtual machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a method, a device and equipment for continuously protecting virtual machine data and a readable storage medium. The method and the device integrate three backup technologies of continuous data protection, full backup and incremental backup, protect recent data at a fine-grained IO level by the continuous data protection technology, and protect early and coarse-grained historical snapshot data by the full backup and the incremental backup to form a sparse-to-dense data protection system. Based on the backup mode, the recent data can be subjected to IO-level data recovery, zero loss of recent data assets is achieved, and RPO indexes are reduced; the early historical data can be subjected to data recovery at an incremental backup level or a full backup level, so that the occupation of storage space is reduced. And when CDP data recovery is carried out, only IO data in a limited range between backup points need to be recovered, so that the IO quantity of data recovery is reduced, and the RTO index is reduced.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for continuously protecting virtual machine data.
Background
With the continuous maturity of virtualization technology, more and more government departments and enterprise units adopt a cloud computing mode to deploy own virtual data centers. In the field of cloud computing, data protection of a virtual machine is particularly important.
Continuous Data Protection (CDP) is a method that can continuously capture or track any change of target Data without affecting the operation of main Data, and can recover to any previous time point. In 2011, the CDP technical group of SNIA (global network storage industry association) published a technical document for CDP, which explicitly points out three major standards for CDP: (1) any data change of the source data can be captured; (2) at least one other place can be backed up (disaster recovery); (3) it is possible to recover to any point in time.
Based on the three above-mentioned criteria defined by SNIA, the industry also defines two metrics to measure CDP data protection and data recovery, which are: a Recovery Time Object (RTO) and a Recovery Point Object (RPO). Wherein, the recovery time target refers to the maximum time length required from the disaster to the system recovery; the recovery point objective refers to the length of time that the data is most likely to be lost when a disaster occurs.
The continuous data protection technology can reduce the RPO to 0 theoretically based on the IO of each disk. However, when the disk IO is continuously saved for a long time, or the disk IOPS (Input/Output Operations Per Second) is too high, and the amount of IO data saved in the CDP is too large, there are two problems at this time: (1) IO data which are too long in time are stored, and too much storage space is occupied; (2) when the IO data stored for a long time is subjected to data recovery, the recovery time is too long, that is, the RTO is too large.
In order to solve the above problems, the prior art proposes a method for reading a time base line with a similar recovery point into a memory, and improving the data recovery efficiency by using the characteristic of a fast data processing speed in the memory. However, the basic starting point of this scheme is to trade memory space for data recovery efficiency (space-time). In practical application, a large memory space in a system needs to be consumed, and when the recovered IO data is large, the method is unavailable.
In summary, how to protect the data of the virtual machine, and overcome the disadvantages of the backup data that occupies too much space and the data recovery consumes a long time are urgent problems to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a method, a device, equipment and a readable storage medium for continuously protecting virtual machine data, which are used for solving the problems that the current virtual machine data protection scheme has overlarge backup data occupation space or long data recovery time. The specific scheme is as follows:
in a first aspect, the present application provides a method for continuously protecting virtual machine data, including:
writing the IO data into a disk whenever the IO data of the virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area;
performing incremental backup on the disk according to the incremental backup frequency, and storing incremental backup data to a second backup area;
performing full backup on the disk according to full backup frequency, and storing full backup data into a third backup area, wherein the full backup frequency is lower than the incremental backup frequency;
when the backup duration of the first backup area exceeds the retention duration of the IO log, clearing the IO log on the first backup area;
and when the backup duration of the second backup area exceeds the retention duration of incremental backup data, removing the incremental backup data on the second backup area, wherein the retention duration of the incremental backup data is greater than the retention duration of the IO log.
Preferably, the incremental backup frequency is once every N hours, and performing incremental backup on the disk according to the incremental backup frequency includes:
performing incremental backup on the disk every N hours every day, wherein N is a factor of 24;
correspondingly, the full backup frequency is once a day, and performing full backup on the disk according to the full backup frequency includes:
and performing full backup on the disk at a target time point every day.
Preferably, before the removing the IO log in the first backup area when the backup duration of the first backup area exceeds the retention duration of the IO log, the method further includes:
traversing the IO logs on the first backup area, and determining the earliest time stamp of all the IO logs; and calculating the difference between the current time and the earliest timestamp to serve as the backup time length of the first backup area.
Preferably, the method further comprises the following steps:
setting a backup policy using a policy management component, wherein the backup policy comprises: the method comprises the steps of incremental backup frequency, full backup frequency, retention time of IO logs, retention time of incremental backup data, an address of a first backup area, an address of a second backup area and an address of a third backup area.
Preferably, the generating an IO log according to the IO data and storing the IO log in a first backup area includes:
and generating an IO log according to the IO data by using an IO filter, and storing the IO log to a first backup area, wherein the IO log comprises SuperBlock and IO metadata.
Preferably, the method further comprises the following steps:
determining a recovery time point according to the data recovery instruction;
if the difference between the current time and the recovery time point is less than or equal to the retention time of the IO log, performing IO-level data recovery according to the IO log on the first backup area;
if the difference between the current time and the recovery time point is larger than the retention time of the IO log and smaller than or equal to the retention time of the incremental backup data, performing data recovery according to the incremental backup data on the second backup area;
and if the difference between the current time and the recovery time point is greater than the retention time of the incremental backup data, performing data recovery according to the full backup data on the third backup area.
Preferably, the performing of the data recovery at the IO level according to the IO log on the first backup area includes:
determining an incremental backup process closest to the recovery time point, determining the actual backup time of the incremental backup process, and acquiring corresponding incremental backup data;
obtaining an IO log generated between the actual backup time and a recovery time point;
and according to the incremental backup data and the IO log, performing IO-level data recovery.
In a second aspect, the present application provides an apparatus for continuously protecting virtual machine data, including:
an IO backup module: the method comprises the steps of writing IO data into a disk when the IO data of a virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area;
an incremental backup module: the incremental backup device is used for performing incremental backup on the disk according to the incremental backup frequency and storing incremental backup data to a second backup area;
a full backup module: the full backup is carried out on the disk according to the full backup frequency, and full backup data are stored in a third backup area, wherein the full backup frequency is lower than the incremental backup frequency;
an IO clear module: the IO log clearing module is used for clearing the IO log on the first backup area when the backup duration of the first backup area exceeds the retention duration of the IO log;
an increment removal module: and the incremental backup data in the second backup area is cleared when the backup duration of the second backup area exceeds the retention duration of the incremental backup data, wherein the retention duration of the incremental backup data is greater than the retention duration of the IO log.
In a third aspect, the present application provides an apparatus for continuously protecting virtual machine data, including:
a memory: for storing a computer program;
a processor: for executing the computer program to implement the method for continuously protecting virtual machine data as described above.
In a fourth aspect, the present application provides a readable storage medium having stored thereon a computer program for implementing the method of continuously protecting virtual machine data as described above when executed by a processor.
The application provides a method for continuously protecting virtual machine data, which comprises the following steps: writing the IO data into a disk whenever the IO data of the virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area; performing incremental backup on the disk according to the incremental backup frequency, and storing incremental backup data to a second backup area; performing full backup on the disk according to the full backup frequency, and storing full backup data into a third backup area, wherein the full backup frequency is lower than the incremental backup frequency; when the backup duration of the first backup area exceeds the retention duration of the IO log, clearing the IO log on the first backup area; and when the backup time length of the second backup area exceeds the retention time length of the incremental backup data, removing the incremental backup data on the second backup area, wherein the retention time length of the incremental backup data is greater than that of the IO log.
Therefore, the method integrates three backup technologies of continuous data protection, full backup and incremental backup, the continuous data protection technology protects the data of fine-grained IO level in the near future (defining the near-term time range according to the backup strategy), and the full backup and the incremental backup protect the historical snapshot data of coarse granularity in the early stage (defining the early-stage time range according to the backup strategy), so as to form a sparse-to-dense data protection system. Based on the backup mode, the recent data can be subjected to IO-level data recovery, zero loss of recent data assets is achieved, and RPO indexes are reduced; the early historical data can be subjected to data recovery at an incremental backup level or a full backup level, so that the occupation of storage space is reduced. And when CDP data recovery is carried out, only IO data in a limited range between backup points need to be recovered, so that the IO quantity of data recovery is reduced, and the RTO index is reduced.
In addition, the application also provides a device, equipment and a readable storage medium for continuously protecting the virtual machine data, and the technical effect of the device, the equipment and the readable storage medium corresponds to the technical effect of the method, and the details are not repeated here.
Drawings
For a clearer explanation of the embodiments or technical solutions of the prior art of the present application, the drawings needed for the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a first implementation of a method for continuously protecting virtual machine data according to an embodiment of the present disclosure;
fig. 2 is a schematic process diagram of a second embodiment of a method for continuously protecting virtual machine data according to the present application;
FIG. 3 is a diagram illustrating a complete data protection cycle for continuously protecting virtual machine data;
FIG. 4 is a schematic diagram of the structure of an IO log;
FIG. 5 is a schematic flow chart of IO data protection performed by CDP
FIG. 6 is a comparison of RTO metrics between a data protection scheme using full/delta/CDP and a conventional CDP scheme;
fig. 7 is a functional block diagram of an embodiment of an apparatus for continuously protecting virtual machine data according to the present application.
Detailed Description
The core of the application is to provide a method, a device, equipment and a readable storage medium for continuously protecting virtual machine data, and combine three backup technologies of continuous data protection, full backup and incremental backup to form a sparse-to-dense data protection system, so that RPO indexes and RTO indexes are effectively reduced.
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a first embodiment of a method for continuously protecting virtual machine data provided by the present application is described below, where the first embodiment includes:
s101, writing IO data into a disk whenever the IO data of the virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area;
s102, performing incremental backup on the disk according to the incremental backup frequency, and storing incremental backup data to a second backup area;
s103, carrying out full backup on the disk according to the full backup frequency, and storing full backup data into a third backup area, wherein the full backup frequency is lower than the incremental backup frequency;
s104, when the backup duration of the first backup area exceeds the retention duration of the IO log, clearing the IO log on the first backup area;
and S105, when the backup time length of the second backup area exceeds the retention time length of the incremental backup data, removing the incremental backup data on the second backup area, wherein the retention time length of the incremental backup data is greater than the retention time length of the IO log.
The embodiment relates to the technical problem of data security in the field of cloud computing, in particular to continuous data protection of virtual machine data by using a continuous data protection technology in a virtualized environment QEMU (Quick Emulator).
Specifically, before executing backup, a backup policy is set, specifically including an incremental backup frequency, a full backup frequency, a retention time of an IO log, and a retention time of incremental backup data. The incremental backup frequency is higher than the full backup frequency, for example, the full backup frequency may be once a day, and the full backup is performed at a fixed time point every day; the incremental backup frequency may be N hours once, and for ease of calculation, N may be a factor of 24.
The backup data and the IO log are not usually stored in a production storage pool of the virtual machine, but stored in a backup storage pool having backup and disaster recovery functions, so as to facilitate data recovery. Therefore, the backup policy may further include an address of the first backup area, an address of the second backup area, and an address of the third backup area.
The backup duration of the first backup area refers to: the difference between the current time and the earliest time to write the IO log on the first backup area. Specifically, the S104 may include: traversing IO logs on the first backup area, and determining the earliest time stamp of all the IO logs; and calculating the difference between the current time and the final time stamp to be used as the backup time length of the first backup area.
Similarly, the backup duration of the second backup area refers to: the difference between the current time and the time at which the incremental backup data was written earliest on the second backup area.
Based on the backup mode, the data recovery process comprises the following steps:
s201, determining a recovery time point according to a data recovery instruction;
s202, if the difference between the current time and the recovery time point is less than or equal to the retention time of the IO log, performing IO-level data recovery according to the IO log on the first backup area;
specifically, a full backup process and an incremental backup process closest to the recovery time point are determined, and corresponding full backup data and incremental backup data are acquired. And determining the backup time of the incremental backup process closest to the recovery time point, and acquiring the IO log between the backup time and the recovery time point. And finally, according to the obtained full backup data, the incremental backup data and the IO log, the data recovery of the IO level is realized.
S203, if the difference between the current time and the recovery time point is greater than the retention time of the IO log and less than or equal to the retention time of the incremental backup data, performing data recovery according to the incremental backup data on the second backup area;
specifically, a full backup process and an incremental backup process closest to the recovery time point are determined, corresponding full backup data and incremental backup data are obtained, and data recovery at the incremental backup level is achieved according to the full backup process and the incremental backup process.
And S204, if the difference between the current time and the recovery time point is greater than the retention time of the incremental backup data, performing data recovery according to the full backup data on the third backup area.
Specifically, a full backup process closest to the recovery time point is determined, corresponding full backup data is acquired, and accordingly, data recovery at the full backup level is achieved.
The embodiment provides a method for continuously protecting virtual machine data, and provides a data protection mode integrating three backup technologies (CDP, incremental backup, full backup). Compared with a common CDP protection scheme, the embodiment can perform IO protection between two backup points based on a full backup point or an incremental backup point, reduce data recovery time and reduce RTO indexes. Compared with a common backup scheme, the embodiment can realize the protection granularity of the IO level between two backup points by adopting a CDP technology, provide the data recovery of the IO level, and achieve zero data loss, namely, the RPO index approaches to 0.
The following begins to describe in detail an embodiment of a method for continuously protecting virtual machine data provided by the present application.
As shown in fig. 2, the second embodiment is implemented based on four major components, which are respectively: the system comprises a policy management component, a data protection component, a data recovery component and a data cleaning component, wherein each component is introduced firstly.
First, the policy management component
The component mainly completes management of virtual machine protection strategies, and the strategies mainly comprise:
(1) the incremental backup frequency backup _ frequency, may be configured hourly (1< backup _ frequency < 24).
(2) The full backup frequency is set here to once a day, specifically once a day in the early morning.
For example, the incremental backup frequency is configured to be once in 2 hours, the full backup is performed at 0 point every morning, and then the incremental backup is performed every 2 hours on the day, so that 1 full backup point and 11 incremental backup points are formed.
(3) The retention duration IO _ barrier of the IO log, the IO log recorded in this time range (the IO log refers to IO data in a virtual machine captured by a CDP continuous IO protection module in the data protection component) is always stored, and may be used for IO-level data recovery, and may be configured according to the day, where 1 ═ IO _ barrier ═ 30. Beyond this time range, the IO log may be cleared by the data clearing component for saving storage space.
As shown in fig. 3, assuming that the retention time for configuring the IO log is 7 days, the IO log of the last 7 days is retained all the time, and data recovery at the IO level can be performed within the last 7 days. The IO logs in more than 7 days are cleaned, and the data recovery of the IO level cannot be carried out.
(4) The retention time back _ barrier of the incremental backup data: within this time range, the recorded incremental backup points (i.e., incremental backup data) are always reserved and are configured by default according to the day, usually backup _ barrier > IO _ barrier. The method can be used for restoring the fine-grained backup points, and the incremental backup points can be cleared by the data clearing component to save the storage space when the time range is exceeded.
As shown in FIG. 3, assuming that the configured incremental backup retention period is 14 days, the data in 7-14 can be restored on an hourly basis. Beyond 14 days, only day-level data recovery is possible.
(5) Storage location backup _ store of backup point and IO log: when the production storage pool where the original virtual machine is located is abnormal, the data of the virtual machine can be recovered based on the backup point and the IO log.
For example, the virtual machine production storage pool is a production _ store _ pool, the storage location of the configured backup and IO log is a back _ store _ pool, and when the production _ store _ pool is caused by an unexpected situation such as a hardware failure, data recovery (with a certain disaster recovery function) can be performed on the original virtual machine through the backup and the log in the back _ store _ pool, so that service switching of the original virtual machine is rapidly completed.
Second, data protection component
The component completes data protection of the virtual machine and is divided into incremental/full backup and CDP continuous IO protection according to the properties:
(1) full/incremental backup: the backup points are discrete backup points, full backup is fixedly performed at 0 a.m. each day, and incremental backup is triggered by a backup frequency defined in the policy management component. The backup is based on a drive-backup mechanism in the QEMU, IO write operation can be performed in the virtual machine while backup is performed, and service continuity in the virtual machine is guaranteed.
(2) CDP continuous IO protection: the IO protection is continuous, linear. When the IO is written in the virtual machine, the IO filter in the QEMU backs up the IO to a predefined storage position to form an IO log. The IO log is stored according to a certain format, so that the IO recovery of the data recovery part can be ensured, and the format of the IO log is described in detail below and is not expanded here.
Third, data recovery part
The component completes the data recovery at the user-specified point in time. When the virtual machine fails, for example, the virtual machine suffers from a lasso virus, data in the virtual machine is deleted by mistake, and an irreparable failure occurs in an original production storage pool of the virtual machine, data recovery can be performed according to a time point specified by a user, which includes the following two cases:
(1) data recovery at IO level: when the time point appointed by the user is within the retention duration range of the IO log defined by the policy management component, the data recovery of the IO level can be carried out, and the zero loss of the data assets is achieved;
(2) data recovery between backup points: when the time point specified by the user exceeds the retention time of the IO log defined by the policy management component and is smaller than the retention time of the incremental backup, the data recovery at the hour level can be carried out; when the time point appointed by the user exceeds the retention time of the incremental backup defined by the strategy management component, day-level data recovery can be carried out;
fourthly, data cleaning component
The component mainly completes cleaning of expired (expired means exceeding reserved time) data, saves storage space, and comprises the following two conditions:
(1) when the current time exceeds the retention time of the IO log defined by the policy management component, the data cleaning component deletes the backup defined by the policy management component and the IO log in the storage of the IO log;
(2) and the current time exceeds the retention time of the incremental backup defined by the strategy management component, and the data cleaning component deletes the backup defined by the strategy management component and the incremental backup in the storage of the IO log.
The IO log can store and recover IO data, and in order to achieve the storage and recovery of data and the consistency of data, a separate data format needs to be designed for processing, and the IO log format is briefly described here.
As shown in fig. 4, the IO log is mainly composed of SuperBlock and IO metadata:
(1) SuperBlock represents the total information of the IO log, such as the size of the IO log, the total IO number, the first IO information (tail block IO block position and tail block IO timestamp), the last IO information (head block IO block position and head block IO timestamp), and the size of the SuperBlock is 512 bytes;
(2) the IO metadata is composed of a DESCRIPTOR BLOCK BLOCK, an original IO info BLOCK, and a COMMIT BLOCK BLOCK in sequence. The DESCRIPTOR BLOCK is description BLOCK information (or understood as a header of the IO metadata) of each IO metadata, and the size thereof is fixed to 512 bytes. The COMMIT BLOCK BLOCK is COMMIT BLOCK information (or understood as the tail of the IO metadata) for each IO metadata, and is fixed to a size of 512 bytes. Each DESCRIPTOR BLOCK information corresponds to a unique COMMIT BLOCK. The original IO information BLOCKs (DATA between the DESCRIPTOR BLOCK and the COMMIT BLOCK in fig. 4) are IO information in the source disk of the slave virtual machine, and are surrounded by the DESCRIPTOR BLOCK as a header and the COMMIT BLOCK as a trailer.
Based on the above, the implementation of the second embodiment will be described.
Firstly, a data protection strategy is established by utilizing a strategy management component
(1) Configuring an incremental backup frequency backup _ frequency, assuming that the backup _ frequency is 2 hours;
(2) configuring a retention time IO _ barrier of the IO log, and assuming that the IO _ barrier is 7 days;
(3) configuring a retention time back _ barrier of the incremental backup, and assuming that the back _ barrier is 14 days;
(4) and configuring a backup and storage position backup _ store of the IO log, and assuming that the backup _ store is backstore _ pool.
It can be seen that, in this embodiment, a data protection mode in which full backup, incremental backup, and CDP are integrated is provided, and three time phases of data protection are defined based on a data protection policy, so as to form a complete data protection period of a virtual machine, as shown in fig. 3.
Secondly, the data protection component carries out virtual machine data protection
The incremental backup parameters are as follows:
The incremental backup parameters are as follows:
the method comprises the following steps: capturing IO when writing in a source disk;
step two: adding the new IO into the IO log image;
step three: and recording the new IO into a source disk of the virtual machine.
Thirdly, the data recovery part carries out data recovery
Specifying a time point T of data recovery, the data recovery section performing data recovery at the time point T, as follows:
(1) data recovery at IO level: when Current _ Time-T < IO _ barrier, the IO log recorded by the data protection component is not cleared, so that IO-level data recovery can be performed, and the recovery steps are as follows:
and 4, based on the backup _ io _ time. img data disk, namely, the recovery data corresponding to the time point T.
(2) Backup point level data recovery: when Current _ Time-T > IO _ barrier, the IO log recorded by the data protection component has already been cleared, and only data recovery at the backup point level can be performed, and the recovery steps are as follows:
and 3, using the backup _ time.img in the backup _ store _ pool storage pool as a data disk, namely recovering the data at the time point.
Therefore, in the time phase closest to the current time, the embodiment can provide data recovery at the IO level; at a time stage centered from the current time, the present embodiment may provide data recovery at an hour level; this embodiment may provide day-level data recovery at the time phase farthest from the current time. In general, different granularities, different levels of data recovery capability are provided at different time ranges.
Fourthly, the data cleaning component cleans up the overdue data
The method mainly comprises the following steps of clearing an overdue IO log and clearing an overdue backup point:
(1) cleaning the data of the overdue IO log, comprising the following steps:
and 4, traversing the next IO log until all IO logs are traversed.
(2) Cleaning up data of expired backup points, comprising the following steps:
and 4, traversing the next incremental backup point until all the incremental backup points are traversed.
Therefore, the embodiment provides a data cleaning method, which is used for cleaning out the expired data defined by the data protection policy, and avoiding the waste of storage space in the user environment.
In summary, the method for continuously protecting data of a virtual machine provided in this embodiment forms a complete virtual machine data protection period by way of fusion of full backup/incremental backup/CDP. According to the method, a plurality of base points of data protection are formed by full backup/incremental backup, and when IO data are written in a virtual machine between adjacent base points of data protection, the IO data are recorded as IO logs, so that a linear protection range is formed.
In addition, the embodiment defines three protection phases according to the data protection policy, and can provide IO-level data recovery at the time phase closest to the current time; at a time stage centered from the current time, data recovery at an hour level may be provided; day-level data recovery may be provided during the time period furthest from the current time. The data recovery capability with different granularities and different levels is provided according to the time range, so that the requirements of data protection in the actual production environment are met (the closer the time range to the current time is, the closer the time point is expected to be recoverable), and the farther the time range is from the current time, the more sparse the time point is expected to be recoverable), and the storage space in the environment is saved.
It should be noted that this embodiment is not a combination of simple backup and CDP schemes. In one aspect, the CDP protection phase relies on the backup point of the phase, and when a new backup is generated, the CDP protection phase implements IO data level protection based on the backup. On the other hand, the backup point defines the IO protection range of two different stages, the backup provides a protection base point through the fusion of full quantity/increment/CDP, and the CDP provides the IO log between the protection base points. When a data protection base point is generated, the IO log must be regenerated to form segmented IO data. If the former section of IO log is unavailable, the integrity of the next section of IO log cannot be influenced, and the availability of data protection in the system is improved.
Specifically, the difference between the present embodiment and the ordinary CDP-based IO persistent data protection scheme (hereinafter, referred to as the ordinary scheme) is shown in fig. 6. In the general scheme, the IO data on the whole time axis are protected based on the CDP data, as shown by the time axis above fig. 6; in this embodiment, full backup/incremental backup/CDP is integrated, disk data is periodically saved through full backup and incremental backup, and IO protection is performed between backup points based on the CDP technology, as shown by a time axis below fig. 6.
Therefore, when the recovery time point designated by the user is T1, the ordinary CDP scheme needs to recover IO data between Base (CDP service activation time) and T1, and the embodiment only needs to recover IO data between backup point 1 (one incremental backup point/full backup point located before T1 and closest to T1) and T1; when the user specifies a restore time point to be T2, the normal CDP scheme requires restoring IO data between Base to T2, and this embodiment requires restoring IO data between backup point 2 (primary incremental backup point/full backup point located before T2 and closest to T2) to T2.
In the following, a device for continuously protecting data of a virtual machine according to an embodiment of the present invention is introduced, and a device for continuously protecting data of a virtual machine described below and a method for continuously protecting data of a virtual machine described above may be referred to correspondingly.
As shown in fig. 7, the apparatus for continuously protecting virtual machine data of this embodiment includes:
the IO backup module 701: the method comprises the steps of writing IO data into a disk when the IO data of a virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area;
the incremental backup module 702: the incremental backup device is used for performing incremental backup on the disk according to the incremental backup frequency and storing incremental backup data to a second backup area;
full backup module 703: the full backup is carried out on the disk according to the full backup frequency, and full backup data are stored in a third backup area, wherein the full backup frequency is lower than the incremental backup frequency;
IO clear module 704: the IO log clearing module is used for clearing the IO log on the first backup area when the backup duration of the first backup area exceeds the retention duration of the IO log;
the incremental clear module 705: and the incremental backup data in the second backup area is cleared when the backup duration of the second backup area exceeds the retention duration of the incremental backup data, wherein the retention duration of the incremental backup data is greater than the retention duration of the IO log.
Therefore, a specific implementation manner of the apparatus in this embodiment may be seen in the above-mentioned part of the method for continuously protecting virtual machine data, for example, the IO backup module 701, the incremental backup module 702, the full backup module 703, the IO purging module 704, and the incremental purging module 705, which are respectively used for implementing steps S101, S102, S103, S104, and S105 in the above-mentioned method for continuously protecting virtual machine data. Therefore, specific embodiments thereof may be referred to in the description of the corresponding respective partial embodiments, and will not be described herein.
In addition, since the apparatus for continuously protecting virtual machine data of this embodiment is used to implement the foregoing method for continuously protecting virtual machine data, the role of the apparatus corresponds to that of the foregoing method, and details are not described here.
In addition, the present application further provides an apparatus for continuously protecting virtual machine data, including:
a memory: for storing a computer program;
a processor: for executing the computer program to implement the method for continuously protecting virtual machine data as described above.
Finally, the present application provides a readable storage medium having stored thereon a computer program for implementing the method of continuously protecting virtual machine data as described above when executed by a processor.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed descriptions of the solutions provided in the present application, and the specific examples applied herein are set forth to explain the principles and implementations of the present application, and the above descriptions of the examples are only used to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.
Claims (10)
1. A method for continuously protecting virtual machine data, comprising:
writing the IO data into a disk whenever the IO data of the virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area;
performing incremental backup on the disk according to the incremental backup frequency, and storing incremental backup data to a second backup area;
performing full backup on the disk according to full backup frequency, and storing full backup data into a third backup area, wherein the full backup frequency is lower than the incremental backup frequency;
when the backup duration of the first backup area exceeds the retention duration of the IO log, clearing the IO log on the first backup area;
and when the backup duration of the second backup area exceeds the retention duration of incremental backup data, removing the incremental backup data on the second backup area, wherein the retention duration of the incremental backup data is greater than the retention duration of the IO log.
2. The method of claim 1, wherein the incremental backup frequency is once every N hours, and the incrementally backing up the disk according to the incremental backup frequency comprises:
performing incremental backup on the disk every N hours every day, wherein N is a factor of 24;
correspondingly, the full backup frequency is once a day, and performing full backup on the disk according to the full backup frequency includes:
and performing full backup on the disk at a target time point every day.
3. The method of claim 1, wherein before clearing the IO log on the first backup area when the backup duration of the first backup area exceeds the retention duration of the IO log, further comprising:
traversing the IO logs on the first backup area, and determining the earliest time stamp of all the IO logs; and calculating the difference between the current time and the earliest timestamp to serve as the backup time length of the first backup area.
4. The method of claim 1, further comprising:
setting a backup policy using a policy management component, wherein the backup policy comprises: the method comprises the steps of incremental backup frequency, full backup frequency, retention time of IO logs, retention time of incremental backup data, an address of a first backup area, an address of a second backup area and an address of a third backup area.
5. The method of claim 1, wherein the generating an IO log according to the IO data and storing the IO log in a first backup area comprises:
and generating an IO log according to the IO data by using an IO filter, and storing the IO log to a first backup area, wherein the IO log comprises SuperBlock and IO metadata.
6. The method of any one of claims 1-5, further comprising:
determining a recovery time point according to the data recovery instruction;
if the difference between the current time and the recovery time point is less than or equal to the retention time of the IO log, performing IO-level data recovery according to the IO log on the first backup area;
if the difference between the current time and the recovery time point is larger than the retention time of the IO log and smaller than or equal to the retention time of the incremental backup data, performing data recovery according to the incremental backup data on the second backup area;
and if the difference between the current time and the recovery time point is greater than the retention time of the incremental backup data, performing data recovery according to the full backup data on the third backup area.
7. The method of claim 6, wherein the performing IO level data recovery from the IO log on the first backup region comprises:
determining an incremental backup process closest to the recovery time point, determining the actual backup time of the incremental backup process, and acquiring corresponding incremental backup data;
obtaining an IO log generated between the actual backup time and a recovery time point;
and according to the incremental backup data and the IO log, performing IO-level data recovery.
8. An apparatus for continuously protecting virtual machine data, comprising:
an IO backup module: the method comprises the steps of writing IO data into a disk when the IO data of a virtual machine is detected; generating an IO log according to the IO data by adopting a continuous data protection technology, and storing the IO log to a first backup area;
an incremental backup module: the incremental backup device is used for performing incremental backup on the disk according to the incremental backup frequency and storing incremental backup data to a second backup area;
a full backup module: the full backup is carried out on the disk according to the full backup frequency, and full backup data are stored in a third backup area, wherein the full backup frequency is lower than the incremental backup frequency;
an IO clear module: the IO log clearing module is used for clearing the IO log on the first backup area when the backup duration of the first backup area exceeds the retention duration of the IO log;
an increment removal module: and the incremental backup data in the second backup area is cleared when the backup duration of the second backup area exceeds the retention duration of the incremental backup data, wherein the retention duration of the incremental backup data is greater than the retention duration of the IO log.
9. An apparatus for continuously protecting data of a virtual machine, comprising:
a memory: for storing a computer program;
a processor: for executing said computer program for implementing a method for continuously protecting virtual machine data according to any of claims 1 to 7.
10. A readable storage medium, having stored thereon a computer program for implementing a method of continuously protecting virtual machine data according to any one of claims 1 to 7 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011112737.4A CN112269681A (en) | 2020-10-16 | 2020-10-16 | Method, device and equipment for continuously protecting virtual machine data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011112737.4A CN112269681A (en) | 2020-10-16 | 2020-10-16 | Method, device and equipment for continuously protecting virtual machine data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112269681A true CN112269681A (en) | 2021-01-26 |
Family
ID=74338256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011112737.4A Pending CN112269681A (en) | 2020-10-16 | 2020-10-16 | Method, device and equipment for continuously protecting virtual machine data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112269681A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113360322A (en) * | 2021-06-25 | 2021-09-07 | 上海上讯信息技术股份有限公司 | Method and equipment for recovering data based on backup system |
CN113886143A (en) * | 2021-10-19 | 2022-01-04 | 深圳市木浪云科技有限公司 | Virtual machine continuous data protection method and device and data recovery method and device |
CN114546276A (en) * | 2022-02-23 | 2022-05-27 | 华云数据控股集团有限公司 | High-availability data storage read-write method, system, device and equipment |
CN114579368A (en) * | 2022-05-07 | 2022-06-03 | 武汉四通信息服务有限公司 | Backup management method for continuous data protection, computer equipment and storage medium |
CN116225789A (en) * | 2023-05-09 | 2023-06-06 | 深圳华锐分布式技术股份有限公司 | Transaction system backup capability detection method, device, equipment and medium |
CN118260815A (en) * | 2024-05-31 | 2024-06-28 | 济南浪潮数据技术有限公司 | Encryption disk backup method and device, electronic equipment, storage medium and product |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102331955A (en) * | 2011-09-14 | 2012-01-25 | 天津火星科技有限公司 | Multiple time granularity data backup method |
CN103365745A (en) * | 2013-06-07 | 2013-10-23 | 上海爱数软件有限公司 | Block level backup method based on content-addressed storage and system |
CN106354582A (en) * | 2016-08-18 | 2017-01-25 | 无锡华云数据技术服务有限公司 | Continuous data protection method |
CN110597661A (en) * | 2019-09-11 | 2019-12-20 | 苏州浪潮智能科技有限公司 | Virtual machine backup method and device |
CN110825559A (en) * | 2018-08-10 | 2020-02-21 | 华为技术有限公司 | Data processing method and equipment |
-
2020
- 2020-10-16 CN CN202011112737.4A patent/CN112269681A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102331955A (en) * | 2011-09-14 | 2012-01-25 | 天津火星科技有限公司 | Multiple time granularity data backup method |
CN103365745A (en) * | 2013-06-07 | 2013-10-23 | 上海爱数软件有限公司 | Block level backup method based on content-addressed storage and system |
CN106354582A (en) * | 2016-08-18 | 2017-01-25 | 无锡华云数据技术服务有限公司 | Continuous data protection method |
CN110825559A (en) * | 2018-08-10 | 2020-02-21 | 华为技术有限公司 | Data processing method and equipment |
CN110597661A (en) * | 2019-09-11 | 2019-12-20 | 苏州浪潮智能科技有限公司 | Virtual machine backup method and device |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113360322A (en) * | 2021-06-25 | 2021-09-07 | 上海上讯信息技术股份有限公司 | Method and equipment for recovering data based on backup system |
CN113886143A (en) * | 2021-10-19 | 2022-01-04 | 深圳市木浪云科技有限公司 | Virtual machine continuous data protection method and device and data recovery method and device |
CN113886143B (en) * | 2021-10-19 | 2022-09-13 | 深圳市木浪云科技有限公司 | Virtual machine continuous data protection method and device and data recovery method and device |
CN114546276A (en) * | 2022-02-23 | 2022-05-27 | 华云数据控股集团有限公司 | High-availability data storage read-write method, system, device and equipment |
CN114546276B (en) * | 2022-02-23 | 2024-04-30 | 华云数据控股集团有限公司 | High-availability data storage read-write method, system, device and equipment |
CN114579368A (en) * | 2022-05-07 | 2022-06-03 | 武汉四通信息服务有限公司 | Backup management method for continuous data protection, computer equipment and storage medium |
CN114579368B (en) * | 2022-05-07 | 2022-08-02 | 武汉四通信息服务有限公司 | Backup management method for continuous data protection, computer equipment and storage medium |
CN116225789A (en) * | 2023-05-09 | 2023-06-06 | 深圳华锐分布式技术股份有限公司 | Transaction system backup capability detection method, device, equipment and medium |
CN116225789B (en) * | 2023-05-09 | 2023-08-11 | 深圳华锐分布式技术股份有限公司 | Transaction system backup capability detection method, device, equipment and medium |
CN118260815A (en) * | 2024-05-31 | 2024-06-28 | 济南浪潮数据技术有限公司 | Encryption disk backup method and device, electronic equipment, storage medium and product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112269681A (en) | Method, device and equipment for continuously protecting virtual machine data | |
US7325159B2 (en) | Method and system for data recovery in a continuous data protection system | |
US7426617B2 (en) | Method and system for synchronizing volumes in a continuous data protection system | |
US7406488B2 (en) | Method and system for maintaining data in a continuous data protection system | |
US7720817B2 (en) | Method and system for browsing objects on a protected volume in a continuous data protection system | |
US7315965B2 (en) | Method and system for storing data using a continuous data protection system | |
JP5346536B2 (en) | Information backup / restore processing device and information backup / restore processing system | |
US7650533B1 (en) | Method and system for performing a restoration in a continuous data protection system | |
US7490103B2 (en) | Method and system for backing up data | |
US8225146B2 (en) | Method for implementing continuous data protection utilizing allocate-on-write snapshots | |
US7516286B1 (en) | Conversion between full-data and space-saving snapshots | |
US7802134B1 (en) | Restoration of backed up data by restoring incremental backup(s) in reverse chronological order | |
JP4512638B2 (en) | Computer hard disk system data protection apparatus and method using system area information table and mapping table | |
JP5669823B2 (en) | System recovery method using change tracking | |
CN111221678B (en) | Hbase data backup/recovery system, method and device and electronic equipment | |
CN103034592B (en) | Data processing method and device | |
WO2007103141A2 (en) | Method and apparatus for providing virtual machine backup | |
CN111506251A (en) | Data processing method, data processing device, SMR storage system and storage medium | |
CN109710456B (en) | Data recovery method and device | |
CN111338844A (en) | Database backup management method and electronic equipment | |
US9336250B1 (en) | Systems and methods for efficiently backing up data | |
CN104462148B (en) | A kind of data storage and management method and device | |
CN110729014A (en) | Method and device for backing up erase count table in SSD (solid State disk) storage, computer equipment and storage medium | |
CN105573862A (en) | Method and equipment for recovering file systems | |
CN110351386B (en) | Increment synchronization method and device between different copies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210126 |