CN109992449B - Backup image management system, method, device and medium - Google Patents

Backup image management system, method, device and medium Download PDF

Info

Publication number
CN109992449B
CN109992449B CN201711494862.4A CN201711494862A CN109992449B CN 109992449 B CN109992449 B CN 109992449B CN 201711494862 A CN201711494862 A CN 201711494862A CN 109992449 B CN109992449 B CN 109992449B
Authority
CN
China
Prior art keywords
backup
space
elastic resource
image file
recovery area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711494862.4A
Other languages
Chinese (zh)
Other versions
CN109992449A (en
Inventor
曾祥洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Sichuan Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Sichuan Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Sichuan Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711494862.4A priority Critical patent/CN109992449B/en
Publication of CN109992449A publication Critical patent/CN109992449A/en
Application granted granted Critical
Publication of CN109992449B publication Critical patent/CN109992449B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1482Generic software techniques for error detection or fault masking by means of middleware or OS functionality
    • G06F11/1484Generic software techniques for error detection or fault masking by means of middleware or OS functionality involving virtual machines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a backup mirror image management system, a method, equipment and a medium for starting a virtual machine without recovery after backup. According to the system, multiple backup images of the virtual machine are placed on the special emergency recovery platform, when the virtual machine fails, the virtual machine backup images of the special emergency recovery platform are directly connected to the virtualization server in an articulated mode, the virtualization server stores the virtual machine backup images redirected to the special emergency recovery platform, then the virtual machine is created and opened, and therefore the purpose that the virtual machine is started in an emergency recovery mode within a short time is achieved, the traditional time-consuming recovery process is avoided after the virtual machine fails, and reliability, usability and continuity of the virtual platform are guaranteed. The method can not only 'find' a plurality of historical time copies of the virtual machine, but also greatly improve the speed of the traditional recovery, and solves the problems that the backup in the prior art can only repair physical errors and can not repair logical errors and recovery time.

Description

Backup image management system, method, device and medium
Technical Field
The invention belongs to the technical field of data backup and recovery, and particularly relates to a backup mirror image management system, a backup mirror image management method, backup mirror image management equipment and a backup mirror image management medium which are used for starting a virtual machine without recovery after backup.
Background
In the form of increasing IT costs, virtualization is more attractive because of ITs cost saving advantages. With the popularization of the server virtualization technology, the resource utilization rate of the data center is improved, the energy consumption is reduced, and efficient and convenient management experience is brought to a data manager.
The server virtualization is to apply system virtualization to a server, and may be used by virtualizing one server into a plurality of virtual servers. Nowadays, more and more important service systems are deployed to a virtualization platform, so the reliability and the availability of the virtualization platform directly relate to the continuity of services.
The current technical means for guaranteeing the reliability, availability and continuity of the virtualization platform mainly include two types: firstly, the service continuity is guaranteed through high availability of the virtualization platform. When the operating system of a single virtual machine or the host physical host of the virtual machine fails, the virtual machine is dynamically switched and migrated from the A physical host to the B physical host. Equivalent to the traditional dual-computer switching. If the virtual machine is in the cross-site condition, the virtual machine is switched from the A place to the B place. The method ensures the service continuity more efficiently, can complete the switching very quickly and can complete the switching within a few minutes. It has the disadvantage that only physical errors can be repaired, not logical errors. When the virtual machine operating system or the virtual machine host physical host fails, the virtual machine can complete automatic switching, but a logical error made inside the virtual machine operating system, such as: when files are deleted by mistake and patches with holes are printed, even if the virtual machine is manually dynamically switched from the physical host A to the physical host B, the method is not favorable because the shared disk is used by the 'double machines'. And secondly, the continuity of the service is ensured by means of backup and recovery. Various backup software is used for backing up the virtual machine to a disk or a tape, and when the virtual machine has a physical fault or a logical error, the backup software is used for recovering the backup of the previous virtual machine, so that the service continuity is guaranteed. Because of the backup of a plurality of time nodes, the method can restore to a plurality of logic nodes and can effectively repair logic errors. The method has the defects that the recovery time is always longer than the backup time by adopting the traditional backup recovery means. Typically the recovery time is equal to or greater than 1.2 times the backup time. That is, if it takes one hour to backup a virtual machine operating system, the time to restore is at least 1.2 hours or more. Business continuity is significantly compromised over the first approach in time efficiency.
In summary, the technical problem of long backup recovery time of the virtual host exists in the prior art.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a backup mirror image management system, a method, equipment and a medium for starting a virtual machine without recovery after backup. By placing multiple backup images of a virtual machine onto a dedicated emergency recovery platform. When the virtual machine fails, the virtual machine backup image of the special emergency recovery platform is directly connected to the virtualization server in a hanging mode, the virtualization server stores and redirects the virtual machine backup image of the special emergency recovery platform, and then the virtual machine is created and opened. Therefore, the purpose of the emergency recovery starting of the virtual machine in a short time is achieved. The traditional time consuming recovery process is avoided after a failure. The method can not only 'find' a plurality of historical time copies of the virtual machine, but also greatly improve the recovery speed compared with the traditional recovery speed.
The technical scheme adopted by the invention is as follows:
in a first aspect, an embodiment of the present invention provides a backup image management system that starts a virtual machine without recovery after backup, where the backup image management system is characterized in that: the system runs on an emergency recovery platform and comprises a storage space dividing module, a backup storage parameter and floating plate rule setting module, a floating plate management module, a virtual machine backup module, a backup mirror image writing module and a backup virtual machine emergency starting module, wherein,
the storage space dividing module is used for dividing the storage space of the emergency recovery platform into a cache and a floating plate;
the backup storage parameter and floating plate rule setting module is used for setting the writing sequence of the backup mirror image, dividing the recovery area of the floating plate, setting the backup strategy and setting the floating plate rule and checking the retention condition of the backup mirror image;
the floating plate management module is used for updating the rules of the floating plate, recording files of the floating plate, checking the floating plate and accessing the floating plate;
the virtual machine backup module is used for generating a snapshot during backup by using a virtualized snapshot backup interface and storing newly written data, a master disc file of the virtual machine is kept unchanged, and the snapshot deletion is executed after the backup is finished, so that the data in the snapshot is written back into the master disc file;
the backup image writing module is used for transmitting the backup image file of the virtual machine to the emergency recovery platform, controlling the emergency recovery platform to write the backup image file into the cache and judging the backup image file to be stored according to the floating plate rule;
the backup virtual machine emergency starting module is used for directly opening the virtual machine by using the latest backup image file reserved in the emergency recovery area in the floating plate and immediately using the virtual machine when the virtual machine fails;
the emergency recovery platform utilizes a storage space dividing module to divide the storage space on the emergency recovery platform, and the emergency recovery platform utilizes a virtual machine backup module to generate a virtual machine snapshot; the emergency recovery platform writes the snapshot generated by the virtual machine backup module into a storage space by using a backup mirror image writing module; when the virtual machine fails, the emergency recovery platform directly opens and uses the latest backup image file reserved in the storage space through the backup virtual machine emergency starting module.
Further, the backup storage parameter and floating plate rule setting module specifically includes:
the backup mirror image is written into the submodule: the backup image file storage device is used for setting that all backup image files are written into a cache space preferentially;
the floating plate is divided into submodules: the device is used for dividing the floating plate into an emergency recovery area and a traditional recovery area without dividing the space;
a backup strategy setting submodule: the backup image file management system is used for bringing backup image files or backup image file groups into different backup strategies, and different backup strategies set different backup image backup frequencies and retention periods;
the floating plate rule setting submodule is used for respectively limiting the size and the number of reserved parts of space occupied by each backup image file or backup image file group in an emergency recovery area and a traditional recovery area, the emergency recovery area only reserves the backup image files for the last few times, the historical backup image files are migrated into the traditional recovery area, and the space is released after the backup image files are expired and the backup image files are handed back to the floating plate;
a backup image file checking submodule: and regularly checking whether each backup image file reaches the required retention time, and if so, clearing the expired backup image file.
Further, the floating plate management module specifically includes:
a backup execution submodule: the system is used for realizing the execution of a virtualization backup process by a backup management system;
backup image file judgment submodule: the method is used for judging the sizes of the backup image file and the residual cache space: if the backup image file is smaller than or equal to the residual size of the cache, writing the backup image file into the cache, otherwise, queuing for waiting;
backup image file cache submodule: the backup mirror image file is written into the cache;
a floating plate judgment submodule: the method is used for judging according to backup storage parameters and a floating plate rule, if the number of reserved backup image file copies COPY2 in an emergency recovery area recorded in a floating plate record file is less than or equal to the number of backup image file copies COPY1 to be reserved in the emergency recovery area defined in a floating plate rule file, and the occupied space SIZE SIZE2 of the emergency recovery area recorded in the floating plate record file is less than or equal to the available space SIZE SIZE1 of the emergency recovery area defined in the floating plate rule file, continuously comparing the residual SIZE of the floating plate space with the SIZE of the backup image file in a cache, if the SIZE of the backup image file in the cache is less than or equal to the SIZE of the residual space of the floating plate, dividing the floating plate space with the SIZE of the backup image file in the cache, delivering the floating plate space to the emergency recovery area, writing the backup image file in the cache into the emergency recovery area, and indicating that the residual space of the floating plate is insufficient if the SIZE of the backup image file in the cache is greater than the SIZE of the residual space of the floating plate, then giving an alarm that the space is insufficient;
if any one of COPY2 is less than or equal to COPY1 and SIZE2 is less than or equal to SIZE1 is not established, transferring the oldest mirror image file of the emergency recovery area to a traditional recovery area, judging according to backup storage parameters and a floating plate rule, if the number of reserved backup mirror image files COPY4 in the traditional recovery area recorded in the floating plate recording file is less than or equal to the number of backup mirror image files COPY3 to be reserved in the traditional recovery area defined in the floating plate rule file, and if the occupied space SIZE SIZE4 of the traditional recovery area recorded in the floating plate recording file is less than or equal to the SIZE SIZE3 capable of occupying the space in the traditional recovery area defined in the floating plate rule file, simultaneously establishing, comparing the SIZE of the residual space of the floating plate with the SIZE of the oldest mirror image of the emergency recovery area, dividing the space of the floating plate into the oldest SIZE of the emergency recovery area, if the SIZE of the oldest mirror image file in the emergency recovery area is less than or equal to the SIZE of the residual space of the floating plate, dividing the space of the floating plate from the floating plate space, and releasing the oldest space of the floating plate in the emergency recovery area to the floating plate, and giving an alarm if the oldest SIZE of the floating plate is insufficient;
if any one of COPY4 is less than or equal to COPY3 and SIZE4 is less than or equal to SIZE3 fails, warning to keep the cycle limit;
a mirror migration submodule: the space used for dividing the size of the oldest mirror image file in the emergency recovery area from the space of the floating plate is delivered to the traditional recovery area, the oldest mirror image file in the emergency recovery area is transferred to the traditional recovery area in the floating plate, and the space is released to the space of the floating plate;
the floating plate space size judgment submodule is as follows: the device is used for comparing the size of the residual space of the floating plate with the size of the mirror image file in the cache, and if the size of the mirror image file in the cache is smaller than or equal to the size of the residual space of the floating plate, the space with the size of the mirror image in the cache is divided in the floating plate and is delivered to an emergency recovery area;
the mirror image file is written into the submodule: and the system is used for writing the image file in the cache into the emergency recovery area.
Further, the backup virtual machine emergency starting module in the system is specifically configured to:
positioning a required backup image file, and mounting the backup image file to a virtualization server in an emergency recovery platform matched with a backup source in a Network File System (NFS) read-only mode, so that the backup image file is used as an NFS data source of the virtualization server;
by using a virtualized snapshot mechanism, a snapshot, i.e., a recovery log, is created in production storage to store newly written data, and a backup image file in the emergency recovery platform is kept unchanged as a master file of the virtual machine, i.e., the virtual machine can be opened.
In a second aspect, an embodiment of the present invention provides a method for managing a backup image of a virtual machine that is started without recovery after backup, where the method includes the following steps:
dividing the storage space of the whole emergency recovery platform into a cache and a floating plate;
setting a backup mirror image writing sequence, dividing a floating plate recovery area, setting a backup strategy and setting a floating plate rule; and checking the backup image retention condition;
updating and managing a floating plate rule file and a floating plate record file, and checking and accessing the floating plate;
fourthly, a snapshot is generated during backup by using a virtualized snapshot backup interface and is used for storing newly written data, a master disk file of the virtual machine is kept unchanged, and after the backup is finished, the snapshot is deleted, and the data in the snapshot is written back into the master disk file;
fifthly, transmitting the backup image file in the virtual machine to the emergency recovery platform, controlling the emergency recovery platform to write the backup image file into a cache and judging how the backup image file is stored according to the floating plate rule;
and (VI) when the virtual machine fails, directly opening the virtual machine for use by using the latest backup image file reserved in the emergency recovery area in the floating plate.
Further, the method comprises the steps of setting a backup mirror image writing sequence, dividing a floating plate recovery area, setting a backup strategy and setting a floating plate rule; and checking the backup image retention condition, specifically comprising the following steps:
(1) Setting all backup images to be written into the cache of the emergency recovery platform preferentially;
(2) Dividing the floating plate into an emergency recovery area and a traditional recovery area without dividing the space;
(3) Setting backup strategies, including backup image files or backup image file groups into different backup strategies, wherein different backup strategies set different backup image backup frequencies and retention periods;
(4) Setting a floating plate rule, respectively limiting the size and the number of reserved space occupied by each backup image file or backup image file group in an emergency recovery area and a traditional recovery area, wherein the emergency recovery area only reserves the backup images for the last few times, the historical backup images are migrated into the traditional recovery area, and the space is released after the image files are expired and the backup images are handed back to the floating plate;
(5) And regularly checking whether each backup image file reaches the required retention time, and if so, clearing the expired backup image file.
Further, the updating of the floating plate rule, the floating plate record file, the floating plate check and the floating plate access specifically comprises the following steps:
(1) And (3) executing backup: the backup management system executes a virtualization backup process;
(2) Judging the sizes of the backup image file and the cache: if the backup image file is smaller than or equal to the cache size, performing the step (3); otherwise, queuing and waiting, and returning to the step (1);
(3) Writing the backup image file into a cache: the backup mirror image file is written into the cache;
(4) And (4) judging a floating plate: judging according to the backup storage parameters and the floating plate rule, if COPY2 is not more than COPY1 and SIZE2 is not more than SIZE1, executing the step (5), and if one condition is not met, executing the step (a);
wherein, COPY1 is the number of image files to be reserved in the emergency recovery area defined in the floating plate rule file; COPY2 is the number of image files reserved in the emergency recovery area recorded in the floating plate record file; SIZE1 is the SIZE of the space which can be occupied by the emergency recovery area defined in the floating plate rule file; SIZE2 is the SIZE of the occupied space of the emergency recovery area recorded in the floating plate recording file;
(a) Migrating the oldest mirror image file in the emergency recovery area to the traditional recovery area, and executing the step (b);
(b) And (4) judging a floating plate: judging according to the backup storage parameters and the floating plate rule, if COPY4 is not more than COPY3 and SIZE4 is not more than SIZE3, executing the step (c), if one condition is not met, alarming the retention period limit, and returning to the step (1);
wherein, COPY3 is the number of mirror image files that should be reserved in the conventional recovery area defined in the floating plate rule file; COPY4 is the number of image file copies that have been reserved in the conventional recovery area recorded in the float record file; SIZE3 is the SIZE of the space that can be occupied by the conventional recovery area defined in the float rule file; SIZE4 is the SIZE of the space occupied by the conventional recovery area recorded in the floating plate recording file;
(c) Judging the space size of the floating plate: comparing the size of the floating plate space with the size of the oldest mirror image of the emergency recovery area, if the size of the oldest mirror image file of the emergency recovery area is smaller than or equal to the residual size of the floating plate space, executing the step (d), otherwise, indicating that the cache space is insufficient, giving an alarm, and returning to the step (1);
(d) Dividing a floating plate space: dividing a space with the size of the oldest mirror image file of the emergency recovery area in the floating plate, handing the space to the traditional recovery area, and continuing to execute the step (e);
(e) Transferring the oldest mirror image file in the emergency recovery area to a traditional recovery area in the floating plate, releasing the space to the space of the floating plate, and continuously executing the step (5);
(5) Judging the size of the space of the floating plate: comparing the size of the space of the floating plate with the size of the mirror image file in the cache, if the size of the mirror image file in the cache is smaller than or equal to the remaining size of the space of the floating plate, executing the step (6), otherwise, indicating that the space of the cache is insufficient, giving an alarm, and returning to the step (1);
(6) Dividing the space of the floating plate: dividing a space with the size of the mirror image file in the cache in the floating plate, delivering the space to an emergency recovery area, and continuously executing the step (7);
(7) And writing the image file in the cache into the emergency recovery area.
Further, the floating plate rule in the method comprises: the reserved number of the backup image files in the emergency recovery area and the traditional recovery area in the floating plate record file is less than or equal to the number of the backup image files to be reserved in the floating plate rule file; the occupied space of the emergency recovery area and the occupied space of the traditional recovery area in the floating plate record file are less than or equal to the occupied space set in the floating plate rule file.
Further, when the virtual machine fails, directly opening and using the virtual machine by using the latest backup image file reserved in the emergency recovery area in the floating plate specifically includes: :
(1) Positioning a required backup image file, and mounting the backup image file to a virtualization server in an emergency recovery platform matched with a backup source in a Network File System (NFS) read-only mode, so that the backup image file is used as an NFS data source of the virtualization server;
(2) By using a virtualized snapshot mechanism, a snapshot, i.e., a recovery log, is created in production storage to store newly written data, and a backup image file in the emergency recovery platform is kept unchanged as a master file of the virtual machine, i.e., the virtual machine can be opened.
Further, the method also comprises the following steps: copying the backup image file to production storage by using a virtualized online storage migration technology, unloading the network file system NFS of the backup image file after migration is finished, synchronizing newly written data in a recovery log to a virtual machine master disc in the production storage, and finishing solidification.
In a third aspect, an embodiment of the present invention provides a backup image management device that starts a virtual machine without restoring after backup, where the backup image management device includes: at least one processor, at least one memory, and computer program instructions stored in the memory which, when executed by the processor, implement the method of the second aspect as in the embodiments described above.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of the second aspect as in the above embodiments.
The invention has the beneficial effects that: the virtual machine emergency recovery platform has an emergency recovery area and a traditional backup recovery area which coexist at the same time, and a floating plate mechanism is utilized, so that the two areas can occupy the automatic dynamic balance space, a user can more flexibly and automatically adjust resource allocation according to actual demands, and the excessive use of resources is avoided. The method not only inherits the high timeliness of the high availability guarantee service continuity through the virtualization platform, but also makes up the characteristic of poor continuity timeliness of the virtualization service through a backup recovery means, and can recover a plurality of historical event node copies of the virtual machine, so that the method is a virtual machine containing means with very balanced timeliness performance functionality. The problem of directly starting and using the virtual machine in an emergency without recovery after the virtual machine is backed up is solved, the virtual machine can be started by utilizing the backed-up mirror image without recovery after the virtual machine is backed up, the recovery process is avoided, and the service recovery time of the virtual machine under extreme conditions is greatly shortened.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 shows a schematic diagram of a backup image management system;
FIG. 2 illustrates a backup image diagram of a virtual machine being started without recovery after backup;
FIG. 3 shows a float plate management flow diagram;
FIG. 4 illustrates a float plate management flow diagram;
FIG. 5 illustrates a float plate management implementation;
FIG. 6 illustrates a backup image migration diagram;
FIG. 7 illustrates a backup image migration diagram;
FIG. 8 illustrates a new backup image write diagram;
FIG. 9 illustrates an emergency restore area backup mirror mount;
FIG. 10 illustrates a snapshot open view of a virtual machine;
FIG. 11 illustrates a migration solidified graph of a virtual machine;
FIG. 12 illustrates an image management device implementation.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising 8230; \8230;" comprises 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
A specific embodiment of a backup image management system that starts a virtual machine without recovery after backup is provided below. As shown in fig. 1, the system operates on an emergency recovery platform and includes a storage space dividing module, a backup storage parameter and floating plate rule setting module, a floating plate management module, a virtual machine backup module, a backup mirror image writing module, and a backup virtual machine emergency starting module,
the storage space dividing module is used for dividing the storage space of the emergency recovery platform into a cache and a floating plate;
the backup storage parameter and floating plate rule setting module is used for setting the writing sequence of the backup mirror image, dividing the recovery area of the floating plate, setting the backup strategy and setting the floating plate rule and checking the retention condition of the backup mirror image;
the floating plate management module is used for updating the rules of the floating plate, recording files of the floating plate, checking the floating plate and accessing the floating plate;
the virtual machine backup module is used for generating a snapshot during backup by using a virtualized snapshot backup interface and storing newly written data, a master disc file of the virtual machine is kept unchanged, and the snapshot deletion is executed after the backup is finished, so that the data in the snapshot is written back into the master disc file;
the backup image writing module is used for transmitting the backup image file of the virtual machine to the emergency recovery platform, controlling the emergency recovery platform to write the backup image file into the cache and judging the backup image file to be stored according to the floating plate rule;
the backup virtual machine emergency starting module is used for directly opening the virtual machine by using the latest backup image file reserved in the emergency recovery area in the floating plate and immediately using the virtual machine when the virtual machine fails;
the emergency recovery platform utilizes a storage space dividing module to divide the storage space on the emergency recovery platform, and utilizes a virtual machine backup module to generate a virtual machine snapshot; the emergency recovery platform writes the snapshot generated by the virtual machine backup module into a storage space by using a backup mirror image writing module; when the virtual machine fails, the emergency recovery platform directly opens and uses the latest backup image file reserved in the storage space through the backup virtual machine emergency starting module.
Further, the backup storage parameter and floating plate rule setting module specifically includes:
the backup mirror image is written into the submodule: the backup image file storage device is used for setting that all backup image files are written into a cache space preferentially;
the floating plate is divided into submodules: the device is used for dividing the floating plate into an emergency recovery area and a traditional recovery area without dividing the space;
a backup strategy setting submodule: the backup image file management system is used for bringing backup image files or backup image file groups into different backup strategies, and different backup strategies set different backup image backup frequencies and retention periods;
the floating plate rule setting submodule is used for respectively limiting the size and the number of reserved parts of space occupied by each backup image file or backup image file group in an emergency recovery area and a traditional recovery area, the emergency recovery area only reserves the backup image files for the last few times, the historical backup image files are migrated into the traditional recovery area, and the space is released after the backup image files are expired and the backup image files are handed back to the floating plate;
a backup image file checking submodule: and regularly checking whether each backup image file reaches the required retention time, and if so, clearing the expired backup image file.
Further, the floating plate management module specifically includes:
a backup execution submodule: the system is used for realizing the execution of a virtualization backup process by a backup management system;
backup image file judgment submodule: the method is used for judging the sizes of the backup image file and the residual cache space: if the backup image file is smaller than or equal to the residual size of the cache, writing the backup image file into the cache, otherwise, queuing for waiting;
backup image file cache submodule: the backup mirror image file is written into the cache;
a floating plate judgment submodule: the method is used for judging according to backup storage parameters and a floating plate rule, if the number of reserved backup image file copies COPY2 in an emergency recovery area recorded in a floating plate record file is less than or equal to the number of backup image file copies COPY1 to be reserved in the emergency recovery area defined in a floating plate rule file, and the occupied space SIZE SIZE2 of the emergency recovery area recorded in the floating plate record file is less than or equal to the available space SIZE SIZE1 of the emergency recovery area defined in the floating plate rule file, continuously comparing the residual SIZE of the floating plate space with the SIZE of the backup image file in a cache, if the SIZE of the backup image file in the cache is less than or equal to the SIZE of the residual space of the floating plate, dividing the floating plate space with the SIZE of the backup image file in the cache, delivering the floating plate space to the emergency recovery area, writing the backup image file in the cache into the emergency recovery area, and indicating that the residual space of the floating plate is insufficient if the SIZE of the backup image file in the cache is greater than the SIZE of the residual space of the floating plate, then giving an alarm that the space is insufficient;
if any one of COPY2 is less than or equal to COPY1 and SIZE2 is less than or equal to SIZE1 is not established, migrating the oldest mirror image file in the emergency recovery area to a traditional recovery area, judging according to backup storage parameters and a floating plate rule, if the number of reserved backup mirror image files COPY4 in the traditional recovery area recorded in the floating plate recording file is less than or equal to the number of backup mirror image files COPY3 which are defined in the traditional recovery area defined in the floating plate rule file, and if the occupied space SIZE SIZE4 of the traditional recovery area recorded in the floating plate recording file is less than or equal to the SIZE of the residual space of the floating plate rule file, the space of the floating plate can occupy the SIZE SIZE3, simultaneously establishing, comparing the SIZE of the residual space of the floating plate with the SIZE of the oldest mirror image in the emergency recovery area, dividing the space of the floating plate, drawing out the floating plate space with the oldest SIZE of the oldest mirror image in the floating plate space, handing over to the traditional recovery area, migrating the oldest mirror image file in the emergency recovery area to the floating plate space, releasing the floating plate space, and alarming if the oldest mirror image file in the emergency recovery area is insufficient, releasing the floating plate space, and alarming the traditional recovery area;
if any one of COPY4 is less than or equal to COPY3 and SIZE4 is less than or equal to SIZE3 fails, the alarm retention period is limited;
a mirror migration submodule: the space used for dividing the size of the oldest mirror image file in the emergency recovery area from the space of the floating plate is delivered to the traditional recovery area, the oldest mirror image file in the emergency recovery area is transferred to the traditional recovery area in the floating plate, and the space is released to the space of the floating plate;
the floating plate space size judgment submodule: the device is used for comparing the residual size of the space of the floating plate with the size of the mirror image file in the cache, and if the size of the mirror image file in the cache is smaller than or equal to the size of the residual space of the floating plate, the space of the mirror image size in the cache is divided in the floating plate and is delivered to an emergency recovery area;
the mirror image file is written into the submodule: and the system is used for writing the image file in the cache into the emergency recovery area.
Further, the backup virtual machine emergency starting module in the system is specifically configured to:
positioning a required backup image file, and mounting the backup image file to a virtualization server in an should-be-recovered platform matched with a backup source in a Network File System (NFS) read-only mode, so that the backup image file is used as an NFS data source of the virtualization server;
by using a virtualized snapshot mechanism, a snapshot, i.e., a recovery log, is created in production storage to store newly written data, and a backup image file in the emergency recovery platform is kept unchanged as a master file of the virtual machine, i.e., the virtual machine can be opened.
The following provides a specific embodiment of a backup image management method for starting a virtual machine without recovery after backup, which includes the following steps:
the storage space of the whole emergency recovery platform is divided into a cache and a floating plate.
Setting a backup mirror image writing sequence, dividing a floating plate recovery area, setting a backup strategy and setting a floating plate rule; and checks for backup image retention.
Updating and managing a floating plate rule file and a floating plate record file, and checking and accessing the floating plate.
And generating a snapshot during backup by using a virtualized snapshot backup interface to store newly written data, keeping the master disk file of the virtual machine unchanged, executing snapshot deletion after the backup is finished, and writing the data in the snapshot back into the master disk file.
And transmitting the backup image file in the virtual machine to an emergency recovery platform, and writing the backup image file into a cache by the emergency recovery platform and judging how the backup image file is stored according to the floating plate rule.
When the virtual machine fails, the virtual machine is directly opened for use by using the latest backup image file reserved in the emergency recovery area in the floating plate.
As shown in fig. 2, the storage space of the entire emergency recovery platform is divided into two parts, namely a cache and a floating plate, all backup mirror images are preferentially written into a cache area, and the floating plate is divided into two functional areas, namely an emergency recovery area and a conventional recovery area, by using floating plate management technology, so that the balanced use of the entire storage space is realized. The emergency recovery area is used as a storage space of the virtual machine when emergency pull-up is performed, extra storage space does not need to be prepared, and recovery preparation time is shortened; the conventional recovery area is used for preserving the copy of the historical virtual machine so as to recover under non-emergency requirements (such as historical data query and the like), after the backup of the virtual machine is completed, the stored backup image is directly mounted to the virtualization platform to serve as a master disc of the virtualization platform, the virtual machine is supported to be directly started, the recovery process is avoided, and the virtual machine can be solidified to production storage by using a virtualized storage migration function after the virtual machine is started. And the automatic dynamic balance of the space is carried out between the emergency recovery area and the traditional recovery area by utilizing a floating plate rule.
The floating plate management process is as shown in fig. 3, the floating plate rule is formulated by backup management, the floating plate checks and compares the floating plate rule and the space/number of copies, and the floating plate division is performed or feedback is performed to the backup management after judgment. The method comprises the following steps: setting all backup image files to be written into a cache space preferentially; dividing the floating plate into an emergency recovery area and a traditional recovery area without dividing the space; bringing the backup image file or the backup image file group into different backup strategies, wherein different backup strategies set different backup image backup frequencies and retention periods; limiting the size and the number of reserved parts of space occupied by each backup image file or backup image file group in an emergency recovery area and a traditional recovery area, wherein the emergency recovery area only reserves the backup image files for the last several times, and the historical backup image files are migrated to the traditional recovery area; releasing the space after the backup image file is expired, and returning the backup image file to the floating plate; and regularly checking whether each backup image file reaches the required retention time, if so, clearing the expired backup image file, feeding back information and giving an alarm.
Specifically, an administrator compiles a floating plate rule file according to actual requirements, uses the use rule of the file management floating plate to define the space capacity and the file number of a certain backup image file or a backup image file group which can be respectively occupied in an emergency recovery area and a traditional recovery area; the backup image in the conventional recovery area does not limit the reserved amount, only limits the space capacity, and limits the expiration clearing time by the reserved period of the backup strategy. And after the backup mirror image is stored, recording the storage information into a floating plate record file. The file records the space capacity and the number of file copies of each backup image file or backup image file group currently used in the emergency recovery area and the conventional recovery area. The rule file and the record file are stored separately, so that the floating plate rule is convenient to obtain or change, and the management difficulty and the misoperation risk caused by the frequent writing of the rule file and the manual maintenance of the record file are avoided. All changes to usage, and changes to the float space, will be registered in the float log file. When the backup mirror image is written, the emergency recovery platform performs floating plate check and compares the floating plate record file with the floating plate rule file to judge how the backup mirror image is stored in the backup storage. The access of a floating plate: as needed to the desired area.
Specifically, the floating plate management includes the following steps, as shown in fig. 4:
(1) And (3) executing backup: the backup management system executes a virtualization backup process;
(2) Judging the sizes of the backup image file and the cache: if the backup image file is smaller than or equal to the size of the cache, performing the step (3); otherwise, queuing and waiting, and returning to the step (1);
(3) Writing the backup image file into a cache: the backup mirror image file is written into the cache;
(4) And (4) judging a floating plate: judging according to the backup storage parameters and the floating plate rule, if COPY2 is not more than COPY1 and SIZE2 is not more than SIZE1, executing the step (5), and if one condition is not met, executing the step (a);
wherein, COPY1 is the number of mirror image files to be reserved in the emergency recovery area defined in the floating plate rule file; COPY2 is the number of image files reserved in the emergency recovery area recorded in the floating plate record file; SIZE1 is the SIZE of the space which can be occupied by the emergency recovery area defined in the floating plate rule file; SIZE2 is the SIZE of the occupied space of the emergency recovery area recorded in the floating plate recording file;
(a) Migrating the oldest mirror image file in the emergency recovery area to the traditional recovery area, and executing the step (b);
(b) And (4) judging a floating plate: judging according to the backup storage parameters and the floating plate rule, if COPY4 is not more than COPY3 and SIZE4 is not more than SIZE3, executing the step (c), if one condition is not met, alarming the retention period limit, and returning to the step (1);
wherein, COPY3 is the number of image files that should be reserved in the conventional recovery area defined in the floating plate rule file; COPY4 is the number of image file copies that have been reserved in the conventional recovery area recorded in the float record file; SIZE3 is the SIZE of the space that can be occupied by the conventional recovery area defined in the float rule file; SIZE4 is the SIZE of the space occupied by the conventional recovery area recorded in the floating plate recording file;
(c) Judging the space size of the floating plate: comparing the size of the floating plate space with the size of the oldest mirror image of the emergency recovery area, if the size of the oldest mirror image file of the emergency recovery area is smaller than or equal to the remaining size of the floating plate space, executing the step (d), otherwise, indicating that the cache space is insufficient, giving an alarm, and returning to the step (1);
(d) Dividing the space of the floating plate: dividing a space with the size of the oldest mirror image file of the emergency recovery area in the floating plate, handing the space to the traditional recovery area, and continuing to execute the step (e);
(e) Transferring the oldest mirror image file in the emergency recovery area to a traditional recovery area in the floating plate, releasing the space to the space of the floating plate, and continuously executing the step (5);
(5) Judging the size of the space of the floating plate: comparing the size of the floating plate space with the size of the mirror image file in the cache, if the size of the mirror image file in the cache is smaller than or equal to the residual size of the floating plate space, executing the step (6), otherwise, indicating that the cache space is insufficient, giving an alarm, and returning to the step (1);
(6) Dividing a floating plate space: dividing a space with the size of the mirror image file in the cache in the floating plate, delivering the space to an emergency recovery area, and continuously executing the step (7);
(7) And writing the image file in the cache into the emergency recovery area.
The floating plate management mechanism of the invention can automatically complete the allocation of the storage space as required in the emergency recovery area and the traditional recovery area of the floating plate, the utilization rate of the storage space is higher, the use mode is more flexible, and the excessive use of resources by a single virtual machine or a virtual machine set is avoided. The specific implementation mechanism of the floating plate management is shown in fig. 5, the upper layer is a floating plate rule file and a floating plate record file, and the lower layer is a specific mirror image. Arrows indicate comparisons within the content of the connection.
For example, the mirror image of the emergency recovery area is compared: there are two mirror images c-5, c-6 in the current emergency recovery area, so copy in the float record file is 2, and the total size of the two mirror images is 0.5+0.4=0.9 (XX represents the data of the specific backup). Copy =2 and size =1.5, both of which are smaller than or equal to the float rule file. Above the state of FIG. 5, the cache is temporarily placed when a new one of the backup images c-7 writes. According to the floating plate rule and mechanism, the floating plate space is divided into spaces with the same size as c-5 to the traditional recovery area, and the oldest backup mirror image c-5 in the emergency recovery area is migrated to the traditional recovery area. And the space occupied by the c-5 in the emergency recovery area is released and is returned to the floating plate because the c-5 is migrated. And updating the floating plate record file. The backup image migration process is shown in fig. 6. The backup image migration of fig. 7 is based on the illustration of fig. 6. As shown in fig. 8, when a new backup image c-7 has been stored in the cache, according to the lifecycle setting of the backup storage and the floating plate rule, the floating plate space is divided into spaces with the same size as c-7 to the emergency recovery area, and the backup image c-7 in the cache is migrated to the conventional recovery area, and the floating plate record file is updated. The new backup image is written into the emergency recovery area, and the oldest backup image in the emergency recovery area is migrated to the conventional recovery area. The writing of the new backup image is shown in fig. 8. The specific floating plate setting and realizing method are as follows:
and (3) indication of a floating plate rule file:
VM/VM group In the area of The number of parts should be kept Occupiable space (TB)
VMx Emergency recovery area 2 1.5
VMx Legacy recovery zone Infinity (not limiting) 3
Illustration of the float plate record file:
VM/VM group In the region of Number of retained Occupied space (TB)
VMx Emergency recovery area 2 0.9
VMx Legacy recovery zone 4 1.9
The emergency recovery platform executes the floating plate inspection:
(a1) 2 parts of data are reserved in the emergency recovery area and need to be migrated to the traditional recovery area;
(b1) The size of the oldest backup image 5 in the emergency recovery area is checked (taking the size of the oldest backup image as 0.5TB for example), none of the backup images in the conventional recovery area reaches the retention period defined by the backup policy, and is not released, and the conventional recovery area occupies 1.9TB.
(c1) The floating plate check judges that the traditional recovery area needs a space of [1.9 (occupied) +0.5 (variable) ] TB <3TB (rule limit), and then the variable space of 0.5TB is divided into the traditional recovery area from the floating plate to finish mirror image migration; in the process, if the + variable > rule limit is occupied, migration of the backup image is prevented, and an alarm is fed back to backup management.
(d1) After the mirror image migration is completed, the emergency space returns the vacant space of 0.5TB to the floating plate, and as shown in FIG. 6, the floating plate records the file change;
Figure BDA0001536210570000181
(e1) The floating plate check judges that the emergency recovery area needs a space of [0.4 (occupied) +0.7 (variable) ] TB <1.5TB (rule limit), and then the variable space of 0.7TB is divided into the emergency recovery area from the floating plate to complete mirror image writing, as shown in FIG. 8; in the process, if the + variable > rule limit is occupied, the writing of the backup image is prevented, and an alarm is fed back to the backup management.
(f1) After the mirror image writing is completed, the floating plate records the file change:
Figure BDA0001536210570000191
when the virtual machine fails, the virtual machine can be directly opened and immediately used by using the latest backup image reserved in the emergency recovery area.
According to the technology for directly starting the emergency and using the virtual machine after the virtual machine is backed up, the virtual machine can be started by using the backed-up mirror image without recovery after the virtual machine is backed up, so that the recovery process is avoided, and the service recovery time of the virtual machine under extreme conditions is greatly shortened. As shown in fig. 9-10, this is accomplished as follows:
(A1) Locating a required backup image (backup image 6), mounting the image to a virtualization server matched with a backup source in an NFS read-only manner, so that the backup image is used as a storage space and a virtual file of the virtualization server, as shown in fig. 9;
(A2) By using a virtualized snapshot mechanism, a snapshot (recovery log) is created in the production storage to store the newly written data, and the backup image in the emergency recovery platform remains unchanged as a master file of the virtual machine, i.e., the virtual machine can be opened, as shown in fig. 10.
(A3) Copying the backup image to the production storage by using a virtualized online storage migration technology, unloading the NFS of the backup image after migration is completed, synchronizing the newly written data in the recovery log to the virtual machine master in the production storage, and completing solidification, as shown in fig. 11.
Fig. 12 is a schematic diagram illustrating a hardware structure of an image management apparatus according to an embodiment of the present invention.
The image management device may include a processor 401 and a memory 402 storing computer program instructions.
Specifically, the processor 401 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more Integrated circuits implementing embodiments of the present invention.
Memory 402 may include mass storage for data or instructions. By way of example, and not limitation, memory 402 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 402 may include removable or non-removable (or fixed) media, where appropriate. The memory 402 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 402 is a non-volatile solid-state memory. In a particular embodiment, the memory 402 includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these.
The processor 401 may implement any of the backup image management methods in the above embodiments by reading and executing computer program instructions stored in the memory 402.
In one example, the image management device may also include a communication interface 403 and a bus 410. As shown in fig. 12, the processor 401, the memory 402, and the communication interface 403 are connected by a bus 410 to complete communication therebetween.
The communication interface 403 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present invention.
Bus 410 includes hardware, software, or both to couple the components of the image management device to each other. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 410 may include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.
The image management apparatus may execute the image management method in the embodiment of the present invention, thereby implementing the image management method described in conjunction with fig. 4.
In addition, in combination with the image management method in the foregoing embodiments, the embodiments of the present invention may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the image management methods of the above embodiments.
It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments can be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include an electronic circuit, a semiconductor memory device, a ROM, a flash memory, an Erasable ROM (EROM), a floppy disk, a CD-ROM, an optical disk, a hard disk, an optical fiber medium, a Radio Frequency (RF) link, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments noted in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (12)

1. A backup mirror image management system for starting a virtual machine without recovery after backup is characterized in that: the system runs on an emergency recovery platform and comprises a storage space dividing module, a backup storage parameter and elastic resource cache space rule setting module, an elastic resource cache space management module, a virtual machine backup module, a backup mirror image writing module and a backup virtual machine emergency starting module, wherein,
the storage space dividing module is used for dividing the storage space of the emergency recovery platform into a cache space and an elastic resource cache space;
the backup storage parameter and elastic resource cache space rule setting module is used for setting a backup mirror image writing sequence, dividing an elastic resource cache space recovery area, setting a backup strategy, setting an elastic resource cache space rule and checking a backup mirror image retention condition;
the elastic resource cache space management module is used for updating the elastic resource cache space rule, the elastic resource cache space record file, the elastic resource cache space check and the elastic resource cache space access;
the virtual machine backup module is used for generating a snapshot during backup by using a virtualized snapshot backup interface and storing newly written data, a master disc file of the virtual machine is kept unchanged, and the snapshot deletion is executed after the backup is finished, so that the data in the snapshot is written back into the master disc file;
the backup image writing module is used for transmitting the backup image file in the virtual machine to the emergency recovery platform, controlling the emergency recovery platform to write the backup image file into a cache and judging the backup image file to be stored according to the elastic resource cache space rule;
the backup virtual machine emergency starting module is used for directly opening and immediately using the virtual machine by using the latest backup image file reserved in the emergency recovery area in the elastic resource cache space when the virtual machine fails;
the emergency recovery platform utilizes a storage space dividing module to divide the storage space on the emergency recovery platform, and utilizes a virtual machine backup module to generate a virtual machine snapshot; the emergency recovery platform writes the snapshot generated by the virtual machine backup module into a storage space by using a backup mirror image writing module; when the virtual machine fails, the emergency recovery platform directly opens and uses the latest backup image file reserved in the storage space through the backup virtual machine emergency starting module.
2. The system of claim 1, wherein: the backup storage parameter and elastic resource cache space rule setting module specifically comprises:
the backup mirror image is written into the submodule: the backup image file storage device is used for setting that all backup image files are written into a cache space preferentially;
the elastic resource cache space division submodule comprises: the system is used for dividing the elastic resource cache space into an emergency recovery area and a traditional recovery area without dividing the space;
a backup strategy setting submodule: the backup image file management system is used for bringing backup image files or backup image file groups into different backup strategies, and different backup strategies set different backup image backup frequencies and retention periods;
the elastic resource cache space rule setting submodule is used for respectively limiting the size and the number of reserved parts of space occupied by each backup image file or backup image file group in an emergency recovery area and a traditional recovery area, the emergency recovery area only reserves the backup image files for the last several times, historical backup image files are migrated into the traditional recovery area, and the space is released after the backup image files expire and the backup image files are handed back to the elastic resource cache space;
a backup image file checking submodule: and regularly checking whether each backup image file reaches the required retention time, and if so, clearing the expired backup image file.
3. The system of claim 1, wherein the elastic resource cache space management module specifically comprises:
a backup execution submodule: the system is used for realizing the execution of a virtualization backup process by a backup management system;
backup image file judgment submodule: the method is used for judging the sizes of the backup image file and the residual cache space: if the backup image file is smaller than or equal to the residual size of the cache, writing the backup image file into the cache, otherwise, queuing for waiting;
backup image file cache submodule: the backup mirror image file is written into the cache;
an elastic resource cache space judgment submodule: the elastic resource caching space judging method comprises the steps that judgment is carried out according to backup storage parameters and an elastic resource caching space rule, if the number of reserved backup image file copies COPY2 in an emergency recovery area recorded in an elastic resource caching space recording file is smaller than or equal to the number of backup image file copies COPY1 to be reserved in the emergency recovery area defined in the elastic resource caching space rule file, and the occupied space SIZE SIZE2 of the emergency recovery area recorded in the elastic resource caching space recording file is smaller than or equal to the occupied space SIZE SIZE1 of the emergency recovery area defined in the elastic resource caching space rule file, the residual SIZE of the elastic resource caching space is continuously compared with the SIZE of a backup image file in a cache, if the SIZE of the backup image file in the cache is smaller than or equal to the SIZE of the residual space of the elastic resource caching space, an elastic resource caching space with the SIZE of the backup image in the cache is drawn out of the elastic resource caching space, the elastic resource caching space is handed to the emergency recovery area, the backup image file in the cache is written into the emergency recovery area, and if the SIZE of the backup image file in the elastic resource caching space is larger than the residual space, the SIZE of the elastic resource caching space indicates that the elastic resource caching space is insufficient;
if any one of COPY2 not more than COPY1 and SIZE2 not more than SIZE1 is not satisfied, migrating the oldest image file of the emergency recovery area to a traditional recovery area, judging according to backup storage parameters and elastic resource cache space rules, if the number of reserved backup image file copies COPY4 in the traditional recovery area recorded in the elastic resource cache space recording file is not more than the backup image file copies COPY3 to be reserved in the traditional recovery area defined in the elastic resource cache space rule file, and if the SIZE of occupied space of the traditional recovery area recorded in the elastic resource cache space recording file is not more than SIZE4 and not more than SIZE3 which can be occupied space SIZE of the traditional recovery area defined in the elastic resource cache space rule file, comparing the SIZE of the residual space of the elastic resource cache space with the SIZE of the oldest image of the emergency recovery area, if the SIZE of the oldest image of the emergency recovery area is not more than the SIZE of the residual space of the elastic resource cache space, dividing the elastic resource cache space of the oldest SIZE of the emergency recovery area from the elastic resource cache space, transferring the elastic resource cache space to the elastic resource cache space of the traditional recovery area, and releasing the elastic resource cache space to the elastic resource cache space, and transferring the elastic resource cache space to the elastic resource cache space of the traditional recovery area;
if any one of COPY4 is less than or equal to COPY3 and SIZE4 is less than or equal to SIZE3 fails, warning to keep the cycle limit;
a mirror migration submodule: the space for dividing the size of the oldest mirror image file of the emergency recovery area from the elastic resource cache space is delivered to the traditional recovery area, the oldest mirror image file in the emergency recovery area is transferred to the traditional recovery area in the elastic resource cache space, and the space is released to the elastic resource cache space;
the elastic resource cache space size judgment submodule comprises: the emergency recovery area is used for comparing the residual size of the elastic resource cache space with the size of the mirror image file in the cache, and if the size of the mirror image file in the cache is smaller than or equal to the size of the residual space of the elastic resource cache space, the space with the size of the mirror image in the cache is divided from the elastic resource cache space and is delivered to the emergency recovery area;
the mirror image file is written into the submodule: and the system is used for writing the image file in the cache into the emergency recovery area.
4. The system of claim 1, wherein: the backup virtual machine emergency starting module is specifically used for:
positioning a required backup image file, and mounting the required backup image file to a virtualization server in an emergency recovery platform matched with a backup source in a Network File System (NFS) read-only mode, so that the required backup image file is used as an NFS data source of the virtualization server;
and creating a snapshot in the production storage by using a virtualized snapshot mechanism to store newly written data, wherein a backup image file in the emergency recovery platform is kept unchanged as a master file of the virtual machine.
5. A backup mirror image management method for starting a virtual machine without recovery after backup is characterized by comprising the following steps:
dividing the storage space of the whole emergency recovery platform into a cache space and an elastic resource cache space;
setting a backup mirror image writing sequence, dividing an elastic resource cache space recovery area, setting a backup strategy and setting an elastic resource cache space rule; and checking the backup mirror image retention condition;
updating an elastic resource cache space rule file, an elastic resource cache space record file, elastic resource cache space check and elastic resource cache space access;
fourthly, a snapshot is generated during backup by using a virtualized snapshot backup interface and is used for storing newly written data, a master disk file of the virtual machine is kept unchanged, and after the backup is finished, the snapshot is deleted, and the data in the snapshot is written back into the master disk file;
fifthly, transmitting the backup image file in the virtual machine to the emergency recovery platform, controlling the emergency recovery platform to write the backup image file into a cache and judging how the backup image file is stored according to the elastic resource cache space rule;
and (VI) when the virtual machine fails, directly opening the virtual machine for use by using the latest backup image file reserved in the emergency recovery area in the elastic resource cache space.
6. The method of claim 5, wherein: setting a backup mirror image writing sequence, dividing an elastic resource cache space recovery area, setting a backup strategy and setting an elastic resource cache space rule; and checking the backup image retention condition, specifically comprising the following steps:
(1) Setting all backup images to be written into the cache of the emergency recovery platform preferentially;
(2) Dividing the elastic resource cache space into an emergency recovery area and a traditional recovery area without dividing the space;
(3) Setting a backup strategy, including the backup image file or the backup image file group into different backup strategies, and setting different backup frequency and retention period of the backup image file by different backup strategies;
(4) Setting an elastic resource cache space rule, respectively limiting the space size and the reserved number of the backup image files or the backup image file groups in an emergency recovery area and a traditional recovery area, wherein the emergency recovery area only reserves the backup images for the last few times, the historical backup images are migrated into the traditional recovery area, and the space is released after the image files are expired and the backup images are handed back to the elastic resource cache space;
(5) And regularly checking whether each backup image file reaches the required retention time, and if so, clearing the expired backup image file.
7. The method of claim 5, wherein the updating of the elastic resource cache space rule, the elastic resource cache space record file, the elastic resource cache space check and the elastic resource cache space access specifically comprises the following steps:
(1) And (3) executing backup: the backup management system executes a virtualization backup process;
(2) Judging the sizes of the backup image file and the cache: if the backup image file is smaller than or equal to the size of the cache, performing the step (3); otherwise, queuing and waiting, and returning to the step (1);
(3) Writing the backup image file into a cache: the backup mirror image file is written into the cache;
(4) And (3) judging the buffer space of the elastic resource: judging according to the backup storage parameters and the elastic resource cache space rule, if COPY2 is not less than COPY1 and SIZE2 is not less than SIZE1, executing the step (5), and if one condition is not met, executing the step (a);
wherein, COPY1 is the number of mirror image files which should be reserved in the emergency recovery area defined in the elastic resource cache space rule file; COPY2 is the number of image files reserved in the emergency recovery area recorded in the elastic resource cache space recording file; SIZE1 is the SIZE of the space which can be occupied by the emergency recovery area defined in the elastic resource cache space rule file; SIZE2 is the SIZE of the occupied space of the emergency recovery area recorded in the elastic resource cache space recording file;
(a) Migrating the oldest mirror image file in the emergency recovery area to the traditional recovery area, and executing the step (b);
(b) And (3) judging the buffer space of the elastic resource: judging according to the backup storage parameters and the elastic resource cache space rule, if COPY4 is not more than COPY3 and SIZE4 is not more than SIZE3, executing the step (c), and if one condition is not met, alarming for retention time limit and returning to the step (1);
wherein, COPY3 is the number of mirror image files that should be reserved in the traditional recovery area defined in the elastic resource cache space rule file; COPY4 is the number of image files reserved in the conventional recovery area recorded in the elastic resource cache space recording file; SIZE3 is the SIZE of the space that can be occupied by the traditional recovery area defined in the elastic resource cache space rule file; SIZE4 is the SIZE of the occupied space of the traditional recovery area recorded in the elastic resource cache space recording file;
(c) Judging the size of the elastic resource cache space: comparing the size of the elastic resource cache space with the size of the oldest mirror image of the emergency recovery area, if the size of the oldest mirror image file of the emergency recovery area is smaller than or equal to the remaining size of the elastic resource cache space, executing the step (d), otherwise, indicating that the cache space is insufficient, giving an alarm, and returning to the step (1);
(d) Dividing an elastic resource cache space: dividing a space with the size of the oldest mirror image file in the emergency recovery area in the elastic resource cache space, handing the space to the traditional recovery area, and continuing to execute the step (e);
(e) Migrating the oldest mirror image file in the emergency recovery area to a traditional recovery area in the elastic resource cache space, releasing the space to the elastic resource cache space, and continuing to execute the step (5);
(5) Judging the size of the elastic resource cache space: comparing the size of the elastic resource cache space with the size of the mirror image file in the cache, if the size of the mirror image file in the cache is smaller than or equal to the remaining size of the elastic resource cache space, executing the step (6), otherwise, indicating that the cache space is insufficient, giving an alarm, and returning to the step (1);
(6) Dividing an elastic resource cache space: dividing a space with the size of the mirror image file in the cache in the elastic resource cache space, delivering the space to an emergency recovery area, and continuously executing the step (7);
(7) And writing the image file in the cache into the emergency recovery area.
8. The method of claim 5, wherein: the elastic resource cache space rule comprises the following steps: the reserved number of the backup image files in the emergency recovery area and the traditional recovery area in the elastic resource cache space recording file is less than or equal to the number of the backup image files set in the elastic resource cache space rule file; the occupied space of the emergency recovery area and the traditional recovery area in the elastic resource cache space recording file is less than or equal to the occupied space set in the elastic resource cache space rule file.
9. The method of claim 5, wherein: when the virtual machine fails, directly opening and using the virtual machine by using the latest backup image file reserved in the emergency recovery area in the elastic resource cache space, specifically comprising:
(1) Positioning a required backup image file, and mounting the backup image file to a virtualization server in an emergency recovery platform matched with a backup source in a Network File System (NFS) read-only mode, so that the backup image file is used as an NFS data source of the virtualization server;
(2) A snapshot, namely a recovery log, is created in production storage by using a virtualized snapshot mechanism and is used for storing newly written data, and a backup image file in an emergency recovery platform is kept unchanged as a master file of a virtual machine, so that the virtual machine can be opened.
10. The method of claim 5, wherein the method further comprises:
and copying the backup image file to production storage by using a virtualization online storage migration technology, unloading the network file system NFS of the backup image file after migration is completed, synchronizing newly written data in a recovery log to a virtual machine master disc in the production storage, and completing solidification.
11. A backup image management device for starting a virtual machine without restoring after backup, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the method of any one of claims 5-10.
12. A computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 5-10.
CN201711494862.4A 2017-12-31 2017-12-31 Backup image management system, method, device and medium Active CN109992449B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711494862.4A CN109992449B (en) 2017-12-31 2017-12-31 Backup image management system, method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711494862.4A CN109992449B (en) 2017-12-31 2017-12-31 Backup image management system, method, device and medium

Publications (2)

Publication Number Publication Date
CN109992449A CN109992449A (en) 2019-07-09
CN109992449B true CN109992449B (en) 2023-04-11

Family

ID=67110758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711494862.4A Active CN109992449B (en) 2017-12-31 2017-12-31 Backup image management system, method, device and medium

Country Status (1)

Country Link
CN (1) CN109992449B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115023693A (en) * 2020-05-29 2022-09-06 深圳市欢太科技有限公司 Mirror image updating method and device, electronic equipment and storage medium
CN113010474B (en) * 2021-03-16 2023-10-24 中国联合网络通信集团有限公司 File management method, instant messaging method and storage server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678036A (en) * 2013-11-15 2014-03-26 上海爱数软件有限公司 Backup method based on virtual machine operation information data finding
CN103902407A (en) * 2012-12-31 2014-07-02 华为技术有限公司 Virtual machine recovery method and server
CN104407938A (en) * 2014-11-21 2015-03-11 上海爱数软件有限公司 Recovery method for various granularities after mirror-image-level backup of virtual machine
CN106681858A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Virtual machine data disaster tolerance method and management device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902407A (en) * 2012-12-31 2014-07-02 华为技术有限公司 Virtual machine recovery method and server
CN103678036A (en) * 2013-11-15 2014-03-26 上海爱数软件有限公司 Backup method based on virtual machine operation information data finding
CN104407938A (en) * 2014-11-21 2015-03-11 上海爱数软件有限公司 Recovery method for various granularities after mirror-image-level backup of virtual machine
CN106681858A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Virtual machine data disaster tolerance method and management device

Also Published As

Publication number Publication date
CN109992449A (en) 2019-07-09

Similar Documents

Publication Publication Date Title
US11782794B2 (en) Methods and apparatus for providing hypervisor level data services for server virtualization
US8924358B1 (en) Change tracking of individual virtual disk files
US11397648B2 (en) Virtual machine recovery method and virtual machine management device
US9652326B1 (en) Instance migration for rapid recovery from correlated failures
AU2014374256B2 (en) Systems and methods for improving snapshot performance
US8627012B1 (en) System and method for improving cache performance
US8930947B1 (en) System and method for live migration of a virtual machine with dedicated cache
US9535907B1 (en) System and method for managing backup operations of virtual machines
US20120209812A1 (en) Incremental virtual machine backup supporting migration
US20080016130A1 (en) Apparatus, system, and method for concurrent storage to an active data file storage pool, copy pool, and next pool
JP2014520344A (en) Managing replicated virtual storage at the recovery site
CN104216793A (en) Application program backing up and restoring method and device
US9342390B2 (en) Cluster management in a shared nothing cluster
CN108228678B (en) Multi-copy data recovery method and device
CN109992449B (en) Backup image management system, method, device and medium
CN112783444A (en) Cluster disk sharing method, system and storage medium
CN111367856B (en) Data copying method, device, electronic equipment and machine-readable storage medium
CN114741234A (en) Data backup storage method, equipment and system
CN115729749A (en) Data backup method and system
US10114754B1 (en) Techniques for space reservation in a storage environment
CN117992283A (en) Cloud host backup method and device, computer equipment and storage medium
US8924442B2 (en) Method and system for implementing high availability storage on thinly provisioned arrays
US9053033B1 (en) System and method for cache content sharing
CN114077517A (en) Data processing method, equipment and system
CN111124275B (en) Monitoring service optimization method and device of distributed block storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant